
On September 17, 2023, the "Chinese Association for Artificial Intelligence (CAAI) Series White Papers" press release was held in Nanchang, Jiangxi. Nine white papers, including "Large Model Technology," "AI+Art," "Principles of Artificial Intelligence," "Risks, Challenges, and Governance of Digital Society," "Deep Learning," "Cognitive Computing," "Smart Grid," "Educational Applications of Large Language Models," and "Intelligent Collaborative Control & Artificial Intelligence" were collectively unveiled at the event. Qiu Zhijie, the person in charge of compiling "AI+Art" and the vice dean of the Central Academy of Fine Arts, shared insights at the scene. Professor Wang Guoyin, the Vice Chairman of CAAI, Vice Principal of Chongqing University of Posts and Telecommunications, and an IRSS/CAAI/CCF Fellow, served as the chairman of the press conference and presided over it.
As part of the concurrent activities of the 12th China Intelligent Industry Summit in 2023, the release of these nine white papers encapsulated the achievements and practical experiences of industry-academic-research collaboration. They serve as a vital addition and extension to the summit's content, featuring various pioneering and unprecedented knowledge points. These white papers play a crucial role in promoting policy formulation, theoretical research, discipline construction, technological innovation, and application of artificial intelligence, providing professional references for decision-makers, practitioners, educators, researchers, and investors. Among them, the world's first AI series white paper, "AI+Art," was spearheaded by the Central Academy of Fine Arts, in collaboration with renowned global experts in the AI domain. This milestone achievement not only offers profound insights into the intersection of AI and art but also points the way for the integrated development of artificial intelligence and cultural arts.

During his address, Qiu Zhijie, the publisher of the "AI+Art" white paper and vice dean of the Central Academy of Fine Arts, elaborated on the developmental background of AI in art, its significance in the history of art, related algorithms and engineering challenges, and shared creative examples from "AI+Art" as well as the impact of AI on the artistic ecosystem.
Since the introduction of Artificial Intelligence (AI) in 1956, the AI industry and technology have continually evolved. Large AI models have become foundational infrastructure for versatile artificial intelligence algorithms. From 2019 onward, the general problem-solving capabilities of large models have significantly improved, becoming the mainstream technological approach in the industry. The term "large AI model" is an abbreviation for "artificial intelligence pre-trained large model." It embodies the two concepts of "pre-training" and "large model," functioning as a foundational infrastructure for AI algorithms that can learn from one task and apply that knowledge to various tasks. The "large model + small model" combination is progressively becoming the industry's mainstream technological direction, fueling a comprehensive acceleration of the global AI industry.
Digital art, globally recognized for its independent aesthetic value, has rapidly developed in recent years. The maturation of AI model technology undoubtedly provides a more expansive horizon for digital art's growth. This is especially encompassed in the strategic vision of national cultural digitization, particularly the development philosophy of digital art industrialization. The 20th National Congress report of the Party has explicitly planned for building a network powerhouse and digital China, implementing a national strategy for cultural digitization. The General Office of the Communist Party of China Central Committee and the State Council issued opinions on "Promoting the Implementation of the National Cultural Digitization Strategy" and the "Overall Planning for Digital China Construction." These documents underscore that cultural digitization has become a strategic choice for building a socialist cultural powerhouse and realizing high-quality cultural development. It plays an irreplaceable and essential role in China's cultural development, the international competitiveness of its cultural industries, and cultural security.
Key points of the Whitepaper:
I. An Overview of AI Art Development
Since the inception of the AI concept in 1956, it has undergone several peaks and troughs. Following the early rule engines and knowledge systems, the coding community witnessed the rise of neural networks and machine learning in the 80s and 90s, showcasing AI's constant evolution and challenging its prior definitions. The progress in the past five years has been swift. The massive growth in computational power and data quantity has enabled deep learning techniques, especially Generative Adversarial Networks (GANs), to shine. This technology, through the adversarial training of a pair of neural networks, can generate data that is almost indistinguishable from real data. The impact of this technology is evident in the portrait of "Edmond de Belamy," which was generated through GANs and stunned the art and tech communities with its high auction price. Concurrently, with the industry's focus on the copyright of AI training datasets, such works have returned to the spotlight.
However, technology always comes with challenges. GANs have an element of unpredictability in art generation, and their output is often hard to align perfectly with the artist's initial directive. Also, because the training process is intricate and resource-intensive, any change in the creative direction necessitates retraining, reducing efficiency. By the second half of 2021, a series of advanced models, such as diffusion models, CLIP, and other pre-trained large models, emerged, significantly advancing AI's application in the art domain. The remarkable CLIP, which tightly links text descriptions to images, offers artists more precise generative capabilities. This advancement is not just technical but also provides artists with an upgrade in their creative tools.
Today's AI tools have started playing a supportive role for artists, like using GANs to quickly generate sketches and then filtering the most creative works for further refinement. Viewing this development trend, AI is no longer just an experimental tool for artistic creation but is starting to integrate systematically and deeply into every aspect of the art creation process.
II. The Historical Significance of AI Art
Within the artistic tapestry of the 20th century, artists were deeply inspired by rules and structures, using predetermined systems to guide, restrict, or amplify creativity. Renowned artists like Bridget Riley and Victor Vasarely employed geometric patterns to present visual illusions to the audience. In contrast, artists like Richard Long and Michael Barnsley drew inspiration from fractal theory, leveraging mathematical logic to showcase infinite details in their art pieces.
This exploration of rules isn't just evident in traditional art forms but has been deepened in the integration of technology and art. Process Art focuses on the art creation process itself, displaying the intriguing blend of systematic structures and randomness. Artists like Cohen even introduced programming languages, using computers as aids, to delve into the generative rules of art. Moore, in his “P-511/D” series, explored all possible facets of the cube through predefined algorithms.
Conceptual art took the idea of rules and instructions to a new elevation. Take Sol LeWitt as an example; his mural series was crafted based on specific instructions. These directives are open-ended and diverse, allowing different executors to bring varied interpretations and experiences. The interplay of rules with freedom, predictability with uncertainty, weaves into a rich and intricate web, laying a profound foundation for the subsequent development of AI+Art.
III. Algorithm and Engineering Issues of AI Art
1. Overview of AIGC Algorithm
Generative technologies play a pivotal role in the AIGC domain. They offer creative and diverse perspectives to coding groups in data generation and enhancement, unsupervised learning, visual and linguistic generation, reinforcement learning and policy generation, and even in the domain of creative and artistic generation. This promotes advancements in intelligent systems in terms of creativity, understanding, and interaction. Generative technologies produce new data samples by learning from data distribution models. These models usually rely on probabilistic models like Generative Adversarial Networks (GANs) and Variational Autoencoders. GANs, in particular, have achieved significant breakthroughs in the realm of image generation, generating lifelike image samples. Later on, Tero Karras and his team proposed StyleGAN, designed for generating realistic facial images. By introducing a style vector into the generative network and employing a progressive overlay training method, it produces high-resolution, diverse, and artistically styled facial images. In these facial images, the term "Style" typically refers to details like head pose, facial expressions, and hairstyles. As illustrated, StyleGAN can capture these nuances and produce high-quality images, showing consistent performance across various resolutions. The GPT model proposed by the OpenAI team, grounded on a deep autoregressive Transformer model, has achieved groundbreaking success in natural language processing tasks, exhibiting potent linguistic generative abilities and broad application potential. Recently, Stable Diffusion, as a generative technique, offers an efficient method to generate high-quality image samples. This method gradually clarifies the generated image by multi-step diffusion of noise, reducing the scale of noise step-by-step. Stable Diffusion has achieved notable progress in image quality and diversity and is extensively applied to image generation tasks.
2. Engineering Issues in AI+Art Generation Tasks
With the advancements in diffusion models and fine-tuning techniques, generative AI technologies, exemplified by models like ChatGPT and Midjourney, have been extensively integrated into various mobile applications and commercial products since the end of 2022. This widespread application marks a transition to a more user-friendly, practical business scenario-driven approach. The engineering considerations mainly involve:
(1) Offering superior user experience, which includes visually appealing, intuitive user interfaces, and efficient construction and maintenance methodologies;
(2) Acquiring necessary computational resources flexibly, especially in resource-constrained situations;
(3) Ensuring the stability of generation outcomes, particularly when dealing with generative AI techniques characterized by randomness and uncertainty. Beyond these focuses, as engineering practices deepen, emerging challenges such as model management and search, and media resource storage and management become evident. Solutions are sought to guarantee the long-term, stable, and efficient application of generative AI.
Users invoke generative AI models in application platforms, set parameters, and receive feedback. The efficiency of this process hinges on the platform's design, ensuring it's both user-friendly and feature-rich. Such platforms mainly cater to algorithm engineers, producers, and designers. On these platforms, algorithm engineers prioritize AI model training and tuning. A visualized user interface aids them in swiftly adjusting parameters, reducing code volume, thus enhancing model performance. Producers, who traditionally had to communicate multiple times with designers to realize their design concepts, can now directly translate their ideas into image forms via the AI platform, simplifying interactions with designers. Meanwhile, designers exploit the platform to efficiently produce numerous drafts, which they can subsequently refine and perfect, enabling a smooth design workflow.
IV. AI + Art Case Studies
In the pre-AI era, the digital genesis was seen as an extension of data visualization, focusing mainly on constructing digital landscapes in virtual spaces. Interactivity was at the core of this domain. Artists and game developers utilized game engines like Unity and Unreal Engine, merging art with code, to provide audiences with immersive experiences. However, due to technical constraints, digital creations from the pre-AI era were typically linear and singular. The introduction of AI technology offered artists a more expansive creative space, enabling more detailed and realistic virtual environment constructions and enhancing interactivity of the works. This not only opened up diverse creative possibilities for artists but also presented audiences with a new appreciative experience. The blend of technology and art has ignited boundless creativity and sparked discussions on innovation and ethics. Through studying recent AI art cases, the writing team can deeply understand this ongoing fusion and transformation of culture and technology. The team broadly categorized the cases into three types: AI-generated works focused on results, AI-driven interactive works, and works involving the combination of multiple intelligent entities in both virtual and real realms.
1. Generative Art
In the early stages of AIGC creation, the writing team noticed that there was no significant innovation in production relationships. Researchers and creators were more focused on algorithmic innovation and adjustments with the aim of achieving a satisfying "image". However, as AI generation tasks shifted from algorithmic innovation to productization, artists should now be more actively involved in the training process of models. From an AI perspective, the current model inference seems more consumer-oriented. For creators, the focus should now be on training AI models, rather than just obtaining the final generated outcome.
2. AIGC Standard
The urgency to establish standards in AIGC can be inspired by the ImageNet project and Li Fei-Fei's work in the field of image recognition. When the ImageNet project was launched, it set a common benchmark and evaluation standard for the image recognition field by introducing a large-scale labeled image dataset. Such standardization accelerated technological advancement, as researchers could compare, share methods, and results on the same dataset, thereby quickly advancing the field's research. Similarly, the AIGC domain faces diversity and complexity. Despite AI art's widespread application internationally, challenges still exist in specific vertical applications like the digital production of Chinese cultural arts. The absence of a common standard or benchmark might lead to fragmented research and applications. By establishing one or more standardized benchmarks, the writing team can promote collaboration across multiple domains and dimensions, from technology, equipment to content, and industry, thereby better meeting the public's aesthetic experiences and practical needs. Hence, just as ImageNet propelled advances in the image recognition field, the AIGC domain urgently needs to establish relevant standards to guide and integrate its development. Evaluating generative art becomes complex due to its involvement of deep professional knowledge and subjective aesthetics; it urgently requires expert inputs from the art community (such as RLHF) to set evaluation criteria. Collaboratively developed evaluation models (Reward Model) can exist independently of generative models, providing expert-level scores for autonomously generated art pieces, aiming to offer a more precise, scientific evaluation mechanism and strengthen the connection between artists and tech researchers.
Despite the extensive application of AI in art internationally, there remains a gap in meeting the demands of Chinese cultural arts. To better serve the cultural and art industries, aside from the above standard construction, the writing team needs to reinforce collaboration and research in various aspects like technology, equipment, content creation, and management. The aim is to break down barriers between the tech and art sectors and address practical application issues of AI in digital art creation.
Generative art has a profound connection with data visualization and the digital genesis. Data visualization began with transforming complex datasets into visual forms, making them more digestible and understandable, while the digital genesis involves constructing and shaping virtual environments in digital spaces. Generative art, as an extension of these two, explores how to employ algorithms and mathematical models to create new, unprecedented art forms.
With technological advancements, especially the rise of artificial intelligence, the domain of generative art has expanded further. AI-driven interactions have introduced new possibilities to generative art, enabling artists to craft more intricate and dynamic ecosystems. These systems are no longer static, predefined structures but can evolve and change in real-time based on internal rules and external interactions.
3. AI-Driven Interactivity
Interactive art emphasizes audience participation. Compared to traditional art, it encourages real-time interactions and feedback from the audience. Viewers often can engage with the artwork through touch, listening, walking, etc. With the advancement of information technology, human-computer interaction interfaces have evolved from the initial command interfaces, graphic interfaces, and multimedia interfaces to more intelligent, complex mixed forms. AI-centric interactive pieces have progressed from simple linear interactions to multi-dimensional, multi-outcome comprehensive sensory interactions.
The writing team observed that AI technology has added a new dimension to storytelling. Audiences might be invited into a universe composed of AI-driven robots, playing roles of world creators, holding the power to decide the fate of creatures on various planets. Through interactions with the system, audiences can choose to foster harmony and cooperation among beings or induce conflicts or even obliterate them entirely. This narrative style provides audiences with opportunities to influence the story's progression, creating limitless narrative possibilities.
4. Multi-Agent System Art
Multi-Agent Systems (MAS) is a subfield of artificial intelligence that specifically studies how multiple autonomous agents interact and cooperate. Each agent has its perception, decision-making, and action capabilities, cooperating or competing based on their objectives. The behavior of the entire system is a result of these agent interactions and can simulate complex real-world scenarios. Artists like Ian Cheng have explored virtual multi-agent ecosystems through works like "BOB". In the art domain, MAS can not only simulate human creative processes, such as collective painting, but also demonstrate how machines shape and optimize their "consciousness" or "concepts" through interactions. This provides artists with a new perspective to explore how machines perceive and understand the external world.
V. Impact on the Artistic Ecosystem
Applying AI to artistic creation can promote research to improve AI algorithms. By studying and analyzing the performance of algorithms in different application scenarios, people can continuously reflect on the limits, principles, and future development of AI algorithms. The application of large AI models will facilitate communication and cooperation between artists and technology researchers, further expanding the possibilities of digital creation. Artistic creation can not only expand the application scenarios of AI but also provide experimental data and practical foundation for the improvement of AI algorithms. Cross-disciplinary collaboration can simultaneously promote technological advancement and the digital transformation of culture and art, further advancing China's modernization process. Therefore, the deep integration of art and AI will become an important direction for the digital construction of culture. This is also the significance of studying AI in the digital construction of national culture.
The "AI+Art" white paper aims to emphasize the spirit of interdisciplinary and integration of humanities and sciences. It unifies cultural development and AI within the framework of the humanistic spirit, reflecting on the impact of technological development on human spirituality and social psychology, and maintaining critical thinking. It also stresses the stimulation of artistic thinking on technological innovation, integrating horizontal, divergent, and reverse creative thinking into discipline construction, and harnessing the energy of artistic exploration to inspire bi-directional innovation. In the field of art, there's an emphasis on the history of AI technology development. By understanding the history of technology, scientific thinking, and experimental methods, we explore unknown territories.
The long history and unique contributions of Chinese cultural traditions, with the influence of AI technology under globalization, Western culture has gradually entered China, and has served as a reference and inspiration for China's modernization and cultural innovation. Currently, most research results in AI art creation come from the West, which indicates certain research challenges and reveals the necessity of such studies. Rooted in China and learning from Western perspectives emphasizes the protection and inheritance of traditional Chinese culture. At the same time, it also focuses on absorbing and integrating valuable elements from Western culture, aiming to promote cultural exchanges between China and the world while enhancing international competitiveness and cultural soft power.
Acknowledgement
On this auspicious occasion, the writing committee wishes to extend its profound gratitude to the Central Academy of Fine Arts, the JD AI Research, Amazon Web Service, and the anonymous mentors and organizations who provided unparalleled support. We are deeply touched by the trust and patience bestowed upon us by members of the Artificial Intelligence Association. Our most heartfelt thanks, however, are reserved for the dedicated members of our writing team who, during the scorching summer of 2023, poured their souls into this white paper with the meticulousness of sculptors, carefully chiseling every artistic piece. You are the indispensable spirit embedded within these pages. It is through your selfless dedication that this white paper shines so brightly. Despite the suddenness of this endeavor and assuming such a responsibility, our entire team, without any additional financial aid, has passionately refined this white paper solely on the ardor for the amalgamation of AI and art and the intent to contribute to society and the academic community, relying only on their spare time and vigor.
During its inception, the committee drew from a vast trove of resources, artist websites, and prior research. We wish to express our deepest respect and appreciation to the original authors and the artists who continuously practice their craft. Their invaluable works and research provided a wealth of inspiration and backing, enabling us to bring this project to fruition. This white paper was orchestrated by the Vice Dean of the Central Academy of Fine Arts, Qiu Zhijie, and primarily penned by Chen Baoyang. However, given time constraints and limited resources, there may inevitably be oversights and shortcomings within. We genuinely apologize for any such lapses and eagerly anticipate feedback and suggestions from our esteemed readership, aiding us in ceaselessly enhancing our endeavors.
Writing Committee for the "Chinese Artificial Intelligence Series White Paper - AI+Art": Qiu Zhijie, Chen Baoyang, Ba Raiyun, Zhang Xiaolin, Zong Yuqi, Zhang Bingwan, Wang Mengyao, He Xiaodong, Wang Jun, Wang Naiyan, Huang Muqi, Yang Xingyu, Chen Yang, Liu Daqing, Xu Hai, Zhang Aoyu, Zhang Pu, Song Hongtao, Dai Yan, Chen Haiyun.
Committee for the "Chinese Artificial Intelligence Series White Paper": Chairman: Dai Qionghai Executive Director: Wang Guoyin Vice-Chairmen: Chen Jie, He You, Liu Chenglin, Liu Hong, Sun Fuchun, Wang Endong, Wang Wenbo, Zhao Chunjiang, Zhou Zhihua Members: Ban Xiaojuan, Cao Peng, Chen Chun, Chen Songcan, Deng Weiwen, Dong Zhenjiang, Du Junping, Fu Yili, Gu Tianlong, Gui Weihua, He Qing, Hu Guoping, Huang Heyan, Ji Xiangyang, Jia Yingmin, Jiao Licheng, Li Bin, Liu Min, Liu Qingfeng, Liu Zengliang, Lu Huaxiang, Ma Huadong, Miao Duoqian, Pan Gang, Park Songhao, Qian Feng, Qiao Junfei, Sun Changyin, Sun Maosong, Tao Jianhua, Wang Weining, Wang Xizhao, Wang Xuan, Wang Yunhong, Wushouer·Silamu, Wu Xiaobei, Yang Fangchun, Yu Jian, Yue Dong, Zhang Xiaochuan, Zhang Xuegong, Zhang Yi, Zhang Yi, Zhou Guodong, Zhou Hongyi, Zhou Jianshe, Zhou Jie, Zhu Liehuang, Zhuang Yueting
