AIGC Visual Content: Progress & Traceability Review

Beijing Zhongke Journal Publising Co. Ltd.

This review is led by Prof. An-an Liu (School of Electrical and Information Engineering, Tianjin University), Yuting Su (School of Electrical and Information Engineering, Tianjin University), Prof. Lanjun Wang (School of New Media and Communication, Tianjin University), et al., which mainly focuses on the AIGC visual content generation and traceability. In the contemporary digital era, rapid technological advancements have made multimedia content creation, particularly visual content generation, integral to modern societal development. The exponential growth of digital media and the creative industry has spotlighted artificial intelligence-generated content (AIGC) technology. AIGC's applications offer multimedia creators novel tools, benefiting diverse fields such as cinema, gaming, and virtual reality.

This review introduces advancements in AIGC technology, focusing on visual content generation and its critical facet of traceability. Initially, it traces the evolutionary path of image generation technology from generative adversarial networks (GANs) to transformer auto-regressive models and diffusion probability models. This progression highlights a leap in image quality and capability.

The development of GANs, evolving from text-conditioned methods to sophisticated style control and large-scale models, pioneered text-to-image generation. GANs improve performance by expanding network parameters and dataset sizes. Additionally, transformer-based auto-regressive models like DALL-E and CogView mark a new epoch in image generation, using transformer structures to predict and decode feature sequences into complete images. Diffusion probability models, known for their stable training methods and high-quality outputs, simulate the transformation of data into noise and reconstruct it, offering stable training and impressive results in quality and diversity.

Over time, AIGC technology faces challenges such as enhancing content quality and needing precise control. Controllable image generation technology aims to furnish meticulous control over content, integrating elements like layouts, sketches, and visual references, empowering creators to maintain artistic autonomy and quality standards.

The review addresses the critical issue of image authenticity and potential misuse, exemplified by deepfakes and fake news, extending to risks related to privacy, security, and societal implications. In response, watermark-related image traceability technology has emerged as a solution, using watermarking techniques to authenticate and verify AI-generated images.

Watermarking techniques are categorized into watermark-free embedding, pre-embedding, post-embedding, and joint generation methods. Watermark-free embedding uses fingerprint information for model attribution. Pre-embedding integrates watermarks into training data, while post-embedding divides image generation and watermark embedding into two stages. Joint generation methods aim for adaptive watermark embedding during image generation. Each approach plays a pivotal role in verifying traceability across diverse scenarios, offering a robust defense against potential misuses of AI-generated imagery. The development of these technologies introduces new horizons in digital creativity but also presents significant challenges, particularly in image authenticity and potential misuse.

In conclusion, while AIGC technology offers promising opportunities in visual content creation, it also presents challenges regarding controllability and security. This review provides an overview of image generation technologies, controllable image generation and watermark-related traceability techniques. The aim is to offer researchers a systematic perspective on advancements in AIGC visual content generation and traceability, understanding current research trends, challenges, and future directions in this rapidly evolving field.

See the article:

https://doi.org/10.11834/jig.240003

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.