ChatGPT's Studio Ghibli-style Images Show Its Creative Power - But Raise New Copyright Problems

Social media has recently been flooded with images that looked like they belonged in a Studio Ghibli film . Selfies, family photos and even memes have been re-imagined with the soft pastel palette characteristic of the Japanese animation company founded by Hayao Miyazaki.

Authors

Kai Riemer
Professor of Information Technology and Organisation, University of Sydney
Sandra Peter
Director of Sydney Executive Plus, University of Sydney

This followed OpenAI's latest update to ChatGPT . The update significantly improved ChatGPT's image generation capabilities, allowing users to create convincing Ghibli-style images in mere seconds. It has been enormously popular - so much so, in fact, that the system crashed due to user demand .

Generative artificial intelligence (AI) systems such as ChatGPT are best understood as "style engines". And what we are seeing now is these systems offering users more precision and control than ever before.

But this is also raising entirely new questions about copyright and creative ownership.

How the new ChatGPT makes images

Generative AI programs work by producing outputs in response to user prompts, including prompts to create an image.

Previous generations of AI image generators used diffusion models. These models gradually refine random, noisy data into a coherent image. But the latest update to ChatGPT uses what's known as an "autoregressive algorithm".

This algorithm treats images more like language, breaking them down into "tokens". Just as ChatGPT predicts the most likely words in a sentence, it can now predict different visual elements in an image separately.

This tokenisation enables the algorithm to better separate certain features of an image - and their relationship with words in a prompt. As a result, ChatGPT can more accurately create images from precise user prompts than previous generations of image generators. It can replace or change specific features while preserving the rest of the image, and it improves on the longstanding issue of generating correct text in images.

A particularly powerful advantage of generating images inside a large language model is the ability to draw on all the knowledge already encoded in the system. This means users don't need to describe every aspect of an image in painstaking detail. They can simply refer to concepts such as Studio Ghibli and the AI understands the reference.

The recent Studio Ghibli trend began with OpenAI itself , before spreading among Silcon Valley software engineers and then even governments and politicians - including seemingly unlikely uses such as the White House creating a Ghiblified image of a crying woman being deported and the Indian government promoting Prime Minister Narendra Modi's narrative of a "New India".

Understanding AI as 'style engines'

Generative AI systems don't store information in any traditional sense. Instead they encode text, facts, or image fragments as patterns - or "styles" - within their neural networks.

Trained on vast amounts of data, AI models learn to recognise patterns at multiple levels. Lower network layers might capture basic features such as word relationships or visual textures. Higher layers encode more complex concepts or visual elements.

This means everything - objects, properties, writing genres, professional voices - gets transformed into styles. When AI learns about Miyazaki's work, it's not storing actual Studio Ghibli frames (though image generators may sometimes produce close imitations of input images). Instead, it's encoding "Ghibli-ness" as a mathematical pattern - a style that can be applied to new images.

The same happens with bananas, cats or corporate emails. The AI learns "banana-ness", "cat-ness" or "corporate email-ness" - patterns that define what makes something recognisably a banana, cat or a professional communication.

The encoding and transfer of styles has for a long time been an express goal in visual AI. Now we have an image generator that achieves this with unprecedented scale and control.

This approach unlocks remarkable creative possibilities across both text and images. If everything is a style, then these styles can be freely combined and transferred. That's why we refer to these systems as "style engines" . Try creating an armchair in the style of a cat, or in elvish style .

The copyright controversy: when styles become identity

While the ability to work with styles is what makes generative AI so powerful, it's also at the heart of growing controversy . For many artists, there's something deeply unsettling about seeing their distinctive artistic approaches reduced to just another "style" that anyone can apply with a simple text prompt.

Hayao Miyazaki has not publicly commented on the recent trend of people using ChatGPT to generate images in his world-famous animation style. But he has been critical of AI previously .

All of this also raises entirely new questions about copyright and creative ownership.

Traditionally, copyright law doesn't protect styles - only specific expressions. You can't copyright a music genre such as "ska" or an art movement such as "impressionism".

This limitation exists for good reason. If someone could monopolise an entire style, it would stifle creative expression for everyone else.

But there's a difference between general styles and highly distinctive ones that become almost synonymous with someone's identity. When an AI can generate work "in the style of Greg Rutkowski" - a Polish artist whose name was reportedly used in over more than 93,000 prompts in AI image generator Stable Diffusion - it potentially threatens both his livelihood and artistic legacy.

Some creators have already taken legal action.

In a case filed in late 2022 , three artists formed a class to sue multiple AI companies, arguing that their image generators were trained on their original works without permission, and now allow users to generate derivative works mimicking their distinctive styles.

As technology evolves faster than the law, work is under way on new legislation to try and balance technological innovation with protecting artists' creative identities.

Whatever the outcome, these debates highlight the transformative nature of AI style engines - and the need to consider both their untapped creative potential and more nuanced protections of distinctive artistic styles.

The authors do not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and have disclosed no relevant affiliations beyond their academic appointment.

/Courtesy of The Conversation. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).

Authors

How the new ChatGPT makes images

Understanding AI as 'style engines'

The copyright controversy: when styles become identity

You might also like