GenAI Images: When Everything Is Possible and Nothing Is Real

People started generating GenAI images with the emergence of DALL-E and other models in the early 2020s. They grabbed our imagination; we’d never been able to produce anything like them before, let alone so easily. But now we’re seeing a flood of them. According to one report, AI generated 15 billion images in the year to August 2023; that’s as many photographs as were taken in the prior 150 years.

So, do they have a future, or will they go the way of the metaverse?

Now that the novelty and fascination have worn off, it takes something exceptional to catch our attention. My friend Mark Schaefer has one suggestion: be the best fake possible.

He produced this image in Midjourney with the prompt, “/imagine most dazzling image imaginable, insane detail, awe, surprise, beautiful, dazzling, gorgeous.” And it is, though I can’t help feeling it’s vacuous: pretty but empty, dense but dead. I don’t think we can maintain the awe and surprise of the first genAI images by continually cranking up the volume on the detail, the gorgeousness, and the dazzle.

But genAI images are good for illustrating articles, and it’s easier to produce an illustration relevant to a story than to find a stock image, which can often take hours. However, like the one at the head of this post, many look like genAI, artificial, and frequently featuring robots.

Creating images for commercial purposes isn’t new. For years, IKEA has used computer-generated images (CGI) like the one below in their catalogs. One reason is that it saves staging a whole room with furniture and knick-knacks for a single photo.

But for images to be high quality, interesting, and relevant for illustration, we have to work at them. Nothing good was ever easily produced, and generating professional images with genAI has some challenges.

Some of these are well known, like blurry or distorted parts, nonsensical objects, hands with six fingers, and perpetuated stereotypes. It is easier to create images than to quality assure them, but if we’re vigilant, we can filter the bad ones out.

A much bigger challenge is getting the output you want. GenAI has a significant element of randomness – after all, we’re asking it to create something new from its vast database of existing images. If we ask it, for instance, to edit an image by giving a woman in it an Apple Watch, we’re more likely to see a watch on her computer screen or a shiny green apple on her desk than an Apple Watch on her wrist; I know because I’ve tried. That’s why IKEA uses CGI, not genAI. The designer specifies the objects and their placement, and the computer renders the surfaces and applies light and shadow. (IKEA has a genAI assistant to answer customer queries and make product suggestions, but that’s another story.)

Notwithstanding the challenges, here are some alternatives for this post’s hero image (the one at the top), which I made with Midjourney. These look less like the somewhat surreal AI images we’ve grown used to, and they illustrate the topic.

I used Midjourney’s default style for these images. But if you want to create and maintain a more distinctive one, that takes some work, too. One way is to modify the degree to which the app will “stylize” the output, and there’s a parameter for that. Another is to, for instance, add “detailed” or “photorealistic” in the prompt, as I did for a couple of those above.

However, Midjourney and similar tools can produce images in a wide variety of named styles, from Art Deco to Zenga (a Japanese art form). An easy way to get one of those is to include it in the prompt, “in the style of Salvador Dali,” for instance. More powerful, though, is to use the tool’s style tuner, in which you can choose between up to 128 pairs of images in different styles. Midjourney generates a code reflecting a combination of your preferred styles that you can use to create images. If you like it and want consistency, you can set that style as the default for subsequent illustrations.

As the volume of AI images grows, at least 90% will be banal and 10% good, proportions that are dictated by Sturgeon’s Law. I’d like to be in the 10%. I am not a designer, but if I can generate decent images, so can most people.

Related Posts