AI-generated art is already transforming creative work

Only a few months old, apps like DALL-E 2, Midjourney and Stable Diffusion are changing how filmmakers, interior designers and other creative professionals do their jobs



NYT
NYT

By Kevin Roose

Published: Sat 5 Nov 2022, 11:20 PM

For years, the conventional wisdom among Silicon Valley futurists was that artificial intelligence and automation spelled doom for blue-collar workers whose jobs involved repetitive manual labour. Truck drivers, retail cashiers and warehouse workers would all lose their jobs to robots, they said, while workers in creative fields like art, entertainment and media would be safe.

Well, an unexpected thing happened recently: AI entered the creative class.

In the past few months, AI-based image generators like DALL-E 2, Midjourney and Stable Diffusion have made it possible for anyone to create unique, hyper-realistic images just by typing a few words into a text box.

These apps, though new, are already astoundingly popular. DALL-E 2, for example, has more than 1.5 million users generating more than 2 million images every day, while Midjourney’s official Discord server has more than 3 million members.

These programmes use what’s known as “generative AI,” a type of AI that was popularised several years ago with the release of text-generating tools like GPT-3 but has since expanded into images, audio and video.

It’s still too early to tell whether this new wave of apps will end up costing artists and illustrators their jobs. What seems clear, though, is that these tools are already being put to use in creative industries.

Recently, I spoke to five creative-class professionals about how they’re using AI-generated art in their jobs.

It spit back a perfect image

NYT
NYT

Collin Waldoch, 29, a game designer in the New York City borough of Brooklyn, recently started using generative AI to create custom art for his online game, Twofer Goofer, which works a bit like a rhyming version of Wordle. Every day, players are given a clue — like “a set of rhythmic moves while in a half-conscious state” — and are tasked with coming up with a pair of rhyming words that matches the clue. (In this case, “trance dance.”)

Initially, Waldoch planned to hire human artists through gig-work platform Upwork to illustrate each day’s rhyming word pair. But when he saw the cost — between $50 and $60 per image, plus time for rounds of feedback and edits — he decided to try using AI instead. He plugged word pairs into Midjourney and DreamStudio, an app based on Stable Diffusion, and tweaked the results until they looked right. Total cost: a few minutes of work, plus a few cents. (DreamStudio charges about 1 cent per image; Midjourney’s standard membership costs $30 per month for unlimited images.)

“I typed in ‘carrot parrot,’ and it spit back a perfect image of a parrot made of carrots,” he said. “That was the immediate ‘aha’ moment.”

Waldoch said he didn’t feel guilty about using AI instead of hiring human artists, because human artists were too expensive to make the game worthwhile.

“We wouldn’t have done this” if not for AI, he said.

I don’t feel like it will take my job away

Isabella Orsi, 24, an interior designer in San Francisco, recently used a generative AI app called InteriorAI to create a mock-up for a client.

The client, a tech startup, was looking to spruce up its office. Orsi uploaded photos of the client’s office to InteriorAI, then applied a “cyberpunk” filter. The app produced new renderings in seconds — showing what the office’s entryway would look like with coloured lights, contoured furniture and a new set of shelves.

Orsi thinks that rather than replacing interior designers entirely, generative AI will help them come up with ideas during the initial phase of a project.

“I think there’s an element of good design that requires the empathetic touch of a human,” she said. “So I don’t feel like it will take my job away. Somebody has to discern between the different renderings, and at the end of the day, I think that needs a designer.”

It’s like working with a really willful concept artist

Patrick Clair, 40, a filmmaker in Sydney, started using AI-generated art this year to help him prepare for a presentation to a film studio.

Clair, who has worked on hit shows including “Westworld,” was looking for an image of a certain type of marble statue. But when he went looking on Getty Images — his usual source for concept art — he came up empty. Instead, he turned to DALL-E 2.

“I put ‘marble statue’ into DALL-E, and it was closer than what I could get on Getty in five minutes,” Clair said.

Since then, he has used DALL-E 2 to help him generate imagery, such as an image of a Melbourne tram in a dust storm, that isn’t readily available from online sources.

He predicted that rather than replacing concept artists or putting Hollywood special effects wizards out of a job, AI image generators would simply become part of every filmmaker’s tool kit.

“It’s like working with a really willful concept artist,” he said.

“Photoshop can do things that you can’t do with your hands, in the same way a calculator can crunch numbers in a way that you can’t in your brain, but Photoshop never surprises you,” he continued. “Whereas DALL-E surprises you and comes back with things that are genuinely creative.”

What if we could show what the dogs playing poker looked like?

During a recent creative brainstorm, Jason Carmel, 49, an executive at New York advertising agency Wunderman Thompson, found himself wondering if AI could help.

“We had three and a half good ideas,” he said of his team. “And the fourth one was just missing a visual way of describing it.”

The image they wanted — a group of dogs playing poker, for an ad being pitched to a pet medicine company — would have taken an artist all day to sketch. Instead, they asked DALL-E 2 to generate it.

“We were like, what if we could show what the dogs playing poker looked like?” Carmel said.

The resulting image didn’t end up going into an ad, but Carmel predicts that generative AI will become part of every ad agency’s creative process. He doesn’t, however, think that using AI will meaningfully speed up the agencies’ work or replace their art departments. He said many of the images generated by AI weren’t good enough to be shown to clients and that users who weren’t experienced users of these apps would probably waste a lot of time trying to formulate the right prompts.

“When I see people write about how it’s going to destroy creativity, they talk about it as if it’s an efficiency play,” Carmel said. “And then I know that they maybe haven’t played around with it that much themselves, because it’s a time suck.”

This is a sketch tool

Sarah Drummond, a service designer in London, started using AI-generated images a few months ago to replace the black-and-white sketches she did for her job. These were usually basic drawings that visually represented processes she was trying to design improvements for, like a group of customers lining up at a store’s cash register.

Instead of spending hours creating what she called “blob drawings” by hand, Drummond, 36, now types what she wants into DALL-E 2 or Midjourney.

“All of a sudden, I can take like 15 seconds and go, ‘Woman at till, standing at kiosk, black-and-white illustration,’ and get something back that’s really professional looking,” she said.

Drummond acknowledged that AI image generators had limitations. They aren’t good at more complex sketches, for example, or creating multiple images with the same character. And like the other creative professionals, she said she didn’t think AI designers would replace human illustrators outright.

“Would I use it for final output? No. I would hire someone to fully make what we wanted to realise,” she said. “But the throwaway work that you do when you’re any kind of designer, whether it’s visual, architectural, urban planner — you’re sketching, sketching, sketching. And so this is a sketch tool.”

This article originally appeared in The New York Times.

Generative AI, Silicon Valley’s new craze

NYT
NYT

That much became clear at the San Francisco Exploratorium, where Stability AI, the startup behind the popular Stable Diffusion image-generating algorithm, gave a party that felt a lot like a return to pre-pandemic exuberance.

The event — which lured tech luminaries including Google co-founder Sergey Brin, AngelList founder Naval Ravikant and venture capitalist Ron Conway out of their Zoom rooms — was billed as a launch party for Stability AI and a celebration of the company’s recent $101 million fundraising round, which reportedly valued the company at $1 billion.

But it doubled as a coming-out bash for the entire field of generative AI — the wonky umbrella term for artificial intelligence that doesn’t just analyse existing data but creates new text, images, videos, code snippets and more.

It’s been a banner year, in particular, for generative AI apps that turn text prompts into images — which, unlike NFTs or virtual reality metaverses, actually have the numbers to justify the hype they’ve received. DALL-E 2, the image generator that OpenAI released this spring, has more than 1.5 million users creating more than 2 million images every day, according to the company. Midjourney, another popular AI image generator released this year, has more than 3 million users in its official Discord server. (Google and Meta have built their own image generators but have not released them to the public.)

That kind of growth has set off a feeding frenzy among investors hoping to get in early on the next big thing. Jasper, a year-old AI copywriting app for marketers, recently raised $125 million at a $1.5 billion valuation. Startups have raised millions more to apply generative AI to areas like gaming, programming and advertising. Sequoia Capital, the venture capital firm, recently said in a blog post that it thought generative AI could create “trillions of dollars of economic value.”

But no generative AI project has created as much buzz — or as much controversy — as Stable Diffusion.

Partly, that’s because, unlike the many generative AI projects that are carefully guarded by their makers, Stable Diffusion is open-source and free to use, meaning that anyone can view the code or download it and run a modified version on a personal computer. More than 200,000 people have downloaded the code since it was released in August, according to the company, and millions of images have been created using tools built on top of Stable Diffusion’s algorithm.

That hands-off approach extends to the images themselves. In contrast to other AI image generators, which have strict rules in place to prevent users from creating violent or copyright-infringing images, Stable Diffusion comes with only a basic safety filter, which can be easily disabled by any users creating their own versions of the app.

That freedom has made Stable Diffusion a hit with underground artists and meme makers. But it has also led to widespread concern that the company’s lax rules could lead to a flood of violent imagery, AI-generated propaganda and misinformation.

Rep. Anna Eshoo, D-Calif., recently sent a letter to federal regulators warning that people had created graphic images of “violently beaten Asian women” using Stable Diffusion. Eshoo urged regulators to crack down against “unsafe” open-source AI models.

Emad Mostaque, the founder and chief executive of Stability AI, has pushed back on the idea of content restrictions. He argues that radical freedom is necessary to achieve his vision of a democratised AI that is untethered from corporate influence.

He reiterated that view in an interview with me last week, contrasting his view with what he described as the heavy-handed, paternalistic approach to AI taken by tech giants.

“We trust people, and we trust the community,” he said, “as opposed to having a centralised, unelected entity controlling the most powerful technology in the world.”

Mostaque, 39, is an odd frontman for the generative AI industry.

He has no doctorate in artificial intelligence, nor has he worked at any of the big tech companies from which AI projects typically emerge, like Google or OpenAI. He is a British former hedge fund manager who spent much of the past decade trading oil and advising companies and governments on Middle East strategy and the threat of Islamic extremism. More recently, he organised an alliance of think tanks and technology groups that tried to use big data to help governments make better decisions about Covid-19.

Mostaque, who initially funded Stability AI himself, has quickly become a polarising figure within the AI community. Researchers and executives at larger and more conventional AI organisations characterise his open-source approach as either naive or reckless. Some worry that releasing open-source generative AI models without guardrails could provoke a backlash among regulators and the general public that could damage the entire industry.

But Mostaque got a hero’s welcome from a crowd of several hundred AI researchers, social media executives and tech Twitter personalities.

He took plenty of veiled shots at tech giants like Google and OpenAI, which has received funding from Microsoft. He denounced targeted advertising, the core of Google’s and Facebook’s business models, as “manipulative technology,” and he said that, unlike those companies, Stability AI would not build a “panopticon” that spied on its users. (That one drew a groan from Brin.)

He also got cheers by announcing that the computer the company uses to train its AI models, which has more than 5,000 high-powered graphics cards and is already one of the largest supercomputers in the world, would grow to five or 10 times its current size within the next year. That firepower would allow the company to expand beyond AI-generated images into video, audio and other formats, as well as make it easy for users around the world to operate their own, localised versions of its algorithms.

Unlike some AI critics, who worry that the technology could cost artists and other creative workers their jobs, Mostaque believes that putting generative AI into the hands of billions of people will lead to an explosion of new opportunities.

“So much of the world is creatively constipated, and we’re going to make it so that they can poop rainbows,” he said.

If this all sounds eerily familiar, it’s because Mostaque’s pitch echoes the utopian dreams of an earlier generation of tech founders, like Mark Zuckerberg of Facebook and Jack Dorsey of Twitter. Those men also raced to put powerful new technology into the hands of billions of people, barely pausing to consider what harm might result.

When I asked Mostaque if he worried about unleashing generative AI on the world before it was safe, he said he didn’t. AI is progressing so quickly, he said, that the safest thing to do is to make it publicly available, so that communities — not big tech companies — can decide how it should be governed.

Ultimately, he said, transparency, not top-down control, is what will keep generative AI from becoming a dangerous force.

“You can interrogate the data sets. You can interrogate the model. You can interrogate the code of Stable Diffusion and the other things we’re doing,” he said. “And we’re seeing it being improved all the time.”

His vision of an open-source AI utopia might seem fantastical, but he found plenty of people who wanted to make it real.

“You can’t put the genie back in the bottle,” said Peter Wang, an Austin, Texas-based tech executive who was in town for the party. “But you can at least have everyone look at the genie.”

This article originally appeared in The New York Times.


More news from Long Reads
Can e-bikes go mainstream?

Long Reads

Can e-bikes go mainstream?

VanMoof, the Dutch e-bike company taking inspiration from Apple and Tesla, is one of the world’s hottest brands in a bike market remade by the pandemic. Will it help reshape urban transportation?

Long Reads