13.9 C
New York
Saturday, April 13, 2024

Inside DALL-E Mini, the Internet's Favorite AI Meme Machine

On June 6, Hugging Face, a company that hosts open source artificial intelligence projects, saw traffic to an AI image-generation tool called DALL-E Mini skyrocket.

The outwardly simple app, which generates nine images in response to any typed text prompt, was launched nearly a year ago by an independent developer. But after some recent improvements and a few viral tweets, its ability to crudely sketch all manner of surreal, hilarious, and even nightmarish visions suddenly became meme magic. Behold its renditions of “Thanos looking for his mom at Walmart,” “drunk shirtless guys wandering around Mordor,” “CCTV camera footage of Darth Vader breakdancing,” and “a hamster Godzilla in a sombrero attacking Tokyo.”

As more people created and shared DALL-E Mini images on Twitter and Reddit, and more new users arrived, Hugging Face saw its servers overwhelmed with traffic. “Our engineers didn’t sleep for the first night,” says Clément Delangue, CEO of Hugging Face, on a video call from his home in Miami. “It’s really hard to serve these models at scale; they had to fix everything.” In recent weeks, DALL-E Mini has been serving up around 50,000 images a day.

DALL-E Mini’s viral moment doesn’t just herald a new way to make memes. It also provides an early look at what can happen when AI tools that make imagery to order become widely available, and a reminder of the uncertainties about their possible impact. Algorithms that generate custom photography and artwork might transform art and help businesses with marketing, but they could also have the power to manipulate and mislead. A warning on the DALL-E Mini web page warns that it may “reinforce or exacerbate societal biases” or “generate images that contain stereotypes against minority groups.”

DALL-E Mini was inspired by a more powerful AI image-making tool called DALL-E (a portmanteau of Salvador Dali and WALL-E), revealed by AI research company OpenAI in January 2021. DALL-E is more powerful but is not openly available, due to concerns that it will be misused.

It has become common for breakthroughs in AI research to be quickly replicated elsewhere, often within months, and DALL-E was no exception. Boris Dayma, a machine learning consultant based in Houston, Texas, says he was fascinated by the original DALL-E research paper. Although OpenAI did not release any code, he was able to knock together the first version of DALL-E Mini at a hackathon organized by Hugging Face and Google in July 2021. The first version produced low-quality images that were often difficult to recognize, but Dayma has continued to improve on it since. Last week he rebranded his project as Craiyon, after OpenAI requested he change the name to avoid confusion with the original DALL-E project. The new site displays ads, and Dayma is also planning a premium version of his image generator.

DALL-E Mini images have a distinctively alien look. Objects are often distorted and smudged, and people appear with faces or body parts missing or mangled. But it’s usually possible to recognize what it is attempting to depict, and comparing the AI’s sometimes unhinged output with the original prompt is often fun.

Most PopularBusinessThe End of Airbnb in New York

Amanda Hoover

BusinessThis Is the True Scale of New York’s Airbnb Apocalypse

Amanda Hoover

CultureStarfield Will Be the Meme Game for Decades to Come

Will Bedingfield

GearThe 15 Best Electric Bikes for Every Kind of Ride

Adrienne So

The AI model behind DALL-E Mini makes images by drawing on statistical patterns it gleaned from analyzing about 30 million labeled images to extract connections between words and pixels. Dayma compiled that training data from several public image collections gathered from the web, including one released by OpenAI. The system can make mistakes partly because it lacks a real understanding of how objects should behave in the physical world. Small snippets of text are often ambiguous, and AI models do not grasp their meaning in the way that people do. Still, Dayma has been amazed by what people have coaxed out of his creation in the past few weeks. “My most creative prompt was the ‘Eiffel Tower on the moon’,” he says. “Now people do crazy things—and it works.”

Some of those creative prompts have taken DALL-E Mini in questionable directions, however. The system was not trained on explicit content, and it is designed to block certain keywords. Even so, users have shared images from prompts that include war crimes, school shootings, and the World Trade Center attack.

AI-powered image manipulation, including spoof imagery of real people termed deepfakes, has become a concern for AI researchers, lawmakers, and nonprofits that work on online harassment. Advances in machine learning could enable many valuable uses for AI-generated imagery, but also malicious use cases such as spreading lies or hate.

This April, OpenAI revealed DALL-E 2. This successor to the original is capable of producing images that resemble photographs and illustrations that look as if they were made by a professional artist. OpenAI has said that DALL-E 2 could be more problematic than the original system because it can generate much more convincing images. The company says it mitigates the risk of misuse by filtering the system’s training data and restricting keywords that could produce undesirable output.

Most PopularBusinessThe End of Airbnb in New York

Amanda Hoover

BusinessThis Is the True Scale of New York’s Airbnb Apocalypse

Amanda Hoover

CultureStarfield Will Be the Meme Game for Decades to Come

Will Bedingfield

GearThe 15 Best Electric Bikes for Every Kind of Ride

Adrienne So

OpenAI has only provided access to DALL-E and DALL-E 2 to selected users, including artists and computer scientists who are asked to abide by strict rules, an approach the company says will allow it to “learn about the technology’s capabilities and limitations.” Other companies are building their own image-generating tools at a striking pace. This May, Google announced a research system called Imagen that it said is capable of generating images of a quality level similar to DALL-E 2; last week it announced another called Parti, which uses a different technical approach. Neither is publicly available.

Don Allen Stevenson III, one artist with access to OpenAI’s more powerful DALL-E 2, has been using it to riff on ideas and speed up the creation of new artwork, including augmented reality content such as Snapchat filters that turn a person into a cartoon lobster or a Bored Ape-style illustration. “I feel like I’m learning a whole new way of creating,” he says. “It allows for you to take more risks with your ideas and try out more complicated designs because it supports many iterations.”

Stevenson says he has run into restrictions programmed in by OpenAI to prevent creation of certain content. “Sometimes I forget that there are guardrails, and I have to be reminded with warnings from the app” that state his access could be revoked. But he does not see this as limiting his creativity because DALL-E 2 is still a research project.

Delangue of Hugging Face says it’s good that the DALL-E Mini’s creations are much cruder than those made with DALL-E 2 because their glitches make clear the imagery is not real and was generated by AI. He argues that this has allowed DALL-E Mini to help people learn firsthand about the emerging image-manipulation capabilities of AI, which have mostly been kept locked away from the public. “Machine learning is becoming the new default way of building technology, but there’s this disconnect with companies building these tools behind closed doors,” he says.

Most PopularBusinessThe End of Airbnb in New York

Amanda Hoover

BusinessThis Is the True Scale of New York’s Airbnb Apocalypse

Amanda Hoover

CultureStarfield Will Be the Meme Game for Decades to Come

Will Bedingfield

GearThe 15 Best Electric Bikes for Every Kind of Ride

Adrienne So

The constant flow of DALL-E Mini content also helped the company iron out technical issues, Delangue says, with users flagging problems such as sexually explicit results or biases in the output. A system trained on images from the web may, for instance, be more likely to show one gender over another in particular roles, reflecting deep-seated social biases. When DALL-E Mini is asked to render a “doctor,” it will show figures that look like men; if asked to draw a “nurse,” the images appear to show women.

Sasha Luccioni, a research scientist who works on AI ethics at Hugging Face, says the influx of DALL-E Mini memes made her realize the importance of developing tools capable of detecting or measuring social bias in these new kinds of AI models. “I definitely see ways in which they can be both harmful and useful,” she says.

It may become increasingly difficult to reign in some of those harms. Dayma, the creator of DALL-E Mini, admits that it’s only a matter of time before tools like his, which are more widely available, are also capable of creating more photorealistic imagery. But he thinks the AI-made memes that have circulated over the past few weeks may have helped prepare us for that eventuality. “You know, it’s coming,” Dayma says. “But I hope DALL-E Mini brings awareness to people that when they see an image they should know that it isn’t necessarily true.”

Updated 6/27/2022 11:30 am ET: A previous version of this story misspelled the name of Sasha Luccioni.

Related Articles

Latest Articles