DALL-E: Revolutionising the creative abilities of AI
Mar 8, 2023by, Pooja S Kumar
The human mind is often regarded as the best artist of all time. Could AI be more creative than humans? Let’s take a look.
Recently, there have been plenty of text-to-image generative models mushrooming in the AI space. This includes OpenAI’s DALL-E 2, Ultraleap’s Midjourney, Hugging Face’s Craiyon, Meta’s Make-A-Scene and Google’s Imagen, and much more. Ironically, some are open source, i.e., accessible to public users, while some are available on an invite-only basis. One such is DALLE. OpenAI recently made its DALL-E 2 available in beta. However, it is still available on an invite-only basis, where the company looks to provide access to people in a phased manner. It has provided access to over 1 lakh users globally and looks to expand to one million users in the coming weeks.
DALL-E is a text-to-image generator that has been designed with creativity in mind. It was developed in 2021 by OpenAI, an artificial intelligence lab that has spent the past seven years in various industries that mimic human capability. With its four times the resolution of other text-to-image generators, DALL-E can now produce high-quality images that look like they’ve been professionally produced.
Here is an example.
Text: An Astronaut riding horse in space.
DALL-E derives its name from two radically different influences, Spanish painter Salvador Dali and the lovable robotic protagonist of WALL-E. It has earned a dedicated online fanbase for its ability to recognize complex words and create unique, original computer-generated graphics based on written sentences.
In the years since its inception, OpenAI’s DALL-E AI program has undergone significant progress in its ability to generate beautiful images from scratch. Early in its development, DALL-E demonstrated a simple method for assembling an image by transforming a foreground object into the shape found in the background of the image. This process is often used in Photoshop to create a seamless background, but the technique has been difficult for AI programs to perfect. DALL-E has evolved into a much more complex program whose models allow for more realistic renderings of individual objects and more natural-looking transformations between objects. Its current best results match or exceed those produced by a human artist in this experiment. Whether you need a quick avatar for your Twitter page or some attractive images for your blog post, DALL-E can help.
DALL-E is not only better at generating photorealistic images from texts but can also realistically edit and retouch photos. With its brilliant resolution, you can guarantee that your images are of the highest quality possible. Based on a simple Natural Language description, it can fill in or replace part of an image with AI-generated imagery that blends seamlessly with the original. It’s called “In-Painting”. It can start with an image as an input and create variations with different angles and styles.
DALL-E was created by training a neural network on images and their text descriptions. Through deep learning, it not only understands individual objects, like Koala bears and motorcycles but learns from relationships between objects.
The DALL-E research has 3 main outcomes.
- It can help people express themselves visually in ways they may not have been able to before.
- An AI-generated image can tell us a lot about whether the system understands us, or is just repeating what it has been taught.
- DALL-E helps humans understand how advanced AI systems see and understand our world.
Neither application is available for public use, with the former being beta tested and made available to select online celebrities to promote its features. It’s unclear when or when the platforms will be released for general use, but it would appear behind a paywall if a public version is released. Many people are left wondering about the application’s code, but OpenAI’s website has a wealth of information about certain aspects of its inner workings, owing to a lack of general knowledge of the program’s complete functionality or its technical foundation.
Technology is constantly evolving, and DALL-E has its limitations. If it’s taught with objects that are incorrectly labelled, like a ‘PLANE’ labelled ‘CAR’, and a user tries to generate a Car, DALL-E may create a Plane. DALL-E can also be limited by gaps in its training. For eg: If you type ‘BABOON’ and DALL-E has learned what a Baboon is through images and accurate labels, it will generate a lot of great Baboons. But if you type “ Howler Monkey” and it hasn’t learned what a Howler Monkey is, DALL-E will give you its best idea of what it could be, like a Howling monkey.
Image 1: Howler Monkey
Image 2: Howling Monkey
What’s exciting about the approach used to train DALL-E is that it can take what it learned from a variety of other labelled images and then apply it to a new image. DALL-E is exactly an example of how imaginative humans and clever systems can work together to make new things, in a way amplifying our creative potential.
So far the DALL-E software has created stunningly realistic paintings on its own without outside intervention. As the system improves and gains more experience, its ability to generate convincing and realistic text-based images will only increase. So what are you waiting for? Try DALL-E today and see the difference for yourself. You won’t be disappointed. Have a similar project in mind, contact us!
Disclaimer: The opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Dexlock.