#Tech news

How to Know if Your Images Trained an AI Model (and How to Opt Out) – MUO – MakeUseOf

Were your images used to train an AI art generator without your permission? Here’s how to find out, and opt out!
To many people's disbelief, living artists are discovering that their art has been used to train AI models without their consent. Using a web tool called "Have I Been Trained?", you can know in a matter of minutes if your images were fed to Midjourney, NightCafe, and other popular AI image generators.
If you find your image in one of the datasets used to train these AI systems, don't despair. Some organizations have developed ways to opt out of this practice, keeping your images from being scrapped from the internet and passed on to AI companies.
When you ask an AI system like DALL-E to generate an image of a "dog wearing a birthday hat", it first needs to know what a dog looks like and what a birthday hat looks like too. It gets this information from enormous datasets that collate billions of links to images across the internet.
As we all know, the internet contains just about any kind of image you can imagine, including, in all likelihood, tons of images of a "dog wearing a birthday hat". With enough data like this, an AI model can work out how to reproduce an image in the likeness of the ones it's been trained on.
But what if those images were originally copyrighted? And what if those images belonged to artists, photographers, or regular people who weren't aware that their images were feeding an AI system?
Many AI image generators have a paid tier where users can buy credits to create more images, earning them a profit. But that profit is earned off the backs of uncredited people whose images were used to train the AI system in the first place.
As more artists find out that their images were used to develop AI systems, it's clear that not everyone is okay with it. At the very least, they want AI companies to gain consent before using their images.
Especially if you are a popular, well-known artist, having images generated in your style can crowd your market, with fans or potential patrons, not knowing whether the art was created by you or replicated in your likeness by AI. What's even worse, people can create artwork in your style to support values you don't believe in.
This isn't a new problem, deepfakes have been around for years and are potentially about to get worse with the rise of AI. Nowadays, reproducing "fake" art is quick, cheap, and easy. There are only a few ways to identify an AI-generated image, making it difficult to spot the original art from its AI-generated counterpart.
As we mentioned earlier, image datasets are used by AI companies to train their models. These datasets look like a giant Excel spreadsheet with one column containing a link to an image on the internet, while another has the image caption.
Not all AI companies will disclose the dataset it uses, DALL-E being one example. This makes it difficult to know what is being referenced when it generates an image and adds to the general mystique of AI systems.
On the other hand, Stable Diffusion, a model developed by Stability AI, has made it clear that it was built on the LAION-5B dataset, which features a colossal 5.85 billion CLIP-filtered image-text pairs. Since this dataset is open-source, anyone is free to view the images it indexes, and because of this it has shouldered heavy criticism.
In early 2023, Getty Images sued Stability AI for scrapping images from its website to train its AI image generator, Stable Diffusion. If you're wondering who, in turn, uses Stable Diffusion, that would be NightCafe, Midjourney, and DreamStudio, some of the biggest players in the field.
Set up by a group of artists, Spawning is a collective whose aim is to help people find out whether their images are on datasets like LAION-5B, used to train AI models. Their web search engine called Have I Been Trained? lets you easily search keywords such as your artist name.
Have I Been Trained works a lot like a Google image search, except your search is matched to results in the LAION-5B dataset. You have the option to search either by keyword or by image, the latter is helpful if you want to see if an exact image has been used.
We used the name of the artist Frida Kahlo (1907-1954) to test it out and found a mix of historical photographs and what looks like fan art in the form of doodles, paintings, cross-stitch, crochet, and illustrations.
If you are one of these creators, you are one of the many uncredited humans whose creativity made it possible for AI image generators to exist. And with that power, now anyone can create Frida images like this bizarre portrait of "Frida Kahlo eating ice cream".
Try typing your own artist name into the search bar to see if your work has been used to train an AI model.
The same team behind the website Have I Been Trained has created a tool for people to opt into or out of AI art systems. It's one way for artists to maintain control and permissions over who uses their art and for what purpose.
Other art platforms are beginning to follow suit and currently, DeviantArt offers an option to exclude their images from being searched by image datasets.
Alongside being able to search for your image, you can also select images to opt out of the LAION-5B training data using the site Have I Been Trained.
You will have to create an account first, and following this, right-click on an image and choose to Opt-out this image.
Selecting this option will add that image to your opt-out list which you can access by clicking on your account symbol in the top right corner of the page, then selecting My Lists. To remove it from your list, right-click on the image and select Remove From Opt-Out List.
If you are a prolific artist then this method is tedious and not adequate to opt out of all of your images efficiently. Unfortunately, there isn't a better alternative at the time of writing, but it's likely that improvements will be made to this system in the future.
These opt-out lists are then passed on to the company behind LAION-5B, who have agreed to remove those images from its dataset.
DeviantArt has so far led the way for art hosting platforms by giving users the option to opt out their art. Initially, you had to find the preference and select the opt-out checkbox. But following strong feedback from the DeviantArt community, this option is now turned on by default.
That means that no image posted to DeviantArt is made available to image datasets, unless where users have opted in. While not entirely foolproof, the mechanism it uses involves flagging an image with a "noai" HTML tag. This tells AI datasets that the image isn't allowed to be used, and if it is, the company will be violating DeviantArt's Terms of Service.
You can find the opt-out preference by hovering your mouse over your personal account icon and clicking Account Settings. Then click General from the left-hand menu and scroll down until you see the heading Tell AI Datasets They Can't Use Your Content.
Finding a good compromise between AI systems and artists whose work helps to train them will take time. If you are a creator, don't feel powerless. With strong responses from the communities who use art platforms like DeviantArt, you can have control over who uses your art.
Not everyone will want to opt out either, some people don't have an issue with their images training AI models. But the most important thing is for AI companies to gain consent and work out a fair and respectful space for AI models and artists to exist together.
Garling has a Master's degree in Music and over a decade of experience using creative technologies. In particular, she loves writing about music production, film, and DIY electronics. Outside of writing, you will find her taking photos or editing audio.


Am I too old for the next