Is using real people’s photos to train AI a copyright violation?
Using data to train AI models is becoming a global trend, especially in fields like facial recognition, image processing, and deepfake content creation. However, a crucial question arises: is using real-life photos to train AI a copyright violation? Let’s explore this in the article below!
What is using real-life photos to train AI?
“Training AI” is the process of providing data to a machine learning model so that it learns to recognize, analyze, or create new content. In this context, real-life photos are often used as input data for AI to learn characteristics such as:
- Face, expression
- Gender, age
- Behavior, gestures
- Image style
Examples:
- AI facial recognition requires millions of photos to distinguish between person A and person B
- AI image creation (like deepfake) requires original data to recreate the face of a specific person
It’s undeniable that the “smarter” the AI, the more diverse and authentic the input data needs to be. Therefore, real-life photos offer many advantages such as high accuracy, richness in shooting angles, lighting, context, and expression. However, it is precisely because of this value that many individuals and businesses have collected and used photos in bulk without fully assessing the potential legal consequences.

Does using real-life photos to train AI violate copyright?
The use of real-life images to train AI cannot be definitively judged as right or wrong; it depends heavily on the data source and the purpose of exploitation. To accurately assess this issue from a legal perspective, two important aspects must be considered simultaneously: copyright over the photograph and the rights to the personal image of the person appearing in the photograph.
Are real-life photos protected by copyright?
According to the Vietnamese Intellectual Property Law, photographs can be protected by copyright if they meet the conditions of being a creative work.
Specifically, Clause 1, Article 14 of the Intellectual Property Law clearly states: “Photographic works are one of the types of works protected by copyright.”
This means that most photographs, including real-life photos, can be considered photographic works, and the photographer or the legal owner of the photograph will hold copyright over that work.
Furthermore, Article 20 of the Intellectual Property Law also stipulates the property rights of the owner of a work, including important rights such as:
- The right to copy the work in any form
- The right to distribute and import the original or copies of the work
- The right to communicate the work to the public
In this context, using images to train AI can essentially be considered “data copying,” because you are feeding images into the system for processing, storage, and analysis. If this is done without the owner’s permission, it can be considered copyright infringement.
Rights to personal images
Besides copyright, another important legal aspect to consider is the right to personal images, as stipulated in the Vietnamese Civil Code.
According to Article 32 of the 2015 Civil Code, the law clearly states: “The use of an individual’s image must be with their consent.”
This regulation shows that even if you have the right to use the image (for example, you have purchased the stock image or have been granted copyright), using the image of the person appearing in the image still requires their consent, especially in cases of commercial use or where it may affect their honor or dignity.
Therefore, if you use real people’s images to train AI without their consent, you may be violating their personal rights, even if you are not violating copyright in the traditional sense.
Cases where using real-person photos to train AI does not violate copyright
Not all cases of using real-person photos to train AI violate the law. In fact, you can absolutely use the data legally if it falls under one of the following cases:
First, the image is in the public domain
These are works whose copyright protection has expired or whose owners have relinquished their rights. In this case, you can freely use them without permission.
Second, the image has a clear usage license
Many platforms provide images with licenses such as Creative Commons (CC0, CC BY…), allowing use for various purposes, including AI training, as long as the license conditions are complied with.
Third, with the owner’s permission
This is the safest legal way, when you sign a contract or have written consent from the image owner or the person in the image to use the data for AI training purposes.
Fourth, use within the scope of research and teaching (with limitations)
According to Article 25 of the Intellectual Property Law, the law allows the use of works without permission in certain special cases such as scientific research or teaching, provided it is not for commercial purposes and the source is clearly stated.
However, it is important to note that if you initially use the data for research purposes but then commercialize the AI product (e.g., selling software, providing services), this act may still be considered copyright infringement without legal permission.
The above is an article titled “Is using real people’s photos to train AI a copyright violation?”. It can be seen that using images to train AI is a complex issue, not only involving technological factors but also closely related to legal regulations on copyright and personal image rights. In the context of data becoming an increasingly important “digital asset,” all actions involving the collection, exploitation, and use of images need careful consideration to avoid unnecessary legal risks.
Sincerely,
FAQ
No. Not attribution does not mean the image is not copyrighted. Unless the image is in the public domain or has a clear license, you still need to ask for permission before using it.
Yes. According to regulations on personal image rights, the use of a person’s image requires their consent, especially when related to commercial purposes or potentially affecting their honor or reputation.