Rebuttal: The Rise and Demise of Facial Recognition in DAM
I read with interest Martin Wilson’s recent article The Rise and Demise of Facial Recognition in DAM and would like to present an opposing view.
Every new technology has always had upsides and downsides, uses and abuses. This was true for electricity, the automobile, nuclear energy, and even social media. Facial Recognition (FR) is just one of the first AI applications to have its good and bad sides considered.
FR is already mainstream and it is likely everyone reading this uses FR many times a day: how else do you unlock your cell phone?
Perhaps the main reason FR has been in the news comes from a single use case: law enforcement using a database from a company called Clearview AI. Clearview AI scraped lots of photos of people from social media, associated the faces with text and tags from social media, and built a huge database which it then licensed to law enforcement (among others). As we have all seen in television dramas, some surveillance camera captures a suspected bad person, the authorities hit a button and that person is instantly recognized with absolute certainty (the actors never say “We think it might be Bob Smith?”)
It is unreasonable to ever expect AI to be 100% accurate: that is not going to happen (except in trivial cases). And claims of >90% accuracy are frequently true only for a narrow test set (see the sidebar about Racial and Gender Bias in Facial Recognition). When law enforcement initially used Clearview that was not appreciated, and people were deemed suspects based solely on its results, and in many cases that was incorrect.
What are the damages in a situation like that? Severe. Someone could be accused of a crime or even incarcerated because an investigation incorrectly focused on the wrong person due to a FR error. FR can assist investigators, sure, but it cannot be considered definitive.
In the US essentially all the laws regarding limits on the use of AI only apply to law enforcement (such as this one from the state of Virginia: https://lis.virginia.gov/cgi-bin/legp604.exe?221+sum+SB741)
Very recently, early legislation banning the use of FR by law enforcement is being overturned in the US:
In the EU and UK, there are concerns about “live” (real time) applications of FR, however:
“Nevertheless, the use of facial recognition technologies and other biometric applications will become more widespread across the EU in the near future. Technology is marching on, with many potentially lucrative niches to be filled…Until the AI Act comes into force, regulators and companies will be feeling out what is acceptable. Some uses of facial recognition are already becoming established, such as in airport security where there’s a strong case for public interest, or in unlocking a smartphone that requires its owner’s consent — both legitimate justifications for processing data under the GDPR.”
And in one interesting biometrics application:
“Tesco, Co-op, Asda, Aldi and Morrisons are to trial facial age estimation technology which accurately guesses a customer’s age when purchasing Challenge 25 products. The technology is being tested to see whether it is able to facilitate alcohol sales faster and more efficiently than manual checks.”
It would appear that guidelines for the use of FR (as a proxy for other forms of AI) are evolving, and will do so for some time.
In the DAM World
For Facial Recognition (FR) let’s look at two things: the use case, and user controls.
Pretty much no one using FR within a DAM is in a law enforcement capacity. Also, it is likely essentially everyone in your photos or videos is related to your organization: you are not gathering pictures of random people on the street. Let’s take a specific but representative use case: you are a marketing professional and your hospital had a fund raiser last night, and 1,000 photos were shot for you in the course of the evening. Early the next morning your CEO calls and asks for the best photo of your star oncologist with your new biggest donor, and they need the photo in the next 10 minutes.
Absent FR, you will need to go through all 1,000 images one at a time, and there is no way that happens in 10 minutes. But if you can find one image with the oncologist, tag their face, FR will find her/him instantly in almost all the photos and add their name to the metadata. Do the same for the big donor. Then just search for both names, and in under a minute you are looking at all the candidates for the mission your CEO assigned you. Pick the best one, deliver it to them, and get on with your other duties. FR just saved you hours and made you a hero to your CEO.
What is the risk level, or put another way, what potential damages are there? Worse case is you misidentify one of the two people, like your star oncologist. I am sure everyone reading this has at some point seen a retraction in the newspaper of something they got wrong the preceding day: mistakes happen. It is unlikely there was any tangible damage, although your star oncologist might be pretty disappointed!
Similarly as Martin points out, there is a FR use case to ease the pain of identifying and making unavailable for publication photos of students who have graduated, or employees who have departed your company. Given the use cases of FR in DAM, is there really any measurable risk? On the other hand, will many hours per year of painful work be saved?
HOW FR is implemented can matter a great deal: is it solely left to its own algorithmic devices (presumably as in the Clearview AI situation), or was a human kept in the loop?
Let’s say you have a DAM system with FR capabilities, you come across a photo with someone you want the system to “know” and you click on their face and type in their name, John Smith. A responsible FR implementation then goes out to all your images, find headshots it believes are the same person, and displays them for you, and you must review them and confirm that you believe all the headshots displayed actually DO show John Smith. This takes seconds for thousands of hits: it is not burdensome and gives you the ability to oversee FR’s quality control.
That takes care of the images already in the system. But then what happens when new images flow in days later, and some of them have headshots the FR model believe also show John Smith? Again, there is a responsible thing to do here: the system can tag them BUT visually indicate it is a “Predicted” tag, and the system can have a simple command to let someone review all the “Predicted” tags from the last X days, confirm, and convert them to verified tags.
This makes a world of difference: for a tiny additional workload you are not at the mercy of an algorithm. This FR implementation is more being your assistant, gathering results in under a second instead of costing you hours or days of manual work.
Of course there are other important safeguards: enforcing minimum criteria of image size and sharpness below which an FR model’s output should not be considered valid. And ensure that the vendor you select to do the computational “heavy lifting” cannot comingle your data with anyone else’s.
Radioactive materials are dangerous, but does that mean we should ban them from any use, even the tightly controlled (and essentially zero risk) systems that use radiation to combat cancer and save lives? We think the same applies to FR in DAM, especially in this time of scarce metadata and high workloads on people. We need to responsibly employ every tool we can get to help people manage and put to use their exploding collections of content!
Paraphrasing Mark Twain: Perhaps the reports of FR’s death are greatly exaggerated!
Share this Article: