AI For DAM: Reliability, Liability And Risk Management
This feature article was contributed by DAM News editor, Ralph Windsor.
As we have discussed on DAM News recently, Artificial Intelligence (AI) is getting a lot of attention currently in Digital Asset Management, as it elsewhere. Two issues which seem to be rapidly increasing in significance (but without a great deal of comment being made about them) are the reliability of the technologies involved and who has liability for them in the event of a fault occurring.
On the reliability question, last week, there was news coverage of a case involving Tesla (the electric vehicle manufacturer). In the story, a car which was under the control of the on-board software had collided with a lorry, tragically killing the driver of the car. The preliminary cause of the accident was not a mechanical issue with the car itself, but the AI visual recognition capabilities. This quote attributed to Tesla was noteworthy:
“In a statement, Tesla said it appeared the Model S car was unable to recognise “the white side of the tractor trailer against a brightly lit sky” that had driven across the car’s path.” [Read More]
This is the same class of technologies as those which are being incorporated into DAM solutions by some vendors. The ‘blind spot’ problem where the software fails to make a detection is a consistently occurring issue, as I discussed when I reviewed the Google Vision API a few months ago. For reasons I will elaborate on later in this article, I do not believe these reliability issues will ever get fully resolved. While I acknowledge there is considerable scope to improve their accuracy, the question of who is liable for them when they fail still remains.
I have talked about the Tesla news with a number of people in the last few days and one point that has been advanced is that driverless cars are a special case and that most DAM solutions are used in far less mission-critical environments. I accept that argument, however, if at the same time we are collectively seeking to get DAM solutions integrated with all kinds of other business systems, there is still potential for quite a lot of negative risk to get transmitted further afield than was initially envisaged. Further, the range and depth of that risk transfer will potentially grow as systems become more interconnected. This has some important implications for managing risks with both Digital Asset Management initiatives and the enterprise Digital Transformation programmes which they are frequently a component of.
Software Bugs And Probability Fallacies
With AI technology, reliability risks are harder to quantify than more conventional software applications. Most non-AI software faults can be divided into two categories: exceptions and semantic issues. Exceptions generally produce some kind of error that stops further processing of requests. To anyone other than software developers, these look very bad as they are generally accompanied by hard to decipher messages that clearly indicate something has gone wrong. If you are writing code, however, they are usually preferable because you get specific system-generated information to help diagnose a fault.
Semantic bugs by contrast are where the program does something different to what it was designed to do. These issues are harder to resolve but it is usually possible to enumerate a range of potential outcomes which determine whether the operation was successful or not and develop tests for them.
Although AI tools get given descriptions which seek to partially anthropomorphise them, like ‘computer vision’, ‘machine learning’ etc, the methods used have mechanised, industrial origins and are far less sophisticated than many might imagine. The concepts employed are relatively basic, by design, principally because the more complex they get, the less universally applicable they become which makes them less reliable when used outside a predetermined range of samples. A number of AI products I have reviewed use strategies which aim to reduce images into lines and shapes. They then use an internal database of examples and assign each a score or rank. Variations on statistical techniques based on Standard Deviation are used to see if they are above a given threshold to predict a match with a degree of confidence.
An important conceptual difference with AI is that unlike regular software, it is impossible to get either a 0% or 100% outcome. AI aims to get a binary device (a computer) to process inputs in a non-binary manner that simulates what human beings do. Faults from AI tools are, therefore, harder to predict, diagnose and resolve in a rational manner because the system does not know the difference between a low scoring result and a limitation in its own algorithm (or possibly even a coding mistake by the developer).
The most practical method that can be used to help limit this effect is to have human beings check the results and to then optimise the algorithm. This can also introduce further unforeseen issues. It is usually impossible to define the full range of outcomes without limiting the potential usefulness of the algorithm. Further, the subjective opinions of the people involved in the review exercise can skew the results and cause ‘trained’ systems to still miss glaring mistakes that most human beings would instantly pick up on.
The scoring method used for many AI tools where outliers that fall outside a given boundary are discounted has flaws resulting from the mathematical theory which underpins them. Even within sectors such as mechanical engineering and financial services that make extensive use of methods which rely on these techniques, ‘tail risk’ (i.e. something completely unexpected happening) is acknowledged to be hard to mitigate against and more common than is widely appreciated. Three Standard Deviations is the equivalent of 99.73%. That sound like quite a high threshold, but consider an AI system is given one million images to extrapolate tags from, that means 2,700 of them will be wrong. Further, that is an average distribution and it is possible that you might get 13,500 incorrect identifications in a row and then five million that are perfect. From what I have seen of most computer vision algorithms, the level of accuracy is far lower than 99.7% (even taking only the top scoring suggestions generated by most systems into account).
As we have described on DAM News when taking quantitative DAM ROI studies to task, it is not advisable to talk in terms of averages with digital assets because what matters is highly context dependent – i.e what you (or another system) needs to find at a precise moment in time. What this means is that a series of assets cannot now be isolated, and it is impossible to tell whether these were useful or not and their value could fluctuate over time based on events. To a greater or lesser extent, AI technology is fundamentally unreliable and always will be.
Whose Liability Is It Anyway?
A point frequently raised about AI reliability is that human beings do not have a 100% trust record either. That is an accurate observation, however, the issue then moves to responsibility and by implication, liability.
Referring to the Tesla piece mentioned at the start of the article, their fault description sounds a lot like the aforementioned odds playing out:
“The high ride height of the trailer combined with its positioning across the road and the extremely rare circumstances of the impact caused the Model S to pass under the trailer, with the bottom of the trailer impacting the windshield of the Model S.” [Read More]
To summarise, they are saying: ‘we didn’t think of that’. This will be a familiar refrain for anyone who has experience of managing teams of software developers. It is a telling point that even the description of software faults as ‘bugs’ is an ‘abdication of responsibility’ to borrow a phrase from Elon Musk (the CEO of Tesla) since unlike the original use of the term attributed to Grace Hopper the root cause of most software faults are human-generated errors and has nothing to do with intrusions from other life-forms (artificial ones or otherwise).
Tesla also offer recommendations for reducing risk:
“It is important to note that Tesla disables Autopilot by default and requires explicit acknowledgement that the system is new technology and still in a public beta phase before it can be enabled. The system also makes frequent checks to ensure that the driver’s hands remain on the wheel and provides visual and audible alerts if hands-on is not detected.” [Read More]
As with this example, the recommended method of risk mitigation for many AI tools is for human beings to not depend on them, but at the same time, the marketing of these technologies emphasises those exact same benefits – that you can leave it all with the computer and go off and do something else. There is a contradiction which hasn’t been resolved: as a user you are sold the benefits of them, if something goes wrong, however, it’s your fault, read the small print.
I don’t believe this dichotomy is going to go unchallenged and where these sort of issues tend to get taken to task is in a court of law. If a human being drives a vehicle into another car or catalogues an image in a way that could be perceived as a form of racist abuse, they have responsibility for it and they may subsequently pay the price for their actions in one form or another. If some automated software does the same, the computer executing its instructions cannot be held to account because it is an inanimate object, so the liability then transfers to the integrators, developers and designers of the systems themselves.
This puts anyone who develops AI and related technologies into a whole new ball game (and by association, their channel partners who integrate them also). The Tesla incident is one of a few I have been hearing about recently. I have read about a number of cases, especially in algorithmic trading of financial instruments which while less well-publicised and far less tragic certainly have been very expensive for those involved.
Foreseeability is an important factor in insurance since this is a key determinant of whether one party was negligent or not. This seems very difficult to pin down with AI technologies so I would expect them to be hotly contested in legal cases with precedent established in test cases rather than legislation being used to set the benchmark (for the next few years, at least). That implies some lengthy and expensive legal disputes in prospect with AI technologies being a key factor. On that basis, I would expect insurers to want to increase premiums for those who both produce and use AI before long to cover their increased costs. With some notable exceptions, this is an aspect of AI which is not currently being widely discussed, but I would imagine a number of legal interests are now paying closer attention to it than they were before.
Advice About AI For Managers Of Digital Asset Management Initiatives
I think there might be something of use in the research behind AI, but I believe it is being over-promoted currently based on very limited evidence of both successful and reliable real-world implementations (certainly that has been properly tested and is not based on marketing literature offered by software vendors). I suspect that in part this is because the technology sector (as a whole) is running out of new ideas that are robust enough to support solutions that can scale up to achieve critical mass. As I have mentioned in other articles about AI, the subject seems to be the bookend to other more significant innovation trends. The increased private equity interest in AI it might be an indicator of a forthcoming downturn in the tech sector, as has happened in the past.
Being the first to implement some new technology concept is rarely as much of an advantage as it first appears. With that in mind, I recommend the following strategies and tactics for those contemplating using AI for DAM initiatives:
- Be aware that the risks inherent in AI technologies are possibly being discounted far more than they should be.
- Unless you have millions of digital assets, try to avoid any automated metadata suggestions being inserted by default.
- Introduce random testing of selected (and non-contiguous) ranges of assets where these methods have been used, preferably approved by different groups of people. This is particularly important for those with very large asset repositories where only automated tools have generated metadata.
- Ensure that awareness of the limitations of the AI tools is a part of your change management and user adoption plans. Encourage users to check the automated suggestions and not to accept them without any kind of critical review.
- Make it clear that independent QA will be carried out and anyone who has relied exclusively on the AI tools may have to re-catalogue assets if they are not meeting a minimum quality standard.
- Be particularly careful if your DAM solution is the source repository for other systems, especially those where metadata can be publicly exposed on websites etc (e.g. captions for photos or automated transcriptions).
- Ask the supplier of your system who has liability if the auto-suggestion produces unexpected results and if they assume or at least share any liability from them. If they can’t agree to this, they don’t fully trust the technology they are selling and you should weight your own level of confidence accordingly.
- Insist that the vendor implements some kind of feedback loop that allows the automated suggestions to be refined and improved.
- Hybrid AI where automated decisions are verified by human judgements are lower risk than pure AI methods. Consider this as an alternative to exclusively using AI alone.
- AI technology (indeed, all technology in general, in my opinion) tends to be more successful if the problem domain (or subject) is restricted and the definition kept as tight as possible, this involves fewer variables and shortens the odds of a successful outcome. Consider restricting the scope of AI metadata suggestions to a more subject-specific range of assets.
Conclusion
If you have an executive role managing a DAM implementation project, risk management should be up there as one of the most important considerations of the entire project (some would argue above all others). This is especially true in Digital Asset Management where there are compounding effects that increase risk exponentially as more data gets introduced.
AI toolsets present an unquantifiable risk with unclear liability so they need to be handled with great care. Managers need to critically evaluate whether or not the benefits offered do genuinely stack up when measured against the time and effort required to babysit them in addition to the potential costs if they fail when put to real-world use.
For anyone interested in the wider topic of risk management and DAM, the chapter I contributed to Elizabeth Keathley’s book: Digital Asset Management: Content Architectures, Project Management, and Creating Order out of Media Chaos covers the subject in some detail.
Share this Article:
The Tesla situation is an interesting one. In fairness, virtually all early cars could be considered death traps by comparison to those build in accordance to today’s safety standards. By just removing seat belts alone, we’d probably see deaths increase at a rate that outpaces the failures of the Tesla autopilot. In fact, we’ve seen deaths result from the malfunction (or, more typically, the “we never thought of that” programming) of aircraft autopilots too. Yet, on the whole, pilots believe the assistance offered by an aircraft’s autopilot contributes to flight safety far more than it does the potential for accident. (Speaking as a pilot who willingly admits to make making many more stupid decisions than the autopilot could ever dream of, I can confirm this.)
So what about the management of content?
The problem that I see is that the problems that result from an autopilot failure are immediate and finite. By comparison, the “failures” that are introduced into a content management system via this type of AI are so insidious that they go virtually unnoticed. Someone throws a few thousands files onto the system and some dialog box indicates that “Tagging is Complete!” No one wants to find more work to do, so we tend to accept that “complete” means “correct, final, done, go home, you’re free!”
The result will be a system that grows increasingly unreliable over time. Unlike those that come from the autopilot failures, these “deaths” of system integrity will be slow and virtually invisible. Compounding the problem will be search engines that rely on the DAM to provide accurate results. And there’s no doubt in my mind that we’re headed in a direction of databases educating one another. So, if Google sees a photo of a wolf inside your DAM that’s tagged “Siberian Husky,” how long will it be before other image recognition engines start thinking the same thing?
A problem is that for some purposes, the difference between “wolf” and “Siberian Husky” isn’t relevant. If you’re making a brochure, and you need a photo for some background treatment, “dog” might be all the tag you need. On the other hand, if you’re a vet, you need more.
I fear that content managers are growing increasingly intoxicated by the potential of AI, without thinking about the ramifications of the potentially massive errors that could be introduced and never found, or propagated to other systems before they’re found.
I recently got to drive a Tesla Model S. I was told it was faster than my Audi, so I’d love it. I did. It was remarkable in how it performed. But, speaking as one who has been involved with technology for more than half my life, there is no way I would ever engage that autopilot unless I was driving on a virtually empty road and I was wide awake.
David Diamond
Director of Global Marketing
Picturepark
I take your point about conventional vehicles being unsafe, but the two issues are firstly the liability question and secondly the nature of the fault. If a conventional motorist drives badly and crashes, it’s their fault and they will probably have to face responsibility for it – so there’s a built-in bias to avoid accidents (even if it isn’t satisfactory to always prevent them). On the nature of the fault, the real concern with these is some major flaw comes to light after an update and you get lots of vehicles crashing within a short space of time. Computer software basically industrialises data processing – including any unanticipated faults. AI is still an industrial process (despite all the ‘cute’ anthropomorphic names given to the components) it’s just one based on a probability of success that is lower than would be deemed acceptable for regular software.
Even saying that, the software industry has a shocking reputation for reliability anyway, before anything allegedly ‘intelligent’ gets added into the mix. The whole reason concepts like ‘agile’ exist is because virtually nothing ever goes quite to plan and there needs to be a built-in acceptance of a very high risk of failure to get through a typical implementation. The engineering skills of the human race currently just aren’t good enough to deliver AI that can be trusted for anything other than some research projects (imo). One day that might change, but we’re not there yet.
I think the real issue with this stuff is less the technology and more the marketing of it and the fact that there is a lot of pressure from those who have invested in it to get their capital returned with a profit. This is encouraging over-promotion leading to unreasonable expectations. It does rather contain all the plot elements for a disaster movie.
The latest news from Tesla isn’t ideal and from my perspective, it confirms the points raised in this article: http://www.bbc.co.uk/news/technology-36783345