Martin Wilson, a director of Bright Interactive (the developers of the Asset Bank DAM solution) has contributed an article for DAM News: AI In DAM: The Challenges And Opportunities. This is related to the Google VISION API review piece I wrote recently, as well as an item on the Asset Bank blog (which was written independently and shortly before mine). Martin reaches some similar conclusions but has some more detail on how they might be resolved, especially in relation to feedback loops and learning:
“DAM applications typically contain hundreds of thousands of images that have been keyworded manually – why can’t the APIs learn from all this data (a bit like a human would, but more quickly) and then use it to provide keywords that are specific to an organisation? For example if my DAM application already contains hundreds of images of my company’s head office building, why can’t the API learn what my head office building looks like for the next time someone uploads an image of it?” [Read More]
This is a point I’ve made in the past also and yet it is something of a blind-spot for all AI (Artificial Intelligence) visual recognition technologies. In the case of Google, I originally thought it might be a policy decision because they are mistrustful of human-generated metadata in general (noting their image search discounts embedded metadata also). It seems most of their competition have the same problem also based on Martin’s observations of Clarifai’s system. This is really the crux of it. Just like you can’t give most five year-old children who are learning to read copies of some complex academic text and hope that in doing so, you will speed up the educational process, so it is with these systems too; they can’t deal with any data you throw at them and you really need human beings to train them. This is where the rapid progress will start to plateau and the cost/benefits become more marginal.
Martin makes an interesting observation that not many DAM vendors are going to want to implement the necessary deep-learning system which would provide the missing link because they don’t want to risk entering that sector only to find that someone else already operating in it decides to do the same (and considering one of those firms is Google, the richest company in the world, that’s a very reasonable point). This is the kind of impasse that thwarts many other innovations: the integrator (which is essentially what most DAM vendors are these days) and the core technology providers mistrust each other on a commercially-strategic level, the resulting lack of collaboration between the two groups prevents progress being made as quickly as it could.
Primarily for this reason I would take issue with Martin’s assessment that this problem will be solved in 25 years – he might be right, but I wouldn’t plan on betting your place of residence on it. I believe the complexity has been radically underestimated by many; over-optimism about the pace of progress is something of an occupational hazard for those with a software background (one that I have been guilty of myself in the past also). The other point that AI is a fast-moving area is one I read a lot, but I don’t think it bears up to closer examination. Artificial Intelligence (as a discipline) has been around for decades and the tangible results which you can base saleable product around are quite low and many examples subsequently end up being mocked or quietly dropped later. Anyone remember that irritating paperclip character in Microsoft Word 97? Usually those that do survive are assimilated into more conventional software engineering techniques. This kind of ‘hybrid AI’ is typically more robust and therefore usable because an incremental approach is used to implement it which can also utilise existing human-supplied knowledge and expertise.
I note that interest in AI tends to spike shortly after some other innovation has gained traction (the same happened twenty years ago after the internet became mainstream and before that with the introduction of the first personal microcomputers). In this case, it’s probably mobile devices. After the wave of excitement has passed, technologists start looking around for something else to sustain the buzz and not being able to find much inspiration, AI (and all of its romanticised, science fiction associations with computers taking over the earth etc) provides a good outlet even though the chances of many of the predictions becoming reality are quite low. I am not exactly eager to to put those fateful ‘this time it’s different’ words into the semi-permeable form which are the electronic pages of DAM News, but the existence of a large-scale public communications network which most computers are connected to and the ability to warehouse massive amounts of data might provide some greater potential for AI to get further on this occasion than it has done in the past. That could also include some potentially interesting and even useful products for the Digital Asset Management field too.
Martin’s piece is though-provoking, considered, well-written and exactly the sort of item which many clients of DAM system vendors would welcome reading (and what we encourage on DAM News). It contrasts starkly with some of the re-heated marketing materials that I have read from other firms who have implemented this (as well as the ‘free hit’ that a number of the providers of the recognition technology have been given where this topic has been addressed elsewhere). I highly recommend that you read it and their own blog post on the same subject also.
- Blockchains As Emerging DAM Interoperability Activity Registers
- Clarifai vs Google Vision: Two Visual Recognition APIs Compared
- What Are The Emerging Digital Asset Management Market Trends To Be Aware Of?
- The Mediachain Digital Asset Provenance And Interoperability Protocol
- Google VISION API: Not Yet A Game Changer