Big Data And The Integrated Future For Small DAM Systems
This special feature article has been written by Ralph Windsor, a senior partner in Digital Asset Management implementation consultants, Daydream.
One of the fashionable tech trends of the moment is undoubtedly ‘Big Data’. Compared with some other terms doing the rounds right now, it has slightly more merit as an addition to the jargon that the IT sector seems to have an apparently limitless capacity to generate.
Many still don’t really grasp what ‘Big Data’ really means in relation to them, however, and often derive all kinds of invalid associations as a result. In the minds of a number of less well-informed prospective DAM end users who I encounter, Assets = Data and they think they have a lot of assets, so, therefore, their DAM system must be ‘Big Data’ too, right? Hopefully this article will clear up why that is incorrect in the majority of cases, however, as I will discuss later, although their primary conclusion is inaccurate, they might be on to something nonetheless.
If you are not up to speed with the term, Big Data refers to very large data sets (typically measured in terms of petabytes) and the associated processes of managing information at that kind of scale – which is quite different to how it might conventionally have been handled in the past and probably still is in most DAM systems you will encounter. As the volume of data we have all collectively generated has increased at an exponential rate over recent years, the conventional techniques that IT personnel have previously depended on have started to become unfit for purpose (but only at a comparatively large scale, it must be emphasised).
Initially those with high volume data management requirements tried to use existing technologies by scaling up data management systems like relational databases and using bigger and more powerful servers. That strategy reaches a tipping point where it is unsustainable and now this discipline has evolved to specifically address those needs – along with a variety of software tools and revisions to previously accepted best practice recommendations. Unusually for a technology term, Big Data is more or less what it says. The main issue is a lack of either appreciation of, or agreement on, what exactly ‘Big’ is and at what point a data repository has acquired a scale that warrants the description. Big Data has become big news recently for a variety of reasons, not least of which is because various very large websites like Facebook (among others) have made use of it to implement their platforms which need to support millions of users and a correspondingly large range of configuration/customisation options.
For many DAM users, the volume of assets they plan to manage does not take it anywhere near to the kind of scale where Big Data should be on the agenda. Even after a number of years, many enterprise DAMs might not make it north of six figures in terms of asset volumes and the repository storage sizes are often in the single or double figure terabyte range – typically even less if there is little or no HD video content involved. Obviously, there are exceptions to those figures and I am just considering average numbers here. Over a period of time, it is likely that they will get towards a scale where Big Data techniques become necessary, but not at a speed that makes it operationally significant for decision makers right now purely for DAM.
Where DAM solutions start to fall within the scope of Big Data initiatives, however, is when you begin to consider integrating them with other systems that a larger organisation might also employ, especially transactional Business Intelligence (BI) and web analytics data. Things start to get more interesting if you can work backwards from something like sales receipts for a given product where one photo has been used in a marketing campaign compared to an alternative where a different one might be used – and be able to see that data in your DAM when making decisions about what asset to use. These present opportunities for DAM that are somewhat different to the currently accepted productivity case for investing in one now (i.e. being able to find digital media more quickly).
Those involved in the formation of customer acquisition programmes are probably well aware of the various split testing techniques that can be applied to help transform a marketer’s hunch into an actionable strategy that can be objectively supported with detailed facts and figures. While your DAM system might contain a limited set of data about assets, when you integrate it with other systems of the type described, the volumes increase by several orders of magnitude. This might be where DAM really does begin to segue into Big Data.
Some might argue that you can already do this and there are systems that connect DAM with Product data (e.g. ‘Product Information Management – PIM applications etc). Over the medium term, however, the growth of data and range of analysis requirements will become too large for them to handle as a single-vendor solution.
The implication of this trend is that integration will be increasingly important as whether a vendor or as a technology buyer, you will need to call in the services offered by some specialist solution that can work with all your other applications in order to continue to offer everything end users need. The current primitive tactics used for most integration features lack the robustness, scale and sophistication required to be effective. As discussed elsewhere on DAM News, there will be vocal demand from a new generation of marketers for this kind of integrated data and for the ability to automate many of the analysis tasks that end up being carried out manually via a spreadsheet right now. Single sourcing one ‘mega solution’ product suite that will try to do the whole lot (and duplicates costs as a result) won’t be well received by most end users, especially in the cash-strapped business environment that we are likely to remain in for the foreseeable future.
DAM is just one of many tools employed by marketers and others with an interest in digital assets. For that reason, it seems untenable that DAM systems can remain independent islands of ‘small data’ for very long. While there might be movement towards integration standards, these all depend on a wide range of independent vendors being willing (and able) to adhere to integration standards without adding their own proprietary enhancements or reducing the scope of what is offered to suit the limitations of their own resources.
In an ideal world, an open source standard would be the best way to implement this, but I don’t think it would have the leverage required to ensure that everyone who uses it will keep to the rules. In addition, open source does not provide much of an answer to the capital expenditure challenge that anyone who wants to implement DAM integrated with a Big Data infrastructure. In many of the examples I have seen of the technology being applied, some existing platform (e.g. the Amazon Cloud) has been utilised and from the reports that Naresh covers about their movements, they also seem quite keen on further market integration directly into the application stack – not just as the ‘virtualised hardware supermarket’ that many IT personnel perceive them currently.
I can, however, foresee open source technologies being used as the logic layer and protocol for exchanging data (i.e. what the applications that run on it are written in and talk to each other using) but the owners of the infrastructure (the platform operators) will have ultimate control over the nature of the applications and how they work. To use an analogy, if you own the railway track, you control the type of trains that can travel on it.
Via one means or another, a small range of platform operators will increasingly be able to call the shots and require vendors to make their products work with it or they just won’t be accessible to their target market. Therefore, to support this assimilation of DAM into the ‘Big Data’ trend that is currently in progress, DAM systems will need to gravitate towards these platforms and become less visible as independent solutions and more software cogs in an overall marketing technology mechanism.
Although it is apparently a misquote, when Thomas Watson (CEO of IBM in the 1940s) said “I think there is a world market for maybe five computers”, if you were to substitute computers for computer platforms and speculate that they might have names like Amazon, Google, Apple, Facebook or Microsoft – then the prediction seems have to gone full circle and, once more, looks plausible again.
In a world of mobile thin clients where, before long, very little may be stored within a corporation’s own network (let alone on a PC they own) then those who think their DAM is a ‘Big Data’ system, might have made an accidentally more astute observation than perhaps they initially realised.
About the author
Ralph Windsor is a senior partner in Digital Asset Management Implementation Consultants, Daydream and has worked in the DAM industry for 18 years as a developer, project manager and now consultant.