One of the recent changes in DAM system technology over the last five years has been the increased use of cloud storage providers to hold asset files. Previously, most DAM solutions relied on a file storage method that worked more or less like a big hard disk, even if it was actually a separate network attached device dedicated to that task. While this change is a lower-level technical one, the implications of it are more far-reaching than is generally acknowledged.
Non-technical readers might not understand how all this works and why they need to be interested. For their benefit, I will attempt to explain how cloud storage operates and the significance of it. If you use a cloud hosting provider, there are two storage methods that can be employed. One of them is something called a ‘block storage’ device which to all intents and purposes works like the aforementioned hard disk or network drive. This is faster, but more expensive in cloud environments and tends to get used to hold core facilities the DAM system relies on just to function (i.e. databases, operating systems etc). The other method is usually called ‘object storage’ (although there are various alternative terms for it) and this is a service that is entirely separate from the DAM system and exists purely to hold data like files. Some examples of well-known object-based cloud storage providers include Amazon S3, Google Cloud Storage or Microsoft Azure Storage and there are a number of other options on the market now also (which I will discuss later).
When a DAM user downloads an asset from a DAM system that is fully integrated with an cloud storage provider, the file is served to them in a manner that appears virtually indistinguishable from the ‘big hard disk’ approach, except the system does not hold the asset file locally nor even on the same network, instead, it redirects the user’s browser to a cloud storage location instead which is where the asset’s file resides. There are three separate entities involved: the DAM system, the user (or ‘client’ to use the technical term) and the storage provider, each of which are independent from each other. All can (and usually do) exist in completely independent locations that might be thousands of miles away apart. The DAM system has a role like a broker or intermediary to connect the user up with the file they require that is held with the cloud storage repository. This characteristic reduces the costs for the vendor and also enables them to scale up storage a lot more easily – in fact to an almost unlimited extent that is only restricted by the total capacity of the storage service provider they use.
As a general rule, cloud storage services are usually more reliable than stand-alone storage hardware devices you can buy yourself from hardware vendors since the providers will replicate the data across several different locations and devices, so even if one is lost, the others will have a copy. Note that it isn’t, ipso facto, the case that because the storage is cloud-based, it is more reliable, only that most providers have deemed it worth their while to offer that benefit to encourage users to deposit data with them (and you can choose some lower-grade options which are less durable if you want to save money). Anyhow, so far, so good, we’re all getting cheaper and more scalable storage to hold our digital assets, what can be wrong with that? Essentially, nothing, but as with most things in IT (and life in general) where there are benefits, there are some risks to trade-off as well and these might increase as the cloud storage market opens up to a wider range of competing vendors.
To fully understand the risks, it is necessary to consider what the term ‘cloud’ really means. As a description, is simultaneously highly unscientific in that it invokes images of whispy cotton wool shapes gliding across clear blue skies and also accurate because clouds are opaque and you can’t see what is either inside nor behind them. ‘The Cloud’ where your digital assets may reside is likely to be one or more buildings that resemble aircraft hangers in relatively nondescript locations and are full of servers, with maybe a couple of bored security guards hanging around. Most people usually grasp this (if asked to seriously consider what the cloud actually is) but the nature of the term permits them to sidestep the true nature of the service they are consuming, so expressions like ‘the files are up in the cloud’ (and similar) will often get used, even though the phrase bears no relationship with reality. The Free Software Foundation Europe have run a marketing campaign featuring a poster with the words: There is no Cloud, just other people’s computers this is based on articles and speeches by Richard Stallman. This is a better description of the nature of the arrangement when you use a cloud service.
I should emphasise that although I think his description is correct, I am not opposed to the cloud and I don’t necessarily agree with everything Richard says (in general or in relation to the cloud, specifically). Cloud services offer a considerable opportunity to solve a whole slew of IT challenges that until recently were expensive and complicated to deal with if you were required to handle all this yourself. I agree with him, however, that the poetic description of the service and its fashionable status in IT circles currently can encourage some to throw away the due diligence textbook and fail to apply some fairly basic, best-practice principles to ensure their digital assets are not more at risk than they should be. To put this in simple terms: if you use a cloud storage service, you place your assets in someone else’s custody and you need to be reasonably sure you can get them back when you need to.
One further development in the storage market which I have been reading about recently, is cloud storage being increasingly regarded as a commodity. For most end users, this appears entirely reasonable because one gigabyte of storage is much like any other, right? If all the providers are gradually standardising on common protocols, data storage superficially has what economists call ‘fungible’ characteristics, i.e. storage with one provider can be easily exchanged for one someone else is offering. I am not sure if this already exists now, but it seems likely that before long someone will come up with an exchange to buy/sell data storage like other commodities and orders could get filled by a variety of different service providers. Just like they do in other kinds of commodity exchanges for metals, grain, oil etc, there are likely to be be third parties who earn a living buying and selling in volume to service this requirement also; in other words there will be a supply chain.
If this kind of asynchronous transaction model becomes the norm, I would expect more DAM system vendors to want to connect into it since an exchange provides access to more favourable pricing and the potential to gain a competitive advantage as a result. The trade-off is that as the owner of digital assets who is a de-facto customer of one of these upstream service providers (albeit one re-sold through your DAM vendor) you may not know with any certainty exactly where in the world your digital assets are.
As discussed earlier, there are some developments in cloud computing that might open up the field to a wider range of prospective suppliers. An open source protocol called OpenStack now exists and this is also compatible with Amazon S3 (who are the most ubiquitous cloud storage provider). That allows more or less anyone with the available server resources to set themselves up in the cloud business with the knowledge that their users can utilise these conventions to gain access to data and use services. As discussed previously, this is a double-edged sword. On the one hand the increased competition helps keep prices low, on the other it makes it simpler for the unscrupulous to participate to the detriment of everyone else (as well as anyone unfortunate enough to become one of their customers).
Previously in this article, I referred to storage having superficially fungible characteristics, this is because there is more to the provision of storage facilities than the capacity alone, for example, the reliability of the devices used or any special geographical characteristics of the location where they are held (e.g. propensity of hurricanes or extreme weather etc) to name just two. While what you get in terms of perceivable value as a user seems to be the same (i.e some space where you can upload your files) there are a variety of other aspects which are not fully transparent and these make comparison more difficult unless you are prepared to carry out a much deeper and prolonged investigation. I gather there is another economics concept called value engineering, the essence of which is that you manufacture a cheaper product so that to the untrained eye it appears identical to a more expensive option, but uses lower cost materials etc to create either an opportunity to generate either greater profit or an opportunity to discount without affecting margins. As it stands now, the cloud storage market appears to offer some scope for over-aggressive value engineering at the expense of reliability. While Amazon, Microsoft, Google etc might not want to risk the PR damage this strategy could inflict upon them, a more buccaneering provider who was eager to acquire customers might be more inclined to take greater risks. At present, you won’t easily know if your DAM vendor has been enticed by the lower costs they offer to use them.
A further issue for DAM solutions using this kind of heterogeneous storage environment is the range of data storage locations used. As I described earlier, a fully cloud-based DAM is already like a broker for a cloud storage provider because the two are de-coupled from each other: the DAM only points to a location where the asset binary data is held. In the near future, DAM systems might dynamically use a multiplicity of storage suppliers selected automatically on a more or less ad-hoc basis from this (as yet) fictional data storage exchange I have just invented. There could be potentially hundreds of providers, depending on what rules were employed to filter them. This might present some challenges when migrating from one system to another, especially if that migration has been forced upon the end user because of the commercial failure of the DAM vendor who used to service their digital asset management needs.
I can see how the process for doing this might be achieved if everything was working as it should: an export file would created with a list of the locations where each asset is stored and then scripts etc written to incrementally copy each asset based on where it was currently located. This seems straightforward enough, providing the DAM database still exists, but if that is lost or becomes irretrievably corrupted, the pointers to the locations where asset files are stored might be difficult to work out. In other words, the asset files would be stored on computers somewhere in the world, but no one would know exactly where. It is somewhat fatalistic to suggest the asset files would then be lost forever, but we are clearly in uncharted territory in that sort of scenario.
One factor which might assist to make the cloud storage market more of a lower risk proposition for DAM users is greater transparency. DAM software already operates within a supply chain itself; not just the data that traverses through DAM solutions, but the applications themselves are becoming infrastructures to support enterprise digital asset logistics. If that is the case then DAM users need full transparency about where their assets are held and what third party supplier nodes they pass through. Achieving this goal, while not entirely straightforward, is feasible. In practical terms, this means that if you used a cloud DAM solution, you can get regular reports about the exact location of all of your assets, in the same way as you can with something like web site log files. These should be records that users can access on demand and audit, if they wish. That provides some assurance that asset files can be retrieved even if the DAM itself is lost.
When cloud-related discussions have come up in the past on DAM News (and elsewhere) they tend to turn into boxing matches between cloud fans and those opposed to the whole idea. I don’t find either camp a very appealing one to join. The facts are that if you want to build out IT infrastructures at an economic cost these days, you need to use cloud technologies and providers of one kind or another to do it. On the other hand, to not properly acknowledge the risks and just bank on safety in user numbers seems like an ill-considered and highly flawed strategy also. As someone who buys cloud services for my own firm’s use and also advises my clients on what to purchase, I want an optimum balance of all the benefits on offer but with as many of the risks mitigated as is feasible. You can only begin to do that by having a complete understanding of what can go wrong now and in the future, as well as solid and rehearsed plans of how you will get out of trouble if it ever occurs.