Improving DAM Interoperability In 2017
DAM products have improved a lot in the last few years: They are now cloud-enabled and more user-friendly, they support video and let us share content better than ever.
Integrating the DAM with other systems has also gotten easier because most systems have APIs by now. But point-to-point integrations are still the rule, where each integration between any two systems requires a software developer to write code tailored to them.
Let’s look at the importance of interoperability for Digital Asset Management, what interoperability means in practice, and how we can improve it.
DAM Systems As Content Hubs
Business processes involving creative content and other digital assets are crossing IT system boundaries all the time. To help the organization derive value from its digital assets, the DAM needs to enable automated, effortless data flows between systems.
Last year, Theresa Regli predicted that 2016 would “see the term ‘content hub’ emerging”. She was right: Many DAM vendors are now using it to market their products. (Examples: ADAM, Stylelabs, Widen, WoodWing) It’s a term that makes sense: DAM systems have evolved from almost dead end “image archives” into central services which gather digital assets from many sources, make them findable, and then route the assets to other people through various connected systems. By definition, interoperability is a content hub’s most important feature.
Jeff Lawrence wrote about a conversation with Bynder CEO Chris Hall: “My understanding […] is that traditional DAM was an archive, but today DAM is an important component of a larger ecosystem of connected tools that must have the ability to work together. Without an integrated solution, a siloed DAM is essentially a lost opportunity for the business.”
Interoperability Is A Two-Way Street
DAM interoperability is a two-way street: In addition to DAM specific data being routed to other systems, or between DAM systems, “foreign” data – product data, customer data, Web analytics data – is also commonly held within the DAM to provide context for its assets. (This is one of the points discussed in Ralph Windsor’s DAM And The Politics Of Metadata Integration.) We need to consider this in our discussions.
Web-Connected By Default
Another quick detour before we dive into the specifics of DAM interoperability:
Our understanding of information system integration has evolved in the last decade. The idea of the Web as a globally interconnected space has become true: Employees and collaborators work from anywhere, and software is moving into the cloud. An increasing number of digital assets originate online – on mobile, connected devices – and are managed exclusively for the purpose of sharing them online. Intranet-only, on-premises setups are becoming the exception, while the norm is a Web-connected DAM system that has the full potential to connect digital assets with any number of people and systems.
This has implications for the technology our systems are using to communicate: Copying files between network shares doesn’t work on the Web, and FTP is not a good choice either.
Use Cases And Operations
Now let’s look at some real-life DAM interoperability use cases.
The most common ones involve content stored in the DAM system being handed over to a publishing system: to a Web CMS or social media sites (Facebook, Twitter, YouTube) for Web publishing, or to InDesign, editorial systems or catalog production software for print production.
Then there’s scenarios where the DAM and other internal systems need to integrate data for search, reporting or other business processes: The DAM system may need product data from a PIM (Product Information Management) system to allow searching assets by products. A museum’s CMS (Collection Management System) or a company’s CRM (Customer Relationship Management) system may want to display images stored in the DAM.
Often, digital assets are automatically uploaded into the DAM system: Other systems use the DAM for archiving their content, or external content suppliers (like news and photo agencies) feed assets into the DAM for production usage.
In other cases, the DAM system connects to external services to outsource or enhance its functionality, calling a cloud service to transcode video files or generate preview images, or integrating image recognition “AI” products.
In each case, systems communicate with each other, performing one or several operations. While these operations can be complex and special, there are typical patterns. Let’s assign names to them to help us talk about interoperability in a specific and systematic way:
- “Referencing an item”: A system stores a reference to an item maintained within another system. Example: A DAM system stores a product ID (which comes from a PIM system) along with an asset.
- “Linking to an item”: A system’s UI displays a link to an item in another system’s UI. Example: A DAM system links to a Facebook post where a DAM-hosted image is used.
- “Embedding a file”: A system displays a file by embedding a URL from another system which points to a media file, possibly with parameters for a tailor-made, dynamically generated file variant (see IIIF).
- “Listing items”: A system retrieves a list of items from another system, along with a subset of item metadata. Example: A Web CMS fetches an RSS feed from a DAM, with asset metadata and thumbnail preview URLs.
- “Searching items”: Like “listing items”, but with the ability to pass search terms, sort order etc. as parameters to request a tailor-made, dynamically generated list.
- “Reading item data”: A system retrieves a single item’s data (including metadata and file references) from another system.
- “Updating an item”: A system appends or replaces item data in another system. Example: A newspaper production system updates photo usage data within a DAM system.
- “Creating an item”: A system creates a new item within another system. Example: A news agency system creates an asset within a DAM system by FTPing it a JPEG image file with embedded metadata.
The Software Developer’s Perspective
References, links, and file embeds might be possible without having to write integration code – see David Diamond’s Integrating DAM with CMS without an Integration. But in many cases, you’ll need help from a software developer to connect your systems.
The scenarios and operations described above should be possible to implement, assuming the intended process and data flow is specified, and the systems involved either provide an API (Application Programming Interface) out of the box, or can be extended to communicate with other systems.
But making everything work together smoothly is easier said than done. The developer has to figure out a lot of details. Here’s some of the technical aspects to consider when connecting software systems:
- Which direction is it going to be: Will software on the side of the DAM system actively push data to another, “passive” system, or will the DAM system remain passive, with the other side pulling data from it? Communication could also take place in both directions, or a third party (like Zapier) could play the active part.
- Is there a live, synchronous connection between both systems using an HTTP API, or will data be transferred by the means of files copied back and forth (possibly via FTP)?
- Which syntax (XML, JSON) and data format (SOAP, RDF, RSS, NewsML, XHTML, JSON-LD) is used for data exchange?
- Which data structures and field names (also known as the “schema”) are used? I.e., does the API or file format have the notion of an “image”, and is the image caption in a field named “CAPTION”, “Headline”, “title”, or “h1”?
- How are assets and records identified – is there an “ID” or URL field, and is the identifier persistent and globally unique?
In fact, there’s dozens more questions that need to be answered before the software developer can craft a connection between any two systems.
The Cambrian Explosion Of DAM Integrations
The sad thing is that when the implementation is done, the developer will have to start almost from scratch when asked to integrate the same DAM with another system. Integrations often aren’t reusable. That’s because all DAM systems, and most systems the DAM needs to work with, have different answers to the questions listed above.
So each and every DAM vendor re-implements integrations with the same systems. Let’s say that a mainstream DAM product should work with Sitecore, WordPress, Clarifai, SharePoint, Salesforce, and YouTube. That’s just six different integrations. But there’s lots of DAM vendors out there. If just ten DAM vendors each connect their products to these six systems, that’s sixty custom-implemented integrations already!
Coding each integration from scratch is a terrible waste of time, money, and developer motivation. What we need is, in the words of Mike Amundsen, (generic) interoperability, not (point-to-point) integration.
The Role Of Current DAM Standards
That’s not to say there aren’t any standards in DAM: The DAM Directory lists quite a few of them. Why is interoperability still so hard?
One problem is that these are mostly metadata standards, focusing on the representation of a single digital asset’s metadata. API operations – how to search the DAM and get a list of assets back, and how to retrieve that metadata over the Internet – are out of scope for standards like IPTC Photo Metadata. Most DAM vendors support “IPTC metadata”, and still, each DAM API looks and behaves differently.
The traditional focus on metadata embedded within image files isn’t a good match for many typical use cases (though it’s still amazing to receive an image file which carries lots of useful metadata). Consider a DAM/WCMS integration: From within the Web CMS, the user searches the DAM, sees a list of assets with a thumbnail image and a few metadata fields, and then picks one from the list which is going to be served directly from the DAM system. None of these operations require, or benefit from, transferring (potentially large) image files with embedded metadata.
Current DAM standards also don’t cover all of the Ten Core Characteristics of a DAM – for example, workflows, collections and custom metadata fields aren’t well-standardized yet.
In 2014, the OASIS standards body initiated the development of a new DAM standard called CMIS4DAM, responding to “the ongoing DAM interoperability crisis” (see Ralph Windsor’s Introduction to CMIS4DAM). Andreas Mockenhaupt described it as “something along the lines of a universal API, allowing those DAM systems that are compliant to easily gain access to the metadata that they need without performing integration work.” But judging from Ray Gauss II’s call for help a year ago, and the silence on the mailing list, work on CMIS4DAM seems to have ceased. (Disclaimer: The DAM vendor I work for considered joining the committee, but the high cost of an OASIS membership stopped us from participating.)
Considering CMIS4DAM’s lack of success so far, and the recent closing down of the DAM Foundation, it seems most vendors are not deeming it important to work together for the greater good of the DAM ecosystem. Our products’ interoperability shortcomings simply mirror this fact (see Conway’s law). To quote Ralph Windsor’s harsh critique from 2014: “The DAM industry is guilty of self-obsessed and narcissistic behavior or (at best) an apathetic and fatalistic attitude that assumes interoperability is someone else’s problem which might never get solved anyway.”
Semantic Web Technology As A Possible Solution
It’s not just that better DAM standards are unlikely to arrive soon: Even the best DAM specific standard would address only half the problem because interoperability is a two-way street, and “foreign” data needs to be exchanged as well. When connecting a DAM to a PIM system, would it help to have rivaling DAM interoperability and PIM interoperability standards?
Generic, standardized mechanisms for exchanging structured data, more helpful than “let’s use (any) XML” but less rigorous than an industry specific standard, could help us bridge diverse systems. That’s what Semantic Web technologies were meant to do: replicate the human-readable Web’s success for structured, machine-readable data by providing a generic language for structured data (RDF), using URIs/URLs as identifiers and links, and making data access as simple as visiting a URL the way a Web browser does. For more details, see the DAM Guru webinar on DAM and the Semantic Web by Margaret Warren, Demian Hess and myself.
The term ”Linked Data“, which is often used in the Semantic Web context, highlights the special mindset required for Web-scale interoperability: You don’t start with integrating systems. Instead, you invest in data quality and interconnectedness, and publish that data on the Web – not through a custom-built API, but in standard formats. To illustrate the Linked Data approach, think of all the software that lets you add hyperlinks when writing text: You can link Web pages to Wiki pages to Google Docs documents to JIRA issue tracker tickets without ”integrating“ any systems because the links live in the HTML data. That’s possible with structured data, too.
While the initial vision of a Semantic Web full of autonomous software agents, doing our shopping and booking our flights, has not (yet?) been fully realized, the core standards and technology are stable and usable. They would provide clear answers to some of the developer’s question we looked at earlier: Assets are identified by URLs, data is accessed using HTTP connections, and RDF offers several standardized formats. And we could even connect asset descriptions in the DAM to public datasets like DBpedia using the same technology. To get an impression of how DAM functions can be mixed with Semantic Web concepts, take a look at my co-contributor’s ImageSnippets product.
The Schema.org Vocabulary
Choosing Semantic Web technology doesn’t answer the question which data structures and field names we could standardize on – there’s lots of Linked Data-compatible vocabularies, including IPTC and XMP (Adobe’s XMP is built on top of RDF). But let’s look at a particular one:
The Schema.org vocabulary is a Semantic Web success story. The large Web search engine vendors (Google, Bing, Yahoo, Yandex) are increasingly interested in indexing not just HTML text, but also structured data, so they can better search for and list structured information like product offers and recipes. These vendors agreed on a shared vocabulary (which is constantly being extended, in an open, W3C assisted community process), and on standardized ways to embed structured data in Web pages so the search engine crawlers could find it. The data model is fully conformant to RDF, and the Linked Data formats RDFa and JSON-LD can be used for embedding. (It’s pretty cool that the SEO guys who add Schema.org data to their Web sites unknowingly contribute to building the Semantic Web.)
Schema.org data on Web sites is often a dumbed-down version of the original data. While the vocabulary is pretty extensive, it doesn’t cover all the complexities of each industry’s data model. But it doesn’t have to: It just needs to be good enough for search and result display purposes. Which, coming back to our original topic, is exactly what we need for many DAM interoperability use cases (referencing, linking, reading, listing, searching items). A DAM is a search engine, after all (plus a few extra features, of course). In my opinion, this congruence makes Schema.org (and RDF or JSON-LD) a good starting point for Linked Data-based DAM interop.
The Schema.org vocabulary could be an answer to the question which data model and field names to use. We could work with ready-made standards and technology, be part of a wider ecosystem and contribute to a living, well-known vocabulary (which even has an extension mechanism for, say, DAM specific stuff like file variants and renditions).
Let’s Work This Out Together
I might be wrong about the suitability of Semantic Web technology for DAM interoperability. Maybe we should reboot the CMIS4DAM efforts, or use some other approach not mentioned here. Ralph Windsor’s 2013 article on The Building Blocks Of Digital Asset Management Interoperability provides a good overview of the options.
But I’m sure we can improve on the rather miserable state of DAM interoperability if we join forces. We’re all going to benefit. How are you willing to contribute? Let’s talk; I’m looking forward to your comments!