Standards and Metadata
by Lisa Grimm, MA, MS-LIS
While librarians love global standards and useful metadata, even within the traditional library, we can be confronted a less-than-consistent institutional approach to those standards, and even wider variation in the tools used to maintain good metadata. That’s especially true in the DAM world, where things like Dublin Core or MeSH can sound like mysterious codes, or even foreign languages, to someone who didn’t attend library school. And if you’ve come into the field from a ‘straight tech’ or marketing route, you may feel you already know everything you need to about metadata – it’s always been important for search and SEO, and few people knew or cared about standards there, right? On the flip side, degreed librarians may throw their hands up in dismay at how different DAM vendors approach metadata management – those global standards can be difficult to implement, even with the best of intentions. Let’s try to clear up the picture.
Types of Metadata
As a DAM professional, you already know the value of good metadata – you can’t find or properly manage your assets without it. You’re already using a variety of flavors of metadata within your DAM – some of it is descriptive, to help power your DAM’s search capability; some is administrative, so you can track an asset’s usage and rights, while those free-text fields may serve as a catch-all for everything that didn’t quite fit – or as a workaround for something your system doesn’t do without considerable tinkering. NISO likes to add a third broad category for structural metadata, but that’s (usually) less relevant to a DAM – it may be called upon to drive display or layout of a page, printed or otherwise. If you’ve spent a lot of time working with XML or ePub files, you’ll know what structural metadata looks like, but it’s less generally applicable to your images, illustrations and videos, at least within the DAM itself – they may certainly end up being called or described in those files in the wild. Whether you knew it or not, you have an in-house metadata model, and you may want to refine it or change it altogether.
Controlled Vocabularies & Benefits
Librarians love controlled vocabularies, and tend to wax lyrical about their favorites, like the Library of Congress Subject Headings (LCSH). But for our purposes here, a controlled vocabulary can be as simple as a picklist in a dropdown menu.
If you can pre-populate your DAM’s metadata fields with commonly-used terms and names that make sense for your DAM, you can reduce the scope for user error, thus ensuring that you keep your assets easily findable – no typos or three different names for the same agency or product (though more on related terms in a moment). Of course, that may not be as easy as it should be with your system – but we’ll look at some strategies there in a moment as well.
Another benefit of going with an existing standard is interoperability with other systems;: if your DAM ties into other systems, be they for rights management, HR, licensing or translation, using the same internationally-recognized standards for your metadata model may make everyone’s lives easier as digital objects travel across your technology ecosystem.
First off, there are a lot of standards out there. Committees have spent unpaid months and years creating and refining them, and most of the time, they’ve ended up with a pretty sensible set of terms for their given brief – no matter how specialized your assets are, one of the existing standards is probably a good fit, at least as a baseline, so there’s no need to start from scratch. We’ll look closely at the more generally applicable ones, and then mention a few specialized options.
Dublin Core or, more fully, the Dublin Core Metadata Initiative (DCMI), has been around in one form or another since 1995, when it was first mooted to help give more structure to web resources to make them findable – something any DAM professional can empathize with. The ‘Dublin’ in question here isn’t the one in Ireland, but rather, Dublin, Ohio – the initial workshop, sponsored by OCLC and NCSA. OCLC, known by its initials to any library professional, maintains (among other things) WorldCat, the global catalog that stores data from more than 170 libraries around the world. NCSA produced the first widely-adopted browser, Mosaic, which would eventually be reborn, phoenix-like, as Mozilla Firefox – but we digress.
Getting up to speed on Dublin Core is easy. (There are regular webinars on the DCMI site, but they may be more in-depth than what you need if you’re just beginning to implement some basic metadata standards.) You can learn a lot just by looking at some Dublin Core in action, whether it’s expressed in XML or in the metadata fields your DAM has already.
The beauty of Dublin Core is that it’s nearly endlessly extensible, though its core of 15 top-level categories, known now as the Dublin Core Metadata Element Set, are broadly applicable to almost any digital object. They will look like (at least vaguely) familiar metadata fields to most DAM users. Indeed, some systems have nothing much further than free text fields with these labels when they first arrive, out of the box:
But these fields, and the large variety of other Dublin Core descriptive terms available, may be used differently in different DAM solutions. And not every field, even of the core fields, is relevant to your particular assets. So it’s all about customization; we’ll dig into that below.
On the face of it, XMP sounds fantastic – you can embed (much of) the metadata you need right into your digital object! You can even use Dublin Core or another existing standard as the starting point. But actually implementing XMP as a standard for your DAM can be tricky, unless you have total control over the creative process from start to finish, since XMP is generally embedded via Adobe Photoshop or Bridge (XMP began life at Adobe, after all). Getting agencies to understand and follow your ‘rules’ isn’t always as straightforward as it should be, and while some DAMs do let you add XMP to assets, often, you need to rely on whomever created the file – and even then, it may not apply itself to every file type.
Another question to consider is whether your DAM’s search can index XMP – is that information being used by your system, or is it lost in the ether? That said, XMP can still be useful, even if it’s not powering your DAM’s search results. Licensing and other rights information can be built in and tracked throughout the asset’s life cycle, provided, of course, the XMP actually travels along with the file as advertised. As of this writing, there is certainly potential, but it may be more trouble than it’s worth for most time-crunched DAM administrators.
Other Standards – MeSH, ULAN, AAT, etc…
Even if you do opt for Dublin Core (or a Dublin Core-light) approach, you may want to seek out some of the more specialized options that exist. If your DAM supports medical or pharmaceutical assets, MeSH may be useful. For art-related collections, ULAN and AAT are incredibly thorough. There are many other unique standards, and in most cases, you can use them as a sort of ‘bolt on’ to your main underlying metadata model.
Customization and Implementation – the ‘How’
Once you (think) you have settled on a model, the real work begins – figuring out how to actually get your chosen model into the system, and how you want to approach applying it to your assets, whether they are newly-imported or legacy files. And while some DAMs will let you test and preview changes within the system, that’s more the exception than the rule, so we’ll assume for our purposes here that much of the upfront work will need to be done outside the system – then we’ll move on to implementation.
1. Analyze existing data:
- Does your DAM store user search terms, abandoned searches and user journeys through the system? This is wildly useful in refining your model, especially if you want to use, say, Dublin Core, but you notice that your users don’t seem to employ terms like Creator or Contributor. If collapsing those two fields into something more like ‘Agency’ or ‘Photographer’ works better, that’s great information.
- Are there metadata fields that are left consistently blank? It may be that you don’t need them, or that their purpose isn’t understood and that they need to be re-labeled.
- Do you have free-text fields that would be better served with drop-downs (e.g. list of agency names, products, countries)? Make note of them before you move on to the next stage.
2. Avoid metadata overkill:
- More isn’t always better. Not only do you need to make sure your fields are properly filled out, but if you have too many search terms, you may not get granular enough results.
- Just because a field exists in Dublin Core (or another existing standard) doesn’t mean you need to use it, or to use it in the ‘preferred’ way. If something else works better for your organization, feel free to make changes; just be consistent in your approach.
- Consider the maintenance ramifications if you do use a large number of fields – what happens if you need to modify them? This may be only a minor consideration for some DAMs, and a huge lift for others.
3. Plot out your proposed changes:
- Hit the spreadsheets! Before doing anything else, list your current metadata fields and any controlled vocabularies (whether they are in a dropdown or maintained elsewhere).
- On another tab, list your would-be changes, and note how they map to, or replace, existing fields. Color-coding can be very helpful.
- On a third tab, list any fields you want to remove entirely. If you know how many assets they may apply to, add that information. Also list net new fields. You may have this listed on your second tab, but it can be helpful to see it at a glance, especially when you move on to the next phase.
4. Get feedback:
- Talk to your users! Take time to walk through your proposed changes with some key users, and modify your spreadsheets accordingly.
- Card sorting exercise. You can do this in person with some of your users, or conduct a virtual card sort if your team is spread out geographically. There are a number of sites that offer free trials to their card sorting tools, or, if you have the budget, it can be well worth exploring in more depth. Knowing how your users categorize your assets – at least in very high-level groups – can tell you what you need to improve about your model. It will also highlight areas of confusion, and is a great way to test whether a particularly field is of any use at all, or if it needs to be re-named. You can use your spreadsheets as a starting point.
5. Test & Implement:
- Get your new fields and drop-downs into your DAM, but keep it to a staging environment at first. Again, this step may be minor, or a very complex exercise, depending on your software and configuration.
- Perform user acceptance testing (UAT): ask users to test drive the modifications to the system to see if your hunches about useful terms and fields were correct.
- If UAT went well, and the metadata mapped to your existing assets as expected in your testing environment, you’re ready to push those changes live!
- Let your users know that change is afoot – give them a heads-up in advance, and as the changes roll out. Whether that’s with a notification in your system, an email alert or a personal communication let them know that you’re working to make use of the DAM easier for them.
- Ensure it’s a two-way street – do they have an easy way to let you know they need help, or if they have suggestions for your next round of changes?
But My DAM Won’t Let Me Change It (Easily)!
It’s all well and good to think about how your metadata model will work in an ideal world, but you may have a DAM that makes such changes hugely cumbersome. You are not alone. While some DAMs have been thoughtfully designed with the user—administrative or otherwise—in mind, others make changing your metadata model extremely difficult.
If you’re one of the lucky ones, adding or modifying metadata fields can be done through your user interface – you’ll just want to ensure you have a governance process in place so that only administrators (or other trusted users) can make changes to your fields. You may even have a handy taxonomy management tool built in that will let you create related terms, ensuring that your users who search for ‘soccer’ also find ‘football’ if that’s what they were expecting. Many systems even let your users add their own tags to assets, and you can ensure good metadata hygiene by regularly reconciling these crowdsourced tags with ‘approved’ terms.
Other forward-thinking DAM vendors let you edit metadata in bulk. While it seems that this should be a standard feature, it’s noticeably absent in quite a few solutions, so it adds to your slate of maintenance projects when you need to do it manually (or if you need to write a script to make it happen). Adding a field that needs to be applied to thousands of assets, or modifying one that’s already in use with an equally-large number, is very straightforward in some DAMs. But can be a huge project requiring considerably IT support in others.
Most seem to sit somewhere in the middle: in many DAM solutions, it may require a bit of front-end scripting to make those changes, or even a full-blown dive into back-end programming. If you’re managing one of the more cumbersome systems out there, and making changes is something that needs to be its own project, you’ll quickly run into an even-more-pressing need for governance. Which leads us to the next potential problem (or opportunity).
But I’m Always Making Changes!
Regular maintenance is the key. You’ll find all manner of best practices, but you’ll need to decide what works best for your DAM. Do you have quarterly reviews of your metadata model? Are you constantly adding new keyword terms to keep up with new content types or products? Could you group those more efficiently in a standard field? Are they not easily findable as they are tagged now? Most importantly, what terms do your users actually employ?
In short, you’ll want to come up with a variation on the following steps:
- Create a metadata governance team – build in a regular cadence to meet with key users and stakeholders, and keep communication lines open.
- Stick to your review schedule – don’t let maintenance become eclipsed by other projects.
- Determine technical challenges – if changes to your model are always going to be a high level of effort, can they be coupled with other technical projects (e.g. upgrades, UI changes)?
- Test and re-test with your users: yes, it takes time, but it’s always a worthwhile exercise.
- Communicate: let your users know beforehand if you’re making major changes, and make sure you help them navigate them when they go live.
The perfect metadata model is always a moving target. But even as an ongoing work-in-progress, using existing standards can help simplify the process of determining your core fields, and how you want to use them in your DAM. But never be afraid to deviate from a standard if it simply doesn’t make sense for your organization, as long as you maintain a consistent approach. You can create your own in-house standards when no others fit the bill, but you can avoid reinventing the wheel for a goodly portion, simply by exploring the metadata standards landscape. It’s partially a well-signposted journey, but certainly requires some traveling off the path!
About Lisa Grimm
While in grad school for archaeology, Lisa Grimm fell into a career as a web developer (back before HTML had tables), and bounced from London to Silicon Valley, then on to NYC and Philadelphia, focusing ever-more on content and digital assets as she worked in tech, government and publishing. Midway through her career, she went to library school to obtain an MS-LIS degree, and left ‘straight’ tech to work in DAM for a number of libraries, archives and museums. She’s back on the corporate side now, serving as Content Librarian for GSK, where she oversees the company’s DAM ecosystem, taxonomy and metadata standards.
Lisa has been a DAM Guru Program member since February of 2014. Connect with her on LinkedIn.
DAM Guru Program recognizes this article as worthy of the #LearnDAM designation for materials that provide genuine digital asset management education without sales agendas. Search #LearnDAM on Google for more materials.This post originally appeared on the DAM Guru Blog.Share this Article: