embedded metadata in a digital file
Digital files contain a wealth of information in the embedded metadata, information that some people share unknowingly PHOTO: Markus Spiske on unsplash

Stripping embedded metadata from digital files may be the most misunderstood technical search engine optimization (SEO) “best practice” in IT today. Embedded metadata is metadata stored inside the file itself that can provide important historical, usage rights or reference information about a digital file.

But are there valid reasons for wanting to strip embedded metadata? And why might you not want to strip embedded metadata?

6 Reasons to Retain Embedded Metadata

Stripping embedded metadata is a hot topic, but the best practice may vary depending on your field. Librarians, archivists, digital asset managers and information professionals are keenly aware of the importance of embedded metadata. Embedded metadata does many things, but let’s break it down to understand the full picture.

Here are six arguments for why you would not want to strip embedded metadata from your digital files:

  1. It can offer insight into a file's history and provenance. Embedded metadata helps you understand the history and provenance of digital files, but there are many other uses of metadata besides provenance, such as discovery and retrieval, rights management, integrity and longevity.
  2. It's a tool for copywriting intellectual works. Embedded metadata can signal key copyright or usage rights information. Embedded metadata may be fragile, but you can use it to mark your creative work.
  3. It can help preserve digital works for the future. Embedded metadata provides context for digital files. It’s also evidence of a company's intellectual property (IP) and can be used to aid in digital preservation efforts.
  4. Preserving it helps you avoid legal repercussions. You can get sued for removing embedded metadata. In 2016, a German photographer successfully sued Facebook, arguing that the company violated German copyright law when it removed embedded metadata from his photographs upon upload to the social network's platform.
  5. It may be a myth that stripping embedded metadata improves load times. Stripping embedded metadata from digital files on your websites doesn’t really impact load times very much. (See myth number three here.)
  6. It can help optimize SEO in the future. Embedded metadata could potentially be indexed by search engines in the future, as Matt Cutts suggested in 2014. Briefly, EXIF (exchangeable image file format) embedded metadata was added to Google Images, although it does not appear there today.

6 Reasons to Strip Embedded Metadata

But reasons exist for methodically stripping this data from files. Technical SEO experts, marketers and IT risk management experts all have an interest in optimizing their websites for "findability," but they also want to close any holes that can be exploited to compromise private data or launch malicious attacks. Look no further than the full gamut of blog posts and whitepapers that discuss ways to optimize your digital files for SEO, and you’ll find the same advice repeated over and over again to strip embedded metadata, but why is that?

Here are six arguments why you would want to strip embedded metadata from your digital files:

  1. It may help improve load times. You want digital files to load fast on your website because it may help you rank higher in search results. We know that higher search rankings mean greater visibility, more visitors and potentially more sales/conversions/leads. This is a primary reason why website administrators and technical SEO experts are stripping metadata from digital files.
  2. It helps protect sensitive information. You probably don’t want everyone to know where you live or where you took certain photographs, or possibly even which tools you used to create a specific digital file. The University of Michigan reminds us that you share metadata when you share files and offers tips for removing metadata. An example of how metadata can reveal sensitive information occurred when the online magazine Vice accidentally gave away the location of a wanted man, John McAfee, when it didn't remove metadata from a photo.
  3. It can help safeguard you against hacks. Cross-site scripting through EXIF embedded metadata once caused potential vulnerabilities, but that threat appears to be resolved now. However, with the growing occurrences of vulnerabilities and hacks, it’s important to stay up to date on what puts you at risk.
  4. It can aid in quality control. Embedded metadata is not always high-quality metadata. There may be times when you want to remove incorrect or erroneous metadata from your digital files.
  5. It may help protect anonymous sources. Whistleblowers who want to hand in confidential documents or journalists who want to protect their sources should remove embedded metadata from digital files if they are concerned about preserving the secrecy of their identities, their organizations' identities and other information.
  6. It can protect you if your files are subject to a digital forensic investigation. There are also other less scrupulous, possibly more uncommon reasons for stripping embedded metadata. For example, people may want to avoid being identified as part of a digital forensic investigation. Consider the Russian troll farm implicated in efforts to influence the 2016 U.S. election, and the fact that an embedded metadata breadcrumb on a document from the the International Press Foundation (IPF) helped cybersecurity experts determine where the document had originated.

To Strip or Not to Strip? It Depends

Deciding whether or not to strip embedded metadata from digital files is not as cut and dried as some may think. You must consider important factors, such as the need to preserve context and rights information, while simultaneously taking into account the need for privacy protections and security risk mitigation.

Your unique perspective will depend greatly on what discipline or field you are in. If you’re still on the fence about the importance of embedded metadata, check out the Embedded Metadata Manifesto. As part of that initiative, researchers surveyed major social media websites to see whether or not they retain or strip embedded metadata from users' digital files. The sites audited include giants such as Facebook and Twitter, and popular photo-sharing websites such as Flickr and Instagram.

Do you have a reason for or against that didn't make the list? Let us know in the comments.