Cleaning Up SharePoint Legacy Content
The idea of cleaning up legacy SharePoint content is daunting. Organizations often place cleanup under the “Nice to Do” column as opposed to the “Must Do” column.

Why not leverage in-house resources? Legacy SharePoint cleanup is a perfect task for the Records Management (RM) department. Reviewing data and applying retention to it are two of our key responsibilities. 

Why Records Management is Right for this Job

This project is an opportunity for records management to demonstrate its worth to the organization. We live in an age of extreme expectation and records management faces a tough, uphill battle because it is a long-term return on investment. This project will result in a significant drop in storage costs and is the right thing for the company – especially from a compliance perspective. Even if the records director is competing against departments who provide a faster and higher rate of return, a compelling argument can be made for legacy SharePoint content cleanup. The records director should gain positive exposure from their project’s results.

Success of this project depends partially on whether or not the records director has an “abandoned records” policy. While records management’s direct ownership of content is eroding in today’s corporate landscape, organizations still expect us to manage data in the traditional sense. Authorizing the records director to shoulder ownership of legacy content (from any storage platform) frees the original content owner or the department surrogate and Information Technology (IT) staff from the decision-making process. An abandoned records policy may extend the authority of the records director, but this approach finally identifies a responsible party and expedites the rate of progress. We want our CIO happy.

How to Tackle the Task Ahead

The typical duration of a legacy SharePoint content cleanup project is two to three months. Make no mistake: because the records management department has too much to do, the records director should hire a consultant to assist with the legacy SharePoint content review. Twenty five percent of one, full-time records staff’s attention should be devoted to this project at the most. IT staff can provide records (and therefore, the consultant) with reports on SharePoint objects that typically contain the following metadata:

  • Object Title
  • Site Address
  • Owner User Name
  • Object File Extension
  • Object File Type
  • Size       
  • Creation Time
  • Last Access Date/Time
  • Last Modified Date/Time
  • Days since Last Accessed
  • Days since Last Modified
  • Days since File Creation 

Because the metadata is so descriptive, the consultant needn’t have access to the original objects (corporate information secured: check!). The records consultant should tackle the easy: the objects which have current owners and paths should be reviewed first. This will afford the records director and the consultant with an opportunity to edit their project approach should any nasty surprises arise. The consultant should determine whether or not these sites have migrated recently, because that changes the date/time stamp attached to these objects. The records consultant will compare the subject content of each object (often deferred from the site address) to the records retention schedule. Additional records-related columns (for example, “trigger event,” “retention to be changed” or “retention met”) will further illustrate the fate of each object.

Note: this project is an opportunity for records to identify department relationships and content dependencies ahead of the work breakdown structure. For example, the forms posted on accounting’s “old” site may still be used by select departments and each department may request a copy prior to deletion. Multiple versions of old forms stored in multiple places is not the best choice for the organization; the records director should have the right conversations with the right audiences to propose better forms management. The results should be documented in the content and project management tools.

Learning Opportunities

The project baseline will answer many questions:

  1. Objects per site
  2. Site deltas and growth
  3. Total percent and number counts/extension per department site - deltas as well as growth
  4. Duplicate (or more) object choices:
    • Choose a location
    • Retain both
    • Titled similarly, but different
    • Flagged for upload to SharePoint Next
  5. SharePoint site growth:
    • Recent
    • Annual
    • Five years
  6. Typical name per file type, numbers and percentage
  7. Records series counts per department site
  8. Number and percentage of duplicates. Locations
  9. Dates since last accessed, last modified, numbers and percentages. Times
  10. Unapproved extensions in SharePoint based on policy (e.g., “.msg”)
  11. High volume corporate records owners

These responses should be tracked at regular intervals and reported up to senior leadership.

Your organization may have terabytes of content in SharePoint. Send the objects without clear ownership over to records management. Cleaning up SharePoint should be a task listed within the SharePoint governance plan. Leverage your in-house resources – let records management help.

Title image by Andrey_Popov (Shutterstock)

Editor's Note: Read more of Mimi's thoughts on SharePoint in Metadata Solves Your SharePoint Content Management Problems