Image by: BrianAJackson, ©2016 Getty Images
Classification is a key requirement for effective IG and FACR—unstructured content, once classified, becomes structured and, therefore, findable, useable and manageable throughout its life cycle. Structuring shared drives using classification will move the organization a long way toward IG, but the use of content management systems, including properly deployed SharePoint, brings the greatest degree of operational effectiveness and life cycle control to achieve formal enterprise IG.
The goal of shared drive remediation is to migrate clean content to a system or standard classification so that it can be found, used and managed through its life cycle. FACR solutions are many and varied; the kinds of content, the ultimate outcomes desired, volume of content and cost will help determine the options available to your organization.
FACR systems have varying capabilities:
- Metadata analysis looks only at the file system (and/or SharePoint) metadata (properties)
- Text analytics further refines categorization of content and also targets personally identifiable information (PII) and identifies high-value content
- Image analysis groups like-images using graphical pattern matching; it does not require optical character recognition (OCR)
- Archive solutions perform the above analytics but also ingest target content into their repository for ongoing classification, analysis, discovery, hold and disposition
- Some solutions are tightly integrated with SharePoint information architecture for bi-directional updates of taxonomies and metadata
- Some solutions offer e-discovery, email migration and classification term-extraction
While manual analysis or Excel spreadsheets can be useful for a high-level analysis of content, acting on content is a much greater challenge without a FACR solution. There are five main purposes for FACR solutions:
1. Discover and cleanse content
Analyze content, group within classification schemes, remediate redundant, outdated or trivial (ROT) content and purge or quarantine content. This task can be completed across very high volumes of content and across multiple repositories for broad normalization of content. In addition, workflow is used for human identification of groups of content that cannot be automatically classified. Most FACR solutions use artificial intelligence (AI) to constantly improve classification accuracy; others require "document corpus" to train the engine concepts for recognition. Extracted metadata can be rationalized and validated.
Another valuable analysis task identifies migration issues dependent on the target system; for example, file names or document types not supported by SharePoint, encrypted files, password protected files, undocumented file extensions, etc. These anomalies can be queued in workflow for review or quarantined prior to initiating migration activities.
2. Identify sensitive data or business-critical data
Another valuable analysis task identifies migration issues dependent on the target system; for example, file names or document types not supported by SharePoint, encrypted files, password protected files, undocumented file extensions, etc. These anomalies can be queued in workflow for review or quarantined prior to initiating migration activities.
2. Identify sensitive data or business-critical data
Products, which leverage text analytics, use regular expressions (regex) to find social security numbers, credit card numbers and other PII or to locate tags that are critical for a business, such as contracts, intellectual property, etc.
3. Migration of content
3. Migration of content
Once you have clean, classified content, it can be migrated, using business rules and considering IG policies, to a new, properly classified shared drive, an enterprise content management (ECM) solution, SharePoint or another repository. Content that is questionable can be queued in workflow for human analysis, and content with sensitive data can be migrated to quarantine, waiting for further analysis and action.
4. Content rationalization
4. Content rationalization
Now that content is clean, categorized and has validated metadata, it can be further analyzed to extract business data or be reorganized to meet business needs (mergers and acquisitions, divestiture, discovery, etc.).
5. Ongoing governance
5. Ongoing governance
It is critical to monitor and maintain IG rules going forward to avoid facing the same mess a year or two down the road. FACR systems offer various ways of automating classification tasks or monitoring repositories for compliance with the new taxonomies.
Following are examples of a FACR system user interface and analysis output, compliments of Active Navigation, Inc.
It has no doubt become obvious that there are many consideration when cleaning up content and migrating it. FACR tools, fortunately, formalize and automate the application of most business rules and IG policies. They manage content anomaly work processes to effect proper content groupings within a formal classification structure.
Following are examples of a FACR system user interface and analysis output, compliments of Active Navigation, Inc.
Active Navigation
Active Navigation
Active Navigation
Active Navigation
Active Navigation
Active Navigation
It has no doubt become obvious that there are many consideration when cleaning up content and migrating it. FACR tools, fortunately, formalize and automate the application of most business rules and IG policies. They manage content anomaly work processes to effect proper content groupings within a formal classification structure.
For more information on FACR issues, efficient information organization, information governance, life cycle management and ongoing control, visit www.imergeconsult.com.
Jim Just is a partner with IMERGE Consulting, Inc., with over 20 years of experience in business process redesign, document management technologies, business process management and records and information management. Contact him at james.just@imergeconsult.com or follow him on Twitter @jamesjust10.