New Century and Integrity Plus Blog

The Importance of Data Cleanup in Effective Integrity Management Programs

Posted by Shelly on May 25, 2016 10:30:00 AM

At New Century Software, we take integrity management very seriously, knowing data-image.jpgfull well what happens when data is inaccurate or not up-to-date. So today we’re going to explore why data cleanup is so important for your integrity management program, and how to do it through data integration, HCA identification, risk assessment, data management and data verification.  

Data Integration

The importance of data cleanup begins with data integration, a process of gathering relevant pipeline information and putting it into a GIS and data storage repository. Such storage is vital, allowing you to monitor and assess the performance and progress of your integrity management program.

Be aware that challenges can arise during data integration, including:

  • Accuracy of gathered data
  • Amount of data available
  • Timely availability of data
  • Variety of data sources
  • Formatting of data
  • Resolution of gaps and unknowns
  • Effective utilization of data

HCA Identification

HCA identification is another important aspect of data cleanup, and allows for increased accuracy and a way to build confidence in the pipeline centerline information. Periodic verification is needed, as is the need to identify and evaluate new HCAs.

Risk Assessment

Within risk assessment lies the critical component of data quality, another component of data cleanup. To improve quality:

  • Identify the appropriate risk model based on available data sources.
  • Explore the uncertainty and limitations of position precision and accuracy of GIS data.
  • Recognize missing data and erroneous assumptions.

Data Management

Identify the type of data model you want to use to allow for easier cleanup. Examples include:

  • Pipeline Open Data Standard (PODS)
  • Utility Pipeline Data Model (UPDM)
  • ArcGIS Online (AGOL)
  • ESRI geodatabase
  • Access

Data Verification

This is critical to identifying potential issues during data cleanup. Centralize your data into as few data silos as possible to make querying your data easy. These quality assurance checks can help locate potential problem areas, including:

  • Duplicates
  • Missing Data
  • Gaps/Overlaps
  • Formatting issues


So, what’s the take-away from all of this?

Clean up your data, improve its quality, and you will have a stronger integrity management program.

Once your data has been cleaned up, think about implementing the following suggestions in order to maintain the high quality of your data going forward. You don’t want to lose all of your hard work!

How to Maintain Data Going Forward

  • Provide high standards and repeatable processes.
  • Define how data will be captured in the database.
  • Limit domain types to ensure that users do not capture data differently.
  • Run QA/QC analysis to identify gaps in the data.
  • Comb through data from new acquisitions to ensure that data fits into your standards when it’s merged.
  • Create performance metrics for your data.
  • Review and update data periodically.
  • Improve data quality through continued data acquisition.

Cleaning up data and keeping it that way can be a lot of work, trust us we know! We are always here to help whether it's through a consultation or completing a service. Check out all the options we offer here


Topics: Data Management, data

Subscribe to Email Updates

Recent Posts

Posts by Topic

see all

Follow Us