Solutions - Data Cleansing
Data migration work often includes a data cleansing element to validate and enrich data before loading it into a target system. We have therefore built up an expertise in data cleansing.
DataLynx has implemented a suite of in-house ETL tools which allow us to Extract, Transform and Load data between almost any systems.
Our ETL software allows us to cleanse data without having to write bespoke programs which means that we can focus on the business issues rather than software development.
Data Audit
In order to determine the data quality within the systems we perform two types of data auditing; Data Profiling and Business Rule Application.
By examining the data we build up a data profile which identifies;
- Record counts
- Empty tables
- Empty fields
- Redundant tables
- Irregular data
- Sparsely populated data
- Orphaned data
- Constrained values and value lists
- Duplication
- Missing data
Business Rule Application identifies data which does not adhere to business rules. We work together with the stakeholders to identify meaningful business rules and then use the rules to identify invalid data.
The results of these two exercises are delivered in a data audit report where we make recommendations about what data should be cleansed and how the cleansing should be performed.
Data Cleansing
Our in house ETL tools enable us to automate most data cleansing activities. Using the results of the Data Audit exercise we enhance the data by;
- Applying Rules
- Standardising values
- De-duplicating data
- Merging data
- Address cleansing using PAF files
- Assigning default values
Occasionally it is not possible to programmatically correct bad data. In these cases we recommend considering the following options:
- Manual data cleansing
- Manual data enrichment (data capture)
- Data Purging
We have experience of data cleansing, data purging and running large scale manual data capture exercises.
Cleansing Safety
To avoid making irrevocable changes to live systems we perform our data cleansing in a staging area. This approach also allows a full user acceptance test to occur before any data is then re-imported back into live systems.
If the data cleansing operation is part of a data migration the staging area can be based on the database schema of the eventual target system.
If you would like to discuss your data cleansing requirements, please contact us on: 01923 677052 or use the DataLynx contact page and have one of our consultants call you.