Data validation is one of the steps involved in data cleansing or data scrubbing. It is necessary to figure out which data is good, and which is incorrect or poorly formatted before you can "cleanse" or "scrub" that data. Additionally, after you find bad data, you could do "data removal" to eliminate the bad data.
In my personal experience, I need programs to check the quality of the formatting of information. My databases have information which is mostly inputted by clients. The clients do not want to format data the way I want them to which leads to a non-standardized output.
The programming on some of my sites forces clients to put data in a particular format which leads to a website where the information looks similar from listing to listing. Can you imagine a site where one phone number is formatted (333)333-3333, another is (333)-333-3333, a third is (333) 333-3333, and then 333-333-3333, and 333.333.3333,
not to mention omissions with 33.333-3333. Its nicer when data looks the same, and I have programs to check for data that doesn't meet the standards assigned to the particular field that its in.
Another form of data validation might be to see if cities, states, and zips match. If someone says they are in Assam, but have a zip code (PIN code) from Maharashtria, its time to call that company up and see where they really are. Its common for companies with multiple offices to mix the information from more than one office which causes a lot of confusion to the end user.
|