Data cleaning can be draining. Sitting in front of the computer and checking data can be numbing. But after many years of cleaning data, I have gained a sixth sense. I somehow get a feeling. It is weird but sometimes the data talks to me. No I am not crazy. But when I open the file and glimpse through I get a feeling. I can sometimes calculate how many hours it will take for me to clean the data. I also get a feel how poorly or well the data was made. It’s kind of interesting. The good thing is that I also learn how to better clean the data. After some cleaning, I see a pattern that expedites cleaning! How rewarding! Yeah, after many years, I get some animal instinct. Ahh the brain works in mysterious ways. Thank you God for giving me this ability 😀
Finally, some tips for data cleaning that I wish you knew when I first started cleaning data.
- Look at random samples. Take good notes. You may find a pattern.
- Many data sets have some sort of structure. Try to figure it out. Some times, the manual is available. Get the data manual.
- Don’t clean data for long hours. Your mistakes will cost your more time. I work in 40-60 min chunks. Clean data. Go take a walk, stretch. Come back and start again.
- Listen to music or podcast to ease the boredom.
- Don’t rush. It will make things. worse.
- Always double check your cleaned data
- Cleaning data can become automatic, do data cleaning when you don’t have much creative energy and can go on autopilot
Happy data cleaning 😀