Data cleaning can be draining. Sitting in front of the computer and checking data can be numbing. But after many years of cleaning data, I have gained a sixth sense. I somehow get a feeling. It is weird but sometimes the data talks to me. No I am not crazy. But when I open the file and glimpse through I get a feeling. I can sometimes calculate how many hours it will take for me to clean the data. I also get a feel how poorly or well the data was made. It’s kind of interesting. The good thing is that I also learn how to better clean the data. After some cleaning, I see a pattern that expedites cleaning! How rewarding! Yeah, after many years, I get some animal instinct. Ahh the brain works in mysterious ways. Thank you God for giving me this ability 😀

Finally, some tips for data cleaning that I wish you knew when I first started cleaning data.

  1. Look at random samples. Take good notes. You may find a pattern.
  2. Many data sets have some sort of structure. Try to figure it out. Some times, the manual is available. Get the data manual.
  3. Don’t clean data for long hours. Your mistakes will cost your more time. I work in 40-60 min chunks. Clean data. Go take a walk, stretch. Come back and start again.
  4. Listen to music or podcast to ease the boredom.
  5. Don’t rush. It will make things. worse.
  6. Always double check your cleaned data
  7. Cleaning data can become automatic, do data cleaning when you don’t have much creative energy and can go on autopilot

Happy data cleaning 😀

Advertisement