First, I am thankful that we have financial data bases. But!! My goodness, there are so many errors, especially in the older data. Wow, confusion between the target and acquirer identifiers. Typos in firm names! Typos in identifiers!

Of course, we can rule all these errors as random but I don’t think they are random. Data quality really gets worse if it is older and if the firm is smaller. I am not sure how the errors in the data will affect the study’s results but as researcher, I want the most pristine data possible.  I look to think myself as a chef, using the best ingredients (data).

At least if there are issues, I will know that it is not the data. Whew…

Well, it has been a grueling three months plus another six month before but almost done.

Can’t wait to run the data 😀

Also, I can’t wait until I share the data so poor researchers do not have to reinvent the wheel every time.

Let’s thank God for giving me the ability and the opportunity to do research. Thank You Jesus 😀


Cleaning data is a skill and I am proud to be a cleaner as my friend always called me.