It is really a good idea to check your data early in the research process, especially if the data is new and not so well verified.

Whenever, I see a new data set, I make sure to check the data by manually seeing them.

Yes, this sounds boring and painful but it is a must or else you will see weird things in your results and think it is some kind of magic or paranormal activity.

So how to effectively check your data~

I suggest using the sample command.

I work with panel data with many industries and years so I often draw random samples by year or industry.

So the code for taking 10 observations from each industry-year, would be

sample 10, count by(industry year)

After that you can check your data using edit or list command.

I really highly encourage to always check your data. Because you will find mistakes and errors. You can always say that it is unavoidable and it is probably noise BUT! As rigorous researchers, we want to know the best that we can about our data.

After all, no expert chef says that he doesn’t know much or care about the ingredients~