Limitations and Caveats

I spent the week crunching the number of times the words ‘the’, ‘of’ and ‘and’ appeared in tech trades and the MSM from 2004-2008 (and then made some projections for 2009).  While I certainly feel that this analysis has some meaning, I can’t escape the conclusion that it also has some severe limitations.  Here are the ones I see:

  • The words I chose may be the most frequently used words in the English language, but I have no proof that they appear in enough articles to be statistically relevant.  For all  I know journalists could be writing the same number of articles as always, while minimizing the use of the words I selected.
  • I can’t vouch for Factiva. Do they pick up everything? Is their database up to date? Might there be a large number of articles waiting to be loaded into the system? I don’t know.
  • Is this just evidence of journalism moving to new publishing platforms? The source lists I searched included some online content, but surely not all.  I may have evidence of traditional media shrinking, but that doesn’t mean journalism is any way diminished.

Any others? Let me know.  I believe my inferences are the correct ones but clearly my analysis is flawed.


