“[…] While we do not argue that deriving measurement concepts from data rather than theory is problematic, per se, researchers should be aware that the most easily available measure may not be the most valid one, and they should discuss to what degree its validity converges with that of established instruments. For example, both communication research and linguistics have a long tradition of content-analytic techniques that are, at least in principle, easily applicable to digital media content.Of course, it is not possible to manually annotate millions of comments, tweets, or blog posts. However, any scholar who analyzes digital media can and should provide evidence for the validity of measures used, especially if they rely on previously unavailable or untested methods. The use of shallow, ‘‘available’’ measures often coincides with an implicit preference for automatic coding instruments over human judgment. There are several explanations for this phenomenon: First, many Big Data analyses are conducted by scholars who have a computer science or engineering background and may simply be unfamiliar with standard social science methods such as content analysis (but some are discussing the benefits of more qualitative manual analyses; Parker et al., 2011). Moreover, these researchers often have easier access to advanced computing machinery than trained research assistants who are traditionally employed as coders or raters […] (from “The Value of Big Data in Digital Media Research”, by Merja Mahrt & Michael Scharkow, 2013)