More from Uri Simonsohn:
"Just post it: The lesson from two cases of fabricated data detected by statistics alone."
The abstract reads:
"I argue that requiring authors to post the raw data supporting their published results has, among many other benefits, that of making fraud much less likely to go undetected. I illustrate this point by describing two cases of fraud I identified exclusively through statistical analysis of reported means and standard deviations. Analyses of the raw data behind these provided invaluable confirmation of the initial suspicions, ruling out benign explanations (e.g., reporting errors, unusual distributions), identifying additional signs of fabrication, and also ruling out one of the suspected fraudster’s explanations for his anomalous results."
In the introduction he writes:
"I illustrate how raw data can be analyzed for such purposes through two case studies. Each began by noting that summary statistics reported in a published paper were too similar across conditions to have originated in random samples, an approach to identifying problematic data that has been employed before (Carlisle, 2012; Fisher, 1936; Gaffan & Gaffan, 1992; Kalai, McKay, & Bar-Hillel, 1998; Roberts, 1987; Sternberg & Roberts, 2006). These preliminary analyses of excessive similarity motivated me to contact the authors and request the raw data behind their results. Only when the raw data were analyzed did these suspicions rise to a level of confidence that could trigger the investigations of possible misconduct that were eventually followed by the resignation of the researchers in question."
I really like this figure: I guess making up data isn't as easy as it sounds...