Unilaterally Raising the Scientific Standard

Published by Anonymous (not verified) on Sun, 03/02/2013 - 9:51pm in

For years, I and others have been arguing that the current system of publishing science is broken. Publishing and peer-reviewing work only after the study's been conducted and the data analysed allows bad practices - such as selective publication of desirable findings, and running multiple statistical tests to find positive results - to run rampant.

So I was extremely interested when I received an email from Jona Sassenhagen, of the University of Marburg, with subject line: Unilaterally raising the standard.

Sassenhagen explained that he's chose to pre-register a neuroscience study on a public database, the German Clinical Trials Register (DRKS).

His project, Alignment of Late Positive ERP Components to Linguistic Deviations ("P600"), is designed to use EEG to test whether the brain generates a distinct electrical response - the P600 - in response to seeing grammatical errors. The background here is that the P600 certainly exists, but people disagree on whether it's specific to language; Sassenhagen hopes to find out.

By publicly announcing the methods he'll use before collecting any data, Sassenhagen has, in my view, taken a brave and important step towards a better kind of science.

Already, most journals require trials of medical treatments to be publicly pre-registered, and the DRKS is one such registry. This study, however, is 'pure' neuroscience with nothing clinical about it, so it doesn't need to be registered - Sassenhagen just did it voluntarily.

Further, I should point out that he offered to pre-register his data analysis pipeline too by sending it to me. Unfortunately, I didn't reply to the email in time... but that was purely my fault.

I very much hope and expect that others will follow in his footsteps. Unilaterally adopting preregistration is one of the ways that I've argued reform could get started. As I said:

This would, at least at first, place these adopters at an objective disadvantage. However, by voluntarily accepting such a disadvantage, it might be hoped that such actors would gain acclaim as more trustworthy than non-adopters.

Pre-registration puts you at a disadvantage - insofar as it limits your ability to use bad practice to fish for positive results. It means you can't cheat, essentially, which is a handicap if everyone else can.

I don't know if this is the first time anyone's opted in to registering a pure neuroscience study, but it's certainly the first case I know of it being done for an entirely new experiment.

There have, however, recently been many pre-registered attempts to replicate previously published results e.g. the Reproducibility of Psychological Science; the 'Precognition' Replications; and an upcoming special issue of Frontiers in Cognition.

Replications are good, registered ones doubly so - but they're not enough to fix bad practice on their own. To do that we need to work on the source, original scientific research.

Is This How Memory Works?

Published by Anonymous (not verified) on Sun, 27/01/2013 - 8:46pm in


papers, Science

We know quite a bit about how long-term memory is formed in the brain - it's all about strengthening of synaptic connections between neurons. But what about remembering something over the course of just a few seconds? Like how you (hopefully) still recall what that last sentence as about?

Short-term memory is formed and lost far too quickly for it to be explained by any (known) kind of synaptic plasticity. So how does it work? British mathematicians Samuel Johnson and colleagues say they have the answer: Robust Short-Term Memory without Synaptic Learning.

They write:

The mechanism, which we call Cluster Reverberation (CR), is very simple. If neurons in a group are more densely connected to each other than to the rest of the network, either because they form a module or because the network is significantly clustered, they will tend to retain the activity of the group: when they are all initially firing, they each continue to receive many action potentials and so go on firing.

The idea is that a neural network will naturally exhibit short-term memory - i.e. a pattern of electrical activity will tend to be maintained over time - so long as neurons are wired up in the form of clusters of cells mostly connected to their neighbours:

The cells within a cluster (or module) are all connected to each other, so once a module becomes active, it will stay active as the cells stimulate each other.

Why, you might ask, are the clusters necessary? Couldn't each individual cell have a memory - a tendency for its activity level to be 'sticky' over time, so that it kept firing even after it had stopped receiving input?

The authors say that even 'sticky' cells couldn't store memory effectively, because we know that the firing pattern of any individual cell is subject to a lot of random variation. If all of the cells were interconnected, this noise would quickly erase the signal. Clustering overcomes this problem.

But how could a neural clustering system develop in the first place? And how would the brain ensure that the clusters were 'useful' groups, rather than just being a bunch of different neurons doing entirely different things? Here's the clever bit:

If an initially homogeneous (i.e., neither modular nor clustered) area of brain tissue were repeatedly stimulated with different patterns... then synaptic plasticity mechanisms might be expected to alter the network structure in such a way that synapses within each of the imposed modules would all tend to become strengthened.

In other words, even if the brain started out life with a random pattern of connections, everyday experience (e.g. sensory input) could create a modular structure of just the right kind to allow short-term memory. Incidentally, such a 'modular' network would also be one of those famous small-world networks.

It strikes me as a very elegant model. But it is just a model, and neuroscience has a lot of those; as always, it awaits experimental proof.

One possible implication of this idea, it seems to me, is that short-term memory ought to be pretty conservative, in the sense that it could only store reactivations of existing neural circuits, rather than entirely new patterns of activity. Might it be possible to test that...?

ResearchBlogging.orgJohnson S, Marro J, and Torres JJ (2013). Robust Short-Term Memory without Synaptic Learning. PloS ONE, 8 (1) PMID: 23349664

Is Medical Science Really 86% True?

Published by Anonymous (not verified) on Fri, 25/01/2013 - 5:39am in

The idea that Most Published Research Findings Are False rocked the world of science when it was proposed in 2005. Since then, however, it's become widely accepted - at least with respect to many kinds of studies in biology, genetics, medicine and psychology.

Now, however, a new analysis from Jager and Leek says things are nowhere near as bad after all: only 14% of the medical literature is wrong, not half of it. Phew!

But is this conclusion... falsely positive?

I'm skeptical of this result for two separate reasons. First off, I have problems with the sample of the literature they used: it seems likely to contain only the 'best' results. This is because the authors:

  • only considered the creme-de-la-creme of top-ranked medical journals, which may be more reliable than others.
  • only looked at the Abstracts of the papers, which generally contain the best results in the paper.
  • only included the just over 5000 statistically significant p-values present in the 75,000 Abstracts published. Those papers that put their p-values up front might be more reliable than those that bury them deep in the Results.

In other words, even if it's true that only 14% of the results in these Abstracts were false, the proportion in the medical literature as a whole might be much higher.

Secondly, I have doubts about the statistics. Jager and Leek estimated the proportion of false positive p values, by assuming that true p-values tend to be low: not just below the arbitrary 0.05 cutoff, but well below it.

It turns out that p-values in these Abstracts strongly cluster around 0, and the conclusion is that most of them are real:

But this depends on the crucial assumption that false-positive p values are different from real ones, and equally likely to be anywhere from 0 to 0.05.

"if we consider only the P-­values that are less than 0.05, the P-­values for false positives must be distributed uniformly between 0 and 0.05."

The statement is true in theory - by definition, p values should behave in that way assuming the null hypothesis is true. In theory.

But... we have no way of knowing if it's true in practice. It might well not be.

For example, authors tend to put their best p-values in the Abstract. If they have several significant findings below 0.05, they'll likely put the lowest one up front. This works for both true and false positives: if you get p=0.01 and p=0.05, you'll probably highlight the 0.01. Therefore, false positive p values in Abstracts might cluster low, just like true positives.

Alternatively, false p's could also cluster the other way, just below 0.05. This is because running lots of independent comparisons is not the only way to generate false positives. You can also take almost-significant p's and fudge them downwards, for example by excluding 'outliers', or running slightly different statistical tests. You won't get p=0.06 down to p=0.001 by doing that, but you can get it down to p=0.04.

In this dataset, there's no evidence that p's just below 0.05 were more common. However, in many other sets of scientific papers, clear evidence of such "p hacking" has been found. That reinforces my suspicion that this is an especially 'good' sample.

Anyway, those are just two examples of why false p's might be unevenly distributed; there are plenty of others: 'there are more bad scientific practices in heaven and earth, Horatio, than are dreamt of in your model...'

In summary, although I think the idea of modelling the distribution of true and false findings, and using these models to estimate the proportions of each in a sample, is promising, I think a lot more work is needed before we can be confident in the results of the approach.

How (Not) To Fix Social Psychology

Published by Anonymous (not verified) on Fri, 18/01/2013 - 8:37pm in

British psychologist David Shanks has commented on the Diedrik Stapel affair and other recent scandals that have rocked the field of social psychology: Unconscious track to disciplinary train wreck,

Lots of people are chipping in on this debate for the first time at the moment, but peoples' initial reactions often fall prey to misunderstandings that can stand in the way of meaningful reform - misunderstandings that more considered analysis has exposed.

For example, Shanks writes:

[despite claims that] social psychology is no more prone to fraud than any other discipline, but outright fraud is not the major problem: the biggest concern is sloppy research practice, such as running several experiments and only reporting the ones that work.

It's true that fraud is not the major issue, as I and many others have said. But bad practice, such as p-value fishing, is in no way "sloppy" as Shanks says. Running multiple experiments to get a positive results is a sensible and effective strategy for getting positive results; that's why so many people do it. And so long as scientists are required to get such findings to get publications and grants, it will continue.

Behavior is the product of rewards and punishments, as a great psychologist said. We need to change the reinforcement schedule, not berate the rats for pressing the lever.

Earlier, Shanks writes that evidence of unconscious influences on human behaviour - a popular topic in Stapel's work and in social psychology generally -

is easily obtained because it usually rests on null results, namely finding that people's reports about (and hence awareness of) the causes of their behaviour fail to acknowledge the relevant cues. Null results are easily obtained if one's methods are poor.

Thus journals have in recent years published extraordinary reports of unconscious social influences on behaviour, including claims that people are more likely to take a cleansing wipe at the end of an experiment in which they are induced to recall an immoral act [etc]...

...failures to replicate the effects described above have been reported, though often papers reporting such failures are rejected out of hand by the journals that published the initial studies. I await with interest the outcome of efforts to replicate the recent claim that touching a teddy bear makes lonely people more sociable.

Here Shanks first says that null results can easily result from poorly-conducted experiments, and then criticizes journals for not publishing null results that represent failures to replicate prior claims! But null replications are very often rejected because a reviewer says, like Shanks, "This replication was just poorly-conducted, it doesn't count." Shanks (unconsciously no doubt) replicates the problem in his article.

So what to do? Again, it's a systemic problem. So long as we have peer-reviewed scientific journals, and the peer-review takes place after the data are collected, it will be open to reviewers to spike results they don't like - generally although not always null ones. If reviewers had to judge the quality of a study before they knew what it was going to find, as I've suggested, this problem would be solved.

Other people have great ideas for fixing science of their own. The problem is structural, not a failing on the part of individual scientists, and not limited to social psychology.

My Breakfast With "Scientism"

Published by Anonymous (not verified) on Mon, 17/12/2012 - 8:16pm in

One morning, I awoke convinced that science was the only source of knowledge. I had developed a case of spontaneous scientism.

The first challenge I faced was deciding what to eat for breakfast. Muesli, or cornflakes? Which would be the more scientific choice? I decided to go on the internet to look up the nutritional value of the different cereals, to see which one would be healthiest.

My computer was off. So first I'd need to turn it on - but how? From past experience, I suspected that pressing the big green power button on the front would do it - but then I remembered, that's merely anecdotal evidence. I needed scientific proof.

So I made a mental note to run a double-blind, randomized controlled trial of "turning my computer on" tomorrow.

Lacking nutritional data, I decided to pick a cereal by taste. I like muesli more than cornflakes. At last, a choice! Muesli it is, I thought - until I realized that I didn't actually know which one I preferred more. I had a gut feeling I liked muesli, but that's not science. What if, in fact, I hated muesli? Science couldn't tell me, at least not yet.

Another mental note: conduct cereal taste preference study, day after tomorrow. No breakfast for me, today.

By now, I was hungry, confused and annoyed. "This is getting ridiculous!", I tried to exclaim to no-one in particular - but then I realized - I could not even speak because I knew next to nothing scientific about the English language.

Sure, I had vague intuitions about how to put words together to express meaning, but that's just unscientific hearsay that I'd picked up as a child (no better than a religion, really!) In order to communicate, I'd need to study some proper science about semantics and grammar... but, oh no, how could I even read that literature?

Faced with the impossibility of doing anything whatsoever purely guided by science, I decided to go back to bed... yet with no scientific basis for controlling my own muscles, I collapsed where I stood, bashing my head on the breakfast table as I fell. 

Luckily, the bump on the noggin cured me of my strange obsession, and I lived to tell the tale.

Many people will tell you that "scientism", the belief that science is the only way to know anything, is a serious problem, a misunderstanding that threatens all kinds of nasty consequences.

It's not, because it doesn't exist - no-one believes that. If they did, they would end up like the unfortunate narrator in my story.

Everyday, we make use of many sources of information, from personal experience and learning to simply looking at things, whether they be right in front of your eyes or on TV. This is knowledge, and no-one thinks that we ought to replace it with "science", if that were even possible.

"Scientism" is a fundamentally unhelpful concept. Scientists are often wrong, and sometimes they're wrong about things that other non-scientists are right about. But each such case is different and must be judged on its own merits.