Manufactured Doubt — The Arnold 2021 Neurofeedback Trial and the Tobacco Playbook

01 The charge

my argument

A study can carry every mark of trust — federal money, a registered protocol, a double blind, a prestige journal — and still be built to deliver a foregone conclusion. I believe the Arnold 2021 trial is such a study, and that its real function is to supply a citable verdict against neurofeedback as a whole. Its conclusion is now used to deny people reimbursement. In my opinion, it does not survive contact with its own methods.

02 The number that was reframed

Both arms improved. About equally.

Start with the trial's own headline statistic. From baseline to treatment end, the primary outcome improved significantly (p<.0001) in both arms — neurofeedback (d=1.51) and the so-called "sham" (d=1.47) — with no significant difference between them. A within-group result like this cannot, on its own, prove that neurofeedback works; both arms could be improving through expectation, attention, or natural course. But it equally cannot support the headline that was sold — "neurofeedback fails" — because, as the critics show, the comparator was never inert.

Two within-group effect sizes, almost the same height. The data say "no difference between two active arms." It was published as "neurofeedback doesn't work." That gap is where my argument begins.

03 The playbook

Where I have seen this before

In 1969, an executive at the tobacco company Brown & Williamson wrote a sentence that became the template for every industry that needed inconvenient science to go away. The strategy is not to disprove your opponent — it is to manufacture enough doubt that the public, regulators, and insurers stop trusting the evidence. Sociologists Naomi Oreskes and Erik Conway later named the practice in their book Merchants of Doubt.

"Doubt is our product, since it is the best means of competing with the 'body of fact' that exists in the mind of the general public. It is also the means of establishing a controversy."

— Brown & Williamson internal memo, 1969

I am not claiming the people involved here smoke cigars in a back room. I am claiming the pattern is the same — and that, in my opinion, this trial is one move in a coordinated effort to discredit a whole field, not an honest test that happened to come out negative. Here is the playbook, and how I think this case maps onto it.

Move 1

Design the study so the answer is fixed

Choose a comparator, an outcome measure, and a reward protocol that cannot detect the effect you claim to test.

Here: an active "sham," an outcome already known to be insensitive, no proof anyone learned.

Move 2

Reframe the result after the fact

When the data don't cooperate, change the wording so a null becomes a defeat.

Here: the pre-registered "placebo sham" quietly became "control treatment."

Move 3

Use credentialed, conflicted experts

Let people with industry ties deliver the verdict, lending it authority.

Here: a lead author and an editorialist tied to multiple stimulant makers.

Move 4

Turn the verdict into policy

Feed the conclusion to regulators, guideline bodies, and insurers so it becomes a wall.

Here: the study is now cited by carriers to deny coverage.

04 The respectable façade

Everything you're trained to trust — flip it over

By the standard academic heuristics this should be a high-trust study. That is exactly what makes it effective: the credibility markers stay intact while, in my reading, the experiment beneath them is hollow. Flip the panel.

05 The documented criticisms

The critique lists nineteen. Here are the load-bearing ones.

Documented in the peer-reviewed critical review by Schummer & Sguigna (NeuroRegulation, 2024 — which formally calls for the study's retraction) and the methodological analysis by Pigott, Cannon & Trullinger (2021). These are their findings and arguments; below, the ones I find most damning.

Criticism 01 · HARKing

The hypothesis was reworded after the result was known

The registered primary comparison was neurofeedback versus "placebo sham." When the so-called sham produced a large, durable improvement, the published conclusion swapped the wording to "control treatment." The relabelling, the critics argue, buries the actual finding: the comparator was never a placebo. (The trial randomised 144 children; 142 entered the primary analysis, 84 neurofeedback / 58 control.)

Hypothesizing After Results Knownregistry vs. publication mismatch

Criticism 02 · The comparator

The "sham" delivered EMG biofeedback — an active intervention

Barth et al. (2017) reported that EMG biofeedback alone reduces ADHD hyperactivity. Comparing neurofeedback against a second active intervention and finding "no difference" is not evidence of failure — in the critics' view it dismantles the sham-controlled inference the trial rests on. (Some trialists do accept EMG biofeedback as a legitimate sham; I address that objection below.)

non-inert controlactive vs. active

Criticism 03 · No proof of learning

It claimed to test operant conditioning — but never checked if anyone learned

The trial set out to test whether children could be conditioned to shift their theta-beta ratio, yet recorded no learning curves. There is no evidence in the data that a single child in the neurofeedback arm learned to modify the signal. You cannot conclude a treatment failed if you never confirmed it was delivered.

no learning curvesintervention not verified

Criticism 04 · The reward rule

Auto-thresholding, Pigott et al. argue, punished success and rewarded failure

Thresholds were auto-adjusted to hold a ~80% reward rate. Pigott et al. argue that this means a child who was learning had reinforcement withdrawn, while a child whose signal worsened was rewarded — inverting operant conditioning. (Adaptive thresholding is common practice; the dispute, addressed below, is whether this specific implementation defeats the learning it measures.) The interactive demo shows what Pigott's reading implies.

auto-thresholdingcontested mechanism

Criticism 05 · The yardstick

An outcome measure already flagged as unreliable

Janssen et al. (2017) and Ogrim & Hestad (2013) had reported the theta-beta ratio is "remarkably stable" and does not reliably move with training. Choosing a metric the literature had already flagged as insensitive, the critics argue, all but guarantees a null result.

known-insensitive outcome

Criticism 06 · Delivery

Technicians weren't certified; supervision fell below clinical standard

Authors interviewed for the critique disclosed that technicians lacked basic neurofeedback skill and that oversight — annual site visits plus weekly calls from one expert — sat below the clinical standard of care. A treatment delivered badly reads as a treatment that does not work.

delivery competence

Criticism 07 · The sample

Medicated and dual-diagnosis children were left in

Stimulant medication alters the very EEG measures used as outcomes. Including medicated and comorbid participants without stratification injects noise precisely where the signal was supposed to appear.

unstratified confounds

06 See it happen

The reward that, on Pigott's reading, punished learning

This makes Criticism 04 tangible. Press the button: a child genuinely learns to raise the targeted brain signal. Watch what auto-thresholding — on Pigott et al.'s interpretation — does the instant they succeed.

Operant conditioning, inverted

In neurofeedback as intended, crossing the threshold earns a reward and the child learns to stay there. Under the protocol Pigott et al. describe, every time the signal rises the threshold is pushed up to keep rewards near 80% — so the moment learning shows, it is taken away.

Idle — signal at baseline.

07 The most serious allegation

A conclusion allegedly required before publication

“

One author described a surreptitious communication between the study's authors and a journal editor, indicating that JAACAP would publish the study only if its conclusion stated that neurofeedback was no better than a placebo — after which, the critique reports, the manuscript was modified to conform.

— Schummer & Sguigna (2024), reporting a single, uncorroborated account from one author (the co-author Schummer sat on the committee that received it)

I want to be precise: this is one uncorroborated account, reported by a critic who is not a neutral party, naming "a journal editor" in the singular. It is an allegation, not an adjudicated fact, and I present it as such. But if it is true, it is the keystone — it would mean the verdict was decided before the data were written up, and the methodological problems above are not accidents but the mechanism that delivered it.

08 Follow the money

Who benefits when neurofeedback "fails"?

Neurofeedback competes with stimulant medication. By their own published disclosures, two of the people who shaped this study's reach are tied to the makers of that competing product line.

Investigator

L. E. Arnold

Disclosed funding / advisory ties to multiple pharma firms

Interest

Stimulant makers

Shire→Takeda, Supernus, Otsuka, Roche/Genentech, Pfizer, Novartis, Ironshore

Output

"NF fails"

A negative verdict on the non-drug rival therapy

Per the disclosures cited in the critique: Arnold received research funding from Shire (which became Takeda, maker of a leading ADHD stimulant), Supernus, Otsuka, Roche/Genentech and Young Living, and sat on advisory boards for six pharmaceutical companies. He earlier led the $17.7M NIMH MTA study, whose follow-ups were criticised for downplaying stimulants' diminishing efficacy. The scathing American Journal of Psychiatry editorial that amplified this trial — "Neurofeedback for ADHD: Time to Call It Quits?" — was written by James McGough, who per the critique served on the board of Sunovion and consulted for Eli Lilly, Takeda and Tris Pharma. A disclosed conflict does not by itself invalidate research — but, in my view, it sets the prior on which way the "errors" were likely to lean.

09 The other side

The strongest objections — and why I'm not persuaded

A signed opinion owes you the counterarguments. Here are the best defences of the trial, and my honest answer to each.

"EMG biofeedback is a standard, legitimate sham for neurofeedback."

It is true that many trialists use a non-EEG feedback condition as a control, precisely because it matches the training ritual. My answer: a control is only valid if it is plausibly inert on the outcome. Once there is published evidence that EMG biofeedback itself reduces ADHD symptoms, "no difference" stops meaning "neurofeedback is inert" and starts meaning "two active treatments performed similarly." The label "placebo" then does real rhetorical work the data don't support.

"Adaptive thresholding is routine — it isn't sabotage."

Correct: keeping reward rates near 80% is common, to keep children engaged. My answer: common is not the same as harmless. Pigott et al.'s point is specific — that this particular auto-thresholding can withdraw reinforcement exactly when a child self-regulates. That is a genuine methodological dispute, not settled fact, which is why I attribute it to them rather than assert it. But "everyone does it" is not a defence if the practice can blunt the very learning under test.

"It's a large, NIH-funded, double-blind RCT — that beats clinic anecdote."

Yes — and I hold my own field to that bar too; much pro-neurofeedback evidence is small and sometimes conflicted. My answer: design quality is not conferred by funding or blinding. A rigorous-looking trial that uses an active control, an insensitive outcome, and unverified delivery is rigorous in form only. The fix is not "trust the clinic instead" — it is CRED-nf-compliant trials that monitor learning and use truly inert controls.

"You sell neurofeedback — of course you'd say this."

Fair, and I disclosed it at the top. My answer: my interest does not make the documented facts disappear — the reworded hypothesis, the active comparator, the absent learning curves, the disclosed pharma ties are all in the public record. Weigh my interpretation sceptically; check the sources yourself. That is exactly what I am asking you to do with the original trial.

10 The damage

A contested study with real-world teeth

A flawed paper is a footnote. A flawed paper that becomes policy is a harm — and harms not one study but every clinician and family relying on the field.

Denied

Cited by insurance carriers to refuse neurofeedback reimbursement — turning a contested result into a coverage wall for families.

"Quit"

Amplified by McGough's 2022 AJP editorial "Time to Call It Quits?" — by an author with disclosed stimulant-industry ties.

No fix

No original author has asked for retraction. JAACAP rejected the letter documenting the criticisms, saying it did not meet the journal's standards.

11 My verdict on the verdict

It did not prove neurofeedback fails. It proved the experiment did.

Here is the narrow, honest reading I stand behind: a trial whose control was active, whose outcome was insensitive, whose reward scheme may have inverted learning, and whose hypothesis was reworded after the fact was never structurally capable of testing what it claimed to test. The strongest claim its data support is not "neurofeedback failed." It is "this study cannot tell us." That a study this weak is being used to shut a whole field out of reimbursement is, in my opinion, not an accident of science. It is manufactured doubt, doing exactly what it was built to do.

How to read this. This is a signed opinion piece by François Altwies, who has a disclosed commercial interest in neurofeedback (see the panel at the top). The factual claims — effect sizes, the rewording of the hypothesis, the nature of the control, the disclosed conflicts of interest, the use of the study by insurers — are drawn from the peer-reviewed critiques cited below; I encourage you to read those sources, and the original Arnold et al. (2021) paper, directly. The editorial-pressure account is a single, uncorroborated allegation reported by a non-neutral source and has not been adjudicated; no regulatory or institutional finding of misconduct against any named individual is asserted. The framing — "manufactured doubt," the tobacco-playbook parallel, and the claim that this trial functions as an attack on the field — is my interpretation and opinion, offered for debate, not a statement of proven fact about anyone's private intentions.