Wiki: Verification of counterfactuals, interventions, and perturbations

Shortest conclusion

High held-out accuracy is important, but it alone does not mean that the mechanisms inside are the same. The current primary literature supports at least six separate walls: held-out decode, online human-in-the-loop control, bidirectional or local intervention, state-dependent intervention, temporal durability and deployment, and branch-structure or perturbation-pattern testing.

Main weakness this pass had to fix

The older version correctly separated held-out accuracy from intervention, but it still let readers learn causal verification as if it were one monotonic ladder. The recent primary literature does not support that shortcut. Wairagkar et al. (2025) demonstrated raw-neural closed-loop voice synthesis in less than 10 ms with explicit silence outside speech, yet the same paper reported a noticeable decline in fixed-decoder performance after about 15 days. Wilson et al. (2025) then achieved one month of unsupervised closed-loop cursor control while showing, with recordings spanning five years, that pairwise recalibration and chained long-term use are different questions. Oehrn et al. (2024) pushed adaptive DBS into blinded randomized blocks in home life over one month per condition, while Cascino et al. (2026) showed that even after offering chronic adaptive DBS to 20 consecutive Parkinson's disease patients, eligibility and programming constraints still narrowed who could actually continue. A second remaining weakness was subtler: this page still let readers treat adaptive DBS as one generic state-dependent intervention. Mathiopoulou et al. (2024), Stanslaski et al. (2024), Cascino et al. (2026), and the newer biomarker papers cited below show that biomarker family, controller mode, state dependence, sensing viability, and programming burden are different causal-verification burdens. Therefore this page now separates same-session causal gain, state-dependent controller family, temporal durability, bridge validity, and deployment burden instead of hiding them inside one phrase such as closed-loop success.

2026-03-31 deepening: burst-driven neuromodulation is controller-limited, not one state-dependent intervention

This page now blocks one more shortcut. Mathiopoulou et al. (2024) showed that subthalamic beta changes with movement, dopamine, and DBS state; Stanslaski et al. (2024) showed that adaptive-DBS already splits across single-threshold, dual-threshold, and different onset-duration policies; Olaru et al. (2024) and Mathiopoulou et al. (2025) pushed up different gamma-linked biomarker families; Dixon et al. (2026) added a remotely optimized neural-decoder route; and Cascino et al. (2026) showed that eligibility and programming constraints can still block chronic use. Therefore, on this site, a state-dependent neuromodulation paper is not read only by trigger timing or symptom change. It must also disclose which biomarker family, which controller policy, which operating regime, which sensing exclusions, and which comparator made the result possible.

First, classify causal evidence into six levels

Stage	What is actually changing	Minimum log wanted on this site	What still cannot be said
1. Held-out decode	Check whether the model still predicts on unused data.	Split unit, leakage audit, calibration error, uncertainty, and abstention if probabilities are output.	It still does not show agreement under changed conditions or agreement in causal structure.
2. Online human-in-the-loop control	A participant continuously operates while seeing or hearing the output.	End-to-end latency distribution, jitter, dropout, abstention or silence policy, and recalibration events.	Even if it works online, it still does not show compatibility with explicit perturbation, long-term durability, or same-state continuity.
3. Bidirectional feedback / local intervention	Feedback or stimulation changes the next biological input or behavior.	Stimulus timing, intensity, artifact window, effect size, failure cases, and local safety conditions.	Local causal gain still does not show whole-brain generative equivalence.
4. State-dependent intervention	Stimulation or control policy switches as a function of the detected state.	State-estimation error, biomarker family / symptom axis, controller family, comparator policy, duty cycle, stop conditions, abstention or fallback policy, and real-life block structure when relevant.	A symptom-linked controller or personalized biomarker route is still different from state completeness or branch-equivalence.
5. Temporal durability / deployment	The controller is expected to survive days, weeks, or home deployment rather than a single research session.	Fixed-decoder interval, supervised versus unsupervised recalibration route, performance-decay curve, recovery time, clinic versus home context, continuation or eligibility counts, signal-availability exclusions, and manual programming burden.	Cross-day operation still does not by itself show same-state continuity or maintenance-consistent causal equivalence.
6. Perturbation-structure / branch test	Compare multiple branches or perturbation-response patterns under fixed comparison rules.	Explicit branch variables, preregistered comparison rule, fixed failure criterion, repeatable perturbation set, temporal scope, and bridge status.	Even here, whole-brain identity, complete maintenance-state coverage, and social deployment are not automatically determined.

Recent literature forces five extra stop lines

The site's newer cards are not editorial decoration. They are forced by concrete gaps that recent primary literature leaves visible. This page now treats them as part of causal-verification reading rather than as optional follow-up bureaucracy.

Stop line	Why recent literature forces it	What this site now requires
Temporal Validity	Wairagkar et al. (2025) shows that an impressive same-session voice loop can still lose fixed-decoder performance after about 15 days, while Wilson et al. (2025) shows that one month of unsupervised use must be read separately from pairwise recalibration.	Attach the Temporal Validity Card whenever the claim reaches beyond same-session performance.
Burst-Controller Log	Mathiopoulou et al. (2024) shows that beta feedback is state-dependent, Stanslaski et al. (2024) shows that controller law and onset-duration policy are part of the object, Olaru et al. (2024) plus Mathiopoulou et al. (2025) show that gamma-linked routes are not the same biomarker family as beta-guided bradykinesia control, and Cascino et al. (2026) shows that eligibility and programming constraints remain part of the result.	Attach the Burst-Controller Log whenever burst-driven neuromodulation or adaptive DBS is promoted above exploratory timing or personalized-controller feasibility.
State-Continuity Bridge	Lu et al. (2023) shows that preservation route and fixation schedule alter extracellular-space retention, MICrONS Consortium et al. (2025) remains a sequential in vivo-to-EM local pipeline rather than simultaneous same-state capture, and Attardo et al. (2015) shows that adult CA1 spines themselves turn over on the scale of weeks.	Attach the State-Continuity Bridge Card whenever a result is read as one same-state sample across days, regimes, or live-to-fix bridges.
Maintenance-State Error Budget	Hengen et al. (2016), Schreiner et al. (2024), and Deng et al. (2025) show that sleep history, replay coupling, and intracellular timing windows remain active controllers of persistence and recovery rather than background context.	Attach the Maintenance-State Error Budget whenever the claim concerns persistence, forgetting, reconsolidation, or recovery after perturbation.
Body / Environment Boundary	Flesher et al. (2021) strengthens local bidirectional feedback, but the closed-loop literature more broadly still depends on retained or omitted tactile, proprioceptive, respiratory, arousal, and other organism-level routes.	Attach the Body / Environment Boundary Card whenever a fast loop is being promoted toward naturalistic or embodied equivalence rather than local controller performance.

Boundary cases seen in primary literature

Papers	What actually happened	How to read on this site	What still cannot be said
Forenzo et al. (2024)	A non-invasive EEG continuous-tracking task was run online with a deep-learning decoder inside the human loop.	This is an online-control result. It is stronger than offline accuracy and should be read through online metrics.	It is not counterfactual equivalence or whole-brain generative equivalence.
Littlejohn et al. (2025)	A speech neuroprosthesis streamed brain-to-voice updates in 80-ms increments for naturalistic communication.	This is a communication-subsystem online-control advance. It raises the bar for tail-latency and output-path logging.	It still does not show same-state continuity, long-term fixed-decoder durability, or branch-equivalence.
Wairagkar et al. (2025)	Raw neural activity was converted into synthesized voice in less than 10 ms, with silence returned in non-speech segments.	This is a strong same-session loop result and a strong abstention example.	The same paper still leaves long-term durability open because fixed-decoder performance declines after about 15 days.
Wilson et al. (2025)	One month of unsupervised closed-loop cursor control was obtained, with offline characterization of neural nonstationarity across five years.	This is the right reading model for Temporal Validity and recalibration burden.	Cross-day usability still does not by itself show same-state continuity or maintenance-complete control.
Flesher et al. (2021)	ICMS tactile feedback improved robotic-arm control behavior in a bidirectional BCI.	This is a classic example that bidirectional feedback can causally improve a local sensorimotor loop.	It remains a subsystem-limited causal gain rather than whole-brain WBE evidence.
Oehrn et al. (2024), Cascino et al. (2026)	Adaptive DBS was pushed into blinded randomized home-life blocks, and later chronic programming-principle work showed that eligibility, signal quality, and continuation remain practical constraints.	This is the correct reading model for deployment burden and controller-feasibility screening rather than laboratory-only success.	Symptom-control benefit still does not equal complete state reconfiguration, same-state equivalence, or one universally valid biomarker/controller pair.
Mathiopoulou et al. (2024), Stanslaski et al. (2024), Olaru et al. (2024), Mathiopoulou et al. (2025), Dixon et al. (2026)	Adaptive neuromodulation split across beta-guided bradykinesia control, multi-timescale threshold policies, dyskinesia-linked gamma, DBS-entrained prokinetic gamma, and movement-responsive decoder control.	This is the correct reading model for state-dependent controller family, not one generic `adaptive DBS` result.	It still does not show that one trigger family, one controller law, or one operating regime solves symptom-linked control across tasks, medication cycles, or home use.
Casali et al. (2013), Comolatti et al. (2019)	Perturbation-complexity metrics were formalized for TMS or intracranial stimulation responses.	Perturbation-based verification can be implemented, but only if stimulation conditions and artifact handling are fixed explicitly.	A single complexity index is still not enough for WBE pass or fail.

State-dependent neuromodulation is controller-limited, not just state-triggered

One more split is needed for causal-verification reading. The older version of this page already separated online control from temporal durability, which was necessary. It still remained too weak for burst-driven neuromodulation, because it let readers learn adaptive DBS as if the main question were only whether stimulation responded to a detected signal. The current primary literature does not support that shortcut. On this site, a burst-driven or adaptive-neuromodulation result now has to separate the biomarker family, the symptom axis, the controller mode and timescale, the sensing-compatibility burden, the comparator policy, and the programming / continuation burden.

Layer to separate	What recent literature now supports	What must be logged on this site
Biomarker family / symptom axis	Little et al. (2013) and Tinkhauser et al. (2017) support beta-guided antikinetic control, Olaru et al. (2024) supports dyskinesia-linked narrowband gamma, Mathiopoulou et al. (2025) supports DBS-entrained prokinetic gamma, and Dixon et al. (2026) supports a movement-responsive neural-decoder route.	Name the biomarker family, symptom target, and why that pairing is being read as the relevant controller object.
State dependence / controllability	Mathiopoulou et al. (2024) showed that subthalamic beta changes with movement, dopamine, and DBS itself, and Busch et al. (2025) showed that chronic thresholds and controllability can drift across real-life use.	Name the operating regime: rest versus movement, medication state, stimulation state, and any state slices where the controller stopped being reliable.
Controller mode / timescale	Stanslaski et al. (2024) showed that adaptive DBS already splits across single-threshold, dual-threshold, and different onset-duration policies rather than one universal timing law.	Name the controller family, update interval, onset duration, ramp or smoothing policy, floor / ceiling amplitude, and fallback rule.
Sensing compatibility / artifact burden	Stanslaski et al. (2024) and Cascino et al. (2026) show that inadequate signal, artifacts, absent peaks, and incompatible settings remain major bottlenecks rather than background implementation detail.	Name sensing contacts, signal-to-noise or peak criteria, excluded hemispheres or participants, artifact resets, and any unilateral or surrogate sensing policy.
Comparator and deployability burden	Oehrn et al. (2024) showed blinded randomized symptom benefit for a personalized signal-selection route, while Cascino et al. (2026) showed that chronic continuation still depends on eligibility and repeated optimization.	Name the comparator condition, any matching rule for duty cycle or energy when relevant, the programming workflow, continuation counts, and whether the benefit survives outside the tuning context.

Fast communication BCI is still not counterfactual equivalence

Willett et al. (2023), Littlejohn et al. (2025), and Wairagkar et al. (2025) all move speech neuroprosthetics forward. However, what they establish is online decoding and closed-loop communication in a subsystem. On this site that is not upgraded to branch-equivalence, whole-brain causal equivalence, or same-state continuity without the extra card bundle above.

What this site calls a counterfactual test

On this site, we do not call something a counterfactual merely because conditions were changed. If the bundle below is incomplete, the result stays at a weaker label such as intervention response test, state-dependent controller result, or perturbation generalization test.

Condition	Why it is necessary
Branch variables are explicit	If it is unclear what was changed, it is impossible to distinguish branch comparison from noise or drift.
Comparison rules are pre-registered	If convenient branches are chosen after seeing the result, the test only looks counterfactual in retrospect.
Artifact windows and safety conditions are published	Without this, stimulation-induced artifacts can be misread as neural response.
Temporal scope is fixed explicitly	Readers need to know whether the test is same-trial, same-session, same-day, or cross-day, and whether the decoder was fixed or recalibrated.
Controller family is disclosed for state-dependent neuromodulation	Without biomarker family, controller law, sensing exclusions, and comparator policy, a burst-trigger result can be overread as generic adaptive control.
Bridge status is disclosed when same-subject language is used	A same-subject claim still needs acquisition order, elapsed time, regime continuity, and omitted drift processes before it can be read as one same-state sample.
Failure conditions are fixed in advance	The result is only falsifiable when the threshold for branch mismatch is declared before the outcome is known.

Minimum log bundle now required

Log family	What to keep	What overread it blocks
Intervention definition	Site, intensity, timing, duration, branch variable, task condition, and explicit control or sham policy.	Blocks vague claims that "something was perturbed" without a reproducible branch definition.
Artifact and safety handling	Artifact window, interpolation or masking policy, excluded trials, stop conditions, and hard-stop versus soft-fallback behavior.	Blocks device-induced changes from being misread as neural response.
Online timing and abstention	P50/P95/P99 latency, jitter, dropout, output-path delay, and abstention, silence, or hold-last-output policy.	Blocks average latency alone from standing in for actual loop behavior.
Temporal validity	Fixed-decoder interval, supervised versus unsupervised recalibration route, performance-decay curve, recovery time, and clinic-versus-home block structure.	Blocks same-session success from being silently promoted to cross-day durability.
Burst-controller disclosure	Biomarker family / symptom axis, controller family, state slice, sensing exclusions, comparator policy, and programming or rescue burden.	Blocks exploratory trigger timing from being promoted to validated symptom-linked control.
Bridge status	Same-session or cross-day status, live-to-fix or live-to-live ordering, elapsed time, regime change, and coordinate-transfer burden.	Blocks same-subject wording from being misread as one same-state sample.
Maintenance-state disclosure	If persistence, forgetting, reconsolidation, or recovery is claimed, name the relevant maintenance families or attach the Maintenance-State Error Budget.	Blocks intervention response from being upgraded to long-horizon maintenance evidence.
Boundary disclosure	Retained, substituted, and omitted sensory, motor, interoceptive, and feedback routes.	Blocks a fast local controller from being promoted to naturalistic or embodied equivalence.

Eight questions when reading causal-verification papers

What was changed physically or computationally? Distinguish branch variable, decoder update, feedback path, and task manipulation.
Is the result same-session, cross-day, or home-life? Online success in one session is different from temporal durability.
Was the decoder fixed, supervised, or unsupervised? Hidden recalibration changes what the claimed causal evidence means.
If the result is state-dependent neuromodulation, which biomarker family and controller law were actually used? `Adaptive` is too coarse unless biomarker, timescale, comparator, and sensing exclusions are explicit.
Was the result compared against a named controller or only against a weak baseline? Symptom benefit and controller feasibility are different readings.
If the paper says same-subject or same-brain, is it really one same-state sample? Check bridge order, elapsed time, and regime continuity.
If persistence or recovery is claimed, where is the maintenance-state disclosure? Intervention logs alone are not enough for long-horizon claims.
Are we jumping from subsystem-limited causal gain to whole-brain equivalence? This remains the main overread to block.

References

Forenzo D, Zhu H, Shanahan J, Lim J, He B. Continuous tracking using deep learning-based decoding for noninvasive brain-computer interface. PNAS Nexus. 2024. doi:10.1093/pnasnexus/pgae145
Willett FR, Kunz EM, Fan C, et al. A high-performance speech neuroprosthesis. Nature. 2023. doi:10.1038/s41586-023-06377-x
Littlejohn KT, Dabagia M, Ladwig A, et al. A streaming brain-to-voice neuroprosthesis to restore naturalistic communication. Nature Neuroscience. 2025. doi:10.1038/s41593-025-01905-6
Wairagkar M, Moses DA, Metzger SL, et al. An instantaneous voice-synthesis neuroprosthesis. Nature. 2025. doi:10.1038/s41586-025-09127-3
Flesher SN, Downey JE, Weiss JM, et al. A brain-computer interface that evokes tactile sensations improves robotic arm control. Science. 2021. doi:10.1126/science.abd0380
Wilson GH, Bray N, Franken M, et al. Long-term unsupervised recalibration of cursor-based intracortical brain-computer interfaces using a hidden Markov model. Nature Biomedical Engineering. 2025. doi:10.1038/s41551-025-01536-z
Little S, Pogosyan A, Neal S, et al. Adaptive deep brain stimulation in advanced Parkinson disease. Annals of Neurology. 2013. doi:10.1002/ana.23951
Tinkhauser G, Pogosyan A, Little S, et al. The modulatory effect of adaptive deep brain stimulation on beta bursts in Parkinson's disease. Brain. 2017. doi:10.1093/brain/awx010
Mathiopoulou V, Lofredi R, Feldmann LK, et al. Modulation of subthalamic beta oscillations by movement, dopamine, and deep brain stimulation in Parkinson's disease. npj Parkinson's Disease. 2024. doi:10.1038/s41531-024-00693-3
Stanslaski S, Summers RLS, Tonder L, et al. Sensing data and methodology from the Adaptive DBS Algorithm for Personalized Therapy in Parkinson's Disease (ADAPT-PD) clinical trial. npj Parkinson's Disease. 2024. doi:10.1038/s41531-024-00772-5
Oehrn CR, Roediger J, Diehl A, et al. Chronic adaptive deep brain stimulation versus conventional stimulation in Parkinson's disease: a blinded randomized feasibility trial. Nature Medicine. 2024. doi:10.1038/s41591-024-03196-z
Olaru M, Cernera S, Hahn A, et al. Motor network gamma oscillations in chronic home recordings predict dyskinesia in Parkinson's disease. Brain. 2024. doi:10.1093/brain/awae004
Busch JL, Kaplan J, Behnke JK, et al. Chronic adaptive deep brain stimulation for Parkinson's disease: clinical outcomes and programming strategies. npj Parkinson's Disease. 2025. PMC:PMC12397205
Mathiopoulou V, Habets J, Feldmann LK, et al. Gamma entrainment induced by deep brain stimulation as a biomarker for motor improvement with neuromodulation. Nature Communications. 2025. doi:10.1038/s41467-025-58132-7
Dixon S, Oehrn C, Remple M, et al. Movement-responsive deep brain stimulation for Parkinson's disease using a remotely optimized neural decoder. Nature Biomedical Engineering. 2026. doi:10.1038/s41551-025-01438-0
Cascino S, Roediger J, Oehrn C, et al. Chronic adaptive deep brain stimulation in Parkinson's disease: ADAPT-START findings and programming principles. npj Parkinson's Disease. 2026. doi:10.1038/s41531-026-01269-z
Wilkins KB, Melbourne JA, Akella P, et al. Beta burst-driven adaptive deep brain stimulation for gait impairment and freezing of gait in Parkinson's disease. Brain Communications. 2025. PMC:PMC12268161
Lu Z, Chmielowiec J, Himes B, et al. Fixation-dependent changes in the preservation of extracellular space in the neuro-glio-vascular unit. Cell Reports Methods. 2023. doi:10.1016/j.crmeth.2023.100520
MICrONS Consortium, et al. Functional connectomics spanning multiple areas of mouse visual cortex. Nature. 2025. doi:10.1038/s41586-025-08790-w
Attardo A, Fitzgerald JE, Schnitzer MJ. Impermanence of dendritic spines in live adult CA1 hippocampus. Nature. 2015. doi:10.1038/nature14467
Hengen KB, Torrado Pacheco A, McGregor JN, Van Hooser SD, Turrigiano GG. Neuronal firing rate homeostasis is inhibited by sleep and promoted by wake. Cell. 2016. doi:10.1016/j.cell.2016.01.046
Schreiner T, Petzka M, Staudigl T, et al. Spindle-locked ripples mediate memory reactivation during human NREM sleep. Nature Communications. 2024. doi:10.1038/s41467-024-49572-8
Deng Z, Fei X, Zhang S, Xu M. A time window for memory consolidation during NREM sleep revealed by cAMP oscillation. Neuron. 2025. doi:10.1016/j.neuron.2025.03.020
Casali AG, Gosseries O, Rosanova M, et al. A theoretically based index of consciousness independent of sensory processing and behavior. Science Translational Medicine. 2013. doi:10.1126/scitranslmed.3006294
Comolatti R, Pigorini A, Casarotto S, et al. A fast and general method to empirically estimate the complexity of brain responses to transcranial and intracranial stimulations. Brain Stimulation. 2019. doi:10.1016/j.brs.2019.05.013

Where to go back next

To return to the difference between translation and generation, use Introduction to WBE. To return to the site-wide rule bundle, use Verification platform. To return to timing-side deployment detail, use Wiki: Closed Loop, Delay, Jitter, Safe Stop. To return to bridge validity itself, use Wiki: State-Continuity Bridge.

Wiki: Verification of counterfactuals, interventions, and perturbations

Read this first to avoid getting lost

What we know now

What we still do not know

Check the basics in the wiki

Plain-language terms on this page

See the structure before reading the whole page