Wiki: Straight path from EEG to L0

What this route now fixes first

The short path from EEG to L0 still exists, but it now starts with a stricter question: what exactly will your score mean? Current primary literature and official standards do not support the shortcut that says "pick a public dataset, run preprocessing, train a model, report accuracy." Before a score matters, this site now fixes benchmark object, temporal regime, event semantics and clock domain, artifact lineage, and the stopped claim.

Why this page needed a 2026-03 rebuild

The old route was too permissive. Pernet et al. (2019), the current BIDS specification, and Pernet et al. (2020) show why raw identity, derivatives, and reporting provenance must be explicit. Hermes et al. (2025) and Kothe et al. (2025) show why event semantics and synchronization still need separate audits. Chaibub Neto et al. (2019), Melnik et al. (2017), Xu et al. (2020), and Di et al. (2021) show why subject and acquisition shortcuts can survive loose evaluation. Egger et al. (2024) show that even within roughly half a day, EEG decoding conditions can shift enough to matter for robustness. The official EEG Challenge (2025) pages then show that benchmark governance itself can change what a leaderboard means. Therefore, this page no longer treats "dataset -> preprocessing -> score" as a sufficient beginner route.

Scope of this page

This page stays on the technical and natural science side. It does not argue about philosophy, law, or identity. It only fixes what must be observable, logged, and audited before an EEG result can count as reproducible L0 work on this site.

Six gates from EEG to L0

Order	Page to open	What is fixed here	What must exist before moving on
1	EEG 101	Fix the measurement ceiling: what EEG directly observes, what remains latent, and what kind of claim it cannot support on its own.	A one-line stopped claim such as "this route aims at reproducible macro-state analysis, not source-complete or WBE-complete recovery."
2	Datasets and Baseline / Benchmark / Pre-registration / Model Card	Choose a benchmark object, not just a file bundle: task, target, independent hold-out unit, metric bundle, version, extra-data policy, and benchmark-governance status.	A short benchmark card naming dataset/version, task, target, split unit, main metric bundle, and whether official rules or postmortems changed the benchmark meaning.
3	Dataset splits and data leakage and State, trait, and drift	Freeze the temporal regime: subject, session, and time disjointness; same-session versus cross-day scope; fixed versus recalibrated decoder interval.	A split manifest plus a temporal-validity note stating whether the result is same-session, same-day, cross-day, or longer-horizon, and whether the decoder stays fixed.
4	Event synchronization and observation logs	Freeze the observation contract: event times, event semantics, label provenance, clock domain, delay/jitter/drift notes, and report-usage flags.	An observation log that separates time anchor, semantics, and synchronization layer instead of mixing them into one generic metadata note.
5	Hands-on and L0 minimum artifact pack	Produce the first reproducible artifact bundle: raw identity, derivative identity, run identity, QC, baseline output, and failure registry.	A rerunnable derivative package with dataset provenance, command or pipeline provenance, environment pin, QC report, baseline output, and at least one named failure mode.
6	Verification	Convert the artifact bundle into a bounded claim: L0 ceiling, observability ceiling, shortcut ceiling, and temporal-validity ceiling.	A submission-ready stopped claim plus the required companion cards if the result starts to imply target specificity or temporal durability.

Why these gates must stay separate

What older beginner routes tended to compress	What current sources actually support	How this site now reads the route
"Dataset choice" as only a file download step	Saito & Rehmsmeier (2015) show why metric choice changes what a binary score means, and the official EEG Challenge (2025) rules plus final leaderboard show that governance changes can alter benchmark meaning after launch.	Choosing data now means choosing the benchmark object: task, target, split/randomization rule, metric bundle, version, and governance status.
"Clean split" as the whole leakage solution	Chaibub Neto et al. (2019), Melnik et al. (2017), Xu et al. (2020), and Di et al. (2021) show that subject/session and acquisition-distribution structure can remain highly predictive.	The route now fixes both split hygiene and shortcut resistance. A clean split is necessary, but it is not treated as proof that the target neural variable was isolated.
"Same-session score" as temporal generalization	Egger et al. (2024) show that EEG decoding conditions can change materially across a day-night window, and this site's Temporal Validity Card plus state-trait-drift rule now separate fixed-decoder interval, fast labels, slow internal-milieu disclosure, and recalibration burden.	The route now asks the reader to decide same-session, same-day, cross-day, or longer-horizon scope before training, and to log whether the regime changed through movement / arousal alone or through slower circadian / endocrine-metabolic state as well.
"Events are in BIDS" as if semantics and timing were solved together	The current BIDS specification and Hermes et al. (2025) support structured events and machine-readable semantics, while Kothe et al. (2025) makes clear that synchronization middleware does not by itself measure device-side delay.	This route now separates time anchor, event semantics, and clock/synchronization audit into distinct observation artifacts.
"Pipeline ran" as if provenance were sufficient	Gorgolewski et al. (2016), Pernet et al. (2019), Pernet et al. (2020), and the current BIDS specification separate raw datasets, derivatives, and generated-by provenance.	The route now requires raw identity, derivative identity, and run identity to be visible as different objects before L0 is called complete.

Minimum artifact bundle before one score matters

Artifact	What it must disclose	What goes wrong if it is missing
Benchmark object	Dataset/version, task, target, independent hold-out unit, metric bundle, extra-data policy, and benchmark-governance status.	A score can be overread as if it applied to a different task, split regime, or official rule set.
Split manifest	Which subject/session/time units are disjoint, how folds were frozen, and which grouping variables were respected.	The evaluation can drift silently as folds or grouping assumptions change.
Temporal-validity note	Same-session versus cross-day scope, fixed versus recalibrated decoder, fast state labels, and any relevant slow internal-milieu disclosure.	Same-day success can be silently promoted to longitudinal stability or deployability.
Observation log	Event times, semantics, scorer or report provenance, clock domain, delay/jitter/drift notes, and bad-segment annotations.	The route cannot distinguish a signal problem from a label or timing problem.
Derivative lineage	Source dataset, generated-by pipeline, version or commit, environment pin, command provenance, and output locations.	Reanalysis and audit become impossible even if the main score is reproducible once.
Stopped claim	What the result supports and what it still does not support, in one or two sentences.	L0 can be overread as source localization truth, stable biomarker evidence, or WBE-relevant state capture.

Operational inference used on this site

Not every field above is a single mandatory key in one external standard. The stronger requirement on this site is an operational inference from current standards, primary literature, and challenge practice: if a result is to count as comparable L0 progress, the benchmark object, temporal regime, observation contract, and derivative lineage all need their own artifacts rather than one mixed prose paragraph.

Five accidents this route now tries to stop early

Common accident	Why it is scientifically weak	Where to return
Choosing a starter dataset without naming the benchmark object	The data may be fine, but the score will still be uninterpretable if task, metric bundle, and governance status are missing.	Baseline / Benchmark / Pre-registration / Model Card
Using a subject/session split but not naming temporal scope	The result can still be same-session or same-day only, even if the split sounds clean.	State, trait, and drift
Treating `events.tsv` as if it fully solved label meaning	Time anchors, condition semantics, and report-derived labels are different objects and can fail independently.	Event synchronization and observation logs
Treating LSL or trigger lines as hardware ground truth	Network synchronization does not automatically measure display, audio, amplifier, or device-internal delays.	Event synchronization and observation logs
Reporting one score without lineage and failure disclosure	The route becomes impossible to audit, extend, or compare even if the run once looked successful.	L0 minimum artifact pack and Verification

Where to go next

If the measurement ceiling of EEG is still not clear, return to EEG 101. If the main uncertainty is split hygiene or benchmark provenance, go to Dataset splits and data leakage. If time anchors, label provenance, or synchronization are the problem, go to Event synchronization and observation logs. Once the first artifact bundle exists, route it through Verification and attach the Temporal Validity Card or Specificity & Shortcut Card whenever the claim starts to reach beyond plain reproducible analysis.

References

Gorgolewski KJ, Auer T, Calhoun VD, et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci Data. 2016. https://doi.org/10.1038/sdata.2016.44
Pernet CR, Appelhoff S, Gorgolewski KJ, et al. EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Sci Data. 2019. https://doi.org/10.1038/s41597-019-0104-8
Pernet C, Garrido MI, Gramfort A, et al. Issues and recommendations from the OHBM COBIDAS MEEG committee for reproducible EEG and MEG research. Nat Neurosci. 2020. https://doi.org/10.1038/s41593-020-00709-0
Brain Imaging Data Structure (stable): dataset_description.json, derived dataset and pipeline description. https://bids-specification.readthedocs.io/en/stable/modality-agnostic-files/dataset-description.html
Hermes D, Bigdely-Shamlo N, Niso G, et al. HED library schema for EEG data annotation. Sci Data. 2025. https://doi.org/10.1038/s41597-025-05791-2
Kothe C, Grivich M, Stenner T, et al. The lab streaming layer for synchronized multimodal recording. Imaging Neurosci. 2025. https://doi.org/10.1162/IMAG.a.136
Chaibub Neto E, Pratap A, Perumal TM, et al. Detecting the impact of subject characteristics on machine learning-based diagnostic applications. npj Digit Med. 2019. https://doi.org/10.1038/s41746-019-0178-x
Melnik A, Legkov P, Izdebski K, et al. Systems, subjects, sessions: to what extent do these factors influence EEG data? Front Hum Neurosci. 2017. https://doi.org/10.3389/fnhum.2017.00150
Xu M, Han J, Wang Y, et al. Cross-dataset variability problem in EEG decoding with deep learning. Front Hum Neurosci. 2020. https://doi.org/10.3389/fnhum.2020.00103
Di M, Han J, Wang Y, et al. The time-robustness analysis of individual identification based on resting-state EEG. Front Hum Neurosci. 2021. https://doi.org/10.3389/fnhum.2021.672946
Egger M, Haden B, Bernarding J, et al. Chrono-EEG dynamics influencing hand gesture decoding: a 10-hour study. Sci Rep. 2024. https://doi.org/10.1038/s41598-024-70609-x
Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One. 2015. https://doi.org/10.1371/journal.pone.0118432
EEG Challenge (2025) official website. https://eeg2025.github.io/
EEG Challenge (2025) official rules. https://eeg2025.github.io/rules/
EEG Challenge (2025) final leaderboard and organizer postmortem. https://eeg2025.github.io/leaderboard/

Wiki: Straight path from EEG to L0

Read this first to avoid getting lost

What we know now

What we still do not know

Check the basics in the wiki

Plain-language terms on this page

See the structure before reading the whole page