Wiki

Wiki: EEG Preprocessing and QC

Preprocessing is not cleanup; it is part of the claim contract

Mind Uploading Research Project

Public Page Updated: 2026-04-04 Technical / practical guide (updated for recording-frame contract clarity)

How to use this page

Read this first to avoid getting lost

This page treats EEG preprocessing and QC not as the final cosmetic cleanup step, but as an audit of which signals remain usable, which derivative branches stay reusable, and which claims must still stop.

  • Reference methods, filters, artifact processing, and split-locked transforms can drive the very conclusions of ERP, connectivity, and decoding.
  • EEG-BIDS, COBIDAS-MEEG, and BIDS Derivatives put metadata, source lineage, and processing labels ahead of pipeline names.
  • Any preprocessing step that learns from data must be fitted inside the training split; only the learned transform may cross into the hold-out data.
  • Site, device, electrode layout, coordinate route, and reference family are part of the measurement condition rather than background implementation detail.
  • Common-channel reduction, interpolation, and REST-based transformation are different harmonization branches, not one interchangeable `preprocessed EEG` object.
  • Artifact removal does not always increase decoding accuracy, and reducing confound may result in decreased accuracy.
  • Cleanup tools do not by themselves solve source leakage, ghost interactions, causal direction, or subject/session shortcut risk.
  • High beta/gamma bands overlap with myoelectric contamination, so don't make a strong case without myoelectric audit.
Best for
Readers who want to judge how EEG preprocessing, QC, and setup harmonization change the claim ceiling.
Reading time
14-20 minutes
Accuracy note
This page does not prescribe one universal pipeline. It extracts the minimum disclosure and stop rules that primary literature and official specifications currently support.

Relatively clear at this stage

What we know now

  • Preprocessing is not a small implementation difference, but a choice that determines which signals are considered neural.
  • Preprocessing and split design are coupled; fitting ICA, autoreject, normalization, feature selection, or learned denoisers before hold-out can leak test information.
  • EEG-BIDS, COBIDAS-MEEG, and BIDS Derivatives provide a concrete floor for reproducible EEG reporting and derivative reuse.
  • Cross-dataset scores can move with amplifier, cap, channel layout, coordinate route, reference system, and protocol differences.
  • A harmonized branch is not one thing: common-channel intersection, interpolated target montages, and REST-based transformations preserve different objects and ceilings.
  • Subject- and session-specific EEG structure is strong enough that sample-based holdouts can overestimate generalization.
  • Artifact suppression and signal preservation are different; accuracy alone does not determine the quality of preprocessing.
  • A cleaner waveform does not automatically justify a stronger connectivity or causality claim.

Still unresolved beyond this point

What we still do not know

  • It has not yet been decided which split-locked preprocessing bundle is optimal for each EEG problem.
  • To determine how much of the high-frequency components can be treated as neural, it is necessary to audit myoelectricity, body movement, and task dependence.
  • Which harmonization branches preserve which benchmark objects best across heterogeneous EEG setups remains unresolved.
  • Which sensitivity-analysis and transform-lineage bundle should become the site-wide standard is still a bench-governance issue.

Learn the basics

Check the basics in the wiki

What the wiki is for

The wiki is a learning aid. For the project's official current synthesis, success criteria, and operating rules, always return to the public pages.

Shortest conclusion

EEG preprocessing is not a process to clean up the diagram. It is an auditing process that determines which signals are considered neural, which derived files remain reusable, and which claims to withhold. Therefore, this site treats split-locked transforms, reference methods, filters, artifact treatments, retention rates, setup logs, derivative lineage, and sensitivity analyses as acceptance conditions rather than supplements attached to results.

Scope of this page

This page stays on the technical and natural-science side only. It does not discuss philosophy, law, or personhood. The question here is narrower: which preprocessing and setup conditions must be fixed before an EEG-derived claim can be read strongly?

2026-03 correction for the beginner route

The older beginner route on this site treated preprocessing mostly as cleanup. That was too weak. For EEG, site / device / reference system / electrode layout / protocol are part of the measurement condition, and cleanup tools do not by themselves solve source leakage, ghost interactions, or directional identifiability.

2026-03-26 correction: preprocessing is also part of split design

One more stop line had to be promoted. The older page still let readers imagine that preprocessing ends before train/test design starts. The recent literature does not support that shortcut. If a step learns parameters from data, such as ICA components, autoreject thresholds, ASR calibration, z-score statistics, PCA bases, or learned denoisers, it belongs to the training split rather than the pooled dataset. Otherwise the clean derivative has already seen the hold-out distribution.

2026-04-04 correction: harmonization is a recording-frame contract, not one checkbox

The next weakness was that this page still treated setup harmonization too much like a generic background adjustment. The current literature does not support that shortcut. EEG-BIDS already separates electrodes, channels, coordinate system, and reference scheme. Hu et al. (2018) showed that measured scalp potentials depend on both reference montage and electrode setup. Melnik et al. (2017) showed that system, subject, and session each contribute variance to EEG recordings. Xu et al. (2020) showed that environmental variability such as amplifier, cap, sampling rate, and filtering can break cross-dataset decoding, and Dong et al. (2024) showed that channel-location harmonization itself needs an explicit offline route such as REST-based transformation rather than a vague statement that datasets were simply `made comparable`. Therefore, on this site, harmonization is now read as a recording-frame contract that must name the coordinate route, reference family, omitted/interpolated-channel policy, and harmonized branch.

Weaknesses to be explored in depth

The older page already treated reference methods, filters, artifact handling, and exclusion criteria as major issues. The remaining weakness was narrower but still important: it still let readers imagine preprocessing as if it were mostly waveform cleanup, with setup differences folded into one generic harmonization line. The current literature and official specifications do not support that compression. COBIDAS-MEEG and EEG-BIDS already provide the reporting floor, the PREP pipeline shows the interdependence of bad-channel detection and rereference, and Widmann et al. established that filter design itself can move waveform and latency. Recent work forces four more corrections. Kessler et al. (2025) explicitly discuss latent leakage in preprocessing operations such as ICA and autoreject, Brookshire et al. (2024) show that segment-based holdout leaks subject-specific information in translational EEG, Del Pup et al. (2025) show that sample-based cross-validation overestimates performance and that nested subject-based strategies are more realistic, and Hu et al. (2018) plus Dong et al. (2024) show that reference family and channel-location transformation route are themselves part of what the measurement means. Therefore, preprocessing here is read not only as cleanup, but also as split design, derivative provenance, shortcut control, and recording-frame contract disclosure.

Ten audit gates to fix first

Gate What primary documents and official specifications currently support Assertion to stop when not passing
metadata / reporting gate EEG-BIDS and COBIDAS-MEEG require minimal recording of references, ground, sampling rate, filters, bad channels, electrode coordinates, events, and exclusion rules. Write it as ``reproducible EEG analysis'' or ``comparable clean EEG.''
split-locked transform gate Generic ML guidance and EEG-specific studies now support the stricter rule that any transform fitted from data must be learned on the training split, not the pooled dataset. Reading a cross-session or cross-subject score as if the hold-out data were genuinely unseen.
derivative-lineage / ancestry gate BIDS Derivatives require explicit source lineage, processing labels, and derivative naming to make cleaned data critically reusable in later processing. Reading a clean derivative as reusable or comparable when the source files and processing branch are not recoverable.
reference gate PREP pipeline and reference comparison studies show that bad-channel processing and rereference drive waveform and network metrics. Reading topology, connectivity, or topography in sensor space without reference dependence disclosure.
recording-frame contract / harmonization gate Cross-dataset studies and channel-location transformation papers show that amplifier, cap, channel map, coordinate route, reference family, omitted/interpolated-channel policy, sampling rate, and protocol differences can change the result before the model changes. Reading a cross-dataset or cross-site score as if it reflected only the target neural variable, or as if the harmonized branch were automatically equivalent to the raw measurement object.
filter gate Widmann et al. explained that cutoff, transition band, filter order, and causal/acausal application can distort waveform and latency. Emphasizing ERP onset, slow components, or high-frequency gain without knowing the filter design.
artifact gate ICA, ICLabel, Autoreject, PREP, ASR, and ZapLine are promising cleanup tools, but research in 2025 showed that artifact correction does not necessarily improve decoding accuracy. Reading ``the preprocessing that produces the highest accuracy'' as automatically the best preprocessing.
shortcut / fingerprint gate Subject-specific and session-specific EEG structure can dominate the score if split design, nuisance channels, and residual acquisition fingerprints are not audited separately. Reading a diagnostic or decoding score as target-specific neural evidence before ruling out subject/session shortcut routes.
connectivity ceiling gate wPLI can reduce some zero-lag mixing, but simulation and source-space studies show that source leakage, ghost interactions, and pipeline dependence remain separate limits. Reading artifact-cleaned connectivity or directed metrics as leak-proof or causal by metric name alone.
retention / high-frequency audit gate Myoelectricity overlaps with high beta/gamma, and aggressive cleaning can also reduce neural signals. Therefore retained trials, interpolation rate, exclusion rate, and raw-clean differences must remain numeric. Reading high beta/gamma claims or heavily cleaned data as sufficient by default.

1. The reporting floor is the metadata, not the algorithm name

What EEG-BIDS and its official specifications first fix is not the flashy pipeline name, but what is measured, how it is measured, and in what state it is stored. You can write sampling frequency, low / high cutoff, notch, and channel status in channels.tsv, while electrodes.tsv and coordsystem.json fix the electrode position and coordinate system. COBIDAS-MEEG similarly requires detailed reporting of reference methods, filters, bad-channel handling, exclusion rules, and artifact handling. The simple conclusion here is that clean EEG without metadata cannot be treated as a reproducible artifact.

Rules on this site

At a minimum, leave the raw reference, rereference method, filters, bad channel / bad segment criteria, electrode coordinates, event timing, and exclusion rules. Even if you only post processed data, it will not be accepted unless you can track the difference from raw to clean.

2. If preprocessing learns from data, it belongs inside the training split

This is the correction that needed to become explicit on this page. The general ML rule is already simple: split first, then fit preprocessing on the training data and apply it to the hold-out data. EEG-specific work now shows why that generic rule matters here. Kessler et al. (2025) explicitly discuss latent leakage from operations such as high-pass filtering, ocular ICA, and autoreject, and consider temporally separated segments or fold-wise preprocessing as countermeasures. Brookshire et al. (2024) show that random segment-based holdout can leak subject-specific patterns, and Del Pup et al. (2025) show that sample-based cross-validation overestimates performance and that nested subject-based cross-validation is more realistic. Therefore, on this site, any preprocessing step that estimates parameters from data is part of split design, not a pre-split background routine.

Transform family What is learned from data Site rule
Fixed preregistered transforms Nothing is estimated from the specific dataset once coefficients are fixed in advance. These may be applied consistently across the dataset, but their parameters still have to be disclosed and sensitivity checked.
ICA / ASR / Autoreject / bad-channel models Components, thresholds, subspaces, or channel / epoch decisions are learned from the data. Fit on the training subset or train-only raw segment within each fold, then apply the learned transform to the hold-out data.
Normalization / PCA / feature selection / learned denoising Means, variances, bases, selected features, or denoiser weights are learned from the data. Never estimate these on pooled train+test data if the claim is cross-session, cross-subject, or otherwise hold-out based.
Rules on this site

Do not write ``hold-out performance'' unless the transform-fit boundary is explicit. For every data-fitted step, leave what was fitted, on which split unit, using which data subset, and which learned object was applied to the test fold.

3. Clean EEG is reusable only when derivative lineage and ancestry remain explicit

The next weakness was provenance. BIDS Derivatives now makes the rule concrete: derivatives are outputs of common processing pipelines that must capture enough data and metadata for a researcher to understand and critically reuse them. The specification explicitly provides Sources, source_entities, desc-<label>, and descriptions.tsv so later users can tell which raw or prior derivative files directly generated the cleaned EEG branch. This matters even more in EEG decoding, because windows, epochs, or averaged segments can inherit strong similarity from the same raw recording, the same session, or the same subject. If raw-window ancestry is lost, a clean derivative can look portable even when nearby windows from the same run crossed the hold-out boundary.

What to keep Why it matters
Direct source files It shows which raw run or prior derivative directly generated the cleaned file.
desc-<label> processing branch It distinguishes filtered, downsampled, rereferenced, or otherwise different clean versions of the same raw input.
Window / epoch ancestry It prevents adjacent or overlapping segments from being mistaken for independent evidence.
Split unit and hold-out ancestry It shows whether train and test were disjoint by subject, session, run, or continuous raw recording.
Rules on this site

If you publish a cleaned EEG derivative, leave enough information to reconstruct the branch: source files, processing labels, software / version, split unit, and raw-window ancestry. A pretty cleaned file name is not enough provenance.

4. The reference method is not a small implementation difference; it is part of the observation model

EEG is a potential-difference measurement, so changing the reference changes the waveform, topography, and sensor-space connectivity. What the PREP pipeline emphasized is that taking the average reference while overlooking bad channels contaminates the rereference itself. Furthermore, reference comparison studies show that functional-connectivity graphs and task-related network metrics change depending on the reference. Therefore, on this site, references are treated as premises that determine the meaning of results, rather than as ``implementation notes.''

Minimum things to write Why it is necessary
Recording reference / ground The assumption behind the raw potential difference changes with it.
Rereference method The meaning of a sensor-space metric changes with average, linked mastoid, REST, and other schemes.
Bad-channel handling before rereference Broken channels contaminate the rereference itself if they are included.
Number of interpolated channels It separates what was truly measured from what was spatially reconstructed.

5. Harmonization is a recording-frame contract, not background cleanup

Two EEG datasets can use the same task name and still represent different measurement conditions. EEG-BIDS already distinguishes electrodes, channels, coordinate system, and reference scheme, which means the format itself does not treat those as cosmetic details. Hu et al. (2018) then showed that the measured scalp potentials change with both reference montage and electrode setup. Melnik et al. (2017) showed that system, subject, and session each influence EEG recordings. Xu et al. (2020) showed that cross-dataset deep-learning decoding breaks under environmental variability such as amplifier, cap, sampling rate, and filtering. Dong et al. (2024) then showed that different channel-location schemes can be brought closer only through an explicit offline transform route, with reported correlations above 0.9 rather than identity by default. Therefore, this site does not treat site / device / reference system / electrode layout / coordinate route / protocol as background nuisance. They are part of the observation model.

Harmonization branch What it preserves best What it still does not make equivalent by default
Common-channel intersection The subset of channels that was directly measured in every dataset. Coverage outside the shared subset, or the original spatial support of denser setups.
Interpolation to a target montage A declared target layout under explicit spatial assumptions. Direct measurement at the interpolated channels, or route-free equivalence to the original montage.
REST / coordinate transformation to a common distribution A transformed branch with a declared reference and channel-location route that may improve comparability. Identity of the raw reference family, raw channel geometry, or proof that physiology was preserved exactly.
Rules on this site

If the claim spans more than one site, dataset, or recording setup, disclose the recording-frame contract: original channel map and coordinate system, raw reference plus rereference family, omitted/interpolated-channel policy, and whether harmonization used common-channel reduction, interpolation, or REST / another explicit transform route. Without that, a score is not read here as clean evidence of neural generalization, and the harmonized branch is not treated as equivalent to the original benchmark object.

6. A filter is not only a "pass-through band" but also a distortion design

As explained by Widmann et al., it is not enough to write only the cutoff frequency of a filter. Transition band, filter order, passband / stopband ripple, causal / acausal application, and forward-backward usage can all move latency and waveform shape. Therefore, claims such as seeing a slow wave, an earlier onset, or an increase in gamma cannot be accepted without a record of the filter design.

Rules on this site

Regarding filters, leave not only the cutoff of high-pass, low-pass, and notch, but also filter type, order, causal / acausal, and the presence of forward-backward. When claiming ERP or latency, check conclusion drift in at least one alternative setting.

7. Artifact suppression is not always an improvement

ICA, ICLabel, Autoreject, PREP, ASR, and ZapLine are strong practical candidates. However, the important point here is not ``which one was used,'' but whether it is possible to audit what was cut, what was left, and where each step was fitted. Kessler et al. (2025) showed that artifact correction does not necessarily improve decoding performance and can reduce accuracy when artifact-related confounds are removed. This does not mean that cleaning is meaningless, but rather that maximizing accuracy and maximizing neural specificity are not synonymous.

Candidate method Role Reason why it is not automatically promoted to standard solution
PREP Clean the floor for line noise, bad channels, and robust reference. Task-specific artifacts and signal preservation still require a separate audit.
Autoreject Automate threshold adjustment and interpolation in trial/sensor units. Retention, signal preservation, and split-aware fitting still have to be checked separately.
ICA + ICLabel Flag candidate ocular, myoelectric, and cardiac components. Component removal can also reduce neural signal, and pooled fitting can leak structure across folds.
ASR / ZapLine Suppress large-amplitude artifacts and line-noise contamination in a reproducible way. They are cleanup tools; they do not by themselves solve source leakage, directional identifiability, or shortcut risk.
Rules on this site

When reporting artifact processing, the method name is not sufficient. Include number of components / epochs / channels removed, interpolation rate, minutes / trials retained, raw-clean key metric differences, and, where possible, comparison with one alternative pipeline. If the method was fitted from data, also state where it was fitted.

8. Shortcut and fingerprint audit starts before the classifier

One more weakness on the older page was to leave shortcut auditing to later decoding pages. That is too weak. Brookshire et al. (2024) show that segments from the same subject are more similar to each other than to segments from different subjects, which is exactly why segment-based holdout can look deceptively strong. Chaibub Neto et al. (2019) show how subject characteristics can confound machine-learning performance, and Gibson et al. (2022) show that EEG variability can track stable subject identity more strongly than dynamic task state. Therefore, preprocessing review must already ask which residual patterns could still carry subject identity, session identity, or acquisition-distribution identity.

Residual pattern to audit Why preprocessing has to log it
Subject-specific spectrum / noise floor It can travel through normalization and windowing into the classifier as a stable identity cue.
Bad-channel map / interpolation pattern It can act as a recording signature rather than a neural variable of interest.
Reference / montage / device chain It can create acquisition fingerprints that survive cleanup and inflate apparent transfer.
EOG / EMG / motion residuals They can correlate with task labels and raise accuracy while reducing neural specificity.
Temporal adjacency of windows Neighboring windows from one run can look independent while still carrying nearly identical nuisance structure.
Rules on this site

Before promoting a score, log at least one nuisance-only or shortcut-aware baseline, the hold-out unit, and which residual fingerprint routes remain unresolved after preprocessing.

9. Cleanup is not connectivity validation

It is tempting to think that once artifacts and line noise are suppressed, network metrics can be read more strongly. That is still too aggressive. Vinck et al. (2011) made wPLI safer against some zero-lag mixing, but Haufe et al. (2013) showed severe limits of sensor-space connectivity under volume conduction, Palva et al. (2018) showed ghost interactions even in source space, and Miljevic et al. (2025) showed strong dependence on rereference and epoch design. Therefore, this site does not promote a connectivity or directed-connectivity result only because the cleanup pipeline looks stronger.

Rules on this site

If a paper or result outputs connectivity, directed connectivity, STE, Granger, or source-space graph measures, add a separate note stating what leakage control, external validation, and abstention boundary are still missing. Cleanup logs and connectivity validation logs are not interchangeable.

10. High beta/gamma does not write strongly without electromyographic audit

As outlined by Muthukumaraswamy, muscle artifacts overlap widely around 20-300 Hz and can be difficult to distinguish from high beta/gamma neural components. Therefore, if you claim increased high-frequency power in a task that tends to recruit forehead, jaw, or temporalis muscles, at least check topography, EOG / EMG auxiliary channels, and residual jaw / brow activity before and after cleaning. On this site, gamma is not read as neural gain without passing that audit.

Minimum submissions

Submission Minimum desired content
acquisition metadata Reference, ground, device chain, sampling rate, line frequency, electrode coordinates, and event timing.
split / evaluation manifest Evaluation family, hold-out unit, and whether train and test were disjoint by subject, session, run, and raw-recording ancestry.
transform-fit ledger For each step, state whether it was fixed in advance or learned from data, what was fitted, and on which split subset it was fitted.
derivative-lineage manifest Leave source files, processing labels, software version, and raw-to-clean ancestry so the clean branch remains auditable.
bad channel / bad segment ledger Leave the criteria used to judge what was bad and what was interpolated or removed.
filter design report Leave cutoff, order, type, causal / acausal choice, and notch.
artifact model report Leave the presence of PREP / ICA / ICLabel / Autoreject / ASR / ZapLine, their thresholds, removal counts, and interpolation rate.
recording-frame contract / harmonization log Disclose site, device, original channel map, coordinate route, raw reference plus rereference family, omitted/interpolated-channel policy, protocol differences, and the exact harmonized branch used for comparison.
raw-clean delta Compare the amount of change in power spectrum, trial count, channel count, and major features between raw and clean.
retention summary Display the number of minutes, number of trials, and number of channels remaining as numeric values.
sensitivity analysis Check conclusion drift with at least one alternative reference, artifact, or transform-fit configuration.
high-frequency exception note If you claim beta/gamma, explain separately how you passed the EMG audit.
connectivity-ceiling note If connectivity or directionality is reported, state separately what leakage control, external validation, and abstention boundary remain.

Misinterpretations that should be avoided from this criticism

Misreading Replacement on this site
I got a clean waveform, so that is enough No. Metadata, transform-fit boundary, retention, and raw-clean diffs are all part of the acceptance condition.
I split after ICA / normalization, so the hold-out is still fair No. Once the transform has seen pooled data, the test distribution has already influenced the clean derivative.
Balanced random windows are enough for evaluation No. Subject identity and temporal adjacency can still leak across train and test.
The pipeline with the highest decoding accuracy is the best Artifact confounds and shortcut routes may be driving the score, so specificity and sensitivity analysis come first.
Average reference is safe, so you do not need to write it Reference is part of the observation model, so write both raw and rereference.
It is enough to write only cutoff in the filter Order, type, and causal / acausal application are also required.
High beta/gamma increase would be neural The myoelectric overlap is strong, so do not write it strongly without an EMG audit.
An automatic pipeline is automatically reproducible Automation and reproducibility are different; you still need source lineage, fitted-transform logs, removal amounts, and retention rates.
We harmonized the datasets, so the signals are now equivalent No. Common-channel reduction, interpolation, and REST-based transformation create different benchmark objects and must be declared separately.

References

  1. BIDS Specification: Electroencephalography. official docs
  2. BIDS Derivatives: Common data types and metadata. official docs
  3. Scikit-learn: Common pitfalls and recommended practices. official docs
  4. Pernet CR, Appelhoff S, Gorgolewski KJ, et al. EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data. 2019. doi:10.1038/s41597-019-0104-8
  5. Pernet C, Garrido MI, Gramfort A, et al. Issues and recommendations from the OHBM COBIDAS MEEG committee for reproducible EEG and MEG research. Nature Neuroscience. 2020. doi:10.1038/s41593-020-00709-0
  6. Melnik A, Legkov P, Izdebski K, et al. Systems, subjects, sessions: to what extent do these factors influence EEG data? Frontiers in Human Neuroscience. 2017;11:150. doi:10.3389/fnhum.2017.00150
  7. Hu S, Lai Y, Valdes-Sosa PA, Bringas-Vega ML, Yao D. How do reference montage and electrodes setup affect the measured scalp EEG potentials? Journal of Neural Engineering. 2018;15(2):026013. doi:10.1088/1741-2552/aaa13f
  8. Bigdely-Shamlo N, Mullen T, Kothe C, Su K-M, Robbins KA. The PREP pipeline: standardized preprocessing for large-scale EEG analysis. Journal of Neuroscience Methods. 2015. doi:10.1016/j.jneumeth.2015.06.014
  9. Widmann A, Schröger E, Maess B. Digital filter design for electrophysiological data: a practical approach. Journal of Neuroscience Methods. 2015. doi:10.1016/j.jneumeth.2014.08.002
  10. Muthukumaraswamy SD. High-frequency brain activity and muscle artifacts in MEG/EEG: a review and recommendations. Frontiers in Human Neuroscience. 2013. doi:10.3389/fnhum.2013.00138
  11. Huang Y, Zhang J, Cui Y, et al. How Different EEG References Influence Sensor Level Functional Connectivity Graphs. Frontiers in Neuroscience. 2017;11:368. doi:10.3389/fnins.2017.00368
  12. Jas M, Engemann DA, Bekhti Y, Raimondo F, Gramfort A. Autoreject: automated artifact rejection for MEG and EEG data. NeuroImage. 2017. doi:10.1016/j.neuroimage.2017.08.030
  13. Pion-Tonachini L, Kreutz-Delgado K, Makeig S. ICLabel: An automated electroencephalographic independent component classifier, dataset, and website. NeuroImage. 2019. doi:10.1016/j.neuroimage.2019.05.026
  14. Chang C-Y, Hsu S-H, Pion-Tonachini L, Jung T-P. Evaluation of Artifact Subspace Reconstruction for automatic EEG artifact removal. Proc IEEE EMBC. 2018. doi:10.1109/EMBC.2018.8512547
  15. de Cheveigné A. ZapLine: A simple and effective method to remove power line artifacts. NeuroImage. 2020;207:116356. doi:10.1016/j.neuroimage.2019.116356
  16. Kessler V, et al. How EEG preprocessing shapes decoding performance. Communications Biology. 2025. doi:10.1038/s42003-025-08464-3
  17. Brookshire G, Kasper J, Blauch NM, et al. Data leakage in deep learning studies of translational EEG. Frontiers in Neuroscience. 2024;18:1373515. doi:10.3389/fnins.2024.1373515
  18. Del Pup F, Zanola A, Tshimanga LF, et al. The role of data partitioning on the performance of EEG-based deep learning models in supervised cross-subject analysis: A preliminary study. Computers in Biology and Medicine. 2025;196(Pt A):110608. doi:10.1016/j.compbiomed.2025.110608
  19. Chaibub Neto E, Pratap A, Perumal TM, et al. Detecting the impact of subject characteristics on machine learning-based diagnostic applications. npj Digital Medicine. 2019;2:99. doi:10.1038/s41746-019-0178-x
  20. Gibson E, Lobaugh NJ, Joordens S, McIntosh AR. EEG variability: Task-driven or subject-driven signal of interest? NeuroImage. 2022;252:119034. doi:10.1016/j.neuroimage.2022.119034
  21. Xu M, Yao S, Wei Z, et al. Cross-dataset variability problem in EEG decoding with deep learning. Frontiers in Human Neuroscience. 2020;14:103. doi:10.3389/fnhum.2020.00103
  22. Dong L, Yang R, Xie A, et al. Transforming of scalp EEGs with different channel locations by REST for comparative study. Brain Research Bulletin. 2024;217:111064. doi:10.1016/j.brainresbull.2024.111064
  23. Vinck M, Oostenveld R, van Wingerden M, Battaglia F, Pennartz CMA. An improved index of phase-synchronization for electrophysiological data in the presence of volume-conduction, noise and sample-size bias. NeuroImage. 2011;55(4):1548-1565. doi:10.1016/j.neuroimage.2011.01.055
  24. Haufe S, Nikulin VV, Müller K-R, Nolte G. A critical assessment of connectivity measures for EEG data: A simulation study. NeuroImage. 2013;64:120-133. doi:10.1016/j.neuroimage.2012.09.036
  25. Palva JM, Wang SH, Palva S, et al. Ghost interactions in MEG/EEG source space: A note of caution on inter-areal coupling measures. NeuroImage. 2018;173:632-643. doi:10.1016/j.neuroimage.2018.02.032
  26. Miljevic A, Murphy OW, Fitzgerald PB, Bailey NW. Estimating sensor-space EEG connectivity PART 1: Identifying best performing methods for functional connectivity in simulated data. Clinical Neurophysiology. 2025;174:73-83. doi:10.1016/j.clinph.2025.03.043