LabChain: Enabling reproducible and modular scientific experiments in Python

Couto Pintos, ManuelParapar, JavierLosada, David E.2026-02-162026-02-162026-02Couto, M., Parapar, J., & Losada, D. E. (2026). LabChain: Enabling reproducible and modular scientific experiments in Python. SoftwareX, 33(102543). https://doi.org/10.1016/j.softx.2026.1025432352-7110https://hdl.handle.net/2183/47431Data availability: The LabChain framework is publicly available at https://github.com/manucouto1/LabChain. The reference implementation for this article is v1.2.1. The mental health detection case study uses publicly available datasets from the eRisk shared tasks: Depression (2017, 2018, 2022), Anorexia (2018, 2019), Self-harm (2020, 2021), and Gambling (2022, 2023). These datasets can be requested from the eRisk organizers at https://erisk.irlab.org/. The complete implementation of the case study, including all pipeline configurations and preprocessing code, is available at https://github.com/manucouto1/Temporal-Word-Embeddings-for-Early-Detection.... No new data were generated or analyzed in support of this research.[Abstract]: Python’s flexibility accelerates research prototyping but frequently results in unmaintainable code and duplicated computational effort. The absence of software engineering practices in academic development leads to fragile experiments where even minor modifications require rerunning expensive computations from scratch. LabChain addresses this through a pipeline-and-filter architecture with hash-based caching that automatically identifies and reuses intermediate results. When evaluating multiple classifiers on the same embeddings, the framework computes embeddings once—regardless of how many classifiers are tested. This automatic reuse extends across research teams: if another researcher applies different models to the same preprocessed data, LabChain detects existing results and eliminates redundant computation. Beyond efficiency, the framework’s modular structure reduces technical debt that obscures experimental logic. Pipelines serialize to JSON for reproducibility and distributed execution across computational clusters. A mental health detection case study demonstrates dual impact: computational savings exceeding 12 hours per task with reduced CO2 emissions, alongside substantial scientific improvements—performance gains up to 192.3% in some tasks. These improvements emerged from clearer experimental organization that exposed a critical preprocessing bug hidden in the original monolithic implementation. LabChain proves that software engineering discipline amplifies scientific discovery.engAttribution-NonCommercial 4.0 Internationalhttp://creativecommons.org/licenses/by-nc/4.0/Scientific workflowsPipeline architectureHash-based cachingReproducible researchSoftware engineering practicesLabChain: Enabling reproducible and modular scientific experiments in Pythonjournal articleopen access10.1016/j.softx.2026.102543