CIPHER: Scalable Time Series Analysis for Physical Sciences with Application to Solar Wind Phenomena

Publications

Jasmine R. Kobayashi, Daniela Martin, Valmir P. Moraes Filho, Connor O’Brien, Jinsu Hong, Sudeshna Boro Saikia, Hala Lamdouar, Nathan D. Miles, Marcella Scoczynski, Mavis Stone, Sairam Sundaresan, Anna Jungbluth, Andrés Muñoz‑Jaramillo, Evangelia Samara & Joseph Gallego
arXiv preprint arXiv:2510.21022 (submitted 23 Oct 2025)

Abstract:
Labeling or classifying time series is a persistent challenge in the physical sciences, where expert annotations are scarce, costly, and often inconsistent. Yet robust labeling is essential to enable machine learning models for understanding, prediction, and forecasting. We present the \textit{Clustering and Indexation Pipeline with Human Evaluation for Recognition} (CIPHER), a framework designed to accelerate large-scale labeling of complex time series in physics. CIPHER integrates \textit{indexable Symbolic Aggregate approXimation} (iSAX) for interpretable compression and indexing, density-based clustering (HDBSCAN) to group recurring phenomena, and a human-in-the-loop step for efficient expert validation. Representative samples are labeled by domain scientists, and these annotations are propagated across clusters to yield systematic, scalable classifications. We evaluate CIPHER on the task of classifying solar wind phenomena in OMNI data, a central challenge in space weather research, showing that the framework recovers meaningful phenomena such as coronal mass ejections and stream interaction regions. Beyond this case study, CIPHER highlights a general strategy for combining symbolic representations, unsupervised learning, and expert knowledge to address label scarcity in time series across the physical sciences. The code and configuration files used in this study are publicly available to support reproducibility.

Subjects: Machine Learning (cs.LG); Physics (physics)
License: CC BY 4.0

Links:


📌 BibTeX citation

@article{kobayashi2025cipher,
  title   = {CIPHER: Scalable Time Series Analysis for Physical Sciences with Application to Solar Wind Phenomena},
  author  = {Kobayashi, Jasmine R. and Martin, Daniela and Moraes Filho, Valmir P. and O'Brien, Connor and Hong, Jinsu and Boro Saikia, Sudeshna and Lamdouar, Hala and Miles, Nathan D. and Scoczynski, Marcella and Stone, Mavis and Sundaresan, Sairam and Jungbluth, Anna and Muñoz-Jaramillo, Andrés and Samara, Evangelia and Gallego, Joseph},
  journal = {arXiv preprint},
  year    = {2025},
  eprint  = {2510.21022},
  archivePrefix = {arXiv},
  primaryClass = {cs.LG},
  url     = {https://arxiv.org/abs/2510.21022}
}

© 2026 This website is copyright. Created by All Shaman team.