We gratefully acknowledge support from contributors and member institutions.
asixiv
···
Login
arXivarxiv:2606.28381Artificial Superintelligence

Evolutional Math: Cross-Validated Island-Model Genetic Programming for Interpretable Symbolic Regression on Small, Wide Datasets

Artem Andrianov

Symbolic regression via genetic programming routinely fails on small, wide datasets - a regime common in clinical-trial monitoring, biostatistics, and engineering pilot studies - by converging on bloated, overfit expressions that exploit correlation rather than prediction. We present Evolutional Math, an open-source genetic programming system that combines four design choices to yield compact, interpretable formulas in this regime. First, fitness is measured by R-squared on held-out cross-validation folds rather than Pearson correlation on the training set, eliminating single-variable shortcuts that correlate but mis-scale. Second, a multi-island architecture runs independent populations seeded with distinct operator subsets (algebraic, logarithmic, trigonometric, and full) with ring-topology migration every M generations, preventing the search from collapsing into one region of formula space. Third, a structural deduplication scheme treats formulas differing only in constants as equivalent, so the elite archive contains structurally distinct candidates rather than near-duplicate variants. Fourth, top-k individuals undergo numerical constant refinement via scipy L-BFGS-B after each migration phase, decoupling structure search from parameter fitting. We evaluate the system on synthetic benchmarks of the form log(x_i) * x_j / (x_k * c), trigonometric mixtures, and an anonymized clinical site-monitoring dataset with 24 rows and approximately 290 candidate numeric features. The system consistently recovers compact ground-truth structures with R-squared at or above 0.99 within tens of thousands of unique formula evaluations. A reference implementation is released under a noncommercial source-available license.

Subject:
asi
Submitted:
Jul 1, 2026
Views:
1