Using p-adic coefficients to predict taxonomy from tags
The umllr (Universal Machine Learning Linear Regression) model assigns p-adic integer coefficients to product tags and uses them to predict taxonomy encodings. Each taxonomy path is encoded as a p-adic integer (base 79), and tags are fitted to minimize p-adic distance on training data.
| Fold | Accuracy | F1 | P-adic loss (mean) | Details |
|---|---|---|---|---|
| 0 | 39.39% | 0.4223 | 0.37259152 | View details → |
| 1 | 37.67% | 0.4151 | 0.38147304 | View details → |
| 2 | 38.31% | 0.4176 | 0.36771924 | View details → |
| 3 | 36.46% | 0.3978 | 0.37876713 | View details → |
| 4 | 37.02% | 0.4024 | 0.38855606 | View details → |
These runs keep the greedy p-adic regressor fixed and vary only the feature ordering heuristic.
Random order baseline across five fixed seeds: 0.31032917 ± 0.00674739 mean p-adic loss.
| Strategy | Seed | Mean p-adic loss | Mean Prefix-2 Accuracy | Mean scoring ops |
|---|---|---|---|---|
| battle_elo | — | 0.37782140 | 53.31% | 1.31 |
| frequency | — | 0.31960312 | 60.93% | 1.94 |
| mean_title_position | — | 0.32789092 | 59.82% | 1.71 |
| random | 7 | 0.31555103 | 59.85% | 1.40 |
| random | 13 | 0.30411023 | 61.69% | 1.45 |
| random | 23 | 0.32103333 | 60.24% | 1.42 |
| random | 37 | 0.30546290 | 60.47% | 1.40 |
| random | 101 | 0.30548836 | 61.31% | 1.42 |
| taxonomy_association | — | 0.24969375 | 66.66% | 1.10 |