umllr P-adic Tag Regression

Using p-adic coefficients to predict taxonomy from tags

← Back to index

Overview

The umllr (Universal Machine Learning Linear Regression) model assigns p-adic integer coefficients to product tags and uses them to predict taxonomy encodings. Each taxonomy path is encoded as a p-adic integer (base 79), and tags are fitted to minimize p-adic distance on training data.

The dedicated benchmark pages now live in the shared benchmark overview, latest comparison page and paper comparison page.

5 CV folds
0.3398 Average p-adic loss
42.53% Mean accuracy
0.4545 Mean F1
57.08% Mean Prefix-2 Accuracy
79 Prime base
0.99 Mean scoring ops

Cross-validation results

FoldAccuracyF1P-adic loss (mean)Details
042.99%0.45560.33639218View details →
143.01%0.46330.34399701View details →
243.12%0.45930.34446616View details →
342.50%0.45020.33526108View details →
441.02%0.44390.33899431View details →

Tag-order ablations

These runs keep the greedy p-adic regressor fixed and vary only the feature ordering heuristic.

These ablations are loaded from the fixed paper snapshot so the comparison stays stable even as the live catalog changes.

Random order baseline across five fixed seeds: 0.31032917 ± 0.00674739 mean p-adic loss.

StrategySeedMean p-adic lossMean Prefix-2 AccuracyMean scoring ops
battle_elo0.3127443761.01%1.41
frequency0.3196031260.93%1.94
mean_title_position0.3278909259.82%1.71
random70.3155510359.85%1.40
random130.3041102361.69%1.45
random230.3210333360.24%1.42
random370.3054629060.47%1.40
random1010.3054883661.31%1.42
taxonomy_association0.2496937566.66%1.10