umllr P-adic Tag Regression

Using p-adic coefficients to predict taxonomy from tags

← Back to index

Overview

The umllr (Universal Machine Learning Linear Regression) model assigns p-adic integer coefficients to product tags and uses them to predict taxonomy encodings. Each taxonomy path is encoded as a p-adic integer (base 79), and tags are fitted to minimize p-adic distance on training data.

The dedicated benchmark pages now live in the shared benchmark overview, latest comparison page and paper comparison page.

5 CV folds
0.3231 Average p-adic loss
44.50% Mean accuracy
0.4723 Mean F1
58.76% Mean Prefix-2 Accuracy
79 Prime base
0.99 Mean scoring ops

Cross-validation results

FoldAccuracyF1P-adic loss (mean)Details
044.46%0.47160.32124954View details →
145.80%0.48080.31267813View details →
241.53%0.44500.35999207View details →
345.67%0.48950.31556355View details →
445.05%0.47480.30584226View details →

Tag-order ablations

These runs keep the greedy p-adic regressor fixed and vary only the feature ordering heuristic.

These ablations are loaded from the fixed paper snapshot so the comparison stays stable even as the live catalog changes.

Random order baseline across five fixed seeds: 0.31032917 ± 0.00674739 mean p-adic loss.

StrategySeedMean p-adic lossMean Prefix-2 AccuracyMean scoring ops
battle_elo0.3127443761.01%1.41
frequency0.3196031260.93%1.94
mean_title_position0.3278909259.82%1.71
random70.3155510359.85%1.40
random130.3041102361.69%1.45
random230.3210333360.24%1.42
random370.3054629060.47%1.40
random1010.3054883661.31%1.42
taxonomy_association0.2496937566.66%1.10