umllr P-adic Tag Regression

Using p-adic coefficients to predict taxonomy from tags

← Back to index

Overview

The umllr (Universal Machine Learning Linear Regression) model assigns p-adic integer coefficients to product tags and uses them to predict taxonomy encodings. Each taxonomy path is encoded as a p-adic integer (base 79), and tags are fitted to minimize p-adic distance on training data.

5 CV folds
0.3778 Average p-adic loss
37.77% Mean accuracy
0.4110 Mean F1
53.31% Mean Prefix-2 Accuracy
79 Prime base
1.31 Mean scoring ops

Cross-validation results

FoldAccuracyF1P-adic loss (mean)Details
039.39%0.42230.37259152View details →
137.67%0.41510.38147304View details →
238.31%0.41760.36771924View details →
336.46%0.39780.37876713View details →
437.02%0.40240.38855606View details →

Tag-order ablations

These runs keep the greedy p-adic regressor fixed and vary only the feature ordering heuristic.

Random order baseline across five fixed seeds: 0.31032917 ± 0.00674739 mean p-adic loss.

StrategySeedMean p-adic lossMean Prefix-2 AccuracyMean scoring ops
battle_elo0.3778214053.31%1.31
frequency0.3196031260.93%1.94
mean_title_position0.3278909259.82%1.71
random70.3155510359.85%1.40
random130.3041102361.69%1.45
random230.3210333360.24%1.42
random370.3054629060.47%1.40
random1010.3054883661.31%1.42
taxonomy_association0.2496937566.66%1.10