umllr P-adic Tag Regression

Overview

The umllr (Universal Machine Learning Linear Regression) model assigns p-adic integer coefficients to product tags and uses them to predict taxonomy encodings. Each taxonomy path is encoded as a p-adic integer (base 79), and tags are fitted to minimize p-adic distance on training data.

5 CV folds

0.3778 Average p-adic loss

37.77% Mean accuracy

0.4110 Mean F1

53.31% Mean Prefix-2 Accuracy

79 Prime base

1.31 Mean scoring ops

Cross-validation results

Fold	Accuracy	F1	P-adic loss (mean)	Details
0	39.39%	0.4223	0.37259152	View details →
1	37.67%	0.4151	0.38147304	View details →
2	38.31%	0.4176	0.36771924	View details →
3	36.46%	0.3978	0.37876713	View details →
4	37.02%	0.4024	0.38855606	View details →

Tag-order ablations

These runs keep the greedy p-adic regressor fixed and vary only the feature ordering heuristic.

Random order baseline across five fixed seeds: 0.31032917 ± 0.00674739 mean p-adic loss.

Strategy	Seed	Mean p-adic loss	Mean Prefix-2 Accuracy	Mean scoring ops
battle_elo	—	0.37782140	53.31%	1.31
frequency	—	0.31960312	60.93%	1.94
mean_title_position	—	0.32789092	59.82%	1.71
random	7	0.31555103	59.85%	1.40
random	13	0.30411023	61.69%	1.45
random	23	0.32103333	60.24%	1.42
random	37	0.30546290	60.47%	1.40
random	101	0.30548836	61.31%	1.42
taxonomy_association	—	0.24969375	66.66%	1.10