Padjective Tag Hierarchy

Dummy Baseline

Always predicts most common taxonomy (baseline for comparison)

0.6244 Avg p-adic loss

1 Parameter

View model →

Importance-Optimised p-adic Linear Regression

P-adic coefficients assigned to tags to predict taxonomy

0.3408 Avg p-adic loss

1,404 Avg non-zero coefficients

View model →

Level-wise Logistic Regression

Hierarchy-aware top-down classifier that always emits a valid taxonomy path

0.1008 Avg p-adic loss

83.01% Prefix-2 accuracy

132,415 Non-zero params

View model →

Zubarev Regression (UMLLR init)

Stochastic p-adic optimization starting from UMLLR (arXiv:2503.23488)

0.4202 Avg p-adic loss

3,697 Non-zero coefficients

View fold details →

Zubarev Regression (Zeros init)

Stochastic p-adic optimization starting from zeros (arXiv:2503.23488)

0.4926 Avg p-adic loss

3,888 Non-zero coefficients

View fold details →

Zubarev Mahler-1 (UMLLR init)

Mahler affine basis (degree 1) with UMLLR initialization

0.4259 Avg p-adic loss

3,462 Non-zero coefficients

View fold details →

Zubarev Mahler-2 (UMLLR init)

Mahler quadratic basis (degree 2) with UMLLR initialization

0.4254 Avg p-adic loss

3,435 Non-zero coefficients

View fold details →

Unconstrained Logistic Regression

L1-regularized model using ALL tags

0.2592 Avg p-adic loss

6,802 Non-zero params

View model →

Decision Tree

Unconstrained tree using ALL tags

0.2215 Avg p-adic loss

66,642 Effective params

View model →

Unconstrained Neural Network

L1-regularized NN with weight pruning

0.2324 Avg p-adic loss

24,965 Non-zero params

View model →

Parameter Constrained Neural Network

Neural network predicting taxonomy from tags

0.5662 Avg p-adic loss

864 Avg input weights

View model →

Parameter Constrained Logistic Regression

Logistic regression model predicting Shopify taxonomy from tags

0.7432 Avg p-adic loss

21,920 Avg parameters

View model →

ELO-Inspired Rankings

Battle-tested tag hierarchy from product title positions

5,352 Tag battles

View rankings →

Benchmark Comparisons

Dedicated `latest` and `paper` benchmark pages, including the average active parameters touched per classification for the importance-optimised p-adic linear regressor.

1.10 Latest active params / classification

1.11 Paper active params / classification

Open benchmark pages →

Taxonomy ID	Name	Path	Samples	Share
gid://shopify/TaxonomyCategory/bt	Baby & Toddler	4	132	0.9%
gid://shopify/TaxonomyCategory/lb	Luggage & Bags	15	93	0.7%
gid://shopify/TaxonomyCategory/bu	Bundles	5	66	0.5%
gid://shopify/TaxonomyCategory/pa	Product Add-Ons	19	24	0.2%
gid://shopify/TaxonomyCategory/na	Uncategorized	25	19	0.1%
gid://shopify/TaxonomyCategory/gc	Gift Cards	11	18	0.1%
gid://shopify/TaxonomyCategory/os	Office Supplies	18	17	0.1%
gid://shopify/TaxonomyCategory/sg	Sporting Goods	23	13	0.1%
gid://shopify/TaxonomyCategory/me	Media	17	10	0.1%
gid://shopify/TaxonomyCategory/hg	Home & Garden	14	9	0.1%

Tag	Top taxonomy	Weight	Max \|weight\|
No tag signal data available

Historical Performance Trends

Tracking model performance and dataset growth over time. Lower p-adic loss indicates better predictions.

Model	Slope (per product)	Intercept	R²	p-value
Importance-Optimised p-adic LR	-0.000000	0.3460	0.0000	0.9984
PCLR	0.000026	0.4805	0.6193	1.68e-41
PCNN	0.000009	0.5169	0.0996	8.69e-06
ULR	0.000005	0.1913	0.7488	7.34e-50
UNN	0.000005	0.1690	0.5261	2.07e-27
Decision Tree	0.000005	0.1578	0.7572	4.02e-50
Zubarev (UMLLR)	0.000003	0.3889	0.3885	1.63e-17
Zubarev (zeros)	0.000008	0.3864	0.8046	2.44e-54
Zubarev (M1)	0.000001	0.4068	0.1516	8.48e-07
Zubarev (M2)	0.000002	0.4009	0.2532	5.29e-11
Dummy Baseline	-0.000026	0.8746	0.5536	2.71e-32

Extrapolation Analysis: When Will Importance-Optimised p-adic LR Outperform Other Models?

Based on current regression trends, we can extrapolate when Importance-Optimised p-adic LR will achieve better performance (lower p-adic loss) than other models as the dataset grows. The confidence intervals are calculated using bootstrap resampling (n=1000).

Model	Crossover Point (products)	95% Confidence Interval	Probability	Estimated Date
UNN (Unconstrained Neural Networks)	34,055	28,483 - 44,406 (95% CI, σ=4,040)	>95%	2027-04-10 (±uncertain, R²=0.998, growth=63.8/product/day)
ULR (Unconstrained Logistic Regression)	30,056	26,003 - 37,337 (95% CI, σ=2,926)	>95%	2027-02-07 (±uncertain, R²=0.998, growth=63.8/product/day)
Decision Tree	40,140	33,735 - 51,156 (95% CI, σ=4,572)	>95%	2027-07-15 (±uncertain, R²=0.998, growth=63.8/product/day)

Statistical Notes: The crossover points are calculated by finding where the regression lines intersect. The 95% confidence intervals are derived from bootstrap resampling of the regression parameters. The probability estimates indicate the likelihood that the crossover will occur given the current trends. Date predictions are based on linear extrapolation of dataset growth and should be interpreted with caution.

Model performance vs number of distinct tags

Model	Slope (per tag)	Intercept	R²	p-value
Importance-Optimised p-adic LR	0.000001	0.3376	0.0065	0.2659
PCLR	0.000038	0.2983	0.6676	4.38e-47
PCNN	0.000016	0.4279	0.1583	1.20e-08
ULR	0.000008	0.1540	0.7584	3.19e-51
UNN	0.000008	0.1250	0.6120	2.66e-34
Decision Tree	0.000007	0.1236	0.7580	3.08e-50
Zubarev (UMLLR)	0.000005	0.3619	0.4741	2.11e-22
Zubarev (zeros)	0.000013	0.3190	0.8559	3.90e-64
Zubarev (M1)	0.000002	0.3959	0.2076	4.67e-09
Zubarev (M2)	0.000003	0.3848	0.3244	2.82e-14
Dummy Baseline	-0.000039	1.0707	0.6282	3.15e-39

Extrapolation Analysis: When Will Importance-Optimised p-adic LR Outperform Other Models?

Model	Crossover Point (tags)	95% Confidence Interval	Probability	Estimated Date
UNN (Unconstrained Neural Networks)	28,795	24,782 - 36,146 (95% CI, σ=2,970)	>95%	2027-04-10 (±uncertain, R²=0.992, growth=45.3/tag/day)
ULR (Unconstrained Logistic Regression)	27,346	23,825 - 33,954 (95% CI, σ=2,630)	>95%	2027-03-09 (±uncertain, R²=0.992, growth=45.3/tag/day)
Decision Tree	35,336	29,748 - 46,001 (95% CI, σ=4,174)	>95%	2027-09-02 (±uncertain, R²=0.992, growth=45.3/tag/day)

Model complexity vs performance (parameter count vs p-adic loss) — Both axes use log scale. The red line is the fixed parsimoniousness baseline rather than a fitted regression.

Why parsimony matters. The question here is not just which model has the lowest loss, but which model gets good p-adic loss with the fewest effective parameters. That is exactly where the smaller p-adic models are interesting.

Where this baseline came from. The original score came from a log-log regression on model size versus loss, rounded to -0.1 × log₁₀(params) - 0.2. Looking across historical snapshots, those scores drifted as the dataset covered more taxonomies, so the current baseline adds + 0.3 × log₁₀(taxonomies / 1,000) to keep comparisons stable as the benchmark grows. For readability, we also re-centre the displayed score by dropping the old constant offset; that keeps the current tables mostly positive without changing the relative comparisons.

Parsimoniousness baseline: log₁₀(loss) = -0.1 × log₁₀(params) + 0.3 × log₁₀(taxonomies / 1,000)
Current snapshot taxonomies: 695
Parsimony score = baseline log₁₀(loss) − observed log₁₀(loss). Positive means better than baseline.

Model	Params	Loss	log₁₀(params)	log₁₀(loss)	Baseline log₁₀(loss)	Parsimony score
Level-wise Logistic	132,415	0.1008	5.1219	-0.9966	-0.5596	+0.4370
Dummy	1	0.6244	0.0000	-0.2045	-0.0474	+0.1571
ULR	6,802	0.2592	3.8326	-0.5864	-0.4307	+0.1558
UNN	24,965	0.2324	4.3973	-0.6338	-0.4871	+0.1467
Decision Tree	66,642	0.2215	4.8238	-0.6547	-0.5298	+0.1249
Importance-Optimised	1,404	0.3408	3.1472	-0.4675	-0.3621	+0.1054
Zubarev (UMLLR)	3,697	0.4202	3.5679	-0.3765	-0.4042	-0.0277
Zubarev (M2)	3,435	0.4254	3.5360	-0.3712	-0.4010	-0.0298
Zubarev (M1)	3,462	0.4259	3.5394	-0.3707	-0.4013	-0.0307
PCNN	864	0.5662	2.9365	-0.2470	-0.3411	-0.0940
Zubarev (zeros)	3,888	0.4926	3.5898	-0.3075	-0.4064	-0.0989
PCLR	21,920	0.7432	4.3408	-0.1289	-0.4815	-0.3526

Historical parsimony score stability — Left: parsimony score versus dataset size. Right: score distribution across historical snapshots. Positive means better than the taxonomy-adjusted baseline.

Model	Snapshots	Mean score	Std dev	Span	Latest score	Latest products
Dummy Baseline	176	+0.0535	0.1212	0.3347	+0.1571	13,972
Unconstrained Logistic Regression with L1	162	+0.1648	0.0191	0.1083	+0.1558	13,972
Unconstrained Neural Network with L1	160	+0.1184	0.0318	0.1631	+0.1467	13,972
Decision Tree	129	+0.1448	0.0182	0.0782	+0.1249	13,972
Importance-Optimised $p$-adic Linear Regression	129	+0.0599	0.0484	0.1273	+0.1054	13,972
Zubarev (UMLLR init)	150	-0.0625	0.0207	0.0810	-0.0277	13,972
PCNN	160	-0.1942	0.0717	0.1992	-0.0940	13,972
PCLR	160	-0.3780	0.0232	0.1880	-0.3526	13,972

Smaller standard deviation and span mean a model’s parsimoniousness is more stable as the dataset grows.

Unconstrained models: complexity vs performance (log-log scale) — Unconstrained models only (no PCLR/PCNN). Both axes on log scale.

Regression: log₁₀(loss) = slope × log₁₀(params) + intercept

Slope	Intercept	R²	p-value	Significant?	n
-0.0960	-0.1984	0.9872	0.0006	Yes	5

Model performance trajectory over time — Arrows show how each model's complexity and performance have changed over time.

Padjective Tag Hierarchy

Dataset coverage

Dummy Baseline

Importance-Optimised p-adic Linear Regression

Level-wise Logistic Regression

Zubarev Regression (UMLLR init)

Zubarev Regression (Zeros init)

Zubarev Mahler-1 (UMLLR init)

Zubarev Mahler-2 (UMLLR init)

Unconstrained Logistic Regression

Decision Tree

Unconstrained Neural Network

Parameter Constrained Neural Network

Parameter Constrained Logistic Regression

ELO-Inspired Rankings

Benchmark Comparisons

Taxonomy distribution

Top 10 taxonomy classes

Tags with strongest signal

Historical Performance Trends

Extrapolation Analysis: When Will Importance-Optimised p-adic LR Outperform Other Models?

Extrapolation Analysis: When Will Importance-Optimised p-adic LR Outperform Other Models?