Padjective Tag Hierarchy

Machine learning insights into Shopify product tag organization

Data sourced from cantbuymelove.industrial-linguistics.com powering Shopify taxonomy classification and filtered to taxonomies with at least five products.

Last updated 2026-04-13 21:30 UTC

10,983 Products used
563 Taxonomies covered
12,132 Tags used
31,631 Total tags
3,876 Tag battles

Dataset coverage

Training data spans 10,983 products across 563 taxonomies. Of 31,631 total tags in the dataset, 12,132 tags were used (tags appearing fewer than 5 times were filtered out). 6,144 products were discarded due to missing or sparse taxonomy labels. Explore the full dataset → | View defective taxonomy labels →

Dummy Baseline

Always predicts most common taxonomy (baseline for comparison)

0.6100 Avg p-adic loss
1 Parameter
View model →

Importance-Optimised p-adic Linear Regression

P-adic coefficients assigned to tags to predict taxonomy

0.3231 Avg p-adic loss
1,083 Avg non-zero coefficients
View model →

Level-wise Logistic Regression

Hierarchy-aware top-down classifier that always emits a valid taxonomy path

0.1008 Avg p-adic loss
83.01% Prefix-2 accuracy
132,415 Non-zero params
View model →

Zubarev Regression (UMLLR init)

Stochastic p-adic optimization starting from UMLLR (arXiv:2503.23488)

0.4147 Avg p-adic loss
3,124 Non-zero coefficients
View fold details →

Zubarev Regression (Zeros init)

Stochastic p-adic optimization starting from zeros (arXiv:2503.23488)

0.4624 Avg p-adic loss
3,309 Non-zero coefficients
View fold details →

Zubarev Mahler-1 (UMLLR init)

Mahler affine basis (degree 1) with UMLLR initialization

0.4072 Avg p-adic loss
2,925 Non-zero coefficients
View fold details →

Zubarev Mahler-2 (UMLLR init)

Mahler quadratic basis (degree 2) with UMLLR initialization

0.4089 Avg p-adic loss
2,950 Non-zero coefficients
View fold details →

Unconstrained Logistic Regression

L1-regularized model using ALL tags

0.2355 Avg p-adic loss
5,130 Non-zero params
View model →

Decision Tree

Unconstrained tree using ALL tags

0.2055 Avg p-adic loss
48,072 Effective params
View model →

Unconstrained Neural Network

L1-regularized NN with weight pruning

0.2184 Avg p-adic loss
25,164 Non-zero params
View model →

Parameter Constrained Neural Network

Neural network predicting taxonomy from tags

0.5254 Avg p-adic loss
864 Avg input weights
View model →

Parameter Constrained Logistic Regression

Logistic regression model predicting Shopify taxonomy from tags

0.7584 Avg p-adic loss
17,696 Avg parameters
View model →

ELO-Inspired Rankings

Battle-tested tag hierarchy from product title positions

3,876 Tag battles
View rankings →

Benchmark Comparisons

Dedicated `latest` and `paper` benchmark pages, including the average active parameters touched per classification for the importance-optimised p-adic linear regressor.

1.09 Latest active params / classification
1.11 Paper active params / classification
Open benchmark pages →

Taxonomy distribution

Taxonomy class distribution
Distribution of products across the most common taxonomy classes

Top 10 taxonomy classes

Taxonomy IDNamePathSamplesShare
gid://shopify/TaxonomyCategory/btBaby & Toddler41061.0%
gid://shopify/TaxonomyCategory/lbLuggage & Bags15880.8%
gid://shopify/TaxonomyCategory/buBundles5470.4%
gid://shopify/TaxonomyCategory/paProduct Add-Ons19200.2%
gid://shopify/TaxonomyCategory/naUncategorized25180.2%
gid://shopify/TaxonomyCategory/sgSporting Goods23130.1%
gid://shopify/TaxonomyCategory/osOffice Supplies18120.1%
gid://shopify/TaxonomyCategory/gcGift Cards11110.1%
gid://shopify/TaxonomyCategory/hgHome & Garden1480.1%
gid://shopify/TaxonomyCategory/elElectronics870.1%

Tags with strongest signal

TagTop taxonomyWeightMax |weight|
No tag signal data available

Historical Performance Trends

Tracking model performance and dataset growth over time. Lower p-adic loss indicates better predictions.

Historical model performance trends
Model performance vs number of products
Model Slope (per product) Intercept p-value
Importance-Optimised p-adic LR0.0000030.33270.07506.65e-04
PCLR0.0000350.43960.62202.77e-33
PCNN0.0000260.43670.41345.56e-19
ULR0.0000050.19080.56901.13e-23
UNN0.0000070.15560.51363.45e-20
Decision Tree0.0000040.15900.54955.40e-22
Zubarev (UMLLR)0.0000040.38030.38614.40e-13
Zubarev (zeros)0.0000090.37850.71275.07e-31
Zubarev (M1)0.0000010.40490.12281.74e-04
Zubarev (M2)0.0000020.39560.26261.05e-08
Dummy Baseline-0.0000420.96230.70861.08e-37

Extrapolation Analysis: When Will Importance-Optimised p-adic LR Outperform Other Models?

Based on current regression trends, we can extrapolate when Importance-Optimised p-adic LR will achieve better performance (lower p-adic loss) than other models as the dataset grows. The confidence intervals are calculated using bootstrap resampling (n=1000).

Model Crossover Point
(products)
95% Confidence Interval Probability Estimated Date
UNN (Unconstrained Neural Networks)38,94526,990 - 82,965 (95% CI, σ=18,101)>95%2027-07-12 (±uncertain, R²=0.997, growth=62.0/product/day)
ULR (Unconstrained Logistic Regression)61,59433,382 - 479,451 (95% CI, σ=1,412,463)>95%2028-07-11 (±uncertain, R²=0.997, growth=62.0/product/day)
Decision Tree111,54749,916 - 1,045,870 (95% CI, σ=7,188,610)>95%2030-09-25 (±uncertain, R²=0.997, growth=62.0/product/day)

Statistical Notes: The crossover points are calculated by finding where the regression lines intersect. The 95% confidence intervals are derived from bootstrap resampling of the regression parameters. The probability estimates indicate the likelihood that the crossover will occur given the current trends. Date predictions are based on linear extrapolation of dataset growth and should be interpreted with caution.

Model performance vs number of distinct tags
Model Slope (per tag) Intercept p-value
Importance-Optimised p-adic LR0.0000050.30640.12181.12e-05
PCLR0.0000470.22270.67265.94e-38
PCNN0.0000380.25540.49546.83e-24
ULR0.0000070.15560.60654.61e-26
UNN0.0000110.09730.62625.60e-27
Decision Tree0.0000060.12830.58176.95e-24
Zubarev (UMLLR)0.0000070.34140.50383.94e-18
Zubarev (zeros)0.0000140.30390.79703.44e-39
Zubarev (M1)0.0000020.39030.19031.90e-06
Zubarev (M2)0.0000040.37200.35516.52e-12
Dummy Baseline-0.0000591.23760.77871.02e-45

Extrapolation Analysis: When Will Importance-Optimised p-adic LR Outperform Other Models?

Based on current regression trends, we can extrapolate when Importance-Optimised p-adic LR will achieve better performance (lower p-adic loss) than other models as the dataset grows. The confidence intervals are calculated using bootstrap resampling (n=1000).

Model Crossover Point
(tags)
95% Confidence Interval Probability Estimated Date
UNN (Unconstrained Neural Networks)32,46524,661 - 57,226 (95% CI, σ=8,675)>95%2027-06-10 (±uncertain, R²=0.993, growth=47.1/tag/day)
ULR (Unconstrained Logistic Regression)61,23031,900 - 472,989 (95% CI, σ=318,162)>95%2029-02-10 (±uncertain, R²=0.993, growth=47.1/tag/day)
Decision Tree121,92146,409 - 1,240,786 (95% CI, σ=7,811,809)>95%2032-08-23 (±uncertain, R²=0.993, growth=47.1/tag/day)

Statistical Notes: The crossover points are calculated by finding where the regression lines intersect. The 95% confidence intervals are derived from bootstrap resampling of the regression parameters. The probability estimates indicate the likelihood that the crossover will occur given the current trends. Date predictions are based on linear extrapolation of dataset growth and should be interpreted with caution.

Model complexity vs performance (parameter count vs p-adic loss)
Both axes use log scale. The red line is the fixed parsimoniousness baseline rather than a fitted regression.

Why parsimony matters. The question here is not just which model has the lowest loss, but which model gets good p-adic loss with the fewest effective parameters. That is exactly where the smaller p-adic models are interesting.

Where this baseline came from. The original score came from a log-log regression on model size versus loss, rounded to -0.1 × log₁₀(params) - 0.2. Looking across historical snapshots, those scores drifted as the dataset covered more taxonomies, so the current baseline adds + 0.3 × log₁₀(taxonomies / 1,000) to keep comparisons stable as the benchmark grows. For readability, we also re-centre the displayed score by dropping the old constant offset; that keeps the current tables mostly positive without changing the relative comparisons.

Parsimoniousness baseline: log₁₀(loss) = -0.1 × log₁₀(params) + 0.3 × log₁₀(taxonomies / 1,000)
Current snapshot taxonomies: 563
Parsimony score = baseline log₁₀(loss) − observed log₁₀(loss). Positive means better than baseline.

Model Params Loss log₁₀(params) log₁₀(loss) Baseline log₁₀(loss) Parsimony score
Level-wise Logistic132,4150.10085.1219-0.9966-0.5870+0.4096
ULR5,1300.23553.7101-0.6280-0.4459+0.1822
UNN25,1640.21844.4008-0.6607-0.5149+0.1457
Decision Tree48,0720.20554.6819-0.6871-0.5430+0.1441
Dummy10.61000.0000-0.2147-0.0748+0.1398
Importance-Optimised1,0830.32323.0345-0.4905-0.3783+0.1123
Zubarev (M1)2,9250.40723.4662-0.3902-0.4215-0.0313
Zubarev (M2)2,9500.40893.4698-0.3884-0.4218-0.0334
Zubarev (UMLLR)3,1240.41473.4947-0.3823-0.4243-0.0420
PCNN8640.52542.9365-0.2795-0.3685-0.0890
Zubarev (zeros)3,3090.46243.5197-0.3350-0.4268-0.0919
PCLR17,6960.75844.2479-0.1201-0.4996-0.3795
Historical parsimony score stability
Left: parsimony score versus dataset size. Right: score distribution across historical snapshots. Positive means better than the taxonomy-adjusted baseline.
Model Snapshots Mean score Std dev Span Latest score Latest products
Unconstrained Logistic Regression with L1122+0.16800.02050.1083+0.181710,968
Unconstrained Neural Network with L1120+0.11020.03270.1631+0.145310,968
Decision Tree89+0.15230.01580.0720+0.143610,968
Dummy Baseline136+0.02410.12330.3268+0.139410,968
Importance-Optimised $p$-adic Linear Regression89+0.03740.04180.1273+0.111810,968
Zubarev (UMLLR init)110-0.07190.01580.0810-0.042510,968
PCNN120-0.22810.04730.1992-0.089510,968
PCLR120-0.37880.02480.1880-0.380010,968

Smaller standard deviation and span mean a model’s parsimoniousness is more stable as the dataset grows.

Unconstrained models: complexity vs performance (log-log scale)
Unconstrained models only (no PCLR/PCNN). Both axes on log scale.

Regression: log₁₀(loss) = slope × log₁₀(params) + intercept

Slope Intercept p-value Significant? n
-0.1029 -0.2106 0.9844 0.0008 Yes 5
Model performance trajectory over time
Arrows show how each model's complexity and performance have changed over time.