Padjective Tag Hierarchy

Machine learning insights into Shopify product tag organization

Data sourced from cantbuymelove.industrial-linguistics.com powering Shopify taxonomy classification and filtered to taxonomies with at least five products.

Last updated 2026-02-01 19:07 UTC

6,181 Products used
369 Taxonomies covered
9,385 Tags used
24,589 Total tags
2,554 Tag battles

Dataset coverage

Training data spans 6,181 products across 369 taxonomies. Of 24,589 total tags in the dataset, 9,385 tags were used (tags appearing fewer than 5 times were filtered out). 4,211 products were discarded due to missing or sparse taxonomy labels. Explore the full dataset → | View defective taxonomy labels →

Dummy Baseline

Always predicts most common taxonomy (baseline for comparison)

0.6001 Avg p-adic loss
1 Parameter
View model →

Importance-Optimised p-adic Linear Regression

P-adic coefficients assigned to tags to predict taxonomy

0.3737 Avg p-adic loss
786 Avg non-zero coefficients
View model →

Zubarev Regression (UMLLR init)

Stochastic p-adic optimization starting from UMLLR (arXiv:2503.23488)

0.4155 Avg p-adic loss
2,311 Non-zero coefficients
View fold details →

Zubarev Regression (Zeros init)

Stochastic p-adic optimization starting from zeros (arXiv:2503.23488)

0.4374 Avg p-adic loss
2,415 Non-zero coefficients
View fold details →

Zubarev Mahler-1 (UMLLR init)

Mahler affine basis (degree 1) with UMLLR initialization

0.4145 Avg p-adic loss
2,229 Non-zero coefficients
View fold details →

Zubarev Mahler-2 (UMLLR init)

Mahler quadratic basis (degree 2) with UMLLR initialization

0.4126 Avg p-adic loss
2,207 Non-zero coefficients
View fold details →

Unconstrained Logistic Regression

L1-regularized model using ALL tags

0.2135 Avg p-adic loss
3,146 Non-zero params
View model →

Decision Tree

Unconstrained tree using ALL tags

0.1874 Avg p-adic loss
26,232 Effective params
View model →

Unconstrained Neural Network

L1-regularized NN with weight pruning

0.2104 Avg p-adic loss
31,883 Non-zero params
View model →

Parameter Constrained Neural Network

Neural network predicting taxonomy from tags

0.6584 Avg p-adic loss
864 Avg input weights
View model →

Parameter Constrained Logistic Regression

Logistic regression model predicting Shopify taxonomy from tags

0.6540 Avg p-adic loss
11,725 Avg parameters
View model →

ELO-Inspired Rankings

Battle-tested tag hierarchy from product title positions

2,554 Tag battles
View rankings →

Taxonomy distribution

Taxonomy class distribution
Distribution of products across the most common taxonomy classes

Top 10 taxonomy classes

Taxonomy IDNamePathSamplesShare
gid://shopify/TaxonomyCategory/aa-1-13-8Apparel & Accessories > Clothing > Clothing Tops > T-Shirts1.1.13.83044.9%
gid://shopify/TaxonomyCategory/fb-2-3-2Food, Beverages & Tobacco > Food Items > Candy & Chocolate > Chocolate9.2.3.22494.0%
gid://shopify/TaxonomyCategory/aa-6-8Apparel & Accessories > Jewelry > Necklaces1.6.81442.3%
gid://shopify/TaxonomyCategory/aa-1-4Apparel & Accessories > Clothing > Dresses1.1.41422.3%
gid://shopify/TaxonomyCategory/ae-2-1Arts & Entertainment > Hobbies & Creative Arts > Arts & Crafts3.2.11302.1%
gid://shopify/TaxonomyCategory/aa-6-6Apparel & Accessories > Jewelry > Earrings1.6.61181.9%
gid://shopify/TaxonomyCategory/hg-9Home & Garden > Household Appliances14.91051.7%
gid://shopify/TaxonomyCategory/ha-6-2-5Hardware > Hardware Accessories > Cabinet Hardware > Cabinet Knobs & Handles12.6.2.5891.4%
gid://shopify/TaxonomyCategory/lbLuggage & Bags15811.3%
gid://shopify/TaxonomyCategory/ae-2-2Arts & Entertainment > Hobbies & Creative Arts > Collectibles3.2.2791.3%

Tags with strongest signal

TagTop taxonomyWeightMax |weight|
FRAMED ARTWORK3.2.25.68795.6879
WOMENS1.8.75.56995.5699
BLUE14.11.10.4.35.49565.4956
ACCESSORIES23.4.4.1.7.45.26725.2672
GIFT14.15.1.95.11215.1121
WHOLESALE14.11.10.7.95.06525.0652
VEGAN13.3.5.25.04365.0436
KIDS4.25.02495.0249
NEW ARRIVALS13.3.2.8.44.85204.8520
PLUS SIZE1.1.1.1.54.81854.8185

Historical Performance Trends

Tracking model performance and dataset growth over time. Lower p-adic loss indicates better predictions.

Historical model performance trends
Model performance vs number of products
Model Slope (per product) Intercept p-value
Importance-Optimised p-adic LR0.0000120.29900.31003.81e-08
PCLR0.0000800.28920.74079.44e-26
PCNN0.0000810.24980.84221.27e-34
ULR0.0000080.17890.21863.19e-04
UNN0.0000260.07090.73602.29e-16
Decision Tree0.0000090.14140.29473.21e-05
Zubarev (UMLLR)0.0000190.30640.80792.81e-16
Zubarev (zeros)0.0000260.29900.81311.60e-16
Zubarev (M1)0.0000070.37590.35822.24e-05
Zubarev (M2)0.0000100.35660.53402.65e-08
Dummy Baseline-0.0000741.09730.59767.09e-15

Extrapolation Analysis: When Will Importance-Optimised p-adic LR Outperform Other Models?

Based on current regression trends, we can extrapolate when Importance-Optimised p-adic LR will achieve better performance (lower p-adic loss) than other models as the dataset grows. The confidence intervals are calculated using bootstrap resampling (n=1000).

Model Crossover Point
(products)
95% Confidence Interval Probability Estimated Date
UNN (Unconstrained Neural Networks)15,73311,892 - 23,242 (95% CI, σ=2,948)>95%2026-07-22 (±uncertain, R²=0.997, growth=56.4/product/day)

Statistical Notes: The crossover points are calculated by finding where the regression lines intersect. The 95% confidence intervals are derived from bootstrap resampling of the regression parameters. The probability estimates indicate the likelihood that the crossover will occur given the current trends. Date predictions are based on linear extrapolation of dataset growth and should be interpreted with caution.

Model performance vs number of distinct tags
Model Slope (per tag) Intercept p-value
Importance-Optimised p-adic LR0.0000140.24220.34225.13e-09
PCLR0.000088-0.05680.72668.28e-25
PCNN0.000089-0.09780.82032.67e-32
ULR0.0000090.14740.25199.48e-05
UNN0.000026-0.02130.76131.74e-17
Decision Tree0.0000090.10850.32679.61e-06
Zubarev (UMLLR)0.0000200.23010.84533.26e-18
Zubarev (zeros)0.0000270.19890.84095.81e-18
Zubarev (M1)0.0000070.34810.37011.51e-05
Zubarev (M2)0.0000110.31800.53792.22e-08
Dummy Baseline-0.0000791.39290.62178.77e-16

Extrapolation Analysis: When Will Importance-Optimised p-adic LR Outperform Other Models?

Based on current regression trends, we can extrapolate when Importance-Optimised p-adic LR will achieve better performance (lower p-adic loss) than other models as the dataset grows. The confidence intervals are calculated using bootstrap resampling (n=1000).

Model Crossover Point
(tags)
95% Confidence Interval Probability Estimated Date
UNN (Unconstrained Neural Networks)20,83916,368 - 31,408 (95% CI, σ=3,999)>95%2026-09-16 (±uncertain, R²=0.993, growth=50.4/tag/day)

Statistical Notes: The crossover points are calculated by finding where the regression lines intersect. The 95% confidence intervals are derived from bootstrap resampling of the regression parameters. The probability estimates indicate the likelihood that the crossover will occur given the current trends. Date predictions are based on linear extrapolation of dataset growth and should be interpreted with caution.

Model complexity vs performance (parameter count vs p-adic loss)
Parameter count (log scale) vs p-adic loss. Sparse models use fewer non-zero parameters.

Regression: p-adic loss = slope × log₁₀(params) + intercept

Line Slope Intercept p-value Significant? n
With Dummy-0.07150.64860.25830.1104No11
Without Dummy-0.13160.86850.19980.1953No10
Unconstrained models: complexity vs performance (log-log scale)
Unconstrained models only (no PCLR/PCNN). Both axes on log scale.

Regression: log₁₀(loss) = slope × log₁₀(params) + intercept

Slope Intercept p-value Significant? n
-0.1111 -0.2044 0.9036 0.0131 Yes 5
Model performance trajectory over time
Arrows show how each model's complexity and performance have changed over time.