Neural network with L1 regularization and weight pruning for sparse predictions
Unconstrained neural network classifier using L1 regularization during training followed by post-training weight pruning to achieve sparsity. Unlike parameter-constrained models, this classifier uses ALL available tags as input features, relying on the combination of L1 regularization and pruning to eliminate unimportant connections.
The network uses a single hidden layer with 256 neurons:
| Fold | Accuracy | F1 | P-adic loss (mean) | Non-zero params | Sparsity |
|---|---|---|---|---|---|
| 0 | 64.19% | 0.6070 | 0.146505 | 37,289 | 92.1% |
| 1 | 63.20% | 0.6096 | 0.126863 | 38,105 | 91.9% |
| 2 | 60.77% | 0.5673 | 0.168868 | 51,346 | 89.1% |
| 3 | 62.91% | 0.6002 | 0.152178 | 37,979 | 91.9% |
| 4 | 61.62% | 0.5650 | 0.190346 | 57,070 | 87.9% |
The unconstrained neural network achieves the best p-adic loss among all models by using more parameters (after pruning), while the L1 regularization and pruning ensure that only the most important connections are retained. This demonstrates the tradeoff between model complexity and prediction accuracy.