Analytics Notebook

Blog

The Bottleneck Was Never the Model

Switching from a 32B local model to a 14B model produced a 4.7x speed improvement. The larger model was not more capable in practice. It was bottlenecked by a hardware constraint that had nothing to do with intelligence.

5 min read