Switching from a 32B local model to a 14B model produced a 4.7x speed improvement. The larger model was not more capable in practice. It was bottlenecked by a hardware constraint that had nothing to do with intelligence.