1-bit LLMs Could Solve AI’s Energy Demands

floofloof@lemmy.ca · 1 month ago

1-bit LLMs Could Solve AI’s Energy Demands

kromem@lemmy.world · 29 days ago

There’s actually a perplexity improvement parameter-to-paramater for BitNet-1.58 which increases as it scales up.

So yes, post-training quantization perplexity issues are apparent, but if you train quantization in from the start it is better than FP.

Which makes sense through the lens of the superposition hypothesis where the weights are actually representing a hyperdimensional virtual vector space. If the weights have too much precision competing features might compromise on fuzzier representations instead of restructuring the virtual network to better matching nodes.

Constrained weight precision is probably going to be the future of pretraining within a generation or two looking at the data so far.

Warning: Some posts on this platform may contain adult material intended for mature audiences only. Viewer discretion is advised. By clicking ‘Continue’, you confirm that you are 18 years or older and consent to viewing explicit content.

1-bit LLMs Could Solve AI’s Energy Demands

1-bit LLMs Could Solve AI’s Energy Demands