Warning: Some posts on this platform may contain adult material intended for mature audiences only. Viewer discretion is advised. By clicking ‘Continue’, you confirm that you are 18 years or older and consent to viewing explicit content.
Yes, but 200 gb is probably already with 4 bit quantization, the weights in fp16 would be more like 800 gb
IDK if its even possible to quantize more, if it is, you’re probably better of going with a smaller model anyways
Yes, but 200 gb is probably already with 4 bit quantization, the weights in fp16 would be more like 800 gb IDK if its even possible to quantize more, if it is, you’re probably better of going with a smaller model anyways