InFeeo
Language

Does it make sense to use alternative quantizations of QAT models? [D](reddit.com)

×
Link preview Does it make sense to use alternative quantizations of QAT models? [D] From TF's website: Quantization aware training emulates inference-time quantization, creating a model that downstream tools will use to produce actually quantized models. So is it designed to work with a very specific quantization method (for Gemma-4, presumably, Google's own)? Or would it make sense to use alternative quantization methods? According to the benchmarks unsloth released, its (alternative) quantizations of Gemma-4-QAT are closer to the QAT fine-tunes, but is it a good thing, or does it defeat the purpose of QAT? submitted by /u/we_are_mammals [link] [Kommentare] reddit.com · reddit.com
From TF's website: Quantization aware training emulates inference-time quantization, creating a model that downstream tools will use to produce actually quantized models. So is it designed to work with a very specific quantization method (for Gemma-4, presumably, Google's own)? Or would it make sense to use alternative quantization methods? According to the benchmarks unsloth released, its (alternative) quantizations of Gemma-4-QAT are closer to the QAT fine-tunes, but is it a good thing, or does it defeat the purpose of QAT? submitted by /u/we_are_mammals [link] [Kommentare]

Log in Log in to comment.

No comments yet.