python - How to use the AWQ quantitative classification model?

If I have a classification model based on Qwen2.5-0.5B training:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B")
model = AutoModelForSequenceClassification.from_pretrained(
    "Qwen/Qwen2.5-0.5B",
    device_map="auto",
    num_labels=2,
    torch_dtype=torch.bfloat16,
)

How do I quantify it for AWQ and calibrate it?

I'm trying to load the classification model using transformers and saving it:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B")
model = AutoModelForSequenceClassification.from_pretrained(
    "Qwen/Qwen2.5-0.5B",
    device_map="auto",
    num_labels=2,
    torch_dtype=torch.bfloat16,
)
model.save_pretrained(model_path)
tokenizer.save_pretrained(model_path)
```python

The conversion was then performed using AutoAwq, but it was found that the head layer changed from score to lm_head after quantisation (the model architecture was changed).

```python
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import AwqConfig, AutoConfig
import torch
quant_config = {"zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoAWQForCausalLM.from_pretrained(model_path, device_map="auto", safetensors=True)

model.quantize(tokenizer, quant_config=quant_config, calib_data=data)
model.save_quantized(quant_path, safetensors=True, shard_size="4GB")
tokenizer.save_pretrained(quant_path)

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - How to use the AWQ quantitative classification model? - Stack Overflow

与本文相关的文章

评论列表(0)