最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - How to use the AWQ quantitative classification model? - Stack Overflow

programmeradmin2浏览0评论

If I have a classification model based on Qwen2.5-0.5B training:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B")
model = AutoModelForSequenceClassification.from_pretrained(
    "Qwen/Qwen2.5-0.5B",
    device_map="auto",
    num_labels=2,
    torch_dtype=torch.bfloat16,
)

How do I quantify it for AWQ and calibrate it?

I'm trying to load the classification model using transformers and saving it:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B")
model = AutoModelForSequenceClassification.from_pretrained(
    "Qwen/Qwen2.5-0.5B",
    device_map="auto",
    num_labels=2,
    torch_dtype=torch.bfloat16,
)
model.save_pretrained(model_path)
tokenizer.save_pretrained(model_path)
```python

The conversion was then performed using AutoAwq, but it was found that the head layer changed from score to lm_head after quantisation (the model architecture was changed).

```python
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import AwqConfig, AutoConfig
import torch
quant_config = {"zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoAWQForCausalLM.from_pretrained(model_path, device_map="auto", safetensors=True)

model.quantize(tokenizer, quant_config=quant_config, calib_data=data)
model.save_quantized(quant_path, safetensors=True, shard_size="4GB")
tokenizer.save_pretrained(quant_path)
发布评论

评论列表(0)

  1. 暂无评论