最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - TypeError: SFTTrainer.__init__() got an unexpected keyword argument 'dataset_text_field' - Stack Overfl

programmeradmin1浏览0评论

I am trying to fine-tune a language model using SFTTrainer from the trl library in Google Colab. However, I am encountering the following error:

TypeError                                 Traceback (most recent call last)
<ipython-input-3-3a32b942f05f> in <cell line: 0>()
     53
     54
---> 55 trainer = SFTTrainer(
     56         model=model,
     57         train_dataset=data,

/usr/local/lib/python3.11/dist-packages/transformers/utils/deprecation.py in wrapped_func(*args, **kwargs)
    170                 warnings.warn(message, FutureWarning, stacklevel=2)
    171
--> 172             return func(*args, **kwargs)
    173
    174         return wrapped_func

TypeError: SFTTrainer.__init__() got an unexpected keyword argument 'dataset_text_field'

Code:

import torch
from datasets import load_dataset, Dataset
from peft import LoraConfig, AutoPeftModelForCausalLM, prepare_model_for_kbit_training, get_peft_model
from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig, TrainingArguments
from trl import SFTTrainer
import os

# Load dataset
data = load_dataset("tatsu-lab/alpaca", split="train")
data_df = data.to_pandas()
data_df = data_df[:5000]
data_df["text"] = data_df[["input", "instruction", "output"]].apply(lambda x: "###Human: " + x["instruction"] + " " + x["input"] + " ###Assistant: "+ x["output"], axis=1)
data = Dataset.from_pandas(data_df)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("TheBloke/Mistral-7B-Instruct-v0.1-GPTQ")
tokenizer.pad_token = tokenizer.eos_token

# Load model
quantization_config_loading = GPTQConfig(bits=4, disable_exllama=True, tokenizer=tokenizer)
model = AutoModelForCausalLM.from_pretrained(
                            "TheBloke/Mistral-7B-Instruct-v0.1-GPTQ",
                            quantization_config=quantization_config_loading,
                            device_map="auto"
                        )

model.config.use_cache = False
model.config.pretraining_tp = 1
model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)

# Apply LoRA configuration
peft_config = LoraConfig(
    r=16, lora_alpha=16, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM", target_modules=["q_proj", "v_proj"]
)
model = get_peft_model(model, peft_config)

# Training arguments
training_arguments = TrainingArguments(
        output_dir="mistral-finetuned-alpaca",
        per_device_train_batch_size=8,
        gradient_accumulation_steps=1,
        optim="paged_adamw_32bit",
        learning_rate=2e-4,
        lr_scheduler_type="cosine",
        save_strategy="epoch",
        logging_steps=100,
        num_train_epochs=1,
        max_steps=250,
        fp16=True,
        push_to_hub=True
)

# Initialize Trainer
trainer = SFTTrainer(
        model=model,
        train_dataset=data,
        peft_config=peft_config,
        dataset_text_field="text",  # This argument is causing the error
        args=training_arguments,
        tokenizer=tokenizer,
        packing=False,
        max_seq_length=512
)

trainer.train()

What I've tried:

  1. Checked the SFTTrainer documentation to verify if dataset_text_field is a valid argument.
  2. Ensured that trl is updated using pip install -U trl.
  3. Verified that dataset_text_field is used correctly for SFTTrainer.

Question:

  • Is dataset_text_field deprecated or no longer needed in SFTTrainer?
  • If so, how should I modify my code to correctly train the model using SFTTrainer?

I am trying to fine-tune a language model using SFTTrainer from the trl library in Google Colab. However, I am encountering the following error:

TypeError                                 Traceback (most recent call last)
<ipython-input-3-3a32b942f05f> in <cell line: 0>()
     53
     54
---> 55 trainer = SFTTrainer(
     56         model=model,
     57         train_dataset=data,

/usr/local/lib/python3.11/dist-packages/transformers/utils/deprecation.py in wrapped_func(*args, **kwargs)
    170                 warnings.warn(message, FutureWarning, stacklevel=2)
    171
--> 172             return func(*args, **kwargs)
    173
    174         return wrapped_func

TypeError: SFTTrainer.__init__() got an unexpected keyword argument 'dataset_text_field'

Code:

import torch
from datasets import load_dataset, Dataset
from peft import LoraConfig, AutoPeftModelForCausalLM, prepare_model_for_kbit_training, get_peft_model
from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig, TrainingArguments
from trl import SFTTrainer
import os

# Load dataset
data = load_dataset("tatsu-lab/alpaca", split="train")
data_df = data.to_pandas()
data_df = data_df[:5000]
data_df["text"] = data_df[["input", "instruction", "output"]].apply(lambda x: "###Human: " + x["instruction"] + " " + x["input"] + " ###Assistant: "+ x["output"], axis=1)
data = Dataset.from_pandas(data_df)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("TheBloke/Mistral-7B-Instruct-v0.1-GPTQ")
tokenizer.pad_token = tokenizer.eos_token

# Load model
quantization_config_loading = GPTQConfig(bits=4, disable_exllama=True, tokenizer=tokenizer)
model = AutoModelForCausalLM.from_pretrained(
                            "TheBloke/Mistral-7B-Instruct-v0.1-GPTQ",
                            quantization_config=quantization_config_loading,
                            device_map="auto"
                        )

model.config.use_cache = False
model.config.pretraining_tp = 1
model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)

# Apply LoRA configuration
peft_config = LoraConfig(
    r=16, lora_alpha=16, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM", target_modules=["q_proj", "v_proj"]
)
model = get_peft_model(model, peft_config)

# Training arguments
training_arguments = TrainingArguments(
        output_dir="mistral-finetuned-alpaca",
        per_device_train_batch_size=8,
        gradient_accumulation_steps=1,
        optim="paged_adamw_32bit",
        learning_rate=2e-4,
        lr_scheduler_type="cosine",
        save_strategy="epoch",
        logging_steps=100,
        num_train_epochs=1,
        max_steps=250,
        fp16=True,
        push_to_hub=True
)

# Initialize Trainer
trainer = SFTTrainer(
        model=model,
        train_dataset=data,
        peft_config=peft_config,
        dataset_text_field="text",  # This argument is causing the error
        args=training_arguments,
        tokenizer=tokenizer,
        packing=False,
        max_seq_length=512
)

trainer.train()

What I've tried:

  1. Checked the SFTTrainer documentation to verify if dataset_text_field is a valid argument.
  2. Ensured that trl is updated using pip install -U trl.
  3. Verified that dataset_text_field is used correctly for SFTTrainer.

Question:

  • Is dataset_text_field deprecated or no longer needed in SFTTrainer?
  • If so, how should I modify my code to correctly train the model using SFTTrainer?
Share Improve this question asked Mar 14 at 17:18 UserUser 431 silver badge3 bronze badges 1
  • What version of trl are you on? – Starship Remembers Shadow Commented Mar 14 at 18:22
Add a comment  | 

1 Answer 1

Reset to default 2

Based on this documentation, the below code should work (I added "##" where changes where made):

# Training arguments
training_arguments = SFTConfig(  ##
        output_dir="mistral-finetuned-alpaca",
        per_device_train_batch_size=8,
        gradient_accumulation_steps=1,
        optim="paged_adamw_32bit",
        learning_rate=2e-4,
        lr_scheduler_type="cosine",
        save_strategy="epoch",
        logging_steps=100,
        num_train_epochs=1,
        max_steps=250,
        fp16=True,
        packing=False,  ##
        max_seq_length=512,  ##
        dataset_text_field="text",  ##
        push_to_hub=True
)

# Initialize Trainer
trainer = SFTTrainer(
        model=model,
        train_dataset=data,
        peft_config=peft_config,
        args=training_arguments,
        tokenizer=tokenizer,
        ##
        ##
        ##
)

trainer.train()
发布评论

评论列表(0)

  1. 暂无评论