python - Llama 3: Getting a CUDA unknown error while fine tuning Llama 3 on wikitext

I am a beginner in Large Language Models and the Hugging Face API. I was trying to fine tune the Llama 3.1 8b model on the wikitext dataset as practice.

When I try to run the following script, I get an unknown CUDA error.

CUDA error: unknown error
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
  File "/root/mesh_LLM.py", line 84, in <module>
    trainer.train()
RuntimeError: CUDA error: unknown error
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

My machine is a Dell Precision series with 16 GB VRAM NVIDIA RTX A5000, so I hope it shouldn't be a memory issue considering I am loading and finetuning the model in 8 bit precision.

Here is the code:

import torch,os
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import Trainer, TrainingArguments, BitsAndBytesConfig
from datasets import load_dataset
from peft import LoraConfig, get_peft_model

os.environ['CUDA_LAUNCH_BLOCKING'] = '1'
model_name = "meta-llama/Llama-3.1-8b"  

tokenizer = AutoTokenizer.from_pretrained(base_model_id, token='hf-****')
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(base_model_id, device_map='auto', load_in_8bit=True)
model.resize_token_embeddings(len(tokenizer))

peft_config = LoraConfig(r=16, lora_alpha=32, lora_dropout=0.05, bias='none', task_type="CAUSAL_LM")
model = get_peft_model(model, peft_config)

model.print_trainable_parameters()

# Load a dataset
dataset = load_dataset("wikitext", "wikitext-2-raw-v1")

# Tokenize the dataset
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=512)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

training_args = TrainingArguments(
    output_dir="./llama3_finetuned",  # Where to save the model
    evaluation_strategy="steps",     # Evaluate during training
    save_strategy="steps",           # Save checkpoints
    learning_rate=2e-5,              # A good starting point for fine-tuning
    per_device_train_batch_size=4,   # Adjust based on GPU memory
    gradient_accumulation_steps=8,   # Simulates a larger batch size
    num_train_epochs=3,              # Experiment with more epochs for small datasets
    logging_steps=100,               # Log training progress
    save_steps=500,                  # Save model every 500 steps
    push_to_hub=False                # Skip pushing to Hugging Face Hub for now
)

trainer = Trainer(
    model=model, 
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['validation'], 
    tokenizer=tokenizer, 
    args=training_args
)

trainer.train()
trainer.save_model('model_ft/fine_tuned_llama3-8B')

Any leads would be very helpful!

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - Llama 3: Getting a CUDA unknown error while fine tuning Llama 3 on wikitext - Stack Overflow

与本文相关的文章

评论列表(0)