pytorch - How to fix Index put requires the source and destination dtypes match` with `googlegemma-2-2b` in Transformers?

I’m trying to train a language model using google/gemma-2-2b with the Hugging Face Transformers Trainer. The same training script works fine for other models like gpt2 and meta-llama/Meta-Llama-3-8B, but with Gemma-2-2B it fails during evaluation, showing:

RuntimeError: Index put requires the source and destination dtypes match, got Float for the destination and BFloat16 for the source.

Below is the full console output (and the relevant code excerpt at the end). Note that I already attempted the following:

Setting attn_implementation='eager' for Gemma-2-2B.
Switching out of paged_adamw_32bit.
(Un)commenting gradient_checkpointing.

I still get this dtype mismatch error at eval time. Any ideas on how to resolve or work around this?

Full console output:

Kwargs to run:
{'mode': 'dryrun', 'project': 'self-opt-train-uncompiled-py-2-gsm8k', 'num_train_epochs': 1, 'model_name': 'google/gemma-2-2b', 'today': '2025_m02_d07_t07h_20m_14s', 'tmux_sess_num': None, 'hostname': 'skampere1'}
Setting random seed = 42
vLLM not installed or vllm set seed has a bug, skipping vLLM seed setting.
Currently logged in as: brando

Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00,  5.63it/s]
block_size=1024
len(ds_train)=18612
len(ds_train)=2740
/lfs/skampere1/0/brando9/miniconda/envs/zip_fit/lib/python3.11/site-packages/transformers/training_args.py:1575: FutureWarning: `evaluation_strategy` is deprecated and will be removed in version 4.46 of

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

pytorch - How to fix Index put requires the source and destination dtypes match` with `googlegemma-2-2b` in Transformers? - Stac

`与本文相关的文章`

`评论列表(0)`