# create a Pytorch optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate)
for iter in range(max_iters):
if iter % eval_iters == 0:
losses = estimate_loss()
print(f"step: {iter}, loss {losses}")
# sample a batch of data
xb, yb = get_batch("train")
#evaluate the loss
logits, loss = model.forward(xb, yb)
optimizer.zero_grad(set_to_none=True)
loss.backward()
optimizer.step()
print(loss.item())
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
I tried adding model = model.to(device) and adding .cuda() after inputs, but none of them worked. I'm struggling to get it fixed.
# create a Pytorch optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate)
for iter in range(max_iters):
if iter % eval_iters == 0:
losses = estimate_loss()
print(f"step: {iter}, loss {losses}")
# sample a batch of data
xb, yb = get_batch("train")
#evaluate the loss
logits, loss = model.forward(xb, yb)
optimizer.zero_grad(set_to_none=True)
loss.backward()
optimizer.step()
print(loss.item())
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
I tried adding model = model.to(device) and adding .cuda() after inputs, but none of them worked. I'm struggling to get it fixed.
Share Improve this question edited Mar 22 at 9:37 Innat 17.2k6 gold badges60 silver badges112 bronze badges asked Mar 17 at 4:36 user29981993user29981993 111 silver badge1 bronze badge1 Answer
Reset to default 1Your calculation failed because PyTorch detected a mix of GPU and CPU tensors. Specifically, some of your data is being processed on the graphics card (cuda:0), while other parts remain on the computer's main processor. This mismatch prevents PyTorch from performing operations that require all tensors to be on the same device.
When you use a GPU with PyTorch, it's identified as cuda:0
if it's your only GPU or if you haven't specified otherwise. Your first GPU is always indexed as 0.
To use the GPU, ensure your device
settings are set to cuda
. If you have multiple GPUs, you can select them by changing the index (e.g., cuda:1, cuda:2).
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")
First, transfer your model to the GPU using model = model.to(device)
. Make sure to do this before setting up your optimizer.
Next, ensure your input data (xb
and yb
) also resides on the GPU by using xb = xb.to(device)
and yb = yb.to(device)
.
Maintain device consistency within estimate_loss
by ensuring all operations and loaded data reside on the same device
as the model.