最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - RuntimeError: Pytorch Optimizer - Stack Overflow

programmeradmin2浏览0评论
# create a Pytorch optimizer 
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate)

for iter in range(max_iters):
    if iter % eval_iters == 0:
        losses = estimate_loss()
        print(f"step: {iter}, loss {losses}")
        
    # sample a batch of data
    xb, yb = get_batch("train")
    
    #evaluate the loss
    logits, loss = model.forward(xb, yb)
    optimizer.zero_grad(set_to_none=True)
    loss.backward()
    optimizer.step()
    
print(loss.item())

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

I tried adding model = model.to(device) and adding .cuda() after inputs, but none of them worked. I'm struggling to get it fixed.

# create a Pytorch optimizer 
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate)

for iter in range(max_iters):
    if iter % eval_iters == 0:
        losses = estimate_loss()
        print(f"step: {iter}, loss {losses}")
        
    # sample a batch of data
    xb, yb = get_batch("train")
    
    #evaluate the loss
    logits, loss = model.forward(xb, yb)
    optimizer.zero_grad(set_to_none=True)
    loss.backward()
    optimizer.step()
    
print(loss.item())

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

I tried adding model = model.to(device) and adding .cuda() after inputs, but none of them worked. I'm struggling to get it fixed.

Share Improve this question edited Mar 22 at 9:37 Innat 17.2k6 gold badges60 silver badges112 bronze badges asked Mar 17 at 4:36 user29981993user29981993 111 silver badge1 bronze badge
Add a comment  | 

1 Answer 1

Reset to default 1

Your calculation failed because PyTorch detected a mix of GPU and CPU tensors. Specifically, some of your data is being processed on the graphics card (cuda:0), while other parts remain on the computer's main processor. This mismatch prevents PyTorch from performing operations that require all tensors to be on the same device.

When you use a GPU with PyTorch, it's identified as cuda:0 if it's your only GPU or if you haven't specified otherwise. Your first GPU is always indexed as 0.
To use the GPU, ensure your device settings are set to cuda. If you have multiple GPUs, you can select them by changing the index (e.g., cuda:1, cuda:2).

import torch

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")

First, transfer your model to the GPU using model = model.to(device). Make sure to do this before setting up your optimizer.
Next, ensure your input data (xb and yb) also resides on the GPU by using xb = xb.to(device) and yb = yb.to(device).

Maintain device consistency within estimate_loss by ensuring all operations and loaded data reside on the same device as the model.

发布评论

评论列表(0)

  1. 暂无评论