python - pytorch code messes up my RAM when using torch.zeros()

I have a function to measure the allocated ram by python in megabytes: def getram(): print(psutil.Process(os.getpid()).memory_info().rss / 1024**2)

And also I have: device = "cuda"

My problem is that the following code allocates RAM and I'm going crazy because of that. Does it have an actual solution or do I have to accept my fate and switch to C++ or something?

The code:

getram()

def load_dataset(dir, filenames):
  dataset = torch.zeros((len(filenames),3,256,256), device=device)
  getram()
  for i, filename in enumerate(filenames):
    f = read_image(f"{dir}/{filename}")
    if 3 != f.shape[0]: print(filename)
    dataset[i] = f.to(device)
  getram()
  return dataset

dataset = load_dataset(dataset_dir, dataset_filenames)

getram()

The code printed out the following:

533.28125
661.2890625
678.27734375
678.27734375

As you can see, as soon as I create the empty tensor with the torch.zeros(), it takes up the RAM for no reason.

I tried gc.collect(), but it didn't help at all.

I have a function to measure the allocated ram by python in megabytes: def getram(): print(psutil.Process(os.getpid()).memory_info().rss / 1024**2)

And also I have: device = "cuda"

My problem is that the following code allocates RAM and I'm going crazy because of that. Does it have an actual solution or do I have to accept my fate and switch to C++ or something?

The code:

getram()

def load_dataset(dir, filenames):
  dataset = torch.zeros((len(filenames),3,256,256), device=device)
  getram()
  for i, filename in enumerate(filenames):
    f = read_image(f"{dir}/{filename}")
    if 3 != f.shape[0]: print(filename)
    dataset[i] = f.to(device)
  getram()
  return dataset

dataset = load_dataset(dataset_dir, dataset_filenames)

getram()

The code printed out the following:

533.28125
661.2890625
678.27734375
678.27734375

As you can see, as soon as I create the empty tensor with the torch.zeros(), it takes up the RAM for no reason.

I tried gc.collect(), but it didn't help at all.

Share Improve this question edited Mar 12 at 22:25 desertnaut 60.5k32 gold badges155 silver badges182 bronze badges asked Mar 12 at 17:21 Nex 31 bronze badge

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

The RAM usage you are seeing is caused by loading various CUDA libraries, not from the tensor itself. When you first use CUDA, Pytorch lazily loads CUDA libraries into RAM. You can verify this with the code below (RAM usage numbers are what I got for my system, you will probably get different numbers but the overall point should be the same):

import os
import psutil
import torch
import time 

def getram(): 
    print(psutil.Process(os.getpid()).memory_info().rss / 1024**2)

device = 'cuda:0'
    
# get baseline ram
getram()
> 331.7734375

# create first cuda tensor
# this causes a large RAM increase due to loading CUDA libraries
tmp = torch.zeros(1, device=device)
time.sleep(0.1)
getram()
> 1251.25

# create dataset on GPU
dataset = torch.zeros((128,3,256,256), device=device)

# Slight RAM increase but mostly unchanged
getram()
> 1252.203125

Note that the time.sleep(0.1) is there because I found running getram right after allocating tmp would sometimes return a value while CUDA libs were still loading (ie running getram again right after without allocating any other values would yield a different result). The sleep is to ensure the libs are fully loaded.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - pytorch code messes up my RAM when using torch.zeros() - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)