I have a test script with ab infinite loop creating Python objects. Initially I was trying to understand what happened with A() objects that did not have references (in terms of memory allocation) - but it is clear now that Python ignores it as Reference counting worked here. But how does the system deals with it?
Eventually I spotted another issue, namely that calling gc.get_objects()
causes a Memory leak in this situation:
from itertools import count
import gc
class A:
def __init__(self):
self.name = "a"
print("Initial GC count:", gc.get_count())
gc_objects_init = [id(obj) for obj in gc.get_objects()]
for number in count():
A() # - Python ignores it. Reference counting is used here. It is clear from Python point of view. But how does the system deal with it?
if number % 5000000 == 0:
gc_objects = gc.get_objects() # it increases (+1) number of Garbage Collector tracked objects - causing a Memory Leak
print(len(gc_objects))
# print(len(gc.get_objects()))
print([id(obj) for obj in gc.get_objects() if id(obj) not in gc_objects_init]) # new objects
print("GC counter:", gc.get_count())
# del gc_objects # it solves the issue
Output:
Initial GC count: (172, 10, 0)
4965
[124362267579712, 124362267585728, 124362267337664]
GC counter: (173, 10, 0)
4966
[124362267579712, 124362267585728, 124362267337664, 124362267579520]
GC counter: (173, 10, 0)
4967
[124362267579712, 124362267585728, 124362267337664, 124362267579520, 124362267579456
...
4977
[128496873002624, 128496873008576, 128496872760192, 128496873002432, 128496873002368, 128496872759872, 128496873000448, 128496873000192, 128496873000256, 128496872999936, 128496873000000, 128496873002752, 128496873008640, 128496873008768, 128496873008704]
GC counter: (177, 10, 0)
4978
[128496873002624, 128496873008576, 128496872760192, 128496873002432, 128496873002368, 128496872759872, 128496873000448, 128496873000192, 128496873000256, 128496872999936, 128496873000000, 128496873002752, 128496873008640, 128496873008768, 128496873008704, 128496873008832]
GC counter: (178, 10, 0)
4979
When I am running this script and looking for process memory usage and Garbage Collector tracked objects:
- virtual memory used by the process increases -> Memory Leak
- The Garbage Collector tracked object increases, but objects were not immediately added to gen0 (it happens later)
The memory Leak is caused by gc_objects = gc.get_objects()
, because when I call del gc_objects
the issue disappears. When we look at the newly tracked GC object, then it is a huge list representing GC object (from the calling line: gc_objects = gc.get_objects()
). But GC or other mechanism does not delete this in every iteration. I am simply curious why GC does not handle this correctly. Does anybody know?
I have a test script with ab infinite loop creating Python objects. Initially I was trying to understand what happened with A() objects that did not have references (in terms of memory allocation) - but it is clear now that Python ignores it as Reference counting worked here. But how does the system deals with it?
Eventually I spotted another issue, namely that calling gc.get_objects()
causes a Memory leak in this situation:
from itertools import count
import gc
class A:
def __init__(self):
self.name = "a"
print("Initial GC count:", gc.get_count())
gc_objects_init = [id(obj) for obj in gc.get_objects()]
for number in count():
A() # - Python ignores it. Reference counting is used here. It is clear from Python point of view. But how does the system deal with it?
if number % 5000000 == 0:
gc_objects = gc.get_objects() # it increases (+1) number of Garbage Collector tracked objects - causing a Memory Leak
print(len(gc_objects))
# print(len(gc.get_objects()))
print([id(obj) for obj in gc.get_objects() if id(obj) not in gc_objects_init]) # new objects
print("GC counter:", gc.get_count())
# del gc_objects # it solves the issue
Output:
Initial GC count: (172, 10, 0)
4965
[124362267579712, 124362267585728, 124362267337664]
GC counter: (173, 10, 0)
4966
[124362267579712, 124362267585728, 124362267337664, 124362267579520]
GC counter: (173, 10, 0)
4967
[124362267579712, 124362267585728, 124362267337664, 124362267579520, 124362267579456
...
4977
[128496873002624, 128496873008576, 128496872760192, 128496873002432, 128496873002368, 128496872759872, 128496873000448, 128496873000192, 128496873000256, 128496872999936, 128496873000000, 128496873002752, 128496873008640, 128496873008768, 128496873008704]
GC counter: (177, 10, 0)
4978
[128496873002624, 128496873008576, 128496872760192, 128496873002432, 128496873002368, 128496872759872, 128496873000448, 128496873000192, 128496873000256, 128496872999936, 128496873000000, 128496873002752, 128496873008640, 128496873008768, 128496873008704, 128496873008832]
GC counter: (178, 10, 0)
4979
When I am running this script and looking for process memory usage and Garbage Collector tracked objects:
- virtual memory used by the process increases -> Memory Leak
- The Garbage Collector tracked object increases, but objects were not immediately added to gen0 (it happens later)
The memory Leak is caused by gc_objects = gc.get_objects()
, because when I call del gc_objects
the issue disappears. When we look at the newly tracked GC object, then it is a huge list representing GC object (from the calling line: gc_objects = gc.get_objects()
). But GC or other mechanism does not delete this in every iteration. I am simply curious why GC does not handle this correctly. Does anybody know?
- 1 "virtual memory used by the process increases --> Memory Leak" I don't think that is the definition of a memory leak. – topsail Commented Feb 4 at 13:45
- No, I was going to ask the same - what do you consider to be a memory leak? – roganjosh Commented Feb 4 at 13:45
- 1 I have no idea why this was graduated from staging ground stackoverflow/staging-ground/79402933 – Andras Deak -- Слава Україні Commented Feb 4 at 13:48
- 1 You might have good reason to be playing with the GC but I can say, in 10 years of python, I don't have a single program that makes any use of that module. I have tried to use it in the past, and it was just totally misguided. The GC is pretty complex but robust and memory leaks take some serious effort to create – roganjosh Commented Feb 4 at 13:48
- Yes, It took effort to create this situation. I played with some edge cases for learning as I thought that it may give me possibility to understand better Python memory management. And memory leak I understand as slow increase of memory allocation that was not needed. – Daniel Commented Feb 4 at 17:11
1 Answer
Reset to default 3It is just that the object returned by gc.get_objects()
will contain a reference to the previous object returned by gc.get_objects()
, since it is still around in memory with a hard reference.
(The reference to the previous call would be decreased when the assignment part in the line gc_objects = gc.get_objects()
takes place - but by them, the new call has already counted the object from the previous run.
So, there is indeed a memory leak in this code - but it is not caused by the Python interpreter - it is explicitly caused by this implementation itself.
As you can see, clearing the previous cycle "gc_objects" instance prior to the call for the new cycle prevents that.