最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - What exactly happened for objects without references? - Stack Overflow

programmeradmin0浏览0评论

I have a test script with ab infinite loop creating Python objects. Initially I was trying to understand what happened with A() objects that did not have references (in terms of memory allocation) - but it is clear now that Python ignores it as Reference counting worked here. But how does the system deals with it?

Eventually I spotted another issue, namely that calling gc.get_objects() causes a Memory leak in this situation:

from itertools import count
import gc

class A:
    def __init__(self):
        self.name = "a"

print("Initial GC count:", gc.get_count())
gc_objects_init = [id(obj) for obj in gc.get_objects()]

for number in count():

    A() # - Python ignores it. Reference counting is used here. It is clear from Python point of view. But how does the system deal with it?

    if number % 5000000 == 0:
        gc_objects = gc.get_objects() # it increases (+1) number of Garbage Collector tracked objects - causing a Memory Leak
        print(len(gc_objects))
        # print(len(gc.get_objects()))

        print([id(obj) for obj in gc.get_objects() if id(obj) not in gc_objects_init]) # new objects
        print("GC counter:", gc.get_count())
        # del gc_objects # it solves the issue

Output:

Initial GC count: (172, 10, 0)

4965
[124362267579712, 124362267585728, 124362267337664]
GC counter: (173, 10, 0)
4966
[124362267579712, 124362267585728, 124362267337664, 124362267579520]
GC counter: (173, 10, 0)
4967
[124362267579712, 124362267585728, 124362267337664, 124362267579520, 124362267579456

...

4977
[128496873002624, 128496873008576, 128496872760192, 128496873002432, 128496873002368, 128496872759872, 128496873000448, 128496873000192, 128496873000256, 128496872999936, 128496873000000, 128496873002752, 128496873008640, 128496873008768, 128496873008704]
GC counter: (177, 10, 0)
4978
[128496873002624, 128496873008576, 128496872760192, 128496873002432, 128496873002368, 128496872759872, 128496873000448, 128496873000192, 128496873000256, 128496872999936, 128496873000000, 128496873002752, 128496873008640, 128496873008768, 128496873008704, 128496873008832]
GC counter: (178, 10, 0)
4979

When I am running this script and looking for process memory usage and Garbage Collector tracked objects:

  • virtual memory used by the process increases -> Memory Leak
  • The Garbage Collector tracked object increases, but objects were not immediately added to gen0 (it happens later)

The memory Leak is caused by gc_objects = gc.get_objects(), because when I call del gc_objects the issue disappears. When we look at the newly tracked GC object, then it is a huge list representing GC object (from the calling line: gc_objects = gc.get_objects()). But GC or other mechanism does not delete this in every iteration. I am simply curious why GC does not handle this correctly. Does anybody know?

I have a test script with ab infinite loop creating Python objects. Initially I was trying to understand what happened with A() objects that did not have references (in terms of memory allocation) - but it is clear now that Python ignores it as Reference counting worked here. But how does the system deals with it?

Eventually I spotted another issue, namely that calling gc.get_objects() causes a Memory leak in this situation:

from itertools import count
import gc

class A:
    def __init__(self):
        self.name = "a"

print("Initial GC count:", gc.get_count())
gc_objects_init = [id(obj) for obj in gc.get_objects()]

for number in count():

    A() # - Python ignores it. Reference counting is used here. It is clear from Python point of view. But how does the system deal with it?

    if number % 5000000 == 0:
        gc_objects = gc.get_objects() # it increases (+1) number of Garbage Collector tracked objects - causing a Memory Leak
        print(len(gc_objects))
        # print(len(gc.get_objects()))

        print([id(obj) for obj in gc.get_objects() if id(obj) not in gc_objects_init]) # new objects
        print("GC counter:", gc.get_count())
        # del gc_objects # it solves the issue

Output:

Initial GC count: (172, 10, 0)

4965
[124362267579712, 124362267585728, 124362267337664]
GC counter: (173, 10, 0)
4966
[124362267579712, 124362267585728, 124362267337664, 124362267579520]
GC counter: (173, 10, 0)
4967
[124362267579712, 124362267585728, 124362267337664, 124362267579520, 124362267579456

...

4977
[128496873002624, 128496873008576, 128496872760192, 128496873002432, 128496873002368, 128496872759872, 128496873000448, 128496873000192, 128496873000256, 128496872999936, 128496873000000, 128496873002752, 128496873008640, 128496873008768, 128496873008704]
GC counter: (177, 10, 0)
4978
[128496873002624, 128496873008576, 128496872760192, 128496873002432, 128496873002368, 128496872759872, 128496873000448, 128496873000192, 128496873000256, 128496872999936, 128496873000000, 128496873002752, 128496873008640, 128496873008768, 128496873008704, 128496873008832]
GC counter: (178, 10, 0)
4979

When I am running this script and looking for process memory usage and Garbage Collector tracked objects:

  • virtual memory used by the process increases -> Memory Leak
  • The Garbage Collector tracked object increases, but objects were not immediately added to gen0 (it happens later)

The memory Leak is caused by gc_objects = gc.get_objects(), because when I call del gc_objects the issue disappears. When we look at the newly tracked GC object, then it is a huge list representing GC object (from the calling line: gc_objects = gc.get_objects()). But GC or other mechanism does not delete this in every iteration. I am simply curious why GC does not handle this correctly. Does anybody know?

Share Improve this question asked Feb 4 at 13:19 DanielDaniel 541 silver badge5 bronze badges 7
  • 1 "virtual memory used by the process increases --> Memory Leak" I don't think that is the definition of a memory leak. – topsail Commented Feb 4 at 13:45
  • No, I was going to ask the same - what do you consider to be a memory leak? – roganjosh Commented Feb 4 at 13:45
  • 1 I have no idea why this was graduated from staging ground stackoverflow/staging-ground/79402933 – Andras Deak -- Слава Україні Commented Feb 4 at 13:48
  • 1 You might have good reason to be playing with the GC but I can say, in 10 years of python, I don't have a single program that makes any use of that module. I have tried to use it in the past, and it was just totally misguided. The GC is pretty complex but robust and memory leaks take some serious effort to create – roganjosh Commented Feb 4 at 13:48
  • Yes, It took effort to create this situation. I played with some edge cases for learning as I thought that it may give me possibility to understand better Python memory management. And memory leak I understand as slow increase of memory allocation that was not needed. – Daniel Commented Feb 4 at 17:11
 |  Show 2 more comments

1 Answer 1

Reset to default 3

It is just that the object returned by gc.get_objects() will contain a reference to the previous object returned by gc.get_objects(), since it is still around in memory with a hard reference.

(The reference to the previous call would be decreased when the assignment part in the line gc_objects = gc.get_objects() takes place - but by them, the new call has already counted the object from the previous run.

So, there is indeed a memory leak in this code - but it is not caused by the Python interpreter - it is explicitly caused by this implementation itself.

As you can see, clearing the previous cycle "gc_objects" instance prior to the call for the new cycle prevents that.

发布评论

评论列表(0)

  1. 暂无评论