最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Why hashtable (array implementation) compaction does not kick in in PHP? - Stack Overflow

programmeradmin1浏览0评论

I am making a big PHP array and then remove most of the beginning of it. As we know it won't free the memory only mark removed items as Undef, as explained by Nikita here. However when I put more items to the array it should reach the compaction point and we should see memory reduction, but that is not observed.

Please help to understand why compaction is not happening?

<?php

$items = [];

// hashtable compaction should start at every 2^n size.
$size = pow(2, 18);

for($i=0;$i<$size;$i++) {
    $items[$i] = "string $i";
}
// at this point we reached the maximum tablesize before compaction
// 2^18 is used to make memory consumpotion visible enough.
printf("mem: %d Mb \n", memory_get_usage(true) / 1024 / 1024);

// Remove all except the last one
for($i=0;$i<$size-1;$i++) {
    unset($items[$i]); 
}

// here it is expected to see the same memory footprint as nothing is freed,
// only marked as UNDEF.
printf("mem: %d Mb \n", memory_get_usage(true) / 1024 / 1024);

// However here we expect to hit the compaction point 
// and see $size-1 elements removed, while 1 added, so memory is expected to go down
$items[] = $i;
printf("mem: %d Mb \n", memory_get_usage(true) / 1024 / 1024);

Live example:

I am making a big PHP array and then remove most of the beginning of it. As we know it won't free the memory only mark removed items as Undef, as explained by Nikita here. However when I put more items to the array it should reach the compaction point and we should see memory reduction, but that is not observed.

Please help to understand why compaction is not happening?

<?php

$items = [];

// hashtable compaction should start at every 2^n size.
$size = pow(2, 18);

for($i=0;$i<$size;$i++) {
    $items[$i] = "string $i";
}
// at this point we reached the maximum tablesize before compaction
// 2^18 is used to make memory consumpotion visible enough.
printf("mem: %d Mb \n", memory_get_usage(true) / 1024 / 1024);

// Remove all except the last one
for($i=0;$i<$size-1;$i++) {
    unset($items[$i]); 
}

// here it is expected to see the same memory footprint as nothing is freed,
// only marked as UNDEF.
printf("mem: %d Mb \n", memory_get_usage(true) / 1024 / 1024);

// However here we expect to hit the compaction point 
// and see $size-1 elements removed, while 1 added, so memory is expected to go down
$items[] = $i;
printf("mem: %d Mb \n", memory_get_usage(true) / 1024 / 1024);

Live example: https://onlinephp.io/c/98970

Share Improve this question edited Nov 28, 2024 at 10:50 Olivier 18.4k1 gold badge11 silver badges31 bronze badges asked Nov 28, 2024 at 10:20 Dmitry LezhnevDmitry Lezhnev 8842 gold badges10 silver badges20 bronze badges 5
  • 2 You should use memory_get_usage(false) to view the actual memory usage. You created a packed hashtable, appending an element can lead to memory growth instead. And the most crucial thing is "arData never shrinks". – shingo Commented Nov 28, 2024 at 11:18
  • Good point about the memory usage! So you mean packed hashtables won't benefit from compaction no matter what? – Dmitry Lezhnev Commented Nov 28, 2024 at 11:20
  • not exactly what we observe at the last step. If that was the case, memory won't spike. I'd say it is clear that is expanded the original arData to 2^19 and not just reanized and appended to the existing arData. No? – Dmitry Lezhnev Commented Nov 28, 2024 at 11:36
  • 1 Because arData never shrinks, packed hashtables will never be compacted. – shingo Commented Nov 28, 2024 at 11:39
  • That seems to be true. Can I mark this comment as the answer? – Dmitry Lezhnev Commented Nov 28, 2024 at 11:44
Add a comment  | 

1 Answer 1

Reset to default 3

Because Nikic said in the article:

One problem with the current implementation is that arData never shrinks (unless explicitly told to).

This indicates that the so-called compaction will only reanize data without reducing memory usage.

For packed hashtables, the author also said:

In packed hashtables the arHash array is NULL and lookups will directly index into arData. If you’re looking for the key 5 then the element will be located at arData[5] or it doesn’t exist at all.

I think that PHP directly uses arData as a regular array, so packed hashtables will not be compacted.

In the current source code, it can be seen that a packed hashtable has been declared as zval *arPacked;, but it still never shrinks.

However because the size of Bucket is larger than zval, there is an exception, when a regular hashtable is converted to a packed hashtable, the memory usage will be reduced (about half). Some functions such as sort can achieve this (surprisingly, shuffle can also do it).

发布评论

评论列表(0)

  1. 暂无评论