Using GIST index with PostGIS makes query slower than without it

I have 2 sample tables with 1m rows:

create table deliveries
(
    id                 serial
        primary key,
    order_id           integer               not null
        references orders,
    point              geometry(Point, 4326) not null,
    delivery_timestamp timestamp             not null,
    year               integer               not null
);

One of them has index:

create index idx_deliveries_geom
    on deliveries using gist (point);

Sample query:

SELECT ST_AsText(point), order_id, delivery_timestamp
FROM deliveries
WHERE ST_Intersects(
              point,
              ST_MakeEnvelope(32.085065712094298, 54.87186750531864, 32.238874305844298, 54.895072980740217, 4326)
      )
ORDER BY delivery_timestamp
LIMIT 100;

Query with index works about 2.5 times slower than without index. Plan with index:

Limit  (cost=20647022.44..20647086.19 rows=100 width=44)
  ->  Result  (cost=20647022.44..21672876.60 rows=1609183 width=44)
        ->  Sort  (cost=20647022.44..20651045.39 rows=1609183 width=44)
              Sort Key: delivery_timestamp
              ->  Index Scan using idx_deliveries_geom on deliveries  (cost=0.42..20585520.62 rows=1609183 width=44)
                    Index Cond: (point && '0103000020E610000001000000050000005DBAE96EE30A4040F3E4BA5A996F4B405DBAE96EE30A4040FFE75DC091724B405DBAE96E931E4040FFE75DC091724B405DBAE96E931E4040F3E4BA5A996F4B405DBAE96EE30A4040F3E4BA5A996F4B40'::geometry)
"                    Filter: st_intersects(point, '0103000020E610000001000000050000005DBAE96EE30A4040F3E4BA5A996F4B405DBAE96EE30A4040FFE75DC091724B405DBAE96E931E4040FFE75DC091724B405DBAE96E931E4040F3E4BA5A996F4B405DBAE96EE30A4040F3E4BA5A996F4B40'::geometry)"

Plan without:

Limit  (cost=51975155.40..51975229.56 rows=100 width=44)
  ->  Gather Merge  (cost=51975155.40..53282318.69 rows=1762448 width=44)
        Workers Planned: 2
        ->  Result  (cost=51974155.37..52535935.67 rows=881224 width=44)
              ->  Sort  (cost=51974155.37..51976358.43 rows=881224 width=44)
                    Sort Key: delivery_timestamp
                    ->  Parallel Seq Scan on deliveries  (cost=0.00..51940475.62 rows=881224 width=44)
"                          Filter: st_intersects(point, '0103000020E610000001000000050000005DBAE96EE30A4040F3E4BA5A996F4B405DBAE96EE30A4040FFE75DC091724B405DBAE96E931E4040FFE75DC091724B405DBAE96E931E4040F3E4BA5A996F4B405DBAE96EE30A4040F3E4BA5A996F4B40'::geometry)"
JIT:
  Functions: 7
"  Options: Inlining true, Optimization true, Expressions true, Deforming true"

I dont understand 2 things:

why postgres uses same filter twice? Index Cond and st_intersects? Looks like it just doubles the work, checking itself one more time.
parallel scan is faster than index work, but with deliveries table growth index should become faster?

I have another bigger query that only seconds this question, parallel scan works faster than index at least on my sample data. Data is generated randomly around points 32.0401, 54.7818 (the logic for generation is slightly harder, but all points are in about 10x10 km square).

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Using GIST index with PostGIS makes query slower than without it - Stack Overflow

与本文相关的文章

评论列表(0)