Postgres Jsonb Partial Index For Not Null Querying

25 minute read

It takes extra steps to make the partial index work

Before

\d webhooks
                                        Table "public.webhooks"
   Column   |            Type             | Collation | Nullable |               Default                
------------+-----------------------------+-----------+----------+--------------------------------------
 id         | bigint                      |           | not null | nextval('webhooks_id_seq'::regclass)
 payload    | jsonb                       |           |          | 
 created_at | timestamp without time zone |           | not null | 
 updated_at | timestamp without time zone |           | not null | 
Indexes:
    "webhooks_pkey" PRIMARY KEY, btree (id)
    "index_webhooks_on_payload" gin (payload) WHERE (payload ->> 'kind'::text) IS NOT NULL AND (payload ->> 'recording'::text) IS NOT NULL
EXPLAIN ANALYZE SELECT "webhooks".* FROM "webhooks"
  WHERE (payload ->> 'kind' IS NOT NULL)
  AND (payload ->> 'recording' IS NOT NULL);
                                                  QUERY PLAN                                                  
--------------------------------------------------------------------------------------------------------------
 Seq Scan on webhooks  (cost=0.00..428.85 rows=1706 width=1676) (actual time=0.119..82.476 rows=1722 loops=1)
   Filter: (((payload ->> 'kind'::text) IS NOT NULL) AND ((payload ->> 'recording'::text) IS NOT NULL))
   Rows Removed by Filter: 1
 Planning Time: 0.433 ms
 Execution Time: 82.681 ms
(5 rows)

It’s still Seq Scan. The index_webhooks_on_payload index is useless😒

After

Turns out, the ? field_name and field_name IS NOT NULL are both required, if we want the field_name IS NOT NULL condition to use indexes.

CREATE INDEX idx ON webhooks ((payload ->> 'kind'))
  WHERE (payload ? 'kind' AND payload -> 'kind' IS NOT NULL);
CREATE INDEX idx2 ON webhooks ((payload ->> 'recording'))
  WHERE (payload ? 'recording' AND payload -> 'recording' IS NOT NULL);
EXPLAIN ANALYZE SELECT "webhooks".* FROM "webhooks"
  WHERE (payload ? 'kind' AND payload -> 'kind' IS NOT NULL)
  AND (payload ? 'recording' AND payload -> 'recording' IS NOT NULL);
                                                                                QUERY PLAN                                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on webhooks  (cost=25.35..29.37 rows=1 width=1676) (actual time=1.432..2.572 rows=1722 loops=1)
   Recheck Cond: ((payload ? 'kind'::text) AND ((payload -> 'kind'::text) IS NOT NULL) AND (payload ? 'recording'::text) AND ((payload -> 'recording'::text) IS NOT NULL))
   Heap Blocks: exact=403
   ->  BitmapAnd  (cost=25.35..25.35 rows=1 width=0) (actual time=1.224..1.226 rows=0 loops=1)
         ->  Bitmap Index Scan on idx  (cost=0.00..4.36 rows=17 width=0) (actual time=0.343..0.344 rows=1722 loops=1)
         ->  Bitmap Index Scan on idx2  (cost=0.00..20.74 rows=17 width=0) (actual time=0.820..0.821 rows=1722 loops=1)
 Planning Time: 0.232 ms
 Execution Time: 2.780 ms
(8 rows)

It uses Bitmap Heap Scan instead. Yay🎉

Let’s remove the testing indexes idx and idx2, and add a formal migration later.

DROP INDEX idx;
DROP INDEX idx2;

Migration and Deploy

class AddIndexToWebhooks < ActiveRecord::Migration[5.1]
  def change
    add_index :webhooks, "((payload -> 'kind'))", {
      name: 'index_webhooks_on_payload_kind',
      where: "payload ? 'kind' AND payload -> 'kind' IS NOT NULL",
    }

    add_index :webhooks, "((payload -> 'recording'))", {
      name: 'index_webhooks_on_payload_recording',
      where: "payload ? 'recording' AND payload -> 'recording' IS NOT NULL",
    }
  end
end
heroku run rake --trace db:migrate
heroku restart

And check if work on production.

heroku rails c
Webhook
  .where("payload ? 'kind' AND payload -> 'kind' IS NOT NULL")
  .where("payload ? 'recording' AND payload -> 'recording' IS NOT NULL")
  .explain
                                                                                QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on webhooks  (cost=14.27..16.28 rows=1 width=1679)
   Recheck Cond: ((payload ? 'kind'::text) AND ((payload -> 'kind'::text) IS NOT NULL) AND (payload ? 'recording'::text) AND ((payload -> 'recording'::text) IS NOT NULL))
   ->  BitmapAnd  (cost=14.27..14.27 rows=1 width=0)
         ->  Bitmap Index Scan on index_webhooks_on_payload_kind  (cost=0.00..2.07 rows=17 width=0)
         ->  Bitmap Index Scan on index_webhooks_on_payload_recording  (cost=0.00..12.15 rows=17 width=0)
(5 rows)

Production works, too.

Reference