The Cost of PostgreSQL Foreign Keys

Acknowledgments: shout out to Steven Jones at Syncro for helping me better understand how Foreign Keys work in PostgreSQL.

Foreign keys are great for maintaining data integrity. However, they are not free.

As mentioned in this blog post by Shaun Thomas:

In PostgreSQL, every foreign key is maintained with an invisible system-level trigger added to the source table in the reference. At least one trigger must go here, as operations that modify the source data must be checked that they do not violate the constraint.

https://bonesmoses.org/2014/05/14/foreign-keys-are-not-free/

What this means is that when you modify your parent table, even if you don’t touch any of the referred keys, these triggers are still fired.

In the original post from 2014, the overhead made the updates up to 95% slower with 20 foreign keys. Let’s see if things changed since then.

These are slightly updated scripts we’ll use for testing

CREATE OR REPLACE FUNCTION fnc_create_check_fk_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  CREATE TABLE test_fk
  (
    id   BIGINT PRIMARY KEY,
    junk VARCHAR
  );

  INSERT INTO test_fk
  SELECT generate_series(1, 100000), repeat(' ', 20);

  CLUSTER test_fk_pkey ON test_fk;

  FOR i IN 1..key_count LOOP
    EXECUTE 'CREATE TABLE test_fk_ref_' || i || 
            ' (test_fk_id BIGINT REFERENCES test_fk (id))';
						
  END LOOP;

END;
$$ LANGUAGE plpgsql VOLATILE;


CREATE OR REPLACE FUNCTION fnc_check_fk_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  FOR i IN 1..100000 LOOP
    UPDATE test_fk SET junk = '    blah                '
     WHERE id = i;
  END LOOP;

END;
$$ LANGUAGE plpgsql VOLATILE;


CREATE OR REPLACE FUNCTION clean_up_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  DROP TABLE test_fk CASCADE;

  FOR i IN 1..key_count LOOP
    EXECUTE 'DROP TABLE test_fk_ref_' || i;
  END LOOP;
END;
$$ LANGUAGE plpgsql VOLATILE;

To validate that the overhead is caused strictly by the presence of the foreign keys, and not from the cost of looking up the child records, after the first benchmark, we’ll modify the first function and add indexes on each foreign key:

CREATE OR REPLACE FUNCTION fnc_create_check_fk_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  CREATE TABLE test_fk
  (
    id   BIGINT PRIMARY KEY,
    junk VARCHAR
  );

  INSERT INTO test_fk
  SELECT generate_series(1, 100000), repeat(' ', 20);

  CLUSTER test_fk_pkey ON test_fk;

  FOR i IN 1..key_count LOOP
    EXECUTE 'CREATE TABLE test_fk_ref_' || i || 
            ' (test_fk_id BIGINT REFERENCES test_fk (id))';
						
		EXECUTE 'CREATE index test_fk_ref_index_' || i ||
            ' on test_fk_ref_' || i || '(test_fk_id)';
  END LOOP;

END;
$$ LANGUAGE plpgsql VOLATILE;

We’ll run on on Mac i9, 2.3 GHz 8-Core, 64 GB Ram

PostgreSQL version 12

-- without an index

select fnc_create_check_fk_overhead(0);
SELECT fnc_check_fk_overhead(0); -- 2.6-2.829
select clean_up_overhead(0)


select fnc_create_check_fk_overhead(20);
SELECT fnc_check_fk_overhead(20); -- 3.186-3.5. ~20% drop
select clean_up_overhead(20)


-- after updating our initial function to add an index for each foreign key:
select fnc_create_check_fk_overhead(0);
SELECT fnc_check_fk_overhead(0); -- 2.6-2.8
select clean_up_overhead(0)

select fnc_create_check_fk_overhead(20);
SELECT fnc_check_fk_overhead(20); -- 3.1 same ~20% drop
select clean_up_overhead(20)


As we see from the benchmark, the drop in update performance on a parent table is about 20% after adding 20 tables with a foreign key to the parent. It’s not quite as bad as 95% in the original post, but the overhead is still clearly there.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.