Performance – Other Things

Acknowledgments: shout out to Steven Jones at Syncro for helping me better understand how Foreign Keys work in PostgreSQL.

Foreign keys are great for maintaining data integrity. However, they are not free.

As mentioned in this blog post by Shaun Thomas:

In PostgreSQL, every foreign key is maintained with an invisible system-level trigger added to the source table in the reference. At least one trigger must go here, as operations that modify the source data must be checked that they do not violate the constraint.
https://bonesmoses.org/2014/05/14/foreign-keys-are-not-free/

What this means is that when you modify your parent table, even if you don’t touch any of the referred keys, these triggers are still fired.

In the original post from 2014, the overhead made the updates up to 95% slower with 20 foreign keys. Let’s see if things changed since then.

These are slightly updated scripts we’ll use for testing

CREATE OR REPLACE FUNCTION fnc_create_check_fk_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  CREATE TABLE test_fk
  (
    id   BIGINT PRIMARY KEY,
    junk VARCHAR
  );

  INSERT INTO test_fk
  SELECT generate_series(1, 100000), repeat(' ', 20);

  CLUSTER test_fk_pkey ON test_fk;

  FOR i IN 1..key_count LOOP
    EXECUTE 'CREATE TABLE test_fk_ref_' || i || 
            ' (test_fk_id BIGINT REFERENCES test_fk (id))';
						
  END LOOP;

END;
$$ LANGUAGE plpgsql VOLATILE;


CREATE OR REPLACE FUNCTION fnc_check_fk_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  FOR i IN 1..100000 LOOP
    UPDATE test_fk SET junk = '    blah                '
     WHERE id = i;
  END LOOP;

END;
$$ LANGUAGE plpgsql VOLATILE;


CREATE OR REPLACE FUNCTION clean_up_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  DROP TABLE test_fk CASCADE;

  FOR i IN 1..key_count LOOP
    EXECUTE 'DROP TABLE test_fk_ref_' || i;
  END LOOP;
END;
$$ LANGUAGE plpgsql VOLATILE;

To validate that the overhead is caused strictly by the presence of the foreign keys, and not from the cost of looking up the child records, after the first benchmark, we’ll modify the first function and add indexes on each foreign key:

CREATE OR REPLACE FUNCTION fnc_create_check_fk_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  CREATE TABLE test_fk
  (
    id   BIGINT PRIMARY KEY,
    junk VARCHAR
  );

  INSERT INTO test_fk
  SELECT generate_series(1, 100000), repeat(' ', 20);

  CLUSTER test_fk_pkey ON test_fk;

  FOR i IN 1..key_count LOOP
    EXECUTE 'CREATE TABLE test_fk_ref_' || i || 
            ' (test_fk_id BIGINT REFERENCES test_fk (id))';
						
		EXECUTE 'CREATE index test_fk_ref_index_' || i ||
            ' on test_fk_ref_' || i || '(test_fk_id)';
  END LOOP;

END;
$$ LANGUAGE plpgsql VOLATILE;

We’ll run on on Mac i9, 2.3 GHz 8-Core, 64 GB Ram

PostgreSQL version 12

-- without an index

select fnc_create_check_fk_overhead(0);
SELECT fnc_check_fk_overhead(0); -- 2.6-2.829
select clean_up_overhead(0)


select fnc_create_check_fk_overhead(20);
SELECT fnc_check_fk_overhead(20); -- 3.186-3.5. ~20% drop
select clean_up_overhead(20)


-- after updating our initial function to add an index for each foreign key:
select fnc_create_check_fk_overhead(0);
SELECT fnc_check_fk_overhead(0); -- 2.6-2.8
select clean_up_overhead(0)

select fnc_create_check_fk_overhead(20);
SELECT fnc_check_fk_overhead(20); -- 3.1 same ~20% drop
select clean_up_overhead(20)

As we see from the benchmark, the drop in update performance on a parent table is about 20% after adding 20 tables with a foreign key to the parent. It’s not quite as bad as 95% in the original post, but the overhead is still clearly there.

One of the best way to improve Visual Studio performance is to replace your HDD with SSD. However, for various reasons it may not always an option. Here are a few other hardware-related suggestions that may help your Visual Studio experience:

If you have enough RAM, create a RAM disk (I use freeware from SoftPerfect ). Set your web.config to use it as a temp compilation directory:
```
 <compilation tempDirectory="R:\TempAspFolderOnRamDisk">
```
While you are there, you can experiment with batch parameter. It is used to control whether the initial page request will continue to compile additional files in the same directory before completing the original request. I prefer to set it to false.
If you have an ability to add a second SSD, you may want to move all your source code to it. However, it may mean you have to change the configuration on your machine to make it work on a different drive/folder. To avoid this configuration headache, instead create a symbolic link between your original source code directory, and the new one on the SSD drive:
```
mklink /D "C:\originalSourceCodeDirectory" R:\DirectoryOnSSD
```
You have to delete or rename the originalSourceCodeDirectory prior to running mklink command.
Move your bin/obj folders to RamDisk. Use symbolic links to avoid configuration problems. This did not seem to have any noticeable effect on my configuration, but your mileage may vary.
Use some caching software like PrimoCache. It allows to cache data either to your RAM or an additional SSD.

The most beneficial of these techniques for me was moving source code to SSD.

Other Things

Category Archives: Performance

The Cost of PostgreSQL Foreign Keys

Improving Visual Studio Performance with RAM disk and Second SSD