<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>SQL &#8211; Other Things</title>
	<atom:link href="https://blog.adamzolo.com/category/sql/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.adamzolo.com</link>
	<description>Blog about Things by Adam Zolotarev</description>
	<lastBuildDate>Fri, 19 Jan 2024 20:48:15 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.1</generator>
	<item>
		<title>Rails Migration to Change string to boolean (PostgreSQL)</title>
		<link>https://blog.adamzolo.com/rails-migration-to-change-string-to-boolean-postgresql/</link>
					<comments>https://blog.adamzolo.com/rails-migration-to-change-string-to-boolean-postgresql/#respond</comments>
		
		<dc:creator><![CDATA[Adam Zolo]]></dc:creator>
		<pubDate>Fri, 19 Jan 2024 20:47:21 +0000</pubDate>
				<category><![CDATA[Rails]]></category>
		<category><![CDATA[SQL]]></category>
		<guid isPermaLink="false">https://blog.adamzolo.com/?p=1057</guid>

					<description><![CDATA[When you run the migration to change the the column type from string to boolean, you may encounter this kind of error: This just tells you that you need a rule to convert your string to boolean. You can fix with using synthax. For example, if you want all columns to change to false: Or&#8230;<p><a class="more-link" href="https://blog.adamzolo.com/rails-migration-to-change-string-to-boolean-postgresql/" title="Continue reading &#8216;Rails Migration to Change string to boolean (PostgreSQL)&#8217;">Continue reading <span class="meta-nav">&#8594;</span></a></p>]]></description>
										<content:encoded><![CDATA[
<p>When you run the migration to change the the column type from string to boolean, you may encounter this kind of error:</p>



<pre class="wp-block-code"><code>PG::DatatypeMismatch: ERROR:  column "blah" cannot be cast automatically to type boolean
HINT:  You might need to specify "USING blah::boolean".</code></pre>



<p>This just tells you that you need a rule to convert your string to boolean. You can fix with <code>using</code> synthax. For example, if you want all columns to change to false:</p>



<pre class="wp-block-code"><code>change_table :table_name do |t|
  t.change :column_name, :boolean, using: 'false', default: false, null: false
end</code></pre>



<p>Or if you want to convert your existing values from your column, you could do something like this:</p>



<pre class="wp-block-code"><code>change_table :table_name do |t|
  t.change :column_name, :boolean, using: 'cast(column_name as boolean)', default: false, null: false
end</code></pre>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.adamzolo.com/rails-migration-to-change-string-to-boolean-postgresql/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Comparing PostgreSQL timestamps without timezones with dates with a timezone</title>
		<link>https://blog.adamzolo.com/comparing-postgresql-timestamps-without-timezones-with-dates-with-a-timezone/</link>
					<comments>https://blog.adamzolo.com/comparing-postgresql-timestamps-without-timezones-with-dates-with-a-timezone/#respond</comments>
		
		<dc:creator><![CDATA[Adam Zolo]]></dc:creator>
		<pubDate>Thu, 12 Jan 2023 23:04:33 +0000</pubDate>
				<category><![CDATA[SQL]]></category>
		<guid isPermaLink="false">https://blog.adamzolo.com/?p=1041</guid>

					<description><![CDATA[Problem: you have a timestamp field and you need to compare it with something that has a timezone. Let&#8217;s assume your database default timezone is UTC (can check it with show timezone;). Thus, we are making the assumption that your dates are stored in UTC. Since timestamp does not have any timezone data, we first&#8230;<p><a class="more-link" href="https://blog.adamzolo.com/comparing-postgresql-timestamps-without-timezones-with-dates-with-a-timezone/" title="Continue reading &#8216;Comparing PostgreSQL timestamps without timezones with dates with a timezone&#8217;">Continue reading <span class="meta-nav">&#8594;</span></a></p>]]></description>
										<content:encoded><![CDATA[
<p>Problem: you have a <code>timestamp</code> field and you need to compare it with something that has a timezone.</p>



<p>Let&#8217;s assume <strong>your database default timezone is UTC</strong> (can check it with <code>show timezone</code>;). Thus, we are making the assumption that your dates are stored in UTC.</p>



<p>Since <code>timestamp</code> does not have any timezone data, we first need to read it in UTC, so it adds the timezone data. Then we can convert it to the desired timezone:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate">
select timestamp_field AT TIME ZONE &#039;UTC&#039; AT TIME ZONE &#039;America/New_York&#039;
</pre></div>


<p>Now, we have the timezone and proper daylight-saving offset.</p>



<p>Just for fun, let&#8217;s check if the current hour matches the hour from the saved timestamp field in the &#8216;America/New_York&#8217; timezone:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate">
EXTRACT ( HOUR FROM timestamp_field at time zone &#039;UTC&#039; at time zone &#039;America/New_York&#039;) = EXTRACT ( HOUR FROM now() at time zone &#039;America/New_York&#039;)
</pre></div>


<p>Notice, that we do not read now() in UTC timezone first, because our DB default timezone is UTC and now() already has all of the timezone data.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.adamzolo.com/comparing-postgresql-timestamps-without-timezones-with-dates-with-a-timezone/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>The Cost of PostgreSQL Foreign Keys</title>
		<link>https://blog.adamzolo.com/the-cost-of-postgresql-foreign-keys/</link>
					<comments>https://blog.adamzolo.com/the-cost-of-postgresql-foreign-keys/#respond</comments>
		
		<dc:creator><![CDATA[Adam Zolo]]></dc:creator>
		<pubDate>Thu, 10 Feb 2022 22:38:00 +0000</pubDate>
				<category><![CDATA[Performance]]></category>
		<category><![CDATA[SQL]]></category>
		<guid isPermaLink="false">https://blog.adamzolo.com/?p=998</guid>

					<description><![CDATA[Acknowledgments: shout out to Steven Jones at Syncro for helping me better understand how Foreign Keys work in PostgreSQL. Foreign keys are great for maintaining data integrity. However, they are not free. As mentioned in this blog post by Shaun Thomas: In PostgreSQL, every foreign key is maintained with an invisible system-level trigger added to&#8230;<p><a class="more-link" href="https://blog.adamzolo.com/the-cost-of-postgresql-foreign-keys/" title="Continue reading &#8216;The Cost of PostgreSQL Foreign Keys&#8217;">Continue reading <span class="meta-nav">&#8594;</span></a></p>]]></description>
										<content:encoded><![CDATA[
<p></p>



<p><strong>Acknowledgments</strong>: shout out to <a rel="noreferrer noopener" href="https://www.linkedin.com/in/steven-jones-a732567/" target="_blank">Steven Jones</a> at <a href="https://syncromsp.com/" target="_blank" rel="noreferrer noopener">Syncro</a> for helping me better understand how Foreign Keys work in PostgreSQL.</p>



<p></p>



<p>Foreign keys are great for maintaining data integrity. However, they are not free.</p>



<p>As mentioned in this blog post by Shaun Thomas:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><p>In PostgreSQL, every foreign key is maintained with an invisible system-level trigger added to the&nbsp;<em>source</em>&nbsp;table in the reference. At least one trigger must go here, as operations that modify the source data must be checked that they do not violate the constraint.</p><cite>https://bonesmoses.org/2014/05/14/foreign-keys-are-not-free/</cite></blockquote>



<p>What this means is that when you modify your parent table, even if you don&#8217;t touch any of the referred keys, these triggers are still fired. </p>



<p>In the original post from 2014, the overhead made the updates up to 95% slower with 20 foreign keys. Let&#8217;s see if things changed since then.</p>



<p></p>



<p>These are slightly updated scripts we&#8217;ll use for testing</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate">
CREATE OR REPLACE FUNCTION fnc_create_check_fk_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  CREATE TABLE test_fk
  (
    id   BIGINT PRIMARY KEY,
    junk VARCHAR
  );

  INSERT INTO test_fk
  SELECT generate_series(1, 100000), repeat(&#039; &#039;, 20);

  CLUSTER test_fk_pkey ON test_fk;

  FOR i IN 1..key_count LOOP
    EXECUTE &#039;CREATE TABLE test_fk_ref_&#039; || i || 
            &#039; (test_fk_id BIGINT REFERENCES test_fk (id))&#039;;
						
  END LOOP;

END;
$$ LANGUAGE plpgsql VOLATILE;


CREATE OR REPLACE FUNCTION fnc_check_fk_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  FOR i IN 1..100000 LOOP
    UPDATE test_fk SET junk = &#039;    blah                &#039;
     WHERE id = i;
  END LOOP;

END;
$$ LANGUAGE plpgsql VOLATILE;


CREATE OR REPLACE FUNCTION clean_up_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  DROP TABLE test_fk CASCADE;

  FOR i IN 1..key_count LOOP
    EXECUTE &#039;DROP TABLE test_fk_ref_&#039; || i;
  END LOOP;
END;
$$ LANGUAGE plpgsql VOLATILE;
</pre></div>


<p>To validate that the overhead is caused strictly by the presence of the foreign keys, and not from the cost of looking up the child records, after the first benchmark, we&#8217;ll modify the first function and add indexes on each foreign key:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate">
CREATE OR REPLACE FUNCTION fnc_create_check_fk_overhead(key_count INT)
RETURNS VOID AS
$$
DECLARE
  i INT;
BEGIN
  CREATE TABLE test_fk
  (
    id   BIGINT PRIMARY KEY,
    junk VARCHAR
  );

  INSERT INTO test_fk
  SELECT generate_series(1, 100000), repeat(&#039; &#039;, 20);

  CLUSTER test_fk_pkey ON test_fk;

  FOR i IN 1..key_count LOOP
    EXECUTE &#039;CREATE TABLE test_fk_ref_&#039; || i || 
            &#039; (test_fk_id BIGINT REFERENCES test_fk (id))&#039;;
						
		EXECUTE &#039;CREATE index test_fk_ref_index_&#039; || i ||
            &#039; on test_fk_ref_&#039; || i || &#039;(test_fk_id)&#039;;
  END LOOP;

END;
$$ LANGUAGE plpgsql VOLATILE;
</pre></div>


<p></p>



<p>We&#8217;ll run on on Mac i9, 2.3 GHz 8-Core, 64 GB Ram</p>



<p>PostgreSQL version 12</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate">
-- without an index

select fnc_create_check_fk_overhead(0);
SELECT fnc_check_fk_overhead(0); -- 2.6-2.829
select clean_up_overhead(0)


select fnc_create_check_fk_overhead(20);
SELECT fnc_check_fk_overhead(20); -- 3.186-3.5. ~20% drop
select clean_up_overhead(20)


-- after updating our initial function to add an index for each foreign key:
select fnc_create_check_fk_overhead(0);
SELECT fnc_check_fk_overhead(0); -- 2.6-2.8
select clean_up_overhead(0)

select fnc_create_check_fk_overhead(20);
SELECT fnc_check_fk_overhead(20); -- 3.1 same ~20% drop
select clean_up_overhead(20)


</pre></div>


<p>As we see from the benchmark, the drop in update performance on a parent table is about 20% after adding 20 tables with a foreign key to the parent. It&#8217;s not quite as bad as <meta charset="utf-8">95% in the original post, but the overhead is still clearly there.</p>



<p></p>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.adamzolo.com/the-cost-of-postgresql-foreign-keys/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Docker MySQL with a Custom SQL Script for Development</title>
		<link>https://blog.adamzolo.com/docker-mysql-with-a-custom-sql-script-for-development/</link>
					<comments>https://blog.adamzolo.com/docker-mysql-with-a-custom-sql-script-for-development/#respond</comments>
		
		<dc:creator><![CDATA[Adam Zolo]]></dc:creator>
		<pubDate>Tue, 04 Jan 2022 15:07:10 +0000</pubDate>
				<category><![CDATA[docker]]></category>
		<category><![CDATA[SQL]]></category>
		<guid isPermaLink="false">https://blog.adamzolo.com/?p=988</guid>

					<description><![CDATA[The setup is similar to setting up MariaDB. Start with standard docker-compose file. If using custom SQL mode, specify the necessary options in the command options: Add dev.dockerfile: Finally, add your init.sql file. Let&#8217;s give all privileges to our dev_user and switch the default caching_sha2_password to mysql_native_password (don&#8217;t do it unless you rely on older&#8230;<p><a class="more-link" href="https://blog.adamzolo.com/docker-mysql-with-a-custom-sql-script-for-development/" title="Continue reading &#8216;Docker MySQL with a Custom SQL Script for Development&#8217;">Continue reading <span class="meta-nav">&#8594;</span></a></p>]]></description>
										<content:encoded><![CDATA[
<p>The setup is similar to <a rel="noreferrer noopener" href="https://blog.adamzolo.com/dockerizing-mariadb-with-a-custom-sql-script-in-development/" target="_blank">setting up MariaDB</a>.</p>



<p>Start with standard docker-compose file. If using custom SQL mode, specify the necessary options in the command options:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; title: ; notranslate">
version: &quot;3.7&quot;
services:
    mysql:
        build:
            context: .
            dockerfile: dev.dockerfile
        restart: always
        command: --sql_mode=&quot;STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE&quot;
        environment:
            MYSQL_ROOT_PASSWORD: root_password
            MYSQL_DATABASE: dev
            MYSQL_USER: dev_user
            MYSQL_PASSWORD: dev_password
        ports:
            - 3306:3306

</pre></div>


<p>Add <code>dev.dockerfile</code>:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; title: ; notranslate">
FROM mysql:8.0.17

ADD init.sql /docker-entrypoint-initdb.d/ddl.sql


</pre></div>


<p><meta charset="utf-8">Finally, add your <code>init.sql</code> file. Let&#8217;s give all privileges to our <code>dev_user</code> and switch the default caching_sha2_password to mysql_native_password (don&#8217;t do it unless you rely on older packages that require the less secure  au</p>



<p>Finally, add your <code>init.sql</code> file. Let&#8217;s give all privileges to our <code>dev_user</code> and switch the default caching_sha2_password to mysql_native_password (don&#8217;t do it unless you rely on older packages that require the less secure <meta charset="utf-8">mysql_native_password authentication method):</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; title: ; notranslate">
GRANT ALL PRIVILEGES ON *.* TO &#039;dev_user&#039;@&#039;%&#039;;
ALTER USER &#039;dev_user&#039;@&#039;%&#039; IDENTIFIED WITH mysql_native_password BY &#039;dev_password&#039;;
</pre></div>


<p><meta charset="utf-8">If you want to access the database container from other containers, while running them separately, you can specify <code>host.docker.internal</code> as the host address of your database.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.adamzolo.com/docker-mysql-with-a-custom-sql-script-for-development/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Getting Min Max ids after splitting SQL table into n equal parts using ntile</title>
		<link>https://blog.adamzolo.com/getting-min-max-ids-after-splitting-sql-table-into-n-equal-parts-using-ntile/</link>
					<comments>https://blog.adamzolo.com/getting-min-max-ids-after-splitting-sql-table-into-n-equal-parts-using-ntile/#respond</comments>
		
		<dc:creator><![CDATA[Adam Zolo]]></dc:creator>
		<pubDate>Wed, 10 Jun 2020 13:11:10 +0000</pubDate>
				<category><![CDATA[SQL]]></category>
		<guid isPermaLink="false">http://blog.adamzolo.com/?p=929</guid>

					<description><![CDATA[This will split your_table into 10 equal parts, and give you the minimum and maximum id for each part:]]></description>
										<content:encoded><![CDATA[
<p>This will split your_table into 10 equal parts, and give you the minimum and maximum id for each part:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: sql; title: ; notranslate">
with cte as
(
select
id,
ntile(10) over(order by a.id) as bucket_id
from your_table as a
group by a.id
)

select bucket_id, min(id), max(id)
from cte
group by bucket_id
order by bucket_id
</pre></div>]]></content:encoded>
					
					<wfw:commentRss>https://blog.adamzolo.com/getting-min-max-ids-after-splitting-sql-table-into-n-equal-parts-using-ntile/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Removing Duplicate Data from SQL Query Caused by New Line Character</title>
		<link>https://blog.adamzolo.com/removing-duplicate-data-sql-newline-character/</link>
					<comments>https://blog.adamzolo.com/removing-duplicate-data-sql-newline-character/#respond</comments>
		
		<dc:creator><![CDATA[Adam Zolo]]></dc:creator>
		<pubDate>Mon, 16 Sep 2013 21:08:03 +0000</pubDate>
				<category><![CDATA[SQL]]></category>
		<guid isPermaLink="false">http://eazolo.com/blog/?p=24</guid>

					<description><![CDATA[We had a query retrieving data from a linked Oracle server. We needed unique rows only. This is the original query: However, this still returned some duplicates despite using DISTINCT. As it turned out, some rows had a new line character in them. The solution:]]></description>
										<content:encoded><![CDATA[<p>We had a query retrieving data from a linked Oracle server. We needed unique rows only. This is the original query: </p>
<pre class="brush: sql; title: ; notranslate">
SELECT Column
     FROM OPENQUERY( LinkedServer, 'SELECT DISTINCT Column from TABLE;' );
</pre>
<p>However, this still returned some duplicates despite using DISTINCT. As it turned out, some rows had a new line character in them. The solution:</p>
<pre class="brush: sql; title: ; notranslate">
SELECT distinct replace(replace(Column,CHAR(13),''),CHAR(10),'')
     FROM OPENQUERY( LinkedServer, 'SELECT DISTINCT Column from TABLE;' )
</pre>
]]></content:encoded>
					
					<wfw:commentRss>https://blog.adamzolo.com/removing-duplicate-data-sql-newline-character/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
