{"data":{"allMarkdownRemark":{"edges":[{"node":{"id":"df094608-f958-5bcd-b406-ea359067fbd8","frontmatter":{"category":"Coding","title":"PostgreSQL 12 - a precious release","date":"2019-10-29","summary":"Improved Common Table Expressions in recent release of Postgres","thumbnail":{"relativePath":"pages/postgres12-a-precious-release/elephant_cropped.jpg","childImageSharp":{"resolutions":{"base64":"data:image/jpeg;base64,/9j/2wBDABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVGC8aGi9jQjhCY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2P/wgARCAAWABQDASIAAhEBAxEB/8QAGQABAQADAQAAAAAAAAAAAAAAAAMBAgQF/8QAFgEBAQEAAAAAAAAAAAAAAAAAAQIA/9oADAMBAAIQAxAAAAHuzwyh9NJUx1M1Cf/EABoQAAIDAQEAAAAAAAAAAAAAAAEDABESAhD/2gAIAQEAAQUC9ZwwnDAsaq5ZgM//xAAYEQADAQEAAAAAAAAAAAAAAAAAARICE//aAAgBAwEBPwGDmxaZbP/EABcRAAMBAAAAAAAAAAAAAAAAAAACEwH/2gAIAQIBAT8BoVU1MJqf/8QAGxAAAgEFAAAAAAAAAAAAAAAAADEBECAhQWH/2gAIAQEABj8CrPIwNM1Z/8QAHBABAAICAwEAAAAAAAAAAAAAAQARITEQQVFx/9oACAEBAAE/IVDbLPeEWs5BNbatDthqyfsUeMAiBon/2gAMAwEAAgADAAAAEL/Xfv/EABkRAQEAAwEAAAAAAAAAAAAAAAEAMUFRYf/aAAgBAwEBPxAOy4kkzet//8QAGREAAwADAAAAAAAAAAAAAAAAAAExEVFh/9oACAECAQE/EHnGb2IuHE//xAAdEAEAAgICAwAAAAAAAAAAAAABABEhMRBRYXGh/9oACAEBAAE/EFAAVogmhwCxZXVnre/cLABvFBlrzTMCB1e/kSbKO4RYmd3m4ZIf/9k=","width":292,"height":325,"src":"/static/5076cf686ef8d2a21a3a6da9bf016970/a2998/elephant_cropped.jpg","srcSet":"/static/5076cf686ef8d2a21a3a6da9bf016970/a2998/elephant_cropped.jpg 1x,\n/static/5076cf686ef8d2a21a3a6da9bf016970/99cce/elephant_cropped.jpg 1.5x,\n/static/5076cf686ef8d2a21a3a6da9bf016970/6e995/elephant_cropped.jpg 2x,\n/static/5076cf686ef8d2a21a3a6da9bf016970/968b1/elephant_cropped.jpg 3x"}}},"authorName":"Mariusz Nowak","authorDescription":"Mariusz is a Senior Software Engineer at AUTO1 Group.","authorAvatar":null,"headerImage":null},"html":"<p>The PostgreSQL team <a href=\"https://www.postgresql.org/about/news/1976/\">announced recently</a> a new release of the most advanced\nopen source relational database - PostgreSQL 12. As usual it comes with an impressive list of improvements\n(generated columns ♥️), one of them being long awaited by dozens of developers: <strong>improved Common Table Expressions</strong>.</p>\n<p>I should first explain what are the Common Table Expressions for those who are unfamiliar with them:\nthe CTE’s, often called <code class=\"language-text\">“WITH queries”</code>, are SQL constructs giving a possibility of creating <strong>temporal data views</strong> for a sake of a query execution.</p>\n<p>Essentially CTE is an additional query which results can be referenced in the subsequent CTE’s or the main query before which\nit is being placed. It should be clear enough with an example - here’s a sample query with two CTE’s taken from Postgres docs:</p>\n<div class=\"gatsby-highlight\" data-language=\"sql\"><pre class=\"language-sql\"><code class=\"language-sql\"><span class=\"token keyword\">WITH</span> regional_sales <span class=\"token keyword\">AS</span> <span class=\"token punctuation\">(</span>\n    <span class=\"token keyword\">SELECT</span> region<span class=\"token punctuation\">,</span> <span class=\"token function\">SUM</span><span class=\"token punctuation\">(</span>amount<span class=\"token punctuation\">)</span> <span class=\"token keyword\">AS</span> total_sales\n    <span class=\"token keyword\">FROM</span> orders\n    <span class=\"token keyword\">GROUP</span> <span class=\"token keyword\">BY</span> region\n<span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> top_regions <span class=\"token keyword\">AS</span> <span class=\"token punctuation\">(</span>\n    <span class=\"token keyword\">SELECT</span> region\n    <span class=\"token keyword\">FROM</span> regional_sales\n    <span class=\"token keyword\">WHERE</span> total_sales <span class=\"token operator\">></span> <span class=\"token punctuation\">(</span><span class=\"token keyword\">SELECT</span> <span class=\"token function\">SUM</span><span class=\"token punctuation\">(</span>total_sales<span class=\"token punctuation\">)</span><span class=\"token operator\">/</span><span class=\"token number\">10</span> <span class=\"token keyword\">FROM</span> regional_sales<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">)</span>\n<span class=\"token keyword\">SELECT</span> region<span class=\"token punctuation\">,</span>\n       product<span class=\"token punctuation\">,</span>\n       <span class=\"token function\">SUM</span><span class=\"token punctuation\">(</span>quantity<span class=\"token punctuation\">)</span> <span class=\"token keyword\">AS</span> product_units<span class=\"token punctuation\">,</span>\n       <span class=\"token function\">SUM</span><span class=\"token punctuation\">(</span>amount<span class=\"token punctuation\">)</span> <span class=\"token keyword\">AS</span> product_sales\n<span class=\"token keyword\">FROM</span> orders\n<span class=\"token keyword\">WHERE</span> region <span class=\"token operator\">IN</span> <span class=\"token punctuation\">(</span><span class=\"token keyword\">SELECT</span> region <span class=\"token keyword\">FROM</span> top_regions<span class=\"token punctuation\">)</span>\n<span class=\"token keyword\">GROUP</span> <span class=\"token keyword\">BY</span> region<span class=\"token punctuation\">,</span> product<span class=\"token punctuation\">;</span></code></pre></div>\n<p>Now we know what they are, but what purpose can they serve us? Well - we could parry here and say: for the same purpose as ordinary database views serve.\nThat’s of course a dramatic simplification - CTE’s are much more powerful and should not be treated as a simple database views.\nNevertheless I would like to keep the collation for the sake of this article.</p>\n<p>Database views are absolutely optional - one can simply substitute them with a subquery and achieve identical results.\nIndeed that’s what modern database engines do these days - once a database view is being used they inline it’s query as a subquery.\nWhy to bother then? We are able to deal completely without database views and even if we did use one the database engine would get rid of it anyway.</p>\n<p>What benefits do views give us then? And why the heck are they inlined?</p>\n<h3>The beauty of database views</h3>\n<p>The most appropriate explanation here is that database views help us achieve better readability.\nThey offer an elegant way of abstracting some parts of a database into a meaningful object, often matching closely with the domain.</p>\n<p>Instead of creating giant and ugly looking queries it is possible to extract some of its parts into an appealing view which is easier to browse and select data from. <strong>\"Divide and conquer\"</strong> rule in it’s true form.</p>\n<p>Still, we didn’t answer the fact that the underlying view’s query is most of the times <strong>inlined</strong> while it is being referenced. The reason is <strong>performance</strong> of course. Smart guys found out that lazy evaluation helps the optimizer a lot - by delaying the execution we could take advantage of a context of the actual query.</p>\n<p>This in turn allows many clever optimization techniques, like: pushing down predicates (<em>WHERE filters</em>), eliminating unnecessary <em>JOINS</em>, accessing only subset of columns etc.\nIn other words - a database is smart enough to do as little work as possible when evaluating database views, in the context of the issued query.</p>\n<p>Personally I love this pattern: aggregating all the data into views and letting a database to optimize my queries - these folks are really good in it and my queries are dead simple too.</p>\n<p>I have mentioned the CTE’s at the beginning, saying they are able to create temporal data views. I still conform to the comparison with database views - they both are in many cases similar.\nThe main difference is that CTE results are temporary and are reachable only in the context of a query which CTE is being part of.</p>\n<p>It may make sense to use a CTE in a place where a regular database view is not justified (e.g. it makes sense only in the context of a query and not in the whole domain), expecting similar behavior.</p>\n<h3>An ugly brother</h3>\n<p>Besides many remarkable advantages there’s at least one disadvantage which disqualifies CTE's for most use cases - before PostgreSQL 12 it was implemented as an <strong>optimization fence</strong>. What does it mean?</p>\n<p>Easy to imagine an example with a generic data view aggregating lots of data. If the view is then used to select just few records it could mean a tremendous waste of computation, if the aggregation is executed immediately.\nInstead the aggregation should be executed only on a small subset of data, which can be deduced from the outer query.</p>\n<p>Unfortunately such counter intuitive behavior was true for CTE’s for a long time - their results were <strong>materialized</strong> only to be accessible for the rest of the query afterwards.</p>\n<p>Not to blame anybody - the creators had quite good reasons to implement such behavior (i.e. guarantee of exactly one evaluation, possibility of a recursive CTE’s and more) but this still feels like focusing on corner cases instead of optimizing the happy path.\nThat’s why the community <a href=\"https://www.postgresql.org/message-id/flat/201209191305.44674.db%40kavod.com\">insisted for a long time</a> for changing the status quo by giving the possibility of disabling the fence and unlocking the full potential of CTE’s.</p>\n<h3>Game changer</h3>\n<p>The SQL gods listen to their prayers and here it is - PostgreSQL 12 with updated <a href=\"https://www.postgresql.org/docs/12/queries-with.html\">Common Table Expressions</a>.\nBy default, when few constraints are met, the queries will be inlined allowing joint optimizations.</p>\n<p>It is still possible to force the old behavior - by defining the CTE <code class=\"language-text\">AS MATERIALIZED</code> the engine would execute it immediately.\nIt is also possible to hint the optimizer that we definitely want the CTE to be inlined, e.g. when the CTE is being referenced twice the engine won't inline it by default.</p>\n<p>This is truly a game changer for many developers who care about their queries’ readability, allowing them to substitute not very well liked subqueries with elegant <code class=\"language-text\">“WITH queries”</code>.</p>\n<p>Don’t get me wrong - not every subquery should be immediately replaced, they still have their strengths and in some cases they should be chosen over CTE’s. It is just convenient to have two distinct tools in a toolbox, isn’t it?</p>","fields":{"slug":"/postgres12-a-precious-release/","tags":["postgres","postgres12","release","sql","cte"]}}}]}},"pageContext":{"slug":"/tags/release","tag":"release","categories":["Architecture","Coding","DevOps","Engineering","ProjectManagement","QA","Social","TechRadar"]}}