{"data":{"allMarkdownRemark":{"edges":[{"node":{"id":"df094608-f958-5bcd-b406-ea359067fbd8","frontmatter":{"category":"Coding","title":"PostgreSQL 12 - a precious release","date":"2019-10-29","summary":"Improved Common Table Expressions in recent release of Postgres","thumbnail":{"relativePath":"pages/postgres12-a-precious-release/elephant_cropped.jpg","childImageSharp":{"resolutions":{"base64":"data:image/jpeg;base64,/9j/2wBDABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVGC8aGi9jQjhCY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2P/wgARCAAWABQDASIAAhEBAxEB/8QAGQABAQADAQAAAAAAAAAAAAAAAAMBAgQF/8QAFgEBAQEAAAAAAAAAAAAAAAAAAQIA/9oADAMBAAIQAxAAAAHuzwyh9NJUx1M1Cf/EABoQAAIDAQEAAAAAAAAAAAAAAAEDABESAhD/2gAIAQEAAQUC9ZwwnDAsaq5ZgM//xAAYEQADAQEAAAAAAAAAAAAAAAAAARICE//aAAgBAwEBPwGDmxaZbP/EABcRAAMBAAAAAAAAAAAAAAAAAAACEwH/2gAIAQIBAT8BoVU1MJqf/8QAGxAAAgEFAAAAAAAAAAAAAAAAADEBECAhQWH/2gAIAQEABj8CrPIwNM1Z/8QAHBABAAICAwEAAAAAAAAAAAAAAQARITEQQVFx/9oACAEBAAE/IVDbLPeEWs5BNbatDthqyfsUeMAiBon/2gAMAwEAAgADAAAAEL/Xfv/EABkRAQEAAwEAAAAAAAAAAAAAAAEAMUFRYf/aAAgBAwEBPxAOy4kkzet//8QAGREAAwADAAAAAAAAAAAAAAAAAAExEVFh/9oACAECAQE/EHnGb2IuHE//xAAdEAEAAgICAwAAAAAAAAAAAAABABEhMRBRYXGh/9oACAEBAAE/EFAAVogmhwCxZXVnre/cLABvFBlrzTMCB1e/kSbKO4RYmd3m4ZIf/9k=","width":292,"height":325,"src":"/static/5076cf686ef8d2a21a3a6da9bf016970/a2998/elephant_cropped.jpg","srcSet":"/static/5076cf686ef8d2a21a3a6da9bf016970/a2998/elephant_cropped.jpg 1x,\n/static/5076cf686ef8d2a21a3a6da9bf016970/99cce/elephant_cropped.jpg 1.5x,\n/static/5076cf686ef8d2a21a3a6da9bf016970/6e995/elephant_cropped.jpg 2x,\n/static/5076cf686ef8d2a21a3a6da9bf016970/968b1/elephant_cropped.jpg 3x"}}},"authorName":"Mariusz Nowak","authorDescription":"Mariusz is a Senior Software Engineer at AUTO1 Group.","authorAvatar":null,"headerImage":null},"html":"<p>The PostgreSQL team <a href=\"https://www.postgresql.org/about/news/1976/\">announced recently</a> a new release of the most advanced\nopen source relational database - PostgreSQL 12. As usual it comes with an impressive list of improvements\n(generated columns ♥️), one of them being long awaited by dozens of developers: <strong>improved Common Table Expressions</strong>.</p>\n<p>I should first explain what are the Common Table Expressions for those who are unfamiliar with them:\nthe CTE’s, often called <code class=\"language-text\">“WITH queries”</code>, are SQL constructs giving a possibility of creating <strong>temporal data views</strong> for a sake of a query execution.</p>\n<p>Essentially CTE is an additional query which results can be referenced in the subsequent CTE’s or the main query before which\nit is being placed. It should be clear enough with an example - here’s a sample query with two CTE’s taken from Postgres docs:</p>\n<div class=\"gatsby-highlight\" data-language=\"sql\"><pre class=\"language-sql\"><code class=\"language-sql\"><span class=\"token keyword\">WITH</span> regional_sales <span class=\"token keyword\">AS</span> <span class=\"token punctuation\">(</span>\n    <span class=\"token keyword\">SELECT</span> region<span class=\"token punctuation\">,</span> <span class=\"token function\">SUM</span><span class=\"token punctuation\">(</span>amount<span class=\"token punctuation\">)</span> <span class=\"token keyword\">AS</span> total_sales\n    <span class=\"token keyword\">FROM</span> orders\n    <span class=\"token keyword\">GROUP</span> <span class=\"token keyword\">BY</span> region\n<span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> top_regions <span class=\"token keyword\">AS</span> <span class=\"token punctuation\">(</span>\n    <span class=\"token keyword\">SELECT</span> region\n    <span class=\"token keyword\">FROM</span> regional_sales\n    <span class=\"token keyword\">WHERE</span> total_sales <span class=\"token operator\">></span> <span class=\"token punctuation\">(</span><span class=\"token keyword\">SELECT</span> <span class=\"token function\">SUM</span><span class=\"token punctuation\">(</span>total_sales<span class=\"token punctuation\">)</span><span class=\"token operator\">/</span><span class=\"token number\">10</span> <span class=\"token keyword\">FROM</span> regional_sales<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">)</span>\n<span class=\"token keyword\">SELECT</span> region<span class=\"token punctuation\">,</span>\n       product<span class=\"token punctuation\">,</span>\n       <span class=\"token function\">SUM</span><span class=\"token punctuation\">(</span>quantity<span class=\"token punctuation\">)</span> <span class=\"token keyword\">AS</span> product_units<span class=\"token punctuation\">,</span>\n       <span class=\"token function\">SUM</span><span class=\"token punctuation\">(</span>amount<span class=\"token punctuation\">)</span> <span class=\"token keyword\">AS</span> product_sales\n<span class=\"token keyword\">FROM</span> orders\n<span class=\"token keyword\">WHERE</span> region <span class=\"token operator\">IN</span> <span class=\"token punctuation\">(</span><span class=\"token keyword\">SELECT</span> region <span class=\"token keyword\">FROM</span> top_regions<span class=\"token punctuation\">)</span>\n<span class=\"token keyword\">GROUP</span> <span class=\"token keyword\">BY</span> region<span class=\"token punctuation\">,</span> product<span class=\"token punctuation\">;</span></code></pre></div>\n<p>Now we know what they are, but what purpose can they serve us? Well - we could parry here and say: for the same purpose as ordinary database views serve.\nThat’s of course a dramatic simplification - CTE’s are much more powerful and should not be treated as a simple database views.\nNevertheless I would like to keep the collation for the sake of this article.</p>\n<p>Database views are absolutely optional - one can simply substitute them with a subquery and achieve identical results.\nIndeed that’s what modern database engines do these days - once a database view is being used they inline it’s query as a subquery.\nWhy to bother then? We are able to deal completely without database views and even if we did use one the database engine would get rid of it anyway.</p>\n<p>What benefits do views give us then? And why the heck are they inlined?</p>\n<h3>The beauty of database views</h3>\n<p>The most appropriate explanation here is that database views help us achieve better readability.\nThey offer an elegant way of abstracting some parts of a database into a meaningful object, often matching closely with the domain.</p>\n<p>Instead of creating giant and ugly looking queries it is possible to extract some of its parts into an appealing view which is easier to browse and select data from. <strong>\"Divide and conquer\"</strong> rule in it’s true form.</p>\n<p>Still, we didn’t answer the fact that the underlying view’s query is most of the times <strong>inlined</strong> while it is being referenced. The reason is <strong>performance</strong> of course. Smart guys found out that lazy evaluation helps the optimizer a lot - by delaying the execution we could take advantage of a context of the actual query.</p>\n<p>This in turn allows many clever optimization techniques, like: pushing down predicates (<em>WHERE filters</em>), eliminating unnecessary <em>JOINS</em>, accessing only subset of columns etc.\nIn other words - a database is smart enough to do as little work as possible when evaluating database views, in the context of the issued query.</p>\n<p>Personally I love this pattern: aggregating all the data into views and letting a database to optimize my queries - these folks are really good in it and my queries are dead simple too.</p>\n<p>I have mentioned the CTE’s at the beginning, saying they are able to create temporal data views. I still conform to the comparison with database views - they both are in many cases similar.\nThe main difference is that CTE results are temporary and are reachable only in the context of a query which CTE is being part of.</p>\n<p>It may make sense to use a CTE in a place where a regular database view is not justified (e.g. it makes sense only in the context of a query and not in the whole domain), expecting similar behavior.</p>\n<h3>An ugly brother</h3>\n<p>Besides many remarkable advantages there’s at least one disadvantage which disqualifies CTE's for most use cases - before PostgreSQL 12 it was implemented as an <strong>optimization fence</strong>. What does it mean?</p>\n<p>Easy to imagine an example with a generic data view aggregating lots of data. If the view is then used to select just few records it could mean a tremendous waste of computation, if the aggregation is executed immediately.\nInstead the aggregation should be executed only on a small subset of data, which can be deduced from the outer query.</p>\n<p>Unfortunately such counter intuitive behavior was true for CTE’s for a long time - their results were <strong>materialized</strong> only to be accessible for the rest of the query afterwards.</p>\n<p>Not to blame anybody - the creators had quite good reasons to implement such behavior (i.e. guarantee of exactly one evaluation, possibility of a recursive CTE’s and more) but this still feels like focusing on corner cases instead of optimizing the happy path.\nThat’s why the community <a href=\"https://www.postgresql.org/message-id/flat/201209191305.44674.db%40kavod.com\">insisted for a long time</a> for changing the status quo by giving the possibility of disabling the fence and unlocking the full potential of CTE’s.</p>\n<h3>Game changer</h3>\n<p>The SQL gods listen to their prayers and here it is - PostgreSQL 12 with updated <a href=\"https://www.postgresql.org/docs/12/queries-with.html\">Common Table Expressions</a>.\nBy default, when few constraints are met, the queries will be inlined allowing joint optimizations.</p>\n<p>It is still possible to force the old behavior - by defining the CTE <code class=\"language-text\">AS MATERIALIZED</code> the engine would execute it immediately.\nIt is also possible to hint the optimizer that we definitely want the CTE to be inlined, e.g. when the CTE is being referenced twice the engine won't inline it by default.</p>\n<p>This is truly a game changer for many developers who care about their queries’ readability, allowing them to substitute not very well liked subqueries with elegant <code class=\"language-text\">“WITH queries”</code>.</p>\n<p>Don’t get me wrong - not every subquery should be immediately replaced, they still have their strengths and in some cases they should be chosen over CTE’s. It is just convenient to have two distinct tools in a toolbox, isn’t it?</p>","fields":{"slug":"/postgres12-a-precious-release/","tags":["postgres","postgres12","release","sql","cte"]}}},{"node":{"id":"cd170267-3e38-5f47-8461-d6f6583c5bed","frontmatter":{"category":"Coding","title":"Our love story with Deadlocks (PostgreSQL Edition)","date":"2019-03-15","summary":"How to get rid of deadlocks","thumbnail":{"relativePath":"pages/event-store-deadlock/thumbnail.png","childImageSharp":{"resolutions":{"base64":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAVCAYAAABG1c6oAAAACXBIWXMAAAsSAAALEgHS3X78AAAFFklEQVQ4y5WUeUzTZxjHi+AWnTuYc1mAKaiAIgoI4tR5henqgSBEoOOmCINxKAIiRPFCpK304CyUUikUtIisHC0UhHoy8ZoaY2Lcodv8w3+WxcyMtu937/ubOLNliXuSN2/yvu/zeb/P8b483t8WvjlkjTopTmCMiQzrXBm4uIyu8T94d0ZGdERYf3Ji/JWIUL4hThB1QZiW8dvuvD0kJzv7aSh/Qy89F8p71QKW+Krr1VrUdF3A4RYzxB0jqNYPQ9Oqg+a0AeJTFhxoNqNCN4zx+49x+f4TCKU9yKruQ233JVTVNcBz7hwlB5s21SFd096FhMo+69xY0YR3gsTmFlVuDS9tmzDd+IGs3aO2zYsVWb3ixXbPeIltUXKlLb+uh6h6x8iCBDFxF1RYhXKjtUVvwEezZpbxYmN2XKnoOI/Z0cdsyzOqEJxRRXyTT6DZOE6iDmqJd7yILM+sQlC6nCz7UkEC0+RwjylHUUM/aeq/ShanVOLjqDL7oVYLkhLifuRlZ2f/mlXdD+YYlK7gDmzIb4SyZwzugnJQZfClawwUsFPGzUvp+HD7YdR0X4ZQpCdUDMlXGiEUpjzh5eTkPs/kgBUkiCpYmCjBF0d1OHP+Dg5qzJxTaHEzfJMrqUoF/IRSrN1VD/mZixi99RBKwxiZtf2wvVw3gpgdEdd5aanC7/aqBuAZJ7KzkGhekCruxND1B8iQnsUIdWofvgX/VBnohVSxCKKOUbQN3YS08wIkp0bJbIHI2thtQbD/okZe2OaN3dLTFtCEW4MzFPBmQEknDp00w21HGVbn1EHVdxUbC1TwS5WyiznVm4vUEIr1pETVTz7JVdlqmlpAayzgec1xKVGeNiIwo9YamCb7SyEFitpHOThTxPK5rUQDll9vmtNd1QbsVw8g8fgpUqTss2dW9WFXTtZTCpzFOmedRF6DrQfaaUFOEAohu2t6UKwygcFZqE3945xCBmSFYd3QbLpGMqVdEOkvWRXtJvj7eKom+/qt3OysR4UqM1Nj90mSEJZDRddFzNxWiq371FAbx7GEFoNVlxWGqf5K3g3T1fvEdPMR8gqLnlOO+8uXsio4oLHhrAU+KTIrKwztQ9I+cgdFzaM43m5Bfl0vPATHQfc4aGC6nKoXEcvdx7bbt29j2htOJorZREfQJHN9pbwa0cfO2OZEHSHLMmuJXPs1KS0uxIOH3xPd8C0SkqdkKrnG9t8pY00O7fC3RNs9aC8+Ino2ZLmIrVu2WHkAHDji6pXmjqFrWLe37Q/d0A20tWhsRaVHJ3bX9k60nrtrL240sgpTUBVYrwqO6IiFttQh7Qj6xx+i6WQrPOfPa2KsKS9UekVGRv6k1XcjYnv47/UNKlju/QJJSw8GBs3IrTawgnFhe9Nw02UGyGWyZ1s+/+zKquVBndQ/5tUPZxLqSkciHd5sDvBdoJk5480TwpSkewUNA1zRgmmOPePFthK1GSlJiZfpOYdJCI2Wx/Px8fkn9F+2v6S4a59qEF4JYittGcJ+nQrdOURHhvW+AE1539nZ8aWDq6srN0+fPt3Bzc3N0cnJycHFxcWxp880la0X5OW0HmgepK9JbF1Bf5758ZKJ+i4LVgQuVrB9Pp/vyHsdCwsPd+LmTSF7taZvsCBFYaXtQ2LFPVBqdOyp8dm+h4fH6wGdnZ0n8zM7v6Dw55PGMSj0o7bGlg6sX/Np/YtwHXj/x9575+3J21P9Fi1E8FI/pkzxX+f/BBhq9luuZDT4AAAAAElFTkSuQmCC","width":315,"height":325,"src":"/static/fff962ac6bfdb5c2156f04c8855b288b/b3029/thumbnail.png","srcSet":"/static/fff962ac6bfdb5c2156f04c8855b288b/b3029/thumbnail.png 1x,\n/static/fff962ac6bfdb5c2156f04c8855b288b/8d141/thumbnail.png 1.5x"}}},"authorName":"Core Services Team","authorDescription":"Core Services team is responsible for infrastructure and platform development at AUTO1","authorAvatar":null,"headerImage":{"relativePath":"pages/event-store-deadlock/header.png","childImageSharp":{"resolutions":{"base64":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAHCAYAAAAIy204AAAACXBIWXMAAAsSAAALEgHS3X78AAABoUlEQVQoz2MIjM+pDy/sfhxb3HkyNL/zXlbD9O+h2Y0XPD3c1RmAwM07gElKSooBhiUlxBlklDQYsAGDhIkMDP45Xf+t8hb9L5y66792/NT/wbUr/hTOOPDfN76whIEAMIjv5zRMmsJkmDiJH8QH0gwMftmdD21zZv8vmrT5l0v+7L/pXev/5XSv+hMUneLFzMCgr6ikrCApKckGdJ0QEAtKyynx8jEwcOiG1csZJk62B7rKDojVgIa5gm2JKWyf5V29+n9G19r/zkADg2pX/p+2bNtNDSVZM1ExcRNpaWkzoEFuQBwKxL5SUtJBEqLCrspmni7G6XMcDeInhAMNlDJMnJhpkDhRlCE6Jcc4pWrio/krNr4Pqlzw1zZz2r8J0+dddffwdAJZaGJqxg51HTcQqwANlBfhYmBTNvcWNEqdJQX0tiDQQFXDhEnyQFfKwoKDvTLJ28A9u/+jR/6M7xmemtIgwczMLCYGUkFxTSsjjB2bXbk3pahhE4i98/oXsLiysgoDcixLScswiAtwMQBdyGCUOpMR6EJQZDACXQnGAOs+imBvfQhTAAAAAElFTkSuQmCC","width":500,"height":168,"src":"/static/2d78359581a2f1861a243566e527e27a/7d852/header.png","srcSet":"/static/2d78359581a2f1861a243566e527e27a/7d852/header.png 1x"}}}},"html":"<h2>TL;DR</h2>\n<p>Always try to keep your transactions short and write your queries to deal with records in a deterministic order with respect to other transactions.</p>\n<h2>Full Story</h2>\n<p>Almost every non-trivial system needs some kind of task (job, event, whatever you call it) scheduling functionality at some point. Having more than 250 micro-services, ours was no exception.\nAt some point, it made sense to us to provide such functionality as a service for our developers. With <code class=\"language-text\">event-store</code> service, one simply can schedule a job to be executed at a specific time in future. There are already several ready to use tools for this specific purpose (e.g. BigBen, Quartz), however, because of several reasons (which are beyond the scope of this post) we decided to implement our own from scratch.</p>\n<p>Considering the subject of the post which is <code class=\"language-text\">Deadlocks</code>, we need a brief introduction on <code class=\"language-text\">event-store</code>'s implementation.</p>\n<p>An event on the server side is simply a record in the database. A simplified view of <code class=\"language-text\">events</code> table is as follow:</p>\n<table>\n<thead>\n<tr>\n<th>id</th>\n<th>event_type</th>\n<th>due_time</th>\n<th>start_process_time</th>\n<th>end_process_time</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>'5c55ea79-4774-49ae-96e2-975af7a41ce6'</td>\n<td>'user-cache-flush'</td>\n<td>'2019-01-01 00:00:00'</td>\n<td>null</td>\n<td>null</td>\n</tr>\n</tbody>\n</table>\n<p>The following is a bird's-eye view of what happens on the service:</p>\n<p><strong>Thread #1</strong><br>\na. The service periodically fetches a bunch of unprocessed events (those that are due and have no process time, neither start nor end) from the database (and marks them as <em>progressing</em> by filling the <code class=\"language-text\">start_process_time</code> column with the current timestamp). Let's call this <em>fetch-unprocessed</em> query.</p>\n<p>b. The service processes the events and marks them as <em>processed</em> (by filling the <code class=\"language-text\">end_process_time</code> column with the current timestamp). Let's call this <em>save-processed</em> query.</p>\n<p><strong>Thread #2</strong><br>\na. The service periodically checks the table and looks for stale events (those that are due and have <code class=\"language-text\">start_process_time</code> but no <code class=\"language-text\">end_process_time</code>). Let's call this <em>fetch-stale</em> query.</p>\n<p>b. The service marks them back as <em>unprocessed</em> (by setting <code class=\"language-text\">start_process_time</code> to <code class=\"language-text\">null</code>). Let's call this <em>release-stale</em> query.</p>\n<p>The very first MVP was a simple read-modify-write (anti-)pattern (with the help of Spring, JPA, and Hibernate). It is not hard to guess to what issue this implementation is vulnerable: Deadlocks.</p>\n<div class=\"gatsby-highlight\" data-language=\"text\"><pre class=\"language-text\"><code class=\"language-text\">ERROR: deadlock detected\nDetail: Process 5234 waits for ShareLock on transaction 3465; blocked by process 467845.\n        Process 467845 waits for ShareLock on transaction 96575; blocked by process 5234.\nHint: See server log for query details.\nWhere: while updating tuple (14954,4) in relation \"events\"</code></pre></div>\n<div class=\"gatsby-highlight\" data-language=\"text\"><pre class=\"language-text\"><code class=\"language-text\">ERROR: deadlock detected\nDetail: Process 10438 waits for ExclusiveLock on tuple (14954,4) of relation 19118 of database 19113; blocked by process 31501.\n        Process 31501 waits for ShareLock on transaction 763124271; blocked by process 28450.\n        Process 28450 waits for ShareLock on transaction 763124277; blocked by process 28873.\n        Process 28873 waits for ExclusiveLock on tuple (14954,4) of relation 19118 of database 19113; blocked by process 10438.\nHint: See server log for query details.\nWhere: while locking tuple (6984,19) in relation \"events\"</code></pre></div>\n<p>In the beginning it was not a big deal, because deadlock errors were not frequent and we could live with it. However, very soon, after getting more clients, the deadlock issue (among other problems, e.g. lost-updates) started hurting the quality of the service (e.g. firing events twice).</p>\n<p>For the first attempt, we refactored the code to get rid of read-modify-write antipattern:</p>\n<div class=\"gatsby-highlight\" data-language=\"sql\"><pre class=\"language-sql\"><code class=\"language-sql\"><span class=\"token comment\">-- fetch-unprocessed query</span>\n<span class=\"token keyword\">UPDATE</span> events event_to_update\n<span class=\"token keyword\">SET</span> start_process_time <span class=\"token operator\">=</span> <span class=\"token function\">now</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span>\n<span class=\"token keyword\">FROM</span> <span class=\"token punctuation\">(</span>\n    <span class=\"token keyword\">SELECT</span> id\n    <span class=\"token keyword\">FROM</span> events      \n    <span class=\"token keyword\">WHERE</span> end_process_time <span class=\"token operator\">is</span> <span class=\"token boolean\">NULL</span>\n            <span class=\"token operator\">AND</span> start_process_time <span class=\"token operator\">IS</span> <span class=\"token boolean\">NULL</span>\n            <span class=\"token operator\">AND</span> <span class=\"token punctuation\">(</span>due_time <span class=\"token operator\">BETWEEN</span> ? <span class=\"token operator\">AND</span> ?<span class=\"token punctuation\">)</span>\n<span class=\"token punctuation\">)</span> matched_event_for_update\n<span class=\"token keyword\">WHERE</span> event_to_update<span class=\"token punctuation\">.</span>id <span class=\"token operator\">=</span> matched_event_for_update<span class=\"token punctuation\">.</span>id\n<span class=\"token keyword\">RETURNING</span> <span class=\"token operator\">*</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token comment\">-- release-stale query (including fetch-stale query)</span>\n<span class=\"token keyword\">UPDATE</span> events event_to_release\n<span class=\"token keyword\">SET</span> start_process_time <span class=\"token operator\">=</span> <span class=\"token boolean\">NULL</span>\n<span class=\"token keyword\">FROM</span> <span class=\"token punctuation\">(</span>\n    <span class=\"token keyword\">SELECT</span> id\n    <span class=\"token keyword\">FROM</span> events\n    <span class=\"token keyword\">WHERE</span> end_process_time <span class=\"token operator\">IS</span> <span class=\"token boolean\">NULL</span> <span class=\"token operator\">AND</span> start_process_time <span class=\"token operator\">&lt;</span> ?\n<span class=\"token punctuation\">)</span> stuck_event\n<span class=\"token keyword\">WHERE</span> event_to_release<span class=\"token punctuation\">.</span>id <span class=\"token operator\">=</span> stuck_event<span class=\"token punctuation\">.</span>id\n<span class=\"token keyword\">RETURNING</span> <span class=\"token operator\">*</span><span class=\"token punctuation\">;</span></code></pre></div>\n<p>Obviously this is an improvement, though reduced the number of deadlocks, it didn't solve the issue completely.\nThe quickest solution to eliminate deadlocks was to change the isolation level to <code class=\"language-text\">Serializable</code>. However this is the last solution we wanted since it would hurt the performance and the quality of the service (think what would happen if one service start a transaction, fetch the events to process, but for some reason become stale).</p>\n<p>The better way to deal with deadlock is to analyze transactions and queries to prevent transaction interference by:</p>\n<ol>\n<li>making transactions short in terms of time</li>\n<li>writing queries to deal with records in a deterministic order</li>\n</ol>\n<p>By the first refactoring, we made transactions a bit shorter. But there's still some room for improvement. To apply the second best practice from the above we refactored the queries as follow (essentially ordering and acquiring a lock):</p>\n<div class=\"gatsby-highlight\" data-language=\"sql\"><pre class=\"language-sql\"><code class=\"language-sql\"><span class=\"token comment\">-- fetch-unprocessed query</span>\n<span class=\"token keyword\">UPDATE</span> events event_to_update\n<span class=\"token keyword\">SET</span> start_process_time <span class=\"token operator\">=</span> <span class=\"token function\">now</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span>\n<span class=\"token keyword\">FROM</span> <span class=\"token punctuation\">(</span>\n    <span class=\"token keyword\">SELECT</span> id\n    <span class=\"token keyword\">FROM</span> events      \n    <span class=\"token keyword\">WHERE</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token comment\">-- as before</span>\n    <span class=\"token keyword\">ORDER</span> <span class=\"token keyword\">BY</span> ID\n    <span class=\"token keyword\">FOR</span> <span class=\"token keyword\">NO</span> <span class=\"token keyword\">KEY</span> <span class=\"token keyword\">UPDATE</span>\n<span class=\"token punctuation\">)</span> matched_event_for_update\n<span class=\"token keyword\">WHERE</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token comment\">-- as before</span>\n<span class=\"token keyword\">RETURNING</span> <span class=\"token operator\">*</span><span class=\"token punctuation\">;</span>\n\n<span class=\"token comment\">-- release-stale query (including fetch-stale query)</span>\n<span class=\"token keyword\">UPDATE</span> events event_to_release\n<span class=\"token keyword\">SET</span> start_process_time <span class=\"token operator\">=</span> <span class=\"token boolean\">NULL</span>\n<span class=\"token keyword\">FROM</span> <span class=\"token punctuation\">(</span>\n    <span class=\"token keyword\">SELECT</span> id\n    <span class=\"token keyword\">FROM</span> events\n    <span class=\"token keyword\">WHERE</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token comment\">-- as before</span>\n    <span class=\"token keyword\">ORDER</span> <span class=\"token keyword\">BY</span> ID\n    <span class=\"token keyword\">FOR</span> <span class=\"token keyword\">NO</span> <span class=\"token keyword\">KEY</span> <span class=\"token keyword\">UPDATE</span>\n<span class=\"token punctuation\">)</span> stuck_event\n<span class=\"token keyword\">WHERE</span> <span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span><span class=\"token punctuation\">.</span> <span class=\"token comment\">-- as before</span>\n<span class=\"token keyword\">RETURNING</span> <span class=\"token operator\">*</span><span class=\"token punctuation\">;</span></code></pre></div>\n<p>However, the deadlocks were still appearing intermittently:</p>\n<div class=\"gatsby-highlight\" data-language=\"text\"><pre class=\"language-text\"><code class=\"language-text\">ERROR: deadlock detected\nDetail: Process 7534 waits for ShareLock on transaction 93897000; blocked by process 13376.\n        Process 13376 waits for ShareLock on transaction 93896994; blocked by process 7534.\nHint: See server log for query details.\nWhere: while rechecking updated tuple (134131,3) in relation \"events\"</code></pre></div>\n<p>Of course! we still had one transaction/query dealing with records in a non deterministic way: <code class=\"language-text\">JpaRepository.save(Iterable&lt;S> entities)</code>. In the code, after fetching events to be processed, we flag (<code class=\"language-text\">UPDATE</code>) the process events as <code class=\"language-text\">processed</code>, and also save (<code class=\"language-text\">INSERT</code>) new events spawned by the processed events. Here's the catch: the event processing is done in <em>parallel</em>. So the resulting list of events (both processed and newly created ones) is not ordered with respect to other transactions. So the third refactoring was to restore the order of processed events:</p>\n<div class=\"gatsby-highlight\" data-language=\"java\"><pre class=\"language-java\"><code class=\"language-java\"><span class=\"token comment\">/**\n * To process events.\n * The result list will be ordered with respect to the given list.\n * All new events will be added at the end of the result list.\n * @param events events to be processed\n * @return list of processed and created events\n */</span>\n<span class=\"token keyword\">private</span> <span class=\"token class-name\">List</span><span class=\"token generics\"><span class=\"token punctuation\">&lt;</span><span class=\"token class-name\">Event</span><span class=\"token punctuation\">></span></span> <span class=\"token function\">doProcessEvents</span><span class=\"token punctuation\">(</span><span class=\"token class-name\">List</span><span class=\"token generics\"><span class=\"token punctuation\">&lt;</span><span class=\"token class-name\">Event</span><span class=\"token punctuation\">></span></span> events<span class=\"token punctuation\">)</span> <span class=\"token punctuation\">{</span>\n    <span class=\"token class-name\">ProcessingResult</span> result <span class=\"token operator\">=</span> eventProcessor<span class=\"token punctuation\">.</span><span class=\"token function\">processInParallel</span><span class=\"token punctuation\">(</span>events<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n    <span class=\"token class-name\">List</span><span class=\"token generics\"><span class=\"token punctuation\">&lt;</span><span class=\"token class-name\">Event</span><span class=\"token punctuation\">></span></span> successfulEvents <span class=\"token operator\">=</span> result<span class=\"token punctuation\">.</span><span class=\"token function\">getSucceeded</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span> <span class=\"token comment\">// may contain new events;</span>\n    <span class=\"token class-name\">List</span><span class=\"token generics\"><span class=\"token punctuation\">&lt;</span><span class=\"token class-name\">Event</span><span class=\"token punctuation\">></span></span> failedEvents <span class=\"token operator\">=</span> result<span class=\"token punctuation\">.</span><span class=\"token function\">getFailed</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n    <span class=\"token class-name\">List</span><span class=\"token generics\"><span class=\"token punctuation\">&lt;</span><span class=\"token class-name\">Event</span><span class=\"token punctuation\">></span></span> allEvents <span class=\"token operator\">=</span> <span class=\"token class-name\">ListUtils</span><span class=\"token punctuation\">.</span><span class=\"token function\">union</span><span class=\"token punctuation\">(</span>successfulEvents<span class=\"token punctuation\">,</span> failedEvents<span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n    <span class=\"token class-name\">Map</span><span class=\"token generics\"><span class=\"token punctuation\">&lt;</span>UUID<span class=\"token punctuation\">,</span> <span class=\"token class-name\">Integer</span><span class=\"token punctuation\">></span></span> eventIndices <span class=\"token operator\">=</span> <span class=\"token function\">range</span><span class=\"token punctuation\">(</span><span class=\"token number\">0</span><span class=\"token punctuation\">,</span> events<span class=\"token punctuation\">.</span><span class=\"token function\">size</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">.</span><span class=\"token function\">boxed</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span>\n            <span class=\"token punctuation\">.</span><span class=\"token function\">collect</span><span class=\"token punctuation\">(</span><span class=\"token class-name\">CollectorUtils</span><span class=\"token punctuation\">.</span><span class=\"token function\">toMap</span><span class=\"token punctuation\">(</span>i <span class=\"token operator\">-></span> events<span class=\"token punctuation\">.</span><span class=\"token function\">get</span><span class=\"token punctuation\">(</span>i<span class=\"token punctuation\">)</span><span class=\"token punctuation\">.</span><span class=\"token function\">getId</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> i <span class=\"token operator\">-></span> i<span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n    <span class=\"token comment\">// This should sort according to the original order and put all new events at the end.</span>\n    allEvents<span class=\"token punctuation\">.</span><span class=\"token function\">sort</span><span class=\"token punctuation\">(</span><span class=\"token function\">comparingInt</span><span class=\"token punctuation\">(</span>event <span class=\"token operator\">-></span> eventIndices<span class=\"token punctuation\">.</span><span class=\"token function\">getOrDefault</span><span class=\"token punctuation\">(</span>event<span class=\"token punctuation\">.</span><span class=\"token function\">getId</span><span class=\"token punctuation\">(</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">,</span> <span class=\"token class-name\">Integer</span><span class=\"token punctuation\">.</span>MAX_VALUE<span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">)</span><span class=\"token punctuation\">;</span>\n\n    <span class=\"token keyword\">return</span> allEvents<span class=\"token punctuation\">;</span>\n<span class=\"token punctuation\">}</span></code></pre></div>\n<p>After this last change, we saw the game for deadlocks was over: <em>\"Goodbye Deadlock!\"</em></p>","fields":{"slug":"/event-store-deadlock/","tags":["deadlock","sql","transaction","postgresql"]}}},{"node":{"id":"f1483025-8f93-596d-89a4-e459c4718414","frontmatter":{"category":"Coding","title":"Database Transaction Isolation","date":"2018-06-20","summary":"What is transaction isolation level? What are differences? What are the default values for RDBMSs?","thumbnail":{"relativePath":"pages/transactionisolation/thumbnail.png","childImageSharp":{"resolutions":{"base64":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAKCAYAAAC0VX7mAAAACXBIWXMAAAsSAAALEgHS3X78AAABIUlEQVQoz5VS7U7DMAzsa6A1cRynTT/WrTBNiB8gJN7/mczFVWHTWiF+nM750OXsS8XMGjhqbDplaRZuMupWKSZ1zqn3/g5EtIsqBNIAoTRdNeajpuPFIMOs0p+0Pjypqw9AvbADQ7Q8tAUIBg1wwk0PwdGEuO2VUTPcynAGnrE3GlNsFnHnH5wXVGvLQbLmy4ehe/3SfP20OnWTpmHQbh41n9BB32IvquS4OQ4TpLIgj7mhDh6OyyPB1t47c1TYu4UJYyr3dwXtgGpNmSFCgDdOHcbBZO3dw/2I3da/Dlm0nd8ww0ljf14CAsv4gofCppNV7MGhoD1uB8zrHa2wfZUSRuGSvve0K7gZSkrpJvZ6wfo9LE2362ZTUET+vPwfh9/1JBQjnS3wvwAAAABJRU5ErkJggg==","width":677,"height":325,"src":"/static/2691083d1efbcb8a271d80bf9ff6940f/b3029/thumbnail.png","srcSet":"/static/2691083d1efbcb8a271d80bf9ff6940f/b3029/thumbnail.png 1x"}}},"authorName":"Hayk Jhangiryan","authorDescription":"Hayk is Senior Java Engineer at AUTO1 Group.","authorAvatar":{"relativePath":"pages/transactionisolation/avatar.png","childImageSharp":{"resolutions":{"base64":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACNiR0NAAAACXBIWXMAAAsSAAALEgHS3X78AAACyElEQVQ4y4VUO2gUURR9mZmN8YcItoKVjeXuzsxu9jOzm92NkBBTqDEmjZAEgtqIpSgI1gp+uhS6kEI7wc7GQsFCLERiISGCaGHhF02yu+M5s/cujyC4cJY379573r3n3XdNvV51xsYiJ4rKbqlUyBj8wjAYyWazp4BVYB3YAraBDeAhMBcE/l76lkrFTLVachuNyCGXEbJMrVbx6ADnaWAtn88nvu8nWBMdQcJ9Auv3wCxjWq2aC9IMSF3DzCyyG1YAs9pSYkL2melmLpfT71tKGkUlz2iZJAuCgA5dEjGAwPoV0AbuAy+tfR7YkZjb5CiXR1OutEw5jWQdCfgCnEiSxNg/7LWAj+KTSiHSzBvrAtascrriPCmnDkP4AYQ0Em3TGPHf8P38ARpP/4PsHQ7aU69Hqdi4OJeI47JXLg9IX1tVbUuWCzSsygcJO+L0goGNRpySoB1colarerhEvcCn4tuxCB/TsC5ZdS3Ct2Ho75uZmRoigWbIbgCxt7Q0O8TLsgg17o2R2+JHzzJcV/1IoBlybel42SLsCcc3I6UqoaZ+lUHFYjjCzJSQa+4J4UVLKiX8ScOGVXJXDN+xd7SfZXFYS65U+n2Wz+cOw+ezHadSkfCRdVJiZdlmMG9ZtHMpgWR3Z0eMSvWExjmrbXoCbZ951dIim7Yk6u1IYtlwavChW51Pp01tXOwf01eC9RHs/Rbbnx2Hf4IUh/Q5zdol8H3KG22Pj9edOK54LF1876pd37TELhqdZ+J4MwxDHUvXgDOFQrB7YqLpjI4WdhG8GLYNbCeBK/1+TWNWRG+Pfy5Hj5Dek3IXzX9+8DkL/AIeLC/PDU1NHXdSQk5altNsDkjPAT+ArzKyFnK5bB2IxbYik4hanmfM5GTL6Q/Y2DUc25y0HLToubR8TI39cL4APJOxr/35AXgOXEJnHKQvepP6eiQj119kDrZjqLo8HAAAAABJRU5ErkJggg==","width":50,"height":50,"src":"/static/26a7a327ccc4b335712e5a7086f2b26d/45876/avatar.png","srcSet":"/static/26a7a327ccc4b335712e5a7086f2b26d/45876/avatar.png 1x,\n/static/26a7a327ccc4b335712e5a7086f2b26d/eb85b/avatar.png 1.5x,\n/static/26a7a327ccc4b335712e5a7086f2b26d/4f71c/avatar.png 2x,\n/static/26a7a327ccc4b335712e5a7086f2b26d/9ec3e/avatar.png 3x"}}},"headerImage":{"relativePath":"pages/transactionisolation/header-image.png","childImageSharp":{"resolutions":{"base64":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAKCAYAAAC0VX7mAAAACXBIWXMAAAsSAAALEgHS3X78AAABIUlEQVQoz5VS7U7DMAzsa6A1cRynTT/WrTBNiB8gJN7/mczFVWHTWiF+nM750OXsS8XMGjhqbDplaRZuMupWKSZ1zqn3/g5EtIsqBNIAoTRdNeajpuPFIMOs0p+0Pjypqw9AvbADQ7Q8tAUIBg1wwk0PwdGEuO2VUTPcynAGnrE3GlNsFnHnH5wXVGvLQbLmy4ehe/3SfP20OnWTpmHQbh41n9BB32IvquS4OQ4TpLIgj7mhDh6OyyPB1t47c1TYu4UJYyr3dwXtgGpNmSFCgDdOHcbBZO3dw/2I3da/Dlm0nd8ww0ljf14CAsv4gofCppNV7MGhoD1uB8zrHa2wfZUSRuGSvve0K7gZSkrpJvZ6wfo9LE2362ZTUET+vPwfh9/1JBQjnS3wvwAAAABJRU5ErkJggg==","width":838,"height":402,"src":"/static/2691083d1efbcb8a271d80bf9ff6940f/d5ccc/header-image.png","srcSet":"/static/2691083d1efbcb8a271d80bf9ff6940f/d5ccc/header-image.png 1x"}}}},"html":"<h1>Database Transaction Isolation</h1>\n<h2>What is transaction isolation level?</h2>\n<p>Transaction isolation level defines the degree to which one transaction must be isolated from resource or data modifications made by other transactions. </p>\n<p>Let's have a look at phenomena which can occur during the execution of concurrent transactions:</p>\n<ul>\n<li>\n<p>Dirty read - A transaction may read data written but not yet committed by other transactions.\n           <p>example: Transaction T1 modifies a row. Transaction T2 then reads that row before T1 performs a COMMIT. If T1 then performs a ROLLBACK, T2 have read a row that was never committed and that may be considered to have never existed.</p></p>\n</li>\n<li>\n<p>Nonrepeatable read - A transaction re-reads data it had previously read and finds that data has been modified by another transaction (that committed since the initial read).\n           <p>example: Transaction T1 reads a row. Transaction T2 then modifies or deletes that row and performs a COMMIT. If T1 then attempts to reread the row, it may receive the modified value or discover that the row has been deleted.</p></p>\n</li>\n<li>\n<p>Phantom read - A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction.\n<p>example: Transaction T1 reads the set of rows N that satisfy some <em>search condition</em>. Transaction T2 then executes SQL-statements that generate one or more rows that satisfy the <em>search condition</em> used by transaction T1. If transaction T1 then repeats the initial read with the same <em>search condition</em>, it obtains a different set of rows.</p></p>\n</li>\n</ul>\n<p>Isolation is the <em>I</em> in the acronym <a href=\"https://en.wikipedia.org/wiki/ACID\"><em>ACID</em></a>. In ACID, atomicity and durability are strict requirements. Whereas consistency and isolation are more of configuration (kind of). Besides, they are closely related, so much so that the SQL standard defines four levels of transaction isolation based on the consistency they provide (from least to the most consistency):</p>\n<table>\n<thead>\n<tr>\n<th>ISOLATION  LEVEL</th>\n<th align=\"center\">DIRTY READ</th>\n<th align=\"center\">NON-REPEATABLE READ</th>\n<th align=\"center\">PHANTOM READ</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>READ UNCOMMITTED</td>\n<td align=\"center\">possible</td>\n<td align=\"center\">possible</td>\n<td align=\"center\">possible</td>\n</tr>\n<tr>\n<td>READ COMMITTED</td>\n<td align=\"center\">--</td>\n<td align=\"center\">possible</td>\n<td align=\"center\">possible</td>\n</tr>\n<tr>\n<td>REPEATABLE READ</td>\n<td align=\"center\">--</td>\n<td align=\"center\">--</td>\n<td align=\"center\">possible</td>\n</tr>\n<tr>\n<td>SERIALIZABLE</td>\n<td align=\"center\">--</td>\n<td align=\"center\">--</td>\n<td align=\"center\">--</td>\n</tr>\n</tbody>\n</table>\n<p>Different isolation levels are defined based on whether they allow (or prevent) the above phenomena (as described by the SQL standard). Note that transaction isolation does not affect the changes made by the same transaction. It means a transaction always sees all the changes made by itself. </p>\n<h2>Now what?</h2>\n<p>Inside Java services, the transaction isolation level can be controlled by Spring <a href=\"https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/transaction/annotation/Transactional.html\"><code class=\"language-text\">@Transactional</code></a> annotation which supports <a href=\"https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/transaction/annotation/Transactional.html#isolation--\"><code class=\"language-text\">isolation</code></a> attribute. If you do not specify the isolation level in the code explicitly, the default isolation level will be applied, which means that the default isolation level of the underlying datastore will be used.</p>\n<p>It is worth to mention that default isolation levels for each system might be different. For example, the default isolation level for <em>MySQL 5.7 (InnoDB)</em> and <em>PostgreSQL 9.5</em> is respectively, <em>REPEATABLE READ</em> and <em>READ COMMITTED</em>. To check the default isolation level, query:</p>\n<ul>\n<li><code class=\"language-text\">SHOW VARIABLES LIKE 'tx_isolation'; -- on MySQL</code></li>\n<li><code class=\"language-text\">SHOW default_transaction_isolation'; -- on PostgreSQL</code></li>\n</ul>\n<h3>References</h3>\n<ol>\n<li><a href=\"https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html\">MySQL Transaction Isolation Levels</a></li>\n<li><a href=\"https://www.postgresql.org/docs/9.6/static/transaction-iso.html\">Transaction Isolation in PostgreSQL</a></li>\n<li><a href=\"https://dzone.com/articles/spring-transaction-management\">Useful article</a></li>\n</ol>","fields":{"slug":"/transactionisolation/","tags":["coding","sql","transactions"]}}}]}},"pageContext":{"slug":"/tags/sql","tag":"sql","categories":["Architecture","Coding","DevOps","Engineering","ProjectManagement","QA","Social","TechRadar"]}}