<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title></title>
    <description>A peek into our tech stack and how we use it</description>
    <link>http://tech.forzafootball.com/</link>
    <atom:link href="http://tech.forzafootball.com/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Mon, 11 Jul 2022 18:50:09 +0000</pubDate>
    <lastBuildDate>Mon, 11 Jul 2022 18:50:09 +0000</lastBuildDate>
    <generator>Jekyll v3.9.2</generator>
    
        <item>
            <title>Our Test automation journey</title>
            <description>&lt;p&gt;We have been doing test automation for a year and a half, and we are really proud of where we are today. This is the story of how we got here and where we are right now.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Angle grinder&quot; src=&quot;/assets/posts/2019-02-26-our-test-automation-journey-at-forza/stormtrooper.jpg&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Keep in mind that this is a description of our journey and it should not be used as a template of how to do it. We hope that our story can help or inspire others on their journey; if it does, then our purpose for writing this will have been fulfilled.&lt;/p&gt;

&lt;h2 id=&quot;selecting-the-framework&quot;&gt;Selecting the framework&lt;/h2&gt;

&lt;p&gt;Choosing a framework was a big challenge. You always need to keep many things in mind: what kind of tests will be written, what language you want to use, will developers be involved, who will write tests, who will support them, how you want to get the results, how much you want to cover and so on.&lt;/p&gt;

&lt;h4 id=&quot;cross-platform&quot;&gt;Cross platform&lt;/h4&gt;

&lt;p&gt;Since our application is cross-platform (iOS and Android) the most obvious choice was to choose a cross-platform framework. The biggest advantage is that there will be a common part and that you, as a Test automation (TA) engineer, do not need to learn several languages and support several systems. Also, there will only be one reporting system—which is the most important part of testing. Of course, it is not that simple as you are working with two completely different applications despite having similar designs, and even if you have one framework you will, in most cases, need to write different tests. So what is the point? The most complicated part is not writing tests but working with data and generating the right reports. In the case of a cross-platform framework, the latter will remain the same.&lt;/p&gt;

&lt;h5 id=&quot;make-sure-there-is-an-active-community-working-with-the-framework-you-choose&quot;&gt;Make sure there is an active community working with the framework you choose!&lt;/h5&gt;

&lt;p&gt;With that in mind, the task was not that hard anymore. There are not that many frameworks that support both platforms. Frankly, if you want to choose something that will last for several years and will be easy to work with then you need something that is widely used and documented well. The last part is important. When you use the framework there will be problems you will not be able to solve yourself. You will have to google, A LOT, and ask the developers of the framework, and google again. This is unavoidable. We had to be sure the framework community regularly answers questions and that there are a lot of answers on websites such as StackOverflow, or Google groups. Also, the language… it will be much easier if you know it.&lt;/p&gt;

&lt;h4 id=&quot;test-the-framework&quot;&gt;Test the framework&lt;/h4&gt;

&lt;p&gt;So the search began. The final candidates were &lt;a href=&quot;http://appium.io/&quot;&gt;Appium&lt;/a&gt; and &lt;a href=&quot;https://calaba.sh/&quot;&gt;Calabash&lt;/a&gt;. For backup, we had &lt;a href=&quot;https://developer.android.com/training/testing/ui-testing/espresso-testing&quot;&gt;Espresso&lt;/a&gt; for Android tests and &lt;a href=&quot;https://gopekannan.wordpress.com/2018/02/16/ios-ui-testing-using-apple-xcuitest/&quot;&gt;XCUITest&lt;/a&gt; for iOS. Appium looked great, it was quite easy to install everything and write the first test. Also, they support a couple of languages to choose from. Java suited really well, however after a successful first test on Android there was a huge disappointment: iOS was slow. Not the kind of slow you can deal with, but REALLY slow. Whatever solution we thought there was—nothing worked, days were wasted and Appium was still too slow. Time to move on.&lt;/p&gt;

&lt;h4 id=&quot;justworks&quot;&gt;#justworks&lt;/h4&gt;

&lt;p&gt;The second on the list was Calabash. Calabash only supports Ruby, which is fine, because part of our legacy backend is written in Ruby. This means that developers will be able to help out and educate. Calabash did not have a fancy UI compared to Appium and it seemed to be less used. Nonetheless, it was considered one of the best options out there. Surprisingly after an hour reading docs and preparing the environment everything just worked. It worked evenly on both platforms. After years of fighting with different kinds of software, I understood the golden rule: if it works without much effort—stick with it. That was it.&lt;/p&gt;

&lt;h4 id=&quot;architecture&quot;&gt;Architecture&lt;/h4&gt;

&lt;p&gt;Whenever you read about test automation, you always run into warnings that it is not worth the effort, because it always becomes so hard to maintain. So, how did we tackle this?&lt;/p&gt;

&lt;h4 id=&quot;given-when-then&quot;&gt;&lt;a href=&quot;https://martinfowler.com/bliki/GivenWhenThen.html&quot;&gt;Given When Then&lt;/a&gt;&lt;/h4&gt;

&lt;p&gt;One great thing you can do is to write scenarios using &lt;a href=&quot;https://cucumber.io/&quot;&gt;Cucumber&lt;/a&gt;. Using this pattern when writing tests really helps in making the tests easy to read, not just for those of us working with test automation but for everyone within the company. It gives you a great error report format while at the same time describing the behaviour of the system, and it actually gives you a &lt;a href=&quot;https://gojko.net/books/specification-by-example/&quot;&gt;living documentation&lt;/a&gt; of the system.&lt;/p&gt;

&lt;p&gt;It is also very easy for the quality assurance engineers to see what is covered by test automation and what they should verify manually. We have even used these scenarios to teach new engineers the behaviour of the app. When we rebuilt parts of the app in a project, we often got stuck in discussions on how a particular feature was supposed to work but then we could simply look at the test cases to sort out the behaviour.&lt;/p&gt;

&lt;h4 id=&quot;page-object-pattern&quot;&gt;Page object pattern&lt;/h4&gt;

&lt;p&gt;Page object patterns are widely used in test engineering, and they give you a nice structure of your code, making it reusable and maintainable. Using a pattern turned out to be a great idea. Now, it was easier to navigate between tests, to see common parts and differences between Android and iOS. We have implemented it in a way that we have a page object class for each view (or sometimes part of a view), and all of the IDs and methods for interacting with the different controls of the view are gathered in one place. Some useful links:
&lt;a href=&quot;https://martinfowler.com/bliki/PageObject.html&quot;&gt;Martin Fowler - Page Object&lt;/a&gt;,
&lt;a href=&quot;https://medium.com/tech-tajawal/page-object-model-pom-design-pattern-f9588630800b&quot;&gt;Page Object Model (POM) | Design Pattern&lt;/a&gt;&lt;/p&gt;

&lt;h4 id=&quot;even-more-maintainable&quot;&gt;Even more maintainable&lt;/h4&gt;

&lt;p&gt;We also tried to keep our structure in modules. The reason for that is simple: when you decide about test architecture you should think about what happens if the framework you have chosen dies? Should you just throw away years of work and start all over again? That is not the best way of doing it. All of our tests have separate layers which, in case of emergency, can be moved to a different framework or easily rewritten.&lt;/p&gt;

&lt;p&gt;This recently happened to us with a module that reads all of the data for ads since we are switching to a different system. Now we have to rewrite only that part and nothing else.&lt;/p&gt;

&lt;p&gt;We also have a helper class with common methods, our constants are separated in another class (which helps a lot to store it in only one place if you want to change timeouts or something similar), several data classes for working getting data, filtering it, and a couple more divided according to their purpose.&lt;/p&gt;

&lt;h2 id=&quot;continuous-integration-ci&quot;&gt;Continuous Integration (CI)&lt;/h2&gt;

&lt;p&gt;After the framework was set in place and we covered some parts of our app with tests, we needed to make sure everything worked as expected, so it was now time to set up the way we run everything on a regular basis. So we needed CI.&lt;/p&gt;

&lt;h4 id=&quot;start-with-hosting-the-ci-locally&quot;&gt;Start with hosting the CI locally&lt;/h4&gt;

&lt;p&gt;Throughout the company there was no common way to run builds and unit tests for clients (except for backend, they were doing great), so we decided that we needed something for that as well, not only for end-to-end (E2E) tests. The idea was to find something that can be run locally, at least from the start. We had to use emulators or real devices for E2E tests and it is easier to maintain yourself, you get less flaky tests, and it is more stable. Should that work well and we need to grow bigger—there will always be an option to move everything to the cloud. We had our Mac mini as a server where we could start testing and trying different tools.&lt;/p&gt;

&lt;h4 id=&quot;team-city-vs-jenkins&quot;&gt;Team City vs. Jenkins&lt;/h4&gt;

&lt;p&gt;After googling and trying different ones the best two candidates were &lt;a href=&quot;https://www.jetbrains.com/teamcity/&quot;&gt;Team City&lt;/a&gt; and &lt;a href=&quot;https://jenkins.io/&quot;&gt;Jenkins&lt;/a&gt;. With Team City it did not really work as we had hoped, we were struggling to set everything up. Nothing worked, so we decided to try to do the same thing with Jenkins, and if that did not work—come back to Team City. However, we never needed to. Jenkins just worked. It has thousands of plugins for anything you need, which is nice. The downside is that it is quite complex to deal with plug-ins because they are written by different people and there is not always an obvious way of using them. I would say it is quite hard and time-consuming to set everything up, but after it was done—you barely need to do anything (only in cases when they change something in the plug-ins).&lt;/p&gt;

&lt;p&gt;This is how it looks now:&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Jenkins setup&quot; src=&quot;/assets/posts/2019-02-26-our-test-automation-journey-at-forza/jenkins-setup.png&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;choose-what-to-automate-with-e2e-tests&quot;&gt;Choose what to automate with E2E tests&lt;/h2&gt;

&lt;p&gt;Another big question within test automation is “What should we automate?”. The following approach we took to answering that question.&lt;/p&gt;

&lt;h4 id=&quot;based-on-analyticsour-most-used-views&quot;&gt;Based on analytics—our most used views&lt;/h4&gt;

&lt;p&gt;We started by gathering data about the usage of the different views and the more views a page had, the higher priority it got. We already had analytics in place, so it was quite easy to get this data. With some help from our colleagues, we also learned some &lt;a href=&quot;https://cloud.google.com/bigquery/&quot;&gt;BigQuery&lt;/a&gt;, which is very useful to know as a test engineer. Backing up an argument with data, as to why we should spend time fixing a bug, will definitely increase your credibility.&lt;/p&gt;

&lt;h4 id=&quot;learn-along-the-way&quot;&gt;Learn along the way&lt;/h4&gt;

&lt;p&gt;We also made sure to cover critical bugs that we found either during release testing or that got out to our users. We held a post-mortem for critical bugs where we discussed how we can prevent this from happening again. Not all bugs were suited for getting covered by E2E tests, it could lead to a unit test, or in some cases even a manual test.&lt;/p&gt;

&lt;h4 id=&quot;trust-the-tests&quot;&gt;Trust the tests&lt;/h4&gt;

&lt;p&gt;Sometimes it is just not worth the effort of creating an automated test. A test that is very hard to build, or if you are building it with the knowledge that it might get flaky, it is probably better to not build it. Although, if you still think it is worth the effort, make sure to run it in a separate build where you can check it manually. Having a test suite that is constantly red will only make you ignore it, and you will miss important bugs.&lt;/p&gt;

&lt;h4 id=&quot;test-data-is-tricky&quot;&gt;Test data is tricky&lt;/h4&gt;

&lt;p&gt;Test data is always an issue, and we tried to avoid building tests that were dependant on data, but eventually, we couldn’t avoid it anymore. We are working on a solution where we can start the client with a predefined state, but until that is in place, we are using our API to get IDs to entities that match our needs. This actually turned out to be a really good solution but we’ve decided to move on with the more complex solution since that will give even more stability to the tests, and fulfil some needs that we don’t cover today.&lt;/p&gt;

&lt;h2 id=&quot;reporting&quot;&gt;Reporting&lt;/h2&gt;

&lt;p&gt;Test runs without good reporting do not make any sense. This part is really worth putting some extra effort into. If you have readable reports, people will get engaged. It also makes your daily work with the tests so much easier, if you can get a good overview of their state.&lt;/p&gt;

&lt;h4 id=&quot;good-default-reports&quot;&gt;Good default reports&lt;/h4&gt;

&lt;p&gt;For E2E test reports we used a standard &lt;a href=&quot;https://wiki.jenkins.io/display/JENKINS/Cucumber+Reports+Plugin&quot;&gt;Cucumber plugin&lt;/a&gt; which generates a nice web page integrated in Jenkins.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Standard Cucumber report&quot; src=&quot;/assets/posts/2019-02-26-our-test-automation-journey-at-forza/standard-cucumber-report.png&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;customized-test-result-reports&quot;&gt;Customized test result reports&lt;/h4&gt;

&lt;p&gt;After a while we changed the concept of how we treat our results. The idea was to write all scenarios we want to cover on all features of the app, but they will be in a pending state. That is how we are measuring the coverage of system. The issue was that standard report considered pending tests as a failure and we did not want that. This forced us to write our own report. Actually, that is what I really like about Jenkins—it is amazingly customizable. After our report generator was ready it was really easy to integrate it with Jenkins.&lt;/p&gt;

&lt;p&gt;So now it looks like this (much more modern and stylish, we like it much better):&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Custom Cucumber report&quot; src=&quot;/assets/posts/2019-02-26-our-test-automation-journey-at-forza/custom-cucumber-report.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We also use JUnit plugin for unit tests &lt;a href=&quot;https://wiki.jenkins.io/display/JENKINS/JUnit+Plugin&quot;&gt;JUnit plugin&lt;/a&gt; and Slack notification plugin for Slack reports.&lt;/p&gt;

&lt;h4 id=&quot;slack&quot;&gt;Slack&lt;/h4&gt;

&lt;p&gt;We have our custom Slack reports for E2E tests as well. A simple Ruby app that generates exactly what we need with the right data and logo (luckily Slack has very nice open API that everyone can use). It was very easy to call it from a Jenkins’s job as a post-build task.&lt;/p&gt;

&lt;p&gt;We have reports from most of our builds reported to different Slack channels. There is one &lt;em&gt;test-automation&lt;/em&gt; channel where most E2E test reports are displayed, some builds send reports to iOS, Android, or even squad channels. There is a link in each report where you can find a more detailed test report to see exactly what step is failing. You even see screenshots of which state the client was in when the test failed.&lt;/p&gt;

&lt;p&gt;These reports are used by people in all different roles. It can be an iOS developer checking what failed in the latest nightly build or a QA engineer checking the state of the client before releasing.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Slack message&quot; src=&quot;/assets/posts/2019-02-26-our-test-automation-journey-at-forza/slack-message.jpg&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;CI report&quot; src=&quot;/assets/posts/2019-02-26-our-test-automation-journey-at-forza/ci-report.jpg&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;feature-coverage-report&quot;&gt;Feature coverage report&lt;/h4&gt;

&lt;p&gt;This function is very useful when investigating if a failing test failed for the first time in a longer period, or if there is a pattern to the failures. For example, we found that one of our tests was failing every Sunday, due to a bug in the test code, only happening on every Sunday. This would have been very hard to investigate without this report. Given you get the date it fails, it is really easy to go back to Jenkins and see if it fails on the same thing. If you have flaky tests (which you, of course, should not have, but let’s face it, sometimes we just do), this is very valuable to make sure you are fixing the flakiness.&lt;/p&gt;

&lt;p&gt;We also have the feature coverage percentage displaying here, letting us know how much of the features we intend to cover with E2E tests, that are actually covered.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Coverage report&quot; src=&quot;/assets/posts/2019-02-26-our-test-automation-journey-at-forza/coverage-report.jpg&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;nanoleaf&quot;&gt;&lt;a href=&quot;https://nanoleaf.me/en/&quot;&gt;Nanoleaf&lt;/a&gt;&lt;/h4&gt;

&lt;p&gt;Finally, on our most high-level report, we have a triangle made of Nanoleaf panels, where each panel represents a selected build. This is the result after a hack week we had in one of the squads.&lt;/p&gt;

&lt;p&gt;Here, the bottom row represents client builds and unit test runs and the middle layer shows if E2E tests on iOS and Android have passed or failed.&lt;/p&gt;

&lt;p&gt;It also keeps track of other essential things, like if it is Friday and Beer O’clock :)&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Nanoleafs&quot; src=&quot;/assets/posts/2019-02-26-our-test-automation-journey-at-forza/nanoleafs.jpg&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;it-is-all-about-visualization&quot;&gt;It is all about visualization&lt;/h4&gt;

&lt;p&gt;We have put them in a place where almost everyone passes by, so when tests are red people ask us why they are failing. It has been a good and fun way to visualize test results.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Nanoleafs in the office&quot; src=&quot;/assets/posts/2019-02-26-our-test-automation-journey-at-forza/nanoleafs-in-office.jpg&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;Our path was long and complicated but also fun and exciting! And it is way far from its end. We are hoping that our journey will bring some value and inspiration. Good luck and remember: more challenges you face—the stronger you get.&lt;/p&gt;
</description>
            <pubDate>Tue, 26 Feb 2019 13:00:00 +0000</pubDate>
            <link>http://tech.forzafootball.com/blog/our-test-automation-journey-at-forza</link>
            <guid isPermaLink="true">http://tech.forzafootball.com/blog/our-test-automation-journey-at-forza</guid>
            
            
                <category>Test automation</category>
            
                <category>Jenkins</category>
            
                <category>end-to-end tests for mobile application</category>
            
        </item>
    
        <item>
            <title>How I sped up my XML parser by 15 times</title>
            <description>&lt;p&gt;In &lt;a href=&quot;/blog/binary-parsing-optimizations-in-elixir&quot;&gt;the previous blog post&lt;/a&gt; we have showcased how possible optimizations can be made to binary parsing implementation in Elixir. This article aims to delineate how I have applied these techniques, in order to drastically speed up XML parsing.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Son Doong Cave&quot; src=&quot;/assets/posts/2018-04-19-how-i-sped-up-my-xml-parser-by-15-times/son-doong.jpg&quot; /&gt;&lt;/p&gt;

&lt;p&gt;XML is one of the most widely used formats for data interchange on the web today. At Forza Football, in addition to JSON and MessagePack, we use XML in several facets of our work: receiving football data from providers, obtaining headlines from RSS feeds. Ensuring that we have an efficent method to parse the data we receive will boost the performance of the application as a whole and thus improve user experience. With speed and usability as my two ultimate goals, I started writing the first implementation of &lt;a href=&quot;https://github.com/qcam/saxy&quot;&gt;the Saxy project&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;not-so-saxy&quot;&gt;Not so “saxy”&lt;/h2&gt;

&lt;p&gt;After two months of work and two aborted proofs of concept, The first version was published to the world. Since speed was, from the beginning, the desired goal of this project, I decided to do some benchmarks on Saxy against other XML parsing libraries.&lt;/p&gt;

&lt;p&gt;Benchmarking with XML is hard because the result highly depends on the complexity of the document, so a relatively simple XML file was picked. The chosen contestants were &lt;a href=&quot;http://erlang.org/doc/man/xmerl_sax_parser.html&quot;&gt;xmerl&lt;/a&gt;—the standard XML parser in Erlang OTP, &lt;a href=&quot;https://hex.pm/packages/erlsom&quot;&gt;Erlsom&lt;/a&gt;, and &lt;a href=&quot;https://github.com/processone/fast_xml&quot;&gt;fast_xml&lt;/a&gt;. But in the end I could not get fast_xml to work as intended, so only xmerl and Erlsom went into &lt;a href=&quot;https://gist.github.com/qcam/6edc7f8a92340492b2eba73f5f12f0fa&quot;&gt;the benchmark script&lt;/a&gt;.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Microseconds per run&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Saxy 0.3.0&lt;/td&gt;
      &lt;td&gt;24.4232&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;xmerl 1.3.16&lt;/td&gt;
      &lt;td&gt;299.242&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Erlsom 1.4.1&lt;/td&gt;
      &lt;td&gt;65.8912&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;“Eureka! Eureka! Saxy is 2.7 times faster than erlsom and 12 times faster than xmerl 🎉🎉🎉” I excitedly screamed out.&lt;/p&gt;

&lt;p&gt;But after taking some time to reflect and cool down, it turned out that I had a fatal mistake right in the benchmark script. When the script was fixed, reality slapped me really hard in the face.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Microseconds per run&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Saxy 0.3.0&lt;/td&gt;
      &lt;td&gt;943.401&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;xmerl 1.3.16&lt;/td&gt;
      &lt;td&gt;285.5068&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Erlsom 1.4.1&lt;/td&gt;
      &lt;td&gt;66.8736&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;“Focus on speed … 😏, good that now you know how ignorant you are!”, I told my dispointed self.&lt;/p&gt;

&lt;h2 id=&quot;shortcomings&quot;&gt;Shortcomings&lt;/h2&gt;

&lt;p&gt;I decided to give it another try, but with a different strategy this time. Instead of throwing away yet another proof of concept and writing the new implementation, I spent a few days pondering why the previous parser was so unacceptably slow. What I did was: 1) Study how Erlsom does it (thus learning from the winner) 2) I read &lt;a href=&quot;http://erlang.org/doc/efficiency_guide/&quot;&gt;the Erlang Efficiency Guide&lt;/a&gt; 3) And used &lt;a href=&quot;/blog/binary-parsing-optimizations-in-elixir&quot;&gt;our previous blog post&lt;/a&gt; as the guideline.&lt;/p&gt;

&lt;p&gt;It became clear that there were some shortcomings in my original approach:&lt;/p&gt;

&lt;h3 id=&quot;sub-binary-creation&quot;&gt;Sub-binary creation&lt;/h3&gt;

&lt;p&gt;Let’s look at a common rule matching function in Saxy.&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:&amp;lt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_token_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:&amp;lt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:Name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tag_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:Name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:SAttribute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;attributes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;zero_or_more&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:SAttribute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]),&lt;/span&gt;
         &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:S&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_s_char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;zero_or_one&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:S&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tag_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A quick summary of what the code does: it tries matching &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;&lt;/code&gt; token, then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:Name&lt;/code&gt; rule, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:SAttribute&lt;/code&gt;, and finally &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:S&lt;/code&gt;. For each token/rule that is matched, it returns a four-element tuple representing the matched token/rule, the buffer and its current position, as well as the parsing state.&lt;/p&gt;

&lt;p&gt;But why does it make the whole parsing flow slow? The answer is because it completely avoids the compiler from keeping the initial match context, because every &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;match&lt;/code&gt; function returns its sub-binary. This topic has been extensively covered in the &lt;a href=&quot;/blog/binary-parsing-optimizations-in-elixir&quot;&gt;binary parsing optimization post&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;pattern-matching-mania&quot;&gt;Pattern Matching Mania&lt;/h3&gt;

&lt;p&gt;In Erlang, pattern-matching in functions, as well as in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;case&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;receive&lt;/code&gt; are usually optimized.&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Foo&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}),&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]),&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The generated Erlang VM instructions below will show us how the compiler rearranges the clauses and produces a more optimized code. See &lt;a href=&quot;/blog/binary-parsing-optimizations-in-elixir#diving-deeper-for-more-optimizations&quot;&gt;here&lt;/a&gt; for more information of how to generate these assembler codes.&lt;/p&gt;

&lt;div class=&quot;language-erlang highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
 &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;line&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;location&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;'lib/foo.ex'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}]},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;atom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;Foo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;atom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_nonempty_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;% &amp;lt;—————————————
&lt;/span&gt;   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;% &amp;lt;—————————————
&lt;/span&gt;   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;is_tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;% &amp;lt;—————————————
&lt;/span&gt;   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;test_arity&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_tuple_element&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}},&lt;/span&gt;
   &lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;return&lt;/span&gt;
 &lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But unfortunately, binary matching is one of the exceptions that the Erlang compiler &lt;strong&gt;does not rearrange clauses yet&lt;/strong&gt;.&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;match_token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;--&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:&quot;--&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;--&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;match_token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:&quot;--&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;ss&quot;&gt;:error&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;match_token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;--&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:CommentChar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;ss&quot;&gt;:error&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;match_token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utf8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:CommentChar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utf8&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;byte_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utf8&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)}}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# 50 more token matching.&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;match_token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utf8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:HexChar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hex_char?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utf8&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;byte_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)}}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;:error&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Pattern matching is absolutely one of the coolest features in Elixir/Erlang and makes writing conditional statements easier than ever, especially for someone who has background from other languages. However, like everything else in the Universe, using it excessively would eventually come back to bite us.&lt;/p&gt;

&lt;p&gt;In the code above, every token type is passed as the second argument in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;match_token&lt;/code&gt;’s clause. In order to match a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:HexChar&lt;/code&gt; token, the Erlang VM has to try matching &lt;strong&gt;every clause&lt;/strong&gt; of the function (because the first argument is matching binary) in a top-to-toe manner before it can reach the function genuinely doing the matching work. Working in this manner means costing a great deal of overheads and gaining nothing in return.&lt;/p&gt;

&lt;h3 id=&quot;binary-construction&quot;&gt;Binary construction&lt;/h3&gt;

&lt;p&gt;For better usability, Saxy was designed to emit SAX event data in binary, instead of a characters list, as in other libraries. This makes all the operations afterwards easier for the library users.&lt;/p&gt;

&lt;p&gt;I am going to demonstrate how Saxy previously operated:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zero_or_more&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;position&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;zero_or_more&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;current_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mismatch_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rule&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mismatch_pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_binary&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;acc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;match_token&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utf8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:NameChar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name_char?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;char&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utf8&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;byte_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)}}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
    &lt;span class=&quot;ss&quot;&gt;:error&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;At first, I suspected that binary appending operations were creating a bottleneck caused by allocations, but it turned out that the Erlang VM is way smarter than I expected. Unlike binary matching which is optimized by the compiler, the optimization work, primarily to avoid copying, of binary appending is done by the &lt;strong&gt;runtime system&lt;/strong&gt;, therefore the optimization only fails in very few circumstances.&lt;/p&gt;

&lt;p&gt;The good thing about open source is that you can learn from any random person on the Internet by looking at their code. I peaked at how other libraries implemented this and learned that &lt;a href=&quot;https://hex.pm/packages/jason&quot;&gt;Jason&lt;/a&gt; took a fairly smart approach. As mentioned, binaries in Erlang are designed in a way that they can be referenced internally, a sub binary is only a reference into a part of another binary. So instead of constructing the result binary, we can slice it out from the original binary with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Kernel.binary_part/3&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;make-saxy-great-again&quot;&gt;Make Saxy great again&lt;/h2&gt;

&lt;p&gt;Taking into account all of these learnings: continuous binary matching, reasonable pattern matching, sub binaries instead of new binary construction, I implemented the next version of Saxy.&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parse_open_tag_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cont&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;original&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;ow&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_name_char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;parse_open_tag_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cont&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;original&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parse_open_tag_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;utf8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cont&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;original&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;ow&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_name_char&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;parse_open_tag_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cont&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;original&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compute_unicode_len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;charcode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;parse_open_tag_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cont&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;original&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;binary_part&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;original&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pos&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;%{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;stack:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;parse_sattribute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cont&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;original&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pos&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[])&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This implementation has many improvements in comparison to the previous one:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Sub-binaries in each parsing function are now continuously passed to the next without being returned or used. This prevents new ones from being created and keeps the original match context.&lt;/li&gt;
  &lt;li&gt;Every rule now gets its own function, instead of the rule name being passed as the second argument and the VM does crazy clause evaluations.&lt;/li&gt;
  &lt;li&gt;Data can be sliced out from the original binary using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;binary_part/3&lt;/code&gt; without having to be constructed. Some times binary construction cannot be completely avoided, for example mixing character references in XML content &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;Tom &amp;amp;#x26; Jerry&quot;&lt;/code&gt;, in that case we can avoid binary appending by building iodata (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[&quot;Tom &quot;, 0x26, &quot; Jerry&quot;]&lt;/code&gt;).&lt;/li&gt;
  &lt;li&gt;It also improves error handling. For example if the look-ahead code point is not an expected token, we can immediately return a parsing error, instead of having to do a throw/catch block in the previous approach.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;and-a-happy-ending&quot;&gt;And a happy ending&lt;/h3&gt;

&lt;p&gt;After applying all the optimizations above, I ran the benchmark again (with the correct script 🙈). And the result started to look very promising:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Microseconds per run&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Saxy 0.4.0&lt;/td&gt;
      &lt;td&gt;64.401&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;xmerl 1.3.16&lt;/td&gt;
      &lt;td&gt;285.5068&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Erlsom 1.4.1&lt;/td&gt;
      &lt;td&gt;66.8736&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;THAT IS 14.7x SPEED UP (yeah I rounded it up a bit in the blog title).&lt;/p&gt;

&lt;p&gt;Results with other samples and a proper benchmark library.&lt;/p&gt;

&lt;p&gt;Benchmarking against &lt;a href=&quot;https://hackernoon.com/feed&quot;&gt;Hackernoon RSS&lt;/a&gt;, Saxy is &lt;strong&gt;1.59 times&lt;/strong&gt; faster than Erlsom.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;IPS&lt;/th&gt;
      &lt;th&gt;99th %&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Saxy 0.4.0&lt;/td&gt;
      &lt;td&gt;437.79&lt;/td&gt;
      &lt;td&gt;3.21 ms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Erlsom 1.4.1&lt;/td&gt;
      &lt;td&gt;275.23&lt;/td&gt;
      &lt;td&gt;5.03 ms&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;This speed is particularly noticeable with &lt;a href=&quot;https://github.com/qcam/saxy-bench/blob/master/samples/soccer.xml&quot;&gt;deeply nested XML&lt;/a&gt;—&lt;strong&gt;4.35 times&lt;/strong&gt;.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;IPS&lt;/th&gt;
      &lt;th&gt;99th %&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Saxy 0.4.0&lt;/td&gt;
      &lt;td&gt;15.37&lt;/td&gt;
      &lt;td&gt;76.18 ms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Erlsom 1.4.1&lt;/td&gt;
      &lt;td&gt;3.53&lt;/td&gt;
      &lt;td&gt;294.30 ms&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;More detailed benchmark results can be found &lt;a href=&quot;https://github.com/qcam/saxy-bench&quot;&gt;on Github&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;I had many different ups and downs, and learned a bunch of things while building the library. I hope that this blog is a fun read and provides a real-world example on optimizing binary parsing.&lt;/p&gt;

</description>
            <pubDate>Thu, 19 Apr 2018 13:00:00 +0000</pubDate>
            <link>http://tech.forzafootball.com/blog/how-i-sped-up-my-xml-parser-by-15-times</link>
            <guid isPermaLink="true">http://tech.forzafootball.com/blog/how-i-sped-up-my-xml-parser-by-15-times</guid>
            
            
                <category>Elixir</category>
            
                <category>XML parser</category>
            
                <category>Saxy</category>
            
        </item>
    
        <item>
            <title>Binary parsing optimizations in Elixir</title>
            <description>&lt;p&gt;Binary parsing is a significant part of almost every application: interactions with databases, data-interchange formats, integers and datetimes in string representation, etc.
Making optimizations to those domains might result in substantial speed-up across the whole application.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Angle grinder&quot; src=&quot;/assets/posts/2018-01-25-binary-parsing-optimizations-in-elixir/angle-grinder.jpg&quot; /&gt;&lt;/p&gt;

&lt;p&gt;In &lt;a href=&quot;/blog/the-pursuit-of-instant-pushes&quot;&gt;the previous blog post&lt;/a&gt; I mentioned that we intensively use the MessagePack format where possible (soon our mobile clients start using it too).
Comparing to JSON for example, it is much faster, has smaller footprint, and it shows better compression speed and ratio.&lt;/p&gt;

&lt;p&gt;In this article I will be using the MessagePack library of our choice—&lt;a href=&quot;https://hex.pm/packages/msgpax&quot;&gt;Msgpax&lt;/a&gt;—to showcase optimizations you could apply to your implementations of binary parsing.&lt;/p&gt;

&lt;h3 id=&quot;measuring-the-impact&quot;&gt;Measuring the impact&lt;/h3&gt;

&lt;p&gt;The actual outcome of any optimizations to binary parsing really depends on the data being parsed.
For our showcase we will be using the following, somewhat close to reality, data:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;inner_map&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;129&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iterate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Stream&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Float&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Enum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;take&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;content&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;%{&lt;/span&gt;
    &lt;span class=&quot;s2&quot;&gt;&quot;foobar&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;s2&quot;&gt;&quot;baz&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;duplicate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;9001&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;s2&quot;&gt;&quot;plugh&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inner_map&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;duplicate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Also we are going to use a rather simple and comprehensible benchmarking script:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;payload&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Msgpax&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pack!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;content&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;iodata:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;run_count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10_000&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;Msgpax&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;total_spent&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Enum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_spent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:timer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Msgpax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;payload&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;run_spent&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;IO&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;puts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_spent&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;run_count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# microseconds per run&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Before getting to the applied optimizations it is worth checking the &lt;em&gt;pre-optimized&lt;/em&gt; version of
Msgpax. For completeness, let’s include our current go-to JSON library—&lt;a href=&quot;https://hex.pm/packages/jason&quot;&gt;Jason&lt;/a&gt;, and yet the most popular JSON library on Hex—&lt;a href=&quot;https://hex.pm/packages/poison&quot;&gt;Poison&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Elixir 1.5.2 and OTP 20.1.5 yield:&lt;/p&gt;

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Msgpax 1.1.0&lt;/td&gt;
      &lt;td&gt;417.8994&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Jason 1.0.0&lt;/td&gt;
      &lt;td&gt;564.4407&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Poison 3.1.0&lt;/td&gt;
      &lt;td&gt;671.3923&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;And the &lt;em&gt;optimized&lt;/em&gt; Msgpax gives the powerful &lt;strong&gt;228.1887 μs&lt;/strong&gt;. It is a &lt;strong&gt;45%&lt;/strong&gt; speed-up.&lt;/p&gt;

&lt;p&gt;How did we get there?&lt;/p&gt;

&lt;h3 id=&quot;the-rocky-road-of-the-single-match-context-optimization&quot;&gt;The rocky road of the single match context optimization&lt;/h3&gt;

&lt;p&gt;In general, every binary matching (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;&amp;lt;...&amp;gt;&amp;gt;&lt;/code&gt;) generates a &lt;strong&gt;match context&lt;/strong&gt;.
Also for each part that accessed out of a binary a &lt;strong&gt;sub binary&lt;/strong&gt; is created, that is a reference to that accessed part of the binary. This makes binary matching relatively cheap because the actual binary data is never copied.&lt;/p&gt;

&lt;p&gt;Nevertheless, the Erlang compiler avoids generating code that creates a sub binary if the compiler sees that shortly afterwards a new match context is created and the sub binary is discarded. Instead of creating a sub binary, the initial match context is kept.&lt;/p&gt;

&lt;p&gt;The compiler can point out the places where such optimiation could be applied (or have already been applied) by using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bin_opt_info&lt;/code&gt; option:&lt;/p&gt;

&lt;div class=&quot;language-shell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;ERL_COMPILER_OPTIONS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;bin_opt_info mix run lib/msgpax/unpacker.ex
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It will produce dozens of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT OPTIMIZED&lt;/code&gt; warnings, that could be split into two types:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;blockquote&gt;
      &lt;p&gt;sub binary is used or returned&lt;/p&gt;
    &lt;/blockquote&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;blockquote&gt;
      &lt;p&gt;called function … does not begin with a suitable binary matching instruction&lt;/p&gt;
    &lt;/blockquote&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first type literally means what the warning says, thus creating a sub binary cannot be avoided.
This is the code that generates such warning:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mh&quot;&gt;0xC3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bytes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the code above, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rest&lt;/code&gt; variable represents that returned sub binary.
In some cases it is indeed a desired behaviour, but not when such function is called in a recursive manner:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unpack_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unpack_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;unpack_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For example, when we unpack a 5-element list the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unpack_list/4&lt;/code&gt; function will be called 5 times,
each time creating a new match context and a new sub binary, even if the only use of the sub binary is the next call binary matching (hence its creation can be avoided and the match context can be reused).&lt;/p&gt;

&lt;p&gt;Without our help the compiler could not understand that intention.
Continuous parsing is the way to get the code optimized:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bytes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buffer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;mh&quot;&gt;0xC3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bytes&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;unpack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rest&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;With this code the optimization works, however, only if the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unpack/2&lt;/code&gt; clauses do not produce the second type of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NOT OPTIMIZED&lt;/code&gt; warnings.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;blockquote&gt;
      &lt;p&gt;called function … does not begin with a suitable binary matching instruction&lt;/p&gt;
    &lt;/blockquote&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To resolve the warning each clause of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unpack/2&lt;/code&gt; function must start with a binary matching, that is, the first argument is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;&amp;lt;...&amp;gt;&amp;gt;&lt;/code&gt;.
Note that with continuous parsing we also were able to avoid 2-element tuple creation.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Starting with OTP 21, the compiler might reorder function arguments to fix the warning and make the optimization happen (&lt;a href=&quot;https://github.com/erlang/otp/pull/1687&quot;&gt;see the pull request&lt;/a&gt;).&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;diving-deeper-for-more-optimizations&quot;&gt;Diving deeper for more optimizations&lt;/h3&gt;

&lt;p&gt;Keeping a single match context when parsing certainly increases its speed.
Yet there are more things to consider.
&lt;strong&gt;Unnecessary check elimination&lt;/strong&gt; is one of such.&lt;/p&gt;

&lt;p&gt;For example, in the previous code snippet the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;::bytes&lt;/code&gt; part of the binary matching is not strictly necessary.
It adds an additional check to ensure that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rest&lt;/code&gt; is divisible by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;8&lt;/code&gt;.
Changing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;::bytes&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;::bits&lt;/code&gt; will eliminates the check (&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/28947f9&quot;&gt;28947f9&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;To observe the result of this change let’s inspect the produced assembler code. The following command should output it to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Elixir.Msgpax.Unpacker.S&lt;/code&gt; file:&lt;/p&gt;

&lt;div class=&quot;language-shell highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;mix run &lt;span class=&quot;nt&quot;&gt;--eval&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;beam = :code.which(Msgpax.Unpacker); {:ok, {_, [abstract_code: {:raw_abstract_v1, abstract_code}]}} = :beam_lib.chunks(beam, [:abstract_code]); :compile.forms(abstract_code, [:S])&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;::bits&lt;/code&gt; usage will result in the eliminated &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bs_test_unit&lt;/code&gt; bytecode instructions in the assembly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Keeping the argument order intact&lt;/strong&gt; also helps with the instruction elimination.
The &lt;a href=&quot;https://github.com/lexmag/msgpax/commit/69cd8df&quot;&gt;69cd8df&lt;/a&gt; commit ensures that the argument order in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unpack_continue/6&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unpack_continue/5&lt;/code&gt; is the same.
This makes the following difference to the produced assembly:&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Angle grinder&quot; src=&quot;/assets/posts/2018-01-25-binary-parsing-optimizations-in-elixir/argument-order-assembly-diff.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Another type of optimizations is &lt;strong&gt;function clauses reordering&lt;/strong&gt;.
In fact, the compiler does function clauses reordering when possible.
But many times it requires manual work to receive faster code.&lt;/p&gt;

&lt;p&gt;Putting higher-probability clauses to the top (or the opposite—lower-probability clauses to the bottom) can give better speed.
Though changes like that, I believe, warrant regular benchmarking and the produced assembler code inspection.&lt;/p&gt;

&lt;p&gt;For Msgpax there were 2 attempts of function clauses reordering (&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/be19258&quot;&gt;be19258&lt;/a&gt;, and &lt;a href=&quot;https://github.com/lexmag/msgpax/commit/1955c5e&quot;&gt;1955c5e&lt;/a&gt;)—both giving noticeable boost.
But ultimately the &lt;strong&gt;code inlining&lt;/strong&gt; gave much better outcome.
The top function clause was integrated into several places where it actually belongs to (&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/81330c3&quot;&gt;81330c3&lt;/a&gt;).
After that the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:inline&lt;/code&gt; directive has helped to get rid of the code duplication (&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/6eacf81&quot;&gt;6eacf81&lt;/a&gt;).&lt;/p&gt;

&lt;h3 id=&quot;wrapping-up&quot;&gt;Wrapping up&lt;/h3&gt;

&lt;p&gt;All the aforementioned optimization were bundled into a single &lt;a href=&quot;https://github.com/lexmag/msgpax/pull/30&quot;&gt;pull request to Msgpax&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I think many kinds of applications would benefit from binary parsing optimizations.
And the compiler has all the necessary tools to assist you.&lt;/p&gt;

&lt;p&gt;There are also more examples of these optimizations in the wild:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/elixir-lang/elixir/pull/5859&quot;&gt;Integer.parse/2 in Elixir&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/lexhide/xandra/pull/85&quot;&gt;Decoding in Xandra&lt;/a&gt;—our Cassandra driver for Elixir&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’m going to close the article with the step by step parsing time progress:&lt;/p&gt;

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/4ac16cf&quot;&gt;4ac16cf&lt;/a&gt; (pre-optimized)&lt;/td&gt;
      &lt;td&gt;417.8994&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/compare/4ac16cf…3d3f5b5&quot;&gt;3ad47e2…3d3f5b5&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;321.7947&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/171faab&quot;&gt;171faab&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;311.3362&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/4b2dba7&quot;&gt;4b2dba7&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;304.8067&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/5e3a213&quot;&gt;5e3a213&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;313.7435&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/08549bf&quot;&gt;08549bf&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;334.0085&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/1955c5e&quot;&gt;1955c5e&lt;/a&gt; (function clauses reordering)&lt;/td&gt;
      &lt;td&gt;266.7469&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/f22686b&quot;&gt;f22686b&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;262.2508&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/81330c3&quot;&gt;81330c3&lt;/a&gt; (function clause inlining)&lt;/td&gt;
      &lt;td&gt;239.2266&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/2eb1a9f&quot;&gt;2eb1a9f&lt;/a&gt; (unnecessary check elimination)&lt;/td&gt;
      &lt;td&gt;229.2112&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://github.com/lexmag/msgpax/commit/69cd8df&quot;&gt;69cd8df&lt;/a&gt; (keeping the argument order)&lt;/td&gt;
      &lt;td&gt;228.1887&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Until the next one. 🚀&lt;/p&gt;

</description>
            <pubDate>Thu, 25 Jan 2018 14:00:00 +0000</pubDate>
            <link>http://tech.forzafootball.com/blog/binary-parsing-optimizations-in-elixir</link>
            <guid isPermaLink="true">http://tech.forzafootball.com/blog/binary-parsing-optimizations-in-elixir</guid>
            
            
                <category>Elixir</category>
            
                <category>Erlang compiler</category>
            
                <category>MessagePack</category>
            
        </item>
    
        <item>
            <title>Maximizing HTTP/2 performance with GenStage</title>
            <description>&lt;p&gt;A core feature of our Forza Football app is push notifications about live match events. With Apple moving their push notifications services to HTTP/2, we wanted to take advantage of the functionalities that their new API provides and at the same maximize performance and improve resource usage with the new platform.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Train&quot; src=&quot;/assets/posts/2017-11-07-maximizing-http2-performance-with-genstage/train.jpg&quot; /&gt;&lt;/p&gt;

&lt;p&gt;At the end of 2015, Apple announced an update to APNs (Apple Push Notification service) API. Part of the update was to move communication over HTTP/2 in order to improve performance and be able to deliver instant feedback to clients sending push notifications. A few months ago, since we were rewriting our push notifications system (we wrote about it &lt;a href=&quot;/blog/the-pursuit-of-instant-pushes&quot;&gt;in the past&lt;/a&gt;), we decided to use the new HTTP/2 API so that we would be prepared for the future and the eventual deprecation of the original API.&lt;/p&gt;

&lt;h2 id=&quot;a-very-quick-http2-primer&quot;&gt;A (very) quick HTTP/2 primer&lt;/h2&gt;

&lt;p&gt;HTTP/2 is the new version of HTTP. It brings many improvements while keeping the same semantics of HTTP/1.1. One of these improvements is &lt;a href=&quot;https://http2.github.io/faq/#why-is-http2-multiplexed&quot;&gt;multiplexing&lt;/a&gt;: all communication that happens between client and server through TCP happens on one HTTP/2 &lt;em&gt;connection&lt;/em&gt;. However, on one connection there can be multiple &lt;em&gt;streams&lt;/em&gt;. A stream is similar to an HTTP/1.1 request but can be initiated - &lt;em&gt;opened&lt;/em&gt;, in HTTP/2 lingo - by both server or client. For example, if a browser (client) wants to load a website from a server using HTTP/2, it will first open a connection (which is a direct mapping to a TCP connection) with the server. After that, it will initiate one stream for each resource it needs to load, so for example one stream for the HTML content, one stream per image, one stream for the CSS documents, and one for the JavaScript code. What makes HTTP/2 great is that different streams on a single connection are &lt;strong&gt;concurrent&lt;/strong&gt;: at any given time, many streams can be open on one connection. The exact number of streams that can be open at the same time is negotiated between the client and server during the life of the connection.&lt;/p&gt;

&lt;p&gt;HTTP/2 has many more features and improvements over HTTP/1.1, but concurrent streams on a single connection are the one you need to know for the rest of this article. If you want more information about HTTP/2, the &lt;a href=&quot;https://http2.github.io&quot;&gt;HTTP/2 website&lt;/a&gt; and the &lt;a href=&quot;https://http2.github.io/faq/&quot;&gt;HTTP/2 FAQs&lt;/a&gt; are great resources to start with.&lt;/p&gt;

&lt;h2 id=&quot;http2-and-apns&quot;&gt;HTTP/2 and APNs&lt;/h2&gt;

&lt;p&gt;The new HTTP/2 API that Apple released uses a single HTTP/2 connection to send multiple push notifications. First, a client (for example, a server where your application lives) opens the HTTP/2 connection with an APN server. This connection can be used to send a (theoretically) unlimited amount of notifications, so it’s advised that you keep this connection open, unless you only send notifications in bursts for only a few times each day. When the client needs to send a notification, it &lt;em&gt;opens a stream&lt;/em&gt; on the connection and send a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;POST&lt;/code&gt; request on that stream. The APN server replies on that same stream right away and it closes the stream. This interaction is similar to having one request per notification in HTTP/1.1, but performance is much better because of the single shared underlying connection. When one connection is not enough, Apple advises to simply open many connections and distribute the load of notifications on those.&lt;/p&gt;

&lt;h2 id=&quot;maximizing-performance&quot;&gt;Maximizing performance&lt;/h2&gt;

&lt;p&gt;We send millions of notifications each day but they mostly happen in big bursts, for example when a goal is scored in a match between two big teams. Since the notifications we send are time-sensitive, we need to send as many as we can in the smallest possible amount of time.&lt;/p&gt;

&lt;p&gt;The first quick-and-dirty approach we took was to simply use one APNs connection to send all of our notifications. We kept this connection open at all times. Problems appeared soon in this prototype implementation: putting the terrible performance aside (caused by having a single sink in the single HTTP/2 connection), we were not handling a fundamental characteristic of HTTP/2, that is, the limit on streams that can be open at the same time (we’ll call this limit &lt;em&gt;max concurrent streams&lt;/em&gt;). The current max concurrent streams limit Apple negotiates is 1000 concurrent streams. When sending more than 1000 notifications, say 1001, we would open 1001 streams and if no streams would have freed up by the time we were to open the 1001st stream, we would get an HTTP/2 error from Apple’s server telling us we were opening too many streams.&lt;/p&gt;

&lt;p&gt;So, one requirement was to never go over the max concurrent streams limit. At the same time, however, we wanted to maximize the performance of each connection by using it as much as possible. In this case, it means that we should strive to have as many streams open at the same time as we can. Ideally, the number of concurrent open streams should be always close to 1000. We tried to achieve this (and improve the single-connection approach) by using many connections alive at the same time and distributing notifications over these connections. The problem with this approach was that we were distributing notifications to connections in a “fair” way, but connections were processing notifications at different speeds (which can be caused by many reasons, like scheduling, connection usage, and so on). We ended up in a situation where some connections striving to keep up with the notifications to send while other connections would be idle. We traced this down to a design problem: we were using a &lt;em&gt;push&lt;/em&gt; pattern where we were pushing notifications to connections without considering the state of each connection. The first solution to this that came to mind was starting to use a &lt;em&gt;pull&lt;/em&gt; approach where each connection would express its availability so that all connections would get a fair amount of notifications to send, based on their state and usage.&lt;/p&gt;

&lt;p&gt;The more we looked at the problem, the more we recognized a pattern:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;many events come into our system (the notifications to send to each user)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;we need to “forward” these events to another system (push the notifications to Apple’s platform)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;we need to rate-limit these events (keeping the number of open streams under the limit)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;we need “consumers” (APNS connections) to &lt;em&gt;pull&lt;/em&gt; data (pending notifications) based on their availability&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These requirements naturally drew us to &lt;a href=&quot;https://github.com/elixir-lang/gen_stage&quot;&gt;GenStage&lt;/a&gt;, which is an Elixir library that provides abstractions for “stages” (producers and consumers) that exchange data between each other in a rate-limited, pulling-oriented way. We decided we wanted to rewrite the handling of notifications on top of GenStage to be able to take advantage of its features.&lt;/p&gt;

&lt;p&gt;Before moving our implementation to GenStage, we felt that none of the existing HTTP/2 clients for Erlang or Elixir would satisfy our performance and reliability requirements. So before doing anything else, we wrote an custom optimized HTTP/2 client to use just in this project.&lt;/p&gt;

&lt;h2 id=&quot;the-current-implementation&quot;&gt;The current implementation&lt;/h2&gt;

&lt;p&gt;In our current implementation, we have a separate service (an HTTP API) which receives notifications to send to different providers (like APNs for Apple or GCM for Google). This HTTP API keeps several connections to APNs open (around 10) and is responsible for all the rate-limiting of open streams and the spreading of events (notifications) over all the open connections. This service is where GenStage lives. Its architecture looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Chart of the architecture&quot; src=&quot;/assets/posts/2017-11-07-maximizing-http2-performance-with-genstage/architecture.png&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;the-notifications-lake&quot;&gt;The notifications “lake”&lt;/h3&gt;

&lt;p&gt;Requests that contain multiple notifications to send come into the HTTP API. The first step that we take is to temporarily store these notifications in what we call a “lake” (play on words around a big “pool”). &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;APNS.Lake&lt;/code&gt; is a GenStage &lt;strong&gt;producer&lt;/strong&gt; that holds a &lt;em&gt;bounded queue&lt;/em&gt; of notifications to send. We use a bounded queue so that we have a back-pressure mechanism if more notifications come in than we can handle. All processes that handle HTTP requests queue notifications in a single &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;APNS.Lake&lt;/code&gt; process: while this was thought as a first step before having multiple lakes as well, we found performance was great this way so we are holding off before adding more lake processes.&lt;/p&gt;

&lt;p&gt;A simplified version of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;APNS.Lake&lt;/code&gt; code looks something like this:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;APNS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Lake&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;GenStage&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bounded_queue_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;%{&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;pending_demand:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;queue:&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;BoundedQueue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bounded_queue_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:producer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;c1&quot;&gt;# This is triggered manually from the HTTP API.&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;handle_call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:send_notifs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;notifs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;BoundedQueue&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;in_many&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;queue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;notifs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;queue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dispatch_demand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;queue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pending_demand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:reply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

      &lt;span class=&quot;ss&quot;&gt;:full&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:reply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:error&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:full&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dispatch_demand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pending_demand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Dispatches events until either the pending demand&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# is satisfied (goes to 0) or we run out of events.&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the code above, we could let GenStage handle the buffering of events and demand using GenStage’s internal buffer instead of an explicit bounded queue. However, GenStage’s internal buffer drops events when it gets filled up, and we can’t afford to lose events (that is, not send notifications). With a manual implementation through a bounded queue, we have tighter control over the behaviour when the queue becomes full. In our case, we reply &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{:error, :full}&lt;/code&gt; to the HTTP API which can the act accordingly: for example, it can retry later or it can reply with an error to the HTTP client so that that client can retry later (pushing the responsibility of retrying further and further).&lt;/p&gt;

&lt;h3 id=&quot;the-apns-connections&quot;&gt;The APNS connections&lt;/h3&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;APNS.Lake&lt;/code&gt;, the GenStage producer, pushes events to APNs connections (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;APNS.Connection&lt;/code&gt;) processes, which are GenStage &lt;strong&gt;consumers&lt;/strong&gt;. Each connection has a maximum demand of events that is exactly the same as the maximum number of concurrent streams. Conceptually, the logical thing to do would be to initially ask for 1000 events (one notification for each stream) and then to ask for one event each time a stream is closed. In practice, this didn’t work optimally because asking for one event at a time would mean that producer and consumer would exchange many messages, ending up degrading performance. Instead, we buffer demand and make the connection wait until a small number of streams (50 in our case) frees up and ask the producer for this number of events. A consumer never asks for more than &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;max_concurrent_streams - currently_open_streams&lt;/code&gt; events, so that the total number of concurrently open streams can’t go over the max concurrent streams limit.&lt;/p&gt;

&lt;p&gt;Buffering events also helped us improve the performance of our HTTP/2 client. Each sending of a notification corresponds to a new HTTP/2 request, which boils down to a write on the HTTP/2’s connection TCP socket. This means that by not buffering events we would do many TCP writes close to each other. Now that we buffer events, our HTTP/2 client is designed to send requests issued at the same time as a single blob of bytes down the TCP socket. If we buffer 50 events at a time and send them all at once, we only do one TCP write for all those corresponding HTTP/2 requests.&lt;/p&gt;

&lt;p&gt;The work each connection does to process events – that is, send notifications – is asynchronous because it sends a request to Apple and then asynchronously waits for the response. For this reason, we can’t use GenStage’s internal demand mechanism but we have to use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:manual&lt;/code&gt; demand mode and send demand to the producer ourselves.&lt;/p&gt;

&lt;p&gt;Simplified code for the connections looks like this:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;APNS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Connection&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;GenStage&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;host_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mcs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;HTTP2Client&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;connect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;host_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;%{&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;client:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;client&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;events_to_ask:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mcs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;ss&quot;&gt;max_concurrent_streams:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mcs&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:consumer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;subscribe_to:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;APNS&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Lake&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;handle_subscribe&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:producer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maybe_ask_demand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(%{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;subscription:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:manual&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# we set the demand to :manual&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;handle_events&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;notifications&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Work is async so we send notifications without&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# sending any demand upstream.&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;send_notifications&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;notifications&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:noreply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;handle_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stream_response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;process_stream_response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stream_response&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;events_to_ask&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;events_to_ask&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maybe_ask_demand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(%{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;events_to_ask:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;events_to_ask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:noreply&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;maybe_ask_demand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;events_to_ask&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;50&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
      &lt;span class=&quot;no&quot;&gt;GenStage&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;subscription&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;events_to_ask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;%{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;events_to_ask:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The number of connections we have today (around 10) was the result of experimenting with the optimal number of connections for a single machine.&lt;/p&gt;

&lt;h2 id=&quot;performance-and-robustness&quot;&gt;Performance and robustness&lt;/h2&gt;

&lt;p&gt;We’re very happy with the performance we’re getting out of this system. The graph below shows how a day with many big matches looks like in our metrics dashboards:&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Metrics dashboard&quot; src=&quot;/assets/posts/2017-11-07-maximizing-http2-performance-with-genstage/dashboard.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;In our system we have two servers handling Apple notifications, and the graph above shows the performance for the two of them combined. At peak times, we are able to send more than 1.3 million notifications in around 10 seconds which is a number we’re satisfied with for the time being.&lt;/p&gt;

&lt;p&gt;While we aimed mostly for performance, rearchitecting this system also improved the robustness of our system. Thanks to the work being very distributed over many HTTP/2 connections, the bounded queue that ensures we don’t get overflows, and the isolation that Elixir and the Erlang VM provide, we are confident that the problems that may happen will only affect small parts of the system without significantly degrading the overall performance.&lt;/p&gt;

&lt;h2 id=&quot;conclusions-and-next-steps&quot;&gt;Conclusions and next steps&lt;/h2&gt;

&lt;p&gt;In this post, we explored how we took advantage of GenStage to build a system around the new HTTP/2 Apple API that is able to reliably deliver millions of push notifications to our users. While we’re happy with the current numbers, we have still improvements that we can make to the system in case we’ll want to squeeze out even more performance. One of them is possible thanks to how we distribute the workload in parallel and over multiple servers (albeit only two for now): this means that the system is &lt;strong&gt;scalable&lt;/strong&gt;, which in turn means that if we throw more servers at it, we’ll be able to increase the throughput of notifications we’re able to send.&lt;/p&gt;

</description>
            <pubDate>Tue, 07 Nov 2017 13:00:00 +0000</pubDate>
            <link>http://tech.forzafootball.com/blog/maximizing-http2-performance-with-genstage</link>
            <guid isPermaLink="true">http://tech.forzafootball.com/blog/maximizing-http2-performance-with-genstage</guid>
            
            
                <category>Elixir</category>
            
                <category>HTTP/2</category>
            
                <category>GenStage</category>
            
        </item>
    
        <item>
            <title>Internationalization of Elixir applications with Gettext and Transifex</title>
            <description>&lt;p&gt;Our Forza Football app is translated into many languages, and therefore our push notifications have to be translated as well. This is an interesting problem to solve when the languages to translate to become more than you can handle by yourself.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Flags&quot; src=&quot;/assets/posts/2016-12-19-internationalization-of-elixir-applications-with-gettext-and-transifex/flags.jpg&quot; /&gt;&lt;/p&gt;

&lt;p&gt;One of the most important features of our Forza Football app is sending notifications to subscribed users when events such as goals happen in football matches (we &lt;a href=&quot;/blog/the-pursuit-of-instant-pushes&quot;&gt;wrote about our notifications&lt;/a&gt; before). Since our users are based in many different countries, we need to translate such notifications in many different languages. In this post, I will talk about how we tackle this problem in Elixir (that’s what we wrote our push system in) and about the tools we use to do so.&lt;/p&gt;

&lt;h2 id=&quot;gettext-for-elixir&quot;&gt;Gettext for Elixir&lt;/h2&gt;

&lt;p&gt;The main tool we’re using for translation is &lt;a href=&quot;https://github.com/elixir-lang/gettext&quot;&gt;Gettext for Elixir&lt;/a&gt;. This library is an implementation of &lt;a href=&quot;https://www.gnu.org/software/gettext/&quot;&gt;GNU Gettext&lt;/a&gt; for Elixir applications; in short, Gettext is a system for internationalization of software based on having source strings in the source code that are extracted to translation files where the translations for different languages live. The Gettext for Elixir &lt;a href=&quot;https://github.com/elixir-lang/gettext&quot;&gt;README&lt;/a&gt; explains this in more depth if you’re interested.&lt;/p&gt;

&lt;p&gt;When we’re building a notification payload to send to a user, we have code that looks like this:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Pushboy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Gettext&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;only:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;gettext:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;gettext&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;%{minute}′ Red Card - %{player_name} (%{team_name})&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;ss&quot;&gt;minute:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;minute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;ss&quot;&gt;player_name:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;player_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;ss&quot;&gt;team_name:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;team_name&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Don’t worry about the name &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Pushboy&lt;/code&gt;, that’s just how our “pusher” app is called. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gettext/2&lt;/code&gt; in the code above is a macro automatically defined in our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Pushboy.Gettext&lt;/code&gt; module by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Gettext&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Pushboy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Gettext&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Gettext&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;otp_app:&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:pushboy&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The source string (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;%{minute}′ Red Card ...&quot;&lt;/code&gt;) is how we will identify this translation in the future, and Gettext will find the translation for the right language at runtime based on such source string.&lt;/p&gt;

&lt;h3 id=&quot;extracting-translations&quot;&gt;Extracting translations&lt;/h3&gt;

&lt;p&gt;Once we’re ready for internationalizing our code, the first thing we do is extract the strings that need translation out of the source code. Gettext provides a Mix task for this:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;mix gettext.extract
Extracted priv/gettext/default.pot
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This task reads all the calls to Gettext macros (such as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gettext/2&lt;/code&gt; above) at compile-time and extracts them to POT files (with a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.pot&lt;/code&gt; extension) that look like this:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-pot&quot;&gt;# This would go in priv/gettext/default.pot

#: lib/pushboy/event/red_card.ex:86
msgid &quot;%{minute}′ Red Card - %{player_name} (%{team_name})&quot;
msgstr &quot;&quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;POT files are &lt;em&gt;template&lt;/em&gt; files that are only meant to hold a list of all the strings to translate. In the file shown above, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;msgid&lt;/code&gt; is the identifier of the string to translate (which is the string itself) and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;msgstr&lt;/code&gt; is where the translation goes. However, translations are not stored in POT files, because this POT file is not specific to any language. Instead, translated strings are stored into PO (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.po&lt;/code&gt;) files; each PO file is stored in a directory specific to a language. For example, if we wanted to translate our string to Italian, we would have a file that looks like this:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-pot&quot;&gt;# This would go in priv/gettext/it/LC_MESSAGES/default.po

#: lib/pushboy/event/red_card.ex:86
msgid &quot;%{minute}′ Red Card - %{player_name} (%{team_name})&quot;
msgstr &quot;%{minute}′ Cartellino rosso - %{player_name} (%{team_name})&quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Gettext reads such PO files at compile time to make the lookup of translations as fast as possible. As you can see, the things between &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;%{&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;}&lt;/code&gt; are &lt;em&gt;interpolation variables&lt;/em&gt;: they’re not meant to be translated, and they will be replaced at runtime with some dynamic value.&lt;/p&gt;

&lt;p&gt;With the POT file and the PO file above, our translation would look like this:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-iex&quot;&gt;iex&amp;gt; import Pushboy.Gettext, only: [gettext: 2]
iex&amp;gt; Gettext.put_locale(Pushboy.Gettext, &quot;it&quot;)
iex&amp;gt; gettext &quot;%{minute}′ Red Card - %{player_name} (%{team_name})&quot;,
...&amp;gt;         minute: 38,
...&amp;gt;         player_name: &quot;Cristiano Ronaldo&quot;,
...&amp;gt;         team_name: &quot;Real Madrid&quot;
&quot;38′ Cartellino rosso - Cristiano Ronaldo (Real Madrid)&quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;translating-into-different-languages&quot;&gt;Translating into different languages&lt;/h2&gt;

&lt;p&gt;While we solved the problem of how we internationalize our application, we still haven’t solved the problem of how we &lt;em&gt;translate&lt;/em&gt; our strings. Our app supports many more languages that our team members speak, so translating in-house is not an option.&lt;/p&gt;

&lt;p&gt;To fix this problem, we use &lt;a href=&quot;https://www.transifex.com&quot;&gt;Transifex&lt;/a&gt;. Transifex is a website that provides both an easy-to-use interface for organizing and editing translations as well as support for outsourcing translations. Basically, they partnered up with &lt;a href=&quot;https://www.transifex.com/features/translation-partners/&quot;&gt;a few translation services&lt;/a&gt; so that you can “order” translations from such services and get strings translated to different languages of your choice by users that speak those languages.&lt;/p&gt;

&lt;p&gt;While we use external translation services for most languages, we also have native speakers for around six languages in the company; with Transifex, we can easily let these native speakers contribute to the translations. This is possible since Transifex’s interface is straightforward and less tech-savvy users can use it intuitively.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Transifex interface&quot; src=&quot;/assets/posts/2016-12-19-internationalization-of-elixir-applications-with-gettext-and-transifex/transifex-interface.png&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;gettext-integration&quot;&gt;Gettext integration&lt;/h3&gt;

&lt;p&gt;Transifex integrates with many translation platforms, and Gettext is one of such platforms. You can upload PO/POT files to Transifex and download translations as PO files. Transifex has the concept of “resources”, which are different “domains” of translations (for example, one resource could be notifications, while another one could be error messages, and so on). Luckily, Transifex resources map exactly to Gettext domains: in Gettext, you can use the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dgettext/3&lt;/code&gt; macro to extract a translation to a different domain (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;gettext/2&lt;/code&gt; uses the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;default&quot;&lt;/code&gt; domain), and each domain ends up in a different PO(T) file (we had &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;default.pot&lt;/code&gt; in the examples above).&lt;/p&gt;

&lt;p&gt;The workflow is roughly this: first, we extract strings to translate from our source code into a POT file with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix gettext.extract&lt;/code&gt; task, as shown above. Then, we upload this POT file to Transifex in order to update, add, or remove the strings to translate.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Uploading POT files to Transifex&quot; src=&quot;/assets/posts/2016-12-19-internationalization-of-elixir-applications-with-gettext-and-transifex/uploading-pot-to-transifex.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;After that, we wait for our coworkers to translate what they can and for the translation services to take care of translating. Once all the translations are ready, we download the PO files for all the languages we need from Transifex.&lt;/p&gt;

&lt;h3 id=&quot;command-line-tools&quot;&gt;Command-line tools&lt;/h3&gt;

&lt;p&gt;The workflow described above works but is pretty tedious, involves a lot of manual interactions, and becomes slower the more languages we need to translate to. Lucky for us, Transifex provides a nifty command-line tool called &lt;a href=&quot;https://docs.transifex.com/client/introduction&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tx&lt;/code&gt;&lt;/a&gt;: this tool allows us to “push” source strings to translate when we run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix gettext.extract&lt;/code&gt; and to “pull” translated strings into PO files once they’re available.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tx&lt;/code&gt; can be configured via the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.tx/config&lt;/code&gt; file in the root of your project. Ours looks like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[main]
host = https://www.transifex.com

[pushboy.default]
type = PO
source_file = priv/gettext/default.pot
source_lang = en
file_filter = priv/gettext/&amp;lt;lang&amp;gt;/LC_MESSAGES/default.po
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In this file, we configure the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pushboy.default&lt;/code&gt; resource (which maps to our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;default&quot;&lt;/code&gt; Gettext domain) and instruct the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tx&lt;/code&gt; tool that:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;we want to use the Gettext format (PO and POT)&lt;/li&gt;
  &lt;li&gt;our source strings to translate live in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;priv/gettext/default.pot&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;our source language is English&lt;/li&gt;
  &lt;li&gt;the translations that we’ll download should end up inside &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;priv/gettext&lt;/code&gt;, under the language they’re translated to (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;lang&amp;gt;&lt;/code&gt; is replaced by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tx&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With the configuration file above, we can now push new, edited, or removed source strings to translate with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$ tx push --source&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--source&lt;/code&gt; ensures we only push the updated POT) and pull updated translations with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tx pull&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&quot;cleaning-up-transifex-po-files&quot;&gt;Cleaning up Transifex PO files&lt;/h3&gt;

&lt;p&gt;We’re pretty strict when it comes to style in our codebase, and we found that the PO files that Transifex generated for us were not perfect for our taste. Luckily, Gettext provides tools to easily parse and modify PO/POT files. What we ended up having is a new Mix task, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix translations.pull&lt;/code&gt;, that abstracts the interaction with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tx&lt;/code&gt; away.&lt;/p&gt;

&lt;p&gt;In such task, we first call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tx pull&lt;/code&gt;, then we iterate over all pulled PO files and we “reformat” them how we want:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reformat_po_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;reformatted_po&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Gettext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PO&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;parse_file!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reformat_headers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;remove_top_of_the_file_comments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

  &lt;span class=&quot;no&quot;&gt;File&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Gettext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PO&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dump&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reformatted_po&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# We get rid of the Last-Translator header&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reformat_headers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(%&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Gettext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;headers:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;headers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;po&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;new_headers&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Enum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;headers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;starts_with?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;&amp;amp;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Last-Translator&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Gettext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;po&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;headers:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_headers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# We get rid of comments that Transifex leaves at the top of the PO file&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;defp&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;remove_top_of_the_file_comments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(%&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Gettext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;po&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;%&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Gettext&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PO&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;po&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;top_of_the_file_comments:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;the-complete-workflow&quot;&gt;The complete workflow&lt;/h2&gt;

&lt;p&gt;Our complete workflow is currently the following:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;after modifying our source code, we run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix gettext.extract&lt;/code&gt; and get a bunch of updated POT files&lt;/li&gt;
  &lt;li&gt;we run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;tx push --source&lt;/code&gt; to update the strings to translate on Transifex&lt;/li&gt;
  &lt;li&gt;we wait for our coworkers and for the translation services to translate the updated strings&lt;/li&gt;
  &lt;li&gt;we run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mix translations.pull&lt;/code&gt; and get updated translations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We really enjoy this workflow as it allows us to programmatically update, push, and pull translations, and at the same time it allows us to scale really easily when adding new languages (as nothing changes in the workflow, we just need to order translations for those languages as well on Transifex).&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;We took a look at how we take care of translating push notifications for our Forza Football app in several languages in a way that is easy to use, fast, and scales well with the number of translations and languages we have.&lt;/p&gt;

</description>
            <pubDate>Mon, 19 Dec 2016 13:00:00 +0000</pubDate>
            <link>http://tech.forzafootball.com/blog/internationalization-of-elixir-applications-with-gettext-and-transifex</link>
            <guid isPermaLink="true">http://tech.forzafootball.com/blog/internationalization-of-elixir-applications-with-gettext-and-transifex</guid>
            
            
                <category>Elixir</category>
            
                <category>Internationalization</category>
            
                <category>Gettext</category>
            
                <category>Transifex</category>
            
        </item>
    
        <item>
            <title>Gathering metrics in Elixir applications</title>
            <description>&lt;p&gt;Metrics are a fundamental part of most pieces of software. They’re a great insight on how a system behaves and performs, and can be used for different purposes, such as performance monitoring or alerting in case of anomalies.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Dashboard cover image&quot; src=&quot;/assets/posts/2016-08-26-gathering-metrics-in-elixir-applications/dashboard-cover-image.jpg&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Elixir is often praised as a fast language, especially for distributed, concurrent applications (like &lt;a href=&quot;/blog/the-pursuit-of-instant-pushes&quot;&gt;the ones we build at Forza Football&lt;/a&gt;). However, “fast” can be tricky to define without enough measurements available; without metrics, refactoring (especially aimed at improving speed) also becomes a risky thing to do, as it can be hard to determine if the performance is improving or not.&lt;/p&gt;

&lt;p&gt;Let’s have a look at in what way and with what tools we gather metrics in our Elixir applications.&lt;/p&gt;

&lt;h2 id=&quot;overview-of-the-stack&quot;&gt;Overview of the stack&lt;/h2&gt;

&lt;p&gt;Our metrics architecture is roughly shaped like this:&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Metrics architecture&quot; src=&quot;/assets/posts/2016-08-26-gathering-metrics-in-elixir-applications/architecture.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Let’s explore this architecture and the components that form it in more depth.&lt;/p&gt;

&lt;h2 id=&quot;storing-metrics&quot;&gt;Storing metrics&lt;/h2&gt;

&lt;p&gt;Let’s start from the core of the architecture: the database we use to store metrics. We’re using &lt;a href=&quot;https://influxdata.com/time-series-platform/influxdb/&quot;&gt;InfluxDB&lt;/a&gt;, an open-source time-series database which suits storing metrics perfectly; it performs really well (it may be interesting to know it’s written in &lt;a href=&quot;https://golang.org&quot;&gt;Go&lt;/a&gt;) and provides a bunch of very useful functionalities.&lt;/p&gt;

&lt;p&gt;InfluxDB databases are formed by a set of &lt;em&gt;measurements&lt;/em&gt;. Each measurement is a collection of data points; each data &lt;em&gt;point&lt;/em&gt; (that’s the InfluxDB official term) is formed by a timestamp, a set of tags, and a set of values. Tags are for “tagging” the measurements with metadata about it: for example, a common tag attached to measurements is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;host&lt;/code&gt;, which holds the value of the host reporting the measurement. Tags are shared by all the points in a measurement. Values, instead, are the subject of the measurement - what is being measured. For example, a value could represent the time it takes to complete an HTTP request, and be called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;request_time&lt;/code&gt;. A data point can have multiple values; when a point has a single value, we often call that value just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;value&lt;/code&gt;, since the measurement name will be enough to determine what that value represents. In the rest of the post, we will often refer to “measurements” as both measurements and points, to make the reading experience more fluid.&lt;/p&gt;

&lt;p&gt;InfluxDB also provides a rich SQL-like language to query the data it stores, and ways to resample data for optimal storing. We won’t look at either of those in this post; if you’re interested, check out &lt;a href=&quot;https://docs.influxdata.com/influxdb/latest/&quot;&gt;their documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;InfluxDB supports a couple of ways to store data in it: measurements can be reported via an HTTP API or via an UDP interface, and can be encoded either as JSON or using InfluxDB’s own “line protocol”.
The UDP interface is naturally faster since it lacks the overhead of something like the HTTP protocol and the TCP transport; however, as always, it provides close to no guarantees of successful delivery of data. Choosing between the HTTP API and the UDP interface can be hard, but we will see how it’s possible to avoid this choice completely and get the best of both worlds.&lt;/p&gt;

&lt;p&gt;Last but not least: to improve the availability of InfluxDB and be more conservative with our metrics, we use &lt;a href=&quot;https://github.com/influxdata/influxdb-relay&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;influxdb-relay&lt;/code&gt;&lt;/a&gt;, which sets up a simple load-balancer-like architecture to add some replication to the metrics storage. This only works for writing metrics though, so we point directly to an InfluxDB instance when we need to read metrics.&lt;/p&gt;

&lt;h2 id=&quot;reporting-metrics&quot;&gt;Reporting metrics&lt;/h2&gt;

&lt;p&gt;The most straightforward way to report metrics to InfluxDB would be to send them directly from our Elixir applications to InfluxDB, using either the HTTP API or the UDP interface mentioned in the previous section.&lt;/p&gt;

&lt;p&gt;This choice, however, is a tough one for us: we have a high load of metrics, and we often need to report hundreds of thousands of metrics in a matter of seconds. This brings out disadvantages in both reporting methods: using the HTTP API would mean slowing down our applications, which is not something you want metrics reporting to do, but using the UDP interface would mean risking to drop metrics, especially because of the substantial number of metrics being reported.&lt;/p&gt;

&lt;h3 id=&quot;telegraf&quot;&gt;Telegraf&lt;/h3&gt;

&lt;p&gt;Luckily, the same folks behind InfluxDB provide another nifty open-source tool called &lt;a href=&quot;https://influxdata.com/time-series-platform/telegraf/&quot;&gt;Telegraf&lt;/a&gt;. Telegraf is basically a daemon that can collect inputs from different sources and delivers it to different outputs every &lt;em&gt;n&lt;/em&gt; seconds; both inputs and outputs are managed via “input plugins” and “output plugins”. We’re using primarily two main input sources, and one output.&lt;/p&gt;

&lt;h4 id=&quot;inputs&quot;&gt;Inputs&lt;/h4&gt;

&lt;p&gt;The inputs that Telegraf is gathering for us are two. The first one is “system metrics”, meaning system data such as CPU usage, memory usage, disk usage, network usage, and so on. Telegraf takes care of reading this measurements out of the box. The other input we’re using is the UDP listener built into Telefgraf: this plugin provides a “mirror” of InfluxDB’s UDP interface, exposed locally through Telegraf. This is useful: for example, it means InfluxDB bindings in different languages can be built to work with the UDP interface, and automatically work with running instances of both InfluxDB as well as Telegraf. Fun fact: me and &lt;a href=&quot;https://github.com/lexmag&quot;&gt;@lexmag&lt;/a&gt; actually contributed this plugin to Telegraf because it was just perfect for our needs!&lt;/p&gt;

&lt;h4 id=&quot;outputs&quot;&gt;Outputs&lt;/h4&gt;

&lt;p&gt;We’re using only one output plugin: the InfluxDB plugin. This plugin is straightforward: it reports measurements gathered via input plugins to InfluxDB. It also provides some niceties such as global tags that can be applied to all measurements from every input plugin (we use this to set the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;host&lt;/code&gt; tag for both application and system metrics) and measurement filtering. But for us the most useful feature is that this plugin talks to InfluxDB’s HTTP API instead of UDP interface: this means that we can rely on the guarantees that HTTP (and the underlying TCP transport) provide to ensure metrics arrive to the InfluxDB server.&lt;/p&gt;

&lt;h3 id=&quot;aggregation&quot;&gt;Aggregation&lt;/h3&gt;

&lt;p&gt;For us, Telegraf works as a middlemann between our application reporting metrics and InfluxDB, and this has multiple advantages:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;We can use the UDP interface to send metrics from our Elixir applications to Telegraf; this greatly benefits the speed at which we can report these metrics, and, since Telegraf is running locally, the risk of losing packets because of UDP is substantially reduced; since Telegraf reports these metrics to InfluxDB via HTTP, the chance of losing measurements is very low.&lt;/li&gt;
  &lt;li&gt;We can throttle the number of measurements that are reported to InfluxDB: Telegraf aggregates metrics for &lt;em&gt;n&lt;/em&gt; seconds (usually 5 or 10 seconds is a good number, depending on the application), before sending them to InfluxDB. This means that Telegraf will only send &lt;em&gt;bulks&lt;/em&gt; of metrics to InfluxDB, once every &lt;em&gt;n&lt;/em&gt; seconds, reducing the number of hits to InfluxDB and the network traffic.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;telegrafinfluxdb-driver&quot;&gt;Telegraf/InfluxDB driver&lt;/h3&gt;

&lt;p&gt;When we started using InfluxDB and Telegraf, there were no Elixir drivers for InfluxDB’s UDP interface (we could only find one for InfluxDB’s HTTP API). But fear not: we built one! &lt;a href=&quot;https://github.com/lexmag/fluxter&quot;&gt;Fluxter&lt;/a&gt; is a straightforward Elixir library that provides a pool of UDP connections to InfluxDB/Telegraf (it doesn’t know the distinction, since the UDP protocol is the same) and a simple API to report metrics.&lt;/p&gt;

&lt;p&gt;Each Elixir application has its own Fluxter module:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Fluxter&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Fluxter&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The code above turns &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MyApp.Fluxter&lt;/code&gt; into a supervised pool of UDP connections. To make things fault-tolerant, we make sure to start this pool under the application’s supervision tree:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;children&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
    &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Fluxter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

  &lt;span class=&quot;no&quot;&gt;Supervisor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_link&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;children&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;strategy:&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:one_for_one&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Finally, we configure the pool via the application’s configuration:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# Assuming Telegraf is running on localhost:8086, and has the UDP input running:&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;config&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:fluxter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;ss&quot;&gt;host:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;localhost&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;ss&quot;&gt;port:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8086&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then, once the Fluxter pool is started, we can easily report metrics through it:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;my_operation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# Perform my operation&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Fluxter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;something_done&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;my_tag:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;foo&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For more information on Fluxter, you can consult &lt;a href=&quot;https://hexdocs.pm/fluxter/&quot;&gt;its documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;erlang-vm-metrics&quot;&gt;Erlang VM metrics&lt;/h3&gt;

&lt;p&gt;Given we report metrics for everything, we couldn’t skip metrics from the Erlang VM: we gather detailed information about number of running processes, memory usage, process mailboxes, garbage collection, and more. To do that, we use &lt;a href=&quot;https://github.com/ferd/vmstats&quot;&gt;vmstats&lt;/a&gt;, a tiny Erlang application that collects measurements from the Erlang VM and reports them to a “sink”: a sink is just an Erlang/Elixir module implementing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:vmstats_sink&lt;/code&gt; behaviour. We use the Fluxter pool as the sink, and we have an identical setup for it in each of our Elixir applications:&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;defmodule&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Fluxter&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;kn&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Fluxter&lt;/span&gt;

  &lt;span class=&quot;nv&quot;&gt;@behaviour&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:vmstats_sink&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;value:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Have a look at vmstats’ &lt;a href=&quot;https://github.com/ferd/vmstats&quot;&gt;README&lt;/a&gt; for more information on what it can do.&lt;/p&gt;

&lt;h4 id=&quot;counters&quot;&gt;Counters&lt;/h4&gt;

&lt;p&gt;Since we report a very high number of metrics in a very short span of time, we experienced trouble with this setup, since each reported measurement would mean a UDP packet sent to Telegraf (which makes things prone to overflowing and packets get easily lost). For this reason we introduced &lt;strong&gt;counters&lt;/strong&gt; in Fluxter: a counter is simply another level of aggregation of &lt;em&gt;numeric&lt;/em&gt; measurements, at the Elixir level this time. Counters will add every measurement reported to them and only send the final sum to Telegraf/InfluxDB when “flushed”.&lt;/p&gt;

&lt;div class=&quot;language-elixir highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Fluxter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;start_counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;my_operation_success&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;host:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;eu-west&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;Enum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;each&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1_000_000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;my_operation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
  &lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Fluxter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;increment_counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;MyApp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Fluxter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flush_counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;counter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;visualizing-metrics&quot;&gt;Visualizing metrics&lt;/h2&gt;

&lt;p&gt;All the metrics we report and store would be close to useless without a way of visualizing them. As our visualization tool, we use &lt;a href=&quot;http://grafana.org&quot;&gt;Grafana&lt;/a&gt;, which has native support for InfluxDB (meaning it can query InfluxDB out of the box) and which packs a ton of useful features.&lt;/p&gt;

&lt;p&gt;Our dashboards look roughly like this one:&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Grafana dashboard&quot; src=&quot;/assets/posts/2016-08-26-gathering-metrics-in-elixir-applications/grafana-dashboard.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The dashboard above is the one that we use to monitor the health of all our applications: it shows metrics from the system and from the Erlang VM. The select box on the top left corner (where it says “pushboy”) lets us choose which application to visualize metrics of, so that we can have an overview of the system and Erlang VM metrics for each application with just a couple of clicks.&lt;/p&gt;

&lt;p&gt;Grafana provides several useful features: for example, it can group measurements by tag (for example, useful for showing each host in a cluster as a different line on a graph), it can perform all sorts of aggregations on the data, and is very customizable in its appearance.&lt;/p&gt;

&lt;p&gt;Grafana is probably the simplest part of the whole architecture, but a crucial one nonetheless.&lt;/p&gt;

&lt;h2 id=&quot;alerting&quot;&gt;Alerting&lt;/h2&gt;

&lt;p&gt;We mainly use metrics to have an insight on how our applications perform, but we also take advantage of them for another thing: alerting. Metrics can be often bent to communicate the health of an application: for example, if the number of Erlang processes in one of our applications goes way up the average value, we know it’s something we should look into.&lt;/p&gt;

&lt;p&gt;The lovely people behind InfluxDB and Telegraf have this covered as well with another open-source tool called &lt;a href=&quot;https://influxdata.com/time-series-platform/kapacitor/&quot;&gt;Kapacitor&lt;/a&gt;. Kapacitor is a daemon that can repeatedly perform queries on InfluxDB and act on the results of such queries. Kapacitor works by interpreting “Kapacitor scripts”, which are scripts written in a Kapacitor-specific language. A lot can be achieved with these scripts, but let’s keep it simple for this post: the script below is taken directly out of our system and shows how we use the Erlang VM metrics gathered by vmstats to determine the health of the application.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;
stream
  |from()
    .measurement('pushboy_vm_modules')
  |groupBy('host')
  |deadman(15.0, 1m) //
    .id('Pushboy [{{ index .Tags &quot;host&quot; }}]')
    .message('{{ .ID }} is {{ if eq .Level &quot;OK&quot; }}up{{ else }}down{{ end }}')
    .stateChangesOnly()
    .slack()

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This small script checks for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;vm_modules&lt;/code&gt; measurement and triggers an alert if the throughput drops below 15 points per minute (a healthy application with our configuration reports around 60 points per minute). As you may guess by the last line, this script will notify us by posting on a Slack channel (which is specified in Kapacitor’s configuration). Kapacitor supports several ways of notifying about alerts other than Slack, such as &lt;a href=&quot;https://www.pagerduty.com&quot;&gt;PagerDuty&lt;/a&gt; or email.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;This post showed how we gather as many metrics as we can from our Elixir applications (and the servers they run on) and how we use them to monitor the health and performance of such applications and to get alerts when problems arise. We’re quite happy with this system, as it works almost flawlessly and has made us much more scientific in our approach to software development: we now tend to experiment, gather data, and draw conclusions based on data instead of relying a bit too much on our gut.&lt;/p&gt;

</description>
            <pubDate>Fri, 26 Aug 2016 14:00:00 +0000</pubDate>
            <link>http://tech.forzafootball.com/blog/gathering-metrics-in-elixir-applications</link>
            <guid isPermaLink="true">http://tech.forzafootball.com/blog/gathering-metrics-in-elixir-applications</guid>
            
            
                <category>Elixir</category>
            
                <category>Metrics</category>
            
        </item>
    
        <item>
            <title>The Pursuit of Instant Pushes</title>
            <description>&lt;p&gt;At Forza Football we strive to provide &lt;a href=&quot;http://www.forzafootball.com&quot;&gt;our apps&lt;/a&gt; users with the best possible experience, and lightning-fast push notifications is one of its aspects.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Cherenkov radiation&quot; src=&quot;/assets/posts/2016-08-01-the-pursuit-of-instant-pushes/Cherenkov-radiation.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This photo (by Argonne National Laboratory) shows Cherenkov radiation glowing in the core of the Advanced Test Reactor. It’s emitted when a charged particle passes through a dielectric medium at a speed greater than the phase velocity of light in that medium.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;My name is &lt;a href=&quot;https://twitter.com/lexmag&quot;&gt;Aleksei&lt;/a&gt; and I spent most of last year building a new push notification system.&lt;/p&gt;

&lt;p&gt;This blog post tells an introductory story of why it was needed, how it operates today, the migration path to the new system, and what the plans for the future are.&lt;/p&gt;

&lt;p&gt;From an extremely high view, each push system consists of two logical parts: subscription tracking and notification sending.&lt;/p&gt;

&lt;h2 id=&quot;overview-of-our-legacy-system&quot;&gt;Overview of our legacy system&lt;/h2&gt;

&lt;p&gt;The legacy push system is an integral part of a much bigger &lt;a href=&quot;http://rubyonrails.org&quot;&gt;Ruby on Rails&lt;/a&gt; application, which is responsible for many more things besides push notifications handling.&lt;/p&gt;

&lt;p&gt;Let’s take a closer look at it.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;Legacy&quot; src=&quot;/assets/posts/2016-08-01-the-pursuit-of-instant-pushes/legacy-system.png&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;subscription-tracking&quot;&gt;Subscription tracking&lt;/h3&gt;

&lt;p&gt;There is an &lt;strong&gt;API&lt;/strong&gt; for handling subscribe/unsubscribe HTTP requests.
Each request goes through a validation step and then subscriptions get inserted/deleted.&lt;/p&gt;

&lt;p&gt;Before going further, I should explain how subscriptions look like.
Every subscription contains:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;device_token&lt;/code&gt; – an identifier in the specific vendor service&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;subject&lt;/code&gt; – match, team, or league&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;topic&lt;/code&gt; – the matter of interest: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;match_reminder&quot;&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&quot;goal&quot;&lt;/code&gt;, etc.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;language&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;country&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For instance, having the following subscription:&lt;/p&gt;

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;device_token&lt;/td&gt;
      &lt;td&gt;xfL3k6QxHagFNf8Y01a&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;subject&lt;/td&gt;
      &lt;td&gt;team, 42&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;topic&lt;/td&gt;
      &lt;td&gt;goal&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;language&lt;/td&gt;
      &lt;td&gt;en&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;country&lt;/td&gt;
      &lt;td&gt;SE&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;We know that someone from Sweden, who speaks English, wants to receive a notification every time their favourite team scores a goal.&lt;/p&gt;

&lt;h3 id=&quot;notification-sending&quot;&gt;Notification sending&lt;/h3&gt;

&lt;p&gt;The process of sending notifications starts when a match reminder is generated, an event has happened during the football match, or a transfer rumour is received.
&lt;strong&gt;Notifier&lt;/strong&gt; loads all the relevant subscriptions, filters out token duplicates, removes muted devices, and as the last step builds translated messages (sometimes limiting selection to specific countries). Yes, we have to deal with token duplicates, since an individual user could be subscribed to both teams in a single match (also to a league).&lt;/p&gt;

&lt;p&gt;After all these steps, the generated payload is sent to &lt;strong&gt;Dispatcher&lt;/strong&gt;, which does the actual sending to the particular vendor service (Apple Push Notification, Google Cloud Messaging, or Windows Push Notification), and that’s the moment when users start to receive their notifications.&lt;/p&gt;

&lt;h2 id=&quot;slow-pushes-are-slow&quot;&gt;Slow pushes are slow&lt;/h2&gt;

&lt;p&gt;After having successfully used this setup for a long time we have reached a point when notification sending time has increased unacceptably.
For instance, it could take a few minutes, depending on application load, to send a push notification to 1 million subscribers.
Quick investigation has showed that it is all due to &lt;strong&gt;Notifier&lt;/strong&gt; being extremely slow in loading the relevant subscriptions.
This was caused by the nature of the query (which included many &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;JOIN&lt;/code&gt; expressions) and by the big size of the whole data set.&lt;/p&gt;

&lt;p&gt;An attempt to fix the problem had already been done: subscribers sharding was introduced and subscribers querying was moved to a slave machine with a fancy index.
There was a strong need to replace MySQL with something more appropriate (and change the data model).
Ideally it should have allowed us to parallelize subscriptions handling (translation, country filtering, etc.) in some way, instead of doing it sequentially.&lt;/p&gt;

&lt;p&gt;There was only one problem: too many data stores to choose from. Of course, we had to do research to pick the right one, so I limited the choice to a handful of databases (some I’ve worked with previously, some I selected based on their storage model and plenty of articles).
I usually say “let’s think about future in the future”, but this wasn’t that case, thus I went with a 200 million subscriptions sample data set measuring 4 million subscriptions extraction.
The process was conceptually simple but rather time consuming, it took about two months in total to pick a suitable database: &lt;a href=&quot;http://cassandra.apache.org&quot;&gt;Cassandra&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;cassandra-data-modeling&quot;&gt;Cassandra data modeling&lt;/h2&gt;

&lt;p&gt;There are many articles written on this topic, however one rule is worth to repeat on every occasion: model around your queries.
This leads to data duplication and a heavier write load, which is a preferred tradeoff in Cassandra.&lt;/p&gt;

&lt;p&gt;The new push system operates with two tables in Cassandra: one to handle subscription tracking and another one for notification sending.&lt;/p&gt;

&lt;p&gt;Both tables leverage compound primary key to store subscriptions which belongs to the same topic and subject sequentially on disk on a single node (plus replicas). It makes reads extremely efficient.&lt;/p&gt;

&lt;p&gt;The role of the second table is to store normalized subscriptions.
I mentioned previously, somebody can freely have intersected subscriptions, therefore this table effectively removes the burden of token duplicates handling during notification sending.&lt;/p&gt;

&lt;h2 id=&quot;moving-to-the-new-system&quot;&gt;Moving to the new system&lt;/h2&gt;

&lt;p&gt;Now our new push system has got three specialized components built around Cassandra database. These components are written in &lt;a href=&quot;http://elixir-lang.org&quot;&gt;Elixir&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For the inter-system data exchange we use &lt;a href=&quot;http://msgpack.org&quot;&gt;MessagePack&lt;/a&gt; format (with &lt;a href=&quot;https://github.com/lexmag/msgpax&quot;&gt;Msgpax&lt;/a&gt; library) throughout the system. It is much faster than JSON, and has smaller footprint.&lt;/p&gt;

&lt;p&gt;&lt;img alt=&quot;New system&quot; src=&quot;/assets/posts/2016-08-01-the-pursuit-of-instant-pushes/new-system.png&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;subscription-tracking-1&quot;&gt;Subscription tracking&lt;/h3&gt;

&lt;p&gt;Besides the old &lt;strong&gt;API&lt;/strong&gt; we have the &lt;strong&gt;Pushkeeper&lt;/strong&gt; application, which is responsible for managing subscriptions, and it also took responsibility of feedback handling, which isn’t depicted.
&lt;strong&gt;API&lt;/strong&gt; operates as previously to support the legacy system, and also duplicates incoming requests to &lt;strong&gt;Pushkeeper&lt;/strong&gt; to allow gradual migration to the new system.&lt;/p&gt;

&lt;h3 id=&quot;notification-sending-1&quot;&gt;Notification sending&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Pushboy&lt;/strong&gt; application is replacing &lt;strong&gt;Notifier&lt;/strong&gt;: it does the same tasks (loading subscriptions, performing messages translation, etc.).
However, there is one important difference: subscriptions are already normalized, and &lt;strong&gt;Pushboy&lt;/strong&gt; effectively streams them out of the Cassandra table. It means that users start to receive their notificatins right away.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Pushpass&lt;/strong&gt; application is a middleware between the former two, which is responsible for data normalization, that is, filtering out token duplicates and populating the specialized table.&lt;/p&gt;

&lt;h2 id=&quot;measure-all-the-things&quot;&gt;Measure all the things&lt;/h2&gt;

&lt;p&gt;Migration to new systems is hard. To verify that the new system works as expected, we run it in parallel with the legacy system for about a half year, performing all operations except the very last step - sending data to vendor services. Only a limited set of test devices has been whitelisted.
This helped us to refine deployment process, failure recovery mechanism, and of course, to measure all the things.&lt;/p&gt;

&lt;p&gt;Our metrics system (which deserves &lt;a href=&quot;/blog/gathering-metrics-in-elixir-applications&quot;&gt;its own blog post&lt;/a&gt;) was actually built from scratch during the migration to the new push system.
I’ll just briefly mention some tools, libraries, and technologies we use for metrics: applications report metrics via &lt;a href=&quot;https://github.com/lexmag/fluxter&quot;&gt;Fluxter&lt;/a&gt; library to locally installed &lt;a href=&quot;https://influxdata.com/time-series-platform/telegraf&quot;&gt;Telegraf&lt;/a&gt; agent, which sends them further to &lt;a href=&quot;https://influxdata.com/time-series-platform/influxdb/&quot;&gt;InfluxDB&lt;/a&gt; store.&lt;/p&gt;

&lt;p&gt;According to a vast range of metrics the new system performs great, let’s see how long it takes to send a notification to 1.3 million subscribers.&lt;/p&gt;

&lt;p&gt;The legacy system requires 40 seconds to prepare payload, and an additional 5 seconds to dispatch it.&lt;/p&gt;

&lt;p&gt;The new system finishes sending in 16 seconds, and this is in a streaming way: this means that a half of subscribers will receive their notification in 8 seconds.&lt;/p&gt;

&lt;h2 id=&quot;bottleneck-shifts-to-other-place&quot;&gt;Bottleneck shifts to other place&lt;/h2&gt;

&lt;p&gt;Components of the new system reside in one datacenter while the legacy system is in another one (yes, one more migration, we’re moving to Amazon Web Services cloud). This quite badly affects the request time for sending generated payload to &lt;strong&gt;Dispatcher&lt;/strong&gt;. In addition to that, &lt;strong&gt;Pushboy&lt;/strong&gt; produces hundreds of such requests per second.&lt;/p&gt;

&lt;p&gt;In fact, &lt;strong&gt;Pushboy&lt;/strong&gt; finishes processing as fast as the slowest subscriptions chunk being processed in &lt;strong&gt;Dispatcher&lt;/strong&gt;.&lt;/p&gt;

&lt;h2 id=&quot;long-live-the-new-system&quot;&gt;Long live the new system!&lt;/h2&gt;

&lt;p&gt;Since last month, we are running new push system in production for some particular notification topics.&lt;/p&gt;

&lt;p&gt;Besides the pure performance improvement, there is a bunch of things improved:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;No more “MySQL as a queue” anti-pattern, which brings locking issues and data growth control problem.&lt;/li&gt;
  &lt;li&gt;Notification sending executes in parallel, in a streaming manner.&lt;/li&gt;
  &lt;li&gt;Main Ruby on Rails application no longer affects push system operation (for example, CPU stealing was a noticeable problem before), and its complexity has decreased.&lt;/li&gt;
  &lt;li&gt;Tiny targeted components make maintenance easier.&lt;/li&gt;
  &lt;li&gt;Started to practice canary deployments.&lt;/li&gt;
  &lt;li&gt;We simply need less resources than was required previously.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;elixir-matters-a-lot-for-the-system&quot;&gt;Elixir matters a lot for the system&lt;/h3&gt;

&lt;p&gt;Starting with the programmer happiness, there are many, much more pragmatic reasons to choose Elixir.&lt;/p&gt;

&lt;p&gt;Elixir’s compiler is wonderful. It is smart, fast, and incredibly helpful. It actively helps in the elimination of bugs early on in the software cycle.&lt;/p&gt;

&lt;p&gt;Erlang Runtime System enabled us to build highly concurrent system with strong fault tolerant characteristics.
It provides better introspection capabilities and live system debugging.&lt;/p&gt;

&lt;p&gt;Furthermore, we utilize distributed setup for several things, for example: to reduce the probability of duplicated delivery, or to coordinate subscriptions normalization.&lt;/p&gt;

&lt;h2 id=&quot;beyond-the-speed-of-light&quot;&gt;Beyond the speed of light&lt;/h2&gt;

&lt;p&gt;Things indeed improved but there are still many left:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Complete switch to the new system in coming weeks.&lt;/li&gt;
  &lt;li&gt;Replace &lt;strong&gt;Dispatcher&lt;/strong&gt; with &lt;a href=&quot;/blog/maximizing-http2-performance-with-genstage&quot;&gt;the &lt;strong&gt;Pushgate&lt;/strong&gt; application&lt;/a&gt;, which is located in the same datacenter.&lt;/li&gt;
  &lt;li&gt;Introduce &lt;strong&gt;API&lt;/strong&gt; v2, revised and enhanced.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In addition, we have quite ambitious plan to release several Elixir libraries. Stay tuned.&lt;/p&gt;

</description>
            <pubDate>Tue, 02 Aug 2016 13:00:00 +0000</pubDate>
            <link>http://tech.forzafootball.com/blog/the-pursuit-of-instant-pushes</link>
            <guid isPermaLink="true">http://tech.forzafootball.com/blog/the-pursuit-of-instant-pushes</guid>
            
            
                <category>Elixir</category>
            
                <category>Cassandra</category>
            
                <category>Push notifications</category>
            
        </item>
    
  </channel>
</rss>
