{"id":95,"date":"2019-04-27T21:25:09","date_gmt":"2019-04-27T20:25:09","guid":{"rendered":"http:\/\/pranavmani.com\/?p=95"},"modified":"2019-07-28T01:04:03","modified_gmt":"2019-07-28T00:04:03","slug":"compile-time-computation-in-c","status":"publish","type":"post","link":"https:\/\/pranavmani.com\/?p=95","title":{"rendered":"Compile Time computation in C++"},"content":{"rendered":"\n<p style=\"font-size:16.5px\">One of the topics that interest me in C++ is compile-time computation. In this article, we will look at how compile-time computation is done in C++. Compile-time computation can be done in three ways I know of. One is using macros to compute constants, the second way is using template metaprogramming and the other way is using the keyword <code>constexpr<\/code> to define computations. Since macros are so commonly used, let&#8217;s dive into doing compile-time computation using template meta-programming. Let us start with the simplest computation, which is the addition of two numbers. The template should take two arguments, namely x and y and compute the output. This can be done as shown in the code snippet below:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\n#include &lt;iostream&gt;\ntemplate&lt;int x, int y&gt;\nstruct sum\n{\n    static const int out = x + y;\n};\n\nint main()\n{\n    std::cout&lt;&lt;&quot;Sum of 3 and 5 = &quot;&lt;&lt; sum&lt;3,5&gt;::out&lt;&lt;std::endl;\n    return 0;\n}\n<\/pre><\/div>\n\n\n<p style=\"font-size:16.5px\">If you want to ensure that the compiler does the job for you, you can look at the assembly output of the compiler through <a href=\"https:\/\/godbolt.org\/\">compiler explorer<\/a> (<code>gcc 9.1 -O3<\/code>). The result of the computation is on line 6 which is generated by the compiler. Please note that most of the other generated assembly is for printing the string on to the standard output. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; highlight: [6]; title: ; notranslate\" title=\"\">\nmain:\n        sub     rsp, 8\n        mov     esi, OFFSET FLAT:.LC0\n        mov     edi, OFFSET FLAT:_ZSt4cout\n        call    std::basic_ostream&lt;char, std::char_traits&lt;char&gt; &gt;&amp; std::operator&lt;&lt; &lt;std::char_traits&lt;char&gt; &gt;(std::basic_ostream&lt;char, std::char_traits&lt;char&gt; &gt;&amp;, char const*)\n        mov     esi, 8\n        mov     rdi, rax\n        call    std::basic_ostream&lt;char, std::char_traits&lt;char&gt; &gt;::operator&lt;&lt;(int)\n        mov     rdi, rax\n        call    std::basic_ostream&lt;char, std::char_traits&lt;char&gt; &gt;&amp; std::endl&lt;char, std::char_traits&lt;char&gt; &gt;(std::basic_ostream&lt;char, std::char_traits&lt;char&gt; &gt;&amp;)\n        xor     eax, eax\n        add     rsp, 8\n        ret\n<\/pre><\/div>\n\n\n<p style=\"font-size:16.5px\">Now, let us write such a function for a little-bit more compute-intensive operation, say factorial. A factorial of a number n can be written recursively as  \\(n \times factorial(n-1)\\) . Going recursively inside a function, you eventually reach a point where the factorial(1) needs to be returned. So, for this, a specialization for a value 1 needs to be explicitly mentioned for the compiler to calculate the recursive factorial. See the implementation of the factorial function below.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\n#include &lt;iostream&gt;\n\ntemplate&lt;int n&gt;\nstruct factorial\n{\n    static const int out = n * factorial&lt;n-1&gt;::out;\n};\n\ntemplate&lt;&gt;\nstruct factorial&lt;1&gt;\n{\n    static const int out = 1; \n};\n\nint main()\n{\n    std::cout&lt;&lt;&quot;Factorial(5)= &quot;&lt;&lt; factorial&lt;5&gt;::out&lt;&lt;std::endl;\n    return 0;\n}\n<\/pre><\/div>\n\n\n<p style=\"font-size:16.5px\">You can also verify this with compiler explorer and check if the generated assembly has the value of \\(5!=120\\) without explicitly calling any function. Let us do an even more special operation, say \\(a^x\\). <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\n#include &lt;iostream&gt;\n\ntemplate&lt;int x, int a&gt;\nstruct power\n{\n    static const int out = x * power&lt;x, a-1&gt;::out;\n};\n\ntemplate&lt;int x&gt;\nstruct power&lt;x, 0&gt;\n{\n    static const int out = 1; \n};\n\nint main()\n{\n    std::cout&lt;&lt;&quot;5 ^ 3 = &quot;&lt;&lt; power&lt;5, 3&gt;::out&lt;&lt;std::endl;\n    return 0;\n}\n<\/pre><\/div>\n\n\n<p style=\"font-size:16.5px\">However, using templates for compile-time computation has its own limits. Another interesting function, which I would like to be evaluated is the square root of a number. However, templated arguments can be non-type if it is an integer constant or a pointer to an object with external linkage.<\/p>\n\n\n\n<p style=\"font-size:16.5px\">And since square root of any number would evaluate to a float\/double, we cannot use templates for such computation. This is one of the situations where I would go for constexpr to evaluate expressions. The square root of a number can be computed recursively using the <a href=\"http:\/\/ https:\/\/helloacm.com\/newton-iterative-sqrt-method\/ \">newton raphson method<\/a>. There are two implementations below, one in which we compute the square root of a number, in run time(without the use of <code>constexpr<\/code>) and one in compile-time. This is a very quick and dirty implementation using Newton Raphson, just to illustrate the usage of <code>constexpr<\/code>. There are more efficient and safe ways to compute the square root of a number. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\n#include&lt;iostream&gt;\n\nconstexpr double EPSILON = 1E-12;\n\n\/*computes sqrt during run time, not the most of efficient implementation of\n  sqrt*\/\ndouble sqrt_runtime(double x, double guess = 1)\n{   \n   double x_approx = 0.5 * (guess + x\/guess);    \n   if((x_approx*x_approx) - x &lt; EPSILON) return x_approx;\n   else return sqrt_runtime(x, x_approx) ;    \n}\n\n\/*computes sqrt during compile time*\/\nconstexpr double sqrt_compile_time(double x, double guess = 1)\n{\n    return (0.25 * (guess + x\/guess) * (guess + x\/guess) - x &lt; EPSILON)?(0.5 * (guess + x\/guess)):\n                                                                        sqrt_compile_time(x, 0.5 * (guess + x\/guess));\n    \n}\nint main()\n{\n   const double rt_val =  sqrt_runtime(200);  \n   std::cout&lt;&lt;&quot;Computed during runtime: &quot;&lt;&lt;rt_val&lt;&lt;std::endl;\n   constexpr double compile_val =  sqrt_compile_time(200);\n   std::cout&lt;&lt;&quot;Computed during compile time: &quot;&lt;&lt;compile_val&lt;&lt;std::endl;\n}\n<\/pre><\/div>\n\n\n<p style=\"font-size:16.5px\">Now, let us benchmark this code quickly using <a href=\"http:\/\/www.quick-bench.com\">quick-bench<\/a>.  This is a tool which can be used to quickly benchmark your C++ implementations. This is not a very comprehensive benchmark and uses only a single number. This is just to give the reader a idea of how much difference compile time computation can make. The snippet below was benchmarked on quick-bench with the <code>gcc 8.2 (C++11 -O3)<\/code>compiler. I also included the standard library implementation to the benchmark to check how it compares against our implementation.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\n#include&lt;iostream&gt;\n#include&lt;cmath&gt;\n\nconstexpr double EPSILON = 1E-12;\n\ndouble sqrt_runtime(double x, double guess = 1)\n{   \n   double x_approx = 0.5 * (guess + x\/guess);    \n   if((x_approx*x_approx) - x &lt; EPSILON) return x_approx;\n   else return sqrt_runtime(x, x_approx) ;    \n}\n\n\/*computes sqrt during compile time*\/\nconstexpr double sqrt_compile_time(double x, double guess = 1)\n{\n    return (0.25 * (guess + x\/guess) * (guess + x\/guess) - x &lt; EPSILON)?(0.5 * (guess + x\/guess)):\n                                                                        sqrt_compile_time(x, 0.5 * (guess + x\/guess));\n    \n}\n\nstatic void RuntimeComputation(benchmark::State&amp; state) {\n  \/\/ Code inside this loop is measured repeatedly\n  for (auto _ : state) {\n    constexpr double x = 1000;\n    double root = sqrt_runtime(x);\n  }\n}\n\/\/ Register the function as a benchmark\nBENCHMARK(RuntimeComputation);\n\nstatic void CompileTimeComputation(benchmark::State&amp; state) {\n  \/\/ Code before the loop is not measured\n  for (auto _ : state) {\n    constexpr double x = 1000;\n    constexpr double root = sqrt_compile_time(x);\n  }\n}\nBENCHMARK(CompileTimeComputation);\n\n\nstatic void CPlusPlusStdLib(benchmark::State&amp; state) {\n  \/\/ Code before the loop is not measured\n  for (auto _ : state) {\n    constexpr double x = 1000;\n    constexpr double root = std::sqrt(x);\n  }\n}\n\nBENCHMARK(CPlusPlusStdLib);\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/pranavmani.com\/wp-content\/uploads\/2019\/07\/IsVd06K62MNivNwjjLJbgtCiucc-1024x512.png\" alt=\"\" class=\"wp-image-90\" width=\"1377\" height=\"689\" srcset=\"https:\/\/pranavmani.com\/wp-content\/uploads\/2019\/07\/IsVd06K62MNivNwjjLJbgtCiucc-1024x512.png 1024w, https:\/\/pranavmani.com\/wp-content\/uploads\/2019\/07\/IsVd06K62MNivNwjjLJbgtCiucc-300x150.png 300w, https:\/\/pranavmani.com\/wp-content\/uploads\/2019\/07\/IsVd06K62MNivNwjjLJbgtCiucc-768x384.png 768w, https:\/\/pranavmani.com\/wp-content\/uploads\/2019\/07\/IsVd06K62MNivNwjjLJbgtCiucc.png 1209w\" sizes=\"(max-width: 1377px) 100vw, 1377px\" \/><figcaption>Results of Benchmarking<\/figcaption><\/figure>\n\n\n\n<p style=\"font-size:16.5px\">Looking at the results of the <a href=\"http:\/\/quick-bench.com\/IsVd06K62MNivNwjjLJbgtCiucc\">benchmark<\/a>, we can see that compile time computation based implementation is much faster than the run-time computation(~27,000 times faster). It is also interesting to note that  our implementation has equivalent performance to the C++ standard library implementation. It would be interesting to see how the C++ standard implements square root, but that discussion is for another day. In summary, use the C++ compiler to do the computations, if you know the value to be computed in advance, either using templates or <code>constexpr<\/code>.<\/p>\n\n\n\n<p style=\"font-size:16px\">Thanks for the reading. If you have any questions, please post a comment below.<\/p>\n\n\n\n<p>Some readings on the topic:<br> <a href=\"https:\/\/binary-studio.com\/2015\/09\/18\/calculation-at-compile-time-by-help-of-constant-expression\/\">https:\/\/binary-studio.com\/2015\/09\/18\/calculation-at-compile-time-by-help-of-constant-expression\/<\/a> <\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the topics that interest me in C++ is compile-time computation. In this article, we will look at how compile-time computation is done in C++. Compile-time computation can be done in three ways I know of. One is using macros to compute constants, the second way is using template metaprogramming and the other way [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_oct_exclude_from_cache":false,"footnotes":""},"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/pranavmani.com\/index.php?rest_route=\/wp\/v2\/posts\/95"}],"collection":[{"href":"https:\/\/pranavmani.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/pranavmani.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/pranavmani.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/pranavmani.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=95"}],"version-history":[{"count":5,"href":"https:\/\/pranavmani.com\/index.php?rest_route=\/wp\/v2\/posts\/95\/revisions"}],"predecessor-version":[{"id":106,"href":"https:\/\/pranavmani.com\/index.php?rest_route=\/wp\/v2\/posts\/95\/revisions\/106"}],"wp:attachment":[{"href":"https:\/\/pranavmani.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=95"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/pranavmani.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=95"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/pranavmani.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=95"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}