From 07839b29fcdc77a79af7a62c91851e979767f251 Mon Sep 17 00:00:00 2001
From: Lilian Besson <lbesson@ens-cachan.fr>
Date: Fri, 30 Jun 2017 14:09:23 +0200
Subject: [PATCH] Added a smarter version of Romberg integration, without
 dictionary...

- Julia wins against Python :snake:, but not much
- Julia smart version was about 2.5 faster than Python smart version executed by Pypy
- and Julia wasn't 10x faster than pure Python...
- Done for now :cry:, but I will try again Julia soon, and hopefully some tweaking :wrench: make it efficient
---
 Benchmark_between_Python_and_Julia.ipynb | 611 +++++++++++++++++++----
 1 file changed, 509 insertions(+), 102 deletions(-)
diff --git a/Benchmark_between_Python_and_Julia.ipynb b/Benchmark_between_Python_and_Julia.ipynb
index f1f4fd2..f0dccb0 100644
--- a/Benchmark_between_Python_and_Julia.ipynb
+++ b/Benchmark_between_Python_and_Julia.ipynb
@@ -7,7 +7,7 @@
    },
    "source": [
     "# Table of Contents\n",
-    " <p><div class=\"lev1 toc-item\"><a href=\"#Benchmark-between-Python-and-Julia\" data-toc-modified-id=\"Benchmark-between-Python-and-Julia-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Benchmark between Python and Julia</a></div><div class=\"lev2 toc-item\"><a href=\"#The-Romberg-method\" data-toc-modified-id=\"The-Romberg-method-11\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>The Romberg method</a></div><div class=\"lev2 toc-item\"><a href=\"#The-Romberg-method,-naive-recursive-version-in-Python\" data-toc-modified-id=\"The-Romberg-method,-naive-recursive-version-in-Python-12\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>The Romberg method, naive recursive version in Python</a></div><div class=\"lev2 toc-item\"><a href=\"#The-Romberg-method,-dynamic-programming-version-in-Python\" data-toc-modified-id=\"The-Romberg-method,-dynamic-programming-version-in-Python-13\"><span class=\"toc-item-num\">1.3&nbsp;&nbsp;</span>The Romberg method, dynamic programming version in Python</a></div><div class=\"lev2 toc-item\"><a href=\"#First-benchmark\" data-toc-modified-id=\"First-benchmark-14\"><span class=\"toc-item-num\">1.4&nbsp;&nbsp;</span>First benchmark</a></div><div class=\"lev2 toc-item\"><a href=\"#Using-Pypy-for-speedup\" data-toc-modified-id=\"Using-Pypy-for-speedup-15\"><span class=\"toc-item-num\">1.5&nbsp;&nbsp;</span>Using Pypy for speedup</a></div><div class=\"lev2 toc-item\"><a href=\"#Numba-version-for-Python\" data-toc-modified-id=\"Numba-version-for-Python-16\"><span class=\"toc-item-num\">1.6&nbsp;&nbsp;</span>Numba version for Python</a></div><div class=\"lev2 toc-item\"><a href=\"#Naive-Julia-version\" data-toc-modified-id=\"Naive-Julia-version-17\"><span class=\"toc-item-num\">1.7&nbsp;&nbsp;</span>Naive Julia version</a></div><div class=\"lev2 toc-item\"><a href=\"#Benchmark-between-Python,-Pypy-and-Julia\" data-toc-modified-id=\"Benchmark-between-Python,-Pypy-and-Julia-18\"><span class=\"toc-item-num\">1.8&nbsp;&nbsp;</span>Benchmark between Python, Pypy and Julia</a></div><div class=\"lev2 toc-item\"><a href=\"#Second-benchmark\" data-toc-modified-id=\"Second-benchmark-19\"><span class=\"toc-item-num\">1.9&nbsp;&nbsp;</span>Second benchmark</a></div><div class=\"lev2 toc-item\"><a href=\"#Conclusion\" data-toc-modified-id=\"Conclusion-110\"><span class=\"toc-item-num\">1.10&nbsp;&nbsp;</span>Conclusion</a></div>"
+    " <p><div class=\"lev1 toc-item\"><a href=\"#Benchmark-between-Python-and-Julia\" data-toc-modified-id=\"Benchmark-between-Python-and-Julia-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Benchmark between Python and Julia</a></div><div class=\"lev2 toc-item\"><a href=\"#The-Romberg-method\" data-toc-modified-id=\"The-Romberg-method-11\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>The Romberg method</a></div><div class=\"lev2 toc-item\"><a href=\"#The-Romberg-method,-naive-recursive-version-in-Python\" data-toc-modified-id=\"The-Romberg-method,-naive-recursive-version-in-Python-12\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>The Romberg method, naive recursive version in Python</a></div><div class=\"lev2 toc-item\"><a href=\"#The-Romberg-method,-dynamic-programming-version-in-Python\" data-toc-modified-id=\"The-Romberg-method,-dynamic-programming-version-in-Python-13\"><span class=\"toc-item-num\">1.3&nbsp;&nbsp;</span>The Romberg method, dynamic programming version in Python</a></div><div class=\"lev2 toc-item\"><a href=\"#The-Romberg-method,-better-dynamic-programming-version-in-Python\" data-toc-modified-id=\"The-Romberg-method,-better-dynamic-programming-version-in-Python-14\"><span class=\"toc-item-num\">1.4&nbsp;&nbsp;</span>The Romberg method, better dynamic programming version in Python</a></div><div class=\"lev2 toc-item\"><a href=\"#First-benchmark\" data-toc-modified-id=\"First-benchmark-15\"><span class=\"toc-item-num\">1.5&nbsp;&nbsp;</span>First benchmark</a></div><div class=\"lev2 toc-item\"><a href=\"#Using-Pypy-for-speedup\" data-toc-modified-id=\"Using-Pypy-for-speedup-16\"><span class=\"toc-item-num\">1.6&nbsp;&nbsp;</span>Using Pypy for speedup</a></div><div class=\"lev2 toc-item\"><a href=\"#Numba-version-for-Python\" data-toc-modified-id=\"Numba-version-for-Python-17\"><span class=\"toc-item-num\">1.7&nbsp;&nbsp;</span>Numba version for Python</a></div><div class=\"lev2 toc-item\"><a href=\"#Naive-Julia-version\" data-toc-modified-id=\"Naive-Julia-version-18\"><span class=\"toc-item-num\">1.8&nbsp;&nbsp;</span>Naive Julia version</a></div><div class=\"lev2 toc-item\"><a href=\"#Benchmark-between-Python,-Pypy-and-Julia\" data-toc-modified-id=\"Benchmark-between-Python,-Pypy-and-Julia-19\"><span class=\"toc-item-num\">1.9&nbsp;&nbsp;</span>Benchmark between Python, Pypy and Julia</a></div><div class=\"lev2 toc-item\"><a href=\"#Second-benchmark\" data-toc-modified-id=\"Second-benchmark-110\"><span class=\"toc-item-num\">1.10&nbsp;&nbsp;</span>Second benchmark</a></div><div class=\"lev2 toc-item\"><a href=\"#Conclusion\" data-toc-modified-id=\"Conclusion-111\"><span class=\"toc-item-num\">1.11&nbsp;&nbsp;</span>Conclusion</a></div><div class=\"lev3 toc-item\"><a href=\"#Remark\" data-toc-modified-id=\"Remark-1111\"><span class=\"toc-item-num\">1.11.1&nbsp;&nbsp;</span>Remark</a></div>"
    ]
   },
   {
@@ -385,19 +385,160 @@
    "metadata": {},
    "source": [
     "----\n",
-    "## First benchmark"
+    "## The Romberg method, better dynamic programming version in Python\n",
+    "\n",
+    "Instead of using a dictionary, which gets filled up dynamically (and so, slowly), let us use a numpy arrays, as we already know the size of the array we need ($n+1 \\times m+1$).\n",
+    "\n",
+    "Note that only half of the array is used, so we could try to use [sparse matrices](https://docs.scipy.org/doc/scipy/reference/sparse.html) maybe, for triangular matrices? From what I know, it is not worth it, both in term of memory information (if the sparsity measure is $\\simeq 1/2$, you don't win anything from [LIL](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.lil_matrix.html#scipy.sparse.lil_matrix) or other sparse matrices representation...\n",
+    "We could use [`numpy.tri`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.tri.html) but this uses a dense array so nope."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 8,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "\n",
+    "def romberg_better(f, xmin, xmax, n=8, m=None):\n",
+    "    assert xmin <= xmax\n",
+    "    if m is None:\n",
+    "        m = n\n",
+    "    assert n >= m >= 0\n",
+    "    # First value:\n",
+    "    r = np.zeros((n+1, m+1))\n",
+    "    r[0, 0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
+    "\n",
+    "    # One side of the triangle:\n",
+    "    for i in range(1, n + 1):\n",
+    "        h_i = (xmax - xmin) / 2.**i\n",
+    "        r[i, 0] = (0.5 * r[i - 1, 0]) + h_i * math.fsum(\n",
+    "            f(xmin + ((2 * k - 1) * h_i))\n",
+    "            for k in range(1, 1 + 2**(i - 1))\n",
+    "        )\n",
+    "\n",
+    "    # All the other values:\n",
+    "    for j in range(1, m + 1):\n",
+    "        for i in range(j, n + 1):\n",
+    "            r[i, j] = (((4.**j) * r[i, j - 1]) - r[i - 1, j - 1]) / ((4.**j) - 1.)\n",
+    "\n",
+    "    return r[n, m]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "404122.62720728031"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    },
+    {
+     "data": {
+      "text/plain": [
+       "372300.32065714896"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    },
+    {
+     "data": {
+      "text/plain": [
+       "373621.99733807985"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    },
+    {
+     "data": {
+      "text/plain": [
+       "373650.47843482986"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    },
+    {
+     "data": {
+      "text/plain": [
+       "409937.61052426015"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    },
+    {
+     "data": {
+      "text/plain": [
+       "409937.5786981525"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    },
+    {
+     "data": {
+      "text/plain": [
+       "409937.5786979791"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "romberg_better(f, a, b, n=0)  # really not accurate!\n",
+    "romberg_better(f, a, b, n=1)  # alreay pretty good!\n",
+    "romberg_better(f, a, b, n=2)\n",
+    "romberg_better(f, a, b, n=3)\n",
+    "romberg_better(f, a, b, n=8)  # Almost the exact value.\n",
+    "romberg_better(f, a, b, n=10)  # Almost the exact value.\n",
+    "romberg_better(f, a, b, n=12)  # Almost the exact value."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "It converges quite quickly to the \"true\" value as given by `scipy.integrate.quad`."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "----\n",
+    "## First benchmark"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "372 µs ± 11 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
+      "390 µs ± 10.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
      ]
     }
    ],
@@ -407,14 +548,14 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 11,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "51.9 ms ± 803 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
+      "52.8 ms ± 1.21 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
      ]
     }
    ],
@@ -424,14 +565,14 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 12,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "801 µs ± 8.32 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
+      "819 µs ± 19.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
      ]
     }
    ],
@@ -439,11 +580,30 @@
     "%timeit romberg(f, a, b, n=10)"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "844 µs ± 19 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
+     ]
+    }
+   ],
+   "source": [
+    "%timeit romberg_better(f, a, b, n=10)"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We already see that the recursive version is *much* slower than the dynamic programming one!"
+    "We already see that the recursive version is *much* slower than the dynamic programming one!\n",
+    "\n",
+    "But there is not much difference between the one using dictionary (`romberg()`) and the one using a numpy array of a known size (`romberg_better()`)."
    ]
   },
   {
@@ -456,10 +616,10 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 14,
    "metadata": {
     "code_folding": [
-     7
+     9
     ]
    },
    "outputs": [
@@ -467,14 +627,15 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "CPU times: user 27.8 s, sys: 4 ms, total: 27.8 s\n",
-      "Wall time: 27.8 s\n"
+      "CPU times: user 24.1 s, sys: 0 ns, total: 24.1 s\n",
+      "Wall time: 24.1 s\n"
      ]
     }
    ],
    "source": [
     "%%time\n",
     "\n",
+    "import numpy as np\n",
     "import math\n",
     "import random\n",
     "\n",
@@ -487,23 +648,23 @@
     "        m = n\n",
     "    assert n >= m >= 0\n",
     "    # First value:\n",
-    "    r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n",
+    "    r = np.zeros((n+1, m+1))\n",
+    "    r[0, 0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
     "\n",
     "    # One side of the triangle:\n",
     "    for i in range(1, n + 1):\n",
-    "        h_i = (xmax - xmin) / float(2**i)\n",
-    "        xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n",
-    "        r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n",
+    "        h_i = (xmax - xmin) / 2.**i\n",
+    "        r[i, 0] = (0.5 * r[i - 1, 0]) + h_i * math.fsum(\n",
+    "            f(xmin + ((2 * k - 1) * h_i))\n",
+    "            for k in range(1, 1 + 2**(i - 1))\n",
+    "        )\n",
     "\n",
     "    # All the other values:\n",
     "    for j in range(1, m + 1):\n",
     "        for i in range(j, n + 1):\n",
-    "            try:\n",
-    "                r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n",
-    "            except:\n",
-    "                raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n",
+    "            r[i, j] = (((4.**j) * r[i, j - 1]) - r[i - 1, j - 1]) / ((4.**j) - 1.)\n",
     "\n",
-    "    return r[(n, m)]\n",
+    "    return r[n, m]\n",
     "\n",
     "for _ in range(100000):\n",
     "    a = random.randint(-2000, 2000)\n",
@@ -520,7 +681,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 15,
    "metadata": {
     "code_folding": [
      9
@@ -531,8 +692,8 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "CPU times: user 0 ns, sys: 8 ms, total: 8 ms\n",
-      "Wall time: 7.84 s\n"
+      "CPU times: user 4 ms, sys: 0 ns, total: 4 ms\n",
+      "Wall time: 7.09 s\n"
      ]
     }
    ],
@@ -552,23 +713,23 @@
     "        m = n\n",
     "    assert n >= m >= 0\n",
     "    # First value:\n",
-    "    r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n",
+    "    r = [[0 for _ in range(n+1)] for _ in range(m+1)]\n",
+    "    r[0][0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
     "\n",
     "    # One side of the triangle:\n",
     "    for i in range(1, n + 1):\n",
-    "        h_i = (xmax - xmin) / float(2**i)\n",
-    "        xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n",
-    "        r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n",
+    "        h_i = (xmax - xmin) / 2.**i\n",
+    "        r[i][0] = (0.5 * r[i - 1][0]) + h_i * math.fsum(\n",
+    "            f(xmin + ((2 * k - 1) * h_i))\n",
+    "            for k in range(1, 1 + 2**(i - 1))\n",
+    "        )\n",
     "\n",
     "    # All the other values:\n",
     "    for j in range(1, m + 1):\n",
     "        for i in range(j, n + 1):\n",
-    "            try:\n",
-    "                r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n",
-    "            except:\n",
-    "                raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n",
+    "            r[i][j] = (((4.**j) * r[i][j - 1]) - r[i - 1][j - 1]) / ((4.**j) - 1.)\n",
     "\n",
-    "    return r[(n, m)]\n",
+    "    return r[n][m]\n",
     "\n",
     "for _ in range(100000):\n",
     "    a = random.randint(-2000, 2000)\n",
@@ -576,6 +737,13 @@
     "    romberg_pypy(f, a, b)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> This version uses the improved memoization trick (no dictionary), but uses nested lists and not numpy arrays, I didn't bother to install numpy on my Pypy installation (even thought [it should be possible](https://bitbucket.org/pypy/numpy.git))."
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -586,7 +754,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 16,
    "metadata": {
     "collapsed": true
    },
@@ -597,7 +765,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 17,
    "metadata": {
     "collapsed": true
    },
@@ -629,7 +797,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 18,
    "metadata": {},
    "outputs": [
     {
@@ -639,7 +807,7 @@
      "traceback": [
       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
       "\u001b[0;31mAssertionError\u001b[0m                            Traceback (most recent call last)",
-      "\u001b[0;32m<ipython-input-15-b4426dd668a3>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mromberg_numba\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m8\u001b[0m\u001b[0;34m)\u001b[0m  \u001b[0;31m# Almost the exact value.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+      "\u001b[0;32m<ipython-input-18-b4426dd668a3>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mromberg_numba\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m8\u001b[0m\u001b[0;34m)\u001b[0m  \u001b[0;31m# Almost the exact value.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
       "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/numba/dispatcher.py\u001b[0m in \u001b[0;36m_compile_for_args\u001b[0;34m(self, *args, **kws)\u001b[0m\n\u001b[1;32m    284\u001b[0m                 \u001b[0margtypes\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtypeof_pyval\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    285\u001b[0m         \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 286\u001b[0;31m             \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcompile\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtuple\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margtypes\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    287\u001b[0m         \u001b[0;32mexcept\u001b[0m \u001b[0merrors\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTypingError\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    288\u001b[0m             \u001b[0;31m# Intercept typing error that may be due to an argument\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
       "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/numba/dispatcher.py\u001b[0m in \u001b[0;36mcompile\u001b[0;34m(self, sig)\u001b[0m\n\u001b[1;32m    530\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    531\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_cache_misses\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0msig\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 532\u001b[0;31m             \u001b[0mcres\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_compiler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcompile\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreturn_type\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    533\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0madd_overload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcres\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    534\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_cache\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msave_overload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msig\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcres\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
       "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/numba/dispatcher.py\u001b[0m in \u001b[0;36mcompile\u001b[0;34m(self, args, return_type)\u001b[0m\n\u001b[1;32m     79\u001b[0m                                       \u001b[0mimpl\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     80\u001b[0m                                       \u001b[0margs\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreturn_type\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mreturn_type\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 81\u001b[0;31m                                       flags=flags, locals=self.locals)\n\u001b[0m\u001b[1;32m     82\u001b[0m         \u001b[0;31m# Check typing error if object mode is used\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     83\u001b[0m         \u001b[0;32mif\u001b[0m \u001b[0mcres\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtyping_error\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mflags\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0menable_pyobject\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
@@ -665,7 +833,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "> It fails!"
+    "> It fails! Almost as always when trying Numba, it fails cryptically, too bad. I don't want to spend time debugging this."
    ]
   },
   {
@@ -680,13 +848,15 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "> Thanks to [this page](https://learnxinyminutes.com/docs/julia/)."
+    "> Thanks to [this page](https://learnxinyminutes.com/docs/julia/) for a nice and short introduction to Julia."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 16,
-   "metadata": {},
+   "execution_count": 19,
+   "metadata": {
+    "code_folding": []
+   },
    "outputs": [
     {
      "name": "stdout",
@@ -761,7 +931,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 20,
    "metadata": {},
    "outputs": [
     {
@@ -770,7 +940,7 @@
        "(409937.57869797916, 8.482168070998719e-05)"
       ]
      },
-     "execution_count": 17,
+     "execution_count": 20,
      "metadata": {},
      "output_type": "execute_result"
     },
@@ -780,7 +950,7 @@
        "409937.5786979791"
       ]
      },
-     "execution_count": 17,
+     "execution_count": 20,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -793,6 +963,84 @@
     "romberg(f, a, b, n=12)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let try a less naive version using a fixed-size array instead of a dictionary. (as we did before for the Python version)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {
+    "code_folding": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "f (generic function with 1 method)\n",
+      "1993\n",
+      "2017\n",
+      "romberg_julia_better (generic function with 1 method)\n",
+      "404122.6272072803\n",
+      "372300.32065714896\n",
+      "373621.99733807985\n",
+      "373650.47843482986\n",
+      "409937.6105242601\n",
+      "409937.57869815256\n",
+      "409937.5786979804\n"
+     ]
+    }
+   ],
+   "source": [
+    "%%script julia\n",
+    "\n",
+    "function f(x)\n",
+    "    (12*x + 1) / (1 + cos(x)^2)\n",
+    "end\n",
+    "\n",
+    "a = 1993\n",
+    "b = 2017\n",
+    "\n",
+    "function romberg_julia_better(f, xmin, xmax; n=8)\n",
+    "    m = n\n",
+    "    # First value:\n",
+    "    r = zeros((n+1, m+1))  # https://docs.julialang.org/en/latest/stdlib/arrays/#Base.zeros\n",
+    "    r[1, 1] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
+    "\n",
+    "    # One side of the triangle:\n",
+    "    for i in 1 : n\n",
+    "        h_i = (xmax - xmin) / (2^i)\n",
+    "        sum_f_x = 0\n",
+    "        for k in 1 : 2^(i - 1)\n",
+    "            sum_f_x += f(xmin + ((2 * k - 1) * h_i))\n",
+    "        end\n",
+    "        r[i + 1, 1] = (r[i, 1] / 2.) + (h_i * sum_f_x)\n",
+    "    end\n",
+    "\n",
+    "    # All the other values:\n",
+    "    for j in 1 : m\n",
+    "        for i in j : n\n",
+    "            r[i + 1, j + 1] = (((4.^j) * r[i + 1, j]) - r[i, j]) / (4.^j - 1.)\n",
+    "        end\n",
+    "    end\n",
+    "\n",
+    "    r[n + 1, m + 1]\n",
+    "end\n",
+    "\n",
+    "\n",
+    "println(romberg_julia_better(f, a, b, n=0))  # really not accurate!\n",
+    "println(romberg_julia_better(f, a, b, n=1))  # alreay pretty good!\n",
+    "println(romberg_julia_better(f, a, b, n=2))\n",
+    "println(romberg_julia_better(f, a, b, n=3))\n",
+    "println(romberg_julia_better(f, a, b, n=8))  # Almost the exact value.\n",
+    "println(romberg_julia_better(f, a, b, n=10))  # Almost the exact value.\n",
+    "println(romberg_julia_better(f, a, b, n=12))  # Almost the exact value."
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -810,10 +1058,10 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 22,
    "metadata": {
     "code_folding": [
-     8
+     9
     ]
    },
    "outputs": [
@@ -821,14 +1069,15 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "CPU times: user 26.1 s, sys: 0 ns, total: 26.1 s\n",
-      "Wall time: 26.1 s\n"
+      "CPU times: user 25.6 s, sys: 0 ns, total: 25.6 s\n",
+      "Wall time: 25.6 s\n"
      ]
     }
    ],
    "source": [
     "%%time\n",
     "\n",
+    "import numpy as np\n",
     "import math\n",
     "import random\n",
     "\n",
@@ -841,23 +1090,23 @@
     "        m = n\n",
     "    assert n >= m >= 0\n",
     "    # First value:\n",
-    "    r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n",
+    "    r = np.zeros((n+1, m+1))\n",
+    "    r[0, 0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
     "\n",
     "    # One side of the triangle:\n",
     "    for i in range(1, n + 1):\n",
-    "        h_i = (xmax - xmin) / float(2**i)\n",
-    "        xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n",
-    "        r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n",
+    "        h_i = (xmax - xmin) / 2.**i\n",
+    "        r[i, 0] = (0.5 * r[i - 1, 0]) + h_i * math.fsum(\n",
+    "            f(xmin + ((2 * k - 1) * h_i))\n",
+    "            for k in range(1, 1 + 2**(i - 1))\n",
+    "        )\n",
     "\n",
     "    # All the other values:\n",
     "    for j in range(1, m + 1):\n",
     "        for i in range(j, n + 1):\n",
-    "            try:\n",
-    "                r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n",
-    "            except:\n",
-    "                raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n",
+    "            r[i, j] = (((4.**j) * r[i, j - 1]) - r[i - 1, j - 1]) / ((4.**j) - 1.)\n",
     "\n",
-    "    return r[(n, m)]\n",
+    "    return r[n, m]\n",
     "\n",
     "for _ in range(100000):\n",
     "    a = random.randint(-2000, 2000)\n",
@@ -874,7 +1123,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 23,
    "metadata": {
     "code_folding": [
      9
@@ -886,8 +1135,8 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "CPU times: user 0 ns, sys: 4 ms, total: 4 ms\n",
-      "Wall time: 7.06 s\n"
+      "CPU times: user 4 ms, sys: 4 ms, total: 8 ms\n",
+      "Wall time: 7.12 s\n"
      ]
     }
    ],
@@ -907,23 +1156,23 @@
     "        m = n\n",
     "    assert n >= m >= 0\n",
     "    # First value:\n",
-    "    r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n",
+    "    r = [[0 for _ in range(n+1)] for _ in range(m+1)]\n",
+    "    r[0][0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
     "\n",
     "    # One side of the triangle:\n",
     "    for i in range(1, n + 1):\n",
-    "        h_i = (xmax - xmin) / float(2**i)\n",
-    "        xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n",
-    "        r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n",
+    "        h_i = (xmax - xmin) / 2.**i\n",
+    "        r[i][0] = (0.5 * r[i - 1][0]) + h_i * math.fsum(\n",
+    "            f(xmin + ((2 * k - 1) * h_i))\n",
+    "            for k in range(1, 1 + 2**(i - 1))\n",
+    "        )\n",
     "\n",
     "    # All the other values:\n",
     "    for j in range(1, m + 1):\n",
     "        for i in range(j, n + 1):\n",
-    "            try:\n",
-    "                r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n",
-    "            except:\n",
-    "                raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n",
+    "            r[i][j] = (((4.**j) * r[i][j - 1]) - r[i - 1][j - 1]) / ((4.**j) - 1.)\n",
     "\n",
-    "    return r[(n, m)]\n",
+    "    return r[n][m]\n",
     "\n",
     "for _ in range(100000):\n",
     "    a = random.randint(-2000, 2000)\n",
@@ -940,10 +1189,10 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 24,
    "metadata": {
     "code_folding": [
-     9
+     7
     ]
    },
    "outputs": [
@@ -954,7 +1203,7 @@
       "f (generic function with 1 method)\n",
       "romberg_julia (generic function with 1 method)\n",
       "CPU times: user 4 ms, sys: 4 ms, total: 8 ms\n",
-      "Wall time: 8.32 s\n"
+      "Wall time: 8.1 s\n"
      ]
     }
    ],
@@ -1003,7 +1252,79 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "On this first test, it doesn't look faster than Pypy..."
+    "On this first test, it doesn't look faster than Pypy...\n",
+    "But what if we use the improved version, with an array instead of dictionary?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "metadata": {
+    "code_folding": [
+     7
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "f (generic function with 1 method)\n",
+      "romberg_julia_better (generic function with 1 method)\n",
+      "CPU times: user 4 ms, sys: 4 ms, total: 8 ms\n",
+      "Wall time: 2.7 s\n"
+     ]
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "%%script julia\n",
+    "\n",
+    "function f(x)\n",
+    "    (12*x + 1) / (1 + cos(x)^2)\n",
+    "end\n",
+    "\n",
+    "function romberg_julia_better(f, xmin, xmax; n=8)\n",
+    "    m = n\n",
+    "    # First value:\n",
+    "    r = zeros((n+1, m+1))  # https://docs.julialang.org/en/latest/stdlib/arrays/#Base.zeros\n",
+    "    r[1, 1] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
+    "\n",
+    "    # One side of the triangle:\n",
+    "    for i in 1 : n\n",
+    "        h_i = (xmax - xmin) / (2^i)\n",
+    "        sum_f_x = 0\n",
+    "        for k in 1 : 2^(i - 1)\n",
+    "            sum_f_x += f(xmin + ((2 * k - 1) * h_i))\n",
+    "        end\n",
+    "        r[i + 1, 1] = (r[i, 1] / 2.) + (h_i * sum_f_x)\n",
+    "    end\n",
+    "\n",
+    "    # All the other values:\n",
+    "    for j in 1 : m\n",
+    "        for i in j : n\n",
+    "            r[i + 1, j + 1] = (((4.^j) * r[i + 1, j]) - r[i, j]) / (4.^j - 1.)\n",
+    "        end\n",
+    "    end\n",
+    "\n",
+    "    r[n + 1, m + 1]\n",
+    "end\n",
+    "\n",
+    "for _ in 1:100000\n",
+    "    a = rand(-2000:2000)\n",
+    "    b = a + rand(0:100)\n",
+    "    romberg_julia_better(f, a, b)\n",
+    "end"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Oh, this time it finally seems faster. Really faster? Yes, about 3 to 4 time faster than Pypy.\n",
+    "\n",
+    "Remark also that this last cells compared by using the magic `%%pypy` and `%%script julia`, so they both need a warm-up time (opening the pipe, the sub-process, initializing the JIT compiler etc).\n",
+    "But it is fair to compare Pypy to Julia this way."
    ]
   },
   {
@@ -1032,10 +1353,10 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 26,
    "metadata": {
     "code_folding": [
-     8
+     9
     ]
    },
    "outputs": [
@@ -1043,15 +1364,16 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "0.842700792949715\n",
-      "CPU times: user 22.1 s, sys: 4 ms, total: 22.1 s\n",
-      "Wall time: 22.1 s\n"
+      "0.84270079295\n",
+      "CPU times: user 20.7 s, sys: 8 ms, total: 20.8 s\n",
+      "Wall time: 20.8 s\n"
      ]
     }
    ],
    "source": [
     "%%time\n",
     "\n",
+    "import numpy as np\n",
     "import math\n",
     "import random\n",
     "\n",
@@ -1064,23 +1386,23 @@
     "        m = n\n",
     "    assert n >= m >= 0\n",
     "    # First value:\n",
-    "    r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n",
+    "    r = np.zeros((n+1, m+1))\n",
+    "    r[0, 0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
     "\n",
     "    # One side of the triangle:\n",
     "    for i in range(1, n + 1):\n",
-    "        h_i = (xmax - xmin) / float(2**i)\n",
-    "        xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n",
-    "        r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n",
+    "        h_i = (xmax - xmin) / 2.**i\n",
+    "        r[i, 0] = (0.5 * r[i - 1, 0]) + h_i * math.fsum(\n",
+    "            f(xmin + ((2 * k - 1) * h_i))\n",
+    "            for k in range(1, 1 + 2**(i - 1))\n",
+    "        )\n",
     "\n",
     "    # All the other values:\n",
     "    for j in range(1, m + 1):\n",
     "        for i in range(j, n + 1):\n",
-    "            try:\n",
-    "                r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n",
-    "            except:\n",
-    "                raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n",
+    "            r[i, j] = (((4.**j) * r[i, j - 1]) - r[i - 1, j - 1]) / ((4.**j) - 1.)\n",
     "\n",
-    "    return r[(n, m)]\n",
+    "    return r[n, m]\n",
     "\n",
     "for _ in range(100000):\n",
     "    a = 0\n",
@@ -1098,7 +1420,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 27,
    "metadata": {
     "code_folding": [
      9
@@ -1112,7 +1434,7 @@
      "text": [
       "0.84270079295\n",
       "CPU times: user 8 ms, sys: 0 ns, total: 8 ms\n",
-      "Wall time: 6.34 s\n"
+      "Wall time: 6.35 s\n"
      ]
     }
    ],
@@ -1132,23 +1454,23 @@
     "        m = n\n",
     "    assert n >= m >= 0\n",
     "    # First value:\n",
-    "    r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n",
+    "    r = [[0 for _ in range(n+1)] for _ in range(m+1)]\n",
+    "    r[0][0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
     "\n",
     "    # One side of the triangle:\n",
     "    for i in range(1, n + 1):\n",
-    "        h_i = (xmax - xmin) / float(2**i)\n",
-    "        xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n",
-    "        r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n",
+    "        h_i = (xmax - xmin) / 2.**i\n",
+    "        r[i][0] = (0.5 * r[i - 1][0]) + h_i * math.fsum(\n",
+    "            f(xmin + ((2 * k - 1) * h_i))\n",
+    "            for k in range(1, 1 + 2**(i - 1))\n",
+    "        )\n",
     "\n",
     "    # All the other values:\n",
     "    for j in range(1, m + 1):\n",
     "        for i in range(j, n + 1):\n",
-    "            try:\n",
-    "                r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n",
-    "            except:\n",
-    "                raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n",
+    "            r[i][j] = (((4.**j) * r[i][j - 1]) - r[i - 1][j - 1]) / ((4.**j) - 1.)\n",
     "\n",
-    "    return r[(n, m)]\n",
+    "    return r[n][m]\n",
     "\n",
     "for _ in range(100000):\n",
     "    a = 0\n",
@@ -1166,7 +1488,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 28,
    "metadata": {
     "code_folding": [
      7
@@ -1180,8 +1502,8 @@
       "f (generic function with 1 method)\n",
       "romberg_julia (generic function with 1 method)\n",
       "0.8427007929497148\n",
-      "CPU times: user 4 ms, sys: 8 ms, total: 12 ms\n",
-      "Wall time: 7.07 s\n"
+      "CPU times: user 4 ms, sys: 4 ms, total: 8 ms\n",
+      "Wall time: 7.19 s\n"
      ]
     }
    ],
@@ -1234,6 +1556,77 @@
     "Still not faster than Pypy... So what is the goal of Julia?"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "metadata": {
+    "code_folding": [
+     8
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "f (generic function with 1 method)\n",
+      "romberg_julia_better (generic function with 1 method)\n",
+      "0.8427007929497148\n",
+      "CPU times: user 8 ms, sys: 0 ns, total: 8 ms\n",
+      "Wall time: 2.42 s\n"
+     ]
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "%%script julia\n",
+    "\n",
+    "function f(x)\n",
+    "    (2.0 / sqrt(pi)) * exp(-x^2)\n",
+    "end\n",
+    "\n",
+    "\n",
+    "function romberg_julia_better(f, xmin, xmax; n=8)\n",
+    "    m = n\n",
+    "    # First value:\n",
+    "    r = zeros((n+1, m+1))  # https://docs.julialang.org/en/latest/stdlib/arrays/#Base.zeros\n",
+    "    r[1, 1] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n",
+    "\n",
+    "    # One side of the triangle:\n",
+    "    for i in 1 : n\n",
+    "        h_i = (xmax - xmin) / (2^i)\n",
+    "        sum_f_x = 0\n",
+    "        for k in 1 : 2^(i - 1)\n",
+    "            sum_f_x += f(xmin + ((2 * k - 1) * h_i))\n",
+    "        end\n",
+    "        r[i + 1, 1] = (r[i, 1] / 2.) + (h_i * sum_f_x)\n",
+    "    end\n",
+    "\n",
+    "    # All the other values:\n",
+    "    for j in 1 : m\n",
+    "        for i in j : n\n",
+    "            r[i + 1, j + 1] = (((4.^j) * r[i + 1, j]) - r[i, j]) / (4.^j - 1.)\n",
+    "        end\n",
+    "    end\n",
+    "\n",
+    "    r[n + 1, m + 1]\n",
+    "end\n",
+    "\n",
+    "for _ in 1:100000\n",
+    "    a = 0\n",
+    "    b = 1\n",
+    "    romberg_julia_better(f, a, b)\n",
+    "end\n",
+    "println(romberg_julia_better(f, 0, 1))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This is also faster than Pypy, but not that much..."
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -1241,9 +1634,16 @@
     "----\n",
     "## Conclusion\n",
     "\n",
+    "$\\implies$\n",
+    "On this (baby) example of a real world numerical algorithms, tested on thousands of random inputs or on thousands time the same input, the speed-up is in favor of Julia, but it doesn't seem impressive enough to make me want to use it (at least for now).\n",
+    "\n",
+    "If I have to use 1-based indexing and a slightly different language, just to maybe gain a speed-up of 2 to 3 (compared to Pypy) or even a 10x speed-up compare to naive Python, why bother?\n",
+    "\n",
+    "### Remark\n",
     "Of course, this was a *baby* benchmark, on a small algorithm, and probably wrongly implemented in both Python and Julia.\n",
     "\n",
-    "But still, I am surprised to see that the naive Julia version was *slower* than the naive Python version executed with Pypy..."
+    "But still, I am surprised to see that the naive Julia version was *slower* than the naive Python version executed with Pypy...\n",
+    "For the less naive version (without dictionary), the Julia version was about *2 to 3* times faster than the Python version with Pypy."
    ]
   }
  ],
@@ -1281,6 +1681,13 @@
    "sideBar": true,
    "threshold": 4,
    "toc_cell": true,
+   "toc_position": {
+    "height": "506px",
+    "left": "0px",
+    "right": "1068px",
+    "top": "116px",
+    "width": "212px"
+   },
    "toc_section_display": "block",
    "toc_window_display": true
   }