From 07839b29fcdc77a79af7a62c91851e979767f251 Mon Sep 17 00:00:00 2001 From: Lilian Besson Date: Fri, 30 Jun 2017 14:09:23 +0200 Subject: [PATCH] Added a smarter version of Romberg integration, without dictionary... - Julia wins against Python :snake:, but not much - Julia smart version was about 2.5 faster than Python smart version executed by Pypy - and Julia wasn't 10x faster than pure Python... - Done for now :cry:, but I will try again Julia soon, and hopefully some tweaking :wrench: make it efficient --- Benchmark_between_Python_and_Julia.ipynb | 611 +++++++++++++++++++---- 1 file changed, 509 insertions(+), 102 deletions(-) diff --git a/Benchmark_between_Python_and_Julia.ipynb b/Benchmark_between_Python_and_Julia.ipynb index f1f4fd2..f0dccb0 100644 --- a/Benchmark_between_Python_and_Julia.ipynb +++ b/Benchmark_between_Python_and_Julia.ipynb @@ -7,7 +7,7 @@ }, "source": [ "# Table of Contents\n", - "

1  Benchmark between Python and Julia
1.1  The Romberg method
1.2  The Romberg method, naive recursive version in Python
1.3  The Romberg method, dynamic programming version in Python
1.4  First benchmark
1.5  Using Pypy for speedup
1.6  Numba version for Python
1.7  Naive Julia version
1.8  Benchmark between Python, Pypy and Julia
1.9  Second benchmark
1.10  Conclusion
" + "

1  Benchmark between Python and Julia
1.1  The Romberg method
1.2  The Romberg method, naive recursive version in Python
1.3  The Romberg method, dynamic programming version in Python
1.4  The Romberg method, better dynamic programming version in Python
1.5  First benchmark
1.6  Using Pypy for speedup
1.7  Numba version for Python
1.8  Naive Julia version
1.9  Benchmark between Python, Pypy and Julia
1.10  Second benchmark
1.11  Conclusion
1.11.1  Remark
" ] }, { @@ -385,19 +385,160 @@ "metadata": {}, "source": [ "----\n", - "## First benchmark" + "## The Romberg method, better dynamic programming version in Python\n", + "\n", + "Instead of using a dictionary, which gets filled up dynamically (and so, slowly), let us use a numpy arrays, as we already know the size of the array we need ($n+1 \\times m+1$).\n", + "\n", + "Note that only half of the array is used, so we could try to use [sparse matrices](https://docs.scipy.org/doc/scipy/reference/sparse.html) maybe, for triangular matrices? From what I know, it is not worth it, both in term of memory information (if the sparsity measure is $\\simeq 1/2$, you don't win anything from [LIL](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.lil_matrix.html#scipy.sparse.lil_matrix) or other sparse matrices representation...\n", + "We could use [`numpy.tri`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.tri.html) but this uses a dense array so nope." ] }, { "cell_type": "code", "execution_count": 8, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "def romberg_better(f, xmin, xmax, n=8, m=None):\n", + " assert xmin <= xmax\n", + " if m is None:\n", + " m = n\n", + " assert n >= m >= 0\n", + " # First value:\n", + " r = np.zeros((n+1, m+1))\n", + " r[0, 0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", + "\n", + " # One side of the triangle:\n", + " for i in range(1, n + 1):\n", + " h_i = (xmax - xmin) / 2.**i\n", + " r[i, 0] = (0.5 * r[i - 1, 0]) + h_i * math.fsum(\n", + " f(xmin + ((2 * k - 1) * h_i))\n", + " for k in range(1, 1 + 2**(i - 1))\n", + " )\n", + "\n", + " # All the other values:\n", + " for j in range(1, m + 1):\n", + " for i in range(j, n + 1):\n", + " r[i, j] = (((4.**j) * r[i, j - 1]) - r[i - 1, j - 1]) / ((4.**j) - 1.)\n", + "\n", + " return r[n, m]" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "404122.62720728031" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "text/plain": [ + "372300.32065714896" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "text/plain": [ + "373621.99733807985" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "text/plain": [ + "373650.47843482986" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "text/plain": [ + "409937.61052426015" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "text/plain": [ + "409937.5786981525" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "text/plain": [ + "409937.5786979791" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "romberg_better(f, a, b, n=0) # really not accurate!\n", + "romberg_better(f, a, b, n=1) # alreay pretty good!\n", + "romberg_better(f, a, b, n=2)\n", + "romberg_better(f, a, b, n=3)\n", + "romberg_better(f, a, b, n=8) # Almost the exact value.\n", + "romberg_better(f, a, b, n=10) # Almost the exact value.\n", + "romberg_better(f, a, b, n=12) # Almost the exact value." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It converges quite quickly to the \"true\" value as given by `scipy.integrate.quad`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "----\n", + "## First benchmark" + ] + }, + { + "cell_type": "code", + "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "372 µs ± 11 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" + "390 µs ± 10.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" ] } ], @@ -407,14 +548,14 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "51.9 ms ± 803 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n" + "52.8 ms ± 1.21 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n" ] } ], @@ -424,14 +565,14 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "801 µs ± 8.32 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" + "819 µs ± 19.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" ] } ], @@ -439,11 +580,30 @@ "%timeit romberg(f, a, b, n=10)" ] }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "844 µs ± 19 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n" + ] + } + ], + "source": [ + "%timeit romberg_better(f, a, b, n=10)" + ] + }, { "cell_type": "markdown", "metadata": {}, "source": [ - "We already see that the recursive version is *much* slower than the dynamic programming one!" + "We already see that the recursive version is *much* slower than the dynamic programming one!\n", + "\n", + "But there is not much difference between the one using dictionary (`romberg()`) and the one using a numpy array of a known size (`romberg_better()`)." ] }, { @@ -456,10 +616,10 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 14, "metadata": { "code_folding": [ - 7 + 9 ] }, "outputs": [ @@ -467,14 +627,15 @@ "name": "stdout", "output_type": "stream", "text": [ - "CPU times: user 27.8 s, sys: 4 ms, total: 27.8 s\n", - "Wall time: 27.8 s\n" + "CPU times: user 24.1 s, sys: 0 ns, total: 24.1 s\n", + "Wall time: 24.1 s\n" ] } ], "source": [ "%%time\n", "\n", + "import numpy as np\n", "import math\n", "import random\n", "\n", @@ -487,23 +648,23 @@ " m = n\n", " assert n >= m >= 0\n", " # First value:\n", - " r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n", + " r = np.zeros((n+1, m+1))\n", + " r[0, 0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", "\n", " # One side of the triangle:\n", " for i in range(1, n + 1):\n", - " h_i = (xmax - xmin) / float(2**i)\n", - " xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n", - " r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n", + " h_i = (xmax - xmin) / 2.**i\n", + " r[i, 0] = (0.5 * r[i - 1, 0]) + h_i * math.fsum(\n", + " f(xmin + ((2 * k - 1) * h_i))\n", + " for k in range(1, 1 + 2**(i - 1))\n", + " )\n", "\n", " # All the other values:\n", " for j in range(1, m + 1):\n", " for i in range(j, n + 1):\n", - " try:\n", - " r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n", - " except:\n", - " raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n", + " r[i, j] = (((4.**j) * r[i, j - 1]) - r[i - 1, j - 1]) / ((4.**j) - 1.)\n", "\n", - " return r[(n, m)]\n", + " return r[n, m]\n", "\n", "for _ in range(100000):\n", " a = random.randint(-2000, 2000)\n", @@ -520,7 +681,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 15, "metadata": { "code_folding": [ 9 @@ -531,8 +692,8 @@ "name": "stdout", "output_type": "stream", "text": [ - "CPU times: user 0 ns, sys: 8 ms, total: 8 ms\n", - "Wall time: 7.84 s\n" + "CPU times: user 4 ms, sys: 0 ns, total: 4 ms\n", + "Wall time: 7.09 s\n" ] } ], @@ -552,23 +713,23 @@ " m = n\n", " assert n >= m >= 0\n", " # First value:\n", - " r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n", + " r = [[0 for _ in range(n+1)] for _ in range(m+1)]\n", + " r[0][0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", "\n", " # One side of the triangle:\n", " for i in range(1, n + 1):\n", - " h_i = (xmax - xmin) / float(2**i)\n", - " xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n", - " r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n", + " h_i = (xmax - xmin) / 2.**i\n", + " r[i][0] = (0.5 * r[i - 1][0]) + h_i * math.fsum(\n", + " f(xmin + ((2 * k - 1) * h_i))\n", + " for k in range(1, 1 + 2**(i - 1))\n", + " )\n", "\n", " # All the other values:\n", " for j in range(1, m + 1):\n", " for i in range(j, n + 1):\n", - " try:\n", - " r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n", - " except:\n", - " raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n", + " r[i][j] = (((4.**j) * r[i][j - 1]) - r[i - 1][j - 1]) / ((4.**j) - 1.)\n", "\n", - " return r[(n, m)]\n", + " return r[n][m]\n", "\n", "for _ in range(100000):\n", " a = random.randint(-2000, 2000)\n", @@ -576,6 +737,13 @@ " romberg_pypy(f, a, b)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> This version uses the improved memoization trick (no dictionary), but uses nested lists and not numpy arrays, I didn't bother to install numpy on my Pypy installation (even thought [it should be possible](https://bitbucket.org/pypy/numpy.git))." + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -586,7 +754,7 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 16, "metadata": { "collapsed": true }, @@ -597,7 +765,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 17, "metadata": { "collapsed": true }, @@ -629,7 +797,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 18, "metadata": {}, "outputs": [ { @@ -639,7 +807,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mAssertionError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mromberg_numba\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m8\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Almost the exact value.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mromberg_numba\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m8\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Almost the exact value.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/numba/dispatcher.py\u001b[0m in \u001b[0;36m_compile_for_args\u001b[0;34m(self, *args, **kws)\u001b[0m\n\u001b[1;32m 284\u001b[0m \u001b[0margtypes\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtypeof_pyval\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 285\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 286\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcompile\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtuple\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margtypes\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 287\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0merrors\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTypingError\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 288\u001b[0m \u001b[0;31m# Intercept typing error that may be due to an argument\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/numba/dispatcher.py\u001b[0m in \u001b[0;36mcompile\u001b[0;34m(self, sig)\u001b[0m\n\u001b[1;32m 530\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 531\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_cache_misses\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0msig\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 532\u001b[0;31m \u001b[0mcres\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_compiler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcompile\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreturn_type\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 533\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0madd_overload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcres\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 534\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_cache\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msave_overload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msig\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcres\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/usr/local/lib/python3.5/dist-packages/numba/dispatcher.py\u001b[0m in \u001b[0;36mcompile\u001b[0;34m(self, args, return_type)\u001b[0m\n\u001b[1;32m 79\u001b[0m \u001b[0mimpl\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 80\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreturn_type\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mreturn_type\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 81\u001b[0;31m flags=flags, locals=self.locals)\n\u001b[0m\u001b[1;32m 82\u001b[0m \u001b[0;31m# Check typing error if object mode is used\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 83\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mcres\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtyping_error\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mflags\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0menable_pyobject\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", @@ -665,7 +833,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "> It fails!" + "> It fails! Almost as always when trying Numba, it fails cryptically, too bad. I don't want to spend time debugging this." ] }, { @@ -680,13 +848,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "> Thanks to [this page](https://learnxinyminutes.com/docs/julia/)." + "> Thanks to [this page](https://learnxinyminutes.com/docs/julia/) for a nice and short introduction to Julia." ] }, { "cell_type": "code", - "execution_count": 16, - "metadata": {}, + "execution_count": 19, + "metadata": { + "code_folding": [] + }, "outputs": [ { "name": "stdout", @@ -761,7 +931,7 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 20, "metadata": {}, "outputs": [ { @@ -770,7 +940,7 @@ "(409937.57869797916, 8.482168070998719e-05)" ] }, - "execution_count": 17, + "execution_count": 20, "metadata": {}, "output_type": "execute_result" }, @@ -780,7 +950,7 @@ "409937.5786979791" ] }, - "execution_count": 17, + "execution_count": 20, "metadata": {}, "output_type": "execute_result" } @@ -793,6 +963,84 @@ "romberg(f, a, b, n=12)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let try a less naive version using a fixed-size array instead of a dictionary. (as we did before for the Python version)" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "code_folding": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "f (generic function with 1 method)\n", + "1993\n", + "2017\n", + "romberg_julia_better (generic function with 1 method)\n", + "404122.6272072803\n", + "372300.32065714896\n", + "373621.99733807985\n", + "373650.47843482986\n", + "409937.6105242601\n", + "409937.57869815256\n", + "409937.5786979804\n" + ] + } + ], + "source": [ + "%%script julia\n", + "\n", + "function f(x)\n", + " (12*x + 1) / (1 + cos(x)^2)\n", + "end\n", + "\n", + "a = 1993\n", + "b = 2017\n", + "\n", + "function romberg_julia_better(f, xmin, xmax; n=8)\n", + " m = n\n", + " # First value:\n", + " r = zeros((n+1, m+1)) # https://docs.julialang.org/en/latest/stdlib/arrays/#Base.zeros\n", + " r[1, 1] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", + "\n", + " # One side of the triangle:\n", + " for i in 1 : n\n", + " h_i = (xmax - xmin) / (2^i)\n", + " sum_f_x = 0\n", + " for k in 1 : 2^(i - 1)\n", + " sum_f_x += f(xmin + ((2 * k - 1) * h_i))\n", + " end\n", + " r[i + 1, 1] = (r[i, 1] / 2.) + (h_i * sum_f_x)\n", + " end\n", + "\n", + " # All the other values:\n", + " for j in 1 : m\n", + " for i in j : n\n", + " r[i + 1, j + 1] = (((4.^j) * r[i + 1, j]) - r[i, j]) / (4.^j - 1.)\n", + " end\n", + " end\n", + "\n", + " r[n + 1, m + 1]\n", + "end\n", + "\n", + "\n", + "println(romberg_julia_better(f, a, b, n=0)) # really not accurate!\n", + "println(romberg_julia_better(f, a, b, n=1)) # alreay pretty good!\n", + "println(romberg_julia_better(f, a, b, n=2))\n", + "println(romberg_julia_better(f, a, b, n=3))\n", + "println(romberg_julia_better(f, a, b, n=8)) # Almost the exact value.\n", + "println(romberg_julia_better(f, a, b, n=10)) # Almost the exact value.\n", + "println(romberg_julia_better(f, a, b, n=12)) # Almost the exact value." + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -810,10 +1058,10 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 22, "metadata": { "code_folding": [ - 8 + 9 ] }, "outputs": [ @@ -821,14 +1069,15 @@ "name": "stdout", "output_type": "stream", "text": [ - "CPU times: user 26.1 s, sys: 0 ns, total: 26.1 s\n", - "Wall time: 26.1 s\n" + "CPU times: user 25.6 s, sys: 0 ns, total: 25.6 s\n", + "Wall time: 25.6 s\n" ] } ], "source": [ "%%time\n", "\n", + "import numpy as np\n", "import math\n", "import random\n", "\n", @@ -841,23 +1090,23 @@ " m = n\n", " assert n >= m >= 0\n", " # First value:\n", - " r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n", + " r = np.zeros((n+1, m+1))\n", + " r[0, 0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", "\n", " # One side of the triangle:\n", " for i in range(1, n + 1):\n", - " h_i = (xmax - xmin) / float(2**i)\n", - " xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n", - " r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n", + " h_i = (xmax - xmin) / 2.**i\n", + " r[i, 0] = (0.5 * r[i - 1, 0]) + h_i * math.fsum(\n", + " f(xmin + ((2 * k - 1) * h_i))\n", + " for k in range(1, 1 + 2**(i - 1))\n", + " )\n", "\n", " # All the other values:\n", " for j in range(1, m + 1):\n", " for i in range(j, n + 1):\n", - " try:\n", - " r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n", - " except:\n", - " raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n", + " r[i, j] = (((4.**j) * r[i, j - 1]) - r[i - 1, j - 1]) / ((4.**j) - 1.)\n", "\n", - " return r[(n, m)]\n", + " return r[n, m]\n", "\n", "for _ in range(100000):\n", " a = random.randint(-2000, 2000)\n", @@ -874,7 +1123,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 23, "metadata": { "code_folding": [ 9 @@ -886,8 +1135,8 @@ "name": "stdout", "output_type": "stream", "text": [ - "CPU times: user 0 ns, sys: 4 ms, total: 4 ms\n", - "Wall time: 7.06 s\n" + "CPU times: user 4 ms, sys: 4 ms, total: 8 ms\n", + "Wall time: 7.12 s\n" ] } ], @@ -907,23 +1156,23 @@ " m = n\n", " assert n >= m >= 0\n", " # First value:\n", - " r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n", + " r = [[0 for _ in range(n+1)] for _ in range(m+1)]\n", + " r[0][0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", "\n", " # One side of the triangle:\n", " for i in range(1, n + 1):\n", - " h_i = (xmax - xmin) / float(2**i)\n", - " xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n", - " r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n", + " h_i = (xmax - xmin) / 2.**i\n", + " r[i][0] = (0.5 * r[i - 1][0]) + h_i * math.fsum(\n", + " f(xmin + ((2 * k - 1) * h_i))\n", + " for k in range(1, 1 + 2**(i - 1))\n", + " )\n", "\n", " # All the other values:\n", " for j in range(1, m + 1):\n", " for i in range(j, n + 1):\n", - " try:\n", - " r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n", - " except:\n", - " raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n", + " r[i][j] = (((4.**j) * r[i][j - 1]) - r[i - 1][j - 1]) / ((4.**j) - 1.)\n", "\n", - " return r[(n, m)]\n", + " return r[n][m]\n", "\n", "for _ in range(100000):\n", " a = random.randint(-2000, 2000)\n", @@ -940,10 +1189,10 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 24, "metadata": { "code_folding": [ - 9 + 7 ] }, "outputs": [ @@ -954,7 +1203,7 @@ "f (generic function with 1 method)\n", "romberg_julia (generic function with 1 method)\n", "CPU times: user 4 ms, sys: 4 ms, total: 8 ms\n", - "Wall time: 8.32 s\n" + "Wall time: 8.1 s\n" ] } ], @@ -1003,7 +1252,79 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "On this first test, it doesn't look faster than Pypy..." + "On this first test, it doesn't look faster than Pypy...\n", + "But what if we use the improved version, with an array instead of dictionary?" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "code_folding": [ + 7 + ] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "f (generic function with 1 method)\n", + "romberg_julia_better (generic function with 1 method)\n", + "CPU times: user 4 ms, sys: 4 ms, total: 8 ms\n", + "Wall time: 2.7 s\n" + ] + } + ], + "source": [ + "%%time\n", + "%%script julia\n", + "\n", + "function f(x)\n", + " (12*x + 1) / (1 + cos(x)^2)\n", + "end\n", + "\n", + "function romberg_julia_better(f, xmin, xmax; n=8)\n", + " m = n\n", + " # First value:\n", + " r = zeros((n+1, m+1)) # https://docs.julialang.org/en/latest/stdlib/arrays/#Base.zeros\n", + " r[1, 1] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", + "\n", + " # One side of the triangle:\n", + " for i in 1 : n\n", + " h_i = (xmax - xmin) / (2^i)\n", + " sum_f_x = 0\n", + " for k in 1 : 2^(i - 1)\n", + " sum_f_x += f(xmin + ((2 * k - 1) * h_i))\n", + " end\n", + " r[i + 1, 1] = (r[i, 1] / 2.) + (h_i * sum_f_x)\n", + " end\n", + "\n", + " # All the other values:\n", + " for j in 1 : m\n", + " for i in j : n\n", + " r[i + 1, j + 1] = (((4.^j) * r[i + 1, j]) - r[i, j]) / (4.^j - 1.)\n", + " end\n", + " end\n", + "\n", + " r[n + 1, m + 1]\n", + "end\n", + "\n", + "for _ in 1:100000\n", + " a = rand(-2000:2000)\n", + " b = a + rand(0:100)\n", + " romberg_julia_better(f, a, b)\n", + "end" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Oh, this time it finally seems faster. Really faster? Yes, about 3 to 4 time faster than Pypy.\n", + "\n", + "Remark also that this last cells compared by using the magic `%%pypy` and `%%script julia`, so they both need a warm-up time (opening the pipe, the sub-process, initializing the JIT compiler etc).\n", + "But it is fair to compare Pypy to Julia this way." ] }, { @@ -1032,10 +1353,10 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 26, "metadata": { "code_folding": [ - 8 + 9 ] }, "outputs": [ @@ -1043,15 +1364,16 @@ "name": "stdout", "output_type": "stream", "text": [ - "0.842700792949715\n", - "CPU times: user 22.1 s, sys: 4 ms, total: 22.1 s\n", - "Wall time: 22.1 s\n" + "0.84270079295\n", + "CPU times: user 20.7 s, sys: 8 ms, total: 20.8 s\n", + "Wall time: 20.8 s\n" ] } ], "source": [ "%%time\n", "\n", + "import numpy as np\n", "import math\n", "import random\n", "\n", @@ -1064,23 +1386,23 @@ " m = n\n", " assert n >= m >= 0\n", " # First value:\n", - " r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n", + " r = np.zeros((n+1, m+1))\n", + " r[0, 0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", "\n", " # One side of the triangle:\n", " for i in range(1, n + 1):\n", - " h_i = (xmax - xmin) / float(2**i)\n", - " xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n", - " r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n", + " h_i = (xmax - xmin) / 2.**i\n", + " r[i, 0] = (0.5 * r[i - 1, 0]) + h_i * math.fsum(\n", + " f(xmin + ((2 * k - 1) * h_i))\n", + " for k in range(1, 1 + 2**(i - 1))\n", + " )\n", "\n", " # All the other values:\n", " for j in range(1, m + 1):\n", " for i in range(j, n + 1):\n", - " try:\n", - " r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n", - " except:\n", - " raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n", + " r[i, j] = (((4.**j) * r[i, j - 1]) - r[i - 1, j - 1]) / ((4.**j) - 1.)\n", "\n", - " return r[(n, m)]\n", + " return r[n, m]\n", "\n", "for _ in range(100000):\n", " a = 0\n", @@ -1098,7 +1420,7 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 27, "metadata": { "code_folding": [ 9 @@ -1112,7 +1434,7 @@ "text": [ "0.84270079295\n", "CPU times: user 8 ms, sys: 0 ns, total: 8 ms\n", - "Wall time: 6.34 s\n" + "Wall time: 6.35 s\n" ] } ], @@ -1132,23 +1454,23 @@ " m = n\n", " assert n >= m >= 0\n", " # First value:\n", - " r = {(0, 0): 0.5 * (xmax - xmin) * (f(xmax) + f(xmin))}\n", + " r = [[0 for _ in range(n+1)] for _ in range(m+1)]\n", + " r[0][0] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", "\n", " # One side of the triangle:\n", " for i in range(1, n + 1):\n", - " h_i = (xmax - xmin) / float(2**i)\n", - " xsamples = [xmin + ((2 * k - 1) * h_i) for k in range(1, 1 + 2**(i - 1))]\n", - " r[(i, 0)] = (0.5 * r[(i - 1, 0)]) + h_i * math.fsum(f(x) for x in xsamples)\n", + " h_i = (xmax - xmin) / 2.**i\n", + " r[i][0] = (0.5 * r[i - 1][0]) + h_i * math.fsum(\n", + " f(xmin + ((2 * k - 1) * h_i))\n", + " for k in range(1, 1 + 2**(i - 1))\n", + " )\n", "\n", " # All the other values:\n", " for j in range(1, m + 1):\n", " for i in range(j, n + 1):\n", - " try:\n", - " r[(i, j)] = (((4**j) * r[(i, j - 1)]) - r[(i - 1, j - 1)]) / float((4**j) - 1)\n", - " except:\n", - " raise ValueError(\"romberg() with n = {}, m = {} and i = {}, j = {} was an error.\".format(n, m, i, j))\n", + " r[i][j] = (((4.**j) * r[i][j - 1]) - r[i - 1][j - 1]) / ((4.**j) - 1.)\n", "\n", - " return r[(n, m)]\n", + " return r[n][m]\n", "\n", "for _ in range(100000):\n", " a = 0\n", @@ -1166,7 +1488,7 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 28, "metadata": { "code_folding": [ 7 @@ -1180,8 +1502,8 @@ "f (generic function with 1 method)\n", "romberg_julia (generic function with 1 method)\n", "0.8427007929497148\n", - "CPU times: user 4 ms, sys: 8 ms, total: 12 ms\n", - "Wall time: 7.07 s\n" + "CPU times: user 4 ms, sys: 4 ms, total: 8 ms\n", + "Wall time: 7.19 s\n" ] } ], @@ -1234,6 +1556,77 @@ "Still not faster than Pypy... So what is the goal of Julia?" ] }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "code_folding": [ + 8 + ] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "f (generic function with 1 method)\n", + "romberg_julia_better (generic function with 1 method)\n", + "0.8427007929497148\n", + "CPU times: user 8 ms, sys: 0 ns, total: 8 ms\n", + "Wall time: 2.42 s\n" + ] + } + ], + "source": [ + "%%time\n", + "%%script julia\n", + "\n", + "function f(x)\n", + " (2.0 / sqrt(pi)) * exp(-x^2)\n", + "end\n", + "\n", + "\n", + "function romberg_julia_better(f, xmin, xmax; n=8)\n", + " m = n\n", + " # First value:\n", + " r = zeros((n+1, m+1)) # https://docs.julialang.org/en/latest/stdlib/arrays/#Base.zeros\n", + " r[1, 1] = (xmax - xmin) * (f(xmax) + f(xmin)) / 2.\n", + "\n", + " # One side of the triangle:\n", + " for i in 1 : n\n", + " h_i = (xmax - xmin) / (2^i)\n", + " sum_f_x = 0\n", + " for k in 1 : 2^(i - 1)\n", + " sum_f_x += f(xmin + ((2 * k - 1) * h_i))\n", + " end\n", + " r[i + 1, 1] = (r[i, 1] / 2.) + (h_i * sum_f_x)\n", + " end\n", + "\n", + " # All the other values:\n", + " for j in 1 : m\n", + " for i in j : n\n", + " r[i + 1, j + 1] = (((4.^j) * r[i + 1, j]) - r[i, j]) / (4.^j - 1.)\n", + " end\n", + " end\n", + "\n", + " r[n + 1, m + 1]\n", + "end\n", + "\n", + "for _ in 1:100000\n", + " a = 0\n", + " b = 1\n", + " romberg_julia_better(f, a, b)\n", + "end\n", + "println(romberg_julia_better(f, 0, 1))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This is also faster than Pypy, but not that much..." + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -1241,9 +1634,16 @@ "----\n", "## Conclusion\n", "\n", + "$\\implies$\n", + "On this (baby) example of a real world numerical algorithms, tested on thousands of random inputs or on thousands time the same input, the speed-up is in favor of Julia, but it doesn't seem impressive enough to make me want to use it (at least for now).\n", + "\n", + "If I have to use 1-based indexing and a slightly different language, just to maybe gain a speed-up of 2 to 3 (compared to Pypy) or even a 10x speed-up compare to naive Python, why bother?\n", + "\n", + "### Remark\n", "Of course, this was a *baby* benchmark, on a small algorithm, and probably wrongly implemented in both Python and Julia.\n", "\n", - "But still, I am surprised to see that the naive Julia version was *slower* than the naive Python version executed with Pypy..." + "But still, I am surprised to see that the naive Julia version was *slower* than the naive Python version executed with Pypy...\n", + "For the less naive version (without dictionary), the Julia version was about *2 to 3* times faster than the Python version with Pypy." ] } ], @@ -1281,6 +1681,13 @@ "sideBar": true, "threshold": 4, "toc_cell": true, + "toc_position": { + "height": "506px", + "left": "0px", + "right": "1068px", + "top": "116px", + "width": "212px" + }, "toc_section_display": "block", "toc_window_display": true }