diff --git a/notebooks/jacobi_method.ipynb b/notebooks/jacobi_method.ipynb index 1c91c5f..08c01da 100644 --- a/notebooks/jacobi_method.ipynb +++ b/notebooks/jacobi_method.ipynb @@ -27,9 +27,9 @@ "\n", "In this notebook, we will learn\n", "\n", - "- How to paralleize a Jacobi method\n", + "- How to paralleize the Jacobi method\n", "- How the data partition can impact the performance of a distributed algorithm\n", - "- How to use latency hiding\n", + "- How to use latency hiding to improve parallel performance\n", "\n" ] }, @@ -93,9 +93,12 @@ "id": "93e84ff8", "metadata": {}, "source": [ - "When solving a Laplace equation in 1D, the Jacobi method leads to the following iterative scheme: The entry $i$ of vector $u$ at iteration $t+1$ is computed as:\n", + "When solving a [Laplace equation](https://en.wikipedia.org/wiki/Laplace%27s_equation) in 1D, the Jacobi method leads to the following iterative scheme: The entry $i$ of vector $u$ at iteration $t+1$ is computed as:\n", "\n", - "$u^{t+1}_i = \\dfrac{u^t_{i-1}+u^t_{i+1}}{2}$" + "$u^{t+1}_i = \\dfrac{u^t_{i-1}+u^t_{i+1}}{2}$\n", + "\n", + "\n", + "This iterative is yet simple but shares fundamental challenges with many other algorithms used in scientific computing. This is why we are studying it here.\n" ] }, { @@ -130,6 +133,14 @@ "end" ] }, + { + "cell_type": "markdown", + "id": "432bd862", + "metadata": {}, + "source": [ + "If you run it for zero iterations, we will see the initial condition." + ] + }, { "cell_type": "code", "execution_count": null, @@ -140,14 +151,78 @@ "jacobi(5,0)" ] }, + { + "cell_type": "markdown", + "id": "c75cb9a6", + "metadata": {}, + "source": [ + "If you run it for enough iterations, you will see the expected solution of the Laplace equation: values that vary linearly form -1 to 1." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b52be374", + "metadata": {}, + "outputs": [], + "source": [ + "jacobi(5,100)" + ] + }, + { + "cell_type": "markdown", + "id": "22fda724", + "metadata": {}, + "source": [ + "In our version of the jacobi method, we return after a given number of iterations. Other stopping criteria are possible. For instance, iterate until the difference between u and u_new is below a tolerance:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "15de7bf5", + "metadata": {}, + "outputs": [], + "source": [ + "using LinearAlgebra: norm\n", + "function jacobi_with_tol(n,tol)\n", + " u = zeros(n+2)\n", + " u[1] = -1\n", + " u[end] = 1\n", + " u_new = copy(u)\n", + " increment = similar(u)\n", + " while true\n", + " for i in 2:(n+1)\n", + " u_new[i] = 0.5*(u[i-1]+u[i+1])\n", + " end\n", + " increment .= u_new .- u\n", + " if norm(increment)/norm(u_new) < tol\n", + " return u_new\n", + " end\n", + " u, u_new = u_new, u\n", + " end\n", + " u\n", + "end" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "697ad307", + "metadata": {}, + "outputs": [], + "source": [ + "n = 5\n", + "tol = 1e-9\n", + "jacobi_with_tol(n,tol)" + ] + }, { "cell_type": "markdown", "id": "6e085701", "metadata": {}, "source": [ - "
\n", - "Note: In our version of the jacobi method, we return after a given number of iterations. Other stopping criteria are possible. For instance, iterate until the difference between u and u_new is below a tolerance.\n", - "
" + "However, we are not going to parallelize this more complex in this notebook (we will consider it later in this course)." ] }, { @@ -156,7 +231,7 @@ "metadata": {}, "source": [ "\n", - "### Where can we exploit parallelism?\n", + "## Where can we exploit parallelism?\n", "\n", "Look at the two nested loops in the sequential implementation:\n", "\n", @@ -169,8 +244,8 @@ "end\n", "```\n", "\n", - "- The outer loop cannot be parallelized. The value of `u` at step `t+1` depends on the value at the previous step `t`.\n", - "- The inner loop can be parallelized.\n", + "- The outer loop over `t` cannot be parallelized. The value of `u` at step `t+1` depends on the value at the previous step `t`.\n", + "- The inner loop is trivially parallel. The loop iterations are independent (any order is possible).\n", "\n" ] }, @@ -386,19 +461,11 @@ "source": [ "### Communication overhead\n", "- We update $N/P$ entries in each process at each iteration, where $N$ is the total length of the vector and $P$ the number of processes\n", + "- Thus, computation complexity is $O(N/P)$\n", "- We need to get remote entries from 2 neighbors (2 messages per iteration)\n", "- We need to communicate 1 entry per message\n", - "- Communication/computation ration is $O(P/N)$ (potentially scalable if $P<