XM_40017/notebooks/jacobi_method.ipynb
2023-08-14 13:04:06 +02:00

733 lines
68 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "2e15ced8",
"metadata": {},
"source": [
"<img src=\"https://upload.wikimedia.org/wikipedia/commons/thumb/3/39/VU_logo.png/800px-VU_logo.png?20161029201021\" width=\"350\">\n",
"\n",
"### Programming large-scale parallel systems"
]
},
{
"cell_type": "markdown",
"id": "e8549215",
"metadata": {},
"source": [
"# Jacobi method"
]
},
{
"cell_type": "markdown",
"id": "6e0ef563",
"metadata": {},
"source": [
"## Contents\n",
"\n",
"In this notebook, we will learn\n",
"\n",
"- How to paralleize a Jacobi method\n",
"- How the data partition can impact the performance of a distributed algorithm\n",
"- How to use latency hiding\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "1dc78750",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"jacobi_4_check (generic function with 1 method)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using Printf\n",
"\n",
"function answer_checker(answer,solution)\n",
" if answer == solution\n",
" \"🥳 Well done! \"\n",
" else\n",
" \"It's not correct. Keep trying! 💪\"\n",
" end |> println\n",
"end\n",
"gauss_seidel_1_check(answer) = answer_checker(answer,\"c\")\n",
"jacobi_1_check(answer) = answer_checker(answer, \"d\")\n",
"jacobi_2_check(answer) = answer_checker(answer, \"b\")\n",
"jacobi_3_check(answer) = answer_checker(answer, \"c\")\n",
"jacobi_4_check(anwswer) = answer_checker(answer, \"d\")"
]
},
{
"cell_type": "markdown",
"id": "d4cb59d5",
"metadata": {},
"source": [
"## The Jacobi method\n",
"\n",
"\n",
"The [Jacobi method](https://en.wikipedia.org/wiki/Jacobi_method) is a numerical tool to solve systems of linear algebraic equations. One of the main applications of the method is to solve boundary value problems (BVPs). I.e., given the values at the boundary (of a grid), the Jacobi method will find the interior values that fulfill a certain equation.\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "6a2bdbc6",
"metadata": {},
"source": [
"<div>\n",
"<img src=\"\" align=\"left\" width=\"300\"/>\n",
"</div>"
]
},
{
"cell_type": "markdown",
"id": "e63a5792",
"metadata": {},
"source": [
"### Serial implementation"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "14a58308",
"metadata": {},
"outputs": [],
"source": [
"function jacobi(n,niters)\n",
" u = zeros(n+2)\n",
" u[1] = -1\n",
" u[end] = 1\n",
" u_new = copy(u)\n",
" for t in 1:niters\n",
" for i in 2:(n+1)\n",
" u_new[i] = 0.5*(u[i-1]+u[i+1])\n",
" end\n",
" u, u_new = u_new, u\n",
" end\n",
" u\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "76e1eba1",
"metadata": {},
"outputs": [],
"source": [
"jacobi(5,1000)"
]
},
{
"cell_type": "markdown",
"id": "6e085701",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-info\">\n",
"<b>Note:</b> The values computed by the Jacobi method are linearly increasing from -1 to 1. It is possible to show mathematically that the method we implemented in the function above approximates a 1D Laplace equation via a finite difference method and the solution of this equation is a linear function.\n",
"</div>\n",
"\n",
"<div class=\"alert alert-block alert-info\">\n",
"<b>Note:</b> In our version of the jacobi method, we return after a given number of iterations. Other stopping criteria are possible. For instance, iterate until the difference between u and u_new is below a tolerance. \n",
"</div>"
]
},
{
"cell_type": "markdown",
"id": "d6918c31",
"metadata": {},
"source": [
"\n",
"### Where can we exploit parallelism?\n",
"\n",
"Look at the two nested loops in the sequential implementation:\n",
"\n",
"```julia\n",
"for t in 1:nsteps\n",
" for i in 2:(n+1)\n",
" u_new[i] = 0.5*(u[i-1]+u[i+1])\n",
" end\n",
" u, u_new = u_new, u\n",
"end\n",
"```\n",
"\n",
"- The outer loop cannot be parallelized. The value of `u` at step `t+1` depends on the value at the previous step `t`.\n",
"- The inner loop can be parallelized.\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "798968b1",
"metadata": {},
"source": [
"### The Gauss-Seidel method\n",
"\n",
"The usage of `u_new` seems a bit unnecessary at first sight, right?. If we remove it, we get another method called Gauss-Seidel.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0f77b547",
"metadata": {},
"outputs": [],
"source": [
"function gauss_seidel(n,niters)\n",
" u = zeros(n+2)\n",
" u[1] = -1\n",
" u[end] = 1\n",
" for t in 1:niters\n",
" for i in 2:(n+1)\n",
" u[i] = 0.5*(u[i-1]+u[i+1])\n",
" end\n",
" end\n",
" u\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "0dbc5358",
"metadata": {},
"source": [
"Note that the final solution is the same (up to machine precision)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ca31518b",
"metadata": {},
"outputs": [],
"source": [
"gauss_seidel(5,1000)"
]
},
{
"cell_type": "markdown",
"id": "c92e9c73",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-success\">\n",
"<b>Question:</b> Which of the two loops in the Gauss-Seidel method are trivially parallelizable?\n",
"</div>\n",
"\n",
"```julia\n",
"for t in 1:niters\n",
" for i in 2:(n+1)\n",
" u[i] = 0.5*(u[i-1]+u[i+1])\n",
" end\n",
"end\n",
"```\n",
"\n",
" a) Both of them\n",
" b) The outer, but not the inner\n",
" c) None of them\n",
" d) The inner, but not the outer\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4edad93f",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"answer = \"x\" # replace x with a, b, c or d\n",
"gauss_seidel_1_check(answer)"
]
},
{
"cell_type": "markdown",
"id": "9df06442",
"metadata": {},
"source": [
"## Parallelization of the Jacobi method\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "97a2d7d5",
"metadata": {},
"source": [
"### Parallelization strategy\n",
"\n",
"- Each worker updates a consecutive section of the array `u_new` \n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "c4e2d000",
"metadata": {},
"source": [
"### Data dependencies\n",
"\n",
"Recall:\n",
"\n",
"`u_new[i] = 0.5*(u[i-1]+u[i+1])`\n",
"\n",
"\n",
"Thus, each process will need values from the neighboring processes to perform the update of its boundary values."
]
},
{
"cell_type": "markdown",
"id": "428bce86",
"metadata": {},
"source": [
"<div>\n",
"<img src=\"\" align=\"left\" width=\"450\"/>\n",
"</div>"
]
},
{
"cell_type": "markdown",
"id": "1b3c8c05",
"metadata": {},
"source": [
"### Ghost (aka halo) cells\n",
"\n",
"A usual way of handling this type of data dependencies is using so-called ghost cells. Ghost cells represent the missing data dependencies in the data owned by each process. After importing the appropriate values from the neighbor processes one can perform the usual sequential Jacobi update locally in the processes."
]
},
{
"cell_type": "markdown",
"id": "cd1e3760",
"metadata": {},
"source": [
"<div>\n",
"<img src=\"\" align=\"left\" width=\"450\"/>\n",
"</div>"
]
},
{
"cell_type": "markdown",
"id": "5c397005",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-success\">\n",
"<b>Question:</b> Which is the communication and computation complexity in each process? N is the length of the vector and P the number of processes. \n",
"</div>\n",
"\n",
" a) Communication: O(P), computation: O(N/P)\n",
" b) Communication: O(1), computation: O(N)\n",
" c) Communication: O(P), computation: O(N)\n",
" d) Communication: O(1), computation: O(N/P)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3a03fc4c",
"metadata": {},
"outputs": [],
"source": [
"answer = \"x\" # replace x with a, b, c or d\n",
"jacobi_1_check(answer)"
]
},
{
"cell_type": "markdown",
"id": "8ed4129c",
"metadata": {},
"source": [
"## Implementation\n",
"\n",
"We consider the implementation using MPI. The programming model of MPI is generally better suited for data-parallel algorithms like this one than the task-based model provided by Distributed.jl. In any case, one can also implement it using Distributed, but it requires some extra effort to setup remote channels right for the communication between neighbor processes.\n",
"\n",
"Take a look at the implementation below and try to understand it. Note that we have used MPIClustermanagers and Distributed just to run the MPI code on the notebook. When running it on a cluster, MPIClustermanagers and Distributed are not needed.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e15082fb",
"metadata": {},
"outputs": [],
"source": [
"] add MPI MPIClusterManagers"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a66cbf9a",
"metadata": {},
"outputs": [],
"source": [
"using MPIClusterManagers \n",
"using Distributed"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e0d63c6b",
"metadata": {},
"outputs": [],
"source": [
"if procs() == workers()\n",
" nw = 3\n",
" manager = MPIWorkerManager(nw)\n",
" addprocs(manager)\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a0923606",
"metadata": {},
"outputs": [],
"source": [
"# Test cell, remove me\n",
"u = [-1, 0, 0, 0, 0, 1]\n",
"view(u, 6:6)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "68851107",
"metadata": {
"code_folding": []
},
"outputs": [],
"source": [
"@mpi_do manager begin\n",
" using MPI\n",
" comm = MPI.Comm_dup(MPI.COMM_WORLD)\n",
" nw = MPI.Comm_size(comm)\n",
" iw = MPI.Comm_rank(comm)+1\n",
" function jacobi_mpi(n,niters)\n",
" if mod(n,nw) != 0\n",
" println(\"n must be a multiple of nw\")\n",
" MPI.Abort(comm,1)\n",
" end\n",
" n_own = div(n,nw)\n",
" u = zeros(n_own+2)\n",
" u[1] = -1\n",
" u[end] = 1\n",
" u_new = copy(u)\n",
" for t in 1:niters\n",
" reqs = MPI.Request[]\n",
" # Exchange cell values with neighbors\n",
" if iw != 1\n",
" neig_rank = (iw-1)-1\n",
" req = MPI.Isend(view(u,2:2),comm,dest=neig_rank,tag=0)\n",
" push!(reqs,req)\n",
" req = MPI.Irecv!(view(u,1:1),comm,source=neig_rank,tag=0)\n",
" push!(reqs,req)\n",
" end\n",
" if iw != nw\n",
" neig_rank = (iw+1)-1\n",
" s = n_own+1\n",
" r = n_own+2\n",
" req = MPI.Isend(view(u,s:s),comm,dest=neig_rank,tag=0)\n",
" push!(reqs,req)\n",
" req = MPI.Irecv!(view(u,r:r),comm,source=neig_rank,tag=0)\n",
" push!(reqs,req)\n",
" end\n",
" MPI.Waitall(reqs)\n",
" for i in 2:(n_own+1)\n",
" u_new[i] = 0.5*(u[i-1]+u[i+1])\n",
" end\n",
" u, u_new = u_new, u\n",
" end\n",
" u\n",
" @show u\n",
" # Gather results in root process\n",
" results = zeros(n+2)\n",
" results[1] = -1\n",
" results[n+2] = 1\n",
" MPI.Gather!(view(u,2:n_own+1), view(results, 2:n+1), root=0, comm)\n",
" if iw == 1\n",
" @show results\n",
" end \n",
" end\n",
" niters = 100\n",
" load = 4\n",
" n = load*nw\n",
" jacobi_mpi(n,niters)\n",
"end"
]
},
{
"cell_type": "markdown",
"id": "eff25246",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-success\">\n",
"<b>Question:</b> How many messages per iteration are sent from a process away from the boundary?\n",
"</div>\n",
"\n",
" a) 1\n",
" b) 2\n",
" c) 3\n",
" d) 4\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "98bd9b5e",
"metadata": {},
"outputs": [],
"source": [
"answer = \"x\" # replace x with a, b, c or d\n",
"jacobi_2_check(answer)"
]
},
{
"cell_type": "markdown",
"id": "075dd6d8",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-success\">\n",
"<b>Question:</b> After the end of the for-loop (line 43), ...\n",
"</div>\n",
"\n",
" a) each worker holds the complete solution.\n",
" b) the root process holds the solution. \n",
" c) the ghost cells contain redundant values. \n",
" d) all ghost cells contain the initial values -1 and 1. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c3b58002",
"metadata": {},
"outputs": [],
"source": [
"answer = \"x\" # replace x with a, b, c or d\n",
"jacobi_3_check(answer)"
]
},
{
"cell_type": "markdown",
"id": "4537661d",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-success\">\n",
"<b>Question:</b> In line 35 of the code, we wait for all receive and send requests. Is it possible to instead wait for just the receive requests?\n",
"</div>\n",
"\n",
" \n",
" a) No, because the send buffer might be overwritten if we don't wait for send requests.\n",
" b) No, because MPI does not allow an asynchronous send without a Wait().\n",
" c) Yes, because each send has a matching receive, so all requests are done when the receive requests return. \n",
" d) Yes, because there are no writes to the send buffer in this iteration."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e16ea5eb",
"metadata": {},
"outputs": [],
"source": [
"answer = \"x\" # replace x with a, b, c or d.\n",
"jacobi_4_check(answer)"
]
},
{
"cell_type": "markdown",
"id": "c9aa2901",
"metadata": {},
"source": [
"### Latency hiding\n",
"\n",
"Note that we only need communications to update the values at the boundary of the portion owned by each process. The other values (the one in green in the figure below) can be updated without communications. This provides the opportunity of overlapping the computation of the interior values (green cells in the figure) with the communication of the ghost values. This technique is called latency hiding, since we are hiding communication latency by overlapping it with communications that we need to do anyway.\n",
"\n",
"The modification of the implementation above to include latency hiding is leaved as an exercise (see below).\n"
]
},
{
"attachments": {
"fig.png": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA6gAAADyCAYAAABAvOgkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAAewgAAHsIBbtB1PgAAABl0RVh0U29mdHdhcmUAd3d3Lmlua3NjYXBlLm9yZ5vuPBoAACAASURBVHic7d17fFxlgf/xz0nTO71ZKKWFUrByKbctlB+2ogJVi0q5rCCLrhVd7bqC6y6yK+yq4OruqoDu/lxXWryh+/uhFERIqSDIRS2lVKwCcoe2SIG29ELb9Jpk9o9n0pxMZ5I0JPM8ST7v12tePU/mzMy3STrNN+ec58nmzp1bQJIkSZKkyGpiB5AkSZIkCSyokiRJkqRE1JaMZ2ZZtiJGEEmSJElS31IoFCYCdzaPWxXULMtWzJkz5+lqh5IkSZIk9T3z5s2jUGiZFslTfCVJkiRJSbCgSpIkSZKSYEGVJEmSJCXBgipJkiRJSoIFVZIkSZKUBAuqJEmSJCkJFlRJkiRJUhIsqJIkSZKkJFhQJUmSJElJsKBKkiRJkpJgQZUkSZIkJcGCKkmSJElKggVVkiRJkpQEC6okSZIkKQkWVEmSJElSEiyokiRJkqQkWFAlSZIkSUmwoEqSJEmSkmBBlSRJkiQlwYIqSZIkSUqCBVWSJEmSlAQLqiRJkiQpCRZUSZIkSVISLKiSJEmSpCRYUCVJkiRJSbCgSpIkSZKSYEGVJEmSJCXBgipJkiRJSkJt7ACSJCk5Q4D3FrcXAvURs1TboOJtF33r7y1JSbCgSpKkUqOBG4vbbwSe7+Tz1ABvAqYCU4BhxY9fQrrl75+AzwP3ADMiZ5GkPseCKkmSusN/Ax8Ehpe573LSLagxvBc4E3gGuDpyFkmKyoIqSZJKbQQuK26v7+RzTKalnK4qPudRrzNXNfwJeBD4YxVf83hgDvArqldQDwJOyN0OAzLgJuCzVcogSXuwoEqSpFKbga++zue4DrgKeBh4Bfgw8IPX+ZzVcF3x1ptNBZZWuG+/agaRpFIWVEmS1B3+X+wAPch3CJNRbdnLxx0GHEe4RvjhTrzuBuC3xcd+CBjfieeQpC7lMjOSJKnUQUCheDs0cpZquxC4F7imiq/5MqEkPrWXj5tFmMxqzl4+7mnC5FdvAN5FuCZ43V4+hyR1C4+gSpIktTgUOAVoquJrTiZcn7sWuK8Kr7epeJOk5HgEVZIkKa73EY6EfjF2EEmKzYIqSZIkSUqCBVWSJEmSlASvQZUkSR01AphU4b5n8LpGSdLrZEGVJEkd9Tbgtgr3nQ7cWcUsfcWhVJ6l98TinycBX6mwzy3Akq4OJUndxYIqSZI6aiOV19v06Gn3mAB8tp19jiveylmOBVVSD2JBlSRJHfVrYGrsEH3MC8BXK9x3InAa8Afgjgr7/L47QklSd7GgSpIkpet54LIK932GUFCXtLGPJPUozuIrSZIkSUqCR1AlSVJ3OI4wcVKz43Pbnwa25sbfArZUI5QkKW0WVEmS1B3amln2CyXjH2JBlSRhQZUkSXvaAswrbnd2dt4ncs/Rnq3t71I1W4C1wIbYQbrZ94EDc+NDi3/OBO7Kffxe4N+qFUqSLKiSJKnUBuCvX+dz/Lp462m+Vrz1dtOAw8t8fFzx1mxtdeJIUmBBlSRJiutm4En2vgzWEZaheb4Tr/l3wLAO7PdCJ55bkjrNgipJkhTX48Xb3nq6eOuMSuumSlJULjMjSZIkSUqCBVWSJEmSlAQLqiRJkiQpCRZUSZIkSVISLKiSJEmSpCRYUCVJkiRJSXCZGSWvUChkdXV1D8XO0ZZrrrnm4OXLlw8B2Lx586sbNmxIbWHzfYH9itvbgBXxopQ1GJiYGz8FNMWJUt6ECRMOp/hLvQsvvPDFqVOnbo4cqaIsy26cNWvWVdV4rbq6uh8UCoWjqvFanTF//vwx999//2iAHTt21K9evTq1NR2HAhNy4ydiBWnDkbntF4D6WEEqmED4PAKsA9ZEzLKHkSNHjh4+fPgYgPHjx2+//PLLl8fO1JaBAweeMnPmzG7/Gt9+++1va2xsvKa7X6eztm/fXnPppZce3jxevXr1ih07dmyLmamMiYT/PyGsoftqvChl7Uf4+QNgK7AyYpY9DBo0aMiYMWMObh5/85vffCLLspiR2vOpM88888HYIarBgqrkffGLX8xOOOGEqbFztGX9+vW88MLun3uHAge3sXtsQ2n5DyNVx8cOUCr39aWxsfHwxP8TW1KtFyoUCkdlWZbsv89t27aV/tscEzFORyT7uSw6sv1doiot/NFt3LiRjRs3AjBgwIChWZaNjhypTTU1NVX52bCxsXFUyu8dWZa1et8Hkv1FXNFQWv+iNzVDaflFeRK2b99e+jWemvL/7VmWjYidoVo8xVeSJEmSlASPoKonui7Lsudih8jbtWvXFRRPsxk/fvzKVatWfTtypFJ/BbypuL0T+ELELOVMBc7Njb8MbImUpaza2tqvNDQ0ALB+/fp7syy7M3KkUrMLhcLkmAEKhcIdNTU198XMUGrz5s2fpHhEbfTo0dvWrVv3xciRSr0NeE9ufFmsIG34Ssvm+Adg/2XxopTz+49B08CwPeIVeOPNcfO0NmZM/aw1a56aANDYWGh4/vmRc2NnyhsyZOeYsWO3nhc5xq4syz4fOUMrO3bs2Af4XPN47NixN73yyiu/jRipnH8BBhS3nwW+EzFLOZ+g5ajuduDKaEnKGDdu3MkvvfTSGc3jhoaGfx44cGBjzEylCoXCvwPpHtbtJhZU9TiFQuHHs2bNuid2jrxJkyZdRrGgTpgwYdWqVau+GjlSqXfQUlB3Aanl+xitC+p/AasjZSmrX79+uwvq8uXLl5xxxhlJfQ4XLFgwDYhaULMs+1Vqn5fp06efTbGgjho1avu6deuSyke41jpfUL8GFCJlqSRXUCf/Dn7xo3hRyun/oVxBfRUe/mHcPK3tv//7TmgpqI0Nn/70yUnl++hHH598zjnPRy+oqb13XHDBBfuTK6gnnnjinXV1dakVwM/RUlBXkt7/7bNoXVCTynfCCSfszBfUH//4x1fPnz9/Z8xMperq6v6NPlhQPcVXkiRJkpQEC6okSZIkKQkWVEmSJElSEiyokiRJkqQkWFAlSZIkSUmwoEqSJEmSkmBBlSRJkiQlwYIqSZIkSUqCBVWSJEmSlAQLqiRJkiQpCRZUSZIkSVISLKiSJEmSpCRYUCVJkiRJSbCgSpIkSZKSYEGVJEmSJCXBgipJkiRJSoIFVZIkSZKUhNrYAdRtBgOTc+NlQFM7j6kBpuTGjwPbujiXJEmSJJVlQe29DgV+mxsPof2yOaDkMccAj3VxLkmSJEkqy1N8JUmSJElJsKBKkiRJkpJgQZUkSZIkJcGCKkmSJElKggVVkiRJkpQEC6okSZIkKQkWVEmSJElSEiyokiRJkqQkWFAlSZIkSUmwoEqSJEmSkmBB7b0KJeOsA4/ZpzuCSJIkSVJHWFB7ry0l446Uz4O7I4gkSZIkdYQFtfdaT+ujqJM68Jh3dFMWSZIkSWqXBbX32gIsz41Pb2f/YcBF3RdHkiRJktpmQe3d7s5tXwyMr7DfQOB64KBuTyRJkiRJFVhQe7d5QFNxexTwa+BcWq5H3Rc4H3gIOAf4fbUDSpIkSVKz2tgB1K0eBr4JfLo4PgSYX9xuovUvKF4ilNdnq5au8y6qq6s7K3aIvH/8x38c3Lz97LPPHgL8Z8Q45RyW2x5AevmOKhl/GdgaI0glDQ0Nu7cnTZr0jrq6uiER4+yhUCgcEzsDcEZdXd3Y2CHyvv3tb09s3l67du1g0vveP75k/B9RUnTY798Kk/aLnaK1pkEt2+sPgEmXxMuypxdfbJjYvN2vX03/6667N6l8/fs3vCF2BmBAXV1dUv82N23aNOSGG27YPV60aNFfACm8z+YNzG0fRnrvb4fktpN7/128ePGU/PgDH/jANbNnz26qtH8kHVmFo9exoPZ+nyn++SlaF9L89m+AvwDWVSvU65Fl2Z/HzlCqtrbln9LatWsPAP42Xpp29SftfAAfix2gVGNj4+7tESNGTAWmxkuTrOnFWzKGDGn5PcJrr702iPS/9xPPt/a4cEvVltGw5fzYKfI2bGjZ7tevpt/YsfVJ5UtELYl97w8YMKDVeP369TOAGXHSdMhBJPY5LDGQxPK9+uqrrcY1NTUXR4qiEhbU3q8R+Dvgu4QSejywH2GW3+cIR1TvIxxR7Qe8P/fYP1UzqCRJkqS+zYLadzxavLWlkZZTgJNxxRVXFBYsWHBb7BxtGTdu3BGNjY3DATZs2PDSyy+//GLsTCXGAQcWt7cAj0fMUs4+wOTc+Le0XD+dhCOOOOKEmpqafgD9+/d/BtjQzkOiybLskSq+1v2ESwSSNHz48AMnT548DmDbtm2bli9f/mTsTCWGA0fkxg/FCtKG/5PbfhLYFCtIBYcDI4rbL5PYL1f333//A0aPHn1Qcbse+GPkSG3avn37rmq8TpZlLwMp/99eM3ny5N1nyqxcufLx+vr60jXmY5tMy7wiL5Lee/GBhJ8/ILxvJPX+u88++wybMGHCkc3jQqGwlNZLNCalsbFxTewM1ZLNnTt39xciy7LD58yZ83TMQJIkSZKkvmHevHmHFQqFp5rHzuIrSZIkSUqCBVWSJCl9A9rfRZJ6PguqJElS+j5PmHV/39hBJKk7WVAlSZLS9xvgBOAR4LTIWSSp21hQJUmS0rcYeA04ALidsHxc/6iJJKkbWFAlSZLStwloKG4PAj4ALAMOjZZInTUodgApZRZUSZKknuGe3PYgwjqYi4G/jBNHnTCIcLr2XJz4SirLgipJktQz3AbU58YZMAb4r+J9Q2KE0l65lnAt8dnAfpGzSEmyoEqSJPUM9wNbynx8BDATeAI4tqqJtDc+CXwYaCQc9V4VN46UJguqJElSz7AW2FnhvgHABMJpwJdVLZE6ahrwjeL2Z4G7ImaRkmZBlSRJ6jkebuf+NwBfAm6pQhZ1zP7AjYRfItwCfD1uHClttbEDSJIkqcNuBd4NDCz5+FbgWeAm4HfA+irnUnm1wE+AA4GngAuBQsxAUuosqJIkST3HfcBGwlE5CKf8DiDMDnspnjqamq8Bbwc2A+cQlguS1AZP8ZUkSeo5VhAm2YEwo+/Pge8Sfqb7EXBAnFgq43zg7wlHTD9CmMRKUjssqJIkST3L44TS8wTwPuBTwKOEo6rX4893KZhK+MUBwL8DN0fMIvUovoFJkiT1LLcRjqKeW/xzG+FoXT3wTuAL8aIJmAjUAUOBhfj1kPaKBVWSJKlnWQj8X2Bl7mNPAJ8obn8BOLvaoQTAKMLXZyxhsqrzaTklW1IHWFAlSZJ6lueAz5T5+P8Q1trMCNejHl3NUGIAMB84ElgFnAVsiZpI6oEsqJIkSb3HPwB3APsAPwVGxo3TZ2TAd4AZhJl63wO8GDWR1ENZUCVJknqPRuAvgeXAm4D/jz/vVcOXgQ8Rlv05B3gkbhyp5/INS5IkqXdZR7gGtR54N/CtuHF6vb8C/okws/Ic4J64caSezYIqSZLU+zwCXAg0ESZPuiRqmt7rvcC1xe0rCMv8SHodLKiSJEm9002Ea1IBrgLOi5ilNzqN8DmuBb4HfCluHKl3sKBKkiT1Xl8H/pPwM98PgZPjxuk1pgG3AoOAnwF/HTeO1HtYUCVJknq3Swgz+jaXKZefeX1OBH5OmCl5IWGt04aoiaRexIIqSZLUuzURZvZ9ABgN3AUcHjVRz3US8AtgBGEypHMJM/dK6iIWVEmSpN5vG2FCn98BY4FfAm+MmqjnOZlQTkcCvwLOInxeJXUhC6okSVLfsBGYASwDxgP3AhNjBupB3kY4nXc4cD+h7G+JmkjqpSyokiRJfcdGwtqoTwIHAXdjSW3PmcAdwLDin+/Gcip1GwuqJElS37IaeAfwLOE0318DR0ZNlK6PEyaYGkyYtfdsPK1X6lYWVEmSpL5nFfBW4A/AgcAi4M1RE6Xns8A8oB/wA8KESDtiBpL6AguqJElS3/QKcBqwBBhFmADo1KiJ0jAQ+D7wleL4X4CP4FIyUlVYUCVJkvqu9YTTfe+m5RrLj0ZNFNf+hOVjLiQU0k8AV8QMJPU1FlRJkqS+bQtwBvATYADwXeAawqmtfckU4CFgOqG4vxuYGzWR1AdZUCVJkrQDuAC4DGgCLgFuJ6z52RfMBn4DTACeAd5COKosqcosqJIkSQIoAF8F3g/UAzOBpcDUmKG62UhgPnA9MAS4DTiRsAyPpAgsqJIkScq7mTDD7wpgEmGG388AWcRM3eEU4HeE2Xl3An9PWEbmtYiZpD7PgipJktTirwjrgn4jdpDIlhGuybyRcF3q1cBC4ICYobrIPsB/ESZDOgR4jnBK738QjiJLisiCKkmS1OJg4GTg2NhBErAROB/4OLAVOB14AvgkPfdnyNOBR4CLiuNrCUX8t9ESSWqlp765SJIkqTq+Q7gO9UFgBPAtwoRCx8QMtZfeRLi+9OeEo6bLCcvr/A2wOWIuSSUsqJIkSS2+AuwHnBM7SGKeIJwGexHhGs1phOs3rwMOjJirPfsRTk9+DJhFuNb0asIR8nsi5pJUgQVVkiSpxVbgVWBT7CAJagL+G5hMmPm2FvgYYVmWq4F940XbwxjgKsKR0s8QrqO9nXDU9x8Ia79KSpAFVZIkqcU0whqg58YOkrCXCEvRTAPuBQYRSuBKYB5xT/2dUsywHLgUGAosISyZcwbwdLxokjrCgipJktTi3cA1hGsT1bYHgdOAdxFK4BDChEqPEIrrbMI6o91tv+LrPkg47fjjxSwPAu8B3gz8ogo5JHWB2tgBJEmS1KPdVbxNB/4W+HPCGqOnEK75vAv4KXA/YUmX16sGOAo4lXCt8FuBfsX7dhDWcb2WsFyQpB7GgipJkqSu8EDxNo6wnuy5hMmI3lu8AbwCLCIc6XyueFsJrCNc45o3iDBr8KHAYYSZeKcQinD+yGwBWEpYs/V6YG3X/rUkVZMFVZIkSV3pJeBLxdvhhKJ6OmGpmrHA+4q3UrtombxoGG3/nLoZWExYNuanwAtdEVxSfBZUSZIkdZengH8t3gYSSup04EjgjYSjo+OBDOgPjCp5fBOhfD5LmC34ccIR2EeAxu6PL6naLKiSJEmqhh2Ecrmo5OP9gOHAYMJpvRlhuZ96XO5H6nMsqJIkSYqpEdhQvEnq41xmRpIkSZKUBAuqJEmSJCkJFlRJkiRJUhIsqJIkSZKkJFhQJUmSJElJsKBKkiRJkpLgMjOSJEkt5gK3A6/FDiJJfZEFVZIkqcWq4k2SFIGn+EqSJEmSkmBBlSRJkiQlwYIqSZIkSUqCBVWSJEmSlAQLqiRJkiQpCRZUSZIkSVISXGZGPcItt9wyMXaGttTV1e27YcOGQQAvvfTSpiVLlmyKnanEMGBEcXsnsCZilnIGAGNy41VAIVKWss4666xxNTU1NQDHH3/8uqOPPnpb7EyVDB48eNPpp5++vhqvdccddxywbdu2gdV4rc5YtGjR8Oeee244wJYtW7bfddddr8bOVGIgsF9u/GKsIG04MLe9FtgRK0gF+wKDitubSWz90mnTpg0bO3bsCIBhw4btPOecc1J7/23l7LPPXpllWbe//954442D+/fvv393v05n7dq1K7vhhhvGN48feuihNatWrdoZM1MZYwj/f0L4vt8cMUs5w4s3CO8bayNm2cMhhxwy8M/+7M92v//Onj07xfff3UaOHPnKqaeeuj12jmqwoCp5V155ZU1tbe3y2DnasnTpUh599NHYMdSNbr311t3bkydPprY23bfPXbt2fQu4uBqvtXPnzttqa2unVuO1OmPFihXccsstsWOoD1u8ePHu7UmTJnHeeedFTNO+u+++eyRVKPmDBg16V5ZlP+vu1+mshoYG3zt6ueXLl7N8ecuPlxdeeCHF30Mnqb6+/nTgztg5qiHdr4IkSZIkqU+xoEqSJEmSkpDuOWpSZX+ZZdni9nernu3bty+jeJ3FlClTHl62bNn7I0cqdT1wcnF7K3BMxCzlvB/499z4JCCpawUHDhz43I4d4dK7J5988tpp06ZdFTlSqWsLhcI7I2e4KsuyayNnaGXNmjU3AVMAJk6c+NqKFSuOjxyp1MeBy3aPvsEn40Wp4O/5793bb+cWzuauiGn29A9cTQNDwmDc0/BPl7X9gOo67LBbP//003dNAdi1a9e2LMuOjp0pr6mp6ZgETrVN7vPypz/9aV9gSfP4pJNOunzJkiU3RoxUzqPQ/L3PImB2xCzl/ARovgRkE8X34lRMnz79ow888MA/N483bdp0xKhRo3bFzFSqUCg8Qx88oGhBVY9TKBRenjVr1vOxc+RNmjSpqXl70KBBO4Ck8gH5i+oLpJevtIyuBFbHCNIR9fX1G88444ykPocLFizYGjsDsCG1z8v06dN3T+hTW1vbRHrf+60nszqNNdSkNUFYKyPZwjsS+7eZ0dQyqN0JF62KF2ZPAwfes/t7sFAoNKX2b+S2224bHTsDUEjt83LBBRfU58djxox5lfTeP3Lf+2wnvXz5CdWSe/8dPXr0uvz4tttuWz5//vykJsKqq6uLHSGKPtfIJUmSJElpsqBKkiRJkpJgQZUkSZIkJcGCKkmSJElKggVVkiRJkpQEC6okSZIkKQkWVEmSJElSEiyokiRJkqQkWFAlSZIkSUmojR2gl8uANwLjgUHAS8DjQOPrfN5JwFhgGLAReB5Y/TqfU5IkSZKiSv0I6kTgudxtWAceU1PymKO7K1zRJ3Ov9T+5DJ8CngSeAe4D7gAeIRTJLwFD9vJ1DgK+BbxYfM5fAwuBB4CXgd8BH6Htr+nluaz/2c7r/TOtP48faWf/m3L7fridfSVJkiRpD6kfQe0PHJobd7RQ5x8zsOvilDUq93orCCX6JuBdFfYfDXwOeAcwE9jUgde4CLiacBS2nAyYAnyPUCTPBtaX2e+xXNbzgL8DChWe8xxafx7PBL5fYd99gFnAgOL49xX2kyRJkqSKUi+oPU0N4Sjquwin8d4PPAxsBg4GzgL2Le77ZkLpnNPOc14JXJEb1wM/B5YBrwFjCEX3pOL9bwXuAt4CbC95rvuBBsLX/QBgMvDHMq/5BkLhzTsF6Ef505PfSks5XUs4UixJkiRJe8WC2rVOJnxO/wicz57l71LgZuC04vijwJeBFyo83yzgC7nxTwhHU9eV7HcF4YjoDwinDh8P/CvwmZL9NgFLgWnF8YwyGQFOpeVo9Q7CUeiRwAnAQ2X2n5HbvofKR2UlSZIkqaLUr0HtaWqBVYSCV674bQQ+AGwpjvsRimU5/QnXnGbF8Y3ABexZTpvNBz6WG3+ScHS11C9z2zPK3F/68Wv3cv97KuwjSZIkSW2yoHa9zxFOc61kNbAgN55aYb/zCBMjQSi0F9P+kckbCEdIIVyvekGZffIF8u2UP4reXDg3A18Fmko+nrcvcGxu/Msy+0iSJElSuyyoXWsX4Uhnex7ObU+ssM+Zue1babv05v00t/3WMvc/AGwrbo8ATiy5/yDgsOL2rwgzBP+hOH4LMLhk/9No+T5aQZjFV5IkSZL2mgW1az0FbO3Afmty2yMq7JMvl4v3IsOTue0jy9y/A/hNblx6VDQ/vrv4Z/NR0UHA9Db29+ipJEmSpE5zkqSuVW5pl3J25rYHlLl/IDAuN76E9tchhbDETX6t2DdU2O+XwDuL2zMIEzWRG+f3a/7z0tz9+SJ6Wpn9JUmSJGmvWVC7VkMXPc+okvGhZfdq39AKH89fhzqNMPNv85Hf5sL5CmHdVIBfE0r1AFoX2IOBScXtAnBvJ3NKkiRJkgU1Uf1KxvdQefbetmyr8PHfARsIRXggYXmcXxDWRW0+cptfLqYeeBB4G2GpmVHFx+fL6h8JpVaSJEmSOqU3FtRyp8z2NKVl9JvAz7rw+RuB+4BziuMZhILa1um6vyQU1H7AKcAteP2pJEmSpC6U+iRJ20vGgzrwmHJrf/Y022l9NPLYSju+DuXWQy03QVKl/TO8/lSSJElSF0q9oG4qGR/Qgcec1B1BIshfz3lmxb06L18opxCK/SnF8TPACyX7P0RYFxVCQT0KGFscNwD3d0NGSZIkSX1I6gX1NWB1bvyWDjzm492Updpuzm2fALy7i5//SWBVcbuGMEvvyOK43NHQXYR1UQGOAGbn7vste/4yQZIkSZL2SuoFFVqvAfoJ2r5u9iO0LJ/S091CmHio2feAQ/bi8fvRUjgryc/me3Fuu/T03mb54npxhY9LkiRJUqf0hIL6o9z20cD1wPCSfYYAXwCuA7ZUKVd3ayIU7ubrcMcCS4GPUnkiqH7AqcA8YCXwxnZeI18sB+det9JyMeX2L/24JEmSJHVKT5jF92fAIlpO7/0AcEbxY68RjhS+mbDmZxPh1NOfVj9mt1hK+Pv8kDBB1Gjgu8DXCUeWXyQU2JGEo6vHAfvsxfOXK5bLgPUV9n8UWEPriai20footyRJkiR1Sk8oqE3A+cBdwJHFjw1nz2sytwJ/A9xavWhVMR9YAfyAsE4pwAjg9HYe9zKwsZ19XgSeAg7PfazS6b0Q1kW9B/iL3McWsedsy5IkSZK013pCQYUwmc+JwCXAB2ldqDYQStx/AE8Qlj+ZX3J/d3oi93qPdfAxL+Qes7qtHYuWAscAfw6cSziNt3Q5nS2EiY9+BdxJODra2IHn/ibw9tz45ko7Fv2IcCpxR/eXJEmSpA7pKQUVoB74UvE2jHBq70b2PB21ALy/irl+yt6fUvxA8bY3moCbijcIpzSPJnwNy30eOupbxVtHLSzeJEmSJKlL9aSCmreZljU5+6r64k2SJEmSeoWeMIuvJEmSJKkPsKBKkiRJkpJgQZUkSZIkJaGnXoPaWR8CBnfB8zwN3NcFz6NOyLLsxrq6up2xc+Rdfvnlw5u3ly5dOhV4KWKcct6Q2x5CevlK/13+gTAxWDJ27mz5lpsyZcrFF1100YcjxtlDoVAYFTsDcHldXd2nYofI+/73vz+6eXv58uUjSO97f2ir0UlcFylHxyzkXO7kjNgxWmnIfw5XHQEDbo8XZk+PP56NaN4eMGDAEI2W2AAAAwZJREFUkLq6utS+B/vHDgAMTu3zsnXr1pobbrhh93jhwoVXAf8SL1FZ+fePk0nv/W10bju599+FCxe2ev/94Ac/uGL27Nmx4lTSJw8m9rWCehWwfxc8z/VYUGMa3f4u1ZVl2e7thoaGAcAB8dK0KyPtfNA1/067VKFQ2L1dU1OzD7BPvDTJGla8JaOmpuX/9sbGxhpS/97fTgq/aKhsF4PZ1SW/6O0mjbXQuG/sFHmNrRd86wnvvzEk93nJv3cANDY2jgRGxknTIQNJ7HNYIrmvcWPpP84sSypfX9YnW7kkSZIkKT197Qjq4XRNKU/q9NLe7sorr2xasGDBh2LnaMuxxx571Lhx40YCrFq16k+PPfbYC7EzlTgQOLi4vRl4JGKWcoYBx+bGDwKNFfaNYsaMGW+ura3tBzB06NAnsizr7NrD3a6pqempar1Wv379Pl8oFJI6YpU3bty4g2fOnHkgwKZNmzYuXrz4j7EzlRgBHJ0bL4oVpA1vyW0/BrwWK0gFR9FyZOtFYGXELHs48sgjx0+YMGEiwMiRIzdnWZba+28ro0aN2lqN16mtrX24qakp2f/bsyzrN3PmzDc3j5ctW/bImjVrUlvi8Big+RKjlYTv/5RMAA4qbr9GeP9IxtixY0ccd9xxu99/syx7IMuyQluPiam2tjbp946ulM2dO3f3FyLLssPnzJnzdMxAkiRJkqS+Yd68eYcVCoXdv1z3FF9JkiRJUhIsqJIkSZKkJFhQJUmSJElJsKBKkiRJkpJgQZUkSZIkJcGCKkmSJElKggVVkiRJkpQEC6okSZIkKQkWVEmSJElSEiyokiRJkqQkWFAlSZIkSUmwoEqSJEmSkmBBlSRJkiQlwYIqSZIkSUqCBVWSJEmSlAQLqiRJkiQpCRZUSZIkSVISLKiSJEmSpCRYUCVJkiRJSbCgSpIkSZKSYEGVJEmSJCWhNj8oFAoT582bFyuLJEmSJKkPKRQKE/Pj2pL77ywUCtVLI0mSJElSkaf4SpIkSZKSYEGVJEmSJCXhfwG8dZWmdb83nAAAAABJRU5ErkJggg=="
}
},
"cell_type": "markdown",
"id": "5ae8701f",
"metadata": {},
"source": [
"<div>\n",
"<img src=\"attachment:fig.png\" align=\"left\" width=\"450\"/>\n",
"</div>"
]
},
{
"cell_type": "markdown",
"id": "9d4de5a9",
"metadata": {},
"source": [
"## Extension to 2D"
]
},
{
"cell_type": "markdown",
"id": "6f5d2895",
"metadata": {},
"source": [
"### Serial implementation"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4ab59b2f",
"metadata": {},
"outputs": [],
"source": [
"function jacobi_2d(n,niters)\n",
" u = zeros(n+2,n+2)\n",
" u[1,:] = u[end,:] = u[:,1] = u[:,end] .= 1\n",
" heater = 1/n^2\n",
" u_new = copy(u)\n",
" for t in 1:niters\n",
" for j in 2:(n+1)\n",
" for i in 2:(n+1)\n",
" north = u[i,j+1]\n",
" south = u[i,j-1]\n",
" east = u[i+1,j]\n",
" west = u[i-1,j]\n",
" u_new[i,j] = 0.25*(north+south+east+west) + heater\n",
" end\n",
" end\n",
" u, u_new = u_new, u\n",
" end\n",
" u\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6da0aa54",
"metadata": {},
"outputs": [],
"source": [
"u = jacobi_2d(100,1000)"
]
},
{
"cell_type": "markdown",
"id": "47643bf6",
"metadata": {},
"source": [
"## Exercises"
]
},
{
"cell_type": "markdown",
"id": "0edeee84",
"metadata": {},
"source": [
"### Exercise 1\n",
"\n",
"Transform the following parallel implementation of the 1d Jacobi method (it is copied from above) to use latency hiding (overlap between computation of interior values and communication)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "66db180d",
"metadata": {},
"outputs": [],
"source": [
"@mpi_do manager begin\n",
" using MPI\n",
" comm = MPI.Comm_dup(MPI.COMM_WORLD)\n",
" nw = MPI.Comm_size(comm)\n",
" iw = MPI.Comm_rank(comm)+1\n",
" function jacobi_mpi(n,niters)\n",
" if mod(n,nw) != 0\n",
" println(\"n must be a multiple of nw\")\n",
" MPI.Abort(comm,1)\n",
" end\n",
" n_own = div(n,nw)\n",
" u = zeros(n_own+2)\n",
" u[1] = -1\n",
" u[end] = 1\n",
" u_new = copy(u)\n",
" for t in 1:niters\n",
" reqs = MPI.Request[]\n",
" # Exchange cell values with neighbors\n",
" if iw != 1\n",
" neig_rank = (iw-1)-1\n",
" req = MPI.Isend(view(u,2:2),comm,dest=neig_rank,tag=0)\n",
" push!(reqs,req)\n",
" req = MPI.Irecv!(view(u,1:1),comm,source=neig_rank,tag=0)\n",
" push!(reqs,req)\n",
" end\n",
" if iw != nw\n",
" neig_rank = (iw+1)-1\n",
" s = n_own+1\n",
" r = n_own+2\n",
" req = MPI.Isend(view(u,s:s),comm,dest=neig_rank,tag=0)\n",
" push!(reqs,req)\n",
" req = MPI.Irecv!(view(u,r:r),comm,source=neig_rank,tag=0)\n",
" push!(reqs,req)\n",
" end\n",
" MPI.Waitall(reqs)\n",
" for i in 2:(n_own+1)\n",
" u_new[i] = 0.5*(u[i-1]+u[i+1])\n",
" end\n",
" u, u_new = u_new, u\n",
" end\n",
" u\n",
" @show u\n",
" # Gather results in root process\n",
" results = zeros(n+2)\n",
" results[1] = -1\n",
" results[n+2] = 1\n",
" MPI.Gather!(view(u,2:n_own+1), view(results, 2:n+1), root=0, comm)\n",
" if iw == 1\n",
" @show results\n",
" end \n",
" end\n",
" niters = 100\n",
" load = 4\n",
" n = load*nw\n",
" jacobi_mpi(n,niters)\n",
"end"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.9.1",
"language": "julia",
"name": "julia-1.9"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}