More work in notebooks

2025-11-24 09:24:32 +01:00 · 2024-08-26 12:54:09 +02:00
parent 316c89f32f
commit 02968243ba
2 changed files with 102 additions and 30 deletions
--- a/notebooks/matrix_matrix.ipynb
+++ b/notebooks/matrix_matrix.ipynb
@@ -285,9 +285,14 @@
   "id": "0eedd28a",
   "metadata": {},
   "source": [
-    "### Where can we exploit parallelism?\n",
+    "## Where can we exploit parallelism?\n",
    "\n",
-    "Look at the three nested loops in the sequential implementation:\n",
+    "\n",
+    "The matrix-matrix multiplication is an example of [embarrassingly parallel algorithm](https://en.wikipedia.org/wiki/Embarrassingly_parallel). An embarrassingly parallel (also known as trivially parallel) algorithm is an algorithm that can be split in parallel tasks with no (or very few) dependences between them. Such algorithms are typically easy to parallelize.\n",
+    "\n",
+    "Which parts of an algorithm are completely independent and thus trivially parallel? To answer this question, it is useful to inspect the for loops, which are potential sources parallelism. If the iterations are independent of each other, then they are trivial to parallelize. An easy check to find out if the iterations are dependent or not is to change their order (for instance changing `for j in 1:n` by `for j in n:-1:1`, i.e. doing the loop in reverse). If the result changes, then the iterations are not independent.\n",
+    "\n",
+    "Look at the three nested loops in the sequential implementation of the matrix-matrix product:\n",
    "\n",
    "```julia\n",
    "for j in 1:n\n",
@@ -301,12 +306,10 @@
    "end\n",
    "```\n",
    "\n",
-    "To find out which parts of an algorithm can be parallelized it is useful to start by looking into the for loops. We can run the iterations of the for loop in parallel if the iterations are independent of each other and do not cause any side effect. An easy check to find out if the iterations are independent is checking what happens if we change their order (for instance changing `for j in 1:n` by `for j in n:-1:1`, i.e. doing the loop in reverse). Is the result independent of the loop order? Then one says that the iteration order is *overspecified* and the iterations are parallelizable (if there are not side effects).\n",
+    "Note that:\n",
    "\n",
-    "In our case:\n",
-    "\n",
-    "- Loops over `i` and `j` are parallelizable.\n",
-    "- The loop over `k` can be parallelized but it requires a reduction. Note that this loop causes a side effect on the outer variable `Cij`. This is why parallelizing this loop is not as easy as the other cases. We are not going to parallelize this loop in this notebook.\n",
+    "- Loops over `i` and `j` are trivially parallel.\n",
+    "- The loop over `k` is not trivially parallel. The accumulation into the reduction variable `Cij` introduces extra dependences. In addition, remember that the addition of floating point numbers is not strictly associative due to rounding errors. Thus, the result of this loop may change with the loop order when using floating point numbers. In any case, this loop can also be parallelized, but it requires a parallel *fold* or a parallel *reduction*.\n",
    "\n"
   ]
  },