From 82cfa1d44b9e2ac6304a6dcb66695429a2bb06fd Mon Sep 17 00:00:00 2001
From: VictorianHues <VictorianHues@gmail.com>
Date: Mon, 30 Sep 2024 23:14:53 +0200
Subject: [PATCH] Miscellaneous typos fixed

---
 notebooks/LEQ.ipynb               |  6 +++---
 notebooks/asp.ipynb               |  4 ++--
 notebooks/jacobi_method.ipynb     | 20 ++++++++++----------
 notebooks/julia_async.ipynb       |  8 ++++----
 notebooks/julia_basics.ipynb      |  2 +-
 notebooks/julia_distributed.ipynb |  8 ++++----
 notebooks/julia_mpi.ipynb         | 12 ++++++------
 notebooks/matrix_matrix.ipynb     |  4 ++--
 8 files changed, 32 insertions(+), 32 deletions(-)
diff --git a/notebooks/LEQ.ipynb b/notebooks/LEQ.ipynb
index a393385..492c085 100644
--- a/notebooks/LEQ.ipynb
+++ b/notebooks/LEQ.ipynb
@@ -103,7 +103,7 @@
     "### Problem statement\n",
     "\n",
     "Let us consider a system of linear equations written in matrix form $Ax=b$, where $A$ is a nonsingular square matrix, and $x$ and $b$ are vectors. $A$ and $b$ are given, and $x$ is unknown. The goal of Gaussian elimination is to transform the system $Ax=b$, into a new system $Ux=c$ such that\n",
-    "- both system have the same solution vector $x$,\n",
+    "- both systems have the same solution vector $x$,\n",
     "- the matrix $U$ of the new system is *upper triangular* with unit diagonal, namely $U_{ii} = 1$ and $U_{ij} = 0$ for $i>j$.\n",
     "\n",
     "\n",
@@ -398,7 +398,7 @@
    "source": [
     "### Data partition\n",
     "\n",
-    "Let start considering a row-wise block partition, as we did in previous algorithms.\n",
+    "Let's start considering a row-wise block partition, as we did in previous algorithms.\n",
     "\n",
     "In the figure below, we use different colors to illustrate which entries are assigned to a CPU. All entries with the same color are assigned to the same CPU."
    ]
@@ -454,7 +454,7 @@
     "<b>Definition:</b> *Load imbalance*: is the problem when work is not equally distributed over all processes and consequently some processes do more work than others.\n",
     "</div>\n",
     "\n",
-    "Having processors waiting for others is a waist of computational resources and affects negatively parallel speedups. The optimal speedup (speedup equal to the number of processors) assumes that the work is perfectly parallel and that it is evenly distributed. If there is load imbalance, the last assumption is not true anymore and the speedup will be suboptimal.\n"
+    "Having processors waiting for others is a waste of computational resources and affects negatively parallel speedups. The optimal speedup (speedup equal to the number of processors) assumes that the work is perfectly parallel and that it is evenly distributed. If there is load imbalance, the last assumption is not true anymore and the speedup will be suboptimal.\n"
    ]
   },
   {
diff --git a/notebooks/asp.ipynb b/notebooks/asp.ipynb
index 9188920..ae3dbe1 100644
--- a/notebooks/asp.ipynb
+++ b/notebooks/asp.ipynb
@@ -57,7 +57,7 @@
     "function q1_answer(bool)\n",
     "    bool || return\n",
     "    msg = \"\"\"\n",
-    "    The we can change the loop order over i and j without changing the result. Rememeber:\n",
+    "    Then we can change the loop order over i and j without changing the result. Remember:\n",
     "    \n",
     "    C[i,j] = min(C[i,j],C[i,k]+C[k,j])\n",
     "    \n",
@@ -788,7 +788,7 @@
     "        if rank == 0\n",
     "            N = size(C,1)\n",
     "            if mod(N,P) !=0\n",
-    "                println(\"N not multplie of P\")\n",
+    "                println(\"N not multiple of P\")\n",
     "                MPI.Abort(comm,-1)\n",
     "            end\n",
     "            Nref = Ref(N)\n",
diff --git a/notebooks/jacobi_method.ipynb b/notebooks/jacobi_method.ipynb
index cfd67f9..381ae88 100644
--- a/notebooks/jacobi_method.ipynb
+++ b/notebooks/jacobi_method.ipynb
@@ -27,7 +27,7 @@
     "\n",
     "In this notebook, we will learn\n",
     "\n",
-    "- How to paralleize the Jacobi method\n",
+    "- How to parallelize the Jacobi method\n",
     "- How the data partition can impact the performance of a distributed algorithm\n",
     "- How to use latency hiding to improve parallel performance\n",
     "\n"
@@ -452,7 +452,7 @@
     "- We need to get remote entries from 2 neighbors (2 messages per iteration)\n",
     "- We need to communicate 1 entry per message\n",
     "- Thus, communication complexity is $O(1)$\n",
-    "- Communication/computation ration is $O(P/N)$, making the algorithm potentially scalable if $P<<N$.\n"
+    "- Communication/computation ratio is $O(P/N)$, making the algorithm potentially scalable if $P<<N$.\n"
    ]
   },
   {
@@ -655,7 +655,7 @@
     "end\n",
     "```\n",
     "\n",
-    "- The outer loop cannot be parallelized (like in the 1d case). \n",
+    "- The outer loop cannot be parallelized (like in the 1D case). \n",
     "- The two inner loops are trivially parallel\n"
    ]
   },
@@ -666,7 +666,7 @@
    "source": [
     "### Parallelization strategies\n",
     "\n",
-    "In 2d one has more flexibility in order to distribute the data over the processes. We consider these three alternatives:\n",
+    "In 2D, one has more flexibility in order to distribute the data over the processes. We consider these three alternatives:\n",
     "\n",
     "- 1D block row partition (each worker handles a subset of consecutive rows and all columns)\n",
     "- 2D block partition (each worker handles a subset of consecutive rows and columns)\n",
@@ -848,9 +848,9 @@
     "\n",
     "|Partition | Messages <br> per iteration | Communication <br>per worker | Computation <br>per worker | Ratio communication/<br>computation |\n",
     "|---|---|---|---|---|\n",
-    "| 1d block | 2 | O(N) | N²/P | O(P/N) |\n",
-    "| 2d block | 4 | O(N/√P) | N²/P | O(√P/N) |\n",
-    "| 2d cyclic | 4 |O(N²/P) | N²/P | O(1) |"
+    "| 1D block | 2 | O(N) | N²/P | O(P/N) |\n",
+    "| 2D block | 4 | O(N/√P) | N²/P | O(√P/N) |\n",
+    "| 2D cyclic | 4 |O(N²/P) | N²/P | O(1) |"
    ]
   },
   {
@@ -862,9 +862,9 @@
     "\n",
     "\n",
     "\n",
-    "- Both 1d and 2d block partitions are potentially scalable if $P<<N$\n",
-    "- The 2d block partition has the lowest communication complexity\n",
-    "- The 1d block partition requires to send less messages (It can be useful if the fixed cost of sending a message is high)\n",
+    "- Both 1D and 2D block partitions are potentially scalable if $P<<N$\n",
+    "- The 2D block partition has the lowest communication complexity\n",
+    "- The 1D block partition requires to send less messages (It can be useful if the fixed cost of sending a message is high)\n",
     "- The best strategy for a given problem size will thus depend on the machine.\n",
     "- Cyclic partitions are impractical for this application (but they are useful in others) \n",
     "\n"
diff --git a/notebooks/julia_async.ipynb b/notebooks/julia_async.ipynb
index f0b269b..6936f04 100644
--- a/notebooks/julia_async.ipynb
+++ b/notebooks/julia_async.ipynb
@@ -87,7 +87,7 @@
     "\n",
     "### Creating  a task\n",
     "\n",
-    "Technically, a task in Julia is a *symmetric* [*co-routine*](https://en.wikipedia.org/wiki/Coroutine). More informally, a task is a piece of computational work that can be started (scheduled) at some point in the future, and that can be interrupted and resumed.  To create a task, we first need to create a function that represents the work to be done in the task. In next cell, we generate a task that generates and sums two matrices."
+    "Technically, a task in Julia is a *symmetric* [*co-routine*](https://en.wikipedia.org/wiki/Coroutine). More informally, a task is a piece of computational work that can be started (scheduled) at some point in the future, and that can be interrupted and resumed.  To create a task, we first need to create a function that represents the work to be done in the task. In the next cell, we generate a task that generates and sums two matrices."
    ]
   },
   {
@@ -322,7 +322,7 @@
    "source": [
     "### `yield`\n",
     "\n",
-    "If tasks do not run in parallel, what is the purpose of tasks? Tasks are handy since they can be interrupted and to switch control to other tasks. This is achieved via function `yield`. When we call yield, we provide the opportunity to switch to another task. The function below is a variation of function `compute_π` in which we yield every 1000 iterations. At the call to yield we allow other tasks to take over. Without this call to yield, once we start function `compute_π` we cannot start any other tasks until this function finishes."
+    "If tasks do not run in parallel, what is the purpose of tasks? Tasks are handy since they can be interrupted and to switch control to other tasks. This is achieved via function `yield`. When we call `yield`, we provide the opportunity to switch to another task. The function below is a variation of function `compute_π` in which we `yield` every 1000 iterations. At the call to `yield` we allow other tasks to take over. Without this call to `yield`, once we start function `compute_π` we cannot start any other tasks until this function finishes."
    ]
   },
   {
@@ -349,7 +349,7 @@
    "id": "69fd4131",
    "metadata": {},
    "source": [
-    "You can check this behavior experimentally with the two following cells. The next one creates and schedules a task that computes pi with the function `compute_π_yield`. Note that you can run the 2nd cell bellow while this task is running since we call to yield often inside  `compute_π_yield`."
+    "You can check this behavior experimentally with the two following cells. The next one creates and schedules a task that computes pi with the function `compute_π_yield`. Note that you can run the 2nd cell bellow while this task is running since we call to `yield` often inside  `compute_π_yield`."
    ]
   },
   {
@@ -381,7 +381,7 @@
    "source": [
     "### Example: Implementing function sleep\n",
     "\n",
-    "Using yield, we can implement our own version of the sleep function as follows:"
+    "Using `yield`, we can implement our own version of the sleep function as follows:"
    ]
   },
   {
diff --git a/notebooks/julia_basics.ipynb b/notebooks/julia_basics.ipynb
index c403981..7566b90 100644
--- a/notebooks/julia_basics.ipynb
+++ b/notebooks/julia_basics.ipynb
@@ -37,7 +37,7 @@
    "source": [
     "## Using Jupyter notebooks in Julia\n",
     "\n",
-    "We are going to use Jupyter notebooks in this and other lectures. You provably have worked with notebooks (in Python). If not, here are the basic concepts you need to know to follow the lessons.\n",
+    "We are going to use Jupyter notebooks in this and other lectures. You probably have worked with notebooks (in Python). If not, here are the basic concepts you need to know to follow the lessons.\n",
     "\n",
     "<div class=\"alert alert-block alert-info\">\n",
     "<b>Tip:</b> Did you know that Jupyter stands for Julia, Python and R?\n",
diff --git a/notebooks/julia_distributed.ipynb b/notebooks/julia_distributed.ipynb
index fa7b1a6..8569e20 100644
--- a/notebooks/julia_distributed.ipynb
+++ b/notebooks/julia_distributed.ipynb
@@ -137,7 +137,7 @@
     "\n",
     "\n",
     "<div class=\"alert alert-block alert-info\">\n",
-    "<b>Tip:</b> We can also start new processes when launching Julia from the command line by suing the `-p` command-line argument. E.g., `$ julia -p 3 ` would launch Julia with 3 extra processes.\n",
+    "<b>Tip:</b> We can also start new processes when launching Julia from the command line by using the `-p` command-line argument. E.g., `$ julia -p 3 ` would launch Julia with 3 extra processes.\n",
     "</div>\n"
    ]
   },
@@ -251,7 +251,7 @@
    "source": [
     "### Creating workers in other machines\n",
     "\n",
-    "For large parallel computations, one typically needs to use different computers in parallel. Function `addprocs` also provides a low-level method to start workers in other machines. Next code example would create 3 workers in `server1` and 4 new workers in server `server2` (see figure below). Under the hood, Julia connects via ssh to the other machines and starts the new processes there. In order this to work, the local computer and the remote servers need to be properly configured (see the Julia manual for details). \n",
+    "For large parallel computations, one typically needs to use different computers in parallel. Function `addprocs` also provides a low-level method to start workers in other machines. Next code example would create 3 workers in `server1` and 4 new workers in `server2` (see figure below). Under the hood, Julia connects via ssh to the other machines and starts the new processes there. In order this to work, the local computer and the remote servers need to be properly configured (see the Julia manual for details). \n",
     "\n",
     "\n",
     "\n",
@@ -514,7 +514,7 @@
    "id": "10899cd4",
    "metadata": {},
    "source": [
-    "### Another usefull macro: `@fetchfrom`\n",
+    "### Another useful macro: `@fetchfrom`\n",
     "\n",
     "Macro `@fetchfrom` is the blocking version of `@spawnat`. It blocks and returns the corresponding result instead of a `Future` object. "
    ]
@@ -552,7 +552,7 @@
    "source": [
     "### Explicit data movement in `remotecall` / `fetch`\n",
     "\n",
-    "When usig `remotecall` we send to the remote process a function and its arguments. In this example, we send function name `+` and matrices `a` and `b` to proc 4. When fetching the result we receive a copy of the matrix from proc 4."
+    "When using `remotecall` we send to the remote process a function and its arguments. In this example, we send function name `+` and matrices `a` and `b` to proc 4. When fetching the result we receive a copy of the matrix from proc 4."
    ]
   },
   {
diff --git a/notebooks/julia_mpi.ipynb b/notebooks/julia_mpi.ipynb
index 3011772..05caecd 100644
--- a/notebooks/julia_mpi.ipynb
+++ b/notebooks/julia_mpi.ipynb
@@ -167,7 +167,7 @@
     "```julia\n",
     "using MPI\n",
     "MPI.Init()\n",
-    "# Your MPI programm here\n",
+    "# Your MPI program here\n",
     "MPI.Finalize() # Optional\n",
     "```\n",
     "\n",
@@ -176,7 +176,7 @@
     "```julia\n",
     "using MPI\n",
     "MPI.Init(finalize_atexit=false)\n",
-    "# Your MPI programm here\n",
+    "# Your MPI program here\n",
     "MPI.Finalize() # Mandatory\n",
     "```\n",
     "\n",
@@ -186,7 +186,7 @@
     "#include <mpi.h>\n",
     "int main(int argc, char** argv) {\n",
     "    MPI_Init(NULL, NULL);\n",
-    "    /* Your MPI Programm here */\n",
+    "    /* Your MPI Program here */\n",
     "    MPI_Finalize();\n",
     "}\n",
     "```\n",
@@ -612,7 +612,7 @@
    "id": "4b455f98",
    "metadata": {},
    "source": [
-    "So, the full MPI program needs to be in the source file passed to Julia or the quote block. In practice, long MPI programms are written as Julia packages using several files, which are then loaded by each MPI process. For our simple example, we just need to include the definition of `foo` inside the quote block."
+    "So, the full MPI program needs to be in the source file passed to Julia or the quote block. In practice, long MPI programs are written as Julia packages using several files, which are then loaded by each MPI process. For our simple example, we just need to include the definition of `foo` inside the quote block."
    ]
   },
   {
@@ -920,7 +920,7 @@
     "        source = MPI.ANY_SOURCE\n",
     "        tag = MPI.ANY_TAG\n",
     "        status = MPI.Probe(comm,MPI.Status; source, tag)\n",
-    "        count = MPI.Get_count(status,Int) # Get incomming message length\n",
+    "        count = MPI.Get_count(status,Int) # Get incoming message length\n",
     "        println(\"I am about to receive $count integers.\")\n",
     "        rcvbuf = zeros(Int,count) # Allocate        \n",
     "        MPI.Recv!(rcvbuf, comm, MPI.Status; source, tag)\n",
@@ -973,7 +973,7 @@
     "    if rank == 3\n",
     "        rcvbuf = zeros(Int,5)\n",
     "        MPI.Recv!(rcvbuf, comm, MPI.Status; source=2, tag=0)\n",
-    "        # recvbuf will have the incomming message fore sure. Recv! has returned.\n",
+    "        # recvbuf will have the incoming message fore sure. Recv! has returned.\n",
     "        @show rcvbuf\n",
     "    end\n",
     "end\n",
diff --git a/notebooks/matrix_matrix.ipynb b/notebooks/matrix_matrix.ipynb
index 955abd9..1448e87 100644
--- a/notebooks/matrix_matrix.ipynb
+++ b/notebooks/matrix_matrix.ipynb
@@ -293,7 +293,7 @@
     "## Where can we exploit parallelism?\n",
     "\n",
     "\n",
-    "The matrix-matrix multiplication is an example of [embarrassingly parallel algorithm](https://en.wikipedia.org/wiki/Embarrassingly_parallel). An embarrassingly parallel (also known as trivially parallel) algorithm is an algorithm that can be split in parallel tasks with no (or very few) dependences between them. Such algorithms are typically easy to parallelize.\n",
+    "The matrix-matrix multiplication is an example of [embarrassingly parallel algorithm](https://en.wikipedia.org/wiki/Embarrassingly_parallel). An embarrassingly parallel (also known as trivially parallel) algorithm is an algorithm that can be split in parallel tasks with no (or very few) dependencies between them. Such algorithms are typically easy to parallelize.\n",
     "\n",
     "Which parts of an algorithm are completely independent and thus trivially parallel? To answer this question, it is useful to inspect the for loops, which are potential sources of parallelism. If the iterations are independent of each other, then they are trivial to parallelize. An easy check to find out if the iterations are dependent or not is to change their order (for instance changing `for j in 1:n` by `for j in n:-1:1`, i.e. doing the loop in reverse). If the result changes, then the iterations are not independent.\n",
     "\n",
@@ -314,7 +314,7 @@
     "Note that:\n",
     "\n",
     "- Loops over `i` and `j` are trivially parallel.\n",
-    "- The loop over `k` is not trivially parallel. The accumulation into the reduction variable `Cij` introduces extra dependences. In addition, remember that the addition of floating point numbers is not strictly associative due to rounding errors. Thus, the result of this loop may change with the loop order when using floating point numbers. In any case, this loop can also be parallelized, but it requires a parallel *fold* or a parallel *reduction*.\n",
+    "- The loop over `k` is not trivially parallel. The accumulation into the reduction variable `Cij` introduces extra dependencies. In addition, remember that the addition of floating point numbers is not strictly associative due to rounding errors. Thus, the result of this loop may change with the loop order when using floating point numbers. In any case, this loop can also be parallelized, but it requires a parallel *fold* or a parallel *reduction*.\n",
     "\n"
    ]
   },