build based on 5e44a19

2026-02-15 01:04:27 +01:00 · 2024-08-27 07:47:13 +00:00
parent dad32dba25
commit bba38eb351
24 changed files with 40 additions and 59 deletions
--- a/dev/matrix_matrix_src/index.html
+++ b/dev/matrix_matrix_src/index.html
@@ -7578,12 +7578,12 @@ a.anchor-link {
 </div>
 </div>
 </div>
-</div><div class="jp-Cell jp-CodeCell jp-Notebook-cell" id="cell-id=2f8ba040">
+</div><div class="jp-Cell jp-CodeCell jp-Notebook-cell jp-mod-noOutputs" id="cell-id=2f8ba040">
 <div class="jp-Cell-inputWrapper" tabindex="0">
 <div class="jp-Collapser jp-InputCollapser jp-Cell-inputCollapser">
 </div>
 <div class="jp-InputArea jp-Cell-inputArea">
-<div class="jp-InputPrompt jp-InputArea-prompt">In [8]:</div>
+<div class="jp-InputPrompt jp-InputArea-prompt">In [ ]:</div>
 <div class="jp-CodeMirrorEditor jp-Editor jp-InputArea-editor" data-type="inline">
 <div class="cm-editor cm-s-jupyter">
 <div class="highlight hl-julia"><pre><span></span><span class="k">using</span><span class="w"> </span><span class="n">Distributed</span>
@@ -7613,19 +7613,6 @@ a.anchor-link {
 </div>
 </div>
 </div>
-<div class="jp-Cell-outputWrapper">
-<div class="jp-Collapser jp-OutputCollapser jp-Cell-outputCollapser">
-</div>
-<div class="jp-OutputArea jp-Cell-outputArea">
-<div class="jp-OutputArea-child">
-<div class="jp-OutputPrompt jp-OutputArea-prompt"></div>
-<div class="jp-RenderedText jp-OutputArea-output" data-mime-type="text/plain" tabindex="0">
-<pre>🥳 Well done! 
-</pre>
-</div>
-</div>
-</div>
-</div>
 </div>
 <div class="jp-Cell jp-MarkdownCell jp-Notebook-cell" id="cell-id=96d2693d">
 <div class="jp-Cell-inputWrapper" tabindex="0">
@@ -7859,7 +7846,7 @@ d) O(N³)</code></pre>
 <div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
 </div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
 <h2 id="Where-can-we-exploit-parallelism?">Where can we exploit parallelism?<a class="anchor-link" href="#Where-can-we-exploit-parallelism?">¶</a></h2><p>The matrix-matrix multiplication is an example of <a href="https://en.wikipedia.org/wiki/Embarrassingly_parallel">embarrassingly parallel algorithm</a>. An embarrassingly parallel (also known as trivially parallel) algorithm is an algorithm that can be split in parallel tasks with no (or very few) dependences between them. Such algorithms are typically easy to parallelize.</p>
-<p>Which parts of an algorithm are completely independent and thus trivially parallel? To answer this question, it is useful to inspect the for loops, which are potential sources parallelism. If the iterations are independent of each other, then they are trivial to parallelize. An easy check to find out if the iterations are dependent or not is to change their order (for instance changing <code>for j in 1:n</code> by <code>for j in n:-1:1</code>, i.e. doing the loop in reverse). If the result changes, then the iterations are not independent.</p>
+<p>Which parts of an algorithm are completely independent and thus trivially parallel? To answer this question, it is useful to inspect the for loops, which are potential sources of parallelism. If the iterations are independent of each other, then they are trivial to parallelize. An easy check to find out if the iterations are dependent or not is to change their order (for instance changing <code>for j in 1:n</code> by <code>for j in n:-1:1</code>, i.e. doing the loop in reverse). If the result changes, then the iterations are not independent.</p>
 <p>Look at the three nested loops in the sequential implementation of the matrix-matrix product:</p>
 <div class="highlight"><pre><span></span><span class="k">for</span><span class="w"> </span><span class="n">j</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="mi">1</span><span class="o">:</span><span class="n">n</span>
 <span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="mi">1</span><span class="o">:</span><span class="n">m</span>
@@ -7886,7 +7873,7 @@ d) O(N³)</code></pre>
 </div>
 <div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
 </div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
-<h3 id="Parallel-algorithms">Parallel algorithms<a class="anchor-link" href="#Parallel-algorithms">¶</a></h3><p>Parallelizing the loops over <code>i</code> and <code>j</code> means that all the entries of matrix C can be potentially computed in parallel. However, <em>which it the most efficient solution to solve all these entries in parallel in a distributed system?</em> To find this we will consider different parallelization strategies:</p>
+<h3 id="Parallel-algorithms">Parallel algorithms<a class="anchor-link" href="#Parallel-algorithms">¶</a></h3><p>The loops over <code>i</code> and <code>j</code> are trivially parallel implies that all the entries of matrix C can be potentially computed in parallel. However, <em>which it the most efficient solution to solve all these entries in parallel in a distributed system?</em> To find this we will consider different parallelization strategies:</p>
 <ul>
 <li>Algorithm 1: each worker computes a single entry of C</li>
 <li>Algorithm 2: each worker computes a single row of C</li>
@@ -7926,7 +7913,7 @@ d) O(N³)</code></pre>
 </div>
 <div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
 </div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
-<h3 id="Data-dependencies">Data dependencies<a class="anchor-link" href="#Data-dependencies">¶</a></h3><p>Moving data through the network is expensive and reducing data movement is one of the key points in a distributed algorithm. To this end, we need to determine which is the minimum data needed by a worker to perform its computations. These are called the <em>data dependencies</em>. This will give us later information about the performance of the parallel algorithm.</p>
+<h3 id="Data-dependencies">Data dependencies<a class="anchor-link" href="#Data-dependencies">¶</a></h3><p>Moving data through the network is expensive and reducing data movement is one of the key points to design efficient distributed algorithms. To this end, we need to determine which is the minimum data needed by a worker to perform its computations. These are called the <em>data dependencies</em>. This will give us later information about the performance of the parallel algorithm.</p>
 <p>In algorithm 1, each worker computes only an entry of the result matrix C.</p>
 </div>
 </div>
@@ -7986,7 +7973,7 @@ d) row A[i,:] and the whole matrix B</code></pre>
 <h3 id="Implementation">Implementation<a class="anchor-link" href="#Implementation">¶</a></h3><p>Taking into account the data dependencies, the parallel algorithm 1 can be efficiently implemented following these steps from the worker perspective:</p>
 <ol>
 <li>The worker receives the data dependencies, i.e., the corresponding row A[i,:] and column B[:,j] from the master process</li>
-<li>The worker computes the dot product of A[i,:] and B[:,j]</li>
+<li>The worker computes the dot product of A[i,:] and B[:,j] locally</li>
 <li>The worker sends back the result of C[i,j] to the master process</li>
 </ol>
 </div>
@@ -8012,7 +7999,7 @@ d) row A[i,:] and the whole matrix B</code></pre>
 </div>
 <div class="jp-InputArea jp-Cell-inputArea"><div class="jp-InputPrompt jp-InputArea-prompt">
 </div><div class="jp-RenderedHTMLCommon jp-RenderedMarkdown jp-MarkdownOutput" data-mime-type="text/markdown">
-<p>A possible implementation of this algorithm in Julia is as follows:</p>
+<p>A possible implementation of this algorithm in Julia is as follows. Try to understand why <code>@sync</code> and <code>@async</code> are needed here.</p>
 </div>
 </div>
 </div>
@@ -8081,7 +8068,8 @@ d) row A[i,:] and the whole matrix B</code></pre>
 <span class="n">A</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rand</span><span class="p">(</span><span class="n">N</span><span class="p">,</span><span class="n">N</span><span class="p">)</span>
 <span class="n">B</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rand</span><span class="p">(</span><span class="n">N</span><span class="p">,</span><span class="n">N</span><span class="p">)</span>
 <span class="n">C</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">similar</span><span class="p">(</span><span class="n">A</span><span class="p">)</span>
-<span class="nd">@test</span><span class="w"> </span><span class="n">matmul_dist_1!</span><span class="p">(</span><span class="n">C</span><span class="p">,</span><span class="n">A</span><span class="p">,</span><span class="n">B</span><span class="p">)</span><span class="w"> </span><span class="o">≈</span><span class="w"> </span><span class="n">A</span><span class="o">*</span><span class="n">B</span>
+<span class="n">matmul_dist_1!</span><span class="p">(</span><span class="n">C</span><span class="p">,</span><span class="n">A</span><span class="p">,</span><span class="n">B</span><span class="p">)</span>
+<span class="nd">@test</span><span class="w">  </span><span class="n">C</span><span class="w"> </span><span class="o">≈</span><span class="w"> </span><span class="n">A</span><span class="o">*</span><span class="n">B</span>
 </pre></div>
 </div>
 </div>