mirror of
https://github.com/fverdugo/XM_40017.git
synced 2025-11-08 22:24:25 +01:00
Updating matrix-matrix notebook
This commit is contained in:
parent
3d7fcb4d8e
commit
a9692802de
@ -72,9 +72,10 @@
|
||||
" \"It's not correct. Keep trying! 💪\"\n",
|
||||
" end |> println\n",
|
||||
"end\n",
|
||||
"alg_0_comp_check(answer) = answer_checker(answer, \"d\")\n",
|
||||
"alg_1_deps_check(answer) = answer_checker(answer,\"b\")\n",
|
||||
"alg_1_comm_overhead_check(answer) = answer_checker(answer, \"c\")\n",
|
||||
"alg_1_comp_check(answer) = answer_checker(answer, \"a\")\n",
|
||||
"alg_1_comm_overhead_check(answer) = answer_checker(answer, \"b\")\n",
|
||||
"alg_1_comp_check(answer) = answer_checker(answer, \"b\")\n",
|
||||
"alg_2_complex_check(answer) = answer_checker(answer, \"b\")\n",
|
||||
"alg_2_deps_check(answer) = answer_checker(answer,\"d\")\n",
|
||||
"alg_3_deps_check(answer) = answer_checker(answer, \"c\")\n",
|
||||
@ -88,7 +89,7 @@
|
||||
"source": [
|
||||
"## Problem Statement\n",
|
||||
"\n",
|
||||
"Let us consider the (dense) matrix-matrix product `C=A*B`."
|
||||
"Given $A$ and $B$ two $N$-by-$N$ matrices, compute the matrix-matrix product $C=AB$. Compute it in parallel and efficiently."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -157,7 +158,7 @@
|
||||
"source": [
|
||||
"## Serial implementation\n",
|
||||
"\n",
|
||||
"We start by considering the (naive) sequential algorithm:"
|
||||
"We start by considering the (naive) sequential algorithm, which is based on the math definition of the matrix-matrix product $C_{ij} = \\sum_k A_{ik} B_{kj}$"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -188,6 +189,30 @@
|
||||
"end"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e3b86457",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Run next cell to test the implementation."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c5caf799",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"using Test\n",
|
||||
"N = 10\n",
|
||||
"A = rand(N,N)\n",
|
||||
"B = rand(N,N)\n",
|
||||
"C = similar(A)\n",
|
||||
"matmul_seq!(C,A,B)\n",
|
||||
"@test C ≈ A*B"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f967d2ea",
|
||||
@ -216,6 +241,32 @@
|
||||
"@btime mul!(C,A,B);"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0ca2fbd4",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<div class=\"alert alert-block alert-success\">\n",
|
||||
"<b>Question:</b> Which is the complexity (number of operations) of the serial algorithm? Assume that all matrices are $N$-by-$N$ matrices. \n",
|
||||
"</div>\n",
|
||||
"\n",
|
||||
" a) O(1)\n",
|
||||
" b) O(N)\n",
|
||||
" c) O(N²)\n",
|
||||
" d) O(N³)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "078e974e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"answer = \"x\" # replace x with a, b, c, or d \n",
|
||||
"alg_0_comp_check(answer)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0eedd28a",
|
||||
@ -489,10 +540,10 @@
|
||||
"<b>Question:</b> How many scalars are communicated from and to a worker? Assume that matrices A, B, and C are N by N matrices.\n",
|
||||
"</div>\n",
|
||||
"\n",
|
||||
" a) 3N\n",
|
||||
" b) 2N + 2\n",
|
||||
" c) 2N + 1\n",
|
||||
" d) N² + 1"
|
||||
" a) O(1)\n",
|
||||
" b) O(N)\n",
|
||||
" c) O(N²)\n",
|
||||
" d) O(N³)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -515,9 +566,10 @@
|
||||
"<b>Question:</b> How many operations are done in a worker? \n",
|
||||
"</div>\n",
|
||||
"\n",
|
||||
" a) O(N)\n",
|
||||
" b) O(N²)\n",
|
||||
" c) O(N³)"
|
||||
" a) O(1)\n",
|
||||
" b) O(N)\n",
|
||||
" c) O(N²)\n",
|
||||
" d) O(N³)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -905,9 +957,9 @@
|
||||
"\n",
|
||||
"| Algorithm | Parallelism <br>(#workers) | Communication <br>per worker | Computation <br>per worker | Ratio communication/<br>computation |\n",
|
||||
"|---|---|---|---|---|\n",
|
||||
"| 1 | N² | 2N + 1 | N | O(1) |\n",
|
||||
"| 2 | N | 2N + N² | N² | O(1) |\n",
|
||||
"| 3 | P | N² + 2N²/P | N³/P | O(P/N) |\n",
|
||||
"| 1 | N² | O(N) | O(N) | O(1) |\n",
|
||||
"| 2 | N | O(N²) | O(N²) | O(1) |\n",
|
||||
"| 3 | P | O(N²) | O(N³/P) | O(P/N) |\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"- Matrix-matrix multiplication is trivially parallelizable (all entries in the result matrix can be computed in parallel, at least in theory)\n",
|
||||
@ -1086,7 +1138,7 @@
|
||||
"id": "ab609c18",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Run the next cell to check the performance of this implementation. Note that we are far away from the optimal speed up. Why? To answer this question compute the theoretical communication over computation ratio for this implementation and reason about the obtained result. Hint: the number of times a worker is spawned in this implementation is N^3/P on average."
|
||||
"Run the next cell to check the performance of this implementation. Note that we are far away from the optimal speed up. Why? To answer this question compute the theoretical communication over computation ratio for this implementation and reason about the obtained result. Hint: the number of times a worker is spawned in this implementation is N^2/P on average."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user