mirror of
https://github.com/fverdugo/XM_40017.git
synced 2025-11-11 21:04:23 +01:00
Polish notebooks ASP, LEQ
This commit is contained in:
parent
bf45227674
commit
e34d92008b
@ -25,6 +25,46 @@
|
|||||||
"- How to fix static load imbalance"
|
"- How to fix static load imbalance"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "480af594",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<div class=\"alert alert-block alert-info\">\n",
|
||||||
|
"<b>Note:</b> Do not forget to execute the cell below before starting this notebook! \n",
|
||||||
|
"</div>"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 1,
|
||||||
|
"id": "7e93809a",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"ge_dep_check (generic function with 1 method)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 1,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"using Printf\n",
|
||||||
|
"function answer_checker(answer,solution)\n",
|
||||||
|
" if answer == solution\n",
|
||||||
|
" \"🥳 Well done! \"\n",
|
||||||
|
" else\n",
|
||||||
|
" \"It's not correct. Keep trying! 💪\"\n",
|
||||||
|
" end |> println\n",
|
||||||
|
"end\n",
|
||||||
|
"ge_par_check(answer) = answer_checker(answer, \"a\")\n",
|
||||||
|
"ge_dep_check(answer) = answer_checker(answer, \"b\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"id": "8dcee319",
|
"id": "8dcee319",
|
||||||
@ -33,7 +73,7 @@
|
|||||||
"## Gaussian elimination\n",
|
"## Gaussian elimination\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"System of linear algebraic equations\n",
|
"[Gaussian elimination](https://en.wikipedia.org/wiki/Gaussian_elimination) is a method to solve systems of linear equations, e.g.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"$$\n",
|
"$$\n",
|
||||||
"\\left[\n",
|
"\\left[\n",
|
||||||
@ -60,7 +100,7 @@
|
|||||||
"\\right]\n",
|
"\\right]\n",
|
||||||
"$$\n",
|
"$$\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Elimination steps\n",
|
"The steps of the Gaussian elimination will transform the system into an upper triangular matrix. The system of linear equations can now easily be solved by backward substitution. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"$$\n",
|
"$$\n",
|
||||||
@ -112,15 +152,29 @@
|
|||||||
"id": "94c10106",
|
"id": "94c10106",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### Serial implementation\n"
|
"### Serial implementation\n",
|
||||||
|
"The following algorithm computes the Gaussian elimination on a matrix which represents a system of linear equations.\n",
|
||||||
|
"- The first inner loop in line 4 divides the current row by the value of the diagonal entry, thus transforming the diagonal to contain only ones. \n",
|
||||||
|
"- The second inner loop beginning in line 8 substracts the rows from one another such that all entries below the diagonal become zero. "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 1,
|
||||||
"id": "e4070214",
|
"id": "e4070214",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"gaussian_elimination! (generic function with 1 method)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 1,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"function gaussian_elimination!(B)\n",
|
"function gaussian_elimination!(B)\n",
|
||||||
" n,m = size(B)\n",
|
" n,m = size(B)\n",
|
||||||
@ -140,12 +194,36 @@
|
|||||||
"end"
|
"end"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "3763b000",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<div class=\"alert alert-block alert-info\">\n",
|
||||||
|
"<b>Note:</b> This algorithm is not correct for all matrices: if any diagonal element <code>B[k,k]</code> is zero, the computation in the first inner loop fails. To get around this problem, another step can be added to the algorithm that swaps the rows until the diagonal entry of the current row is not zero. This process of finding a nonzero value is called <b>pivoting</b>. \n",
|
||||||
|
"</div>"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 2,
|
||||||
"id": "eb30df0d",
|
"id": "eb30df0d",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"3×4 Matrix{Float64}:\n",
|
||||||
|
" 1.0 3.0 1.0 9.0\n",
|
||||||
|
" 0.0 1.0 2.0 8.0\n",
|
||||||
|
" 0.0 0.0 1.0 4.0"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 2,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"A = Float64[1 3 1; 1 2 -1; 3 11 5]\n",
|
"A = Float64[1 3 1; 1 2 -1; 3 11 5]\n",
|
||||||
"b = Float64[9,1,35]\n",
|
"b = Float64[9,1,35]\n",
|
||||||
@ -153,6 +231,14 @@
|
|||||||
"gaussian_elimination!(B)"
|
"gaussian_elimination!(B)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "8d941741",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"The result is an upper triangular matrix which can be used to solve the system by backward substitution. "
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"id": "39f2e8ef",
|
"id": "39f2e8ef",
|
||||||
@ -185,6 +271,40 @@
|
|||||||
"```"
|
"```"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "e52c4b38",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<div class=\"alert alert-block alert-success\">\n",
|
||||||
|
"<b>Question:</b> Which of the loops can be parallelized?\n",
|
||||||
|
"</div>\n",
|
||||||
|
"\n",
|
||||||
|
" a) the inner loops, but not the outer loop\n",
|
||||||
|
" b) the outer loop, but not the inner loops\n",
|
||||||
|
" c) all loops\n",
|
||||||
|
" d) only the first inner loop"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 5,
|
||||||
|
"id": "078e974e",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"It's not correct. Keep trying! 💪\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"answer = \"x\" # replace x with a, b, c, or d \n",
|
||||||
|
"ge_par_check(answer)"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"id": "14d57c52",
|
"id": "14d57c52",
|
||||||
@ -193,6 +313,17 @@
|
|||||||
"### Two possible data partitions"
|
"### Two possible data partitions"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "c518f863",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"The outer loop of the algorithm is not parallelizable, since the iterations depend on the results of the previous iterations. However, we can extract parallelism from the inner loops. Let's have a look at two different parallelization schemes. \n",
|
||||||
|
"\n",
|
||||||
|
"1. **Block-wise partitioning**: Each processor gets a block of subsequent rows. \n",
|
||||||
|
"2. **Cyclic partitioning**: The rows are cyclicly distributed among the processors. "
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"attachments": {
|
"attachments": {
|
||||||
"g23933.png": {
|
"g23933.png": {
|
||||||
@ -213,8 +344,9 @@
|
|||||||
"id": "a67e0aad",
|
"id": "a67e0aad",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### Which is the work per process at iteration k ?\n",
|
"## What is the work per process at iteration k?\n",
|
||||||
"\n"
|
"To evaluate the efficiency of both partitioning schemes, consider how much work the processors do in the following example. \n",
|
||||||
|
"In any iteration k, which part of the matrix is updated in the inner loops? "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -237,12 +369,7 @@
|
|||||||
"id": "d083cd53",
|
"id": "d083cd53",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"<div>\n",
|
"It is clear from the code that at a given iteration `k`, the matrix is updated from row `k` to `n` and from column `k` to `m`. If we look at how that reflects the distribution of work over the processes, we can see that CPU 1 does not have any work, whereas CPU 2 does a little work and CPU 3 and 4 do a lot of work. Thus, the work load is _imbalanced_ across the different processes. "
|
||||||
" <br>\n",
|
|
||||||
" <br>\n",
|
|
||||||
" <br>\n",
|
|
||||||
" <br>\n",
|
|
||||||
"</div>"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -269,7 +396,37 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"- CPUs with rows <k are idle during iteration k\n",
|
"- CPUs with rows <k are idle during iteration k\n",
|
||||||
"- Bad load balance means bad speedups, as some CPUs are waiting instead of doing useful work\n",
|
"- Bad load balance means bad speedups, as some CPUs are waiting instead of doing useful work\n",
|
||||||
"- Solution: cyclic partition "
|
"- Solution: cyclic partition \n",
|
||||||
|
" \n",
|
||||||
|
"### Data dependencies\n",
|
||||||
|
" \n",
|
||||||
|
"<div class=\"alert alert-block alert-success\">\n",
|
||||||
|
"<b>Question:</b> What are the data dependencies of this partitioning?\n",
|
||||||
|
"</div>\n",
|
||||||
|
"\n",
|
||||||
|
" a) CPUs with rows >k need all rows <=k\n",
|
||||||
|
" b) CPUs with rows >k need part of row k\n",
|
||||||
|
" c) All CPUs need row k \n",
|
||||||
|
" d) CPUs with row k needs all rows >k \n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 4,
|
||||||
|
"id": "e0565e92",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"It's not correct. Keep trying! 💪\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"answer = \"x\" # replace x with a, b, c, or d \n",
|
||||||
|
"ge_dep_check(answer)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -279,11 +436,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"### Cyclic partition\n",
|
"### Cyclic partition\n",
|
||||||
"\n",
|
"\n",
|
||||||
"- Less load imbalance\n",
|
"In contrast, if we look at how the work is balanced for the same example and cyclic partitioning, we find that the processes have similar work load. "
|
||||||
"- Same data dependencies as 1d block partition\n",
|
|
||||||
"- Useful for some problems with predictable load imbalance\n",
|
|
||||||
"- A form of static load balancing\n",
|
|
||||||
"- Not suitable for all communication patterns (e.g. Jacobi)"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -301,25 +454,43 @@
|
|||||||
"</div>"
|
"</div>"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "866824c6",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Conclusion\n",
|
||||||
|
"Cyclic partitioning tends to work well in problems with predictable load imbalance. It is a form of **static load balancing** which means using a pre-defined load schedule based on prior information about the algorithm (as opposed to **dynamic load balancing** which can schedule loads more flexibly during runtime). The data dependencies are the same as for the 1d block partitioning.\n",
|
||||||
|
"\n",
|
||||||
|
"At the same time, cyclic partitioning is not suitable for all communication patterns. For example, it can lead to a large communication overhead in the parallel Jacobi method, since the computation of each value depends on its neighbouring elements."
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"id": "20982b04",
|
"id": "20982b04",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"## Exercise\n",
|
"## Exercise\n",
|
||||||
"\n",
|
"The actual implementation of the parallel algorithm is left as an exercise. Implement both 1d block and 1d cyclic partitioning and compare their performance. The implementation is closely related to that of Floyd's algorithm. To test your algorithms, generate input matrices with the function below (a random matrix is not enough, we need a non singular matrix that does not require pivoting). "
|
||||||
"- The actual parallel implementation is let as an exercise\n",
|
|
||||||
"- Implement both 1d block and 1d cyclic partitions and compare performance\n",
|
|
||||||
"- Closely related with Floyd's algorithm\n",
|
|
||||||
"- Generate input matrix with function below (a random matrix is not enough, we need a non singular matrix that does not require pivoting)"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 5,
|
||||||
"id": "a65cf8e6",
|
"id": "a65cf8e6",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"tridiagonal_matrix (generic function with 1 method)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 5,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"function tridiagonal_matrix(n)\n",
|
"function tridiagonal_matrix(n)\n",
|
||||||
" C = zeros(n,n)\n",
|
" C = zeros(n,n)\n",
|
||||||
@ -352,7 +523,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Julia 1.9.0",
|
"display_name": "Julia 1.9.1",
|
||||||
"language": "julia",
|
"language": "julia",
|
||||||
"name": "julia-1.9"
|
"name": "julia-1.9"
|
||||||
},
|
},
|
||||||
@ -360,7 +531,7 @@
|
|||||||
"file_extension": ".jl",
|
"file_extension": ".jl",
|
||||||
"mimetype": "application/julia",
|
"mimetype": "application/julia",
|
||||||
"name": "julia",
|
"name": "julia",
|
||||||
"version": "1.9.0"
|
"version": "1.9.1"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
Before Width: | Height: | Size: 6.2 MiB After Width: | Height: | Size: 6.4 MiB |
@ -39,7 +39,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"<div class=\"alert alert-block alert-info\">\n",
|
"<div class=\"alert alert-block alert-info\">\n",
|
||||||
"<b>Note:</b> Do not forget to run the next cell before starting studying this notebook. \n",
|
"<b>Note:</b> Do not forget to run the next cell before you start studying this notebook. \n",
|
||||||
"</div>"
|
"</div>"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@ -977,7 +977,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Julia 1.9.0",
|
"display_name": "Julia 1.9.1",
|
||||||
"language": "julia",
|
"language": "julia",
|
||||||
"name": "julia-1.9"
|
"name": "julia-1.9"
|
||||||
},
|
},
|
||||||
@ -985,7 +985,7 @@
|
|||||||
"file_extension": ".jl",
|
"file_extension": ".jl",
|
||||||
"mimetype": "application/julia",
|
"mimetype": "application/julia",
|
||||||
"name": "julia",
|
"name": "julia",
|
||||||
"version": "1.9.0"
|
"version": "1.9.1"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
@ -1175,7 +1175,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Julia 1.9.0",
|
"display_name": "Julia 1.9.1",
|
||||||
"language": "julia",
|
"language": "julia",
|
||||||
"name": "julia-1.9"
|
"name": "julia-1.9"
|
||||||
},
|
},
|
||||||
@ -1183,7 +1183,7 @@
|
|||||||
"file_extension": ".jl",
|
"file_extension": ".jl",
|
||||||
"mimetype": "application/julia",
|
"mimetype": "application/julia",
|
||||||
"name": "julia",
|
"name": "julia",
|
||||||
"version": "1.9.0"
|
"version": "1.9.1"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
@ -1322,7 +1322,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Julia 1.9.0",
|
"display_name": "Julia 1.9.1",
|
||||||
"language": "julia",
|
"language": "julia",
|
||||||
"name": "julia-1.9"
|
"name": "julia-1.9"
|
||||||
},
|
},
|
||||||
@ -1330,7 +1330,7 @@
|
|||||||
"file_extension": ".jl",
|
"file_extension": ".jl",
|
||||||
"mimetype": "application/julia",
|
"mimetype": "application/julia",
|
||||||
"name": "julia",
|
"name": "julia",
|
||||||
"version": "1.9.0"
|
"version": "1.9.1"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
@ -823,7 +823,7 @@
|
|||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Julia 1.9.0",
|
"display_name": "Julia 1.9.1",
|
||||||
"language": "julia",
|
"language": "julia",
|
||||||
"name": "julia-1.9"
|
"name": "julia-1.9"
|
||||||
},
|
},
|
||||||
@ -831,7 +831,7 @@
|
|||||||
"file_extension": ".jl",
|
"file_extension": ".jl",
|
||||||
"mimetype": "application/julia",
|
"mimetype": "application/julia",
|
||||||
"name": "julia",
|
"name": "julia",
|
||||||
"version": "1.9.0"
|
"version": "1.9.1"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
File diff suppressed because one or more lines are too long
Loading…
x
Reference in New Issue
Block a user