Polish notebooks ASP, LEQ

This commit is contained in:
Gelieza K 2023-10-30 11:23:00 +01:00
parent bf45227674
commit e34d92008b
8 changed files with 20487 additions and 28374 deletions

View File

@ -25,6 +25,46 @@
"- How to fix static load imbalance"
]
},
{
"cell_type": "markdown",
"id": "480af594",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-info\">\n",
"<b>Note:</b> Do not forget to execute the cell below before starting this notebook! \n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "7e93809a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ge_dep_check (generic function with 1 method)"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using Printf\n",
"function answer_checker(answer,solution)\n",
" if answer == solution\n",
" \"🥳 Well done! \"\n",
" else\n",
" \"It's not correct. Keep trying! 💪\"\n",
" end |> println\n",
"end\n",
"ge_par_check(answer) = answer_checker(answer, \"a\")\n",
"ge_dep_check(answer) = answer_checker(answer, \"b\")"
]
},
{
"cell_type": "markdown",
"id": "8dcee319",
@ -33,7 +73,7 @@
"## Gaussian elimination\n",
"\n",
"\n",
"System of linear algebraic equations\n",
"[Gaussian elimination](https://en.wikipedia.org/wiki/Gaussian_elimination) is a method to solve systems of linear equations, e.g.\n",
"\n",
"$$\n",
"\\left[\n",
@ -60,7 +100,7 @@
"\\right]\n",
"$$\n",
"\n",
"Elimination steps\n",
"The steps of the Gaussian elimination will transform the system into an upper triangular matrix. The system of linear equations can now easily be solved by backward substitution. \n",
"\n",
"\n",
"$$\n",
@ -112,15 +152,29 @@
"id": "94c10106",
"metadata": {},
"source": [
"### Serial implementation\n"
"### Serial implementation\n",
"The following algorithm computes the Gaussian elimination on a matrix which represents a system of linear equations.\n",
"- The first inner loop in line 4 divides the current row by the value of the diagonal entry, thus transforming the diagonal to contain only ones. \n",
"- The second inner loop beginning in line 8 substracts the rows from one another such that all entries below the diagonal become zero. "
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"id": "e4070214",
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"gaussian_elimination! (generic function with 1 method)"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function gaussian_elimination!(B)\n",
" n,m = size(B)\n",
@ -140,12 +194,36 @@
"end"
]
},
{
"cell_type": "markdown",
"id": "3763b000",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-info\">\n",
"<b>Note:</b> This algorithm is not correct for all matrices: if any diagonal element <code>B[k,k]</code> is zero, the computation in the first inner loop fails. To get around this problem, another step can be added to the algorithm that swaps the rows until the diagonal entry of the current row is not zero. This process of finding a nonzero value is called <b>pivoting</b>. \n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 2,
"id": "eb30df0d",
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"3×4 Matrix{Float64}:\n",
" 1.0 3.0 1.0 9.0\n",
" 0.0 1.0 2.0 8.0\n",
" 0.0 0.0 1.0 4.0"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"A = Float64[1 3 1; 1 2 -1; 3 11 5]\n",
"b = Float64[9,1,35]\n",
@ -153,6 +231,14 @@
"gaussian_elimination!(B)"
]
},
{
"cell_type": "markdown",
"id": "8d941741",
"metadata": {},
"source": [
"The result is an upper triangular matrix which can be used to solve the system by backward substitution. "
]
},
{
"cell_type": "markdown",
"id": "39f2e8ef",
@ -185,6 +271,40 @@
"```"
]
},
{
"cell_type": "markdown",
"id": "e52c4b38",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-success\">\n",
"<b>Question:</b> Which of the loops can be parallelized?\n",
"</div>\n",
"\n",
" a) the inner loops, but not the outer loop\n",
" b) the outer loop, but not the inner loops\n",
" c) all loops\n",
" d) only the first inner loop"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "078e974e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"It's not correct. Keep trying! 💪\n"
]
}
],
"source": [
"answer = \"x\" # replace x with a, b, c, or d \n",
"ge_par_check(answer)"
]
},
{
"cell_type": "markdown",
"id": "14d57c52",
@ -193,6 +313,17 @@
"### Two possible data partitions"
]
},
{
"cell_type": "markdown",
"id": "c518f863",
"metadata": {},
"source": [
"The outer loop of the algorithm is not parallelizable, since the iterations depend on the results of the previous iterations. However, we can extract parallelism from the inner loops. Let's have a look at two different parallelization schemes. \n",
"\n",
"1. **Block-wise partitioning**: Each processor gets a block of subsequent rows. \n",
"2. **Cyclic partitioning**: The rows are cyclicly distributed among the processors. "
]
},
{
"attachments": {
"g23933.png": {
@ -213,8 +344,9 @@
"id": "a67e0aad",
"metadata": {},
"source": [
"### Which is the work per process at iteration k ?\n",
"\n"
"## What is the work per process at iteration k?\n",
"To evaluate the efficiency of both partitioning schemes, consider how much work the processors do in the following example. \n",
"In any iteration k, which part of the matrix is updated in the inner loops? "
]
},
{
@ -237,12 +369,7 @@
"id": "d083cd53",
"metadata": {},
"source": [
"<div>\n",
" <br>\n",
" <br>\n",
" <br>\n",
" <br>\n",
"</div>"
"It is clear from the code that at a given iteration `k`, the matrix is updated from row `k` to `n` and from column `k` to `m`. If we look at how that reflects the distribution of work over the processes, we can see that CPU 1 does not have any work, whereas CPU 2 does a little work and CPU 3 and 4 do a lot of work. Thus, the work load is _imbalanced_ across the different processes. "
]
},
{
@ -269,7 +396,37 @@
"\n",
"- CPUs with rows <k are idle during iteration k\n",
"- Bad load balance means bad speedups, as some CPUs are waiting instead of doing useful work\n",
"- Solution: cyclic partition "
"- Solution: cyclic partition \n",
" \n",
"### Data dependencies\n",
" \n",
"<div class=\"alert alert-block alert-success\">\n",
"<b>Question:</b> What are the data dependencies of this partitioning?\n",
"</div>\n",
"\n",
" a) CPUs with rows >k need all rows <=k\n",
" b) CPUs with rows >k need part of row k\n",
" c) All CPUs need row k \n",
" d) CPUs with row k needs all rows >k \n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "e0565e92",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"It's not correct. Keep trying! 💪\n"
]
}
],
"source": [
"answer = \"x\" # replace x with a, b, c, or d \n",
"ge_dep_check(answer)"
]
},
{
@ -279,11 +436,7 @@
"source": [
"### Cyclic partition\n",
"\n",
"- Less load imbalance\n",
"- Same data dependencies as 1d block partition\n",
"- Useful for some problems with predictable load imbalance\n",
"- A form of static load balancing\n",
"- Not suitable for all communication patterns (e.g. Jacobi)"
"In contrast, if we look at how the work is balanced for the same example and cyclic partitioning, we find that the processes have similar work load. "
]
},
{
@ -301,25 +454,43 @@
"</div>"
]
},
{
"cell_type": "markdown",
"id": "866824c6",
"metadata": {},
"source": [
"## Conclusion\n",
"Cyclic partitioning tends to work well in problems with predictable load imbalance. It is a form of **static load balancing** which means using a pre-defined load schedule based on prior information about the algorithm (as opposed to **dynamic load balancing** which can schedule loads more flexibly during runtime). The data dependencies are the same as for the 1d block partitioning.\n",
"\n",
"At the same time, cyclic partitioning is not suitable for all communication patterns. For example, it can lead to a large communication overhead in the parallel Jacobi method, since the computation of each value depends on its neighbouring elements."
]
},
{
"cell_type": "markdown",
"id": "20982b04",
"metadata": {},
"source": [
"## Exercise\n",
"\n",
"- The actual parallel implementation is let as an exercise\n",
"- Implement both 1d block and 1d cyclic partitions and compare performance\n",
"- Closely related with Floyd's algorithm\n",
"- Generate input matrix with function below (a random matrix is not enough, we need a non singular matrix that does not require pivoting)"
"The actual implementation of the parallel algorithm is left as an exercise. Implement both 1d block and 1d cyclic partitioning and compare their performance. The implementation is closely related to that of Floyd's algorithm. To test your algorithms, generate input matrices with the function below (a random matrix is not enough, we need a non singular matrix that does not require pivoting). "
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"id": "a65cf8e6",
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"tridiagonal_matrix (generic function with 1 method)"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function tridiagonal_matrix(n)\n",
" C = zeros(n,n)\n",
@ -352,7 +523,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.9.0",
"display_name": "Julia 1.9.1",
"language": "julia",
"name": "julia-1.9"
},
@ -360,7 +531,7 @@
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.9.0"
"version": "1.9.1"
}
},
"nbformat": 4,

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 6.2 MiB

After

Width:  |  Height:  |  Size: 6.4 MiB

View File

@ -39,7 +39,7 @@
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-info\">\n",
"<b>Note:</b> Do not forget to run the next cell before starting studying this notebook. \n",
"<b>Note:</b> Do not forget to run the next cell before you start studying this notebook. \n",
"</div>"
]
},
@ -977,7 +977,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.9.0",
"display_name": "Julia 1.9.1",
"language": "julia",
"name": "julia-1.9"
},
@ -985,7 +985,7 @@
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.9.0"
"version": "1.9.1"
}
},
"nbformat": 4,

View File

@ -1175,7 +1175,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.9.0",
"display_name": "Julia 1.9.1",
"language": "julia",
"name": "julia-1.9"
},
@ -1183,7 +1183,7 @@
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.9.0"
"version": "1.9.1"
}
},
"nbformat": 4,

View File

@ -1322,7 +1322,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.9.0",
"display_name": "Julia 1.9.1",
"language": "julia",
"name": "julia-1.9"
},
@ -1330,7 +1330,7 @@
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.9.0"
"version": "1.9.1"
}
},
"nbformat": 4,

View File

@ -823,7 +823,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.9.0",
"display_name": "Julia 1.9.1",
"language": "julia",
"name": "julia-1.9"
},
@ -831,7 +831,7 @@
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.9.0"
"version": "1.9.1"
}
},
"nbformat": 4,

File diff suppressed because one or more lines are too long