diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json
index c0b9997..749f819 100644
--- a/dev/.documenter-siteinfo.json
+++ b/dev/.documenter-siteinfo.json
@@ -1 +1 @@
-{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-09-23T05:44:47","documenter_version":"1.7.0"}}
\ No newline at end of file
+{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-09-23T11:58:25","documenter_version":"1.7.0"}}
\ No newline at end of file
diff --git a/dev/LEQ/index.html b/dev/LEQ/index.html
index 27c5e9c..4bae58d 100644
--- a/dev/LEQ/index.html
+++ b/dev/LEQ/index.html
@@ -1,5 +1,5 @@
-
The programming of this course will be done using the Julia programming language. Thus, we start by explaining how to get up and running with Julia. After studying this page, you will be able to:
Courses related with high-performance computing (HPC) often use languages such as C, C++, or Fortran. We use Julia instead to make the course accessible to a wider set of students, including the ones that have no experience with C/C++ or Fortran, but are willing to learn parallel programming. Julia is a relatively new programming language specifically designed for scientific computing. It combines a high-level syntax close to interpreted languages like Python with the performance of compiled languages like C, C++, or Fortran. Thus, Julia will allow us to write efficient parallel algorithms with a syntax that is convenient in a teaching setting. In addition, Julia provides easy access to different programming models to write distributed algorithms, which will be useful to learn and experiment with them.
Tip
You can run the code in this link to learn how Julia compares to other languages (C and Python) in terms of performance.
There are several ways of opening Julia depending on your operating system and your IDE, but it is usually as simple as launching the Julia app. With VSCode, open a folder (File > Open Folder). Then, press Ctrl+Shift+P to open the command bar, and execute Julia: Start REPL. If this does not work, make sure you have the Julia extension for VSCode installed. Independently of the method you use, opening Julia results in a window with some text ending with:
julia>
You have just opened the Julia read-evaluate-print loop, or simply the Julia REPL. Congrats! You will spend most of time using the REPL, when working in Julia. The REPL is a console waiting for user input. Just as in other consoles, the string of text right before the input area (julia> in the case) is called the command prompt or simply the prompt.
Curious about what the function println does? Enter into help mode to look into the documentation. This is done by typing a question mark (?) into the input field:
julia> ?
After typing ?, the command prompt changes to help?>. It means we are in help mode. Now, we can type a function name to see its documentation.
The REPL comes with two more modes, namely package and shell modes. To enter package mode type
julia> ]
Package mode is used to install and manage packages. We are going to discuss the package mode in greater detail later. To return back to normal mode press the backspace key several times.
To enter shell mode type semicolon (;)
julia> ;
The prompt should have changed to shell> indicating that we are in shell mode. Now you can type commands that you would normally do on your system command line. For instance,
shell> ls
will display the contents of the current folder in Mac or Linux. Using shell mode in Windows is not straightforward, and thus not recommended for beginners.
Real-world Julia programs are not typed in the REPL in practice. They are written in one or more files and included in the REPL. To try this, create a new file called hello.jl, write the code of the "Hello world" example above, and save it. If you are using VSCode, you can create the file using File > New File > Julia File. Once the file is saved with the name hello.jl, execute it as follows
julia> include("hello.jl")
Warning
Make sure that the file "hello.jl" is located in the current working directory of your Julia session. You can query the current directory with function pwd(). You can change to another directory with function cd() if needed. Also, make sure that the file extension is .jl.
The recommended way of running Julia code is using the REPL as we did. But it is also possible to run code directly from the system command line. To this end, open a terminal and call Julia followed by the path to the file containing the code you want to execute.
$ julia hello.jl
The previous line assumes that you have Julia properly installed in the system and that it's usable from the terminal. In UNIX systems (Linux and Mac), the Julia binary needs to be in one of the directories listed in the PATH environment variable. To check that Julia is properly installed, you can use
$ julia --version
If this runs without error and you see a version number, you are good to go!
You can also run julia code from the terminal using the -e flag:
$ julia -e 'println("Hello, world!")'
Note
In this tutorial, when a code snipped starts with $, it should be run in the terminal. Otherwise, the code is to be run in the Julia REPL.
Tip
Avoid calling Julia code from the terminal, use the Julia REPL instead! Each time you call Julia from the terminal, you start a fresh Julia session and Julia will need to compile your code from scratch. This can be time consuming for large projects. In contrast, if you execute code in the REPL, Julia will compile code incrementally, which is much faster. Running code in a cluster (like in DAS-5 for the Julia assignment) is among the few situations you need to run Julia code from the terminal. Visit this link (Julia workflow tips) from the official Julia documentation for further information about how to develop Julia code effectivelly.
The programming of this course will be done using the Julia programming language. Thus, we start by explaining how to get up and running with Julia. After studying this page, you will be able to:
Courses related with high-performance computing (HPC) often use languages such as C, C++, or Fortran. We use Julia instead to make the course accessible to a wider set of students, including the ones that have no experience with C/C++ or Fortran, but are willing to learn parallel programming. Julia is a relatively new programming language specifically designed for scientific computing. It combines a high-level syntax close to interpreted languages like Python with the performance of compiled languages like C, C++, or Fortran. Thus, Julia will allow us to write efficient parallel algorithms with a syntax that is convenient in a teaching setting. In addition, Julia provides easy access to different programming models to write distributed algorithms, which will be useful to learn and experiment with them.
Tip
You can run the code in this link to learn how Julia compares to other languages (C and Python) in terms of performance.
There are several ways of opening Julia depending on your operating system and your IDE, but it is usually as simple as launching the Julia app. With VSCode, open a folder (File > Open Folder). Then, press Ctrl+Shift+P to open the command bar, and execute Julia: Start REPL. If this does not work, make sure you have the Julia extension for VSCode installed. Independently of the method you use, opening Julia results in a window with some text ending with:
julia>
You have just opened the Julia read-evaluate-print loop, or simply the Julia REPL. Congrats! You will spend most of time using the REPL, when working in Julia. The REPL is a console waiting for user input. Just as in other consoles, the string of text right before the input area (julia> in the case) is called the command prompt or simply the prompt.
Curious about what the function println does? Enter into help mode to look into the documentation. This is done by typing a question mark (?) into the input field:
julia> ?
After typing ?, the command prompt changes to help?>. It means we are in help mode. Now, we can type a function name to see its documentation.
The REPL comes with two more modes, namely package and shell modes. To enter package mode type
julia> ]
Package mode is used to install and manage packages. We are going to discuss the package mode in greater detail later. To return back to normal mode press the backspace key several times.
To enter shell mode type semicolon (;)
julia> ;
The prompt should have changed to shell> indicating that we are in shell mode. Now you can type commands that you would normally do on your system command line. For instance,
shell> ls
will display the contents of the current folder in Mac or Linux. Using shell mode in Windows is not straightforward, and thus not recommended for beginners.
Real-world Julia programs are not typed in the REPL in practice. They are written in one or more files and included in the REPL. To try this, create a new file called hello.jl, write the code of the "Hello world" example above, and save it. If you are using VSCode, you can create the file using File > New File > Julia File. Once the file is saved with the name hello.jl, execute it as follows
julia> include("hello.jl")
Warning
Make sure that the file "hello.jl" is located in the current working directory of your Julia session. You can query the current directory with function pwd(). You can change to another directory with function cd() if needed. Also, make sure that the file extension is .jl.
The recommended way of running Julia code is using the REPL as we did. But it is also possible to run code directly from the system command line. To this end, open a terminal and call Julia followed by the path to the file containing the code you want to execute.
$ julia hello.jl
The previous line assumes that you have Julia properly installed in the system and that it's usable from the terminal. In UNIX systems (Linux and Mac), the Julia binary needs to be in one of the directories listed in the PATH environment variable. To check that Julia is properly installed, you can use
$ julia --version
If this runs without error and you see a version number, you are good to go!
You can also run julia code from the terminal using the -e flag:
$ julia -e 'println("Hello, world!")'
Note
In this tutorial, when a code snipped starts with $, it should be run in the terminal. Otherwise, the code is to be run in the Julia REPL.
Tip
Avoid calling Julia code from the terminal, use the Julia REPL instead! Each time you call Julia from the terminal, you start a fresh Julia session and Julia will need to compile your code from scratch. This can be time consuming for large projects. In contrast, if you execute code in the REPL, Julia will compile code incrementally, which is much faster. Running code in a cluster (like in DAS-5 for the Julia assignment) is among the few situations you need to run Julia code from the terminal. Visit this link (Julia workflow tips) from the official Julia documentation for further information about how to develop Julia code effectivelly.
Since we are in a parallel computing course, let's run a parallel "Hello world" example in Julia. Open a Julia REPL and write
julia> using Distributed
julia> @everywhere println("Hello, world! I am proc $(myid()) from $(nprocs())")
Here, we are using the Distributed package, which is part of the Julia standard library that provides distributed memory parallel support. The code prints the process id and the number of processes in the current Julia session.
You will probably only see output from 1 process. We need to add more processes to run the example in parallel. This is done with the addprocs function.
julia> addprocs(3)
We have added 3 new processes. Plus the old one, we have 4 processes. Run the code again.
julia> @everywhere println("Hello, world! I am proc $(myid()) from $(nprocs())")
Now, you should see output from 4 processes.
It is possible to specify the number of processes when starting Julia from the terminal with the -p argument (useful, e.g., when running in a cluster). If you launch Julia from the terminal as
$ julia -p 3
and then run
julia> @everywhere println("Hello, world! I am proc $(myid()) from $(nprocs())")
One of the most useful features of Julia is its package manager. It allows one to install Julia packages in a straightforward and platform independent way. To illustrate this, let us consider the following parallel "Hello world" example. This example uses the Message Passing Interface (MPI). We will learn more about MPI later in the course.
Copy the following block of code into a new file named "hello_mpi.jl"
Copy the contents of previous code block into a file called Project.toml and place it in an empty folder named newproject. It is important that the file is named Project.toml. You can create a new folder from the REPL with
julia> mkdir("newproject")
To install all the packages registered in this file you need to activate the folder containing your Project.toml file
(@v1.10) pkg> activate newproject
and then instantiating it
(newproject) pkg> instantiate
The instantiate command will download and install all listed packages and their dependencies in just one click.
In some situations it is required to use package commands in Julia code, e.g., to automatize installation and deployment of Julia applications. This can be done using the Pkg package. For instance
In many situations, it is useful to create your own package, for instance, when working with a large code base, when you want to reduce compilation latency using Revise.jl, or if you want to eventually register your package and share it with others.
The simplest way of generating a package (called MyPackage) is as follows. Open Julia, go to package mode, and type
(@v1.10) pkg> generate MyPackage
This will crate a minimal package consisting of a new folder MyPackage with two files:
MyPackage/Project.toml: Project file defining the direct dependencies of your package.
MyPackage/src/MyPackage.jl: Main source file of your package. You can split your code in several files if needed, and include them in the package main file using function include.
Tip
This approach only generates a very minimal package. To create a more sophisticated package skeleton (including unit testing, code coverage, readme file, licence, etc.) use PkgTemplates.jl or BestieTemplate.jl. The later one is developed in Amsterdam at the Netherlands eScience Center.
You can add dependencies to the package by activating the MyPackage folder in package mode and adding new dependencies as always:
To use your package you first need to add it to a package environment of your choice. This is done by changing to package mode and typing develop followed by the path to the folder containing the package. For instance:
(@v1.10) pkg> develop MyPackage
Note
You do not need to "develop" your package if you activated the package folder MyPackage.
Now, we can go back to standard Julia mode and use it as any other package:
using MyPackage
-MyPackage.greet()
Here, we just called the example function defined in MyPackage/src/MyPackage.jl.
We have learned the basics of how to work with Julia, including how to run serial and parallel code, and how to manage, create, and use Julia packages. This knowledge will allow you to follow the course effectively! If you want to further dig into the topics we have covered here, you can take a look at the following links:
We have learned the basics of how to work with Julia, including how to run serial and parallel code, and how to manage, create, and use Julia packages. This knowledge will allow you to follow the course effectively! If you want to further dig into the topics we have covered here, you can take a look at the following links:
This page contains part of the course material of the Programming Large-Scale Parallel Systems course at VU Amsterdam. We provide several lecture notes in jupyter notebook format, which will help you to learn how to design, analyze, and program parallel algorithms on multi-node computing systems. Further information about the course is found in the study guide (click here) and our Canvas page (for registered students).
Note
Material will be added incrementally to the website as the course advances.
Warning
This page will eventually contain only a part of the course material. The rest will be available on Canvas. In particular, the material in this public webpage does not fully cover all topics in the final exam.
Download the notebooks and run them locally on your computer (recommended). At each notebook page you will find a green box with links to download the notebook.
You also have the static version of the notebooks displayed in this webpage for quick reference.
This page contains part of the course material of the Programming Large-Scale Parallel Systems course at VU Amsterdam. We provide several lecture notes in jupyter notebook format, which will help you to learn how to design, analyze, and program parallel algorithms on multi-node computing systems. Further information about the course is found in the study guide (click here) and our Canvas page (for registered students).
Note
Material will be added incrementally to the website as the course advances.
Warning
This page will eventually contain only a part of the course material. The rest will be available on Canvas. In particular, the material in this public webpage does not fully cover all topics in the final exam.
Download the notebooks and run them locally on your computer (recommended). At each notebook page you will find a green box with links to download the notebook.
You also have the static version of the notebooks displayed in this webpage for quick reference.
This page was created with the support of the Faculty of Science of Vrije Universiteit Amsterdam in the framework of the project "Interactive lecture notes and exercises for the Programming Large-Scale Parallel Systems course" funded by the "Innovation budget BETA 2023 Studievoorschotmiddelen (SVM) towards Activated Blended Learning".
Settings
This document was generated with Documenter.jl version 1.7.0 on Monday 23 September 2024. Using Julia version 1.10.5.
+julia> notebook()
These commands will open a jupyter in your web browser. Navigate in jupyter to the notebook file you have downloaded and open it.
This page was created with the support of the Faculty of Science of Vrije Universiteit Amsterdam in the framework of the project "Interactive lecture notes and exercises for the Programming Large-Scale Parallel Systems course" funded by the "Innovation budget BETA 2023 Studievoorschotmiddelen (SVM) towards Activated Blended Learning".
Settings
This document was generated with Documenter.jl version 1.7.0 on Monday 23 September 2024. Using Julia version 1.10.5.
This document was generated with Documenter.jl version 1.7.0 on Monday 23 September 2024. Using Julia version 1.10.5.
+
Settings
This document was generated with Documenter.jl version 1.7.0 on Monday 23 September 2024. Using Julia version 1.10.5.
diff --git a/dev/search_index.js b/dev/search_index.js
index 93fc0b2..149ec2e 100644
--- a/dev/search_index.js
+++ b/dev/search_index.js
@@ -1,3 +1,3 @@
var documenterSearchIndex = {"docs":
-[{"location":"getting_started_with_julia/#Getting-started","page":"Getting started","title":"Getting started","text":"","category":"section"},{"location":"getting_started_with_julia/#Introduction","page":"Getting started","title":"Introduction","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The programming of this course will be done using the Julia programming language. Thus, we start by explaining how to get up and running with Julia. After studying this page, you will be able to:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Use the Julia REPL,\nRun serial and parallel code,\nInstall and manage Julia packages.","category":"page"},{"location":"getting_started_with_julia/#Why-Julia?","page":"Getting started","title":"Why Julia?","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Courses related with high-performance computing (HPC) often use languages such as C, C++, or Fortran. We use Julia instead to make the course accessible to a wider set of students, including the ones that have no experience with C/C++ or Fortran, but are willing to learn parallel programming. Julia is a relatively new programming language specifically designed for scientific computing. It combines a high-level syntax close to interpreted languages like Python with the performance of compiled languages like C, C++, or Fortran. Thus, Julia will allow us to write efficient parallel algorithms with a syntax that is convenient in a teaching setting. In addition, Julia provides easy access to different programming models to write distributed algorithms, which will be useful to learn and experiment with them.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"tip: Tip\nYou can run the code in this link to learn how Julia compares to other languages (C and Python) in terms of performance.","category":"page"},{"location":"getting_started_with_julia/#Installing-Julia","page":"Getting started","title":"Installing Julia","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"This is a tutorial-like page. Follow these steps before you continue reading the document.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Download and install Julia from julialang.org;\nFollow the specific instructions for your operating system: Windows, MacOS, or Linux\nDownload and install VSCode and its Julia extension;","category":"page"},{"location":"getting_started_with_julia/#The-Julia-REPL","page":"Getting started","title":"The Julia REPL","text":"","category":"section"},{"location":"getting_started_with_julia/#Starting-Julia","page":"Getting started","title":"Starting Julia","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"There are several ways of opening Julia depending on your operating system and your IDE, but it is usually as simple as launching the Julia app. With VSCode, open a folder (File > Open Folder). Then, press Ctrl+Shift+P to open the command bar, and execute Julia: Start REPL. If this does not work, make sure you have the Julia extension for VSCode installed. Independently of the method you use, opening Julia results in a window with some text ending with:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia>","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You have just opened the Julia read-evaluate-print loop, or simply the Julia REPL. Congrats! You will spend most of time using the REPL, when working in Julia. The REPL is a console waiting for user input. Just as in other consoles, the string of text right before the input area (julia> in the case) is called the command prompt or simply the prompt.","category":"page"},{"location":"getting_started_with_julia/#Basic-usage","page":"Getting started","title":"Basic usage","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The usage of the REPL is as follows:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You write some input\npress enter\nyou get the output","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"For instance, try this","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> 1 + 1","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"A \"Hello world\" example looks like this in Julia","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> println(\"Hello, world!\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Try to run it in the REPL.","category":"page"},{"location":"getting_started_with_julia/#Help-mode","page":"Getting started","title":"Help mode","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Curious about what the function println does? Enter into help mode to look into the documentation. This is done by typing a question mark (?) into the input field:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> ?","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"After typing ?, the command prompt changes to help?>. It means we are in help mode. Now, we can type a function name to see its documentation.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"help?> println","category":"page"},{"location":"getting_started_with_julia/#Package-and-shell-modes","page":"Getting started","title":"Package and shell modes","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The REPL comes with two more modes, namely package and shell modes. To enter package mode type","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> ]","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Package mode is used to install and manage packages. We are going to discuss the package mode in greater detail later. To return back to normal mode press the backspace key several times.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To enter shell mode type semicolon (;)","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> ;","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The prompt should have changed to shell> indicating that we are in shell mode. Now you can type commands that you would normally do on your system command line. For instance,","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"shell> ls","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"will display the contents of the current folder in Mac or Linux. Using shell mode in Windows is not straightforward, and thus not recommended for beginners.","category":"page"},{"location":"getting_started_with_julia/#Running-Julia-code","page":"Getting started","title":"Running Julia code","text":"","category":"section"},{"location":"getting_started_with_julia/#Running-more-complex-code","page":"Getting started","title":"Running more complex code","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Real-world Julia programs are not typed in the REPL in practice. They are written in one or more files and included in the REPL. To try this, create a new file called hello.jl, write the code of the \"Hello world\" example above, and save it. If you are using VSCode, you can create the file using File > New File > Julia File. Once the file is saved with the name hello.jl, execute it as follows","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> include(\"hello.jl\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"warning: Warning\nMake sure that the file \"hello.jl\" is located in the current working directory of your Julia session. You can query the current directory with function pwd(). You can change to another directory with function cd() if needed. Also, make sure that the file extension is .jl.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The recommended way of running Julia code is using the REPL as we did. But it is also possible to run code directly from the system command line. To this end, open a terminal and call Julia followed by the path to the file containing the code you want to execute.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ julia hello.jl","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The previous line assumes that you have Julia properly installed in the system and that it's usable from the terminal. In UNIX systems (Linux and Mac), the Julia binary needs to be in one of the directories listed in the PATH environment variable. To check that Julia is properly installed, you can use","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ julia --version","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"If this runs without error and you see a version number, you are good to go!","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can also run julia code from the terminal using the -e flag:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ julia -e 'println(\"Hello, world!\")'","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nIn this tutorial, when a code snipped starts with $, it should be run in the terminal. Otherwise, the code is to be run in the Julia REPL.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"tip: Tip\nAvoid calling Julia code from the terminal, use the Julia REPL instead! Each time you call Julia from the terminal, you start a fresh Julia session and Julia will need to compile your code from scratch. This can be time consuming for large projects. In contrast, if you execute code in the REPL, Julia will compile code incrementally, which is much faster. Running code in a cluster (like in DAS-5 for the Julia assignment) is among the few situations you need to run Julia code from the terminal. Visit this link (Julia workflow tips) from the official Julia documentation for further information about how to develop Julia code effectivelly.","category":"page"},{"location":"getting_started_with_julia/#Running-parallel-code","page":"Getting started","title":"Running parallel code","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Since we are in a parallel computing course, let's run a parallel \"Hello world\" example in Julia. Open a Julia REPL and write","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using Distributed\njulia> @everywhere println(\"Hello, world! I am proc $(myid()) from $(nprocs())\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Here, we are using the Distributed package, which is part of the Julia standard library that provides distributed memory parallel support. The code prints the process id and the number of processes in the current Julia session.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You will probably only see output from 1 process. We need to add more processes to run the example in parallel. This is done with the addprocs function.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> addprocs(3)","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"We have added 3 new processes. Plus the old one, we have 4 processes. Run the code again.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> @everywhere println(\"Hello, world! I am proc $(myid()) from $(nprocs())\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, you should see output from 4 processes.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"It is possible to specify the number of processes when starting Julia from the terminal with the -p argument (useful, e.g., when running in a cluster). If you launch Julia from the terminal as","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ julia -p 3","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"and then run","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> @everywhere println(\"Hello, world! I am proc $(myid()) from $(nprocs())\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You should get output from 4 processes as before.","category":"page"},{"location":"getting_started_with_julia/#Installing-packages","page":"Getting started","title":"Installing packages","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"One of the most useful features of Julia is its package manager. It allows one to install Julia packages in a straightforward and platform independent way. To illustrate this, let us consider the following parallel \"Hello world\" example. This example uses the Message Passing Interface (MPI). We will learn more about MPI later in the course.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Copy the following block of code into a new file named \"hello_mpi.jl\"","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"# file hello_mpi.jl\nusing MPI\nMPI.Init()\ncomm = MPI.COMM_WORLD\nrank = MPI.Comm_rank(comm)\nnranks = MPI.Comm_size(comm)\nprintln(\"Hello world, I am rank $rank of $nranks\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"As you can see from this example, one can access MPI from Julia in a clean way, without type annotations and other complexities of C/C++ code.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, run the file from the REPL","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> include(\"hello_mpi.jl\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"It probably didn't work, right? Read the error message and note that the MPI package needs to be installed to run this code.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To install a package, we need to enter package mode. Remember that we entered into help mode by typing ?. Package mode is activated by typing ] : ","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> ]","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"At this point, the prompt should have changed to (@v1.10) pkg> indicating that we are in package mode. The text between the parentheses indicates which is the active project, i.e., where packages are going to be installed. In this case, we are working with the global project associated with our Julia installation (which is Julia 1.10 in this example, but it can be another version in your case).","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To install the MPI package, type","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> add MPI","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Congrats, you have installed MPI!","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nMany Julia package names end with .jl. This is just a way of signaling that a package is written in Julia. When using such packages, the .jl needs to be omitted. In this case, we have installed the MPI.jl package even though we have only typed MPI in the REPL.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nThe package you have installed is the Julia interface to MPI, called MPI.jl. Note that it is not a MPI library by itself. It is just a thin wrapper between MPI and Julia. To use this interface, you need an actual MPI library installed in your system such as OpenMPI or MPICH. Julia downloads and installs a MPI library for you, but it is also possible to use a MPI library already available in your system. This is useful, e.g., when running on HPC clusters. See the documentation of MPI.jl for further details.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To check that the package was installed properly, exit package mode by pressing the backspace key several times, and run it again","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> include(\"hello_mpi.jl\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, it should work, but you probably get output from a single MPI rank only.","category":"page"},{"location":"getting_started_with_julia/#Running-MPI-code","page":"Getting started","title":"Running MPI code","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To run MPI applications in parallel, you need a launcher like mpiexec. MPI codes written in Julia are not an exception to this rule. From the system terminal, you can run","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ mpiexec -np 4 julia hello_mpi.jl","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"But it will probably not work since the version of mpiexec needs to match with the MPI version we are using from Julia. Don't worry if you could not make it work! A more elegant way to run MPI code is from the Julia REPL directly, by using these commands:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using MPI\njulia> run(`$(mpiexec()) -np 4 julia hello_mpi.jl`)","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, you should see output from 4 ranks.","category":"page"},{"location":"getting_started_with_julia/#Package-manager","page":"Getting started","title":"Package manager","text":"","category":"section"},{"location":"getting_started_with_julia/#Installing-packages-locally","page":"Getting started","title":"Installing packages locally","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"We have installed the MPI package globally and it will be available in all Julia sessions. However, in some situations, we want to work with different versions of the same package or to install packages in an isolated way to avoid potential conflicts with other packages. This can be done by using local projects.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"A project is simply a folder in your file system. To use a particular folder as your project, you need to activate it. This is done by entering package mode and using the activate command followed by the path to the folder you want to activate.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> activate .","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The previous command will activate the current working directory. Note that the dot . is indeed the path to the current folder.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The prompt has changed to (lessons) pkg> indicating that we are in the project within the lessons folder. The particular folder name can be different in your case.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"tip: Tip\nYou can activate a project directly when opening Julia from the terminal using the --project flag. The command $ julia --project=. will open Julia and activate a project in the current directory. You can also achieve the same effect by setting the environment variable JULIA_PROJECT with the path of the folder you want to activate.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nThe active project folder and the current working directory are two independent concepts! For instance, (@v1.10) pkg> activate folderB and then julia> cd(\"folderA\"), will activate the project in folderB and change the current working directory to folderA.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"At this point all package-related operations will be local to the new project. For instance, install the DataFrames package.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(lessons) pkg> add DataFrames","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Use the package to check that it is installed","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using DataFrames\njulia> DataFrame(a=[1,2],b=[3,4])","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, we can return to the global project to check that DataFrames has not been installed there. To return to the global environment, use activate without a folder name.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(lessons) pkg> activate","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The prompt is again (@v1.10) pkg>","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, try to use DataFrames.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using DataFrames\njulia> DataFrame(a=[1,2],b=[3,4])","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You should get an error or a warning unless you already had DataFrames installed globally.","category":"page"},{"location":"getting_started_with_julia/#Project-and-Manifest-files","page":"Getting started","title":"Project and Manifest files","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The information about a project is stored in two files Project.toml and Manifest.toml.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Project.toml contains the packages explicitly installed (the direct dependencies)\nManifest.toml contains direct and indirect dependencies along with the concrete version of each package.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"In other words, Project.toml contains the packages relevant for the user, whereas Manifest.toml is the detailed snapshot of all dependencies. The Manifest.toml can be used to reproduce the same environment in another machine.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can see the path to the current Project.toml file by using the status operator (or st in its short form) while in package mode","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> status","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The information about the Manifest.toml can be inspected by passing the -m flag.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> status -m","category":"page"},{"location":"getting_started_with_julia/#Installing-packages-from-a-project-file","page":"Getting started","title":"Installing packages from a project file","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Project files can be used to install lists of packages defined by others. E.g., to install all the dependencies of a Julia application.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Assume that a colleague has sent to you a Project.toml file with this content:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"[deps]\nBenchmarkTools = \"6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf\"\nDataFrames = \"a93c6f00-e57d-5684-b7b6-d8193f3e46c0\"\nMPI = \"da04e1cc-30fd-572f-bb4f-1f8673147195\"","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Copy the contents of previous code block into a file called Project.toml and place it in an empty folder named newproject. It is important that the file is named Project.toml. You can create a new folder from the REPL with","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> mkdir(\"newproject\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To install all the packages registered in this file you need to activate the folder containing your Project.toml file","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> activate newproject","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"and then instantiating it","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(newproject) pkg> instantiate","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The instantiate command will download and install all listed packages and their dependencies in just one click.","category":"page"},{"location":"getting_started_with_julia/#Getting-help-in-package-mode","page":"Getting started","title":"Getting help in package mode","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can get help about a particular package operator by writing help in front of it","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> help activate","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can get an overview of all package commands by typing help alone","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> help","category":"page"},{"location":"getting_started_with_julia/#Package-operations-in-Julia-code","page":"Getting started","title":"Package operations in Julia code","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"In some situations it is required to use package commands in Julia code, e.g., to automatize installation and deployment of Julia applications. This can be done using the Pkg package. For instance","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using Pkg\njulia> Pkg.status()","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"is equivalent to calling status in package mode.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> status","category":"page"},{"location":"getting_started_with_julia/#Creating-you-own-package","page":"Getting started","title":"Creating you own package","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"In many situations, it is useful to create your own package, for instance, when working with a large code base, when you want to reduce compilation latency using Revise.jl, or if you want to eventually register your package and share it with others.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The simplest way of generating a package (called MyPackage) is as follows. Open Julia, go to package mode, and type","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> generate MyPackage","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"This will crate a minimal package consisting of a new folder MyPackage with two files:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"MyPackage/Project.toml: Project file defining the direct dependencies of your package.\nMyPackage/src/MyPackage.jl: Main source file of your package. You can split your code in several files if needed, and include them in the package main file using function include.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"tip: Tip\nThis approach only generates a very minimal package. To create a more sophisticated package skeleton (including unit testing, code coverage, readme file, licence, etc.) use PkgTemplates.jl or BestieTemplate.jl. The later one is developed in Amsterdam at the Netherlands eScience Center.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can add dependencies to the package by activating the MyPackage folder in package mode and adding new dependencies as always:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> activate MyPackage\n(MyPackage) pkg> add MPI","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"This will add MPI to your package dependencies.","category":"page"},{"location":"getting_started_with_julia/#Using-your-own-package","page":"Getting started","title":"Using your own package","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To use your package you first need to add it to a package environment of your choice. This is done by changing to package mode and typing develop followed by the path to the folder containing the package. For instance:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> develop MyPackage","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nYou do not need to \"develop\" your package if you activated the package folder MyPackage.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, we can go back to standard Julia mode and use it as any other package:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"using MyPackage\nMyPackage.greet()","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Here, we just called the example function defined in MyPackage/src/MyPackage.jl.","category":"page"},{"location":"getting_started_with_julia/#Conclusion","page":"Getting started","title":"Conclusion","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"We have learned the basics of how to work with Julia, including how to run serial and parallel code, and how to manage, create, and use Julia packages. This knowledge will allow you to follow the course effectively! If you want to further dig into the topics we have covered here, you can take a look at the following links:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Julia Manual\nPackage manager","category":"page"},{"location":"tsp/","page":"-","title":"-","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/tsp.ipynb\"","category":"page"},{"location":"tsp/","page":"-","title":"-","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"jacobi_2D/","page":"-","title":"-","text":"\n","category":"page"},{"location":"julia_async/","page":"Asynchronous programming in Julia","title":"Asynchronous programming in Julia","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/julia_async.ipynb\"","category":"page"},{"location":"julia_async/","page":"Asynchronous programming in Julia","title":"Asynchronous programming in Julia","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"julia_async/","page":"Asynchronous programming in Julia","title":"Asynchronous programming in Julia","text":"\n","category":"page"},{"location":"solutions/","page":"-","title":"-","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/solutions.ipynb\"","category":"page"},{"location":"solutions/","page":"-","title":"-","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"LEQ/","page":"Gaussian elimination","title":"Gaussian elimination","text":"\n","category":"page"},{"location":"julia_distributed/","page":"Distributed computing in Julia","title":"Distributed computing in Julia","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/julia_distributed.ipynb\"","category":"page"},{"location":"julia_distributed/","page":"Distributed computing in Julia","title":"Distributed computing in Julia","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"julia_distributed/","page":"Distributed computing in Julia","title":"Distributed computing in Julia","text":"\n","category":"page"},{"location":"julia_jacobi/","page":"-","title":"-","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/julia_jacobi.ipynb\"","category":"page"},{"location":"julia_jacobi/","page":"-","title":"-","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"matrix_matrix/","page":"Matrix-matrix multiplication","title":"Matrix-matrix multiplication","text":"\n","category":"page"},{"location":"asp/","page":"All pairs of shortest paths","title":"All pairs of shortest paths","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/asp.ipynb\"","category":"page"},{"location":"asp/","page":"All pairs of shortest paths","title":"All pairs of shortest paths","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"asp/","page":"All pairs of shortest paths","title":"All pairs of shortest paths","text":"\n","category":"page"},{"location":"","page":"Home","title":"Home","text":"CurrentModule = XM_40017","category":"page"},{"location":"#Programming-Large-Scale-Parallel-Systems-(XM_40017)","page":"Home","title":"Programming Large-Scale Parallel Systems (XM_40017)","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Welcome to the interactive lecture notes of the Programming Large-Scale Parallel Systems course at VU Amsterdam!","category":"page"},{"location":"#What","page":"Home","title":"What","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This page contains part of the course material of the Programming Large-Scale Parallel Systems course at VU Amsterdam. We provide several lecture notes in jupyter notebook format, which will help you to learn how to design, analyze, and program parallel algorithms on multi-node computing systems. Further information about the course is found in the study guide (click here) and our Canvas page (for registered students).","category":"page"},{"location":"","page":"Home","title":"Home","text":"note: Note\nMaterial will be added incrementally to the website as the course advances.","category":"page"},{"location":"","page":"Home","title":"Home","text":"warning: Warning\nThis page will eventually contain only a part of the course material. The rest will be available on Canvas. In particular, the material in this public webpage does not fully cover all topics in the final exam.","category":"page"},{"location":"#How-to-use-this-page","page":"Home","title":"How to use this page","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"You have two main ways of studying the notebooks:","category":"page"},{"location":"","page":"Home","title":"Home","text":"Download the notebooks and run them locally on your computer (recommended). At each notebook page you will find a green box with links to download the notebook.\nYou also have the static version of the notebooks displayed in this webpage for quick reference.","category":"page"},{"location":"#How-to-run-the-notebooks-locally","page":"Home","title":"How to run the notebooks locally","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"To run a notebook locally follow these steps:","category":"page"},{"location":"","page":"Home","title":"Home","text":"Install Julia (if not done already). More information in Getting started.\nDownload the notebook.\nLaunch Julia. More information in Getting started.\nExecute these commands in the Julia command line:","category":"page"},{"location":"","page":"Home","title":"Home","text":"julia> using Pkg\njulia> Pkg.add(\"IJulia\")\njulia> using IJulia\njulia> notebook()","category":"page"},{"location":"","page":"Home","title":"Home","text":"These commands will open a jupyter in your web browser. Navigate in jupyter to the notebook file you have downloaded and open it.","category":"page"},{"location":"#Authors","page":"Home","title":"Authors","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This material is created by Francesc Verdugo with the help of Gelieza Kötterheinrich. Part of the notebooks are based on the course slides by Henri Bal.","category":"page"},{"location":"#License","page":"Home","title":"License","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"All material on this page that is original to this course may be used under a CC BY 4.0 license.","category":"page"},{"location":"#Acknowledgment","page":"Home","title":"Acknowledgment","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This page was created with the support of the Faculty of Science of Vrije Universiteit Amsterdam in the framework of the project \"Interactive lecture notes and exercises for the Programming Large-Scale Parallel Systems course\" funded by the \"Innovation budget BETA 2023 Studievoorschotmiddelen (SVM) towards Activated Blended Learning\".","category":"page"}]
+[{"location":"getting_started_with_julia/#Getting-started","page":"Getting started","title":"Getting started","text":"","category":"section"},{"location":"getting_started_with_julia/#Introduction","page":"Getting started","title":"Introduction","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The programming of this course will be done using the Julia programming language. Thus, we start by explaining how to get up and running with Julia. After studying this page, you will be able to:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Use the Julia REPL,\nRun serial and parallel code,\nInstall and manage Julia packages.","category":"page"},{"location":"getting_started_with_julia/#Why-Julia?","page":"Getting started","title":"Why Julia?","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Courses related with high-performance computing (HPC) often use languages such as C, C++, or Fortran. We use Julia instead to make the course accessible to a wider set of students, including the ones that have no experience with C/C++ or Fortran, but are willing to learn parallel programming. Julia is a relatively new programming language specifically designed for scientific computing. It combines a high-level syntax close to interpreted languages like Python with the performance of compiled languages like C, C++, or Fortran. Thus, Julia will allow us to write efficient parallel algorithms with a syntax that is convenient in a teaching setting. In addition, Julia provides easy access to different programming models to write distributed algorithms, which will be useful to learn and experiment with them.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"tip: Tip\nYou can run the code in this link to learn how Julia compares to other languages (C and Python) in terms of performance.","category":"page"},{"location":"getting_started_with_julia/#Installing-Julia","page":"Getting started","title":"Installing Julia","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"This is a tutorial-like page. Follow these steps before you continue reading the document.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Download and install Julia from julialang.org;\nFollow the specific instructions for your operating system: Windows, MacOS, or Linux\nDownload and install VSCode and its Julia extension;","category":"page"},{"location":"getting_started_with_julia/#The-Julia-REPL","page":"Getting started","title":"The Julia REPL","text":"","category":"section"},{"location":"getting_started_with_julia/#Starting-Julia","page":"Getting started","title":"Starting Julia","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"There are several ways of opening Julia depending on your operating system and your IDE, but it is usually as simple as launching the Julia app. With VSCode, open a folder (File > Open Folder). Then, press Ctrl+Shift+P to open the command bar, and execute Julia: Start REPL. If this does not work, make sure you have the Julia extension for VSCode installed. Independently of the method you use, opening Julia results in a window with some text ending with:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia>","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You have just opened the Julia read-evaluate-print loop, or simply the Julia REPL. Congrats! You will spend most of time using the REPL, when working in Julia. The REPL is a console waiting for user input. Just as in other consoles, the string of text right before the input area (julia> in the case) is called the command prompt or simply the prompt.","category":"page"},{"location":"getting_started_with_julia/#Basic-usage","page":"Getting started","title":"Basic usage","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The usage of the REPL is as follows:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You write some input\npress enter\nyou get the output","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"For instance, try this","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> 1 + 1","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"A \"Hello world\" example looks like this in Julia","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> println(\"Hello, world!\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Try to run it in the REPL.","category":"page"},{"location":"getting_started_with_julia/#Help-mode","page":"Getting started","title":"Help mode","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Curious about what the function println does? Enter into help mode to look into the documentation. This is done by typing a question mark (?) into the input field:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> ?","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"After typing ?, the command prompt changes to help?>. It means we are in help mode. Now, we can type a function name to see its documentation.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"help?> println","category":"page"},{"location":"getting_started_with_julia/#Package-and-shell-modes","page":"Getting started","title":"Package and shell modes","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The REPL comes with two more modes, namely package and shell modes. To enter package mode type","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> ]","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Package mode is used to install and manage packages. We are going to discuss the package mode in greater detail later. To return back to normal mode press the backspace key several times.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To enter shell mode type semicolon (;)","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> ;","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The prompt should have changed to shell> indicating that we are in shell mode. Now you can type commands that you would normally do on your system command line. For instance,","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"shell> ls","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"will display the contents of the current folder in Mac or Linux. Using shell mode in Windows is not straightforward, and thus not recommended for beginners.","category":"page"},{"location":"getting_started_with_julia/#Running-Julia-code","page":"Getting started","title":"Running Julia code","text":"","category":"section"},{"location":"getting_started_with_julia/#Running-more-complex-code","page":"Getting started","title":"Running more complex code","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Real-world Julia programs are not typed in the REPL in practice. They are written in one or more files and included in the REPL. To try this, create a new file called hello.jl, write the code of the \"Hello world\" example above, and save it. If you are using VSCode, you can create the file using File > New File > Julia File. Once the file is saved with the name hello.jl, execute it as follows","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> include(\"hello.jl\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"warning: Warning\nMake sure that the file \"hello.jl\" is located in the current working directory of your Julia session. You can query the current directory with function pwd(). You can change to another directory with function cd() if needed. Also, make sure that the file extension is .jl.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The recommended way of running Julia code is using the REPL as we did. But it is also possible to run code directly from the system command line. To this end, open a terminal and call Julia followed by the path to the file containing the code you want to execute.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ julia hello.jl","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The previous line assumes that you have Julia properly installed in the system and that it's usable from the terminal. In UNIX systems (Linux and Mac), the Julia binary needs to be in one of the directories listed in the PATH environment variable. To check that Julia is properly installed, you can use","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ julia --version","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"If this runs without error and you see a version number, you are good to go!","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can also run julia code from the terminal using the -e flag:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ julia -e 'println(\"Hello, world!\")'","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nIn this tutorial, when a code snipped starts with $, it should be run in the terminal. Otherwise, the code is to be run in the Julia REPL.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"tip: Tip\nAvoid calling Julia code from the terminal, use the Julia REPL instead! Each time you call Julia from the terminal, you start a fresh Julia session and Julia will need to compile your code from scratch. This can be time consuming for large projects. In contrast, if you execute code in the REPL, Julia will compile code incrementally, which is much faster. Running code in a cluster (like in DAS-5 for the Julia assignment) is among the few situations you need to run Julia code from the terminal. Visit this link (Julia workflow tips) from the official Julia documentation for further information about how to develop Julia code effectivelly.","category":"page"},{"location":"getting_started_with_julia/#Running-parallel-code","page":"Getting started","title":"Running parallel code","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Since we are in a parallel computing course, let's run a parallel \"Hello world\" example in Julia. Open a Julia REPL and write","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using Distributed\njulia> @everywhere println(\"Hello, world! I am proc $(myid()) from $(nprocs())\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Here, we are using the Distributed package, which is part of the Julia standard library that provides distributed memory parallel support. The code prints the process id and the number of processes in the current Julia session.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You will probably only see output from 1 process. We need to add more processes to run the example in parallel. This is done with the addprocs function.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> addprocs(3)","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"We have added 3 new processes. Plus the old one, we have 4 processes. Run the code again.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> @everywhere println(\"Hello, world! I am proc $(myid()) from $(nprocs())\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, you should see output from 4 processes.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"It is possible to specify the number of processes when starting Julia from the terminal with the -p argument (useful, e.g., when running in a cluster). If you launch Julia from the terminal as","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ julia -p 3","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"and then run","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> @everywhere println(\"Hello, world! I am proc $(myid()) from $(nprocs())\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You should get output from 4 processes as before.","category":"page"},{"location":"getting_started_with_julia/#Installing-packages","page":"Getting started","title":"Installing packages","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"One of the most useful features of Julia is its package manager. It allows one to install Julia packages in a straightforward and platform independent way. To illustrate this, let us consider the following parallel \"Hello world\" example. This example uses the Message Passing Interface (MPI). We will learn more about MPI later in the course.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Copy the following block of code into a new file named \"hello_mpi.jl\"","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"# file hello_mpi.jl\nusing MPI\nMPI.Init()\ncomm = MPI.COMM_WORLD\nrank = MPI.Comm_rank(comm)\nnranks = MPI.Comm_size(comm)\nprintln(\"Hello world, I am rank $rank of $nranks\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"As you can see from this example, one can access MPI from Julia in a clean way, without type annotations and other complexities of C/C++ code.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, run the file from the REPL","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> include(\"hello_mpi.jl\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"It probably didn't work, right? Read the error message and note that the MPI package needs to be installed to run this code.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To install a package, we need to enter package mode. Remember that we entered into help mode by typing ?. Package mode is activated by typing ] : ","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> ]","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"At this point, the prompt should have changed to (@v1.10) pkg> indicating that we are in package mode. The text between the parentheses indicates which is the active project, i.e., where packages are going to be installed. In this case, we are working with the global project associated with our Julia installation (which is Julia 1.10 in this example, but it can be another version in your case).","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To install the MPI package, type","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> add MPI","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Congrats, you have installed MPI!","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nMany Julia package names end with .jl. This is just a way of signaling that a package is written in Julia. When using such packages, the .jl needs to be omitted. In this case, we have installed the MPI.jl package even though we have only typed MPI in the REPL.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nThe package you have installed is the Julia interface to MPI, called MPI.jl. Note that it is not a MPI library by itself. It is just a thin wrapper between MPI and Julia. To use this interface, you need an actual MPI library installed in your system such as OpenMPI or MPICH. Julia downloads and installs a MPI library for you, but it is also possible to use a MPI library already available in your system. This is useful, e.g., when running on HPC clusters. See the documentation of MPI.jl for further details.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To check that the package was installed properly, exit package mode by pressing the backspace key several times, and run it again","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> include(\"hello_mpi.jl\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, it should work, but you probably get output from a single MPI rank only.","category":"page"},{"location":"getting_started_with_julia/#Running-MPI-code","page":"Getting started","title":"Running MPI code","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To run MPI applications in parallel, you need a launcher like mpiexec. MPI codes written in Julia are not an exception to this rule. From the system terminal, you can run","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"$ mpiexec -np 4 julia hello_mpi.jl","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"But it will probably not work since the version of mpiexec needs to match with the MPI version we are using from Julia. Don't worry if you could not make it work! A more elegant way to run MPI code is from the Julia REPL directly, by using these commands:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using MPI\njulia> run(`$(mpiexec()) -np 4 julia hello_mpi.jl`)","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, you should see output from 4 ranks.","category":"page"},{"location":"getting_started_with_julia/#Package-manager","page":"Getting started","title":"Package manager","text":"","category":"section"},{"location":"getting_started_with_julia/#Installing-packages-locally","page":"Getting started","title":"Installing packages locally","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"We have installed the MPI package globally and it will be available in all Julia sessions. However, in some situations, we want to work with different versions of the same package or to install packages in an isolated way to avoid potential conflicts with other packages. This can be done by using local projects.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"A project is simply a folder in your file system. To use a particular folder as your project, you need to activate it. This is done by entering package mode and using the activate command followed by the path to the folder you want to activate.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> activate .","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The previous command will activate the current working directory. Note that the dot . is indeed the path to the current folder.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The prompt has changed to (lessons) pkg> indicating that we are in the project within the lessons folder. The particular folder name can be different in your case.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"tip: Tip\nYou can activate a project directly when opening Julia from the terminal using the --project flag. The command $ julia --project=. will open Julia and activate a project in the current directory. You can also achieve the same effect by setting the environment variable JULIA_PROJECT with the path of the folder you want to activate.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nThe active project folder and the current working directory are two independent concepts! For instance, (@v1.10) pkg> activate folderB and then julia> cd(\"folderA\"), will activate the project in folderB and change the current working directory to folderA.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"At this point all package-related operations will be local to the new project. For instance, install the DataFrames package.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(lessons) pkg> add DataFrames","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Use the package to check that it is installed","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using DataFrames\njulia> DataFrame(a=[1,2],b=[3,4])","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, we can return to the global project to check that DataFrames has not been installed there. To return to the global environment, use activate without a folder name.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(lessons) pkg> activate","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The prompt is again (@v1.10) pkg>","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, try to use DataFrames.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using DataFrames\njulia> DataFrame(a=[1,2],b=[3,4])","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You should get an error or a warning unless you already had DataFrames installed globally.","category":"page"},{"location":"getting_started_with_julia/#Project-and-Manifest-files","page":"Getting started","title":"Project and Manifest files","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The information about a project is stored in two files Project.toml and Manifest.toml.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Project.toml contains the packages explicitly installed (the direct dependencies)\nManifest.toml contains direct and indirect dependencies along with the concrete version of each package.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"In other words, Project.toml contains the packages relevant for the user, whereas Manifest.toml is the detailed snapshot of all dependencies. The Manifest.toml can be used to reproduce the same environment in another machine.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can see the path to the current Project.toml file by using the status operator (or st in its short form) while in package mode","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> status","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The information about the Manifest.toml can be inspected by passing the -m flag.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> status -m","category":"page"},{"location":"getting_started_with_julia/#Installing-packages-from-a-project-file","page":"Getting started","title":"Installing packages from a project file","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Project files can be used to install lists of packages defined by others. E.g., to install all the dependencies of a Julia application.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Assume that a colleague has sent to you a Project.toml file with this content:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"[deps]\nBenchmarkTools = \"6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf\"\nDataFrames = \"a93c6f00-e57d-5684-b7b6-d8193f3e46c0\"\nMPI = \"da04e1cc-30fd-572f-bb4f-1f8673147195\"","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Copy the contents of previous code block into a file called Project.toml and place it in an empty folder named newproject. It is important that the file is named Project.toml. You can create a new folder from the REPL with","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> mkdir(\"newproject\")","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To install all the packages registered in this file you need to activate the folder containing your Project.toml file","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> activate newproject","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"and then instantiating it","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(newproject) pkg> instantiate","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The instantiate command will download and install all listed packages and their dependencies in just one click.","category":"page"},{"location":"getting_started_with_julia/#Getting-help-in-package-mode","page":"Getting started","title":"Getting help in package mode","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can get help about a particular package operator by writing help in front of it","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> help activate","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can get an overview of all package commands by typing help alone","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> help","category":"page"},{"location":"getting_started_with_julia/#Package-operations-in-Julia-code","page":"Getting started","title":"Package operations in Julia code","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"In some situations it is required to use package commands in Julia code, e.g., to automatize installation and deployment of Julia applications. This can be done using the Pkg package. For instance","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"julia> using Pkg\njulia> Pkg.status()","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"is equivalent to calling status in package mode.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> status","category":"page"},{"location":"getting_started_with_julia/#Creating-you-own-package","page":"Getting started","title":"Creating you own package","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"In many situations, it is useful to create your own package, for instance, when working with a large code base, when you want to reduce compilation latency using Revise.jl, or if you want to eventually register your package and share it with others.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"The simplest way of generating a package (called MyPackage) is as follows. Open Julia, go to package mode, and type","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> generate MyPackage","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"This will crate a minimal package consisting of a new folder MyPackage with two files:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"MyPackage/Project.toml: Project file defining the direct dependencies of your package.\nMyPackage/src/MyPackage.jl: Main source file of your package. You can split your code in several files if needed, and include them in the package main file using function include.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"tip: Tip\nThis approach only generates a very minimal package. To create a more sophisticated package skeleton (including unit testing, code coverage, readme file, licence, etc.) use PkgTemplates.jl or BestieTemplate.jl. The later one is developed in Amsterdam at the Netherlands eScience Center.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"You can add dependencies to the package by activating the MyPackage folder in package mode and adding new dependencies as always:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> activate MyPackage\n(MyPackage) pkg> add MPI","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"This will add MPI to your package dependencies.","category":"page"},{"location":"getting_started_with_julia/#Using-your-own-package","page":"Getting started","title":"Using your own package","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"To use your package you first need to add it to a package environment of your choice. This is done by changing to package mode and typing develop followed by the path to the folder containing the package. For instance:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"(@v1.10) pkg> develop MyPackage","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"note: Note\nYou do not need to \"develop\" your package if you activated the package folder MyPackage.","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Now, we can go back to standard Julia mode and use it as any other package:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"using MyPackage\nMyPackage.greet()","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Here, we just called the example function defined in MyPackage/src/MyPackage.jl.","category":"page"},{"location":"getting_started_with_julia/#Conclusion","page":"Getting started","title":"Conclusion","text":"","category":"section"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"We have learned the basics of how to work with Julia, including how to run serial and parallel code, and how to manage, create, and use Julia packages. This knowledge will allow you to follow the course effectively! If you want to further dig into the topics we have covered here, you can take a look at the following links:","category":"page"},{"location":"getting_started_with_julia/","page":"Getting started","title":"Getting started","text":"Julia Manual\nPackage manager","category":"page"},{"location":"tsp/","page":"Traveling salesperson problem","title":"Traveling salesperson problem","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/tsp.ipynb\"","category":"page"},{"location":"tsp/","page":"Traveling salesperson problem","title":"Traveling salesperson problem","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"jacobi_2D/","page":"-","title":"-","text":"\n","category":"page"},{"location":"julia_async/","page":"Asynchronous programming in Julia","title":"Asynchronous programming in Julia","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/julia_async.ipynb\"","category":"page"},{"location":"julia_async/","page":"Asynchronous programming in Julia","title":"Asynchronous programming in Julia","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"julia_async/","page":"Asynchronous programming in Julia","title":"Asynchronous programming in Julia","text":"\n","category":"page"},{"location":"solutions/","page":"-","title":"-","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/solutions.ipynb\"","category":"page"},{"location":"solutions/","page":"-","title":"-","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"LEQ/","page":"Gaussian elimination","title":"Gaussian elimination","text":"\n","category":"page"},{"location":"julia_distributed/","page":"Distributed computing in Julia","title":"Distributed computing in Julia","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/julia_distributed.ipynb\"","category":"page"},{"location":"julia_distributed/","page":"Distributed computing in Julia","title":"Distributed computing in Julia","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"julia_distributed/","page":"Distributed computing in Julia","title":"Distributed computing in Julia","text":"\n","category":"page"},{"location":"julia_jacobi/","page":"-","title":"-","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/julia_jacobi.ipynb\"","category":"page"},{"location":"julia_jacobi/","page":"-","title":"-","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"matrix_matrix/","page":"Matrix-matrix multiplication","title":"Matrix-matrix multiplication","text":"\n","category":"page"},{"location":"asp/","page":"All pairs of shortest paths","title":"All pairs of shortest paths","text":"EditURL = \"https://github.com/fverdugo/XM_40017/blob/main/notebooks/asp.ipynb\"","category":"page"},{"location":"asp/","page":"All pairs of shortest paths","title":"All pairs of shortest paths","text":"
\n Tip\n
\n
\n
\n Download this notebook and run it locally on your machine [highly recommended]. Click here.\n
\n
\n
\n
","category":"page"},{"location":"asp/","page":"All pairs of shortest paths","title":"All pairs of shortest paths","text":"\n","category":"page"},{"location":"","page":"Home","title":"Home","text":"CurrentModule = XM_40017","category":"page"},{"location":"#Programming-Large-Scale-Parallel-Systems-(XM_40017)","page":"Home","title":"Programming Large-Scale Parallel Systems (XM_40017)","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Welcome to the interactive lecture notes of the Programming Large-Scale Parallel Systems course at VU Amsterdam!","category":"page"},{"location":"#What","page":"Home","title":"What","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This page contains part of the course material of the Programming Large-Scale Parallel Systems course at VU Amsterdam. We provide several lecture notes in jupyter notebook format, which will help you to learn how to design, analyze, and program parallel algorithms on multi-node computing systems. Further information about the course is found in the study guide (click here) and our Canvas page (for registered students).","category":"page"},{"location":"","page":"Home","title":"Home","text":"note: Note\nMaterial will be added incrementally to the website as the course advances.","category":"page"},{"location":"","page":"Home","title":"Home","text":"warning: Warning\nThis page will eventually contain only a part of the course material. The rest will be available on Canvas. In particular, the material in this public webpage does not fully cover all topics in the final exam.","category":"page"},{"location":"#How-to-use-this-page","page":"Home","title":"How to use this page","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"You have two main ways of studying the notebooks:","category":"page"},{"location":"","page":"Home","title":"Home","text":"Download the notebooks and run them locally on your computer (recommended). At each notebook page you will find a green box with links to download the notebook.\nYou also have the static version of the notebooks displayed in this webpage for quick reference.","category":"page"},{"location":"#How-to-run-the-notebooks-locally","page":"Home","title":"How to run the notebooks locally","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"To run a notebook locally follow these steps:","category":"page"},{"location":"","page":"Home","title":"Home","text":"Install Julia (if not done already). More information in Getting started.\nDownload the notebook.\nLaunch Julia. More information in Getting started.\nExecute these commands in the Julia command line:","category":"page"},{"location":"","page":"Home","title":"Home","text":"julia> using Pkg\njulia> Pkg.add(\"IJulia\")\njulia> using IJulia\njulia> notebook()","category":"page"},{"location":"","page":"Home","title":"Home","text":"These commands will open a jupyter in your web browser. Navigate in jupyter to the notebook file you have downloaded and open it.","category":"page"},{"location":"#Authors","page":"Home","title":"Authors","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This material is created by Francesc Verdugo with the help of Gelieza Kötterheinrich. Part of the notebooks are based on the course slides by Henri Bal.","category":"page"},{"location":"#License","page":"Home","title":"License","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"All material on this page that is original to this course may be used under a CC BY 4.0 license.","category":"page"},{"location":"#Acknowledgment","page":"Home","title":"Acknowledgment","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This page was created with the support of the Faculty of Science of Vrije Universiteit Amsterdam in the framework of the project \"Interactive lecture notes and exercises for the Programming Large-Scale Parallel Systems course\" funded by the \"Innovation budget BETA 2023 Studievoorschotmiddelen (SVM) towards Activated Blended Learning\".","category":"page"}]
}
diff --git a/dev/solutions/index.html b/dev/solutions/index.html
index 9b9603d..2b6fe45 100644
--- a/dev/solutions/index.html
+++ b/dev/solutions/index.html
@@ -1,5 +1,5 @@
-- · XM_40017
This document was generated with Documenter.jl version 1.7.0 on Monday 23 September 2024. Using Julia version 1.10.5.
+end
Settings
This document was generated with Documenter.jl version 1.7.0 on Monday 23 September 2024. Using Julia version 1.10.5.
diff --git a/dev/tsp.ipynb b/dev/tsp.ipynb
index 1ad09d8..259bd0f 100644
--- a/dev/tsp.ipynb
+++ b/dev/tsp.ipynb
@@ -22,8 +22,8 @@
"In this notebook, we will learn\n",
"\n",
"- How to parallelize the solution of the traveling sales person problem\n",
- "- How to fix dynamic load imbalance\n",
- "- The concept of search overhead\n"
+ "- The concept of search overhead\n",
+ "- A dynamic load balancing method\n"
]
},
{
@@ -51,9 +51,17 @@
" \"It's not correct. Keep trying! 💪\"\n",
" end |> println\n",
"end\n",
- "tsp_check_2(answer) = answer_checker(answer, 2)\n",
+ "tsp_check_2(answer) = answer_checker(answer, 4)\n",
"tsp_check_3(answer) = answer_checker(answer, \"d\")\n",
- "tsp_check_4(answer) = answer_checker(answer, \"a\")"
+ "tsp_check_4(answer) = answer_checker(answer, \"a\")\n",
+ "function q_superlinear_answer(bool)\n",
+ " bool || return\n",
+ " msg = \"\"\"\n",
+ " Negative search overhead can explain the superlinear speedup in this algorithm. The optimal speedup (speedup equal to the numer of processors) assumes that the work done in the sequental and parallel algorithm is the same. If the parallel code does less work, it is possible to go beyond the optimal speedup. Cache effects are not likely to have a positive impact here. Even large search spaces can be represented with rather small distance matrices. Moreover, we are not partitioning the distance matrix.\n",
+ " \"\"\"\n",
+ " println(msg)\n",
+ "end\n",
+ "println(\"🥳 Well done!\")"
]
},
{
@@ -64,9 +72,16 @@
"## The traveling sales person (TSP) problem\n",
"\n",
"\n",
+ "In this notebook we will study another algorithm that works with graphs, the [traveling sales person (TSP) problem](https://en.wikipedia.org/wiki/Travelling_salesman_problem). The classical formulation of this problem is as follows (quoted from Wikipedia) \"Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once?\" This problem as applications in combinatorial optimization, theoretical computer science, and operations research. It is very expensive problem to solve (NP-hard problem) which often needs parallel computing.\n",
+ "\n",
+ "
\n",
+ "Note: There are two key variations of this problem. One in which the sales person returns to the initial city, and another in which the sales person does not return to the initial city. We will consider the second variant for simplicity.\n",
+ "
\n",
+ "\n",
+ "\n",
"### Problem statement\n",
"\n",
- "Given a graph $G$ with a distance table $C$ and an initial node (i.e. a city) in the graph, compute the shortest route that visits all cities exactly once, without returning to the initial city."
+ "Our version of the TSP problem can be formalized as follows. Given a graph $G$ with a distance table $C$ and an initial node in the graph, compute the shortest route that visits all nodes exactly once, without returning to the initial node. The nodes on the graph can be interpreted as the \"cities\", and the solution is the optimal route for the traveling salesperson to visit all cities. The following figure shows a simple TSP problem and its solution."
]
},
{
@@ -86,14 +101,12 @@
},
{
"cell_type": "markdown",
- "id": "c303dddf",
+ "id": "fd4f87fc",
"metadata": {},
"source": [
"### Sequential algorithm (branch and bound)\n",
"\n",
- "The sequential algorithm finds a shortest path by traversing the paths tree of the problem. The root of this tree is the initial city. The children of each node in the graph are the neighbour cities that have not been visited on the path so far. When all neighbour cities are already visited, the city becomes a leaf node in the tree.\n",
- "\n",
- "The possile solutions of the problem are the paths from the root of the tree to a leaf node. Note that we assume the children are sorted using the **nearest city first heuristic**. This allows to quickly find a minimum bound for the distance which will be used to prune the remaining paths (see next section). "
+ "A well known method to solve this problem is based on a [branch and bound](https://en.wikipedia.org/wiki/Branch_and_bound) strategy. It consisting in organizing all possible routes in a tree-like structure (this is the \"branch\" part). The root of this tree is the initial city. The children of each node in the graph are the neighbor cities that have not been visited in the path so far. When all neighbor cities are already visited, the city becomes a leaf node in the tree. See figure below for the tree associated with our TSP problem example. The TSP problem consists now in finding which is the \"shortest\" branch in this tree. The tree data structure is just a convenient way of organizing all possible routes in order to search for the shortest one. We refer to it as the *search tree* or the *search space*."
]
},
{
@@ -113,10 +126,12 @@
},
{
"cell_type": "markdown",
- "id": "9da5f5ae",
+ "id": "c303dddf",
"metadata": {},
"source": [
- "Of course, visiting all paths in the tree is impractical for moderate and large numbers of cities. The number of possible paths might be up to $O(N!)$. Therefore, an essential part of the algorithm is to bound the search by remembering the current minimum distance. "
+ "### Nearest city first heuristic\n",
+ "\n",
+ "When building the search tree we are free to choose any order when defining the children of a node. A clever order is using the *nearest city first heuristic*. I.e., we sort the children according to how far they are from the current node, in ascending order. This allows to quickly find a minimum bound for the distance which will be used to prune the remaining paths (see next section). The figure above used the nearest city first heuristic. In blue you can see the distance between cities. The first child is always the one with the shortest distance."
]
},
{
@@ -126,9 +141,9 @@
"source": [
"### Pruning the search tree\n",
"\n",
- "The algorithm keeps track of the best solution of all paths visited so far. This allows to skip searching paths that already exceed this value. \n",
+ "The basic idea of the algorithm is to loop over all possible routes (all branches in the search tree) and find find the one with the shortest distance. One can optimize this process by \"pruning\" the search tree. We keep track of the best solution of all paths visited so far, which allows us to skip searching paths that already exceed this value. This is the \"bound\" part of the branch and bound strategy. \n",
"\n",
- "For example, in the following graph only 3 out of 6 possible routes need to be visted when we cut off the search after the minimum distance is exceeded. (The grey nodes are the ones we don't visit because the minimum distance had been exceeded at the previous node already.)"
+ "For example, in the following graph only 3 out of 6 possible routes need to be fully traversed to find the shortest route. In particular, we do not need to fully traverse the second branch/route (figure below left). when visiting the third city in this branch the current distance is already equal to the full previous route. It means that the solution will not be in this part of the tree for sure. In figure below (right), the gray nodes are the ones we do not visit because the minimum distance had been exceeded before completing the route."
]
},
{
@@ -148,37 +163,14 @@
},
{
"cell_type": "markdown",
- "id": "f3c78a1d",
+ "id": "9da5f5ae",
"metadata": {},
"source": [
- "Note that it is not necessary that the graph be fully connected. Variations of this algorithm work for sparse graphs or directed graphs as well. \n",
+ "### Computation complexity\n",
"\n",
- "In the previous example, the shortest route was also the leftmost path in the graph. Although it is more likely that the shortest route be found in the left part of the graph when using the nearest city first heuristic, the solution can be anywhere in the search tree."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "85d771de",
- "metadata": {},
- "source": [
- "
\n",
- " Example: Look at the following graph and its corresponding search tree. If $x\\leq 15$, the shortest route is the leftmost branch of the search tree. If $x > 16$, the route is situated on the right side of the search tree. \n",
- "
"
+ "The total number of routes we need to traverse is $O(N!)$, where $N$ is the number of cities. This comes from the fact that the number of possible routes is equal to the number of possible permutations of $N$ cities. Thus the cost of the algorithm is $O(N!)$, which becomes expensive very quickly when $N$ grows.\n",
+ "\n",
+ "In practice, however, we will not need to traverse all $O(N!)$ possible routes to find the shortest one since we consider pruning. The nearest city first heuristic also makes more likely that the shortest route is among the first routes to be traversed (left part of the tree), thus speeding the process. However, the solution can be anywhere in the search tree, and the number of routes to be traversed is $O(N!)$ in the worse case scenario."
]
},
{
@@ -188,7 +180,12 @@
"source": [
"## Serial implementation\n",
"\n",
- "Let's implement the serial algorithm. First, we sort the neighbours according to their distance. "
+ "Let's implement the serial algorithm.\n",
+ "\n",
+ "\n",
+ "
\n",
+ "Note: The implementation of this algorithm is rather challenging. Try to understand the key ideas (the explanations) instead of all code details. Having the complete functional implementation is useful to analyze the actual performance of our parallel implementation at the end of the notebook, and to check if it is consistent with the theory.\n",
+ "
"
]
},
{
@@ -196,7 +193,9 @@
"id": "f2b70f85",
"metadata": {},
"source": [
- "### Nearest-city first heuristic"
+ "### Nearest-city first heuristic\n",
+ "\n",
+ "The first step is preprocessing the distance table to create a new data structure that takes into account the nearest city first heuristic. This is done in the following function."
]
},
{
@@ -217,27 +216,12 @@
"end"
]
},
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "2eeecdd6",
- "metadata": {},
- "outputs": [],
- "source": [
- "C = [\n",
- " 0 2 3 2\n",
- " 2 0 4 1\n",
- " 3 4 0 3\n",
- " 2 1 3 0 \n",
- "]"
- ]
- },
{
"cell_type": "markdown",
- "id": "f769627c",
+ "id": "9fc1a398",
"metadata": {},
"source": [
- "The data structure we will use for the connections table is a matrix of tuples of the form (destination, distance). The tuples are sorted by their distance in ascending order (per start city). "
+ "Execute the next cell to understand the output of `sort_neighbors`."
]
},
{
@@ -247,15 +231,21 @@
"metadata": {},
"outputs": [],
"source": [
+ "C = [\n",
+ " 0 2 3 2\n",
+ " 2 0 4 1\n",
+ " 3 4 0 3\n",
+ " 2 1 3 0 \n",
+ "]\n",
"C_sorted = sort_neighbors(C)"
]
},
{
"cell_type": "markdown",
- "id": "51ed8312",
+ "id": "a63dc266",
"metadata": {},
"source": [
- "The connections matrix can be indexed by a city. This returns a `Vector{Tuple}}` of all the destinations and their corresponding distances. "
+ "The output is a vector of vector of tuples. The outer vector is indexed by a city id, for instance:"
]
},
{
@@ -271,6 +261,14 @@
"C_sorted[city]"
]
},
+ {
+ "cell_type": "markdown",
+ "id": "f769627c",
+ "metadata": {},
+ "source": [
+ " This returns a vector of tuples that contains information about the connections to this city of the form (destination, distance). In this case, city 3 is connected to city 3 at distance 0 (itself), then with city 1 at distance 3, then with city 4 at distance 3, and finally with city 2 at distance 4. Note that the connections are sorted by their distance in ascending order (here is where the nearest city first heuristic is used). "
+ ]
+ },
{
"cell_type": "markdown",
"id": "44025bd5",
@@ -284,7 +282,7 @@
"id": "6c91a99f",
"metadata": {},
"source": [
- "Next, we write an algorithm that traverses the whole search tree and prints all the possible paths. The tree is traversed in depth-first order. Before we go to a neighbouring city, we also have to verify that it has not been visited on this path yet. If we reach a leaf node, we print the complete path and continue searching. "
+ "Next, we write an algorithm that traverses the whole search tree and prints all the possible paths. To this end, the tree is traversed in [depth-first order](https://en.wikipedia.org/wiki/Depth-first_search) using a recursive function call. Before we go to a neighbouring city, we also have to verify that it has not been visited on this path yet. If we reach a leaf node, we print the complete path and continue searching. "
]
},
{
@@ -333,7 +331,7 @@
" end\n",
" return nothing\n",
" else\n",
- " println(path)\n",
+ " println(\"I just completed route $path\")\n",
" return nothing\n",
" end\n",
"end"
@@ -357,7 +355,7 @@
"source": [
"### Serial implementation without pruning\n",
"\n",
- "Now, we add the computation of the minimum distance. At each leaf node, we update the minimum distance. Furthermore, as we add another node to our path, we update the distance of the current path. That makes it necessary to include two more parameters in our recursive algorithm: `distance`, the distance of the current path, and `min_distance`, the best minimum distance found so far. "
+ "Now, we know how to traverse all possible routes. We just need a minor modification of the code below to solve the TSP problem (without pruning). We add a new variable called `min_distance` that keeps track of the distance of the shortest route so-far. This variable is updated at the end of each route, i.e., when a leaf node is visited. After traversing all routes, `min_distance` will contain the distance of the shortest route (the solution of the ASP problem)."
]
},
{
@@ -375,6 +373,16 @@
"
\n",
+ "Note: We could further modify the function so that we also return a vector containing the cities in the shortest route. However, in this notebook, we will only return the distance of the shortest route (a single value) for simplicity.\n",
+ "
"
+ ]
+ },
{
"cell_type": "code",
"execution_count": null,
@@ -382,6 +390,7 @@
"metadata": {},
"outputs": [],
"source": [
+ "verbose::Bool = true\n",
"function tsp_serial_no_prune(C_sorted,city)\n",
" num_cities = length(C_sorted)\n",
" path=zeros(Int,num_cities)\n",
@@ -411,7 +420,7 @@
" else\n",
" # Set new minimum distance in leaf nodes\n",
" min_distance = min(distance,min_distance)\n",
- " #@show path, distance, min_distance\n",
+ " verbose && println(\"I just completed route $path. Min distance so far is $min_distance\")\n",
" return min_distance\n",
" end\n",
"end"
@@ -425,6 +434,7 @@
"outputs": [],
"source": [
"city = 1\n",
+ "verbose = true\n",
"min_distance = tsp_serial_no_prune(C_sorted,city)"
]
},
@@ -435,7 +445,7 @@
"source": [
"### Final serial implementation\n",
"\n",
- "Finally, we add the pruning to our algorithm. Anytime the current distance exceeds the minimum distance, the search in this path is aborted and continued with another path. "
+ "Finally, we add the pruning to our algorithm. Anytime the current distance exceeds the minimum distance, the search in this path is aborted and continued with another path. By running the function below, you will see that only three routes will be traversed thanks to pruning as shown in next figure."
]
},
{
@@ -472,6 +482,7 @@
"function tsp_serial_recursive!(C_sorted,hops,path,distance,min_distance)\n",
" # Prune this path if its distance is too high already\n",
" if distance >= min_distance\n",
+ " verbose && println(\"I am pruning at $(view(path,1:hops))\")\n",
" return min_distance\n",
" end\n",
" num_cities = length(C_sorted)\n",
@@ -493,7 +504,7 @@
" else\n",
" # Set new minimum distance in leaf nodes\n",
" min_distance = min(distance,min_distance)\n",
- " #@show path, distance, min_distance\n",
+ " verbose && println(\"I just completed route $path. Min distance so far is $min_distance\")\n",
" return min_distance\n",
" end\n",
"end"
@@ -507,6 +518,7 @@
"outputs": [],
"source": [
"city = 1\n",
+ "verbose = true\n",
"min_distance = tsp_serial(C_sorted,city)"
]
},
@@ -526,13 +538,14 @@
"metadata": {},
"outputs": [],
"source": [
- "n = 12 # It is safe to test up to n=12\n",
+ "n = 11 # It is safe to test up to n=11 on a laptop\n",
"using Random\n",
"using Test\n",
"Random.seed!(1)\n",
"C = rand(1:10,n,n)\n",
"C_sorted = sort_neighbors(C)\n",
"city = 1\n",
+ "verbose = false\n",
"@time min_no_prune = tsp_serial_no_prune(C_sorted,city)\n",
"@time min_prune = tsp_serial(C_sorted,city)\n",
"@test min_no_prune == min_prune"
@@ -543,7 +556,7 @@
"id": "6088ddc9",
"metadata": {},
"source": [
- "You can observe that, especially for larger numbers of cities (n=11 or n=12), the performance of the algorithm with pruning is much better than the performance of the algorithm without pruning. "
+ "You can observe that, especially for larger numbers of cities (n=11), the performance of the algorithm with pruning is much better than the performance of the algorithm without pruning. "
]
},
{
@@ -556,11 +569,13 @@
},
{
"cell_type": "markdown",
- "id": "c6375465",
+ "id": "732a3ffb",
"metadata": {},
"source": [
"### Where can we extract parallelism ?\n",
- "Unlike the previous algorithms we studied, in this problem we don't know beforehand how much work is performed since we don't know where the pruning cuts off part of the search tree. Still, we want to divide the workload among multiple processes to enhance the performance. "
+ "\n",
+ "All branches of the search tree can be traversed in parallel. Let us discuss how we can distribute these branches over several processes.\n",
+ "\n"
]
},
{
@@ -585,7 +600,7 @@
"source": [
"### Option 1\n",
"\n",
- "The first idea how to parallelize the TSP algorithm is to assign a branch of our search tree to each process. However, as mentioned in an earlier section, the number of branches in the search tree can be up to $O(N!)$. This would require an unfeasibly large amount of proecesses which each do only very little work. "
+ "We can (at least in theory) assign a branch of our search tree to each process. However, as mentioned in an earlier section, the number of branches in the search tree can be up to $O(N!)$. This would require an unfeasibly large amount of processors which each do only very little work. Thus, we skip this option as it is impractical."
]
},
{
@@ -609,7 +624,8 @@
"metadata": {},
"source": [
"### Option 2\n",
- "Instead of assigning one branch per worker, we can assign a fixed number of branches to each worker. This way, each worker can perform the pruning within their own subtree and less workers are needed. \n"
+ "\n",
+ "Instead of assigning one branch per worker, we can assign a fixed number of branches to each worker. This would be a good strategy if we do not consider pruning. However, it is not efficient if we include pruning (which is essential in this algorithm). "
]
},
{
@@ -632,13 +648,14 @@
"id": "e3af7def",
"metadata": {},
"source": [
- "### Performance issues\n",
+ "### Performance issues: Load balance\n",
"\n",
- "#### Load balancing\n",
- "However, this approach has a problem with load balancing. Since we don't know beforehand how much pruning can be done in each subtree, some workers might end up doing less work than others. This uneven distribution of workload leads to some workers being idle, which impairs the speedup. \n",
+ "Pruning is essential in this algorithm but makes challenging to evenly distribute the work over available processors. Image that we assign the same number of branches per worker and that the workers use pruning locally to speed up the solution process. It is not possible to know in advance how many branches will be fully traversed by each worker since pruning depends on the actual values in the input distance matrix (runtime values). It might happen that a worker can prune many branches and finishes fast, whereas other workers are not able to prune so many branches and they need more time to finish. This is a clear example of bad load balance. We will explain later a strategy to fix it.\n",
"\n",
- "#### Search overhead \n",
- "Another disadvantage of this kind of parallel search is that the pruning is now less effective. The workers each run their own version of the search algorithm and keep track of their local minimum distances. This means that less nodes will be pruned in the parallel version than in the serial version. This is called **search overhead**."
+ "\n",
+ "### Performance issues: Search overhead \n",
+ "\n",
+ "Another disadvantage of this kind of parallel search is that the pruning is now less effective. The workers each run their own version of the search algorithm and keep track of their local minimum distances. This means that less nodes will be pruned in the parallel version than in the serial version. The parallel code might search more routes than the sequential ones. This is called *search overhead*."
]
},
{
@@ -647,7 +664,7 @@
"metadata": {},
"source": [
"
\n",
- "Question: How many nodes are pruned in total when we assign two branches to each worker? Look at the illustration below.\n",
+ "Question: How routes are fully traversed in total when we assign two branches to each worker? Look at the illustration below. Assume that each worker does pruning locally and independently of the other workers.\n",
"
"
]
},
@@ -682,33 +699,42 @@
"id": "d0bc4fdd",
"metadata": {},
"source": [
- "In this example, the parallel algorithm prunes less nodes than the serial version because not all workers are able to use the global minimum distance as a pruning bound."
+ "In this example, the parallel algorithm traverses more routes (1 more) then the serial version because not all workers are able to use the global minimum distance as a pruning bound. Remember that the sequential code only traverses 3 routes completely. See figure:"
+ ]
+ },
+ {
+ "attachments": {
+ "g26375.png": {
+ "image/png": ""
+ }
+ },
+ "cell_type": "markdown",
+ "id": "f6329f5d",
+ "metadata": {},
+ "source": [
+ "
\n",
- "Question: The previous example described positive search overhead. There is also negative search overhead, resulting in superlinear speedups. Can there be negative search overhead in this parallel TSP algorithm? (Provided the workers communicate the minimum distance with each other)\n",
- "
\n",
- "\n",
- " a) No, because we use the nearest city first heuristic. \n",
- " b) No, because each worker has to search the whole subtree before the algorithm completes. \n",
- " c) Yes, because the parallel algorithm does not need to search the whole search tree.\n",
- " d) Yes, because the global minimum distance can be found more quickly, enabling the parallel version to do more pruning."
+ "In order to minimize search overhead, workers need to collaboratively keep track of a global minimum distance. However, this needs to be done carefully to avoid race conditions. We show how to do this later in the notebook."
]
},
{
- "cell_type": "code",
- "execution_count": null,
- "id": "fff58498",
+ "cell_type": "markdown",
+ "id": "5e4cee1a",
"metadata": {},
- "outputs": [],
"source": [
- "answer = \"x\" # Replace x with a,b,c or d\n",
- "tsp_check_3(answer)"
+ "### Negative search overhead\n",
+ "\n",
+ "The parallel algorithm might search more branches than the sequential one when we parallelize the pruning process. However, it is also possible that parallel algorithm searches less branches that the sequential one for particular cases. Imagine that the optimal route is on the right side of the tree (or the last route in the tree in the limit case). The parallel algorithm will need less work than the sequential one in this case. The last workers might find the optimal route very quickly and inform the other workers about the optimal minimum, which can then prune branches very effectively. Whereas the sequential algorithm will need to traverse many branches in order to reach the optimal one. If the parallel code does less searches than the sequential one, we way that the search overhead is negative. \n",
+ "\n",
+ "Negative search overhead is very good for parallel speedups, but it depends on the input values. We cannot rely on it to speed up the parallel execution of the algorithm. \n"
]
},
{
@@ -717,7 +743,16 @@
"metadata": {},
"source": [
"### Option 3: Dynamic load balancing with replicated workers model\n",
- "In our parallel implementation, we will use a coordinator process and several worker processes. The coordinator process (or _master_) searches the tree up to a certain maximum depth _maxhops_. When _maxhops_ is reached, the coordinator creates a job and delegates it to a worker. The workers repeatedly get work from the master and execute it. This is an example of **dynamic load balancing**: the load is distributed among the workers during runtime."
+ "\n",
+ "In this third option, we explain a strategy to improve load balance based using the [*replicated workers model*](https://en.wikipedia.org/wiki/Thread_pool) also known as *worker pool* or *thread pool*. In this model, the main processes (aka master or coordinator process) sends jobs to a job queue. Then, workers take one available job from the queue, run it, and take a new job when they are done. In this process, workers never wait for other workers thus fixing the load balance problem. It does not matter if there are some jobs that are larger than others as long as there are enough jobs to keep the workers busy. The main limiting factor of this model is the number of jobs and speed in which the main process is able to generate jobs and send them to the queue. This is an example of **dynamic load balancing**: the load is distributed among the workers at runtime.\n",
+ "\n",
+ "\n",
+ "In our parallel implementation, we will use a coordinator process and several worker processes. The coordinator process will search the tree up to a certain maximum depth given by a number of hops/levels _maxhops_. When _maxhops_ is reached, the coordinator will stop searching the tree and will let any available worker to continue searching in the subtree. In the figure below, the master process will only visit the nodes in the top green box. The worker processes will search in parallel the subtrees below.\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
]
},
{
@@ -745,28 +780,19 @@
},
{
"cell_type": "markdown",
- "id": "81219a27",
+ "id": "9e345393",
"metadata": {},
"source": [
- "
\n",
- " Question: To find the right maxhops level is a tradeoff between...\n",
- "
\n",
- " \n",
- " a) Communication overhead (large maxhops) and load imbalance (small maxhops). \n",
- " b) Search overhead (large maxhops) and load imbalance (small maxhops). \n",
- " c) the number of workers (large maxhops) and the job size (small maxhops). \n",
- " d) buffer for the job queue (large maxhops) and idle time of the coordinator process (small maxhops)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "0cf1ec88",
- "metadata": {},
- "outputs": [],
- "source": [
- "answer=\"x\" #Replace x with a,b,c, or d\n",
- "tsp_check_4(answer)"
+ "### Performance impact of maxhops\n",
+ "\n",
+ "We introduced a new parameter `maxhops`. Which is then the optimal value for it? When choosing `maxhops`, there is a trade off between load balance and communication overhead.\n",
+ "\n",
+ "- A small `maxhops` will reduce the number of jobs communicated to the workers (less communication), but reducing the number of jobs is bad for load balance. In the limit, we might generate even less jobs than the number of workers.\n",
+ "\n",
+ "- A large `maxhops` will increase the number of parallel jobs, improving dynamic load balance, but it will lead to more communication.\n",
+ "\n",
+ "\n",
+ "The optimal value of `maxhops` will depend on the given system, the number of workers, problem size, and also the particular input values. It is not possible to determine it in advance."
]
},
{
@@ -776,6 +802,8 @@
"source": [
"## Implementation of the parallel algorithm \n",
"\n",
+ "We will implement this algorithm using the task-based programming model provided by Distributed.jl as it is convenient to implement the replicated workers model.\n",
+ "\n",
"First, let's add our worker processes. "
]
},
@@ -937,7 +965,7 @@
"\n",
"### Simplified example\n",
"\n",
- "We will demonstrate how the workers communicate the minimum distance with each other with a short example. Each worker generates a random value and updates a globally shared minimum. The variable for the global minimum is stored in a `RemoteChannel`. The buffer size of the channel is 1, such that only one channel can take and put new values to the channel at a time."
+ "We will demonstrate how the workers communicate the minimum distance with each other with a short example. Each worker generates a random value and updates a globally shared minimum. The variable for the global minimum is stored in a `RemoteChannel`. The buffer size of the channel is 1, such that only one worker can take and put new values to the channel at a time, thus solving the race condition problem."
]
},
{
@@ -1087,7 +1115,7 @@
"metadata": {},
"source": [
"## Testing the parallel implementation\n",
- "Next, we will test the correctness and performance of our parallel implementation by comparing the results of the parallel algorithm to the results of the serial algorithm for multiple problem instances. "
+ "Next, we will test the correctness and performance of our parallel implementation by comparing the results of the parallel algorithm to the results of the serial algorithm for multiple problem instances. Run it for different values of `n` and `max_hops`. Try to explain the impact of these values on the parallel efficiency."
]
},
{
@@ -1097,12 +1125,13 @@
"metadata": {},
"outputs": [],
"source": [
- "n = 18 # Safe to run up to 18\n",
+ "n = 18 # Safe to run up to 18 on a laptop\n",
"using Random\n",
"Random.seed!(1)\n",
"C = rand(1:10,n,n)\n",
"C_sorted = sort_neighbors(C)\n",
"city = 1\n",
+ "verbose = false\n",
"T1 = @elapsed min_serial = tsp_serial(C_sorted,city)\n",
"max_hops = 2\n",
"P = nworkers()\n",
@@ -1115,6 +1144,50 @@
"@test min_serial == min_dist"
]
},
+ {
+ "cell_type": "markdown",
+ "id": "92e68978",
+ "metadata": {},
+ "source": [
+ "### Super-linear speedup\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9a724509",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "Question: For some values of `n` and `max_hops` the parallel efficiency can be above 100% (super-linear speedup). For example with `n=18` and `max_hops=2`, I get super-linear speedup on my laptop for some runs. Explain a possible cause for super-linear speedup in this algorithm."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "eebe7e9a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "uncover = false\n",
+ "q_superlinear_answer(uncover)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "19835531",
+ "metadata": {},
+ "source": [
+ "## Summary\n",
+ "\n",
+ "- We studied the solution of the TSP problem using a branch and bound strategy\n",
+ "- The problem is $O(N!)$ complex in the worse case scenario, where $N$ is the number of cities.\n",
+ "- Luckily, the compute time can be drastically reduced in practice using the nearest city first heuristic and branch pruning.\n",
+ "- Pruning, however, introduces load imbalance in the parallel code. To this fix this, one needs a dynamic load balancing strategy as the actual work per worker depends on the input matrix (runtime values).\n",
+ "- A replicated workers model is useful to distribute work dynamically. However, it introduces a trade-off between load balance and communication depending on the value of `maxhops`.\n",
+ "- The parallel code might suffer from positive search overhead (if the optimal route is on the left of the tree) or it can benefit from negative search overhead (if the optimal route is on the right of the tree).\n",
+ "- In some cases, it is possible to observe super-linear speedup thanks to negative search overhead.\n"
+ ]
+ },
{
"cell_type": "markdown",
"id": "c789dc7a",
@@ -1130,15 +1203,15 @@
],
"metadata": {
"kernelspec": {
- "display_name": "Julia 1.9.1",
+ "display_name": "Julia 1.10.0",
"language": "julia",
- "name": "julia-1.9"
+ "name": "julia-1.10"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
- "version": "1.9.1"
+ "version": "1.10.0"
}
},
"nbformat": 4,
diff --git a/dev/tsp/index.html b/dev/tsp/index.html
index ec79fe1..0ec9318 100644
--- a/dev/tsp/index.html
+++ b/dev/tsp/index.html
@@ -1,5 +1,5 @@
-- · XM_40017
How to parallelize the solution of the traveling sales person problem
-
How to fix dynamic load imbalance
The concept of search overhead
+
A dynamic load balancing method
@@ -7568,9 +7568,17 @@ a.anchor-link {
"It's not correct. Keep trying! 💪"end|>printlnend
-tsp_check_2(answer)=answer_checker(answer,2)
+tsp_check_2(answer)=answer_checker(answer,4)tsp_check_3(answer)=answer_checker(answer,"d")tsp_check_4(answer)=answer_checker(answer,"a")
+functionq_superlinear_answer(bool)
+bool||return
+msg="""
+ Negative search overhead can explain the superlinear speedup in this algorithm. The optimal speedup (speedup equal to the numer of processors) assumes that the work done in the sequental and parallel algorithm is the same. If the parallel code does less work, it is possible to go beyond the optimal speedup. Cache effects are not likely to have a positive impact here. Even large search spaces can be represented with rather small distance matrices. Moreover, we are not partitioning the distance matrix.
+ """
+println(msg)
+end
+println("🥳 Well done!")
Given a graph $G$ with a distance table $C$ and an initial node (i.e. a city) in the graph, compute the shortest route that visits all cities exactly once, without returning to the initial city.
In this notebook we will study another algorithm that works with graphs, the traveling sales person (TSP) problem. The classical formulation of this problem is as follows (quoted from Wikipedia) "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once?" This problem as applications in combinatorial optimization, theoretical computer science, and operations research. It is very expensive problem to solve (NP-hard problem) which often needs parallel computing.
+
+Note: There are two key variations of this problem. One in which the sales person returns to the initial city, and another in which the sales person does not return to the initial city. We will consider the second variant for simplicity.
+
Our version of the TSP problem can be formalized as follows. Given a graph $G$ with a distance table $C$ and an initial node in the graph, compute the shortest route that visits all nodes exactly once, without returning to the initial node. The nodes on the graph can be interpreted as the "cities", and the solution is the optimal route for the traveling salesperson to visit all cities. The following figure shows a simple TSP problem and its solution.
The sequential algorithm finds a shortest path by traversing the paths tree of the problem. The root of this tree is the initial city. The children of each node in the graph are the neighbour cities that have not been visited on the path so far. When all neighbour cities are already visited, the city becomes a leaf node in the tree.
-
The possile solutions of the problem are the paths from the root of the tree to a leaf node. Note that we assume the children are sorted using the nearest city first heuristic. This allows to quickly find a minimum bound for the distance which will be used to prune the remaining paths (see next section).
A well known method to solve this problem is based on a branch and bound strategy. It consisting in organizing all possible routes in a tree-like structure (this is the "branch" part). The root of this tree is the initial city. The children of each node in the graph are the neighbor cities that have not been visited in the path so far. When all neighbor cities are already visited, the city becomes a leaf node in the tree. See figure below for the tree associated with our TSP problem example. The TSP problem consists now in finding which is the "shortest" branch in this tree. The tree data structure is just a convenient way of organizing all possible routes in order to search for the shortest one. We refer to it as the search tree or the search space.
@@ -7626,13 +7637,13 @@ a.anchor-link {
-
+
-
Of course, visiting all paths in the tree is impractical for moderate and large numbers of cities. The number of possible paths might be up to $O(N!)$. Therefore, an essential part of the algorithm is to bound the search by remembering the current minimum distance.
When building the search tree we are free to choose any order when defining the children of a node. A clever order is using the nearest city first heuristic. I.e., we sort the children according to how far they are from the current node, in ascending order. This allows to quickly find a minimum bound for the distance which will be used to prune the remaining paths (see next section). The figure above used the nearest city first heuristic. In blue you can see the distance between cities. The first child is always the one with the shortest distance.
The algorithm keeps track of the best solution of all paths visited so far. This allows to skip searching paths that already exceed this value.
-
For example, in the following graph only 3 out of 6 possible routes need to be visted when we cut off the search after the minimum distance is exceeded. (The grey nodes are the ones we don't visit because the minimum distance had been exceeded at the previous node already.)
The basic idea of the algorithm is to loop over all possible routes (all branches in the search tree) and find find the one with the shortest distance. One can optimize this process by "pruning" the search tree. We keep track of the best solution of all paths visited so far, which allows us to skip searching paths that already exceed this value. This is the "bound" part of the branch and bound strategy.
+
For example, in the following graph only 3 out of 6 possible routes need to be fully traversed to find the shortest route. In particular, we do not need to fully traverse the second branch/route (figure below left). when visiting the third city in this branch the current distance is already equal to the full previous route. It means that the solution will not be in this part of the tree for sure. In figure below (right), the gray nodes are the ones we do not visit because the minimum distance had been exceeded before completing the route.
@@ -7662,40 +7673,14 @@ a.anchor-link {
-
+
-
Note that it is not necessary that the graph be fully connected. Variations of this algorithm work for sparse graphs or directed graphs as well.
-
In the previous example, the shortest route was also the leftmost path in the graph. Although it is more likely that the shortest route be found in the left part of the graph when using the nearest city first heuristic, the solution can be anywhere in the search tree.
-
-
-
-
-
-
-
-
-
-
-
- Example: Look at the following graph and its corresponding search tree. If $x\leq 15$, the shortest route is the leftmost branch of the search tree. If $x > 16$, the route is situated on the right side of the search tree.
-
The total number of routes we need to traverse is $O(N!)$, where $N$ is the number of cities. This comes from the fact that the number of possible routes is equal to the number of possible permutations of $N$ cities. Thus the cost of the algorithm is $O(N!)$, which becomes expensive very quickly when $N$ grows.
+
In practice, however, we will not need to traverse all $O(N!)$ possible routes to find the shortest one since we consider pruning. The nearest city first heuristic also makes more likely that the shortest route is among the first routes to be traversed (left part of the tree), thus speeding the process. However, the solution can be anywhere in the search tree, and the number of routes to be traversed is $O(N!)$ in the worse case scenario.
+Note: The implementation of this algorithm is rather challenging. Try to understand the key ideas (the explanations) instead of all code details. Having the complete functional implementation is useful to analyze the actual performance of our parallel implementation at the end of the notebook, and to check if it is consistent with the theory.
+
The first step is preprocessing the distance table to create a new data structure that takes into account the nearest city first heuristic. This is done in the following function.
@@ -7743,7 +7731,18 @@ a.anchor-link {
-
+
+
+
+
+
+
+
+
Execute the next cell to understand the output of sort_neighbors.
The data structure we will use for the connections table is a matrix of tuples of the form (destination, distance). The tuples are sorted by their distance in ascending order (per start city).
-
-
-
-
-
-
-
-
-
In [ ]:
-
-
-
C_sorted=sort_neighbors(C)
-
-
-
-
-
-
-
-
-
-
-
-
-
The connections matrix can be indexed by a city. This returns a Vector{Tuple}} of all the destinations and their corresponding distances.
+
The output is a vector of vector of tuples. The outer vector is indexed by a city id, for instance:
@@ -7814,6 +7789,17 @@ a.anchor-link {
+
+
+
+
+
+
+
This returns a vector of tuples that contains information about the connections to this city of the form (destination, distance). In this case, city 3 is connected to city 3 at distance 0 (itself), then with city 1 at distance 3, then with city 4 at distance 3, and finally with city 2 at distance 4. Note that the connections are sorted by their distance in ascending order (here is where the nearest city first heuristic is used).
+
+
+
+
@@ -7831,7 +7817,7 @@ a.anchor-link {
-
Next, we write an algorithm that traverses the whole search tree and prints all the possible paths. The tree is traversed in depth-first order. Before we go to a neighbouring city, we also have to verify that it has not been visited on this path yet. If we reach a leaf node, we print the complete path and continue searching.
+
Next, we write an algorithm that traverses the whole search tree and prints all the possible paths. To this end, the tree is traversed in depth-first order using a recursive function call. Before we go to a neighbouring city, we also have to verify that it has not been visited on this path yet. If we reach a leaf node, we print the complete path and continue searching.
Now, we add the computation of the minimum distance. At each leaf node, we update the minimum distance. Furthermore, as we add another node to our path, we update the distance of the current path. That makes it necessary to include two more parameters in our recursive algorithm: distance, the distance of the current path, and min_distance, the best minimum distance found so far.
Now, we know how to traverse all possible routes. We just need a minor modification of the code below to solve the TSP problem (without pruning). We add a new variable called min_distance that keeps track of the distance of the shortest route so-far. This variable is updated at the end of each route, i.e., when a leaf node is visited. After traversing all routes, min_distance will contain the distance of the shortest route (the solution of the ASP problem).
@@ -7928,6 +7914,19 @@ a.anchor-link {
+
+
+
+
+
+
+
+
+Note: We could further modify the function so that we also return a vector containing the cities in the shortest route. However, in this notebook, we will only return the distance of the shortest route (a single value) for simplicity.
+
+
+
+
@@ -7936,7 +7935,8 @@ a.anchor-link {
In [ ]:
-
functiontsp_serial_no_prune(C_sorted,city)
+
verbose::Bool=true
+functiontsp_serial_no_prune(C_sorted,city)num_cities=length(C_sorted)path=zeros(Int,num_cities)hops=1
@@ -7965,7 +7965,7 @@ a.anchor-link {
else# Set new minimum distance in leaf nodesmin_distance=min(distance,min_distance)
-#@show path, distance, min_distance
+verbose&&println("I just completed route $path. Min distance so far is $min_distance")returnmin_distanceendend
@@ -7983,6 +7983,7 @@ a.anchor-link {
Finally, we add the pruning to our algorithm. Anytime the current distance exceeds the minimum distance, the search in this path is aborted and continued with another path.
Finally, we add the pruning to our algorithm. Anytime the current distance exceeds the minimum distance, the search in this path is aborted and continued with another path. By running the function below, you will see that only three routes will be traversed thanks to pruning as shown in next figure.
@@ -8033,6 +8034,7 @@ a.anchor-link {
functiontsp_serial_recursive!(C_sorted,hops,path,distance,min_distance)# Prune this path if its distance is too high alreadyifdistance>=min_distance
+verbose&&println("I am pruning at $(view(path,1:hops))")returnmin_distanceendnum_cities=length(C_sorted)
@@ -8054,7 +8056,7 @@ a.anchor-link {
else# Set new minimum distance in leaf nodesmin_distance=min(distance,min_distance)
-#@show path, distance, min_distance
+verbose&&println("I just completed route $path. Min distance so far is $min_distance")returnmin_distanceendend
@@ -8072,6 +8074,7 @@ a.anchor-link {
n=11# It is safe to test up to n=11 on a laptopusingRandomusingTestRandom.seed!(1)C=rand(1:10,n,n)C_sorted=sort_neighbors(C)city=1
+verbose=false@timemin_no_prune=tsp_serial_no_prune(C_sorted,city)@timemin_prune=tsp_serial(C_sorted,city)@testmin_no_prune==min_prune
@@ -8119,7 +8123,7 @@ a.anchor-link {
-
You can observe that, especially for larger numbers of cities (n=11 or n=12), the performance of the algorithm with pruning is much better than the performance of the algorithm without pruning.
+
You can observe that, especially for larger numbers of cities (n=11), the performance of the algorithm with pruning is much better than the performance of the algorithm without pruning.
Unlike the previous algorithms we studied, in this problem we don't know beforehand how much work is performed since we don't know where the pruning cuts off part of the search tree. Still, we want to divide the workload among multiple processes to enhance the performance.
The first idea how to parallelize the TSP algorithm is to assign a branch of our search tree to each process. However, as mentioned in an earlier section, the number of branches in the search tree can be up to $O(N!)$. This would require an unfeasibly large amount of proecesses which each do only very little work.
We can (at least in theory) assign a branch of our search tree to each process. However, as mentioned in an earlier section, the number of branches in the search tree can be up to $O(N!)$. This would require an unfeasibly large amount of processors which each do only very little work. Thus, we skip this option as it is impractical.
Instead of assigning one branch per worker, we can assign a fixed number of branches to each worker. This way, each worker can perform the pruning within their own subtree and less workers are needed.
Instead of assigning one branch per worker, we can assign a fixed number of branches to each worker. This would be a good strategy if we do not consider pruning. However, it is not efficient if we include pruning (which is essential in this algorithm).
However, this approach has a problem with load balancing. Since we don't know beforehand how much pruning can be done in each subtree, some workers might end up doing less work than others. This uneven distribution of workload leads to some workers being idle, which impairs the speedup.
Another disadvantage of this kind of parallel search is that the pruning is now less effective. The workers each run their own version of the search algorithm and keep track of their local minimum distances. This means that less nodes will be pruned in the parallel version than in the serial version. This is called search overhead.
Pruning is essential in this algorithm but makes challenging to evenly distribute the work over available processors. Image that we assign the same number of branches per worker and that the workers use pruning locally to speed up the solution process. It is not possible to know in advance how many branches will be fully traversed by each worker since pruning depends on the actual values in the input distance matrix (runtime values). It might happen that a worker can prune many branches and finishes fast, whereas other workers are not able to prune so many branches and they need more time to finish. This is a clear example of bad load balance. We will explain later a strategy to fix it.
Another disadvantage of this kind of parallel search is that the pruning is now less effective. The workers each run their own version of the search algorithm and keep track of their local minimum distances. This means that less nodes will be pruned in the parallel version than in the serial version. The parallel code might search more routes than the sequential ones. This is called search overhead.
@@ -8226,7 +8230,7 @@ a.anchor-link {
-Question: How many nodes are pruned in total when we assign two branches to each worker? Look at the illustration below.
+Question: How routes are fully traversed in total when we assign two branches to each worker? Look at the illustration below. Assume that each worker does pruning locally and independently of the other workers.
@@ -8266,42 +8270,46 @@ a.anchor-link {
-
In this example, the parallel algorithm prunes less nodes than the serial version because not all workers are able to use the global minimum distance as a pruning bound.
+
In this example, the parallel algorithm traverses more routes (1 more) then the serial version because not all workers are able to use the global minimum distance as a pruning bound. Remember that the sequential code only traverses 3 routes completely. See figure:
-
+
-
-Question: The previous example described positive search overhead. There is also negative search overhead, resulting in superlinear speedups. Can there be negative search overhead in this parallel TSP algorithm? (Provided the workers communicate the minimum distance with each other)
-
-
a) No, because we use the nearest city first heuristic.
-b) No, because each worker has to search the whole subtree before the algorithm completes.
-c) Yes, because the parallel algorithm does not need to search the whole search tree.
-d) Yes, because the global minimum distance can be found more quickly, enabling the parallel version to do more pruning.
+
+
-
+
+
+
-
-
In [ ]:
-
-
-
answer="x"# Replace x with a,b,c or d
-tsp_check_3(answer)
-
+
+
+
In order to minimize search overhead, workers need to collaboratively keep track of a global minimum distance. However, this needs to be done carefully to avoid race conditions. We show how to do this later in the notebook.
The parallel algorithm might search more branches than the sequential one when we parallelize the pruning process. However, it is also possible that parallel algorithm searches less branches that the sequential one for particular cases. Imagine that the optimal route is on the right side of the tree (or the last route in the tree in the limit case). The parallel algorithm will need less work than the sequential one in this case. The last workers might find the optimal route very quickly and inform the other workers about the optimal minimum, which can then prune branches very effectively. Whereas the sequential algorithm will need to traverse many branches in order to reach the optimal one. If the parallel code does less searches than the sequential one, we way that the search overhead is negative.
+
Negative search overhead is very good for parallel speedups, but it depends on the input values. We cannot rely on it to speed up the parallel execution of the algorithm.
+
+
+
@@ -8309,7 +8317,8 @@ d) Yes, because the global minimum distance can be found more quickly, enabling
-
Option 3: Dynamic load balancing with replicated workers model¶
In our parallel implementation, we will use a coordinator process and several worker processes. The coordinator process (or master) searches the tree up to a certain maximum depth maxhops. When maxhops is reached, the coordinator creates a job and delegates it to a worker. The workers repeatedly get work from the master and execute it. This is an example of dynamic load balancing: the load is distributed among the workers during runtime.
+
Option 3: Dynamic load balancing with replicated workers model¶
In this third option, we explain a strategy to improve load balance based using the replicated workers model also known as worker pool or thread pool. In this model, the main processes (aka master or coordinator process) sends jobs to a job queue. Then, workers take one available job from the queue, run it, and take a new job when they are done. In this process, workers never wait for other workers thus fixing the load balance problem. It does not matter if there are some jobs that are larger than others as long as there are enough jobs to keep the workers busy. The main limiting factor of this model is the number of jobs and speed in which the main process is able to generate jobs and send them to the queue. This is an example of dynamic load balancing: the load is distributed among the workers at runtime.
+
In our parallel implementation, we will use a coordinator process and several worker processes. The coordinator process will search the tree up to a certain maximum depth given by a number of hops/levels maxhops. When maxhops is reached, the coordinator will stop searching the tree and will let any available worker to continue searching in the subtree. In the figure below, the master process will only visit the nodes in the top green box. The worker processes will search in parallel the subtrees below.
@@ -8338,34 +8347,20 @@ d) Yes, because the global minimum distance can be found more quickly, enabling
-
+
-
-Question: To find the right maxhops level is a tradeoff between...
-
-
a) Communication overhead (large maxhops) and load imbalance (small maxhops).
-b) Search overhead (large maxhops) and load imbalance (small maxhops).
-c) the number of workers (large maxhops) and the job size (small maxhops).
-d) buffer for the job queue (large maxhops) and idle time of the coordinator process (small maxhops).
-
-
-
-
-
-
-
-
-
In [ ]:
-
-
-
answer="x"#Replace x with a,b,c, or d
-tsp_check_4(answer)
-
We introduced a new parameter maxhops. Which is then the optimal value for it? When choosing maxhops, there is a trade off between load balance and communication overhead.
+
+
A small maxhops will reduce the number of jobs communicated to the workers (less communication), but reducing the number of jobs is bad for load balance. In the limit, we might generate even less jobs than the number of workers.
+
+
A large maxhops will increase the number of parallel jobs, improving dynamic load balance, but it will lead to more communication.
+
+
+
The optimal value of maxhops will depend on the given system, the number of workers, problem size, and also the particular input values. It is not possible to determine it in advance.
@@ -8376,7 +8371,8 @@ d) buffer for the job queue (large maxhops) and idle time of the coordinator pro
We will implement this algorithm using the task-based programming model provided by Distributed.jl as it is convenient to implement the replicated workers model.
+
First, let's add our worker processes.
@@ -8555,7 +8551,7 @@ d) buffer for the job queue (large maxhops) and idle time of the coordinator pro
We will demonstrate how the workers communicate the minimum distance with each other with a short example. Each worker generates a random value and updates a globally shared minimum. The variable for the global minimum is stored in a RemoteChannel. The buffer size of the channel is 1, such that only one channel can take and put new values to the channel at a time.
We will demonstrate how the workers communicate the minimum distance with each other with a short example. Each worker generates a random value and updates a globally shared minimum. The variable for the global minimum is stored in a RemoteChannel. The buffer size of the channel is 1, such that only one worker can take and put new values to the channel at a time, thus solving the race condition problem.
@@ -8721,7 +8717,7 @@ d) buffer for the job queue (large maxhops) and idle time of the coordinator pro
Next, we will test the correctness and performance of our parallel implementation by comparing the results of the parallel algorithm to the results of the serial algorithm for multiple problem instances.
Next, we will test the correctness and performance of our parallel implementation by comparing the results of the parallel algorithm to the results of the serial algorithm for multiple problem instances. Run it for different values of n and max_hops. Try to explain the impact of these values on the parallel efficiency.
@@ -8733,12 +8729,13 @@ d) buffer for the job queue (large maxhops) and idle time of the coordinator pro
In [ ]:
-
n=18# Safe to run up to 18
+
n=18# Safe to run up to 18 on a laptopusingRandomRandom.seed!(1)C=rand(1:10,n,n)C_sorted=sort_neighbors(C)city=1
+verbose=falseT1=@elapsedmin_serial=tsp_serial(C_sorted,city)max_hops=2P=nworkers()
@@ -8755,6 +8752,65 @@ d) buffer for the job queue (large maxhops) and idle time of the coordinator pro
+Question: For some values of `n` and `max_hops` the parallel efficiency can be above 100% (super-linear speedup). For example with `n=18` and `max_hops=2`, I get super-linear speedup on my laptop for some runs. Explain a possible cause for super-linear speedup in this algorithm.
+
+
+
We studied the solution of the TSP problem using a branch and bound strategy
+
The problem is $O(N!)$ complex in the worse case scenario, where $N$ is the number of cities.
+
Luckily, the compute time can be drastically reduced in practice using the nearest city first heuristic and branch pruning.
+
Pruning, however, introduces load imbalance in the parallel code. To this fix this, one needs a dynamic load balancing strategy as the actual work per worker depends on the input matrix (runtime values).
+
A replicated workers model is useful to distribute work dynamically. However, it introduces a trade-off between load balance and communication depending on the value of maxhops.
+
The parallel code might suffer from positive search overhead (if the optimal route is on the left of the tree) or it can benefit from negative search overhead (if the optimal route is on the right of the tree).
+
In some cases, it is possible to observe super-linear speedup thanks to negative search overhead.
+
+
+
+
+
@@ -8766,6 +8822,6 @@ d) buffer for the job queue (large maxhops) and idle time of the coordinator pro