Lecture 11: Odyssey!!!¶
Date: 10/05/2017, Thursday
You are expected to finish 8+1 tiny tasks. They will help you get prepared for the final project!
Related resources:
- Ryans’s Linux tutorial
- Intro-to-Odssey-S17_am111.pdf on Canvas.
- Odyssey quickstart guide
- MATLAB on Odyssey
- Parallel MATLAB on Odyssey
Task 1: Command line on your laptop¶
Preparation¶
Read Session 4 note, especially Ryans’s tutorial if you didn’t come to Monday’s session.
After reading Chapter 1 to Chapter 5, you should at least know the following Linux commands
ls
pwd
mkdir
cd
mv
rm
andrm -rf
cp
andcp -r
If you choose vi/vim as your text editor, read Chapter 6. Then you should at least know the following vim commands
i
esc
:wq
:q!
Find your own tutorial if you choose other text editors.
Writing code in terminal¶
Task: Use vim or other command line text editer to create a matlab file hello.m with the content “disp(‘hello world!’)”
We use vim as an example.
First, create a text file by
vim hello.m
(If hello.m already exists, then it will just open that file)
Inside vim, type i
to enter the Insert Mode.
Then type the code as usual. For example
disp('hello world!')
After writting the content, type esc
to go back to Command Mode.
Finally, type :wq
to save and quit vim.
Again, read Chapter 6 for more vim usages!
Tips: You can check the content of hello.m by a graphic editer. On
Mac, you can use open ./
to open the graphic finder, and then open
hello.m that you’ve just created. On Odyssey (See Task 2), there’s no
graphic editor, so you will also use vim to check the file content.
Running MATLAB interactively in terminal¶
Windows users can jump to Task 2 because I am not sure if the following stuff would work.
Find the MATLAB executable path on your laptop. On Mac it should be something like
/Applications/MATLAB_R2017a.app/bin/matlab
Running the above command will open the traditional graphic version of MATLAB.
To only use the command line, add 3 options:
/Applications/MATLAB_R2017a.app/bin/matlab -nojvm -nosplash -nodesktop
Play with this command line version of MATLAB for a while. Type exit
to quit.
Set shortcut¶
If you are tired with typing this long command, you can set
alias matlab='/Applications/MATLAB_R2017a.app/bin/matlab'
Then you can simply type matlab
to launch the program. However, this
shortcut will go away if you close the terminal. To make it a permanent
configuration, add the above command to a system file called
~/.bash_profile. You can edit it by vim for example:
vim ~/.bash_profile
Running MATLAB scripts in terminal¶
cd
to the directory where you saved the hello.m file. You can
execute it by
matlab -nojvm -nosplash -nodesktop
hello
Or you can use ‘-r’ to combine two commands together
matlab -nojvm -nosplash -nodesktop -r hello
If you didn’t set shortcut, the full command would be
/Applications/MATLAB_R2017a.app/bin/matlab -nojvm -nosplash -nodesktop -r hello
(I actually prefer this command line version to the complicated graphic version!)
Task 2: Command line on Odyssey¶
Login¶
Login to Odyssey by
ssh am111uXXXX@login.rc.fas.harvard.edu
Check Odyssey website if you have any trouble.
Tips: You can open multiple terminals and login to Odyssey, if one is not enough for you.
File transfer¶
Use scp¶
You can transfer files by the built-in scp
(security-copy) command.
Make sure you are running this command on your laptop, not on
odyssey.
From you laptop to Odyssey (first figure out your Odyssey home directory
path by pwd
)
scp local_file_path username@login.rc.fas.harvard.edu:/path_shown_by_pwd_on_Odyssey
Try to transfer *hello.m* that you wrote in Task 1 to Odyssey! You will be asked to enter your password again.
From to Odyssey to your laptop is just reversing the arguments
scp username@login.rc.fas.harvard.edu:/file_path_on_odyssey local_file_path
Use scp -r
for transfering directory (similar to cp -r
)
Task 3: MATLAB on Odyssey¶
Load MATLAB¶
Load MATLAB by
module load matlab
(If you get an error, run source new-modules.sh
and try again.)
It loads the lastest version by default. You can check the version by
which
[username]$ which matlab
alias matlab='matlab -singleCompThread'
/n/sw/matlab-R2017a/bin/matlab
Or you can load a specific version
module load matlab/R2017a-fasrc01
Use this RC portal to find avaiable software and the corresponding loading command. Search for MATLAB. How many different verions do you see?
Run MATLAB¶
After loading MATLAB, you can run it by: (same as on your laptop)
matlab -nojvm -nosplash -nodesktop
The 3 options are crucial because there’s no graphical user interface on Odyssey.
Play with it, and type exit
to quit.
Run hello.m by matlab -nojvm -nosplash -nodesktop -r hello
.
Task 4: Interactive Job on Odyssey¶
After logging into Odyssey, you are on a home node with very few computational resources. For any serious computing work you need to switch to a compute node. The easiest way is to do this interactively (more about interative mode):
srun -t 0-0:30 -c 4 -N 1 --pty -p interact /bin/bash
Here we request 30 minutes of computing time (-t 0-0:30
) on 4 CPUs
(-c 4
), on a single computer (-N 1
), using interactive mode
(--pty
and /bin/bash
).
Warning: Don’t request too many CPUs! This will make you wait for much longer.
-p interact
only means you are requesting CPUs on the interactive
partition, but doesn’t mean that you want it to run interactively. The
following command starts interactive mode on the general partition
(more about
partition).
srun -t 0-0:30 -c 4 -N 1 --pty -p general /bin/bash
Then repeat what you’ve done in Task 3.
Task 5: Batch Job on Odyssey¶
If your job runs for hours or even days, you can submit it as a batch job, so you don’t need to keep your terminal open all the time. You are allowed to log out and go away while the job is runnning.
Create a file called runscript.sh with the following content. (you can use vim to create such a text file)
#!/bin/bash
#SBATCH -J Matlabjob1
#SBATCH -p general
#SBATCH -c 1 # single CPU
#SBATCH -t 00:05:00
#SBATCH --mem=400M # memory
#SBATCH -o %j.o # output filename
#SBATCH -e %j.e # error filename
## LOAD SOFTWARE ENV ##
source new-modules.sh
module purge
module load matlab/R2017a-fasrc01
## EXECUTE CODE ##
matlab -nojvm -nodisplay -nosplash -r hello
It just puts the options you’ve used in Task 4 into a text file.
Make sure runscript.sh is at the same directory as hello.m, then execute
sbatch runscript.sh
Use sacct
to check job status. You should get some output files once
it is finished. (more about
submitting
and
monitoring
jobs)
Tips: always test your code in interactive mode before submitting a batch job!
Task 6: Use MATLAB-parallel on your laptop¶
Make sure you’ve installed the parallel toolbox. To start the command
line version, remove the -nojvm
option when using parallel mode.
(The original graphic version works as usual)
matlab -nosplash -nodesktop
Initialize parallel mode by
In [1]:
parpool('local', 2)
Starting parallel pool (parpool) using the 'local' profile ...
connected to 2 workers.
ans =
Pool with properties:
Connected: true
NumWorkers: 2
Cluster: local
AttachedFiles: {}
IdleTimeout: 30 minutes (30 minutes remaining)
SpmdEnabled: true
Then run this script for several times to make sure you get speed-up by
using parallel for-loop (parfor
)
In [4]:
n = 1e9;
X = 0;
tic
for i = 1:n
X = X + 1;
end
T = toc;
fprintf('serial time: %f; result: %d \n',T,X)
X = 0;
tic
parfor i = 1:n
X = X + 1;
end
T = toc;
fprintf('parallel time: %f; result: %d \n',T,X)
serial time: 2.724932; result: 1000000000
parallel time: 1.748450; result: 1000000000
Tips: For command line version of MATLAB, save the code as
parallel_timing.m, and then execute parallel_timing
inside
MATLAB.
Finally, quit the parallel mode
In [5]:
delete(gcp)
Task 7: Use MATLAB-parallel on Odyssey interactive mode¶
Repeat what you’ve done in Task 6, but on Odyssey. This might not be as straightforward as you expected!
You need to request enough memory for the parallel tool box
srun -t 0-0:30 -c 4 -N 1 --mem-per-cpu 4000 --pty -p interact /bin/bash
Environment variable SLURM_CPUS_PER_TASK tells you how many CPUs are available
echo $SLURM_CPUS_PER_TASK
4
For parallel support, you need to call matlab-default
instead of
matlab
to launch the program, as described
here.
module load matlab
matlab-default -nosplash -nodesktop
Inside MATLAB, you can again check the number of CPUs by
getenv('SLURM_CPUS_PER_TASK')
ans = '4'
Initialize parallel mode by (this is a general code for any number of CPUs)
parpool('local', str2num(getenv('SLURM_CPUS_PER_TASK')) )
The initialization might take severals minutes on Odyssey. Eventually you should see something like
ans =
Pool with properties:
Connected: true
NumWorkers: 4
Cluster: local
AttachedFiles: {}
IdleTimeout: 30 minutes (30 minutes remaining)
SpmdEnabled: true
Then, execute the parallel_timing.m script in Task 6. You should see a speed-up like that
>> parallel_timing
serial time: 12.228084; result: 1000000000
parallel time: 2.667366; result: 1000000000
Task 8: MATLAB-parallel as batch Job¶
Sightly modify the script parallel_timing.m in Task 6. Call it parallel_timing_batch.m this time.
parpool('local', str2num(getenv('SLURM_CPUS_PER_TASK')))
n = 1e9;
X = 0;
tic
for i = 1:n
X = X + 1;
end
T = toc;
fprintf('serial time: %f; result: %d \n',T,X)
X = 0;
tic
parfor i = 1:n
X = X + 1;
end
T = toc;
fprintf('parallel time: %f; result: %d \n',T,X)
X = 0;
tic
parfor i = 1:n
X = X + 1;
end
T = toc;
fprintf('parallel time: %f; result: %d \n',T,X)
delete(gcp)
Then, change the runscript.sh in Task 5 correspondingly
#!/bin/bash
#SBATCH -J timing
#SBATCH -o timing.out
#SBATCH -e timing.err
#SBATCH -N 1
#SBATCH -c 4
#SBATCH -t 0-00:20
#SBATCH -p general
#SBATCH --mem-per-cpu 8000
source new-modules.sh
module load matlab
srun -n 1 -c 4 matlab-default -nosplash -nodesktop -r parallel_timing_batch
Submit this job. It will take many minutes to finish. Do you get expected speed-up?
In timing.out, you should see something like
ans =
Pool with properties:
Connected: true
NumWorkers: 4
Cluster: local
AttachedFiles: {}
IdleTimeout: 30 minutes (30 minutes remaining)
SpmdEnabled: true
serial time: 7.635188; result: 1000000000
parallel time: 5.901599; result: 1000000000
parallel time: 3.516169; result: 1000000000
Parallel pool using the 'local' profile is shutting down.
Explain why the second parfor
is faster then the first parfor
Tips: Using batch job for this kind of small computation is definitely an overkill, as queuing and initializing would take much longer than actual compuation. You will probably use the interactive mode much more often in this class.
Bonus task: make your terminal prettier¶
Open ~/.bash_profile (for example vim ~/.bash_profile
), add the
following lines
For Mac
export CLICOLOR=1
export LSCOLORS=ExFxBxDxCxegedabagacad
For Linux (Odyssey)
alias ls="ls --color=auto"
Type source ~/.bash_profile
or relaunch the terminal. Notice any
difference?