A reference for numerical techniques
Here is a brief review of basic numerical techniques, including commands for using Linux and computational clusters.
Table of contents
Linux cheatsheet
Manipulate files and directories
ls(List files in the current directory)cd dir(Enter a directory/folder)cd(Go back to the home directory)pwd(Show the current directory)mkdir dir(Create a directory/folder)rm file(Delete a file)rm -r dir(Delete a directory)cp file1 file_or_dir(Copy file1 as/to file_or_dir)mv file1 file_or_dir(Move/rename file1 to file_or_dir)more file(Show some contents of file)head file(Show the first 10 lines of file)tail file(Show the last 10 lines of file)
Compression
gzip file(Compress file to file.gz)gzip -d file.gz(Uncompress file.gz to file)
Manage software
sudo apt updateorsudo apt upgrade(Update the entire apt-get library)sudo apt install build-essential(Install basic functions and dependencies, almost always a good choice)sudo apt install pkg(Install pkg)sudo apt install -f(Fix the last installation)sudo apt upgrade pkg(Upgrade pkg)sudo apt autoremoveandsudo apt clean(Clean useless dependencies and package files)sudo apt-get --purge autoremove pkg(Remove pkg and its dependencies)dpkg -i pkg.deb(Install pkg through .deb file)bash xxx.sh(Execute the shell script xxx)wget website_address(Download files from a website)
Vim shortcuts (See more commands on Vim Cheatsheet)
:w(Save file):q(Quit file, fails if there are unsaved changes):q!(Force to quit without saving changes):wq(Save and quit)
Bash shortcuts
- ctrl+c (Stop current command)
- ctrl+a (Go to start of line)
- ctrl+e (Go to end of line)
Workflow for computational clusters
Log in and copy files
ssh username@nots.rice.eduscp local_file.zip username@nots.rice.edu:(Don’t forget the colon)- (1)
sftp username@nots.rice.edu(2)put local_file.zip(3)get remote_file.zip(4)put -r local_directory(5)get -r remote_directory.
Check and load modules
module list(Current loaded modules in my environment)module spider GCC(Available modules related to GCC)module load GCC/9.3.0 OpenMPI/4.0.3 FFTW/3.3.8(Load related modules, e.g. GCC, OpenMPI and FFTW)module save(Save the current environment)
Submit and monitoring jobs
- Write a slurm file, i.e. an example of my slurm.
sbatch myslurm(Submit the job)watch squeue -u username(Check the job status)
How to run graphical applications, e.g. Mathematica? (For reference only)
- Download and run Xming.
- To connect the current bash terminal to X11 server, type
export DISPLAY=localhost:0.0. - Connect the cluster through
ssh -X username@nots.rice.edu. - Type
xtermto see if there is a window opened. If yes, then it works! - In order to run Mathematica, we need to reserve a compute node, say “scavenge”, by using
srun --pty --x11 --export=ALL --partition scavenge --nodes=1 --ntasks-per-node=1 --cpus-per-task=20 --mem-per-cpu=1500M --time=00:40:00 $SHELL. Here--export==ALLmeans exporting the same environment to the compute note.
Data and quotas (For reference only)
- \$Home: The home directory is where your sessions begin by default. Its intended use is for personal storage of data and files. Since it’s not designed for I/O, jobs should not be submitted from here.
- \$PROJECTS: The project storage is intended for storing datasets, files and software that are accessible by all members of the PI group.
- \$WORK: This directory is similar to \$PROJECT, but only accessible to login nodes. Hence jobs should not be submitted from here.
- \$SHARED_SCRATCH: This is intended for reading and writing data (job I/O) on the cluster. It may also be used as a temporary location to hold files if you are over quota. To use it, you should create a directory for yourself and work with files from inside of said directory. Files here will be purged regularly, so always copy your data to other directories.
- \$TMPDIR: This is similar to \$SHARED_SCRATCH, but is designed for compute nodes. Data and files here will be purged at the end of each job.
- In short, submit jobs from and output data to \$SHARED_SCRATCH or \$PROJECT, then copy data to \$HOME, \$PROJECT or \$WORK at the end of each job.
- NOTE: The physical paths for the above file systems are subject to change. You should always access the filesystems using environment variables, especially in job scripts.