Linux Basics¶
Feng Li¶
Guanghua School of Management¶
Peking University¶
feng.li@gsm.pku.edu.cn¶
Course home page: https://feng.li/bdcf¶
Why Linux?¶
Linux is a free, open-source operating system.
All most all of distributed software are offered with Linux distributions.
How do I try Linux on my machine?¶
Windows 10+ user:
- Enable the Windows Subsystem for Linux feature.
- Install a Linux Distribution (Ubuntu or Debian) with Windows Store.
Mac user: Terminal commands are very similar to Linux commands.
Install a real Linux system like me.
How much do we need to know about Linux?¶
Login to a Linux server
Navigation with basic Linux shell commands
File Manipulation and transformation
Use an editor within a Linux server
Login to a Linux server¶
Windows users, you need Windows Terminal
Mac or Linux users, use the system Terminal to login to the server
ssh -p 26506 teacher01@hz-t3.matpool.com
where
26506
is your remote server's SSH port (could omitted if-p 22
),teacher01
is your remote server's username,hz-t3.matpool.com
is your remote server's address.
Using PKU cluster¶
Apply for an account in the PKU HPC teaching cluster online
Login to the cluster
- Using the SCOW web interface at https://scow-jx2.pku.edu.cn/
- Using SSH to login with OTP pass https://hpc.pku.edu.cn/ug/guide/access/
More information is available at https://hpc.pku.edu.cn/ug/guide/
Run a remote Jupyter Notebook or using the SSH tunnel¶
- Using the terminal to login to a remote server
ssh 2412345678@wmjx2-login.pku.edu.cn
Start a Jupyter Application on the PKU server
Navigation with basic Linux shell commands¶
- Once you have logged into the server, a few things you could do with shell commands
echo $HOME
/home/fli
whoami
fli
pwd
/home/fli/nextcloud/teaching/distributed-computing/distcomp-slides/L01-Introduction-to-Distributed-Computing
ls
figuresl google-first-computer.jpeg L01.1-Introduction-to-Distributed-Computing.ipynb L01.1-Introduction-to-Distributed-Computing.slides.html L01.2-Linux-Basics.ipynb L01.2-Linux-Basics.slides.html L01.3-Introduction-to-Hadoop.ipynb L01.3-Introduction-to-Hadoop.slides.html
ls -htla
total 3.6M drwxr-xr-x 4 fli fli 4.0K Feb 20 18:42 . -rw-r--r-- 1 fli fli 15K Feb 20 18:42 L01.2-Linux-Basics.ipynb drwxr-xr-x 2 fli fli 4.0K Feb 20 17:36 .ipynb_checkpoints drwxr-xr-x 17 fli fli 4.0K Feb 20 17:32 .. -rw-r--r-- 1 fli fli 28K Nov 21 12:46 L01.3-Introduction-to-Hadoop.ipynb -rw-r--r-- 1 fli fli 586K Nov 21 11:04 L01.3-Introduction-to-Hadoop.slides.html -rw-r--r-- 1 fli fli 815K Nov 21 11:04 L01.2-Linux-Basics.slides.html -rw-r--r-- 1 fli fli 560K Nov 21 11:04 L01.1-Introduction-to-Distributed-Computing.slides.html -rw-r--r-- 1 fli fli 1.7M Sep 30 22:50 google-first-computer.jpeg -rw-r--r-- 1 fli fli 7.3K Mar 3 2020 L01.1-Introduction-to-Distributed-Computing.ipynb drwxr-xr-x 2 fli fli 4.0K Feb 15 2020 figures
cat /etc/passwd
root:x:0:0:root:/root:/bin/zsh daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin bin:x:2:2:bin:/bin:/usr/sbin/nologin sys:x:3:3:sys:/dev:/usr/sbin/nologin sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/usr/sbin/nologin man:x:6:12:man:/var/cache/man:/usr/sbin/nologin lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin mail:x:8:8:mail:/var/mail:/usr/sbin/nologin news:x:9:9:news:/var/spool/news:/usr/sbin/nologin uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin proxy:x:13:13:proxy:/bin:/usr/sbin/nologin www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin backup:x:34:34:backup:/var/backups:/usr/sbin/nologin list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin irc:x:39:39:ircd:/run/ircd:/usr/sbin/nologin gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin _apt:x:100:65534::/nonexistent:/usr/sbin/nologin systemd-timesync:x:101:102:systemd Time Synchronization,,,:/run/systemd:/usr/sbin/nologin systemd-network:x:102:103:systemd Network Management,,,:/run/systemd:/usr/sbin/nologin systemd-resolve:x:103:104:systemd Resolver,,,:/run/systemd:/usr/sbin/nologin messagebus:x:105:110::/nonexistent:/usr/sbin/nologin tss:x:106:111:TPM2 software stack,,,:/var/lib/tpm:/bin/false avahi-autoipd:x:107:114:Avahi autoip daemon,,,:/var/lib/avahi-autoipd:/usr/sbin/nologin usbmux:x:108:46:usbmux daemon,,,:/var/lib/usbmux:/usr/sbin/nologin rtkit:x:109:115:RealtimeKit,,,:/proc:/usr/sbin/nologin sshd:x:110:65534::/run/sshd:/usr/sbin/nologin pulse:x:111:119:PulseAudio daemon,,,:/var/run/pulse:/usr/sbin/nologin avahi:x:113:121:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/usr/sbin/nologin saned:x:114:122::/var/lib/saned:/usr/sbin/nologin colord:x:115:123:colord colour management daemon,,,:/var/lib/colord:/usr/sbin/nologin geoclue:x:116:124::/var/lib/geoclue:/usr/sbin/nologin hplip:x:117:7:HPLIP system user,,,:/var/run/hplip:/bin/false fli:x:1000:1000:Feng Li,,,:/home/fli:/bin/zsh systemd-coredump:x:999:999:systemd Core Dumper:/:/usr/sbin/nologin Debian-exim:x:119:126::/var/spool/exim4:/usr/sbin/nologin Debian-gdm:x:118:125:Gnome Display Manager:/var/lib/gdm3:/bin/false nm-openconnect:x:104:129:NetworkManager OpenConnect plugin,,,:/var/lib/NetworkManager:/usr/sbin/nologin uuidd:x:121:130::/run/uuidd:/usr/sbin/nologin
File Manipulation¶
touch lifeng.txt # create a file
ls -htla lifeng.txt
-rw-r--r-- 1 fli fli 0 Feb 20 18:44 lifeng.txt
mkdir hello_linux
ls -htla .
total 3.6M drwxr-xr-x 5 fli fli 4.0K Feb 20 18:44 . drwxr-xr-x 2 fli fli 4.0K Feb 20 18:44 hello_linux -rw-r--r-- 1 fli fli 0 Feb 20 18:44 lifeng.txt -rw-r--r-- 1 fli fli 15K Feb 20 18:42 L01.2-Linux-Basics.ipynb drwxr-xr-x 2 fli fli 4.0K Feb 20 17:36 .ipynb_checkpoints drwxr-xr-x 17 fli fli 4.0K Feb 20 17:32 .. -rw-r--r-- 1 fli fli 28K Nov 21 12:46 L01.3-Introduction-to-Hadoop.ipynb -rw-r--r-- 1 fli fli 586K Nov 21 11:04 L01.3-Introduction-to-Hadoop.slides.html -rw-r--r-- 1 fli fli 815K Nov 21 11:04 L01.2-Linux-Basics.slides.html -rw-r--r-- 1 fli fli 560K Nov 21 11:04 L01.1-Introduction-to-Distributed-Computing.slides.html -rw-r--r-- 1 fli fli 1.7M Sep 30 22:50 google-first-computer.jpeg -rw-r--r-- 1 fli fli 7.3K Mar 3 2020 L01.1-Introduction-to-Distributed-Computing.ipynb drwxr-xr-x 2 fli fli 4.0K Feb 15 2020 figures
cd hello_linux # change directory to hello_linux
pwd
/home/fli/nextcloud/teaching/distributed-computing/distcomp-slides/L01-Introduction-to-Distributed-Computing/hello_linux
touch hello.py
ls
hello.py
rm hello.py
cd .. # swtich to parent directory
pwd
/home/fli/nextcloud/teaching/distributed-computing/distcomp-slides/L01-Introduction-to-Distributed-Computing
ls
figuresl google-first-computer.jpeg hello_linux L01.1-Introduction-to-Distributed-Computing.ipynb L01.1-Introduction-to-Distributed-Computing.slides.html L01.2-Linux-Basics.ipynb L01.2-Linux-Basics.slides.html L01.3-Introduction-to-Hadoop.ipynb L01.3-Introduction-to-Hadoop.slides.html lifeng.txt
cd L1-Introduction
bash: cd: L1-Introduction: No such file or directory
pwd
/home/fli/nextcloud/teaching/distributed-computing/distcomp-slides/L01-Introduction-to-Distributed-Computing
mv hello_linux goodbye_linux
ls
figuresl goodbye_linux google-first-computer.jpeg L01.1-Introduction-to-Distributed-Computing.ipynb L01.1-Introduction-to-Distributed-Computing.slides.html L01.2-Linux-Basics.ipynb L01.2-Linux-Basics.slides.html L01.3-Introduction-to-Hadoop.ipynb L01.3-Introduction-to-Hadoop.slides.html lifeng.txt
rm -rf goodbye_linux
ls
figuresl google-first-computer.jpeg L01.1-Introduction-to-Distributed-Computing.ipynb L01.1-Introduction-to-Distributed-Computing.slides.html L01.2-Linux-Basics.ipynb L01.2-Linux-Basics.slides.html L01.3-Introduction-to-Hadoop.ipynb L01.3-Introduction-to-Hadoop.slides.html lifeng.txt
Use an editor within a Linux server¶
nano
: simple to usevim
: take a little time to learnemacs
: steady learning curve
Need help of a command?¶
We take
mkdir
as an exampleman mkdir
shows the standard manual of Linux commandmkdir --help
prints short help for the command
Linux Pipelines¶
cat
– Concatenate filessort
– Sort lines of textuniq
– Report or omit repeated linesgrep
– Print lines matching a patternwc
– Print newline, word, and byte counts for each filehead
– Output the first part of a filetail
– Output the last part of a filetee
– Read from standard input and write to standard output and files
Standard Input, Output, and Error¶
Linux theme: "everything is a file."
The output of a shell program often consists of two types:
- The program's results, that is, the data the program is designed to produce.
- Status and error messages that tell us how the program is getting along.
Programs such as
ls
actually send their results toa special file called standard output (often expressed as
stdout
) andtheir status messages to another file called standard error (
stderr
).By default, both standard output and standard error are linked to the screen and not saved into a disk file.
Many programs take input from a facility called standard input (stdin), which is, by default, attached to the keyboard.
Redirecting Standard Output and Errors¶
ls -l /usr/bin > ls-output.txt
ls -l ls-output.txt
-rw-r--r-- 1 fli fli 154955 Feb 20 18:44 ls-output.txt
head ls-output.txt
total 422588 -rwxr-xr-x 1 root root 60224 Oct 24 03:29 [ -rwxr-xr-x 1 root root 96 Jan 27 10:05 2to3-2.7 -rwxr-xr-x 1 root root 39 Aug 15 2020 7z -rwxr-xr-x 1 root root 40 Aug 15 2020 7za -rwxr-xr-x 1 root root 40 Aug 15 2020 7zr lrwxrwxrwx 1 root root 52 Feb 5 06:04 a2ping -> ../share/texlive/texmf-dist/scripts/a2ping/a2ping.pl -rwxr-xr-x 1 root root 39 Dec 21 23:28 a2x lrwxrwxrwx 1 root root 54 Feb 5 06:04 a5toa4 -> ../share/texlive/texmf-dist/scripts/pfarrei/a5toa4.tlu lrwxrwxrwx 1 root root 25 Sep 22 2019 aclocal -> /etc/alternatives/aclocal
Append redirected output to a file¶
ls -l /usr/bin >> ls-output.txt
Redirecting Standard Error¶
ls -l /bin/usr 2> ls-error.txt
Redirecting Standard Input¶
head ls-output.txt
total 422588 -rwxr-xr-x 1 root root 60224 Oct 24 03:29 [ -rwxr-xr-x 1 root root 96 Jan 27 10:05 2to3-2.7 -rwxr-xr-x 1 root root 39 Aug 15 2020 7z -rwxr-xr-x 1 root root 40 Aug 15 2020 7za -rwxr-xr-x 1 root root 40 Aug 15 2020 7zr lrwxrwxrwx 1 root root 52 Feb 5 06:04 a2ping -> ../share/texlive/texmf-dist/scripts/a2ping/a2ping.pl -rwxr-xr-x 1 root root 39 Dec 21 23:28 a2x lrwxrwxrwx 1 root root 54 Feb 5 06:04 a5toa4 -> ../share/texlive/texmf-dist/scripts/pfarrei/a5toa4.tlu lrwxrwxrwx 1 root root 25 Sep 22 2019 aclocal -> /etc/alternatives/aclocal
cat ls-output.txt ls-error.txt > ls_all.txt
ls -htla ls_all.txt
Pipelines¶
Read data from standard input and send to standard output is utilized by a shell feature called pipelines with the format
command1 | command2
ls -l /usr/bin | wc
2328 22201 154955
ls /bin /usr/bin | sort | head
[ [ 2to3-2.7 2to3-2.7 7z 7z 7za 7za 7zr sort: write failed: 'standard output': Broken pipe sort: write error