Linux Basics¶

Feng Li¶

Central University of Finance and Economics¶

feng.li@cufe.edu.cn¶

Course home page: https://feng.li/distcomp¶

Why Linux?¶

  • Linux is a free, open-source operating system.

  • All most all of distributed software are offered with Linux distributions.

How do I try a Linux computer?¶

  • Windows 10+ user:

    • Enable the Windows Subsystem for Linux feature.
    • Install a Linux Distribution (Ubuntu or Debian) with Windows Store.
  • Mac user: Terminal commands are very similar to Linux commands.

  • Install a real Linux system like me.

How much do we need to know about Linux?¶

  • Login to a Linux server

  • Navigation with basic Linux shell commands

  • File Manipulation and transformation

  • Use an editor within a Linux server

Login to a Linux server¶

  • Windows user, you need an SSH client

    • Windows 10+: Windows Terminal

    • Other Windows systems PuTTY or Xshell

  • Mac or Linux user, use the system Terminal to login to the server

    ssh -p 26506 teacher01@hz-t3.matpool.com
    
    

    where 26506 is your remote server's SSH port (could omitted if -p 22), teacher01 is your remote server's username, hz-t3.matpool.com is your remote server's address.

Run a remote Jupyter Notebook using the SSH tunnel¶

  • Login to a remote server

     ssh -p 28596 teacher01@hz-t3.matpool.com
    
    
  • Start a Jupyter notebook on the server

     jupyter notebook --port 8888
    
    

    Also copy the string http://localhost:8888/?token=a373718460df0437c443eeadb7250e7793d6524f80f129cc printed on the terminal.

  • Start a terminal on your local machine and run:

      ssh  -p 28596 -L 8888:127.0.0.1:8888 -N teacher01@hz-t3.matpool.com
    
    

    where 8888:127.0.0.1:8888 means forwarding your remote server's 8888 port to your local machine's port 8888, -N to avoid login to a shell on the remote.

  • Open your bowser and connect to the remote Jupyter Notebook server locally with the link

    http://localhost:8888/?token=a373718460df0437c443eeadb7250e7793d6524f80f129cc

Navigation with basic Linux shell commands¶

  • Once you have logged into the server, a few things you could do with shell commands
In [2]:
echo $HOME
/home/fli

In [1]:
whoami
fli
In [4]:
pwd
/home/fli/nextcloud/teaching/distributed-computing/distcomp-slides/L01-Introduction-to-Distributed-Computing

In [5]:
ls
figuresl
google-first-computer.jpeg
L01.1-Introduction-to-Distributed-Computing.ipynb
L01.1-Introduction-to-Distributed-Computing.slides.html
L01.2-Linux-Basics.ipynb
L01.2-Linux-Basics.slides.html
L01.3-Introduction-to-Hadoop.ipynb
L01.3-Introduction-to-Hadoop.slides.html

In [6]:
ls -htla
total 3.6M
drwxr-xr-x  4 fli fli 4.0K Feb 20 18:42 .
-rw-r--r--  1 fli fli  15K Feb 20 18:42 L01.2-Linux-Basics.ipynb
drwxr-xr-x  2 fli fli 4.0K Feb 20 17:36 .ipynb_checkpoints
drwxr-xr-x 17 fli fli 4.0K Feb 20 17:32 ..
-rw-r--r--  1 fli fli  28K Nov 21 12:46 L01.3-Introduction-to-Hadoop.ipynb
-rw-r--r--  1 fli fli 586K Nov 21 11:04 L01.3-Introduction-to-Hadoop.slides.html
-rw-r--r--  1 fli fli 815K Nov 21 11:04 L01.2-Linux-Basics.slides.html
-rw-r--r--  1 fli fli 560K Nov 21 11:04 L01.1-Introduction-to-Distributed-Computing.slides.html
-rw-r--r--  1 fli fli 1.7M Sep 30 22:50 google-first-computer.jpeg
-rw-r--r--  1 fli fli 7.3K Mar  3  2020 L01.1-Introduction-to-Distributed-Computing.ipynb
drwxr-xr-x  2 fli fli 4.0K Feb 15  2020 figures

In [7]:
cat /etc/passwd
root:x:0:0:root:/root:/bin/zsh
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
_apt:x:100:65534::/nonexistent:/usr/sbin/nologin
systemd-timesync:x:101:102:systemd Time Synchronization,,,:/run/systemd:/usr/sbin/nologin
systemd-network:x:102:103:systemd Network Management,,,:/run/systemd:/usr/sbin/nologin
systemd-resolve:x:103:104:systemd Resolver,,,:/run/systemd:/usr/sbin/nologin
messagebus:x:105:110::/nonexistent:/usr/sbin/nologin
tss:x:106:111:TPM2 software stack,,,:/var/lib/tpm:/bin/false
avahi-autoipd:x:107:114:Avahi autoip daemon,,,:/var/lib/avahi-autoipd:/usr/sbin/nologin
usbmux:x:108:46:usbmux daemon,,,:/var/lib/usbmux:/usr/sbin/nologin
rtkit:x:109:115:RealtimeKit,,,:/proc:/usr/sbin/nologin
sshd:x:110:65534::/run/sshd:/usr/sbin/nologin
pulse:x:111:119:PulseAudio daemon,,,:/var/run/pulse:/usr/sbin/nologin
avahi:x:113:121:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/usr/sbin/nologin
saned:x:114:122::/var/lib/saned:/usr/sbin/nologin
colord:x:115:123:colord colour management daemon,,,:/var/lib/colord:/usr/sbin/nologin
geoclue:x:116:124::/var/lib/geoclue:/usr/sbin/nologin
hplip:x:117:7:HPLIP system user,,,:/var/run/hplip:/bin/false
fli:x:1000:1000:Feng Li,,,:/home/fli:/bin/zsh
systemd-coredump:x:999:999:systemd Core Dumper:/:/usr/sbin/nologin
Debian-exim:x:119:126::/var/spool/exim4:/usr/sbin/nologin
Debian-gdm:x:118:125:Gnome Display Manager:/var/lib/gdm3:/bin/false
nm-openconnect:x:104:129:NetworkManager OpenConnect plugin,,,:/var/lib/NetworkManager:/usr/sbin/nologin
uuidd:x:121:130::/run/uuidd:/usr/sbin/nologin

File Manipulation¶

In [8]:
touch lifeng.txt # create a file


In [9]:
ls -htla lifeng.txt
-rw-r--r-- 1 fli fli 0 Feb 20 18:44 lifeng.txt

In [10]:
mkdir hello_linux


In [11]:
ls -htla .
total 3.6M
drwxr-xr-x  5 fli fli 4.0K Feb 20 18:44 .
drwxr-xr-x  2 fli fli 4.0K Feb 20 18:44 hello_linux
-rw-r--r--  1 fli fli    0 Feb 20 18:44 lifeng.txt
-rw-r--r--  1 fli fli  15K Feb 20 18:42 L01.2-Linux-Basics.ipynb
drwxr-xr-x  2 fli fli 4.0K Feb 20 17:36 .ipynb_checkpoints
drwxr-xr-x 17 fli fli 4.0K Feb 20 17:32 ..
-rw-r--r--  1 fli fli  28K Nov 21 12:46 L01.3-Introduction-to-Hadoop.ipynb
-rw-r--r--  1 fli fli 586K Nov 21 11:04 L01.3-Introduction-to-Hadoop.slides.html
-rw-r--r--  1 fli fli 815K Nov 21 11:04 L01.2-Linux-Basics.slides.html
-rw-r--r--  1 fli fli 560K Nov 21 11:04 L01.1-Introduction-to-Distributed-Computing.slides.html
-rw-r--r--  1 fli fli 1.7M Sep 30 22:50 google-first-computer.jpeg
-rw-r--r--  1 fli fli 7.3K Mar  3  2020 L01.1-Introduction-to-Distributed-Computing.ipynb
drwxr-xr-x  2 fli fli 4.0K Feb 15  2020 figures

In [12]:
cd hello_linux # change directory to hello_linux


In [13]:
pwd
/home/fli/nextcloud/teaching/distributed-computing/distcomp-slides/L01-Introduction-to-Distributed-Computing/hello_linux

In [14]:
touch hello.py


In [15]:
ls
hello.py

In [16]:
rm hello.py


In [17]:
cd .. # swtich to parent directory


In [18]:
pwd
/home/fli/nextcloud/teaching/distributed-computing/distcomp-slides/L01-Introduction-to-Distributed-Computing

In [19]:
ls
figuresl
google-first-computer.jpeg
hello_linux
L01.1-Introduction-to-Distributed-Computing.ipynb
L01.1-Introduction-to-Distributed-Computing.slides.html
L01.2-Linux-Basics.ipynb
L01.2-Linux-Basics.slides.html
L01.3-Introduction-to-Hadoop.ipynb
L01.3-Introduction-to-Hadoop.slides.html
lifeng.txt

In [20]:
cd L1-Introduction
bash: cd: L1-Introduction: No such file or directory

In [21]:
pwd
/home/fli/nextcloud/teaching/distributed-computing/distcomp-slides/L01-Introduction-to-Distributed-Computing

In [22]:
mv hello_linux goodbye_linux


In [23]:
ls
figuresl
goodbye_linux
google-first-computer.jpeg
L01.1-Introduction-to-Distributed-Computing.ipynb
L01.1-Introduction-to-Distributed-Computing.slides.html
L01.2-Linux-Basics.ipynb
L01.2-Linux-Basics.slides.html
L01.3-Introduction-to-Hadoop.ipynb
L01.3-Introduction-to-Hadoop.slides.html
lifeng.txt

In [24]:
rm -rf goodbye_linux


In [25]:
ls
figuresl
google-first-computer.jpeg
L01.1-Introduction-to-Distributed-Computing.ipynb
L01.1-Introduction-to-Distributed-Computing.slides.html
L01.2-Linux-Basics.ipynb
L01.2-Linux-Basics.slides.html
L01.3-Introduction-to-Hadoop.ipynb
L01.3-Introduction-to-Hadoop.slides.html
lifeng.txt

Use an editor within a Linux server¶

  • nano : simple to use
  • vim : take a little time to learn
  • emacs : steady learning curve

Need help of a command?¶

  • We take mkdir as an example

    • man mkdir shows the standard manual of Linux command

    • mkdir --help prints short help for the command

Linux Pipelines¶

  • cat – Concatenate files
  • sort – Sort lines of text
  • uniq – Report or omit repeated lines
  • grep – Print lines matching a pattern
  • wc – Print newline, word, and byte counts for each file
  • head – Output the first part of a file
  • tail – Output the last part of a file
  • tee – Read from standard input and write to standard output and files

Standard Input, Output, and Error¶

  • Linux theme: "everything is a file."

  • The output of a shell program often consists of two types:

    • The program's results, that is, the data the program is designed to produce.
    • Status and error messages that tell us how the program is getting along.
  • Programs such as ls actually send their results to

    • a special file called standard output (often expressed as stdout) and

    • their status messages to another file called standard error (stderr).

    • By default, both standard output and standard error are linked to the screen and not saved into a disk file.

  • Many programs take input from a facility called standard input (stdin), which is, by default, attached to the keyboard.

Redirecting Standard Output and Errors¶

In [26]:
ls -l /usr/bin > ls-output.txt


In [27]:
ls -l ls-output.txt
-rw-r--r-- 1 fli fli 154955 Feb 20 18:44 ls-output.txt

In [34]:
head ls-output.txt
total 422588
-rwxr-xr-x 1 root root       60224 Oct 24 03:29 [
-rwxr-xr-x 1 root root          96 Jan 27 10:05 2to3-2.7
-rwxr-xr-x 1 root root          39 Aug 15  2020 7z
-rwxr-xr-x 1 root root          40 Aug 15  2020 7za
-rwxr-xr-x 1 root root          40 Aug 15  2020 7zr
lrwxrwxrwx 1 root root          52 Feb  5 06:04 a2ping -> ../share/texlive/texmf-dist/scripts/a2ping/a2ping.pl
-rwxr-xr-x 1 root root          39 Dec 21 23:28 a2x
lrwxrwxrwx 1 root root          54 Feb  5 06:04 a5toa4 -> ../share/texlive/texmf-dist/scripts/pfarrei/a5toa4.tlu
lrwxrwxrwx 1 root root          25 Sep 22  2019 aclocal -> /etc/alternatives/aclocal

Append redirected output to a file¶

In [ ]:
ls -l /usr/bin >> ls-output.txt

Redirecting Standard Error¶

In [29]:
ls -l /bin/usr 2> ls-error.txt


Redirecting Standard Input¶

In [33]:
head ls-output.txt
total 422588
-rwxr-xr-x 1 root root       60224 Oct 24 03:29 [
-rwxr-xr-x 1 root root          96 Jan 27 10:05 2to3-2.7
-rwxr-xr-x 1 root root          39 Aug 15  2020 7z
-rwxr-xr-x 1 root root          40 Aug 15  2020 7za
-rwxr-xr-x 1 root root          40 Aug 15  2020 7zr
lrwxrwxrwx 1 root root          52 Feb  5 06:04 a2ping -> ../share/texlive/texmf-dist/scripts/a2ping/a2ping.pl
-rwxr-xr-x 1 root root          39 Dec 21 23:28 a2x
lrwxrwxrwx 1 root root          54 Feb  5 06:04 a5toa4 -> ../share/texlive/texmf-dist/scripts/pfarrei/a5toa4.tlu
lrwxrwxrwx 1 root root          25 Sep 22  2019 aclocal -> /etc/alternatives/aclocal

In [ ]:
cat ls-output.txt ls-error.txt > ls_all.txt
In [ ]:
ls -htla ls_all.txt

Pipelines¶

Read data from standard input and send to standard output is utilized by a shell feature called pipelines with the format

command1 | command2 
In [31]:
ls -l /usr/bin | wc
   2328   22201  154955

In [32]:
ls /bin /usr/bin | sort | head
[
[
2to3-2.7
2to3-2.7
7z
7z
7za
7za
7zr
sort: write failed: 'standard output': Broken pipe
sort: write error

Upload and download data¶

  • The simplest way is to use a graphical tool such as FileZilla.

  • When you are more comfortable with commandline tools, you could try scp or rsync