Linux Systems Administration

Author: C. Sean Burns
Date: 2022-03-09
Email: sean.burns@uky.edu
Twitter: @cseanburns Website: cseanburns.net

This short book is based on a series of lectures for my course on Linux Systems Administration.

This is a very rough draft that covers the Fedora distribution. I plan to change to Ubuntu in the next release during the Fall 2022 semester.

Introduction

This section is an intro to Linux, its history, and its basic usage.

History of Unix and Linux

Location: Bell Labs, part of AT&T (New Jersey), late 1960s through early 1970s

  • Multics, a time sharing system (that is, more than one person could use it at once)
  • Multics had issues and was slowly abandoned
  • Ken Thompson found an old PDP-7. Started to write UNIX.
  • Also, the ed text editor was written. Pronounced e.d. but generally sounded out.
  • This version of UNIX would later be referred to as Research Unix
  • Dennis Ritchie joined him and created the C language (In October 2011, Steve Jobs passed away a week before Dennis Ritchie, but the world mourned Jobs and Ritchie's death went largely unnoticed).

Location: Berkeley, CA (University of California, Berkeley), early to mid 1970s

  • The code for UNIX was not 'free' but low cost and easily shared.
  • Ken Thompson visited Berkeley and helped install Version 6 of UNIX https://en.wikipedia.org/wiki/Berkeley_Software_Distribution.
  • Bill Joy and others contributed heavily (Joy created vi, which Vim descends from).
  • This installation of UNIX would eventually become known as the Berkeley Software Distribution, or BSD.

AT&T

  • Until its breakup in 1984, AT&T was not allowed to profit off of patents that were not directly related to its telecommunications businesses.
  • This agreement with the US government helped protect the company from charges of monopoly, and as a result, they could not commercialize UNIX.
  • This changed after the breakup. System V UNIX became the standard bearer of commercial UNIX.

Location: Boston, MA (MIT), early 1980s through early 1990s

  • In the late 1970s, Richard Stallman began to notice that software began to become commoditized and as a result, hardware vendors were no longer sharing the code they developed to make their hardware work. During much of his education, software code was not eligible for copyright protection (changed under the Copyright Act of 1976).
  • Stallman, who thrived in a hacker culture (Wikipedia page on Stallman), began to wage battles against this.
  • Stallman created the [GNU project][gnuproject] and philosophy (also the creator of GNU Emacs). The project is an attempt to create a completely free operating system, that was Unix-like, called GNU.
  • By the early 1990s, Stallman and others had developed all the utilities needed to have a full operating system, except for a kernel.
  • This includes the Bash shell, written by Brian Fox.
  • The philosophy includes several propositions that define free software:

The four freedoms, per GNU Project

[https://www.gnu.org/philosophy/free-sw.html][fourfreedoms]

  1. The freedom to run the program as you wish, for any purpose (freedom 0).
  2. The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  3. The freedom to redistribute copies so you can help others (freedom 2).
  4. The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

The Unix wars and the lawsuit

  • Differences in AT&T Unix and BSD Unix arose. The former was aimed at commercialization and the latter aimed at researchers and academics.
  • UNIX Systems Laboratories, Inc. (USL, part of AT&T) sued Berkeley Software Design, Inc. (BSDi, part of the University of California, Berkeley) for copyright and trademark violations.
  • USL ultimately lost the case.

The Rise of Linux, Linus Torvalds, University of Helsinki, Finland

  • On August 25, 1991 (30 years ago), [Linus Torvalds][linustorvalds] announced that he had started working on a free operating system kernel for the 386 CPU architecture and for the specific hard drives that he had. This kernel would later be named Linux.
  • Linux technically refers only to the kernel. An operating system kernel handles startup, devices, memory, resources, etc.
  • His motivation was to learn about OS development but also to have access to a Unix-like system. He already had access to an Unix-like system called MINIX, but MINIX had some technical and copyright restrictions.
  • Torvalds has stated that if a BSD or if GNU Hurd were available, then he may not have created the Linux kernel.
  • But Torvalds and others took the GNU utilities and created what is now called Linux, or GNU/Linux.

Distributions

  • Soon after Linux development, people would create their own Linux and GNU based operating systems and would distribute.
  • As such, they became referred to as distributions.
  • The two oldest distributions that are still in active development include:
    • Slackware
    • Debian

Short History of BSD

  • Unix version numbers 1-6 eventually led to BSD 1-4.
  • At BSD 4.3, all versions had some AT&T code. Desire to remove this code led to BSD Net/1.
  • All AT&T code was removed by BSD Net/2.
  • BSD Net/2 was ported to the Intel 386 processor. This became 386BSD and was made available a year after the Linux kernel was released, in 1992.
  • 386BSD split into two projects:
    • NetBSD
    • FreeBSd
  • NetBSD split into another project: OpenBSD.
  • All three of these BSDs are still in active development. From a bird's eye point of view, they each have different foci:
    • NetBSD focuses on portability (MacOS, NASA)
    • FreeBSD focuses on wide applicability (WhatsApp, Netflix, PlayStation 4, MacOS)
    • OpenBSD focuses on security (has contributed a number of very important applications)

Note: MacOS is based on [Darwin][puredarwin], is technically UNIX, and is partly based on FreeBSD with some code coming from the other BSDs.

Short History of GNU

  • The GNU Hurd is still being developed, but it's only in the pre-production state. The last release was 0.9 on December 2016. A complete OS based on the GNU Hurd can be downloaded and ran.

Free and Open Source Licenses

  • [GNU General Public License (GPL)][gnugpl]
  • [BSD License][bsdlicense]

Quick SSH Connect

Later in the semester, each of you will install virtual machine software so that you can install a Linux operating system and manage it on your own computer. Until then, we will need to connect to a remote server in order to acquire some basic Linux command line skills. To do that, we will use what is called SSH, or Secure Shell.

How you use SSH will depend on which operating system or OS version you are using. If you are using a macOS computer, then everything I do in this video will be exactly the same for you. Use SpotLight to search for the terminal.app and open it. You will see a command line prompt similar to mine (although by default, I think macOS uses black text on a white background, but you can configure this in the app preferences).

If you are on a Windows machine, then I hope you are using version 10. If so, then you can install a SSH client. I don't use Windows and can't create a video for you, but I found one that looks pretty helpful. Follow the instructions in that video and see if it works for you. Alternatively, below is a link to some other instructions. I am not sure which one is more current, but between the two, you should be able to figure it out:

If neither of those work for you, then you should install PuTTY. This is a great app, and since it's available on Linux, I will show you how to use it in this video.

Linux and macOS

  • Open terminal
  • type: ssh username@ip-address
  • Enter your password
  • type less README
  • press q
  • type exit to exit the session

Windows with PuTTY

  • Open PuTTY
  • Enter IP Address in field marked Host Name (or IP address)
  • Click Open at the bottom
  • Enter your username at the prompt: login as:
  • Enter your password
  • type less README
  • press q
  • type exit to exit the session

PuTTY Configuration

If you use PuTTY, then you can configure font size and screen colors, if you want. The default font size is too small for me, and so that's something I'd change right away. To change the font size, in the PuTTY window:

  • Click on Fonts under the Window category
  • Click on Change... next to the item labeled Font used for ordinary text.
  • Select the font and size that you prefer

That's all for today. Keep these notes and next week we start to connect to our remote Linux server in order to learn some command line basics.

The Linux/Unix File System and File Types

In this demo, we will cover the:

  • the Linux file system and how it is organized, and
  • the basic commands to work with directories and files

Throughout this demonstration, I encourage you to ssh into our remote server and follow along with the commands that I use.

Visualizing the Tree Structure

First, the term file system may refer to different concepts. In come cases, it refers to how data is stored and retrieved on a device like a hard drive, USB drive, etc. For example, macOS uses the Apple File System (APFS) by default, and Windows uses the New Technology File System (NTFS). Linux and other unix-like operating systems use a variety of file systems. Presently, the two major ones include ext4 and btrfs. The former is the default file system on distributions like Debian and Ubuntu, but the Fedora distribution recently switched to the latter. Opensource.com has a nice overview of file systems under this concept, and we will learn how to use some of them later in the semester when we will create partitions, manage disk volumes, and learn about backups.

The other way that the term file system might be used is to refer to the directory structure of a system. This concept is not always directly related to the prior concept of a file system. For example, on Windows, the root file system is identified by a letter, like the C: drive regardless if the disk has a NTFS file system or a FAT file system. macOS adheres to a root file system like Linux and other unix-like operating systems. In these operating systems, we have a root, top-level directory identified by a /, and then sub-directories (or folders in GUI-speak) under that root directory. Linux.com has a nice overview of the most common directory structure that Linux distributions use along with an explanation for the major bottom level directories.

On Linux, we can visualize the filesystem with the tree command.

  • tree : list contents of directories in a tree-like format
    • tree -dfL 1 : directories only, full path, one level

The root Directory and its Base Level Directories

As explained on the Linux.com page, here are the major sub directories under /:

  • /bin : binary files needed to use the system
  • /boot : files needed to boot the system
  • /dev : device files -- all hardware has a file
  • /etc : system configuration files
  • /home : user directories
  • /lib : libraries/programs needed for other programs
  • /media : external storage is mounted
  • /mnt : other file systems may be mounted
  • /opt : store software code to compile software
  • /proc : files containing info about your computer
  • /root : home directory of superuser
  • /run : used by system processes
  • /sbin : like /bin, binary files that require superuser privileges
  • /usr : user binaries, etc that might be installed by users
  • /srv : contains data for servers
  • /sys : contains info about devices
  • /tmp : temp files used by applications
  • /var : variable files, used often for system logs

Although there are 18 directories listed here and that stem from the root directory, there are some that we'll use much more often than others. For example, since the /etc directory contains system configuration files, we will use the contents of this directory, along with the /var directory, quite a bit when we set up our web servers, relational database servers, and more. The /home directory is where our default home directories are stored, and thus if you manage a multi-user system, like I do for this class, then this will be an important directory.

Source: [Linux Filesystem Explained][8]

Basic Directory and File commands

In order to explore the above directories but also to create new ones and work with files, we need to know some basic commands. A lot of these commands are GNU Coreutils, and in this demo, we will specifically cover some of the following:

Directory Listing

We have a few options to list directories, but the most common command is the ls command, and we use it like so:

ls

However, most commands can be used with options. In order to see what options are available for the ls command, we use look at it's man(ual) page:

man ls

From the ls man page, we learn that we can use the -l option to format the output of the ls command as a long-list, or a list that provides more information about the files and directories in the working directory.

We can use the -a option to list hidden files. In Linux, hidden files are hidden from the default ls command if the files begin with a period. We have a number of those files in hour $HOME directories, and we can see them like so:

ls -a

We can also combine options. For example, to view all files, including hidden ones, in the long-list format, we can use:

ls -al

Basic Operations

Some basic operation commands include:

  • cp for copying files and directories
  • mv for moving (or renaming) files and directories
  • rm for removing (or deleting) files and directories

These commands also have various options that can be viewed in their respective man pages. See:

man cp
man mv
man rm

Here are some ways to use these commands:

copy

To copy an existing file to a new file:

cp file.txt newfile.txt

move

To move an existing file in our $HOME directory to a subdirectory, like into our public_html directory:

mv file.html public_html/file.html

rename

To rename a file, we also use the mv command:

mv file.html newfile.html

move and rename

To move and rename a file:

mv file.html public_html/newfile.html

remove or delete

Finally, to delete a file:

rm file.html

Special File Types

For now, let's only cover two commands here:

  • mkdir for creating a new directory
  • rmdir for deleting a directory

Like the above commands, these commands also have their own set of options that can be viewed in their respective man pages:

man mkdir
man rmdir

make or create a new directory

We use these commands like we do the ones above. If we are in our $HOME directory and we want to create a new directory, we do:

mkdir documents

And if we run ls, we can see that it was successful.

delete a directory

The rmdir command is a bit weird because it only removes empty directories. To remove the directory we just created, we use it like so:

rmdir documents

However, if you want to remove a directory that contains files or other sub-directories, then you will have to use the rm command along with the -r option:

rm -r directory-with-content

Printing Text

There a number of ways to print text to standard output, which is our screen by default in the terminal. We could redirect standard output to a file or to a printer or to a remote shell. We'll get to examples like that later in the semester. Here let's cover two commands:

  • echo to print a line of text to standard output
  • cat to concatenate and write files

To use echo:

echo "hello world"
echo "Today is a good day."

cat is listed elsewhere in the GNU Coreutils page. The primary use of the cat command is to join, combine, or concatenate files, but if used on a single file, it has this nice side effect of printing the content of the file to the screen:

cat file.html

If the file is very long, we might want to use a pager. There are a few options, but the less command is very useful:

less file.html

Conclusion

In this demo, we learned about the file system or directory structure of Linux, and we also learned some basic command to work with directories and files. You should practice using these commands as much as possible. The more you use them, the easier it'll get. Also, be sure to review the man pages for each of the commands, especially to see what options are available for each of them.

Basic commands covered in this demo include:

  • ls : list
  • man : manual pages
  • cp : copy
  • mv : move or rename
  • rm : remove or delete a file or directory
  • mkdir : create a directory
  • rmdir : delete an empty directory
  • echo : print a line of text
  • cat : display contents of a file
  • less : display contents of a file by page
  • tree : list contents of directories in a tree-like format

File Attributes

Let's take a look at file attributes. Often we'll have to change the file permissions and owners of a file. This will become really important when we create our web servers.

  1. chmod for changing file permissions (or file mode bits)
  2. chown for changing file owner and group

We use chmod to change the file permissions:

# Make a file readable and writer for user:
chmod u+rw file.txt

# Make a file executable for user and group:
chmod ug+x file.sh

# Make a file readable by the world:
chmod o+r file.html

Let's change the ownership of a file so that it's owned by a group we're in:

chown sean:sis_fac_staff file.sh

# Let's make it read only for the group:
chmod g-wx+r file.sh

Conclusion

In this demo, we looked at two ways to change the attributes of a file. This includes changing the ownership of a file and the read, write, and execute permissions of a file.

The commands we used to change these attributes include:

  • chmod : for changing file permissions (or file mode bits)
  • chown : for changing file ownwership

Processing Data: GNU Coreutils (Part 1)

Text Processing

We've touched on a few of these commands already, such as:

  1. touch
  2. cat
  3. echo
  4. pwd
  5. mkdir
  6. rmdir
  7. head
  8. wc

We have commands also for getting data on the users:

  1. who
  2. w

Or the local time:

  1. date

Today I want to cover some file related commands for processing data in a file; specifically:

  1. sort for sorting lines of text files
  2. uniq for reporting or omitting repeats lines
  3. cut for removing from each line of files
  4. head for outputting the first part of files
  5. tail for outputting the last part of files

Let's look at a toy, sample file that contains structured data as a CSV (comma separated value) file:

cat operating-systems.csv

Chrome OS, Proprietary, 2009
FreeBSD, BSD, 1993
Linux, GPL, 1991
iOS, Propietary, 2007
macOS, Proprietary, 2001
Windows NT, Proprietary, 1993
Android, Apache, 2008

To get data from file:

# get the second field, where the fields are separated by a comma ","
cut -d"," -f2 operating-system.csv

# get the third field
cut -d"," -f3 operating-system.csv

# sort it, unique it, and save it in a separate file
cut -d"," -f3 operating-system.csv | sort | uniq > os-years.csv

If that CSV file has a header line, then we may want to remove it from the output. First, let's look at the file:

cat operating-systems.csv

OS, License, Year
Chrome OS, Proprietary, 2009
FreeBSD, BSD, 1993
Linux, GPL, 1991
iOS, Propietary, 2007
macOS, Proprietary, 2001
Windows NT, Proprietary, 1993
Android, Apache, 2008

Say we want the license field data but we want to remove that first line, then we need the tail command:

tail -n +2 operating-system-csv | cut -d, -f2 | sort | uniq > license-data.csv

Conclusion

In this lesson, we learned how to process and make sense of data held in a text file. We drew upon some commands we learned in prior lessons that help us navigate the command line and create files and directories. We also added commands that let us sort and view data in different ways. The commands we used in this lesson include:

  • sort : for sorting lines of text files
  • uniq : for reporting or omitting repeats lines
  • cut : for removing from each line of files
  • head : for outputting the first part of files
  • tail : for outputting the last part of files
  • who : show who is logged on
  • w : show who is logged on and what they are doing.
  • touch : change file timestamps
  • cat : concatenate files and print on the standard output
  • echo : display a line of text
  • pwd : print name of current/working directory
  • mkdir : show who is logged on
  • rmdir : remove empty directories
  • head : output the first part of files
  • wc : print newline, word, and byte counts for each file

We also used two types of operators, the pipe and the redirect:

  • | : redirect standard output command1 to standard input of command2
  • > : redirect to standard output to a file, overwriting
  • >> : redirect to standard output to a file, appending

Processing Data: Grep, Sed, Awk (Part 2)

Introduction

Hi Class -- in this demo, I will cover three additional utilities for processing text: grep, sed, and awk. This page contains the entire transcript for the three programs, but I will break the video up into the three respective parts.

Thus far in class, we have learned about commands like wc, cat, cut, head, tail, sort, and uniq. There are other utilities that help us process data, and these include:

  • join for joining lines of two files on a common field
  • paste for merging lines of files

We have learned about the | pipe operator, which we use to redirect standard output to a different command so that second (or third) command can process the output. An example is: sort file | uniq, which sorts a file first and then identifies the unique lines (by the way, files must be sorted before piped to uniq).

We have learned about the > and >> redirect operators. They work like the pipe operator, but instead of directing output to a new command for the new command to process, they direct output to a file. As a reminder, the single redirect > will overwrite a file or create a file if it does not exist. The double redirect >> will append to an existing file or create a file if it does not exist. It's safer thus to use the double redirect, but if you are processing large amounts of data, it could also mean creating really large files really quickly. If that gets out of hand, then you might crash your system.

The real magic of the Linux command line (and other Unix-like OSes) is this ability to use the pipe and redirect operators to string together multiple commands like the ones that we have covered and to redirect output to files.

Grep, Sed, and Awk

In addition to the above commands (or utilities) described above, we have additional, and very powerful, programs available to us for processing data. In this demo, I will introduce us to these, and they include grep, sed, and awk. I use these more than I use some of the utilities we have covered so far.

Grep

The grep command is one of my most often used commands. Basically, grep "prints lines that match patterns" (see man grep). In other words, it's search, and it's super powerful.

grep works line by line. So when we use it to search a file for a string of text, it will return the whole line that includes the match. Remember, this line by line idea is part of the history of Unix-like operating systems, and it's super important to remember that most utilities and programs that we use on the command line have this as the basis of their approach.

Let's consider the file operating-systems.csv, as seen below:

OS, License, Year
Chrome OS, Proprietary, 2009
FreeBSD, BSD, 1993
Linux, GPL, 1991
macOS, Proprietary, 2001
Windows NT, Proprietary, 1993
Android, Apache, 2008

Quick note: In the code snippets below, and like in many of my examples, lines starting with a pound sign # signal a comment that explains the purpose of the command that follows on the next line.

# to search for a string; here we search for the string "Chrome"
grep "Chrome" operating-systems.csv
# repeat the search by be case insensitive; the default is case sensitive
grep -i "chrome" operating-systems.csv
# return lines that do not match and that are case insensitive
grep -vi "chrome" operating-systems.csv

I used the tail command in class to show how we might use that to remove the header (1st line) in a file, but I don't use tail very often because I have grep. Part of the power with grep is that we can use what are called regular expressions (regex for short). Regex is a method used to identify patterns in text via abstractions. They can get complicated, but we can use some easy regex methods.

# remove the first line from the output; the carat indicates the start of the line
# thus, this returns results that do not match lines where the start of the line begins with "os"
grep -vi "^os" operating-systems.csv
# remove the first line from the output; the dollar sign indicates the end of the line
# thus, this returns results that do not match lines where the start of the line ends with "year"
grep -vi "year$" operating-systems.csv

Other grep options:

# returns lines that have the string "proprietary"
grep -i "proprietary" operating-systems.csv
# get a count of those lines
grep -ic "proprietary" operating-systems.csv
# print only the match and not the whole line
grep -io "proprietary" operating-systems.csv

Sed

I spoke about the ed command. That and editors like vi or vim all belong to the same family of programs. sed belongs to this family, too. Specifically, sed is a "stream editor ... used to perform basic text transformations on an input stream (a file or input from a pipeline") (see man sed). By default, sed works on standard output, but it can be used to edit and write to files.

Like ed, vim, and grep, sed works line by line. Unlike grep, sed uses addresses to specify lines or ranges of lines, and these addresses are followed by a command.

All text files are considered to have lines that start with the number 1. So another way to remove the header line of our operating-systems.csv file is to simply delete the first line:

# delete (d command) line one from standard output; 
sed '1d' operating-systems.csv

If I wanted to make that a permanent deletion, then I would use the -i option, which means that I would edit the file in place (see man sed).

# Let's work on a copy and not the original file just for the example
cp operating-systems.csv os.csv
# now let's delete the first line from standard output
sed -i '1d' os.csv

To refer to line ranges, then I add a comma between addresses:

# delete the lines one through three; you would add the -i option to edit the file inplace
sed '1,3d' operating-systems.csv

I can use sed to find and replace strings:

# find string "Linux" and replace with "GNU/Linux"
# the 's' command means search
# the string within after the forward slash indicates the search term
# the string after the second forward slash indicates the replace term
# the \ is an escape character so that sed interprets the forward slash literally
# the ending 'g' command means "global", or all instances.
# Without it, it would stop after the first instance

sed 's/Linux/GNU\/Linux/g' operating-systems.csv

I can append a or insert i text:

# append line after line 3 with the 'a' command
sed '3a iOS, Proprietary, 2007' operating-systems.csv
# insert line at line 3 with the 'i' command
sed '3i iOS, Proprietary, 2007' operating-systems.csv

For example, say I forgot to add the shebang at the top of a script named test.sh, I could use sed like so (Note: the above command will only work on a non-empty file):

sed -i '1i #!/usr/bin/bash' test.sh

Awk

awk or gawk is a complete scripting language for "pattern scanning and processing" text (see man awk). It's a powerful language, and its focus is on columns of structured data.

In awk, columns of a file are identified by a dollar sign and then the number of the column. So, $1 indicates column 1, and $2 indicates column 2. If we use $0, then we refer to the entire file.

The syntax for awk is a little different. Basically, awk uses the following syntax, where pattern is optional.

awk pattern { action statements }

To print the first column of our file, then:

# print column one
awk '{ print $1 }' operating-systems.csv
# print column one that includes the term 'Linux'
awk '/Linux/ { print $1 }' operating-systems.csv

awk by default considers the first empty space as the field delimiter. That's why when I printed the first column above, only the term Windows appeared in the results even though it should be Windows NT. To specify that we want awk to treat the comma as a field delimiter, we use the -F option and we surround the comma with a couple of single quotes:

# use -F to tell awk that the comma is the separator or delimiter
awk -F',' '{ print $1 }' operating-systems.csv

Now we can do a bit more with columns:

# print select columns, like column 1 and 3
awk -F',' '{ print $1 $3 }' operating-systems.csv
# make a report by adding some text
awk -F',' '{ print $1 " was founded in" $3 }' operating-systems.csv

Since awk is a full-fledged programming language, it understands data structures, which means it can do math or work on strings of text. Let's return lines where the year is greater than some number.

# print all of column 3
awk -F',' '{ print $3 }' operating-systems.csv
# print only the parts of column 3 that are greater than 2005
awk -F',' '$3 > 2005 { print $3 }' operating-systems.csv
# print only the parts of column 3 that are equal to 2007
awk -F',' '$3 == 2007 { print $3 }' operating-systems.csv
# print only the parts of columns 1 and 3 where column 3 equals 2007
awk -F',' '$3 == 2007 { print $1 $3 }' operating-systems.csv
# print the entire line where column three equals 2007
awk -F',' '$3 == 2007 { print $0 }' operating-systems.csv
# print only those lines where column 3 is greater than 200 and less than 2008
awk -F',' '$3 > 2000 && $3 < 2008 { print $0 }' operating-systems.csv
# even though we wouldn't normally add years up, let's print the sum of column 3
awk -F',' 'sum += $3 { print sum }' operating-systems.csv

Here are a few basic string operations:

awk -F',' '{ print toupper($1) }' operating-systems.csv
awk -F',' '{ print tolower($1) }' operating-systems.csv
awk -F',' '{ print length($1) }' operating-systems.csv

We can add additional logic. The double ampersands && indicate a Boolean/Logical AND. The exclamation point ! indicates a Boolean/Logical NOT.

# print only those lines where column three is greater than 1990
# and the line has the string "BSD" in it
awk -F',' '$3 > 1990 && /BSD/ { print $0 }' operating-systems.csv
# print only those lines where column three is greater than 1990
# and the line DOES NOT have the string "BSD" in it; here we use the 
# exclamation point to signal a Boolean NOT
awk -F',' '$3 > 1990 && !/BSD/ { print $0 }' operating-systems.csv

And now, the double vertical bar || indicates a Boolean/Logical OR:

# print only those lines that contain the string "Proprietary" or the string "Apache" (prints both)
awk -F',' '/Proprietary/ || /Apache/ { print $0 }' operating-systems.csv
# consider lower case letters. Here use the square brackets to indicate alternate letters
awk -F',' '/[pP]roprietary/ || /[aA]pache/ { print $0 }' operating-systems.csv

awk is full-fledged programming language. It provides all sorts of conditionals, control structures, variables, etc. Feel free to explore at:

Here's an example of how I've used awk recently:

#!/bin/bash

# Check how much time left to sync email
# Sean Burns

systemctl status --user mbsync.timer |\
         awk -F";"\
         '/Trigger:/ && $2 == "" { print "Syncing..."}
         /Trigger:/ && $2 != "" { print "Time left to sync: " $2}'

Conclusion

The Linux (and other Unix-like OSes) command line offers a lot of utilities to examine data. Prior to this lesson, we covered a few of them that help us get parts of a file and then pipe those parts through other commands or redirect output to files. We can use pipes and redirects with grep, sed, and awk, also, if needed, but we may be able to avoid using the basic utilities like cut, wc, etc if want to learn more powerful programs like grep, sed, and awk.

It's fun to learn and practice these. Despite this, you do not have to become a sed or an awk programmer. Like the utilities that we've discussed in prior lectures, the power of programs like these is that their on hand and easy to use as one-liners. If you want to get started, the resources below offer nice lists of pre-made one-liners:

The commands we used in this session incuded:

  • join : join lines of two files on a common field
  • paste : merge lines of files
  • grep : print lines that match patterns
  • sed : stream editor for filtering and transforming text
  • awk : pattern scanning and text processing language

We also used two types of operators, the pipe and the redirect:

  • | : redirect standard output command1 to standard input of command2
  • > : redirect to standard output to a file, overwriting
  • >> : redirect to standard output to a file, appending

Please explore and try these out yourselves!

Text editors

In addition to writing commands at the command prompt, we can also write commands (and other kinds of text) in text editors, save those files as scripts, and then execute the files so that the commands in the script run. This process can save lots of time, especially when we find ourselves writing longer commands. We can enter long commands at the command prompt, but it is difficult to fix errors when writing multi-line commands there.

It's important to have some familiarity with command line text editors not just because we may need to write out longer scripts, but also because command line text editors may be the primary way we have of modifying configuration files on a server. We'll learn more about configuration files later in the semester, but these are files, often in the /etc/ directory, which define the parameters of various services on a server, such as how a web server delivers web sites or how a firewall blocks certain kinds of internet connections. Graphical based text editors can be nice, but graphical user environments are not always available on servers because of the computer resources they require, which are better used for the purposes of the server, and because the security bugs that come with graphical user software unnecessarily increase the attack surface of the server.

In this demo, I am going to address three text editor programs and I will start with the most difficult one to learn (ed) and end with the easiest one to learn (nano). For this course, I suggest that you use nano, but the first two text editors, ed and vim, are very much a part of Unix and Linux culture. I want you to be familiar with them for that reason but also because their basic designs have influenced a lot of other common technologies.

ed

ed is specifically a line editor, and it's the most likely text editor to be installed on a Linux distribution (nano is installed by default on Fedora now). ed, or an early version of it, is also the first text editor for the Unix operating system and was developed by Ken Thompson, one of the Unix creators, in the late 1960s. It was written without computer monitors in mind, because those were still uncommon, and instead for teletypewriters (TTYs) and printers. If you visit that second link, what you will essentially see is the terminal interface from those earlier days, and it is the same basic interface you're using when you use your terminal applications, which are now virtualised versions of those old teletypewriters. It's a testament of the power of the terminal that advanced computer users still use the same basic technology today.

In practice, when we use a line editor like ed, the main process of entering text is like any other editor. The big difference is when we need to manipulate the text. In a graphical based text editor, if we want to delete a word or edit some text, we might just backspace over the text or highlight a word and delete it. In a line editor, we manipulate text by referring to lines or across multiple lines and then run commands on the text in those line(s). To operationalize this, this means that each line has an address, and the address for line 7 is, for example, 7, and so forth. Line editors like ed are command driven. There is no menu to select from at the top of the window. Instead, if a user wants to delete a word, the user first has to direct the line editor to the relevant line by its address and then command the line editor to delete the word on that line. Line editors can also work on ranges of line, including all the lines in the file.

Commands are cryptic in line editors and usually are called by a single letter. For example, to delete a line in ed, we use the d command. To print a line, we use the p command. To substitute (or replace) a word in a line, we use the s command. If I wanted to delete line 7 in ed, I'd first enter the address and then the d command at the ed prompt:

7d

And to print it, I would enter the address and then the p command:

7p

Substituting a word is a tiny bit more complicated because it involves first addressing the line and then searching the line for the word to substitute. This process means that there is a bit more to the command. Here the syntax is:

[ADDRESS[,ADDRESS]]COMMAND[PARAMETERS]

The square brackets in this format example are not used in the actual commands but only to demarcate the parts of the command. The first ADDRESS is required, but the second ADDRESS is optional and would be used only if I wanted to indicate a range of lines. The COMMAND is stated, and if additional parameters are needed, then those are optional. Imagine then that on line 7, I had the following sentence:

I use linux.

Now imagine that I am editing the file and realize that the word linux is a proper noun and should be capitalized. The substitute command requires that I first address line 7, then state the command s, and then search for the word linux, and then replace it with the word Linux.

7s/linux/Linux/

You can see that the operation we're doing is familiar to you already because it's like finding and replacing text in a word processor:

7s/find/replace/

If I wanted to substitute or replace the word linux with Linux on lines 5-7, then I'd indicate that on the address range by typing 5,7, which ed interprets as lines 5, 6, 7 (like 5-7):

5,7s/linux/Linux/

Line editors like ed are also modal, which means they have separate modes for inputting text and for issuing commands on text. Text editors in graphical user environments, like Notepad on Microsoft Windows or TextEdit on macOS, are modeless---the user can switch between writing text and editing text without changing the environment and by using keyboard shortcuts (like Ctrl-c or Cmd-c to copy) or using the menu bar. But in modal text editors, vim included, the user has to enter one mode to write text (the input mode) and another mode to run commands on the text (the command mode) (vim has many modes but three main ones).

To start with ed, I simply type ed on the Bash prompt followed by an optional file name to work on, which could be a new file or an existing file. Let's say I wanted to create a file called letter.txt, then I'd type:

ed letter.txt
letter.txt: No such file or directory

Here we see that when ed opens, it's not very forthcoming or user friendly and if the file letter.txt does not already exist, it returns an error message informing of us that. Also, by default we are in command mode. To enter insert mode so that I an start writing my letter, I can type an i for insert or an a for append:

a
Dear Students,

I hope you find this really interesting and that you feel free to practice and play on the command line, as well as use tools like ed, the standard editor.

Sincerely,
Dr. Burns
.

After I've typed that letter, my goal is to leave insert mode and enter command mode, which I need to do if I just want to save and exit ed. I do that at the end when I press return to get a new line and typed a period by itself. This is how we tell ed to exit input mode and enter command mode.

The first line is line 1, but it's not obvious that this is always so, and it would be hard to know a line number in ed when the file is long. To see line numbers, I can tell ed to show them to me with the following command:

,n

That will display all the line numbers in the file because the comma is shorthand for the entire address space. Or I could tell ed to show me line numbers for a range:

2,3n

And that will print out lines 2 and 3 with line numbers. If I wanted to print those two lines without line numbers, then I switch n for p:

2,3p

The non-shorthand way to refer to the entire address space, range 1 to the last line, would be to write 1,$n, where 1 marks the first line and $ marks the last line.

There's a lot more we can do with this text editor because it's extremely powerful, but for now, let me just show you how to exit. To do so, make sure you're in command mode (press enter to get a new line and then press . (or period) to enter command mode. If you want to save the file and then quit, then type:

w
q

If you want to quit without saving, then type:

Q

It's good to know something about ed not just for cultural reasons, but also because the line editing technology developed for it is still in use today, and is a basic part of the vim text editor, which is a very widely used application.

vim

The vim text editor is a take on the vi text editor and is in fact called Vi IMproved. Although vim is not a line editor, but a screen-oriented editor, it is multi-modal like ed and in fact is its direct descendant through vi. Due to this genealogy, vim can use many of the same commands as ed does when vim is in command mode. Like ed, we can start vim at the Bash prompt with or without a file name. Here I will open the letters.txt file with vim, and will automatically be in command mode:

vim letters.txt

To enter insert mode, I can type i or a for insert or append mode. The difference is that i will start insert mode where the cursor lies, and a will start insert mode right-adjacent to the cursor. Once in insert mode, you can type text as you normally would and use the arrow keys to navigate around the file.

To get into command mode in vim, you can press the Esc key. And then you can enter commands like you would with ed, using the same syntax:

[ADDRESS[,ADDRESS]]COMMAND[PARAMETERS]

Unlike ed, when in command mode, the commands we type are not wherever the cursor is, but at the bottom of the screen. Let's first turn on line numbers so we know which address is which, and then we'll replace ed with Ed. Note that I precede these commands with a colon:

:set number
:3s/ed/Ed/

One of the more powerful things about both ed and vim is that I can call Bash shell commands from the editors. Let's say that I wanted to add the date to my letter file. To do that, Linux has a command called date that when typed and executed, will return today's date. To call the date command within Vim and insert the output into the file, I press Esc to enter command mode (if I'm not already in it), enter a colon, and then use the shell escape command, which is an exclamation point !, type a space, then r for the read file into buffer command, and then the Bash shell date command:

:! r date

Vim/vi users also love the navigation keystrokes to move around vim, which are the j,k,l,h keystrokes:

  • j moves down line by line
  • k moves up line by line
  • l moves right letter by letter
  • h moves left letter by letter

Like the other commands, you can precede this with addresses. To move 2 lines down in vim, you type 2j, and so forth. Vi/Vim have had such a powerful impact on software development that you can in fact use these same keystrokes to navigate a number of sites such as Gmail and Facebook.

To save the file and exit vim, we go into command mode by pressing the Esc key, and then:

:wq

The above only barely scratches the surface and there are whole books on these editors as well as websites, videos, etc that explore them, and especially vim in more detail. But now that you have some familiarity with them, you might find this hilarious: Ed, man! !man ed.

nano

The nano text editor is the user-friendliest of these text editors but still requires some adjustment as a new command line user. The friendliest thing about nano is that it is modeless, which is what you're already accustomed to using, because it can be used to enter text and manipulate the text without changing to insert or command mode. It is also friendly because, like many graphical text editors and other graphical software, it uses control keys to perform its operations. The tricky part is that the control keys are assigned to different keystroke combinations than what many graphical editors (or word processors) use. For example, instead of Ctrl-c or Cmd-c to copy, in nano you press the Alt-6 keys (press Alt and 6). Then to paste, you press Ctrl-u instead of the more common Ctrl-v. Fortunately, nano lists the shortcuts at the bottom of the screen.

The shortcuts listed need some explanation. The carat mark is shorthand for the keyboard's Control (Ctrl) key. Therefore to Save As a file, we write out the file by pressing Ctrl-o (Ctrl-s works but will not return any prompts). The Alt key is also important, and the shorthand for that is M-. To mark (highlight) text, you press Alt-a, which is listed as M-A in the shortcut list at the bottom of the screen, then move the cursor over the text that you want to highlight. If your goal is to copy, then press Alt-6 to copy the marked (highlighted) text. Move to where you want to paste the text, and press Ctrl-u to paste.

For the purposes of this class, that's all you really need to know about nano. Use it and get comfortable writing in it. Some quick tips:

  1. nano file.txt will open the file named file.txt
  2. nano will open to an empty page
  3. In nano, save a file by pressing Ctrl-o or Ctrl-s
  4. Be sure to follow the prompts at the bottom of the screen
  5. Use the arrow keys to navigate around the page/file

Conclusion

In prior lessons, we learned how to use the Bash interactive shell and how to view, manipulate, and edit files from that shell. In this lesson, we learned how to use several command line text editors. This allows us to save our commands and create scripts. The commands we used in this lesson include:

  • ed : line-oriented text editor
  • ``vim` : Vi IMproved, a programmer's text editor
  • nano : Nano's ANOther editor, inspired by Pico

Revisiting Paths

Here's a video on using the command line to navigate the file system or the paths. This is not a transcript, but in this video, I use commands like:

  • cd to change directories
    • cd - (cd dash) to change to the previous working directory
    • cd ~ (cd tilde) to change to the home directory
    • cd .. (cd dot dot) to change to the directory up one level
  • mkdir -p to make multiple directories
  • rm to remove files
    • rm ../file.txt to remove a file up one level
    • rm ~/file.txt to remove a file in my home directory
    • rm ~/test0/test1/test2/file.txt to remove a file nested in multiple sub-directories
  • cp to copy files
    • cp file.txt ../../../ to copy a file multiple directories up from current position
    • cp bin/file.txt documents/file.txt to copy a file in my bin directory to my documents directory
  • mv to move or rename a file
    • mv file.txt test0/test1/test2/ to move a file in my home directory to the nested directories
    • mv bin/file.txt documents/file.txt to move a file from my bin directory to my documents directory
  • tree -d to visualize the directory structure from my current directory
  • find . -name "file.txt" to search for a file in my directories that's named "file.txt"

I hope this video helps. Please let me know if you have any questions.

Bash Scripting

Resources

Here are some useful guides and cheat sheets on Bash scripting:

Variables

Declare a variable with the name of the variable, an equal sign, and then the value of the variable within double quotes. Do not insert spaces:

name="Sean"
backup="/media/"
# test it out:
echo "$name"
echo "$backup"
cd "$backup"
cd

Variables may include values that may change given some context. For example, if we want a variable to refer to today's day of week, we can use command substitution, which "allows the output of a command to replace the command name" (see man bash). Thus, the output at the time this variable is set will differ if it is set on a different day.

today="$(date +%A)"
echo "$today"

We can print the variable a number of ways. We can call it with a dollar sign and the name of the variable:

echo $today

But it's safer to wrap the variable in double quotes:

echo "$today"

A useful and safe way to print or refer to a variable, is to use the dollar sign plus the variable name and surround these in curly braces and quotations:

echo "${name}"
echo "${backup}"
echo "${today}"

The curly braces are not strictly necessary, but they offer benefits when we start to use things like array variables. See:

For example, let's look at basic brace expansion, which can be used to generate arbitrary strings:

echo {1..5}
echo {5..1}
echo {a..l}
echo {l..a}

Another example: generating two sub-directories. Start off in your home directory, and:

mkdir -p documents/{drafts,notes}

But more than that, they allow us to deal with arrays (or lists):

seasons=(winter spring summer fall)
echo "${seasons[1]}"
echo "${seasons[2]}"
echo "${seasons[-1]}"

See Parameter expansions for more advanced techniques.

Conditional Expression

We can include a list of commands on one line in Bash with a semicolon:

cd ; ls -lt

But we can use conditional expressions and apply logic with && (Logical AND) or || (Logical OR).

Here, command2 is executed if and only if command1 is successful:

command1 && command2

Here, command2 is executed if and only if command1 fails:

command1 || command2

Example:

cd documents && echo "success"
cd documents || echo "failed"
# combine them:
cd test && pwd || echo "no such directory"
mkdir test
cd test && pwd || echo "no such directory"

Shebang or Hashbang

When we start to write scripts, the first thing we add is a [shebang][shebang] at line one. We can do so a couple of ways:

##!/usr/bin/env bash

The first one should be more portable, but alternatively, you could put the direct path to Bash:

#!/usr/bin/bash

Looping

There are several looping methods Bash, including: for, while, until, and select. The for loop is often very useful.

for i in {1..5} ; do
  echo "$i"
done

With that, we can create a rudimentary timer:

for i in {1..5} ; do
  echo "$i" ; sleep 1
done

We can loop through our seasons variable:

for i in ${seasons[@]} ; do
  echo "$i"
done

Testing

Sometimes we will want to test certain conditions. There are two parts to this, we can use if; then ; else commands, and we can also use the double square brackets: [[. There are a few ways to get documentation on these functions. See the following:

man test
help test
help [
help [[
help if

We can test integers:

if [[ 5 -ge 3 ]] ; then
  echo "true"
else
  echo "false"
fi

Reverse it to return the else statement:

if [[ 3 -ge 5 ]] ; then
  echo "true"
else
  echo "false"
fi

We can test strings:

if [[ "$HOME" = "$PWD" ]] ; then
 echo "you are home"
else
 echo "you are not home, but I will take you there"
 cd $HOME
 pwd
fi

We can test files. Let's first create a file called paper.txt and a file called paper.bak. We will add some trivial content to paper.txt but not to the second file. The following if statement will test if paper.txt has a more recent modification date, and if so, it'll back up the file with the cp and echo back its success:

if [[ "$HOME/paper.txt" -nt "$HOME/paper.bak" ]] ; then
  cp "$HOME/paper.txt" "$HOME/paper.bak" && echo "Paper is backed up."
fi

Here's a script that does prints info depending on which day of the week it is:

day1="Tue"
day2="Thu"
day3="$(date +%a)"

if [[ "$day3" = "$day1" ]] ; then
  printf "\nIf %s is %s, then class is at 9:30am.\n" "$day3" "$day1"
elif [[ "$day3" = "$day2" ]] ; then
  printf "\nIf %s is %s, then class is at 9:30am.\n" "$day3" "$day2"
else
  printf "\nThere is no class today."
fi

Summary

In this demo, we learned about:

  • creating and referring to variables
  • conditional expressions with && and ||
  • adding the shebang or hashbang at the beginning of a script
  • looping with the for statement
  • testing with the if statement

These are the basics. I'll cover more practical examples in upcoming demos, but note that mastering the basics requires understanding a lot of the commands that we have covered so far in class. So keep practicing.

Grep and Regular Expressions

Oftentimes, as systems administrators, we will need to search the contents of a file, like a log. One of the commands that we use to do that is the grep command. We have already discussed using the grep command, which is not unlike doing any kind of search, such as in Google. The command simply involves running grep along with the search string and against a file. So if I wanted to search a file called cities.txt for the search string Lexington, then I can do this:

grep "lexington" cities.txt

Whole words, case sensitive by default

However, grep can employ stricter and more powerful syntax than a Google search. Since the contents of cities.txt are all in lowercase, if I run the above command with the city named capitalized, then grep will return nothing:

grep "Lexington" cities.txt

In order to tell grep to ignore case, I need to use the -i option. This is a reminder for you to run man grep and to read through the documentation and see what the various options exit for this command.

grep -i "Lexington" cities.txt

Multiword strings

If we would search for multiword strings, then we enclose them in quotes:

grep "lexington, ky" cities.txt

Character Classes and Bracket Expressions

In conjunction with the grep command, we can also use regular expressions to search the content of text files. For example, we can use what are called character classes and bracket expressions to search for patterns in the text. Here again man grep is very important.

Note that the regular expression that marks the beginning of a line is the carat key ^, but the carat key serves a different function in bracket expressions. Specifically, it functions like a Logical NOT. Here are examples of bracket expressions and character class searches:

# bracket expressions
grep [a-d] cities.txt # matches any characters in the range a,b,c,d
grep [^a-d] cities.txt # matches any characters not in the range a,b,c,d
grep [1-3] cities.txt # matches any numbers in the range 1,2,3
grep [^1-3] cities.txt # matches any numbers not in the range 1,2,3

# character classes
grep [[:alpha:]] cities.txt
grep [^[:alpha:]] cities.txt
grep [[:lower:]] cities.txt
grep [^[:lower:]] cities.txt
grep [[:upper:]] cities.txt
grep [^[:upper:]] cities.txt
grep [[:digit:]] cities.txt
grep [^[:digit:]] cities.txt

Anchoring

Outside of bracket expansions and character classes, we use the caret ^ to mark the beginning of a line. We can also use the $ to match the end of a line:

grep "^l" cities.txt # all lines beginning with a "l"
grep "9$" cities.txt # all lines ending with the number "9"

Repetition

If we want to use regular expressions to identify repetitive patterns, then we can use repetition operators. The most useful one is the * asterisk, but there are other options:

grep "l*" cities.txt # the preceding item "l" matched zero or more times

# In come cases, we need to add the -E option to extend grep's basic functionality:
grep -E "l?" cities.txt    # the preceding item "l" is matched at most once
grep -E "l+" cities.txt    # the preceding item "l" is matched one or more times
grep -E "l{2}" cities.txt  # the preceding item "l" is matched exactly 2 times
grep -E "l{2,}" cities.txt # the preceding item "l" is matched 2 or more times

OR searches

Here we search for either lexington or lansing. Since they both appear in the file, both lines that contain them are returned:

grep "lexington\|lansing" cities.txt

This works like a Boolean OR statement, which means it'll return one or the other or both if one is True, the alternate is True, or both are True. If we repeat this line with a city name that is not in the file, but with one that is, then it'll return the city name that is in the file since at least that is True:

grep "lexington\|london" cities.txt

Real World Example

The log file at /var/log/auth.log records all attempts to authenticate and login to the server, including "invalid" attempts. Invalid attempts will include actual invalid, malicious attempts but may also capture real user mistakes when logging into the server via ssh. We're interested in the malicious attempts, made by automated bots, on the system. Let's use less to take a quick look at that file:

less /var/log/auth.log
Sep 12 00:00:08 sised-summer2020 sshd[78312]: Invalid user uu from 152.231.25.170 port 61921
Sep 12 00:00:08 sised-summer2020 sshd[78312]: pam_unix(sshd:auth): check pass; user unknown
Sep 12 00:00:08 sised-summer2020 sshd[78312]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=152.231.25.170 
Sep 12 00:00:10 sised-summer2020 sshd[78312]: Failed password for invalid user uu from 152.231.25.170 port 61921 ssh2
Sep 12 00:00:11 sised-summer2020 sshd[78312]: Received disconnect from 152.231.25.170 port 61921:11: Bye Bye [preauth]
Sep 12 00:00:11 sised-summer2020 sshd[78312]: Disconnected from invalid user uu 152.231.25.170 port 61921 [preauth]
Sep 12 00:00:40 sised-summer2020 sshd[78314]: User root from 107.170.153.57 not allowed because none of user's groups are listed in AllowGroups
Sep 12 00:00:40 sised-summer2020 sshd[78314]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=107.170.153.57  user=root
Sep 12 00:00:42 sised-summer2020 sshd[78314]: Failed password for invalid user root from 107.170.153.57 port 60118 ssh2
Sep 12 00:00:44 sised-summer2020 sshd[78314]: Received disconnect from 107.170.153.57 port 60118:11: Bye Bye [preauth]

If we continue to peruse that log file, we can also see valid, successful authentications. They follow the pattern below.

Sep 12 02:20:05 sised-summer2020 sshd[79540]: Accepted password for user1 from 192.128.0.1 port 56000 ssh2
Sep 12 15:08:10 sised-summer2020 sshd[87434]: Accepted password for user2 from 192.128.0.2 port 43501 ssh2
Sep 12 17:02:55 sised-summer2020 sshd[89235]: Accepted password for user3 from 192.128.0.3 port 57996 ssh2
Sep 12 17:03:36 sised-summer2020 sshd[89341]: Accepted password for user4 from 192.128.0.4 port 56188 ssh2

Note: I've obfuscated the usernames and IP addresses in the above snippet. Where it our reports user1, user2, ..., it would report our usernames, and the IP addresses are made up:

As we examine the file, we should look for patterns in the text. For example, we see the following kind of line over and over again:

Sep 12 00:00:08 sised-summer2020 sshd[78312]: Invalid user uu from 152.231.25.170 port 61921

The user listed here is uu. That's a made up user. So let's grep for one of the fixed parts of that string, which is Invalid user. We can pipe the output through the head command to look the first ten results:

grep "Invalid user" /var/log/auth.log | head
Sep 12 00:00:08 sised-summer2020 sshd[78312]: Invalid user uu from 152.231.25.170 port 61921
Sep 12 00:00:52 sised-summer2020 sshd[78316]: Invalid user reshma from 42.200.109.74 port 33028
Sep 12 00:02:16 sised-summer2020 sshd[78320]: Invalid user julie from 95.85.43.241 port 50010
Sep 12 00:02:39 sised-summer2020 sshd[78322]: Invalid user sasha from 113.161.37.216 port 46615
Sep 12 00:02:49 sised-summer2020 sshd[78324]: Invalid user a from 177.83.39.253 port 39699
Sep 12 00:05:39 sised-summer2020 sshd[78330]: Invalid user scan from 152.231.25.170 port 64587
Sep 12 00:06:45 sised-summer2020 sshd[78334]: Invalid user postgres from 95.85.43.241 port 37766
Sep 12 00:08:48 sised-summer2020 sshd[78341]: Invalid user apidoc from 113.161.37.216 port 32963
Sep 12 00:09:10 sised-summer2020 sshd[78407]: Invalid user sentry from 124.89.83.117 port 42690
Sep 12 00:09:15 sised-summer2020 sshd[78414]: Invalid user apidoc from 177.83.39.253 port 10693

Let's apply a brace expression and a asterisk to extract the list of invalid users from that output. I want everything from this log file, but I'll pipe the output through the head command to examine the validity of the results:

# note: we may also not grep for the string "from"
grep -o "Invalid user [a-zA-Z]* from" auth.log | head
Invalid user uu from
Invalid user reshma from
Invalid user julie from
Invalid user sasha from
Invalid user a from
Invalid user scan from
Invalid user postgres from
Invalid user apidoc from
Invalid user sentry from
Invalid user apidoc from

This is good progress, but I really just want the list of usernames. Let's modify that to remove the text "Invalid user " and "from ". We can use the cut or the awk command. I'll show both and just a bit of the output. The output is same with both commands:

# Using cut
grep -o "Invalid user [a-zA-Z]* from" auth.log | cut -d' ' -f3

# Using awk
grep -o "Invalid user [a-zA-Z]* from" auth.log | awk '{ print $3 }'
uu
reshma
julie
sasha
a
scan
postgres
apidoc
sentry
apidoc

Goal accomplished, but we can do better. Right now we have an un-ordered list of users. Let's sort them, get a count and list of usernames that are the most commonly used, and then sort those by their counts. We'll continue to use the | pipe operator, along with an assortment of utilities, to create a powerful command:

grep -o "Invalid user [a-zA-Z]* from" auth.log | awk '{ print $3 }' | sort | uniq -c | sort

Once we have that, we may want to save the data to a file:

grep -o "Invalid user [a-zA-Z]* from" auth.log | awk '{ print $3 }' | sort | uniq -c | sort > invalid-user-attempts.txt

Or, we may want to save our command as a script in order to automate the process. In the following code snippet, I use the backslash to split our single command over multiple lines. This makes the command more readable. I also save the output to a file with the current date as part of the file name. This makes the file name unique, which is good since I only use a single > redirect.

#!/usr/bin/env bash

# Get sorted list of invalid attempts to ssh into server
grep -o "Invalid user [a-zA-Z]* from" auth.log |\
  awk '{ print $3 }' |\
  sort |\
  uniq -c |\
  sort > invalid-user-attempts-"$(date +%Y-%m-%d)".txt

And from there we can investigate whether any of these users truly exist on the system and conduct other security checks.

Addendum

The goal in this lesson is to learn a bit about regular expressions, but there are a number of ways to get a list of users from the auth.log file and also avoid using complicated regular expressions. We can grep for a multistring and take more advantage of awk. For example:

grep "Invalid user " auth.log | awk '{ print $8 }' | sort | uniq -c | sort

We can also not grep at all, and use only awk, because awk can search. To search in awk, we insert the search string within forward slashes, i.e., /search string/:

awk '/Invalid user / { print $8 }' auth.log | sort | uniq -c | sort

In both examples above, we depend on the default behavior of awk to treat spaces as a field separator. Hence, in order for me to know that the users are listed in field 8 ($8), I had to count the columns.

Managing Services and Software

This section covers installing Linux, using Linux with VirtualBox, systemd, creating backups, and using DNF.

Installing and Configuring Fedora 34 Linux Server

Download and Install VirtualBox && download fedora

  1. Download fedora Server Edition. We'll install the x86_64 netinstall ISO image because it's a smaller download, and then we can install other items once Fedora is running.
  2. Download VirtualBox. This will vary depending on whether you run macOS or Windows. If you have issues here, please seek tutorials elsewhere, like YouTube.
  3. Install VirtualBox however you install software for your operating system.
    • Instructions for macOS Big Sur. Note: instructions from this video are only valid until the 4:20 time. Do not use these instructions after that point of the video.
    • Instructions for Windows 10. Note: instructions from this video are only valid until the 2:30 second mark. Do not use these instructions after that point.
  4. Be aware of the VirtualBox user manual. I'm not asking you to read it, but if you have any issues, you should search this documentation.

Host machine set up

  1. Create a directory on your own Windows or macOS computer, and call it iso. It's fine to create this directory in your home directory, or in your documents directory, or wherever.
  2. Move the fedora ISO file that you downloaded to your new iso directory.
  3. In your iso directory, create a new directory called virtualbox.

Set up fedora in VirtualBox

  1. Click on the New button
  2. In the Name: field, type in Fedora-Base-34. As you type this in, VirtualBox should automatically recognize the Type: and Version. Click Next >.
  3. Memory size: Depending on your laptop's memory capabilities, you can set this higher, but it's not necessary. The default 1024 MB should be fine.
  4. Hard disk: The default is 8.00 GB. We will change this to 100 GB on an upcoming screen. For now, make sure you select the default Create a virtual hard disk now, and then click on Create.
  5. Hard disk file type: Accept the default, which is VDI.
  6. Storage on physical hard disk: Accept the default, which is Dynamically allocated
  7. File location and size: Here you adjust to the disk size to 100 GB. Note that you will not use all of this space on your hard drive. It's the maximum amount that you can use. Click on Create.

The settings box should close. Now highlight the virtual machine in the left pane of the VirtualBox menu, and click on Start.

Install fedora in VirtualBox

  1. When you start, a window will pop up and ask you to select a virtual optical disk. This is the fedora 34 ISO that you downloaded and that you saved to your new iso folder. We have to select the file icon in the window and then find and select the ISO that we downloaded. Once you do that, click on Start.

  2. A terminal will open up. Press Enter on the Installing Fedora 34 option. Some setup text will scroll by and soon a graphical installer will launch.

  3. In the graphical installer, you'll get a message: "WELCOME TO FEDORA 34." US English should be the default options in both panes. Click on Continue.

  4. We only have to configure a few things. Click on Software Selection. In the right pane, under Additional software for Selected Environment, select Headless Management and then click on Done.

  5. Now we have to partition our hard drive. We could accept automatic partitioning, but we want to fine tune this. So click on Installation Destination.

  6. At the bottom of the window it says Storage Configuration. Select Custom and then Done.

  7. Manual Partitioning: In the next screen, the LVM option should already be selected. This stands for Logical Volume Management. Make sure that stays selected.

  8. Now we create some partitions. Partitioning a drive is a way to slice up a drive into parts so that we better manage the use of the drive. As we create these partitions, note the two pieces of information at the bottom of the screen: the Available Space and the Total Space. The Available Space will change as we add partitions.

  9. Click on the + icon. In the Mount Point box, we'll first create the root partition, which is indicated with the forward slash: /. For Desired Capacity, we should input 20 GB. In the next screen, change the File System to ext4.

  10. Next we create our other partitions. These include /boot, /home, /var, /tmp partitions and the swap file (note the missing slash). Click on the plus sign and repeat the process for these partitions. We'll leave about 34 GB of free disk space available for later use. Altogether, our partition map should look like this, but fedora will adjust the sizes a bit:

    | Mount Point/Partition | Size | File System | |-----------------------|-------|-------------| | /home | 30 GB | ext4 | | / | 20 GB | ext4 | | /tmp | 6 GB | ext4 | | /var | 10 GB | ext4 | | /boot | 1 GB | ext4 | | swap | 4 GB | swap |

    Note that the /boot partition is mapped to sda1. This is handy to remember.

    Note also that the Available Space listed at the bottom should be about 33.86 GiB.

    After you've inputted the above info, click on Done. Then click on Accept Changes in the next window.

  11. Set up Root Password. Save this password!

  12. Click on Begin Installation. This might take up to 15 minutes depending on the strength of your internet connection and what kind of hardware you have.

  13. Wait until installation is complete and you are able to Reboot System. HOWEVER, do not reboot yet. Instead, choose Close from the File menu, and then click Power off the machine.

  14. In VirtualBox settings, click on the settings for this machine.

  15. In the window that pops up, click on the System tab (this may look different if you're on macOS, especially).

  16. Deselect the Floppy and Optical options next to Boot Order, and then click Okay.

  17. Click on Start to boot your system back.

Boot fedora; update system; and create regular user

  1. Once your system is running, you will get a login prompt. Login as the user root.

  2. Update your system right away with the following commands:

    dnf updateinfo
    dnf upgrade
    

    The first command will sync your local repository with the remove repositories that manage software updates. The second command will update the software on your system if updates are available.

  3. We do not normally want to work in the root account. So next we need to a regular user. To create a new user, do the next command. Instead of sean, create a username that you like. Make sure it is one word and that it is all lowercase:

    useradd -m -U -s /usr/bin/bash -G wheel sean
    

    The above command will create a new user named sean and put that user in group of the same name, create a home directory for that user, make bash the default shell for that user, and assign that user to the wheel group, which will make that user an administrator of the system. Read man useradd for more details.

  4. Next create a password for your new user:

    passwd sean
    

    Once you have completed these steps, you can power off the machine:

    poweroff
    

Clone the machine

In the VirtualBox Manager, right click on our installation, and then select Clone. Accept the default name or rename it as you prefer. Be sure to choose Full clone.

That's good for now. Congratulations! You have just completed your first installation of a Linux server.

Logical Volume Management

Background Reading and Documentation

Please read the following two articles before proceeding:

Additionally, you should review / skim some helpful man or info pages before proceeding:

  • man filesytems
  • man ext4
  • man btrfs
  • man vgdisplay
  • man pvdisplay
  • man lvdisplay
  • man pvcreate
  • man lvcreate
  • man vgextend
  • man fdisk
  • man lsblk
  • man fstab
  • man parted
  • man mount
  • man cfdisk

The second link above demonstrates some other logical volume commands that we are not using here. Give the man or info pages for those a read, too.

Our motivation

When we installed Fedora in VirtualBox, we told VirtualBox that our hard drive would be a 100 GB in size. However, when we partitioned our hard drive, we only partitioned around 67 GB of that space. In this lesson, we are going to create a new partition of the sda partition, and then expand our logical volume to include this partition. We'll allocate 15 GB of the remaining space to sda3.

Briefly:

  • Physical volumes (the pv commands) are related to the physical devices.
  • Volume groups (the vg commands) organize the physical and logical volumes.
  • Logical volumes (the lv commands) are about partitions.

Procedure

Administrative commands

We need to use the sudo command to run many of the commands below or login as root user. Either way, be careful about running commands with sudo or as root. Mistyping a command may harm your system beyond repair. If you do harm your system beyond repair, delete your clone in VirtualBox and reclone the original install. In the demonstration that follows, I will login as the root user.

If you would like to login as root, you can login as root at the intitial prompt after you system boots up; or, if you are already logged in as a regular user, you can type the su root command and then enter the root password that you set when you installed Fedora, or you can use the sudo su command if you are in the wheel group:

su root

Or:

sudo su

Gather information

First we need take a look at what we have before we start. Pay some attention to the details:

lsblk
fdisk -l | less

Create a Partition

We can use a program called parted or a slightly more user friendly program called cfdisk. We'll use cfdisk:

cfdisk

In cfdisk, complete the following steps:

  • Arrow down to the Free space section, and press Enter on New.
  • Next to Partition size, backspace over value and then type 15GB
  • Set to primary
  • Use right arrow key to select Type
  • Arrow down to the 8e Linux LVM selection and press enter
  • User right arrow key to select Write, and then type yes at the prompt to write the partition to the virtual disk

Creating a Physical Volume

Next let's create a new physical volume to refer to our new partition. It's important to read the man pages for pvdisplay and pvcreate before you start so that you get a better idea of what you're doing, above and beyond what I'm detailing here or the links above describe. Here we'll use pvdisplay before and after we use pvcreate to note the differenc after we use the latter command:

pvdisplay
pvcreate /dev/sda3
pvdisplay

Add a Physical Volume to a Volume Group

Now we add our new physical volume to an existing volume group, which was created when we installed Fedora. Usage note: vgextend VG PV. (See man page, of course :)

Below, the volume group name is fedora_fedora and the physical volume name is, per the last set of commands, /dev/sda3. By extending the volume group, we'll have extended its size to encompass the new partition:

vgdisplay
vgextend fedora_fedora /dev/sda3
vgdisplay

Creating a Logical Volume

We create a logical volume to allow us to mount the partition and make it accessible to the other parts of the file system. Note: man pages!!!

lvdisplay | less
lsblk
vgdisplay # note Free PE / Size space
lvcreate -L +15GiB --name projects fedora_fedora
lvdisplay

Creating a File System for the LV

Now we need to format the logical volume: read the man pages for the commands below, including man fstab.

mkfs.ext4 /dev/fedora_fedora/projects
mkdir /projects
mount /dev/fedora_fedora/projects /projects

In nano, enter this in the /etc/fstab file:

/dev/mapper/fedora_fedora-projects  /projects ext4   defaults    1 2

Per the instructions in that file, run the following command:

systemctl daemon-reload

Now reboot the machine. When you reboot, your new partition should be recognized and mounted automatically.

reboot now

NAT (Network Address Translation)

Set up NAT

We will want to SSH into our machines without having to use the VirtualBox GUI, and later connect to these virtual machines using other protocols. We can do so by setting up NAT in VirtualBox.

To set up NAT:

In VirtualBox, go to Settings, Network, Advanced, Port Forwarding, and enter the info in the table below. Be sure to replace the Host IP address with the IP address of your laptop or desktop. You can find your Host IP in your system settings on your Windows or Mac computers, or by opening up a terminal session and typing ifconfig or the equivalent for your operating system.

NameProtocolHost IPHost PortGuest IPGuest Port
SSHTCP10.163.36.88222210.0.2.1522

Once you have that IP, and have made the above changes, start your Fedora clone in headless mode.

Now you can SSH into your virtual machine using the terminal of your choice (e.g., the one you used to connect to the remote server).

From a command line, we are going to SSH through port 2222 via our Host IP address. For me, that looks like this:

ssh -p 2222 user@10.163.36.88

Web Console and NAT

We set NAT to allow SSH traffic between the Host computer (our computers) and the Guest computer (the virtual machines).

[ Insert video demo ]

In VirtualBox, go to Settings, Network, Advanced, Port Forwarding, and add the info in the table below. Be sure to replace the [ YOUR IP ] address with the IP address of your laptop or desktop.

NameProtocolHost IPHost PortGuest IPGuest Port
ConsoleTCP[ YOUR IP ]909110.0.2.159090

Now you can access the console from your regular web browser by visiting the following URL:

https://YOURIP:9091

Managing Users

The passwd file

On my Fedora 34 virtual machine, I can see the following information about my user account in the passwd file:

cat /etc/passwd

Or, I can also grep or awk for specific accounts:

grep "$(whoami)" /etc/passwd
sean:x:1000:1000:sean::/home/sean:/usr/bin/bash

grep "sean" /etc/passwd
sean:x:1000:1000:sean::/home/sean:/usr/bin/bash

awk '/sean/ { print $0 }' /etc/passwd
sean:x:1000:1000:sean::/home/sean:/usr/bin/bash

Any of those commands can be piped through sed to look at the individual fields, one line at a time:

grep "sean" /etc/passwd | sed 's/:/\n/g'
sean
x
1000
1000

/home/sean
/usr/bin/bash

The fields represent the following information:

  • username
  • password indicator
  • user id
  • group id
  • user name or comment
  • home directory
  • default shell

You can read about these fields via man 5 passwd. [ EXPLAIN THE 5 ]

Note that the user name or comment line is blank. We can add a comment using the chfn, and there are multiple options to use. If I use the -f option, I can set my full name to appear here. See man chfn for more options to set:

sudo chfn -f "Sean Burns" sean

The /etc/passwd file is a pretty standard Linux file, but some things will change depending on the distribution. For example, the user id may start at a different point depending on the system. However, nowadays both Ubuntu and Fedora set the starting UID and group ID for new users at 1000.

The shadow file

The /etc/passwd file does not contain any passwords but a simple x to mark the password field. Passwords on Linux are stored in /etc/shadow and are hashed with sha512, which is indicated by $6$. You need to be root to examine the shadow file or use sudo:

sudo su
grep "sean" /etc/shadow
sean:ENCRYPTED_PASSWORD::0:99999:7:::
grep "sean" /etc/shadow | sed 's/:/\n/g'
sean
ENCRYPTED_PASSWORD

0
99999
7

The fields are (see man 5 passwd):

  • login name (username)
  • encrypted password
  • days since 1/1/1970 since password was last changed
  • days after which password must be changed
  • minimum password age
  • maximum password age
  • password warning period
  • password inactivity period
  • account expiration date
  • a reserved field

The group file

The /etc/group file holds group information about the entire system (see man group). In the following command, you can see that I'm a member of the wheel group (which allows my account to use the sudo command) and that there's a group name that is also the name of my user account. The sean at the end of the wheel line indicates that I am a member of the wheel group. Although user sean is a member of group sean, users do not have to be listed as members of their own group.

grep -E 'wheel|^sean' /etc/group
wheel:x:10:sean
sean:x:1000:

The fields are:

  • group name
  • group password
  • group ID (GID)
  • group members (user list)

Management Tools

Other user and group utilities include:

  • /usr/sbin/useradd
  • /usr/sbin/usermod
  • /usr/sbin/userdel
  • /usr/sbin/groupadd
  • /usr/sbin/groupdel
  • /usr/sbin/groupmod
  • /usr/sbin/gpasswd

Practice

Modify default new user setttings

In today's demo, we will modify some default user account settings for new users, and then we'll create a new user account.

Before we proceed, let's review several important configuration files that establish some default settings:

  • /etc/login.defs : see man login.defs
  • /etc/skel
  • /etc/default/useradd

Let's change some defaults. We can either user sudo or become su. Here I use sudo to become root:

sudo su

Let's edit the default .bashrc file:

nano /etc/skel/.bashrc

We want to add these lines at the end of the file:

# Dear New User,
#
# I have made the following settings to make your life a bit easier:
#
# make "c" a shortcut for "clear"
alias c='clear'
#
# make vi the default command line keybinding
set -o vi

Now use nano again to create a README file. This file will be added to the home directories of all new users. Add any welcome message you want to add, plus any guidelines for using the system.

nano /etc/skel/README

Add new user account

After writing (saving) and exiting nano, we can go ahead and create a new user named linus. The -m option creats the user's home directory, the -U option creates a group with the same name as the user, and the -s option sets the default shell to /usr/bin/bash.

useradd -m -U -s /usr/bin/bash linus
grep "linus" /etc/passwd

Let's add the user's full name:

chfn -f "Linus Torvalds" linus

The user does not yet have a password set. Let's create a password for linus:

grep "linus" /etc/shadow
passwd linus
grep "linus" /etc/shadow

Let's modify the minimum days and maximum days of the password's lifetime:

passwd -n 90 linus
passwd -x 180 linus

Create a new group; add users to the group

Let's now create a new group, and then I will add my account and my new user's account to the group:

grep "linus" /etc/group
groupadd developers
grep "developers" /etc/group
gpasswd -a linus developers
gpasswd -a sean developers
grep "developers" /etc/group

Exit out of root if logged in as root.

Now login as user linus and examine the user's group memberships:

su linus
groups

Great! Let's exit out and become root again:

exit
sudo su

Let's make the /projects directory/logical volume a shared directory:

ls -ld /projects
# change ownership of the directory to the group developers
chown :developers /projects
# allow all group users to add and delete from the folder and read/write to each other's files
# See this post for various options:
# https://ubuntuforums.org/showthread.php?t=2138476&p=12616640#post12616640
chmod 2770 /projects
exit

Log all the way out and then login again:

exit # from root
exit # from regular user

And then relogin so that the group modification will take affect. Check with the groups command:

groups

User account and group deletion

If we want to delete the new user's account:

userdel -r linus
grep "linus" /etc/passwd
grep "linus" /etc/shadow
cd /home ; ls -l

And then delete the new group:

grep "developers" /etc/group
groupdel developers
grep "developers" /etc/group

systemd

  • systemd is an init system that aims to provide better boot time and a better way to manage services and processes.
  • systemd is a replacement of the System V like init system that most Linux distributions used.
  • systemd includes additional utilities to help manage services on a Linux system; essentially, it's much more than an init system

There are only two basic aspects of systemd that I want to cover in this lesson, but know that systemd is a big, complicated suite of software that provides a lot of functions. In this lesson, though, we will cover using systemd to:

  1. manage services
  2. examine logs

Manage Services

When we install a complicated piece of software like a web server (e.g., Apache2), a SSH server (e.g., openssh-server), or a database server (e.g., MySQL), then it's helpful if we have some commands that will help us manage that service (the web service/server, the SSH service/server, etc).

For example, after installing a SSH server, we might like to know if it's running, or we might want to stop it if it's running, or start it if it's not. Let's see what that looks like. In the following commands, I will use the dnf utility to install the openssh-server. Then I will check the status of the server using the systemctl status command. I will enable it so that it starts automatically when the operating system is rebooted using the systemctl enable command. Finally, I will make sure the firewall allows outside access to the operating system via ssh. I use the sudo command to run the relevant commands as administrator:

dnf search openssh
sudo dnf install openssh-server
systemctl status sshd.service
sudo systemctl enable sshd.service
sudo firewall-cmd --add-service=ssh --permanent

There are similar commands to stop a service or to reload a service if a service configuration file has changed. As an example of the latter, let's say that I wanted to present a message to anyone who logs into my system remotely using ssh. In order to do that, I need to edit the main ssh configuration file, which is located in /etc/ssh/sshd_config:

cd /etc/ssh
sudo nano sshd_config

Then I will remove the beginning pound sign and thus un-comment the following line:

#Banner none

And replace it with a path a file that will contain my message:

Banner /etc/ssh/ssh-banner

After saving and closing /etc/ssh/sshd_config, I will create and open the banner file using nano:

sudo nano /etc/ssh/ssh-banner

And add the following:

Unauthorized access to this system is not permitted and will be reported to the authorities.

Since we have changed a configuration for the sshd.service, we need to reload the service so that sshd.service becomes aware of the new configuration. To do that, I use systemctl like so:

sudo systemctl reload sshd.service

Now, when you log into your Fedora system, you will see that new banner displayed.

Examine Logs

The journalctl command is also part of the systemd software suite and is used to monitor logs on the system.

If we just type journalctl at the command prompt, we will be presented with the logs for the entire system. These logs can be paged through by pressing the space bar, the page up/page down keys, or the up/down arrow keys, and they can also be searched by pressing the forward slash /.

journalctl

However, it's much better to use various options. If you tab tab after typing journalctl, command line completion will provide additional fields (see man page: man 7 systemd.journal-fields and see man man for numbering options) to examine logs for. There are many, but as an example, we see that there is an option called _UID=, which allows us to examine the logs for a user with a specific user id. For example, on our independent Fedora systems, our user ID numbers are 1000. So that means I can see the logs for my account by:

journalctl _UID=1000

The above shows journal entries related to user ID of 1000, which is my user id. We can see other user IDs by concatenating (cat) the passwd file. Not only do real humans who have accounts on the system have user IDs, but many services do to. Here I look at journal entries for chronyd, with a user ID of 992. This is a service that manages the system's time:

cat /etc/passwd
journalctl _UID=984

I can more specifically look at the logs files for a service by using the -u option with journalctl:

journalctl -u sshd.service

I can examine logs since last boot:

journalctl -b

Or I can follow the logs in real-time (press ctrl-c to quit the real-time view):

journalctl -f

Useful Systemd Commands

You can see more of what systemctl or journalctl can do by reading through their documentation:

man systemctl
man journalctl

You can get the status, start, stop, reload, restart a servicer; e.g., sshd:

systemctl status sshd.service
systemctl start sshd.service
systemctl stop sshd.service
systemctl reload sshd.service
systemctl restart sshd.service
systemctl reload-or-restart sshd.service

To enable, disable sshd (or some service):

systemctl enable sshd.service
systemctl disable sshd.service

You can check if a service if enabled:

systemctl is-enabled sshd.service

You can reboot, poweroff, or suspend a system:

systemctl reboot
systemctl poweroff
systemctl suspend

To show configuration file changes to the system:

systemd-delta

To list real-time control group process, resource usage, and memory usage:

systemd-cgtop
  • to search failed processes/services:
systemctl --state failed
  • to list services
systemctl list-unit-files -t service
  • to examine boot time:
systemd-analyze

Backing up and Managing Software

Backup with rsync

Using our /backups directory

To complete this assignment, you must have successfully completed Assignment 5: Logical Volumes.

In Assignment 5: Logical Volumes, we created an additional volume at /backups. In a real production scenario, we'd use separate drives and remote machines to store backups in order to full prepare for and recover from data loss. But for us, this /backups partition will work just fine.

Backup with rsync

In the Managing Users and Groups forum, we created a new user for our system. This means that we have at least two user accounts to backup: ours and the new uers.

Now we'll use the rsync command to backup these users' home directories to the /backups volume/partition. See man rsync for documentation. The basic syntax is:

rsync option source-directory destination-directory

Syntax matters here. Specifically, there are two ways to backup the home directories using rsync depending on a small detail in our commands. Consider the next two examples:

In this first example, the rsync command will back up the home directory and all subdirectories to the /backups directory. Thus, rsync behaves differently whether I include or leave out the trailing slash after the home source-directory.

Thus, if I run rsync this way:

rsync -ahv --delete /home /backups/

The results will look like this:

ls /backups
home
ls /backups/home
linus sean

If I include the trailing slash after the /home/ directory, like so:

rsync -ahv --delete /home/ /backups/

Then rsync will sync the contents of the /home/ directory, and the results will include not /home but the specific directories contained in /home:

ls /backups
linus sean

Delete Option

The --delete option is important. Without it, rsync will add new files to the destination directory when it backs up the source directory. With it, rsync truly syncs. Thus, if a file that was previously backed up to the destination directory and later deleted in the source directory (e.g., because it was no longer needed), then it will be deleted from the destination directory when the --delete option is used. This is how services like Dropbox work.

See other options and functionality for rsync here: https://www.linux.com/tutorials/how-backup-files-linux-rsync-command-line/. One of the most important options is the ability to backup up to remote machines over a network, and this can be done with the the rsync command.

Managing Software

Many modern Linux distros offer some kind of package management for installing, managing, and removing software. On RedHat based systems, package management is based on rpm (the RedHat Package Manager). On Debian based systems, package management is based on dpkg.

There are some advanced things you can do with these base package management systems (rpm or dpkg), but most of the time it will be easier to use their front ends. For RedHat systems, the current front end is called dnf, and for Debian systems, it's apt or apt-get. Since the Fedora distribution is part of the RedHat universe, we'll use the dnf command to manage software. As always, read the man dnf manual for more information. See also the online documentation on dnf.

Let's look at a few of the basic dnf commands.

dnf info and search commands

To see a history of how dnf has been used on the system:

dnf history

We get info on the history of a specific package on our system. Since we haven't installed anything yet, there's nothing to look at yet, but the basic syntax looks like this.

dnf history package_name

To search for a package, we can use the following command to search for the bash package:

dnf search bash

If the output is more than one page, we can pipe it through less.

To get technical information on a specific package, which we might want to do before we install it:

dnf info bash

We can use dnf to search by tag in order to get information about a package. To get a list of possible tags to search by, we can use the following command:

dnf reqoquery --querytags

Then to search by tag, we uset the following format that will show us tag-related information for the bash package:

dnf repoquery --queryformat "%{arch}" bash
dnf repoquery --queryformat "%{name}" bash
dnf repoquery --queryformat "%{release}" bash
dnf repoquery --queryformat "%{reponame}" bash

Software managed by dnf is organized into groups, which contain multiple software packages. We can see a list of groups with this command:

dnf group list

You should see a category called Installed Gruops and listed under that is Headless Management. You might recognize that from when we installed Fedora.

To see what packages would be installed with a group, such as the System Tools group, we can do:

dnf group info "System Tools"

If we want to install the default packages with a group, then:

sudo dnf group install "System Tools"

dnf install process and commands

It's pretty simple to install a software package. The hard part will involve configuring a package after it's installed, if it's a complicated piece of software. For now, let's install tmux, which is a terminal multiplexer that we can use to open and manage multiple terminals in a single window.

dnf search tmux
dnf info tmux
sudo dnf install tmux
dnf history tmux

To use tmux, I like to use the ctrl-a keybinding to control it. By default it's set to use ctrl-b. Let's configure the new keybinding like so. Here we redirect the configuration to the configureation file in our home directory. Since I'm using a single redirect >, this file gets created. Remember to use a double redirect >> if appending to the file.

echo "set-option -g prefix C-a" > $HOME/.tmux.conf

And then start tmux like so:

tmux

When done, just type exit.

Updating the system

It's easy to update the entire system:

sudo dnf update
sudo dnf clean all

The dnf clean all command removes the downloaded files, thereby freeing up storage space, used to update the system. It does not reverse the update.

dnf basics

Here are the basic dnf commands. See man dnf for details:

  • dnf search [name]
  • dnf install [name]
  • dnf remove [name]
  • dnf repolist
  • dnf list installed
  • dnf list available
  • dnf provides /usr/bin/bash
  • dnf info [name]
  • dnf update [name]
  • dnf check-update
  • dnf update OR dnf upgrade
  • dnf autoremove
  • dnf clean all
  • dnf help clean
  • dnf help
  • dnf history
  • dnf group list 'Python Science'
  • dnf group info 'Python Science'
  • dnf group install 'Python Science'
  • dnf group install --with -optional 'Python Science'
  • dnf group upgrade 'Python Science'
  • dnf group remove 'Python Science'

Introduction

This section covers basic networking, DNS, chroot, and firewalls.

Networking

Wikipedia has a good primer on the Internet protocol suite.

ARP (Address Resolution Protocol)

ARP or Address Resolution Protocol is a protocol used to map a network address, like the IP address, to the ethernet address (aka, the MAC or Media Access Control address, or the hardware address). Routers use MAC addresses to enable communication inside networks (w/in subnets) so that computers within a local network can talk to each other. Networks are designed so that IP addresses must be associated with MAC addresses before systems can communicate over a network.

To get ARP info for a system, we can use the ip command, which uses regular options (like -b) but also various objects (see man ip for details). Here is the IP info, the ARP output, and the routing table on my Fedora virtual machine (10.0.2.15) running on my desktop via a NAT connection:

ip a
ip neigh show
ip route show

Where:

  • 10.0.2.15 is the IP address of my fedora server
  • 10.0.2.2 is the first usable address on subnet, and is likely the virtual router; likewise, 52:54:00:12:35:02 is the MAC/hardware address for that virtual router
  • 10.0.2.0 is called the network address (signified by the /24 part), which is a unique identifier IP address for the subnet

The above information is used or created in the following way: A router gets configured to use a specific network address, when it's brought online, it searches the network for connected MAC addresses that are assigned to wireless or ethernet cards, it assigns to each of those MAC addresses an available IP address based on the network address,

Internet Layer

IP (Internet Protocol)

The Internet Protocol, or IP, address is used to uniquely identify a host on a network and place that host at a specific location (its IP address). If that network is subnetted (i.e., routed), then a host's IP address will have a subnet or private IP address that will not be directly exposed to the Internet.

These IP address ranges are reserved, private address ranges, which means no public internet device will have an IP address within these ranges. The private address ranges include:

Start AddressEnd Address
10.0.0.010.255.255.255
172.16.0.0172.31.255.255
192.168.0.0192.168.255.255

If you have a router at home, and look at the IP address for at any of your devices connected to that router, like your phone or computer, you will see that it will have an address within one of the ranges above. For example, it might have an IP address beginning with 192.168.X.X. This a standard IP address range for a home router. The 10.X.X.X private range can assign many more IP addresses on its network. This is why you'll see that IP range on bigger networks, like UK's. We'll talk more about subnetwork sizes, shortly.

Example Private IP Usage

At work, at one time, the IP address on my desktop was 10.163.34.59/24 via a wired connection. I checked with my office neighbor and found that their desktop reported an IP address of 10.163.34.65/24. These are on the same subnet, and later I will show you how this works.

At the time, if we both, using our respective wired connected computers, searched Google for [ what's my IP address ], we will see that we share the same public IP address of 128.163.8.25.

Without any additional information, this tells us that we know that all traffic coming from our computers and going out to the internet looks like it's coming from the same IP address (128.163.8.25). And in reverse, all traffic coming from outside our network first goes to 128.163.8.25 before it's routed to our respective computers via the router.

My laptop tells a different story because it is connected to UK wireless (eduroam). At the time of this writing, the laptop had the IP address 10.47.34.150/16. You can see there's a different pattern with this IP address. The reason it has a different pattern is because this laptop is on an different subnet. This wireless subnet was configured to allow more hosts to connect to it since it must allow for more devices (i.e., laptops, phones, etc). When I searched Google for my IP address from this laptop, it reported 128.163.238.148, indicating that UK owns a range of public IP address spaces.

Here's kind of visual diagram of what this network looks like:

network diagram
Fig. 1. This figure contains a network switch, which is used to route traffic within a subnet. It relies solely on MAC addresses and not IP addresses to determine the location of devices on its subnet. The router is capable of transferring data across networks.

Using the ip Command

The ip command can do more than provide us information about our network. We can also use it to turn a connection to the network on or off (and more). Here is how to disable and then enable a connection on a machine. Note that enp0s3 is the name of my network card/device. Yours might have a different name.

sudo ip link set enp0s3 down
sudo ip link set enp0s3 up

Transport Layer

UDP, User Datagram Protocol

UDP or User Datagram Protocol performs a similar function as TCP, but it does not error check and data may get lost. UDP is useful for conducting voice over internet calls or for streaming video, such as through YouTube, which uses a type of UDP transmission called QUIC that has builtin encryption.

TCP, Transmission Control Protocol

TCP or Transmission Control Protocol is responsible for the transmission of data and for making sure the data arrives at its destination w/o errors. If there are errors, the data is re-transmitted or halted in case of some failure. Much of the data sent over the internet is sent using TCP.

TCP and UDP Headers

The above protocols send data in data packets (TCP) or datagrams (UDP), but these terms may be used interchangeably. Packets for both protocols include header information to help route the data across the internet. TCP includes ten fields of header data, and UDP includes four fields.

We can see this header data using the tcpdump command, which requires sudo or being root to use. The first part of the IP header contains the source address, then comes the destination address, and so forth. Aside from a few other parts, this is the primary information in an IP header.

To use tcpdump, we first identify the IP number of a host, which we can do with the ping command, and then run tcpdump:

ping -c1 www.uky.edu
sudo tcpdump host 128.163.35.46

While that's running, we can type that IP address in our web browser, or enter www.uky.edu, and watch the output of tcpdump.

TCP headers include port information and other mandatory fields for both source and destination servers. The SYN, or synchronize, message is sent when a source or client requests a connection. The ACK, or acknowledgment, message is sent in response, along with a SYN message, to acknowledge the request for a connection. Then the client responds with an additional ACK message. This is referred to as the TCP three-way handshake. In addition to the header info, TCP and UDP packets include the data that's being sent (e.g., a webpage) and error checking if it's TCP.

Ports

TCP and UDP connections use ports to bind internet traffic to specific IP addresses. Specifically, a port associates a process with an application, such as a web service or outgoing email. That is, ports provide a way to distinguish and filter internet traffic through an IP address. E.g., all traffic going to IP address 10.0.5.33:80 means that this is http traffic for the http web service, since http is commonly associated with port 80. Note that the port info is attached to the end of the IP address via a colon.

Common ports include:

  • 21: FTP
  • 22: SSH
  • 25: SMTP
  • 53: DNS
  • 143: IMAP
  • 443: HTTPS
  • 587: SMTP Secure
  • 993: IMAP Secure

There's a complete list of the 317 default ports on your Linux systems. It's located in the following file:

less /etc/services

And to get a count of the ports, we can invert grep for lines starting with a pound sign or are empty

grep -Ev "^#|^$" /etc/services | wc -l

See also the Wikipedia page: List of TCP and UDP port numbers

IP Subnetting

Private IP Ranges

When subnetting, we generally work with private IP ranges:

Start AddressEnd Address
10.0.0.010.255.255.255
172.16.0.0172.31.255.255
192.168.0.0192.168.255.255

IP Meaning

An IP address is 32 bits (8 x 4), or four bytes, in size. In human readable context, it's usually expressed in the following, decimal-based, notation style:

  • 192.168.1.6
  • 172.16.3.44

Each set of numbers separated by a dot is referred to as an octet. An octet is a group of 8 bits. Eight bits equal a single byte. By implication, 8 gigabits equals 1 gigabyte, and 8 megabits equals 1 megabyte. We use these symbols to note the terms:

TermSymbol
bitb
byteB
octeto

Each bit is represented by either a 1 or a 0. For example, the first address above in binary is:

  • 11000000.10101000.00000001.00000110
  • 192.168.1.6

Or:

  • 11000000 = 192
  • 10101000 = 168
  • 00000001 = 1
  • 00000110 = 6

IP Math

When doing IP math, one easy way to do it is to simply remember that each bit in each of the above bytes is a placeholder for the following values:

128 64 32 16 8 4 2 1

Alternatively, from low to high:

base-2Output
2^01
2^12
2^24
2^38
2^416
2^532
2^664
2^7128

In binary, 192 is equal to 11000000. It's helpful to work backward. For IP addresses, all octets are 255 or less (256 total, from 0 to 255) and therefore do not exceed 8 bits or places. To convert the integer 192 to binary:

1 * 2^7 = 128
1 * 2^6 =  64 (128 + 64 = 192)

STOP: There are no values left, and so the rest are zeroes. So: 11000000

Our everyday counting system is base-10, but binary is base-2, and thus another way to convert binary to decimal is to multiple each bit (1 or 0) by the power of base two of its placeholder:

(0 * 2^0) = 0 +
(0 * 2^1) = 0 +
(0 * 2^2) = 0 +
(0 * 2^3) = 0 +
(0 * 2^4) = 0 +
(0 * 2^5) = 0 +
(1 * 2^6) = 64 +
(1 * 2^7) = 128 = 192

Another way to convert to binary: simply subtract the numbers from each value. As long as there is something remaining or the placeholder equals the remainder of the previous subtraction, then the bit equals 1. So:

  • 192 - 128 = 64 -- therefore the first bit is equal to 1.
  • Now take the leftover and subtract it:
  • 64 - 64 = 0 -- therefore the second bit is equal to 1.

Since there is nothing remaining, the rest of the bits equal 0.

Subnetting Examples

Subnetting involves dividing a network into two or more subnets. When we subnet, we first identify the number of hosts we will require on the subnet. For starters, let's assume that we need a subnet that can assign at most 254 IP addresses to the devices attached to it via the router.

We need two additional IP addresses: the subnet mask and the network address/ID. The network address identifies the network and the subnet mask marks the boundary between the network and the hosts. Knowing or determining the subnet mask will allow us to determine how many hosts can exist on a network. Both the network address and the subnet mask can be written as IP addresses, but they cannot be assigned to computers on a network.

When we have determined these IPs, we will know the broadcast address. This is the last IP address in a subnet range, and it cannot be assigned to a connected device. The broadcast address is used by a router or other devices to communicate to all connected devices on the subnet.

For our sake, let's work backwards. We want to identify and describe a network that we are connected to. Let's work with two example private IP addresses that exist on two separate subnets.

Example 1: 192.168.1.6

Let's derive the network mask and the network address (or ID) from this IP address.

11000000.10101000.00000001.00000110 IP              192.168.1.6
11111111.11111111.11111111.00000000 Mask            255.255.255.0
-----------------------------------
11000000.10101000.00000001.00000000 Network Address 192.168.1.0

Note the mask has 24 ones followed by 8 zeroes. That 24 is used as CIDR notation:

192.168.1.6/24

For Example 1, we have the following subnet information:

TypeIP
Netmask/Mask255.255.255.0
Network ID192.168.1.0
Start Range192.168.1.1
End Range192.168.1.254
Broadcast192.168.1.255

Example 2: 10.160.38.75

For example 2:

00001010.10100000.00100110.01001011 IP               10.160.38.75
11111111.11111111.11111111.00000000 Mask            255.255.255.0
-----------------------------------
00001010.10100000.00100110.00000000 Network Address   10.160.38.0
TypeIP
Netmask/Mask255.255.255.0
Network ID10.160.38.0
Start Range10.160.38.1
End Range10.160.38.254
Broadcast10.160.38.255

Example 3: 172.16.1.62/24

For example 3:

10101100 00010000 00000001 00100111 IP                172.16.1.62
11111111 11111111 11111111 00000000 Mask            255.255.255.0
-----------------------------------
10101100 00010000 00000001 00000000 Network Address    172.16.1.0
TypeIP
Netmask/Mask255.255.255.0
Network ID172.16.1.0
Start Range172.16.1.1
End Range172.16.1.254
Broadcast172.16.1.255

To determine the number of hosts on a CIDR /24 subnet, we look at the start and end ranges. In all three of the above examples, the start range begins with X.X.X.1 and ends with X.X.X.254. Therefore, there are 254 maximum hosts allowed on these subnets.

Example 4: 10.0.5.23/16

The first three examples show instances where the CIDR is set to /24. This only allows 254 maximum hosts on a subnet. If the CIDR is set to /16, then we can theoretically allow 65,534 hosts on a subnet.

For example 4, then: 10.0.5.23/16

00001010.00000000.00000101.00010111 IP Address: 10.0.5.23
11111111.11111111.00000000.00000000 Mask:       255.255.0.0
-----------------------------------------------------------
00001010.00000000.00000000.00000000 Network ID: 10.0.0.0
TypeIP
IP Address10.0.5.23
Netmask/Mask255.255.0.0
Network ID10.0.0.0
Start Range10.0.0.1
End Range10.0.255.254
Broadcast10.0.255.255

Hosts:

IPs
10.0.0.1
10.0.0.255= 256
10.0.1.1
10.0.255.255= 256
  • Number of Hosts = 256 x 256 = 65536
  • Subtract Network ID (1) and Broadcast (1) = 2 IP addresses
  • Number of Usable Hosts = 256 x 256 - 2 = 65534

IPv6 subnetting

We're not going to cover IPv6 subnetting, but if you're interested, this is a nice article: IPv6 subnetting overview

Introduction to DNS, the Domain Name System

DNS Intro Videos

Two helpful YouTube videos. The first one provides an overview of the DNS system:

How a DNS Server (Domain Name System) works

The second video illustrates how to use a GUI to create and manage DNS records.

DNS Records

Here is a nice intro to recursive DNS:

https://www.cloudflare.com/learning/dns/what-is-recursive-dns/

FQDN: The Fully Qualified Domain Name

The structure of the domain name system is like the structure of the UNIX/Linux file hierarchy; that is, it is like an inverted tree.

The fully qualified domain name includes a period at the end of the top-level domain. Your browser is able to supply that dot since we often don't use it when typing website addresses.

Thus, for Google's main page, the FQDN is:

FQDN: www.google.com.

And the parts include:

.           root domain
com         top-level domain
google.     second-level domain
www.        third-level domain

This is important to know so that you understand how the Domain Name System works and which DNS servers are responsible for their part of the network.

Root domain

The root domain is managed by root name servers. These servers are listed on the IANA, the Internet Assigned Numbers Authority, website, but are managed by multiple operators. The root servers manage the root domain, alternatively referred to as the zone, or the . at the end of the .com., .edu., etc.

Alternative DNS root systems

Aside: It's possible to have alternate internets by using outside root name servers. This is not common, but it happens. Read about a few of them here:

Russia, as an example, has threated to use it's own alternate internet based on a different DNS root system. This would essentially create a large, second internet. You can read about in this IEEE Spectrum article.

Top level domain (TLD)

We are all familiar with top level domains. Specific examples include:

  • generic names: .org, .com, .net, .mil, .gov
  • and country code-based: .us, .uk, .ca

We can download a list of those top level names from IANA, and get a total count:

wget https://data.iana.org/TLD/tlds-alpha-by-domain.txt
sed '1d' tlds-alpha-by-domain.txt | wc -l
1495

Second-level domain names

In the Google example, the second level domain is google. Other examples include: redhat in redhat.com and debian in debian.org. Soyinka, (2016) refers to this part of the FQDN as that which makes up the "organizational boundary of the namespace" (p. 425).

Third-level domain names / hostnames / subdomains

When you've purchased (leased) a top and second level domain like getfedora.org, you can choose whether you employ third level domains. For example: www is a third level domain. If you owned example.org, you could also have www.example.org resolve to a different location, or, www.example.org could resolve to the second-level domain itself. That is:

  • www.debian.org can point to debian.org

But it could also point to a separate server, such that debian.org and www.debian.org would be two separate servers with two separate websites or services. Although this is not common with third-level domains that start with www, it is common with others.

For example, with hostnames that are not www:

  • google.com resolves to www.google.com
  • google.com does not resolve to:
    • drive.google.com, or
    • maps.google.com, or
    • mail.google.com

This is because those other three provide different, but specific services.

DNS Paths

Recursive DNS is the first DNS server to be queried in the DNS system. This is the resolver server in the first video above. This server queries itself (recursive) to check if the domain to IP mapping has been cached in its system.

If it hasn't been cached, then the DNS query is forwarded to a root server and and so forth down the line.

We can use the dig command to query the non-cached DNS paths. Let's say we want to follow the DNS path for google.com, then we can start by querying any root server. In the output, we wan to pay attention to the QUERY field, the ANSWER field, and the Authority Section:

dig @198.41.0.4 google.com 

The root servers only know about top level domains, and in this case, that's the com. domain. Fortunately, it lists some authoritative TLD servers and their IP addresses, and we can query one of those next:

dig @192.12.94.30 google.com

Now we know something about the TLD. Here we still don't know the full path, but it does tell us that for google.com, we need to query one of Google's name servers:

dig @216.239.34.10 google.com

And now we finally get our answer, which is that google.com resolves to 142.250.190.78, at least for me and at this instant.

DNS Record types

  • SOA: Start of Authority: describes the site's DNS entries
    • IN: Internet Record
  • NS: Name Server: state which name server provides DNS resolution
  • A: Address records: provides mapping hostname to IPv4 address
  • AAAA: Address records: provides mapping hostname to IPv6 address
dig google.com
google.com.     IN      A       142.251.32.78
  • PTR: Pointer Record: provides mapping from IP Address to Hostname
  • MX: Mail exchanger: the MX record maps your email server.
  • CNAME: Canonical name: used so that a domain name may act as an alias for another domain name. Thus, say someone visits www.example.org, but if no subdomain is set up for www, then the CNAME can point to example.org.

DNS Toolbox

It's important to be able to troubleshoot DNS issues. To do that, we have a few utilities available. Here are examples and you should read the man pages for each one:

host: resolve hostnames to IP Address; or IP addresses to hostnames

man -f host
host (1) - DNS lookup utility
host uky.edu
host 128.163.35.46
host -t MX uky.edu
host -t MX dropbox.com
host -t MX netflix.com
host -t MX wikipedia.org

dig: domain information gopher -- get info on DNS servers

man -f dig
dig (1) - DNS lookup utility
dig uky.edu
dig uky.edu MX
dig www.uky.edu CNAME

nslookup: query internet name servers

man -f nslookup
nslookup (1) - query Internet name servers interactively
nslookup
> uky.edu
> yahoo.com
> exit

whois: determine ownership of a domain

man -f whois
whois (1) - client for the whois directory services
whois uky.edu | less

resolve.conf: local resolver info; what's your DNS info

man -f  resolv.conf
resolv.conf (5) - resolver configuration file
cat /etc/resolv.conf
resolvectl status

Local Security: chroot

A chroot jail is a technology used to change the "apparent root / directory for a user or a process" and confine that user to that location on the system. A user or process that is confined to the chroot jail cannot easily see or access the rest of the file system and will have limited access to the binaries (executables/apps/utilities) on the system. From its man page:

chroot (8) - run command or interactive shell with special root directory

Although it is not security proof, it does have some useful security use cases, from tampering. Some have used chroot to contain DNS servers, for example.

chroot is also the conceptual basis for some kinds of virtualization technologies that are common today, like Docker.

chroot a current user

In this tutorial, we are going to create a chroot for a human user account.

Step 1: Let's create a new user. After we create the new user, we will chroot that user going forward.

sudo su
useradd -m -U -s /usr/bin/bash vader
passwd vader

Step 2: Next, we chroot vader into a new directory. That directory will be located at /var/chroot. Note that the root directory for our regular users is /, but user vader's root directory will be different /var/chroot, even if they can't tell.

mkdir /var/chroot

Step 3: Now we set up available binaries for the user. We'll only allow bash for now. To do that, we'll create a bin/ directory, and copy bash to that directory.

which bash
mkdir -p /var/chroot/usr/bin
mkdir -p /var/chroot/bin
cp /usr/bin/bash /var/chroot/usr/bin/
cp /usr/bin/bash /var/chroot/bin/

Step 4: Large software applications have dependencies (aka, libraries). Thus, next we copy the libraries that bash needs to run.

To identify libraries needed by bash:

ldd /usr/bin/bash

Use the locate command to be sure you identify the exact locations of the libraries:

locate libtinfo.so.6
...

Create a library directory. We'll name the library directory after lib64 since these are all lib64 libraries.

mkdir /var/chroot/lib64
cp /usr/lib64/libtinfo.so.6 /var/chroot/lib64/
cp /usr/lib64/libdl.so.2 /var/chroot/lib64/
cp /usr/lib64/libc.so.6 /var/chroot/lib64/
cp /usr/lib64/ld-linux-x86-64.so.2 /var/chroot/lib64/

Step 5: Create and test the chroot

chroot /var/chroot/
bash-5.1# ls
bash: ls: command not found
bash-5.1# help
bash-5.1# dirs
bash-5.1# cd bin/
bash-5.1# dirs
bash-5.1# cd ../lib64/
bash-5.1# dirs
bash-5.1# cd ..
bash-5.1# exit

Step 6: Create a new group called chrootjail. We can add users to this group that we want to jail. Instructions are based on linuxconfig.org.

groupadd chrootjail
usermod -a -G chrootjail vader
groups vader

Step 7: Edit /etc/ssh/sshd_config to direct users in the chrootjail group to the chroot directory. Add the following line at the end of the file. Then restart ssh server.

# nano /etc/ssh/sshd_config
Match group chrootjail
            ChrootDirectory /var/chroot/

Exit nano, and restart ssh:

systemctl restart sshd

The logout of the server altogether:

exit

Step 8: Test ssh.

Connect to the Fedora server via ssh as the user vader:

ssh vader@relevant_ip_address
-bash-5.1$ ls
-bash: ls: command not found
exit

That works as expected. The user vader is now restricted to a special directory and has limited access to the system or to any utilities on that system.

Exercise

By using the ldd command, you can add additional binaries for this user. As an exercise, use the ldd command to locate the libraries for the nano editor, and make nano available to the user vader in the chrooted directory.

Nano in chroot

After making Bash available in chroot:

(Side note: Unlike previous instances when I use the # sign to indicate a comment, below I'm using the # sign below to indicate the root prompt.)

# which nano
/usr/bin/nano
# cp /usr/bin/nano /var/chroot/bin/
# ldd /usr/bin/nano
linux-vdso.so.1 (0x00007fff5bdd5000)
  libmagic.so.1 => /lib64/libmagic.so.1 (0x00007f0ce11a7000)
  libncursesw.so.6 => /lib64/libncursesw.so.6 (0x00007f0ce1167000)
  libtinfo.so.6 => /lib64/libtinfo.so.6 (0x00007f0ce1138000)
  libc.so.6 => /lib64/libc.so.6 (0x00007f0ce0f6e000)
  libz.so.1 => /lib64/libz.so.1 (0x00007f0ce0f54000)
  libdl.so.2 => /lib64/libdl.so.2 (0x00007f0ce0f4d000)
  /lib64/ld-linux-x86-64.so.2 (0x00007f0ce1232000)
# cp /usr/lib64/libmagic.so.1 /var/chroot/lib64/
# cp /usr/lib64/libncursesw.so.6 /var/chroot/lib64/
# cp /usr/lib64/libtinfo.so.6 /var/chroot/lib64/
# cp /usr/lib64/libc.so.6 /var/chroot/lib64/
# cp /usr/lib64/libz.so.1 /var/chroot/lib64/
# cp /usr/lib64/libdl.so.2 /var/chroot/lib64/
# cp /usr/lib64/ld-linux-x86-64.so.2 /var/chroot/lib64/
# chroot /var/chroot/
bash-5.1# nano
Error opening terminal: xterm-256color.
bash-5.1# exit

To fix this, install ncurses-term and copy over additional files:

# dnf install -y ncurses-term
# locate xterm-256color
/usr/share/terminfo/s/screen.xterm-256color
/usr/share/terminfo/x/xterm-256color
# mkdir -p /var/chroot/etc/terminfo/x/
# cp /usr/share/terminfo/x/* /var/chroot/etc/terminfo/x/
# chroot /var/chroot
# nano

Security: Firewalls iptables and firewall-cmd

Netfilter is a part of the kernel that includes a suite of applications that help manage how packets flow in and out of a server or internet connected device. iptables is the command line user interface to netfilter and is one of the main firewall applications on many Linux distributions.

Fedora/RedHat offers a more user friendly interface to iptables called firewall-cmd that I'll discuss below. Ubuntu offers its own user friendly interface called ufw. In this lecture, I'll discuss iptables and firewall-cmd.

iptables

There are five predefined tables (operations) and five chains that come with iptables. Tables define the kinds of operations that you use to control the firewall and the packets that come in and out of the system or through a system. The filter and nat tables are the most commonly used ones.

The five chains are lists of rules that act on packets flowing through the system. Finally, there are also targets. These are the actions to be performed on packets. The main targets include ACCEPT, DROP, and RETURN.

From man iptables, the tables and respective chains include:

  • filter (the default table)
    • INPUT: for packets destined to local sockets
    • FORWARD: for packets being routed through the box
    • OUTPUT: for locally-generated packets
  • nat
    • PREROUTING: for altering packets as soon as they come in
    • INPUT: for altering packets destined for local sockets
    • OUTPUT: for altering locally-generated packets
    • POSTROUTING: for altering packets as they are about to go out
  • mangle
    • PREROUTING
    • OUTPUT
    • INPUT
    • FORWARD
    • POSTROUTING
  • raw
    • PREROUTING
    • OUTPUT
  • security
    • INPUT
    • OUTPUT
    • FORWARD

We'll cover the filter tables and the nat tables.

Usage

First, we'll look at the default parameters for the filter table. You need to be root to run this commands, or use sudo:

sudo su
# -L: List all rules in the selected chain;
# if no chain is selected, then list all chains; and,
# -v: be verbose
iptables -L -v | less
iptables -L | grep policy
# specify a table
iptables -t filter -L -v

Allow connections only from subnet

We can change the firewall to only allow communication on a subnet. Of course, in order to do this, we need the subnet Network ID and the CIDR number:

ip a

Now that we have the Network ID and the CIDR number, we can set the firewall (don't follow along here if you're connecting via SSH because this will end your connection):

# comment: first, set policy to drop all incoming, forwarding, and
# outgoing
# packets; this means we're starting from a baseline
iptables --policy INPUT DROP
iptables --policy FORWARD DROP
iptables --policy OUTPUT DROP

# comment: review new policies for the above chains
iptables -L | grep policy

# comment: now accept only input, forwarding, and output from the
# following
# comment: network ranges:

iptables -A INPUT -s 10.163.36.0/24 -j ACCEPT
iptables -A FORWARD -s 10.163.36.0/24 -j ACCEPT
iptables -A OUTPUT -s 10.163.36.0/24 -j ACCEPT

To test this, we can try to connect to a remote server:

w3m https://www.google.com

There are lots of examples on the web. Examples from:

PREROUTING

Here is an example where we forward all traffic to port 25 to port 2525. Here we use the NAT table, since NAT is responsible for network address translation:

iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 25 -j REDIRECT --to-port 2525

OUTPUT

Here is an example where we disable outgoing email by disabling the ports that are commonly associated with email. Of course, this could be bypassed by using non-standard ports for email, but for out case, it's a decent demonstration:

## iptables -A OUTPUT -p tcp --dports 25,465,587 -j REJECT

firewall-cmd

firewall-cmd documentation

firewalld is a more user friendly interface to netfilters/iptables in Red Hat based distros.

In firewalld, we use zones to manage internet traffic. It is possible to define custom zones with firewalld, but it does come included with the following predefined zones, and in many cases, these might be all that we need:

  • DROP : strictest. All incoming network packets are dropped
  • BLOCK : all very strict
  • PUBLIC : only selected incoming connections are accepted. Good zone for web server, email server, etc.
  • EXTERNAL : external networks (useful for NAT)
  • DMZ : computers located in DMZ
  • work : trust most computers in network and accept some services
  • home : trust most computers in network and accept some services
  • trusted : trust all machines in network

Check if running:

firewall-cmd --state

Get active zones and interfaces attached to them:

firewall-cmd --get-zones
firewall-cmd --get-default-zone
firewall-cmd --get-active-zones
FedoraServer
  interfaces: enp0s3

Allow port 22, list ports, remove services by name, add services by name:

firewall-cmd --zone=FedoraServer --add-port=22/tcp
firewall-cmd --zone=FedoraServer --list-ports
firewall-cmd --zone=FedoraServer --remove-service=ssh --permanent
firewall-cmd --zone=FedoraServer --add-service=smtp --permanent
firewall-cmd --reload

Go into panic mode (drop all incoming/outgoing packets):

firewall-cmd --panic-on
firewall-cmd --panic-off

To change default zone:

firewall-cmd --permanent --set-default-zone=public

Introduction

This section shows how to create a LAMP (Linux, Apache2, MySQL, and PHP) server.

Installing Apache2

Let's install our first web server.

First, switch to Bridged mode in VirtualBox network settings and refresh MAC address in VirtualBox.

Update system first

Make sure your machine is up to date before installing Apache2.

Login as root, or switch to the root user, and update the machine:

sudo su
dnf update

Install httpd

Now, that the machine is updated, install Apache2. On distributions that use a package management system, such as dnf on Fedora and apt on Ubuntu, we can use those systems to install the relevant software and dependencies. However, different distributions use different names for the packages. Fedora refers to the Apache2 package as httpd while Ubuntu refers to it as Apache2. We can use dnf to search for the appropriate package name:

dnf search apache | grep "httpd"

Apache2 is not the only web server available. nginx is another popular web server, and you should explore or learn about other options on your own. For now, let's get some basic info on the httpd package:

dnf info httpd

Based on the output, and at the time of this writing, it looks like the httpd package refers to the Apache HTTP Server, version 2.4.51. I want to highlight this because it's important to know what version of things are that we're installing, for a couple of reasons at least:

  1. First, although Apache2 has its own dependencies, other packages will also depend on it. For example, say we wanted to install Drupal or WordPress, we would first have to install a web server, like Apache2, and it might be the case that Drupal or WordPress require a certain minimum version of Apache2.

  2. Second, some Linux operating systems focus on stability and thus do not update to the most recent version of a package instead opting for the most stable version of the software. The latest stable release of Apache2 is 2.4.51. But it's not always likely that Fedora or some other distribution will use that or some newer version until the next distribution upgrade, for example, from Fedora 33 to Fedora 34. For now, this is fine, and we can proceed with the install:

dnf -y install httpd

Basic checks

One of the things that makes Apache2, and some other web servers, powerful is the library of modules that extend Apache's functionality. We'll come back to modules soon. For now, we're going to make sure the server is up and running, configure some basic things, and then create a basic web site.

To start, let's get some info about Apache2 and make sure it is enabled and running:

systemctl list-unit-files httpd.service
systemctl enable httpd.service
systemctl list-unit-files httpd.service
systemctl status httpd.service
systemctl start httpd.service
systemctl status httpd.service

Creating a web page

Now that we have it up and running, let's look at the default web page. We can use our loopback IP address (aka, localhost) and the w3m text web browser to view the default page:

dnf install -y w3m
w3m http://127.0.0.1
w3m http://localhost/

The w3m text-mode browser shows the Fedora Test Page. That's a sign that the default install was successful.

Let's now create our first web page. To do so, we need to know what directory that httpd is using to serve websites. This directory is called the DocumentRoot directory. If we read through that Fedora Test Page document, it'll tell us that the default directory is /var/www/html/. Let's go there and create a webpage with our text editor of choice:

cd /var/www/html/
nano index.html

Create a simple HTML page, something like this. Of course, modify the content to suit your own interests:

<html>
<head>
<title>My first web page using Apache2</title>
</head>
<body>

<h1>Welcome</h1>

<p>Welcome to my web site. I have ever created using Apache2 and Fedora
Linux.</p>

<p>Thanks!<br/>
Dr. Burns</p>

</body>
</html>

After you're done, save and close the document. Let's visit our website again with w3m to see if it works:

w3m http://127.0.0.1

Let's open the firewall so that outside systems can access this page:

firewall-cmd --list-all
firewall-cmd --get-active-zones
firewall-cmd --zone=FedoraServer --add-service=http
firewall-cmd --zone=FedoraServer --add-service=https
firewall-cmd --runtime-to-permanent

Changing the hostname

The hostname of a system is the label it uses to identify itself to others (humans) on a (sub)network. If the hostname is on the web (or the internet), it may also be part of its of the fully qualified domain name (FQDN), which we studied during the DNS and networking weeks. For example, on a server identified as enterprise.example.net, then enterprise, is the hostname, example.net is the domain name, and enterprise.example.net is the fully qualified domain name. If two computers are on the same subnet, then they can be reachable via the hostname only, but the domain name is part of the DNS system and is required for two computers on the broader internet to reach each other.

We're going to check and set the system hostname on our Fedora (virtual) machines using the hostname and hostnamectl commands:

Check the default hostname:

# hostname
localhost.localdomain

To change the default hostname from localhost, use the hostnamectl command to update the system's hostname per the file. My new hostname will be enterprise. You can name your hostname whatever you want, but be sure it's a single word with no punctuation.

hostnamectl set-hostname enterprise
hostname
cat /etc/hostname

We can access our site by hostname rather than by IP:

w3m http://enterprise

Optional

After you've completed the above steps, do the following:

  1. On your host machine, find your OS's version of /etc/hosts.

  2. Map your guest IP address (your Fedora IP) to your new hostname:

    192.168.4.31 enterprise
    

Then, in your Firefox, Chrome, or whatever browser, visit your new website and replace enterprise with the hostname that you chose for your guest OS:

http://enterprise

Apache2 User Directories

We can enable Apache2 so that users on our systems can run websites from their home directories; that is, sites located at:

  • $HOME/public_html

Enable userdir

Edit the userdir.conf file.

cd /etc/httpd/conf.d/
nano userdir.conf

Make the following changes:

  • UserDir disabled to UserDir enabled
  • Uncomment line UserDir public_html

After saving and exiting, restart httpd.service:

systemctl restart httpd.service

Tasks

  1. Exit out of root account
  2. Go to your regular user's home directory
  3. Make a directory titled public_html if it doesn't already exist
  4. Set public_html directory permissions to 755:
    • chmod 755 public_html
  5. Change the user's directory permissions to 701:
    • chmod 701 /home/your_user

SELinux needs to be configured to allow web access to our home directories. Specifically, we need to set some SELinux switches. Using sudo or logging in as root. Make sure you replace sean with your username:

setsebool -P httpd_enable_homedirs true
chcon -R -t httpd_sys_content_t /home/sean/public_html

Exit out of root if you need to.

Test

Now test to see if your public_html site is operational by simply visiting the site. For me, I use the following command:

cd ~/public_html/
echo "<p>Hello world</p>" >> index.html
w3m http://127.0.0.1/~sean

Installing and Setting PHP for Apache

PHP is a server-side programming language. Client-side programming languages, like JavaScript, are handled by the browser, but the PHP software must be installed on the server and made to work with the Apache httpd server.

In order for Apache and PHP to work together, we therefore need to install PHP. Then we'll make some modifications to Apache so that it defaults to PHP files rather than to HTML files.

To get started, let's work with last week's HTTP machine that we used to set up Apache and user directories userdir. You can use that machine or you can clone it to be sure that you have a good backup in case you need to start over.

Let's find the relevant packages to install. Again, make sure your system is up to date first.

sudo su
dnf -y upgrade
dnf search php | less
dnf info php
dnf info php-common
dnf install php php-common

Since we are altering how the Apache httpd service functions, we need to restart the service. To check and restart services:

systemctl status httpd.service
systemctl restart httpd.service
systemctl status httpd.service

If all is well, our next task is to see if the Apache httpd service recognizes PHP. We will proceed to the base HTTP directory, and use nano to create and open a file called info.php.

cd /var/www/html/
nano info.php

To make sure that the Apache web server can recognize that PHP is installed and usable, we can add test code to the info.php file. The test code will give us information about the version of PHP that we just installed:

<?php
phpinfo();
?>

Next, update file ownership for all files in this directory. They should be owned be the Apache user:

ls -l
chown apache:apache *
ls -l
w3m http://localhost/info.php

By default, if both an index.html file and an index.php file exist in the same directory, the Apache web server will display the index.html file if a user visits the directory (e.g., http://example.com/ or http://localhost/). So we need to configure Apache to display index.php files before displaying index.html files in case both files exist in the same directory:

cd /etc/httpd/conf/
nano httpd.conf

Change this line:

DirectoryIndex index.html

To (that is, add index.php file to the line and make sure that it comes before index.html. It should look like:

DirectoryIndex index.php index.html

Since we have modified an Apache configuration file, we should check that we haven't made a syntax mistake:

apachectl configtest

If we get an Syntax Ok message (you can ignore the FQDN error message), you can tell Apache to reload its config files:

systemctl reload httpd.service
systemctl restart httpd.service

Now create a basic PHP page. cd back to the base HTTP directory and use nano to create and open and index.php file:

cd /var/www/html/
nano index.php

Add some HTML and PHP that will detect our browser to the index.php (Source code link):

<p>You are using the following browser to view this site:</p>

<?php
echo $_SERVER['HTTP_USER_AGENT'] . "\n\n";

$browser = get_browser(null, true);
print_r($browser);
?>

Next, save and exit nano, change ownership of the file to Apache, and view with w3m:

chown apache:apache index.php
w3m http://localhost/

Of course, since we set up our hostname last week, we can use our hostname in our URL:

w3m http://enterprise/

Your goal:

  • Create an index.php file in your userdir
  • Add some PHP and submit screenshots, like last week, showing both the code and the output.

Test some sample PHP code from here: https://www.w3schools.com/php/php_syntax.asp

Apache2 VirtualHosts

In this tutorial, we use VirtualHosts so that our server may support multiple domain names.

We do this by configuring Apache2 to recognize new DocumentRoot directories for the additional domain names.

This allows us to serve multiple websites based on the same IP address.

Update OS

As always, we need to keep our Fedora installation updated:

dnf upgrade

Create new configurations

So far we have learned how to create a main website at the following document root:

/var/www/html

We have also learned how to enable Apache2 to serve websites from user directories:

/home/USER/public_html/.

Websites that are stored at /var/www/html can eventually have a domain name like example.org or biguniversity.edu. And then websites at /home/USER/public_html/ would have URLs like http://biguniversity.edu/~USER.

The problem with creating a website at the /var/www/html DocumentRoot is that, by default, we can only create the one main site; so either example.org or biguniversity.edu but not both.

VirtualHosts solve this problem. It allows a single server, with a single IP address, to host websites linked to multiple domain names, where all of these sites would have their own DocumentRoot directories in the /var/www/html directory.

To start, we need to revisit the Apache configuration files and add information about the VirtualHosts that we want to create.

We begin by adding VirtualHost information to the following file:

less /etc/httpd/conf/httpd.conf

That file includes the following line:

  • IncludeOptional conf.d/*.conf

That option tells the Apache2 service to look for additional configuration files in the conf.d/ directory. Per that above line, the configuration files that we add will need to end with .conf.

To get started, we'll name the files after some pretend domain names. I'll create a domain called linuxonenterprise.com and another one called websysadmins.com:

cd /etc/httpd/conf.d/
touch linuxonenterprise.conf

In the linuxonenterprise.conf file, I'll add the following info:

<VirtualHost *:80>
ServerAdmin webmaster@linuxonenterprise.com
DocumentRoot "/var/www/html/linuxonenterprise/"
ServerName linuxonenterprise.com
ServerAlias www.linuxonenterprise.com
ErrorLog "/var/log/httpd/linuxonenterprise.com-error_log"
CustomLog "/var/log/httpd/linuxonenterprise.com-access_log" combined

<Directory "/var/www/html/linuxonenterprise/">
DirectoryIndex index.php index.html
Options FollowSymLinks
AllowOverride All
Require all granted
</Directory>
</VirtualHost>

Then I'll repeat the process with a new file called websysadmins.conf. To make life easier, I can copy the linuxonenterprise.conf to websysadmins.conf.

cp linuxonenterprise.conf websysadmins.conf

And edit the websysadmins.conf file accordingly by replacing the names of the site:

<VirtualHost *:80>
ServerAdmin webmaster@websysadmins.com
DocumentRoot "/var/www/html/websysadmins/"
ServerName websysadmins.com
ServerAlias www.websysadmins.com
ErrorLog "/var/log/httpd/websysadmins.com-error_log"
CustomLog "/var/log/httpd/websysadmins.com-access_log" combined

<Directory "/var/www/html/websysadmins/">
DirectoryIndex index.html index.php
Options FollowSymLinks
AllowOverride All
Require all granted
</Directory>
</VirtualHost>

When done, I'll exit out of my text editor and check the configuration syntax with one of the following two commands:

httpd -t
apachectl configtest

You should get an error stating that the sites don't exist at the DocumentRoot, but we'll fix that in a second. For now, you want to get a Syntax OK message.

Creating the sites

The above two files tell Apache2 to look for the two websites in:

  • /var/www/html/linuxonenterprise
  • /var/www/html/websysadmins

These are the DocumentRoot, i.e., the base directories for our websites. We need to create those locations. I'll do that now for my two domains, and I'll use Bash brace expansion to create both at the same time:

mkdir /var/www/html/{linuxonenterprise,websysadmins}

The above command creates two directories:

  • /var/www/html/linuxonenterprise
  • /var/www/html/websysadmins

Now create some basic web pages in each domain directory:

cd /var/www/html/linuxonenterprise
echo "<h1>Linux on the Enterprise</h1>" >> index.html

Then cd to websysadmins from the linuxonenterprise directory:

cd ../websysadmins
echo "<h1>Web Sys Admins</h1>" >> index.html

And now we have to make sure that the user apache owns those two directories and all future files in them. We use the user apache because the main Apache2 configuration file (/etc/httpd/conf/httpd.conf) has two directives that state that the names of the User/Group should be apache:

cd .. # to return to the parent directory
chgrp -R apache /var/www/html/
chmod 2775 -R /var/www/html

By adding our account to the apache group, we can edit these and all future files without using sudo or becoming root. Here I make user 'sean' part of the apache group:

usermod -a -G apache sean

This group addition will not go into effect until the user logs out and logs back in.

You can run ls -ld and ls -l on those directories and files to confirm that the apache owner owns them. You can also run httpd -t or apachectl configtest again to confirm that all the syntax is good.

The Hosts File: /etc/hosts

In order to resolve IP address to domain names, we need some kind of system that will map these two identifiers to each other. We have already covered DNS more extensively, but since we're not really creating new websites for the web, we'll repeat what we did in the previous weeks with the /etc/hosts file.

The /etc/hosts file is like a basic DNS system and we can use as a "static table lookup for hostnames" (from man hosts). Let's modify this so that our IP address is mapped to the our domain names. To do that, let's add the following line just after the two localhost lines: (USE YOUR IP NOT MINE)

ip a
192.168.4.32
sudo nano /etc/hosts

Then let's map the IP address to the hostnames that we'll use for the new websites. Add the following to /etc/hosts, but replace my IP with yours and my hostname with one of your own creation:

192.168.4.32 linuxonenterprise.com
192.168.4.32 websysadmins.com

This is one way to create a kind of intranet that uses actual names instead of just IP addresses. Say that you have a home network and one of the computers on your network is running a web server. If you assign a static IP to this computer using the software on your home router, modify the /etc/hosts files on each of those three computers to point to that static IP via a domain name, then you have basic DNS system for your subnet.

Now, let's restart Apache2 and see if we can visit our sites.

systemctl reload httpd.service
systemctl restart httpd.service
w3m http://linuxonenterprise.com
w3m http://websysadmins.com

Success!

If you change the /etc/hosts file on your host machine (i.e., your laptop) per the instructions in the last lecture, then you should be able to visit http://linuxonenterprise.com and http://websysadmins.com in your browser. Here is a snippet of what my /etc/hosts file looks like on my desktop machine (i.e., my host machine):

127.0.0.1      localhost
127.0.1.1      desktop
192.168.4.32    linuxonenterprise.com
192.168.4.32    websysadmins.com

References

MySQL Server Administration

Install and Set Up MySQL

This week we'll learn how to install, setup, secure, and configure the MySQL relational database so that it works with the Apache2 web server. First, a point on terms. This week we will be working as

  • the Linux root user and as
  • the MySQL root user.

These are two different users and accounts. To revisit, in Linux, there is the root user, which has a home directory at /root, and there is also the root directory at /.

In MySQL and other relational database software, there is also a root user and this user is not the same as the Linux root user. It's important to keep these concepts separate in our heads, and for most of this transcript, I will refer to the MySQL root user or to the Linux root user when I'm referring to either one.

First, let's install MySQL Community Server, and then log into the MySQL shell under the MySQL root account.

sudo su
dnf upgrade
dnf search mysql server
dnf info community-mysql-server
dnf info community-mysql-server
dnf install community-mysql-server
systemctl list-unit-files mysqld.service
systemctl status mysqld.service
systemctl start mysqld.service
systemctl enable mysqld.service
systemctl status mysqld.service
systemctl list-unit-files mysqld.service
mysql -u root

After we have logged in, we need to create a secure password for the MySQL root account. Again, do not confuse the Linux root with the MySQL root account. That is, these are two different accounts: Linux root and MySQL root. Once we have created the password, we will exit MySQL.

In MySQL, we will create the following root password: "aNewPassword4!" (withouth the quotes), and then log out. In a production environment, I would not use a basic password like this, but for our purposes, we can keep things simple.

mysql> alter user 'root'@'localhost' identified by 'aNewPassword4!';
mysql> \q

Secure MySQL Server

Now we use a MySQL program called mysql_secure_installation to help secure the MySQL installation. From the Bash shell and while logged in as Linux root, run the following command, and respond to the command line prompts as follows:

mysql_secure_installation
Enter password for user root: aNewPassword4!
Validate Password: Y
Password Strength: 0
Change the password for root: N
Remove anonymous users: y
Disallow root login remotely: y
Remove test database: y
Reload privilege tables now: y

Create and Set Up a Regular User Account

Now, log back into the MySQL shell as the MySQL root user. Here the command is a bit different from the first one that we used to login to MySQL because we now have to enter our password:

mysql -u root -p

In MySQL, we need to create and set up a new account that is not root and therefore does not have root privileges:

mysql> create user 'sean'@'localhost' identified by 'an0ldP4ssPhrase!';

Create a Practice Database

Now let's create a linux-topic database for user 'sean'. This user will be granted all privileges on this database, including all its tables. Other than granting all privileges, we could only grant specific privileges, including: CREATE, DROP, DELETE, INSERT, SELECT, UPDATE, and GRANT OPTION. Such privileges may be called operations or functions. They allow MySQL users to use and modify the database:

Don't use this exact command, but the syntax of the grant command below is this:

grant PRIVILEGE_OPTION on DATABASE.TABLE to 'USER'@'LOCALHOST';

In practice, we do this:

mysql> create database linuxdb;
mysql> grant all privileges on linuxdb.* to 'sean'@'localhost';
mysql> show databases;
mysql> \q

Logging in as Regular User and Creating Tables

Now, we can start doing MySQL work. We've created a new MySQL user named sean and a new database for sean that is called linuxdb. Let's logout out of the Linux root account and re-login under our regular Linux account, for me that's sean, and create tables and data for our database:

$ mysql -u sean -p
mysql> show databases;
mysql> use linuxdb;
mysql> create table distributions
    -> (
    -> id int unsigned not null auto_increment,
    -> name varchar(150) not null,
    -> developer varchar(150) not null,
    -> founded date not null,
    -> primary key (id)
    -> );
Query OK, 0 rows affected (0.07 sec)

mysql> show tables;
mysql> describe distributions;

Congratulations! Now create some records for that table.

Adding records into the table

We'll use the INSERT command to add records:

mysql> insert into distributions (name, developer, founded) values
    -> ('Debian', 'The Debian Project', '1993-09-15'),
    -> ('Ubuntu', 'Canonical Ltd.', '2004-10-20'),
    -> ('Fedora', 'Fedora Project', '2003-11-06');
Query OK, 3 rows affected (0.06 sec)
Records: 3  Duplicates: 0  Warnings: 0
mysql> select * from distributions;

Success! Now let's test our table. We will complete the following tasks to refresh our MySQL knowledge:

  • retrieve some records or parts of records,
  • delete a record,
  • alter the table structure so that it will hold more data, and
  • add a record:
mysql> select name from distributions;
mysql> select founded from distributions;
mysql> select name, developer from distributions;
mysql> select name from distributions where name='Debian';
mysql> select developer from distributions where name='Ubuntu';
mysql> # delete from distributions where name='Debian';
mysql> select /* from distributions;
mysql> alter table distributions add packagemanager char(3) after name;
mysql> describe distributions;
mysql> select * from distributions;
mysql> update distributions set packagemanager="APT" where id="1";
mysql> update distributions set packagemanager="APT" where id="2";
mysql> update distributions set packagemanager="DNF" where id="3";
mysql> select * from distributions;
mysql> insert into distributions (name, packagemanager, developer, founded) values
    -> ('CentOS', 'YUM', 'The CentOS Project', '2004-05-14');
mysql> select * from distributions;
mysql> select name, packagemanager from distributions where founded < '2004-01-01';
mysql> select name from distributions order by founded;
mysql> \q

References and Read More

  1. MySQL: Getting Started with MySQL
  2. How to Create a New User and Grant Permissions in MySQL
  3. MySQL: MySQL 5.7 Reference Manual: 13 SQL Statement Syntax

Install PHP and MySQL Support

The next goal is to complete the connection between PHP and MySQL so that we can use both for our websites.

First install MySQL support for PHP. We're installing some modules alongside the basic support. These may or may not be needed, but I'm installing them to demonstrate some basics. Use dnf info <packagename> to get information about each package before installing.

sudo su
dnf install php-mysqlnd php-cli php-mbstring php-fpm
systemctl restart mysqld.service
systemctl restart httpd.service

Create PHP Scripts

Let's move to the base web directory and create our login file, which will contain the credentials for our MySQL regular user account. In the previous week, I demonstrated VirtualHosts. We'll use one of our virtual domains to connect to our MySQL server with PHP.

cd /var/www/html/linuxonenterprise/
touch login.php
chmod 640 login.php
ls -l login.php
nano login.php

In the file, add the following credentials, substituting your credentials where necessary:

<?php // login.php
$db_hostname = "localhost";
$db_database = "linuxdb";
$db_username = "sean";
$db_password = "an0ldP4ssPhrase!";
?>

Now, in a separate file, which will be index.php, add the following PHP to test our database connection and return some results:

<html>
<head><title>MySQL Server Example</title></head>
<body>

<?php

// Load MySQL credentials
require_once 'login.php';

// Establish connection
$conn = mysqli_connect($db_hostname, $db_username, $db_password) or
  die("Unable to connect");

// Open database
mysqli_select_db($conn, $db_database) or
  die("Could not open database '$db_database'");

// QUERY 1
$query1 = "show tables from $db_database";
$result1 = mysqli_query($conn, $query1);

$tblcnt = 0;
while($tbl = mysqli_fetch_array($result1)) {
  $tblcnt++;
}

if (!$tblcnt) {
  echo "<p>There are no tables</p>\n";
}
else {
  echo "<p>There are $tblcnt tables</p>\n";
}

// Free result1 set
mysqli_free_result($result1);

// QUERY 2
$query2 = "select name, developer from distributions";
$result2 = mysqli_query($conn, $query2);

$row = mysqli_fetch_array($result2, MYSQLI_NUM);
printf ("%s (%s)\n", $row[0], $row[1]);
echo "<br/>";

$row = mysqli_fetch_array($result2, MYSQLI_ASSOC);
printf ("%s (%s)\n", $row["name"], $row["developer"]);

// Free result2 set
mysqli_free_result($result2);

// Query 3
$query3 = "select * from distributions";
$result3 = mysqli_query($conn, $query3);

while($row = $result3->fetch_assoc()) {
  echo "<p>Owner " . $row["developer"] . " manages distribution " . $row["name"] . ".</p>";
}

mysqli_free_result($result3);

$result4 = mysqli_query($conn, $query3);
while($row = $result4->fetch_assoc()) {
  echo "<p>Distribution " . $row["name"] . " was released on " . $row["founded"] . ".</p>";
}

// Free result4 set
mysqli_free_result($result4);

/* Close connection */
mysqli_close($conn);

?>

</body>
</html>

After you save the file and exit the text editor, we need to test the PHP syntax. If there are any errors in our PHP, these commands will show the line numbers that are causing errors or leading up to errors. If all is well with the first command, nothing will output. If all is well with the second command, HTML should be outputted:

php -f login.php
php -f index.php
chown :apache *php

Check IP and Hostname

We want to make sure that /etc/hosts has the correct IP address for linuxonenterprise:

ip a
nano /etc/hosts # update IP address if changed

Tasks

Copy the login.php and index.php to your public_html directory (you should still have userdir enabled). Figure out what you need to change in order to get your script to work there.

References

Note: this doesn't seem to be a problem now, but in previous times, there was an an error with authentication due to an upgrade in MySQL that hadn't caught up with PHP yet. If so, you might need to login as root to MySQL and run the following command, replacing the relevant information with your non-root user info:

ALTER USER 'mysqlUsername'@'localhost' IDENTIFIED WITH mysql_native_password BY 'mysqlUsernamePassword';