D3 Tips and Tricks v4

Wednesday, 30 March 2016

Linux Directory Structure

The following post is a section of the book 'Just Enough Linux'.  The entire book can be downloaded in pdf format for free from Leanpub or you can read it online here.
Since this post is a snapshot in time. I recommend that you download a copy of the book which is updated frequently to improve and expand the content.
---------------------------------------

Linux Directory Structure

To a new user of Linux, the file structure may feel like something at best arcane and in some cases arbitrary. Of course this isn’t entirely the case and in spite of some distribution specific differences, there is a fairly well laid out hierarchy of directories and files with a good reason for being where they are.
We are frequently comfortable with the concept of navigating this structure using a graphical interface similar to that shown below, but to operate effectively at the command line we need to have a working knowledge of what goes where.
Linux Directories
The directories we are going to describe form a hierarchy similar to the following;
Directory Hierarchy
For a concise description of the directory functions check out the cheat sheet. Alternatively their function and descriptions are as follows;

/

The / or ‘root’ directory contains all other files and directories. It is important to note that this is not the root users home directory (although it used to be many years ago). The root user’s home directory is /root. Only the root user has write privileges for this directory.

/bin

The /bin directory contains common essential binary executables / commands for use by all users. For example: the commands cdcpls,pingps. These are commands that may be used by both the system administrator and by users, but which are required when no other filesystems are mounted.

/boot

The /boot directory contains the files needed to successfully start the computer during the boot process. As such the /boot directory contains information that is accessed before the Linux kernel begins running the programs and process that allow the operating system to function.

/dev

The /dev directory holds device files that represent physical devices attached to the computer such as hard drives, sound devices and communication ports as well as ‘logical’ devices such as a random number generator and /dev/null which will essentially discard any information sent to it. This directory holds a range of files that strongly reinforces the Linux precept that Everything is a file.

/etc

The /etc directory contains configuration files that control the operation of programs. It also contains scripts used to startup and shutdown individual programs.

/etc/cron.d

The /etc/cron.d/etc/cron.hourly/etc/cron.daily/etc/cron.weekly/etc/cron.monthly directories contain scripts which are executed on a regular schedule by the crontab process.

/etc/rc?.d

The /rc0.d/rc1.d/rc2.d/rc3.d/rc4.d/rc5.d/rc6.d/rcS.d directories contain the files required to control system services and configure the mode of operation (runlevel) for the computer.

/home

Because Linux is an operating system that is a ‘multi-user’ environment, each user requires a space to store information specific to them. This is done via the /home directory. For example, the user ‘pi’ would have /home/pi as their home directory.

/lib

The /lib directory contains shared library files that supports the executable files located under /bin and /sbin. It also holds the kernel modules (drivers) responsible for giving Linux a great deal of versatility to add or remove functionality as needs dictate.

/lost+found

The /lost+found directory will contain potentially recoverable data that might be produced if the file system undergoes an improper shut-down due to a crash or power failure. The data recovered is unlikely to be complete or undamaged, but in some circumstances it may hold useful information or pointers to the reason for the improper shut-down.

/media

The /media directory is used as a directory to temporarily mount removable devices (for example, /media/cdrom or /media/cdrecorder). This is a relatively new development for Linux and comes as a result of a degree of historical confusion over where was best to mount these types of devices (/cdrom/mnt or /mnt/cdrom for example).

/mnt

The /mnt directory is used as a generic mount point for filesystems or devices. Recent use of the directory is directing it towards it being used as a temporary mount point for system administrators, but there is a degree of historical variation that has resulted in different distributions doing things different ways (for example, Debian allocates /floppy and /cdrom as mount points while Redhat places them in/mnt/floppy and /mnt/cdrom respectively).

/opt

The /opt directory is used for the installation of third party or additional optional software that is not part of the default installation. Any applications installed in this area should be installed in such a way that it conforms to a reasonable structure and should not install files outside the /opt directory.

/proc

The /proc directory holds files that contain information about running processes and system resources. It can be described as a pseudo filesystem in the sense that it contains runtime system information, but not ‘real’ files in the normal sense of the word. For example the/proc/cpuinfo file which contains information about the computers cpus is listed as 0 bytes in length and yet if it is listed it will produce a description of the cpus in use.

/root

The /root directory is the home directory of the System Administrator, or the ‘root’ user. This could be viewed as slightly confusing as all other users home directories are in the /home directory and there is already a directory referred to as the ‘root’ directory (/). However, rest assured that there is good reason for doing this (sometimes the /home directory could be mounted on a separate file system that has to be accessed as a remote share).

/sbin

The /sbin directory is similar to the /bin directory in the sense that it holds binary executables / commands, but the ones in /sbin are essential to the working of the operating system and are identified as being those that the system administrator would use in maintaining the system. Examples of these commands are fdiskshutdownifconfig and modprobe.

/srv

The /srv directory is set aside to provide a location for storing data for specific services. The rationale behind using this directory is that processes or services which require a single location and directory hierarchy for data and scripts can have a consistent placement across systems.

/tmp

The /tmp directory is set aside as a location where programs or users that require a temporary location for storing files or data can do so on the understanding that when a system is rebooted or shut down, this location is cleared and the contents deleted.

/usr

The /usr directory serves as a directory where user programs and data are stored and shared. This potential wide range of files and information can make the /usr directory fairly large and complex, so it contains several subdirectories that mirror those in the root (/) directory to make organisation more consistent.

/usr/bin

The /usr/bin directory contains binary executable files for users. The distinction between /bin and /usr/bin is that /bin contains the essential commands required to operate the system even if no other file system is mounted and /usr/bin contains the programs that users will require to do normal tasks. For example; awkcurlphppython. If you can’t find a user binary under /bin, look under /usr/bin.

/usr/lib

The /usr/lib directory is the equivalent of the /lib directory in that it contains shared library files that supports the executable files for users located under /usr/bin and /usr/sbin.

/usr/local

The /usr/local directory contains users programs that are installed locally from source code. It is placed here specifically to avoid being inadvertently overwritten if the system software is upgraded.

/usr/sbin

The /usr/sbin directory contains non-essential binary executables which are used by the system administrator. For example cron anduseradd. If you can’t locate a system binary in /usr/sbin, try /sbin.

/var

The /var directory contains variable data files. These are files that are expected to grow under normal circumstances For example, log files or spool directories for printer queues.

/var/lib

The /var/lib directory holds dynamic state information that programs typically modify while they run. This can be used to preserve the state of an application between reboots or even to share state information between different instances of the same application.

/var/log

The /var/log directory holds log files from a range of programs and services. Files in /var/log can often grow quite large and care should be taken to ensure that the size of the directory is managed appropriately. This can be done with the logrotate program.

/var/spool

The /var/spool directory contains what are called ‘spool’ files that contain data stored for later processing. For example, printers which will queue print jobs in a spool file for eventual printing and then deletion when the resource (the printer) becomes available.

/var/tmp

The /var/tmp directory is a temporary store for data that needs to be held between reboots (unlike /tmp).


The post above (and heaps of other stuff) is in the book 'Just Enough Linux' that can be downloaded for free (or donate if you really want to :-)).

Monday, 28 March 2016

Using Pipes with Linux Commands

The following post is a section of the book 'Just Enough Linux'.  The entire book can be downloaded in pdf format for free from Leanpub or you can read it online here.
Since this post is a snapshot in time. I recommend that you download a copy of the book which is updated frequently to improve and expand the content.
---------------------------------------

Pipes are used in Linux to combine commands to allow them to have the output from a command feed directly into another command and so on. The idea is to use multiple commands to create a sequence of processing information from one command to another.
The commands are separated by the vertical line (|) that is normally found above the backslash key (\). This is the character that denotes the ‘pipe’ function. To combine three functions together with pipe symbol we would have something like the following;
As the name suggests, we can think of the pipe function as representing commands linked by a pipe where the command runs a function that is output in a pipe to flow to the second command and on to the eventual final output. To help with the visual association it might be useful to think of the connections similar to the following;
Piping one command after another
To demonstrate by example we can consider a set of commands lined as follows;
Piping Example
Here we have the ls command that is listing the contents of the /var/log directory with the listings sorted by time. This is then feeding into the head command that will only display the first 10 of those lines. We then feed those 10 lines into a grep command that filters the result to show only those lines that have the word ‘root’ in them.
The command as it would be run from the command line would be as follows;
… and the output would appear something like the following;
pi@raspberrypi ~ $ ls -lt /var/log | head | grep root
-rw-r----- 1 root  adm  133637 Sep 13 04:01 auth.log
-rw-r----- 1 root  adm   16794 Sep 13 04:01 syslog
-rw-rw-r-- 1 root  utmp 292292 Sep 12 18:58 lastlog
-rw-rw-r-- 1 root  utmp   2688 Sep 12 18:58 wtmp
-rw-r----- 1 root  adm    1050 Sep 12 06:25 messages
-rw-r----- 1 root  adm   18986 Sep 12 06:25 syslog.1
-rw-r----- 1 root  adm    1357 Sep 11 06:25 syslog.2.gz


The post above (and heaps of other stuff) is in the book 'Just Enough Linux' that can be downloaded for free (or donate if you really want to :-)).

Thursday, 24 March 2016

Regular Expressions in Linux

The following post is a section of the book 'Just Enough Linux'.  The entire book can be downloaded in pdf format for free from Leanpub or you can read it online here.
Since this post is a snapshot in time. I recommend that you download a copy of the book which is updated frequently to improve and expand the content.
---------------------------------------

Regular expressions (also called ‘regex’) are a pattern matching system that uses sequences of characters constructed according to pre-defined syntax rules to find desired strings in text. The topic of regular expressions is a book in itself and I heartily recommend further reading for those who find the need to use them in anger.
The command grep (where the re in grep stands for regular expression) is an essential tool for any one using Linux, allowing regular expressions to be used in file searches or command outputs. Although the use of regular expressions is widespread in multiple facets of computing operations.
For example, if we wanted to search the file dmesg which is in the /var/log directory and wanted to show each line that contained the string of characters CPU. we would use the grep command as follows;
The output from the command will appear similar to the following;
pi@raspberrypi ~ $ grep CPU /var/log/dmesg
[    0.000000] Booting Linux on physical CPU 0xf00
[    0.000000] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.004116] CPU: Testing write buffer coherency: ok
[    0.053503] CPU0: update cpu_capacity 1024
[    0.053577] CPU0: thread -1, cpu 0, socket 15, mpidr 80000f00
[    0.113791] CPU1: Booted secondary processor
[    0.113851] CPU1: update cpu_capacity 1024
[    0.113860] CPU1: thread -1, cpu 1, socket 15, mpidr 80000f01
[    0.133710] CPU2: Booted secondary processor
[    0.133746] CPU2: update cpu_capacity 1024
[    0.133755] CPU2: thread -1, cpu 2, socket 15, mpidr 80000f02
[    0.153750] CPU3: Booted secondary processor
[    0.153788] CPU3: update cpu_capacity 1024
[    0.153797] CPU3: thread -1, cpu 3, socket 15, mpidr 80000f03
[    0.153891] Brought up 4 CPUs
[    0.154045] CPU: All CPU(s) started in SVC mode.
[    2.406902] ledtrig-cpu: registered to indicate activity on CPUs
This is a basic example that utilises a simple string to match on and shouldn’t necessarily be regarded as a great use of regular expressions. However, if we wanted to limit the returned results to instances where the string was the text CPU followed by the number 012 or 3, we could use a regular expression with a facility that included a range of options. This is accomplished by using the square brackets [] with the specified range inside.
In our case we want the text CPU and it must be immediately followed by a number in the range 0 to 3. This can be designated by the regular expression CPU[0-3].
Which means that our search as follows;
… will result in;
pi@raspberrypi ~ $ grep CPU[0-3] /var/log/dmesg
[    0.053503] CPU0: update cpu_capacity 1024
[    0.053577] CPU0: thread -1, cpu 0, socket 15, mpidr 80000f00
[    0.113791] CPU1: Booted secondary processor
[    0.113851] CPU1: update cpu_capacity 1024
[    0.113860] CPU1: thread -1, cpu 1, socket 15, mpidr 80000f01
[    0.133710] CPU2: Booted secondary processor
[    0.133746] CPU2: update cpu_capacity 1024
[    0.133755] CPU2: thread -1, cpu 2, socket 15, mpidr 80000f02
[    0.153750] CPU3: Booted secondary processor
[    0.153788] CPU3: update cpu_capacity 1024
[    0.153797] CPU3: thread -1, cpu 3, socket 15, mpidr 80000f03
The square brackets are ‘metacharacters’ and it is the use of these metacharacters that provide regular expressions with the foundation of their strength.
The following are some of the most commonly used metacharacters and a very short description of their effect (we will show examples further on);
[ ]Match anything inside the square brackets for ONE character
^(circumflex or caret) Matches only at the beginning of the target string (when not used inside square brackets (where it has a different meaning))
$Matches only at the end of the target string
.(period or full-stop) Matches any single character
?Matches when the preceding character occurs 0 or 1 times only
*Matches when the preceding character occurs 0 or more times
+Matches when the preceding character occurs 1 or more times
( )Can be used to group parts of our search expression together
|(vertical bar or pipe) Allows us to find the left hand or right values

Match a defined single character with square brackets ([])

As demonstrated at the start of this section, the use of square brackets will allow us to match any single character. The example we used below employed the use of the dash (or minus) character as a range signifier to signify that the possible characters were 012, or3.
We could also have simply put each character in the square brackets as follows;
In either case it should be noted that only a single character is matched for the entries in the square brackets.
We can specify more than one range and we can also distinguish between upper case and lower case characters. Therefore the following ranges will have the corresponding results;
  • [a-z] : Match any single character between a to z.
  • [A-Z] : Match any single character between A to Z.
  • [0-9] : Match any single character between 0 to 9.
  • [a-zA-Z0-9] : Match any single character either a to z or A to Z or 0 to 9
Within square brackets we can also use the circumflex or caret character (^) to negate the characters selection. I.e. with a caret we can say search for lines with the text CPU and it must be immediately followed by a character that is not in the range 0 to 3. This is done as follows;
Which would result in an output similar to the following;
pi@raspberrypi ~ $ grep CPU[^0-3] /var/log/dmesg
[    0.000000] Booting Linux on physical CPU 0xf00
[    0.000000] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing
[    0.000000] PERCPU: Embedded 11 pages/cpu @ba05d000 s12864 r8192 d24000
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.004116] CPU: Testing write buffer coherency: ok
[    0.153891] Brought up 4 CPUs
[    0.154045] CPU: All CPU(s) started in SVC mode.
[    2.406902] ledtrig-cpu: registered to indicate activity on CPUs
Note that none of the previous lines with CPU0CPU1CPU2 or CPU3 have been listed.

Match at the beginning of a string (^)

We can use the circumflex or caret character (^) to match lines of text that begin with a specific set of characters.
Given a text file names foo.txt with the following contents;
First line with something
Second line with something else
Third line still going
Fourth Line but Second last
Last line. Goodbye!
If we run the grep command looking for the string ‘Second’ as follows;
We should have two lines returned as below;
pi@raspberrypi ~ $ grep Second foo.txt
Second line with something else
Fourth Line but Second last
But if we use the caret character to designate that we are only looking for lines that start with our string as follows;
… we will get the following output where only the second line is returned;
pi@raspberrypi ~ $ grep Second foo.txt
Second line with something else

Match at the end of a string ($)

We can use the dollar sign character ($) to match lines of text that finish with a specific character or set of characters.
For example, given a text file names foo.txt with the following contents;
First line with something
Second line with something else
Third line still going
Fourth Line but Second last
Last line. Goodbye!
If we use the dollar sign character to search for all lines that end in ‘ing’ as follows;
… we will get the following output where only the second line is returned;
pi@raspberrypi ~ $ grep ing$ foo.txt
First line with something
Third line still going

Match any single character (.)

The . (period) character will allow us to match any single character in this position.
For example, given a text file names foo.txt with the following contents;
First line with something
Second line with something else
Third line still going
Fourth Line but Second last
Last line. Goodbye!
… if we wanted to return all lines where the characters ing were in the middle of the line (not at the end) we could run the following grep command;
This would produce an output similar to the following;
pi@raspberrypi ~ $ grep ing. foo.txt
Second line with something else
While there are two other lines with ing in them, (the first and third lines), both of them end with ‘ing’ and as a result there is no character after them. The only one where there is ‘ing’ with a character following it is in the second line.

Match when the preceding character occurs 0 or 1 times only (?)

It may be difficult to think of a situation where we would want to match against something that occurs 0 or 1 time, but the best example comes from the world of language. In American spelling the word ‘color’ differs from the British spelling by the omission of the letter ‘u’ (‘colour’). We can write a regular expression that will match either spelling as follows;
This way the question mark denotes that for a match to occur, the preceding character must either not be present or must occur once. The additional characters (‘colo’ and the ‘r’) are literals in the sense that they must be present exactly as stated. The only variable in the expression is the ‘u’.

Match when the preceding character occurs 0 or more times (*)

The asterisk metacharacter in regular expressions can be one of the most confusing options to use, but this is mainly because its real strength is applied when matched with other metacharacters.
For example it could be argued that a regular expression such as q*w will match wqw and qqqw, however if we use a period and an asterisk together (.*) we gain a function that will match zero or more of any series of characters.
In this case we can use a regular expression such as …
… to find any combination of characters that start with pa and end with y and have any number of characters (including none) in between. These would include the following;
pacify
painfully
paisley
palmistry
palpably
pay

Match when the preceding character occurs 1 or more times (+)

The use of the + character to allow one or more instances of a character is similar to that of the asterisk. Where the * metacharacter might return the following matches from the regular expression fe*d;
fd
fed
feed
The use of fe+d would result in;
fed
feed

Group parts of a search expression together (())

Regular expressions can be combined into subgroups that can be operated on as separate entities by enclosing those entities in parenthesis. For example, if we wanted to return a match if we saw the word ‘monkey’ or ‘banana’ we would use the or metacharacter |(the pipe) to try to match one string or another as follows;

Find one group of values or another (|)

The pipe metacharacter allows us to apply a logical ‘or’ operator to our pattern matching. For example if we wanted to return a match if we saw the word ‘monkey’ or ‘banana’ we would use the words encapsulated in parenthesis and the pipe metacharacter to try to match one string or another as follows;

Extended Regular Expressions

In basic regular expressions the meta-characters ?+{|(, and ) are not regarded as special and instead we need to use the backslashed versions \?\+\{\|\(, and \).


The post above (and heaps of other stuff) is in the book 'Just Enough Linux' that can be downloaded for free (or donate if you really want to :-)).