CIS

COMPUTER INSTRUCTIONS

FALL, 1994

UNIX CONCEPTS

==== ========

The purpose of this document is to provide an introduction to some basic concepts of the UNIX operating system. Two major topics are discussed: the file system and the command interpreter. It is recommended that this document be read before using the UNIX system.

The contents of this document are as follows:

FILE SYSTEM

Directories

Path Names (Notation)

Current Working Directory

Home Directory

SHELL (COMMAND INTERPRETER)

Shell

Commands

Redirection

Pipes

Shell Variables

.profile

FILE SYSTEM

---- ------

Directories

-----------

UNIX uses directory trees to organize the file system. Any directory of files fits somewhere into the tree, and can have directories underneath it (called "sub-directories") comprising a "sub-tree" of directories. Ultimately, all directories in the UNIX file system fit, somewhere, in one large directory tree. At the top of the tree is the "root" directory, symbolized in the UNIX syntax by a "/" character. Within this directory, are additional directories (sub-directories), any of which may have more sub-directories, and so on.

Below is a graphic description of a sample directory tree.

                                 (root)
				           /
				           |
			    -------------------------------
			    |            |                 |
			   etc          usr              users
			    |            |                 |
		     ---------    -----------     --------------
		     |            |         |     |      |     |
		    passwd ...   lib ...   src  grad   staff faculty
				 |          |   |     |      |
				---        --- ---   ---   -------
				| |        | | | |   | |   |     |
				 .          .   .     .        snoopy
				 .          .   .     .          |
				 .          .   .     .       --------
							            |      |
							          work  .profile
							            |
	 					   	         ------
	 						         |    |

Using this example, we see that all files and directories are in one tree which ultimately begins with the "/" directory.(the root directory) This "/" directory has directories in it. These directories, in turn, may have directories in them. For example, the directory "etc" within "/" contains a file called "passwd" but no directories.

The directory called "users" contains no files, but it does have directories in it ("grad", "staff", and "faculty"). These directories also contain directories. A directory may contain both files and sub-directories. The directory "snoopy" contains a file called ".profile" and a sub-directory called "work". There is no limit to how far the tree may extend.

Path Names (Notation)

---- ----- ----------

The notation for a file (or directory) name describes the file's location in the directory tree. In UNIX, it is usually more correct to refer to a file (or directory) name as a "path name". This is because the name describes the "path" to be taken to locate the file, or directory, in the tree.

The path name consists of all the directories from the top down to the file (or directory) itself, separating each directory with a "/". A "/" at the beginning of the path name signifies that the name is being specified from the very top of the directory tree (the root directory).

This is a fully qualified path name, or absolute name. No defaults are involved; there is no question as to where in the tree the path leads.

If the path name does not begin with a "/", then it is assumed to be relative to the "current working directory" (see the commands "cd" and "pwd"). This is effectively the "default" directory. Without the "/" in the path name, the current working directory is assumed to be the top of the tree (or sub-tree, really).

For example, the directory "etc" in the root ("/") directory is specified by the path name "/etc". The file "passwd" in "etc" is known by the path name "/etc/passwd". The directory "snoopy" in the tree above has the path name "/users/staff/snoopy" and the file ".profile" in that directory has the full name of "/users/staff/snoopy/.profile".

Current Working Directory

------- ------- ---------

Clearly, it would be rather tedious to have to refer to all files by such long path names. So the current working directory allows us to abbreviate out references. If the current working directory is "/users/staff/snoopy" then a the file ".profile" can simply be referred to by that name: ".profile". A file within the "work" sub- directory of "snoopy" which happens to be called "abc" could be referred to as "work/abc".

Home Directory

---- ---------

When a user logs into the system, she/he has his current working directory set to a particular directory which is his own. For a user named "snoopy", for example, the directory "/users/staff/snoopy" is set as his current working directory as soon as he logs in. This directory is known as his "home" directory. A user can always return to his home directory (from anywhere in the directory tree) by entering the command "cd" (Change Directory) with no parameters.

SHELL (COMMAND INTERPRETER)

----- -------- ------------

Shell

-----

The "shell" is the command interpreter for the UNIX operating system. It accepts input from the terminal, and executes the appropriate programs (or takes the appropriate action) in response.

The shell is a very powerful tool which can do many things to customize the overall environment and to add power to the existing programs in the UNIX system.

Commands

--------

Every command in UNIX corresponds to a file. The file is either an executable program or a shell script (much like a .COM file in VAX/VMS, or .BAT in MS-DOS). File names in UNIX are case sensitive (a distinction is made between an uppercase letter and its corresponding lower case letter). All command names (and corresponding files) in UNIX are in lower case.

All parameters to a command are passed on the same line following the command name. Each parameter is separated by a space. The command name is separated from the parameters also by a space. So, for example, to use the "cp" command to copy "file_1" to "file_2", the following would be entered on the command line:

cp file_1 file_2

All commands available are placed in designated directories. Any executable file (of shell script) in one of these directories can be called as a command by simply entering its name and any associated parameters. The "cat" command, for example, corresponds to a file in the directory "/bin". When the command "cat" is entered, a set of directories is search for a file named "cat". When this file is found, the program is then executed.

In UNIX, there is no difference between a command and a program. They are one and the same. A command is simply the name of a program.

Creating new commands is as simple as creating a program and placing it where the shell can find it (in one of the directories searched for commands).

The set of directories searched for commands is defined by a "shell variable" named "PATH". (A shell variable is a character string known to the shell.) The PATH variable name contains a list of directories, separated by colons, which will be searched to find a given command.

One directory that is always in the PATH variable is the current working directory. That means that entering the name of an executable file (program) will cause it to run if it is in the current working directory.

Directory names can be added to the PATH variable to allow additional directories to be searched for a command. This way, individual users can have directories where they can keep programs that they wish to be able to execute at any time. The following sequence will add a directory called "/users/faculty/snoopy/bin" to the PATH variable:

PATH=$PATH:/users/faculty/snoopy/bin

export PATH

Redirection

-----------

Perhaps one of the UNIX system's biggest claims to fame is the ability to "redirect" input or output easily. That is, rather than having input come from a terminal or output sent to a terminal, files can be used.

To redirect output from a command to go to a file rather than a terminal, the ">" is used after the command (on the command line) followed by the name of a file to send the output to.

For example, to place a list of the files in the current working directory into a file, the "ls" command will be used to generate the list. If called in the following form,

ls > file_list

the contents of the directory will be placed into the file "file_list".

In other words, the output of the "ls" command will be sent to that specific file.

One thing about the ">" qualifier is that it rewrites the contents of a file. If there was anything in the file "file_list" before, it would be lost after the above command was executed. If, rather than rewriting a file, a user would prefer to append to the end of a file, the double redirection character ">>" could be used instead.

To redirect input from a file rather than from a terminal, the "<" character would be used. For example, if a program called "prog1" reads input from a terminal and processes it, a file called "sample_data" could be created with sample input for the program.

Then, to feed this input into the program, simply run the program as follows:

prog1 < sample_data

Both input and output can be redirected at the same time. If the above example were modified as follows,

prog1 < sample_data > sample_output

the input would be read from the file "sample_data" and the output would be written to the file "sample_output".

Pipes

-----

Another claim to fame for UNIX is "pipes". Pipes are used to take the output of one program and feed it into another one as input. By placing calls to two programs (or commands) on the same command line separated by a "|" character, the output of the first program will become the input of the second. This is also not limited to two programs, but can be continued to a third, or fourth, etc.

For example, to generate manual pages describing the C compiler ("cc") the command "man cc" would be used. To print a file, the command "lp" is used. To print a copy of the "cc" manual pages, both commands could be made to work together. There are two ways it could be done.

One way would be to redirect the output of "man" to a file and then print the file. Another would be to "pipe" the output of "man" into the "lp" command (which reads input if the file name is not passed).

The first solution would be as follows:

man cc > cc.doc

lp cc.doc

rm cc.doc

The second solution, which is the cleaner solution, is with a pipe:

man cc | lp

Note that there is nothing special about the way the commands and programs are written that enable them to work with either redirectioning or pipes. They simple use input and output as normal, assuming that the terminal will serve both purposes. It is the shell which creates the "illusion" that the terminal is involved so that the programs will run normally, but, in fact, uses files or pipes.

Shell Variables

----- ---------

The shell maintains a number of variables describing the user's environment. One such variable, already introduced, is the PATH variable. This names the directories to be searched to find commands to be executed.

By convention, shell variables tend to be all uppercase letters.

This helps to distinguish them from commands and file names which usually tend to be all lowercase (or mixed case).

To translate a variable name to its value, precede the variable by a "$" character. To see the value of a variable, use the command "echo" which displays a string passed to it. So to see the value of a shell variable, pass the variable name preceded by a "$" to the "echo" command.

For example, to see the value of the variable PATH, enter the command:

echo $PATH

Another useful shell variable is the HOME variable. This variable contains the path name of the user's home directory (the working directory set at login). An example of where this would be useful is if a user has a directory called "bin" under his home directory which has useful commands he'd like to execute. He would probably want to add that directory to his PATH variable. He could do this by providing the full path name, but this is longer, and if  his home directory should change, it would cause confusion. So the HOME variable could be used. The PATH variable could be modified in the following way:

PATH=$PATH:$HOME/bin

This concatenates the users "bin" directory to the end of the PATH variable (separated by a ":").

Also useful is the shell variable TERM. This identifies to UNIX software (especially the text editor) what kind of terminal the user has. If, for example, the user has a DEC VT200 series terminal, the TERM variable should be set as "TERM=vt200". If he has a Wyse 50 terminal, "TERM=wyse50" would be appropriate.

After modifying a shell variable, the "export" command should be used to insure that the new value is known.

To identify a terminal as a DEC VT100 type terminal, the following sequence should be used:

TERM=vt100

export TERM

.profile

--------

A file named ".profile" found in a user's home directory is always executed at login. The contents of this file are simply shell commands that would otherwise be typed in. Typically, this file will contain commands to set the terminal definition and anything else to do with the user's environment. So the above example for setting the TERM shell variable (perhaps along with the PATH variable example) could be placed into the ".profile" file to be executed at every login. The result could be a ".profile" file containing the following:

TERM=vt200

export TERM

PATH=$PATH:$HOME/bin

export PATH