• Início
  • agenda
  • blog
  • english
  • español
  • news
  • pager

Julio Neves

Rio de Janeiro – Shell Script Blog

Feeds:
Posts
Comentários

english

julio-neves.jpg

Pub Talk Part I

  • The Linux Environment
  • The Shell Environment
    • A Little Bang in the Main Shell Flavors
      • Bourne Shell (sh)
      • Korn Shell (ksh)
      • Bourne Again Shell (Bash)
      • C Shell (csh)
    • Explaining Shell Work
      • Examination of the Command Line
        • Assignment
        • Command
          • Redirection Resolution
          • Variable Substitutions
          • Metacharacters Substitutions
          • Sending Command Line to the Kernel
    • Decrypting the Rosetta Stone
      • Escape Characters
        • Single Quotes (‘)
        • Backslash (\)
        • Double Quotes (“)
      • Redirection Characters
        • Redirecting Standard Output
        • Redirecting Standard Error Output
        • Redirecting Standard Input
        • Redirecting Commands (pipes)
      • Environment Characters

Dialog overheard between a Linuxer and a “Mouseoholic”- Who’s Bash?- Bash is the newest son of the Shell Family

- Hey dude! You’re gonna drive me crazy. I had one doubt… Now I have two!

- No, man. It’s been a long time you are crazy. Since you decided to use that operating system, you need to boot ten times a day and you don’t have any idea what’s happening in your computer. But, never mind. I’m gonna explain what a Shell is and what Shell Families are, and in the end, you’ll exclaim: “Holy God of Shell ! Why didn’t I choose linux before?”

The Linux Environment

- To understand Shell and how it works, first of all I’ll show you how the layers in the Linux Environment works. Take a look at the graph:

Visão do shell em relação do Kernel do Linux In this graph, we can see that the Hardware Layer is in the center and made of the physical components of your computer. Surrounding that we have the Linux Kernel Layer, that is your core. This layer communicates with the hardware, managing and controlling it. The kernel sends programs and commands to the central processing unit (CPU) for execution. Enclosing this, we have the Shell. This name is because it’s a wrapper between the User and the Operating System (OS). All User interaction with the OS is managed by the Shell.

The Shell Environment

Well… to get to the Linux Core – the ambition of all applications – Shell filtering is needed, Let’s understand how it works, to make the most of the numerous tools the Shell provides us.

Linux, by definition, is a multi-user operating system – we can’t ever forget this – and to allow the access to specific users and deny others, there is a file called /etc/passwd. The /etc/passwd file holds data for the “host” function and contains information which controls the login of all users of the system. The last field of each user record in /etc/passwd tells the system which Shell the user will get at login.

Pinguim com placa de dica (em inglês) Remember, I said that the last field of /etc/passwd tells the system the default user Shell at login? This means, if in this field we have prog, when the user logs in he will have the prog program screen, and when execution of prog finishes, the system will logout the user. Imagine how much security we can implement with this simple tool.

Do you remember I told the Shell, family, brother? That’s it, lets understand this: the Shell is the concept of the Shell involving the operational system as is, and is the generic name to treat the sons of this idea that, for the years of Unix existence, was born. Nowadays there are lots of Shell flavors. We can tell about sh (Bourne Shell), the ksh (Korn Shell), the Bash (Bourne Again Shell) and the csh (C Shell).

A Little Bang in the Main Shell Flavors

Bourne Shell (sh)

Developed by Stephen Bourne, at AT&T Bell Labs (where Unix was developed too), this was during many years the default Shell of Unix Operational System. Is also called Standard Shell, because it was for years the only one Shell and is the most used today, because it was ported to all Unix environments and Linux distros.

Korn Shell (ksh)

Developed by David Korn, from Bell Labs, is a superset of sh, that means, it has all easinesses of sh and to them agregated many others. The total compatibility with sh brings many users and Shell programers to this environment.

Bourne Again Shell (Bash)

This is the most modern Shell (excepting on Bash 2) and whose number of adepts is growing more and more in whole world, or because it is the Linux default Shell, or because its big diversity of commands, that also incorporates many C-Shell commands.

C Shell (csh)

Developed by Bill Joy from Berkley University, is the most used BSD and Xenix Shell. Its command structure is very similar to C language structure. Your biggest sin was to ignore the SH compatibility, walking its own way.

There are some other Shells, but we only will talk about the three firsts, treating them generically as “Shell” and pointing the particular characteristics of each one, if they have.

Explaining Shell Work

The Shell is the first program you have when you make login at Linux. He will solve lots of things in order to not burden Kernel with repetitive tasks, alliviating him to take care about more noble tasks. As each user has your own Shell interposed between him and Linux, is the Shell that will interpret the commands that are typed and checks its syntax, passing it clean to execution.

- YO ! This kind of interpret command doesn’t have anything with an interpreter ?

- Yes, it has. In the truth, the Shell is an interpeter that brings with him a powerfull language with high-level commands, that allows loop construction, decision structures and values storage in variables, as I’ll show you.
Let’s me explain the main tasks that Shell do, in its execution order. Pay attencion in this order, because she’s fundamental to the rest of our speech understanding.

Examination of the Command Line

In this examination, the Shell identifies the special (reserved) characters that have meaning for interpretation of the line, and checks if the passed line is an assignment or a command.

Assignment

If the Shell finds two fields separated by an equal (=) without blank spaces between them, identifies this sequency as an assignment.

$ ls linux
linux

In this example, the Shell identified the ls as a program and linux as a parameter passed to ls program.

$ value=1000

In this case, because we don’t have blank spaces (and we can notice that the blank space is one of those reservated characters), the Shell identified an assignment and put 1000 on variable value.

Pinguim com placa de atenção (em inglês) Never do:

$ value = 10000
bash: value: not found

Bash found the word value between blanks spaces and “guess” that you were trying to execute a program called value, passing two parameters: != and 1000.

Command

When a line is typed at linux prompt, she is divided in pieces separeted by blank spaces: the first piece is the name of the program and will have your existence searched; next identifies, in this order, options/parameters, redirects and variables.
When the identified program exists, the Shell verifies the permissions of involved files (including the own program), generating an error if you don’t have permissions to run this job.

Redirection Resolution

After identifies the components at command line you typed, the Shell goes to redirection resolution. The Shell has in your advantage issues something we call redirection, that can be input (stdin), output (stdout) or error (stderr), as I’ll explain soon.

Variable Substitutions

At this point, Shell verifies if the variables ( parameters started by $ ), found at command scope, are defined and change them to its present values.

Metacharacters Substitutions

If any metacharacter ( *, ? ou [] ) was found at command line, it will be changed by its possible values, at this point.

Supose that the only file in your actual directory started by T be a directory named ThisIsAVeryHugeNameForADirectoryButIsMyDirectoryName, you can do:

$ cd T*

As until here who’s working your command line is the Shell and the command (program) cd isn’t executed yet, the Shell changes the T* in ThisIsAVeryHugeNameForADirectoryButIsMyDirectoryName and the command cd will be successfully executed.

Sending Command Line to the Kernel

Completed the previous jobs, the Shell mounts the command line, now with all changes done, call the kernel to execute it in a new Shell (Son Shell), wining a process number called PID (Process IDentification ) and stays inactive, taking a little nap, during the program execution. Once finished this process (together the son Shell), it takes the control again and, showing the prompt, tells it is ready to execute new commands.

Decrypting the Rosetta Stone

To take off that feeling you have when you see a Shell script, that’s like a letter soup or hierogliphs, I’ll show you the main special characteres that allow you to walk as Jean-François Champollion (make a little research at Google to find out who’s this man) decrypting the Roseta’s Stone.

Escape Characters

That’s it. When we desire Shell to interpret a special character, we must “hide” it from him. This can be done in many ways:

Single Quotes (')

When the Shell see a character chain between single quotes, he takes of the single quotes and doesn’t interprets its content.

$ ls linux*
linuxmagazine

$ ls ‘linux*’
bash: linux* no such file or directory

In the first case, Shell “expanded” the asterisk (*) and discovered the file linuxmagazine to list.In the second case, the single quotes inhibited the Shell interpretation and we got the answer that there is no file linux*. That means, the asterisk (*) was expanded in the first case, but was interpreted as a literal asterisk (*) character in the second case.

Backslash (\)

At the same way that single quotes work, backslash (\) inhibities the interpretation only of the character that follows her.

Imagine you acidentally had created a file named * (asterisk) – some Unix flavors allow it – and wants to remove it. If you do:

$ rm *

You will doing a big mess, because the rm would erase all files in the current directory. The best way to do this is:

$ rm \*

In this way, Shell didn’t interpretate the asteristiks, doing its expansion.

Do the following cientific experience:

$ cd /etc
$ echo ‘*’
$ echo \*
$ echo *

Did you see the diferences? So, I don’t need to explain nothing more.

Double Quotes (")

Likelly at single quotes, excepting if the chain between double quotes has a dolar ($), a backquote (`) or a backslash (\). You don’t need to get stressed, but I didn’t give samples of double quotes use because you don’t know the dolar ($) nor the backquote (`). From here, we’ll see the use of these special characters too many times. The most important is to understand what any one means.

Redirection Characters

The most of commands have an input, an output and can generate errors. This input is called Standard Input or stdin and its default is the terminal keyboard. The output of a command is called Standard Output or stdout and its default is the terminal screen. To the terminal screen also is send by default the error messages that command can generate, called Standard Error or stderr. Let’s see now how to change this state of things. Let’s do a “parrot program”. Do as following:

$ cat

The cat is an instruction that lists the contents of a specific file to the standard output (stdout). If this input aren’t defined, he waits the data from standard input (stdin). As I didn’t specify the input, he’s waiting it from keyboard (standard input) and, as I also didn’t tell the output, what I type will go to the screen (standard output), doing this way, as I was proposed, a “parrot program”. Try it !

Redirecting Standard Output

To specify the output of a program we can use the > (greather than) or the >> (greather than, greather than) followed by the name of a file to wich we want to send the output.

Lets change the “parrot program” onto a “text editor” (how pretensious, hu ?).

$ cat > Arq

The cat continues without the specified input, so is waiting for data typed, but your output is redirected to the file Arq. In this way, everything typed is going to inside the Arq file, that means we did the shorter and whorst text editor of entire planet.

If I do again:

$ cat > Arq

The data in Arq will be lost, as before the redirecting the Shell will create an empty Arq file. To put information at the end of file, it should be done:

$ cat >> Arq
Pinguim com placa de atenção (em inglês) As I already told you, the Shell resolves the line and after send the command to execution. Thus, if you redirect the output of a file to itself, first the Shell empty the file and after send the command to execution. In this way, you just lost your dear file.

With this, we can notice that >> is used to insert data at the end of file.

Redirecting Standard Error Output

As the Shell receives data from a keyboard and send the output to a screen by default, the errors also are send to the screen if you don’t specify another output. To redirect the errors, use 2> error_file. Pay atention that between the number 2 and the greather than sign (>) there is no blank space.

Pinguim com placa de atenção (em inglês) Don’t make the confusion between >> with 2>. The first one insert data at the end of a file and the second one redirects the standard error outupt (stderr) to the specified file. This is important!

Supose that, during a script execution you can, or not (it depends of the way the program execution takes), created a file named /tmp/IsThisExisting$$. To erase trash from your disc, at the end of the script you could put a line like:

rm /tmp/IsThisExisting$$

If the file doesn’t exist, an error message will be send to the screen. To not allow this, you should do:

rm /tmp/IsThisExisting$$ 2> /dev/null

About the example we just saw, I have two tips:

Pinguim com placa de dica (em inglês) TIP # 1 The $$ has the PID, this means, the Process IDentification. As Linux is a multi-user system, is always good insert the $$ to the file names that will be used for many people to avoid properties problems, that means, if you named your file just as IsThisExist, the first user (the creator, then) will be your owner and all others will have a permission error when tried to write something in the file.

To test you Standard Error Output at Shell prompt, I’ll give one example. Do:

$ ls donotexist
bash: donotexist no such file or directory

$ ls donotexist 2> errorfile
$ cat errofile
bash: donotexist no such file or directory

In this case, we saw that when we did a ls at donotexist, we got an error message. After redirect the standard error output to errorfile and run the same command, we got only the Shell prompt back. When listing the errorfile, we saw the error message was stored in it. Do the same test.

Pinguim com placa de dica (em inglês) TIP # 2 – Who’s the hell of /dev/null?- In Unix, there is a ghost file. It’s called /dev/null. Everything is sent to this file disapears. It’s like a Black Hole. In my example, as I was not interested in store a possible error message from rm command, I just redirect it to this file.

It is good to notice that those redirecting characters are cumulatives, this means, if in the previous example we did:

$ ls donotexist 2>> errorfile

the error message from ls will inserted at the end of errorfile.

Redirecting Standard Input

To do the Standard Input Redirection we use the < (less than).

- And this is used for what, you’ll ask me.

- You’ll understand too fast. Let me show you an example.

Supose taht you want to send an e-mail to your boss. To the Boss, we always whim, right? So, instead of start typing the e-mail at the prompt that makes the correction of a previous phrase impossible, you write a file with the message and after ten checks without see any error, you decide to send it, and do:

$ mail boss < filewithmailtotheboss

Your boss will receive the text in the filewithmailtotheboss.

Another type of very crazy redirection Shell allows is called here document. He’s represented by << (less than, less than) and indicates to the Shell that the command escope begins at the next line and ends when found a line that contains only the label that follow the sign <<.

See the following script, with a ftp routine:

ftp -ivn  remotehost <<endftp
    user $USER $PASSWD
    binary
    get remotefile
endftp

This little portion of code we have lots of interesting details:

  1. The options I used to ftp (-ivn) are used to it list everything is happening (-v from verbose), to not ask if you really want to get the file (-i from interactive) and, last but not least, to doesn’t require user and password, (-n), because these parameters will be informed by the specified instruction user;
  2. When I used the << endftp, I was telling the following:
    “Listen to me, Shell. Do not mess with nothing from here until find the label endftp. You didn’t understand anything, as they are ftp specific instructions”.
    If this was the end, it would be simple, but following the example, we can see that there are two variables ($USER and $PASSWD), that the Shell will interpret before the redirection. But the great advantage of this kind of construction is that it allows to commands be interpreted inside the here document escope, that opposes what I just said. Soon I’ll explain how this thing works. Now we can’t, because you don’t know all the tools;
  3. The command user is a ftp command and is used to pass user and password that were read in a previous routine and inserted in our two variables: $USER and $PASSWD;
  4. The binary is another ftp instruction, that is used to indicate that the transfer of the remotefile file will be done in binary way, that means, the file data will not interpreted to know if it is ASCII, EBCDIC, etc;
  5. The get remotefile tells to ftp to download this file from remotehost to our local host. If we want to send the file, we used the command put.
Pinguim com placa de atenção (em inglês) A very frequent error in the labels use (as the endftp in our previous example) is caused by the existence of blank spaces before or after it. Pay atention on it, because this kind of error uses to spank the programer’s ass, until its detection. Remember: a good label must be an entire line to her.

- All right… all right… I know I was babling and walked by ftp commands, outing of our main subject, but is always good to learn and is very rare to find people that loves to teach…

Redirecting Commands (pipes)

The redirections we told until now always refered to files, that means, they sent things to a file, they got things from a file, they simulated local files. What we’ll see from now redirects the output of a command to the input of another.

This is very usefull and make lots of things easy. You name is pipe and acts as a pipe between two commands, or in other words, acts pipeing information from one command to another. Your representation is a vertical bar (|).

$ ls | wc -l
21

The ls command passed the file list to the wc command, that when it has the option -l counts the lines received. Using this, we can say how many files we have in our directory (21 in this case).

$ cat /etc/passwd | sort | lp

This command line sends contents of /etc/passwd file to the sort command input. This command classifies it and sends to lp, that is our printer spool manager.

Environment Characters

When you want to priorize one expression, you put it between parentesis, right? So, because of arithmetic, this is a normal think. But in Shell what really priorizes expressions are the backquote (`) and not the parentesis. I’ll give you examples of backquote uses, to get better understanding.

I want to know how many users are logged in my computer. I can do:

$ who | wc -l

The who command sends the connected users list to the command wc -l that counts how many lines it received and shows the answer in the terminal. So, if we want to have more than a number alone in the screen, what I want is that is stays in the middle of a phrase.

To send phrases to the screen I use the echo command. So let see how it works:

$ echo “There is who | wc -l connected users”
There is who | wc -l connected users

What? Look that! It didn’t work. It didn’t work indeed, and wasn’t because the quotation marks I used, but because I must to execute the who | wc -l command before the echo command. To solve this problem, I need to priorize this second command with the use of backquote, doing this:

$ echo “There is `who | wc -l ` connected users”
There is 8 connected users

To remove those blank spaces before 8 that wc -l produced, we just need to remove quotations. Like this:

$ echo There is `who | wc -l ` connected users
There is 8 connected users

As I said, the quotation marks protect everything that is inside your limits from Shell interpretation. As to the Shell a single blank space as separator is enought, the extra spaces will be changed by an only one after we remove the quotation marks.

Before tell about parentesis use, let me give a little bang about the semi-collon (;) use. When you are in the Shell, you must always give only one command in each line. To group commands at the same line, we need to separate ir by semi-collon (;). So:

$ pwd ; cd /etc; pwd; cd -; pwd
/home/mydir
/etc/
/home/mydir

At this example, I listed the name of current directory with pwd command, changed to the /etc directory, again listed the directory name and finally back to the previous directory (cd -), listing its name. Note that I put the semi-collon (;) in all possible ways, to show that don’t mind if there is blank spaces before or after this character.

Finally, lets see the parentesis case. Take a look at the following case, likelly the previous example:

$ (pwd ; cd /etc ; pwd;)
/home/mydir
/etc/

$ pwd
/home/mydir

- What the heck? I was in /home/mydir, changed to /etc, check that I really was in this directory with the pwd and, when the command group finished, I saw I was at the /etc/mydir, as if I never had out of there!

- Oh crap. It’s a kind of magic !

- Are you crazy, man? Of course not! The interesting in the parentesis use is that it calls a new Shell to execute the commands inside them. In this way, we really gone to /etc directory, but when all parentesis commands were executed, the new Shell that was at /etc directory died and we came back to the previous Shell which our current directory as /home/mydir. Do other tests using cd and ls to fix the concepts.

Now we already know these concepts, take a lookt at the following example:

$ mail support << END
>
Hi support, today at `date”+%hh:mm”`
>
we had that problem again
>
that I was reported by
>
phone. As you ask
>
here it goes a file list from
>
the directory:
>
`ls -l`
>
Best Reggards.
>
END

Finally now we have knowledge to show what we had talk about here document. The commands between backquote (`)are priorized and then the Shell will execute then before the mail instruction. When the support received the e-mail, will see that the commands date and ls were executed before the command mail, receiving then the snapshot of environment at the e-mail send moment.

The default Shell primary prompt, as we saw, is the dolar ($), but Shell uses a concept of secondary prompt, or command continue, that is sent to the screen when we have a line feed and the instruction didn’t end yet. This prompt is represented as a greather than signal (>), that we see at the beggining of the second line and above.

To end and mess with everything, I need to say that exists a newer, modern, construction that is used as command execution priorization way, like the backquotes. They are the constructions like $(cmd), where cmd are one or many commands that will be executed with priority in its context.

In this way, the use of backquotes or constructions like $(cmd) have the same target, but for whom works with multi-plataform operational systems, I advice the use of backquotes, as the $(cmd) wasn’t ported to all Shell flavoes. Here in the pub, I’ll use both ways, with no distiction.

Lets see again the gave example to the backquote in this new point of view:

$ echo There is $(who | grep wc -l) connected users
There is 8 connected users

Take a look at this case:

$ Arqs=ls
$ echo $Arqs
ls

In this example, I did an assignment (=) and run an instruction. What I wanted was the variable $Arqs had received the output of ls command. As the instructions of a script are interpreted from above to bellow and from left to right, the assignment was done before the execution of ls. To do what we want is needed to priorize the execution of this command in detriment of assignment and this can be done in any of the following ways:

$ Arqs=`ls`

or:

$ Arqs=$(ls)

To finish this topic, let see only one example. Say I’d want to put into the variable $Arqs the long list (ls -l) of all files started by arq and followed by a single character (?). I should do:

$ Arqs=$(ls -l arq?)

or:

$ Arqs=`ls -l arq?`

But, look at this:

$ echo $Arqs
-rw-r–r– 1 jneves jneves 19 May 24 19:41 arq1 -rw-r–r– 1 jneves
jneves 23 May 24 19:43 arq2 -rw-r–r– 1 jneves jneves 1866 Jan 22 2003
arql

- Wow! Everything messed!

- As I told you man, if you let the Shell “see” the blank spaces, always we have many blank spaces together, they will be changed by only one. To see a cute list, we need to protect the variable from Shell interpretation, like this:

$ echo “$Arqs”
-rw-r–r– 1 jneves jneves 19 May 24 19:41 arq1
-rw-r–r– 1 jneves jneves 23 May 24 19:43 arq2
-rw-r–r– 1 jneves jneves 1866 Jan 22 2003 arql

- Look pall, go training these examples because, when we meet again, I’ll explain a series of tipical Shell Programming instructions. Bye ! Oh, only one little thing I was forgotting: in Shell, the hash (#) is used to do a comment.

$ exit # Ask the check frown

Pub Talk Part II

  • The great grep
    • The grep family
  • Building a CD Library
    • Informing the Parameters
    • Parametric Hints

‘Waiter, get me a pint and don’t worry about my lad over here, he’s finally getting to meet a real operating system and he’s got a lot to learn!’ ‘So my friend, could you get anything of what I’ve said so far?”Well, I can get what you mean, but I actually don’t see what’s the point of it.”Take it easy pal! We’re just begining… What I’ve said so far is a taste of what lies ahead. As soon as we start developing structured programs, you’ll see how useful those tools can be. After learning that, you’ll see how easy it is to reach the top shellves. Now, tell me: how do you like the grep family?’

‘Pardon me! I don’t know any grep family’

‘Sure, sure… grep is an an acronym for “global regular expression print” – although there is a legend that tells that the name grep comes from ed (a text editor that is vim’s grampa), in which the search command was g/_regular expression_/p, or g/_re_/p.’

‘Well, this grep command takes regular expressions and matches them to the lines of an “input”. By the way, there is this guy – Aurélio Marinho Jargas – who maintains a webpage that can give you all the hints, clues and even tutorials you want about regular expressions (regexp). If you feel like learning to program in Shell, Perl, Python, etc. you’re better to see what he’s got!’

The great grep

‘Waiter, this time I’ll try a caipirinha – the Brazilian National Drink wink – (See how to prepare it)!’

‘So, I told you that grep matches regular expressions to the lines of an “input”. But what are those “inputs”? Well, there are different ways of defining those inputs. Let’s see!’

Searching a file:

$ grep mary /etc/passwd

Searching more than one file:

$ grep grep *.sh

Searching the output of a command

$ who | grep pelegrino

Considering the 1st example – which is the simplest one – I searched the occurrences of the word mary in any position of the file /etc/passwd. If I wanted to search it as a login name – or, in other words, just at the begining of the registers of that file – I should execute:

$ grep ‘^rafael’ /etc/passwd

‘Hold on, hold on… what’s that caret (circumflex ^) and those apostrophes for?’

‘The caret (^), as you’d know if you had read the other articles on regular expressions I told you about, constrains the matches to the begining of the lines and the apostrophes (') tell grep not to understand that circumflex, in order to be searched for.’

The 2nd example will list all the lines of all the files with the extension .sh that have the world grep. Since I use this extension to my Shell scripts, what I’ve done is to look for a good grep example in all my scripts.

And look!! grep accepts as input the output of another command, as long as it is indicated by a pipe symbol (|) – this is very common in shell and it accelerates enourmously the execution of commands, since it takes the output of a command and reads it as if it were a file.

So, looking at the 3rd example, the command who lists the users who are logged in the same machine as you are (remember: Linux is a multi user system) and the command grep verifies whether the user pelegrino is working or not.

The grep family

You know, the command grep is widely known, because it is frequently used, but what most people don’t know is that there are three commands in the grep family. They are:

  • grep
  • egrep
  • fgrep

Their main features are:

  • grep
    Can (or cannot) use simple regular expressions, but when it is not the case of using them, it is better to execute fgrep (it is faster);
  • egrep ('e' standing for extended)
    Is a very powerful tool that uses regular expressions. It is often seen as the slowest brother of the grep family, hence it is more likely to use it when it is necessary to elaborate a regular expression that grep does not accept;
  • fgrep ('f' standing for fast, or file)
    As its own name points out, is the fast brother of the family. It is fast running (it is about 30% faster than grep and 50% faster than egrep), but it is does not allow the use of regular expressions
Pinguim com placa de atenção (em inglês) The considerations above on speed are valid to the Unix grep family. grep is faster running on Linux, because the other two (fgrep and egrep) are shell scripts that execute grep. And I must say: I don’t like that solution.

‘Now that you know the differences among the tree, tell me: What do you think about the examples I gave before the explanation?’

‘I thought fgrep would solve your problem a lot faster than grep.’

‘Perfect!! I see you got what I said! Let’s see some other examples to make their differences even clearer.’

  • Examples

I know that there is a text talking about Linux, but I’m not quite sure on whether the word Linux is written with a capital L or with a small one, what should I do?

There are two options in that case:

$ egrep (Linux | linux) arquivo.txt

or

$ grep [Ll]inux arquivo.txt

In the first case, the complex regular expression (Linux | linux) uses the parentheses to group up the options and the pipe (|) as a logical “or”, which means that you are searching Linux or linux.

In the second case, on the other hand, the regular expression [Ll]inux means that you are searching a word that starst with L or l followed by inux. Since this expression is simpler, grep itself can solve it, so I think it is a more recomendable one (remember: egrep is slower).

Another example. If you want to list the subdirectories of a directory, you should run:

$ ls -l | grep ‘^d’
drwxr-xr-x 3 root root 4096 Dec 18 2000 doc
drwxr-xr-x 11 root root 4096 Jul 13 18:58 freeciv
drwxr-xr-x 3 root root 4096 Oct 17 2000 gimp
drwxr-xr-x 3 root root 4096 Aug 8 2000 gnome
drwxr-xr-x 2 root root 4096 Aug 8 2000 idl
drwxrwxr-x 14 root root 4096 Jul 13 18:58 locale
drwxrwxr-x 12 root root 4096 Jan 14 2000 lyx
drwxrwxr-x 3 root root 4096 Jan 17 2000 pixmaps
drwxr-xr-x 3 root root 4096 Jul 2 20:30 scribus
drwxrwxr-x 3 root root 4096 Jan 17 2000 sounds
drwxr-xr-x 3 root root 4096 Dec 18 2000 xine

As you can see above, the circumflex (^) limits the search to the first position of the long output of the ls command. The apostrophes tell the shell not to ‘understand’ the circumflex (^).

Let’s take another example. You know what are the first four positions of the output of a ls -s command for an ordinary file (not a directory, nor a link, nor anything…) should be:

Position 1st 2nd 3rd 4th
Possible values - r w x
  - - s (suid)
      -

Thus, in order to find out what are the executable files in a directory, you should:

$ ls -la | egrep ‘^-..(x|s)’
-rwxr-xr-x 1 root root 2875 Jun 18 19:38 rc
-rwxr-xr-x 1 root root 857 Aug 9 22:03 rc.local
-rwxr-xr-x 1 root root 18453 Jul 6 17:28 rc.sysinit

Once again the caret (^) limits the search to the begining of each line, hence, the listed occurrences are the ones that start with a -, followed by anything (the full stop – a dot – in a regular expression denotes any character), once again followed by any character, followed by an x or a s.

The same result would be found with the command:

$ ls -la | grep ‘^-..[xs]‘

and the search would be faster.

Building a CD Library

‘Let me use a nice and didactic example: the process of building a CD Library. Keep in mind that it is as possible to develop software to organize audio CDs, as it is to data CDs (including those you get when you buy magazines, those you burn for yourself, etc.).’

‘Hold on a sec. Where am I taking the CD data from?’

‘Firstly I’ll show you how your software can obtain data from those who are using it, afterwards I’ll show you how to get data from the screen or from a file.’

Informing the Parameters

‘In our case, the layout of a music file will be:’

    name of the album^artist~name of the song:..:singer of the song

As you can see above, a circumflex (^) separates the name of the album from the rest of the register (which contains information on each song and on its singer). The artist and the name of the song are separated by a tilde (~), and a colon (:) separates name of the song and name of the singer.

The software I’m intended to develop is called musinc, and it will include registers on my music file. I will inform the content of each album as a parameter whenever I run the software, this way:

$ musinc “album^musician~music:musician~music:…”

That way, the software musinc will get data from each album as if it were a variable. The only difference between a received parameter and a variable is that the first one gets numerical names (I know it sounds strange… what I meant was that they get one character names), such as $1, $2, $3, ..., $9. Let’s make a test:

$ cat teste
#!/bin/bash
# Program to test how to inform the parameters
echo “1o. parm -> $1″
echo “2o. parm -> $2″
echo “3o. parm -> $3″

Let’s run it now:

$ teste informing parameters to test
bash: teste: cannot execute

OOPS, there is a detail I’ve forgotten: we have to make the file executable before running it:

$ chmod 755 teste
$ teste informing parameters to test
1o. parm -> informing
2o. parm -> parameters
3o. parm -> to

Interestingly, the last word test was not considered by our program. That is because the program just considered the three first parameters. Let’s execute it another way:

$ teste “informing parameters” to test
1o. parm -> informing parameters
2o. parm -> to
3o. parm -> test

With inverted commas Shell did not consider the blank space between the two first words, making it consider them as a single parameter.

Parametric Hints

Since we are talking about parameters, let me give you some hints:

Meaning of the main variables
Variable Meaning
$0 Name of the program
$# Amount of informed parameters
$* Set of all parameters (similar to $@)
  • Examples

Making changes on the program teste, in order to use the variables we have just seen. Let’s do it this way:

$ cat teste
#!/bin/bash
# Program to test how to inform the parameters (2nd Version)
echo The program $0 received $# parameters
echo “1o. parm -> $1″
echo “2o. parm -> $2″
echo “3o. parm -> $3″
echo Todos de uma só \”tacada\”: $*

Note that preceding the inverted commas I inserted a inverted slash, in order to tell Shell not to interpret them. Let’s run the program.

$ teste informing parameters to test
The program teste received 4 parameters
1o. parm -> informing
2o. parm -> parameters
3o. parm -> to
Todos de uma “tacada”: informing parameters to test

As I’ve said before, the parameters are numbered from 1 to 9, but that does not mean that it is not possible to use more than 9 parameters. Let’s test it:

  • Example:
$ cat teste
#!/bin/bash
# Program to test how to inform the parameters (3rd Version)
echo The program $0 received $# parameters
echo “11th parm -> $11″
shift
echo “2nd parm -> $1″
shift 2
echo “4th Parm -> $4″

Let’s run it now:

$ teste informing parameters to test
The program teste received 4 parameters
11th parm -> informing1
2nd parm -> parameters
4th parm -> test

There are two remarkable points about this script:

  1. In order to show that the parameters range from $1 to $9, I wrote an echo $11 and what happened? It was interpreted as a $1 followed by the character 1, and the result was informing1;
  2. The command shift, whose syntax is shift n (in which n is a variable that can assume any numerical value – although its default is 1), does not consider the first n parameters, making the first parameter the one numbered n+1.

Well, now that you know a little bit more about informing parameters, let’s return to our CD Library and create our script for including CDs on bank called musics. It is a very simple script (as simple as everything else in Shell) and I’ll list you so that you can see:

  • Examples:
$ cat musinc
#!/bin/bash
# Cadastra CDs (Version 1)
#
echo $1 >> musics

Since it is a is very functional script, I’ll simply attach the received parameter at the end of the file songs. Let’s include 3 albums and see if it works (in order to simplify, I’ll suppose each album contains just 2 songs):

$ musinc “album 3^Musician5~Music5:Musician6~Music5″
$ musinc “album 1^Musician1~Music1:Musician2~Music2″
$ musinc “album 2^Musician3~Music3:Musician4~Music4″

Listing the content of songs.

$ cat musics
album 3^Musician5~Music5:Musician6~Music6
album 1^Musician1~Music1:Musician2~Music2
album 2^Musician3~Music3:Musician4~Music4

It is not as functional as it was supposed to be… it could be a lot better. The albums are out of order, complicating the research. Let’s change the script and test it again:

$ cat musinc
#!/bin/bash
# Cadastra CDs (versao 2)
#
echo $1 >> musics
sort musics -o musics

Including another one

$ musinc “album 4^Musician7~Music7:Musician8~Music8″

Now let’s see what happens to the song file:

$ cat musics
album 1^Musician1~Music1:Musician2~Music2
album 2^Musician3~Music3:Musician4~Music4
album 3^Musician5~Music5:Musician6~Music5
album 4^Musician7~Music7:Musician8~Music8

I simply inserted a line that classifies the file musics, pointing the output to the same file (that’s how the option -o works), after attaching each album.

WOW! Now it is nice and almost functional. But attention and don’t panic! That is not the final version. The next version of the program will be a lot better and more friendly! We’ll develop it as soon as we learn how to get data from the screen and how to format the input.

  • Examples

Listing with the cat command is totally out, let’s make a program called muslist that lists the album whose name is given as parameter:

$ cat muslist
#!/bin/bash
# Search for CDs (version 1)
#
grep $1 musicas

Let’s run it looking for album 2. As we have previously seen, when informing the sequence of characters album 2, it is necessary to prevent Shell from interpreting it (otherwise it would read two parameters). Let’s try the following:

$ muslist “album 2″
grep: can’t open 2
musicas: album 1^Musician1~Music1:Musician2~Music2
musicas: album 2^Musician3~Music3:Musician4~Music4
musicas: album 3^Musician5~Music5:Musician6~Music6
musicas: album 4^Musician7~Music7:Musician8~Music8

‘What a mess!! Where is the mistake? I put the parameter between inverted commas so that shell would not split it into two…’

‘Yeap, but pay attention to how grep is running:

    grep $1 musics

Even putting album 2 between inverted commas, when Shell sees $1 it splits it into two arguments. So, the final content of the line that grep has executed is:

    grep album 2 musics

As the grep syntax is:

    grep  [arq1, arq2, ..., arqn]

grep has understood that it was supposed to look for the chain of characters album on the files 2 and musics. But, since there is no arquivo 2, an error has occurred. Moreover, since the word album was found in every register of musicas, all registers were listed.

Pinguim com placa de dica (em inglês) Use inverted commas whenever there is a blank space or a <TAB> in the chain of characters that grep will run. That helps the words after the blank space or <TAB> from being interpreted as file names.

On the other side, it is better not to consider the case of the letters in the research. The following program would solve two problems at the same time:

$ cat muslist
#!/bin/bash
# Search for CDs (version 2)
#
grep -i “$1″ musics

In that case, the option -i tells grep not to consider the case of the letters. Another point is the parameter $1 that was inserted between inverted commas so that grep would understand the chain of characters as a single argument.

$ muslist “album 2″
album2^Musician3~Music3:Musician4~Music4

Pay attention too to the fact that grep locates the chain of characters in any position of the register, so, this way we can search for album, song, singer or even for pieces of information. As soon as we get started with conditional commands, we’ll get a new version of muslist that asks us in which of the fields the research will be performed.’

‘Hold on pal! That putting between inverted commas thing is not really a friendly way of doing that…’

‘You are right! Let me show you another way, then:

$ cat muslist
#!/bin/bash
# Consulta CDs (versao 3)
#
grep -i “$*” musics
$ muslist album 2
album 2^Musician3~Music3:Musician4~Music4

The option $* stands for all parameters, and in that program it will be substituted by the chain album 2 (according to the previous example), and it will do what you wanted it to.

You should have realized by now that the problem about Shell is not if if does or not something, but what is the best way of doing it (as you’ve seen, the range of options is huge!).’

‘But what if I have to exclude a CD? Once I forgot a CD of mine under the sun and when I looked at it again… it was lost. What if that happened again?’

‘Well, let’s make another script called musexc, in order to solve that kind of problem.’

Before developing it, I’d like to introduce you to a very useful option of the grep family. Meet the option -v. This option lists every input register, but the ones found by the command. Let’s see the example:

$ grep -v “album 2″ musics
album 1^Musician1~Music1:Musician2~Music2
album 3^Musician5~Music5:Musician6~Music6
album 4^Musician7~Music7:Musician8~Music8

As I’ve mentioned, that grep from the example lists all the registers but the ones that refer to album 2, and that happens because it fits into the parameters of the command. Now we are ready to develop the script that will remove the lost CD from your CD Library. It looks like this:

$ cat musexc
#!/bin/bash
# Delete CDs from Library (version 1)
#
grep -v “$1″ musics > /tmp/mus$$
mv -f /tmp/mus$$ musics

The first line sends the file musics to /tmp/mus$$, but extracting the registers that conform to the grep='s research. Afterwards, it moves (or renames, if you prefer this word) =/tmp/mus$$ to musics.

I used the file /tmp/mus$$ as a work copy, because, as I’ve mentioned previously, the $$ contains the PID (Process IDentification), because of that, when others edit the file musics, a different work copy will be made, and that avoids running over other’s files.

‘And that’s it?’

‘Yeah, man! Well, those programs we’ve made are quite basic, because we still lack knowledge about some tools. But, while I have another pint, you can practice using the examples, and I promise you will develop a nice control system for your CDs.

Next time we meet, I’ll show you how conditional commands work and we’ll improve those scripts.’

‘That’s it for now… but before:

Waiter, another round for me and my pal, please!’

Pub Talk Part III

  • Working on chains
    • The cut command
      • The cut command and the option -c
      • The cut commandand the option -f
    • If you cut, you paste
      • Laying down
      • Using separators
    • The tr command
      • Changing characters with tr
      • Removing characters with tr
      • Shrinking with tr
  • Conditional Commands
    • The if command

‘Get me two pints waiter, I’ve got a lot to talk!’

Working on chains

‘No, I’m not going to get you on a hard work! I’m just going to talk about chains of characters.’

The cut command

‘Let me show you first, a very practical instruction: the cut command. This instruction may be used to cut a certain piece of a file and it may be used in two different ways’

The cut command and the option -c

‘With the option -c, the syntax of the command is the following:’

    cut -c PosIni-PosFim [file]

‘In which:’

    PosIni = Initial Position
    PosFim = Final position
$ cat numbers
1234567890
0987654321
1234554321
9876556789

$ cut -c1-5 numbers
12345
09876
12345
98765

$ cut -c-6 numbers
123456
098765
123455
987655

$ cut -c4- numbers
4567890
7654321
4554321
6556789

$ cut -c1,3,5,7,9 numbers
13579
08642
13542
97568

$ cut -c -3,5,8- numbers
1235890
0986321
1235321
9875789

‘As you can see, there are four different syntaxes: at the first one (-c 1-5), I have especified a range, in the second one (-c -6), I have especified everything up to one position, in the third (-c 4-), from a certain point onward and in the fourth (-c 1,3,5,7,9), certain positions. The last one (-c -3,5,8-) was just to show that they can all be combined.’

The cut commandand the option -f

‘Don’t you think that it’s finished. As you may have realized, this syntax is handful for fixed sized files, but actually, there are more files with variable sized fields, in which each field ends with a delimiting character. Let’s take a look at the file musics that we started in our last meeting.’

$ cat musics
album 1^Musician1~Music1:Musician2~Music2
album 2^Musician3~Music3:Musician4~Music4
album 3^Musician5~Music5:Musician6~Music5
album 4^Musician7~Music7:Musician8~Music8

‘So, the layout is the following one:’

    album^Musician1~music1:...:Musician~musicn

‘That means that the name of the album is separated by a circumflex (or caret ^) from the rest of the register. This register is made of several groups (composed by: the singer of each song and the song). The artist and the name of the song are separated by a tilde (~), and a colon (:) separates the name of the song and the name of the singer.’

‘So, in order to cut the piece of information that refers to the the second songs of the file musics, we must do the following:’

$ cut -f2 -d: musics
Musician2~Music2
Musician4~Music4
Musician6~Music5
Musician8~Music8

‘That means that we have cut the second field (-f) delimited (-d) with a colon (:). But if we just wanted the names of the interpreters we should have used a different syntax:’

$ cut -f2 -d: musics | cut -f1 -d~
Musician2
Musician4
Musician6
Musician8

‘In order to understand it, let’s use the first line of musics:’

$ head -1 musics
album 1^Musician1~Music1:Musician2~Music2

‘Watch me now:’

Delimitating the first cut (:)

album 1^Musician1~Music1:Musician2~Music2

‘This way, at the first cut, the first delimiting field (-d) colon (:) is album 1^Musician1~Music1 and the second one, that is of our interest, is Musician2~Music2. ‘

‘Let’s see then, what’s happened to the second cut:’

New delimitating character (~)

Musician2~Music2

‘Now the first field of the delimitating character (-d) tilde (~) is of our interest and it is Musician2 and the second field is Music2.’

‘Considering that our first assumption was applied to the rest of the file, we’ll get that same answer.’

If you cut, you paste

‘As you may guess, the paste command pastes things. When we are dealing with shell, nevertheless, we talk about pasting files. In order to understand it, let’s see:’

    paste file1 file2

‘This way, the command will send the registers from file1 and from file1 to the standard output (stdout). The registers of both files will be arranged side by side and if you do not define a delimitating character, the default one will be used.’

‘Paste is a rarely used command because of its syntax (not that it is hard, it is not well known). Let’s play with two files:’

$ seq 10 > integer
$ seq 2 2 10 > even

‘In order to check the content of the files, let’s use the paste command in its more conventional way:’

$ paste integer even
1 2
2 4
3 6
4 8
5 10
6
7
8
9
10

Laying down

‘Let’s convert the column into a line:’

$ paste -s even
2 4 6 8 10

Using separators

‘As we have said, <TAB> is the default separator, but it can be changed with the option -d. So, in order to calculate the sum, we would perform the following operation:’

$ paste -s -d’+’ even # could be: -sd’+’
2+4+6+8+10

‘Afterwards, we would paste that line to the calculator (bc), then it would be like this:’

$ paste -sd’+’ even | bc
30

‘So, the factorial of the number defined by $Num would be:’

$ seq $Num | paste -sd’*’ | bc

‘With the paste command, it is also possible to employ some ‘exotic’ formats, like the following:’

$ ls | paste -s -d’\t\t\n’
file1 file2 file3
file4 file5 file6

‘What has just happened was: with the option -s, the past command converts lines into columns. The separators (oh, yeah! There might be more than one separator after each column that has been created) would be a <TAB>, another <TAB> and a <ENTER>. So that the output would be presented in three columns.’

‘Now that you got it, check how the same thing can be done, but in an easier (and less strange) way, using the same command, but with a different syntax:’

$ ls | paste – - -
file1 file2 file3
file4 file5 file6

‘That happens because when we use a minus (-), the paste command substitutes the files for the standard input (or output). In our last example, the data of the files was sent to the standard output (stdout), because the character pipe (|), changed the route of the ls command to the standard input (stdin) of the paste command, but take a look at the following example:’

$ cat file1
precedence
privilegious
proportional

$ cat file2
position
mary
motion

$ cut -c-3 file1 | paste -d “” – file2
preposition
primary
promotion

‘In that case, the cut command has returned the three first letters of each register of file1. The paste command was designed not to employ a separator (-d"") and to receive an input from the standard input (whose route has been changed by the pipe (|) at the dash (-), generating an output with file2.’

The tr command

‘Another very interesting command is tr. It substitutes, compresses or removes characters. Its syntax follows the pattern:’

    tr [options] string1 [string2]

‘The tr command copies the text from the standard input, changes the occurrences of the characters of string1 by their correspondents in string2 or changes multiple occurrences of the characters from string1 for just one character, or it removes characters from string1.’

The main options are:

Main options of the tr command
option meaning
-s n compresses n occurrences of the string into one
-d char removes the characters char from the string

Changing characters with tr

‘Let’me show you a silly example first’

$ echo silly | tr i a
sally

‘That means, I have changed the occurrences of i for a.’

‘Suppose that at a certain point of my script, the operator is asked to press y or n (Yes or No), and its answer is stored at the variable $Resp. The content of that file could be capitalized or not, so, in order to avoid many tests to find out whether it is N, n, Y, y, I simply do the following:’

$ Resp=$(echo $Resp | tr YN yn)

‘and afterwards, you can be sure that the content of the file will be whether a n or a y.’

‘If my file FileIn is all written in small letters and I wish to convert them into capital letters, what can I do?’

$ tr A-Z a-z < FileIn? > /tmp/$$
$ mv -f /tmp/$$ ArqEnt?

‘Take a look: I have used the notation A-Z so that I would not need to write ABCDEF....YZ. Other notations that could be used were those we call escape sequences (which are common to other languages, like C) whose meaning you’ll see below:’

Escape Sequences
Sequence Meaning Octal
\t Tab 11
\n New line 12
\v Vertical Tab 13
\f Form Feed 14
\r Carriage Return <^M> 15
\\ Inverted Dash 134

Removing characters with tr

‘Let-me tell you a tale: a student was quite mad at me, so he decided to make things worse for me and in a practical exercise, he handed me in a script in which the commands were separated by a semicolon (do you remember I said that semicolon is used to write many commands at the same line?).’

‘I’ll show you an example of such an aberration:’

$ cat confusion
echo Read an online shell book at http://www.julioneves.com > book;cat book;pwd;ls;rm -f trash 2>/dev/null;cd ~

‘When the script was run, the answer was:’

$ confusion
Read an online shell book at http://www.julioneves.com
/home/jneves/LM
confusion book musexc musics musinc muslist number

‘But, since I was meant to grade the script, I had to evaluate it seriously, so, to understand what he has done, I called him and in front of him, I ran the following command:’

$ tr “;” “\n” < confusion
echo Read an online shell book at http://www.julioneves.com
pwd
ls
rm -f trash 2>/dev/null

cd ~

‘I don’t have to tell you how disapointed he got when I, in a few seconds, undid the joke he had spent hours doing.’
‘But, pay attention! If I were using a Unix system (with ksh ou sh), the command should be:’

$ tr “;” “12″ < confusion

Shrinking with tr

‘See the difference between two executions of the date command (one I ran today and the other I ran two weeks ago):’

$ date # Today
Sun Sep 19 14:59:54 2004

$ date # Two weeks ago
Sun Sep 5 10:12:33 2004

‘If I wanted to isolate the hour, I should do the following:’

$ date | cut -f 4 -d ‘ ‘
14:59:54

‘On the other hand, two weeks ago, the answer would be:’

$ date | cut -f 4 -d ‘ ‘
5

‘But pay attention to the following detail:’

$ date # Two weeks ago
Sun Sep 5 10:12:33 2004

‘As you can see, there are two blank spaces before the number 5 (day). That ruins it all, because the third part is empty and the fourth is the day (5). The ideal would be to compress the sucessive blank spaces into just one in order to work with the two strings. See how you can do it:’

$ date | tr -s ” “
Sun Sep 5 10:12:33 2004

‘You can see there is no more two spaces Now I can cut it:’

$ date | tr -s ” ” | cut -f 4 -d ” “
10:12:33

‘See how shell can be handful? Now, take a look at the following file, that originally came from that operational system that is vulnerable to all sorts of virus.’

$ cat -ve FileFromDOS.txt
This file^M$
was recorded by^M$
the Windows and^M$
downloaded by^M$
a badly done ftp.^M$

‘Let-me give you two tips:’

Pinguim com placa de dica (em inglês) Tip #1 – The option -v of the cat command shows the invisible control characters, with the notation ^L, in which ^ stands for the control key and L stands for the corresponding letter. The option -e shows the end of the line with a dolar sign ($).
Pinguim com placa de dica (em inglês) Tip #2 – That happens because in Windows (or DOS) formated files, there is a carriage-return (\r) and a line-feed (\n) at the end of the registers. In Linux formated files, on the other hand, there is only a line-feed at the end of the registers.

‘Let’s clean the file now’

$ tr -d ‘\r’ < FileFromDOS.txt > /tmp/$$
$ mv -f /tmp/$$ FileFromDOS.txt

‘Check now what’s happened:’

$ cat -ve FileFromDOS.txt
This file$
was recorded by$
the Windows and$
downloaded by$
a badly done ftp.$

‘The option -d of the tr command removes a character (the one that has been specified) from the whole file. Thus, I have removed the unwishful characters saving the text in a temporary file (that afterwards became the substitute of the original file).’
If I were using a Unix machine (with ksh ou sh), the command should be:

$ tr -d ‘15′ < FileFromDOS.txt > /tmp/$$
$ mv -f /tmp/$$ FileFromDOS.txt
Pinguim com placa de atenção (em inglês) That has happened because ftp was run on binary mode (or image), and it means: no text interpretation. If, before the file transmission, the option ascii had been defined, that wouldn’t have happened.

‘Well, those hints are making me enjoy this shell stuff, but there are many things I still can’t do.’

‘Nevermind! There are still many things for you to learn about shell programming. But you are ready to solve a lot of problems using what you’ve learn as long as you adopt a “shell way of thinking”. Are you able to make a script that tells me who’s been logged in for more than one day at your server?’

‘Surely not!! I would have to use conditional commands that I still don’t know.’

‘Why don’t you change a little bit your way of thinking and come to the shell side of the force? Waiter, my pal, bring us some pints before we proceed…

‘Now, that we have our pints, let’s solve that problem. Pay attention to the who command:’

$ who
jneves pts/1 Sep 18 13:40
rtorres pts/0 Sep 20 07:01
rlegaria pts/1 Sep 20 08:19
lcarlos pts/3 Sep 20 10:01

‘And also to the date command ‘

$ date
Mon Sep 20 10:47:19 BRT 2004

‘Now look: month and day are presented in the same format by both commands.’

Pinguim com placa de dica (em inglês) Sometimes different commands present outputs in different languages. When that happens, you can do the following:

$ date
Mon Sep 20 10:47:19 BRT 2004

$ LANG=pt_BR date
Seg Set 20 10:47:19 BRT 2004

This way, you can print a bit of uniformity to the languages employed (The original language of this URL is portuguese from Brazil).

‘Well, if there is a register of who in which we don’t find today’s date, that means the user has been logged in for more than one day (considering that the user can’t be logged in since tomorrow)… So, let’s save the piece of data that is of our interest.’

$ Data=$(date | cut -c 5-10)

‘I have used the construction $(...), in order to priorize the execution of the commands before attributing its output to the variable $Data. Let’s see how it works:’

$ echo $Data
Sep 20

‘Sweet! Now, what we should do, is to look for the registers that do not present that day in the output of the who command.’

‘I see! Well, and since you’ve mentioned the action of searching, I’m thinking of grep. Am I right?’

‘That is RIGHT! Very good! But I need to use grep with that option that makes it list just the registers in which there is not the string. Any idea?’

‘Well, yeah… hummm… is it -v?’

‘In fact, it IS! You’re getting good at it! So let’s see:’

$ who | grep -v “$Data”
jneves pts/1 Sep 18 13:40

‘And if I wanted something a little prettier, I would do the following:’

$ who | grep -v “$Data” | cut -f1 -d ‘ ‘
jneves

‘See? No conditional was necessary. Specially when we consider that our conditional (if) does not test conditions, but instructions, as we shall see.’

Conditional Commands

‘Check the lines below:’

$ ls musics
musics

$ echo $?
0

$ ls FileThatDontExists
ls: FileThatDontExists: No such file or directory

$ echo $?
1

$ who | grep jneves
jneves pts/1 Sep 18 13:40 (10.2.4.144)

$ echo $?
0

$ who | grep juliana
$ echo $?

1

‘What that $? does? It looks like a variable, is it?’

‘Yes, it is a variable that contains the returning code of the last instruction run. I can assure you that if this instruction succeeded, $? equals zero, otherwise, it will be different.’

The if command

‘The if command tests the variable $?. Its syntax is:’

    if cmd
    then
        cmd1
        cmd2
        ...
        cmdn
    else
        cmd3
        cmd4
        ...
        cmdm
    fi

‘That means: considering that the cmd command has been successfully executed, the commands that compose the set called then (cmd1, cmd2, ... & cmdn) will be executed. Otherwise, the optional set of commands else (composed by the cmd3, cmd4, ... & cmdm commands), will be executed. The execution will be finished with a fi.’

‘Let’s see how it works, using a small script that includes users at /etc/passwd:’

$ cat incusu
#!/bin/bash
# Version 1
if grep ^$1 /etc/passwd
then
echo User \’$1\’ already exists
else
if useradd $1
then
echo User \’$1\’ included at /etc/passwd
else
echo “We have problems. Are you root?”
fi
fi

‘Notice that the if command tests the grep command (that’s its purpose). If the if command succeeds (in the case above, that means: if the user – whose name is in $1 – is found at /etc/passwd) the then set of commands is executed (in this example, only the echo command). Otherwise, the instructions of the else set are executed, where new if tests whether the useradd command works including the user $1 in /etc/passwd or not, in this case an error message is exhibited asking if the guy is root.’

‘Let’s check the command, firstly trying to execute it with a pre existent user:’

$ incusu jneves
jneves:x:54002:1001:Julio Neves:/home/jneves:/bin/bash
User ‘jneves’ already exists

‘As we’ve seen a few times, an undesirable line was added by the output of the grep command. In order to avoid that problem, we should desviate the output of that command to /dev/null, like this:’

$ cat incusu
#!/bin/bash
# Version 2
if grep ^$1 /etc/passwd > /dev/null # or: if grep -q ^$1 /etc/passwd
then
echo User \’$1\’ already exists
else
if useradd $1
then
echo User \’$1\’ included at /etc/passwd
else
echo “We have problems. Are you root?”
fi
fi

‘Now, let a normal user (not the root) test it:’

$ incusu JohnNobody
./incusu[6]: useradd: not found
We have problems. Are you root?

‘Wow… that error was not supposed to occur! In order to avoid it, let’s desviate the useradd error output to /dev/null, like this:’

$ cat incusu
#!/bin/bash
# Version 3
if grep ^$1 /etc/passwd > /dev/null # or: if grep -q ^$1 /etc/passwd
then
echo User \’$1\’ already exists
else
if useradd $1 2> /dev/null
then
echo User \’$1\’ included at /etc/passwd
else
echo “We have problems. Are you root?”
fi
fi

‘After doing those changes, and executing a su - (becoming root), let’s see how it works:’

$ incusu xalaskero
User ‘xalaskero’ included at /etc/passwd

‘Once again:’

$ incusu xalaskero
User ‘xalaskero’ already exists

‘See? As I told you, as long as we talk and drink, our programming skills are improving. Let’s see how we can enhance our music software:’

$ cat musinc
#!/bin/bash
# Include Musics (version 3)
#
if grep “^$1$” musicas > /dev/null
then
echo This album is already registered
else
echo $1 >> musics
sort musics -o musics
fi

‘It’s an evolution of the previous version, see? Instead of including a register (that could be duplicated in the previous version), we now test if the begining (^) and the end ($) of a register match to the informed parameter ($1). A ^ is used at the begining of the string and a $ is used at the end of it in order to test whether the parameter informed equals to some previously registered data.’

‘Let’s run it now, informing a previously registered album’

$ musinc “album 4^Musician7~Music7:Musician8~Music8″
This album is already registered

‘And now a non registered one: ‘

$ musinc “album 5^Musician9~Music9:Musician10~Music10″
$ cat musicas
album 1^Musician1~Music1:Musician2~Music2
album 2^Musician3~Music3:Musician4~Music4
album 3^Musician5~Music5:Musician6~Music5
album 4^Musician7~Music7:Musician8~Music8
album 5^Musician9~Music9:Musician10~Music10

‘As you’ve seen, our software is slowly improving, and it will get even better as long as we pass through these shell classes.’ ‘I’ve got all that you said, but I still don’t get how I can do an if in order to test conditions, which I think that would be the main function of the command.’

‘Dude, that’s what the test command is for: to test conditions. The if command, on the other hand, tests the test command. Nevertheless, talking about it now would be way too complicated, moreover, I’m really thirsty. Let’s have some beer and the next time I’ll tell you about test and other if syntaxes.’

‘Deal! Specially because I’m getting dizzy with that amount of information and it will give me some time to practice.’

‘Why don’t you write a little script that informs whether an user is logged on or not?? Meanwhile: WAITER?? two more pints, please…’

– JulioNeves – 01 Aug 2006

Pub Talk Part IV

  • The test Command
  • Please help me to finish or to correct this translation (from Portuguese)

‘My fellow!!! How have you been, mr. bin? Have you done the exercise I asked you to?”I surely did! When programming is the topic, if you don’t practice, you don’t learn. You’ve asked me a simple script that tells whether a user is logged in or not. I have made the following script:’

$ cat logado
#!/bin/bash
# searches whether a user is logged in or not
if who | grep $1
then
echo $1 is logged
else
echo $1 can’t be seen anywhere around
fi

‘Easy boy!! You look quite excited, but let’s just ask for some beer first. John, two beers, please! And please, pour mine with no foam…’

‘OK! Now that we’ve had our drinks, let’s take a look:’

$ logado jneves
jneves pts/0 Oct 18 12:02 (10.2.4.144)
jneves is logged

‘Well, it really works! I used my login as parameter and it told me I was logged in. Nevertheless, it told me something I didn’t want to: a line of the who command. In order to avoid it, it’s just necessary to send that line to the black hole called /dev/null. Check it out:’

$ cat logado
#!/bin/bash
# searches whether a user is logged in or not (version 2)
if who | grep $1 > /dev/null
then
echo $1 is logged
else
echo $1 can’t be seen anywhere around
fi

‘Now, let’s test:’

$ logado jneves
jneves is logged
$ logado chico
chico can’t be seen anywhere around
Pinguim com placa de atenção (em inglês) Remember the trick: most of the commands have a standard output and an error output (grep is one of the few exceptions because it shows no error message when it doesn’t find a string) and we should pay attention when it is necessary to send them to the black hole.

‘But let’s change the subject. The last time we met, I was showing you some conditional commands and, when I was thirsty as hell, you asked me how one can test conditions. Let’s see the test command, then.’

The test Command

‘Well, we are all acquainted to the use of if testing conditions (which are always greater than, less then, greater or equal, less or equal, equal and not equal). When using Shell to test conditions, we can use the test command, but it is a lot more powerful than we can imagine. Let me first show you the main options to test files in a disc:’

Options of the test command to files
Option True if:
-e file file does exist
-s file file does exist and is bigger than zero
-f file file does exist and is a regular file
-d file file does exist and is a directory
-r file file does exist with reading grant
-w file file does exist with writing grant
-x file file does exist with execution grant

‘Now, the main testing options for character chains:’

Options of the test command for character strings
Option True if:
-e string string size is zero
-n string string size is bigger than zero
string the string string is bigger than zero
c1 = c2 strings c1 & c2 are identical

‘Thinking it’s over? So sorry!!! Now the part you are not acquainted to, comparisons with numbers. Check out the table below:’

Options of the test command for numbers
Option True if
n1 -eq n2 n1 equals n2
n1 -ne n2 n1 and n2 are different
n1 -gt n2 n1 is greater than n2
n1 -ge n2 n1 is greater or equals than n2
n1 -lt n1 n1 is less than n2
n1 -le n2 n1 is less or equals than n2

‘Furthermore, consider the following operators’

Operators
Operator Purpose
Parenthesis ( ) Grouping
Exclamation ! Denying
-a logical AND
-o logical OR

‘Wow! As you’ve seen, there is a lot of stuff here, and as I told you, our if is much more powerful than others’s. Let’s see some examples of how it works, first, testing the existence of a directory:’

Example:

    if  test -d lmb
    then
        cd lmb
    else
        mkdir lmb
        cd lmb
    fi

‘That example tests if there is a lmb directory, or else, it creates such directory. I know you will probably question this logic saying the script is not optimized. I know, but I wanted you to understand it the way it is, so that we can use an exclamation point (!) to deny the test command. Check it:’

    if  test ! -d lmb
    then
        mkdir lmb
    fi
    cd lmb

‘This way, the lmb directory would be created if (and only if) it did not exist. This denial comes from the exclamation point (!) that preceeds the option -d. At the end of the execution of this piece of script, you would certainly be in the lmb directory.

Let’s see other two examples to check the difference between numbers and chains.’

    str1=1
    str2=01
    if  test $str1 = $str2
    then
        echo The variables are equal.
    else
        echo The variables are not equal.
    fi

‘Running the piece of software above, the answer would be:’

    The variables are not equal.

‘Now, let’s change it in order to have a numerical comparison:’

    str1=1
    str2=01
    if  test $str1 -eq $str2
    then
        echo The variables are equal.
    else
        echo The variables are not equal.
    fi

‘And let’s run it again:’

     The variables are equal.

‘As you have seen above, we’ve had two different results because the string 01 is quite different from the chain string 1, though they are the same when considered numerically, since the number 1 is equivalent to the number 01.’

Examples:

‘In order to show the use of conectors -o (OR) and -a (AND), check this example made at the prompt:’

$ Familia=felinae
$ Genero=cat
$ if test $Familia = canidea -a $Genero = lobo -o $Familia = felino -a $Genero = lion
> then
> echo Be aware
> else
> echo Can wave it
> fi
Can wave it
Pinguim com placa de dica (em inglês) The angle brackets (>) in the begining of the internal lines of the if, are the continuation prompts (that are defined as $PS2) and when our friend Shell identifies that a command will have a continuation in the following line, it automatically places it until the end of the command.

‘Let’s change the example to check if it still works:’

$ Familia=felino
$ Genero=cat
$ if test $Familia = felino -o $Familia = canideo -a $Genero = lion -o $Genero = lobo
> then
> echo be aware
> else
> echo Can wave it
> fi
Be aware

‘The operation has, obviously, generated an error. That has happened because the option -a preceeds the option -o. This way, what was firstly evaluated was the expression:’

    $Familia = canideo -a $Genero = lion

‘That expression was evaluated as false, so that the answer was:’

    $Familia = felino -o FALSE -o $Genero = lobo

‘Solved, it would be:’

    TRUE -o FALSE -o FALSE

‘Since all the conectors are -o, the final expression has resulted as TRUE (considering that, when in a series of logical expressions connected by logic OR, only one of those expressions must be true to the result become true) and the then was wrongly executed. To make it work properly (again), let’s try the following procedure:’

$ if test \($Familia = felino -o $Familia = canideo\) -a \($Genero = lion -o $Genero = lobo\)
> then
> echo Be aware
> else
> echo Can wave it
> fi
Can wave it

‘This way, using the parentheses, the expressions are grouped with the connector -o, which priorizes the execution and results in:’

    TRUE -a FALSE

‘The result of expressions connected by the operator -a is true when all the expressions that it connects are true (which is not the case above). This way, the result was FALSE and the else was correctly executed.’

‘If we decide to read a CD with songs of different singers, we could be temptated to use an if with a connector -a, but it is better to bare in mind that bash provides us many resources, and it could be done more easily with a single grep command, as in the following example:’

$ grep Musician1 musics | grep Musician2

‘Similarly, if we pick a CD with songs of an Musician1 and of an Musician2, it is not necessary to use an if with the connector -o. The egrep (or grep -E, which is more proper for that situation) can also solve this problem. Check it out:’

$ egrep (Musician1|Musician2) musics

‘Or (specifically for that case) grep itself could help us out:’

$ grep Musician[12] musics

‘Above, a regular expression was used. The vertical bar (|) works as a logical OR and the parentheses are used to limit that OR. On the next grep, on the other hand, the word Artista must be followed by one of the values of the list between square brackets ([]), that means, 1 or 2.’

‘OK! I accept it when you tell me that shell’s if is much more powerful than other’s. But let me tell you one thing: that syntax if test... is quite creepy… han?’

‘Yeah, I think you’re right. I don’t like it either… no one does, I guess. I think that’s why Shell has another syntax to substitute the test command.’

Examples

‘In order to do so, we’ll use that example to switch directories. It was like that:’

    if  test ! -d lmb
    then
        mkdir lmb
    fi
    cd lmb

‘Using a new syntax, it will be:’

    if  [ ! -d lmb ]
    then
        mkdir lmb
    fi
    cd lmb

‘That means that the test command can be replaced by a pair of square brackets ([]), separated by blank spaces between the arguments, which would considerably increase the legibility of the command, since if would present a syntax similar to the one of other languages and that is why the test command will be used henceforth.’

Please help me to finish or to correct this translation (from Portuguese)

Send me a e-mail julio.neves@gmail.com — JulioNeves

Deixar um comentário »

  •  

    Novembro 2009
    S T Q Q S S D
    « Mar    
     1
    2345678
    9101112131415
    16171819202122
    23242526272829
    30  
  • Arquivos

    • Março 2008
  • Tópicos recentes

    • Papo de Botequim – Parte 1
    • Papo de Botequim – Parte 2
    • Papo de Botequim – Parte 3
    • Papo de Botequim – Parte 4
    • Papo de Botequim – Parte 5
  • Categorias

    Shell
  • Páginas

    • agenda
    • blog
    • english
    • español
    • news
    • pager
  • Blog Stats

    • 12,593 hits

Blog no WordPress.com.

Tema: Mistylook por Sadish.