White shell - whitespace in bash for C programmer
I will try to explain how to use whitespace characters and quoting in shells like bash. I'm programming in Python, C, Java, Perl and other normal computer languages. And I've found myself many times in a big problem just doing simple things in the shell. Why? I hope not because I'm bad programmer:)
Rule no. 1
Basic things we programmers need to know about shell languages can be very different from languages we are using on daily basis. The most important is this:
In shell everything is a string!
Not like for example in C where string is only text inside "" (double quotes) or stored under char* variable name. In shell everything is a string.
Lets check out this on the variable assignment example:
X="hello"
Variable X is storing string now, but line below is exactly the same:
X=hello
Frankly saying X is also a string. You can try:
echo X
... click here to see output
just if added $ (dollar sign) shell will get string from memory storied before.
echo $X
... click here to see output
X=hello hello=X echo $X $hello :)
... click here to see output
So what we need "" quoting for? In this case for nothing. But when you type
X="hello world"
is different than
X=hello world
... click here to see output
Shell needs to know where starts every command argument (main duty of shell is invoking commands). So by default white spaces are separators for command name and arguments. In conclusion we need quoting to suppress this behavior.
Another proof that everything is a string is here:
X="hello"" ""world"
Because everything is a string so string concatenation is as basic as letter itself (in words you put letter after letter, you concatenate them, don't you ?). In our example we have three quoted strings joined together.
But something unacceptable in normal programming languages here is legal:
X=hello" ""world"
BTW, try also this funny looking command:
"hello world"
... click here to see output
Yes, this line is a ordinary command. Another proof that everything is a string.
Furthermore, the most common source of problems is in understanding how quotes are treated.
When we write:
X="hello world"
into memory, under the name X string hello world is copied. Quotes are not stored into memory!
So command
echo $X
... click here to see output
works OK, but not like people thing it works. Variable is expanded to one string which is then again split in spaces positions. So that command has two arguments
- echo
- hello
- world
To test that try this:
X="hello world" echo $X
... click here to see output
There is no difference in output because space between words are added by echo command, it was not taken from the string.
So how to save those extra spaces? Quotes are not stored, so we need to used them again, when we reach for variable from memory:
X="hello world" echo "$X"
... click here to see output
I know, that invocation looks silly.
Actually, that extra quoting is the most common problem when working with file names containing spaces. For example try execute this command somewhere along files with spaces in names
for f in *; do ls "$f"; done
... click here to see output
And now try without double quotes.
for f in *; do ls $f; done
... click here to see output
Many gurus say that we should always double quote variable
"$variable"
because we never know if spaces are inside. Sounds logical. Just those extra two character to type in...
Quoting
Everything in shell is a string we proved earlier. So quoting mechanism has to be different than in normal languages.
-
First we have single quotes
echo 'I "have" 10$!'
... click here to see output
Everything inside is allowed and treated as a character.
-
Escaping character \ pretty much like in other languages:
echo I\ \"have\"\ 10\$!
... click here to see output
-
Double quotes
echo "I 'have' 10$!"
... click here to see output
Spaces are preserved but variables like that special $! are expanded. Single quotes inside double quotes are ordinary characters too.
echo "I \"have\" 10\$!"
... click here to see output
As you can see escaping character \ works here like outside quotes.
Can you repeat/eval, please?
Shell execution procedure is pretty naive. Not only everything is a string, but also these
strings are read by shell just once.
For example if under one variable we want store name of other variable and reach for its value, it's too much for shell. This case is like pointers from C or reference from other languages.USER_A=tom USER_B=geoff USER_C=anna CURRENT='$USER_B' # without single quotes it would be just value copy echo $CURRENT
... click here to see output
We can notice that expansion of variables occurred only one time. How about telling shell to redo code interpretation one more time:eval echo $CURRENT
... click here to see output
This is equivalent to two steps. First:
cmd = shell_reparse_line("echo $X_POINTER");
... click here to see output
Then variable cmd is executed as normal command:
echo $X
... click here to see output
Exactly what we need.
More useful example:
SILENT_MODE='> /dev/null 2>&1' eval ls $SILENT_MODEwhich is equivalent to:
ls > /dev/null 2>&1
Without eval we have errors. Special characters are parsed only once, like expanding variables.
SILENT_MODE='> /dev/null 2>&1' ls $SILENT_MODE
... click here to see output
Very nasty and common is case which needs both techniques: proper quoting strings with spaces and late evaluation.
For example:
DIR="My Music" COMMAND="ls $DIR" sudo $COMMAND
... click here to see output
this doesn't work because we missed quoting of $DIR content. Lets try again
DIR="My Music" COMMAND="ls '$DIR'" sudo $COMMAND
... click here to see output
Do you see slight difference in the outputs? Quoting of strings is parsed just once, at the beginning, so quotes stored in variable don't work. How it should looks?
DIR="My Music" COMMAND="ls '$DIR'" eval sudo $COMMAND
... click here to see output
In conclusion, we need to eval again:
- variables
- control shell characters
- quotes itself (but they are kind of control characters)
Conclusions
The first requirement for shell was to interpret human commands/needs and communicate with system (OS) kernel. If this is true and systems became more complex, shell functionality needed to grown. In a current version most shells are fully functional programming languages... But still they need to interpret human commands/behavior. So maybe this is reason why so simple platform (I mean shell) is so complicated in nuances of work/execution. Not mention that every shell is slightly different from each other!
I hope we've cleared something... a little... Unfortunately many aspects of shell code interpretation will be still not easy to understand. This is because every shell is little different from each other. And interactive mode is different than shell script rules. And of course documentation is too long and too complicated... But I believe we've discussed most misleading and fundamental things.