Understanding Quoting and Escaping in Bash Scripts
According to my observation, quoting and escaping has long been one of the largest source of confusion with regard to shell scripting. Sometimes, even veterans are caught by its pitfalls. However, hard it might seem, it is actually easy to grasp if you understand the rule. The key to understanding is to keep in mind that commands you write are interpreted in two steps. In this long blog, I explain that and also summarize some frequently met escaping rules.
Questions
Let's start with a couple of questions.
How to replace two-character the string \A
with one character -
using
'sed'?
Why all commands in the first group work while none in the second groups works?
# Works echo 'hi \A there' | sed 's/\\A/-/' echo 'hi \A there' | sed "s/\\\A/-/" echo 'hi \A there' | sed "s/\\\\A/-/" echo 'hi \A there' | sed s/\\\\A/-/ # Oops echo 'hi \A there' | sed 's/\A/-/' echo 'hi \A there' | sed "s/\\A/-/" echo 'hi \A there' | sed s/\\\A/-/
Why things change when strings being processed change? Say, if the string to
be replaced is not \A
but \$
, none of above commands works, even those
which worked for \A
. We have to add more backslashes.
echo 'hi \$ there' | sed 's/\\\$/-/' echo 'hi \$ there' | sed "s/\\\\\\$/-/" echo 'hi \$ there' | sed "s/\\\\\\\$/-/"
Why sometimes adding one more backslash does not matter but other times it does?
There are three steps
As I always say, whenever you type a command and press the <enter>, the command line is processed in following steps:
- The shell gets the command line and, if needed, do some "magic" stuff like expansion etc.;
- The shell evokes the command and passes cooked string to the command as argument(s);
- The command runs.
Consequently, it is crucial to be very clear about what shell will do with the input string and what input the command expects. For instance:
find . -name *.txt # WRONG! find . -name '*.txt' # Correct ls *.txt # Correct ls '*.txt' # WRONG!
Why we shall quote the operand of 'find' but not that of 'ls'? The reason is that we want to pass '*.txt' to 'find' literally while having shell to do file name globing to '*.txt' before passing it to 'ls'. In other words,
- 'find' expects a regexp string.
- 'ls' expects a list of file names.
Character escaping can happen at both steps
When it comes to character escaping, it can happend at both steps as well. Take 'echo' for example, when we type the following line on command line, the five-character string (' a \ n b ') is processed by shell in the first place and then passed to 'echo'. 'echo' then process it again.
echo -e 'a\nb'
Another example is that both shell and 'sed' handle character escaping.
sed 's/\A/nul/' input.txt
Escaping rules of 'bash'
As we now see that character escaping can happen at different stages of command processing, the key to write commands involve character escaping correctly is to understand what the escaping rules for each stage are. In this section, let's take a look at bash.
Enclosing characters in single quotes (`'') preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.
Enclosing characters in double quotes (`"') preserves the literal value of all characters within the quotes, with the exception of `$', ``', `\'
…
The backslash retains its special meaning only when followed by one of the following characters: `$', ``', `"', `\', or `newline'
…
NOTE:
- NOTHING within single quotes will be escaped.
$ echo It\'s hard # <--- works It's hard $ echo "It's hard" # <--- looks better It's hard $ echo 'It\'s hard'' # <--- OOPS (also note the trailling quote) It\s hard $ echo 'It'\''s hard'# <--- if you do need "nested" single quote It's hard
newline
here means NOT\n
but the newline itself (i.e. what you got when you press <enter> key.)
Escaping rules of 'echo'
`echo'
… If the `-e' option is given, interpretation of the following backslash-escaped characters is enabled….
`\n' newline
`\\' backslash
`\a' alert (bell)
NOTE:
- Escaping only happens when called with
-e
.
Some examples
This section presents some examples to deepen your understanding of aforementioned rules. If you are familiar with them, feel free to skip this section.
How bash escapes characters
Note that since we are focused on examine how backslash works in bash
, we
are running 'echo' WITHOUT -e
to avoid confusion might be caused by
escapes done by 'echo'.
$ echo a\ > b ab $ echo "a\ > b" ab $ echo 'a\ > b' a\ b $ echo a\\ a\ $ echo a\nb anb $ echo "a\nb" a\nb $ echo 'a\nb' a\nb $ echo a\\nb a\nb $ echo "a\\nb" a\nb $ echo 'a\\nb' a\\nb
How echo escape works
Since, for bash, nothing within single quotes escaped, to examine how escape
works in echo
, we enclose the strings within SINGLE QUOTES to avoid
confusion.
$ echo -e 'a\' a\ $ echo -e 'a\\' a\ $ echo -e 'a\nb' a b $ echo -e 'a\\nb' a\nb $ echo -e 'a\\\nb' a\ b $ echo -e 'a\\\\nb' a\\nb
A case study
Now, let's do an exercise. What should I do if I want to write a string some like this into a file?
cmd 1 \ cmd 2
Before looking at the answer, check the output of underneath commands and explain why.
$ echo -e "1- a \\ b"; echo -e "2- a \n b"; echo -e "3- a \\\n b" 1- a \ b 2- a b 3- a \n b
More confusing things:
$ echo -e "1: a \\\n b"; echo -e "2: a \\\\n b"; echo -e "3: a \\\\\n b" 1: a \n b 2: a \n b 3: a \ b
Here is the answer:
$ cmd1=hi; cmd2=there $ echo -e "$cmd1\\\\\n$cmd2" hi\ there # bash: \\ \\ \n -> \\\n # echo -e: \\ \n -> \ newline $ echo -e "$cmd1\\\\n$cmd2" hi\nthere # bash: \\ \\ n -> \\n # echo -e: \\ n -> \ n $ echo -e "$cmd1\\\n$cmd2" hi\nthere # bash: \\ \n -> \\n # echo -e: \\ n -> \ n
Answer to the opening questions
The key is understand escape rules of both 'bash' and 'sed'. That way, we can split the task into two steps and figure out the expected input for each step. Here we go!
- For
\A
to-
.- The command (string) passed to 'sed'
It should be
s/\\A/-/
.The command is some like
s/from/to/
buts/\A/-/
is incorrect since 'sed' will interpret the\
as an escape character. Hence, we need to use an extra backslash to escape it. - The string passed to 'bash', i.e. what to type on command line.
This depends on quotes you choose.
- If use single quotes, since nothing in single quotes will be escaped by
bash. We can simply type
sed 's/\\A/-/'
- If use double quotes, we'll need to add one backslash escape a
backslash. Consequently, number of backslashes are doubled.
echo 'hi \A there' | sed "s/\\\\A/-/"
- If use single quotes, since nothing in single quotes will be escaped by
bash. We can simply type
- The command (string) passed to 'sed'
- For
\$
to-
- The command (string) passed to 'sed'
It should be
s/\\\$/-
. The "from" string is\$
, but since both\
and$
should be escaped the expected string becomes\\\$
. - The string passed to 'bash', i.e. what to type on command line.
Again, the type of quotes we choose decides if we may need to escape backslashes and dollar sign or not. Hence either of the following is correct:
sed 's/\\\$/-/' sed "s/\\\\\\\$/-/"
- The command (string) passed to 'sed'
Exercises:
- Explain why the commands without quotes listed at the beginning of this blog work/do not work?
- Why sometimes use one less backslashes also works?
- Notice that both of the following commands (six and seven backslashes) work.
But, if
$
is followed by other character(s), using six backslashes will result in error. Why?echo 'hi \$name there' | sed "s/\\\\\\\$name/-/"
Bash ansi-c quoting
Just FYI. In addition to single quotes and double quotes, bash also supports ansi-c quotes, which can be really handy sometimes.
Words of the form `$'STRING'' are treated specially. The word expands to STRING, with backslash-escaped characters replaced as specified by the ANSI C standard.
$ cmd1=hi; cmd2=there; escaped_newline=$'\\\n' $ echo "$cmd1 $escaped_newline $cmd2" hi \ there
blog comments powered by Disqus