Okay, here’s a long-form article (approximately 5000 words) detailing the use of the sed
command, fulfilling all requirements:
Using sed
: A Comprehensive Guide
sed
(stream editor) is a powerful, non-interactive text editor in Unix-like operating systems (including Linux and macOS). It’s a cornerstone of command-line text manipulation, allowing you to perform complex text transformations on files or streams of text without needing to open a full-fledged editor. Unlike interactive editors like vi
or nano
, sed
operates on a stream of text, line by line, applying a set of commands that you specify. This makes it ideal for scripting, automating repetitive tasks, and processing large files efficiently.
This guide focuses primarily on the most frequently used command within sed
: the substitution command, s
. While sed
offers a wide array of commands for text manipulation, the s
command is arguably its most versatile and commonly employed feature. We’ll delve into every aspect of the s
command, from basic substitutions to advanced techniques using regular expressions, backreferences, and flags.
1. Basic Syntax and Structure
The fundamental syntax of sed
when using the s
command is:
bash
sed 's/pattern/replacement/flags' input_file
Let’s break down each part:
sed
: This invokes the stream editor program.'...'
: The single quotes enclose thesed
script. Using single quotes prevents the shell from interpreting special characters within the script, ensuring thatsed
receives them correctly. This is crucially important for avoiding unintended behavior, especially when working with regular expressions.s
: This is the substitution command. It tellssed
to find instances of a pattern and replace them with something else./
: This is the default delimiter.sed
uses the character immediately following thes
as the delimiter separating the pattern, replacement, and flags sections. While/
is conventional, any character can be used as the delimiter, provided it’s used consistently within a singles
command. This is incredibly useful when your pattern or replacement contains forward slashes (more on this later).pattern
: This is the text or regular expression thatsed
will search for.replacement
: This is the text that will replace the matchedpattern
.flags
: These are optional modifiers that control how the substitution is performed. We’ll explore these in detail below.input_file
: This is the file thatsed
will read and process. If no input file is specified,sed
reads from standard input (stdin), which allows it to be used in pipelines.
Example 1: Simple Substitution
bash
echo "The quick brown fox" | sed 's/fox/dog/'
Output:
The quick brown dog
In this example:
echo "The quick brown fox"
sends the string to standard output.- The pipe (
|
) redirects this output tosed
‘s standard input. sed 's/fox/dog/'
replaces the first occurrence of “fox” with “dog”.
Example 2: Using a Different Delimiter
bash
echo "/path/to/file" | sed 's#/path/to/#/new/path/to/#'
Output:
/new/path/to/file
Here, we used #
as the delimiter instead of /
. This avoids the need to escape the forward slashes within the /path/to/
string. Choosing a delimiter that doesn’t appear in your pattern or replacement makes your sed
commands much more readable. Common alternatives to /
include #
, @
, |
, and :
.
2. Flags: Controlling the Substitution
The flags
section of the s
command provides fine-grained control over how the substitution is performed. Here are the most important flags:
-
g
(global): By default,sed
only replaces the first occurrence of thepattern
on each line. Theg
flag tellssed
to replace all occurrences of thepattern
on each line.bash
echo "apple apple apple" | sed 's/apple/banana/' # Output: banana apple apple
echo "apple apple apple" | sed 's/apple/banana/g' # Output: banana banana banana -
[number]
(nth occurrence): You can specify a number to replace only the nth occurrence of thepattern
on each line.bash
echo "apple apple apple" | sed 's/apple/banana/2' # Output: apple banana apple -
i
(case-insensitive): Thei
flag makes thepattern
matching case-insensitive.bash
echo "Apple apple APPLE" | sed 's/apple/banana/gi' # Output: banana banana banana
This replaces all occurrences of “apple”, regardless of case, due to the combination ofg
andi
. -
p
(print): If a substitution is made, thep
flag causessed
to print the modified line. This is often used in conjunction with the-n
option (see below), which suppresses the default printing of all lines. -
w filename
(write): If a substitution is made, thew
flag writes the modified line to the specifiedfilename
. This allows you to selectively write lines to a different file. Crucially, the filename must be separated from thew
by a space.bash
echo "apple apple apple" | sed 's/apple/banana/w output.txt'
cat output.txt # Output: banana apple apple
This replaces the first instance of “apple” with “banana”, and only that modified line is written tooutput.txt
. The original, unmodified line is also printed to standard output because we didn’t use-n
. -
I
(case-insensitive, GNU extension): This is a GNU extension, equivalent toi
. It might not be available on all systems. -
m
/M
(multi-line, GNU extensions): These flags affect how^
(beginning of line) and$
(end of line) anchors work in regular expressions when dealing with input that contains embedded newline characters. Normally,^
and$
match the beginning and end of the entire input stream. Withm
orM
, they match the beginning and end of each line within the input. These are GNU extensions.
3. The -n
Option: Suppressing Default Output
By default, sed
prints every line of the input, whether or not a substitution was made. The -n
option (also --quiet
or --silent
) suppresses this default output. When used with the p
flag, only lines where a substitution was made are printed.
“`bash
Without -n, all lines are printed, even if not modified
sed ‘s/apple/banana/p’ input.txt
With -n, only modified lines are printed
sed -n ‘s/apple/banana/p’ input.txt
“`
This combination is very useful for extracting lines that match a pattern and undergo a transformation.
4. Regular Expressions: Unleashing the Power of sed
The real power of sed
comes from its ability to use regular expressions (regex) in the pattern
part of the s
command. Regular expressions are a powerful way to describe patterns of text, going far beyond simple string matching.
Here’s a breakdown of commonly used regular expression elements within sed
:
-
.
(dot): Matches any single character (except a newline, unless thes
command is used in multi-line mode with them
orM
flags).bash
echo "cat bat mat" | sed 's/.at/XXX/' # Output: XXX bat mat -
*
(asterisk): Matches the preceding character zero or more times.bash
echo "caat ct" | sed 's/a*/X/' # Output: Xct Xct (replaces "" and "aa") -
+
(plus): Matches the preceding character one or more times. Note: By default,sed
uses basic regular expressions (BRE), where+
is a literal character. To use it as a quantifier, you need to escape it (\+
) or use the-r
or-E
option (extended regular expressions – ERE).bash
echo "caat ct" | sed 's/a\+/X/' # Output: caat ct (because + is literal)
echo "caat ct" | sed -E 's/a+/X/' # Output: X ct (using extended regex)
echo "caat ct" | sed 's/a\\+/X/' # Output: X ct (escaping + in BRE) -
?
(question mark): Matches the preceding character zero or one time. Similar to+
, you need to escape it (\?
) in BRE or use-r
or-E
for ERE.bash
echo "caat cat ct" | sed -E 's/a?/X/' # Output: XcaXt cXt ct -
[...]
(character class): Matches any single character within the brackets.bash
echo "cat bat mat" | sed 's/[cb]at/XXX/' # Output: XXX XXX mat -
[^...]
(negated character class): Matches any single character not within the brackets.bash
echo "cat bat mat" | sed 's/[^cb]at/XXX/' # Output: cat bat XXX -
^
(caret): Matches the beginning of the line (or string, in the absence of newlines). As mentioned earlier, them
orM
flags can modify this behavior.bash
echo -e "cat\nbat" | sed 's/^b/XXX/' # Output: cat\nXXXat -
$
(dollar sign): Matches the end of the line (or string). Also affected bym
orM
.bash
echo -e "cat\nbat" | sed 's/t$/XXX/' # Output: caXXX\nbaXXX -
\{n\}
(exactly n times): Matches the preceding character exactlyn
times. (Requires escaping in BRE)
bash
echo "caaat" | sed 's/a\{3\}/X/' # Output: cXt -
\{n,\}
(n or more times): Matches the preceding charactern
or more times. (Requires escaping in BRE)
bash
echo "caaat" | sed 's/a\{2,\}/X/' # Output: cXt -
\{n,m\}
(between n and m times): Matches the preceding character betweenn
andm
times (inclusive). (Requires escaping in BRE)
bash
echo "caaaat" | sed 's/a\{2,3\}/X/' # Output: cXat -
(...)
(capturing group): Groups parts of the regular expression. These groups can be referenced in thereplacement
section using backreferences (see below). Requires escaping in BRE:\(...\)
. -
|
(alternation): Matches either the expression before or the expression after the|
. Requires escaping in BRE (\|
) or using-r
or-E
.bash
echo "cat dog" | sed -E 's/cat|dog/animal/' # Output: animal animal
This replaces either “cat” or “dog” with “animal”. -
Character Classes (POSIX): POSIX defines some useful character classes that can be used within square brackets:
[:alnum:]
: Alphanumeric characters (a-z, A-Z, 0-9).[:alpha:]
: Alphabetic characters (a-z, A-Z).[:blank:]
: Space and tab characters.[:cntrl:]
: Control characters.[:digit:]
: Digits (0-9).[:graph:]
: Printable and visible characters (excluding space).[:lower:]
: Lowercase letters (a-z).[:print:]
: Printable characters (including space).[:punct:]
: Punctuation characters.[:space:]
: Whitespace characters (space, tab, newline, carriage return, form feed, vertical tab).[:upper:]
: Uppercase letters (A-Z).[:xdigit:]
: Hexadecimal digits (0-9, a-f, A-F).
bash
echo "a1B2c3D4" | sed 's/[[:digit:]]/X/g' #Output: aXbXcXdX
5. Backreferences: Using Captured Groups
Capturing groups, defined by (...)
(or \(...\)
in BRE), allow you to refer back to the text matched by the group in the replacement
part of the s
command. This is done using backreferences:
\1
: Refers to the first capturing group.\2
: Refers to the second capturing group.\3
: Refers to the third capturing group, and so on.&
: Refers to the entire matched text. This is useful even without capturing groups.
Example 1: Swapping Words
bash
echo "John Doe" | sed -E 's/(\w+) (\w+)/\2, \1/' # Output: Doe, John
-E
: Enables extended regular expressions (so we don’t need to escape the parentheses).(\w+)
: Matches one or more word characters (\w
is equivalent to[[:alnum:]_]
) and captures it as group 1.: Matches a space.
(\w+)
: Matches another word and captures it as group 2.\2, \1
: Replaces the entire match with group 2, a comma, a space, and then group 1.
Example 2: Doubling a Number
bash
echo "The price is 123 dollars." | sed -E 's/([0-9]+)/& &/g'
Output:
The price is 123 123 dollars.
([0-9]+)
: Matches one or more digits and captures them as group 1.& &
: Replaces the matched number with itself (using&
) followed by a space and itself again.
Example 3: Inserting Text Around a Match
bash
echo "Hello world" | sed 's/world/<b>&<\/b>/' # Output: Hello <b>world</b>
* &
represents the entire matched text (“world”). We wrap it in HTML bold tags.
6. Extended Regular Expressions (-r
or -E
)
As mentioned earlier, sed
uses basic regular expressions (BRE) by default. This means that some characters, like +
, ?
, |
, (
, and )
, are treated as literal characters unless they are escaped with a backslash (\
).
The -r
or -E
options (they are equivalent) tell sed
to use extended regular expressions (ERE). With ERE, these characters have their special regex meaning without needing to be escaped. This often makes your sed
commands more readable and easier to write.
“`bash
BRE (escaping required)
echo “123” | sed ‘s/[0-9]+/number/’ # Output: 123 (no match, + is literal)
echo “123” | sed ‘s/[0-9]\+/number/’ # Output: number
ERE (no escaping needed)
echo “123” | sed -E ‘s/[0-9]+/number/’ # Output: number
echo “123” | sed -r ‘s/[0-9]+/number/’ # Output: number
“`
Recommendation: For most cases, it’s highly recommended to use -E
(or -r
) to enable extended regular expressions. It significantly improves the readability and reduces the chance of errors due to forgotten backslashes.
7. In-Place Editing (-i
)
By default, sed
writes its output to standard output. It does not modify the original input file. The -i
option (in-place editing) changes this behavior, allowing sed
to modify the file directly.
Important: Use -i
with extreme caution. It modifies the original file, and there’s no undo. Always make a backup of your file before using -i
.
“`bash
Make a backup first!
cp myfile.txt myfile.txt.bak
Now, modify the file in-place
sed -i ‘s/old/new/g’ myfile.txt
“`
7.1. In-Place Editing with Backup (-i.bak
)
Many versions of sed
(particularly GNU sed
) support creating a backup file automatically when using -i
. You can specify a suffix to be appended to the original filename to create the backup.
bash
sed -i.bak 's/old/new/g' myfile.txt
This will:
- Create a backup file named
myfile.txt.bak
. - Modify
myfile.txt
in-place, replacing all occurrences of “old” with “new”.
This is a much safer way to use in-place editing, as you always have a copy of the original file.
7.2. macOS (BSD) sed
and -i
On macOS, the sed
command is based on BSD sed
, which has a slightly different behavior with -i
. On macOS, -i
requires a backup extension, even if it’s an empty string.
- To create a backup:
sed -i.bak 's/old/new/g' myfile.txt
(same as GNUsed
). - To modify in-place without a backup (very dangerous):
sed -i '' 's/old/new/g' myfile.txt
(note the empty string after-i
).
8. Addressing: Specifying Lines to Operate On
So far, we’ve been applying the s
command to every line of the input. sed
allows you to specify a range of lines or lines matching a specific pattern on which to apply the command. This is called “addressing.”
-
No address: The command is applied to every line.
-
Single line number: The command is applied only to that specific line.
bash
sed '2s/old/new/' myfile.txt # Replace "old" with "new" only on line 2 -
Line range (start,end): The command is applied to lines from
start
toend
(inclusive).bash
sed '2,5s/old/new/' myfile.txt # Replace on lines 2 through 5 -
$
(last line): Represents the last line of the input.bash
sed '$,$s/old/new/' myfile.txt # Replace on the last line
sed '2,$s/old/new/' myfile.txt # Replace from line 2 to the last line -
/pattern/
(address by pattern): The command is applied to lines that match the given regular expressionpattern
.bash
sed '/error/s/error/WARNING/' myfile.txt # Replace "error" with "WARNING" on lines containing "error" -
/pattern1/,/pattern2/
(range by patterns): The command is applied to lines starting from the first line matchingpattern1
up to and including the first line matchingpattern2
. Ifpattern2
is not found, the command is applied to the end of the file.bash
sed '/START/,/END/s/old/new/' myfile.txt # Replace between lines containing "START" and "END" -
first~step
(GNU extension): This address selects linefirst
, and then everystep
-th line after that.bash
sed '1~2s/old/new/' myfile.txt # Replace on odd-numbered lines
sed '2~2s/old/new/' myfile.txt # Replace on even-numbered lines -
!
(negation): The!
character negates the address. It applies the command to lines that do not match the address.bash
sed '2!s/old/new/' myfile.txt # Replace on all lines *except* line 2
sed '/error/!s/old/new/' myfile.txt # Replace on lines *not* containing "error"
This can be combined:bash
sed '2,4!s/old/new/' myfile.txt # Replaces all lines, except those from line 2 to 4.
9. Multiple sed
Commands
You can combine multiple sed
commands in several ways:
-
Semicolon (
;
): Separate commands with semicolons.bash
sed 's/foo/bar/;s/baz/qux/' myfile.txt
This first replaces “foo” with “bar”, and then replaces “baz” with “qux” on each line. The output of the first command is fed as input into the second command, per line. -
-e
option: Use multiple-e
options to specify separate commands.bash
sed -e 's/foo/bar/' -e 's/baz/qux/' myfile.txt
This is functionally equivalent to using semicolons. -
sed
script file (-f
option): Create a text file containing yoursed
commands, one command per line. Then, use the-f
option to tellsed
to read its commands from this file.“`bash
Create a file named ‘commands.sed’ with the following content:
s/foo/bar/
s/baz/qux/
sed -f commands.sed myfile.txt
``
sed` scripts. It makes your commands more organized and reusable.
This is particularly useful for complex or frequently used
10. Other Useful sed
Commands (Beyond s
)
While the s
command is the focus of this guide, it’s worth briefly mentioning some other useful sed
commands:
-
d
(delete): Deletes the current line.bash
sed '2d' myfile.txt # Delete line 2
sed '/pattern/d' myfile.txt # Delete lines matching "pattern" -
p
(print): Prints the current line (usually used with-n
). We’ve already seen this as a flag for thes
command, but it can be used independently.bash
sed -n '2p' myfile.txt # Print only line 2 -
a\
(append): Appends text after the current line.bash
sed '2a\This is a new line.' myfile.txt # Append text after line 2
Note the backslash before the newline. -
i\
(insert): Inserts text before the current line.bash
sed '2i\This is a new line.' myfile.txt # Insert text before line 2 -
c\
(change/replace): Replaces entire lines with new text.bash
sed '2c\This is the replacement for line 2.' file.txt -
q
(quit): Exitssed
immediately. This is useful for processing only the beginning of a file.bash
sed '10q' myfile.txt # Process only the first 10 lines -
r filename
(read): Reads the contents offilename
and appends them to the output after the current line.bash
sed '2r insert.txt' myfile.txt # Insert the contents of insert.txt after line 2 of myfile.txt -
w filename
(write): Writes the current line tofilename
. We’ve seen this as a flag fors
, but it can also be used as a standalone command.bash
sed -n '/pattern/w output.txt' myfile.txt # Write lines matching "pattern" to output.txt
*y/source/dest/
(translate or transliterate): transforms individual characters.
bash
sed 'y/abc/xyz/' #Translates all occurrences of "a" to "x", "b" to "y", and "c" to "z".
=
(Print line number): prints the current line number.
bash
sed '=' file.txt #Outputs each line number, followed by the line itself, on separate lines.
11. Common sed
Use Cases and Examples
Here are some practical examples demonstrating how sed
can be used to solve common text manipulation tasks:
-
Extracting specific lines:
“`bash
Extract lines containing the word “error”
sed -n ‘/error/p’ logfile.txt
Extract lines between two patterns (inclusive)
sed -n ‘/
/,/<\/end>/p’ data.xml Extract the first 10 lines
sed ’10q’ large_file.txt
or
head -n 10 large_file.txt # (head is often faster for this)
“` -
Replacing text in a configuration file:
“`bash
Change the database host in a config file (using -i.bak for safety)
sed -i.bak ‘s/db_host = localhost/db_host = 192.168.1.100/’ config.ini
“` -
Removing blank lines:
bash
sed '/^$/d' myfile.txt -
Removing comments from code (lines starting with
#
):bash
sed '/^#/d' script.sh -
Adding a prefix to each line:
bash
sed 's/^/prefix: /' myfile.txt -
Adding a suffix to each line:
bash
sed 's/$/: suffix/' myfile.txt -
Converting CSV to a different delimiter:
bash
sed 's/,/|/g' data.csv > data.psv # Convert commas to pipes -
Extracting data from structured text:
“`bash
Extract the username from lines like “User: jdoe”
sed -n -E ‘s/^User: (.*)/\1/p’ users.txt
“` -
Double-spacing a file:
bash
sed 'G' myfile.txt #GNU extension. Appends a newline to every line. -
Numbering lines:
bash
sed = myfile.txt | sed 'N;s/\n/ /' #Adds line numbers
- Removing leading and trailing whitespace:
bash
sed 's/^[[:space:]]*//;s/[[:space:]]*$//' myfile.txt
12. Conclusion
sed
is an incredibly powerful and versatile tool for text manipulation. This guide has covered the s
command in detail, including its syntax, flags, regular expressions, backreferences, addressing, and various use cases. While sed
might seem daunting at first, mastering it will significantly enhance your command-line productivity. The ability to perform complex text transformations with a few concise commands is invaluable for system administrators, developers, and anyone who works with text files. Remember to use -i
with caution and always back up your files before making in-place edits. Experiment with the examples provided, and gradually build your sed
expertise. The more you practice, the more comfortable you’ll become with its power and flexibility.