Using `command s`: a guide.

Okay, here’s a long-form article (approximately 5000 words) detailing the use of the sed command, fulfilling all requirements:

Using sed: A Comprehensive Guide

sed (stream editor) is a powerful, non-interactive text editor in Unix-like operating systems (including Linux and macOS). It’s a cornerstone of command-line text manipulation, allowing you to perform complex text transformations on files or streams of text without needing to open a full-fledged editor. Unlike interactive editors like vi or nano, sed operates on a stream of text, line by line, applying a set of commands that you specify. This makes it ideal for scripting, automating repetitive tasks, and processing large files efficiently.

This guide focuses primarily on the most frequently used command within sed: the substitution command, s. While sed offers a wide array of commands for text manipulation, the s command is arguably its most versatile and commonly employed feature. We’ll delve into every aspect of the s command, from basic substitutions to advanced techniques using regular expressions, backreferences, and flags.

1. Basic Syntax and Structure

The fundamental syntax of sed when using the s command is:

bash
sed 's/pattern/replacement/flags' input_file

Let’s break down each part:

  • sed: This invokes the stream editor program.
  • '...': The single quotes enclose the sed script. Using single quotes prevents the shell from interpreting special characters within the script, ensuring that sed receives them correctly. This is crucially important for avoiding unintended behavior, especially when working with regular expressions.
  • s: This is the substitution command. It tells sed to find instances of a pattern and replace them with something else.
  • /: This is the default delimiter. sed uses the character immediately following the s as the delimiter separating the pattern, replacement, and flags sections. While / is conventional, any character can be used as the delimiter, provided it’s used consistently within a single s command. This is incredibly useful when your pattern or replacement contains forward slashes (more on this later).
  • pattern: This is the text or regular expression that sed will search for.
  • replacement: This is the text that will replace the matched pattern.
  • flags: These are optional modifiers that control how the substitution is performed. We’ll explore these in detail below.
  • input_file: This is the file that sed will read and process. If no input file is specified, sed reads from standard input (stdin), which allows it to be used in pipelines.

Example 1: Simple Substitution

bash
echo "The quick brown fox" | sed 's/fox/dog/'

Output:

The quick brown dog

In this example:

  • echo "The quick brown fox" sends the string to standard output.
  • The pipe (|) redirects this output to sed‘s standard input.
  • sed 's/fox/dog/' replaces the first occurrence of “fox” with “dog”.

Example 2: Using a Different Delimiter

bash
echo "/path/to/file" | sed 's#/path/to/#/new/path/to/#'

Output:

/new/path/to/file

Here, we used # as the delimiter instead of /. This avoids the need to escape the forward slashes within the /path/to/ string. Choosing a delimiter that doesn’t appear in your pattern or replacement makes your sed commands much more readable. Common alternatives to / include #, @, |, and :.

2. Flags: Controlling the Substitution

The flags section of the s command provides fine-grained control over how the substitution is performed. Here are the most important flags:

  • g (global): By default, sed only replaces the first occurrence of the pattern on each line. The g flag tells sed to replace all occurrences of the pattern on each line.

    bash
    echo "apple apple apple" | sed 's/apple/banana/' # Output: banana apple apple
    echo "apple apple apple" | sed 's/apple/banana/g' # Output: banana banana banana

  • [number] (nth occurrence): You can specify a number to replace only the nth occurrence of the pattern on each line.

    bash
    echo "apple apple apple" | sed 's/apple/banana/2' # Output: apple banana apple

  • i (case-insensitive): The i flag makes the pattern matching case-insensitive.

    bash
    echo "Apple apple APPLE" | sed 's/apple/banana/gi' # Output: banana banana banana

    This replaces all occurrences of “apple”, regardless of case, due to the combination of g and i.

  • p (print): If a substitution is made, the p flag causes sed to print the modified line. This is often used in conjunction with the -n option (see below), which suppresses the default printing of all lines.

  • w filename (write): If a substitution is made, the w flag writes the modified line to the specified filename. This allows you to selectively write lines to a different file. Crucially, the filename must be separated from the w by a space.

    bash
    echo "apple apple apple" | sed 's/apple/banana/w output.txt'
    cat output.txt # Output: banana apple apple

    This replaces the first instance of “apple” with “banana”, and only that modified line is written to output.txt. The original, unmodified line is also printed to standard output because we didn’t use -n.

  • I (case-insensitive, GNU extension): This is a GNU extension, equivalent to i. It might not be available on all systems.

  • m / M (multi-line, GNU extensions): These flags affect how ^ (beginning of line) and $ (end of line) anchors work in regular expressions when dealing with input that contains embedded newline characters. Normally, ^ and $ match the beginning and end of the entire input stream. With m or M, they match the beginning and end of each line within the input. These are GNU extensions.

3. The -n Option: Suppressing Default Output

By default, sed prints every line of the input, whether or not a substitution was made. The -n option (also --quiet or --silent) suppresses this default output. When used with the p flag, only lines where a substitution was made are printed.

“`bash

Without -n, all lines are printed, even if not modified

sed ‘s/apple/banana/p’ input.txt

With -n, only modified lines are printed

sed -n ‘s/apple/banana/p’ input.txt
“`

This combination is very useful for extracting lines that match a pattern and undergo a transformation.

4. Regular Expressions: Unleashing the Power of sed

The real power of sed comes from its ability to use regular expressions (regex) in the pattern part of the s command. Regular expressions are a powerful way to describe patterns of text, going far beyond simple string matching.

Here’s a breakdown of commonly used regular expression elements within sed:

  • . (dot): Matches any single character (except a newline, unless the s command is used in multi-line mode with the m or M flags).

    bash
    echo "cat bat mat" | sed 's/.at/XXX/' # Output: XXX bat mat

  • * (asterisk): Matches the preceding character zero or more times.

    bash
    echo "caat ct" | sed 's/a*/X/' # Output: Xct Xct (replaces "" and "aa")

  • + (plus): Matches the preceding character one or more times. Note: By default, sed uses basic regular expressions (BRE), where + is a literal character. To use it as a quantifier, you need to escape it (\+) or use the -r or -E option (extended regular expressions – ERE).

    bash
    echo "caat ct" | sed 's/a\+/X/' # Output: caat ct (because + is literal)
    echo "caat ct" | sed -E 's/a+/X/' # Output: X ct (using extended regex)
    echo "caat ct" | sed 's/a\\+/X/' # Output: X ct (escaping + in BRE)

  • ? (question mark): Matches the preceding character zero or one time. Similar to +, you need to escape it (\?) in BRE or use -r or -E for ERE.

    bash
    echo "caat cat ct" | sed -E 's/a?/X/' # Output: XcaXt cXt ct

  • [...] (character class): Matches any single character within the brackets.

    bash
    echo "cat bat mat" | sed 's/[cb]at/XXX/' # Output: XXX XXX mat

  • [^...] (negated character class): Matches any single character not within the brackets.

    bash
    echo "cat bat mat" | sed 's/[^cb]at/XXX/' # Output: cat bat XXX

  • ^ (caret): Matches the beginning of the line (or string, in the absence of newlines). As mentioned earlier, the m or M flags can modify this behavior.

    bash
    echo -e "cat\nbat" | sed 's/^b/XXX/' # Output: cat\nXXXat

  • $ (dollar sign): Matches the end of the line (or string). Also affected by m or M.

    bash
    echo -e "cat\nbat" | sed 's/t$/XXX/' # Output: caXXX\nbaXXX

  • \{n\} (exactly n times): Matches the preceding character exactly n times. (Requires escaping in BRE)
    bash
    echo "caaat" | sed 's/a\{3\}/X/' # Output: cXt

  • \{n,\} (n or more times): Matches the preceding character n or more times. (Requires escaping in BRE)
    bash
    echo "caaat" | sed 's/a\{2,\}/X/' # Output: cXt

  • \{n,m\} (between n and m times): Matches the preceding character between n and m times (inclusive). (Requires escaping in BRE)
    bash
    echo "caaaat" | sed 's/a\{2,3\}/X/' # Output: cXat

  • (...) (capturing group): Groups parts of the regular expression. These groups can be referenced in the replacement section using backreferences (see below). Requires escaping in BRE: \(...\).

  • | (alternation): Matches either the expression before or the expression after the |. Requires escaping in BRE (\|) or using -r or -E.

    bash
    echo "cat dog" | sed -E 's/cat|dog/animal/' # Output: animal animal

    This replaces either “cat” or “dog” with “animal”.

  • Character Classes (POSIX): POSIX defines some useful character classes that can be used within square brackets:

  • [:alnum:]: Alphanumeric characters (a-z, A-Z, 0-9).
  • [:alpha:]: Alphabetic characters (a-z, A-Z).
  • [:blank:]: Space and tab characters.
  • [:cntrl:]: Control characters.
  • [:digit:]: Digits (0-9).
  • [:graph:]: Printable and visible characters (excluding space).
  • [:lower:]: Lowercase letters (a-z).
  • [:print:]: Printable characters (including space).
  • [:punct:]: Punctuation characters.
  • [:space:]: Whitespace characters (space, tab, newline, carriage return, form feed, vertical tab).
  • [:upper:]: Uppercase letters (A-Z).
  • [:xdigit:]: Hexadecimal digits (0-9, a-f, A-F).

bash
echo "a1B2c3D4" | sed 's/[[:digit:]]/X/g' #Output: aXbXcXdX

5. Backreferences: Using Captured Groups

Capturing groups, defined by (...) (or \(...\) in BRE), allow you to refer back to the text matched by the group in the replacement part of the s command. This is done using backreferences:

  • \1: Refers to the first capturing group.
  • \2: Refers to the second capturing group.
  • \3: Refers to the third capturing group, and so on.
  • &: Refers to the entire matched text. This is useful even without capturing groups.

Example 1: Swapping Words

bash
echo "John Doe" | sed -E 's/(\w+) (\w+)/\2, \1/' # Output: Doe, John

  • -E: Enables extended regular expressions (so we don’t need to escape the parentheses).
  • (\w+): Matches one or more word characters (\w is equivalent to [[:alnum:]_]) and captures it as group 1.
  • : Matches a space.
  • (\w+): Matches another word and captures it as group 2.
  • \2, \1: Replaces the entire match with group 2, a comma, a space, and then group 1.

Example 2: Doubling a Number

bash
echo "The price is 123 dollars." | sed -E 's/([0-9]+)/& &/g'

Output:

The price is 123 123 dollars.

  • ([0-9]+): Matches one or more digits and captures them as group 1.
  • & &: Replaces the matched number with itself (using &) followed by a space and itself again.

Example 3: Inserting Text Around a Match

bash
echo "Hello world" | sed 's/world/<b>&<\/b>/' # Output: Hello <b>world</b>

* & represents the entire matched text (“world”). We wrap it in HTML bold tags.

6. Extended Regular Expressions (-r or -E)

As mentioned earlier, sed uses basic regular expressions (BRE) by default. This means that some characters, like +, ?, |, (, and ), are treated as literal characters unless they are escaped with a backslash (\).

The -r or -E options (they are equivalent) tell sed to use extended regular expressions (ERE). With ERE, these characters have their special regex meaning without needing to be escaped. This often makes your sed commands more readable and easier to write.

“`bash

BRE (escaping required)

echo “123” | sed ‘s/[0-9]+/number/’ # Output: 123 (no match, + is literal)
echo “123” | sed ‘s/[0-9]\+/number/’ # Output: number

ERE (no escaping needed)

echo “123” | sed -E ‘s/[0-9]+/number/’ # Output: number
echo “123” | sed -r ‘s/[0-9]+/number/’ # Output: number
“`

Recommendation: For most cases, it’s highly recommended to use -E (or -r) to enable extended regular expressions. It significantly improves the readability and reduces the chance of errors due to forgotten backslashes.

7. In-Place Editing (-i)

By default, sed writes its output to standard output. It does not modify the original input file. The -i option (in-place editing) changes this behavior, allowing sed to modify the file directly.

Important: Use -i with extreme caution. It modifies the original file, and there’s no undo. Always make a backup of your file before using -i.

“`bash

Make a backup first!

cp myfile.txt myfile.txt.bak

Now, modify the file in-place

sed -i ‘s/old/new/g’ myfile.txt
“`

7.1. In-Place Editing with Backup (-i.bak)

Many versions of sed (particularly GNU sed) support creating a backup file automatically when using -i. You can specify a suffix to be appended to the original filename to create the backup.

bash
sed -i.bak 's/old/new/g' myfile.txt

This will:

  1. Create a backup file named myfile.txt.bak.
  2. Modify myfile.txt in-place, replacing all occurrences of “old” with “new”.

This is a much safer way to use in-place editing, as you always have a copy of the original file.

7.2. macOS (BSD) sed and -i

On macOS, the sed command is based on BSD sed, which has a slightly different behavior with -i. On macOS, -i requires a backup extension, even if it’s an empty string.

  • To create a backup: sed -i.bak 's/old/new/g' myfile.txt (same as GNU sed).
  • To modify in-place without a backup (very dangerous): sed -i '' 's/old/new/g' myfile.txt (note the empty string after -i).

8. Addressing: Specifying Lines to Operate On

So far, we’ve been applying the s command to every line of the input. sed allows you to specify a range of lines or lines matching a specific pattern on which to apply the command. This is called “addressing.”

  • No address: The command is applied to every line.

  • Single line number: The command is applied only to that specific line.

    bash
    sed '2s/old/new/' myfile.txt # Replace "old" with "new" only on line 2

  • Line range (start,end): The command is applied to lines from start to end (inclusive).

    bash
    sed '2,5s/old/new/' myfile.txt # Replace on lines 2 through 5

  • $ (last line): Represents the last line of the input.

    bash
    sed '$,$s/old/new/' myfile.txt # Replace on the last line
    sed '2,$s/old/new/' myfile.txt # Replace from line 2 to the last line

  • /pattern/ (address by pattern): The command is applied to lines that match the given regular expression pattern.

    bash
    sed '/error/s/error/WARNING/' myfile.txt # Replace "error" with "WARNING" on lines containing "error"

  • /pattern1/,/pattern2/ (range by patterns): The command is applied to lines starting from the first line matching pattern1 up to and including the first line matching pattern2. If pattern2 is not found, the command is applied to the end of the file.

    bash
    sed '/START/,/END/s/old/new/' myfile.txt # Replace between lines containing "START" and "END"

  • first~step (GNU extension): This address selects line first, and then every step-th line after that.

    bash
    sed '1~2s/old/new/' myfile.txt # Replace on odd-numbered lines
    sed '2~2s/old/new/' myfile.txt # Replace on even-numbered lines

  • ! (negation): The ! character negates the address. It applies the command to lines that do not match the address.

    bash
    sed '2!s/old/new/' myfile.txt # Replace on all lines *except* line 2
    sed '/error/!s/old/new/' myfile.txt # Replace on lines *not* containing "error"

    This can be combined:

    bash
    sed '2,4!s/old/new/' myfile.txt # Replaces all lines, except those from line 2 to 4.

9. Multiple sed Commands

You can combine multiple sed commands in several ways:

  • Semicolon (;): Separate commands with semicolons.

    bash
    sed 's/foo/bar/;s/baz/qux/' myfile.txt

    This first replaces “foo” with “bar”, and then replaces “baz” with “qux” on each line. The output of the first command is fed as input into the second command, per line.

  • -e option: Use multiple -e options to specify separate commands.

    bash
    sed -e 's/foo/bar/' -e 's/baz/qux/' myfile.txt

    This is functionally equivalent to using semicolons.

  • sed script file (-f option): Create a text file containing your sed commands, one command per line. Then, use the -f option to tell sed to read its commands from this file.

    “`bash

    Create a file named ‘commands.sed’ with the following content:

    s/foo/bar/

    s/baz/qux/

    sed -f commands.sed myfile.txt
    ``
    This is particularly useful for complex or frequently used
    sed` scripts. It makes your commands more organized and reusable.

10. Other Useful sed Commands (Beyond s)

While the s command is the focus of this guide, it’s worth briefly mentioning some other useful sed commands:

  • d (delete): Deletes the current line.

    bash
    sed '2d' myfile.txt # Delete line 2
    sed '/pattern/d' myfile.txt # Delete lines matching "pattern"

  • p (print): Prints the current line (usually used with -n). We’ve already seen this as a flag for the s command, but it can be used independently.

    bash
    sed -n '2p' myfile.txt # Print only line 2

  • a\ (append): Appends text after the current line.

    bash
    sed '2a\This is a new line.' myfile.txt # Append text after line 2

    Note the backslash before the newline.

  • i\ (insert): Inserts text before the current line.

    bash
    sed '2i\This is a new line.' myfile.txt # Insert text before line 2

  • c\ (change/replace): Replaces entire lines with new text.

    bash
    sed '2c\This is the replacement for line 2.' file.txt

  • q (quit): Exits sed immediately. This is useful for processing only the beginning of a file.

    bash
    sed '10q' myfile.txt # Process only the first 10 lines

  • r filename (read): Reads the contents of filename and appends them to the output after the current line.

    bash
    sed '2r insert.txt' myfile.txt # Insert the contents of insert.txt after line 2 of myfile.txt

  • w filename (write): Writes the current line to filename. We’ve seen this as a flag for s, but it can also be used as a standalone command.

    bash
    sed -n '/pattern/w output.txt' myfile.txt # Write lines matching "pattern" to output.txt

    * y/source/dest/ (translate or transliterate): transforms individual characters.

bash
sed 'y/abc/xyz/' #Translates all occurrences of "a" to "x", "b" to "y", and "c" to "z".

  • = (Print line number): prints the current line number.
    bash
    sed '=' file.txt #Outputs each line number, followed by the line itself, on separate lines.

11. Common sed Use Cases and Examples

Here are some practical examples demonstrating how sed can be used to solve common text manipulation tasks:

  • Extracting specific lines:

    “`bash

    Extract lines containing the word “error”

    sed -n ‘/error/p’ logfile.txt

    Extract lines between two patterns (inclusive)

    sed -n ‘//,/<\/end>/p’ data.xml

    Extract the first 10 lines

    sed ’10q’ large_file.txt

    or

    head -n 10 large_file.txt # (head is often faster for this)
    “`

  • Replacing text in a configuration file:

    “`bash

    Change the database host in a config file (using -i.bak for safety)

    sed -i.bak ‘s/db_host = localhost/db_host = 192.168.1.100/’ config.ini
    “`

  • Removing blank lines:

    bash
    sed '/^$/d' myfile.txt

  • Removing comments from code (lines starting with #):

    bash
    sed '/^#/d' script.sh

  • Adding a prefix to each line:

    bash
    sed 's/^/prefix: /' myfile.txt

  • Adding a suffix to each line:

    bash
    sed 's/$/: suffix/' myfile.txt

  • Converting CSV to a different delimiter:

    bash
    sed 's/,/|/g' data.csv > data.psv # Convert commas to pipes

  • Extracting data from structured text:

    “`bash

    Extract the username from lines like “User: jdoe”

    sed -n -E ‘s/^User: (.*)/\1/p’ users.txt
    “`

  • Double-spacing a file:

    bash
    sed 'G' myfile.txt #GNU extension. Appends a newline to every line.

  • Numbering lines:

bash
sed = myfile.txt | sed 'N;s/\n/ /' #Adds line numbers

  • Removing leading and trailing whitespace:

bash
sed 's/^[[:space:]]*//;s/[[:space:]]*$//' myfile.txt

12. Conclusion

sed is an incredibly powerful and versatile tool for text manipulation. This guide has covered the s command in detail, including its syntax, flags, regular expressions, backreferences, addressing, and various use cases. While sed might seem daunting at first, mastering it will significantly enhance your command-line productivity. The ability to perform complex text transformations with a few concise commands is invaluable for system administrators, developers, and anyone who works with text files. Remember to use -i with caution and always back up your files before making in-place edits. Experiment with the examples provided, and gradually build your sed expertise. The more you practice, the more comfortable you’ll become with its power and flexibility.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top