Chapter 7. Sed Multi-Line Commands and loops

Sed by default always handles one line at a time, unless we use the H, G, or N command to create multiple lines separated by new line. This chapter will describe sed commands applicable to such multi-line buffers. Note: When we have multiple lines, please keep in mind that ^ matches only the 1st character of the buffer, i.e. of all the multiple lines combined together, and $ matches only the last character in the buffer, i.e. the newline of the last line.

46. Append Next Line to Pattern Space (N command)

Just as upper case H and G append rather than replacing, the N command appends the next line from input-file to the pattern buffer, rather than replacing the current line. As we discussed earlier the lower case n command prints the current pattern space, clears the pattern space, reads the next line from the input-file into pattern space and resumes command execution where it left off. The upper case N command does not print the current pattern space and does not clear the pattern space. Instead, it adds a newline (\n) at the end of the current pattern space, appends the next line from the input-file to the current pattern space, and continues with the sed standard flow by executing the rest of the sed commands. Print employee names and titles separated by colon:
$ sed -e '{N;s/\n/:/}' empnametitle.txt
John Doe:CEO
Jason Smith:IT Manager
Raj Reddy:Sysadmin
Anand Ram:Developer
Jane Miller:Sales Manager
In the above example:
  • N appends new line to current pattern space (which has employee name) and appends the next line from input-file to the current pattern space. So, the pattern space will contain (employee name\ntitle).
  • s/\n/:/ This replaces the \n that separates the "Employee Name\nTitle" with a colon :
This replaces the \n that separates the The following example demonstrates the use of the N command to print the line number on the same line as the text, while printing each line from employee.txt. Print line numbers:
$ sed -e '=' employee.txt | sed -e '{N;s/\n/ /}'
1 101,John Doe,CEO
2 102,Jason Smith,IT Manager
3 103,Raj Reddy,Sysadmin
4 104,Anand Ram,Developer
5 105,Jane Miller,Sales Manager
As we saw in our previous examples, the sed = command prints the line number first, and the original line next. In this example, the N command adds \n to the current pattern space (which contains the line number), then reads the next line and appends it. So, the pattern space will contain "line-number\nOriginalline-content". Then we execute s/\n/ / to change the newline (\n) to a space.

47. Print 1st Line in MultiLine (P command)

We have seen three upper case commands so far, each of which appended to rather than replacing the content of a buffer. We will now see that upper case P and D operate in a fashion similar to their lower case equivalents, but that they also do something special related to MultiLine buffers. As we discussed earlier the lower case p command prints the pattern space. Upper case P command also prints the pattern space, but only until it encounters a new line (\n). The following example prints all the managers names from the empnametitle.txt file
$ sed -n -e 'N' -e '/Manager/P' empnametitle.txt
Jason Smith
Jane Miller

48. Delete 1st Line in MultiLine (D command)

As we discussed earlier the lower case d command deletes the current pattern space, reads the next line from the input-file to the pattern space, aborts the rest of the sed commands and starts the loop again. The upper case D command does not read the next line to the pattern space after deleting it, nor does it completely clear the pattern buffer (unless it only has one line). Instead, it does the following:
  • Deletes part of the pattern space until it encounters new line (\n).
  • Aborts the rest of the sed commands and starts command execution from the beginning on the remaining content in the pattern buffer.
Consider the following file, which has comments enclosed between @ and @ for every title. Note that this comment also spans across the lines in some cases. For example @Information Technology officer@ spans across two rows. Create the following sample file.
$ vi empnametitle-with-comment.txt
John Doe
CEO @Chief Executive Officer@
Jason Smith
IT Manager @Information Technology
Officer@
Raj Reddy
Sysadmin @System Administrator@
Anand Ram
Developer @Senior
Programmer@
Jane Miller
Sales Manager @Sales
Manager@
Our goal is to remove these comments from this file. This can be done as shown below.
$ sed -e '/@/{N;/@.*@/{s/@.*@//;P;D}}' empnametitlewith-
comment.txt
John Doe
CEO
Jason Smith
IT Manager
Raj Reddy
Sysadmin
Anand Ram
Developer
Jane Miller
Sales Manager
The above command should be executed in a single line as shown below.
sed -e '/@/{N;/@.*@/{s/@.*@//;P;D}}' empnametitle-with-comment.txt
You can also save this in a sed script file and execute it as shown below.
$ vi D-upper.sed
#!/bin/sed -f
/@/ {
N
/@.*@/ {s/@.*@//;P;D }
}
$ chmod u+x D-upper.sed
$ ./D-upper.sed empnametitle-with-comment.txt
In the above example:
  • /@/ { - This is the outer loop. Sed looks for any line that contains @ symbol. If it finds one, it executes the rest of the logic. If not, it reads the next line. For example, let us take line 4, which is "@Information Technology" (the comment spans to multiple column and goes to line 5 also). There is an @ symbol on line 4, so the rest of the commands are executed.
  • N - Get the next line from the input file and append it to the pattern space. For example, this will read line 5 "Officer@", and append it to pattern space. So, pattern space will contain "@Information Technology\nOfficer@".
  • /@.*@/ - Searches whether pattern space has the pattern "@.*@", which means anything enclosed between @ and @. The expression is true for the current pattern space, so, it goes to the next step.
  • s/@.*@//;P;D - This substitutes the whole text "@Information Technology\nOfficer@" with nothing (basically it deletes the text). P prints the 1st portion of the line. D deletes the rest of the content of pattern space. And the logic continues from the top again.

49. Loop and Branch (b command and :label)

You can change the execution flow of the sed commands by using label and branch (b command).
  • :label defines the label.
  • b label branches the execution flow to the label. Sed jumps to the line marked by the label and continues executing the rest of the commands from there.
  • Note: You can also execute just the b command (without any label name). In this case, sed jumps to the end of the sed script file.
The following example combines the employee name and title (from the empnametitle.txt file) to a single line separated by : between the fields, and also adds a "*" in front of the employee name, when that employee's title contains the keyword "Manager".
$ vi label.sed
#!/bin/sed -nf
h;n;H;x
s/\n/:/
/Manager/!b end
s/^/*/
:end
p
In the above example, you already know what "h;n;H;x" and "s/\n/:/" does, as we discussed those in our previous examples. Following are the branching related lines in this file.
  • /Manager/!b end - If the lines doesn't contain the keyword "Manager", it goes to the label called "end". Please note that the name of the label can be anything you want. So, this executes "s/^/*/" (add a * in the front), only for the Managers.
  • :end - This is the label.
Execute the above label.sed script:
$ chmod u+x label.sed

$ ./label.sed empnametitle.txt
John Doe:CEO
*Jason Smith:IT Manager
Raj Reddy:Sysadmin
Anand Ram:Developer
*Jane Miller:Sales Manager

50. Loop Using t command

The sed command t label branches the execution flow to the label only if the previous substitute command was successful. That is, when the previous substitution was successful, sed jumps to the line marked by the label and continues executing the rest of the commands from there, otherwise it continues normal execution flow. The following example combines the employee name and title (from the empnametitle.txt file) to a single line separated by : between the fields, and also adds three "*" in front of the employee name, when that employee's title contains the keyword "Manager". Note: We could've just changed the substitute command in the previous example to "s/^/***/" (instead of s/^/*/) to achieve the same result. This example is given only to explain how the sed t command works.
$ vi label-t.sed
#!/bin/sed -nf
h;n;H;x
s/\n/:/
:repeat
/Manager/s/^/*/
/\*\*\*/!t repeat
p

$ chmod u+x label-t.sed

$ ./label-t.sed empnametitle.txt
John Doe:CEO
***Jason Smith:IT Manager
Raj Reddy:Sysadmin
Anand Ram:Developer
***Jane Miller:Sales Manager
In the above example:
  • The following code snippet does the looping.
  • /Manager/s/^/*/ - If it is Manager, it adds a single * in front of the line.
  • /\*\*\*/!t repeat - If the line doesn't contain three *s (represented by /\*\*\*/!), and if the previous substitute command is successful by adding a single star in front of the line, sed jumps to the label called repeat (this is represented by t repeat)
  • :repeat - This is just the label