Chapter 10. Awk Variables and Operators

62. Awk Variables

Awk variables should begin with an alphabetic character; the rest of the characters can be numbers, or letters, or underscore. Keywords cannot be used as an awk variable name.

Unlike other programming languages, you don't need to declare an variable to use it. If you wish to initialize an awk variable, it is better to do it in the BEGIN section, which will be executed only once.

There are no data types in Awk. Whether an awk variable is a number or a string depends on the context in which the variable is used in.

employee-sal.txt sample file

employee-sal.txt is a comma delimited file that contains 5 employee records in the following format:

employee-number,employee-name,employee-title,salary

Create the file:

$ vi employee-sal.txt
101,John Doe,CEO,10000
102,Jason Smith,IT Manager,5000
103,Raj Reddy,Sysadmin,4500
104,Anand Ram,Developer,4500
105,Jane Miller,Sales Manager,3000

The following example shows how to create and use your own variable inside an awk script. In this example, "total" is the user defined Awk variable that is used to calculate the total salary of all the employees in the company.

$ cat total-company-salary.awk
BEGIN {
  FS=",";
  total=0;
}
{
  print $2 "'s salary is: " $4;
  total=total+$4
}
END {
  print "---\nTotal company salary = $"total;
}
$ awk -f total-company-salary.awk employee-sal.txt
John Doe's salary is: 10000
Jason Smith's salary is: 5000
Raj Reddy's salary is: 4500
Anand Ram's salary is: 4500
Jane Miller's salary is: 3000
---
Total company salary = $27000

63. Unary Operators

An operator which accepts a single operand is called a unary operator.

+ -The number (returns the number itself)
- -Negate the number
++ -Auto Increment
-- -Auto Decrement

The following example negates the number using unary operator minus:

$ awk -F, '{print -$4}' employee-sal.txt
-10000
-5000
-4500
-4500
-3000

The following example demonstrates how plus and minus unary operators affect negative numbers stored in a text file:

$ vi negative.txt
-1
-2
-3
$ awk '{print +$1}' negative.txt
-1
-2
-3
$ awk '{print -$1}' negative.txt
1
2
3

Auto Increment and Auto Decrement

Auto increment and auto decrement operators change the associated variable's value; when used inside an expression their interpreted value can be either 'pre' or 'post' the change of value.

Pre means you'll add ++ (or --) before the variable name. This will first increase (or decrease) the value of the variable by one, and then execute the rest of the statement in which it is used.

Post means you'll add ++ (or --) after the variable name. This will first execute the containing statement and then increase (or decrease) the value of the variable by one.

Example of pre-auto-increment:

$ awk -F, '{print ++$4}' employee-sal.txt
10001
5001
4501
4501
3001

Example of pre-auto-decrement:

$ awk -F, '{print --$4}' employee-sal.txt
9999
4999
4499
4499
2999

Example of post-auto-increment:

(since ++ is in the print statement the original value is printed):

$ awk -F ',' '{print $4++}' employee-sal.txt
10000
5000
4500
4500
3000

Example of post-auto-increment:

(since ++ is in a separate statement the resulting value is printed):

$ awk -F ',' '{$4++; print $4}' employee-sal.txt
10001
5001
4501
4501
3001

Example of post-auto-decrement:

(since -- is in the print statement the original value is printed):

$ awk -F ',' '{print $4--}' employee-sal.txt
10000
5000
4500
4500
3000

Example of post-auto-decrement:

(since -- is in a separate statement the resulting value is printed):

$ awk -F ',' '{$4--; print $4}' employee-sal.txt
9999
4999
4499
4499
2999

The following useful example displays the total number of users who have a login shell, i.e. who can log in to the system and reach a command prompt.

• This uses the post-increment unary operator (although since the variable is not printed till the END block pre-increment would produce the same result).
• The body block of this script includes a pattern match so that the contained code executes only if the last field of the line contains the pattern /bin/bash.
• Note: Regular expressions should be enclosed between // but that means that the frontslash (/) character must be escaped in the regular expression so that it is not interpreted as the end-of-expression.
• When a line matches, variable ‘n’ gets incremented by one. The final value is printed from the END block.

Example: Print number of shell users.

$ awk -F ':' '$NF ~ /\/bin\/bash/ { n++ }; END { print
n }' /etc/passwd
2

64. Awk Arithmetic Operators

An operator that accepts two operands is called a binary operator. There are different kinds of binary operators that are classified based on usage. (arithmetic, string, assignment, etc.)

The following operators are used for performing arithmetic calculations.

+ -Addition
- -Subtraction
* -Multiplication
/ -Division
% -Modulo Division

The following example shows the usage of the binary operators +, -, * and /

This examples does two things:

  1. Reduces the price of every single item by 20%
  2. Reduces the quantity of every single item by 1.

Create and run awk arithmetic example:

$ vi arithmetic.awk
BEGIN {
FS=",";
OFS=",";
item_discount=0;
}
{
item_discount=$4*20/100;
print $1,$2,$3,$4-item_discount,$5-1
}
$ awk -f arithmetic.awk items.txt
101,HD Camcorder,Video,168,9
102,Refrigerator,Appliance,680,1
103,MP3 Player,Audio,216,14
104,Tennis Racket,Sports,152,19
105,Laser Printer,Office,380,4

The following example prints all the even numbered lines from the input file. The row number of each line is checked to see if it is a multiple of 2, and if so the default operation (print the whole line) is executed.

Demonstrate modulo division:

$ awk 'NR % 2 == 0' items.txt
102,Refrigerator,Appliance,850,2
104,Tennis Racket,Sports,190,20

65. Awk String Operator

(space) is a string operator that does string concatenation.

In the following example, string concatenation happens at three locations. In the statement "string3=string1 string2", string3 contains the concatenated value of string1 and string2. Each print statement does a string concatenation with a static string and an awk variable.

Note: This operator is why you must separate the values in a print statement with a comma if you want to print the OFS in between. If you do not include a comma to separate the values, the values are concatenated instead.

$ cat string.awk
BEGIN {
FS=",";
OFS=",";
string1="Audio";
string2="Video";
numberstring="100";
string3=string1 string2;
print "Concatenate string is:" string3;
numberstring=numberstring+1;
print "String to number:" numberstring;
}
$ awk -f string.awk items.txt
Concatenate string is:AudioVideo
String to number:101

66. Assignment Operators

Just like most other programming languages, awk uses = as the assignment operator. Like C, awk also supports shortcut assignment operators that modify a variable rather than replacing its value.

  • "=" -Assignment
  • "+=" -Shortcut addition assignment
  • "-=" -Shortcut subtraction assignment
  • "*=" -Shortcut multiplication assignment
  • "/=" -Shortcut division assignment
  • "%=" -Shortcut modulo division assignment

The following example shows how to use the assignment operators:

$ cat assignment.awk
BEGIN {
FS=",";
OFS=",";
total1 = total2 = total3 = total4 = total5 = 10;
total1 += 5; print total1;
total2 -= 5; print total2;
total3 *= 5; print total3;
total4 /= 5; print total4;
total5 %= 5; print total5;
}
$ awk -f assignment.awk
15
5
50
2
0

The following example uses the += shortcut assignment operator.

Display the total amount of inventory available across all items:

$ awk -F ',' 'BEGIN { total=0 } { total+=$5 } END
{print "Total Quantity: " total}' items.txt
Total Quantity: 52

The next example counts the total number of fields in a file. The awk script matches all lines and keeps adding the number of fields in each line using the shortcut addition assignment operator. The number of fields seen so far is kept in a variable named ‘total’. Once the input file is processed, the END block is executed, which prints the total number of fields.

Count total number of fields in items.txt:

$ awk -F ',' 'BEGIN { total=0 } { total += NF }; END
{ print total }' items.txt
25

67. Awk Comparison Operators

Awk supports the standard comparison operators that are listed below.

> -Is greater than
>= -Is greater than or equal to
< -Is less than
<= -Is less than or equal to
== -Is equal to
!= -Is not equal to && Both the conditional expressions are true
|| -Either one of the conditional expressions is true

A note on the following examples: If you don't specify any action, awk will print the whole record if it matches the conditional comparison.

The following example uses <= condition. This displays all the items that are under the critical inventory level of 5:

$ awk -F "," '$5 <= 5' items.txt
102,Refrigerator,Appliance,850,2
105,Laser Printer,Office,475,5

The following example uses == condition. This displays the record with the item number 103:

$ awk -F "," '$1 == 103' items.txt
103,MP3 Player,Audio,270,15

Note: don't confuse the == (exact match) operator with =(Assignment).

Print only the description of the item with number 103:

$ awk -F "," '$1 == 103 {print $2}' items.txt
MP3 Player

The following example uses != condition. This prints all items except those in the category Video:

$ awk -F "," '$3 != "Video"' items.txt
102,Refrigerator,Appliance,850,2
103,MP3 Player,Audio,270,15
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5

Same as above, but prints only the item description:

$ awk -F "," '$3 != "Video" {print $2}' items.txt
Refrigerator
MP3 Player
Tennis Racket
Laser Printer

The following example uses && (AND operator) to check two conditions. This prints the record where the cost is under 900 AND the quantity is less than or equal to the critical inventory level of 5.

$ awk -F "," '$4 < 900 && $5 <= 5' items.txt
102,Refrigerator,Appliance,850,2
105,Laser Printer,Office,475,5

Same as above, but prints only the item description:

$ awk -F "," '$4 < 900 && $5 <= 5 {print $2}' items.txt
Refrigerator
Laser Printer

The following example uses || (OR operator) to check two conditions. This prints records where the cost is less than 900 OR the quantity is at or under the critical inventory level of 5.

$ awk -F "," '$4 < 900 || $5 <= 5' items.txt
101,HD Camcorder,Video,210,10
102,Refrigerator,Appliance,850,2
103,MP3 Player,Audio,270,15
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5

Same as above. But prints only the item description:

$ awk -F "," '$4 < 900 || $5 <= 5 {print $2}' items.txt
HD Camcorder
Refrigerator
MP3 Player
Tennis Racket
Laser Printer

The following example uses > (Greater than) condition. This example displays the uid (and the full line) from the /etc/passwd that has the highest USER ID value. This awk script keeps track of the largest number (of field3) in the variable ‘maxuid’ and keeps a copy of the corresponding line in the variable ‘maxline’. Once it has looped over all the lines, it prints the uid and the line.

$ awk -F ':' '$3 > maxuid { maxuid=$3; maxline=$0 }; END { print maxuid, maxline }' /etc/passwd
112 gdm:x:112:119:Gnome Display
Manager:/var/lib/gdm:/bin/false

The following example uses == condition. This example prints every line from the /etc/passwd file that has the same USER ID and GROUP ID. This awk script prints the line only if $3 (USER ID) and $4 (GROUP ID) are equal.

$ awk -F ':' '$3==$4' /etc/passwd
gnats:x:41:41:Gnats Bug-Reporting System
(admin):/var/lib/gnats:/bin/sh

The following example uses >= and && conditions. This example prints any line from /etc/passwd where the USER ID >= 100 AND the user's shell is /bin/sh.

$ awk -F ':' '$3>=100 && $NF ~ /\/bin\/sh/' /etc/passwd
libuuid:x:100:101::/var/lib/libuuid:/bin/sh

The following example uses == condition. This example prints all the lines from /etc/passwd that doesn't have a comment (field 5).

$ awk -F ':' '$5 == "" ' /etc/passwd
libuuid:x:100:101::/var/lib/libuuid:/bin/sh
syslog:x:101:102::/home/syslog:/bin/false
saned:x:110:116::/home/saned:/bin/false

68. Regular Expression Operators

~ -Match operator
!~ -No Match operator

When you use the == condition, awk looks for a full match. The following example doesn't print anything, as none of the 2nd fields in the items.txt file exactly matches the keyword "Tennis". "Tennis Racket" is not a full match.

Print lines where field two is “Tennis”:

awk -F "," '$2 == "Tennis"' items.txt

When you use the match operator ~, awk looks for a partial match, i.e. it looks for a field that “contains” the match string.

Print lines where field two contains “Tennis”:

$ awk -F "," '$2 ~ "Tennis"' items.txt
104,Tennis Racket,Sports,190,20

The !~ operator is the opposite of ~, i.e. “does not contain”.

Print lines where field two does not contain “Tennis”:

$ awk -F "," '$2 !~ "Tennis"' items.txt
101,HD Camcorder,Video,210,10
102,Refrigerator,Appliance,850,2
103,MP3 Player,Audio,270,15
105,Laser Printer,Office,475,5

The next example prints the total number of users who use /bin/bash as their shell. In this awk script, when the last field of a line contains the pattern "/bin/bash", the awk variable ‘n’ gets incremented by one.

$ awk -F ':' '$NF ~ /\/bin\/sh/ { n++ }; END { print
n }' /etc/passwd
2