AWK is an interpreted programming language. It is very powerful and specially designed for text processing.
AWK – Basic Examples
This chapter describes several useful AWK commands and their appropriate examples. Consider a text file marks.txt to be processed with the following content −
1) Amit Physics 80 2) Rahul Maths 90 3) Shyam Biology 87 4) Kedar English 85 5) Hari History 89
Printing Column or Field
You can instruct AWK to print only certain columns from the input field. The following example demonstrates this −
Example
[jerry]$ awk '{print $3 "\t" $4}' marks.txt
On executing this code, you get the following result −
Output
Physics 80 Maths 90 Biology 87 English 85 History 89
In the file marks.txt, the third column contains the subject name and the fourth column contains the marks obtained in a particular subject. Let us print these two columns using AWK print command. In the above example, $3 and $4 represent the third and the fourth fields respectively from the input record.
Printing All Lines
By default, AWK prints all the lines that match pattern.
Example
[jerry]$ awk '/a/ {print $0}' marks.txt
On executing this code, you get the following result −
Output
2) Rahul Maths 90 3) Shyam Biology 87 4) Kedar English 85 5) Hari History 89
In the above example, we are searching form pattern a. When a pattern match succeeds, it executes a command from the body block. In the absence of a body block − default action is taken which is print the record. Hence, the following command produces the same result −
Example
[jerry]$ awk '/a/' marks.txt
Printing Columns by Pattern
When a pattern match succeeds, AWK prints the entire record by default. But you can instruct AWK to print only certain fields. For instance, the following example prints the third and fourth field when a pattern match succeeds.
Example
[jerry]$ awk '/a/ {print $3 "\t" $4}' marks.txt
On executing this code, you get the following result −
Output
Maths 90 Biology 87 English 85 History 89
Printing Column in Any Order
You can print columns in any order. For instance, the following example prints the fourth column followed by the third column.
Example
[jerry]$ awk '/a/ {print $4 "\t" $3}' marks.txt
On executing the above code, you get the following result −
Output
90 Maths 87 Biology 85 English 89 History
Counting and Printing Matched Pattern
Let us see an example where you can count and print the number of lines for which a pattern match succeeded.
Example
[jerry]$ awk '/a/{++cnt} END {print "Count = ", cnt}' marks.txt
On executing this code, you get the following result −
Output
Count = 4
In this example, we increment the value of counter when a pattern match succeeds and we print this value in the END block. Note that unlike other programming languages, there is no need to declare a variable before using it.
Printing Lines with More than 18 Characters
Let us print only those lines that contain more than 18 characters.
Example
[jerry]$ awk 'length($0) > 18' marks.txt
On executing this code, you get the following result −
Output
3) Shyam Biology 87 4) Kedar English 85
AWK provides a built-in length function that returns the length of the string. $0 variable stores the entire line and in the absence of a body block, default action is taken, i.e., the print action. Hence, if a line has more than 18 characters, then the comparison results true and the line gets printed.
AWK – Built-in Variables
AWK provides several built-in variables. They play an important role while writing AWK scripts. This chapter demonstrates the usage of built-in variables.
Standard AWK variables
The standard AWK variables are discussed below.
ARGC
It implies the number of arguments provided at the command line.
Example
[jerry]$ awk 'BEGIN {print "Arguments =", ARGC}' One Two Three Four
On executing this code, you get the following result −
Output
Arguments = 5
But why AWK shows 5 when you passed only 4 arguments? Just check the following example to clear your doubt.
ARGV
It is an array that stores the command-line arguments. The array’s valid index ranges from 0 to ARGC-1.
Example
[jerry]$ awk 'BEGIN { for (i = 0; i < ARGC - 1; ++i) { printf "ARGV[%d] = %s\n", i, ARGV[i] } }' one two three four
On executing this code, you get the following result −
Output
ARGV[0] = awk ARGV[1] = one ARGV[2] = two ARGV[3] = three
CONVFMT
It represents the conversion format for numbers. Its default value is %.6g.
Example
[jerry]$ awk 'BEGIN { print "Conversion Format =", CONVFMT }'
On executing this code, you get the following result −
Output
Conversion Format = %.6g
ENVIRON
It is an associative array of environment variables.
Example
[jerry]$ awk 'BEGIN { print ENVIRON["USER"] }'
On executing this code, you get the following result −
Output
jerry
To find names of other environment variables, use env command.
FILENAME
It represents the current file name.
Example
[jerry]$ awk 'END {print FILENAME}' marks.txt
On executing this code, you get the following result −
Output
marks.txt
Please note that FILENAME is undefined in the BEGIN block.
FS
It represents the (input) field separator and its default value is space. You can also change this by using -F command line option.
Example
[jerry]$ awk 'BEGIN {print "FS = " FS}' | cat -vte
On executing this code, you get the following result −
Output
FS = $
NF
It represents the number of fields in the current record. For instance, the following example prints only those lines that contain more than two fields.
Example
[jerry]$ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NF > 2'
On executing this code, you get the following result −
Output
One Two Three One Two Three Four
NR
It represents the number of the current record. For instance, the following example prints the record if the current record contains less than three fields.
Example
[jerry]$ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NR < 3'
On executing this code, you get the following result −
Output
One Two One Two Three
FNR
It is similar to NR, but relative to the current file. It is useful when AWK is operating on multiple files. Value of FNR resets with new file.
OFMT
It represents the output format number and its default value is %.6g.
Example
[jerry]$ awk 'BEGIN {print "OFMT = " OFMT}'
On executing this code, you get the following result −
Output
OFMT = %.6g
OFS
It represents the output field separator and its default value is space.
Example
[jerry]$ awk 'BEGIN {print "OFS = " OFS}' | cat -vte
On executing this code, you get the following result −
Output
OFS = $
ORS
It represents the output record separator and its default value is newline.
Example
[jerry]$ awk 'BEGIN {print "ORS = " ORS}' | cat -vte
On executing the above code, you get the following result −
Output
ORS = $ $
RLENGTH
It represents the length of the string matched by match function. AWK’s match function searches for a given string in the input-string.
Example
[jerry]$ awk 'BEGIN { if (match("One Two Three", "re")) { print RLENGTH } }'
On executing this code, you get the following result −
Output
2
RS
It represents (input) record separator and its default value is newline.
Example
[jerry]$ awk 'BEGIN {print "RS = " RS}' | cat -vte
On executing this code, you get the following result −
Output
RS = $ $
RSTART
It represents the first position in the string matched by match function.
Example
[jerry]$ awk 'BEGIN { if (match("One Two Three", "Thre")) { print RSTART } }'
On executing this code, you get the following result −
Output
9
SUBSEP
It represents the separator character for array subscripts and its default value is \034.
Example
[jerry]$ awk 'BEGIN { print "SUBSEP = " SUBSEP }' | cat -vte
On executing this code, you get the following result −
Output
SUBSEP = ^\$
$0
It represents the entire input record.
Example
[jerry]$ awk '{print $0}' marks.txt
On executing this code, you get the following result −
Output
1) Amit Physics 80 2) Rahul Maths 90 3) Shyam Biology 87 4) Kedar English 85 5) Hari History 89
$n
It represents the nth field in the current record where the fields are separated by FS.
Example
[jerry]$ awk '{print $3 "\t" $4}' marks.txt
On executing this code, you get the following result −
Output
Physics 80 Maths 90 Biology 87 English 85 History 89
GNU AWK Specific Variables
GNU AWK specific variables are as follows −
ARGIND
It represents the index in ARGV of the current file being processed.
Example
[jerry]$ awk '{ print "ARGIND = ", ARGIND; print "Filename = ", ARGV[ARGIND] }' junk1 junk2 junk3
On executing this code, you get the following result −
Output
ARGIND = 1 Filename = junk1 ARGIND = 2 Filename = junk2 ARGIND = 3 Filename = junk3
BINMODE
It is used to specify binary mode for all file I/O on non-POSIX systems. Numeric values of 1, 2, or 3 specify that input files, output files, or all files, respectively, should use binary I/O. String values of r or w specify that input files or output files, respectively, should use binary I/O. String values of rw or wr specify that all files should use binary I/O.
ERRNO
A string indicates an error when a redirection fails for getline or if close call fails.
Example
[jerry]$ awk 'BEGIN { ret = getline < "junk.txt"; if (ret == -1) print "Error:", ERRNO }'
On executing this code, you get the following result −
Output
Error: No such file or directory
FIELDWIDTHS
A space separated list of field widths variable is set, GAWK parses the input into fields of fixed width, instead of using the value of the FS variable as the field separator.
IGNORECASE
When this variable is set, GAWK becomes case-insensitive. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN{IGNORECASE = 1} /amit/' marks.txt
On executing this code, you get the following result −
Output
1) Amit Physics 80
LINT
It provides dynamic control of the –lint option from the GAWK program. When this variable is set, GAWK prints lint warnings. When assigned the string value fatal, lint warnings become fatal errors, exactly like –lint=fatal.
Example
[jerry]$ awk 'BEGIN {LINT = 1; a}'
On executing this code, you get the following result −
Output
awk: cmd. line:1: warning: reference to uninitialized variable `a' awk: cmd. line:1: warning: statement has no effect
PROCINFO
This is an associative array containing information about the process, such as real and effective UID numbers, process ID number, and so on.
Example
[jerry]$ awk 'BEGIN { print PROCINFO["pid"] }'
On executing this code, you get the following result −
Output
4316
TEXTDOMAIN
It represents the text domain of the AWK program. It is used to find the localized translations for the program’s strings.
Example
[jerry]$ awk 'BEGIN { print TEXTDOMAIN }'
On executing this code, you get the following result −
Output
messages
The above output shows English text due to en_IN locale
AWK – Operators
Like other programming languages, AWK also provides a large set of operators. This chapter explains AWK operators with suitable examples.
S.No. | Operators & Description |
---|---|
1 | Arithmetic Operators
AWK supports the following arithmetic operators. |
2 | Increment and Decrement Operators
AWK supports the following increment and decrement operators. |
3 | Assignment Operators
AWK supports the following assignment operators. |
4 | Relational Operators
AWK supports the following relational operators. |
5 | Logical Operators
AWK supports the following logical operators. |
6 | Ternary Operator
We can easily implement a condition expression using ternary operator. |
7 | Unary Operators
AWK supports the following unary operators. |
8 | Exponential Operators
There are two formats of exponential operators. |
9 | String Concatenation Operator
Space is a string concatenation operator that merges two strings. |
10 | Array Membership Operator
It is represented by in. It is used while accessing array elements. |
11 | Regular Expression Operators
This example explains the two forms of regular expressions operators. |
AWK – Regular Expressions
AWK is very powerful and efficient in handling regular expressions. A number of complex tasks can be solved with simple regular expressions. Any command-line expert knows the power of regular expressions.
This chapter covers standard regular expressions with suitable examples.
Dot
It matches any single character except the end of line character. For instance, the following example matches fin, fun, fan etc.
Example
[jerry]$ echo -e "cat\nbat\nfun\nfin\nfan" | awk '/f.n/'
On executing the above code, you get the following result −
Output
fun fin fan
Start of line
It matches the start of line. For instance, the following example prints all the lines that start with pattern The.
Example
[jerry]$ echo -e "This\nThat\nThere\nTheir\nthese" | awk '/^The/'
On executing this code, you get the following result −
Output
There Their
End of line
It matches the end of line. For instance, the following example prints the lines that end with the letter n.
Example
[jerry]$ echo -e "knife\nknow\nfun\nfin\nfan\nnine" | awk '/n$/'
Output
On executing this code, you get the following result −
fun fin fan
Match character set
It is used to match only one out of several characters. For instance, the following example matches pattern Call and Tall but not Ball.
Example
[jerry]$ echo -e "Call\nTall\nBall" | awk '/[CT]all/'
Output
On executing this code, you get the following result −
Call Tall
Exclusive set
In exclusive set, the carat negates the set of characters in the square brackets. For instance, the following example prints only Ball.
Example
[jerry]$ echo -e "Call\nTall\nBall" | awk '/[^CT]all/'
On executing this code, you get the following result −
Output
Ball
Alteration
A vertical bar allows regular expressions to be logically ORed. For instance, the following example prints Ball and Call.
Example
[jerry]$ echo -e "Call\nTall\nBall\nSmall\nShall" | awk '/Call|Ball/'
On executing this code, you get the following result −
Output
Call Ball
Zero or One Occurrence
It matches zero or one occurrence of the preceding character. For instance, the following example matches Colour as well as Color. We have made u as an optional character by using ?.
Example
[jerry]$ echo -e "Colour\nColor" | awk '/Colou?r/'
On executing this code, you get the following result −
Output
Colour Color
Zero or More Occurrence
It matches zero or more occurrences of the preceding character. For instance, the following example matches ca, cat, catt, and so on.
Example
[jerry]$ echo -e "ca\ncat\ncatt" | awk '/cat*/'
On executing this code, you get the following result −
Output
ca cat catt
One or More Occurrence
It matches one or more occurrence of the preceding character. For instance below example matches one or more occurrences of the 2.
Example
[jerry]$ echo -e "111\n22\n123\n234\n456\n222" | awk '/2+/'
On executing the above code, you get the following result −
Output
22 123 234 222
Grouping
Parentheses () are used for grouping and the character | is used for alternatives. For instance, the following regular expression matches the lines containing either Apple Juice or Apple Cake.
Example
[jerry]$ echo -e "Apple Juice\nApple Pie\nApple Tart\nApple Cake" | awk '/Apple (Juice|Cake)/'
On executing this code, you get the following result −
Output
Apple Juice Apple Cake
AWK – Arrays
AWK has associative arrays and one of the best thing about it is – the indexes need not to be continuous set of number; you can use either string or number as an array index. Also, there is no need to declare the size of an array in advance – arrays can expand/shrink at runtime.
Its syntax is as follows −
Syntax
array_name[index] = value
Where array_name is the name of array, index is the array index, and value is any value assigning to the element of the array.
Creating Array
To gain more insight on array, let us create and access the elements of an array.
Example
[jerry]$ awk 'BEGIN { fruits["mango"] = "yellow"; fruits["orange"] = "orange" print fruits["orange"] "\n" fruits["mango"] }'
On executing this code, you get the following result −
Output
orange yellow
In the above example, we declare the array as fruits whose index is fruit name and the value is the color of the fruit. To access array elements, we use array_name[index] format.
Deleting Array Elements
For insertion, we used assignment operator. Similarly, we can use delete statement to remove an element from the array. The syntax of delete statement is as follows −
Syntax
delete array_name[index]
The following example deletes the element orange. Hence the command does not show any output.
Example
[jerry]$ awk 'BEGIN { fruits["mango"] = "yellow"; fruits["orange"] = "orange"; delete fruits["orange"]; print fruits["orange"] }'
Multi-Dimensional arrays
AWK only supports one-dimensional arrays. But you can easily simulate a multi-dimensional array using the one-dimensional array itself.
For instance, given below is a 3×3 three-dimensional array −
100 200 300 400 500 600 700 800 900
In the above example, array[0][0] stores 100, array[0][1] stores 200, and so on. To store 100 at array location [0][0], we can use the following syntax −
Syntax
array["0,0"] = 100
Though we gave 0,0 as index, these are not two indexes. In reality, it is just one index with the string 0,0.
The following example simulates a 2-D array −
Example
[jerry]$ awk 'BEGIN { array["0,0"] = 100; array["0,1"] = 200; array["0,2"] = 300; array["1,0"] = 400; array["1,1"] = 500; array["1,2"] = 600; # print array elements print "array[0,0] = " array["0,0"]; print "array[0,1] = " array["0,1"]; print "array[0,2] = " array["0,2"]; print "array[1,0] = " array["1,0"]; print "array[1,1] = " array["1,1"]; print "array[1,2] = " array["1,2"]; }'
On executing this code, you get the following result −
Output
array[0,0] = 100 array[0,1] = 200 array[0,2] = 300 array[1,0] = 400 array[1,1] = 500 array[1,2] = 600
You can also perform a variety of operations on an array such as sorting its elements/indexes. For that purpose, you can use assort and asorti functions
AWK – Control Flow
Like other programming languages, AWK provides conditional statements to control the flow of a program. This chapter explains AWK’s control statements with suitable examples.
If statement
It simply tests the condition and performs certain actions depending upon the condition. Given below is the syntax of if statement −
Syntax
if (condition) action
We can also use a pair of curly braces as given below to execute multiple actions −
Syntax
if (condition) { action-1 action-1 . . action-n }
For instance, the following example checks whether a number is even or not −
Example
[jerry]$ awk 'BEGIN {num = 10; if (num % 2 == 0) printf "%d is even number.\n", num }'
On executing the above code, you get the following result −
Output
10 is even number.
If Else Statement
In if-else syntax, we can provide a list of actions to be performed when a condition becomes false.
The syntax of if-else statement is as follows −
Syntax
if (condition) action-1 else action-2
In the above syntax, action-1 is performed when the condition evaluates to true and action-2 is performed when the condition evaluates to false. For instance, the following example checks whether a number is even or not −
Example
[jerry]$ awk 'BEGIN { num = 11; if (num % 2 == 0) printf "%d is even number.\n", num; else printf "%d is odd number.\n", num }'
On executing this code, you get the following result −
Output
11 is odd number.
If-Else-If Ladder
We can easily create an if-else-if ladder by using multiple if-else statements. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { a = 30; if (a==10) print "a = 10"; else if (a == 20) print "a = 20"; else if (a == 30) print "a = 30"; }'
On executing this code, you get the following result −
Output
a = 30
AWK – Loops
This chapter explains AWK’s loops with suitable example. Loops are used to execute a set of actions in a repeated manner. The loop execution continues as long as the loop condition is true.
For Loop
The syntax of for loop is −
Syntax
for (initialisation; condition; increment/decrement) action
Initially, the for statement performs initialization action, then it checks the condition. If the condition is true, it executes actions, thereafter it performs increment or decrement operation. The loop execution continues as long as the condition is true. For instance, the following example prints 1 to 5 using for loop −
Example
[jerry]$ awk 'BEGIN { for (i = 1; i <= 5; ++i) print i }'
On executing this code, you get the following result −
Output
1 2 3 4 5
While Loop
The while loop keeps executing the action until a particular logical condition evaluates to true. Here is the syntax of while loop −
Syntax
while (condition) action
AWK first checks the condition; if the condition is true, it executes the action. This process repeats as long as the loop condition evaluates to true. For instance, the following example prints 1 to 5 using while loop −
Example
[jerry]$ awk 'BEGIN {i = 1; while (i < 6) { print i; ++i } }'
On executing this code, you get the following result −
Output
1 2 3 4 5
Do-While Loop
The do-while loop is similar to the while loop, except that the test condition is evaluated at the end of the loop. Here is the syntax of do-whileloop −
Syntax
do action while (condition)
In a do-while loop, the action statement gets executed at least once even when the condition statement evaluates to false. For instance, the following example prints 1 to 5 numbers using do-while loop −
Example
[jerry]$ awk 'BEGIN {i = 1; do { print i; ++i } while (i < 6) }'
On executing this code, you get the following result −
Output
1 2 3 4 5
Break Statement
As its name suggests, it is used to end the loop execution. Here is an example which ends the loop when the sum becomes greater than 50.
Example
[jerry]$ awk 'BEGIN { sum = 0; for (i = 0; i < 20; ++i) { sum += i; if (sum > 50) break; else print "Sum =", sum } }'
On executing this code, you get the following result −
Output
Sum = 0 Sum = 1 Sum = 3 Sum = 6 Sum = 10 Sum = 15 Sum = 21 Sum = 28 Sum = 36 Sum = 45
Continue Statement
The continue statement is used inside a loop to skip to the next iteration of the loop. It is useful when you wish to skip the processing of some data inside the loop. For instance, the following example uses continue statement to print the even numbers between 1 to 20.
Example
[jerry]$ awk 'BEGIN { for (i = 1; i <= 20; ++i) { if (i % 2 == 0) print i ; else continue } }'
On executing this code, you get the following result −
Output
2 4 6 8 10 12 14 16 18 20
Exit Statement
It is used to stop the execution of the script. It accepts an integer as an argument which is the exit status code for AWK process. If no argument is supplied, exit returns status zero. Here is an example that stops the execution when the sum becomes greater than 50.
Example
[jerry]$ awk 'BEGIN { sum = 0; for (i = 0; i < 20; ++i) { sum += i; if (sum > 50) exit(10); else print "Sum =", sum } }'
Output
On executing this code, you get the following result −
Sum = 0 Sum = 1 Sum = 3 Sum = 6 Sum = 10 Sum = 15 Sum = 21 Sum = 28 Sum = 36 Sum = 45
Let us check the return status of the script.
Example
[jerry]$ echo $?
On executing this code, you get the following result −
Output
10
AWK – Built-in Functions
AWK has a number of functions built into it that are always available to the programmer. This chapter describes Arithmetic, String, Time, Bit manipulation, and other miscellaneous functions with suitable examples.
S.No. | Built in functions & Description |
---|---|
1 | Arithmetic Functions
AWK has the following built-in arithmetic functions. |
2 | String Functions
AWK has the following built-in String functions. |
3 | Time Functions
AWK has the following built-in time functions. |
4 | Bit Manipulation Functions
AWK has the following built-in bit manipulation functions. |
5 | Miscellaneous Functions
AWK has the following miscellaneous functions. |
AWK – User Defined Functions
Functions are basic building blocks of a program. AWK allows us to define our own functions. A large program can be divided into functions and each function can be written/tested independently. It provides re-usability of code.
Given below is the general format of a user-defined function −
Syntax
function function_name(argument1, argument2, ...) { function body }
In this syntax, the function_name is the name of the user-defined function. Function name should begin with a letter and the rest of the characters can be any combination of numbers, alphabetic characters, or underscore. AWK’s reserve words cannot be used as function names.
Functions can accept multiple arguments separated by comma. Arguments are not mandatory. You can also create a user-defined function without any argument.
function body consists of one or more AWK statements.
Let us write two functions that calculate the minimum and the maximum number and call these functions from another function called main. The functions.awk file contains −
Example
# Returns minimum number function find_min(num1, num2){ if (num1 < num2) return num1 return num2 } # Returns maximum number function find_max(num1, num2){ if (num1 > num2) return num1 return num2 } # Main function function main(num1, num2){ # Find minimum number result = find_min(10, 20) print "Minimum =", result # Find maximum number result = find_max(10, 20) print "Maximum =", result } # Script execution starts here BEGIN { main(10, 20) }
On executing this code, you get the following result −
Output
Minimum = 10 Maximum = 20
AWK – Output Redirection
So far, we displayed data on standard output stream. We can also redirect data to a file. A redirection appears after the print or printf statement. Redirections in AWK are written just like redirection in shell commands, except that they are written inside the AWK program. This chapter explains redirection with suitable examples.
Redirection Operator
The syntax of the redirection operator is −
Syntax
print DATA > output-file
It writes the data into the output-file. If the output-file does not exist, then it creates one. When this type of redirection is used, the output-file is erased before the first output is written to it. Subsequent write operations to the same output-file do not erase the output-file, but append to it. For instance, the following example writes Hello, World !!! to the file.
Let us create a file with some text data.
Example
[jerry]$ echo "Old data" > /tmp/message.txt [jerry]$ cat /tmp/message.txt
On executing this code, you get the following result −
Output
Old data
Now let us redirect some contents into it using AWK’s redirection operator.
Example
[jerry]$ awk 'BEGIN { print "Hello, World !!!" > "/tmp/message.txt" }' [jerry]$ cat /tmp/message.txt
On executing this code, you get the following result −
Output
Hello, World !!!
Append Operator
The syntax of append operator is as follows −
Syntax
print DATA >> output-file
It appends the data into the output-file. If the output-file does not exist, then it creates one. When this type of redirection is used, new contents are appended at the end of file. For instance, the following example appends Hello, World !!! to the file.
Let us create a file with some text data.
Example
[jerry]$ echo "Old data" > /tmp/message.txt [jerry]$ cat /tmp/message.txt
On executing this code, you get the following result −
Output
Old data
Now let us append some contents to it using AWK’s append operator.
Example
[jerry]$ awk 'BEGIN { print "Hello, World !!!" >> "/tmp/message.txt" }' [jerry]$ cat /tmp/message.txt
On executing this code, you get the following result −
Output
Old data Hello, World !!!
Pipe
It is possible to send output to another program through a pipe instead of using a file. This redirection opens a pipe to command, and writes the values of items through this pipe to another process to execute the command. The redirection argument command is actually an AWK expression. Here is the syntax of pipe −
Syntax
print items | command
Let us use tr command to convert lowercase letters to uppercase.
Example
[jerry]$ awk 'BEGIN { print "hello, world !!!" | "tr [a-z] [A-Z]" }'
On executing this code, you get the following result −
Output
HELLO, WORLD !!!
Two way communication
AWK can communicate to an external process using |&, which is two-way communication. For instance, the following example uses tr command to convert lowercase letters to uppercase. Our command.awk file contains −
Example
BEGIN { cmd = "tr [a-z] [A-Z]" print "hello, world !!!" |& cmd close(cmd, "to") cmd |& getline out print out; close(cmd); }
On executing this code, you get the following result −
Output
HELLO, WORLD !!!
Does the script look cryptic? Let us demystify it.
- The first statement, cmd = “tr [a-z] [A-Z]”, is the command to which we establish the two-way communication from AWK.
- The next statement, i.e., the print command provides input to the tr command. Here &| indicates two-way communication.
- The third statement, i.e., close(cmd, “to”), closes the to process after competing its execution.
- The next statement cmd |& getline out stores the output into out variable with the aid of getline function.
- The next print statement prints the output and finally the close function closes the command.
AWK – Pretty Printing
So far we have used AWK’s print and printf functions to display data on standard output. But printf is much more powerful than what we have seen before. This function is borrowed from the C language and is very helpful while producing formatted output. Below is the syntax of the printf statement −
Syntax
printf fmt, expr-list
In the above syntax fmt is a string of format specifications and constants. expr-list is a list of arguments corresponding to format specifiers.
Escape Sequences
Similar to any string, format can contain embedded escape sequences. Discussed below are the escape sequences supported by AWK −
New Line
The following example prints Hello and World in separate lines using newline character −
Example
[jerry]$ awk 'BEGIN { printf "Hello\nWorld\n" }'
On executing this code, you get the following result −
Output
Hello World
Horizontal Tab
The following example uses horizontal tab to display different field −
Example
[jerry]$ awk 'BEGIN { printf "Sr No\tName\tSub\tMarks\n" }'
On executing the above code, you get the following result −
Output
Sr No Name Sub Marks
Vertical Tab
The following example uses vertical tab after each filed −
Example
[jerry]$ awk 'BEGIN { printf "Sr No\vName\vSub\vMarks\n" }'
On executing this code, you get the following result −
Output
Sr No Name Sub Marks
Backspace
The following example prints a backspace after every field except the last one. It erases the last number from the first three fields. For instance, Field 1 is displayed as Field, because the last character is erased with backspace. However, the last field Field 4 is displayed as it is, as we did not have a \b after Field 4.
Example
[jerry]$ awk 'BEGIN { printf "Field 1\bField 2\bField 3\bField 4\n" }'
On executing this code, you get the following result −
Output
Field Field Field Field 4
Carriage Return
In the following example, after printing every field, we do a Carriage Return and print the next value on top of the current printed value. It means, in the final output, you can see only Field 4, as it was the last thing to be printed on top of all the previous fields.
Example
[jerry]$ awk 'BEGIN { printf "Field 1\rField 2\rField 3\rField 4\n" }'
On executing this code, you get the following result −
Output
Field 4
Form Feed
The following example uses form feed after printing each field.
Example
[jerry]$ awk 'BEGIN { printf "Sr No\fName\fSub\fMarks\n" }'
On executing this code, you get the following result −
Output
Sr No Name Sub Marks
Format Specifier
As in C-language, AWK also has format specifiers. The AWK version of the printf statement accepts the following conversion specification formats −
%c
It prints a single character. If the argument used for %c is numeric, it is treated as a character and printed. Otherwise, the argument is assumed to be a string, and the only first character of that string is printed.
Example
[jerry]$ awk 'BEGIN { printf "ASCII value 65 = character %c\n", 65 }'
Output
On executing this code, you get the following result −
ASCII value 65 = character A
%d and %i
It prints only the integer part of a decimal number.
Example
[jerry]$ awk 'BEGIN { printf "Percentags = %d\n", 80.66 }'
On executing this code, you get the following result −
Output
Percentags = 80
%e and %E
It prints a floating point number of the form [-]d.dddddde[+-]dd.
Example
[jerry]$ awk 'BEGIN { printf "Percentags = %E\n", 80.66 }'
On executing this code, you get the following result −
Output
Percentags = 8.066000e+01
The %E format uses E instead of e.
Example
[jerry]$ awk 'BEGIN { printf "Percentags = %e\n", 80.66 }'
On executing this code, you get the following result −
Output
Percentags = 8.066000E+01
%f
It prints a floating point number of the form [-]ddd.dddddd.
Example
[jerry]$ awk 'BEGIN { printf "Percentags = %f\n", 80.66 }'
On executing this code, you get the following result −
Output
Percentags = 80.660000
%g and %G
Uses %e or %f conversion, whichever is shorter, with non-significant zeros suppressed.
Example
[jerry]$ awk 'BEGIN { printf "Percentags = %g\n", 80.66 }'
Output
On executing this code, you get the following result −
Percentags = 80.66
The %G format uses %E instead of %e.
Example
[jerry]$ awk 'BEGIN { printf "Percentags = %G\n", 80.66 }'
On executing this code, you get the following result −
Output
Percentags = 80.66
%o
It prints an unsigned octal number.
Example
[jerry]$ awk 'BEGIN { printf "Octal representation of decimal number 10 = %o\n", 10}'
On executing this code, you get the following result −
Output
Octal representation of decimal number 10 = 12
%u
It prints an unsigned decimal number.
Example
[jerry]$ awk 'BEGIN { printf "Unsigned 10 = %u\n", 10 }'
On executing this code, you get the following result −
Output
Unsigned 10 = 10
%s
It prints a character string.
Example
[jerry]$ awk 'BEGIN { printf "Name = %s\n", "Sherlock Holmes" }'
On executing this code, you get the following result −
Output
Name = Sherlock Holmes
%x and %X
It prints an unsigned hexadecimal number. The %X format uses uppercase letters instead of lowercase.
Example
[jerry]$ awk 'BEGIN { printf "Hexadecimal representation of decimal number 15 = %x\n", 15 }'
On executing this code, you get the following result −
Output
Hexadecimal representation of decimal number 15 = f
Now let use %X and observe the result −
Example
[jerry]$ awk 'BEGIN { printf "Hexadecimal representation of decimal number 15 = %X\n", 15 }'
On executing this code, you get the following result −
Output
Hexadecimal representation of decimal number 15 = F
%%
It prints a single % character and no argument is converted.
Example
[jerry]$ awk 'BEGIN { printf "Percentags = %d%%\n", 80.66 }'
On executing this code, you get the following result −
Output
Percentags = 80%
Optional Parameters with %
With % we can use following optional parameters −
Width
The field is padded to the width. By default, the field is padded with spaces but when 0 flag is used, it is padded with zeroes.
Example
[jerry]$ awk 'BEGIN { num1 = 10; num2 = 20; printf "Num1 = %10d\nNum2 = %10d\n", num1, num2 }'
On executing this code, you get the following result −
Output
Num1 = 10 Num2 = 20
Leading Zeros
A leading zero acts as a flag, which indicates that the output should be padded with zeroes instead of spaces. Please note that this flag only has an effect when the field is wider than the value to be printed. The following example describes this −
Example
[jerry]$ awk 'BEGIN { num1 = -10; num2 = 20; printf "Num1 = %05d\nNum2 = %05d\n", num1, num2 }'
On executing this code, you get the following result −
Output
Num1 = -0010 Num2 = 00020
Left Justification
The expression should be left-justified within its field. When the input-string is less than the number of characters specified, and you want it to be left justified, i.e., by adding spaces to the right, use a minus symbol (–) immediately after the % and before the number.
In the following example, output of the AWK command is piped to the cat command to display the END OF LINE($) character.
Example
[jerry]$ awk 'BEGIN { num = 10; printf "Num = %-5d\n", num }' | cat -vte
On executing this code, you get the following result −
Output
Num = 10 $
Prefix Sign
It always prefixes numeric values with a sign, even if the value is positive.
Example
[jerry]$ awk 'BEGIN { num1 = -10; num2 = 20; printf "Num1 = %+d\nNum2 = %+d\n", num1, num2 }'
On executing this code, you get the following result −
Output
Num1 = -10 Num2 = +20
Hash
For %o, it supplies a leading zero. For %x and %X, it supplies a leading 0x or 0X respectively, only if the result is non-zero. For %e, %E, %f, and %F, the result always contains a decimal point. For %g and %G, trailing zeros are not removed from the result. The following example describes this −
Example
[jerry]$ awk 'BEGIN { printf "Octal representation = %#o\nHexadecimal representaion = %#X\n", 10, 10 }'
On executing this code, you get the following result −
Output
Octal representation = 012 Hexadecimal representation = 0XA