How to Use `sed` and `awk` for Text Processing on Arch Linux
sed and awk for Text Processing on Arch LinuxCategories:
5 minute read
Text processing is a fundamental part of Linux system administration, scripting, and data analysis. Two of the most powerful tools for handling text streams and files are sed (stream editor) and awk (pattern scanning and processing language). On Arch Linux, these utilities are available by default in the base system, and mastering them can significantly boost your productivity.
In this article, we’ll explore how to effectively use sed and awk for various text processing tasks on Arch Linux. Whether you’re editing configuration files, extracting data, or automating system reports, these tools will be your best allies.
Introduction to sed and awk
What is sed?
sed stands for Stream EDitor. It reads input line by line, applies the specified operation(s), and outputs the result. It’s great for simple substitutions, deletions, insertions, and complex multi-line edits.
What is awk?
awk is both a command-line utility and a programming language designed for pattern scanning and text processing. It allows you to filter and format text using conditions and expressions, making it ideal for parsing structured data like CSV or logs.
Installing sed and awk on Arch Linux
In most cases, both tools are already available in a fresh Arch Linux installation:
sed --version
awk --version
sedis part of thecoreutilspackage.awkis usually implemented bygawk(GNU Awk), which is in the base system.
If for some reason they’re missing, you can install them with:
sudo pacman -S gawk
sudo pacman -S sed
Basic sed Usage
Syntax
sed [options] 'script' file
Common Examples
1. Replacing text
To replace “foo” with “bar” in a file:
sed 's/foo/bar/' file.txt
To replace all occurrences in each line:
sed 's/foo/bar/g' file.txt
2. In-place editing
To modify the file directly:
sed -i 's/foo/bar/g' file.txt
You can create a backup with:
sed -i.bak 's/foo/bar/g' file.txt
3. Deleting lines
Delete line 5:
sed '5d' file.txt
Delete lines matching a pattern:
sed '/^#/d' file.txt # Remove comments
4. Print specific lines
Print only line 3:
sed -n '3p' file.txt
Print lines 2 to 4:
sed -n '2,4p' file.txt
Basic awk Usage
Syntax
awk 'pattern { action }' file
Default Behavior
By default, awk splits each line into fields based on whitespace and lets you reference them using $1, $2, etc.
1. Print specific columns
Print the first column:
awk '{ print $1 }' file.txt
Print first and third columns:
awk '{ print $1, $3 }' file.txt
2. Use a custom delimiter
For comma-separated values:
awk -F',' '{ print $1, $2 }' data.csv
3. Filter with conditions
Print lines where the second column is greater than 100:
awk '$2 > 100' data.txt
Print lines where the first column matches “john”:
awk '$1 == "john"' data.txt
4. Begin and End Blocks
awk 'BEGIN { print "Start" } { print $0 } END { print "End" }' file.txt
Real-World Examples
Example 1: Extracting IP addresses from logs
awk '{ print $1 }' /var/log/nginx/access.log | sort | uniq -c | sort -nr
This command:
- Extracts the first field (IP address)
- Counts unique entries
- Sorts them in reverse numerical order
Example 2: Mass renaming using sed
If you have files like image01.jpg, image02.jpg, …, and want to rename them to pic01.jpg, etc.:
for f in image*.jpg; do
mv "$f" "$(echo "$f" | sed 's/image/pic/')"
done
Example 3: Summing values with awk
Assume data.txt contains:
Item1 25
Item2 40
Item3 35
To calculate the total:
awk '{ sum += $2 } END { print "Total:", sum }' data.txt
Example 4: Find and replace in multiple files
find . -type f -name "*.conf" -exec sed -i 's/localhost/127.0.0.1/g' {} +
This finds all .conf files and replaces localhost with 127.0.0.1.
Combining sed and awk
Both tools can complement each other. For example:
cat data.txt | sed 's/foo/bar/' | awk '{ print $1, $3 }'
Or in a more efficient way, without cat:
sed 's/foo/bar/' data.txt | awk '{ print $1, $3 }'
Tips and Best Practices
1. Test before using -i
Always test your sed command before using the -i (in-place) option to avoid accidental data loss.
2. Use comments in awk scripts
When writing complex awk scripts, use comments and line breaks for clarity:
awk '
# Print rows where column 2 is > 100
$2 > 100 {
print $1, $2
}
' file.txt
3. Use awk over cut or grep for complex tasks
While tools like cut, grep, and head are great for simple jobs, awk shines when you need conditional logic, math, or formatting.
Creating Reusable awk and sed Scripts
awk Script File
You can write an awk script in a file, e.g., script.awk:
BEGIN { FS=":"; OFS=" | " }
$3 > 1000 { print $1, $3 }
Run it with:
awk -f script.awk /etc/passwd
sed Script File
You can also save multiple sed commands in a file:
s/foo/bar/g
s/baz/qux/g
Run it with:
sed -f script.sed file.txt
Advanced Examples
Replace only on specific lines using sed
Replace apple with orange only on line 2:
sed '2s/apple/orange/' file.txt
Print average from a column using awk
awk '{ sum += $2; count++ } END { print "Average:", sum/count }' data.txt
Update /etc/hosts programmatically
Add an entry if it doesn’t exist:
grep -q 'example.com' /etc/hosts || echo '127.0.0.1 example.com' | sudo tee -a /etc/hosts
You could also use sed to modify an existing entry:
sudo sed -i '/example.com/ s/127.0.0.1/127.0.1.1/' /etc/hosts
Conclusion
Both sed and awk are indispensable tools for any Linux user, especially those managing systems or automating tasks. On Arch Linux, they are lightweight, fast, and available by default, making them perfect for quick fixes, data transformation, and script-based automation.
By learning to use sed for quick text substitution and editing, and awk for powerful data extraction and processing, you can handle nearly any text manipulation task from the command line.
Don’t be afraid to experiment — build small commands, chain them together, and start crafting your own command-line magic.
Further Reading:
man sedman awk(orman gawk)- GNU awk manual
- Arch Wiki pages on scripting and shell tools
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.