
CIS120
Linux Fundamentals
Linux Pipelines
Think of Linux pipelines like a factory assembly line. Just as raw materials move from one station to another, getting transformed at each step, data flows through a series of commands, getting processed and transformed along the way. The pipe symbol (|
) is like the conveyor belt that moves the data from one command to the next.
When to Use Pipelines
Use pipelines when you want to:
- Combine multiple commands to perform complex tasks
- Process data step by step without creating temporary files
- Filter and transform data in a single command
- Create powerful one-liners for data analysis
- Build custom data processing workflows
Basic Pipeline Concepts
A pipeline connects commands using the pipe symbol (|
). The output of the command on the left becomes the input for the command on the right. Think of it like a chain of commands, where each link processes the data in some way.
Common Pipeline Patterns
Pattern | What It Does | When to Use It |
---|---|---|
command1 | command2 |
Basic two-command pipeline | Simple data processing |
command1 | command2 | command3 |
Three-command pipeline | Complex data processing |
command1 | tee file.txt |
Save and display output | When you need both output and a file |
Practical Examples
File Operations
# Find all text files containing "error"
ls *.txt | grep "error"
# Count how many Python files are in a directory
ls *.py | wc -l
# Sort files by size
ls -l | sort -k5 -n
Text Processing
# Convert text to uppercase and count lines
cat file.txt | tr '[:lower:]' '[:upper:]' | wc -l
# Extract usernames from /etc/passwd
cat /etc/passwd | cut -d: -f1 | sort
# Find unique IP addresses in a log file
grep "IP:" access.log | cut -d' ' -f2 | sort -u
System Monitoring
# Show top 5 memory-consuming processes
ps aux | sort -k4 -nr | head -n 5
# Monitor disk usage
df -h | grep "/dev/sd" | sort -k5 -nr
# Count running processes for a user
ps aux | grep username | wc -l
Tips for Success
- Start simple: Begin with two commands and add more as needed
- Test each step: Run each command separately first
- Use comments: Add comments to explain complex pipelines
- Break it down: Build pipelines step by step
Common Mistakes to Avoid
- Forgetting that some commands don't work well in pipelines
- Not checking if the first command produces the expected output
- Creating overly complex pipelines when a simpler solution exists
- Not using proper quoting for special characters
Best Practices
- Keep pipelines readable by using line breaks for long commands
- Use meaningful variable names when storing pipeline output
- Add comments to explain complex pipelines
- Test each part of the pipeline separately
- Use proper error handling
Advanced Pipeline Techniques
Combining Multiple Commands
# Find and process specific files
find . -name "*.log" | xargs grep "error" | sort | uniq -c
# Monitor system resources
top -b -n 1 | head -n 10 | tee system_stats.txt
# Process CSV data
cat data.csv | cut -d',' -f1,3 | sort | uniq -c | sort -nr