General Idea for Pipes in UNIX:
Bad/Naive Approach: Using Files
Design Philosophy of Pipes:
stdout of out one command to the stdin of another.Examples: Using pipes
# Count number of lines the who command returned who | wc -l # Search for all entries in world file for net-im packages cat /var/lib/portage/world | grep "net-im/" # Sort list of users who | sort
Beginner Mistake:
|v.s.>Don’t confuse pipes and redirection!
- Pipes are a UNIX feature for creating data pipelines between processes,
- Redirection is a shell feature for writing the output of processes into files.
Filters: Class of UNIX utilities that read from standard input, transform the file, and write to standard out.
stdin and stdout how they want.stdoutsort file.txt doesn’t modify the contents of file.txt.cat: Read lines from stdin (and more files), and concatenate them to stdout.more/less: Read lines from stdin, and provide a paginated view to stdout.head: Read the first few lines from stdin (and more files) and print them to stdout.tail: Read the last few lines from stdin (and more files) and print them to stdout.tee: Copy stdin to stdout and one or more filescut: Cut specified byte, character or field from each line of stdin and print to stdout.paste: Read lines from stdin (and more files), and paste them together line-by-line to stdout.wc: Read from stdin, and print the number of newlines, words, and bytes to stdout.tr: Translate or delete characters read from stdin and print to stdout.sort: Sort the lines in stdin, and print the result to stdout.uniq: Read from stdin and print unique (that are different from the adjacent line) to stdout.grep: Find lines in stdin that match a pattern and print them to stdout.sed: Streamline Editor used to edit text in non interactive mode.awk: Powerful editing tool for files with programming featuresMore on
cat
catcopies its input to output unchanged (identity filter). When supplied a list of file names, it concatenates them ontostdout.Some Options (Flags):
-n: Number output lines starting from 1.-v: Display control-characters in visible form (e.g.,^M)
More on
more
moreis likecat, except it displays content page-by-page.
- Example:
more +4 foo.txtdisplays file content starting at line 4.
More on
less
lesslets you view the contents of a file similarly tomore, except it doesn’t load the entire file at once.
lessis faster thanmoreand has a larger number of features.
More on
headandtail
headdisplays the first few lines of a file.Format:
head [-n] [filename...]
-n: Number of lines to display (default: 10)- filename: List of filenames to display
- When more than one filename is given, start of each listing begins with
==>filename<==.
taildisplays the last few lines of a file.Format:
tail -number [rbc] [f] [filename]
-number: Number of lines to display (default: 10)b,c: Units of lines/blocks/charactersr: Print ins reverse order (lines only)Example: Using
headandtailto get lines 4—10 of a filetail -n +4 patch.sh | head -n 6
More on
wc
wccounts the number of lines, characters, or words.
- By default, prints number of all three.
Options:
-l: Count lines-w: Count words-c: Count characters
More on
tee
teecopiesstdintostdoutfor one or more files.
- Captures intermediate results from a filter in the pipeline.
- Remember, pipes can have multiple readers and writers!
Format:
tee [ -ai ] file-list
-a: append to output file rather than overwrite, default is to overwrite (replace) the output file-i: ignore interruptsfile-list: one or more file names for capturing outputExample: Using
teeto store the state of a pipels | head -10 | tee first_10.txt | tail -5
- This line prints to
stdoutthe last five lines (tail -5) of the first 10 lines (head -10) inls.
- It also captures the result of pipe after doing
head -10and stores it infirst_10.txt
More on
cutNote: Delimited data
- Data can be delimited by a variety of symbols (e.g., tabs, bars, colons). When using
cut, make sure you’re using the right delimiter!
cutprints selected parts of input lines.
- Can select columns
- Can select a range of character positions
Options:
-f listOfCols: print only the specified columns on output
- Can be given as range (e.g.,
1-5) or comma-separated (e.g.,1,3,4,5)-c listOfPos: print only chars in the specified positions
- Can be given as range (e.g.,
1-5) or comma-separated (e.g.,1,3,4,5)-d c: use charactercas the column separator
- Defaults to
<tab>Example: Using
cut$ cat /etc/passwd | cut -d: -f1
- Print the first column of the
/etc/passwdfile (only usernames)$ cat /etc/passwd | cut -d: -f1,7
- Print the first and last column of the
/etc/passwdfile (username and default shell)- Note how there’s no way to refer to the last column without counting the columns.
More on
paste
pastedisplays several text file “in parallel” on output.
- If the inputs are files $, \varsigma, and \upsilon:
- Line 1 will be composed of the first line of $, first line of \varsigma, and first line of \upsilon.
- Line 2 will be composed of the second line of $, second line of \varsigma, and second line of \upsilon.
- etc.
- Lines from each file are separated by a tab character.
- If files are different lengths the output will have all the lines from the longest file, missing lines will have empty strings.
Example: Using
pasteSuppose we have the following files:
a.txt --- 1 2b.txt --- 3 4c.txt --- 5 6$ paste a.txt b.txt c.txt 1 3 5 2 4 6Example: Using
pasteto reverse acut(cut -f1 -d: < /etc/passwd) > 0_usernames.txt (cut -f2 -d: < /etc/passwd) > 1_encrypted_password.txt (cut -f3 -d: < /etc/passwd) > 2_uid.txt (cut -f4 -d: < /etc/passwd) > 3_gid.txt (cut -f5 -d: < /etc/passwd) > 4_fullname.txt (cut -f6 -d: < /etc/passwd) > 5_homedir.txt (cut -f7 -d: < /etc/passwd) > 6_loginshell.txt
- Cut the /etc/passwd file into seven files.
paste -d: 0_usernames.txt 1_encrypted_password.txt 2_uid.txt 3_gid.txt 4_fullname.txt 5_homedir.txt 6_loginshell.txt
- Perfectly reconstruct the
/etc/passwdfile from itscutcomponents.
More on
tr(translating strings)
trcopiesstdintostdoutwith substitution or deletion of selected characters.
- Reads from
stdinFormat:
tr [-cds] [string1] [string2]
-d: delete all input characters contained in string1-c: complements the characters in string1 with respect to the entire ASCII character set-s: squeeze all strings of repeated output characters in the last operand to single characters[string1]and[string2]: Any character that matches a character instring1is translated into the corresponding character insstring2.
- Any character that doesn’t match a character in
string1is passed tostdoutunchanged.Examples:
# Replace all instances of s with z tr s z# Replaces all instances of s with z and o with x tr so zx# Replaces all lower-case characters with upper-case characters tr a-z A-Z# Deletes all a-c characters tr –d a-c# Change delimiter of /etc/passwd tr ‘:’ ‘|’ /etc/passwd# Change text from upper to lower case cat lowercase.txt | tr '[A-Z]' '[a-z]' > uppercase.txt # Change text from upper to lower case using named character classes cat lowercase.txt | tr [:upper:] [:lower:] > uppercase.txt# Import DOS files tr –d ’\r’ < dos_file.txtRemember:
trtranslates strings character-by-character, it doesn’t substitute string-by-string.
More on
sortFormat:
sort [-dftnr] [-o filename] [filename(s)]
-d: Dictionary order, only letters, digits, and whitespace are significant in determining sort order-f: Ignore case (fold into lower case)-t: Specify delimiter-n: Numeric order, sort by arithmetic value instead of first digit-r: Sort in reverse order-t: delimiter character-k: column-o: filename - write output to filename, filename can be the same as one of the input filesExamples:
sort -t: -nk2 /etc/passwd
- Sort the
/etc/passwdfile numerically (-n), by uid (-k2), using the “:” character as the field separator (-t:).sort -t: -nrk3 /etc/passwd
- Sort the
/etc/passwdfile numerically (-n), by gid (-k3), using the “:” character as the field separator (-t:), in reverse order (-r)
More on
uniq(list unique items):
uniqremoves or reports adjacent duplicate lines.
- Tip: Use
sortto make all duplicate lines adjacent.Format:
uniq [-cduif] [input-file] [output-file]
-c: Precede each output line with the number of time the line occurred in input.-d: Only output lines that were repeated.-u: Only output lines that were unique.-i: Case insensitive comparison-f [num]: Ignore first[num]fields in each input line.
- Field: String a non-blank characters separated from adjacent fields by blanks.
-s [chars]: Ignore first[chars]characters in each input line.
More on
find(apply expressions to files):Format:
find [pathlist] [expression]
[pathlist]: Recursively descends through this path, applying[expression]to every file.[expression]: Expression that gets applied to every file
- Expressions:
-name pattern
- Find files where the pattern returned
true.
- e.g.,
-name '*.c'-type ch
- Find files of type (
chc: character,b: block,fplain file, etc.)
- e.g.,
find ~ -type f-perm[+-]mode
- Find files with given access mode (given in octal mode)
- e.g.,
find . -perm 755-user uid/username
- Find files by owner uid or username
-group guid/groupname
- Find files by group guid or groupname
-size size
- Find files by
size.- etc…
- Logical Operations:
!: Returns logical negation of expressionop1 -a op2: Matches patternsop1andop2op1 -o op2: Matches patternsop1orop2(): Group expressions together- Actions:
-exec cmd: Executescmd(must be terminated by an escaped semicolon.
- If you specify
{}as an argument, it will be replaced by the name of the current file.- Executes once per file
- e.g.,
find /tmp -name "*.pdf" -exec rm "{}" ";"Examples:
# Print all png files in my documents folder find ~/Pictures -name '*.png' # Delete all pdf files in my /tmp folder find /tmp -name "*.pdf" -exec rm "{}" ";" # Print all files in my videos folder larger than 500MiB find ~/Videos -size +500M -print # Print all config files modified in the last day find ~/.config -mtime 1 # Count words of all config files modified in the last day find ~/.config -mtime 1 -exec wc -w {} \;
- Note how the
*is being escaped to suppress shell interpretation.