More Shell Patterns I Use Weekly
Covers find-and-replace with sed and xargs, while loops until success/failure, parallelizing commands using wait, and $SECONDS.
I’m not a particularly strong shell programmer. I’m incapable of writing a for loop in Bash that handles input with spaces in it. Or writing a trap. Shell arrays? Not a chance without checking the docs. Whenever I need to do anything more advanced, I normally switch to python or zx.
But I’m still strong enough on the command line that I’m able to accomplish the things that I want to accomplish day-to-day and build quick little utilities that serve my needs well. Here are a few quick notes on patterns I find myself using regularly.
String parameter expansion
for file in **/*.md; do
pandoc $file -o "${file/.md/.html}"
done
I’m terrible with shell string manipulation, but I can still remember ${variable/pattern/replacement}. There are two ways of referring to a variable in the shell: $x and ${x}. ${x} mode gives you a few useful things:
${x}ywill print the variable stored in x followed by the character y.$xywill print nothing unlessxyis set${x:-string_fallback}lets you create fallbacks if a variable is unset:echo ${x:-hello, world}x=page.html; ${x%%.html}will remove.htmlfrom the end of the string (but I just do${x/.html/}because I can’t remember%and#)- And the aforementioned
${x/pattern/replacement}syntax I already mentioned.
The **/*.md syntax is supported in zsh (and in Bash with shopt -s globstar): You probably don’t need find.
In addition to pandoc, I’ll often write for loops with renames with tools like magick and ffmpeg.
printf / chalk
Urjit introduced me to the pattern of including a copy-pastable print colors block to scripts.
# ----- Colours easy def: Usage: printf "%s\n" "${red}foobar${end}"
red=$'\e[1;31m'
green=$'\e[1;32m'
yellow=$'\e[1;33m'
blue=$'\e[1;34m'
mag=$'\e[1;35m'
cyn=$'\e[1;36m'
end=$'\e[0m'
printf "\n\t%s\n" "${red}Helpful error message that we want to highlight.${end}"
A teeny bit of effort spent making script output more legible can make hacked-together shell scripts nicer to work with.
In general though, once I start adding error messages to a shell script, that’s about the point where I want to switch it over to a language like JS (or zx) or python. In JS, chalk makes it quick to augment console output.
Using mktemp and /tmp heavily
This isn’t a command per se, but I’ve started using the /tmp directory more. Having a command that spits out a ton of output that I’ll only care about if it fails is a bummer, so it’s nice to have a spot to stash that data that will automatically get cleaned up.
Your operating system will (generally) take care of keeping /tmp clean of old files.
output_log_file=$(mktemp)
if ! some_build_command_with_tons_of_output &> "$output_log_file"; then
printf "\n%s\n" "${red}some_build_command_with_tons_of_output FAILED${end}"
printf "\t%s\n" "${red}Full log output available at ${output_log_file}${end}"
printf "\t%s\n\n" "${red}Last ten lines of output:${end}"
tail -n10 "$output_log_file"
exit 1
fi
This is a helpful pattern to use with LLMs because it avoids polluting the context window with random success messages.1 Humans have a context window too, and if something is working properly, I generally don’t care to see its logs.
This particular script isn’t great βΒ it sticks together stdout and stderr and it loses the exact error code from the failing command β but it illustrates the general pattern.
In the past, I’ve used scripts similar to allelify in JS to parallelize command runs while limiting output. I don’t recommend using this setup today β it’s not DAG-aware and isn’t built for an LLM-heavy world β but the general approach of storing logs in /tmp unless things fail is useful.
I’ll also occasionally add things like | tee /tmp/just-in-case to commands I run. Being able to use less to navigate around failed commands feels more ergonomic than having to run a heavy command again to get its output.
It’s also worth mentioning that mktemp can create directories as well as files if you need a temporary directory to do some quick work in.
Using $() in simple commands rather than assigning to an intermediate variable
With more familiarity with the shell, I’ve stopped using intermediate variables when writing quick scripts in the shell. I think this is a less readable style than having explicit named variables, but I find it more ergonomic, especially because it keeps the command as a one-liner that I can edit and replay:
x=$(command | grep foo)
do_something_with_x "$x"
# vs.
do_something_with_x "$(command | grep foo)"
mocha $(rg -l pattern): run the tests that match the search. (On a large codebase, this will be much faster thanmocha -gbecausemocha -gneeds to load all of the test files to decide which ones to run)hx $(which command)orhx $(rg -l pattern | fzf): edit the shell script behindcommand(for stuff I’ve written) or edit a particular file after tracking it down
If you’re not yet comfortable with $() and <(), it probably makes sense to continue using intermediate variables.
Filtering with awk rather than grep
awk '!seen[$0]++: removes duplicates. This is alternative to| sort | uniq.awkis 1-indexed, and$0refers to the full array.seen(or any other variable) will be created when referenced.awkwill print the full line when provided truthy values.awk '$5 == "keyword"': avoids false positives withgrep keywordbecause it’s only looking for the fifth elementawk '$2 ~ "pattern"': is similarly more focused than a similar grep command
I’ve occasionally started using grep for a filter, made a complex search, and then realized that I should have just started with awk.
-
Another useful technique is setting up a skill that uses a subagent with a cheaper model and has that subagent give a summary of the output. ↩︎