aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorChristine Dodrill <me@christine.website>2021-09-22 22:01:12 -0400
committerChristine Dodrill <me@christine.website>2021-09-22 22:01:12 -0400
commitdc07e6d9385a9972f520a84e7f4341e542944216 (patch)
tree53c7a2cdf88e760a8cae01bafe2c37a07542bcad
parent0c6eb0474c328935a1d1d24551d210ca6c663f7c (diff)
downloadxesite-dc07e6d9385a9972f520a84e7f4341e542944216.tar.xz
xesite-dc07e6d9385a9972f520a84e7f4341e542944216.zip
Fun with redirection
Signed-off-by: Christine Dodrill <me@christine.website>
-rw-r--r--blog/fun-with-redirection-2021-09-22.markdown350
-rw-r--r--config.dhall8
2 files changed, 358 insertions, 0 deletions
diff --git a/blog/fun-with-redirection-2021-09-22.markdown b/blog/fun-with-redirection-2021-09-22.markdown
new file mode 100644
index 0000000..d0ad618
--- /dev/null
+++ b/blog/fun-with-redirection-2021-09-22.markdown
@@ -0,0 +1,350 @@
+---
+title: Fun with Redirection
+date: 2021-09-22
+author: Twi
+tags:
+ - shell
+ - redirection
+ - osdev
+---
+
+When you're hacking in the shell or in a script, sometimes you want to change
+how the output of a command is routed. Today I'm gonna cover common shell
+redirection tips and tricks that I use every day at work and how it all works
+under the hood.
+
+Let's say you're trying to capture the output of a command to a file, such as
+`uname -av`:
+
+```console
+$ uname -av
+Linux shachi 5.13.15 #1-NixOS SMP Wed Sep 8 06:50:21 UTC 2021 x86_64 GNU/Linux
+```
+
+You could copy that to the clipboard and paste it into a file, but there is a
+better way thanks to the `>` operator:
+
+```console
+$ uname -av > uname.txt
+$ cat uname.txt
+Linux shachi 5.13.15 #1-NixOS SMP Wed Sep 8 06:50:21 UTC 2021 x86_64 GNU/Linux
+```
+
+Let's say you want to run this on a few machines and put all of the output into
+`uname.txt`. You could write a shell script loop like this:
+
+```sh
+# make sure the file doesn't already exist
+rm -f uname.txt
+
+for host in shachi chrysalis kos-mos ontos pneuma
+do
+ ssh $host -- uname -av >> uname.txt
+done
+```
+
+Then `uname.txt` should look like this:
+
+```
+Linux shachi 5.13.15 #1-NixOS SMP Wed Sep 8 06:50:21 UTC 2021 x86_64 GNU/Linux
+Linux chrysalis 5.10.63 #1-NixOS SMP Wed Sep 8 06:49:02 UTC 2021 x86_64 GNU/Linux
+Linux kos-mos 5.10.45 #1-NixOS SMP Fri Jun 18 08:00:06 UTC 2021 x86_64 GNU/Linux
+Linux ontos 5.10.52 #1-NixOS SMP Tue Jul 20 14:05:59 UTC 2021 x86_64 GNU/Linux
+Linux pneuma 5.10.57 #1-NixOS SMP Sun Aug 8 07:05:24 UTC 2021 x86_64 GNU/Linux
+```
+
+Now let's say you want to extract all of the hostnames from that `uname.txt`.
+The pattern of the file seems to specify that fields are separated by spaces and
+the hostname seems to be the second space-separated field in each line. You can
+use the `cut` command to select that small subset from each line, and you can
+feed the `cut` command's standard input using the `<` operator:
+
+```console
+$ cut -d' ' -f2 < uname.txt
+shachi
+chrysalis
+kos-mos
+ontos
+pneuma
+```
+
+[It's worth noting that a lot of these core CLI utilities are built on the idea
+that they are _filters_, or things that take one infinite stream of text in on
+one end and then return another stream of text out the other
+end. This is done through a channel called "standard input/output", where
+standard input refers to input to the command and standard output refers to the
+output of the command.](conversation://Mara/hacker)
+
+[That's a great metaphor, let's build onto it using the `|` (pipe)
+operator. The pipe operator lets you pipe the standard output of one command to
+the standard input of another.](conversation://Cadey/enby)
+
+[You mentioned that you can pass files as input and output for commands, does
+this mean that standard input and standard output are
+files?](conversation://Mara/happy)
+
+[Precisely! They are just files that are automatically open for every process.
+Usually commands will output to standard out and some will also accept input via
+standard in.](conversation://Cadey/enby)
+
+[Doesn't that have some level of overhead though? Isn't it expensive to spin up
+a whole heckin' `cat` process for that?](conversation://Mara/hmm)
+
+[Not on any decent system made in the last 20 years. This may have some impact
+on Windows (because they have core architectural mistakes that make processes
+take up to 100 milliseconds to spin up), but this is about Unix/Linux. I think
+these should work on Windows too if you use Cygwin, but if you're using WSL you
+shouldn't have any real issues there](conversation://Cadey/coffee)
+
+Let's say we want to rewrite that `cut` command above to use pipes. You could
+write it like this:
+
+```sh
+cat uname.txt | cut -d' ' -f2
+```
+
+[The mnemonic we use for remembering the `cut` command is that fields are
+separated by the `d`elimiter and you cut out the nth
+`f`ield/s. You can use ](conversation://Mara/hacker)
+
+This will get you the exact same output:
+
+```console
+$ cat uname.txt | cut -d' ' -f2
+shachi
+chrysalis
+kos-mos
+ontos
+pneuma
+```
+
+Personally I prefer writing shell pipelines like that as it makes it a bit
+easier to tack on more specific selectors or operations as you go along. For
+example, if you wanted to sort them you could pipe the result to `sort`:
+
+```console
+$ cat uname.txt | cut -d' ' -f2 | sort
+chrysalis
+kos-mos
+ontos
+pneuma
+shachi
+```
+
+This lets you gradually build up a shell pipeline as you drill down to the data
+you want in the format you want.
+
+[I wanted to save this compiler error to a file but it didn't work. I tried
+doing this:](conversation://Mara/hmm)
+
+```console
+$ rustc foo.rs > foo.log
+```
+
+But the output printed to the screen instead of the file:
+
+```console
+$ rustc foo.rs > foo.log
+error: expected one of `!` or `::`, found `main`
+ --> foo.rs:1:5
+ |
+1 | fun main() {}
+ | ^^^^ expected one of `!` or `::`
+
+error: aborting due to previous error
+
+$ cat foo.log
+$
+```
+
+This happens because there are actually _two_ output streams per program. There
+is the standard out stream and there is also a standard error stream. The reason
+that standard error exists is so that you can see if any errors have happened if
+you redirect standard out.
+
+Sometimes standard out may not be a stream of text, say you have a compressed
+file you want to analyze and there's an issue with the decompression. If the
+decompressor wrote its errors to the standard output stream, it could confuse or
+corrupt your analysis.
+
+However, we can redirect standard error in particular by modifying how we
+redirect to the file:
+
+```console
+$ rustc foo.rs 2> foo.log
+$ cat foo.log
+error: expected one of `!` or `::`, found `main`
+ --> foo.rs:1:5
+ |
+1 | fun main() {}
+ | ^^^^ expected one of `!` or `::`
+
+error: aborting due to previous error
+```
+
+[Where did the `2` come from?](conversation://Mara/wat)
+
+So I mentioned earlier that redirection modifies the standard input and output
+of programs. This is not entirely true, but it was a convenient half-truth to
+help build this part of the explanation.
+
+For every process on a Unix-like system (such as Linux and macOS), the kernel
+stores a list of active file-like objects. This includes real files on the
+filesystem, pipes between processes, network sockets, and more. When a program
+reads or writes a file, they tell the kernel which file they want to use by
+giving it a number index into that list, starting at zero. Standard in/out/error
+are just the conventional names for the first three open files in the list, like
+this:
+
+| File Descriptor | Purpose |
+| :------ | :------- |
+| 0 | Standard input |
+| 1 | Standard output |
+| 2 | Standard error |
+
+Shell redirection simply changes which files are in that list of open files when
+the program starts running.
+
+That is why you use a `2` there, because you are telling the shell to change
+file descriptor number `2` of the `rustc` process to point to the filesystem
+file `foo.log`, which in turn makes the standard error of `rustc` get written to
+that file for you.
+
+In turn, this also means that `cat foo.txt > foo2.txt` is actually a shortcut
+for saying `cat foo.txt 1> foo2.txt`, but the `1` can be omitted there because
+standard out is usually the "default" output that most of these kind of
+pipelines cares about.
+
+[How would I get both standard output and standard error in the same
+file?](conversation://Mara/hmm)
+
+The cool part about the `>` operator is that it doesn't just stop with output to
+files on the desk, you can actually have one file descriptor get pointed to
+another. Let's say you have a need for both standard out and standard error to
+go to the same file. You can do this with a command like this:
+
+```
+$ rustc foo.rs 2>&1 > foo.log
+```
+
+This tells the shell to point standard error to standard out and then the
+combined output to `foo.log`. There's a short form of this too:
+
+```
+$ rustc foo.rs &> foo.log
+```
+
+[Where can I expect to use that?](conversation://Mara/hmm)
+
+[It's a bourne shell extension, but I've tested it in `zsh` and `fish`. You can
+also do `&|` to pipe both standard out and standard error at the same time in
+the same way you'd do `2>&1 | whatever`.](conversation://Cadey/enby)
+
+That will put standard out and standard error to `foo.log` the same way that
+`2>&1 > foo.log` will. You can also use this with `>>`:
+
+```
+$ rustc foo.rs &>> foo.log
+$ cat foo.log
+error: expected one of `!` or `::`, found `main`
+ --> foo.rs:1:5
+ |
+1 | fun main() {}
+ | ^^^^ expected one of `!` or `::
+
+error: aborting due to previous error
+
+error: expected one of `!` or `::`, found `main`
+ --> foo.rs:1:5
+ |
+1 | fun main() {}
+ | ^^^^ expected one of `!` or `::`
+
+error: aborting due to previous error
+```
+
+[How do I redirect standard in to a file?](conversation://Mara/hmm)
+
+The answer there is not directly! There is a workaround in the form of a tool
+called `tee` which outputs its standard in to both standard out and a file. For
+example:
+
+```console
+$ dmesg | tee dmesg.txt | grep 'msedge'
+[ 70.585463] traps: msedge[4715] trap invalid opcode ip:5630ddcedc4c sp:7ffd41f67700 error:0 in msedge[5630d8fc2000+952d000]
+[ 70.702544] traps: msedge[4745] trap invalid opcode ip:5630ddcedc4c sp:7ffd41f67700 error:0 in msedge[5630d8fc2000+952d000]
+[ 70.806296] traps: msedge[4781] trap invalid opcode ip:5630ddcedc4c sp:7ffd41f67700 error:0 in msedge[5630d8fc2000+952d000]
+[ 70.918095] traps: msedge[4889] trap invalid opcode ip:5630ddcedc4c sp:7ffd41f67700 error:0 in msedge[5630d8fc2000+952d000]
+[ 71.031938] traps: msedge[4926] trap invalid opcode ip:5630ddcedc4c sp:7ffd41f67700 error:0 in msedge[5630d8fc2000+952d000]
+[ 71.138974] traps: msedge[4935] trap invalid opcode ip:5630ddcedc4c sp:7ffd41f67700 error:0 in msedge[5630d8fc2000+952d000]
+[ 1169.163603] traps: msedge[35719] trap invalid opcode ip:556a93951c4c sp:7ffc533f35c0 error:0 in msedge[556a8ec26000+952d000]
+[ 1213.301722] traps: msedge[36054] trap invalid opcode ip:55a245960c4c sp:7ffe6d169b40 error:0 in msedge[55a240c35000+952d000]
+[10963.234459] traps: msedge[104732] trap invalid opcode ip:55fdb864fc4c sp:7ffc996dfee0 error:0 in msedge[55fdb3924000+952d000]
+```
+
+This would put the output of the `dmesg` command (read from kernel logs) into
+`dmesg.txt`, as well as sending it into the grep command. You might want to do
+this when debugging long command pipelines to see exactly what is going into a
+program that isn't doing what you expect.
+
+Redirections also work in scripts too. You can also set "default" redirects for
+every command in a script using the `exec` command:
+
+```sh
+exec > out.log 2> error.log
+
+ls
+rustc foo.rs
+```
+
+This will have the file listing from `ls` written to `out.log` and any errors
+from `rustc` written to `error.log`.
+
+A lot of other shell tricks and fun is built on top of these fundamentals. For
+example you can take a folder, zip it up and then unzip it over on another
+machine using a command like this:
+
+```
+$ tar cz ./blog | ssh pneuma tar xz -C ~/code/christine.website/blog
+```
+
+This will run `tar` to create a compressed copy of the `./blog` folder and then
+pipe that to tar on another computer to extract that into
+`~/code/christine.website/blog`. It's just pipes and redirection all the way
+down! Deep inside `ssh` it's really just piping output of commands back and
+forth over an encrypted network socket. Connecting to an IRC server is just
+piping in and out data to the chat server, even more so if you use TLS to
+connect there. In a way you can model just about everything in Unix with pipes
+and file descriptors because that is the cornerstone of its design: Everything
+is a file.
+
+[This doesn't mean it's literally a file on the disk, it means you can _interact
+with_ just about everything using the same system interface as you do with
+files. Even things like hard disks and video cards.](conversation://Mara/hacker)
+
+Here's a fun thing to do. Using [`curl`](https://curl.se/) to read the contents
+of a URL and [`jq`](https://stedolan.github.io/jq/) to select out bits from a
+JSON stream, you can make a script that lets you read the most recent title from
+my blog's [JSONFeed](/blog.json):
+
+```sh
+#!/usr/bin/env bash
+# xeblog-post.sh
+
+curl -s https://christine.website/blog.json | jq -r '.items[0] | "\(.title) \(.url)"'
+```
+
+At the time of writing this post, here is the output I get from this command:
+
+```
+$ ./xeblog-post.sh
+Anbernic RG280M Review https://christine.website/blog/rg280m-review
+```
+
+What else could you do with pipes and redirection? The cloud's the limit!
+
+---
+
+Thanks to violet spark for looking over this post and fact-checking as well as
+helping mend some of the brain dump and awkward wording into more polished
+sentences.
diff --git a/config.dhall b/config.dhall
index 70d4a86..827a00a 100644
--- a/config.dhall
+++ b/config.dhall
@@ -70,6 +70,14 @@ let Config =
, twitter = Some "BeJustFine"
, inSystem = True
}
+ , Author::{
+ , name = "Nicole"
+ , handle = "Twi"
+ , picUrl = None Text
+ , link = None Text
+ , twitter = None Text
+ , inSystem = True
+ }
]
, port = defaultPort
, clackSet = [ "Ashlynn" ]