---
title: "The carcinization of Go programs"
date: 2022-11-22
author: Heartmender
tags:
- cursed
- wasm
- go
- rust
hero:
ai: "Waifu Diffusion v1.3 (float16)"
file: "crab-invasion"
prompt: "crabs, invasion, beach, palm trees, green hill zone, studio ghibli, xenoblade chronicles 2, pokemon, ken sugimori, thick outlines, ink"
---
Sometimes you just need to embed this one library written in another language
into your program. This is a common thread amongst programmers time immemorial.
This has always been a process fraught with peril, fear, torment, and
lemon-scented moist towelettes for some reason.
Normally if you want to call a Rust function from Go, you have to go through
some middleman like [cgo](https://pkg.go.dev/cmd/cgo). This works and is
somewhat elegant for how utterly terrible a hack cgo is.
However, the main problem is that when you use cgo to link a Rust function into
a Go program, you need to copy around the shared object that Rust generates. You
can't check this shared object into your source tree (it needs to be unique per
OS distribution per OS per CPU architecture, much like normal dynamically linked
binaries are). It does work, but overall the developer experience is poor. Your
build is no longer one simple `go build`. Now you have to remember to run `cargo
build --release` and ensure that the resulting `.so`, `.dll`, or `.dylib` is in
the right path for the OS' dynamic linker to read from. It's a mess.
This is such a big problem that at a generic level that this is why [Nix and
NixOS](https://nixos.org) exist. Imagine how complicated this is when you get
general-purpose OS components into the mix. It's astounding that anything
works at all.
So what if I told you there was a way that we could ship _one_ binary from Rust,
have that work on _every_ platform Go supports, and not have to modify the build
process beyond a simple `go build`? Imagine how much easier that would be. It's
easy to imagine that such a thing would let users _not even know that Rust was
involved at all_, even if they consume a package or program that uses it.
I've done this with a package I call
[mastosan](https://github.com/Xe/x/tree/master/web/mastosan) and here's why it
exists as well as how I made it.
## Why
Mastodon stores toots in HTML and presents that HTML to API consumers. HTML is
very nice for a browser to display, but this is not as useful for a bot.
Especially if your goal is to send toots to a Slack webhook.
When you look at a toot such as this in the API:
Its content looks something like this:
```html
test mention
@xe
so I can see what HTML mastodon makes
```
Ideally we'd like it to look semantically identical in Slack, maybe something
like this:
```
test mention so I can see what HTML mastodon makes
```
This will display the link in Slack like any other hyperlink. As things get more
elaborate, Mastodon will do more semantic weirdness like invisible spans and
other things that make displaying things on Slack annoying. Imagine the
difference between these two things:
```
https:// tailscale.com/blog/introducing -tailscale-funnel/
https://tailscale.com/blog/introducing-tailscale-funnel/
```
One is much more easy to understand for humans than the other.
## How
One of the core features of the UNIX philosophy is the idea that programs are
simple filters that do _one thing well_ and then allow you to compose them into
new and interesting ways. If you've ever used `curl` and `jq` together to do
things like read data from a JSONFeed, you know how this is in practice:
```
$ curl https://xeiaso.net/blog.json -qsSL | jq .items[0].title -r
The birdsong persists
```
I made a little program in Rust that uses
[lol_html](https://docs.rs/lol_html/latest/lol_html/) to take incoming
Mastodon-flavored HTML and emit slack-flavored markdown. Usage is simple:
```
$ echo 'test mention @xe so I can see what HTML mastodon makes
' | ./testdata/mastosan.wasm
test mention so I can see what HTML mastodon makes
```
That's it. It takes input on standard input and returns the result on standard
output. This doesn't cleanly map to the WebAssembly flow, except if you use
[WASI](https://wasi.dev/) to bridge the gap. WASI gives WebAssembly programs
enough of a POSIX-like environment that most basic things can work, but here we
are only really using two major parts of it: standard input and standard output.
In Go, if you were running this as a normal OS subprocess, you'd probably write
some code like this:
```go
package foo
import (
"bytes"
"os/exec"
"strings"
)
func HTML2Slackdown(input string) (string, error) {
loc, err := exec.LookPath("mastosan")
if err != nil {
return "", err
}
fout := &bytes.Buffer{}
cmd := exec.Command(loc)
cmd.Stdin = bytes.NewBufferString(input)
cmd.Stdout = fout
if err := cmd.Run(); err != nil {
return "", err
}
return strings.TrimSpace(fout.String()), nil
}
```
However this still depends on the program being compiled for your native OS and
distribution as well as present in a folder in your `$PATH`. This works, but
this is not ideal in the slightest.
Rust lets you build a binary that targets WASI with this compiler flag:
```
cargo build --target wasm32-wasi --release --bin mastosan
```
This will emit a several megabyte binary file in
`./target/wasm32-wasi/release/mastosan.wasm`. When you run it, it will do what
you want.
Now you need to use it from Go. There's many choices for this, but I chose to
use [wazero](https://wazero.io/). The overall flow of using this is similar to
using a subprocess with `os/exec`, but slightly different because we're
embedding WebAssembly. It will look like this:
```go
//go:embed testdata/mastosan.wasm
var mastosanWasm []byte
func HTML2Slackdown(ctx context.Context, text string) (string, error) {
// create wazero runtime
r := wazero.NewRuntime(ctx)
defer r.Close(ctx)
// load wasi environment into runtime
wasi_snapshot_preview1.MustInstantiate(ctx, r)
// set up standard output and standard input
fout := &bytes.Buffer{}
fin := bytes.NewBufferString(text)
// create runtime configuration
config := wazero.NewModuleConfig().WithStdout(fout).WithStdin(fin).WithArgs("mastosan")
// compile the WASM module
code, err := r.CompileModule(ctx, mastosanWasm)
if err != nil {
log.Panicln(err)
}
// run the WASM module
if _, err = r.InstantiateModule(ctx, code, config); err != nil {
return "", err
}
return strings.TrimSpace(fout.String()), nil
}
```
This is mostly the same thing. You set up the environment, load the WASM module
and then run it. The main difference is that instead of loading the binary as
machine code from the disk, I use
[go:embed](https://pkg.go.dev/embed#hdr-Strings_and_Bytes) to embed the
precompiled WebAssembly module into the binary. This means that the resulting Go
program will Just Work as long as the WebAssembly module is present in the place
it expects.
## Moar faster
One main drawback to this implementation is that it's a bit slow. It has to
compile the WebAssembly module _every time_ the function is called.
The wazero runtime and compiled WebAssembly module code can be lifted into a
package-level variable, like with [this
patch](https://github.com/Xe/x/commit/b61b59318be6544632ac1f64b1237bb17b2e7a32).
The main advantage this gives you is _speed_. After this patch, the WebAssembly
module is only ever compiled _once_, on application boot. Before this patch,
each invocation took about 0.2 seconds per run. Here's the benchmark results
after this patch:
```
BenchmarkHTML2Slackdown 1221 938774 ns/op
BenchmarkHTML2Slackdown-2 2293 488032 ns/op
BenchmarkHTML2Slackdown-6 3555 305505 ns/op
BenchmarkHTML2Slackdown-12 3897 297974 ns/op
```
It's gone down from 0.2 seconds in the best case to _0.3 milliseconds_ in the
best case. This is at least a 1000x increase in performance, with most of the
time probably being spent in the HTML parser rather than being spent in anything
else.
I think this is going to more than meet my needs both personally and at work.
I'm going to have to try this a bit more against random Mastodon messages to see
if it does what I want. It's cool to be able to merge two incompatible worlds
together and I'm excited to see what I can do in the future with this.
Darn you shitposting coworkers nerd sniping the poor DevRel on their
well-earned week off!