talks/wasm-abi.markdown


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936

---
title: The Go WebAssembly ABI at a Low Level
date: 2022-10-17
slides_link: "https://drive.google.com/file/d/1RKitNYC77AYnsstNsYvJcBNnaKT06stb/view?usp=sharing"
tags:
 - wasm
 - golang
 - go
 - ieee754
---

<xeblog-talk-warning></xeblog-talk-warning>

# The Go WebAssembly ABI at a Low Level

<xeblog-video path="talks/golab-wasm"></xeblog-video>

<xeblog-conv name="Mara" mood="hacker">If that doesn't load, try viewing the
version on [YouTube](https://youtu.be/y-RYxMB4xFE).</xeblog-conv>

This talk was presented at [GoLab 2022](https://golab.io/) in Florence, Italy as
a remote talk. It was fully scripted using a conversational style and
prerecorded ahead of time.

## Talk

<xeblog-slide name="golab-wasm/slides/001" essential></xeblog-slide>

For over a decade, the dominant language for developing applications in browsers
has been JavaScript. Everyone in this room either deals with or touches
something in the world that deals with JavaScript. Recently the Worldwide Web
Consortium published a standard called WebAssembly that is one of the first
steps towards JavaScript no longer being a requirement for developing things
targeting web browsers.
  
The Go 1.11 release in mid-2018 added support for compiling Go to WebAssembly.
It allows you to take a reasonable subset of Go programs and run them in
browsers alongside JavaScript.
  
I'm Xe Iaso and today I'm going to help you understand how this works and the
amazingly terrible hacks that power the core of this. This talk is aimed at
intermediate to expert audiences, it will likely hit best if you have *some*
familiarity with JavaScript and WebAssembly, and especially if you are a fan of
amazingly terrible ideas. To make sure everyone is on the same page, I'm going
to give background and context for everything as it is needed.
  
So come on and join me on this magical journey through calling conventions,
I-triple-E seven fifty-four floating point numbers and more as we learn the
nitty gritty of how Go's WebAssembly support works!

<xeblog-slide name="golab-wasm/slides/002"></xeblog-slide>

As it says on the tin, I'm Xe Iaso. I'm the Archmage of Infrastructure at
Tailscale and I am regularly accused of being an expert in both Go and
WebAssembly. I have an extensive background in development and site reliability
things, but I've been doing developer relations recently.
  
So you are aware, this talk is going to contain opinions about many topics.
These opinions are my own and are not the opinions of my employer.

<xeblog-slide name="golab-wasm/slides/003"></xeblog-slide>

WebAssembly is a specification that defines a bunch of semantics about how a
computer that doesn't exist should work. It defines the virtual machine, how the
stack works, the instructions the machine can run, the format that compilers
should target, and other fiddly details like that. In practice it is somewhere
between native code and a scripting language (much like Java class files), but
at a high level we can think about it like this:

<xeblog-slide name="golab-wasm/slides/004"></xeblog-slide>

WebAssembly itself is a computer that takes your code and executes it.
Inherently it will run any functions you want, store the results in its linear
memory or return values from the stack; but otherwise it's a glorified
reverse-polish-notation calculator that runs very fast.
  
With this you can do simple math all day, but that's not overly useful in the
real world by itself.

<xeblog-slide name="golab-wasm/slides/005"></xeblog-slide>

The real magic for how WebAssembly gets useful comes from external functions
that get imported into WebAssembly-land. These functions can do just about
anything you want from "making an HTTP request with the JavaScript fetch()
function" to "read from and write to local storage".

<xeblog-slide name="golab-wasm/slides/006"></xeblog-slide>

WebAssembly was designed to run very fast on consumer hardware. Is binary format
was designed to be easy to parse, and WebAssembly instructions can easily
compile down to machine code with very little effort in real time.
  
Overall, this lets you have your WebAssembly program do whatever you want,
access whatever it needs and can overall be as powerful as JavaScript. There are
only a few caveats related to performance and translation between the two
worlds, much like the caveats with system calls on Unix. It's really nice in
practice.

<xeblog-slide name="golab-wasm/slides/007"></xeblog-slide>

However, let's take a look at that "functions imported from the environment"
thing a bit closer. The WebAssembly specification only defines the *virtual
machine*, how code is stored into and loaded from dot WASM files, and the
semantics of how all that works. It doesn't specify an API that programs written
to target WebAssembly can use to talk to the outside world.
  
Given the constraints of the WebAssembly team at the time, it's very reasonable
that they didn't try to also shove a stable interoperability API into the mix
when trying to get the minimum viable product out of the door. That could have
taken years. But, as a side effect of this, everyone has had to invent their own
one-off APIs for gluing the two sides together.

<xeblog-slide name="golab-wasm/slides/008"></xeblog-slide>

Oh and to make things even more fun, WebAssembly has no native string type, just
like C! All there is are contiguous blocks of ram terminated by null characters.
Just like our friend, the PDP-11.

<xeblog-slide name="golab-wasm/slides/009"></xeblog-slide>

WebAssembly was originally intended for use in browsers, but there have been
efforts to standardize on an API for WebAssembly programs to function on server
environments. This allows operators to run arbitrary code from users and also
take advantage of the inherent isolation features WebAssembly brings to the
table. WASI (short for WebAssembly System Interface) is an independent standard
that gives WebAssembly programs some Unix-y calls, but overall we are talking
about a browser here, not a Unix system.
  
Go's WebAssembly support also was made before WASI even came out, so at the time
it wasn't a viable option. WASI also doesn't support system calls like "open
network socket", which makes it logistically annoying for writing many
real-world applications. As far as I am aware Go doesn't support WASI at all.

<xeblog-slide name="golab-wasm/slides/010"></xeblog-slide>

There is more than one Go compiler though. TinyGo is a Go compiler built on top
of LLVM that can compile a subset of Go programs to WebAssembly with WASI. I
want to reiterate that the Go WebAssembly port mostly targets browsers, not Unix
systems. Those two are different beasts entirely. For the sake of keeping things
simple in this talk, I'm going to focus on how Google's Go compiler does all of
this.

<xeblog-slide name="golab-wasm/slides/011" essential></xeblog-slide>

So with all of those caveats in mind, it's reasonable to wonder something like
"Why would I even use this in the first place? It's a brand new compiler port
with brand new platform semantics that I have to invent myself.".
  
That's a reasonable thing to conclude, however I counter with these points:

<xeblog-slide name="golab-wasm/slides/012"></xeblog-slide>

Sometimes the one library call you need in JavaScript but have in Go doesn't
exist and you really don't want to have to make an API call for it. WebAssembly
is the only officially sanctioned way to do this.
  
Previously, there was a community effort to compile Go to JavaScript called
GopherJS, but that has fallen out of favour as the WebAssembly port for Go gets
more and more mature.

<xeblog-slide name="golab-wasm/slides/013"></xeblog-slide>

Doing this also lets you run the same code in the same language on both your
browser and servers, which can help reduce cognitive complexity as you switch
between issues on the frontend and backend.

<xeblog-slide name="golab-wasm/slides/014"></xeblog-slide>

It's also new and fun! You're all programmers, right? You know just as well as I do that we have a hard time resisting the siren song of new things.

<xeblog-slide name="golab-wasm/slides/015"></xeblog-slide>

Here are some notable places where you can use Go's WebAssembly port  in order
to get things done.

<xeblog-slide name="golab-wasm/slides/016"></xeblog-slide>

You can embed your already existing peer to peer VPN engine into a browser so
you can SSH into production from a webpage.

<xeblog-slide name="golab-wasm/slides/017"></xeblog-slide>

You can embed the new netip package into your JavaScript applications so you can
do advanced subnet calculations in order to make setting up networks faster.

<xeblog-slide name="golab-wasm/slides/018"></xeblog-slide>

You can write full featured web applications without having to write a lick of
JavaScript.

<xeblog-slide name="golab-wasm/slides/019"></xeblog-slide>

Another place Go's WebAssembly port has been used is as part of the process of
porting over the game "Bear's Restaurant" to the Nintendo Switch. The team made
it work in the WebAssembly port, then had a bunch of custom scripts recompile
that blob of WebAssembly to C++ and then wrapped the input and output layers to
the proprietary APIs that the Nintendo Switch uses.
  
As far as I know, "Bear's Restaurant" is the first commercially released game
that uses Go in any way on actual game console hardware.

<xeblog-slide name="golab-wasm/slides/020"></xeblog-slide>

With all this in mind, you can see how it would be hard to write an API that
would let you do anything you want in the browser with JavaScript like you were
writing native JavaScript code. It's a lot to consider because there is frankly
a lot going on. Computers are surprisingly complicated.

<xeblog-slide name="golab-wasm/slides/021" essential></xeblog-slide>

The Go standard library has a package called `syscall/js`. This defines the
system call API to a bunch of JavaScript code (included with every release of
Go) that helps bridge the gap between  WebAssembly and JavaScript. You can focus
on writing your code in Go and let the system call layer handle the rest.

<xeblog-slide name="golab-wasm/slides/022"></xeblog-slide>

This works by giving you references to JavaScript objects and then also gives
you a set of calls to manipulate them however you want. This will let you do
most of what you can to do JavaScript objects in your Go code.
  
These references are opaque handles to objects outside of the program, just like
file descriptors are opaque handles to kernel objects in Unix.

<xeblog-slide name="golab-wasm/slides/023"></xeblog-slide>

Oh and for extra fun, all of the object references are NaN values.

<xeblog-slide name="golab-wasm/slides/024"></xeblog-slide>

Yes, really. There is more than one NaN (not-a-number) value in floating point
logic. There's actually many more than you'd think possible. You know what,
let's take a moment to learn about how numbers work in computers so we all can
understand how utterly elegant this hack is.

<xeblog-slide name="golab-wasm/slides/025"></xeblog-slide>

As humans, we usually deal with numbers in what we call "base 10" or "decimal".
There are ten options for each digit. As digits go farther to the left on
numbers, those digits signify bigger and bigger values. Let's think about the
number four-hundred twenty-six:

<xeblog-slide name="golab-wasm/slides/026" essential></xeblog-slide>

This number is broken up into digits that correspond to different values. There
are four hundreds, two tens and six ones. Four-hundred twenty-six (426).

<xeblog-slide name="golab-wasm/slides/027"></xeblog-slide>

However, this only covers whole numbers. Many times we will deal with fractional
parts of a whole, such as with making exact change to two decimal points with
coins. Our number system expands to handle this too by adding columns for
tenths, hundredths and so on.

<xeblog-slide name="golab-wasm/slides/028" essential></xeblog-slide>

If we think about the number four-hundred twenty-six point three five, we can
also break it down like we did before. There are four hundreds, two tens and six
ones, and the three tenths and five hundredths come after the decimal point. 

<xeblog-slide name="golab-wasm/slides/029"></xeblog-slide>

That's how us humans deal with numbers. One of the weird things about the sand
we cursed into thinking is that sand deals with numbers in completely different
ways to humans. Our current computers deal with states that are either
completely on or completely off. With some conversion, you can use this to
express all the same mathematical operations as with decimal arithmetic, but
with two digit options instead of ten. We call this "base 2" or "binary"
mathematics.

<details class="warning">
  <summary>This slide was cut from the recording for time constraints</summary>

<xeblog-slide name="golab-wasm/slides/030"></xeblog-slide>

We call this "binary" because it's actually two words smashed together. "Bi"
means two and "ary" is short for the word "airity", which refers to the number
of arguments. Two-arguments, binary.

</details>

<xeblog-slide name="golab-wasm/slides/031" essential></xeblog-slide>

Instead of going by tens, each binary digit goes up by twos. The first digit is
the ones digit, the second is the twos digit, the third is the fours digit, the
fourth is the eights digit, et-cetera.
  
As a cheeky example, consider the base 10 number two-hundred and fifty-five. As
the diagram shows it's got 8 bits set. One for the 1's, the 2's, the 4's, the
8's, the 16's, the 32's, the 64's and the 128's. You can add all those
components up and get the total, 255.
  
Math operations work the same as you'd expect in binary. You just deal with twos
instead of tens.

<xeblog-slide name="golab-wasm/slides/032"></xeblog-slide>

But then we get back to the problem of fractional components in numbers. The
system I just described works great for whole numbers, but fractional components
get a bit messy. You could imagine just slapping on a binary point somewhere and
doing some hacks to call it a day (and I imagine that older computers did just
that to save time in development), but it's the future and we have a standard
for this called I-triple-E 754.

<xeblog-slide name="golab-wasm/slides/033"></xeblog-slide>

IEEE-754 is the de-facto standard for expressing numbers with fractional
components, or floating-point numbers. It defines the binary form of these
numbers for use in computers. It was first defined in 1985 by the Institute of
Electrical and Electronics Engineers, or I-triple-E. This standard was designed
to help make it easier to implement and use code that uses floating-point
numbers by defining the semantics so that electrical engineers could implement
them in hardware.
  
Every major programming language, CPU and GPU made in the last thirty years or
more supports I-triple-E 754 floating point. It's also notably used by Go,
WebAssembly, and JavaScript. This means that you can pass floating point numbers
from JavaScript into your Go functions compiled into WebAssembly.

<xeblog-slide name="golab-wasm/slides/034"></xeblog-slide>

As an aside, for the rest of this bit I'm going to be using the 16 bit encoding
for floating point numbers to make my diagrams easier to understand. Natively,
JavaScript uses 64 bit floating point numbers. Just imagine that there's more
bits.

<xeblog-slide name="golab-wasm/slides/035" essential></xeblog-slide>

One of the cool parts about how this all was implemented is that floating point
numbers are essentially scientific notation. You have a sign bit to tell if the
number is positive or negative, an exponent of two, and the mantissa that you
multiply. This lets you express numbers like two point one two five as the
scientific notation form of two to the power of one times 1.0625. The exponent
is one and the mantissa is 1.0625.

<xeblog-slide name="golab-wasm/slides/036"></xeblog-slide>

So with all this in mind, you'd probably wonder what the result of zero point
three minus zero point two is. First we need to convert these to floating point
numbers:

<xeblog-slide name="golab-wasm/slides/037" essential></xeblog-slide>

One of the first gotchas we will run into is the fact that we can't get an exact
replica of zero point three and zero point two in floating point numbers.
  
This is scientific notation, scientific notation gives you an *approximation* of
what the number is. The approximations add up, and the end result is that zero
point three minus zero point two is NOT zero point one in JavaScript, Go or most
other computer programming languages. You get zero point zero nine nine nine
et-cetera.

<xeblog-slide name="golab-wasm/slides/038"></xeblog-slide>

However, if you round this up to two decimal places, you do actually get zero
point one. So there is that.

<xeblog-slide name="golab-wasm/slides/039"></xeblog-slide>

One of the other things in I-triple-E 754 floating point numbers is an explicit
encoding for things that *are not* numbers, like infinity.

<xeblog-slide name="golab-wasm/slides/040" essential></xeblog-slide>

All you have to do is set all of the exponent bits and leave none of the
mantissa bits set. That gets you positive infinity.

<xeblog-slide name="golab-wasm/slides/041" essential></xeblog-slide>

If you flip the sign bit, you get negative infinity.

<xeblog-slide name="golab-wasm/slides/042" essential></xeblog-slide>

And if you set any of the other mantissa bits, you get a Not-a-Number value,
also known as NaN. The Go to JavaScript interoperability uses NaN-space numbers
to encode object ids in the same way that Unix uses numerical file descriptors
to encode kernel objects.
  
With a 64 bit floating point number, this gives the Go to JavaScript bridge
something hilarious like 4.5 quadrillion (ten to the power of fifteen) possible
object IDs.

<xeblog-slide name="golab-wasm/slides/043" essential></xeblog-slide>

With a simple bitwise exclusive or (xor) on the exponent bits, you can extract
the NaN space number into a normal integer that the JavaScript side uses to
address objects it knows about.

<xeblog-slide name="golab-wasm/slides/044"></xeblog-slide>

The reason why you'd want to do this has to do with an absurdly ugly hack that
has been baked into the core of nearly every JavaScript engine.
  
They use NaN values as object IDs because then the object IDs can fit in a
machine register. This means that you can pass JavaScript object IDs as register
values to functions and then the function can look up things on it if it
actually needs to care. If it doesn't, the only thing that's copied around is
the very small object ID. NaN values also have a fast path in most CPU floating
point units, making this faster than you'd expect. Computers are very fast at
copying things, but it adds up when you do it a lot.
  
As above, so below, eh?

<xeblog-slide name="golab-wasm/slides/045"></xeblog-slide>

The main thing to take away is that the numbers encoded into NaN values are used
as object IDs. It's a horrifying wrapper that is faster in practice because CPUs
are lazy. A NaN value is not a number, but it can contain a number.

<xeblog-slide name="golab-wasm/slides/046"></xeblog-slide>

If you want to learn more about this, I really do suggest checking out jan
Misali's video ["how floating point works"](https://youtu.be/dQhj5RGtag0). It
covers all of this in so much more detail, including how you would go about
deriving the entire floating point number system from scratch.

Numbers are weird, eh?
  
With all that light thinking out of the way, let's focus on something more exciting. Like calling conventions.

<xeblog-slide name="golab-wasm/slides/047"></xeblog-slide>

When you are writing programs in machine language, sometimes you want to take
common bits of code and reuse them. We can call these bits of code "functions".
At a high level they need to get arguments somehow, return a result somehow, and
figure out how to go back to where the function was called so that the program
continues to work like normal.

<xeblog-slide name="golab-wasm/slides/048"></xeblog-slide>

A famous example of this is in the game Super Mario Brothers for the Nintendo
Entertainment System. Every 21 frames the game will call a function that checks
to see if the level is cleared or not. When that function gets called, if it
doesn't explicitly return to where it was called from then the NES will continue
to execute code after that function. This will probably not do what the
developers of the game intended. It will most likely make the NES crash, which
is not good.

<xeblog-slide name="golab-wasm/slides/049" essential></xeblog-slide>

So to work around that, there's some semantics described in long, boring
documents that specify the conventions of how you call functions. These
conventions spell out how arguments and return values work, the assumptions you
should make about CPU registers and other intermediate state like stack hygiene,
and how data is stored in memory.

<xeblog-slide name="golab-wasm/slides/050"></xeblog-slide>

As a fun aside, there are cases when a function is called that has more
parameters than the CPU has registers. In that case the remaining arguments
would be pushed to the CPU stack (or somewhere else in memory that the function
assumes it should read from). Sometimes you have to make things a little more
complicated to cope with edge cases.

<xeblog-slide name="golab-wasm/slides/051"></xeblog-slide>

Before Go 1.17, Go had a stack-based calling convention for most of its targets,
modelled after Plan 9 from Bell Labs. When you called Go functions, it put those
arguments on the stack, made room for the return parameters, and then told the
CPU to jump to the function in question. That function would pop the things it
needed off of the stack, do what it needs to and then return any results on the
stack.
  
This is technically a bit slow because the stack is stored in system memory, but
realistically computers are pretty darn fast so it mostly works out, mostly.

<xeblog-slide name="golab-wasm/slides/052"></xeblog-slide>

WebAssembly is a stack-based virtual machine on the inside. This means that the
calling convention for WebAssembly functions is a bit similar to reverse Polish
notation:

```lisp
(