WebAssembly Overview

WebAssembly Overview

1. Introduction

What is WebAssembly (Wasm)?

WebAssembly is a low-level, assembly-like code format that is safe, portable, and optimized for efficient execution and compact representation. It acts as a compilation target for many programming languages, such as:

  • C/C++ (using compilers like Emscripten or Clang).

  • Rust (using Rust's Wasm target, wasm32-unknown-unknown).

  • Go (using Go’s WebAssembly support).

  • Haskell (using tools like GHC/Asterius).

The output of these compilers is a WebAssembly binary (a .wasm file). These binaries can run in web browsers (e.g., Firefox, Chrome, Safari) or in stand-alone environments (e.g., Node.js, WAVM).


The WebAssembly Workflow:

source code (C++, Rust, Go, Haskell, etc.)
      ↓
  Compiler (e.g., Emscripten, Rust compiler)
      ↓
WebAssembly Binary (.wasm file)
      ↓
   Runtime (Browser, Node.js, etc.)

Once compiled, the WebAssembly binary is loaded into a runtime environment, where it is validated for safety and instantiated for execution. WebAssembly runs as a virtual instruction set architecture (ISA), and the runtime interprets or compiles it into native machine code for CPUs (like x86 or ARM).

Key Benefit: WebAssembly is sandboxed, ensuring secure execution by preventing out-of-bounds memory access or other unsafe operations.


2. WebAssembly Abstract Machine

At runtime, WebAssembly binaries are executed within a WebAssembly abstract machine, a conceptual model that defines how code runs and interacts with memory, functions, and the environment.


Key Components of the Abstract Machine:

  1. Store:
    Holds all runtime resources:

    • Functions (compiled code),

    • Tables (function references),

    • Linear memory (byte arrays),

    • Globals (global variables).

  2. Stacks:

    • Operand Stack: A stack for intermediate values used by instructions. For example, i32.add pops two values, adds them, and pushes the result.

    • Control Stack: Tracks labels for control flow (e.g., blocks, loops, conditionals).

    • Call Stack: Tracks function calls and their local variables.


2.1. Store

The store is where all the resources for a WebAssembly program (functions, memory, tables, globals) are managed. Each module instance uses the store to ensure that resources are safely sandboxed and isolated.


2.2. Stacks Explained

  • Operand Stack:
    Instructions like i32.const push values onto this stack, while operations like i32.add pop operands and push the result.

  • Control Stack:
    Handles control flow, such as blocks and branches. Instructions like br (branch) refer to labels based on their nesting depth in the control stack.

  • Call Stack:
    Tracks active function calls. When a function is called (call #n), a frame is pushed to the call stack, holding:

    • Local variables (including parameters),

    • The return arity of the function.
      When the function returns, the frame is popped off.


2.3. Traps

A trap is an error that immediately halts execution. Common causes include:

  • Accessing memory out of bounds,

  • Dividing by zero,

  • Calling an invalid function,

  • Executing the unreachable instruction.

Traps cannot be handled within Wasm but are reported to the host (e.g., a browser or Node.js) for further action.


2.4. External Interface

WebAssembly interacts with its environment through imports and exports:

  • Imports: Functions, tables, or globals provided by the host environment (e.g., console.log from JavaScript).

  • Exports: Functions or memory from the module made available to the host.


3. The WebAssembly Module

A WebAssembly module is the basic unit of code distribution. It contains definitions for:

  • Types: Signatures of functions and other elements.

  • Functions: Compiled code for execution.

  • Tables: Arrays of function references.

  • Memory: Contiguous blocks of bytes (linear memory).

  • Globals: Variables, which can be mutable or immutable.

  • Imports/Exports: Interfacing with the host environment.


3.1. Module Instantiation

When a module is loaded into a runtime:

  1. It is validated to ensure safety (e.g., type correctness).

  2. Its resources (memory, functions, etc.) are allocated in the store.

  3. Data and function tables are initialized.

  4. The optional start function is run, if present.

At this point, the module instance is ready to execute.


3.2. Binary Encoding

A WebAssembly binary is divided into sections, such as:

  • Type Section (ID = 1): Describes function signatures.

  • Import Section (ID = 2): Lists imported resources.

  • Function Section (ID = 3): References compiled functions.

  • Code Section (ID = 10): Contains the actual function bodies.

All numbers in WebAssembly binaries are encoded using LEB128, a compact, variable-length integer format.


4. WebAssembly Instructions

WebAssembly instructions are compact, stack-based operations. They fall into two main categories:

  1. Simple Instructions:

    • Numeric: Perform arithmetic or logical operations (e.g., i32.add).

    • Memory: Access or modify memory (e.g., i32.load).

    • Local/Global: Read/write variables (e.g., get_local, set_global).

  2. Control Instructions:

    • Manage structured flow using block, loop, and if-else.

    • Use branching (br, br_if) to jump between blocks.

    • Call functions using call or call_indirect.


4.1. Numeric Instructions

  • Example: i32.add pops two values from the operand stack, adds them, and pushes the result.

      ;; Example: i32.add
      ;; Stack before: [..., a, b]
      ;; Instruction: i32.add
      ;; Stack after: [..., (a + b)]
    

4.2. Memory Instructions

  • load/store: Access linear memory by address.

  • memory.grow: Increase memory size in 64 KiB pages.


4.3. Control Flow

  • block, loop, if-else: Define structured control flow.

  • br, br_if: Branch within control blocks.

  • call: Invoke a function.

  • return: Exit the current function.