24 October 2025 · 11 min read compiler-design linear-types ckb risc-v

CellScript: A Language Where Linear Types Meet the Cell Model

A design note on a domain-specific language that compiles typed, linear-resource-aware smart contracts to RISC-V for the CKB virtual machine. The language is in alpha, the compiler has rough edges, and several important features remain metadata-only scaffolding. But the core idea — that the type system should know about Cell lifecycle, not just about data types — seems to us worth writing down.

1. The Starting Point

CKB, for those unfamiliar with it, is a proof-of-work blockchain whose virtual machine is a standard RISC-V core (RV64IMAC). Smart contracts on CKB are called “scripts,” and they operate over a Cell model rather than an account model. A Cell is a data container with some capacity, a lock script (who owns it), and a type script (what kind of data it holds). Transactions consume input Cells and create output Cells. The type script of each output is invoked to verify that the transformation is legitimate.

This is, we think, a rather elegant model. But the developer experience of writing CKB scripts by hand is punishing. One works close to the wire format: parsing witness bytes manually, tracking inputs and outputs by index, encoding typed state into raw byte arrays, and writing C or assembly against syscall numbers. The Cell model’s structural guarantees — linear ownership, explicit consumption, schema-matched creation — are preserved only by convention. The compiler does not know about them. The programmer must remember them.

CellScript is an attempt to raise the programming model so that the language’s type system encodes the Cell model directly. A resource in CellScript is linear: it cannot be copied, silently dropped, or forgotten. A consume operation is a first-class language construct, not a convention about transaction indices. A create operation names its output and constrains its fields, and the generated RISC-V code reads back the proposed output from the transaction to verify it matches the intent — verifying, not trusting.

2. Four Kinds of Cell-Backed Data

The Cell model, in practice, supports several distinct patterns of use. CellScript gives each pattern its own type declaration, because the lifecycle rules differ fundamentally.

flowchart TD
    classDef res fill:diagramSurface,stroke:diagramBorder,stroke-width:1.5px,color:diagramText
    classDef shr fill:diagramSurfaceMuted,stroke:diagramBorderMuted,stroke-width:1px,color:diagramText
    classDef rcpt fill:diagramSurfaceMuted,stroke:diagramBorderMuted,stroke-width:1px,color:diagramText
    classDef data fill:diagramSurfaceMuted,stroke:diagramBorderMuted,stroke-dasharray:4 2,color:diagramTextMuted

    R["resource Token<br/>Linear, owned<br/>Consumed + Created on transfer"]:::res
    S["shared Pool<br/>Mutable global state<br/>Preserved in-place"]:::shr
    RC["receipt VestingGrant<br/>Single-use proof<br/>Consumed on claim"]:::rcpt
    D["struct Config<br/>Pure data, not Cell-backed<br/>No on-chain lifecycle"]:::data

    R -->|"consume + create"| TX["Transaction"]
    S -->|"preserve fields"| TX
    RC -->|"consume, no recreation"| TX

    class TX res

A resource is a linear, owned asset — a token, an NFT, a mint authority. It maps to a Cell that appears as a transaction input when consumed and as an output when created. The compiler tracks every resource binding through a state machine: Available, Consumed, Transferred, or Destroyed. At the end of an action body, every resource must have left the Available state. If one remains, compilation fails. If one is used twice, compilation fails. This is linear typing, and it prevents both resource leaks and double-spends at compile time.

A shared declaration marks contention-sensitive protocol state — an AMM pool, a registry, a configuration Cell. Shared Cells are mutated in place within a type group: the action specifies which fields are preserved unchanged and which are updated. The codegen emits byte-level comparison instructions that verify the preserved fields match between input and output.

A receipt is a single-use proof Cell — a vesting grant, a liquidity position, a marketplace listing. It is consumed when claimed and typically produces a new resource as output. The compiler enforces that receipts are consumed exactly once.

A struct is pure data — a wallet address, a configuration record. Structs are not Cell-backed. They exist in local computation and witness data. They have no on-chain lifecycle.

The distinction matters because the type checker enforces fundamentally different rules for each. A resource that is not consumed is a compile error. A shared Cell that is not preserved is a compile error. A receipt that is created but never consumed is flagged. A struct that tries to use consume is a type error. The programmer cannot accidentally treat a shared pool as a transferable asset, because the language does not allow it.

3. Linear Checking in Practice

The linear type system is, in our view, the single most important feature of CellScript. It is also the feature most likely to surprise programmers accustomed to account-model languages where state is just storage slots.

Every Cell-backed binding enters the type environment with a linear state of Available. The checker tracks transitions:

stateDiagram-v2
    [*] --> Available : binding declared
    Available --> Consumed : consume
    Available --> Transferred : std.lifecycle.transfer
    Available --> Destroyed : destroy
    Consumed --> [*]
    Transferred --> [*]
    Destroyed --> [*]

Three rules are enforced strictly:

Completeness. At the end of every action body, all Cell-backed bindings must have left the Available state. A forgotten resource is a compilation failure, not a runtime bug.

No double-use. Attempting to consume a resource that is already in the Consumed state is a compilation failure. This prevents double-spends statically.

Branch consistency. After a conditional (if / match), both branches must agree on the linear state of every binding. If one branch consumes a resource but the other does not, compilation fails. Loops cannot change linear state at all — a resource consumed inside a loop body is a type error.

These rules are not advisory. They are enforced by the compiler, and the generated RISC-V code is structured so that the CKB VM verifies the Cell effects at runtime. The linear checking is a compile-time guarantee that the runtime effects will be well-formed.

4. Effects and Verification

CellScript compiles to RISC-V ELF artifacts that run on CKB’s VM. The codegen does not trust the transaction builder. It verifies.

When an action creates an output, the generated code does not simply write field values to memory and hope they end up in the transaction. It constructs the intended values, then uses CKB syscalls to read back the proposed output from the transaction, and compares the two. If they differ, the script aborts. This is the “verify, don’t trust” pattern, and it runs against a natural temptation in smart contract development: assuming that the in-memory representation of a proposed output matches what actually appears on-chain.

sequenceDiagram
    autonumber
    participant Action as Action Body
    participant IR as IR Layer
    participant CG as RISC-V Codegen
    participant VM as CKB VM

    Action->>IR: create token = Token { amount: 100, symbol: "TEST" }
    IR->>IR: Record CreatePattern with field values
    IR->>CG: Emit create instruction
    CG->>VM: Store intended field values to stack buffer
    CG->>VM: syscall_load_cell_data(OUTPUT, index)
    VM-->>CG: Proposed output data from transaction
    CG->>VM: Compare field-by-field at computed offsets
    alt Fields match
        CG->>VM: Continue execution
    else Fields differ
        CG->>VM: Abort (verification failed)
    end

When an action destroys a resource, the generated code emits an output-absence scan: it loads the consumed input’s type hash, then iterates all transaction outputs checking that none share the same type hash. The scan terminates when it encounters CKB_INDEX_OUT_OF_BOUND. This proves the resource no longer exists in the output set.

The destruction verification is not uniform. CellScript supports multiple destruction policies — SingletonType (the type appears exactly once in inputs and zero times in outputs), Unique (instance-level tracking), BurnAmount (the burned amount is accounted for in a separate field), and others. The choice of policy is declared in the resource definition and affects the generated verification code.

5. The Stdlib as Macro Expansion

A design decision that we think deserves attention is how the standard library works.

Security-critical operations like std::lifecycle::transfer, std::receipt::claim, and std::lifecycle::settle are not runtime function calls. They are AST-to-AST macro expansions at the IR layer. When the compiler encounters std::lifecycle::transfer(input, output, lock), it expands it into:

consume input
create output = <type> { fields... } with_lock(lock)
require output.type_hash() == input.type_hash()
preserve output from input { fields... }

These expanded operations are then tracked individually by the effect system and the ProofPlan metadata generator. Every obligation — the consume, the create, the type-hash preservation, the field-level preservation — appears as a distinct entry in the audit metadata.

The alternative — implementing transfer as an opaque runtime call — would mean that the compiler cannot reason about what the transfer does internally. It could not verify field preservation or generate ProofPlan coverage. By expanding at the IR layer, the compiler retains full visibility into the security-relevant operations.

The actual runtime stdlib is deliberately thin: min, max, isqrt, abs_diff, current_timepoint, epoch_number, remaining_cycles. These are safe to be opaque because they are pure functions or environment queries with no Cell effects.

6. Effect Classification

Every action is classified into one of five effect classes: Pure, ReadOnly, Mutating, Creating, or Destroying. This classification is not decorative — it is a security property.

The compiler infers the effect class from the action’s Cell operations. An action that contains a consume is at least Mutating. An action that contains a create is at least Creating. An action that contains a destroy is Destroying. If the programmer declares a weaker effect class than the inferred one, compilation fails.

This matters because effect classification determines what the action is permitted to do. A ReadOnly action should not be able to consume resources or create outputs. The type system enforces this at compile time, and the CKB target profile can reject actions whose declared effects do not match the inferred ones.

7. ProofPlan: Audit Metadata as a First-Class Output

One of the more unusual features of CellScript is the ProofPlan metadata system. Every compiled action and invariant generates a structured record capturing:

What the action claims to do (trigger, scope, reads).
What the action actually does (Cell effects, field-level access patterns).
Whether the generated code covers each declared invariant.
Whether any gaps exist between declared invariants and enforced obligations.

An invariant like assert_conserved(Token.amount, scope = group) generates a ProofPlan entry. The system then searches for action-level obligations that cover resource conservation for Token.amount. If the action’s codegen emits runtime verification for token conservation, the ProofPlan records “checked-runtime.” If the obligation exists but no code covers it, the record shows “runtime-required” — a gap. If the codegen aborts rather than accept an unprovable obligation, it shows “fail-closed.”

flowchart LR
    classDef inv fill:diagramSurface,stroke:diagramBorder,stroke-width:1.5px,color:diagramText
    classDef act fill:diagramSurfaceMuted,stroke:diagramBorderMuted,stroke-width:1px,color:diagramText
    classDef gap fill:diagramDangerBg,stroke:diagramDanger,stroke-width:1px,color:diagramDangerText
    classDef ok fill:diagramSuccessBg,stroke:diagramSuccess,stroke-width:1px,color:diagramSuccessText

    INV["invariant:<br/>token_conservation"]:::inv
    A1["action: transfer_token<br/>consume + create Token"]:::act
    A2["action: burn<br/>destroy Token"]:::act

    INV -->|"searches for"| A1
    INV -->|"searches for"| A2

    A1 -->|"codegen verifies<br/>amount conservation"| OK1["checked-runtime"]:::ok
    A2 -->|"burn is intentional<br/>conservation does not apply"| OK2["not-applicable"]:::ok

    A3["action: rogue_mint<br/>create Token without consume"]:::gap
    INV -->|"no matching obligation"| GAP["runtime-required<br/>(gap!)"]:::gap

The motivation is straightforward. For a smart contract language targeting a verification-centric VM, the most important question is not “does the code compile?” but “what security properties does the compiled code actually enforce?” ProofPlan makes this question answerable by a machine rather than requiring manual audit of generated assembly.

We should be honest about the current state: ProofPlan is metadata-only in several areas. It records what should be checked and whether the codegen claims to check it, but it does not independently verify the generated RISC-V code. A sound ProofPlan-to-assembly verification pass would be a significant and welcome improvement, but we have not built one.

8. Flows and State Machines

CellScript supports declarative state machines over resource fields. A flow declaration specifies the allowed transitions for a field:

flow Order.status {
    Open -> Filled;
    Open -> Cancelled;
    Filled -> Settled;
}

The action that triggers a transition declares it explicitly:

transition order.status: Open -> Filled

The type checker validates that the transition is declared in the flow graph. The codegen emits a runtime check verifying the output’s status field matches the expected target value. Multiple transitions may be grouped in a single action, but each state field has exactly one flow declaration.

This is a modest feature, as language features go, but it addresses a real problem: smart contract protocols often have implicit state machines that are enforced only by careful manual coding. Making the state machine explicit in the type system means the compiler can reject impossible transitions at compile time, and the ProofPlan can record which transitions an action covers.

9. Capability Gates

CellScript resource declarations include explicit capability gates:

resource Token has store, create, consume, replace, burn, relock { ... }

These gates declare what operations are permitted on the resource. A resource that does not declare burn cannot be destroyed. A resource that does not declare relock cannot have its lock script changed. The type checker rejects actions that attempt a capability not declared on the resource type.

The capability system uses kernel-effect terms — store, create, consume, replace, burn, relock — rather than protocol-level verbs. This is deliberate. The capabilities map to the verification operations that the CKB VM actually performs: checking type hashes, comparing field values, scanning for output absence. A higher-level capability like “transfer” is a composite of replace and relock, and the compiler expands it accordingly.

10. What We Are Unsure About

Several aspects of the design deserve scrutiny and may change.

The straight-line lifecycle restriction. Currently, lifecycle operations (consume, create, destroy, transfer) are only permitted in the top-level statement list of an action body, not inside conditional branches. This is a deliberate simplification — handling Cell effects in branches requires tracking which effects occurred in which branch, merging state across branches, and emitting appropriate compensating logic. We have not yet implemented this correctly, and the restriction is a placeholder until we do. It is, we acknowledge, a significant limitation for complex protocols.

ProofPlan coverage is metadata-only. As noted above, the ProofPlan records what the codegen claims to verify, but does not independently verify the claims. A malicious or buggy codegen could emit incorrect ProofPlan entries. Trusting the codegen’s self-reporting is a gap that we recognise but have not yet closed.

The CKB target profile is the only profile. CellScript currently targets CKB exclusively. The language’s Cell model is a reasonable fit for other UTXO-based systems, but the codegen, syscall conventions, and schema encoding are all CKB-specific. Multi-target support is a future possibility but not a current priority.

The package and registry workflow is local-first. Package management works well for local path dependencies and lockfile-based reproducibility. Registry publishing and registry dependency resolution are command-shaped but fail-closed. We are not yet confident in the trust model for a package registry, and we prefer to leave the commands unimplemented rather than ship something we do not trust.

Concluding Remarks

CellScript is, at its core, an experiment in aligning a programming language’s type system with the execution model of its target platform. The Cell model on CKB provides structural guarantees — linear ownership, explicit consumption, schema-matched creation — that most smart contract languages ignore or leave to convention. CellScript attempts to make these guarantees first-class in the language, so that the compiler can enforce them at compile time and the generated code can verify them at runtime.

Whether this approach yields meaningfully safer smart contracts is an empirical question that we cannot yet answer. The language is young, the compiler has limitations, and several important features remain works in progress. But we think the direction is right, and we share this work in the hope that others working on Cell-based protocols might find something useful in it.