WebRTC signaling architecture¶

Canvas Chat uses a WebRTC-based peer-to-peer sync system built on top of Yjs CRDTs. This document explains how the signaling server works and why we chose this approach.

The signaling problem¶

WebRTC enables direct peer-to-peer connections between browsers, but peers need to discover each other first. This is the "signaling" problem - how do two browsers find each other on the internet?

The solution is a lightweight relay server that:

Accepts WebSocket connections from peers
Groups peers by "room" (session ID in our case)
Relays connection metadata (SDP offers/answers, ICE candidates)
Never sees or stores the actual sync data

Once peers exchange connection metadata, they establish a direct WebRTC connection and sync data peer-to-peer without the server.

Why build our own signaling server?¶

The y-webrtc package includes a Node.js signaling server, but we chose to implement a compatible server in FastAPI/Python for several reasons:

Single deployment: We already deploy a FastAPI app to Modal. Adding a separate Node.js service would complicate deployment and monitoring.
Protocol simplicity: The y-webrtc signaling protocol is just four message types (subscribe, unsubscribe, publish, ping). Implementing it in Python is straightforward.
Consistent stack: Using Python/FastAPI for the entire backend makes the codebase easier to maintain.

Protocol details¶

The signaling protocol uses JSON messages over WebSocket:

subscribe   { "type": "subscribe", "topics": ["room-id-1", "room-id-2"] }
unsubscribe { "type": "unsubscribe", "topics": ["room-id-1"] }
publish     { "type": "publish", "topic": "room-id", ...payload... }
ping        { "type": "ping" }  ->  { "type": "pong" }

When a peer sends a publish message, the server broadcasts it to all other peers subscribed to that topic. The server adds a clients field indicating how many peers received the message.

Statelessness¶

The signaling server is completely stateless:

No database: All state lives in memory
No user data: Only relays opaque connection metadata
Restart-safe: Peers automatically reconnect and re-subscribe
Horizontally scalable: Each server instance is independent

If the server restarts, peers will reconnect within seconds. The CRDTs ensure eventual consistency even if some sync messages are lost during reconnection.

Privacy guarantees¶

The signaling server provides strong privacy guarantees:

Encrypted signaling: y-webrtc supports optional password-based encryption for signaling messages, preventing man-in-the-middle attacks.
No content visibility: The server only sees room IDs (random UUIDs) and encrypted connection metadata. It never sees node content, chat messages, or any user data.
Self-hosting option: Users can run their own signaling server for maximum privacy.

Architecture diagram¶

┌─────────────────────────────────────────────────────────────────────────┐
│                           Canvas Chat Server                             │
│                                                                          │
│   ┌─────────────────┐                    ┌─────────────────────────┐    │
│   │  FastAPI App    │                    │  Signaling Manager      │    │
│   │  (HTTP/REST)    │                    │  (WebSocket)            │    │
│   │                 │                    │                         │    │
│   │  /api/chat      │                    │  /signal                │    │
│   │  /api/models    │                    │  ├── topics: Map        │    │
│   │  /api/...       │                    │  └── subscriptions: Map │    │
│   └─────────────────┘                    └─────────────────────────┘    │
│                                                    │                     │
└────────────────────────────────────────────────────│─────────────────────┘
                                                     │
        ┌────────────────────────────────────────────┼────────────────────┐
        │                                            │                    │
        ▼                                            ▼                    ▼
   ┌─────────┐                               ┌─────────────────────────────┐
   │ Browser │◄─────── WebRTC P2P ──────────►│         Browser            │
   │ (Peer A)│                               │         (Peer B)           │
   │         │                               │                            │
   │ ┌─────────────┐                         │ ┌─────────────┐            │
   │ │ CRDTGraph   │                         │ │ CRDTGraph   │            │
   │ │ + WebRTC    │                         │ │ + WebRTC    │            │
   │ │ + IndexedDB │                         │ │ + IndexedDB │            │
   │ └─────────────┘                         │ └─────────────┘            │
   └─────────┘                               └─────────────────────────────┘

Connection flow¶

User A opens a session, CRDTGraph creates a WebrtcProvider
Provider connects to /signal and subscribes to the session's room ID
User B opens the same session (shared link or same browser tab)
Provider connects and subscribes to the same room ID
Signaling server relays SDP offer from A to B
B responds with SDP answer, relayed back to A
Peers exchange ICE candidates through signaling
Direct WebRTC connection established
Yjs syncs CRDT state over WebRTC
Both browsers now see the same canvas in real-time

Failure handling¶

The system handles various failure modes gracefully:

Failure	Recovery
Signaling server down	Peers retry connection automatically
WebRTC connection lost	Yjs awareness triggers reconnection
NAT traversal fails	Falls back to TURN relay (future)
Browser tab closed	Other peers continue, state persists locally

Future enhancements¶

TURN server support: For peers behind restrictive NATs, we may add TURN relay support for guaranteed connectivity.
Presence awareness: Show which users are viewing/editing the canvas using Yjs awareness protocol.
Selective sync: Only sync visible portions of large canvases.