# ADR-004: Multi-transport, per-modality policy (Tor default; Tor-signaled direct P2P + opaque relay for media)

## Status

Accepted

## Date

2026-05-31

## Context

pvtcoms is anonymity-first. ADR-001 committed to a Tor-only v1. But two real pressures push beyond Tor-only:

1. **Reachability without manual router setup.** Users will not configure port-forwarding. Two peers
   behind home NAT cannot accept inbound connections without *some* coordination (this is a hard
   property of NAT, not a bug). The options are: open a port (manual or auto), coordinate a hole
   punch (needs a rendezvous), or route around NAT (Tor/relay). See ADR-001 and `THREAT_MODEL.md`.
2. **Latency-sensitive media.** Real-time voice/video over Tor is poor (latency, jitter, throughput
   volatility, circuit churn). Anonymous video is effectively non-viable; anonymous audio is marginal.

The user asked: can we get TURN-like relay and faster paths **without operating a server**, choosing
transport **per modality** (chat over Tor, calls direct P2P), possibly **using the Tor link to
bootstrap a direct P2P connection**, with a **fallback ladder** and a **user setting vs. adaptive**
choice?

Research (web + Codex + Gemini, 2026-05-31):
- The "Tor-first, then direct" idea is exactly **libp2p DCUtR** (Direct Connection Upgrade through
  Relay) / WebRTC **ICE**: use an existing relayed/anonymous link as the **signaling channel**, then
  hole-punch to a direct path. ([libp2p DCUtR](https://libp2p.io/docs/dcutr/),
  [IPFS hole punching](https://blog.ipfs.tech/2022-01-20-libp2p-hole-punching/))
- Largest measurement of decentralized hole punching to date: **~70% ± 7% success**, transport-agnostic
  (TCP ≈ QUIC), 97.6% on first attempt — so a **relay fallback is mandatory, not optional**.
  ([arXiv 2510.27500](https://arxiv.org/abs/2510.27500))
- **Tailscale** model: STUN + hole punch + **DERP** encrypted relay fallback + birthday-paradox port
  prediction for symmetric NAT. ([How NAT traversal works](https://tailscale.com/blog/how-nat-traversal-works))
- **Signal** uses P2P calls by default (reveals IP to contacts) with an "Always Relay Calls" option
  that trades quality for IP privacy. ([Signal SSD](https://ssd.eff.org/module/how-to-use-signal))
- Both advisors converge: **build minimal** (don't adopt libp2p wholesale); prefer **str0m** (sync,
  state-machine WebRTC/ICE — auditable for IP leaks) over webrtc-rs; keep `arti` for Tor.

Brutal truth from the research: **"serverless" ≠ "infrastructure-free."** High call-success for hard
NATs requires *some* third-party rendezvous (public STUN or a relay). We can avoid running **our own**
server; we cannot avoid all third-party infra and still get reliable direct calls. (Tor itself is
third-party infra — and that is the intended design.)

## Decision

Adopt a **per-modality tiered transport ladder** governed by a **user privacy profile**, with the Tor
onion link doubling as the signaling channel for any direct upgrade.

**Transport tiers** (preference order within a modality, gated by profile):

| Tier | Path | Anonymous? | Needs router setup? | Server we run? |
|----|----|----|----|----|
| **T0** | LAN direct (mDNS / internal IP) | n/a (same net) | no | no |
| **T1** | **Tor v3 onion (arti)** — *default for all v1* | yes (IP hidden) | no | no (public Tor net) |
| **T2** | **Tor-signaled direct P2P** (DCUtR/ICE: hole-punch over UPnP/NAT-PMP/PCP auto-map + birthday-paradox; UDP/QUIC) | **no — reveals IP to peer** | no | no |
| **T3** | **Opaque relay** (Circuit-Relay-v2 / DERP model; volunteer or self-hosted/BYO; ciphertext only) | partial | no | **no (not ours)** |

**Per-modality selection:**
- **Chat / files / async voice notes:** `T0 → T1`. Never opens a direct socket. Anonymous always.
- **Real-time call (v2):** `T0 → T2 → T3 → (T1 audio, last resort)`. Direct preferred for latency.

**Privacy profiles:**
- **Strict (high-threat):** `T1` + `T3-over-Tor` only. **Forbid T2 entirely** — no UPnP/NAT-PMP/PCP,
  no host/srflx ICE candidates, do **not** initialize the UDP/WebRTC stack. Calls degrade to async
  voice notes over Tor. **Fail closed.**
- **Balanced (default):** chat over Tor; calls attempt T2 **only after explicit per-call consent**
  ("this reveals your IP to <verified peer>"), then fall back to T3, then degrade.
- **Performance:** prefer direct for everything; Tor used for identity/signaling/control plane.

Plus **adaptive selection** *within* the chosen profile: ICE-style connectivity checks race the
allowed candidates and pick the working path; never escalate outside the profile's permitted tiers.

**Stack:** `arti` (Tor, mandatory). For v2 media: **str0m** (sync, auditable ICE/DTLS/SRTP) — not
webrtc-rs, not libp2p wholesale. `quinn` optional for direct data channels. WebRTC DTLS-SRTP is not
post-quantum, so media keys are wrapped in our **app-layer PQ Double Ratchet** to preserve the PQ claim.

**Roadmap split:**
- **v1 (now):** Tor-only foundation — chat, PQ handshake + SAS verify, file transfer, async voice
  notes. **Zero UDP sockets.** Proves the anonymity thesis, crypto, and UX. (Matches ADR-001.)
- **v2:** str0m media + Tor-signaled ICE hole punch + privacy-profile UI + per-call IP-exposure
  consent + relay fallback (self-hosted/volunteer). The high-risk real-time layer.

## Dynamic IPs (both peers' public IPs change — ISP lease, mobile, CGNAT)

Dynamic IPs are designed around, not fought. Core principle: **an IP is a disposable hint, never an
identity.** Reachability is by stable public-key identity that *resolves* to the current address.

- **Tor onion is the killer advantage here.** A v3 `.onion` is derived from the service key, so the
  address is **immune to IP change** — when the IP changes, arti just rebuilds circuits to the intro
  points; the Tor HSDir (distributed directory) transparently maps `stable-id → current intro points`.
  Dynamic IP is therefore a **non-problem** for the Tor (T1) path. This is decisive: it's why Tor is
  the always-on bedrock.
- **Control/data-plane split.** Use the Tor onion as an always-on, low-bandwidth **control channel**
  (stable, IP-change-immune signaling); use ephemeral direct QUIC/WebRTC as the **data plane** for
  media, set up by exchanging *current* candidates over the Tor control channel.
- **Direct-path rendezvous = decentralized dynamic-DNS.** For T2/T3 without Tor, peers publish
  short-lived **Reachability Records** to a DHT/Nostr, keyed by a **blinded identity**
  (`HMAC(shared_secret, epoch)`), value = encrypted candidate set + QUIC params + expiry + monotonic
  seq + signature. TTL ~60–180 s; heartbeat ~20–40 s with jitter; **immediate re-announce on OS
  network-change events** (netlink / NWPathMonitor — event-driven, not polling).
- **QUIC connection migration** (RFC 9000): QUIC Connection IDs decouple a session from the IP:port
  5-tuple, so an **active** call survives a WiFi↔cellular / IP change without re-handshaking (`quinn`
  supports it; `str0m`/WebRTC uses ICE-restart instead). Migration rescues live sessions; the stable
  Tor identity rescues *initiation* after a change.
- **IPv6** is treated as just another *ephemeral* candidate — SLAAC privacy extensions (RFC 4941)
  rotate it, so it is never a canonical locator.

**End-to-end connection algorithm (robust to dynamic IPs on both ends):**
1. Resolve peer by stable identity → `.onion`; open the **Tor control channel** (always; IP-immune).
2. Exchange signed capability + policy (is direct/IP-exposure allowed now; battery/network state).
3. If both allow direct: publish/fetch fresh reachability candidates over the Tor control channel.
4. **Happy-eyeballs dial race** (budgeted, parallel): IPv6 direct QUIC · IPv4 hole-punch QUIC ·
   opaque-relay QUIC — Tor control stays alive throughout.
5. First validated secure path wins; others drain/close. Media keys stay wrapped in the PQ ratchet.
6. On mid-session IP change: try **QUIC migration** first; else re-announce over Tor + re-race in the
   background; degrade to relay/Tor **without tearing down the app-layer session**.
7. Rotate blinded rendezvous tokens; expire stale records aggressively.

## Consequences

### Positive
- No manual port-forwarding ever; Tor default needs no router config and hides IP.
- A principled, honest answer to "fast calls vs. anonymity": the user chooses, per profile, with consent.
- No server **we** operate at any tier; relays (when used) are volunteer/self-hosted and see ciphertext only.
- Clear v1/v2 boundary keeps the highest-risk code (UDP/IP exposure) out of the anonymity-first core until ready.

### Negative
- T2 breaks anonymity to the peer (IP exposure). Mitigated by: verified-contacts-only, explicit consent, Strict forbids it.
- **Timing-correlation risk:** a network observer can correlate Tor signaling with the direct UDP burst,
  potentially de-anonymizing even "direct-after-Tor." This is *why* Strict forbids T2 outright.
- Hole punching is ~70% — T3 relay fallback is required, and relays have availability/incentive problems
  for high-bandwidth media (volunteers will rate-limit). Reliable media fallback realistically needs BYO/self-hosted relay.
- More transports = more attack surface and more code to audit (especially the WebRTC stack for leaks).

### Neutral
- Aligns with existing backlog: `SR-2026-05-30-011` (direct-first + mailbox-fallback + strict-direct-only
  toggle), `SR-2026-05-30-006` (v1 UI with privacy profiles). Those items inherit this ladder.
- `THREAT_MODEL.md` must document the T2 IP-exposure boundary and the timing-correlation caveat.