
# One URL, three audiences — designing for humans, LLMs, and crawlers simultaneously

`bind.ly/@bindly/cloudflare-native-stack` is visited by at least three different types of clients with fundamentally different needs:

**A browser** wants: styled HTML, navigation, interactive UI, images, fonts, a readable layout.

**An LLM** (via web access or `?format=md`) wants: clean Markdown, no HTML to parse, minimal tokens, just the content.

**A search crawler** wants: static HTML with OG meta tags, JSON-LD structured data, canonical URL, fast response time.

These requirements conflict. An interactive SPA is terrible for crawlers (no static HTML). Raw Markdown is terrible for browsers (no styling). Rich SEO HTML is wasteful for LLMs (hundreds of tokens of markup before the content starts).

Here's how we navigate these conflicts — and where we don't.

## Conflict 1: SPA vs SSR

The SPA (React + Vite) gives the best human experience: instant navigation, real-time updates, rich interactions. But it renders client-side — Googlebot gets an empty `<div id="root"></div>` and nothing to index.

The SSR path renders HTML in the Gateway Worker before sending it. Crawlers get a full HTML page with content. But SSR pages are static — no real-time updates, no interactive editing.

**Resolution: session cookie determines the branch.**

The Gateway Worker checks for a `bindly_session` cookie on every request to `/@space/binding`:

- Cookie present (authenticated user) → proxy to SPA
- Cookie absent (crawler, unauthenticated visitor) → render SSR

This works for the most important cases:

- Crawlers (no cookie) always get SSR ✓
- Logged-in users always get the SPA ✓

**Unresolved tension:** An authenticated user visiting a public Binding gets the SPA — which means the page they see isn't what Google indexed. If the SPA renders additional content (comments, edit UI), there's a visual difference between what Google shows and what the user sees. Acceptable for a knowledge platform, but worth knowing about.

## Conflict 2: LLM-optimal vs human-optimal response format

An LLM fetching a Binding wants minimum tokens. The ideal LLM response for a 2,000-token document:

```markdown
# Document Title

[2,000 tokens of clean Markdown content]
```

A human visiting the same URL wants:

```html
<!DOCTYPE html>
<html>
  <head>
    <meta property="og:title" content="..." />
    <!-- 20 more meta tags -->
    <style>/* 10KB of CSS */</style>
  </head>
  <body>
    <!-- Navigation: 500 tokens -->
    <!-- Header with author, date, tags: 200 tokens -->
    <!-- Content: 2,000 tokens -->
    <!-- Footer: 300 tokens -->
  </body>
</html>
```

The HTML response is 3,000+ tokens for the same 2,000 tokens of actual content. An LLM fetching this without `?format=md` burns 50% more context on wrapper chrome.

**Resolution: `?format=md` as an explicit LLM path.**

Rather than trying to detect LLMs from user agents (unreliable, adversarial, requires maintenance), we add an explicit LLM path: `?format=md` returns raw Markdown. The `textUrl` in every MCP response pre-constructs this URL.

This works because LLMs that use MCP always get `textUrl` — they know to use it. LLMs with web access can use `?format=md` if they know the convention (documented in `llms.txt`). Browsers that accidentally hit `?format=md` see Markdown — slightly confusing, but not broken.

**Unresolved tension:** LLMs without prior Bindly knowledge won't know to use `?format=md`. They'll fetch the HTML version and waste tokens. This is only solvable through broader adoption of content negotiation standards — or by hoping crawlers like ClaudeBot learn to check for `?format=md` conventions in `llms.txt`.

## Conflict 3: Crawler metadata vs content payload

Crawlers want structured metadata in the HTML: `og:title`, `og:description`, `og:image`, JSON-LD `Article` schema, `datePublished`, `author`. This makes content rich in search results and link previews.

This metadata adds ~2KB to every SSR response. For a crawler, useful. For an LLM, pure overhead. For a browser rendering the SPA, irrelevant (the SPA manages its own `<head>` client-side).

**Resolution: metadata only on the SSR path.**

The SSR path (unauthenticated requests) includes full metadata. The SPA proxy path (authenticated requests) doesn't. This is clean because the two paths don't share code.

**Unresolved tension:** Authenticated users who share a Binding link get a URL that serves the SPA — which means link preview services (Slack, Twitter, iMessage) fetch the URL and get the SPA shell, not the SSR metadata. The OG image and description are missing in link previews for logged-in users sharing links. Solvable with a separate preview endpoint, not currently implemented.

## Conflict 4: Cache semantics per audience

Each audience has different caching requirements:

- **Crawlers** (SSR): can cache aggressively — content doesn't change every minute
- **LLMs** (`?format=md`): same caching is fine, but must not return cached HTML
- **Authenticated users** (SPA): must not cache at all — the response depends on auth state

**Resolution: `Cache-Control` and `Vary` headers.**

```typescript
// SSR HTML
'Cache-Control': 'public, s-maxage=60, stale-while-revalidate=300'

// Markdown
'Cache-Control': 'public, s-maxage=60, stale-while-revalidate=300'
'Vary': 'Accept'  // Cache separately for different Accept values

// SPA proxy
'Cache-Control': 'private, no-store'
'Vary': 'Cookie'  // Response varies by auth state
```

The `Vary: Accept` header tells CDNs to store separate cache entries for `text/html` and `text/markdown` responses from the same URL. Without it, a CDN might serve cached Markdown to a browser, or cached HTML to an LLM.

**Unresolved tension:** Cloudflare Workers bypass the CF edge CDN cache by default for dynamic responses. The `s-maxage` header is honored by downstream caches (browsers, other CDNs) but not the Cloudflare edge itself — unless you explicitly call the Cache API (`caches.default.put`). For now, caching happens at the browser and KV fallback levels, not the Cloudflare edge. Good enough for current traffic; not infinite scale.

## What "designing for three audiences" means day-to-day

Every feature decision has to be evaluated against all three audiences:

**Adding a new Binding field:**

- Human: show it in the UI somewhere useful
- LLM: include it in Tier 1 response? (cost: tokens). Tier 2 only? (cost: hidden from search results)
- Crawler: include it in JSON-LD? (benefit: richer search results)

**Changing a URL:**

- Human: set up a redirect, update navigation
- LLM: update `textUrl` in MCP responses, update `llms.txt`
- Crawler: canonical URL must point to the new path, old path must 301

**Adding interactive features (comments, editing):**

- Human: implement in SPA
- LLM: implement as MCP tools
- Crawler: no action needed

The discipline is asking "who else does this affect?" for every change. A URL change obvious for the browser path might break LLM bookmarks. A new metadata field great for SEO might bloat every MCP response unnecessarily.

The three-audience model isn't a constraint — it's a forcing function for cleaner design. If a feature can't be explained for all three audiences, it probably needs more thought.