Skip to main content
The Content Lake is Sanity’s hosted backend infrastructure that stores and serves your content. It provides real-time capabilities, GROQ querying, and a powerful mutation system for collaborative editing.

What is the Content Lake?

The Content Lake is:
  • Hosted backend - Fully managed infrastructure by Sanity
  • Real-time - WebSocket connections for live updates
  • Transactional - ACID-compliant document mutations
  • Query-optimized - Indexed for GROQ queries
  • Collaborative - Built for simultaneous editing
  • Version-controlled - Complete document history

Architecture overview

┌─────────────────────────────────────────────────────────────┐
│                      SANITY STUDIO                          │
│                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │    Forms     │  │   Queries    │  │   Listeners  │     │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘     │
│         │                 │                 │             │
│         │                 │                 │             │
└─────────┼─────────────────┼─────────────────┼─────────────┘
          │                 │                 │
          │                 │                 │
      Mutations          GROQ Queries     WebSockets
       (HTTP)              (HTTP)          (WS)
          │                 │                 │
          └────────────┬────┴─────┬───────────┘
                       │          │
                       ▼          ▼
            ┌──────────────────────────────┐
            │      CONTENT LAKE            │
            │  ┌────────────────────────┐  │
            │  │   Document Store       │  │
            │  ├────────────────────────┤  │
            │  │   Mutation Engine      │  │
            │  ├────────────────────────┤  │
            │  │   Query Engine (GROQ)  │  │
            │  ├────────────────────────┤  │
            │  │   Real-time Sync       │  │
            │  ├────────────────────────┤  │
            │  │   History & Versions   │  │
            │  └────────────────────────┘  │
            └──────────────────────────────┘

Document lifecycle

Documents in the Content Lake follow a clear lifecycle:
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Draft     │────▶│  Published  │────▶│  Historical │
│  (working)  │     │   (live)    │     │  (versions) │
└─────────────┘     └─────────────┘     └─────────────┘

Draft documents

  • Created when you start editing
  • Prefixed with drafts. in the document ID
  • Not visible to published queries by default
  • Can be edited by multiple users simultaneously
// Draft document ID
"drafts.abc-123"

// Published document ID
"abc-123"

Publishing

When you publish:
  1. Draft content is copied to the published document
  2. Draft document is deleted
  3. Published document becomes visible to queries
  4. Previous version is stored in history

Unpublishing

  • Removes the published document
  • Creates or keeps the draft version
  • Previous versions remain in history

Connecting to the Content Lake

Using @sanity/client

The official Sanity client handles all communication:
import {createClient} from '@sanity/client'

const client = createClient({
  projectId: 'your-project-id',
  dataset: 'production',
  apiVersion: '2024-01-01',
  useCdn: false, // Use false for fresh data
})

Querying with GROQ

GROQ (Graph-Oriented Query Language) is Sanity’s query language:
// Fetch all published posts
const posts = await client.fetch(
  `*[_type == "post" && !(_id in path("drafts.**"))]{
    _id,
    title,
    slug,
    author->{
      name,
      image
    }
  }`
)

// Fetch a single document
const post = await client.fetch(
  `*[_type == "post" && slug.current == $slug][0]`,
  {slug: 'my-post'}
)

// Fetch with joins
const postsWithAuthors = await client.fetch(
  `*[_type == "post"]{
    title,
    "authorName": author->name,
    "categories": categories[]->title
  }`
)

Real-time listeners

Listen to document changes in real-time:
import {createClient} from '@sanity/client'

const client = createClient({
  projectId: 'your-project-id',
  dataset: 'production',
  apiVersion: '2024-01-01',
  useCdn: false,
})

// Listen to all changes for a document type
const subscription = client
  .listen('*[_type == "post"]')
  .subscribe((update) => {
    console.log('Document changed:', update)
  })

// Clean up
subscription.unsubscribe()

Mutations

Mutations are transactional changes to documents.

Create

const result = await client.create({
  _type: 'post',
  title: 'My New Post',
  slug: {current: 'my-new-post'},
  publishedAt: new Date().toISOString(),
})

Create or replace

const result = await client.createOrReplace({
  _id: 'my-post-id',
  _type: 'post',
  title: 'Updated Title',
})

Patch

Update specific fields without replacing the entire document:
const result = await client
  .patch('my-post-id')
  .set({title: 'New Title'})
  .inc({views: 1})
  .commit()

// More patch operations
await client
  .patch('my-post-id')
  .setIfMissing({tags: []})
  .append('tags', ['javascript'])
  .unset(['oldField'])
  .commit()

Delete

const result = await client.delete('my-post-id')

// Delete multiple documents
const result = await client.delete({
  query: '*[_type == "post" && publishedAt < $date]',
  params: {date: '2020-01-01'},
})

Transactions

Group multiple mutations together:
const transaction = client.transaction()

transaction
  .create({
    _type: 'author',
    _id: 'author-1',
    name: 'John Doe',
  })
  .create({
    _type: 'post',
    title: 'My Post',
    author: {_type: 'reference', _ref: 'author-1'},
  })
  .patch('other-post-id', (patch) => 
    patch.set({featured: true})
  )

const result = await transaction.commit()

Real-time collaboration

The Content Lake enables real-time collaboration through:

Presence

Shows who is viewing or editing documents:
import {useDocumentPresence} from 'sanity'

function MyComponent({documentId}) {
  const presence = useDocumentPresence(documentId)
  
  return (
    <div>
      {presence.map((user) => (
        <Avatar key={user.userId} user={user} />
      ))}
    </div>
  )
}

Optimistic updates

Changes appear immediately in the UI before being confirmed by the server:
  1. User makes an edit
  2. UI updates immediately (optimistic)
  3. Mutation sent to Content Lake
  4. Server confirms or rejects
  5. UI reconciles with server state

Conflict resolution

When multiple users edit the same field:
  1. Last write wins (default)
  2. Automatic merge for non-conflicting fields
  3. Manual resolution required for conflicts

Document store hooks

Sanity Studio provides hooks to interact with the document store:
import {
  useDocumentStore,
  useDocumentOperation,
  useValidationStatus,
} from 'sanity'

function MyComponent({documentId, documentType}) {
  // Access the document store
  const documentStore = useDocumentStore()
  
  // Get document operations
  const {publish, unpublish, del} = useDocumentOperation(
    documentId,
    documentType
  )
  
  // Check validation status
  const validation = useValidationStatus(documentId, documentType)
  
  const handlePublish = () => {
    if (validation.isValidating || validation.markers.length > 0) {
      return // Don't publish if there are errors
    }
    publish.execute()
  }
  
  return (
    <button onClick={handlePublish}>
      Publish
    </button>
  )
}

Observables and RxJS

The document store uses RxJS observables for reactive data:
import {map} from 'rxjs/operators'
import {useObservable} from 'react-rx'

function useDocumentTitle(documentId: string) {
  const documentStore = useDocumentStore()
  
  const title$ = documentStore
    .pair
    .documentEvents(documentId)
    .pipe(
      map((event) => event.document?.title)
    )
  
  return useObservable(title$)
}

Content delivery

CDN vs. Direct API

const client = createClient({
  projectId: 'your-project-id',
  dataset: 'production',
  apiVersion: '2024-01-01',
  useCdn: true, // Use CDN for public queries
})

// Use direct API for authenticated requests
const authenticatedClient = client.withConfig({
  useCdn: false,
  token: 'your-token',
})

Caching strategies

  • CDN caching - Automatic caching for public queries
  • Stale-while-revalidate - Serve cached content while fetching fresh data
  • On-demand revalidation - Trigger cache invalidation via webhooks

Datasets

Organize content into separate datasets:
// Production dataset
const prodClient = createClient({
  projectId: 'abc123',
  dataset: 'production',
  apiVersion: '2024-01-01',
})

// Staging dataset
const stagingClient = createClient({
  projectId: 'abc123',
  dataset: 'staging',
  apiVersion: '2024-01-01',
})
Datasets are isolated from each other. They don’t share documents or history.

API versioning

Specify an API version for consistent behavior:
const client = createClient({
  projectId: 'your-project-id',
  dataset: 'production',
  apiVersion: '2024-01-01', // Use a specific date
})
Always use a specific API version in production to avoid breaking changes.

Rate limits and quotas

  • Free tier - Generous limits for development
  • Paid tiers - Higher limits and SLAs
  • Request batching - Combine multiple queries
  • Caching - Use CDN to reduce API calls
Avoid polling the API frequently. Use listeners for real-time updates instead.

Build docs developers (and LLMs) love