SheetDB - Reanimating Spreadsheets 🎃

A Kiroween-themed project that brings Google Spreadsheets back to life as a functional relational database backend.

Universal Backend Solution - Use Google Sheets as a database with Python, REST API, or JavaScript client. No hosting, no bandwidth limits, no setup time.

Why SheetDB?

SheetDB transforms Google Sheets into a powerful relational database with the performance of a traditional backend. It's perfect for:

Rapid Prototyping - Build MVPs in hours, not days
👥 Collaborative Apps - Users can edit data directly in sheets
📊 Internal Tools - Admin dashboards without hosting costs
🌐 Public APIs - Serve data from published sheets (no auth!)
🎓 Learning Projects - Database concepts without infrastructure

Key Features

🚀 500-1000x Faster - DuckDB caching makes queries blazing fast (vs direct API calls)
🔗 Relational Queries - Joins, filters, aggregations, subqueries - full SQL power
🔒 Type Safety - Schema validation and TypeScript support
🌐 Multiple Interfaces - Python library, REST API, JavaScript client
🎯 Framework Agnostic - Works with React, Django, FastAPI, Flask, and more
🔓 Three Operating Modes - Public (no auth), Authenticated (full power), Hybrid (best of both)
📊 Built-in UI - Google Sheets provides the admin interface for free
⚡ Zero Setup - Public mode works instantly, no credentials needed

Project Structure

⚠️ Note: This is a monorepo containing multiple components. When you pip install sheetdb, you get only the core library. See REPOSITORY_STRUCTURE.md for details.

.
├── sheetdb/              # Core Python library (pip package)
├── sheetdb_server/       # REST API server (separate component)
├── sheetdb-js/           # JavaScript/TypeScript client (npm package)
├── django_app/           # Monster Hunt Tracker demo
├── examples/             # Example applications
├── docs/                 # Documentation
└── tests/                # Test suite

What gets installed with pip install sheetdb:

✅ Core library (sheetdb/)
✅ CLI tool

What's separate:

❌ REST API server (install FastAPI/uvicorn separately)
❌ JavaScript client (separate npm package)
❌ Examples (available in source only)

Setup

1. Create Virtual Environment

py -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

2. Install Dependencies

pip install -r requirements.txt

3. Configure Environment

cp .env.example .env
# Edit .env with your spreadsheet ID and credentials path

4. Set Up Google Sheets API

Go to Google Cloud Console
Create a new project or select existing one
Enable Google Sheets API
Create a service account
Download the JSON credentials file
Share your spreadsheet with the service account email

Three Operating Modes

SheetDB supports three modes to fit different use cases:

1. Public Mode (No Authentication) 🔓

Perfect for prototypes, public data, and learning projects. Zero setup required!

from sheetdb import SheetDB

# Just publish your sheet to web and use the URL
db = SheetDB.from_public_url("https://docs.google.com/spreadsheets/d/YOUR_ID/")
db.sync()  # Reads from public CSV export

# Query with full SQL power
monsters = db.query("SELECT * FROM monsters WHERE danger_level >= 7")

Benefits:

✅ No OAuth setup or credentials needed
✅ No rate limits on reads (unlimited!)
✅ Perfect for prototypes and public data
✅ Still get 500x performance boost from caching

How to enable:

In Google Sheets: File → Share → Publish to web
Select "Entire Document" and "Web page"
Copy the URL and use it with SheetDB

2. Authenticated Mode (Full Power) 🔐

Full CRUD operations with Google Sheets API. Best for private data and production apps.

from sheetdb import SheetDB

# Use service account credentials
db = SheetDB(
    credentials="credentials.json",
    spreadsheet_id="YOUR_SPREADSHEET_ID",
    cache_mode="aggressive"  # Load once, query many times
)

db.sync()  # One-time sync from Sheets API

# Unlimited fast queries (cached in DuckDB)
for i in range(1000):
    data = db.query("SELECT * FROM monsters")  # ~0.001s each!

# Write operations
db.insert('monsters', {'name': 'Vampire', 'danger_level': 8})
db.update('monsters', id=1, data={'danger_level': 9})
db.delete('monsters', id=1)

Benefits:

✅ Full CRUD operations
✅ Private data support
✅ 500x faster reads via caching
✅ Controlled write access

Setup:

Create Google Cloud project
Enable Google Sheets API
Create service account and download JSON key
Share spreadsheet with service account email

3. Hybrid Mode (Best of Both) ⚡

Combine public reads (unlimited) with authenticated writes (controlled). Recommended for production!

from sheetdb import SheetDB

# Public reads + authenticated writes
db = SheetDB(
    sheet_url="https://docs.google.com/spreadsheets/d/YOUR_ID/",
    credentials="credentials.json",
    mode="hybrid",
    cache_mode="smart",
    cache_ttl=300  # Auto-refresh every 5 minutes
)

# Unlimited fast reads from public URL
data = db.query("SELECT * FROM monsters")

# Controlled writes via API
db.insert('monsters', {'name': 'Vampire', 'danger_level': 8})

Benefits:

✅ Unlimited reads (no rate limits)
✅ Controlled writes (authenticated)
✅ Best performance
✅ Auto-sync for collaborative editing

Architecture

SheetDB uses a layered architecture that works with any framework:

┌─────────────────────────────────────────────────────┐
│  Application Layer (React, Django, FastAPI, etc.)  │
└─────────────────────────────────────────────────────┘
                         ↓
┌─────────────────────────────────────────────────────┐
│  Interface Layer (JS Client, REST API, ORM)         │
└─────────────────────────────────────────────────────┘
                         ↓
┌─────────────────────────────────────────────────────┐
│  Core SheetDB Library (Unified API)                 │
│  ├─ Sheets Gateway (API + Public URLs)              │
│  ├─ DuckDB Engine (Relational Cache)                │
│  └─ Schema Config (Validation)                      │
└─────────────────────────────────────────────────────┘
                         ↓
┌─────────────────────────────────────────────────────┐
│  Data Sources (Google Sheets, Public URLs, Forms)   │
└─────────────────────────────────────────────────────┘

Key Principles:

Google Sheets = Source of truth for all data
DuckDB = In-memory relational cache (500x faster queries)
Schema Config = Type safety and validation
Framework Agnostic = Works with any Python/JS framework

For a detailed explanation of how SheetDB works, see ARCHITECTURE.md.

Quick Start

5-Minute Setup (Public Mode)

The fastest way to get started - no authentication required!

from sheetdb import SheetDB

# 1. Publish your Google Sheet to web (File → Share → Publish to web)
# 2. Use the URL with SheetDB
db = SheetDB.from_public_url("https://docs.google.com/spreadsheets/d/YOUR_ID/")

# 3. Sync data (one-time load into cache)
db.sync()

# 4. Query with full SQL power
monsters = db.query("""
    SELECT m.name, m.danger_level, COUNT(h.id) as hunt_count
    FROM monsters m
    LEFT JOIN hunts h ON m.id = h.monster_id
    GROUP BY m.id, m.name, m.danger_level
    ORDER BY hunt_count DESC
""")

# 5. That's it! No credentials, no setup, just works.

Python Library (All Modes)

from sheetdb import SheetDB

# Public mode (no auth)
db = SheetDB.from_public_url("https://docs.google.com/spreadsheets/d/YOUR_ID/")

# Authenticated mode (full CRUD)
db = SheetDB(
    credentials="credentials.json",
    spreadsheet_id="YOUR_ID",
    cache_mode="aggressive"  # Default: load once, query many times
)

# Hybrid mode (public reads + auth writes)
db = SheetDB(
    sheet_url="https://docs.google.com/spreadsheets/d/YOUR_ID/",
    credentials="credentials.json",
    mode="hybrid"
)

# Cache modes
db = SheetDB(..., cache_mode="aggressive")  # Load once (default)
db = SheetDB(..., cache_mode="smart", cache_ttl=300)  # Auto-refresh every 5 min
db = SheetDB(..., cache_mode="lazy")  # Cache on-demand
db = SheetDB(..., cache_mode="none")  # No cache (like Retrosheet)

# Operations
db.sync()  # Refresh cache from sheets
data = db.query("SELECT * FROM monsters WHERE danger_level > 5")
db.insert('monsters', {'name': 'Vampire', 'danger_level': 8})
db.update('monsters', id=1, data={'danger_level': 9})
db.delete('monsters', id=1)

REST API Server

Start a universal backend that any frontend can use:

# Start the server
cd sheetdb_server
python server.py

# API is now available at http://localhost:8000
# Docs at http://localhost:8000/docs

API Endpoints:

# List all tables
GET /api/tables

# Query rows with filters
GET /api/tables/monsters/rows?danger_level__gte=7&limit=10&order_by=name

# Create row
POST /api/tables/monsters/rows
{"name": "Vampire", "danger_level": 8}

# Update row
PUT /api/tables/monsters/rows/1
{"danger_level": 9}

# Delete row
DELETE /api/tables/monsters/rows/1

# Trigger sync
POST /api/sync

# Check sync status
GET /api/sync/status

JavaScript Client

Use SheetDB from any JavaScript/TypeScript frontend:

npm install sheetdb-js

Vanilla JavaScript:

import { SheetDB } from "sheetdb-js";

const db = new SheetDB("http://localhost:8000/api");

// Fluent query API
const monsters = await db
  .table("monsters")
  .filter({ danger_level__gte: 7 })
  .orderBy("name")
  .limit(10)
  .get();

// CRUD operations
await db.table("monsters").create({ name: "Vampire", danger_level: 8 });
await db.table("monsters").update(1, { danger_level: 9 });
await db.table("monsters").delete(1);

// Trigger sync
await db.sync();

React Hooks:

import { SheetDBProvider, useQuery, useMutation } from "sheetdb-js/react";

function App() {
  return (
    <SheetDBProvider apiUrl="http://localhost:8000/api">
      <MonsterList />
    </SheetDBProvider>
  );
}

function MonsterList() {
  const { data, loading, error, refetch } = useQuery("monsters", {
    filter: { danger_level__gte: 7 },
    orderBy: "name",
  });

  const { mutate: createMonster } = useMutation("monsters", "create");

  if (loading) return <div>Loading monsters...</div>;
  if (error) return <div>Error: {error.message}</div>;

  return (
    <div>
      <h1>Dangerous Monsters ({data.length})</h1>
      {data.map((monster) => (
        <div key={monster.id}>
          {monster.name} - Danger Level: {monster.danger_level}
        </div>
      ))}
      <button
        onClick={() => createMonster({ name: "Vampire", danger_level: 8 })}
      >
        Add Monster
      </button>
      <button onClick={refetch}>Refresh</button>
    </div>
  );
}

CLI Tool

SheetDB includes a command-line tool for quick setup and management:

# Install SheetDB with CLI
pip install -e .

# Initialize from public URL (no auth required)
sheetdb init --url "https://docs.google.com/spreadsheets/d/YOUR_ID/"

# Initialize with authentication
sheetdb init --credentials creds.json --spreadsheet-id YOUR_ID

# Sync data
sheetdb sync --url "https://docs.google.com/spreadsheets/d/YOUR_ID/"

# Generate schema
sheetdb schema generate --credentials creds.json --spreadsheet-id YOUR_ID

# Get help
sheetdb --help

For complete CLI documentation, see docs/CLI_USAGE.md.

Using the ORM

SheetDB provides a Django-like ORM for working with your spreadsheet data. The ORM infrastructure is provided by sheetdb, while model definitions live in your application code (e.g., Django app).

Important: SheetDB does not support Django migrations. The schema is defined in sheets_schema.yaml and Google Sheets is the source of truth. Models simply map to existing sheets - they don't create or modify the sheet structure.

Defining Models

from sheetdb import Model

class Monster(Model):
    class Meta:
        table_name = "monsters"
        primary_key = "id"

class Hunter(Model):
    class Meta:
        table_name = "hunters"
        primary_key = "id"

class Hunt(Model):
    class Meta:
        table_name = "hunts"
        primary_key = "id"

    def get_monster(self):
        """Get related monster."""
        if self.monster_id:
            return Monster.objects.get(id=self.monster_id)
        return None

Configuration

from sheetdb import (
    SheetsGateway,
    SchemaConfig,
    DuckDBEngine,
    configure_orm
)

# Initialize components
gateway = SheetsGateway('credentials.json', 'your-spreadsheet-id')
schema = SchemaConfig('.kiro/specs/sheetdb/sheets_schema.yaml')
engine = DuckDBEngine(":memory:")

# Configure ORM
configure_orm(engine, gateway, schema)

# Sync data from sheets
engine.refresh_all_tables(gateway, schema)

Querying Data

# Get all records
monsters = Monster.objects.all()

# Filter records
dangerous = Monster.objects.filter(danger_level__gte=7)

# Get single record
dracula = Monster.objects.get(id=1)

# Chain filters
forest_monsters = Monster.objects.filter(habitat="Forest").filter(danger_level__gte=5)

# Order results
ordered = Monster.objects.all().order_by('-danger_level')

# Count records
count = Monster.objects.count()

Filter Operators

field: Exact match
field__gte: Greater than or equal
field__lte: Less than or equal
field__contains: String contains

Creating and Updating Records

# Create new record
monster = Monster.create(
    id=10,
    name='Vampire',
    danger_level=8,
    habitat='Crypt'
)

# Update existing record
monster.danger_level = 9
monster.save()

Relationships

# Get related objects
hunt = Hunt.objects.get(id=1)
monster = hunt.get_monster()
hunter = hunt.get_hunter()

Example Script

See examples/orm_example.py for a complete working example.

Schema Management (No Migrations)

SheetDB uses a different approach than traditional Django ORMs:

Create your Google Sheet with the desired structure (columns, headers)
Define the schema in sheets_schema.yaml to map sheets to tables
Define models in your Django app that match the schema
Validate using python validate_schema.py

To add a new table:

Create a new sheet in Google Sheets
Add the table definition to sheets_schema.yaml
Create a model class in your Django app
Run validation to ensure everything is configured correctly

Note: Django's makemigrations and migrate commands are not used with SheetDB. The schema is managed through the YAML configuration file and Google Sheets itself.

For detailed Django setup instructions, see docs/django_setup.md.

Development

This project follows spec-driven development. See .kiro/specs/sheetdb/ for:

requirements.md - Feature requirements
design.md - Architecture and design decisions
tasks.md - Implementation task list

Schema Validation

Before deploying or running the application, validate your schema configuration:

python validate_schema.py <config_path> <credentials_path>

Example:

python validate_schema.py .kiro/specs/sheetdb/sheets_schema.yaml path/to/credentials.json

The validation script checks:

All referenced sheets exist in the spreadsheet
All referenced columns exist in sheets
Primary key columns exist
Foreign key columns exist in both source and target tables

Exit codes:

0: Validation successful
1: Validation failed with errors
2: Script error (missing files, invalid arguments, etc.)

Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=sheetdb

# Run property-based tests
pytest -k property

Documentation

Architecture Overview - System design and components
Vision & Roadmap - Project goals and future plans
Unified API - Core SheetDB class documentation
ORM Design Guide - When to use ORM vs raw SQL
CLI Usage - Command-line tool guide
Django Setup - Django integration guide
Phase 2 Complete - Public mode and forms
Phase 3 Complete - REST API server
Phase 4 Complete - JavaScript client
JavaScript Examples - JS client examples

Examples

examples/sheetdb_example.py - Core SheetDB usage
examples/orm_example.py - ORM usage
examples/cli_example.sh - CLI commands
examples/rest_api_example.sh - REST API usage
examples/forms_writer_example.py - Forms integration
examples/js-client-vanilla.html - Vanilla JavaScript example
examples/js-client-react.jsx - React example

Performance Comparison

SheetDB vs Retrosheet

The Problem with Direct API Calls:

Retrosheet and similar libraries query Google Sheets API directly for every request. This is slow and hits rate limits quickly.

# Retrosheet approach (direct API calls)
import retrosheet

sheet = retrosheet.Retrosheet("YOUR_ID", "credentials.json")

# Every query hits the API (slow!)
for i in range(100):
    data = sheet.fetch("monsters")  # ~0.5s each = 50 seconds total!

SheetDB's Solution: Smart Caching

SheetDB syncs data once into DuckDB, then queries run in-memory at blazing speed:

# SheetDB approach (cached queries)
from sheetdb import SheetDB

db = SheetDB.from_public_url("https://docs.google.com/spreadsheets/d/YOUR_ID/")
db.sync()  # One-time sync: ~2 seconds

# Unlimited fast queries (in-memory!)
for i in range(100):
    data = db.query("SELECT * FROM monsters")  # ~0.001s each = 0.1 seconds total!

# Result: 500x faster! (50s → 0.1s)

Feature Comparison:

Feature	SheetDB	Retrosheet	Traditional DB
Query Speed	⚡ 0.001s (cached)	🐌 0.5s (API call)	⚡ 0.001s
Relational Queries	✅ Full SQL (joins, aggregations)	❌ No joins	✅ Full SQL
Rate Limits	✅ Unlimited reads (public mode)	❌ 100 req/100s	✅ Unlimited
Setup Time	✅ 5 minutes (public mode)	⚠️ 30 minutes (OAuth)	❌ Hours (hosting)
Hosting Cost	✅ $0 (Google's infrastructure)	✅ $0	❌ $5-50/month
Built-in UI	✅ Google Sheets	✅ Google Sheets	❌ Build your own
Type Safety	✅ Schema validation	❌ No validation	✅ Schema enforcement
Multiple Interfaces	✅ Python, REST API, JS	❌ Python only	⚠️ Varies
Collaborative Editing	✅ Real-time in Sheets	✅ Real-time in Sheets	❌ Build your own

When to Use Each:

SheetDB: Prototypes, internal tools, collaborative apps, public APIs
Retrosheet: Simple scripts, one-off data fetches
Traditional DB: High-scale production apps (1M+ rows, complex transactions)

vs Traditional Databases (PostgreSQL, MySQL)

Advantages of SheetDB:

✅ No Hosting - Leverage Google's infrastructure (free!)
✅ Unlimited Bandwidth - Google handles all traffic
✅ Built-in Admin UI - Google Sheets is your admin panel
✅ Collaboration - Multiple users can edit simultaneously
✅ Version History - Built-in audit trail
✅ Zero Setup - Public mode works instantly
✅ Familiar Interface - Everyone knows spreadsheets

When Traditional DB is Better:

❌ Very large datasets (1M+ rows)
❌ Complex transactions (ACID guarantees)
❌ High write throughput (1000s writes/sec)
❌ Advanced features (triggers, stored procedures)

vs Firebase/Supabase

Advantages of SheetDB:

✅ No Vendor Lock-in - Your data lives in Google Sheets
✅ Familiar Interface - Spreadsheets vs custom admin panels
✅ No Learning Curve - Everyone knows how to use sheets
✅ Free Tier - Google Sheets is free for most use cases
✅ Offline Editing - Edit sheets offline, sync later
✅ Relational Queries - Full SQL power with joins

When Firebase/Supabase is Better:

❌ Real-time subscriptions (live updates)
❌ Authentication/authorization (built-in)
❌ File storage (built-in)
❌ Serverless functions (built-in)

Use Cases

Perfect For:

🚀 Rapid Prototyping - Build MVPs in hours
👥 Internal Tools - Admin dashboards, data entry forms
📊 Public APIs - Serve data from published sheets
🎓 Learning Projects - Database concepts without infrastructure
🤝 Collaborative Apps - Users edit data directly in sheets
📈 Data Analysis - SQL queries on spreadsheet data

Not Recommended For:

❌ High-scale production (1M+ rows)
❌ Real-time gaming (sub-100ms latency)
❌ Financial transactions (ACID guarantees)
❌ High write throughput (1000s writes/sec)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.kiro		.kiro
django_app		django_app
documentation		documentation
examples		examples
sheetdb		sheetdb
tests		tests
.gitignore		.gitignore
DEPENDENCIES.md		DEPENDENCIES.md
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements-django.txt		requirements-django.txt
requirements.txt		requirements.txt
schema.yml		schema.yml
setup.py		setup.py
test_dependencies.sh		test_dependencies.sh

whitebumblebee/sheetdb

Folders and files

Latest commit

History

Repository files navigation

SheetDB - Reanimating Spreadsheets 🎃

Why SheetDB?

Key Features

Project Structure

Setup

1. Create Virtual Environment

2. Install Dependencies

3. Configure Environment

4. Set Up Google Sheets API

Three Operating Modes

1. Public Mode (No Authentication) 🔓

2. Authenticated Mode (Full Power) 🔐

3. Hybrid Mode (Best of Both) ⚡

Architecture

Quick Start

5-Minute Setup (Public Mode)

Python Library (All Modes)

REST API Server

JavaScript Client

CLI Tool

Using the ORM

Defining Models

Configuration

Querying Data

Filter Operators

Creating and Updating Records

Relationships

Example Script

Schema Management (No Migrations)

Development

Schema Validation

Testing

Documentation

Examples

Performance Comparison

SheetDB vs Retrosheet

vs Traditional Databases (PostgreSQL, MySQL)

vs Firebase/Supabase

Use Cases

Perfect For:

Not Recommended For:

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages