Schema

Complete reference for the database schemas.

The database is organized into two main schemas:

  • opinions - Court opinions and caselaw data
  • laws - Statutes and regulations

Opinions Schema

The opinions schema contains normalized court opinion data. All tables are in the opinions schema and can be joined using foreign key relationships.

Entity Relationship

courts
  └── cases
        └── docket_entries
              ├── citations
              └── content

opinion_relations (cross-references between opinions)

Opinions Tables

opinions.courts

Court definitions for 2000+ federal and state courts (US, Canada, and other jurisdictions).

Column Type Description
id 🔑 VARCHAR(50) Court identifier (e.g., 'ca9', 'nysd', 'scc')
full_name TEXT Full court name
court_abbreviation VARCHAR(100) Citation abbreviation
jurisdiction VARCHAR(50) Federal Appellate, Federal District, Federal Bankruptcy, Federal Special, State Supreme, State Appellate, State Trial, State Special, State Attorney General, Military, Military Appellate, Tribal, U.S. Territory, Committee, International
state VARCHAR(50) State/province name, 'Federal', 'Tribal', or territory name
country VARCHAR(2) ISO 3166-1 alpha-2 country code (US, CA, etc.)
in_use BOOLEAN Whether court is actively scraped

opinions.cases

Case metadata including docket information and dates.

Column Type Description
id 🔑 TEXT Internal case ID
court_id VARCHAR(50) Court identifier
docket_number TEXT Court docket number
case_name TEXT Case name/caption
date_filed DATE Date case was filed
date_terminated DATE Date case was terminated
data_origin VARCHAR(25) Source of the data

opinions.docket_entries

Opinion entries (where is_opinion = true). Each row represents a court opinion.

Column Type Description
id 🔑 TEXT Opinion identifier
case_id TEXT Parent case
description TEXT Opinion description
date_filed DATE Opinion filing date
judge_name TEXT Authoring judge name
is_opinion BOOLEAN Whether this is an opinion (true for opinions)
opinion_role ENUM Role of opinion (lead, concurrence, dissent, etc.)
publish_status ENUM Publication status
data_origin VARCHAR(25) Source of the data

opinions.citations

Reporter citations for opinions.

Column Type Description
id 🔑 UUID Internal citation ID
opinion_id TEXT Parent opinion
court_id VARCHAR(255) Court identifier
volume VARCHAR(20) Reporter volume number
reporter VARCHAR(50) Reporter abbreviation (e.g., 'U.S.', 'F.3d')
page VARCHAR(20) Starting page number
cited_as TEXT Full citation string
citation_type VARCHAR(50) Type of citation
year INTEGER Year of citation
normalized VARCHAR(150) Normalized citation format

opinions.content

Full text content for opinions.

Column Type Description
id 🔑 TEXT Opinion ID (same as docket_entries.id)
html_content TEXT Opinion text in HTML format
source VARCHAR(50) Source of content
created_at TIMESTAMP When content was added

opinions.opinion_relations

Citation relationships and treatment analysis between opinions.

Column Type Description
cited_id TEXT The opinion being cited
citing_id TEXT The opinion doing the citing
citator_version TEXT Version of citator analysis
is_authoritative BOOLEAN Whether this is an authoritative citation
treatment_category TEXT How the citing opinion treats the cited opinion
treatment_description TEXT Detailed treatment explanation
supporting_quote TEXT Quote from citing opinion supporting the treatment
confidence_score DOUBLE Confidence score of the treatment analysis

🔑 = Primary Key, → = Foreign Key


Laws Schema

The laws schema contains statutes (codified laws), regulations (administrative rules), and constitutions. This schema uses a unified document structure with hierarchical support and versioning.

Entity Relationship

collections (uscode, cfr, state codes)
  └── documents (hierarchical: title > chapter > section)

citation_relations (cross-references between any document types)

Laws Tables

laws.collections

Content collections (e.g., U.S. Code, CFR, state statutes and regulations).

Column Type Description
id 🔑 VARCHAR(50) Collection identifier (e.g., 'uscode', 'cfr')
name VARCHAR(200) Full name (e.g., 'United States Code')
type VARCHAR(30) Content type: 'statute', 'regulation', 'constitution'
authority_name VARCHAR(200) Publishing authority (e.g., 'United States', 'California')
jurisdiction VARCHAR(50) Jurisdiction name (e.g., 'Federal', 'California', 'New York')

Available collections:

  • uscode - United States Code (federal statutes)
  • cfr - Code of Federal Regulations (federal regulations)
  • fed-bills - Congressional Bills
  • fed-plaw - Public Laws
  • State collections follow pattern: {abbrev}-codes (statutes), {abbrev}-ccr or {abbrev}-regs (regulations)

laws.documents

Unified table for hierarchical legal content (statutes, regulations, constitutions).

Column Type Description
id 🔑 UUID Document identifier
collection_id VARCHAR(50) Parent collection (e.g., 'uscode', 'cfr')
parent_id UUID Parent document node
path LTREE Hierarchical path for tree queries (e.g., 'uscode.5.i.101')
depth INTEGER Depth in hierarchy (0 = root)
sort_order INTEGER Order within parent
node_type VARCHAR(30) Type: 'title', 'chapter', 'part', 'section', 'article', 'amendment'
number VARCHAR(50) Section/chapter number (e.g., '5', 'I', '101', '(a)')
heading TEXT Section heading/title
citation_canonical VARCHAR(200) Canonical citation (e.g., '5 U.S.C. § 552', '26 C.F.R. § 1.61-1')
effective_date DATE When this version became effective
end_date DATE When this version was superseded (null if current)
status VARCHAR(20) Status: 'active', 'repealed', 'amended'
supersedes_id UUID Previous version this supersedes
is_current BOOLEAN Whether this is the current version
html_content TEXT HTML formatted content
source_law VARCHAR(500) Public law reference for statutes (e.g., 'Pub. L. 118-100')
agency_name VARCHAR(200) Agency name for regulations (e.g., 'Environmental Protection Agency')
agency_code VARCHAR(20) Agency code for regulations (e.g., 'EPA')
fr_citation VARCHAR(100) Federal Register citation for regulations (e.g., '89 FR 12345')
source_id VARCHAR(200) External source identifier
gcs_path VARCHAR(500) GCS path to original source file

Type-specific fields:

  • Statutes use: source_law
  • Regulations use: agency_name, agency_code, fr_citation
  • Constitutions use: standard hierarchy fields only

laws.citation_relations

Cross-references between any document types (statutes citing statutes, opinions citing statutes, etc.).

Column Type Description
id 🔑 UUID Relation identifier
citing_id UUID Document doing the citing
citing_type VARCHAR(30) Type: 'document', 'opinion'
cited_id UUID Document being cited
cited_type VARCHAR(30) Type: 'document', 'opinion'
relation_type VARCHAR(50) Relation: 'cites', 'implements', 'amends', 'authority'
treatment VARCHAR(50) Treatment: 'Negative', 'Caution', 'Neutral' (for caselaw)
pin_cite VARCHAR(100) Specific location within cited document
source VARCHAR(50) Source: 'extraction', 'manual'

🔑 = Primary Key, → = Foreign Key