@midpage-ai/scrapersDocumentationAPI Reference

REST API

HTTP endpoints for searching and retrieving opinions.

Base URL

https://api.midpage.ai/api/v1

Authentication

API requests require an API key passed in the header:

Authorization: Bearer YOUR_API_KEY

Contact us to obtain API credentials.

Endpoints

Endpoint Method Description
/search POST Search for opinions using semantic, keyword, or hybrid search
/opinions/get POST Retrieve full opinion data by ID, citation, or docket

Search Endpoint

The /search endpoint supports three search modes:

Semantic Search (default)

Vector similarity search using AI embeddings. Best for conceptual queries.

{
  "query": "breach of fiduciary duty",
  "mode": "semantic",
  "page": 1,
  "page_size": 10
}

Traditional full-text search. Best for exact phrases and known terms. Supports Lexis/Westlaw-style boolean operators.

{
  "query": "summary judgment standard",
  "mode": "keyword",
  "page": 1,
  "page_size": 20,
  "filters": {
    "jurisdictions": ["Federal Appellate"],
    "date_filed": {
      "start": "2020-01-01",
      "end": "2024-12-31"
    }
  }
}

Boolean Operators

Keyword mode supports Lexis/Westlaw-style boolean operators:

Operator Description Example
AND Both terms must appear negligence AND damages
OR Either term may appear breach OR default
NOT Exclude term contract NOT employment
"..." Exact phrase "summary judgment"
* Wildcard (any characters) negligen*
? Single character wildcard wom?n
W/n Within n words (proximity) negligence W/5 damages
() Grouping (contract OR agreement) AND breach

Important: Boolean operators must be UPPERCASE (e.g., AND not and).

Example: Boolean search

{
  "query": "(contract OR agreement) AND breach NOT employment",
  "mode": "keyword",
  "page": 1,
  "page_size": 20
}

Example: Proximity search

{
  "query": "negligence W/5 damages",
  "mode": "keyword",
  "page": 1,
  "page_size": 20
}

When boolean operators are detected, the response includes a boolean_query field in metadata:

{
  "metadata": {
    "mode": "keyword",
    "query": "negligence AND damages",
    "processing_time_ms": 459,
    "boolean_query": {
      "detected": true,
      "operators": ["AND"]
    }
  }
}

Combines semantic and keyword results 50/50 with deduplication. Best for comprehensive coverage.

{
  "query": "fourth amendment search and seizure",
  "mode": "hybrid",
  "page": 1,
  "page_size": 10,
  "filters": {
    "court_ids": ["ca9", "scotus"]
  }
}

Available Filters

Filter Type Description
court_ids string[] Court identifiers (e.g., ["ca9", "scotus"])
jurisdictions string[] Jurisdiction types (e.g., ["Federal Appellate"])
states string[] State names (e.g., ["California"])
date_filed.start string Start date (YYYY-MM-DD)
date_filed.end string End date (YYYY-MM-DD)

Facets (Result Breakdowns)

For keyword and hybrid modes, the response automatically includes facet breakdowns showing result counts by court, jurisdiction, state, and year. This is useful for building filter UIs or understanding result distribution.

To disable facets, set include_facets: false:

{
  "query": "copyright infringement",
  "mode": "keyword",
  "include_facets": false
}

Response with facets (default for keyword/hybrid):

{
  "results": [...],
  "pagination": {...},
  "facets": {
    "court_abbreviation": [
      { "key": "9th Cir.", "count": 1523 },
      { "key": "S.D.N.Y.", "count": 892 }
    ],
    "jurisdiction": [
      { "key": "Federal Appellate", "count": 3200 },
      { "key": "Federal District", "count": 4100 }
    ],
    "state": [
      { "key": "California", "count": 1800 },
      { "key": "New York", "count": 1650 }
    ],
    "year_filed": [
      { "key": "2024", "count": 1200 },
      { "key": "2023", "count": 2100 }
    ]
  }
}

Important notes:

  • Facets are enabled by default for keyword and hybrid modes
  • Facets are not available in semantic mode (Pinecone doesn't support aggregations)
  • In hybrid mode, facet counts reflect keyword search results only; semantic (Pinecone) results are not included in these counts. A note field will be present explaining this limitation.

Response

{
  "results": [
    {
      "opinion_id": "8623588",
      "score": 0.85,
      "case_name": "Smith v. Jones",
      "court_id": "ca9",
      "court_name": "United States Court of Appeals for the Ninth Circuit",
      "jurisdiction": "Federal Appellate",
      "state": null,
      "date_filed": "2023-05-15",
      "snippet": "The court found that...",
      "source": "semantic"
    }
  ],
  "pagination": {
    "page": 1,
    "page_size": 10,
    "total_results": 150,
    "total_pages": 15,
    "has_next": true,
    "has_prev": false
  },
  "metadata": {
    "mode": "semantic",
    "query": "breach of fiduciary duty",
    "processing_time_ms": 542
  },
  "facets": {
    "court_abbreviation": [{ "key": "9th Cir.", "count": 45 }],
    "jurisdiction": [{ "key": "Federal Appellate", "count": 150 }],
    "state": [],
    "year_filed": [{ "key": "2023", "count": 80 }]
  }
}

See the API Reference tab for complete endpoint documentation, request/response schemas, and interactive examples.