How to Automate Legal Contract Review with Claude: A Practical Step-by-Step Guide
Build a Claude-powered contract analyzer using the Batch API to extract liability clauses, auto-renewals, and payment terms across hundreds of documents — with ready-to-use prompts for NDAs, employment agreements, and vendor contracts.
Contract review is one of those tasks that sounds simple until you’re staring down 300 vendor agreements, each 40 pages long, with a deadline on Friday. Someone has to find every auto-renewal clause, every indemnification trap, every payment term buried in Section 14(b)(ii). Traditionally, that someone bills $400 an hour and drinks a lot of coffee.
Claude changes the math. With a 200,000-token context window — enough to hold roughly 150,000 words, or a very thorough master services agreement plus its exhibits — and a Batch API designed for exactly this kind of high-volume, non-urgent work, you can build a contract analysis pipeline that processes hundreds of documents systematically, flags the clauses that matter, and hands your lawyers a clean summary instead of a stack of PDFs. This guide shows you exactly how to do that.
One honest caveat before we start: Claude is a powerful first-pass reviewer, not a replacement for attorney judgment. The output of this pipeline should feed into legal review, not bypass it. With that said, getting 80% of the extraction work done before a human reads a single page is the actual value here. Let’s build it.
What You’ll Actually Build
By the end of this guide you’ll have a working pipeline that takes a batch of contract documents (PDFs or plain text), sends them to Claude via the Anthropic Batch API, and returns structured JSON for each document containing: identified contract type, key parties, payment terms, termination clauses, auto-renewal dates, liability caps, and flagged risk clauses. You’ll also have ready-to-use prompt templates for three contract types: NDAs, employment agreements, and vendor/SaaS contracts.
What You’ll Need
An Anthropic API key with access to Claude Opus 4.6 (Anthropic’s current flagship, optimized for complex reasoning tasks like this). Python 3.9 or newer. The anthropic Python SDK, installable via pip install anthropic. Your contracts in text-extractable PDF format or plain text. A basic comfort level with Python — you don’t need to be a developer, but you do need to be able to run a script.
Note 💡
Claude Opus 4.6 has a 200,000-token context window. A dense 50-page contract typically runs 25,000–35,000 tokens. You can fit one substantial contract per API call comfortably, or multiple shorter agreements in a single call if you structure them carefully.
Step 1: Extract Text from Your Contracts
Before Claude sees anything, your PDFs need to become readable text. The pdfplumber library handles this cleanly for most standard contracts. Install it with pip install pdfplumber, then extract your documents in bulk. Keep the filename associated with each text block — you’ll need it for your output mapping later.
For contracts with scanned pages (common with older agreements), you’ll need an OCR step first. pytesseract paired with pdf2image handles this, though OCR quality varies. If your firm regularly works with scanned documents, a dedicated OCR service like AWS Textract or Google Document AI will give you cleaner text as input — which directly improves Claude’s extraction accuracy downstream.
Pro tip ✅
Always strip headers, footers, and page numbers from extracted text before sending to Claude. Repeated boilerplate (“Page 12 of 47 | CONFIDENTIAL | Acme Corp”) inflates your token count without adding any information. A simple regex pass can trim thousands of tokens per document across a large batch.
Step 2: Build Your Extraction Prompts
This is where the quality of your pipeline lives or dies. Generic prompts get generic output. The prompts below are specific, structured, and designed to return consistent JSON that you can parse programmatically. Use them as-is or adapt them to your firm’s specific risk priorities.
Master extraction prompt for any contract type:
You are a legal document analyst. Review the contract below and extract the following information. Return your response as valid JSON only, with no additional text before or after the JSON object.
Extract:
- contract_type: The type of agreement (NDA, Employment, Vendor, SaaS, MSA, etc.)
- effective_date: When the contract takes effect (ISO 8601 format if determinable, else the text as written)
- parties: Array of objects with "name" and "role" for each party
- term_length: Duration of the agreement
- termination_clauses: Array of objects, each with "clause_text" (verbatim, max 200 words) and "termination_type" (e.g., "for cause", "for convenience", "automatic expiry")
- auto_renewal: Object with "exists" (boolean), "notice_period_days" (integer or null), and "clause_text" (verbatim if exists)
- payment_terms: Object with "amounts" (array of described amounts), "schedule" (text), and "late_payment_penalties" (text or null)
- liability_cap: Object with "exists" (boolean), "cap_amount" (text), and "clause_text" (verbatim if exists)
- indemnification: Array of key indemnification obligations with "party" and "obligation_summary"
- governing_law: Jurisdiction
- risk_flags: Array of objects with "flag_type" and "description" for any unusual, one-sided, or potentially problematic clauses
- missing_standard_clauses: Array of clause types typically expected in this contract type that appear absent
CONTRACT TEXT:
{{CONTRACT_TEXT}}
This prompt will return clean, parseable JSON every time — as long as the contract text is readable. The risk_flags field is where Claude earns its keep: it surfaces unlimited liability provisions, unusually broad IP assignment clauses, and lopsided termination rights that a rushed human reviewer might miss at 11 PM.
NDA-specific prompt with additional fields:
You are a legal document analyst specializing in confidentiality agreements. Review the NDA below and return a JSON object only.
Extract all fields from a standard contract review, plus these NDA-specific fields:
- confidentiality_scope: What information is covered and what is excluded
- disclosure_permitted_to: Who the receiving party may share confidential information with
- obligations_on_breach: What the disclosing party can seek if the NDA is breached
- return_or_destroy: Whether there is a clause requiring return or destruction of confidential materials, and the timeline
- residuals_clause: Boolean — does a residuals clause exist? If yes, quote it verbatim. (A residuals clause allows the receiving party to use information retained in unaided memory — this is a significant carve-out.)
- one_way_or_mutual: Whether confidentiality obligations flow one-way or both directions
- survival_period: How long confidentiality obligations survive termination
Return valid JSON only. No text outside the JSON object.
NDA TEXT:
{{CONTRACT_TEXT}}
Warning ⚠️
Residuals clauses are one of the most commonly missed high-risk provisions in NDAs. They effectively allow a recipient to use anything they can “remember” from confidential materials, gutting the agreement’s protection in practice. The prompt above explicitly calls this out — Claude will flag it every time.
Employment agreement prompt:
You are a legal document analyst reviewing an employment agreement. Extract the following and return as JSON only.
Standard fields (contract_type, effective_date, parties, governing_law, risk_flags, missing_standard_clauses) plus:
- compensation: Object with "base_salary", "bonus_structure", "equity" (if any), and "benefits_summary"
- at_will_employment: Boolean — is this an at-will arrangement?
- notice_period: Required notice for termination by either party
- non_compete: Object with "exists" (boolean), "geographic_scope", "duration", and "clause_text" (verbatim)
- non_solicitation: Object with "exists" (boolean), "covers_employees" (boolean), "covers_customers" (boolean), "duration"
- ip_assignment: Summary of what intellectual property the employee assigns to the employer — flag if scope is unusually broad (e.g., covers work done on personal time)
- arbitration_clause: Object with "exists" (boolean), "class_action_waiver" (boolean), "clause_text" (verbatim if exists)
- severance: Described severance terms or null if none
Return valid JSON only.
EMPLOYMENT AGREEMENT TEXT:
{{CONTRACT_TEXT}}
Vendor/SaaS contract prompt:
You are a legal document analyst reviewing a vendor or SaaS service agreement. Extract the following and return as JSON only.
Standard fields plus:
- services_description: What the vendor is contracted to provide
- sla_commitments: Array of service level commitments with "metric" and "target"
- sla_remedies: What remedies exist if SLAs are breached (credits, termination rights, etc.)
- data_handling: How customer data is handled, stored, and what rights the vendor has to use it
- data_breach_notification: Timeline and process for breach notification, or null if absent
- price_escalation: Any provisions allowing the vendor to increase pricing, and under what conditions
- most_favored_customer: Boolean — does an MFC clause exist?
- source_code_escrow: Boolean
- limitation_of_liability: Object with "cap_basis" (e.g., "fees paid in prior 12 months"), "excluded_damages" (e.g., consequential, lost profits), and "clause_text" (verbatim)
- termination_for_convenience: Object with "customer_right" (boolean), "vendor_right" (boolean), and notice periods for each
- renewal_price_lock: Whether pricing is locked at renewal or subject to change
Return valid JSON only.
VENDOR CONTRACT TEXT:
{{CONTRACT_TEXT}}
Step 3: Set Up Batch Processing
The Anthropic Batch API is the right tool here. It’s designed for exactly this: you submit a batch of requests, Anthropic processes them asynchronously, and you collect results when they’re ready. It’s cheaper per token than synchronous calls and doesn’t eat into your rate limits the same way rapid-fire requests do.
Here’s the core structure for submitting a contract batch in Python:
import anthropic
import json
from pathlib import Path
client = anthropic.Anthropic(api_key="your_api_key_here")
def build_batch_requests(contracts: dict, prompt_template: str) -> list:
"""
contracts: dict of {filename: contract_text}
prompt_template: string with {{CONTRACT_TEXT}} placeholder
"""
requests = []
for filename, contract_text in contracts.items():
filled_prompt = prompt_template.replace("{{CONTRACT_TEXT}}", contract_text)
requests.append({
"custom_id": filename,
"params": {
"model": "claude-opus-4-6",
"max_tokens": 2048,
"messages": [
{"role": "user", "content": filled_prompt}
]
}
})
return requests
# Submit the batch
def submit_contract_batch(contracts: dict, prompt_template: str):
requests = build_batch_requests(contracts, prompt_template)
batch = client.beta.messages.batches.create(requests=requests)
print(f"Batch submitted. ID: {batch.id}")
print(f"Processing {len(requests)} contracts.")
return batch.id
# Poll for results
def collect_batch_results(batch_id: str) -> dict:
import time
while True:
batch = client.beta.messages.batches.retrieve(batch_id)
status = batch.processing_status
print(f"Status: {status} — {batch.request_counts}")
if status == "ended":
break
time.sleep(60) # Check every minute
results = {}
for result in client.beta.messages.batches.results(batch_id):
if result.result.type == "succeeded":
raw_text = result.result.message.content[0].text
try:
results[result.custom_id] = json.loads(raw_text)
except json.JSONDecodeError:
results[result.custom_id] = {"error": "JSON parse failed", "raw": raw_text}
else:
results[result.custom_id] = {"error": result.result.error}
return results
Pro tip ✅
Set
max_tokensto at least 2048 for contract extraction. Shorter limits will cause Claude to truncate JSON mid-object, which breaks your parser. For very complex contracts with many flagged clauses, bump it to 4096. The extra cost is negligible compared to a broken pipeline you discover at 2 AM.
Step 4: Post-Process and Surface Insights
Once your batch results come back, the raw JSON is useful but not immediately actionable for most lawyers. A simple post-processing step converts it into something more usable: a master spreadsheet flagging every document with active risk flags, a summary report sorted by risk severity, and individual contract summaries for attorney review.
import pandas as pd
def build_risk_summary(results: dict) -> pd.DataFrame:
rows = []
for filename, data in results.items():
if "error" in data:
continue
row = {
"file": filename,
"contract_type": data.get("contract_type", "Unknown"),
"parties": ", ".join([p["name"] for p in data.get("parties", [])]),
"governing_law": data.get("governing_law", ""),
"auto_renewal": data.get("auto_renewal", {}).get("exists", False),
"auto_renewal_notice_days": data.get("auto_renewal", {}).get("notice_period_days"),
"liability_cap_exists": data.get("liability_cap", {}).get("exists", False),
"risk_flag_count": len(data.get("risk_flags", [])),
"risk_flags_summary": " | ".join([f["description"] for f in data.get("risk_flags", [])]),
"missing_clauses": ", ".join(data.get("missing_standard_clauses", []))
}
rows.append(row)
df = pd.DataFrame(rows)
df = df.sort_values("risk_flag_count", ascending=False)
return df
Sort by risk_flag_count descending and your legal team immediately knows which contracts need the most attention. Auto-renewal contracts with short notice periods bubble up. Agreements missing limitation-of-liability clauses get flagged. The 20 contracts that need a real lawyer’s eyes get identified from a pile of 200.
Step 5: Run a Quality Check Prompt
Before you trust the pipeline on live client work, run it against a handful of contracts you’ve already reviewed manually. This catches prompt edge cases — contracts in unusual formats, agreements with non-standard clause numbering, documents where key terms appear in schedules rather than the main body.
Use this verification prompt to cross-check Claude’s extraction against your known ground truth:
I have already reviewed this contract manually and know the following facts:
- Auto-renewal clause exists with 30-day notice period
- Liability is capped at fees paid in prior 6 months
- Governing law is New York
Review the contract below and tell me:
1. Did you correctly identify these three facts? Quote the exact clause text for each.
2. Did you identify any risk flags beyond what I noted above?
3. What (if anything) do you think you might have missed or misclassified, and why?
Be direct about any uncertainty.
CONTRACT TEXT:
{{CONTRACT_TEXT}}
Pro tip ✅
Claude will tell you when it’s uncertain if you ask directly. The default extraction prompts above return structured data confidently — which is what you want for bulk processing. The verification prompt above is specifically for your QA phase, where you want Claude to surface its own doubts. Use different prompts for different purposes.
Avoid 🚫
Don’t skip the QA phase because your first batch “looked right.” Contract language is deliberately varied — the same concept appears as “termination for convenience,” “cancellation at will,” “right to terminate without cause,” and a dozen other phrasings depending on which firm drafted the agreement. Test against your actual document corpus before relying on this pipeline for anything consequential.
Prompt Variations That Change the Output
Small changes to your extraction prompts produce meaningfully different results. Here are three worth knowing about:
Risk severity scoring: Adding a "risk_severity": "high/medium/low" field to your risk_flags array makes triage much faster. Add this line to any prompt: “For each risk flag, include a ‘risk_severity’ field rated ‘high’ (potential financial exposure or loss of key rights), ‘medium’ (suboptimal but manageable), or ‘low’ (minor drafting issue).”
Comparative analysis across a contract series: If you’re reviewing a stack of agreements with the same counterparty — say, an MSA plus several SOWs — a synthesis prompt works better than individual extractions:
The following documents are related agreements between the same parties. After extracting structured data from each document individually, provide a "cross_document_analysis" section identifying: (1) any conflicts or inconsistencies between documents, (2) obligations in one document not adequately addressed in another, (3) your overall assessment of risk concentration across the full agreement set.
DOCUMENT 1 — MASTER SERVICES AGREEMENT:
{{MSA_TEXT}}
DOCUMENT 2 — STATEMENT OF WORK #1:
{{SOW1_TEXT}}
Return all output as a single valid JSON object.
Jurisdiction-specific flagging: If your firm focuses on a specific jurisdiction, add a line like: “Flag any clauses that may be unenforceable or require modification under California law, particularly regarding non-compete provisions, arbitration requirements, and IP assignment.” Claude’s training includes substantial legal knowledge, and jurisdiction-specific prompting surfaces relevant issues that generic prompts miss.
Pro tip ✅
If you’re processing NDAs from technology companies, add this line to your NDA prompt: “Pay particular attention to any definitions of ‘Confidential Information’ that include metadata, usage data, or aggregated analytics derived from the receiving party’s use of confidential materials — this is increasingly common in SaaS-era NDAs and expands the scope significantly.” Generic prompts miss this.
What Realistic Performance Looks Like
The Batch API processes requests asynchronously, and turnaround time depends on current API load. For a batch of 100 standard-length contracts (20–30 pages each), expect results within one to three hours during normal usage periods. Longer contracts or larger batches take proportionally more time. The API documentation notes that batches complete within 24 hours, but in practice, most complete much faster for typical legal document workloads.
Cost scales with token usage. A 30-page vendor contract might run 18,000 input tokens plus ~1,500 output tokens for the structured extraction. At Opus 4.6 Batch API pricing, processing 100 such contracts costs a fraction of what a single hour of attorney review runs. The economics are not subtle.
Warning ⚠️
The Batch API has a limit of 10,000 requests per batch and up to 32 million tokens of input per batch. For most law firm workloads, you’ll hit neither limit. But if you’re processing a full M&A due diligence set with thousands of documents, plan to split your processing into multiple batches and build a queue manager to handle the sequencing.
The Part Where a Human Still Has to Read the Contract
This pipeline does not replace attorney review. It organizes and accelerates it. The value is that your lawyers start each review already knowing: which contracts have auto-renewal traps, which ones lack liability caps, which ones have non-standard indemnification terms, and which ones have risk flags worth examining first. Instead of reading every contract top-to-bottom to find out whether it matters, they read the five contracts that actually matter and skim the summaries for the rest.
That shift — from full document review to targeted review — is where the real time savings come from. The AI handles the extraction. The lawyer handles the judgment. Neither substitutes for the other.
Get It Running This Week
The full stack here is: pdfplumber for extraction, the Anthropic Python SDK for batch submission, the prompts above dropped directly into your prompt_template variable, and pandas for the output spreadsheet. That’s four libraries and maybe 200 lines of Python to go from a folder of contracts to a prioritized review list.
Start with 10 contracts you already know well. Run the batch, compare the output against your existing knowledge of those documents, and tune the prompts where Claude misses something. Two rounds of prompt iteration is usually enough to get extraction accuracy to a level where the output genuinely speeds up rather than complicates your workflow. Then scale up. The Batch API doesn’t mind.


