Convergence Study Design

Methodology Validation Open Design Prepared March 25, 2026 - Draft for discussion

Status

This is an open research design, not a finished protocol. It is intended as a starting point for collaboration. The framework, phases, and validation criteria below are all open to revision by a research partner with expertise in empirical study design.

1. The Problem

A single independent researcher built a dataset of 264 confirmed Section 1328(f) violations across 7 federal judicial districts. The dataset was produced using open-source tools applied to public federal court records.

The central question any reviewer will ask:

How do we know these results aren't an artifact of one person's methodology, selection bias, or errors?

The convergence study answers this by separating the person who built the tools from the people validating the results.

2. The Core Idea

Two independent data streams are being generated right now:

Stream	Source	Controller	Status
Ground truth	264 verified cases, 7 districts	Dataset builder (private)	Complete
Independent verification	GitHub cloners running the screener	Self-selected public users	In progress (198 unique cloners)

If independent users running the same tool against the same public records in overlapping districts produce the same results, the findings are externally validated without requiring trust in any single researcher.

If results diverge, the study identifies exactly where and why, which is equally valuable.

3. Why This Is Novel

Crowdsourced legal data verification has not been done before. Legal empirical studies typically rely on one research team collecting, coding, and analyzing data. This design distributes the collection phase across independent actors.
The tool is deterministic. Given the same PACER case record, the screener will always return the same result. There is no subjective coding, no human judgment in the verification step. This makes convergence testing clean.
The ground truth is blinded. Independent users do not know the expected result for any given case. The screener returns "eligible" or "barred" based on dates. Users cannot confirm toward a known answer because no known answer is published.
The methodology is a standalone contribution. Independent of the 1328(f) findings, a validated framework for crowdsourced court record verification is publishable on its own as a methods paper.

4. Proposed Phases

Phase 1 - Baseline

Characterize existing data streams

Determine which districts the 198 GitHub cloners have run the screener against. Identify overlap with the 7-district ground truth set. Measure: how many independent verification data points exist today, without any additional collection?

Phase 2 - Overlap test

Compare results in shared districts

For cases where both the ground truth and independent users screened the same district, compare results case by case. Metrics: agreement rate, false positive rate, false negative rate. If agreement exceeds a pre-specified threshold (to be determined by research design), the tool is validated.

Phase 3 - Expansion

Extend to new districts with institutional PACER access

Using fee-exempt access through a university affiliation, run the screener against a stratified random sample of the 391,951-case verification universe. Sample design: stratify by district, filing year, and prior-filer discharge rate. Target: statistically representative coverage across all 94 districts.

Phase 4 - Publication

Two papers, not one

Paper 1 (methods): The convergence framework itself. Can crowdsourced verification of court records produce reliable results? What are the conditions for validity? Applicable beyond 1328(f) to any court-record verification task.

Paper 2 (findings): The national 1328(f) violation rate, estimated from the expanded sample. Geographic variation analysis. Policy implications for Rule 4004 and discharge eligibility verification.

5. Validation Criteria (Open for Discussion)

What constitutes "convergence"? Proposed thresholds, subject to revision:

Metric	Proposed threshold	Notes
Case-level agreement rate	≥ 95%	Independent result matches ground truth for same case
District-level violation rate	Within 5 percentage points	Independent district estimate vs. ground truth district estimate
False positive rate	≤ 2%	Cases flagged as violations that are not (dates outside bar window)
False negative rate	To be measured	Cases missed by the screener that are actual violations

These thresholds are placeholders. A research partner with experience in validation study design should set the actual criteria before data collection begins.

6. Known Threats to Validity

Threat	Description	Mitigation
Selection bias	GitHub cloners are self-selected. They found the tool through Reddit, search, or word of mouth. They may not be representative.	Phase 3 uses stratified random sampling with institutional access, removing self-selection entirely.
Tool error	The screener could have bugs that produce systematic errors.	The 264-case ground truth was manually verified against PACER dockets. Any screener error that contradicts manual verification would surface in Phase 2.
PACER data quality	PACER records may contain errors (incorrect dates, missing cases, miscoded chapters).	This affects all PACER-based research equally. The convergence design tests whether independent users encounter the same data quality, not whether PACER itself is perfect.
Temporal drift	PACER records can be amended. A case screened in March may show different data than the same case screened in June.	Timestamp all screenings. Compare only results generated within the same time window.
Non-independence	Some GitHub cloners may share results with each other, compromising independence.	The screener output is deterministic. If two users get the same result, it is because the data is the same, not because they compared notes. Non-independence does not affect a deterministic tool.

7. What a Research Partner Brings

The dataset builder built the infrastructure. The research partner designs the study. Specifically:

Sample design: How to stratify the Phase 3 sample. Which districts, how many cases per district, what filing years.
Statistical framework: What tests to use for convergence. How to handle partial overlap. Power analysis for minimum sample sizes.
Validation criteria: Setting the actual thresholds in Section 5 based on precedent in empirical legal studies or adjacent fields.
Publication strategy: Whether to publish one paper or two. Which journal. How to frame the methods contribution vs. the substantive findings.
IRB assessment: Whether this study requires institutional review, given that all data is from public federal court records with no human subjects contact.
PACER fee exemption: Institutional affiliation that qualifies for the AO's researcher fee exemption, enabling Phase 3 at zero marginal cost.

8. What Exists Today

Component	Status
Ground truth dataset (264 cases, 7 districts)	Complete
Screening tool (open-source, deterministic)	Live, ranking #1 nationally
FJC national dataset (4.9M Ch. 13 cases)	Loaded, queryable
391,951 verification universe identified	Complete
RSS real-time monitoring (all 94 districts)	Running
RECAP enrichment pipeline (16,000+ cases)	Running
Independent GitHub cloners	198 unique as of March 25, 2026
Research design for convergence test	This document (draft)
Institutional PACER access	Not yet available
Formal study protocol	Awaiting research partner

This design is open

Every element on this page - the phases, the thresholds, the threats, the publication strategy - is a proposal, not a decision. The purpose of this document is to show that a rigorous validation framework is possible and that the infrastructure to execute it already exists. The research design itself should be shaped by someone with the expertise to do it right.