Convergence Study Design
Status
This is an open research design, not a finished protocol. It is intended as a starting point for collaboration. The framework, phases, and validation criteria below are all open to revision by a research partner with expertise in empirical study design.
1. The Problem
A single independent researcher built a dataset of 264 confirmed Section 1328(f) violations across 7 federal judicial districts. The dataset was produced using open-source tools applied to public federal court records.
The central question any reviewer will ask:
How do we know these results aren't an artifact of one person's methodology, selection bias, or errors?
The convergence study answers this by separating the person who built the tools from the people validating the results.
2. The Core Idea
Two independent data streams are being generated right now:
| Stream | Source | Controller | Status |
|---|---|---|---|
| Ground truth | 264 verified cases, 7 districts | Dataset builder (private) | Complete |
| Independent verification | GitHub cloners running the screener | Self-selected public users | In progress (198 unique cloners) |
If independent users running the same tool against the same public records in overlapping districts produce the same results, the findings are externally validated without requiring trust in any single researcher.
If results diverge, the study identifies exactly where and why, which is equally valuable.
3. Why This Is Novel
- Crowdsourced legal data verification has not been done before. Legal empirical studies typically rely on one research team collecting, coding, and analyzing data. This design distributes the collection phase across independent actors.
- The tool is deterministic. Given the same PACER case record, the screener will always return the same result. There is no subjective coding, no human judgment in the verification step. This makes convergence testing clean.
- The ground truth is blinded. Independent users do not know the expected result for any given case. The screener returns "eligible" or "barred" based on dates. Users cannot confirm toward a known answer because no known answer is published.
- The methodology is a standalone contribution. Independent of the 1328(f) findings, a validated framework for crowdsourced court record verification is publishable on its own as a methods paper.
4. Proposed Phases
Characterize existing data streams
Determine which districts the 198 GitHub cloners have run the screener against. Identify overlap with the 7-district ground truth set. Measure: how many independent verification data points exist today, without any additional collection?
Compare results in shared districts
For cases where both the ground truth and independent users screened the same district, compare results case by case. Metrics: agreement rate, false positive rate, false negative rate. If agreement exceeds a pre-specified threshold (to be determined by research design), the tool is validated.
Extend to new districts with institutional PACER access
Using fee-exempt access through a university affiliation, run the screener against a stratified random sample of the 391,951-case verification universe. Sample design: stratify by district, filing year, and prior-filer discharge rate. Target: statistically representative coverage across all 94 districts.
Two papers, not one
Paper 1 (methods): The convergence framework itself. Can crowdsourced verification of court records produce reliable results? What are the conditions for validity? Applicable beyond 1328(f) to any court-record verification task.
Paper 2 (findings): The national 1328(f) violation rate, estimated from the expanded sample. Geographic variation analysis. Policy implications for Rule 4004 and discharge eligibility verification.
5. Validation Criteria (Open for Discussion)
What constitutes "convergence"? Proposed thresholds, subject to revision:
| Metric | Proposed threshold | Notes |
|---|---|---|
| Case-level agreement rate | ≥ 95% | Independent result matches ground truth for same case |
| District-level violation rate | Within 5 percentage points | Independent district estimate vs. ground truth district estimate |
| False positive rate | ≤ 2% | Cases flagged as violations that are not (dates outside bar window) |
| False negative rate | To be measured | Cases missed by the screener that are actual violations |
These thresholds are placeholders. A research partner with experience in validation study design should set the actual criteria before data collection begins.
6. Known Threats to Validity
| Threat | Description | Mitigation |
|---|---|---|
| Selection bias | GitHub cloners are self-selected. They found the tool through Reddit, search, or word of mouth. They may not be representative. | Phase 3 uses stratified random sampling with institutional access, removing self-selection entirely. |
| Tool error | The screener could have bugs that produce systematic errors. | The 264-case ground truth was manually verified against PACER dockets. Any screener error that contradicts manual verification would surface in Phase 2. |
| PACER data quality | PACER records may contain errors (incorrect dates, missing cases, miscoded chapters). | This affects all PACER-based research equally. The convergence design tests whether independent users encounter the same data quality, not whether PACER itself is perfect. |
| Temporal drift | PACER records can be amended. A case screened in March may show different data than the same case screened in June. | Timestamp all screenings. Compare only results generated within the same time window. |
| Non-independence | Some GitHub cloners may share results with each other, compromising independence. | The screener output is deterministic. If two users get the same result, it is because the data is the same, not because they compared notes. Non-independence does not affect a deterministic tool. |
7. What a Research Partner Brings
The dataset builder built the infrastructure. The research partner designs the study. Specifically:
- Sample design: How to stratify the Phase 3 sample. Which districts, how many cases per district, what filing years.
- Statistical framework: What tests to use for convergence. How to handle partial overlap. Power analysis for minimum sample sizes.
- Validation criteria: Setting the actual thresholds in Section 5 based on precedent in empirical legal studies or adjacent fields.
- Publication strategy: Whether to publish one paper or two. Which journal. How to frame the methods contribution vs. the substantive findings.
- IRB assessment: Whether this study requires institutional review, given that all data is from public federal court records with no human subjects contact.
- PACER fee exemption: Institutional affiliation that qualifies for the AO's researcher fee exemption, enabling Phase 3 at zero marginal cost.
8. What Exists Today
| Component | Status |
|---|---|
| Ground truth dataset (264 cases, 7 districts) | Complete |
| Screening tool (open-source, deterministic) | Live, ranking #1 nationally |
| FJC national dataset (4.9M Ch. 13 cases) | Loaded, queryable |
| 391,951 verification universe identified | Complete |
| RSS real-time monitoring (all 94 districts) | Running |
| RECAP enrichment pipeline (16,000+ cases) | Running |
| Independent GitHub cloners | 198 unique as of March 25, 2026 |
| Research design for convergence test | This document (draft) |
| Institutional PACER access | Not yet available |
| Formal study protocol | Awaiting research partner |
This design is open
Every element on this page - the phases, the thresholds, the threats, the publication strategy - is a proposal, not a decision. The purpose of this document is to show that a rigorous validation framework is possible and that the infrastructure to execute it already exists. The research design itself should be shaped by someone with the expertise to do it right.