Red Anvil Creative: Direct / institutional: The Algorithmic Fairness Authority: A Product Requirements Document

# A Product Requirements Document for the Algorithmic Fairness Authority

## Why a Private Standards Body Beats the Federal Patchwork

*Redwin Tursor*

---

## Summary

Colorado Regulation 10-1-1 — originally a life insurance rule on algorithmic discrimination, extended in October 2025 to auto and health benefit plans — is the opening move. Other states are drafting. The federal government will not ship a unified rule in the operative window. Plaintiff's bar has live cases past motion to dismiss on algorithmic denial-of-care theories. State attorneys general have recent coalition experience on algorithmic harm and are looking for the next joint action. The insurance industry is about to discover that the cheapest version of this problem is the one it funds itself.

This document specifies the product that should exist: a private standards body with de facto regulatory force, modeled structurally on Underwriters Laboratories (1894) and the Comics Code Authority (1954), designed to avoid the failure modes of both, and scoped to algorithmic fairness in insurance underwriting, claims adjudication, and rating.

The institution being specified is not a rulebook. It is a living system of decisions, disputes, and measurements that has to be kept functional under adversarial conditions for decades. The specification has to be read in that register or several of its load-bearing choices will look like fussy overhead rather than what they actually are, which is the difference between a standards body that stays alive and one that ossifies.

The document is structured in two parts. Part One is the thesis: why the Authority should exist and what historical templates it learns from. Part Two is the specification: users, workflows, certification lifecycle, governance decisions, failure states, MVP, and registry design.

---

# Part One: Thesis

## The Problem, Stated Plainly

Insurers now run algorithmic models across underwriting, pricing, claims triage, and fraud detection. When those models produce disparate outcomes along protected characteristics, three things happen in sequence.

First, a state regulator notices. Colorado noticed first and wrote Reg 10-1-1. Connecticut, New York, California, Washington, and New Jersey are at varying stages of drafting. The National Association of Insurance Commissioners adopted an AI Model Bulletin in 2023, and state adoption has been uneven, which is the problem. Every state that writes its own rule writes it differently, and each rule requires its own audit format, its own testing methodology, its own acceptable-disparity threshold.

Second, the plaintiff's bar files. The UnitedHealth nH Predict case on Medicare Advantage claims denials and the Cigna PXDX case on algorithmic claim adjudication are the templates. Both have survived early dispositive challenges. The legal theory — that an algorithm producing systematically biased outcomes constitutes a breach of the insurer's duty — is now operative. The firms building expertise here include Motley Rice, Lieff Cabraser, and Cohen Milstein, whose institutional memory runs through tobacco and opioids.

Third, state attorneys general coordinate. The social media addiction suits against Meta filed by dozens of state AGs in 2023 and 2024 are the warm-up. The coalition infrastructure, outside counsel arrangements, and political incentives carry over to algorithmic insurance harm. When the coalition forms, the industry faces either a fifty-state patchwork of consent decrees or a single Master Settlement Agreement successor.

The industry's internal compliance function cannot solve this. Chief Compliance Officers at individual carriers can build internal governance but cannot set cross-industry standards. Trade associations — AHIP, APCIA, ACLI — can lobby, but a trade association self-regulating its own members carries no credibility with regulators, plaintiffs, or juries. The Big Four consultancies publish frameworks, but frameworks without enforcement are marketing documents.

What is missing is an institution.

## Structural Precedent: Two Templates, One Lesson

### Underwriters Laboratories (1894 → present)

William Henry Merrill founded UL to solve an actuarial problem. Fire insurers in the 1890s could not price electrical-installation risk because they could not assess the safety of the underlying equipment. UL was funded initially by the insurance industry itself, specifically by the Western Insurance Union. Its product was not a rule. Its product was a *mark*. Equipment carrying the UL mark was insurable. Equipment that failed UL testing was not.

The mechanism is the point. UL had no statutory authority. It never needed any. The insurance industry's refusal to underwrite uncertified equipment made UL certification a de facto legal requirement everywhere insurance was required. Municipalities wrote UL into building codes later, decades after the mark had already become unavoidable.

UL's governance model — industry-funded, operationally independent, technically rigorous, with outcomes-based testing rather than prescriptive process rules — is the structural template. The failure modes UL exhibited in later decades — conflicts of interest around its largest funders, opacity in certification decisions, slow adaptation to new technology categories — are the design constraints the Authority must solve at inception rather than at century-three.

### Comics Code Authority (1954 → 2011)

Fredric Wertham's *Seduction of the Innocent* and the 1954 Senate subcommittee hearings created an existential threat to the comics publishing industry: federal content regulation. The industry's response was to build the Comics Code Authority, a private standards body that certified comics as code-compliant. Retailers refused to stock uncertified comics. The Code preempted federal regulation by making federal regulation unnecessary.

The Code worked structurally. It failed substantively. It ossified around moral-panic content rules that reflected 1954's cultural panic rather than any defensible theory of harm. By the 1970s its restrictions were dated. By the 1990s major publishers bypassed it. By 2011 it was dissolved.

The Code's lesson for the Authority is not "don't build one." The lesson is that a standards body written around process rules that capture one cultural moment will ossify and die. A standards body written around *outcomes* — measurable, testable, updatable outcomes — will not. The MPAA film rating system (1968) and the ESRB video game rating system (1994), both of which also preempted federal regulation, have survived precisely because their substrate is outcome-based and their ratings update with the content landscape.

### The Synthesis

From UL: industry funding, operational independence, a mark that becomes unavoidable through private refusal-to-underwrite rather than public enforcement.

From the Comics Code: the move to preempt federal regulation by creating an existing institution that regulators can ratify rather than replace.

From the Comics Code's failure: do not ossify around process. Build around outcomes.

From MPAA and ESRB: the standards update as the field updates. A 2026 bias-testing methodology will be obsolete in 2030. The Authority's core competence is keeping the methodology current, not freezing a specific test.

The deeper pattern across the templates that lived: each one behaves less like a rulebook and more like an ongoing practice. Its threshold committee updates rather than legislates. Its review cycle tends the standards rather than freezing them. Its adversarial-testing function exists specifically to catch the most dangerous pattern in this domain — the institution or the model that performs virtue without producing it. A standards body that does not understand itself as a live practice will be written around process rules on day one and will be ossified by day three thousand. The Authority has to be designed to stay functional, not merely to launch.

## The UL Analogy Has One Crack, and It Matters

The UL story is the right structural template, but the mechanism that made UL unavoidable does not transfer cleanly to algorithmic fairness, and the honest version of this document has to say so.

UL worked because fire insurers had a direct underwriting need to know equipment safety. They could not price fire risk without it. The refusal to underwrite uncertified equipment was not a collective-action commitment the industry had to organize around — it was a pricing necessity that each carrier arrived at independently and converged on collectively. The coordination was emergent.

Algorithmic fairness is different. No carrier needs Authority certification in order to price its own models. The pressure is reputational and litigation-defensive, which is weaker and more diffuse than actuarial necessity. A carrier that declines certification saves the dues and fees, relies on internal compliance, and takes its chances individually rather than funding a body that might publicly reveal its models as non-compliant.

The Authority therefore cannot rely on UL's emergent-convergence mechanism. It has to solve for collective action at the front of the building process, not hope it appears at the back.

Three mechanisms are available, and the founding coalition should adopt all three.

The first is a multi-year funding commitment from participating carriers, structured as a four-year pledge with graduated withdrawal penalties, so that an individual carrier cannot defect cheaply once the Authority's standards become inconvenient. The American Petroleum Institute's shared-research funding model is the operative precedent: pledges are binding for the full term, with withdrawal before year four triggering forfeiture of prior-year contributions plus a penalty equal to the remaining-year commitment. A carrier that joins in year one and attempts to exit in year two after an unfavorable certification owes three years of dues plus the penalty multiplier. The mechanism has to bite or it is decorative. API's version bites.

The second is tying Authority participation to trade association membership in at least one of AHIP, APCIA, or ACLI — not as a formal requirement but as a political norm that the trade associations themselves enforce. This converts the coordination-among-competitors problem into a governance-of-members problem, which trade associations know how to do.

The third, and most important, is getting the first state insurance commissioner to formally recognize Authority certification as satisfying its own audit requirement *before* the Authority's first certification is issued. Once one regulator is on record, the Authority has an external anchor. Carriers evaluating participation are no longer weighing it against their own risk appetite — they are weighing it against a regulator's standing position. That shift changes the default.

None of the three mechanisms eliminates the collective-action problem. Together they change its shape from "will insurers coordinate voluntarily" to "will insurers defect from an already-running institution their trade association sponsors and their regulator recognizes." That second question answers more favorably than the first.

---

# Part Two: Specification

## Mission

To establish, maintain, and certify outcome-based fairness standards for algorithmic decision systems in insurance, such that a carrier's models carrying the Authority's certification are presumptively compliant with state insurance regulators, defensible in plaintiff litigation, and credible to state attorneys general.

## Users

The Authority has five user classes. Each interacts with a distinct surface of the institution.

- **Carrier compliance leadership** (CCO, CRO, General Counsel). Submits models, signs membership, responds to registry status changes, coordinates internal response to adverse determinations.

- **Carrier model-risk and actuarial teams.** Prepare submissions, configure the enclave environment, respond to technical committee queries, implement remediation, report trigger events.

- **State regulators.** Review methodology during public comment, accept certification as satisfying state audit requirements where ratified, query the registry, participate through alumni seats on the regulatory board pool.

- **Plaintiff-side and outside reviewers.** Litigation experts, academic researchers, investigative journalists, civil society analysts. Access the public registry, reproduce the open-source testing protocol against independently obtained data, challenge methodology in public comment, cite registry data in pleadings and amicus briefs.

- **Public-interest board participants.** Civil rights organizations, consumer advocacy groups, academic fairness researchers. Governance participation, threshold-setting review, public-interest positioning during committee deliberation, surfacing observed gaming patterns.

## Certification Lifecycle

A single certification moves through seven stages.

1. **Submission.** Carrier identifies a model and use context, signs the certification-specific access agreement, configures the enclave environment.

2. **Testing.** Authority's open-source testing code runs against the carrier's deployment data inside the enclave. Results returned with cryptographic attestation.

3. **Review.** Technical committee staff evaluate results against current thresholds. Edge cases escalate to the committee itself.

4. **Determination.** One of five outcomes: full certification, conditional certification, warning, failed, or withdrawn at carrier request.

5. **Publication.** Status and determination metadata enter the public registry within 30 days.

6. **Monitoring.** The certified model enters ongoing drift surveillance until renewal or trigger event.

7. **Renewal or revocation.** Annual renewal cycle, or mid-cycle revocation if trigger events and re-review produce a failing result.

## Outcome-Based Standards, Not Process Rules

The Authority certifies results. It does not certify methodology.

A carrier may use whatever internal process it wishes to achieve compliance — adversarial debiasing, counterfactual testing, BIFSG-based race inference for outcome measurement, or approaches not yet invented. The Authority's test is whether the deployed model's outcomes, measured on the Authority's protocol, fall within the Authority's thresholds.

The thresholds themselves descend from existing law where possible. The four-fifths rule from EEOC adverse impact doctrine is the starting substrate. The Authority's technical committee adapts the substrate to insurance-specific contexts — specifying, for instance, how the four-fifths framework applies to an auto pricing model using telematics data where relevant comparison groups are defined differently than in an employment context.

The thresholds update on a defined cadence — every eighteen months, with a public comment period preceding each update, modeled on FDA patient-focused drug development and CFPB plain-language rulemaking conventions.

## The Measurement Problem Is the Whole Problem

Outcome-based certification is the right substrate. It is also the place where the Authority's credibility will be won or lost, and the hardest political and technical work the institution will ever do.

The difficulty is structural. Employment law under the four-fifths rule can treat most disparities as suspect because employment decisions are not, by law, permitted to use most protected characteristics as inputs. Insurance is different. Risk-based pricing is legal. It is actuarially required. Some legitimate rating factors correlate with protected characteristics for reasons unrelated to bias — age correlates with driving risk, geography correlates with weather exposure, occupation correlates with injury rates — and a naive application of the four-fifths rule would forbid accurate underwriting in ways that would collapse the product.

The Authority's technical committee therefore cannot hide behind a single threshold. It has to make value-laden, publicly defended decisions about which disparities are unfair and which reflect legitimate rating. Those decisions will be contested by civil rights organizations from one side and by carriers from the other. The threshold-setting process is not a technical computation. It is a governance function performed by a technical committee, and the governance part is not optional.

Three design principles follow.

First, the Authority must publish its threshold reasoning, not just its thresholds. Every disparity bar the technical committee sets must come with a written rationale that a civil rights attorney and an actuary can both argue with. Thresholds without reasoning invite attack. Reasoning invites debate, which is what the Authority wants.

Second, the Authority has to guard against a specific pattern: the model that passes the measurement while preserving the underlying harm. A carrier that adds noise to its outputs, swaps in proxy variables that statistically reduce measured disparity without changing underlying outcomes, or games the Authority's measurement protocol at the margin has produced a *performance* of fairness — a polished surface, hollow within — rather than the thing itself. Left undefended, this pattern self-propagates through the certified population: a successful game by one carrier is a template for every other carrier. The technical committee's adversarial-testing function — continuous, not once every eighteen months — is the defense. The frame matters because it tells the committee what it is looking for: not a rule violation but a specific kind of institutional mimicry, identifiable by the signature of passing numbers alongside unchanged complaint patterns and unchanged consumer outcomes.

Third, the Authority has to concede in advance that some disparities will pass through its standards for legitimate actuarial reasons, and it has to explain those concessions in terms civil rights organizations can engage with rather than feel excluded from. A standards body that pretends it can eliminate all measurable disparity will lose the civil rights seats on its board. A standards body that pretends disparate outcomes are purely technical will lose the regulator seats. The only survivable position is explicit, public, ongoing argument about where the lines are and why.

## Certification Event Plus Drift Surveillance

The Authority is a two-part product, not one. Annual certification is the event. Continuous drift surveillance is the ongoing function. Either one alone is gameable.

The split is not administrative convenience. It reflects what models actually are. A model certified in January is not the same model in August — it has retrained on new data, its input distributions have shifted, its downstream effects on claimant populations have accumulated. Certifying the January model and walking away treats the model as an artifact. It is not an artifact. It is a live process, and a live process needs monitoring, not stamping.

Drift surveillance runs against models in the certified population and triggers re-review when any of the following conditions fire:

- Material model retraining, including scheduled retraining cycles beyond a defined cadence threshold

- Major feature changes, including addition, removal, or redefinition of model inputs

- Distribution shift in input data exceeding defined statistical thresholds

- Claims denial or approval rate spikes outside historical bands

- Regulator inquiry or formal complaint

- Consumer complaint volume crossing a defined threshold in any measured population

- Reported outcome disparity from any registered third party (including public-interest board members, academic researchers, or state regulators) that passes initial triage

Trigger events initiate a compressed re-review — fourteen-day technical committee review, not the full submission-to-determination cycle. Outcomes of re-review publish to the registry with the trigger reason attached. This is how the Authority catches models that were certified correctly at T=0 but drifted into non-compliance at T=8, and how it makes certification a live status rather than a stamp that fades.

Consumer complaint volume crossing a threshold is the most important trigger and the one most at risk of being downweighted. Complaint patterns are the early-warning signal for everything the formal measurement misses — the disparities that escaped the test, the sub-populations the test was not configured for, the performances of fairness that look clean on the protocol and ugly in the mailroom. The committee that treats complaint-volume triggers as noise will, predictably, certify its way into the next class-action template. The committee that treats them as signal will catch what its own tests cannot.

## Consequence Tiers

The determination stage produces one of five outcomes. All five publish to the registry.

- **Full certification.** Model passes all thresholds. Mark issued for twelve months, subject to drift surveillance.

- **Conditional certification.** Core thresholds passed, issues in specific sub-populations or edge cases. Issued with documented remediation requirement and 90- or 180-day follow-up window. Registry reflects conditional status explicitly.

- **Warning.** Does not pass but within remediation distance. No certification issued. 180-day remediation window before resubmission. Registry reflects warning status and deadline.

- **Failed.** Materially fails thresholds, not within remediation distance. No certification issued. 180-day moratorium on resubmission. Registry reflects failed status.

- **Post-certification revocation.** A certified model later fails drift surveillance or trigger-event re-review. Certification revoked as of the re-review determination date. Registry reflects revocation and originating trigger.

Refusal to submit a model in the Authority's scope is its own registry status, separate from the five determination outcomes. A member carrier that declines to submit a model is publicly identified as such. Non-member carriers are implicitly in this category for all in-scope models.

## Appeals and Dispute Resolution

Any value-laden determination will be contested. An Authority that cannot absorb contest will lose governance integrity within three cycles.

Three appeal channels exist.

**Carrier methodology appeals.** A carrier whose model receives a conditional, warning, or failed determination may appeal on grounds that the testing protocol was misapplied, that the comparison population was misconstructed, or that the threshold was set outside the Authority's published reasoning. Appeals go to an Independent Review Panel composed of three members — one technical committee alternate not involved in the original determination, one former regulator, and one independent academic fairness researcher. Decisions are rendered within 60 days and are public.

**Public-interest methodology challenges.** Civil rights organizations and public-interest board members may challenge specific thresholds, testing protocols, or determinations that they assess to be insufficiently protective. Challenges trigger a formal technical committee response during the next quarterly committee meeting and, if the challenge advances, feed into the eighteen-month threshold update cycle. Challenge dispositions are public.

**Regulator reconciliation.** State insurance departments that find the Authority's methodology diverges from their own audit requirements may request reconciliation. Regulator requests move through the Authority's executive director, with a written response and, where applicable, a technical committee methodology note within 90 days. Reconciliation outcomes are public and inform the threshold update cycle.

Ground rule: the board ratifies or remands technical committee findings but does not edit them line-by-line. Technical methodology is the committee's mandate. Governance of the committee is the board's mandate. Certification decisions are staff-executed under committee rules. This separation is the protection against politicization running in either direction.

## Governance

A board of directors drawn from three pools, with no pool permitted to hold a voting majority. The pools are: industry (carriers and trade associations), public interest (civil rights organizations, academic researchers, consumer advocacy groups), and regulatory (former state insurance commissioners, former federal agency officials, state attorneys general emeriti). A technical committee reporting to the board, composed of practitioners with published work in algorithmic fairness. An operational staff led by an executive director who is a Chief Technical Officer of fairness, not a CEO of an industry association.

The operational independence test is a bright line: no single funder contributes more than fifteen percent of the Authority's annual budget, and no funder may have any direct relationship with any specific certification decision. This is the UL conflict-of-interest failure mode, addressed at inception.

The public-interest seats require their own funding protection. Civil rights organizations that accept board seats cannot be asked to fund their own participation, and they cannot be funded directly by the insurers. A dedicated philanthropic pool — foundation commitments with multi-year horizons — is the vehicle, with the executive director responsible for maintaining it. Without this, the public-interest seats become either performative or captured, and the governance tripartite collapses into industry-plus-regulator, which is a structure no one credits.

## Funding Model

The Authority runs on three revenue streams: carrier dues, certification fees, and a dedicated philanthropic pool for public-interest participation. Each stream has a distinct function, and none substitutes for the others.

**Dues.** Scaled to premium volume, structured in four tiers. Tier 1 (top-ten national carriers by premium) pays roughly forty percent of total dues revenue across the tier. Tier 2 (regional and specialty carriers above a defined threshold) pays roughly thirty-five percent. Tier 3 (smaller regional and mono-line carriers) pays roughly twenty percent. Tier 4 (observer status for carriers below the certification-submission threshold) pays the remaining five percent and does not receive voting rights on trade-association sponsored seats.

The fifteen-percent single-funder cap overrides the tier math. If a top-ten carrier's tier-one dues would exceed fifteen percent of total budget, that carrier's dues are capped and the shortfall is redistributed across the remaining Tier 1 members. This is the anti-capture instrument and it is non-negotiable.

**Dues commitment.** Four-year pledge with graduated withdrawal penalties, modeled on the American Petroleum Institute's shared-research funding structure. A carrier that withdraws before year four forfeits prior-year contributions and owes a penalty equal to the remaining-year commitment. A carrier withdrawing in year two after an adverse certification owes three years of dues plus the penalty multiplier. This converts the exit option from cheap to expensive.

**Certification fees.** Per-model, scaled to model complexity and population size. Baseline fee covers testing-protocol execution, technical committee review time, and registry publication. Remediation-window resubmissions carry a reduced fee. Failed certifications do not refund. The fee structure ensures that certifications are treated as consequential submissions, not administrative formalities.

**Envelope.** Year-one operating budget in the range of $18M-$25M, with Tier 1 dues funding roughly half, certification fees funding approximately a quarter, and the philanthropic pool funding the public-interest seats and their associated staff support. Year-three steady-state in the $30M-$40M range as certification volume scales and member count grows. These envelopes are not estimates of final budget. They are the order-of-magnitude scoping a founding coalition needs to decide whether the institution is fundable at all.

**Philanthropic pool.** A dedicated line funded by foundations — Ford, MacArthur, Russell Sage, the Kapor Center, and the Omidyar Network are the natural candidates given their existing work on algorithmic accountability. The pool covers stipends for public-interest board members (because civil rights organizations cannot afford to donate senior staff time indefinitely), dedicated staff support to public-interest seats (because a board seat without staff is ornamental), and a small research budget for adversarial-testing pilots the technical committee cannot fund through carrier dues without conflict concerns. Target pool size at steady state: $4M-$6M annually. An executive director who cannot keep this pool funded will watch the governance tripartite collapse within three years.

The trade associations — AHIP for health, APCIA for property and casualty, ACLI for life — are the collection mechanism and political sponsor, not the governors. Their role is to deliver their memberships to the table and enforce the participation norm described in Part One. The Authority's independence from them is the condition under which they deliver value to their memberships.

## Data Access and Model Transparency

The Authority cannot certify outcomes without access to data. Outcome testing requires deployment data — the model's inputs, outputs, and outcomes across a defined measurement population — and in some cases the model logic itself. Carriers will resist this for legitimate reasons (proprietary model internals are genuine trade secrets, deployment data contains protected health information and other regulated data) and for self-protective reasons (a model the carrier suspects will fail is a model the carrier does not want examined).

The operational pattern that works across regulated industries — pharmaceutical clinical trial audits, banking stress tests, SOC 2 attestations at scale — is a secure enclave model. The carrier does not transfer data to the Authority. The Authority's testing code runs inside the carrier's environment, against data that never leaves the carrier's systems, with cryptographic attestation that the test was run correctly and the outputs were not tampered with. The Authority sees the test results. It does not see the underlying data.

This protocol satisfies carrier confidentiality. It does not automatically satisfy civil rights advocates, who reasonably ask how they are supposed to trust results they cannot independently reproduce. Open-source testing code is a necessary condition but not a sufficient one — open code against sealed data is reproducibility in theory, not in practice, because no independent party has standing to demand data access on which to run the code.

The sufficient condition requires three additional instruments, all of which the Authority must commit to at inception.

The first is that state regulators with statutory audit authority can exercise that authority against carrier data and run the Authority's testing code against that data, with the regulator's findings published in the regulator-only registry layer and, where the regulator chooses, publicly. This is not a novel power — state insurance departments already have data-access authority. What is novel is that the regulator's audit runs the same protocol the carrier was certified under, making the results directly comparable.

The second is a standing academic research partnership, modeled on the FDA's Sentinel Initiative and the CFPB's research partnerships with academic institutions. Pre-qualified academic researchers operating under data-use agreements can access de-identified carrier deployment data for the purpose of independent testing. Findings that diverge from Authority certifications feed directly into the public-interest challenge channel and the eighteen-month threshold update cycle.

The third is that plaintiff-side experts in litigation with established data-discovery rights can run the Authority's open-source protocol against data obtained through discovery, and the Authority will cooperate with motions to use its testing code in expert witness work. The Authority does not become a plaintiff. It also does not become an obstacle to plaintiffs whose data access comes through a different legal authority than the Authority's own.

The combination of these three instruments converts the reproducibility promise from theoretical to operational. The Authority holds the certified reference implementation. Regulators, academic researchers, and plaintiff experts each have an independent path to data on which to run that implementation. The results are comparable because the protocol is shared.

## The Public Registry

The registry is the enforcement instrument. Specification matters.

Every record in the public registry contains the following fields:

- Carrier name and Authority member status

- Model identifier (non-sensitive label, carrier-assigned)

- Model category (underwriting, claims adjudication, pricing/rating, fraud detection)

- Use context (line of business, jurisdiction)

- Current status (certified, conditional, warning, failed, revoked, refused, not submitted)

- Determination date

- Next review date

- Active trigger events, if any, with trigger category

- Active appeals, if any, with appeal category and date filed

- Revocation reason, if applicable

- Narrative summary (one paragraph, technical committee authorship, plain language)

A separate regulator-only layer contains certification methodology details, test result data, remediation plans, and trigger event specifics that the Authority judges inappropriate for public release but necessary for state regulator supervision. Access to the regulator layer is gated by formal state insurance department credentials and logged.

Registry updates publish quarterly at a minimum, with status-change events published within 30 days.

## Institutional Failure Modes

Each mode below is a specific pathology the institution has to be built to resist, not a risk to be flagged in a memo and forgotten.

|---|---|---|---|

| Funder capture | Largest carriers apply pressure via dues threats | Threshold-setting delays; technical committee turnover | 15% funding cap; four-year pledge penalties; board pool voting rules |

| Performance of fairness | Carriers optimize to pass measurement, not to reduce harm | Passing certifications alongside stable or rising complaint volume | Continuous adversarial testing; trigger-event re-review; open-source test protocol |

| Opacity in certification decisions | Committee deference to industry in ambiguous cases | Sparse narrative summaries; low appeal volume | Published threshold reasoning; appeals channels; public-interest challenge right |

| Methodology ossification | Technical committee resists updates that invalidate prior work | Unchanged thresholds across update cycles despite field advances | 18-month mandatory update cycle; external academic seats on committee; public comment |

| Performative public-interest participation | Civil rights seats exist but lack resources to engage meaningfully | Low challenge volume; formulaic board comments | Philanthropic funding pool for public-interest seats; staff support to board members |

| Regulator withdrawal | State commissioner concludes the Authority duplicates or undermines state authority | Withdrawn ratification; methodology divergence complaints | Regulator reconciliation channel; early commissioner recognition; regulatory pool board seats |

| Registry weaponization | Bad-faith parties use registry data to harass carriers or plaintiffs | Repeated challenges from identifiable non-serious actors | Triage process for third-party reports; public standing for challenges |

The two most contagious failure modes — performance of fairness and methodology ossification — propagate for the same underlying reason. A successful pattern in one carrier or one committee cycle becomes the default template for the next. Neither is a discrete event. Both are slow colonizations that look normal in any given month and are visible only in the time series. The Authority's monitoring function has to watch its own institutional state with the same seriousness it watches member carriers' models.

## What the Authority Is Not

It is not a lobbying organization. Its credibility with regulators and plaintiffs is the asset it must not trade.

It is not an industry association. Trade associations exist. They do not and cannot do this work.

It is not a consulting firm. Consultants perform audits to a standard. The Authority *sets* the standard.

It is not a certifying body in the sense of ISO 9001 or SOC 2. Those certifications are process attestations. This one is an outcome attestation, tested on deployed systems against defined disparity thresholds, not on the carrier's documented procedures for avoiding disparity.

## The MVP

The smallest credible version of the Authority is deliberately narrow. The temptation is to launch with scope matching the ambition. The discipline is to launch with scope matching the credibility the institution can defend in its first eighteen months.

Year-one MVP boundaries:

- **One line of business.** Health benefit plan underwriting, specifically. Colorado Reg 10-1-1's amended coverage maps here directly, and the plaintiff's bar cases on algorithmic denial of care concentrate the regulatory and litigation signal in this line.

- **One use case family.** Underwriting and prior authorization models. Pricing, fraud, and downstream claims models follow in years two and three.

- **Member carriers only.** No certification available to non-members in MVP. This keeps the coalition intact and the dues model coherent.

- **Annual certification plus trigger-event re-review.** Continuous drift surveillance at full fidelity is a year-two capability. MVP drift surveillance is limited to the trigger conditions listed above.

- **Limited protected-class stack.** Race and disability at MVP. Age, gender, and additional protected classes layer in during the first threshold update cycle.

- **One regulator recognition pilot.** Colorado is the target. A formal recognition before MVP first certification is the political objective.

- **Registry v1.** All seven determination statuses, but without the regulator-only layer (follows in year two) and without real-time trigger event publishing (batched quarterly in MVP).

MVP success criteria at month eighteen: minimum eight member carriers across at least two trade association memberships; at least twenty certifications issued across the three active determination outcomes; one state regulator formally recognizing Authority certification; no governance resignations from the public-interest pool; technical committee operational with published threshold reasoning for the initial rule set.

## The Architect

The Authority's first-year work is building the technical methodology, the governance instrument, and the political ratification from at least one state insurance department. This requires an architect who meets four criteria simultaneously: published peer-reviewed methodology on algorithmic auditing at scale; existing relationships with state insurance regulators; existing relationships with civil society organizations that will sit on the public interest side of the board; and operational experience running an actual auditing practice on paying clients.

The population of individuals and institutions meeting all four criteria is small. The founding coalition should specify the capability profile in the charter and let the search process identify the architect. The natural candidate pool includes academic auditing practices with regulatory relationships (the O'Neil Risk Consulting & Algorithmic Auditing model), civil society organizations with demonstrated auditing capacity (the Algorithmic Justice League model, though its mission scope differs), and law-firm/academic hybrids working on algorithmic accountability. The coalition should screen for institutional conflicts — an architect with existing carrier engagements cannot credibly hold this position — and for the ability to scale from founding engagement to sustained operational partnership over three to five years.

## The Window

Colorado amended Reg 10-1-1 in October 2025, effective staged through 2026. Connecticut, New York, California, Washington, and New Jersey are at varying stages of drafting their own rules. The NAIC AI Model Bulletin sees uneven state adoption, which means fragmentation grows faster than unification.

The UnitedHealth nH Predict case and the Cigna PXDX case will reach dispositive motions or settle during 2026. Whichever outcome occurs will define the plaintiff's-bar template for the next wave.

State attorneys general coordinate on these issues in two-to-three-year cycles. The social media addiction coalition formed in 2022 and 2023 and filed in 2023 and 2024. The algorithmic insurance harm coalition, if it forms on the same tempo, is a 2026-2027 formation and 2027-2028 filing.

The window to build the Authority before it is built for the industry, by other parties, is roughly eighteen to thirty months. This is fast. Two years from founding-coalition convening to first certification is aggressive for an institution that needs technical credibility and multi-stakeholder governance, and trade-association coordination alone will consume more of that window than the principals are likely to anticipate. An Authority built on a five-year timeline is a post-mortem, not a preemption.

After the window closes, the structural question is no longer whether a standards body emerges. It is whether the standards body that emerges is industry-funded and outcome-based, or externally imposed and process-rigid.

Both are possible. Only one is survivable for the industry in its current form.

## What Happens Next

The coalition that commissions the Authority consists, at minimum, of one willing state insurance department (Colorado is the obvious candidate, having already partnered with outside auditing expertise), two of the three major trade associations (AHIP and APCIA are most likely; ACLI follows), and one architect institution with the capability profile described above.

The coalition convenes. The architect institution is engaged on a two-year building contract. The technical methodology is drafted during year one, with public comment during months nine through twelve. The governance instrument — articles, bylaws, board composition — is finalized during months six through nine. The first certifications are issued in month eighteen. The first state regulator recognizes Authority certification as satisfying its own audit requirement in month twenty-four, though the design goal is to get that recognition on record earlier, before the first certification issues, so that the collective-action mechanism has its external anchor in place from the start.

None of this requires federal legislation. None of it requires new statutory authority. All of it has been done before, by UL, by MPAA, by ESRB, by every private standards body that has successfully preempted the federal action it feared. The template exists. The question is whether the industry commissions the Authority in time, or whether the industry discovers, as the tobacco industry discovered in 1998, that the conversation about standards has moved to a room it is not invited to.

---

*This is an open document. It is written to be read, argued with, rebuilt, and, if it turns out to be wrong in important places, refuted. The author has no affiliation with any auditing firm, trade association, or insurer, and has not been compensated by any party to write it.*

*Contact: through Redwin Tursor at the redanvilcreative at the gmail address intuitively derived from the domain.*

Red Anvil Creative

Friday, April 24, 2026

Direct / institutional: The Algorithmic Fairness Authority: A Product Requirements Document

No comments:

Post a Comment