Sunday, 19 April 2026 · Vol. II · No. 287 · LIVE · Budget speech debate
Johannesburg Edition
R15 · free online
Jhb 21°C · Cpt 17°C
Dbn 24°C · Wdh 19°C
Established 2026 · Independent · Indexed

The Meridian

Neutral record · Multi-source · Cited
A Southern African journal of record, reassembled from many voices.
Front PageMethod
Transparency

How we work

A Southern African journal of record built on an open pipeline - every score traceable, every claim reversible.

The Meridian · Pipeline v1 · Scorer heuristic-v1 · Rubric synthesis-v1

The problem we're solving

Southern Africa has dozens of quality news outlets - but a reader following any single story has to visit four or five of them, reconcile contradictions, and figure out which framing is factual versus editorial. The Meridian automates that reconciliation and publishes a single neutral synthesis with per-fact citations back to the original sources.

The pipeline

Every 15 minutes, a cron trigger fans out fetch jobs to all indexed outlets. Six queue-driven workers process each article through the full pipeline:

  1. Fetch - conditional HTTP, robots-respecting, per-host rate limits. Raw HTML → R2 as a permanent receipt.
  2. Extract - Mozilla Readability + linkedom parses the article body. Language detected via franc (English, Afrikaans, isiZulu, isiXhosa).
  3. Embed & cluster - Workers AI bge-m3 produces a 1024-dim embedding. Articles with cosine similarity >0.82 within an 18 h window form a cluster.
  4. Entity extraction - Fast gazetteer regex covers ~600 SA politicians, parties, places. Claude handles out-of-gazetteer entities via structured tool-use.
  5. Scoring - Pure TypeScript heuristic scorer (heuristic-v1). Three outputs: neutrality, fact density, political drift. Append-only; old scores are never overwritten.
  6. Synthesis + validation - Claude synthesises all cluster articles into a single neutral piece. A second Claude call validates every claim has a source sentence. Pass → publish. Retry up to twice. Then hold silently for manual review.

Scoring

Every score is produced by scorers/heuristic-v1.ts - a deterministic TypeScript function that reads the article text and the per-language lexicons. No black-box model decides the score.

Neutrality (0–100) - starts at 100. Deductions for: loaded-language density (from the lexicon), evaluative adjectives, stance markers, attribution gaps. Offset by the source's editorial prior from sources.toml.

Fact density (0–100) - ratio of attributed claims to total claims. Attribution markers ("according to", "sê", "utshilo", "i-… ithe") counted across all four languages.

Political drift (−1 to +1) - signed score from bias-right lexicon hits minus bias-left hits, normalised by article length. Zero = centre.

Every public score links back to the exact sentences and lexicon entries that contributed to it. "Why 87?" is a question you can answer by clicking the score.

Synthesis rubric

The synthesis prompt (synthesis-v1) instructs Claude to: report only verified facts with attribution, present all sides where sources disagree, use past-tense declarative broadsheet style, avoid loaded language, and produce a claim-map linking every asserted claim to the source article(s) that supplied it.

The validator prompt (validator-v1) checks: does every claim in the body have a supporting source sentence? Are framing-loaded words present? Is the disagreement report complete?

Source registry

All indexed outlets are listed in config/sources.toml in the repository. Each entry has a lean prior (set by editorial review, updated quarterly), neutrality ceiling, and reviewer notes.

What we don't do

We don't rank stories by popularity or clicks. We don't write opinions. We don't have a comments section. We don't use affiliate links or native advertising. The pipeline is the editorial desk.

Open decisions

Model selection (Sonnet vs Opus for synthesis) - bake-off pending once 50 clusters exist. Per-source priors are public now; we may refine methodology after a full quarter of data. Paywalled outlets (Netwerk24, parts of Financial Mail) - RSS-stub only for now.

People, Places & Topics in the News

Auto-extracted entities from the last 24 hours. Click any chip to see every article that mentioned it.