Coronial

How it works

Coronial is a free, searchable database of Australian coroners' findings built for clinicians, trainees, and researchers. This page explains how findings are collected, what information is extracted from them, how search works, and how privacy is protected.

Step 1 — Collecting findings from official sources

Every Australian state and territory maintains a public website where coroners' findings are published after they are finalised. An automated scraper runs weekly, visiting each of these official websites and checking for new findings. When a new finding is detected, the scraper downloads a copy of the PDF document and records the original source URL so users can always verify against the official version.

Only publicly accessible findings are collected — nothing behind a login, paywall, or court restriction. The scraper is rate-limited and respectful of each site's terms of access, pausing between requests so as not to overload government servers.

Step 2 — Extracting text from PDFs

Coroners' findings are published as PDF documents. These vary enormously in quality: some are modern, digitally typeset documents; others are scanned images of typed or even handwritten pages from older proceedings.

The pipeline first attempts to extract text directly from the PDF using a digital extraction library. For most modern documents this works perfectly. However, if the extracted text is too short or appears garbled — a sign that the PDF is a scanned image with no embedded text layer — the pipeline automatically falls back to Optical Character Recognition (OCR). OCR analyses the visual image of each page and converts it to machine-readable text, much like how a phone camera can read a printed menu.

The extracted text is cached so that it does not need to be re-processed on every pipeline run. The original archived PDF is also stored and linked from each case page, so you can always read the source document in full.

Step 3 — AI-powered field extraction

Once the text of a finding has been extracted, it is sent to a large language model (Claude, made by Anthropic) with a detailed prompt asking it to read the finding and extract specific structured information. This is the core of what makes the database searchable and filterable.

The AI reads each finding in full — which can be anywhere from a few pages to over a hundred — and returns a structured summary covering the fields below. This step is analogous to a trained researcher reading the document and filling out a standardised case report form.

AI summary

A concise plain-English summary of what happened, the clinical context, and the key findings.

Cause of death

The cause or mechanism of death as recorded by the coroner.

Specialty involved

The medical or surgical specialty most relevant to the case (e.g. emergency medicine, cardiology).

Type of error

The category of clinical error, if any — such as diagnostic error, medication error, or failure to escalate.

Clinical setting

Where the primary clinical event occurred: hospital, aged care, mental health, community, home, or other. Reflects the location that set the fatal chain in motion, not necessarily where death was certified.

Drugs involved

Any medications that played a material role in the case.

Clinical conditions

Diagnoses and medical conditions relevant to the circumstances of death.

Procedures

Surgical or medical procedures that were performed or should have been.

Contributing factors

Systemic, organisational, or individual factors the coroner identified as contributing to the death.

Recommendations

Formal recommendations the coroner made to prevent similar deaths in future.

Preventability

The coroner's assessment of whether the death was preventable, possibly preventable, or not preventable.

Escalation failure

Whether failure to escalate care (calling for help or upgrading care level) was identified as a factor.

Hospital name

The hospital or health service where care was delivered, normalised to a canonical name.

Demographics

Age, sex, and age group of the deceased, used to enable demographic filtering.

AI extraction is not perfect. The model occasionally miscategorises a specialty, misses a drug name, or generates a summary that does not fully reflect the finding. All AI-generated content is labelled as such on case pages, and users should always refer to the original document when accuracy matters. If you spot an error, please let us know.

Step 4 — Anonymising clinician names

Coroners' findings frequently name the treating clinicians involved in a case — doctors, nurses, paramedics, and others. While these findings are public records, republishing full clinician names in a searchable database creates a meaningful privacy risk: a clinician's name could become permanently and prominently associated with an adverse outcome in a way that goes beyond what the original court record warrants.

To address this, the pipeline automatically detects doctor name patterns in the AI-generated summary, contributing factors, recommendations, and other text fields, and replaces them with an initial. For example, “Dr James Smith” becomes “Dr S.” The substitution preserves clinical context while removing the identifying information.

This applies only to the structured fields indexed in the database. The original PDF — which is a public court document — is linked unmodified from each case page.

Step 5 — Standardising terminology

Medical terminology is not consistent. Different findings may refer to the same condition as “sepsis”, “septicaemia”, or “blood poisoning”. British and American spelling variants are also common — “anaemia” vs “anemia”, “haemorrhage” vs “hemorrhage”.

A vocabulary standardisation step runs over each new case. It checks every clinical condition and procedure tag against a curated master vocabulary list and normalises it to the canonical form (Australian English spellings throughout). Terms that do not match any existing entry are flagged in a review queue for potential addition to the vocabulary.

Hospital names go through a similar process. A fuzzy-matching step compares each hospital name extracted by the AI against a master list of known Australian hospitals and snaps it to the canonical name if the match is close enough (for example, correcting “The Royal Melbourne Hospital” vs “Royal Melbourne Hospital”).

How search works

Search queries run against a full-text index built over all structured fields — including the AI summary, specialty, drug names, clinical conditions, procedures, contributing factors, and recommendations. The underlying search engine is SQLite's FTS5, which uses the BM25 ranking algorithm to sort results by relevance when a query is present.

Filters (state, specialty, error type, etc.) work as exact-match refinements on top of the full-text search. You can combine a text query with multiple filters — for example, searching for “insulin” and filtering to ICU cases in Victoria.

Note that the full text of the original PDF documents is not part of the search index. Search runs on the structured, AI-extracted fields only. This is an intentional design choice: full-document search tends to surface findings that mention a condition only tangentially, producing less relevant results for clinical learning purposes. See the Research page for more detail on this.

Limitations and important caveats

  • Coverage is not complete. Some findings are not published online by courts. Some older findings may not have been captured. The database should be used as a learning and discovery tool, not as an exhaustive archive.
  • AI tagging contains errors. Specialty classification, error typing, and drug identification are all AI-generated and imperfect. Treat them as a useful approximation, not a verified classification.
  • Findings are legal, not clinical, documents. A coroner's finding reflects the coronial process and the evidence presented in those proceedings. Clinical details may be incomplete, stated at a lay level, or filtered through expert testimony.
  • Context matters. No summary captures the full nuance of a multi-day inquest. Always read the original document before drawing any clinical or educational conclusions from a case.
  • Some documents have been excluded due to redaction or non-publication orders. All efforts are made to honour court orders suppressing identifying information. A small number of older PDF documents have been found to have technically defective redaction — visually obscured by a black overlay but with the underlying text still present and copyable. To avoid surfacing protected names in search results, those documents have been omitted from the database entirely.

How often is the database updated?

The scrapers run on a regular weekly schedule. New findings published by coronial courts are typically added within a few days of appearing on the official court websites. Because findings must be scraped, extracted, processed by the AI, and then indexed, there is always some lag between a finding's publication and its appearance here.

If you are looking for a very recent finding that you believe has been published, check the relevant court's website directly. Links to all official coronial court websites are included on each case page under “Official source”.