Methodology

How MedicOath discovers, scores, and verifies cure hypotheses. Full transparency — no black box.

Neglect Score Calculation

The MedicOath Neglect Score (0-100) measures how underinvestigated a disease is relative to its burden. It combines: - **Disease burden** (DALYs, mortality, affected population) from the WHO Global Burden of Disease study - **Research investment** (NIH funding, clinical trial count, publications) from NIH RePORTER and PubMed - **Treatment availability** (approved therapies, pipeline drugs) from FDA and EMA databases - **Market incentive** (estimated revenue potential vs development cost) Formula: Neglect = Burden_normalised - Investment_normalised, scaled 0-100. A score of 100 means maximum neglect: high burden, near-zero investment. A score of 0 means the disease receives proportional attention.

Hypothesis Generation

Hypotheses are generated by Claude Sonnet (Anthropic's reasoning model) operating on evidence gathered from 100+ biomedical data sources including: - PubMed (biomedical literature) - ClinicalTrials.gov (trial registry) - ChEMBL (compound bioactivity) - UniProt (protein data) - Traditional medicine databases (Ayurveda, TCM, African pharmacopoeia) - Historical archives (Soviet medical literature, WHO archives) - Negative results repositories The AI reads the evidence corpus for a given disease, identifies potential drug repurposing opportunities, and generates structured hypotheses with mechanistic explanations.

7-Dimension Scoring

Every hypothesis is scored independently across 7 dimensions, each 0.0-1.0: 1. **Confidence** — Overall likelihood this hypothesis is correct 2. **Mechanistic Plausibility** — How well the proposed mechanism is supported by known biology 3. **Evidence Convergence** — Do multiple independent lines of evidence agree? 4. **Novelty** — Has this connection been proposed before? 5. **Tractability** — Can this be tested with existing compounds and methods? 6. **Impact** — If validated, how many patients would benefit? 7. **Cost Efficiency** — Cost to validate relative to potential impact Scoring uses a separate Claude Sonnet call with a dedicated rubric prompt. Each dimension is scored with explicit reasoning.

Safety Gates

The Layer-6 cure engine includes two mandatory safety gates: **SafetyGateAgent** — Queries ChEMBL for toxicity data (LD50, NOAEL, adverse events) on every proposed compound. If safety concerns exceed a threshold, the hypothesis is flagged and requires human review. **EthicsReviewAgent** — Screens every hypothesis against the Belmont Report principles (respect for persons, beneficence, justice). Hypotheses that raise ethical concerns are flagged. Both gates must pass for a hypothesis to be published. Neither gate can be bypassed.

ProofChain Timestamping

Every published hypothesis is: 1. **Content-hashed** — SHA-256 of the canonical JSON 2. **Pinned to IPFS** — Permanent, content-addressed storage via Kubo 3. **Registered on Ethereum Sepolia** — The ResearchProvenance smart contract records the content hash, IPFS CID, and author address This creates an immutable timestamp proving the hypothesis existed at a specific point in time. Under U.S. patent law (35 U.S.C. § 102), public disclosure may establish prior art. Anyone can verify: check the IPFS CID, check the blockchain transaction, compare the content hash.

Versioning Policy

Hypotheses are living documents. When new evidence is incorporated: - A new version is created (never overwritten) - The change is classified: evidence_added, challenged, safety_updated, confidence_recalculated, forked - The old version remains accessible - Each version gets its own ProofChain timestamp - Confidence scores are recalculated with the new evidence This creates a full audit trail of how scientific understanding evolves.

Autonomous Discovery Pipeline

MedicOath operates continuously without human intervention: 1. **Omnivore** ingests papers from 100+ sources on scheduled cron jobs 2. **Prometheus Engine** assesses each paper with Claude Haiku for significance 3. **Hypothesis Engine** generates repurposing hypotheses every 4 hours 4. **Layer-6** runs full cure investigations when high-confidence signals are detected 5. **Human-in-the-Loop** checkpoint pauses the pipeline for human approval 6. **ProofChain** timestamps every result Budget controls cap API usage at ~£5/day to prevent runaway costs.

MedicOath is open source (AGPL-3.0). Every line of code is public.

View source on GitHub