Neglect Score Calculation
The MedicOath Neglect Score (0-100) measures how underinvestigated a disease is relative to its burden. It combines:
- **Disease burden** (DALYs, mortality, affected population) from the WHO Global Burden of Disease study
- **Research investment** (NIH funding, clinical trial count, publications) from NIH RePORTER and PubMed
- **Treatment availability** (approved therapies, pipeline drugs) from FDA and EMA databases
- **Market incentive** (estimated revenue potential vs development cost)
Formula: Neglect = Burden_normalised - Investment_normalised, scaled 0-100.
A score of 100 means maximum neglect: high burden, near-zero investment. A score of 0 means the disease receives proportional attention.
Hypothesis Generation
Hypotheses are generated by Claude Sonnet (Anthropic's reasoning model) operating on evidence gathered from 100+ biomedical data sources including:
- PubMed (biomedical literature)
- ClinicalTrials.gov (trial registry)
- ChEMBL (compound bioactivity)
- UniProt (protein data)
- Traditional medicine databases (Ayurveda, TCM, African pharmacopoeia)
- Historical archives (Soviet medical literature, WHO archives)
- Negative results repositories
The AI reads the evidence corpus for a given disease, identifies potential drug repurposing opportunities, and generates structured hypotheses with mechanistic explanations.
7-Dimension Scoring
Every hypothesis is scored independently across 7 dimensions, each 0.0-1.0:
1. **Confidence** โ Overall likelihood this hypothesis is correct
2. **Mechanistic Plausibility** โ How well the proposed mechanism is supported by known biology
3. **Evidence Convergence** โ Do multiple independent lines of evidence agree?
4. **Novelty** โ Has this connection been proposed before?
5. **Tractability** โ Can this be tested with existing compounds and methods?
6. **Impact** โ If validated, how many patients would benefit?
7. **Cost Efficiency** โ Cost to validate relative to potential impact
Scoring uses a separate Claude Sonnet call with a dedicated rubric prompt. Each dimension is scored with explicit reasoning.
Safety Gates
The Layer-6 cure engine includes two mandatory safety gates:
**SafetyGateAgent** โ Queries ChEMBL for toxicity data (LD50, NOAEL, adverse events) on every proposed compound. If safety concerns exceed a threshold, the hypothesis is flagged and requires human review.
**EthicsReviewAgent** โ Screens every hypothesis against the Belmont Report principles (respect for persons, beneficence, justice). Hypotheses that raise ethical concerns are flagged.
Both gates must pass for a hypothesis to be published. Neither gate can be bypassed.
ProofChain Timestamping
Every published hypothesis is:
1. **Content-hashed** โ SHA-256 of the canonical JSON
2. **Pinned to IPFS** โ Permanent, content-addressed storage via Kubo
3. **Registered on Ethereum Sepolia** โ The ResearchProvenance smart contract records the content hash, IPFS CID, and author address
This creates an immutable timestamp proving the hypothesis existed at a specific point in time. Under U.S. patent law (35 U.S.C. ยง 102), public disclosure may establish prior art.
Anyone can verify: check the IPFS CID, check the blockchain transaction, compare the content hash.
Versioning Policy
Hypotheses are living documents. When new evidence is incorporated:
- A new version is created (never overwritten)
- The change is classified: evidence_added, challenged, safety_updated, confidence_recalculated, forked
- The old version remains accessible
- Each version gets its own ProofChain timestamp
- Confidence scores are recalculated with the new evidence
This creates a full audit trail of how scientific understanding evolves.
Autonomous Discovery Pipeline
MedicOath operates continuously without human intervention:
1. **Omnivore** ingests papers from 100+ sources on scheduled cron jobs
2. **Prometheus Engine** assesses each paper with Claude Haiku for significance
3. **Hypothesis Engine** generates repurposing hypotheses every 4 hours
4. **Layer-6** runs full cure investigations when high-confidence signals are detected
5. **Human-in-the-Loop** checkpoint pauses the pipeline for human approval
6. **ProofChain** timestamps every result
Budget controls cap API usage at ~ยฃ5/day to prevent runaway costs.
MedicOath is open source (AGPL-3.0). Every line of code is public.
View source on GitHub