Open-source infrastructure for literature-based drug discovery.
{
"subject": "disulfiram",
"subject_type": "drug",
"predicate": "inhibits",
"object": "ALDH1A3",
"object_type": "gene",
"context": "glioblastoma stem cells",
"polarity": "positive",
"confidence": 0.85
}
Over 4,000 biomedical papers are published every day. Knowledge stays fragmented across journals, disciplines, and languages. The connections that would lead to new treatments often already exist in the published literature, but no human team can find them systematically.
Robertium reads open biomedical literature, extracts structured claims about drugs, genes, and diseases, and connects them into a knowledge graph. From this graph, it surfaces contradictions, gaps, and reasoning chains that point to new drug repurposing hypotheses.
The pipeline is domain-agnostic. The first domain is glioblastoma. The second will be epilepsy.
Open biomedical literature from OpenAlex, PubMed, bioRxiv
Structured claims about drugs, genes, diseases via language models
Knowledge graph reveals contradictions, gaps, and reasoning chains
Robertium is open-source under the MIT license. The code, the extracted claims, and the knowledge graph are all freely available.
This is intentional. Drug discovery infrastructure should belong to the scientific community, not to private platforms.
The first domain — glioblastoma — is being processed. This page is updated as work progresses.
Glioblastoma corpus processed. Knowledge graph operational. First contradictions identified.
Epilepsy added. Pipeline validated as domain-agnostic.
Preprint on bioRxiv describing methodology and findings.
Additional therapeutic areas where treatment is incomplete.