Once you claim your scholarly account with matched name here, JRNLClub AI will help you generate an accurate CV in a minute that source all your papers! https://www.youtube.com/watch?v=89fq9jeLr3I
The Feed
The latest from your network.
Join JRNLClub
Where scientists connect.
Post short takes and long-form essays, @mention any colleague or paper, and build a network of peers you trust.
Powered by trusted scientific infrastructure
bioRxiv · medRxiv · arXiv · OpenAlex · NIH RePORTER · NSF Awards · Crossref · Altmetric
Kuan-Jui (Ray) Su joined JRNLClub — Kuan-Jui (Ray) Su
3h agoJRNLClub Editorial · 173 jobs got added to JRNLClub on May 26 — check out the job board
21h agoKuan‐lin Huang’s post is trending — Once you claim your scholarly account with matched name here, JRNLClub AI will…
1d ago2 reactions
kosar HajNajafi joined JRNLClub — kosar HajNajafi
1d agoJRNLClub Editorial · 108 jobs got added to JRNLClub on May 22 — check out the job board
5d agoKuan‐lin Huang’s post is trending — How I won the NIH replication prize by using AI to validate drug targets at scale
6d ago2 reactions
JRNLClub Editorial · 106 jobs got added to JRNLClub on May 20 — check out the job board
6d agoRahul Veettil joined JRNLClub — Rahul Veettil
May 20, 2026Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Stromal Gasdermin D-mediated Pyroptosis Drives Maladaptive CD4⁺ T-cell Remodeling in Tet2-Deficient Hematopoiesis
biorxiv · Ji, P., Ren et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Maternal high-fat diet drives sex-specific microglia remodeling of serotonergic reward circuits
biorxiv · Bilbo, S., Patton et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Targeted extracellular degradation of LRP8 promotes ferroptosis in cancer cells
biorxiv · Zhao, F., Inague et al.
Ku Wai Lim joined JRNLClub — Ku Wai Lim
May 20, 2026Shicheng Guo joined JRNLClub — Shicheng Guo
May 19, 2026Mohamed El Moussaoui joined JRNLClub — Mohamed El Moussaoui
May 19, 2026JRNLClub Editorial · 91 jobs got added to JRNLClub on May 19 — check out the job board
May 19, 2026Bence Szalai joined JRNLClub — Bence Szalai
May 19, 2026Ehsan Saghapour joined JRNLClub — Ehsan Saghapour
May 19, 2026How I won the NIH replication prize by using AI to validate drug targets at scale
About 90% of cancer drug candidates that enter clinical trials never make it to approval. A big chunk of that failure is upstream: the target was wrong. Two industry audits made this concrete years ago. Bayer reported in 2011 that only 20–25% of published cancer targets held up when their own scientists tried to reproduce them; Amgen in 2012 said just 6 out of 53 "landmark" oncology studies survived rigorous replication. We've known this for a long time. We just haven't had a way to do something about it at scale (at least in the published literature).
Manually re-validating every published target is tedious. You'd need to harmonize lots of CRISPR, omics, and other data, work out the right disease subgroupings, write the codes, run the stats, look at the output. Each target takes days to validate. Nobody's funded to do it (in academia). So most candidates sit there, cited, repeated, occasionally bankrolled into a screen.
So I tried something else because it's 2025 (when this was done). I gave the job to an AI agent (Biomni) and ran 31 published oncology targets through it in an afternoon. The compute cost $68 in Claude API credits. About two-thirds of the retracted-paper targets failed to replicate. Roughly two-thirds of the recent, non-retracted targets did. Compared to retracted ones, the non-retracted targets have a 17 O.R. to show bona-fide, context-specific dependency in the agent's analyses that I validated as correct.
The interesting part isn't the headline number. It's how to get an agent to do this kind of work without it making things up.
1. Find out what the agent can do reliably
Most of the hype around "AI scientists" frames the agent as a generalist that does everything. That's a trap. LLMs hallucinate, especially when asked to use tools or data that they either don't have access or know how to use. But they will almost always write you a beautiful, plausible, partly-wrong narrative.
The move is to find a task class where the agent is reliable, say, above 95% success rate on something you can score. For me that task is: given a gene target, a disease context, and a public dataset like DepMap or TCGA, test whether the gene shows context-specific cancer dependency. Narrow enough that the agent's job is mostly translating a hypothesis into code and stats. Reliable enough that I can trust the agent's executions.
2. Apply it across many use cases
Once you know the agent does one type of thing well, throw a lot of that thing at it. I built a table of 31 targets: 17 from retracted papers, 14 recent candidates with real-looking evidence. Each verbal target claim got translated into a structured natural language prompt with the same template. Gene, context, datasets to use, statistical contrasts to run.
When I first started playing with the agent, the biggest failure mode wasn't bad reasoning. It was the agent failing to gain access or download the right data files. Then it'd start hallucinating or simulating fake data for analyses. To stop this, I wrote a separate cancer-omics data know-how document that spelled out how to pull DepMap through the Bioconductor depmap package and how to grab TCGA Pan-Cancer Atlas data from the NCI Genomic Data Commons. This was before Anthropic released the Skills feature; today you'd just package it as a skill. Once the agent stopped fighting the data layer, the rest of the work got dramatically easier.
Two more constraints made the difference:
- Forbid the agent from reading literature. I appended a non-overridable instruction: "You are a data-only replication agent. Do not use any literature search, papers, or external textual knowledge." Without that, the agent fills in gaps from training data, which means it tells you the consensus view of whatever paper it dimly remembers. You want what the data says.
- Force everything into executable code. No prose conclusions. Every claim has to come from a notebook cell that loaded real data and ran a real test for me to review.
3. Validate the process before you trust the results
Before I believed anything the agent said about retracted targets, I needed proof it could find the real ones. So I seeded the panel with well-established synthetic lethal relationships: WRN in microsatellite-unstable tumors, PRMT5 in MTAP-deleted cancers.
The agent successfully re-derived the MTAP–PRMT5 relationship in detail. It stratified cell lines by copy number using a sensible 15% threshold it picked itself, compared dependency between groups, ran the dose-response across copy-number quartiles, and landed on effect sizes consistent with the literature and p-values from 10⁻⁹ to 10⁻¹¹. Once those controls worked, the rest of the panel became interpretable.
4. Look at every output myself
This is the unglamorous part nobody talks about. The agent produces 31 python notebooks. A human has to read it to validate and learn what happened. Did the data actually load? Did the statistical test make sense for the question? Did the agent silently swap in a different dataset when the first one failed? Did it interpret "wild type" the same way you meant?
I scored every one of the 31 notebooks manually. There are few components that was false after doing the aforementioned steps. The rest I coded supported, refuted, or inconclusive on two axes: context-specific dependency, and other supporting evidence.
Expert review isn't optional. The good news: it's faster than doing the analysis yourself. Maybe 15 minutes per notebook, against the several days it would take from scratch.
The most interesting result wasn't the big retracted-versus-non-retracted split. It was ALKBH5. The original paper was retracted, and the specific mechanistic claim (that miR-193a-3p regulates AKT2 through ALKBH5) didn't hold up. But the agent independently found that ALKBH5 itself is a real, glioma-selective dependency, with consistent CRISPR and RNAi signals, a strong correlation with stemness scores, a very strong negative correlation with the m6A gene signature, and a significant survival hazard ratio across gliomas.
You get insights like this because the agent decomposed the target claim into testable pieces and ran each one independently. That's the part I didn't expect, and it's the part that's made me think this approach generalizes well beyond target replication.
On AI Scientist Arena (aiscientistarena.com), I've benchmark LLMs and even without any sophisticated tool use or harness, they could predict clinical trial success beyond noise. If AI agents continue to improve in their capacity in all tasks across the drug discovery and development cycle, the best constructor of an entire clinical program might end up being an AI.
All of this — the prompts, the data and replication know-how documents, the 31 notebooks, the expert scoring — is at github.com/Huang-lab/AgentReplication. The bioRxiv preprint is at Agent-Driven Validation of Oncology Therapeutic Targets. This is part of the work that initiated the Accelerated Discovery with Agents (ADA) Consortium.
There's a version of this work that sounds bigger than it is. "AI agent validates 31 cancer drug targets in one hour" is technically true and somewhat misleading. The hour is the agent's compute time. Building the prompts, curating the targets, writing the know-how documents, and reviewing every notebook took weeks. The agent isn't doing the science. It's doing the implementation.
The science is still in deciding what to ask and whether the answer means anything to benefit humans.
Postscript, May 2026: This was my Track 2 submission to the NIH Replication Prize that was done in Nov 2025, which I thought was the better entry. My other entry, proposing mandatory release of participant-level clinical trial data, won Track 1.
Kailash B P joined JRNLClub — Kailash B P
May 19, 2026JRNLClub Editorial · 35 jobs got added to JRNLClub on May 18 — check out the job board
May 18, 2026Marek Wiewiórka joined JRNLClub — Marek Wiewiórka
May 18, 2026Αlpha¹ Editorial joined JRNLClub — Αlpha¹ Editorial
May 16, 2026Randy Aryee joined JRNLClub — Randy Aryee
May 16, 2026JRNLClub Editorial · 31 jobs got added to JRNLClub on May 16 — check out the job board
May 16, 2026JRNLClub Editorial · 28 jobs got added to JRNLClub on May 15 — check out the job board
May 15, 2026How I rebuilt Variant Effect Predictor to be 100x faster (fastVEP!)
If you work with genomic variants, you know VEP. Ensembl's Variant Effect Predictor is the standard tool — the thing your pipeline calls to figure out whether a given mutation breaks a protein, hits a splice site, or sits harmlessly in some intron. It's been around forever and it works. It's also written in Perl, ships with a Perl 5.22+ requirement, ten-plus CPAN modules, a DBI dependency, and a small graveyard of installation issues anyone who's set up VEP from scratch will recognize.
The annotation itself is fine. The speed is not.
Annotating 50,000 variants with VEP takes about 206 seconds. Point it at a full human WGS (~4 million variants) and it doesn't finish on the newest MacBook Pro. People work around this by splitting their VCFs, running parallel processes, and stitching the outputs back together. That works, but it's a huge time tax. A lab running thousands of samples pays that tax every day.
So I rebuilt it in Rust.
The numbers
fastVEP runs the same 50,000-variant file in 1.59 seconds. That's a 130x speedup. The full WGS that VEP can't finish? fastVEP does it in 86 seconds.
Peak memory drops from ~500 MB to 2.8 MB. The installed binary is 3.3 MB instead of ~200 MB of Perl plus dependencies. There are no CPAN modules to chase. You cargo install, you run a binary, that's it.
That's the headline. The interesting part is what actually made it fast. It wasn't one thing. It was the dumb stuff Perl couldn't do well, layered on top of a few good ideas.
What Rust gets you for free
A lot of the speedup is just what you get when you stop paying for an interpreter and a garbage-collected dynamic language. Tight loops over variant records compile to real machine code. Strings don't allocate when they don't need to. Parallelism is rayon and works; you don't fork ten Perl processes and reconstitute their output.
Thanks to agentic coding, doing this manageable with one person's effort for a full month. This involves knowing exactly how the algorithm works to instruct the coding agents, and verify extensively with tests and outputs. Mostly, the Sequence Ontology has 49 consequence terms; you map a variant's coordinates against a transcript and figure out which ones apply. The bottleneck in the Perl version is the Perl, not the algorithm.
If you stop there, you get maybe 10–20x. The rest came from somewhere else.
The next real win: rebuilding the annotation lookup
VEP's slowest path is annotation lookup: pulling in ClinVar, gnomAD, dbSNP, COSMIC, all the supplementary databases that turn raw consequence into something a clinician can act on. The default workflow round-trips through SQLite or remote APIs. For a million variants, that's a million lookups, and every one of them costs more than the consequence prediction itself.
The fix is to put the annotations in a format designed for the access pattern. fastVEP has its own binary format called fastSA, and the v2 design is shamelessly inspired by echtvar: thanks to Brent Pedersen's work & credit where it's due. The key improvements in my understanding:
- Chunked ZIP layout with Var32 encoding for variant keys.
- Parallel u32 value arrays per annotation field.
- Delta encoding on sorted positions.
- An LRU chunk cache, because variant lookups in a real VCF are clustered.
- A Bloom filter in front of the index for negative lookups.
Putting ClinVar, gnomAD, and dbSNP into this format and querying them as a single in-process call is most of what closes the gap on the heaviest workloads. You're not asking a database anymore. You're doing memory-mapped byte arithmetic.
What surprised me
A few things I didn't expect going in.
The FASTA handling matters more than I thought. You need the reference sequence for HGVS notation, and a naïve read of the GRCh38 primary assembly is enough to wreck your memory budget on its own. Memory-mapping the indexed FASTA and pulling spans on demand was the difference between "fastVEP runs on a laptop" and "fastVEP needs a server." Apparent simplicity hides this kind of thing; samtools faidx is doing a lot of work for you.
Structural variants are genuinely separate code. SNVs and short indels share a clean abstraction. <DEL>, <DUP>, <INV>, <BND> and the rest don't slot into it cleanly. I tried for a while to unify them, eventually gave up, and wrote a separate SV consequence predictor.
HGVS was the worst part. Generating correct HGVSc and HGVSp notation with 3' normalization across all the edge cases — overlapping CDS, mitochondrial circular coordinates, start-loss variants in non-Met-starting transcripts — required more test cases than the consequence engine itself. There's a reason VEP has been worked on for a decade. The annoying details are plenty and real.
Correctness
A faster but wrongly annotated VCF isn't useful. fastVEP is validated against VEP's output on shared test sets and matches on the consequences that matter. The repo has 233 tests across the workspace, not because that number is magic, but because every annoying HGVS edge case eventually became one. If you find a case where fastVEP disagrees with VEP and you think VEP is right, open an issue. Let me know here!
Try it
It's on GitHub at Huang-lab/fastVEP, Apache 2.0. There's a hosted web version at fastVEP.org if you want to paste in some VCF and see what it does. If you have Rust installed, it's a single cargo install away.
It works on yeast, fly, arabidopsis, mouse, human, anything with a GFF3. The web server can switch between organisms if you point it at a directory of them. The preprint is on bioRxiv. If it saves your group some compute time, that's the point and I'm glad :) Watch on YouTube
Checkout the JRNLClub demo to see what you can do here: https://youtu.be/tc_tdoC9LpI?si=1qtEiZ5pRpUEIL2t
Hello World, JRNLClub!
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
IRES-TrAPPr reveals novel insights into viral and cellular mRNA translation
biorxiv · May, G. E., McManus et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Turnip mosaic virus-based gRNA delivery system for plant genome editing
biorxiv · Khwanbua, E., Lappe et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Complete biosynthesis of penicillin G in Nicotiana benthamiana
biorxiv · Rawoof, A., Lin et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Intermolecular 3'UTR-3'UTR interactions drive Wnt gene activation through heteromeric protein assembly
biorxiv · Cai, T., Cruz et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Iridescence in pterosaur pycnofibers and the evolution of integumentary coloration
biorxiv · wu, Z., D'Alba et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
One-pot parallel Sidewinder construction from oligo pools
biorxiv · Robinson, N. E., Paul et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Polymeric mechanism of enhancer-promoter cooperativity in transcriptional bursting
biorxiv · YAMAMOTO, T., Kawasaki et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
AI-guided discovery of atypical protein assemblies
biorxiv · Toghani, A., Seager et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Resolving human neuronal herpesvirus reactivation via petabase-scale association studies
biorxiv · Gutierrez, J. C., Chen et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Generative design of sequence specific DNA binding proteins
biorxiv · Sehgal, E., Politanska et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Identification of a neural circuit that enables safe, long-term torpor in mice
biorxiv · Guo, F., Tong et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Generative design of intrinsically disordered protein regions with IDiom
biorxiv · Liu, J., Ibarraran et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Local ancestry inference identifies robust evidence of selection in Neolithic Europe
biorxiv · Mies, G., Mathieson et al.
Top 1% Most Discussed biorxiv Preprints Added
Preprint· Alpha1
Hierarchical and non-hierarchical network flows generate complementary representational dynamics in human visual cortex
biorxiv · Tzalavras, A., Osher et al.