Automating Animal Research

In the fourth year of my PhD, I ran a longevity study that required testing about 450 mice. I measured a panel of 10 health metrics - activity, food intake, grip strength, treadmill endurance, cognition, sleep, weight, etc. - a basic battery that tells you whether animals are healthy or not.

That took eight months of continuous work. All-day, every-day.

I would pull mice from their cages, habituate them, train them, run them through an assay, clean the equipment, pull the next mouse and repeat. Each assay had to be repeated across multiple days and trials - roughly 10x per assay to get an accurate average value. Between animals, you have to scrub everything down because mice communicate by scent—residual odor from the previous animal is a confounder. You habituate them to the apparatus for days before you can collect a single data point. A grip strength test requires you to physically pull on a mouse until it lets go, and the result depends on how hard you pull. A treadmill test requires you to decide when the animal looks “too tired” to continue.

Eight months. For one study.

I looked into outsourcing. The best quotes from contract research organizations came to roughly $2 million. I looked into existing “smart cage” systems that could automate some of the work. The most widely used one - designed in 1986 - could test only one mouse at a time and cost $30,000 per cage. Mice are social animals; they have to be housed together. Isolating them for more than a few days elevates stress hormones and confounds results. Males become aggressive when reintroduced to cage-mates after separation. So these systems were limited to short tests, female mice, or terminal measurements only.

None of this was a solvable problem within my lab. It was a structural feature of how animal research works—and had been since the 1960s. I couldn’t do my research because the infrastructure to do it didn’t exist. And I gradually understood that nobody else could do theirs either. They just didn’t talk about it. It was the norm. And the incentives to improve animal tech just weren’t there, because it doesn’t get you sexy papers or big grants.

Here are the numbers. The United States houses roughly 120 million mice across about 3,000 research facilities. Globally, the figure is double that, and animal use grew 40% between 2017 and 2022. Despite all the excitement about organoids and organ-on-a-chip, animal testing remains required for 95% of drugs passing through the FDA. This is a $30-$100 billion ecosystem, operated almost entirely by hand.

A typical preclinical mouse study costs $300,000–$500,000 and takes 6–12 months. It is rarely replicated, because few can afford to do it twice. The data it produces are sparse—months of biology squeezed into a handful of data points at one or two time points. And the results are notoriously irreproducible, not only because conditions differ between labs, but because the humans do too. Even the gender of the experimenter alters pain responses in rodents. Handling technique, time of day, ambient noise, perfume - all of these affect results. Experimenter effects are frequently larger than drug effects. Finally, genetic modification and humanization are difficult, so disease models are often chosen for convenience and historical precedent, not for maximum human relevance.

As a result, nine out of ten drugs that enter human clinical trials fail, even after passing animal testing. Each approved drug costs an average of $2.3 billion and takes 12 years to reach patients. Only about 50 new drugs get FDA approval in a given year, across the entire country. Drug development costs double every nine years - Eroom’s Law, Moore’s Law in reverse. This is not a law of nature. It’s a consequence of how we test.

And that’s just the experimentation side. There’s an entire other half of the problem that gets even less attention: animal care.

Every cage in a facility is visually checked by a technician once per day. The average check lasts about 1.2 seconds. Mice are nocturnal, so they’re asleep when the technician walks by. Food is topped up manually. Water bottles are refilled once a week. Cages are swapped out every one to two weeks across the entire facility regardless of whether they need it, because tracking which of your 10,000 cages is actually dirty is impossible with manual methods. Sick animals are often not discovered until the weekly cage exchange.

As a result, an average facility needs 10–15 technicians, at a ratio of roughly one per 800 cages. The work is physically unpleasant - animals smell, bite, and produce allergens that cause chronic respiratory problems. Repetitive tasks lead to strain injuries. As a result, facilities see 30–70% annual turnover. This means a constantly rotating, under-trained, inconsistent workforce responsible for the daily welfare of millions of animals. This is the backbone of the $2 trillion pharmaceutical industry.

You might excuse this state by saying that this was how things had always have been; animals are more complex; tech is difficult to build, etc. – but when looking at the capabilities of molecular biology, these excuses fall flat. The gap is roughly 10,000-fold.

A single genomics run produces terabytes of standardized, machine-readable data. Transcriptomics, proteomics, metabolomics - each field built instruments that industrialized measurement. This is the foundation of modern drug discovery.

But what we actually care about is – what is the impact of the molecular intervention on the organism itself? Does the animal live longer? Move better? Eat normally? Sleep well? These are the measurements that determine whether a drug works. And they are still collected by hand, one mouse at a time, by a technician with a stopwatch and a notebook.

There is no omics for phenotypes. There isn’t even a commonly used word for “large-scale, standardized, organism-level phenotypic data generation.”

In short, “Phenomics” is a great missing omics.

So, I designed the tech that would solve this - a monitoring device that would sit on a standard mouse cage, record continuous video and audio, and use computer vision to track every animal inside. The v1 of an automated, continuous, comprehensive health measurement device for animal phenomics.

The hard part was multi-animal tracking – which, to my surprise, was an unsolved problem. Research mice are genetically identical clones who look almost exactly alike. It is crucial to keep track of an individual animal’s identity – but also very difficult. A good analogy is self-driving cars. Tracking a single mouse in a clean test setup is like autonomous driving on a racetrack. Tracking multiple mice in their home cage - with bedding, nesting material, food hoppers, water bottles, animals climbing over each other - is like autonomous driving in a city. Messier, harder, and absolutely necessary.

I knew it had to be built as a company. Otherwise, it would never get manufactured and made accessible beyond my own lab. But I’d spent nearly 15 years preparing to be a research scientist. I had never planned to start a company. So I spent a full year trying to find an entrepreneur to build it with me. I talked to close to a thousand people to find a founder (most chats informal, averaging about 3-4 per day). The few who had the right combination of skills were already running much larger companies. But through that search I found my first co-founder—and I came to accept that I was the best person to do it, because the solution demanded someone who understood the problem intimately, while having the technical ability to design the system and interface with engineers and coders.

So, in the last year of my PhD, I converted my Harvard studio to a small prototyping space, complete with 3D printer, laser cutter, soldering – and a colony of mice bought from Petco down the street. My bed was a few feet away from them – and mice don’t smell nice. It was not glamorous.

But over the following 9 months, it allowed my co-founder and I to build the first 6 versions of the smart cage next to full-time jobs. The prototypes were rough, but we knew we could solve the multi-animal tracking and get genuinely useful data. That work won us 3 entrepreneurship awards and 2 grants, and right after my graduation, our first investment. From there, we went full-time, recruited more engineering co-founders and exactly one year later, launched the first system publicly.

The system we built is called the Smart Lid. It mounts on existing cages, inside existing ventilated racks (individual mouse cages are hooked up to ventilation in specialized racks to ensure breathable air at high density housing), and analyzes continuous video using a pipeline of 8 custom neural networks trained on tens of thousands of hours of proprietary data. It tracks all co-housed animals simultaneously at average 97% accuracy, with individual identification at 99.5% concordance with human scoring and compute cost at ~under $100 per cage per month. It measures over 20 health metrics per animal: locomotion, sleep, eating, drinking, fighting, rearing, climbing, anxiety, social behavior, spatial positioning at 10 times per second resolution, continuously. That’s 100 to 10,000 times more data per animal per study than manual methods produce. The best open-source alternatives - DeepLabCut, SLEAP - while excellent software – are designed for short videos with single mice and many key-points on the body, and thus still inevitably mix up animals while costing ~$1,000 per cage per month runtime.

At launch, we booked $180,000 of orders in the first month, and since then, we’ve shipped to close to 100 institutions across five continents and all sectors of research (pharma, biotech, academia, CROs).

The potential leverage of this is massive.

Currently, the entire $2T US pharmaceutical industry combined outputs about 50 new drugs per year. If better animal data improved clinical success rates by 2% - from 90% failure to 88% - that’s roughly 10 additional drugs approved per year. This is more than the average output of the world’s three largest pharmaceutical companies combined.

As with other types of biological data, the real endgame happens once phenotypic data like this exists at scale. With enough structured data published by different labs and gathered into one database, you can start training AI models on whole-organism biology and infer what outcome a given drug or intervention would have on whole-body health. As other fields of biology, it would become digitized, then simulated or inferred. Arguably, that’s the ultimate goal of biology as a science - understand living systems well enough to simulate everything.

When this happens, biology transitions from a science discipline into an engineering discipline.

And there is a lot to engineer. Let’s make it happen.

Company: Olden Labs

Paper: Smart Lids for deep multi-animal phenotyping in standard home cages