A photograph
has a story
inside it.
Most tools make you decode it yourself.
Clouseau reads it for you.
The Answer to "What Does It Do?"
It tells you exactly what it sees —
Chapter 01
The back of your eye is the only place the body shows you itself.
Beneath the surface of the eye lies the retina — a thin sheet of living nerve tissue roughly twice the size of a postage stamp. It is the only place in the human body where a doctor can look directly at living blood vessels and nerve fibres without making a single incision.
A single photograph of the retina contains enough information to detect early-stage diabetic retinopathy, signs of glaucoma, macular degeneration, evidence of a past stroke, and more. The information has always been there. The challenge has always been reading it fast enough, consistently enough, at scale.
every pixel is a number
Chapter 02
A photograph is just numbers. Computers are very good at numbers.
Every digital image is a grid of pixels. Each pixel is three numbers: one for red, one for green, one for blue — each between 0 and 255. A 500×500 retinal image is simply 750,000 numbers, arranged in a specific order.
A computer vision model learns to find meaningful patterns inside those numbers. Not by following rules a programmer wrote, but by being shown millions of examples and taught to recognize what healthy looks like versus what disease looks like — the same way you learned to recognize a face without ever being told the definition of a nose.
Chapter 03
Context is everything. Not all parts of an image matter equally.
Consider the sentence: "The angry pitcher threw the ball at the empty pitcher." You instantly know the first "pitcher" is a person and the second is a container — because of context. You didn't look at each word in isolation; you looked at each word relative to every other word.
A Transformer is a model architecture that does exactly this for images. It divides the photograph into small patches and asks: which patches should I pay the most attention to, given what all the other patches are telling me? The glowing spotlight shows the model's focus shifting in real time — pausing at the optic disc, sweeping to the vessels, settling on the macula.
Chapter 04
It didn't learn from a textbook. It learned from 1.6 million photographs.
Training is the process of showing a model example after example — and correcting it each time it's wrong. On its first attempt, the model guesses randomly. It fails. A small adjustment is made. It fails slightly less. This happens millions of times, automatically.
The model we built on is called RETFound, developed by Moorfields Eye Hospital and published in Nature (2023). It was trained on 1.6 million unlabelled retinal images — one of the largest ophthalmic AI models ever released. Clouseau adapts it with task-specific training for five separate diagnostic targets.
Chapter 05
From photograph to findings: the pipeline.
Every retinal image Clouseau receives passes through five stages. Each stage happens on your device — nothing leaves the room. The entire process, start to finish, takes less than eight seconds.
Cup-to-Disc Ratio anomaly detected in superior quadrant.
Clear vascular architecture. No exudates identified.
Signal noise in macula. Manual inspection required.
Chapter 06
A number you can trust — because it shows its work.
Every finding comes with a confidence score. This is not a guess — it is the model's calibrated estimate of how certain it is about each result, trained against thousands of expert-labelled cases.
A high confidence score (above ~0.75) means the model has seen many cases like this one and is giving you a reliable signal. A lower score is not a failure — it is the system being honest with you, telling you to look closer.
The clinician always has the final word. Each finding can be agreed with, disputed, or flagged as uncertain — and that feedback is logged against the case record.
Chapter 07
Others are already doing this. Most of them require a data center.
AI retinal screening exists. FDA-cleared products are already deployed in clinics. The question is not whether this technology works — it has been validated. The question is who can access it, and at what cost.
End to end, fully on ram
Chapter 08
It thinks. It answers. It forgets.
Every star you have ever seen has already moved on - the light is only a memory of where it was. Clouseau works something like that. The photograph enters working memory. The model examines it. The findings arrive.
And then — without instruction, without ceremony — everything dissolves. Nothing is written to disk. Nothing is transmitted. The analysis exists for the span of a deep breath; and then it is gone. What persists is only what you choose to keep.
Medicine Advances as Far as the Tools Allow It
The eye has always held the answer. We simply lacked the means to hear it quickly enough. Every tool medicine has ever trusted — the stethoscope, the X-ray, the MRI — began as an idea that the body was telling us something we hadn't yet learned to read. We believe this is one of those moments. Not a conclusion, but a beginning.
It's only critical.