ARC-AGI 3 - Gaming with Reason

June 11, 2025

ARC-AGI 2 was just released in March, but we’re already getting an announcement of the upcoming ARC-AGI 3, a benchmark said to be the next step on the way to AGI. The expected release date is early 2026, with beta testing about to begin.

Normalising a Short AGI Horizon

June 3, 2025

AI 2027 was recently released and is another great forecast of how the next few years could play out, similar to another favourite of mine, Situational Awareness. While listening to the authors on the Dwarkesh Podcast I tried to look a bit deeper on my own thoughts. Why do I also believe AGI is coming soon? Why does it still seem so fantastical, despite my belief? To help myself, I decided to jot down the key points that have influenced that thinking. It’s far from an exhaustive list, but are some of the latest facts that help me to stay grounded in my predictions.

Understanding Human Thought: Perspectives from Cognitive Science and AI

May 29, 2025

DISCLAIMER: The following is generated by OpenAI’s DeepResearch on 2025-05-29.

Think it Faster in AI

May 15, 2025

Ever since reading “How could I have thought that faster?” this year, I have been trying to put it into practice. Working with AI models, I found there are plenty of opportunities. One can spend hours on some buggy code just to find the bug they fixed wasn’t the real problem after all, or that someone had already solved it five years ago on Stack Overflow. One can invests hours into modelling to realise that there was a far simpler approach if you just thought about it from another angle…

Test-Time Compute Matters: From LLMs to Search

May 7, 2025

Test-time compute has recently been popularised with LLMs (e.g. o1/o3, DeepSeek-R1 and other reasoning models) as its application has allowed LLMs to perform significantly better on complex reasoning problems (e.g. mathematics, reasoning). This article explores why it works, why it’s not new, and how it’s been employed across different AI paradigms.

How to Run Cheap LLM Experiments

April 25, 2025

As a researcher, it’s common to have way more ideas than you have time to experiment with. This is especially true in the world of language modeling when you consider the cost of running such experiments. In this post I touch on some of the methods that I’ve seen to run experiments with language models without breaking the bank.