Exploring how AI is reshaping our world
Daily analysis of AI tools, research, and industry shifts — written for engineers and decision-makers.
-
AI Agents Now Handle 12-Hour Tasks. Here’s the Data.
The length of coding tasks frontier AI agents can complete with 50% reliability is doubling every 7 months—and recently accelerating to every 3. METR’s rigorous multi-year study puts hard numbers on a trend most teams only vaguely track. Here’s what…
-
How to Use Cursor’s Parallel Agents for Large Refactors
Cursor 2.5 introduced parallel cloud agents — up to eight at once in isolated VMs with git worktrees — that…
-
Visa AI Agents Can Pay for You—But Should They?
Visa’s Agentic Ready programme launched in March 2026 with 21 European banks, enabling AI agents to initiate real payments on…
-
ICML Catches 497 Papers Cheating on AI Peer Review
ICML 2026 desk-rejected 497 papers after detecting that 398 reviewers used language models in violation of Policy A — a…
-
GPT-5.4 Review: Accuracy Gains and Context Window Limits
OpenAI’s GPT-5.4 cuts individual factual errors by 33% and raises the context ceiling to 1.05M tokens—but roughly 1 in 12…
-
Gemini 3.1 Pro vs GPT-5.2: The Context Window War
Google’s Gemini 3.1 Pro has a 1M-token context window. OpenAI’s GPT-5.2 caps at 400K. The raw numbers favor Gemini —…
-
Cursor BugBot Autofix: Parallel Agents Fix PRs
Cursor’s BugBot Autofix — now generally available — uses isolated cloud agents to propose fixes directly on pull requests. Over…
-
AI Writes Its Own Paper—And Passes Peer Review
Sakana AI’s AI Scientist-v2 produced the first fully AI-generated paper to pass human peer review—published in Nature today. The result…
-
AI Coding Agents in 2026: 90% Adoption, Zero DORA Gain
90 to 95 percent of developers now use AI coding tools, and individual velocity metrics are clearly up. But DORA…
