Simon Willison's Weblog
About This Feed
No description available for this feed.
Feed Activity
Latest Posts
Quoting Leaked Amazon memo
Amazonians, We've reviewed the Presidential Proclamation on H-1B visas that was released today and are actively working to gain greater clarity. Here's what you need to know right now: The...
httpjail
httpjail Here's a promising new (experimental) project in the sandboxing space from Ammar Bandukwala at Coder. httpjail provides a Rust CLI tool for running an individual process against a custom...
Magistral 1.2
Mistral quietly released two new models yesterday: Magistral Small 1.2 (Apache 2.0, 96.1 GB on Hugging Face) and Magistral Medium 1.2 (not open weights same as Mistral's other "medium" models.)...
The Hidden Risk in Notion 3.0 AI Agents: Web Search Tool Abuse for Data Exfiltration
The Hidden Risk in Notion 3.0 AI Agents: Web Search Tool Abuse for Data Exfiltration Abi Raghuram reports that Notion 3.0, released yesterday, introduces new prompt injection data exfiltration vulnerabilities...
Quoting Steve Jobs
Well, the types of computers we have today are tools. They’re responders: you ask a computer to do something and it will do it. The next stage is going to...
I think "agent" may finally have a widely enough agreed upon definition to be useful jargon now
I've noticed something interesting over the past few weeks: I've started using the term "agent" in conversations where I don't feel the need to then define it, roll my eyes...
Anthropic: A postmortem of three recent issues
Anthropic: A postmortem of three recent issues Anthropic had a very bad month in terms of model reliability: Between August and early September, three infrastructure bugs intermittently degraded Claude's response...
ICPC medals for OpenAI and Gemini
In July it was the International Math Olympiad (OpenAI, Gemini), today it's the International Collegiate Programming Contest (ICPC). Once again, both OpenAI and Gemini competed with models that achieved Gold...
Announcing the 2025 PSF Board Election Results!
Announcing the 2025 PSF Board Election Results! I'm happy to share that I've been re-elected for second term on the board of directors of the Python Software Foundation. Jannis Leidel...
Quoting Poul-Henning Kamp
I thought I had an verbal agreement with them, that “Varnish Cache” was the FOSS project and “Varnish Software” was the commercial entitity, but the current position of Varnish Software’s...
GPT‑5-Codex and upgrades to Codex
GPT‑5-Codex and upgrades to Codex OpenAI half-released a new model today: GPT‑5-Codex, a fine-tuned GPT-5 variant explicitly designed for their various AI-assisted programming tools. Update: OpenAI call it a "version...
Models can prompt now
Here's an interesting example of models incrementally improving over time: I am finding that today's leading models are competent at writing prompts for themselves and each other. A year ago...
gpt-5 and gpt-5-mini rate limit updates
gpt-5 and gpt-5-mini rate limit updates OpenAI have increased the rate limits for their two main GPT-5 models. These look significant: gpt-5 Tier 1: 30K → 500K TPM (1.5M batch)...
Quoting Matt Webb
The trick with Claude Code is to give it large, but not too large, extremely well defined problems. (If the problems are too large then you are now vibe coding…...
London Transport Museum Depot Open Days
London Transport Museum Depot Open Days I just found out about this (thanks, ChatGPT) and I'm heart-broken to learn that I'm in London a week too early! If you are...
Comparing the memory implementations of Claude and ChatGPT
Claude Memory: A Different Philosophy Shlok Khemani has been doing excellent work reverse-engineering LLM systems and documenting his discoveries. Last week he wrote about ChatGPT memory. This week it's Claude....
Qwen3-Next-80B-A3B: 🐧🦩 Who needs legs?!
Qwen3-Next-80B-A3B Qwen announced two new models via their Twitter account (and here's their blog): Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking. They make some big claims on performance: Qwen3-Next-80B-A3B-Instruct approaches our 235B flagship. Qwen3-Next-80B-A3B-Thinking...
Defeating Nondeterminism in LLM Inference
Defeating Nondeterminism in LLM Inference A very common question I see about LLMs concerns why they can't be made to deliver the same response to the same prompt by setting...
Quoting Kumar Aditya
In Python 3.14, I have implemented several changes to fix thread safety of asyncio and enable it to scale effectively on the free-threaded build of CPython. It is now implemented...
Claude API: Web fetch tool
Claude API: Web fetch tool New in the Claude API: if you pass the web-fetch-2025-09-10 beta header you can add {"type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5} to your "tools" list...