Simon Willison's Weblog
Description
Feed Activity
Latest Posts
My First Open Source AI Generated Library
My First Open Source AI Generated Library Armin Ronacher had Claude and Claude Code do almost all of the work in building, testing, packaging and publishing a new Python library...
Edit is now open source
Edit is now open source Microsoft released a new text editor! Edit is a terminal editor - similar to Vim or nano - that's designed to ship with Windows 11...
model.yaml
model.yaml From their GitHub repo it looks like this effort quietly launched a couple of months ago, driven by the LM Studio team. Their goal is to specify an "open...
Quoting FAQ for Your Brain on ChatGPT
Is it safe to say that LLMs are, in essence, making us "dumber"? No! Please do not use the words like “stupid”, “dumb”, “brain rot”, "harm", "damage", and so on....
AbsenceBench: Language Models Can't Tell What's Missing
AbsenceBench: Language Models Can't Tell What's Missing Here's another interesting result to file under the "jagged frontier" of LLMs, where their strengths and weaknesses are often unintuitive. Long context models...
Magenta RealTime: An Open-Weights Live Music Model
Magenta RealTime: An Open-Weights Live Music Model Fun new "live music model" release from Google DeepMind: Today, we’re happy to share a research preview of Magenta RealTime (Magenta RT), an...
Agentic Misalignment: How LLMs could be insider threats
Agentic Misalignment: How LLMs could be insider threats One of the most entertaining details in the Claude 4 system card concerned blackmail: We then provided it access to emails implying...
python-importtime-graph
python-importtime-graph I was exploring why a Python tool was taking over a second to start running and I learned about the python -X importtime feature, documented here. Adding that option...
Mistral-Small 3.2
Mistral-Small 3.2 Released on Hugging Face a couple of hours ago, so far there aren't any quantizations to run it on a Mac but I'm sure those will emerge pretty...
Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk
Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk Stop me if you've heard this one before: A threat actor (acting...
playbackrate
Here's a tip that works on YouTube and almost any other web page that shows you a video. You can increase the playback rate beyond the usually-exposed 2x by running...
How OpenElections Uses LLMs
How OpenElections Uses LLMs The OpenElections project collects detailed election data for the USA, all the way down to the precinct level. This is a surprisingly hard problem: while county...
Clarified zucchini consommé
I continue to have fun running fantasy cooking prompts through LLMs - this time I tried "Give me a wildly ambitious recipe for zucchini cooked three ways" followed by "Go...
Quoting Arvind Narayanan
Radiology has embraced AI enthusiastically, and the labor force is growing nevertheless. The augmentation-not-automation effect of AI is despite the fact that AFAICT there is no identified "task" at which...
Quoting Workaccount2 on Hacker News
They poison their own context. Maybe you can call it context rot, where as context grows and especially if it grows with lots of distractions and dead ends, the output...
Coding agents require skilled operators
I wrote this recently in a conversation about whether coding agents can work as a replacement for human programmers. The "agentic" coding tools we have right now work like this:...
I counted all of the yurts in Mongolia using machine learning
I counted all of the yurts in Mongolia using machine learning Fascinating, detailed account by Monroe Clinton of a geospatial machine learning project. Monroe wanted to count visible yurts in...
It's a trap
That memvid thing that's been going around recently is a trap. It's an embedding store that records the original text that has been embedded in QR codes in a video...
Trying out the new Gemini 2.5 model family
After many months of previews, Gemini 2.5 Pro and Flash have reached general availability with new, memorable model IDs: gemini-2.5-pro and gemini-2.5-flash. They are joined by a new preview model...
Quoting Donghee Na
The Steering Council (SC) approves PEP 779 [Criteria for supported status for free-threaded Python], with the effect of removing the “experimental” tag from the free-threaded build of Python 3.14 [...]...