Description
A community blog devoted to refining the art of rationality
Feed Activity
Latest Posts
To what extent is AI safety work trying to get AI to reliably and safely do what the user asks vs. do what is best in some ultimate sense?
Published on May 23, 2025 9:05 PM GMTTrying to get a rough estimate for some related research I’m doing. Specifically, I’m wondering if anyone could give a rough percentage of...
Default history is dead wrong
Published on May 23, 2025 4:31 PM GMTThere is a default historic grand narrative that goes something like "humanity in the past was worse than the humanity of the present,"...
Notes on Claude 4 System Card
Published on May 23, 2025 3:23 PM GMTAnthropic released Claude 4. I've read the accompanying system card, and noted down some of my remarks.Alignment assessment: system prompt mix-upsThere's a worrying...
What is emptiness?
Published on May 23, 2025 12:06 PM GMTThe value of philosophy is that no one needs it. -- Alexander Piatigorsky[1]I'll start with a disclaimer. I'm neither a Buddhist nor a...
Idiohobbies
Published on May 23, 2025 6:38 AM GMTWhen you get to know someone, you might ask about their interests or hobbies. From that, you can better decide what activity to...
Learning (more) from horse employment history
Published on May 23, 2025 2:11 AM GMTThe economist Wassily Leontief, writing in 1966, used the then-recent decline of horses to make vivid what he foresaw as the coming impact...
Qualitative Fit Testing
Published on May 23, 2025 2:50 AM GMT As I wrote about last week, it's worth it for everyone to have an elastomeric respirator in case of emergencies: the chance...
Anthropic is Quietly Backpedalling on its Safety Commitments
Published on May 23, 2025 2:26 AM GMTDiscuss
Schizobench: Documenting Magical-Thinking Behavior in Claude 4 Opus
Published on May 23, 2025 1:31 AM GMTWith today's release of the new Claude models, we've seen a relatively predictable jump in performance. However, we've also seen something that I...
Post-Manifest coworking at Mox
Published on May 23, 2025 12:20 AM GMTMox (https://moxsf.com) is fully open to the public in the leadup to LessOnline and after Manifest! Wanted to check out Mox? Need a...
Art Is Art: AI Is the Next Erotica
Published on May 22, 2025 6:04 PM GMTAs AI generates more of our cultural lives, good art will be much harder to find. Fortunately, I am here to help you...
Claude 4, Opportunistic Blackmail, and "Pleas"
Published on May 22, 2025 7:59 PM GMTIn the recently published Claude 4 model card:Notably, Claude Opus 4 (as well as previous models) has a strong preference to advocate for...
Reward button alignment
Published on May 22, 2025 5:36 PM GMTIn the context of actor-critic model-based RL agents in general, and brain-like AGIÂ in particular, part of the source code is a reward...
We're Not Advertising Enough (Post 3 of 6 on AI Governance)
Published on May 22, 2025 5:05 PM GMTIn my previous post in this series, I explained why we urgently need to change AI developers’ incentives: if we allow the status...
What we can learn from afterlife myths
Published on May 22, 2025 3:49 PM GMTOverviewThe "Modal Rationalist Anti-Death Stance" goes something like this:Since time immemorial, people have told comforting stories about the afterlife to avoid confronting the...
Policy recommendations regarding reproductive technology
Published on May 22, 2025 2:49 PM GMTPDF version. berkeleygenomics.org. X.com. Bluesky. Introduction Here we list six policies that would help accelerate the development of novel assisted reproductive technologies. Such...
Does BPC-157 work for healing and tissue repair?
Published on May 22, 2025 1:18 PM GMTBPC-157, a peptide frequently marketed as a breakthrough for healing and tissue repair, has attracted substantial attention in wellness and performance communities. It’s...
How load-bearing is KL divergence from a known-good base model in modern RL?
Published on May 22, 2025 12:08 PM GMTMotivation One major risk from powerful optimizers is that they can find "unexpected" solutions to the objective function, which score very well on...
Christianity vs. Tantra vs. Sex – one spiritual path?
Published on May 22, 2025 11:15 AM GMT[Cross-posted from my blog https://www.pchvykov.com/blog]Â This conversation is inspired by the common narrative in western spiritual (especially rationalist or new-age) circles that Christianity...