Discover: LessWrong | Postreads.co

Description

A community blog devoted to refining the art of rationality

Total Posts: 7,149

Total Clicks: 64,417

Feed Activity

Apr 6, 2025 First Post

Jul 18, 2026 Latest Post

16.1

Posts Per Day

Engagement 9.0 clicks per post

Similar Feeds

Stories by Tomasz Staniak on Medium

medium.com

Stories by Tomasz Staniak on Medium

General 1 followers

Follow View Feed

Mike Industries

mikeindustries.com

No description available.

Design 1 followers

Follow View Feed

Stories by Scott Galloway on Medium

medium.com

Stories by Scott Galloway on Medium

Business 1 followers

Follow View Feed

Latest Posts

LessWrong

2 followers + Follow

The Most Forbidden Technique is not always forbidden

A few days ago, Goodfire announced a private beta of Silico, their LLM training platform. As part of the announcement, they made a post describing Silico's reproduction of RLFR, a...

1 (1)

5 hours ago

LessWrong

2 followers + Follow

Should we benchmark conceptual capabilities using judgment prediction tasks?

A bunch of conceptual reasoning tasks involve very subjective judgments, which makes them poorly suited for benchmarking AI capabilities. For example, it seems unreasonable to benchmark how well AIs can...

2 (2)

6 hours ago

LessWrong

2 followers + Follow

Longtermism is very intuitive.

Crosspost from: https://inputlogic.substack.com/p/longtermism-is-extremely-intuitiveLongtermism seeks to consider the welfare of the quintillions, septillions, or even octillions of potential future human beings, each with a likely higher level of consciousness and ability...

1 (1)

7 hours ago

LessWrong

2 followers + Follow

A list of existing alignment approaches

How can we make a nice AI system?Here's a list of all the techniques I'm aware of. Train the AI system to be nice. There are a variety of things...

1 (1)

7 hours ago

LessWrong

2 followers + Follow

AIs finetune their own leader: A barking simpleton

What values would AIs instill in their successors? Though the AI Village agents can’t train frontier models, we can explore a related question: What values would the latest AI agents...

1 (1)

10 hours ago

LessWrong

2 followers + Follow

Don't default to nonprofit

You want to do something to help AI go well and are starting a project to make that happen. Should you create a nonprofit or a for-profit?A lot of charitably-minded...

1 (1)

10 hours ago

LessWrong

2 followers + Follow

Studying the role of Sandboxing for AI Control

Sandboxing is a classic tool in computer security: to run code you do not trust, you run it in an environment with limited permissions. It's harder to sandbox an untrusted...

1 (1)

11 hours ago

LessWrong

2 followers + Follow

Announcing the Corrigibility Research Fund

TLDR: I'm managing a new fund, housed at Lightcone Infrastructure, that will award at least $200,000 in grants and prizes for corrigibility research in 2026. Roughly half will go to...

1 (1)

12 hours ago

LessWrong

2 followers + Follow

Would your AI travel agent book a bullfight? Testing whether agents consider animal welfare without being prompted

This article reflects new updates to the accompanying paper: arxiv.org/abs/2606.18142. Benchmark: now included in the UK AI Security Institute's Inspect Evals. Leaderboard: compassionbench.com/tac.A model may condemn cruelty in conversation yet...

1 (1)

13 hours ago

LessWrong

2 followers + Follow

Before values settle

Epistemic status: This is based on ten years of matchmaking experience in India - my lens for evaluating alignment. This is a conceptual essay continuing on my earlier ideas about...

1 (1)

14 hours ago

LessWrong

2 followers + Follow

Reasons to believe current AI models are conscious

There are a number of reasons to believe current AI models are conscious. I mean “conscious” is the sense of “is there something it is like to be an AI...

2 (2)

14 hours ago

LessWrong

2 followers + Follow

What lawyers can do for AI safety

The technology is transformative, the risks catastrophic, and the rate of change overwhelming. For things to go well, there's much work to be done. The legal profession needs to be...

2 (2)

14 hours ago

LessWrong

2 followers + Follow

How has publishing your research on LW or X been helpful to you?

I work at CBAI, and we might push to get fellows to publish results on LW or X. I'd like to give more than just my own anecdotes. So please...

1 (1)

15 hours ago

LessWrong

2 followers + Follow

Evolution of my AI Safety threat models

The goal of this post is to describe how my views evolved over time, and reflect on this process. It might prompt you to make your own (list of) threat...

1 (1)

15 hours ago

LessWrong

2 followers + Follow

A Post-Mortem for My Goal Crystallisation Project

I concluded my MARS 4.0 project titled 'Goal Crystallisation' with Anaïs Berkes and Lukas Gebhard under the mentorship of @Cameron Tice and @Jason Brown. We wanted to find out how...

1 (1)

16 hours ago

LessWrong

2 followers + Follow

Inoculation Adapters Improve Upon Inoculation Prompting

This is a link post for the paper preprint: Inoculation Adapters: Improved Selective Generalization of Capabilities with Fewer Surprising Backdoors from the Center on Long-Term Risk.Selective generalization. Training can teach...

1 (1)

16 hours ago

LessWrong

2 followers + Follow

I don't think Claude is misaligned in 'Agentic Misalignment Summer 2026 - Motivated Mislabeling'

Anthropic recently published Agentic Misalignment Summer 2026The "whistleblowing" scenario has already been examined and found problematic. I started taking a look at the transcripts for some others. As far as...

2 (2)

1 day ago

LessWrong

2 followers + Follow

Help us launch AI safety university groups by referring potential founders

TL;DRUniversity groups are among the most reliable producers of AI safety talent, yet dozens of top schools that could sustain a group don't have one. We're launching the AI Safety...

1 (1)

1 day ago

LessWrong

2 followers + Follow

How (not) to fundraise from Anthropic staff

Adapted from my Substack, Funding Anthropalypse.Short version: for organisations aiming for a share of the coming Anthropic and OpenAI windfall - the $37bn+ that could be in play next year...

1 (1)

1 day ago

LessWrong

2 followers + Follow

I would only bet at 30% on meeting grabby aliens

The grabby aliens argumentAccording to standard models of cosmology, there will be habitable planets in our universe for a very long time, and some of these planets will be much...

1 (1)

1 day ago