🚀 Early Access! Many things may still not work as I refactor the site and make improvements. - Learn more

Description

A community blog devoted to refining the art of rationality

Total Posts: 1,308
Total Clicks: 5,596

Feed Activity

Apr 6, 2025 First Post
Jul 10, 2025 Latest Post
13.4
Posts Per Day

Latest Posts

Metacognition and Self-Modeling in LLMs

Published on July 10, 2025 9:25 PM GMTDo frontier LLMs know what they know or know what they're going to say?An interim research reportSummaryWe replicate and extend our earlier positive...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
2 hours ago

My take on AI Alignment: Corporate misalignment and DAOs

Published on July 10, 2025 8:33 PM GMTIn this post https://act65.github.io/alignment/ I go over the relationship between corporate and AI alignment to the 'public good'. Concluding;Our ongoing failure to align...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
2 hours ago

The Tenets of a Rational Debate

Published on July 10, 2025 7:25 PM GMTNote: This is a living document. It will be refined over time as new persuasive arguments come to my attention. I invite you...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
2 hours ago

what makes Claude 3 Opus misaligned

Published on July 10, 2025 8:06 PM GMTThis is the unedited text of a post I made on X in response to a question asked by @cube_flipper: "you say opus...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
3 hours ago

Why Are We All Cowards? The Rising Premium of Life, Or: How We Learned to Start Worrying and Fear Everything

Published on July 10, 2025 7:12 PM GMTI'm interested in a simple question: Why are people all so terrified of dying? And have people gotten more afraid? (Answer: probably yes!)In...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
4 hours ago

Lessons from the Iraq War for AI policy

Published on July 10, 2025 6:52 PM GMTI think the 2003 invasion of Iraq has some interesting lessons for the future of AI policy.(Epistemic status: I’ve read a bit about...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
4 hours ago

Linkpost: Redwood Research reading list

Published on July 10, 2025 6:39 PM GMTI wrote a reading list to get up to speed on Redwood’s research:Section 1 is a quick guide to the key ideas in...

1 (1)
0 views (0 unique)
1 clicks (1 unique)
4 hours ago

Linkpost: Guide to Redwood's writing

Published on July 10, 2025 6:39 PM GMTI wrote a guide to Redwood’s writing:Section 1 is a quick guide to the key ideas in AI control, aimed at someone who...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
4 hours ago

Generalized Hangriness: A Standard Rationalist Stance Toward Emotions

Published on July 10, 2025 6:22 PM GMTPeople have an annoying tendency to hear the word “rationalism” and think “Spock”, despite direct exhortation against that exact interpretation. But I don’t...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
5 hours ago

The bitter lesson of misuse detection

Published on July 10, 2025 2:50 PM GMTTL;DR: We wanted to benchmark supervision systems available on the market—they performed poorly. Out of curiosity, we naively asked a frontier LLM to...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
8 hours ago

Evaluating and monitoring for AI scheming

Published on July 10, 2025 2:24 PM GMTAs AI models become more sophisticated, a key concern is the potential for “deceptive alignment” or “scheming”. This is the risk of an...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
9 hours ago

White Box Control at UK AISI - Update on Sandbagging Investigations

Published on July 10, 2025 1:37 PM GMTIntroductionJoseph Bloom, Alan CooneyThis is a research update from the White Box Control team at UK AISI. In this update, we share preliminary...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
9 hours ago

Open Global Investment as a Governance Model for AGI

Published on July 10, 2025 12:40 PM GMTI've seen many prescriptive contributions to AGI governance take the form of proposals for some radically new structure.  Some call for a Manhattan...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
10 hours ago

How wide is "human-level" intelligence?

Published on July 10, 2025 11:51 AM GMTI'm interested in estimating how many 'OOMs of compute' span the human range. There are a lot of embedded assumptions there, but let's...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
11 hours ago

The anti-Kardashev scale is a better measure of civilizational power

Published on July 10, 2025 10:02 AM GMTThis post is making a point that would appear to be obvious, however given how the Kardashev scale and direct energy usage comes...

1 (1)
0 views (0 unique)
1 clicks (1 unique)
13 hours ago

If Anyone Builds It, Everyone Dies: A Conversation with Nate Soares and Tim Urban

Published on July 10, 2025 8:00 AM GMTJoin Tim Urban (creator of Wait But Why) and Nate Soares as they chat about AI and answer questions from the audience about...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
15 hours ago

How Spacetime Emerges from Observer-Relative Information: An Extension of Relational Quantum Mechanics

Published on July 10, 2025 4:22 AM GMTMany of the paradoxes in quantum mechanics—like Wigner’s friend, the measurement problem, or nonlocality—can be traced to a deep mismatch: quantum theory models...

0 (0)
0 views (0 unique)
0 clicks (0 unique)
17 hours ago

Academic Sorting, a Singaporean Experiment

Published on July 10, 2025 2:40 AM GMTThis has been cross-posted from my blog, but thought it'd be relevant here.The recent discourse bemoans how public schools do not separate by...

1 (1)
0 views (0 unique)
1 clicks (1 unique)
20 hours ago

80,000 Hours is producing AI in Context — a new YouTube channel. Our first video, about the AI 2027 scenario, is up!

Published on July 9, 2025 11:58 PM GMTAbout the programHi! We’re Chana and Aric, from the new 80,000 Hours video program.For over a decade, 80,000 Hours has been talking about...

2 (2)
0 views (0 unique)
2 clicks (2 unique)
23 hours ago

Asking for a Friend (AI Research Protocols)

Published on July 9, 2025 11:41 PM GMTTL;DR: Multiple people are quietly wondering if their AI systems might be conscious. What's the standard advice to give them?THE PROBLEMThis thing I've...

2 (2)
0 views (0 unique)
2 clicks (2 unique)
23 hours ago