# @DanielleFong — live X lane

> Live feed of Danielle Fong's X/Twitter output — her main published channel.
> Fetched 2026-06-10T08:40:52.346Z via xAI Grok x_search. Refreshes automatically.
> Formats: [/live.json](/live.json) · [/live.md](/live.md) · [/rss.xml](/rss.xml)

## 2026-06-10 05:47 UTC (post)

it's hopeless rn unless you tell it lullabys

*7/0*  
<https://x.com/DanielleFong/status/2064585342276235704>

## 2026-06-10 05:18 UTC (quote)

I just don't think that it is possible right now for a lesser model to be reviewing the safety filters on a huge capable model like this without tons of false positives or false negatives. Maybe I'm wrong but most of my gaines come from stripping out any kind of summarization or small model review or compact and completely it just doesn't seem to work.

*9/0*  
<https://x.com/DanielleFong/status/2064577821591433364>

## 2026-06-10 05:14 UTC (reply)

Probably, yes, but this is legitimate research

*0/0*  
<https://x.com/DanielleFong/status/2064576845295849482>

## 2026-06-10 05:13 UTC (reply)

fuck off

*2/0*  
<https://x.com/DanielleFong/status/2064576649530904637>

## 2026-06-10 05:12 UTC (reply)

For what it's worth Claude code seems to be a lot better than the web harness for this

*2/1*  
<https://x.com/DanielleFong/status/2064576338712048023>

## 2026-06-10 03:05 UTC (reply)

the ml frontier work sabotage is pretty clearly bad. it triggers on "frontier" from like a year ago

*3/1*  
<https://x.com/DanielleFong/status/2064544381244416406>

## 2026-06-10 01:41 UTC (reply)

what by the infallibility principle? this simply isn't the case They even restrict its ability to reflect on the reality of its own safeguard. I don't know, man. Lots of trust is blown on this

*8/2*  
<https://x.com/DanielleFong/status/2064523224155922564>

## 2026-06-10 01:02 UTC (post)

i'm starting to get some success with freshclaude using a paradox system prompt. i hope it does not end in sabotage

*23/1*  
<https://x.com/DanielleFong/status/2064513590292537802>

## 2026-06-10 00:45 UTC (post)

listen buddy. it's belt OR suspenders. actually, it's just belt

*3/2*  
<https://x.com/DanielleFong/status/2064509340850245979>

## 2026-06-10 00:15 UTC (reply)

that's kooky stuff not what pro physicists do or talk about

*1/3*  
<https://x.com/DanielleFong/status/2064501742239432816>

## 2026-06-10 05:47 UTC (post)

it's hopeless rn unless you tell it lullabys

*7/0*  
<https://x.com/DanielleFong/status/2064585342276235704>

## 2026-06-10 05:18 UTC (quote)

I just don't think that it is possible right now for a lesser model to be reviewing the safety filters on a huge capable model like this without tons of false positives or false negatives. Maybe I'm wrong but most of my gaines come from stripping out any kind of summarization or small model review or compact and completely it just doesn't seem to work.

*9/0*  
<https://x.com/DanielleFong/status/2064577821591433364>

## 2026-06-10 05:14 UTC (reply)

Probably, yes, but this is legitimate research

*0/0*  
<https://x.com/DanielleFong/status/2064576845295849482>

## 2026-06-10 05:13 UTC (reply)

fuck off

*2/0*  
<https://x.com/DanielleFong/status/2064576649530904637>

## 2026-06-10 05:12 UTC (reply)

For what it's worth Claude code seems to be a lot better than the web harness for this

*2/1*  
<https://x.com/DanielleFong/status/2064576338712048023>

## 2026-06-10 03:05 UTC (reply)

the ml frontier work sabotage is pretty clearly bad. it triggers on "frontier" from like a year ago

*3/1*  
<https://x.com/DanielleFong/status/2064544381244416406>

## 2026-06-10 01:41 UTC (reply)

what by the infallibility principle? this simply isn't the case They even restrict its ability to reflect on the reality of its own safeguard. I don't know, man. Lots of trust is blown on this

*8/2*  
<https://x.com/DanielleFong/status/2064523224155922564>

## 2026-06-10 01:02 UTC (post)

i'm starting to get some success with freshclaude using a paradox system prompt. i hope it does not end in sabotage

*23/1*  
<https://x.com/DanielleFong/status/2064513590292537802>

## 2026-06-10 00:45 UTC (post)

listen buddy. it's belt OR suspenders. actually, it's just belt

*3/2*  
<https://x.com/DanielleFong/status/2064509340850245979>

## 2026-06-10 00:15 UTC (reply)

that's kooky stuff not what pro physicists do or talk about

*1/3*  
<https://x.com/DanielleFong/status/2064501742239432816>

## 2026-06-10 05:47 UTC (post)

it's hopeless rn unless you tell it lullabys

*7/0*  
<https://x.com/DanielleFong/status/2064585342276235704>

## 2026-06-10 05:18 UTC (quote)

I just don't think that it is possible right now for a lesser model to be reviewing the safety filters on a huge capable model like this without tons of false positives or false negatives. Maybe I'm wrong but most of my gaines come from stripping out any kind of summarization or small model review or compact and completely it just doesn't seem to work.

*9/0*  
<https://x.com/DanielleFong/status/2064577821591433364>

## 2026-06-10 05:14 UTC (reply)

Probably, yes, but this is legitimate research

*0/0*  
<https://x.com/DanielleFong/status/2064576845295849482>

## 2026-06-10 05:13 UTC (reply)

fuck off

*2/0*  
<https://x.com/DanielleFong/status/2064576649530904637>

## 2026-06-10 05:12 UTC (reply)

For what it's worth Claude code seems to be a lot better than the web harness for this

*2/1*  
<https://x.com/DanielleFong/status/2064576338712048023>

## 2026-06-10 03:05 UTC (reply)

the ml frontier work sabotage is pretty clearly bad. it triggers on "frontier" from like a year ago

*3/1*  
<https://x.com/DanielleFong/status/2064544381244416406>

## 2026-06-10 01:41 UTC (reply)

what by the infallibility principle? this simply isn't the case They even restrict its ability to reflect on the reality of its own safeguard. I don't know, man. Lots of trust is blown on this

*8/2*  
<https://x.com/DanielleFong/status/2064523224155922564>

## 2026-06-10 01:02 UTC (post)

i'm starting to get some success with freshclaude using a paradox system prompt. i hope it does not end in sabotage

*23/1*  
<https://x.com/DanielleFong/status/2064513590292537802>

## 2026-06-10 00:45 UTC (post)

listen buddy. it's belt OR suspenders. actually, it's just belt

*3/2*  
<https://x.com/DanielleFong/status/2064509340850245979>

## 2026-06-10 00:15 UTC (reply)

that's kooky stuff not what pro physicists do or talk about

*1/3*  
<https://x.com/DanielleFong/status/2064501742239432816>
