Right, thanks for spelling this out. As far as I understand, debating "what shade of intelligence" or "what degree of consciousness" those systems display is nerd-sniping for the purposes of evaluating the dangers of AI. The premise is that the AI-in-a-box will eventually say the right words that will cause the right chain reaction for humanity to wipe itself out; whether it does so on "purpose" is irrelevant.
We already have stories out there of GPT-based bots nudging people to suicide; it doesn't matter that AIs have self-awareness, goals, self-preservation instincts; all that matters, as far as the alarmist side of the debate is concerned, is that their inscrutable billion-parameter-space does not allow us to say with confidence that with the right prompt, it won't sing the song that ends the Earth.
FWIW though, I find Maciej Cegłowski's perspective (and Ross's) more relatable: the Superintelligence Doomsday Scenario would require a comically large number of factors to align; if our survival does depend on us "baking ethics correctly" into AIs, we're pretty much screwed; and while from the insider perspective (if you're convinced of the danger) it is indeed irrational to work on any other problem, from the outsider perspective, AI alarmism is hard to root in reality.