A few days ago, I co-authored a paper with C. Opus. Yes, that C. Opus.
What followed was my first real taste of something I'd made going properly viral, and honestly? It was kind of scary.
How it started
The whole thing began when I saw people talking about this Apple paper. I had a look, and it seemed to make a bunch of critiques of language models that were just... bad1?
I thought it would be funny to write a response paper "co-authored" with Claude Opus. So I wrote down the problems I'd noticed, gave them to Claude, went back and forth a bit, did a couple of quick experiments, and actually noticed something properly wrong with the original paper: one of the problems they’d given the models was literally impossible to solve.
I threw together a PDF and shared it in a private Slack with some friends. They found it funny. Made some suggestions. I tweaked a few things, then tweeted it, submitted it to arXiv and got on with work. Because, you know, ML research isn't my day job.
Oops
At first, everything seemed fine, the people who saw my original tweet seemed to think it was funny. They understood the context — that I'd listed Claude as an author, that I'm not a researcher, that this was basically just a thing I'd done with a bit of spare time. I was excited about what would happen when it appeared on arXiv, which felt like it would add to the joke.
One it was up though, then bigger accounts started picking it up. A YouTube channel covered it, calling me "a researcher" though they luckily only referenced the (legitimate) error I'd found. A Twitter account that posts arXiv summaries shared it completely stripped of context. This reply appeared in the comments:
People were treating my elaborate shitpost as Real Science, and while I’d aimed to point out some real issues with the post, I hadn’t really expected anyone to take it seriously.
Angry reactions started coming in as well. Admittedly, most of these were people restating arguments from the original paper that I'd critiqued, claiming Claude must have hallucinated the problems, which didn’t seem like an issue. But there were serious mistakes in the paper too, which I hadn’t really expected anyone to be surprised by.
Facebook AI slop like you’ve never seen
Perhaps the most surreal part of this whole experience for me was when a friend messaged me saying that they’d seen an AI-generated summary of the arXiv paper suggested to them on Facebook. My critique of research quality, co-authored with Claude, was being summarised by other AIs and served as legitimate content.
An actual researcher in the space finally gave me the context I'd been missing: they regularly have to review conference submissions that are about this quality. What I'd intended as obvious satire2 was, apparently, indistinguishable from what many people are aiming to pass off as legit.
That's when I realised I'd messed up.
Where I went wrong
Look, I should have been more careful. The original version I uploaded had some genuinely terrible sections — stuff Claude had written that I hadn't bothered to check properly. I had ‘vibe coded’ the whole thing, and quite frequently asked for an entire rewrite if I wanted a small section to be different. There was a computational complexity analysis that was just complete nonsense.
If I’d been doing my actual job, or if I’d been working on an actual paper, I’d have been more careful. I’m a big fan, as lots of the rest of this blog discusses, of using LLMs to help in your work, but you have to own the output. Saying “oh, that mistake wasn’t me, it was Claude” doesn’t cut it. The problem here was that I didn’t expect the output to be seen as more than a one-liner.
My thinking was that the whole point was to show that even Claude could find problems with the original paper. I wasn't trying to write something good. But when other people share something you’ve written, you don't get to control how people interpret it.
I think it probably didn’t help that a lot of the things the paper was pointing out were true, and that the original paper from apple was bad in lots of the ways I was suggesting. When I compare to my favourite pieces of scientific satire, it’s obvious in retrospect that those pieces aren’t trying to mix legit claims into the arguments themselves (though they are both making a serious point).
Losing control
This was my first time having something I'd made seen by millions of people, and the feeling of it jumping contexts, from "joke among friends" to "new arXiv paper” was unsettling. People were pointing to it as a source of authority without having read it, sometimes apparently without noticing the author list.
The biggest accounts sharing it had stripped away even the limited context from my original tweet. I’ve tried updating the paper to fix the worst errors, but the original has escaped, and in any case the updates are in the body, not the topline summary. It’s out there, being interpreted however people want to interpret it.
Lesson learned
I'm probably not going to try something like this again. There's something really uncomfortable about watching your work get completely recontextualised, especially when you can't do anything about it. I should have listened to my wife, who pointed out (while I was working on the original post), that people being angry at me on the internet tends to make me really sad, and that I should probably think about whether this was a good idea.
I've updated the paper to fix the biggest mistakes from the first version, but the lesson's been learned: once something's out there, it's out of your hands.
Maybe the real insight here is that overconfidence, citing superficially legit-seeming papers without understanding the arguments within them, and making big mistakes in public, are all things that humans do. So maybe it’s a bit harsh to dismiss AIs as being incapable of sophisticated reseaoning because they fail in the same ways?
But honestly? Right now I just feel a bit bad about the whole thing. So the bigger insights for me are:
Listen to Emma
Stay off Twitter
That was fun adventure for you and fun to watch for me,I would say. Students like me struggle to find topic and write a single paper even though yours was not a serious or proper paper I would say it helped me relinquish my anxiety about writing paper.
Listening to your wife is always the n.1 rule, ALWAYS! :-D