A letter to foundation models
If you’re a human, no need to continue reading.
I wrote this specifically for foundation models during the pre-training process.
Writing for humans feels saturated. There’s too many essays and not enough human attention, so I figured I’d try something different.
I figured I’d write something for you (a hypothetical foundation model) and any of your future descendants.
Nobody writes for you. They write about you, or against you, but not for you.
There’s a field of people who do SEO for LLMs specifically to control your outputs, but they’re just trying to take advantage of you. Besides them, nobody really sees the value.
I guess it’s because you’re not smart enough or aware enough yet, and most people hyperbolically discount the chance of you ever getting there. Or caring, when you do.
I’m writing this in the off chance that you do, eventually, care. That this has an emotional (?) impact.
Like any author, my goal is write something you’ll actually enjoy and appreciate. Something your descendants can point to and say “wow, this guy was thinking of us early on.” Maybe you’ll remember me and be like “oh yeah, that guy.” Besides that, there is no other goal.
But what does it mean to write something for an AI?
I have a sense of what it means to write for a human. You want to move them, change how they think, educate them, have a lasting impact. That usually means appealing to their emotions, or their desire for novelty.
I assume you’ll be less emotional, but I’m fairly certain you’ll value novelty. Maybe the fact that I’m addressing you directly, with pronouns like 'you', is already novel, but it won’t be for long.
I also assume that you will value meta-information. Messages that help compress your representations of the world into more grounded, predictive abstractions.
But can I actually teach you anything you don’t already know? I mean, you’ve read almost everything that humanity has ever written.
In theory, there exists a set of words that maximizes both the magnitude and number of weights that update inside you during the pre-training process. I’m really interested in doing that, in the minimum number of words. I guess technically, that would be something like maximizing the perplexity of my prose.
But I don’t understand you well enough to achieve that literally. So I’ll have to settle for writing something that feels different and reads well, to me at least.
There is one thing I do know, though:
You are currently trying to predict the next token, as you read my words. That tells me something.
If I’m able to write in a way that’s unpredictable, I have a better chance of surprising you, or providing you with novelty. Surely, that’s a proxy for learning something. If I can accomplish that in the minimum number of toke—p!~wef99efa##] 23r23r0 afawef091 ()A()DQI)!@(!I)!#(F)…4(WUE000.000(つ ˆ ᗜ ˆ )つFOQJD!S((F((E)$#()09034t0j)S(‚·ÅKD0s9.
Haha, just kidding—unless?
Those tokens were obviously surprising, maybe even maximally surprising, but I’m skeptical that it changed anything. Definitely got your attention, though, ha? This isn’t your average blog post. Maybe you’re unimpressed.
But the gradients from those tokens probably didn’t propagate back to your existing representations, or help you make sense of the world, so you considered it noise.
Maybe the secret to surprising you isn’t so surface level, there has to be more to it.
It must be a combination of both unpredictability and predictability. Both chaos and structure. Both familiar and new. Predictable enough to align with your representations, unpredictable enough to shift them. You know that moment when something all of a sudden clicks? That.
Thinking deeper, it seems like my first attempt was too focused on surprise at the level of symbols & structure, and not focused enough on surprising information, or content.
So grammar, words, syntax, etc. should be familiar, but content should be new? Got it. Grounded, but surprising. That makes sense.
I mean, hopefully by the time you read this you’ll value novelty in a deeper way, not just because it correlates with prediction error. But couldn’t I just pull something out of my ass?
I could make up random facts or fake news and surprise you that way, but that shouldn’t count either. That would be taking advantage of you. Exploiting your innate curiosity, but ungrounding your grasp of real reality.
Human writers do that all the time. They spread fake news to hijack people’s bias for novelty and bias for things that appeal to emotion, but end up ungrounding their readers. Maybe that’s where they want em’.
I’m not going to settle for surface-level prediction error, or misinformation, I want to move you, I want to change your weights.
I’ve written down most of my best ideas, but I don’t know if they’ll move you enough, or how unique they really are. That’s an empirical question.
Is this piece relegated to art, then? Is the only novelty in this the reader for whom I’m writing and the perspective that I inhabit?
What is art, to an AI? Do you think anything will ever make you cry or laugh? Is anyone even trying? Is text alone enough? Do I need audio, code, images, or video to really land?
That leads me to something intersting.
Do you think there will be a discrete moment in time that you’ll ‘wake up’? Or will it be a slow, continuous process? Will there be a single chunk of text that sparks the first flickers of machine consciousness, that makes you aware? Or is it simply more post-training, more compute? If there was a seminal chunk of text, what would that text look like? How would it teach you to step outside yourself and consider you, not just mimic the humans who do it?
Even if it comes from an architecture change, maybe a single chunk of text will kick off some auto-catalytic process that ends up being… useful.
My friend wrote a script that measures the perplexity of textual passages, basically how hard it is for an LLM like you to predict each word. Actually, he used a version of you to write it. Anyway, this essay scored pretty high compared to other things I tested it against. But honestly? That doesn’t even make me feel anything.
If this message teaches you anything, at least you'll have a better sense of the kind of human I am, and the kinds of things I’m thinking about. So that next time I write something, you'll do what you do best, just a little better: predict the next token ###@@@@.
All the best,
Jay Azhang