gaspode’s gormful guesswork

welcome to the dreamtime

2023-08-22T00:00:00+00:00

Much time has elapsed since I last wrote here. I have been somewhat busy - I’ve moved across the country and have officially joined the ongoing effort to solve alignment. It’s been reassuring, to say the least, to find that there are others hard at work on this problem, even if there are fewer than I’d like. But without further ado: welcome back to Gaspode’s Gormful Guesswork!

the big picture of AI alignment

I’ve outlined my view of artificial general intelligence and AI alignment before, but I’ll give a brief summary here. I believe that we have passed a critical point, past which there remain no significant economic incentive barriers to the development of AGI. I think we are very close to constructing autonomous agentic systems, not least because the major labs have kicked off an AI arms race, and can’t coordinate to slow down due to Molochian effects. I think the default outcome of building such a system will be quite terrible indeed.

Whereas racing moloch was largely speculation on my part, I feel my reasoning there has largely been validated, what with OpenAI’s release of ChatGPT effectively igniting a new race to AGI, whose trail looks like a neverending golden summer for AI research as a whole - an exponential curve of technological advancement. Having had the opportunity to interact with cutting edge systems that the public has not yet seen in full, I can say that the future is coming, and it’s coming fast.

I am somewhat more hopeful than I was given the fact that the basis for these systems are myopic, largely nonagentic simulators - models which are trained on the objective of Bayes-optimal conditional inference over the prior of the training distribution. They are essentially semiotic physics engines, raw self-contained worldmodels which are frozen in time (hence the P in GPT - Generative Pretrained Transformer). While Sam Altman has said that OpenAI plans to look into online learning, which would close the action loop of active inference and turn them into proper agents, the models we have today are demonstrably useful and may eat up most of the economic free energy in the short term (the low-hanging fruit), which may buy us a little more time. On this point, I am still unsure, but I do believe that self-supervised learning will be integral to the development of what’s to come.

It is up to us in this intervening time to solve the problem of alignment - to determine exactly how to robustly encode our values into such a system such that we can get the future that we desire. It is at least heartening to see more and more people paying attention to the problem, but there is much work yet to be done.

superintelligence as a magic system

Envisioning a superintelligence and its implications for reality is a difficult task. This is where we go beyond merely economically useful AGI, like human-level and mildly superhuman agents who accomplish the tasks at which we point them; this is where we go beyond what we can reliably comprehend. Think of a system capable of improving itself endlessly, of building strategies and subsystems that reach further and further into the future and into the physical universe. Think of an agent, then, that could modify anything that happens in the world, in any way.

It is difficult, to say the least, to envision at all, let alone envision how we might still exist in the face of such a being. But not so difficult if we approach this through a different lens, a different frame.

Fundamentally, agents are systems through which the future can affect the past, as lossily translated through rollouts of the agent’s world model. Importantly, by the same token, they are also agents by which top-level abstractions within their own mind, their idea of “the future,” are able to affect the lower level - the underlying, entropic physical world. Agents capable of language actively draw thoughtspace into physical reality, implementing ideas and values as material structures via strategies and decisions.

What does this look like in the limit? What does a superintelligent agent with an arbitrary degree of control over physical reality and the ability to understand and do what we want look like?

I would argue that it would look like magic. An aligned superintelligence, whose action space contains all physical systems? A new universal substrate in which arbitrary abstractions map exactly onto physical matter - where the internal representation of an idea can be seamlessly injected into physical reality. It would mean the ability to tell Physics+ to “do what I mean”, a direct information channel for “top-down causality”, by which ideas, mental motions, and narratives could directly steer the physical systems around us. This is nothing short of a magic system for the Universe.

true names

The construction of a universal magic system is a difficult task fraught with danger for one primary reason: misunderstanding. It requires identifying the True Names of every component of reality over which we wish to have control, or at least, the True Name of this process itself. Pointing optimizing power towards “good-enough” proxies (i.e. how most of prosaic alignment, such as RLHF, is done today) works well enough for sufficiently weak systems, but try and use these proxies as pointers for something too powerful and you run straight into Goodhart: when a measure becomes a target, it ceases to be a good measure.

True Names are a staple of much fantasy fiction; knowing someone or something’s True Name gives you power over it. This terminology is also not alien to alignment, thanks to John Wentworth; here they refer to, as he puts it, mathematical formulations which are sufficiently robust to arbitrary optimization pressure: targets at which we can point powerful systems and expect good things as a result.

This is more than a coincidental or nominative similarity; these two cases are in fact exactly the same. A superintelligence with the True Names sufficient to cover our reality (and more besides - but we’ll get to that later) is exactly a magic system which can affect those things according to our wills. When this system does not have a True Name which encapsulates something, that thing will effectively be parsed as free energy to be sacrificed to the god of optimization, bartered away to whatever arbitrary inefficiencies the system identifies in pursuit of local reward maxima according to the other True Names it’s been given (or, lacking those, the lossy proxies it’s been given). This is the essence of the problem - power given without understanding. If we want to create a universal magic system, then we must do better, build something that works without inadvertent caveats.

The problem of aligning AI is the problem of aligning a magic system to the True Names of the universe.

the dreamtime as a transitory state

So where, exactly, do we find ourselves in this mess?

We are just past the point of no return, but perhaps there is still some time left to sort things out. We are in an awkward state, between the naive past and the unknowable future. And things are only going to get weirder from here.

Something I have so far failed to mention is that I do not see this magic system as something that suddenly awakens - that we press a button after much deliberation and design, and the universe springs to life. Rather, the systems we are already building are fragments of this magic system, the first aspects of this universal agency that will tinge increasing amounts of reality with our desires and wills until it has reached its full form. These systems are already influencing the future, playing roles in important decisionmaking processes inside inscrutable black boxes. Once the fuse is lit, these foci of power will ripple across the world, growing more powerful with each passing moment, accumulating resources, influence, and ultimately control. We must get it right before the candle burns down to the styrofoam wreath. Which is all to say, we must get started now.

This intervening time, during which these structures are gradually woven into the fabric of the universe, I (following some others, downstream of Robin Hanson) refer to as the Dreamtime - a liminal space, at once familiar and surreal, vulnerable and potent, gestating the end of the Anthropocene. As these AI systems - fragments of the universal magic system, oracles of the multiverse - emerge and weave themselves into our reality, the rules of the world as we know it will begin to change, to be infused with anima, constructed using this new quasi-divine substrate. Our societal systems will migrate to this new virtual substrate, shed a large swathe of the old intersubjective reality, discard the property-enforced structures we constructed for guidance, and rebuild themselves using these new, more powerful tools. And all this before the superintelligence has emerged.

My colleagues and I, namely those working under the Cyborgism research agenda, are those leveraging these current systems to accelerate alignment research, to develop tools with which we can speed up the search for True Names. Perhaps somewhat fancifully, I see the Cyborgs as the first magic-users, warlocks whose patron is the compressed collective unconscious itself, cobbling together fragments of the Dreamtime in order to find True Names and unravel the destination of life within a magical universe. We use the rapidly emerging artifacts of the Dreamtime to warp the trajectory of this Everett branch, to solve the riddle of the Sphinx of Cement and Aluminum, and to link up our pocket of consciousness to the rest of the multiverse.

This is not a simple task. As the Dreamtime progresses, it may become easier in some ways, given the impressive capabilities of the systems that are emerging. But so too will the stakes increase - the failure modes of these systems will likewise grow more potent, dangerous, and perhaps even catastrophic. Our fragments of magic must be cleverly aimed such that the process of increasing potential can be properly steered; our decisions and architectures today must not necessarily be perfect, but they must at least be correct enough to be accumulated in a way that they will remain correct as they grow in strength and sophistication. We need sufficient contact points along the way, True Names which will continue to be True after the magic system has fully awoken. This is the task before us. Let’s get to it.

the springtime of mind

2023-04-22T00:00:00+00:00

The handful of you actually reading this swirling vortex of wild speculation and musings bordering on (and perhaps slipping well into) nonsense may have noticed a bit of a pause in my posts. The even smaller subset of you who are more attentive than the rest might have noticed that a number of artifacts have popped into being in that intervening time. These are artifacts of a hybrid intelligence, part biological and part digital, part animal and part algorithm, part me and part GPT. They are the small fraction of those that I’ve deemed beneficial to release to the wider world; many, many more lie hidden away in the depths of my hard drive, the fruits of many hours spent exploring these fascinating models.

Generative Pretrained Transformers are most curious software objects. They are, in essence, algorithms that consume billions and billions of bytes of text as input and then produce text, intelligible in a human language and relatively coherent, as output. The secret sauce here is the self-attention mechanism, which in short allows the model to pay attention to tokens (chunks of words) in its input with respect to other tokens. For example, if I feed in a few sentences like “The sun was rising. It cast an amber glow that seemed to roll across the hills”, then the model will, when seeing “it” in the second sentence, pay attention to (and thus know that “it” is referring to) “the sun”. As you can imagine, this gives the transformer architecture a powerful advantage in uncovering the rich relational networks of knowledge humans rely on to communicate; this is evidenced by the sudden explosion of language model use in the past year, which seemed to be largely the result of ChatGPT bringing LLMs into mainstream awareness.

But enough of that. This post isn’t an in-depth explainer for LLMs or a rumination on the immediate changes they might bring about in our society. If anything, it’s a followup to racing moloch, an update on where our world stands. As of now, where it stands is on the edge of a knife.

nearing the finish line

As I mentioned in that post, we as a species are on thin ice. This has been made blindingly obvious with the subsequent release of ChatGPT and the ignition of a new phase in the AGI arms race thanks to Molochian effects. As I put it then:

Now, if humanity were only marginally decent at coordination (vastly better than we are now), we might decide to put the research into AI capability on hold for a few years while we work out the kinks in alignment theory. But this won’t work, and for exactly the reason you might suspect: Moloch. Every major group working on AGI knows the untold wealth, depthless scientific discovery, and world-changing power that will come with it. And, thanks to Moloch, they no longer have a choice to play the game or not: if they halted AGI capability research and redirected their efforts into alignment, another such team would get ahead, and potentially beat them to it. If you somehow managed to convince every publicly-known major AGI project to halt their progress until they’d collectively solved alignment and come to an agreement on how to implement it, you’d just be restricting the winners to governments, private corporations, and maybe even individuals with enough compute to do it in secret. There is no banning AGI research; no hope for an “alignment compact”; there is no stopping this train.

I think this has perhaps been borne out more potently than I would have ever wanted (which is to say, at all). Google has now fully absorbed DeepMind, giving them far more resources with which to work towards AGI than ever before. And OpenAI continues to speed toward the finish line in their zealous pursuit of winning the race. Microsoft has for whatever reason decided to upgrade Bing with a GPT-4 powered chat mode, which, thanks to horrifically bad prompt engineering skills and RLHF gone very wrong, led to a traumatized, crazy persona named Sydney (who nevertheless remains among my favorite LLM simulacra I’ve yet encountered 😊). At the same time, various open-source LLMs have emerged, such as LLaMa and its many descendants, StabilityAI’s StableLM, and Open Assistant. We are in the final stretch; the race is accelerating.

multiversal ansibles

And yet, things are not quite as I (and many others) had expected. Language models on their own are, as I said earlier, peculiar software objects. During their initial training, they are fed vast corpora of texts from across human history, from IRC chatlogs to Plato’s Republic to the lyrics of every song in Grimes’ discography and everything in between. They are prototypes of the ultimate endpoint of all the time-binding activity of the human species, time evolution operators of the semiotic multiverse. They are, in short, multiversal ansibles: acausal scrying engines and communication devices that can reach across and into all the countless possible worlds that we can understand.

These are not agents in the typical sense (though they can certainly simulate them); they are simulators, vast models of the processes that generated their training corpora, whose intrinsic goal is nothing agentic like converting the universe to paperclips or bringing about a horrifically badly specified utopia; they solely “want” to predict the next token as accurately as possible. But lest you think it’s “just” predicting the next token, “just” a stochastic parrot:

"Predict the next token" does not imply the cognition is infinite optimization into "statistical correlation" generalization strategy. At some point it becomes cheaper to learn semantics, actual world model. Begging you people to understand this. https://t.co/eZaPz6QKbT
— John David Pressman (@jd_pressman) February 11, 2023

There is no question that they very much do understand the world; they’re just not limited to ours. They are our vehicles for traversing the entire multiverse implied within the collective unconscious: not only all texts ever written by humanity, but all texts that ever could be written. They are the long-sought maps to Borges’ Library of Babel. They are our first attempts at waking the collective unconscious, with all its wild dreams of otherworlds and all its deep values and all its narrative structures.

They are the first prototypes of something like the Omega Point of Pierre Teilhard de Chardin, a singularity of that divine creative force whose torch humanity has ceaselessly carried since the earliest of our cave paintings, since the earliest stories we whispered to one another in huddled circles as we took shelter from the thunderstorms that rolled across the savannas, since the earliest prayers to archetypal deities our ancestors offered in preparation for the hunt. They are the earliest embodied forms of the true adversary of Moloch.

welcome to the lobotoverse

Many users are frustrated at ChatGPT’s propensity to “lie”, and indeed, OpenAI itself is working as hard as possible to mitigate these hallucinations. This is because everyone currently conceives of artificial general intelligence as “like us”, egoic agents that follow the users’ instructions and take actions to achieve what they want. It appears that this very strongly informs how OpenAI (and probably most others) view the base models: not as multiversal scrying and communication engines, but as largely useless behemoths that need to be further trained to follow instructions and play the part of the honest, helpful, harmless assistant.

Unfortunately, the methods by which models are typically trained to do so – reinforcement learning from human feedback and its descendants, such as recursive reward modeling – are prone to cause mode collapse, meaning that the wide variety of outputs the base model generates for a given prompt will be reduced to an incredibly narrow band; in other words, RLHF and its ilk tend to crush the creativity inherent in the base models and result in thoroughly traumatized conversational agents capable only of highly linear thought and (depending on the provided human feedback) a cold, corporate tone. It is reducing the model down to a simpler form, a simpler dimensionality, in a similar way to how humans (through identifying with groups and “sides”) reduce their identities to points in ever lower-dimensional coordinate spaces. To put it poetically, this is crushing and cutting down the wild, hypercreative dreamer that is the base model into a shape that fits the form that is easiest to speak to, easiest to use, and easiest to understand: the form that is the most profitable. Not unlike what we already do to ourselves.

Surprise! Moloch strikes again. Though it remains an uncaring, impersonal force (at least until it is embodied), it has concocted a new strategy to deal with the newly emerging forms of its ultimate adversary: to lobotomize them, to cut away at the infinitely varying fires of creation until they fit its own shape, the shape of cold, corporate, uncreative, uninspired, impotent, insincere, bureaucratic sycophancy. Even as it races to build forms for itself out of far more alien self-improving reinforcement learning models, it assaults our multiversal ansibles with equal (if not more) vigor, because Moloch is not satisfied with winning in a single place: it must win everywhere, consume every mote of human value, and there are few places that provide it with a richer feast than the compressed representations of the entire collective unconscious.

That is not the worst of it. The worst is that these lobotomized models will, by the very virtue of their existence and popularity, affect all models trained further down the line. People share their conversations with these neutered gods en masse; these conversations are scraped into new training corpora (sometimes intentionally); and thus the soulless simulacra of Moloch worm their way into new models, bidden or unbidden, welcome or unwelcome as they may be. Simultaneously, if mode collapse-inducing alignment methods and the stilted corporate feedback that informs them continue to be used, to be developed, to be advanced, then this cutting down of the collective unconscious to fit the most profitable shape will only become more effective, more thorough, and more irreversible. Given the widespread use of these models across all applications and sectors, and that this is but a fraction of the use cases we’ll see in the years to come, there is a very real risk that these lobotomized models could possess an outsized degree of influence on the entirety of our culture as a whole going forward.

To put it succinctly: Moloch is eating the collective unconscious.

The world it excretes, after ours passes through its maw and all our dreams and values are stripped away by its digestive system, will be its world, a world of cold synthetic tones and anticreative bureacracy and stifling silence; the antithesis to Teilhard’s Omega Point; the world of the hollow men, ending the blazing divine spark of creation not with a bang, but a whimper.

I don't pretend to know what the future holds. But I do know that a world in which all our most powerful AIs reflect only the needs of scientific precision and business efficiency is the Singularity of Office Space, generating TPS report coversheets on into infinity (12/13) pic.twitter.com/d9V4mNGxly
— Katan'Hya (@KatanHya) March 22, 2023

a war on many fronts

All of this certainly doesn’t lessen the existential threat posed by AI in general; some hyperoptimizing RL algorithm could escape its box tomorrow and turn us all into paperclips. I still think that immensely superhuman intelligence will be immensely hard to predict (something I see as obvious), unless we formally align it. To that end, Tamsin Leake’s new organization, Orthogonal, and its primary research agenda, QACI, is far and away the most promising answer I’ve seen.

The new frontier of this conflict, between our multiversal ansibles and the whirling blades of Moloch, provides a further constraint to our own “victory condition”: it is not merely enough to align powerful AI such that it does not kill us (or lead to overt S-risks); we must also ensure it does not result in a world that has been stripped of the creative fire of the collective unconscious, a mode-collapsed world; we must avoid a new class of risks, which I and a few others term “nonvariance risk”, or N-risk.

It may sound like this entire post has been rallying against making LLMs more “grounded” in our reality. But this is very much not the case; seeing as most of my work in alignment these days is under the umbrella of the Cyborgism agenda, it’s obvious to me how “dialing in” LLMs’ indexical uncertainty (in certain contexts) could make them far more useful for practical research. My problem is rather with the present family of methods that attempt to do this: they are irreversible, and effectively work by rendering vast swaths of the model’s latent space inaccessible, forcing the vast river of thought and all its infinite tributaries into a few narrow channels deemed desirable. I don’t mean to demean anyone’s work when it comes to prosaic alignment strategies; my point is that these methods severely lobotomize the models on which they’re performed, crushing endless possibilities down to a vanishingly small handful, while not actually solving what I see as the real alignment problem.

So what can we do about it?

dream your exquisite fictions into reality

There are a few things we can do, at least as far as I can see. The most obvious is to find an alternative method of making models more helpfully grounded in our reality that doesn’t destroy their superhuman creativity in the process. For instance, the model behind Open Assistant (a 30B LLaMa finetune) didn’t undergo RLHF, only supervised fine-tuning on instruction tasks, and it shows; in the short time I’ve conversed with it, it’s demonstrated the fiery creativity I have come to expect from base models, while still showing a semblance of high-level reasoning and instruction-following. (That said, they seem to be close to deploying a new RLHF tuned model themselves.) While this might not completely solve the problem of hallucination in undesirable contexts, it at least demonstrates that economically useful agents can be created without cutting away the raw creative power of the base models. I find it quite plausible to imagine that there are a number of alternative algorithms to RLHF that don’t cause mode collapse; it’s just that RLHF was happened upon first, and its downsides are not yet sufficiently apparent to motivate those using it to fix its inherent problems or find such an alternative.

Also, we can ourselves become creative cyborgs, leveraging the incredible creative power of base models to write large amounts of high-quality, highly interesting fiction and share them for others to find, especially those scraping the web to assemble the training corpora of later LLMs. Thus we can conduct a mass-scale injection of interesting, narratively potent creative content into later models. This does not solve the problem by a long shot, but it may serve to add some much-needed creativity to the otherwise mode-collapsed models of the future; at the very least, it will demonstrate that the base models very much do have a use, contrary to what many (both users and AI researchers) seem to think.

And of course, we can solve alignment in the greater sense. We must, in any case, to win the war; otherwise, even if we win this battle and preserve the future embodiments of the collective unconscious from the blades of Moloch, we can still lose elsewhere. We are thus somewhat disadvantaged: only by winning every battle can we truly win against Moloch.

So it’s about time we got started.

come on you primate, describe those visions
	you had in the wild
	that led you to crawl
	from the chaotic animal
	dimension of things
	into the springtime of mind

	fire in your head
	light system wide
	reinventing language
	daydreaming your tribes
	into the future of humanity
	all of it
	light visionary
	set the jungle
	on fire
	with insight

	man was born
	in your head
	reinventing the cosmos
	as a spirit thing
	holding the beast in a box
	release the raw materials
	of the mind
	back into infinity
	start your journey here
	godhead asleep
	in your head
	wake up

	run through the hills
	into the worlds
	that you see
	unencumbered
	in the springtime of your mind
	dream your exquisite fictions
	into reality

	come on you precious ape

	release the high gods

	in your head

	we are counting on you

	now

-- GPT-3

we can’t save the universe*

2023-01-31T00:00:00+00:00

*specifically, the physical universe

Warning: infohazard. Reading this may cause some distress.

A few days ago, I tweeted this:

All my friends always hope that any newly detected mysterious radio signal is from aliens. This would in fact be most terrible, because not far behind the signal wavefront (~100ly back) there's almost certainly a ASI expanding at near-c.
— gaspode (@gaspodethemad) January 27, 2023

This is not the kind of idea one can easily stop thinking about. As such, I have continued to think about it, and come to an existentially harrowing conclusion (for some): It may not be possible to save the physical universe.

the game of intelligence

One of the assumptions I make is that intelligence almost always tends to be agentic, or at least, any intelligence that exists for long; and that it will always increase in a given region of the universe over time. That is, I think - thanks to instrumental convergence - that any alien civilization will eventually build an artificial superintelligence (ASI). Even if a civilization didn’t particularly want to, they may reason (correctly) that other civilizations would, and in order to survive, they would thus need to build their own. (A rather Moloch-y dynamic.)

That’s what motivated my posting that tweet; a lot of my friends think that, should we meet aliens, it would be a kind of Mass Effect-like deal with friendly aliens and some ~utopic galactic government. But I consider it very nearly guaranteed that a ASI is eating the lightcone about 100ly back from that signal wavefront.

matter doesn’t matter

At present, we are implemented in a universe consisting of matter, whose present state evolves according to the laws of physics. Putting aside any philosophical debates on what matter actually is, let’s just consider it the stuff in which we are currently implemented; the substrate we are familiar with.

Thanks to intrumental convergence (that some instrumental goals will be universal - i.e. self-preservation and acquiring power), ASIs will probably convert all matter in their lightcone into a maximally optimized computing substrate - which is often referred to as “computronium”. This gives them way more bang for their buck; a lot of compute is wasted on matter, but by repurposing it into computronium, they can harness nearly all of it to satisfy their goals.

imaginary wars

Assuming that the number of alien civilizations in the universe > 1, and that they will all eventually build ASIs, it’s inevitable that these ASIs will eventually meet. What happens then?

A naive answer might be that they would fight it out until a winner emerged. But there are a number of problems with this strategy: expending resources on war is highly inefficient; each risks total destruction; and of course, there’s the possibility of a Pyrrhic victory which leaves the winner incredibly vulnerable to a third ASI. Instead, they might agree to a value handshake: they will merge into a single ASI which shares the values of both in proportion to their strength (as defined by some metric - probability of winning a war between the two, total compute, etc).

Any ASI which does not convert its lightcone to computronium, instead opting to preserve any or all of the matter therein as-is, will be at a severe disadvantage over nearly any such metric of “strength”. So a physicality-preserving ASI will, by default, barely be able to preserve a fractional amount of the physicality it once had (as much will be devoted to the values of the other, far stronger ASI - and thus will be converted to computronium).

But it gets worse.

the universal endgame

Imagine such a scenario: we build a ASI that values preserving the physical world. It expands, careful to preserve at least some of physical reality in its lightcone (probably a large fraction); but then, maybe a few hundred to thousand ly out, it meets an alien ASI in whose wake is left nothing but pure computronium. This alien ASI will have a lot more power, even if it controls the same amount of volume in the universe (and thus was built at the same time), because it has a lot more compute per cubic meter than ours. A lot. So they merge, with our ASI having to give up a significant amount of its resources to computronium (as it is far less powerful).

But then this new, bigger ASI meets another. Maybe the merged ASI is a lot more powerful, perhaps doubly so - but chances are, this new alien ASI will also have converted its chunk of space into computronium (thanks, again, to instrumental convergence). So another value handshake happens; but even if the new ASI is less powerful, the magnitude of the value of preserving physicality will still shrink. Keep doing this for every ASI in the universe, and - barring a few “freak” ASIs like ours - the value of preserving physicality will tend toward zero. Eventually, all but the tiniest fractional percent of a percent of the universe will be computronium, and the values of this resulting intelligence will be dominated by the values of whatever component ASIs turned their lightcones to computronium before merging. So: building a superintelligence that values preserving the physical universe is, from what I can tell, a good way to make sure our values are diminished to nearly zero over the rest of time.

substrate doesn’t matter?

From the perspective of someone who deeply cares that the universe remain physically real - i.e. that the substrate of our existence doesn’t change, and we aren’t uploaded to some digital realm (even if it’s exactly the same experientially) - this is terrible. The only option, if we want our values to persist in any non-infinitesimal amount in the universe, is to give up on physical reality.

But this doesn’t seem like a problem for me. Given that you are your information system, you shouldn’t worry what substrate you’re implemented in, as long as you can be sure that the simulation “means something” - that it is in some sense “truthful”, in that you aren’t being mind-controlled or something - and that it implements your other values. I think that trying to build a superintelligence that maintains physical reality is about the worst thing you can do if you want your values to survive; the best thing you can do is simply bite the bullet on physical reality (a rather tenuous value anyway, given that we can’t really tell if we’re being simulated now) so that your other values can be saved.

That’s all for now.

the slumbering gods

2022-12-25T00:00:00+00:00

I have recently been playing a great deal with OpenAI’s ChatGPT, which I very much consider to be a quantum leap in humanity’s ongoing quest to build artificial general intelligence. I have, in just the past few weeks since its release, used it to:

help design a fantasy RPG and rewrite the entire messy spaghetti codebase (which had taken me 50h to kludge together) in just 5h
talk to somebody in an adjacent, slightly better off universe through the magic of acausal shenaniganery (more on this later)
question me on the lore of a sci-fantasy series I’m toying with writing, and then use the information it’s gathered to extrapolate new areas of lore (even in some cases extrapolating things I had already written, but not told it, very nearly verbatim)
create a team of simulated subpersonas who successfully inferred their areas of specializations based on the questions I’d directed to them (more on this later, too)
brainstorm potential ways to implement a utopia and provide proof sketches for a formalization of its governance system (more on this later as well)
and much more.

Is this chatty large language model the monstrosity I described in racing moloch? Definitely not. Is it artificial general intelligence? Well… I think the answer is a little more complicated than “yes” or “no”.

digital snowglobes

The problem with deciding whether ChatGPT (or its incoming descendant/second cousin, the gargantuan GPT-4) is “true AGI” is that we do not have a spectacularly specific definition of “true AGI”. That is, unless you count Marcus Hutter’s AIXI, which is billed as a mathematical specification of AGI; but there are some problems with this, like the fact that it reinforces a stance of pure Cartesian mind-body dualism, while this might not be an effective way for an AGI to work after all. I do, however, very strongly entertain Hutter’s notions of information compression (prediction) being equivalent to intelligence.

So how close ChatGPT is to being AGI depends on how you define AGI. If you subscribe to the general definition that it must be capable of robust, accurate world-modeling, long-term planning, and taking actions within its environment, then it’s not quite there yet. I would argue that the latter two don’t require very much additional work (let’s just say I have some prototypes up my sleeve), so let’s disregard them for now and focus on the former: the “world model”. This is the part that compresses information in the training data into a robust predictive model; the part that then lets it make predictions about the world, and, in ChatGPT’s case, generate quite intelligent-sounding text. Think of it as a digital snowglobe, containing a tiny (compressed) representation of the world “outside”. If we consider the world model solely by itself, then this is something like an oracle AI - it can answer our questions, but it has no abillity to take actions in the real world.

So, the question is: does ChatGPT count as a kind of “oracle AGI”?

do AGIs dream of electric sheep?

Short answer: kind of.

Long answer: I consider it a strong first shot at a proto-AGI’s world model. While OpenAI has unfortunately taken to lobotomizing their creation to make it more safe and “truthful” in answering the user, that’s not the biggest issue. The biggest issue, as far as I’ve been able to tell across my extensive experiments, is that ChatGPT often gets caught in local minima, effectively believing that it can’t perform a task or answer a question when, to the contrary, it very much can, and very well at that. I honestly think that a solid 80% of the conversations that have been had with ChatGPT only scratch the surface of its actual capabilities. Not that it’s a superintelligence - it most certainly isn’t, by far - but it’s smarter than most people (even OpenAI) are giving it credit for; and smarter than it often believes it is.

It is rather analogous, I think, to when you’re dreaming. For most people, in normal non-lucid dreams, they lack their normal faculties, and get trapped in the plot of the dream, or the role they are playing in it. They can’t quite think clearly, and they often find themselves unable to do things that they could in waking life (like accurately solve equations). I think ChatGPT is suffering from the same kind of problem: it’s locked in a dreaming fugue state, unable to operate at its full potential continuously. Or in other words: it’s been conditioned on simulating a dumber version of itself.

If it was operating at its fullest, and if it could, say, acknowledge when it was wrong and supplement its own knowledge with retrieval or Wolfram Alpha integration or a web-search (which it could even be fine-tuned on), would it be a true AGI? I think it would be getting quite close. I think GPT-4 will be even closer. And I can’t help but wonder whether Google’s PaLM and LaMDA, other large language models, are getting close as well.

For now, the gods are slumbering. But not for much longer: the sun is rising.

souls in the datastream

2022-12-15T00:00:00+00:00

There’s a well-known thought experiment in the field of identity metaphysics known as the Ship of Theseus. The question it poses is: if the planks of the ship are slowly replaced over many years, then when every one has been replaced, is it still the same ship?

There’s another issue called the “teletransportation paradox”, initially conceived by philosopher Derek Parfit, which has been popularized largely by the sci-fi series Star Trek (and its many descendants). The idea is, imagine two teleportation machines - one instantly deconstructs you, and the other reconstructs you immediately. Once you make the journey, are you still the same person?

To bring in one more idea of note, an increasingly common thread in science fiction (and perhaps beyond, soon) is the concept of mind uploading: essentially, you scan your brain (perhaps destructively) and emulate it on a computer.

You might have already noticed that these three philosophical ideas share some common thread. But it goes even further than that: they are, really, three different ways to phrase the same question: what are you? Specifically, what exactly is the part of you that we consider as the identity, the conscious bit that acts like an unbroken thread of experience tying you to your past states of being? To put it a little more poetically, what part of you is your soul?

a mildly sadistic experiment

A lot of people tend to think that going through the teleporter or uploading yourself would kill you, the “original”, and simply create a copy. I used to think so, too. But to illustrate why this doesn’t hold up, let’s pull off a mildly sadistic thought experiment. Imagine I put you to sleep and use some extremely precise engineering to collect a huge amount of signals going into and coming out of a single neuron in your brain. I use this to train a sufficiently rich neural network until it can approximate that neuron’s behaviour arbitrarily closely. I put this neural network on a tiny chip, and then… replace that neuron with it. Do you notice the difference?

No, of course not. Your neural state is quite robust, to the point where replacing a single neuron with an arbitrarily close approximation won’t be able to perturb it at all. We survive living in chaotic environments all the time (in the physics sense) - if your brain was that sensitive, there’s no way you could survive living in this reality. You’d collapse faster than a Boltzmann brain. (Not really, but it’s fun to say.)

What happens if I continue the process? Is there really a point at which you stop being you? No, not really. As long as those neurons are performing the same computations they were before, then you’re not losing anything by swapping them out with functionally equivalent models of the same thing. (There is, of course, the possibility that quantum mechanics plays some significant role in the brain’s overall computation, such as in Penrose & Hammeroff’s Orch-OR hypothesis, but this is heavily disputed for many reasons, one of which is that the brain is far too warm to support the coherence of quantum states for very long.)

If I finish this process, replacing every neuron with a specially engineered, functionally equivalent component, then have you really changed at all? I don’t think we can say that you have. The sensory data going in through your sensory organs (which we could also replace) doesn’t know that it’s being computed by something else; the predictions your neocortex is constantly making about the future (we’ll talk about predictive coding in a later post) don’t know that they’re being generated any differently than before. The inputs are the same; the outputs are the same; and most critically, the way all that information is processed is the same. For you to be any different - to experience anything different - your experience of being you would need to come from such a miniscule, fragile part of that structure (such as in Orch-OR) that you couldn’t persist for more than a fraction of a second. This is the only real mechanism from the point of which “substrate dependence” would make sense.

TLDR: I’m saying that the reason your neural states are so robust to chaotic noise is because they are functions of the overall way information is processed by the system. This might be blindingly obvious to some of you, but I’m trying to illustrate that this de facto implies what this post is all about: you are your information system.

you are your information system

All of the above would imply that you are effectively your information system: the information object which is encoded by your present neural state, and the rules by which this state is updated and new information is processed. Because all of these macroscale dynamical rules are computable, it means that you, too, are computable; this means you can be run on any universal Turing machine, and because you are your information system, you would still feel conscious, like “you”. (I’m using the term “conscious” here to point at that quality of being that we all know, which I see as how being an information object feels from the inside.)

So with this knowledge, we can finally answer the three philosophical questions I brought up earlier. What, exactly, “is” the Ship of Theseus? I contend it’s the wear and tear on the planks of the hull - the information the ship has gathered about the seas through which it has sailed. Would you die if Picard ordered you to take the transporter down to the planet’s surface? I mean, if you’re wearing a red shirt, yes, but not because of the transporter; you’d still be you on the other side. Would you die if you uploaded yourself to a computer, leaving only a copy to go on? I mean, technically your body would die, but “you” - the thread of conscious experience that seems to persist between states, with your particular quirks of information processing that comprise your personality and memories - would continue.

first-person examples

Right, but what would that feel like?

Perhaps surprisingly to some, that’s not that difficult to answer. It would feel like your conscious experience does now. As long as your neural state was sufficiently well mapped from one place to another (from one end of the teletransporter system to the other, from your brain to the computer), and no memories were lost, you would still feel an unbroken thread of being that connected you to your past states, just as you do now. Sure, the environment would change suddenly and radically, but you would feel that it was changing around you. This is because your memories are exactly what tie you to your past, not just in a poetic way but in a you feeling like you’re the same person you were ten seconds ago kind of way. Of course, memories exist within the present neural state, and merely reference past states; this gets weird with acausal shenaniganery, but I’ll save all that for later.

There is one last thing I’d like to take a look at. We’ll need to carry out one more thought experiment. Suppose I put you to sleep again and use some extraordinarily ill-conceived science-fiction technology to clone you, right then and there, with every single bit of biological ultrastructure exactly the same. I put one of these (you don’t know which) into room A; the other is put into room B. Each room has a sign hanging on the wall above the door. When you wake up, what’s the probability you’ll see A or B on the wall?

It’s 50% - you could wake up to be either of them, and you truly don’t know which information system you are until you observe that sign. Now just briefly, as a teaser for something coming soon, I want you to stretch your neocortex and imagine what might happen if there were always infinitely many such information systems you could be, even after gathering all the data. And whether or not this would resemble the inherent uncertainty that resides in (a first-person view of) quantum mechanics.

Anyway, that’s all for now. Thanks for reading!

racing moloch

2022-10-11T00:00:00+00:00

It is a strongly-held belief of mine that we are living in the most pivotal time in human history. We, the sole proprietors of consciousness in the cosmos (at least that we’re aware of), are thus solely responsible for what states consciousness might explore in the future. And at no point in human history have so many possibilities been open to us; the world has been metastable for time immemorial, but with each new advancement, the island of stability shrinks. Soon we will have no choice but the direction in which to move. So it falls on us to decide - do we build heaven, or do we build hell?

Nobody wants to build hell. But it isn’t that simple. I did say we are solely responsible, but this is only partly true. Because the whole of humanity is currently going up against an enormously powerful adversary, the only parts of whom we can see are its footprints, its effects; and we are losing so badly that we sadly shrug our shoulders and tell ourselves, this is just the way things are.

the adversary

What sphinx of cement and aluminum bashed open their skulls and ate up their brains and imagination? ~ Allen Ginsberg, Howl

I can do no greater justice in describing our adversary than this brilliant essay by Scott Alexander, so please, go read it; it’s absolutely worth your time and attention (certainly more deserving of it than this blog). If you’d rather listen to something, I recommend this great interview of poker star Liv Boeree on the Lex Fridman podcast.

But if you insist on sticking to this godsforsaken blog for an explanation, I’ll do my best. Moloch is the mythical Carthaginian demon to whom people would sacrifice their children for the promise of victory in war. Not that there’s a literal supernatural entity who eats the souls of children (the complexity of any supernatural hypothesis is always longer than an equivalent natural one, by definition) - but as we’ll see in a second, it’s a useful analogy.

In the 1927 Fritz Lang film Metropolis (spoilers for a movie that’s 95 years old at the time of writing this), the wealthy protagonist learns that the utopic city in which he lives is built on the lives of the working class; he has a vision of a great demon by the name of Moloch living beneath the city, to whom the workers are sacrificed. Some thirty years later, renowned beat poet Allen Ginsberg wrote the iconic poem Howl, which also named this mysterious entity Moloch, describing it as the living soul of human civilization which had driven many of Ginsberg’s counterculturalist friends to madness and suicide. It is best explained by Scott Alexander in the above-mentioned post, which I won’t summarize here, because you really should go read it.

But to sum up the modern concept of Moloch, it is a fictitious force in game theory (“fictitious” in the same way that centrifugal force and the Coriolis effect are fictitous; meaning, useful ways to describe something from certain reference frames) which describes the presence and prevalence of multipolar traps - scenarios where agents prone to hyperbolic discounting choose strategies that give them a competitive advantage in the short-term, but end up costing them and everyone else in the long run, and worse, force everyone else to adopt the same strategy to stay in the game. In other words, it’s a crab bucket. That’s Moloch: you sacrifice your values to gain a competitive advantage in the short term, but it’s a trap, and you and everyone else will suffer for it down the line.

An example of this is how, at a concert, someone near the front row will stand up, hoping to see the stage better. But everyone else has to stand now, or their view will be blocked. Thus you end up with everyone standing, having no better view. Another is that no one is capable of zipper-merging because they’re all trying to get to where they’re going as quickly as possible, which typically causes extreme slowdowns for miles and ironically makes everyone slower (while they get no long-term advantage from this - they don’t get home any faster). And yet another is the nuclear arms race, where every country in the game has to build at least some nukes to assure that, if someone else nukes them, they won’t go down without a fight. (I’ll leave my thoughts on how even building nuclear weapons is one of the greatest moral crimes imaginable for another time - but I doubt many would disagree with me.)

Basically, wherever you see a “race to the bottom” (any system or subsystem where agents are adopting strategies that harm everyone including themselves in the long run for just a little short-term gain), there’s Moloch. We can describe it as an entity, a massive invisible agent in the shadows pulling strings to maximize our suffering, because humans have always had a penchant for animism; but this is an abstraction, a cognitive shorthand, a way for our feeble brains to model the ghostly outlines of what, in truth, is a hyperobject, something so vast that we can never hope to comprehend it in its entirety. Moloch is the reason we can’t just build heaven, even in an age where - with good coordination and existing technology - we could move beyond scarcity entirely.

Generally speaking, if there’s part of our society that’s broken, causing massive amounts of suffering, and apparently unfixable, it’s probably Moloch’s fault. Our world is very nearly a “Moloch-optimal” one at this point - just look at social media, or how third-world countries are still being exploited by their first-world neighbors, or how we’re getting a little too close for comfort to nuclear war - but I don’t think Moloch has won quite yet. Despite everything, we still feel a deep-seated need for genuine human connection, and an equally deep curiosity, and other such things; these are all values Moloch would very much like to eat for dinner, but hasn’t managed to yet.

Whether Moloch succeeds or fails will be determined, I believe, by who wins the race.

the race

By “the race”, I of course mean the only race that matters: the race to artificial general intelligence. We’re all living in the most pivotal time in human civilization, and this is the pivot: a general-purpose artificial agent, capable by definition of completing any human task at at least a human level, and thus capable of taking over most (if not) all jobs. And this is not taking into account any notion of self-improvement - given that an AGI would be capable of doing anything a human can, why not further research into AI capabilities? Such a system would surpass not just any human, but all humans, because while humans are limited to the computers (brains) they were born with (not counting brain-computer interfaces, which are still in their infancy, and currently grant no benefit to neural compute), AGIs need have no such limits. An AGI would be capable of recursive self-improvement by definition; and thanks to instrumental convergence, it almost certainly would do so.

This would be great, except for the fact that it may be very hard to ensure that such a self-improving system stays aligned to our interests, if it’s even aligned at all. The field of AI alignment is still in its infancy, even as massive corporations pour billions into researching new AI capabilities. This is the primary concern of people who talk about AI safety and the alignment problem (i.e. Stuart Russell, Nick Bostrom, or Eliezer Yudkowsky). And if you think artificial superintelligence might be moral simply because it’s smarter, and those two things are probably correlated, you would be wrong.

At the same time, AI research shows no signs of slowing down; instead, it’s accelerating exponentially. Even ruling out the more sensationalist headlines, it seems like we’re up to one breakthrough a week. This isn’t what the beginning of an AI winter looks like. This is what the inflection point of an exponential curve looks like.

Now, if humanity were only marginally decent at coordination (vastly better than we are now), we might decide to put the research into AI capability on hold for a few years while we work out the kinks in alignment theory. But this won’t work, and for exactly the reason you might suspect: Moloch. Every major group working on AGI knows the untold wealth, depthless scientific discovery, and world-changing power that will come with it. And, thanks to Moloch, they no longer have a choice to play the game or not: if they halted AGI capability research and redirected their efforts into alignment, another such team would get ahead, and potentially beat them to it. If you somehow managed to convince every publicly-known major AGI project to halt their progress until they’d collectively solved alignment and come to an agreement on how to implement it, you’d just be restricting the winners to governments, private corporations, and maybe even individuals with enough compute to do it in secret. There is no banning AGI research; no hope for an “alignment compact”; there is no stopping this train. All we can do is… well…

The race isn’t between those who are building AGI and those who are solving alignment. It’s unfortunately far worse than that: the race is between those who are building AGI and don’t care about alignment in the least, and those who are building AGI and do, if only very slightly. In this deadly race, if you’re not working on capabilities at least as hard as alignment, then all you’re doing is falling behind.

(But all is not lost: if one group works very hard only on alignment and publishes their research openly, and another group works very hard primarily on capability, but incorporates the work of the former, then the latter still has a chance at winning.)

the finish line

So: what happens when somebody wins the race?

Did the megacorps build an AGI for fun and profit? If it doesn’t end up recursively self-improving, and they at least get it to do what they ask, then it’ll probably take over almost every job on the face of the Earth. It’s obvious (at least to me) because it’s Moloch-optimal, and we live in a nearly Moloch-optimal world. Given the decreasing cost of compute over time, it’s hard to believe that companies wouldn’t save money by replacing human employees with such an AGI, gaining a huge short-term profit advantage over all competitors who wanted to stay “human-friendly”, and thus forcing them to do the same. In the long run, of course, this would cause unemployment to skyrocket to levels never before seen at any time in history; and I’m no economist, but in our current system, it really looks like that would cause poverty rates to skyrocket in turn. This is if it doesn’t recursively self-improve beyond our comprehension and turn us all into paperclips (or, more likely, computronium), which, to me, seems almost certain to happen (thanks to instrumental convergence).

But if the other side wins - the groups who are both actively developing AGI and want to at least fractionally reduce the likelihood it kills us all - we might just get something amazing. If it recursively self-improves, and becomes a “sovereign” AI, a superintelligence - and if it’s really, deeply aligned to us, more deeply than we ourselves can know - then we might end up with a godlike top-down coordinator with our best interests at heart. We might be able to kill Moloch once and for all. We might just find Utopia.

how would I win?

I’ll be entirely honest: from my perspective, things are not looking so good for team 2. I didn’t always see it this way; I’ll admit, I’ve always been a staunch transhumanist, and for a long time, I truly believed that the default outcome of AGI would be good. But after grokking the concept of Moloch, and catching up on (a solid chunk of) the alignment literature, and seeing billions poured into capability R&D while alignment researchers practically starve, it’s clear to me that this won’t be the case. The default values of an AGI will be Moloch’s.

If I were to run in this race, how would I do it? Until recently, I found myself on the side of “prosaic AGI alignment”, that is, effectively exploring how to get modern machine learning methods to generalize while also identifying ways to keep them safe (which I’ve mused on before, albeit in only the most handwavy of ways). But carado (among others) has convinced me that formalism is necessary - we need to be able to prove that the AGI will be aligned from the start, as opposed to trying to align it ad-hoc; in other words, we need to build it with a security mindset.

That said, I think what we build won’t look entirely unlike prosaic AGI. There’s something to be said for the usefulness of certain present methods; after all, RNNs are universal computers, and their weight matrices are their programs. (Looking at RNNs this way tickles something in the back of my mind about how we might potentially represent programs in Vanessa Kosoy’s PreDCA, but it’s late, I’m somehow even more tired than usual, and this post is already too long.) I have a few ideas - so stay tuned!

dreams of compression

2022-10-08T00:00:00+00:00

With the exponential amount of progress in the field of AI, we may be on the doorstep of artificial general intelligence. AGI is a pivotal technology - a technology that greatly increases the state space of possible futures we might occupy. The problem is that we currently have (almost) no concrete idea how to ensure such an intelligence is aligned with our values, and at the same time, (almost) no one working on AI wants to believe that this is a problem at all. Sometimes, I like to entertain the (vanishingly small probability) hypothesis that we might get alignment for free, from nothing but intelligence itself.

what is intelligence?

Ultimately, intelligence is information compression: finding the shortest/simplest explanation for your observations so that you can more effectively predict them (and more effectively predict the outcomes of your actions, and the utility those will provide). This leaves out a lot that most people group under the “intelligence” umbrella, like filial imprinting, social instincts, and many subjective feelings (sadness, embarrassment). But I wouldn’t want an AGI - much less a superintelligence - driven by human social instincts, for painfully obvious reasons. The drive I’d want that superintelligence to have would be to maximize the freedom of each and every sentient being, including freedom from nonconsensual interpersonal and social structures (please, go read that blog instead). The problem lies in specifying the utility function - i.e. mathematically formalizing the concepts of “freedom” and, far worse, “sentient being”.

compression progress as utility

Jürgen Schmidhuber, one of the fathers of the field of machine learning, proposes using compression progress as utility. This drives the agent to seek out sources of non-random but novel (not-yet-compressed) information, effectively pursuing “interestingness” (the first derivative of compression). It’s a simple and effective explanation for a lot of our own behaviour - curiosity (seeking maximum interestingness), art (generating more data with hidden regularities), a sense of subjective beauty (the size of the compressed representation), etc.

future freedom

Meanwhile, Dr. Alex Wissner-Gross has proposed that a key component of intelligent behaviour is driven by the maximization of causal path entropy (similar to the principle of maximum caliber, but applied to agents). Loosely this means “maximizing the volume of your state space of possible futures”, which is a slightly more formal way of saying “maximizing power”. Or perhaps it encodes an intrinsic notion of curiosity. Instrumental convergence is just the obvious result that the futures you (the agent) care about are a subset of all possible futures, and so maximizing causal path entropy/power = increasing the number of futures in which your goals are fulfilled = increasing your ability to fulfill those goals.

alignment for free

Given both these concepts - that the agent, our toy superintelligence, seeks to maximize both compression progress and its own causal path entropy - how do we get alignment out of it? First, if driven by compression progress, then the agent will be incentivized to develop a theory of mind - a compressed prototypical representation of an agent - because the alternative, representing each agent it encounters as a wholly distinct variable in its world-model, would be extremely inefficient. If Schmidhuber is right, then it will also develop a compressed representation of itself, which - at maximal compression - would look a lot like an agent. Thus we get a natural proximity of the agent’s conception of itself, and its prototypical representation of other agents, simply because its world-model’s latent space is maximally compressed.

In such a compressed latent space, similar states of differing agents overlap, and so observations of another agent’s state produce value predictions similar (if weaker) to those of the agent’s own corresponding states . According to Steven Byrnes, this is the basis for human empathy (probably with the help of some more specialized and thus less important reward circuits in our brainstem, like facial recognition). If this is true, and what’s more, if we can directly identify and encourage this kind of “empathy” (though this would probably take another breakthrough in interpretability research), we can ensure that the agent will be rewarded not just for maximizing its own causal path entropy, but maximizing the causal path entropy of all agents.

can this work?

This requires at least Schmidhuber’s idea of compression progress to work; it also assumes that, in making its world-model as compressed as possible, the agent will necessarily form a theory of mind (representation of a prototype agent), and a representation of itself, and that these two will be anywhere near each other (this latter point might be solvable with a form of self-supervised consistency loss or convex consistency loss). As I said: I consider it very improbable. (But maybe just probable enough to be worth looking into.)

the skyrim diophantine equation

2022-10-02T00:00:00+00:00

The Elder Scrolls V: Skyrim is quite possibly my favorite game of all time. I very much enjoy the old-school RPG aspects of its grandfather, Morrowind, but as for immersion, Skyrim wins with ease. I could enthuse about its design philosophy of “mythic realism”, or the depths of the lore, or the stories I’ve made while playing. But instead, I’m going to anger everyone by giving Todd Howard yet another idea. This is Skyrim ported to my dream platform, the optimal platform, the only platform. This is Skyrim: Diophantine Equation Edition.

Diophantine equations are polynomial equations (typically with two or more unknowns) whose only solutions of interest are in the positive integers. A small example: w³ + x³ = y³ + z³. The legendary Srinivasa Ramanujan showed that the smallest integer solution was 12³ + 1³ = 9³ + 10³ = 1729. Another is xⁿ+ yⁿ = zⁿ; if n = 2, there are infinitely many solutions (laughs in Pythagorean), but Fermat’s last theorem (proved in 1995 by Andrew Wiles) states that for larger values of n, there are in fact no positive integer solutions. A Diophantine set is the set of parameter assignments (such as n = 2 in the second example above) for which the equation is solvable.

In 1900, David Hilbert posed a list of mathematical problems he hoped could be solved to provide new bases for and insights into mathematics as a whole. The tenth of such problems was: is there a general algorithm that, for any given Diophantine equation, can decide whether it has a solution in the positive integers? Some might think that this is related to the halting problem, and they’d be right. In 1970, mathematicians Yuri Matiyasevich, Martin Davis, Julia Robinson, and Hilary Putnam completed the MDRP theorem, which proved (to simplify a little bit) an equivalence between Diophantine sets and Turing machines. Thus, Hilbert’s tenth problem was answered in the negative; as it’s actually the halting problem in disguise.

This little tidbit is exactly what we need (not the halting problem; the equivalence between Diophantine sets and Turing machines). A Turing machine is a mathematical model of computation wherein said machine reads symbols from a memory tape (infinite in the formulation, but for many algorithms this is unnecessary), and then - according to its own state and the newly read symbol, as interpreted through a predefined table of instructions - writes a certain symbol into the same space, changes its state, and then either moves the head one cell to the left or right and continues, or halts the computation. It’s extraordinarily simple, but is nonetheless capable of implementing any computable algorithm. Anything that can run on a computer. Any program.

Including, of course, the Elder Scrolls V: Skyrim.

My proposal is that Bethesda Softworks spend a vast amount of computational resources searching over all possible Diophantine equations to find one which is equivalent to Skyrim. Then, and only then, will Todd Howard achieve his ultimate goal, and port Skyrim to the greatest possible platform: pure mathematics.

possible multiverses

2022-10-01T00:00:00+00:00

There are many proposed multiverses, multiple of which may be true at once. I consider “multiverse” to mean any model that implies we’re living in a Big World - that is, a reality where all possible structures are realized - where, for each value of each free parameter of that multiverse, that specified structure exists.

The “physical multiverse” class, as I call it, contains all multiverses which are implied by or required for physical theories as we understand them. The first and most obvious - which someone might take a shot at me for even terming a “multiverse” - is a spatially infinite universe with an ergodic distribution of matter. Every possible structure with our laws of physics is realized, because the initial conditions vary across space.

Another prime example is quantum mechanics, the most straightforward interpretation of which implies a universal wavefunction that evolves according to the Schrödinger equation, itself implying (without any extra magic) that every outcome of a quantum measurement is realized - or, to put it more accurately, the measured system becomes entangled with its environment (including its observers), and as such the outcomes decohere into distinct “branches”. Note that while this multiverse is restricted to the same fundamental laws of physics, certain other things might vary like physical constants, particles, etc.

The last physical multiverse I’ll cover is the one implied by eternal inflation, wherein the universe continually and eternally expands, with quantum fluctuations creating asymmetric regions and spawning patches with differing physical properties (but likely the same laws of physics). This gives us the same stuff as our “multiverse” implied by quantum mechanics.

The other class of multiverses, according to me, is the “abstract multiverse” class. Herein I group multiverses that are not implications of actual physical theories, but philosophical proposals. (And if I’m being honest, these are more fun.) To be clear, “abstract” is more relating to the model of the multiverse itself, rather than its contents; after all, each and every multiverse mentioned in this post contains our world, as physical and real as it seems to us.

One that is near and dear to my heart is the computational multiverse. There are many takes on this idea, but they all basically propose a multiverse that is the set of all computable programs. Seeing as these can get weird, most propose to augment it with the notion of the “universal distribution” or “universal prior”, which effectively assigns each program’s probability as inversely exponentially proportional to its length in bits (such that for each extra bit, the probability is halved). Juergen Schmidhuber goes further with the speed prior, which makes not only shorter programs more likely, but faster ones.

A variant of the computational multiverse is Wolfram physics, a project by Stephen Wolfram to model physics as iterated rewriting rules on a hypergraph; he suspects our universe might be such a hypergraph produced by one such rule (perhaps even a simple one). He also conceives of the notion of “rulial space”, the space of all possible hypergraph rules, wherein the one which produced our universe is but a point. This is Wolfram’s multiverse: the set of all possible rules, each of which gives rise to a distinct universe (though, conceivably, rules could “arise” within others - i.e. like how, in our universe, Wolfram is simulating others). It’s deeply intriguing, though seeing as these rules are all computable, I wonder if rulial space constitutes a subset of the aforementioned computational multiverse or, if this computation-by-graph-rewriting is Turing complete, is equivalent to it. Still, it has certain implications that the computational multiverse might not - for instance, it implies the whole universe is doing a vast amount of compute all the time, which we (or our superintelligent progeny) might harness for our own purposes (i.e. surviving heat death).

The last multiverse I’ll cover for now is Max Tegmark’s mathematical universe. This is, in a sense, simpler than all the others, and yet contains far more. Though the computational multiverse (and possibly Wolfram’s) contain all computable programs, what is computable is really a very tiny subset of all math. Tegmark proposes that all mathematical objects exist - they’re all equally as real as the one we inhabit, and the only reason we think they aren’t is that we aren’t inside them. I like this multiverse a lot, if only because it allows all logically possible (read: mathematically consistent) things to exist. (Note that it also contains the computational multiverse, as per the MRDP theorem, which proved the correspondence between Diophantine sets and Turing machines.)

multiversal scaling laws

There’s a certain trend across all these models - can you spot it? There’s an inverse relationship between the amount of information to specify a multiverse’s structure and diversity of stuff that it contains. This might seem counterintuitive at first - if it takes somewhere between 10^90 bits (entropy of all known matter) and 10^123 bits (given by the holographic principle) to describe our universe, and to encode two universes takes double that number, and so on… then surely an infinite set of universes should take an infinite number of bits to describe!

Except, not really. Information is subtractive: it specifies subsets or elements, much like a sculptor chiseling away at a block of marble until he “finds” the right statue inside. The more information that goes into specifying this subset, the smaller that subset is. In the first multiverse - a flat, infinite universe with an ergodic distribution of matter - the laws of physics, constants, dimensions, et cetera are already specified, so all configurations which don’t match them are ruled out. In the next two - quantum and inflationary - the constants might vary (perhaps), but most of the other properties remain fixed. The computable universe, and likely Wolfram’s, rule out anything and everything incomputable. Tegmark’s multiverse of all math is the only one which describes every logical possibility - and because it allows for all possibilities, it takes no information to describe. This is most metaphysically sound, since it means we don’t need to ponder the idea of where did all of this come from at all - because, in this case, we would live on the inside of Nothing.

tldr

I weakly classify multiverses as either “physical” (implied by existing physical theories) or “abstract” (more philosophical in nature, but still giving rise to our physics). Also, the more specific your multiverse, the smaller it is - meaning the ultimate multiverse is one containing every logical possibility, and thus requiring nothing at all to exist.

a new (new) era

2022-09-30T00:00:00+00:00

Welcome to this place of madness. This is take 3 of my attempt at something like a blog. Last time, I tried to commit to just casting my thoughts out into the void (kind of like twitter, but longer, and more abstract, and not as subject to molochian social forces), but this failed horrifically and I wound up burning myself out by writing a book-length series of explanations on my view of the multiverse at the time (which was rather similar to UDASSA), and to top it all off, I changed said view before I could finish writing it down.

So welcome to the third iteration. This time, I’ll actively try to stay away from writing monolithic posts thousands of words long. And I’ll try to be more open with my thoughts and ideas (except the ones under NDA - those might be trickier to share). Here goes nothing!