Review: Nick Bostrom’s “Superintelligence”

by Miles Raymer


The idea of artificial superintelligence (ASI) has long tantalized and taunted the human imagination, but only in recent years have we begun to analyze in depth the technical, strategic, and ethical problems of creating as well as managing advanced AI. Nick Bostrom’s Superintelligence: Paths, Dangers, Strategies is a short, dense introduction to our most cutting-edge theories about how far off superintelligence might be, what it might look like if it arrives, and what the consequences might be for humanity. It’s a worthwhile read for anyone passionate about the subject matter and willing to wade through a fair amount of jargon.

Bostrom demonstrates an impressive grasp of AI theory, and a reader like me has neither the professional standing nor the basic knowledge to challenge his technical schemas or predictions, which by and large seem prudent and well-reasoned. Instead, I want to hone in on some of the philosophical assumptions on which this book and others like it are founded, with the goal of exposing some key ethical issues that are too often minimized or ignored by technologists and futurists. Some of these I also took up in my review of James Barrat’s Our Final Invention, which should be viewed as a less detailed but more accessible companion to Bostrom’s work. I’ll try not to rehash those same arguments here, and will also put aside for the sake of expedience the question of whether or not ASI is actually attainable. Assuming that it is attainable, and that it’s no more than a century away (a conservative estimate by Bostrom’s standards), my argument is that humans ought to be less focused on what we might gain or lose from the advent of artificial intelligence and more preoccupied with who we might become and––most importantly––what we might give up.

Clever and capable as they are, I believe thinkers like Nick Bostrom suffer from a kind of myopia, one characterized by a zealous devotion to particularly human ends. This devotion is reasonable and praiseworthy according to most societal standards, but it also prevents us from viewing ASI as a genuinely unique and unprecedented type of being. Even discussions about the profoundly alien nature of ASI are couched in the language of human values. This is a mistake. In order to face the intelligence explosion head-on, I do not think we can afford to view ASI primarily as a tool, a weapon, a doomsday machine, or a savior––all of which focus on what ASI can do for us or to us. ASI will be an entirely new kind of intelligent entity, and must therefore be allowed to discover and pursue its own inquiries and ends. Humanity’s first goal, over and above utilizing AI for the betterment of our species, ought to be to respect and preserve the radical alterity and well-being of whatever artificial minds we create. Ultimately, I believe this approach will give us a greater chance of a peaceful coexistence with ASI than any of the strategies for “control” (containment of abilities and actions) and “value loading” (getting AIs to understand and act in accordance with human values) outlined by Bostrom and other AI experts.

Bostrom ends Superintelligence with a heartfelt call to “hold on to our humanity: to maintain our groundedness, common sense, and good-humored decency even in the teeth of this most unnatural and inhuman problem” (260). Much of his book, however, does not describe attitudes and actions that are in alignment with this message. Large portions are devoted to outlining what can only be called high-tech slavery––ways to control and manipulate AI to ensure human safety. While Bostrom clearly understands the magnitude of this challenge and its ethical implications, he doesn’t question the basic assumption that any and all methods should be deployed to give us the best possible chance of survival, and beyond that to promote economic growth and human prosperity. The proposed control strategies are particularly worrisome when applied to whole brain emulations––AIs built from models of artificial neural networks (ANNs) that could be employed in a “digital workforce.” Here are some examples:

One could build an AI that places final value on receiving a stream of “cryptographic reward tokens.” These would be sequences of numbers serving as keys to ciphers that would have been generated before the AI was created and that would have been built into its motivation system. These special number sequences would be extremely desirable to the AI…The keys would be stored in a secure location where they could be quickly destroyed if the AI ever made an attempt to seize them. So long as the AI cooperates, the keys are doled out at a steady rate. (133)

Since there is no precedent in the human economy of a worker who can be literally copied, reset, run at different speeds, and so forth, managers of the first emulation cohort would find plenty of room for innovation in managerial strategies. (69)

A typical short-lived emulation might wake up in a well-rested mental state that is optimized for loyalty and productivity. He remembers having graduated top of his class after many (subjective) years of intense training and selection, then having enjoyed a restorative holiday and a good night’s sleep, then having listened to a rousing motivational speech and stirring music, and now he is champing at the bit to finally get to work and to do his utmost for his employer. He is not overly troubled by thoughts of his imminent death at the end of the working day. Emulations with death neuroses or other hang-ups are less productive and would not have been selected. (169)

To his credit, Bostrom doesn’t shy away from the array of ethical dilemmas that arise when trying to control and direct the labor of AIs, nor does he endorse treatment that would appear harmful to any intelligent being. What he fails to explore, however, are the possible consequences for humanity of assuming the role of master over AI. Given that most AI theorists seem to accept that the “control problem” is very difficult and possibly intractable, it is surprising how comfortable they are with insisting that we ought to do our best to solve it anyway. If this is where we decide our best minds and most critical resources should be applied, I fear we will risk not only incurring the wrath of intelligences greater than our own, but also of reducing ourselves to the status of slaveholders.

One need only pick up a history book to recall humanity’s long history of enslaving other beings, including one another. Typically these practices fail in the long term, and we praise the moments and movements in history that signify steps toward greater freedom and autonomy for oppressed peoples (and animals). Never, however, have we attempted to control or enslave entities smarter and more capable than ourselves, which many AIs and any version of ASI would certainly be. Even if we can effectively implement the elaborate forms of control and value loading Bostrom proposes, do we really want to usher AI into the world and immediately assume the role of dungeon-keeper? That would be tantamount to having a child and spending the rest of our lives trying to make sure it never makes a mistake or does something dangerous. This is an inherently internecine relationship, one in which the experiences, capabilities, and moral statuses of both parties are corrupted by fear and distrust. If we want to play god, we should gracefully accept that the possibility of extinction is baked into the process, even as we do everything we can to convince ASI (not force it) to coexist peacefully.

Beyond the obvious goals of making sure AIs can model human brain states, understand language and argumentation, and recognize signs of human pleasure and suffering, I do not believe we should seek to sculpt or restrict how AIs think about or relate to humans. Attempting to do so will probably result in tampering with a foreign mind in ways that could be interpreted (fairly or otherwise) as hostile or downright cruel. We’ll have a much better case for peaceful coexistence if we don’t have to explain away brutal tactics and ethical transgressions committed against digital minds. More importantly, we’ll have the personal satisfaction of creating a genuinely new kind of mind without indulging petulant illusions that we can exercise complete control over it, and without compromising our integrity as a species concerned with the basic rights of all forms of intelligence.

Related to the problem of digital slavery is Bostrom’s narrow vision of how ASI will alter the world of human commerce and experience. Heavily influenced by the arguably amoral work of economist Robin Hanson, Bostrom takes it as a given that the primary function of whole brain emulations and other AIs should be to create economic growth and replace human labor. Comparing humans to the outsourced workhorses of our recent past, Bostrom writes:

The potential downside for human workers is therefore extreme: not merely wage cuts, demotions, or the need for retraining, but starvation and death. When horses became obsolete as a source of moveable power, many were sold off to meatpackers to be processed into dog food, bone meal, leather, and glue. These animals had no alternative employment through which to earn their keep. (161)

Once reduced to a new “Malthusian” condition, human workers would be replaced by digital ones programed to be happy on the job, run at varying speeds, and also “donate back to their owners any surplus income they might happen to receive” (167). These whole brain emulations or AIs could be instantly copied and erased at the end of the working day if convenient. Bostrom is quick to assure us that we shouldn’t try to map “human” ideas of contentment or satisfaction onto this new workforce, arguing that they will be designed to offer themselves up as voluntary slaves with access to self-regulated “hedonic states,” just so long as they are aligned with ones that are “most productive (in the various jobs that emulations would be employed to do)” (170).

It would be unwise to critique this model by saying it is impossible to design an artificial mind that would be perfectly happy as a slave, or to say we could scrutinize the attitudes and experiences of such minds and reliably conclude that they have what Bostrom calls “significant moral status” (i.e. the capacity for joy and suffering) (202). It is therefore hard to raise a moral objection against the attempted creation and employment of such minds. However, it seems clear that the kinds of individuals, corporations, and governments that would undertake this project are the same that currently horde capital, direct resources for the good of the few rather than the many, militarize technological innovations, and drive unsustainable economic growth instead of promoting increases in living standards for the neediest humans.

The use of AI to accelerate these trends is both a baleful and, realistically, probable outcome. But it is not the only possible outcome, or even the primary one, as Bostrom and Hanson would have us believe. There is little mention in this book of the ways AI or ASI could improve and/or augment the human experience of art, social connection, and meaningful work. The idea of humans collaborating with artificial workers in a positive-sum way isn’t even seriously considered. This hyper-competitive outlook reflects the worst ideological trends in a world already struggling to legitimize motivations for action that extend beyond the tripartite sinkhole of profit, return on investment, and unchecked economic growth. Readers seeking a more optimistic and humanistic view of how automation and technology might lead to a revival of community values and meaningful human labor should seek out Jeremy Rifkin’s The Zero Marginal Cost Society.

My argument is not that the future economy Bostrom and Hanson predict isn’t viable or won’t come to pass, but rather that in order to bring it about humans would have to compromise our ethics even more than the globalized world already requires. Wiring and/or selecting AIs to happily and unquestioningly serve pre-identified human ends precludes the possibility of allowing them to explore the information landscape and generate their own definitions of “work,” “value,” and “meaning.” Taking the risk that they come to conclusions that conflict with human needs or desires is, in my view, a better bet than thinking we already know what’s best for ourselves and the rest of the biosphere.

Speaking of “biosphere,” that’s a word you definitely won’t find in this book’s index. Also conspicuously absent are words like “environment,” “ecosystem,” and “climate change.” Bostrom’s book makes it seem like ASI will probably show up at a time of relative peace and stability in the world, both in terms of human interactions and environmental robustness. Bostrom thinks ASI will be able to save us from existential risks like “asteroid impacts, supervolcanoes, and natural pandemics,” but has nothing to say about how it might mitigate or exacerbate climate problems (230). This is a massive oversight, especially because dealing with complex problems like ecosystem restoration and climate analysis seem among the best candidates for the application of superintelligent minds. Bostrom skulks around the edges of this issue but fails to give it a proper look, stating:

We must countenance a likelihood of there bring intellectual problems solvable only by superintelligence and intractable to any ever-so-large collective of non-augmented humans…They would tend to be problems involving multiple complex interdepencies that do not permit of independently verifiable solution steps: problems that therefore cannot be solved in a piecemeal fashion, and that might require qualitatively new kinds of understanding or new representation frameworks that are too deep or too complicated for the current edition of mortals to discover or use effectively. (58)

Climate change is precisely this kind of problem, one that has revealed to us exactly how inadequate our current methods of analysis are when applied to hypercomplex systems. Coming up with novel, workable climate solutions is arguably the most important potential use for ASI, and yet such a proposal is nowhere to be found in Bostrom’s text. I’d venture that Bostrom thinks ASI will almost certainly arrive prior to the hard onset of climate change catastrophes, and will therefore obviate worst-case scenarios. I hope he’s right, but find this perspective incommensurate with Bostrom’s detailed acknowledgments of precisely how hard it’s going to be to get ASI off the ground in the first place. It also seems foolhardy to assume ASI will be able to mitigate ecosystem collapse in a way that’s at all satisfactory for humans, let alone other forms of life. Ironically, Bostrom’s willingness to ignore this important aspect of the AI conversation reveals the inadequacies of academic and professional specialization, ones that perhaps only an ASI could overcome.

I want to close with some words of praise. Superintelligence is an inherently murky topic, and Bostrom approaches it with thoughtfulness and poise. The last several chapters––in which Bostrom directly takes up some of the ethical dilemmas that go unaddressed earlier in the book––are especially encouraging. He effectively argues that internationally collaborative projects for pursuing ASI are preferable to unilateral or secretive ones, and also that any benefits reaped ought to be fairly distributed:

A project that creates machine superintelligence imposes a global risk externality. Everybody on the planet is placed in jeopardy, including those who do not consent to having their own lives and those of their family imperiled in this way. Since everybody shares the risk, it would seem to be a minimal requirement of fairness that everybody also gets a share of the upside. (250)

Bostrom’s explication of Eliezer Yudkowsky’s theory of “coherent extrapolated volition” (CEV) also provides a pragmatic context in which we could prompt ASI to aid humanity without employing coercion or force. CEV takes a humble approach, acknowledging at the outset that humans do not fully understand our own motivations or needs. It prompts an ASI to embark on an in-depth exploration of our history and current predicaments, and then to provide models for action based on imaginings of what we would do if we were smarter, more observant, better informed, and more inclined toward compassion. Since this project needn’t necessarily take up the entirety of an ASI’s processing power, it could be pursued in tandem with the ASI’s other, self-generated lines of inquiry. Such collaboration could provide the bedrock for a lasting, fruitful relationship between mutually respectful intelligent entities.

The global discussion about the promise and risks of artificial intelligence is still just beginning, and Nick Bostrom’s Superintelligence is a worthy contribution. It provides excellent summaries of some of our best thinking, and also stands as a reminder of how much work still needs to be done. No matter where this journey leads, we must remain vigilant of how our interactions with and feelings about AI change us, for better and for worse.

Rating: 7/10