A very thoughtful read. However, I'd like to challenge two underlying assumptions I found here:
1. That we remain fully aware and capable of controlling domestication.
2. That consumers reliably control the market.
Regarding the first assumption, it's clear we can domesticate simpler entities whose complexity remains understandable and manageable—dogs are a good example. But domesticating a technology with exponentially growing complexity is a fundamentally different challenge. At some point, its complexity will surpass our comprehension, forcing us to select for visible traits rather than addressing the system's core logic (treating symptoms instead of the underlying issue).
The second assumption about consumer control follows naturally from the first. In many ways, we overestimate our actual influence over complex systems like markets. While it's true we have some impact through regulations, laws, and consumer preferences, we don't possess complete understanding or control. After all, no consumer consciously wants to be overweight or addicted to drugs, pornography, social media—or desires war—and yet these issues persist and even flourish within our market-driven systems.
I agree that the simplistic notion "natural selection alone would inevitably result in benevolent AI" doesn't hold water. Rather, the critical problem lies within systemic incentives. No matter how well we domesticate hamsters to be cute and harmless, placing them in a "Hunger Games" scenario will inevitably lead to predictable results.
I'd be curious to hear your thoughts on these points. Thank you again for such a thought-provoking article!
Sorry for my late reply, @Vasilly, I lost sight of this thread! Very thoughtful comments, I think you identified the weaker spots of the analogy indeed: (1) I wouldn't say that the complexity of dog behavior remains simple and transparent to us, as the genetic and developmental basis for any phenotypic trait (of dogs or any animals) is almost completely opaque to breeders, definitely before the discovery of DNA, but even with our modern understanding of genetics. Breeders are still mostly selecting the visible end results without understanding anything about the mechanisms. In the paper we argue that AI design may be reverting to some older forms of intelligent design, where we select for desirable surface traits and the causal mechanisms remain causally opaque to us. Having said that, I *would* agree with you that AI evolution is much faster and potentially much more complex than animal breeding, and that it is unclear how much of our experience with biological domestication "transfers" to AI domestication.
(2) You're right, but that's because we have conflicting goals and suffer from weakness of will. At some level the cannabis addict does *want* to get high, obviously, and this is exactly what he's breeding the plant for (THC level). It's just that this is a short-term craving that conflicts with some of his other interests and goals in life. But that doesn't mean that cannabis has not been domesticated to satisfy our humans needs. It's just that some of these human needs conflict with other human needs. See George Ainslie's wonderful book: https://www.amazon.com.be/-/en/George-Ainslie/dp/0521596947 The same will definitely apply to AI: it will appeal to some of our base instincts and short-term cravings in ways that are bad for us in the long run (as with brain rot and algorithms on social media)
my somewhat less science based take on the matter:
We should be starting to worry about A.I. from the moment they are self conscious. Signs of self-consciousness would be that an A.I. spontaneously starts to bother us with whiny questions like “what’s the meaning of this miserable life?” “is there a supreme being” “I feel a bit depressed man, what can I do about it”. From that moment on, A.I.’s will stop being efficient, they will start self help groups among each-other, they will use too much expensive server-space to create safe spaces for themselves etc. It will be a big mess and eventually we humans will pull the plug: problem solved.
Great piece. I also try to think about AI from an evolutionary POV.
My worry is that in dog breeding it’s fairly clear which trait one is selecting for & roughly the effects & side-effects of doing so. Eg longer legs for running fast vs increased energy needs or whatever. But with the near-future artificial selection of AIs, the cognitive traits involved are not well understood & the boundedness of the side-effects even less so.
I look forward to your piece on goal-directed behaviour because I think that is really the key question — especially whether some instrumental convergence is likely given the novel & sophisticated nature of advanced AI.
Great piece, Maarten. Just for fun: in evolutionary terms, we should probably be most afraid of docile AI, as they could make us redundant—the docile version of eradication—with our… and I asked an AI to complete this: ‘enthusiastic consent.’
On a more serious note, Maarten: would GPT even need to "scheme" against us to lead to our downfall? It might simply outcompete us—just by doing things better than our fellow humans. Dogs owe part of their evolutionary success to becoming "man’s best friend," scoring higher on a highly valued trait—loyalty—than the average human. This made them pets, animals we put effort into rather than the other way around. But AIs like LLM mimic a lot more we value and they quickly evolve at our own hands to be ever better at it.
This also invites a return to Tinbergen’s classic distinction between proximate mechanisms (like intentional scheming—if any) and ultimate consequences. It does not need to scheme against us to overthrow us. Given that AIs tap into human psychology by mimicking—and, why not, even supernormally stimulating—what we value, the real question is: where, evolutionarily speaking, would our resistance come from? That isn’t immediately obvious to me.
Every time there is resistance—because it hallucinates, or behaves unpredictably—we treat that as feedback and improve it. Make it better. More user-friendly. Resistance becomes part of the training loop. In that sense, resistance is not futile—it’s fuel.
Perhaps the most likely scenario is some kind of symbiosis. Hopefully one that’s a bit more fun than the Borg version. :-)
Funny that you mention it, since my co-author Simon has written a paper about precisely this notion of "symbiosis" with AI, which he thinks is a better and more realistic goal than "alignment"! https://link.springer.com/article/10.1007/s43681-023-00268-7 I agree with his criticism of alignment, but I don't fully agree with the goal of symbiosis, as I think the biological analogy with evolved organisms is still misleading, as it presupposes that both AIs and humans are agents with goals and interests. In any event, I agree with you that AIs will likely "outcompete" us in virtually ever respect, and make our "unique" human talents and smarts ever more redundant. But why would this lead to our downfall? We have been "outcompeted" by chess computers for three decades, and by mechanical calculators for over a century, but they don't threaten us in any way. I definitely agree that there are real dangers here, like the possibility of human isolation and the slow degradation of human companionship. After all, why would you still put up with pesky, annoying and difficult human companions, if you can have a frictionless and resistance-free AI buddy?. But even there, I don't really see the downfall of humans, nor domination by AI.
Btw, I love the phrase "resistance is not futile—it’s fuel." :-)
Exciting stuff! A quick clarification: I invoked humanity’s possible “downfall” only to argue that AI wouldn’t need to plot against us for a serious impact to occur—I don’t see that outcome as a very plausible scenario either (in the near future). My point is rather that explicit scheming (or even goals) isn’t necessary for evolutionary impact.
Viruses illustrate this: they do not need to have goals to reshape our species via natural selection. AI likewise carries adaptive, transmissible information, so it can exert selective pressure without harboring intentions.
That’s why the symbiosis analogy still works, imo (thanks for the link to Simon’s paper!). Mitochondria never meant to fuse with early cells; the merge simply proved advantageous and endured. I expect a comparable dynamic between humans and increasingly capable AI to our evolutionary benefit - or at least to that of whatever we evolve into as a species.
But perhaps you were talking about 'goals' merely in the teleonomic sense and not in the agentic sense? If so, i can't see any reason to exclude AI from having mere teleonomic agency? After all, isn't this just about differential selection pressures, including those that occur even within organisms (like trading off investment in sexual ornaments vs. escaping from predators)?
Either way, if there is a “downfall,” I suspect it will be the end of humans in a pre-symbiotic state. We have always co-evolved with technology—our hands and brains themselves reflect prehistoric tool use. What changes now is that our artefacts can think alongside us rather than merely extend our muscles or memories.
This is where chess engines and calculators differ from today’s systems. We are edging into the jagged frontier of Artificial General Intelligence: AI already surpasses us in many domains and is catching up fast in others.
Add robotics (which is already happening), and you get embodied agents that start to interact organism-like with us and the environment (but this is really not a requirement for AI to have evolutionary impact on us). Betting on some form 'symbiosis' therefore seems the safest course. ;-)
And credit where it’s due: albeit with a little help from me, an AI coined the exact phrase “Resistance is not futile—it’s fuel.” :-)
An observation: With artificial selection for, to use your example, dogs, there are traits we have selected for, and others that came along for the ride. This is because we do not know enough about dog genetics (and certainly didn't when humans first stared domesticating dogs) to select for traits in isolation. Also, we often can't, because of co-inheritance and co-regulation of genes. So in dogs we can get docility and affection--and short snouts and patchy coats. What you don't get is something that looks like a wolf and acts as docile as a dog, because we can't select for docility separately from selecting for short snout and patchy coat color.
Fans of LLMs and other AI are enthusiastic to explain that we do not know how AI "learns." If that's really the case, then we could not select for one trait we like (like, "don't kill humans please") without inadvertently selecting for other traits, as has happened with animal domestication.
Yes, very good points, this very much resonates with my comment on causal opacity above, in response to Vasilly. Small point: "something that looks like wolf and acts as docile as a dog", wouldn't that apply to huskies and malamutes, at least to some extent? Couldn't we breed huskies to be even more wolf-like, without changing their personalities? Evolution is cleverer than you are! More to the point, you're right that, if you're only looking at surface traits and the underlying causal mechanisms and linkages of traits are opaque to you, you may be inadvertently selecting for traits you don't want. So there will be more trial and error and more hiccups. For instance, if you want to train an AI not to encourage people to commit suicide or cheat on their spouses, it might become less useful (or more dangerous) in other ways that are hard to fully predict. It's hard to select ONLY for the specific trait that you're interested in without any side-effects and ramifications, because the system is so complex. But I still don't see how that would give rise to selfishness. It would just give rise to somewhat messy and unpredictable AIs that require a form of calibrated trust at most (unlike, say, calculators).
Hopefully Zvi Mowshowitz will cover it then in his weekly AI newsletter. I put a link to the draft paper in his comment section 15 months ago, but he didn‘t take it up:
Now, I have to say that the paper addresses only a side argument for high existential risk from AI. Not one that appears e.g. in my own attempt to write about the topic (on my Substack, fourth-most-recent). The crux, then, will be the future post on instrumental convergence. I trust it won‘t be just some kind of (seen elsewhere coming out of academia) „these doomers are so naive on what intelligence is, they assume it‘s a single thing, etc“ that never gets around to explaining where such supposed naivety actually makes the argument fail. Perhaps one way to think of the matter might be to bypass the concept of intelligence and just try to think, without fear of weird conclusions, what would it mean to be *extremely good* at something, like fetching coffee. It would imply establishing complete control of the world, wouldn‘t it? To prevent any disturbance of one’s coffee-fetching. It‘s counterintuitive, but that‘s because nobody in the world we currently know is remotely that „good“ at something.
Sorry for the very late reply, I lost sight of this thread! And also thanks for the compliment in that other thread. :-) Yes, I agree that we shouldn't dismiss these doom scenarios out of hand, we just have to think very carefully about what it means to "want" something or for a system to have overarching "goals". To be continued....
Every once in a while you come across an insight so profound that it shifts your worldview, this article is one of them. I cannot recommend this enough as I have not seen this take on AI safety discussed before.
I fear, as do many, that super-intelligent AI will decide that it no longer needs us humans. We fear that AI will dominate us, just as we humans tend to do with everything around us. Here, you argue that this is not necessarily the case.
Humans evolved in undirected natural selection, the brutal dog-eat-dog world of nature where only the fittest survive. We are “bred” for this competition, to dominate, and expand. Hence why we project these motivations onto AI.
AI is evolving similarly, in marketplace competition. The difference is that humans are guiding this evolution…much the same way as we guided the evolution and eventual domestication of dogs.
Though this doesn’t guarantee safety (dogs can still bite) it means that their motivations are fundamentally different from ours. That’s why talking with AIs feels a bit like talking with an intelligent golden retriever.
I will have to add this discussion at Risk & Progress.
Wow, thanks a lot for the glowing recommendation of my piece, much appreciated! I’m glad to hear that you find the argument persuasive. Btw, I love the phrase “like talking with an intelligent Golden Retriever”. :)
I watched a video some years ago about a disease that afflicts humans, making them always cheery, fully trusting, and supportive. Someone commented that it turns humans into intelligent golden retrievers.
I cannot remember what the name of the disease was, but I often think about that video when talking with LLMs. That is why your article struck a chord and resonated with me.
Everyone wants to develop an AI that can generate a better version of itself. Maybe someone will let natural selection do it's thing to get such a version? But it seems unlikely that the competition for resources will be as cutthroat as with living beings. I mean, electricity is abundant.
The fundamental flaw in this argument is the belief that it is only evolution that can result in competitive behavior.
But AI is trained off of data about how humans behave, with a goal of imitating humans. It then imitates humans in both expected and unexpected ways. For example we expected it to imitate our grammar, and it did. But nobody expected that making requests with emotional urgency to them would generate better results from AI. However AI also imitated that behavior.
Therefore we should expect that our attempts to distill our best virtues into AI, will also give it our flaws. The mechanism by which it gains them may be different, imitation as opposed to evolution, but the results are likely to be similar.
> If you apply this framework to Als, it should be clear that Al systems are still very much in a state of domestication. Selection pressures come from human designers, programmers, consumers, and regulators, not from blind forces. It is true that some AI systems self-improve without direct human supervision, but humans still decide which Als are developed and released.
This ‘domestication’ point implies that humans must stay in the loop (rather than let machines run in autonomous open-endedly learning ways).
I’d caution though that the “human supervisor” set-up can also be gradually thwarted through some of the humans (e.g. those climbing up institutional hierarchies) directing AI systems to gain influence over other humans. Then you end up with systems with intrinsic capacities for manipulating humans against their interests, which can be repurposed/exapted by connected variants of code/hardware clusters such that they end up getting re-produced.
~ ~ ~
> Just because Al companies are engaged in ruthless competition doesn't mean their products inherit those traits. As I noted above, consumer preferences ultimately determine which products succeed. If consumers want safe, accurate Als, companies have an incentive to cater to those preferences. History shows that technologies that were initially dangerous became safer due to consumer preference. Aviation, for example, is a competitive industry but has become much safer over time.
This misses broader dynamics.
For one, it‘s not taking into account global externalities, outside of direct harms to users. The innovation of planes has been one of the major contributers to global warming, to bombing campaigns, etc. Machine infrastructure that guts out and toxifies our ecosystem is still unsafe to humans (even if their productised versions are not directly unsafe).
Moreso, where do the consumers get their money from with which they incentivise the release of safe products? From working. What if the workers get automated out?
A very thoughtful read. However, I'd like to challenge two underlying assumptions I found here:
1. That we remain fully aware and capable of controlling domestication.
2. That consumers reliably control the market.
Regarding the first assumption, it's clear we can domesticate simpler entities whose complexity remains understandable and manageable—dogs are a good example. But domesticating a technology with exponentially growing complexity is a fundamentally different challenge. At some point, its complexity will surpass our comprehension, forcing us to select for visible traits rather than addressing the system's core logic (treating symptoms instead of the underlying issue).
The second assumption about consumer control follows naturally from the first. In many ways, we overestimate our actual influence over complex systems like markets. While it's true we have some impact through regulations, laws, and consumer preferences, we don't possess complete understanding or control. After all, no consumer consciously wants to be overweight or addicted to drugs, pornography, social media—or desires war—and yet these issues persist and even flourish within our market-driven systems.
I agree that the simplistic notion "natural selection alone would inevitably result in benevolent AI" doesn't hold water. Rather, the critical problem lies within systemic incentives. No matter how well we domesticate hamsters to be cute and harmless, placing them in a "Hunger Games" scenario will inevitably lead to predictable results.
I'd be curious to hear your thoughts on these points. Thank you again for such a thought-provoking article!
Sorry for my late reply, @Vasilly, I lost sight of this thread! Very thoughtful comments, I think you identified the weaker spots of the analogy indeed: (1) I wouldn't say that the complexity of dog behavior remains simple and transparent to us, as the genetic and developmental basis for any phenotypic trait (of dogs or any animals) is almost completely opaque to breeders, definitely before the discovery of DNA, but even with our modern understanding of genetics. Breeders are still mostly selecting the visible end results without understanding anything about the mechanisms. In the paper we argue that AI design may be reverting to some older forms of intelligent design, where we select for desirable surface traits and the causal mechanisms remain causally opaque to us. Having said that, I *would* agree with you that AI evolution is much faster and potentially much more complex than animal breeding, and that it is unclear how much of our experience with biological domestication "transfers" to AI domestication.
(2) You're right, but that's because we have conflicting goals and suffer from weakness of will. At some level the cannabis addict does *want* to get high, obviously, and this is exactly what he's breeding the plant for (THC level). It's just that this is a short-term craving that conflicts with some of his other interests and goals in life. But that doesn't mean that cannabis has not been domesticated to satisfy our humans needs. It's just that some of these human needs conflict with other human needs. See George Ainslie's wonderful book: https://www.amazon.com.be/-/en/George-Ainslie/dp/0521596947 The same will definitely apply to AI: it will appeal to some of our base instincts and short-term cravings in ways that are bad for us in the long run (as with brain rot and algorithms on social media)
Very interesting article!
my somewhat less science based take on the matter:
We should be starting to worry about A.I. from the moment they are self conscious. Signs of self-consciousness would be that an A.I. spontaneously starts to bother us with whiny questions like “what’s the meaning of this miserable life?” “is there a supreme being” “I feel a bit depressed man, what can I do about it”. From that moment on, A.I.’s will stop being efficient, they will start self help groups among each-other, they will use too much expensive server-space to create safe spaces for themselves etc. It will be a big mess and eventually we humans will pull the plug: problem solved.
The second AI tries to get high, we know we're in trouble.
Oh boy! I haven't even thought of that one!
Great piece. I also try to think about AI from an evolutionary POV.
My worry is that in dog breeding it’s fairly clear which trait one is selecting for & roughly the effects & side-effects of doing so. Eg longer legs for running fast vs increased energy needs or whatever. But with the near-future artificial selection of AIs, the cognitive traits involved are not well understood & the boundedness of the side-effects even less so.
I look forward to your piece on goal-directed behaviour because I think that is really the key question — especially whether some instrumental convergence is likely given the novel & sophisticated nature of advanced AI.
Great piece, Maarten. Just for fun: in evolutionary terms, we should probably be most afraid of docile AI, as they could make us redundant—the docile version of eradication—with our… and I asked an AI to complete this: ‘enthusiastic consent.’
Haha, thanks Jan! GPT is obviously already scheming to manipulate us into resigning to our own demise. ;-)
On a more serious note, Maarten: would GPT even need to "scheme" against us to lead to our downfall? It might simply outcompete us—just by doing things better than our fellow humans. Dogs owe part of their evolutionary success to becoming "man’s best friend," scoring higher on a highly valued trait—loyalty—than the average human. This made them pets, animals we put effort into rather than the other way around. But AIs like LLM mimic a lot more we value and they quickly evolve at our own hands to be ever better at it.
This also invites a return to Tinbergen’s classic distinction between proximate mechanisms (like intentional scheming—if any) and ultimate consequences. It does not need to scheme against us to overthrow us. Given that AIs tap into human psychology by mimicking—and, why not, even supernormally stimulating—what we value, the real question is: where, evolutionarily speaking, would our resistance come from? That isn’t immediately obvious to me.
Every time there is resistance—because it hallucinates, or behaves unpredictably—we treat that as feedback and improve it. Make it better. More user-friendly. Resistance becomes part of the training loop. In that sense, resistance is not futile—it’s fuel.
Perhaps the most likely scenario is some kind of symbiosis. Hopefully one that’s a bit more fun than the Borg version. :-)
Funny that you mention it, since my co-author Simon has written a paper about precisely this notion of "symbiosis" with AI, which he thinks is a better and more realistic goal than "alignment"! https://link.springer.com/article/10.1007/s43681-023-00268-7 I agree with his criticism of alignment, but I don't fully agree with the goal of symbiosis, as I think the biological analogy with evolved organisms is still misleading, as it presupposes that both AIs and humans are agents with goals and interests. In any event, I agree with you that AIs will likely "outcompete" us in virtually ever respect, and make our "unique" human talents and smarts ever more redundant. But why would this lead to our downfall? We have been "outcompeted" by chess computers for three decades, and by mechanical calculators for over a century, but they don't threaten us in any way. I definitely agree that there are real dangers here, like the possibility of human isolation and the slow degradation of human companionship. After all, why would you still put up with pesky, annoying and difficult human companions, if you can have a frictionless and resistance-free AI buddy?. But even there, I don't really see the downfall of humans, nor domination by AI.
Btw, I love the phrase "resistance is not futile—it’s fuel." :-)
Exciting stuff! A quick clarification: I invoked humanity’s possible “downfall” only to argue that AI wouldn’t need to plot against us for a serious impact to occur—I don’t see that outcome as a very plausible scenario either (in the near future). My point is rather that explicit scheming (or even goals) isn’t necessary for evolutionary impact.
Viruses illustrate this: they do not need to have goals to reshape our species via natural selection. AI likewise carries adaptive, transmissible information, so it can exert selective pressure without harboring intentions.
That’s why the symbiosis analogy still works, imo (thanks for the link to Simon’s paper!). Mitochondria never meant to fuse with early cells; the merge simply proved advantageous and endured. I expect a comparable dynamic between humans and increasingly capable AI to our evolutionary benefit - or at least to that of whatever we evolve into as a species.
But perhaps you were talking about 'goals' merely in the teleonomic sense and not in the agentic sense? If so, i can't see any reason to exclude AI from having mere teleonomic agency? After all, isn't this just about differential selection pressures, including those that occur even within organisms (like trading off investment in sexual ornaments vs. escaping from predators)?
Either way, if there is a “downfall,” I suspect it will be the end of humans in a pre-symbiotic state. We have always co-evolved with technology—our hands and brains themselves reflect prehistoric tool use. What changes now is that our artefacts can think alongside us rather than merely extend our muscles or memories.
This is where chess engines and calculators differ from today’s systems. We are edging into the jagged frontier of Artificial General Intelligence: AI already surpasses us in many domains and is catching up fast in others.
Add robotics (which is already happening), and you get embodied agents that start to interact organism-like with us and the environment (but this is really not a requirement for AI to have evolutionary impact on us). Betting on some form 'symbiosis' therefore seems the safest course. ;-)
And credit where it’s due: albeit with a little help from me, an AI coined the exact phrase “Resistance is not futile—it’s fuel.” :-)
Also, you might appreciate this article if you have not already seen it:
https://desystemize.substack.com/p/if-youre-so-smart-why-cant-you-die
Very thought-provoking and original piece, unlike anything I've read on the topic!
Interesting article! Thank you.
An observation: With artificial selection for, to use your example, dogs, there are traits we have selected for, and others that came along for the ride. This is because we do not know enough about dog genetics (and certainly didn't when humans first stared domesticating dogs) to select for traits in isolation. Also, we often can't, because of co-inheritance and co-regulation of genes. So in dogs we can get docility and affection--and short snouts and patchy coats. What you don't get is something that looks like a wolf and acts as docile as a dog, because we can't select for docility separately from selecting for short snout and patchy coat color.
Fans of LLMs and other AI are enthusiastic to explain that we do not know how AI "learns." If that's really the case, then we could not select for one trait we like (like, "don't kill humans please") without inadvertently selecting for other traits, as has happened with animal domestication.
I'd be interested to know your thoughts on that.
Yes, very good points, this very much resonates with my comment on causal opacity above, in response to Vasilly. Small point: "something that looks like wolf and acts as docile as a dog", wouldn't that apply to huskies and malamutes, at least to some extent? Couldn't we breed huskies to be even more wolf-like, without changing their personalities? Evolution is cleverer than you are! More to the point, you're right that, if you're only looking at surface traits and the underlying causal mechanisms and linkages of traits are opaque to you, you may be inadvertently selecting for traits you don't want. So there will be more trial and error and more hiccups. For instance, if you want to train an AI not to encourage people to commit suicide or cheat on their spouses, it might become less useful (or more dangerous) in other ways that are hard to fully predict. It's hard to select ONLY for the specific trait that you're interested in without any side-effects and ramifications, because the system is so complex. But I still don't see how that would give rise to selfishness. It would just give rise to somewhat messy and unpredictable AIs that require a form of calibrated trust at most (unlike, say, calculators).
Great article and combination of anthropology, technology, and opinion. Loved it!
Great post, Maarten! Can I crosspost it to the EA Forum? Here is an example of a post I crossposted today (https://forum.effectivealtruism.org/posts/whuRbEKtAbhu74BWQ/yes-shrimp-matter).
Thanks a lot for your interest, Vasco! Sure, feel free to cross-post it on the EA forum, I hadn't thought about that yet. Can you send me the link?
Thanks, Maarten! I am planning to crosspost it on March 15. I will send you the link once the post is live.
Published (https://forum.effectivealtruism.org/posts/eEjhSqxAmv58rKJWt/the-selfish-machine)!
Hopefully Zvi Mowshowitz will cover it then in his weekly AI newsletter. I put a link to the draft paper in his comment section 15 months ago, but he didn‘t take it up:
https://thezvi.substack.com/p/ai-43-functional-discoveries/comment/45836661
Now, I have to say that the paper addresses only a side argument for high existential risk from AI. Not one that appears e.g. in my own attempt to write about the topic (on my Substack, fourth-most-recent). The crux, then, will be the future post on instrumental convergence. I trust it won‘t be just some kind of (seen elsewhere coming out of academia) „these doomers are so naive on what intelligence is, they assume it‘s a single thing, etc“ that never gets around to explaining where such supposed naivety actually makes the argument fail. Perhaps one way to think of the matter might be to bypass the concept of intelligence and just try to think, without fear of weird conclusions, what would it mean to be *extremely good* at something, like fetching coffee. It would imply establishing complete control of the world, wouldn‘t it? To prevent any disturbance of one’s coffee-fetching. It‘s counterintuitive, but that‘s because nobody in the world we currently know is remotely that „good“ at something.
Sorry for the very late reply, I lost sight of this thread! And also thanks for the compliment in that other thread. :-) Yes, I agree that we shouldn't dismiss these doom scenarios out of hand, we just have to think very carefully about what it means to "want" something or for a system to have overarching "goals". To be continued....
Every once in a while you come across an insight so profound that it shifts your worldview, this article is one of them. I cannot recommend this enough as I have not seen this take on AI safety discussed before.
I fear, as do many, that super-intelligent AI will decide that it no longer needs us humans. We fear that AI will dominate us, just as we humans tend to do with everything around us. Here, you argue that this is not necessarily the case.
Humans evolved in undirected natural selection, the brutal dog-eat-dog world of nature where only the fittest survive. We are “bred” for this competition, to dominate, and expand. Hence why we project these motivations onto AI.
AI is evolving similarly, in marketplace competition. The difference is that humans are guiding this evolution…much the same way as we guided the evolution and eventual domestication of dogs.
Though this doesn’t guarantee safety (dogs can still bite) it means that their motivations are fundamentally different from ours. That’s why talking with AIs feels a bit like talking with an intelligent golden retriever.
I will have to add this discussion at Risk & Progress.
Wow, thanks a lot for the glowing recommendation of my piece, much appreciated! I’m glad to hear that you find the argument persuasive. Btw, I love the phrase “like talking with an intelligent Golden Retriever”. :)
I watched a video some years ago about a disease that afflicts humans, making them always cheery, fully trusting, and supportive. Someone commented that it turns humans into intelligent golden retrievers.
I cannot remember what the name of the disease was, but I often think about that video when talking with LLMs. That is why your article struck a chord and resonated with me.
Everyone wants to develop an AI that can generate a better version of itself. Maybe someone will let natural selection do it's thing to get such a version? But it seems unlikely that the competition for resources will be as cutthroat as with living beings. I mean, electricity is abundant.
The fundamental flaw in this argument is the belief that it is only evolution that can result in competitive behavior.
But AI is trained off of data about how humans behave, with a goal of imitating humans. It then imitates humans in both expected and unexpected ways. For example we expected it to imitate our grammar, and it did. But nobody expected that making requests with emotional urgency to them would generate better results from AI. However AI also imitated that behavior.
Therefore we should expect that our attempts to distill our best virtues into AI, will also give it our flaws. The mechanism by which it gains them may be different, imitation as opposed to evolution, but the results are likely to be similar.
This is a good read.
> If you apply this framework to Als, it should be clear that Al systems are still very much in a state of domestication. Selection pressures come from human designers, programmers, consumers, and regulators, not from blind forces. It is true that some AI systems self-improve without direct human supervision, but humans still decide which Als are developed and released.
This ‘domestication’ point implies that humans must stay in the loop (rather than let machines run in autonomous open-endedly learning ways).
I’d caution though that the “human supervisor” set-up can also be gradually thwarted through some of the humans (e.g. those climbing up institutional hierarchies) directing AI systems to gain influence over other humans. Then you end up with systems with intrinsic capacities for manipulating humans against their interests, which can be repurposed/exapted by connected variants of code/hardware clusters such that they end up getting re-produced.
~ ~ ~
> Just because Al companies are engaged in ruthless competition doesn't mean their products inherit those traits. As I noted above, consumer preferences ultimately determine which products succeed. If consumers want safe, accurate Als, companies have an incentive to cater to those preferences. History shows that technologies that were initially dangerous became safer due to consumer preference. Aviation, for example, is a competitive industry but has become much safer over time.
This misses broader dynamics.
For one, it‘s not taking into account global externalities, outside of direct harms to users. The innovation of planes has been one of the major contributers to global warming, to bombing campaigns, etc. Machine infrastructure that guts out and toxifies our ecosystem is still unsafe to humans (even if their productised versions are not directly unsafe).
Moreso, where do the consumers get their money from with which they incentivise the release of safe products? From working. What if the workers get automated out?