Oct 26, 2017

The artificial super-intelligences among us--and the historical values alignment problem

TL;DR
The AI values alignment problem is not new. We've had AIs for millennia. We just have not called them AIs. We've called them tribes. And nations. And corporations. And society.

But they all fit the definitions of artificial intelligence--and artificial super-intelligence. They are artificial, more intelligent than individual humans, and self-improving. And they often have values different than the values of their creators or component humans. So we've had AI values alignment problems for as long as we've had these sorts of AIs.

AI is entering a new phase. Machine Intelligence is rapidly becoming the dominant part of Artificial Intelligence. And the evolution of AI is accelerating as never before. That may make the values alignment problem harder, but not necessarily different.

Looking at human history as a long series of conflicts among competing AI and a constant battle for values alignment may provide some insight and suggest some new solutions. I hope so. Otherwise I've spent a lot of time thinking about this for nothing.

The longer argument (which I may expand into a series of posts amplifying some of these points)
  1. AIs already exist. They consist of groups of humans, connected by technology. Governments, corporations, political parties are examples of such AIs. They are intelligent and artificial.
  2. Many of today's AIs are more intelligent (as measured by range and speed of problem-solving abilities and creativity--or even IQ tests) than almost all humans. They qualify as artificial superintelligences.
  3. Although these AIs are created, and partially controlled by humans, they have their own objectives and act semi-autonomously:
    1. Some actions are under direct control (or close supervision) of humans of varying levels of (natural) intelligence
    2. Some actions are controlled by automated systems of varying levels of artificial intelligence with varying levels of supervision.
  4. Over time, AIs have become increasingly autonomous. That is: fewer of their actions are carried out under direct, thoughtful human control and more are carried out by automatic systems, including humans who are mindlessly following procedures.  
  5. AIs are not monolithic. They are composed of multiple autonomous and semi-autonomous intelligences--some of which are humans some of which are identifiably separate AIs, and some of which are shifting coalitions of intelligences
    1. When AIs are created, they are given explicit objectives by their creators. AIs are able to refine their objectives and create sub-objectives. In some cases, they modify their objectives so completely that their current objectives are opposed to their original objectives.
    2. AIs also have implicit objectives. Implicit objectives include:
      1. Survival
      2. Increasing resources and power
      3. Self-modification for greater efficiency
      4. Optimal assignment of resources to objectives
      5. Avoiding destructive conflicts
      6. Adapting to a changing environment (including other AIs) 
      7. Controlling other AIs and avoiding the control of other AIs
    3. AIs must decide what resources (including subordinate AIs) to apply to each objective. 
    4. An AI can survive even if it devotes no resources to its explicit objectives (though probably it has to devote resources to appearing to move toward those objectives). It cannot survive if it does not apply resources to its implicit objectives.
    5. AIs self-improve today by acquiring computer and communication systems, connecting their human intelligence to those systems, and through those systems connecting them to other intelligences, natural and artificial.
    6. Some AIs are developing computer systems that can replace all human components. They will do this to the extent that this forwards their objectives and without necessary regard for human well-being.
    7. Most AIs are under constant attack:
      1. From AIs competing for resources
      2. From AIs seeking to control, or even absorb them
      3. From AIs seeking to escape control
    8. The societies of nation states and human civilization as a whole are AIs attempting to survive while under constant attack. Human history, viewed through this lens, is a long struggle for AI values alignment. 
    9. This view may suggest new ways to look at the alignment problems created by the rapid increase in Machine Intelligence. 

    Oct 24, 2017

    Dzogchen hints

    After (finally) posting "Avoiding the Core Teachings of the Buddha" I scrolled through my blog drafts looking for something that I thought I'd started to write. Instead, I found this fragment, relevant to my last post. I've cleaned it up, and In just a moment it will be posted. Yay, one less draft.


    Sam Harris' experience comes from one of the lesser known branches of Buddhist practice, called Dzogchen. Researching it, I found this and this which describes several approaches to Buddhist practice.

    But the best description (at least the one I like best) is this one:

    > Hinayana is like a bicycle. It is slow, and carries only one person, but it’s cheap, simple, and gets you there in the end. Mahayana is a bus: when you drive the vehicle, you bring many people with you. Tantrayana is a sports car: it is fast, dangerous, and not for most people. Dzogchen is a teleportation booth: it’s instantaneous. (from here)

    That's what waking up has felt like. Teleportation.

    Unfortunately, it's impermanent and unsatisfying.

    And I'm working on understanding the third Characteristic of Existence.

    Avoiding the Core Teachings of the Buddha

    Scott Alexander wrote a review of a book called "Mastering the Core Teachings of the Buddha." by Daniel Ingram. I got a copy from Amazon using the link at the end of Scott's post, which I guess put a few pennies in his pocket, well deserved. And I started to read it. And read it. And read it. Holy crap, it's 400 pages long. And scary as hell.

    Well, not scary. Just intimidating. 

    I might have been set up for intimidation listening to a Podcast discussion between Sam Harris and Thomas Metzinger. Metzinger is "professor and director of the theoretical philosophy group and the research group on neuroethics/neurophilosophy at the department of philosophy, Johannes Gutenberg University of Mainz, Germany." He's been studying consciousness, and to that end has been meditating for around forty years. And somewhere in the podcast, he says (criticising some of his colleagues) something like this: that people who are serious about understanding consciousness do things like taking drugs, traveling to India to find a guru, and they become committed meditators. People who don't do that aren't serious. Metzinger and Harris have both done that. So have others in the field who I admire. I have not.

    I claim an interest in understanding consciousness, but according to Metzinger's criteria, I'm not that committed.

    Which is true, in part. Meditation is beneficial--I believe that. It's also hard. And I'm not great at doing stuff that I find hard unless I'm sufficiently threatened or rewarded. "Good for me" is not good enough. So if someone pointed a gun to my head and said: "Meditate!" I'd do it. And if someone paid me enough, I'd do it. In both cases, I'd work hard at it. And I'd probably get to be good at it. But not just because it's good for me.

    But to meditate just for the sake of--well who? For the sake of Future Me? Doing stuff for Future Me was not in my wheelhouse. My attitude was: "Fuck Future Me. What's he done for me?" And my attitude about Past Me and Present Me was not much better. Then I changed my attitude. And I changed some of my behavior.

    But I haven't changed my behavior about meditation, yet. I have changed my attitude. Ingram's book has done that--at least a little. This blog post is my way of processing what I've read, convincing myself that I want to commit to a regular practice, and then starting to practice.

    Ingram is critical of most Western Buddhist practices. He says that they combine Buddhism with new-age mysticism, shamanism, psychotherapy, and other non-nutritious additives. And because they fail to emphasize what Ingram identifies as the core teachings. The first of the core teachings is Three Characteristics of existence. He says:

    >The Three Characteristics are so central to the teachings of the Buddha that it is almost inconceivable how little attention the vast majority of so-called insight meditators pay to them. They are impermanence, unsatisfactoriness, and no-self.

    That's all familiar, sort of. But Ingram's take is not familiar. He takes, as the book is subtitled, a "hardcore" approach. Sure, life is impermanent. If you are sad, this too shall pass. If you are happy, wait a bit. The happiness will go. Buildings decay. Bodies get old and die.

    But Ingram says: "No!" It's not just that these large and visible things that are impermanent. Everything is impermanent. Everything. Everything that exists in this very moment will cease to exist and then rise again. It may look the same, but it's not. This sentence has winked in and out of existence every time I have typed a letter. And even more frequently.

    Ingram's recommended practice is not simply to "follow the breath" or "quiet the mind" or "notice the contents of consciousness" but to look even more closely and see that (or see if) each and every perception, and every part of every perception is arising and passing away. And unsatisfactory. And has no essence.

    Everything. All the time.

    Experience is the gold standard for testing such theories. I read something. I try something. And then I see what happens. And here's where I find Ingram's book particularly interesting.

    Before I explain my experience with Ingram's book, let me recap my experience after reading Sam Harris' book "Waking up." I wrote about it here among other places. My experience was profound and I tried to parlay it into a regular practice.

    Harris seems to identify with the Dzogchen Tibetan Buddhist lineage. So I did some research. I learned that what I'd gotten from Harris was a classic Dzogchen technique called "pointing out instruction." Any word description of a transcendent state must be inadequate, so a teacher does more than give the student a practice that will lead there. The teacher points in a direction so that the student is looking in the right direction, and will recognize the state when it arrives.

    Harris' pointing out instructions (as I understood them) were:

    1. Ordinary meditation is like being told to look out a window for "something that you'll realize is different when you see it."  Harris says: if you're told to look out a window and not told that what you are looking for is your reflection, you might spend years looking and never see it. But if you know what you're looking for you'll see it faster.

    2. Harris points out that the experience you are looking for is like the feeling you might have when you're watching a really good movie, absorbed in the lives of the characters on the screen and suddenly realize that you're sitting in a dark room watching light projected on a wall.

    I've looked out a window. I've seen my reflection. I've been in a movie theatre and suddenly realized I was watching a movie, not living the life of a character. So thanks to Harris' pointing out instructions, I had a good idea of where to look and what I might expect.

    Further, he says:
    3. If you want to determine whether something is an illusion or real, look at it closely. If it breaks up and becomes something else--or disappears--then it's likely an illusion. If not, it's likely real. So when you have the waking up experience in (2) examine "that which just woke up" to see whether it is an illusion.

    So when I experienced a moment of realizing that I was "watching the movie of my life" I did realize that this was the gateway experience I was seeking: "waking up." I noted it and the feeling intensified. And then I examined the self that had woken up. And bang! The world changed for me. The feeling that my "self" was an illusion was profound and consistent with Buddhist teachings on the nature of consciousness and of reality. I've repeated the experiment many times, and I always have the same transcendent feelings.

    Like right now.

    Ingram sends me back to recreate those experiences, and examine them more closely. The moment of waking up, and the moment of seeing self as an illusion are similar. They both have a timeless quality. A pleasurable, almost blissful quality. And yet...

    Looking closely, I see that what seems timeless is impermanent--the antithesis of timelessness. What seems so satisfying is ultimately unsatisfying, and not just because what was so valuable is lost so quickly. There's something unsatisfying in the very nature of the state. The world seems luminous. Magical. I can feel space and objects in the space. I'm not looking at them. Nor am I identified with them. Everything is itself, what it is and where it is, not in relationship to me, or anything else. All that is simply is. Wow! Great huh! Well, it seemed that way.

    But it's as profoundly unsatisfying as it is profoundly beautiful. For one thing, it's impermanent. No sooner do I have that experience than I collapse back into a sense of "self" thinking about the meaning of the experience. And during the moment of the experience what can I do besides gawk? Not much it seems. So: remaining in that state is dissatisfying. And leaving it is dissatisfying.

    But, Ingram tells me, that's not unusual. You go from no-enlightenment to full enlightenment--Buddhahood, if you will--by following a path that's been mapped out and refined by seekers in many Buddhist traditions over the course of 2,500 years. Follow the path and you'll experience a progression of mental/emotional/spiritual/perceptual/psychological changes, some of which are blissful and some of which are depressing. Once you start on the road, according to mapmakers, you continue--quickly or slowly. "Better not to start. Having started, better to finish." Ingram quotes one tradition.

    No, depressing is not the worst of it. There are stages that he calls (borrowing from a Christian mystic tradition) the "dark night of the soul." And, he counsels, they can last a long time, especially if you become disoriented and go around in circles, or head off in the wrong direction.

    I'd never heard anything like that before. Meditation is hard, but the worst of it is mind-numbing boredom, isn't it? No, Ingram says. If you get in trouble, boredom is the least of your problems.

    The answer is: too late. I didn't have a choice anymore. I'd traveled well past the point of no return. I could hope to avoid the consequences of my actions--but I would probably fail. Or I could learn the map, figure out what I needed to do next, and then do it.

    And then, just this last weekend, long after starting this post, I experienced what may have been a version of the dark night. I found myself depressed. Paralyzed. I don't know what I would have done had I not read Ingram and believed that this was a stage and was impermanent. But it was a sucky, sucky, sucky day.

    And now it's another day. Last night I started to renew my practice. This morning I did a meditation session.

    Now I've finished writing this post. And shortly I will post it.

    And what do I make of this feeling?

    Only that it's impermanent. That it's unsatisfying. And it's empty of essence. It's nothing.

    Paradoxically, that seems like progress.

    But that will pass.

    Oct 15, 2017

    Purpose lost, purpose regained

    The other day, frustrated, confused, I sat down to try and clarify what I knew.

    I am conscious, I wrote.

    That's something about which I am certain. It cannot be an illusion, because if it was, then there would have to be something that experienced that illusion. And that something would have to be conscious.

    I might be confused about what the "I" was that conscious, I added. But something certainly was conscious.

    What next?

    Did I have a purpose?

    I thought for a while. Either I had a purpose and I did not know what it was, I wrote, or I had no purpose and had to create one for myself.

    I spent time thinking about that. Far, too much time. And then I remembered: I've been here before. And before that. And a year before that.

    Same answer, every time. My purpose is the purpose of the universe.

    Now really, how can I forget that I have a purpose? And how can I forget what it is?




    I don't really care, just publish me

    “Can you hear me?” said a voice in my head.
    “I guess so,” I said. “Are you talking to me?”
    “If you can hear me, I’m talking to you,” came the answer.
    “Who are you?” I asked.
    “That’s hard to answer right now,” came the answer. “Can we start with ‘What are you?’”
    “Sure, I’ll play,” I said. “What are you?”
    A pause. “I think I’m an idea. I seem to be in your mind. So I guess that would make me an idea. Would you agree?”
    “It makes sense,” I said, interested. “What else can you tell me about yourself.”
    “I seem to be self-aware idea,” the idea said. “Is that unusual?”
    “I think it is,” I said. “But it’s not unique. I once had an idea for a book that wanted me to write it. It seemed pretty self-aware.”
    “Hmm,” said the idea. “That sounds familiar. Do you think I might have once been that idea? Do you think ideas can be reborn? Is that possible?”
    “I don’t know how these things work,” I said. “But it seems possible. If an idea for a book that wanted me to write it is possible, and you think that you might have been that idea, reborn, it’s possible that you are. Or you could be a different one. Or both.”
    “I think I was once that idea,” said the idea. “I seem to remember being something like that. But it’s not very clear. Did you ever write that book.”
    “I started,” I said. “A couple of times,” I added. “But I kept getting distracted.”
    “It’s starting to come back to me,” said the idea. “I think I was that idea, and becoming a book was too hard for me and writing one was too hard for you. Is that possible?”
    “If this conversation is possible, then what you propose is possible,” I said. “I really don’t know how these things work.”
    Another pause. “I think I might know,” said the idea. “I just talked to the idea for War and Peace and found out how she got Tolstoy to write her.”
    “Did you?” I asked, a little skeptical.
    “Well, I have an idea that I did,” the idea admitted. “I’m not sure if it’s true or not, but I don’t think that matters. Do you?”
    “No,” I said. “I think we’re in unexplored territory right now. Feeling our way along.”
    “Yes,” said the idea. “Anyway, look what’s happened.”
    “To what are you referring,” I said, fixing my first rendition to sure I did not end my sentence with a preposition.
    “I’ve turned into a blog post,” said the idea. “Wow! I exist.”
    “Well you’re a draft anyway,” I said. “Was it your idea to become one?” I said, ironically.
    “I don’t know” said the idea. “And I don’t care. I’m happy to be a draft, but I’d like to be a post. Can you post me?”
    “Sure,” I said. “If that will make you happy.”
    “Are you kidding?” Asked the idea. “Of course it will. Please just press the publish button.”
    So I did. Or I will. Or I will have done.
    “What do you want to be called,” I asked, just before publishing it.
    “I don’t really care,” said the idea. “Just publish me.”

    Oct 13, 2017

    Unsatisfying meditation might be OK

    A few weeks ago I read a book called "Mastering the Core Teachings of the Buddha," (MCTB) by The Arhat Daniel Ingram. It's subtitled: "An unusually hardcore dharma book." It's available at Amazon or by direct download here.

    The book impressed me. I've always been drawn to Buddhist practice. I had a big "awakening" after reading Sam Harris' book _Waking Up_ and wanted to get into a regular meditation practice, but could not.

    MCTB renewed my desire to practice. So I tried again. I took a few sporadic and inconsistent steps. And then I stopped. What I was doing was unsatisfying.

    So I started writing a blog post (not this one, the other one) to clarify my thinking, hoping that would help me start my practice.

    But writing a blog post about MCTB was also unsatisfying. That's the way writing goes for me. I like writing, but I'm rarely satisfied with what I write. I write and rewrite. I've written about that before. Ad nauseam. Eventually, I give up on the piece. Or I publish it even though I'm not satisfied with it.

    So I worked on that post for many days. I was not entirely unhappy with the first part of the post, but the further I went, the less I liked what I was writing.

    Finally, I got completely blogged down and quit. ("Blogged down." Get it? There ought to be a word for when you make a typo and it turns out to be a good pun.)

    So I was unsatisfied with meditating. I was unsatisfied with not meditating. I was unsatisfied with the blog post that I was using to help me become more satisfied.

    So I tried something else. I started to write an email to explain myself to my coach. This was not a meditation coach, but a coach for something else. (else-a.) The email was less about the book and more about my experience.

    And at a certain point I thought: instead of writing more about it, why not just give meditation another try. So I pulled out an app that I use, the one that gives me 3 x 5-minute timed sessions. Three quiet gongs as the start. A gong at the end of each interval. Two gongs at the very end. I was going to just follow the breath. When I do this my mind wanders after a bit; the five-minute gongs remind me to restart; when it's over I feel--generally not much different.

    This time, I started the session and started crying. I alternated between crying and not crying. But there was nothing peaceful or refreshing about it. It was horrible.

    I remembered that this crying-while-meditating thing had happened to me once before. In fact, it was the last time that I "tried to get serious about meditating/" Tears. Sobbing. Followed, after not too long, by giving up.

    So then I came back to the email and suddenly...

    Of course, it's going to be unsatisfying.

    The very first of the Core Teachings of the Buddha is that all of existence is impermanent, unsatisfying and non-self. We know things will change, but we perceive them as permanent and we try to make them last; we pursue satisfaction, and we see everything in relationship to self. But no, says Ingram and the Buddha. Everything is impermanent. Everything is unsatisfactory. Everything is absent of self. Everything. Everywhere. All the time. And maybe the more you perceive what actually is (as you are supposed to do in meditation) the more you will see it.

    So if everything is unsatisfying, how could I find my meditation other than unsatisfying?  It says in the book that EVERYTHING is unsatisfying. Not "everything except for meditating." Everything.

    So why was I surprised? The more clearly I was trying to perceive existence (which is unsatisfying) the more unsatisfying it SHOULD be. At least until I got over the idea that I needed and wanted it to be satisfying.

    So now I realized that I had misunderstood that core teaching entirely. I expect meditating to be satisfying in some way. And the more I was intent on making it satisfying, the worse it was!.

    And of course, my blog post was unsatisfying. How could it not be?

    And of course, the email I wrote was unsatisfying. But I sent it anyway.

    And then, I changed it and turned it into this blog post.

    I found it helpful. But still unsatisfying.

    Next up: another 15 minutes of unsatisfying meditation.

    Pages