Make Sure You’re Kinect-ed: Talking With Microsoft’s Alex Kipman

Make Sure You’re Kinect-ed: Talking With Microsoft’s Alex Kipman (photo)

Posted by on

First we knew it as Xbox’s response to the Wii, with mysterious rumors pegging it as everything from a wand-like input device to a revolutionary motion-sensing camera. Then, the latter of those two prospects got introduced as Project Natal, the head-scratching codename for Microsoft’s controller-free initiative. A media circus (literally) during this year’s E3 revealed the actual product name this summer. Since then, silence and anticipation built steadily until Kinect’s debut in stores today.

Microsoft claims that Kinect will change everything. Much of that line of thought comes from Alex Kipman, the company’s Director of Incubation and the gentleman parting the waters into the promised land of super-immersion. Kipman’s been barnstorming for the last few weeks preaching the Kinect gospel and spoke to IFC News about the philosophy and practical considerations behind Microsoft’s next big bet.

I know a lot of research has gone into Kinect and it’s finally out of development. Now that you guys can send the product off to launch, can you talk about how long the journey’s been to get here? Has it been a two year cycle? Longer than that?

One could say it was a million years in the making. This was about making you into the controller, right?

Well, sure.

I would say that what you see here is a combination of us understanding a moment in time. Of us understanding that computers, as a whole, are transitioning from this world where all of us of had to understand technology, into this world where technology fundamentally understands us.

But I see Kinect as the peak of that journey, of that transitioning moment, of the catalyst that brings us from this old world, to this new world that will be.

From that perspective, why hadn’t we had this before today? And the answer is because we haven’t been able to get the algorithms to a level of sophistication around the various elements–computer vision, machine learning or voice recognition–to a point where we could transition science fiction into science fact.

So, to give you the idea of time, which is what you’re asking, I need to mention that we have a huge branch of research at Microsoft. And if I were to add the many years of people with domain expertise in these fields, it’s decades’ worth of work.

Right. So you were already doing biometrics research and stuff like that?

11042010_alex_kipman.jpgAny number of things like that. Generally, we pick the key experts in the world in all of these fields and fuse them together to really make sure we can get a very strong platform that really lives up to, “Hey, simply step in front of the sensor, and it recognizes you.” It knows the difference between you and I, you and your family, you and your friends. Start moving, and the sensor understands fundamentally your human movement. Knows when you kick a soccer ball, or gesture to move between UI [user interface] screens. Then, it can tell when you move around to do tai chi poses as in “Your Shape: Fitness Evolved.”

And, finally, when you use your voice, you’ll have voice recognition work in a natural way. So, if you’re watching entertainment, you can simply say “Xbox, pause,” or “Xbox, why don’t you suggest me a movie for me?” Voice commands and things along those lines.

Those three pillars create the palette, the paint colors and the paint brushes that allow us to create these unique experiences that land us in this new world, where technology disappears and Kinect fundamentally understands you. Now that’s half of the story.

The other half of the story is how the combination of research and technology–paint colors and paint brushes, if you will–lets players be painters and paint pictures. We think the seventeen experiences that we bring to market at launch will let you do that. All the launch games were really created from the beginning to get you up on your feet, get everybody collaborating, cheering each other on, playing together, having fun and laughing together.

So what’s the nature of the challenges in the various Kinect games?

These experiences were designed to be simple, fun, and approachable for everyone. Now, that doesn’t mean that they are simple in every way. They’re simple to start, but they’re still skill-based. They take forever to get good at. It’s like golf. You and I can go to the golf course today. I know the rules. I can swing my arms and I can hit a ball. I never played before. To beat Tiger Woods, I’m going to have to spend a little bit more time to get to that level. Same thing here.

All of our experiences can be described as simple but approachable. Super-easy to get into and go. But it takes work and skill before you can get really, really good at it.

Can you talk a little bit about integrating the research? Like you said, you’ve got three pillars here that are all combining to create essentially a seamless experience. Can you talk about the different directions that you guys could have gone, for things like voice recognition or the body scanning?

Yes and no. The reason nobody has been able to crack this problem before is because everybody goes down a route, after trying to figure out a pre-set path. That’s the very engineering way as opposed to the artistic way of approaching the problem. I always say to people, “I’m the Kirk in a world of Spocks.” The world of Spocks requires you to choose something. It’s zeros and ones. It’s true and false. It’s black and white. It’s yes and no.

The answer to your question, which is the Kirk answer, is the more emotional and artistic answer…

[overlapping] You got some of the William Shatner body language going on, too.

[laughs] The point I’m trying to make is that the human body and human expression represent a system that’s analog. It’s not yes and no. It’s maybe. It’s not black and white. It’s gray. You’re moving to a world that’s not “what you know” but to a world of probabilities where all of these probabilities exist all the time. Your brain’s job is to create a language that allows you to know what to choose out of all these probabilities and when to choose it.

That was a whole bunch of philosophical blah-blah-blah. Let me give you some concrete examples. Take identity recognition. I can reduce that entire space to a signal-to-noise problem. Why haven’t I had identity recognition that works in the past? It’s because people choose a way of doing it; either a face, a voice or a fingerprint gets added . It turns out that if you and I get in front of a camera sensor right now, we’re very different people. Kinect is going to use that facial recognition data to lock us in.

Now that facial recognition is signal, everything else is noise. Still, it turns out that in the living room, Darwin is against us. You are genetically similar to your family. At that point, facial recognition sucks. So, then, facial recognition just became noise, I need something else to be signal.

So it’s really about trying to create hardware and software that look at the world in terms of “ands” instead of “ors.” It’s not about choosing a path. It’s about realizing that no one path will get you to the Promised Land, and you need to create a language that tells you that everything’s probable. You need to have some language around confidence. You need to know when you know something. You need to know when you don’t know something.

All of that sound ridiculously difficult to program a computer to do…

It gets better, because there’s a second derivative to it, which is that you need to be confident about your confidence. Because if my [computer vision] system says “Hey, I’m really 100% sure that this is a head,” and it’s really a foot, well, it’s not really confident about it confidence. So, the entirety of Kinect is designed to be this probabilistic, statistical-based system that really looks at everything–identity, motion, and voice–in terms of a signal-to-noise world. And it knows when to focus on the signal, when to throwaway the noise, much like your brain works.

Our minds are essentially massive signal-to-noise machines that are way more complicated, complex and sophisticated than Kinect. Like, right now, your attention is focused on me and my voice, relegating all the voice in the other rooms to the background. Al of our efforts for what we want to do on the console have been to basically replicate a similar means of judging and filtering multiple streams of data, to figure out the most probable conclusion for which user you are, what you might be saying and how you might be moving.

So that goes back to what I said about no single path for decision-making. It’s about all possible paths. And it’s about being confident about your confidence so we can believe in the choice. Traditionally, it’s super-simple to create an artificial intelligence system that knows something. Now, to have the artificial intelligence system know when it’s stupid and when it doesn’t know something, that’s the hard problem. And Kinect does that with something that’s uniquely ours, something that we invented, which is this language to be able to describe these very analog concepts in a robust way.

One last question. This conceptual framework, this architecture for the algorithms that you’re talking about, is this something that we can expect to see rolled out on Xbox in different ways, or even onto the PC platform? Because it sounds flexible enough to kind of reinvent user interfaces altogether.

I meant what I said. The entire computer world is changing. And when we look at Kinect, it’s the beginning of the journey. It’s not an end of a journey. And we begin the journey very focused in the living room, and in gaming and entertainment as a whole, but it would be silly of us to not be looking at this in a broader sense.

We don’t have time or wish to think about that broader space right now. We need to have an amazing consumer launch on November 4th and have an amazing device for everyone in the living room, but, as you say, we believe fundamentally in ushering this new era of computers, and we see Kinect as the pinnacle of that transition right now.

Watch More

Millennial Wisdom

Charles Speaks For Us All

Get to know Charles, the social media whiz of Brockmire.

Posted by on

He may be an unlikely radio producer Brockmire, but Charles is #1 when it comes to delivering quips that tie a nice little bow on the absurdity of any given situation.

Charles also perfectly captures the jaded outlook of Millennials. Or at least Millennials as mythologized by marketers and news idiots. You know who you are.

Played superbly by Tyrel Jackson Williams, Charles’s quippy nuggets target just about any subject matter, from entry-level jobs in social media (“I plan on getting some experience here, then moving to New York to finally start my life.”) to the ramifications of fictional celebrity hookups (“Drake and Taylor Swift are dating! Albums y’all!”). But where he really nails the whole Millennial POV thing is when he comments on America’s second favorite past-time after type II diabetes: baseball.

Here are a few pearls.

On Baseball’s Lasting Cultural Relevance

“Baseball’s one of those old-timey things you don’t need anymore. Like cursive. Or email.”

On The Dramatic Value Of Double-Headers

“The only thing dumber than playing two boring-ass baseball games in one day is putting a two-hour delay between the boring-ass games.”

On Sartorial Tradition

“Is dressing badly just a thing for baseball, because that would explain his jacket.”

On Baseball, In A Nutshell

“Baseball is a f-cked up sport, and I want you to know it.”

Learn more about Charles in the behind-the-scenes video below.

And if you were born before the late ’80s and want to know what the kids think about Baseball, watch Brockmire Wednesdays at 10P on IFC.

Watch More

Crown Jules

Amanda Peet FTW on Brockmire

Amanda Peet brings it on Brockmire Wednesday at 10P on IFC.

Posted by on
GIFS via Giphy

On Brockmire, Jules is the unexpected yin to Jim Brockmire’s yang. Which is saying a lot, because Brockmire’s yang is way out there. Played by Amanda Peet, Jules is hard-drinking, truth-spewing, baseball-loving…everything Brockmire is, and perhaps what he never expected to encounter in another human.

“We’re the same level of functional alcoholic.”

But Jules takes that commonality and transforms it into something special: a new beginning. A new beginning for failing minor league baseball team “The Frackers”, who suddenly about-face into a winning streak; and a new beginning for Brockmire, whose life gets a jumpstart when Jules lures him back to baseball. As for herself, her unexpected connection with Brockmire gives her own life a surprising and much needed goose.

“You’re a Goddamn Disaster and you’re starting To look good to me.”

This palpable dynamic adds depth and complexity to the narrative and pushes the series far beyond expected comedy. See for yourself in this behind-the-scenes video (and brace yourself for a unforgettable description of Brockmire’s genitals)…

Want more about Amanda Peet? She’s all over the place, and has even penned a recent self-reflective piece in the New York Times.

And of course you can watch the Jim-Jules relationship hysterically unfold in new episodes of Brockmire, every Wednesday at 10PM on IFC.

Watch More

Draught Pick

Sam Adams “Keeps It Brockmire”

All New Brockmire airs Wednesdays at 10P on IFC.

Posted by on

From baseball to beer, Jim Brockmire calls ’em like he sees ’em.


It’s no wonder at all, then, that Sam Adams would reach out to Brockmire to be their shockingly-honest (and inevitably short-term) new spokesperson. Unscripted and unrestrained, he’ll talk straight about Sam—and we’ll take his word. Check out this new testimonial for proof:

See more Brockmire Wednesdays at 10P on IFC, presented by Samuel Adams. Good f***** beer.

Watch More
Powered by ZergNet