By David Rowan, 7 Oct. 2010
By mid-2007, Don Mattrick, who runs Microsoft’s interactive-entertainment business division in Redmond, Washington state, was demanding a new direction for the Xbox 360. “There has to be a fundamental reimagining of the way we interact,” explained Marc Whitten, Xbox Live VP, in a strategy meeting. To be fair, Whitten -- along with the rest of the senior executive team -- wasn’t certain what this meant in terms of hardware. Still, they compiled a cursory list of desired features: motion-tracking controls, facial recognition, speech recognition and backwards compatibility.
The problem wasn’t vision. It was the task’s sheer impossibility. Finding cameras that could map a living-room in 3D was easy. Getting one reliably to decode the flailing limbs and shouts of 40 million Xbox users was a whole other dream. To pull this off, the hardware would require software “brain” capable of interpreting what the team calculated was a crushing 1023 spatial and aural variables at any given moment. And it would have to do this on the fly, with no perceptible on-screen lag.
Still, fragments of the solution did already seem in place. Microsoft Research’s Beijing bureau had collected tomes on the successes (and failings) of facial-recognition technology. Redmond’s speech-recognition software -- which now ships in Windows 7 and the Ford Focus -- had already been in development for decades. And Redmond’s hardware engineers were busy creating exotic gyroscopic and accelerometer-based controller prototypes in anticipation of the coming shift. But it would take exhaustive research and testing before anyone could guess whether such a shift would be workable -- or profitable.
As 2008 began, Mattrick made it known that he wanted to transform and expand Xbox 360 and allow users to experience gaming and entertainment in a different, more social way. Could his teams take depth sensors, multi-array microphones and RGB cameras and turn them into a consumer experience? The magic box would need to track people at 30 frames per second, recognise them, understand how they move, and incorporate voice recognition -- and all in ways that could enhance the game experience.
There was just one problem: this hadn’t yet been done anywhere in the world.
Read full article: