NTT DoCoMo comes up with hands-free videophone
Going hands-free right now means one of three things – turn your handset’s speakerphone function on while letting everyone and their dog around you listen in on your conversation, use a wired hands-free kit that will definitely end up in a tangled mess when you stash the wired hands-free kit away, being extremely frustrating to untangle when the situation calls for it, and last but not least, scare people into thinking that you are talking to yourself or an imaginary friend while you gesticulate your arms all over the place in an animated conversation over a Bluetooth headset. NTT DoCoMo might have something right up your alley with a new futuristic looking glasses-type Head Mounted Device, calling it the Hands-Free Videophone. How blase, but I guess there is plenty of time to think up of a cool name later on.
NTT DoCoMo came up with this particular future glasses-type device because they feel that there definitely is a market for such a device. How does the Hands-Free Videophone work exactly? For starters, it will be able to capture the user’s face using all three cameras which are located at the left and right sides of the frames. Video will be sent to the other person simply by combining the pictures together using a pre-rendered 3D model of the user’s face.
NTT DoCoMo described, “Each camera has 720p resolution, and a fish-eye lens, with a 180-degree field of view. This is the High Definition picture currently being captured in real time. If you look at the face, you can see it’s really distorted, because the fish-eye lens is so close. The distortion is compensated, and the picture is combined with a 3D model of the person in the computer. Currently, priority is given to the part around the eyes. As you can see when the man closes his eyes, the eyelids and the corners of the eyes appear quite realistic. Such a level of realism is hard to achieve with models like CG-based avatars, where parts are overlaid on the face.”
That sounds like some serious bandwidth is required, although as at press time, the resolution is not quite high enough to be able to handle the mouth and upper body parts of the image, so what we see are are based on computer graphics. The face’s orientation is based on six-axis sensor data, and the motion of the mouth is based on audio data from the microphone. The ultimate aim for such a project? To recreate the whole face, without the help of any computer rendering. That ought to be still some time down the road, we think. How about you?