Technology Horizon for the innovation generation
FONT SIZE A A A
FEATURES
subnav divider
subnav divider
subnav divider
subnav divider
subnav divider
subnav divider

CURRENT VACANCIES

Graduate Opportunities
Rolls-Royce plc
Find out more

Equipment Technician
Prosper Recruitment UK Ltd
Find out more

Quality Engineer
Prosper Recruitment UK Ltd
Find out more

Maintenance Technician
Prosper Recruitment UK Ltd
Find out more


Hand crafted

Bjorn Stenger was invited to join a team of three researchers at Toshiba.
Bjorn Stenger was invited to join a team of three researchers at Toshiba.

A fascination for computer vision and machine learning led not only to a PhD at Cambridge but a top research role at Toshiba for Bjorn Stenger.

Developing vision systems that can capture, process and display images is one of the most challenging areas of mathematics and computer science. Yet it can also be highly rewarding. Just ask Dr Bjorn Stenger, a researcher in computer vision and machine learning at the Computer Vision Group (CVG) of Toshiba Research Europe in Cambridge.

Pursuing his passion for this demanding area has taken him to the leading edge of his field and set him on his way to a great career with Toshiba, one of the biggest technology groups.

Stenger became intrigued by the subject while studying for a diploma in computer science at Bonn University. In his final year, he was offered the opportunity to work at Siemens’ Princeton Research Laboratory in the US. He spent six months there developing a system that could take a video source and extract moving objects from backgrounds whose appearance changed over time.

Siemens planned to use the software in a networked CCTV system to help security teams assess surveillance videos for suspicious behaviour. Stenger used the experience of developing the software as the basis for the thesis of his undergraduate degree.

By then, he was bitten by the image processing bug and after being awarded his diploma, he enrolled on a PhD at Cambridge University. For his thesis he developed a system that allows a computer to interpret the motion of hand gestures captured from a video stream and subsequently attach certain actions to those motions. If you have seen Tom Cruise controlling a video screen in the film Minority Report, you will know what he was trying to perfect.

Stenger realised that to build such a system he would need to compare the image of a hand captured from a video stream with a model he would need to create on the computer. First, however, he had to develop a technique to extract the essential information from the image of the hand, using edge-detection and colour-matching techniques.

He constructed a 3D model of a hand from cones, cylinders and ellipsoids, which looked ‘much like a piece of Cubist art’. The model was used to generate contours, which were then compared with the edge contours and skin colour in the images.

Stenger admitted: ‘It was a shamefully simplistic representation that did not take into account shading and lighting or motion blur. The hardest part of the tracking was the sheer number of possible hand gestures from the video the computer needed to compare against his model.

He said: ‘The real trick was to develop a way of quickly matching the image data to the model.’

To resolve that, he used a data glove with sensors in the joints to capture a variety of hand poses that a user might employ to interface with the system in real-life scenarios. By doing so, he was able to reduce the search space dramatically and therefore the computational complexity of the task.

He also developed a hierarchical processing scheme to make the computer matching more efficient. In Stenger’s system, the computer identifies a close match to the hand gesture presented to it then only explores others that are similar. It tracks down a hierarchy of possibilities, finally selecting the most promising match from a distribution of potential solutions. The result is then propagated to the next frame of the image as the most likely match. Then, the procedure is carried out again and again as each new image is presented.

The program Stenger wrote for his doctorate won him the British Machine Vision Association Sullivan Thesis Prize, awarded annually to the best thesis in the UK in the field of vision.

During his final year at Cambridge, he applied for a position on the Toshiba research fellowship programme which, supported by the EPSRC, offered the opportunity to live and work in Japan. Stenger was delighted when he was accepted and flew to Japan to join a team of researchers at the Toshiba R&D Centre in Kawasaki, near Tokyo.

While there, he extended the techniques he had developed for his PhD to track the human body. ‘The hand-tracking algorithm could easily be employed to analyse images of a human walking, turning and dancing,’ said Stenger. He and his colleagues were asked to build a system to do that. ‘We developed a “virtual fashion show” that would display a CGI image of a model on a screen that would move in tandem with a real fashion model walking on a catwalk.’

The system Stenger developed tracks the movement of a model using two cameras and captures a 3D image while she is walking down the catwalk. A simplified version of the captured image is then compared with a database of postures created from laser scans of the body. A computer algorithm then bestimates the posture of the model, again using a hierarchical scheme, from which it is able to display the virtual equivalent on a big display in real time.

In Japan, Stenger was also asked to look again at developing a hand recognition system, this time for the Toshiba Qosmio laptop. Toshiba wanted to demonstrate that the computer, with a built-in camera, could recognise a few simple hand gestures to enable a user to control simple functions such as audio and video.

Stenger realised his earlier work would be inappropriate in a commercial system. ‘For a commercial system we needed to develop a better system that was less complex, meaning fewer gestures, while at the same time being much more robust,’ he said.

He used cascaded classifiers to detect a number of hand poses in each video frame independently. But that did not prove fast enough to recognise a hand gesture in real-time. To do that, the detection algorithm was optimised for multi-core processors by distributing the operations to multiple cores and minimising the data transmission between them.

Toshiba was planning to open a computer vision group at the research laboratory it had formed in Cambridge and Stenger was invited to join a team of three.

Back in the UK, he has been working on refining the hand recognition system and implementing it on a standard laptop.

‘I have been investigating ways of improving the algorithm and to make it run fast on standard low-cost hardware,’ he said.

He believes one day such systems will be all-pervasive — on TVs, remote controls and public displays.

Source: Technology Horizons
Date Published: October 27, 2008