Grand Theft Auto and AI help Surrey team turn dog pics into 3D models
Photographs of dogs could soon be used to help generate 3D models more accurately than ever before – thanks to an award-winning study from the University of Surrey and the famous video game, Grand Theft Auto.
The researchers taught an artificial intelligence (AI) system to predict the 3D pose from a 2D image of a dog – which they trained on images they created using Grand Theft Auto V.
Moira Shooter, a postgraduate researcher at the University of Surrey, said:
“Our model was trained on CGI dogs – but we were able to use it to make 3D skeletal models from photographs of real animals. That could let conservationists spot injured wildlife, or help artists create more realistic animals in the metaverse.”
One way to teach AI to get 3D information from 2D images is to show it photos while giving it information about 3D ‘ground truth’ – where the objects actually are in 3D space. For humans, that means wearing motion capture suits.
Even on their best behaviour, getting dozens of dogs to wear motion capture suits can prove challenging.
Instead, researchers created a myriad of virtual dogs to study.
They altered the code of Grand Theft Auto, switching the main character for one of eight kinds of dog – a process known as “modding”. They generated 118 videos of the dogs sitting, walking, barking and running in a range of different weather and lighting conditions.
The team called their new database ‘DigiDogs’ – made up of 27,900 frames. They will now fine-tune it using Meta’s DINOv2 model to make sure it can predict a 3D pose just as well from real dog pictures.
Moira added, “3D poses contain so much more information than 2D photographs. From ecology to animation – this neat solution has so many possible uses.”
Here comes the science bit…
The research won the prize for Best Paper at the IEEE/CVF’s Winter Conference on Applications of Computer Vision.
It helps promote the UN Sustainable Development Goals (SDGs) 9 (industry, innovation and infrastructure) and 15 (life on land).
All images were produced by Moira Shooter of the University of Surrey, using the engine of GTA 5 (Rockstar Games).