Machines can now rank photos according to their aesthetic appeal, in much the same way humans do, thanks to a new artificial intelligence (AI) model created by researchers at Google.
Neural Image Assessment, or NIMA, was designed using a machine learning algorithm known as a convolutional neural network (CNN), which uses data that’s been rated and labeled by humans.
Unlike existing models which typically only categorize images based on their quality, NIMA can be trained to predict which images a human would rate as technically good and aesthetically attractive.
NIMA achieves this by recognizing and classifying images based on a variety of characteristics humans associate with emotions and beauty, scoring each on a scale of one to 10, the researchers said in a blog post announcing the technology.
“This is more directly in line with how training data is typically captured, and it turns out to be a better predictor of human preferences when measured against other approaches,” the researchers claim.
To test the model’s capabilities, NIMA was pitted against photos from the large-scale database for Aesthetic Visual Analysis (AVA). Each AVA photo is scored by an average of 200 people in response to photography contests.
After training, the aesthetic ranking of many of these photos by NIMA closely matched the mean scores given by humans, as illustrated below.
According to Google, NIMA scores can be used to compare the quality of different versions of the same image, which may have been distorted in various ways, while also being used to enhance the perceptual quality of an image, finding “aesthetically near-optimal” settings for brightness, highlights and shadows.
NIMA could also be used to improve photography in real-time and assist in the editing process, the researchers say.
Speaking with authority
Interestingly, NIMA is just one of a number of machine learning tools with near human-capabilities that Google has developed.
For example, a research paper published by the tech giant in January details a text-to-speech system called Tacotron 2, which claims near-human accuracy at imitating audio of a person speaking from text.
The system can also handle hard-to-pronounce words and names, as well as alter the way it enunciates based on punctuation. For instance, capitalized words are stressed, as someone would do when indicating that specific word is an important part of a sentence.
Meanwhile, researchers from Google’s DeepMind AI lab recently developed AlphaZero, which absorbed all of humanity’s chess knowledge in around four hours.
After being programmed with only the rules of chess, AlphaZero had mastered the game to the extent it was able to beat the highest-rated chess-playing program, Stockfish.
David Kramaley, CEO of chess science website Chessable, said: “It will no doubt revolutionize the game, but think about how this could be applied outside chess. This algorithm could run cities, continents, universes.”