In the examples for Some OAUTH2 walkthroughs, I used the Google CloudVision API to demonstrate various authentication scenarios. The result was an evolving webapp which could do some cool things applying that API to images. I thought it might be fun to expand it a little to do the same thing using Microsoft's project oxford vision API, which has many similar capabilities.
All the code for this is on Github and is a development of the examples shown in Some OAUTH2 walkthroughs. I'm not going to dig into the code too much in this post. You are welcome to take a look and play around with it yourself.
Both APIS need authorization
You need to pay for both APIS, but there is a free tier.
Both have similar capabilities such as image classification, face detection and emotion detection. I'm going to focus on emotion detection in this post, since it presupposes the others.
There's a video version of this post here.
The Google one allows you to do more things at once, but the result is that it's more complicated to use. It also returns categorized emotional scores, whereas Microsoft sticks its neck out and gives you an actual value. Microsoft is also more ambitious in its emotional analysis range, attempting to detect more subtle emotions. Although neither are perfect, I think the Microsoft one gives more balanced and subtle results, and its easier to use. So for once, it's Microsoft for me. Let me know what you think from the examples below.
Before I get into the details, here are the side by side results of emotion analysis on a couple of images.
The emotions detected (they are called slightly different things, but I've normalized them for comparison), are a little different. Both detect joy, sorrow, anger and surprise, but Microsoft go a little further and try to look for contempt, disgust, fear and neutrality. Google offer some functional type measures such as headwear detection, underexposed and blurred.
Whereas Microsoft returns values between 0 and 1 (actually there are a couple of results that were very tiny negative numbers),
Google return a classification.
So that I could compare them as a chart, I assigned these weights to the categorizations.
Let's take a look at a few more comparisons. Google (on the right) got this one completely wrong - Bernie doesn't look too happy to me in this picture. He looks angry and somewhat disgusted - Microsoft got it right.
Google did better in this one, capturing both the surprise and anger in this Trump image.
In this image Microsoft did better - picking up the Trump trademark surprise and anger
I love this photo, which is 100% joy, recognized by both. Strangely, Google didn't detect she was wearing a hat.
There really isn't a category to describe this dopey image of Francois Hollande, but they both concluded he was happy, with Microsoft throwing in some surprise.
Google completely failed to notice any emotion in this Trump picture, whereas Microsoft picked up the clear sorrow, anger and even contempt in the expression. Google just focused on the hat. In fact it often thinks people are wearing hats when they are not (see the picture from homeland at the beginning of this post)
Both did well in this Hillary image, picking up both the surprise and joy.
I've covered Google label detection in other posts in Some OAUTH2 walkthroughs, but it's interesting that Google certainly likes to detect hair or headwear.
I haven't tried Microsoft label detection. Face detection is of course a pre-cursor to being able to analyze emotion. The Google face detection and emotion detection are wrapped up - both are returned from the same query. Microsoft has a separate api call for each.
Services > Desktop Liberation - the definitive resource for Google Apps Script and Microsoft Office automation > OAuth2 for Apps Script in a few lines of code > Some OAUTH2 walkthroughs >