Microsoft Azure Cognitive Services Fluent API v1.0
Introduction
Azure Cognitive services are a powerful, consumer friendly set of API's that allow developers to easily make use of powerful artificial intelligence and machine learning services within their applications. Services such as text recognition, sentiment analysis, facial recognition and emotion detection to name a few. You can go check them out here.
However, even though they are relatively easy to consume, they all utilise a common usage pattern which can get a little tedious or verbose. This common mechanic is:
- Initialise your HTTP communication stack
- Construct your Uri, headers, arguments and payload, then send to the relevant API.
- Parse the response, extracting out the relevant values to make sense of it all.
- For multiple calls, this process is repeated multiple times.
While this is not necessarily difficult, it can be tedious.
What if....
What if you wanted to do some sentiment plus keyphrase analysis and could simply do this:
var result = await TextAnalyticConfigurationSettings
.CreateUsingConfigurationKeys(TestConfig.TextAnalyticsApiKey, LocationKeyIdentifier.WestUs)
.SetDiagnosticLoggingLevel(LoggingLevel.Everything)
.AddDebugDiagnosticLogging()
.UsingHttpCommunication()
.WithTextAnalyticAnalysisActions()
.AddSentimentAnalysis("I am having a terrible time.")
.AddKeyPhraseAnalysis("This is a basic sentence. I have absolutely nothing to assert here.")
.AnalyseAllAsync();
and get back a nice strongly typed response?
Well, you can.
Introducing v1.0 of the Fluent API for Azure Cognitive Services.
With this fluent API, you can use simple syntax for multiple operations against the cognitive services with very concise code. The fluent library currently supports TextAnalytics, Face and ComputerVision sets of API's. Full source and documentation can be found here: https://github.com/glav/CognitiveServicesFluentApi
The nuget packages are available here:
- Core package (all others dependent upon)
- Text Analytics / Text Analytics Extensions
- Face / Face Extensions
- ComputerVision / ComputerVision Extensions
This library takes away the tedious aspects of communicating with cognitive services and makes them even easier, but it also builds on this by adding a number of extensions or value add features. These are (but not limited to):
- Convenience methods that perform context specific functions on the result set, like determining the number of positive responses in a sentiment analysis result set.
- Provides a default but customisable scoring system with descriptive support to better parse the confidence levels or scores that are returned in a more semantic, and fluid way.
- Allow an easy way to provide your own customisations or extensions.
- Supports testability by allowing easy substitution of components in the pipeline, such as the communications stack.
- Support for retry and back off protocols automatically without you having to do anything. This makes using the free tier easy, within its limited usage restrictions. If Azure indicates that a request should wait for 2 seconds before trying again, thats exactly what happens.
- Provides an easy mechanism to work with the "polling" API's where a jo is submitted and you must query and endpoint for completion. This can be done in one line of code.
Examples
There is more detailed examples and documentation here (https://github.com/glav/CognitiveServicesFluentApi) but as a simple example, to perform sentiment analysis on a block of text, you can simply do the following:
var result = await TextAnalyticConfigurationSettings
.CreateUsingConfigurationKeys("{your-api-key}", LocationKeyIdentifier.WestUs).SetDiagnosticLoggingLevel(LoggingLevel.Everything)
.UsingHttpCommunication()
.WithTextAnalyticAnalysisActions()
.AddSentimentAnalysis("I am having a fantastic time.")
.AnalyseAllAsync();
Similarly for Keyphrase analysis:
var result = await TextAnalyticConfigurationSettings
.CreateUsingConfigurationKeys(TestConfig.TextAnalyticsApiKey, LocationKeyIdentifier.WestUs)
.SetDiagnosticLoggingLevel(LoggingLevel.Everything)
.UsingHttpCommunication()
.WithTextAnalyticAnalysisActions()
.AddKeyPhraseAnalysis("This is a basic sentence. I have absolutely nothing to assert here.")
.AnalyseAllAsync();
And what about other cognitive services? Here is an example of ComputerVision support that will look for adult content, include tags, and look for celebrities:
var result = await ComputerVisionConfigurationSettings
.CreateUsingConfigurationKeys(TestConfig.ComputerVisionApiKey, LocationKeyIdentifier.SouthEastAsia)
.SetDiagnosticLoggingLevel(LoggingLevel.Everything)
.AddDebugDiagnosticLogging()
.UsingHttpCommunication()
.WithComputerVisionAnalysisActions()
.AddUrlForImageAnalysis("http://www.scface.org/examples/001_frontal.jpg",
ImageAnalysisVisualFeatures.Adult | ImageAnalysisVisualFeatures.Tags
, ImageAnalysisDetails.Celebrities, SupportedLanguageType.English)
.AnalyseAllAsync();
However, Instead of doing each of those operations separately, we can group common API operations together which will perform both keyphrase and sentiment analysis in this case:
var result = await TextAnalyticConfigurationSettings
.CreateUsingConfigurationKeys(TestConfig.TextAnalyticsApiKey, LocationKeyIdentifier.WestUs)
.SetDiagnosticLoggingLevel(LoggingLevel.Everything)
.UsingHttpCommunication()
.WithTextAnalyticAnalysisActions()
.AddKeyPhraseAnalysis("This is a basic sentence. I have absolutely nothing to assert here.")
.AddSentimentAnalysis("I am having a fantastic time.")
.AnalyseAllAsync();
Doing this you can perform as many operations as you like. Generally this means multiple API calls to the cognitive service, however some support batch operations (such as TextAnalytics) and this is automatically handled by the fluent API.
Results of the operations are stored in an object which contains a reference to the configuration used, the inputs passed in and finally the respective context objects to represent each set of operations. So in the keyphrase example above, the results can be obtained using something like:
var keyphrase = result.KeyPhraseAnalysis.AnalysisResult.ResponseData.documents[0].keyPhrases[0]; // "basic sentence"
And for the sentiment analysis
var score = result.SentimentAnalysis.AnalysisResult.ResponseData.documents[0].score; // a double like 0.9
Convenience Extensions
There is a separate package for each cognitive service fluent API package that just contains a series of extension methods to act as convenience methods or helpers. This allows further extensions to be provided easily. An example using the Face detection API is:
var firstResult = result.FaceDetectionAnalysis.GetResults().First();
var isNoiseLevelLow = firstResult.IsNoiseLevel(NoiseLevel.Low);
var isGoodExposure = firstResult.IsExposureLevel(ExposureLevel.GoodExposure);
var isBlurLevelLow = firstResult.IsBlurLevel(BlurLevel.Low);
There are many more but this intended as a start.
Scoring system
Common to almost all cognitive service operations is a confidence score for the returned result. This value is always in between 0 and 1 inclusive. 0 represents a negative, or low confidence score whereas 1 represents a positive, or high confidence score. The fluent API contains support for parsing these results much easier through a scoring system that is fully customisable. A default set of "scores" is provided.
The default score levels are the following:
- 0 - 0.35 : "Negative"
- 0.35 - 0.45 : "Slightly Negative"
- 0.45 - 0.55 : "Neutral"
- 0.55 - 0.75 : "Slightly Positive"
- 0.75 - 1 : "Positive"
This allows you to do things like:
var items = result.SentimentAnalysis.GetResults(); var firstItem = items.First(); var score = result.SentimentAnalysis.Score(firstItem); Console.WriteLine($"Score level is: {score.Name}");
You can also utilise convenience methods to operations such as the following to get all negative results from a SentimentAnalysis operation:
var negativeResults = result.SentimentAnalysis.GetResults(DefaultScoreLevels.Negative);
Wrapping up
This has really only touched the surface of what you can do. The Face API provides extensive support for not only face recognition, but also working with people groups, face detection and group processing. This library has been in the works for a little while as simply a plaything that has grown over time. Rather than just leave it lie, I thought I'd polish it up, release it, and see if it helps anyone or garners any interest. If not, nothing lost as its been a great learning experience. I certainly welcome any feedback. For full documentation, see the Github link below.
Github: https://github.com/glav/CognitiveServicesFluentApi
Nuget packages:
- Core package (all others dependent upon)
- Text Analytics / Text Analytics Extensions
- Face / Face Extensions
- ComputerVision / ComputerVision Extensions
Sonarcloud analysis: https://sonarcloud.io/dashboard?id=CognitiveFluentApi