Performance Analysis of Watson NLU Service using Confusion Matrix

Some of you must be using Watson Natural Language Understanding Service for getting the quick results for extracting meta-data from   unstructured content such as entities, keywords, sentiment, emotion, relations.

    

main-qimg-2db472e26210892876536dabf5b971c0-c

In this blog post –

We will be analyzing the accuracy of this service with the help of Confusion Matrix,

Confusion Matrix- it is not confusing as the name suggests(pun intended) although it is used to describe the performance of a model.

First of all you must be aware of Confusion Matrix,if you haven’t heard of this name before Refer to :

http://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/

Now let’s see how you can calculate this, before calculating this matrix you need to have two things in hand for a handful of examples, it should be more than 50.

  1. Actual Values

            and

      2. Watson Predicted Values for the considered examples.

The sample data in my case was eCommerce Product review data.

Let’s say you are analyzing sentiments from the text, then you need to store Actual Sentiment score that text should get and what is the Watson Prediction for that text. For example: 

-1  for  Negative

0   for  Neutral

1   for  Positive

You can make classes based on the output you are getting from Watson NLU service while extracting the desired meta-data,

As for sentiments, NLU service outputs the sentiment score between [-1,1] and my data was a Review based, so it was best for me to make classes as mentioned above.

Once you have Watson Predicted values and Actual Values for the text you are analyzing, you can get a confusion Matrix by running a simple Python Script ,

confusion_matrix

Once you have a confusion matrix, you can see how many examples are mis-classified and how many of them are correctly classified and calculate the accuracy, precision of the model. 

Although Accuracy can also be calculated using sklearn :

accuracy

This way you will get to know how efficient is NLU service on your data.

The accuracy score what I achieved on my review based data was 0.648.  

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *


4 + = eight

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>