With Azure Data Lake Analytics, beyond a simple language called U-SQL that does analytics like percentiles in statistics, we’ve incorporated the best of Microsoft’s machine learning in the form of cognitive capabilities. So with U-SQL, you can perform simple imaging and text analysis. Detect objects in pictures, detect human faces and their emotions, and detect text in photos. Get the sentiment analysis, extract it, and key phrases from text and now, we’ll show you a demo of how to accomplish all this very easily using Azure Data Lake Analytics and U-SQL.
Now here we are at the Azure portal and as you can see we’re looking at some files. I’ve uploaded four JPEGs from Wikimedia Commons and what we’re going to do is show how Azure Data Lake Analytics and U-SQL and our integrated cognitive abilities let us see what’s in these pictures without having any prior knowledge of machine learning or object tagging or detection, just plain U-SQL. So here’s a very simple script, right? Let me just walk through what the script is doing.
- First, all these assemblies are provided as part of U-SQL. As you can see, we have the FaceSdk, and we can detect emotion, tagging means objects in images, and OCR means detect text in images.
- The next is we have a little bit of U-SQL here and what we’re doing is we’re reading a bunch of JPEGs using this Cognitive.Vision.ImageExtractor and we’re storing that in a rowset called images, the filename and the image binary data. Then we’re using a built-in image processing code called Cognition.Vision.ImageTagger and what we’re doing is detecting the objects in the image, a number of objects and the tags, the strings that describe what’s in the image.
- Then finally, we’re going to output this as a simple comma separated value file in the Azure Data Lake Store. Now one more thing I want to point out is the slider that says AUs, analytic units. It’s the number of containers that will be used to execute this query. This is a small query, so I’ll just picked one container. But I can drag this up to a big number. If I had many images I could do a 100. Or I could increase this and even say 1,000, or 3,000.
Any number of containers, and I’ll just pay for the containers I use. In this case, I’ll leave it to 1 since I don’t have that many images. So, now let’s take a look at this job after it ran. As you can see, it completed and it read from these files and there were four images here and it placed the output here. Let’s take a look at the Output file. So in the four images, it’s detected a number of objects and in the third image, which is called Flood Under Old Route 49, Bridge Crossing Over, it says outdoor; tree; bridge; ground; arch. So based on Microsoft’s machine learning algorithms, this is what we think is in the image.