cityscape with traffic motion blur

AI Developer on Azure

Learn how to build applications using Microsoft's Machine Learning and Artificial Intelligence services
cityscape with traffic motion blur
AI developer on Azure

AI Developer on Azure

Learn how to build applications using Microsoft's Machine Learning and Artificial Intelligence services
Beginner  |  Intermediate  | Advanced

Learning Plan - Artificial Intelligence Applications on Microsoft Azure

Developed by the Microsoft Artificial Intelligence and Research Team

Course Audience and Requirements

Level: Advanced
Audience: Software Developer, Data Scientist
Requirements: Software development experience with Python or another scientific computing language, and a Microsoft Azure subscription

Using the Microsoft Cognitive Toolkit for training and operationalizing intelligent applications using deep learning for Image Segmentation and Object Detection

Computer vision can be generally described as the study of visual data. The amount of visual data being produced is growing at a truly incredible pace, largely due to the amount of sensors and cameras in the world (which by some estimates is larger than the amount of humans on the planet) and the ubiquity of cheap data storage. The field of computer vision is a very interdisciplinary field, touching on many different areas of science, engineering, and technology. Prior to the deep learning revolution, practitioners attempting to tackle computer vision problems like object detection would need years of experience to learn how to identify the necessary features from visual data to solve the problem at hand.

In the year 2012, a true revolution occurred in the field of computer vision, largely driven by the ubiquity of image datasets and computing power. That revolution was the application of deep learning algorithms, specifically, convolutional neural networks, to computer vision problems like image classification, where they achieved new state-of-the-art records by a large margin. Rather than hand-crafting the necessary features to solve the image classification or object detection problem, deep learning algorithms automatically find the most useful representations of the data to solve the problem at hand. The Microsoft Cognitive Toolkit (CNTK) has emerged as one of the most performant software tools for training and operationalizing deep learning algorithms. CNTK enables developers to build custom deep learning solutions that can scale across multiple GPUs and systems, and can be embedded in small mobile devices, enabling the full end-to-end training-to-production pipeline for deploying deep learning algorithms in the wild.

This learning path will help you train and test your own Deep Learning object detection model using Microsoft Cognitive Toolkit (CNTK), and deploy it on a mobile device.

Course Overview

Module Topic Description Test Your Skills
Overview Introduction to Deep Learning for Computer Vision This lecture, the first from a course at Stanford, provides a great introduction to computer vision. It introduces the field and describes the approach we will be taking in this Learning Path Explain at a high-level the success of convolutional neural networks to computer vision
Setup Deploy an Azure DSVM In order to do deep learning, you'll need a system equipped with the heavy computation requirements. The Azure DSVM will be our primary workspace for training deep learning algorithms. It is already equipped with the drivers we need to get GPU-acceleration, CNTK, and a complete Python environment for writing your application. Make sure you select an NC6, NC12 or NC24SKU in order to get GPU access Launch a Jupyter notebook, and do the data science walkthroughon your brand-new DSVM
101-Course edX - Deep Learning Explained This is an introductory course to CNTK and deep learning, covering the core concepts in CNTK and deep learning. This course runs over a 6-week duration, but each week can be completed in a few hours. While we highly recommend this course, this learning path also provides links to tutorials you can do at your own pace. Complete the first four modules of the course
Concepts Test CNTK on Synthetic Data This first tutorial will give you an introduction to CNTK's Python interface and the process of using CNTK for machine learning. While the data here is synthetic, you'll get a firm foundation of the core concepts of CNTK, including data loading, training and evaluation. Run through the entire notebook and change the number of output classes
OCR Recognition Ingest and Explore Data with CNTK and Python's Scientific Packages Now that understand CNTK's core Python API for data processing and model training/evaluation, you are ready to try out your skills on real dataset. The MNIST dataset is sometimes referred to as the "Hello, World!" of the machine learning world. Understanding this dataset with CNTK is a great way of learning about deep learning and CNTK Explore the dataset in different ways: try shuffling the data, apply transformations to see how they distort the images, and add noise to see how that impacts generalization error
Logistic Regression with MNIST This tutorial continues our OCR problem using a simple logistic classifier. While not yet a "deep" neural network, this approach nonetheless gets an accuracy of 93%, and shows the standard workflow of CNTK: data reading, data processing, creating a model, learning, and evaluation. Adjust the following parameters and see how they effect your test accuracy: minibatch size, number of sweeps, network architecture. Can you explain why these parameters effected the score the way they did?
Fully Connected Neural Network with Two Hidden Layers Finally, a "deep" net! Here you will implement a fully connected neural network with two hidden layers. Your test score should go up to nearly 99%! Try adding more hidden units. At what point does "overfitting" occur?
Convolutional Neural Network for MNIST Convolutional neural networks (CNNs) are very similar to the perceptron and multi-layer neural network of the previous sections. The main difference is the assumptions they make on the data source. They explicitly assume the data source is an image, and constrain the network architecture in a sensible way, in part, inspired by our understanding of our visual cortex. Rather than connecting each previous layer's neurons to every neuron in the next layer, CNNs use "local connections", where only a small input region (the receptive field) is applied to the next layer. Here you'll implement a CNN on the MNIST dataset, and if you're careful, you could even crack past a 99% accuracy rate. Try different activation functions and strides, and see how they impact performance.
Transfer Learning Image Recognition with Transfer Learning In the previous sections, you learned about CNTK's core API and trained a neural network for OCR recognition. While this worked great for our problem, our data was not very large, or very varied. It consisted entirely of greyscale images with only 10 categories. In real world examples however, you'll frequently encounter images with far greater diversity and categories. Rather than training a hugely deep architecture for each new problem you encounter, transfer learning allows you to reuse an existing architecture trained on a large image corpus (usually ImageNet) and only retrain the last few layers to "adapt" to your new dataset. Try this notebook with a different pretrained model, say ResNet-50, and a different dataset
Neural Style Transfer While not directly related to transfer learning, this module will show you how to utilize the optimization procedure we have used so far to optimize a loss function that leads to a result that is close to both the content of one image, as well as the style of the other image. Here, you'll take a famous painting by van Gogh, and synthesize its texture on any image your choosing! Try different images and different pre-trained network architectures.
Object Detection Fast R-CNN Object Detection Tutorial You've understood the core concepts of CNTK, image classification, transfer learning, and optimization of loss functions. Now you're going to apply all those to the task of object detection! In particular, you will use an approach called Fast R-CNN. The basic method of R-CNN is to take a deep Neural Network which was originally trained for image classification using millions of annotated images and modify it for the purpose of object detection. Try this same approach on the MS Cocodataset, a far richer and more challenging dataset for object detection.
Operationalization Embarassingly Parallel Image Classification on pySpark with HDInsight Now that you have trained your deep neural network for the tasks of image classification and object detection, the question is how to operationalize? In this module, you'll take an image classification model and deploy it on HDInsight Spark, where it can score thousands of image while simultaneously using the distributed nature of Spark and the high throughput of Azure Data Lake. Adapt the model to use the Fast-RCNN model you trained in the previous section
Deploy CNTK on a Raspberry Pi Now you'll take your CNTK architecture and deploy it on a simple Raspberry Pi. The Microsoft Embedded Learning Library allows you to build and deploy machine-learned pipelines onto embedded platforms, such as the Raspberry Pi, Arduino, micro:bits, and other microcontrollers. We'll take our image classification model and put it into a Raspberry Pi where it can do real-time classification using a video feed Decrease the size of your network architecture and see how that improves the latency of your real-time classification device.