Learning Plan - Artificial Intelligence Applications on Microsoft Azure
Developed by the Microsoft Artificial Intelligence and Research Team
Course Audience and Requirements
|Requirements:||Software development, Microsoft C#, desktop applications, working with Application Programming Interfaces (APIs)
Getting Started with AI and Cognitive Systems
Artificial Intelligence is a vastly complicated set of topics. A simple definition is that Artificial Intelligence is when computers do things that humans normally do. Almost all of these AI components involve three basic concepts, modeled on how the brain works:
In this phase, the system takes in information or stimuli in some way. Note that acquiring doesn't have to involve external senses like sight, sound or touch. It can also be the process of getting new or different information from another subsystem. This is the phase in AI that you write to bring in data - wherever that data is from. Stringing the output from the "Respond" phase can lead to another "Acquire" phase of your AI program.
In this phase, the data is processed in some way. Functions such as Machine Learning and Deep Learning, and algorithms in those disciplines such as Neural Networks are used to process the data from the inputs. The Cognitive Services API for computer vision contains an example of the input from the computer's camera to be passed on to another API for image recognition.
In this phase, the AI system either performs an action, or sends the processed data along to another AI function (or both). AI programs work in this way - a series of systems for acquiring information, processing it, and responding according to a set of rules. AI is a chain of these smaller systems working together to mimic human behavior.
You can use the following resources to create an application that follows the acquire/process/respond pattern in AI. Once you've completed this application, you'll be able to code other applications that use more Cognitive Services.
FamilyNotes is a Universal Windows Platform (UWP) application that explores different input modalities and scenarios of user awareness. It is a bulletin-board app that allows family members to leave notes for each other on a common PC/tablet just like they would on a bulletin board. Using text, speech, ink, or pictures, a user can create a note and tag it for another user. Later when that other user approaches the PC/Tablet, the app uses imaging APIs and the Microsoft Cognitive Services (Face API) to detect their presence and display the notes that have been left for them, effectively filtering based on facial recognition. While the app is open, users can naturally interact with it using speech ("Add a note for Bob"). If the app isn't open, a user can easily launch it and interact with it using Cortana.
|Module||Topic||Description||Test your skills|
|Overview||Using Microsoft Cognitive Services in your applications||Let's begin with this short video that explains the Microsoft Cognitive Services and shows a sample application.||Explain to a friend what Cognitive Services are and give an example of how you can use them in an application.|
|Introduction and Setup||Install and configure Visual Studio||To get started, you'll need Visual Studio to create and run your Cognitive Services application. The free Community Edition Works fine.||Open Visual Studio and create a simple application.|
|Copy the sample application||Next, you'll need to download the sample application. You'll work through other example Solutions, but this one has the complete application used for this learning path.||Open the ZIP file and expand it to a directory on your computer. Open the Visual Studio solution (the ".sln" file) in Visual Studio.|
|Working with the Universal Windows Platform SDK||The Universal Windows Platform (UWP) is the app platform for Windows 10. We'll be using this SDK throughout the Learning Path to work with vision, text and speech. This overview gives you an introduction to what the SDK is.||First, set up the SDK, and then complete a sample application.|
|The Acquire Phase||Accessing the Camera in Windows||We'll start the Acquire Phase by learning to use the Windows built-in camera UI, where you'll learn how to create Universal Windows Platform (UWP) apps that use the camera to capture photos, video, or audio. This github sample has the Windows.Media.Capture API and how to use it.||Create a basic app that has photo, video, and audio capture using MediaCapture.|
|Working with Handwriting||Now that we have the video and audio interfaces created, we're ready to add in the next human-interaction part of the application - handwriting. We'll learn how to take input using the InkCanvas API. This github-based project (scroll down to the README file) shows how to use ink functionality (such as capturing ink from user input and performing handwriting recognition on ink strokes) in Universal Windows apps using C#.||Create an app that supports writing and drawing with Windows Ink.|
|Acquiring Speech with Cortana||With both an Acquire and Processing capability, we'll now focus on Microsoft's Intelligent Assistant, Cortana, and explain how it can start the sample application, and set and receive messages by using natural language. This github site demonstrates how to integrate your app with Cortana voice commands. It covers authoring and installing a Voice Command Definition file, and shows how your app can respond to being activated by Cortana.||Ensure you can start an application using Cortana.|
|Processing Phase||Processing speech with the Windows.Media.SpeechRecognition Namespace||One of the most complicated (and powerful) capabilities in human-like processing in AI is working with speech. The Windows.Media.SpeechRecognition namespace provides functionality which you can acquire and monitor speech input, create speech recognition grammars that produce both literal and semantic recognition results, capture information from events generated by the speech recognition and, and configure and manage speech recognition engines. This resource has an overview of this technology in Windows, and also has a set of articles at the bottom of the page to learn more, and samples you should work through.||Make a UWP app respond with speech.|
|Recognizing a Face Image||We'll continue the Processing Phase using the video we've captured, we'll zero in on the face of the user using the FaceDetectionEffect Class of the Windows.Media.Core namespace. Next, we'll take that area of the video and pass it along to our first Cognitive Services call, the Face API. Using this service, we'll first upload a still shot of the user, which we can then compare using the Face API by identifying previously tagged people in images. This lets the user set or get messages. this github sample shows you how to work with the API's.||Create a simple app that detects faces in an image.|
|Working wtih the Cognitive Services API||We're ready to make our first call to the Cognitive Services API. In this step we'll learn how to take the images and send them to the API for processing, and how to interpret the results. Read this resource and the API guides it references.||Work through this tutorial to ensure you know how to make calls to the CS API.|
|The Response Phase||Compiling and Running the sample Application||We're ready to try out the application. We'll compile the application, open the settings, and enter the information for the Cognitive Services API. We'll then test the application with video, voice and handwriting.||Download the application at this link, and then open the "FamilyNotes.sln" file in Visual Studio. Compile the application, set your API key, and try it out!|
|Next Steps||Other Cognitive Services||Now you're ready to branch out and create your own applications using more of the Cognitive Services. Take a look at the list here and follow the samples they have.||Create and deploy more AI applications|