Building your first Alexa Skill — Part 1

By September 26, 2019 November 13th, 2019 AI, Alexa, Blogs, Machine Learning, ML

Written by Tejaswee Das, Software Engineer, Powerupcloud Technologies

Technological advancement in the area of Artificial Intelligence & Machine Learning has not only helped systems to become more intelligent but has also made them more vocational. You can just speak to the phone & add items to your shopping list or just instruct your laptop to read your email. In this fast-growing era of voice-enabled automation, Amazon’s Alexa enabled devices are changing the way people go through their daily routines. In fact, it has introduced a new term in the dictionary, Intelligent Virtual Assistant (IVA).

Technopedia defines Intelligent Virtual Assistant as an engineered entity residing in software that interfaces with humans in a human way. This technology incorporates elements of interactive voice response and other modern artificial intelligence projects to deliver full-fledged “virtual identities” that converse with users.”

Some of the most commonly used IVAs are Google Assistant, Amazon Alexa, Apple Siri, Microsoft Cortana, with Samsung Bixby joining the already brimming list lately. Although IVAs seem to be technically charged, they bring enormous automation & value. Not only do they make jobs for humans easier, but they also optimize processes and reduce inefficiencies. These systems are so seamless, that just a simple voice command is required to get tasks completed.

The future of personalized customer experience is inevitably tied to “Intelligent Assistance”. –Dan Miller, Founder, Opus Research

So let’s bring our focus to Alexa, Amazon’s IVA. Alexa is Amazon’s cloud-based voice service, which can interface with multiple devices on Amazon. Alexa gives you the power to create applications, which have the capability to interact in natural language, making your systems more intuitive to interact with technology. Its capabilities mimic those of other IVAs such as Google Assistant, Apple Siri, Microsoft Cortana, and Samsung Bixby.

The Alexa Voice Service (AVS) is Amazon’s intelligent voice recognition and natural language understanding service that allows you to voice-enable any connected device that has a microphone and a speaker.

Powerupcloud has worked on multiple use-cases, where they have developed Alexa voice automation. One of the most successful & adopted use cases being one of the largest General Insurance providers.

This blog series aims at giving a high-level overview of building your first Alexa Skills. It has been divided into two parts, first, covering the required configurations for setting up the Alexa skills, while the second focuses on the approach for training the model and programming.

Before we dive in to start building our first skill, let’s have a look at some Alexa terminologies.

  • Alexa Skill — It is a robust set of actions or tasks that are accomplished by Alexa. It provides a set of built-in skills (such as playing music), and developers can use the Alexa Skills Kit to give Alexa new skills. A skill includes both the code (in the form of a cloud-based service) and the configuration provided on the developer console.
  • Alexa Skills Kit — A collection of APIs, tools, and documentation that will help us work with Alexa.
  • Utterances — The words, phrases or sentences the user says to Alexa to convey a meaning.
  • Intents — A representation of the action that fulfils the user’s spoken request.

You can find the detailed glossary at

https://developer.amazon.com/docs/ask-overviews/alexa-skills-kit-glossary.html

Following are the prerequisites to get started with your 1st Alexa skill.

  1. Amazon Developer Account (Free: It’s the same as the account you use for Amazon.in)
  2. Amazon Web Services (AWS) Account (Recommended)
  3. Basic Programming knowledge

Let’s now spend some time going through each requirement in depth.

We need to use the Amazon Developer Portal to configure our skill and build our model which is a necessity.

  • Click on Create Skill, and then select Custom Model to create your Custom Skill.

Please select your locale carefully. Alexa currently caters to English (AU), English (CA), English (IN), English (UK), German (DE), Japanese (JP), Spanish (ES), Spanish (MX), French (FR), and Italian (IT). We will use English (IN) while developing the current skill.

  • Select ‘Start from Scratch’
  • Alexa Developer Console
  • Enter an Invocation Name for your skill. Invocation name should be unique because it identifies Skills. Invocation Name is what you say Alexa to invoke or activate your skill.

There are certain requirements that your Invocation name must strictly adhere to.

  • Invocation name should be two or more words and can contain only lowercase alphabetic characters, spaces between words, possessive apostrophes (for example, “sam’s science trivia”), or periods used in abbreviations (for example, “a. b. c.”). Other characters like numbers must be spelt out. For example, “twenty-one”.
  • Invocation names cannot contain any of the Alexa skill launch phrases such as “launch”, “ask”, “tell”, “load”, “begin”, and “enable”. Wake words including “Alexa”, “Amazon”, “Echo”, “Computer”, or the words “skill” or “app” are not allowed. Learn more about invocation names for custom skills.
  • Changes to your skill’s invocation name will not take effect until you have built your skill’s interaction model. In order to successfully build, your skill’s interaction model must contain an intent with at least one sample utterance. Learn more about creating interaction models for custom skills.
  • Endpoint — The Endpoint will receive POST requests when a user interacts with your Alexa Skill. So this is basically the backend for your Alexa Skill. You can host your skill’s service endpoint either using AWS Lambda ARN, which is recommended, or a simple HTTPS endpoint. Advantages of using an AWS Lambda ARN are :
  • Sign in to AWS Management Console at https://aws.amazon.com/console/
  • Lookup for Lambda in AWS services
  • US East (N. Virginia)
  • EU (Ireland)
  • US West (Oregon)
  • Asia Pacific(Tokyo)

We are using Lambda in the N.Virginia (us-east-1) region.

  • Once we are in a supported region, we can go ahead to create a new function. There are three different options for creating your function. You can create a function from scratch or you can also use available Blueprints and Serverless Application Repositories.
  • C# / .NET
  • Go
  • Java
  • NodeJS
  • Python

We will discuss programming Alexa with different languages in the next part of this series.

https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html

  • Go back to the Endpoint section in Alexa Developer Console, and add the ARN we had copied from Lambda in AWS Lambda ARN Default Region.

ARN format — arn:aws:lambda:us-east-1:XXXXX:function:function_name

In, part 2, we will discuss the training our model — adding Intents & Utterances, finding walkarounds for some interesting issues we faced, making workflows using dialog state, understanding the Alexa Request & Response JSON, and finally our programming approach in Python.

Leave a Reply