Building custom Alexa skill is wide and extends from searching a retail store in a city to assisting drones in military operations. The custom skill creation steps are :
- User interaction namely, Wake Word/Trigger, Invocation Name/Tag, and Utterances. The Alexa engine in the Echo device maps the appropriate Alexa Skill (custom skill) based on the invocation name.
- Analysis by Alexa Skill user utterances, intent handle, request sent to a back-end system (AWS Lambda function) and mapped to the Alexa Skill.
- AWS Lambda function handles the intent with the appropriate intent handler.
- After fetching the feedback, the AWS Lambda function calls the third party API or other API to get the information.
- Based on the retrieved information, the AWS Lambda function calls the appropriate API’s to fetch the details.
- The JSON response built by the Speech Synthesis Markup Language (SSML) builder and consumed by the Amazon Echo device to give the audio response.
Note: Amazon Developer Account is mandatory to create the skill type for the corresponding intent with slot values.
The steps required for custom skill creation in the Amazon Developer Console are:
- Configuring Skill Information
- Setting Interaction Model
- Registering Skill Configuration
- Testing Custom Alexa Skill
Setting Alexa Skill Information
The Skill Information requires the user to create an Alexa Skill with Invocation name. The invocation name determines what the user says to start a skill.
Setting Alexa Interaction Model
The interaction model is the business logic for the user voice interface. This is Intent Schema. The intent schema has got all possible meanings of the utterances. The intent schema is the JSON object that includes all the intents that the user builds. Based on the Interaction Model, the following actions are performed.
Many user conversations or user utterances have different or same meaning, Alexa handles them as Intents. The sample utterances are intents. For example, “Alexa, answer the location of my nearest store”.
The slots fulfill the intents through input data. For example, when the user utters “Alexa, where is the nearest store from Minnesota”. The city is the slot property type and the property value is Minnesota. The enumeration is the city slot values.
Alexa queries the feedback questions. Alexa expects the user to supply data (for a slot) that is required to fulfill the intents. For example, the feedback question entered is “for which city”.
Alexa receives sample utterances in the interaction model. For example, the sample answer from the user maybe the city name only. Sometimes, the utterances are filled with slot value and empty slot value.
Setting up Skill Configuration
After the interaction model is configured, the user invokes the Alexa device that is configured with the back-end system.
AWS lambda receives the intent and users request in JSON decoded form for processing the skills. Alexa maintains a set of Lambda Amazon Resource Name (ARN) for each custom skill. The custom skill code parses the JSON, reads the intent and context and performs the applicable processing to retrieve the data suitable to the APIs.