Emotion & Tone(BETA)

Enable this model configuration to analyze speaker's tone (acoustic) & emotions based on spoken text (Lexical Emotion Analysis)

Overview

Emotion Analysis

The Emotion Analysis model will help you understand and interpret speaker emotions in a conversation or text. It is designed to understand human conversation in the form or free text or spoken text and is designed after the emotion wheel.

The Emotion wheel describes eight basic emotions: anger, anticipation, disgust, fear, joy, sadness, surprise, and trust.

Emotion Types

Types of Emotions detected by enabling this model configuration in the Speech Analytics API:

Admiration Amusement Anger Annoyance Approval Caring Confusion Curiosity Desire Disappointment Disapproval Disgust Embarrassment Excitement Fear Gratitude Grief Joy Love Nervousness Optimism Pride Realization Relief Remorse Sadness Surprise Neutral

Tone Analysis

Tone Analysis suggests speaker emotion using only audio clues. Sometimes the speaker may show emotions in the tone of the response and this is important to capture to get the overall sentiment/mood of the conversation which cannot be extracted from conventional Lexical Emotion analysis.

Marsview's propritary Tone Analysis AI can detect the intonations in the tone to the statement level.

Types of Tone

Marsview is capable of detecting the following tones in an audio file:

negative positive neutral slightly-negative

`modelType`Configuration

Keys

Value

modelType

emotion_analysis

modelConfig

Model Configuration object for emotion_analysis (No configurations)

Example Request

curl --location --request POST 'https://api.marsview.ai/cb/v1/conversation/compute' \
--header 'Content-Type: application/json' \
--header "Authorization: {{Insert Auth Token With Type}}" \
--data-raw '{
        "txnId": "{{Insert txn ID}}",
        "enableModels":[
            {
            "modelType":"speech_to_text",
                "modelConfig":{
                    "automatic_punctuation" : true,
                    "custom_vocabulary":["Marsview", "Communication"],
                    "speaker_seperation":{
                        "num_speakers":2
                    },
                    "enableKeywords":true,
                    "enableTopics":false
                    }
            },
            {
            "modelType":"emotion_analysis"
            }
        ]
}'

import requests
auth_token = "replace this with your auth token"
txn_id = 'Replace this with yout txn id'
request_url = "https://api.marsview.ai/cb/v1/conversation/compute"

#Note: Emotional analysis is dependant on the output from speech to text model,
# Hence both models needs to be given in the request for this to work
def get_emotion_and_tone():
  payload={
        "txnId": txn_id,
        "enableModels":[
            {
            "modelType":"speech_to_text",
                "modelConfig":{
                    "automatic_punctuation" : True,
                    "custom_vocabulary":["Marsview", "Communication"],
                    "speaker_seperation":{
                        "num_speakers":2
                    },
                    "enableKeywords":True,
                    "enableTopics":False
                }
            },
            {
                "modelType":"emotion_analysis"
            },
            ]
        }
  headers = {'authorization': '{}'.format(auth_token)}
  
  response = requests.request("POST", headers=headers, json=payload)
  print(response.text)
  if response.status_code == 200 and response.json()["status"] == "true":
    return response.json()["data"]["enableModels"]["state"]["status"]
  else:
    raise Exception("Custom exception")

if __name__ == "__main__": 
    get_emotion_and_tone()

Example Metadata Response

"data": {
    "emotion": [
        {
            "transcript": "Good evening teresa.",
            "startTime": 1390,
            "endTime": 2690,
            "speaker": "1",
            "tone": {
                "value": "calm",
                "confidence": 0.9030694961547852
            },
            "emotion": {
                "confidence": 0.9549336433410645,
                "value": "JOY"
            },
            "wordsPerMinute": 92.3076923076923
        },
    ]
}

Response Object

Field

Description

emotion

A list of emotion objects

transcript

The sentence for which emotion is being analyzed

startTime

Start time of the sentence in the input Video/Audio in milliseconds.

endTime

End time of the sentence in the input Video/Audio in milliseconds.

speaker

Id of the speaker whose voice is identified in the given time frame.

tone

Object that describes the tone of the speaker

tone[value]

Tone of the speaker in the given time frame

tone[confidence]

Value indicating the models confidence in the predicted tone value

emotion(object)

Object that describes the emotion of the speaker

emotion[confidence]

Value indicating the models confidence in the predicted emotion value.

emotion[value]

Emotion of the speaker in the given time frame.

wordsPerMinute

Average words per minute spoken by the speaker.

PreviousSpeech-to-text NextStatement Tags

Last updated 2 years ago