Speech-to-text

Enable this model configuration to convert speech to text automatically with the highest accuracy.

Overview

Marsview Automatic Speech Recognition (ASR) technology accurately converts speech into text in live or batch mode. API can be deployed in the cloud or on-premise. Get superior accuracy, speaker separation, punctuation, casing, word-level time markers, and more.

Model Features

`modelType`Configuration

`modelConfig` Parameters

Example Request

curl --location --request POST 'https://api.marsview.ai/cb/v1/conversation/compute' \
--header 'Content-Type: application/json' \
--header "Authorization:{{Insert Auth Token With Type}}" \
--data-raw '{
        "txnId": "{{Insert txn ID}}",
        "enableModels":[
            {
            "modelType":"speech_to_text",
                "modelConfig":{
                    "automatic_punctuation" : true,
                    "custom_vocabulary":["Marsview", "Communication"],
                    "speaker_seperation":{
                        "num_speakers":2
                    },
                    "multi_channel": {
                        "enable": true,
                        "channel_ids": {
                            "0":"0",
                            "1":"1"
                        }
                    },
                    "enableKeywords":true,
                    "enableTopics":true,
                    "aggressiveness": 3,
                    "topics": {
                        "threshold": 0.5
                        }
                    }
            }
        ]
}'

import requests
auth_token = "replace this with your auth token"
txn_id = "Replace this with your transaction ID"

#Note: the speech to text model does not depends on any other models, hence
#can be used independently

def get_speech_to_text():
  url = "https://api.marsview.ai/cb/v1/conversation/compute"
  payload={
        "txnId": txn_id,
        "enableModels":[
            {
            "modelType":"speech_to_text",
                "modelConfig":{
                    "automatic_punctuation" : True,
                    "custom_vocabulary":["Marsview", "Communication"],
                    "speaker_seperation":{
                        "num_speakers":2
                    },
                    "multi_channel": {
                        "enable": true,
                        "channel_ids": {
                            "0":"0",
                            "1":"1"
                        }
                    },
                    "enableKeywords":True,
                    "enableTopics":False,
                    "aggressiveness": 2
                    }
                }
            ]
        }

  headers = {'authorization': '{}'.format(auth_token)}
  
  response = requests.request("POST", headers=headers, json=payload)
  print(response.text)
  if response.status_code == 200 and response.json()["status"] == "true":
    return response.json()["data"]["enableModels"]["state"]["status"]
  else:
    raise Exception("Custom exception")

if __name__ == "__main__": 
    get_speech_to_text()

Example Metadata Response

"data": {
    "transcript": [
        {
            "sentence": "Good evening teresa.",
            "channelId": 0
            "startTime": 1390,
            "endTime": 2690,
            "speakers": [
                "1"
            ],
            "keywords": [
                {
                    "keyword": "good evening teresa",
                    "metadata": [],
                    "type": "DNN"
                }
            ],
            "keySentence": "Good evening teresa.",
            "topics": [
                    {
                        "tiers": [
                            {
                                "tierName": "Education",
                                "type": 1
                            }
                        ],
                        "name": "Secondary Education"
                    },
                    {
                        "tiers": [
                            {
                                "tierName": "Education",
                                "type": 1
                            }
                        ],
                        "name": "College Education"
                    },

                ],
            "suggestedIntents": [
                    "foolish power school board",
                    "stressful situation",
                    "determination",
                    "good good job"
                ]
        }
    ]
}

Response Object

Sentence Level Keywords`type`

PreviousConfiguring Models NextEmotion & Tone(BETA)

Last updated 2 years ago