Screengrabs

Enable this model configuration to capture keyframes and slides from a visual meeting/conversation.

Overview

Built as a video-first platform, Marsview is the only company in this category that captures keyframes and slides from videos and screen sharing from an online web conference.

modelTypeConfiguration

Keys
Value
modelType
screengrabs
modelConfig
Model Configuration object for screengrabs

modelConfig parameters

modelConfig
Description
Defaults
ocr.enable
Boolean to enable or disable OCR
False
screen_activity.enable
Filter the screengrabs based on the given classes. If this is set to true, then the screengrabs captured by the model will be filtered based on them
True
screen_activity.classes
A list of classes that the model can recognize. By default all the classes are given. Select the classes that best suit the type of input video. Eg: For meeting, set the class list with values ppt and speaker_presentation
['drawing', 'ppt', 'screen', 'speaker_face', 'speaker_presentation']

Example Request

Curl
Python
curl --location --request POST 'https://api.marsview.ai/cb/v1/conversation/save_file_link/{{user ID}' \
--header 'Content-Type: application/json' \
--header 'authorization: {{Replace with your authentication token' \
--data-raw '{
"userId":"{{Insert User ID}}",
"txnId": "{{Insert txn ID}}",
"enableModels":[
{
"modelType":"screengrabs",
"modelConfig":{
"ocr":{
"enable":true
},
"screen_activity" : {
"enable" : true,
"classes" : ["drawing","ppt","screen","speaker_face","speaker_presentation"]
}
}
}
]
}'
import requests
user_id = "[email protected]"
auth_token = "replace this with your auth token"
txn_id = "Replace this with your txn id"
request_url = "https://api.marsview.ai/cb/v1/conversation/{user_id}/compute"
def get_screen_grabs():
payload={
"userId":user_id,
"txnId": txn_id,
"enableModels":[
{
"modelType":"screengrabs",
"modelConfig":{
"ocr":{
"enable":true
},
"screen_activity" : {
"enable" : true,
"classes" : ['drawing', 'ppt', 'screen',
'speaker_face', 'speaker_presentation']
}
}
},
]
}
headers = {'authorization': '{}'.format(auth_token)}
response = requests.request("POST", request_url.format(user_id=user_id), headers=headers, json=payload)
print(response.text)
if response.status_code == 200 and response.json()["status"] == "true":
return response.json()["data"]["enableModels"]["state"]["status"]
else:
raise Exception("Custom exception")
if __name__ == "__main__":
get_screen_grabs()

Example Response

"data": {
"screengrabs": [
{
"shotId": null,
"timeStamp": 13466.666666666666,
"frameNumber": 202,
"meetingActivivty": null,
"title": [],
"text": [],
"confidenceTitle": [],
"confidenceText": [],
"bboxTitle": [],
"bboxText": []
},
]
}

Response Object

Field
Description
screenGrabs
List of Screengrab Objects
shotId
Id of the collection of frames being analyzed
timeStamp
Start time of the shotId in milliseconds
frameNumber
Frame number of the frame being analyzed
meetingActivity
Object specifying the meeting activity, returns null if there is not meeting activity
title
List of String
text
List of text identified in the given time frame
confidenceTitle
List of confidence titles
confindenceText
List of confidence text
bboxTitle
Gives a list of 4 parameters with index 0 = X-displacement, 1=Y-displacement, 2=X-end co-ordinate, 3=Y end co-ordinate which represents the coordniates of bounding boxes for the titles identified by the model in the given frame.
bboxText
Gives a list of 4 parameters with index 0 = X-displacement, 1=Y-displacement, 2=X-end co-ordinate, 3=Y end co-ordinate which represents the coordniates of bounding boxes for the text areas identified by the model in the given frame.