Objective
Exporting Kaia data using the daily export feature.
Kaia recordings and transcripts, both for meetings and calls, can be exported on a daily basis to customer owned storage for the following providers: sFTP server, AWS S3 or Azure Blob Store.
Recordings must be made by Kaia users and be available in Outreach in order to be exported.
The data exported will include all selected data from the previous day.
Applies To
- Kaia
- Outreach Admins
Overview
Procedure
- Log in to Outreach as an Admin.
- Click Administration > Tools > Kaia.
- Toggle Bulk Export.
- Select the service.
- Select the export option.
- Select the location.
Note: In the event of a failure to export data due to the customer’s storage not being available, Outreach will send an email to administrator(s) and retry for up to 48 hours. If a retry attempt is successful, a follow up email will be sent to the customer notifying them that the data export has been resumed. Data cannot be guaranteed to be uploaded by Outreach in the case of unavailability of customer owned storage beyond 48 hours.
Export Information
Export destinations
The following storage options are available as export destinations:
- sFTP server
- AWS S3
- Azure Blob Store
sFTP Credentials
The following information is required for sFTP:
- SFTP Endpoint
- SFTP UserName
- SFTP Password
- SFTP HostKeyType (Optional)
- Example: ecdsa-sha2-nistp256, rsa-sha2-256)
- SFTP Host Public Key (Optional)
AWS S3 Credentials
The following information is required for AWS S3:
- S3 Region
- S3 Bucket Name
- IAM Access Key ID
- IAM Access Secret
This bucket needs to be created within AWS before entering the information into Outreach. Please create a dedicated IAM User with List/Put/Delete permissions for the bucket.
Azure Blob Credentials
The following information is required for Azure Blob Storage:
- Azure Storage Account Name
- Azure Storage Account SharedKey
- Azure Storage Container Name (Optional. If not provided, we will use outreach-export by default.)
- Azure Endpoint Suffix (Optional. If not provided, blob.core.windows.net is used by default.)
We recommend creating a dedicated storage account instead of using an existing one.
Data export files
The file exported for each Kaia recording consists of:
- A zip file containing files for the recording with naming convention: ‘KAIA-<RFC3339 DateTime>+<InstanceID>.zip’
- A 0 byte trigger file for each zip file once it has been uploaded with the naming convention: ‘KAIA-<RFC3339 DateTime>+<InstanceID>.done’’
- A final trigger file uploaded once a daily batch of zip files has finished uploading with the naming convention: ‘KAIA-<RFC3339 DateTime>.end that contains a list of the names of each zip file that was uploaded.
The contents of the Zip files containing recordings are as follows:
- The transcript of the recording with naming convention: ‘KAIA-<RFC3339 DateTime>+<InstanceID>.txt’
- The media file for the recording with the naming convention: ‘KAIA-<RFC3339 DateTime>+<InstanceID>.mp4’
- The metadata for the call/meeting recording with the naming convention: ‘KAIA-<RFC3339 DateTime>+<InstanceID>.json’
These files are not encrypted.
Recording transcript files
Transcript text file definition:
- Transcript .txt files contains a list of spoken utterances.
- For each spoken utterance the first line consists of the timestamp for when the utterance began within the meeting, speaker in_meeting_id, and speaker name (when available) followed on a new line with the transcribed utterance.
- The length of the utterance is determined by how long a user talks before Kaia identifies a pause in speaking.
Example transcript file
00:00:06 - (2)Mike Goodwin:
Hello. How's everyone doing?
00:00:10 - (1)Neil Deason:
Great.
00:00:13 - (1)Neil Deason:
How about you?
00:00:17 - (2)Mike Goodwin:
I’m well. Can you tell me about the product pricing?
00:00:25 - (1)Neil Deason:
I’d be happy to.
Attributes for the Transcription.txt File
Note: Max Length values for Strings recommended values based on expected content. When extracting this data it is recommended to gracefully handle the case where the max length is ever exceeded in an attribute (e.g. a meeting title, utterance, etc).
Attribute | Definition | Data Type | Max Length |
startTime in_meeting_id, and speaker name (when available) | Time (in seconds) of when the utterance began within the meeting | String | 8 |
speaker | The in_meeting_id of the participant speaking (see Participant attributes in metadata .json file definition) | Integer | n/a |
name | Name of the speaker (when available) | String | 255 |
utterance | The transcription of the speakers speech | String | 64k |
Recording info .json file definition
- version: the revision number for the data sharing format
- title: the meeting title
- instance_id: unique instance of the recording
- startTime: recording start date/time in UTC
- endTime: recording end date/time in UTC
- duration: length of recording in seconds
- host_guid: unique identifier for Outreach user that created the recording
- call_direction: whether call was “inbound” or “outbound” for Outreach user (Used for Outreach Voice call recordings only)
- call_id: unique identifier for the call within the scope of an Outreach instance, used to query correlate call with data in Salesforce (Used for Outreach Voice call recordings only)
- participants: array of meeting participants in the call/meeting
- in_meeting_id - unique identifier during the call/meeting (required)
- name - participant name (If available - Only used for an Outreach Voice call for non Outreach users if their phone number is associated with a Prospect)
- email: email address (If available - Used for Outreach user, used for other parties if they are known in Outreach and not used for Kaia endpoint)
- phone_number: phone number (If available - Used for an Outreach Voice call recording for Outreach user & non Outreach user; not used for Kaia endpoint)
- derived_guid - unique identifier for each identified Outreach user in the call (if applicable - only available for Outreach users)
Example .json file for a meeting recording
{
"version": "1.0",
"title": "Topics (General, Custom, Methodology)",
"instance_id": "MdpV8csOQ8iDE5hDHHih3w",
"startTime": "2024-03-12T17:02:32.157Z",
"endTime": "2024-03-12T17:46:04.703Z",
"durationInSeconds": 2612.546,
"host_guid": "1a0a3d8c-e28c-30b2-bb4a-9a670fed0a1f",
"participants": [
{
"in_meeting_id": 2,
"name": "Jesper Holmberg",
"derived_guid": "50fc5240-3bf0-3ccb-86d7-393abbaeb761"
},
{
"in_meeting_id": 3,
"name": "Elizabeth Webber",
"derived_guid": "1a0a3d8c-e28c-30b2-bb4a-9a670fed0a1f",
"email": "elizabeth.webber@outreach.io"
},
{
"in_meeting_id": 4,
"name": "Maksym Manziuk",
"derived_guid": "ebb9ce5e-462e-3550-9b53-4d7793088d65",
"email": "maksym.manziuk@outreach.io"
}
]
}
Example .json file for a meeting recording
{
“version”: “1.0.0”,
"title": "Diego Villasenor - Paul Smith",
"instance_id": "FJs66ia-QN-TQwGfMR7qUQ",
"startTime": "2023-02-11T16:52:07.467Z",
"endTime": "2023-02-11T16:53:04.271Z",
"host_guid": "e13a0da2-74a2-3f6d-88cc-e67d03772792",
“call_direction”: “inbound”,
“call_id”: 123123,
“duration”: 57,
"participants": [
{
"in_meeting_id": 1,
"name": "Outreach Kaia"
},
{
"in_meeting_id": 2,
"phone_number": "+15550001234",
"name": "Diego Villasenor",
“email”: “diego.villasenor@outreach.io”,
"derived_guid": "e13a0da2-74a2-3f6d-88cc-e67d03772792"
},
{
"in_meeting_id": 3,
"phone_number":"+15551230987",
“name”: “Paul Smith”,
“email”: “paul.smith@test.com”
}
]
}
File Attributes within info .json file
Note: Max Length values for Strings recommended values based on expected content. When extracting this data, it is recommended to gracefully handle the case where the max length is ever exceeded in an attribute (e.g. a meeting title, utterance, etc).
Attribute | Definition | Data Type | Recommended Max Length |
version | the revision number for the data sharing format | String | 16 |
title | the meeting title | String | 255 |
instance_id | unique instance of the meeting | String | 36 |
startTime | recording start date/time in UTC | Timestamp | 24 |
endTime | recording end date/time in UTC | Timestamp | 24 |
duration | length of call recording in seconds | Integer | n/a |
host_guid | unique identifier for Outreach user | String | 36 |
call_direction | whether call was “inbound” or “outbound” for Outreach user | String | 8 |
call_id | unique identifier for the call within the scope of an Outreach instance, can be used to correlate recording call object in Salesforce | Integer | n/a |
participants | See below | Array | 3+ nodes |
Participant Attributes within the Metadata .json file
Note: Max Length values for Strings recommended values based on expected content. When extracting this data it is recommended to gracefully handle the case where the max length is ever exceeded in an attribute (e.g. a meeting title, utterance, etc).
Participant Attribute | Definition | Data Type | Max Length |
in_meeting_id | unique identifier during the call (required) |
Integer | n/a |
phone_number | Participant's phone number (optional) | String | 24 |
name | Participant's name (optional) | String | 255 |
Participant's email address (optional) | String | 255 | |
derived_guid | unique identifier for each identified Outreach user in the recording (optional) | String | 36 |
Correlating exported Outreach Voice recordings to CRM data
In Outreach Advanced Task Mapping can be used to sync the call_ID value into a corresponding Task Field in Salesforce
Correlating exported Meeting recordings to CRM data
Outreach can be configured to sync the Kaia recording urls to Salesforce Meetings objects.
The instance_id in the exported metadata is contained with the kaia recording url and can be used to correlate data sets.
Example
instance_id: MdpV8csOQ8iDE5hDHHih3w
Kaia recording URL: https://web.outreach.io/kaia/record/MdpV8csOQ8iDE5hDHHih3w
Generating an exported file to develop against
Prior to having daily exports setup, downloading an individual recording from Outreach will generate the same file format for a recording that is exported