-
Notifications
You must be signed in to change notification settings - Fork 168
Description
Found your wonderful software, but had minor issue when loading an Amazon Transcribe transcript that had the variant format for independent audio channels as oppose to the typical speakers format.
Impressively, your software still loaded the rows of the transcript correctly, however, it made every speaker label have a unique number suffix, so it was impossible to relabel the speaker labels all at once and almost insurmountable task to track and correct by hand a very long transcript.
It's used when each speakers are each on a dedicated channel/track in the source audio file:
https://docs.aws.amazon.com/transcribe/latest/dg/how-channel-id.html
Excerpt from referred AWS doc showing the JSON format:
{
"jobName": "job id",
"accountId": "account id",
"results": {
"transcripts": [
{
"transcript": "When you try ... It seems to ..."
}
],
"channel_labels": {
"channels": [
{
"channel_label": "ch_0",
"items": [
{
"start_time": "12.282",
"end_time": "12.592",
"alternatives": [
{
"confidence": "1.0000",
"content": "When"
}
],
"type": "pronunciation"
},
{
"start_time": "12.592",
"end_time": "12.692",
"alternatives": [
{
"confidence": "0.8787",
"content": "you"
}
],
"type": "pronunciation"
},
{
"start_time": "12.702",
"end_time": "13.252",
"alternatives": [
{
"confidence": "0.8318",
"content": "try"
}
],
"type": "pronunciation"
},
Transcription abbreviated
]
},
{
"channel_label": "ch_1",
"items": [
{
"start_time": "12.379",
"end_time": "12.589",
"alternatives": [
{
"confidence": "0.5645",
"content": "It"
}
],
"type": "pronunciation"
},
{
"start_time": "12.599",
"end_time": "12.659",
"alternatives": [
{
"confidence": "0.2907",
"content": "seems"
}
],
"type": "pronunciation"
},
{
"start_time": "12.669",
"end_time": "13.029",
"alternatives": [
{
"confidence": "0.2497",
"content": "to"
}
],
"type": "pronunciation"
},
Transcription abbreviated
]
}
}
It has "channel_labels" (object) -> "channels" (array/list) ->"channel" (object) with each channel containing it's own "items" for words oppose to "items" being declared once in the other format and uses "channel_label" instead of "speaker_label" for speakers.
Could you please accommodate the Amazon Transcribe channel format variant and at least have speaker ID labels be consistent per channel if not matching the "channel_label?"
Just for reference here's the doc for speaker identification format:
https://docs.aws.amazon.com/transcribe/latest/dg/how-diarization.html