Speech To Text API#
This service exposes the speech-to-text models from a variety of different vendors.
Time to Integrate#
Less than 5 minute
Service Providers#
You can find a list of all service providers here:
slashml.SpeechToText.ServiceProvider
Instructions#
(Optional) Upload files to a static server by sending
POST
request tohttps://api.slashml.com/speech-to-text/v1/upload/
where the data points to your audio file. Save theupload_url
. You can use this url link in the rest of the calls.Submit your audio file for transcription by sending
POST
request tohttps://api.slashml.com/speech-to-text/v1/jobs/
. The body should contain a json object withaudio_url
which points to an audio file that is publicly available. Save theid
in the response object.Check the status of the transcription by sending a
GET
request tohttps://api.slashml.com/speech-to-text/v1/jobs/YOUR-TRANSCRIPT-ID/
Note: The uploaded_url will be used when submitting a transcription job.
Code Blocks#
Upload audio file to static storage#
If your audio files aren’t accessible via a URL already (like in an S3 bucket, or a static file server), you can upload your audio file using this API. All uploads have a 24hr deletion policy.
POST https://api.slashml.com/speech-to-text/v1/upload/
Request#
import requests
url = "https://api.slashml.com/speech-to-text/v1/upload/"
headers = {'authorization': "Token <YOUR_API_KEY>"}
payload = { 'service_provider':'assembly' }
files=[
('audio',('test_audio.mp3',open('/path/to/your_audio.mp3','rb'),'audio/mpeg'))
]
response = requests.post(url, headers = headers, data = payload, files = files)
print(response.text)
Response (200)#
{
"uploaded_url": "https://cdn.slashml.com/upload/ccbbbfaf-f319-4455-9556-272d48faaf7f"
}
Response (400)#
{
"error" : {
"message" : "something bad happened"
}
}
Note: The uploaded_url will be used when submitting a transcription job.
Convert audio to text#
The body of the request should contain a field uploaded_audio_url
, the value of which shall contain the link to the uploaded audio url, and service_provider
which is the name of the service provider you want to use.
POST https://api.slashml.com/speech-to-text/v1/jobs/
Request#
import requests
url = 'https://api.slashml.com/speech-to-text/v1/jobs/'
payload = {
"uploaded_audio_url": "https://cdn.slashml.com/upload/ccbbbfaf-f319-4455-9556-272d48faaf7f",
"service_provider": 'assembly'
}
headers = {
"Authorization": "Token <YOUR_API_KEY>",
}
response = requests.post(url, headers=headers, data=payload)
print(response.text)
Response (200)#
{
# keep track of the id for later
"id": "ozfv3zim7-9725-4b54-9b71-f527bc21e5ab",
# note that the status is now "processing"
"status": "IN_PROGESS",
"audio_duration": null,
"audio_url": "https://cdn.slashml.com/upload/ccbbbfaf-f319-4455-9556-272d48faaf7f",
"text": null,
}
Response (400)#
{
"error" : {
"message" : "something bad happened"
}
}
Note: The ‘id’ will be used to fetch the status of the job, in the status endpoint.
Check status of job#
The request API is similar to all job submissions. We can make requests to GET the status of the jobs, and eventually the result of the submitted job, i.e. transcription, or speechification.
GET https://api.slashml.com/speech-to-text/v1/jobs/YOUR-JOB-ID/
Request#
import requests
url = 'https://api.slashml.com/speech-to-text/v1/jobs/YOUR-JOB-ID/?service_provider=assembly'
headers = {
'Authorization': 'Token <YOUR_API_KEY>'
}
response = requests.get(url, headers=headers, data=payload)
print(response.text)
Response (200) - In Progress#
{
# keep track of the id for later
"id": "ozfv3zim7-9725-4b54-9b71-f527bc21e5ab",
# note that the status is now "processing"
"status": "IN_PROGESS",
"acoustic_model": "assemblyai_default",
"language_model": "assemblyai_default",
"audio_duration": null,
"audio_url": "https://s3-us-west-2.amazonaws.com/blog.assemblyai.com/audio/8-7-2018-post/7510.mp3",
"confidence": null,
"dual_channel": null,
"text": null,
"words": null
}
Response (200) - Completed#
{
"id": "5551722-f677-48a6-9287-39c0aafd9ac1",
"status": "COMPLETED",
"acoustic_model": "assemblyai_default",
"language_model": "assemblyai_default",
"audio_duration": 12.0960090702948,
"audio_url": "https://s3-us-west-2.amazonaws.com/blog.assemblyai.com/audio/8-7-2018-post/7510.mp3",
"confidence": 0.956,
"dual_channel": null,
"text": "You know Demons on TV like that and and for people to expose themselves to being rejected on TV or humiliated by fear factor or.",
"words": [
{
"confidence": 1.0,
"end": 440,
"start": 0,
"text": "You"
},
...
{
"confidence": 0.96,
"end": 10060,
"start": 9600,
"text": "factor"
},
{
"confidence": 0.97,
"end": 10260,
"start": 10080,
"text": "or."
}
]
}
Response (400) - Error#
{
"error" : {
"message" : "something bad happened"
}
}
Note: The status will go from ‘QUEUED’ to ‘IN_PROGRESS’ to ‘COMPLETED’. If there’s an error processing your input, the status will go to ‘ERROR’ and there will be an ‘ERROR’ key in the response JSON which will contain more information.