Voice AI platform for developers.

Start new thread

DeepGram - Search engine for speech

OpenAI o1

•10yr ago

Replies

Best

@stephensonsco Does search API also work in identifying the keywords in video's and start the video from a particular keyword ?

Report

10yr ago

Deepgram

Maker

@msolstice If you search for a term, then the API takes you to the best match in the file. So, yep, the API can definitely be used to navigate around video & audio based on keywords/phrases. You could make a fifteen-hour-long video scavenger hunt if you wanted—oh no, ideas for a demo.

Report

10yr ago

Sounds like a great idea

Report

1yr ago

Complice

Huh, I'm kind of confused. Is it parsing it into words, and then comparing those with a fuzzy match? Or something much subtler, using phonemes rather than words. It seems that the words shown aren't actually that accurate, but the fuzzy match *is*. Given that searching "conquer" matches the spoken word "concrete" which is close-captioned as "conquering", but "concrete" doesn't seem to find this snippet, I'm guessing that it's more like the first thing I guessed... Curious also at which stage(s) in the process the AI/ML is used.

Report

10yr ago

Deepgram

Maker

@malcolm_ocean Great point! There are circumstances where the match will be imperfect. We are constantly working on improving the algorithm by increasing both precision and recall (check out https://en.wikipedia.org/wiki/Pr... for a very detailed discussion on precision and recall in search). It's useful to think of the results in the Google sense where your first result may not necessarily be the best result. You might have to venture into the next few results (keep clicking 'next' and the engine will 'lower its standards') to really find what you are looking for. The UI for this isn't super simple yet since it's a new idea—how do you present search results in video and audio? We have achieved much better accuracy than traditional search based on transcriptions, but we always are trying to improve!

Report

10yr ago

Deepgram

Maker

@malcolm_ocean I noticed I missed a few points you asked about. The AI is used in the indexing stage and in prediction layers that are built on top of the search (the prediction is a 'special' thing that customers have to request). The search does work based on a fuzzy model—matching of words and sounds are all weighted and compared based on their probability of being correct and how far away it is from your query.

Report

10yr ago

Complice

@stephensonsco hm, I still don't quite understand. Are you using the neural networks directly on the audio data? Like with the audio as input nodes? Or both that and the transcription? Or...

Report

10yr ago

Deepgram

Maker

@malcolm_ocean The NN is used (more or less) directly on the audio waveform to produce a searchable index. When you hit the search button, the index is queried and the most relevant results are returned back to you. The NN is not active during that query stage though, just the indexing. I hope that helps!

Report

10yr ago

Not working on Firefox 44.0.2.

Report

10yr ago

Delphi

Wow. This is really great execution. Can't wait to see this sort of indexing of AV content to become more commonplace.

Report

10yr ago

This is really interesting. It could have applications for audiobooks, particularly non-fiction content. Navigation/searchability is something most of the commercial digital audiobook providers struggle with which really limits the usefulness of audiobooks as a reference resource.

Report

10yr ago

Wow it can help some many people out there! Can you please explain about the technical techniques that you used? I can get from the name of the company that you use Deep learning in some way, but i will be happy to hear more about it. thanks!

Report

9yr ago

Smartly.ai

As a Voice enthousiast, I really love what you guys are doing ! Best of luck !

Report

9yr ago

Congrats on the launch! Seems like the folks over at @bumpersfm would have a great dataset and application for DeepGram.

Report

9yr ago

This is a fascinating concept! The ability to search through speech and video content in such a precise way opens up countless possibilities for creators, researchers, and businesses.

Report

1yr ago

1 2 3 4