Magisto! .. It sounds like Magic and Presto put together.
This is how it works: https://www.magisto.com/how-it-w...
This is where the API docs and API key access live: https://developers.magisto.com/
Beware, Magisto API usage may come at a cost $$, consider this:
You are sending data to be processed by them (not simple text-based data either), and then you're getting nicely packaged digital content back, so, some server somewhere is getting hot on your behalf.. I would say, depending on how much video/audio/image data you send (and the size of it per-item) the server may heat up enough to eventually cost you something to use, makes sense.
If you're able to piece together some algorithm yourself capable of doing what Magisto's AI can do then you can avoid this cost at the cost of your programming time; I'm assuming you probably wouldn't have asked for a recommendation if you could make it yourself, but, consider this too:
Where is your content going to live after you get it all packaged up and sent back to you? Will you only show it to others on a Smart-Device/Computer-Screen? If this is the case, then you could just use technically plain JS and HTML5 Audio and Video to create the final digital content at a URL, because even with Smart TVs URLs are accessible. If though, you need a single file in return the URL option is not viable.
If paying for Magisto API use is totally bugging you out, check this list at http://alternativeto.net/softwar...
for some possible alternatives
If you do choose to host the combined video+audio+image thing at a URL, I would consider using GreenSockJS (GSAP) and CreateJS libraries to combine collections of video, audio, and image files into one.