i built an app to create audiobooks, a user-friendly and full automatic, there are other options but require manual work, and have complex dashboards.
the ai model using behind it is different from most TTS you find, it's an LLM based, so you get the different voice pace, emotions etc. each time you generate the audiobook, it means less robotic and non-flat voices.