State of Text-To Speech

#21

I think he was more talking about integrating the voice tech and not the dedicated standalone server stuff. Think voice chat in CounterStrike, Battlefield, Call of Duty. See here:

http://teamspeak.com/?page=teamspeak3sdk

#22

He was. It may be that I-Novae Studios is relying on the fact that players can modify the game to permit stuff like Teamspeak integration. For example, ARMA 3 has multiple mods that use Teamspeak to implement various radios. In one mod the short range radio has fairly low quality audio, while the longer range (and much bulkier) radio has better quality audio. There are mic-press chirps and everything. I’m fairly certain that game sounds can also be sent through the Teamspeak channels to indicate background activity to the receiver.

#23

It’s called positional audio, there’s a teamspeak plugin to enable it and mumble has the feature by default, iirc. It’s really great for games like arma. the game basically tells teamspeak where the person is or what effects to add to their voice and etc. Though im not sure how useful it would be for a game like infinity, surely having realistic voice carry distance isnt so important in the vacuum of space.

1 Like
#24

The important bit for me is that the sounds the pilot hears are also heard by anyone the pilot communicates with. Other players would hear the pilot’s background chatter on other communications channels, warning claxons, impacts, weapon fire, possibly simulated explosion sounds, scanner noises, atmospheric noises, etc.

Using a proper microphone with a separate Teamspeak server means that ambient noises (such as people yelling in the background) are not heard by others. But that also means that game noises are not heard.

It could be argued that a space-faring society would have the pilot using a proper microphone so that the ambient noises of combat wouldn’t interfere with their critical communications. I would say that gameplay invites keeping those ambient noises.

#25

AEIOU!

(John Madden)

3 Likes
#26

Right, time for some nightmares because that crap is going to stick in my head for a long time.

1 Like
#27

Of course, not by default! :smiley:

[edit] Though it seems IVONA does a bit better than Sam, ‘???’ (reads only one). But ‘uuuuuuuu’ pronounces every vowel as a single one, goes into stutter repeat without losing breath… ever.

#28

*in a kill streak voice*

#ULTRA BUMP!

#29

I use IVONA TTS ro read back to me on windows. Indian English is my fav, and a lot more TTS companies now use a version of Indian English. I just really like the female Indian English accent.

#30

Yeah, but as Flavien points out:

The issue is that IVONA, now part of Amazon, and others offer TTS as a service. Meaning that TTS conversion that had occurred before for a different client would have to be saved to I-Novae Studios Azure servers in order to not have to pay few cents repeatedly.

1 Like
#31

Seems we will soon have a way to do this client-side.
https://15.ai/
DeepThroat (/ˈdēpˌTHrōt/): Natural emotive high-fidelity text-to-speech synthesis with minimal viable data

Warning: Includes swearing/cursing

Not dubbed by the voice actors of Team Fortress 2.

Another example… the little ponies skit:
https://www.youtube.com/watch?v=Fj6dufqpQOw

P.S. I do find the Big Chungus meme funny.
https://knowyourmeme.com/memes/big-chungus

1 Like
#32

So I tinkered with an open-source TTS engine called MaryTTS for several hours and wrote a Python script that just reads the log file for text notifications.


There is no need to compile MaryTTS, there is already a release for Windows, where you just run the marytts.bat file to run the MaryTTS Web server locally: (marytts-installer-5.2.zip)

Download link for the executable file for the log reader, which watches for changes within the log, fetches voice lines and then sends those voice lines to the local MaryTTS Web server:
https://1drv.ms/u/s!AuEqVKK0eUKDhuQIs5ommY8AI_Alig?e=fpbLFq (SHA-256 hash = 09C3B8E6EAC3C26C08EA6064E1AF03390114FCE641A19AA35086BC3001B6049E)
Source code:
https://gist.github.com/Pendrokar/9311de9c2be9e51c608372637d4e8583

Demonstration video:



Installation steps:

  1. Unpack and run marrytts.bat file from the MaryTTS installer, which will run the TTS engine and has an interface accessible through a web browser.
  2. Download and move the executable file to Documents\I-Novae Studios\Infinity Battlescape\Logs
    (Valid path in File Explorer)
  3. Launch Infinity Battlescape, so that a new log file is created
  4. Run the exe file
  5. Play Battlescape
  6. You may adjust TTS volume through Windows Sound Mixer

The voice may sound more robotic due to two randomized parameters I add to it, so that the same line is never read in the same manner.

But this is what an open-source solution can already give. I used plain input text option, but there are other options that allow to attach certain emotions to phrases. As friendly fire and critical hits currently are also read slowly and calmly.

6 Likes
#33

Went through some of the TTS and AI voice actor services. Sadly, all AI enchanced TTS still offer the same type of service. Pay their data cloud to have it generate an audio file. Even though most of their costs goes into generating the neural network, that just makes audio files based on text input. No reason to have a supercomputer generate the files. Other than to keep control. :unamused:

While the following is impressive, it cannot be mass produced:

Even if these NPCs would have simple backgrounds: