State of Text-To Speech (Stretch goal?)


Probably because you're not used to (hearing the presented foreign languages).
Same as english (both UK and USA I guess) was instantly categorized as "synthesised", I had exactly the same thought about french version. Moreover, the female voice does not articulate well enough (at least for the text they choose) and even as a french-man myself, I had a hard time guessing what she was actually saying for 2 words. She had also some weird intonation...

About russian side, it sounded more "chopped" than other languages somehow. Though i can't say I speak it smile

Best choice IMO. There's a reason not everybody's a singer ... and that commercial adds choose specific voices according to the emotions / stimulii they want you to feel.

Though I would love to get my hands on the same kind of AI as Kane acquired (yet rather creepy if you remember the whole stroy), we're still miles from anything close to that.
Your proposition is a very good one, but not feasible with actual tech. Or not without a huge portion of players making fun / being irritated by the chopped "one-tone" voice currently on the market.


This, one tone thing, seems to have been noticed even in sound effects. Most found the solution in changing the pitch a little each time the sound is playing. I have noticed that this has been even done to voice samples(Clicking on characters in Total War series campaign maps). While others starting with games like Half-Life 2 and Battlefield: Bad Company and on modify played sounds based on distance and environment.

By the way I have actually tried to recreate CABAL's voice, but without directly contacting former Westwood employees I won't get the same effect. It seems the effect is mix of multiple samples with different pitches in the same timeframe, while adding a metallic effect, this is actually too heavy on the CPU to be done in real time. Anyways here is what I got by using IVONA's Brian British English voice:

Westwood's CABAL Sample

IVONA's Brian saying the same without added effects

My attempt to recreate the effect

For me Brian sounds good, except for for the sudden high pitched "frontLINE".

If some of you mean about the content that the TTS would read, yes they would be very similar. I haven't looked or read any computer generated articles. Sport articles especially. But such generators would probably be expensive and require heavy modifications.
Computer Generated Sports Article


What really matters is player and corp names so when creating a name for either of those there would be a TTS window that allows players to fine-tune how they want the TTS to read their name. If i name myself "bob the miner" but TTS reads it as "boob the minor" i can chose a different variation until it hits the intended one.

EDIT: i can see some players intentionally making it funny... xD


I can see a lot of players failing to make it funny. I don't particularly look forward to having the game repeatedly tell me "You have been targeted by coxcoxcoxcoxcox!"

1 Like

Personally I would much prefer that TTS is not used for any story related text, I'd rather there be no voice acting there than TTS.

However I would like to see it used in all private/guild chat and sidequests/missions. And ofcourse that there is an option to turn it off, because it'll most likely irritate some people.

1 Like

Yeah I am definitely of the opinion that no TTS is better than bad TTS.

An option for visually impaired players? Sure. But definitely off by default.


I think it would be pertinent to this discussion if we all went and spent an hour or two playing moonbase alpha.

1 Like

I watched about 20 seconds of the first YouTube hit on "Moonbase Alpha". There are many ways to design something wrong, and the developers of Moonbase Alpha certainly found one. For those uninterested in going through that exercise, consider how long it takes to ignore a spoken sentence that you don't want to hear versus a text message you don't want to read. The spoken sentence just won't go away.

I hope that I-Novae Studios will pursue Teamspeak integration so that voice communication is a standard part of gameplay. I am a gamer who wants to interact with other players through the game, so I have no interest in text-to-speech or even canned voice (i.e. voice acting). That said, I can imagine experiences in game that are scripted, and text-to-speech or canned voice seems appropriate there. I'll just do my best to avoid them.

An example of how canned voice is not interesting to me is the "Jumping" announcement in EVE Online. The woman has a beautiful voice and the delivery is velvety smooth, but with so many jumps in the game the repetition got to me. I'd rather have the game fiddle with the sound effects so that they're not always the same. That is, if the game world is procedural, then the sound effects can modulate according to the environment. A nearby star in a jump causes reverb to one element of the jump sound. Low fuel produces a rattle sound in another element. Starting a jump in a bendy jump tunnel plays with the volume. And so on.


The canned voice in the X Serries is one of the greatest parts about it in my oppinion.


Speaking about TS integration, I'm not so sure. Most of the time, players have already their own server. And more importantly, how would you give management rights to players ? Because the TS would be public, it would most likely be a happy mess at best, a troll nest at worst.

But well, as long as it is only an option and not compulsory, it'll be fine.


I think he was more talking about integrating the voice tech and not the dedicated standalone server stuff. Think voice chat in CounterStrike, Battlefield, Call of Duty. See here:


He was. It may be that I-Novae Studios is relying on the fact that players can modify the game to permit stuff like Teamspeak integration. For example, ARMA 3 has multiple mods that use Teamspeak to implement various radios. In one mod the short range radio has fairly low quality audio, while the longer range (and much bulkier) radio has better quality audio. There are mic-press chirps and everything. I'm fairly certain that game sounds can also be sent through the Teamspeak channels to indicate background activity to the receiver.


It's called positional audio, there's a teamspeak plugin to enable it and mumble has the feature by default, iirc. It's really great for games like arma. the game basically tells teamspeak where the person is or what effects to add to their voice and etc. Though im not sure how useful it would be for a game like infinity, surely having realistic voice carry distance isnt so important in the vacuum of space.

1 Like

The important bit for me is that the sounds the pilot hears are also heard by anyone the pilot communicates with. Other players would hear the pilot's background chatter on other communications channels, warning claxons, impacts, weapon fire, possibly simulated explosion sounds, scanner noises, atmospheric noises, etc.

Using a proper microphone with a separate Teamspeak server means that ambient noises (such as people yelling in the background) are not heard by others. But that also means that game noises are not heard.

It could be argued that a space-faring society would have the pilot using a proper microphone so that the ambient noises of combat wouldn't interfere with their critical communications. I would say that gameplay invites keeping those ambient noises.



(John Madden)


Right, time for some nightmares because that crap is going to stick in my head for a long time.

1 Like

Of course, not by default! smiley

[edit] Though it seems IVONA does a bit better than Sam, '!?!?!?' (reads only one). But 'uuuuuuuu' pronounces every vowel as a single one, goes into stutter repeat without losing breath... ever.


*in a kill streak voice*



I use IVONA TTS ro read back to me on windows. Indian English is my fav, and a lot more TTS companies now use a version of Indian English. I just really like the female Indian English accent.


Yeah, but as Flavien points out:

The issue is that IVONA, now part of Amazon, and others offer TTS as a service. Meaning that TTS conversion that had occurred before for a different client would have to be saved to I-Novae Studios Azure servers in order to not have to pay few cents repeatedly.

1 Like