About Text To speech:
A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech.And Android uses Text to speech engine to read text and convert into speech using downloaded language data.And i have noticed that no one tells about what are the parts of Text to speech so here is some brief info regarding parts of text to speech.Parts of Text To speech:
A text-to-speech system (or "engine") is composed of two parts:- A Front-End
The front-end has two major tasks. First, it converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words. This process is often called text normalization, pre-processing, or tokenization. The front-end then assigns phonetic transcriptions to each word, and divides and marks the text into prosodic units, like phrases, clauses, and sentences. The process of assigning phonetic transcriptions to words is called text-to-phoneme or grapheme-to-phoneme conversion. Phonetic transcriptions and prosody information together make up the symbolic linguistic representation that is output by the front-end.
- A Back-End
The back-end—often referred to as the synthesizer—then converts the symbolic linguistic representation into sound. In certain systems, this part includes the computation of the target prosody (pitch contour, phoneme durations), which is then imposed on the output speech.
Above popup dialog could be shown if language data is not already available on your device to download text to speech language data.so make sure that specific language data is already downloaded to your device or download it by following these steps before using the Application.
Settings -> Language & input -> scroll down to Text-to-speech output -> under “Preferred Engine” click the settings icon next to Google Text-to-speech Engine -> Install voice data -> select whichever language you like -> click the download icon next to the “high quality” version (should be around 240MB) -> once downloaded it should already be selected for you.
I have used US locale so download united states voices as i used for this post.
OutPut:
Create new Android Project
Project Name: TextToSpeak
//tested from 2.3.3 to current android sdk
Build Target: Android 2.3.3 //or greater than that
Application Name: TextToSpeak
Package Name: com.shaikhhamadali.blogspot.texttospeech
Create Layout file: activity_text_to_speech
1. code of Layout:
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" android:layout_width="match_parent" android:layout_height="match_parent" android:orientation="vertical" tools:context=".TextToSpeak" > <TextView android:id="@+id/tVSpeechRate" android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="Set Speech Rate" /> <SeekBar android:id="@+id/sBSpeechRate" android:layout_width="match_parent" android:layout_height="wrap_content" android:layout_below="@id/tVSpeechRate" android:max="19" android:progress="9" /> <TextView android:id="@+id/tVPitchRate" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_below="@id/sBSpeechRate" android:text="Set Pitch" /> <SeekBar android:id="@+id/sBPitchRate" android:layout_width="match_parent" android:layout_height="wrap_content" android:layout_below="@id/tVPitchRate" android:max="19" android:progress="9" /> <EditText android:id="@+id/eTPronounce" android:layout_width="match_parent" android:layout_height="wrap_content" android:layout_below="@+id/sBPitchRate" android:ems="10" android:hint="Enter Text to Speak" > <requestFocus /> </EditText> <Button android:id="@+id/btnSpeak" android:layout_width="match_parent" android:layout_height="wrap_content" android:text="Speak" /> </LinearLayout>
2. code of activity:
package com.shaikhhamadali.blogspot.texttospeech; import java.util.Locale; import android.os.Bundle; import android.app.Activity; import android.view.View; import android.view.View.OnClickListener; import android.widget.Button; import android.widget.EditText; import android.widget.SeekBar; import android.widget.SeekBar.OnSeekBarChangeListener; import android.widget.Toast; import android.speech.tts.TextToSpeech; public class TextToSpeak extends Activity implements TextToSpeech.OnInitListener{ //Create variables double pitch=0.0f,speechRate=0.0f; //declare views/controls private TextToSpeech tts; SeekBar sBSpeechRate,sBPitchRate; EditText eTPronounce; Button btnSpeak; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_text_to_speech); initializeControls(); /*Initialize the Text to speech engine using the default TTS engine. *This will also initialize the associated TextToSpeech engine if it isn't already running. */ tts = new TextToSpeech(this, this); } private void initializeControls() { //get reference of the UI Controls sBSpeechRate=(SeekBar)findViewById(R.id.sBSpeechRate); sBPitchRate=(SeekBar)findViewById(R.id.sBPitchRate); eTPronounce=(EditText)findViewById(R.id.eTPronounce); btnSpeak=(Button)findViewById(R.id.btnSpeak); /*initialize seek bar change listener to listen every change on seekbar * either increment or decrement*/ sBSpeechRate.setOnSeekBarChangeListener(new OnSeekBarChangeListener() { @Override public void onStopTrackingTouch(SeekBar seekBar) {} @Override public void onStartTrackingTouch(SeekBar seekBar) {} @Override public void onProgressChanged(SeekBar seekBar, int progress, boolean fromUser) { //divide progress by 10 to get speech rate in float values like 0.1 speechRate=((double)progress+1)/10; } }); sBPitchRate.setOnSeekBarChangeListener(new OnSeekBarChangeListener() { @Override public void onStopTrackingTouch(SeekBar seekBar) {} @Override public void onStartTrackingTouch(SeekBar seekBar) {} @Override public void onProgressChanged(SeekBar seekBar, int progress, boolean fromUser) { //divide progress by 10 to get pitch in float values like 0.1 pitch=((double)progress+1)/10; } }); //set default text as Welcome to shaikhhamadali.blogspot.com eTPronounce.setText("Welcome to shaikhhamadali.blogspot.com"); //set on click listener to button speak call speakOut Method to speak text btnSpeak.setOnClickListener(new OnClickListener() { @Override public void onClick(View v) { speakOut(); } }); } @Override public void onInit(int status) { //check the status if (status == TextToSpeech.SUCCESS) { //set language Locale to US int result = tts.setLanguage(Locale.US); //check that is language locale available on device or supported if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) { } else { //then enable button to listen for listener btnSpeak.setEnabled(true); //and speak by calling speakOut speakOut(); } } else { //show toast if initialization failed Toast.makeText(getBaseContext(), "TTS Engine Initilization Failed!",Toast.LENGTH_SHORT).show(); } } private void speakOut() { //get entered text to speak String text = eTPronounce.getText().toString(); //set pitch rate adjusted by user tts.setPitch((float)pitch); //set speech rate adjusted by user tts.setSpeechRate((float)speechRate); /*pass text to speak using engine and pass Queue mode as QUEUE_FLUSH where all entries in the playback queue *(media to be played and text to be synthesized) are dropped and *replaced by the new entry. Queues are flushed with respect to *a given calling app. Entries in the queue from other callers are not discarded*/ tts.speak(text, TextToSpeech.QUEUE_FLUSH, null); } @Override public void onDestroy() { // Don't forget to stop and shutdown text to speech engine! if (tts != null) { tts.stop(); tts.shutdown(); } super.onDestroy(); } }
3. note that:
-
Good practice is to always shutdown text to speech engine in onDestroy.
- Above I have used Speech Rate, speech rate is nothing but the speed of speaking text you can slow down it and can also speed it up.
- pitch is nothing but the frequency set of voice, you can change it high and low frequency. high frequency is an example of some of the people whose voice is thinner enough to understand.
- Learn more uses of intent voice search speech recognition and web search using intent.
Good practice is to always shutdown text to speech engine in onDestroy.
4. conclusion:
- Some information about text to speech engine.
- Some information pitch and speech rate setting.
- Know how to use seek bar control and how to progress values of seek bar.
- Know how to download text to speech engine voices of any language from settings.
- Some information about text to speech engine.
- Some information pitch and speech rate setting.
- Know how to use seek bar control and how to progress values of seek bar.
- Know how to download text to speech engine voices of any language from settings.
5. About the post:
5. About the post:
- The code seems to explain itself due to comments, but if you have any questions you can freely ask too!
- Don’t mind to write a comment whatever you like to ask, to know,to suggest or recommend.
- Hope you enjoy it!
6. Source Code:
you can download the source code from: GoogleDrive, Github
Cheers,
Hamad Ali Shaikh
- The code seems to explain itself due to comments, but if you have any questions you can freely ask too!
- Don’t mind to write a comment whatever you like to ask, to know,to suggest or recommend.
- Hope you enjoy it!
6. Source Code:
you can download the source code from: GoogleDrive, Github
Cheers,
Hamad Ali Shaikh