Adding speech recognition to your Jetpack Compose Android app — The Minimum Effort Guide

6 min readJun 19, 2024

NB — pretty much the entirety of the content (or the inspiration of it) of this post comes from this Youtube video by Indently. Give them the wonderful credit please (and if you’re still using Views, take a look at the vid!). All I did was update it to Jetpack Compose.

What you’ll have by the end of this post

An app with button that allows you to launch Google’s speech recognition and then returns to the app the resulting speech-to-text, which is displayed in the text on the screen.

But I want it now

OK fine, for those of you who just want grabby-grab, here’s the final code up-front. If you need it, the detail comes after…

package de.chrisward.googlespeechrecognitionprototype

import android.app.Activity
import android.content.Intent
import android.os.Bundle
import android.speech.RecognizerIntent
import androidx.activity.ComponentActivity
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.compose.setContent
import androidx.activity.enableEdgeToEdge
import androidx.activity.result.contract.ActivityResultContracts
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.Spacer
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.padding
import androidx.compose.material3.Button
import androidx.compose.material3.Scaffold
import androidx.compose.material3.Text
import androidx.compose.runtime.Composable
import androidx.compose.runtime.mutableStateOf
import androidx.compose.runtime.remember
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.tooling.preview.Preview
import androidx.compose.ui.unit.dp
import de.chrisward.googlespeechrecognitionprototype.ui.theme.GoogleSpeechRecognitionPrototypeTheme
import java.util.Locale

class MainActivity : ComponentActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        enableEdgeToEdge()
        setContent {
            GoogleSpeechRecognitionPrototypeTheme {
                Scaffold(modifier = Modifier.fillMaxSize()) { innerPadding ->
                    MainScreen(modifier = Modifier.padding(innerPadding))
                }
            }
        }
    }
}

@Composable
fun MainScreen(modifier: Modifier = Modifier) {
    val speechText = remember { mutableStateOf("Your speech will appear here.") }
    val launcher = rememberLauncherForActivityResult(ActivityResultContracts.StartActivityForResult()) {
        if (it.resultCode == Activity.RESULT_OK) {
            val data = it.data
            val result = data?.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
            speechText.value = result?.get(0) ?: "No speech detected."
        } else {
            speechText.value = "[Speech recognition failed.]"
        }
    }
    Column(modifier = modifier
        .fillMaxSize(),
        horizontalAlignment = Alignment.CenterHorizontally,
        verticalArrangement = Arrangement.Center
    ) {
        Button(onClick = {
            val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
            intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
            intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())
            intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Go on then, say something.")
            launcher.launch(intent)
        }) {
            Text("Start speech recognition")
        }
        Spacer(modifier = Modifier.padding(16.dp))
        Text(speechText.value)
    }
}

And here’s the not so tldr…

Create a new project in Android Studio

Choose Empty Activity and select Next.
Call it what you like, but I’m calling it Google Speech Recognition Prototype and I’m setting a Min SDK of API 29 (Android 10). Then press Finish.

Create your screen

We want a simple screen with a vertically and horizontally-centred Button and Text component. The button launches speech recognition, and the text displays the resulting text afterwards.

Yes, I also hate that Medium doesn’t let you resize images. Sorry for the extra scrolling effort…

Let’s create a Composable for the screen called MainScreen. In it, we’ll have a Column with the Button and the Text centre-aligned…

@Composable
fun MainScreen(modifier: Modifier = Modifier) {
    val speechText = remember { mutableStateOf("Your speech will appear here.") }

    Column(modifier = modifier
        .fillMaxSize(),
        horizontalAlignment = Alignment.CenterHorizontally,
        verticalArrangement = Arrangement.Center
    ) {
        Button(onClick = {}) {
            Text("Start speech recognition")
        }
        Spacer(modifier = Modifier.padding(16.dp))
        Text(speechText.value)
    }
}

Nothing special here — just a column with a button and a text component. The text component makes use of a MutableState speechText variable towards the top, as we expect this to change depending on what is returned from speech recognition.

Don’t forget to change your onCreate method to use this MainScreen Composable instead of the built in Greeting:

class MainActivity : ComponentActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        enableEdgeToEdge()
        setContent {
            GoogleSpeechRecognitionPrototypeTheme {
                Scaffold(modifier = Modifier.fillMaxSize()) { innerPadding ->
                    MainScreen(modifier = Modifier.padding(innerPadding))
                }
            }
        }
    }
}

Create the launcher for Android’s built-in speech recognition

Previously one would have done this with startActivityForResult(), but this has now been deprecated. The up-to-date way of launching a different activity with the expectation of something being returned afterwards is using rememberLauncherForActivityResult. We set it up towards the top of the Composable and we only launch it with the specific Intent upon the button press, as follows:

val launcher = rememberLauncherForActivityResult(ActivityResultContracts.StartActivityForResult()) {
    if (it.resultCode == Activity.RESULT_OK) {
       val data = it.data
       val result = data?.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
       speechText.value = result?.get(0) ?: "No speech detected."
    } else {
       speechText.value = "[Speech recognition failed.]"
    }
}

Here we are simply stating what we wish to happen when Google’s speech recogniser (or indeed any other Intent launched from launcher should do with the result.

First we check to see if the speech recognition was a success with by checking:

if (it.resultCode == Activity.RESULT_OK) {
..
}

Otherwise we change our text to show an error message.

We then extract the speech text from the data returned to us specifying that we want information from Recognizer. If no data is present (e.g. if the user didn’t provide any speech or just pushed the back button immediately) this will be empty, so we state that no speech was detected.

val data = it.data
val result = data?.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
speechText.value = result?.get(0) ?: "No speech detected."

And here it is in the context of the rest of the code:

@Composable
fun MainScreen(modifier: Modifier = Modifier) {
    val speechText = remember { mutableStateOf("Your speech will appear here.") }
    val launcher = rememberLauncherForActivityResult(ActivityResultContracts.StartActivityForResult()) {
        if (it.resultCode == Activity.RESULT_OK) {
            val data = it.data
            val result = data?.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
            speechText.value = result?.get(0) ?: "No speech detected."
        } else {
            speechText.value = "[Speech recognition failed.]"
        }
    }
    Column(modifier = modifier
        .fillMaxSize(),
        horizontalAlignment = Alignment.CenterHorizontally,
        verticalArrangement = Arrangement.Center
    ) {
        Button(onClick = {}) {
            Text("Start speech recognition")
        }
        Spacer(modifier = Modifier.padding(16.dp))
        Text(speechText.value)
    }
}

This is all fine and dandy, but it does nothing until we actually launch something with the launcher. So let’s do that…

Launch the launcher

If you’ve used startActivityForResult() before this will look pretty similar to you. In the new Android world, you simply call the launcher’s launch method with the Intent you want to launch in much the same way you’d have done before calling startActivityForResult() in the past.

Button(onClick = {
    val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())
    intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Go on then, say something.")    
    launcher.launch(intent)
}) {
    Text("Start speech recognition")
}

All we’re doing here within the button’s onClick lambda is creating an intent stating we want to use SpeechRecognizer, then specifying that we want to use the free-form language model (based on free-form speech recognition as opposed to web search terms), specifying the language to be the device’s default locale, and optionally sending a prompt. Then we call the launch method of our previously created launcher.

And as before, here’s that code snippet in the context of the rest of the code.

@Composable
fun MainScreen(modifier: Modifier = Modifier) {
    val speechText = remember { mutableStateOf("Your speech will appear here.") }
		val launcher = rememberLauncherForActivityResult(ActivityResultContracts.StartActivityForResult()) {
        if (it.resultCode == Activity.RESULT_OK) {
            val data = it.data
            val result = data?.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS)
            speechText.value = result?.get(0) ?: "No speech detected."
        } else {
            speechText.value = "[Speech recognition failed.]"
        }
    }
    Column(modifier = modifier
        .fillMaxSize(),
        horizontalAlignment = Alignment.CenterHorizontally,
        verticalArrangement = Arrangement.Center
    ) {
    Button(onClick = {
            val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
            intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
            intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())
            intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Go on then, say something.")    
            launcher.launch(intent)
        }) {
            Text("Start speech recognition")
        }
        Spacer(modifier = Modifier.padding(16.dp))
        Text(speechText.value)
    }
}

And that’s it…

This is the lowest effort way of introducing speech recognition into your app. If all you want is text-to-speech and you don’t mind Google’s UI, this’ll do the job!

Adding speech recognition to your Jetpack Compose Android app — The Minimum Effort Guide

What you’ll have by the end of this post

But I want it now

Create a new project in Android Studio

Create your screen

Create the launcher for Android’s built-in speech recognition

Launch the launcher

And that’s it…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Chris Ward

Responses (1)

More from Chris Ward

Installing Jupyter Notebook on Termux on Android with one command

tldr…

I made a very quick Android app to help you get 5 friends out to vote for democratic parties on…

It’s very “graphic design is my passion” so be gentle, but Friends Against Fascists is also entirely offline and doesn’t send your data…

Deserialising like a pro — Getting to grips with Moshi’s JsonReader

Creating your first ever very own custom Android View!

When I first started out with Android, most of the work I did was with the predefined Views — TextView, EditText, etc. It took me a long…

Recommended from Medium

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Enhance Your App with Conversational Input Using ChatGPT API

Shift from Traditional User Input to an Engaging Chat-Based Input

Lists

Medium's Huge List of Publications Accepting Submissions

Staff picks

Building a Music Player App with Jetpack Compose: Services, Permissions, Broadcast Receiver and…

Hello Buddy’s, Creating a music player app is an exciting project that combines many core Android development concepts, including handling…

You are an absolute moron for believing in the hype of “AI Agents”.

All of my articles are 100% free to read. Non-members can read for free by clicking my friend link!

🚀 10 Kotlin Coroutine Mistakes Every Senior Android Developer Must Avoid (With Real-World Fixes!)

Kotlin Coroutines have revolutionized asynchronous programming on Android, making it simpler, cleaner, and more intuitive. However, even…

Google MLKit Face Recognition with CameraX Android

What is Face detection?

Adding speech recognition to your Jetpack Compose Android app — The Minimum Effort Guide

What you’ll have by the end of this post

But I want it now

Create a new project in Android Studio

Create your screen

Create the launcher for Android’s built-in speech recognition

Launch the launcher

And that’s it…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Chris Ward

Responses (1)

More from Chris Ward

Installing Jupyter Notebook on Termux on Android with *one* command

tldr…

I made a very quick Android app to help you get 5 friends out to vote for democratic parties on…

It’s very “graphic design is my passion” so be gentle, but Friends Against Fascists is also entirely offline and doesn’t send your data…

Deserialising like a pro — Getting to grips with Moshi’s JsonReader

Creating your first ever very own custom Android View!

When I first started out with Android, most of the work I did was with the predefined Views — TextView, EditText, etc. It took me a long…

Recommended from Medium

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Enhance Your App with Conversational Input Using ChatGPT API

Shift from Traditional User Input to an Engaging Chat-Based Input

Lists

Medium's Huge List of Publications Accepting Submissions

Staff picks

Building a Music Player App with Jetpack Compose: Services, Permissions, Broadcast Receiver and…

Hello Buddy’s, Creating a music player app is an exciting project that combines many core Android development concepts, including handling…

You are an absolute moron for believing in the hype of “AI Agents”.

All of my articles are 100% free to read. Non-members can read for free by clicking my friend link!

🚀 10 Kotlin Coroutine Mistakes Every Senior Android Developer Must Avoid (With Real-World Fixes!)

Kotlin Coroutines have revolutionized asynchronous programming on Android, making it simpler, cleaner, and more intuitive. However, even…

Google MLKit Face Recognition with CameraX Android

What is Face detection?

Installing Jupyter Notebook on Termux on Android with one command