Speech, technology and research lab

Communicating with, and through, computer applications

SRI’s speech and language technologies allow us to interact more naturally with computing applications and provide a wealth of actionable information about our intentions, health, and emotional state.

Core technologies and applications

SRI’s Speech Technology and Research (STAR) Laboratory brings together a multidisciplinary mix of engineers, computer scientists and linguists. Together our experts build systems for a wide range of applications including signal processing; data indexing and mining; and computer-aided learning.

  • Noise robustness
  • Speech production and perception-based features
  • Keyword spotting
  • Prosodic modeling and disfluencies

  • Voice biometrics
  • Language/accent identification
  • Speaker and speaker-state characterization
  • Audio event detection
  • Speaker diarization

  • Speech-to-Speech translation
  • Cross-lingual information retrieval
  • Machine-mediated cross-lingual communication

  • Human-computer interaction
  • Dialog systems and virtual personal assistants (VPAs)
  • Error detection and recovery
  • Semantic and syntactic parsing

  • Multi-lingual information extraction
  • Topic and event identification
  • Summarization;
  • Question answering

Real-world impact

Read more

Speech and Natural language leadership


Novel speech processing technology leverages AI algorithms to enable speech activity detection in high levels of noise and distortion.

Real-time speaker state platform estimates speaker state—such as emotion, sentiment, cognition, health, mental health and communication quality—in a range of end applications.

Small-footprint, high-accuracy engine incorporates patented techniques that increase recognition performance using speaker adaptation, microphone adaptation, end-of- speech detection, distributed speech recognition and noise robustness.

Toolkit specifically designed for language-learning applications and other educational and training software. Works for both adult and child voices, it excels at recognizing native and non-native speakers.

Toolkit helps build and apply statistical language models for speech recognition, statistical tagging and segmentation, and machine translation. Can be downloaded and used free of charge.


Read more