Speech, technology and research lab

Communicating with, and through, computer applications

SRI’s speech and language technologies allow us to interact more naturally with computing applications and provide a wealth of actionable information about our intentions, health, and emotional state.

Core technologies and applications

SRI’s Speech Technology and Research (STAR) Laboratory brings together a multidisciplinary mix of engineers, computer scientists and linguists. Together our experts build systems for a wide range of applications including signal processing; data indexing and mining; and computer-aided learning.

  • Noise robustness
  • Speech production and perception-based features
  • Keyword spotting
  • Prosodic modeling and disfluencies

  • Voice biometrics
  • Language/accent identification
  • Speaker and speaker-state characterization
  • Audio event detection
  • Speaker diarization

  • Speech-to-Speech translation
  • Cross-lingual information retrieval
  • Machine-mediated cross-lingual communication

  • Human-computer interaction
  • Dialog systems and virtual personal assistants (VPAs)
  • Error detection and recovery
  • Semantic and syntactic parsing

  • Multi-lingual information extraction
  • Topic and event identification
  • Summarization;
  • Question answering

Real-world impact

Read more

Speech and Natural language leadership

Platforms

Novel speech processing technology leverages AI algorithms to enable speech activity detection in high levels of noise and distortion.

Real-time speaker state platform estimates speaker state—such as emotion, sentiment, cognition, health, mental health and communication quality—in a range of end applications.

Small-footprint, high-accuracy engine incorporates patented techniques that increase recognition performance using speaker adaptation, microphone adaptation, end-of- speech detection, distributed speech recognition and noise robustness.

Toolkit specifically designed for language-learning applications and other educational and training software. Works for both adult and child voices, it excels at recognizing native and non-native speakers.

Toolkit helps build and apply statistical language models for speech recognition, statistical tagging and segmentation, and machine translation. Can be downloaded and used free of charge.

Publications

Read more