Open Language Interface for Voice Exploitation (OLIVE)

The Open Language Interface for Voice Exploitation (OLIVE) speech processing system provides robust speech information extraction amid high levels of noise and distortion in real-world data.

AI algorithms underlying OLIVE enable the technology to:

Detect the presence of speech, not just an open channel (speech activity detection);
Find and/or track speakers of interest (speaker identification);
Detect languages and dialects from a set of languages of interest (language and dialect identification); and
Detect specific keywords and phrases (keyword spotting).

Graphical user interfaces in OLIVE enable close editing of audio files, enrollment of new speakers, scoring of segments, speech activity segmentation, and semi-supervised speaker diarization (identification of an individual person based on voice qualities).

Initially developed under the DARPA Robust Automatic Transcription of Speech (RATS) program, OLIVE is designed for easy integration into end-user applications. The technology is under continuous development and refinement based on user feedback.

Capabilities

Automatic detection of speech, speaker, keywords, and languages of interest from live streaming input or file-based audio
Automatic speaker segmentation of audio, labeling where each person speaksFunctions with high accuracy in tactical communications with high noise and across multiple channels

Key technologies

Speech Activity Detection (SAD)

Accurate on noisy operational audio
Detect speech, not just an open channel
Process hundreds of channels on low-powered hardware

Language Identification (LID)

Detect languages/dialects
Add new languages using collected audio

Speaker Identification (SID)

Find/track speakers of interest across time and channels
Add new speakers offline or live, with as little as 8 seconds of speech

Query by Example (QBE)

Language agnostic keyword spotting
Enroll key words/phrases offline or live with as little as a single example

Other technologies

Keyword Spotting (KWS)

Word detection in Spanish, Mandarin, Iraqi Arabic

Acoustic Event Detection (AED)

Detection of non-speech acoustic events including whistling, barking, vehicles, gunshots

Speaker Diarization (DIA)

Separation and labeling of unknown speakers in multi-speaker conversations

Forensic Speaker Identification

Close analysis speaker identification for forensic use

OLIVE GUIs

Batch processing/ data triage

Automatically discard files with no speech
Find files most likely to contain target language, speaker or keywords
Processing speed scales with available CPUs

Live streaming

Live monitoring of incoming audio streams
Save, search and review past audio
On-the-fly enrollment of new speakers
Up to 16 channels running SAD, SID, LID

Close waveform analysis

Forensic analysis of speech
Simple but powerful GUI for selecting and reviewing audio segments
Run any plugin on any selected selected segments

Open Language Interface for Voice Exploitation (OLIVE)

The Open Language Interface for Voice Exploitation (OLIVE) speech processing system provides robust speech information extraction amid high levels of noise and distortion in real-world data.

AI algorithms underlying OLIVE enable the technology to:

Capabilities

Key technologies

Other technologies

OLIVE GUIs

Read more from SRI

Researchers develop materials that can take on the toughest conditions

Podcast: Re-imagining instructional quality and coaching

SRI’s Genome Explorer: Enhanced genome browser delivers better user experience