• Skip to primary navigation
  • Skip to main content
SRI InternationalSRI mobile logo

SRI International

SRI International - American Nonprofit Research Institute

  • About
    • Blog
    • Press room
  • Expertise
    • Advanced imaging systems
    • Artificial intelligence
    • Biomedical R&D services
    • Biomedical sciences
    • Computer vision
    • Cyber & formal methods
    • Education and learning
    • Innovation strategy and policy
    • National security
    • Ocean & space
    • Quantum
    • QED-C
    • Robotics, sensors & devices
    • Speech & natural language
    • Video test & measurement
  • Ventures
  • NSIC
  • Careers
  • Contact
  • 日本支社
Show Search
Hide Search
Home » Archives for Ajay Divakaran

Ajay Divakaran

SRI Author

  • Ajay Divakaran

    View all posts

ajay-divakaran-bio-pic
Other Innovators September 8, 2021

Ajay Divakaran

Ajay Divakaran September 8, 2021

Senior Technical Director, Vision and Learning Laboratory, Center for Vision Technologies

Computer vision publications March 16, 2020 Tech Report

Deep Adaptive Semantic Logic (DASL): Compiling Declarative Knowledge into Deep Neural Networks

Andrew Silberfarb, John Byrnes, Ajay Divakaran March 16, 2020

We introduce Deep Adaptive Semantic Logic (DASL), a novel framework for automating the generation of deep neural networks that incorporates user-provided formal knowledge to improve learning from data. We provide formal semantics that demonstrate that our knowledge representation captures all of first order logic and that finite sampling from infinite domains converges to correct truth values. DASL’s representation improves on prior neural-symbolic work by avoiding vanishing gradients, allowing deeper logical structure, and enabling richer interactions between the knowledge and learning components. We illustrate DASL through a toy problem in which we add structure to an image classification problem and demonstrate that knowledge of that structure reduces data requirements by a factor of 1000 . We then evaluate DASL on a visual relationship detection task and demonstrate that the addition of commonsense knowledge improves performance by 10.7 % in a data scarce setting.

Computer vision publications December 1, 2013 Conference Paper

Dynamic Pooling for Complex Event Recognition

SRI International, Ajay Divakaran December 1, 2013

The problem of adaptively selecting pooling regions for the classification of complex video events is considered. Complex events are defined as events composed of several
characteristic behaviors, whose temporal configuration can change from sequence to sequence. A dynamic pooling operator is defined so as to enable a unified solution to the
problems of event specific video segmentation, temporal structure modeling, and event detection

Computer vision publications October 1, 2013 Conference Paper

Semantic Pooling for Complex Event Detection

SRI International, Ajay Divakaran October 1, 2013

SRI Authors: Ajay Divakaran

Computer vision publications August 1, 2013 Conference Paper

Affect Analysis in Natural Human Interaction Using Joint Hidden Conditional Random Fields

SRI International, Ajay Divakaran August 1, 2013

We present a novel approach for multi-modal affect analysis in human interactions that is capable of integrating data from multiple modalities while also taking into account temporal dynamics. Our fusion approach, Joint Hidden Conditional Random Fields (JHRCFs), combines the advantages of purely feature level (early fusion) fusion approaches with late fusion (CRFs on individual modalities) to simultaneously learn the correlations between features from multiple modalities as well as their temporal dynamics. Our approach addresses major shortcomings of other fusion approaches such as the domination of other modalities by a single modality with early fusion and the loss of cross-modal information with late fusion. Extensive results on the AVEC 2011 dataset show that we outperform the state-of-the-art on the Audio Sub-Challenge, while achieving competitive performance on the Video Sub-Challenge and the Audiovisual Sub-Challenge.

Computer vision publications July 1, 2013 Conference Paper

Affect Analysis in Natural Human Interaction Using Joint Hidden Conditional Random Fields

SRI International, Ajay Divakaran July 1, 2013

SRI Authors: Ajay Divakaran

Computer vision publications April 1, 2013 Article

On the Applicability of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/Speech Video

Ajay Divakaran April 1, 2013

A video’s soundtrack is usually highly correlated to its content. Hence, audio-based techniques have recently emerged as a means for video concept detection complementary to visual analysis. Most state-of-the-art approaches rely on manual definition of predefined sound concepts such as “engine sounds,” “outdoor/indoor sounds.” These approaches come with three major drawbacks: manual definitions do not scale as they are highly domain-dependent, manual definitions are highly subjective with respect to annotators and a large part of the audio content is omitted since the predefined concepts are usually found only in a fraction of the soundtrack. This paper explores how unsupervised audio segmentation systems like speaker diarization can be adapted to automatically identify low-level sound concepts similar to annotator defined concepts and how these concepts can be used for audio indexing. Speaker diarization systems are designed to answer the question “who spoke when?” by finding segments in an audio stream that exhibit similar properties in feature space, i.e., sound similar. Using a diarization system, all the content of an audio file is analyzed and similar sounds are clustered. This article provides an in-depth analysis on the statistic properties of similar acoustic segments identified by the diarization system in a predefined document set and the theoretical fitness of this approach to discern one document class from another. It also discusses how diarization can be tuned in order to better reflect the acoustic properties of general sounds as opposed to speech and introduces a proof-of-concept system for multimedia event classification working with diarization-based indexing.

How can we help?

Once you hit send…

We’ll match your inquiry to the person who can best help you.

Expect a response within 48 hours.

Career call to action image

Make your own mark.

Search jobs
Our work

Case studies

Publications

Timeline of innovation

Areas of expertise

Blog

Institute

Leadership

Press room

Media inquiries

Compliance

Privacy policy

Careers

Job listings

Contact

SRI Ventures

Our locations

Headquarters

333 Ravenswood Ave
Menlo Park, CA 94025 USA

+1 (650) 859-2000

Subscribe to our newsletter

日本支社

SRI International

  • Contact us
  • Privacy Policy
  • Cookies
  • DMCA
  • Copyright © 2022 SRI International