N-CANDA Data Integration: Anatomy of an Asynchronous Infrastructure for Multi-Site, Multi-Instrument Longitudinal Data Capture


Rohlfing, T., Cummins, K., Henthorn, T., Chu, W., & Nichols, B. N. (2014). N-CANDA data integration: anatomy of an asynchronous infrastructure for multi-site, multi-instrument longitudinal data capture. Journal of the American Medical Informatics Association : JAMIA, 21(4), 758-762.


The infrastructure for data collection implemented by the National Consortium on Alcohol and NeuroDevelopment in Adolescence (N-CANDA) for data collection comprises several innovative features: (a) secure, asynchronous transfer and persistent storage of collected data via a revision control system; (b) two-stage import into a longitudinal database; and (c) use of a script-controlled web browser for data retrieval from a third-party, web-based neuropsychological test battery. The asynchronous operation of data transmission and import is of particular benefit, as it has allowed the consortium sites to begin data collection before the receiving database infrastructure had been deployed. Records were collected within 86 days of funding, 35 days after finalizing the collected instruments. Final instruments were added to the database import 225 days after instrument selection, with up to 173 records already collected at that time. Thus, the concepts implemented in N-CANDA’s data collection system helped reduce project start-up time by several months.

Keywords: informatics, longitudinal data collection, data integration, revision control system

Read more from SRI