

Displaying 381 - 400 of 1717

Ratner Corpus

N. Ratner, S. Wagovich, S. Orloski

This dataset contains corpus records from 23 children within 4 months of stuttering onset and 15 age- , gender- and SES-matched fluent peers used in a series of published research reports from 1997 to 2012. Both audio and video recordings of language testing sessions were made, and languagme samples were transcribed.
StutteringLanguage development

Lu Corpus

C. Lu

Corpus of text and audio from tiawanese dementia patients was collected. Narrative and procedural discourse are focused in this dataset. Dementia patients read out certain texts and the audio is collected.

The Multimodal Dyadic Behavior Dataset

James M. Rehg, A. Rozga, Gregory D. Abowd, Matthew S. Goodwin

Behavioral imaging can affect the quality of care for individuals with a developmental or behavioral disorder. Multimodal videos, audio and affective sensors are annotated for relevant child behaviours
Autism Spectrum DisordersBehavioral research

The Nemours database of dysarthric speech

X. Menendez-Pidal, J.B. Polikoff, S.M. Peters, J.E. Leonzio, H.T. Bunnell

This work collects speech by dysarthic patients to investigate the general charecteristics of dysarthic speech. 74 sentences are spoken by each speaker, then the speech is labeled at the phoneme level.

Purdue RVL-SLLL American Sign Language Database

R. Wilbur

A video dataset is proposed for automatic recognition of American sign language. High resolution cameras and different lighting conditions were used to record videos of motion primitives and shapes, and of subjects signed two or more sentences including prosody patterns.
Sign languageAmerican Sign Language

DSet Touchscreen typing-pattern analysis for detecting fne motor skills decline in early-stage Parkinson’s disease

D. Iakovakis, S. Hadjidimitriou, V. Charisis, S. Bostantzopoulou, Z. Katsarou, Leontios J. Hadjileontiadis

A machine learning classifier to automatically detect motor impariment in patients with Parkinson's disease is constructed using a dataset collected by recording touchscreen keystrokes in a clinical setting. The collection protocol includes a typing experiment of multiple text excerpts on smartphones and a clinical evaluation of Early PD patients and healthy controls.
Parkinson's diseaseParkinsons

Illinois International Stuttering Research Project Corpus

N. Ambrose, E. Yairi

This dataset is part of the longitudinal stuttering research project at the University of Illinois which and used to study the onset and subsequent developmental course of stuttering in children under age six, documenting what happens to children who begin stuttering and generate criteria for risks Speech-language samples are collected from pre-school children with and without stuttering involved in multiple tests including speech and language at different age stages.
StutteringEarly Childhood StutteringStuttering-like disfluencies

Alector: A Parallel Corpus of Simplified French Texts with Alignments of Misreadings by Poor and Dyslexic Readers

N. Gala, A. Tack, L. Javourey-Drevet, T. François, Johannes C. Ziegler

Reading errors by poor and dyslexic readers are collected to buid a corpus for the development of automatic text simplification tools. A sample of 21 children was asked to read aloud 5 original and 5 simplified texts, and answer a reading comprehension test after each text.
dyslexiaReading Impaired

K-RSL: A Dataset for Linguistic Understanding, Visual Evaluation, and Recognition of Sign Languages

A. Imashev, M. Mukushev, V. Kimmelman, A. Sygulova

This work presents the first Kazakh-Russian Sign Language (KRSL) corpus which includes non-manual features to improve signs’ recognition accuracy. Recording setup had a green background in an office space without professional lighting sources to collect videos of phrases signed by 5 professional sign language interpreters and, for one subset, 5 deaf native signers.
Sign languageKazakh-Russian Sign Language


Y. Li, C. Juan

A dataset of 3D depth video and skeleton data is collected for Chinese sign language. 4 signers performed 2000 Chinese sign language signs twice each and 4 signers performed them once, this was recorded using a 3D Kinect sensor.
Sign languageChinese Sign Language

English PPA DePaul

R. DePaul

Speech samples were collected to construct a dataset for speech issues faced by people with aphasia. Patient was given tasks in 2 sessions with audio being recorded for one and the video being recorded for the other.
AphasiaPrimary Progressive Aphasia

English PPA Hopkins

A. Hillis

A dataset of Speech is collected to investigate the progression of Primary Progressive Aphasia. Patients were given tasks and some of then were recorded
AphasiaPrimary Progressive Aphasia

BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues

S. Albanie, G. Varol, L. Momeni, T. Afouras, J. Chung, N. Fox, A. Zisserman

A large-scale dataset of continuous BSL signing is genereated from publicly broadcast TV programs and used to train sign language recognition models for co-articulated signs. Weakly-aligned subtitles for broadcast footage together with a keyword spotting method were used to automatically localise sign-instances for a vocabulary of 1,000 signs in 1,000 hours of video.
Sign languageBritish Sign Language

iMove dataset

H. Kacorri, S. Mascetti, A. Gerino, D. Ahmetovic, H. Takagi, C. Asakawa

This work collects mobile app usage data to understand how people with visual impairment use voiceover and other settings People with visual impairment had a software on a mobile device that logged their interactions.
Visual impairment

NKI-CCRT corpus: speech intelligibility before and after advanced head and neck cancer treated with concomitant chemoradiotherapy

R.P. Clapham, L. van der Molen, R.J.J.H. van Son, M. van den Brekel, F.J.M. Hilgers

This work investigates how speech changes before and after a surgery and treatment of neck cancer. All participants read the same Dutch fairy tale, and the speech recorded was annotated with text and noise markers.
Speech IntelligibilityNeck Cancer

Diagnostically relevant facial gestalt information from ordinary photos

Q. Ferry, J. Steinberg, C. Webber, D. FitzPatrick, C. Ponting, A. Zisserman, C. Nellåker

Ordinary non-clinical photographs are used to model face dysmorphisms for reducing the search space of patients with developmental disorders. Images of people with various developmental disorders were collected from internet, along with patients without these disorders.
Facial PhenotypeDown's Syndrome


Y. Li, C. Juan

A dataset of 3D depth video and skeleton data is collected for Chinese sign language 4 signers performed 500 daily vocabulary words twice each and 4 signers performed them once, this was recorded using a 3D Kinect sensor.
Sign languageChinese Sign Language

Detecting neurodegenerative disorders from web search signals

R. White, M. Doraiswamy, E. Horvitz

The study used web serach engine queries and usage patterns to ascertain the chance that a user has a neurogenerative disorder. Microsoft bing search queries were collected
Parkinson's diseaseParkinsonsAlzheimer's disease

English Kempler

D. Kempler

The work investigates the language ability of patients with probable Alzheimer's disease. Conversations were collected as they pronounced the Cookie Theft picture description from The Boston Diagnostic Aphasia Examination.
Alzheimer's diseaseAphasia

GData Motor Impairment Estimates via Touchscreen Typing Dynamics Toward Parkinson's Disease Detection From Data Harvested In-the-Wild

D. Iakovakis, S. Hadjidimitriou, V. Charisis

A machine learning classifier to automatically detect motor impariment in patients with Parkinson's disease is constructed using a dataset collected by recording touchscreen keystrokes in the wild. Subjects from four countries across EU contributed pseudo-anonymised multimodal data remotely on keyboard interactions and keystroke dynamics through the app install on their smartphone. The characters typed were not captured to have the process privacy-aware.
ParkinsonsParkinson's disease