Twine DMP

Trending data sets

See All

Kinetics-700

A large, high-quality video dataset of URL links t...

MP4

24.3MB

Video

Casual Conversations Dataset

45,000 videos (3,011 participants) and intended to...

MP4

15GB

Video

Urban Sound 8K dataset

Contains 8732 urban sounds from 10 classes like an...

An audio-visual dataset consisting of short clips ...

11,827 videos related to 180 different tasks, whic...

A large-scale dataset that contains a diverse set ...

An open source, multi-language dataset of voices t...

A Large-Scale Video Benchmark for Human Activity U...

AVA is a project that provides audiovisual annotat...

The largest collection of poses which focuses on v...

MP4

n/a

Video

Yahoo-Flickr Creative Commons 100 Million Dataset

The YFCC100M is the largest publicly and freely us...

UMDFaces is a face dataset divided into two parts:...

A large-scale video dataset, featuring clips from ...

AVSpeech is a new, large-scale audio-visual datase...

MP4

128MB

Video

Voices Obscured in Complex Environmental Settings (VOICES) Dataset

A creative commons speech dataset targeting acoust...

MP3

1.4GB

Audio

Free Spoken digit dataset

A simple audio or speech data which consists of re...

WAV

10MB

Audio

The WebVid-10M Dataset

A large-scale dataset of short videos with textual...

The first dataset of egocentric videos to study hu...

MP4

32.3GB

Video

The Stereo Human Pose Estimation Dataset

A dataset of stereo image pairs suited for stereo ...

JPG

197.8MB

Video

The VIRAT Video Dataset

The VIRAT Video Dataset is designed to be realisti...

A large, high-quality video dataset of URL links t...

MP4

24.3MB

Video

Casual Conversations Dataset

45,000 videos (3,011 participants) and intended to...

An audio-visual dataset consisting of short clips ...

11,827 videos related to 180 different tasks, whic...

A large-scale dataset that contains a diverse set ...

A Large-Scale Video Benchmark for Human Activity U...

AVA is a project that provides audiovisual annotat...

The largest collection of poses which focuses on v...

MP4

n/a

Video

Yahoo-Flickr Creative Commons 100 Million Dataset

The YFCC100M is the largest publicly and freely us...

UMDFaces is a face dataset divided into two parts:...

A large-scale video dataset, featuring clips from ...

AVSpeech is a new, large-scale audio-visual datase...

MP4

128MB

Video

The WebVid-10M Dataset

A large-scale dataset of short videos with textual...

The first dataset of egocentric videos to study hu...

MP4

32.3GB

Video

The Stereo Human Pose Estimation Dataset

A dataset of stereo image pairs suited for stereo ...

JPG

197.8MB

Video

The VIRAT Video Dataset

The VIRAT Video Dataset is designed to be realisti...

A large-scale dataset for recognizing and understa...

MP4

150MB

Video

Something Something Dataset

A large collection of labeled video clips that sho...

Comprises ten tasks and 100K videos to estimate th...

MP4

3.9GB

Video

TV Human Interaction Dataset

300+ videos from 20 different TV shows for predict...

a large collection of video clips of different kin...

Fully annotated 4.5 hour dataset of RGB-D video + ...

A database of face videos designed for studying th...

Facial recognition 9,376 still images and 2,802 vi...

n/a

Video

iQIYI-VID

The largest video dataset for multi-modal person i...

Videos by 42 subjects, coming from 14 different na...

10 actors portraying 10 different emotional states...

12 hours of audiovisual data by 10 actors; 5 emoti...

3D video eye tracking dataset...

A large multi-purpose human motion and video datas...

Urban Sound 8K dataset

Contains 8732 urban sounds from 10 classes like an...

An open source, multi-language dataset of voices t...

MP3

65GB

Audio

Voices Obscured in Complex Environmental Settings (VOICES) Dataset

A creative commons speech dataset targeting acoust...

MP3

1.4GB

Audio

Free Spoken digit dataset

A simple audio or speech data which consists of re...

WAV

10MB

Audio

The Spoken Wikipedia Corpora

This is a corpus of aligned spoken Wikipedia artic...

Audio transcription of TED talks. 1495 TED talks a...

n/a

Audio

Speech Commands Dataset

65,000 one-second long utterances of 30 short word...

n/a

Audio

Persian Consonant Vowel Combination (PCVC) Speech Dataset

This dataset contains 23 Persian consonants and 6 ...

This 38.7 GB dataset helps predict which letter-na...

Phonetic and orthographic transcriptions of more t...

recordings of 630 speakers of eight major dialects...

.WAV, .TXT, .WRD, .PHN

n/a

Audio

Mivia Audio Events Dataset

6,000 events of surveillance applications, namely ...

1302 labeled sound recordings. Each recording is l...

A novel audio captioning dataset, consisting of 49...

an open dataset of human-labeled sound events cont...

WAV

n/a

Audio

Vocal Imitation Set v1.1.3

A collection of crowd-sourced vocal imitations of ...

635 audio event classes and a collection of 2,084,...

MP3

n/a

Audio

CALLHOME American English Speech

120 unscripted 30-minute telephone conversations b...

n/a

Audio

LibriSpeech ASR Corpus

1,000 hours of 16kHz read English speech...

MP3

n/a

Audio

Speech Accent Archive

Parallel English speech samples from 177 countries...

MP3

907MB

Audio

Phone Conversation Data Sample

Conversations in Dutch, Japanese, and Irish Englis...

.WAV, .JSON

n/a

Audio

Alexa Wake Word Voice Samples

Sample of 24 Alexa wake word recordings in four la...

WAV

n/a

Audio

The LJ Speech Dataset

Public domain speech dataset consisting of 13,100 ...

The largest free speech corpus available for Manda...

n/a

Audio

AEDD

500 utterances by a diverse group of actors (over ...

n/a

Audio

ANAD

1384 recording by multiple speakers; 3 emotions: a...

Consists of 30000 audio samples of spoken digits (...

1935 recording by 61 speakers (45 male and 16 fema...

65 hours of annotated video from more than 1000 sp...

n/a

Audio

CMU-MOSI

2199 opinion utterances with annotated sentiment; ...

n/a

Audio

CMU Wilderness

speech dataset many accents reciting passages from...

7,442 original clips from 91 actors. These clips w...

20 speakers (10 female and 10 male) reading 5 exce...

n/a

Audio

Deep Clustering Dataset

Training deep discriminative embeddings to solve t...

9365 emotional and 332 neutral samples produced by...

n/a

Audio

DIPCO

Dinner Party Corpus - The participants were record...

n/a

Audio

EEKK

26 text passage read by 10 speakers; 4 main emotio...

800 recording spoken by 10 actors (5 males and 5 f...

n/a

Audio

EmoFilm

1115 audio instances sentences extracted from vari...

WAV

n/a

Audio

Emotional Voice dataset - Nature

2,519 speech samples produced by 100 actors from 5...

n/a

Audio

EmotionTTS

Recordings and their associated transcriptions by ...

n/a

Audio

Emov-DB

Recordings for 4 speakers- 2 males and 2 females; ...

n/a

Audio

EMOVO

6 actors who played 14 sentences; 6 emotions: disg...

n/a

Audio

Keio-ESD

A set of human speech with vocal emotion spoken by...

20 sentences by 12 actors; 4 emotions: angry, sad,...

n/a

Audio

MSP Podcast Corpus

100 hours by over 100 speakers - annotated with em...

MP3

n/a

Audio

NISQA Speech Quality Corpus

includes 14k speech samples with simulated (codecs...

n/a

Audio

OGVC

9114 spontaneous utterances and 2656 acted utteran...

n/a

Audio

RECOLA

3.8 hours of recordings by 46 participants; negati...

n/a

Audio

The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

7356 files (total size: 24.8 GB). The database con...

4 male actors in 7 different emotions, 480 British...

95 dyadic conversations from 21 subjects. Each sub...

n/a

Audio

SEWA

more than 2000 minutes of audio-visual data of 398...

n/a

Audio

ShEMO

3000 semi-natural utterances, equivalent to 3 hour...

WAV

n/a

Audio

Spoken Commands dataset

A test bed for voice activity detection algorithms...

2800 recording by 2 actresses; 7 emotions: anger, ...

German language dataset, 22,668 recorded phrases, ...

400 utterances by 38 speakers (27 male and 11 fema...

110 English speakers with various accents; each sp...

non-speech, 1085 audio file by ~12 speakers; non-s...

100K hours of unlabelled speech data for 23 langua...

WAV

6.4T

Audio

The SIWIS French Speech Synthesis Database

High quality French speech recordings and associat...

WAV

2.671GB

Audio

TCOF : Traitement de Corpus Oraux en Français

The corpus made available includes two main catego...

WAV

n/a

Audio

African Accented French

This corpus consists of approximately 22 hours of ...

WAV

1.8GB

Audio

Fisher Spanish Speech

This corpus consists of audio files covering rough...

WAV

n/a

Audio

CallFriend - Spanish Corpus

The CallFriend Spanish corpus of telephone speech ...

This corpus includes 240 hours of Catalan speech f...

Recordings and their associated transcriptions by ...

The Pansori TEDxKR Corpus is a Korean speech recog...

This dataset consists of 6 actors who recite 14 se...

WAV

237MB

Audio

Online gaming voice chat corpus (OGVC)

This speech material contains 2,656 acted utteranc...

WAV

n/a

Audio

Keio University Japanese Emotional Speech Database (Keio-ESD)

A set of human speech with vocal emotion spoken by...

WAV

n/a

Audio

NST Danish ASR Database

This database was created by Nordic Language Techn...

This database contains speech data for Danish, mad...

WAV

n/a

Audio

NST Danish Speech Synthesis

This database contains speech data for Danish, mad...

FT Speech is a new speech corpus created from the ...

WAV

n/a

Audio

FalaBrasil-LaPS Benchmark

LaPS is a dataset used by the Fala Brasil group to...

WAV

n/a

Audio

M-AILABS Polish Corpus

The M-AILABS Speech Dataset is the first large dat...

WAV

n/a

Audio

Estonian Emotional Speech Corpus

26 text passages read by 10 speakers, covering 4 m...

The Acted Emotional Speech Dynamic Database (AESDD...

WAV

391MB

Audio

Microsoft Speech Corpus (Indian languages)

Microsoft Speech Corpus (Indian languages) release...

WAV

n/a

Audio

Tunisian Modern Standard Arabic

The Tunisian_MSA corpus was originally collected t...

400 people from different accent areas in China ar...

WAV

15GB

Audio

Malayalam Speech Corpus

The Malayalam Speech Corpus (MSC) is one of the fi...

This data set contains transcribed high-quality au...

WAV

1.345GB

Audio

Facebook AI is releasing Multilingual LibriSpeech

Multilingual Librispeech (MLS) a large-scale, open...

BABEL was a joint European project under the COPER...

A "Crowd-Built" continuously growing speech datase...

WAV

n/a

Audio

Microsoft Speech Language Translation Corpus

The Microsoft Speech Language Translation Corpus r...

WAV

326MB

Audio