• Dashboard
  • My Datasets
  • Marketplace
Dataset Management Platform
  • Dashboard
  • My Datasets
  • Marketplace
  • My Account
  • Billing
  • Manage Team
  • Support
  • Settings
  • Sign Out
Trending data sets
See All
Video
Kinetics-700

A large, high-quality video dataset of URL links t...

MP4
24.3MB
Video
Video
Casual Conversations Dataset

45,000 videos (3,011 participants) and intended to...

MP4
15GB
Video
Audio
Urban Sound 8K dataset

Contains 8732 urban sounds from 10 classes like an...

.WAV/.CSV
13.84KB
Audio
Video
VoxCeleb

An audio-visual dataset consisting of short clips ...

MP4
133MB
Video
Video
COIN

11,827 videos related to 180 different tasks, whic...

JSON
8.47MB
Video
Video
CityScapes

A large-scale dataset that contains a diverse set ...

JPG
51.92GB
Video
Audio
Mozilla Common Voice

An open source, multi-language dataset of voices t...

MP3
65GB
Audio
Video
Activity Net

A Large-Scale Video Benchmark for Human Activity U...

JSON
600GB
Video
Video
AVA-Kinetics Dataset

AVA is a project that provides audiovisual annotat...

CSV
7.7MB
Video
Video
HiEve

The largest collection of poses which focuses on v...

MP4
n/a
Video
Video
Yahoo-Flickr Creative Commons 100 Million Dataset

The YFCC100M is the largest publicly and freely us...

MP4
15GB
Video
Video
UMDFaces

UMDFaces is a face dataset divided into two parts:...

MP4
173MB
Video
Video
Condensed Movies

A large-scale video dataset, featuring clips from ...

MP4
250GB
Video
Video
AVSpeech

AVSpeech is a new, large-scale audio-visual datase...

MP4
128MB
Video
Audio
Voices Obscured in Complex Environmental Settings (VOICES) Dataset

A creative commons speech dataset targeting acoust...

MP3
1.4GB
Audio
Audio
Free Spoken digit dataset

A simple audio or speech data which consists of re...

WAV
10MB
Audio
Video
The WebVid-10M Dataset

A large-scale dataset of short videos with textual...

MP4
2.5MB
Video
Video
The MECCANO Dataset

The first dataset of egocentric videos to study hu...

MP4
32.3GB
Video
Video
The Stereo Human Pose Estimation Dataset

A dataset of stereo image pairs suited for stereo ...

JPG
197.8MB
Video
Video
The VIRAT Video Dataset

The VIRAT Video Dataset is designed to be realisti...

PDF
12MB
Video
Video data sets
See All
Video
Kinetics-700

A large, high-quality video dataset of URL links t...

MP4
24.3MB
Video
Video
Casual Conversations Dataset

45,000 videos (3,011 participants) and intended to...

MP4
15GB
Video
Video
VoxCeleb

An audio-visual dataset consisting of short clips ...

MP4
133MB
Video
Video
COIN

11,827 videos related to 180 different tasks, whic...

JSON
8.47MB
Video
Video
CityScapes

A large-scale dataset that contains a diverse set ...

JPG
51.92GB
Video
Video
Activity Net

A Large-Scale Video Benchmark for Human Activity U...

JSON
600GB
Video
Video
AVA-Kinetics Dataset

AVA is a project that provides audiovisual annotat...

CSV
7.7MB
Video
Video
HiEve

The largest collection of poses which focuses on v...

MP4
n/a
Video
Video
Yahoo-Flickr Creative Commons 100 Million Dataset

The YFCC100M is the largest publicly and freely us...

MP4
15GB
Video
Video
UMDFaces

UMDFaces is a face dataset divided into two parts:...

MP4
173MB
Video
Video
Condensed Movies

A large-scale video dataset, featuring clips from ...

MP4
250GB
Video
Video
AVSpeech

AVSpeech is a new, large-scale audio-visual datase...

MP4
128MB
Video
Video
The WebVid-10M Dataset

A large-scale dataset of short videos with textual...

MP4
2.5MB
Video
Video
The MECCANO Dataset

The first dataset of egocentric videos to study hu...

MP4
32.3GB
Video
Video
The Stereo Human Pose Estimation Dataset

A dataset of stereo image pairs suited for stereo ...

JPG
197.8MB
Video
Video
The VIRAT Video Dataset

The VIRAT Video Dataset is designed to be realisti...

PDF
12MB
Video
Video
Moments In Time

A large-scale dataset for recognizing and understa...

MP4
150MB
Video
Video
Something Something Dataset

A large collection of labeled video clips that sho...

WEBM
19.4GB
Video
Video
BDD100K

Comprises ten tasks and 100K videos to estimate th...

MP4
3.9GB
Video
Video
TV Human Interaction Dataset

300+ videos from 20 different TV shows for predict...

MP4
156MB
Video
Video
THUMOS Dataset

a large collection of video clips of different kin...

MP4
385KB
Video
Video
50 Salads Dataset

Fully annotated 4.5 hour dataset of RGB-D video + ...

RGB
31GB
Video
Video
YoutubeFace

A database of face videos designed for studying th...

MP4
n/a
Video
Video
PaSc

Facial recognition 9,376 still images and 2,802 vi...

n/a
n/a
Video
Video
iQIYI-VID

The largest video dataset for multi-modal person i...

MP4
n/a
Video
Video
eNTERFACE05

Videos by 42 subjects, coming from 14 different na...

n/a
801MB
Video
Video
GEMEP corpus

10 actors portraying 10 different emotional states...

MP3
n/a
Video
Video
IEMOCAP

12 hours of audiovisual data by 10 actors; 5 emoti...

WAV
n/a
Video
Video
EyeC3D

3D video eye tracking dataset...

n/a
3.9GB
Video
Video
MoVi

A large multi-purpose human motion and video datas...

MP4
1.3MB
Video
Audio data sets
See All
Audio
Urban Sound 8K dataset

Contains 8732 urban sounds from 10 classes like an...

.WAV/.CSV
13.84KB
Audio
Audio
Mozilla Common Voice

An open source, multi-language dataset of voices t...

MP3
65GB
Audio
Audio
Voices Obscured in Complex Environmental Settings (VOICES) Dataset

A creative commons speech dataset targeting acoust...

MP3
1.4GB
Audio
Audio
Free Spoken digit dataset

A simple audio or speech data which consists of re...

WAV
10MB
Audio
Audio
The Spoken Wikipedia Corpora

This is a corpus of aligned spoken Wikipedia artic...

MP3
23GB
Audio
Audio
TED-LIUM

Audio transcription of TED talks. 1495 TED talks a...

n/a
n/a
Audio
Audio
Speech Commands Dataset

65,000 one-second long utterances of 30 short word...

n/a
n/a
Audio
Audio
Persian Consonant Vowel Combination (PCVC) Speech Dataset

This dataset contains 23 Persian consonants and 6 ...

MAT
n/a
Audio
Audio
ISOLET Data Set

This 38.7 GB dataset helps predict which letter-na...

n/a
38.7Gb
Audio
Audio
Arabic Speech Corpus

Phonetic and orthographic transcriptions of more t...

WAV
n/a
Audio
Audio
TIMIT

recordings of 630 speakers of eight major dialects...

.WAV, .TXT, .WRD, .PHN
n/a
Audio
Audio
Mivia Audio Events Dataset

6,000 events of surveillance applications, namely ...

WAV
n/a
Audio
Audio
Urban Sound Dataset

1302 labeled sound recordings. Each recording is l...

WAV
n/a
Audio
Audio
Clotho Dataset

A novel audio captioning dataset, consisting of 49...

MP3
n/a
Audio
Audio
FSD50K

an open dataset of human-labeled sound events cont...

WAV
n/a
Audio
Audio
Vocal Imitation Set v1.1.3

A collection of crowd-sourced vocal imitations of ...

WAV
7.6GB
Audio
Audio
Google Audioset

635 audio event classes and a collection of 2,084,...

MP3
n/a
Audio
Audio
CALLHOME American English Speech

120 unscripted 30-minute telephone conversations b...

n/a
n/a
Audio
Audio
LibriSpeech ASR Corpus

1,000 hours of 16kHz read English speech...

MP3
n/a
Audio
Audio
Speech Accent Archive

Parallel English speech samples from 177 countries...

MP3
907MB
Audio
Audio
Phone Conversation Data Sample

Conversations in Dutch, Japanese, and Irish Englis...

.WAV, .JSON
n/a
Audio
Audio
Alexa Wake Word Voice Samples

Sample of 24 Alexa wake word recordings in four la...

WAV
n/a
Audio
Audio
The LJ Speech Dataset

Public domain speech dataset consisting of 13,100 ...

CSV
2.6GB
Audio
Audio
AISHELL-2

The largest free speech corpus available for Manda...

n/a
n/a
Audio
Audio
AEDD

500 utterances by a diverse group of actors (over ...

n/a
n/a
Audio
Audio
ANAD

1384 recording by multiple speakers; 3 emotions: a...

WAV
2GB
Audio
Audio
AudioMNIST

Consists of 30000 audio samples of spoken digits (...

MP3
n/a
Audio
Audio
BAVED

1935 recording by 61 speakers (45 male and 16 fema...

WAV
97.8MB
Audio
Audio
CMU-MOSEI

65 hours of annotated video from more than 1000 sp...

n/a
n/a
Audio
Audio
CMU-MOSI

2199 opinion utterances with annotated sentiment; ...

n/a
n/a
Audio
Audio
CMU Wilderness

speech dataset many accents reciting passages from...

MP3
n/a
Audio
Audio
CREMA-D

7,442 original clips from 91 actors. These clips w...

GIT-LFS
163MB
Audio
Audio
DAPS Dataset

20 speakers (10 female and 10 male) reading 5 exce...

n/a
n/a
Audio
Audio
Deep Clustering Dataset

Training deep discriminative embeddings to solve t...

WAV / Mp3 / OGG
n/a
Audio
Audio
DEMoS

9365 emotional and 332 neutral samples produced by...

n/a
n/a
Audio
Audio
DIPCO

Dinner Party Corpus - The participants were record...

n/a
n/a
Audio
Audio
EEKK

26 text passage read by 10 speakers; 4 main emotio...

MP3
n/a
Audio
Audio
Emo-DB

800 recording spoken by 10 actors (5 males and 5 f...

n/a
n/a
Audio
Audio
EmoFilm

1115 audio instances sentences extracted from vari...

WAV
n/a
Audio
Audio
Emotional Voice dataset - Nature

2,519 speech samples produced by 100 actors from 5...

n/a
n/a
Audio
Audio
EmotionTTS

Recordings and their associated transcriptions by ...

n/a
n/a
Audio
Audio
Emov-DB

Recordings for 4 speakers- 2 males and 2 females; ...

n/a
n/a
Audio
Audio
EMOVO

6 actors who played 14 sentences; 6 emotions: disg...

n/a
n/a
Audio
Audio
Keio-ESD

A set of human speech with vocal emotion spoken by...

WAV
n/a
Audio
Audio
MSP-IMPROV

20 sentences by 12 actors; 4 emotions: angry, sad,...

n/a
n/a
Audio
Audio
MSP Podcast Corpus

100 hours by over 100 speakers - annotated with em...

MP3
n/a
Audio
Audio
NISQA Speech Quality Corpus

includes 14k speech samples with simulated (codecs...

n/a
n/a
Audio
Audio
OGVC

9114 spontaneous utterances and 2656 acted utteran...

n/a
n/a
Audio
Audio
RECOLA

3.8 hours of recordings by 46 participants; negati...

n/a
n/a
Audio
Audio
The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

7356 files (total size: 24.8 GB). The database con...

WAV
24.8Gb
Audio
Audio
SAVEE Dataset

4 male actors in 7 different emotions, 480 British...

MP4
n/a
Audio
Audio
SEMAINE

95 dyadic conversations from 21 subjects. Each sub...

n/a
n/a
Audio
Audio
SEWA

more than 2000 minutes of audio-visual data of 398...

n/a
n/a
Audio
Audio
ShEMO

3000 semi-natural utterances, equivalent to 3 hour...

WAV
n/a
Audio
Audio
Spoken Commands dataset

A test bed for voice activity detection algorithms...

n/a
10MB per word
Audio
Audio
Tess

2800 recording by 2 actresses; 7 emotions: anger, ...

WAV
n/a
Audio
Audio
Thorsten dataset

German language dataset, 22,668 recorded phrases, ...

WAV
n/a
Audio
Audio
URDU-Dataset

400 utterances by 38 speakers (27 male and 11 fema...

WAV
n/a
Audio
Audio
VCTK dataset

110 English speakers with various accents; each sp...

TXT
10.94GB
Audio
Audio
VIVAE

non-speech, 1085 audio file by ~12 speakers; non-s...

VIVAE
93.5MB
Audio
Audio
VoxPopuli

100K hours of unlabelled speech data for 23 langua...

WAV
6.4T
Audio
Audio
The SIWIS French Speech Synthesis Database

High quality French speech recordings and associat...

WAV
2.671GB
Audio
Audio
TCOF : Traitement de Corpus Oraux en Français

The corpus made available includes two main catego...

WAV
n/a
Audio
Audio
African Accented French

This corpus consists of approximately 22 hours of ...

WAV
1.8GB
Audio
Audio
Fisher Spanish Speech

This corpus consists of audio files covering rough...

WAV
n/a
Audio
Audio
CallFriend - Spanish Corpus

The CallFriend Spanish corpus of telephone speech ...

WAV
n/a
Audio
Audio
TV3Parla

This corpus includes 240 hours of Catalan speech f...

WAV
27.6GB
Audio
Audio
emotiontts_open_db

Recordings and their associated transcriptions by ...

WAV
n/a
Audio
Audio
Pansori TEDxKR

The Pansori TEDxKR Corpus is a Korean speech recog...

WAV
174MB
Audio
Audio
EMOVO

This dataset consists of 6 actors who recite 14 se...

WAV
237MB
Audio
Audio
Online gaming voice chat corpus (OGVC)

This speech material contains 2,656 acted utteranc...

WAV
n/a
Audio
Audio
Keio University Japanese Emotional Speech Database (Keio-ESD)

A set of human speech with vocal emotion spoken by...

WAV
n/a
Audio
Audio
NST Danish ASR Database

This database was created by Nordic Language Techn...

WAV
n/a
Audio
Audio
NST Danish Dictation

This database contains speech data for Danish, mad...

WAV
n/a
Audio
Audio
NST Danish Speech Synthesis

This database contains speech data for Danish, mad...

WAV
n/a
Audio
Audio
FT Speech

FT Speech is a new speech corpus created from the ...

WAV
n/a
Audio
Audio
FalaBrasil-LaPS Benchmark

LaPS is a dataset used by the Fala Brasil group to...

WAV
n/a
Audio
Audio
M-AILABS Polish Corpus

The M-AILABS Speech Dataset is the first large dat...

WAV
n/a
Audio
Audio
Estonian Emotional Speech Corpus

26 text passages read by 10 speakers, covering 4 m...

WAV
n/a
Audio
Audio
AESDD

The Acted Emotional Speech Dynamic Database (AESDD...

WAV
391MB
Audio
Audio
Microsoft Speech Corpus (Indian languages)

Microsoft Speech Corpus (Indian languages) release...

WAV
n/a
Audio
Audio
Tunisian Modern Standard Arabic

The Tunisian_MSA corpus was originally collected t...

WAV
1.2GB
Audio
Audio
AISHELL-1

400 people from different accent areas in China ar...

WAV
15GB
Audio
Audio
Malayalam Speech Corpus

The Malayalam Speech Corpus (MSC) is one of the fi...

WAV
326MB
Audio
Audio
Google Malayalam

This data set contains transcribed high-quality au...

WAV
1.345GB
Audio
Audio
Facebook AI is releasing Multilingual LibriSpeech

Multilingual Librispeech (MLS) a large-scale, open...

WAV
3TB
Audio
Audio
The BABEL Project

BABEL was a joint European project under the COPER...

WAV
n/a
Audio
Audio
Living Audio Dataset

A "Crowd-Built" continuously growing speech datase...

WAV
n/a
Audio
Audio
Microsoft Speech Language Translation Corpus

The Microsoft Speech Language Translation Corpus r...

WAV
326MB
Audio