This repository contains the
ARAUS dataset, a publicly-available dataset (comprising a 5-fold training/validation set and an independent test set) of 25,440 unique subjective perceptual responses to augmented soundscapes presented as audio-visual stimuli. Each augmented soundscape is made by digitally adding "maskers" (bird, water, wind, traffic, construction, or silence) to urban soundscape recordings at fixed soundscape-to-masker ratios. This mimics a real-life soundscape augmentation system, whereby a speaker (or some other sound source) is used to add "maskers" to an actual urban soundscape.
Responses were then collected by asking participants to rate how pleasant, annoying, eventful, uneventful, vibrant, monotonous, chaotic, calm, and appropriate each augmented soundscape was.
The data in this repository aims to form a benchmark for fair comparisons of models for the prediction and analysis of perceptual attributes of soundscapes. Please refer to our publication submitted to
IEEE Transactions on Affective Computing for more details regarding the data collection, annotation, and processing methodologies for the creation of the dataset:
Kenneth Ooi, Zhen-Ting Ong, Karn N. Watcharasupat, Bhan Lam, Joo Young Hong, Woon-Seng Gan, ARAUS: A large-scale dataset and baseline models of affective responses to augmented urban soundscapes,
IEEE Transactions on Affective Computing, doi: 10.1109/TAFFC.2023.3247914.
Replication code and baseline models that we have trained using the
ARAUS dataset can be found at our GitHub repository:
https://github.com/ntudsp/araus-dataset-baseline-models (2022-03-30)