Dataset for: Probably Pleasant? A Neural-Probabilistic Approach to Automatic Masker Selection for Urban Soundscape Augmentation

Version 1.1

Ooi, Kenneth; Watcharasupat, Karn N.; Lam, Bhan; Ong, Zhen-Ting; Gan, Woon-Seng, 2022, "Dataset for: Probably Pleasant? A Neural-Probabilistic Approach to Automatic Masker Selection for Urban Soundscape Augmentation", https://doi.org/10.21979/N9/YSJQKD, DR-NTU (Data), V1

Learn about Data Citation Standards.

Contact Owner

Dataset Metrics

462 Views

31 Downloads

0 Citations

Description	This dataset contains the log-mel spectrograms for the augmented soundscapes described in our ICASSP 2022 submission "Probably Pleasant? A Neural-Probabilistic Approach to Automatic Masker Selection for Urban Soundscape Augmentation", in `.npy` format. The data can be accessed using the `numpy` package of Python, using the command `numpy.load`. The dataset is available as a 5-fold cross validation dataset, with the log-mel spectrograms for each fold having filenames `fold_#_features.npy` and the subjective ratings for the augmented soundscapes having filenames of the format `fold_#_labels.npy`, where `#` is the number of the fold in the set {1,2,3,4,5}. The independent test set has fold index 0. Generation of augmented soundscapes Each augmented soundscape was created by adding 30-second excerpts of recordings of sounds known as maskers to binaural recordings of urban soundscapes (element-wise addition in the time domain). Each masker recording only has one class ("construction", "traffic", "water", or "wind") active for the entire duration of the recording, whereas each binaural recording of an urban soundscape may have multiple sound sources active at any point in the recording, including sound sources outside of the four masker classes. Cross-validation set The masker samples were obtained from Freesound by searching the names of the masker classes (i.e. "construction", "traffic", "water", and "wind") on Freesound, and randomly picking a selection of tracks containing 30-second sections of sound that corresponded only to that particular masker class. The soundscape samples were obtained from the Urban Soundscapes of the World (USotW) dataset, and consisted of all binaural recordings available in the public dataset, minus those with audible electrical noise, measured in-situ L_A,eq values below 52 dB, and measured in-situ L_A,eq values above 77 dB, in order to reflect only the accurately-captured real-life soundscapes, ensure that reproduction levels were significantly above the noise floor of the location with the highest noise floor (~36 dB) where the subjective responses were obtained, and ensure safe listening levels for our participants. In total, 120 out of the 127 publicly-available recordings in the USotW dataset were used for the cross-validation set. Test set The masker samples were obtained from Freesound in the same manner as that for the cross-validation set, but ensuring that no overlap in recordings occurred between the test set and cross-validation set maskers. The soundscape samples were taken from binaural recordings of locations in Singapore (which was not represented in any of the soundscapes in the USotW dataset and hence the cross-validation set). They were recorded under the similar Soundscape Indices Protocol and were taken in similar urban contexts as the USotW dataset Specifically, they were from a road facing a construction site, a gazebo in a park, a walkway facing a lake, a walkway facing a crowded canteen, a path facing a lake, and a path facing a lake with an aircraft flying overhead. Participant information The participants of the listening test were a sample of people who were able to physically come down to our laboratory (in Nanyang Technological University, Singapore) to listen to the stimuli and provide their responses. Their mean age was 28.4 ± 11.8 years, and there were a total of 151 female and 149 male participants. All participants were tested to have normal hearing (mean hearing threshold <20 dB (resp. 30 dB) at 0.5, 1, 2, 4, and 6 kHz for participants below (resp. equal to or above) 30 years of age).
Subject	Computer and Information Science; Engineering
Keyword	Soundscape
Related Publication	Ooi, K., Watcharasupat, K. N., Lam, B., Ong, Z. & Gan, W. (2022). Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation. 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2022), 8887-8891. doi: 10.1109/ICASSP43922.2022.9746897
Grant Information	Ministry of National Development (MND): COT-V4-2020-1 National Research Foundation (NRF): COT-V4-2020-1
License/Data Use Agreement	CC BY-NC 4.0

Filter by

	1 to 10 of 12 Files	Download
	fold_0_features.npy Unknown - 15.1 MB Published Jan 28, 2022 8 Downloads MD5: a9142e6e5e268cfa00a3b821d1eb96c1	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX
	fold_0_labels.npy Unknown - 512 B Published Jan 28, 2022 4 Downloads MD5: 9f42d9207cd4cc57ca26361cb34e8730	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX
	fold_1_features.npy Unknown - 792.4 MB Published Jan 28, 2022 7 Downloads MD5: b8ff1d1e82131ec528d7a9e4d46af34d	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX
	fold_1_labels.npy Unknown - 19.8 KB Published Jan 28, 2022 5 Downloads MD5: 35febc306206901ecddded05fbc9345f	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX
	fold_2_features.npy Unknown - 792.4 MB Published Jan 28, 2022 6 Downloads MD5: 8feafc3920d28b22a25c98e85eda8ca7	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX
	fold_2_labels.npy Unknown - 19.8 KB Published Jan 28, 2022 6 Downloads MD5: 3f3334a99cc995ece472e50299985a29	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX
	fold_3_features.npy Unknown - 792.4 MB Published Jan 28, 2022 6 Downloads MD5: 7bfd1bd19267f4a172268a47ba68ab1f	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX
	fold_3_labels.npy Unknown - 19.8 KB Published Jan 28, 2022 4 Downloads MD5: 46cdd0e0f99c506f53fb0f3823cd3c2d	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX
	fold_4_features.npy Unknown - 792.4 MB Published Jan 28, 2022 7 Downloads MD5: 6b9f7be1a1abbda23507db0e216abb17	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX
	fold_4_labels.npy Unknown - 19.8 KB Published Jan 28, 2022 4 Downloads MD5: 164ff70810e7f97592bb6603fb4ec957	Access File File Access Public Download Options Original File Format Download Metadata Data File Citation EndNote XML RIS BibTeX

Citation Metadata

Dataset Persistent ID	doi:10.21979/N9/YSJQKD
Publication Date	2022-01-28
Title	Dataset for: Probably Pleasant? A Neural-Probabilistic Approach to Automatic Masker Selection for Urban Soundscape Augmentation
Author	Ooi, Kenneth (Nanyang Technological University) - ORCID: 0000-0001-5629-6275 Watcharasupat, Karn N. (Nanyang Technological University) - ORCID: 0000-0002-3878-5048 Lam, Bhan (Nanyang Technological University) - ORCID: 0000-0001-5193-6560 Ong, Zhen-Ting (Nanyang Technological University) - ORCID: 0000-0002-1249-4760 Gan, Woon-Seng (Nanyang Technological University) - ORCID: 0000-0002-7143-1823
Contact	Use email button above to contact. Ooi Wen Rui Kenneth (Nanyang Technological University)
Description	This dataset contains the log-mel spectrograms for the augmented soundscapes described in our ICASSP 2022 submission "Probably Pleasant? A Neural-Probabilistic Approach to Automatic Masker Selection for Urban Soundscape Augmentation", in `.npy` format. The data can be accessed using the `numpy` package of Python, using the command `numpy.load`. The dataset is available as a 5-fold cross validation dataset, with the log-mel spectrograms for each fold having filenames `fold_#_features.npy` and the subjective ratings for the augmented soundscapes having filenames of the format `fold_#_labels.npy`, where `#` is the number of the fold in the set {1,2,3,4,5}. The independent test set has fold index 0. Generation of augmented soundscapes Each augmented soundscape was created by adding 30-second excerpts of recordings of sounds known as maskers to binaural recordings of urban soundscapes (element-wise addition in the time domain). Each masker recording only has one class ("construction", "traffic", "water", or "wind") active for the entire duration of the recording, whereas each binaural recording of an urban soundscape may have multiple sound sources active at any point in the recording, including sound sources outside of the four masker classes. Cross-validation set The masker samples were obtained from Freesound by searching the names of the masker classes (i.e. "construction", "traffic", "water", and "wind") on Freesound, and randomly picking a selection of tracks containing 30-second sections of sound that corresponded only to that particular masker class. The soundscape samples were obtained from the Urban Soundscapes of the World (USotW) dataset, and consisted of all binaural recordings available in the public dataset, minus those with audible electrical noise, measured in-situ L_A,eq values below 52 dB, and measured in-situ L_A,eq values above 77 dB, in order to reflect only the accurately-captured real-life soundscapes, ensure that reproduction levels were significantly above the noise floor of the location with the highest noise floor (~36 dB) where the subjective responses were obtained, and ensure safe listening levels for our participants. In total, 120 out of the 127 publicly-available recordings in the USotW dataset were used for the cross-validation set. Test set The masker samples were obtained from Freesound in the same manner as that for the cross-validation set, but ensuring that no overlap in recordings occurred between the test set and cross-validation set maskers. The soundscape samples were taken from binaural recordings of locations in Singapore (which was not represented in any of the soundscapes in the USotW dataset and hence the cross-validation set). They were recorded under the similar Soundscape Indices Protocol and were taken in similar urban contexts as the USotW dataset Specifically, they were from a road facing a construction site, a gazebo in a park, a walkway facing a lake, a walkway facing a crowded canteen, a path facing a lake, and a path facing a lake with an aircraft flying overhead. Participant information The participants of the listening test were a sample of people who were able to physically come down to our laboratory (in Nanyang Technological University, Singapore) to listen to the stimuli and provide their responses. Their mean age was 28.4 ± 11.8 years, and there were a total of 151 female and 149 male participants. All participants were tested to have normal hearing (mean hearing threshold <20 dB (resp. 30 dB) at 0.5, 1, 2, 4, and 6 kHz for participants below (resp. equal to or above) 30 years of age).
Subject	Computer and Information Science; Engineering
Keyword	Soundscape
Related Publication	Ooi, K., Watcharasupat, K. N., Lam, B., Ong, Z. & Gan, W. (2022). Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation. 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2022), 8887-8891. doi: 10.1109/ICASSP43922.2022.9746897 https://ieeexplore.ieee.org/document/9746897 Ooi, K., Watcharasupat, K. N., Lam, B., Ong, Z. & Gan, W. (2022). Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation. 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2022), 8887-8891. handle: 10356/158000 https://hdl.handle.net/10356/158000
Grant Information	Ministry of National Development (MND): COT-V4-2020-1 National Research Foundation (NRF): COT-V4-2020-1
Depositor	Ooi Wen Rui Kenneth
Deposit Date	2021-10-04
Kind of Data	Processed audio data (log-mel spectrograms)
Software	Python

Dataset Terms

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons Attribution-NonCommercial 4.0 International License. CC BY-NC 4.0

Dataset Version	Summary	Contributors	Published on
No records found.

Edit File

This file has already been deleted (or replaced) in the current version. It may not be edited.

Restrict Access

Restricting limits access to published files. People who want to use the restricted files can request access by default. If you disable request access, you must add information about access to the Terms of Access field.

Learn about restricting files and dataset access in the User Guide.

Request Access

Enable access request

You must enable request access or add terms of access to restrict file access.

Terms of Access for Restricted Files

Save Changes

Edit Embargo

The selected file or files have already been published. Contact an administrator to change the embargo date or reason of the file or files.

Delete Files

The file will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Select File(s)

Please select one or more files.

Share Dataset

Share this dataset on your favorite social media networks.

Dataset Citations

Citations for this dataset are retrieved from Crossref via DataCite using Make Data Count standards. For more information about dataset metrics, please refer to the User Guide.

Sorry, no citations were found.

Restricted Files Selected

The selected file(s) may not be downloaded because you have not been granted access.

Download Options

The files selected are too large to download as a ZIP.

You can select individual files that are below the 953.7 MB download limit from the files table, or use the Data Access API for programmatic access to the files.

Select File(s)

Please select a file or files to be downloaded.

Restricted Files Selected

The restricted file(s) selected may not be downloaded because you have not been granted access.

Click Continue to download the files you have access to download.

Delete Dataset

Are you sure you want to delete this dataset and all of its files? You cannot undelete this dataset.

Delete Draft Version

Are you sure you want to delete this draft version? Files will be reverted to the most recently published version. You cannot undelete this draft.

Unpublished Dataset Private URL

Private URL can only be used with unpublished versions of datasets.

Unpublished Dataset Private URL

Are you sure you want to disable the Private URL? If you have shared the Private URL with others they will no longer be able to use it to access your unpublished dataset.

Delete Files

The file(s) will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Compute

This dataset contains restricted files you may not compute on because you have not been granted access.

Deaccession Dataset

Are you sure you want to deaccession? The selected version(s) will no longer be viewable by the public.

Deaccession Dataset

Are you sure you want to deaccession this dataset? It will no longer be viewable by the public.

Version Differences Details

Please select two versions to view the differences.

Version Differences Details

Version:
Last Updated:

Select File(s)

Please select a file or files for access request.

Select File(s)

Embargoed files cannot be accessed. Please select an unembargoed file or files for your access request.

Edit Tags

Select existing file tags or create new tags to describe your files. Each file can have more than one tag.

Request Access

You need to Log In to request access.

Dataset Terms

This dataset is made available under the following terms. Please confirm and/or complete the information needed below in order to continue.

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons Attribution-NonCommercial 4.0 International License. CC BY-NC 4.0

Preview Guestbook

Upon downloading files the guestbook asks for the following information.

Guestbook Name

Collected Data

Account Information

Package File Download

Use the Download URL in a Wget command or a download manager to download this package file. Download via web browser is not recommended. User Guide - Downloading a Dataverse Package via URL

Download URL

https://researchdata.ntu.edu.sg/api/access/datafile/

Request Access

Please confirm and/or complete the information needed below in order to request access to files in this dataset.

Compute Batch

Clear Batch

Dataset	Dataset Persistent ID	Change Compute Batch

Compute Batch

Submit for Review

You will not be able to make changes to this dataset while it is in review.

Publish Dataset

Are you sure you want to republish this dataset?

Publishing dataset means making it publicly available and publicly searchable on the DR-NTU (Data) search engine and third-party search engines (e.g., Google Search or Google Dataset Search).

If you need a second opinion/review, or if you have some concerns, refrain from publishing, and contact NTU Data Librarians (library@ntu.edu.sg).

Please read the following carefully BEFORE you publish your research data:

By posting User Uploads to your dataverse or other dataverses in DR-NTU (Data), or by allowing others to do so, you make the following representations and warranties to DR-NTU (Data):

1. User Uploads do not infringe upon the copyrights or other intellectual property rights, including, but not limited to patent, trademark, trade secret, copyright, right of publicity or other right of any third party;

2. User Uploads do not violate any laws;

3. In the event you become aware of any issues after submitting a User Upload, you will promptly notify DR-NTU (Data) and the relevant DR-NTU (Data) Administrator(s) of any confidentiality, privacy or data protection, licensing, or intellectual property issues regarding the User Uploads;

4. User Uploads do not contain software viruses or any other computer codes, files, or programs that are designed or intended to disrupt, damage, limit or interfere with the proper function of any software, hardware, or telecommunications equipment or to damage or obtain unauthorized access to any system, data files, or other information of DR-NTU (Data) or any third party;

5. User Uploads have been given all relevant, obligatory, and applicable approvals for posting such materials with the content included and in the format uploaded, including but not limited to approvals from the Institutional Review Board and third parties with whom Users have relevant contractual obligations; and

6. User Uploads must be void of all identifiable information, such that re-identification of any subjects from the amalgamation of the information available from all of the materials (across datasets and dataverses) uploaded under any one author and/or User should not be possible. Specifically, User Uploads cannot contain social security numbers; credit card numbers; medical record numbers; health plan numbers; other account numbers of individuals; or biometric identifiers (fingerprints, retina, voice, print, DNA, etc.). The only exceptions for when identifiable information is allowed are when:

a. the information has been previously released to the public;
b. the information describes public figures, where the data relates to their public roles or other non-sensitive subjects; or
c. all identified subjects have given explicit informed consent allowing the public release of the information in the dataset.

Select if this is a minor or major version update.

Minor Release (1.2)

Major Release (2.0)

Publish Dataset

This dataset cannot be published until Digital Signal Processing Laboratory is published by its administrator.

Publish Dataset

This dataset cannot be published until Digital Signal Processing Laboratory and School of Electrical and Electronic Engineering (EEE) are published.

Return to Author

Return this dataset to contributor for modification.

Dataset for: Probably Pleasant? A Neural-Probabilistic Approach to Automatic Masker Selection for Urban Soundscape Augmentation

Generation of augmented soundscapes

Cross-validation set

Test set

Participant information