Relating language and sound: two distributional models.

C.W.J. van Miltenburg, A. Lopopolo

Research output: Contribution to ConferencePosterOther research output

Abstract

We present preliminary results in the domain of sound labeling and sound representation. Our work is based on data from the Freesound database, which contains thousands of sounds complete with tags and descriptions, under a Creative Commons license. We want to investigate how people represent and categorize different sounds, and how language reflects this categorization. Moreover, following recent developments in multimodal distributional semantics (Bruni et al. 2012), we want to assess whether acoustic information can improve the semantic representation of lexemes.

We have built two different distributional models on the basis of a subset of the Freesound database, containing all sounds that were manually classified as SoundFX (e.g. footsteps, opening and closing doors, animal sounds). The first model is based on tag co-occurrence. On the basis of this model, we created a network of tags that we partitioned using cluster analysis. The clustering intuitively seems to correspond with different types of scenes. We imagine that this partitioning is a first step towards linking particular sounds with relevant frames in FrameNet. The second model is built using a bag-of-auditory-words approach. In order to assess the goodness of the semantic representations, the two models are compared to human judgment scores from the WordSim353 and MEN database.
Original languageEnglish
Publication statusPublished - 2015
EventCLIN 25 -
Duration: 6 Feb 20156 Feb 2015

Conference

ConferenceCLIN 25
Period6/02/156/02/15

Fingerprint

Acoustic waves
Semantics
Cluster analysis
Labeling
Animals
Acoustics

Cite this

van Miltenburg, C. W. J., & Lopopolo, A. (2015). Relating language and sound: two distributional models.. Poster session presented at CLIN 25, .
van Miltenburg, C.W.J. ; Lopopolo, A. / Relating language and sound: two distributional models. Poster session presented at CLIN 25, .
@conference{1800d4bd4d6c442fba775b6169cfbb4b,
title = "Relating language and sound: two distributional models.",
abstract = "We present preliminary results in the domain of sound labeling and sound representation. Our work is based on data from the Freesound database, which contains thousands of sounds complete with tags and descriptions, under a Creative Commons license. We want to investigate how people represent and categorize different sounds, and how language reflects this categorization. Moreover, following recent developments in multimodal distributional semantics (Bruni et al. 2012), we want to assess whether acoustic information can improve the semantic representation of lexemes.We have built two different distributional models on the basis of a subset of the Freesound database, containing all sounds that were manually classified as SoundFX (e.g. footsteps, opening and closing doors, animal sounds). The first model is based on tag co-occurrence. On the basis of this model, we created a network of tags that we partitioned using cluster analysis. The clustering intuitively seems to correspond with different types of scenes. We imagine that this partitioning is a first step towards linking particular sounds with relevant frames in FrameNet. The second model is built using a bag-of-auditory-words approach. In order to assess the goodness of the semantic representations, the two models are compared to human judgment scores from the WordSim353 and MEN database.",
author = "{van Miltenburg}, C.W.J. and A. Lopopolo",
year = "2015",
language = "English",
note = "CLIN 25 ; Conference date: 06-02-2015 Through 06-02-2015",

}

van Miltenburg, CWJ & Lopopolo, A 2015, 'Relating language and sound: two distributional models.' CLIN 25, 6/02/15 - 6/02/15, .

Relating language and sound: two distributional models. / van Miltenburg, C.W.J.; Lopopolo, A.

2015. Poster session presented at CLIN 25, .

Research output: Contribution to ConferencePosterOther research output

TY - CONF

T1 - Relating language and sound: two distributional models.

AU - van Miltenburg, C.W.J.

AU - Lopopolo, A.

PY - 2015

Y1 - 2015

N2 - We present preliminary results in the domain of sound labeling and sound representation. Our work is based on data from the Freesound database, which contains thousands of sounds complete with tags and descriptions, under a Creative Commons license. We want to investigate how people represent and categorize different sounds, and how language reflects this categorization. Moreover, following recent developments in multimodal distributional semantics (Bruni et al. 2012), we want to assess whether acoustic information can improve the semantic representation of lexemes.We have built two different distributional models on the basis of a subset of the Freesound database, containing all sounds that were manually classified as SoundFX (e.g. footsteps, opening and closing doors, animal sounds). The first model is based on tag co-occurrence. On the basis of this model, we created a network of tags that we partitioned using cluster analysis. The clustering intuitively seems to correspond with different types of scenes. We imagine that this partitioning is a first step towards linking particular sounds with relevant frames in FrameNet. The second model is built using a bag-of-auditory-words approach. In order to assess the goodness of the semantic representations, the two models are compared to human judgment scores from the WordSim353 and MEN database.

AB - We present preliminary results in the domain of sound labeling and sound representation. Our work is based on data from the Freesound database, which contains thousands of sounds complete with tags and descriptions, under a Creative Commons license. We want to investigate how people represent and categorize different sounds, and how language reflects this categorization. Moreover, following recent developments in multimodal distributional semantics (Bruni et al. 2012), we want to assess whether acoustic information can improve the semantic representation of lexemes.We have built two different distributional models on the basis of a subset of the Freesound database, containing all sounds that were manually classified as SoundFX (e.g. footsteps, opening and closing doors, animal sounds). The first model is based on tag co-occurrence. On the basis of this model, we created a network of tags that we partitioned using cluster analysis. The clustering intuitively seems to correspond with different types of scenes. We imagine that this partitioning is a first step towards linking particular sounds with relevant frames in FrameNet. The second model is built using a bag-of-auditory-words approach. In order to assess the goodness of the semantic representations, the two models are compared to human judgment scores from the WordSim353 and MEN database.

M3 - Poster

ER -

van Miltenburg CWJ, Lopopolo A. Relating language and sound: two distributional models.. 2015. Poster session presented at CLIN 25, .