The winning approach to cross-genre gender identification in Russian at RUSProfiling 2017

I. Markov, H. Gómez-Adorno, G. Sidorov, A. Gelbukh

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

We present the CIC systems submitted to the 2017 PAN shared task on Cross-Genre Gender Identification in Russian texts (RUSProfiling). We submitted five systems. One of them was based on a statistical approach using only lexical features, and other four on machine-learning techniques using some combinations of gender-specific Russian grammatical features, word and character n-grams, and suffix n-grams. Our systems achieved the highest weighted accuracy across all the test datasets, occupying the first four places in the ranking.
Original languageEnglish
Title of host publicationFIRE 2017 - Working Notes of FIRE 2017 - Forum for Information Retrieval Evaluation
EditorsP. Majumder, J. Sankhavara, M. Mitra, P. Mehta
PublisherCEUR-WS
Pages20-24
Volume2036
Publication statusPublished - 2017
Externally publishedYes
Event2017 Working Notes of Forum for Information Retrieval Evaluation, FIRE 2017 - Bangalore, India
Duration: 8 Dec 201710 Dec 2017

Publication series

NameCEUR Workshop Proceedings
ISSN (Print)1613-0073

Conference

Conference2017 Working Notes of Forum for Information Retrieval Evaluation, FIRE 2017
Country/TerritoryIndia
CityBangalore
Period8/12/1710/12/17

Funding

This work was partially supported by the Mexican Government (CONACYT projects 240844, SNI, COFAA-IPN, SIP-IPN 20171813, 20172008, and 20172044).

FundersFunder number
Mexican Government
California Health Care Safety Net Institute
Consejo Nacional de Ciencia y Tecnología240844
Sistema Nacional de Investigadores20172044, SIP-IPN 20171813, 20172008

    Fingerprint

    Dive into the research topics of 'The winning approach to cross-genre gender identification in Russian at RUSProfiling 2017'. Together they form a unique fingerprint.

    Cite this