Skip to content

A corpus of data collected from YouTube annotated with MBTI personality labels.

Notifications You must be signed in to change notification settings

elisabassignana/Personal-ITY

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Personal-ITY

Personal-ITY is a corpus of data collected from YouTube annotated with MBTI personality labels.

Language: Italian

Statistics:

  • 1048 users
  • 92 average comments/user
  • 115 average tokens/comment

Each line in the files is about one author. The information present are, in order, divided by a "\t":

  • YouTube username
  • MBTI label
  • list of comments

In the pre-processed file:

  • the punctuation is spaced
  • URLs, hashtags, usernames and emojis are replaced with four distinctive labels
  • comments are concatenated

MBTI traits:

  • Extravert - Introvert
  • Sensing - iNtuition
  • Thinking - Feeling
  • Judding - Perceiving

MBTI personality types:

  • Analysts

    • INTJ Architect
    • INTP Logician
    • ENTJ Commander
    • ENTP Debater
  • Diplomats

    • INFJ Advocate
    • INFP Mediator
    • ENFJ Protagonist
    • ENFP Campaigner
  • Sentinels

    • ISTJ Logistician
    • ISFJ Defender
    • ESTJ Executive
    • ESFJ Consul
  • Explorers

    • ISTP Virtuoso
    • ISFP Adventurer
    • ESTP Entrepreneur
    • ESFP Entertainer

Publications

Personal-ITY is described in:

  • Elisa Bassignana, Malvina Nissim, Viviana Patti. Personal-ITY: A Novel YouTube-based Corpus for Personality Prediction in Italian. In Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020) http://ceur-ws.org/Vol-2769/paper_77.pdf
  • Elisa Bassignana, Malvina Nissim, Viviana Patti. Matching Theory and Data with Personal-ITY: What a Corpus of Italian YouTube Comments Reveals About Personality. In Proceedings of the Third Workshop on Computational Modeling of People's Opinions, Personality, and Emotion's in Social Media (PEOPLES 2020) https://www.aclweb.org/anthology/2020.peoples-1.2.pdf

About

A corpus of data collected from YouTube annotated with MBTI personality labels.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published