Personal-ITY is a corpus of data collected from YouTube annotated with MBTI personality labels.
Language: Italian
Statistics:
- 1048 users
- 92 average comments/user
- 115 average tokens/comment
Each line in the files is about one author. The information present are, in order, divided by a "\t":
- YouTube username
- MBTI label
- list of comments
In the pre-processed file:
- the punctuation is spaced
- URLs, hashtags, usernames and emojis are replaced with four distinctive labels
- comments are concatenated
MBTI traits:
- Extravert - Introvert
- Sensing - iNtuition
- Thinking - Feeling
- Judding - Perceiving
MBTI personality types:
-
Analysts
- INTJ Architect
- INTP Logician
- ENTJ Commander
- ENTP Debater
-
Diplomats
- INFJ Advocate
- INFP Mediator
- ENFJ Protagonist
- ENFP Campaigner
-
Sentinels
- ISTJ Logistician
- ISFJ Defender
- ESTJ Executive
- ESFJ Consul
-
Explorers
- ISTP Virtuoso
- ISFP Adventurer
- ESTP Entrepreneur
- ESFP Entertainer
Personal-ITY is described in:
- Elisa Bassignana, Malvina Nissim, Viviana Patti. Personal-ITY: A Novel YouTube-based Corpus for Personality Prediction in Italian. In Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020) http://ceur-ws.org/Vol-2769/paper_77.pdf
- Elisa Bassignana, Malvina Nissim, Viviana Patti. Matching Theory and Data with Personal-ITY: What a Corpus of Italian YouTube Comments Reveals About Personality. In Proceedings of the Third Workshop on Computational Modeling of People's Opinions, Personality, and Emotion's in Social Media (PEOPLES 2020) https://www.aclweb.org/anthology/2020.peoples-1.2.pdf