HaRe (Harassment Recognizer) is a command line tool and Python library to automatically detect harassment as it happens (real-time) with the help of machine learning techniques.
The monitoring tool can be started by running bin/hare monitor <pretrained_model_name>
, and listens on localhost:11118
. If you want to run it on another address or port, you can add <address>:<port>
, for example:
$bin/hare monitor moba localhost:1234
Loading pretrained model moba...
HaRe is listening on localhost:1234. To start monitoring the first conversation, send either an HTTP PUT request to localhost:1234/yourconvoid, or use the convenient clien tool: 'bin/hare client convo yourconvoid'
You can then send HTTP PUT, POST and GET requests to this process. For convenience, HaRe includes a separate tool bin/hare client
to send these requests quickly from the command line. It will use localhost:11118
by default, but this can be overridden by including the address and host somewhere in the arguments.
Start a new conversation:
bin/hare client convo <yourconvoid>
- Does an HTTP PUT to address:port/yourconvoid
- Example:
bin/hare client convo myconvo
does an HTTP PUT tolocalhost:11118/myconvo
Add an utterance:
bin/hare client utterance <yourconvoid> <yourspeakerid> <the rest of the words are for the utterance>
- Does an HTTP POST to address:port/yourconvoid
- Example:
bin/hare client utterance myconvo speakera good luck everyone
does an HTTP POST tolocalhost:11118/myconvo
with this data:
{"utterance":"good luck everyone","speaker":"a"}
Check the status:
bin/hare client status <yourconvoid>
- Does an HTTP GET to address:port/yourconvoid
- Example:
bin/hare client status myconvo
does an HTTP GET to `localhost:11118/myconvo', and will return something in this format:
TODO
The easiest way to use HaRe is by simply loading a pretrained HaRe model included with this repo in the models
folder, like the one named 'moba':
from hare import load_pretrained
moba_hare = load_pretrained('moba')
You can then use this object to monitor conversations in progress. Let's start a conversation and ask HaRe to monitor it:
from hare import Conversation
convo = Conversation()
moba_hare.add_conversation(convo)
At any point in time, you can then request the current status of the conversation according to this HaRe model:
convo.add_utterance(speaker='a',content='hello')
convo.add_utterance(speaker='b',content='hi everyone')
moba_hare.get_status()
You can also add multiple sentences at once; for example a whole conversation if it has already finished.
from hare import Utterance
convo.add_utterances([Utterance(speaker='a',content='good luck'),
Utterance(speaker='c',content='ur all n00bs')])
moba_hare.get_status()
If you add multiple conversations for Hare to monitor, you will need to specify the conversation index when asking for the status:
second_convo = Conversation()
convo_index = moba_hare.add_conversation(second_convo)
second_convo.add_utterance(speaker='a',content='hello')
second_convo.add_utterance(speaker='b',content='hi everyone')
moba_hare.get_status(id=convo_index)
If you have a labeled dataset (that is: for each conversation an indication which participants are considered toxic), HaRe can calculate to what extent its judgments match the labels. A label can range from the default 0 (not toxic at all) to 1 (maximally toxic). Let's label speaker c
:
convo.label_speaker('c',0.9)
There are several evaluation metrics, depending on what is important to you (detecting ALL harassment, detecting harassment quickly, no false positives, etc):
moba_hare.calculate_accuracy()
These metrics are calculated on the basis on all conversations the HaRe object is aware of that have at least 1 labeled participant. If you want to exclude a label from evaluation, simply add it to the conversations_excluded_for_evaluation
list:
moba_hare.conversations_excluded_for_evaluation = [0]
At some point, you might want to do some training yourself. This can for example be the case because you are applying HaRe in another domain than the pretrained models, and harassment looks slightly different there, or because you even want to detect something different than harassment.
from hare import Hare, Utterance
new_hare = Hare()
new_hare.start_conversation(conversation_id='convo_001')
new_hare.label_speaker('b',1)
new_hare.add_utterances([Utterance(speaker='a',content='good luck'),
Utterance(speaker='b',content='ur all n00bs')])
new_hare.start_conversation(conversation_id='convo_002')
new_hare.label_speaker('b',1)
new_hare.add_utterances([Utterance(speaker='a',content='hi'),
Utterance(speaker='b',content='SHUT UP!')])
new_hare.train()
It will probably be highly effective to use so-called word embeddings during training. You can see these embeddings as a dictionary that translates from a word's characters to an estimate of its meaning. Research has shown that classifying these 'meanings' is much more successful that classifying the raw words. [reference] . To use them, simply point your HaRe object to the embedding file in the embeddings
folder before training:
new_hare.embedding_file = 'english_large'
new_hare.train()
Of course, these embeddings should be in the same language as the rest of your dataset.