HateCheck
HateCheck
An application to detect Hate speeches in social media
These words of Martin Luther King still
echo in our minds when we hear of incidents such as Riots in Bangalore incited
due to Facebook post, US Capitol Hill Siege, or more recently, the Shaheen Bagh
Protest where peaceful protestors were labeled 'anti-nationals' and
'jihadists.' These incidents have one thing in common. They were all incited by
or at least propagated by organized hate speech campaigns especially through
social media as a tool. This brings to fore the immense potentialities of
social media in propagating Hate Speeches. Given its wide relevance in today's
world, it becomes crucial to understand what ‘Hate Speech’ is and how it can be
prevented?
United Nations (UN) defines hate speech
as "any kind of communication in speech, writing or behavior, that
attacks or uses pejorative or discriminatory language with reference to the person or a group on the basis of who they are, in other words, based on their
religion, ethnicity, nationality, race, color, descent, gender, or other
identity factors". In India, it is manifested mainly on religious and
caste lines. The consequences of hate speech can be devastating. It has the
potential to breed intolerance and extremism in society. This was well
understood by dictators of the likes of Hitler who used intense and consistent
propaganda against Jews to create an entrenched opinion amongst the masses
which ultimately culminated into widespread persecution of Jews, remembered in
history as the Holocaust.
Now that we know the potential of Hate Speech, it
becomes imperative to prevent it. How can we prevent hate speech?
I, along with my friends, have come up with a solution as a part of our project 'Hate-Check' for a course 'Privacy and Security in Online Social Media (PSOSM)' we took this semester in IIIT Delhi. We have tried to address the problem by creating a website that takes the URL of a tweet from the users and identifies whether it contains hate speech or not. If the tweet contains traces of hate speech in it, the website identifies the section that it has flagged as containing hate speech. It also classifies hate speech into 6 different categories viz., Not Hate, Racist, Sexist, Homophobic, Religious, Other. It will help users become aware of such tweets online and make informed choices while consuming the content.
How does it work?
The Machine Learning model has been used to implement the solution. A training dataset consisting of labeled texts and reviews annotated on the labels by content reviewers are used to building a Machine Learning classifier that can subsequently, be used to detect hate speeches in a text. In Layman's terms, our classifier would learn from the training dataset fed to it and would use that algorithm to classify hate speeches.
Methodology
Our methodology is based on four key steps, namely Data Collection, Data. Preprocessing, Constructing Classifier Model, and Classification Model Evaluation.
Let us understand each step in a little more detail
Data Collection
The machine-learning classifier has been trained with the help of a dataset
consisting of more than 1.46 lakh tweets. Each tweet has been labeled by a
content reviewer as one of the following labels:
- Not
Hate
- Racist
- Sexist
- Homophobic
- Religious
- Other
Data Pre-processing
In this
stage, the data collected has to be organized in a manner such that
data-processing steps can be applied to it. For this, we remove the following
features if they are present in the tweet:
- Images: Images
cannot be processed through our algorithm.
- Special
Characters: Special characters like !,#, @, (, ), etc. if
present are removed as they cannot make a tweet more or less hateful
- Stop words: The words like I, We, They, are, our, All, At, are, etc. are removed
The remaining
words are then converted into shingles. This is to ensure that each word of the
text is processed in the later steps
Constructing Classifier Model
At this step,
a machine learning classifier is trained using the pre-processed data. To train
the model, we need to associate an integer value as an identifier for each of
the data labels. Following identifiers have been used:
0- Not Hate
1- Racist
2-Sexist
3-Homophobic
4-Religious
5-Other
We trained
our model using the following machine learning models:
- Gaussian
Naïve Bayes (GNB)
- Support Vector Machines (SVM)
- Decision Tree (DT)
- K-Nearest Neighbours (KNN)
- Random Forest
Classification Model Evaluation
Each of the trained models are evaluated objectively by considering their precision and recall values for a few tweets. It has been observed that the Random Forest The classifier model had the highest precision and a very high value of recall. Hence it can be inferred that ‘Random Forest Classifier’ is the best-trained model for our dataset among all models.
HOW TO USE OUR WEBSITE
Step 1: Go to our website http://127.0.0.1:8000/
Step 2: You will see the description of our
project on the website. Scroll down to enter the URL of the tweet which you
want to check
Step 3: You will get the result that our model
predicted, i.e., whether the tweet is hateful or not.
Conclusion
Hate
Speech is an emerging problem of the 21st century whose effects are only going
to increase. Hence there is
an urgent need to monitor and regulate hate speeches as they breed extremist
views and intolerance. HateCheck is a step in the direction that can reduce the
hateful comments in social media by alerting users before posting if their
content is hateful and by highlighting hateful content in the text. In the future,
we would like to explore the idea of identifying hate speeches in different
media like audio and video.
Comments
Post a Comment