Hate speech and misinformation on social media can have a devastating impact, particularly on marginalized communities. But what if we used artificial intelligence to combat such harmful content?
That's the goal of a team of University of Toronto researchers who were awarded a Catalyst Grant by the Data Sciences Institute (DSI) to develop an AI system to address the marginalization of communities in data-centric systems - including social media platforms such as Twitter.
The collaborative research team consists of Syed Ishtiaque Ahmed, an assistant professor in the department of computer science in the Faculty of Arts & Science; Shohini Bhattasali, an assistant professor in the department of language studies at U of T Scarborough; and Shion Guha, an assistant professor cross-appointed between the department of computer science and the Faculty of Information and the director of the Human-Centered Data Science Lab.
Their goal is to make content moderation more inclusive by involving the communities affected by harmful or hateful content on social media. The project is a collaboration with two Canadian non-profit organizations: the Chinese Canadian National Council for Social Justice (CCNC-SJ) and the Islam Unravelled Anti-Racism Initiative.
Historically marginalized groups are most affected by content moderation failings as they have lower representation among human moderators and their data is less available for algorithms, Ahmed explains.
(L-R) Syed Ishtiaque Ahmed, Shohini Bhattasali and Shion Guha (supplied photos)
"While most social media platforms have taken measures to moderate and identify harmful content and limit its spread, human moderators and AI algorithms often fail to identify it correctly and take proper actions," he says.
The team plans to design and evaluate the proposed system to address potential Islamophobic and Sinophobic posts on Twitter. The AI system aims to democratize content moderation by including diverse voices in two primary ways: first, by allowing users to contest a decision, the moderation process becomes more transparent and trustworthy for users who are victims of online harms. Second, by taking user input and retraining machine learning (ML) models, the system ensures that users' contesting positions reflect on the prescreening ML system.
"Annotating data becomes challenging when the annotators are divided in their opinions. Resolving this issue democratically requires involving different communities, which is currently not common in data science practices," Ahmed notes.
"This project addresses the issue by designing, developing and evaluating a pluralistic framework of justification and contestation in data science while working with two historically marginalized communities in Toronto."
The AI system will integrate the knowledge and experiences of community members into the process of reducing hateful content directed toward their communities. The team is using a participatory data-curation methodology that helps them learn about the characterization of different kinds of harmful content affecting a community and includes members of the corresponding community in the data-labelling process to ensure data quality.
"We are grateful to DSI for their generous support for this project. The DSI community has also helped us connect with people conducting similar research and learn from them," Ahmed says, adding that his team's research is expected to have far-reaching impacts beyond the two communities it is currently focused on.