The Association for Computational Linguistics conference (ACL) 2021, a top publication venue and event for research in natural language processing (NLP), happened virtually from August 1-6, and I was fortunate to present our own paper at the conference: "Claim Matching Beyond English to Scale Global Fact-Checking".
From the opening remarks, everyone knew that a major theme of the conference would be large language models and according to program committee co-chair Roberto Navigli, BERT-based models were the most prevalent topic in this year’s proceedings. The rest of the conference came with a fresh and critical perspective, from the opening remarks to discussions around ethics, CO2 emissions and social good in natural language processing (NLP). The conference reminded me of Dr. Strangelove, a cinematic masterpiece by Stanley Kubrick about the threat of nuclear war and how thoughtless actions from a small group of impactful people could endanger humanity. While in the past scientists only speculated about the threat of Artificial Intelligence to humanity, the AI we feared already exists. It was heartwarming and hopeful to see the computational linguistics community engage in these conversations before we hit the point of no-return on harmful AI and language technologies.
ACL Presidential Address
One of the opening talks of the conference was Professor Rada Mihalcea’s presidential address as the 2021 Association of Computational Linguistics president. Coincidentally, she is also my PhD advisor!
Dr. Mihalcea called on the NLP community to "stop chasing accuracy numbers" and expressed that "there is more to natural language processing than state of the art results." She rightfully pointed out that neural networks have taken over a large part of NLP even though they have major shortcomings such as lack of explainability, concerning biases and large environmental footprints that current NLP benchmarks overlook. The speech followed a year of heated discussions around the ethics and implications of large language models that peaked when Timnit Gebru was fired from and harassed by Google after submitting a paper critical of large language models to the ACM FAccT conference.
Large language models (LLMs) like BERT have transformed natural language processing and played a major role in recent advances in AI technology: powering personal assistants like Siri and Alexa, automating call center services and improving Google search. There are no silver bullets, however, when it comes to these models. A 2019 paper found that training a large Transformer-based language model with neural architecture search emits 5x the CO2 emissions of a car in its lifetime. While a large number of ACL 2021 participants used pretrained LLMs, recently released LLMs like OpenAI’s GPT-3 or Google’s T5 cannot be trained on an academic budget, resulting in a monopoly over this impactful research trend by big tech companies. This is all made worse by the biases encoded in these models including stereotypes and negative attributions to specific groups that make their widespread adoption dangerous to society.
Together with the major limitations of neural networks mentioned in Dr. Mihalcea’s presidential address, we can see that amid the hype surrounding transformer-based language models, they are likely still long away from obtaining human-like intelligence and understanding of language. To quote an anonymous survey respondent from Dr. Mihalcea’s research, "[NLP]’s major negative impact now is putting more money in less hands" and we should be changing our focus from improving accuracy numbers to factors such as interpretability, emissions and ethics to build language technology that benefits everyone, not just a handful of powerful companies.
Green NLP Panel
Green NLP is a community initiative inside ACL that aims to address environmental impacts of NLP. A panel of academic and industry researchers moderated by Dr. Iryna Gurevych discussed these impacts on the second day of the main conference. Dr. Jesse Dodge started the panel with a presentation on "Efficient Natural Language Processing", an effort to encourage more resource-efficient explorations in NLP and address some of the challenges creative and low budget research face when publication venues are dominated by research around LLMs, without slowing down the impressive progress of NLP seen in recent years. Throughout the panel, many interesting points were brought up by the audience and the panelists.
Time and again, panelists expressed their concern for lack of access to resources required to reproduce a lot of the papers presented at ACL conferences and some even called for interventions. Dr. Mona Diab expressed how instead of the current "Black Box" approach to NLP through the use of LLMs, we should move towards "Green Box", efficient NLP that is easily reproducible and accessible to a broader and more diverse group of researchers, eventually resulting in democratization of NLP while in parallel reducing emissions of our research. Others pointed out that the current setup in NLP discourages competition from academia and moving towards a more green, efficient NLP could mitigate that and increase creativity in the community’s research output.
The panel ended with a question from the audience, asking how energy use and emissions of NLP compare with cryptocurrencies and Bitcoin. One of the panelists Dr. Emma Strubell elaborated that we simply don’t have an answer to this question yet. While Bitcoin’s energy consumption currently can power up Czech Republic twice, there are active efforts in place to reduce emissions from cryptocurrencies that were simply made possible through measurement, something the NLP and AI community may be lacking behind. There is a lot to be done to ensure NLP is democratized and environmentally safe, but community initiatives like Green NLP spark hope that these hopes could become a reality.
NLP for Social Good
The theme track for the conference, a workshop and a social event shared the same topic: NLP for social good: an effort from Association of Computational Linguistics to nurture discussions around the role of NLP in society. These discussions included efforts to define what "NLP for social good" means, to identify both positive and negative societal impacts of NLP and to find methods for better assessing these effects. Dr. Chris Potts’ last keynote of the conference "Reliable characterizations of NLP systems as a social responsibility" offered detailed and fresh directions for NLP systems that could decrease adverse social impacts of these systems and train and build models that promote performance towards our collective social values.
At the NLP for Social Good birds of a feather social event led by Zhijing Jin, Dr. Rada Mihalcea and Dr. Sam Bowman, a friendly conversation started around questions of community building, current NLP for social good initiatives and directions for the future to develop NLP with positive social impact. The consensus on defining "social good" was to go with a loose and broad definition, as long as we don’t overstate the impact of the research as computer scientists sometimes do. Topics such as NLP for climate change and preserving indigenous languages were brought up as research initiatives that the NLP for Social Good community could focus on in the near future. I unfortunately could not attend the "NLP for Positive Impact" workshop, but I encourage the readers to check out their proceedings.
Ending on some favorite NLP + CSS papers from the conference
- A shoutout for our paper "Claim Matching Beyond English to Scale Global Fact-Checking"in which we try to group similar claims that can be served with one fact-check in a variety of high and low resource Indian languages. We released two novel multilingual datasets for this task.
- Changing the World by Changing the Data
- How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social Impact
- COVID-19 and Misinformation: A Large-Scale Lexical Analysis on Twitter
- Structurizing Misinformation Stories via Rationalizing Fact-Checks
- Tackling Fake News Detection by Interactively Learning Representations using Graph Neural Networks
- Read more about my experience at the 2020 ACL conference
- Online conversations are heavily influenced by news coverage, like the 2022 Supreme Court decision on abortion. The relationship is less clear between big breaking news and specific increases in online misinformation.
- The tweets analyzed were a random sample qualitatively coded as “misinformation” or “not misinformation” by two qualitative coders trained in public health and internet studies.
- This method used Twitter’s historical search API
- The peak was a significant outlier compared to days before it using Grubbs' test for outliers for Chemical Abortion (p<0.2 for the decision; p<0.003 for the leak) and Herbal Abortion (p<0.001 for the decision and leak).
- All our searches were case insensitive and could match substrings; so, “revers” matches “reverse”, “reversal”, etc.
<p><a href="https://www.ashkankazemi.ir" title="Ashkan's personal website">Ashkan</a> is a natural language processing (NLP) intern at Meedan, contributing to research efforts in building fact-checking technology. He is also a PhD candidate at University of Michigan’s department of Computer Science and Engineering.</p>