These are the most fascinating AI research papers that have been released this year. It combines advances in data science and artificial intelligence (AI). It has a link to a longer article and is arranged chronologically.
An Approach to Meta-Learning Based on Language Clustering for Cross-Lingual Transfer and Zero-Shot Generation
The study of multilingual and cross-lingual transfer, which involves shifting supervision from high-resource languages (HRLs) to low-resource languages (LRLs), has advanced significantly in recent years. However, there are situations when the cross-linguistic transfer varies from one language to another, particularly in the zero-shot situation. Learning architectures that may be applied to a variety of tasks with a small amount of tagged data is one possible research area.
The researchers in this paper present a meta-learning framework called Meta-XNLG that employs language clustering and meta-learning to acquire reusable structures from typologically distinct languages. It is a step in the direction of consistent interlanguage translation for unheard-of languages. The languages are divided into groups depending on how they are written, and the centre language of each group is then identified. All centroid languages are used to train a meta-learning system, which is then tested in zero-shot on all other languages. The researchers demonstrate the effectiveness of their modelling on
five well-known datasets, 30 languages, two NLG tasks (Question Generation and Abstract Text Summarization)
They are written in quite different styles. Nevertheless, steady advancements from solid foundations demonstrate the viability of the suggested framework. Additionally, this end-to-end NLG configuration is less likely to experience the issue of unintended translation, which is a major concern in zero-shot cross-lingual NLG jobs, because the researchers carefully created the model.
Cross-lingual Transfer Among Related Languages Is Improved by Overlap-based Vocabulary Generation
mBERT and XLM-R, two already trained multilingual language models, have showed great potential for zero-shot cross-lingual transfer to low web resource languages (LRL). The large difference between high web resource languages (HRL) and low web resource languages (LRL), however, requires to allow greater opportunity for co-embedding the LRL with the HRL since model capacity is limited, which eventually harms the performance of LRLs.
The authors of this study make the case that some issues with LRLs’ corpora may be overcome by taking advantage of the similarities in lexical overlap between languages belonging to the same language family. They propose Overlap BPE (OBPE), a straightforward but efficient modification to the BPE method that increases the overlap between related languages. They discovered that OBPE creates a vocabulary that makes it simpler for LRLs to be expressed by utilising tokens that HRLs also utilise after conducting numerous tests on various NLP tasks and datasets. Without affecting the accuracy or representation of related HRLs, it improves the zero-shot transfer from LRLs to related HRLs.
The researchers demonstrate that token overlap is crucial in low-resource linguistic environments, in contrast to past studies that did not consider it to be important. The accuracy of the zero-shot transfer can decrease by as much as four times if the overlap is reduced to zero.
Moving Towards Fair Dialogue State Tracking Evaluation via Flexible Turn-level Performance Incorporation
Joint Goal Accuracy (JGA), which measures the percentage of turns in which the actual discussion state perfectly matches the prediction, is the primary criterion used to assess Discussion State Tracking (DST). In DST, the dialogue or belief state typically takes the user’s prior intents into account for a specific turn. After a failed forecast, it is simpler to create a successful one thanks to the accumulated belief state. As a result, despite being a useful metric, it has the potential to be harsh and underestimate a DST model’s true potential. Additionally, due to unequal annotations, enhancing JGA may result in a reduction in turn-level or non-cumulative belief state prediction. As a result, using JGA as the only factor for model selection may only be the best option in some situations.
The researchers examine various DST evaluation methods and their shortcomings in this study. They propose the use of a new evaluation metric called Flexible Goal Accuracy (FGA) to address the issues raised above (FGA). The JGA has been expanded upon by the FGA. Contrary to JGA, it makes an effort to penalise locally correct mispredictions, which indicate that the error originated in a preceding round. In comparison to preceding metrics, FGA assesses cumulative and turn-level prediction performance more flexibly and offers superior insight. The researchers also show that FGA is a better predictor of the effectiveness of DST models.