Learning That Never Stops, Extremely Complex Neural Networks, And Extreme Forms Of Multi-Modal Classification

These are the most intriguing artificial intelligence research articles published this year. Innovations in artificial intelligence (AI) and data science are brought together in this approach. The information is presented in a chronological order and a link is provided to an extended article.

Gradient coreset-based replay buffer selection for continuous learning is referred to as GCR.

Continual learning, or CL, is a strategy that aims to develop methods by which a single model may adapt to an increasing number of tasks that are faced sequentially. This could potentially leverage learning across tasks in a way that is resource-efficient. Catastrophic forgetting, on the other hand, is a serious challenge for CL systems. This problem arises when a person is learning a new task and forgets what they have already learned.

Gradient Coreset Replay (GCR) is an innovative method for selecting and updating replay buffers. It makes use of a meticulously developed optimization criterion to achieve optimal performance. To be more specific, they choose and keep track of a “coreset” that is an approximation of the gradient of all the data observed to this point in time in relation to the parameters of the present model. They investigate the necessary approaches for its efficient implementation in an environment of continuous learning and discuss their findings.

The researchers show that their method is superior to the current state of the art by a significant amount (2%-4% absolute) when applied in a well-researched offline continuous learning setting. Their discoveries, when applied effectively to online and streaming CL contexts, demonstrate improvements of up to 5% compared to the solutions that are already in use. In their conclusion, the researchers show that supervised contrastive loss is beneficial for continuous learning. This type of learning results in a cumulative improvement in accuracy of up to 5% when combined with their subset selection technique.

Rotating a frame in order to deceive a DNN is called “Merry Go Round.”

Wearable cameras are responsible for the production of the vast majority of first-person videos that are captured nowadays. Egocentric vision is one of the most difficult problems in computer vision, and the majority of state-of-the-art (SOTA) vision systems rely on deep neural networks (DNNs). On the other hand, it is common knowledge that DNNs are susceptible to Adversarial Attacks (AAs), which are attacks that supplement the input with noise that is not visible to the human eye. In addition, both black-box and white-box attacks on occupations requiring image and video analysis have been shown to be effective.

The researchers observe that the majority of AA methods result in altered image intensities. Even for videos, the process must be carried again repeatedly for each every frame. They emphasise that the concept of imperceptibility that is employed for images may not apply to videos, because in videos, a random shift in intensity can be discernible even when it occurs between two frames that are immediately following each other.

As the most significant new aspect of this research, the authors suggest using optical flow perturbation to a video analysis system in order to carry out AAs. This type of disruption is particularly useful for egocentric movies since egocentric recordings already contain a significant amount of tremor, and the addition of even a small amount extra tremor renders it almost impossible to detect. In a broad sense, our idea can be understood as the addition of structured, parametric noise as the disruptive adversary. Their application of the notion, which included adding 3D rotations to the frames, reveals that using their technique, one is able to mount a black-box AA on an egocentric activity detection system with one-third fewer queries than when using the SOTA AA technique.

Multi-modal Extreme Classification

This research presents the MUFIN method as a solution to extreme classification (XC) problems, which often involve millions of labels and data points that are accompanied by both graphical and textual descriptors. Applications of MUFIN are presented below. These applications include product-to-product recommendation and bid query prediction across millions of products. Embedding-based techniques are typically the sole method upon which modern multimodal techniques rely. On the other hand, XC approaches utilise classifier designs to produce results that are more accurate than those produced by embedding-only methods, while at the same time concentrating mainly on text-based classification issues.

MUFIN develops an architecture that is predicated on cross-modal attention and then trains it in a modular fashion by making use of pre-training, positive and negative mining, and pre-training. From the publicly available listings on Amazon.com, a new dataset called MM-AmazonTitles-300K was compiled for use in making product-to-product recommendations. This dataset contains roughly 300 thousand products, each with its own title and a variety of images. On the MM-AmazonTitles-300K and Polyvore datasets, as well as a dataset with over 4 million labels collected from the Bing search engine click logs, MUFIN provided at least 3% greater accuracy than leading text-based, image-based, and multimodal approaches. This was demonstrated on all three datasets.

Learning That Never Stops, Extremely Complex Neural Networks, And Extreme Forms Of Multi-Modal Classification

Leave a Reply Cancel reply

Editors Corner

How can Artificial Intelligence tools be a blessing for recruiters?

Will Artificial Intelligence ever match human intelligence?

Artificial Intelligence: Features of peer-to-peer networking

What not to share or ask on Chatgpt?

How can Machine Learning help in detecting and eliminating poverty?

How can Artificial Intelligence help in treating Autism?

Speech Recognition and its Wonders in your corporate life

Most groundbreaking Artificial Intelligence-based gadgets to vouch for in 2023

Recommended News

Google: AI From All Perspectives

US And UK Doctors Think Pfizer Is Setting The Standard For AI And Machine Learning In Drug Discovery

An Agreement Is Signed By MEA, MeitY, And CSC To Offer E-Migration Services Via Shared Service Centers

PR Handbook For AI Startups: How To Avoid Traps And Succeed In A Crowded Field

Related Posts

Google: AI From All Perspectives

US And UK Doctors Think Pfizer Is Setting The Standard For AI And Machine Learning In Drug Discovery

PR Handbook For AI Startups: How To Avoid Traps And Succeed In A Crowded Field

OpenAI Creates An AI Safety Committee Following Significant Departures

The New Chatbot From DeepMind Uses Google Search And Human Input To Provide More Accurate Answers

Recent Posts

Google: AI From All Perspectives

US And UK Doctors Think Pfizer Is Setting The Standard For AI And Machine Learning In Drug Discovery

An Agreement Is Signed By MEA, MeitY, And CSC To Offer E-Migration Services Via Shared Service Centers

PR Handbook For AI Startups: How To Avoid Traps And Succeed In A Crowded Field

OpenAI Creates An AI Safety Committee Following Significant Departures

Tags

Follow us

Welcome Back!

Retrieve your password

Add New Playlist

Join Our Newsletter