Encodec, a tool provided by Meta, uses AI to noticeably interpret audio rates.
Nearly every industry in business is currently focused on AI and machine learning research. This month at Meta, developers from the business presented two new innovations: an AI system that compresses audio files and a method that can speed up AI performance for protein folding by 60x. In order to offer machines a better feel of their surroundings, MIT researchers also disclosed that they are simulating how a listener may perceive a sound from any location in a room. Lyra is a neural audio codec developed by Google that aims to compress speech at low bit rates. But according to Meta, their system is the first to deliver stereo audio in CD quality, making it appropriate for business applications like voice calls.
Real-time audio compression and decompression at rates ranging from 1.5 kbps to 12 kbps on a single CPU core is made possible by Meta’s Encodec compression solution using AI. Encodec can achieve a compression rate of roughly 10 times at 64 kbps in comparison to MP3 without substantially lowering quality. Human listeners preferred the Encodec-processed audio quality over Lyra-processed audio, according to the researchers who developed Encodec. This finding suggests that Encodec may one day be used to deliver higher-quality audio in situations when bandwidth is expensive or limited. There are few immediate commercial applications for the protein folding research that Meta has conducted. However, it might serve as the starting point for important biological research in the future. The structures of about 600 million uncharacterized proteins from bacteria, viruses, and other microbes were reportedly predicted using Meta’s AI system, ESMFold.
The Meta system is less accurate. Of the 600 million proteins it generated, only a third were of “high grade.” Though it can scale up structure prediction to much larger protein databases because it predicts structures 60 times more quickly. The company’s AI division also released a method this month that employs mathematics to reason in order to avoid drawing undue attention to Meta. Corporate researchers claim that by studying a dataset of successful mathematical arguments, their “neural issue solver” learned to generalise to new, distinct forms of problems. Such a system wasn’t invented by Meta in the beginning. Lean is the name of OpenAI’s original, which was unveiled in February. Separately, DeepMind has experimented with systems that can solve challenging mathematical conundrums in the field of symmetry and knots.
A machine learning model that can predict how sounds in a room would ricochet around the room was developed by researchers at MIT. By modelling the acoustics, the system can learn the geometry of a room from sound recordings; this geometry may then be used to produce visual representations of a room.
The method, according to the researchers, might be applied to virtual and augmented reality applications as well as robots that must navigate difficult settings. Future work on the approach will allow it to be used in new and larger environments, such as complete structures or even entire cities and villages. Two separate teams at Berkeley’s robotics department are accelerating the rate at which a quadrupedal robot can learn to walk and perform other tricks. One team tried to combine the best-of-breed work from numerous different improvements in reinforcement learning in order to enable a robot to proceed from a completely blank slate to robust walking on unpredictable terrain in under 20 minutes in real-time. “ Perhaps surprisingly, we find that, with a few wise design decisions regarding task structure and algorithm implementation, a quadrupedal robot can learn to walk from scratch with deep RL in under 20 minutes, across a range of different locales and surface types. The researchers stress that neither new algorithmic components nor any other unanticipated innovation are required. Instead, they pick and combine a few innovative techniques to achieve amazing results.
The researchers claim that because it is not obvious whether there is enough data to train the forecasting system, it is challenging to apply the technique in the real world. They are nonetheless optimistic about the applications, some of which might entail predicting damage to bridges and other structures.