Suraj Srinivas is a postdoctoral research fellow at Harvard University. He completed his PhD studies in Switzerland at the Idiap Research Institute and EPFL.
Suraj earned his Master’s degree (by research) at the Indian Institute of Science in Bangalore, where he focused on deep neural network model compression.
His research interests include model interpretability, generative modelling, uncertainty quantification, and deep learning theory.
INDIAai interviewed Suraj to hear his perspective on artificial intelligence.
Can you tell us about your research work in the Video Analytics Lab at IISc?
Deep neural networks are powerful tools which have been wildly successful in many AI tasks such as computer vision, speech recognition, and natural language processing. However, a problem with these models is that they are often large, require massive amounts of memory for storage, and can be slow to use. It means that we can practically use them only on devices with large computers such as GPUs and workstations, not on tiny mobile devices or even laptops.
At the Indian Institute of Science, Bangalore, I worked on the problem of deep neural network compression, which involved reducing the size of these models such that they can eventually run on devices with small amounts of memory. Together with Prof. R. Venkatesh Babu, who leads the video analytics lab at IISc, we published papers from 2015 to 2017 exploring different ways to perform such compression. We published these works when model compression was not a very popular topic of research, but this field has grown in size and importance in the last 5-6 years. Today, it is not uncommon for deep neural networks on mobile devices, which is one reason why smartphone image quality has drastically improved over the last few years.
What motivated you to research at Idiap Research Institute on neural network interpretability? Can you tell us about your role model and motivating factors?
The complexity of deep neural networks leads to a large memory requirement that I mentioned above and makes it difficult for researchers to analyze and understand the model. It is a paradox that usually doesn’t occur in other areas of engineering: the engineer who builds the ML model themselves does not know how it works! While we can get models to perform specific tasks, we do not understand how they accomplish this. Specifically, consider the task of identifying whether an image contains a cat or a dog. Given a large set of cat and dog images, we can train models to solve this task, but we do not know how the model tells cats and dogs apart. For example, does it look for whiskers, fur, ear shape, or all? It is known as the problem of AI interpretability, which consists of broadly understanding the relevant factors that influence model behaviour.
During my PhD at EPFL and Idiap Research Institute in Switzerland, I worked with my supervisor Prof. François Fleuret on this problem. We showed that several existing approaches to interpretability were flawed and did not work as intended. As a result, I do not think we currently have promising approaches to tackle this problem. At the same time, this is a significant and fundamental problem that urgently needs solutions.
What is the difference between working as a PhD student and a postdoctoral research fellow?
Both PhD students and postdocs are researchers whose primary job is to contribute and extend the knowledge of the field. They do this by proposing new ideas to improve the area and publishing such ideas in the form of papers in top-tier conferences and journals.
PhD students typically need to write a certain number of papers and take several courses to graduate. The exact numbers depend on the field and the lab. Many take up a postdoctoral fellowship (or simply, a “postdoc”) after earning their PhD, which involves mentoring junior PhD students, publishing papers, grant writing, and teaching some courses. The objective of a postdoc is to eventually be able to work independently in academia as a professor and lead their lab. However, both PhDs and postdocs can later pursue careers in academia and industry, which is especially true in an area such as machine learning.
According to the market report, the AI image recognition market will increase by USD 3.56 billion between 2021 and 2026. So what would India’s research role be in this area?
The critical factors for success in AI are access to computational resources, access to data, and the availability of research excellence. Large companies such as Google, Amazon, Facebook, etc., have access to these.
My personal opinion is that India already has excellence in AI research, with world-class researchers at institutes like IISc and IITs. However, we can bridge the gap in access to computation & data by increasing public spending on AI infrastructure, which will enable Indian companies and researchers to be competitive in the global AI market.
What kind of knowledge do you need to do research or work in the field of artificial intelligence?
AI is unique among engineering disciplines due to its multidisciplinary nature. It combines ideas from calculus, statistics, computational biology, psychology, control theory, signal processing, theory of computation and physics, among many others. It means that curious individuals from a wide range of backgrounds can understand and contribute to the field, and not necessarily only those from a computer science background.
Moreover, it is essential to be comfortable with coding and undergraduate-level math topics such as linear algebra and probability to get started with AI research. The field of AI is fast-changing, and thus one should be open to constantly learning throughout their career.
What advice would you provide to someone considering a career in artificial intelligence?
From the outside, the field of AI can look intimidating. Research papers at top-tier conferences with hard-to-parse math equations and incomprehensible jargon. It is essential not to be discouraged while reading such articles, focus on learning one thing at a time, and not try to learn everything at once. Apart from this, continually invest in your general math and coding skills (I still do!), which will pay off well in your career.
Can you suggest some research articles and books on AI?
My favourite ML textbooks are “Pattern Recognition and Machine Learning” by Dr Christopher Bishop and “Information theory, Inference and Learning Algorithms” by Prof. David McKay. I consider the book by Bishop to be a bible for ML, and I highly recommend it to all students of the field.
I like everything Prof. Geoff Hinton wrote regarding research papers because of his no-nonsense writing style. My favourite paper of his is “Keeping the neural networks simple by minimizing the description length of the weights” by Hinton and Van Camp, published in 1993. Although this paper was published almost 30 years ago, the ideas it proposes, such as variational inference for neural networks, are relevant even today. I also like the papers that Prof. Aapo Hyvärinen (although it took me a long time to understand). Finally, my favourite paper is “Estimation of Non-Normalized Statistical Models by Score Matching”, published in 2005. This paper proposes a new way of fitting probability distributions to data, and I highly recommend this paper to more advanced readers.
Source: indiaai.gov.in