These statistical ideas have proven to have introduced great breakthroughs in AI development.
AI has made remarkable progress in various fields of applications. It is defined as the part of computer science that is concerned with designing systems that exhibit the characteristics of human intelligence to intercept, learn, reason, and solve problems. The field has attracted researchers because of its ambitious goals and enormous underlying intellectual challenges. The field has been under constant controversy because of its social, ethical, and philosophical implications. But even after all controversies, AI, deep learning, and machine learning have become household terms. This development in AI is also triggered by several breakthroughs in statistics. These statistical ideas for AI have been reasons for various innovations in AI technology. In this article, we have listed such top statistical ideas that have triggered AI development over the years.
Information Theory and an Extension of the Maximum Likelihood Principles by Hirotugu Akaike (1973):
This paper originally introduced the term AIC, which was earlier abbreviated for An Information Criterion but is currently known as the Akaike Information Criterion. This term was coined to depict the process of evaluating a model’s fit based on its estimated predicted accuracy. AIC was immediately recognized as a useful tool, and this paper became one of the several published papers that placed statistical inference within a predictive framework.
Exploratory Data Analysis by John W. Tukey (1977):
This book has not only been greatly influential but is also fun to read. Earlier data visualization and exploration were considered low-grade aspects of practical statistics. But John Tukey changed this concept entirely when he wrote about using statistical tools, not just for confirming what we already know and rejecting hypotheses that we will never believe, but for discovering new and unexpected insights from data.
Improper Priors, Spline Smoothing and the Problem of Guarding Against Model Errors in Regression by Grace Wahba (1978):
Spline smoothing is an approach for fitting non-parametric curves. Wahba wrote another paper in the same year that refers to a class of algorithms that can fit into arbitrary smooth curves through data without overfitting to noise or outliers. Even though this idea might seem a bit too obvious in the modern era of disruptive technologies, but it was a huge step back in the day when the starting points of curve fittings were polynomials, exponentials, and other fixed forms.
Bootstrap Methods: Another Look at the Jackknife by B.Efron (1979):
Bootstrapping is the method of performing statistical inference without assumptions. Traditionally, inferences could not be performed without assumptions. But this paper attempted to explain bootstrapping as a simple procedure that can be deployed for resampling data. Among the statistical methods in the past 50 years, this became widely useful due to an explosion in the computing power that allowed the transformation of simulations to mathematical analysis.
Sampling-Based Approaches to Calculating Marginal Densities by Alan E. Gelfand and Adrian F. M. Smith (1990):
One way that fast computing has revolutionized statistics and machine learning is with the use of Bayesian models. But modern statistical modeling enables more flexibility to solve complex problems. The users just need to deploy computational tools that can collaborate with these models. In this influential paper, Gelfand and Smith did not develop any new tools, instead, they depicted how Gibbs modeling can be used to fit a large class of statistical models.
Identification and Estimation of Local Average Treatment Effects by Guido W. Imbens and Joshua D. Angrist (1994):
The casual inference is central to any artificial intelligence problem. These casual inference methods have evolved along with the rest of statistics and machine learning. But traditionally, it accompanied the added challenge of asking about the data that cannot be measured. Imbens and Angrist are influential economists who wrote about what can be estimated when casual effects vary, and their ideas have formed the basis of much of the later developments in this domain.
Regression Shrinkage and Selection via The Lasso by Robert Tibshirani (1996):
In regression, the actual challenge lies in including lots of inputs and their interactions, which results in the estimation problem becoming statistically unstable. In this paper, Tibshirani introduces lasso, a computationally efficient approach, which is now widely used in data-based regularization in more complex models.
The Grammar of Graphics by Leland Wilkinson (1999):
Leland Wilkinson has worked on several influential commercial software projects, including SPSS and Tableau. In this book, Wilkinson lays out a framework for statistical graphics that went beyond that traditional focus on pie charts and histograms, to understand how data and visualization can relate.
General Adversarial Nets by Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio (2014):
One of ML algorithms’ modern achievements in recent years is its real-time decision-making through predictive and inference feedbacks. The paper introduces the general adversarial nets (GANs) that are advanced concepts and allow reinforcement learning problems to be solved automatically.
Deep Learning by Yoshua Bengio, Yann LeCun, and Geoffrey Hinton (2015):
Deep learning is a class of artificial neural network models that can be used to make flexible non-linear predictions using a large number of features. This paper explores the uses of deep learning in statistics and machine learning.
Source: analyticsinsight.net