These machine learning projects will bring your resume to the top.
The AI and machine learning industries are booming like never before. As of 2021, the increase in AI usage across businesses will create US$2.9 trillion in business value. AI has automated many industries across the globe and changed the way they operate. Most large companies incorporate AI to maximize productivity in their workflow, and industries like marketing and healthcare have undergone a paradigm shift due to the consolidation of AI.
Sales Forecasting
Time-series forecasting is a machine learning technique used very often in the industry. The use of past data to predict future sales has a large number of business use cases. The Kaggle Demand Forecasting dataset can be used to practice this project. This dataset has 5 years of sales data, and you will need to predict sales for the next three months. There are ten different stores listed in the dataset, and there are 50 items at each store. To predict sales, you can try out various methods — ARIMA, vector autoregression, or deep learning. One method you can use for this project is to measure the increase in sales for each month and record it. Then, build the model on the difference between the previous month and the present month sales. Taking into account factors like holidays and seasonality can improve the performance of your machine learning model.
Customer Service Chatbot
A customer service chatbot uses AI and machine learning techniques to reply to customers, taking the role of a human representative. A chatbot should be able to answer simple questions to satisfy customer needs.
There are presently three kinds of chatbots that you can build:
- Rule-Based Chatbots — These chatbots aren’t intelligent. They are fed a set of pre-defined rules and only reply to users based on these rules. Some chatbots are also provided with a pre-defined set of questions and answers and cannot answer queries that fall outside this domain.
- Independent Chatbots — Independent chatbots utilize machine learning to process and analyze a user’s request and provide responses accordingly.
- NLP Chatbots — These chatbots can understand patterns in words and distinguish between different word combinations. They are the most advanced of all three chatbot types, as they can come up with what to say next based on the word patterns they were trained on.
An NLP chatbot is an interesting machine learning project idea. You will need an existing corpus of words to train your model on, and you can easily find Python libraries to do this. You can also have a pre-defined dictionary with a list of question-and-answer pairs you’d like to train your model.
Wildlife Object Detection System
If you live in an area with frequent wild-animal sightings, it is helpful to implement an object detection system to identify their presence in your area. Follow these steps to build a system like this:
- Install cameras in the area you want to monitor.
- Download all video footage and save them.
- Create a Python application to analyze incoming images and identify wild animals.
Microsoft has built an Image Recognition API using data collected from wildlife cameras. They released an open-source pre-trained model for this purpose called a MegaDetector. You can use this pre-trained model in your Python application to identify wild animals from the images collected. It is one of the most exciting ML projects mentioned so far and is pretty simple to implement due to the availability of a pre-trained model for this purpose.
Spotify Music Recommender System
Spotify uses AI to recommend music to its users. You can try building a recommender system based on publicly available data on Spotify. Spotify has an API that you can use to retrieve audio data — you can find features like the year of release, key, popularity, and artist. To access this API in Python, you can use a library called Spotify. You can also use the Spotify dataset on Kaggle that has around 600K rows. Using these datasets, you can suggest the best alternative to each user’s favorite musician. You can also come up with song recommendations based on the content and genre preferred by each user. This recommender system can be built using K-Means clustering — similar data points will be grouped. You can recommend songs with a minimal intra-cluster distance between them to the end-user. Once you have built the recommender system, you can also turn it into a simple Python app and deploy it. You can get users to enter their favorite songs on Spotify, then display your model recommendations on the screen that have the highest similarity to the songs they enjoyed.
Market Basket Analysis
Market Basket Analysis can help companies identify hidden correlations between items that are frequently bought together. These stores can then position their items in a way that allows people to find them easier. You can use the Market Basket optimization dataset on Kaggle to build and train your model. The most commonly used algorithm used to perform Market Basket analysis is the Apriori algorithm.
NYC Taxi Trip Duration
The dataset has variables that include start and end coordinates of a taxi trip, time, and the number of passengers. The goal of this ML project is to predict trip duration with all these variables. It is a regression problem. Variables like time and coordinates need to be pre-processed appropriately and converted into an understandable format. This project isn’t as straightforward as it seems. This dataset also has some outliers that make prediction more complex, so you will need to handle this with feature engineering techniques. The evaluation criteria for this NYC Taxi Trip Kaggle Competition is RMSLE or the Root Mean Squared Log Error. The top submission on Kaggle received an RMSLE score of 0.29, and Kaggle’s baseline model has an RMSLE of 0.89. You can use any regression algorithm to solve this Kaggle project, but the highest performing competitors of this challenge have either used gradient boosting models or deep learning techniques.
Real-Time Spam Detection
In this project, you can use machine learning techniques to distinguish between spam (illegitimate) and ham (legitimate) messages. To achieve this, you can use the Kaggle SMS spam collection dataset. This dataset contains a set of approximately 5K messages that have been labeled as spam or ham. To build the machine learning model, you first need to pre-process the text messages present in Kaggle’s SMS spam collection dataset. Then, convert these messages into a bag of words so that they can easily be passed into your classification model for prediction.
Myers-Briggs Personality Prediction App
You can create an app to predict a user’s personality type based on what they say. The Myers-Briggs type indicator categorizes individuals into 16 different personality types. It is one of the most popular personality tests in the world. If you try to find your personality type on the Internet, you will find many online quizzes. After answering around 20–30 questions, you will be assigned to a personality type. However, in this project, you can use machine learning to predict anyone’s personality type just based on one sentence.
Here are the steps you can take to achieve this:
- Build a multi-class classification model and train it on the Myers-Briggs dataset on Kaggle. This involves data pre-processing (removing stop-words and unnecessary characters) and some feature engineering. You can use a shallow learning model like logistic regression or a deep learning model like an LSTM for this purpose.
- You can create an application that allows users to enter any sentence of their choice.
- Save your machine learning model weights and integrate the model with your app. After the end-user enters a word, display their personality type on the screen after the model makes a prediction.
Mood Recognition System + Recommender System
You can build an app that recognizes a user’s mood based on live web footage and a movie suggestion based on the user’s expression.
To build this, you can take the following steps:
- Create an app that can take in a live video feed.
- Use Python’s face recognition API to detect faces and emotions on objects in the video feed.
- After classifying these emotions into various categories, start building the recommender system. This can be a set of hardcoded values for each emotion, which means you don’t need to involve machine learning for the recommendations.
- Once you’re done building the app, you can deploy it on Heroku, Dash, or a web server.
YouTube Comment Sentiment Analysis
In this project, you can create a dashboard analyzing the overall sentiment of popular YouTubers. Over 2 billion users watch YouTube videos at least once a month. Popular YouTubers garner hundreds of billions of views with their content. However, many of these influencers have come under fire due to controversies in the past, and public perception is constantly changing. You can build a sentiment analysis model and create a dashboard to visualize sentiments around celebrities over time.
To build this, you can take the following steps:
- Scrape comments of the videos by the YouTubers you want to analyze.
- Use a pre-trained sentiment analysis model to make predictions on each comment.
- Visualize the model’s predictions on a dashboard. You can even create a dashboard app using libraries like Dash (Python) or Shiny (R).
- You can make the dashboard interactive by allowing users to filter sentiment by time frame, name of YouTuber, and video genre.
Source: analyticsinsight.net