Projects

Music recommendation engine using ALS based Matrix factorization

According to a new report released by Nielsen Music, on an average, Americans now spend just slightly more than 32 hours a week listening to music. This is a staggering 36% increase in 2 years. With such a tremendous growth in the music industry, it becomes crucial to deliver personalized music recommendation to the listeners. This piqued our curiosity to understand the process that goes behind the music recommendation engine and led us to work on this project. We achieved a ~90% AUC for ~40,000 users and ~100,000 artists.

Sketch recognition using Mobilenet

This is a computer vision project with the objective of classifying hand drawn sketches. We have deployed multiple deep learning CNN architectures such as simple CNN, Resnet and Mobilenet and achieved 92.11% precision (MAP@3) on all the 350 classes. This result also helped us secure a rank of 268 among 1316 teams in the kaggle competition.

Visualizing different factors that affect Austin bike sharing and recommendations for bike rebalancing

Traffic is always a painful problem for both authorities and commuters. Recently bike sharing has evolved as a viable alternative to both reduce traffic and pollution.In this project, we have performed exploratory data analysis to understand the different aspects that affect bike sharing and recommendations for bike rebalancing

Cable-cord cutter sentiment analysis using Reddit data

Understanding customer sentiments about their products or services is key to any business. In this project, we scraped data from Reddit and performed Named entity recognition,topic modelling and sentiment analysis on the comments to understand public views about moving from cable channels to streaming services

Salary prediction based on job description using XGboost

During my job search, i often wondered how companies such as Glassdoor,LinkedIn and others are able to identify the pay scale of a particular job. Applying the text processing and predictive analytics skills that we learnt from our course, we have achieved the objective of predicting high and low paying jobs with ~80% accuracy which is an increase of 30% from baseline accuracy(50%)

Association between grocery items using Apriori algorithm and Gephi visualization

Association rule mining is a very interesting and important topic in retail analytics. In this mini project, i implemented Apriori algorithm in R to discover associations between various products and visualized the results through compelling graph visualizations using Gephi visualization

Author attribution using TF-IDF, PCA and Randomforest

Author attribution is one of the famous NLP technique to identify the author of an unidentified article or to determine the genuine author of a publication when there are multiple claims. In this mini project, I have tried to attribute articles to the respective authors and used multiple classification techniques to come up with the most suitable model for the analysis.

Car brand association analysis using web scraped data

Understanding how customers are associating a brand with their own sentiments is crucial information to growth in industries. In this mini project, we have found associations between luxury cars discussed in Edmund’s forum and generated insights regarding what attributes are customers talking about when it comes to these brands

Test time prediction for Mercedes Benz using XGboost

Predicting the test time for Mercedes benz cars after manufacturing was the main objective of this project. Using XGBoost, we achieved an R2 of ~55% for the prediction model

Understanding the market potential from customer tweets using k-means clustering

Understanding the customer base is important for any organization. In this mini project, a nutrition company wants to identify its customer segments based on their Twitter feed. I have identified the customer segments by performing cluster analysis on the data.Some of the segments are fitness enthusiasts, highly educated adults and Family people

Blog Posts

More Posts

An intuitive way to understand Bayesian personalized ranking optimization criterion using Matrix factorization

CONTINUE READING

A summarized view of the challenges in implementing recommender systems from an industry point of view

CONTINUE READING

Brief overview of our Journey with sketch recognition| Achieved 92.11% precision (MAP@3) on all the 350 classes and ranked 268 among 1316 teams that participated in the Kaggle competition.

CONTINUE READING

A user friendly approach to access Reddit API and Google Bigquery

CONTINUE READING

In this article, i will be discussing about the key features of different boosting models

CONTINUE READING