yilin_wu

Yilin Wu's Data Science/Analytics Portfolio

This project is maintained by foolwuilin

Yilin’s Data Science/Analytics Projects

KaggleLinkedinTikTokInstagramFacebookYoutubeWebsite

Yilin Wu - Duble Master’s degrees holder; MBA in Business Analytics and M.Eng. in Media. 2-year project manager experience driving product development for consumer electronics as well as DJ equipment; also, 6-year broadcast engineer experience for TV production, news, live concerts, etc. Certified with PMP (Project Management), NPDP (Product Management), Data Analytics, WBSA (Strategic Planning), Sig Sigma Green Belt, DJing, etc.

Project 1: Time Series Prediction for The Rainfall in Kerala

The agricultural sector, coconut, tea, coffee, cashew and spices are important in Kerala. Thus, well predicting the rainfall in Kerala would benefit the agriculture industry to make a better plan and possibly increase the output. The key method of this analysis is as below.

  1. Building five regression models, Linear Regression, Ridge, Lasso, XGBRegressor, and ElasticNet.
  2. MAE and RMSE for evaluating the accuracy. Correlation between features to evaluate if over-fitting.

Project 2: Customer Behavior Analysis (K-means and Hierarchical Clustering)

A Russian alcohol company had success when running a wine promotion in Saint Petersburg. This analysis would like to suggest other locations where the buying behaviors of customers are similar to Saint Petersburg for further marketing campaigns to maximize profits. Here are the analysis methods.

  1. Choosing the optimal number of groups for K-means clustering by an Elbow method.
  2. Targeting the cities by a quantile method, K-means clustering and hierarchical clustering.

Project 3: Sentiment Analysis, Likes & Retweets Prediction (Data Scraping)

Acess the code of data scraping here

Sentiment analysis for inMusic Brands to review the tweets relating to the company’s brands and the major competitor, Pioneer DJ. inMusic owns 4 DJ brands, Numark, Rane DJ, Denon DJ, and Stanton. Messages were scrapped by using the Twitter API from July 16 to July 25, 2021. The analysis methods are as follows.

  1. Crawling data by using the Twitter API and then extract useful information from the JSON files.
  2. Proceeding NLP to get WordCloud keywords, using pipeline and GridsearchCV to optimize classification models, and then evaluating the prediction by a confusion matrix.

Project 4: Market Basket Analysis (Support, Confidence, and Lift)

A market basket analysis would help understand the buying behavior of customers. By knowing what items customers would like to buy together as a combination, retail managers are able to make a better shelf displaying plan.

  1. what are the consequents of the popular items?
  2. What are the most important items that should always be in the store?
  3. How do the items connect to each other?

Project 5: Market RFM Analysis (Retention, Frequency, and Monetary)

A good RFM analysis would help businesses target the right customers in order to maximize profit. The given dataset contains sales records with 64.682 transactions and 22.625 customers IDs in 2016. The columns are Transaction, Customer ID, Transaction ID, Category, SKU, Quantity, and Sales Amount. This analysis answers the questions as follows.

  1. What is the customer retention rates?
  2. Who are the most valuable customers by using K-Means clustering?
  3. The outcome difference between the k-means clustering and the linear quantile method.

Executive Board (Flood-It by Google Data Studio and Google Analytics)