The Introduction

Bridge the Gap

Drive business values through data analysis.

3+ years experiences in analytics and BI using SQL and Tableau for actionable business insights.

2+ years of experience with Python and R for ETL (extract, transform, load), statistical analysis and ML.

Design idea credited to @Adham Dannaway

My Skills

Database Management

#MySQL, #PyMysql
#PySpark
#Hive
#MongoDB, #MongoClient

Programming

#ETL - Extract, Transform, Load:
(Python:numpy,pandas; R:dplyr)
#Web Scraping:
(Python:BeautifulSoup; R:rvest; Java:jsoup)
#Regular Expression (Python:re)

Visualization

#Python (Matplotlib, Seaborn)
#R (ggplot2)
#Tableau Desktop Certified Associate

Cloud Computing

#AWS Certified Practitioner (EC2, S3, Lambda)
#GCP (AutoML)
#Salesforce (Einstein)

Machine Learning

#Regression
#Classification:
(LogisticRegression, SVM, RandomForest)
#Clustering
#NLP:
(Sentiment Analysis, Topic Modeling, Word2Vec)

Growth Analytics

#A/B Testing
#EOD (Experiment of Design)
#Retention Analysis

My Projects

ETL with Regex on mock data

A wholistic summary of the most common syntax and functions of the "re" module in Python with hands-on Regex practices for ETL (Extract, Transform, Load).

eBay sponsored items Web-Scraping

Curious about the differences between sponsored and non-sponsored items on eBay?
Check out this Python script that web-scraped and stored information into a database via "BeautifulSoup" and "PyMysql".

USNews top stories Web-Scraping

You can fetch the real-time headlines and the first three lines of the top stories for simplicity and time in either Python (BeautifulSoup, tokenize), R (rvest, dplyr, stringr) or Java (jsoup, BreakIterator).

Create your own movie database using OMDb API

Isn't it cool to have your own local movie database to track all the movies you watched and want to watch? Check out the script utilizing OMDb database API in either Python (json, PyMysql) or Java (gson, java.sql).

DB management of unstructured big data in MongoDB

This project handles unstructured big data storage and manipulation (Map-Reduce) via MongoDB by using MongoClient in Python. The data can also be transformed and transferred from MongoDB to SQL database.

Recommender system based on Association Rules Mining

Design promotion bundles? Want to know what products should be recommended next? Check out this recommender system which generated frequent itemsets and association rules via "Apriori" algorithm in Python.

Sentiment Analysis of Yelp reviews on Google Cloud Platform

Are rating stars everything? It's inevitable that people have different standards to give stars. This project helps you build a more pragmatic and useful rating system based on reviews sentiment analysis with uniformed standards.

Multiple Classification models for diabetes prediction

By seperately applying LogisticRegression, SVM, RandomForest classification algorithms with GridSearchCV and RecursiveFeatureElimination (RFE), this projects reached an accuracy of 90.3% in diabetes prediction.

Modeling and analysis on NYTimes articles about 2020 primary election

By employing topic modeling, sentiment analysis and generating word-clouds on New York Times articles, this project tells you what is mainly talked about and the media sentiments regarding Joe Biden and Bernie Sanders.

My Blogs

Contribute to Team Productivity when you are not the “Boss”

Based on my experiences from practicum, I summarized three points to improve team productivity as a team member but not the leader.

The era of data capitalism

“This is the dawn of the era of data capitalism.” In the age of big data, capitalism has been reinvented. Check out to see how my practicum make good use of data and drive business value for our client?

Secure the analysis with right data

How to deliver outcomes that will "WOW" your client even in circumstances of lacking data? Check out my four practicle suggestions and tips.

Devise the roadmap to buried gold

As an analyst, I assume the art of data science is exactly like digging gold. How to devise a precise and easy-to-understand roadmap which is the key to treasures and also key of data analysis?

My Visualizations

The Startup Quadrant - Find out your invesment choices

A dashboard for risk investors to get a grasp of the most investment-worthy
#startups #quadrant #investments

World Demographic Analysis

An #animated dashboard showing how important indicators of countries have been developing over decades. Drill into individual countries via #trailRun and #highlight

Laundrary Startup Extension Plan

An #extension plan dashboard for a laundry startup via #ClusterAnalytics.
#RegionalProfitability #TableauMaps #analytics #Groups