Data Mining: Student Projects
Project 3
- Liu:
Is this product Top selling or not and predict the demand.
- Cairo: Income prediction based on census data (above 50K or not.
- Zoe: Beer recommendation
- Juhan Choi: Box Office
- Amin:
Spotify Music Popularity Regressor
- Fuxian Li:
Movie Genre Prediction from a plot description
- Xuan Wang:
Image Recognition
- Dan Grennel
County Primary Election Votes prediction for a democratic candidate (Clinton)
code at github.
- Bohyun:
CKD: Chronic Kidney Disease Prediction ***
- Vishal:
Fruit image clustering app
- Moti: Irony Detection from tweets ***
- Steve Riesberger:
Fentanyl Early Alert ***
- Steven Hanna: Predict political party from tweets
slides
paper
- Paul Thomson:
slides
paper
- HongTu Yan:
Stock overreaction prediction
(NO python code implemented?)
stock overreaction info
- Enis:
Train audioset model
VGG inference: https://colab.research.google.com/drive/1TCYU
2qRdw7Nh8qd2DZT_yTkaUz3liNK1
Audioset model inference: https://colab.research.google.com/
drive/1ILwZH5NqtM-wVNyBDBSwpxag-6-AKabp
-
Movie Genre Prediction models - Jason Fushian Li
Comments: need performance info - accuracy for each model
Project 2
- Hongtu Yan
Web App: Credit default prediction
- Bohyun
Webapp:
: bohyun/dm2019
- Moti: (Please make sure your web app works)
Web App;: Predicting purchase
- Wang Xuan:
Web App: Predict the sentiment of Yelp Review
- Vishal:
Web App: Image object classifier
- Linmei Lin
Predicting the cryptocurrency price.
- Steve Hanna
Web app: Rain Predication
Model building: Predictiong the rain tomorrow in Australia.
- Lu Liu
Predict if a construction job requires self-professional certification or not?
- Amin:
Dataset: Spotify music data
Web App :
Predict popularity of music.
- Mark Perlman:
web app: Titanic survival classification
- Zoe
Kaggle’s Students’ Academic Performance Dataset
Webapp: Can we predict and classify a student's academic performance as either performing high, medium or low?
Features broken up into three categories: demographic features, academic background features, and behavioral features
- Cairo
dataset: UCLA Application information from at least 2009400 x 9 data points
Webapp : whether a student will be accepted into a graduate program
- Steve Risenberg
Dataset: what is cooking kaggle dataset
Webapp: classify a recipe into one of 20 cuisines based on the ingredients of the recipe. The winner of the competition had an accuracy of about 82%, so my goal will be to see if I can get to 80%.
- Fushian Li
Web app:
Movie Genre prediction
-
Juhan Choi
Predict customer transaction or not.
- Enis:
Webapp: building
the news classifier for incendiary content and its
website.
- Daniel Grenell
Web app:
Predicting Snow day.
a probabilistic classifier for snow days in New York City
Snow day data:
Example of using wunderground API:
Weather data:
Project 1
-
Enis Berk: Short speech commands classification
Kaggel competition dataset of 1 sec short command speech files (.wav).
CNN model using tensorflow to classify sound into short command speech.
- Xuan Wang:
NYC Government Employee Payroll data analysis
Citywide payroll data set
payroll analyses (high avearge payment, income by department)
- Fuxian Li: Crime Distribution in Boston
crime data of Boston
- Juhan Cho - Housing price prediction
dataset: Kaggel housing price dtaset
- Daniel Amin:
Indie Music analysis
UCI FMA a dataset for music Analysis data set
Predict the sucess of a song
-
Daniel Grenell: 311 Complaints data analysis in NYC
311 Dataset of NYC in 2019?
How have the response times to 311 complaints changed over time for different parts of the city?
complaint types?
- Lu Liu ,
pdf
NYC OpenData - DOB Job Application Filings dataset
Webapp
Building Construction Job Trends and Prediction for Real Estate Investment.
-- recommend new investment on properties that llow construction projects.
-
Steve Riesenberg: SHSAT testing schools
slides
Selective High School Admissions Test data
identifying which schools have the most opportunity for increasing the number of low income students taking the SHSAT
-
Vishal Bharti: NY Collision Analyses
NYPD Collisions dataset.
Discovering trends in motor accidents based on location, vehicle types, etc.
- Bohyun Lee: HPM data analysis
human proteome map data - intensity profiles for hungdres of proteins
(from Spectrometry)- humanprooteomemaporg data, paper in pubmed
Identify protein expression patterns for each body sites and cell types .
-
Linmei Huang: Cryptocurrency Price Development
cryptocurrencies historical daily transactions data from website "http://nomic.com" and other websites
The price development of cryptocurrencies and volatility
- Mark Perelman:
Shoes data analysis
Women's shoe data
Trends of prices
-
Paul Thompson : Clinical Data analysis
Clinical Research Trends
-
Cairo Thompson: Loan application analysis
Loan Applications and then Predictions
- Zoe Markovits:
Twitter Sentiment analysis
Twitter data on US airlines classified by sentiment.
Working with sentiment analysis, and clustering algorithms to see if a natural pattern appears
- Hongtu Yan: Credit card analysis
a very large data set of credt card
Credit card default prediction.
- Angel Vizzuraga: Crime in NYC
NYC crime data
- Moti Mounesan
Customer purchase prediction
Kaggle competition dataset
- Steven Hanna
Data - Cervical Cancer dataset
Create an accurate predictor of cervical cancer.
exercise: Titanic Survival prediction