Data Mining: Student Projects
Is this product Top selling or not and predict the demand.
- Cairo: Income prediction based on census data (above 50K or not.
- Zoe: Beer recommendation
- Juhan Choi: Box Office
Spotify Music Popularity Regressor
- Fuxian Li:
Movie Genre Prediction from a plot description
- Xuan Wang:
- Dan Grennel
County Primary Election Votes prediction for a democratic candidate (Clinton)
code at github.
CKD: Chronic Kidney Disease Prediction ***
Fruit image clustering app
- Moti: Irony Detection from tweets ***
- Steve Riesberger:
Fentanyl Early Alert ***
- Steven Hanna: Predict political party from tweets
- Paul Thomson:
- HongTu Yan:
Stock overreaction prediction
(NO python code implemented?)
stock overreaction info
Train audioset model
VGG inference: https://colab.research.google.com/drive/1TCYU
Audioset model inference: https://colab.research.google.com/
Movie Genre Prediction models - Jason Fushian Li
Comments: need performance info - accuracy for each model
- Hongtu Yan
Web App: Credit default prediction
- Moti: (Please make sure your web app works)
Web App;: Predicting purchase
- Wang Xuan:
Web App: Predict the sentiment of Yelp Review
Web App: Image object classifier
- Linmei Lin
Predicting the cryptocurrency price.
- Steve Hanna
Web app: Rain Predication
Model building: Predictiong the rain tomorrow in Australia.
- Lu Liu
Predict if a construction job requires self-professional certification or not?
Dataset: Spotify music data
Web App :
Predict popularity of music.
- Mark Perlman:
web app: Titanic survival classification
Kaggle’s Students’ Academic Performance Dataset
Webapp: Can we predict and classify a student's academic performance as either performing high, medium or low?
Features broken up into three categories: demographic features, academic background features, and behavioral features
dataset: UCLA Application information from at least 2009400 x 9 data points
Webapp : whether a student will be accepted into a graduate program
- Steve Risenberg
Dataset: what is cooking kaggle dataset
Webapp: classify a recipe into one of 20 cuisines based on the ingredients of the recipe. The winner of the competition had an accuracy of about 82%, so my goal will be to see if I can get to 80%.
- Fushian Li
Movie Genre prediction
Predict customer transaction or not.
the news classifier for incendiary content and its
- Daniel Grenell
Predicting Snow day.
a probabilistic classifier for snow days in New York City
Snow day data:
Example of using wunderground API:
Enis Berk: Short speech commands classification
Kaggel competition dataset of 1 sec short command speech files (.wav).
CNN model using tensorflow to classify sound into short command speech.
- Xuan Wang:
NYC Government Employee Payroll data analysis
Citywide payroll data set
payroll analyses (high avearge payment, income by department)
- Fuxian Li: Crime Distribution in Boston
crime data of Boston
- Juhan Cho - Housing price prediction
dataset: Kaggel housing price dtaset
- Daniel Amin:
Indie Music analysis
UCI FMA a dataset for music Analysis data set
Predict the sucess of a song
Daniel Grenell: 311 Complaints data analysis in NYC
311 Dataset of NYC in 2019?
How have the response times to 311 complaints changed over time for different parts of the city?
- Lu Liu ,
NYC OpenData - DOB Job Application Filings dataset
Building Construction Job Trends and Prediction for Real Estate Investment.
-- recommend new investment on properties that llow construction projects.
Steve Riesenberg: SHSAT testing schools
Selective High School Admissions Test data
identifying which schools have the most opportunity for increasing the number of low income students taking the SHSAT
Vishal Bharti: NY Collision Analyses
NYPD Collisions dataset.
Discovering trends in motor accidents based on location, vehicle types, etc.
- Bohyun Lee: HPM data analysis
human proteome map data - intensity profiles for hungdres of proteins
(from Spectrometry)- humanprooteomemaporg data, paper in pubmed
Identify protein expression patterns for each body sites and cell types .
Linmei Huang: Cryptocurrency Price Development
cryptocurrencies historical daily transactions data from website "http://nomic.com" and other websites
The price development of cryptocurrencies and volatility
- Mark Perelman:
Shoes data analysis
Women's shoe data
Trends of prices
Paul Thompson : Clinical Data analysis
Clinical Research Trends
Cairo Thompson: Loan application analysis
Loan Applications and then Predictions
- Zoe Markovits:
Twitter Sentiment analysis
Twitter data on US airlines classified by sentiment.
Working with sentiment analysis, and clustering algorithms to see if a natural pattern appears
- Hongtu Yan: Credit card analysis
a very large data set of credt card
Credit card default prediction.
- Angel Vizzuraga: Crime in NYC
NYC crime data
- Moti Mounesan
Customer purchase prediction
Kaggle competition dataset
- Steven Hanna
Data - Cervical Cancer dataset
Create an accurate predictor of cervical cancer.
exercise: Titanic Survival prediction