- Fall On Campus
- Spring On Campus
Students will learn different analysis methods that are widely used across the range of internet companies, from start-ups to online giants like Amazon or Google. At the end of the course, students will apply these methods to answer a real scientific question.
Prerequisites: Students must have programming experience. It is also highly recommended for the students to have taken Multivariate Data Analytics (BIA 652), Data & Knowledge Management (MIS 630), and Knowledge Discovery in Databases (MIS 637).
Additional learning objectives include the development of:
Data collection and preprocessing skills: students will learn how to identify and profile candidate sources of valuable data, as well as how to automatically collect and manage the information they need for their analytics tasks.
Diverse Analytic Skills: students will be exposed to a wide range of both quantitative and qualitative analytics techniques with applications across multiple business domains.
Team Skills: the students will be organized in teams and collaborate on projects for the duration of the course. Each student will evaluate his/her teammates twice during the semester via a customized team survey tool. The tool provides a detailed analysis of a person’s contributions to the different stages of the team’s operation and will be used to promptly identify and address possible problems.
Collect, clean and organize online data from 2 different websites of your choice. The deliverable includes 2 datasets, the collection & cleaning scripts, and a presentation to be given in class.
Choose an important research question that emerges in the context of one of the two datasets collected for the midterm project. Develop, apply and record an analytics methodology to address your question. This work will be presented in class.
Introduction to the course
Introduction to Python I (basic concepts)
Introduction to Python II (parsing & using libraries)
Using Python to scrape the web I (regex & other libraries)
Using Python to scrape the web II (data cleaning)
Text Mining with Python (nltk)
Midterm Project Presentations
Sentiment Analysis with Python
Social Network & Graph Mining with Python (networkx)
Machine Learning & Analytics with Python I (sklearn)
Machine Learning & Analytics with Python II (sklearn)
Machine Learning & Analytics with Python III (sklearn)
Visualization (matplotlib & other tools)
Work on Final projects
Final Project Presentations