ICOS Big Data Summer Camp

University of Michigan

Room R0220 - Ross School of Business - 701 Tappan Street, Central Campus
June 1-4, 2015
9:00 am - 5:00 pm

General Information

Social and organizational life are increasingly conducted or tracked online through electronic media, from emails to Twitter feed to dating sites to GPS phone tracking. The traces these activities leave behind have acquired the (misleading) title of “big data.” It is a good bet that within a few years, a standard part of graduate training in the social sciences will include a hefty dose of “how to make use of big data,” just as statistical analysis is a standard part of such training today. The ICOS Big Data Camp aims to make big data accessible for people with no prior background. We want people to leave with enough confidence and basic knowledge to be able to know what is possible in their research and where they might go next, drawing on resources at the University of Michigan. Organizing committee: Jerry Davis, H. V. Jagadish, Cliff Lampe, and Brian Noble

Instructors: Jon Atwell, Mike Anderson, Jerry Davis, Zakir Durumeric, Gareth Keeves, Cliff Lampe, Colleen Van Lent, Brian Noble, Katharina Reinecke, Eric Seymour

Helpers: David Adrian, Joshua Adkins, Antonio Deusany de Carvalho Junior, Ariana Mirian

Who: The course is aimed at graduate students and other researchers.

Requirements: Participants must bring a laptop with a few specific software packages installed (listed below).

Contact: Please mail schifelt@umich.edu for more information.

Resources: Go here for example papers and data sources.


Schedule

Monday 09:00 Introduction and Overview with Jerry Davis (Intro ppt, pdf. Assignment ppt, pdf)
10:15 Break
10:30 Hero's Journey #1 - Cliff Lampe
11:15 Hero's Journey #2 - Katharina Reinecke
12:00 Lunch break
1:00 Group formation & How to learn in groups: lessons from design teams, Brian Noble
2:00 The Setup & Command line with Todd Schifeling (Tutorials: install, command line)
Tuesday 09:00-10:45 Introduction to SQL with Mike Anderson (Slides: ppt pdf)
10:45-11:00 Coffee Break
11:00-12:00 Using SQL with Eric Seymour (Materials: zip file)
12:00-1:00 Lunch break
1:00-5:00 Group Work (play data)
4:00-5:00 Check-in and end of day discussion
Wednesday 09:00-10:45 Introduction to Python with Colleen Van Lent (HTML, Notebook)
10:45-11:00 Coffee Break
11:00-12:00 Python Jon (Link to code)
12:00-1:00 Lunch break
1:00-4:00 Group Work (scraping links)
4:00-5:00 Check-in and end of day discussion
SIGN IN SHEET
Thursday 9:00-9:20 Now What? with Sharon Broude Geva (Slides)
09:20-10:45 Introduction to APIs with Zakir Durumeric (Install, Code, Lecture)
10:45-11:00 Coffee Break
11:00-12:00 Using APIs with Gareth Keeves
12:00-1:00 Lunch break
1:00-1:30 Music API with Antonio Deusany de Carvalho Junior (notebook)
1:00-4:00 Group Work & Python + SQL (slides, code)
4:00-5:00 Check-in and end of day discussion
Thursday June 11th 1:00-4:00 Final Session with Group Presentations. Ross R0220
4:00-5:00 Dominicks!

Setup

To participate in the ICOS Big Data Summercamp, you will need working copies of the software described below. Please make sure to install everything (or at least to download the installers) before the start of your bootcamp.

Overview of the tools

Editor

When you're writing code, it's nice to have a text editor that is optimized for writing code, with features like automatic color-coding of key words.

The Bash Shell

Bash is a commonly-used shell. Using a shell gives you more power to do more tasks more quickly with your computer.

Python

Python is becoming very popular in scientific computing, and it's a great language for teaching general programming concepts due to its easy-to-read syntax. We teach with Python version 2.7, since it is still the most widely used. Installing all the scientific packages for Python individually can be a bit difficult, so we recommend an all-in-one installer.

IPython Notebook

The IPython Notebook is a web-based interface for interactive computing with Python. Individual notebooks are composable, executable, and sharable documents that mix text, code, data, and visualizations. The IPython Notebook comes pre-loaded on many all-in-one python installers like Anaconda CE.

SQL

SQL is a specialized programming language used with databases. SQL is a declarative langauge for describing (declaring) the data you want from the database. We use a firefox plugin called SQLite Manager, for the lessons.

Windows Installation

Python

  • Download and install Anaconda CE.
  • Use all of the defaults for installation except make sure to check Make Anaconda the default Python.

Editor

Notepad++ is a popular free code editor for Windows. Be aware that you must add its installation directory to your system path in order to launch it from the command line (or have other tools like Git launch it for you). Please ask your instructor to help you do this.

Firefox SQLite Plugin

Windows doesn't have sqlite3 available on the the command line, so we will use this plugin for Firefox instead. To install it:

  • Start Firefox.
  • Go to the plugin homepage.
  • Click the "Add Now" button.
  • Click "Install Now" on the dialog that appears after the download completes.
  • Restart Firefox when prompted.
  • Depending on Firefox version, either 1) Select "SQLite Manager" from the "Tools" menu or 2) Go to "customize" in main menu and drag SQLite into the menu.

Mac OS X Installation

Python

  • Download and install Anaconda CE.
  • Use all of the defaults for installation except make sure to check Make Anaconda the default Python.

Editor

We recommend Text Wrangler or Sublime Text. In a pinch, you can use nano or vi, which should be pre-installed.

Firefox SQLite Plugin

Instead of using sqlite3 from the command line, we will use this plugin for Firefox instead. To install it:

  • Start Firefox.
  • Go to the plugin homepage.
  • Click the "Add Now" button.
  • Click "Install Now" on the dialog that appears after the download completes.
  • Restart Firefox when prompted.
  • Depending on Firefox version, either 1) Select "SQLite Manager" from the "Tools" menu or 2) Go to "customize" in main menu and drag SQLite into the menu.