The Python Data Scientist Toolbox!
Data Science Courses
- Harvard course: CS109 Data Science
- University of Washington: Introduction to Data Science
- Emilio Ferrara’s graduate course on Mining the social web
Readings & Tutorials
Frameworks
Plotting – Visualization
- matplotlib
- seaborn
- plotly (interactive)
- pygal (interactive)
- ggplot (Python wrapper)
- gnuplot.py (Python wrapper)
- Geomapping
Machine Learning
Data Repositories
- Network Data Repository
- Awesome public datasets
- SNAP – Stanford Large Network Dataset Collection
- PyDataset
Network analysis
Network algorithms
Data Slicing
- pandas (10m tutorial)
- DateFinder (finds dates in text)
Scientific, Numerical, and Statistical Analysis
Topic modeling
Natural Language Processing (NLP) & Sentiment Analysis (SA)
Twitter Analytics
Web Data Extraction
Geocoding
Timeseries analysis