You are here
Introduction to Data Analytics with Pandas
Matt Harrison, MetaSnake
Matt Harrison is a consultant and corporate trainer at MetaSnake, focusing on Python and Data Science. He has been using Python since 2000 across the domains of search, build management and testing, business intelligence, and storage.
Matt also runs pycast.io, a screencasting service providing instruction on Python and Data Science. He occasionally tweets useful Python-related information at @__mharrison__.
This course will provide an introduction to the pandas toolkit for Python. In addition to being used purely for development, Python programming is one of the top skills for data scientists because it is a full stack analytics package. You can use it to access data (or crawl to gather data), slice it and dice it, throw it into a database, visualize it, and perform machine learning with it.
Note: Attendees should have the (free) Anaconda stack installed on their computer. This is a large download, so it should be installed prior to the class. Downloads can be found at http://continuum.io/downloads.
Anyone with basic programming skills who wants to learn how to use pandas for manipulating tabular data
Attendees will be able to slice and dice tabular data with pandas.
- IPython Notebook
- Navigation in Notebook
- Executing code in Notebook
- Pandas Introduction
- Getting data
- Cleaning data
- Examining data
- Filtering, joining and updating data
- Working with aggregates
- Creating pivot tables
- Basic plots