Course length: 4 days (32 hours)
Description: This course introduces the NumPy and Pandas packages for Python, and shows how they can be used to ask and answer a variety of questions involving data analysis. NumPy and Pandas are the foundation of the “SciPy stack,” a set of Python packages that have become extremely popular in recent years. Indeed, some financial institutions have begun to replace certain uses of Excel with Pandas, because of its versatility and power.
This course covers all of the major ways in which NumPy and Pandas are typically used — from reading data, to processing and cleaning it, to visualizing it, to exporting it into other formats.
The course includes numerous hands-on labs, in which participants will work on real-world problems, with real-life data sets.
Course length: 4 days (32 hours)
Audience: This course is aimed at programmers who have day-to-day practical experience working with Python. Knowledge of basic data types, an ability to write loops, and familiarity with writing and executing functions will all be needed.
- Jupyter notebook
- NumPy
- NumPy arrays
- Data types
- Operations
- Working with external data
- Boolean indexing techniques
- Sorting, searching, and retrieving
- Pandas
- Series
- DataFrame
- Working with Pandas data
- Importing and exporting data
- Filtering data by row and column
- Working with string data
- Indexes
- Indexing and multi-level indexing
- Stacking and unstacking
- Pivot tables
- Aggregate functionality
- Grouping
- Sorting
- Joining
- Combining data frames
- Categorizing data
- Working with date/time data
- Visualization of data with Pandas
- Pandas and memory usage