Monthly Archives: April 2017

Boolean indexing in NumPy and Pandas: A free e-mail course for aspiring data scientists

NumPyFor nearly two years, I have been teaching my introductory course in data science and machine learning to companies around the world. On the one hand, participants are excited by data science, and all of the potential that it has to change our world. On the other hand, there is a lot to learn in order to be effective — a mix of theory and practice. While Python’s tools reduce the learning curve, you still end up having to learn about NumPy, Pandas, Matplotlib, scikit-learn, and a variety of other tools, such as the Jupyter notebook.

One of the things that most surprises and frustrates my students is the notion of “boolean indexing.” If you want to retrieve all of the odd elements from a NumPy array, you don’t use a “for” loop or even a list comprehension. Rather, you use a boolean index, as follows:


In order to cut down on the confusion, and help people to understand and use this powerful tool, I have created a free, 15-part e-mail course.  Each day, you’ll be introduced to another aspect of boolean indexing.  In every lesson, I give exercises, to help you practice what you’ve learned, and internalize the ideas.  By the end, you’ll know how to work with data in NumPy and Pandas (both series and data frames).

The course is 100% free, and a new lesson arrives in your e-mail inbox every day. So if you, or a colleague of yours, has been frustrated trying to understand how boolean indexing works, this is your chance!  The more fluently you can use these tools, the better you can be as a data scientist. Which, I would argue, is better for the world.

Sign up here: