GitXplorerGitXplorer
s

binder-environments

public
2 stars
1 forks
0 issues

Commits

List of commits on branch master.
Verified
e38cdf3853ba52e719ae3f0750bcdc978a5f7a03

Update README.md

sstefmolin committed 3 years ago
Verified
10b218c1b44acca0b3211d1490b8d130e2cbb272

Add new workshop to README.

sstefmolin committed 3 years ago
Verified
b12c40475755545627b2353d9276283d8deed797

Add pandas workshop

sstefmolin committed 4 years ago
Verified
524c65b8b0f336314c017f4fe3ccfe5a6621ef28

Include Binder link generators as resources

sstefmolin committed 4 years ago
Verified
dc1b2409a285d87be61e8096ed668f83c5e76f92

Update publication date for second edition.

sstefmolin committed 4 years ago
Verified
214bea56a014df67960efe770d62dce98da806aa

Update README.md

sstefmolin committed 4 years ago

README

The README file for this repository.

Binder environments

Central location for binder environments, especially those used with my book, Hands-On Data Analysis with Pandas. The actual Jupyter Notebooks and content are located in separate GitHub repositories. This makes it possible to build the environment faster since each push of new content to the repository won't trigger a rebuild of the environment.

Some helpful Binder link generators:

  • general case: when the environment specifications and content are in the same repository
  • nbgitpuller: when using separate repositories for the content and the environment

Available environments


Hands-On Data Analysis with Pandas – Second Edition (April 2021)

Binder Purchase the book on Amazon View repository on GitHub

Hands-On Data Analysis with Pandas – Second Edition

Data analysis has become an essential skill in a variety of domains where knowing how to work with data and extract insights can generate significant value. Hands-On Data Analysis with Pandas will show you how to analyze your data, get started with machine learning, and work effectively with the Python libraries often used for data science, such as pandas, NumPy, matplotlib, seaborn, and scikit-learn.

Using real-world datasets, you will learn how to use the pandas library to perform data wrangling to reshape, clean, and aggregate your data. Then, you will learn how to conduct exploratory data analysis by calculating summary statistics and visualizing the data to find patterns. In the concluding chapters, you will explore some applications of anomaly detection, regression, clustering, and classification using scikit-learn to make predictions based on past data.

This updated edition will equip you with the skills you need to use pandas 1.x to efficiently perform various data manipulation tasks, reliably reproduce analyses, and visualize your data for effective decision making—valuable knowledge that can be applied across multiple domains.

The 2nd_edition branch defines the environment needed to run the code bundle for 2nd edition of Hands-On Data Analysis with Pandas using binder.

Hands-On Data Analysis with Pandas – First Edition (July 2019)

Binder Purchase the book on Amazon View repository on GitHub Hands-On Data Analysis with Pandas

Data analysis has become an essential skill in a variety of domains where knowing how to work with data and extract insights can generate significant value.

Hands-On Data Analysis with Pandas will show you how to analyze your data, get started with machine learning, and work effectively with Python libraries often used for data science, such as pandas, NumPy, matplotlib, seaborn, and scikit-learn. Using real-world datasets, you will learn how to use the powerful pandas library to perform data wrangling to reshape, clean, and aggregate your data. Then, you will learn how to conduct exploratory data analysis by calculating summary statistics and visualizing the data to find patterns. In the concluding chapters, you will explore some applications of anomaly detection, regression, clustering, and classification, using scikit-learn, to make predictions based on past data.

By the end of this book, you will be equipped with the skills you need to use pandas to ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analysis across multiple domains.

The 1st_edition branch defines the environment needed to run the code bundle for 1st edition of Hands-On Data Analysis with Pandas using binder.


Pandas Workshop

Binder View repository on GitHub

Working with data can be challenging: it often doesn’t come in the best format for analysis, and understanding it well enough to extract insights requires both time and the skills to filter, aggregate, reshape, and visualize it. This session will equip you with the knowledge you need to effectively use pandas – a powerful library for data analysis in Python – to make this process easier.

Pandas makes it possible to work with tabular data and perform all parts of the analysis from collection and manipulation through aggregation and visualization. While most of this session focuses on pandas, during our discussion of visualization, we will also introduce at a high level matplotlib (the library that pandas uses for its visualization features, which when used directly makes it possible to create custom layouts, add annotations, etc.) and seaborn (another plotting library, which features additional plot types and the ability to visualize long-format data).

The pandas_workshop branch defines the environment needed to run the Jupyter Notebooks for my "Introduction to Data Analysis Using Pandas" workshop.


Python Data Visualization Workshop

Binder View repository on GitHub

The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python.

While there are many plotting libraries to choose from, the prolific Matplotlib library is always a great place to start. Since various Python data science libraries utilize Matplotlib under the hood, familiarity with Matplotlib itself gives you the flexibility to fine tune the resulting visualizations (e.g., add annotations, animate, etc.). This session will also introduce interactive visualizations using HoloViz, which provides a higher-level plotting API capable of using Matplotlib and Bokeh (a Python library for generating interactive, JavaScript-powered visualizations) under the hood.

The data_viz_workshop branch defines the environment needed to run the Jupyter Notebooks for my "Beyond the Basics: Data Visualization in Python" workshop.