Monday, October 12, 2020

Using ggplot in Python: Visualizing Data With plotnine

In this tutorial, you’ll learn how to use ggplot in Python to create data visualizations using a grammar of graphics. A grammar of graphics is a high-level tool that allows you to create data plots in an efficient and consistent way. It abstracts most low-level details, letting you focus on creating meaningful and beautiful visualizations for your data.

There are several Python packages that provide a grammar of graphics. This tutorial focuses on plotnine since it’s one of the most mature ones. plotnine is based on ggplot2 from the R programming language, so if you have a background in R, then you can consider plotnine as the equivalent of ggplot2 in Python.

In this tutorial, you’ll learn how to:

  • Install plotnine and Jupyter Notebook
  • Combine the different elements of the grammar of graphics
  • Use plotnine to create visualizations in an efficient and consistent way
  • Export your data visualizations to files

This tutorial assumes that you already have some experience in Python and at least some knowledge of Jupyter Notebook and pandas. To get up to speed on these topics, check out Jupyter Notebook: An Introduction and Using Pandas and Python to Explore Your Dataset.

Setting Up Your Environment#

In this section, you’ll learn how to set up your environment. You’ll cover the following topics:

  1. Creating a virtual environment
  2. Installing plotnine
  3. Installing Juptyer Notebook

Virtual environments enable you to install packages in isolated environments. They’re very useful when you want to try some packages or projects without messing with your system-wide installation. You can learn more about them in Python Virtual Environments: A Primer.

Run the following commands to create a directory named data-visualization and a virtual environment inside it:

$ mkdir data-visualization
$ cd data-visualization
$ python3 -m venv venv

After running the above commands, you’ll find your virtual environment inside the data-visualization directory. Run the following command to activate the virtual environment and start using it:

$ source ./venv/bin/activate

When you activate a virtual environment, any package that you install will be installed inside the environment without affecting your system-wide installation.

Next, you’ll install plotnine inside the virtual environment using the pip package installer.

Install plotnine by running this command:

$ pip install plotnine

Executing the above command makes the plotnine package available in your virtual environment.

Finally, you’ll install Jupyter Notebook. While this isn’t strictly necessary for using plotnine, you’ll find Jupyter Notebook really useful when working with data and building visualizations. If you’ve never used the program before, then you can learn more about it in Jupyter Notebook: An Introduction.

To install Jupyter Notebook, use the following command:

$ pip install jupyter

Congratulations, you now have a virtual environment with plotnine and Jupyter Notebook installed! With this setup, you’ll be able to run all the code samples presented through this tutorial.

Building Your First Plot With ggplot and Python#

In this section, you’ll learn how to build your first data visualization using ggplot in Python. You’ll also learn how to inspect and use the example datasets included with plotnine.

Read the full article at https://realpython.com/ggplot-python/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Real Python
read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...