Monday, January 11, 2021

NumPy Tutorial: Your First Steps Into Data Science in Python

NumPy is a Python library that provides a simple yet powerful data structure: the n-dimensional array. This is the foundation on which almost all the power of Python’s data science toolkit is built, and learning NumPy is the first step on any Python data scientist’s journey. This tutorial will provide you with the knowledge you need to use NumPy and the higher-level libraries that rely on it.

In this tutorial you’ll learn:

  • What core concepts in data science are made possible by NumPy
  • How to create NumPy arrays using various methods
  • How to manipulate NumPy arrays to perform useful calculations
  • How to apply these new skills to real-world problems

To get the most out of this NumPy tutorial, you should be familiar with writing Python code. Working through the Introduction to Python learning path is a great way to make sure you’ve got the basic skills covered. If you’re familiar with matrix mathematics, then that will certainly be helpful as well. You don’t need to know anything about data science, however. You’ll learn that here.

There’s also a repository of NumPy code samples that you’ll see throughout this tutorial. You can use it for reference and experiment with the examples to see how changing the code changes the outcome. To download the code, click the link below:

Choosing NumPy: The Benefits

Since you already know Python, you may be asking yourself if you really have to learn a whole new paradigm to do data science. Python’s for loops are awesome! Reading and writing CSV files can be done with traditional code. However, there are some convincing arguments for learning a new paradigm.

Here are the top four benefits that NumPy can bring to your code:

  1. More speed: NumPy uses algorithms written in C that complete in nanoseconds rather than seconds.
  2. Fewer loops: NumPy helps you to reduce loops and keep from getting tangled up in iteration indices.
  3. Clearer code: Without loops, your code will look more like the equations you’re trying to calculate.
  4. Better quality: There are thousands of contributors working to keep NumPy fast, friendly, and bug free.

Because of these benefits, NumPy is the de facto standard for multidimensional arrays in Python data science, and many of the most popular libraries are built on top of it. Learning NumPy is a great way to set down a solid foundation as you expand your knowledge into more specific areas of data science.

Installing NumPy

It’s time to get everything set up so you can start learning how to work with NumPy. There are a few different ways to do this, and you can’t go wrong by following the instructions on the NumPy website. But there are some extra details to be aware of that are outlined below.

You’ll also be installing Matplotlib. You’ll use it in one of the later examples to explore how other libraries make use of NumPy.

Using Repl.it as an Online Editor

If you just want to get started with some examples, follow along with this tutorial, and start building some muscle memory with NumPy, then Repl.it is a great option for in-browser editing. You can sign up and fire up a Python environment in minutes. Along the left side, there’s a tab for packages. You can add as many as you want. For this NumPy tutorial, go with the current versions of NumPy and Matplotlib.

Here’s where you can find the packages in the interface:

The location of the "Packages" tab on Repl.it.

Luckily, they allow you to just click and install.

Installing NumPy With Anaconda

The Anaconda distribution is a suite of common Python data science tools bundled around a package manager that helps manage your virtual environments and project dependencies. It’s built around conda, which is the actual package manager. This is the method recommended by the NumPy project, especially if you’re stepping into data science in Python without having already set up a complex development environment.

If you’ve already got a workflow you like that uses pip, Pipenv, Poetry, or some other toolset, then it might be better not to add conda to the mix. The conda package repository is separate from PyPI, and conda itself sets up a separate little island of packages on your machine, so managing paths and remembering which package lives where can be a nightmare.

Once you’ve got conda installed, you can run the install command for the libraries you’ll need:

$ conda install numpy matplotlib

This will install what you need for this NumPy tutorial, and you’ll be all set to go.

Read the full article at https://realpython.com/numpy-tutorial/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Real Python
read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...