Sunday, March 21, 2021

Python Pool: How to Convert Numpy Array to Pandas Dataframe

Introduction

In python, there are many ways to convert a numpy array to a pandas dataframe. But, sometimes we are asked to solve with particular methods. So In this tutorial, we will be seeing all the methods through which we can convert numpy array to pandas dataframe.

What is Numpy Array?

Numpy arrays are the grid of values that are of the same type and are indexed by a tuple of non-negative integers.

import numpy as np
arr = np.array((1, 2, 3, 4, 5))
print(arr)

Output:

[1 2 3 4 5]

What is Pandas Dataframe?

Pandas Dataframe is two-dimensional whose size is mutable and which are potentially heterogeneous tabular data structures with labeled rows and columns.

import pandas as pd
lst = ['Latracal', 'Solution', 'an', 'online', 
            'portal', 'for', 'languages']
df = pd.DataFrame(lst)
print(df)

Output:

       0
0   Latracal
1   Solution
2         an
3     online
4     portal
5        for
6  languages

Syntax of Pandas Dataframe

pandas.DataFrame(data=None, index=None, columns=None)

Parameter of Pandas Dataframe

  • data: It is the input as numpy array, dictionary.
  • Index: This input is used for resulting the dataframe.
  • Columns: These are the column labels for the resulting dataframe.

Steps to Convert Numpy array to Pandas Dataframe

  1. Import the modules: pandas and numpy.
  2. Create the numpy array.
  3. Create the list of index values and column values for the DataFrame.
  4. Then, create the dataframe.
  5. At last, display the dataframe.

Various Examples to Convert Numpy array to Pandas Dataframe

Let us understand the conversion of numpy array to pandas dataframe with the help of different methods and ways explained in detail with the help of examples:

1. Using numpy array from random.rand method to Convert Numpy array to Pandas Dataframe

In this example, we will take the input of the numpy array from random.rand() function in numpy. and then apply the dataframe syntax to convert it to pandas dataframe.

#import numpy and pandas module
import numpy as np 
import pandas as pd 
  
arr = np.random.rand(4, 4) 
print("Numpy array : ",arr ) 
  
# conversion into dataframe 
df = pd.DataFrame(arr, columns =['A', 'B', 'C', 'D']) 
print("\nPandas DataFrame: ")
print(df)

Output:

Numpy array :  [[0.93845309 0.89059495 0.51480681 0.06583541]
 [0.94972596 0.55147651 0.40720578 0.86422873]
 [0.53556404 0.7760867  0.80657461 0.37336038]
 [0.21177783 0.90187237 0.53926327 0.06067915]]
Pandas DataFrame:
       A         B         C         D
0  0.938453  0.890595  0.514807  0.065835
1  0.949726  0.551477  0.407206  0.864229
2  0.535564  0.776087  0.806575  0.373360
3  0.211778  0.901872  0.539263  0.060679

Explanation:

Here firstly, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array from random.rand() method from the numpy module and printed the input array. Thirdly, we have applied the syntax to convert it into a dataframe in which we have set the values of columns from A to D. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and converted array to the dataframe.

2. Using numpy array with random.rand and reshape()

In this example, we will be taking the input in random.rand().reshape() function. Secondly, we will apply the dataframe syntax with the index values and columns and print the converted dataframe from the numpy module.

#import module: numpy and pandas
import numpy as np 
import pandas as pd 
  
arr = np.random.rand(6).reshape(2, 3) 
print("Numpy array : " ,arr) 
  
# converting into dataframe 
df = pd.DataFrame(arr, columns =['1', '2', '3']) 
print("\nPandas DataFrame: ") 
print(df)

Output:

Numpy array :  [[0.05949315 0.66499294 0.39795918]
 [0.93026286 0.42710097 0.70753262]]
Pandas DataFrame: 
      1         2         3
0  0.059493  0.664993  0.397959
1  0.930263  0.427101  0.707533

Explanation:

Here firstly, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array from random.rand().reshape() method from the numpy module and printed the input array. Thirdly, we have applied the syntax to convert it into a dataframe in which we have set the values of columns from 1 to 4. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and converted array to the dataframe.

3. using numpy array to Convert Numpy array to Pandas Dataframe

In this example, we will be taking input from np.array() and then convert the numpy array to pandas dataframe through dataframe syntax.

#import module numpy and pandas
import numpy as np 
import pandas as pd   
  
arr = np.array([[1, 2], [3, 4]]) 
print("Numpy array : ",arr) 
  
# converting into dataframe 
df = pd.DataFrame(data = arr, index =["row1", "row2"],  
                  columns =["col1", "col2"]) 
  
print("\nPandas DataFrame: ") 
print(df)

Output:

Numpy array :  [[1 2]
 [3 4]]
Pandas DataFrame: 
       col1  col2
row1     1     2
row2     3     4

Explanation;

Here firstly, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array np.array() method from the numpy module and printed the input array. Thirdly, we have applied the syntax to convert it into a dataframe in which we have set the values of rows from row1, row2, and columns from col1, col2. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and converted array to the dataframe.

4. Creating an empty dataframe

In this example, we will show how to create an empty dataframe and then print it.

#import pandas module and numpy module
import pandas as pd
import numpy as np

df = pd.DataFrame(np.nan, index=[0,1,2], columns=['A'])
print(df)

Output:

   A
0 NaN
1 NaN
2 NaN

Explanation:

Here firstly, we have imported two modules, i.e., numpy and pandas. Secondly, we have applied dataframe syntax without taking the input array from the numpy module. In the syntax, we have np.nan, which means all the array values are set to NaN, i.e., 0. In the function, rows are set with 0, 1, 2, and columns are set with A. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and converted array to the dataframe.

5. Generating rows and columns through iteration

In this example, we will be generating index columns and column headers through iteration.

#import module: numpy and pandas
import pandas as pd 
import numpy as np 
  
arr = np.array([[1, 2, 3],  
                       [4, 5, 6]]) 
   
df = pd.DataFrame(data = arr[0:, 0:], 
                        index = ['Row_' + str(i + 1)  
                        for i in range(arr.shape[0])], 
                        columns = ['Column_' + str(i + 1)  
                        for i in range(arr.shape[1])]) 
  
print(df) 

Output:

          Column_1  Column_2  Column_3
Row_1         1         2         3
Row_2         4         5         6

Explanation:

Here firstly, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array np.array() method from the numpy module and printed the input array. Thirdly, we have applied the syntax to convert it into a dataframe in which we have set the values of rows and columns with the help of iteration through for loop. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and converted array to the dataframe.

6. Generating Rows And Columns before converting into dataframe

In this example, we will be taking input from a numpy array. Then, we will set the index columns and column headers separately, and after that, we will put the value of rows and columns inside the dataframe syntax.

#import module: numpy and pandas
import pandas as pd 
import numpy as np 
  
arr = np.array([[1, 2, 3],  
                       [4, 5, 6]]) 

index = ['Row_' + str(i)  
        for i in range(1, len(arr) + 1)] 
  
columns = ['Column_' + str(i)  
          for i in range(1, len(arr[0]) + 1)] 

df = pd.DataFrame(arr ,  
                        index = index, 
                        columns = columns) 
 
print(df) 

Output:

     Column_1  Column_2  Column_3
Row_1    1         2         3
Row_2    4         5         6

Explanation:

Here firstly, we have imported two modules, i.e., numpy and pandas. Secondly, we have taken an input array np.array() method from the numpy module and printed the input array. Thirdly, we have set the value for the rows and columns in the variable name as Index and columns with the help of iteration through for loop. Fourthly, we have applied the syntax to convert it into a dataframe in which we have set the values of rows and columns with the values defined before the dataframe function. If we don’t set the rows and columns, these are set by default starting from the index 0. At last, we have printed the dataframe. Hence, you can see the output and converted array to the dataframe.

Conclusion

In this tutorial, we have discussed how to create a pandas dataframe from the numpy array. We have also discussed how we can create and write the program for converting the numpy array to pandas dataframe. All the examples are explained in detail for a better understanding. You can the any of the programs as per your requirement.

The post How to Convert Numpy Array to Pandas Dataframe appeared first on Python Pool.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...