How to Get the Max Element of a Pandas DataFrame
A DataFrame
is a data structure that represents a special kind of two-dimensional array, built on top of multiple Series
objects. These are the central data structures of Pandas - an extremely popular and powerful data analysis framework for Python.
If you're not already familiar with DataFrames and how they work, read our Guide to DataFrames.
DataFrames have the ability to give a name to rows and/or columns, and in a sense, represent tables.
Let's import Pandas and create a DataFrame
from a dictionary:
import pandas as pd
df_data = {
"column1": [24, 9, 20, 24],
"column2": [17, 16, 201, 16]
}
df = pd.DataFrame(df_data)
print(df)
Pandas has a great integration with Python and we can easily create DataFrames from dictionaries. The df
we've constructed now contains the columns and their respective values:
column1 column2
0 24 17
1 9 16
2 20 201
3 24 16
Each column has a list of elements, and we can search for the maximum element of each column, each row or the entire DataFrame
.
Find Maximum Element in Pandas DataFrame's Column
To find the maximum element of each column, we call the max()
method of the DataFrame
class, which returns a Series
of column names and their largest values:
max_elements = df.max()
print(max_elements)
This will give us the max value for each column of our df
, as expected:
column1 24
column2 201
dtype: int64
However, to find the max()
element of a single column, you first isolate it and call the max()
method on that specific Series
:
max_element = df['column1'].max()
print(max_element)
24
Find Maximum Element in Pandas DataFrame's Row
Finding the max element of each DataFrame row relies on the max()
method as well, but we set the axis
argument to 1
.
The default value for the
axis
argument is 0. If theaxis
equals to 0, themax()
method will find the max element of each column. On the other hand, if theaxis
equals to 1, themax()
will find the max element of each row.
max_elements = df.max(axis=1)
print(max_elements)
This will give us the max value for each row of our df
, as expected:
0 24
1 16
2 201
3 24
dtype: int64
Alternatively, if you'd like to search through a specific row, you can access it via iloc[]
:
print(df)
for row in df.index:
print(f'Max element of row {row} is:', max(df.iloc[row]))
We've printed the df
for reference to make it easier to verify the results, and obtained the max()
element of each row, obtained through iloc[]
:
column1 column2
0 24 17
1 9 16
2 20 201
3 24 16
Max element of row 0 is: 24
Max element of row 1 is: 16
Max element of row 2 is: 201
Max element of row 3 is: 24
Find Maximum Element in Entire Pandas DataFrame
Finally, we can take a look at how to find the max element in an entire DataFrame.
Based on what we've previously seen, this should be pretty simple. We'll just use the built-in max()
method and pass it one of two previously created lists of max elements - either for all rows or all columns. These are two facets of the same data, so the same result is guaranteed.
This should give us a single highest value in the entire df
:
max_by_columns = df.max()
max_by_rows = df.max(axis=1)
df_max = max(max_by_columns)
print("Max element based on the list of columns: ", df_max)
df_max2 = max(max_by_rows)
print("Max element based on the list of rows: ", df_max2)
This will output:
Max element based on the list of columns: 201
Max element based on the list of rows: 201
This is both expected and correct! The max element of a list of max elements of each row should be the same as the max element of a list of max elements of each column and both of them should be the same as the max element of the entire DataFrame.
Conclusion
In this short tutorial, we've taken a look at how to find the maximum element of a Pandas DataFrame, for columns, rows and the entire DataFrame instance.
from Planet Python
via read more
No comments:
Post a Comment