Sunday, September 20, 2020

ListenData: How to rename columns in pandas dataframe

In this tutorial, we will cover various methods to rename columns in pandas dataframe in Python. Renaming or changing the names of columns is one of the most common data wrangling task. If you are not from programming background and worked only in Excel Spreadsheets in the past you might feel it not so easy doing this in Python as you can easily rename columns in MS Excel by just typing in the cell what you want to have. If you are from database background it is similar to ALIAS in SQL. In Python there is a popular data manipulation package called pandas which simplifies doing these kind of data operations.
2 Methods to rename columns in Pandas
In Pandas there are two simple methods to rename name of columns.

First step is to install pandas package if it is not already installed. You can check if the package is installed on your machine by running !pip show pandas statement in Ipython console. If it is not installed, you can install it by using the command !pip install pandas.

Import Dataset for practice

To import dataset, we are using read_csv( ) function from pandas package.

import pandas as pd
df = df = pd.read_csv("https://raw.githubusercontent.com/JackyP/testing/master/datasets/nycflights.csv", usecols=range(1,17))
To see the names of columns in a data frame, write the command below :
df.columns
Index(['year', 'month', 'day', 'dep_time', 'dep_delay', 'arr_time',
'arr_delay', 'carrier', 'tailnum', 'flight', 'origin', 'dest',
'air_time', 'distance', 'hour', 'minute'],
dtype='object')
Method I : rename() function
Suppose you want to replace column name year with years. In the code below it will create a new dataframe named df2 having new column names and same values.
df2 = df.rename(columns={'year':'years'})
If you want to make changes in the same dataset df you can try this option inplace = True
df.rename(columns={'year':'years'}, inplace = True)
By default inplace = False is set, hence you need to specify this option and mark it True. If you want to rename names of multiple columns, you can specify other columns with comma separator.
df.rename(columns={'year':'years', 'month':'months' }, inplace = True)
Method II : dataframe.columns = [list]
You can also assign the list of new column names to df.columns. See the example below. We are renaming year and month columns here.
df.columns = ['years', 'months', 'day', 'dep_time', 'dep_delay', 'arr_time',
'arr_delay', 'carrier', 'tailnum', 'flight', 'origin', 'dest',
'air_time', 'distance', 'hour', 'minute']
Rename columns having pattern
Suppose you want to rename columns having underscore '_' in their names. You want to get rid of underscore
df.columns = df.columns.str.replace('_' , '')
New column names are as follows. You can observe no underscore in the column names.
  Index(['year', 'month', 'day', 'deptime', 'depdelay', 'arrtime', 'arrdelay',
'carrier', 'tailnum', 'flight', 'origin', 'dest', 'airtime', 'distance',
'hour', 'minute'],
dtype='object')
Add prefix / suffix in column names
In case you want to add some text before or after existing column names, you can do it by using add_prefix( ) and add_suffix( ) functions.
df = df.add_prefix('V_')
df = df.add_suffix('_V')
How to access columns having space in names
For demonstration purpose we can add space in some column names by using df.columns = df.columns.str.replace('_' , ' '). You can access the column using the syntax df["columnname"]
df["arr delay"]
How to change row names
With the use of index option, you can rename rows (or index). In the code below, we are altering row names 0 and 1 to 'First' and 'Second' in dataframe df. By creating dictionary and taking previous row names as keys and new row names as values.
df.rename(index={0:'First',1:'Second'}, inplace=True)


from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...