To rename columns in dataframe in Pandas python, you can either provide new columns in column property or you can use rename() function.
Consider this code –
>>> superHeroDF = pd.DataFrame({'marvel':["Ironman", "Thor"], 'dc': ["Superman", "Batman"]})
>>> superHeroDF
marvel dc
0 Ironman Superman
1 Thor Batman
In this code we have created a panda dataframe with two columns – marvel and dc. In marvel we have two superhero Ironman and Thor while in dc we have Superman and Batman. Suppose, we want to rename columns marvel and dc to avenger and justice league, then we can do it in following ways –
1. Assigning new columns to attribute .columns
You can simply use the attribute .columns like this –
>>> superHeroDF.columns = ["avenger", "justice league"]
>>> superHeroDF
avenger justice league
0 Ironman Superman
1 Thor Batman
There is a drawback with this approach. You need to provide all the columns. Suppose you just want to change a single column even then you will need to provide all other columns too.
2. Making columns copy, change required one and reassign
In this approach we will first make a copy of current columns list. Then, we will change the required column and assign it to the dataframe. Look at this code –
>>> superHeroColumns = superHeroDF.columns.values
>>> superHeroColumns[0] = "avenger"
>>> superHeroDF.columns = superHeroColumns
>>> superHeroDF
avenger dc
0 Ironman Superman
1 Thor Batman
Check that we have only changed one column. dc is untouched while marvel is changed to avenger.
3. Using rename() function
The better approach is to use rename() function. There are 4 ways to use it –
Old Way – In this way you provide columns object with columns parameter. Check the code –
>>> superHeroDF = superHeroDF.rename(columns={'marvel': 'avenger'})
>>> superHeroDF
avenger dc
0 Ironman Superman
1 Thor Batman
axis=1 – In pandas, the 1st axis is column axis. Since we want to change column names so we can provide axis in rename function. Also, you don’t need to pass columns parameter. Check out code –
>>> superHeroDF = superHeroDF.rename({'marvel': 'avenger'}, axis=1)
>>> superHeroDF
avenger dc
0 Ironman Superman
1 Thor Batman
axis=’columns’ – You may also specify the full name in axis. Here we are using axis=columns. This code will clear it –
>>> superHeroDF = superHeroDF.rename({'marvel': 'avenger'}, axis='columns')
>>> superHeroDF
avenger dc
0 Ironman Superman
1 Thor Batman
inplace – By using inplace parameter we can directly change the dataframe without the need of assigning it back. Get the code-
>>> superHeroDF.rename({'marvel': 'avenger'}, axis='columns', inplace=True)
>>> superHeroDF
avenger dc
0 Ironman Superman
1 Thor Batman
4. Using set_axis() function
Pandas also provides set_axis() function which could update the columns in dataframe. Similar to .columns attribute, here also we need to provide all the columns. Code –
>>> superHeroDF.set_axis(['avenger', 'dc'], axis=1, inplace=True)
>>> superHeroDF
avenger dc
0 Ironman Superman
1 Thor Batman
5. Using lambda function
If you columns has a pattern which could be modified by a formula, then you can generate the dictionary for you new columns using lambda function. Check this code –
>>> superHeroDF = pd.DataFrame({'$$marvel':["Ironman", "Thor"], '$$dc': ["Superman", "Batman"]})
Here our columns are $$marvel and $$dc. Suppose we want to remove $$ from all the columns then new columns will become marvel and dc. So, there is a pattern and that pattern is to remove $$ from all columns. For such cases, we can use lambda.
>>> superHeroDF.rename(columns=lambda x: x.replace('$$', ''), inplace=True)
>>> superHeroDF
marvel dc
0 Ironman Superman
1 Thor Batman