Subscribe now to Datacamp and save 63% - Subscribe now

18 Pandas Functions to Replace Excel with Python (and be happy forever)

Share this post

Excel is the worst. If you want to live a long and healthy life, drop it.

Excel does nothing that Python can’t do better. Let’s learn a little bit about Python Pandas.

If you drop Excel for Python (and I mean, never use Excel EVER), I guarantee you that you will never regret it.


Subscribe to my Newsletter


Excel to Python

These 18 Pandas functions will help you replace Excel with Pandas.

import pandas as pd
df = pd.read_excel('filename.xlsx') # Read Excel File
df.to_excel('filename.xlsx', index=Fales) # Save Excel File
df.read_csv('filename.csv') # Read CSV File
df.to_csv('filename.csv') # Save CSV File
df.head() # Preview dataframe
df.count() # Count Rows
df.describe() # Get general statistics (min,max,mean,std,...)
df['col_name'] # Select one column
df['col_name'].value_counts() # Get unique value count
df.fillna(0) # Replace Null values
df.dropna() # Remove Null values
df[df['col_name'] == 0] # Filter DataFrame
df.drop_duplicates() # Remove duplicates
df.merge(df2,on='col_name') # vlookup
df.replace() # Find and replace
df.groupby(['col1','col2']).sum() # Pivot/Groupby
df.groupby(['col1','col2']).agg({'col1':'sum','col2','count'})
df.plot() # Plot your DataFrame
df = pd.util.testing.makeDataFrame() # or # pd.get_dummies(pd.Series(list('abcaa'))) # Create dummy dataframe

Import Package

The first step is to import the pandas package into your code.

import pandas as pd

Work With Excel Files

To read and save to Excel files, use pandas.read_excel or pandas.to_excel.

import pandas as pd

df = pd.read_excel('filename.xlsx') # Read Excel File
df.to_excel('filename.xlsx', index=Fales) # Save Excel File

Work With CSV Files

import pandas as pd

df.read_csv('filename.csv') # Read CSV File
df.to_csv('filename.csv') # Save CSV File

Preview DataFrame

To preview you Dataframe, use the head() method.

df.head() # Preview dataframe
df.count() # Count Rows
df.describe() # Get general statistics (min,max,mean,std,...)
df['col_name'] # Select one column
df['col_name'].value_counts() # Get unique value count
df.fillna(0) # Replace Null values
df.dropna() # Remove Null values
df[df['col_name'] == 0] # Filter DataFrame
df.drop_duplicates() # Remove duplicates
df.merge(df2,on='col_name') # vlookup
df.groupby(['col1','col2']).sum() # Pivot/Groupby
df.groupby(['col1','col2']).agg({'col1':'sum','col2','count'})
df.plot() # Plot your DataFrame
df = pd.get_dummies(pd.Series(list('abcaa'))) # Create dummy dataframe

Get Statistics

df.head() # Preview dataframe
df.count() # Count Rows
df.describe() # Get general statistics (min,max,mean,std,...)
df['col_name'].value_counts() # Get unique value count

Work with DataFrame

df['col_name'] # Select one column
df.fillna(0) # Replace Null values
df.dropna() # Remove Null values
df[df['col_name'] == 0] # Filter DataFrame
df.drop_duplicates() # Remove duplicates

Replace Vlookup With Pandas merge

To replace vlookup using pandas, use the pandas.DataFrame.merge method.

df.merge(df2,on='col_name') # vlookup

Both Data frames will need to have a common “key”. In the example above, both df and df2 should have a column named “col_name“.

You Might Also Like  SEO Split-Testing Experiments Using Google Tag Manager

Pandas .replace() vs Excel Find and Replace

Pandas .replace() method is far more efficient that Excel’s search and replace or substitute functionalities.

df.replace('to_replace', 'new_value) # Find and Replace
df.replace(regex=r'^ba.$', value='new') # Allows regex

Replace Pivot Tables with GroupBy

df.groupby(['col1','col2']).sum() # Pivot/Groupby
df.groupby(['col1','col2']).agg({'col1':'sum','col2','count'})

Plot Your DataFrame

df.plot() # Plot your DataFrame

This is it. You can now replace Excel with Python and never look back. If you are hooked, try 30 days of Pandas.