Keep Columns
Tags: #pandas #snippet #datacleaning #operations
Description: This notebook shows how to define a new DataFrame that only keeps columns defined in Input section.
Reference: https://www.statology.org/pandas-keep-columns/
import pandas as pd
#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B'],
'points': [11, 7, 8, 10, 13, 13],
'assists': [5, 7, 7, 9, 12, 9],
'rebounds': [11, 8, 10, 6, 6, 5]})
df
# list of columns to keep in dataframe
to_keep = ['team', 'points', "blocks"]
You can't set one column that does not exists in your dataframe so we managed it in the function below.
def keep_columns(df, to_keep):
# Check if all columns exist in dataframe
for i, x in enumerate(to_keep):
if not x in df.columns:
print(f"🚨 Columns '{x}' does not exist in DataFrame -> removed from your list!")
to_keep.pop(i)
df1 = df[to_keep]
return df1
df1 = keep_columns(df, to_keep)
df1
Last modified 2mo ago