Recommand · June 9, 2021 0

remove values from pandas df and move remaining upwards

I have a dataframe with categorical data in it.

I have come with a procedure to keep only desired categories, while moving up the remaining categories in the empty cells of deleted values.

But I want to do it without the list intermediaries if possible.

My workaround:

import pandas as pd
mydf = pd.DataFrame(data = {'a': [6,3,8,5], 'b': [4, 3,5,6], 'c': [5, 3,6,9]} )
‚Äč
selecList = [5,8,4,6] # only this categories shall remain

mydf

   a  b  c
0  6  4  5
1  3  3  3
2  8  5  6
3  5  6  9

myList = mydf.T.values.tolist()
myList
[[6, 3, 8, 5], [4, 3, 5, 6], [5, 3, 6, 9]]

filtered_list = [[x for x in y if x in selecList ] for y in myList] 
filtered_list
[[6, 8, 5], [4, 5, 6], [5, 6]]


filtered_df = pd.DataFrame(filtered_list).T
filtered_df.columns = list(mydf)
filtered_df = filtered_df.astype('Int64')
filtered_df

    a   b   c
0   6   4   5
1   8   5   6
2   5   6   <NA>

Here is an alternative solution:

df.where(df.isin(selecList)).dropna(how='all')