Recommand · May 14, 2022 0

Dataframe new columns to tell if the row contains column's header text

2 columns dataframe as the first screenshot. I want to add new columns (by the contents in the Note column from the original dataframe) to tell if the Note column contains the new column’s header text.

Example as the second screenshot.

enter image description here

Some lines work for a few columns. When there are a lot of new columns, it’s not efficient.

What would be the good way to do so? Thank you.

import pandas as pd
from io import StringIO

csvfile = StringIO(
Mike\tBright, Kind
Kate\tConsiderate, energetic
John\tReliable, friendly

df = pd.read_csv(csvfile, sep = '\t', engine='python')

col_list =  df['Note'].tolist()

n_list = []
for c in col_list:
    for _ in c.split(','):

df = df.assign(**dict.fromkeys(n_list, ''))
df["Bright"][df['Note'].str.contains("Bright")] = "Yes"

You can try .str.get_dummies then replace 1 with Yes

df = df.join(df['Note'].str.get_dummies(', ').replace({1: 'Yes', 0: ''}))

   Name                    Note Bright Considerate Friendly Kind Reliable energetic friendly
0  Mike            Bright, Kind    Yes                       Yes
1  Lily                Friendly                         Yes
2  Kate  Considerate, energetic                Yes                              Yes
3  John      Reliable, friendly                                       Yes                Yes
4   Ale                  Bright    Yes

You may also like…

amazon-web-services android angular api arrays c# css dart dataframe django docker excel express firebase flutter html ios java javascript jquery json kotlin laravel linux list mongodb mysql node.js pandas php postgresql python python-3.x r react-native reactjs regex spring spring-boot sql sql-server string swift typescript vue.js