Skip to content

Feature Request: Add example using the comparison operator to assign a new boolean column #1

@skilfullycurled

Description

@skilfullycurled

Hello,

First, thank you so much for these great tutorials. There are a number of warnings regarding the usage of "just the indexing operator" for quite a while and the explanation of .loc and .iloc were tremendously helpful.

I'm writing to recommend that you add an example of assigning a new column from a boolean selection that returns a boolean series in the article on assignment. Take for example, the following:

criteria = df[‘some_col’] > sum_number
criteria.head()

0     True
1    False
2     True
4     True
6    False

Using just the assignment operator...

df['new_col'] = df['some-col'] > some_number

...works but yields the warning:

Try using .loc[row_indexer,col_indexer] = value instead

The closest example I've found in your article is this one:

last_name = pd.Series(data=['Smith', 'Jones', 'Williams', 'Green', 'Brown', 'Simpson', 'Peters'],
                      index=['Tom', 'Niko', 'Penelope', 'Aria', 'Sofia', 'Dean', 'Zach'])
last_name
df['last_name'] = last_name

However, at least in Pandas 0.19.2, this will still yield the same error. After searching around a bit I found this stack overflow discussion which states that after Pandas 0.16.0, the best way to do this is to use the assign function in the following manner:

criteria = df[‘some_col’] > sum_number
df_three.assign(new_col_name = criteria) #note: no quotes on new_col_name

Which seems to work well for me.

Alternatively, I suppose you can simply add which version the tutorial was written under.

Thanks again for this wonderful guide!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions