-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SNOW-1649172]: Fix loc
set when setting DataFrame row with Series value
#2213
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # CHANGELOG.md # src/snowflake/snowpark/modin/pandas/series.py # tests/integ/modin/frame/test_loc.py
@@ -1832,6 +1832,15 @@ def loc(): | |||
viper 0 0 | |||
sidewinder 0 0 | |||
|
|||
Setting the values with a Series item. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sfc-gh-helmeleegy this is the example I added
Please describe what is the problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please describe what was the issue?
|
||
|
||
@sql_count_checker(query_count=1, join_count=3) | ||
def test_df_iloc_full_set_row_from_series_int_and_string_indexes(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can combine this one into the previous one.
if isinstance(df, pd.DataFrame): | ||
df.loc[:] = series | ||
else: | ||
if index == [0, 1, 2]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you'd better just compare the result with the expected result. These steps here can be confusing.
@@ -1039,6 +1039,9 @@ def __setitem__( | |||
) | |||
if item_is_2d_array: | |||
item = pd.DataFrame(item) | |||
frame_is_df_and_item_is_series = isinstance(item, pd.Series) and isinstance( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens for item is a Index or list? Are they matching with pandas? Can you verify too?
original_index = index | ||
# If `item` is from a Series (rather than a Dataframe), flip the series item values to apply them | ||
# across columns rather than rows. | ||
if frame_is_df_and_item_is_series and (columns == slice(None) or len(columns) > 1): # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you wrap it into a function and use function name to brief what this method does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this mean (columns == slice(None) or len(columns) > 1)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this type: ignore[arg-type]
actually indicate something is wrong. You didn't consider all type cases.
item, col_len, move_index_to_cols=True | ||
) | ||
|
||
if is_scalar(index): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if index is not scalar?
original_index = index | ||
# If `item` is from a Series (rather than a Dataframe), flip the series item values to apply them | ||
# across columns rather than rows. | ||
if frame_is_df_and_item_is_series and (columns == slice(None) or len(columns) > 1): # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done in _set_2d_labels_helper_for_frame_item
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-1649172
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
When doing
df.loc[x] = series
, an error occurs because series does not have the same number of columns as the dataframe being set. Instead, the Series should be transposed and set, regardless of whether it has an equal number of rows as the dataframe has columns.