Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIT-2372] Add the DataFrame.map method #2315

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

sfc-gh-lfallasavendano
Copy link

@sfc-gh-lfallasavendano sfc-gh-lfallasavendano commented Sep 18, 2024

Adds the DataFrame.map method which applies a Python function to every element of a DataFrame.

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SIT-2372

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
  3. Please describe how your code solves the related issue.

    Adds the DataFrame.map method that applies a Python function to every row of a DataFrame.

Copy link

github-actions bot commented Sep 18, 2024

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@sfc-gh-lfallasavendano
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

Copy link

@sfc-gh-mvega sfc-gh-mvega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@sfc-gh-yixie
Copy link
Collaborator

@sfc-gh-aalam could you help review this PR?

self,
func: Callable,
output_types: list[StructType],
output_column_names: Optional[list[str]] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding a * before this parameter to make these optional parameters key-word arguments.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion.

I just removed the extra_packages parameter and added a **kwargs parameter to pass arguments to the UTDF registration (as with other similar methods).

@@ -4188,6 +4189,117 @@ def _explain_string(self) -> str:

return f"{msg}\n--------------------------------------------"

def map(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A UDTF call may have other parameters like partitioning.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of the previous comment, the**kwargs parameter can be used to pass arguments to the UDTF registration.

Adds the `DataFrame.map` method which applies a Python function to
every element of a DataFrame.
Added a `kwargs` parameter to pass arguments to the UDTF registration
function.

Part of the code review comments.
Skips executing `test_map` in local testing mode  because it requires UDTFs.
Copy link

Seems like your changes contain some Local Testing changes, please request review from @snowflakedb/local-testing

Copy link

Seems like your changes contain some Local Testing changes, please request review from @snowflakedb/local-testing

Copy link

Seems like your changes contain some Local Testing changes, please request review from @snowflakedb/local-testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants