Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1565789: Column.isin() fails with empty list #1999

Open
NickFishr opened this issue Jul 30, 2024 · 1 comment
Open

SNOW-1565789: Column.isin() fails with empty list #1999

NickFishr opened this issue Jul 30, 2024 · 1 comment
Assignees
Labels
bug Something isn't working status-triage_done Initial triage done, will be further handled by the driver team

Comments

@NickFishr
Copy link
Contributor

  1. What version of Python are you using?

    Python 3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0]

  2. What operating system and processor architecture are you using?

    • Linux-5.19.0-46-generic-x86_64-with-glibc2.35
    • Web Console
  3. What are the component versions in the environment (pip freeze)?

    ...
    pandas==2.2.2
    pyarrow==17.0.0
    pytest==7.4.4
    requests==2.32.3
    snowflake-connector-python==3.11.0
    snowflake-snowpark-python==1.20.0
    ...
    
  4. What did you do?

    Create a dataframe and a apply filter() with an isin() subquery (whose result was empty).

    import snowflake.snowpark as snowpark
    
    def main(session: snowpark.Session): 
        df1 = session.create_dataframe([1, 2, 3, 4]).to_df("a")
        df2 = session.create_dataframe([1, 2, 3, 4]).to_df("b")
    
        # Lazy evaluation of a subquery with empty result set --> OK
        df3 = df1.filter(df1["a"].isin(df2[df2["b"] > 4]))
        df3.collect()
    
        # None --> OK
        df4 = df1.filter(df1["a"].isin(None))
        df4.collect()
    
        # Empty list --> EXCEPTION
        df5 = df1.filter(df1["a"].isin([]))
        df5.collect()
    
        # Collecting the subquery which yields an empty result is essentially the same
        #df5 = df1.filter(df1["a"].isin((df2[df2["b"] > 4]).collect()))
        #df5.collect()
    

    This problem does not occurr in local testing mode.

  5. What did you expect to see?

    An empty dataframe (since the intersection of any set with an empty set is again the empty set).

  6. Can you set logging to DEBUG and collect the logs?

    This is the output from the Python worksheet run in the web console (without the additional debug logs):

  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 123, in wrap
    raise ex
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 117, in wrap
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 417, in run_query
    raise ex
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 402, in run_query
    results_cursor = self.execute_and_notify_query_listener(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/snowpark/_internal/server_connection.py", line 354, in execute_and_notify_query_listener
    results_cursor = self._cursor.execute(query, **kwargs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/connector/cursor.py", line 1016, in execute
    Error.errorhandler_wrapper(
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/connector/errors.py", line 232, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/connector/errors.py", line 287, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/usr/lib/python_udf/ba358c1eee732decf7355676d6091c8871aded551c68484cde9ee9ddcf4e5260/lib/python3.11/site-packages/snowflake/connector/errors.py", line 165, in default_errorhandler
    raise error_class(
snowflake.snowpark.exceptions.SnowparkSQLException: (1304): <REDACTED>: 001003 (42000): SQL compilation error:
syntax error line 1 at position 138 unexpected ')'.

The failing generated SQL query was:

SELECT  *  FROM ( SELECT "_1" AS "A" FROM ( SELECT $1 AS "_1" FROM  VALUES (1 :: INT), (2 :: INT), (3 :: INT), (4 :: INT))) WHERE "A" IN () LIMIT 10

With the empty brackets after the IN at the end being the problem.

@NickFishr NickFishr added bug Something isn't working needs triage Initial RCA is required labels Jul 30, 2024
@github-actions github-actions bot changed the title Column.isin() fails with empty list SNOW-1565789: Column.isin() fails with empty list Jul 30, 2024
@sfc-gh-sghosh sfc-gh-sghosh self-assigned this Aug 1, 2024
@sfc-gh-sghosh
Copy link

Hello @NickFishr ,

Thanks for raising the issue, we confirm the issue is with regular session and its throwing exception
will work on eliminating it.

SnowparkSQLException: (1304): 01b60cf5-080a-ecc4-0000-164925ab06da: 001003 (42000): SQL compilation error:
syntax error line 1 at position 138 unexpected ')'.

Regards,
Sujan

@sfc-gh-sghosh sfc-gh-sghosh added status-triage_done Initial triage done, will be further handled by the driver team and removed needs triage Initial RCA is required labels Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

3 participants