Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-122145: Handle an empty AST body when reporting tracebacks. #122161

Merged
merged 8 commits into from
Sep 18, 2024

Conversation

picnixz
Copy link
Contributor

@picnixz picnixz commented Jul 23, 2024

Credits to @sobolevn.

Should I rather try to fix the two issues simultaneously or should I just fix this one here?

@picnixz
Copy link
Contributor Author

picnixz commented Jul 23, 2024

I don't know what to really put in the NEWS entry 😢 I'm not sure that the only way to get an AST body that is empty is only by having comments inside (and without causing a SyntaxError/Import Error) =/ (I also forgot to add credits to you Nikita).

@iritkatriel
Copy link
Member

CC @pablogsal

@iritkatriel
Copy link
Member

I don't think it's possible for an AST to have a node with empty body, there seems to be something else going on here.

@picnixz
Copy link
Contributor Author

picnixz commented Jul 29, 2024

I don't think it's possible for an AST to have a node with empty body,

Actually it is if you only have comments:

>>> ast.parse('#').body
[]

@iritkatriel
Copy link
Member

Could you rewrite the test to use this ast rather than construct the intermediate structure?

@picnixz
Copy link
Contributor Author

picnixz commented Jul 29, 2024

Could you rewrite the test to use this ast rather than construct the intermediate structure?

I can do it but I need to check that I'm able to hit the ast.parse() call. I don't know how I can call format_frame_summary without the intermediate FrameSummary (I can however change my '#1234567890' into '#' if this is what you wanted and can directly call _should_show_carets)

@picnixz picnixz requested a review from iritkatriel July 29, 2024 10:45
@iritkatriel
Copy link
Member

Could you rewrite the test to use this ast rather than construct the intermediate structure?

I can do it but I need to check that I'm able to hit the ast.parse() call. I don't know how I can call format_frame_summary without the intermediate FrameSummary (I can however change my '#1234567890' into '#' if this is what you wanted and can directly call _should_show_carets)

Can you create the frame summary from the python code snippet, rather than hard code it?
This would make the test look like it's testing something real.

@iritkatriel
Copy link
Member

(I can however change my '#1234567890' into '#' if this is what you wanted and can directly call _should_show_carets)

Ideally no, we don't write unit tests for internal functions.

@picnixz
Copy link
Contributor Author

picnixz commented Jul 29, 2024

Ok, I'll try to see how I can write my test without this (I was fixing some stuff in the meantime)

@picnixz
Copy link
Contributor Author

picnixz commented Jul 29, 2024

@iritkatriel I tried to use as much public API I could but it's actually extremely difficult to trigger the bug without an explicit construction of a FrameSummary. The reason is that the column offsets are only set when invoking traceback.TracebackException.from_exception(exc).stack, otherwise they are set to None and the AST is never parsed.

(Also, I don't think I can write everything in the file since otherwise the AST won't be empty...)

@iritkatriel
Copy link
Member

@iritkatriel I tried to use as much public API I could but it's actually extremely difficult to trigger the bug without an explicit construction of a FrameSummary. The reason is that the column offsets are only set when invoking traceback.TracebackException.from_exception(exc).stack, otherwise they are set to None and the AST is never parsed.

(Also, I don't think I can write everything in the file since otherwise the AST won't be empty...)

So this makes me wonder - ok, it's possible to have an empty body on an AST node, but how is it possible for that to appear on the traceback of an exception? It would be compiled into something like "RESUME" followed by "RETURN_CONST None". So maybe if there is a KeyboardInterrupt while the RETURN_CONST is executing?

How did an empty body end up being accessed from tracebackin the case that was reported in this issue?

@iritkatriel
Copy link
Member

Sorry, RETURN_CONST will only be added to a function node body. For a module not even that.

But the question remains - how does executing an empty module raise an exception?

@picnixz
Copy link
Contributor Author

picnixz commented Jul 29, 2024

But the question remains - how does executing an empty module raise an exception?

Actually, I think it's because of linecache (but I'm not sure). In my test, I need to fake the real source of the file so that the line to render (and to actually parase) would give an empty AST body.

For the issue, I think the source is incorrectly deduced or one of the frame is actually used in the traceback itself. Except for a live reproducer with the REPL or with a linecache hack, I don't know how to do reproduce the bug. I agree that we should not have an empty AST because that would mean the file is empty or only consists of comments. Empty lines are already handled but somewhat comments only no.

In the meantime, we could perhaps have this hotfix and then try to investigate more in details (or we can just take our time, though it could be annoying that your REPL crashes without you being able to do anything...). The culprit AST is in _pyrepl/__main__.py (it seems that the traceback is assuming that the file it should take the line from is this one):

# Important: don't add things to this module, as they will end up in the REPL's
# default globals.  Use _pyrepl.main instead.

if __name__ == "__main__":
    from .main import interactive_console as __pyrepl_interactive_console
    __pyrepl_interactive_console()

@picnixz
Copy link
Contributor Author

picnixz commented Jul 29, 2024

But _pyrepl/__main__.py is not empty. Is linecache returning the wrong file?

It's not empty but the first line is a comment. If you add print(all_lines) just after ast.parse() in traceback.py, you'll see:

>>> exec(compile("1/0", "?", "exec"))
["# Important: don't add things to this module, as they will end up in the REPL's"]
Traceback (most recent call last):
["# Important: don't add things to this module, as they will end up in the REPL's"]
Traceback (most recent call last):
  File "/lib/python/cpython/Lib/code.py", line 91, in runcode
    exec(code, self.locals)
  File "<python-input-0>", line 1, in <module>
  File "?", line 1, in <module>
ZeroDivisionError: division by zero

(That's how I knew which file was the culprit...)

@picnixz
Copy link
Contributor Author

picnixz commented Jul 29, 2024

Actually, the files being queried by linecache are:

>>> exec(compile("1/0", "?", "exec"))
filename= <python-input-0>
filename= ?
["# Important: don't add things to this module, as they will end up in the REPL's"]
filename= <frozen runpy>
filename= <frozen runpy>
filename= /lib/python/cpython/Lib/_pyrepl/__main__.py
filename= /lib/python/cpython/Lib/_pyrepl/main.py
filename= /lib/python/cpython/Lib/_pyrepl/simple_interact.py
filename= /lib/python/cpython/Lib/code.py
filename= /lib/python/cpython/Lib/_pyrepl/console.py
filename= /lib/python/cpython/Lib/code.py
filename= /lib/python/cpython/Lib/_pyrepl/console.py
filename= /lib/python/cpython/Lib/code.py
filename= /lib/python/cpython/Lib/traceback.py
filename= /lib/python/cpython/Lib/traceback.py
filename= /lib/python/cpython/Lib/traceback.py
filename= /lib/python/cpython/Lib/traceback.py
filename= /lib/python/cpython/Lib/traceback.py
filename= /lib/python/cpython/Lib/code.py
filename= <python-input-0>
filename= ?
Traceback (most recent call last):
["# Important: don't add things to this module, as they will end up in the REPL's"]
Traceback (most recent call last):
  File "/lib/python/cpython/Lib/code.py", line 91, in runcode
    exec(code, self.locals)
  File "<python-input-0>", line 1, in <module>
  File "?", line 1, in <module>
ZeroDivisionError: division by zero

I've added a print("filename=", self.filename) inside the FrameSummary._set_lines() to check when the lines are being queried.

@iritkatriel
Copy link
Member

I put a breakpoint just before the call to StackSummary._extract_from_extended_frame_gen, and the exception's traceback has two entries:

(Pdb) p exc_traceback.tb_frame.f_code
<code object <module> at 0x10dee5470, file "<python-input-0>", line 1>
(Pdb) p exc_traceback.tb_next.tb_frame.f_code
<code object <module> at 0x10db16340, file "s", line 1>

neither of these is in _pyrepl/__main__.py. So there is some bug that makes code from _pyrepl/__main__.py to be looked up.

@devdanzin
Copy link
Contributor

It's linecache using a loader that has the wrong filename. I'll post the relevant code once I'm on my computer.

@devdanzin
Copy link
Contributor

See #122071 (comment) and surrounding context for an explanation, a possible fix and a PR targeting this.

@iritkatriel
Copy link
Member

It's linecache using a loader that has the wrong filename. I'll post the relevant code once I'm on my computer.

Right. I just landed on this possible solution before I saw your comment (not sure this is 100% correct, there's stuff here I'm not very familiar with):

diff --git a/Lib/linecache.py b/Lib/linecache.py
index 3462f1c451b..2b98488adf3 100644
--- a/Lib/linecache.py
+++ b/Lib/linecache.py
@@ -175,6 +175,8 @@ def lazycache(filename, module_globals):
         return False
     # Try for a __loader__, if available
     if module_globals and '__name__' in module_globals:
+        if not ((mod_file := module_globals.get('__file__', None)) and mod_file.endswith(filename)):
+            return False
         spec = module_globals.get('__spec__')
         name = getattr(spec, 'name', None) or module_globals['__name__']
         loader = getattr(spec, 'loader', None)

@iritkatriel
Copy link
Member

It would be nice to have compile() feed linecache with the line it's compiling. Then we would have that in the traceback as well (not just the exec call). compile is written in C though, so it's not clear how simple this is.

@picnixz
Copy link
Contributor Author

picnixz commented Jul 30, 2024

I'll add a DO-NOT-MERGE label here because we might possibly want to fix linecache instead of using this simple fix. If we fix linecache, I think we should instead have an assert out there just in case (unless I'm mistaken, we seem to agree that you cannot raise an exception with an empty AST body).

@vstinner
Copy link
Member

@picnixz: Your PR has the DO-NOT-MERGE label. Is the PR ready for review? It already got multiple reviews :-)

@vstinner
Copy link
Member

I'll add a DO-NOT-MERGE label here because we might possibly want to fix linecache instead of using this simple fix.

IMO this fix is needed. Fixing linecache is something different which deserves its own PR.

@vstinner
Copy link
Member

This bug can occur if the Python source code is modified after a program started, which is likely while developing. So again, IMO it's worth it to fix this bug.

Fixing the other bug about getting the right line in linecache is also worth it but has a lower priority. I marked this bug as a blocker issue.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vstinner
Copy link
Member

@iritkatriel @sobolevn: I removed the DO-NOT-MERGE label. Are you still ok to merge this change?

@vstinner vstinner added the needs backport to 3.13 bugs and security fixes label Sep 18, 2024
@vstinner vstinner merged commit 5cd50cb into python:main Sep 18, 2024
36 of 37 checks passed
@miss-islington-app
Copy link

Thanks @picnixz for the PR, and @vstinner for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 18, 2024
@bedevere-app
Copy link

bedevere-app bot commented Sep 18, 2024

GH-124214 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Sep 18, 2024
@vstinner
Copy link
Member

Merged. Thanks @picnixz for your fix and the test.

@picnixz picnixz deleted the fix-traceback-ast branch September 19, 2024 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants