Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use FUNC_RET_READ for inet_csk_accept #201

Merged
merged 4 commits into from
Aug 28, 2024

Conversation

fearful-symmetry
Copy link
Contributor

closes #200

This changes the inet_csk_accept fexit probe to use FUNC_RET_READ instead of passing args to the method; this means we don't have to deal with the function arguments changing, like with the 6.10 kernel.

Tested on Fedora with 6.10.3, but we seem to be missing the addr fields. Not sure if this is a quirk of using the fexit probe, or if there's a bug here and the inet_csk_accept method no longer works the way we assume.

sudo ./EventsTrace --net-conn-accept -i                                                                                                                                                   
{"probes_initialized": true, "features": {"bpf_tramp": true}}
{"event_type":"NETWORK_CONNECTION_ACCEPTED","pids":{"tid":4184470,"tgid":4184470,"ppid":4145903,"pgid":4184470,"sid":4145903,"start_time_ns":1468277566951535},"net":{"transport":"TCP","family":"AF_INET","source_address":"0.0.0.0","source_port":8000,"destination_address":"0.0.0.0","destination_port":0,"network_namespace":4026531840},"comm":"python3"}

@fearful-symmetry fearful-symmetry self-assigned this Aug 26, 2024
@fearful-symmetry fearful-symmetry requested a review from a team as a code owner August 26, 2024 18:10
@haesbaert
Copy link
Contributor

Not sure I grok this but, won't this break for multiple cpus? Two cpus entering the same function would basically store the return value in the same storage space.
I guess the magic is in the ctx arithmetic, where does that come from?

@fearful-symmetry
Copy link
Contributor Author

@haesbaert hmm, good point, didn't think of that. @stanek-michal do you know how this will work across CPUs?

@fearful-symmetry
Copy link
Contributor Author

fearful-symmetry commented Aug 26, 2024

Hmm. Tested with bpftrace via kretfunc, the addresses show up fine:

Tracing TCP accepts. Hit Ctrl-C to end.
TIME     PID    COMM           RADDR                                   RPORT LADDR                                   LPORT BL
11:33:39 4184470 python3        192.168.1.124                           54570 192.168.1.37                            8000  0/5

@fearful-symmetry
Copy link
Contributor Author

Using the branch where I rewrote the probes to use an updated vmlinux, it works fine:

sudo ./EventsTrace --net-conn-accept -i                                                                                                              
{"probes_initialized": true, "features": {"bpf_tramp": true}}
{"event_type":"NETWORK_CONNECTION_ACCEPTED","pids":{"tid":4184470,"tgid":4184470,"ppid":4145903,"pgid":4184470,"sid":4145903,"start_time_ns":1468277566951535},"net":{"transport":"TCP","family":"AF_INET","source_address":"192.168.1.37","source_port":8000,"destination_address":"192.168.1.124","destination_port":54728,"network_namespace":4026531840},"comm":"python3"}

I guess FUNC_RET_READ is behaving in some unexpected way here?

@haesbaert
Copy link
Contributor

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

@haesbaert
Copy link
Contributor

with the rellocation addition it's correct I believe.

@stanek-michal
Copy link
Contributor

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

Yeah, its eBPF's R0 which is the return value, the layout I guess is just the eBPF registers (like pt_regs but for eBPF arch)

Copy link
Contributor

@stanek-michal stanek-michal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I don't see any issues, multiple CPUs don't seem to matter here. Thanks for the changes!

@haesbaert
Copy link
Contributor

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

Yeah, its eBPF's R0 which is the return value, the layout I guess is just the eBPF registers (like pt_regs but for eBPF arch)

Not sure that's quite right, if I understand this now, it does ctx + patched_return_index, which is the last register in the return function:

int BPF_PROG(
    fexit__inet_csk_accept, struct sock *sk, int flags, int *err, bool kern, struct sock *ret)

So the patched_return_index here would be 4, meaning use R5 (R1=sk; R2=flags; R3=err; R4=kern; R5=ret).
Problem before is we were doing ctx + 0 which is R1.

@stanek-michal
Copy link
Contributor

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

Yeah, its eBPF's R0 which is the return value, the layout I guess is just the eBPF registers (like pt_regs but for eBPF arch)

Not sure that's quite right, if I understand this now, it does ctx + patched_return_index, which is the last register in the return function:

int BPF_PROG(
    fexit__inet_csk_accept, struct sock *sk, int flags, int *err, bool kern, struct sock *ret)

So the patched_return_index here would be 4, meaning use R5 (R1=sk; R2=flags; R3=err; R4=kern; R5=ret). Problem before is we were doing ctx + 0 which is R1.

Oh interesting. I guess the ctx (which seems to be struct 'bpf_context') can be different things in different cases, like for network stuff its the packet data etc.

@fearful-symmetry
Copy link
Contributor Author

Not really sure what's going on with the CI failures, can't see anything other than this:

make: *** [Makefile:171: run-multikernel-test] Error 1

@haesbaert
Copy link
Contributor

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

Yeah, its eBPF's R0 which is the return value, the layout I guess is just the eBPF registers (like pt_regs but for eBPF arch)

Not sure that's quite right, if I understand this now, it does ctx + patched_return_index, which is the last register in the return function:

int BPF_PROG(
    fexit__inet_csk_accept, struct sock *sk, int flags, int *err, bool kern, struct sock *ret)

So the patched_return_index here would be 4, meaning use R5 (R1=sk; R2=flags; R3=err; R4=kern; R5=ret). Problem before is we were doing ctx + 0 which is R1.

Oh interesting. I guess the ctx (which seems to be struct 'bpf_context') can be different things in different cases, like for network stuff its the packet data etc.

I guess so as well, I'll try to read the bpf code later and see if I can confirm.

@fearful-symmetry
Copy link
Contributor Author

are the tests just....timing out?

2024-08-27T17:42:18.7669228Z [            ] stdout: timed out waiting for EventsTrace to get ready, dumped stderr above
2024-08-27T17:42:18.7669408Z [            ] stdout: ===== STACKTRACE FOR FAILED TEST =====
2024-08-27T17:42:18.7669543Z [            ] stdout: goroutine 1 [running]:
2024-08-27T17:42:18.7669709Z [            ] stdout: main.TestFail({0x400088fd98, 0x1, 0x1})
2024-08-27T17:42:18.7670123Z [            ] stdout: 	/home/runner/work/ebpf/ebpf/testing/testrunner/utils.go:292 +0xe8
2024-08-27T17:42:18.7670417Z [            ] stdout: main.(*EventsTraceInstance).Start(0x4000012080, {0x13d168, 0x400008c240})
2024-08-27T17:42:18.7670761Z [            ] stdout: 	/home/runner/work/ebpf/ebpf/testing/testrunner/eventstrace.go:87 +0x360
2024-08-27T17:42:18.7670983Z [            ] stdout: main.RunEventsTest(0x11dd40, {0x400009df38, 0x1, 0x1})
2024-08-27T17:42:18.7671283Z [            ] stdout: 	/home/runner/work/ebpf/ebpf/testing/testrunner/utils.go:350 +0x190
2024-08-27T17:42:18.7671395Z [            ] stdout: main.main()
2024-08-27T17:42:18.7671684Z [            ] stdout: 	/home/runner/work/ebpf/ebpf/testing/testrunner/main.go:30 +0x2dc

@fearful-symmetry fearful-symmetry merged commit d9a42f9 into main Aug 28, 2024
25 of 26 checks passed
@fearful-symmetry fearful-symmetry deleted the use_func_ret_read_for_inet_accept branch August 28, 2024 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fexit/inet_csk_accept probe will not work on kernel >=6.10
3 participants