[PATCH 0/1] Get tracer PID without reliance on the proc FS

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/1] Get tracer PID without reliance on the proc FS
@ 2024-09-05 21:27 Roman Kisel
  2024-09-05 21:27 ` [PATCH 1/1] ptrace: " Roman Kisel
  0 siblings, 1 reply; 29+ messages in thread
From: Roman Kisel @ 2024-09-05 21:27 UTC (permalink / raw)
  To: oleg, linux-kernel; +Cc: apais, benhill, ssengar, sunilmut, vdso

For debugging, it might be useful to run the debug trap
instruction to break into the debugger. To detect the debugger
presence, the kernel provides the `/proc/self/status` pseudo-file
that needs to be searched for the "TracerPid:" string.

Provide a prctl command that returns the PID of the tracer if any.
That allows for much simpler logic in the user land, and makes it
possible to detect tracer presence even if PROC_FS is not enabled.

As an example where this might be useful, one might refer to
the standard C++ and Rust libraries. See these links for the details:

* https://en.cppreference.com/w/cpp/utility/breakpoint_if_debugging
* https://lists.llvm.org/pipermail/libcxx-commits/2024-May/083574.html
* https://patchwork-proxy.ozlabs.org/project/gcc/patch/20240601102446.878286-1-jwakely@redhat.com/#3321542
* https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2546r0.html
* https://github.com/rust-lang/rust/pull/129019

Roman Kisel (1):
  ptrace: Get tracer PID without reliance on the proc FS

 include/uapi/linux/ptrace.h |  1 +
 kernel/ptrace.c             | 11 ++++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

base-commit: ad618736883b8970f66af799e34007475fe33a68
-- 
2.34.1

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-05 21:27 [PATCH 0/1] Get tracer PID without reliance on the proc FS Roman Kisel
@ 2024-09-05 21:27 ` Roman Kisel
  2024-09-06 11:24   ` Oleg Nesterov
  2024-09-07 19:33   ` kernel test robot
  0 siblings, 2 replies; 29+ messages in thread
From: Roman Kisel @ 2024-09-05 21:27 UTC (permalink / raw)
  To: oleg, linux-kernel; +Cc: apais, benhill, ssengar, sunilmut, vdso

For debugging, it might be useful to run the debug trap
instruction to break into the debugger. To detect the debugger
presence, the kernel provides the `/proc/self/status` pseudo-file
that needs to be searched for the "TracerPid:" string.

Provide a prctl command that returns the PID of the tracer if any.
That allows for much simpler logic in the user land, and makes it
possible to detect tracer presence even if PROC_FS is not enabled.

Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
---
 include/uapi/linux/ptrace.h |  1 +
 kernel/ptrace.c             | 11 ++++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 72c038fc71d0..5056f5d1df6b 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -21,6 +21,7 @@
 
 #define PTRACE_ATTACH		  16
 #define PTRACE_DETACH		  17
+#define PTRACE_TRACER		  18
 
 #define PTRACE_SYSCALL		  24
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index d5f89f9ef29f..91275c5c4f57 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1258,7 +1258,7 @@ int ptrace_request(struct task_struct *child, long request,
 SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr,
 		unsigned long, data)
 {
-	struct task_struct *child;
+	struct task_struct *child, *tracer;
 	long ret;
 
 	if (request == PTRACE_TRACEME) {
@@ -1277,6 +1277,15 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr,
 		goto out_put_task_struct;
 	}
 
+	if (request == PTRACE_TRACER) {
+		rcu_read_lock();
+		tracer = ptrace_parent(current);
+		ret = tracer ? task_pid_nr_ns(tracer,
+					task_active_pid_ns(current->parent)) : -ESRCH;
+		rcu_read_unlock();
+		goto out;
+	}
+
 	ret = ptrace_check_attach(child, request == PTRACE_KILL ||
 				  request == PTRACE_INTERRUPT);
 	if (ret < 0)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-05 21:27 ` [PATCH 1/1] ptrace: " Roman Kisel
@ 2024-09-06 11:24   ` Oleg Nesterov
  2024-09-06 11:48     ` Oleg Nesterov
  2024-09-06 19:09     ` Linus Torvalds
  2024-09-07 19:33   ` kernel test robot
  1 sibling, 2 replies; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-06 11:24 UTC (permalink / raw)
  To: Roman Kisel, Eric W. Biederman, Linus Torvalds, Andrew Morton
  Cc: linux-kernel, apais, benhill, ssengar, sunilmut, vdso

Add cc's. Perhaps someone else can ack/nack the intent...

This (trivial) patch is obviously buggy, but fixable. I won't argue
if it can help userspace.

On 09/05, Roman Kisel wrote:
>
> For debugging, it might be useful to run the debug trap
> instruction to break into the debugger. To detect the debugger
> presence, the kernel provides the `/proc/self/status` pseudo-file
> that needs to be searched for the "TracerPid:" string.
>
> Provide a prctl command that returns the PID of the tracer if any.

prctl?

> That allows for much simpler logic in the user land, and makes it
> possible to detect tracer presence even if PROC_FS is not enabled.

You should probably move the links from 0/1 to the changelog to make
it more convincing.

> +	if (request == PTRACE_TRACER) {
> +		rcu_read_lock();
> +		tracer = ptrace_parent(current);
> +		ret = tracer ? task_pid_nr_ns(tracer,
> +					task_active_pid_ns(current->parent)) : -ESRCH;

The namespace is wrong, we need task_active_pid_ns(current). So this
code should simply do task_tgid_vnr(tracer) like sys_getppid() does.
And to me it would be better to return 0 if !current->ptrace.

> +		rcu_read_unlock();
> +		goto out;

Wrong, this code runs after "child = find_get_task_by_vpid(pid);" above.

And why? perhaps the intent was to check if this child is traced, not
current?

Oleg.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-06 11:24   ` Oleg Nesterov
@ 2024-09-06 11:48     ` Oleg Nesterov
  2024-09-06 19:09     ` Linus Torvalds
  1 sibling, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-06 11:48 UTC (permalink / raw)
  To: Roman Kisel, Eric W. Biederman, Linus Torvalds, Andrew Morton
  Cc: linux-kernel, apais, benhill, ssengar, sunilmut, vdso

Forgot to ask...

Do you really want the tracer's pid or can PTRACE_TRACER/whatever
simply return the !!current->ptrace boolean? The changelog should
probably explain this too.

On 09/06, Oleg Nesterov wrote:
>
> Add cc's. Perhaps someone else can ack/nack the intent...
>
> This (trivial) patch is obviously buggy, but fixable. I won't argue
> if it can help userspace.
>
> On 09/05, Roman Kisel wrote:
> >
> > For debugging, it might be useful to run the debug trap
> > instruction to break into the debugger. To detect the debugger
> > presence, the kernel provides the `/proc/self/status` pseudo-file
> > that needs to be searched for the "TracerPid:" string.
> >
> > Provide a prctl command that returns the PID of the tracer if any.
>
> prctl?
>
> > That allows for much simpler logic in the user land, and makes it
> > possible to detect tracer presence even if PROC_FS is not enabled.
>
> You should probably move the links from 0/1 to the changelog to make
> it more convincing.
>
> > +	if (request == PTRACE_TRACER) {
> > +		rcu_read_lock();
> > +		tracer = ptrace_parent(current);
> > +		ret = tracer ? task_pid_nr_ns(tracer,
> > +					task_active_pid_ns(current->parent)) : -ESRCH;
>
> The namespace is wrong, we need task_active_pid_ns(current). So this
> code should simply do task_tgid_vnr(tracer) like sys_getppid() does.
> And to me it would be better to return 0 if !current->ptrace.
>
> > +		rcu_read_unlock();
> > +		goto out;
>
> Wrong, this code runs after "child = find_get_task_by_vpid(pid);" above.
>
> And why? perhaps the intent was to check if this child is traced, not
> current?
>
> Oleg.
>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-06 11:24   ` Oleg Nesterov
  2024-09-06 11:48     ` Oleg Nesterov
@ 2024-09-06 19:09     ` Linus Torvalds
  2024-09-06 20:08       ` Roman Kisel
  1 sibling, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2024-09-06 19:09 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roman Kisel, Eric W. Biederman, Andrew Morton, linux-kernel,
	apais, benhill, ssengar, sunilmut, vdso

On Fri, 6 Sept 2024 at 04:24, Oleg Nesterov <oleg@redhat.com> wrote:
>
> Add cc's. Perhaps someone else can ack/nack the intent...
>
> This (trivial) patch is obviously buggy, but fixable. I won't argue
> if it can help userspace.

I think the "what's the point for user space" is the much more important thing.

Honestly, acting differently when traced sounds like a truly
fundamentally HORRIBLE model for anything at all - much less debugging
- and I think it should not be helped in any way unless you have some
really really strong arguments for it.

Can you figure it out as-is? Sure. But that's still not a reason to
make bad behavior _easier_.

               Linus

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-06 19:09     ` Linus Torvalds
@ 2024-09-06 20:08       ` Roman Kisel
  2024-09-06 20:26         ` Linus Torvalds
  2024-09-06 20:55         ` Oleg Nesterov
  0 siblings, 2 replies; 29+ messages in thread
From: Roman Kisel @ 2024-09-06 20:08 UTC (permalink / raw)
  To: Linus Torvalds, Oleg Nesterov
  Cc: Eric W. Biederman, Andrew Morton, linux-kernel, apais, benhill,
	ssengar, sunilmut, vdso

On 9/6/2024 12:09 PM, Linus Torvalds wrote:
> On Fri, 6 Sept 2024 at 04:24, Oleg Nesterov <oleg@redhat.com> wrote:
>>
>> Add cc's. Perhaps someone else can ack/nack the intent...
>>
>> This (trivial) patch is obviously buggy, but fixable. I won't argue
>> if it can help userspace.
> 
> I think the "what's the point for user space" is the much more important thing.
> 
> Honestly, acting differently when traced sounds like a truly
> fundamentally HORRIBLE model for anything at all - much less debugging
> - and I think it should not be helped in any way unless you have some
> really really strong arguments for it.
> 
> Can you figure it out as-is? Sure. But that's still not a reason to
> make bad behavior _easier_.

No dispute that altering behavior based on whether a process is traced
or not _is_ bad behavior. To be precise, when the process is still
doing work, it undoubtedly is.

When the process has run into a fatal error and is about to exit, having
a way to break into the debugger at this exact moment wouldn't change
anything about the logic of the data processing happening in the process.
What's so horrible in that to have a way to land in the debugger to see
what exactly is going on?

Another aid of a similar kind is logging. Sure, can figure out what's
the bug without it. It is easier to with it though, and logging might
change resource consumption so much more than the check for the tracer
being present when the process is dying.

All told, let me know if I may proceed with fixing the code as Oleg
suggested, or this piece should go into the waste basket. I could make
an argument that providing the way to get the tracer PID only via
proc FS through parsing text is more like shell/Perl/Python interface
to the kernel, and for compiled languages could have what's easier in
that setting (there is an easy syscall for getting PID, and there could
be code changing the logic on the PID being odd or even for the sake
of argument).

> 
>                 Linus

-- 
Thank you,
Roman

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-06 20:08       ` Roman Kisel
@ 2024-09-06 20:26         ` Linus Torvalds
  2024-09-06 21:15           ` Roman Kisel
  2024-09-06 20:55         ` Oleg Nesterov
  1 sibling, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2024-09-06 20:26 UTC (permalink / raw)
  To: Roman Kisel
  Cc: Oleg Nesterov, Eric W. Biederman, Andrew Morton, linux-kernel,
	apais, benhill, ssengar, sunilmut, vdso

On Fri, 6 Sept 2024 at 13:08, Roman Kisel <romank@linux.microsoft.com> wrote:
>
> When the process has run into a fatal error and is about to exit, having
> a way to break into the debugger at this exact moment wouldn't change
> anything about the logic of the data processing happening in the process.
> What's so horrible in that to have a way to land in the debugger to see
> what exactly is going on?

I don't buy it.

If you want to debug some fatal behavior, and a debugger *isn't*
attached, you want it to create a core-dump.

And if a debugger *is* attached, it will catch that.

This is basically how abort() has always worked, and it very much is
*not* doing some "let's check if we're being debugged" stuff. Exactly
because that would be a bad idea and an anti-pattern.

The other very traditional model - for example if you do *not* want to
do core-dumps for some reason, and just exit with an error message -
is to just put a breakpoint on the "fatal()" function (or whatever
your "fatal error happened" situation is) using the debugger.

Then the target will behave differently depending on whether it is
being debugged or not BECAUSE THE DEBUGGER ASKED FOR IT, not because
the program decided to act differently when debugged.

In other words, this is a long-solved problem with solid solutions
from before Linux even existed.

              Linus

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-06 20:26         ` Linus Torvalds
@ 2024-09-06 21:15           ` Roman Kisel
  2024-09-09 16:18             ` Eric W. Biederman
  0 siblings, 1 reply; 29+ messages in thread
From: Roman Kisel @ 2024-09-06 21:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Oleg Nesterov, Eric W. Biederman, Andrew Morton, linux-kernel,
	apais, benhill, ssengar, sunilmut, vdso

On 9/6/2024 1:26 PM, Linus Torvalds wrote:
> On Fri, 6 Sept 2024 at 13:08, Roman Kisel <romank@linux.microsoft.com> wrote:
>>
>> When the process has run into a fatal error and is about to exit, having
>> a way to break into the debugger at this exact moment wouldn't change
>> anything about the logic of the data processing happening in the process.
>> What's so horrible in that to have a way to land in the debugger to see
>> what exactly is going on?
> 
> I don't buy it.
> 
> If you want to debug some fatal behavior, and a debugger *isn't*
> attached, you want it to create a core-dump.
> 
> And if a debugger *is* attached, it will catch that.
> 
> This is basically how abort() has always worked, and it very much is
> *not* doing some "let's check if we're being debugged" stuff. Exactly
> because that would be a bad idea and an anti-pattern.
> 
> The other very traditional model - for example if you do *not* want to
> do core-dumps for some reason, and just exit with an error message -
> is to just put a breakpoint on the "fatal()" function (or whatever
> your "fatal error happened" situation is) using the debugger.
> 
> Then the target will behave differently depending on whether it is
> being debugged or not BECAUSE THE DEBUGGER ASKED FOR IT, not because
> the program decided to act differently when debugged.
> 
> In other words, this is a long-solved problem with solid solutions
> from before Linux even existed.

Writing a core-dump might not be an option as you have pointed out
hence the "fatal()" function might not be permitted to fault.

Breaking into the debugger if it is attached saves a step of setting
a breakpoint and doesn't require the knowledge of the guts of the
standard library and/or the journey of the trap exception from the
CPU to the debugger. The very name of the "fatal()" function is a
tight contract, and something akin to the onion in the varnish for
the uninitiated.

Libraries like Google Breakpad, Boost.Test, C++ std, AWS C SDK,
Unreal Engine have that "breakpoint_if_debugging" facility, folks
find that useful. They all read and parse the "/proc/self/status" file,
where as it may seem, just the ptrace syscall one liner could save
the trouble of that. The kernel helps the user space do work as I
understand, why police it? There is "fork()", and threads can deadlock.
Quite horrible, still the user space has access to that. Here,
there is evidence folks do want to detect if a debugger is present,
there is evidence how the kernel can help the user space compute that
with so much less effort, the patch is trivial. Why don't let the
userspace burn less electricity?

> 
>                Linus

-- 
Thank you,
Roman

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-06 21:15           ` Roman Kisel
@ 2024-09-09 16:18             ` Eric W. Biederman
  2024-09-09 17:05               ` Oleg Nesterov
  2024-09-09 17:22               ` Roman Kisel
  0 siblings, 2 replies; 29+ messages in thread
From: Eric W. Biederman @ 2024-09-09 16:18 UTC (permalink / raw)
  To: Roman Kisel
  Cc: Linus Torvalds, Oleg Nesterov, Andrew Morton, linux-kernel, apais,
	benhill, ssengar, sunilmut, vdso

Roman Kisel <romank@linux.microsoft.com> writes:

> On 9/6/2024 1:26 PM, Linus Torvalds wrote:
>> On Fri, 6 Sept 2024 at 13:08, Roman Kisel <romank@linux.microsoft.com> wrote:
>>>
>>> When the process has run into a fatal error and is about to exit, having
>>> a way to break into the debugger at this exact moment wouldn't change
>>> anything about the logic of the data processing happening in the process.
>>> What's so horrible in that to have a way to land in the debugger to see
>>> what exactly is going on?
>> I don't buy it.
>> If you want to debug some fatal behavior, and a debugger *isn't*
>> attached, you want it to create a core-dump.
>> And if a debugger *is* attached, it will catch that.
>> This is basically how abort() has always worked, and it very much is
>> *not* doing some "let's check if we're being debugged" stuff. Exactly
>> because that would be a bad idea and an anti-pattern.
>> The other very traditional model - for example if you do *not* want to
>> do core-dumps for some reason, and just exit with an error message -
>> is to just put a breakpoint on the "fatal()" function (or whatever
>> your "fatal error happened" situation is) using the debugger.
>> Then the target will behave differently depending on whether it is
>> being debugged or not BECAUSE THE DEBUGGER ASKED FOR IT, not because
>> the program decided to act differently when debugged.
>> In other words, this is a long-solved problem with solid solutions
>> from before Linux even existed.
>
>
> Writing a core-dump might not be an option as you have pointed out
> hence the "fatal()" function might not be permitted to fault.

For that you just need to set core file size rlimit to 0.
Then you can safely raise SIGABRT to terminate your process.

A debugger can also stop at PTRACE_EVENT_EXIT.  The process
is still available (not cleaned up), but already fatally dead.

So I think the scenario of a process exiting is safely handled.

So if that is the case you care about I would say please look at
PTRACE_EVENT_EXIT.

> Breaking into the debugger if it is attached saves a step of setting
> a breakpoint and doesn't require the knowledge of the guts of the
> standard library and/or the journey of the trap exception from the
> CPU to the debugger. The very name of the "fatal()" function is a
> tight contract, and something akin to the onion in the varnish for
> the uninitiated.
>
> Libraries like Google Breakpad, Boost.Test, C++ std, AWS C SDK,
> Unreal Engine have that "breakpoint_if_debugging" facility, folks
> find that useful. They all read and parse the "/proc/self/status" file,
> where as it may seem, just the ptrace syscall one liner could save
> the trouble of that. The kernel helps the user space do work as I
> understand, why police it? There is "fork()", and threads can deadlock.
> Quite horrible, still the user space has access to that. Here,
> there is evidence folks do want to detect if a debugger is present,
> there is evidence how the kernel can help the user space compute that
> with so much less effort, the patch is trivial. Why don't let the
> userspace burn less electricity?

If you want more than just stopping at when a process exits, such as the
not yet standardized std:is_debugger_present and
std:breakpoint_if_debugging calls.  Then a real world justification
needs to be shown why the kernel should optimize for the uncommon case.

Especially because it will take time and energy and maintenance to keep
that going, all for to support a set of facilities that seem highly
dubious.

I suspect the best way to support breakpoint_if_debugging (in the
general case) is to make them nop functions, and then have something
scan the source code collect those locations, and feed those locations
into the debugger as break points.  Perhaps the compiler can be that
scanner and record the list of suggest break points somewhere that
a debugger can read.

As for std:is_debugger_present I suspect the best way to handle that
is to keep the current implementation, making it expensive to use so
people won't use it unthinkingly.  If people are using this facility
enough that they are wasting electricity, an expensive operation at
least has the potential to make people to stop and think about what they
are doing.

From a don't introduce heisenbug's perspective I think optimizing any of
this in the kernel looks like a bad idea.

Am I missing something?

Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-09 16:18             ` Eric W. Biederman
@ 2024-09-09 17:05               ` Oleg Nesterov
  2024-09-09 17:34                 ` Eric W. Biederman
  2024-09-09 17:22               ` Roman Kisel
  1 sibling, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-09 17:05 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Roman Kisel, Linus Torvalds, Andrew Morton, linux-kernel, apais,
	benhill, ssengar, sunilmut, vdso

On 09/09, Eric W. Biederman wrote:
>
> I suspect the best way to support breakpoint_if_debugging (in the
> general case) is to make them nop functions,

or may be make it call a single function which can be used as a
breakpoint placeholder.

Either way, at least the program with breakpoint_if_debugging() will
survive under /usr/bin/strace.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-09 17:05               ` Oleg Nesterov
@ 2024-09-09 17:34                 ` Eric W. Biederman
  0 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2024-09-09 17:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roman Kisel, Linus Torvalds, Andrew Morton, linux-kernel, apais,
	benhill, ssengar, sunilmut, vdso

Oleg Nesterov <oleg@redhat.com> writes:

> On 09/09, Eric W. Biederman wrote:
>>
>> I suspect the best way to support breakpoint_if_debugging (in the
>> general case) is to make them nop functions,
>
> or may be make it call a single function which can be used as a
> breakpoint placeholder.

That would be much simpler to implement.

> Either way, at least the program with breakpoint_if_debugging() will
> survive under /usr/bin/strace.

Yes, I can't imagine wanting any of that functionality to trigger
when it is just strace using ptrace on a program.

Eric


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-09 16:18             ` Eric W. Biederman
  2024-09-09 17:05               ` Oleg Nesterov
@ 2024-09-09 17:22               ` Roman Kisel
  1 sibling, 0 replies; 29+ messages in thread
From: Roman Kisel @ 2024-09-09 17:22 UTC (permalink / raw)
  To: ebiederm
  Cc: akpm, apais, benhill, linux-kernel, oleg, romank, ssengar,
	sunilmut, torvalds, vdso

On 9/9/2024, Eric wrote:

> > On 9/6/2024 1:26 PM, Linus Torvalds wrote:
> >> On Fri, 6 Sept 2024 at 13:08, Roman Kisel <romank@linux.microsoft.com> wrote:
> >>>
> >>> When the process has run into a fatal error and is about to exit, having
> >>> a way to break into the debugger at this exact moment wouldn't change
> >>> anything about the logic of the data processing happening in the process.
> >>> What's so horrible in that to have a way to land in the debugger to see
> >>> what exactly is going on?
> >> I don't buy it.
> >> If you want to debug some fatal behavior, and a debugger *isn't*
> >> attached, you want it to create a core-dump.
> >> And if a debugger *is* attached, it will catch that.
> >> This is basically how abort() has always worked, and it very much is
> >> *not* doing some "let's check if we're being debugged" stuff. Exactly
> >> because that would be a bad idea and an anti-pattern.
> >> The other very traditional model - for example if you do *not* want to
> >> do core-dumps for some reason, and just exit with an error message -
> >> is to just put a breakpoint on the "fatal()" function (or whatever
> >> your "fatal error happened" situation is) using the debugger.
> >> Then the target will behave differently depending on whether it is
> >> being debugged or not BECAUSE THE DEBUGGER ASKED FOR IT, not because
> >> the program decided to act differently when debugged.
> >> In other words, this is a long-solved problem with solid solutions
> >> from before Linux even existed.
> >
> >
> > Writing a core-dump might not be an option as you have pointed out
> > hence the "fatal()" function might not be permitted to fault.
> 
> For that you just need to set core file size rlimit to 0.
> Then you can safely raise SIGABRT to terminate your process.
> 
> A debugger can also stop at PTRACE_EVENT_EXIT.  The process
> is still available (not cleaned up), but already fatally dead.
> 
> So I think the scenario of a process exiting is safely handled.
> 
> So if that is the case you care about I would say please look at
> PTRACE_EVENT_EXIT.
> 
> > Breaking into the debugger if it is attached saves a step of setting
> > a breakpoint and doesn't require the knowledge of the guts of the
> > standard library and/or the journey of the trap exception from the
> > CPU to the debugger. The very name of the "fatal()" function is a
> > tight contract, and something akin to the onion in the varnish for
> > the uninitiated.
> >
> > Libraries like Google Breakpad, Boost.Test, C++ std, AWS C SDK,
> > Unreal Engine have that "breakpoint_if_debugging" facility, folks
> > find that useful. They all read and parse the "/proc/self/status" file,
> > where as it may seem, just the ptrace syscall one liner could save
> > the trouble of that. The kernel helps the user space do work as I
> > understand, why police it? There is "fork()", and threads can deadlock.
> > Quite horrible, still the user space has access to that. Here,
> > there is evidence folks do want to detect if a debugger is present,
> > there is evidence how the kernel can help the user space compute that
> > with so much less effort, the patch is trivial. Why don't let the
> > userspace burn less electricity?
> 
> If you want more than just stopping at when a process exits, such as the
> not yet standardized std:is_debugger_present and
> std:breakpoint_if_debugging calls.  Then a real world justification
> needs to be shown why the kernel should optimize for the uncommon case.
> 
> Especially because it will take time and energy and maintenance to keep
> that going, all for to support a set of facilities that seem highly
> dubious.
> 
> I suspect the best way to support breakpoint_if_debugging (in the
> general case) is to make them nop functions, and then have something
> scan the source code collect those locations, and feed those locations
> into the debugger as break points.  Perhaps the compiler can be that
> scanner and record the list of suggest break points somewhere that
> a debugger can read.
> 
> As for std:is_debugger_present I suspect the best way to handle that
> is to keep the current implementation, making it expensive to use so
> people won't use it unthinkingly.  If people are using this facility
> enough that they are wasting electricity, an expensive operation at
> least has the potential to make people to stop and think about what they
> are doing.
> 
> From a don't introduce heisenbug's perspective I think optimizing any of
> this in the kernel looks like a bad idea.
> 
> Am I missing something?

This part "people won't use it unthinkingly" does not look too convincing
to me: why not add an expensive spin loop or a 1 second wait to all parts
of the kernel that must be used with caution? I am imagining flashing
L.E.D.s in the modern smartphone, sorry, couldn't help but imagine that :D.
I believe that's a matter of documentation and perfromance evaluation.

I see that the vote has been 3:1 in favor of not merging anything like that,
and no new ideas have surfaced so far to change this. Looks like could close
on this. Appreciate helping me see your points!

> 
> Eric

Roman

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-06 20:08       ` Roman Kisel
  2024-09-06 20:26         ` Linus Torvalds
@ 2024-09-06 20:55         ` Oleg Nesterov
  2024-09-06 21:25           ` Roman Kisel
  1 sibling, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-06 20:55 UTC (permalink / raw)
  To: Roman Kisel
  Cc: Linus Torvalds, Eric W. Biederman, Andrew Morton, linux-kernel,
	apais, benhill, ssengar, sunilmut, vdso

Well, I leave this to you and Linus (and other reviewers), but if it was not
clear I too do not really like this feature, that is why I added cc's.

Perhaps it makes sense to discuss the alternatives? Say, a process can have a
please_insert_the_breakpoint_here() function implemented in asm which just does
asm(ret).

Then something like

	#define breakpoint_if_debugging()	\
		asm volatile ("call please_insert_the_breakpoint_here" : ASM_CALL_CONSTRAINT);

if the process is ptraced, debugger can insert the breakoint into
please_insert_the_breakpoint_here(). Otherwise breakpoint_if_debugging()
is a cheap nop.

Not that I think this is a good idea, but std::breakpoint_if_debugging()
looks even more strange to me...

Oleg.

On 09/06, Roman Kisel wrote:
>
> All told, let me know if I may proceed with fixing the code as Oleg
> suggested, or this piece should go into the waste basket. I could make
> an argument that providing the way to get the tracer PID only via
> proc FS through parsing text is more like shell/Perl/Python interface
> to the kernel, and for compiled languages could have what's easier in
> that setting (there is an easy syscall for getting PID, and there could
> be code changing the logic on the PID being odd or even for the sake
> of argument).
> 
> >
> >                Linus
> 
> -- 
> Thank you,
> Roman
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-06 20:55         ` Oleg Nesterov
@ 2024-09-06 21:25           ` Roman Kisel
  2024-09-08 14:08             ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Roman Kisel @ 2024-09-06 21:25 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Linus Torvalds, Eric W. Biederman, Andrew Morton, linux-kernel,
	apais, benhill, ssengar, sunilmut, vdso



On 9/6/2024 1:55 PM, Oleg Nesterov wrote:
> Well, I leave this to you and Linus (and other reviewers), but if it was not
> clear I too do not really like this feature, that is why I added cc's.
> 
Appreciate you time and help in understanding the Linux kernel code
better!

> Perhaps it makes sense to discuss the alternatives? Say, a process can have a
> please_insert_the_breakpoint_here() function implemented in asm which just does
> asm(ret).
> 
> Then something like
> 
> 	#define breakpoint_if_debugging()	\
> 		asm volatile ("call please_insert_the_breakpoint_here" : ASM_CALL_CONSTRAINT);
> 
> if the process is ptraced, debugger can insert the breakoint into
> please_insert_the_breakpoint_here(). Otherwise breakpoint_if_debugging()
> is a cheap nop.
> 
> Not that I think this is a good idea, but std::breakpoint_if_debugging()
> looks even more strange to me...
Can't speak for everyone obviously, I've found that convenient
when making sense of large (unknown) codebases instead of setting
up breakpoints and adding prints/logs, and when the process
can't/doesn't fault when it encounters a fatal error.

> 
> Oleg.
> 
> On 09/06, Roman Kisel wrote:
>>
>> All told, let me know if I may proceed with fixing the code as Oleg
>> suggested, or this piece should go into the waste basket. I could make
>> an argument that providing the way to get the tracer PID only via
>> proc FS through parsing text is more like shell/Perl/Python interface
>> to the kernel, and for compiled languages could have what's easier in
>> that setting (there is an easy syscall for getting PID, and there could
>> be code changing the logic on the PID being odd or even for the sake
>> of argument).
>>
>>>
>>>                 Linus
>>
>> -- 
>> Thank you,
>> Roman
>>

-- 
Thank you,
Roman


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-06 21:25           ` Roman Kisel
@ 2024-09-08 14:08             ` Oleg Nesterov
  2024-09-09 15:19               ` Roman Kisel
  0 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-08 14:08 UTC (permalink / raw)
  To: Roman Kisel
  Cc: Linus Torvalds, Eric W. Biederman, Andrew Morton, linux-kernel,
	apais, benhill, ssengar, sunilmut, vdso

On 09/06, Roman Kisel wrote:
>
> On 9/6/2024 1:55 PM, Oleg Nesterov wrote:
> >
> >Not that I think this is a good idea, but std::breakpoint_if_debugging()
> >looks even more strange to me...
> Can't speak for everyone obviously, I've found that convenient
> when making sense of large (unknown) codebases instead of setting
> up breakpoints and adding prints/logs, and when the process
> can't/doesn't fault when it encounters a fatal error.

Sorry, I don't understand.

I fail to understand how/why people can use std::breakpoint_if_debugging().
To me it doesn't look useful at all.

But you can safely ignore me, I do not pretend I understand the userspace's
needs.

And I guess people will use it anyway, so I won't argue with, say, a trivial
patch which just adds

	case PR_GET_PTRACED:
		error = !!current->ptrace;
		break;

into sys_prctl(), even if I agree that this probably just makes bad behavior
easier.

But you need to convince Linus.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-08 14:08             ` Oleg Nesterov
@ 2024-09-09 15:19               ` Roman Kisel
  2024-09-09 16:42                 ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Roman Kisel @ 2024-09-09 15:19 UTC (permalink / raw)
  To: oleg
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, romank, ssengar,
	sunilmut, torvalds, vdso

On 9/8/2024, Oleg Nesterov wrote:

> On 09/06, Roman Kisel wrote:
> >
> > On 9/6/2024 1:55 PM, Oleg Nesterov wrote:
> > >
> > >Not that I think this is a good idea, but std::breakpoint_if_debugging()
> > >looks even more strange to me...
> > Can't speak for everyone obviously, I've found that convenient
> > when making sense of large (unknown) codebases instead of setting
> > up breakpoints and adding prints/logs, and when the process
> > can't/doesn't fault when it encounters a fatal error.
>
> Sorry, I don't understand.
>
> I fail to understand how/why people can use std::breakpoint_if_debugging().
> To me it doesn't look useful at all.
>
> But you can safely ignore me, I do not pretend I understand the userspace's
> needs.
>
> And I guess people will use it anyway, so I won't argue with, say, a trivial
> patch which just adds
> 
> case PR_GET_PTRACED:
>     error = !!current->ptrace;
>     break;
>
> into sys_prctl(), even if I agree that this probably just makes bad behavior
> easier.

Very kind of you trying to build a longer table rather than a taller fence,
I appreciate that very much! Your aproach looks very neat indeed, I've learned
a lot from all sugestions you have shared.

>
> But you need to convince Linus.

No new evidence, I rest my case. The difference seems to be a
matter of on which which set of axioms one builds the theorems,
and these sets come across as non-compatible. It might be prudent
to repack these to give this some final thoughts and move on.
Hoping this might be interesting to folks whose service to humanity
requires wearing kernel-tinted glasses.

I brought up evidence-based arguments of the change providing
benefits for the user space, and these were countered with "bad
behavior" on the grounds that changing program's behavior under
debugger is bad. Well, if it's bad for you, you won't do that.
The very notion of convincing becomes devoid of sense in this
situation.

Good for us to land in the debugger at the point of panic without
fiddling with the breakpoints; we're going to continue enjoying that.
Also good to slip in `std::breakpoint_if_debugging` in some obscure
function in someone's library to see how the execution gets there
instead of figuring out through which pointers it is called and what
full name like  `A::B::C::X::Y::Z::func::{impl #12}()` needs to be
used for the breakpoint. Not having to use/know the trap instruction
mnemonic for the target architecture feels not all that bad in the
user space.

That all is very different for the kernel: one C aka portable assembly
codebase that runs the world and where change might be hard, no deadline
or time-to-market as no one sells the kernel as a commodity. Producing
more software faster requires farming things off to the toolchain
(like memory management) or to the 3rd party libraries (to be as general
as possible they use abstractions of abstractions of abstractions cooked
on vtabless, generics and traits), and these present a complication when
debugging. Sure can use a demangled name for the "fatal()" function,
then again why bother and learn the name of that function and making that
into a hard dependency?

Gdb has got aids to land at the `main` function (the `start` command),
why so much ink is being spent of the aids for panic?

It was told in the discussion that all these problems had been solved
before Linux existed. I dare to say the more precise statement would
be that they were solved for a different world. Some 40-30 years
after we live in the world eaten by the software and automation and
figuring out where the bug is can be helped by different means better.

Besides bugs in the patch, why would the kernel even _care_ if/how
the user land uses that tracer PID? That is not sensitive data, and
the user land has access to that already, and uses it in numeruous
large libraries (not just some scrawny pet project of mine) but only
via proc FS and parsing the "/proc/self/status" text pseudo-file.

The kernel could be like "dear user land, here is your sweet one liner
for getting the tracer PID, go knock yourself out reading the tracer
PIDs all day long". And the user land be like "oh, thank you, dear
kernel, one can always count on you!". Win-win. Easy. Joyous.

If folks are still reading or jumped over this long wall of text to
the last paragraph, color me clueless as to why providing a simpler
interface for such an inconsequntial thing as the tracer PID instead
of the existing convoluted one (nailed to proc FS, too) needs to be
resolved by the project's BDFL. Excuse my sudden loss of eloquence
but LOL WUT!

>
> Oleg.

Roman

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-09 15:19               ` Roman Kisel
@ 2024-09-09 16:42                 ` Oleg Nesterov
  2024-09-09 17:05                   ` Roman Kisel
  0 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-09 16:42 UTC (permalink / raw)
  To: Roman Kisel
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, ssengar, sunilmut,
	torvalds, vdso

On 09/09, Roman Kisel wrote:
>
> On 9/8/2024, Oleg Nesterov wrote:
>
> > But you can safely ignore me, I do not pretend I understand the userspace's
> > needs.
> >
> > And I guess people will use it anyway, so I won't argue with, say, a trivial
> > patch which just adds
> >
> > case PR_GET_PTRACED:
> >     error = !!current->ptrace;
> >     break;
> >
> > into sys_prctl(), even if I agree that this probably just makes bad behavior
> > easier.
>
> Very kind of you trying to build a longer table rather than a taller fence,
> I appreciate that very much! Your aproach looks very neat indeed,

Well, you didn't answer my question in
https://lore.kernel.org/all/20240906114819.GA20831@redhat.com/
so I decided that a simpler change which returns !!current->ptrace instead
of the tracer's pid might work as well.

Sorry for annoying you.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-09 16:42                 ` Oleg Nesterov
@ 2024-09-09 17:05                   ` Roman Kisel
  0 siblings, 0 replies; 29+ messages in thread
From: Roman Kisel @ 2024-09-09 17:05 UTC (permalink / raw)
  To: oleg
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, romank, ssengar,
	sunilmut, torvalds, vdso

On 09/09, Oleg Nesterov wrote:

> On 09/09, Roman Kisel wrote:
> >
> > On 9/8/2024, Oleg Nesterov wrote:
> >
> > > But you can safely ignore me, I do not pretend I understand the userspace's
> > > needs.
> > >
> > > And I guess people will use it anyway, so I won't argue with, say, a trivial
> > > patch which just adds
> > >
> > > case PR_GET_PTRACED:
> > >     error = !!current->ptrace;
> > >     break;
> > >
> > > into sys_prctl(), even if I agree that this probably just makes bad behavior
> > > easier.
> >
> > Very kind of you trying to build a longer table rather than a taller fence,
> > I appreciate that very much! Your aproach looks very neat indeed,
> 
> Well, you didn't answer my question in
> https://lore.kernel.org/all/20240906114819.GA20831@redhat.com/
> so I decided that a simpler change which returns !!current->ptrace instead
> of the tracer's pid might work as well.
> 

Apologies for that! After Linus had been added, I braced for the impact as
obviously I was not fixing anything urgent or making some breakthrough
deserving such attention. I guess I got my 101 on adding code to "./kernel" :D

> Sorry for annoying you.
> 
Sorry if my response carried that connotation. I indeed learned a lot from
your suggestions. I'll make sure to write better.

> Oleg.

Roman

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-05 21:27 ` [PATCH 1/1] ptrace: " Roman Kisel
  2024-09-06 11:24   ` Oleg Nesterov
@ 2024-09-07 19:33   ` kernel test robot
  1 sibling, 0 replies; 29+ messages in thread
From: kernel test robot @ 2024-09-07 19:33 UTC (permalink / raw)
  To: Roman Kisel, oleg, linux-kernel
  Cc: oe-kbuild-all, apais, benhill, ssengar, sunilmut, vdso

Hi Roman,

kernel test robot noticed the following build warnings:

[auto build test WARNING on ad618736883b8970f66af799e34007475fe33a68]

url:    https://github.com/intel-lab-lkp/linux/commits/Roman-Kisel/ptrace-Get-tracer-PID-without-reliance-on-the-proc-FS/20240906-085121
base:   ad618736883b8970f66af799e34007475fe33a68
patch link:    https://lore.kernel.org/r/20240905212741.143626-2-romank%40linux.microsoft.com
patch subject: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
config: x86_64-randconfig-122-20240907 (https://download.01.org/0day-ci/archive/20240908/202409080315.xFbJCdN5-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240908/202409080315.xFbJCdN5-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409080315.xFbJCdN5-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
   kernel/ptrace.c:55:22: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/ptrace.c:55:22: sparse:    struct task_struct *
   kernel/ptrace.c:55:22: sparse:    struct task_struct [noderef] __rcu *
   kernel/ptrace.c:74:23: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct task_struct [noderef] __rcu *parent @@     got struct task_struct *new_parent @@
   kernel/ptrace.c:74:23: sparse:     expected struct task_struct [noderef] __rcu *parent
   kernel/ptrace.c:74:23: sparse:     got struct task_struct *new_parent
   kernel/ptrace.c:75:29: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct cred const [noderef] __rcu *ptracer_cred @@     got struct cred const * @@
   kernel/ptrace.c:75:29: sparse:     expected struct cred const [noderef] __rcu *ptracer_cred
   kernel/ptrace.c:75:29: sparse:     got struct cred const *
   kernel/ptrace.c:129:18: sparse: sparse: incorrect type in assignment (different address spaces) @@     expected struct cred const *old_cred @@     got struct cred const [noderef] __rcu *ptracer_cred @@
   kernel/ptrace.c:129:18: sparse:     expected struct cred const *old_cred
   kernel/ptrace.c:129:18: sparse:     got struct cred const [noderef] __rcu *ptracer_cred
   kernel/ptrace.c:133:25: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:133:25: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:133:25: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:160:27: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:160:27: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:160:27: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:192:28: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:192:28: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:192:28: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:198:30: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:198:30: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:198:30: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:251:44: sparse: sparse: incompatible types in comparison expression (different address spaces):
   kernel/ptrace.c:251:44: sparse:    struct task_struct [noderef] __rcu *
   kernel/ptrace.c:251:44: sparse:    struct task_struct *
   kernel/ptrace.c:494:54: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *parent @@     got struct task_struct [noderef] __rcu *parent @@
   kernel/ptrace.c:494:54: sparse:     expected struct task_struct *parent
   kernel/ptrace.c:494:54: sparse:     got struct task_struct [noderef] __rcu *parent
   kernel/ptrace.c:502:53: sparse: sparse: incorrect type in argument 2 (different address spaces) @@     expected struct task_struct *new_parent @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/ptrace.c:502:53: sparse:     expected struct task_struct *new_parent
   kernel/ptrace.c:502:53: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/ptrace.c:550:41: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *p1 @@     got struct task_struct [noderef] __rcu *real_parent @@
   kernel/ptrace.c:550:41: sparse:     expected struct task_struct *p1
   kernel/ptrace.c:550:41: sparse:     got struct task_struct [noderef] __rcu *real_parent
   kernel/ptrace.c:552:50: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct sighand_struct *sigh @@     got struct sighand_struct [noderef] __rcu *sighand @@
   kernel/ptrace.c:552:50: sparse:     expected struct sighand_struct *sigh
   kernel/ptrace.c:552:50: sparse:     got struct sighand_struct [noderef] __rcu *sighand
   kernel/ptrace.c:743:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:743:37: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:743:37: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:751:39: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:751:39: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:751:39: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:862:29: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:862:29: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:862:29: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:866:31: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:866:31: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:866:31: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:1096:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:1096:37: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:1096:37: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:1098:39: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:1098:39: sparse:     expected struct spinlock [usertype] *lock
   kernel/ptrace.c:1098:39: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c: note: in included file (through include/linux/rcuwait.h, include/linux/percpu-rwsem.h, include/linux/fs.h, ...):
   include/linux/sched/signal.h:754:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   include/linux/sched/signal.h:754:37: sparse:     expected struct spinlock [usertype] *lock
   include/linux/sched/signal.h:754:37: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:380:30: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *l @@     got struct spinlock [noderef] __rcu * @@
   kernel/ptrace.c:380:30: sparse:     expected struct spinlock [usertype] *l
   kernel/ptrace.c:380:30: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:456:17: sparse: sparse: context imbalance in 'ptrace_attach' - different lock contexts for basic block
   kernel/ptrace.c:500:38: sparse: sparse: dereference of noderef expression
   include/linux/sched/signal.h:754:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   include/linux/sched/signal.h:754:37: sparse:     expected struct spinlock [usertype] *lock
   include/linux/sched/signal.h:754:37: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:690:9: sparse: sparse: context imbalance in 'ptrace_getsiginfo' - different lock contexts for basic block
   include/linux/sched/signal.h:754:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   include/linux/sched/signal.h:754:37: sparse:     expected struct spinlock [usertype] *lock
   include/linux/sched/signal.h:754:37: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:706:9: sparse: sparse: context imbalance in 'ptrace_setsiginfo' - different lock contexts for basic block
   include/linux/sched/signal.h:754:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   include/linux/sched/signal.h:754:37: sparse:     expected struct spinlock [usertype] *lock
   include/linux/sched/signal.h:754:37: sparse:     got struct spinlock [noderef] __rcu *
   include/linux/sched/signal.h:754:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   include/linux/sched/signal.h:754:37: sparse:     expected struct spinlock [usertype] *lock
   include/linux/sched/signal.h:754:37: sparse:     got struct spinlock [noderef] __rcu *
   kernel/ptrace.c:1255:9: sparse: sparse: context imbalance in 'ptrace_request' - different lock contexts for basic block
>> kernel/ptrace.c:1284:67: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct task_struct *tsk @@     got struct task_struct [noderef] __rcu *parent @@
   kernel/ptrace.c:1284:67: sparse:     expected struct task_struct *tsk
   kernel/ptrace.c:1284:67: sparse:     got struct task_struct [noderef] __rcu *parent

vim +1284 kernel/ptrace.c

  1257	
  1258	SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr,
  1259			unsigned long, data)
  1260	{
  1261		struct task_struct *child, *tracer;
  1262		long ret;
  1263	
  1264		if (request == PTRACE_TRACEME) {
  1265			ret = ptrace_traceme();
  1266			goto out;
  1267		}
  1268	
  1269		child = find_get_task_by_vpid(pid);
  1270		if (!child) {
  1271			ret = -ESRCH;
  1272			goto out;
  1273		}
  1274	
  1275		if (request == PTRACE_ATTACH || request == PTRACE_SEIZE) {
  1276			ret = ptrace_attach(child, request, addr, data);
  1277			goto out_put_task_struct;
  1278		}
  1279	
  1280		if (request == PTRACE_TRACER) {
  1281			rcu_read_lock();
  1282			tracer = ptrace_parent(current);
  1283			ret = tracer ? task_pid_nr_ns(tracer,
> 1284						task_active_pid_ns(current->parent)) : -ESRCH;
  1285			rcu_read_unlock();
  1286			goto out;
  1287		}
  1288	
  1289		ret = ptrace_check_attach(child, request == PTRACE_KILL ||
  1290					  request == PTRACE_INTERRUPT);
  1291		if (ret < 0)
  1292			goto out_put_task_struct;
  1293	
  1294		ret = arch_ptrace(child, request, addr, data);
  1295		if (ret || request != PTRACE_DETACH)
  1296			ptrace_unfreeze_traced(child);
  1297	
  1298	 out_put_task_struct:
  1299		put_task_struct(child);
  1300	 out:
  1301		return ret;
  1302	}
  1303	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
@ 2024-09-07  8:45 Jubilee Young
  2024-09-09 19:37 ` Oleg Nesterov
  2024-09-10 16:34 ` Eric W. Biederman
  0 siblings, 2 replies; 29+ messages in thread
From: Jubilee Young @ 2024-09-07  8:45 UTC (permalink / raw)
  To: oleg
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, romank, ssengar,
	sunilmut, torvalds, vdso

> if the process is ptraced, debugger can insert the breakoint into
> please_insert_the_breakpoint_here(). Otherwise breakpoint_if_debugging()
> is a cheap nop.

Generally a programmer wants to put this kind of conditional breakpoint
on an exception path, for which, unless one is working with a language that
abuses exceptions for control flow (which unfortunately describes... many),
the performance isn't of enormous concern. Not that it's free, either, but
opening a file and scanning it is a lot more code than a single call to prctl.

> Perhaps it makes sense to discuss the alternatives? Say, a process can have a
> please_insert_the_breakpoint_here() function implemented in asm which just does
> asm(ret).

There's some merit in having the debuggers recognize this pattern, as that
then would save every language that wants to have this power available
the trouble of reimplementing it. But first debuggers must recognize it,
which would require teaching each of them, which can be... tedious.
Having a function named `fatal` or whatever likewise has this issue.
A toolchain, however, can simply insert a jump to a breakpoint easily,
without having to hold gdb, lldb, cdb, and whatever-other-db's hand.

- Jubilee

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-07  8:45 Jubilee Young
@ 2024-09-09 19:37 ` Oleg Nesterov
  2024-09-10 15:40   ` Roman Kisel
  2024-09-10 16:34 ` Eric W. Biederman
  1 sibling, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-09 19:37 UTC (permalink / raw)
  To: Jubilee Young
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, romank, ssengar,
	sunilmut, torvalds, vdso

On 09/07, Jubilee Young wrote:
>
> > Perhaps it makes sense to discuss the alternatives? Say, a process can have a
> > please_insert_the_breakpoint_here() function implemented in asm which just does
> > asm(ret).
>
> There's some merit in having the debuggers recognize this pattern, as that
> then would save every language that wants to have this power available
> the trouble of reimplementing it. But first debuggers must recognize it,
> which would require teaching each of them, which can be... tedious.

Yet another thing in this discussion I can't understand... sorry, I tried.
You do not need to teach, say, gdb to recognize this pattern. You can just do

	$ gdb -ex 'b please_insert_the_breakpoint_here' ...

Nevermind, as I have already said you can safely ignore me. I still do not
see any "real" use-case for breakpoint_if_debugging(), but I guess that is
due to my ignorance and lack of imagination.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-09 19:37 ` Oleg Nesterov
@ 2024-09-10 15:40   ` Roman Kisel
  2024-09-11 14:44     ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Roman Kisel @ 2024-09-10 15:40 UTC (permalink / raw)
  To: oleg
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, romank, ssengar,
	sunilmut, torvalds, vdso, workingjubilee

On 09/09, Oleg wrote:

> On 09/07, Jubilee Young wrote:
> >
> > > Perhaps it makes sense to discuss the alternatives? Say, a process can have a
> > > please_insert_the_breakpoint_here() function implemented in asm which just does
> > > asm(ret).
> >
> > There's some merit in having the debuggers recognize this pattern, as that
> > then would save every language that wants to have this power available
> > the trouble of reimplementing it. But first debuggers must recognize it,
> > which would require teaching each of them, which can be... tedious.
> 
> Yet another thing in this discussion I can't understand... sorry, I tried.
> You do not need to teach, say, gdb to recognize this pattern. You can just do
> 
> 	$ gdb -ex 'b please_insert_the_breakpoint_here' ...
> 
> Nevermind, as I have already said you can safely ignore me. I still do not
> see any "real" use-case for breakpoint_if_debugging(), but I guess that is
> due to my ignorance and lack of imagination.

I've started this so let me butt in and take up the gaunlet.

Lambda's would be the most prominent example to me[1]. The toolchain
doesn't give them the user-accesible type and the name as it does for
the functions. Hence in the example above can't do the short'n'sweet
"b lol_wut", the debugger would complain about a function not found.
The right incantaion should be

"b main::{lambda()#1}::operator()() const" for gdb and 
"b main::$_0::operator()() const" for lldb,

that is visible in the disassembly with name demangling. I tried all "b"'s
options mentioned in the documentation to no avail. Likely the complexity
of some of the modern day user land code isn't handled gracefully by
the console debuggers that started out decades ago as debuggers for
the C code.

Sure, can fix that. Or the patch might be rejected because of reasons
being pretty noble from the maintainers' point of view. Say, lambdas
are ugly so why encourage people to use them?

D'oh, might go then write a Python extension that figures out the address
of the lambda. Can set the breakpoint by the line number yet that might
need to be updated when the source file changes, can put a hardware
breakpoint on the variable's address, yet slipping in the call to
"breakpoint_if_debugging" feels easier than using the commands in the
debugger.

Somewhat less painful wrt breakpoints are couroutines[2]. One would
think

"b switch_to_new_thread::awaitable::await_suspend"

should suffice but the right form incantation appears to be

"b switch_to_new_thread(std::jthread&)::awaitable::await_suspend(std::__n4861::coroutine_handle<void>)".

Fortunately, "br se --basename await_suspend" saves the day, still a
handful to type. Wait, what if there are _many_ "await_suspend" due to
function overloading, for example?

As for how real the examples are, the pattern of using lambdas/functors
is pretty common for libraries to define callbacks. Coroutines
are used to get some more perf due to less context switches - as we all
know - giving the semblance of a funcition call (I have had an experience
of working in a codebase that has that), makes coding huge state machines
more modular as I feel it.

That becomes very real if the customer is in pain and need to find the bug
asap. Using "breakpoint_if_debugging" vs figuring out in what nested namespace
the function is defined does have its merits. One can see that similar to
adding more printk's in the code. That notwithstanding, shipping the code with
"breakpoint_if_debugging" in the non-fatal path might be the last resort to
perhaps catch a race happening rarely if nothing else works out. Highly
debatable though in my opinion.

Similar should apply for Rust and Swift I believe as both use name mangling
to desugar code and they add hidden/automatic code paths as C++ does. The
sophisticated abstraction machinery, function name overloading, syntactic
sugar and the hidden code paths seem to make using the console debuggers
a bit more tedious I believe. It all started out with people not wanting to
care all the time about register allocation, and we got C instead of assembly :D
Machines are faster now, can parse more complex languages it appears and 
the compiler can do static reference checking, type inference and not needing
to have the declaration strictly before using it in the source file.
'Cause the times, they are a-changin' I guess.

> 
> Oleg.

Roman

[1]
// Compile with clang or gcc and "-std=c++11 -o bp ./bp.cpp".

// The stack in lldb/macOS:
//
// * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BREAKPOINT (code=1, subcode=0x100003f38)
//   * frame #0: 0x0000000100003f38 bp`std_breakpoint_if_debugging()
//     frame #1: 0x0000000100003f84 bp`main::$_0::operator()() const + 20
//     frame #2: 0x0000000100003f60 bp`main + 32

// The stack in gdb/Linux:
//
// Program received signal SIGTRAP, Trace/breakpoint trap.
// 0x00000000004101ac in std_breakpoint_if_debugging() ()
// (gdb) bt
// #0  0x00000000004101ac in std_breakpoint_if_debugging() ()
// #1  0x00000000004101c0 in main::{lambda()#1}::operator()() const ()
// #2  0x00000000004101f8 in main ()

void std_breakpoint_if_debugging() {
    #if defined(__i386__) || defined(__x86_64__)
        __asm__("int3");
    #elif defined(__aarch64__) || defined(__arm__)
        __builtin_trap();
    #elif defined(__powerpc__) || defined(__ppc__) || defined(__PPC__) || defined(__powerpc64__)
        __asm__(".long 0x7d821008"); // twge r2,r2? Wouldn't bet on that. Would you?
    #elif defined(__riscv)
        __asm__("ebreak");
    #else
        #error "Arch not supported"
    #endif
}

int main() {
    volatile int why_not;

    auto lol_wut = [&](){
        std_breakpoint_if_debugging();
        ++why_not;
    };

    lol_wut();
    return why_not;
}

[2, taken from https://en.cppreference.com/w/cpp/language/coroutines and adapted for illustrating the point]
// Compile with g++ -std=c++23 -fcoroutines -o corou ./corou.cpp

#include <coroutine>
#include <iostream>
#include <stdexcept>
#include <thread>

void std_breakpoint_if_debugging() {
#if defined(__i386__) || defined(__x86_64__)
    __asm__("int3");
#elif defined(__aarch64__) || defined(__arm__)
    __builtin_trap();
#elif defined(__powerpc__) || defined(__ppc__) || defined(__PPC__) || defined(__powerpc64__)
    __asm__(".long 0x7d821008"); // twge r2,r2? Wouldn't bet on that.
#elif defined(__riscv)
    __asm__("ebreak");
#else
    #error "Arch not supported"
#endif
}

auto switch_to_new_thread(std::jthread& out)
{
    struct awaitable
    {
        std::jthread* p_out;
        bool await_ready() { return false; }
        void await_suspend(std::coroutine_handle<> h)
        {
            std::jthread& out = *p_out;
            if (out.joinable())
                throw std::runtime_error("Output jthread parameter not empty");
            out = std::jthread([h] { h.resume(); });
            // Potential undefined behavior: accessing potentially destroyed *this
            // std::cout << "New thread ID: " << p_out->get_id() << '\n';
            std::cout << "New thread ID: " << out.get_id() << '\n'; // this is OK
            std_breakpoint_if_debugging();
        }
        void await_resume() {}
    };
    return awaitable{&out};
}

struct task
{
    struct promise_type
    {
        task get_return_object() { return {}; }
        std::suspend_never initial_suspend() { return {}; }
        std::suspend_never final_suspend() noexcept { return {}; }
        void return_void() {}
        void unhandled_exception() {}
    };
};

task resuming_on_new_thread(std::jthread& out)
{
    std::cout << "Coroutine started on thread: " << std::this_thread::get_id() << '\n';
    co_await switch_to_new_thread(out);
    // awaiter destroyed here
    std::cout << "Coroutine resumed on thread: " << std::this_thread::get_id() << '\n';
}

int main()
{
    std::jthread out;
    resuming_on_new_thread(out);
}

--
Thank you,
Roman

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-10 15:40   ` Roman Kisel
@ 2024-09-11 14:44     ` Oleg Nesterov
  2024-09-11 17:41       ` Roman Kisel
  0 siblings, 1 reply; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-11 14:44 UTC (permalink / raw)
  To: Roman Kisel
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, ssengar, sunilmut,
	torvalds, vdso, workingjubilee

We certainly can't understand each other. At least, I certainly can't.

On 09/10, Roman Kisel wrote:
>
> On 09/09, Oleg wrote:
>

> > Yet another thing in this discussion I can't understand... sorry, I tried.
> > You do not need to teach, say, gdb to recognize this pattern. You can just do
> >
> > 	$ gdb -ex 'b please_insert_the_breakpoint_here' ...
> >
> > Nevermind, as I have already said you can safely ignore me. I still do not
> > see any "real" use-case for breakpoint_if_debugging(), but I guess that is
> > due to my ignorance and lack of imagination.
>
> I've started this so let me butt in and take up the gaunlet.
>
> Lambda's would be the most prominent example to me[1]. The toolchain
> doesn't give them the user-accesible type and the name as it does for
> the functions.

And?

Once again, what I tried to suggest is a single "nop" function,
"void please_insert_the_breakpoint_here()" which simply does asm("ret") and

	#define breakpoint_if_debugging()	\
		please_insert_the_breakpoint_here()

Or, to make it more cheap,

	#define breakpoint_if_debugging()	\
		asm volatile ("call please_insert_the_breakpoint_here" : ASM_CALL_CONSTRAINT);

so that compiler will know that breakpoint_if_debugging() doesn't change the regs.

Again, again, I am not saying that this is necessarily the best solution.
Just I fail to understand your email, sorry.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-11 14:44     ` Oleg Nesterov
@ 2024-09-11 17:41       ` Roman Kisel
  2024-09-11 19:53         ` Oleg Nesterov
  0 siblings, 1 reply; 29+ messages in thread
From: Roman Kisel @ 2024-09-11 17:41 UTC (permalink / raw)
  To: oleg
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, romank, ssengar,
	sunilmut, torvalds, vdso, workingjubilee

On 09/11, Oleg Nesterov wrote:

> On 09/10, Roman Kisel wrote:
> >
> > On 09/09, Oleg wrote:
> >
>
> > > Yet another thing in this discussion I can't understand... sorry, I tried.
> > > You do not need to teach, say, gdb to recognize this pattern. You can just do
> > >
> > > 	$ gdb -ex 'b please_insert_the_breakpoint_here' ...
> > >
> > > Nevermind, as I have already said you can safely ignore me. I still do not
> > > see any "real" use-case for breakpoint_if_debugging(), but I guess that is
> > > due to my ignorance and lack of imagination.
> >
> > I've started this so let me butt in and take up the gaunlet.
> >
> > Lambda's would be the most prominent example to me[1]. The toolchain
> > doesn't give them the user-accesible type and the name as it does for
> > the functions.
> 
> And?
>

You wanted an example of '"real" use-case for breakpoint_if_debugging()':

> > > Nevermind, as I have already said you can safely ignore me. I still do not
> > > see any "real" use-case for breakpoint_if_debugging(), but I guess that is
> > > due to my ignorance and lack of imagination.

I have provided them, and illustrated how it is tiresome to set the breakpoint
in the debugger in these cases so can add a call to breakpoint_if_debugging()
to these places instead.

> Once again, what I tried to suggest is a single "nop" function,
> "void please_insert_the_breakpoint_here()" which simply does asm("ret") and
> 
> 	#define breakpoint_if_debugging()	\
> 		please_insert_the_breakpoint_here()
> 
> Or, to make it more cheap,
>
> 	#define breakpoint_if_debugging()	\
> 		asm volatile ("call please_insert_the_breakpoint_here" : ASM_CALL_CONSTRAINT);
> 
> so that compiler will know that breakpoint_if_debugging() doesn't change the regs.
> 
> Again, again, I am not saying that this is necessarily the best solution.
> Just I fail to understand your email, sorry.

No worries, appreciate your willingness to help!

>
> Oleg.

Roman

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-11 17:41       ` Roman Kisel
@ 2024-09-11 19:53         ` Oleg Nesterov
  2024-09-11 19:57           ` Linus Torvalds
  2024-09-11 20:25           ` Roman Kisel
  0 siblings, 2 replies; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-11 19:53 UTC (permalink / raw)
  To: Roman Kisel
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, ssengar, sunilmut,
	torvalds, vdso, workingjubilee

Roman,

I can only repeat that we can't understand each other. Quite possibly my bad.

On 09/11, Roman Kisel wrote:
>
> On 09/11, Oleg Nesterov wrote:
>
> > On 09/10, Roman Kisel wrote:
> > >
> > > On 09/09, Oleg wrote:
> > >
> >
> > > > Yet another thing in this discussion I can't understand... sorry, I tried.
> > > > You do not need to teach, say, gdb to recognize this pattern. You can just do
> > > >
> > > > 	$ gdb -ex 'b please_insert_the_breakpoint_here' ...
> > > >
> > > > Nevermind, as I have already said you can safely ignore me. I still do not
> > > > see any "real" use-case for breakpoint_if_debugging(), but I guess that is
> > > > due to my ignorance and lack of imagination.
> > >
> > > I've started this so let me butt in and take up the gaunlet.
> > >
> > > Lambda's would be the most prominent example to me[1]. The toolchain
> > > doesn't give them the user-accesible type and the name as it does for
> > > the functions.
> >
> > And?
>
> You wanted an example of '"real" use-case for breakpoint_if_debugging()':

Then why does your email explain that c++ lambdas don't have a good name?
Why doesi it mention lambdas at all?

> > > > Nevermind, as I have already said you can safely ignore me. I still do not
> > > > see any "real" use-case for breakpoint_if_debugging(), but I guess that is
> > > > due to my ignorance and lack of imagination.
>
> I have provided them, and illustrated how it is tiresome to set the breakpoint
> in the debugger in these cases so can add a call to breakpoint_if_debugging()
> to these places instead.

Instead of what??? Instead of

	#define breakpoint_if_debugging()	\
		asm volatile ("call please_insert_the_breakpoint_here" : ASM_CALL_CONSTRAINT);

plus -ex 'b please_insert_the_breakpoint_here'???

If you say that this is ugly I won't even argue. But instead of what?

Roman, I am leaving this thread, sorry. But let me try to summarize.
Your patch was buggy and you seem to agree. Feel free to send V2 and I will
be happy to review it correctness-wise. But:

	- please keep Eric/Linus cc'ed

	- please try to make your changelog more convincing. And in particular,
	  please explain why !!current->ptrace is not enough and this feature
	  needs the tracer's pid.

	  If possible, please provide a clear/simple/artificial/whatever example
	  of the (pseudo)code which can justify this feature.
Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-11 19:53         ` Oleg Nesterov
@ 2024-09-11 19:57           ` Linus Torvalds
  2024-09-11 20:14             ` Oleg Nesterov
  2024-09-11 20:25           ` Roman Kisel
  1 sibling, 1 reply; 29+ messages in thread
From: Linus Torvalds @ 2024-09-11 19:57 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roman Kisel, akpm, apais, benhill, ebiederm, linux-kernel,
	ssengar, sunilmut, vdso, workingjubilee

On Wed, 11 Sept 2024 at 12:54, Oleg Nesterov <oleg@redhat.com> wrote:
>
>         - please try to make your changelog more convincing. And in particular,
>           please explain why !!current->ptrace is not enough and this feature
>           needs the tracer's pid.

Oleg, I realize you like the simpler patch that only has that
"!!current->ptrace", but my point is that even that simpler patch is
simply WRONG, WRONG, WRONG.

There is simply no valid situation where a "I have a tracer" is a good
thing to test for.

Whether it then gives just that "tracer exists" information, or the
pid of the tracer, or a proper pidfd that is actually reliable, is
then entirely immaterial. The whole feature fails at an earlier and
more fundamental point.

              Linus

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-11 19:57           ` Linus Torvalds
@ 2024-09-11 20:14             ` Oleg Nesterov
  0 siblings, 0 replies; 29+ messages in thread
From: Oleg Nesterov @ 2024-09-11 20:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Roman Kisel, akpm, apais, benhill, ebiederm, linux-kernel,
	ssengar, sunilmut, vdso, workingjubilee

On 09/11, Linus Torvalds wrote:
>
> On Wed, 11 Sept 2024 at 12:54, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> >         - please try to make your changelog more convincing. And in particular,
> >           please explain why !!current->ptrace is not enough and this feature
> >           needs the tracer's pid.
>
> Oleg, I realize you like the simpler patch that only has that
                  ^^^^^^^^
No, no, I don't!!! ;)

> "!!current->ptrace", but my point is that even that simpler patch is
> simply WRONG, WRONG, WRONG.

and I agree, agree, agree.

> There is simply no valid situation where a "I have a tracer" is a good
> thing to test for.

Yes, yes, and that is why I added you/Eric to this discussion.

I just tried to play fair. I just thought that I can't simply "nack" this
change, because I can't explain why I didn't like the whole idea.

Oleg.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-11 19:53         ` Oleg Nesterov
  2024-09-11 19:57           ` Linus Torvalds
@ 2024-09-11 20:25           ` Roman Kisel
  1 sibling, 0 replies; 29+ messages in thread
From: Roman Kisel @ 2024-09-11 20:25 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: akpm, apais, benhill, ebiederm, linux-kernel, ssengar, sunilmut,
	torvalds, vdso, workingjubilee



On 9/11/2024 12:53 PM, Oleg Nesterov wrote:
> Roman,
> 
> I can only repeat that we can't understand each other. Quite possibly my bad.
> 

Oleg,

I'll re-read the whole thread to see where I can improve.
Didn't mean to cause the undue burden for you. My apologies, will make a
note to myself to be crisper and more concise at the very least.

> On 09/11, Roman Kisel wrote:
>>
>> On 09/11, Oleg Nesterov wrote:
>>
>>> On 09/10, Roman Kisel wrote:
>>>>
>>>> On 09/09, Oleg wrote:
>>>>
>>>
>>>>> Yet another thing in this discussion I can't understand... sorry, I tried.
>>>>> You do not need to teach, say, gdb to recognize this pattern. You can just do
>>>>>
>>>>> 	$ gdb -ex 'b please_insert_the_breakpoint_here' ...
>>>>>
>>>>> Nevermind, as I have already said you can safely ignore me. I still do not
>>>>> see any "real" use-case for breakpoint_if_debugging(), but I guess that is
>>>>> due to my ignorance and lack of imagination.
>>>>
>>>> I've started this so let me butt in and take up the gaunlet.
>>>>
>>>> Lambda's would be the most prominent example to me[1]. The toolchain
>>>> doesn't give them the user-accesible type and the name as it does for
>>>> the functions.
>>>
>>> And?
>>
>> You wanted an example of '"real" use-case for breakpoint_if_debugging()':
> 
> Then why does your email explain that c++ lambdas don't have a good name?
> Why doesi it mention lambdas at all?
> 

I was making the point that setting a breakpoint on the lambda is quite
laborious, and instead one can put "breakpoint_if_debugging()" inside
of it.

>>>>> Nevermind, as I have already said you can safely ignore me. I still do not
>>>>> see any "real" use-case for breakpoint_if_debugging(), but I guess that is
>>>>> due to my ignorance and lack of imagination.
>>
>> I have provided them, and illustrated how it is tiresome to set the breakpoint
>> in the debugger in these cases so can add a call to breakpoint_if_debugging()
>> to these places instead.
> 
> Instead of what??? Instead of
> 
> 	#define breakpoint_if_debugging()	\
> 		asm volatile ("call please_insert_the_breakpoint_here" : ASM_CALL_CONSTRAINT);
> 
> plus -ex 'b please_insert_the_breakpoint_here'???
> 
> If you say that this is ugly I won't even argue. But instead of what?
> 
> Roman, I am leaving this thread, sorry. But let me try to summarize.
> Your patch was buggy and you seem to agree. Feel free to send V2 and I will
> be happy to review it correctness-wise. But:
> 
> 	- please keep Eric/Linus cc'ed
> 
> 	- please try to make your changelog more convincing. And in particular,
> 	  please explain why !!current->ptrace is not enough and this feature
> 	  needs the tracer's pid.
> 
> 	  If possible, please provide a clear/simple/artificial/whatever example
> 	  of the (pseudo)code which can justify this feature.

Oleg,

I owe you a great deal of gratitude for giving my arguments a chance! I
won't reply more to this thread, too. Will use the relevant points
from the list you kindly provided where that applies when posting
to LKML in the future.

> Oleg.

-- 
Thank you,
Roman


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
  2024-09-07  8:45 Jubilee Young
  2024-09-09 19:37 ` Oleg Nesterov
@ 2024-09-10 16:34 ` Eric W. Biederman
  1 sibling, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2024-09-10 16:34 UTC (permalink / raw)
  To: Jubilee Young
  Cc: oleg, akpm, apais, benhill, linux-kernel, romank, ssengar,
	sunilmut, torvalds, vdso

Jubilee Young <workingjubilee@gmail.com> writes:

>> if the process is ptraced, debugger can insert the breakoint into
>> please_insert_the_breakpoint_here(). Otherwise breakpoint_if_debugging()
>> is a cheap nop.
>
> Generally a programmer wants to put this kind of conditional breakpoint
> on an exception path, for which, unless one is working with a language that
> abuses exceptions for control flow (which unfortunately describes... many),
> the performance isn't of enormous concern. Not that it's free, either, but
> opening a file and scanning it is a lot more code than a single call
> to prctl.

I want to reiterate what Oleg pointed out elsewhere in this
conversation.

Observing a process is ptracing the current process does not mean a
debugger is attached to the current process.  It could be strace, or
upstart or some other code that happens to use the ptrace facility.

Which in turn means that opening /proc/self/status and looking for
TracerPid does not reliably detect if a debugger is attached.

Which unfortunately means that this must be solved as a coopeartive
effort between the library implementation and the debuggers.  Nothing
else can reliably tell you that a debugger is attached to your process.

Which in turn means this can not be solved in the kernel.  It must
be done as a cooperative effort between the implementation of the
library and the debuggers.

>> Perhaps it makes sense to discuss the alternatives? Say, a process can have a
>> please_insert_the_breakpoint_here() function implemented in asm which just does
>> asm(ret).
>
> There's some merit in having the debuggers recognize this pattern, as that
> then would save every language that wants to have this power available
> the trouble of reimplementing it. But first debuggers must recognize it,
> which would require teaching each of them, which can be... tedious.

Unfortunately that is the only solution that I can see that will work
reliably.

> Having a function named `fatal` or whatever likewise has this issue.
> A toolchain, however, can simply insert a jump to a breakpoint easily,
> without having to hold gdb, lldb, cdb, and whatever-other-db's hand.

Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2024-09-11 20:25 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-05 21:27 [PATCH 0/1] Get tracer PID without reliance on the proc FS Roman Kisel
2024-09-05 21:27 ` [PATCH 1/1] ptrace: " Roman Kisel
2024-09-06 11:24   ` Oleg Nesterov
2024-09-06 11:48     ` Oleg Nesterov
2024-09-06 19:09     ` Linus Torvalds
2024-09-06 20:08       ` Roman Kisel
2024-09-06 20:26         ` Linus Torvalds
2024-09-06 21:15           ` Roman Kisel
2024-09-09 16:18             ` Eric W. Biederman
2024-09-09 17:05               ` Oleg Nesterov
2024-09-09 17:34                 ` Eric W. Biederman
2024-09-09 17:22               ` Roman Kisel
2024-09-06 20:55         ` Oleg Nesterov
2024-09-06 21:25           ` Roman Kisel
2024-09-08 14:08             ` Oleg Nesterov
2024-09-09 15:19               ` Roman Kisel
2024-09-09 16:42                 ` Oleg Nesterov
2024-09-09 17:05                   ` Roman Kisel
2024-09-07 19:33   ` kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2024-09-07  8:45 Jubilee Young
2024-09-09 19:37 ` Oleg Nesterov
2024-09-10 15:40   ` Roman Kisel
2024-09-11 14:44     ` Oleg Nesterov
2024-09-11 17:41       ` Roman Kisel
2024-09-11 19:53         ` Oleg Nesterov
2024-09-11 19:57           ` Linus Torvalds
2024-09-11 20:14             ` Oleg Nesterov
2024-09-11 20:25           ` Roman Kisel
2024-09-10 16:34 ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox