public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Eric W. Biederman" <ebiederm@xmission.com>
To: Roman Kisel <romank@linux.microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	 Oleg Nesterov <oleg@redhat.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,  apais@microsoft.com,
	benhill@microsoft.com,  ssengar@microsoft.com,
	 sunilmut@microsoft.com, vdso@hexbites.dev
Subject: Re: [PATCH 1/1] ptrace: Get tracer PID without reliance on the proc FS
Date: Mon, 09 Sep 2024 11:18:45 -0500	[thread overview]
Message-ID: <87ed5ser6i.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <61713c72-bd86-4694-9c06-49579a33d8f3@linux.microsoft.com> (Roman Kisel's message of "Fri, 6 Sep 2024 14:15:33 -0700")

Roman Kisel <romank@linux.microsoft.com> writes:

> On 9/6/2024 1:26 PM, Linus Torvalds wrote:
>> On Fri, 6 Sept 2024 at 13:08, Roman Kisel <romank@linux.microsoft.com> wrote:
>>>
>>> When the process has run into a fatal error and is about to exit, having
>>> a way to break into the debugger at this exact moment wouldn't change
>>> anything about the logic of the data processing happening in the process.
>>> What's so horrible in that to have a way to land in the debugger to see
>>> what exactly is going on?
>> I don't buy it.
>> If you want to debug some fatal behavior, and a debugger *isn't*
>> attached, you want it to create a core-dump.
>> And if a debugger *is* attached, it will catch that.
>> This is basically how abort() has always worked, and it very much is
>> *not* doing some "let's check if we're being debugged" stuff. Exactly
>> because that would be a bad idea and an anti-pattern.
>> The other very traditional model - for example if you do *not* want to
>> do core-dumps for some reason, and just exit with an error message -
>> is to just put a breakpoint on the "fatal()" function (or whatever
>> your "fatal error happened" situation is) using the debugger.
>> Then the target will behave differently depending on whether it is
>> being debugged or not BECAUSE THE DEBUGGER ASKED FOR IT, not because
>> the program decided to act differently when debugged.
>> In other words, this is a long-solved problem with solid solutions
>> from before Linux even existed.
>
>
> Writing a core-dump might not be an option as you have pointed out
> hence the "fatal()" function might not be permitted to fault.

For that you just need to set core file size rlimit to 0.
Then you can safely raise SIGABRT to terminate your process.

A debugger can also stop at PTRACE_EVENT_EXIT.  The process
is still available (not cleaned up), but already fatally dead.

So I think the scenario of a process exiting is safely handled.

So if that is the case you care about I would say please look at
PTRACE_EVENT_EXIT.

> Breaking into the debugger if it is attached saves a step of setting
> a breakpoint and doesn't require the knowledge of the guts of the
> standard library and/or the journey of the trap exception from the
> CPU to the debugger. The very name of the "fatal()" function is a
> tight contract, and something akin to the onion in the varnish for
> the uninitiated.
>
> Libraries like Google Breakpad, Boost.Test, C++ std, AWS C SDK,
> Unreal Engine have that "breakpoint_if_debugging" facility, folks
> find that useful. They all read and parse the "/proc/self/status" file,
> where as it may seem, just the ptrace syscall one liner could save
> the trouble of that. The kernel helps the user space do work as I
> understand, why police it? There is "fork()", and threads can deadlock.
> Quite horrible, still the user space has access to that. Here,
> there is evidence folks do want to detect if a debugger is present,
> there is evidence how the kernel can help the user space compute that
> with so much less effort, the patch is trivial. Why don't let the
> userspace burn less electricity?

If you want more than just stopping at when a process exits, such as the
not yet standardized std:is_debugger_present and
std:breakpoint_if_debugging calls.  Then a real world justification
needs to be shown why the kernel should optimize for the uncommon case.

Especially because it will take time and energy and maintenance to keep
that going, all for to support a set of facilities that seem highly
dubious.

I suspect the best way to support breakpoint_if_debugging (in the
general case) is to make them nop functions, and then have something
scan the source code collect those locations, and feed those locations
into the debugger as break points.  Perhaps the compiler can be that
scanner and record the list of suggest break points somewhere that
a debugger can read.

As for std:is_debugger_present I suspect the best way to handle that
is to keep the current implementation, making it expensive to use so
people won't use it unthinkingly.  If people are using this facility
enough that they are wasting electricity, an expensive operation at
least has the potential to make people to stop and think about what they
are doing.

From a don't introduce heisenbug's perspective I think optimizing any of
this in the kernel looks like a bad idea.

Am I missing something?

Eric

  reply	other threads:[~2024-09-09 16:46 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-05 21:27 [PATCH 0/1] Get tracer PID without reliance on the proc FS Roman Kisel
2024-09-05 21:27 ` [PATCH 1/1] ptrace: " Roman Kisel
2024-09-06 11:24   ` Oleg Nesterov
2024-09-06 11:48     ` Oleg Nesterov
2024-09-06 19:09     ` Linus Torvalds
2024-09-06 20:08       ` Roman Kisel
2024-09-06 20:26         ` Linus Torvalds
2024-09-06 21:15           ` Roman Kisel
2024-09-09 16:18             ` Eric W. Biederman [this message]
2024-09-09 17:05               ` Oleg Nesterov
2024-09-09 17:34                 ` Eric W. Biederman
2024-09-09 17:22               ` Roman Kisel
2024-09-06 20:55         ` Oleg Nesterov
2024-09-06 21:25           ` Roman Kisel
2024-09-08 14:08             ` Oleg Nesterov
2024-09-09 15:19               ` Roman Kisel
2024-09-09 16:42                 ` Oleg Nesterov
2024-09-09 17:05                   ` Roman Kisel
2024-09-07 19:33   ` kernel test robot
  -- strict thread matches above, loose matches on Subject: below --
2024-09-07  8:45 Jubilee Young
2024-09-09 19:37 ` Oleg Nesterov
2024-09-10 15:40   ` Roman Kisel
2024-09-11 14:44     ` Oleg Nesterov
2024-09-11 17:41       ` Roman Kisel
2024-09-11 19:53         ` Oleg Nesterov
2024-09-11 19:57           ` Linus Torvalds
2024-09-11 20:14             ` Oleg Nesterov
2024-09-11 20:25           ` Roman Kisel
2024-09-10 16:34 ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ed5ser6i.fsf@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=apais@microsoft.com \
    --cc=benhill@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=romank@linux.microsoft.com \
    --cc=ssengar@microsoft.com \
    --cc=sunilmut@microsoft.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vdso@hexbites.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox