All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Nabih Estefan <nabihestefan@google.com>
Cc: qemu-devel@nongnu.org,  richard.henderson@linaro.org,
	pbonzini@redhat.com,  philmd@linaro.org
Subject: Re: [PATCH 0/2] Fix for multi-process gdbstub breakpoints
Date: Tue, 20 May 2025 15:25:18 +0100	[thread overview]
Message-ID: <87tt5f9x69.fsf@draig.linaro.org> (raw)
In-Reply-To: <20250508224514.805456-1-nabihestefan@google.com> (Nabih Estefan's message of "Thu, 8 May 2025 22:45:12 +0000")

Nabih Estefan <nabihestefan@google.com> writes:

> This patch series modifies the gdbstub to address a bug running a
> multi cluster machine in QEMU using TCG.

Was this a downstream multi-cluster machine? Do we have any examples for
upstream? It would be nice to add a gdbstub test case to cover the
multi-inferior behaviour.

> The machine where the
> problem was seen had several clusters of CPUs with similar
> architectures and similar memory layout all working with physical
> addresses. It was discovered under gdb debugging that a breakpoint
> targeting one cluster misfired on the wrong cluster quite frequently
> with no possible workaround since gdb was also unaware of any
> breakpoint in that cluster and simply reported SIGTRAP.
>
> The sequence that discovered the bug adds N inferiors and adds a
> breakpoint on inferior N. Then after continuing an unrelated thread
> stops the execution when its PC hits the same address of the break
> targeting a different inferior.
>
> target extended-remote :1234
> add-inferior
> inferior 2
> attach 2
> ...
> add-inferior
> inferior N
> attach N
> add-symbol-file /path/to/foo.elf
> break foo
>> Breakpoint 1 at 0xf00add
> info break
>> Num     Type           Disp Enb Address    What
>> 1       breakpoint     keep y   0x00f00add in foo
>>                                            at foo.c:1234 inf N
> continue
>> Continuing.
>>
>> Thread 2.1 received signal SIGTRAP, Trace/breakpoint trap.
>> [Switching to Thread 2.2]
>> 0xf00add in ?? ()
>
> The main problem is the current implementation of
> 'tcg_insert_breakpoint' and 'tcg_remove_breakpoint' insert and remove
> breakpoints to all the CPUs in the system regardless of what the
> remote gdb protocol implements.
>
> If we look at the current source code of GDB we can see that the
> function 'remote_target::insert_breakpoint' in file gdb/remote.c has
> the intention to select the process ID of the inferior where the
> break was inserted.
>
> int
> remote_target::insert_breakpoint (struct gdbarch *gdbarch,
>                                   struct bp_target_info *bp_tgt)
> {
> ...
>   /* Make sure the remote is pointing at the right process, if
>      necessary.  */
>   if (!gdbarch_has_global_breakpoints (current_inferior ()->arch ()))
>     set_general_process ();
> ...
> }
>
> https:sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/remote.c;h=2c3988cb5075655e8a799d1cc5d4760ad8ed426e;hb=HEAD#l11023
>
> This would not happen when we input the 'break' in gdb but it is
> deferred until the time we execute the 'continue' command. Because we
> might be currently selecting an inferior that is not the one where we
> previously set the breakpoint, the remote gdb has to make sure we
> move the focus to the process ID of the inferior where we inserted
> the break.
>
> In the previous example this will translate to something like:
>
> HgpN.M
> Z0,00f00add,4
>
> Another thing that is wrong with the current implementation (and it
> affects both TCG and KVM mode) is that the remote gdb protocol uses
> 'Hg' and not 'Hc' to select the process. Functions
> 'gdb_breakpoint_insert' and 'gdb_breakpoint_remove' receive wrongly
> 'gdbserver_state.c_cpu' instead of 'gdbserver_state.g_cpu'.
>
> This is supported by the documentation of 'H op thread-id' where op =
> 'c' is reserved to the step and continue:
>
> https:sourceware.org/gdb/current/onlinedocs/gdb.html/Packets.html
>
> And it can be verified that the function 'set_general_process' in the
> previous code snippet will eventually call
> 'remote_target::set_general_thread' and not
> 'remote_target::set_continue_thread' if it needs to change focus.
>
> A third scenario that has to be taken into account is the case of a
> break on a specific thread, for instance the sequence:
>
> inferior 1
> break bar thread 1.3
> break bar thread 1.4
>
> The remote protocol expects the gdbstub to apply the break to the
> process ID of inferior 1 and considers the specific thread-id as a
> breakpoint condition (not too different from a 'break if').
>
> In this case the packet exchange may look like:
>
> Hgp1.1
> Z0,00ba2add,4
>
> There wouldn't be an independent set of packets for 'Hgp1.3' and
> 'Hgp1.4'.
>
> In the gdb source code, the handling of the specific thread-id
> happens during breakpoint evaluation in function
> 'bpstat_check_breakpoint_conditions' of file gdb/breakpoint.c
>
> https:sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/breakpoint.c;h=17bd627f867cf3d4dc81322ed1919ba40cbb237d;hb=HEAD#l5550
>
> The proposed fix inserts or removes a breakpoint to all the cpus
> sharing the same process ID as the one selected with the latest
> received 'Hg' packet.
>
> Roque Arcudia Hernandez (2):
>   gdbstub: Fix wrong CPUState pointer in breakpoint functions
>   gdbstub: Apply breakpoints only to the selected PID
>
>  accel/tcg/tcg-accel-ops.c | 37 +++++++++++++++++++++++--------------
>  gdbstub/gdbstub.c         | 10 ++++++++--
>  gdbstub/internals.h       | 13 +++++++++++--
>  include/exec/gdbstub.h    | 12 ++++++++++++
>  4 files changed, 54 insertions(+), 18 deletions(-)

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


  parent reply	other threads:[~2025-05-20 14:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-08 22:45 [PATCH 0/2] Fix for multi-process gdbstub breakpoints Nabih Estefan
2025-05-08 22:45 ` [PATCH 1/2] gdbstub: Fix wrong CPUState pointer in breakpoint functions Nabih Estefan
2025-05-08 22:45 ` [PATCH 2/2] gdbstub: Apply breakpoints only to the selected PID Nabih Estefan
2025-05-20 14:25 ` Alex Bennée [this message]
  -- strict thread matches above, loose matches on Subject: below --
2024-09-06 22:54 [PATCH 0/2] Fix for multi-process gdbstub breakpoints Roque Arcudia Hernandez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tt5f9x69.fsf@draig.linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=nabihestefan@google.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.