qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Nabih Estefan <nabihestefan@google.com>
Cc: qemu-devel@nongnu.org,  richard.henderson@linaro.org,
	pbonzini@redhat.com,  philmd@linaro.org
Subject: Re: [PATCH 0/2] Fix for multi-process gdbstub breakpoints
Date: Tue, 20 May 2025 15:25:18 +0100	[thread overview]
Message-ID: <87tt5f9x69.fsf@draig.linaro.org> (raw)
In-Reply-To: <20250508224514.805456-1-nabihestefan@google.com> (Nabih Estefan's message of "Thu, 8 May 2025 22:45:12 +0000")

Nabih Estefan <nabihestefan@google.com> writes:

> This patch series modifies the gdbstub to address a bug running a
> multi cluster machine in QEMU using TCG.

Was this a downstream multi-cluster machine? Do we have any examples for
upstream? It would be nice to add a gdbstub test case to cover the
multi-inferior behaviour.

> The machine where the
> problem was seen had several clusters of CPUs with similar
> architectures and similar memory layout all working with physical
> addresses. It was discovered under gdb debugging that a breakpoint
> targeting one cluster misfired on the wrong cluster quite frequently
> with no possible workaround since gdb was also unaware of any
> breakpoint in that cluster and simply reported SIGTRAP.
>
> The sequence that discovered the bug adds N inferiors and adds a
> breakpoint on inferior N. Then after continuing an unrelated thread
> stops the execution when its PC hits the same address of the break
> targeting a different inferior.
>
> target extended-remote :1234
> add-inferior
> inferior 2
> attach 2
> ...
> add-inferior
> inferior N
> attach N
> add-symbol-file /path/to/foo.elf
> break foo
>> Breakpoint 1 at 0xf00add
> info break
>> Num     Type           Disp Enb Address    What
>> 1       breakpoint     keep y   0x00f00add in foo
>>                                            at foo.c:1234 inf N
> continue
>> Continuing.
>>
>> Thread 2.1 received signal SIGTRAP, Trace/breakpoint trap.
>> [Switching to Thread 2.2]
>> 0xf00add in ?? ()
>
> The main problem is the current implementation of
> 'tcg_insert_breakpoint' and 'tcg_remove_breakpoint' insert and remove
> breakpoints to all the CPUs in the system regardless of what the
> remote gdb protocol implements.
>
> If we look at the current source code of GDB we can see that the
> function 'remote_target::insert_breakpoint' in file gdb/remote.c has
> the intention to select the process ID of the inferior where the
> break was inserted.
>
> int
> remote_target::insert_breakpoint (struct gdbarch *gdbarch,
>                                   struct bp_target_info *bp_tgt)
> {
> ...
>   /* Make sure the remote is pointing at the right process, if
>      necessary.  */
>   if (!gdbarch_has_global_breakpoints (current_inferior ()->arch ()))
>     set_general_process ();
> ...
> }
>
> https:sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/remote.c;h=2c3988cb5075655e8a799d1cc5d4760ad8ed426e;hb=HEAD#l11023
>
> This would not happen when we input the 'break' in gdb but it is
> deferred until the time we execute the 'continue' command. Because we
> might be currently selecting an inferior that is not the one where we
> previously set the breakpoint, the remote gdb has to make sure we
> move the focus to the process ID of the inferior where we inserted
> the break.
>
> In the previous example this will translate to something like:
>
> HgpN.M
> Z0,00f00add,4
>
> Another thing that is wrong with the current implementation (and it
> affects both TCG and KVM mode) is that the remote gdb protocol uses
> 'Hg' and not 'Hc' to select the process. Functions
> 'gdb_breakpoint_insert' and 'gdb_breakpoint_remove' receive wrongly
> 'gdbserver_state.c_cpu' instead of 'gdbserver_state.g_cpu'.
>
> This is supported by the documentation of 'H op thread-id' where op =
> 'c' is reserved to the step and continue:
>
> https:sourceware.org/gdb/current/onlinedocs/gdb.html/Packets.html
>
> And it can be verified that the function 'set_general_process' in the
> previous code snippet will eventually call
> 'remote_target::set_general_thread' and not
> 'remote_target::set_continue_thread' if it needs to change focus.
>
> A third scenario that has to be taken into account is the case of a
> break on a specific thread, for instance the sequence:
>
> inferior 1
> break bar thread 1.3
> break bar thread 1.4
>
> The remote protocol expects the gdbstub to apply the break to the
> process ID of inferior 1 and considers the specific thread-id as a
> breakpoint condition (not too different from a 'break if').
>
> In this case the packet exchange may look like:
>
> Hgp1.1
> Z0,00ba2add,4
>
> There wouldn't be an independent set of packets for 'Hgp1.3' and
> 'Hgp1.4'.
>
> In the gdb source code, the handling of the specific thread-id
> happens during breakpoint evaluation in function
> 'bpstat_check_breakpoint_conditions' of file gdb/breakpoint.c
>
> https:sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/breakpoint.c;h=17bd627f867cf3d4dc81322ed1919ba40cbb237d;hb=HEAD#l5550
>
> The proposed fix inserts or removes a breakpoint to all the cpus
> sharing the same process ID as the one selected with the latest
> received 'Hg' packet.
>
> Roque Arcudia Hernandez (2):
>   gdbstub: Fix wrong CPUState pointer in breakpoint functions
>   gdbstub: Apply breakpoints only to the selected PID
>
>  accel/tcg/tcg-accel-ops.c | 37 +++++++++++++++++++++++--------------
>  gdbstub/gdbstub.c         | 10 ++++++++--
>  gdbstub/internals.h       | 13 +++++++++++--
>  include/exec/gdbstub.h    | 12 ++++++++++++
>  4 files changed, 54 insertions(+), 18 deletions(-)

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


  parent reply	other threads:[~2025-05-20 14:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-08 22:45 [PATCH 0/2] Fix for multi-process gdbstub breakpoints Nabih Estefan
2025-05-08 22:45 ` [PATCH 1/2] gdbstub: Fix wrong CPUState pointer in breakpoint functions Nabih Estefan
2025-05-08 22:45 ` [PATCH 2/2] gdbstub: Apply breakpoints only to the selected PID Nabih Estefan
2025-05-20 14:25 ` Alex Bennée [this message]
  -- strict thread matches above, loose matches on Subject: below --
2024-09-06 22:54 [PATCH 0/2] Fix for multi-process gdbstub breakpoints Roque Arcudia Hernandez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tt5f9x69.fsf@draig.linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=nabihestefan@google.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).