[Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set
@ 2024-01-04  2:35 bugzilla-daemon
  2024-01-04 16:54 ` Sean Christopherson
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: bugzilla-daemon @ 2024-01-04  2:35 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=218339

            Bug ID: 218339
           Summary: kernel goes unresponsive if single-stepping over an
                    instruction which writes to an address for which a
                    hardware read/write watchpoint has been set
           Product: Virtualization
           Version: unspecified
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P3
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: anthony.louis.eden@gmail.com
        Regression: No

In a debian QEMU/KVM virtual machine, run `gdb` on any executable (e.g.
`/usr/bin/ls`). Run the program by typing `starti`. Proceed to `_dl_start`
(i.e. `break _dl_start`, `continue`). When you get there disassemble the
function (i.e. `disas`). Find an instruction that's going to be executed for
which you can compute the address in memory it will write to. Run the program
to that instruction (i.e. `break *0xINSN`, `continue`). When you're on that
instruction, set a read/write watchpoint on the address it will write to, then
single-step (i.e. `stepi`) and the kernel will go unresponsive.

>(gdb) x/1i $pc
>=> 0x7ffff7fe6510 <_dl_start+48>:      mov    %rdi,-0x88(%rbp)
>(gdb) x/1wx $rbp-0x88
>0x7fffffffec28:        0x00000000
>(gdb) awatch *0x7fffffffec28
>Hardware access (read/write) watchpoint 2: *0x7fffffffec28
>(gdb) stepi

Looking with `journalctl`, I cannot find anything printed to dmesg.

The kernel of the guest inside the virtual machine is Debian 6.1.0-15-amd64.
The kernel of the host running qemu-system-x86_64 is Archlinux 6.6.7-arch1-1.
gdb is version 13.1.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set
  2024-01-04  2:35 [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set bugzilla-daemon
@ 2024-01-04 16:54 ` Sean Christopherson
  2024-01-04 16:54 ` [Bug 218339] " bugzilla-daemon
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Sean Christopherson @ 2024-01-04 16:54 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: kvm

On Thu, Jan 04, 2024, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=218339
> 
>             Bug ID: 218339
>            Summary: kernel goes unresponsive if single-stepping over an
>                     instruction which writes to an address for which a
>                     hardware read/write watchpoint has been set
>            Product: Virtualization
>            Version: unspecified
>           Hardware: All
>                 OS: Linux
>             Status: NEW
>           Severity: normal
>           Priority: P3
>          Component: kvm
>           Assignee: virtualization_kvm@kernel-bugs.osdl.org
>           Reporter: anthony.louis.eden@gmail.com
>         Regression: No
> 
> In a debian QEMU/KVM virtual machine, run `gdb` on any executable (e.g.
> `/usr/bin/ls`). Run the program by typing `starti`. Proceed to `_dl_start`
> (i.e. `break _dl_start`, `continue`). When you get there disassemble the
> function (i.e. `disas`). Find an instruction that's going to be executed for
> which you can compute the address in memory it will write to. Run the program
> to that instruction (i.e. `break *0xINSN`, `continue`). When you're on that
> instruction, set a read/write watchpoint on the address it will write to, then
> single-step (i.e. `stepi`) and the kernel will go unresponsive.

By "the kernel", I assume you mean the guest kernel?

> >(gdb) x/1i $pc
> >=> 0x7ffff7fe6510 <_dl_start+48>:      mov    %rdi,-0x88(%rbp)
> >(gdb) x/1wx $rbp-0x88
> >0x7fffffffec28:        0x00000000
> >(gdb) awatch *0x7fffffffec28
> >Hardware access (read/write) watchpoint 2: *0x7fffffffec28
> >(gdb) stepi
> 
> 
> Looking with `journalctl`, I cannot find anything printed to dmesg.
> 
> The kernel of the guest inside the virtual machine is Debian 6.1.0-15-amd64.
> The kernel of the host running qemu-system-x86_64 is Archlinux 6.6.7-arch1-1.
> gdb is version 13.1.

Is this a regression or something that has always been broken?  I.e. did this work
on previous host kernels?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 218339] kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set
  2024-01-04  2:35 [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set bugzilla-daemon
  2024-01-04 16:54 ` Sean Christopherson
@ 2024-01-04 16:54 ` bugzilla-daemon
  2024-01-04 23:21 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2024-01-04 16:54 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=218339

--- Comment #1 from Sean Christopherson (seanjc@google.com) ---
On Thu, Jan 04, 2024, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=218339
> 
>             Bug ID: 218339
>            Summary: kernel goes unresponsive if single-stepping over an
>                     instruction which writes to an address for which a
>                     hardware read/write watchpoint has been set
>            Product: Virtualization
>            Version: unspecified
>           Hardware: All
>                 OS: Linux
>             Status: NEW
>           Severity: normal
>           Priority: P3
>          Component: kvm
>           Assignee: virtualization_kvm@kernel-bugs.osdl.org
>           Reporter: anthony.louis.eden@gmail.com
>         Regression: No
> 
> In a debian QEMU/KVM virtual machine, run `gdb` on any executable (e.g.
> `/usr/bin/ls`). Run the program by typing `starti`. Proceed to `_dl_start`
> (i.e. `break _dl_start`, `continue`). When you get there disassemble the
> function (i.e. `disas`). Find an instruction that's going to be executed for
> which you can compute the address in memory it will write to. Run the program
> to that instruction (i.e. `break *0xINSN`, `continue`). When you're on that
> instruction, set a read/write watchpoint on the address it will write to,
> then
> single-step (i.e. `stepi`) and the kernel will go unresponsive.

By "the kernel", I assume you mean the guest kernel?

> >(gdb) x/1i $pc
> >=> 0x7ffff7fe6510 <_dl_start+48>:      mov    %rdi,-0x88(%rbp)
> >(gdb) x/1wx $rbp-0x88
> >0x7fffffffec28:        0x00000000
> >(gdb) awatch *0x7fffffffec28
> >Hardware access (read/write) watchpoint 2: *0x7fffffffec28
> >(gdb) stepi
> 
> 
> Looking with `journalctl`, I cannot find anything printed to dmesg.
> 
> The kernel of the guest inside the virtual machine is Debian 6.1.0-15-amd64.
> The kernel of the host running qemu-system-x86_64 is Archlinux 6.6.7-arch1-1.
> gdb is version 13.1.

Is this a regression or something that has always been broken?  I.e. did this
work
on previous host kernels?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 218339] kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set
  2024-01-04  2:35 [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set bugzilla-daemon
  2024-01-04 16:54 ` Sean Christopherson
  2024-01-04 16:54 ` [Bug 218339] " bugzilla-daemon
@ 2024-01-04 23:21 ` bugzilla-daemon
  2024-01-10 12:38 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2024-01-04 23:21 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=218339

--- Comment #2 from Anthony L. Eden (anthony.louis.eden@gmail.com) ---
> By "the kernel", I assume you mean the guest kernel?
Yes, the guest kernel. I can no longer interact with the VM via the serial
console. It is unresponsive.

I attached a debugger to qemu-system-x86_64 to see if qemu itself was in an
infinite loop or something but the stacktraces all looked normal.

> Is this a regression or something that has always been broken?  I.e. did this
> work on previous host kernels?
I do not know whether this has always been broken or not.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 218339] kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set
  2024-01-04  2:35 [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set bugzilla-daemon
                   ` (2 preceding siblings ...)
  2024-01-04 23:21 ` bugzilla-daemon
@ 2024-01-10 12:38 ` bugzilla-daemon
  2024-01-10 20:21 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2024-01-10 12:38 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=218339

Yao Yuan (yaoyuan0329@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |yaoyuan0329@gmail.com

--- Comment #3 from Yao Yuan (yaoyuan0329@gmail.com) ---
Hi,

I tried on my side but can't reproducce it, logs below. Any steps I missed ?

(gdb) b *0x00007ffff7fe4048                                                     
Breakpoint 1 at 0x7ffff7fe4048: file ./elf/rtld.c, line 527.                    
(gdb) c                                                                         
Continuing.                                                                     

Breakpoint 1, 0x00007ffff7fe4048 in _dl_start (arg=0x7fffffffe510) at
./elf/rtld.c:527                                                                
527     in ./elf/rtld.c                                                         
(gdb) disassemble                                                               
Dump of assembler code for function _dl_start:                                  
   0x00007ffff7fe4030 <+0>:     endbr64                                         
   0x00007ffff7fe4034 <+4>:     push   %rbp                                     
   0x00007ffff7fe4035 <+5>:     mov    %rsp,%rbp                                
   0x00007ffff7fe4038 <+8>:     push   %r15                                     
   0x00007ffff7fe403a <+10>:    push   %r14                                     
   0x00007ffff7fe403c <+12>:    push   %r13                                     
   0x00007ffff7fe403e <+14>:    push   %r12                                     
   0x00007ffff7fe4040 <+16>:    push   %rbx                                     
   0x00007ffff7fe4041 <+17>:    sub    $0x88,%rsp                               
=> 0x00007ffff7fe4048 <+24>:    mov    %rdi,-0x78(%rbp)                         
   0x00007ffff7fe404c <+28>:    rdtsc                                           
(gdb) x/16xb $rbp-0x78                                                          
0x7fffffffe488: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00    
0x7fffffffe490: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00    
(gdb) awatch *0x7fffffffe488                                                    
Hardware access (read/write) watchpoint 2: *0x7fffffffe488                      
(gdb) stepi                                                                     

Hardware access (read/write) watchpoint 2: *0x7fffffffe488                      

Old value = 0                                                                   
New value = -6896                                                               
0x00007ffff7fe404c in rtld_timer_start (var=0x7ffff7ffcaa0 <start_time>) at
./elf/rtld.c:85                                                                 
85      in ./elf/rtld.c     


the guest kernel runs properly after above steps inside guest.

My configure:
Host: stable kernel v6.6.8 commit 4c9646a796d66a2d81871a694e88e19a38b115a7
QEMU: v8.1.1 commit 6bb4a8a47a43f35a345f107227fcd6abed59e62c
Guest kernel: kvm tree tags/kvm-6.8-1 commit
1c6d984f523f67ecfad1083bb04c55d91977bb15

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 218339] kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set
  2024-01-04  2:35 [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set bugzilla-daemon
                   ` (3 preceding siblings ...)
  2024-01-10 12:38 ` bugzilla-daemon
@ 2024-01-10 20:21 ` bugzilla-daemon
  2024-01-10 21:07 ` bugzilla-daemon
  2024-03-05 20:24 ` bugzilla-daemon
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2024-01-10 20:21 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=218339

--- Comment #4 from Anthony L. Eden (anthony.louis.eden@gmail.com) ---
>> I tried on my side but can't reproducce it, logs below. Any steps I missed ?

Nope, it looks like you did everything right.



I spent a little more time investigating this, since for me it's trivial to
reproduce. I was able to get the guest kernel vmlinux *with* debugging
information from the linux-image-6.1.0-15-amd64-dbg debian package.

After entering the final `stepi` within gdb, which is when the guest goes
totally unresponsive, in htop I see qemu-system-x86_64 is taking up 100% CPU.
Like I said, the thread call stacks in the qemu process look typical.

I used the qemu monitor command 'dump-guest-memory -p /root/linux.core' three
separate times after the guest went unresponsive, and all three of the core
file backtraces look like this:

#0  pv_native_set_debugreg (regno=7, val=0) at
arch/x86/include/asm/debugreg.h:92
#1  0xffffffff81a21533 in set_debugreg (reg=7, val=0) at
arch/x86/include/asm/paravirt.h:129
#2  local_db_save () at arch/x86/include/asm/debugreg.h:127
#3  exc_debug_kernel (dr6=0, regs=0xfffffe0000010f58) at
arch/x86/kernel/traps.c:1038
#4  exc_debug (regs=0xfffffe0000010f58) at arch/x86/kernel/traps.c:1175
#5  0xffffffff81c00c6a in asm_exc_debug () at
/build/reproducible-path/linux-6.1.66/arch/x86/include/asm/idtentry.h:606
#6  0x0000000000000000 in ?? ()



My VM was in a self-contained folder with its own run script on the host so I
made a tarball of it. It is available for download here (~9 GB):

https://drive.google.com/file/d/1r3tlrw8kG17vFwXzP6ETv76ptNhbLYjt/view?usp=sharing

Usage:

$ tar xvSf deb-vm-x86_64.tar
$ cd deb-vm-x86_64/
$ ./run.sh

In another terminal,

$ screen /dev/pts/23 115200
$ login as user 'root' with password 'root'

Once inside,

$ gdb /usr/bin/ls
$ starti
...


Oh and by the way, the version of qemu-system-x86_64 on my host is 7.2.7
(Debian 1:7.2+dfsg-7+deb12u3).

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 218339] kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set
  2024-01-04  2:35 [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set bugzilla-daemon
                   ` (4 preceding siblings ...)
  2024-01-10 20:21 ` bugzilla-daemon
@ 2024-01-10 21:07 ` bugzilla-daemon
  2024-03-05 20:24 ` bugzilla-daemon
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2024-01-10 21:07 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=218339

--- Comment #5 from Anthony L. Eden (anthony.louis.eden@gmail.com) ---
Actually upon closer inspection I'm seeing two distinct call stacks appear in
the core files.

#0  pv_native_set_debugreg (regno=7, val=0) at
arch/x86/include/asm/debugreg.h:92
#1  0xffffffff81a21533 in set_debugreg (reg=7, val=0) at
arch/x86/include/asm/paravirt.h:129
#2  local_db_save () at arch/x86/include/asm/debugreg.h:127
#3  exc_debug_kernel (dr6=0, regs=0xfffffe0000010f58) at
arch/x86/kernel/traps.c:1038
#4  exc_debug (regs=0xfffffe0000010f58) at arch/x86/kernel/traps.c:1175
#5  0xffffffff81c00c6a in asm_exc_debug () at
/build/reproducible-path/linux-6.1.66/arch/x86/include/asm/idtentry.h:606
#6  0x0000000000000000 in ?? ()

#0  pv_native_set_debugreg (regno=7, val=983554) at
arch/x86/include/asm/debugreg.h:92
#1  0xffffffff81a21509 in set_debugreg (reg=7, val=983554) at
arch/x86/include/asm/paravirt.h:129
#2  local_db_restore (dr7=983554) at arch/x86/include/asm/debugreg.h:147
#3  exc_debug_kernel (dr6=<optimized out>, regs=0xfffffe0000010f58) at
arch/x86/kernel/traps.c:1095
#4  exc_debug (regs=0xfffffe0000010f58) at arch/x86/kernel/traps.c:1175
#5  0xffffffff81c00c6a in asm_exc_debug () at
/build/reproducible-path/linux-6.1.66/arch/x86/include/asm/idtentry.h:606
#6  0x0000000000000000 in ?? ()


They are quite similar except in one of them frame #2 is local_db_save() and in
the other trace frame #2 is local_db_restore().

By the way this time I ran the VM under a different, newer version of
qemu-system-x86_64 (8.2.0), and it appears to have made no difference.


Also, concerning the VM in that tarball I linked to, if the run.sh is run as it
is you will be able to ssh into the running guest with 'ssh -p 10024
root@localhost', furthermore the path to the kernel image *with* debug info is
located at /usr/lib/debug/vmlinux-6.1.0-15-amd64.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug 218339] kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set
  2024-01-04  2:35 [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set bugzilla-daemon
                   ` (5 preceding siblings ...)
  2024-01-10 21:07 ` bugzilla-daemon
@ 2024-03-05 20:24 ` bugzilla-daemon
  6 siblings, 0 replies; 8+ messages in thread
From: bugzilla-daemon @ 2024-03-05 20:24 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=218339

Kishen Maloor (kishen.maloor@intel.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kishen.maloor@intel.com

--- Comment #6 from Kishen Maloor (kishen.maloor@intel.com) ---
(In reply to Anthony L. Eden from comment #0)
> In a debian QEMU/KVM virtual machine, run `gdb` on any executable (e.g.
> `/usr/bin/ls`). Run the program by typing `starti`. Proceed to `_dl_start`
> (i.e. `break _dl_start`, `continue`). When you get there disassemble the
> function (i.e. `disas`). Find an instruction that's going to be executed for
> which you can compute the address in memory it will write to. Run the
> program to that instruction (i.e. `break *0xINSN`, `continue`). When you're
> on that instruction, set a read/write watchpoint on the address it will
> write to, then single-step (i.e. `stepi`) and the kernel will go
> unresponsive.
> 
> 
> >(gdb) x/1i $pc
> >=> 0x7ffff7fe6510 <_dl_start+48>:      mov    %rdi,-0x88(%rbp)
> >(gdb) x/1wx $rbp-0x88
> >0x7fffffffec28:        0x00000000
> >(gdb) awatch *0x7fffffffec28
> >Hardware access (read/write) watchpoint 2: *0x7fffffffec28
> >(gdb) stepi
> 

I can reproduce the behavior you describe. But it seems that you're not
invoking KVM at all, because when I add '-accel kvm' or '-enable-kvm' to your
qemu command line the problem goes away.

There may be an issue specifically in the handling of hardware watchpoints
on the qemu emulation. If I disable hardware watchpoints in gdb using 'set
can-use-hw-watchpoints 0' and then use 'watch *<ADDR>', that works.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-03-05 20:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-04  2:35 [Bug 218339] New: kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set bugzilla-daemon
2024-01-04 16:54 ` Sean Christopherson
2024-01-04 16:54 ` [Bug 218339] " bugzilla-daemon
2024-01-04 23:21 ` bugzilla-daemon
2024-01-10 12:38 ` bugzilla-daemon
2024-01-10 20:21 ` bugzilla-daemon
2024-01-10 21:07 ` bugzilla-daemon
2024-03-05 20:24 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox