qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] question: I found a bug which will lead to qemu crash
@ 2017-09-12 11:17 WangJie (Captain)
  2017-09-12 11:37 ` Kevin Wolf
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: WangJie (Captain) @ 2017-09-12 11:17 UTC (permalink / raw)
  To: Kevin Wolf, berto, stefanha, famz, wangjie88; +Cc: qemu-devel, pbonzini

Hi, Kevin.

I found a bug about qemu-kvm(version 2.7.0-rc0 adn 2.8.1). but qemu 2.6.0 and current master is OK.
So I git-bisect the master branch,and I found the patch you commited (block: Decouple throttling from BlockDriverState) lead the bug into qemu.

The patch which lead the bug into qemu: (https://github.com/qemu/qemu/commit/7ca7f0f6db1fedd28d490795d778cf23979a2aa7#diff-ea36ba0f79150cc299732696a069caba)

Because the current master is OK. So I think you had fixed it , can you tell me which patch fixed the bug?        Thank you :>


the bug is that: qemu will crash when loop to attach and detach a disk which configured qos to a VM for a while.


*Segmentation fault info(qemu 2.7.0-rc0):*
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/qemu-kvm -name guest=wangjie-i-clone203_rhel_7.3_64_guestosdev,debug-t'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fe960413e3c in throttle_group_next_blk (blk=0x11) at block/throttle-groups.c:160
160        ThrottleState *ts = blkp->throttle_state;
Missing separate debuginfos, use: debuginfo-install glib2-2.40.0-4.x86_64 glibc-2.17-157.h5.x86_64 libaio-0.3.109-13.x86_64 libgcc-4.8.3-10.h1.x86_64 nettle-2.7.1-4.h1.x86_64 numactl-libs-2.0.9-4.x86_64 pixman-0.32.4-3.x86_64 zlib-1.2.7-14.x86_64
(gdb) bt
#0  0x00007fe960413e3c in throttle_group_next_blk (blk=0x11) at block/throttle-groups.c:160
#1  0x00007fe960413eff in next_throttle_token (blk=0x7fe963f5c400, is_write=false) at block/throttle-groups.c:192
#2  0x00007fe9604141a8 in throttle_group_co_io_limits_intercept (blk=0x7fe963f5c400, bytes=512, is_write=false)
    at block/throttle-groups.c:303
#3  0x00007fe960400048 in blk_co_preadv (blk=0x7fe963f5c400, offset=0, bytes=512, qiov=0x7ffc37ee8aa0, flags=(unknown: 0))
    at block/block-backend.c:728
#4  0x00007fe960400159 in blk_read_entry (opaque=0x7ffc37ee8ac0) at block/block-backend.c:769
#5  0x00007fe96048f4d7 in coroutine_trampoline (i0=1678853408, i1=32745) at util/coroutine-ucontext.c:78
#6  0x00007fe95dfdacf0 in ?? () from /lib64/libc.so.6
#7  0x00007ffc37ee9c00 in ?? ()
#8  0x0000000000000000 in ?? ()



*Segmentation fault info(qemu 2.8.1):*
Program received signal SIGSEGV, Segmentation fault.
0x00007f5469220607 in blk_has_pending_reqs (blk=0x7f54672a0032, is_write=false) at block/throttle-groups.c:184
184        return blkp->pending_reqs[is_write];
(gdb) bt
#0  0x00007f5469220607 in blk_has_pending_reqs (blk=0x7f54672a0032, is_write=false) at block/throttle-groups.c:184
#1  0x00007f54692206a8 in next_throttle_token (blk=0x7f546b6cd120, is_write=false) at block/throttle-groups.c:207
#2  0x00007f5469220984 in throttle_group_co_io_limits_intercept (blk=0x7f546b6cd120, bytes=512, is_write=false)
    at block/throttle-groups.c:322
#3  0x00007f546920bc79 in blk_co_preadv (blk=0x7f546b6cd120, offset=0, bytes=512, qiov=0x7ffcc7355060, flags=0)
    at block/block-backend.c:815
#4  0x00007f546920bddf in blk_read_entry (opaque=0x7ffcc7355080) at block/block-backend.c:865
#5  0x00007f54692a00f0 in coroutine_trampoline (i0=-588050448, i1=32595) at util/coroutine-ucontext.c:79
#6  0x00007f5466f34cf0 in ?? () from /lib64/libc.so.6
#7  0x00007f53f27fa9e0 in ?? ()
#8  0x0000000000000000 in ?? ()


*The way how to find the bug as follows:*
*1、start a VM*


*2、attach and detach a disk for a while,the configure of the disk (add-1.xml) as follows*
<disk device="disk" type="file">
<driver cache="none" io="native" name="qemu" type="raw" />
<source file="/mnt/sdb/wangjie-kvm/core/fk8b42zr-oz" />
<target bus="virtio" dev="vdb" />
<iotune>
<read_iops_sec>3000</read_iops_sec>
<write_iops_sec>3000</write_iops_sec>
<read_bytes_sec>120000000</read_bytes_sec>
<write_bytes_sec>120000000</write_bytes_sec>
</iotune>
</disk>


*3、run below script for a while,the qemu process of  VM will crash*
ret=1
while [ $ret -ne 0 ]; do
        virsh attach-device i-clone203_rhel_7.3_64_guestosdev add-1.xml
        sleep 2
        virsh detach-device i-clone203_rhel_7.3_64_guestosdev add-1.xml
done

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] question: I found a bug which will lead to qemu crash
  2017-09-12 11:17 [Qemu-devel] question: I found a bug which will lead to qemu crash WangJie (Captain)
@ 2017-09-12 11:37 ` Kevin Wolf
  2017-09-12 13:53   ` Eric Blake
  2017-09-12 12:00 ` Alberto Garcia
  2017-09-12 13:37 ` WangJie (Captain)
  2 siblings, 1 reply; 6+ messages in thread
From: Kevin Wolf @ 2017-09-12 11:37 UTC (permalink / raw)
  To: WangJie (Captain); +Cc: berto, stefanha, famz, qemu-devel, pbonzini

Am 12.09.2017 um 13:17 hat WangJie (Captain) geschrieben:
> Hi, Kevin.
> 
> I found a bug about qemu-kvm(version 2.7.0-rc0 adn 2.8.1). but qemu 2.6.0 and current master is OK.
> So I git-bisect the master branch,and I found the patch you commited (block: Decouple throttling from BlockDriverState) lead the bug into qemu.
> 
> The patch which lead the bug into qemu: (https://github.com/qemu/qemu/commit/7ca7f0f6db1fedd28d490795d778cf23979a2aa7#diff-ea36ba0f79150cc299732696a069caba)
> 
> Because the current master is OK. So I think you had fixed it , can you tell me which patch fixed the bug?        Thank you :>

I can't tell offhand which fix this was, but you can use 'git bisect'
not only to find which commit introduced the bug, but also to find the
fix. You just bisect between a broken commit and master, and then use
the reversed meaning of 'good' and 'bad' (i.e. 'good' means that the bug
is still there, 'bad' means it is already fixed).

Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] question: I found a bug which will lead to qemu crash
  2017-09-12 11:17 [Qemu-devel] question: I found a bug which will lead to qemu crash WangJie (Captain)
  2017-09-12 11:37 ` Kevin Wolf
@ 2017-09-12 12:00 ` Alberto Garcia
  2017-09-12 13:37 ` WangJie (Captain)
  2 siblings, 0 replies; 6+ messages in thread
From: Alberto Garcia @ 2017-09-12 12:00 UTC (permalink / raw)
  To: WangJie (Captain), Kevin Wolf, stefanha, famz; +Cc: qemu-devel, pbonzini

On Tue 12 Sep 2017 01:17:38 PM CEST, WangJie (Captain) wrote:
> Hi, Kevin.
>
> I found a bug about qemu-kvm(version 2.7.0-rc0 adn 2.8.1). but qemu 2.6.0 and current master is OK.
> So I git-bisect the master branch,and I found the patch you commited (block: Decouple throttling from BlockDriverState) lead the bug into qemu.
>
> The patch which lead the bug into qemu: (https://github.com/qemu/qemu/commit/7ca7f0f6db1fedd28d490795d778cf23979a2aa7#diff-ea36ba0f79150cc299732696a069caba)
>
> Because the current master is OK. So I think you had fixed it , can you tell me which patch fixed the bug?        Thank you :>
>
>
> the bug is that: qemu will crash when loop to attach and detach a disk which configured qos to a VM for a while.
>
>
> *Segmentation fault info(qemu 2.7.0-rc0):*
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `/usr/bin/qemu-kvm -name guest=wangjie-i-clone203_rhel_7.3_64_guestosdev,debug-t'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007fe960413e3c in throttle_group_next_blk (blk=0x11) at block/throttle-groups.c:160

That's clearly an invalid pointer, so the code is iterating over a
BlockBackend that has either been freed or is not on the throttle_groups
list anymore.

Commit 6bf77e1c2dc24da1bade16e8a9a637f3b127314d fixed a problem in which
the code was not iterating the list correctly, although I don't think
that would have caused any crash.

Did you try using git-bisect to find the commit that fixed the bug?

Berto

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] question: I found a bug which will lead to qemu crash
  2017-09-12 11:17 [Qemu-devel] question: I found a bug which will lead to qemu crash WangJie (Captain)
  2017-09-12 11:37 ` Kevin Wolf
  2017-09-12 12:00 ` Alberto Garcia
@ 2017-09-12 13:37 ` WangJie (Captain)
  2017-09-12 13:51   ` Alberto Garcia
  2 siblings, 1 reply; 6+ messages in thread
From: WangJie (Captain) @ 2017-09-12 13:37 UTC (permalink / raw)
  To: Kevin Wolf, eblake; +Cc: berto, stefanha, famz, qemu-devel, pbonzini

Hi, Eric
I used git-bisect and fount the patch you commited(throttle: Remove block from group on hot-unplug) fixed the bug which I presented to Kevin

the patch which fixed the bug is: https://github.com/qemu/qemu/commit/1606e4cf8a976513ecac70ad6642a7ec45744cf5#diff-7cb66df56045598b75a219eebc27efb6


But the condition which I made to causes crash is differ in the condition you  described in patch info, is it the same reason?

I feel very confused, please tell me, thank you :>


On 2017/9/12 19:17, WangJie (Captain) wrote:
> Hi, Kevin.
>
> I found a bug about qemu-kvm(version 2.7.0-rc0 adn 2.8.1). but qemu 2.6.0 and current master is OK.
> So I git-bisect the master branch,and I found the patch you commited (block: Decouple throttling from BlockDriverState) lead the bug into qemu.
>
> The patch which lead the bug into qemu: (https://github.com/qemu/qemu/commit/7ca7f0f6db1fedd28d490795d778cf23979a2aa7#diff-ea36ba0f79150cc299732696a069caba)
>
> Because the current master is OK. So I think you had fixed it , can you tell me which patch fixed the bug?        Thank you :>
>
>
> the bug is that: qemu will crash when loop to attach and detach a disk which configured qos to a VM for a while.
>
>
> *Segmentation fault info(qemu 2.7.0-rc0):*
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `/usr/bin/qemu-kvm -name guest=wangjie-i-clone203_rhel_7.3_64_guestosdev,debug-t'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007fe960413e3c in throttle_group_next_blk (blk=0x11) at block/throttle-groups.c:160
> 160        ThrottleState *ts = blkp->throttle_state;
> Missing separate debuginfos, use: debuginfo-install glib2-2.40.0-4.x86_64 glibc-2.17-157.h5.x86_64 libaio-0.3.109-13.x86_64 libgcc-4.8.3-10.h1.x86_64 nettle-2.7.1-4.h1.x86_64 numactl-libs-2.0.9-4.x86_64 pixman-0.32.4-3.x86_64 zlib-1.2.7-14.x86_64
> (gdb) bt
> #0  0x00007fe960413e3c in throttle_group_next_blk (blk=0x11) at block/throttle-groups.c:160
> #1  0x00007fe960413eff in next_throttle_token (blk=0x7fe963f5c400, is_write=false) at block/throttle-groups.c:192
> #2  0x00007fe9604141a8 in throttle_group_co_io_limits_intercept (blk=0x7fe963f5c400, bytes=512, is_write=false)
>     at block/throttle-groups.c:303
> #3  0x00007fe960400048 in blk_co_preadv (blk=0x7fe963f5c400, offset=0, bytes=512, qiov=0x7ffc37ee8aa0, flags=(unknown: 0))
>     at block/block-backend.c:728
> #4  0x00007fe960400159 in blk_read_entry (opaque=0x7ffc37ee8ac0) at block/block-backend.c:769
> #5  0x00007fe96048f4d7 in coroutine_trampoline (i0=1678853408, i1=32745) at util/coroutine-ucontext.c:78
> #6  0x00007fe95dfdacf0 in ?? () from /lib64/libc.so.6
> #7  0x00007ffc37ee9c00 in ?? ()
> #8  0x0000000000000000 in ?? ()
>
>
>
> *Segmentation fault info(qemu 2.8.1):*
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007f5469220607 in blk_has_pending_reqs (blk=0x7f54672a0032, is_write=false) at block/throttle-groups.c:184
> 184        return blkp->pending_reqs[is_write];
> (gdb) bt
> #0  0x00007f5469220607 in blk_has_pending_reqs (blk=0x7f54672a0032, is_write=false) at block/throttle-groups.c:184
> #1  0x00007f54692206a8 in next_throttle_token (blk=0x7f546b6cd120, is_write=false) at block/throttle-groups.c:207
> #2  0x00007f5469220984 in throttle_group_co_io_limits_intercept (blk=0x7f546b6cd120, bytes=512, is_write=false)
>     at block/throttle-groups.c:322
> #3  0x00007f546920bc79 in blk_co_preadv (blk=0x7f546b6cd120, offset=0, bytes=512, qiov=0x7ffcc7355060, flags=0)
>     at block/block-backend.c:815
> #4  0x00007f546920bddf in blk_read_entry (opaque=0x7ffcc7355080) at block/block-backend.c:865
> #5  0x00007f54692a00f0 in coroutine_trampoline (i0=-588050448, i1=32595) at util/coroutine-ucontext.c:79
> #6  0x00007f5466f34cf0 in ?? () from /lib64/libc.so.6
> #7  0x00007f53f27fa9e0 in ?? ()
> #8  0x0000000000000000 in ?? ()
>
>
> *The way how to find the bug as follows:*
> *1、start a VM*
>
>
> *2、attach and detach a disk for a while,the configure of the disk (add-1.xml) as follows*
> <disk device="disk" type="file">
> <driver cache="none" io="native" name="qemu" type="raw" />
> <source file="/mnt/sdb/wangjie-kvm/core/fk8b42zr-oz" />
> <target bus="virtio" dev="vdb" />
> <iotune>
> <read_iops_sec>3000</read_iops_sec>
> <write_iops_sec>3000</write_iops_sec>
> <read_bytes_sec>120000000</read_bytes_sec>
> <write_bytes_sec>120000000</write_bytes_sec>
> </iotune>
> </disk>
>
>
> *3、run below script for a while,the qemu process of  VM will crash*
> ret=1
> while [ $ret -ne 0 ]; do
>         virsh attach-device i-clone203_rhel_7.3_64_guestosdev add-1.xml
>         sleep 2
>         virsh detach-device i-clone203_rhel_7.3_64_guestosdev add-1.xml
> done
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] question: I found a bug which will lead to qemu crash
  2017-09-12 13:37 ` WangJie (Captain)
@ 2017-09-12 13:51   ` Alberto Garcia
  0 siblings, 0 replies; 6+ messages in thread
From: Alberto Garcia @ 2017-09-12 13:51 UTC (permalink / raw)
  To: WangJie (Captain), Kevin Wolf, eblake
  Cc: stefanha, famz, qemu-devel, pbonzini

On Tue 12 Sep 2017 03:37:09 PM CEST, WangJie (Captain) wrote:
> the patch which fixed the bug is: https://github.com/qemu/qemu/commit/1606e4cf8a976513ecac70ad6642a7ec45744cf5#diff-7cb66df56045598b75a219eebc27efb6

Oh, now I remember. Here's the bug report:

https://bugzilla.redhat.com/show_bug.cgi?id=1428810

> But the condition which I made to causes crash is differ in the
> condition you described in patch info, is it the same reason?

How is it different? Doesn't the script that you posted do exactly that?
(hot-unplug)

>> *3、run below script for a while,the qemu process of  VM will crash*
>> ret=1
>> while [ $ret -ne 0 ]; do
>>         virsh attach-device i-clone203_rhel_7.3_64_guestosdev add-1.xml
>>         sleep 2
>>         virsh detach-device i-clone203_rhel_7.3_64_guestosdev add-1.xml
>> done

It makes total sense that that patch fixes the problem. Without it, a
deleted BlockBackend remains in a throttling group, which explains your
backtrace.

Berto

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] question: I found a bug which will lead to qemu crash
  2017-09-12 11:37 ` Kevin Wolf
@ 2017-09-12 13:53   ` Eric Blake
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Blake @ 2017-09-12 13:53 UTC (permalink / raw)
  To: Kevin Wolf, WangJie (Captain); +Cc: famz, berto, qemu-devel, stefanha, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1563 bytes --]

On 09/12/2017 06:37 AM, Kevin Wolf wrote:
> Am 12.09.2017 um 13:17 hat WangJie (Captain) geschrieben:
>> Hi, Kevin.
>>
>> I found a bug about qemu-kvm(version 2.7.0-rc0 adn 2.8.1). but qemu 2.6.0 and current master is OK.
>> So I git-bisect the master branch,and I found the patch you commited (block: Decouple throttling from BlockDriverState) lead the bug into qemu.
>>
>> The patch which lead the bug into qemu: (https://github.com/qemu/qemu/commit/7ca7f0f6db1fedd28d490795d778cf23979a2aa7#diff-ea36ba0f79150cc299732696a069caba)
>>
>> Because the current master is OK. So I think you had fixed it , can you tell me which patch fixed the bug?        Thank you :>
> 
> I can't tell offhand which fix this was, but you can use 'git bisect'
> not only to find which commit introduced the bug, but also to find the
> fix. You just bisect between a broken commit and master, and then use
> the reversed meaning of 'good' and 'bad' (i.e. 'good' means that the bug
> is still there, 'bad' means it is already fixed).

That can be mentally confusing; with new-enough git, you can also use:

git bisect start --term-old=buggy --term-new=fixed

at which point, you can then say 'git bisect buggy' or 'git bisect
fixed' according to whether the bug is still present on a given
compilation, without having to remember which direction good/bad means.
There's also 'git bisect terms' to remind you what you chose.


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-09-12 13:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-12 11:17 [Qemu-devel] question: I found a bug which will lead to qemu crash WangJie (Captain)
2017-09-12 11:37 ` Kevin Wolf
2017-09-12 13:53   ` Eric Blake
2017-09-12 12:00 ` Alberto Garcia
2017-09-12 13:37 ` WangJie (Captain)
2017-09-12 13:51   ` Alberto Garcia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).