qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	qemu-block@nongnu.org
Subject: Re: [Qemu-devel] strange crash in tracked_request_begin
Date: Mon, 7 Mar 2016 20:00:49 +0100	[thread overview]
Message-ID: <56DDCFE1.4000808@de.ibm.com> (raw)
In-Reply-To: <20160307170139.GB26074@stefanha-x1.localdomain>

On 03/07/2016 06:01 PM, Stefan Hajnoczi wrote:
> On Mon, Mar 07, 2016 at 01:29:08PM +0100, Christian Borntraeger wrote:
>> Folks,
>>
>> I had a crash of a qemu guest in tracked_request_begin.
>> The testcase was a guest with ramdisk/kernel that reboots in a 
>> loop. (about 10 times per second) with a single null-co disk 
>> attached. No idea how to reproduce this, seems to be a lucky hit.
>>
>> (gdb) bt
>> #0  0x00000000101db5ba in tracked_request_begin (req=req@entry=0x3ff90f1bdc0, bs=bs@entry=0x42a39190, offset=offset@entry=0, bytes=bytes@entry=4096, type=type@entry=BDRV_TRACKED_READ)
>>     at /home/cborntra/REPOS/qemu/block/io.c:390
>> #1  0x00000000101de91e in bdrv_co_do_preadv (bs=0x42a39190, offset=0, bytes=4096, qiov=0x3ff7400cbd8, flags=<optimized out>, flags@entry=(unknown: 0))
>>     at /home/cborntra/REPOS/qemu/block/io.c:1001
>> #2  0x00000000101dfc3e in bdrv_co_do_readv (flags=(unknown: 0), qiov=<optimized out>, nb_sectors=<optimized out>, sector_num=<optimized out>, bs=<optimized out>)
>>     at /home/cborntra/REPOS/qemu/block/io.c:1024
>> #3  bdrv_co_do_rw (opaque=0x3ff7400e370) at /home/cborntra/REPOS/qemu/block/io.c:2173
>> #4  0x000000001022d8f6 in coroutine_trampoline (i0=<optimized out>, i1=-1946150928) at /home/cborntra/REPOS/qemu/util/coroutine-ucontext.c:79
>> #5  0x000003ff95ed150a in __makecontext_ret () from /lib64/libc.so.6
>>
>> looking at the code we are at
>>
>> QLIST_INSERT_HEAD(&bs->tracked_requests, req, list);
>> which translates to
>>
>> if (((req)->list.le_next = (&bs->tracked_requests)->lh_first) != NULL) 
>>     (&bs->tracked_requests)->lh_first->list.le_prev = &(req)->list.le_next;
>> (&bs->tracked_requests)->lh_first = (req);                       
>> (req)->list.le_prev = &(&bs->tracked_requests)->lh_first;
>>
>> gdb says, that (&bs->tracked_requests)->lh_first) is zero in the corefile
>> (gdb) print /x bs->tracked_requests
>> $6 = {lh_first = 0x0}
>>
>> Now looking at the code I am asking myself if this can happen in parallel
>> to another code that touches tracked_requests, because gcc seems to read
>> &bs->tracked_requests)->lh_first twice (first to check the value, then
>> to use it as pointer)
> 
> tracked_requests is protected by AioContext.  Perhaps something is doing
> I/O without acquiring AioContext?

Hmm, the guest was rebooting, which resets all devices. Maybe something
in that code is still not right? I will have a look.
> 
> Luckily there is only 1 place where items are added and removed from
> tracked_requests.  This might make debugging somewhat easier.

I have trouble reproducing the issue, which makes it hard :-/
 


>>
>> 388	    qemu_co_queue_init(&req->wait_queue);
>>    0x00000000101db594 <+76>:	la	%r2,72(%r13)
>>    0x00000000101db598 <+80>:	brasl	%r14,0x1022cdc0 <qemu_co_queue_init>
>>
>> 389	
>> 390	    QLIST_INSERT_HEAD(&bs->tracked_requests, req, list);
>>    0x00000000101db59e <+86>:	lg	%r1,12744(%r12)		# r1 = (&bs->tracked_requests)->lh_first)
>>    0x00000000101db5a4 <+92>:	stg	%r1,48(%r13)		# (req)->list.le_next = r1
>>    0x00000000101db5aa <+98>:	cgij	%r1,0,8,0x101db5c0 ---+ # if r1==0 goto
>>    0x00000000101db5b0 <+104>:	lg	%r1,12744(%r12)       | # r1 = (&bs->tracked_requests)->lh_first) (again!!)
>>    0x00000000101db5b6 <+110>:	la	%r2,48(%r13)          | 
>> => 0x00000000101db5ba <+114>:	stg	%r2,56(%r1)           | # r1==0 bang
>>    0x00000000101db5c0 <+120>:	stg	%r13,12744(%r12)<-----+
>>    0x00000000101db5c6 <+126>:	lay	%r12,12744(%r12)
>>    0x00000000101db5cc <+132>:	stg	%r12,56(%r13)
>>
>>
>> Christian
>>

  reply	other threads:[~2016-03-07 19:01 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-07 12:29 [Qemu-devel] strange crash in tracked_request_begin Christian Borntraeger
2016-03-07 13:35 ` Paolo Bonzini
2016-03-07 13:40   ` Christian Borntraeger
2016-03-07 17:01 ` Stefan Hajnoczi
2016-03-07 19:00   ` Christian Borntraeger [this message]
2016-03-08 10:06     ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56DDCFE1.4000808@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).