From: Kevin Wolf <kwolf@redhat.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: qemu-devel <qemu-devel@nongnu.org>, kvm <kvm@vger.kernel.org>
Subject: [Qemu-devel] Re: Endless loop in qcow2_alloc_cluster_offset
Date: Mon, 07 Dec 2009 16:00:18 +0100 [thread overview]
Message-ID: <4B1D1882.7040404@redhat.com> (raw)
In-Reply-To: <4B1D0E34.6070907@siemens.com>
Am 07.12.2009 15:16, schrieb Jan Kiszka:
>> Likely not. What I did was nothing special, and I did not noticed such a
>> crash in the last months.
>
> And now it happened again (qemu-kvm head, during kernel installation
> from network onto local qcow2-disk). Any clever idea how to proceed with
> this?
I still haven't seen this and I still have no theory on what could be
happening here. I'm just trying to write down what I think must happen
to get into this situation. Maybe you can point at something I'm missing
or maybe it helps you to have a sudden inspiration.
The crash happens because we have a loop in the s->cluster_allocs list.
A loop can only be created by inserting an object twice. The only insert
to this list happens in qcow2_alloc_cluster_offset (though an earlier
call than that of the stack trace).
There is only one relevant caller of this function, qcow_aio_write_cb.
Part of it is a call to run_dependent_requests which removes the request
from s->cluster_allocs. So after the QLIST_REMOVE in
run_dependent_requests the request can't be contained in the list, but
at the call of qcow2_alloc_cluster_offset it must be contained again. It
must be added somewhere in between these two calls.
In qcow_aio_write_cb there isn't much happening between these calls. The
only thing that could somehow become dangerous is the
qcow_aio_write_cb(req, 0); for queued requests in run_dependent_requests.
> I could try to run the step in a loop, hopefully retriggering it once in
> a (likely longer) while. But then we need some good instrumentation first.
I can't explain what exactly would be going wrong there, but if my
thoughts are right so far, I think that moving this into a Bottom Half
would help. So if you can reproduce it in a loop this could be worth a try.
I'd certainly prefer to understand the problem first, but thinking about
AIO is the perfect way to make your brain hurt...
Kevin
next prev parent reply other threads:[~2009-12-07 15:01 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-19 12:19 [Qemu-devel] Endless loop in qcow2_alloc_cluster_offset Jan Kiszka
2009-11-19 14:49 ` [Qemu-devel] " Kevin Wolf
2009-11-19 14:58 ` Jan Kiszka
2009-12-07 14:16 ` Jan Kiszka
2009-12-07 14:50 ` Jan Kiszka
2009-12-07 15:03 ` Kevin Wolf
2009-12-07 15:25 ` Jan Kiszka
2009-12-07 15:04 ` Avi Kivity
2009-12-07 15:00 ` Kevin Wolf [this message]
2009-12-07 16:09 ` Jan Kiszka
2009-12-07 16:26 ` Kevin Wolf
2009-12-08 14:51 ` Kevin Wolf
2010-05-07 1:19 ` Marcelo Tosatti
2010-05-07 7:37 ` Kevin Wolf
2010-05-07 15:16 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B1D1882.7040404@redhat.com \
--to=kwolf@redhat.com \
--cc=jan.kiszka@siemens.com \
--cc=kvm@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).