From: Jan Kiszka <jan.kiszka@siemens.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>, kvm <kvm@vger.kernel.org>
Subject: Re: Endless loop in qcow2_alloc_cluster_offset
Date: Mon, 07 Dec 2009 17:09:45 +0100 [thread overview]
Message-ID: <4B1D28C9.70201@siemens.com> (raw)
In-Reply-To: <4B1D1882.7040404@redhat.com>
Kevin Wolf wrote:
> Am 07.12.2009 15:16, schrieb Jan Kiszka:
>>> Likely not. What I did was nothing special, and I did not noticed such a
>>> crash in the last months.
>> And now it happened again (qemu-kvm head, during kernel installation
>> from network onto local qcow2-disk). Any clever idea how to proceed with
>> this?
>
> I still haven't seen this and I still have no theory on what could be
> happening here. I'm just trying to write down what I think must happen
> to get into this situation. Maybe you can point at something I'm missing
> or maybe it helps you to have a sudden inspiration.
>
> The crash happens because we have a loop in the s->cluster_allocs list.
> A loop can only be created by inserting an object twice. The only insert
> to this list happens in qcow2_alloc_cluster_offset (though an earlier
> call than that of the stack trace).
>
> There is only one relevant caller of this function, qcow_aio_write_cb.
> Part of it is a call to run_dependent_requests which removes the request
> from s->cluster_allocs. So after the QLIST_REMOVE in
> run_dependent_requests the request can't be contained in the list, but
> at the call of qcow2_alloc_cluster_offset it must be contained again. It
> must be added somewhere in between these two calls.
>
> In qcow_aio_write_cb there isn't much happening between these calls. The
> only thing that could somehow become dangerous is the
> qcow_aio_write_cb(req, 0); for queued requests in run_dependent_requests.
If m->nb_clusters is not, the entry won't be removed from the list. And
of something corrupted nb_clusters so that it became 0 although it's
still enqueued, we would see the deadly loop I faced, right?
Unfortunately, any arbitrary memory corruption that generates such zeros
can cause this...
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
WARNING: multiple messages have this Message-ID (diff)
From: Jan Kiszka <jan.kiszka@siemens.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel <qemu-devel@nongnu.org>, kvm <kvm@vger.kernel.org>
Subject: [Qemu-devel] Re: Endless loop in qcow2_alloc_cluster_offset
Date: Mon, 07 Dec 2009 17:09:45 +0100 [thread overview]
Message-ID: <4B1D28C9.70201@siemens.com> (raw)
In-Reply-To: <4B1D1882.7040404@redhat.com>
Kevin Wolf wrote:
> Am 07.12.2009 15:16, schrieb Jan Kiszka:
>>> Likely not. What I did was nothing special, and I did not noticed such a
>>> crash in the last months.
>> And now it happened again (qemu-kvm head, during kernel installation
>> from network onto local qcow2-disk). Any clever idea how to proceed with
>> this?
>
> I still haven't seen this and I still have no theory on what could be
> happening here. I'm just trying to write down what I think must happen
> to get into this situation. Maybe you can point at something I'm missing
> or maybe it helps you to have a sudden inspiration.
>
> The crash happens because we have a loop in the s->cluster_allocs list.
> A loop can only be created by inserting an object twice. The only insert
> to this list happens in qcow2_alloc_cluster_offset (though an earlier
> call than that of the stack trace).
>
> There is only one relevant caller of this function, qcow_aio_write_cb.
> Part of it is a call to run_dependent_requests which removes the request
> from s->cluster_allocs. So after the QLIST_REMOVE in
> run_dependent_requests the request can't be contained in the list, but
> at the call of qcow2_alloc_cluster_offset it must be contained again. It
> must be added somewhere in between these two calls.
>
> In qcow_aio_write_cb there isn't much happening between these calls. The
> only thing that could somehow become dangerous is the
> qcow_aio_write_cb(req, 0); for queued requests in run_dependent_requests.
If m->nb_clusters is not, the entry won't be removed from the list. And
of something corrupted nb_clusters so that it became 0 although it's
still enqueued, we would see the deadly loop I faced, right?
Unfortunately, any arbitrary memory corruption that generates such zeros
can cause this...
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
next prev parent reply other threads:[~2009-12-07 16:10 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-19 12:19 Endless loop in qcow2_alloc_cluster_offset Jan Kiszka
2009-11-19 12:19 ` [Qemu-devel] " Jan Kiszka
2009-11-19 14:49 ` Kevin Wolf
2009-11-19 14:49 ` [Qemu-devel] " Kevin Wolf
2009-11-19 14:58 ` Jan Kiszka
2009-11-19 14:58 ` [Qemu-devel] " Jan Kiszka
2009-12-07 14:16 ` Jan Kiszka
2009-12-07 14:16 ` [Qemu-devel] " Jan Kiszka
2009-12-07 14:50 ` Jan Kiszka
2009-12-07 14:50 ` [Qemu-devel] " Jan Kiszka
2009-12-07 15:03 ` Kevin Wolf
2009-12-07 15:03 ` [Qemu-devel] " Kevin Wolf
2009-12-07 15:25 ` Jan Kiszka
2009-12-07 15:25 ` [Qemu-devel] " Jan Kiszka
2009-12-07 15:04 ` Avi Kivity
2009-12-07 15:04 ` [Qemu-devel] " Avi Kivity
2009-12-07 15:00 ` Kevin Wolf
2009-12-07 15:00 ` [Qemu-devel] " Kevin Wolf
2009-12-07 16:09 ` Jan Kiszka [this message]
2009-12-07 16:09 ` Jan Kiszka
2009-12-07 16:26 ` Kevin Wolf
2009-12-07 16:26 ` [Qemu-devel] " Kevin Wolf
2009-12-08 14:51 ` Kevin Wolf
2010-05-07 1:19 ` Marcelo Tosatti
2010-05-07 1:19 ` [Qemu-devel] " Marcelo Tosatti
2010-05-07 7:37 ` Kevin Wolf
2010-05-07 7:37 ` [Qemu-devel] " Kevin Wolf
2010-05-07 15:16 ` Marcelo Tosatti
2010-05-07 15:16 ` [Qemu-devel] " Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B1D28C9.70201@siemens.com \
--to=jan.kiszka@siemens.com \
--cc=kvm@vger.kernel.org \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.