From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1JhNfM-0001kD-UU for qemu-devel@nongnu.org; Thu, 03 Apr 2008 07:31:37 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1JhNfJ-0001gC-Tl for qemu-devel@nongnu.org; Thu, 03 Apr 2008 07:31:35 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1JhNfJ-0001fu-JN for qemu-devel@nongnu.org; Thu, 03 Apr 2008 07:31:33 -0400 Received: from mail2.shareable.org ([80.68.89.115]) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1JhNfJ-000123-FF for qemu-devel@nongnu.org; Thu, 03 Apr 2008 07:31:33 -0400 Received: from jamie by mail2.shareable.org with local (Exim 4.63) (envelope-from ) id 1JhNfF-0004nP-5e for qemu-devel@nongnu.org; Thu, 03 Apr 2008 12:31:29 +0100 Date: Thu, 3 Apr 2008 12:31:29 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] QEMU/KVM SCSI lock up Message-ID: <20080403113128.GA17900@shareable.org> References: <87k5jfixcb.fsf@fftw.org> <47F49770.1010906@qumranet.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47F49770.1010906@qumranet.com> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Avi Kivity wrote: > > bdrv_aio_read() > > r->sector += n; > > r->sector_count -= n; > > > >For reasons that I do not fully understand, bdrv_aio_read() does > >not return immediately, but instead it calls scsi_read_data() > >recursively. > > What happens (I think) is that bdrv_aio_read() completes immediately, > calls the completion callback, which starts a read for the next batch of > sectors. > > Long term we want to replace the recursion by queuing. Ah yes. Reminds me of a bug in a program of mine with asynchronous sockets, which sometimes completes immediately. My async connection function took a callback argument. The callback was called immediately if the connect returned success immediately, otherwise was queued. Then I wrote some code like this: // some stuff my_obj->connector = async_connection (address, my_obj_connect_cb, my_obj); // some more stuff void my_obj_connect_cb (void * arg) { MyObj * my_obj = arg; do_stuff_with (my_obj->connector); } That worked for several months flawlessly, the code was deployed on 500 units around the country, then one day an async connect happened to complete fast enough that the callback was called recursively, and a core dump followed. I learned a few lessons about async callbacks: 1. Queue them, even when the async operation completes immediately. It's really not worth the "optimisation" of immediate callback, especially if the code for queuing is already there. Even when immediate callback is documented, mistakes are too easy to make when calling it, because the async call pattern _looks_ safe, and may work most of the time or always with some configurations. 2. If you really must have recursion, e.g. perhaps the synchronous path is common and a hot path, beware that this is prone to mistakes in usage. Check that it's worth the optimisation. Then document with words like "warning" the need for special care. 3. If there is immediate callback possible, sometimes it's useful API to have a flag argument to enable that, so that always queueing is an option, and so the programmer has to think about consequences if they set the flag. E.g.: async_id = async_func (args, callback, cb_arg, CB_ALWAYS_QUEUED); async_id = async_func (args, callback, cb_arg, CB_MAYBE_IMMEDIATE); 3. The code pattern "async_id = async_func (args, callback, callback_arg)" looks very natural, but it doesn't work when there is immediate callback. Callback is called before the id can be stored anywhere. If the async_id is still meaningful at the point of callback, the API is fundamentally broken. Just a few notes on async callback API pitfalls I've had... -- Jamie