From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: [PATCH 1/2] kvm tools: Respect ISR status in virtio header Date: Sat, 7 May 2011 16:55:21 +0200 Message-ID: <20110507145521.GG2859@elte.hu> References: <1304735660-10844-1-git-send-email-asias.hejun@gmail.com> <20110507093027.GD27657@elte.hu> <4DC545BA.3030501@us.ibm.com> <20110507140251.GB2859@elte.hu> <4DC55562.4070304@codemonkey.ws> <20110507144735.GD2859@elte.hu> <1304779960.10621.16.camel@jaguar> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Anthony Liguori , Asias He , "Michael S. Tsirkin" , Rusty Russell , Mark McLoughlin , Cyrill Gorcunov , Sasha Levin , Prasad Joshi , kvm@vger.kernel.org To: Pekka Enberg Return-path: Received: from mx3.mail.elte.hu ([157.181.1.138]:43692 "EHLO mx3.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755158Ab1EGOzf (ORCPT ); Sat, 7 May 2011 10:55:35 -0400 Content-Disposition: inline In-Reply-To: <1304779960.10621.16.camel@jaguar> Sender: kvm-owner@vger.kernel.org List-ID: * Pekka Enberg wrote: > On Sat, 2011-05-07 at 16:47 +0200, Ingo Molnar wrote: > > Can you anything in the virtio protocol implementation that would explain > > networking lags, which seem to be caused by guest notifications either not be > > sent or being missed? > > > > In particular this sequence: > > > > > while pending_requests: > > > a = get_next_request(); > > > process_next_request(a); > > > > is apparently not what Qemu uses - so maybe there's some latent bug or some > > silly oversight somewhere. > > > > It is suboptimal and i agree with you that the better sequence should be > > implemented, but the above *should* work, yet it does not. > > Yes, so the performance benefits of Asias' patch aren't the interesting > part but the fact that it fixes a real bug in our tool. It could be the same like the mutex_lock() change: that too seemed to 'fix' the latency bug but we still do not understand the root cause of the 'stuck ring-buffer' situation. I.e. some sort of timing related condition which goes away spuriously when unrelated but timing-relevant changes are done to the code. And we'll continue to see these problems on and off, in probably all virtio drivers. virtio-console might be suffering from it, virtio-blk, etc. etc. I'd suggest freezing changes to this driver until this bug is analyzed correctly... Thanks, Ingo