From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35149) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YJW4o-0004fE-5h for qemu-devel@nongnu.org; Thu, 05 Feb 2015 18:43:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YJW4k-00016v-Va for qemu-devel@nongnu.org; Thu, 05 Feb 2015 18:43:14 -0500 Received: from mx1.redhat.com ([209.132.183.28]:57615) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YJW4k-00016o-OE for qemu-devel@nongnu.org; Thu, 05 Feb 2015 18:43:10 -0500 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t15Nh9AQ016036 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Thu, 5 Feb 2015 18:43:09 -0500 Message-ID: <54D4000B.3080703@redhat.com> Date: Thu, 05 Feb 2015 18:43:07 -0500 From: Max Reitz MIME-Version: 1.0 References: <1423178141-22642-1-git-send-email-jsnow@redhat.com> In-Reply-To: <1423178141-22642-1-git-send-email-jsnow@redhat.com> Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] blkdebug: fix "once" rule List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: John Snow , qemu-devel@nongnu.org Cc: kwolf@redhat.com, pbonzini@redhat.com On 2015-02-05 at 18:15, John Snow wrote: > The blkdebug scripts are currently engineered so that when a debug event > occurs, a prefilter browses a master list of parsed rules for a certain > event and adds them to an "active list" of rules to be used for the > forthcoming action, provided the events and state numbers match. > > Then, once the request is received, the last active rule is used to inject > an error if certain parameters match. > > This active list is cleared every time the prefilter injects a new rule > for the first time during a debug event. > > The "once" rule currently causes the error injection, if it is triggered, > to only clear the active list. This is insufficient for preventing future > injections of the same rule. > > This patch /deletes/ the rule from the list that the prefilter browses, > so it is gone for good. > > Lastly, this impacts iotests 026. Several ENOSPC tests that used "once" > can be seen to have output that shows multiple failure messages. After > this patch, the error messages tend to be smaller and less severe, but > the injection can still be seen to be working. > > Patch the expected output to expect the smaller error messages. > > Signed-off-by: John Snow > --- > block/blkdebug.c | 9 +++++---- > tests/qemu-iotests/026.out | 24 ++++-------------------- > 2 files changed, 9 insertions(+), 24 deletions(-) > > diff --git a/block/blkdebug.c b/block/blkdebug.c > index 9ce35cd..d30c44c 100644 > --- a/block/blkdebug.c > +++ b/block/blkdebug.c > @@ -473,14 +473,15 @@ static BlockAIOCB *inject_error(BlockDriverState *bs, > struct BlkdebugAIOCB *acb; > QEMUBH *bh; > > - if (rule->options.inject.once) { > - QSIMPLEQ_INIT(&s->active_rules); > - } > - > if (rule->options.inject.immediately) { > return NULL; > } > > + if (rule->options.inject.once) { > + QSIMPLEQ_INIT(&s->active_rules); This will still delete all other currently active rules, which I don't think is what is intended. Case in point why this is not really right: $ ./qemu-img create -f qcow2 test.qcow2 64M Formatting 'test.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off $ ./qemu-io -c 'aio_write 0 64k' "json:{'driver':'qcow2','file':{'driver':'blkdebug','image':{'driver':'file','filename':'test.qcow2'},'inject-error':[{'event':'write_aio'},{'event':'write_aio','state':42,'once':true}],'set-state':[{'event':'cluster_alloc','new_state':42}]}}" aio_write failed: Input/output error Failed to flush the L2 table cache: Input/output error Failed to flush the refcount block cache: Input/output error So, here we get three errors; the non-once rule stayed active. Now let's just turn the rules around: $ ./qemu-img create -f qcow2 test.qcow2 64M Formatting 'test.qcow2', fmt=qcow2 size=67108864 encryption=off cluster_size=65536 lazy_refcounts=off $ ./qemu-io -c 'aio_write 0 64k' "json:{'driver':'qcow2','file':{'driver':'blkdebug','image':{'driver':'file','filename':'test.qcow2'},'inject-error':[{'event':'write_aio','state':42,'once':true},{'event':'write_aio'}],'set-state':[{'event':'cluster_alloc','new_state':42}]}}" aio_write failed: Input/output error Hmmm... So what happens is that in the first case, the non-"once" rule had precedence, so the list of active rules was not changed. In the second case, however, the "once" rule had precedence, therefore all active rules (including the non-"once" rule) were deleted. I think it's fine that the "once" rule does not get deleted in the first case, but whether the non-"once" rule stays active should not be influenced by the "once" rule. Note that this issue was not introduced by this patch, but I think this patch should fix it. :-) Max > + remove_rule(rule); > + } > +