From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bart Van Assche Subject: Re: dangling pointers and/or reentrancy in scmd_eh_abort_handler? Date: Tue, 20 May 2014 09:32:05 +0200 Message-ID: <537B04F5.4080808@acm.org> References: <537A105B.4080504@redhat.com> <537A1E88.9080803@acm.org> <537A2CB8.9060302@redhat.com> <537A34C6.7090905@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Return-path: Received: from andre.telenet-ops.be ([195.130.132.53]:33839 "EHLO andre.telenet-ops.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751164AbaETHcI (ORCPT ); Tue, 20 May 2014 03:32:08 -0400 In-Reply-To: <537A34C6.7090905@acm.org> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Paolo Bonzini Cc: linux-scsi , Ulrich Obergfell On 05/19/14 18:43, Bart Van Assche wrote: > On 05/19/14 18:09, Paolo Bonzini wrote: >> Il 19/05/2014 17:08, Bart Van Assche ha scritto: >>> On 05/19/14 16:08, Paolo Bonzini wrote: >>>> 2) reentrancy: the softirq handler and scmd_eh_abort_handler can run >>>> concurrently, and call scsi_finish_command without any lock protecting >>>> the calls. You can then get memory corruption. >>> >>> I'm not sure what the recommended approach is to address this race. But >>> it is possible to address this in the LLD. See e.g. the srp_claim_req() >>> function in the SRP LLD and how it is invoked from the reply handler, >>> the abort handler and the reset handlers in that LLD. >> >> That's not enough, unless I'm missing something. Say the request >> handler claims the request and the abort handler doesn't: >> >> - the request handler calls scsi_done and ends up in scsi_finish_command. >> >> - the abort handler will return SUCCESS, and scmd_eh_abort_handler then >> calls scsi_finish_command. > > It depends on how the SCSI abort handler gets invoked. If the SCSI abort > handler gets invoked because a SCSI command timed out that means that > the block layer has already detected a timeout and also that the > REQ_ATOM_COMPLETE bit has already been set. In this scenario if a SCSI > LLD invokes scsi_done() that causes blk_complete_request() to return > without invoking __blk_complete_request() and hence without invoking > scsi_softirq_done(). (replying to my own e-mail) Please note that scsi_eh_abort_cmds() neither checks nor sets the REQ_ATOM_COMPLETE bit before it invokes hostt->eh_abort_handler(). Would it make sense to modify that function such that it invokes blk_abort_request() instead ? That last function atomically test-and-sets the REQ_ATOM_COMPLETE bit before invoking the timeout handler. Bart.