From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932599AbbHYUhA (ORCPT ); Tue, 25 Aug 2015 16:37:00 -0400 Received: from mail-ig0-f171.google.com ([209.85.213.171]:35421 "EHLO mail-ig0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751887AbbHYUg5 (ORCPT ); Tue, 25 Aug 2015 16:36:57 -0400 Subject: Re: [patch|rfc] mtip32x: fix regression introduced by blk-mq per-hctx flush To: Jeff Moyer References: <55B8E6D6.7090809@kernel.dk> Cc: Ming Lei , Sam Bradshaw , Asai Thambi SP , linux-kernel@vger.kernel.org, dmilburn@redhat.com From: Jens Axboe Message-ID: <55DCD1E7.8090006@kernel.dk> Date: Tue, 25 Aug 2015 14:36:55 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/05/2015 02:44 PM, Jeff Moyer wrote: > Jens Axboe writes: > >> On 07/29/2015 08:22 AM, Jeff Moyer wrote: >>> Hi, >>> >>> After commit f70ced091707 (blk-mq: support per-distpatch_queue flush >>> machinery), the mtip32xx driver may oops upon module load due to walking >>> off the end of an array in mtip_init_cmd. On initialization of the >>> flush_rq, init_request is called with request_index >= the maximum queue >>> depth the driver supports. For mtip32xx, this value is used to index >>> into an array. What this means is that the driver will walk off the end >>> of the array, and either oops or cause random memory corruption. >>> >>> The problem is easily reproduced by doing modprobe/rmmod of the mtip32xx >>> driver in a loop. I can typically reproduce the problem in about 30 >>> seconds. >>> >>> Now, in the case of mtip32xx, it actually doesn't support flush/fua, so >>> I think we can simply return without doing anything. In addition, no >>> other mq-enabled driver does anything with the request_index passed into >>> init_request(), so no other driver is affected. However, I'm not really >>> sure what is expected of drivers. Ming, what did you envision drivers >>> would do when initializing the flush requests? >> >> This is really a bug in the core, we should not have to work around >> this in the driver. I'll take a look at this. > > Hi, Jens, > > Any update on this? To avoid stalling further on this, I'll apply the simple fix for 4.2 so we can move forward. It's a memory corruption issue and should get fixed, we can argue details later. -- Jens Axboe