From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932599AbbHYUhA (ORCPT <rfc822;w@1wt.eu>);
	Tue, 25 Aug 2015 16:37:00 -0400
Received: from mail-ig0-f171.google.com ([209.85.213.171]:35421 "EHLO
	mail-ig0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751887AbbHYUg5 (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 25 Aug 2015 16:36:57 -0400
Subject: Re: [patch|rfc] mtip32x: fix regression introduced by blk-mq per-hctx
 flush
To: Jeff Moyer <jmoyer@redhat.com>
References: <x494mknhwlh.fsf@segfault.boston.devel.redhat.com>
 <55B8E6D6.7090809@kernel.dk>
 <x49fv3xihxz.fsf@segfault.boston.devel.redhat.com>
Cc: Ming Lei <ming.lei@canonical.com>, Sam Bradshaw <sbradshaw@micron.com>,
        Asai Thambi SP <asamymuthupa@micron.com>, linux-kernel@vger.kernel.org,
        dmilburn@redhat.com
From: Jens Axboe <axboe@kernel.dk>
Message-ID: <55DCD1E7.8090006@kernel.dk>
Date: Tue, 25 Aug 2015 14:36:55 -0600
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <x49fv3xihxz.fsf@segfault.boston.devel.redhat.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 08/05/2015 02:44 PM, Jeff Moyer wrote:
> Jens Axboe <axboe@kernel.dk> writes:
>
>> On 07/29/2015 08:22 AM, Jeff Moyer wrote:
>>> Hi,
>>>
>>> After commit f70ced091707 (blk-mq: support per-distpatch_queue flush
>>> machinery), the mtip32xx driver may oops upon module load due to walking
>>> off the end of an array in mtip_init_cmd.  On initialization of the
>>> flush_rq, init_request is called with request_index >= the maximum queue
>>> depth the driver supports.  For mtip32xx, this value is used to index
>>> into an array.  What this means is that the driver will walk off the end
>>> of the array, and either oops or cause random memory corruption.
>>>
>>> The problem is easily reproduced by doing modprobe/rmmod of the mtip32xx
>>> driver in a loop.  I can typically reproduce the problem in about 30
>>> seconds.
>>>
>>> Now, in the case of mtip32xx, it actually doesn't support flush/fua, so
>>> I think we can simply return without doing anything.  In addition, no
>>> other mq-enabled driver does anything with the request_index passed into
>>> init_request(), so no other driver is affected.  However, I'm not really
>>> sure what is expected of drivers.  Ming, what did you envision drivers
>>> would do when initializing the flush requests?
>>
>> This is really a bug in the core, we should not have to work around
>> this in the driver. I'll take a look at this.
>
> Hi, Jens,
>
> Any update on this?

To avoid stalling further on this, I'll apply the simple fix for 4.2 so 
we can move forward. It's a memory corruption issue and should get 
fixed, we can argue details later.

-- 
Jens Axboe