From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752784AbbIROV4 (ORCPT <rfc822;w@1wt.eu>);
	Fri, 18 Sep 2015 10:21:56 -0400
Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:44207 "EHLO
	mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK)
	by vger.kernel.org with ESMTP id S1751754AbbIROVy (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Fri, 18 Sep 2015 10:21:54 -0400
Subject: Re: [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug()
To: Linus Torvalds <torvalds@linux-foundation.org>,
        Dave Chinner <david@fromorbit.com>
References: <20150916220704.GM3902@dastard> <20150917003738.GN3902@dastard>
 <CA+55aFw3=_asAhUR3=o0pv0vtOJpownyWJpAfgSVtJVeaX0+bQ@mail.gmail.com>
 <20150917021453.GO3902@dastard>
 <CA+55aFz6zfHQnrwtimgm9v10s8dkF-e1w1aQQ3aWperbZGT1Jg@mail.gmail.com>
 <20150917224230.GF8624@ret.masoncoding.com>
 <CA+55aFw40VNejeCtHC+-fPThK+xp9WnoNGQUwYW2JEVoVp5JJw@mail.gmail.com>
 <20150917235647.GG8624@ret.masoncoding.com> <20150918003735.GR3902@dastard>
 <CA+55aFzXW7t+1v3tmW2sxn-BLpvZ1_Ye6epiPWBeq70FoaSmFQ@mail.gmail.com>
 <20150918054044.GT3902@dastard>
 <CA+55aFw3Y51ZtaPK=r1dp66hDsGmc-dFz9wf-gYMGi5B0FP4KQ@mail.gmail.com>
 <CA+55aFzKG-TxBSWzw=k0q=E16_KwvF3BdXK0f7D-fXncLm05+w@mail.gmail.com>
CC: Chris Mason <clm@fb.com>, Jan Kara <jack@suse.cz>,
        Josef Bacik <jbacik@fb.com>, LKML <linux-kernel@vger.kernel.org>,
        linux-fsdevel <linux-fsdevel@vger.kernel.org>,
        Neil Brown <neilb@suse.de>, Christoph Hellwig <hch@lst.de>,
        Tejun Heo <tj@kernel.org>
From: Jens Axboe <axboe@fb.com>
Message-ID: <55FC1DF5.9050102@fb.com>
Date: Fri, 18 Sep 2015 08:21:41 -0600
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <CA+55aFzKG-TxBSWzw=k0q=E16_KwvF3BdXK0f7D-fXncLm05+w@mail.gmail.com>
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 7bit
X-Originating-IP: [192.168.52.123]
X-Proofpoint-Spam-Reason: safe
X-FB-Internal: Safe
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.14.151,1.0.33,0.0.0000
 definitions=2015-09-18_07:2015-09-18,2015-09-18,1970-01-01 signatures=0
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 09/18/2015 12:06 AM, Linus Torvalds wrote:
> Gaah, my mailer autocompleted Jens' email with an old one..
>
> Sorry for the repeat email with the correct address.
>
> On Thu, Sep 17, 2015 at 11:04 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Thu, Sep 17, 2015 at 10:40 PM, Dave Chinner <david@fromorbit.com> wrote:
>>>
>>> PS: just hit another "did this just get broken in 4.3-rc1" issue - I
>>> can't run blktrace while there's a IO load because:
>>>
>>> $ sudo blktrace -d /dev/vdc
>>> BLKTRACESETUP(2) /dev/vdc failed: 5/Input/output error
>>> Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or directory
>>> ....
>>>
>>> [  641.424618] blktrace: page allocation failure: order:5, mode:0x2040d0
>>> [  641.438933]  [<ffffffff811c1569>] kmem_cache_alloc_trace+0x129/0x400
>>> [  641.440240]  [<ffffffff811424f8>] relay_open+0x68/0x2c0
>>> [  641.441299]  [<ffffffff8115deb1>] do_blk_trace_setup+0x191/0x2d0
>>>
>>> gdb) l *(relay_open+0x68)
>>> 0xffffffff811424f8 is in relay_open (kernel/relay.c:582).
>>> 577                     return NULL;
>>> 578             if (subbuf_size > UINT_MAX / n_subbufs)
>>> 579                     return NULL;
>>> 580
>>> 581             chan = kzalloc(sizeof(struct rchan), GFP_KERNEL);
>>> 582             if (!chan)
>>> 583                     return NULL;
>>> 584
>>> 585             chan->version = RELAYFS_CHANNEL_VERSION;
>>> 586             chan->n_subbufs = n_subbufs;
>>>
>>> and struct rchan has a member struct rchan_buf *buf[NR_CPUS];
>>> and CONFIG_NR_CPUS=8192, hence the attempt at an order 5 allocation
>>> that fails here....
>>
>> Hm. Have you always had MAX_SMP (and the NR_CPU==8192 that it causes)?
>>  From a quick check, none of this code seems to be new.
>>
>> That said, having that
>>
>>          struct rchan_buf *buf[NR_CPUS];
>>
>> in "struct rchan" really is something we should fix. We really should
>> strive to not allocate things by CONFIG_NR_CPU's, but by the actual
>> real CPU count.
>>
>> This looks to be mostly Jens' code, and much of it harkens back to 2006. Jens?

The relayfs code mostly came out of IBM, but yes, that alloc doesn't 
look nice. Not a regression, though, I don't think that has changed in 
years. I'll take a stab at fixing this.

-- 
Jens Axboe