From mboxrd@z Thu Jan  1 00:00:00 1970
From: John Fastabend <john.fastabend@gmail.com>
Subject: Re: [PATCH net-next 1/5] bpf: use __GFP_COMP while allocating page
Date: Wed, 12 Sep 2018 09:51:24 -0700
Message-ID: <6855742d-5925-0d94-e4f3-74bf118ca3d2@gmail.com>
References: <1536694684-3200-1-git-send-email-tushar.n.dave@oracle.com>
 <1536694684-3200-2-git-send-email-tushar.n.dave@oracle.com>
 <2fd601e3-5679-08f4-1610-b3c22de80935@oracle.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
To: Tushar Dave <tushar.n.dave@oracle.com>, ast@kernel.org,
        daniel@iogearbox.net, davem@davemloft.net,
        santosh.shilimkar@oracle.com, jakub.kicinski@netronome.com,
        quentin.monnet@netronome.com, jiong.wang@netronome.com,
        sandipan@linux.vnet.ibm.com, kafai@fb.com, rdna@fb.com, yhs@fb.com,
        netdev@vger.kernel.org, rds-devel@oss.oracle.com,
        sowmini.varadhan@oracle.com
Return-path: <netdev-owner@vger.kernel.org>
Received: from mail-io1-f66.google.com ([209.85.166.66]:43323 "EHLO
        mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1727716AbeILV5J (ORCPT
        <rfc822;netdev@vger.kernel.org>); Wed, 12 Sep 2018 17:57:09 -0400
Received: by mail-io1-f66.google.com with SMTP id y10-v6so740710ioa.10
        for <netdev@vger.kernel.org>; Wed, 12 Sep 2018 09:51:45 -0700 (PDT)
In-Reply-To: <2fd601e3-5679-08f4-1610-b3c22de80935@oracle.com>
Content-Language: en-US
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On 09/12/2018 09:21 AM, Tushar Dave wrote:
> 
> 
> On 09/11/2018 12:38 PM, Tushar Dave wrote:
>> Helper bpg_msg_pull_data() can allocate multiple pages while
>> linearizing multiple scatterlist elements into one shared page.
>> However, if the shared page has size > PAGE_SIZE, using
>> copy_page_to_iter() causes below warning.
>>
>> e.g.
>> [ 6367.019832] WARNING: CPU: 2 PID: 7410 at lib/iov_iter.c:825
>> page_copy_sane.part.8+0x0/0x8
>>
>> To avoid above warning, use __GFP_COMP while allocating multiple
>> contiguous pages.
>>
>> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
>> ---
>>   net/core/filter.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index d301134..0b40f95 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -2344,7 +2344,8 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>       if (unlikely(bytes_sg_total > copy))
>>           return -EINVAL;
>>   -    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC, get_order(copy));
>> +    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP,
>> +               get_order(copy));
>>       if (unlikely(!page))
>>           return -ENOMEM;
>>       p = page_address(page);
> 
> I should have mentioned that I could re-order this patch anywhere in
> patch series (as long as it doesn't break git bisect). I kept it first
> because I think it is more like a bug fix. I sent it along with these
> patch series considering we have a context of why and for what I need
> this patch!
> 
> Daniel, John,
> 
> Not sure if you guys hit this page_copy_sane warning. I hit it when RDS
> copy sg page to userspace using copy_page_to_iter().
> 

I have not hit this before but I'm working on a set of patches for
test_sockmap to test the bpf_msg_pull_data() so I'll add a case
for this. Currently, we only test the simple case where we pull
data out of a single page in selftests. This was sufficient for
my use case but missed a handful of other valid cases.

> example:
> 
> RDS packet size 8KB represented in scatterlist:
> sg_data[0].length = 1400
> sg_data[1].length = 1448
> sg_data[2].length = 1448
> sg_data[3].length = 1448
> sg_data[4].length = 1448
> sg_data[5].length = 1000
> 
> If start=0 and end=8192, bpf_msg_pull_data() will linearize all
> sg_data elements into one shared page. e.g. sg_data[0].length = 8192.
> Using this sg_data[0].page in function copy_page_to_iter() causes:
> WARNING: CPU: 2 PID: 7410 at lib/iov_iter.c:825
> page_copy_sane.part.8+0x0/0x8
> 
> (FYI, patch 4 has code that does copy_page_to_iter)
> 

How about sending it as a bugfix against bpf on its own. It
looks like we could reproduce this with a combination of
bpf_msg_pull_data() + redirect (to ingress) perhaps. Either
way seems like a candidate for the bpf fixes tree to me.

Thanks,
John

> 
> Comments?
> 
> Thanks in advance,
> -Tushar
> 
>