From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Fastabend Subject: Re: [PATCH net-next 1/5] bpf: use __GFP_COMP while allocating page Date: Wed, 12 Sep 2018 09:51:24 -0700 Message-ID: <6855742d-5925-0d94-e4f3-74bf118ca3d2@gmail.com> References: <1536694684-3200-1-git-send-email-tushar.n.dave@oracle.com> <1536694684-3200-2-git-send-email-tushar.n.dave@oracle.com> <2fd601e3-5679-08f4-1610-b3c22de80935@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit To: Tushar Dave , ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, santosh.shilimkar@oracle.com, jakub.kicinski@netronome.com, quentin.monnet@netronome.com, jiong.wang@netronome.com, sandipan@linux.vnet.ibm.com, kafai@fb.com, rdna@fb.com, yhs@fb.com, netdev@vger.kernel.org, rds-devel@oss.oracle.com, sowmini.varadhan@oracle.com Return-path: Received: from mail-io1-f66.google.com ([209.85.166.66]:43323 "EHLO mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727716AbeILV5J (ORCPT ); Wed, 12 Sep 2018 17:57:09 -0400 Received: by mail-io1-f66.google.com with SMTP id y10-v6so740710ioa.10 for ; Wed, 12 Sep 2018 09:51:45 -0700 (PDT) In-Reply-To: <2fd601e3-5679-08f4-1610-b3c22de80935@oracle.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 09/12/2018 09:21 AM, Tushar Dave wrote: > > > On 09/11/2018 12:38 PM, Tushar Dave wrote: >> Helper bpg_msg_pull_data() can allocate multiple pages while >> linearizing multiple scatterlist elements into one shared page. >> However, if the shared page has size > PAGE_SIZE, using >> copy_page_to_iter() causes below warning. >> >> e.g. >> [ 6367.019832] WARNING: CPU: 2 PID: 7410 at lib/iov_iter.c:825 >> page_copy_sane.part.8+0x0/0x8 >> >> To avoid above warning, use __GFP_COMP while allocating multiple >> contiguous pages. >> >> Signed-off-by: Tushar Dave >> --- >>   net/core/filter.c | 3 ++- >>   1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/net/core/filter.c b/net/core/filter.c >> index d301134..0b40f95 100644 >> --- a/net/core/filter.c >> +++ b/net/core/filter.c >> @@ -2344,7 +2344,8 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg) >>       if (unlikely(bytes_sg_total > copy)) >>           return -EINVAL; >>   -    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC, get_order(copy)); >> +    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP, >> +               get_order(copy)); >>       if (unlikely(!page)) >>           return -ENOMEM; >>       p = page_address(page); > > I should have mentioned that I could re-order this patch anywhere in > patch series (as long as it doesn't break git bisect). I kept it first > because I think it is more like a bug fix. I sent it along with these > patch series considering we have a context of why and for what I need > this patch! > > Daniel, John, > > Not sure if you guys hit this page_copy_sane warning. I hit it when RDS > copy sg page to userspace using copy_page_to_iter(). > I have not hit this before but I'm working on a set of patches for test_sockmap to test the bpf_msg_pull_data() so I'll add a case for this. Currently, we only test the simple case where we pull data out of a single page in selftests. This was sufficient for my use case but missed a handful of other valid cases. > example: > > RDS packet size 8KB represented in scatterlist: > sg_data[0].length = 1400 > sg_data[1].length = 1448 > sg_data[2].length = 1448 > sg_data[3].length = 1448 > sg_data[4].length = 1448 > sg_data[5].length = 1000 > > If start=0 and end=8192, bpf_msg_pull_data() will linearize all > sg_data elements into one shared page. e.g. sg_data[0].length = 8192. > Using this sg_data[0].page in function copy_page_to_iter() causes: > WARNING: CPU: 2 PID: 7410 at lib/iov_iter.c:825 > page_copy_sane.part.8+0x0/0x8 > > (FYI, patch 4 has code that does copy_page_to_iter) > How about sending it as a bugfix against bpf on its own. It looks like we could reproduce this with a combination of bpf_msg_pull_data() + redirect (to ingress) perhaps. Either way seems like a candidate for the bpf fixes tree to me. Thanks, John > > Comments? > > Thanks in advance, > -Tushar > >