From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tushar Dave Subject: Re: [PATCH net-next 1/5] bpf: use __GFP_COMP while allocating page Date: Wed, 12 Sep 2018 13:15:48 -0700 Message-ID: <8f22f67e-437b-6533-4c87-e13d4fe33f95@oracle.com> References: <1536694684-3200-1-git-send-email-tushar.n.dave@oracle.com> <1536694684-3200-2-git-send-email-tushar.n.dave@oracle.com> <2fd601e3-5679-08f4-1610-b3c22de80935@oracle.com> <6855742d-5925-0d94-e4f3-74bf118ca3d2@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit To: John Fastabend , ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net, santosh.shilimkar@oracle.com, jakub.kicinski@netronome.com, quentin.monnet@netronome.com, jiong.wang@netronome.com, sandipan@linux.vnet.ibm.com, kafai@fb.com, rdna@fb.com, yhs@fb.com, netdev@vger.kernel.org, rds-devel@oss.oracle.com, sowmini.varadhan@oracle.com Return-path: Received: from userp2130.oracle.com ([156.151.31.86]:58848 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726069AbeIMBXN (ORCPT ); Wed, 12 Sep 2018 21:23:13 -0400 In-Reply-To: <6855742d-5925-0d94-e4f3-74bf118ca3d2@gmail.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 09/12/2018 09:51 AM, John Fastabend wrote: > On 09/12/2018 09:21 AM, Tushar Dave wrote: >> >> >> On 09/11/2018 12:38 PM, Tushar Dave wrote: >>> Helper bpg_msg_pull_data() can allocate multiple pages while >>> linearizing multiple scatterlist elements into one shared page. >>> However, if the shared page has size > PAGE_SIZE, using >>> copy_page_to_iter() causes below warning. >>> >>> e.g. >>> [ 6367.019832] WARNING: CPU: 2 PID: 7410 at lib/iov_iter.c:825 >>> page_copy_sane.part.8+0x0/0x8 >>> >>> To avoid above warning, use __GFP_COMP while allocating multiple >>> contiguous pages. >>> >>> Signed-off-by: Tushar Dave >>> --- >>>   net/core/filter.c | 3 ++- >>>   1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/net/core/filter.c b/net/core/filter.c >>> index d301134..0b40f95 100644 >>> --- a/net/core/filter.c >>> +++ b/net/core/filter.c >>> @@ -2344,7 +2344,8 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg) >>>       if (unlikely(bytes_sg_total > copy)) >>>           return -EINVAL; >>>   -    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC, get_order(copy)); >>> +    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP, >>> +               get_order(copy)); >>>       if (unlikely(!page)) >>>           return -ENOMEM; >>>       p = page_address(page); >> >> I should have mentioned that I could re-order this patch anywhere in >> patch series (as long as it doesn't break git bisect). I kept it first >> because I think it is more like a bug fix. I sent it along with these >> patch series considering we have a context of why and for what I need >> this patch! >> >> Daniel, John, >> >> Not sure if you guys hit this page_copy_sane warning. I hit it when RDS >> copy sg page to userspace using copy_page_to_iter(). >> > > I have not hit this before but I'm working on a set of patches for > test_sockmap to test the bpf_msg_pull_data() so I'll add a case > for this. Currently, we only test the simple case where we pull > data out of a single page in selftests. This was sufficient for > my use case but missed a handful of other valid cases. > >> example: >> >> RDS packet size 8KB represented in scatterlist: >> sg_data[0].length = 1400 >> sg_data[1].length = 1448 >> sg_data[2].length = 1448 >> sg_data[3].length = 1448 >> sg_data[4].length = 1448 >> sg_data[5].length = 1000 >> >> If start=0 and end=8192, bpf_msg_pull_data() will linearize all >> sg_data elements into one shared page. e.g. sg_data[0].length = 8192. >> Using this sg_data[0].page in function copy_page_to_iter() causes: >> WARNING: CPU: 2 PID: 7410 at lib/iov_iter.c:825 >> page_copy_sane.part.8+0x0/0x8 >> >> (FYI, patch 4 has code that does copy_page_to_iter) >> > > How about sending it as a bugfix against bpf on its own. It > looks like we could reproduce this with a combination of > bpf_msg_pull_data() + redirect (to ingress) perhaps. Either > way seems like a candidate for the bpf fixes tree to me. Done. Thanks. -Tushar > > Thanks, > John > >> >> Comments? >> >> Thanks in advance, >> -Tushar >> >> > >