From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:48998 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752014AbeCPInd (ORCPT ); Fri, 16 Mar 2018 04:43:33 -0400 Subject: Re: [bpf-next V2 PATCH 02/15] xdp: introduce xdp_return_frame API and use in cpumap To: Jesper Dangaard Brouer Cc: netdev@vger.kernel.org, =?UTF-8?B?QmrDtnJuVMO2cGVs?= , magnus.karlsson@intel.com, eugenia@mellanox.com, John Fastabend , Eran Ben Elisha , Saeed Mahameed , galp@mellanox.com, Daniel Borkmann , Alexei Starovoitov , Tariq Toukan References: <152051439383.7018.11827926732878918934.stgit@firesoul> <152051447139.7018.551457480386098465.stgit@firesoul> <2fcc8356-4d68-30dc-5424-4dea618ac6b5@redhat.com> <20180309103536.1d7fdc58@redhat.com> <50eeb5c4-d9fd-8d5b-af98-01f17ec56ec4@redhat.com> <20180309170520.11967b11@redhat.com> From: Jason Wang Message-ID: Date: Fri, 16 Mar 2018 16:43:26 +0800 MIME-Version: 1.0 In-Reply-To: <20180309170520.11967b11@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 2018年03月10日 00:05, Jesper Dangaard Brouer wrote: > On Fri, 9 Mar 2018 21:04:23 +0800 > Jason Wang wrote: > >> On 2018年03月09日 17:35, Jesper Dangaard Brouer wrote: >>> On Fri, 9 Mar 2018 15:24:10 +0800 >>> Jason Wang wrote: >>> >>>> On 2018年03月08日 21:07, Jesper Dangaard Brouer wrote: >>>>> Introduce an xdp_return_frame API, and convert over cpumap as >>>>> the first user, given it have queued XDP frame structure to leverage. >>>>> >>>>> Signed-off-by: Jesper Dangaard Brouer >>>>> --- >>>>> include/net/xdp.h | 32 +++++++++++++++++++++++++++ >>>>> kernel/bpf/cpumap.c | 60 +++++++++++++++++++++++++++++++-------------------- >>>>> net/core/xdp.c | 18 +++++++++++++++ >>>>> 3 files changed, 86 insertions(+), 24 deletions(-) >>>>> >>>>> diff --git a/include/net/xdp.h b/include/net/xdp.h >>>>> index b2362ddfa694..3cb726a6dc5b 100644 >>>>> --- a/include/net/xdp.h >>>>> +++ b/include/net/xdp.h >>>>> @@ -33,16 +33,48 @@ >>>>> * also mandatory during RX-ring setup. >>>>> */ >>>>> >>>>> +enum mem_type { >>>>> + MEM_TYPE_PAGE_SHARED = 0, /* Split-page refcnt based model */ >>>>> + MEM_TYPE_PAGE_ORDER0, /* Orig XDP full page model */ >>>>> + // Possible new ideas for types: >>>>> + // MEM_TYPE_PAGE_POOL, /* Will be added later */ >>>>> + // MEM_TYPE_AF_XDP, >>>>> + // MEM_TYPE_DEVICE_OWNED -- invoking an dev->ndo? >>>>> + MEM_TYPE_MAX, >>>>> +}; >>>> So if we plan to support dev->ndo, it looks to me two types AF_XDP and >>>> DEVICE_OWNED are sufficient? Driver can do what it wants (e.g page pool >>>> or ordinary page allocator) in ndo or what ever other callbacks. >>> So, the design is open to go both ways, we can figure out later. >>> >>> The reason I'm not calling page_pool from a driver level dev->ndo, is >>> that I'm trying to avoid invoking code in a module, as the driver >>> module code can be (in the process of being) unloaded from the kernel, >>> while another driver is calling xdp_return_frame. The current design >>> in patch 10, uses the RCU period to guarantee that the allocator >>> pointer is valid as long as the ID lookup succeeded. >>> >>> To do what you propose, we also need to guarantee that the net_device >>> cannot disappear so it is safe to invoke dev->ndo. To do so, the >>> driver likely have to add a rcu_barrier before unloading. I'm also >>> thinking that for MEM_TYPE_DEVICE_OWNED the allocator pointer ("under >>> protection") could be the net_device pointer, and we could take a >>> refcnt on dev (dev_hold/dev_put). >> Then it looks like a device (e.g tun) can lock another device for >> indefinite time. >> >>> Thus, I think it is doable, but lets >>> figure this out later. >>> >>> Another important design consideration, is that the xdp core need to >>> know how to release memory in case the ID lookup failed. >> Can we simply use put_page() in this case? > For now, yes. The (future) use case is when pages are kept DMA mapped, > then the page also need to be DMA unmapped. This probably needs careful thought. E.g current virtio-net driver does not do mapping by itself, instead it depends on the virtio core to do map and unmap. > To DMA unmap we need to > know the struct device (not net_device). (And then the discussion > start around how to know struct device is still valid). So still need a safe fallback if I understand this correctly. Thanks