From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77890C43381 for ; Fri, 22 Feb 2019 09:47:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4400C2081B for ; Fri, 22 Feb 2019 09:47:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726555AbfBVJrO convert rfc822-to-8bit (ORCPT ); Fri, 22 Feb 2019 04:47:14 -0500 Received: from mail-ed1-f65.google.com ([209.85.208.65]:43093 "EHLO mail-ed1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725855AbfBVJrO (ORCPT ); Fri, 22 Feb 2019 04:47:14 -0500 Received: by mail-ed1-f65.google.com with SMTP id m35so1228614ede.10 for ; Fri, 22 Feb 2019 01:47:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=euEgCQr8q/u6bYzlvWsxI/81sczxJMKBWEH18dchm9M=; b=YNI/VvxDnyHl+vGRZqUaLvP9qE3WQuzcj4+d2muK8G7v/lfYDXL8p/TumEhvqGrwbN p3bYNAWJrS5vrW6vbkxC2BLns360bxgGcRnJS+FCJflASlqNJKnzyGPXKbOGCuXjSX5r IWylOFrvKcEgfmdveEjrDb78HVPzclQ/sAhORIaUUJ9Q1VAYeZYtLROZU37XkPNIu8a9 0ZNQTdREwGYxkZzwEawt/m/Kv4BJaTBQK228/3WiDkB8WxO19ldYI1HdITzxs4FVWXR4 LoAgEcxwdz3dRWRRerb176PRYDx+hPQqcfiMGZYlWZF9r10ViN3FREtdK0p8Yi2+Ei2w jbLA== X-Gm-Message-State: AHQUAuYzXUCwHXREdcyz+WOk6lHLLUtiDZamZ9d/o1SyDB/OtmyygrqF zGPGKGunH7JKCSi2kjxI05VG4TeULKyuvA== X-Google-Smtp-Source: AHgI3Ibr+XWonja6uCfhliYSlTeJ8lOAyuxa+vLCDtSXBHFPGACXA/BDHoigjfGSpBgQzZL8DN0PnQ== X-Received: by 2002:a17:906:3d7:: with SMTP id c23mr2273946eja.138.1550828831906; Fri, 22 Feb 2019 01:47:11 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk (alrua-x1.vpn.toke.dk. [2a00:7660:6da:10::2]) by smtp.gmail.com with ESMTPSA id m4sm165933ejl.68.2019.02.22.01.47.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 22 Feb 2019 01:47:11 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id B791F1803B8; Fri, 22 Feb 2019 10:47:10 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Jakub Kicinski Cc: David Miller , netdev@vger.kernel.org, Jesper Dangaard Brouer , Daniel Borkmann , Alexei Starovoitov Subject: Re: [PATCH net-next 2/2] xdp: Add devmap_idx map type for looking up devices by ifindex In-Reply-To: <20190221163218.72905325@cakuba.netronome.com> References: <155075021399.13610.12521373406832889226.stgit@alrua-x1> <155075021407.13610.6656977312753058829.stgit@alrua-x1> <20190221134923.53c40b11@cakuba.netronome.com> <874l8wiw3k.fsf@toke.dk> <20190221163218.72905325@cakuba.netronome.com> X-Clacks-Overhead: GNU Terry Pratchett Date: Fri, 22 Feb 2019 10:47:10 +0100 Message-ID: <87y368gnoh.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Jakub Kicinski writes: > On Fri, 22 Feb 2019 00:02:23 +0100, Toke Høiland-Jørgensen wrote: >> Jakub Kicinski writes: >> >> > On Thu, 21 Feb 2019 12:56:54 +0100, Toke Høiland-Jørgensen wrote: >> >> A common pattern when using xdp_redirect_map() is to create a device map >> >> where the lookup key is simply ifindex. Because device maps are arrays, >> >> this leaves holes in the map, and the map has to be sized to fit the >> >> largest ifindex, regardless of how many devices actually are actually >> >> needed in the map. >> >> >> >> This patch adds a second type of device map where the key is interpreted as >> >> an ifindex and looked up using a hashmap, instead of being used as an array >> >> index. This leads to maps being densely packed, so they can be smaller. >> >> >> >> The default maps used by xdp_redirect() are changed to use the new map >> >> type, which means that xdp_redirect() is no longer limited to ifindex < 64, >> >> but instead to 64 total simultaneous interfaces per network namespace. This >> >> also provides an easy way to compare the performance of devmap and >> >> devmap_idx: >> >> >> >> xdp_redirect_map (devmap): 8394560 pkt/s >> >> xdp_redirect (devmap_idx): 8179480 pkt/s >> >> >> >> Difference: 215080 pkt/s or 3.1 nanoseconds per packet. >> > >> > Could you share what the ifindex mix was here, to arrive at these >> > numbers? How does it compare to using an array but not keying with >> > ifindex? >> >> Just the standard set on my test machine; ifindex 1 through 9, except 8 >> in this case. So certainly no more than 1 ifindex in each hash bucket >> for those numbers. > > Oh, I clearly misread your numbers, it's still slower than array, you > just don't need the size limit. Yeah, this is not about speeding up devmap, it's about lifting the size restriction. >> >> Signed-off-by: Toke Høiland-Jørgensen >> > >> >> +static int dev_map_idx_update_elem(struct bpf_map *map, void *key, void *value, >> >> + u64 map_flags) >> >> +{ >> >> + struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); >> >> + struct bpf_dtab_netdev *dev, *old_dev; >> >> + u32 idx = *(u32 *)key; >> >> + u32 val = *(u32 *)value; >> >> + u32 bit; >> >> + >> >> + if (unlikely(map_flags > BPF_EXIST)) >> >> + return -EINVAL; >> >> + if (unlikely(map_flags == BPF_NOEXIST)) >> >> + return -EEXIST; >> >> + >> >> + old_dev = __dev_map_idx_lookup_elem(map, idx); >> >> + if (!val) { >> >> + if (!old_dev) >> >> + return 0; >> > >> > IMHO this is a fairly strange mix of array and hashmap semantics. I >> > think you should stick to hashmap behaviour AFA flags and >> > update/delete goes. >> >> Yeah, the double book-keeping is a bit strange, but it allows the actual >> forwarding and flush code to be reused between both types of maps. I >> think this is worth the slight semantic confusion :) > > I'm not sure I was clear, let me try again :) Your get_next_key only > reports existing indexes if I read the code right, so that's not an > array - in an array indexes always exist. What follows inserting 0 > should not be equivalent to delete and BPF_NOEXIST should be handled > appropriately. Ah, I see what you mean. Yeah, sure, I guess I can restrict deletion to only working through explicit delete. I could also add a fail on NOEXIST, but since each index is tied to a particular value, you can't actually change the contents of each index, only insert and remove. So why would you ever set that flag? > Different maps behave differently, I think it's worth trying to limit > the divergence in how things behave to the basic array and a hashmap > models when possible. So I don't actually think of this as a hashmap in the general sense; after all, you can only store ifindexes in it, and key and value are tied to one another. So it's an ifindex'ed devmap (which is also why I named it devmap_idx and not devmap_hash); the fact that it's implemented as a hashmap is just incidental. So I guess it's a choice between being consistent with the other devmap type, or with a general hashmap. I'm not actually sure that the latter is less surprising? :) -Toke