From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 977ADC7618B for ; Thu, 25 Jul 2019 15:05:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 71501218D4 for ; Thu, 25 Jul 2019 15:05:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388850AbfGYPFH convert rfc822-to-8bit (ORCPT ); Thu, 25 Jul 2019 11:05:07 -0400 Received: from mail-ed1-f68.google.com ([209.85.208.68]:46202 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388312AbfGYPFH (ORCPT ); Thu, 25 Jul 2019 11:05:07 -0400 Received: by mail-ed1-f68.google.com with SMTP id d4so50566012edr.13 for ; Thu, 25 Jul 2019 08:05:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=+mu2HirervjgYd6JY58DZ33I6K7Dku+vAMAcY8H7Y7U=; b=AH4Feythe6U9RCaQVjgekkxZZQNXT8QOw5or9sRkVxa9ge/B7nSwOjaj+STiK/QF1B kogKzUDDEMlwAKtXHfkiTEIh3oZsq1yJ0Wt1tWm+JMrCc/fhpDKY3lLvkWRIDNAFnsxi 2hEZYkpbq3HgiZhe+KRucbVLHwRP31Oi/SZ3aFfi6+SAsJP+ELKbl50tqZFscTKh6PAQ RV678j5ZJRkkSMqYbOv3t43oXz0VQwNhXicmYhEoi/bj3tqYa6AUi2Kp8r+TOrg94AFe m9871wRYn8nBj5DAFkAK4STDtlQ5KdiCY7CncLpX9/E1oOOaOStJB3YDQxJ5nCEUTPad bIAw== X-Gm-Message-State: APjAAAUYQWnzfkWcc5GzJX+UIXuFqiMasv2/f4aHdlN2d8DOdA6LYBoj sWzLsgqbiBYTD1h6uib0pg/Shw== X-Google-Smtp-Source: APXvYqwNlPc3Cg/VWxt//7wC5Kzi7b/rHF9HLsDmm0o58+fV5ECH+EWJTV6Rc29AommslAEUGEfHOg== X-Received: by 2002:a17:906:2544:: with SMTP id j4mr31018548ejb.221.1564067105505; Thu, 25 Jul 2019 08:05:05 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([2a00:7660:6da:443::2]) by smtp.gmail.com with ESMTPSA id j11sm9840907ejr.69.2019.07.25.08.05.04 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Thu, 25 Jul 2019 08:05:04 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id C97A21800C5; Thu, 25 Jul 2019 17:05:03 +0200 (CEST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Jesper Dangaard Brouer Cc: Daniel Borkmann , Alexei Starovoitov , netdev@vger.kernel.org, David Miller , Jakub Kicinski , =?utf-8?B?QmrDtnJuIFTDtnBlbA==?= , Yonghong Song , brouer@redhat.com Subject: Re: [PATCH bpf-next v4 3/6] xdp: Add devmap_hash map type for looking up devices by hashed index In-Reply-To: <20190725133730.3750c66c@carbon> References: <156379636786.12332.17776973951938230698.stgit@alrua-x1> <156379636866.12332.6546616116016146789.stgit@alrua-x1> <20190725100717.0c4e8265@carbon> <87muh2z9os.fsf@toke.dk> <20190725133730.3750c66c@carbon> X-Clacks-Overhead: GNU Terry Pratchett Date: Thu, 25 Jul 2019 17:05:03 +0200 Message-ID: <8736iuyx28.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Jesper Dangaard Brouer writes: > On Thu, 25 Jul 2019 12:32:19 +0200 > Toke Høiland-Jørgensen wrote: > >> Jesper Dangaard Brouer writes: >> >> > On Mon, 22 Jul 2019 13:52:48 +0200 >> > Toke Høiland-Jørgensen wrote: >> > >> >> +static inline struct hlist_head *dev_map_index_hash(struct bpf_dtab *dtab, >> >> + int idx) >> >> +{ >> >> + return &dtab->dev_index_head[idx & (NETDEV_HASHENTRIES - 1)]; >> >> +} >> > >> > It is good for performance that our "hash" function is simply an AND >> > operation on the idx. We want to keep it this way. >> > >> > I don't like that you are using NETDEV_HASHENTRIES, because the BPF map >> > infrastructure already have a way to specify the map size (struct >> > bpf_map_def .max_entries). BUT for performance reasons, to keep the >> > AND operation, we would need to round up the hash-array size to nearest >> > power of 2 (or reject if user didn't specify a power of 2, if we want >> > to "expose" this limit to users). >> >> But do we really want the number of hash buckets to be equal to the max >> number of entries? The values are not likely to be evenly distributed, >> so we'll end up with big buckets if the number is small, meaning we'll >> blow performance on walking long lists in each bucket. > > The requested change makes it user-configurable, instead of fixed 256 > entries. I've seen production use-case with >5000 net_devices, thus > they need a knob to increase this (to avoid the list walking as you > mention). Ah, I see. That makes sense; I thought you wanted to make it smaller (cf. the previous discussion about it being too big). Still, it seems counter-intuitive to overload max_entries in this way. I do see that this is what the existing hash map is also doing, though, so I guess there is some precedence. I do wonder if we'll end up getting bad performance from the hash being too simplistic, but I guess we can always fix that later. >> Also, if the size is dynamic the size needs to be loaded from memory >> instead of being a compile-time constant, which will presumably hurt >> performance (though not sure by how much)? > > To counter this, the mask value which need to be loaded from memory, > needs to be placed next to some other struct member which is already in > use (at least on same cacheline, Intel have some 16 bytes access micro > optimizations, which I've never been able to measure, as its in 0.5 > nanosec scale). In the fast path (i.e., in __xdp_map_lookup_elem) we will have already loaded map->max_entries since it's on the same cacheline as map_type which we use to disambiguate which function to call. So it should be fine to just use that directly. I'll send a new version with this change :) -Toke