From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E0CE723BCEB; Thu, 31 Jul 2025 16:41:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.52 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753980101; cv=none; b=jY/VgGkigumQf6cT5mEiZss2FLNqoknkk89eb/MJ4QSascR8xysWXjkVjQGrS44RzdbkwAmbFqzA1ireuwDyitT78dMX9YLCb5bhIu9OWevEe7XHoMG3TP12VSA2a9oZg/DQn70697osDxl7GVo9zoJ8DxSoLZa43Q9p3iKPjFs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753980101; c=relaxed/simple; bh=FOn/6rSOFrcvFlub8uI3wyAurRPAFWyHtW66MMdoEnQ=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=AXtpzm9mM/ZDzcLAsB+IfL3XZ9TJ1qsmBH3ohf5leepRGi3sl1ONJ90E4gZMhWZFbLlEzY73hcUPEurxV17e2HDys50MtvKqpeQCM0HjKa/RHuyE9/rSnkxpEuaJxvtQF15nXj8mTb826xgIvevAPjM7zP+Ou8etKXFpFDFiiNo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kEY2Uz3p; arc=none smtp.client-ip=209.85.128.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kEY2Uz3p" Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-4563cfac2d2so10809795e9.3; Thu, 31 Jul 2025 09:41:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1753980098; x=1754584898; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Xsu6ObGdZ62l8P/qwUGJQLtAjhbxoXSmlIgJ/7hyZRU=; b=kEY2Uz3pVoapjD4YJJrkPwogcO+4fxuB5IqT+mfSwaOP11pYN4Ba/eFbwtVr+l5QlQ bTYfLt6gHitGsN2UPf6PCXql1uQw6gLwJjrb5DA5WUiEp2CX1KWJwVXKpwTUBAj1e/FA J+nt923sbDoLRdm7V+WWrPrWZCnQwopjQpWOko9BPkGfpVNwo3LINmFdh7PPG7bOHmvR G2VfxdZMrziTypJXMDLWOgt0AoO3rxy9gAOQBzjgZHUv35K5eYCTM/afFTvymjHMzbix EwU/admIy6YCGr/bSsIC00teeEbu5XxNFvBhq2IK0zK50348qCkawWt+Rzhdy7VAUY4g AyEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753980098; x=1754584898; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xsu6ObGdZ62l8P/qwUGJQLtAjhbxoXSmlIgJ/7hyZRU=; b=F0ejwvQWVwk+3aBagI4AcUd9UNgP5p5uuiRcNrWphLvUwNs764zLjvUPtT8OpgX3Di 0gvFdUC2NSHJX2Yi5OLqKu6h2dArZ/l4HGnuiyyzHXKqdua6qp7WbQ1qtZKtJIu68g8Z H/bUavib3wxeL2PUVGw3N6GANg5aqobDPc5MOHP8BeiJadxVCnpf1bjIkylE3bXU8CLJ uTXVe0RAIA3YzgxjhBpxPzBh26uvxMd44q1a0VXY7ZvbS7+PlprPm6akQzKBnP3nD2bG agKig/jX5KdbTG0fF9tK2u51IhEtQs1R7qhuZLI/m8mmRJEWYQg3mL2tZ4ctDgFJGJ75 FftA== X-Forwarded-Encrypted: i=1; AJvYcCWJlBBdrnVjLxuVtyo89cQh19w++3NkkLyXIruY+wS2mUPwlBrIJYDL04632uuYvrccx9tyHQVTwbiQUD2F@vger.kernel.org, AJvYcCWgmg7S8gubYy713E066aHIDmSOccS1L4gAhfSjZRHHJLKdrqPw35QTSH8CtTTV3xODLaXdxTYN9XVKSxKEMZx6@vger.kernel.org, AJvYcCWqThGoNqMI7+/cDMwLy3p+iv95TjoXeP7j+SBas5K1dalnvG2uHuaQ/neLL0fbr20JxzrjDJem@vger.kernel.org, AJvYcCX3aNdWMluw+sAX6KGTPu9VxNSboeKPAZeMLf1FaiXbJPTDJo2yPzDn7wF0nckHkPpcPs8=@vger.kernel.org X-Gm-Message-State: AOJu0YwYCsH8rQlst0r3wuzVnCGriCNWkWjdgvusEjWExSw5owiQRbLU p4HAe+f9C0bX181iQB7TVsoW+HYkZ79vYfk2C3aeDNh9TrzFSGAzmx8E4xYR8q3Hy2G2IAL4bB9 GQ7twV7JPwFJIKmX35lmjCTQW6GvHuoibcg== X-Gm-Gg: ASbGncunMnqRES0/Q7nOx6l1qfzu+waGbJXz6UKffI0WMaFKHZMep/t1w0V3t6OOREk HJzxuFguFSMGZ673d0CoOrSYA1z6z8PbRihYhfDvGvDequ7hyGLBma59OIlfzFxdh29EP5sjzoU Ym/fxKj7PxyCdf8/ajW+hKc+hHj1ABPK215tn5XJCsc9BeeyIQbVnUJPM1uSvz3Wo1dvI50ifSB 0KkgXWNEiibkm8mi/iUavs= X-Google-Smtp-Source: AGHT+IFM8ZRUhTMMFPoPU9m6U65rQihPR6DTHa7d6s2IreKPbGglE506/v94vJCHwTLm/k801E+rGxEYCdn9O+Ebt44= X-Received: by 2002:a05:600c:3145:b0:453:81a:2f3f with SMTP id 5b1f17b1804b1-45892bed97emr80555055e9.30.1753980097839; Thu, 31 Jul 2025 09:41:37 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20250722150152.1158205-1-matt@readmodwrite.com> In-Reply-To: From: Alexei Starovoitov Date: Thu, 31 Jul 2025 09:41:22 -0700 X-Gm-Features: Ac12FXxYox-ZTydO-9bGxcV7tjEfh_Od76qWoqE9ySEFca9hzhP54c2ZNF2ytCk Message-ID: Subject: Re: [PATCH bpf-next v3] selftests/bpf: Add LPM trie microbenchmarks To: Matt Fleming Cc: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Eduard Zingerman , Shuah Khan , kernel-team , Jesper Dangaard Brouer , "open list:KERNEL SELFTEST FRAMEWORK" , LKML , bpf , Martin KaFai Lau , Yonghong Song , Network Development , Matt Fleming Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jul 29, 2025 at 6:56=E2=80=AFAM Matt Fleming wrote: > > On Mon, Jul 28, 2025 at 3:35=E2=80=AFPM Alexei Starovoitov > wrote: > > > > Please make a full description of what the test does, > > since it's not trivial to decipher from the code. > > If I'm reading it correctly, first, the user space > > makes the LPM completely full and then lookup/update > > use random key-s within range. > > Yep, that's correct. I'll provide a more comprehensive description in v4. > > > But delete looks different. It seems the kernel delete > > operation can interleave with user space refilling the LPM, > > so ... > > > > > lookup: throughput 7.423 =C2=B1 0.023 M ops/s ( 7.423M ops/prod= ), latency 134.710 ns/op > > > update: throughput 2.643 =C2=B1 0.015 M ops/s ( 2.643M ops/prod= ), latency 378.310 ns/op > > > delete: throughput 0.712 =C2=B1 0.008 M ops/s ( 0.712M ops/prod= ), latency 1405.152 ns/op > > > > this comparison doesn't look like apples to apples, > > since delete will include user space refill time. > > Yeah, you're right. Though we only measure the delete time from in the > BPF prog, delete throughput is essentially blocked while the refill > happens and because things are measured with a 1-second timer > (bench.c:sigalarm_handler) the refill time gets accounted for anyway. > I could try extrapolating the delete time like I've done for the free > op, i.e. we calculate how many ops were completed in what fraction of > a second. > > > > free: throughput 0.574 =C2=B1 0.003 K ops/s ( 0.574K ops/prod= ), latency 1.743 ms/op > > > > Does this measure the free-ing of full LPM ? > > Yes, this measures the total time to free every element in the trie. > > > > +static void __lpm_validate(void) > > > > why double underscore ? > > Just lpm_validate() ? > > The double underscore is used for the common validation parts, but > I'll rename this to include "_common()" (along with all other uses of > __). No. It's also wrong. Double underscore or _common suffix are both meaningless. The helper name should describe what it's for. > > > + /* > > > + * Ideally we'd have access to the map ID but that's already > > > + * freed before we enter trie_free(). > > > + */ > > > + BPF_CORE_READ_STR_INTO(&name, map, name); > > > + if (bpf_strncmp(name, BPF_OBJ_NAME_LEN, "trie_free_map")) > > > + return 0; > > > + > > > + val =3D bpf_ktime_get_ns(); > > > + bpf_map_update_elem(&latency_free_start, &map, &val, BPF_ANY)= ; > > > > Looks like there is only one lpm map. > > What's the point of that extra map ? > > Just have a global var for start time ? > > Sure, I can make this a global variable for the start time instead. > > > bpf_get_prandom_u32() is not free > > and modulo operation isn't free either. > > The benchmark includes their time. > > It's ok to have it, but add a mode where the bench > > tests linear lookup/update too with simple key.data++ > > Good idea. > > > Since the map suppose to full before we start all keys > > should be there, right? > > Yes. > > > Let's add a sanity check that update() succeeds. > > Will do. > > > > +static int delete (__u32 index, bool *need_refill) > > > +{ > > > + struct trie_key key =3D { > > > + .data =3D deleted_entries, > > > + .prefixlen =3D prefixlen, > > > + }; > > > + > > > + bpf_map_delete_elem(&trie_map, &key); > > > + > > > + /* Do we need to refill the map? */ > > > + if (++deleted_entries =3D=3D nr_entries) { > > > + /* > > > + * Atomicity isn't required because DELETE only suppo= rts > > > + * one producer running concurrently. What we need is= a > > > + * way to track how many entries have been deleted fr= om > > > + * the trie between consecutive invocations of the BP= F > > > + * prog because a single bpf_loop() call might not > > > + * delete all entries, e.g. when NR_LOOPS < nr_entrie= s. > > > + */ > > > + deleted_entries =3D 0; > > > + *need_refill =3D true; > > > + return 1; > > > > This early break is another reason that makes > > 'delete' op different from 'lookup/update'. > > Pls make all 3 consistent, so they can be compared to each other. > > Hmm.. I'm not quite sure how to do that. lookup/update don't modify > the number of entries in the map whereas delete does (update only > updates existing entries, it doesn't create new ones). So when the map > is empty it needs to be refilled. You're right that somehow the time > to refill needs to be removed from the delete throughput/latency > numbers but fundamentally these 3 ops are not the same and I don't see > how to treat them as such. well, random-key update when the map is full is also quite different from random-key update when the map is empty. Instead doing an update from user space do timed ops: 1 start with empty map, update (aka insert) all keys sequentially 2 lookup all sequentially 3 delete all sequentially 4 update (aka insert) all sequentially 5 lookup random 6 update random 7 delete all random The elapsed time for 1 and 4 should be exactly the same. While all others might have differences, but all can be compared to each other and reasoned about.