From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5558414658D for ; Fri, 12 Sep 2025 20:22:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.180 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757708571; cv=none; b=SpQdGKKjsCS4+/hwef5TZCCF3GxTWbhyiRKYHRqXgDaj9e2YYZgvT47dMUBzDb14bORTbqtwocJQdrW9b7Dc6/t8CS1oSo/gNh/vsYeL66hGPmBnPdjaDUz+vdgp/FzwwZzrK13fRUh8t8QGQEW45drb+gh0jcwuKxQLFts6WxA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757708571; c=relaxed/simple; bh=zxmKjtTaFr2kiDMFN6ONr/GfS0ANiboimEWCDdVrYXk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Vn0+qGL2LRoGC2eJPzXR2S/48GW1n2IQL05YV0sCtBGIfv20q4iqnGG2MoT8JstverFqjK+MaqmvpLguJL9shzi9WIIiu2aLAHTiVU+IdYFfw74h3dfQL8JChsurYUsxgyCEersc0TcLmt4ToYPoAdxqyUsAWIIuOSQf43wXCiM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=TFUQRkMw; arc=none smtp.client-ip=209.85.214.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TFUQRkMw" Received: by mail-pl1-f180.google.com with SMTP id d9443c01a7336-24eb713b2dfso20253285ad.0 for ; Fri, 12 Sep 2025 13:22:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1757708567; x=1758313367; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pn6pU8ahuNSz0QEX1yi3Aarm470NSEtRtEElRpy5Srw=; b=TFUQRkMwjP6nH/PTwAyW4F9dW3HH9ET92MLS4So/szrygZhNCIMLPtsmNeL2KNPQgY b1q6O18f7lCBsZ/xc8m0cW8XZ7ospYYshNfoZjRFrNxgFouhZDbF/md6hMJaOcXtSXGQ QrIxI6nBZL9Skso4kSilvJglWp182K9XpxYK+S3kHN8qhbmAIzTcuhYlHGqS6/Ux0+D6 TkoKUmru6+7miPU74eMZeIeeoLYN38gbUnbKMcRgo/SGQ9PW11A1L71GmH1aBvKjbPLC FXzQED4PL3dCzTIQgourcQ27/s8InCAtsjCJPCnvYFhIqXopuuEwAQ6k+zrY56J1iTR8 TfvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757708567; x=1758313367; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pn6pU8ahuNSz0QEX1yi3Aarm470NSEtRtEElRpy5Srw=; b=rMS52/r5CXSx1XQ2HU0EMZb77FsN+dB//zLhos8g3ikdC8JPt7ghu0nhwvdrBGHK0I CdfDqEEcHNrP+2HxDFSdTQCr/fI4CfNPRxYIofNBYZQY9CwXg0V3RJlF8pB5q0McUVZE Z/g85k5A1jMgqs4459pvYOGiGFWj7I7Ox+8zAnKYwHXzdKUZG1yRy+YAwnpVb31Ikdtn IjkNfzzYALghlgcW6J2xXyTDmhwiXarWuSvKbtNj5FRqzXk9088VErKUzO5lCmQA+IbQ utYX+K+oMOSkR9+2S0ZWo5BFum7+ePaKcuyBxeOxE78Jb4UtkczIvDU91ZCelTbhzdrD ymCw== X-Forwarded-Encrypted: i=1; AJvYcCVhLKcYu0r+CBvJ80vkZcoC3Qy08EUF2XzDSS4nGcxsGxtby2kiuNMjcOLgpE6Ff5yqq4iE4FM=@vger.kernel.org X-Gm-Message-State: AOJu0Yw43M5ttZCWxjDLtzvj18hgHvc9SBPTjpPVWerA411Of5jWbd56 VC3tpS/bQ/9lQ+nshCSdZKYzPNLmlTtiwojSNbLZZbCq0jJ7TEBdoen9246u X-Gm-Gg: ASbGncuQ+c0xJHgcBJ3vvskMf/Jqt75Vp/uKI7ZurDilri3vITsgWZHobIQ1cbby+al L85qPtFaiiCYSiKpBStK7DWm0nq3Xav/qiPJ29QAdkzV2aMR6jcKaPHRUGFhavWchKM2pPLyFg3 6xp/UD1QDKp7bNeKQ8OCeH830g9KVSE50BCVEqMimdj+hNikfeeaCxofAGXa9ODyJzgHGEkYpZx PE5E8vaAieD3SQcELnh8IqpEZp4NA46XL5YM5EX+XymEM/QQ7fFEKw47xMnQ21l3GfU6X3bPdYe 7S7+kiyoT7I/w5kwa8alaTAIc8kdnk77JUNoqzvngBB3YmQ7pFC+AVgCjcsYfZZ4GrEHKz0hZsi 1w7cP3AxveH+WyT753VUvcYkZ2vXsm80pbCeih4/hFdt+B7u/SEeTSlIvB/7u6P1CFYRMxZM8DL JZOe2d2tBS31BwSqMn7nP9ZjE/L/9Zn9f1e9w/2msc71cdi2bF2RH+x/hN7chW09khVItq9xsmP z2uqPM4uH78C+g= X-Google-Smtp-Source: AGHT+IHAu05iTzM4QfB4FyCUqCoy3PXVcQOeYW3VEwtSlRoq10SKQI4nXSfxir7ae5gRkdBqJOl6pw== X-Received: by 2002:a17:903:32c8:b0:24c:d717:71a8 with SMTP id d9443c01a7336-25d26e486e9mr41176135ad.48.1757708567496; Fri, 12 Sep 2025 13:22:47 -0700 (PDT) Received: from localhost (c-73-158-218-242.hsd1.ca.comcast.net. [73.158.218.242]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-25c3b306065sm57238675ad.129.2025.09.12.13.22.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Sep 2025 13:22:47 -0700 (PDT) Date: Fri, 12 Sep 2025 13:22:46 -0700 From: Stanislav Fomichev To: Samiullah Khawaja Cc: Jakub Kicinski , "David S . Miller " , Eric Dumazet , Paolo Abeni , almasrymina@google.com, willemb@google.com, Joe Damato , mkarsten@uwaterloo.ca, netdev@vger.kernel.org Subject: Re: [PATCH net-next v9 0/2] Add support to do threaded napi busy poll Message-ID: References: <20250911212901.1718508-1-skhawaja@google.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20250911212901.1718508-1-skhawaja@google.com> On 09/11, Samiullah Khawaja wrote: > Extend the already existing support of threaded napi poll to do continuous > busy polling. > > This is used for doing continuous polling of napi to fetch descriptors > from backing RX/TX queues for low latency applications. Allow enabling > of threaded busypoll using netlink so this can be enabled on a set of > dedicated napis for low latency applications. > > Once enabled user can fetch the PID of the kthread doing NAPI polling > and set affinity, priority and scheduler for it depending on the > low-latency requirements. > > Extend the netlink interface to allow enabling/disabling threaded > busypolling at individual napi level. > > We use this for our AF_XDP based hard low-latency usecase with usecs > level latency requirement. For our usecase we want low jitter and stable > latency at P99. > > Following is an analysis and comparison of available (and compatible) > busy poll interfaces for a low latency usecase with stable P99. This can > be suitable for applications that want very low latency at the expense > of cpu usage and efficiency. > > Already existing APIs (SO_BUSYPOLL and epoll) allow busy polling a NAPI > backing a socket, but the missing piece is a mechanism to busy poll a > NAPI instance in a dedicated thread while ignoring available events or > packets, regardless of the userspace API. Most existing mechanisms are > designed to work in a pattern where you poll until new packets or events > are received, after which userspace is expected to handle them. > > As a result, one has to hack together a solution using a mechanism > intended to receive packets or events, not to simply NAPI poll. NAPI > threaded busy polling, on the other hand, provides this capability > natively, independent of any userspace API. This makes it really easy to > setup and manage. > > For analysis we use an AF_XDP based benchmarking tool `xsk_rr`. The > description of the tool and how it tries to simulate the real workload > is following, > > - It sends UDP packets between 2 machines. > - The client machine sends packets at a fixed frequency. To maintain the > frequency of the packet being sent, we use open-loop sampling. That is > the packets are sent in a separate thread. > - The server replies to the packet inline by reading the pkt from the > recv ring and replies using the tx ring. > - To simulate the application processing time, we use a configurable > delay in usecs on the client side after a reply is received from the > server. > > The xsk_rr tool is posted separately as an RFC for tools/testing/selftest. > > We use this tool with following napi polling configurations, > > - Interrupts only > - SO_BUSYPOLL (inline in the same thread where the client receives the > packet). > - SO_BUSYPOLL (separate thread and separate core) > - Threaded NAPI busypoll > > System is configured using following script in all 4 cases, > > ``` > echo 0 | sudo tee /sys/class/net/eth0/threaded > echo 0 | sudo tee /proc/sys/kernel/timer_migration > echo off | sudo tee /sys/devices/system/cpu/smt/control > > sudo ethtool -L eth0 rx 1 tx 1 > sudo ethtool -G eth0 rx 1024 > > echo 0 | sudo tee /proc/sys/net/core/rps_sock_flow_entries > echo 0 | sudo tee /sys/class/net/eth0/queues/rx-0/rps_cpus > > # pin IRQs on CPU 2 > IRQS="$(gawk '/eth0-(TxRx-)?1/ {match($1, /([0-9]+)/, arr); \ > print arr[0]}' < /proc/interrupts)" > for irq in "${IRQS}"; \ > do echo 2 | sudo tee /proc/irq/$irq/smp_affinity_list; done > > echo -1 | sudo tee /proc/sys/kernel/sched_rt_runtime_us > > for i in /sys/devices/virtual/workqueue/*/cpumask; \ > do echo $i; echo 1,2,3,4,5,6 > $i; done > > if [[ -z "$1" ]]; then > echo 400 | sudo tee /proc/sys/net/core/busy_read > echo 100 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs > echo 15000 | sudo tee /sys/class/net/eth0/gro_flush_timeout > fi > > sudo ethtool -C eth0 adaptive-rx off adaptive-tx off rx-usecs 0 tx-usecs 0 > > if [[ "$1" == "enable_threaded" ]]; then > echo 0 | sudo tee /proc/sys/net/core/busy_poll > echo 0 | sudo tee /proc/sys/net/core/busy_read > echo 100 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs > echo 15000 | sudo tee /sys/class/net/eth0/gro_flush_timeout > NAPI_ID=$(ynl --family netdev --output-json --do queue-get \ > --json '{"ifindex": '${IFINDEX}', "id": '0', "type": "rx"}' | jq '."napi-id"') > > ynl --family netdev --json '{"id": "'${NAPI_ID}'", "threaded": "busy-poll-enabled"}' > > NAPI_T=$(ynl --family netdev --output-json --do napi-get \ > --json '{"id": "'$NAPI_ID'"}' | jq '."pid"') > > sudo chrt -f -p 50 $NAPI_T > > # pin threaded poll thread to CPU 2 > sudo taskset -pc 2 $NAPI_T > fi > > if [[ "$1" == "enable_interrupt" ]]; then > echo 0 | sudo tee /proc/sys/net/core/busy_read > echo 0 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs > echo 15000 | sudo tee /sys/class/net/eth0/gro_flush_timeout > fi > ``` > > To enable various configurations, script can be run as following, > > - Interrupt Only > ``` >