linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] docs: networking: document NAPI
@ 2023-03-15 22:30 Jakub Kicinski
  2023-03-15 22:46 ` Stephen Hemminger
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Jakub Kicinski @ 2023-03-15 22:30 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, Jakub Kicinski, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

Add basic documentation about NAPI. We can stop linking to the ancient
doc on the LF wiki.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: jesse.brandeburg@intel.com
CC: anthony.l.nguyen@intel.com
CC: corbet@lwn.net
CC: linux-doc@vger.kernel.org
---
 .../device_drivers/ethernet/intel/e100.rst    |   3 +-
 .../device_drivers/ethernet/intel/i40e.rst    |   4 +-
 .../device_drivers/ethernet/intel/ixgb.rst    |   4 +-
 Documentation/networking/index.rst            |   1 +
 Documentation/networking/napi.rst             | 231 ++++++++++++++++++
 include/linux/netdevice.h                     |  13 +-
 6 files changed, 244 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/networking/napi.rst

diff --git a/Documentation/networking/device_drivers/ethernet/intel/e100.rst b/Documentation/networking/device_drivers/ethernet/intel/e100.rst
index 3d4a9ba21946..371b7e5c3293 100644
--- a/Documentation/networking/device_drivers/ethernet/intel/e100.rst
+++ b/Documentation/networking/device_drivers/ethernet/intel/e100.rst
@@ -151,8 +151,7 @@ NAPI
 
 NAPI (Rx polling mode) is supported in the e100 driver.
 
-See https://wiki.linuxfoundation.org/networking/napi for more
-information on NAPI.
+See :ref:`Documentation/networking/napi.rst <napi>` for more information.
 
 Multiple Interfaces on Same Ethernet Broadcast Network
 ------------------------------------------------------
diff --git a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst
index ac35bd472bdc..c495c4e16b3b 100644
--- a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst
+++ b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst
@@ -399,8 +399,8 @@ operate only in full duplex and only at their native speed.
 NAPI
 ----
 NAPI (Rx polling mode) is supported in the i40e driver.
-For more information on NAPI, see
-https://wiki.linuxfoundation.org/networking/napi
+
+See :ref:`Documentation/networking/napi.rst <napi>` for more information.
 
 Flow Control
 ------------
diff --git a/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst b/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst
index c6a233e68ad6..90ddbc912d8d 100644
--- a/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst
+++ b/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst
@@ -367,9 +367,7 @@ NAPI
 ----
 NAPI (Rx polling mode) is supported in the ixgb driver.
 
-See https://wiki.linuxfoundation.org/networking/napi for more information on
-NAPI.
-
+See :ref:`Documentation/networking/napi.rst <napi>` for more information.
 
 Known Issues/Troubleshooting
 ============================
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 4ddcae33c336..24bb256d6d53 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -73,6 +73,7 @@ Refer to :ref:`netdev-FAQ` for a guide on netdev development process specifics.
    mpls-sysctl
    mptcp-sysctl
    multiqueue
+   napi
    netconsole
    netdev-features
    netdevices
diff --git a/Documentation/networking/napi.rst b/Documentation/networking/napi.rst
new file mode 100644
index 000000000000..4d87032a7e9e
--- /dev/null
+++ b/Documentation/networking/napi.rst
@@ -0,0 +1,231 @@
+.. _napi:
+
+====
+NAPI
+====
+
+NAPI is the event handling mechanism used by the Linux networking stack.
+The name NAPI does not stand for anything in particular.
+
+In basic operation device notifies the host about new events via an interrupt.
+The host then schedules a NAPI instance to process the events.
+Device may also be polled for events via NAPI without receiving
+an interrupts first (busy polling).
+
+NAPI processing usually happens in the software interrupt context,
+but user may choose to use separate kernel threads for NAPI processing.
+
+All in all NAPI abstracts away from the drivers the context and configuration
+of event (packet Rx and Tx) processing.
+
+Driver API
+==========
+
+The two most important elements of NAPI are the struct napi_struct
+and the associated poll method. struct napi_struct holds the state
+of the NAPI instance while the method is the driver-specific event
+handler. The method will typically free Tx packets which had been
+transmitted and process newly received packets.
+
+.. _drv_ctrl:
+
+Control API
+-----------
+
+netif_napi_add() and netif_napi_del() add/remove a NAPI instance
+from the system. The instances are attached to the netdevice passed
+as argument (and will be deleted automatically when netdevice is
+unregistered). Instances are added in a disabled state.
+
+napi_enable() and napi_disable() manage the disabled state.
+A disabled NAPI can't be scheduled and its poll method is guaranteed
+to not be invoked. napi_disable() waits for ownership of the NAPI
+instance to be released.
+
+Datapath API
+------------
+
+napi_schedule() is the basic method of scheduling a NAPI poll.
+Drivers should call this function in their interrupt handler
+(see :ref:`drv_sched` for more info). Successful call to napi_schedule()
+will take ownership of the NAPI instance.
+
+Some time after NAPI is scheduled driver's poll method will be
+called to process the events/packets. The method takes a ``budget``
+argument - drivers can process completions for any number of Tx
+packets but should only process up to ``budget`` number of
+Rx packets. Rx processing is usually much more expensive.
+
+.. warning::
+
+   ``budget`` may be 0 if core tries to only process Tx completions
+   and no Rx packets.
+
+The poll method returns amount of work performed. If driver still
+has outstanding work to do (e.g. ``budget`` was exhausted)
+the poll method should return exactly ``budget``. In that case
+the NAPI instance will be serviced/polled again (without the
+need to be scheduled).
+
+If event processing has been completed (all outstanding packets
+processed) the poll method should call napi_complete_done()
+before returning. napi_complete_done() releases the ownership
+of the instance.
+
+.. warning::
+
+   The case of finishing all events and using exactly ``budget``
+   must be handled carefully. There is no way to report this
+   (rare) condition to the stack, so the driver must either
+   not call napi_complete_done() and wait to be called again,
+   or return ``budget - 1``.
+
+   If ``budget`` is 0 napi_complete_done() should never be called.
+
+Call sequence
+-------------
+
+Drivers should not make assumptions about the exact sequencing
+of calls. The poll method may be called without driver scheduling
+the instance (unless the instance is disabled). Similarly if
+it's not guaranteed that the poll method will be called, even
+if napi_schedule() succeeded (e.g. if the instance gets disabled).
+
+As mentioned in the :ref:`drv_ctrl` section - napi_disable() and subsequent
+calls to the poll method only wait for the ownership of the instance
+to be released, not for the poll method to exit. This means that
+drivers should avoid accessing any data structures after calling
+napi_complete_done().
+
+.. _drv_sched:
+
+Scheduling and IRQ masking
+--------------------------
+
+Drivers should keep the interrupts masked after scheduling
+the NAPI instance - until NAPI polling finishes any further
+interrupts are unnecessary.
+
+Drivers which have to mask the interrupts explicitly (as opposed
+to IRQ being auto-masked by the device) should use the napi_schedule_prep()
+and __napi_schedule() calls:
+
+.. code-block:: c
+
+  if (napi_schedule_prep(&v->napi)) {
+      mydrv_mask_rxtx_irq(v->idx);
+      /* schedule after masking to avoid races */
+      __napi_schedule(&v->napi);
+  }
+
+IRQ should only be unmasked after successful call to napi_complete_done():
+
+.. code-block:: c
+
+  if (budget && napi_complete_done(&v->napi, work_done)) {
+    mydrv_unmask_rxtx_irq(v->idx);
+    return min(work_done, budget - 1);
+  }
+
+napi_schedule_irqoff() is a variant of napi_schedule() which takes advantage
+of guarantees given by being invoked in IRQ context (no need to
+mask interrupts). Note that PREEMPT_RT forces all interrupts
+to be threaded so the interrupt may need to be marked ``IRQF_NO_THREAD``
+to avoid issues on real-time kernel configurations.
+
+Instance to queue mapping
+-------------------------
+
+Modern devices have multiple NAPI instances (struct napi_struct) per
+interface. There is no strong requirement on how the instances are
+mapped to queues and interrupts. NAPI is primarily a polling/processing
+abstraction without many user-facing semantics. That said, most networking
+devices end up using NAPI is fairly similar ways.
+
+NAPI instances most often correspond 1:1:1 to interrupts and queue pairs
+(queue pair is a set of a single Rx and single Tx queue).
+
+In less common cases a NAPI instance may be used for multiple queues
+or Rx and Tx queues can be serviced by separate NAPI instances on a single
+core. Regardless of the queue assignment, however, there is usually still
+a 1:1 mapping between NAPI instances and interrupts.
+
+It's worth noting that the ethtool API uses a "channel" terminology where
+each channel can be either ``rx``, ``tx`` or ``combined``. It's not clear
+what constitutes a channel, the recommended interpretation is to understand
+a channel as an IRQ/NAPI which services queues of a given type. For example
+a configuration of 1 ``rx``, 1 ``tx`` and 1 ``combined`` channel is expected
+to utilize 3 interrupts, 2 Rx and 2 Tx queues.
+
+User API
+========
+
+User interactions with NAPI depend on NAPI instance ID. The instance IDs
+are only visible to the user thru the ``SO_INCOMING_NAPI_ID`` socket option.
+It's not currently possible to query IDs used by a given device.
+
+Software IRQ coalescing
+-----------------------
+
+NAPI does not perform any explicit event coalescing by default.
+In most scenarios batching happens due to IRQ coalescing which is done
+by the device. There are cases where software coalescing is helpful.
+
+NAPI can be configured to arm a repoll timer instead of unmasking
+the hardware interrupts as soon as all packets are processed.
+The ``gro_flush_timeout`` sysfs configuration of the netdevice
+is reused to control the delay of the timer, while
+``napi_defer_hard_irqs`` controls the number of consecutive empty polls
+before NAPI gives up and goes back to using hardware IRQs.
+
+Busy polling
+------------
+
+Busy polling allows user process to check for incoming packets before
+device interrupt fires. As is the case with any busy polling it trades
+off CPU cycles for lower latency (in fact production uses of NAPI busy
+polling are not well known).
+
+User can enable busy polling by either setting ``SO_BUSY_POLL`` on
+selected sockets or using the global ``net.core.busy_poll`` and
+``net.core.busy_read`` sysctls. An io_uring API for NAPI busy polling
+also exists.
+
+IRQ mitigation
+---------------
+
+While busy polling is supposed to be used by low latency applications,
+a similar mechanism can be used for IRQ mitigation.
+
+Very high request-per-second applications (especially routing/forwarding
+applications and especially applications using AF_XDP sockets) may not
+want to be interrupted until they finish processing a request or a batch
+of packets.
+
+Such applications can pledge to the kernel that they will perform a busy
+polling operation periodically, and the driver should keep the device IRQs
+permanently masked. This mode is enabled by using the ``SO_PREFER_BUSY_POLL``
+socket option. To avoid the system misbehavior the pledge is revoked
+if ``gro_flush_timeout`` passes without any busy poll call.
+
+The NAPI budget for busy polling is lower than the default (which makes
+sense given the low latency intention of normal busy polling). This is
+not the case with IRQ mitigation, however, so the budget can be adjusted
+with the ``SO_BUSY_POLL_BUDGET`` socket option.
+
+Threaded NAPI
+-------------
+
+Use dedicated kernel threads rather than software IRQ context for NAPI
+processing. The configuration is per netdevice and will affect all
+NAPI instances of that device. Each NAPI instance will spawn a separate
+thread (called ``napi/${ifc-name}-${napi-id}``).
+
+It is recommended to pin each kernel thread to a single CPU, the same
+CPU as services the interrupt. Note that the mapping between IRQs and
+NAPI instances may not be trivial (and is driver dependent).
+The NAPI instance IDs will be assigned in the opposite order
+than the process IDs of the kernel threads.
+
+Threaded NAPI is controlled by writing 0/1 to the ``threaded`` file in
+netdev's sysfs directory.
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 470085b121d3..b439f877bc3a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -508,15 +508,18 @@ static inline bool napi_reschedule(struct napi_struct *napi)
 	return false;
 }
 
-bool napi_complete_done(struct napi_struct *n, int work_done);
 /**
- *	napi_complete - NAPI processing complete
- *	@n: NAPI context
+ * napi_complete_done - NAPI processing complete
+ * @n: NAPI context
+ * @work_done: number of packets processed
  *
- * Mark NAPI processing as complete.
- * Consider using napi_complete_done() instead.
+ * Mark NAPI processing as complete. Should only be called if poll budget
+ * has not been completely consumed.
+ * Prefer over napi_complete().
  * Return false if device should avoid rearming interrupts.
  */
+bool napi_complete_done(struct napi_struct *n, int work_done);
+
 static inline bool napi_complete(struct napi_struct *n)
 {
 	return napi_complete_done(n, 0);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 22:30 [PATCH net-next] docs: networking: document NAPI Jakub Kicinski
@ 2023-03-15 22:46 ` Stephen Hemminger
  2023-03-15 22:52 ` Stephen Hemminger
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Stephen Hemminger @ 2023-03-15 22:46 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On Wed, 15 Mar 2023 15:30:44 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> Add basic documentation about NAPI. We can stop linking to the ancient
> doc on the LF wiki.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: jesse.brandeburg@intel.com
> CC: anthony.l.nguyen@intel.com
> CC: corbet@lwn.net
> CC: linux-doc@vger.kernel.org

And the ancient LF wiki can be updated to point to kernel.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 22:30 [PATCH net-next] docs: networking: document NAPI Jakub Kicinski
  2023-03-15 22:46 ` Stephen Hemminger
@ 2023-03-15 22:52 ` Stephen Hemminger
  2023-03-15 23:11   ` Jakub Kicinski
  2023-03-15 23:12 ` Tony Nguyen
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Stephen Hemminger @ 2023-03-15 22:52 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On Wed, 15 Mar 2023 15:30:44 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> Add basic documentation about NAPI. We can stop linking to the ancient
> doc on the LF wiki.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: jesse.brandeburg@intel.com
> CC: anthony.l.nguyen@intel.com
> CC: corbet@lwn.net
> CC: linux-doc@vger.kernel.org

The one thing missing, is how to handle level vs edge triggered interrupts.
For level triggered interrupts, the re-enable is inherently not racy.
I.e re-enabling interrupt when packet is present will cause an interrupt.
But for devices with edge triggered interrupts, it is often necessary to
poll and manually schedule again. Older documentation referred to this
as the "rotten packet" problem.

Maybe this is no longer a problem for drivers?
Or maybe all new hardware uses PCI MSI and is level triggered?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 22:52 ` Stephen Hemminger
@ 2023-03-15 23:11   ` Jakub Kicinski
  2023-03-16  1:36     ` Stephen Hemminger
  2023-03-16 22:59     ` Florian Fainelli
  0 siblings, 2 replies; 24+ messages in thread
From: Jakub Kicinski @ 2023-03-15 23:11 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On Wed, 15 Mar 2023 15:52:02 -0700 Stephen Hemminger wrote:
> On Wed, 15 Mar 2023 15:30:44 -0700
> Jakub Kicinski <kuba@kernel.org> wrote:
> 
> > Add basic documentation about NAPI. We can stop linking to the ancient
> > doc on the LF wiki.
> > 
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > ---
> > CC: jesse.brandeburg@intel.com
> > CC: anthony.l.nguyen@intel.com
> > CC: corbet@lwn.net
> > CC: linux-doc@vger.kernel.org  
> 
> The one thing missing, is how to handle level vs edge triggered interrupts.
> For level triggered interrupts, the re-enable is inherently not racy.
> I.e re-enabling interrupt when packet is present will cause an interrupt.
> But for devices with edge triggered interrupts, it is often necessary to
> poll and manually schedule again. Older documentation referred to this
> as the "rotten packet" problem.
> 
> Maybe this is no longer a problem for drivers?
> Or maybe all new hardware uses PCI MSI and is level triggered?

It's still a problem depending on the exact design of the interrupt
controller in the chip / tradeoffs the SW wants to make.
I haven't actually read the LF doc, because I wasn't sure about the
licenses (sigh). The rotten packet problem does not come up in reviews
very often, so it wasn't front of mind. I'm not sure I'd be able to
concisely describe it, actually :S There are many races and conditions
which can lead to it.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 22:30 [PATCH net-next] docs: networking: document NAPI Jakub Kicinski
  2023-03-15 22:46 ` Stephen Hemminger
  2023-03-15 22:52 ` Stephen Hemminger
@ 2023-03-15 23:12 ` Tony Nguyen
  2023-03-15 23:17   ` Jakub Kicinski
  2023-03-16  1:38 ` Stephen Hemminger
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Tony Nguyen @ 2023-03-15 23:12 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, jesse.brandeburg, corbet, linux-doc

On 3/15/2023 3:30 PM, Jakub Kicinski wrote:
> Add basic documentation about NAPI. We can stop linking to the ancient
> doc on the LF wiki.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: jesse.brandeburg@intel.com
> CC: anthony.l.nguyen@intel.com
> CC: corbet@lwn.net
> CC: linux-doc@vger.kernel.org
> ---
>   .../device_drivers/ethernet/intel/e100.rst    |   3 +-
>   .../device_drivers/ethernet/intel/i40e.rst    |   4 +-
>   .../device_drivers/ethernet/intel/ixgb.rst    |   4 +-

ice has an entry as well; we recently updated the (more?) ancient link 
to the ancient one :P

>   Documentation/networking/index.rst            |   1 +
>   Documentation/networking/napi.rst             | 231 ++++++++++++++++++
>   include/linux/netdevice.h                     |  13 +-
>   6 files changed, 244 insertions(+), 12 deletions(-)
>   create mode 100644 Documentation/networking/napi.rst


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 23:12 ` Tony Nguyen
@ 2023-03-15 23:17   ` Jakub Kicinski
  2023-03-15 23:19     ` Jakub Kicinski
  0 siblings, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2023-03-15 23:17 UTC (permalink / raw)
  To: Tony Nguyen
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg, corbet,
	linux-doc

On Wed, 15 Mar 2023 16:12:42 -0700 Tony Nguyen wrote:
> >   .../device_drivers/ethernet/intel/e100.rst    |   3 +-
> >   .../device_drivers/ethernet/intel/i40e.rst    |   4 +-
> >   .../device_drivers/ethernet/intel/ixgb.rst    |   4 +-  
> 
> ice has an entry as well; we recently updated the (more?) ancient link 
> to the ancient one :P

Sweet Baby J. I'll fix that in v2, and there seems to be another 
link in CAN.. should have grepped harder :)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 23:17   ` Jakub Kicinski
@ 2023-03-15 23:19     ` Jakub Kicinski
  2023-03-16  0:20       ` Tony Nguyen
  0 siblings, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2023-03-15 23:19 UTC (permalink / raw)
  To: Tony Nguyen
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg, corbet,
	linux-doc

On Wed, 15 Mar 2023 16:17:06 -0700 Jakub Kicinski wrote:
> On Wed, 15 Mar 2023 16:12:42 -0700 Tony Nguyen wrote:
> > >   .../device_drivers/ethernet/intel/e100.rst    |   3 +-
> > >   .../device_drivers/ethernet/intel/i40e.rst    |   4 +-
> > >   .../device_drivers/ethernet/intel/ixgb.rst    |   4 +-  
> > 
> > ice has an entry as well; we recently updated the (more?) ancient link 
> > to the ancient one :P
> 
> Sweet Baby J. I'll fix that in v2, and there seems to be another 
> link in CAN.. should have grepped harder :)

BTW are there any ixgb parts still in use. The driver doesn't seem
like much of a burden, but IIRC it was a bit of an oddball design
so maybe we can axe it?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 23:19     ` Jakub Kicinski
@ 2023-03-16  0:20       ` Tony Nguyen
  2023-03-16 21:27         ` Tony Nguyen
  0 siblings, 1 reply; 24+ messages in thread
From: Tony Nguyen @ 2023-03-16  0:20 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg, corbet,
	linux-doc

On 3/15/2023 4:19 PM, Jakub Kicinski wrote:
> On Wed, 15 Mar 2023 16:17:06 -0700 Jakub Kicinski wrote:
>> On Wed, 15 Mar 2023 16:12:42 -0700 Tony Nguyen wrote:
>>>>    .../device_drivers/ethernet/intel/e100.rst    |   3 +-
>>>>    .../device_drivers/ethernet/intel/i40e.rst    |   4 +-
>>>>    .../device_drivers/ethernet/intel/ixgb.rst    |   4 +-
>>>
>>> ice has an entry as well; we recently updated the (more?) ancient link
>>> to the ancient one :P
>>
>> Sweet Baby J. I'll fix that in v2, and there seems to be another
>> link in CAN.. should have grepped harder :)
> 
> BTW are there any ixgb parts still in use. The driver doesn't seem
> like much of a burden, but IIRC it was a bit of an oddball design
> so maybe we can axe it?

Let me ask around.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 23:11   ` Jakub Kicinski
@ 2023-03-16  1:36     ` Stephen Hemminger
  2023-03-16 22:59     ` Florian Fainelli
  1 sibling, 0 replies; 24+ messages in thread
From: Stephen Hemminger @ 2023-03-16  1:36 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On Wed, 15 Mar 2023 16:11:42 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> > 
> > The one thing missing, is how to handle level vs edge triggered interrupts.
> > For level triggered interrupts, the re-enable is inherently not racy.
> > I.e re-enabling interrupt when packet is present will cause an interrupt.
> > But for devices with edge triggered interrupts, it is often necessary to
> > poll and manually schedule again. Older documentation referred to this
> > as the "rotten packet" problem.
> > 
> > Maybe this is no longer a problem for drivers?
> > Or maybe all new hardware uses PCI MSI and is level triggered?  
> 
> It's still a problem depending on the exact design of the interrupt
> controller in the chip / tradeoffs the SW wants to make.
> I haven't actually read the LF doc, because I wasn't sure about the
> licenses (sigh)

I wrote the old NAPI from some older info that was available back then.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 22:30 [PATCH net-next] docs: networking: document NAPI Jakub Kicinski
                   ` (2 preceding siblings ...)
  2023-03-15 23:12 ` Tony Nguyen
@ 2023-03-16  1:38 ` Stephen Hemminger
  2023-03-16  2:58   ` Jakub Kicinski
  2023-03-23  0:44   ` Jamal Hadi Salim
  2023-03-16  9:50 ` Bagas Sanjaya
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Stephen Hemminger @ 2023-03-16  1:38 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On Wed, 15 Mar 2023 15:30:44 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> Add basic documentation about NAPI. We can stop linking to the ancient
> doc on the LF wiki.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: jesse.brandeburg@intel.com
> CC: anthony.l.nguyen@intel.com
> CC: corbet@lwn.net
> CC: linux-doc@vger.kernel.org

Older pre LF wiki NAPI docs still survive here
https://lwn.net/2002/0321/a/napi-howto.php3

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-16  1:38 ` Stephen Hemminger
@ 2023-03-16  2:58   ` Jakub Kicinski
  2023-03-16 12:03     ` Francois Romieu
  2023-03-23  0:44   ` Jamal Hadi Salim
  1 sibling, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2023-03-16  2:58 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On Wed, 15 Mar 2023 18:38:46 -0700 Stephen Hemminger wrote:
> On Wed, 15 Mar 2023 15:30:44 -0700
> Jakub Kicinski <kuba@kernel.org> wrote:
> 
> > Add basic documentation about NAPI. We can stop linking to the ancient
> > doc on the LF wiki.
> > 
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > ---
> > CC: jesse.brandeburg@intel.com
> > CC: anthony.l.nguyen@intel.com
> > CC: corbet@lwn.net
> > CC: linux-doc@vger.kernel.org  
> 
> Older pre LF wiki NAPI docs still survive here
> https://lwn.net/2002/0321/a/napi-howto.php3

Wow, it's over 20 years old and still largely relevant!
Makes me feel that we stopped innovating :)

Why were all the docs hosted out of tree back then?
Because there was no web rendering of the in-tree stuff?
Or no in-tree stuff at all?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 22:30 [PATCH net-next] docs: networking: document NAPI Jakub Kicinski
                   ` (3 preceding siblings ...)
  2023-03-16  1:38 ` Stephen Hemminger
@ 2023-03-16  9:50 ` Bagas Sanjaya
  2023-03-16 10:29 ` Toke Høiland-Jørgensen
  2023-03-16 23:16 ` Florian Fainelli
  6 siblings, 0 replies; 24+ messages in thread
From: Bagas Sanjaya @ 2023-03-16  9:50 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, jesse.brandeburg, anthony.l.nguyen,
	corbet, linux-doc

[-- Attachment #1: Type: text/plain, Size: 11927 bytes --]

On Wed, Mar 15, 2023 at 03:30:44PM -0700, Jakub Kicinski wrote:
> -See https://wiki.linuxfoundation.org/networking/napi for more
> -information on NAPI.
> +See :ref:`Documentation/networking/napi.rst <napi>` for more information.
> <snipped>...
> -For more information on NAPI, see
> -https://wiki.linuxfoundation.org/networking/napi
> +
> +See :ref:`Documentation/networking/napi.rst <napi>` for more information.
>  
> <snipped>...
> -See https://wiki.linuxfoundation.org/networking/napi for more information on
> -NAPI.
> -
> +See :ref:`Documentation/networking/napi.rst <napi>` for more information.

I prefer not to use :ref:, but simply write out the full document path
to achieve the same internal link.

> +====
> +NAPI
> +====
> +
> +NAPI is the event handling mechanism used by the Linux networking stack.
> +The name NAPI does not stand for anything in particular.
> +
> +In basic operation device notifies the host about new events via an interrupt.
> +The host then schedules a NAPI instance to process the events.
> +Device may also be polled for events via NAPI without receiving
> +an interrupts first (busy polling).
> +
> +NAPI processing usually happens in the software interrupt context,
> +but user may choose to use separate kernel threads for NAPI processing.
> +
> +All in all NAPI abstracts away from the drivers the context and configuration
> +of event (packet Rx and Tx) processing.
> +
> +Driver API
> +==========
> +
> +The two most important elements of NAPI are the struct napi_struct
> +and the associated poll method. struct napi_struct holds the state
> +of the NAPI instance while the method is the driver-specific event
> +handler. The method will typically free Tx packets which had been
> +transmitted and process newly received packets.
> +
> +.. _drv_ctrl:
> +
> +Control API
> +-----------
> +
> +netif_napi_add() and netif_napi_del() add/remove a NAPI instance
> +from the system. The instances are attached to the netdevice passed
> +as argument (and will be deleted automatically when netdevice is
> +unregistered). Instances are added in a disabled state.
> +
> +napi_enable() and napi_disable() manage the disabled state.
> +A disabled NAPI can't be scheduled and its poll method is guaranteed
> +to not be invoked. napi_disable() waits for ownership of the NAPI
> +instance to be released.
> +
> +Datapath API
> +------------
> +
> +napi_schedule() is the basic method of scheduling a NAPI poll.
> +Drivers should call this function in their interrupt handler
> +(see :ref:`drv_sched` for more info). Successful call to napi_schedule()
> +will take ownership of the NAPI instance.
> +
> +Some time after NAPI is scheduled driver's poll method will be
> +called to process the events/packets. The method takes a ``budget``
> +argument - drivers can process completions for any number of Tx
> +packets but should only process up to ``budget`` number of
> +Rx packets. Rx processing is usually much more expensive.
> +
> +.. warning::
> +
> +   ``budget`` may be 0 if core tries to only process Tx completions
> +   and no Rx packets.
> +
> +The poll method returns amount of work performed. If driver still
> +has outstanding work to do (e.g. ``budget`` was exhausted)
> +the poll method should return exactly ``budget``. In that case
> +the NAPI instance will be serviced/polled again (without the
> +need to be scheduled).
> +
> +If event processing has been completed (all outstanding packets
> +processed) the poll method should call napi_complete_done()
> +before returning. napi_complete_done() releases the ownership
> +of the instance.
> +
> +.. warning::
> +
> +   The case of finishing all events and using exactly ``budget``
> +   must be handled carefully. There is no way to report this
> +   (rare) condition to the stack, so the driver must either
> +   not call napi_complete_done() and wait to be called again,
> +   or return ``budget - 1``.
> +
> +   If ``budget`` is 0 napi_complete_done() should never be called.
> +
> +Call sequence
> +-------------
> +
> +Drivers should not make assumptions about the exact sequencing
> +of calls. The poll method may be called without driver scheduling
> +the instance (unless the instance is disabled). Similarly if
> +it's not guaranteed that the poll method will be called, even
> +if napi_schedule() succeeded (e.g. if the instance gets disabled).
> +
> +As mentioned in the :ref:`drv_ctrl` section - napi_disable() and subsequent
> +calls to the poll method only wait for the ownership of the instance
> +to be released, not for the poll method to exit. This means that
> +drivers should avoid accessing any data structures after calling
> +napi_complete_done().
> +
> +.. _drv_sched:
> +
> +Scheduling and IRQ masking
> +--------------------------
> +
> +Drivers should keep the interrupts masked after scheduling
> +the NAPI instance - until NAPI polling finishes any further
> +interrupts are unnecessary.
> +
> +Drivers which have to mask the interrupts explicitly (as opposed
> +to IRQ being auto-masked by the device) should use the napi_schedule_prep()
> +and __napi_schedule() calls:
> +
> +.. code-block:: c
> +
> +  if (napi_schedule_prep(&v->napi)) {
> +      mydrv_mask_rxtx_irq(v->idx);
> +      /* schedule after masking to avoid races */
> +      __napi_schedule(&v->napi);
> +  }
> +
> +IRQ should only be unmasked after successful call to napi_complete_done():
> +
> +.. code-block:: c
> +
> +  if (budget && napi_complete_done(&v->napi, work_done)) {
> +    mydrv_unmask_rxtx_irq(v->idx);
> +    return min(work_done, budget - 1);
> +  }
> +
> +napi_schedule_irqoff() is a variant of napi_schedule() which takes advantage
> +of guarantees given by being invoked in IRQ context (no need to
> +mask interrupts). Note that PREEMPT_RT forces all interrupts
> +to be threaded so the interrupt may need to be marked ``IRQF_NO_THREAD``
> +to avoid issues on real-time kernel configurations.
> +
> +Instance to queue mapping
> +-------------------------
> +
> +Modern devices have multiple NAPI instances (struct napi_struct) per
> +interface. There is no strong requirement on how the instances are
> +mapped to queues and interrupts. NAPI is primarily a polling/processing
> +abstraction without many user-facing semantics. That said, most networking
> +devices end up using NAPI is fairly similar ways.
> +
> +NAPI instances most often correspond 1:1:1 to interrupts and queue pairs
> +(queue pair is a set of a single Rx and single Tx queue).
> +
> +In less common cases a NAPI instance may be used for multiple queues
> +or Rx and Tx queues can be serviced by separate NAPI instances on a single
> +core. Regardless of the queue assignment, however, there is usually still
> +a 1:1 mapping between NAPI instances and interrupts.
> +
> +It's worth noting that the ethtool API uses a "channel" terminology where
> +each channel can be either ``rx``, ``tx`` or ``combined``. It's not clear
> +what constitutes a channel, the recommended interpretation is to understand
> +a channel as an IRQ/NAPI which services queues of a given type. For example
> +a configuration of 1 ``rx``, 1 ``tx`` and 1 ``combined`` channel is expected
> +to utilize 3 interrupts, 2 Rx and 2 Tx queues.
> +
> +User API
> +========
> +
> +User interactions with NAPI depend on NAPI instance ID. The instance IDs
> +are only visible to the user thru the ``SO_INCOMING_NAPI_ID`` socket option.
> +It's not currently possible to query IDs used by a given device.
> +
> +Software IRQ coalescing
> +-----------------------
> +
> +NAPI does not perform any explicit event coalescing by default.
> +In most scenarios batching happens due to IRQ coalescing which is done
> +by the device. There are cases where software coalescing is helpful.
> +
> +NAPI can be configured to arm a repoll timer instead of unmasking
> +the hardware interrupts as soon as all packets are processed.
> +The ``gro_flush_timeout`` sysfs configuration of the netdevice
> +is reused to control the delay of the timer, while
> +``napi_defer_hard_irqs`` controls the number of consecutive empty polls
> +before NAPI gives up and goes back to using hardware IRQs.
> +
> +Busy polling
> +------------
> +
> +Busy polling allows user process to check for incoming packets before
> +device interrupt fires. As is the case with any busy polling it trades
> +off CPU cycles for lower latency (in fact production uses of NAPI busy
> +polling are not well known).
> +
> +User can enable busy polling by either setting ``SO_BUSY_POLL`` on
> +selected sockets or using the global ``net.core.busy_poll`` and
> +``net.core.busy_read`` sysctls. An io_uring API for NAPI busy polling
> +also exists.
> +
> +IRQ mitigation
> +---------------
> +
> +While busy polling is supposed to be used by low latency applications,
> +a similar mechanism can be used for IRQ mitigation.
> +
> +Very high request-per-second applications (especially routing/forwarding
> +applications and especially applications using AF_XDP sockets) may not
> +want to be interrupted until they finish processing a request or a batch
> +of packets.
> +
> +Such applications can pledge to the kernel that they will perform a busy
> +polling operation periodically, and the driver should keep the device IRQs
> +permanently masked. This mode is enabled by using the ``SO_PREFER_BUSY_POLL``
> +socket option. To avoid the system misbehavior the pledge is revoked
> +if ``gro_flush_timeout`` passes without any busy poll call.
> +
> +The NAPI budget for busy polling is lower than the default (which makes
> +sense given the low latency intention of normal busy polling). This is
> +not the case with IRQ mitigation, however, so the budget can be adjusted
> +with the ``SO_BUSY_POLL_BUDGET`` socket option.
> +
> +Threaded NAPI
> +-------------
> +
> +Use dedicated kernel threads rather than software IRQ context for NAPI
> +processing. The configuration is per netdevice and will affect all
> +NAPI instances of that device. Each NAPI instance will spawn a separate
> +thread (called ``napi/${ifc-name}-${napi-id}``).
> +
> +It is recommended to pin each kernel thread to a single CPU, the same
> +CPU as services the interrupt. Note that the mapping between IRQs and
> +NAPI instances may not be trivial (and is driver dependent).
> +The NAPI instance IDs will be assigned in the opposite order
> +than the process IDs of the kernel threads.
> +
> +Threaded NAPI is controlled by writing 0/1 to the ``threaded`` file in
> +netdev's sysfs directory.
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 470085b121d3..b439f877bc3a 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -508,15 +508,18 @@ static inline bool napi_reschedule(struct napi_struct *napi)
>  	return false;
>  }
>  
> -bool napi_complete_done(struct napi_struct *n, int work_done);
>  /**
> - *	napi_complete - NAPI processing complete
> - *	@n: NAPI context
> + * napi_complete_done - NAPI processing complete
> + * @n: NAPI context
> + * @work_done: number of packets processed
>   *
> - * Mark NAPI processing as complete.
> - * Consider using napi_complete_done() instead.
> + * Mark NAPI processing as complete. Should only be called if poll budget
> + * has not been completely consumed.
> + * Prefer over napi_complete().
>   * Return false if device should avoid rearming interrupts.
>   */
> +bool napi_complete_done(struct napi_struct *n, int work_done);
> +
>  static inline bool napi_complete(struct napi_struct *n)
>  {
>  	return napi_complete_done(n, 0);

The doc LGTM, thanks!

Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 22:30 [PATCH net-next] docs: networking: document NAPI Jakub Kicinski
                   ` (4 preceding siblings ...)
  2023-03-16  9:50 ` Bagas Sanjaya
@ 2023-03-16 10:29 ` Toke Høiland-Jørgensen
  2023-03-16 21:35   ` Jakub Kicinski
  2023-03-16 23:16 ` Florian Fainelli
  6 siblings, 1 reply; 24+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-03-16 10:29 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, Jakub Kicinski, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

Jakub Kicinski <kuba@kernel.org> writes:

> Add basic documentation about NAPI. We can stop linking to the ancient
> doc on the LF wiki.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Looks good, just one nit:

[...]

> +Threaded NAPI
> +-------------
> +
> +Use dedicated kernel threads rather than software IRQ context for NAPI
> +processing. The configuration is per netdevice and will affect all
> +NAPI instances of that device. Each NAPI instance will spawn a separate
> +thread (called ``napi/${ifc-name}-${napi-id}``).

This section starts a bit abruptly. Maybe start it with "Threaded NAPI
is an operating mode that uses dedicated..." or something along those
lines?

Other than that:

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-16  2:58   ` Jakub Kicinski
@ 2023-03-16 12:03     ` Francois Romieu
  0 siblings, 0 replies; 24+ messages in thread
From: Francois Romieu @ 2023-03-16 12:03 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Stephen Hemminger, davem, netdev, edumazet, pabeni,
	jesse.brandeburg, anthony.l.nguyen, corbet, linux-doc

Jakub Kicinski <kuba@kernel.org> :
> On Wed, 15 Mar 2023 18:38:46 -0700 Stephen Hemminger wrote:
> > On Wed, 15 Mar 2023 15:30:44 -0700
> > Jakub Kicinski <kuba@kernel.org> wrote:
[...]
> > Older pre LF wiki NAPI docs still survive here
> > https://lwn.net/2002/0321/a/napi-howto.php3
> 
> Wow, it's over 20 years old and still largely relevant!
> Makes me feel that we stopped innovating :)
> 
> Why were all the docs hosted out of tree back then?

This is not completely true.

Dave Jones's full-history-linux.git.tar shows that
Documentation/networking/NAPI_HOWTO.txt with the same content was included by
davem on 2002/03/13.

It was possible to do quite some work with the then in-tree kernel doc
(napi, locking, dma).

-- 
Ueimor

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-16  0:20       ` Tony Nguyen
@ 2023-03-16 21:27         ` Tony Nguyen
  0 siblings, 0 replies; 24+ messages in thread
From: Tony Nguyen @ 2023-03-16 21:27 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg, corbet,
	linux-doc

On 3/15/2023 5:20 PM, Tony Nguyen wrote:
> On 3/15/2023 4:19 PM, Jakub Kicinski wrote:
>> On Wed, 15 Mar 2023 16:17:06 -0700 Jakub Kicinski wrote:
>>> On Wed, 15 Mar 2023 16:12:42 -0700 Tony Nguyen wrote:
>>>>>    .../device_drivers/ethernet/intel/e100.rst    |   3 +-
>>>>>    .../device_drivers/ethernet/intel/i40e.rst    |   4 +-
>>>>>    .../device_drivers/ethernet/intel/ixgb.rst    |   4 +-
>>>>
>>>> ice has an entry as well; we recently updated the (more?) ancient link
>>>> to the ancient one :P
>>>
>>> Sweet Baby J. I'll fix that in v2, and there seems to be another
>>> link in CAN.. should have grepped harder :)
>>
>> BTW are there any ixgb parts still in use. The driver doesn't seem
>> like much of a burden, but IIRC it was a bit of an oddball design
>> so maybe we can axe it?
> 
> Let me ask around.

It seems like we're ok to remove it; I'll work up a patch for it.

Thanks,
Tony

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-16 10:29 ` Toke Høiland-Jørgensen
@ 2023-03-16 21:35   ` Jakub Kicinski
  0 siblings, 0 replies; 24+ messages in thread
From: Jakub Kicinski @ 2023-03-16 21:35 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On Thu, 16 Mar 2023 11:29:14 +0100 Toke Høiland-Jørgensen wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> > +Threaded NAPI
> > +-------------
> > +
> > +Use dedicated kernel threads rather than software IRQ context for NAPI
> > +processing. The configuration is per netdevice and will affect all
> > +NAPI instances of that device. Each NAPI instance will spawn a separate
> > +thread (called ``napi/${ifc-name}-${napi-id}``).  
> 
> This section starts a bit abruptly. Maybe start it with "Threaded NAPI
> is an operating mode that uses dedicated..." or something along those
> lines?

Fair point, I'll change as suggested.

> Other than that:
> 
> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>

Thanks!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 23:11   ` Jakub Kicinski
  2023-03-16  1:36     ` Stephen Hemminger
@ 2023-03-16 22:59     ` Florian Fainelli
  2023-03-16 23:07       ` Jakub Kicinski
  1 sibling, 1 reply; 24+ messages in thread
From: Florian Fainelli @ 2023-03-16 22:59 UTC (permalink / raw)
  To: Jakub Kicinski, Stephen Hemminger
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On 3/15/23 16:11, Jakub Kicinski wrote:
> On Wed, 15 Mar 2023 15:52:02 -0700 Stephen Hemminger wrote:
>> On Wed, 15 Mar 2023 15:30:44 -0700
>> Jakub Kicinski <kuba@kernel.org> wrote:
>>
>>> Add basic documentation about NAPI. We can stop linking to the ancient
>>> doc on the LF wiki.
>>>
>>> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>>> ---
>>> CC: jesse.brandeburg@intel.com
>>> CC: anthony.l.nguyen@intel.com
>>> CC: corbet@lwn.net
>>> CC: linux-doc@vger.kernel.org
>>
>> The one thing missing, is how to handle level vs edge triggered interrupts.
>> For level triggered interrupts, the re-enable is inherently not racy.
>> I.e re-enabling interrupt when packet is present will cause an interrupt.
>> But for devices with edge triggered interrupts, it is often necessary to
>> poll and manually schedule again. Older documentation referred to this
>> as the "rotten packet" problem.
>>
>> Maybe this is no longer a problem for drivers?
>> Or maybe all new hardware uses PCI MSI and is level triggered?
> 
> It's still a problem depending on the exact design of the interrupt
> controller in the chip / tradeoffs the SW wants to make.
> I haven't actually read the LF doc, because I wasn't sure about the
> licenses (sigh). The rotten packet problem does not come up in reviews
> very often, so it wasn't front of mind. I'm not sure I'd be able to
> concisely describe it, actually :S There are many races and conditions
> which can lead to it.

True, though I would put a word in or two about level vs. edge triggered 
anyway, if nothing else, explain essentially what Stephen just provided 
ought to be a good starting point for driver writers to consider the 
possible issue.
-- 
Florian


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-16 22:59     ` Florian Fainelli
@ 2023-03-16 23:07       ` Jakub Kicinski
  2023-03-16 23:18         ` Florian Fainelli
  0 siblings, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2023-03-16 23:07 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Stephen Hemminger, davem, netdev, edumazet, pabeni,
	jesse.brandeburg, anthony.l.nguyen, corbet, linux-doc

On Thu, 16 Mar 2023 15:59:49 -0700 Florian Fainelli wrote:
> True, though I would put a word in or two about level vs. edge triggered 
> anyway, if nothing else, explain essentially what Stephen just provided 
> ought to be a good starting point for driver writers to consider the 
> possible issue.

It's not a blocker for something close to the current document
going in, tho, right? More of a future extension (possibly done
by someone else...) ?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-15 22:30 [PATCH net-next] docs: networking: document NAPI Jakub Kicinski
                   ` (5 preceding siblings ...)
  2023-03-16 10:29 ` Toke Høiland-Jørgensen
@ 2023-03-16 23:16 ` Florian Fainelli
  2023-03-21  0:02   ` Jakub Kicinski
  6 siblings, 1 reply; 24+ messages in thread
From: Florian Fainelli @ 2023-03-16 23:16 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, jesse.brandeburg, anthony.l.nguyen,
	corbet, linux-doc

On 3/15/23 15:30, Jakub Kicinski wrote:
> Add basic documentation about NAPI. We can stop linking to the ancient
> doc on the LF wiki.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: jesse.brandeburg@intel.com
> CC: anthony.l.nguyen@intel.com
> CC: corbet@lwn.net
> CC: linux-doc@vger.kernel.org
> ---
>   .../device_drivers/ethernet/intel/e100.rst    |   3 +-
>   .../device_drivers/ethernet/intel/i40e.rst    |   4 +-
>   .../device_drivers/ethernet/intel/ixgb.rst    |   4 +-
>   Documentation/networking/index.rst            |   1 +
>   Documentation/networking/napi.rst             | 231 ++++++++++++++++++
>   include/linux/netdevice.h                     |  13 +-
>   6 files changed, 244 insertions(+), 12 deletions(-)
>   create mode 100644 Documentation/networking/napi.rst
> 
> diff --git a/Documentation/networking/device_drivers/ethernet/intel/e100.rst b/Documentation/networking/device_drivers/ethernet/intel/e100.rst
> index 3d4a9ba21946..371b7e5c3293 100644
> --- a/Documentation/networking/device_drivers/ethernet/intel/e100.rst
> +++ b/Documentation/networking/device_drivers/ethernet/intel/e100.rst
> @@ -151,8 +151,7 @@ NAPI
>   
>   NAPI (Rx polling mode) is supported in the e100 driver.
>   
> -See https://wiki.linuxfoundation.org/networking/napi for more
> -information on NAPI.
> +See :ref:`Documentation/networking/napi.rst <napi>` for more information.
>   
>   Multiple Interfaces on Same Ethernet Broadcast Network
>   ------------------------------------------------------
> diff --git a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst
> index ac35bd472bdc..c495c4e16b3b 100644
> --- a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst
> +++ b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst
> @@ -399,8 +399,8 @@ operate only in full duplex and only at their native speed.
>   NAPI
>   ----
>   NAPI (Rx polling mode) is supported in the i40e driver.
> -For more information on NAPI, see
> -https://wiki.linuxfoundation.org/networking/napi
> +
> +See :ref:`Documentation/networking/napi.rst <napi>` for more information.
>   
>   Flow Control
>   ------------
> diff --git a/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst b/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst
> index c6a233e68ad6..90ddbc912d8d 100644
> --- a/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst
> +++ b/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst
> @@ -367,9 +367,7 @@ NAPI
>   ----
>   NAPI (Rx polling mode) is supported in the ixgb driver.
>   
> -See https://wiki.linuxfoundation.org/networking/napi for more information on
> -NAPI.
> -
> +See :ref:`Documentation/networking/napi.rst <napi>` for more information.
>   
>   Known Issues/Troubleshooting
>   ============================
> diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
> index 4ddcae33c336..24bb256d6d53 100644
> --- a/Documentation/networking/index.rst
> +++ b/Documentation/networking/index.rst
> @@ -73,6 +73,7 @@ Refer to :ref:`netdev-FAQ` for a guide on netdev development process specifics.
>      mpls-sysctl
>      mptcp-sysctl
>      multiqueue
> +   napi
>      netconsole
>      netdev-features
>      netdevices
> diff --git a/Documentation/networking/napi.rst b/Documentation/networking/napi.rst
> new file mode 100644
> index 000000000000..4d87032a7e9e
> --- /dev/null
> +++ b/Documentation/networking/napi.rst
> @@ -0,0 +1,231 @@
> +.. _napi:
> +
> +====
> +NAPI
> +====
> +
> +NAPI is the event handling mechanism used by the Linux networking stack.
> +The name NAPI does not stand for anything in particular.

Did it not stand for New API?

> +
> +In basic operation device notifies the host about new events via an interrupt.
> +The host then schedules a NAPI instance to process the events.
> +Device may also be polled for events via NAPI without receiving
> +an interrupts first (busy polling).

s/an//

> +
> +NAPI processing usually happens in the software interrupt context,
> +but user may choose to use separate kernel threads for NAPI processing.

(called threaded NAPI)

> +
> +All in all NAPI abstracts away from the drivers the context and configuration
> +of event (packet Rx and Tx) processing.
> +
> +Driver API
> +==========
> +
> +The two most important elements of NAPI are the struct napi_struct
> +and the associated poll method. struct napi_struct holds the state
> +of the NAPI instance while the method is the driver-specific event
> +handler. The method will typically free Tx packets which had been
> +transmitted and process newly received packets.
> +
> +.. _drv_ctrl:
> +
> +Control API
> +-----------
> +
> +netif_napi_add() and netif_napi_del() add/remove a NAPI instance
> +from the system. The instances are attached to the netdevice passed
> +as argument (and will be deleted automatically when netdevice is
> +unregistered). Instances are added in a disabled state.
> +
> +napi_enable() and napi_disable() manage the disabled state.
> +A disabled NAPI can't be scheduled and its poll method is guaranteed
> +to not be invoked. napi_disable() waits for ownership of the NAPI
> +instance to be released.

Might add a word that calling napi_disable() twice will deadlock? This 
seems to be a frequent trap driver authors fall into.

> +
> +Datapath API
> +------------
> +
> +napi_schedule() is the basic method of scheduling a NAPI poll.
> +Drivers should call this function in their interrupt handler
> +(see :ref:`drv_sched` for more info). Successful call to napi_schedule()
> +will take ownership of the NAPI instance.
> +
> +Some time after NAPI is scheduled driver's poll method will be
> +called to process the events/packets. The method takes a ``budget``
> +argument - drivers can process completions for any number of Tx
> +packets but should only process up to ``budget`` number of
> +Rx packets. Rx processing is usually much more expensive.

In other words, it is recommended to ignore the budget argument when 
performing TX buffer reclamation to ensure that the reclamation is not 
arbitrarily bounded, however it is required to honor the budget argument 
for RX processing.

> +
> +.. warning::
> +
> +   ``budget`` may be 0 if core tries to only process Tx completions
> +   and no Rx packets.
> +
> +The poll method returns amount of work performed.

returns the amount of work.

> If driver still

If the driver

> +has outstanding work to do (e.g. ``budget`` was exhausted)
> +the poll method should return exactly ``budget``. In that case
> +the NAPI instance will be serviced/polled again (without the
> +need to be scheduled).
> +
> +If event processing has been completed (all outstanding packets
> +processed) the poll method should call napi_complete_done()
> +before returning. napi_complete_done() releases the ownership
> +of the instance.
> +
> +.. warning::
> +
> +   The case of finishing all events and using exactly ``budget``
> +   must be handled carefully. There is no way to report this
> +   (rare) condition to the stack, so the driver must either
> +   not call napi_complete_done() and wait to be called again,
> +   or return ``budget - 1``.
> +
> +   If ``budget`` is 0 napi_complete_done() should never be called.

Can we detail when budget may be 0?

> +
> +Call sequence
> +-------------
> +
> +Drivers should not make assumptions about the exact sequencing
> +of calls. The poll method may be called without driver scheduling
> +the instance (unless the instance is disabled). Similarly if
> +it's not guaranteed that the poll method will be called, even
> +if napi_schedule() succeeded (e.g. if the instance gets disabled).

You lost me there, it seems to me that what you mean to say is that:

- drivers should ensure that past the point where they call 
netif_napi_add(), any software context referenced by the NAPI poll 
function should be fully set-up

- it is not guaranteed that the NAPI poll function will not be called 
once netif_napi_disable() returns

> +
> +As mentioned in the :ref:`drv_ctrl` section - napi_disable() and subsequent
> +calls to the poll method only wait for the ownership of the instance
> +to be released, not for the poll method to exit. This means that
> +drivers should avoid accessing any data structures after calling
> +napi_complete_done().
> +
> +.. _drv_sched:
> +
> +Scheduling and IRQ masking
> +--------------------------
> +
> +Drivers should keep the interrupts masked after scheduling
> +the NAPI instance - until NAPI polling finishes any further
> +interrupts are unnecessary.
> +
> +Drivers which have to mask the interrupts explicitly (as opposed
> +to IRQ being auto-masked by the device) should use the napi_schedule_prep()
> +and __napi_schedule() calls:
> +
> +.. code-block:: c
> +
> +  if (napi_schedule_prep(&v->napi)) {
> +      mydrv_mask_rxtx_irq(v->idx);
> +      /* schedule after masking to avoid races */
> +      __napi_schedule(&v->napi);
> +  }
> +
> +IRQ should only be unmasked after successful call to napi_complete_done():
> +
> +.. code-block:: c
> +
> +  if (budget && napi_complete_done(&v->napi, work_done)) {
> +    mydrv_unmask_rxtx_irq(v->idx);
> +    return min(work_done, budget - 1);
> +  }
> +
> +napi_schedule_irqoff() is a variant of napi_schedule() which takes advantage
> +of guarantees given by being invoked in IRQ context (no need to
> +mask interrupts). Note that PREEMPT_RT forces all interrupts
> +to be threaded so the interrupt may need to be marked ``IRQF_NO_THREAD``
> +to avoid issues on real-time kernel configurations.
> +
> +Instance to queue mapping
> +-------------------------
> +
> +Modern devices have multiple NAPI instances (struct napi_struct) per
> +interface. There is no strong requirement on how the instances are
> +mapped to queues and interrupts. NAPI is primarily a polling/processing
> +abstraction without many user-facing semantics. That said, most networking
> +devices end up using NAPI is fairly similar ways.

s/is/in/

> +
> +NAPI instances most often correspond 1:1:1 to interrupts and queue pairs
> +(queue pair is a set of a single Rx and single Tx queue).

correspond to.

> +
> +In less common cases a NAPI instance may be used for multiple queues
> +or Rx and Tx queues can be serviced by separate NAPI instances on a single
> +core. Regardless of the queue assignment, however, there is usually still
> +a 1:1 mapping between NAPI instances and interrupts.
> +
> +It's worth noting that the ethtool API uses a "channel" terminology where
> +each channel can be either ``rx``, ``tx`` or ``combined``. It's not clear
> +what constitutes a channel, the recommended interpretation is to understand
> +a channel as an IRQ/NAPI which services queues of a given type. For example
> +a configuration of 1 ``rx``, 1 ``tx`` and 1 ``combined`` channel is expected
> +to utilize 3 interrupts, 2 Rx and 2 Tx queues.
> +
> +User API
> +========
> +
> +User interactions with NAPI depend on NAPI instance ID. The instance IDs
> +are only visible to the user thru the ``SO_INCOMING_NAPI_ID`` socket option.
> +It's not currently possible to query IDs used by a given device.
> +
> +Software IRQ coalescing
> +-----------------------
> +
> +NAPI does not perform any explicit event coalescing by default.
> +In most scenarios batching happens due to IRQ coalescing which is done
> +by the device. There are cases where software coalescing is helpful.
> +
> +NAPI can be configured to arm a repoll timer instead of unmasking
> +the hardware interrupts as soon as all packets are processed.
> +The ``gro_flush_timeout`` sysfs configuration of the netdevice
> +is reused to control the delay of the timer, while
> +``napi_defer_hard_irqs`` controls the number of consecutive empty polls
> +before NAPI gives up and goes back to using hardware IRQs.
> +
> +Busy polling
> +------------
> +
> +Busy polling allows user process to check for incoming packets before
> +device interrupt fires.

the device

> As is the case with any busy polling it trades
> +off CPU cycles for lower latency (in fact production uses of NAPI busy
> +polling are not well known).

Did not this originate via Intel at the request of financial companies 
doing high speed trading? Have they moved entirely away from busy 
polling nowadays?

> +
> +User can enable busy polling by either setting ``SO_BUSY_POLL`` on
> +selected sockets or using the global ``net.core.busy_poll`` and
> +``net.core.busy_read`` sysctls. An io_uring API for NAPI busy polling
> +also exists.
> +
> +IRQ mitigation
> +---------------
> +
> +While busy polling is supposed to be used by low latency applications,
> +a similar mechanism can be used for IRQ mitigation.
> +
> +Very high request-per-second applications (especially routing/forwarding
> +applications and especially applications using AF_XDP sockets) may not
> +want to be interrupted until they finish processing a request or a batch
> +of packets.
> +
> +Such applications can pledge to the kernel that they will perform a busy
> +polling operation periodically, and the driver should keep the device IRQs
> +permanently masked. This mode is enabled by using the ``SO_PREFER_BUSY_POLL``
> +socket option. To avoid the system misbehavior the pledge is revoked
> +if ``gro_flush_timeout`` passes without any busy poll call.
> +
> +The NAPI budget for busy polling is lower than the default (which makes
> +sense given the low latency intention of normal busy polling). This is
> +not the case with IRQ mitigation, however, so the budget can be adjusted
> +with the ``SO_BUSY_POLL_BUDGET`` socket option.
> +
> +Threaded NAPI
> +-------------
> +
> +Use dedicated kernel threads rather than software IRQ context for NAPI
> +processing. 

Uses

> The configuration is per netdevice and will affect all
> +NAPI instances of that device. Each NAPI instance will spawn a separate
> +thread (called ``napi/${ifc-name}-${napi-id}``).
> +
> +It is recommended to pin each kernel thread to a single CPU, the same
> +CPU as services the interrupt. Note that the mapping between IRQs and
> +NAPI instances may not be trivial (and is driver dependent).
> +The NAPI instance IDs will be assigned in the opposite order
> +than the process IDs of the kernel threads.

Device drivers may opt for threaded NAPI behavior by default by calling 
dev_set_threaded(.., true)
-- 
Florian


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-16 23:07       ` Jakub Kicinski
@ 2023-03-16 23:18         ` Florian Fainelli
  0 siblings, 0 replies; 24+ messages in thread
From: Florian Fainelli @ 2023-03-16 23:18 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Stephen Hemminger, davem, netdev, edumazet, pabeni,
	jesse.brandeburg, anthony.l.nguyen, corbet, linux-doc

On 3/16/23 16:07, Jakub Kicinski wrote:
> On Thu, 16 Mar 2023 15:59:49 -0700 Florian Fainelli wrote:
>> True, though I would put a word in or two about level vs. edge triggered
>> anyway, if nothing else, explain essentially what Stephen just provided
>> ought to be a good starting point for driver writers to consider the
>> possible issue.
> 
> It's not a blocker for something close to the current document
> going in, tho, right? More of a future extension (possibly done
> by someone else...) ?

Works for me, Stephen should have plenty of time to come up with an 
addendum ;)
-- 
Florian


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-16 23:16 ` Florian Fainelli
@ 2023-03-21  0:02   ` Jakub Kicinski
  2023-03-21  0:48     ` Stephen Hemminger
  0 siblings, 1 reply; 24+ messages in thread
From: Jakub Kicinski @ 2023-03-21  0:02 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On Thu, 16 Mar 2023 16:16:39 -0700 Florian Fainelli wrote:
> Did it not stand for New API?

I think it did. But we had extra 20 years of software development
experience and now agree that naming things "new" or "next" is
a bad idea? So let's pretend it stands for nothing. Or NAPI API ;)

> > +NAPI processing usually happens in the software interrupt context,
> > +but user may choose to use separate kernel threads for NAPI processing.  
> 
> (called threaded NAPI)

I added a cross link:

but user may choose to use :ref:`separate kernel threads<threaded>`

(and same for the busy poll sentence).

> > +Control API
> > +-----------
> > +
> > +netif_napi_add() and netif_napi_del() add/remove a NAPI instance
> > +from the system. The instances are attached to the netdevice passed
> > +as argument (and will be deleted automatically when netdevice is
> > +unregistered). Instances are added in a disabled state.
> > +
> > +napi_enable() and napi_disable() manage the disabled state.
> > +A disabled NAPI can't be scheduled and its poll method is guaranteed
> > +to not be invoked. napi_disable() waits for ownership of the NAPI
> > +instance to be released.  
> 
> Might add a word that calling napi_disable() twice will deadlock? This 
> seems to be a frequent trap driver authors fall into.

Good point. I'll say that the APIs are not idempotent:

The control APIs are not idempotent. Control API calls are safe against
concurrent use of datapath APIs but incorrect sequence of control API
calls may result in crashes, deadlocks, or race conditions. For example
calling napi_disable() multiple times in a row will deadlock.

> > +Datapath API
> > +------------
> > +
> > +napi_schedule() is the basic method of scheduling a NAPI poll.
> > +Drivers should call this function in their interrupt handler
> > +(see :ref:`drv_sched` for more info). Successful call to napi_schedule()
> > +will take ownership of the NAPI instance.
> > +
> > +Some time after NAPI is scheduled driver's poll method will be
> > +called to process the events/packets. The method takes a ``budget``
> > +argument - drivers can process completions for any number of Tx
> > +packets but should only process up to ``budget`` number of
> > +Rx packets. Rx processing is usually much more expensive.  
> 
> In other words, it is recommended to ignore the budget argument when 
> performing TX buffer reclamation to ensure that the reclamation is not 
> arbitrarily bounded, however it is required to honor the budget argument 
> for RX processing.

Added verbatim.

> > +.. warning::
> > +
> > +   ``budget`` may be 0 if core tries to only process Tx completions
> > +   and no Rx packets.
> > +
> > +The poll method returns amount of work performed.  
> 
> returns the amount of work.

Hm. Reads to me like we need an attributive(?) in this sentence.
"amount of work done" maybe? No?

> > +has outstanding work to do (e.g. ``budget`` was exhausted)
> > +the poll method should return exactly ``budget``. In that case
> > +the NAPI instance will be serviced/polled again (without the
> > +need to be scheduled).
> > +
> > +If event processing has been completed (all outstanding packets
> > +processed) the poll method should call napi_complete_done()
> > +before returning. napi_complete_done() releases the ownership
> > +of the instance.
> > +
> > +.. warning::
> > +
> > +   The case of finishing all events and using exactly ``budget``
> > +   must be handled carefully. There is no way to report this
> > +   (rare) condition to the stack, so the driver must either
> > +   not call napi_complete_done() and wait to be called again,
> > +   or return ``budget - 1``.
> > +
> > +   If ``budget`` is 0 napi_complete_done() should never be called.  
> 
> Can we detail when budget may be 0?

I was trying to avoid enshrining implementation details.
budget == 0 -> don't process Rx, don't ask why.
In practice AFAIK it's only done by netpoll. I don't think AF_XDP 
does it.

> > +Call sequence
> > +-------------
> > +
> > +Drivers should not make assumptions about the exact sequencing
> > +of calls. The poll method may be called without driver scheduling
> > +the instance (unless the instance is disabled). Similarly if

s/if//

> > +it's not guaranteed that the poll method will be called, even
> > +if napi_schedule() succeeded (e.g. if the instance gets disabled).  
> 
> You lost me there, it seems to me that what you mean to say is that:
> 
> - drivers should ensure that past the point where they call 
> netif_napi_add(), any software context referenced by the NAPI poll 
> function should be fully set-up
> 
> - it is not guaranteed that the NAPI poll function will not be called 
> once netif_napi_disable() returns

That is guaranteed. What's not guaranteed is 1:1 relationship between
napi_schedule() and napi->poll(). For busy polling we'll see
napi->poll() without there ever being an interrupt. And inverse may
also be true, where napi_schedule() is done but the polling never
happens.

I'm trying to make sure nobody tries to split the logic between the IRQ
handler and napi->poll(), expecting 1:1.

> > +NAPI instances most often correspond 1:1:1 to interrupts and queue pairs
> > +(queue pair is a set of a single Rx and single Tx queue).  
> 
> correspond to.

1:1:1 is meant as an attributive(?), describing the relationship
as 3-way 1:1.

> > As is the case with any busy polling it trades
> > +off CPU cycles for lower latency (in fact production uses of NAPI busy
> > +polling are not well known).  
> 
> Did not this originate via Intel at the request of financial companies 
> doing high speed trading? Have they moved entirely away from busy 
> polling nowadays?

No idea. If someone knows of prod use please speak up? 🤷️
I feel like the theoretical excitement about this feature does 
not match its impact :S

> > The configuration is per netdevice and will affect all
> > +NAPI instances of that device. Each NAPI instance will spawn a separate
> > +thread (called ``napi/${ifc-name}-${napi-id}``).
> > +
> > +It is recommended to pin each kernel thread to a single CPU, the same
> > +CPU as services the interrupt. Note that the mapping between IRQs and
> > +NAPI instances may not be trivial (and is driver dependent).
> > +The NAPI instance IDs will be assigned in the opposite order
> > +than the process IDs of the kernel threads.  
> 
> Device drivers may opt for threaded NAPI behavior by default by calling 
> dev_set_threaded(.., true)

Let's not advertise it too widely... We'd need to describe under what
conditions it's okay to opt-in by default.

Fixed all the points which I'm not quoting. Thanks!!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-21  0:02   ` Jakub Kicinski
@ 2023-03-21  0:48     ` Stephen Hemminger
  2023-03-21  1:19       ` Jakub Kicinski
  0 siblings, 1 reply; 24+ messages in thread
From: Stephen Hemminger @ 2023-03-21  0:48 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Florian Fainelli, davem, netdev, edumazet, pabeni,
	jesse.brandeburg, anthony.l.nguyen, corbet, linux-doc

On Mon, 20 Mar 2023 17:02:21 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> On Thu, 16 Mar 2023 16:16:39 -0700 Florian Fainelli wrote:
> > Did it not stand for New API?  
> 
> I think it did. But we had extra 20 years of software development
> experience and now agree that naming things "new" or "next" is
> a bad idea? So let's pretend it stands for nothing. Or NAPI API ;)


Maybe just a footnote like:
  [1] Was originally referred to as New API in 2.4 Linux.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-21  0:48     ` Stephen Hemminger
@ 2023-03-21  1:19       ` Jakub Kicinski
  0 siblings, 0 replies; 24+ messages in thread
From: Jakub Kicinski @ 2023-03-21  1:19 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Florian Fainelli, davem, netdev, edumazet, pabeni,
	jesse.brandeburg, anthony.l.nguyen, corbet, linux-doc

On Mon, 20 Mar 2023 17:48:40 -0700 Stephen Hemminger wrote:
> > On Thu, 16 Mar 2023 16:16:39 -0700 Florian Fainelli wrote:  
> > > Did it not stand for New API?    
> > 
> > I think it did. But we had extra 20 years of software development
> > experience and now agree that naming things "new" or "next" is
> > a bad idea? So let's pretend it stands for nothing. Or NAPI API ;)  
> 
> 
> Maybe just a footnote like:
>   [1] Was originally referred to as New API in 2.4 Linux.

👌️ let me find out how to make a footnote in sphinx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next] docs: networking: document NAPI
  2023-03-16  1:38 ` Stephen Hemminger
  2023-03-16  2:58   ` Jakub Kicinski
@ 2023-03-23  0:44   ` Jamal Hadi Salim
  1 sibling, 0 replies; 24+ messages in thread
From: Jamal Hadi Salim @ 2023-03-23  0:44 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni, jesse.brandeburg,
	anthony.l.nguyen, corbet, linux-doc

On Wed, Mar 15, 2023 at 9:38 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 15 Mar 2023 15:30:44 -0700
> Jakub Kicinski <kuba@kernel.org> wrote:
>
> > Add basic documentation about NAPI. We can stop linking to the ancient
> > doc on the LF wiki.
> >
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > ---
> > CC: jesse.brandeburg@intel.com
> > CC: anthony.l.nguyen@intel.com
> > CC: corbet@lwn.net
> > CC: linux-doc@vger.kernel.org
>
> Older pre LF wiki NAPI docs still survive here
> https://lwn.net/2002/0321/a/napi-howto.php3

Feel free to use that doc or excerpts under whatever licence you want.
Other references:
https://www.usenix.org/legacy/publications/library/proceedings/als01/full_papers/jamal/jamal.pdf
(good reference for most excellent diagrams!)
And some musings (still relevant today):
http://ftp.dei.uc.pt/pub/linux/kernel/people/hadi/docs/UKUUG2005.pdf

cheers,
jamal

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2023-03-23  0:44 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-15 22:30 [PATCH net-next] docs: networking: document NAPI Jakub Kicinski
2023-03-15 22:46 ` Stephen Hemminger
2023-03-15 22:52 ` Stephen Hemminger
2023-03-15 23:11   ` Jakub Kicinski
2023-03-16  1:36     ` Stephen Hemminger
2023-03-16 22:59     ` Florian Fainelli
2023-03-16 23:07       ` Jakub Kicinski
2023-03-16 23:18         ` Florian Fainelli
2023-03-15 23:12 ` Tony Nguyen
2023-03-15 23:17   ` Jakub Kicinski
2023-03-15 23:19     ` Jakub Kicinski
2023-03-16  0:20       ` Tony Nguyen
2023-03-16 21:27         ` Tony Nguyen
2023-03-16  1:38 ` Stephen Hemminger
2023-03-16  2:58   ` Jakub Kicinski
2023-03-16 12:03     ` Francois Romieu
2023-03-23  0:44   ` Jamal Hadi Salim
2023-03-16  9:50 ` Bagas Sanjaya
2023-03-16 10:29 ` Toke Høiland-Jørgensen
2023-03-16 21:35   ` Jakub Kicinski
2023-03-16 23:16 ` Florian Fainelli
2023-03-21  0:02   ` Jakub Kicinski
2023-03-21  0:48     ` Stephen Hemminger
2023-03-21  1:19       ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).