All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Halil Pasic <pasic@linux.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>,
	Si-Wei Liu <si-wei.liu@oracle.com>,
	Parav Pandit <parav@nvidia.com>,
	Heng Qi <hengqi@linux.alibaba.com>,
	Cornelia Huck <cohuck@redhat.com>,
	"virtio-comment@lists.linux.dev" <virtio-comment@lists.linux.dev>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Subject: Re: [PATCH v5] virtio-net: clarify coalescing parameters settings
Date: Tue, 2 Jul 2024 17:04:09 -0400	[thread overview]
Message-ID: <20240702165806-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20240702223735.4fd2847a.pasic@linux.ibm.com>

On Tue, Jul 02, 2024 at 10:37:35PM +0200, Halil Pasic wrote:
> On Fri, 28 Jun 2024 16:23:17 +0800
> Jason Wang <jasowang@redhat.com> wrote:
> 
> > > Hmmm, this next sentence you reference above was indeed for the
> > > interaction between coalescing and used buffer suppression. Then what's
> > > the best-effort part was about, really? Round-up or round down the set
> > > value to the power of 2 to save space? How is it relevant to our
> > > discussion? I think even with rounding it shouldn't be too off? (as
> > > said, by best-effort v.s. give up)  
> > 
> > Would it help if we add something like below in the guidance?
> >
> 
> It would IMHO greatly benefit clarity, provided we do get it right.
>  
> > When the device wants to send a notification it needs:
> > 
> > 1) check notification suppression, don't send interrupt if it is
> > suppressed otherwise goto 2)
> > 2) check event index, don't send interrupt if event index is not
> > crossed otherwise goto 3)
> > 3) check coalescing, don't send interrupt if the coalescing limit is
> > not exceed, otherwise send the interrupt
> > 
> > (Just to demonstrate the idea, needs tweaking for sure)
> 
> Yes I agree this indeed needs tweaking. Let me make some points.

Very good points below.


> 1) for virtio-net we have 3 distinct mechanisms that police used
> buffer notification avoidance, but 2 of the three are actually
> mutually exclusive if I understand it correctly. Namely we have
> coalescing guarded by VIRTIO_NET_F_NOTF_COAL and then depending on
> whether _F_EVENT_IDX was negotiated or not, we have "event_idx based" or
> "crude flag based" notification suppression.
> 
> 2) Let me just quote the entire section on the good old used buffer
> notification suppression
> """
> 2.7.7.2 Device Requirements: Used Buffer Notification Suppression
> 
> If the VIRTIO_F_EVENT_IDX feature bit is not negotiated:
> 
>     * The device MUST ignore the used_event value.
>     * After the device writes a descriptor index into the used ring:
>        -  If flags is 1, the device SHOULD NOT send a notification.
>        -  If flags is 0, the device MUST send a notification.
> 
> Otherwise, if the VIRTIO_F_EVENT_IDX feature bit is negotiated:
> 
>     * The device MUST ignore the lower bit of flags.
>     * After the device writes a descriptor index into the used ring:
>        - If the idx field in the used ring (which determined where that
>       descriptor index was placed) was equal to used_event, the device
>       MUST send a notification.
>        - Otherwise the device SHOULD NOT send a notification.
> 
> Note: For example, if used_event is 0, then a device using
> 
> VIRTIO_F_EVENT_IDX would send a used buffer notification to the driver
> after the first buffer is used (and again after the 65536th buffer,
> etc).
> """ 
> 
> Frankly, a MUST is a must, and I don't see very little leeway
> for the coalescing to avoid any notifications. But on the other hand,
> that is the very purpose of the coalescing so IMHO we have a problem
> here.
> 
> The only loophole here is "after", which could be stretched to
> "eventually, before the end of the world". Which would basically reduce
> the MUST ad-absurdum, and I would not like that.

I agree as such but I also think we can just go ahead and make this
explicit. That is s/after/at some point in time after/.

And then let's add text that actually explains how
- devices can defer notifications for performance
- some devices can have device specific mechanisms
  for driver to control how much notifications are deferred.


WDYT?



> 3) AFAIU if multiple suppression mechanisms are active
> concurrently we are aiming for some sort of an logical AND semantic i.e.
> a further reduction over just a single one being employed (and not OR).
> 
> But IMHO the both the wording for the event_idx based and curde
> suppression, as well as for coalescing.
> 
> Let me quote the full description for the coalescing:
> 
> """
> 5.1.6.5.9.1 Operation
> The device sends a used buffer notification once the notification conditions are met and if the notifications are not suppressed as explained in 2.7.7.
> 
> When the device has non-zero max_usecs and non-zero max_packets, it starts counting microseconds and packets upon receiving/sending a packet. The device counts packets and microseconds for each receive virtqueue and transmit virtqueue separately. In this case, the notification conditions are met when max_usecs microseconds elapse, or upon sending/receiving max_packets packets, whichever happens first. Afterwards, the device waits for the next packet and starts counting packets and microseconds again.
> 
> When the device has max_usecs = 0 or max_packets = 0, the notification conditions are met after every packet received/sent. 
> """
> 
> Here "Afterwards" is ambiguous. What we want here is "afterwards" ==
> "after the notification has been sent", but one can also read it as
> "after the condition has been met".
> 
> 4) I see two ways to make sense out of it.
> 
> My preferred way would be to replace "send a notification" with
> "generate a notification" in the main text including "2.7.7.2 Device
> Requirements: Used Buffer Notification Suppression", along with adding an
> explanation which states that generated notifications are sent
> immediately unless a device has an active facility that under a certain
> set of conditions  retains and possibly coalesces notifications. 
> 
> Please notice that here coalescence is used with its dictionary
> meaning (see https://dictionary.cambridge.org/dictionary/english/coalesce), 
> and that the current description of NET_F_NOTIF_COAL is more akin to a
> suppression mechanism where notifications are suppressed or discarded and
> not coalesced.
> 
> The notification coalescing facility would then retain the first
> used buffer notification request for each queue (those are still
> generated by the same rules), until either the packet count or the
> timeout type max frequency (or delay we need to clarify that) condition
> is met, and coalesce any further notification requests with the retained
> one until it is sent.
> 
> The other way, which I'm not sure is actually viable, is basically to
> try to clarify the hell out of the current approach as taken by
> "5.1.6.5.9.1 Operation" by making it crystal clear when exactly are the
> "notification conditions met" and what is the complement state of that,
> and add a sentence to 2.7.7.2 that references device specific mechanisms
> to do more suppression.  And  of course we would have to find a nice
> definition for "if the notifications are not suppressed as explained in
> 2.7.7." such that loss of initiative is not possible.
> 
> IMHO any solution where things can stall because we would have sent
> an used buffer notification without "coalescing" but we did not because
> of coalescing  is not acceptable. This could happen if we cross
> event_idx when the coalescing notification conditions are not met and
> thus do not notify, and then when the coalescing notification condition
> are finally met (e.g. via timeout) the event_id type suppression
> is active again because we crossed event_idx when coalescing
> notification condition was not met, and we end up not notifying again.
> 
> Sorry for the wall of text.
> 
> Regards,
> Halil
> 
> 
> 


  reply	other threads:[~2024-07-02 21:04 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-28  4:47 [PATCH v5] virtio-net: clarify coalescing parameters settings Heng Qi
2024-05-28  4:50 ` Heng Qi
2024-05-31  6:36   ` Heng Qi
2024-05-31  9:39     ` Cornelia Huck
2024-06-07 20:02 ` Halil Pasic
2024-06-08  2:34   ` Heng Qi
2024-06-10 12:46     ` Halil Pasic
2024-06-10 13:35       ` Heng Qi
2024-06-10 14:50         ` Michael S. Tsirkin
2024-06-10 15:12           ` Parav Pandit
2024-06-11 14:04           ` Cornelia Huck
2024-06-10 20:19         ` Halil Pasic
2024-06-11 10:40           ` Heng Qi
2024-06-11 16:29             ` Michael S. Tsirkin
2024-06-11 17:43               ` Parav Pandit
2024-06-13  6:13                 ` Michael S. Tsirkin
2024-06-17  2:27                   ` Heng Qi
2024-06-17 23:31                     ` Si-Wei Liu
2024-06-20  7:40                       ` Heng Qi
2024-06-21  1:21                         ` Si-Wei Liu
2024-06-21  3:24                           ` Heng Qi
2024-06-21 23:46                             ` Si-Wei Liu
2024-06-22  1:34                               ` Heng Qi
2024-06-25  4:51                                 ` Si-Wei Liu
2024-06-25  5:56                                   ` Parav Pandit
2024-06-26  1:14                                     ` Si-Wei Liu
2024-06-27 10:37                                       ` Halil Pasic
2024-06-27 11:27                                         ` Parav Pandit
2024-06-27 12:35                                         ` Michael S. Tsirkin
2024-06-27 12:45                                           ` Parav Pandit
2024-06-27 12:52                                             ` Michael S. Tsirkin
2024-06-27 13:03                                               ` Parav Pandit
2024-06-27 14:59                                                 ` Michael S. Tsirkin
2024-06-27 17:27                                               ` Si-Wei Liu
2024-06-27 17:14                                           ` Si-Wei Liu
2024-06-27 22:18                                             ` Michael S. Tsirkin
2024-06-28  6:56                                               ` Si-Wei Liu
2024-06-28  8:23                                                 ` Jason Wang
2024-06-28 19:31                                                   ` Si-Wei Liu
2024-06-30 17:04                                                     ` Michael S. Tsirkin
2024-07-03  6:09                                                     ` Jason Wang
2024-07-02 20:37                                                   ` Halil Pasic
2024-07-02 21:04                                                     ` Michael S. Tsirkin [this message]
2024-07-03  5:01                                                     ` Jason Wang
2024-06-29  6:47                                           ` Halil Pasic
2024-06-30 16:55                                             ` Michael S. Tsirkin
2024-07-02 21:43                                               ` Halil Pasic
2024-06-27 12:13                                       ` Parav Pandit
2024-06-27 12:42                                         ` Michael S. Tsirkin
2024-06-25  7:53                               ` Jason Wang
2024-06-25  8:06                                 ` Michael S. Tsirkin
2024-06-25  8:13                                   ` Jason Wang
2024-06-25  8:21                                     ` Michael S. Tsirkin
2024-06-11 23:03 ` Michael S. Tsirkin
2024-06-17  2:35   ` Heng Qi
2024-06-25  7:26     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240702165806-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=hengqi@linux.alibaba.com \
    --cc=jasowang@redhat.com \
    --cc=parav@nvidia.com \
    --cc=pasic@linux.ibm.com \
    --cc=si-wei.liu@oracle.com \
    --cc=virtio-comment@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.