public inbox for virtio-comment@lists.linux.dev
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Halil Pasic <pasic@linux.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>,
	Si-Wei Liu <si-wei.liu@oracle.com>,
	Parav Pandit <parav@nvidia.com>,
	Heng Qi <hengqi@linux.alibaba.com>,
	Cornelia Huck <cohuck@redhat.com>,
	"virtio-comment@lists.linux.dev" <virtio-comment@lists.linux.dev>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Subject: Re: [PATCH v5] virtio-net: clarify coalescing parameters settings
Date: Tue, 2 Jul 2024 17:04:09 -0400	[thread overview]
Message-ID: <20240702165806-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20240702223735.4fd2847a.pasic@linux.ibm.com>

On Tue, Jul 02, 2024 at 10:37:35PM +0200, Halil Pasic wrote:
> On Fri, 28 Jun 2024 16:23:17 +0800
> Jason Wang <jasowang@redhat.com> wrote:
> 
> > > Hmmm, this next sentence you reference above was indeed for the
> > > interaction between coalescing and used buffer suppression. Then what's
> > > the best-effort part was about, really? Round-up or round down the set
> > > value to the power of 2 to save space? How is it relevant to our
> > > discussion? I think even with rounding it shouldn't be too off? (as
> > > said, by best-effort v.s. give up)  
> > 
> > Would it help if we add something like below in the guidance?
> >
> 
> It would IMHO greatly benefit clarity, provided we do get it right.
>  
> > When the device wants to send a notification it needs:
> > 
> > 1) check notification suppression, don't send interrupt if it is
> > suppressed otherwise goto 2)
> > 2) check event index, don't send interrupt if event index is not
> > crossed otherwise goto 3)
> > 3) check coalescing, don't send interrupt if the coalescing limit is
> > not exceed, otherwise send the interrupt
> > 
> > (Just to demonstrate the idea, needs tweaking for sure)
> 
> Yes I agree this indeed needs tweaking. Let me make some points.

Very good points below.


> 1) for virtio-net we have 3 distinct mechanisms that police used
> buffer notification avoidance, but 2 of the three are actually
> mutually exclusive if I understand it correctly. Namely we have
> coalescing guarded by VIRTIO_NET_F_NOTF_COAL and then depending on
> whether _F_EVENT_IDX was negotiated or not, we have "event_idx based" or
> "crude flag based" notification suppression.
> 
> 2) Let me just quote the entire section on the good old used buffer
> notification suppression
> """
> 2.7.7.2 Device Requirements: Used Buffer Notification Suppression
> 
> If the VIRTIO_F_EVENT_IDX feature bit is not negotiated:
> 
>     * The device MUST ignore the used_event value.
>     * After the device writes a descriptor index into the used ring:
>        -  If flags is 1, the device SHOULD NOT send a notification.
>        -  If flags is 0, the device MUST send a notification.
> 
> Otherwise, if the VIRTIO_F_EVENT_IDX feature bit is negotiated:
> 
>     * The device MUST ignore the lower bit of flags.
>     * After the device writes a descriptor index into the used ring:
>        - If the idx field in the used ring (which determined where that
>       descriptor index was placed) was equal to used_event, the device
>       MUST send a notification.
>        - Otherwise the device SHOULD NOT send a notification.
> 
> Note: For example, if used_event is 0, then a device using
> 
> VIRTIO_F_EVENT_IDX would send a used buffer notification to the driver
> after the first buffer is used (and again after the 65536th buffer,
> etc).
> """ 
> 
> Frankly, a MUST is a must, and I don't see very little leeway
> for the coalescing to avoid any notifications. But on the other hand,
> that is the very purpose of the coalescing so IMHO we have a problem
> here.
> 
> The only loophole here is "after", which could be stretched to
> "eventually, before the end of the world". Which would basically reduce
> the MUST ad-absurdum, and I would not like that.

I agree as such but I also think we can just go ahead and make this
explicit. That is s/after/at some point in time after/.

And then let's add text that actually explains how
- devices can defer notifications for performance
- some devices can have device specific mechanisms
  for driver to control how much notifications are deferred.


WDYT?



> 3) AFAIU if multiple suppression mechanisms are active
> concurrently we are aiming for some sort of an logical AND semantic i.e.
> a further reduction over just a single one being employed (and not OR).
> 
> But IMHO the both the wording for the event_idx based and curde
> suppression, as well as for coalescing.
> 
> Let me quote the full description for the coalescing:
> 
> """
> 5.1.6.5.9.1 Operation
> The device sends a used buffer notification once the notification conditions are met and if the notifications are not suppressed as explained in 2.7.7.
> 
> When the device has non-zero max_usecs and non-zero max_packets, it starts counting microseconds and packets upon receiving/sending a packet. The device counts packets and microseconds for each receive virtqueue and transmit virtqueue separately. In this case, the notification conditions are met when max_usecs microseconds elapse, or upon sending/receiving max_packets packets, whichever happens first. Afterwards, the device waits for the next packet and starts counting packets and microseconds again.
> 
> When the device has max_usecs = 0 or max_packets = 0, the notification conditions are met after every packet received/sent. 
> """
> 
> Here "Afterwards" is ambiguous. What we want here is "afterwards" ==
> "after the notification has been sent", but one can also read it as
> "after the condition has been met".
> 
> 4) I see two ways to make sense out of it.
> 
> My preferred way would be to replace "send a notification" with
> "generate a notification" in the main text including "2.7.7.2 Device
> Requirements: Used Buffer Notification Suppression", along with adding an
> explanation which states that generated notifications are sent
> immediately unless a device has an active facility that under a certain
> set of conditions  retains and possibly coalesces notifications. 
> 
> Please notice that here coalescence is used with its dictionary
> meaning (see https://dictionary.cambridge.org/dictionary/english/coalesce), 
> and that the current description of NET_F_NOTIF_COAL is more akin to a
> suppression mechanism where notifications are suppressed or discarded and
> not coalesced.
> 
> The notification coalescing facility would then retain the first
> used buffer notification request for each queue (those are still
> generated by the same rules), until either the packet count or the
> timeout type max frequency (or delay we need to clarify that) condition
> is met, and coalesce any further notification requests with the retained
> one until it is sent.
> 
> The other way, which I'm not sure is actually viable, is basically to
> try to clarify the hell out of the current approach as taken by
> "5.1.6.5.9.1 Operation" by making it crystal clear when exactly are the
> "notification conditions met" and what is the complement state of that,
> and add a sentence to 2.7.7.2 that references device specific mechanisms
> to do more suppression.  And  of course we would have to find a nice
> definition for "if the notifications are not suppressed as explained in
> 2.7.7." such that loss of initiative is not possible.
> 
> IMHO any solution where things can stall because we would have sent
> an used buffer notification without "coalescing" but we did not because
> of coalescing  is not acceptable. This could happen if we cross
> event_idx when the coalescing notification conditions are not met and
> thus do not notify, and then when the coalescing notification condition
> are finally met (e.g. via timeout) the event_id type suppression
> is active again because we crossed event_idx when coalescing
> notification condition was not met, and we end up not notifying again.
> 
> Sorry for the wall of text.
> 
> Regards,
> Halil
> 
> 
> 


  reply	other threads:[~2024-07-02 21:04 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-28  4:47 [PATCH v5] virtio-net: clarify coalescing parameters settings Heng Qi
2024-05-28  4:50 ` Heng Qi
2024-05-31  6:36   ` Heng Qi
2024-05-31  9:39     ` Cornelia Huck
2024-06-07 20:02 ` Halil Pasic
2024-06-08  2:34   ` Heng Qi
2024-06-10 12:46     ` Halil Pasic
2024-06-10 13:35       ` Heng Qi
2024-06-10 14:50         ` Michael S. Tsirkin
2024-06-10 15:12           ` Parav Pandit
2024-06-11 14:04           ` Cornelia Huck
2024-06-10 20:19         ` Halil Pasic
2024-06-11 10:40           ` Heng Qi
2024-06-11 16:29             ` Michael S. Tsirkin
2024-06-11 17:43               ` Parav Pandit
2024-06-13  6:13                 ` Michael S. Tsirkin
2024-06-17  2:27                   ` Heng Qi
2024-06-17 23:31                     ` Si-Wei Liu
2024-06-20  7:40                       ` Heng Qi
2024-06-21  1:21                         ` Si-Wei Liu
2024-06-21  3:24                           ` Heng Qi
2024-06-21 23:46                             ` Si-Wei Liu
2024-06-22  1:34                               ` Heng Qi
2024-06-25  4:51                                 ` Si-Wei Liu
2024-06-25  5:56                                   ` Parav Pandit
2024-06-26  1:14                                     ` Si-Wei Liu
2024-06-27 10:37                                       ` Halil Pasic
2024-06-27 11:27                                         ` Parav Pandit
2024-06-27 12:35                                         ` Michael S. Tsirkin
2024-06-27 12:45                                           ` Parav Pandit
2024-06-27 12:52                                             ` Michael S. Tsirkin
2024-06-27 13:03                                               ` Parav Pandit
2024-06-27 14:59                                                 ` Michael S. Tsirkin
2024-06-27 17:27                                               ` Si-Wei Liu
2024-06-27 17:14                                           ` Si-Wei Liu
2024-06-27 22:18                                             ` Michael S. Tsirkin
2024-06-28  6:56                                               ` Si-Wei Liu
2024-06-28  8:23                                                 ` Jason Wang
2024-06-28 19:31                                                   ` Si-Wei Liu
2024-06-30 17:04                                                     ` Michael S. Tsirkin
2024-07-03  6:09                                                     ` Jason Wang
2024-07-02 20:37                                                   ` Halil Pasic
2024-07-02 21:04                                                     ` Michael S. Tsirkin [this message]
2024-07-03  5:01                                                     ` Jason Wang
2024-06-29  6:47                                           ` Halil Pasic
2024-06-30 16:55                                             ` Michael S. Tsirkin
2024-07-02 21:43                                               ` Halil Pasic
2024-06-27 12:13                                       ` Parav Pandit
2024-06-27 12:42                                         ` Michael S. Tsirkin
2024-06-25  7:53                               ` Jason Wang
2024-06-25  8:06                                 ` Michael S. Tsirkin
2024-06-25  8:13                                   ` Jason Wang
2024-06-25  8:21                                     ` Michael S. Tsirkin
2024-06-11 23:03 ` Michael S. Tsirkin
2024-06-17  2:35   ` Heng Qi
2024-06-25  7:26     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240702165806-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=hengqi@linux.alibaba.com \
    --cc=jasowang@redhat.com \
    --cc=parav@nvidia.com \
    --cc=pasic@linux.ibm.com \
    --cc=si-wei.liu@oracle.com \
    --cc=virtio-comment@lists.linux.dev \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox