From: "Michael S. Tsirkin" <mst@redhat.com>
To: Halil Pasic <pasic@linux.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>,
Si-Wei Liu <si-wei.liu@oracle.com>,
Parav Pandit <parav@nvidia.com>,
Heng Qi <hengqi@linux.alibaba.com>,
Cornelia Huck <cohuck@redhat.com>,
"virtio-comment@lists.linux.dev" <virtio-comment@lists.linux.dev>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Subject: Re: [PATCH v5] virtio-net: clarify coalescing parameters settings
Date: Tue, 2 Jul 2024 17:04:09 -0400 [thread overview]
Message-ID: <20240702165806-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20240702223735.4fd2847a.pasic@linux.ibm.com>
On Tue, Jul 02, 2024 at 10:37:35PM +0200, Halil Pasic wrote:
> On Fri, 28 Jun 2024 16:23:17 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
> > > Hmmm, this next sentence you reference above was indeed for the
> > > interaction between coalescing and used buffer suppression. Then what's
> > > the best-effort part was about, really? Round-up or round down the set
> > > value to the power of 2 to save space? How is it relevant to our
> > > discussion? I think even with rounding it shouldn't be too off? (as
> > > said, by best-effort v.s. give up)
> >
> > Would it help if we add something like below in the guidance?
> >
>
> It would IMHO greatly benefit clarity, provided we do get it right.
>
> > When the device wants to send a notification it needs:
> >
> > 1) check notification suppression, don't send interrupt if it is
> > suppressed otherwise goto 2)
> > 2) check event index, don't send interrupt if event index is not
> > crossed otherwise goto 3)
> > 3) check coalescing, don't send interrupt if the coalescing limit is
> > not exceed, otherwise send the interrupt
> >
> > (Just to demonstrate the idea, needs tweaking for sure)
>
> Yes I agree this indeed needs tweaking. Let me make some points.
Very good points below.
> 1) for virtio-net we have 3 distinct mechanisms that police used
> buffer notification avoidance, but 2 of the three are actually
> mutually exclusive if I understand it correctly. Namely we have
> coalescing guarded by VIRTIO_NET_F_NOTF_COAL and then depending on
> whether _F_EVENT_IDX was negotiated or not, we have "event_idx based" or
> "crude flag based" notification suppression.
>
> 2) Let me just quote the entire section on the good old used buffer
> notification suppression
> """
> 2.7.7.2 Device Requirements: Used Buffer Notification Suppression
>
> If the VIRTIO_F_EVENT_IDX feature bit is not negotiated:
>
> * The device MUST ignore the used_event value.
> * After the device writes a descriptor index into the used ring:
> - If flags is 1, the device SHOULD NOT send a notification.
> - If flags is 0, the device MUST send a notification.
>
> Otherwise, if the VIRTIO_F_EVENT_IDX feature bit is negotiated:
>
> * The device MUST ignore the lower bit of flags.
> * After the device writes a descriptor index into the used ring:
> - If the idx field in the used ring (which determined where that
> descriptor index was placed) was equal to used_event, the device
> MUST send a notification.
> - Otherwise the device SHOULD NOT send a notification.
>
> Note: For example, if used_event is 0, then a device using
>
> VIRTIO_F_EVENT_IDX would send a used buffer notification to the driver
> after the first buffer is used (and again after the 65536th buffer,
> etc).
> """
>
> Frankly, a MUST is a must, and I don't see very little leeway
> for the coalescing to avoid any notifications. But on the other hand,
> that is the very purpose of the coalescing so IMHO we have a problem
> here.
>
> The only loophole here is "after", which could be stretched to
> "eventually, before the end of the world". Which would basically reduce
> the MUST ad-absurdum, and I would not like that.
I agree as such but I also think we can just go ahead and make this
explicit. That is s/after/at some point in time after/.
And then let's add text that actually explains how
- devices can defer notifications for performance
- some devices can have device specific mechanisms
for driver to control how much notifications are deferred.
WDYT?
> 3) AFAIU if multiple suppression mechanisms are active
> concurrently we are aiming for some sort of an logical AND semantic i.e.
> a further reduction over just a single one being employed (and not OR).
>
> But IMHO the both the wording for the event_idx based and curde
> suppression, as well as for coalescing.
>
> Let me quote the full description for the coalescing:
>
> """
> 5.1.6.5.9.1 Operation
> The device sends a used buffer notification once the notification conditions are met and if the notifications are not suppressed as explained in 2.7.7.
>
> When the device has non-zero max_usecs and non-zero max_packets, it starts counting microseconds and packets upon receiving/sending a packet. The device counts packets and microseconds for each receive virtqueue and transmit virtqueue separately. In this case, the notification conditions are met when max_usecs microseconds elapse, or upon sending/receiving max_packets packets, whichever happens first. Afterwards, the device waits for the next packet and starts counting packets and microseconds again.
>
> When the device has max_usecs = 0 or max_packets = 0, the notification conditions are met after every packet received/sent.
> """
>
> Here "Afterwards" is ambiguous. What we want here is "afterwards" ==
> "after the notification has been sent", but one can also read it as
> "after the condition has been met".
>
> 4) I see two ways to make sense out of it.
>
> My preferred way would be to replace "send a notification" with
> "generate a notification" in the main text including "2.7.7.2 Device
> Requirements: Used Buffer Notification Suppression", along with adding an
> explanation which states that generated notifications are sent
> immediately unless a device has an active facility that under a certain
> set of conditions retains and possibly coalesces notifications.
>
> Please notice that here coalescence is used with its dictionary
> meaning (see https://dictionary.cambridge.org/dictionary/english/coalesce),
> and that the current description of NET_F_NOTIF_COAL is more akin to a
> suppression mechanism where notifications are suppressed or discarded and
> not coalesced.
>
> The notification coalescing facility would then retain the first
> used buffer notification request for each queue (those are still
> generated by the same rules), until either the packet count or the
> timeout type max frequency (or delay we need to clarify that) condition
> is met, and coalesce any further notification requests with the retained
> one until it is sent.
>
> The other way, which I'm not sure is actually viable, is basically to
> try to clarify the hell out of the current approach as taken by
> "5.1.6.5.9.1 Operation" by making it crystal clear when exactly are the
> "notification conditions met" and what is the complement state of that,
> and add a sentence to 2.7.7.2 that references device specific mechanisms
> to do more suppression. And of course we would have to find a nice
> definition for "if the notifications are not suppressed as explained in
> 2.7.7." such that loss of initiative is not possible.
>
> IMHO any solution where things can stall because we would have sent
> an used buffer notification without "coalescing" but we did not because
> of coalescing is not acceptable. This could happen if we cross
> event_idx when the coalescing notification conditions are not met and
> thus do not notify, and then when the coalescing notification condition
> are finally met (e.g. via timeout) the event_id type suppression
> is active again because we crossed event_idx when coalescing
> notification condition was not met, and we end up not notifying again.
>
> Sorry for the wall of text.
>
> Regards,
> Halil
>
>
>
next prev parent reply other threads:[~2024-07-02 21:04 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-28 4:47 [PATCH v5] virtio-net: clarify coalescing parameters settings Heng Qi
2024-05-28 4:50 ` Heng Qi
2024-05-31 6:36 ` Heng Qi
2024-05-31 9:39 ` Cornelia Huck
2024-06-07 20:02 ` Halil Pasic
2024-06-08 2:34 ` Heng Qi
2024-06-10 12:46 ` Halil Pasic
2024-06-10 13:35 ` Heng Qi
2024-06-10 14:50 ` Michael S. Tsirkin
2024-06-10 15:12 ` Parav Pandit
2024-06-11 14:04 ` Cornelia Huck
2024-06-10 20:19 ` Halil Pasic
2024-06-11 10:40 ` Heng Qi
2024-06-11 16:29 ` Michael S. Tsirkin
2024-06-11 17:43 ` Parav Pandit
2024-06-13 6:13 ` Michael S. Tsirkin
2024-06-17 2:27 ` Heng Qi
2024-06-17 23:31 ` Si-Wei Liu
2024-06-20 7:40 ` Heng Qi
2024-06-21 1:21 ` Si-Wei Liu
2024-06-21 3:24 ` Heng Qi
2024-06-21 23:46 ` Si-Wei Liu
2024-06-22 1:34 ` Heng Qi
2024-06-25 4:51 ` Si-Wei Liu
2024-06-25 5:56 ` Parav Pandit
2024-06-26 1:14 ` Si-Wei Liu
2024-06-27 10:37 ` Halil Pasic
2024-06-27 11:27 ` Parav Pandit
2024-06-27 12:35 ` Michael S. Tsirkin
2024-06-27 12:45 ` Parav Pandit
2024-06-27 12:52 ` Michael S. Tsirkin
2024-06-27 13:03 ` Parav Pandit
2024-06-27 14:59 ` Michael S. Tsirkin
2024-06-27 17:27 ` Si-Wei Liu
2024-06-27 17:14 ` Si-Wei Liu
2024-06-27 22:18 ` Michael S. Tsirkin
2024-06-28 6:56 ` Si-Wei Liu
2024-06-28 8:23 ` Jason Wang
2024-06-28 19:31 ` Si-Wei Liu
2024-06-30 17:04 ` Michael S. Tsirkin
2024-07-03 6:09 ` Jason Wang
2024-07-02 20:37 ` Halil Pasic
2024-07-02 21:04 ` Michael S. Tsirkin [this message]
2024-07-03 5:01 ` Jason Wang
2024-06-29 6:47 ` Halil Pasic
2024-06-30 16:55 ` Michael S. Tsirkin
2024-07-02 21:43 ` Halil Pasic
2024-06-27 12:13 ` Parav Pandit
2024-06-27 12:42 ` Michael S. Tsirkin
2024-06-25 7:53 ` Jason Wang
2024-06-25 8:06 ` Michael S. Tsirkin
2024-06-25 8:13 ` Jason Wang
2024-06-25 8:21 ` Michael S. Tsirkin
2024-06-11 23:03 ` Michael S. Tsirkin
2024-06-17 2:35 ` Heng Qi
2024-06-25 7:26 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240702165806-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=cohuck@redhat.com \
--cc=hengqi@linux.alibaba.com \
--cc=jasowang@redhat.com \
--cc=parav@nvidia.com \
--cc=pasic@linux.ibm.com \
--cc=si-wei.liu@oracle.com \
--cc=virtio-comment@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox