From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jiri Pirko <jiri@resnulli.us>
Cc: Jason Wang <jasowang@redhat.com>,
Jason Xing <kerneljasonxing@gmail.com>,
Heng Qi <hengqi@linux.alibaba.com>,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, xuanzhuo@linux.alibaba.com,
virtualization@lists.linux.dev, ast@kernel.org,
daniel@iogearbox.net, hawk@kernel.org, john.fastabend@gmail.com,
netdev@vger.kernel.org
Subject: Re: [patch net-next] virtio_net: add support for Byte Queue Limits
Date: Fri, 7 Jun 2024 02:43:29 -0400 [thread overview]
Message-ID: <20240607024057-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <ZmKrGBLiNvDVKL2Z@nanopsycho.orion>
On Fri, Jun 07, 2024 at 08:39:20AM +0200, Jiri Pirko wrote:
> Fri, Jun 07, 2024 at 08:25:19AM CEST, jasowang@redhat.com wrote:
> >On Thu, Jun 6, 2024 at 9:45 PM Jiri Pirko <jiri@resnulli.us> wrote:
> >>
> >> Thu, Jun 06, 2024 at 09:56:50AM CEST, jasowang@redhat.com wrote:
> >> >On Thu, Jun 6, 2024 at 2:05 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >>
> >> >> On Thu, Jun 06, 2024 at 12:25:15PM +0800, Jason Wang wrote:
> >> >> > > If the codes of orphan mode don't have an impact when you enable
> >> >> > > napi_tx mode, please keep it if you can.
> >> >> >
> >> >> > For example, it complicates BQL implementation.
> >> >> >
> >> >> > Thanks
> >> >>
> >> >> I very much doubt sending interrupts to a VM can
> >> >> *on all benchmarks* compete with not sending interrupts.
> >> >
> >> >It should not differ too much from the physical NIC. We can have one
> >> >more round of benchmarks to see the difference.
> >> >
> >> >But if NAPI mode needs to win all of the benchmarks in order to get
> >> >rid of orphan, that would be very difficult. Considering various bugs
> >> >will be fixed by dropping skb_orphan(), it would be sufficient if most
> >> >of the benchmark doesn't show obvious differences.
> >> >
> >> >Looking at git history, there're commits that removes skb_orphan(), for example:
> >> >
> >> >commit 8112ec3b8722680251aecdcc23dfd81aa7af6340
> >> >Author: Eric Dumazet <edumazet@google.com>
> >> >Date: Fri Sep 28 07:53:26 2012 +0000
> >> >
> >> > mlx4: dont orphan skbs in mlx4_en_xmit()
> >> >
> >> > After commit e22979d96a55d (mlx4_en: Moving to Interrupts for TX
> >> > completions) we no longer need to orphan skbs in mlx4_en_xmit()
> >> > since skb wont stay a long time in TX ring before their release.
> >> >
> >> > Orphaning skbs in ndo_start_xmit() should be avoided as much as
> >> > possible, since it breaks TCP Small Queue or other flow control
> >> > mechanisms (per socket limits)
> >> >
> >> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> >> > Acked-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
> >> > Cc: Or Gerlitz <ogerlitz@mellanox.com>
> >> > Signed-off-by: David S. Miller <davem@davemloft.net>
> >> >
> >> >>
> >> >> So yea, it's great if napi and hardware are advanced enough
> >> >> that the default can be changed, since this way virtio
> >> >> is closer to a regular nic and more or standard
> >> >> infrastructure can be used.
> >> >>
> >> >> But dropping it will go against *no breaking userspace* rule.
> >> >> Complicated? Tough.
> >> >
> >> >I don't know what kind of userspace is broken by this. Or why it is
> >> >not broken since the day we enable NAPI mode by default.
> >>
> >> There is a module option that explicitly allows user to set
> >> napi_tx=false
> >> or
> >> napi_weight=0
> >>
> >> So if you remove this option or ignore it, both breaks the user
> >> expectation.
> >
> >We can keep them, but I wonder what's the expectation of the user
> >here? The only thing so far I can imagine is the performance
> >difference.
>
> True.
>
> >
> >> I personally would vote for this breakage. To carry ancient
> >> things like this one forever does not make sense to me.
> >
> >Exactly.
> >
> >> While at it,
> >> let's remove all virtio net module params. Thoughts?
> >
> >I tend to
> >
> >1) drop the orphan mode, but we can have some benchmarks first
>
> Any idea which? That would be really tricky to find the ones where
> orphan mode makes difference I assume.
Exactly. We are kind of stuck with it I think.
I would just do this:
void orphan_destructor(struct sk_buff *skb)
{
}
skb_orphan(skb);
skb->destructor = orphan_destructor;
/* skip BQL */
return;
and then later
/* skip BQL accounting if we orphaned on xmit path */
if (skb->destructor == orphan_destructor)
return;
Hmm?
>
> >2) keep the module parameters
>
> and ignore them, correct? Perhaps a warning would be good.
>
>
> >
> >Thanks
> >
> >>
> >>
> >>
> >> >
> >> >Thanks
> >> >
> >> >>
> >> >> --
> >> >> MST
> >> >>
> >> >
> >>
> >
next prev parent reply other threads:[~2024-06-07 6:43 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-09 11:46 [patch net-next] virtio_net: add support for Byte Queue Limits Jiri Pirko
2024-05-09 12:41 ` Michael S. Tsirkin
2024-05-09 13:31 ` Jiri Pirko
2024-05-09 14:28 ` Michael S. Tsirkin
2024-05-10 4:25 ` Jason Wang
2024-05-10 10:37 ` Jiri Pirko
2024-05-10 10:52 ` Michael S. Tsirkin
2024-05-10 11:11 ` Jiri Pirko
2024-05-10 11:27 ` Michael S. Tsirkin
2024-05-10 11:36 ` Jiri Pirko
2024-05-15 7:34 ` Jiri Pirko
2024-05-15 8:20 ` Michael S. Tsirkin
2024-05-15 10:12 ` Jiri Pirko
2024-05-15 12:54 ` Jiri Pirko
2024-05-16 4:48 ` Jason Wang
2024-05-16 10:54 ` Jiri Pirko
2024-05-16 12:31 ` Michael S. Tsirkin
2024-05-16 15:25 ` Jiri Pirko
2024-05-16 19:04 ` Michael S. Tsirkin
2024-05-17 7:52 ` Jiri Pirko
[not found] ` <CAA93jw6WanAQrPAFZ1hYVTXuWDwP+4J70LnmPOD2ugNwYK6HMA@mail.gmail.com>
2024-06-06 7:30 ` Jiri Pirko
2024-05-10 4:25 ` Jason Wang
2024-05-10 7:11 ` Heng Qi
2024-05-10 10:35 ` Jiri Pirko
2024-05-20 12:48 ` Jiri Pirko
2024-06-05 11:30 ` Jiri Pirko
2024-06-05 11:42 ` Heng Qi
2024-06-06 0:20 ` Jason Wang
2024-06-06 2:58 ` Jason Xing
2024-06-06 4:25 ` Jason Wang
2024-06-06 6:05 ` Michael S. Tsirkin
2024-06-06 7:56 ` Jason Wang
2024-06-06 13:45 ` Jiri Pirko
2024-06-07 6:25 ` Jason Wang
2024-06-07 6:39 ` Jiri Pirko
2024-06-07 6:43 ` Michael S. Tsirkin [this message]
2024-06-07 6:47 ` Jason Wang
2024-06-07 9:57 ` Jiri Pirko
2024-06-07 10:23 ` Michael S. Tsirkin
2024-06-07 11:30 ` Jiri Pirko
2024-06-10 14:18 ` Michael S. Tsirkin
2024-06-17 1:44 ` Jason Wang
2024-06-17 9:30 ` Jiri Pirko
2024-06-17 16:16 ` Michael S. Tsirkin
2024-06-18 1:19 ` Jason Wang
2024-06-18 0:52 ` Jason Wang
2024-06-18 18:23 ` Michael S. Tsirkin
2024-06-17 16:18 ` Michael S. Tsirkin
2024-06-07 11:22 ` Jason Xing
2024-06-06 11:42 ` Jason Xing
2024-06-06 12:00 ` Michael S. Tsirkin
2024-06-06 13:41 ` Jiri Pirko
2024-06-07 6:22 ` Jason Wang
2024-06-07 6:39 ` Michael S. Tsirkin
2024-06-07 6:40 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240607024057-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=hengqi@linux.alibaba.com \
--cc=jasowang@redhat.com \
--cc=jiri@resnulli.us \
--cc=john.fastabend@gmail.com \
--cc=kerneljasonxing@gmail.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).