netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Alexander Duyck <alexander.duyck@gmail.com>, pabeni@redhat.com
Cc: davem@davemloft.net, netdev@vger.kernel.org, edumazet@google.com,
	willemb@google.com
Subject: Re: [PATCH net-next 1/3] net: provide macros for commonly copied lockless queue stop/wake code
Date: Tue, 28 Mar 2023 17:56:01 -0700	[thread overview]
Message-ID: <20230328175601.76574704@kernel.org> (raw)
In-Reply-To: <CAKgT0Ufoy2WM3=aMNOdq2PFYL8AH9QSs=QrP_Xx59uouTnKLJg@mail.gmail.com>

On Sun, 26 Mar 2023 14:23:07 -0700 Alexander Duyck wrote:
> > > Except this isn't "stop", this is "maybe stop".  
> >
> > So the return value from try_stop and maybe_stop would be different?
> > try_stop needs to return 0 if it stopped - the same semantics as
> > trylock(), AFAIR. Not that I love those semantics, but it's a fairly
> > strong precedent.  
> 
> The problem is this isn't a lock. Ideally with this we aren't taking
> the action. So if anything this functions in my mind more like the
> inverse where if this does stop we have to abort more like trylock
> failing.

No.. for try_stop we are trying to stop.

> This is why I mentioned that maybe this should be renamed. I view this
> more as a check to verify we are good to proceed. In addition there is
> the problem that there are 3 possible outcomes with maybe_stop versus
> the two from try_stop.

I'm open to other names :S

> > > The thing is in order to make this work for the ixgbe patch you didn't
> > > use the maybe_stop instead you went with the try_stop. If you replaced
> > > the ixgbe_maybe_stop_tx with your maybe stop would have to do
> > > something such as the code above to make it work. That is what I am
> > > getting at. From what I can tell the only real difference between
> > > ixgbe_maybe_stop_tx and your maybe_stop is that you avoided having to
> > > move the restart_queue stat increment out.  
> >
> > I can convert ixgbe further, true, but I needed the try_stop, anyway,
> > because bnxt does:
> >
> > if (/* need to stop */) {
> >         if (xmit_more())
> >                 flush_db_write();
> >         netif_tx_queue_try_stop();
> > }
> >
> > which seems reasonable.  
> 
> I wasn't saying we didn't need try_stop. However the logic here
> doesn't care about the return value. In the ixgbe case we track the
> queue restarts so we would want a 0 on success and a non-zero if we
> have to increment the stat. I would be okay with the 0 (success) / -1
> (queue restarted) in this case.
> 
> > > The general thought is I would prefer to keep it so that 0 is the
> > > default most likely case in both where the queue is enabled and is
> > > still enabled. By moving the "take action" items into the 1/-1 values
> > > then it becomes much easier to sort them out with 1 being a stat
> > > increment and -1 being an indication to stop transmitting and prep for
> > > watchdog hang if we don't clear this in the next watchdog period.  
> >
> > Maybe worth taking a step back - the restart stat which ixgbe
> > maintains made perfect sense when you pioneered this approach but
> > I think we had a decade of use, and have kprobes now, so we don't
> > really need to maintain a statistic for a condition with no impact
> > to the user? New driver should not care 1 vs -1..  
> 
> Actually the restart_queue stat is VERY useful for debugging. It tells
> us we are seeing backlogs develop in the Tx queue. We track it any
> time we wake up the queue, not just in the maybe_stop case.
> 
> WIthout that we are then having to break out kprobes and the like
> which we could only add after-the-fact which makes things much harder
> to debug when issues occur. For example, a common case to use it is to
> monitor it when we see a system with slow Tx connections. With that
> stat we can tell if we are building a backlog in the qdisc or if it is
> something else such as a limited amount of socket memory is limiting
> the transmits.

Oh, I missed that wake uses the same stat. Let me clarify - the
stop/start counter is definitely useful. What I thought the restart
counter is counting is just the race cases. I don't think the race
cases are worth counting in any way.

> > > The thought I had with the enum is to more easily connect the outcomes
> > > with the sources. It would also help to prevent any confusion on what
> > > is what. Having the two stop/wake functions return different values is
> > > a potential source for errors since 0/1 means different things in the
> > > different functions. Basically since we have 3 possible outcomes using
> > > the enum would make it very clear what the mapping is between the two.  
> >
> > IMO only two outcomes matter in practice (as mentioned above).
> > I really like the ability to treat the return value as a bool, if only
> > we had negative zero we would have a perfect compromise :(  
> 
> I think we are just thinking about two different things. I am focusing
> on the "maybe" calls that have 3 outcomes whereas I think you are
> mostly focused on the "try" calls. My thought is to treat it something
> like the msix allocation calls where a negative indicates a failure
> forcing us to stop since the ring is full, 0 is a success, and a value
> indicates that there are resources but they are/were limited.

I don't see a strong analogy to PCI resource allocation :(

I prefer to keep the 0 vs non-0 distinction to indicate whether 
the action was performed.

Paolo, Eric, any opinion? Other than the one likely vs unlikely
flip -- is this good enough to merge for you?

  reply	other threads:[~2023-03-29  0:56 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-22 23:30 [PATCH net-next 1/3] net: provide macros for commonly copied lockless queue stop/wake code Jakub Kicinski
2023-03-22 23:30 ` [PATCH net-next 2/3] ixgbe: use new queue try_stop/try_wake macros Jakub Kicinski
2023-03-22 23:30 ` [PATCH net-next 3/3] bnxt: " Jakub Kicinski
2023-03-23  0:35 ` [PATCH net-next 1/3] net: provide macros for commonly copied lockless queue stop/wake code Andrew Lunn
2023-03-23  1:04   ` Jakub Kicinski
2023-03-23 21:02     ` Andrew Lunn
2023-03-23 22:46       ` Jakub Kicinski
2023-03-23  3:05 ` Yunsheng Lin
2023-03-23  3:27   ` Jakub Kicinski
2023-03-23  4:53 ` Pavan Chebbi
2023-03-23  5:08   ` Jakub Kicinski
2023-03-23 16:05 ` Alexander H Duyck
2023-03-24  3:09   ` Jakub Kicinski
2023-03-24 15:45     ` Alexander Duyck
2023-03-24 21:28       ` Jakub Kicinski
2023-03-26 21:23         ` Alexander Duyck
2023-03-29  0:56           ` Jakub Kicinski [this message]
2023-03-30 14:56             ` Paolo Abeni
  -- strict thread matches above, loose matches on Subject: below --
2023-04-01  5:12 [PATCH net-next 0/3] " Jakub Kicinski
2023-04-01  5:12 ` [PATCH net-next 1/3] " Jakub Kicinski
2023-04-01 15:04   ` Heiner Kallweit
2023-04-01 18:03     ` Jakub Kicinski
2023-04-01 15:18   ` Heiner Kallweit
2023-04-01 18:58     ` Jakub Kicinski
2023-04-01 20:41       ` Heiner Kallweit
2023-04-03 15:18       ` Alexander Duyck
2023-04-03 15:56         ` Jakub Kicinski
2023-04-03 18:11           ` Alexander Duyck
2023-04-03 19:03             ` Jakub Kicinski
2023-04-03 20:27               ` Alexander Duyck
2023-04-05 22:20                 ` Paul E. McKenney
2023-04-06  5:15                   ` Herbert Xu
2023-04-06 14:17                     ` Paul E. McKenney
2023-04-06 14:46                       ` Jakub Kicinski
2023-04-06 15:45                         ` Paul E. McKenney
2023-04-06 15:56                           ` Jakub Kicinski
2023-04-06 16:25                             ` Paul E. McKenney
2023-04-07  0:58                         ` Herbert Xu
2023-04-07  1:03                           ` Jakub Kicinski
2023-04-07  1:14                             ` Herbert Xu
2023-04-07  1:21                               ` Jakub Kicinski
2023-04-04  6:39         ` Herbert Xu
2023-04-04 22:36           ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230328175601.76574704@kernel.org \
    --to=kuba@kernel.org \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).