From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
Larysa Zaremba <larysa.zaremba@intel.com>,
Yunsheng Lin <linyunsheng@huawei.com>,
Alexander Duyck <alexanderduyck@fb.com>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
Ilias Apalodimas <ilias.apalodimas@linaro.org>,
Simon Horman <simon.horman@corigine.com>,
<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net-next v4 5/6] page_pool: add a lockdep check for recycling in hardirq
Date: Tue, 8 Aug 2023 17:06:05 +0200 [thread overview]
Message-ID: <8ee66e8f-cada-b492-d23f-e4e15cfef868@intel.com> (raw)
In-Reply-To: <CAKgT0UcZspvhYcfiKs90snAfwwb+CMn-vhA62XcSTRiV0BfOqw@mail.gmail.com>
From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Tue, 8 Aug 2023 07:52:32 -0700
> On Tue, Aug 8, 2023 at 6:59 AM Alexander Lobakin
> <aleksander.lobakin@intel.com> wrote:
>>
>> From: Alexander Duyck <alexander.duyck@gmail.com>
>> Date: Tue, 8 Aug 2023 06:45:26 -0700
[...]
>>>>> Secondly rather than returning an error is there any reason why we
>>>>> couldn't just look at not returning page and instead just drop into the
>>>>> release path which wouldn't take the locks in the first place? Either
>>>>
>>>> That is exception path to quickly catch broken drivers and fix them, why
>>>> bother? It's not something we have to live with.
>>>
>>> My concern is that the current "fix" consists of stalling a Tx ring.
>>> We need to have a way to allow forward progress when somebody mixes
>>> xdp_frame and skb traffic as I suspect we will end up with a number of
>>> devices doing this since they cannot handle recycling the pages in
>>> hardirq context.
>>
>> You could've seen that several vendors already disabled recycling XDP
>> buffers when in hardirq (= netpoll) in their drivers. hardirq is in
>> general not for networking-related operations.
>
> The whole idea behind the netpoll cleanup is to get the Tx buffers out
> of the way so that we can transmit even after the system has crashed.
> The idea isn't to transmit XDP buffers, but to get the buffers out of
> the way in the cases where somebody is combining both xdp_frame and
> sk_buff on the same queue due to a limited number of rings being
> present on the device.
I see now, thanks a lot!
>
> My concern is that at some point in the near future somebody is going
> to have a system crash and instead of being able to get the crash log
> message out via their netconsole it is going to get cut off because
> the driver stopped cleaning the Tx ring because somebody was also
> using it as an XDP redirect destination.
>
>>>
>>> The only reason why the skbs don't have the problem is that they are
>>> queued and then cleaned up in the net_tx_action. That is why I wonder
>>> if we shouldn't look at adding some sort of support for doing
>>> something like that with xdp_frame as well. Something like a
>>> dev_kfree_pp_page_any to go along with the dev_kfree_skb_any.
>>
>> I still don't get why we may need to clean XDP buffers in hardirq, maybe
>> someone could give me some links to read why we may need this and how
>> that happens? netpoll is a very specific thing for some debug
>> operations, isn't it? XDP shouldn't in general be enabled when this
>> happens, should it?
>
> I think I kind of explained it above. It isn't so much about cleaning
> the XDP buffers as getting them off of the ring and out of the way. If
> we block a Tx queue because of an XDP buffer then we cannot use that
> Tx queue. I would be good with us just deferring the cleanup like we
> do with an sk_buff in dev_kfree_skb_irq, the only issue is we don't
> have the ability to put them on a queue since they don't have
> prev/next pointers.
>
> I suppose an alternative to cleaning them might be to make a mandatory
> requirement that you cannot support netpoll and mix xdp_frame and
> sk_buff on the same queue. If we enforced that then my concern about
> them blocking a queue would be addressed.
I'm leaning more towards this one TBH. I don't feel sole netpoll as
a solid argument for introducing XDP frame deferred queues :s
>
>> (unrelated: 6:58 AM West Coast, you use to wake up early or traveling?
>> :D)
>
> I am usually up pretty early, especially this time of year. Sunrise
> here is 6AM and I am usually up a little before that.. :)
Nice!
Thanks,
Olek
next prev parent reply other threads:[~2023-08-08 16:47 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-04 18:05 [PATCH net-next v4 0/6] page_pool: a couple of assorted optimizations Alexander Lobakin
2023-08-04 18:05 ` [PATCH net-next v4 1/6] page_pool: split types and declarations from page_pool.h Alexander Lobakin
2023-08-07 20:09 ` Jakub Kicinski
2023-08-04 18:05 ` [PATCH net-next v4 2/6] net: skbuff: don't include <net/page_pool/types.h> to <linux/skbuff.h> Alexander Lobakin
2023-08-04 18:05 ` [PATCH net-next v4 3/6] page_pool: place frag_* fields in one cacheline Alexander Lobakin
2023-08-04 18:05 ` [PATCH net-next v4 4/6] net: skbuff: avoid accessing page_pool if !napi_safe when returning page Alexander Lobakin
2023-08-04 18:05 ` [PATCH net-next v4 5/6] page_pool: add a lockdep check for recycling in hardirq Alexander Lobakin
2023-08-07 14:48 ` Alexander H Duyck
2023-08-08 13:16 ` Alexander Lobakin
2023-08-08 13:45 ` Alexander Duyck
2023-08-08 13:58 ` Alexander Lobakin
2023-08-08 14:52 ` Alexander Duyck
2023-08-08 15:06 ` Alexander Lobakin [this message]
2023-08-08 17:35 ` Alexander Duyck
2023-08-04 18:05 ` [PATCH net-next v4 6/6] net: skbuff: always try to recycle PP pages directly when in softirq Alexander Lobakin
2023-08-07 14:53 ` [PATCH net-next v4 0/6] page_pool: a couple of assorted optimizations Alexander H Duyck
2023-08-07 20:20 ` patchwork-bot+netdevbpf
2023-08-08 13:17 ` Alexander Lobakin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8ee66e8f-cada-b492-d23f-e4e15cfef868@intel.com \
--to=aleksander.lobakin@intel.com \
--cc=alexander.duyck@gmail.com \
--cc=alexanderduyck@fb.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=kuba@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linyunsheng@huawei.com \
--cc=maciej.fijalkowski@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=simon.horman@corigine.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.