All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <hawk@kernel.org>
To: "Yunsheng Lin" <linyunsheng@huawei.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com
Cc: zhangkun09@huawei.com, fanghaiqing@huawei.com,
	liuyonglong@huawei.com, Robin Murphy <robin.murphy@arm.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	IOMMU <iommu@lists.linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Eric Dumazet <edumazet@google.com>,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, kernel-team <kernel-team@cloudflare.com>
Subject: Re: [PATCH net-next v3 3/3] page_pool: fix IOMMU crash when driver has already unbound
Date: Tue, 12 Nov 2024 15:19:53 +0100	[thread overview]
Message-ID: <be049c33-936a-4c93-94ff-69cd51b5de8e@kernel.org> (raw)
In-Reply-To: <eab44c89-5ada-48b6-b880-65967c0f3b49@huawei.com>



On 12/11/2024 13.22, Yunsheng Lin wrote:
> On 2024/11/12 2:51, Toke Høiland-Jørgensen wrote:
> 
> ...
> 
>>>
>>> Is there any other suggestion/concern about how to fix the problem here?
>>>
>>>  From the previous discussion, it seems the main concern about tracking the
>>> inflight pages is about how many inflight pages it is needed.
>>
>> Yeah, my hardest objection was against putting a hard limit on the
>> number of outstanding pages.
>>
>>> If there is no other suggestion/concern , it seems the above concern might be
>>> addressed by using pre-allocated memory to satisfy the mostly used case, and
>>> use the dynamically allocated memory if/when necessary.
>>
>> For this, my biggest concern would be performance.
>>
>> In general, doing extra work in rarely used code paths (such as device
>> teardown) is much preferred to adding extra tracking in the fast path.
>> Which would be an argument for Alexander's suggestion of just scanning
>> the entire system page table to find pages to unmap. Don't know enough
>> about mm system internals to have an opinion on whether this is
>> feasible, though.
> 
> Yes, there seems to be many MM system internals, like the CONFIG_SPARSEMEM*
> config, memory offline/online and other MM specific optimization that it
> is hard to tell it is feasible.
> 
> It would be good if MM experts can clarify on this.
>

Yes, please.  Can Alex Duyck or MM-experts point me at some code walking
entire system page table?

Then I'll write some kernel code (maybe module) that I can benchmark how
long it takes on my machine with 384GiB. I do like Alex'es suggestion,
but I want to assess the overhead of doing this on modern hardware.

>>
>> In any case, we'll need some numbers to really judge the overhead in
>> practice. So benchmarking would be the logical next step in any case :)
> 
> Using POC code show that using the dynamic memory allocation does not
> seems to be adding much overhead than the pre-allocated memory allocation
> in this patch, the overhead is about 10~20ns, which seems to be similar to
> the overhead of added overhead in the patch.
> 

Overhead around 10~20ns is too large for page_pool, because XDP DDoS
use-case have a very small time budget (which is what page_pool was
designed for).

[1] 
https://github.com/xdp-project/xdp-project/blob/master/areas/hints/traits01_bench_kmod.org#benchmark-basics

  | Link speed | Packet rate           | Time-budget   |
  |            | at smallest pkts size | per packet    |
  |------------+-----------------------+---------------|
  |  10 Gbit/s |  14,880,952 pps       | 67.2 nanosec  |
  |  25 Gbit/s |  37,202,381 pps       | 26.88 nanosec |
  | 100 Gbit/s | 148,809,523 pps       |  6.72 nanosec |


--Jesper

  reply	other threads:[~2024-11-12 14:19 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-22  3:22 [PATCH net-next v3 0/3] fix two bugs related to page_pool Yunsheng Lin
2024-10-22  3:22 ` [Intel-wired-lan] " Yunsheng Lin
2024-10-22  3:22 ` [PATCH net-next v3 1/3] page_pool: introduce page_pool_to_pp() API Yunsheng Lin
2024-10-22  3:22   ` [Intel-wired-lan] " Yunsheng Lin
2024-10-22  3:22 ` [PATCH net-next v3 2/3] page_pool: fix timing for checking and disabling napi_local Yunsheng Lin
2024-11-07  6:17   ` Xuan Zhuo
2024-10-22  3:22 ` [PATCH net-next v3 3/3] page_pool: fix IOMMU crash when driver has already unbound Yunsheng Lin
2024-10-22 16:40   ` Simon Horman
2024-10-22 18:14   ` Jesper Dangaard Brouer
2024-10-23  8:59     ` Yunsheng Lin
2024-10-24 14:40       ` Toke Høiland-Jørgensen
2024-10-25  3:20         ` Yunsheng Lin
2024-10-25 11:16           ` Toke Høiland-Jørgensen
2024-10-25 14:07             ` Jesper Dangaard Brouer
2024-10-26  7:33               ` Yunsheng Lin
2024-11-06 13:25                 ` Jesper Dangaard Brouer
2024-11-06 15:57                   ` Jesper Dangaard Brouer
2024-11-06 19:55                     ` Alexander Duyck
2024-11-07 11:10                       ` Yunsheng Lin
2024-11-07 11:09                     ` Yunsheng Lin
2024-11-11 11:31                 ` Yunsheng Lin
2024-11-11 18:51                   ` Toke Høiland-Jørgensen
2024-11-12 12:22                     ` Yunsheng Lin
2024-11-12 14:19                       ` Jesper Dangaard Brouer [this message]
2024-11-13 12:21                         ` Yunsheng Lin
2024-11-18  9:08                         ` Yunsheng Lin
2024-11-18 15:11                           ` Jesper Dangaard Brouer
2024-10-26  7:32             ` Yunsheng Lin
2024-10-29 13:58               ` Toke Høiland-Jørgensen
2024-10-30 11:30                 ` Yunsheng Lin
2024-10-30 11:57                   ` Toke Høiland-Jørgensen
2024-10-31 12:17                     ` Yunsheng Lin
2024-10-31 16:18                       ` Toke Høiland-Jørgensen
2024-11-01 11:11                         ` Yunsheng Lin
2024-11-05 20:11                           ` Jesper Dangaard Brouer
2024-11-06 10:56                             ` Yunsheng Lin
2024-11-06 14:17                               ` Robin Murphy
2024-11-07  8:41                               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be049c33-936a-4c93-94ff-69cd51b5de8e@kernel.org \
    --to=hawk@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fanghaiqing@huawei.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=iommu@lists.linux.dev \
    --cc=kernel-team@cloudflare.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linyunsheng@huawei.com \
    --cc=liuyonglong@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=robin.murphy@arm.com \
    --cc=toke@redhat.com \
    --cc=zhangkun09@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.