linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Muchun Song <muchun.song@linux.dev>
To: Huan Yang <link@vivo.com>
Cc: bingbu.cao@linux.intel.com, "Christoph Hellwig" <hch@lst.de>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Vivek Kasireddy" <vivek.kasireddy@intel.com>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Uladzislau Rezki" <urezki@gmail.com>,
	"Shuah Khan" <shuah@kernel.org>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
	linux-mm@kvack.org, linux-kselftest@vger.kernel.org,
	opensource.kernel@vivo.com
Subject: Re: CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is broken, was Re: [RFC PATCH 0/6] Deep talk about folio vmap
Date: Mon, 7 Apr 2025 10:57:13 +0800	[thread overview]
Message-ID: <E4D6E02F-BC82-4630-8CB8-CD1A0163ABCF@linux.dev> (raw)
In-Reply-To: <e9f44d16-fd9a-4d82-b40e-c173d068676a@vivo.com>



> On Apr 7, 2025, at 09:59, Huan Yang <link@vivo.com> wrote:
> 
> 
> 在 2025/4/4 18:07, Muchun Song 写道:
>> 
>>> On Apr 4, 2025, at 17:38, Muchun Song <muchun.song@linux.dev> wrote:
>>> 
>>> 
>>> 
>>>> On Apr 4, 2025, at 17:01, Christoph Hellwig <hch@lst.de> wrote:
>>>> 
>>>> After the btrfs compressed bio discussion I think the hugetlb changes that
>>>> skip the tail pages are fundamentally unsafe in the current kernel.
>>>> 
>>>> That is because the bio_vec representation assumes tail pages do exist, so
>>>> as soon as you are doing direct I/O that generates a bvec starting beyond
>>>> the present head page things will blow up.  Other users of bio_vecs might
>>>> do the same, but the way the block bio_vecs are generated are very suspect
>>>> to that.  So we'll first need to sort that out and a few other things
>>>> before we can even think of enabling such a feature.
>>>> 
>>> I would like to express my gratitude to Christoph for including me in the
>>> thread. I have carefully read the cover letter in [1], which indicates
>>> that an issue has arisen due to the improper use of `vmap_pfn()`. I'm
>>> wondering if we could consider using `vmap()` instead. In the HVO scenario,
>>> the tail struct pages do **exist**, but they are read-only. I've examined
>>> the code of `vmap()`, and it appears that it only reads the struct page.
>>> Therefore, it seems feasible for us to use `vmap()` (I am not a expert in
>>> udmabuf.). Right?
>> I believe my stance is correct. I've also reviewed another thread in [2].
>> Allow me to clarify and correct the viewpoints you presented. You stated:
>>   "
>>    So by HVO, it also not backed by pages, only contains folio head, each
>>    tail pfn's page struct go away.
>>   "
>> This statement is entirely inaccurate. The tail pages do not cease to exist;
>> rather, they are read-only. For your specific use-case, please use `vmap()`
>> to resolve the issue at hand. If you wish to gain a comprehensive understanding
> 
> I see the document give a simple graph to point:
> 
>  +-----------+ ---virt_to_page---> +-----------+   mapping to   +-----------+
>  |           |                                     |     0     | -------------> |     0     |
>  |           | +-----------+                +-----------+
>  |           |                                      |     1     | -------------> |     1     |
>  |           | +-----------+                +-----------+
>  |           |                                      |     2     | ----------------^ ^ ^ ^ ^ ^
>  |           | +-----------+                      | | | | |
>  |           |                                      |     3     | ------------------+ | | | |
>  |           | +-----------+                        | | | |
>  |           |                                      |     4     | --------------------+ | | |
>  |    PMD    | +-----------+                          | | |
>  |   level   |                                   |     5     | ----------------------+ | |
>  |  mapping  | +-----------+                             | |
>  |           |                                     |     6     | ------------------------+ |
>  |           | +-----------+                              |
>  |           |                                     |     7     | --------------------------+
>  |           |                                    +-----------+
>  |           |
>  |           |
>  |           |
>  +-----------+
> 
> If I understand correct, each 2-7 tail's page struct is freed, so if I just need map page 2-7, can we use vmap do
> 
> something correctly?

The answer is you can. It is essential to distinguish between virtual
address (VA) and physical address (PA). The VAs of tail struct pages
aren't freed but remapped to the physical page mapped by the VA of the
head struct page (since contents of those tail physical pages are the
same). Thus, the freed pages are the physical pages mapped by original
tail struct pages, not their virtual addresses. Moreover, while it
is possible to read the virtual addresses of these tail struct pages,
any write operations are prohibited since it is within the realm of
acceptability that the kernel is expected to perform write operations
solely on the head struct page of a compound head and conduct read
operations only on the tail struct pages. BTW, folio infrastructure
is also based on this assumption.

Thanks,
Muchun.

> 
> Or something I still misunderstand, please correct me.
> 
> Thanks,
> 
> Huan Yang
> 
>> of the fundamentals of HVO, I kindly suggest a thorough review of the document
>> in [3].
>> 
>> [2] https://lore.kernel.org/lkml/5229b24f-1984-4225-ae03-8b952de56e3b@vivo.com/#t
>> [3] Documentation/mm/vmemmap_dedup.rst
>> 
>>> [1] https://lore.kernel.org/linux-mm/20250327092922.536-1-link@vivo.com/T/#m055b34978cf882fd44d2d08d929b50292d8502b4
>>> 
>>> Thanks,
>>> Muchun.
>>> 
>> 



  reply	other threads:[~2025-04-07  2:58 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-27  9:28 [RFC PATCH 0/6] Deep talk about folio vmap Huan Yang
2025-03-27  9:28 ` [RFC PATCH 1/6] udmabuf: try fix udmabuf vmap Huan Yang
2025-03-27  9:28 ` [RFC PATCH 2/6] udmabuf: try udmabuf vmap test Huan Yang
2025-03-27  9:28 ` [RFC PATCH 3/6] mm/vmalloc: try add vmap folios range Huan Yang
2025-03-27  9:28 ` [RFC PATCH 4/6] udmabuf: use vmap_range_folios Huan Yang
2025-03-27  9:28 ` [RFC PATCH 5/6] udmabuf: vmap test suit for pages and pfns compare Huan Yang
2025-03-27  9:28 ` [RFC PATCH 6/6] udmabuf: remove no need code Huan Yang
2025-03-28 21:09 ` [RFC PATCH 0/6] Deep talk about folio vmap Vishal Moola (Oracle)
2025-04-04  9:01 ` CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is broken, was " Christoph Hellwig
2025-04-04  9:38   ` Muchun Song
2025-04-04 10:07     ` Muchun Song
2025-04-07  1:59       ` Huan Yang
2025-04-07  2:57         ` Muchun Song [this message]
2025-04-07  3:21           ` Huan Yang
2025-04-07  3:37             ` Muchun Song
2025-04-07  6:43               ` Muchun Song
2025-04-07  7:09                 ` Huan Yang
2025-04-07  7:22                   ` Muchun Song
2025-04-07  8:55                     ` Huan Yang
2025-04-07  8:59                 ` Christoph Hellwig
2025-04-07  9:48                   ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E4D6E02F-BC82-4630-8CB8-CD1A0163ABCF@linux.dev \
    --to=muchun.song@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=bingbu.cao@linux.intel.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@lst.de \
    --cc=kraxel@redhat.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=link@vivo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=opensource.kernel@vivo.com \
    --cc=shuah@kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=urezki@gmail.com \
    --cc=vivek.kasireddy@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).