From: Usama Arif <usamaarif642@gmail.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Yosry Ahmed <yosryahmed@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Kairui Song <ryncsn@gmail.com>,
hanchuanhua@oppo.com, linux-mm@kvack.org,
baolin.wang@linux.alibaba.com, chrisl@kernel.org,
david@redhat.com, hannes@cmpxchg.org, hughd@google.com,
kaleshsingh@google.com, linux-kernel@vger.kernel.org,
mhocko@suse.com, minchan@kernel.org, nphamcs@gmail.com,
ryan.roberts@arm.com, senozhatsky@chromium.org,
shakeel.butt@linux.dev, shy828301@gmail.com, surenb@google.com,
v-songbaohua@oppo.com, willy@infradead.org, xiang@kernel.org,
ying.huang@intel.com, hch@infradead.org
Subject: Re: [PATCH v7 2/2] mm: support large folios swap-in for sync io devices
Date: Thu, 5 Sep 2024 00:23:47 +0100 [thread overview]
Message-ID: <bf232555-3653-40c7-bbdc-a8fe58a93a9e@gmail.com> (raw)
In-Reply-To: <CAGsJ_4yX7xmyDokYgc_H7MaxcOptcLeQs-SB1O22bSRHFdvVhQ@mail.gmail.com>
On 05/09/2024 00:10, Barry Song wrote:
> On Thu, Sep 5, 2024 at 9:30 AM Usama Arif <usamaarif642@gmail.com> wrote:
>>
>>
>>
>> On 03/09/2024 23:05, Yosry Ahmed wrote:
>>> On Tue, Sep 3, 2024 at 2:36 PM Barry Song <21cnbao@gmail.com> wrote:
>>>>
>>>> On Wed, Sep 4, 2024 at 8:08 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>>>>>
>>>>> On Tue, 3 Sep 2024 11:38:37 -0700 Yosry Ahmed <yosryahmed@google.com> wrote:
>>>>>
>>>>>>> [ 39.157954] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000007
>>>>>>> [ 39.158288] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>>>>>>> [ 39.158634] R13: 0000000000002b9a R14: 0000000000000000 R15: 00007ffd619d5518
>>>>>>> [ 39.158998] </TASK>
>>>>>>> [ 39.159226] ---[ end trace 0000000000000000 ]---
>>>>>>>
>>>>>>> After reverting this or Usama's "mm: store zero pages to be swapped
>>>>>>> out in a bitmap", the problem is gone. I think these two patches may
>>>>>>> have some conflict that needs to be resolved.
>>>>>>
>>>>>> Yup. I saw this conflict coming and specifically asked for this
>>>>>> warning to be added in Usama's patch to catch it [1]. It served its
>>>>>> purpose.
>>>>>>
>>>>>> Usama's patch does not handle large folio swapin, because at the time
>>>>>> it was written we didn't have it. We expected Usama's series to land
>>>>>> sooner than this one, so the warning was to make sure that this series
>>>>>> handles large folio swapin in the zeromap code. Now that they are both
>>>>>> in mm-unstable, we are gonna have to figure this out.
>>>>>>
>>>>>> I suspect Usama's patches are closer to land so it's better to handle
>>>>>> this in this series, but I will leave it up to Usama and
>>>>>> Chuanhua/Barry to figure this out :)
>>>>
>>>> I believe handling this in swap-in might violate layer separation.
>>>> `swap_read_folio()` should be a reliable API to call, regardless of
>>>> whether `zeromap` is present. Therefore, the fix should likely be
>>>> within `zeromap` but not this `swap-in`. I’ll take a look at this with
>>>> Usama :-)
>>>
>>> I meant handling it within this series to avoid blocking Usama
>>> patches, not within this code. Thanks for taking a look, I am sure you
>>> and Usama will figure out the best way forward :)
>>
>> Hi Barry and Yosry,
>>
>> Is the best (and quickest) way forward to have a v8 of this with
>> https://lore.kernel.org/all/20240904055522.2376-1-21cnbao@gmail.com/
>> as the first patch, and using swap_zeromap_entries_count in alloc_swap_folio
>> in this support large folios swap-in patch?
>
> Yes, Usama. i can actually do a check:
>
> zeromap_cnt = swap_zeromap_entries_count(entry, nr);
>
> /* swap_read_folio() can handle inconsistent zeromap in multiple entries */
> if (zeromap_cnt > 0 && zeromap_cnt < nr)
> try next order;
>
> On the other hand, if you read the code of zRAM, you will find zRAM has
> exactly the same mechanism as zeromap but zRAM can even do more
> by same_pages filled. since zRAM does the job in swapfile layer, there
> is no this kind of consistency issue like zeromap.
>
> So I feel for zRAM case, we don't need zeromap at all as there are duplicated
> efforts while I really appreciate your job which can benefit all swapfiles.
> i mean, zRAM has the ability to check "zero"(and also non-zero but same
> content). after zeromap checks zeromap, zRAM will check again:
>
Yes, so there is a reason for having the zeromap patches, which I have outlined
in the coverletter.
https://lore.kernel.org/all/20240627105730.3110705-1-usamaarif642@gmail.com/
There are usecases where zswap/zram might not be used in production.
We can reduce I/O and flash wear in those cases by a large amount.
Also running in Meta production, we found that the number of non-zero filled
complete pages were less than 1%, so essentially its only the zero-filled pages
that matter.
I believe after zeromap, it might be a good idea to remove the page_same_filled
check from zram code? Its not really a problem if its kept as well as I dont
believe any zero-filled pages should reach zram_write_page?
> static int zram_write_page(struct zram *zram, struct page *page, u32 index)
> {
> ...
>
> if (page_same_filled(mem, &element)) {
> kunmap_local(mem);
> /* Free memory associated with this sector now. */
> flags = ZRAM_SAME;
> atomic64_inc(&zram->stats.same_pages);
> goto out;
> }
> ...
> }
>
> So it seems that zeromap might slightly impact my zRAM use case. I'm not
> blaming you, just pointing out that there might be some overlap in effort
> here :-)
>
>>
>> Thanks,
>> Usama
>
> Thanks
> Barry
next prev parent reply other threads:[~2024-09-04 23:23 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-21 7:45 [PATCH v7 0/2] mm: Ignite large folios swap-in support hanchuanhua
2024-08-21 7:45 ` [PATCH v7 1/2] mm: add nr argument in mem_cgroup_swapin_uncharge_swap() helper to support large folios hanchuanhua
2024-08-21 7:45 ` [PATCH v7 2/2] mm: support large folios swap-in for sync io devices hanchuanhua
2024-08-21 17:31 ` Shakeel Butt
2024-08-21 21:13 ` Barry Song
2024-08-23 17:56 ` Shakeel Butt
2024-08-26 19:46 ` Barry Song
2024-08-29 1:01 ` Kanchana P Sridhar
2024-08-29 2:24 ` Barry Song
2024-08-29 2:38 ` Sridhar, Kanchana P
2024-09-03 18:24 ` Kairui Song
2024-09-03 18:38 ` Yosry Ahmed
2024-09-03 20:07 ` Andrew Morton
2024-09-03 21:36 ` Barry Song
2024-09-03 22:05 ` Yosry Ahmed
2024-09-04 21:30 ` Usama Arif
2024-09-04 23:10 ` Barry Song
2024-09-04 23:23 ` Usama Arif [this message]
2024-09-04 23:27 ` Barry Song
2024-09-04 23:35 ` Yosry Ahmed
2024-09-22 23:57 ` Barry Song
2024-09-23 10:22 ` Usama Arif
2024-09-23 12:10 ` Johannes Weiner
2024-09-23 16:53 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bf232555-3653-40c7-bbdc-a8fe58a93a9e@gmail.com \
--to=usamaarif642@gmail.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hanchuanhua@oppo.com \
--cc=hannes@cmpxchg.org \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=kaleshsingh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=ryncsn@gmail.com \
--cc=senozhatsky@chromium.org \
--cc=shakeel.butt@linux.dev \
--cc=shy828301@gmail.com \
--cc=surenb@google.com \
--cc=v-songbaohua@oppo.com \
--cc=willy@infradead.org \
--cc=xiang@kernel.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).