From: Dongsheng Yang <dongsheng.yang@linux.dev>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: agk@redhat.com, snitzer@kernel.org, axboe@kernel.dk, hch@lst.de,
dan.j.williams@intel.com, Jonathan.Cameron@Huawei.com,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev,
dm-devel@lists.linux.dev
Subject: Re: [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices
Date: Mon, 30 Jun 2025 22:16:37 +0800 [thread overview]
Message-ID: <43e84a3e-f574-4c97-9f33-35fcb3751e01@linux.dev> (raw)
In-Reply-To: <7ff7c4fc-d830-41c9-ab94-a198d3d9a3b5@linux.dev>
在 6/30/2025 9:40 PM, Dongsheng Yang 写道:
>
> 在 6/30/2025 9:30 PM, Mikulas Patocka 写道:
>>
>> On Tue, 24 Jun 2025, Dongsheng Yang wrote:
>>
>>> Hi Mikulas,
>>> This is V1 for dm-pcache, please take a look.
>>>
>>> Code:
>>> https://github.com/DataTravelGuide/linux tags/pcache_v1
>>>
>>> Changelogs from RFC-V2:
>>> - use crc32c to replace crc32
>>> - only retry pcache_req when cache full, add pcache_req into
>>> defer_list,
>>> and wait cache invalidation happen.
>>> - new format for pcache table, it is more easily extended with
>>> new parameters later.
>>> - remove __packed.
>>> - use spin_lock_irq in req_complete_fn to replace
>>> spin_lock_irqsave.
>>> - fix bug in backing_dev_bio_end with spin_lock_irqsave.
>>> - queue_work() inside spinlock.
>>> - introduce inline_bvecs in backing_dev_req.
>>> - use kmalloc_array for bvecs allocation.
>>> - calculate ->off with dm_target_offset() before use it.
>> Hi
>>
>> The out-of-memory handling still doesn't seem right.
>>
>> If the GFP_NOWAIT allocation doesn't succeed (which may happen anytime,
>> for example it happens when the machine is receiving network packets
>> faster than the swapper is able to swap out data), create_cache_miss_req
>> returns NULL, the caller changes it to -ENOMEM, cache_read returns
>> -ENOMEM, -ENOMEM is propagated up to end_req and end_req will set the
>> status to BLK_STS_RESOURCE. So, it may randomly fail I/Os with an error.
>>
>> Properly, you should use mempools. The mempool allocation will wait
>> until
>> some other process frees data into the mempool.
>>
>> If you need to allocate memory inside a spinlock, you can't do it
>> reliably
>> (because you can't sleep inside a spinlock and non-sleepng memory
>> allocation may fail anytime). So, in this case, you should drop the
>> spinlock, allocate the memory from a mempool with GFP_NOIO and jump back
>> to grab the spinlock - and now you holding the allocated object, so you
>> can use it while you hold the spinlock.
>
>
> Hi Mikulas,
>
> Thanx for your suggestion, I will cook a GFP_NOIO version for the
> memory allocation for pcache data path.
Hi Mikulas,
The reason why we don’t release the spinlock here is that if we do,
the subtree could change.
For example, in the `fixup_overlap_contained()` function, we may need to
split a certain `cache_key`, and that requires allocating a new
`cache_key`.
If we drop the spinlock at this point and then re-acquire it after the
allocation, the subtree might already have been modified, and we cannot
safely continue with the split operation.
In this case, we would have to restart the entire subtree search
and walk. But the new walk might require more memory—or less,
so it's very difficult to know in advance how much memory will be needed
before acquiring the spinlock.
So allocating memory inside a spinlock is actually a more direct
and feasible approach. `GFP_NOWAIT` fails too easily, maybe `GFP_ATOMIC`
is more appropriate.
What do you think?
>
>>
>>
>> Another comment:
>> set_bit/clear_bit use atomic instructions which are slow. As you already
>> hold a spinlock when calling them, you don't need the atomicity, so you
>> can replace them with __set_bit and __clear_bit.
>
>
> Good idea.
>
>
> Thanx
>
> Dongsheng
>
>>
>> Mikulas
>>
>
next prev parent reply other threads:[~2025-06-30 14:16 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-24 7:33 [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 01/11] dm-pcache: add pcache_internal.h Dongsheng Yang
2025-07-01 13:43 ` Jonathan Cameron
2025-06-24 7:33 ` [PATCH v1 02/11] dm-pcache: add backing device management Dongsheng Yang
2025-07-01 13:56 ` Jonathan Cameron
2025-07-07 6:25 ` Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 03/11] dm-pcache: add cache device Dongsheng Yang
2025-07-01 14:07 ` Jonathan Cameron
2025-06-24 7:33 ` [PATCH v1 04/11] dm-pcache: add segment layer Dongsheng Yang
2025-07-01 14:46 ` Jonathan Cameron
2025-07-07 6:24 ` Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 05/11] dm-pcache: add cache_segment Dongsheng Yang
2025-07-01 14:59 ` Jonathan Cameron
2025-07-07 6:24 ` Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 06/11] dm-pcache: add cache_writeback Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 07/11] dm-pcache: add cache_gc Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 08/11] dm-pcache: add cache_key Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 09/11] dm-pcache: add cache_req Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 10/11] dm-pcache: add cache core Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 11/11] dm-pcache: initial dm-pcache target Dongsheng Yang
2025-06-30 13:30 ` [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices Mikulas Patocka
2025-06-30 13:40 ` Dongsheng Yang
2025-06-30 14:16 ` Dongsheng Yang [this message]
2025-06-30 15:45 ` Mikulas Patocka
2025-06-30 16:30 ` Dongsheng Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43e84a3e-f574-4c97-9f33-35fcb3751e01@linux.dev \
--to=dongsheng.yang@linux.dev \
--cc=Jonathan.Cameron@Huawei.com \
--cc=agk@redhat.com \
--cc=axboe@kernel.dk \
--cc=dan.j.williams@intel.com \
--cc=dm-devel@lists.linux.dev \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=nvdimm@lists.linux.dev \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).