From: Dongsheng Yang <dongsheng.yang@linux.dev>
To: mpatocka@redhat.com, agk@redhat.com, snitzer@kernel.org,
axboe@kernel.dk, hch@lst.de, dan.j.williams@intel.com,
Jonathan.Cameron@Huawei.com
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev,
dm-devel@lists.linux.dev,
Dongsheng Yang <dongsheng.yang@linux.dev>
Subject: [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices
Date: Thu, 5 Jun 2025 14:22:55 +0000 [thread overview]
Message-ID: <20250605142306.1930831-1-dongsheng.yang@linux.dev> (raw)
Hi Mikulas and all,
This is *RFC v2* of the *pcache* series, a persistent-memory backed cache.
Compared with *RFC v1*
<https://lore.kernel.org/lkml/20250414014505.20477-1-dongsheng.yang@linux.dev/>
the most important change is that the whole cache has been *ported to
the Device-Mapper framework* and is now exposed as a regular DM target.
Code:
https://github.com/DataTravelGuide/linux/tree/dm-pcache
Full RFC v2 test results:
https://datatravelguide.github.io/dtg-blog/pcache/pcache_rfc_v2_result/results.html
All 962 xfstests cases passed successfully under four different
pcache configurations.
One of the detailed xfstests run:
https://datatravelguide.github.io/dtg-blog/pcache/pcache_rfc_v2_result/test-results/02-._pcache.py_PcacheTest.test_run-crc-enable-gc-gc0-test_script-xfstests-a515/debug.log
Below is a quick tour through the three layers of the implementation,
followed by an example invocation.
----------------------------------------------------------------------
1. pmem access layer
----------------------------------------------------------------------
* All reads use *copy_mc_to_kernel()* so that uncorrectable media
errors are detected and reported.
* All writes go through *memcpy_flushcache()* to guarantee durability
on real persistent memory.
----------------------------------------------------------------------
2. cache-logic layer (segments / keys / workers)
----------------------------------------------------------------------
Main features
- 16 MiB pmem segments, log-structured allocation.
- Multi-subtree RB-tree index for high parallelism.
- Optional per-entry *CRC32* on cached data.
- Background *write-back* worker and watermark-driven *GC*.
- Crash-safe replay: key-sets are scanned from *key_tail* on start-up.
Current limitations
- Only *write-back* mode implemented.
- Only FIFO cache invalidate; other (LRU, ARC...) planned.
----------------------------------------------------------------------
3. dm-pcache target integration
----------------------------------------------------------------------
* Table line
`pcache <pmem_dev> <origin_dev> writeback <true|false>`
* Features advertised to DM:
- `ti->flush_supported = true`, so *PREFLUSH* and *FUA* are honoured
(they force all open key-sets to close and data to be durable).
* Not yet supported:
- Discard / TRIM.
- dynamic `dmsetup reload`.
Runtime controls
- `dmsetup message <dev> 0 gc_percent <0-90>` adjusts the GC trigger.
Status line reports super-block flags, segment counts, GC threshold and
the three tail/head pointers (see the RST document for details).
----------------------------------------------------------------------
Example
----------------------------------------------------------------------
# 1. create a pmem and ssd
pmem=/dev/pmem0
ssd=/dev/sdb
# 2. map a pcache device in front.
dmsetup create pcache_sdb --table \
"0 $(blockdev --getsz $ssd) pcache $pmem $ssd writeback true"
# 3. format and mount
mkfs.ext4 /dev/mapper/pcache_sdb
mount /dev/mapper/pcache_sdb /mnt
# 4. tune GC to 80 %
dmsetup message pcache_sdb 0 gc_percent 80
# 5. monitor
watch -n1 'dmsetup status pcache_sdb'
Testing:
The test suite for pcache is hosted in the dtg-tests project, built
on top of the Avocado Framework. It includes currently:
- Management-related test cases for pcache devices.
- Data verification and validation tests.
- Complete execution of xfstests suite under multiple
configurations.
Thanx
Dongsheng
Dongsheng Yang (11):
dm-pcache: add pcache_internal.h
dm-pcache: add backing device management
dm-pcache: add cache device
dm-pcache: add segment layer
dm-pcache: add cache_segment
dm-pcache: add cache_writeback
dm-pcache: add cache_gc
dm-pcache: add cache_key
dm-pcache: add cache_req
dm-pcache: add cache core
dm-pcache: initial dm-pcache target
.../admin-guide/device-mapper/dm-pcache.rst | 200 ++++
MAINTAINERS | 9 +
drivers/md/Kconfig | 2 +
drivers/md/Makefile | 1 +
drivers/md/dm-pcache/Kconfig | 17 +
drivers/md/dm-pcache/Makefile | 3 +
drivers/md/dm-pcache/backing_dev.c | 305 ++++++
drivers/md/dm-pcache/backing_dev.h | 84 ++
drivers/md/dm-pcache/cache.c | 443 +++++++++
drivers/md/dm-pcache/cache.h | 601 ++++++++++++
drivers/md/dm-pcache/cache_dev.c | 310 ++++++
drivers/md/dm-pcache/cache_dev.h | 70 ++
drivers/md/dm-pcache/cache_gc.c | 170 ++++
drivers/md/dm-pcache/cache_key.c | 907 ++++++++++++++++++
drivers/md/dm-pcache/cache_req.c | 810 ++++++++++++++++
drivers/md/dm-pcache/cache_segment.c | 300 ++++++
drivers/md/dm-pcache/cache_writeback.c | 239 +++++
drivers/md/dm-pcache/dm_pcache.c | 388 ++++++++
drivers/md/dm-pcache/dm_pcache.h | 61 ++
drivers/md/dm-pcache/pcache_internal.h | 116 +++
drivers/md/dm-pcache/segment.c | 63 ++
drivers/md/dm-pcache/segment.h | 74 ++
22 files changed, 5173 insertions(+)
create mode 100644 Documentation/admin-guide/device-mapper/dm-pcache.rst
create mode 100644 drivers/md/dm-pcache/Kconfig
create mode 100644 drivers/md/dm-pcache/Makefile
create mode 100644 drivers/md/dm-pcache/backing_dev.c
create mode 100644 drivers/md/dm-pcache/backing_dev.h
create mode 100644 drivers/md/dm-pcache/cache.c
create mode 100644 drivers/md/dm-pcache/cache.h
create mode 100644 drivers/md/dm-pcache/cache_dev.c
create mode 100644 drivers/md/dm-pcache/cache_dev.h
create mode 100644 drivers/md/dm-pcache/cache_gc.c
create mode 100644 drivers/md/dm-pcache/cache_key.c
create mode 100644 drivers/md/dm-pcache/cache_req.c
create mode 100644 drivers/md/dm-pcache/cache_segment.c
create mode 100644 drivers/md/dm-pcache/cache_writeback.c
create mode 100644 drivers/md/dm-pcache/dm_pcache.c
create mode 100644 drivers/md/dm-pcache/dm_pcache.h
create mode 100644 drivers/md/dm-pcache/pcache_internal.h
create mode 100644 drivers/md/dm-pcache/segment.c
create mode 100644 drivers/md/dm-pcache/segment.h
--
2.34.1
next reply other threads:[~2025-06-05 14:23 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-05 14:22 Dongsheng Yang [this message]
2025-06-05 14:22 ` [RFC PATCH 01/11] dm-pcache: add pcache_internal.h Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 02/11] dm-pcache: add backing device management Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 03/11] dm-pcache: add cache device Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 04/11] dm-pcache: add segment layer Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 05/11] dm-pcache: add cache_segment Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 06/11] dm-pcache: add cache_writeback Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 07/11] dm-pcache: add cache_gc Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 08/11] dm-pcache: add cache_key Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 09/11] dm-pcache: add cache_req Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 10/11] dm-pcache: add cache core Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 11/11] dm-pcache: initial dm-pcache target Dongsheng Yang
2025-06-12 16:57 ` [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices Mikulas Patocka
2025-06-13 3:39 ` Dongsheng Yang
2025-06-23 3:13 ` Dongsheng Yang
2025-06-23 4:18 ` Dongsheng Yang
2025-06-30 15:57 ` Mikulas Patocka
2025-06-30 16:28 ` Dongsheng Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250605142306.1930831-1-dongsheng.yang@linux.dev \
--to=dongsheng.yang@linux.dev \
--cc=Jonathan.Cameron@Huawei.com \
--cc=agk@redhat.com \
--cc=axboe@kernel.dk \
--cc=dan.j.williams@intel.com \
--cc=dm-devel@lists.linux.dev \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=nvdimm@lists.linux.dev \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.