linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices
@ 2025-06-05 14:22 Dongsheng Yang
  2025-06-05 14:22 ` [RFC PATCH 01/11] dm-pcache: add pcache_internal.h Dongsheng Yang
                   ` (11 more replies)
  0 siblings, 12 replies; 18+ messages in thread
From: Dongsheng Yang @ 2025-06-05 14:22 UTC (permalink / raw)
  To: mpatocka, agk, snitzer, axboe, hch, dan.j.williams,
	Jonathan.Cameron
  Cc: linux-block, linux-kernel, linux-cxl, nvdimm, dm-devel,
	Dongsheng Yang

Hi Mikulas and all,

This is *RFC v2* of the *pcache* series, a persistent-memory backed cache.
Compared with *RFC v1* 
<https://lore.kernel.org/lkml/20250414014505.20477-1-dongsheng.yang@linux.dev/>  
the most important change is that the whole cache has been *ported to
the Device-Mapper framework* and is now exposed as a regular DM target.

Code:
    https://github.com/DataTravelGuide/linux/tree/dm-pcache

Full RFC v2 test results:
    https://datatravelguide.github.io/dtg-blog/pcache/pcache_rfc_v2_result/results.html

    All 962 xfstests cases passed successfully under four different
pcache configurations.

    One of the detailed xfstests run:
        https://datatravelguide.github.io/dtg-blog/pcache/pcache_rfc_v2_result/test-results/02-._pcache.py_PcacheTest.test_run-crc-enable-gc-gc0-test_script-xfstests-a515/debug.log

Below is a quick tour through the three layers of the implementation,
followed by an example invocation.

----------------------------------------------------------------------
1. pmem access layer
----------------------------------------------------------------------

* All reads use *copy_mc_to_kernel()* so that uncorrectable media
  errors are detected and reported.
* All writes go through *memcpy_flushcache()* to guarantee durability
  on real persistent memory.

----------------------------------------------------------------------
2. cache-logic layer (segments / keys / workers)
----------------------------------------------------------------------

Main features
  - 16 MiB pmem segments, log-structured allocation.
  - Multi-subtree RB-tree index for high parallelism.
  - Optional per-entry *CRC32* on cached data.
  - Background *write-back* worker and watermark-driven *GC*.
  - Crash-safe replay: key-sets are scanned from *key_tail* on start-up.

Current limitations
  - Only *write-back* mode implemented.
  - Only FIFO cache invalidate; other (LRU, ARC...) planned.

----------------------------------------------------------------------
3. dm-pcache target integration
----------------------------------------------------------------------

* Table line  
    `pcache <pmem_dev> <origin_dev> writeback <true|false>`
* Features advertised to DM:
  - `ti->flush_supported = true`, so *PREFLUSH* and *FUA* are honoured
    (they force all open key-sets to close and data to be durable).
* Not yet supported:
  - Discard / TRIM.
  - dynamic `dmsetup reload`.

Runtime controls
  - `dmsetup message <dev> 0 gc_percent <0-90>` adjusts the GC trigger.

Status line reports super-block flags, segment counts, GC threshold and
the three tail/head pointers (see the RST document for details).

----------------------------------------------------------------------
Example
----------------------------------------------------------------------
# 1. create a pmem and ssd
pmem=/dev/pmem0
ssd=/dev/sdb

# 2. map a pcache device in front.
dmsetup create pcache_sdb --table \
  "0 $(blockdev --getsz $ssd) pcache $pmem $ssd writeback true"

# 3. format and mount
mkfs.ext4 /dev/mapper/pcache_sdb
mount /dev/mapper/pcache_sdb /mnt

# 4. tune GC to 80 %
dmsetup message pcache_sdb 0 gc_percent 80

# 5. monitor
watch -n1 'dmsetup status pcache_sdb'

Testing:
    The test suite for pcache is hosted in the dtg-tests project, built
on top of the Avocado Framework. It includes currently:
        - Management-related test cases for pcache devices.
        - Data verification and validation tests.
        - Complete execution of xfstests suite under multiple
          configurations.

Thanx
Dongsheng

Dongsheng Yang (11):
  dm-pcache: add pcache_internal.h
  dm-pcache: add backing device management
  dm-pcache: add cache device
  dm-pcache: add segment layer
  dm-pcache: add cache_segment
  dm-pcache: add cache_writeback
  dm-pcache: add cache_gc
  dm-pcache: add cache_key
  dm-pcache: add cache_req
  dm-pcache: add cache core
  dm-pcache: initial dm-pcache target

 .../admin-guide/device-mapper/dm-pcache.rst   | 200 ++++
 MAINTAINERS                                   |   9 +
 drivers/md/Kconfig                            |   2 +
 drivers/md/Makefile                           |   1 +
 drivers/md/dm-pcache/Kconfig                  |  17 +
 drivers/md/dm-pcache/Makefile                 |   3 +
 drivers/md/dm-pcache/backing_dev.c            | 305 ++++++
 drivers/md/dm-pcache/backing_dev.h            |  84 ++
 drivers/md/dm-pcache/cache.c                  | 443 +++++++++
 drivers/md/dm-pcache/cache.h                  | 601 ++++++++++++
 drivers/md/dm-pcache/cache_dev.c              | 310 ++++++
 drivers/md/dm-pcache/cache_dev.h              |  70 ++
 drivers/md/dm-pcache/cache_gc.c               | 170 ++++
 drivers/md/dm-pcache/cache_key.c              | 907 ++++++++++++++++++
 drivers/md/dm-pcache/cache_req.c              | 810 ++++++++++++++++
 drivers/md/dm-pcache/cache_segment.c          | 300 ++++++
 drivers/md/dm-pcache/cache_writeback.c        | 239 +++++
 drivers/md/dm-pcache/dm_pcache.c              | 388 ++++++++
 drivers/md/dm-pcache/dm_pcache.h              |  61 ++
 drivers/md/dm-pcache/pcache_internal.h        | 116 +++
 drivers/md/dm-pcache/segment.c                |  63 ++
 drivers/md/dm-pcache/segment.h                |  74 ++
 22 files changed, 5173 insertions(+)
 create mode 100644 Documentation/admin-guide/device-mapper/dm-pcache.rst
 create mode 100644 drivers/md/dm-pcache/Kconfig
 create mode 100644 drivers/md/dm-pcache/Makefile
 create mode 100644 drivers/md/dm-pcache/backing_dev.c
 create mode 100644 drivers/md/dm-pcache/backing_dev.h
 create mode 100644 drivers/md/dm-pcache/cache.c
 create mode 100644 drivers/md/dm-pcache/cache.h
 create mode 100644 drivers/md/dm-pcache/cache_dev.c
 create mode 100644 drivers/md/dm-pcache/cache_dev.h
 create mode 100644 drivers/md/dm-pcache/cache_gc.c
 create mode 100644 drivers/md/dm-pcache/cache_key.c
 create mode 100644 drivers/md/dm-pcache/cache_req.c
 create mode 100644 drivers/md/dm-pcache/cache_segment.c
 create mode 100644 drivers/md/dm-pcache/cache_writeback.c
 create mode 100644 drivers/md/dm-pcache/dm_pcache.c
 create mode 100644 drivers/md/dm-pcache/dm_pcache.h
 create mode 100644 drivers/md/dm-pcache/pcache_internal.h
 create mode 100644 drivers/md/dm-pcache/segment.c
 create mode 100644 drivers/md/dm-pcache/segment.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-06-30 16:28 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-05 14:22 [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 01/11] dm-pcache: add pcache_internal.h Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 02/11] dm-pcache: add backing device management Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 03/11] dm-pcache: add cache device Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 04/11] dm-pcache: add segment layer Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 05/11] dm-pcache: add cache_segment Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 06/11] dm-pcache: add cache_writeback Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 07/11] dm-pcache: add cache_gc Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 08/11] dm-pcache: add cache_key Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 09/11] dm-pcache: add cache_req Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 10/11] dm-pcache: add cache core Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 11/11] dm-pcache: initial dm-pcache target Dongsheng Yang
2025-06-12 16:57 ` [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices Mikulas Patocka
2025-06-13  3:39   ` Dongsheng Yang
2025-06-23  3:13   ` Dongsheng Yang
2025-06-23  4:18     ` Dongsheng Yang
2025-06-30 15:57     ` Mikulas Patocka
2025-06-30 16:28       ` Dongsheng Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).