public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Dongsheng Yang <dongsheng.yang@linux.dev>
To: mpatocka@redhat.com, agk@redhat.com, snitzer@kernel.org,
	axboe@kernel.dk, hch@lst.de, dan.j.williams@intel.com,
	Jonathan.Cameron@Huawei.com
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev,
	dm-devel@lists.linux.dev,
	Dongsheng Yang <dongsheng.yang@linux.dev>
Subject: [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices
Date: Thu,  5 Jun 2025 14:22:55 +0000	[thread overview]
Message-ID: <20250605142306.1930831-1-dongsheng.yang@linux.dev> (raw)

Hi Mikulas and all,

This is *RFC v2* of the *pcache* series, a persistent-memory backed cache.
Compared with *RFC v1* 
<https://lore.kernel.org/lkml/20250414014505.20477-1-dongsheng.yang@linux.dev/>  
the most important change is that the whole cache has been *ported to
the Device-Mapper framework* and is now exposed as a regular DM target.

Code:
    https://github.com/DataTravelGuide/linux/tree/dm-pcache

Full RFC v2 test results:
    https://datatravelguide.github.io/dtg-blog/pcache/pcache_rfc_v2_result/results.html

    All 962 xfstests cases passed successfully under four different
pcache configurations.

    One of the detailed xfstests run:
        https://datatravelguide.github.io/dtg-blog/pcache/pcache_rfc_v2_result/test-results/02-._pcache.py_PcacheTest.test_run-crc-enable-gc-gc0-test_script-xfstests-a515/debug.log

Below is a quick tour through the three layers of the implementation,
followed by an example invocation.

----------------------------------------------------------------------
1. pmem access layer
----------------------------------------------------------------------

* All reads use *copy_mc_to_kernel()* so that uncorrectable media
  errors are detected and reported.
* All writes go through *memcpy_flushcache()* to guarantee durability
  on real persistent memory.

----------------------------------------------------------------------
2. cache-logic layer (segments / keys / workers)
----------------------------------------------------------------------

Main features
  - 16 MiB pmem segments, log-structured allocation.
  - Multi-subtree RB-tree index for high parallelism.
  - Optional per-entry *CRC32* on cached data.
  - Background *write-back* worker and watermark-driven *GC*.
  - Crash-safe replay: key-sets are scanned from *key_tail* on start-up.

Current limitations
  - Only *write-back* mode implemented.
  - Only FIFO cache invalidate; other (LRU, ARC...) planned.

----------------------------------------------------------------------
3. dm-pcache target integration
----------------------------------------------------------------------

* Table line  
    `pcache <pmem_dev> <origin_dev> writeback <true|false>`
* Features advertised to DM:
  - `ti->flush_supported = true`, so *PREFLUSH* and *FUA* are honoured
    (they force all open key-sets to close and data to be durable).
* Not yet supported:
  - Discard / TRIM.
  - dynamic `dmsetup reload`.

Runtime controls
  - `dmsetup message <dev> 0 gc_percent <0-90>` adjusts the GC trigger.

Status line reports super-block flags, segment counts, GC threshold and
the three tail/head pointers (see the RST document for details).

----------------------------------------------------------------------
Example
----------------------------------------------------------------------
# 1. create a pmem and ssd
pmem=/dev/pmem0
ssd=/dev/sdb

# 2. map a pcache device in front.
dmsetup create pcache_sdb --table \
  "0 $(blockdev --getsz $ssd) pcache $pmem $ssd writeback true"

# 3. format and mount
mkfs.ext4 /dev/mapper/pcache_sdb
mount /dev/mapper/pcache_sdb /mnt

# 4. tune GC to 80 %
dmsetup message pcache_sdb 0 gc_percent 80

# 5. monitor
watch -n1 'dmsetup status pcache_sdb'

Testing:
    The test suite for pcache is hosted in the dtg-tests project, built
on top of the Avocado Framework. It includes currently:
        - Management-related test cases for pcache devices.
        - Data verification and validation tests.
        - Complete execution of xfstests suite under multiple
          configurations.

Thanx
Dongsheng

Dongsheng Yang (11):
  dm-pcache: add pcache_internal.h
  dm-pcache: add backing device management
  dm-pcache: add cache device
  dm-pcache: add segment layer
  dm-pcache: add cache_segment
  dm-pcache: add cache_writeback
  dm-pcache: add cache_gc
  dm-pcache: add cache_key
  dm-pcache: add cache_req
  dm-pcache: add cache core
  dm-pcache: initial dm-pcache target

 .../admin-guide/device-mapper/dm-pcache.rst   | 200 ++++
 MAINTAINERS                                   |   9 +
 drivers/md/Kconfig                            |   2 +
 drivers/md/Makefile                           |   1 +
 drivers/md/dm-pcache/Kconfig                  |  17 +
 drivers/md/dm-pcache/Makefile                 |   3 +
 drivers/md/dm-pcache/backing_dev.c            | 305 ++++++
 drivers/md/dm-pcache/backing_dev.h            |  84 ++
 drivers/md/dm-pcache/cache.c                  | 443 +++++++++
 drivers/md/dm-pcache/cache.h                  | 601 ++++++++++++
 drivers/md/dm-pcache/cache_dev.c              | 310 ++++++
 drivers/md/dm-pcache/cache_dev.h              |  70 ++
 drivers/md/dm-pcache/cache_gc.c               | 170 ++++
 drivers/md/dm-pcache/cache_key.c              | 907 ++++++++++++++++++
 drivers/md/dm-pcache/cache_req.c              | 810 ++++++++++++++++
 drivers/md/dm-pcache/cache_segment.c          | 300 ++++++
 drivers/md/dm-pcache/cache_writeback.c        | 239 +++++
 drivers/md/dm-pcache/dm_pcache.c              | 388 ++++++++
 drivers/md/dm-pcache/dm_pcache.h              |  61 ++
 drivers/md/dm-pcache/pcache_internal.h        | 116 +++
 drivers/md/dm-pcache/segment.c                |  63 ++
 drivers/md/dm-pcache/segment.h                |  74 ++
 22 files changed, 5173 insertions(+)
 create mode 100644 Documentation/admin-guide/device-mapper/dm-pcache.rst
 create mode 100644 drivers/md/dm-pcache/Kconfig
 create mode 100644 drivers/md/dm-pcache/Makefile
 create mode 100644 drivers/md/dm-pcache/backing_dev.c
 create mode 100644 drivers/md/dm-pcache/backing_dev.h
 create mode 100644 drivers/md/dm-pcache/cache.c
 create mode 100644 drivers/md/dm-pcache/cache.h
 create mode 100644 drivers/md/dm-pcache/cache_dev.c
 create mode 100644 drivers/md/dm-pcache/cache_dev.h
 create mode 100644 drivers/md/dm-pcache/cache_gc.c
 create mode 100644 drivers/md/dm-pcache/cache_key.c
 create mode 100644 drivers/md/dm-pcache/cache_req.c
 create mode 100644 drivers/md/dm-pcache/cache_segment.c
 create mode 100644 drivers/md/dm-pcache/cache_writeback.c
 create mode 100644 drivers/md/dm-pcache/dm_pcache.c
 create mode 100644 drivers/md/dm-pcache/dm_pcache.h
 create mode 100644 drivers/md/dm-pcache/pcache_internal.h
 create mode 100644 drivers/md/dm-pcache/segment.c
 create mode 100644 drivers/md/dm-pcache/segment.h

-- 
2.34.1


             reply	other threads:[~2025-06-05 14:23 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-05 14:22 Dongsheng Yang [this message]
2025-06-05 14:22 ` [RFC PATCH 01/11] dm-pcache: add pcache_internal.h Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 02/11] dm-pcache: add backing device management Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 03/11] dm-pcache: add cache device Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 04/11] dm-pcache: add segment layer Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 05/11] dm-pcache: add cache_segment Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 06/11] dm-pcache: add cache_writeback Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 07/11] dm-pcache: add cache_gc Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 08/11] dm-pcache: add cache_key Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 09/11] dm-pcache: add cache_req Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 10/11] dm-pcache: add cache core Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 11/11] dm-pcache: initial dm-pcache target Dongsheng Yang
2025-06-12 16:57 ` [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices Mikulas Patocka
2025-06-13  3:39   ` Dongsheng Yang
2025-06-23  3:13   ` Dongsheng Yang
2025-06-23  4:18     ` Dongsheng Yang
2025-06-30 15:57     ` Mikulas Patocka
2025-06-30 16:28       ` Dongsheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250605142306.1930831-1-dongsheng.yang@linux.dev \
    --to=dongsheng.yang@linux.dev \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=snitzer@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox