All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dongsheng Yang <dongsheng.yang@linux.dev>
To: mpatocka@redhat.com, agk@redhat.com, snitzer@kernel.org,
	axboe@kernel.dk, hch@lst.de, dan.j.williams@intel.com,
	Jonathan.Cameron@Huawei.com
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-cxl@vger.kernel.org, nvdimm@lists.linux.dev,
	dm-devel@lists.linux.dev,
	Dongsheng Yang <dongsheng.yang@linux.dev>
Subject: [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices
Date: Thu,  5 Jun 2025 14:22:55 +0000	[thread overview]
Message-ID: <20250605142306.1930831-1-dongsheng.yang@linux.dev> (raw)

Hi Mikulas and all,

This is *RFC v2* of the *pcache* series, a persistent-memory backed cache.
Compared with *RFC v1* 
<https://lore.kernel.org/lkml/20250414014505.20477-1-dongsheng.yang@linux.dev/>  
the most important change is that the whole cache has been *ported to
the Device-Mapper framework* and is now exposed as a regular DM target.

Code:
    https://github.com/DataTravelGuide/linux/tree/dm-pcache

Full RFC v2 test results:
    https://datatravelguide.github.io/dtg-blog/pcache/pcache_rfc_v2_result/results.html

    All 962 xfstests cases passed successfully under four different
pcache configurations.

    One of the detailed xfstests run:
        https://datatravelguide.github.io/dtg-blog/pcache/pcache_rfc_v2_result/test-results/02-._pcache.py_PcacheTest.test_run-crc-enable-gc-gc0-test_script-xfstests-a515/debug.log

Below is a quick tour through the three layers of the implementation,
followed by an example invocation.

----------------------------------------------------------------------
1. pmem access layer
----------------------------------------------------------------------

* All reads use *copy_mc_to_kernel()* so that uncorrectable media
  errors are detected and reported.
* All writes go through *memcpy_flushcache()* to guarantee durability
  on real persistent memory.

----------------------------------------------------------------------
2. cache-logic layer (segments / keys / workers)
----------------------------------------------------------------------

Main features
  - 16 MiB pmem segments, log-structured allocation.
  - Multi-subtree RB-tree index for high parallelism.
  - Optional per-entry *CRC32* on cached data.
  - Background *write-back* worker and watermark-driven *GC*.
  - Crash-safe replay: key-sets are scanned from *key_tail* on start-up.

Current limitations
  - Only *write-back* mode implemented.
  - Only FIFO cache invalidate; other (LRU, ARC...) planned.

----------------------------------------------------------------------
3. dm-pcache target integration
----------------------------------------------------------------------

* Table line  
    `pcache <pmem_dev> <origin_dev> writeback <true|false>`
* Features advertised to DM:
  - `ti->flush_supported = true`, so *PREFLUSH* and *FUA* are honoured
    (they force all open key-sets to close and data to be durable).
* Not yet supported:
  - Discard / TRIM.
  - dynamic `dmsetup reload`.

Runtime controls
  - `dmsetup message <dev> 0 gc_percent <0-90>` adjusts the GC trigger.

Status line reports super-block flags, segment counts, GC threshold and
the three tail/head pointers (see the RST document for details).

----------------------------------------------------------------------
Example
----------------------------------------------------------------------
# 1. create a pmem and ssd
pmem=/dev/pmem0
ssd=/dev/sdb

# 2. map a pcache device in front.
dmsetup create pcache_sdb --table \
  "0 $(blockdev --getsz $ssd) pcache $pmem $ssd writeback true"

# 3. format and mount
mkfs.ext4 /dev/mapper/pcache_sdb
mount /dev/mapper/pcache_sdb /mnt

# 4. tune GC to 80 %
dmsetup message pcache_sdb 0 gc_percent 80

# 5. monitor
watch -n1 'dmsetup status pcache_sdb'

Testing:
    The test suite for pcache is hosted in the dtg-tests project, built
on top of the Avocado Framework. It includes currently:
        - Management-related test cases for pcache devices.
        - Data verification and validation tests.
        - Complete execution of xfstests suite under multiple
          configurations.

Thanx
Dongsheng

Dongsheng Yang (11):
  dm-pcache: add pcache_internal.h
  dm-pcache: add backing device management
  dm-pcache: add cache device
  dm-pcache: add segment layer
  dm-pcache: add cache_segment
  dm-pcache: add cache_writeback
  dm-pcache: add cache_gc
  dm-pcache: add cache_key
  dm-pcache: add cache_req
  dm-pcache: add cache core
  dm-pcache: initial dm-pcache target

 .../admin-guide/device-mapper/dm-pcache.rst   | 200 ++++
 MAINTAINERS                                   |   9 +
 drivers/md/Kconfig                            |   2 +
 drivers/md/Makefile                           |   1 +
 drivers/md/dm-pcache/Kconfig                  |  17 +
 drivers/md/dm-pcache/Makefile                 |   3 +
 drivers/md/dm-pcache/backing_dev.c            | 305 ++++++
 drivers/md/dm-pcache/backing_dev.h            |  84 ++
 drivers/md/dm-pcache/cache.c                  | 443 +++++++++
 drivers/md/dm-pcache/cache.h                  | 601 ++++++++++++
 drivers/md/dm-pcache/cache_dev.c              | 310 ++++++
 drivers/md/dm-pcache/cache_dev.h              |  70 ++
 drivers/md/dm-pcache/cache_gc.c               | 170 ++++
 drivers/md/dm-pcache/cache_key.c              | 907 ++++++++++++++++++
 drivers/md/dm-pcache/cache_req.c              | 810 ++++++++++++++++
 drivers/md/dm-pcache/cache_segment.c          | 300 ++++++
 drivers/md/dm-pcache/cache_writeback.c        | 239 +++++
 drivers/md/dm-pcache/dm_pcache.c              | 388 ++++++++
 drivers/md/dm-pcache/dm_pcache.h              |  61 ++
 drivers/md/dm-pcache/pcache_internal.h        | 116 +++
 drivers/md/dm-pcache/segment.c                |  63 ++
 drivers/md/dm-pcache/segment.h                |  74 ++
 22 files changed, 5173 insertions(+)
 create mode 100644 Documentation/admin-guide/device-mapper/dm-pcache.rst
 create mode 100644 drivers/md/dm-pcache/Kconfig
 create mode 100644 drivers/md/dm-pcache/Makefile
 create mode 100644 drivers/md/dm-pcache/backing_dev.c
 create mode 100644 drivers/md/dm-pcache/backing_dev.h
 create mode 100644 drivers/md/dm-pcache/cache.c
 create mode 100644 drivers/md/dm-pcache/cache.h
 create mode 100644 drivers/md/dm-pcache/cache_dev.c
 create mode 100644 drivers/md/dm-pcache/cache_dev.h
 create mode 100644 drivers/md/dm-pcache/cache_gc.c
 create mode 100644 drivers/md/dm-pcache/cache_key.c
 create mode 100644 drivers/md/dm-pcache/cache_req.c
 create mode 100644 drivers/md/dm-pcache/cache_segment.c
 create mode 100644 drivers/md/dm-pcache/cache_writeback.c
 create mode 100644 drivers/md/dm-pcache/dm_pcache.c
 create mode 100644 drivers/md/dm-pcache/dm_pcache.h
 create mode 100644 drivers/md/dm-pcache/pcache_internal.h
 create mode 100644 drivers/md/dm-pcache/segment.c
 create mode 100644 drivers/md/dm-pcache/segment.h

-- 
2.34.1


             reply	other threads:[~2025-06-05 14:23 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-05 14:22 Dongsheng Yang [this message]
2025-06-05 14:22 ` [RFC PATCH 01/11] dm-pcache: add pcache_internal.h Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 02/11] dm-pcache: add backing device management Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 03/11] dm-pcache: add cache device Dongsheng Yang
2025-06-05 14:22 ` [RFC PATCH 04/11] dm-pcache: add segment layer Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 05/11] dm-pcache: add cache_segment Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 06/11] dm-pcache: add cache_writeback Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 07/11] dm-pcache: add cache_gc Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 08/11] dm-pcache: add cache_key Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 09/11] dm-pcache: add cache_req Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 10/11] dm-pcache: add cache core Dongsheng Yang
2025-06-05 14:23 ` [RFC PATCH 11/11] dm-pcache: initial dm-pcache target Dongsheng Yang
2025-06-12 16:57 ` [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices Mikulas Patocka
2025-06-13  3:39   ` Dongsheng Yang
2025-06-23  3:13   ` Dongsheng Yang
2025-06-23  4:18     ` Dongsheng Yang
2025-06-30 15:57     ` Mikulas Patocka
2025-06-30 16:28       ` Dongsheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250605142306.1930831-1-dongsheng.yang@linux.dev \
    --to=dongsheng.yang@linux.dev \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=snitzer@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.