From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Dongsheng Yang <dongsheng.yang@linux.dev>
Cc: <mpatocka@redhat.com>, <agk@redhat.com>, <snitzer@kernel.org>,
<axboe@kernel.dk>, <hch@lst.de>, <dan.j.williams@intel.com>,
<linux-block@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-cxl@vger.kernel.org>, <nvdimm@lists.linux.dev>,
<dm-devel@lists.linux.dev>
Subject: Re: [PATCH v1 03/11] dm-pcache: add cache device
Date: Tue, 1 Jul 2025 15:07:21 +0100 [thread overview]
Message-ID: <20250701150721.00003e67@huawei.com> (raw)
In-Reply-To: <20250624073359.2041340-4-dongsheng.yang@linux.dev>
On Tue, 24 Jun 2025 07:33:50 +0000
Dongsheng Yang <dongsheng.yang@linux.dev> wrote:
> Add cache_dev.{c,h} to manage the persistent-memory device that stores
> all pcache metadata and data segments. Splitting this logic out keeps
> the main dm-pcache code focused on policy while cache_dev handles the
> low-level interaction with the DAX block device.
>
> * DAX mapping
> - Opens the underlying device via dm_get_device().
> - Uses dax_direct_access() to obtain a direct linear mapping; falls
> back to vmap() when the range is fragmented.
>
> * On-disk layout
> ┌─ 4 KB ─┐ super-block (SB)
> ├─ 4 KB ─┤ cache_info[0]
> ├─ 4 KB ─┤ cache_info[1]
> ├─ 4 KB ─┤ cache_ctrl
> └─ ... ─┘ segments
> Constants and macros in the header expose offsets and sizes.
>
> * Super-block handling
> - sb_read(), sb_validate(), sb_init() verify magic, CRC32 and host
> endianness (flag *PCACHE_SB_F_BIGENDIAN*).
> - Formatting zeroes the metadata replicas and initialises the segment
> bitmap when the SB is blank.
>
> * Segment allocator
> - Bitmap protected by seg_lock; find_next_zero_bit() yields the next
> free 16 MB segment.
>
> * Lifecycle helpers
> - cache_dev_start()/stop() encapsulate init/exit and are invoked by
> dm-pcache core.
> - Gracefully handles errors: CRC mismatch, wrong endianness, device
> too small (< 512 MB), or failed DAX mapping.
>
> Signed-off-by: Dongsheng Yang <dongsheng.yang@linux.dev>
> ---
> drivers/md/dm-pcache/cache_dev.c | 299 +++++++++++++++++++++++++++++++
> drivers/md/dm-pcache/cache_dev.h | 70 ++++++++
> 2 files changed, 369 insertions(+)
> create mode 100644 drivers/md/dm-pcache/cache_dev.c
> create mode 100644 drivers/md/dm-pcache/cache_dev.h
>
> diff --git a/drivers/md/dm-pcache/cache_dev.c b/drivers/md/dm-pcache/cache_dev.c
> new file mode 100644
> index 000000000000..4dcebc9c167e
> --- /dev/null
> +++ b/drivers/md/dm-pcache/cache_dev.c
> @@ -0,0 +1,299 @@
> +static int build_vmap(struct dax_device *dax_dev, long total_pages, void **vaddr)
> +{
> + struct page **pages;
> + long i = 0, chunk;
> + pfn_t pfn;
> + int ret;
> +
> + pages = vmalloc_array(total_pages, sizeof(struct page *));
Perhaps if DM allows it, use __free() here to avoid need to manually clean it up and
allow early returns on errors.
> + if (!pages)
> + return -ENOMEM;
> +
> + do {
> + chunk = dax_direct_access(dax_dev, i, total_pages - i,
> + DAX_ACCESS, NULL, &pfn);
> + if (chunk <= 0) {
> + ret = chunk ? chunk : -EINVAL;
> + goto out_free;
> + }
> +
> + if (!pfn_t_has_page(pfn)) {
> + ret = -EOPNOTSUPP;
> + goto out_free;
> + }
> +
> + while (chunk-- && i < total_pages) {
> + pages[i++] = pfn_t_to_page(pfn);
> + pfn.val++;
> + if (!(i & 15))
> + cond_resched();
> + }
> + } while (i < total_pages);
> +
> + *vaddr = vmap(pages, total_pages, VM_MAP, PAGE_KERNEL);
> + if (!*vaddr)
> + ret = -ENOMEM;
> +out_free:
> + vfree(pages);
> + return ret;
> +}
> +
> +static int cache_dev_dax_init(struct pcache_cache_dev *cache_dev)
> +{
> + struct dm_pcache *pcache = CACHE_DEV_TO_PCACHE(cache_dev);
> + struct dax_device *dax_dev;
> + long total_pages, mapped_pages;
> + u64 bdev_size;
> + void *vaddr;
> + int ret;
> + int id;
combine ret and id on one line.
> + pfn_t pfn;
> +
> + dax_dev = cache_dev->dm_dev->dax_dev;
> + /* total size check */
> + bdev_size = bdev_nr_bytes(cache_dev->dm_dev->bdev);
> + if (bdev_size < PCACHE_CACHE_DEV_SIZE_MIN) {
> + pcache_dev_err(pcache, "dax device is too small, required at least %llu",
> + PCACHE_CACHE_DEV_SIZE_MIN);
> + ret = -ENOSPC;
> + goto out;
return -ENOSPC;
> +int cache_dev_start(struct dm_pcache *pcache)
> +{
> + struct pcache_cache_dev *cache_dev = &pcache->cache_dev;
> + struct pcache_sb sb;
> + bool format = false;
> + int ret;
> +
> + mutex_init(&cache_dev->seg_lock);
> +
> + ret = cache_dev_dax_init(cache_dev);
> + if (ret) {
> + pcache_dev_err(pcache, "failed to init cache_dev %s via dax way: %d.",
> + cache_dev->dm_dev->name, ret);
> + goto err;
> + }
> +
> + ret = sb_read(cache_dev, &sb);
> + if (ret)
> + goto dax_release;
> +
> + if (le64_to_cpu(sb.magic) == 0) {
> + format = true;
> + ret = sb_init(cache_dev, &sb);
> + if (ret < 0)
> + goto dax_release;
> + }
> +
> + ret = sb_validate(cache_dev, &sb);
> + if (ret)
> + goto dax_release;
> +
> + cache_dev->sb_flags = le32_to_cpu(sb.flags);
> + ret = cache_dev_init(cache_dev, sb.seg_num);
> + if (ret)
> + goto dax_release;
> +
> + if (format)
> + sb_write(cache_dev, &sb);
> +
> + return 0;
> +
> +dax_release:
> + cache_dev_dax_exit(cache_dev);
> +err:
In these cases just return instead of going to the label. It gives
generally more readable code.
> + return ret;
> +}
> +
> +int cache_dev_get_empty_segment_id(struct pcache_cache_dev *cache_dev, u32 *seg_id)
> +{
> + int ret;
> +
> + mutex_lock(&cache_dev->seg_lock);
If DM is fine with guard() use it here.
> + *seg_id = find_next_zero_bit(cache_dev->seg_bitmap, cache_dev->seg_num, 0);
> + if (*seg_id == cache_dev->seg_num) {
> + ret = -ENOSPC;
> + goto unlock;
> + }
> +
> + set_bit(*seg_id, cache_dev->seg_bitmap);
> + ret = 0;
> +unlock:
> + mutex_unlock(&cache_dev->seg_lock);
> + return ret;
> +}
next prev parent reply other threads:[~2025-07-01 14:07 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-24 7:33 [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 01/11] dm-pcache: add pcache_internal.h Dongsheng Yang
2025-07-01 13:43 ` Jonathan Cameron
2025-06-24 7:33 ` [PATCH v1 02/11] dm-pcache: add backing device management Dongsheng Yang
2025-07-01 13:56 ` Jonathan Cameron
2025-07-07 6:25 ` Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 03/11] dm-pcache: add cache device Dongsheng Yang
2025-07-01 14:07 ` Jonathan Cameron [this message]
2025-06-24 7:33 ` [PATCH v1 04/11] dm-pcache: add segment layer Dongsheng Yang
2025-07-01 14:46 ` Jonathan Cameron
2025-07-07 6:24 ` Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 05/11] dm-pcache: add cache_segment Dongsheng Yang
2025-07-01 14:59 ` Jonathan Cameron
2025-07-07 6:24 ` Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 06/11] dm-pcache: add cache_writeback Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 07/11] dm-pcache: add cache_gc Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 08/11] dm-pcache: add cache_key Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 09/11] dm-pcache: add cache_req Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 10/11] dm-pcache: add cache core Dongsheng Yang
2025-06-24 7:33 ` [PATCH v1 11/11] dm-pcache: initial dm-pcache target Dongsheng Yang
2025-06-30 13:30 ` [PATCH v1 00/11] dm-pcache – persistent-memory cache for block devices Mikulas Patocka
2025-06-30 13:40 ` Dongsheng Yang
2025-06-30 14:16 ` Dongsheng Yang
2025-06-30 15:45 ` Mikulas Patocka
2025-06-30 16:30 ` Dongsheng Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250701150721.00003e67@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=agk@redhat.com \
--cc=axboe@kernel.dk \
--cc=dan.j.williams@intel.com \
--cc=dm-devel@lists.linux.dev \
--cc=dongsheng.yang@linux.dev \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=nvdimm@lists.linux.dev \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.