Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Christoph Hellwig <hch@lst.de>, Yu Kuai <yukuai1@huaweicloud.com>
Cc: xni@redhat.com, colyli@kernel.org, axboe@kernel.dk,
	agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com,
	song@kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, dm-devel@lists.linux.dev,
	linux-raid@vger.kernel.org, yi.zhang@huawei.com,
	yangerkun@huawei.com, kbusch@kernel.org,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH RFC v2 00/14] md: introduce a new lockless bitmap
Date: Wed, 9 Apr 2025 17:27:11 +0800	[thread overview]
Message-ID: <115c3b08-aff1-dd97-fe6a-7901452ce62c@huaweicloud.com> (raw)
In-Reply-To: <20250409083208.GA2326@lst.de>

Hi,

在 2025/04/09 16:32, Christoph Hellwig 写道:
> On Sat, Mar 29, 2025 at 09:11:13AM +0800, Yu Kuai wrote:
>> The purpose here is to hide the low level bitmap IO implementation to
>> the API disk->submit_bio(), and the bitmap IO can be converted to buffer
>> IO to the bdev_file. This is the easiest way that I can think of to
>> resue the pagecache, with natural ability for dirty page writeback. I do
>> think about creating a new anon file and implement a new
>> file_operations, this will be much more complicated.
> 
> I've started looking at this a bit now, sorry for the delay.
> 
> As far as I can see you use the bitmap file just so that you have your
> own struct address_space and thus page cache instance and then call
> read_mapping_page and filemap_write_and_wait_range on it right?
Yes.

> 
> For that you'd be much better of just creating your own trivial
> file_system_type with an inode fully controlled by your driver
> that has a trivial set of address_space ops instead of oddly
> mixing with the block layer.

Yes, this is exactly what I said implement a new file_operations(and
address_space ops), I wanted do this the easy way, just reuse the raw
block device ops, this way I just need to implement the submit_bio ops
for new hidden disk.

I can try with new fs type if we really think this solution is too
hacky, however, the code line will be much more. :(

> 
> Note that either way I'm not sure using the page cache here is an
> all that good idea, as we're at the bottom of the I/O stack and
> thus memory allocations can very easily deadlock.

Yes, for the page from bitmap, this set do the easy way just read and
ping all realted pages while loading the bitmap. For two reasons:

1) We don't need to allocate and read pages from IO path;(In the first
RFC version, I'm using a worker to do that).
2) In the first RFC version, I find and get page in the IO path, turns
out page reference is an *atomic*, and the overhead is not acceptable;

And the only action from IO path is that if bitmap page is dirty,
filemap_write_and_wait_range() is called from async worker, the same as
old bitmap, to flush bitmap dirty pages.
> 
> What speaks against using your own folios explicitly allocated at
> probe time and then just doing manual submit_bio on that?  That's
> probably not much more code but a lot more robust.

I'm not quite sure if I understand you correctly. Do you means don't use
pagecache for bitmap IO, and manually create BIOs like the old bitmap,
meanwhile invent a new solution for synchronism instead of the global
spin_lock from old bitmap?

Thanks,
Kuai

> 
> Also a high level note: the bitmap_operations aren't a very nice
> interface.  A lot of methods are empty and should just be called
> conditionally.  Or even better you'd do away with the expensive
> indirect calls and just directly call either the old or new
> bitmap code.
> 
>> Meanwhile, bitmap file for the old bitmap will be removed sooner or
>> later, and this bdev_file implementation will compatible with bitmap
>> file as well.
> 
> Which would also mean that at that point the operations vector would
> be pointless, so we might as well not add it to start with.
> 
> .
> 


  reply	other threads:[~2025-04-09  9:27 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-28  6:08 [PATCH RFC v2 00/14] md: introduce a new lockless bitmap Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 01/14] block: factor out a helper bdev_file_alloc() Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 02/14] md/md-bitmap: pass discard information to bitmap_{start, end}write Yu Kuai
2025-04-04  9:29   ` Christoph Hellwig
2025-04-07  1:19     ` Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 03/14] md/md-bitmap: remove parameter slot from bitmap_create() Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 04/14] md: add a new sysfs api bitmap_version Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 05/14] md: delay registeration of bitmap_ops until creating bitmap Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 06/14] md/md-llbitmap: implement bit state machine Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 07/14] md/md-llbitmap: implement hidden disk to manage bitmap IO Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 08/14] md/md-llbitmap: implement APIs for page level dirty bits synchronization Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 09/14] md/md-llbitmap: implement APIs to mange bitmap lifetime Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 10/14] md/md-llbitmap: implement APIs to dirty bits and clear bits Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 11/14] md/md-llbitmap: implement APIs for sync_thread Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 12/14] md/md-llbitmap: implement all bitmap operations Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 13/14] md/md-llbitmap: implement sysfs APIs Yu Kuai
2025-03-28  6:08 ` [PATCH RFC v2 14/14] md/md-llbitmap: add Kconfig Yu Kuai
2025-03-28 11:06 ` [PATCH RFC v2 00/14] md: introduce a new lockless bitmap Christoph Hellwig
2025-03-29  1:11   ` Yu Kuai
2025-04-09  8:32     ` Christoph Hellwig
2025-04-09  9:27       ` Yu Kuai [this message]
2025-04-09  9:40         ` Christoph Hellwig
2025-04-11  1:36           ` Yu Kuai
2025-04-19  8:46             ` Yu Kuai
2025-04-21  7:39               ` Christoph Hellwig
2025-04-04  9:27 ` Christoph Hellwig
2025-04-07  1:09   ` Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=115c3b08-aff1-dd97-fe6a-7901452ce62c@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=colyli@kernel.org \
    --cc=dm-devel@lists.linux.dev \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@kernel.org \
    --cc=song@kernel.org \
    --cc=xni@redhat.com \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox