From: David Howells <dhowells@redhat.com>
To: Viacheslav Dubeyko <slava@dubeyko.com>,
Alex Markuze <amarkuze@redhat.com>
Cc: David Howells <dhowells@redhat.com>,
Ilya Dryomov <idryomov@gmail.com>,
Jeff Layton <jlayton@kernel.org>,
Dongsheng Yang <dongsheng.yang@easystack.cn>,
ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [RFC PATCH 00/35] ceph, rbd, netfs: Make ceph fully use netfslib
Date: Thu, 13 Mar 2025 23:32:52 +0000 [thread overview]
Message-ID: <20250313233341.1675324-1-dhowells@redhat.com> (raw)
Hi Viacheslav, Alex,
[!] NOTE: This is a preview of a work in progress. rbd works and ceph
works for plain I/O, but content crypto does not.
[!] NOTE: These patches are based on some other sets of patches not
included in this posting. They are, however, included in the git
branch mentioned below.
These patches do a number of things:
(1) (Mostly) collapse the different I/O types (PAGES, PAGELIST, BVECS,
BIO) down to a single one.
I added a new type, ceph_databuf, to make this easier. The page list
is attached to that as a bio_vec[] with an iov_iter, but could also be
some other type supported by the iov_iter. The iov_iter defines the
data or buffer to be used. I have an additional iov_iter type
implemented that allows use of a straight folio[] or page[] instead of
a bio_vec[] that I can deploy if that proves more useful.
(2) RBD is modified to get rid of the removed page-list types and I think
now fully works.
(3) Ceph is mostly converted to using netfslib. At this point, it can do
plain reads and writes, but content crypto in currently
non-functional.
Multipage folios are enabled and work (all the support for that is
hidden inside of netfslib).
(4) The old Ceph VFS/VM I/O API implementation is removed. With that, as
the code currently stands, the patches overall result in a ~2500 LoC
reduction. That may be reduced as some more bits need transferring
from the old code to the new code.
The conversion isn't quite complete:
(1) ceph_osd_linger_request::preply_pages needs switching over to a
ceph_databuf, but I haven't yet managed to work out how the pages that
handle_watch_notify() sticks in there come about.
(2) I haven't altered data transmission in net/ceph/messenger*.c yet. The
aim is to reduce it to a single sendmsg() call for each ceph_msg_data
struct, using the iov_iter therein.
(3) The data reception routines in net/ceph/messenger*.c also need
modifying to pass each ceph_msg_data::iter to recvmsg() in turn.
(4) It might be possible to merge struct ceph_databuf into struct
ceph_msg_data and eliminate the former.
(5) fs/ceph/ still needs a bit more work to clean up the use of page
arrays.
(6) I would like to change front and middle buffers with a ceph_databuf,
vmapping them when we need to access them.
I added a kmap_ceph_databuf_page() macro and used that to get a page and
use kmap_local_page() on it to hide the bvec[] inside to make it easier to
replace.
Anyway, if anyone has any thoughts...
I've pushed the patches here also:
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=ceph-iter
David
David Howells (35):
ceph: Fix incorrect flush end position calculation
libceph: Rename alignment to offset
libceph: Add a new data container type, ceph_databuf
ceph: Convert ceph_mds_request::r_pagelist to a databuf
libceph: Add functions to add ceph_databufs to requests
rbd: Use ceph_databuf for rbd_obj_read_sync()
libceph: Change ceph_osdc_call()'s reply to a ceph_databuf
libceph: Unexport osd_req_op_cls_request_data_pages()
libceph: Remove osd_req_op_cls_response_data_pages()
libceph: Convert notify_id_pages to a ceph_databuf
ceph: Use ceph_databuf in DIO
libceph: Bypass the messenger-v1 Tx loop for databuf/iter data blobs
rbd: Switch from using bvec_iter to iov_iter
libceph: Remove bvec and bio data container types
libceph: Make osd_req_op_cls_init() use a ceph_databuf and map it
libceph: Convert req_page of ceph_osdc_call() to ceph_databuf
libceph, rbd: Use ceph_databuf encoding start/stop
libceph, rbd: Convert some page arrays to ceph_databuf
libceph, ceph: Convert users of ceph_pagelist to ceph_databuf
libceph: Remove ceph_pagelist
libceph: Make notify code use ceph_databuf_enc_start/stop
libceph, rbd: Convert ceph_osdc_notify() reply to ceph_databuf
rbd: Use ceph_databuf_enc_start/stop()
ceph: Make ceph_calc_file_object_mapping() return size as size_t
ceph: Wrap POSIX_FADV_WILLNEED to get caps
ceph: Kill ceph_rw_context
netfs: Pass extra write context to write functions
netfs: Adjust group handling
netfs: Allow fs-private data to be handed through to request alloc
netfs: Make netfs_page_mkwrite() use folio_mkwrite_check_truncate()
netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int
netfs: Add some more RMW support for ceph
ceph: Use netfslib [INCOMPLETE]
ceph: Enable multipage folios for ceph files
ceph: Remove old I/O API bits
drivers/block/rbd.c | 904 ++++++--------
fs/9p/vfs_file.c | 2 +-
fs/afs/write.c | 2 +-
fs/ceph/Makefile | 2 +-
fs/ceph/acl.c | 39 +-
fs/ceph/addr.c | 2009 +------------------------------
fs/ceph/cache.h | 5 +
fs/ceph/caps.c | 2 +-
fs/ceph/crypto.c | 56 +-
fs/ceph/file.c | 1810 +++-------------------------
fs/ceph/inode.c | 116 +-
fs/ceph/ioctl.c | 2 +-
fs/ceph/locks.c | 23 +-
fs/ceph/mds_client.c | 134 +--
fs/ceph/mds_client.h | 2 +-
fs/ceph/rdwr.c | 1006 ++++++++++++++++
fs/ceph/super.h | 81 +-
fs/ceph/xattr.c | 69 +-
fs/netfs/buffered_read.c | 11 +-
fs/netfs/buffered_write.c | 48 +-
fs/netfs/direct_read.c | 83 +-
fs/netfs/direct_write.c | 3 +-
fs/netfs/internal.h | 40 +-
fs/netfs/main.c | 5 +-
fs/netfs/objects.c | 4 +
fs/netfs/read_collect.c | 2 +
fs/netfs/read_pgpriv2.c | 2 +-
fs/netfs/read_single.c | 2 +-
fs/netfs/write_issue.c | 55 +-
fs/netfs/write_retry.c | 5 +-
fs/smb/client/file.c | 4 +-
include/linux/ceph/databuf.h | 169 +++
include/linux/ceph/decode.h | 4 +-
include/linux/ceph/libceph.h | 3 +-
include/linux/ceph/messenger.h | 122 +-
include/linux/ceph/osd_client.h | 87 +-
include/linux/ceph/pagelist.h | 60 -
include/linux/ceph/striper.h | 60 +-
include/linux/netfs.h | 89 +-
include/trace/events/netfs.h | 3 +
net/ceph/Makefile | 5 +-
net/ceph/cls_lock_client.c | 200 ++-
net/ceph/databuf.c | 200 +++
net/ceph/messenger.c | 310 +----
net/ceph/messenger_v1.c | 76 +-
net/ceph/mon_client.c | 10 +-
net/ceph/osd_client.c | 510 +++-----
net/ceph/pagelist.c | 133 --
net/ceph/snapshot.c | 20 +-
net/ceph/striper.c | 57 +-
50 files changed, 2996 insertions(+), 5650 deletions(-)
create mode 100644 fs/ceph/rdwr.c
create mode 100644 include/linux/ceph/databuf.h
delete mode 100644 include/linux/ceph/pagelist.h
create mode 100644 net/ceph/databuf.c
delete mode 100644 net/ceph/pagelist.c
next reply other threads:[~2025-03-13 23:33 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 23:32 David Howells [this message]
2025-03-13 23:32 ` [RFC PATCH 01/35] ceph: Fix incorrect flush end position calculation David Howells
2025-03-13 23:32 ` [RFC PATCH 02/35] libceph: Rename alignment to offset David Howells
2025-03-14 19:04 ` Viacheslav Dubeyko
2025-03-14 20:01 ` David Howells
2025-03-13 23:32 ` [RFC PATCH 03/35] libceph: Add a new data container type, ceph_databuf David Howells
2025-03-14 20:06 ` Viacheslav Dubeyko
2025-03-17 11:27 ` David Howells
2025-03-13 23:32 ` [RFC PATCH 04/35] ceph: Convert ceph_mds_request::r_pagelist to a databuf David Howells
2025-03-14 22:27 ` slava
2025-03-17 11:52 ` David Howells
2025-03-20 20:34 ` Viacheslav Dubeyko
2025-03-20 22:01 ` David Howells
2025-03-13 23:32 ` [RFC PATCH 05/35] libceph: Add functions to add ceph_databufs to requests David Howells
2025-03-13 23:32 ` [RFC PATCH 06/35] rbd: Use ceph_databuf for rbd_obj_read_sync() David Howells
2025-03-17 19:08 ` Viacheslav Dubeyko
2025-04-11 13:48 ` David Howells
2025-03-13 23:32 ` [RFC PATCH 07/35] libceph: Change ceph_osdc_call()'s reply to a ceph_databuf David Howells
2025-03-17 19:41 ` Viacheslav Dubeyko
2025-03-17 22:12 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 08/35] libceph: Unexport osd_req_op_cls_request_data_pages() David Howells
2025-03-13 23:33 ` [RFC PATCH 09/35] libceph: Remove osd_req_op_cls_response_data_pages() David Howells
2025-03-13 23:33 ` [RFC PATCH 10/35] libceph: Convert notify_id_pages to a ceph_databuf David Howells
2025-03-13 23:33 ` [RFC PATCH 11/35] ceph: Use ceph_databuf in DIO David Howells
2025-03-17 20:03 ` Viacheslav Dubeyko
2025-03-17 22:26 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 12/35] libceph: Bypass the messenger-v1 Tx loop for databuf/iter data blobs David Howells
2025-03-13 23:33 ` [RFC PATCH 13/35] rbd: Switch from using bvec_iter to iov_iter David Howells
2025-03-18 19:38 ` Viacheslav Dubeyko
2025-03-18 22:13 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 14/35] libceph: Remove bvec and bio data container types David Howells
2025-03-13 23:33 ` [RFC PATCH 15/35] libceph: Make osd_req_op_cls_init() use a ceph_databuf and map it David Howells
2025-03-13 23:33 ` [RFC PATCH 16/35] libceph: Convert req_page of ceph_osdc_call() to ceph_databuf David Howells
2025-03-13 23:33 ` [RFC PATCH 17/35] libceph, rbd: Use ceph_databuf encoding start/stop David Howells
2025-03-18 19:59 ` Viacheslav Dubeyko
2025-03-18 22:19 ` David Howells
2025-03-20 21:45 ` Viacheslav Dubeyko
2025-03-13 23:33 ` [RFC PATCH 18/35] libceph, rbd: Convert some page arrays to ceph_databuf David Howells
2025-03-18 20:02 ` Viacheslav Dubeyko
2025-03-18 22:25 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 19/35] libceph, ceph: Convert users of ceph_pagelist " David Howells
2025-03-18 20:09 ` Viacheslav Dubeyko
2025-03-18 22:27 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 20/35] libceph: Remove ceph_pagelist David Howells
2025-03-13 23:33 ` [RFC PATCH 21/35] libceph: Make notify code use ceph_databuf_enc_start/stop David Howells
2025-03-18 20:12 ` Viacheslav Dubeyko
2025-03-18 22:36 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 22/35] libceph, rbd: Convert ceph_osdc_notify() reply to ceph_databuf David Howells
2025-03-19 0:08 ` Viacheslav Dubeyko
2025-03-20 14:44 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 23/35] rbd: Use ceph_databuf_enc_start/stop() David Howells
2025-03-19 0:32 ` Viacheslav Dubeyko
2025-03-20 14:59 ` Why use plain numbers and totals rather than predef'd constants for RPC sizes? David Howells
2025-03-20 21:48 ` Viacheslav Dubeyko
2025-03-13 23:33 ` [RFC PATCH 24/35] ceph: Make ceph_calc_file_object_mapping() return size as size_t David Howells
2025-03-13 23:33 ` [RFC PATCH 25/35] ceph: Wrap POSIX_FADV_WILLNEED to get caps David Howells
2025-03-13 23:33 ` [RFC PATCH 26/35] ceph: Kill ceph_rw_context David Howells
2025-03-13 23:33 ` [RFC PATCH 27/35] netfs: Pass extra write context to write functions David Howells
2025-03-13 23:33 ` [RFC PATCH 28/35] netfs: Adjust group handling David Howells
2025-03-19 18:57 ` Viacheslav Dubeyko
2025-03-20 15:22 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 29/35] netfs: Allow fs-private data to be handed through to request alloc David Howells
2025-03-13 23:33 ` [RFC PATCH 30/35] netfs: Make netfs_page_mkwrite() use folio_mkwrite_check_truncate() David Howells
2025-03-13 23:33 ` [RFC PATCH 31/35] netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int David Howells
2025-03-13 23:33 ` [RFC PATCH 32/35] netfs: Add some more RMW support for ceph David Howells
2025-03-19 19:14 ` Viacheslav Dubeyko
2025-03-20 15:25 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 33/35] ceph: Use netfslib [INCOMPLETE] David Howells
2025-03-19 19:54 ` Viacheslav Dubeyko
2025-03-20 15:38 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 34/35] ceph: Enable multipage folios for ceph files David Howells
2025-03-13 23:33 ` [RFC PATCH 35/35] ceph: Remove old I/O API bits David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250313233341.1675324-1-dhowells@redhat.com \
--to=dhowells@redhat.com \
--cc=amarkuze@redhat.com \
--cc=ceph-devel@vger.kernel.org \
--cc=dongsheng.yang@easystack.cn \
--cc=idryomov@gmail.com \
--cc=jlayton@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=slava@dubeyko.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox