From: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
To: Alex Markuze <amarkuze@redhat.com>,
"slava@dubeyko.com" <slava@dubeyko.com>,
David Howells <dhowells@redhat.com>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"idryomov@gmail.com" <idryomov@gmail.com>,
"jlayton@kernel.org" <jlayton@kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
"dongsheng.yang@easystack.cn" <dongsheng.yang@easystack.cn>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 03/35] libceph: Add a new data container type, ceph_databuf
Date: Fri, 14 Mar 2025 20:06:02 +0000 [thread overview]
Message-ID: <1bab7ad752df6f2fa953fbf8eed8370e10344ff7.camel@ibm.com> (raw)
In-Reply-To: <20250313233341.1675324-4-dhowells@redhat.com>
On Thu, 2025-03-13 at 23:32 +0000, David Howells wrote:
> Add a new ceph data container type, ceph_databuf, that can carry a list of
> pages in a bvec and use an iov_iter to handle describe the data to the next
> layer down. The iterator can also be used to refer to other types, such as
> ITER_FOLIOQ.
>
> There are two ways of loading the bvec. One way is to allocate a buffer
> with space in it and then add data, expanding the space as needed; the
> other is to splice in pages, expanding the bvec[] as needed.
>
> This is intended to replace all other types.
>
We definitely need to think about unit-tests or self-tests here.
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Viacheslav Dubeyko <slava@dubeyko.com>
> cc: Alex Markuze <amarkuze@redhat.com>
> cc: Ilya Dryomov <idryomov@gmail.com>
> cc: ceph-devel@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---
> include/linux/ceph/databuf.h | 131 +++++++++++++++++++++
> include/linux/ceph/messenger.h | 6 +-
> include/linux/ceph/osd_client.h | 3 +
> net/ceph/Makefile | 3 +-
> net/ceph/databuf.c | 200 ++++++++++++++++++++++++++++++++
> net/ceph/messenger.c | 20 +++-
> net/ceph/osd_client.c | 11 +-
> 7 files changed, 369 insertions(+), 5 deletions(-)
> create mode 100644 include/linux/ceph/databuf.h
> create mode 100644 net/ceph/databuf.c
>
> diff --git a/include/linux/ceph/databuf.h b/include/linux/ceph/databuf.h
> new file mode 100644
> index 000000000000..14c7a6449467
> --- /dev/null
> +++ b/include/linux/ceph/databuf.h
> @@ -0,0 +1,131 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __FS_CEPH_DATABUF_H
> +#define __FS_CEPH_DATABUF_H
> +
> +#include <asm/byteorder.h>
> +#include <linux/refcount.h>
> +#include <linux/blk_types.h>
> +
> +struct ceph_databuf {
> + struct bio_vec *bvec; /* List of pages */
So, maybe we need to think about folios now?
> + struct bio_vec inline_bvec[1]; /* Inline bvecs for small buffers */
> + struct iov_iter iter; /* Iterator defining occupied data */
> + size_t limit; /* Maximum length before expansion required */
> + size_t nr_bvec; /* Number of bvec[] that have pages */
Folios? :)
> + size_t max_bvec; /* Size of bvec[] */
> + refcount_t refcnt;
> + bool put_pages; /* T if pages in bvec[] need to be put*/
Maybe folios? :)
> +};
> +
> +struct ceph_databuf *ceph_databuf_alloc(size_t min_bvec, size_t space,
> + unsigned int data_source, gfp_t gfp);
> +struct ceph_databuf *ceph_databuf_get(struct ceph_databuf *dbuf);
> +void ceph_databuf_release(struct ceph_databuf *dbuf);
> +int ceph_databuf_append(struct ceph_databuf *dbuf, const void *d, size_t l);
I think that declaration is important and argument names needs to be clear
enough. Short name is good but it could be confusing. Why not len instead of l?
And I am still guessing what d means. :)
> +int ceph_databuf_reserve(struct ceph_databuf *dbuf, size_t space, gfp_t gfp);
> +int ceph_databuf_insert_frag(struct ceph_databuf *dbuf, unsigned int ix,
> + size_t len, gfp_t gfp);
> +
> +static inline
> +struct ceph_databuf *ceph_databuf_req_alloc(size_t min_bvec, size_t space, gfp_t gfp)
> +{
> + return ceph_databuf_alloc(min_bvec, space, ITER_SOURCE, gfp);
> +}
> +
> +static inline
> +struct ceph_databuf *ceph_databuf_reply_alloc(size_t min_bvec, size_t space, gfp_t gfp)
> +{
> + struct ceph_databuf *dbuf;
> +
> + dbuf = ceph_databuf_alloc(min_bvec, space, ITER_DEST, gfp);
> + if (dbuf)
> + iov_iter_reexpand(&dbuf->iter, space);
> + return dbuf;
> +}
> +
> +static inline struct page *ceph_databuf_page(struct ceph_databuf *dbuf,
> + unsigned int ix)
> +{
> + return dbuf->bvec[ix].bv_page;
> +}
> +
> +#define kmap_ceph_databuf_page(dbuf, ix) \
> + kmap_local_page(ceph_databuf_page(dbuf, ix));
> +
I am still thinking that we need to base the new code on folio.
> +static inline int ceph_databuf_encode_64(struct ceph_databuf *dbuf, u64 v)
> +{
> + __le64 ev = cpu_to_le64(v);
> + return ceph_databuf_append(dbuf, &ev, sizeof(ev));
> +}
> +static inline int ceph_databuf_encode_32(struct ceph_databuf *dbuf, u32 v)
> +{
> + __le32 ev = cpu_to_le32(v);
> + return ceph_databuf_append(dbuf, &ev, sizeof(ev));
> +}
> +static inline int ceph_databuf_encode_16(struct ceph_databuf *dbuf, u16 v)
> +{
> + __le16 ev = cpu_to_le16(v);
> + return ceph_databuf_append(dbuf, &ev, sizeof(ev));
> +}
> +static inline int ceph_databuf_encode_8(struct ceph_databuf *dbuf, u8 v)
> +{
> + return ceph_databuf_append(dbuf, &v, 1);
> +}
Maybe, encode_8, encode_16, encode_32, encode_64? I mean reverse sequence here.
> +static inline int ceph_databuf_encode_string(struct ceph_databuf *dbuf,
> + const char *s, u32 len)
> +{
> + int ret = ceph_databuf_encode_32(dbuf, len);
> + if (ret)
> + return ret;
> + if (len)
> + return ceph_databuf_append(dbuf, s, len);
> + return 0;
> +}
> +
> +static inline size_t ceph_databuf_len(struct ceph_databuf *dbuf)
> +{
> + return dbuf->iter.count;
> +}
> +
> +static inline void ceph_databuf_added_data(struct ceph_databuf *dbuf,
> + size_t len)
> +{
> + dbuf->iter.count += len;
> +}
> +
> +static inline void ceph_databuf_reply_ready(struct ceph_databuf *reply,
> + size_t len)
> +{
> + reply->iter.data_source = ITER_SOURCE;
> + iov_iter_truncate(&reply->iter, len);
> +}
> +
> +static inline void ceph_databuf_reset_reply(struct ceph_databuf *reply)
> +{
> + iov_iter_bvec(&reply->iter, ITER_DEST,
> + reply->bvec, reply->nr_bvec, reply->limit);
> +}
> +
> +static inline void ceph_databuf_append_page(struct ceph_databuf *dbuf,
> + struct page *page,
> + unsigned int offset,
> + unsigned int len)
> +{
> + BUG_ON(dbuf->nr_bvec >= dbuf->max_bvec);
> + bvec_set_page(&dbuf->bvec[dbuf->nr_bvec++], page, len, offset);
> + dbuf->iter.count += len;
> + dbuf->iter.nr_segs++;
Why do we assign len to dbuf->iter.count but only increment dbuf->iter.nr_segs?
> +}
> +
> +static inline void *ceph_databuf_enc_start(struct ceph_databuf *dbuf)
> +{
> + return page_address(ceph_databuf_page(dbuf, 0)) + dbuf->iter.count;
> +}
> +
> +static inline void ceph_databuf_enc_stop(struct ceph_databuf *dbuf, void *p)
> +{
> + dbuf->iter.count = p - page_address(ceph_databuf_page(dbuf, 0));
> + BUG_ON(dbuf->iter.count > dbuf->limit);
> +}
The same about page...
> +
> +#endif /* __FS_CEPH_DATABUF_H */
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index db2aba32b8a0..864aad369c91 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -117,6 +117,7 @@ struct ceph_messenger {
>
> enum ceph_msg_data_type {
> CEPH_MSG_DATA_NONE, /* message contains no data payload */
> + CEPH_MSG_DATA_DATABUF, /* data source/destination is a data buffer */
> CEPH_MSG_DATA_PAGES, /* data source/destination is a page array */
> CEPH_MSG_DATA_PAGELIST, /* data source/destination is a pagelist */
So, the final replacement on databuf will be in the future?
> #ifdef CONFIG_BLOCK
> @@ -210,7 +211,10 @@ struct ceph_bvec_iter {
>
> struct ceph_msg_data {
> enum ceph_msg_data_type type;
> + struct iov_iter iter;
> + bool release_dbuf;
> union {
> + struct ceph_databuf *dbuf;
> #ifdef CONFIG_BLOCK
> struct {
> struct ceph_bio_iter bio_pos;
> @@ -225,7 +229,6 @@ struct ceph_msg_data {
> bool own_pages;
> };
> struct ceph_pagelist *pagelist;
> - struct iov_iter iter;
> };
> };
>
> @@ -601,6 +604,7 @@ extern void ceph_con_keepalive(struct ceph_connection *con);
> extern bool ceph_con_keepalive_expired(struct ceph_connection *con,
> unsigned long interval);
>
> +void ceph_msg_data_add_databuf(struct ceph_msg *msg, struct ceph_databuf *dbuf);
> void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page **pages,
> size_t length, size_t offset, bool own_pages);
> extern void ceph_msg_data_add_pagelist(struct ceph_msg *msg,
> diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
> index 8fc84f389aad..b8fb5a71dd57 100644
> --- a/include/linux/ceph/osd_client.h
> +++ b/include/linux/ceph/osd_client.h
> @@ -16,6 +16,7 @@
> #include <linux/ceph/msgpool.h>
> #include <linux/ceph/auth.h>
> #include <linux/ceph/pagelist.h>
> +#include <linux/ceph/databuf.h>
>
> struct ceph_msg;
> struct ceph_snap_context;
> @@ -103,6 +104,7 @@ struct ceph_osd {
>
> enum ceph_osd_data_type {
> CEPH_OSD_DATA_TYPE_NONE = 0,
> + CEPH_OSD_DATA_TYPE_DATABUF,
> CEPH_OSD_DATA_TYPE_PAGES,
> CEPH_OSD_DATA_TYPE_PAGELIST,
The same question about replacement on databuf here? Is it future work?
> #ifdef CONFIG_BLOCK
> @@ -115,6 +117,7 @@ enum ceph_osd_data_type {
> struct ceph_osd_data {
> enum ceph_osd_data_type type;
> union {
> + struct ceph_databuf *dbuf;
> struct {
> struct page **pages;
> u64 length;
> diff --git a/net/ceph/Makefile b/net/ceph/Makefile
> index 8802a0c0155d..4b2e0b654e45 100644
> --- a/net/ceph/Makefile
> +++ b/net/ceph/Makefile
> @@ -15,4 +15,5 @@ libceph-y := ceph_common.o messenger.o msgpool.o buffer.o pagelist.o \
> auth_x.o \
> ceph_strings.o ceph_hash.o \
> pagevec.o snapshot.o string_table.o \
> - messenger_v1.o messenger_v2.o
> + messenger_v1.o messenger_v2.o \
> + databuf.o
> diff --git a/net/ceph/databuf.c b/net/ceph/databuf.c
> new file mode 100644
> index 000000000000..9d108fff5a4f
> --- /dev/null
> +++ b/net/ceph/databuf.c
> @@ -0,0 +1,200 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Data container
> + *
> + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved.
> + * Written by David Howells (dhowells@redhat.com)
> + */
> +
> +#include <linux/export.h>
> +#include <linux/gfp.h>
> +#include <linux/slab.h>
> +#include <linux/uio.h>
> +#include <linux/pagemap.h>
> +#include <linux/highmem.h>
> +#include <linux/ceph/databuf.h>
> +
> +struct ceph_databuf *ceph_databuf_alloc(size_t min_bvec, size_t space,
> + unsigned int data_source, gfp_t gfp)
> +{
> + struct ceph_databuf *dbuf;
> + size_t inl = ARRAY_SIZE(dbuf->inline_bvec);
> +
> + dbuf = kzalloc(sizeof(*dbuf), gfp);
> + if (!dbuf)
> + return NULL;
I am guessing... Should we return error code here?
> +
> + refcount_set(&dbuf->refcnt, 1);
> +
> + if (min_bvec == 0 && space == 0) {
> + /* Do nothing */
> + } else if (min_bvec <= inl && space <= inl * PAGE_SIZE) {
> + dbuf->bvec = dbuf->inline_bvec;
> + dbuf->max_bvec = inl;
> + dbuf->limit = space;
> + } else if (min_bvec) {
> + min_bvec = umax(min_bvec, 16);
Why 16 here? Maybe, do we need to introduce some well explained constant?
> +
> + dbuf->bvec = kcalloc(min_bvec, sizeof(struct bio_vec), gfp);
> + if (!dbuf->bvec) {
> + kfree(dbuf);
> + return NULL;
Ditto. Should we return error code here?
> + }
> +
> + dbuf->max_bvec = min_bvec;
Why do we assign min_bvec to max_bvec? I am simply slightly confused why
argument of function is named as min_bvec, but finally we are saving min_bvec
value into max_bvec.
> + }
> +
> + iov_iter_bvec(&dbuf->iter, data_source, dbuf->bvec, 0, 0);
> +
> + if (space) {
> + if (ceph_databuf_reserve(dbuf, space, gfp) < 0) {
> + ceph_databuf_release(dbuf);
> + return NULL;
Ditto. Should we return error code here?
> + }
> + }
> + return dbuf;
> +}
> +EXPORT_SYMBOL(ceph_databuf_alloc);
> +
> +struct ceph_databuf *ceph_databuf_get(struct ceph_databuf *dbuf)
I see the point here. But do we really need to return pointer? Why not simply:
void ceph_databuf_get(struct ceph_databuf *dbuf)
> +{
> + if (!dbuf)
> + return NULL;
> + refcount_inc(&dbuf->refcnt);
> + return dbuf;
> +}
> +EXPORT_SYMBOL(ceph_databuf_get);
> +
> +void ceph_databuf_release(struct ceph_databuf *dbuf)
> +{
> + size_t i;
> +
> + if (!dbuf || !refcount_dec_and_test(&dbuf->refcnt))
> + return;
> +
> + if (dbuf->put_pages)
> + for (i = 0; i < dbuf->nr_bvec; i++)
> + put_page(dbuf->bvec[i].bv_page);
> + if (dbuf->bvec != dbuf->inline_bvec)
> + kfree(dbuf->bvec);
> + kfree(dbuf);
> +}
> +EXPORT_SYMBOL(ceph_databuf_release);
> +
> +/*
> + * Expand the bvec[] in the dbuf.
> + */
> +static int ceph_databuf_expand(struct ceph_databuf *dbuf, size_t req_bvec,
> + gfp_t gfp)
> +{
> + struct bio_vec *bvec = dbuf->bvec, *old = bvec;
I think that assigning (*old = bvec) looks confusing if we keep it on the same
line as bvec declaration and initialization. Why do not declare and not
initialize it on the next line?
> + size_t size, max_bvec, off = dbuf->iter.bvec - old;
I think it's too much declarations on the same line. Why not:
size_t size, max_bvec;
size_t off = dbuf->iter.bvec - old;
> + size_t inl = ARRAY_SIZE(dbuf->inline_bvec);
> +
> + if (req_bvec <= inl) {
> + dbuf->bvec = dbuf->inline_bvec;
> + dbuf->max_bvec = inl;
> + dbuf->iter.bvec = dbuf->inline_bvec + off;
> + return 0;
> + }
> +
> + max_bvec = roundup_pow_of_two(req_bvec);
> + size = array_size(max_bvec, sizeof(struct bio_vec));
> +
> + if (old == dbuf->inline_bvec) {
> + bvec = kmalloc_array(max_bvec, sizeof(struct bio_vec), gfp);
> + if (!bvec)
> + return -ENOMEM;
> + memcpy(bvec, old, inl);
> + } else {
> + bvec = krealloc(old, size, gfp);
> + if (!bvec)
> + return -ENOMEM;
> + }
> + dbuf->bvec = bvec;
> + dbuf->max_bvec = max_bvec;
> + dbuf->iter.bvec = bvec + off;
> + return 0;
> +}
> +
> +/* Allocate enough pages for a dbuf to append the given amount
> + * of dbuf without allocating.
> + * Returns: 0 on success, -ENOMEM on error.
> + */
> +int ceph_databuf_reserve(struct ceph_databuf *dbuf, size_t add_space,
> + gfp_t gfp)
> +{
> + struct bio_vec *bvec;
> + size_t i, req_bvec = DIV_ROUND_UP(dbuf->iter.count + add_space, PAGE_SIZE);
Why not:
size_t req_bvec = DIV_ROUND_UP(dbuf->iter.count + add_space, PAGE_SIZE);
size_t i;
> + int ret;
> +
> + dbuf->put_pages = true;
> + if (req_bvec > dbuf->max_bvec) {
> + ret = ceph_databuf_expand(dbuf, req_bvec, gfp);
> + if (ret < 0)
> + return ret;
> + }
> +
> + bvec = dbuf->bvec;
> + while (dbuf->nr_bvec < req_bvec) {
> + struct page *pages[16];
Why do we hardcoded 16 here but using some well defined constant?
And, again, why not folio?
> + size_t want = min(req_bvec, ARRAY_SIZE(pages)), got;
> +
> + memset(pages, 0, sizeof(pages));
> + got = alloc_pages_bulk(gfp, want, pages);
> + if (!got)
> + return -ENOMEM;
> + for (i = 0; i < got; i++)
Why do we use size_t for i and got? Why not int, for example?
> + bvec_set_page(&bvec[dbuf->nr_bvec + i], pages[i],
> + PAGE_SIZE, 0);
> + dbuf->iter.nr_segs += got;
> + dbuf->nr_bvec += got;
If I understood correctly, the ceph_databuf_append_page() uses slightly
different logic.
+ dbuf->iter.count += len;
+ dbuf->iter.nr_segs++;
But here we assign number of allocated pages to nr_segs. It is slightly
confusing. I think I am missing something here.
> + dbuf->limit = dbuf->nr_bvec * PAGE_SIZE;
> + }
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(ceph_databuf_reserve);
> +
> +int ceph_databuf_append(struct ceph_databuf *dbuf, const void *buf, size_t len)
> +{
> + struct iov_iter temp_iter;
> +
> + if (!len)
> + return 0;
> + if (dbuf->limit - dbuf->iter.count > len &&
> + ceph_databuf_reserve(dbuf, len, GFP_NOIO) < 0)
> + return -ENOMEM;
> +
> + iov_iter_bvec(&temp_iter, ITER_DEST,
> + dbuf->bvec, dbuf->nr_bvec, dbuf->limit);
> + iov_iter_advance(&temp_iter, dbuf->iter.count);
> +
> + if (copy_to_iter(buf, len, &temp_iter) != len)
> + return -EFAULT;
> + dbuf->iter.count += len;
> + return 0;
> +}
> +EXPORT_SYMBOL(ceph_databuf_append);
> +
> +/*
> + * Allocate a fragment and insert it into the buffer at the specified index.
> + */
> +int ceph_databuf_insert_frag(struct ceph_databuf *dbuf, unsigned int ix,
> + size_t len, gfp_t gfp)
> +{
> + struct page *page;
> +
Why not folio?
> + page = alloc_page(gfp);
> + if (!page)
> + return -ENOMEM;
> +
> + bvec_set_page(&dbuf->bvec[ix], page, len, 0);
> +
> + if (dbuf->nr_bvec == ix) {
> + dbuf->iter.nr_segs = ix + 1;
> + dbuf->nr_bvec = ix + 1;
> + dbuf->iter.count += len;
> + }
> + return 0;
> +}
> +EXPORT_SYMBOL(ceph_databuf_insert_frag);
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 1df4291cc80b..802f0b222131 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -1872,7 +1872,9 @@ static struct ceph_msg_data *ceph_msg_data_add(struct ceph_msg *msg)
>
> static void ceph_msg_data_destroy(struct ceph_msg_data *data)
> {
> - if (data->type == CEPH_MSG_DATA_PAGES && data->own_pages) {
> + if (data->type == CEPH_MSG_DATA_DATABUF) {
> + ceph_databuf_release(data->dbuf);
> + } else if (data->type == CEPH_MSG_DATA_PAGES && data->own_pages) {
> int num_pages = calc_pages_for(data->offset, data->length);
> ceph_release_page_vector(data->pages, num_pages);
> } else if (data->type == CEPH_MSG_DATA_PAGELIST) {
> @@ -1880,6 +1882,22 @@ static void ceph_msg_data_destroy(struct ceph_msg_data *data)
> }
> }
>
> +void ceph_msg_data_add_databuf(struct ceph_msg *msg, struct ceph_databuf *dbuf)
> +{
> + struct ceph_msg_data *data;
> +
> + BUG_ON(!dbuf);
> + BUG_ON(!ceph_databuf_len(dbuf));
> +
> + data = ceph_msg_data_add(msg);
> + data->type = CEPH_MSG_DATA_DATABUF;
> + data->dbuf = ceph_databuf_get(dbuf);
> + data->iter = dbuf->iter;
> +
> + msg->data_length += ceph_databuf_len(dbuf);
> +}
> +EXPORT_SYMBOL(ceph_msg_data_add_databuf);
> +
> void ceph_msg_data_add_pages(struct ceph_msg *msg, struct page **pages,
> size_t length, size_t offset, bool own_pages)
> {
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index e359e70ad47e..c84634264377 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -359,6 +359,8 @@ static u64 ceph_osd_data_length(struct ceph_osd_data *osd_data)
> switch (osd_data->type) {
> case CEPH_OSD_DATA_TYPE_NONE:
> return 0;
> + case CEPH_OSD_DATA_TYPE_DATABUF:
> + return ceph_databuf_len(osd_data->dbuf);
> case CEPH_OSD_DATA_TYPE_PAGES:
> return osd_data->length;
> case CEPH_OSD_DATA_TYPE_PAGELIST:
> @@ -379,7 +381,9 @@ static u64 ceph_osd_data_length(struct ceph_osd_data *osd_data)
>
> static void ceph_osd_data_release(struct ceph_osd_data *osd_data)
> {
> - if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES && osd_data->own_pages) {
> + if (osd_data->type == CEPH_OSD_DATA_TYPE_DATABUF) {
> + ceph_databuf_release(osd_data->dbuf);
> + } else if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES && osd_data->own_pages) {
> int num_pages;
>
> num_pages = calc_pages_for((u64)osd_data->offset,
> @@ -965,7 +969,10 @@ static void ceph_osdc_msg_data_add(struct ceph_msg *msg,
> {
> u64 length = ceph_osd_data_length(osd_data);
>
> - if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES) {
> + if (osd_data->type == CEPH_OSD_DATA_TYPE_DATABUF) {
> + BUG_ON(!length);
> + ceph_msg_data_add_databuf(msg, osd_data->dbuf);
> + } else if (osd_data->type == CEPH_OSD_DATA_TYPE_PAGES) {
> BUG_ON(length > (u64) SIZE_MAX);
> if (length)
> ceph_msg_data_add_pages(msg, osd_data->pages,
>
>
Thanks,
Slava.
next prev parent reply other threads:[~2025-03-14 20:06 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 23:32 [RFC PATCH 00/35] ceph, rbd, netfs: Make ceph fully use netfslib David Howells
2025-03-13 23:32 ` [RFC PATCH 01/35] ceph: Fix incorrect flush end position calculation David Howells
2025-03-13 23:32 ` [RFC PATCH 02/35] libceph: Rename alignment to offset David Howells
2025-03-14 19:04 ` Viacheslav Dubeyko
2025-03-14 20:01 ` David Howells
2025-03-13 23:32 ` [RFC PATCH 03/35] libceph: Add a new data container type, ceph_databuf David Howells
2025-03-14 20:06 ` Viacheslav Dubeyko [this message]
2025-03-17 11:27 ` David Howells
2025-03-13 23:32 ` [RFC PATCH 04/35] ceph: Convert ceph_mds_request::r_pagelist to a databuf David Howells
2025-03-14 22:27 ` slava
2025-03-17 11:52 ` David Howells
2025-03-20 20:34 ` Viacheslav Dubeyko
2025-03-20 22:01 ` David Howells
2025-03-13 23:32 ` [RFC PATCH 05/35] libceph: Add functions to add ceph_databufs to requests David Howells
2025-03-13 23:32 ` [RFC PATCH 06/35] rbd: Use ceph_databuf for rbd_obj_read_sync() David Howells
2025-03-17 19:08 ` Viacheslav Dubeyko
2025-04-11 13:48 ` David Howells
2025-03-13 23:32 ` [RFC PATCH 07/35] libceph: Change ceph_osdc_call()'s reply to a ceph_databuf David Howells
2025-03-17 19:41 ` Viacheslav Dubeyko
2025-03-17 22:12 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 08/35] libceph: Unexport osd_req_op_cls_request_data_pages() David Howells
2025-03-13 23:33 ` [RFC PATCH 09/35] libceph: Remove osd_req_op_cls_response_data_pages() David Howells
2025-03-13 23:33 ` [RFC PATCH 10/35] libceph: Convert notify_id_pages to a ceph_databuf David Howells
2025-03-13 23:33 ` [RFC PATCH 11/35] ceph: Use ceph_databuf in DIO David Howells
2025-03-17 20:03 ` Viacheslav Dubeyko
2025-03-17 22:26 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 12/35] libceph: Bypass the messenger-v1 Tx loop for databuf/iter data blobs David Howells
2025-03-13 23:33 ` [RFC PATCH 13/35] rbd: Switch from using bvec_iter to iov_iter David Howells
2025-03-18 19:38 ` Viacheslav Dubeyko
2025-03-18 22:13 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 14/35] libceph: Remove bvec and bio data container types David Howells
2025-03-13 23:33 ` [RFC PATCH 15/35] libceph: Make osd_req_op_cls_init() use a ceph_databuf and map it David Howells
2025-03-13 23:33 ` [RFC PATCH 16/35] libceph: Convert req_page of ceph_osdc_call() to ceph_databuf David Howells
2025-03-13 23:33 ` [RFC PATCH 17/35] libceph, rbd: Use ceph_databuf encoding start/stop David Howells
2025-03-18 19:59 ` Viacheslav Dubeyko
2025-03-18 22:19 ` David Howells
2025-03-20 21:45 ` Viacheslav Dubeyko
2025-03-13 23:33 ` [RFC PATCH 18/35] libceph, rbd: Convert some page arrays to ceph_databuf David Howells
2025-03-18 20:02 ` Viacheslav Dubeyko
2025-03-18 22:25 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 19/35] libceph, ceph: Convert users of ceph_pagelist " David Howells
2025-03-18 20:09 ` Viacheslav Dubeyko
2025-03-18 22:27 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 20/35] libceph: Remove ceph_pagelist David Howells
2025-03-13 23:33 ` [RFC PATCH 21/35] libceph: Make notify code use ceph_databuf_enc_start/stop David Howells
2025-03-18 20:12 ` Viacheslav Dubeyko
2025-03-18 22:36 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 22/35] libceph, rbd: Convert ceph_osdc_notify() reply to ceph_databuf David Howells
2025-03-19 0:08 ` Viacheslav Dubeyko
2025-03-20 14:44 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 23/35] rbd: Use ceph_databuf_enc_start/stop() David Howells
2025-03-19 0:32 ` Viacheslav Dubeyko
2025-03-20 14:59 ` Why use plain numbers and totals rather than predef'd constants for RPC sizes? David Howells
2025-03-20 21:48 ` Viacheslav Dubeyko
2025-03-13 23:33 ` [RFC PATCH 24/35] ceph: Make ceph_calc_file_object_mapping() return size as size_t David Howells
2025-03-13 23:33 ` [RFC PATCH 25/35] ceph: Wrap POSIX_FADV_WILLNEED to get caps David Howells
2025-03-13 23:33 ` [RFC PATCH 26/35] ceph: Kill ceph_rw_context David Howells
2025-03-13 23:33 ` [RFC PATCH 27/35] netfs: Pass extra write context to write functions David Howells
2025-03-13 23:33 ` [RFC PATCH 28/35] netfs: Adjust group handling David Howells
2025-03-19 18:57 ` Viacheslav Dubeyko
2025-03-20 15:22 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 29/35] netfs: Allow fs-private data to be handed through to request alloc David Howells
2025-03-13 23:33 ` [RFC PATCH 30/35] netfs: Make netfs_page_mkwrite() use folio_mkwrite_check_truncate() David Howells
2025-03-13 23:33 ` [RFC PATCH 31/35] netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int David Howells
2025-03-13 23:33 ` [RFC PATCH 32/35] netfs: Add some more RMW support for ceph David Howells
2025-03-19 19:14 ` Viacheslav Dubeyko
2025-03-20 15:25 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 33/35] ceph: Use netfslib [INCOMPLETE] David Howells
2025-03-19 19:54 ` Viacheslav Dubeyko
2025-03-20 15:38 ` David Howells
2025-03-13 23:33 ` [RFC PATCH 34/35] ceph: Enable multipage folios for ceph files David Howells
2025-03-13 23:33 ` [RFC PATCH 35/35] ceph: Remove old I/O API bits David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1bab7ad752df6f2fa953fbf8eed8370e10344ff7.camel@ibm.com \
--to=slava.dubeyko@ibm.com \
--cc=amarkuze@redhat.com \
--cc=ceph-devel@vger.kernel.org \
--cc=dhowells@redhat.com \
--cc=dongsheng.yang@easystack.cn \
--cc=idryomov@gmail.com \
--cc=jlayton@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=slava@dubeyko.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox