linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Markuze <amarkuze@redhat.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
	Viacheslav Dubeyko <vdubeyko@redhat.com>,
	 "idryomov@gmail.com" <idryomov@gmail.com>,
	 "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v3 4/4] ceph: adding CEPH_SUBVOLUME_ID_NONE
Date: Wed, 3 Dec 2025 23:22:59 +0200	[thread overview]
Message-ID: <CAO8a2SjQDC2qaVV6_jsQbzOtUUdxStx2jEMYkG3VVkSCPbiH_Q@mail.gmail.com> (raw)
In-Reply-To: <361062ac3b2caf3262b319003c7b4aa2cf0f6a6e.camel@ibm.com>

The latest ceph code supports subvolume metrics.
The test is simple:
1. Deploy a ceph cluster
2. Create and mount a subvolume
3. run some I/O
4. I used debugfs to see that subvolume metrics were collected on the
client side and checked for subvolume metrics being reported on the
mds.

Nothing more to it.

On Wed, Dec 3, 2025 at 10:15 PM Viacheslav Dubeyko
<Slava.Dubeyko@ibm.com> wrote:
>
> On Wed, 2025-12-03 at 15:46 +0000, Alex Markuze wrote:
> > 1. Introduce CEPH_SUBVOLUME_ID_NONE constant (value 0) to make the
> >    unknown/unset state explicit and self-documenting.
> >
> > 2. Add WARN_ON_ONCE if attempting to change an already-set subvolume_id.
> >    An inode's subvolume membership is immutable - once created in a
> >    subvolume, it stays there. Attempting to change it indicates a bug.
> > ---
> >  fs/ceph/inode.c             | 32 +++++++++++++++++++++++++-------
> >  fs/ceph/mds_client.c        |  5 +----
> >  fs/ceph/subvolume_metrics.c |  7 ++++---
> >  fs/ceph/super.h             | 10 +++++++++-
> >  4 files changed, 39 insertions(+), 15 deletions(-)
> >
> > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> > index 835049004047..257b3e27b741 100644
> > --- a/fs/ceph/inode.c
> > +++ b/fs/ceph/inode.c
> > @@ -638,7 +638,7 @@ struct inode *ceph_alloc_inode(struct super_block *sb)
> >
> >       ci->i_max_bytes = 0;
> >       ci->i_max_files = 0;
> > -     ci->i_subvolume_id = 0;
> > +     ci->i_subvolume_id = CEPH_SUBVOLUME_ID_NONE;
>
> I was expected to see the code of this patch in the second and third ones. And
> it looks really confusing. Why have you introduced another one patch?
>
> So, how I can test this patchset? I assume that xfstests run will be not enough.
> Do we have special test environment or test-cases for this?
>
> Thanks,
> Slava.
>
> >
> >       memset(&ci->i_dir_layout, 0, sizeof(ci->i_dir_layout));
> >       memset(&ci->i_cached_layout, 0, sizeof(ci->i_cached_layout));
> > @@ -743,7 +743,7 @@ void ceph_evict_inode(struct inode *inode)
> >
> >       percpu_counter_dec(&mdsc->metric.total_inodes);
> >
> > -     ci->i_subvolume_id = 0;
> > +     ci->i_subvolume_id = CEPH_SUBVOLUME_ID_NONE;
> >
> >       netfs_wait_for_outstanding_io(inode);
> >       truncate_inode_pages_final(&inode->i_data);
> > @@ -877,19 +877,37 @@ int ceph_fill_file_size(struct inode *inode, int issued,
> >  }
> >
> >  /*
> > - * Set the subvolume ID for an inode. Following the FUSE client convention,
> > - * 0 means unknown/unset (MDS only sends non-zero IDs for subvolume inodes).
> > + * Set the subvolume ID for an inode.
> > + *
> > + * The subvolume_id identifies which CephFS subvolume this inode belongs to.
> > + * CEPH_SUBVOLUME_ID_NONE (0) means unknown/unset - the MDS only sends
> > + * non-zero IDs for inodes within subvolumes.
> > + *
> > + * An inode's subvolume membership is immutable - once an inode is created
> > + * in a subvolume, it stays there. Therefore, if we already have a valid
> > + * (non-zero) subvolume_id and receive a different one, that indicates a bug.
> >   */
> >  void ceph_inode_set_subvolume(struct inode *inode, u64 subvolume_id)
> >  {
> >       struct ceph_inode_info *ci;
> > +     u64 old;
> >
> > -     if (!inode || !subvolume_id)
> > +     if (!inode || subvolume_id == CEPH_SUBVOLUME_ID_NONE)
> >               return;
> >
> >       ci = ceph_inode(inode);
> > -     if (READ_ONCE(ci->i_subvolume_id) != subvolume_id)
> > -             WRITE_ONCE(ci->i_subvolume_id, subvolume_id);
> > +     old = READ_ONCE(ci->i_subvolume_id);
> > +
> > +     if (old == subvolume_id)
> > +             return;
> > +
> > +     if (old != CEPH_SUBVOLUME_ID_NONE) {
> > +             /* subvolume_id should not change once set */
> > +             WARN_ON_ONCE(1);
> > +             return;
> > +     }
> > +
> > +     WRITE_ONCE(ci->i_subvolume_id, subvolume_id);
> >  }
> >
> >  void ceph_fill_file_time(struct inode *inode, int issued,
> > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> > index 2b831f48c844..f2a17e11fcef 100644
> > --- a/fs/ceph/mds_client.c
> > +++ b/fs/ceph/mds_client.c
> > @@ -122,10 +122,7 @@ static int parse_reply_info_in(void **p, void *end,
> >       u32 struct_len = 0;
> >       struct ceph_client *cl = mdsc ? mdsc->fsc->client : NULL;
> >
> > -     info->subvolume_id = 0;
> > -     doutc(cl, "subv_metric parse start features=0x%llx\n", features);
> > -
> > -     info->subvolume_id = 0;
> > +     info->subvolume_id = CEPH_SUBVOLUME_ID_NONE;
> >
> >       if (features == (u64)-1) {
> >               ceph_decode_8_safe(p, end, struct_v, bad);
> > diff --git a/fs/ceph/subvolume_metrics.c b/fs/ceph/subvolume_metrics.c
> > index 111f6754e609..37cbed5b52c3 100644
> > --- a/fs/ceph/subvolume_metrics.c
> > +++ b/fs/ceph/subvolume_metrics.c
> > @@ -136,8 +136,9 @@ void ceph_subvolume_metrics_record(struct ceph_subvolume_metrics_tracker *tracke
> >       struct ceph_subvol_metric_rb_entry *entry, *new_entry = NULL;
> >       bool retry = false;
> >
> > -     /* 0 means unknown/unset subvolume (matches FUSE client convention) */
> > -     if (!READ_ONCE(tracker->enabled) || !subvol_id || !size || !latency_us)
> > +     /* CEPH_SUBVOLUME_ID_NONE (0) means unknown/unset subvolume */
> > +     if (!READ_ONCE(tracker->enabled) ||
> > +         subvol_id == CEPH_SUBVOLUME_ID_NONE || !size || !latency_us)
> >               return;
> >
> >       do {
> > @@ -403,7 +404,7 @@ void ceph_subvolume_metrics_record_io(struct ceph_mds_client *mdsc,
> >       }
> >
> >       subvol_id = READ_ONCE(ci->i_subvolume_id);
> > -     if (!subvol_id) {
> > +     if (subvol_id == CEPH_SUBVOLUME_ID_NONE) {
> >               atomic64_inc(&tracker->record_no_subvol);
> >               return;
> >       }
> > diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> > index a03c373efd52..731df0fcbcc8 100644
> > --- a/fs/ceph/super.h
> > +++ b/fs/ceph/super.h
> > @@ -386,7 +386,15 @@ struct ceph_inode_info {
> >
> >       /* quotas */
> >       u64 i_max_bytes, i_max_files;
> > -     u64 i_subvolume_id;     /* 0 = unknown/unset, matches FUSE client */
> > +
> > +     /*
> > +      * Subvolume ID this inode belongs to. CEPH_SUBVOLUME_ID_NONE (0)
> > +      * means unknown/unset, matching the FUSE client convention.
> > +      * Once set to a valid (non-zero) value, it should not change
> > +      * during the inode's lifetime.
> > +      */
> > +#define CEPH_SUBVOLUME_ID_NONE 0
> > +     u64 i_subvolume_id;
> >
> >       s32 i_dir_pin;
> >


  reply	other threads:[~2025-12-03 21:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-03 15:46 [PATCH v3 0/4] ceph: add subvolume metrics reporting support Alex Markuze
2025-12-03 15:46 ` [PATCH v3 1/4] ceph: handle InodeStat v8 versioned field in reply parsing Alex Markuze
2025-12-03 15:46 ` [PATCH v3 2/4] ceph: parse subvolume_id from InodeStat v9 and store in inode Alex Markuze
2025-12-03 15:46 ` [PATCH v3 3/4] ceph: add subvolume metrics collection and reporting Alex Markuze
2025-12-03 15:46 ` [PATCH v3 4/4] ceph: adding CEPH_SUBVOLUME_ID_NONE Alex Markuze
2025-12-03 20:15   ` Viacheslav Dubeyko
2025-12-03 21:22     ` Alex Markuze [this message]
2025-12-03 22:54       ` Viacheslav Dubeyko
2025-12-04  8:18         ` Alex Markuze
2025-12-04 18:53           ` Viacheslav Dubeyko
2025-12-07 10:51             ` Alex Markuze
2025-12-12  0:14   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAO8a2SjQDC2qaVV6_jsQbzOtUUdxStx2jEMYkG3VVkSCPbiH_Q@mail.gmail.com \
    --to=amarkuze@redhat.com \
    --cc=Slava.Dubeyko@ibm.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=idryomov@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=vdubeyko@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).