linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] ceph: add subvolume metrics reporting support
@ 2025-12-02 15:57 Alex Markuze
  2025-12-02 15:57 ` [PATCH v2 1/3] ceph: handle InodeStat v8 versioned field in reply parsing Alex Markuze
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Alex Markuze @ 2025-12-02 15:57 UTC (permalink / raw)
  To: ceph-devel; +Cc: idryomov, linux-fsdevel, amarkuze, vdubeyko

This patch series adds support for per-subvolume I/O metrics collection
and reporting to the MDS. This enables administrators to monitor I/O
patterns at the subvolume granularity, which is useful for multi-tenant
CephFS deployments where different subvolumes may be allocated to
different users or applications.

The implementation requires protocol changes to receive the subvolume_id
from the MDS (InodeStat v9), and introduces a new metrics type
(CLIENT_METRIC_TYPE_SUBVOLUME_METRICS) for reporting aggregated I/O
statistics back to the MDS.

Patch 1 adds forward-compatible handling for InodeStat v8. The MDS v8
encoding added a versioned "optmetadata" field (this is the actual field
name in the MDS C++ code - short for "optional metadata"). This field
contains optional inode metadata such as charmap for case-insensitive/
case-preserving file systems. The kernel client does not currently
support case-insensitive lookups, so this field is skipped rather than
parsed. This ensures forward compatibility with newer MDS servers
without requiring the full case-insensitivity feature implementation.

Patch 2 adds support for parsing the subvolume_id field from InodeStat
v9 and storing it in the inode structure for later use. Following the
FUSE client convention, subvolume_id of 0 indicates unknown/unset
(the MDS only sends non-zero subvolume IDs for inodes within subvolumes).

Patch 3 adds the complete subvolume metrics infrastructure:
- CEPHFS_FEATURE_SUBVOLUME_METRICS feature flag for MDS negotiation
- Red-black tree based metrics tracker for efficient per-subvolume
  aggregation
- Wire format encoding matching the MDS C++ AggregatedIOMetrics struct
- Integration with the existing CLIENT_METRICS message
- Recording of I/O operations from file read/write and writeback paths
- Debugfs interfaces for monitoring

Metrics tracked per subvolume include:
- Read/write operation counts
- Read/write byte counts
- Read/write latency sums (for average calculation)

The metrics are periodically sent to the MDS as part of the existing
metrics reporting infrastructure when the MDS advertises support for
the SUBVOLUME_METRICS feature.

Debugfs additions in Patch 3:
- metrics/subvolumes: displays last sent and pending subvolume metrics
- metrics/metric_features: displays MDS session feature negotiation
  status, showing which metric-related features are enabled (including
  METRIC_COLLECT and SUBVOLUME_METRICS)

Changes since v1:
- Fixed unused variable warnings in patch 1 (v8_struct_v, v8_struct_compat)
  reported by kernel test robot. Now uses ceph_decode_skip_8() instead of
  ceph_decode_8_safe() since we only need to skip the versioned field header.
- Added comprehensive comment explaining InodeStat encoding versions v1-v9.
- Clarified that "optmetadata" is the actual field name in MDS C++ code.
- Added comments documenting that subvolume_id of 0 means unknown/unset,
  following the FUSE client convention.
- Fixed smatch warning in subvolume_metrics_show() where mdsc was assumed
  to potentially be NULL but later dereferenced unconditionally.

Alex Markuze (3):
  ceph: handle InodeStat v8 versioned field in reply parsing
  ceph: parse subvolume_id from InodeStat v9 and store in inode
  ceph: add subvolume metrics collection and reporting

 fs/ceph/Makefile            |   2 +-
 fs/ceph/addr.c              |  10 +
 fs/ceph/debugfs.c           | 153 ++++++++++++++
 fs/ceph/file.c              |  58 ++++-
 fs/ceph/inode.c             |  23 ++
 fs/ceph/mds_client.c        |  97 +++++++--
 fs/ceph/mds_client.h        |  14 +-
 fs/ceph/metric.c            | 172 ++++++++++++++-
 fs/ceph/metric.h            |  27 ++-
 fs/ceph/subvolume_metrics.c | 408 ++++++++++++++++++++++++++++++++++++
 fs/ceph/subvolume_metrics.h |  68 ++++++
 fs/ceph/super.c             |   1 +
 fs/ceph/super.h             |   3 +
 13 files changed, 1010 insertions(+), 26 deletions(-)
 create mode 100644 fs/ceph/subvolume_metrics.c
 create mode 100644 fs/ceph/subvolume_metrics.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-12-03 15:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-02 15:57 [PATCH v2 0/3] ceph: add subvolume metrics reporting support Alex Markuze
2025-12-02 15:57 ` [PATCH v2 1/3] ceph: handle InodeStat v8 versioned field in reply parsing Alex Markuze
2025-12-02 20:44   ` Viacheslav Dubeyko
2025-12-02 15:57 ` [PATCH v2 2/3] ceph: parse subvolume_id from InodeStat v9 and store in inode Alex Markuze
2025-12-02 20:50   ` Viacheslav Dubeyko
2025-12-03 15:48     ` Alex Markuze
2025-12-02 15:57 ` [PATCH v2 3/3] ceph: add subvolume metrics collection and reporting Alex Markuze
2025-12-02 22:54   ` Viacheslav Dubeyko
2025-12-03 15:57     ` Alex Markuze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).