All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] cgroup/rdma: add rdma.peak and rdma.events[.local]
@ 2026-05-14  6:50 Tao Cui
  2026-05-14  6:50 ` [PATCH v3 1/4] cgroup/rdma: add rdma.peak for per-device peak usage tracking Tao Cui
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Tao Cui @ 2026-05-14  6:50 UTC (permalink / raw)
  To: tj, hannes, mkoutny, cgroups; +Cc: Tao Cui

Hi,

This is v3 of the RDMA cgroup observability series.  Thanks to the
reviewers for the detailed feedback on v1 and v2.

This series adds new cgroup interface files to the RDMA controller
to improve observability of resource usage and limit enforcement:

  - rdma.peak:        per-device high watermark of resource usage
  - rdma.events:      hierarchical max and alloc_fail event counters
  - rdma.events.local: per-cgroup local max and alloc_fail counters

rdma.peak tracks the historical high watermark so administrators can
determine a sensible rdma.max based on actual peak demand rather than
guesswork.  This is directly analogous to memory.peak.

rdma.events and rdma.events.local provide per-device counters that
track how often resource limits block allocations, and can be monitored
via poll/epoll for real-time alerting.  Both files expose the same
keys (max and alloc_fail); rdma.events aggregates hierarchically while
rdma.events.local shows per-cgroup values.  This follows the
pids.events / pids.events.local design.

Patch overview:
  Patch 1 introduces rdma.peak, adding a per-resource peak field to track
  the high watermark of usage, updated only after a full hierarchical
  charge succeeds, and extends rpool lifetime to preserve non-zero
  peak values.
  Patch 2 adds rdma.events, which introduces rdmacg_event_locked() to
  propagate hierarchical max counters upward from the over-limit
  cgroup using get_cg_rpool_locked() to ensure full hierarchical
  coverage even for ancestors without a prior rpool, with poll/epoll
  notification via cgroup_file_notify().
  Patch 3 adds rdma.events.local and hierarchical alloc_fail, extending
  the event framework with per-cgroup local counters (local_max for
  the over-limit cgroup, local_alloc_fail for the requesting cgroup)
  and a hierarchical alloc_fail counter propagated from the requestor
  upward.  It also extracts the duplicated rpool-keep predicate into
  a rpool_has_persistent_state() helper and replaces the non-error
  goto dev_err in rdmacg_resource_set_max() with an if-guard.
  Patch 4 documents all three new interface files in cgroup-v2.rst.

Tao Cui (4):
  cgroup/rdma: add rdma.peak for per-device peak usage tracking
  cgroup/rdma: add rdma.events to track resource limit exhaustion
  cgroup/rdma: add rdma.events.local for per-cgroup allocation failure
    attribution
  cgroup/rdma: document rdma.peak, rdma.events and rdma.events.local

 Documentation/admin-guide/cgroup-v2.rst |  54 +++++++
 include/linux/cgroup_rdma.h             |   4 +
 kernel/cgroup/rdma.c                    | 199 ++++++++++++++++++++++--
 3 files changed, 247 insertions(+), 10 deletions(-)

---
Changes in v3:
  - Switch rdmacg_event_locked() from find_ to get_cg_rpool_locked()
    in hierarchical propagation loops (events_max and events_alloc_fail)
    to ensure full hierarchical coverage; the rpool-keep check now
    covers event counters, so spurious-rpool concern from v1 no longer
    applies.
  - Extract the duplicated rpool-keep predicate (peak + 4 event
    counters) into rpool_has_persistent_state() helper.
  - Replace the non-error goto dev_err in rdmacg_resource_set_max()
    with an if-guard so dev_err is only used for real error paths.
  - Fix commit message of rdma.events.local patch to mention the
    rdma.events hierarchical alloc_fail extension.
  - Use %llu and drop (s64) cast in rdmacg_events_show() and
    rdmacg_events_local_show() to match u64 counter type.

Changes in v2:
  - Fix peak updated before full hierarchical charge succeeds.
  - Use find_cg_rpool_locked() to avoid creating spurious rpools.
  - Replace atomic64_t with u64 + READ_ONCE (all under rdmacg_mutex).
  - Use key=value output format, remove trailing spaces.
  - Always list all devices, show zero for devices without an rpool.
  - Extend rpool-free condition to preserve non-zero event counters.
  - Rename "failcnt" to "alloc_fail" (cgroup v2 naming convention).
  - Fix alloc_fail semantics: local to the requesting cgroup only.
  - Add hierarchical alloc_fail to rdma.events for key consistency.
  - Add documentation in Documentation/admin-guide/cgroup-v2.rst.

v1:
  https://lore.kernel.org/all/20260512031719.273507-1-cuitao@kylinos.cn/
v2:
  https://lore.kernel.org/all/20260513104956.373216-1-cuitao@kylinos.cn/
-- 
2.43.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-05-15  0:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14  6:50 [PATCH v3 0/4] cgroup/rdma: add rdma.peak and rdma.events[.local] Tao Cui
2026-05-14  6:50 ` [PATCH v3 1/4] cgroup/rdma: add rdma.peak for per-device peak usage tracking Tao Cui
2026-05-14  6:50 ` [PATCH v3 2/4] cgroup/rdma: add rdma.events to track resource limit exhaustion Tao Cui
2026-05-14  6:50 ` [PATCH v3 3/4] cgroup/rdma: add rdma.events.local for per-cgroup allocation failure attribution Tao Cui
2026-05-14  6:50 ` [PATCH v3 4/4] cgroup/rdma: document rdma.peak, rdma.events and rdma.events.local Tao Cui
2026-05-14 21:26 ` [PATCH v3 0/4] cgroup/rdma: add rdma.peak and rdma.events[.local] Tejun Heo
2026-05-15  0:48   ` Tao Cui
2026-05-14 21:27 ` Tejun Heo
2026-05-14 21:27 ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.