Devicetree
 help / color / mirror / Atom feed
From: Drew Fustini <fustini@kernel.org>
To: sashiko-reviews@lists.linux.dev
Cc: robh@kernel.org, conor+dt@kernel.org, devicetree@vger.kernel.org
Subject: Re: [PATCH RFC v5 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb
Date: Sat, 30 May 2026 18:50:50 -0700	[thread overview]
Message-ID: <ahuT-nUeUwLpzzaR@gen8> (raw)
In-Reply-To: <20260525005523.D677A1F000E9@smtp.kernel.org>

On Mon, May 25, 2026 at 12:55:23AM +0000, sashiko-bot@kernel.org wrote:
> > +	case RDT_RESOURCE_MB_MIN:
> > +		/* sum(Rbwb) <= MRBWB validation runs inside cbqri_apply_rbwb(). */
> > +		return cbqri_apply_rbwb(dom->hw_ctrl, closid, cfg_val, true);
> 
> Does this validation approach leak bandwidth when groups are deleted?
> 
> When a resctrl group is deleted via rmdir, the CLOSID is returned to the free
> pool, but standard resctrl flows do not trigger an architecture-specific
> callback to reset the resource's value. 
> 
> Because cbqri_apply_rbwb() iterates over all RCIDs (including inactive ones)
> to enforce sum(Rbwb) <= MRBWB, the bandwidth assigned to deleted groups remains
> permanently accounted against the global MRBWB budget.
> 
> If a user deletes a group to free up bandwidth and tries to assign it to an
> active group, could it be falsely rejected with -EINVAL, effectively leaking
> that capacity until the exact CLOSID is randomly reused?

This is already listed in the open issues in the cover letter. The fix
would require some changes to resctrl core. I would prefer to work on
a patch series to address this issue after this series lands.

> > +static int cbqri_attach_cpu_to_one_bw_res(struct cbqri_controller *ctrl,
> > +					  enum resctrl_res_level rid,
> > +					  unsigned int cpu)
> > +{
> > +	struct cbqri_resctrl_res *hw_res = &cbqri_resctrl_resources[rid];
> > +	struct rdt_resource *res = &hw_res->resctrl_res;
> > +	struct rdt_ctrl_domain *domain;
> > +	int dom_id = ctrl->mem.prox_dom;

> Can this alias multiple memory controllers to a single domain?
> 
> A single NUMA node (proximity domain) can contain multiple memory controllers.
> If multiple bandwidth controllers share the same proximity domain, the first one
> processed will create a domain and assign its hw_ctrl pointer to it.
> 
> When the second controller is processed, cbqri_find_ctrl_domain() will match
> the existing dom_id, and simply add its CPUs to the cpu_mask without saving
> the second controller's hw_ctrl pointer.
> 
> Will subsequent schemata writes to this domain exclusively program the first
> bandwidth controller, leaving the second unmanaged by resctrl?

This is a known limitation that can be addressed in a future series.

> > -		err = cbqri_attach_cpu_to_cap_ctrl(ctrl, cpu);
> > +		}
> >  		if (err)
> >  			break;
> >  	}
> 
> This is a pre-existing issue, but does this leak domain state if an error
> occurs here?
> 
> If an attachment fails, the loop breaks and returns the error immediately. The
> CPU hotplug core catches the error and aborts the online state transition.
> However, it does not invoke the corresponding offline teardown callback because
> this state failed during its startup sequence.
> 
> Since cbqri_resctrl_online_cpu() fails to manually roll back the domain state
> for the controllers that successfully processed the CPU in earlier iterations,
> will the CPU remain permanently set in their cpu_mask?

I will change the next rev to track the last successfully attached
controller and detach in reverse on the error path.  

-Drew

  reply	other threads:[~2026-05-31  1:51 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-24 23:55 [PATCH RFC v5 00/18] riscv: add Ssqosid and CBQRI resctrl support Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 01/18] dt-bindings: riscv: Add Ssqosid extension description Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 02/18] riscv: detect the Ssqosid extension Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 03/18] riscv: add support for srmcfg CSR from " Drew Fustini
2026-05-25  0:30   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 04/18] fs/resctrl: Add resctrl_is_membw() helper Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 05/18] fs/resctrl: Add RDT_RESOURCE_MB_MIN and RDT_RESOURCE_MB_WGHT Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 06/18] fs/resctrl: Let bandwidth resources default to min_bw at reset Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 07/18] riscv_cbqri: Add capacity controller probe and allocation device ops Drew Fustini
2026-05-25  0:30   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 08/18] riscv_cbqri: Add capacity controller monitoring " Drew Fustini
2026-05-25  0:29   ` sashiko-bot
2026-05-25  6:58     ` Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 09/18] riscv_cbqri: Add bandwidth controller probe and allocation " Drew Fustini
2026-05-25  0:30   ` sashiko-bot
2026-05-25  7:21     ` Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 10/18] riscv_cbqri: Add bandwidth controller monitoring " Drew Fustini
2026-05-25  0:36   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 11/18] riscv_cbqri: resctrl: Add cache allocation via capacity block mask Drew Fustini
2026-05-25  0:50   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 12/18] riscv_cbqri: resctrl: Add L3 cache occupancy monitoring Drew Fustini
2026-05-25  0:46   ` sashiko-bot
2026-05-24 23:55 ` [PATCH RFC v5 13/18] riscv_cbqri: resctrl: Add MB_MIN bandwidth allocation via Rbwb Drew Fustini
2026-05-25  0:55   ` sashiko-bot
2026-05-31  1:50     ` Drew Fustini [this message]
2026-05-24 23:55 ` [PATCH RFC v5 14/18] riscv_cbqri: resctrl: Add MB_WGHT bandwidth allocation via Mweight Drew Fustini
2026-05-25  0:52   ` sashiko-bot
2026-05-31  1:50     ` Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 15/18] riscv_cbqri: resctrl: Add mbm_total_bytes bandwidth monitoring Drew Fustini
2026-05-25  1:27   ` sashiko-bot
2026-05-31  1:50     ` Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 16/18] ACPI: RISC-V: Parse RISC-V Quality of Service Controller (RQSC) table Drew Fustini
2026-05-25  8:23   ` Sunil V L
2026-05-31  1:51     ` Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 17/18] ACPI: RISC-V: Add support for RISC-V Quality of Service Controller (RQSC) Drew Fustini
2026-05-24 23:55 ` [PATCH RFC v5 18/18] riscv: enable resctrl filesystem for Ssqosid Drew Fustini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahuT-nUeUwLpzzaR@gen8 \
    --to=fustini@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox