Linux CXL
 help / color / mirror / Atom feed
From: Yuquan Wang <wangyuquan1236@phytium.com.cn>
To: Gregory Price <gregory.price@memverge.com>
Cc: lizhijian@fujitsu.com, dan.j.williams@intel.com,
	linux-cxl@vger.kernel.org, y-goto@fujitsu.com,
	Jonathan.Cameron@huawei.com, dave.jiang@intel.com,
	fan.ni@samsung.com
Subject: Re: CXL volatile memory: How to restore the previous region/Interleave set
Date: Thu, 30 May 2024 18:35:10 +0800	[thread overview]
Message-ID: <ZlhWXu6l6i2nYL+t@phytium.com.cn> (raw)
In-Reply-To: <ZldaiYTv1fPcGbCs@memverge.com>

On Wed, May 29, 2024 at 12:40:41PM -0400, Gregory Price wrote:
> 
> The CFMWS is the BIOS/EFI's mechanism to report the system configuration
> to the Operating System, not the Operating System's mechanism to change
> system configurations (such as interleave).  What you're talking about
> is re-configuring HDM Decoders to interleave devices *presented by* the
> CFMWS to the operating system.
> 
> Confusing, I know. But stick with me.
> 
> 
> 
> The interleave referred to the CFMWS is the BIOS/EFI telling the system
> that memory accesses to this (physicall address) region will be interleaved
> across the set of devices that are backing that region. The operating system
> is responsible for reading these settings and presenting the memory to the
> system accordingly.
> 
> The BIOS for example could configure all devices behind a single CFMW as
> a "Single Device" that interleaves many physical devices, and the OS should
> present it as such.  In this scenario, there is no need to configure an
> interleave region via cxl-cli - the BIOS already did that for you and
> presented all these devices as a single device.  All you need to do is
> online the memory.
> 
Sorry Gregory, here I have a question. According to your description, the 
bios drivers could prepare some interleave cxl region configurations on 
default cxl hardware(SoC) just like we using ndctl-tools in OS run-time
(cxl create-region). 

> Configuring the CFMWS *should* (but may not) manifest as a set of BIOS/EFI
> options that say how to configure a set of CXL devices behind one or more
> host bridges prior to OS boot. This has its limitations. For example, you'd
> need to reboot the system to make changes and hotplugging a memory device
> becomes impossible. The BIOS/EFI would also need to understand when the
> prior configuration is no longer valid - complicated and problematic.
> 
> Additionally, for more dynamic environments (devices behind a switch,
> or a DCD) this more "static" configuration may (read: does) reduce your
> management flexibility.  I.e. hotplug may not be possible.
> 
> 
> 
> Alternatively, the BIOS may configure each device separately, and the
> OS is may create a region that interleaves those devices explicitly by
> programming an HDM decoder.
> 
> In this scenario, the OS could tear down the region, hotplug that device,
> and recreate the region with new settings accordingly. Greater
> management flexibility, but more software/management complexity.
> 
> This requires the OS to recreate the region/interleave set on each
> reboot - and is probably the preferred mechanism for configuring the
> system (if only because hotplug and device failure is not uncommon).
> 
> In this scenario, re-configuration looks a lot like storage mounting.
> The device is either there or it isn't, and the configuration file
> either works or it doesn't.  Alternatively the daemon setting this all
> up is free to try to make auto-configuration decisions.
> 
> 
> 
> 
> (Final note about interleave for completion sake, but not really
> relevant to this discussion)
> 
> Alternatively you could just online each device as a separate region,
> and simply use something like set_mempolicy/numactl to implement
> interleave on a per-task basis.
> 
> 
> > 
> > But, really is that the above scenario is only for persistent memory with LSA.
> > Even if a user configures a new region for volatile memory, and I could not find any specification to
> > tell the new configuration to the Firmware. 
> > 
> > Could you tell me why such interface is not defined in the CXL specification?
> > Is it just because there is no place to store region information for volatile memory?
> >
> > 
> > IMHO, users want to keep previous configuration after reboot even if it is volatile memory.
> > Though users don't concern about contents of volatile memory, they want to keep region/interleave
> > configuration after reboot. Especially, if previous configuration is some years ago, I'll bet
> > users will forget how they configured regions against cxl volatile memory.
> >
> 
> Probably we want some daemon that reconfigures this similar to how we're
> doing it with storage.  You register a preferred configuration given the
> hardware environment that is valid until the hardware changes.
> 
> The OS shouldn't really be telling the firmware to configure itself if
> only because what happens if you unplug a device?
> 
> ~Gregory


  reply	other threads:[~2024-05-30 10:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24  7:32 CXL volatile memory: How to restore the previous region/Interleave set Zhijian Li (Fujitsu)
2024-05-29  1:08 ` Dan Williams
2024-05-29 10:19   ` Zhijian Li (Fujitsu)
2024-05-29 15:44     ` Gregory Price
2024-05-30  9:56       ` Zhijian Li (Fujitsu)
2024-05-29 11:33   ` Yasunori Gotou (Fujitsu)
2024-05-29 16:40     ` Gregory Price
2024-05-30 10:35       ` Yuquan Wang [this message]
2024-05-31 15:50         ` Gregory Price
2024-05-30 10:54       ` Yasunori Gotou (Fujitsu)
2024-05-31 20:56     ` Dan Williams
2024-06-03  5:01       ` Yasunori Gotou (Fujitsu)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlhWXu6l6i2nYL+t@phytium.com.cn \
    --to=wangyuquan1236@phytium.com.cn \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=fan.ni@samsung.com \
    --cc=gregory.price@memverge.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=lizhijian@fujitsu.com \
    --cc=y-goto@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox