* Should bios always mark CXL DRAM as EFI_MEMORY_SP? [not found] <07cedbe6-00ab-52fc-9475-c8d7120f5a95@jagalactic.com> @ 2022-01-27 16:18 ` John Groves 2022-01-28 0:47 ` Dan Williams 2022-01-28 4:08 ` Li Qiang (Johnny Li) 0 siblings, 2 replies; 6+ messages in thread From: John Groves @ 2022-01-27 16:18 UTC (permalink / raw) To: Dan Williams, linux-cxl@vger.kernel.org Cc: Jonathan Cameron, Ben Widawsky, John Groves I’d like to seek some feedback and see whether a consensus exists or can be developed regarding how system firmware (bios/efi/etc) should present CXL DRAM to a system in a pre-fabric world (CXL 1.1/2.0). The CXL spec, along with the Intel documentation are pretty specific and useful, but one open issue seems not to be outright specified: should the system mark CXL-attached DRAM as “specific purpose” (EFI_MEMORY_SP)? Consistency across platforms is certainly desirable. If this behavior is not prescribed, we could end up with inconsistent behavior across server and bios vendors. If this is already specified, no need to read on (but please point me to where it’s specified). Objective: I think everyone will likely agree that it should be possible to use CXL DRAM as either general-purpose memory, or via DAX, or a mix. What’s the difference? Memory marked as EFI_MEMORY_SP: · Mappable via DAX · Can be online-converted to general purpose memory via daxctl · Can be boot-converted to general-purpose with efi=nosoftreserve on Linux command line Memory NOT marked as EFI_MEMORY_SP: · CXL is general-purpose memory (NUMA node with no local CPU cores) · Some of the contents appear to be used for in-memory metadata (presumably buddy lists, etc.) · Can be boot-converted to DAX with an efi_fake_mem= argument on the Linux command line · Currently cannot be online-converted to DAX-managed (can this work? Is it intended to be working?) If online conversion from general-purpose to DAX is not going to work, it seems that the default should preserve the ability to use it either way: mark the memory as EFI_MEMORY_SP. Is there a right and wrong answer re:EFI_MEMORY_SP? How important is it to have consistency across platforms? If there is a consensus, the next question is who should express it. Perhaps the CXL consortium. I’m a part of that, but it seemed like the Linux dev community was the right place to start. Thanks for any thoughts. John Groves Micron ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Should bios always mark CXL DRAM as EFI_MEMORY_SP? 2022-01-27 16:18 ` Should bios always mark CXL DRAM as EFI_MEMORY_SP? John Groves @ 2022-01-28 0:47 ` Dan Williams 2022-01-28 4:08 ` Li Qiang (Johnny Li) 1 sibling, 0 replies; 6+ messages in thread From: Dan Williams @ 2022-01-28 0:47 UTC (permalink / raw) To: John Groves Cc: linux-cxl@vger.kernel.org, Jonathan Cameron, Ben Widawsky, John Groves, Linux MM [ add linux-mm since my opinion is not the only one that matters here ] Responses inline below with only my Linux kernel developer hat on, i.e. not necessarily the view of $current_employer: On Thu, Jan 27, 2022 at 8:18 AM John Groves <john@jagalactic.com> wrote: > > I’d like to seek some feedback and see whether a consensus exists or can be developed regarding how system firmware (bios/efi/etc) should present CXL DRAM to a system in a pre-fabric world (CXL 1.1/2.0). > > > > The CXL spec, along with the Intel documentation are pretty specific and useful, but one open issue seems not to be outright specified: should the system mark CXL-attached DRAM as “specific purpose” (EFI_MEMORY_SP)? Consistency across platforms is certainly desirable. If this behavior is not prescribed, we could end up with inconsistent behavior across server and bios vendors. > > > > If this is already specified, no need to read on (but please point me to where it’s specified). There is no specification for how an OS handles EFI_MEMORY_SP. Everything below is only a Linux perspective and likely any other OS you ask will give a different perspective. > Objective: I think everyone will likely agree that it should be possible to use CXL DRAM as either general-purpose memory, or via DAX, or a mix. [..] > > · Currently cannot be online-converted to DAX-managed (can this work? Is it intended to be working?) > There is no guaranteed way to un-online memory, especially ZONE_NORMAL memory. There are heuristics to make it fail less often, but in general it's not reliable so online-conversion to DAX-managed is not being attempted for the general case. > If online conversion from general-purpose to DAX is not going to work, it seems that the default should preserve the ability to use it either way: mark the memory as EFI_MEMORY_SP. Yes, unfortunately that requires a paradigm shift for end users to make a policy decision about memory where they did not need to make one before. My hope is that distributions would set a default daxctl policy to just online soft-reserved (Linux term for EFI_MEMORY_SP) memory. That way savvy users have a control point to change the policy to varying degrees of exclusive access through a DAX-device instance / instances, and other users, that don't even know what EFI_MEMORY_SP is, will see just another NUMA node by default. > Is there a right and wrong answer re:EFI_MEMORY_SP? How important is it to have consistency across platforms? The Principle of Least Surprise applies, and the vast bulk of users simply don't know that they need to care about memory types and memory performance classes. The ones that do know and care are also likely the ones to be surprised if they can not guarantee 100% exclusive access, i.e. machines purpose built to run a workload where the application gets 100% of the high performance memory. The distro gets to decide the CONFIG_EFI_SOFT_RESERVE policy, and if it chooses CONFIG_EFI_SOFT_RESERVE=y I think it should go further to ship daxctl and a policy that onlines it by default. https://github.com/pmem/ndctl/blob/main/Documentation/daxctl/daxctl-reconfigure-device.txt#L244 > If there is a consensus, the next question is who should express it. Perhaps the CXL consortium. I’m a part of that, but it seemed like the Linux dev community was the right place to start. EFI_MEMORY_SP is defined as a hint, so to me that effectively kicks all the policy questions over to OS specific / Distro specific solution space. ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Should bios always mark CXL DRAM as EFI_MEMORY_SP? 2022-01-27 16:18 ` Should bios always mark CXL DRAM as EFI_MEMORY_SP? John Groves 2022-01-28 0:47 ` Dan Williams @ 2022-01-28 4:08 ` Li Qiang (Johnny Li) 2022-01-28 5:28 ` Dan Williams 1 sibling, 1 reply; 6+ messages in thread From: Li Qiang (Johnny Li) @ 2022-01-28 4:08 UTC (permalink / raw) To: 'John Groves', 'Dan Williams', linux-cxl Cc: 'Jonathan Cameron', 'Ben Widawsky', 'John Groves' I think BIOS should follow CDAT spec v1.01 Device Scoped EFI Memory Type Structure (DSEMTS) structure In Table 8 Device Scoped EFI Memory Type Structure, field EFI Memory Type and Attribute has below definition 0 – EfiConventionalMemory 1 - EfiConventionalMemory Type with EFI_MEMORY_SP Attribute 2 – EfiReservedMemoryType 3-255 – Reserved encoding The memory attribute EFI_MEMORY_NV may be inferred from NonVolatile flag in DSMAS. Memory types other than EfiConventionalMemory and EfiReservedMemoryType are not permitted. Thanks Johnny -----Original Message----- From: John Groves (john@jagalactic.com) [mailto:john@jagalactic.com] Sent: Friday, January 28, 2022 12:19 AM To: Dan Williams; linux-cxl@vger.kernel.org Cc: Jonathan Cameron; Ben Widawsky; John Groves Subject: Should bios always mark CXL DRAM as EFI_MEMORY_SP? I’d like to seek some feedback and see whether a consensus exists or can be developed regarding how system firmware (bios/efi/etc) should present CXL DRAM to a system in a pre-fabric world (CXL 1.1/2.0). The CXL spec, along with the Intel documentation are pretty specific and useful, but one open issue seems not to be outright specified: should the system mark CXL-attached DRAM as “specific purpose” (EFI_MEMORY_SP)? Consistency across platforms is certainly desirable. If this behavior is not prescribed, we could end up with inconsistent behavior across server and bios vendors. If this is already specified, no need to read on (but please point me to where it’s specified). Objective: I think everyone will likely agree that it should be possible to use CXL DRAM as either general-purpose memory, or via DAX, or a mix. What’s the difference? Memory marked as EFI_MEMORY_SP: · Mappable via DAX · Can be online-converted to general purpose memory via daxctl · Can be boot-converted to general-purpose with efi=nosoftreserve on Linux command line Memory NOT marked as EFI_MEMORY_SP: · CXL is general-purpose memory (NUMA node with no local CPU cores) · Some of the contents appear to be used for in-memory metadata (presumably buddy lists, etc.) · Can be boot-converted to DAX with an efi_fake_mem= argument on the Linux command line · Currently cannot be online-converted to DAX-managed (can this work? Is it intended to be working?) If online conversion from general-purpose to DAX is not going to work, it seems that the default should preserve the ability to use it either way: mark the memory as EFI_MEMORY_SP. Is there a right and wrong answer re:EFI_MEMORY_SP? How important is it to have consistency across platforms? If there is a consensus, the next question is who should express it. Perhaps the CXL consortium. I’m a part of that, but it seemed like the Linux dev community was the right place to start. Thanks for any thoughts. John Groves Micron ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Should bios always mark CXL DRAM as EFI_MEMORY_SP? 2022-01-28 4:08 ` Li Qiang (Johnny Li) @ 2022-01-28 5:28 ` Dan Williams 2022-01-28 10:24 ` Jonathan Cameron 0 siblings, 1 reply; 6+ messages in thread From: Dan Williams @ 2022-01-28 5:28 UTC (permalink / raw) To: Li Qiang (Johnny Li) Cc: John Groves, linux-cxl, Jonathan Cameron, Ben Widawsky, John Groves On Thu, Jan 27, 2022 at 8:12 PM Li Qiang (Johnny Li) <johnny.li@montage-tech.com> wrote: > > I think BIOS should follow CDAT spec v1.01 Device Scoped EFI Memory Type Structure (DSEMTS) structure > > In Table 8 Device Scoped EFI Memory Type Structure, field EFI Memory Type and Attribute has below definition > 0 – EfiConventionalMemory > 1 - EfiConventionalMemory Type with EFI_MEMORY_SP Attribute > 2 – EfiReservedMemoryType > 3-255 – Reserved encoding > The memory attribute EFI_MEMORY_NV may be inferred from NonVolatile flag in DSMAS. > Memory types other than EfiConventionalMemory and EfiReservedMemoryType are not permitted. Definitely BIOS should follow CDAT for the type, but it's not so clear to me the same can be said about the attribute. I think the bigger question is when should devices claim to be EFI_MEMORY_SP, and when should BIOS apply EFI_MEMORY_SP regardless of what the device advertises. EFI_MEMORY_SP is a claim about usage that the memory is either too high performance or too low performance to be added to the general memory pool by default. That's not a decision that a device necessarily knows how to make on its own. The platform BIOS might have a better chance to know intended application the system was built. The OS kernel is somewhat blind to usage but OS policy can do the last mile tuning of how much if any memory of a given performance class should be set aside for exclusive access. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Should bios always mark CXL DRAM as EFI_MEMORY_SP? 2022-01-28 5:28 ` Dan Williams @ 2022-01-28 10:24 ` Jonathan Cameron 2022-01-28 16:12 ` Dan Williams 0 siblings, 1 reply; 6+ messages in thread From: Jonathan Cameron @ 2022-01-28 10:24 UTC (permalink / raw) To: Dan Williams Cc: Li Qiang (Johnny Li), John Groves, linux-cxl, Ben Widawsky, John Groves On Thu, 27 Jan 2022 21:28:40 -0800 Dan Williams <dan.j.williams@intel.com> wrote: > On Thu, Jan 27, 2022 at 8:12 PM Li Qiang (Johnny Li) > <johnny.li@montage-tech.com> wrote: > > > > I think BIOS should follow CDAT spec v1.01 Device Scoped EFI Memory Type Structure (DSEMTS) structure > > > > In Table 8 Device Scoped EFI Memory Type Structure, field EFI Memory Type and Attribute has below definition > > 0 – EfiConventionalMemory > > 1 - EfiConventionalMemory Type with EFI_MEMORY_SP Attribute > > 2 – EfiReservedMemoryType > > 3-255 – Reserved encoding > > The memory attribute EFI_MEMORY_NV may be inferred from NonVolatile flag in DSMAS. > > Memory types other than EfiConventionalMemory and EfiReservedMemoryType are not permitted. > > Definitely BIOS should follow CDAT for the type, but it's not so clear > to me the same can be said about the attribute. I think the bigger > question is when should devices claim to be EFI_MEMORY_SP, and when > should BIOS apply EFI_MEMORY_SP regardless of what the device > advertises. EFI_MEMORY_SP is a claim about usage that the memory is > either too high performance or too low performance to be added to the > general memory pool by default. That's not a decision that a device > necessarily knows how to make on its own. The platform BIOS might have > a better chance to know intended application the system was built. The > OS kernel is somewhat blind to usage but OS policy can do the last > mile tuning of how much if any memory of a given performance class > should be set aside for exclusive access. I'd add another spin based on where EFI_MEMORY_SP originally came from, though it's not relevant to memory only devices which I think is what is being discussed here. For some devices the memory will work fine as general purpose RAM, but it was put there with an intended use. Typically something like DDR attached to a GPU or other accelerator. Might be nice and quick for general use, but it's even quicker if the GPU is using it :) Again, how much to reserve for what usecase is an OS policy decision hence the hint from the attribute. Neither the device nor the bios can know the answer as it depends on what is actually being run in the OS. Jonathan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Should bios always mark CXL DRAM as EFI_MEMORY_SP? 2022-01-28 10:24 ` Jonathan Cameron @ 2022-01-28 16:12 ` Dan Williams 0 siblings, 0 replies; 6+ messages in thread From: Dan Williams @ 2022-01-28 16:12 UTC (permalink / raw) To: Jonathan Cameron Cc: Li Qiang (Johnny Li), John Groves, linux-cxl, Ben Widawsky, John Groves On Fri, Jan 28, 2022 at 2:25 AM Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > > On Thu, 27 Jan 2022 21:28:40 -0800 > Dan Williams <dan.j.williams@intel.com> wrote: > > > On Thu, Jan 27, 2022 at 8:12 PM Li Qiang (Johnny Li) > > <johnny.li@montage-tech.com> wrote: > > > > > > I think BIOS should follow CDAT spec v1.01 Device Scoped EFI Memory Type Structure (DSEMTS) structure > > > > > > In Table 8 Device Scoped EFI Memory Type Structure, field EFI Memory Type and Attribute has below definition > > > 0 – EfiConventionalMemory > > > 1 - EfiConventionalMemory Type with EFI_MEMORY_SP Attribute > > > 2 – EfiReservedMemoryType > > > 3-255 – Reserved encoding > > > The memory attribute EFI_MEMORY_NV may be inferred from NonVolatile flag in DSMAS. > > > Memory types other than EfiConventionalMemory and EfiReservedMemoryType are not permitted. > > > > Definitely BIOS should follow CDAT for the type, but it's not so clear > > to me the same can be said about the attribute. I think the bigger > > question is when should devices claim to be EFI_MEMORY_SP, and when > > should BIOS apply EFI_MEMORY_SP regardless of what the device > > advertises. EFI_MEMORY_SP is a claim about usage that the memory is > > either too high performance or too low performance to be added to the > > general memory pool by default. That's not a decision that a device > > necessarily knows how to make on its own. The platform BIOS might have > > a better chance to know intended application the system was built. The > > OS kernel is somewhat blind to usage but OS policy can do the last > > mile tuning of how much if any memory of a given performance class > > should be set aside for exclusive access. > > I'd add another spin based on where EFI_MEMORY_SP originally came from, > though it's not relevant to memory only devices which I think is what > is being discussed here. > > For some devices the memory will work fine as general purpose RAM, but > it was put there with an intended use. Typically something like DDR > attached to a GPU or other accelerator. Might be nice and quick for > general use, but it's even quicker if the GPU is using it :) > > Again, how much to reserve for what usecase is an OS policy decision > hence the hint from the attribute. Neither the device nor the > bios can know the answer as it depends on what is actually being run > in the OS. I agree this was one of the original motivations, but every time I talk to a GPU developer and bring up the case of the OS taking a page, and pinning it indefinitely, they balk. So I think this case is covered by setting the type to EfiReservedMemory (hard-reserved) and the GPU driver owns the policy about giving the memory to the OS general pool, if ever. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-01-28 16:12 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <07cedbe6-00ab-52fc-9475-c8d7120f5a95@jagalactic.com>
2022-01-27 16:18 ` Should bios always mark CXL DRAM as EFI_MEMORY_SP? John Groves
2022-01-28 0:47 ` Dan Williams
2022-01-28 4:08 ` Li Qiang (Johnny Li)
2022-01-28 5:28 ` Dan Williams
2022-01-28 10:24 ` Jonathan Cameron
2022-01-28 16:12 ` Dan Williams
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.