* [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory
@ 2026-03-31 6:38 Uros Bizjak
2026-03-31 6:38 ` [PATCH 2/2] x86/process: Use storesegment() when saving segment selectors Uros Bizjak
2026-03-31 6:56 ` [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory Ingo Molnar
0 siblings, 2 replies; 8+ messages in thread
From: Uros Bizjak @ 2026-03-31 6:38 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Uros Bizjak, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen
Introduce a new helper, storesegment(), that stores a segment selector
directly into a u16 (or compatible) memory location without using an
intermediate general-purpose register.
To support this, split the existing SAVE_SEGMENT macro into two parts:
SAVE_SEGMENT_VAR(): retains the current behavior of reading a segment
register into an unsigned long via a register.
SAVE_SEGMENT_PTR(): adds a new variant that writes the 16-bit selector
directly to memory.
The combined SAVE_SEGMENT() macro now generates both helpers for each
segment register.
The new storesegment() interface is preferred over savesegment() when
the value only needs to be stored (e.g. into a struct field), avoiding
an unnecessary register move and making the intent clearer.
No functional change for existing users of savesegment().
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Suggested-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
---
arch/x86/include/asm/segment.h | 30 +++++++++++++++++++++++++++++-
1 file changed, 29 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
index dbd90fede5e7..b9728c4acd46 100644
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -344,7 +344,7 @@ static inline void __loadsegment_fs(u16 value)
/*
* Save a segment register away:
*/
-#define SAVE_SEGMENT(seg) \
+#define SAVE_SEGMENT_VAR(seg) \
static inline unsigned long __savesegment_##seg(void) \
{ \
unsigned long v; \
@@ -352,6 +352,16 @@ static inline unsigned long __savesegment_##seg(void) \
return v; \
}
+#define SAVE_SEGMENT_PTR(seg) \
+static inline void __savesegment_##seg##_ptr(u16 *p) \
+{ \
+ asm volatile("movw %%" #seg ",%0" : "=m" (*p)); \
+}
+
+#define SAVE_SEGMENT(seg) \
+ SAVE_SEGMENT_VAR(seg) \
+ SAVE_SEGMENT_PTR(seg)
+
SAVE_SEGMENT(cs)
SAVE_SEGMENT(ss)
SAVE_SEGMENT(ds)
@@ -359,10 +369,28 @@ SAVE_SEGMENT(es)
SAVE_SEGMENT(fs)
SAVE_SEGMENT(gs)
+#undef SAVE_SEGMENT_VAR
+#undef SAVE_SEGMENT_PTR
#undef SAVE_SEGMENT
+/*
+ * savesegment(seg, var) - Read a segment register into an unsigned long.
+ *
+ * Reads the segment selector via a general-purpose register into an
+ * unsigned long. Preferred when the value is needed in a register for
+ * subsequent arithmetic or comparison.
+ */
#define savesegment(seg, var) ((var) = __savesegment_##seg())
+/*
+ * storesegment(seg, loc) - Store a segment register directly to a u16 location.
+ *
+ * Writes the 16-bit segment selector directly to memory, bypassing any
+ * intermediate general-purpose register. Preferred over savesegment()
+ * for simply saving a segment register to a u16 (or compatible) field.
+ */
+#define storesegment(seg, loc) __savesegment_##seg##_ptr(&(loc))
+
#endif /* !__ASSEMBLER__ */
#endif /* __KERNEL__ */
--
2.53.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] x86/process: Use storesegment() when saving segment selectors
2026-03-31 6:38 [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory Uros Bizjak
@ 2026-03-31 6:38 ` Uros Bizjak
2026-03-31 6:56 ` [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory Ingo Molnar
1 sibling, 0 replies; 8+ messages in thread
From: Uros Bizjak @ 2026-03-31 6:38 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin
Switch segment save sites in copy_thread() and save_fsgs() from
savesegment() to the newly introduced storesegment() helper.
These call sites only store the segment selector into a u16 field
and do not require the value in a general-purpose register. Using
storesegment() avoids the unnecessary register intermediate and
better matches the intended use.
The generated code is improved from, e.g.:
34f: 8c c0 mov %es,%eax
351: 66 89 83 08 0c 00 00 mov %ax,0xc08(%rbx)
to a single direct store:
34f: 8c 83 08 0c 00 00 mov %es,0xc08(%rbx)
This is a mechanical follow-up to the introduction of
storesegment(), with no functional change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
arch/x86/kernel/process.c | 4 ++--
arch/x86/kernel/process_64.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 4c718f8adc59..10e2ba668a14 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -197,8 +197,8 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
p->thread.gsindex = current->thread.gsindex;
p->thread.gsbase = current->thread.gsbase;
- savesegment(es, p->thread.es);
- savesegment(ds, p->thread.ds);
+ storesegment(es, p->thread.es);
+ storesegment(ds, p->thread.ds);
if (p->mm && (clone_flags & (CLONE_VM | CLONE_VFORK)) == CLONE_VM)
set_bit(MM_CONTEXT_LOCK_LAM, &p->mm->context.flags);
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index b85e715ebb30..fe391961ed86 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -275,8 +275,8 @@ static __always_inline void save_base_legacy(struct task_struct *prev_p,
static __always_inline void save_fsgs(struct task_struct *task)
{
- savesegment(fs, task->thread.fsindex);
- savesegment(gs, task->thread.gsindex);
+ storesegment(fs, task->thread.fsindex);
+ storesegment(gs, task->thread.gsindex);
if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
/*
* If FSGSBASE is enabled, we can't make any useful guesses
--
2.53.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory
2026-03-31 6:38 [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory Uros Bizjak
2026-03-31 6:38 ` [PATCH 2/2] x86/process: Use storesegment() when saving segment selectors Uros Bizjak
@ 2026-03-31 6:56 ` Ingo Molnar
2026-03-31 9:53 ` Uros Bizjak
1 sibling, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2026-03-31 6:56 UTC (permalink / raw)
To: Uros Bizjak
Cc: x86, linux-kernel, H. Peter Anvin, Thomas Gleixner,
Borislav Petkov, Dave Hansen
* Uros Bizjak <ubizjak@gmail.com> wrote:
> Introduce a new helper, storesegment(), that stores a segment selector
> directly into a u16 (or compatible) memory location without using an
> intermediate general-purpose register.
>
> To support this, split the existing SAVE_SEGMENT macro into two parts:
>
> SAVE_SEGMENT_VAR(): retains the current behavior of reading a segment
> register into an unsigned long via a register.
> SAVE_SEGMENT_PTR(): adds a new variant that writes the 16-bit selector
> directly to memory.
>
> The combined SAVE_SEGMENT() macro now generates both helpers for each
> segment register.
>
> The new storesegment() interface is preferred over savesegment() when
> the value only needs to be stored (e.g. into a struct field), avoiding
> an unnecessary register move and making the intent clearer.
>
> No functional change for existing users of savesegment().
Why does the API have to be split into =r and =m variants?
Coulnd't we use a more generic constraint and let the compiler
decide what the target is? Would that negatively impact
other aspects of code generation?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory
2026-03-31 6:56 ` [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory Ingo Molnar
@ 2026-03-31 9:53 ` Uros Bizjak
2026-03-31 9:59 ` Uros Bizjak
2026-04-01 6:40 ` Ingo Molnar
0 siblings, 2 replies; 8+ messages in thread
From: Uros Bizjak @ 2026-03-31 9:53 UTC (permalink / raw)
To: Ingo Molnar
Cc: x86, linux-kernel, H. Peter Anvin, Thomas Gleixner,
Borislav Petkov, Dave Hansen
On Tue, Mar 31, 2026 at 8:56 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Uros Bizjak <ubizjak@gmail.com> wrote:
>
> > Introduce a new helper, storesegment(), that stores a segment selector
> > directly into a u16 (or compatible) memory location without using an
> > intermediate general-purpose register.
> >
> > To support this, split the existing SAVE_SEGMENT macro into two parts:
> >
> > SAVE_SEGMENT_VAR(): retains the current behavior of reading a segment
> > register into an unsigned long via a register.
> > SAVE_SEGMENT_PTR(): adds a new variant that writes the 16-bit selector
> > directly to memory.
> >
> > The combined SAVE_SEGMENT() macro now generates both helpers for each
> > segment register.
> >
> > The new storesegment() interface is preferred over savesegment() when
> > the value only needs to be stored (e.g. into a struct field), avoiding
> > an unnecessary register move and making the intent clearer.
> >
> > No functional change for existing users of savesegment().
>
> Why does the API have to be split into =r and =m variants?
>
> Coulnd't we use a more generic constraint and let the compiler
> decide what the target is? Would that negatively impact
> other aspects of code generation?
The "=r" variant actually outputs zero-extended value to the whole
register width. So, the "=r" variant is used to eliminate
zero-extensions when the value is used in the follow-up calculations,
comparisons, or when the value is stored to a location that is more
than 16-bits wide. Additionally, "r" variant always uses MOVL, where
operand size prefix byte (0x66) is not needed.
The "=m" variant only outputs to a 16-bit location. Having "=rm" here
would always emit a 0x66 operand size prefix when register is used as
an output, and there would be many zero-extensions emitted, because
the compiler needs to zero-extend the value from 'unsigned short' to
anything wider.
Other than that, GCC (and Clang, too) has serious problems with "=rm"
output constraints. Forward propagation (AKA combine pass) does not
work reliably with assembly outputs (due to always present clobbers
for assembly clauses), so there will be many cases of moves to a
temporary register and even to a temporary stack location with this
constraint. Having two separate functions (with clear and
informational function comment) leaves the decision to the programmer,
which function is the most optimal.
Uros.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory
2026-03-31 9:53 ` Uros Bizjak
@ 2026-03-31 9:59 ` Uros Bizjak
2026-04-01 6:40 ` Ingo Molnar
1 sibling, 0 replies; 8+ messages in thread
From: Uros Bizjak @ 2026-03-31 9:59 UTC (permalink / raw)
To: Ingo Molnar
Cc: x86, linux-kernel, H. Peter Anvin, Thomas Gleixner,
Borislav Petkov, Dave Hansen
[-- Attachment #1: Type: text/plain, Size: 2770 bytes --]
On Tue, Mar 31, 2026 at 11:53 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Tue, Mar 31, 2026 at 8:56 AM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > > Introduce a new helper, storesegment(), that stores a segment selector
> > > directly into a u16 (or compatible) memory location without using an
> > > intermediate general-purpose register.
> > >
> > > To support this, split the existing SAVE_SEGMENT macro into two parts:
> > >
> > > SAVE_SEGMENT_VAR(): retains the current behavior of reading a segment
> > > register into an unsigned long via a register.
> > > SAVE_SEGMENT_PTR(): adds a new variant that writes the 16-bit selector
> > > directly to memory.
> > >
> > > The combined SAVE_SEGMENT() macro now generates both helpers for each
> > > segment register.
> > >
> > > The new storesegment() interface is preferred over savesegment() when
> > > the value only needs to be stored (e.g. into a struct field), avoiding
> > > an unnecessary register move and making the intent clearer.
> > >
> > > No functional change for existing users of savesegment().
> >
> > Why does the API have to be split into =r and =m variants?
> >
> > Coulnd't we use a more generic constraint and let the compiler
> > decide what the target is? Would that negatively impact
> > other aspects of code generation?
>
> The "=r" variant actually outputs zero-extended value to the whole
> register width. So, the "=r" variant is used to eliminate
> zero-extensions when the value is used in the follow-up calculations,
> comparisons, or when the value is stored to a location that is more
> than 16-bits wide. Additionally, "r" variant always uses MOVL, where
> operand size prefix byte (0x66) is not needed.
>
> The "=m" variant only outputs to a 16-bit location. Having "=rm" here
> would always emit a 0x66 operand size prefix when register is used as
> an output, and there would be many zero-extensions emitted, because
> the compiler needs to zero-extend the value from 'unsigned short' to
> anything wider.
>
> Other than that, GCC (and Clang, too) has serious problems with "=rm"
> output constraints. Forward propagation (AKA combine pass) does not
> work reliably with assembly outputs (due to always present clobbers
> for assembly clauses), so there will be many cases of moves to a
> temporary register and even to a temporary stack location with this
> constraint. Having two separate functions (with clear and
> informational function comment) leaves the decision to the programmer,
> which function is the most optimal.
I forgot to say that there are more opportunities for storesegment()
in other subsystems, please see the attached patch.
Uros.
[-- Attachment #2: segment.diff.txt --]
[-- Type: text/plain, Size: 2875 bytes --]
diff --git a/arch/x86/hyperv/hv_crash.c b/arch/x86/hyperv/hv_crash.c
index 5ffcc23255de..1d29ada3d9f1 100644
--- a/arch/x86/hyperv/hv_crash.c
+++ b/arch/x86/hyperv/hv_crash.c
@@ -207,12 +207,12 @@ static void hv_hvcrash_ctxt_save(void)
asm volatile("movq %%cr2, %0" : "=r"(ctxt->cr2));
asm volatile("movq %%cr8, %0" : "=r"(ctxt->cr8));
- asm volatile("movw %%cs, %0" : "=m"(ctxt->cs));
- asm volatile("movw %%ss, %0" : "=m"(ctxt->ss));
- asm volatile("movw %%ds, %0" : "=m"(ctxt->ds));
- asm volatile("movw %%es, %0" : "=m"(ctxt->es));
- asm volatile("movw %%fs, %0" : "=m"(ctxt->fs));
- asm volatile("movw %%gs, %0" : "=m"(ctxt->gs));
+ storesegment(cs, ctxt->cs);
+ storesegment(ss, ctxt->ss);
+ storesegment(ds, ctxt->ds);
+ storesegment(es, ctxt->es);
+ storesegment(fs, ctxt->fs);
+ storesegment(gs, ctxt->gs);
native_store_gdt(&ctxt->gdtr);
store_idt(&ctxt->idtr);
diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 5cfb27f26583..9b20eb699c54 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -106,11 +106,11 @@ static inline void crash_setup_regs(struct pt_regs *newregs,
asm volatile("mov %%r14,%0" : "=m"(newregs->r14));
asm volatile("mov %%r15,%0" : "=m"(newregs->r15));
#endif
- asm volatile("mov %%ss,%k0" : "=a"(newregs->ss));
- asm volatile("mov %%cs,%k0" : "=a"(newregs->cs));
+ storesegment(ss, newregs->ss);
+ storesegment(cs, newregs->cs);
#ifdef CONFIG_X86_32
- asm volatile("mov %%ds,%k0" : "=a"(newregs->ds));
- asm volatile("mov %%es,%k0" : "=a"(newregs->es));
+ storesegment(ds, newregs->ds);
+ storesegment(es, newregs->es);
#endif
asm volatile("pushf\n\t"
"pop %0" : "=m"(newregs->flags));
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8b24e682535b..4e9939a8ff30 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1390,8 +1390,8 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
host_state->ldt_sel = kvm_read_ldt();
#ifdef CONFIG_X86_64
- savesegment(ds, host_state->ds_sel);
- savesegment(es, host_state->es_sel);
+ storesegment(ds, host_state->ds_sel);
+ storesegment(es, host_state->es_sel);
gs_base = cpu_kernelmode_gs_base(cpu);
if (likely(is_64bit_mm(current->mm))) {
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 702f30eaf9c4..24253f40fa52 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -105,11 +105,11 @@ static void __save_processor_state(struct saved_context *ctxt)
/*
* segment registers
*/
- savesegment(gs, ctxt->gs);
+ storesegment(gs, ctxt->gs);
#ifdef CONFIG_X86_64
- savesegment(fs, ctxt->fs);
- savesegment(ds, ctxt->ds);
- savesegment(es, ctxt->es);
+ storesegment(fs, ctxt->fs);
+ storesegment(ds, ctxt->ds);
+ storesegment(es, ctxt->es);
rdmsrq(MSR_FS_BASE, ctxt->fs_base);
rdmsrq(MSR_GS_BASE, ctxt->kernelmode_gs_base);
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory
2026-03-31 9:53 ` Uros Bizjak
2026-03-31 9:59 ` Uros Bizjak
@ 2026-04-01 6:40 ` Ingo Molnar
2026-04-01 6:59 ` Uros Bizjak
1 sibling, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2026-04-01 6:40 UTC (permalink / raw)
To: Uros Bizjak
Cc: x86, linux-kernel, H. Peter Anvin, Thomas Gleixner,
Borislav Petkov, Dave Hansen
* Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Mar 31, 2026 at 8:56 AM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > > Introduce a new helper, storesegment(), that stores a segment selector
> > > directly into a u16 (or compatible) memory location without using an
> > > intermediate general-purpose register.
> > >
> > > To support this, split the existing SAVE_SEGMENT macro into two parts:
> > >
> > > SAVE_SEGMENT_VAR(): retains the current behavior of reading a segment
> > > register into an unsigned long via a register.
> > > SAVE_SEGMENT_PTR(): adds a new variant that writes the 16-bit selector
> > > directly to memory.
> > >
> > > The combined SAVE_SEGMENT() macro now generates both helpers for each
> > > segment register.
> > >
> > > The new storesegment() interface is preferred over savesegment() when
> > > the value only needs to be stored (e.g. into a struct field), avoiding
> > > an unnecessary register move and making the intent clearer.
> > >
> > > No functional change for existing users of savesegment().
> >
> > Why does the API have to be split into =r and =m variants?
> >
> > Coulnd't we use a more generic constraint and let the compiler
> > decide what the target is? Would that negatively impact
> > other aspects of code generation?
>
> The "=r" variant actually outputs zero-extended value to the whole
> register width. So, the "=r" variant is used to eliminate
> zero-extensions when the value is used in the follow-up calculations,
> comparisons, or when the value is stored to a location that is more
> than 16-bits wide. Additionally, "r" variant always uses MOVL, where
> operand size prefix byte (0x66) is not needed.
>
> The "=m" variant only outputs to a 16-bit location. Having "=rm" here
> would always emit a 0x66 operand size prefix when register is used as
> an output, and there would be many zero-extensions emitted, because
> the compiler needs to zero-extend the value from 'unsigned short' to
> anything wider.
>
> Other than that, GCC (and Clang, too) has serious problems with "=rm"
> output constraints. Forward propagation (AKA combine pass) does not
> work reliably with assembly outputs (due to always present clobbers
> for assembly clauses), so there will be many cases of moves to a
> temporary register and even to a temporary stack location with this
> constraint. Having two separate functions (with clear and
> informational function comment) leaves the decision to the programmer,
> which function is the most optimal.
Yeah, so I still have a problem with the split API, 'savesegment()'
is very similar to 'storesegment()' and there's no real way to tell
them apart in practice.
Since the main difference is that the =m variant outputs to an u16, I'd
suggest naming the two APIs according to the type they handle, not some
random word that nobody remembers:
savesegment()
savesegment_u16()
... or so?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory
2026-04-01 6:40 ` Ingo Molnar
@ 2026-04-01 6:59 ` Uros Bizjak
2026-04-01 7:02 ` Ingo Molnar
0 siblings, 1 reply; 8+ messages in thread
From: Uros Bizjak @ 2026-04-01 6:59 UTC (permalink / raw)
To: Ingo Molnar
Cc: x86, linux-kernel, H. Peter Anvin, Thomas Gleixner,
Borislav Petkov, Dave Hansen
On Wed, Apr 1, 2026 at 8:40 AM Ingo Molnar <mingo@kernel.org> wrote:
> > > Why does the API have to be split into =r and =m variants?
> > >
> > > Coulnd't we use a more generic constraint and let the compiler
> > > decide what the target is? Would that negatively impact
> > > other aspects of code generation?
> >
> > The "=r" variant actually outputs zero-extended value to the whole
> > register width. So, the "=r" variant is used to eliminate
> > zero-extensions when the value is used in the follow-up calculations,
> > comparisons, or when the value is stored to a location that is more
> > than 16-bits wide. Additionally, "r" variant always uses MOVL, where
> > operand size prefix byte (0x66) is not needed.
> >
> > The "=m" variant only outputs to a 16-bit location. Having "=rm" here
> > would always emit a 0x66 operand size prefix when register is used as
> > an output, and there would be many zero-extensions emitted, because
> > the compiler needs to zero-extend the value from 'unsigned short' to
> > anything wider.
> >
> > Other than that, GCC (and Clang, too) has serious problems with "=rm"
> > output constraints. Forward propagation (AKA combine pass) does not
> > work reliably with assembly outputs (due to always present clobbers
> > for assembly clauses), so there will be many cases of moves to a
> > temporary register and even to a temporary stack location with this
> > constraint. Having two separate functions (with clear and
> > informational function comment) leaves the decision to the programmer,
> > which function is the most optimal.
>
> Yeah, so I still have a problem with the split API, 'savesegment()'
> is very similar to 'storesegment()' and there's no real way to tell
> them apart in practice.
>
> Since the main difference is that the =m variant outputs to an u16, I'd
> suggest naming the two APIs according to the type they handle, not some
> random word that nobody remembers:
>
> savesegment()
> savesegment_u16()
>
> ... or so?
Yes, the above is a good proposal. I was undecided how to name the new
function, but the above is definitely more informative. Maybe also
emphasize that the new function operates on memory, so:
savesegment_mem16()
(the location does not need to be unsigned)?
Thanks,
Uros.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory
2026-04-01 6:59 ` Uros Bizjak
@ 2026-04-01 7:02 ` Ingo Molnar
0 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2026-04-01 7:02 UTC (permalink / raw)
To: Uros Bizjak
Cc: x86, linux-kernel, H. Peter Anvin, Thomas Gleixner,
Borislav Petkov, Dave Hansen
* Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, Apr 1, 2026 at 8:40 AM Ingo Molnar <mingo@kernel.org> wrote:
>
> > > > Why does the API have to be split into =r and =m variants?
> > > >
> > > > Coulnd't we use a more generic constraint and let the compiler
> > > > decide what the target is? Would that negatively impact
> > > > other aspects of code generation?
> > >
> > > The "=r" variant actually outputs zero-extended value to the whole
> > > register width. So, the "=r" variant is used to eliminate
> > > zero-extensions when the value is used in the follow-up calculations,
> > > comparisons, or when the value is stored to a location that is more
> > > than 16-bits wide. Additionally, "r" variant always uses MOVL, where
> > > operand size prefix byte (0x66) is not needed.
> > >
> > > The "=m" variant only outputs to a 16-bit location. Having "=rm" here
> > > would always emit a 0x66 operand size prefix when register is used as
> > > an output, and there would be many zero-extensions emitted, because
> > > the compiler needs to zero-extend the value from 'unsigned short' to
> > > anything wider.
> > >
> > > Other than that, GCC (and Clang, too) has serious problems with "=rm"
> > > output constraints. Forward propagation (AKA combine pass) does not
> > > work reliably with assembly outputs (due to always present clobbers
> > > for assembly clauses), so there will be many cases of moves to a
> > > temporary register and even to a temporary stack location with this
> > > constraint. Having two separate functions (with clear and
> > > informational function comment) leaves the decision to the programmer,
> > > which function is the most optimal.
> >
> > Yeah, so I still have a problem with the split API, 'savesegment()'
> > is very similar to 'storesegment()' and there's no real way to tell
> > them apart in practice.
> >
> > Since the main difference is that the =m variant outputs to an u16, I'd
> > suggest naming the two APIs according to the type they handle, not some
> > random word that nobody remembers:
> >
> > savesegment()
> > savesegment_u16()
> >
> > ... or so?
>
> Yes, the above is a good proposal. I was undecided how to name the new
> function, but the above is definitely more informative. Maybe also
> emphasize that the new function operates on memory, so:
>
> savesegment_mem16()
>
> (the location does not need to be unsigned)?
Sounds good to me.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-04-01 7:02 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31 6:38 [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory Uros Bizjak
2026-03-31 6:38 ` [PATCH 2/2] x86/process: Use storesegment() when saving segment selectors Uros Bizjak
2026-03-31 6:56 ` [PATCH 1/2] x86/segment: Introduce storesegment() helper to write segment selectors to memory Ingo Molnar
2026-03-31 9:53 ` Uros Bizjak
2026-03-31 9:59 ` Uros Bizjak
2026-04-01 6:40 ` Ingo Molnar
2026-04-01 6:59 ` Uros Bizjak
2026-04-01 7:02 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox