All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>,
	x86@kernel.org, hkrzesin@redhat.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, olichtne@redhat.com, atomasov@redhat.com,
	aokuliar@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86: handle the tail in rep_movs_alternative() with an overlapping store
Date: Fri, 21 Mar 2025 20:47:23 +0000	[thread overview]
Message-ID: <20250321204723.1e21cb23@pumpkin> (raw)
In-Reply-To: <CAHk-=wjxi0poUzCd666Kx5wCjgOwN5v=-zG8xSAL7Wj_ax8Zvw@mail.gmail.com>

On Thu, 20 Mar 2025 16:53:32 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, 20 Mar 2025 at 14:17, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Thu, 20 Mar 2025 at 12:33, Mateusz Guzik <mjguzik@gmail.com> wrote:  
> > >
> > > I have a recollection that handling the tail after rep movsq with an
> > > overlapping store was suffering a penalty big enough to warrant a
> > > "normal" copy instead, avoiding the just written to area.  
> >
> > Ahh. Good point. The rep movsq might indeed end up having odd effects
> > with subsequent aliasing memory operations.
> >
> > Consider myself convinced.  
> 
> Actually, I think there's a solution for this.
> 
> Do not do the last 0-7 bytes as a word that overlaps with the tail of
> the 'rep movs'
> 
> Do the last 8-15 bytes *non-overlapping* (well, they overlap each
> other, but not the 'rep movs')
> 
> Something UNTESTED like the appended, in other words. The large case
> then ends up without any conditionals, looking something like this:
> 
>         mov    %rcx,%rax
>         shr    $0x3,%rcx
>         dec    %rcx
>         and    $0x7,%eax
>         rep movsq %ds:(%rsi),%es:(%rdi)
>         mov    (%rsi),%rcx
>         mov    %rcx,(%rdi)
>         mov    (%rsi,%rax,1),%rcx
>         mov    %rcx,(%rdi,%rax,1)
>         xor    %ecx,%ecx
>         ret

I think you can save the 'tail end' copying the same 8 bytes twice by doing:
	sub	$9,%rcx
	mov	%rcx,%rax
	shr	$3,%rcx
	and	$7,%rax
	inc	%rax
before the 'rep movsq'.

	David
	
> 
> with some added complexity - but not a lot - in the exception fixup cases.
> 
> This is once again intentionally whitespace-damaged, because I don't
> want people applying this mindlessly. Somebody needs to double-check
> my logic, and verify that this also avoids the cost from the aliasing
> with the rep movs.
> 
>                    Linus
...

  parent reply	other threads:[~2025-03-21 20:47 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20 19:05 [PATCH] x86: handle the tail in rep_movs_alternative() with an overlapping store Mateusz Guzik
2025-03-20 19:23 ` Linus Torvalds
2025-03-20 19:33   ` Mateusz Guzik
2025-03-20 20:24     ` Mateusz Guzik
2025-03-22 12:02       ` David Laight
2025-03-20 21:17     ` Linus Torvalds
2025-03-20 23:53       ` Linus Torvalds
2025-03-21 20:10         ` Mateusz Guzik
2025-03-21 20:47         ` David Laight [this message]
2025-03-25 22:42           ` Herton Krzesinski
2025-03-25 22:51             ` Linus Torvalds
2025-03-26 22:45             ` David Laight
2025-03-26 22:59             ` Mateusz Guzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250321204723.1e21cb23@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=aokuliar@redhat.com \
    --cc=atomasov@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hkrzesin@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mjguzik@gmail.com \
    --cc=olichtne@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.