Generic Linux architectural discussions
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Linus Torvalds' <torvalds@linux-foundation.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"bp@alien8.de" <bp@alien8.de>
Subject: RE: [PATCH v2] x86: bring back rep movsq for user access on CPUs without ERMS
Date: Wed, 13 Sep 2023 08:25:40 +0000	[thread overview]
Message-ID: <e0228468e054426c9737530fed594ad0@AcuMS.aculab.com> (raw)
In-Reply-To: <CAHk-=whC8TaarEhz2ie_w01r34hQHNCTiZLAs6e42ewP7+cvoA@mail.gmail.com>

From: Linus Torvalds
> Sent: 12 September 2023 21:48
> 
> On Tue, 12 Sept 2023 at 12:41, David Laight <David.Laight@aculab.com> wrote:
> >
> > What I found seemed to imply that 'rep movsq' used the same internal
> > logic as 'rep movsb' (pretty easy to do in hardware)
> 
> Christ.
> 
> I told you. It's pretty easy in hardware  AS LONG AS IT'S ALIGNED.
> 
> And if it's unaligned, "rep movsq" is FUNDAMENTALLY HARDER.

For cached memory it only has to appear to have used 8 byte
accesses.
So in the same way that 'rep movsb' could be optimised to do
cache line sized reads and writes even if the address are
completely misaligned 'rep movsq' could use exactly the same
hardware logic with a byte count that is 8 times larger.

The only subtlety is that the read length would need masking
to a multiple of 8 if there is a page fault on a misaligned
read side (so that a multiple of 8 bytes would be written).
That wouldn't really be hard.

I definitely saw exactly the same number of bytes/clock
for 'rep movsb' and 'rep movsq' when the destination was
misaligned.
The alignment made no difference except that a multiple
of 32 ran (about) twice as fast.
I even double-checked the disassembly to make sure I was
running the right code.

So it looks like the Intel hardware engineers have solved
the 'FUNDAMENTALLY HARDER' problem.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

      reply	other threads:[~2023-09-13  8:25 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-30 14:03 [PATCH v2] x86: bring back rep movsq for user access on CPUs without ERMS Mateusz Guzik
2023-08-30 16:50 ` Linus Torvalds
2023-08-30 20:00 ` Linus Torvalds
2023-09-01 15:20   ` Mateusz Guzik
2023-09-01 15:29     ` Linus Torvalds
2023-09-03 18:49     ` Linus Torvalds
2023-09-03 19:14       ` Linus Torvalds
2023-09-03 20:08       ` Linus Torvalds
2023-09-03 20:48         ` Mateusz Guzik
2023-09-03 20:57           ` Linus Torvalds
2023-09-03 21:06             ` Mateusz Guzik
2023-09-03 21:08               ` Linus Torvalds
2023-09-03 21:18                 ` Mateusz Guzik
2023-09-03 23:28                   ` Al Viro
2023-09-03 20:58           ` Mateusz Guzik
2023-09-03 21:05           ` Linus Torvalds
2023-09-03 21:48             ` Ingo Molnar
2023-09-03 22:34               ` Linus Torvalds
2023-09-03 23:15                 ` Mateusz Guzik
2023-09-04  3:07                   ` Linus Torvalds
2023-09-04  3:17                     ` Linus Torvalds
2023-09-04  6:03                       ` Mateusz Guzik
2023-09-04 17:28                         ` Linus Torvalds
2023-09-05 20:41                           ` Mateusz Guzik
2023-09-06  0:16                             ` Linus Torvalds
2023-09-06  4:11                               ` Mateusz Guzik
2023-09-01 13:33 ` David Laight
2023-09-01 15:28   ` Mateusz Guzik
2023-09-03 20:42     ` David Laight
2023-09-10 10:53       ` Mateusz Guzik
2023-09-11 10:37         ` David Laight
2023-09-12 18:48           ` Linus Torvalds
2023-09-12 19:41             ` David Laight
2023-09-12 20:48               ` Linus Torvalds
2023-09-13  8:25                 ` David Laight [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e0228468e054426c9737530fed594ad0@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=bp@alien8.de \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjguzik@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox