linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	elliott@hpe.com, Brian Gerst <brgerst@gmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-nvdimm@ml01.01.org, x86@kernel.org
Subject: Re: [PATCH v10 3/4] x86, mce: Add __mcsafe_copy()
Date: Wed, 10 Feb 2016 11:58:43 +0100	[thread overview]
Message-ID: <20160210105843.GD23914@pd.tnic> (raw)
In-Reply-To: <20160209231557.GA23207@agluck-desk.sc.intel.com>

On Tue, Feb 09, 2016 at 03:15:57PM -0800, Luck, Tony wrote:
> > You can save yourself this MOV here in what is, I'm assuming, the
> > general likely case where @src is aligned and do:
> > 
> >         /* check for bad alignment of source */
> >         testl $7, %esi
> >         /* already aligned? */
> >         jz 102f
> > 
> >         movl %esi,%ecx
> >         subl $8,%ecx
> >         negl %ecx
> >         subl %ecx,%edx
> > 0:      movb (%rsi),%al
> >         movb %al,(%rdi)
> >         incq %rsi
> >         incq %rdi
> >         decl %ecx
> >         jnz 0b
> 
> The "testl $7, %esi" just checks the low three bits ... it doesn't
> change %esi.  But the code from the "subl $8" on down assumes that
> %ecx is a number in [1..7] as the count of bytes to copy until we
> achieve alignment.

Grr, sorry about that, I actually missed to copy-paste the AND:

        /* check for bad alignment of source */
        testl $7, %esi
        jz 102f                         /* already aligned */

        movl %esi,%ecx
        andl $7,%ecx
        subl $8,%ecx
        negl %ecx
        subl %ecx,%edx
0:      movb (%rsi),%al
        movb %al,(%rdi)
        incq %rsi
        incq %rdi
        decl %ecx
        jnz 0b

I basically am proposing to move the unlikely case out of line and
optimize the likely one.

> So your "movl %esi,%ecx" needs to be somthing that just copies the
> low three bits and zeroes the high part of %ecx.  Is there a cute
> way to do that in x86 assembler?

We could do some funky games with byte-sized moves but those are
generally slower anyway so doing the default operand size thing should
be ok.

> I copied that loop from arch/x86/lib/copy_user_64.S:__copy_user_nocache()
> I guess the answer depends on whether you generally copy enough
> cache lines to save enough time to cover the cost of saving and
> restoring those registers.

Well, that function will run on modern hw with a stack engine so I'd
assume those 4 pushes and pops would be paid for by the increased
registers count for the data shuffling.

But one could take out that function do some microbenchmarking with
different sizes and once with the current version and once with the
pushes and pops of r1[2-5] to see where the breakeven is.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-02-10 10:58 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-04 20:36 [PATCH v10 0/4] Machine check recovery when kernel accesses poison Tony Luck
2015-12-31 19:40 ` [PATCH v10 2/4] x86, mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries Tony Luck
2016-01-08 20:49 ` [PATCH v10 1/4] x86: Expand exception table to allow new handling options Tony Luck
2016-01-08 21:18 ` [PATCH v10 3/4] x86, mce: Add __mcsafe_copy() Tony Luck
2016-02-07 16:49   ` Borislav Petkov
2016-02-09 23:15     ` Luck, Tony
2016-02-10 10:58       ` Borislav Petkov [this message]
2016-02-10 19:39         ` Luck, Tony
2016-02-10 20:50           ` Borislav Petkov
2016-02-07 16:55   ` Borislav Petkov
2016-02-07 20:54     ` Richard Weinberger
2016-01-30  0:00 ` [PATCH v10 4/4] x86: Create a new synthetic cpu capability for machine check recovery Tony Luck
2016-02-07 17:10   ` Borislav Petkov
2016-02-09 23:38     ` Luck, Tony
2016-02-10 11:06       ` Borislav Petkov
2016-02-10 19:27         ` Luck, Tony
2016-02-11 11:55           ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160210105843.GD23914@pd.tnic \
    --to=bp@alien8.de \
    --cc=akpm@linux-foundation.org \
    --cc=brgerst@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=elliott@hpe.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).