From: Catalin Marinas <catalin.marinas@arm.com>
To: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>,
Yang Yingliang <yangyingliang@huawei.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, guohanjun@huawei.com
Subject: Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes
Date: Tue, 23 Mar 2021 15:03:59 +0000 [thread overview]
Message-ID: <20210323150358.GA10576@arm.com> (raw)
In-Reply-To: <20210323133217.GA11802@willie-the-truck>
On Tue, Mar 23, 2021 at 01:32:18PM +0000, Will Deacon wrote:
> On Tue, Mar 23, 2021 at 12:08:56PM +0000, Robin Murphy wrote:
> > On 2021-03-23 07:34, Yang Yingliang wrote:
> > > When copy over 128 bytes, src/dst is added after
> > > each ldp/stp instruction, it will cost more time.
> > > To improve this, we only add src/dst after load
> > > or store 64 bytes.
> >
> > This breaks the required behaviour for copy_*_user(), since the fault
> > handler expects the base address to be up-to-date at all times. Say you're
> > copying 128 bytes and fault on the 4th store, it should return 80 bytes not
> > copied; the code below would return 128 bytes not copied, even though 48
> > bytes have actually been written to the destination.
> >
> > We've had a couple of tries at updating this code (because the whole
> > template is frankly a bit terrible, and a long way from the well-optimised
> > code it was derived from), but getting the fault-handling behaviour right
> > without making the handler itself ludicrously complex has proven tricky. And
> > then it got bumped down the priority list while the uaccess behaviour in
> > general was in flux - now that the dust has largely settled on that I should
> > probably try to find time to pick this up again...
>
> I think the v5 from Oli was pretty close, but it didn't get any review:
>
> https://lore.kernel.org/r/20200914151800.2270-1-oli.swede@arm.com
These are still unread in my inbox as I was planning to look at them
again. However, I think we discussed a few options on how to proceed but
no concrete plans:
1. Merge Oli's patches as they are, with some potential complexity
issues as fixing the user copy accuracy was non-trivial. I think the
latest version uses a two-stage approach: when taking a fault, it
falls back to to byte-by-byte with the expectation that it faults
again and we can then report the correct fault address.
2. Only use Cortex Strings for in-kernel memcpy() while the uaccess
routines are some simple loops that align the uaccess part only
(unlike Cortex Strings which usually to align the source).
3. Similar to 2 but with Cortex Strings imported automatically with some
script to make it easier to keep the routines up to date.
If having non-optimal (but good enough) uaccess routines is acceptable,
I'd go for (2) with a plan to move to (3) at the next Cortex Strings
update.
I also need to look again at option (1) to see how complex it is but
given the time one spends on importing a new Cortex Strings library, I
don't think (1) scales well on the long term. We could, however, go for
(1) now and look at (3) with the next update.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-03-23 15:05 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-23 7:34 [PATCH 0/3] arm64: lib: improve copy performance Yang Yingliang
2021-03-23 7:34 ` [PATCH 1/3] arm64: lib: introduce ldp2/stp2 macro Yang Yingliang
2021-03-23 7:34 ` [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes Yang Yingliang
2021-03-23 12:08 ` Robin Murphy
2021-03-23 13:32 ` Will Deacon
2021-03-23 14:28 ` Robin Murphy
2021-03-23 15:03 ` Catalin Marinas [this message]
2021-03-24 16:38 ` David Laight
2021-03-24 19:36 ` Robin Murphy
2021-03-23 7:34 ` [PATCH 3/3] arm64: lib: improve copy performance when size is less than 128 and ge 64 bytes Yang Yingliang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210323150358.GA10576@arm.com \
--to=catalin.marinas@arm.com \
--cc=guohanjun@huawei.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
--cc=yangyingliang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).