From: mingo@kernel.org (Ingo Molnar)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 0/3] Batched user access support
Date: Fri, 18 Dec 2015 10:44:52 +0100 [thread overview]
Message-ID: <20151218094452.GA32455@gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.20.1512171016510.27310@i7>
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> So I already sent the end result of these three patches to the x86 people,
> but since I *think* it may bve an arm64 issue too, I'm including the arm64
> people too for information.
>
> Background for the the arm64 people: I upgraded my main desktop to
> Skylake, and did my usual build performance tests, including a perf run to
> check that everything looks fine. Yes, the machine is 20% faster than my
> old one, but the profile also shows that now that I have a CPU that
> supports SMAP, the overhead of that on the user string handling functions
> was horrendous.
>
> Normally, that probably isn't really noticeable, but on loads that do a
> ton of pathname handling (like a "make -j" on the fully built kernel, or
> doing "git diff" etc - both of which spend most of their time just doing
> 'lstat()' on all the files they care about), the user space string
> accesses really are pretty hot.
>
> On the 'make -j' test on a fully built kernel, strncpy_from_user() was
> about 1.5% of all CPU time. And almost two thirds of that was just the
> SMAP overhead.
Just curious: by how much did the overhead shift after your patches?
> So this patch series introduces a model for batching that SMAP overhead on
> x86, and the reason the ARM people are involved is that the same _may_ be
> true of the PAN overhead. I don't know - for all I know, the pstate "set
> pan" instruction may be so cheap on ARM64 that it doesn't really matter.
>
> Thew new interface is very simple: new "unsafe_{get,put}_user()" functions
> that have exactly the same semantics as the old unsafe ones (that weren't
> called "unsafe", but have the two underscores). The only difference is
> that you have to use "user_access_{begin,end}()" around them, which allows
> the architecture to hoist the user access permission wrapper to outside
> the loop, and then batch the raw accesses.
>
> The series contains this addition to uaccess.h:
>
> #ifndef user_access_begin
> #define user_access_begin() do { } while (0)
> #define user_access_end() do { } while (0)
> #define unsafe_get_user(x, ptr) __get_user(x, ptr)
> #define unsafe_put_user(x, ptr) __put_user(x, ptr)
> #endif
>
> so architectures that don't care or haven't implemented it yet, don't need
> to worry about it. Architectures that _do_ care just need to implement
> their own versions, and make sure that user_access_begin is a macro (it
> may obviously be an inline function and just then an additional
> self-defining macro).
>
> Any comments?
I like this interface, especially the method of naming it 'unsafe'. This draws
review focus on them - and we had a number of security holes with the
__get/put_user() 'fast' variants before, which are named in a too benign fashion.
Btw., still today, when I use get_user()/put_user() APIs after a few weeks of not
having coded such code, I have to look up their argument order in the headers ...
while I don't have to look up memcpy() arguments.
In hindsight, if we could go back 20 years, I'd organize our user memory access
APIs in such a fashion:
memcpy_begin()
memcpy_user_kernel(userptr, kernelvar)
memcpy_kernel_user(kernelvar, userptr)
memcpy_user_kernel(userptr, kernelptr, size)
memcpy_kernel_user(kernelptr, userptr, size)
memcpy(kernelptr1, kernelptr2, size)
memcpy_end()
and I'd use the regular memcpy ordering. GCC macro magic would make a distinction
between our 2-argument auto-sizing optimized APIs:
memcpy_user_kernel(userptr, kernelvar)
and the regular 3-argument sized memcpy:
memcpy_user_kernel(userptr, kernelptr, size)
i.e. generated could would still be the exact same as today, but the argument
order is streamlined, and it matches the naming of the API: 'user_kernel' fits
'userptr, kernelvar'.
It's also easy to read at a glance, every 'user_kernel' API conceptually maps to a
regular C assignment:
uservar = kernelvar;
In fact we might want to harmonize the APIs a bit more and make the 2-argument
APIs all pointer based:
memcpy_user_kernel(userptr, kernelptr)
memcpy_kernel_user(kernelptr, userptr)
the pointers still have to be type compatible so I think it's still just as
robust.
There would be various other advantages to such a family of APIs as well:
- while get/put matches the read/write IO logic, in reality it's very often mixed
with memcpy and C variable assigment code, and the mixed and conflicting
semantics are IMHO lethal to robustness.
- with this it's all a single well-known pattern, throughout all major user
memory access APIs.
- subsequently it's also harder to mis-review what such code does and what
security implications it has.
- the naming space is easily extensible: does any code need memcpy_user_user()
perhaps? Or memcpy_user_iomem()?
- another dimension for intuitive extensions would be _nocache() variants for
non-local copies. Variants of such APIs do exist, but the naming is not very
well organized.
But there are of course the disadvantages:
- I guess we don't want to change thousands of get_user()/put_user() API
usages ... We could automate 99.9% of it, and we could make it all graceful
(i.e. keep the old APIs as well), but it would still be a big change.
- [ I might have missed some advantage of the get/put idiom that my suggestion
regresses. ]
Changing this is something I'd love to volunteer for though, so I thought I'd
throw in the offer ;-)
Thanks,
Ingo
next prev parent reply other threads:[~2015-12-18 9:44 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-17 18:33 [PATCH 0/3] Batched user access support Linus Torvalds
2015-12-18 9:44 ` Ingo Molnar [this message]
2015-12-18 17:06 ` Linus Torvalds
2015-12-18 11:13 ` Will Deacon
2015-12-18 18:33 ` H. Peter Anvin
2015-12-18 18:43 ` Linus Torvalds
2015-12-18 19:56 ` Russell King - ARM Linux
2015-12-18 20:18 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151218094452.GA32455@gmail.com \
--to=mingo@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).