All of lore.kernel.org
 help / color / mirror / Atom feed
From: mingo@kernel.org (Ingo Molnar)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 0/3] Batched user access support
Date: Fri, 18 Dec 2015 10:44:52 +0100	[thread overview]
Message-ID: <20151218094452.GA32455@gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.20.1512171016510.27310@i7>


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> So I already sent the end result of these three patches to the x86 people, 
> but since I *think* it may bve an arm64 issue too, I'm including the arm64 
> people too for information.
> 
> Background for the the arm64 people: I upgraded my main desktop to 
> Skylake, and did my usual build performance tests, including a perf run to 
> check that everything looks fine. Yes, the machine is 20% faster than my 
> old one, but the profile also shows that now that I have a CPU that 
> supports SMAP, the overhead of that on the user string handling functions 
> was horrendous.
> 
> Normally, that probably isn't really noticeable, but on loads that do a 
> ton of pathname handling (like a "make -j" on the fully built kernel, or 
> doing "git diff" etc - both of which spend most of their time just doing 
> 'lstat()' on all the files they care about), the user space string 
> accesses really are pretty hot.
> 
> On the 'make -j' test on a fully built kernel, strncpy_from_user() was 
> about 1.5% of all CPU time. And almost two thirds of that was just the 
> SMAP overhead.

Just curious: by how much did the overhead shift after your patches?

> So this patch series introduces a model for batching that SMAP overhead on 
> x86, and the reason the ARM people are involved is that the same _may_ be 
> true of the PAN overhead. I don't know - for all I know, the pstate "set 
> pan" instruction may be so cheap on ARM64 that it doesn't really matter.
> 
> Thew new interface is very simple: new "unsafe_{get,put}_user()" functions 
> that have exactly the same semantics as the old unsafe ones (that weren't 
> called "unsafe", but have the two underscores). The only difference is 
> that you have to use "user_access_{begin,end}()" around them, which allows 
> the architecture to hoist the user access permission wrapper to outside 
> the loop, and then batch the raw accesses.
> 
> The series contains this addition to uaccess.h:
> 
>   #ifndef user_access_begin
>   #define user_access_begin() do { } while (0)
>   #define user_access_end() do { } while (0)
>   #define unsafe_get_user(x, ptr) __get_user(x, ptr)
>   #define unsafe_put_user(x, ptr) __put_user(x, ptr)
>   #endif
> 
> so architectures that don't care or haven't implemented it yet, don't need 
> to worry about it. Architectures that _do_ care just need to implement 
> their own versions, and make sure that user_access_begin is a macro (it 
> may obviously be an inline function and just then an additional 
> self-defining macro).
> 
> Any comments? 

I like this interface, especially the method of naming it 'unsafe'. This draws 
review focus on them - and we had a number of security holes with the 
__get/put_user() 'fast' variants before, which are named in a too benign fashion.

Btw., still today, when I use get_user()/put_user() APIs after a few weeks of not 
having coded such code, I have to look up their argument order in the headers ... 
while I don't have to look up memcpy() arguments.

In hindsight, if we could go back 20 years, I'd organize our user memory access 
APIs in such a fashion:

	memcpy_begin()

	memcpy_user_kernel(userptr, kernelvar)
	memcpy_kernel_user(kernelvar, userptr)

	memcpy_user_kernel(userptr, kernelptr, size)
	memcpy_kernel_user(kernelptr, userptr, size)

	memcpy(kernelptr1, kernelptr2, size)

	memcpy_end()

and I'd use the regular memcpy ordering. GCC macro magic would make a distinction 
between our 2-argument auto-sizing optimized APIs:

	memcpy_user_kernel(userptr, kernelvar)

and the regular 3-argument sized memcpy:

	memcpy_user_kernel(userptr, kernelptr, size)

i.e. generated could would still be the exact same as today, but the argument 
order is streamlined, and it matches the naming of the API: 'user_kernel' fits 
'userptr, kernelvar'.

It's also easy to read at a glance, every 'user_kernel' API conceptually maps to a 
regular C assignment:

	uservar = kernelvar;

In fact we might want to harmonize the APIs a bit more and make the 2-argument 
APIs all pointer based:

	memcpy_user_kernel(userptr, kernelptr)
	memcpy_kernel_user(kernelptr, userptr)

the pointers still have to be type compatible so I think it's still just as 
robust.

There would be various other advantages to such a family of APIs as well:

 - while get/put matches the read/write IO logic, in reality it's very often mixed 
   with memcpy and C variable assigment code, and the mixed and conflicting
   semantics are IMHO lethal to robustness.

 - with this it's all a single well-known pattern, throughout all major user
   memory access APIs.

 - subsequently it's also harder to mis-review what such code does and what
   security implications it has.

 - the naming space is easily extensible: does any code need memcpy_user_user()
   perhaps? Or memcpy_user_iomem()?

 - another dimension for intuitive extensions would be _nocache() variants for 
   non-local copies. Variants of such APIs do exist, but the naming is not very 
   well organized.

But there are of course the disadvantages:

 - I guess we don't want to change thousands of get_user()/put_user() API
   usages ... We could automate 99.9% of it, and we could make it all graceful
   (i.e. keep the old APIs as well), but it would still be a big change.

 - [ I might have missed some advantage of the get/put idiom that my suggestion
     regresses. ]

Changing this is something I'd love to volunteer for though, so I thought I'd 
throw in the offer ;-)

Thanks,

	Ingo

  reply	other threads:[~2015-12-18  9:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-17 18:33 [PATCH 0/3] Batched user access support Linus Torvalds
2015-12-18  9:44 ` Ingo Molnar [this message]
2015-12-18 17:06   ` Linus Torvalds
2015-12-18 11:13 ` Will Deacon
2015-12-18 18:33   ` H. Peter Anvin
2015-12-18 18:43     ` Linus Torvalds
2015-12-18 19:56 ` Russell King - ARM Linux
2015-12-18 20:18   ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151218094452.GA32455@gmail.com \
    --to=mingo@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.