* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() [not found] ` <121ABD73-9400-4657-997C-6AEA578864C5@nutanix.com> @ 2025-11-26 6:04 ` Jason Wang 2025-11-26 10:25 ` Arnd Bergmann 0 siblings, 1 reply; 12+ messages in thread From: Jason Wang @ 2025-11-26 6:04 UTC (permalink / raw) To: Jon Kohler Cc: Michael S. Tsirkin, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel, Russell King - ARM Linux, Catalin Marinas, Will Deacon, Arnd Bergmann, krzk, alexandre.belloni, Linus Walleij, fustini On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: > > > > > On Nov 19, 2025, at 8:57 PM, Jason Wang <jasowang@redhat.com> wrote: > > > > On Tue, Nov 18, 2025 at 1:35 AM Jon Kohler <jon@nutanix.com> wrote: > >> > >> > >>> On Nov 16, 2025, at 11:32 PM, Jason Wang <jasowang@redhat.com> wrote: > >>> > >>> On Fri, Nov 14, 2025 at 10:53 PM Jon Kohler <jon@nutanix.com> wrote: > >>>> > >>>> > >>>>> On Nov 12, 2025, at 8:09 PM, Jason Wang <jasowang@redhat.com> wrote: > >>>>> > >>>>> On Thu, Nov 13, 2025 at 8:14 AM Jon Kohler <jon@nutanix.com> wrote: > >>>>>> > >>>>>> vhost_get_user and vhost_put_user leverage __get_user and __put_user, > >>>>>> respectively, which were both added in 2016 by commit 6b1e6cc7855b > >>>>>> ("vhost: new device IOTLB API"). > >>>>> > >>>>> It has been used even before this commit. > >>>> > >>>> Ah, thanks for the pointer. I’d have to go dig to find its genesis, but > >>>> its more to say, this existed prior to the LFENCE commit. > >>>> > >>>>> > >>>>>> In a heavy UDP transmit workload on a > >>>>>> vhost-net backed tap device, these functions showed up as ~11.6% of > >>>>>> samples in a flamegraph of the underlying vhost worker thread. > >>>>>> > >>>>>> Quoting Linus from [1]: > >>>>>> Anyway, every single __get_user() call I looked at looked like > >>>>>> historical garbage. [...] End result: I get the feeling that we > >>>>>> should just do a global search-and-replace of the __get_user/ > >>>>>> __put_user users, replace them with plain get_user/put_user instead, > >>>>>> and then fix up any fallout (eg the coco code). > >>>>>> > >>>>>> Switch to plain get_user/put_user in vhost, which results in a slight > >>>>>> throughput speedup. get_user now about ~8.4% of samples in flamegraph. > >>>>>> > >>>>>> Basic iperf3 test on a Intel 5416S CPU with Ubuntu 25.10 guest: > >>>>>> TX: taskset -c 2 iperf3 -c <rx_ip> -t 60 -p 5200 -b 0 -u -i 5 > >>>>>> RX: taskset -c 2 iperf3 -s -p 5200 -D > >>>>>> Before: 6.08 Gbits/sec > >>>>>> After: 6.32 Gbits/sec > >>>>> > >>>>> I wonder if we need to test on archs like ARM. > >>>> > >>>> Are you thinking from a performance perspective? Or a correctness one? > >>> > >>> Performance, I think the patch is correct. > >>> > >>> Thanks > >>> > >> > >> Ok gotcha. If anyone has an ARM system stuffed in their > >> front pocket and can give this a poke, I’d appreciate it, as > >> I don’t have ready access to one personally. > >> > >> That said, I think this might end up in “well, it is what it is” > >> territory as Linus was alluding to, i.e. if performance dips on > >> ARM for vhost, then thats a compelling point to optimize whatever > >> ends up being the culprit for get/put user? > >> > >> Said another way, would ARM perf testing (or any other arch) be a > >> blocker to taking this change? > > > > Not a must but at least we need to explain the implication for other > > archs as the discussion you quoted are all for x86. > > > > Thanks > > I’ll admit my ARM muscle is weak, but here’s my best take on this: > > Looking at arch/arm/include/asm/uaccess.h, the biggest thing that I > noticed is the CONFIG_CPU_SPECTRE ifdef, which already remaps > __get_user() to get_user(), so anyone running that in their kconfig > will already practically have the behavior implemented by this patch > by way of commit b1cd0a148063 ("ARM: spectre-v1: use get_user() for > __get_user()”). > > Same deal goes for __put_user() vs put_user by way of commit > e3aa6243434f ("ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()”) > > Looking at arch/arm/mm/Kconfig, there are a variety of scenarios > where CONFIG_CPU_SPECTRE will be enabled automagically. Looking at > commit 252309adc81f ("ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations") > it says that "ARMv8 is a superset of ARMv7", so I’d guess that just > about everything ARM would include this by default? > > If so, that mean at least for a non-zero population of ARM’ers, > they wouldn’t notice anything from this patch, yea? Adding ARM maintainers for more thought. Thanks > > Happy to learn otherwise if that read is incorrect! > > Thanks all, > Jon ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-26 6:04 ` [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() Jason Wang @ 2025-11-26 10:25 ` Arnd Bergmann 2025-11-26 19:47 ` Jon Kohler 0 siblings, 1 reply; 12+ messages in thread From: Arnd Bergmann @ 2025-11-26 10:25 UTC (permalink / raw) To: Jason Wang, Jon Kohler Cc: Michael S. Tsirkin, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini On Wed, Nov 26, 2025, at 07:04, Jason Wang wrote: > On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: >> > On Nov 19, 2025, at 8:57 PM, Jason Wang <jasowang@redhat.com> wrote: >> > On Tue, Nov 18, 2025 at 1:35 AM Jon Kohler <jon@nutanix.com> wrote: >> Same deal goes for __put_user() vs put_user by way of commit >> e3aa6243434f ("ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()”) >> >> Looking at arch/arm/mm/Kconfig, there are a variety of scenarios >> where CONFIG_CPU_SPECTRE will be enabled automagically. Looking at >> commit 252309adc81f ("ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations") >> it says that "ARMv8 is a superset of ARMv7", so I’d guess that just >> about everything ARM would include this by default? I think the more relevant commit is for 64-bit Arm here, but this does the same thing, see 84624087dd7e ("arm64: uaccess: Don't bother eliding access_ok checks in __{get, put}_user"). Note that there is no KVM on 32-bit Arm any more, so we really don't care about vhost performance there. The added access_ok() check in arm32 __get_user() is probably avoidable, as embedded systems with in-order cores could turn off the spectre workarounds, but as Will explained in the arm64 commit, it's not that expensive either. >> If so, that mean at least for a non-zero population of ARM’ers, >> they wouldn’t notice anything from this patch, yea? > > Adding ARM maintainers for more thought. I would think that if we change the __get_user() to get_user() in this driver, the same should be done for the __copy_{from,to}_user(), which similarly skips the access_ok() check but not the PAN/SMAP handling. In general, the access_ok()/__get_user()/__copy_from_user() pattern isn't really helpful any more, as Linus already explained. I can't tell from the vhost driver code whether we can just drop the access_ok() here and use the plain get_user()/copy_from_user(), or if it makes sense to move to the newer user_access_begin()/unsafe_get_user()/ unsafe_copy_from_user()/user_access_end() and try optimize out a few PAN/SMAP flips in the process. Arnd ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-26 10:25 ` Arnd Bergmann @ 2025-11-26 19:47 ` Jon Kohler 2025-11-26 19:58 ` Arnd Bergmann 2025-11-27 1:08 ` Jason Wang 0 siblings, 2 replies; 12+ messages in thread From: Jon Kohler @ 2025-11-26 19:47 UTC (permalink / raw) To: Arnd Bergmann Cc: Jason Wang, Michael S. Tsirkin, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini > On Nov 26, 2025, at 5:25 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > On Wed, Nov 26, 2025, at 07:04, Jason Wang wrote: >> On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: >>>> On Nov 19, 2025, at 8:57 PM, Jason Wang <jasowang@redhat.com> wrote: >>>> On Tue, Nov 18, 2025 at 1:35 AM Jon Kohler <jon@nutanix.com> wrote: >>> Same deal goes for __put_user() vs put_user by way of commit >>> e3aa6243434f ("ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()”) >>> >>> Looking at arch/arm/mm/Kconfig, there are a variety of scenarios >>> where CONFIG_CPU_SPECTRE will be enabled automagically. Looking at >>> commit 252309adc81f ("ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations") >>> it says that "ARMv8 is a superset of ARMv7", so I’d guess that just >>> about everything ARM would include this by default? > > I think the more relevant commit is for 64-bit Arm here, but this does > the same thing, see 84624087dd7e ("arm64: uaccess: Don't bother > eliding access_ok checks in __{get, put}_user"). Ah! Right, this is definitely the important bit, as it makes it crystal clear that these are exactly the same thing. The current code is: #define get_user __get_user #define put_user __put_user So, this patch changing from __* to regular versions is a no-op on arm side of the house, yea? > I would think that if we change the __get_user() to get_user() > in this driver, the same should be done for the > __copy_{from,to}_user(), which similarly skips the access_ok() > check but not the PAN/SMAP handling. Perhaps, thats a good call out. I’d file that under one battle at a time. Let’s get get/put user dusted first, then go down that road? > In general, the access_ok()/__get_user()/__copy_from_user() > pattern isn't really helpful any more, as Linus already > explained. I can't tell from the vhost driver code whether > we can just drop the access_ok() here and use the plain > get_user()/copy_from_user(), or if it makes sense to move > to the newer user_access_begin()/unsafe_get_user()/ > unsafe_copy_from_user()/user_access_end() and try optimize > out a few PAN/SMAP flips in the process. In general, I think there are a few spots where we might be able to optimize (vhost_get_vq_desc perhaps?) as that gets called quite a bit and IIRC there are at least two flips in there that perhaps we could elide to one? An investigation for another day I think. Anyhow, with this info - Jason - is there anything else you can think of that we want to double click on? Jon ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-26 19:47 ` Jon Kohler @ 2025-11-26 19:58 ` Arnd Bergmann 2025-11-26 21:42 ` Jon Kohler 2025-11-27 1:08 ` Jason Wang 1 sibling, 1 reply; 12+ messages in thread From: Arnd Bergmann @ 2025-11-26 19:58 UTC (permalink / raw) To: Jon Kohler Cc: Jason Wang, Michael S. Tsirkin, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini On Wed, Nov 26, 2025, at 20:47, Jon Kohler wrote: >> On Nov 26, 2025, at 5:25 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> On Wed, Nov 26, 2025, at 07:04, Jason Wang wrote: >>> On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: >> I think the more relevant commit is for 64-bit Arm here, but this does >> the same thing, see 84624087dd7e ("arm64: uaccess: Don't bother >> eliding access_ok checks in __{get, put}_user"). > > Ah! Right, this is definitely the important bit, as it makes it > crystal clear that these are exactly the same thing. The current > code is: > #define get_user __get_user > #define put_user __put_user > > So, this patch changing from __* to regular versions is a no-op > on arm side of the house, yea? Certainly on 64-bit, and almost always on 32-bit, yes. >> I would think that if we change the __get_user() to get_user() >> in this driver, the same should be done for the >> __copy_{from,to}_user(), which similarly skips the access_ok() >> check but not the PAN/SMAP handling. > > Perhaps, thats a good call out. I’d file that under one battle > at a time. Let’s get get/put user dusted first, then go down > that road? It depends on what your bigger plan is. Are you working on improving the vhost driver specifically, or are you trying to kill off the __get_user/__put_user calls across the entire kernel? In the latter case, I would suggest you do one driver at a time but address access_ok(), __{get,put}_user and __copy_{from,to}_user() with a single patch per driver as long as this is simple enough. For vhost specifically, doing it piecemeal is probably fine since the interaction is more complicated than most others. >> In general, the access_ok()/__get_user()/__copy_from_user() >> pattern isn't really helpful any more, as Linus already >> explained. I can't tell from the vhost driver code whether >> we can just drop the access_ok() here and use the plain >> get_user()/copy_from_user(), or if it makes sense to move >> to the newer user_access_begin()/unsafe_get_user()/ >> unsafe_copy_from_user()/user_access_end() and try optimize >> out a few PAN/SMAP flips in the process. > > In general, I think there are a few spots where we might be > able to optimize (vhost_get_vq_desc perhaps?) as that gets > called quite a bit and IIRC there are at least two flips > in there that perhaps we could elide to one? An investigation > for another day I think. Yes, sounds good. Arnd ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-26 19:58 ` Arnd Bergmann @ 2025-11-26 21:42 ` Jon Kohler 2025-11-26 21:45 ` Linus Torvalds 0 siblings, 1 reply; 12+ messages in thread From: Jon Kohler @ 2025-11-26 21:42 UTC (permalink / raw) To: Arnd Bergmann Cc: Jason Wang, Michael S. Tsirkin, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini > On Nov 26, 2025, at 2:58 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > On Wed, Nov 26, 2025, at 20:47, Jon Kohler wrote: >>> On Nov 26, 2025, at 5:25 AM, Arnd Bergmann <arnd@arndb.de> wrote: >>> On Wed, Nov 26, 2025, at 07:04, Jason Wang wrote: >>>> On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: >>> I think the more relevant commit is for 64-bit Arm here, but this does >>> the same thing, see 84624087dd7e ("arm64: uaccess: Don't bother >>> eliding access_ok checks in __{get, put}_user"). >> >> Ah! Right, this is definitely the important bit, as it makes it >> crystal clear that these are exactly the same thing. The current >> code is: >> #define get_user __get_user >> #define put_user __put_user >> >> So, this patch changing from __* to regular versions is a no-op >> on arm side of the house, yea? > > Certainly on 64-bit, and almost always on 32-bit, yes. > >>> I would think that if we change the __get_user() to get_user() >>> in this driver, the same should be done for the >>> __copy_{from,to}_user(), which similarly skips the access_ok() >>> check but not the PAN/SMAP handling. >> >> Perhaps, thats a good call out. I’d file that under one battle >> at a time. Let’s get get/put user dusted first, then go down >> that road? > > It depends on what your bigger plan is. Are you working on > improving the vhost driver specifically, or are you trying > to kill off the __get_user/__put_user calls across the > entire kernel? I’m working on vhost / virtualized networking improvements at the moment, not the broader kernel wide work. Linus mentioned he might get into the mix and do a bulk change and kill the whole thing once and for all, so I’m simply trying to help knock an incremental amount of work off the pile in advance of that (and reap some performance benefits at the same time, at least on the x86 side). Thanks for the info and back n forth here, I appreciate it! ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-26 21:42 ` Jon Kohler @ 2025-11-26 21:45 ` Linus Torvalds 2025-11-27 2:58 ` Jon Kohler 0 siblings, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2025-11-26 21:45 UTC (permalink / raw) To: Jon Kohler Cc: Arnd Bergmann, Jason Wang, Michael S. Tsirkin, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini On Wed, 26 Nov 2025 at 13:43, Jon Kohler <jon@nutanix.com> wrote: > > Linus mentioned he might get into the mix and do a bulk > change and kill the whole thing once and for all, so I’m > simply trying to help knock an incremental amount of work > off the pile in advance of that (and reap some performance > benefits at the same time, at least on the x86 side). So I'm definitely going to do some bulk conversion at some point, but honestly, I'll be a lot happier if most users already self-converted before that, and I only end up doing a "convert unmaintained old code that nobody really cares about". Linus ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-26 21:45 ` Linus Torvalds @ 2025-11-27 2:58 ` Jon Kohler 0 siblings, 0 replies; 12+ messages in thread From: Jon Kohler @ 2025-11-27 2:58 UTC (permalink / raw) To: Linus Torvalds Cc: Arnd Bergmann, Jason Wang, Michael S. Tsirkin, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini > On Nov 26, 2025, at 4:45 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Wed, 26 Nov 2025 at 13:43, Jon Kohler <jon@nutanix.com> wrote: >> >> Linus mentioned he might get into the mix and do a bulk >> change and kill the whole thing once and for all, so I’m >> simply trying to help knock an incremental amount of work >> off the pile in advance of that (and reap some performance >> benefits at the same time, at least on the x86 side). > > So I'm definitely going to do some bulk conversion at some point, but > honestly, I'll be a lot happier if most users already self-converted > before that, and I only end up doing a "convert unmaintained old code > that nobody really cares about". > > Linus That is fair, thanks Linus. I’ll see if I can pick off a couple more to start thinning out the pile, and see what sort of trouble I can get into. Cheers - Jon ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-26 19:47 ` Jon Kohler 2025-11-26 19:58 ` Arnd Bergmann @ 2025-11-27 1:08 ` Jason Wang 2025-11-27 3:11 ` Jon Kohler 1 sibling, 1 reply; 12+ messages in thread From: Jason Wang @ 2025-11-27 1:08 UTC (permalink / raw) To: Jon Kohler Cc: Arnd Bergmann, Michael S. Tsirkin, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini On Thu, Nov 27, 2025 at 3:48 AM Jon Kohler <jon@nutanix.com> wrote: > > > > > On Nov 26, 2025, at 5:25 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > > > On Wed, Nov 26, 2025, at 07:04, Jason Wang wrote: > >> On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: > >>>> On Nov 19, 2025, at 8:57 PM, Jason Wang <jasowang@redhat.com> wrote: > >>>> On Tue, Nov 18, 2025 at 1:35 AM Jon Kohler <jon@nutanix.com> wrote: > >>> Same deal goes for __put_user() vs put_user by way of commit > >>> e3aa6243434f ("ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()”) > >>> > >>> Looking at arch/arm/mm/Kconfig, there are a variety of scenarios > >>> where CONFIG_CPU_SPECTRE will be enabled automagically. Looking at > >>> commit 252309adc81f ("ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations") > >>> it says that "ARMv8 is a superset of ARMv7", so I’d guess that just > >>> about everything ARM would include this by default? > > > > I think the more relevant commit is for 64-bit Arm here, but this does > > the same thing, see 84624087dd7e ("arm64: uaccess: Don't bother > > eliding access_ok checks in __{get, put}_user"). > > Ah! Right, this is definitely the important bit, as it makes it > crystal clear that these are exactly the same thing. The current > code is: > #define get_user __get_user > #define put_user __put_user > > So, this patch changing from __* to regular versions is a no-op > on arm side of the house, yea? > > > I would think that if we change the __get_user() to get_user() > > in this driver, the same should be done for the > > __copy_{from,to}_user(), which similarly skips the access_ok() > > check but not the PAN/SMAP handling. > > Perhaps, thats a good call out. I’d file that under one battle > at a time. Let’s get get/put user dusted first, then go down > that road? > > > In general, the access_ok()/__get_user()/__copy_from_user() > > pattern isn't really helpful any more, as Linus already > > explained. I can't tell from the vhost driver code whether > > we can just drop the access_ok() here and use the plain > > get_user()/copy_from_user(), or if it makes sense to move > > to the newer user_access_begin()/unsafe_get_user()/ > > unsafe_copy_from_user()/user_access_end() and try optimize > > out a few PAN/SMAP flips in the process. Right, according to my testing in the past, PAN/SMAP is a killer for small packet performance (PPS). > > In general, I think there are a few spots where we might be > able to optimize (vhost_get_vq_desc perhaps?) as that gets > called quite a bit and IIRC there are at least two flips > in there that perhaps we could elide to one? An investigation > for another day I think. Did you mean trying to read descriptors in a batch, that would be better and with IN_ORDER it would be even faster as a single (at most two) copy_from_user() might work (without the need to use user_access_begin()/user_access_end(). > > Anyhow, with this info - Jason - is there anything else you > can think of that we want to double click on? Nope. Thanks > > Jon ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-27 1:08 ` Jason Wang @ 2025-11-27 3:11 ` Jon Kohler 2025-11-27 6:31 ` Michael S. Tsirkin 2025-11-27 6:32 ` Michael S. Tsirkin 0 siblings, 2 replies; 12+ messages in thread From: Jon Kohler @ 2025-11-27 3:11 UTC (permalink / raw) To: Jason Wang Cc: Arnd Bergmann, Michael S. Tsirkin, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini > On Nov 26, 2025, at 8:08 PM, Jason Wang <jasowang@redhat.com> wrote: > > On Thu, Nov 27, 2025 at 3:48 AM Jon Kohler <jon@nutanix.com> wrote: >> >> >>> On Nov 26, 2025, at 5:25 AM, Arnd Bergmann <arnd@arndb.de> wrote: >>> >>> On Wed, Nov 26, 2025, at 07:04, Jason Wang wrote: >>>> On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: >>>>>> On Nov 19, 2025, at 8:57 PM, Jason Wang <jasowang@redhat.com> wrote: >>>>>> On Tue, Nov 18, 2025 at 1:35 AM Jon Kohler <jon@nutanix.com> wrote: >>>>> Same deal goes for __put_user() vs put_user by way of commit >>>>> e3aa6243434f ("ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()”) >>>>> >>>>> Looking at arch/arm/mm/Kconfig, there are a variety of scenarios >>>>> where CONFIG_CPU_SPECTRE will be enabled automagically. Looking at >>>>> commit 252309adc81f ("ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations") >>>>> it says that "ARMv8 is a superset of ARMv7", so I’d guess that just >>>>> about everything ARM would include this by default? >>> >>> I think the more relevant commit is for 64-bit Arm here, but this does >>> the same thing, see 84624087dd7e ("arm64: uaccess: Don't bother >>> eliding access_ok checks in __{get, put}_user"). >> >> Ah! Right, this is definitely the important bit, as it makes it >> crystal clear that these are exactly the same thing. The current >> code is: >> #define get_user __get_user >> #define put_user __put_user >> >> So, this patch changing from __* to regular versions is a no-op >> on arm side of the house, yea? >> >>> I would think that if we change the __get_user() to get_user() >>> in this driver, the same should be done for the >>> __copy_{from,to}_user(), which similarly skips the access_ok() >>> check but not the PAN/SMAP handling. >> >> Perhaps, thats a good call out. I’d file that under one battle >> at a time. Let’s get get/put user dusted first, then go down >> that road? >> >>> In general, the access_ok()/__get_user()/__copy_from_user() >>> pattern isn't really helpful any more, as Linus already >>> explained. I can't tell from the vhost driver code whether >>> we can just drop the access_ok() here and use the plain >>> get_user()/copy_from_user(), or if it makes sense to move >>> to the newer user_access_begin()/unsafe_get_user()/ >>> unsafe_copy_from_user()/user_access_end() and try optimize >>> out a few PAN/SMAP flips in the process. > > Right, according to my testing in the past, PAN/SMAP is a killer for > small packet performance (PPS). For sure, every little bit helps along that path > >> >> In general, I think there are a few spots where we might be >> able to optimize (vhost_get_vq_desc perhaps?) as that gets >> called quite a bit and IIRC there are at least two flips >> in there that perhaps we could elide to one? An investigation >> for another day I think. > > Did you mean trying to read descriptors in a batch, that would be > better and with IN_ORDER it would be even faster as a single (at most > two) copy_from_user() might work (without the need to use > user_access_begin()/user_access_end(). Yep. I haven’t fully thought through it, just a drive-by idea from looking at code for the recent work I’ve been doing, just scratching my head thinking there *must* be something we can do better there. Basically on the get rx/tx bufs path as well as the vhost_add_used_and_signal_n path, I think we could cluster together some of the get/put users and copy to/from’s. Would take some massaging, but I think there is something there. >> >> Anyhow, with this info - Jason - is there anything else you >> can think of that we want to double click on? > > Nope. > > Thanks Ok thanks. Perhaps we can land this in next and let it soak in, though, knock on wood, I don’t think there will be fallout (famous last words!) ? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-27 3:11 ` Jon Kohler @ 2025-11-27 6:31 ` Michael S. Tsirkin 2025-11-27 6:32 ` Michael S. Tsirkin 1 sibling, 0 replies; 12+ messages in thread From: Michael S. Tsirkin @ 2025-11-27 6:31 UTC (permalink / raw) To: Jon Kohler Cc: Jason Wang, Arnd Bergmann, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini On Thu, Nov 27, 2025 at 03:11:57AM +0000, Jon Kohler wrote: > > > > On Nov 26, 2025, at 8:08 PM, Jason Wang <jasowang@redhat.com> wrote: > > > > On Thu, Nov 27, 2025 at 3:48 AM Jon Kohler <jon@nutanix.com> wrote: > >> > >> > >>> On Nov 26, 2025, at 5:25 AM, Arnd Bergmann <arnd@arndb.de> wrote: > >>> > >>> On Wed, Nov 26, 2025, at 07:04, Jason Wang wrote: > >>>> On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: > >>>>>> On Nov 19, 2025, at 8:57 PM, Jason Wang <jasowang@redhat.com> wrote: > >>>>>> On Tue, Nov 18, 2025 at 1:35 AM Jon Kohler <jon@nutanix.com> wrote: > >>>>> Same deal goes for __put_user() vs put_user by way of commit > >>>>> e3aa6243434f ("ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()”) > >>>>> > >>>>> Looking at arch/arm/mm/Kconfig, there are a variety of scenarios > >>>>> where CONFIG_CPU_SPECTRE will be enabled automagically. Looking at > >>>>> commit 252309adc81f ("ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations") > >>>>> it says that "ARMv8 is a superset of ARMv7", so I’d guess that just > >>>>> about everything ARM would include this by default? > >>> > >>> I think the more relevant commit is for 64-bit Arm here, but this does > >>> the same thing, see 84624087dd7e ("arm64: uaccess: Don't bother > >>> eliding access_ok checks in __{get, put}_user"). > >> > >> Ah! Right, this is definitely the important bit, as it makes it > >> crystal clear that these are exactly the same thing. The current > >> code is: > >> #define get_user __get_user > >> #define put_user __put_user > >> > >> So, this patch changing from __* to regular versions is a no-op > >> on arm side of the house, yea? > >> > >>> I would think that if we change the __get_user() to get_user() > >>> in this driver, the same should be done for the > >>> __copy_{from,to}_user(), which similarly skips the access_ok() > >>> check but not the PAN/SMAP handling. > >> > >> Perhaps, thats a good call out. I’d file that under one battle > >> at a time. Let’s get get/put user dusted first, then go down > >> that road? > >> > >>> In general, the access_ok()/__get_user()/__copy_from_user() > >>> pattern isn't really helpful any more, as Linus already > >>> explained. I can't tell from the vhost driver code whether > >>> we can just drop the access_ok() here and use the plain > >>> get_user()/copy_from_user(), or if it makes sense to move > >>> to the newer user_access_begin()/unsafe_get_user()/ > >>> unsafe_copy_from_user()/user_access_end() and try optimize > >>> out a few PAN/SMAP flips in the process. > > > > Right, according to my testing in the past, PAN/SMAP is a killer for > > small packet performance (PPS). > > For sure, every little bit helps along that path > > > > >> > >> In general, I think there are a few spots where we might be > >> able to optimize (vhost_get_vq_desc perhaps?) as that gets > >> called quite a bit and IIRC there are at least two flips > >> in there that perhaps we could elide to one? An investigation > >> for another day I think. > > > > Did you mean trying to read descriptors in a batch, that would be > > better and with IN_ORDER it would be even faster as a single (at most > > two) copy_from_user() might work (without the need to use > > user_access_begin()/user_access_end(). > > Yep. I haven’t fully thought through it, just a drive-by idea > from looking at code for the recent work I’ve been doing, just > scratching my head thinking there *must* be something we can do > better there. > > Basically on the get rx/tx bufs path as well as the > vhost_add_used_and_signal_n path, I think we could cluster together > some of the get/put users and copy to/from’s. Would take some > massaging, but I think there is something there. > > >> > >> Anyhow, with this info - Jason - is there anything else you > >> can think of that we want to double click on? > > > > Nope. > > > > Thanks > > Ok thanks. Perhaps we can land this in next and let it soak in, > though, knock on wood, I don’t think there will be fallout > (famous last words!) ? Yea I'll put this in linux-next and we'll see what happens. -- MST ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-27 3:11 ` Jon Kohler 2025-11-27 6:31 ` Michael S. Tsirkin @ 2025-11-27 6:32 ` Michael S. Tsirkin 2025-12-02 16:54 ` Jon Kohler 1 sibling, 1 reply; 12+ messages in thread From: Michael S. Tsirkin @ 2025-11-27 6:32 UTC (permalink / raw) To: Jon Kohler Cc: Jason Wang, Arnd Bergmann, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini On Thu, Nov 27, 2025 at 03:11:57AM +0000, Jon Kohler wrote: > > > > On Nov 26, 2025, at 8:08 PM, Jason Wang <jasowang@redhat.com> wrote: > > > > On Thu, Nov 27, 2025 at 3:48 AM Jon Kohler <jon@nutanix.com> wrote: > >> > >> > >>> On Nov 26, 2025, at 5:25 AM, Arnd Bergmann <arnd@arndb.de> wrote: > >>> > >>> On Wed, Nov 26, 2025, at 07:04, Jason Wang wrote: > >>>> On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: > >>>>>> On Nov 19, 2025, at 8:57 PM, Jason Wang <jasowang@redhat.com> wrote: > >>>>>> On Tue, Nov 18, 2025 at 1:35 AM Jon Kohler <jon@nutanix.com> wrote: > >>>>> Same deal goes for __put_user() vs put_user by way of commit > >>>>> e3aa6243434f ("ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()”) > >>>>> > >>>>> Looking at arch/arm/mm/Kconfig, there are a variety of scenarios > >>>>> where CONFIG_CPU_SPECTRE will be enabled automagically. Looking at > >>>>> commit 252309adc81f ("ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations") > >>>>> it says that "ARMv8 is a superset of ARMv7", so I’d guess that just > >>>>> about everything ARM would include this by default? > >>> > >>> I think the more relevant commit is for 64-bit Arm here, but this does > >>> the same thing, see 84624087dd7e ("arm64: uaccess: Don't bother > >>> eliding access_ok checks in __{get, put}_user"). > >> > >> Ah! Right, this is definitely the important bit, as it makes it > >> crystal clear that these are exactly the same thing. The current > >> code is: > >> #define get_user __get_user > >> #define put_user __put_user > >> > >> So, this patch changing from __* to regular versions is a no-op > >> on arm side of the house, yea? > >> > >>> I would think that if we change the __get_user() to get_user() > >>> in this driver, the same should be done for the > >>> __copy_{from,to}_user(), which similarly skips the access_ok() > >>> check but not the PAN/SMAP handling. > >> > >> Perhaps, thats a good call out. I’d file that under one battle > >> at a time. Let’s get get/put user dusted first, then go down > >> that road? > >> > >>> In general, the access_ok()/__get_user()/__copy_from_user() > >>> pattern isn't really helpful any more, as Linus already > >>> explained. I can't tell from the vhost driver code whether > >>> we can just drop the access_ok() here and use the plain > >>> get_user()/copy_from_user(), or if it makes sense to move > >>> to the newer user_access_begin()/unsafe_get_user()/ > >>> unsafe_copy_from_user()/user_access_end() and try optimize > >>> out a few PAN/SMAP flips in the process. > > > > Right, according to my testing in the past, PAN/SMAP is a killer for > > small packet performance (PPS). > > For sure, every little bit helps along that path > > > > >> > >> In general, I think there are a few spots where we might be > >> able to optimize (vhost_get_vq_desc perhaps?) as that gets > >> called quite a bit and IIRC there are at least two flips > >> in there that perhaps we could elide to one? An investigation > >> for another day I think. > > > > Did you mean trying to read descriptors in a batch, that would be > > better and with IN_ORDER it would be even faster as a single (at most > > two) copy_from_user() might work (without the need to use > > user_access_begin()/user_access_end(). > > Yep. I haven’t fully thought through it, just a drive-by idea > from looking at code for the recent work I’ve been doing, just > scratching my head thinking there *must* be something we can do > better there. > > Basically on the get rx/tx bufs path as well as the > vhost_add_used_and_signal_n path, I think we could cluster together > some of the get/put users and copy to/from’s. Would take some > massaging, but I think there is something there. > > >> > >> Anyhow, with this info - Jason - is there anything else you > >> can think of that we want to double click on? > > > > Nope. > > > > Thanks > > Ok thanks. Perhaps we can land this in next and let it soak in, > though, knock on wood, I don’t think there will be fallout > (famous last words!) ? > To clairify, I think vhost tree is better to put this in next than net-next, both because it's purely core vhost and because unlike net-next vhost rebases so it is easy to just drop the patch if there are issues. I'll put it there. -- MST ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() 2025-11-27 6:32 ` Michael S. Tsirkin @ 2025-12-02 16:54 ` Jon Kohler 0 siblings, 0 replies; 12+ messages in thread From: Jon Kohler @ 2025-12-02 16:54 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Jason Wang, Arnd Bergmann, Eugenio Pérez, kvm@vger.kernel.org, virtualization@lists.linux.dev, Netdev, linux-kernel@vger.kernel.org, Linus Torvalds, Borislav Petkov, Sean Christopherson, linux-arm-kernel@lists.infradead.org, Russell King, Catalin Marinas, Will Deacon, Krzysztof Kozlowski, Alexandre Belloni, Linus Walleij, Drew Fustini > On Nov 27, 2025, at 1:32 AM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Thu, Nov 27, 2025 at 03:11:57AM +0000, Jon Kohler wrote: >> >> >>> On Nov 26, 2025, at 8:08 PM, Jason Wang <jasowang@redhat.com> wrote: >>> >>> On Thu, Nov 27, 2025 at 3:48 AM Jon Kohler <jon@nutanix.com> wrote: >>>> >>>> >>>>> On Nov 26, 2025, at 5:25 AM, Arnd Bergmann <arnd@arndb.de> wrote: >>>>> >>>>> On Wed, Nov 26, 2025, at 07:04, Jason Wang wrote: >>>>>> On Wed, Nov 26, 2025 at 3:45 AM Jon Kohler <jon@nutanix.com> wrote: >>>>>>>> On Nov 19, 2025, at 8:57 PM, Jason Wang <jasowang@redhat.com> wrote: >>>>>>>> On Tue, Nov 18, 2025 at 1:35 AM Jon Kohler <jon@nutanix.com> wrote: >>>>>>> Same deal goes for __put_user() vs put_user by way of commit >>>>>>> e3aa6243434f ("ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()”) >>>>>>> >>>>>>> Looking at arch/arm/mm/Kconfig, there are a variety of scenarios >>>>>>> where CONFIG_CPU_SPECTRE will be enabled automagically. Looking at >>>>>>> commit 252309adc81f ("ARM: Make CONFIG_CPU_V7 valid for 32bit ARMv8 implementations") >>>>>>> it says that "ARMv8 is a superset of ARMv7", so I’d guess that just >>>>>>> about everything ARM would include this by default? >>>>> >>>>> I think the more relevant commit is for 64-bit Arm here, but this does >>>>> the same thing, see 84624087dd7e ("arm64: uaccess: Don't bother >>>>> eliding access_ok checks in __{get, put}_user"). >>>> >>>> Ah! Right, this is definitely the important bit, as it makes it >>>> crystal clear that these are exactly the same thing. The current >>>> code is: >>>> #define get_user __get_user >>>> #define put_user __put_user >>>> >>>> So, this patch changing from __* to regular versions is a no-op >>>> on arm side of the house, yea? >>>> >>>>> I would think that if we change the __get_user() to get_user() >>>>> in this driver, the same should be done for the >>>>> __copy_{from,to}_user(), which similarly skips the access_ok() >>>>> check but not the PAN/SMAP handling. >>>> >>>> Perhaps, thats a good call out. I’d file that under one battle >>>> at a time. Let’s get get/put user dusted first, then go down >>>> that road? >>>> >>>>> In general, the access_ok()/__get_user()/__copy_from_user() >>>>> pattern isn't really helpful any more, as Linus already >>>>> explained. I can't tell from the vhost driver code whether >>>>> we can just drop the access_ok() here and use the plain >>>>> get_user()/copy_from_user(), or if it makes sense to move >>>>> to the newer user_access_begin()/unsafe_get_user()/ >>>>> unsafe_copy_from_user()/user_access_end() and try optimize >>>>> out a few PAN/SMAP flips in the process. >>> >>> Right, according to my testing in the past, PAN/SMAP is a killer for >>> small packet performance (PPS). >> >> For sure, every little bit helps along that path >> >>> >>>> >>>> In general, I think there are a few spots where we might be >>>> able to optimize (vhost_get_vq_desc perhaps?) as that gets >>>> called quite a bit and IIRC there are at least two flips >>>> in there that perhaps we could elide to one? An investigation >>>> for another day I think. >>> >>> Did you mean trying to read descriptors in a batch, that would be >>> better and with IN_ORDER it would be even faster as a single (at most >>> two) copy_from_user() might work (without the need to use >>> user_access_begin()/user_access_end(). >> >> Yep. I haven’t fully thought through it, just a drive-by idea >> from looking at code for the recent work I’ve been doing, just >> scratching my head thinking there *must* be something we can do >> better there. >> >> Basically on the get rx/tx bufs path as well as the >> vhost_add_used_and_signal_n path, I think we could cluster together >> some of the get/put users and copy to/from’s. Would take some >> massaging, but I think there is something there. >> >>>> >>>> Anyhow, with this info - Jason - is there anything else you >>>> can think of that we want to double click on? >>> >>> Nope. >>> >>> Thanks >> >> Ok thanks. Perhaps we can land this in next and let it soak in, >> though, knock on wood, I don’t think there will be fallout >> (famous last words!) ? >> > > > To clairify, I think vhost tree is better to put this > in next than net-next, both because it's purely core vhost > and because unlike net-next vhost rebases so it is easy to > just drop the patch if there are issues. > I'll put it there. > > -- > MST > Ok cool, thank you! ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-12-02 16:55 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20251113005529.2494066-1-jon@nutanix.com>
[not found] ` <CACGkMEtQZ3M-sERT2P8WV=82BuXCbBHeJX+zgxx+9X7OUTqi4g@mail.gmail.com>
[not found] ` <E1226897-C6D1-439C-AB3B-012F8C4A72DF@nutanix.com>
[not found] ` <CACGkMEuPK4=Tf3x-k0ZHY1rqL=2rg60-qdON8UJmQZTqpUryTQ@mail.gmail.com>
[not found] ` <A0AFD371-1FA3-48F7-A259-6503A6F052E5@nutanix.com>
[not found] ` <CACGkMEvD16y2rt+cXupZ-aEcPZ=nvU7+xYSYBkUj7tH=ER3f-A@mail.gmail.com>
[not found] ` <121ABD73-9400-4657-997C-6AEA578864C5@nutanix.com>
2025-11-26 6:04 ` [PATCH net-next] vhost: use "checked" versions of get_user() and put_user() Jason Wang
2025-11-26 10:25 ` Arnd Bergmann
2025-11-26 19:47 ` Jon Kohler
2025-11-26 19:58 ` Arnd Bergmann
2025-11-26 21:42 ` Jon Kohler
2025-11-26 21:45 ` Linus Torvalds
2025-11-27 2:58 ` Jon Kohler
2025-11-27 1:08 ` Jason Wang
2025-11-27 3:11 ` Jon Kohler
2025-11-27 6:31 ` Michael S. Tsirkin
2025-11-27 6:32 ` Michael S. Tsirkin
2025-12-02 16:54 ` Jon Kohler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).