* Question about memcpy @ 2018-07-07 11:36 bing zhu 2018-07-07 18:44 ` valdis.kletnieks at vt.edu 0 siblings, 1 reply; 22+ messages in thread From: bing zhu @ 2018-07-07 11:36 UTC (permalink / raw) To: kernelnewbies Dear Sir/Ma'am Thank you for your time ,i'm a student new to linux kernel. I have a question about memcpy,i noticed that memcpy is faster in kernel than in user space for example : in a module helloworld , i use memcpy to copy a 4096B to a block of memory for like 10000 times and in user space i do the same thing,I noticed that kernel is faster than user , is it possible that in kernel when i insmod hello it can not be scheduled but in user space it will so kernel is faster? is there a possible way that a user task can run a block of code that uninterruptable? No switch ,no schedule ? Thank you ! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180707/2f8c8e06/attachment-0001.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-07 11:36 Question about memcpy bing zhu @ 2018-07-07 18:44 ` valdis.kletnieks at vt.edu 2018-07-08 14:03 ` bing zhu 0 siblings, 1 reply; 22+ messages in thread From: valdis.kletnieks at vt.edu @ 2018-07-07 18:44 UTC (permalink / raw) To: kernelnewbies On Sat, 07 Jul 2018 19:36:47 +0800, bing zhu said: > and in user space i do the same thing,I noticed that kernel is faster than > user , How did you measure the times? Doing this right is actually harder than it looks... -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180707/56888e38/attachment.sig> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-07 18:44 ` valdis.kletnieks at vt.edu @ 2018-07-08 14:03 ` bing zhu 2018-07-09 7:54 ` 袁建鹏 2018-07-09 14:04 ` Himanshu Jha 0 siblings, 2 replies; 22+ messages in thread From: bing zhu @ 2018-07-08 14:03 UTC (permalink / raw) To: kernelnewbies void *p = malloc(4096 * max); start = usec(); for (i = 0; i < max; i++) { memcpy(p + i * 4096, page, 4096); } end = usec(); printf("%s : %d time use %lu us \n", __func__, max,end - start?; static unsigned long usec(void) { struct timeval tv; gettimeofday(&tv, 0); return (unsigned long)tv.tv_sec * 1000000 + tv.tv_usec; } I'm don't think it's really precise but i did notice a difference , 2018-07-08 2:44 GMT+08:00 <valdis.kletnieks@vt.edu>: > On Sat, 07 Jul 2018 19:36:47 +0800, bing zhu said: > > > and in user space i do the same thing,I noticed that kernel is faster > than > > user , > > How did you measure the times? Doing this right is actually harder than it > looks... > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180708/3bba6606/attachment.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-08 14:03 ` bing zhu @ 2018-07-09 7:54 ` 袁建鹏 2018-07-09 8:14 ` bing zhu 2018-07-09 14:04 ` Himanshu Jha 1 sibling, 1 reply; 22+ messages in thread From: 袁建鹏 @ 2018-07-09 7:54 UTC (permalink / raw) To: kernelnewbies can you show all code kernel and userspace ? Kernel compile options are optimized, very different from userspace. you can use the same object (memcpy.o) to link userspace program and kernel module. -----????----- ???:"bing zhu" <zhubohong12@gmail.com> ????:2018-07-08 22:03:48 (???) ???: "Valdis Kletnieks" <valdis.kletnieks@vt.edu> ??: kernelnewbies at kernelnewbies.org ??: Re: Question about memcpy void *p = malloc(4096 * max); start = usec(); for (i = 0; i < max; i++) { memcpy(p + i * 4096, page, 4096); } end = usec(); printf("%s : %d time use %lu us \n", __func__, max,end - start?; static unsigned long usec(void) { struct timeval tv; gettimeofday(&tv, 0); return (unsigned long)tv.tv_sec * 1000000 + tv.tv_usec; } I'm don't think it's really precise but i did notice a difference , 2018-07-08 2:44 GMT+08:00 <valdis.kletnieks@vt.edu>: On Sat, 07 Jul 2018 19:36:47 +0800, bing zhu said: > and in user space i do the same thing,I noticed that kernel is faster than > user , How did you measure the times? Doing this right is actually harder than it looks... -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180709/fc177327/attachment.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-09 7:54 ` 袁建鹏 @ 2018-07-09 8:14 ` bing zhu 0 siblings, 0 replies; 22+ messages in thread From: bing zhu @ 2018-07-09 8:14 UTC (permalink / raw) To: kernelnewbies in kernel you should use this func: static unsigned long usec(void) { struct timeval tv; do_gettimeofday(&tv); return (unsigned long)tv.tv_sec * 1000000 + tv.tv_usec; } 2018-07-09 15:54 GMT+08:00 ??? <yuanjp@hust.edu.cn>: > can you show all code kernel and userspace ? > > Kernel compile options are optimized, very different from userspace. > > you can use the same object (memcpy.o) to link userspace program and > kernel module. > > -----????----- > *???:*"bing zhu" <zhubohong12@gmail.com> > *????:*2018-07-08 22:03:48 (???) > *???:* "Valdis Kletnieks" <valdis.kletnieks@vt.edu> > *??:* kernelnewbies at kernelnewbies.org > *??:* Re: Question about memcpy > > void *p = malloc(4096 * max); > start = usec(); > for (i = 0; i < max; i++) { > memcpy(p + i * 4096, page, 4096); > } > end = usec(); > printf("%s : %d time use %lu us \n", __func__, max,end - start?; > > static unsigned long usec(void) > { > struct timeval tv; > gettimeofday(&tv, 0); > return (unsigned long)tv.tv_sec * 1000000 + tv.tv_usec; > } > > > I'm don't think it's really precise but i did notice a difference , > > 2018-07-08 2:44 GMT+08:00 <valdis.kletnieks@vt.edu>: > >> On Sat, 07 Jul 2018 19:36:47 +0800, bing zhu said: >> >> > and in user space i do the same thing,I noticed that kernel is faster >> than >> > user , >> >> How did you measure the times? Doing this right is actually harder than >> it looks... >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180709/aa3f9663/attachment-0001.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-08 14:03 ` bing zhu 2018-07-09 7:54 ` 袁建鹏 @ 2018-07-09 14:04 ` Himanshu Jha 2018-07-09 16:16 ` valdis.kletnieks at vt.edu 2018-07-10 4:50 ` bing zhu 1 sibling, 2 replies; 22+ messages in thread From: Himanshu Jha @ 2018-07-09 14:04 UTC (permalink / raw) To: kernelnewbies Hi Bing, On Sun, Jul 08, 2018 at 10:03:48PM +0800, bing zhu wrote: > void *p = malloc(4096 * max); > start = usec(); > for (i = 0; i < max; i++) { > memcpy(p + i * 4096, page, 4096); > } > end = usec(); > printf("%s : %d time use %lu us \n", __func__, max,end - start?; > > static unsigned long usec(void) > { > struct timeval tv; > gettimeofday(&tv, 0); > return (unsigned long)tv.tv_sec * 1000000 + tv.tv_usec; > } I think for these benchmarking stuff, to evaluate the cycles and time correctly you should use the __rdtscp(more info at "AMD64 Architecture Programmer?s Manual Volume 3: General-Purpose and System Instructions" Pg 401) Userspace: ---------------------------------------------------------------------- #include <stdio.h> #include <time.h> #include <stdint.h> #include <x86intrin.h> volatile unsigned sink; unsigned int junk; int main (void) { clock_t start = clock(); register uint64_t t=__rdtscp(&junk); for(size_t i=0; i<10000000; ++i) sink++; t=__rdtscp(&junk)-t; clock_t end = clock(); double cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC; printf("for loop took %f seconds to execute %zu cylces\n", cpu_time_used, t); } --------------------------------------------------------------------- Kernelspace: If you want to dig more: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf Thanks -- Himanshu Jha Undergraduate Student Department of Electronics & Communication Guru Tegh Bahadur Institute of Technology ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-09 14:04 ` Himanshu Jha @ 2018-07-09 16:16 ` valdis.kletnieks at vt.edu 2018-07-14 10:10 ` Himanshu Jha 2018-07-10 4:50 ` bing zhu 1 sibling, 1 reply; 22+ messages in thread From: valdis.kletnieks at vt.edu @ 2018-07-09 16:16 UTC (permalink / raw) To: kernelnewbies On Mon, 09 Jul 2018 19:34:44 +0530, Himanshu Jha said: > I think for these benchmarking stuff, to evaluate the cycles and time > correctly you should use the __rdtscp(more info at "AMD64 Architecture > Programmer???s Manual Volume 3: General-Purpose and System Instructions" > Pg 401) Just beware that many Intel (and maybe some AMD) chipsets have a non-constant TSC frequency. Check /proc/cpuinfo for 'constant_tsc' before relying on the value. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180709/313cad4f/attachment.sig> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-09 16:16 ` valdis.kletnieks at vt.edu @ 2018-07-14 10:10 ` Himanshu Jha 0 siblings, 0 replies; 22+ messages in thread From: Himanshu Jha @ 2018-07-14 10:10 UTC (permalink / raw) To: kernelnewbies On Mon, Jul 09, 2018 at 12:16:27PM -0400, valdis.kletnieks at vt.edu wrote: > On Mon, 09 Jul 2018 19:34:44 +0530, Himanshu Jha said: > > > I think for these benchmarking stuff, to evaluate the cycles and time > > correctly you should use the __rdtscp(more info at "AMD64 Architecture > > Programmer???s Manual Volume 3: General-Purpose and System Instructions" > > Pg 401) > > Just beware that many Intel (and maybe some AMD) chipsets have a non-constant > TSC frequency. Check /proc/cpuinfo for 'constant_tsc' before relying on the value. How about setting "performance" governor[1] for all CPUs ? Would that work ? I mean no throttle down, but not sure if we have a constant cpufreq. Something like the following script: for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do [ -f $CPUFREQ ] || continue; echo -n performance > $CPUFREQ; done [1] https://www.kernel.org/doc/html/v4.14/admin-guide/pm/cpufreq.html#performance -- Himanshu Jha Undergraduate Student Department of Electronics & Communication Guru Tegh Bahadur Institute of Technology ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-09 14:04 ` Himanshu Jha 2018-07-09 16:16 ` valdis.kletnieks at vt.edu @ 2018-07-10 4:50 ` bing zhu 2018-07-10 6:22 ` Greg KH 1 sibling, 1 reply; 22+ messages in thread From: bing zhu @ 2018-07-10 4:50 UTC (permalink / raw) To: kernelnewbies I agree !,just i think the problem is still there,memcpy is indeed faster in kernel than in user,i've tried both ways . schedule might be to blame. 2018-07-09 22:04 GMT+08:00 Himanshu Jha <himanshujha199640@gmail.com>: > Hi Bing, > > On Sun, Jul 08, 2018 at 10:03:48PM +0800, bing zhu wrote: > > void *p = malloc(4096 * max); > > start = usec(); > > for (i = 0; i < max; i++) { > > memcpy(p + i * 4096, page, 4096); > > } > > end = usec(); > > printf("%s : %d time use %lu us \n", __func__, max,end - start?; > > > > static unsigned long usec(void) > > { > > struct timeval tv; > > gettimeofday(&tv, 0); > > return (unsigned long)tv.tv_sec * 1000000 + tv.tv_usec; > > } > > I think for these benchmarking stuff, to evaluate the cycles and time > correctly you should use the __rdtscp(more info at "AMD64 Architecture > Programmer?s Manual Volume 3: General-Purpose and System Instructions" > Pg 401) > > Userspace: > ---------------------------------------------------------------------- > #include <stdio.h> > #include <time.h> > #include <stdint.h> > #include <x86intrin.h> > > volatile unsigned sink; > unsigned int junk; > > int main (void) > { > clock_t start = clock(); > register uint64_t t=__rdtscp(&junk); > > for(size_t i=0; i<10000000; ++i) > sink++; > > t=__rdtscp(&junk)-t; > clock_t end = clock(); > double cpu_time_used = ((double) (end - start)) / CLOCKS_PER_SEC; > > printf("for loop took %f seconds to execute %zu cylces\n", cpu_time_used, > t); > } > --------------------------------------------------------------------- > > Kernelspace: > If you want to dig more: > https://www.intel.com/content/dam/www/public/us/en/ > documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf > > > Thanks > -- > Himanshu Jha > Undergraduate Student > Department of Electronics & Communication > Guru Tegh Bahadur Institute of Technology > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180710/52e1350d/attachment.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-10 4:50 ` bing zhu @ 2018-07-10 6:22 ` Greg KH 2018-07-10 14:51 ` bing zhu 0 siblings, 1 reply; 22+ messages in thread From: Greg KH @ 2018-07-10 6:22 UTC (permalink / raw) To: kernelnewbies On Tue, Jul 10, 2018 at 12:50:21PM +0800, bing zhu wrote: > I agree !,just i think the problem is still there,memcpy is indeed faster in > kernel than in user,i've tried both ways . Make sure you are actually using the same code for memcpy in both places. Do not rely on your libc or the kernel library for such a thing, otherwise you are not comparing the same code exactly. > schedule might be to blame. Lots of things "might be to blame", but first off, try to work out exactly what you are trying to test, and why, and work on that. good luck! greg k-h ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-10 6:22 ` Greg KH @ 2018-07-10 14:51 ` bing zhu 2018-07-10 14:57 ` Greg KH 2018-07-10 16:03 ` valdis.kletnieks at vt.edu 0 siblings, 2 replies; 22+ messages in thread From: bing zhu @ 2018-07-10 14:51 UTC (permalink / raw) To: kernelnewbies Thank you ,I use this func for both kernel and user ,result are same. void *memcpy(void *dest, const void *src, size_t n) { long d0, d1, d2; asm volatile( "rep ; movsq\n\t" "movq %4,%%rcx\n\t" "rep ; movsb\n\t" : "=&c" (d0), "=&D" (d1), "=&S" (d2) : "0" (n >> 3), "g" (n & 7), "1" (dest), "2" (src) : "memory"); return dest; } kernel is indeed faster than user. 2018-07-10 14:22 GMT+08:00 Greg KH <greg@kroah.com>: > On Tue, Jul 10, 2018 at 12:50:21PM +0800, bing zhu wrote: > > I agree !,just i think the problem is still there,memcpy is indeed > faster in > > kernel than in user,i've tried both ways . > > Make sure you are actually using the same code for memcpy in both > places. Do not rely on your libc or the kernel library for such a > thing, otherwise you are not comparing the same code exactly. > > > schedule might be to blame. > > Lots of things "might be to blame", but first off, try to work out > exactly what you are trying to test, and why, and work on that. > > good luck! > > greg k-h > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180710/30970b5b/attachment.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-10 14:51 ` bing zhu @ 2018-07-10 14:57 ` Greg KH 2018-07-10 16:03 ` valdis.kletnieks at vt.edu 1 sibling, 0 replies; 22+ messages in thread From: Greg KH @ 2018-07-10 14:57 UTC (permalink / raw) To: kernelnewbies On Tue, Jul 10, 2018 at 10:51:34PM +0800, bing zhu wrote: > Thank you ,I use this func for both kernel and user ,result are same. > void *memcpy(void *dest, const void *src, size_t n) > { > long d0, d1, d2; > asm volatile( > "rep ; movsq\n\t" > "movq %4,%%rcx\n\t" > "rep ; movsb\n\t" > : "=&c" (d0), "=&D" (d1), "=&S" (d2) > : "0" (n >> 3), "g" (n & 7), "1" (dest), "2" (src) > : "memory"); > > return dest; > } > kernel is indeed faster than user. Ok, and that is due to the fact that the kernel thread does not get scheduled, unlike your userspace program. So this means the kernel is working as designed :) greg k-h ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-10 14:51 ` bing zhu 2018-07-10 14:57 ` Greg KH @ 2018-07-10 16:03 ` valdis.kletnieks at vt.edu 2018-07-12 4:47 ` bing zhu 1 sibling, 1 reply; 22+ messages in thread From: valdis.kletnieks at vt.edu @ 2018-07-10 16:03 UTC (permalink / raw) To: kernelnewbies On Tue, 10 Jul 2018 22:51:34 +0800, bing zhu said: > Thank you ,I use this func for both kernel and user ,result are same. > void *memcpy(void *dest, const void *src, size_t n) > { Might want to use 'void *my_memcpy(..)' instead, just in case the build environment plays #define games with you and causes a different memcpy() to get invoked instead. [/usr/src/linux-next] egrep -r '#define\s*memcpy\(' include/ arch/*/include arch/arm64/include/asm/string.h:#define memcpy(dst, src, len) __memcpy(dst, src, len) arch/m68k/include/asm/string.h:#define memcpy(d, s, n) __builtin_memcpy(d, s, n) arch/sparc/include/asm/string.h:#define memcpy(t, f, n) __builtin_memcpy(t, f, n) arch/x86/include/asm/string_64.h:#define memcpy(dst, src, len) \ arch/x86/include/asm/string_64.h:#define memcpy(dst, src, len) __memcpy(dst, src, len) arch/x86/include/asm/string_32.h:#define memcpy(t, f, n) \ arch/x86/include/asm/string_32.h:#define memcpy(t, f, n) __builtin_memcpy(t, f, n) arch/x86/include/asm/string_32.h:#define memcpy(t, f, n) \ arch/xtensa/include/asm/string.h:#define memcpy(dst, src, len) __memcpy(dst, src, len) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180710/85541f24/attachment.sig> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-10 16:03 ` valdis.kletnieks at vt.edu @ 2018-07-12 4:47 ` bing zhu 2018-07-12 5:34 ` Greg KH 0 siblings, 1 reply; 22+ messages in thread From: bing zhu @ 2018-07-12 4:47 UTC (permalink / raw) To: kernelnewbies agree! a simple rename would survice.results are the same .kernel is faster could anyone help fix this ? 2018-07-11 0:03 GMT+08:00 <valdis.kletnieks@vt.edu>: > On Tue, 10 Jul 2018 22:51:34 +0800, bing zhu said: > > > Thank you ,I use this func for both kernel and user ,result are same. > > void *memcpy(void *dest, const void *src, size_t n) > > { > > Might want to use 'void *my_memcpy(..)' instead, just in case the build > environment plays #define games with you and causes a different memcpy() > to get invoked instead. > > [/usr/src/linux-next] egrep -r '#define\s*memcpy\(' include/ arch/*/include > arch/arm64/include/asm/string.h:#define memcpy(dst, src, len) > __memcpy(dst, src, len) > arch/m68k/include/asm/string.h:#define memcpy(d, s, n) > __builtin_memcpy(d, s, n) > arch/sparc/include/asm/string.h:#define memcpy(t, f, n) > __builtin_memcpy(t, f, n) > arch/x86/include/asm/string_64.h:#define memcpy(dst, src, len) > \ > arch/x86/include/asm/string_64.h:#define memcpy(dst, src, len) > __memcpy(dst, src, len) > arch/x86/include/asm/string_32.h:#define memcpy(t, f, n) > \ > arch/x86/include/asm/string_32.h:#define memcpy(t, f, n) > __builtin_memcpy(t, f, n) > arch/x86/include/asm/string_32.h:#define memcpy(t, f, n) > \ > arch/xtensa/include/asm/string.h:#define memcpy(dst, src, len) > __memcpy(dst, src, len) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180712/cce74095/attachment.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-12 4:47 ` bing zhu @ 2018-07-12 5:34 ` Greg KH 2018-07-12 14:27 ` bing zhu 0 siblings, 1 reply; 22+ messages in thread From: Greg KH @ 2018-07-12 5:34 UTC (permalink / raw) To: kernelnewbies On Thu, Jul 12, 2018 at 12:47:12PM +0800, bing zhu wrote: > agree! a simple rename would survice.results are the same .kernel is faster > could anyone help fix this ? Fix what exactly? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-12 5:34 ` Greg KH @ 2018-07-12 14:27 ` bing zhu 2018-07-12 14:53 ` Greg KH 2018-07-12 16:49 ` valdis.kletnieks at vt.edu 0 siblings, 2 replies; 22+ messages in thread From: bing zhu @ 2018-07-12 14:27 UTC (permalink / raw) To: kernelnewbies as for memcpy ,kernel is faster than user ,might because schedule ,can i try to make user as fast as kernel ? 2018-07-12 13:34 GMT+08:00 Greg KH <greg@kroah.com>: > On Thu, Jul 12, 2018 at 12:47:12PM +0800, bing zhu wrote: > > agree! a simple rename would survice.results are the same .kernel is > faster > > could anyone help fix this ? > > Fix what exactly? > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180712/87cca5b9/attachment.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-12 14:27 ` bing zhu @ 2018-07-12 14:53 ` Greg KH 2018-07-13 3:02 ` bing zhu 2018-07-12 16:49 ` valdis.kletnieks at vt.edu 1 sibling, 1 reply; 22+ messages in thread From: Greg KH @ 2018-07-12 14:53 UTC (permalink / raw) To: kernelnewbies A: http://en.wikipedia.org/wiki/Top_post Q: Were do I find info about this thing called top-posting? A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail? A: No. Q: Should I include quotations after my reply? http://daringfireball.net/2007/07/on_top On Thu, Jul 12, 2018 at 10:27:37PM +0800, bing zhu wrote: > as for memcpy ,kernel is faster than user ,might because schedule ,can i try to > make user as fast as kernel ? You can bind a specific CPU to your userspace task, and have it only run that program and not get interupted at all for anything. That would make it as fast as the kernel runs. Lots of people do that in high frequency trading as they don't want the CPU to get in the way of their work or response times to the network. But without doing fancy tricks like that, no. Think about what an operating system does. It's job is to schedule things that need to be done behind the back of your program. Otherwise there's no need for it, right? good luck! greg k-h ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-12 14:53 ` Greg KH @ 2018-07-13 3:02 ` bing zhu 2018-07-13 7:33 ` valdis.kletnieks at vt.edu 0 siblings, 1 reply; 22+ messages in thread From: bing zhu @ 2018-07-13 3:02 UTC (permalink / raw) To: kernelnewbies I?m trying to write a simple fs in user space,if memcpy is slower than kernel , i think it's unfair,as for only cpu for my task, it's a bit of arbitrary ?i just want my task not interrupted during a specific time is that possible ? 2018-07-12 22:53 GMT+08:00 Greg KH <greg@kroah.com>: > A: http://en.wikipedia.org/wiki/Top_post > Q: Were do I find info about this thing called top-posting? > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? > A: Top-posting. > Q: What is the most annoying thing in e-mail? > > A: No. > Q: Should I include quotations after my reply? > > http://daringfireball.net/2007/07/on_top > > > On Thu, Jul 12, 2018 at 10:27:37PM +0800, bing zhu wrote: > > as for memcpy ,kernel is faster than user ,might because schedule ,can i > try to > > make user as fast as kernel ? > > You can bind a specific CPU to your userspace task, and have it only run > that program and not get interupted at all for anything. That would > make it as fast as the kernel runs. Lots of people do that in high > frequency trading as they don't want the CPU to get in the way of their > work or response times to the network. > > But without doing fancy tricks like that, no. Think about what an > operating system does. It's job is to schedule things that need to be > done behind the back of your program. Otherwise there's no need for it, > right? > > good luck! > > greg k-h > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180713/6f3913e9/attachment-0001.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-13 3:02 ` bing zhu @ 2018-07-13 7:33 ` valdis.kletnieks at vt.edu 2018-07-17 2:44 ` bing zhu 0 siblings, 1 reply; 22+ messages in thread From: valdis.kletnieks at vt.edu @ 2018-07-13 7:33 UTC (permalink / raw) To: kernelnewbies On Fri, 13 Jul 2018 11:02:13 +0800, bing zhu said: > I???m trying to write a simple fs in user space,if memcpy is slower than > kernel , i think it's unfair,as for only cpu for my task, > it's a bit of arbitrary ???i just want my task not interrupted during a > specific time is that possible ? Not getting interrupted is an *entirely* different issue than making memcpy fast. Note that in general, systems code should be able to deal with interruptions during most parts of the code, and locking used and disabling pre-emption for sections of code that can't deal with being interrupted. Remember that if your filesystem code turns off interrupts for long enough, you can start losing things like I/O completions. Fortunately for those who write systems code, the vast majority of interrupts are totally transparent to the vast majority of the kernel code. And if you're doing a file system in userspace, you're going to fail to notice hundreds or even thousands of interrupts happening. If you don't believe me, 'cat /proc/interrupts', and realize that userspace didn't notice *any* of them happening. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180713/2d312e5f/attachment.sig> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-13 7:33 ` valdis.kletnieks at vt.edu @ 2018-07-17 2:44 ` bing zhu 0 siblings, 0 replies; 22+ messages in thread From: bing zhu @ 2018-07-17 2:44 UTC (permalink / raw) To: kernelnewbies Thanks for elaborating ,I've learned that it's not worth it,i'm turning other ways for performance consideration ,thanks! 2018-07-13 15:33 GMT+08:00 <valdis.kletnieks@vt.edu>: > On Fri, 13 Jul 2018 11:02:13 +0800, bing zhu said: > > > I???m trying to write a simple fs in user space,if memcpy is slower than > > kernel , i think it's unfair,as for only cpu for my task, > > it's a bit of arbitrary ???i just want my task not interrupted during a > > specific time is that possible ? > > Not getting interrupted is an *entirely* different issue than making > memcpy fast. > > Note that in general, systems code should be able to deal with > interruptions > during most parts of the code, and locking used and disabling pre-emption > for > sections of code that can't deal with being interrupted. Remember that if > your > filesystem code turns off interrupts for long enough, you can start losing > things like I/O completions. Fortunately for those who write systems code, > the vast majority of interrupts are totally transparent to the vast > majority > of the kernel code. > > And if you're doing a file system in userspace, you're going to fail to > notice > hundreds or even thousands of interrupts happening. If you don't believe > me, > 'cat /proc/interrupts', and realize that userspace didn't notice *any* of > them > happening. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180717/9a59950a/attachment.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy 2018-07-12 14:27 ` bing zhu 2018-07-12 14:53 ` Greg KH @ 2018-07-12 16:49 ` valdis.kletnieks at vt.edu 1 sibling, 0 replies; 22+ messages in thread From: valdis.kletnieks at vt.edu @ 2018-07-12 16:49 UTC (permalink / raw) To: kernelnewbies On Thu, 12 Jul 2018 22:27:37 +0800, bing zhu said: > as for memcpy ,kernel is faster than user ,might because schedule ,can i > try to make user as fast as kernel ? Do you have an actual issue where the difference in speed of these two things makes a difference? Or is this primarily a mental curiosity thing? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 486 bytes Desc: not available URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180712/16647dfd/attachment.sig> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Question about memcpy @ 2018-07-07 13:21 Alex Arvelaez 0 siblings, 0 replies; 22+ messages in thread From: Alex Arvelaez @ 2018-07-07 13:21 UTC (permalink / raw) To: kernelnewbies On Jul 7, 2018 7:37 AM, bing zhu <zhubohong12@gmail.com> wrote: > > Dear Sir/Ma'am > Thank you for your time ,i'm a student new to linux kernel. > I have a question about memcpy,i noticed that memcpy is faster in kernel than in user space > for example : > in a module helloworld , i use memcpy to copy a 4096B to a block of memory for like 10000 times > and in user space i do the same thing,I noticed that kernel is faster than user , > is it possible that in kernel when i insmod hello it can not be scheduled but in user space it will so kernel is faster? This makes sense, less context switches. > is there a possible way that a user task can run a block of code that uninterruptable? No switch ,no schedule ? I don't think this is possible, Linux is a preemptive kernel. > Thank you ! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20180707/1a889d44/attachment.html> ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2018-07-17 2:44 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-07-07 11:36 Question about memcpy bing zhu 2018-07-07 18:44 ` valdis.kletnieks at vt.edu 2018-07-08 14:03 ` bing zhu 2018-07-09 7:54 ` 袁建鹏 2018-07-09 8:14 ` bing zhu 2018-07-09 14:04 ` Himanshu Jha 2018-07-09 16:16 ` valdis.kletnieks at vt.edu 2018-07-14 10:10 ` Himanshu Jha 2018-07-10 4:50 ` bing zhu 2018-07-10 6:22 ` Greg KH 2018-07-10 14:51 ` bing zhu 2018-07-10 14:57 ` Greg KH 2018-07-10 16:03 ` valdis.kletnieks at vt.edu 2018-07-12 4:47 ` bing zhu 2018-07-12 5:34 ` Greg KH 2018-07-12 14:27 ` bing zhu 2018-07-12 14:53 ` Greg KH 2018-07-13 3:02 ` bing zhu 2018-07-13 7:33 ` valdis.kletnieks at vt.edu 2018-07-17 2:44 ` bing zhu 2018-07-12 16:49 ` valdis.kletnieks at vt.edu -- strict thread matches above, loose matches on Subject: below -- 2018-07-07 13:21 Alex Arvelaez
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).