From mboxrd@z Thu Jan 1 00:00:00 1970 From: mulyadi.santosa@gmail.com (Mulyadi Santosa) Date: Sun, 12 Feb 2012 18:46:42 +0700 Subject: How to measure performance inside Kernel? In-Reply-To: References: <6F5DE7538AFCDA45A114F5E7510424A702F8C9FC@hq-exchange01.bytemobile.com> <4F35E432.6060002@gmail.com> Message-ID: To: kernelnewbies@lists.kernelnewbies.org List-Id: kernelnewbies.lists.kernelnewbies.org Hi Peter... On Sat, Feb 11, 2012 at 20:57, Peter Senna Tschudin wrote: > Graeme, > > I found a problem on my code. I was calling kmalloc() only once for > both portions of code. The result is that the first loop that accessed > the memory was finding some penalty. Now I'm calling independent > kmalloc for each test. Sorry for jumping in the mid of discussion :) I read your code and I think kmalloc can be streamlined here. I recommend that kmalloc() allocate total memory needed to handle whole q->buf[] array. something like (CMIIW): q->buf=kmalloc(sizeof(struct vb_buffer)*q->num_buffers,GFP_KERNEL) then access q->buf[1], q->buf[2] etc. This way, AFAIK, you will likely get not only virtually continous pages, but also physical continous pages. And that will ease prefetching into L1/L2 cache. -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com