* [PATCH] oprofile per-cpu buffer overrun @ 2004-01-26 2:37 Philippe Elie 2004-01-26 4:07 ` Andrew Morton 0 siblings, 1 reply; 6+ messages in thread From: Philippe Elie @ 2004-01-26 2:37 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, John Levon Hi Andrew, In a ring buffer controlled by a read and write positions we can't use buffer_size but only buffer_size - 1 entry, the last free entry act as a guard to avoid write pos overrun. This bug was hidden because the writer, oprofile_add_sample(), request one more entry than really needed. regards, Phil Index: drivers/oprofile/cpu_buffer.c =================================================================== RCS file: /usr/local/cvsroot/linux-2.5/drivers/oprofile/cpu_buffer.c,v retrieving revision 1.9 diff -u -p -r1.9 cpu_buffer.c --- drivers/oprofile/cpu_buffer.c 26 May 2003 04:42:54 -0000 1.9 +++ drivers/oprofile/cpu_buffer.c 24 Jan 2004 21:07:03 -0000 @@ -86,9 +86,9 @@ static unsigned long nr_available_slots( unsigned long tail = b->tail_pos; if (tail > head) - return tail - head; + return (tail - head) - 1; - return tail + (b->buffer_size - head); + return tail + (b->buffer_size - head) - 1; } ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] oprofile per-cpu buffer overrun 2004-01-26 2:37 [PATCH] oprofile per-cpu buffer overrun Philippe Elie @ 2004-01-26 4:07 ` Andrew Morton 2004-01-26 5:56 ` Anton Blanchard 2004-01-26 10:32 ` John Levon 0 siblings, 2 replies; 6+ messages in thread From: Andrew Morton @ 2004-01-26 4:07 UTC (permalink / raw) To: Philippe Elie; +Cc: linux-kernel, levon Philippe Elie <phil.el@wanadoo.fr> wrote: > > Hi Andrew, > > In a ring buffer controlled by a read and write positions we > can't use buffer_size but only buffer_size - 1 entry, you can, actually. > the last > free entry act as a guard to avoid write pos overrun. This bug > was hidden because the writer, oprofile_add_sample(), request > one more entry than really needed. > >... > diff -u -p -r1.9 cpu_buffer.c > --- drivers/oprofile/cpu_buffer.c 26 May 2003 04:42:54 -0000 1.9 > +++ drivers/oprofile/cpu_buffer.c 24 Jan 2004 21:07:03 -0000 > @@ -86,9 +86,9 @@ static unsigned long nr_available_slots( > unsigned long tail = b->tail_pos; > > if (tail > head) > - return tail - head; > + return (tail - head) - 1; > > - return tail + (b->buffer_size - head); > + return tail + (b->buffer_size - head) - 1; > } When implementing a circular buffer it is better to not constrain the head and tail indices - just let them grow and wrap without bound. You only need to bring them in-bounds when you actually use them to index the buffer. This way, - head-tail is always the amount of used space, no need to futz around handling the case where one has wrapped and the other hasn't. - you get to use all of the buffer, because the cases head-tail == 0 (empty) and head-tail == bufsize (full) are now distinguishable. It helps if the buffer size is a power of two, of course, but integer modulus is pretty quick. All the net drivers and the printk log buffer implement their ring buffers in this way; it works nicely. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] oprofile per-cpu buffer overrun 2004-01-26 4:07 ` Andrew Morton @ 2004-01-26 5:56 ` Anton Blanchard 2004-01-26 10:32 ` John Levon 1 sibling, 0 replies; 6+ messages in thread From: Anton Blanchard @ 2004-01-26 5:56 UTC (permalink / raw) To: Andrew Morton; +Cc: Philippe Elie, linux-kernel, levon > It helps if the buffer size is a power of two, of course, but integer > modulus is pretty quick. If its something thats hit real hard, the way the modulus is done can make a difference. We've seen it in various places (eg disabling tigon 1 support in the acenic changes the driver from using a variable to a compile time constant and you can actually see it in the profile): quickest == power of 2 compile time constant (results in a mask) quickish == compile time constant (results in the multiplication by an inverse trick) slow == run time (results in a divide) Of course you can do like the printk stuff and use a variable but enforce a power of 2 value and & it. Anton -- unsigned long i, j, k; int quickest() { j = i % 2; } int quickish() { j = i % 3; } int slow() { j = i % k; } -- quickest: lis 9,i@ha lwz 0,i@l(9) lis 9,j@ha rlwinm 0,0,0,31,31 stw 0,j@l(9) blr quickish: lis 9,i@ha lis 0,0xaaaa lwz 11,i@l(9) ori 0,0,43691 lis 9,j@ha mulhwu 0,11,0 srwi 0,0,1 mulli 0,0,3 subf 11,0,11 stw 11,j@l(9) blr slow: lis 9,i@ha lis 11,k@ha lwz 10,i@l(9) lwz 9,k@l(11) divwu 0,10,9 mullw 0,0,9 lis 9,j@ha subf 10,0,10 stw 10,j@l(9) blr ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] oprofile per-cpu buffer overrun 2004-01-26 4:07 ` Andrew Morton 2004-01-26 5:56 ` Anton Blanchard @ 2004-01-26 10:32 ` John Levon 2004-01-26 11:07 ` Nick Piggin 2004-01-26 15:52 ` Philippe Elie 1 sibling, 2 replies; 6+ messages in thread From: John Levon @ 2004-01-26 10:32 UTC (permalink / raw) To: Andrew Morton; +Cc: Philippe Elie, linux-kernel On Sun, Jan 25, 2004 at 08:07:01PM -0800, Andrew Morton wrote: > When implementing a circular buffer it is better to not constrain the head > and tail indices - just let them grow and wrap without bound. You only need > to bring them in-bounds when you actually use them to index the buffer. I'm not sure why that's better. > - head-tail is always the amount of used space, no need to futz around > handling the case where one has wrapped and the other hasn't. I admit I have a hangover, but it seems to me it would actually be more complicated to make damn sure that the integer overflow case is definitely handled properly. I clearly can't write functioning ring buffers :), so I'd prefer it to stay as simple as possible. regards, john -- Khendon's Law: If the same point is made twice by the same person, the thread is over. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] oprofile per-cpu buffer overrun 2004-01-26 10:32 ` John Levon @ 2004-01-26 11:07 ` Nick Piggin 2004-01-26 15:52 ` Philippe Elie 1 sibling, 0 replies; 6+ messages in thread From: Nick Piggin @ 2004-01-26 11:07 UTC (permalink / raw) To: John Levon; +Cc: Andrew Morton, Philippe Elie, linux-kernel John Levon wrote: >On Sun, Jan 25, 2004 at 08:07:01PM -0800, Andrew Morton wrote: > > >>When implementing a circular buffer it is better to not constrain the head >>and tail indices - just let them grow and wrap without bound. You only need >>to bring them in-bounds when you actually use them to index the buffer. >> > >I'm not sure why that's better. > > >>- head-tail is always the amount of used space, no need to futz around >> handling the case where one has wrapped and the other hasn't. >> > >I admit I have a hangover, but it seems to me it would actually be more >complicated to make damn sure that the integer overflow case is >definitely handled properly. > You needn't worry about integer overflow - it is handled properly ;) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] oprofile per-cpu buffer overrun 2004-01-26 10:32 ` John Levon 2004-01-26 11:07 ` Nick Piggin @ 2004-01-26 15:52 ` Philippe Elie 1 sibling, 0 replies; 6+ messages in thread From: Philippe Elie @ 2004-01-26 15:52 UTC (permalink / raw) To: John Levon; +Cc: Andrew Morton, linux-kernel John Levon wrote: > On Sun, Jan 25, 2004 at 08:07:01PM -0800, Andrew Morton wrote: > > >>When implementing a circular buffer it is better to not constrain the head >>and tail indices - just let them grow and wrap without bound. You only need >>to bring them in-bounds when you actually use them to index the buffer. neat! > I'm not sure why that's better. We win in increment_head/increment_tail: static void increment_head(struct oprofile_cpu_buffer * b) { - unsigned long new_head = b->head_pos + 1; wmb(); - if (new_head < (b->buffer_size)) - b->head_pos = new_head; - else - b->head_pos = 0; + b->head_pos++; } for this added cost when accessing the buffer: b->buffer[b->head & b->buffer_size_mask]; Modulo use is not worth but with buffer_size a power of 2 it's probably a win, I'll try and measure this later, not urgent since the problem is fixed, I added it in our todo. regards, Phil ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2004-01-26 14:36 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-01-26 2:37 [PATCH] oprofile per-cpu buffer overrun Philippe Elie 2004-01-26 4:07 ` Andrew Morton 2004-01-26 5:56 ` Anton Blanchard 2004-01-26 10:32 ` John Levon 2004-01-26 11:07 ` Nick Piggin 2004-01-26 15:52 ` Philippe Elie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox