public inbox for kernel-janitors@vger.kernel.org
 help / color / mirror / Atom feed
* busy loops in kernel
@ 2012-01-18  4:15 Asim
  2012-01-18  4:55 ` Greg Dietsche
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Asim @ 2012-01-18  4:15 UTC (permalink / raw)
  To: kernel-janitors

Hi,

I was just wanted to send a message to janitors -

We've identified a set of busy loops, waiting on hardware in the Linux
kernel. These are perhaps not the best way to wait for a device to
become ready and can occasionally freeze the system. These need to be
fixed by adding timeouts (for example by using rdtscll to wait for 2
clock cycles).

The list is here => http://pages.cs.wisc.edu/~kadav/busy_waits.txt  I
have added some caveats to be on the safe side. I will also add/update
this list as time passes by and as these are fixed (or as I identify
more).

Thanks,
Asim

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
@ 2012-01-18  4:55 ` Greg Dietsche
  2012-01-18  7:37 ` Dan Carpenter
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Greg Dietsche @ 2012-01-18  4:55 UTC (permalink / raw)
  To: kernel-janitors

On 1/17/2012 10:15 PM, Asim wrote:
> The list is here => http://pages.cs.wisc.edu/~kadav/busy_waits.txt  I
> have added some caveats to be on the safe side. I will also add/update
> this list as time passes by and as these are fixed (or as I identify
> more).
Hi Asim,
What version of the kernel source was used to generate this list?

Thanks,
Greg

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
  2012-01-18  4:55 ` Greg Dietsche
@ 2012-01-18  7:37 ` Dan Carpenter
  2012-01-18  7:39 ` Dan Carpenter
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Dan Carpenter @ 2012-01-18  7:37 UTC (permalink / raw)
  To: kernel-janitors

[-- Attachment #1: Type: text/plain, Size: 383 bytes --]

Interesting stuff.  You guys are doing static analysis on the .o
files?  I'm looking at the first line.

ticks 25 What does the ticks count mean?

Infinite Loop:142  Ok.  I see this on drivers/atm/zatm.c line 142.
It's a real infinite loop (if you're zatm hardware is broken).

Dynamic array:393  What's this?  I don't see any dynamic arrays in
this driver.

regards,
dan carpenter


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
  2012-01-18  4:55 ` Greg Dietsche
  2012-01-18  7:37 ` Dan Carpenter
@ 2012-01-18  7:39 ` Dan Carpenter
  2012-01-18 17:45 ` Asim
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Dan Carpenter @ 2012-01-18  7:39 UTC (permalink / raw)
  To: kernel-janitors

[-- Attachment #1: Type: text/plain, Size: 334 bytes --]

On Wed, Jan 18, 2012 at 10:37:30AM +0300, Dan Carpenter wrote:
> Interesting stuff.  You guys are doing static analysis on the .o
> files?  I'm looking at the first line.
> 
> ticks 25 What does the ticks count mean?

I figured it out.  Ticks is the number of infinite loops found in
the .o file.

regards,
dan carpenter


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
                   ` (2 preceding siblings ...)
  2012-01-18  7:39 ` Dan Carpenter
@ 2012-01-18 17:45 ` Asim
  2012-01-18 17:48 ` Asim
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Asim @ 2012-01-18 17:45 UTC (permalink / raw)
  To: kernel-janitors

Hi Greg,

This is a 6 month old kernel. 2.6.38. I am now working to get the
results in 3.2.1 kernel. I cross checked and most drivers still have
these problems even though in some cases the line numbers have changed
by an offset.
I will send the 3.2.1 results as soon as I can to this list. I am
mostly facing issues with macros like DEFINE_TIMER which just breaks
all over analyses and have surprising spread all over the drivers
directory.

But as I said earlier, most of the problems still  remain, in many
cases with the same line numbers..

Thanks,
Asim

On Tue, Jan 17, 2012 at 8:55 PM, Greg Dietsche <gregory.dietsche@cuw.edu> wrote:
> On 1/17/2012 10:15 PM, Asim wrote:
>> The list is here => http://pages.cs.wisc.edu/~kadav/busy_waits.txt  I
>> have added some caveats to be on the safe side. I will also add/update
>> this list as time passes by and as these are fixed (or as I identify
>> more).
> Hi Asim,
> What version of the kernel source was used to generate this list?
>
> Thanks,
> Greg
--
To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
                   ` (3 preceding siblings ...)
  2012-01-18 17:45 ` Asim
@ 2012-01-18 17:48 ` Asim
  2012-01-21 17:33 ` Asim
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Asim @ 2012-01-18 17:48 UTC (permalink / raw)
  To: kernel-janitors

Hi Dan,

Yes we do static analysis. The txt file has a bit more information. It
is actually: count, line numbers and C file. In some cases, when
driver comprises of multiple files, line numbers may be shifted (since
all files are merged together for analysis).

Some cases are more insidious where drivers use hardware values for
array/pointer de-reference. They are way fewer than busy loops but I
have muted them for this iteration at least.

And, yes these are real infinite loops, about 1000 of them.

Thanks,
Asim

On Tue, Jan 17, 2012 at 11:37 PM, Dan Carpenter
<dan.carpenter@oracle.com> wrote:
> Interesting stuff.  You guys are doing static analysis on the .o
> files?  I'm looking at the first line.
>
> ticks 25 What does the ticks count mean?
>
> Infinite Loop:142  Ok.  I see this on drivers/atm/zatm.c line 142.
> It's a real infinite loop (if you're zatm hardware is broken).
>
> Dynamic array:393  What's this?  I don't see any dynamic arrays in
> this driver.
>
> regards,
> dan carpenter
>
--
To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
                   ` (4 preceding siblings ...)
  2012-01-18 17:48 ` Asim
@ 2012-01-21 17:33 ` Asim
  2012-01-21 19:04 ` Julia Lawall
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Asim @ 2012-01-21 17:33 UTC (permalink / raw)
  To: kernel-janitors

Hi all,

I was able to generate a list for Linux 3.2.1. This is the list of
busy waits for hardware in the Linux 3.2.1 kernel:

http://pages.cs.wisc.edu/~kadav/busy_waits-3.2.1.txt

I was suggested that I release this list to the janitors and hopefully
we can reduce some of these.

Thanks,
Asim

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
                   ` (5 preceding siblings ...)
  2012-01-21 17:33 ` Asim
@ 2012-01-21 19:04 ` Julia Lawall
  2012-01-21 20:46 ` Asim
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Julia Lawall @ 2012-01-21 19:04 UTC (permalink / raw)
  To: kernel-janitors

I looked at this one:

ticks 2  Infinite Loop:228  Infinite Loop:600    CC [M] 
drivers/i2c/busses/i2c-xiic.o

Line 200 looks like what you described in your previous message.  I don't 
understand the problem in line 600, though:

         while ((fifo_space >= 2) && (first || (i2c->nmsgs > 1))) {
                 if (!first) {
                         i2c->nmsgs--;
                         i2c->tx_msg++;
                         i2c->tx_pos = 0;
                 } else
                         first = 0;

                 if (i2c->tx_msg->flags & I2C_M_RD) {
                         /* we dont date putting several reads in the FIFO 
*/
                         xiic_start_recv(i2c);
                         return;
                 } else {
                         xiic_start_send(i2c);
                         if (xiic_tx_space(i2c) != 0) {
                                 /* the message could not be completely 
sent */
                                 break;
                         }
                 }

 		fifo_space = xiic_tx_fifo_space(i2c);
         }

It looks like on every iteration execpt the first one, i2c->nmsgs is 
decremented, so it will not long be greater than one.

The next two rely on kernel timing mechansims:

ticks 1  Infinite Loop:284    CC [M]  drivers/i2c/busses/i2c-eg20t.o
    while (ktime_lt(ktime_get(), ns_val))
ticks 1  Infinite Loop:84    CC [M]  drivers/i2c/busses/i2c-pca-isa.o
    ret = time_before(jiffies, timeout); ... while (ret)

Are these timing mechanisms unsafe?

In this case:

ticks 1  Infinite Loop:338    CC [M]  drivers/infiniband/hw/amso1100/c2.o

it looks like it is actually the outer loop, starting on line 336, that is 
unsafe.

For this one:

ticks 1  Infinite Loop:47    CC [M]  drivers/infiniband/hw/amso1100/c2_intr.o

I am not completely sure to understand the code.  c2dev->hints_read is 
nonlocal and is never explicitly reset to 0.  Perhaps this code is only 
executed once per instance of cdev.  In any case hints_read is incremented 
on each iteration.  But perhaps the value of 
be16_to_cpu(*c2dev->hint_count) can also change at each access?

This is clearly the sort of loop you described:

ticks 1  Infinite Loop:735    CC [M]  drivers/infiniband/hw/amso1100/c2_qp.o

Skipping ahead a bit, I don't at all see the problem in the following 
code:

ticks 1  Infinite Loop:184    CC [M]  drivers/media/video/ivtv/ivtv-firmware.o

         for (i = 0; i < size; i += 0x100) {
 		if (readl(mem + i)      = 0x12345678 &&
                     readl(mem + i + 4)  = 0x34567812 &&
                     readl(mem + i + 8)  = 0x56781234 &&
                     readl(mem + i + 12) = 0x78123456) {
                         return (volatile struct ivtv_mailbox __iomem *)(mem + i + 16);
                 }
         }

I don't see the problem here either:

ticks 1  Infinite Loop:93    CC [M]  drivers/media/video/pvrusb2/pvrusb2-eeprom.o

         for (tcnt = 0; tcnt < EEPROM_SIZE; tcnt += pcnt) {
                 pcnt = 16;
                 if (pcnt + tcnt > EEPROM_SIZE) pcnt = EEPROM_SIZE-tcnt;
 		...
 	}

The next report is on similar code:

ticks 1  Infinite Loop:3536    CC [M]  drivers/media/video/pvrusb2/pvrusb2-hdw.o

These two indeed might be dangerous:

ticks 2  Infinite Loop:481  Infinite Loop:676    CC [M]  drivers/media/video/saa7134/saa7134-tvaudio.o

julia

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
                   ` (6 preceding siblings ...)
  2012-01-21 19:04 ` Julia Lawall
@ 2012-01-21 20:46 ` Asim
  2012-01-21 20:50 ` Julia Lawall
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Asim @ 2012-01-21 20:46 UTC (permalink / raw)
  To: kernel-janitors

Yeah - most of the busy waits that are correct are short and fairly
obvious while loops and there are plenty of them. There are some
complex ones too.

However, we do have false positive rate of 5- 8% (as measured on
2.6.18 kernel) - so in some (~100) cases for example the while loops
where variables may be re-used it may report the false positive. As
far as kernel timing is concerned - we do have mechanisms to account
for these based on our experience with the timers used in 2.6 kernel.
I will add these timing mechanisms and refresh the results - and
continue to update as I encounter more diverse mechanisms to account
for timing. But I strongly believe this a good subset to look at to
catch these problems.

Thanks,
Asim

On Sat, Jan 21, 2012 at 11:04 AM, Julia Lawall <julia.lawall@lip6.fr> wrote:
> I looked at this one:
>
> ticks 2  Infinite Loop:228  Infinite Loop:600    CC [M]
> drivers/i2c/busses/i2c-xiic.o
>
> Line 200 looks like what you described in your previous message.  I don't
> understand the problem in line 600, though:
>
>        while ((fifo_space >= 2) && (first || (i2c->nmsgs > 1))) {
>                if (!first) {
>                        i2c->nmsgs--;
>                        i2c->tx_msg++;
>                        i2c->tx_pos = 0;
>                } else
>                        first = 0;
>
>                if (i2c->tx_msg->flags & I2C_M_RD) {
>                        /* we dont date putting several reads in the FIFO */
>                        xiic_start_recv(i2c);
>                        return;
>                } else {
>                        xiic_start_send(i2c);
>                        if (xiic_tx_space(i2c) != 0) {
>                                /* the message could not be completely sent
> */
>                                break;
>                        }
>                }
>
>                fifo_space = xiic_tx_fifo_space(i2c);
>        }
>
> It looks like on every iteration execpt the first one, i2c->nmsgs is
> decremented, so it will not long be greater than one.
>
> The next two rely on kernel timing mechansims:
>
> ticks 1  Infinite Loop:284    CC [M]  drivers/i2c/busses/i2c-eg20t.o
>   while (ktime_lt(ktime_get(), ns_val))
> ticks 1  Infinite Loop:84    CC [M]  drivers/i2c/busses/i2c-pca-isa.o
>   ret = time_before(jiffies, timeout); ... while (ret)
>
> Are these timing mechanisms unsafe?
>
> In this case:
>
> ticks 1  Infinite Loop:338    CC [M]  drivers/infiniband/hw/amso1100/c2.o
>
> it looks like it is actually the outer loop, starting on line 336, that is
> unsafe.
>
> For this one:
>
> ticks 1  Infinite Loop:47    CC [M]
>  drivers/infiniband/hw/amso1100/c2_intr.o
>
> I am not completely sure to understand the code.  c2dev->hints_read is
> nonlocal and is never explicitly reset to 0.  Perhaps this code is only
> executed once per instance of cdev.  In any case hints_read is incremented
> on each iteration.  But perhaps the value of be16_to_cpu(*c2dev->hint_count)
> can also change at each access?
>
> This is clearly the sort of loop you described:
>
> ticks 1  Infinite Loop:735    CC [M]  drivers/infiniband/hw/amso1100/c2_qp.o
>
> Skipping ahead a bit, I don't at all see the problem in the following code:
>
> ticks 1  Infinite Loop:184    CC [M]
>  drivers/media/video/ivtv/ivtv-firmware.o
>
>        for (i = 0; i < size; i += 0x100) {
>                if (readl(mem + i)      = 0x12345678 &&
>                    readl(mem + i + 4)  = 0x34567812 &&
>                    readl(mem + i + 8)  = 0x56781234 &&
>                    readl(mem + i + 12) = 0x78123456) {
>                        return (volatile struct ivtv_mailbox __iomem *)(mem +
> i + 16);
>                }
>        }
>
> I don't see the problem here either:
>
> ticks 1  Infinite Loop:93    CC [M]
>  drivers/media/video/pvrusb2/pvrusb2-eeprom.o
>
>        for (tcnt = 0; tcnt < EEPROM_SIZE; tcnt += pcnt) {
>                pcnt = 16;
>                if (pcnt + tcnt > EEPROM_SIZE) pcnt = EEPROM_SIZE-tcnt;
>                ...
>        }
>
> The next report is on similar code:
>
> ticks 1  Infinite Loop:3536    CC [M]
>  drivers/media/video/pvrusb2/pvrusb2-hdw.o
>
> These two indeed might be dangerous:
>
> ticks 2  Infinite Loop:481  Infinite Loop:676    CC [M]
>  drivers/media/video/saa7134/saa7134-tvaudio.o
>
> julia
--
To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
                   ` (7 preceding siblings ...)
  2012-01-21 20:46 ` Asim
@ 2012-01-21 20:50 ` Julia Lawall
  2012-01-21 22:32 ` Asim
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Julia Lawall @ 2012-01-21 20:50 UTC (permalink / raw)
  To: kernel-janitors

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4791 bytes --]

Do you have any idea about the two for loops near the end of the message 
below?  It seems strange that they would be included, unless I am 
completely overlooking something.

julia

On Sat, 21 Jan 2012, Asim wrote:

> Yeah - most of the busy waits that are correct are short and fairly
> obvious while loops and there are plenty of them. There are some
> complex ones too.
>
> However, we do have false positive rate of 5- 8% (as measured on
> 2.6.18 kernel) - so in some (~100) cases for example the while loops
> where variables may be re-used it may report the false positive. As
> far as kernel timing is concerned - we do have mechanisms to account
> for these based on our experience with the timers used in 2.6 kernel.
> I will add these timing mechanisms and refresh the results - and
> continue to update as I encounter more diverse mechanisms to account
> for timing. But I strongly believe this a good subset to look at to
> catch these problems.
>
> Thanks,
> Asim
>
> On Sat, Jan 21, 2012 at 11:04 AM, Julia Lawall <julia.lawall@lip6.fr> wrote:
>> I looked at this one:
>>
>> ticks 2  Infinite Loop:228  Infinite Loop:600    CC [M]
>> drivers/i2c/busses/i2c-xiic.o
>>
>> Line 200 looks like what you described in your previous message.  I don't
>> understand the problem in line 600, though:
>>
>>        while ((fifo_space >= 2) && (first || (i2c->nmsgs > 1))) {
>>                if (!first) {
>>                        i2c->nmsgs--;
>>                        i2c->tx_msg++;
>>                        i2c->tx_pos = 0;
>>                } else
>>                        first = 0;
>>
>>                if (i2c->tx_msg->flags & I2C_M_RD) {
>>                        /* we dont date putting several reads in the FIFO */
>>                        xiic_start_recv(i2c);
>>                        return;
>>                } else {
>>                        xiic_start_send(i2c);
>>                        if (xiic_tx_space(i2c) != 0) {
>>                                /* the message could not be completely sent
>> */
>>                                break;
>>                        }
>>                }
>>
>>                fifo_space = xiic_tx_fifo_space(i2c);
>>        }
>>
>> It looks like on every iteration execpt the first one, i2c->nmsgs is
>> decremented, so it will not long be greater than one.
>>
>> The next two rely on kernel timing mechansims:
>>
>> ticks 1  Infinite Loop:284    CC [M]  drivers/i2c/busses/i2c-eg20t.o
>>   while (ktime_lt(ktime_get(), ns_val))
>> ticks 1  Infinite Loop:84    CC [M]  drivers/i2c/busses/i2c-pca-isa.o
>>   ret = time_before(jiffies, timeout); ... while (ret)
>>
>> Are these timing mechanisms unsafe?
>>
>> In this case:
>>
>> ticks 1  Infinite Loop:338    CC [M]  drivers/infiniband/hw/amso1100/c2.o
>>
>> it looks like it is actually the outer loop, starting on line 336, that is
>> unsafe.
>>
>> For this one:
>>
>> ticks 1  Infinite Loop:47    CC [M]
>>  drivers/infiniband/hw/amso1100/c2_intr.o
>>
>> I am not completely sure to understand the code.  c2dev->hints_read is
>> nonlocal and is never explicitly reset to 0.  Perhaps this code is only
>> executed once per instance of cdev.  In any case hints_read is incremented
>> on each iteration.  But perhaps the value of be16_to_cpu(*c2dev->hint_count)
>> can also change at each access?
>>
>> This is clearly the sort of loop you described:
>>
>> ticks 1  Infinite Loop:735    CC [M]  drivers/infiniband/hw/amso1100/c2_qp.o
>>
>> Skipping ahead a bit, I don't at all see the problem in the following code:
>>
>> ticks 1  Infinite Loop:184    CC [M]
>>  drivers/media/video/ivtv/ivtv-firmware.o
>>
>>        for (i = 0; i < size; i += 0x100) {
>>                if (readl(mem + i)      == 0x12345678 &&
>>                    readl(mem + i + 4)  == 0x34567812 &&
>>                    readl(mem + i + 8)  == 0x56781234 &&
>>                    readl(mem + i + 12) == 0x78123456) {
>>                        return (volatile struct ivtv_mailbox __iomem *)(mem +
>> i + 16);
>>                }
>>        }
>>
>> I don't see the problem here either:
>>
>> ticks 1  Infinite Loop:93    CC [M]
>>  drivers/media/video/pvrusb2/pvrusb2-eeprom.o
>>
>>        for (tcnt = 0; tcnt < EEPROM_SIZE; tcnt += pcnt) {
>>                pcnt = 16;
>>                if (pcnt + tcnt > EEPROM_SIZE) pcnt = EEPROM_SIZE-tcnt;
>>                ...
>>        }
>>
>> The next report is on similar code:
>>
>> ticks 1  Infinite Loop:3536    CC [M]
>>  drivers/media/video/pvrusb2/pvrusb2-hdw.o
>>
>> These two indeed might be dangerous:
>>
>> ticks 2  Infinite Loop:481  Infinite Loop:676    CC [M]
>>  drivers/media/video/saa7134/saa7134-tvaudio.o
>>
>> julia
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
                   ` (8 preceding siblings ...)
  2012-01-21 20:50 ` Julia Lawall
@ 2012-01-21 22:32 ` Asim
  2012-01-22  6:09 ` Julia Lawall
  2012-01-22 12:17 ` Bernd Petrovitsch
  11 siblings, 0 replies; 13+ messages in thread
From: Asim @ 2012-01-21 22:32 UTC (permalink / raw)
  To: kernel-janitors

Yeah - these are not the strongest parts of the analyses. Lately, as
we have added more busses, interprocedural analyses and supported
newer kernels, we've seen more problems. In the first example, CIL(our
underlying analysis language) cannot distinguish between for and while
loops. To distinguish for loops and to eliminate certain class of
timers, we try to detect for loops and counters by searching for
integers that increment/decrement by 1. Here, it increments by 25, and
it is not caught as a for loop. Then, the intermediate code generated
for this is :

#line 185
    tmp___7 = readl((void const volatile   *)(mem + i));
#line 185
    if (tmp___7 = 305419896U) {
#line 185
      tmp___8 = readl((void const volatile   *)((mem + i) + 4));
#line 185
      if (tmp___8 = 878082066U) {
#line 185
        tmp___9 = readl((void const volatile   *)((mem + i) + 8));
#line 185
        if (tmp___9 = 1450709556U) {
#line 185
          tmp___10 = readl((void const volatile   *)((mem + i) + 12));
#line 185
          if (tmp___10 = 2014458966U) {
#line 189
            return ((struct ivtv_mailbox  volatile  *)((mem + i) + 16));
          }


Here it detects, that the exit condition is dependent on a read from
device, and marks this loop as a suspect. Apart from random
increments, we also face problems with casts used while detecting loop
exit conditions/for loops. Second one is a bug in how i2c_transfer
function is handled (i2c bus was added sometime this year). I've fixed
it and will update results shorty...

As I mentioned in my previous email, I would perhaps give priority to
the simpler examples that are plenty..


Thanks,
Asim

On Sat, Jan 21, 2012 at 12:50 PM, Julia Lawall <julia.lawall@lip6.fr> wrote:
> Do you have any idea about the two for loops near the end of the message
> below?  It seems strange that they would be included, unless I am completely
> overlooking something.
>
> julia
>
>
> On Sat, 21 Jan 2012, Asim wrote:
>
>> Yeah - most of the busy waits that are correct are short and fairly
>> obvious while loops and there are plenty of them. There are some
>> complex ones too.
>>
>> However, we do have false positive rate of 5- 8% (as measured on
>> 2.6.18 kernel) - so in some (~100) cases for example the while loops
>> where variables may be re-used it may report the false positive. As
>> far as kernel timing is concerned - we do have mechanisms to account
>> for these based on our experience with the timers used in 2.6 kernel.
>> I will add these timing mechanisms and refresh the results - and
>> continue to update as I encounter more diverse mechanisms to account
>> for timing. But I strongly believe this a good subset to look at to
>> catch these problems.
>>
>> Thanks,
>> Asim
>>
>> On Sat, Jan 21, 2012 at 11:04 AM, Julia Lawall <julia.lawall@lip6.fr>
>> wrote:
>>>
>>> I looked at this one:
>>>
>>> ticks 2  Infinite Loop:228  Infinite Loop:600    CC [M]
>>> drivers/i2c/busses/i2c-xiic.o
>>>
>>> Line 200 looks like what you described in your previous message.  I don't
>>> understand the problem in line 600, though:
>>>
>>>        while ((fifo_space >= 2) && (first || (i2c->nmsgs > 1))) {
>>>                if (!first) {
>>>                        i2c->nmsgs--;
>>>                        i2c->tx_msg++;
>>>                        i2c->tx_pos = 0;
>>>                } else
>>>                        first = 0;
>>>
>>>                if (i2c->tx_msg->flags & I2C_M_RD) {
>>>                        /* we dont date putting several reads in the FIFO
>>> */
>>>                        xiic_start_recv(i2c);
>>>                        return;
>>>                } else {
>>>                        xiic_start_send(i2c);
>>>                        if (xiic_tx_space(i2c) != 0) {
>>>                                /* the message could not be completely
>>> sent
>>> */
>>>                                break;
>>>                        }
>>>                }
>>>
>>>                fifo_space = xiic_tx_fifo_space(i2c);
>>>        }
>>>
>>> It looks like on every iteration execpt the first one, i2c->nmsgs is
>>> decremented, so it will not long be greater than one.
>>>
>>> The next two rely on kernel timing mechansims:
>>>
>>> ticks 1  Infinite Loop:284    CC [M]  drivers/i2c/busses/i2c-eg20t.o
>>>   while (ktime_lt(ktime_get(), ns_val))
>>> ticks 1  Infinite Loop:84    CC [M]  drivers/i2c/busses/i2c-pca-isa.o
>>>   ret = time_before(jiffies, timeout); ... while (ret)
>>>
>>> Are these timing mechanisms unsafe?
>>>
>>> In this case:
>>>
>>> ticks 1  Infinite Loop:338    CC [M]  drivers/infiniband/hw/amso1100/c2.o
>>>
>>> it looks like it is actually the outer loop, starting on line 336, that
>>> is
>>> unsafe.
>>>
>>> For this one:
>>>
>>> ticks 1  Infinite Loop:47    CC [M]
>>>  drivers/infiniband/hw/amso1100/c2_intr.o
>>>
>>> I am not completely sure to understand the code.  c2dev->hints_read is
>>> nonlocal and is never explicitly reset to 0.  Perhaps this code is only
>>> executed once per instance of cdev.  In any case hints_read is
>>> incremented
>>> on each iteration.  But perhaps the value of
>>> be16_to_cpu(*c2dev->hint_count)
>>> can also change at each access?
>>>
>>> This is clearly the sort of loop you described:
>>>
>>> ticks 1  Infinite Loop:735    CC [M]
>>>  drivers/infiniband/hw/amso1100/c2_qp.o
>>>
>>> Skipping ahead a bit, I don't at all see the problem in the following
>>> code:
>>>
>>> ticks 1  Infinite Loop:184    CC [M]
>>>  drivers/media/video/ivtv/ivtv-firmware.o
>>>
>>>        for (i = 0; i < size; i += 0x100) {
>>>                if (readl(mem + i)      = 0x12345678 &&
>>>                    readl(mem + i + 4)  = 0x34567812 &&
>>>                    readl(mem + i + 8)  = 0x56781234 &&
>>>                    readl(mem + i + 12) = 0x78123456) {
>>>                        return (volatile struct ivtv_mailbox __iomem
>>> *)(mem +
>>> i + 16);
>>>                }
>>>        }
>>>
>>> I don't see the problem here either:
>>>
>>> ticks 1  Infinite Loop:93    CC [M]
>>>  drivers/media/video/pvrusb2/pvrusb2-eeprom.o
>>>
>>>        for (tcnt = 0; tcnt < EEPROM_SIZE; tcnt += pcnt) {
>>>                pcnt = 16;
>>>                if (pcnt + tcnt > EEPROM_SIZE) pcnt = EEPROM_SIZE-tcnt;
>>>                ...
>>>        }
>>>
>>> The next report is on similar code:
>>>
>>> ticks 1  Infinite Loop:3536    CC [M]
>>>  drivers/media/video/pvrusb2/pvrusb2-hdw.o
>>>
>>> These two indeed might be dangerous:
>>>
>>> ticks 2  Infinite Loop:481  Infinite Loop:676    CC [M]
>>>  drivers/media/video/saa7134/saa7134-tvaudio.o
>>>
>>> julia
--
To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
                   ` (9 preceding siblings ...)
  2012-01-21 22:32 ` Asim
@ 2012-01-22  6:09 ` Julia Lawall
  2012-01-22 12:17 ` Bernd Petrovitsch
  11 siblings, 0 replies; 13+ messages in thread
From: Julia Lawall @ 2012-01-22  6:09 UTC (permalink / raw)
  To: kernel-janitors



On Sat, 21 Jan 2012, Asim wrote:

> Yeah - these are not the strongest parts of the analyses. Lately, as
> we have added more busses, interprocedural analyses and supported
> newer kernels, we've seen more problems. In the first example, CIL(our
> underlying analysis language) cannot distinguish between for and while
> loops. To distinguish for loops and to eliminate certain class of
> timers, we try to detect for loops and counters by searching for
> integers that increment/decrement by 1. Here, it increments by 25, and
> it is not caught as a for loop. Then, the intermediate code generated
> for this is :
>
> #line 185
>    tmp___7 = readl((void const volatile   *)(mem + i));
> #line 185
>    if (tmp___7 = 305419896U) {
> #line 185
>      tmp___8 = readl((void const volatile   *)((mem + i) + 4));
> #line 185
>      if (tmp___8 = 878082066U) {
> #line 185
>        tmp___9 = readl((void const volatile   *)((mem + i) + 8));
> #line 185
>        if (tmp___9 = 1450709556U) {
> #line 185
>          tmp___10 = readl((void const volatile   *)((mem + i) + 12));
> #line 185
>          if (tmp___10 = 2014458966U) {
> #line 189
>            return ((struct ivtv_mailbox  volatile  *)((mem + i) + 16));
>          }
>
>
> Here it detects, that the exit condition is dependent on a read from
> device, and marks this loop as a suspect.

There exists an exit condition that has this property, but isn't there 
only a problem if this is the only exit condition?  Or is the code 
obfuscated by CIL such that the normal exit cannot be detected?

> Apart from random
> increments, we also face problems with casts used while detecting loop
> exit conditions/for loops. Second one is a bug in how i2c_transfer
> function is handled (i2c bus was added sometime this year). I've fixed
> it and will update results shorty...
>
> As I mentioned in my previous email, I would perhaps give priority to
> the simpler examples that are plenty..

Are they ordered by increasing complexity?  I didn't start at the top.  In 
any case the false positive rate seems to be much higher than 5%.  So you 
might get better results by removing the false positives first.  Also, you 
might want to send the reports to the specific maintainers (you can use 
get_maintainer.pl to get a list for each file).  I'm not sure that the 
people who read the kernel janitors list would have the expertise to both 
be sure that there is a real problem and choose the desired solution.  But 
if there is a real problem, the maintainer will probably react quickly.

julia

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: busy loops in kernel
  2012-01-18  4:15 busy loops in kernel Asim
                   ` (10 preceding siblings ...)
  2012-01-22  6:09 ` Julia Lawall
@ 2012-01-22 12:17 ` Bernd Petrovitsch
  11 siblings, 0 replies; 13+ messages in thread
From: Bernd Petrovitsch @ 2012-01-22 12:17 UTC (permalink / raw)
  To: kernel-janitors

Hi!

On Mit, 2012-01-18 at 09:48 -0800, Asim wrote:
[....]
> is actually: count, line numbers and C file. In some cases, when
> driver comprises of multiple files, line numbers may be shifted (since
> all files are merged together for analysis).

It's somewhat off-topic, but people are probably more motivated if they
get correct filenames+line numbers of the original source.
Other source munging tools like yacc/bison or lex/flex use "#line" to
get the C compiler to report in errors and warnings the correct
reference to the source.

Kind regards,
	Bernd
-- 
Bernd Petrovitsch                  Email : bernd@petrovitsch.priv.at
                     LUGA : http://www.luga.at


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-01-22 12:17 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-18  4:15 busy loops in kernel Asim
2012-01-18  4:55 ` Greg Dietsche
2012-01-18  7:37 ` Dan Carpenter
2012-01-18  7:39 ` Dan Carpenter
2012-01-18 17:45 ` Asim
2012-01-18 17:48 ` Asim
2012-01-21 17:33 ` Asim
2012-01-21 19:04 ` Julia Lawall
2012-01-21 20:46 ` Asim
2012-01-21 20:50 ` Julia Lawall
2012-01-21 22:32 ` Asim
2012-01-22  6:09 ` Julia Lawall
2012-01-22 12:17 ` Bernd Petrovitsch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox