timing code in 2.6.1

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* timing code in 2.6.1
@ 2004-01-16 16:51 Richard B. Johnson
  2004-01-16 23:31 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Richard B. Johnson @ 2004-01-16 16:51 UTC (permalink / raw)
  To: Linux kernel

Some drivers are being re-written for 2.6++. The following
construct seems to work for "waiting for an event" in
the kernel modules.

        // No locks are being held
        tim = jiffies + EVENT_TIMEOUT;
        while(!event() && time_before(jiffies, tim))
            schedule_timeout(0);

Is there anything wrong?
Do I have to execute "set_current_state(TASK_INTERRUPTIBLE)" before?
Do I have to execute "set_current_state(TASK_RUNNING)" after?

I don't want to have to change this again so I really need to
know. For instance, if I execute "set_current_state(TASK_INTERRUPTIBLE)"
in version 2.4.24, it didn't hurt anything. In 2.6.1, there are
conditions where schedule_timeout(0) doesn't return if another
task is spinning "while(1) ; ". This is NotGood(tm).

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: timing code in 2.6.1
  2004-01-16 16:51 timing code in 2.6.1 Richard B. Johnson
@ 2004-01-16 23:31 ` Andrew Morton
  2004-01-19 14:11   ` Richard B. Johnson
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2004-01-16 23:31 UTC (permalink / raw)
  To: root; +Cc: linux-kernel

"Richard B. Johnson" <root@chaos.analogic.com> wrote:
>
> 
> Some drivers are being re-written for 2.6++. The following
> construct seems to work for "waiting for an event" in
> the kernel modules.
> 
>         // No locks are being held
>         tim = jiffies + EVENT_TIMEOUT;
>         while(!event() && time_before(jiffies, tim))
>             schedule_timeout(0);
> 
> Is there anything wrong?

This is not a good thing to be doing.  You should add this task to a
waitqueue and then sleep.  Make the code which causes event() to come true
deliver a wake_up to that waitqueue.  There are many examples of this in
the kernel.

If the hardware only supports polling then gee, you'd be best off spinning
for a few microseconds then fall into a schedule_timeout(1) polling loop. 
Or something like that.  Or make the hardware designer write the damn
driver.

> Do I have to execute "set_current_state(TASK_INTERRUPTIBLE)" before?
> Do I have to execute "set_current_state(TASK_RUNNING)" after?
> 
> I don't want to have to change this again so I really need to
> know. For instance, if I execute "set_current_state(TASK_INTERRUPTIBLE)"
> in version 2.4.24, it didn't hurt anything. In 2.6.1, there are
> conditions where schedule_timeout(0) doesn't return if another
> task is spinning "while(1) ; ". This is NotGood(tm).

As you have it, you may as well be calling schedule() inside that loop. 
You _have_ to be in state TASK_RUNNING, else you'll sleep forever.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: timing code in 2.6.1
  2004-01-16 23:31 ` Andrew Morton
@ 2004-01-19 14:11   ` Richard B. Johnson
  2004-01-20  9:59     ` George Anzinger
  2004-01-22  2:26     ` Jamie Lokier
  0 siblings, 2 replies; 5+ messages in thread
From: Richard B. Johnson @ 2004-01-19 14:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Fri, 16 Jan 2004, Andrew Morton wrote:

> "Richard B. Johnson" <root@chaos.analogic.com> wrote:
> >
> >
> > Some drivers are being re-written for 2.6++. The following
> > construct seems to work for "waiting for an event" in
> > the kernel modules.
> >
> >         // No locks are being held
> >         tim = jiffies + EVENT_TIMEOUT;
> >         while(!event() && time_before(jiffies, tim))
> >             schedule_timeout(0);
> >
> > Is there anything wrong?
>
> This is not a good thing to be doing.  You should add this task to a
> waitqueue and then sleep.  Make the code which causes event() to come true
> deliver a wake_up to that waitqueue.  There are many examples of this in
> the kernel.
>

Huh? The code that causes "event()" needs to get the CPU occasionally
to check for the event. The hardware doesn't produce an interrupt
upon that event.

> If the hardware only supports polling then gee, you'd be best off spinning
> for a few microseconds then fall into a schedule_timeout(1) polling loop.
> Or something like that.  Or make the hardware designer write the damn
> driver.

The poll will almost never be true when first called. Spinning and
wasting CPU cycles that can be used by another task isn't a good
idea.

The hardware designer has designed the hardware according to
the requirements dictated by a government agency (the FDA).
There was no requirement to make interface code simple. The
interface code must check for multiple failure modes during
every specific operation. Both the hardware and software
check for these modes so that no single failure can cause
injury to a patient. This is SOP for medical equipment.

In the specific case, we operate a patient table. The operator
presses an UP and DOWN button. These will produce interrupts.

When an UP button is hit, thousands of interrupts are generated.
This is because it is a mechanical operation. The same is true
for the DOWN button. These events are filtered to determine
the true intervals for "UP", "NOTHING", and "DOWN". When the
"UP" button is pressed, the CPU servos position information
from another driver and motor speed to to move the table to
the commanded position. If the button is pressed for a short
period of time, the table moves slowly. If it is pressed for
a longer period, it moves quickly. It must accellerate according
to a schedule and decellerate according to a schedule so that
patients that weigh 350 lbs and patients that weigh 45 lbs are
accellerated and decellerated at the same rate.

When the table is being moved, 12 different parameters are
monitored. At least two parameters are actually calculated
and filtered to predict where the patient will be if the
button is released.

So, this cannot be dismissed as "get the hardware designer
to write the same driver..."

Writing software often requires knowing about the whole sustem.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: timing code in 2.6.1
  2004-01-19 14:11   ` Richard B. Johnson
@ 2004-01-20  9:59     ` George Anzinger
  2004-01-22  2:26     ` Jamie Lokier
  1 sibling, 0 replies; 5+ messages in thread
From: George Anzinger @ 2004-01-20  9:59 UTC (permalink / raw)
  To: root; +Cc: Andrew Morton, linux-kernel

Richard B. Johnson wrote:
> On Fri, 16 Jan 2004, Andrew Morton wrote:
> 
> 
>>"Richard B. Johnson" <root@chaos.analogic.com> wrote:
>>
>>>
>>>Some drivers are being re-written for 2.6++. The following
>>>construct seems to work for "waiting for an event" in
>>>the kernel modules.
>>>
>>>        // No locks are being held
>>>        tim = jiffies + EVENT_TIMEOUT;
>>>        while(!event() && time_before(jiffies, tim))
>>>            schedule_timeout(0);
>>>
>>>Is there anything wrong?
>>
>>This is not a good thing to be doing.  You should add this task to a
>>waitqueue and then sleep.  Make the code which causes event() to come true
>>deliver a wake_up to that waitqueue.  There are many examples of this in
>>the kernel.
>>
> 
> 
> Huh? The code that causes "event()" needs to get the CPU occasionally
> to check for the event. The hardware doesn't produce an interrupt
> upon that event.

THAT is where the hardware designer went wrong.  An interrupt is really the best 
way for hardware to notify the cpu of an event.  Details may then be read from 
hardware registers, if needed.

The request of a 0 timeout is really asking for a timeout on the last jiffie.  I 
suspect that relying on this to actually not happen until the next jiffie is NOT 
a good thing.  I am not aware of any spec that says that is what it should do 
and I am aware of timer patchs that will run the timer immeadiatly.  A request 
for 1 jiffie will do the same thing in the current kernel AND under that patch. 
  That thing being, by the way, is to wait for the next jiffie which will be any 
time from ~0 to 1 jiffie.
> 
Lots of words, which did not explain why an interrupt was not used to signal the 
event, deleted :)
-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: timing code in 2.6.1
  2004-01-19 14:11   ` Richard B. Johnson
  2004-01-20  9:59     ` George Anzinger
@ 2004-01-22  2:26     ` Jamie Lokier
  1 sibling, 0 replies; 5+ messages in thread
From: Jamie Lokier @ 2004-01-22  2:26 UTC (permalink / raw)
  To: Richard B. Johnson; +Cc: linux-kernel

Richard B. Johnson wrote:
> > > Some drivers are being re-written for 2.6++. The following
> > > construct seems to work for "waiting for an event" in
> > > the kernel modules.
> > >
> > >         // No locks are being held
> > >         tim = jiffies + EVENT_TIMEOUT;
> > >         while(!event() && time_before(jiffies, tim))
> > >             schedule_timeout(0);
> > >
> > > Is there anything wrong?

Yes.

  1. The call to schedule_timeout(0) returns immediately, so you
     might as well have an empty loop body.

     See this from schedule_timeout's documentation:

       The routine will return immediately unless
       the current task state has been set (see set_current_state()).

     You need to call set_current_state(TASK_INTERRUPTIBLE) or
     set_current_state(TASK_UNINTERRUPTIBLE) inside the loop prior to
     calling schedule_timeout.

  2. With some experimental timer patches, schedule_timeout(0) could
     return immediately.  You should call schedule_timeout(1) instead to
     ensure at least some kind of yielding the CPU, assuming that's what
     you intended.

     It would be much better to work out how often you have to poll
     the event() function, and use that time interval as the argument
     to schedule_timeout().

     For example if you know you need to poll approximately every
     20ms, write schedule_timeout(HZ/50).

Some additional advice:

You are polling for an event.  If the event test is not complex, the
highest performance way to do it is to create a periodic timer and
have that check for the event, and use a waitqueue to wake the
sleeping task.  What you are doing is much simpler to write, though.

> Huh? The code that causes "event()" needs to get the CPU occasionally
> to check for the event. The hardware doesn't produce an interrupt
> upon that event.
> 
> The hardware designer has designed the hardware according to
> the requirements dictated by a government agency (the FDA).
> There was no requirement to make interface code simple. The
> interface code must check for multiple failure modes during
> every specific operation. Both the hardware and software
> check for these modes so that no single failure can cause
> injury to a patient. This is SOP for medical equipment.

When you call schedule_timeout(0), the scheduler _could_ delay your
process for an arbitrarily long time, unless you have set an
appropriate SCHED_FIFO or SCHED_RR policy.

For medical equipment which affects patient safety, usually _maximum_
polling intervals are required so that a dangerous operation is
aborted within a fixed time limit.

schedule_timeout(0) only guarantees a _minimum_ polling interval.
Please ensure you are doing this right.

> In the specific case, we operate a patient table. The operator
> presses an UP and DOWN button. These will produce interrupts.
> When an UP button is hit, thousands of interrupts are generated.

The hardware generates interrupts for button events but doesn't
produce interrupts for the periodic sampling that is monitoring the
table movement, is that right?

> This is because it is a mechanical operation. The same is true
> for the DOWN button. These events are filtered to determine
> the true intervals for "UP", "NOTHING", and "DOWN". When the
> "UP" button is pressed, the CPU servos position information
> from another driver and motor speed to to move the table to
> the commanded position. If the button is pressed for a short
> period of time, the table moves slowly. If it is pressed for
> a longer period, it moves quickly. It must accellerate according
> to a schedule and decellerate according to a schedule so that
> patients that weigh 350 lbs and patients that weigh 45 lbs are
> accellerated and decellerated at the same rate.
> 
> When the table is being moved, 12 different parameters are
> monitored. At least two parameters are actually calculated
> and filtered to predict where the patient will be if the
> button is released.

If you are periodically monitoring and filtering parameters, it seems
very likely that you need accurate and consistent time intervals
between each measurement, to ensure patient safety.

-- Jamie

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-01-22  2:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-16 16:51 timing code in 2.6.1 Richard B. Johnson
2004-01-16 23:31 ` Andrew Morton
2004-01-19 14:11   ` Richard B. Johnson
2004-01-20  9:59     ` George Anzinger
2004-01-22  2:26     ` Jamie Lokier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox