public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* linux rt priority  thread corrupt  global variable?
@ 2003-05-08  9:03 Ming Lei
  2003-05-08  9:43 ` Jörn Engel
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Ming Lei @ 2003-05-08  9:03 UTC (permalink / raw)
  To: linux-kernel

Platform:
Intel Pentium II; RedHat 7.2 with kernel version 2.4.7-10, libc 2.2.4-13 and
gcc 2.96.

Problem description:

a program has 3 threads of priority 12, 10, 6 respectively, and the main
process at priority 0. All the threads except main process is created with
pthread_create, and defined SCHED_FIFO as real time scheduler policy.

There is a global variable I define with 'int cpl'. All the threads and main
process may alter cpl at any time. cpl may have one of these values {0,
0xf000006e, 0xf0000068, 0xe0000000, 0xe0000060}. cpl is protected by mutex
for any access.

<Problem=> at some point of execution which cpl should be a value say
e0000060, but the actual value retained at cpl is another say e0000000; that
is, the value is changed without the program actually done anything on it.
The retained value I observed is kind of historic value(one of these value
in the above set), not the arbituary value. The problem had occured just
after context switch, also occured during a thread execution.

<Confirm> I used Intel debug register to track any writing to the cpl memory
address globally, which is the way GDB use for x86 hardware watchpoint
implementation. I could see all the writing from my program to change cpl,
but failed to see the source from which the problem occured. So I dont know
what cause the problem.

Can anyone listening give me a direction or hint on this annoying situation?

PS. please cc to this email address.
-Ming


Related questions:

Is linux kernel 2.4.10 considered strictly preemptive such as VxWorks or
other RTOS? I guess 2.4.10 may simulate preemptive with running scheduler on
every syscall or interrupt returns. Am I right?

Is printf() real-time priority thread safe?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux rt priority  thread corrupt  global variable?
  2003-05-08  9:03 linux rt priority thread corrupt global variable? Ming Lei
@ 2003-05-08  9:43 ` Jörn Engel
  2003-05-08 16:59   ` Ming Lei
  2003-05-08  9:51 ` Arjan van de Ven
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Jörn Engel @ 2003-05-08  9:43 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-kernel

On Thu, 8 May 2003 02:03:35 -0700, Ming Lei wrote:
> 
> Platform:
> Intel Pentium II; RedHat 7.2 with kernel version 2.4.7-10, libc 2.2.4-13 and
> gcc 2.96.

You should either upgrade to 2.4.20 or similar or post the question to
RedHat for their kernels. If the problem can be reproduced with
2.4.20, come back here. :)

> Problem description:
> 
> a program has 3 threads of priority 12, 10, 6 respectively, and the main
> process at priority 0. All the threads except main process is created with
> pthread_create, and defined SCHED_FIFO as real time scheduler policy.
> 
> There is a global variable I define with 'int cpl'. All the threads and main
> process may alter cpl at any time. cpl may have one of these values {0,
> 0xf000006e, 0xf0000068, 0xe0000000, 0xe0000060}. cpl is protected by mutex
> for any access.
> 
> <Problem=> at some point of execution which cpl should be a value say
> e0000060, but the actual value retained at cpl is another say e0000000; that
> is, the value is changed without the program actually done anything on it.
> The retained value I observed is kind of historic value(one of these value
> in the above set), not the arbituary value. The problem had occured just
> after context switch, also occured during a thread execution.
> 
> <Confirm> I used Intel debug register to track any writing to the cpl memory
> address globally, which is the way GDB use for x86 hardware watchpoint
> implementation. I could see all the writing from my program to change cpl,
> but failed to see the source from which the problem occured. So I dont know
> what cause the problem.
> 
> Can anyone listening give me a direction or hint on this annoying situation?

Sounds a bit like a caching problem. Old value in cache, new value
written to memory, chache line dirty => flushed, old value written to
memory again. But it could also be something else.

Jörn

-- 
Simplicity is prerequisite for reliability.
-- Edsger W. Dijkstra

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux rt priority  thread corrupt  global variable?
  2003-05-08  9:03 linux rt priority thread corrupt global variable? Ming Lei
  2003-05-08  9:43 ` Jörn Engel
@ 2003-05-08  9:51 ` Arjan van de Ven
  2003-05-08  9:52 ` Bill Huey
  2003-05-08 20:45 ` Roger Larsson
  3 siblings, 0 replies; 8+ messages in thread
From: Arjan van de Ven @ 2003-05-08  9:51 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 207 bytes --]

On Thu, 2003-05-08 at 11:03, Ming Lei wrote:
> Platform:
> Intel Pentium II; RedHat 7.2 with kernel version 2.4.7-10, 
eeep that's an old one; it has been superceeded by like 10 or more
errata kernels.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux rt priority  thread corrupt  global variable?
  2003-05-08  9:03 linux rt priority thread corrupt global variable? Ming Lei
  2003-05-08  9:43 ` Jörn Engel
  2003-05-08  9:51 ` Arjan van de Ven
@ 2003-05-08  9:52 ` Bill Huey
  2003-05-08  9:59   ` Bill Huey
  2003-05-08 20:45 ` Roger Larsson
  3 siblings, 1 reply; 8+ messages in thread
From: Bill Huey @ 2003-05-08  9:52 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-kernel, Bill Huey (Hui)

On Thu, May 08, 2003 at 02:03:35AM -0700, Ming Lei wrote:
> Related questions:
> 
> Is linux kernel 2.4.10 considered strictly preemptive such as VxWorks or
> other RTOS? I guess 2.4.10 may simulate preemptive with running scheduler on
> every syscall or interrupt returns. Am I right?

No, it's not a fully preemptive kernel, but spreads preemption points
throughout the source tree, both directly and indirectly, instead. Spinlocks
are the primary mutex of choice in Linux and create atomic critical sections
that can't be preempted with respect to the normal Linux scheduler. Fully
preemptive systems tend to use sleepable locks with relaxed preemptability
within critical sections and add the possible option of priority inheritance
depending on the system.

If you're going to do RT Linux related stuff use RTLinux, RTAI or other
commerical options instead.

> Is printf() real-time priority thread safe?

Stock Linux is definitely not if I understand what you're saying and
if I understand the code correctly. :)

bill


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux rt priority  thread corrupt  global variable?
  2003-05-08  9:52 ` Bill Huey
@ 2003-05-08  9:59   ` Bill Huey
  2003-05-08 10:42     ` Bill Huey
  0 siblings, 1 reply; 8+ messages in thread
From: Bill Huey @ 2003-05-08  9:59 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-kernel, Bill Huey (Hui)

On Thu, May 08, 2003 at 02:52:38AM -0700, Bill Huey wrote:
> No, it's not a fully preemptive kernel, but spreads preemption points
> throughout the source tree, both directly and indirectly, instead. Spinlocks
> are the primary mutex of choice in Linux and create atomic critical sections
> that can't be preempted with respect to the normal Linux scheduler. Fully

Geez, this isn't exactly right either, my brain is failing me at the moment.

> preemptive systems tend to use sleepable locks with relaxed preemptability
> within critical sections and add the possible option of priority inheritance
> depending on the system.

/me thinks

bill


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux rt priority  thread corrupt  global variable?
  2003-05-08  9:59   ` Bill Huey
@ 2003-05-08 10:42     ` Bill Huey
  0 siblings, 0 replies; 8+ messages in thread
From: Bill Huey @ 2003-05-08 10:42 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-kernel, Bill Huey (Hui)

On Thu, May 08, 2003 at 02:59:11AM -0700, Bill Huey wrote:
> On Thu, May 08, 2003 at 02:52:38AM -0700, Bill Huey wrote:
> > No, it's not a fully preemptive kernel, but spreads preemption points
> > throughout the source tree, both directly and indirectly, instead. Spinlocks
> > are the primary mutex of choice in Linux and create atomic critical sections
> > that can't be preempted with respect to the normal Linux scheduler. Fully
> 
> Geez, this isn't exactly right either, my brain is failing me at the moment.

I was right the first time. :) Just remember why breaking a spinlock is
a bad thing to do. :)

bill


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux rt priority  thread corrupt  global variable?
  2003-05-08  9:43 ` Jörn Engel
@ 2003-05-08 16:59   ` Ming Lei
  0 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2003-05-08 16:59 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-kernel


Does anyone know about  how Intel x86 debug register monitor the write
access to a specified mem address? I looked the gdb code and found only the
process VM address of the variable to be watched is writen to the debug
register. Does it mean that x86 debug register only watchs the VM address? I
want to know if Intel hardware watchs the real physical address or VM
address or CPU cache? where can I find this info? I didnt find it in intel
manual.


> > a program has 3 threads of priority 12, 10, 6 respectively, and the main
> > process at priority 0. All the threads except main process is created
with
> > pthread_create, and defined SCHED_FIFO as real time scheduler policy.
> >
> > There is a global variable I define with 'int cpl'. All the threads and
main
> > process may alter cpl at any time. cpl may have one of these values {0,
> > 0xf000006e, 0xf0000068, 0xe0000000, 0xe0000060}. cpl is protected by
mutex
> > for any access.
> >
> > <Problem=> at some point of execution which cpl should be a value say
> > e0000060, but the actual value retained at cpl is another say e0000000;
that
> > is, the value is changed without the program actually done anything on
it.
> > The retained value I observed is kind of historic value(one of these
value
> > in the above set), not the arbituary value. The problem had occured just
> > after context switch, also occured during a thread execution.
> >
> > <Confirm> I used Intel debug register to track any writing to the cpl
memory
> > address globally, which is the way GDB use for x86 hardware watchpoint
> > implementation. I could see all the writing from my program to change
cpl,
> > but failed to see the source from which the problem occured. So I dont
know
> > what cause the problem.
> >
> > Can anyone listening give me a direction or hint on this annoying
situation?
>
> Sounds a bit like a caching problem. Old value in cache, new value
> written to memory, chache line dirty => flushed, old value written to
> memory again. But it could also be something else.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux rt priority  thread corrupt  global variable?
  2003-05-08  9:03 linux rt priority thread corrupt global variable? Ming Lei
                   ` (2 preceding siblings ...)
  2003-05-08  9:52 ` Bill Huey
@ 2003-05-08 20:45 ` Roger Larsson
  3 siblings, 0 replies; 8+ messages in thread
From: Roger Larsson @ 2003-05-08 20:45 UTC (permalink / raw)
  To: linux-kernel

On torsdag 08 maj 2003 11:03, Ming Lei wrote:
> 
> Is linux kernel 2.4.10 considered strictly preemptive such as VxWorks or
> other RTOS? I guess 2.4.10 may simulate preemptive with running scheduler on
> every syscall or interrupt returns. Am I right?
>

Yes, but what else is there?
- A timer interrupt that ends a sleep for a RT process.
- A device interrupt that notifies a RT process about new data.
- A process that wakes up another process.
The problem with 2.4.10 is that while the current process is
executing IN kernel, the wakened RT process will need to wait
until the current leaves kernel or goes to sleep.

This is not a huge problem since there are patches for 2.4.10 that adds
explicit checks in found kernel spots (loops over long lists).

Later kernels got some of these improvements. There are patches for
these as well.

In the 2.5 series you can specify preemptive kernel.
With that a preemption can happen in the kernel but not
when being inside a spin lock. There are patches for this case
as well.

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-05-08 20:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-08  9:03 linux rt priority thread corrupt global variable? Ming Lei
2003-05-08  9:43 ` Jörn Engel
2003-05-08 16:59   ` Ming Lei
2003-05-08  9:51 ` Arjan van de Ven
2003-05-08  9:52 ` Bill Huey
2003-05-08  9:59   ` Bill Huey
2003-05-08 10:42     ` Bill Huey
2003-05-08 20:45 ` Roger Larsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox