public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] mprotect problem
@ 2001-12-06 23:56 Hoeflinger, Jay P
  2001-12-07  0:11 ` n0ano
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Hoeflinger, Jay P @ 2001-12-06 23:56 UTC (permalink / raw)
  To: linux-ia64

We are seeing an apparent problem with mprotect on Itanium.  We have seen
the problem
on two different machines, one running RedHat 7.1 (Seawolf)  [2.4.3-12smp]
and one
running Turbolinux [2.4.1-010131-8smp].

The mprotect is called from user space, as part of the implementation of a 
distributed virtual shared memory system that uses the virtual
memory mechanism to implement a shared address space between two or more
nodes.

The code works correctly under RedHat 7.1 for IA32 (and a variety of other
OS'es and platforms, so
we feel that there aren't coding errors, although maybe there is some
slightly different
way to use mprotect on Itanium (additional parameters, or flags?)?.

The problem we see is this:

During the course of running the user's program on top of our DVSM, the
program touches 
a "shared" page that has been mprotect'ed against reading and writing
previously because it is not
up-to-date with respect to the same page on other nodes in the system.  The
access faults,
our SEGV handler is called, we do the appropriate message passing operations
to 
make the data on the page consistent and up-to-date, then do an mprotect
allowing 
READ and WRITE this time, and return from the SEGV handler.  At this point,
the original instruction
(a READ) is restarted and immediately faults, causing control to go to the
SEGV handler 
again.  This time, since we know the page is up-to-date, we do nothing and
return, the 
instruction is again re-started, again faults, again jumps to the SEGV
handler . . . an infinite
loop.

The interesting thing is that this particular user code fails at random
points, sometimes working
correctly at points where it failed before.  We have never seen the code
work correctly all the way
through, though.  It always fails very soon after it begins, just at
different points on different runs.
We theorized that this was a timing problem, such that it just took some
time for mprotect to 
take effect, so we put in 10-millisecond delays after each mprotect, but
this really changed nothing.

One potential clue would be that the code is a pthreads program, and
multiple threads are 
running while one thread is doing the mprotect, and these machines are both
dual-processor machines.

We would appreciate any help that anyone can give.

Jay

Jay Hoeflinger, jay.p.hoeflinger@intel.com
KAI Software, A Division of Intel Americas, Inc., http://www.kai.com
Phone 217/356-2288, Direct 217/356-5052 x 140, Fax 217/356-5199




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2001-12-07 20:27 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
2001-12-07  0:11 ` n0ano
2001-12-07 14:53 ` Hoeflinger, Jay P
2001-12-07 15:13 ` n0ano
2001-12-07 15:18 ` Hoeflinger, Jay P
2001-12-07 16:10 ` David Mosberger
2001-12-07 16:23 ` Hoeflinger, Jay P
2001-12-07 17:34 ` David Mosberger
2001-12-07 19:47 ` Hoeflinger, Jay P
2001-12-07 20:13 ` Boehm, Hans
2001-12-07 20:27 ` David Mosberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox