* RE: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
@ 2005-10-05 21:54 ` Luck, Tony
2005-10-05 22:09 ` Luck, Tony
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Luck, Tony @ 2005-10-05 21:54 UTC (permalink / raw)
To: linux-ia64
>It was suggested to me that this might be of interest outside the LSB,
>and that I should forward it more widely...
Running under strace, I see you write 2048 bytes at a time until you
see EAGAIN. Then you read 4096 bytes to empty the pipe a little.
Then you try to write 4120 bytes (more than you just emptied).
Is that what you are trying to do?
Do you expect a partial write, rather than the EAGAIN?
-Tony
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
2005-10-05 21:54 ` Luck, Tony
@ 2005-10-05 22:09 ` Luck, Tony
2005-10-05 22:13 ` Jeff Licquia
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Luck, Tony @ 2005-10-05 22:09 UTC (permalink / raw)
To: linux-ia64
>Do you expect a partial write, rather than the EAGAIN?
I suppose you should expect EAGAIN here: http://tinyurl.com/82wr9
describes this situation quite clearly.
So either:
1) ia64 thinks 4120 is less than PIPE_BUF, so that it believes
that it should not do a partial write
2) Even though we removed some data from the pipe, it thinks that
it is still all the way full.
[My tests on 2.6.14-rc2].
Which other architectures have you tried this on? Is ia64 all alone
in failing this test?
-Tony
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
2005-10-05 21:54 ` Luck, Tony
2005-10-05 22:09 ` Luck, Tony
@ 2005-10-05 22:13 ` Jeff Licquia
2005-10-05 22:30 ` Jeff Licquia
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Jeff Licquia @ 2005-10-05 22:13 UTC (permalink / raw)
To: linux-ia64
On Wed, 2005-10-05 at 14:54 -0700, Luck, Tony wrote:
> >It was suggested to me that this might be of interest outside the LSB,
> >and that I should forward it more widely...
>
> Running under strace, I see you write 2048 bytes at a time until you
> see EAGAIN. Then you read 4096 bytes to empty the pipe a little.
>
> Then you try to write 4120 bytes (more than you just emptied).
>
> Is that what you are trying to do?
That sounds about right.
FWIW, if you'd like a look at the source of one of the LSB tests that's
failing, see:
http://cvs.gforge.freestandards.org/cgi-bin/cvsweb.cgi/tests/lsb-runtime-test/modules/vsx-pcts/tset/ANSI.os/streamio/fwrite/fwrite.c?rev=1.1.1.1&contenttype=text/x-cvsweb-markup&cvsroot=lsb
Look at function "test3", specifically the part after where it reports
"testing assertion using non-blocking pipe".
> Do you expect a partial write, rather than the EAGAIN?
That appears to be what's happening in most other cases, and also
appears to be what the LSB tests want.
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
` (2 preceding siblings ...)
2005-10-05 22:13 ` Jeff Licquia
@ 2005-10-05 22:30 ` Jeff Licquia
2005-10-05 22:33 ` Luck, Tony
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Jeff Licquia @ 2005-10-05 22:30 UTC (permalink / raw)
To: linux-ia64
On Wed, 2005-10-05 at 15:09 -0700, Luck, Tony wrote:
> >Do you expect a partial write, rather than the EAGAIN?
>
> I suppose you should expect EAGAIN here: http://tinyurl.com/82wr9
> describes this situation quite clearly.
>
> So either:
> 1) ia64 thinks 4120 is less than PIPE_BUF, so that it believes
> that it should not do a partial write
Except that the test code asks the system for PIPE_BUF, and then adds to
it to determine how much to write. (See lines 26, 32, and 33 of my
test.) Specifically, we try to write PIPE_BUF + 24 bytes.
> 2) Even though we removed some data from the pipe, it thinks that
> it is still all the way full.
Hmm. The spec doesn't seem to say whether a read of PIPE_BUF bytes on a
full pipe _must_ put the pipe in a state where it can accept more input.
The test, certainly, seems to think this is mandatory.
> [My tests on 2.6.14-rc2].
>
> Which other architectures have you tried this on? Is ia64 all alone
> in failing this test?
Yup. In fact, only recent kernels fail; 2.6.8 succeeds, while 2.6.12
fails.
For other architectures, I've tested on i386 and amd64 for both 2.6.8
and 2.6.12 kernels.
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
` (3 preceding siblings ...)
2005-10-05 22:30 ` Jeff Licquia
@ 2005-10-05 22:33 ` Luck, Tony
2005-10-05 23:09 ` Jeff Licquia
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Luck, Tony @ 2005-10-05 22:33 UTC (permalink / raw)
To: linux-ia64
>> Do you expect a partial write, rather than the EAGAIN?
>
>That appears to be what's happening in most other cases, and also
>appears to be what the LSB tests want.
Running with a kernel configured for 4K pagesize, this
test passes (prints "the write appeared to succeed").
I expect this difference in behavior can be traced to when
Linus improved pipe performance by allowing a list of many
pages to implement the pipe buffer. See http://tinyurl.com/7jfen
The date on this is Jan 6th, 2005 ... which puts it between
2.6.10 and 2.6.11-rc1
-Tony
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
` (4 preceding siblings ...)
2005-10-05 22:33 ` Luck, Tony
@ 2005-10-05 23:09 ` Jeff Licquia
2005-10-05 23:28 ` Luck, Tony
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Jeff Licquia @ 2005-10-05 23:09 UTC (permalink / raw)
To: linux-ia64
On Wed, 2005-10-05 at 15:33 -0700, Luck, Tony wrote:
> Running with a kernel configured for 4K pagesize, this
> test passes (prints "the write appeared to succeed").
>
> I expect this difference in behavior can be traced to when
> Linus improved pipe performance by allowing a list of many
> pages to implement the pipe buffer. See http://tinyurl.com/7jfen
>
> The date on this is Jan 6th, 2005 ... which puts it between
> 2.6.10 and 2.6.11-rc1
This implies that PIPE_BUF should be equal to the kernel's configured
page size on ia64. I modified my test to report on PIPE_BUF's returned
value, and it is 4096 on all my tested architectures, including ia64 on
2.6.8 and 2.6.12.
(Also, the page size on my 2.6.8 ia64 kernel is 16K, same as my 2.6.12
kernel.)
It strikes me that, if this is correct, forcing 4K page size isn't the
right fix. Do you have any ideas on that front?
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
` (5 preceding siblings ...)
2005-10-05 23:09 ` Jeff Licquia
@ 2005-10-05 23:28 ` Luck, Tony
2005-10-05 23:40 ` Luck, Tony
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Luck, Tony @ 2005-10-05 23:28 UTC (permalink / raw)
To: linux-ia64
>This implies that PIPE_BUF should be equal to the kernel's configured
>page size on ia64. I modified my test to report on PIPE_BUF's returned
>value, and it is 4096 on all my tested architectures, including ia64 on
>2.6.8 and 2.6.12.
4096 is possibly for historical reasons.
>(Also, the page size on my 2.6.8 ia64 kernel is 16K, same as my 2.6.12
>kernel.)
Yes, the default pagesize on ia64 is 16k.
>It strikes me that, if this is correct, forcing 4K page size isn't the
>right fix. Do you have any ideas on that front?
My experiment with a 4K page kernel was just that, an experiment. I
didn't intend it to be a fix. But it is a good thing to try when
some generic test fails on ia64 and works on everything else.
Looking at fs/pipe.c ... it appears that pipe_readv() only
wants to let any writers know that we've read anything if
we manage to empty one of the page buffers:
if (!buf->len) {
...
do_wakeup = 1;
}
Now, the question becomes "is this a bug". As you pointed
out in an earlier e-mail the standard makes no mention of
what happens when you read some data from a full pipe. So
we appear not to be in violation of the letter of the standard.
But this does fly in the face of common sense.
As you say above, reporting the PIPE_BUF value as PAGE_SIZE
[probably max(4096, PAGE_SIZE) ... for any arch that has a
page size smaller than 4k] would fix this. But then we get
back to the historical properties of 4k as the PIPE_BUF size.
Would such a change break existing applications that are not
well enough written to use fpathconf(fd, _PC_PIPE_BUF)?
-Tony
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
` (6 preceding siblings ...)
2005-10-05 23:28 ` Luck, Tony
@ 2005-10-05 23:40 ` Luck, Tony
2005-10-06 1:02 ` Ian Wienand
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Luck, Tony @ 2005-10-05 23:40 UTC (permalink / raw)
To: linux-ia64
>This implies that PIPE_BUF should be equal to the kernel's configured
>page size on ia64. I modified my test to report on PIPE_BUF's returned
>value, and it is 4096 on all my tested architectures, including ia64 on
>2.6.8 and 2.6.12.
Hmmph. According to strace, there are no system calls resulting from
the fpathconf() call. How does it come up with the magic 4096 value?
-Tony
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
` (7 preceding siblings ...)
2005-10-05 23:40 ` Luck, Tony
@ 2005-10-06 1:02 ` Ian Wienand
2005-10-07 14:29 ` Jeff Licquia
2005-10-08 2:40 ` Tony Luck
10 siblings, 0 replies; 12+ messages in thread
From: Ian Wienand @ 2005-10-06 1:02 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 520 bytes --]
On Wed, Oct 05, 2005 at 04:40:24PM -0700, Luck, Tony wrote:
> Hmmph. According to strace, there are no system calls resulting from
> the fpathconf() call. How does it come up with the magic 4096 value?
Looks like it's built with the value of PIPE_BUF from
include/linux/limits.h
Through a binary search, on my system (16K pages) with the original
test program, setting pipe_buf to 10923 fails with EAGAIN, but add one
more byte (10924) and it works. I can't immediately see how that
number relates to anything.
-i
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
` (8 preceding siblings ...)
2005-10-06 1:02 ` Ian Wienand
@ 2005-10-07 14:29 ` Jeff Licquia
2005-10-08 2:40 ` Tony Luck
10 siblings, 0 replies; 12+ messages in thread
From: Jeff Licquia @ 2005-10-07 14:29 UTC (permalink / raw)
To: linux-ia64
On Wed, 2005-10-05 at 16:28 -0700, Luck, Tony wrote:
> Now, the question becomes "is this a bug". As you pointed
> out in an earlier e-mail the standard makes no mention of
> what happens when you read some data from a full pipe. So
> we appear not to be in violation of the letter of the standard.
>
> But this does fly in the face of common sense.
>
> As you say above, reporting the PIPE_BUF value as PAGE_SIZE
> [probably max(4096, PAGE_SIZE) ... for any arch that has a
> page size smaller than 4k] would fix this. But then we get
> back to the historical properties of 4k as the PIPE_BUF size.
> Would such a change break existing applications that are not
> well enough written to use fpathconf(fd, _PC_PIPE_BUF)?
I think there's a problem with this. While PIPE_BUF is specified in a
kernel header, it ends up becoming an embedded value in glibc, from what
I can tell. Which also makes it an embedded value in statically linked
apps. This looks like a dead end, even for apps that use fpathconf().
Also, from looking at some of the kernel comments (like where PIPE_SIZE
is defined), it seems that the kernel powers-that-be also intend to keep
PIPE_BUF and PAGE_SIZE decoupled.
So the way forward seems to be to add a test to pipe_readv() for this
condition.
I'm thinking it should check if (PIPE_SIZE - buf->len) > PIPE_BUF, and
set do_wakeup. The code could be #defined out if PIPE_SIZE = PIPE_BUF,
so the change reduces to a no-op on other archs.
Does that sound right? I'm going to work on a patch here.
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [Fwd: Weird ia64 problem]
2005-10-05 21:43 [Fwd: Weird ia64 problem] Jeff Licquia
` (9 preceding siblings ...)
2005-10-07 14:29 ` Jeff Licquia
@ 2005-10-08 2:40 ` Tony Luck
10 siblings, 0 replies; 12+ messages in thread
From: Tony Luck @ 2005-10-08 2:40 UTC (permalink / raw)
To: linux-ia64
> I'm thinking it should check if (PIPE_SIZE - buf->len) > PIPE_BUF, and
> set do_wakeup. The code could be #defined out if PIPE_SIZE = PIPE_BUF,
> so the change reduces to a no-op on other archs.
>
> Does that sound right? I'm going to work on a patch here.
Roughly (without looking back as fs/pipe.c). When you have your
patch, you will need to restart this discussion on the linux-kernel
mailing list (since you'll be touching a generic file). Some random
grepping for PAGE_SIZE (well PAGE_SHIFT) shows that some
other artchitectures have some support for page size > PIPE_BUF.
So there should be some interest in this fix.
-Tony
^ permalink raw reply [flat|nested] 12+ messages in thread