* Re: mmap operation not working as expected on sparc linux
2011-06-01 8:18 mmap operation not working as expected on sparc linux Steven Dake
@ 2011-06-01 8:38 ` David Miller
2011-06-01 9:37 ` Steven Dake
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2011-06-01 8:38 UTC (permalink / raw)
To: sparclinux
From: Steven Dake <sdake@redhat.com>
Date: Wed, 01 Jun 2011 01:18:01 -0700
> I maintain a project called corosync which uses a memory backed file to
> generate a duplicate mapping in memory to implement a ring buffer. It
> essentially uses the concepts here:
>
> http://en.wikipedia.org/wiki/Ring_buffer#Exemplary_POSIX_Implementation
>
> This doesn't appear to work on sparclinux, returning an error EINVAL on
> the second memory map operation:
>
> address = mmap (buffer->address + buffer->count_bytes,
> buffer->count_bytes, PROT_READ | PROT_WRITE,
> MAP_FIXED | MAP_SHARED, file_descriptor, 0);
>
> Any ideas?
The start addresses of fixed and shared mappings need to have a
certain property relative to other such mappings.
And that property is that the addresses must all be modulo the
D-cache alias factor, which on sparc64 is 16384 bytes.
If we didn't enfore this, then writes from one mapping could get
aliased in the L1 D-cache, and not show up in other mappings.
However if we force all the start address to the same 16K boundary,
then this guarentees that writes to one mapping will show up in
others.
You'll hit similar issues on MIPS, ARM, PARISC, and SH. Just look for
which platforms define __ARCH_FORCE_SHMLBA in their shmparam.h header
file.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: mmap operation not working as expected on sparc linux
2011-06-01 8:18 mmap operation not working as expected on sparc linux Steven Dake
2011-06-01 8:38 ` David Miller
@ 2011-06-01 9:37 ` Steven Dake
2011-06-01 16:39 ` william felipe_welter
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Steven Dake @ 2011-06-01 9:37 UTC (permalink / raw)
To: sparclinux
On 06/01/2011 01:38 AM, David Miller wrote:
> From: Steven Dake <sdake@redhat.com>
> Date: Wed, 01 Jun 2011 01:18:01 -0700
>
>> I maintain a project called corosync which uses a memory backed file to
>> generate a duplicate mapping in memory to implement a ring buffer. It
>> essentially uses the concepts here:
>>
>> http://en.wikipedia.org/wiki/Ring_buffer#Exemplary_POSIX_Implementation
>>
>> This doesn't appear to work on sparclinux, returning an error EINVAL on
>> the second memory map operation:
>>
>> address = mmap (buffer->address + buffer->count_bytes,
>> buffer->count_bytes, PROT_READ | PROT_WRITE,
>> MAP_FIXED | MAP_SHARED, file_descriptor, 0);
>>
>> Any ideas?
>
> The start addresses of fixed and shared mappings need to have a
> certain property relative to other such mappings.
>
> And that property is that the addresses must all be modulo the
> D-cache alias factor, which on sparc64 is 16384 bytes.
>
> If we didn't enfore this, then writes from one mapping could get
> aliased in the L1 D-cache, and not show up in other mappings.
>
> However if we force all the start address to the same 16K boundary,
> then this guarentees that writes to one mapping will show up in
> others.
>
> You'll hit similar issues on MIPS, ARM, PARISC, and SH. Just look for
> which platforms define __ARCH_FORCE_SHMLBA in their shmparam.h header
> file.
Thanks for the response. For those searching for this problem in the
future, We tried 16384, and then decided to have a look at the kernel
source tree and came up with some speculation that the hugetlb settings
would result in EINVAL via prepare_hugetlb. The minimum page we could
execute this operation on was 4MB. I notice on some other arches this
is not required.
Best regards
-steve
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: mmap operation not working as expected on sparc linux
2011-06-01 8:18 mmap operation not working as expected on sparc linux Steven Dake
2011-06-01 8:38 ` David Miller
2011-06-01 9:37 ` Steven Dake
@ 2011-06-01 16:39 ` william felipe_welter
2011-06-02 15:47 ` william felipe_welter
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: william felipe_welter @ 2011-06-01 16:39 UTC (permalink / raw)
To: sparclinux
> The start addresses of fixed and shared mappings need to have a
> certain property relative to other such mappings.
>
> And that property is that the addresses must all be modulo the
> D-cache alias factor, which on sparc64 is 16384 bytes.
>
How can i do this? Calculate?
Example:
addr_orig = mmap (NULL, bytes, PROT_NONE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
addr = mmap ( addr_orig , bytes, PROT_READ | PROT_WRITE,
MAP_FIXED | MAP_SHARED, fd, 0);
Should i do addr_orig % 16384 ??
--
William Felipe Welter
------------------------------
Consultor em Tecnologias Livres
william.welter@4linux.com.br
www.4linux.com.br
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: mmap operation not working as expected on sparc linux
2011-06-01 8:18 mmap operation not working as expected on sparc linux Steven Dake
` (2 preceding siblings ...)
2011-06-01 16:39 ` william felipe_welter
@ 2011-06-02 15:47 ` william felipe_welter
2011-06-02 21:27 ` David Miller
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: william felipe_welter @ 2011-06-02 15:47 UTC (permalink / raw)
To: sparclinux
> http://en.wikipedia.org/wiki/Ring_buffer#Exemplary_POSIX_Implementation
>
> This doesn't appear to work on sparclinux, returning an error EINVAL on
> the second memory map operation:
>
> address = mmap (buffer->address + buffer->count_bytes,
> buffer->count_bytes, PROT_READ | PROT_WRITE,
> MAP_FIXED | MAP_SHARED, file_descriptor, 0);
>
> Any ideas?
Steven the problem are on the second mmap, this is the third call to mmap...
First call:
buffer->address = mmap (NULL, buffer->count_bytes << 1, PROT_NONE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
Second call (where the problem occurs, returns 0xffffffff):
address = mmap (buffer->address, buffer->count_bytes, PROT_READ | PROT_WRITE,
MAP_FIXED | MAP_SHARED, file_descriptor, 0);
Third (is not called because the error of the second) :
address = mmap (buffer->address + buffer->count_bytes,
buffer->count_bytes, PROT_READ | PROT_WRITE,
MAP_FIXED | MAP_SHARED, file_descriptor, 0);
Some ideas ? What can be the reason of this behavior ?
--
William Felipe Welter
------------------------------
Consultor em Tecnologias Livres
william.welter@4linux.com.br
www.4linux.com.br
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: mmap operation not working as expected on sparc linux
2011-06-01 8:18 mmap operation not working as expected on sparc linux Steven Dake
` (3 preceding siblings ...)
2011-06-02 15:47 ` william felipe_welter
@ 2011-06-02 21:27 ` David Miller
2011-06-02 21:53 ` Steven Dake
2011-06-06 23:06 ` Matthias Rosenfelder
6 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2011-06-02 21:27 UTC (permalink / raw)
To: sparclinux
From: Steven Dake <sdake@redhat.com>
Date: Wed, 01 Jun 2011 02:37:04 -0700
> Thanks for the response. For those searching for this problem in the
> future, We tried 16384, and then decided to have a look at the kernel
> source tree and came up with some speculation that the hugetlb settings
> would result in EINVAL via prepare_hugetlb. The minimum page we could
> execute this operation on was 4MB. I notice on some other arches this
> is not required.
Requiring 4MB is not kosher, it shouldn't be asking you to do that
much.
Where is this prepare_hugetlb() function in the kernel? I cannot
find it.
Also, please prepare a test case for me, I want to fix this.
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: mmap operation not working as expected on sparc linux
2011-06-01 8:18 mmap operation not working as expected on sparc linux Steven Dake
` (4 preceding siblings ...)
2011-06-02 21:27 ` David Miller
@ 2011-06-02 21:53 ` Steven Dake
2011-06-06 23:06 ` Matthias Rosenfelder
6 siblings, 0 replies; 8+ messages in thread
From: Steven Dake @ 2011-06-02 21:53 UTC (permalink / raw)
To: sparclinux
On 06/02/2011 02:27 PM, David Miller wrote:
> From: Steven Dake <sdake@redhat.com>
> Date: Wed, 01 Jun 2011 02:37:04 -0700
>
>> Thanks for the response. For those searching for this problem in the
>> future, We tried 16384, and then decided to have a look at the kernel
>> source tree and came up with some speculation that the hugetlb settings
>> would result in EINVAL via prepare_hugetlb. The minimum page we could
>> execute this operation on was 4MB. I notice on some other arches this
>> is not required.
>
> Requiring 4MB is not kosher, it shouldn't be asking you to do that
> much.
>
> Where is this prepare_hugetlb() function in the kernel? I cannot
> find it.
>
> Also, please prepare a test case for me, I want to fix this.
>
> Thanks.
Dave,
Sorry for the noise. We did not require 4MB regions, as I had
originally stated. Unfortunately I just got access to hardware today to
test (was relying on community hardware testers previously). I have
prepared a patch which fixes the problem in our software. x86_64
behaves differently (works) then sparc64.
In summary we were doing the following:
1.
creating a nonwrapping mmap as follows:
addr_orig = mmap (NULL, bytes, PROT_NONE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (addr_orig = MAP_FAILED) {
goto error_close_unlink;
}
addr = mmap (addr_orig, bytes, PROT_READ | PROT_WRITE,
MAP_FIXED | MAP_SHARED, fd, 0);
^^ clearly dubious
2.
further creations of circular memory maps caused all sorts of problems
on sparc but not on x86_64.
This resulted in later circular memory maps we wanted to create having
to be 4MB in size to work properly. I can't explain why.
I changed the mmap operation in 1 to do the following:
addr = mmap (NULL, bytes, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
This allows our software to function properly.
I've cc'ed Angus who is working on a new implementation of this ring
buffer in a different project (www.libqb.org) who found the 4MB sizes
resulted in the software working. He is investigating introducing the
change I stated into his code to see if this also works on sparc.
He can follow up with you if he runs into further problems.
Regards
-steve
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: mmap operation not working as expected on sparc linux
2011-06-01 8:18 mmap operation not working as expected on sparc linux Steven Dake
` (5 preceding siblings ...)
2011-06-02 21:53 ` Steven Dake
@ 2011-06-06 23:06 ` Matthias Rosenfelder
6 siblings, 0 replies; 8+ messages in thread
From: Matthias Rosenfelder @ 2011-06-06 23:06 UTC (permalink / raw)
To: sparclinux
On 02.06.2011 17:47, william felipe_welter wrote:
> Some ideas ? What can be the reason of this behavior ?
Any fixed and shared mmap() mapping address must be aligned to SHMLBA as
mentioned by Dave. I guess the first mmap() succeeds because these
restrictions do not apply to private mappings. The second mmap() wants a
shared memory mapping but the address returned by the first one is only
page-aligned and not aligned to SHMLBA. This is why the second one fails.
See the comment in
http://lxr.linux.no/#linux+v2.6.39/arch/sparc/kernel/sys_sparc_64.c#L124
On 02.06.2011 23:53, Steven Dake wrote:
> 2.
> further creations of circular memory maps caused all sorts of problems
> on sparc but not on x86_64.
>
> This resulted in later circular memory maps we wanted to create having
> to be 4MB in size to work properly. I can't explain why.
>
> I changed the mmap operation in 1 to do the following:
> addr = mmap (NULL, bytes, PROT_READ | PROT_WRITE,
> MAP_SHARED, fd, 0);
>
> This allows our software to function properly.
This may work only by accident. You're requesting a new mapping for
which the kernel may choose a virtual address. There is no guarantee
that there is enough space after this address to also map the buffer for
the second time. Therefore, the third mmap() call might fail. If it does
not fail, you're just lucky.
As far as I can see, the first mmap() is only there in order to find a
large enough region in the virtual address space that the buffer can be
mapped twice - one after the other. There is no point in mapping it the
first time successfully and then finding out that there is already
something else mapped right behind it.
In order to make this circular buffer work, you need the two mappings
being consecutive. Furthermore, (due to architectural restrictions) any
two successful mmap() mappings are at least SHMLBA bytes away from each
other and are also aligned to this size. Therefore, your buffer must be
at least SHMLBA bytes large to avoid a gap and both mappings must be
aligned to SHMLBA bytes.
Unfortunatelly, you cannot specify the alignment for mmap(). You either
choose an address by yourself (which one?) or you make the kernel decide
for you. Therefore, the difficulty is to find an address suitable for
the first of the three mmap() calls.
You could try the following:
At first let the kernel choose an address for the first mmap(). If it is
successful but the alignment is not right, you can take this address and
align it properly by hand in order to repeat the first mmap() with this
aligned address. I guess, in most cases this should succeed. If it does
not, you can repeat the first mmap() request with three times the buffer
size. If it is successful and the alignment is right, then you're done.
If not, align it by hand and try again.
Something like this:
#include <asm/shmparam.h>
#define ALIGNUP(p, q) \
((void *)(((unsigned long)(p) + (q) - 1) & ~((q) - 1)))
#define ALIGN_TEST(p, q) \
((unsigned long)(p) & ~((q) - 1)) = (unsigned long)(p))
/* forward declaration */
void ring_buffer_free (struct ring_buffer *buffer);
void
ring_buffer_create (struct ring_buffer *buffer, unsigned long order)
{
...
int req_size;
...
buffer->address = mmap (NULL, buffer->count_bytes << 1, PROT_NONE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (buffer->address = MAP_FAILED)
report_exceptional_condition ();
/* my proposal goes here */
#if __ARCH_FORCE_SHMLBA
if(buffer->count_bytes < SHMLBA) /* ... then this cannot work */
report_exceptional_condition ();
req_size = buffer->count_bytes << 1;
while(1) {
if (buffer->address = MAP_FAILED)
report_exceptional_condition ();
if (ALIGN_TEST(buffer->address, SHMLBA)) {
break;
} else {
/* try again this addr with manual alignment */
void *aligned_addr = ALIGNUP(buffer->address, SHMLBA);
if (buffer->address != MAP_FAILED)
ring_buffer_free(buffer);
buffer->address = mmap (aligned_addr, req_size, PROT_NONE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (buffer->address != MAP_FAILED)
break;
if(req_size = buffer->count_bytes << 1) {
/* failed; try again in larger region */
req_size = 3 * buffer->count_bytes;
buffer->address = mmap (NULL, req_size, PROT_NONE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
} else {
report_exceptional_condition ();
}
}
}
#endif
/* 2nd and third mmap() go here; they should succeed */
This is not tested, as I don't have any SPARC hardware near me at the
moment. But I guess this should work also for buffer sizes much smaller
than 4 MiB.
On 02.06.2011 23:27, David Miller wrote:
> Also, please prepare a test case for me, I want to fix this.
I don't think there is anything to fix inside of the kernel. People just
need to pay attention to the SHMLBA alignment when they specify
MAP_FIXED | MAP_SHARED.
^ permalink raw reply [flat|nested] 8+ messages in thread