* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
@ 2006-06-02 0:54 ` David Miller
2006-06-02 8:04 ` Christian Joensson
` (12 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2006-06-02 0:54 UTC (permalink / raw)
To: sparclinux
From: David Miller <davem@davemloft.net>
Date: Thu, 01 Jun 2006 17:52:58 -0700 (PDT)
> I'll also post a little reproducer program that can be used to
> accurately see if the patch really fixes the problem on your
> machine or not, which should help with the variability of the
> dpkg corruption case.
As promised:
/* mremap() stress tester for D-cache aliasing platforms.
*
* Copyright (C) 2006 David S. Miller (davem@davemloft.net)
*
* It tries to exercise the case where mremap() with MREMAP_MAYMOVE
* will actually place the area somewhere else and not just extend
* or shrink at the existing mapping location.
*
* This can cause problems on virtually indexed cache platforms
* if they do not implement move_pte() with logic to handle a
* change of virtual color. If the cache virtual color changes
* when mremap() moves the mapping around, we can end up accessing
* stale aliases in the cache on subsequent cpu accesses to the
* new virtual addresses.
*
* This bug was first discovered as file corruption occuring occaisionally
* in 'dpkg'. When 'dpkg' is building a 'status' or 'available' file it
* uses an expanding allocator called 'varbuf' which uses realloc()
* heavilly to expand it's internal buffer. In glibc, for very large
* malloc() buffer sizes, realloc() uses mremap() with MREMAP_MAYMOVE to
* try and satisfy expansion requests. Most of the time there is room
* in the address space, but if we bump up against anoter mmap() region
* it can move things around. If this results in different D-cache coloring
* for the region we can end up reading corrupt data from existing aliases
* in the D-cache.
*
* It was very hard to reproduce, so this test case was written.
*/
#define _GNU_SOURCE
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
/* XXX There is no easy way to get at mremap()'s MREAMP_FIXED functionality
* XXX with older glibc versions...
*/
extern void *mremap(void *old_address, size_t old_size, size_t new_size,
unsigned long flags, ...);
# define MREMAP_MAYMOVE 1
# define MREMAP_FIXED 2
extern void *mmap(void *start, size_t length, int prot, int flags, int fd,
off_t offset);
/* Same on all platforms... */
#define PROT_READ 0x1 /* Page can be read. */
#define PROT_WRITE 0x2 /* Page can be written. */
#if defined(__alpha__)
#define MAP_FIXED 0x100 /* Interpret addr exactly. */
#else
#if defined(__parisc__)
#define MAP_FIXED 0x04 /* Interpret addr exactly. */
#else
#define MAP_FIXED 0x10 /* Interpret addr exactly. */
#endif
#endif
#if defined(__alpha__) || defined(__parisc__)
#define MAP_ANONYMOUS 0x10 /* Don't use a file. */
#else
#if defined(__mips__) || defined(__xtensa__)
#define MAP_ANONYMOUS 0x800 /* Don't use a file. */
#else
#define MAP_ANONYMOUS 0x20 /* Don't use a file. */
#endif
#endif
#define MAP_PRIVATE 0x02 /* Changes are private. */
#define MAP_FAILED ((void *) -1)
static int page_size;
static void *unmapped_region;
#define MAX_OFFSET 8
static int init_main_buf(void **main_buf_p, int *main_buf_size_p)
{
page_size = getpagesize();
*main_buf_size_p = page_size;
unmapped_region = mmap(NULL, (MAX_OFFSET + 1) * page_size,
PROT_READ,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (unmapped_region = (void *) MAP_FAILED) {
perror("mmap() of unmapped_region");
return 1;
}
fprintf(stdout, "unmapped region at %p\n", unmapped_region);
fflush(stdout);
*main_buf_p = mmap(unmapped_region, *main_buf_size_p,
PROT_READ | PROT_WRITE,
MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);
if (*main_buf_p = (void *) MAP_FAILED) {
perror("Initial 1-page mmap() of main_buf");
return 1;
}
fprintf(stdout, "Initial main_buf at %p\n", *main_buf_p);
fflush(stdout);
return 0;
}
static void destroy_main_buf(void *main_buf, int main_buf_size)
{
munmap(main_buf, main_buf_size);
}
static unsigned int key = 0x00000001;
static void fill_page(void *start)
{
unsigned int *p = start;
unsigned int *end = start + page_size;
while (p < end) {
volatile unsigned int *xp = (volatile unsigned int *) p;
(void) *xp;
*p++ = (unsigned int) (p - (unsigned int *) start) + key;
}
}
static int remap_page(void **main_buf_p, int *main_buf_size_p, int off)
{
void *p;
p = mremap(*main_buf_p,
*main_buf_size_p,
*main_buf_size_p,
MREMAP_FIXED | MREMAP_MAYMOVE,
unmapped_region + (off * page_size));
if (p = (void *) MAP_FAILED) {
perror("mremap() of main_buf");
return 1;
}
*main_buf_p = p;
return 0;
}
static void check_page(void *start)
{
unsigned int *p = start;
unsigned int *end = start + page_size;
while (p < end) {
unsigned int val = *p;
unsigned int exp_val;
exp_val = (unsigned int) (p - (unsigned int *) start) + key;
if (val != exp_val) {
fprintf(stderr, "Bogus value %08x should be %08x\n",
val, exp_val);
exit(99);
}
p++;
}
}
static int do_test(void)
{
void *main_buf;
int main_buf_size;
int err, i;
err = init_main_buf(&main_buf, &main_buf_size);
if (err)
return 1;
while (1) {
for (i = 0; i < MAX_OFFSET; i++) {
fill_page(main_buf);
remap_page(&main_buf, &main_buf_size, i + 1);
check_page(main_buf);
key++;
}
}
destroy_main_buf(main_buf, main_buf_size);
return 0;
}
int main(int argc, char **argv, char **envp)
{
page_size = getpagesize();
return do_test();
}
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
2006-06-02 0:54 ` [PATCH]: Fix D-cache corruption in mremap() David Miller
@ 2006-06-02 8:04 ` Christian Joensson
2006-06-02 8:20 ` David Miller
` (11 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Christian Joensson @ 2006-06-02 8:04 UTC (permalink / raw)
To: sparclinux
On 6/2/06, David Miller <davem@davemloft.net> wrote:
> From: David Miller <davem@davemloft.net>
> Date: Thu, 01 Jun 2006 17:52:58 -0700 (PDT)
>
> > I'll also post a little reproducer program that can be used to
> > accurately see if the patch really fixes the problem on your
> > machine or not, which should help with the variability of the
> > dpkg corruption case.
>
> As promised:
>
> /* mremap() stress tester for D-cache aliasing platforms.
> *
> * Copyright (C) 2006 David S. Miller (davem@davemloft.net)
> *
great, so if I compile with gcc and run it... I get this:
[chj@arnljot ~]$ gcc mremap-stress-tester.c
[chj@arnljot ~]$ ./a.out
unmapped region at 0xf7d72000
Initial main_buf at 0xf7d72000
Bogus value 00000001 should be 00000002
[chj@arnljot ~]$ uname -a
Linux arnljot 2.6.16-1.2128sp4 #1 Fri May 26 10:08:13 EDT 2006 sparc64
sparc64 sparc64 GNU/Linux
[chj@arnljot ~]$
which means?
--
Cheers,
/ChJ
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
2006-06-02 0:54 ` [PATCH]: Fix D-cache corruption in mremap() David Miller
2006-06-02 8:04 ` Christian Joensson
@ 2006-06-02 8:20 ` David Miller
2006-06-02 13:17 ` [PATCH]: Fix D-cache corruption in mremap() (was Re: 2.6.17-rc3+git Meelis Roos
` (10 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2006-06-02 8:20 UTC (permalink / raw)
To: sparclinux
From: "Christian Joensson" <christian.joensson@gmail.com>
Date: Fri, 2 Jun 2006 10:04:29 +0200
> On 6/2/06, David Miller <davem@davemloft.net> wrote:
> > From: David Miller <davem@davemloft.net>
> > Date: Thu, 01 Jun 2006 17:52:58 -0700 (PDT)
> >
> > > I'll also post a little reproducer program that can be used to
> > > accurately see if the patch really fixes the problem on your
> > > machine or not, which should help with the variability of the
> > > dpkg corruption case.
> >
> > As promised:
> >
> > /* mremap() stress tester for D-cache aliasing platforms.
> > *
> > * Copyright (C) 2006 David S. Miller (davem@davemloft.net)
> > *
>
> great, so if I compile with gcc and run it... I get this:
>
> [chj@arnljot ~]$ gcc mremap-stress-tester.c
> [chj@arnljot ~]$ ./a.out
> unmapped region at 0xf7d72000
> Initial main_buf at 0xf7d72000
> Bogus value 00000001 should be 00000002
> [chj@arnljot ~]$ uname -a
> Linux arnljot 2.6.16-1.2128sp4 #1 Fri May 26 10:08:13 EDT 2006 sparc64
> sparc64 sparc64 GNU/Linux
> [chj@arnljot ~]$
>
> which means?
It means the kernel you are running has the bug, try to rerun
it with the kernel patch I posted, it should no longer produce
the:
Bogus value 00000001 should be 00000002
lines.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap() (was Re: 2.6.17-rc3+git
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (2 preceding siblings ...)
2006-06-02 8:20 ` David Miller
@ 2006-06-02 13:17 ` Meelis Roos
2006-06-02 15:31 ` Gustavo Zacarias
` (9 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Meelis Roos @ 2006-06-02 13:17 UTC (permalink / raw)
To: sparclinux
> Anyways, give this patch below a test, it should fix things.
> It is against the current 2.6.17 tree.
>
> I'll also post a little reproducer program that can be used to
> accurately see if the patch really fixes the problem on your
> machine or not, which should help with the variability of the
> dpkg corruption case.
>
> Please test this, I need to know if this works for everyone
> or not.
Works on my Ulta 5 - without the patch I get the error at once and with
the patch I get no error for half an hour.
--
Meelis Roos (mroos@linux.ee)
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap() (was Re: 2.6.17-rc3+git
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (3 preceding siblings ...)
2006-06-02 13:17 ` [PATCH]: Fix D-cache corruption in mremap() (was Re: 2.6.17-rc3+git Meelis Roos
@ 2006-06-02 15:31 ` Gustavo Zacarias
2006-06-02 20:18 ` [PATCH]: Fix D-cache corruption in mremap() Daniel Smolik
` (8 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Gustavo Zacarias @ 2006-06-02 15:31 UTC (permalink / raw)
To: sparclinux
David Miller wrote:
> I'll also post a little reproducer program that can be used to
> accurately see if the patch really fixes the problem on your
> machine or not, which should help with the variability of the
> dpkg corruption case.
>
> Please test this, I need to know if this works for everyone
> or not.
Works for me too on my u5.
On a side note it seems to have fixed the alsa oops i reported some time
ago that's missing from the archives.
Basically listen to some ogg file with alsa output with ogg123
(vorbis-tools), usually when it's finished or if you break it you'll get
it nicely (at least with snd-sun-cs4231, i think ens1371 too but i have
it out of my box atm).
--
Gustavo Zacarias
Gentoo/SPARC monkey
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (4 preceding siblings ...)
2006-06-02 15:31 ` Gustavo Zacarias
@ 2006-06-02 20:18 ` Daniel Smolik
2006-06-02 22:19 ` David Miller
` (7 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Daniel Smolik @ 2006-06-02 20:18 UTC (permalink / raw)
To: sparclinux
David Miller napsal(a):
> From: "Christian Joensson" <christian.joensson@gmail.com>
> Date: Fri, 2 Jun 2006 10:04:29 +0200
>
>
>>On 6/2/06, David Miller <davem@davemloft.net> wrote:
>>
>>>From: David Miller <davem@davemloft.net>
>>>Date: Thu, 01 Jun 2006 17:52:58 -0700 (PDT)
>>>
>>>
>>>>I'll also post a little reproducer program that can be used to
>>>>accurately see if the patch really fixes the problem on your
>>>>machine or not, which should help with the variability of the
>>>>dpkg corruption case.
May be OK on my E250 ? mmremap stress program never returns is it ok ?
Problem with dselect is solved. In wich version of kernel will be
patch included ?
Many thanks
Dan
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (5 preceding siblings ...)
2006-06-02 20:18 ` [PATCH]: Fix D-cache corruption in mremap() Daniel Smolik
@ 2006-06-02 22:19 ` David Miller
2006-06-03 6:08 ` David Miller
` (6 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2006-06-02 22:19 UTC (permalink / raw)
To: sparclinux
From: Daniel Smolik <marvin@mydatex.cz>
Date: Fri, 02 Jun 2006 22:18:29 +0200
> May be OK on my E250 ? mmremap stress program never returns is it ok ?
> Problem with dselect is solved. In wich version of kernel will be
> patch included ?
The program just runs forever until it triggers the bug.
If it runs fine for more than a few seconds you are ok.
Thank you very much for testing.
I plan to put this into 2.6.17 and give it to the 2.6.16-stable folks
as well. If Chris W. or Greg KH are still doing 2.6.15-stable
releases I'll ask them to toss it into there too.
I'll try to take care of that over the weekend.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (6 preceding siblings ...)
2006-06-02 22:19 ` David Miller
@ 2006-06-03 6:08 ` David Miller
2006-06-03 6:09 ` David Miller
` (5 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2006-06-03 6:08 UTC (permalink / raw)
To: sparclinux
From: Meelis Roos <mroos@linux.ee>
Date: Fri, 2 Jun 2006 16:17:29 +0300 (EEST)
> Works on my Ulta 5 - without the patch I get the error at once and with
> the patch I get no error for half an hour.
Great, please watch for any dpkg database file corruption. That's the
reason all of this started :)
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (7 preceding siblings ...)
2006-06-03 6:08 ` David Miller
@ 2006-06-03 6:09 ` David Miller
2006-06-03 19:01 ` Jurij Smakov
` (4 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2006-06-03 6:09 UTC (permalink / raw)
To: sparclinux
From: Gustavo Zacarias <gustavoz@gentoo.org>
Date: Fri, 02 Jun 2006 12:31:37 -0300
> On a side note it seems to have fixed the alsa oops i reported some time
> ago that's missing from the archives.
> Basically listen to some ogg file with alsa output with ogg123
> (vorbis-tools), usually when it's finished or if you break it you'll get
> it nicely (at least with snd-sun-cs4231, i think ens1371 too but i have
> it out of my box atm).
Glad to hear it fixes other bugs :)
Thanks for testing.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (8 preceding siblings ...)
2006-06-03 6:09 ` David Miller
@ 2006-06-03 19:01 ` Jurij Smakov
2006-06-04 6:41 ` David Miller
` (3 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Jurij Smakov @ 2006-06-03 19:01 UTC (permalink / raw)
To: sparclinux
On Fri, 2 Jun 2006, David Miller wrote:
> Great, please watch for any dpkg database file corruption. That's the
> reason all of this started :)
I've finally got to reinstalling my machine and re-run the same
upgrade/install which triggered the bug last time. With patched up kernel
everything worked fine. Thanks a lot for taking care of it so fast, Dave.
Best regards,
Jurij Smakov jurij@wooyd.org
Key: http://www.wooyd.org/pgpkey/ KeyID: C99E03CC
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (9 preceding siblings ...)
2006-06-03 19:01 ` Jurij Smakov
@ 2006-06-04 6:41 ` David Miller
2006-06-04 16:13 ` Meelis Roos
` (2 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2006-06-04 6:41 UTC (permalink / raw)
To: sparclinux
From: Jurij Smakov <jurij@wooyd.org>
Date: Sat, 3 Jun 2006 12:01:24 -0700 (PDT)
> On Fri, 2 Jun 2006, David Miller wrote:
>
> > Great, please watch for any dpkg database file corruption. That's the
> > reason all of this started :)
>
> I've finally got to reinstalling my machine and re-run the same
> upgrade/install which triggered the bug last time. With patched up kernel
> everything worked fine. Thanks a lot for taking care of it so fast, Dave.
Thanks for testing.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (10 preceding siblings ...)
2006-06-04 6:41 ` David Miller
@ 2006-06-04 16:13 ` Meelis Roos
2006-06-06 10:05 ` Marc Zyngier
2006-06-09 15:23 ` [PATCH]: Fix D-cache corruption in mremap() (was Re: 2.6.17-rc3+git seems to corrupt dpkg databa Ludovic Courtès
13 siblings, 0 replies; 15+ messages in thread
From: Meelis Roos @ 2006-06-04 16:13 UTC (permalink / raw)
To: sparclinux
>> Works on my Ulta 5 - without the patch I get the error at once and with
>> the patch I get no error for half an hour.
>
> Great, please watch for any dpkg database file corruption. That's the
> reason all of this started :)
Seems to have survived a large debian update since the patch was in
place, so it seems to work. Thanks!
--
Meelis Roos (mroos@linux.ee)
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap()
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (11 preceding siblings ...)
2006-06-04 16:13 ` Meelis Roos
@ 2006-06-06 10:05 ` Marc Zyngier
2006-06-09 15:23 ` [PATCH]: Fix D-cache corruption in mremap() (was Re: 2.6.17-rc3+git seems to corrupt dpkg databa Ludovic Courtès
13 siblings, 0 replies; 15+ messages in thread
From: Marc Zyngier @ 2006-06-06 10:05 UTC (permalink / raw)
To: sparclinux
Just to add yet another testing point.
The latest git runs nicely on my oddest sparc64 system, a Netra AX1105
(a strange crossover between a Blade-100 and a Netra-X1).
Thanks a lot for fixing that, David.
M.
--
And if you don't know where you're going, any road will take you there...
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH]: Fix D-cache corruption in mremap() (was Re: 2.6.17-rc3+git seems to corrupt dpkg databa
2006-06-02 0:52 [PATCH]: Fix D-cache corruption in mremap() (was David Miller
` (12 preceding siblings ...)
2006-06-06 10:05 ` Marc Zyngier
@ 2006-06-09 15:23 ` Ludovic Courtès
13 siblings, 0 replies; 15+ messages in thread
From: Ludovic Courtès @ 2006-06-09 15:23 UTC (permalink / raw)
To: sparclinux
Hi,
7 days, 14 hours, 26 minutes, 10 seconds ago,
David Miller wrote:
> Anyways, give this patch below a test, it should fix things.
> It is against the current 2.6.17 tree.
It seems to fix the bug (according to `mremap-stress' and with
2.6.17-rc6) on my U5 (I was hoping it would also somehow allow X.org 7
to run but unfortunately it doesn't ;-)).
Thanks!
Ludovic.
^ permalink raw reply [flat|nested] 15+ messages in thread