From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Dilger Subject: Re: Memory corruption Date: Thu, 15 Aug 2002 14:36:57 -0600 Sender: linux-fsdevel-owner@vger.kernel.org Message-ID: <20020815203657.GY9642@clusterfs.com> References: <1029443222.1334.23.camel@terrier2> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-fsdevel@vger.kernel.org Return-path: To: Dave Boutcher Content-Disposition: inline In-Reply-To: <1029443222.1334.23.camel@terrier2> List-Id: linux-fsdevel.vger.kernel.org On Aug 15, 2002 15:26 -0500, Dave Boutcher wrote: > I'm chasing a wierd memory corruption problem on a ppc64 system. The > first byte of a slab_t structure keeps getting stepped on (zeroed, > actually.) This happens during a testcase that copies a large file > called "junk" between file systems (a mix of ext2 and reiser) on a > 2.4.13 kernel. Well, I hate to say it, but 2.4.13 is a very old kernel. Also, since reiserfs was fairly new to non-x86 architectures there may have been significant fixes since then. > C000000037008E00: FD8C0600 FE8C0600 FF8C0600 008D0600 < > > C000000037008E10: 018D0600 028D0600 038D0600 048D0600 < > > C000000037008E20: 058D0600 068D0600 078D0600 088D0600 < > > C000000037008E30: 098D0600 0A8D0600 0B8D0600 0C8D0600 < > > C000000037008E40: 0D8D0600 0E8D0600 0F8D0600 108D0600 < > > C000000037008E50: 118D0600 128D0600 138D0600 148D0600 < > > C000000037008E60: 158D0600 168D0600 178D0600 188D0600 < > > C000000037008E70: 198D0600 1A8D0600 1B8D0600 1C8D0600 < > > C000000037008E80: 1D8D0600 1E8D0600 1F8D0600 208D0600 < > > C000000037008E90: 218D0600 228D0600 238D0600 248D0600 > C000000037008EA0: 258D0600 268D0600 278D0600 288D0600 <% & ' ( > > C000000037008EB0: 298D0600 2A8D0600 2B8D0600 2C8D0600 <) * + , > > C000000037008EC0: 2D8D0600 2E8D0600 2F8D0600 308D0600 <- . / 0 > > C000000037008ED0: 318D0600 328D0600 338D0600 348D0600 <1 2 3 4 > > C000000037008EE0: 358D0600 368D0600 378D0600 388D0600 <5 6 7 8 > > C000000037008EF0: 398D0600 3A8D0600 3B8D0600 3C8D0600 <9 : ; < > > C000000037008F00: 3D8D0600 3E8D0600 3F8D0600 408D0600 <= > ? @ > > C000000037008F10: 418D0600 428D0600 438D0600 448D0600 > C000000037008F20: 458D0600 468D0600 478D0600 488D0600 > C000000037008F30: 498D0600 4A8D0600 4B8D0600 4C8D0600 > C000000037008F40: 4D8D0600 4E8D0600 4F8D0600 508D0600 > C000000037008F50: 518D0600 528D0600 538D0600 548D0600 > C000000037008F60: 558D0600 A4810000 01000000 0020F906 > C000000037008F70: 00000000 00000000 00000000 B377493D < wI=> > C000000037008F80: C377493D C377493D 907C0300 32000000 < wI= wI= | 2 > > C000000037008F90: 01000000 01000000 02000000 40000400 < @ > > C000000037008FA0: 02000000 00000000 01000000 38000400 < 8 > > C000000037008FB0: 80F1A501 02000000 03000000 30000400 < 0 > > C000000037008FC0: 6A756E6B 00000000 2E2E0000 00000000 > C000000037008FD0: 2E000000 00000000 ED4174F0 03000000 <. At > > C000000037008FE0: 48000000 00000000 00000000 00000000 > C000000037008FF0: 91B2103D B377493D B377493D 01000000 < = wI= wI= > > > The byte immediately following that gets zeroed. It sure looks to me > like someone is going over the end of a buffer. > > The question is, does anyone recognize that data structure?!?!?! It doesn't look ext2-ish. The ext2 on-disk directory entries would have a few bytes between "junk", "..", and "." (reclen, namelen, inode number), and would be in the order ".", "..", and "junk" instead. The items are also too small to be dentries or dirents from a readdir. I don't know enough about reiserfs to say either way, but I would suggest posting to their list also (be prepared again for the "your kernel is too old" from them as well). Cheers, Andreas -- Andreas Dilger http://www-mddsp.enel.ucalgary.ca/People/adilger/ http://sourceforge.net/projects/ext2resize/