From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jens Axboe Subject: Re: [Bug #13112] Oops in drain_array Date: Tue, 28 Apr 2009 20:22:48 +0200 Message-ID: <20090428182248.GL4593@kernel.dk> References: <20090428171139N.fujita.tomonori@lab.ntt.co.jp> <20090428234512P.fujita.tomonori@lab.ntt.co.jp> Mime-Version: 1.0 Return-path: Content-Disposition: inline In-Reply-To: <20090428234512P.fujita.tomonori-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org> Sender: kernel-testers-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: FUJITA Tomonori Cc: mmx-G/jkD+u3s4s@public.gmane.org, rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org, penberg-bbCR+/B0CizivPeTLB3BmA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, rjw-KKrjLPT3xs0@public.gmane.org, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org On Tue, Apr 28 2009, FUJITA Tomonori wrote: > On Tue, 28 Apr 2009 14:43:37 +0200 (CEST) > Bart wrote: > > > > On Mon, 27 Apr 2009 13:36:46 -0700 (PDT) > > > David Rientjes wrote: > > > > > >> On Mon, 27 Apr 2009, Bart wrote: > > >> > > >>> After turning the suggested debuging options I've got tons of these when > > >>> trying to stress the tape device like before: > > >>> > > >>> Apr 27 16:57:30 fs kernel: [ 96.446708] slab error in verify_redzone_free(): > > >>> cache `size-128': memory outside object was overwritten > > >>> Apr 27 16:57:30 fs kernel: [ 96.446713] Pid: 0, comm: swapper Not tainted > > >>> 2.6.29.1-64 #2 > > >>> Apr 27 16:57:30 fs kernel: [ 96.446715] Call Trace: > > >>> Apr 27 16:57:30 fs kernel: [ 96.446717] [] > > >>> __slab_error+0x1f/0x25 > > >>> Apr 27 16:57:30 fs kernel: [ 96.446728] [] > > >>> cache_free_debugcheck+0x108/0x1d6 > > >>> Apr 27 16:57:30 fs kernel: [ 96.446731] [] > > >>> kfree+0x81/0xc2 > > >>> Apr 27 16:57:30 fs kernel: [ 96.446735] [] > > >>> bio_free_map_data+0xc/0x1e > > >> > > >> This appears to be kfree(bmd->iovecs) in bio_free_map_data(). It looks > > >> like the memcpy size in bio_set_map_data() overrides the kmalloc size; in > > >> other words, for a redzone error, bio->bi_vcnt > nr_pages in > > >> bio_copy_user_iov(). > > > > > > Can you try this? > > > > > > diff --git a/fs/bio.c b/fs/bio.c > > > index 7bbc98f..6a09356 100644 > > > --- a/fs/bio.c > > > +++ b/fs/bio.c > > > @@ -817,6 +817,9 @@ struct bio *bio_copy_user_iov(struct request_queue *q, > > > len += iov[i].iov_len; > > > } > > > > > > + if (offset) > > > + nr_pages += 1; > > > + > > > bmd = bio_alloc_map_data(nr_pages, iov_count, gfp_mask); > > > if (!bmd) > > > return ERR_PTR(-ENOMEM); > > > > > > > There are no more errors in the dmesg after applying this patch to > > 2.6.29.2. > > > > Without this patch I can reproduce this kind of errors on > > 2.6.29.1, 2.6.29.2. > > > > I've not tested this patch with 2.6.29.1 and 2.6.30rc3-git3. > > I will try to reproduce the error on 2.6.30rc3-git3 as soon as I compile > > it. > > Thanks for testing! And very sorry about the bug. > > I'm sure that you hit the same bug with 2.6.30-rc3-git. > > Jens, can you please apply this against 2.6.30-rc (and we need this > for 2.6.29.x too)? > > I know that bio_copy_user_iov() is hacky. I'll try to clean up the > mapping API later. I'll apply it for 2.6.30-rc and CC stable. bio_copy_user_iov() is indeed not pretty, both the API and the implementation needs looking at... -- Jens Axboe