Re: Fw: 2.6.17 oops, possibly ntfs/mmap related

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Andrew Morton <akpm@osdl.org>
To: Anton Altaparmakov <aia21@cam.ac.uk>
Cc: Jonathan Woithe <jwoithe@physics.adelaide.edu.au>,
	linux-kernel@vger.kernel.org
Subject: Re: Fw: 2.6.17 oops, possibly ntfs/mmap related
Date: Thu, 21 Sep 2006 10:52:36 -0700	[thread overview]
Message-ID: <20060921105236.76e344a2.akpm@osdl.org> (raw)
In-Reply-To: <1158849696.5958.39.camel@imp.csi.cam.ac.uk>

On Thu, 21 Sep 2006 15:41:36 +0100
Anton Altaparmakov <aia21@cam.ac.uk> wrote:

> Hi,
> 
> On Thu, 2006-09-21 at 10:54 +0100, Anton Altaparmakov wrote:
> > On Tue, 2006-09-12 at 20:56 -0700, Andrew Morton wrote:
> > Andrew, thanks for forwarding me the message...
> > > Begin forwarded message:
> > > 
> > > We have a machine which is currently making heavy use of a usb hard disc
> > > formatted with ntfs.  There have been two occasions where the kernel has
> > > oopsed while this disc was being accessed heavily.  Before adding this HDD
> > > the machine in question was rock solid which leads me to think that it
> > > might be related to ntfs.  USB drives formatted with other filesystems do
> > > not appear to suffer from this problem.
> 
> I have now seen such an oops too with 2.6.18 kernel.

I assume it is a once-off?

>  Note no NTFS file
> systems were mounted at the time (but I had an NTFS file system mounted
> earlier in the day).
> 
> The oops is caused by kswapd0 kernel thread, the stack trace is:
> 
> Call Trace:
>  [<c10470a3>] shrink_inactive_list+0x46b/0x790
>  [<c104747c>] shrink_zone+0xb4/0xd3
>  [<c104797d>] kswapd+0x2de/0x3cf
>  [<c102c18e>] kthread+0xc2/0xf0
>  [<c1000bf1>] kernel_thread_helper+0x5/0xb
> DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
> Leftover inexact backtrace:
>  [<c1003e6c>] show_stack_log_lvl+0x8c/0x97
>  [<c1003fc8>] show_registers+0x151/0x1c6
>  [<c10041af>] die+0x172/0x27b
>  [<c145f22c>] do_page_fault+0x42c/0x4f9
>  [<c10037dd>] error_code+0x39/0x40
>  [<c10470a3>] shrink_inactive_list+0x46b/0x790
>  [<c104747c>] shrink_zone+0xb4/0xd3
>  [<c104797d>] kswapd+0x2de/0x3cf
>  [<c102c18e>] kthread+0xc2/0xf0
>  [<c1000bf1>] kernel_thread_helper+0x5/0xb
> 
> And the EIP is at fs/buffer.c::try_to_release_page() the code of which
> is here:
> 
> int try_to_release_page(struct page *page, gfp_t gfp_mask)
> {
>         struct address_space * const mapping = page->mapping;
> 
>         BUG_ON(!PageLocked(page));
>         if (PageWriteback(page))
>                 return 0;
> 
>         if (mapping && mapping->a_ops->releasepage)
> 
> ^^^ bug happens here when the value of mapping->a_ops is used to obtain
> mapping->a_ops->releasepage
> 
>                 return mapping->a_ops->releasepage(page, gfp_mask);
>         return try_to_free_buffers(page);
> }
> 
> This bug seems to suggest that there is a page which the kernel is
> trying to release private data which has page->mapping set to a valid
> value and page->mapping->a_ops apparently set to an invalid value and
> when page->mapping->a_ops->releasepage is dereferenced it causes an oops
> with the kernel saying:
> 
> BUG: unable to handle kernel paging request at virtual address 020030d2
> 
> The values of the relevant variables from the oops are:
> 
> page = 0xc2248fa0
> page->mapping = 0xe3a79eac
> page->mapping->a_ops = 0x020030aa

I wonder if page->mapping really wanted to be 0xc3a79eac, only something
set bit 29.

> Note that 0x020030aa+0x28 = 020030d2 which is the oops causing address
> and 0x28 is the offset of the releasepage function pointer in the
> address space operations structure...
> 
> This oops is not identical to the oopses pointed out by Jonathan at:
> 
> http://www.atrad.com.au/~jwoithe/kernel/oopses-20060913.txt
> 
> But those oopses have to do with pages also so could be related...

Looks a bit different - Jonathan appears to have pulled a bad page* out
of the radix tree whereas you got your page off the LRU.

> Anyone have any ideas how a page can end up in such a weird state?

Nope.

next prev parent reply	other threads:[~2006-09-21 17:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20060912205602.57568b2a.akpm@osdl.org>
2006-09-21  9:54 ` Fw: 2.6.17 oops, possibly ntfs/mmap related Anton Altaparmakov
2006-09-21 14:41   ` Anton Altaparmakov
2006-09-21 17:52     ` Andrew Morton [this message]
2006-09-21 19:04       ` Hugh Dickins
2006-09-21 19:24         ` Dave Jones
2006-09-22  7:17           ` Jonathan Woithe
2006-09-22 15:40             ` Dave Jones
2006-09-24 23:49               ` Jonathan Woithe
2006-09-22  6:59       ` Anton Altaparmakov
2006-09-22  7:23         ` Jonathan Woithe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060921105236.76e344a2.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=aia21@cam.ac.uk \
    --cc=jwoithe@physics.adelaide.edu.au \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox