Linux NFS development
 help / color / mirror / Atom feed
From: Peter Staubach <staubach@redhat.com>
To: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Frank van Maarseveen <frankvm@frankvm.com>, linux-nfs@vger.kernel.org
Subject: Re: 2.6.24.3 kernel BUG at fs/nfs/pagelist.c:82
Date: Fri, 11 Apr 2008 16:10:42 -0400	[thread overview]
Message-ID: <47FFC5C2.1070009@redhat.com> (raw)
In-Reply-To: <1207944436.14621.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>

Trond Myklebust wrote:
> On Thu, 2008-04-10 at 13:54 +0200, Frank van Maarseveen wrote:
>   
>> FYI,
>>
>> On Thu, Mar 20, 2008 at 01:57:16PM +0100, Frank van Maarseveen wrote:
>>     
>>> On Thu, Mar 20, 2008 at 08:47:13AM -0400, Trond Myklebust wrote:
>>>       
>>>> On Wed, 2008-03-19 at 10:49 +0100, Frank van Maarseveen wrote:
>>>>         
>>>>> FYI,
>>>>>
>>>>> 2.6.24.3 wrote:
>>>>>           
>>>>>> kernel BUG at fs/nfs/pagelist.c:82!
>>>>>>             
>>>>> BUG_ON(PagePrivate(page));
>>>>>
>>>>>           
>>>>>> invalid opcode: 0000 [#1] SMP 
>>>>>> Modules linked in: vmnetfilter vmnet(P) vmmon(P) vmthrottle
>>>>>>             
>>>>> In addition, there are some NFS patches for handling >16 groups and
>>>>> selectively disabling attribute caching so its not a clean kernel.
>>>>>
>>>>>           
>>>>>> Pid: 4575, comm: tail Tainted: P        (2.6.24.3-x177 #1)
>>>>>>             
>>>                        ^^^^
>>>       
>>>>>> EIP: 0060:[<c01fc084>] EFLAGS: 00010202 CPU: 1
>>>>>> EIP is at nfs_create_request+0xf4/0x100
>>>>>> EAX: 80000821 EBX: e31a5300 ECX: 00000000 EDX: c1f0712c
>>>>>> ESI: c1f0712c EDI: e31a5338 EBP: e56dfd90 ESP: e56dfd74
>>>>>>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 
>>>>>> Process tail (pid: 4575, ti=e56de000 task=d4b65500 task.ti=e56de000) 
>>>>>> Stack: 00000000 f669ad20 cac3c168 e7330cb0 00000000 00000000 cac3c168 e56dfdc8 
>>>>>>        c01fded5 00000000 000000a4 039cffff 000000a4 c1f0712c cac3c168 e7330cb0 
>>>>>>        e56dfdb4 e56dfdb4 ffffff8c c1f0712c e7330cb0 e56dfdf0 c01fe8ce e56dfddc 
>>>>>> Call Trace:
>>>>>>  [<c010562a>] show_trace_log_lvl+0x1a/0x30
>>>>>>  [<c01056fa>] show_stack_log_lvl+0x9a/0xc0
>>>>>>  [<c01058a8>] show_registers+0xc8/0x1d0
>>>>>>  [<c0105b1c>] die+0x10c/0x230
>>>>>>  [<c0105cd1>] do_trap+0x91/0xd0  
>>>>>>  [<c0105f79>] do_invalid_op+0x89/0xa0
>>>>>>  [<c05bba02>] error_code+0x72/0x80
>>>>>>  [<c01fded5>] nfs_readpage_async+0xb5/0x1b0
>>>>>>  [<c01fe8ce>] nfs_readpage+0xae/0x120
>>>>>>  [<c0155268>] do_generic_mapping_read+0xe8/0x440
>>>>>>  [<c01558b0>] generic_file_aio_read+0x160/0x190
>>>>>>  [<c01f4fb7>] nfs_file_read+0x97/0xe0
>>>>>>  [<c01795e7>] do_sync_read+0xc7/0x120
>>>>>>  [<c01796c4>] vfs_read+0x84/0x130
>>>>>>  [<c01799fd>] sys_read+0x3d/0x70 
>>>>>>  [<c0104292>] syscall_call+0x7/0xb
>>>>>>  =======================
>>>>>> Code: 02 75 0a e8 4f dc 3b 00 e9 4a ff ff ff 83 c4 10 b8 00 fe ff ff 5b 5e 5f 5d c3 8b 56 0c e9 7a ff ff ff 0f 0b eb fe 90 0f 0b eb fe <0f> 0b eb fe 90 8d b4 26 00 00 00 00 55 89 e5 53 83 ec 04 89 c3 
>>>>>> EIP: [<c01fc084>] nfs_create_request+0xf4/0x100 SS:ESP 0068:e56dfd74
>>>>>> ---[ end trace 0ef921372ea6410b ]--- 
>>>>>>             
>>>>> The machine is a quad Xeon with 4GB ram with CONFIG_HIGHMEM64G=y
>>>>>           
>>>> Would that be on a file that was open for read and write, or is it
>>>> possible that some other process was writing to the same file? If so,
>>>> then it might be a bug in nfs_wb_page().
>>>>         
>>> Yes, I'm quite sure it was a "tail -f" on a logfile which gets
>>> continuously appended to by another process.. So, one process reads it
>>> while another one writes to it through different descriptors/struct file.
>>>       
>> The problem occurred again on a different box under exactly the same
>> userland conditions yielding exactly the same stack trace. Kernels are
>> identical but no vmware modules this time.
>>     
>
> Just a quick question: how does your > 16 groups patch behave when it is
> denied a write with an EACCES error? I've got a feeling that this may be
> due to the page getting redirtied and the RPC call retried. If so, then
> the following patch may help.
>
> Cheers
>   Trond
> --------------------------------------------------------------------------
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> Date: Fri, 11 Apr 2008 16:03:54 -0400
> Subject: NFS: Fix nfs_wb_page() to always exit with an error or a clean page
>
> It is possible for nfs_wb_page() to sometimes exit with 0 return value, yet
> the page is left in a dirty state.
> For instance in the case where the server rebooted, and the COMMIT request
> failed, then all the previously "clean" pages which were cached by the
> server, but were not guaranteed to have been writted out to disk,
> have to be redirtied and resent to the server.
> The fix is to have nfs_wb_page_priority() check that the page is clean
> before it exits...
>
>   

Is this a problem if the page is mmap'd into a process address
space which is constantly touching it?

    Thanx...

       ps

> This fixes a condition that triggers the BUG_ON(PagePrivate(page)) in
> nfs_create_request() when we're in the nfs_readpage() path.
>
> Also eliminate a redundant BUG_ON(!PageLocked(page)) while we're at it. It
> turns out that clear_page_dirty_for_io() has the exact same test.
>
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
>
>  fs/nfs/write.c |   24 +++++++++++++-----------
>  1 files changed, 13 insertions(+), 11 deletions(-)
>
> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
> index ce40cad..3a2e821 100644
> --- a/fs/nfs/write.c
> +++ b/fs/nfs/write.c
> @@ -1493,18 +1493,20 @@ static int nfs_wb_page_priority(struct inode *inode, struct page *page,
>  	};
>  	int ret;
>  
> -	BUG_ON(!PageLocked(page));
> -	if (clear_page_dirty_for_io(page)) {
> -		ret = nfs_writepage_locked(page, &wbc);
> +	do {
> +		if (clear_page_dirty_for_io(page)) {
> +			ret = nfs_writepage_locked(page, &wbc);
> +			if (ret < 0)
> +				goto out_error;
> +		} else if (!PagePrivate(page))
> +			break;
> +		ret = nfs_sync_mapping_wait(page->mapping, &wbc, how);
>  		if (ret < 0)
> -			goto out;
> -	}
> -	if (!PagePrivate(page))
> -		return 0;
> -	ret = nfs_sync_mapping_wait(page->mapping, &wbc, how);
> -	if (ret >= 0)
> -		return 0;
> -out:
> +			goto out_error;
> +	/* Is the test for PageDirty() really needed here? */
> +	} while (PagePrivate(page) || PageDirty(page));
> +	return 0;
> +out_error:
>  	__mark_inode_dirty(inode, I_DIRTY_PAGES);
>  	return ret;
>  }
>
>   


  parent reply	other threads:[~2008-04-11 20:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-19  9:49 2.6.24.3 kernel BUG at fs/nfs/pagelist.c:82 Frank van Maarseveen
2008-03-20 12:47 ` Trond Myklebust
     [not found]   ` <1206017233.8465.7.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-03-20 12:57     ` Frank van Maarseveen
2008-04-10 11:54       ` Frank van Maarseveen
2008-04-11 20:07         ` Trond Myklebust
     [not found]           ` <1207944436.14621.6.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-04-11 20:10             ` Peter Staubach [this message]
2008-04-11 20:44               ` Trond Myklebust
     [not found]                 ` <1207946679.15646.29.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-04-12 10:10                   ` Frank van Maarseveen
2008-04-12  9:42             ` Frank van Maarseveen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47FFC5C2.1070009@redhat.com \
    --to=staubach@redhat.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=frankvm@frankvm.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox