Re: [Bugme-new] [Bug 2019] New: Bug from the mm subsystem involving X (fwd)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Martin J. Bligh" <mbligh@aracnet.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm mailing list <linux-mm@kvack.org>,
	kmannth@us.ibm.com
Subject: Re: [Bugme-new] [Bug 2019] New: Bug from the mm subsystem involving X  (fwd)
Date: Wed, 04 Feb 2004 16:12:38 -0800	[thread overview]
Message-ID: <60330000.1075939958@flay> (raw)
In-Reply-To: <Pine.LNX.4.58.0402041539470.2086@home.osdl.org>

>> So there have been alot of X issue with Red Hat and 2.6 kernels.  I managed to
>> get the system to panic and I decide it was time to open this bug.  I got this
>> on boot up. 
> 
> Hmm. Compiler? Why would AS-3 in particular have problems?

I think it's more likely the combination of NUMA and X. People hardly
ever run X on the big servers ... Keith is just odd ;-)
 
>> Unable to handle kernel paging request at virtual address 0264d000
>>  printing eip:
>> c0147af4
>> *pde = 00000000
>> Oops: 0000 [#1]
>> CPU:    7
>> EIP:    0060:[<c0147af4>]    Not tainted
>> EFLAGS: 00013206
>> EIP is at remap_page_range+0x193/0x26c
>> eax: 0264d000   ebx: 000f5200   ecx: 00000001   edx: dad0fa80
>> esi: 001fe000   edi: d87c9ff0   ebp: f5200000   esp: d8835ee4
>> ds: 007b   es: 007b   ss: 0068
>> Process X (pid: 1285, threadinfo=d8834000 task=d9474ce0)
>> Stack: d961d580 001ff000 001ff000 40000000 f5002000 001fe000 d9578000 d961d580
>>        401ff000 d9576508 00000000 f5200000 d961d580 00000001 c0247055 d87d62c0
>>        401fe000 b5002000 00001000 00000027 d9388e80 00001000 c014a7fd d9388e80
>> Call Trace:
>>  [<c0247055>] mmap_mem+0x71/0xd4
>>  [<c014a7fd>] do_mmap_pgoff+0x362/0x70d
>>  [<c0156f65>] filp_open+0x67/0x69
>>  [<c0111c4d>] sys_mmap2+0x7a/0xaa
>>  [<c010aced>] sysenter_past_esp+0x52/0x71
>> 
>> Code: 8b 00 a9 00 08 00 00 74 10 89 d8 8b 54 24 4c c1 e8 14 09 ea
> 
> This _seems_ to be the code
> 
> 		...
>                 if (!pfn_valid(pfn) || PageReserved(pfn_to_page(pfn)))
>                         set_pte(pte, pfn_pte(pfn, prot));
> 		...
> 
> in particular, it disassembles to
> 
> 	0x8048490 <insn>:       mov    (%eax),%eax
> 	0x8048492 <insn+2>:     test   $0x800,%eax
> 	0x8048497 <insn+7>:     je     0x80484a9
> 	0x8048499 <insn+9>:     mov    %ebx,%eax
> 	0x804849b <insn+11>:    mov    0x4c(%esp,1),%edx
> 	0x804849f <insn+15>:    shr    $0x14,%eax
> 
> which seems to be the "PageReserved(pfn_to_page(pfn))" test.
> 
> This implies that you have either:
>  - a buggy "pfn_valid()" macro (do you use CONFIG_DISCONTIGMEM?)

Yup.
#define pfn_valid(pfn)          ((pfn) < num_physpages)

Which is wrong. There's a even a comment above it that says:

/*
 * pfn_valid should be made as fast as possible, and the current definition
 * is valid for machines that are NUMA, but still contiguous, which is what
 * is currently supported. A more generalised, but slower definition would
 * be something like this - mbligh:
 * ( pfn_to_pgdat(pfn) && ((pfn) < node_end_pfn(pfn_to_nid(pfn))) )
 */

;-)

Which I still don't think is correct, as there's a hole in the middle of
node 0 ... I'll make a new patch up somehow and give to Keith to test ;-)

Thanks,

M.

WARNING: multiple messages have this Message-ID (diff)

From: "Martin J. Bligh" <mbligh@aracnet.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm mailing list <linux-mm@kvack.org>,
	kmannth@us.ibm.com
Subject: Re: [Bugme-new] [Bug 2019] New: Bug from the mm subsystem involving X  (fwd)
Date: Wed, 04 Feb 2004 16:12:38 -0800	[thread overview]
Message-ID: <60330000.1075939958@flay> (raw)
In-Reply-To: <Pine.LNX.4.58.0402041539470.2086@home.osdl.org>

>> So there have been alot of X issue with Red Hat and 2.6 kernels.  I managed to
>> get the system to panic and I decide it was time to open this bug.  I got this
>> on boot up. 
> 
> Hmm. Compiler? Why would AS-3 in particular have problems?

I think it's more likely the combination of NUMA and X. People hardly
ever run X on the big servers ... Keith is just odd ;-)
 
>> Unable to handle kernel paging request at virtual address 0264d000
>>  printing eip:
>> c0147af4
>> *pde = 00000000
>> Oops: 0000 [#1]
>> CPU:    7
>> EIP:    0060:[<c0147af4>]    Not tainted
>> EFLAGS: 00013206
>> EIP is at remap_page_range+0x193/0x26c
>> eax: 0264d000   ebx: 000f5200   ecx: 00000001   edx: dad0fa80
>> esi: 001fe000   edi: d87c9ff0   ebp: f5200000   esp: d8835ee4
>> ds: 007b   es: 007b   ss: 0068
>> Process X (pid: 1285, threadinfo=d8834000 task=d9474ce0)
>> Stack: d961d580 001ff000 001ff000 40000000 f5002000 001fe000 d9578000 d961d580
>>        401ff000 d9576508 00000000 f5200000 d961d580 00000001 c0247055 d87d62c0
>>        401fe000 b5002000 00001000 00000027 d9388e80 00001000 c014a7fd d9388e80
>> Call Trace:
>>  [<c0247055>] mmap_mem+0x71/0xd4
>>  [<c014a7fd>] do_mmap_pgoff+0x362/0x70d
>>  [<c0156f65>] filp_open+0x67/0x69
>>  [<c0111c4d>] sys_mmap2+0x7a/0xaa
>>  [<c010aced>] sysenter_past_esp+0x52/0x71
>> 
>> Code: 8b 00 a9 00 08 00 00 74 10 89 d8 8b 54 24 4c c1 e8 14 09 ea
> 
> This _seems_ to be the code
> 
> 		...
>                 if (!pfn_valid(pfn) || PageReserved(pfn_to_page(pfn)))
>                         set_pte(pte, pfn_pte(pfn, prot));
> 		...
> 
> in particular, it disassembles to
> 
> 	0x8048490 <insn>:       mov    (%eax),%eax
> 	0x8048492 <insn+2>:     test   $0x800,%eax
> 	0x8048497 <insn+7>:     je     0x80484a9
> 	0x8048499 <insn+9>:     mov    %ebx,%eax
> 	0x804849b <insn+11>:    mov    0x4c(%esp,1),%edx
> 	0x804849f <insn+15>:    shr    $0x14,%eax
> 
> which seems to be the "PageReserved(pfn_to_page(pfn))" test.
> 
> This implies that you have either:
>  - a buggy "pfn_valid()" macro (do you use CONFIG_DISCONTIGMEM?)

Yup.
#define pfn_valid(pfn)          ((pfn) < num_physpages)

Which is wrong. There's a even a comment above it that says:

/*
 * pfn_valid should be made as fast as possible, and the current definition
 * is valid for machines that are NUMA, but still contiguous, which is what
 * is currently supported. A more generalised, but slower definition would
 * be something like this - mbligh:
 * ( pfn_to_pgdat(pfn) && ((pfn) < node_end_pfn(pfn_to_nid(pfn))) )
 */

;-)

Which I still don't think is correct, as there's a hole in the middle of
node 0 ... I'll make a new patch up somehow and give to Keith to test ;-)

Thanks,

M.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

next prev parent reply	other threads:[~2004-02-05  0:16 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-04 23:17 [Bugme-new] [Bug 2019] New: Bug from the mm subsystem involving X (fwd) Martin J. Bligh
2004-02-04 23:17 ` Martin J. Bligh
2004-02-04 23:58 ` Linus Torvalds
2004-02-04 23:58   ` Linus Torvalds
2004-02-05  0:12   ` Martin J. Bligh [this message]
2004-02-05  0:12     ` Martin J. Bligh
2004-02-05  0:36     ` Martin J. Bligh
2004-02-05  0:36       ` Martin J. Bligh
2004-02-05  0:43       ` Linus Torvalds
2004-02-05  0:43         ` Linus Torvalds
2004-02-05  0:56         ` Andrew Morton
2004-02-05  0:56           ` Andrew Morton
2004-02-05  1:29           ` Linus Torvalds
2004-02-05  1:29             ` Linus Torvalds
2004-02-05  1:56             ` Keith Mannthey
2004-02-05  1:56               ` Keith Mannthey
2004-02-05  2:04               ` Linus Torvalds
2004-02-05  2:04                 ` Linus Torvalds
2004-02-05  2:33                 ` Keith Mannthey
2004-02-05  2:33                   ` Keith Mannthey
2004-02-05  2:47                   ` Linus Torvalds
2004-02-05  2:47                     ` Linus Torvalds
2004-02-06  7:17                 ` Martin J. Bligh
2004-02-06  7:17                   ` Martin J. Bligh
2004-02-06  7:19                   ` Martin J. Bligh
2004-02-06  7:19                     ` Martin J. Bligh
2004-02-06  9:57                   ` Dave Hansen
2004-02-06  9:57                     ` Dave Hansen
2004-02-06 15:49                     ` Martin J. Bligh
2004-02-06 15:49                       ` Martin J. Bligh
2004-02-06 17:22                       ` Dave Hansen
2004-02-06 17:22                         ` Dave Hansen
2004-02-06 19:59                         ` Martin J. Bligh
2004-02-06 19:59                           ` Martin J. Bligh
2004-02-06 20:16                           ` Linus Torvalds
2004-02-06 20:16                             ` Linus Torvalds
2004-02-06 21:18                             ` Martin J. Bligh
2004-02-06 21:18                               ` Martin J. Bligh
     [not found] <51080000.1075936626@flay.suse.lists.linux.kernel>
     [not found] ` <Pine.LNX.4.58.0402041539470.2086@home.osdl.org.suse.lists.linux.kernel>
     [not found]   ` <60330000.1075939958@flay.suse.lists.linux.kernel>
     [not found]     ` <64260000.1075941399@flay.suse.lists.linux.kernel>
     [not found]       ` <Pine.LNX.4.58.0402041639420.2086@home.osdl.org.suse.lists.linux.kernel>
     [not found]         ` <20040204165620.3d608798.akpm@osdl.org.suse.lists.linux.kernel>
     [not found]           ` <Pine.LNX.4.58.0402041719300.2086@home.osdl.org.suse.lists.linux.kernel>
     [not found]             ` <1075946211.13163.18962.camel@dyn318004bld.beaverton.ibm.com.suse.lists.linux.kernel>
     [not found]               ` <Pine.LNX.4.58.0402041800320.2086@home.osdl.org.suse.lists.linux.kernel>
     [not found]                 ` <98220000.1076051821@[10.10.2.4].suse.lists.linux.kernel>
     [not found]                   ` <1076061476.27855.1144.camel@nighthawk.suse.lists.linux.kernel>
     [not found]                     ` <5450000.1076082574@[10.10.2.4].suse.lists.linux.kernel>
     [not found]                       ` <1076088169.29478.2928.camel@nighthawk.suse.lists.linux.kernel>
     [not found]                         ` <218650000.1076097590@flay.suse.lists.linux.kernel>
     [not found]                           ` <Pine.LNX.4.58.0402061215030.30672@home.osdl.org.suse.lists.linux.kernel>
     [not found]                             ` <220850000.1076102320@flay.suse.lists.linux.kernel>
2004-02-07  3:54                               ` Andi Kleen
2004-02-07  4:49                                 ` Martin J. Bligh
2004-02-07  5:21                                   ` Andi Kleen
2004-02-07  6:37                                   ` Nick Piggin
2004-02-07  7:31                                     ` Martin J. Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60330000.1075939958@flay \
    --to=mbligh@aracnet.com \
    --cc=kmannth@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.