Re: Oops with 4GB memory setting in 2.4.0 stable

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: Oops with 4GB memory setting in 2.4.0 stable
@ 2001-01-16 13:33 Petr Vandrovec
  2001-01-16 20:17 ` Urban Widmark
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Vandrovec @ 2001-01-16 13:33 UTC (permalink / raw)
  To: Urban Widmark; +Cc: linux-kernel, rmager

On 16 Jan 01 at 9:40, Urban Widmark wrote:
> On Tue, 16 Jan 2001, Rainer Mager wrote:
> 
> > Hi all,
> >
> >   I have a 100% reproducable bug in all of the 2.4.0 kernels including the
> > latest stable one. The issue is that if I compile the kernel to support 4GB
> > RAM (I have 1 GB) and then try to access a samba mount I get an oops. This
> 
> I'll have a look tonight or so. It works for you on non-bigmem?
> 
> > ALWAYS happens. Usually after this the system is frozen (although the magic
> > SYSREQ still works). If the system isn't frozen then any commands that
> > access the disk will freeze. Fortunately GPM worked and I was able to paste
> > the oops to a file via telnet.
> 
> smb_rename suggests mv, but the process is ls ... er? What commands where
> you running on smbfs when it crashed?
> 
> Could this be a symbol mismatch? Keith Owens suggested a less manual way
> to get module symbol output. Do you get the same results using that?

smb_get_dircache looks suspicious to me, as it can try to map unlimited
number of pages with kmap. And kmaps are not unlimited resource...
You have 512 kmaps, but one SMBFS cache page can contain about 504
pages... So two smbfs cached directories can consume all your kmaps,
dying then in endless loop in mm/highmem.c:map_new_virtual().

Also, smb_add_to_cache looks suspicious:

cachep->idx++;
if (cachep->idx > NINDEX) goto out_full;

cannot idx grow over any limit?

get_block:
  cachep->pages++;
  ...
  if (page) {
    block = kmap(page);
    ...
  }

Should not you increment cachep->pages only if grab_cache_page
succeeded? This can cause that smb_find_in_cache finds NULL
index->block, which then oopses...

smb_find_in_cache should verify index->block == NULL anyway, as
smb_get_dircache can return couple of index->block == NULL when system
decided to throw out one of cache pages connected to directory.

But I personally do not use neither smbfs nor PAE, so what I can say...
                                            Best regards,
                                                    Petr Vandrovec
                                                    vandrove@vc.cvut.cz

BTW: For ncpfs PAE testing I was using patch which needed kmap() for
all memory above 32MB... It was very educational...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-16 13:33 Oops with 4GB memory setting in 2.4.0 stable Petr Vandrovec
@ 2001-01-16 20:17 ` Urban Widmark
  0 siblings, 0 replies; 15+ messages in thread
From: Urban Widmark @ 2001-01-16 20:17 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: linux-kernel, rmager

On Tue, 16 Jan 2001, Petr Vandrovec wrote:

> smb_get_dircache looks suspicious to me, as it can try to map unlimited
> number of pages with kmap. And kmaps are not unlimited resource...
> You have 512 kmaps, but one SMBFS cache page can contain about 504
> pages... So two smbfs cached directories can consume all your kmaps,
> dying then in endless loop in mm/highmem.c:map_new_virtual().

The smbfs dircache needs to find/kmap all of its cache pages since the
entries in it are variable length and the way it is called. It would be
nice to change that.

I haven't looked at all your detailed comments yet. They may not matter if
the many kmaps are a problem.

The ncpfs code puts 'struct dentry *' in it's cache pages. Fixed size
entries makes it easy to know which page you need to start reading from,
so only one kmap is needed. That looks simpler so I want to steal it,
except ...

ncpfs ends up calling d_validate to verify that the dentry is sane. But
how can it know that the dentry is the right one? I thought that dentries
could be removed/reused by someone at will (d_count will be 0 because of
the dput in ncp_fill_cache, no?). Why isn't it possible for someone to
write a new dentry where the old one was.

fs/ncpfs/dir.c:ncp_d_validate() calls
  valid = d_validate(dentry, dentry->d_parent, dentry->d_name.hash, len);

all values are taken from the dentry pointer on the cache page (including
len). d_validate verifies that d_hash() points to a list and it searches
the list for dentry. How do you know that it is the same dentry that was
put in the cache and not someone elses dentry?

> But I personally do not use neither smbfs nor PAE, so what I can say...

A whole lot, thanks. Especially for the kmap info.

Now if someone could explain the dentry pointers ... what am I missing?

/Urban

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Oops with 4GB memory setting in 2.4.0 stable
@ 2001-01-16 22:29 Petr Vandrovec
  2001-01-16 22:38 ` Urban Widmark
  2001-01-16 22:42 ` Rainer Mager
  0 siblings, 2 replies; 15+ messages in thread
From: Petr Vandrovec @ 2001-01-16 22:29 UTC (permalink / raw)
  To: Urban Widmark; +Cc: linux-kernel, rmager

On 16 Jan 01 at 21:17, Urban Widmark wrote:
> The smbfs dircache needs to find/kmap all of its cache pages since the
> entries in it are variable length and the way it is called. It would be
> nice to change that.
> 
> I haven't looked at all your detailed comments yet. They may not matter if
> the many kmaps are a problem.

I think that too many kmaps could explain reported 'silent hang'... (if
my memory serves good, there was some report about silent PAE hang during
last 7 days, yes?). Not-checking ->block for NULL looks like bug which
can be triggered without kmap too.
 
> how can it know that the dentry is the right one? I thought that dentries
> could be removed/reused by someone at will (d_count will be 0 because of
> the dput in ncp_fill_cache, no?). Why isn't it possible for someone to
> write a new dentry where the old one was.
> 
> fs/ncpfs/dir.c:ncp_d_validate() calls
>   valid = d_validate(dentry, dentry->d_parent, dentry->d_name.hash, len);
> 
> all values are taken from the dentry pointer on the cache page (including
> len). d_validate verifies that d_hash() points to a list and it searches
> the list for dentry. How do you know that it is the same dentry that was
> put in the cache and not someone elses dentry?

Before calling d_validate it checks whethern dentry->d_parent == parent
(readdir-ed directory). And if dentry is in directory we read,
it is in dentry d_hash, and even d_fsdata matches its position in
directory, I bet that it is valid dentry... 

If there is new dentry, which is at fpos postion, and it is child of
readdir-ed directory, we should return it anyway, no? There must not be
two ncpfs dentries with same d_parent and d_fsdata if d_fsdata != 0,
as each dentry can be in only one directory.

This looked as reasonable limitation to me ;-)
                                            Best regards,
                                                Petr Vandrovec
                                                vandrove@vc.cvut.cz
                                                
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-16 22:29 Petr Vandrovec
@ 2001-01-16 22:38 ` Urban Widmark
  2001-01-16 22:42 ` Rainer Mager
  1 sibling, 0 replies; 15+ messages in thread
From: Urban Widmark @ 2001-01-16 22:38 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: linux-kernel, rmager

On Tue, 16 Jan 2001, Petr Vandrovec wrote:

> If there is new dentry, which is at fpos postion, and it is child of
> readdir-ed directory, we should return it anyway, no? There must not be
> two ncpfs dentries with same d_parent and d_fsdata if d_fsdata != 0,
> as each dentry can be in only one directory.
>
> This looked as reasonable limitation to me ;-)

Right. I chose not to read those tests for some reason ... good.

The parent test should be ok. d_fsdata is only set in ncpfs if it is put
in the cache and d_alloc sets it to 0. Works for me (whatever that may be
worth).

Rewriting the smbfs cache code allows for a nice speedup too.

In ncpfs when reading a directory you create dentries and inodes at once.
I assume that when reading the dir list from the server that you get all
the info you need in one go.

I think smbfs gets all needed info on all protocol versions it supports,
so that should be a nice speedup for readdir() + stat()-each file (ls -l).
Currently it only caches name info and then does a remote call for each
entry.

Too bad this is 2.4.0, the biggest problem may be sneaking it past Linus.
oh well ... :)

/Urban

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-16 22:29 Petr Vandrovec
  2001-01-16 22:38 ` Urban Widmark
@ 2001-01-16 22:42 ` Rainer Mager
  1 sibling, 0 replies; 15+ messages in thread
From: Rainer Mager @ 2001-01-16 22:42 UTC (permalink / raw)
  To: Urban Widmark, linux-kernel

Hi again,

	It looks like some progress is being made, *wonderful*, as to some earlier
questions...


> I'll have a look tonight or so. It works for you on non-bigmem?

Yes. Absolutely no problems on non-bigmem.


> smb_rename suggests mv, but the process is ls ... er? What commands where
> you running on smbfs when it crashed?

It seems that ANY access to the smbfs has this affect. Definitely confirmed
are: ls, tab completion from bash, cat [some file], and usually df.

>
> Could this be a symbol mismatch? Keith Owens suggested a less manual way
> to get module symbol output. Do you get the same results using that?

I'll try to do this and report back.



--Rainer

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* FS corruption on 2.4.0-ac8
@ 2001-01-15 22:47 Jure Pecar
  2001-01-15 23:31 ` Oops with 4GB memory setting in 2.4.0 stable Rainer Mager
  0 siblings, 1 reply; 15+ messages in thread
From: Jure Pecar @ 2001-01-15 22:47 UTC (permalink / raw)
  To: linux-kernel

Hi all,

I was running 2.4.0test10pre5 happily for months and wanted to see how
things stand in the 'latest stuff'. Here's what i found:

I compiled 2.4.0-ac8 with nearly the same .config as test10pre5 (with
latest gcc on rh7). Then i booted it and used X for some normal browsing
and mp3s. Performance was poor, responsivness also, even the mouse
stopped responding for a couple of seconds at a time, a lot of disk
trashing & so on. I deceided to boot test10 back, and there was a nasty
suprise: fsck found filesystem with errors, and LOTS of them ... i had
to hold down 'y' for almost 5 minutes ... :)

Then i examined the logs for what would be the cause for this ... and
here's what 2.4.0-ac8 left in the logs:

Jan 14 16:26:47 open kernel: ee_blocks: Freeing blocks not in datazone -
block = 979727457, count = 1
Jan 14 16:26:47 open kernel: EXT2-fs error (device md(9,1)):
ext2_free_blocks: Freeing blocks not in datazone - block = 1769096736,
count = 1
Jan 14 16:26:47 open kernel: EXT2-fs error (device md(9,1)):
ext2_free_blocks: Freeing blocks not in datazone - block = 842080300,
count = 1
Jan 14 16:26:47 open kernel: EXT2-fs error (device md(9,1)):
ext2_free_blocks: Freeing blocks not in datazone - block = 1851869728,
count = 1
Jan 14 16:26:47 open kernel: EXT2-fs error (device md(9,1)):
ext2_free_blocks: Freeing blocks not in datazone - block = 808464928,
count = 1
...
and so on for about 150 such lines in 3 seconds.

There is something not that usual about my setup: i run raid1 /boot and
raid5 root with one disk disconnected (its simply too loud...), so the
array is in degraded mode all the time. Other hardware is more or less
standard, p200 classic, 430vx board, adaptec2940u, 64mb ram.

Is this a known problem? If it's not, please advise me on how to provide
more usefull informations.

-- 

Jure Pecar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Oops with 4GB memory setting in 2.4.0 stable
  2001-01-15 22:47 FS corruption on 2.4.0-ac8 Jure Pecar
@ 2001-01-15 23:31 ` Rainer Mager
  2001-01-15 21:47   ` Marcelo Tosatti
  2001-01-16  8:40   ` Urban Widmark
  0 siblings, 2 replies; 15+ messages in thread
From: Rainer Mager @ 2001-01-15 23:31 UTC (permalink / raw)
  To: linux-kernel

Hi all,

	I have a 100% reproducable bug in all of the 2.4.0 kernels including the
latest stable one. The issue is that if I compile the kernel to support 4GB
RAM (I have 1 GB) and then try to access a samba mount I get an oops. This
ALWAYS happens. Usually after this the system is frozen (although the magic
SYSREQ still works). If the system isn't frozen then any commands that
access the disk will freeze. Fortunately GPM worked and I was able to paste
the oops to a file via telnet.

	Attached is my oops.txt and the result sent through ksymoops. The results
don't look particularly useful to me so perhaps I'm doing something wrong.
PLEASE tell me if I should parse this differently. Likewise, if there is
anything else I can do to help debug this, please tell me.

--Rainer

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-15 23:31 ` Oops with 4GB memory setting in 2.4.0 stable Rainer Mager
@ 2001-01-15 21:47   ` Marcelo Tosatti
  2001-01-15 23:45     ` Rainer Mager
  2001-01-16  8:40   ` Urban Widmark
  1 sibling, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2001-01-15 21:47 UTC (permalink / raw)
  To: Rainer Mager; +Cc: linux-kernel



On Tue, 16 Jan 2001, Rainer Mager wrote:

> 	Attached is my oops.txt and the result sent through ksymoops. The results
> don't look particularly useful to me so perhaps I'm doing something wrong.
> PLEASE tell me if I should parse this differently. Likewise, if there is
> anything else I can do to help debug this, please tell me.

It seems you forgot to attach oops.txt. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-15 21:47   ` Marcelo Tosatti
@ 2001-01-15 23:45     ` Rainer Mager
  2001-01-15 22:09       ` Marcelo Tosatti
  0 siblings, 1 reply; 15+ messages in thread
From: Rainer Mager @ 2001-01-15 23:45 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 964 bytes --]

I knew that, I was just testing you all.  ;-)

\e hides his head in shame



> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org
> [mailto:linux-kernel-owner@vger.kernel.org]On Behalf Of Marcelo Tosatti
> Sent: Tuesday, January 16, 2001 6:47 AM
> To: Rainer Mager
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: Oops with 4GB memory setting in 2.4.0 stable
>
>
>
>
> On Tue, 16 Jan 2001, Rainer Mager wrote:
>
> > 	Attached is my oops.txt and the result sent through
> ksymoops. The results
> > don't look particularly useful to me so perhaps I'm doing
> something wrong.
> > PLEASE tell me if I should parse this differently. Likewise, if there is
> > anything else I can do to help debug this, please tell me.
>
> It seems you forgot to attach oops.txt.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/

[-- Attachment #2: oops.parsed --]
[-- Type: application/octet-stream, Size: 2090 bytes --]

ksymoops 0.7c on i686 2.4.0.  Options used
     -v /boot/vmlinux-2.4.0-bigmem (specified)
     -K (specified)
     -L (specified)
     -o /lib/modules/2.4.0/ (default)
     -m /boot/System.map-2.4.0-bigmem (specified)

No modules in ksyms, skipping objects
Unable to handle kernel NULL pointer dereference at virtual address 00000000
f889e044
*pde = 00000000
Oops: 0002
CPU:    1
EIP:    0010:[<f889e044>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000000   ebx: d5762800   ecx: 00000400   edx: c19665fc
esi: d55be120   edi: 00000000   ebp: d5764260   esp: d5505f1c
ds: 0018   es: 0018   ss: 0018
Process ls (pid: 865, stackpage=d5505000)
Stack: d5762800 d55be120 d5764260 d5764260 d55be120 00000000 f889d966 d55be120
       d5762800 d5504000 d5764260 fffffffe fffffffb d5762800 d5764260 d55be120
       00000000 d5764260 bffffa40 00000006 c0140c10 d5764260 d5505fb0 c0140e7c
Call Trace: [<f889d966>] [<c0140c10>] [<c0140e7c>] [<c0140f9e>] [<c0140e7c>] [<c0108f4b>]
Code: f3 ab e9 8b 00 00 00 90 8d 74 26 00 8b 44 24 14 c7 00 00 00

>>EIP; f889e044 <END_OF_CODE+385bfe34/????>   <=====
Trace; f889d966 <END_OF_CODE+385bf756/????>
Trace; c0140c10 <vfs_readdir+90/ec>
Trace; c0140e7c <filldir+0/d8>
Trace; c0140f9e <sys_getdents+4a/98>
Trace; c0140e7c <filldir+0/d8>
Trace; c0108f4b <system_call+33/38>
Code;  f889e044 <END_OF_CODE+385bfe34/????>
00000000 <_EIP>:
Code;  f889e044 <END_OF_CODE+385bfe34/????>   <=====
   0:   f3 ab                     repz stos %eax,%es:(%edi)   <=====
Code;  f889e046 <END_OF_CODE+385bfe36/????>
   2:   e9 8b 00 00 00            jmp    92 <_EIP+0x92> f889e0d6 <END_OF_CODE+385bfec6/????>
Code;  f889e04b <END_OF_CODE+385bfe3b/????>
   7:   90                        nop    
Code;  f889e04c <END_OF_CODE+385bfe3c/????>
   8:   8d 74 26 00               lea    0x0(%esi,1),%esi
Code;  f889e050 <END_OF_CODE+385bfe40/????>
   c:   8b 44 24 14               mov    0x14(%esp,1),%eax
Code;  f889e054 <END_OF_CODE+385bfe44/????>
  10:   c7 00 00 00 00 00         movl   $0x0,(%eax)


[-- Attachment #3: oops.txt --]
[-- Type: text/plain, Size: 810 bytes --]

Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
f889e044
*pde = 00000000
Oops: 0002
CPU:    1
EIP:    0010:[<f889e044>]
EFLAGS: 00010246
eax: 00000000   ebx: d5762800   ecx: 00000400   edx: c19665fc
esi: d55be120   edi: 00000000   ebp: d5764260   esp: d5505f1c
ds: 0018   es: 0018   ss: 0018
Process ls (pid: 865, stackpage=d5505000)
Stack: d5762800 d55be120 d5764260 d5764260 d55be120 00000000 f889d966 d55be120
       d5762800 d5504000 d5764260 fffffffe fffffffb d5762800 d5764260 d55be120
       00000000 d5764260 bffffa40 00000006 c0140c10 d5764260 d5505fb0 c0140e7c
Call Trace: [<f889d966>] [<c0140c10>] [<c0140e7c>] [<c0140f9e>] [<c0140e7c>] [<c0108f4b>]

Code: f3 ab e9 8b 00 00 00 90 8d 74 26 00 8b 44 24 14 c7 00 00 00
Segmentation fault

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-15 23:45     ` Rainer Mager
@ 2001-01-15 22:09       ` Marcelo Tosatti
  2001-01-16  0:21         ` Rainer Mager
  2001-01-16  2:03         ` Keith Owens
  0 siblings, 2 replies; 15+ messages in thread
From: Marcelo Tosatti @ 2001-01-15 22:09 UTC (permalink / raw)
  To: Rainer Mager; +Cc: linux-kernel



On Tue, 16 Jan 2001, Rainer Mager wrote:

> I knew that, I was just testing you all.  ;-)

>>EIP; f889e044 <END_OF_CODE+385bfe34/????>   <=====
Trace; f889d966 <END_OF_CODE+385bf756/????>
Trace; c0140c10 <vfs_readdir+90/ec>
Trace; c0140e7c <filldir+0/d8>
Trace; c0140f9e <sys_getdents+4a/98>
Trace; c0140e7c <filldir+0/d8>

It seems the oops is happening in a module's function.

You have to make ksymoops parse the oops output against a System.map which
has all modules symbols. Load each module by hand with the insmod -m
option ("insmod -m module.o") and _append_ the outputs to System.map.

After that you can run ksymoops against this new System.map. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-15 22:09       ` Marcelo Tosatti
@ 2001-01-16  0:21         ` Rainer Mager
  2001-01-15 22:37           ` Marcelo Tosatti
  2001-01-16  2:03         ` Keith Owens
  1 sibling, 1 reply; 15+ messages in thread
From: Rainer Mager @ 2001-01-16  0:21 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]

Ok, now were making progress. I did as you said and have attached (really!)
the new parsed output. Now we have some useful information (I hope). I still
got lots of warnings on symbols (which I have edited out of the parsed file
for the sake of briefness). What's the next step?

--Rainer


> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org
> [mailto:linux-kernel-owner@vger.kernel.org]On Behalf Of Marcelo Tosatti
> Sent: Tuesday, January 16, 2001 7:09 AM
> To: Rainer Mager
> Cc: linux-kernel@vger.kernel.org
> Subject: RE: Oops with 4GB memory setting in 2.4.0 stable
>
> >>EIP; f889e044 <END_OF_CODE+385bfe34/????>   <=====
> Trace; f889d966 <END_OF_CODE+385bf756/????>
> Trace; c0140c10 <vfs_readdir+90/ec>
> Trace; c0140e7c <filldir+0/d8>
> Trace; c0140f9e <sys_getdents+4a/98>
> Trace; c0140e7c <filldir+0/d8>
>
> It seems the oops is happening in a module's function.
>
> You have to make ksymoops parse the oops output against a System.map which
> has all modules symbols. Load each module by hand with the insmod -m
> option ("insmod -m module.o") and _append_ the outputs to System.map.
>
> After that you can run ksymoops against this new System.map.

[-- Attachment #2: oops.parsed.edit --]
[-- Type: application/octet-stream, Size: 2066 bytes --]

ksymoops 0.7c on i686 2.4.0.  Options used
     -v /boot/vmlinux-2.4.0-bigmem (specified)
     -K (specified)
     -L (specified)
     -o /lib/modules/2.4.0/ (default)
     -m ./System.map-2.4.0-bigmem (specified)

No modules in ksyms, skipping objects
Unable to handle kernel NULL pointer dereference at virtual address 00000000
f889e044
*pde = 00000000
Oops: 0002
CPU:    1
EIP:    0010:[<f889e044>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00000000   ebx: d5762800   ecx: 00000400   edx: c19665fc
esi: d55be120   edi: 00000000   ebp: d5764260   esp: d5505f1c
ds: 0018   es: 0018   ss: 0018
Process ls (pid: 865, stackpage=d5505000)
Stack: d5762800 d55be120 d5764260 d5764260 d55be120 00000000 f889d966 d55be120
       d5762800 d5504000 d5764260 fffffffe fffffffb d5762800 d5764260 d55be120
       00000000 d5764260 bffffa40 00000006 c0140c10 d5764260 d5505fb0 c0140e7c
Call Trace: [<f889d966>] [<c0140c10>] [<c0140e7c>] [<c0140f9e>] [<c0140e7c>] [<c0108f4b>]
Code: f3 ab e9 8b 00 00 00 90 8d 74 26 00 8b 44 24 14 c7 00 00 00

>>EIP; f889e044 <smb_rename+fc/19c>   <=====
Trace; f889d966 <smb_readdir+b6/188>
Trace; c0140c10 <vfs_readdir+90/ec>
Trace; c0140e7c <filldir+0/d8>
Trace; c0140f9e <sys_getdents+4a/98>
Trace; c0140e7c <filldir+0/d8>
Trace; c0108f4b <system_call+33/38>
Code;  f889e044 <smb_rename+fc/19c>
00000000 <_EIP>:
Code;  f889e044 <smb_rename+fc/19c>   <=====
   0:   f3 ab                     repz stos %eax,%es:(%edi)   <=====
Code;  f889e046 <smb_rename+fe/19c>
   2:   e9 8b 00 00 00            jmp    92 <_EIP+0x92> f889e0d6 <smb_rename+18e/19c>
Code;  f889e04b <smb_rename+103/19c>
   7:   90                        nop    
Code;  f889e04c <smb_rename+104/19c>
   8:   8d 74 26 00               lea    0x0(%esi,1),%esi
Code;  f889e050 <smb_rename+108/19c>
   c:   8b 44 24 14               mov    0x14(%esp,1),%eax
Code;  f889e054 <smb_rename+10c/19c>
  10:   c7 00 00 00 00 00         movl   $0x0,(%eax)


367 warnings issued.  Results may not be reliable.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-16  0:21         ` Rainer Mager
@ 2001-01-15 22:37           ` Marcelo Tosatti
  0 siblings, 0 replies; 15+ messages in thread
From: Marcelo Tosatti @ 2001-01-15 22:37 UTC (permalink / raw)
  To: Rainer Mager; +Cc: linux-kernel, Urban Widmark



On Tue, 16 Jan 2001, Rainer Mager wrote:

> Ok, now were making progress. I did as you said and have attached (really!)
> the new parsed output. Now we have some useful information (I hope). I still
> got lots of warnings on symbols (which I have edited out of the parsed file
> for the sake of briefness). What's the next step?

Wait for someone who has a clue about smbfs to find out the problem. 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-15 22:09       ` Marcelo Tosatti
  2001-01-16  0:21         ` Rainer Mager
@ 2001-01-16  2:03         ` Keith Owens
  1 sibling, 0 replies; 15+ messages in thread
From: Keith Owens @ 2001-01-16  2:03 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Rainer Mager, linux-kernel

On Mon, 15 Jan 2001 20:09:14 -0200 (BRST), 
Marcelo Tosatti <marcelo@conectiva.com.br> wrote:
>On Tue, 16 Jan 2001, Rainer Mager wrote:
>>>EIP; f889e044 <END_OF_CODE+385bfe34/????>   <=====
>Trace; f889d966 <END_OF_CODE+385bf756/????>
>
>It seems the oops is happening in a module's function.
>
>You have to make ksymoops parse the oops output against a System.map which
>has all modules symbols. Load each module by hand with the insmod -m
>option ("insmod -m module.o") and _append_ the outputs to System.map.

No need, just create directory /var/log/ksymoops.  insmod and rmmod
will automatically save the list of modules and the symbol table on
every module load or unload, neatly timestamped.  When you get an oops,
find the entries just before the oops and point ksymoops at those.

ksymoops -m /lib/modules/2.4.0/System.map \
	 -k /var/log/ksymoops/20010116093850.ksyms \
	 -l /var/log/ksymoops/20010116093850.modules < oops.txt

man insmod, section KSYMOOPS ASSISTANCE.  Much easier than trying to
reproduce the environment by hand.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-15 23:31 ` Oops with 4GB memory setting in 2.4.0 stable Rainer Mager
  2001-01-15 21:47   ` Marcelo Tosatti
@ 2001-01-16  8:40   ` Urban Widmark
  2001-01-17 23:59     ` Rainer Mager
  1 sibling, 1 reply; 15+ messages in thread
From: Urban Widmark @ 2001-01-16  8:40 UTC (permalink / raw)
  To: Rainer Mager; +Cc: linux-kernel

On Tue, 16 Jan 2001, Rainer Mager wrote:

> Hi all,
>
> 	I have a 100% reproducable bug in all of the 2.4.0 kernels including the
> latest stable one. The issue is that if I compile the kernel to support 4GB
> RAM (I have 1 GB) and then try to access a samba mount I get an oops. This

I'll have a look tonight or so. It works for you on non-bigmem?

> ALWAYS happens. Usually after this the system is frozen (although the magic
> SYSREQ still works). If the system isn't frozen then any commands that
> access the disk will freeze. Fortunately GPM worked and I was able to paste
> the oops to a file via telnet.

smb_rename suggests mv, but the process is ls ... er? What commands where
you running on smbfs when it crashed?

Could this be a symbol mismatch? Keith Owens suggested a less manual way
to get module symbol output. Do you get the same results using that?

/Urban

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-16  8:40   ` Urban Widmark
@ 2001-01-17 23:59     ` Rainer Mager
  2001-01-18  0:30       ` Urban Widmark
  0 siblings, 1 reply; 15+ messages in thread
From: Rainer Mager @ 2001-01-17 23:59 UTC (permalink / raw)
  To: Urban Widmark; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 398 bytes --]

> smb_rename suggests mv, but the process is ls ... er? What commands where
> you running on smbfs when it crashed?
>
> Could this be a symbol mismatch? Keith Owens suggested a less manual way
> to get module symbol output. Do you get the same results using that?

Here is a newly parsed oops, this time using the /var/log/ksymoops method
mentioned by Keith Owens. Does this look better?

--Rainer

[-- Attachment #2: oops.parsed --]
[-- Type: application/octet-stream, Size: 3771 bytes --]

ksymoops 0.7c on i686 2.4.0.  Options used
     -V (default)
     -k /var/log/ksymoops/20010118084505.ksyms (specified)
     -l /var/log/ksymoops/20010118084505.modules (specified)
     -o /lib/modules/2.4.0/ (default)
     -m /boot/System.map-2.4.0-bigmem (specified)

Warning (compare_maps): ksyms_base symbol highmem_start_page_R__ver_highmem_start_page not found in System.map.  Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol kmap_high_R__ver_kmap_high not found in System.map.  Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol kunmap_high_R__ver_kunmap_high not found in System.map.  Ignoring ksyms_base entry
Unable to handle kernel NULL pointer dereference at virtual address 00000000
c01239a4
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c01239a4>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 00001001   ebx: 00000000   ecx: c0256730   edx: 0003f435
esi: c20cde24   edi: 00000000   ebp: 00000001   esp: ee5e3e30
ds: 0018   es: 0018   ss: 0018
Process ls (pid: 449, stackpage=ee5e3000)
Stack: c20cde24 ee5e3e64 f7e00004 00000001 c01262f5 c20cde24 00000000 00000001
       f7e00004 c1000010 fe2f0014 00000018 fe2f0000 c20cde24 f88982f8 00000000
       00000001 00000070 ee5e3ee8 f889e180 ee61a000 ee6ede9c 00000010 f8896e69
Call Trace: [<c01262f5>] [<fe2f0014>] [<fe2f0000>] [<f88982f8>] [<f889e180>] [<f8896e69>] [<f8896eaa>]
       [<fe2f0000>] [<fe2f0000>] [<f889e048>] [<f889e03c>] [<f8896f40>] [<fe2f0000>] [<f88983b0>] [<fe2f0000>]
       [<fe2f0000>] [<f889798b>] [<fe2f0000>] [<c0140c10>] [<c0140e7c>] [<c0140f9e>] [<c0140e7c>] [<c0108f4b>]
Code: 8b 07 ff 47 18 89 70 04 89 06 89 7e 04 89 37 89 7e 08 8b 44

>>EIP; c01239a4 <add_to_page_cache_unique+c0/f4>   <=====
Trace; c01262f5 <grab_cache_page+7d/a4>
Trace; fe2f0014 <END_OF_CODE+5a531b5/????>
Trace; fe2f0000 <END_OF_CODE+5a531a1/????>
Trace; f88982f8 <[smbfs]smb_add_to_cache+dc/104>
Trace; f889e180 <.data.end+1321/????>
Trace; f8896e69 <[smbfs]smb_proc_readdir_long+34d/400>
Trace; f8896eaa <[smbfs]smb_proc_readdir_long+38e/400>
Trace; fe2f0000 <END_OF_CODE+5a531a1/????>
Trace; fe2f0000 <END_OF_CODE+5a531a1/????>
Trace; f889e048 <.data.end+11e9/????>
Trace; f889e03c <.data.end+11dd/????>
Trace; f8896f40 <[smbfs]smb_proc_readdir+24/34>
Trace; fe2f0000 <END_OF_CODE+5a531a1/????>
Trace; f88983b0 <[smbfs]smb_refill_dircache+24/70>
Trace; fe2f0000 <END_OF_CODE+5a531a1/????>
Trace; fe2f0000 <END_OF_CODE+5a531a1/????>
Trace; f889798b <[smbfs]smb_readdir+db/188>
Trace; fe2f0000 <END_OF_CODE+5a531a1/????>
Trace; c0140c10 <vfs_readdir+90/ec>
Trace; c0140e7c <filldir+0/d8>
Trace; c0140f9e <sys_getdents+4a/98>
Trace; c0140e7c <filldir+0/d8>
Trace; c0108f4b <system_call+33/38>
Code;  c01239a4 <add_to_page_cache_unique+c0/f4>
00000000 <_EIP>:
Code;  c01239a4 <add_to_page_cache_unique+c0/f4>   <=====
   0:   8b 07                     mov    (%edi),%eax   <=====
Code;  c01239a6 <add_to_page_cache_unique+c2/f4>
   2:   ff 47 18                  incl   0x18(%edi)
Code;  c01239a9 <add_to_page_cache_unique+c5/f4>
   5:   89 70 04                  mov    %esi,0x4(%eax)
Code;  c01239ac <add_to_page_cache_unique+c8/f4>
   8:   89 06                     mov    %eax,(%esi)
Code;  c01239ae <add_to_page_cache_unique+ca/f4>
   a:   89 7e 04                  mov    %edi,0x4(%esi)
Code;  c01239b1 <add_to_page_cache_unique+cd/f4>
   d:   89 37                     mov    %esi,(%edi)
Code;  c01239b3 <add_to_page_cache_unique+cf/f4>
   f:   89 7e 08                  mov    %edi,0x8(%esi)
Code;  c01239b6 <add_to_page_cache_unique+d2/f4>
  12:   8b 44 00 00               mov    0x0(%eax,%eax,1),%eax


3 warnings issued.  Results may not be reliable.

[-- Attachment #3: oops.txt --]
[-- Type: text/plain, Size: 1047 bytes --]

Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c01239a4
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c01239a4>]
EFLAGS: 00010202
eax: 00001001   ebx: 00000000   ecx: c0256730   edx: 0003f435
esi: c20cde24   edi: 00000000   ebp: 00000001   esp: ee5e3e30
ds: 0018   es: 0018   ss: 0018
Process ls (pid: 449, stackpage=ee5e3000)
Stack: c20cde24 ee5e3e64 f7e00004 00000001 c01262f5 c20cde24 00000000 00000001
       f7e00004 c1000010 fe2f0014 00000018 fe2f0000 c20cde24 f88982f8 00000000
       00000001 00000070 ee5e3ee8 f889e180 ee61a000 ee6ede9c 00000010 f8896e69
Call Trace: [<c01262f5>] [<fe2f0014>] [<fe2f0000>] [<f88982f8>] [<f889e180>] [<f8896e69>] [<f8896eaa>]
       [<fe2f0000>] [<fe2f0000>] [<f889e048>] [<f889e03c>] [<f8896f40>] [<fe2f0000>] [<f88983b0>] [<fe2f0000>]
       [<fe2f0000>] [<f889798b>] [<fe2f0000>] [<c0140c10>] [<c0140e7c>] [<c0140f9e>] [<c0140e7c>] [<c0108f4b>]

Code: 8b 07 ff 47 18 89 70 04 89 06 89 7e 04 89 37 89 7e 08 8b 44
Segmentation fault

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Oops with 4GB memory setting in 2.4.0 stable
  2001-01-17 23:59     ` Rainer Mager
@ 2001-01-18  0:30       ` Urban Widmark
  0 siblings, 0 replies; 15+ messages in thread
From: Urban Widmark @ 2001-01-18  0:30 UTC (permalink / raw)
  To: Rainer Mager; +Cc: linux-kernel

On Thu, 18 Jan 2001, Rainer Mager wrote:

> Here is a newly parsed oops, this time using the /var/log/ksymoops method
> mentioned by Keith Owens. Does this look better?

Yes, and it sort of matches the other oops someone sent. Thanks.

I have a changed version now, based on the ncpfs directory cahce code.
But it doesn't work at all right now. (and that would be the "based on"
bit, the copy and paste bits haven't crashed yet :)

Assuming that all meetings have an end, and sometimes they don't seem to
have one, there may be something for you to try tomorrow.

/Urban

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2001-01-18  0:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-01-16 13:33 Oops with 4GB memory setting in 2.4.0 stable Petr Vandrovec
2001-01-16 20:17 ` Urban Widmark
  -- strict thread matches above, loose matches on Subject: below --
2001-01-16 22:29 Petr Vandrovec
2001-01-16 22:38 ` Urban Widmark
2001-01-16 22:42 ` Rainer Mager
2001-01-15 22:47 FS corruption on 2.4.0-ac8 Jure Pecar
2001-01-15 23:31 ` Oops with 4GB memory setting in 2.4.0 stable Rainer Mager
2001-01-15 21:47   ` Marcelo Tosatti
2001-01-15 23:45     ` Rainer Mager
2001-01-15 22:09       ` Marcelo Tosatti
2001-01-16  0:21         ` Rainer Mager
2001-01-15 22:37           ` Marcelo Tosatti
2001-01-16  2:03         ` Keith Owens
2001-01-16  8:40   ` Urban Widmark
2001-01-17 23:59     ` Rainer Mager
2001-01-18  0:30       ` Urban Widmark

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.