public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* filp_open() in 2.2.19 causes memory corruption
@ 2001-04-23 18:04 Jeff V. Merkey
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff V. Merkey @ 2001-04-23 18:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: jmerkey

[-- Attachment #1: Type: text/plain, Size: 537 bytes --]



I am now using the filp_open() call in kernel to scan for tape 
devices in lieu of chrdev_open()/blkdev_open(), but I have 
discovered that calling this api with non-existent devices 
appears to result in memory corruption and some nasty oops.

I have attached the code fragment and oops generated by calling
ScanTapeDevices().  This basically works the way Al Viro described,
and I like the auto-probing of the tape device via calls to 
filp_open(), which is really slick, if I can just get over 
the oops, I think it's there.

Jeff


[-- Attachment #2: trace.txt --]
[-- Type: text/plain, Size: 1812 bytes --]


read enter 
Unable to handle kernel paging request at virtual address c4035020 
current->tss.cr3 = 03dcc000, %cr3 = 03dcc000 
*pde = 03f7a063 
*pte = 00000000 
Oops: 0000 
CPU:    0 
EIP:    0010:[sys_mremap+31/884] 
EFLAGS: 00010206 
eax: ffffffff   ebx: c3fefb00   ecx: c023e964   edx: c404d020 
esi: c4035020   edi: c404d020   ebp: c3fefb60   esp: c238dd28 
ds: 0018   es: 0018   ss: 0018 
Process insmod (pid: 658, process nr: 23, stackpage=c238d000) 
Stack: c238dd74 c238ddc8 00000020 00000020 ffffffe0 c3fefb6c c238dd50 00000000  
       00000020 00000004 00000020 c4049c30 c404d020 00000048 00000020 00002000  
       00000000 00000000 c4049d35 00000000 c025b5a0 c01cb0b1 00000000 0000000d  
Call Trace: 
[lockd:nlm4_granted_Rea24c726+123124/114388216] 
[lockd:nlm4_granted_Rea24c726+136420/114374920] 
[lockd:nlm4_granted_Rea24c726+123385/114387955] 
[fbcon_cfb24_putc+69/808] 
[lockd:nlm4_granted_Rea24c726+143844/114367496] 
[lockd:nlm4_granted_Rea24c726+130960/114380380] 
[lockd:nlm4_granted_Rea24c726+143588/114367752]  
[lockd:nlm4_granted_Rea24c726+127420/114383920] 
[lockd:nlm4_granted_Rea24c726+126445/114384895] 
[lockd:nlm4_granted_Rea24c726+126268/114385072] 
[lockd:nlm4_granted_Rea24c726+121676/114389664] 
[lockd:nlm4_granted_Rea24c726+120083/114391257] 
[lockd:nlm4_granted_Rea24c726+144284/114367056] 
[ide_timer_expiry+172/436] 
[lockd:nlm4_granted_Rea24c726+126932/114384408]  
[lockd:nlm4_granted_Rea24c726+120004/114391336] 
[lockd:nlm4_granted_Rea24c726+127808/114383532] 
[lockd:nlm4_granted_Rea24c726+120004/114391336] 
[getrusage+263/924] 
[lockd:nlm4_granted_Rea24c726+142028/114369312] 
[lockd:nlm4_granted_Rea24c726+87236/114424104] 
[lockd:nlm4_granted_Rea24c726+120076/114391264] 
[do_signal+512/616]  
Code: ac ae 75 08 84 c0 75 f8 31 c0 eb 04 19 c0 0c 01 85 c0 75 d9  


[-- Attachment #3: tape.c --]
[-- Type: text/plain, Size: 3076 bytes --]


BYTE *scsi_tape_handles[] =
{
    "/dev/st0",  "/dev/st1",  "/dev/st2",  "/dev/st3",  "/dev/st4",
    "/dev/st5",  "/dev/st6",  "/dev/st7",  "/dev/st8",  "/dev/st9",
    "/dev/st10", "/dev/st11", "/dev/st12", "/dev/st13", "/dev/st14",
    "/dev/st15", "/dev/st16", "/dev/st17", "/dev/st18", "/dev/st19",
    "/dev/st20", "/dev/st21", "/dev/st22", "/dev/st23", "/dev/st24",
    "/dev/st25", "/dev/st26", "/dev/st27", "/dev/st28", "/dev/st29",
    "/dev/st30", "/dev/st31",
    0,
};

kdev_t scsi_tape_devs[] =
{
    0x0900, 0x0901, 0x0902, 0x0903, 0x0904, 0x0905, 0x0906, 0x0907, 
    0x0908, 0x0909, 0x090A, 0x090B, 0x090C, 0x090D, 0x090E, 0x090F,
    0x0910, 0x0911, 0x0912, 0x0913, 0x0914, 0x0915, 0x0916, 0x0917, 
    0x0918, 0x0919, 0x091A, 0x091B, 0x091C, 0x091D, 0x091E, 0x091F,  
    0,
};

BYTE *ide_tape_handles[] =
{
    "/dev/ht0",  "/dev/ht1",  "/dev/ht2",  "/dev/ht3",  "/dev/ht4", 
    "/dev/ht5",  "/dev/ht6",  "/dev/ht7",  "/dev/ht8",  "/dev/ht9",
    "/dev/ht10", "/dev/ht11", "/dev/ht12", "/dev/ht13", "/dev/ht14",
    "/dev/ht15", "/dev/ht16", "/dev/ht17", "/dev/ht18", "/dev/ht19", 
    "/dev/ht20", "/dev/ht21", "/dev/ht22", "/dev/ht23", "/dev/ht24", 
    "/dev/ht25", "/dev/ht26", "/dev/ht27", "/dev/ht28", "/dev/ht29", 
    "/dev/ht30", "/dev/ht31", 0
};

kdev_t ide_tape_devs[] =
{
    0x3700, 0x3701, 0x3702, 0x3703, 0x3704, 0x3705, 0x3706, 0x3707, 
    0x3708, 0x3709, 0x370A, 0x370B, 0x370C, 0x370D, 0x370E, 0x370F,
    0x3710, 0x3711, 0x3712, 0x3713, 0x3714, 0x3715, 0x3716, 0x3717, 
    0x3718, 0x3719, 0x371A, 0x371B, 0x371C, 0x371D, 0x371E, 0x371F,
    0 
};

ULONG max_scsi_tape_devs = sizeof(scsi_tape_devs) / sizeof(kdev_t);
ULONG max_ide_tape_devs = sizeof(ide_tape_devs) / sizeof(kdev_t);
ULONG max_scsi_tape_names = sizeof(scsi_tape_handles) / sizeof(BYTE *);
ULONG max_ide_tape_names = sizeof(ide_tape_handles) / sizeof(BYTE *);

void RemoveTapeDevices(void)
{
    register ULONG j;

    for (j=0; j < max_scsi_tape_names; j++)
    {
       if (SystemTape[j])
       {
          if (SystemTape[j]->filp) 
             filp_close(SystemTape[j]->filp, NULL);
	  TRXDRVFree(SystemTape[j]);
	  SystemTape[j] = 0;
       }
    }
}

void ScanTapeDevices(void)
{
    register ULONG j;

    for (j = 0; j < max_scsi_tape_names; j++)
    {
       if (!SystemTape[j])
       {
          if (!scsi_tape_devs[j])
             break;

	  SystemTape[j] = (NWTAPE *) TRXDRVAlloc(sizeof(NWTAPE), NWTAPE_TAG);
	  if (!SystemTape[j])
	  {
	      TRXDRVPrint("trxdrv: memory alloc failure in AddTapeDevices\n");
	      continue;
	  }
	  TRXDRVSet(SystemTape[j], 0, sizeof(NWTAPE));
          
	  TRXDRVPrint("filp_open %s\n", scsi_tape_handles[j]); 
	  
          SystemTape[j]->filp = filp_open(scsi_tape_handles[j], O_RDWR, 0600);
          if (IS_ERR(SystemTape[j]->filp))
          {
             if (SystemTape[j])
	        TRXDRVFree(SystemTape[j]);
	     SystemTape[j] = 0;
             continue;
          }
          TRXDRVPrint("trxdrv:  tape device detected at %s\n", 
                      scsi_tape_handles[j]); 
       }
    }
    return;
}


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filp_open() in 2.2.19 causes memory corruption
@ 2001-04-23 20:24 Manfred Spraul
  2001-04-23 20:44 ` Jeff V. Merkey
  2001-04-23 22:03 ` David Woodhouse
  0 siblings, 2 replies; 7+ messages in thread
From: Manfred Spraul @ 2001-04-23 20:24 UTC (permalink / raw)
  To: jmerkey; +Cc: linux-kernel

Are you sure the trace is decoded correctly?

> CPU:    0 
> EIP:    0010:[sys_mremap+31/884] 
> EFLAGS: 00010206

> Code: ac ae 75 08 84 c0 75 f8 31 c0 eb 04 19 c0 0c 01 85 c0 75 d9
ac ae is
lodsb
scasb

Could you run
#objdump --disassemble-all --reloc linux/mm/mremap.o | less

and check that the code is really at offset 31 of sys_mremap?

And is it correct that only 64 MB memory is installed/enabled?

--
    Manfred



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filp_open() in 2.2.19 causes memory corruption
  2001-04-23 20:24 filp_open() in 2.2.19 causes memory corruption Manfred Spraul
@ 2001-04-23 20:44 ` Jeff V. Merkey
  2001-04-23 22:03 ` David Woodhouse
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff V. Merkey @ 2001-04-23 20:44 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel

On Mon, Apr 23, 2001 at 10:24:55PM +0200, Manfred Spraul wrote:
> Are you sure the trace is decoded correctly?
> 
> > CPU:    0 
> > EIP:    0010:[sys_mremap+31/884] 
> > EFLAGS: 00010206
> 
> > Code: ac ae 75 08 84 c0 75 f8 31 c0 eb 04 19 c0 0c 01 85 c0 75 d9
> ac ae is
> lodsb
> scasb
> 
> Could you run
> #objdump --disassemble-all --reloc linux/mm/mremap.o | less
> 
> and check that the code is really at offset 31 of sys_mremap?
> 
> And is it correct that only 64 MB memory is installed/enabled?
> 
> --
>     Manfred


Manfred,

This is what's being reported when I produce the oops.  I think we have 
memory corruption somewhere, which explains the funky code offsets.  It's
easy to reproduce.  Call filp_open with the handle table I gave you 
on a single IDE system with **NO** tape drive in the system, and it 
crashes quite after the module is loaded the fisrt time, then unloaded,
and reloaded a second time.  The oops happens on the second insmod 
of the module.  I can provide you the actual module itself built with
all the code if you want to reproduce it. 

It's 100% reproduceable.

Jeff

> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filp_open() in 2.2.19 causes memory corruption
  2001-04-23 20:24 filp_open() in 2.2.19 causes memory corruption Manfred Spraul
  2001-04-23 20:44 ` Jeff V. Merkey
@ 2001-04-23 22:03 ` David Woodhouse
  2001-04-23 22:32   ` Jeff V. Merkey
  1 sibling, 1 reply; 7+ messages in thread
From: David Woodhouse @ 2001-04-23 22:03 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: jmerkey, linux-kernel


manfred@colorfullife.com said:
> Are you sure the trace is decoded correctly?

> > CPU:    0 
> > EIP:    0010:[sys_mremap+31/884]  

Probably not. It looks like it was munged by klogd. Some distributions are 
still shipping with klogd configured to destroy the original information on 
the way to the log, without even making it do a sanity check that the 
System.map it's using actually matches the current kernel.

Jeff, please disable the broken klogd symbol munging and reproduce it,
running the oops through ksymoops manually. Ksymoops should have built-in 
sanity checks on the System.map it tries to use.

Also, please make sure you report this as a serious bug with the vendor of 
whatever distribution you're running on this box.

--
dwmw2



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filp_open() in 2.2.19 causes memory corruption
  2001-04-23 22:03 ` David Woodhouse
@ 2001-04-23 22:32   ` Jeff V. Merkey
       [not found]     ` <4750.988065680@redhat.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff V. Merkey @ 2001-04-23 22:32 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Manfred Spraul, linux-kernel

On Mon, Apr 23, 2001 at 11:03:48PM +0100, David Woodhouse wrote:
> 
> manfred@colorfullife.com said:
> > Are you sure the trace is decoded correctly?
> 
> > > CPU:    0 
> > > EIP:    0010:[sys_mremap+31/884]  
> 
> Probably not. It looks like it was munged by klogd. Some distributions are 
> still shipping with klogd configured to destroy the original information on 
> the way to the log, without even making it do a sanity check that the 
> System.map it's using actually matches the current kernel.
> 
> Jeff, please disable the broken klogd symbol munging and reproduce it,
> running the oops through ksymoops manually. Ksymoops should have built-in 
> sanity checks on the System.map it tries to use.
> 
> Also, please make sure you report this as a serious bug with the vendor of 
> whatever distribution you're running on this box.
> 


David,

I will comply and repost the oops.

Jeff

> --
> dwmw2
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filp_open() in 2.2.19 causes memory corruption
       [not found]             ` <4897.988066047@redhat.com>
@ 2001-04-25 23:32               ` Jeff V. Merkey
  2001-04-27 11:50               ` David Woodhouse
  1 sibling, 0 replies; 7+ messages in thread
From: Jeff V. Merkey @ 2001-04-25 23:32 UTC (permalink / raw)
  To: David Woodhouse, linux-kernel; +Cc: jmerkey

On Mon, Apr 23, 2001 at 11:47:27PM +0100, David Woodhouse wrote:

David/LKML,

I've gotten to the bottom of this problem, and you are correct that klog 
is trashing the messages file for the oops.  As for the oops, it was related
to the use of ll_rw_blk() instead of submit_bh() in 2.4.3 which was causing 
memory corruption in Linus' buffer cache code.   In NetWare, we used to 
create a signature field for I/O and other structures that were submitted
by modules other than the media manager.  

This would be useful for the buffer cache to put in a signature field so 
if he ever gets back a buffer head that is not his, the buffer cache 
could drop it with a noisy message rather than have memory corruption 
and other side effects that take days to track down.

Jeff

> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: filp_open() in 2.2.19 causes memory corruption
       [not found]             ` <4897.988066047@redhat.com>
  2001-04-25 23:32               ` Jeff V. Merkey
@ 2001-04-27 11:50               ` David Woodhouse
  1 sibling, 0 replies; 7+ messages in thread
From: David Woodhouse @ 2001-04-27 11:50 UTC (permalink / raw)
  To: Jeff V. Merkey; +Cc: linux-kernel, jmerkey


jmerkey@vger.timpanogas.org said:
> I've gotten to the bottom of this problem, and you are correct that
> klog  is trashing the messages file for the oops.

Oh dear. That's quite a serious bug in klogd. It should never destroy the
original information, _especially_ if the System.map it's looking at
blatantly doesn't match /proc/ksyms.

Have you reported it to your distribution vendor yet?

--
dwmw2



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-04-27 11:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-23 20:24 filp_open() in 2.2.19 causes memory corruption Manfred Spraul
2001-04-23 20:44 ` Jeff V. Merkey
2001-04-23 22:03 ` David Woodhouse
2001-04-23 22:32   ` Jeff V. Merkey
     [not found]     ` <4750.988065680@redhat.com>
     [not found]       ` <20010423163757.D1131@vger.timpanogas.org>
     [not found]         ` <4855.988065927@redhat.com>
     [not found]           ` <20010423163954.A1237@vger.timpanogas.org>
     [not found]             ` <4897.988066047@redhat.com>
2001-04-25 23:32               ` Jeff V. Merkey
2001-04-27 11:50               ` David Woodhouse
  -- strict thread matches above, loose matches on Subject: below --
2001-04-23 18:04 Jeff V. Merkey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox