* [PATCH as445] Fix reference to deallocated memory in sd.c [not found] <200501121941.18058.david-b@pacbell.net> @ 2005-01-14 16:05 ` Alan Stern 2005-01-18 14:56 ` [PATCH as448] Fix reference to deallocated memory in sr.c Alan Stern 0 siblings, 1 reply; 8+ messages in thread From: Alan Stern @ 2005-01-14 16:05 UTC (permalink / raw) To: James Bottomley Cc: David Brownell, USB development list, SCSI development list James: This patch of yours: http://linux-scsi.bkbits.net:8080/scsi-for-linus-2.6/cset@1.2034.95.5?nav=index.html|src/|src/drivers|src/drivers/scsi|related/drivers/scsi/sd.c --- 1.162/drivers/scsi/sd.c 2005-01-14 07:53:59 -08:00 +++ 1.163/drivers/scsi/sd.c 2005-01-14 07:53:59 -08:00 @@ -198,8 +198,8 @@ static void scsi_disk_put(struct scsi_disk *sdkp) { down(&sd_ref_sem); - scsi_device_put(sdkp->device); kref_put(&sdkp->kref, scsi_disk_release); + scsi_device_put(sdkp->device); up(&sd_ref_sem); } is causing almost as much trouble as it fixed. If kref_put() drops the last reference to the scsi_disk (this happens when the device file is closed after the device has been hot-unplugged) then the call to scsi_device_put() will take its argument from an area of memory that has been deallocated. The patch below should fix things. Alan Stern Signed-off-by: Alan Stern <stern@rowland.harvard.edu> ===== drivers/scsi/sd.c 1.75 vs edited ===== --- 1.75/drivers/scsi/sd.c 2004-12-22 11:18:12 -05:00 +++ edited/drivers/scsi/sd.c 2005-01-14 11:01:14 -05:00 @@ -197,9 +197,11 @@ static void scsi_disk_put(struct scsi_disk *sdkp) { + struct scsi_device *sdev = sdkp->device; + down(&sd_ref_sem); kref_put(&sdkp->kref, scsi_disk_release); - scsi_device_put(sdkp->device); + scsi_device_put(sdev); up(&sd_ref_sem); } On Wed, 12 Jan 2005, David Brownell wrote: > I didn't realize FC3 was mounting this drive, else I might have done > things differently ... but I think everyone will agree that oopsing > is not OK. See the following dmesg trace. > > I've seen a lot of messages about similar failures lately, as if > maybe more distros are automounting removable drives. But I also > remember seeing a lot of fixes go by; does this oops have a fix? > > - Dave > > > ============================================================================ > > Connect drive to NEC EHCI > Erm, FC3 must be automatically mounting this for me. > I didn't ask it to, but I suppose that could be OK ... > ... except that when I then unplug the drive ... > ... then things go completely haywire ... > > lost page write due to I/O error on sda1 > Unable to handle kernel paging request at virtual address 6b6b6b6b > printing eip: > c027169b > *pde = 00000000 > Oops: 0000 [#1] > SMP > Modules linked in: usb_storage ohci_hcd ehci_hcd > CPU: 1 > EIP: 0060:[<c027169b>] Not tainted VLI > EFLAGS: 00010286 (2.6.11-rc1-helium) > EIP is at scsi_device_put+0x7/0x48 > eax: 0000000f ebx: 6b6b6b6b ecx: 00000000 edx: c14efbdc > esi: cd3f2360 edi: d12b3148 ebp: cd241ee4 esp: cd241ee0 > ds: 007b es: 007b ss: 0068 > Process umount (pid: 3497, threadinfo=cd240000 task=ccd71ac0) > Stack: cd3f2360 cd241efc c0278852 6b6b6b6b cd3f2360 c0279e4e cd2f05b8 cd241f10 > c0278bf2 cd3f2360 c14df02c c14df02c cd241f34 c01519c3 c14df0a0 00000000 > c14df0a0 00000000 00000000 dfc695d4 d12b3148 cd241f54 c0151a69 c14df02c > Call Trace: > [<c0102f63>] show_stack+0x74/0x7c > [<c0103077>] show_registers+0xf4/0x15e > [<c010323e>] die+0xd8/0x157 > [<c010fe9a>] do_page_fault+0x43d/0x5cc > [<c0102c2b>] error_code+0x2b/0x30 > [<c0278852>] scsi_disk_put+0x38/0x4d > [<c0278bf2>] sd_release+0x46/0x4f > [<c01519c3>] blkdev_put+0x69/0x137 > [<c0151a69>] blkdev_put+0x10f/0x137 > [<c014fdc7>] deactivate_super+0x59/0x78 > [<c0161ffa>] sys_umount+0x6b/0x73 > [<c0102155>] sysenter_past_esp+0x52/0x75 > Code: 06 8d 04 02 ff 80 00 01 00 00 eb 0d 56 e8 3a 8c fd ff ba fa ff ff ff eb 02 31 d2 8d 65 f8 89 d0 5b 5e c9 c3 55 89 e5 53 8b 5d 08 <8b> 03 8b 40 74 8b 10 85 d2 74 26 b8 00 e0 ff ff 21 e0 8b 40 10 > <7>hub 2-0:1.0: debounce: port 5: total 100ms stable 100ms status 0x100 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH as448] Fix reference to deallocated memory in sr.c 2005-01-14 16:05 ` [PATCH as445] Fix reference to deallocated memory in sd.c Alan Stern @ 2005-01-18 14:56 ` Alan Stern 2005-01-18 15:10 ` James Bottomley 0 siblings, 1 reply; 8+ messages in thread From: Alan Stern @ 2005-01-18 14:56 UTC (permalink / raw) To: James Bottomley; +Cc: SCSI development list James: When I posted a patch last week to fix a reference to deallocated memory in sd.c, I forgot to check whether the same problem exists in sr.c. It does, and here's the patch to fix it. Alan Stern Signed-off-by: Alan Stern <stern@rowland.harvard.edu> ===== drivers/scsi/sr.c 1.78 vs edited ===== --- 1.78/drivers/scsi/sr.c 2005-01-11 11:57:28 -05:00 +++ edited/drivers/scsi/sr.c 2005-01-18 09:53:50 -05:00 @@ -152,9 +152,11 @@ static inline void scsi_cd_put(struct scsi_cd *cd) { + struct scsi_device *sdev = cd->device; + down(&sr_ref_sem); kref_put(&cd->kref, sr_kref_release); - scsi_device_put(cd->device); + scsi_device_put(sdev); up(&sr_ref_sem); } ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH as448] Fix reference to deallocated memory in sr.c 2005-01-18 14:56 ` [PATCH as448] Fix reference to deallocated memory in sr.c Alan Stern @ 2005-01-18 15:10 ` James Bottomley 2005-01-18 20:55 ` Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error Guy 0 siblings, 1 reply; 8+ messages in thread From: James Bottomley @ 2005-01-18 15:10 UTC (permalink / raw) To: Alan Stern; +Cc: SCSI Mailing List On Tue, 2005-01-18 at 09:56 -0500, Alan Stern wrote: > When I posted a patch last week to fix a reference to deallocated memory > in sd.c, I forgot to check whether the same problem exists in sr.c. It > does, and here's the patch to fix it. Yes, I already caught that in the scsi-rc-fixes-2.6 tree (although I haven't synchronised it yet). James ^ permalink raw reply [flat|nested] 8+ messages in thread
* Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error 2005-01-18 15:10 ` James Bottomley @ 2005-01-18 20:55 ` Guy 2005-01-18 21:08 ` Matthias Andree 0 siblings, 1 reply; 8+ messages in thread From: Guy @ 2005-01-18 20:55 UTC (permalink / raw) Cc: 'SCSI Mailing List' Can anyone help decode this info? What is 0x25e6e3? What disk is sd08:b1? I have disks on 3 SCSI buses (scsi0, scsi2 and scsi3). Do you need more info? Thanks, Guy kernel: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error kernel: Additional sense indicates Recovered data with error corr. & retries applied ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error 2005-01-18 20:55 ` Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error Guy @ 2005-01-18 21:08 ` Matthias Andree 2005-01-18 23:32 ` Guy 0 siblings, 1 reply; 8+ messages in thread From: Matthias Andree @ 2005-01-18 21:08 UTC (permalink / raw) To: Guy "Guy" <bugzilla@watkins-home.com> writes: > Can anyone help decode this info? > > What is 0x25e6e3? > What disk is sd08:b1? /dev/sdl1 (ess dee ell one) - that's sedecimal notation for a device with major 8 minor 0xb1 = 177; $ ls -l /dev/sd* |grep " 8, 177" brw-rw---- 1 root disk 8, 177 2004-10-02 10:38 /dev/sdl1 > kernel: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error > kernel: Additional sense indicates Recovered data with error corr. & retries > applied Time to check and possibly replace the drive, or at least refresh the block. smartmontools (on sourceforge) and perhaps badblocks or Jörg Schillings sformat (careful!) may help you with that. -- Matthias Andree - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error 2005-01-18 21:08 ` Matthias Andree @ 2005-01-18 23:32 ` Guy 2005-01-19 1:22 ` Douglas Gilbert 0 siblings, 1 reply; 8+ messages in thread From: Guy @ 2005-01-18 23:32 UTC (permalink / raw) To: 'Matthias Andree'; +Cc: 'SCSI Mailing List' Good info. Thanks! I could not find the answer with google. Too much noise! Is 0x25e6e3 the block number? If it is, is it relative to the beginning of sdl1, or sdl? If not, what is it? Thanks, Guy -----Original Message----- From: Matthias Andree [mailto:matthias.andree@gmx.de] Sent: Tuesday, January 18, 2005 4:09 PM To: Guy Cc: unlisted-recipients:; no To-header on input; 'SCSI Mailing List' Subject: Re: Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error "Guy" <bugzilla@watkins-home.com> writes: > Can anyone help decode this info? > > What is 0x25e6e3? > What disk is sd08:b1? /dev/sdl1 (ess dee ell one) - that's sedecimal notation for a device with major 8 minor 0xb1 = 177; $ ls -l /dev/sd* |grep " 8, 177" brw-rw---- 1 root disk 8, 177 2004-10-02 10:38 /dev/sdl1 > kernel: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error > kernel: Additional sense indicates Recovered data with error corr. & retries > applied Time to check and possibly replace the drive, or at least refresh the block. smartmontools (on sourceforge) and perhaps badblocks or Jörg Schillings sformat (careful!) may help you with that. -- Matthias Andree - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error 2005-01-18 23:32 ` Guy @ 2005-01-19 1:22 ` Douglas Gilbert 2005-01-19 4:32 ` Guy 0 siblings, 1 reply; 8+ messages in thread From: Douglas Gilbert @ 2005-01-19 1:22 UTC (permalink / raw) To: Guy; +Cc: 'Matthias Andree', 'SCSI Mailing List' Guy wrote: > Good info. Thanks! > I could not find the answer with google. Too much noise! > > Is 0x25e6e3 the block number? Yes (logical block number expressed in hex) > If it is, is it relative to the beginning of sdl1, or sdl? /dev/sdl > If not, what is it? Looking at the settings of the "read write error recovery" mode page on /dev/sdl may be instructive. ['sginfo -e /dev/sdl' from sg3_utils.] The PER bit seems to be set (otherwise a recovered error should not have been reported) but the ARRE and AWRE bits are probably clear. Those bits control the automatic reaasignment of a block when a recovered error occurs as reported in your case. Assuming the problem occurred on a read and that the ARRE it is clear then you may want to reassign that block. To check its current state you might try: sg_dd if=/dev/sdl skip=0x25e6e3 of=. bs=512 count=1 blk_sgio=1 If that recovered error persists (or worse) rather than formatting the disk, reassigning that block is more surgical. sg_reassign has be added to sg3_utils recently (v1.12 beta at www.torque.net/sg) to do this. In your case: sg_reassign -a 0x25e6e3 /dev/sdl If successful the replaced sector should go into the "grown" defect list ('sginfo -G /dev/sdl'). This utility may be worth trying before and after the sg_reassign. Another way to accomplish the same thing is to set the ARRE bit (and the AWRE while you are at it) and do another read of that block. The reported additonal sense message should change to something like "Recovered data: data auto-reallocated". Reading the whole disk might be wise (to see if that lba was a lone case). More generally this is not a good sign concerning the health of that disk. No data has been lost _yet_ but it had to work hard to recovery it. Any entries in the "grown" defect list is not a good sign. Also with smartmontools you might like to try 'smartctl -a /dev/sdl' and examine the "Error counter log" and compare that does some of your other drives that are not reporting problems. A long self test may also be appropriate: 'smartctl -t long /dev/sdl'. Doug Gilbert > -----Original Message----- > From: Matthias Andree [mailto:matthias.andree@gmx.de] > Sent: Tuesday, January 18, 2005 4:09 PM > To: Guy > Cc: unlisted-recipients:; no To-header on input; 'SCSI Mailing List' > Subject: Re: Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key > Recovered Error > > "Guy" <bugzilla@watkins-home.com> writes: > > >>Can anyone help decode this info? >> >>What is 0x25e6e3? >>What disk is sd08:b1? > > > /dev/sdl1 (ess dee ell one) - that's sedecimal notation for a device > with major 8 minor 0xb1 = 177; > > $ ls -l /dev/sd* |grep " 8, 177" > brw-rw---- 1 root disk 8, 177 2004-10-02 10:38 /dev/sdl1 > > >>kernel: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error >>kernel: Additional sense indicates Recovered data with error corr. & > > retries > >>applied > > > Time to check and possibly replace the drive, or at least refresh the > block. > > smartmontools (on sourceforge) and perhaps badblocks or Jörg Schillings > sformat (careful!) may help you with that. > - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error 2005-01-19 1:22 ` Douglas Gilbert @ 2005-01-19 4:32 ` Guy 0 siblings, 0 replies; 8+ messages in thread From: Guy @ 2005-01-19 4:32 UTC (permalink / raw) To: dougg; +Cc: 'Matthias Andree', 'SCSI Mailing List' Lots of good info! Thanks. I have installed sg3_utils, cool stuff. I knew about AWRE and ARRE. AWRE is on, ARRE is off. I do plan to turn on ARRE for all of my disks. I can't re-produce these errors, so I guess they were write errors that were re-located. I was hoping to find a reproducible error, then turn ARRE on and "see" the error get corrected. You had the same idea using 'sginfo -G /dev/sdl' to verify the error was corrected. I do a read test of all my disks, every night. It is required, IMO, since md kicks disks out for having 1 bad block. I want to find the bad blocks before md finds them. But since I started my nightly disk tests, I have had no bad blocks. It seems ARRE is on, but it is not. Anyway, thanks for the good info. Guy -----Original Message----- From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Douglas Gilbert Sent: Tuesday, January 18, 2005 8:23 PM To: Guy Cc: 'Matthias Andree'; 'SCSI Mailing List' Subject: Re: Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error Guy wrote: > Good info. Thanks! > I could not find the answer with google. Too much noise! > > Is 0x25e6e3 the block number? Yes (logical block number expressed in hex) > If it is, is it relative to the beginning of sdl1, or sdl? /dev/sdl > If not, what is it? Looking at the settings of the "read write error recovery" mode page on /dev/sdl may be instructive. ['sginfo -e /dev/sdl' from sg3_utils.] The PER bit seems to be set (otherwise a recovered error should not have been reported) but the ARRE and AWRE bits are probably clear. Those bits control the automatic reaasignment of a block when a recovered error occurs as reported in your case. Assuming the problem occurred on a read and that the ARRE it is clear then you may want to reassign that block. To check its current state you might try: sg_dd if=/dev/sdl skip=0x25e6e3 of=. bs=512 count=1 blk_sgio=1 If that recovered error persists (or worse) rather than formatting the disk, reassigning that block is more surgical. sg_reassign has be added to sg3_utils recently (v1.12 beta at www.torque.net/sg) to do this. In your case: sg_reassign -a 0x25e6e3 /dev/sdl If successful the replaced sector should go into the "grown" defect list ('sginfo -G /dev/sdl'). This utility may be worth trying before and after the sg_reassign. Another way to accomplish the same thing is to set the ARRE bit (and the AWRE while you are at it) and do another read of that block. The reported additonal sense message should change to something like "Recovered data: data auto-reallocated". Reading the whole disk might be wise (to see if that lba was a lone case). More generally this is not a good sign concerning the health of that disk. No data has been lost _yet_ but it had to work hard to recovery it. Any entries in the "grown" defect list is not a good sign. Also with smartmontools you might like to try 'smartctl -a /dev/sdl' and examine the "Error counter log" and compare that does some of your other drives that are not reporting problems. A long self test may also be appropriate: 'smartctl -t long /dev/sdl'. Doug Gilbert > -----Original Message----- > From: Matthias Andree [mailto:matthias.andree@gmx.de] > Sent: Tuesday, January 18, 2005 4:09 PM > To: Guy > Cc: unlisted-recipients:; no To-header on input; 'SCSI Mailing List' > Subject: Re: Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key > Recovered Error > > "Guy" <bugzilla@watkins-home.com> writes: > > >>Can anyone help decode this info? >> >>What is 0x25e6e3? >>What disk is sd08:b1? > > > /dev/sdl1 (ess dee ell one) - that's sedecimal notation for a device > with major 8 minor 0xb1 = 177; > > $ ls -l /dev/sd* |grep " 8, 177" > brw-rw---- 1 root disk 8, 177 2004-10-02 10:38 /dev/sdl1 > > >>kernel: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error >>kernel: Additional sense indicates Recovered data with error corr. & > > retries > >>applied > > > Time to check and possibly replace the drive, or at least refresh the > block. > > smartmontools (on sourceforge) and perhaps badblocks or Jörg Schillings > sformat (careful!) may help you with that. > - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-01-19 4:35 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200501121941.18058.david-b@pacbell.net>
2005-01-14 16:05 ` [PATCH as445] Fix reference to deallocated memory in sd.c Alan Stern
2005-01-18 14:56 ` [PATCH as448] Fix reference to deallocated memory in sr.c Alan Stern
2005-01-18 15:10 ` James Bottomley
2005-01-18 20:55 ` Help decoding: Info fld=0x25e6e3, Current sd08:b1: sense key Recovered Error Guy
2005-01-18 21:08 ` Matthias Andree
2005-01-18 23:32 ` Guy
2005-01-19 1:22 ` Douglas Gilbert
2005-01-19 4:32 ` Guy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox