* md disk fault communication code @ 2014-04-18 5:38 Sonu a 2014-04-18 6:13 ` NeilBrown 0 siblings, 1 reply; 4+ messages in thread From: Sonu a @ 2014-04-18 5:38 UTC (permalink / raw) To: linux-raid when disk is removed with out mdadm as I see from the stack below the communication reaching the md driver. dump_stack+0x49/0x5e md_error+0x50/0x110 [md_mod] state_store+0x43/0x300 [md_mod] rdev_attr_store+0xad/0xd0 [md_mod] ? sysfs_write_file+0x62/0x1c0 sysfs_write_file+0x138/0x1c0 vfs_write+0xc0/0x1e0 SyS_write+0x5a/0xa0 ? __audit_syscall_exit+0x246/0x2f0 system_call_fastpath+0x16/0x1b could someone point me to the code which is monitoring scsi disks status and thus calling md driver sysfs interface accordingly ? Thx. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: md disk fault communication code 2014-04-18 5:38 md disk fault communication code Sonu a @ 2014-04-18 6:13 ` NeilBrown 2014-04-18 6:47 ` Sonu a 0 siblings, 1 reply; 4+ messages in thread From: NeilBrown @ 2014-04-18 6:13 UTC (permalink / raw) To: Sonu a; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1194 bytes --] On Fri, 18 Apr 2014 13:38:58 +0800 Sonu a <p10sonu@gmail.com> wrote: > when disk is removed with out mdadm as I see from the stack below the > communication reaching the md driver. > > dump_stack+0x49/0x5e > md_error+0x50/0x110 [md_mod] > state_store+0x43/0x300 [md_mod] > rdev_attr_store+0xad/0xd0 [md_mod] > ? sysfs_write_file+0x62/0x1c0 > sysfs_write_file+0x138/0x1c0 > vfs_write+0xc0/0x1e0 > SyS_write+0x5a/0xa0 > ? __audit_syscall_exit+0x246/0x2f0 > system_call_fastpath+0x16/0x1b > > could someone point me to the code which is monitoring scsi disks > status and thus calling md driver sysfs interface accordingly ? I think you ask asking how md_error gets called when a SCSI device fails, having already discovered how it is called when you explicitly write to a sysfs file. Nothing monitors the scsi disks. md only discovers failure if it sends a request to a disk, and the request signals an error. If you search for 'bi_end_io', functions assigned to this field are called when a request finishes. Those functions might call md_error if the request failed, or they might schedule some other handling first to try to correct the error. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: md disk fault communication code 2014-04-18 6:13 ` NeilBrown @ 2014-04-18 6:47 ` Sonu a 2014-04-18 7:16 ` NeilBrown 0 siblings, 1 reply; 4+ messages in thread From: Sonu a @ 2014-04-18 6:47 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid Yes it does when there is IO failure But. But my question was when disk fail silently with out IO as show below. The md sysfs interface /sys/block/mdY/md/dev-sdX/state is written with faulty when sd corresponding disk is deleted with.. echo 1 > /sys/block/sdc/device/delete kernel: [21853.981735] sd 2:0:0:0: [sdc] Synchronizing SCSI cache kernel: [21854.049967] md: md0 still in use. kernel: [21854.051201] md/raid1:md0: Disk failure on sdc, disabling device. kernel: [21854.051201] md/raid1:md0: Operation continuing on 1 devices. kernel: [21854.308355] sd 2:0:0:0: [sdc] Stopping disk kernel: [21854.415122] ata3.00: disabled kernel: [21854.467540] md: unbind<sdc> kernel: [21854.467544] md: export_rdev(sdc) earlier stack dump which shows the sysfs write interface there has to be code monitoring block disk state, and propagating that state to the md ? Thx. On Fri, Apr 18, 2014 at 2:13 PM, NeilBrown <neilb@suse.de> wrote: > On Fri, 18 Apr 2014 13:38:58 +0800 Sonu a <p10sonu@gmail.com> wrote: > >> when disk is removed with out mdadm as I see from the stack below the >> communication reaching the md driver. >> >> dump_stack+0x49/0x5e >> md_error+0x50/0x110 [md_mod] >> state_store+0x43/0x300 [md_mod] >> rdev_attr_store+0xad/0xd0 [md_mod] >> ? sysfs_write_file+0x62/0x1c0 >> sysfs_write_file+0x138/0x1c0 >> vfs_write+0xc0/0x1e0 >> SyS_write+0x5a/0xa0 >> ? __audit_syscall_exit+0x246/0x2f0 >> system_call_fastpath+0x16/0x1b >> >> could someone point me to the code which is monitoring scsi disks >> status and thus calling md driver sysfs interface accordingly ? > > I think you ask asking how md_error gets called when a SCSI device fails, > having already discovered how it is called when you explicitly write to a > sysfs file. > > Nothing monitors the scsi disks. md only discovers failure if it sends a > request to a disk, and the request signals an error. If you search for > 'bi_end_io', functions assigned to this field are called when a request > finishes. Those functions might call md_error if the request failed, or they > might schedule some other handling first to try to correct the error. > > NeilBrown ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: md disk fault communication code 2014-04-18 6:47 ` Sonu a @ 2014-04-18 7:16 ` NeilBrown 0 siblings, 0 replies; 4+ messages in thread From: NeilBrown @ 2014-04-18 7:16 UTC (permalink / raw) To: Sonu a; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 2833 bytes --] On Fri, 18 Apr 2014 14:47:06 +0800 Sonu a <p10sonu@gmail.com> wrote: > Yes it does when there is IO failure But. > > But my question was when disk fail silently with out IO as show below. > > The md sysfs interface /sys/block/mdY/md/dev-sdX/state is written with > faulty when sd corresponding disk is deleted with.. > > echo 1 > /sys/block/sdc/device/delete > > kernel: [21853.981735] sd 2:0:0:0: [sdc] Synchronizing SCSI cache > kernel: [21854.049967] md: md0 still in use. > kernel: [21854.051201] md/raid1:md0: Disk failure on sdc, disabling device. > kernel: [21854.051201] md/raid1:md0: Operation continuing on 1 devices. > kernel: [21854.308355] sd 2:0:0:0: [sdc] Stopping disk > kernel: [21854.415122] ata3.00: disabled > kernel: [21854.467540] md: unbind<sdc> > kernel: [21854.467544] md: export_rdev(sdc) > > earlier stack dump which shows the sysfs write interface > > there has to be code monitoring block disk state, and propagating that > state to the md ? I understand your question now. This is handled by used. /usr/lib/udev/rules.d/64-md-raid-assembly.rules or some file name like that contains a line like ACTION=="remove", ENV{ID_PATH}!="?*", RUN+="/sbin/mdadm -If $name" so when the device is removed, udev runs "mdadm -If /dev/devicename". mdadm finds which array this device is in, marks it as faulty via sysfs, and then removes the device from the array if it can. NeilBrown > > Thx. > > On Fri, Apr 18, 2014 at 2:13 PM, NeilBrown <neilb@suse.de> wrote: > > On Fri, 18 Apr 2014 13:38:58 +0800 Sonu a <p10sonu@gmail.com> wrote: > > > >> when disk is removed with out mdadm as I see from the stack below the > >> communication reaching the md driver. > >> > >> dump_stack+0x49/0x5e > >> md_error+0x50/0x110 [md_mod] > >> state_store+0x43/0x300 [md_mod] > >> rdev_attr_store+0xad/0xd0 [md_mod] > >> ? sysfs_write_file+0x62/0x1c0 > >> sysfs_write_file+0x138/0x1c0 > >> vfs_write+0xc0/0x1e0 > >> SyS_write+0x5a/0xa0 > >> ? __audit_syscall_exit+0x246/0x2f0 > >> system_call_fastpath+0x16/0x1b > >> > >> could someone point me to the code which is monitoring scsi disks > >> status and thus calling md driver sysfs interface accordingly ? > > > > I think you ask asking how md_error gets called when a SCSI device fails, > > having already discovered how it is called when you explicitly write to a > > sysfs file. > > > > Nothing monitors the scsi disks. md only discovers failure if it sends a > > request to a disk, and the request signals an error. If you search for > > 'bi_end_io', functions assigned to this field are called when a request > > finishes. Those functions might call md_error if the request failed, or they > > might schedule some other handling first to try to correct the error. > > > > NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-04-18 7:16 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-04-18 5:38 md disk fault communication code Sonu a 2014-04-18 6:13 ` NeilBrown 2014-04-18 6:47 ` Sonu a 2014-04-18 7:16 ` NeilBrown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).