* md disk fault communication code
@ 2014-04-18 5:38 Sonu a
2014-04-18 6:13 ` NeilBrown
0 siblings, 1 reply; 4+ messages in thread
From: Sonu a @ 2014-04-18 5:38 UTC (permalink / raw)
To: linux-raid
when disk is removed with out mdadm as I see from the stack below the
communication reaching the md driver.
dump_stack+0x49/0x5e
md_error+0x50/0x110 [md_mod]
state_store+0x43/0x300 [md_mod]
rdev_attr_store+0xad/0xd0 [md_mod]
? sysfs_write_file+0x62/0x1c0
sysfs_write_file+0x138/0x1c0
vfs_write+0xc0/0x1e0
SyS_write+0x5a/0xa0
? __audit_syscall_exit+0x246/0x2f0
system_call_fastpath+0x16/0x1b
could someone point me to the code which is monitoring scsi disks
status and thus calling md driver sysfs interface accordingly ?
Thx.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: md disk fault communication code
2014-04-18 5:38 md disk fault communication code Sonu a
@ 2014-04-18 6:13 ` NeilBrown
2014-04-18 6:47 ` Sonu a
0 siblings, 1 reply; 4+ messages in thread
From: NeilBrown @ 2014-04-18 6:13 UTC (permalink / raw)
To: Sonu a; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1194 bytes --]
On Fri, 18 Apr 2014 13:38:58 +0800 Sonu a <p10sonu@gmail.com> wrote:
> when disk is removed with out mdadm as I see from the stack below the
> communication reaching the md driver.
>
> dump_stack+0x49/0x5e
> md_error+0x50/0x110 [md_mod]
> state_store+0x43/0x300 [md_mod]
> rdev_attr_store+0xad/0xd0 [md_mod]
> ? sysfs_write_file+0x62/0x1c0
> sysfs_write_file+0x138/0x1c0
> vfs_write+0xc0/0x1e0
> SyS_write+0x5a/0xa0
> ? __audit_syscall_exit+0x246/0x2f0
> system_call_fastpath+0x16/0x1b
>
> could someone point me to the code which is monitoring scsi disks
> status and thus calling md driver sysfs interface accordingly ?
I think you ask asking how md_error gets called when a SCSI device fails,
having already discovered how it is called when you explicitly write to a
sysfs file.
Nothing monitors the scsi disks. md only discovers failure if it sends a
request to a disk, and the request signals an error. If you search for
'bi_end_io', functions assigned to this field are called when a request
finishes. Those functions might call md_error if the request failed, or they
might schedule some other handling first to try to correct the error.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: md disk fault communication code
2014-04-18 6:13 ` NeilBrown
@ 2014-04-18 6:47 ` Sonu a
2014-04-18 7:16 ` NeilBrown
0 siblings, 1 reply; 4+ messages in thread
From: Sonu a @ 2014-04-18 6:47 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
Yes it does when there is IO failure But.
But my question was when disk fail silently with out IO as show below.
The md sysfs interface /sys/block/mdY/md/dev-sdX/state is written with
faulty when sd corresponding disk is deleted with..
echo 1 > /sys/block/sdc/device/delete
kernel: [21853.981735] sd 2:0:0:0: [sdc] Synchronizing SCSI cache
kernel: [21854.049967] md: md0 still in use.
kernel: [21854.051201] md/raid1:md0: Disk failure on sdc, disabling device.
kernel: [21854.051201] md/raid1:md0: Operation continuing on 1 devices.
kernel: [21854.308355] sd 2:0:0:0: [sdc] Stopping disk
kernel: [21854.415122] ata3.00: disabled
kernel: [21854.467540] md: unbind<sdc>
kernel: [21854.467544] md: export_rdev(sdc)
earlier stack dump which shows the sysfs write interface
there has to be code monitoring block disk state, and propagating that
state to the md ?
Thx.
On Fri, Apr 18, 2014 at 2:13 PM, NeilBrown <neilb@suse.de> wrote:
> On Fri, 18 Apr 2014 13:38:58 +0800 Sonu a <p10sonu@gmail.com> wrote:
>
>> when disk is removed with out mdadm as I see from the stack below the
>> communication reaching the md driver.
>>
>> dump_stack+0x49/0x5e
>> md_error+0x50/0x110 [md_mod]
>> state_store+0x43/0x300 [md_mod]
>> rdev_attr_store+0xad/0xd0 [md_mod]
>> ? sysfs_write_file+0x62/0x1c0
>> sysfs_write_file+0x138/0x1c0
>> vfs_write+0xc0/0x1e0
>> SyS_write+0x5a/0xa0
>> ? __audit_syscall_exit+0x246/0x2f0
>> system_call_fastpath+0x16/0x1b
>>
>> could someone point me to the code which is monitoring scsi disks
>> status and thus calling md driver sysfs interface accordingly ?
>
> I think you ask asking how md_error gets called when a SCSI device fails,
> having already discovered how it is called when you explicitly write to a
> sysfs file.
>
> Nothing monitors the scsi disks. md only discovers failure if it sends a
> request to a disk, and the request signals an error. If you search for
> 'bi_end_io', functions assigned to this field are called when a request
> finishes. Those functions might call md_error if the request failed, or they
> might schedule some other handling first to try to correct the error.
>
> NeilBrown
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: md disk fault communication code
2014-04-18 6:47 ` Sonu a
@ 2014-04-18 7:16 ` NeilBrown
0 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2014-04-18 7:16 UTC (permalink / raw)
To: Sonu a; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 2833 bytes --]
On Fri, 18 Apr 2014 14:47:06 +0800 Sonu a <p10sonu@gmail.com> wrote:
> Yes it does when there is IO failure But.
>
> But my question was when disk fail silently with out IO as show below.
>
> The md sysfs interface /sys/block/mdY/md/dev-sdX/state is written with
> faulty when sd corresponding disk is deleted with..
>
> echo 1 > /sys/block/sdc/device/delete
>
> kernel: [21853.981735] sd 2:0:0:0: [sdc] Synchronizing SCSI cache
> kernel: [21854.049967] md: md0 still in use.
> kernel: [21854.051201] md/raid1:md0: Disk failure on sdc, disabling device.
> kernel: [21854.051201] md/raid1:md0: Operation continuing on 1 devices.
> kernel: [21854.308355] sd 2:0:0:0: [sdc] Stopping disk
> kernel: [21854.415122] ata3.00: disabled
> kernel: [21854.467540] md: unbind<sdc>
> kernel: [21854.467544] md: export_rdev(sdc)
>
> earlier stack dump which shows the sysfs write interface
>
> there has to be code monitoring block disk state, and propagating that
> state to the md ?
I understand your question now.
This is handled by used. /usr/lib/udev/rules.d/64-md-raid-assembly.rules or
some file name like that contains a line like
ACTION=="remove", ENV{ID_PATH}!="?*", RUN+="/sbin/mdadm -If $name"
so when the device is removed, udev runs "mdadm -If /dev/devicename".
mdadm finds which array this device is in, marks it as faulty via sysfs, and
then removes the device from the array if it can.
NeilBrown
>
> Thx.
>
> On Fri, Apr 18, 2014 at 2:13 PM, NeilBrown <neilb@suse.de> wrote:
> > On Fri, 18 Apr 2014 13:38:58 +0800 Sonu a <p10sonu@gmail.com> wrote:
> >
> >> when disk is removed with out mdadm as I see from the stack below the
> >> communication reaching the md driver.
> >>
> >> dump_stack+0x49/0x5e
> >> md_error+0x50/0x110 [md_mod]
> >> state_store+0x43/0x300 [md_mod]
> >> rdev_attr_store+0xad/0xd0 [md_mod]
> >> ? sysfs_write_file+0x62/0x1c0
> >> sysfs_write_file+0x138/0x1c0
> >> vfs_write+0xc0/0x1e0
> >> SyS_write+0x5a/0xa0
> >> ? __audit_syscall_exit+0x246/0x2f0
> >> system_call_fastpath+0x16/0x1b
> >>
> >> could someone point me to the code which is monitoring scsi disks
> >> status and thus calling md driver sysfs interface accordingly ?
> >
> > I think you ask asking how md_error gets called when a SCSI device fails,
> > having already discovered how it is called when you explicitly write to a
> > sysfs file.
> >
> > Nothing monitors the scsi disks. md only discovers failure if it sends a
> > request to a disk, and the request signals an error. If you search for
> > 'bi_end_io', functions assigned to this field are called when a request
> > finishes. Those functions might call md_error if the request failed, or they
> > might schedule some other handling first to try to correct the error.
> >
> > NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-04-18 7:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-18 5:38 md disk fault communication code Sonu a
2014-04-18 6:13 ` NeilBrown
2014-04-18 6:47 ` Sonu a
2014-04-18 7:16 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).