From: Jarod Wilson <jwilson@redhat.com>
To: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: linux1394-devel@lists.sourceforge.net,
"Kristian Høgsberg" <krh@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH update] firewire: fix "kobject_add failed for fw* with -EEXIST"
Date: Mon, 28 Jan 2008 17:24:11 -0500 [thread overview]
Message-ID: <200801281724.12029.jwilson@redhat.com> (raw)
In-Reply-To: <479E24D6.4060006@s5r6.in-berlin.de>
On Monday 28 January 2008 01:54:14 pm Stefan Richter wrote:
> Jarod Wilson wrote:
> > We may have another issue there though, as when this happened to me, the
> > md layer apparently never noticed (after ~6 hours) that one of the array
> > members had disappeared -- not sure if that's firewire's fault or md's
> > though... This will presumably avoid this situation entirely, but worth
> > noting that there may still be somewhere we need to better communicate
> > status to an upper layer.
>
> I don't know how md ticks, so I have no idea what might have happened
> there.
It looks like firewire is doing the right thing, unregistering the fw* device,
and the SCSI layer is subsequently removing the appropriate /dev/sd* nodes,
but for whatever reason, md hasn't a clue this has happened. I can reproduce
this particular part of the problem by bringing the array up, and then simply
pulling the firewire cable on one of the drives in the array...
> Somewhat related: What if
> - we lose connection to disk "A", represented by scsi_device "a",
> - the SCSI core sets "a" offline,
> - we gain connection to disk "A" again (i.e. it only shortly
> disappeared from the bus from firewire-core's and -sbp2's point
> of view),
> - and firewire-sbp2 adds it as scsi_device "b", even before SCSI
> core got rid of "a"?
> No big problem for stand-alone volumes (unless it happens when the
> volume is in use), but maybe trouble for md managed volumes.
That does appear to be the case. If I reconnect the drive I disconnected,
which was originally /dev/sdb, it comes back up as /dev/sdd now. So
apparently, the scsi layer is at least bright enough to see that someone (md)
is still trying to use /dev/sdb, but I'm clueless as to why md doesn't have
any idea that /dev/sdb actually went away. :\
> To smooth such issues out, my longer term goal was to allow brief
> periods of disconnection in (firewire-)sbp2. I.e. the SCSI core
> wouldn't notice that "A"/"a" went away, it would only notice that "a"
> wasn't accessible for a short time. I think the Fibre Channel drivers
> already support this. The ieee1394 driver even has a "limbo" for
> devices which went away, in order to remember them until they come back,
> but sbp2 doesn't use this feature. (Nobody did the work to enhance sbp2
> to utilize the feature.)
>
> BTW, if you unplug and replug a FireWire disk under Mac OS X fairly
> quickly, OS X will pretend that nothing happened and let the user
> continue using the disk if he hadn't "ejected" it before the brief
> connection loss.
Certainly sounds like a feature we'd benefit from having in this particular
case...
> Anyhow, we have a few more urgent problems to solve in firewire-sbp2's
> reconnection handling before we can think about such extras.
Very true... Perhaps I'll just file this one away a bit down the TODO list for
now... ;)
--
Jarod Wilson
jwilson@redhat.com
next prev parent reply other threads:[~2008-01-28 22:24 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-27 0:05 [PATCH] firewire: fix "kobject_add failed for fw* with -EEXIST" Stefan Richter
2008-01-27 17:20 ` [PATCH update] " Stefan Richter
2008-01-27 17:21 ` [PATCH] firewire: fail open() quickly if the node doesn't exist anymore Stefan Richter
2008-01-28 22:32 ` Jarod Wilson
2008-01-28 23:50 ` Stefan Richter
2008-01-28 16:48 ` [PATCH update] firewire: fix "kobject_add failed for fw* with -EEXIST" Jarod Wilson
2008-01-28 18:54 ` Stefan Richter
2008-01-28 22:24 ` Jarod Wilson [this message]
2008-01-28 19:16 ` Stefan Richter
2008-01-28 23:31 ` Stefan Richter
2008-02-02 14:01 ` [PATCH update 2] " Stefan Richter
2008-01-28 22:30 ` [PATCH update] " Jarod Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200801281724.12029.jwilson@redhat.com \
--to=jwilson@redhat.com \
--cc=krh@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux1394-devel@lists.sourceforge.net \
--cc=stefanr@s5r6.in-berlin.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.