public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jarod Wilson <jwilson@redhat.com>
To: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: linux1394-devel@lists.sourceforge.net,
	"Kristian Høgsberg" <krh@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH update] firewire: fix "kobject_add failed for fw* with -EEXIST"
Date: Mon, 28 Jan 2008 17:24:11 -0500	[thread overview]
Message-ID: <200801281724.12029.jwilson@redhat.com> (raw)
In-Reply-To: <479E24D6.4060006@s5r6.in-berlin.de>

On Monday 28 January 2008 01:54:14 pm Stefan Richter wrote:
> Jarod Wilson wrote:
> > We may have another issue there though, as when this happened to me, the
> > md layer apparently never noticed (after ~6 hours) that one of the array
> > members had disappeared -- not sure if that's firewire's fault or md's
> > though... This will presumably avoid this situation entirely, but worth
> > noting that there may still be somewhere we need to better communicate
> > status to an upper layer.
>
> I don't know how md ticks, so I have no idea what might have happened
> there.

It looks like firewire is doing the right thing, unregistering the fw* device, 
and the SCSI layer is subsequently removing the appropriate /dev/sd* nodes, 
but for whatever reason, md hasn't a clue this has happened. I can reproduce 
this particular part of the problem by bringing the array up, and then simply 
pulling the firewire cable on one of the drives in the array...

> Somewhat related:  What if
>   - we lose connection to disk "A", represented by scsi_device "a",
>   - the SCSI core sets "a" offline,
>   - we gain connection to disk "A" again (i.e. it only shortly
>     disappeared from the bus from firewire-core's and -sbp2's point
>     of view),
>   - and firewire-sbp2 adds it as scsi_device "b", even before SCSI
>     core got rid of "a"?
> No big problem for stand-alone volumes (unless it happens when the
> volume is in use), but maybe trouble for md managed volumes.

That does appear to be the case. If I reconnect the drive I disconnected, 
which was originally /dev/sdb, it comes back up as /dev/sdd now. So 
apparently, the scsi layer is at least bright enough to see that someone (md) 
is still trying to use /dev/sdb, but I'm clueless as to why md doesn't have 
any idea that /dev/sdb actually went away. :\

> To smooth such issues out, my longer term goal was to allow brief
> periods of disconnection in (firewire-)sbp2.  I.e. the SCSI core
> wouldn't notice that "A"/"a" went away, it would only notice that "a"
> wasn't accessible for a short time.  I think the Fibre Channel drivers
> already support this.  The ieee1394 driver even has a "limbo" for
> devices which went away, in order to remember them until they come back,
> but sbp2 doesn't use this feature.  (Nobody did the work to enhance sbp2
> to utilize the feature.)
>
> BTW, if you unplug and replug a FireWire disk under Mac OS X fairly
> quickly, OS X will pretend that nothing happened and let the user
> continue using the disk if he hadn't "ejected" it before the brief
> connection loss.

Certainly sounds like a feature we'd benefit from having in this particular 
case...

> Anyhow, we have a few more urgent problems to solve in firewire-sbp2's
> reconnection handling before we can think about such extras.

Very true... Perhaps I'll just file this one away a bit down the TODO list for 
now... ;)

-- 
Jarod Wilson
jwilson@redhat.com

  reply	other threads:[~2008-01-28 22:24 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-27  0:05 [PATCH] firewire: fix "kobject_add failed for fw* with -EEXIST" Stefan Richter
2008-01-27 17:20 ` [PATCH update] " Stefan Richter
2008-01-27 17:21   ` [PATCH] firewire: fail open() quickly if the node doesn't exist anymore Stefan Richter
2008-01-28 22:32     ` Jarod Wilson
2008-01-28 23:50       ` Stefan Richter
2008-01-28 16:48   ` [PATCH update] firewire: fix "kobject_add failed for fw* with -EEXIST" Jarod Wilson
2008-01-28 18:54     ` Stefan Richter
2008-01-28 22:24       ` Jarod Wilson [this message]
2008-01-28 19:16     ` Stefan Richter
2008-01-28 23:31       ` Stefan Richter
2008-02-02 14:01         ` [PATCH update 2] " Stefan Richter
2008-01-28 22:30   ` [PATCH update] " Jarod Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200801281724.12029.jwilson@redhat.com \
    --to=jwilson@redhat.com \
    --cc=krh@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux1394-devel@lists.sourceforge.net \
    --cc=stefanr@s5r6.in-berlin.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox