Re: how to do fc_remote_port_delete correctly

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Joe Eykholt <jeykholt-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
To: James Smart <James.Smart-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org>
Cc: "devel-s9riP+hp16TNLxjTenLetw@public.gmane.org"
	<devel-s9riP+hp16TNLxjTenLetw@public.gmane.org>,
	"linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: how to do fc_remote_port_delete correctly
Date: Wed, 24 Jun 2009 10:10:09 -0700	[thread overview]
Message-ID: <4A425DF1.30005@cisco.com> (raw)
In-Reply-To: <4A423ADE.80306-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>

James Smart wrote:
> 
> 
> Joe Eykholt wrote:
>> This seems pretty basic, but I'm having trouble with it.
>>
>> The current libfc code has a private structure called fc_rport_libfc_priv
>> that gets allocated along with the fc_rport structure.
>>
>> I'm working on patches that restructure this to be separately allocated,
>> since we need a the private structure to be around before we can do
>> fc_remote_port_add().  It's a long story, but my question applies to
>> any FC driver that wants to allocate its rport private data separately.
>>   
> Most of the other fc drivers are in this position already, and always 
> have been. They've been the reverse - they desire a position where they 
> could move to rport data, but you seem to always need a context before 
> you know what it is enough to talk to the transport.
>> How should rport->dd_data be set to NULL before/after calling
>> fc_rport_delete().
>>   
> You never do set it to NULL. This is the role of the transport, and it 
> will do so after calling the devloss_tmo_callbk(), which is the 
> "end-of-life" indicator for the rport.

That sounds doable.  One thing I'm wondering about is what if I
re-discover the same remote port before devloss timeout.

I'll create new context, but then when I do fc_remote_port_add()
I'll find that the fc_rport already has a dd_data.   Now I have
two contexts for the same rport.  I guess I could switch over to
the old one, but it's a bit awkward.

One aspect is that the newly discovered rport may have the same port_id
but different port_name and/or node_name.  Since libfc shouldn't be
concerned about what the binding method is, it may or may not be the
same fc_rport as before.  So fc_remote_port_add should not be called
until all three items are known, it seems to me.

It would be nice if there was a way for scsi_transport_fc to allow
creating the fc_rport with only port_id known and add functions to
change port_name and/or node_name later.  However, that just passes
the problem up to the transport.  It would be possible but unlikely
for a new remote port to have the same port_name as an old one, and
if that's the binding method, it's messy to handle.

BTW, why is it possible to bind by node_name?  It seems useless to me.
If we made node_name just an attribute and not something that could
be used for binding, it makes the problem slightly easier.  At least
we could then create fc_rports before knowing the node name.

> If you are changing it - it can cause problems as the transport still 
> has the rport active, may make other calls, etc until 
> devloss_tmo_callbk() would occur.

Right.  That's the issue I see.

>> I used to set it to NULL before, but figure it's safer to clear it after
>> terminate_io has been called.  I did a get_device(&rport->dev) first
>> to be sure the rport doesn't get freed in the meantime.
>>   
> terminate_io is one of those calls the transport tries to make before 
> the rport truly dies.
> 
>> However, other threads may be in queuecommand already and already
>> beyond their call to fc_remote_port_chkready() and will reference 
>> dd_data.
>> Maybe they just aren't allowed do that?
>>
>> Or maybe dd_data isn't allowed to go NULL until the rport times
>> out and is getting deleted?
>>   
> Yes. This last point.
> 
>> Maybe we need a small private rport data with just enough info for the 
>> I/O
>> that's already in progress, or I can recode the I/O path to not use
>> dd_data (it's mostly stuff that can be gotten through scsi_host instead).
>>   
> have it be whatever you need it to be. You can always maintain your own 
> state flag in your own structure.
> 
>> Or maybe we need to do more in our fc_rport_terminate_io callback
>> from fc_remote_port_delete() to be sure that all I/O really ceases.
>> We currently do an exch_mgr_reset(), but perhaps some I/O thread hasn't
>> allocated an exchange yet.
>>   
> Sounds like race conditions within all the library routines.   The 
> transport has some simple expectations, and really doesn't care how many 
> internal states the LLDD has. fc_rport_terminated_io() should terminate 
> (or at least initiate termination) of anything within the driver (in a 
> path post queuecommand()). Fc_remote_port_delete() is really just a 
> notifier to the transport that connection to the rport has gone away and 
> to start the devloss timers and block/reject new io requests - so it's 
> too early in the process. I/O may or may not be killed as of 
> rport_delete, as that's somewhat a LLDD policy (depends on FCP-2 
> Recovery desires and how discovery engines work).

I understand.

> The devloss_tmo_callbk(), which is the "end-of-life" indicator for the 
> rport, should never return from the LLDD until anything referencing the 
> rport (a if you're using it's dd_data, that too), has been killed - as 
> the transport may immediately "free" the rport after it's return.   Do 
> you have a devloss_tmo_callbk() ?  

No.  I can see that it would be useful.

Thanks very much for the help!

	Joe

next prev parent reply	other threads:[~2009-06-24 17:10 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-24  0:27 how to do fc_remote_port_delete correctly Joe Eykholt
     [not found] ` <4A4172FA.70008-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-06-24 14:40   ` James Smart
     [not found]     ` <4A423ADE.80306-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>
2009-06-24 16:29       ` James Smart
2009-06-24 16:30       ` James Smart
     [not found]         ` <4A4254BC.6090302-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>
2009-06-24 17:47           ` Joe Eykholt
     [not found]             ` <4A426698.3-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-06-24 18:31               ` James Smart
     [not found]                 ` <4A427107.6060303-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>
2009-06-24 19:33                   ` Joe Eykholt
2009-06-24 17:10       ` Joe Eykholt [this message]
     [not found]         ` <4A425DF1.30005-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-06-24 17:43           ` James Smart
2009-06-24 17:45         ` [Open-FCoE] " James Smart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A425DF1.30005@cisco.com \
    --to=jeykholt-fyb4gu1cfyuavxtiumwx3w@public.gmane.org \
    --cc=James.Smart-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org \
    --cc=devel-s9riP+hp16TNLxjTenLetw@public.gmane.org \
    --cc=linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.