* how to do fc_remote_port_delete correctly
@ 2009-06-24 0:27 Joe Eykholt
[not found] ` <4A4172FA.70008-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Joe Eykholt @ 2009-06-24 0:27 UTC (permalink / raw)
To: devel@open-fcoe.org, linux-scsi
This seems pretty basic, but I'm having trouble with it.
The current libfc code has a private structure called fc_rport_libfc_priv
that gets allocated along with the fc_rport structure.
I'm working on patches that restructure this to be separately allocated,
since we need a the private structure to be around before we can do
fc_remote_port_add(). It's a long story, but my question applies to
any FC driver that wants to allocate its rport private data separately.
How should rport->dd_data be set to NULL before/after calling
fc_rport_delete().
I used to set it to NULL before, but figure it's safer to clear it after
terminate_io has been called. I did a get_device(&rport->dev) first
to be sure the rport doesn't get freed in the meantime.
However, other threads may be in queuecommand already and already
beyond their call to fc_remote_port_chkready() and will reference dd_data.
Maybe they just aren't allowed do that?
Or maybe dd_data isn't allowed to go NULL until the rport times
out and is getting deleted?
Maybe we need a small private rport data with just enough info for the I/O
that's already in progress, or I can recode the I/O path to not use
dd_data (it's mostly stuff that can be gotten through scsi_host instead).
Or maybe we need to do more in our fc_rport_terminate_io callback
from fc_remote_port_delete() to be sure that all I/O really ceases.
We currently do an exch_mgr_reset(), but perhaps some I/O thread hasn't
allocated an exchange yet.
Ideas?
Thanks,
Joe
^ permalink raw reply [flat|nested] 10+ messages in thread[parent not found: <4A4172FA.70008-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>]
* Re: how to do fc_remote_port_delete correctly [not found] ` <4A4172FA.70008-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> @ 2009-06-24 14:40 ` James Smart [not found] ` <4A423ADE.80306-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: James Smart @ 2009-06-24 14:40 UTC (permalink / raw) To: Joe Eykholt Cc: devel-s9riP+hp16TNLxjTenLetw@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Joe Eykholt wrote: > This seems pretty basic, but I'm having trouble with it. > > The current libfc code has a private structure called fc_rport_libfc_priv > that gets allocated along with the fc_rport structure. > > I'm working on patches that restructure this to be separately allocated, > since we need a the private structure to be around before we can do > fc_remote_port_add(). It's a long story, but my question applies to > any FC driver that wants to allocate its rport private data separately. > Most of the other fc drivers are in this position already, and always have been. They've been the reverse - they desire a position where they could move to rport data, but you seem to always need a context before you know what it is enough to talk to the transport. > How should rport->dd_data be set to NULL before/after calling > fc_rport_delete(). > You never do set it to NULL. This is the role of the transport, and it will do so after calling the devloss_tmo_callbk(), which is the "end-of-life" indicator for the rport. If you are changing it - it can cause problems as the transport still has the rport active, may make other calls, etc until devloss_tmo_callbk() would occur. > I used to set it to NULL before, but figure it's safer to clear it after > terminate_io has been called. I did a get_device(&rport->dev) first > to be sure the rport doesn't get freed in the meantime. > terminate_io is one of those calls the transport tries to make before the rport truly dies. > However, other threads may be in queuecommand already and already > beyond their call to fc_remote_port_chkready() and will reference dd_data. > Maybe they just aren't allowed do that? > > Or maybe dd_data isn't allowed to go NULL until the rport times > out and is getting deleted? > Yes. This last point. > Maybe we need a small private rport data with just enough info for the I/O > that's already in progress, or I can recode the I/O path to not use > dd_data (it's mostly stuff that can be gotten through scsi_host instead). > have it be whatever you need it to be. You can always maintain your own state flag in your own structure. > Or maybe we need to do more in our fc_rport_terminate_io callback > from fc_remote_port_delete() to be sure that all I/O really ceases. > We currently do an exch_mgr_reset(), but perhaps some I/O thread hasn't > allocated an exchange yet. > Sounds like race conditions within all the library routines. The transport has some simple expectations, and really doesn't care how many internal states the LLDD has. fc_rport_terminated_io() should terminate (or at least initiate termination) of anything within the driver (in a path post queuecommand()). Fc_remote_port_delete() is really just a notifier to the transport that connection to the rport has gone away and to start the devloss timers and block/reject new io requests - so it's too early in the process. I/O may or may not be killed as of rport_delete, as that's somewhat a LLDD policy (depends on FCP-2 Recovery desires and how discovery engines work). The devloss_tmo_callbk(), which is the "end-of-life" indicator for the rport, should never return from the LLDD until anything referencing the rport (a if you're using it's dd_data, that too), has been killed - as the transport may immediately "free" the rport after it's return. Do you have a devloss_tmo_callbk() ? -- james > Ideas? > > Thanks, > Joe > > > > _______________________________________________ > devel mailing list > devel-s9riP+hp16TNLxjTenLetw@public.gmane.org > http://www.open-fcoe.org/mailman/listinfo/devel > > ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4A423ADE.80306-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>]
* Re: how to do fc_remote_port_delete correctly [not found] ` <4A423ADE.80306-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org> @ 2009-06-24 16:29 ` James Smart 2009-06-24 16:30 ` James Smart 2009-06-24 17:10 ` Joe Eykholt 2 siblings, 0 replies; 10+ messages in thread From: James Smart @ 2009-06-24 16:29 UTC (permalink / raw) To: James Smart Cc: devel-s9riP+hp16TNLxjTenLetw@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org James Smart wrote: > You never do set it to NULL. This is the role of the transport, and it > will do so after calling the devloss_tmo_callbk(), which is the > "end-of-life" indicator for the rport. > > If you are changing it - it can cause problems as the transport still > has the rport active, may make other calls, etc until > devloss_tmo_callbk() would occur. > Actually, it can cause other problems if you're changing dd_data. The transport will set it when it allocates the rport. As the rport can also be used as a container for the scsi tgt id bindings, and later reused if the same device comes back post devloss_tmo (in which case, it "appears" as a new rport to the LLDD). In this case, if you are NULL-ing the dd_data value, you've hosed the structure for the later use. Let the tansport manage the dd_data value, you can manage the contents pointed to by dd_data. -- james s ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: how to do fc_remote_port_delete correctly [not found] ` <4A423ADE.80306-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org> 2009-06-24 16:29 ` James Smart @ 2009-06-24 16:30 ` James Smart [not found] ` <4A4254BC.6090302-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org> 2009-06-24 17:10 ` Joe Eykholt 2 siblings, 1 reply; 10+ messages in thread From: James Smart @ 2009-06-24 16:30 UTC (permalink / raw) To: James Smart Cc: devel-s9riP+hp16TNLxjTenLetw@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org James Smart wrote: > You never do set it to NULL. This is the role of the transport, and it > will do so after calling the devloss_tmo_callbk(), which is the > "end-of-life" indicator for the rport. > > If you are changing it - it can cause problems as the transport still > has the rport active, may make other calls, etc until > devloss_tmo_callbk() would occur. > > Actually, it can cause other problems if you're changing dd_data. The transport will set it when it allocates the rport. As the rport can also be used as a container for the scsi tgt id bindings, and later reused if the same device comes back post devloss_tmo (in which case, it "appears" as a new rport to the LLDD). In this case, if you are NULL-ing the dd_data value, you've hosed the structure for the later use. Let the tansport manage the dd_data value, you can manage the contents pointed to by dd_data. -- james s ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4A4254BC.6090302-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>]
* Re: how to do fc_remote_port_delete correctly [not found] ` <4A4254BC.6090302-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org> @ 2009-06-24 17:47 ` Joe Eykholt [not found] ` <4A426698.3-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Joe Eykholt @ 2009-06-24 17:47 UTC (permalink / raw) To: James Smart Cc: devel-s9riP+hp16TNLxjTenLetw@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org James Smart wrote: > > James Smart wrote: >> You never do set it to NULL. This is the role of the transport, and it >> will do so after calling the devloss_tmo_callbk(), which is the >> "end-of-life" indicator for the rport. >> >> If you are changing it - it can cause problems as the transport still >> has the rport active, may make other calls, etc until >> devloss_tmo_callbk() would occur. >> >> > Actually, it can cause other problems if you're changing dd_data. The > transport will set it when it allocates the rport. As the rport can also > be used as a container for the scsi tgt id bindings, and later reused if > the same device comes back post devloss_tmo (in which case, it "appears" > as a new rport to the LLDD). In this case, if you are NULL-ing the > dd_data value, you've hosed the structure for the later use. Let the > tansport manage the dd_data value, you can manage the contents pointed > to by dd_data. I see what you mean and agree. It seems like there are two slightly different sets of rules depending on whether dd_fcrport_size is zero or not, as specified by the LLDD. In the first model, where dd_fcrport_size is zero, the transport never sets dd_data at all. My understanding now is that its OK for the LLDD to set it non-NULL, but not OK to change it after that. I guess it would be OK but unnecessary to NULL it at dev_loss timeout just before freeing the attached context. These are the usage rules I didn't fully understand. These rules are really established by how the LLDDs I/O routines use dd_data. In the second model, it doesn't seem like the LLDD has full control over the contents pointed to by dd_data either, since when the remote port is re-added the area pointed to by dd_data is cleared by the transport, so we always start fresh. This is fine, but has implications on how the context is used during devloss. For example, it shouldn't be used for list linkage unless it's unlinked before fc_remote_port_add. All that's in the LLDDs control, so it's OK. For libfc, I'm leaning towards continuing to use a non-zero dd_fcrport_size and the fc_rport_libfc_priv struct. libfc could use a separately allocated struct like fc_disc_rport for the discovery and rport (PLOGI, PRLI, etc.) state machines. This is all an effort to clean up some issues caused by creating "rogue" fc_rports in libfc so that we would always have both an fc_rport_libfc_priv and an fc_rport allocated together, even before fc_remote_port_add(). It causes issues when we do remote_port_add and have to transition the state from the rogue to the "real" rport. In the meantime, the rogue could still be accessed by incoming requests, or new RSCNs, and those changes wouldn't get reflected to the real rport. It's messy, and hard to analyze all the potential problems, so I'm trying to fix that. I really appreciate your help! Thanks a bunch! Joe ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4A426698.3-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>]
* Re: how to do fc_remote_port_delete correctly [not found] ` <4A426698.3-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> @ 2009-06-24 18:31 ` James Smart [not found] ` <4A427107.6060303-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: James Smart @ 2009-06-24 18:31 UTC (permalink / raw) To: Joe Eykholt Cc: devel-s9riP+hp16TNLxjTenLetw@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Joe Eykholt wrote: > It seems like there are two slightly different sets of rules depending > on whether dd_fcrport_size is zero or not, as specified by the LLDD. > > In the first model, where dd_fcrport_size is zero, the transport > never sets dd_data at all. My understanding now is that its OK > for the LLDD to set it non-NULL, but not OK to change it after that. > I guess it would be OK but unnecessary to NULL it at dev_loss timeout > just before freeing the attached context. These are the usage rules > I didn't fully understand. These rules are really established by > how the LLDDs I/O routines use dd_data. No. In all cases where you see dd_data, the dd_data field is a transport owned field. The LLDD can change the contents of data pointed to by dd_data, but an LLDD is not supposed to change the dd_data field itself. > > In the second model, it doesn't seem like the LLDD has full control over > the contents pointed to by dd_data either, since when the remote > port is re-added the area pointed to by dd_data is cleared by the > transport, so we always start fresh. This is fine, but has > implications on how the context is used during devloss. For example, > it shouldn't be used for list linkage unless it's unlinked > before fc_remote_port_add. All that's in the LLDDs control, so it's OK. The only place the transport should be zero'ing the contents are: a) when it's allocated b) when it was deleted but kept around to hold the target id binding, then reallocated. Note: by deleted, I mean it fully transitioned through devloss_tmo_callbk() so for all intents and purposes, its as if it was freed and realloc'd. If there's another case, it's bug and we should be fixing it. > > For libfc, I'm leaning towards continuing to use a non-zero dd_fcrport_size > and the fc_rport_libfc_priv struct. libfc could use a separately > allocated struct like fc_disc_rport for the discovery and > rport (PLOGI, PRLI, etc.) state machines. makes sense > > This is all an effort to clean up some issues caused by creating "rogue" > fc_rports in libfc so that we would always have both an fc_rport_libfc_priv > and an fc_rport allocated together, even before fc_remote_port_add(). > It causes issues when we do remote_port_add and have to transition > the state from the rogue to the "real" rport. > In the meantime, the rogue could still be accessed by incoming requests, > or new RSCNs, and those changes wouldn't get reflected to the real rport. > It's messy, and hard to analyze all the potential problems, so I'm > trying to fix that. > > I really appreciate your help! Thanks a bunch! > > Joe Ok.. Good Luck. -- james ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4A427107.6060303-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>]
* Re: how to do fc_remote_port_delete correctly [not found] ` <4A427107.6060303-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org> @ 2009-06-24 19:33 ` Joe Eykholt 0 siblings, 0 replies; 10+ messages in thread From: Joe Eykholt @ 2009-06-24 19:33 UTC (permalink / raw) To: James Smart Cc: devel-s9riP+hp16TNLxjTenLetw@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org James Smart wrote: > > > Joe Eykholt wrote: >> It seems like there are two slightly different sets of rules depending >> on whether dd_fcrport_size is zero or not, as specified by the LLDD. >> >> In the first model, where dd_fcrport_size is zero, the transport >> never sets dd_data at all. My understanding now is that its OK >> for the LLDD to set it non-NULL, but not OK to change it after that. >> I guess it would be OK but unnecessary to NULL it at dev_loss timeout >> just before freeing the attached context. These are the usage rules >> I didn't fully understand. These rules are really established by >> how the LLDDs I/O routines use dd_data. > > No. In all cases where you see dd_data, the dd_data field is a transport > owned field. The LLDD can change the contents of data pointed to by > dd_data, but an LLDD is not supposed to change the dd_data field itself. I didn't know that was the intent. The comment "exported data" seems to indicate the LLDD can use it. In that case, there is no other good hook the LLDD could use for its private data and this model would never be used. Since the transport never sets dd_data to anything (except NULL on initial allocation) in this model, I think it would be OK for the LLDD to set it once, but it does cause usage problems for the I/O routines. I think most/all LLDDs have non-zero dd_fcrport_size so this model isn't used. >> In the second model, it doesn't seem like the LLDD has full control over >> the contents pointed to by dd_data either, since when the remote >> port is re-added the area pointed to by dd_data is cleared by the >> transport, so we always start fresh. This is fine, but has >> implications on how the context is used during devloss. For example, >> it shouldn't be used for list linkage unless it's unlinked >> before fc_remote_port_add. All that's in the LLDDs control, so it's OK. > > The only place the transport should be zero'ing the contents are: > a) when it's allocated > b) when it was deleted but kept around to hold the target id binding, then > reallocated. Note: by deleted, I mean it fully transitioned through > devloss_tmo_callbk() so for all intents and purposes, its as if it was > freed and realloc'd. > > If there's another case, it's bug and we should be fixing it. I didn't see any other case, and didn't mean to imply that there was. >> For libfc, I'm leaning towards continuing to use a non-zero >> dd_fcrport_size >> and the fc_rport_libfc_priv struct. libfc could use a separately >> allocated struct like fc_disc_rport for the discovery and >> rport (PLOGI, PRLI, etc.) state machines. > > makes sense > >> >> This is all an effort to clean up some issues caused by creating "rogue" >> fc_rports in libfc so that we would always have both an >> fc_rport_libfc_priv >> and an fc_rport allocated together, even before fc_remote_port_add(). >> It causes issues when we do remote_port_add and have to transition >> the state from the rogue to the "real" rport. >> In the meantime, the rogue could still be accessed by incoming requests, >> or new RSCNs, and those changes wouldn't get reflected to the real rport. >> It's messy, and hard to analyze all the potential problems, so I'm >> trying to fix that. >> >> I really appreciate your help! Thanks a bunch! >> >> Joe > > Ok.. Good Luck. > > -- james Thanks again. Joe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: how to do fc_remote_port_delete correctly [not found] ` <4A423ADE.80306-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org> 2009-06-24 16:29 ` James Smart 2009-06-24 16:30 ` James Smart @ 2009-06-24 17:10 ` Joe Eykholt [not found] ` <4A425DF1.30005-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> 2009-06-24 17:45 ` [Open-FCoE] " James Smart 2 siblings, 2 replies; 10+ messages in thread From: Joe Eykholt @ 2009-06-24 17:10 UTC (permalink / raw) To: James Smart Cc: devel-s9riP+hp16TNLxjTenLetw@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org James Smart wrote: > > > Joe Eykholt wrote: >> This seems pretty basic, but I'm having trouble with it. >> >> The current libfc code has a private structure called fc_rport_libfc_priv >> that gets allocated along with the fc_rport structure. >> >> I'm working on patches that restructure this to be separately allocated, >> since we need a the private structure to be around before we can do >> fc_remote_port_add(). It's a long story, but my question applies to >> any FC driver that wants to allocate its rport private data separately. >> > Most of the other fc drivers are in this position already, and always > have been. They've been the reverse - they desire a position where they > could move to rport data, but you seem to always need a context before > you know what it is enough to talk to the transport. >> How should rport->dd_data be set to NULL before/after calling >> fc_rport_delete(). >> > You never do set it to NULL. This is the role of the transport, and it > will do so after calling the devloss_tmo_callbk(), which is the > "end-of-life" indicator for the rport. That sounds doable. One thing I'm wondering about is what if I re-discover the same remote port before devloss timeout. I'll create new context, but then when I do fc_remote_port_add() I'll find that the fc_rport already has a dd_data. Now I have two contexts for the same rport. I guess I could switch over to the old one, but it's a bit awkward. One aspect is that the newly discovered rport may have the same port_id but different port_name and/or node_name. Since libfc shouldn't be concerned about what the binding method is, it may or may not be the same fc_rport as before. So fc_remote_port_add should not be called until all three items are known, it seems to me. It would be nice if there was a way for scsi_transport_fc to allow creating the fc_rport with only port_id known and add functions to change port_name and/or node_name later. However, that just passes the problem up to the transport. It would be possible but unlikely for a new remote port to have the same port_name as an old one, and if that's the binding method, it's messy to handle. BTW, why is it possible to bind by node_name? It seems useless to me. If we made node_name just an attribute and not something that could be used for binding, it makes the problem slightly easier. At least we could then create fc_rports before knowing the node name. > If you are changing it - it can cause problems as the transport still > has the rport active, may make other calls, etc until > devloss_tmo_callbk() would occur. Right. That's the issue I see. >> I used to set it to NULL before, but figure it's safer to clear it after >> terminate_io has been called. I did a get_device(&rport->dev) first >> to be sure the rport doesn't get freed in the meantime. >> > terminate_io is one of those calls the transport tries to make before > the rport truly dies. > >> However, other threads may be in queuecommand already and already >> beyond their call to fc_remote_port_chkready() and will reference >> dd_data. >> Maybe they just aren't allowed do that? >> >> Or maybe dd_data isn't allowed to go NULL until the rport times >> out and is getting deleted? >> > Yes. This last point. > >> Maybe we need a small private rport data with just enough info for the >> I/O >> that's already in progress, or I can recode the I/O path to not use >> dd_data (it's mostly stuff that can be gotten through scsi_host instead). >> > have it be whatever you need it to be. You can always maintain your own > state flag in your own structure. > >> Or maybe we need to do more in our fc_rport_terminate_io callback >> from fc_remote_port_delete() to be sure that all I/O really ceases. >> We currently do an exch_mgr_reset(), but perhaps some I/O thread hasn't >> allocated an exchange yet. >> > Sounds like race conditions within all the library routines. The > transport has some simple expectations, and really doesn't care how many > internal states the LLDD has. fc_rport_terminated_io() should terminate > (or at least initiate termination) of anything within the driver (in a > path post queuecommand()). Fc_remote_port_delete() is really just a > notifier to the transport that connection to the rport has gone away and > to start the devloss timers and block/reject new io requests - so it's > too early in the process. I/O may or may not be killed as of > rport_delete, as that's somewhat a LLDD policy (depends on FCP-2 > Recovery desires and how discovery engines work). I understand. > The devloss_tmo_callbk(), which is the "end-of-life" indicator for the > rport, should never return from the LLDD until anything referencing the > rport (a if you're using it's dd_data, that too), has been killed - as > the transport may immediately "free" the rport after it's return. Do > you have a devloss_tmo_callbk() ? No. I can see that it would be useful. Thanks very much for the help! Joe ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4A425DF1.30005-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>]
* Re: how to do fc_remote_port_delete correctly [not found] ` <4A425DF1.30005-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> @ 2009-06-24 17:43 ` James Smart 0 siblings, 0 replies; 10+ messages in thread From: James Smart @ 2009-06-24 17:43 UTC (permalink / raw) To: Joe Eykholt Cc: devel-s9riP+hp16TNLxjTenLetw@public.gmane.org, linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Joe Eykholt wrote: > James Smart wrote: > That sounds doable. One thing I'm wondering about is what if I > re-discover the same remote port before devloss timeout. > > I'll create new context, but then when I do fc_remote_port_add() > I'll find that the fc_rport already has a dd_data. Now I have > two contexts for the same rport. I guess I could switch over to > the old one, but it's a bit awkward. > > One aspect is that the newly discovered rport may have the same port_id > but different port_name and/or node_name. Since libfc shouldn't be > concerned about what the binding method is, it may or may not be the > same fc_rport as before. So fc_remote_port_add should not be called > until all three items are known, it seems to me. > > It would be nice if there was a way for scsi_transport_fc to allow > creating the fc_rport with only port_id known and add functions to > change port_name and/or node_name later. However, that just passes > the problem up to the transport. It would be possible but unlikely > for a new remote port to have the same port_name as an old one, and > if that's the binding method, it's messy to handle. > > This is all a repeat of the discussion we had back in Sept 08 on "rogue" rports, which included a patch to add a routine fc_remote_port_alloc(), that essentially does everything you describe above. http://www.open-fcoe.org/pipermail/devel/2008-September/000760.html http://www.open-fcoe.org/pipermail/devel/2008-September/000837.html What confuses me - I thought you were already using this, and that it was to be merged into the kernel tree when fcoe was merged. I don't know that it went anywhere, and if you're not using it, then we are where we should be I guess. > BTW, why is it possible to bind by node_name? It seems useless to me. > If we made node_name just an attribute and not something that could > be used for binding, it makes the problem slightly easier. At least > we could then create fc_rports before knowing the node name. > I don't know why node name is interesting - this was something left in for compatibility reasons (I know our driver had this option before we introduced transport support). I agree, no one in their sane mind should be doing it, but not everyone is sane (or one mans sanity is anothers madness). node_name isn't the hiccup here. We really need 2 elements - port_id, wwpn - although we usually include wwnn as well as we treat the "name" as the <wwnn,wwpn> pair. One without the other doesn't help. The notion of the rogue rport was - you had an rport structure with nothing in it - so you could use the context for anything and at the time fc_remote_port_add() was called, you would either get the rport back (its new) or you would get the existing structure and the driver would have to resolve its internal state structures then throw away the rogue. -- james s ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Open-FCoE] how to do fc_remote_port_delete correctly 2009-06-24 17:10 ` Joe Eykholt [not found] ` <4A425DF1.30005-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> @ 2009-06-24 17:45 ` James Smart 1 sibling, 0 replies; 10+ messages in thread From: James Smart @ 2009-06-24 17:45 UTC (permalink / raw) To: Joe Eykholt; +Cc: devel@open-fcoe.org, linux-scsi@vger.kernel.org Joe Eykholt wrote: > > That sounds doable. One thing I'm wondering about is what if I > re-discover the same remote port before devloss timeout. > > I'll create new context, but then when I do fc_remote_port_add() > I'll find that the fc_rport already has a dd_data. Now I have > two contexts for the same rport. I guess I could switch over to > the old one, but it's a bit awkward. > > One aspect is that the newly discovered rport may have the same port_id > but different port_name and/or node_name. Since libfc shouldn't be > concerned about what the binding method is, it may or may not be the > same fc_rport as before. So fc_remote_port_add should not be called > until all three items are known, it seems to me. > > It would be nice if there was a way for scsi_transport_fc to allow > creating the fc_rport with only port_id known and add functions to > change port_name and/or node_name later. However, that just passes > the problem up to the transport. It would be possible but unlikely > for a new remote port to have the same port_name as an old one, and > if that's the binding method, it's messy to handle. This is all a repeat of the discussion we had back in Sept 08 on "rogue" rports, which included a patch to add a routine fc_remote_port_alloc(), that essentially does everything you describe above. http://www.open-fcoe.org/pipermail/devel/2008-September/000760.html http://www.open-fcoe.org/pipermail/devel/2008-September/000837.html What confuses me - I thought you were already using this, and that it was to be merged into the kernel tree when fcoe was merged. I don't know that it went anywhere, and if you're not using it, then we are where we should be I guess. > BTW, why is it possible to bind by node_name? It seems useless to me. > If we made node_name just an attribute and not something that could > be used for binding, it makes the problem slightly easier. At least > we could then create fc_rports before knowing the node name. I don't know why node name is interesting - this was something left in for compatibility reasons (I know our driver had this option before we introduced transport support). I agree, no one in their sane mind should be doing it, but not everyone is sane (or one mans sanity is anothers madness). node_name isn't the hiccup here. We really need 2 elements - port_id, wwpn - although we usually include wwnn as well as we treat the "name" as the <wwnn,wwpn> pair. One without the other doesn't help. The notion of the rogue rport was - you had an rport structure with nothing in it - so you could use the context for anything and at the time fc_remote_port_add() was called, you would either get the rport back (its new) or you would get the existing structure and the driver would have to resolve its internal state structures then throw away the rogue. -- james s ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-06-24 19:33 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-24 0:27 how to do fc_remote_port_delete correctly Joe Eykholt
[not found] ` <4A4172FA.70008-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-06-24 14:40 ` James Smart
[not found] ` <4A423ADE.80306-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>
2009-06-24 16:29 ` James Smart
2009-06-24 16:30 ` James Smart
[not found] ` <4A4254BC.6090302-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>
2009-06-24 17:47 ` Joe Eykholt
[not found] ` <4A426698.3-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-06-24 18:31 ` James Smart
[not found] ` <4A427107.6060303-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>
2009-06-24 19:33 ` Joe Eykholt
2009-06-24 17:10 ` Joe Eykholt
[not found] ` <4A425DF1.30005-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2009-06-24 17:43 ` James Smart
2009-06-24 17:45 ` [Open-FCoE] " James Smart
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox