public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] opensm/osm_trap_rcv.c: In trap_rcv_process_request, no need to sweep on trap 145 and certain trap 144s
       [not found] ` <20090805222737.GA8523-Wuw85uim5zDR7s880joybQ@public.gmane.org>
@ 2009-10-30  1:57   ` Sasha Khapyorsky
  2009-10-30 15:23     ` Hal Rosenstock
  0 siblings, 1 reply; 4+ messages in thread
From: Sasha Khapyorsky @ 2009-10-30  1:57 UTC (permalink / raw)
  To: Hal Rosenstock; +Cc: linux-rdma

Hi Hal,

On 18:27 Wed 05 Aug     , Hal Rosenstock wrote:
> 
> NodeDescription changed trap only needs to query the new NodeDescription
> and not cause sweep
> 
> Similarly for SystemImageGUID changed (trap 145)
> 
> LinkWidth/SpeedEnabled changed traps (at least right now) and
> SM priority changed traps do need to sweep.
> 
> In the future, LinkWidth/SpeedEnabled changed trap handling
> can query PortInfo (may also need to bounce port too).
> 
> Also, as noted in related email thread, it's unclear what
> sweeping on non generic traps accomplishes but this behavior
> is preserved.
> 
> Signed-off-by: Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> diff --git a/opensm/opensm/osm_trap_rcv.c b/opensm/opensm/osm_trap_rcv.c
> index d2e4202..e5bd529 100644
> --- a/opensm/opensm/osm_trap_rcv.c
> +++ b/opensm/opensm/osm_trap_rcv.c
> @@ -291,8 +291,9 @@ trap_rcv_process_request(IN osm_sm_t * sm,
>  	osm_physp_t *p_physp;
>  	cl_ptr_vector_t *p_tbl;
>  	osm_port_t *p_port;
> +	osm_node_t *p_node;
>  	ib_net16_t source_lid = 0;
> -	boolean_t is_gsi = TRUE;
> +	boolean_t is_gsi = TRUE, is_trap144_sweep = FALSE;

What about reusing already existing 'run_heavy_sweep' variable and not
introduce new one?

>  	uint8_t port_num = 0;
>  	boolean_t physp_change_trap = FALSE;
>  	uint64_t event_wheel_timeout = OSM_DEFAULT_TRAP_SUPRESSION_TIMEOUT;
> @@ -547,44 +548,59 @@ trap_rcv_process_request(IN osm_sm_t * sm,
>  		}
>  	}

In general: function trap_rcv_process_request() already has 400 lines of
code and adding there more lines without simplifying flows makes me
nervous.

>  
> -	/* Check for node description update. IB Spec v1.2.1 pg 823 */
> -	if (ib_notice_is_generic(p_ntci) &&
> -	    cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144 &&
> -	    p_ntci->data_details.ntc_144.local_changes & TRAP_144_MASK_OTHER_LOCAL_CHANGES &&
> -	    p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_NODE_DESCRIPTION_CHANGE) {
> -		OSM_LOG(sm->p_log, OSM_LOG_INFO, "Trap 144 Node description update\n");
> -
> -		if (p_physp) {
> -			CL_PLOCK_ACQUIRE(sm->p_lock);
> -			osm_req_get_node_desc(sm, p_physp);
> -			CL_PLOCK_RELEASE(sm->p_lock);
> -		} else
> -			OSM_LOG(sm->p_log, OSM_LOG_ERROR,
> -				"ERR 3812: No physical port found for "
> -				"trap 144: \"node description update\"\n");
> -	}
> +	if (ib_notice_is_generic(p_ntci)) {
> +		/* Check for node description update. IB Spec v1.2.1 pg 823 */
> +		if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144) {
> +			/* update port's capability mask (in PortInfo) */
> +			p_physp->port_info.capability_mask = p_ntci->data_details.ntc_144.new_cap_mask;

Hmm, could this introduce some issues?

> +			if (p_ntci->data_details.ntc_144.local_changes & TRAP_144_MASK_OTHER_LOCAL_CHANGES) {
> +				if (p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_NODE_DESCRIPTION_CHANGE) {

	if ((p_ntci->data_details.ntc_144.local_changes &
	     TRAP_144_MASK_OTHER_LOCAL_CHANGES) &&
	    (p_ntci->data_details.ntc_144.change_flgs &
	     TRAP_144_MASK_NODE_DESCRIPTION_CHANGE)) {

> +					OSM_LOG(sm->p_log, OSM_LOG_INFO,
> +						"Trap 144 Node description update\n");
> +
> +					if (p_physp) {
> +						CL_PLOCK_ACQUIRE(sm->p_lock);
> +						osm_req_get_node_desc(sm, p_physp);
> +						CL_PLOCK_RELEASE(sm->p_lock);
> +					} else
> +						OSM_LOG(sm->p_log, OSM_LOG_ERROR,
> +							"ERR 3812: No physical port found for "
> +							"trap 144: \"node description update\"\n");
> +				}
> +			}
> +			if (p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_LINK_WIDTH_ENABLE_CHANGE ||
> +			    p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE ||
> +			    p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_SM_PRIORITY_CHANGE)

	if ((p_ntci->data_details.ntc_144.change_flgs &
	     (TRAP_144_MASK_LINK_WIDTH_ENABLE_CHANGE |
	      TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE |
	      TRAP_144_MASK_SM_PRIORITY_CHANGE)))


> +				is_trap144_sweep = TRUE;
> +		}
>  
> -	/* do a sweep if we received a trap */
> -	if (sm->p_subn->opt.sweep_on_trap) {
> -		/* if this is trap number 128 or run_heavy_sweep is TRUE -
> -		   update the force_heavy_sweep flag of the subnet.
> -		   Sweep also on traps 144/145 - these traps signal a change of
> -		   certain port capabilities/system image guid.
> -		   TODO: In the future this can be changed to just getting
> -		   PortInfo on this port instead of sweeping the entire subnet. */
> -		if (ib_notice_is_generic(p_ntci) &&
> -		    (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 128 ||
> -		     cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144 ||
> -		     cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 145 ||
> -		     run_heavy_sweep)) {
> -			OSM_LOG(sm->p_log, OSM_LOG_VERBOSE,
> -				"Forcing heavy sweep. Received trap:%u\n",
> -				cl_ntoh16(p_ntci->g_or_v.generic.trap_num));
> +		if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 145) {
> +			/* update system image guid (in NodeInfo) */
> +			p_node = osm_physp_get_node_ptr(p_physp);
> +			if (p_node)

When 'p_node' could be NULL?

> +				p_node->node_info.node_guid = p_ntci->data_details.ntc_145.new_sys_guid;

Didn't you mean 'node_info.sys_guid' and 'node_info.node_guid'?

> +		}
> +
> +		/* do a sweep if we received a trap */
> +		if (sm->p_subn->opt.sweep_on_trap) {
> +			/* if this is trap number 128 or run_heavy_sweep is
> +			   TRUE - update the force_heavy_sweep flag of the
> +			   subnet. Also, sweep on certain types of trap 144.
> +			   TODO: In the future this can be changed to just
> +			   getting PortInfo on this port instead of sweeping
> +			   the entire subnet. */
> +		    	if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 128 ||
> +			    is_trap144_sweep || run_heavy_sweep) {

It looks that prevents sweeping on PortInfo:CapMask change which
is effectively breaks at least SM handover - new SM will be undetected.

> +				OSM_LOG(sm->p_log, OSM_LOG_VERBOSE,
> +					"Forcing heavy sweep. Received trap:%u\n",
> +					cl_ntoh16(p_ntci->g_or_v.generic.trap_num));
>  
> -			sm->p_subn->force_heavy_sweep = TRUE;
> +				sm->p_subn->force_heavy_sweep = TRUE;
> +			}
> +			osm_sm_signal(sm, OSM_SIGNAL_SWEEP);
>  		}
> +	} else if (sm->p_subn->opt.sweep_on_trap)
>  		osm_sm_signal(sm, OSM_SIGNAL_SWEEP);
> -	}
>  
>  	/* If we reached here due to trap 129/130/131 - do not need to do
>  	   the notice report. Just goto exit. We know this is the case
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] opensm/osm_trap_rcv.c: In trap_rcv_process_request, no need to sweep on trap 145 and certain trap 144s
  2009-10-30  1:57   ` [PATCH] opensm/osm_trap_rcv.c: In trap_rcv_process_request, no need to sweep on trap 145 and certain trap 144s Sasha Khapyorsky
@ 2009-10-30 15:23     ` Hal Rosenstock
       [not found]       ` <f0e08f230910300823j56edb800nf88efca23165bf5-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Hal Rosenstock @ 2009-10-30 15:23 UTC (permalink / raw)
  To: Sasha Khapyorsky; +Cc: linux-rdma

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 9862 bytes --]

Hi Sasha,

On Thu, Oct 29, 2009 at 9:57 PM, Sasha Khapyorsky <sashak@voltaire.com> wrote:
> Hi Hal,
>
> On 18:27 Wed 05 Aug     , Hal Rosenstock wrote:
>>
>> NodeDescription changed trap only needs to query the new NodeDescription
>> and not cause sweep
>>
>> Similarly for SystemImageGUID changed (trap 145)
>>
>> LinkWidth/SpeedEnabled changed traps (at least right now) and
>> SM priority changed traps do need to sweep.
>>
>> In the future, LinkWidth/SpeedEnabled changed trap handling
>> can query PortInfo (may also need to bounce port too).
>>
>> Also, as noted in related email thread, it's unclear what
>> sweeping on non generic traps accomplishes but this behavior
>> is preserved.
>>
>> Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com>
>> ---
>> diff --git a/opensm/opensm/osm_trap_rcv.c b/opensm/opensm/osm_trap_rcv.c
>> index d2e4202..e5bd529 100644
>> --- a/opensm/opensm/osm_trap_rcv.c
>> +++ b/opensm/opensm/osm_trap_rcv.c
>> @@ -291,8 +291,9 @@ trap_rcv_process_request(IN osm_sm_t * sm,
>>       osm_physp_t *p_physp;
>>       cl_ptr_vector_t *p_tbl;
>>       osm_port_t *p_port;
>> +     osm_node_t *p_node;
>>       ib_net16_t source_lid = 0;
>> -     boolean_t is_gsi = TRUE;
>> +     boolean_t is_gsi = TRUE, is_trap144_sweep = FALSE;
>
> What about reusing already existing 'run_heavy_sweep' variable and not
> introduce new one?

Sure.

>>       uint8_t port_num = 0;
>>       boolean_t physp_change_trap = FALSE;
>>       uint64_t event_wheel_timeout = OSM_DEFAULT_TRAP_SUPRESSION_TIMEOUT;
>> @@ -547,44 +548,59 @@ trap_rcv_process_request(IN osm_sm_t * sm,
>>               }
>>       }
>
> In general: function trap_rcv_process_request() already has 400 lines of
> code and adding there more lines without simplifying flows makes me
> nervous.

Are you going to block this change until such restructure ?

>> -     /* Check for node description update. IB Spec v1.2.1 pg 823 */
>> -     if (ib_notice_is_generic(p_ntci) &&
>> -         cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144 &&
>> -         p_ntci->data_details.ntc_144.local_changes & TRAP_144_MASK_OTHER_LOCAL_CHANGES &&
>> -         p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_NODE_DESCRIPTION_CHANGE) {
>> -             OSM_LOG(sm->p_log, OSM_LOG_INFO, "Trap 144 Node description update\n");
>> -
>> -             if (p_physp) {
>> -                     CL_PLOCK_ACQUIRE(sm->p_lock);
>> -                     osm_req_get_node_desc(sm, p_physp);
>> -                     CL_PLOCK_RELEASE(sm->p_lock);
>> -             } else
>> -                     OSM_LOG(sm->p_log, OSM_LOG_ERROR,
>> -                             "ERR 3812: No physical port found for "
>> -                             "trap 144: \"node description update\"\n");
>> -     }
>> +     if (ib_notice_is_generic(p_ntci)) {
>> +             /* Check for node description update. IB Spec v1.2.1 pg 823 */
>> +             if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144) {
>> +                     /* update port's capability mask (in PortInfo) */
>> +                     p_physp->port_info.capability_mask = p_ntci->data_details.ntc_144.new_cap_mask;
>
> Hmm, could this introduce some issues?

Like what ?

>> +                     if (p_ntci->data_details.ntc_144.local_changes & TRAP_144_MASK_OTHER_LOCAL_CHANGES) {
>> +                             if (p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_NODE_DESCRIPTION_CHANGE) {
>
>        if ((p_ntci->data_details.ntc_144.local_changes &
>             TRAP_144_MASK_OTHER_LOCAL_CHANGES) &&
>            (p_ntci->data_details.ntc_144.change_flgs &
>             TRAP_144_MASK_NODE_DESCRIPTION_CHANGE)) {
>
>> +                                     OSM_LOG(sm->p_log, OSM_LOG_INFO,
>> +                                             "Trap 144 Node description update\n");
>> +
>> +                                     if (p_physp) {
>> +                                             CL_PLOCK_ACQUIRE(sm->p_lock);
>> +                                             osm_req_get_node_desc(sm, p_physp);
>> +                                             CL_PLOCK_RELEASE(sm->p_lock);
>> +                                     } else
>> +                                             OSM_LOG(sm->p_log, OSM_LOG_ERROR,
>> +                                                     "ERR 3812: No physical port found for "
>> +                                                     "trap 144: \"node description update\"\n");
>> +                             }
>> +                     }
>> +                     if (p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_LINK_WIDTH_ENABLE_CHANGE ||
>> +                         p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE ||
>> +                         p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_SM_PRIORITY_CHANGE)
>
>        if ((p_ntci->data_details.ntc_144.change_flgs &
>             (TRAP_144_MASK_LINK_WIDTH_ENABLE_CHANGE |
>              TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE |
>              TRAP_144_MASK_SM_PRIORITY_CHANGE)))
>
>
>> +                             is_trap144_sweep = TRUE;
>> +             }
>>
>> -     /* do a sweep if we received a trap */
>> -     if (sm->p_subn->opt.sweep_on_trap) {
>> -             /* if this is trap number 128 or run_heavy_sweep is TRUE -
>> -                update the force_heavy_sweep flag of the subnet.
>> -                Sweep also on traps 144/145 - these traps signal a change of
>> -                certain port capabilities/system image guid.
>> -                TODO: In the future this can be changed to just getting
>> -                PortInfo on this port instead of sweeping the entire subnet. */
>> -             if (ib_notice_is_generic(p_ntci) &&
>> -                 (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 128 ||
>> -                  cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144 ||
>> -                  cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 145 ||
>> -                  run_heavy_sweep)) {
>> -                     OSM_LOG(sm->p_log, OSM_LOG_VERBOSE,
>> -                             "Forcing heavy sweep. Received trap:%u\n",
>> -                             cl_ntoh16(p_ntci->g_or_v.generic.trap_num));
>> +             if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 145) {
>> +                     /* update system image guid (in NodeInfo) */
>> +                     p_node = osm_physp_get_node_ptr(p_physp);
>> +                     if (p_node)
>
> When 'p_node' could be NULL?

It's just a possible return from osm_physp_get_node_ptr and wanted to
be sure this was not the case before setting node_info. Should this be
eliminated ?

>> +                             p_node->node_info.node_guid = p_ntci->data_details.ntc_145.new_sys_guid;
>
> Didn't you mean 'node_info.sys_guid' and 'node_info.node_guid'?

Just sys_guid.

>> +             }
>> +
>> +             /* do a sweep if we received a trap */
>> +             if (sm->p_subn->opt.sweep_on_trap) {
>> +                     /* if this is trap number 128 or run_heavy_sweep is
>> +                        TRUE - update the force_heavy_sweep flag of the
>> +                        subnet. Also, sweep on certain types of trap 144.
>> +                        TODO: In the future this can be changed to just
>> +                        getting PortInfo on this port instead of sweeping
>> +                        the entire subnet. */
>> +                     if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 128 ||
>> +                         is_trap144_sweep || run_heavy_sweep) {
>
> It looks that prevents sweeping on PortInfo:CapMask change which
> is effectively breaks at least SM handover - new SM will be undetected.

Yes, CapMask change should be included along with the other local
changes to cause sweep.

-- Hal

>> +                             OSM_LOG(sm->p_log, OSM_LOG_VERBOSE,
>> +                                     "Forcing heavy sweep. Received trap:%u\n",
>> +                                     cl_ntoh16(p_ntci->g_or_v.generic.trap_num));
>>
>> -                     sm->p_subn->force_heavy_sweep = TRUE;
>> +                             sm->p_subn->force_heavy_sweep = TRUE;
>> +                     }
>> +                     osm_sm_signal(sm, OSM_SIGNAL_SWEEP);
>>               }
>> +     } else if (sm->p_subn->opt.sweep_on_trap)
>>               osm_sm_signal(sm, OSM_SIGNAL_SWEEP);
>> -     }
>>
>>       /* If we reached here due to trap 129/130/131 - do not need to do
>>          the notice report. Just goto exit. We know this is the case
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] opensm/osm_trap_rcv.c: In trap_rcv_process_request, no need to sweep on trap 145 and certain trap 144s
       [not found]       ` <f0e08f230910300823j56edb800nf88efca23165bf5-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-11-01 15:49         ` Sasha Khapyorsky
  2009-11-02 15:27           ` Hal Rosenstock
  0 siblings, 1 reply; 4+ messages in thread
From: Sasha Khapyorsky @ 2009-11-01 15:49 UTC (permalink / raw)
  To: Hal Rosenstock; +Cc: linux-rdma

On 11:23 Fri 30 Oct     , Hal Rosenstock wrote:
> >
> > In general: function trap_rcv_process_request() already has 400 lines of
> > code and adding there more lines without simplifying flows makes me
> > nervous.
> 
> Are you going to block this change until such restructure ?

I would prefer to improve it at least in a part where patch adds even
more flows. Also there are many issues yet without restructuring.

> 
> >> -     /* Check for node description update. IB Spec v1.2.1 pg 823 */
> >> -     if (ib_notice_is_generic(p_ntci) &&
> >> -         cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144 &&
> >> -         p_ntci->data_details.ntc_144.local_changes & TRAP_144_MASK_OTHER_LOCAL_CHANGES &&
> >> -         p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_NODE_DESCRIPTION_CHANGE) {
> >> -             OSM_LOG(sm->p_log, OSM_LOG_INFO, "Trap 144 Node description update\n");
> >> -
> >> -             if (p_physp) {
> >> -                     CL_PLOCK_ACQUIRE(sm->p_lock);
> >> -                     osm_req_get_node_desc(sm, p_physp);
> >> -                     CL_PLOCK_RELEASE(sm->p_lock);
> >> -             } else
> >> -                     OSM_LOG(sm->p_log, OSM_LOG_ERROR,
> >> -                             "ERR 3812: No physical port found for "
> >> -                             "trap 144: \"node description update\"\n");
> >> -     }
> >> +     if (ib_notice_is_generic(p_ntci)) {
> >> +             /* Check for node description update. IB Spec v1.2.1 pg 823 */
> >> +             if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144) {
> >> +                     /* update port's capability mask (in PortInfo) */
> >> +                     p_physp->port_info.capability_mask = p_ntci->data_details.ntc_144.new_cap_mask;
> >
> > Hmm, could this introduce some issues?
> 
> Like what ?

Like ignoring isSM bit and maybe some other "problematic" bits which
required SM DB update.

> 
> >> +                     if (p_ntci->data_details.ntc_144.local_changes & TRAP_144_MASK_OTHER_LOCAL_CHANGES) {
> >> +                             if (p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_NODE_DESCRIPTION_CHANGE) {
> >
> >        if ((p_ntci->data_details.ntc_144.local_changes &
> >             TRAP_144_MASK_OTHER_LOCAL_CHANGES) &&
> >            (p_ntci->data_details.ntc_144.change_flgs &
> >             TRAP_144_MASK_NODE_DESCRIPTION_CHANGE)) {
> >
> >> +                                     OSM_LOG(sm->p_log, OSM_LOG_INFO,
> >> +                                             "Trap 144 Node description update\n");
> >> +
> >> +                                     if (p_physp) {
> >> +                                             CL_PLOCK_ACQUIRE(sm->p_lock);
> >> +                                             osm_req_get_node_desc(sm, p_physp);
> >> +                                             CL_PLOCK_RELEASE(sm->p_lock);
> >> +                                     } else
> >> +                                             OSM_LOG(sm->p_log, OSM_LOG_ERROR,
> >> +                                                     "ERR 3812: No physical port found for "
> >> +                                                     "trap 144: \"node description update\"\n");
> >> +                             }
> >> +                     }
> >> +                     if (p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_LINK_WIDTH_ENABLE_CHANGE ||
> >> +                         p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE ||
> >> +                         p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_SM_PRIORITY_CHANGE)
> >
> >        if ((p_ntci->data_details.ntc_144.change_flgs &
> >             (TRAP_144_MASK_LINK_WIDTH_ENABLE_CHANGE |
> >              TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE |
> >              TRAP_144_MASK_SM_PRIORITY_CHANGE)))
> >
> >
> >> +                             is_trap144_sweep = TRUE;
> >> +             }
> >>
> >> -     /* do a sweep if we received a trap */
> >> -     if (sm->p_subn->opt.sweep_on_trap) {
> >> -             /* if this is trap number 128 or run_heavy_sweep is TRUE -
> >> -                update the force_heavy_sweep flag of the subnet.
> >> -                Sweep also on traps 144/145 - these traps signal a change of
> >> -                certain port capabilities/system image guid.
> >> -                TODO: In the future this can be changed to just getting
> >> -                PortInfo on this port instead of sweeping the entire subnet. */
> >> -             if (ib_notice_is_generic(p_ntci) &&
> >> -                 (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 128 ||
> >> -                  cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144 ||
> >> -                  cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 145 ||
> >> -                  run_heavy_sweep)) {
> >> -                     OSM_LOG(sm->p_log, OSM_LOG_VERBOSE,
> >> -                             "Forcing heavy sweep. Received trap:%u\n",
> >> -                             cl_ntoh16(p_ntci->g_or_v.generic.trap_num));
> >> +             if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 145) {
> >> +                     /* update system image guid (in NodeInfo) */
> >> +                     p_node = osm_physp_get_node_ptr(p_physp);
> >> +                     if (p_node)
> >
> > When 'p_node' could be NULL?
> 
> It's just a possible return from osm_physp_get_node_ptr and wanted to
> be sure this was not the case before setting node_info. Should this be
> eliminated ?

If p_physp->p_node cannot be NULL legally, then the check should be
eliminated in order to not hide an errors and to not eat cpu cycles.

> 
> >> +                             p_node->node_info.node_guid = p_ntci->data_details.ntc_145.new_sys_guid;
> >
> > Didn't you mean 'node_info.sys_guid' and 'node_info.node_guid'?
> 
> Just sys_guid.

Sorry typo, I tried to ask: shouldn't be:

	p_node->node_info.sys_guid = p_ntci->data_details.ntc_145.new_sys_guid;

, instead?

> 
> >> +             }
> >> +
> >> +             /* do a sweep if we received a trap */
> >> +             if (sm->p_subn->opt.sweep_on_trap) {
> >> +                     /* if this is trap number 128 or run_heavy_sweep is
> >> +                        TRUE - update the force_heavy_sweep flag of the
> >> +                        subnet. Also, sweep on certain types of trap 144.
> >> +                        TODO: In the future this can be changed to just
> >> +                        getting PortInfo on this port instead of sweeping
> >> +                        the entire subnet. */
> >> +                     if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 128 ||
> >> +                         is_trap144_sweep || run_heavy_sweep) {
> >
> > It looks that prevents sweeping on PortInfo:CapMask change which
> > is effectively breaks at least SM handover - new SM will be undetected.
> 
> Yes, CapMask change should be included along with the other local
> changes to cause sweep.

So this should change the patch. 

Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] opensm/osm_trap_rcv.c: In trap_rcv_process_request, no need to sweep on trap 145 and certain trap 144s
  2009-11-01 15:49         ` Sasha Khapyorsky
@ 2009-11-02 15:27           ` Hal Rosenstock
  0 siblings, 0 replies; 4+ messages in thread
From: Hal Rosenstock @ 2009-11-02 15:27 UTC (permalink / raw)
  To: Sasha Khapyorsky; +Cc: linux-rdma

On Sun, Nov 1, 2009 at 11:49 AM, Sasha Khapyorsky <sashak@voltaire.com> wrote:
> On 11:23 Fri 30 Oct     , Hal Rosenstock wrote:
>> >
>> > In general: function trap_rcv_process_request() already has 400 lines of
>> > code and adding there more lines without simplifying flows makes me
>> > nervous.
>>
>> Are you going to block this change until such restructure ?
>
> I would prefer to improve it at least in a part where patch adds even
> more flows. Also there are many issues yet without restructuring.

The latter is a horse of a different color IMO.

>>
>> >> -     /* Check for node description update. IB Spec v1.2.1 pg 823 */
>> >> -     if (ib_notice_is_generic(p_ntci) &&
>> >> -         cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144 &&
>> >> -         p_ntci->data_details.ntc_144.local_changes & TRAP_144_MASK_OTHER_LOCAL_CHANGES &&
>> >> -         p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_NODE_DESCRIPTION_CHANGE) {
>> >> -             OSM_LOG(sm->p_log, OSM_LOG_INFO, "Trap 144 Node description update\n");
>> >> -
>> >> -             if (p_physp) {
>> >> -                     CL_PLOCK_ACQUIRE(sm->p_lock);
>> >> -                     osm_req_get_node_desc(sm, p_physp);
>> >> -                     CL_PLOCK_RELEASE(sm->p_lock);
>> >> -             } else
>> >> -                     OSM_LOG(sm->p_log, OSM_LOG_ERROR,
>> >> -                             "ERR 3812: No physical port found for "
>> >> -                             "trap 144: \"node description update\"\n");
>> >> -     }
>> >> +     if (ib_notice_is_generic(p_ntci)) {
>> >> +             /* Check for node description update. IB Spec v1.2.1 pg 823 */
>> >> +             if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144) {
>> >> +                     /* update port's capability mask (in PortInfo) */
>> >> +                     p_physp->port_info.capability_mask = p_ntci->data_details.ntc_144.new_cap_mask;
>> >
>> > Hmm, could this introduce some issues?
>>
>> Like what ?
>
> Like ignoring isSM bit and maybe some other "problematic" bits which
> required SM DB update.

You had already pointed that issue out elsewhere in your email.

>>
>> >> +                     if (p_ntci->data_details.ntc_144.local_changes & TRAP_144_MASK_OTHER_LOCAL_CHANGES) {
>> >> +                             if (p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_NODE_DESCRIPTION_CHANGE) {
>> >
>> >        if ((p_ntci->data_details.ntc_144.local_changes &
>> >             TRAP_144_MASK_OTHER_LOCAL_CHANGES) &&
>> >            (p_ntci->data_details.ntc_144.change_flgs &
>> >             TRAP_144_MASK_NODE_DESCRIPTION_CHANGE)) {
>> >
>> >> +                                     OSM_LOG(sm->p_log, OSM_LOG_INFO,
>> >> +                                             "Trap 144 Node description update\n");
>> >> +
>> >> +                                     if (p_physp) {
>> >> +                                             CL_PLOCK_ACQUIRE(sm->p_lock);
>> >> +                                             osm_req_get_node_desc(sm, p_physp);
>> >> +                                             CL_PLOCK_RELEASE(sm->p_lock);
>> >> +                                     } else
>> >> +                                             OSM_LOG(sm->p_log, OSM_LOG_ERROR,
>> >> +                                                     "ERR 3812: No physical port found for "
>> >> +                                                     "trap 144: \"node description update\"\n");
>> >> +                             }
>> >> +                     }
>> >> +                     if (p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_LINK_WIDTH_ENABLE_CHANGE ||
>> >> +                         p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE ||
>> >> +                         p_ntci->data_details.ntc_144.change_flgs & TRAP_144_MASK_SM_PRIORITY_CHANGE)
>> >
>> >        if ((p_ntci->data_details.ntc_144.change_flgs &
>> >             (TRAP_144_MASK_LINK_WIDTH_ENABLE_CHANGE |
>> >              TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE |
>> >              TRAP_144_MASK_SM_PRIORITY_CHANGE)))
>> >
>> >
>> >> +                             is_trap144_sweep = TRUE;
>> >> +             }
>> >>
>> >> -     /* do a sweep if we received a trap */
>> >> -     if (sm->p_subn->opt.sweep_on_trap) {
>> >> -             /* if this is trap number 128 or run_heavy_sweep is TRUE -
>> >> -                update the force_heavy_sweep flag of the subnet.
>> >> -                Sweep also on traps 144/145 - these traps signal a change of
>> >> -                certain port capabilities/system image guid.
>> >> -                TODO: In the future this can be changed to just getting
>> >> -                PortInfo on this port instead of sweeping the entire subnet. */
>> >> -             if (ib_notice_is_generic(p_ntci) &&
>> >> -                 (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 128 ||
>> >> -                  cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 144 ||
>> >> -                  cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 145 ||
>> >> -                  run_heavy_sweep)) {
>> >> -                     OSM_LOG(sm->p_log, OSM_LOG_VERBOSE,
>> >> -                             "Forcing heavy sweep. Received trap:%u\n",
>> >> -                             cl_ntoh16(p_ntci->g_or_v.generic.trap_num));
>> >> +             if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 145) {
>> >> +                     /* update system image guid (in NodeInfo) */
>> >> +                     p_node = osm_physp_get_node_ptr(p_physp);
>> >> +                     if (p_node)
>> >
>> > When 'p_node' could be NULL?
>>
>> It's just a possible return from osm_physp_get_node_ptr and wanted to
>> be sure this was not the case before setting node_info. Should this be
>> eliminated ?
>
> If p_physp->p_node cannot be NULL legally, then the check should be
> eliminated in order to not hide an errors and to not eat cpu cycles.
>
>>
>> >> +                             p_node->node_info.node_guid = p_ntci->data_details.ntc_145.new_sys_guid;
>> >
>> > Didn't you mean 'node_info.sys_guid' and 'node_info.node_guid'?
>>
>> Just sys_guid.
>
> Sorry typo, I tried to ask: shouldn't be:
>
>        p_node->node_info.sys_guid = p_ntci->data_details.ntc_145.new_sys_guid;
>
> , instead?

Yes, that's what I meant when I wrote "just sys_guid".

>>
>> >> +             }
>> >> +
>> >> +             /* do a sweep if we received a trap */
>> >> +             if (sm->p_subn->opt.sweep_on_trap) {
>> >> +                     /* if this is trap number 128 or run_heavy_sweep is
>> >> +                        TRUE - update the force_heavy_sweep flag of the
>> >> +                        subnet. Also, sweep on certain types of trap 144.
>> >> +                        TODO: In the future this can be changed to just
>> >> +                        getting PortInfo on this port instead of sweeping
>> >> +                        the entire subnet. */
>> >> +                     if (cl_ntoh16(p_ntci->g_or_v.generic.trap_num) == 128 ||
>> >> +                         is_trap144_sweep || run_heavy_sweep) {
>> >
>> > It looks that prevents sweeping on PortInfo:CapMask change which
>> > is effectively breaks at least SM handover - new SM will be undetected.
>>
>> Yes, CapMask change should be included along with the other local
>> changes to cause sweep.
>
> So this should change the patch.

Yes.

-- Hal

> Sasha
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-11-02 15:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20090805222737.GA8523@comcast.net>
     [not found] ` <20090805222737.GA8523-Wuw85uim5zDR7s880joybQ@public.gmane.org>
2009-10-30  1:57   ` [PATCH] opensm/osm_trap_rcv.c: In trap_rcv_process_request, no need to sweep on trap 145 and certain trap 144s Sasha Khapyorsky
2009-10-30 15:23     ` Hal Rosenstock
     [not found]       ` <f0e08f230910300823j56edb800nf88efca23165bf5-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-11-01 15:49         ` Sasha Khapyorsky
2009-11-02 15:27           ` Hal Rosenstock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox