[Lustre-devel] extend lnet_notify to public LNet API

* [Lustre-devel] extend lnet_notify to public LNet API
@ 2010-11-16 16:00 Nic Henke
  2010-11-17  3:00 ` liang Zhen
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Nic Henke @ 2010-11-16 16:00 UTC (permalink / raw)
  To: lustre-devel

We'd like to allow upper layers (Lustre, Cray DVS, etc) to register a 
callback that would be called from lnet_notify. This will allow them to 
be notified when the lower layers have seen network problems between 
NIDs and let them take appropriate action. The upper layer could also be 
notified when that peer has returned to 'network health' after the LND 
gets its act together.

This would help allow upper layers to aggressively resend/reconnect in 
the cases where all TX have completed successfully (meaning no LNet -EIO 
on LND errors) but there are LNET_MSG_ACK or other REPLY traffic 
outstanding.

Initial proposal is on the verbose side, giving all data that 
lnet_notify sees:
- lnet_nid_t
- is_alive (boolean)
- cfs_time_t when (unsigned long on Linux) - jiffies when last alive

Is this workable and likely to be accepted up-stream ?

Nic

^ permalink raw reply	[flat|nested] 7+ messages in thread