netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [PATCH 0/2][RFC] Network Event Notifier Mechanism
@ 2006-06-22 22:11 Caitlin Bestler
  2006-06-22 22:21 ` jamal
  2006-06-22 22:58 ` David Miller
  0 siblings, 2 replies; 25+ messages in thread
From: Caitlin Bestler @ 2006-06-22 22:11 UTC (permalink / raw)
  To: hadi, Steve Wise; +Cc: David Miller, netdev

netdev-owner@vger.kernel.org wrote:
> On Thu, 2006-22-06 at 15:40 -0500, Steve Wise wrote:
>> On Thu, 2006-06-22 at 15:43 -0400, jamal wrote:
>>> 
>>> No - what these 2 gents are saying was these events and
>>> infrastructure already exist.
>> 
>> Notification of the exact events needed does not exist today.
>> 
> 
> Ok, so you cant event make use of anything that already exists?
> Or is a subset of what you need already there?
> 
>> The key events, again, are:
>> 
>> - the neighbour entry mac address has changed.
>> 
>> 
>> - the next hop ip address (ie the neighbour) for a given dst_entry
>> has changed.
> 
> 
> I dont see a difference for the above two from an L2 perspective.
> Are you keeping track of IP addresses?
> You didn't answer my question in the previous email as to
> what RDMA needs to keep track of in hardware.
> 

The RDMA device is handling L4 or L5 connections that 
have L3 Addresses (IP). Subscribing to the information
allows the device to keep its behaviour consistent
with the host stack.

The common alternative before proposing this integration
was to have the RDMA device sniff all incoming packets
and attempt to do parallel procesing on a large set
of lower layer protocols (ICMP, ARP, routing, ...)
Or by simply trusting that the IB network adminstrator
has faithfully replicated all IP-relevent instructions
in two forums (traditional IP nework administration
and IB network administration).

These subscriptions are an attempt to cede full control
of these issues back to one place, the kernel, and to
guarantee that an offload device can never think that
the route to to X is Y when the kernel says it is Z.
Or that it has a different PMTU, etc.

I don't have any strong opinion on the best mechanism
for implementing these subscriptions, but having correct
consistent networking behaviour depend on a user-mode
relay strikes me as odd.




^ permalink raw reply	[flat|nested] 25+ messages in thread
* RE: [PATCH 0/2][RFC] Network Event Notifier Mechanism
@ 2006-06-22 22:39 Caitlin Bestler
  0 siblings, 0 replies; 25+ messages in thread
From: Caitlin Bestler @ 2006-06-22 22:39 UTC (permalink / raw)
  To: hadi, Steve Wise; +Cc: netdev, David Miller

netdev-owner@vger.kernel.org wrote:
> On Thu, 2006-22-06 at 15:58 -0500, Steve Wise wrote:
>> On Thu, 2006-06-22 at 16:36 -0400, jamal wrote:
> 
>> I created a new notifier block in my patch for these network events.
>> I guess I thought I was using the existing infrastructure to provide
>> this notification service. (I thought my patch was lovely :)  But I
>> didn't integrate with netlink for user space notification. Mainly cuz
>> I didn't think these events should be propagated up to users unless
>> there was a need.
> 
> I think they will be useful in user space. Typically you only
> propagate them if there's a user space program subscribed to
> listening (there are hooks which will tell you if there's
> anyone listening).
> The netdevice events tend to be a lot more usable in a few
> other blocks because they are lower in the hierachy (i.e
> routing depends on ip addresses which depend on netdevices)
> within the kernel unlike in this case where you are the only
> consumer; so it does sound logical to me to do it in user
> space; however, not totally unreasonable to do it in the kernel.
> 


These services are relevant to any RDMA connection. The user-space
consumer of RDMA services is no more interested in tracking the
routing of the remote IP address than the consumer of socket
services is.


>> 
>> 
>> Another issue I see with netlink is that the event notifications
>> aren't reliable.  Especially the CONFIG_ARPD stuff because it allocs
>> an sk_buff with ATOMIC.  A lost neighbour macaddr change is perhaps
>> fatal for an RDMA connection...
>> 
> 
> This would happen in the cases where you are short on memory;
> i would suspect you will need to allocate memory in your
> driver as well to update something in the hardware as well -
> so same problem.
> You can however work around issues like these in netlink.
>

A direct notification call to the driver makes the driver responsible
for providing whatever buffering it requires to save the information.
And if there is insufficient memory available at least the driver
is aware of the failure.

Allowing a third component to fail to relay information means that
the driver can no longer be responsible for maintaining its own
consistency with kernel routing, ARP and neighbor tables.

Maintaining that consistency is a matter of correct network
behaviour, not doing status reports. obviously we cannot have
hardware looking at and interpreting these tables directly.
So a *reliable* subscription would seem to be the only option.

If the only subscribers who require reliable notifications are
kernel drivers, does it really mamke sense to make those changes
in code that also supports user space?
 

> 
> I am still unclear:
> You have destination IP address, the dstMAC of the nexthop to
> get the packet to this IP address and i suspect some srcMAC
> address you will use sending out as well as the pathMTU to
> get there correct?
> Because of the IP address it sounds to me like you are
> populating an L3 table How is this info used in hardware? Can
> you explain how an arriving packet would be used by the RDMA
> in conjunction with this info once it is in the hardware?
>

Some packets are associated with established RDMA (or iSCSI)
connections, and are processed on the RDMA (or iSCSI) device.
These packets will also pass through other packets to the
host stack for processing (non-matched Ethernet frames for
IP networks, and IPoIB tunneled frames for IB networks).

The device provides L5 services (RDMA and/or iSCSI) in addition
to L2 services (as an Ethernet device). The rest of the network
rightfully demands that the left hand knows what the right hand
is doing. So information that is provided to a host, ARP/ICMP,
should affect the behaviour of *all* connections from that host.

Do you agree that having the device subsribe to the kernel
maintained tables is a better solution than having it attempt
to guess the correct values in parallel?
 


^ permalink raw reply	[flat|nested] 25+ messages in thread
* [PATCH 0/2][RFC] Network Event Notifier Mechanism
@ 2006-06-21 18:45 Steve Wise
  2006-06-21 19:08 ` YOSHIFUJI Hideaki / 吉藤英明
  2006-06-22  8:57 ` David Miller
  0 siblings, 2 replies; 25+ messages in thread
From: Steve Wise @ 2006-06-21 18:45 UTC (permalink / raw)
  To: netdev


This patch implements a mechanism that allows interested clients to
register for notification of certain network events. The intended use
is to allow RDMA devices (linux/drivers/infiniband) to be notified of
neighbour updates, ICMP redirects, path MTU changes, and route changes.

The reason these devices need update events is because they typically
cache this information in hardware and need to be notified when this
information has been updated.

This approach is one of many possibilities and may be preferred because it
uses an existing notification mechanism that has precedent in the stack.
An alternative would be to add a netdev method to notify affect devices
of these events.

This code does not yet implement path MTU change because the number of
places in which this value is updated is large and if this mechanism
seems reasonable, it would be probably be best to funnel these updates
through a single function.

We would like to get this or similar functionality included in 2.6.19
and request comments.

This patchset consists of 2 patches:

1) New files implementing the Network Event Notifier
2) Core network changes to generate network event notifications

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2006-06-27 12:44 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-22 22:11 [PATCH 0/2][RFC] Network Event Notifier Mechanism Caitlin Bestler
2006-06-22 22:21 ` jamal
2006-06-22 22:58 ` David Miller
2006-06-23  0:56   ` jamal
2006-06-23 13:24     ` Steve Wise
2006-06-23 19:57       ` David Miller
2006-06-23 20:12         ` Steve Wise
2006-06-24 14:30       ` jamal
2006-06-26 14:34         ` Steve Wise
2006-06-27 12:44           ` jamal
  -- strict thread matches above, loose matches on Subject: below --
2006-06-22 22:39 Caitlin Bestler
2006-06-21 18:45 Steve Wise
2006-06-21 19:08 ` YOSHIFUJI Hideaki / 吉藤英明
2006-06-22  8:57 ` David Miller
2006-06-22 13:53   ` Steve Wise
2006-06-22 15:27     ` Steve Wise
2006-06-22 19:43       ` jamal
2006-06-22 20:18         ` Steve Wise
2006-06-22 20:36           ` jamal
2006-06-22 20:58             ` Steve Wise
2006-06-22 22:14               ` jamal
2006-06-23 13:11                 ` Steve Wise
2006-06-22 20:40         ` Steve Wise
2006-06-22 20:56           ` jamal
2006-06-23 13:17             ` Steve Wise

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).