netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC RTNETLINK 00/09]: Netlink link creation API
@ 2007-06-05 14:12 Patrick McHardy
  2007-06-05 14:12 ` [RFC NETLINK 01/09]: Mark netlink policies const Patrick McHardy
                   ` (13 more replies)
  0 siblings, 14 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:12 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

The following patches contain the rtnetlink link creation API I promised,
as well as two simple driver conversion to use the API as an example.
I've also converted VLAN as a more complex example, but these patches
need some more work and are most likely not interesting to all the CCed
parties, so I'm sending them seperately.

A few words about the API:

Drivers wishing to use the API register a struct rtnl_link_ops, which
contains a few function pointers for device setup, registation, changing
and deletion, as well as netlink attribute validation and device dumping.

All netlink communication happens within the AF_UNSPEC family. I
initially introduced new netlink families for this, but removed them
again since that would require adding new protocol families that serve
no further purpose for most drivers. Additionally we currently use
RTM.*LINK messages with ifi_family != AF_UNSPEC for information that
is related to the device, but doesn't come from the driver that created
the device itself, like bridge port state, IPv6 device configuration etc.

The device specific attributes are nested within a new attribute
IFLA_LINKINFO. I didn't use IFLA_PROTINFO since userspace can reasonably
expect to have IFLA_PROTINFO unset for AF_UNSPEC messages, and the
userspace STP daemon does that. Identification of the driver happens
by name, stored in the IFLA_INFO_NAME attribute. IFLA_INFO_DATA contains
driver specific attributes, IFLA_INFO_XSTATS driver specific statistics.

The API does *not* use the existing RTM_SETLINK message type, instead
it adds support for receiving RTM_NEWLINK within the kernel. I did this
because of three reasons: 

- RTM_SETLINK does not follow the usual rtnetlink conventions and ignores
  all netlink flags

- Other rtnetlink subsystems use the same message type for dumps and
  notifications from the kernel as for configuration from userspace,
  which usually allows to recreate an object by simply setting the
  NLM_F_REQUEST flag on message received from the kernel and sending
  it back.

- Easier for userspace to detect support for the new features

The RTM_NEWLINK message type is a superset of RTM_SETLINK, it allows
to change both driver specific and generic attributes of the device.
The set of generic device attributes that may be supplied during
device creation is limited to a few simple ones, it currently does
not support specifying link layer address/broadcast address as well
as device flags. The change operation can change all device attributes.

Not sure what else to say .. comments welcome.


 drivers/net/dummy.c               |  144 +++++++-----
 drivers/net/ifb.c                 |  115 ++++++---
 include/linux/if_link.h           |   13 +
 include/linux/netdevice.h         |    3 
 include/net/fib_rules.h           |    2 
 include/net/genetlink.h           |    2 
 include/net/ip_fib.h              |    2 
 include/net/netlink.h             |   12 -
 include/net/rtnetlink.h           |   57 ++++
 net/core/neighbour.c              |    4 
 net/core/rtnetlink.c              |  451 +++++++++++++++++++++++++++++++++-----
 net/decnet/dn_dev.c               |    2 
 net/decnet/dn_rules.c             |    2 
 net/ipv4/devinet.c                |    2 
 net/ipv4/fib_frontend.c           |    2 
 net/ipv4/fib_rules.c              |    2 
 net/ipv6/addrconf.c               |    2 
 net/ipv6/fib6_rules.c             |    2 
 net/ipv6/route.c                  |    2 
 net/netlabel/netlabel_cipso_v4.c  |    2 
 net/netlabel/netlabel_mgmt.c      |    2 
 net/netlabel/netlabel_unlabeled.c |    2 
 net/netlink/attr.c                |    8 
 net/netlink/genetlink.c           |    2 
 24 files changed, 665 insertions(+), 172 deletions(-)

Patrick McHardy (9):
      [NETLINK]: Mark netlink policies const
      [RTNETLINK]: ifindex 0 does not exist
      [RTNETLINK]: Split up rtnl_setlink
      [RTNETLINK]: Link creation API
      [DUMMY]: Use dev->stats
      [DUMMY]: Keep dummy devices on list
      [DUMMY]: Use rtnl_link API
      [IFB]: Keep ifb devices on list
      [IFB]: Use rtnl_link API

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [RFC NETLINK 01/09]: Mark netlink policies const
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
@ 2007-06-05 14:12 ` Patrick McHardy
  2007-06-05 19:39   ` David Miller
  2007-06-05 14:12 ` [RFC RTNETLINK 02/09]: ifindex 0 does not exist Patrick McHardy
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:12 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

[NETLINK]: Mark netlink policies const

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 842f5a7aa8357f4e43b1ab8938c51f4d7ce7aba3
tree 701a1399958ff019ab6939fa0e5a96e5b148b643
parent c1a13ff57ab1ce52a0aae9984594dbfcfbaf68c0
author Patrick McHardy <kaber@trash.net> Mon, 04 Jun 2007 14:52:51 +0200
committer Patrick McHardy <kaber@trash.net> Mon, 04 Jun 2007 14:52:51 +0200

 include/net/fib_rules.h           |    2 +-
 include/net/genetlink.h           |    2 +-
 include/net/ip_fib.h              |    2 +-
 include/net/netlink.h             |   12 ++++++------
 net/core/neighbour.c              |    4 ++--
 net/core/rtnetlink.c              |    2 +-
 net/decnet/dn_dev.c               |    2 +-
 net/decnet/dn_rules.c             |    2 +-
 net/ipv4/devinet.c                |    2 +-
 net/ipv4/fib_frontend.c           |    2 +-
 net/ipv4/fib_rules.c              |    2 +-
 net/ipv6/addrconf.c               |    2 +-
 net/ipv6/fib6_rules.c             |    2 +-
 net/ipv6/route.c                  |    2 +-
 net/netlabel/netlabel_cipso_v4.c  |    2 +-
 net/netlabel/netlabel_mgmt.c      |    2 +-
 net/netlabel/netlabel_unlabeled.c |    2 +-
 net/netlink/attr.c                |    8 ++++----
 net/netlink/genetlink.c           |    2 +-
 19 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h
index ed3a887..83e41dd 100644
--- a/include/net/fib_rules.h
+++ b/include/net/fib_rules.h
@@ -64,7 +64,7 @@ struct fib_rules_ops
 	void			(*flush_cache)(void);
 
 	int			nlgroup;
-	struct nla_policy	*policy;
+	const struct nla_policy	*policy;
 	struct list_head	*rules_list;
 	struct module		*owner;
 };
diff --git a/include/net/genetlink.h b/include/net/genetlink.h
index adff4c8..b6eaca1 100644
--- a/include/net/genetlink.h
+++ b/include/net/genetlink.h
@@ -60,7 +60,7 @@ struct genl_ops
 {
 	u8			cmd;
 	unsigned int		flags;
-	struct nla_policy	*policy;
+	const struct nla_policy	*policy;
 	int		       (*doit)(struct sk_buff *skb,
 				       struct genl_info *info);
 	int		       (*dumpit)(struct sk_buff *skb,
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 5a4a036..69252cb 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -213,7 +213,7 @@ extern void fib_select_default(const struct flowi *flp, struct fib_result *res);
 #endif /* CONFIG_IP_MULTIPLE_TABLES */
 
 /* Exported by fib_frontend.c */
-extern struct nla_policy rtm_ipv4_policy[];
+extern const struct nla_policy rtm_ipv4_policy[];
 extern void		ip_fib_init(void);
 extern int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
 			       struct net_device *dev, __be32 *spec_dst, u32 *itag);
diff --git a/include/net/netlink.h b/include/net/netlink.h
index 0bf325c..7b510a9 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -222,10 +222,10 @@ extern int		nlmsg_notify(struct sock *sk, struct sk_buff *skb,
 				     gfp_t flags);
 
 extern int		nla_validate(struct nlattr *head, int len, int maxtype,
-				     struct nla_policy *policy);
+				     const struct nla_policy *policy);
 extern int		nla_parse(struct nlattr *tb[], int maxtype,
 				  struct nlattr *head, int len,
-				  struct nla_policy *policy);
+				  const struct nla_policy *policy);
 extern struct nlattr *	nla_find(struct nlattr *head, int len, int attrtype);
 extern size_t		nla_strlcpy(char *dst, const struct nlattr *nla,
 				    size_t dstsize);
@@ -360,7 +360,7 @@ static inline struct nlmsghdr *nlmsg_next(struct nlmsghdr *nlh, int *remaining)
  */
 static inline int nlmsg_parse(struct nlmsghdr *nlh, int hdrlen,
 			      struct nlattr *tb[], int maxtype,
-			      struct nla_policy *policy)
+			      const struct nla_policy *policy)
 {
 	if (nlh->nlmsg_len < nlmsg_msg_size(hdrlen))
 		return -EINVAL;
@@ -392,7 +392,7 @@ static inline struct nlattr *nlmsg_find_attr(struct nlmsghdr *nlh,
  * @policy: validation policy
  */
 static inline int nlmsg_validate(struct nlmsghdr *nlh, int hdrlen, int maxtype,
-				 struct nla_policy *policy)
+				 const struct nla_policy *policy)
 {
 	if (nlh->nlmsg_len < nlmsg_msg_size(hdrlen))
 		return -EINVAL;
@@ -729,7 +729,7 @@ static inline struct nlattr *nla_find_nested(struct nlattr *nla, int attrtype)
  */
 static inline int nla_parse_nested(struct nlattr *tb[], int maxtype,
 				   struct nlattr *nla,
-				   struct nla_policy *policy)
+				   const struct nla_policy *policy)
 {
 	return nla_parse(tb, maxtype, nla_data(nla), nla_len(nla), policy);
 }
@@ -990,7 +990,7 @@ static inline int nla_nest_cancel(struct sk_buff *skb, struct nlattr *start)
  * Returns 0 on success or a negative error code.
  */
 static inline int nla_validate_nested(struct nlattr *start, int maxtype,
-				      struct nla_policy *policy)
+				      const struct nla_policy *policy)
 {
 	return nla_validate(nla_data(start), nla_len(start), maxtype, policy);
 }
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 6f3bb73..9df26a0 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1761,7 +1761,7 @@ static inline struct neigh_parms *lookup_neigh_params(struct neigh_table *tbl,
 	return NULL;
 }
 
-static struct nla_policy nl_neightbl_policy[NDTA_MAX+1] __read_mostly = {
+static const struct nla_policy nl_neightbl_policy[NDTA_MAX+1] = {
 	[NDTA_NAME]		= { .type = NLA_STRING },
 	[NDTA_THRESH1]		= { .type = NLA_U32 },
 	[NDTA_THRESH2]		= { .type = NLA_U32 },
@@ -1770,7 +1770,7 @@ static struct nla_policy nl_neightbl_policy[NDTA_MAX+1] __read_mostly = {
 	[NDTA_PARMS]		= { .type = NLA_NESTED },
 };
 
-static struct nla_policy nl_ntbl_parm_policy[NDTPA_MAX+1] __read_mostly = {
+static const struct nla_policy nl_ntbl_parm_policy[NDTPA_MAX+1] = {
 	[NDTPA_IFINDEX]			= { .type = NLA_U32 },
 	[NDTPA_QUEUE_LEN]		= { .type = NLA_U32 },
 	[NDTPA_PROXY_QLEN]		= { .type = NLA_U32 },
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 27da9cd..a8a5093 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -551,7 +551,7 @@ cont:
 	return skb->len;
 }
 
-static struct nla_policy ifla_policy[IFLA_MAX+1] __read_mostly = {
+static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 	[IFLA_IFNAME]		= { .type = NLA_STRING, .len = IFNAMSIZ-1 },
 	[IFLA_MAP]		= { .len = sizeof(struct rtnl_link_ifmap) },
 	[IFLA_MTU]		= { .type = NLA_U32 },
diff --git a/net/decnet/dn_dev.c b/net/decnet/dn_dev.c
index 764a56a..ab41c18 100644
--- a/net/decnet/dn_dev.c
+++ b/net/decnet/dn_dev.c
@@ -638,7 +638,7 @@ static struct dn_dev *dn_dev_by_index(int ifindex)
 	return dn_dev;
 }
 
-static struct nla_policy dn_ifa_policy[IFA_MAX+1] __read_mostly = {
+static const struct nla_policy dn_ifa_policy[IFA_MAX+1] = {
 	[IFA_ADDRESS]		= { .type = NLA_U16 },
 	[IFA_LOCAL]		= { .type = NLA_U16 },
 	[IFA_LABEL]		= { .type = NLA_STRING,
diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c
index 17a1932..84ff3dd 100644
--- a/net/decnet/dn_rules.c
+++ b/net/decnet/dn_rules.c
@@ -108,7 +108,7 @@ errout:
 	return err;
 }
 
-static struct nla_policy dn_fib_rule_policy[FRA_MAX+1] __read_mostly = {
+static const struct nla_policy dn_fib_rule_policy[FRA_MAX+1] = {
 	FRA_GENERIC_POLICY,
 };
 
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 7f95e6e..965e35a 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -78,7 +78,7 @@ static struct ipv4_devconf ipv4_devconf_dflt = {
 	.accept_source_route = 1,
 };
 
-static struct nla_policy ifa_ipv4_policy[IFA_MAX+1] __read_mostly = {
+static const struct nla_policy ifa_ipv4_policy[IFA_MAX+1] = {
 	[IFA_LOCAL]     	= { .type = NLA_U32 },
 	[IFA_ADDRESS]   	= { .type = NLA_U32 },
 	[IFA_BROADCAST] 	= { .type = NLA_U32 },
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 9ad1f62..311d633 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -441,7 +441,7 @@ int ip_rt_ioctl(unsigned int cmd, void __user *arg)
 	return -EINVAL;
 }
 
-struct nla_policy rtm_ipv4_policy[RTA_MAX+1] __read_mostly = {
+const struct nla_policy rtm_ipv4_policy[RTA_MAX+1] = {
 	[RTA_DST]		= { .type = NLA_U32 },
 	[RTA_SRC]		= { .type = NLA_U32 },
 	[RTA_IIF]		= { .type = NLA_U32 },
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 33083ad..2a94784 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -169,7 +169,7 @@ static struct fib_table *fib_empty_table(void)
 	return NULL;
 }
 
-static struct nla_policy fib4_rule_policy[FRA_MAX+1] __read_mostly = {
+static const struct nla_policy fib4_rule_policy[FRA_MAX+1] = {
 	FRA_GENERIC_POLICY,
 	[FRA_FLOW]	= { .type = NLA_U32 },
 };
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 329de67..5a5f8bd 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2990,7 +2990,7 @@ static struct in6_addr *extract_addr(struct nlattr *addr, struct nlattr *local)
 	return pfx;
 }
 
-static struct nla_policy ifa_ipv6_policy[IFA_MAX+1] __read_mostly = {
+static const struct nla_policy ifa_ipv6_policy[IFA_MAX+1] = {
 	[IFA_ADDRESS]		= { .len = sizeof(struct in6_addr) },
 	[IFA_LOCAL]		= { .len = sizeof(struct in6_addr) },
 	[IFA_CACHEINFO]		= { .len = sizeof(struct ifa_cacheinfo) },
diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c
index fc3882c..53b3998 100644
--- a/net/ipv6/fib6_rules.c
+++ b/net/ipv6/fib6_rules.c
@@ -157,7 +157,7 @@ static int fib6_rule_match(struct fib_rule *rule, struct flowi *fl, int flags)
 	return 1;
 }
 
-static struct nla_policy fib6_rule_policy[FRA_MAX+1] __read_mostly = {
+static const struct nla_policy fib6_rule_policy[FRA_MAX+1] = {
 	FRA_GENERIC_POLICY,
 };
 
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 1324b06..fe8d983 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1999,7 +1999,7 @@ void rt6_mtu_change(struct net_device *dev, unsigned mtu)
 	fib6_clean_all(rt6_mtu_change_route, 0, &arg);
 }
 
-static struct nla_policy rtm_ipv6_policy[RTA_MAX+1] __read_mostly = {
+static const struct nla_policy rtm_ipv6_policy[RTA_MAX+1] = {
 	[RTA_GATEWAY]           = { .len = sizeof(struct in6_addr) },
 	[RTA_OIF]               = { .type = NLA_U32 },
 	[RTA_IIF]		= { .type = NLA_U32 },
diff --git a/net/netlabel/netlabel_cipso_v4.c b/net/netlabel/netlabel_cipso_v4.c
index 07e47db..24b660f 100644
--- a/net/netlabel/netlabel_cipso_v4.c
+++ b/net/netlabel/netlabel_cipso_v4.c
@@ -59,7 +59,7 @@ static struct genl_family netlbl_cipsov4_gnl_family = {
 };
 
 /* NetLabel Netlink attribute policy */
-static struct nla_policy netlbl_cipsov4_genl_policy[NLBL_CIPSOV4_A_MAX + 1] = {
+static const struct nla_policy netlbl_cipsov4_genl_policy[NLBL_CIPSOV4_A_MAX + 1] = {
 	[NLBL_CIPSOV4_A_DOI] = { .type = NLA_U32 },
 	[NLBL_CIPSOV4_A_MTYPE] = { .type = NLA_U32 },
 	[NLBL_CIPSOV4_A_TAG] = { .type = NLA_U8 },
diff --git a/net/netlabel/netlabel_mgmt.c b/net/netlabel/netlabel_mgmt.c
index e8c80f3..e00fc21 100644
--- a/net/netlabel/netlabel_mgmt.c
+++ b/net/netlabel/netlabel_mgmt.c
@@ -59,7 +59,7 @@ static struct genl_family netlbl_mgmt_gnl_family = {
 };
 
 /* NetLabel Netlink attribute policy */
-static struct nla_policy netlbl_mgmt_genl_policy[NLBL_MGMT_A_MAX + 1] = {
+static const struct nla_policy netlbl_mgmt_genl_policy[NLBL_MGMT_A_MAX + 1] = {
 	[NLBL_MGMT_A_DOMAIN] = { .type = NLA_NUL_STRING },
 	[NLBL_MGMT_A_PROTOCOL] = { .type = NLA_U32 },
 	[NLBL_MGMT_A_VERSION] = { .type = NLA_U32 },
diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c
index b931ede..5c303c6 100644
--- a/net/netlabel/netlabel_unlabeled.c
+++ b/net/netlabel/netlabel_unlabeled.c
@@ -61,7 +61,7 @@ static struct genl_family netlbl_unlabel_gnl_family = {
 };
 
 /* NetLabel Netlink attribute policy */
-static struct nla_policy netlbl_unlabel_genl_policy[NLBL_UNLABEL_A_MAX + 1] = {
+static const struct nla_policy netlbl_unlabel_genl_policy[NLBL_UNLABEL_A_MAX + 1] = {
 	[NLBL_UNLABEL_A_ACPTFLG] = { .type = NLA_U8 },
 };
 
diff --git a/net/netlink/attr.c b/net/netlink/attr.c
index df5f820..c591212 100644
--- a/net/netlink/attr.c
+++ b/net/netlink/attr.c
@@ -24,9 +24,9 @@ static u16 nla_attr_minlen[NLA_TYPE_MAX+1] __read_mostly = {
 };
 
 static int validate_nla(struct nlattr *nla, int maxtype,
-			struct nla_policy *policy)
+			const struct nla_policy *policy)
 {
-	struct nla_policy *pt;
+	const struct nla_policy *pt;
 	int minlen = 0, attrlen = nla_len(nla);
 
 	if (nla->nla_type <= 0 || nla->nla_type > maxtype)
@@ -99,7 +99,7 @@ static int validate_nla(struct nlattr *nla, int maxtype,
  * Returns 0 on success or a negative error code.
  */
 int nla_validate(struct nlattr *head, int len, int maxtype,
-		 struct nla_policy *policy)
+		 const struct nla_policy *policy)
 {
 	struct nlattr *nla;
 	int rem, err;
@@ -130,7 +130,7 @@ errout:
  * Returns 0 on success or a negative error code.
  */
 int nla_parse(struct nlattr *tb[], int maxtype, struct nlattr *head, int len,
-	      struct nla_policy *policy)
+	      const struct nla_policy *policy)
 {
 	struct nlattr *nla;
 	int rem, err;
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
index 6e31234..b9ab62f 100644
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -472,7 +472,7 @@ static struct sk_buff *ctrl_build_msg(struct genl_family *family, u32 pid,
 	return skb;
 }
 
-static struct nla_policy ctrl_policy[CTRL_ATTR_MAX+1] __read_mostly = {
+static const struct nla_policy ctrl_policy[CTRL_ATTR_MAX+1] = {
 	[CTRL_ATTR_FAMILY_ID]	= { .type = NLA_U16 },
 	[CTRL_ATTR_FAMILY_NAME]	= { .type = NLA_NUL_STRING,
 				    .len = GENL_NAMSIZ - 1 },

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC RTNETLINK 02/09]: ifindex 0 does not exist
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
  2007-06-05 14:12 ` [RFC NETLINK 01/09]: Mark netlink policies const Patrick McHardy
@ 2007-06-05 14:12 ` Patrick McHardy
  2007-06-05 19:40   ` David Miller
  2007-06-05 14:12 ` [RFC RTNETLINK 03/09]: Split up rtnl_setlink Patrick McHardy
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:12 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

[RTNETLINK]: ifindex 0 does not exist

ifindex == 0 does not exist and implies we should do a lookup by name if
one was given.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 72104442408666a0ff9395a5c62a23e96a3845ef
tree d46c7ec0b401a99bfbec94ef328e9c7cf7197854
parent 842f5a7aa8357f4e43b1ab8938c51f4d7ce7aba3
author Patrick McHardy <kaber@trash.net> Mon, 04 Jun 2007 23:23:14 +0200
committer Patrick McHardy <kaber@trash.net> Mon, 04 Jun 2007 23:23:14 +0200

 net/core/rtnetlink.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a8a5093..02e8bf0 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -580,7 +580,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 
 	err = -EINVAL;
 	ifm = nlmsg_data(nlh);
-	if (ifm->ifi_index >= 0)
+	if (ifm->ifi_index > 0)
 		dev = dev_get_by_index(ifm->ifi_index);
 	else if (tb[IFLA_IFNAME])
 		dev = dev_get_by_name(ifname);
@@ -672,7 +672,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	 * name provided implies that a name change has been
 	 * requested.
 	 */
-	if (ifm->ifi_index >= 0 && ifname[0]) {
+	if (ifm->ifi_index > 0 && ifname[0]) {
 		err = dev_change_name(dev, ifname);
 		if (err < 0)
 			goto errout_dev;
@@ -740,7 +740,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 		return err;
 
 	ifm = nlmsg_data(nlh);
-	if (ifm->ifi_index >= 0) {
+	if (ifm->ifi_index > 0) {
 		dev = dev_get_by_index(ifm->ifi_index);
 		if (dev == NULL)
 			return -ENODEV;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC RTNETLINK 03/09]: Split up rtnl_setlink
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
  2007-06-05 14:12 ` [RFC NETLINK 01/09]: Mark netlink policies const Patrick McHardy
  2007-06-05 14:12 ` [RFC RTNETLINK 02/09]: ifindex 0 does not exist Patrick McHardy
@ 2007-06-05 14:12 ` Patrick McHardy
  2007-06-05 19:41   ` David Miller
  2007-06-05 14:12 ` [RFC RTNETLINK 04/09]: Link creation API Patrick McHardy
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:12 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

[RTNETLINK]: Split up rtnl_setlink

Split up rtnl_setlink into a function performing validation and a function
performing the actual changes. This allows to share the modifcation logic
with rtnl_newlink, which is introduced by the next patch.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 40b0b8787c1057d055baa6e3d11ff6db7783c982
tree 625d2bb4401e72882e6e5907d4aded9d2dcb416e
parent 72104442408666a0ff9395a5c62a23e96a3845ef
author Patrick McHardy <kaber@trash.net> Mon, 04 Jun 2007 23:28:24 +0200
committer Patrick McHardy <kaber@trash.net> Mon, 04 Jun 2007 23:28:24 +0200

 net/core/rtnetlink.c |  105 +++++++++++++++++++++++++++-----------------------
 1 files changed, 57 insertions(+), 48 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 02e8bf0..25ca219 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -561,44 +561,11 @@ static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 	[IFLA_LINKMODE]		= { .type = NLA_U8 },
 };
 
-static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
+		      struct nlattr **tb, char *ifname)
 {
-	struct ifinfomsg *ifm;
-	struct net_device *dev;
-	int err, send_addr_notify = 0, modified = 0;
-	struct nlattr *tb[IFLA_MAX+1];
-	char ifname[IFNAMSIZ];
-
-	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
-	if (err < 0)
-		goto errout;
-
-	if (tb[IFLA_IFNAME])
-		nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
-	else
-		ifname[0] = '\0';
-
-	err = -EINVAL;
-	ifm = nlmsg_data(nlh);
-	if (ifm->ifi_index > 0)
-		dev = dev_get_by_index(ifm->ifi_index);
-	else if (tb[IFLA_IFNAME])
-		dev = dev_get_by_name(ifname);
-	else
-		goto errout;
-
-	if (dev == NULL) {
-		err = -ENODEV;
-		goto errout;
-	}
-
-	if (tb[IFLA_ADDRESS] &&
-	    nla_len(tb[IFLA_ADDRESS]) < dev->addr_len)
-		goto errout_dev;
-
-	if (tb[IFLA_BROADCAST] &&
-	    nla_len(tb[IFLA_BROADCAST]) < dev->addr_len)
-		goto errout_dev;
+	int modified = 0, send_addr_notify = 0;
+	int err;
 
 	if (tb[IFLA_MAP]) {
 		struct rtnl_link_ifmap *u_map;
@@ -606,12 +573,12 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 
 		if (!dev->set_config) {
 			err = -EOPNOTSUPP;
-			goto errout_dev;
+			goto errout;
 		}
 
 		if (!netif_device_present(dev)) {
 			err = -ENODEV;
-			goto errout_dev;
+			goto errout;
 		}
 
 		u_map = nla_data(tb[IFLA_MAP]);
@@ -624,7 +591,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 
 		err = dev->set_config(dev, &k_map);
 		if (err < 0)
-			goto errout_dev;
+			goto errout;
 
 		modified = 1;
 	}
@@ -635,19 +602,19 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 
 		if (!dev->set_mac_address) {
 			err = -EOPNOTSUPP;
-			goto errout_dev;
+			goto errout;
 		}
 
 		if (!netif_device_present(dev)) {
 			err = -ENODEV;
-			goto errout_dev;
+			goto errout;
 		}
 
 		len = sizeof(sa_family_t) + dev->addr_len;
 		sa = kmalloc(len, GFP_KERNEL);
 		if (!sa) {
 			err = -ENOMEM;
-			goto errout_dev;
+			goto errout;
 		}
 		sa->sa_family = dev->type;
 		memcpy(sa->sa_data, nla_data(tb[IFLA_ADDRESS]),
@@ -655,7 +622,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 		err = dev->set_mac_address(dev, sa);
 		kfree(sa);
 		if (err)
-			goto errout_dev;
+			goto errout;
 		send_addr_notify = 1;
 		modified = 1;
 	}
@@ -663,7 +630,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	if (tb[IFLA_MTU]) {
 		err = dev_set_mtu(dev, nla_get_u32(tb[IFLA_MTU]));
 		if (err < 0)
-			goto errout_dev;
+			goto errout;
 		modified = 1;
 	}
 
@@ -675,7 +642,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	if (ifm->ifi_index > 0 && ifname[0]) {
 		err = dev_change_name(dev, ifname);
 		if (err < 0)
-			goto errout_dev;
+			goto errout;
 		modified = 1;
 	}
 
@@ -684,7 +651,6 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 		send_addr_notify = 1;
 	}
 
-
 	if (ifm->ifi_flags || ifm->ifi_change) {
 		unsigned int flags = ifm->ifi_flags;
 
@@ -712,7 +678,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 
 	err = 0;
 
-errout_dev:
+errout:
 	if (err < 0 && modified && net_ratelimit())
 		printk(KERN_WARNING "A link change request failed with "
 		       "some changes comitted already. Interface %s may "
@@ -721,7 +687,50 @@ errout_dev:
 
 	if (send_addr_notify)
 		call_netdevice_notifiers(NETDEV_CHANGEADDR, dev);
+	return err;
+}
 
+static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+{
+	struct ifinfomsg *ifm;
+	struct net_device *dev;
+	int err;
+	struct nlattr *tb[IFLA_MAX+1];
+	char ifname[IFNAMSIZ];
+
+	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
+	if (err < 0)
+		goto errout;
+
+	if (tb[IFLA_IFNAME])
+		nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
+	else
+		ifname[0] = '\0';
+
+	err = -EINVAL;
+	ifm = nlmsg_data(nlh);
+	if (ifm->ifi_index > 0)
+		dev = dev_get_by_index(ifm->ifi_index);
+	else if (tb[IFLA_IFNAME])
+		dev = dev_get_by_name(ifname);
+	else
+		goto errout;
+
+	if (dev == NULL) {
+		err = -ENODEV;
+		goto errout;
+	}
+
+	if (tb[IFLA_ADDRESS] &&
+	    nla_len(tb[IFLA_ADDRESS]) < dev->addr_len)
+		goto errout_dev;
+
+	if (tb[IFLA_BROADCAST] &&
+	    nla_len(tb[IFLA_BROADCAST]) < dev->addr_len)
+		goto errout_dev;
+
+	err = do_setlink(dev, ifm, tb, ifname);
+errout_dev:
 	dev_put(dev);
 errout:
 	return err;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC RTNETLINK 04/09]: Link creation API
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (2 preceding siblings ...)
  2007-06-05 14:12 ` [RFC RTNETLINK 03/09]: Split up rtnl_setlink Patrick McHardy
@ 2007-06-05 14:12 ` Patrick McHardy
  2007-06-05 19:43   ` David Miller
                     ` (2 more replies)
  2007-06-05 14:12 ` [RFC DUMMY 05/09]: Use dev->stats Patrick McHardy
                   ` (9 subsequent siblings)
  13 siblings, 3 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:12 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

[RTNETLINK]: Link creation API

Add rtnetlink API for creating, changing and deleting software devices.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 0323e7d1e7d5042492684264cfcba6d7ff55c473
tree 161530836d43b39ddef42a2c2b48b82187580e3c
parent 40b0b8787c1057d055baa6e3d11ff6db7783c982
author Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:12 +0200
committer Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:12 +0200

 include/linux/if_link.h   |   13 ++
 include/linux/netdevice.h |    3 
 include/net/rtnetlink.h   |   57 ++++++++
 net/core/rtnetlink.c      |  338 ++++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 404 insertions(+), 7 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 604c243..e46ed94 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -76,6 +76,8 @@ enum
 #define IFLA_WEIGHT IFLA_WEIGHT
 	IFLA_OPERSTATE,
 	IFLA_LINKMODE,
+	IFLA_LINKINFO,
+#define IFLA_LINKINFO IFLA_LINKINFO
 	__IFLA_MAX
 };
 
@@ -140,4 +142,15 @@ struct ifla_cacheinfo
 	__u32	retrans_time;
 };
 
+enum
+{
+	IFLA_INFO_UNSPEC,
+	IFLA_INFO_NAME,
+	IFLA_INFO_DATA,
+	IFLA_INFO_XSTATS,
+	__IFLA_INFO_MAX,
+};
+
+#define IFLA_INFO_MAX	(__IFLA_INFO_MAX - 1)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3a70f55..e327ccc 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -540,6 +540,9 @@ struct net_device
 	struct device		dev;
 	/* space for optional statistics and wireless sysfs groups */
 	struct attribute_group  *sysfs_groups[3];
+
+	/* rtnetlink link ops */
+	struct rtnl_link_ops	*rtnl_link_ops;
 };
 #define to_net_dev(d) container_of(d, struct net_device, dev)
 
diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 3b3d474..d744198 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -22,4 +22,61 @@ static inline int rtnl_msg_family(struct nlmsghdr *nlh)
 		return AF_UNSPEC;
 }
 
+/**
+ *	struct rtnl_link_ops - rtnetlink link operations
+ *
+ *	@list: Used internally
+ *	@name: Identifier
+ *	@maxtype: Highest device specific netlink attribute number
+ *	@policy: Netlink policy for device specific attribute validation
+ *	@validate: Optional validation function for netlink/changelink parameters
+ *	@priv_size: sizeof net_device private space
+ *	@setup: net_device setup function
+ *	@newlink: Function for configuring and registering a new device
+ *	@changelink: Function for changing parameters of an existing device
+ *	@dellink: Function to remove a device
+ *	@get_size: Function to calculate required room for dumping device
+ *		   specific netlink attributes
+ *	@fill_info: Function to dump device specific netlink attributes
+ *	@xstats_size: Size of device specific statistics
+ *	@fill_xstats: Function to dump device specific statistics
+ */
+struct rtnl_link_ops {
+	struct list_head	list;
+
+	const char		*name;
+
+	size_t			priv_size;
+	void			(*setup)(struct net_device *dev);
+
+	int			maxtype;
+	const struct nla_policy	*policy;
+	int			(*validate)(struct nlattr *tb[],
+					    struct nlattr *data[]);
+
+	int			(*newlink)(struct net_device *dev,
+					   struct nlattr *tb[],
+					   struct nlattr *data[]);
+	int			(*changelink)(struct net_device *dev,
+					      struct nlattr *tb[],
+					      struct nlattr *data[]);
+	void			(*dellink)(struct net_device *dev);
+
+	size_t			(*get_size)(struct net_device *dev);
+	int			(*fill_info)(struct sk_buff *skb,
+					     struct net_device *dev);
+
+	size_t			xstats_size;
+	int			(*fill_xstats)(struct sk_buff *skb,
+					       struct net_device *dev);
+};
+
+extern int	__rtnl_link_register(struct rtnl_link_ops *ops);
+extern void	__rtnl_link_unregister(struct rtnl_link_ops *ops);
+
+extern int	rtnl_link_register(struct rtnl_link_ops *ops);
+extern void	rtnl_link_unregister(struct rtnl_link_ops *ops);
+
+#define MODULE_ALIAS_RTNL_LINK(name) MODULE_ALIAS("rtnl-link-" #name)
+
 #endif
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 25ca219..ed17288 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -243,6 +243,141 @@ void rtnl_unregister_all(int protocol)
 
 EXPORT_SYMBOL_GPL(rtnl_unregister_all);
 
+static LIST_HEAD(link_ops);
+
+/**
+ * __rtnl_link_register - Register rtnl_link_ops with rtnetlink.
+ * @ops: struct rtnl_link_ops * to register
+ *
+ * The caller must hold the rtnl_mutex. This function should be used
+ * by drivers that create devices during module initialization. It
+ * must be called before registering the devices.
+ *
+ * Returns 0 on success or a negative error code.
+ */
+int __rtnl_link_register(struct rtnl_link_ops *ops)
+{
+	list_add_tail(&ops->list, &link_ops);
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(__rtnl_link_register);
+
+/**
+ * rtnl_link_register - Register rtnl_link_ops with rtnetlink.
+ * @ops: struct rtnl_link_ops * to register
+ *
+ * Returns 0 on success or a negative error code.
+ */
+int rtnl_link_register(struct rtnl_link_ops *ops)
+{
+	int err;
+
+	rtnl_lock();
+	err = __rtnl_link_register(ops);
+	rtnl_unlock();
+	return err;
+}
+
+EXPORT_SYMBOL_GPL(rtnl_link_register);
+
+/**
+ * __rtnl_link_unregister - Unregister rtnl_link_ops from rtnetlink.
+ * @ops: struct rtnl_link_ops * to unregister
+ *
+ * The caller must hold the rtnl_mutex. This function should be used
+ * by drivers that unregister devices during module unloading. It must
+ * be called after unregistering the devices.
+ */
+void __rtnl_link_unregister(struct rtnl_link_ops *ops)
+{
+	list_del(&ops->list);
+}
+
+EXPORT_SYMBOL_GPL(__rtnl_link_unregister);
+
+/**
+ * rtnl_link_unregister - Unregister rtnl_link_ops from rtnetlink.
+ * @ops: struct rtnl_link_ops * to unregister
+ */
+void rtnl_link_unregister(struct rtnl_link_ops *ops)
+{
+	rtnl_lock();
+	__rtnl_link_unregister(ops);
+	rtnl_unlock();
+}
+
+EXPORT_SYMBOL_GPL(rtnl_link_unregister);
+
+static struct rtnl_link_ops *rtnl_link_ops_get(char *name)
+{
+	struct rtnl_link_ops *ops;
+
+	list_for_each_entry(ops, &link_ops, list) {
+		if (!strcmp(ops->name, name))
+			return ops;
+	}
+	return NULL;
+}
+
+static size_t rtnl_link_get_size(struct net_device *dev)
+{
+	const struct rtnl_link_ops *ops = dev->rtnl_link_ops;
+	size_t size;
+
+	if (!ops)
+		return 0;
+
+	size = nlmsg_total_size(sizeof(struct nlattr)) + /* IFLA_LINKINFO */
+	       nlmsg_total_size(strlen(ops->name) + 1) + /* IFLA_INFO_NAME */
+	       nlmsg_total_size(ops->xstats_size);	 /* IFLA_INFO_XSTATS */
+
+	if (ops->get_size)
+		/* IFLA_INFO_DATA + nested data */
+		size += nlmsg_total_size(sizeof(struct nlattr)) +
+			ops->get_size(dev);
+
+	return size;
+}
+
+static int rtnl_link_fill(struct sk_buff *skb, struct net_device *dev)
+{
+	const struct rtnl_link_ops *ops = dev->rtnl_link_ops;
+	struct nlattr *linkinfo, *data;
+	int err = -EMSGSIZE;
+
+	linkinfo = nla_nest_start(skb, IFLA_LINKINFO);
+	if (linkinfo == NULL)
+		goto out;
+
+	if (nla_put_string(skb, IFLA_INFO_NAME, ops->name) < 0)
+		goto err_cancel_link;
+	if (ops->fill_xstats) {
+		err = ops->fill_xstats(skb, dev);
+		if (err < 0)
+			goto err_cancel_link;
+	}
+	if (ops->fill_info) {
+		data = nla_nest_start(skb, IFLA_INFO_DATA);
+		if (data == NULL)
+			goto err_cancel_link;
+		err = ops->fill_info(skb, dev);
+		if (err < 0)
+			goto err_cancel_data;
+		nla_nest_end(skb, data);
+	}
+
+	nla_nest_end(skb, linkinfo);
+	return 0;
+
+err_cancel_data:
+	nla_nest_cancel(skb, data);
+err_cancel_link:
+	nla_nest_cancel(skb, linkinfo);
+out:
+	return err;
+}
+
 static const int rtm_min[RTM_NR_FAMILIES] =
 {
 	[RTM_FAM(RTM_NEWLINK)]      = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
@@ -437,7 +572,7 @@ static void copy_rtnl_link_stats(struct rtnl_link_stats *a,
 	a->tx_compressed = b->tx_compressed;
 };
 
-static inline size_t if_nlmsg_size(void)
+static inline size_t if_nlmsg_size(struct net_device *dev)
 {
 	return NLMSG_ALIGN(sizeof(struct ifinfomsg))
 	       + nla_total_size(IFNAMSIZ) /* IFLA_IFNAME */
@@ -452,7 +587,8 @@ static inline size_t if_nlmsg_size(void)
 	       + nla_total_size(4) /* IFLA_LINK */
 	       + nla_total_size(4) /* IFLA_MASTER */
 	       + nla_total_size(1) /* IFLA_OPERSTATE */
-	       + nla_total_size(1); /* IFLA_LINKMODE */
+	       + nla_total_size(1) /* IFLA_LINKMODE */
+	       + rtnl_link_get_size(dev); /* IFLA_LINKINFO */
 }
 
 static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
@@ -522,6 +658,11 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 		}
 	}
 
+	if (dev->rtnl_link_ops) {
+		if (rtnl_link_fill(skb, dev) < 0)
+			goto nla_put_failure;
+	}
+
 	return nlmsg_end(skb, nlh);
 
 nla_put_failure:
@@ -553,6 +694,8 @@ cont:
 
 static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 	[IFLA_IFNAME]		= { .type = NLA_STRING, .len = IFNAMSIZ-1 },
+	[IFLA_ADDRESS]		= { .type = NLA_BINARY, .len = MAX_ADDR_LEN },
+	[IFLA_BROADCAST]	= { .type = NLA_BINARY, .len = MAX_ADDR_LEN },
 	[IFLA_MAP]		= { .len = sizeof(struct rtnl_link_ifmap) },
 	[IFLA_MTU]		= { .type = NLA_U32 },
 	[IFLA_TXQLEN]		= { .type = NLA_U32 },
@@ -561,10 +704,15 @@ static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 	[IFLA_LINKMODE]		= { .type = NLA_U8 },
 };
 
+static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
+	[IFLA_INFO_NAME]	= { .type = NLA_STRING },
+	[IFLA_INFO_DATA]	= { .type = NLA_NESTED },
+};
+
 static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
-		      struct nlattr **tb, char *ifname)
+		      struct nlattr **tb, char *ifname, int modified)
 {
-	int modified = 0, send_addr_notify = 0;
+	int send_addr_notify = 0;
 	int err;
 
 	if (tb[IFLA_MAP]) {
@@ -729,13 +877,187 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	    nla_len(tb[IFLA_BROADCAST]) < dev->addr_len)
 		goto errout_dev;
 
-	err = do_setlink(dev, ifm, tb, ifname);
+	err = do_setlink(dev, ifm, tb, ifname, 0);
 errout_dev:
 	dev_put(dev);
 errout:
 	return err;
 }
 
+static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+{
+	const struct rtnl_link_ops *ops;
+	struct net_device *dev;
+	struct ifinfomsg *ifm;
+	char ifname[IFNAMSIZ];
+	struct nlattr *tb[IFLA_MAX+1];
+	int err;
+
+	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
+	if (err < 0)
+		return err;
+
+	if (tb[IFLA_IFNAME])
+		nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
+
+	ifm = nlmsg_data(nlh);
+	if (ifm->ifi_index > 0)
+		dev = __dev_get_by_index(ifm->ifi_index);
+	else if (tb[IFLA_IFNAME])
+		dev = __dev_get_by_name(ifname);
+	else
+		return -EINVAL;
+
+	if (!dev)
+		return -ENODEV;
+
+	ops = dev->rtnl_link_ops;
+	if (!ops)
+		return -EOPNOTSUPP;
+
+	ops->dellink(dev);
+	return 0;
+}
+
+static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
+{
+	struct rtnl_link_ops *ops;
+	struct net_device *dev;
+	struct ifinfomsg *ifm;
+	char name[MODULE_NAME_LEN];
+	char ifname[IFNAMSIZ];
+	struct nlattr *tb[IFLA_MAX+1];
+	struct nlattr *linkinfo[IFLA_INFO_MAX+1];
+	int err;
+
+	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
+	if (err < 0)
+		return err;
+
+	if (tb[IFLA_IFNAME])
+		nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
+	else
+		ifname[0] = '\0';
+
+	ifm = nlmsg_data(nlh);
+	if (ifm->ifi_index > 0)
+		dev = __dev_get_by_index(ifm->ifi_index);
+	else if (ifname[0])
+		dev = __dev_get_by_name(ifname);
+	else
+		dev = NULL;
+
+	if (tb[IFLA_LINKINFO]) {
+		err = nla_parse_nested(linkinfo, IFLA_INFO_MAX,
+				       tb[IFLA_LINKINFO], ifla_info_policy);
+		if (err < 0)
+			return err;
+	} else
+		memset(linkinfo, 0, sizeof(linkinfo));
+
+	if (linkinfo[IFLA_INFO_NAME]) {
+		nla_strlcpy(name, linkinfo[IFLA_INFO_NAME], sizeof(name));
+		ops = rtnl_link_ops_get(name);
+	} else {
+		name[0] = '\0';
+		ops = NULL;
+	}
+
+	if (1) {
+		struct nlattr *attr[ops ? ops->maxtype + 1 : 0], **data = NULL;
+
+		if (ops) {
+			if (ops->maxtype && linkinfo[IFLA_INFO_DATA]) {
+				err = nla_parse_nested(attr, ops->maxtype,
+						       linkinfo[IFLA_INFO_DATA],
+						       ops->policy);
+				if (err < 0)
+					return err;
+				data = attr;
+			}
+			if (ops->validate) {
+				err = ops->validate(tb, data);
+				if (err < 0)
+					return err;
+			}
+		}
+
+		if (dev) {
+			int modified = 0;
+
+			if (nlh->nlmsg_flags & NLM_F_EXCL)
+				return -EEXIST;
+			if (nlh->nlmsg_flags & NLM_F_REPLACE)
+				return -EOPNOTSUPP;
+
+			if (linkinfo[IFLA_INFO_DATA]) {
+				if (!ops || ops != dev->rtnl_link_ops ||
+				    !ops->changelink)
+					return -EOPNOTSUPP;
+
+				err = ops->changelink(dev, tb, data);
+				if (err < 0)
+					return err;
+				modified = 1;
+			}
+
+			return do_setlink(dev, ifm, tb, ifname, modified);
+		}
+
+		if (!(nlh->nlmsg_flags & NLM_F_CREATE))
+			return -ENODEV;
+
+		if (ifm->ifi_index)
+			return -EINVAL;
+		if (tb[IFLA_ADDRESS] || tb[IFLA_BROADCAST] || tb[IFLA_MAP])
+			return -EOPNOTSUPP;
+
+#ifdef CONFIG_KMOD
+		if (!ops && name[0]) {
+			/* race condition: device may be created while rtnl is
+			 * unlocked, final register_netdevice will catch it.
+			 */
+			__rtnl_unlock();
+			request_module("rtnl-link-%s", name);
+			rtnl_lock();
+			ops = rtnl_link_ops_get(name);
+		}
+#endif
+		if (!ops)
+			return -EOPNOTSUPP;
+
+		if (!ifname[0])
+			snprintf(ifname, IFNAMSIZ, "%s%%d", ops->name);
+		dev = alloc_netdev(ops->priv_size, ifname, ops->setup);
+		if (!dev)
+			return -ENOMEM;
+
+		if (strchr(dev->name, '%')) {
+			err = dev_alloc_name(dev, dev->name);
+			if (err < 0)
+				goto err_free;
+		}
+		dev->rtnl_link_ops = ops;
+
+		if (tb[IFLA_MTU])
+			dev->mtu = nla_get_u32(tb[IFLA_MTU]);
+		if (tb[IFLA_TXQLEN])
+			dev->tx_queue_len = nla_get_u32(tb[IFLA_TXQLEN]);
+		if (tb[IFLA_WEIGHT])
+			dev->weight = nla_get_u32(tb[IFLA_WEIGHT]);
+		if (tb[IFLA_OPERSTATE])
+			set_operstate(dev, nla_get_u8(tb[IFLA_OPERSTATE]));
+		if (tb[IFLA_LINKMODE])
+			dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]);
+
+		err = ops->newlink(dev, tb, data);
+err_free:
+		if (err < 0)
+			free_netdev(dev);
+		return err;
+	}
+}
+
 static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 {
 	struct ifinfomsg *ifm;
@@ -756,7 +1078,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 	} else
 		return -EINVAL;
 
-	nskb = nlmsg_new(if_nlmsg_size(), GFP_KERNEL);
+	nskb = nlmsg_new(if_nlmsg_size(dev), GFP_KERNEL);
 	if (nskb == NULL) {
 		err = -ENOBUFS;
 		goto errout;
@@ -806,7 +1128,7 @@ void rtmsg_ifinfo(int type, struct net_device *dev, unsigned change)
 	struct sk_buff *skb;
 	int err = -ENOBUFS;
 
-	skb = nlmsg_new(if_nlmsg_size(), GFP_KERNEL);
+	skb = nlmsg_new(if_nlmsg_size(dev), GFP_KERNEL);
 	if (skb == NULL)
 		goto errout;
 
@@ -961,6 +1283,8 @@ void __init rtnetlink_init(void)
 
 	rtnl_register(PF_UNSPEC, RTM_GETLINK, rtnl_getlink, rtnl_dump_ifinfo);
 	rtnl_register(PF_UNSPEC, RTM_SETLINK, rtnl_setlink, NULL);
+	rtnl_register(PF_UNSPEC, RTM_NEWLINK, rtnl_newlink, NULL);
+	rtnl_register(PF_UNSPEC, RTM_DELLINK, rtnl_dellink, NULL);
 
 	rtnl_register(PF_UNSPEC, RTM_GETADDR, NULL, rtnl_dump_all);
 	rtnl_register(PF_UNSPEC, RTM_GETROUTE, NULL, rtnl_dump_all);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC DUMMY 05/09]: Use dev->stats
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (3 preceding siblings ...)
  2007-06-05 14:12 ` [RFC RTNETLINK 04/09]: Link creation API Patrick McHardy
@ 2007-06-05 14:12 ` Patrick McHardy
  2007-06-05 19:44   ` David Miller
  2007-06-05 14:13 ` [RFC DUMMY 06/09]: Keep dummy devices on list Patrick McHardy
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:12 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

[DUMMY]: Use dev->stats

Use dev->stats instead of netdev_priv().

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit c5178079b5d191e34a516dc111be862e3382e32b
tree 339bf81b1270b51272a1e9ebc643188dc755a2ee
parent 0323e7d1e7d5042492684264cfcba6d7ff55c473
author Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:28 +0200
committer Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:28 +0200

 drivers/net/dummy.c |   16 +++-------------
 1 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 60673bc..91b474c 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -38,7 +38,6 @@
 static int numdummies = 1;
 
 static int dummy_xmit(struct sk_buff *skb, struct net_device *dev);
-static struct net_device_stats *dummy_get_stats(struct net_device *dev);
 
 static int dummy_set_address(struct net_device *dev, void *p)
 {
@@ -59,7 +58,6 @@ static void set_multicast_list(struct net_device *dev)
 static void __init dummy_setup(struct net_device *dev)
 {
 	/* Initialize the device structure. */
-	dev->get_stats = dummy_get_stats;
 	dev->hard_start_xmit = dummy_xmit;
 	dev->set_multicast_list = set_multicast_list;
 	dev->set_mac_address = dummy_set_address;
@@ -76,20 +74,13 @@ static void __init dummy_setup(struct net_device *dev)
 
 static int dummy_xmit(struct sk_buff *skb, struct net_device *dev)
 {
-	struct net_device_stats *stats = netdev_priv(dev);
-
-	stats->tx_packets++;
-	stats->tx_bytes+=skb->len;
+	dev->stats.tx_packets++;
+	dev->stats.tx_bytes += skb->len;
 
 	dev_kfree_skb(skb);
 	return 0;
 }
 
-static struct net_device_stats *dummy_get_stats(struct net_device *dev)
-{
-	return netdev_priv(dev);
-}
-
 static struct net_device **dummies;
 
 /* Number of dummy devices to be set up by this module. */
@@ -101,8 +92,7 @@ static int __init dummy_init_one(int index)
 	struct net_device *dev_dummy;
 	int err;
 
-	dev_dummy = alloc_netdev(sizeof(struct net_device_stats),
-				 "dummy%d", dummy_setup);
+	dev_dummy = alloc_netdev(0, "dummy%d", dummy_setup);
 
 	if (!dev_dummy)
 		return -ENOMEM;

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC DUMMY 06/09]: Keep dummy devices on list
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (4 preceding siblings ...)
  2007-06-05 14:12 ` [RFC DUMMY 05/09]: Use dev->stats Patrick McHardy
@ 2007-06-05 14:13 ` Patrick McHardy
  2007-06-05 19:44   ` David Miller
  2007-06-05 14:13 ` [RFC DUMMY 07/09]: Use rtnl_link API Patrick McHardy
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:13 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

[DUMMY]: Keep dummy devices on list

Use a list instead of an array to allow creating new devices.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 5785eb9eb2c30be5662261fc115a65d28cc98c17
tree 51a0c8e4d7522895e2582f4a90330dad7e4d237e
parent c5178079b5d191e34a516dc111be862e3382e32b
author Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:28 +0200
committer Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:28 +0200

 drivers/net/dummy.c |   47 +++++++++++++++++++++++++++++------------------
 1 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 91b474c..2f2cf3c 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -34,6 +34,12 @@
 #include <linux/etherdevice.h>
 #include <linux/init.h>
 #include <linux/moduleparam.h>
+#include <linux/rtnetlink.h>
+
+struct dummy_priv {
+	struct net_device *dev;
+	struct list_head list;
+};
 
 static int numdummies = 1;
 
@@ -81,18 +87,20 @@ static int dummy_xmit(struct sk_buff *skb, struct net_device *dev)
 	return 0;
 }
 
-static struct net_device **dummies;
+static LIST_HEAD(dummies);
 
 /* Number of dummy devices to be set up by this module. */
 module_param(numdummies, int, 0);
 MODULE_PARM_DESC(numdummies, "Number of dummy pseudo devices");
 
-static int __init dummy_init_one(int index)
+static int __init dummy_init_one(void)
 {
 	struct net_device *dev_dummy;
+	struct dummy_priv *priv;
 	int err;
 
-	dev_dummy = alloc_netdev(0, "dummy%d", dummy_setup);
+	dev_dummy = alloc_netdev(sizeof(struct dummy_priv), "dummy%d",
+				 dummy_setup);
 
 	if (!dev_dummy)
 		return -ENOMEM;
@@ -101,40 +109,43 @@ static int __init dummy_init_one(int index)
 		free_netdev(dev_dummy);
 		dev_dummy = NULL;
 	} else {
-		dummies[index] = dev_dummy;
+		priv = netdev_priv(dev_dummy);
+		priv->dev = dev_dummy;
+		list_add_tail(&priv->list, &dummies);
 	}
 
 	return err;
 }
 
-static void dummy_free_one(int index)
+static void dummy_free_one(struct net_device *dev)
 {
-	unregister_netdev(dummies[index]);
-	free_netdev(dummies[index]);
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	list_del(&priv->list);
+	unregister_netdev(dev);
+	free_netdev(dev);
 }
 
 static int __init dummy_init_module(void)
 {
+	struct dummy_priv *priv, *next;
 	int i, err = 0;
-	dummies = kmalloc(numdummies * sizeof(void *), GFP_KERNEL);
-	if (!dummies)
-		return -ENOMEM;
+
 	for (i = 0; i < numdummies && !err; i++)
-		err = dummy_init_one(i);
+		err = dummy_init_one();
 	if (err) {
-		i--;
-		while (--i >= 0)
-			dummy_free_one(i);
+		list_for_each_entry_safe(priv, next, &dummies, list)
+			dummy_free_one(priv->dev);
 	}
 	return err;
 }
 
 static void __exit dummy_cleanup_module(void)
 {
-	int i;
-	for (i = 0; i < numdummies; i++)
-		dummy_free_one(i);
-	kfree(dummies);
+	struct dummy_priv *priv, *next;
+
+	list_for_each_entry_safe(priv, next, &dummies, list)
+		dummy_free_one(priv->dev);
 }
 
 module_init(dummy_init_module);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC DUMMY 07/09]: Use rtnl_link API
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (5 preceding siblings ...)
  2007-06-05 14:13 ` [RFC DUMMY 06/09]: Keep dummy devices on list Patrick McHardy
@ 2007-06-05 14:13 ` Patrick McHardy
  2007-06-05 19:46   ` David Miller
  2007-06-05 14:13 ` [RFC IFB 08/09]: Keep ifb devices on list Patrick McHardy
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:13 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

[DUMMY]: Use rtnl_link API

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit a86d7c15680b4bbeec06f0194c2ca927648e33dd
tree f6f55cbbd72b9643fb2f9c4e5859a3a2d699504a
parent 5785eb9eb2c30be5662261fc115a65d28cc98c17
author Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:28 +0200
committer Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:28 +0200

 drivers/net/dummy.c |   81 ++++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 2f2cf3c..b1bb7a0 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -35,6 +35,7 @@
 #include <linux/init.h>
 #include <linux/moduleparam.h>
 #include <linux/rtnetlink.h>
+#include <net/rtnetlink.h>
 
 struct dummy_priv {
 	struct net_device *dev;
@@ -61,12 +62,13 @@ static void set_multicast_list(struct net_device *dev)
 {
 }
 
-static void __init dummy_setup(struct net_device *dev)
+static void dummy_setup(struct net_device *dev)
 {
 	/* Initialize the device structure. */
 	dev->hard_start_xmit = dummy_xmit;
 	dev->set_multicast_list = set_multicast_list;
 	dev->set_mac_address = dummy_set_address;
+	dev->destructor = free_netdev;
 
 	/* Fill in device structure with ethernet-generic values. */
 	ether_setup(dev);
@@ -89,6 +91,37 @@ static int dummy_xmit(struct sk_buff *skb, struct net_device *dev)
 
 static LIST_HEAD(dummies);
 
+static int dummy_newlink(struct net_device *dev,
+			 struct nlattr *tb[], struct nlattr *data[])
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+	int err;
+
+	err = register_netdevice(dev);
+	if (err < 0)
+		return err;
+
+	priv->dev = dev;
+	list_add_tail(&priv->list, &dummies);
+	return 0;
+}
+
+static void dummy_dellink(struct net_device *dev)
+{
+	struct dummy_priv *priv = netdev_priv(dev);
+
+	list_del(&priv->list);
+	unregister_netdevice(dev);
+}
+
+static struct rtnl_link_ops dummy_link_ops = {
+	.name		= "dummy",
+	.priv_size	= sizeof(struct dummy_priv),
+	.setup		= dummy_setup,
+	.newlink	= dummy_newlink,
+	.dellink	= dummy_dellink,
+};
+
 /* Number of dummy devices to be set up by this module. */
 module_param(numdummies, int, 0);
 MODULE_PARM_DESC(numdummies, "Number of dummy pseudo devices");
@@ -105,25 +138,22 @@ static int __init dummy_init_one(void)
 	if (!dev_dummy)
 		return -ENOMEM;
 
-	if ((err = register_netdev(dev_dummy))) {
-		free_netdev(dev_dummy);
-		dev_dummy = NULL;
-	} else {
-		priv = netdev_priv(dev_dummy);
-		priv->dev = dev_dummy;
-		list_add_tail(&priv->list, &dummies);
-	}
+	err = dev_alloc_name(dev_dummy, dev_dummy->name);
+	if (err < 0)
+		goto err;
 
-	return err;
-}
+	err = register_netdevice(dev_dummy);
+	if (err < 0)
+		goto err;
 
-static void dummy_free_one(struct net_device *dev)
-{
-	struct dummy_priv *priv = netdev_priv(dev);
+	priv = netdev_priv(dev_dummy);
+	priv->dev = dev_dummy;
+	list_add_tail(&priv->list, &dummies);
+	return 0;
 
-	list_del(&priv->list);
-	unregister_netdev(dev);
-	free_netdev(dev);
+err:
+	free_netdev(dev_dummy);
+	return err;
 }
 
 static int __init dummy_init_module(void)
@@ -131,12 +161,18 @@ static int __init dummy_init_module(void)
 	struct dummy_priv *priv, *next;
 	int i, err = 0;
 
+	rtnl_lock();
+	err = __rtnl_link_register(&dummy_link_ops);
+
 	for (i = 0; i < numdummies && !err; i++)
 		err = dummy_init_one();
-	if (err) {
+	if (err < 0) {
 		list_for_each_entry_safe(priv, next, &dummies, list)
-			dummy_free_one(priv->dev);
+			dummy_dellink(priv->dev);
+		__rtnl_link_unregister(&dummy_link_ops);
 	}
+	rtnl_unlock();
+
 	return err;
 }
 
@@ -144,10 +180,15 @@ static void __exit dummy_cleanup_module(void)
 {
 	struct dummy_priv *priv, *next;
 
+	rtnl_lock();
 	list_for_each_entry_safe(priv, next, &dummies, list)
-		dummy_free_one(priv->dev);
+		dummy_dellink(priv->dev);
+
+	__rtnl_link_unregister(&dummy_link_ops);
+	rtnl_unlock();
 }
 
 module_init(dummy_init_module);
 module_exit(dummy_cleanup_module);
 MODULE_LICENSE("GPL");
+MODULE_ALIAS_RTNL_LINK("dummy");

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC IFB 08/09]: Keep ifb devices on list
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (6 preceding siblings ...)
  2007-06-05 14:13 ` [RFC DUMMY 07/09]: Use rtnl_link API Patrick McHardy
@ 2007-06-05 14:13 ` Patrick McHardy
  2007-06-05 14:13 ` [RFC IFB 09/09]: Use rtnl_link API Patrick McHardy
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:13 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

[IFB]: Keep ifb devices on list

Use a list instead of an array to allow creating new devices.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 279921d3d97ab5f310142bdb27b5459742dbba4f
tree b6602b51779ffefeae65a32c80cc92da90a40a6c
parent a86d7c15680b4bbeec06f0194c2ca927648e33dd
author Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:29 +0200
committer Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:29 +0200

 drivers/net/ifb.c |   36 +++++++++++++++++++++---------------
 1 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 07b4c0d..819945e 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -33,12 +33,15 @@
 #include <linux/etherdevice.h>
 #include <linux/init.h>
 #include <linux/moduleparam.h>
+#include <linux/list.h>
 #include <net/pkt_sched.h>
 
 #define TX_TIMEOUT  (2*HZ)
 
 #define TX_Q_LIMIT    32
 struct ifb_private {
+	struct list_head	list;
+	struct net_device	*dev;
 	struct net_device_stats stats;
 	struct tasklet_struct   ifb_tasklet;
 	int     tasklet_pending;
@@ -197,7 +200,7 @@ static struct net_device_stats *ifb_get_stats(struct net_device *dev)
 	return stats;
 }
 
-static struct net_device **ifbs;
+static LIST_HEAD(ifbs);
 
 /* Number of ifb devices to be set up by this module. */
 module_param(numifbs, int, 0);
@@ -229,6 +232,7 @@ static int ifb_open(struct net_device *dev)
 static int __init ifb_init_one(int index)
 {
 	struct net_device *dev_ifb;
+	struct ifb_private *priv;
 	int err;
 
 	dev_ifb = alloc_netdev(sizeof(struct ifb_private),
@@ -241,30 +245,33 @@ static int __init ifb_init_one(int index)
 		free_netdev(dev_ifb);
 		dev_ifb = NULL;
 	} else {
-		ifbs[index] = dev_ifb;
+		priv = netdev_priv(dev_ifb);
+		priv->dev = dev_ifb;
+		list_add_tail(&priv->list, &ifbs);
 	}
 
 	return err;
 }
 
-static void ifb_free_one(int index)
+static void ifb_free_one(struct net_device *dev)
 {
-	unregister_netdev(ifbs[index]);
-	free_netdev(ifbs[index]);
+	struct ifb_private *priv = netdev_priv(dev);
+
+	list_del(&priv->list);
+	unregister_netdev(dev);
+	free_netdev(dev);
 }
 
 static int __init ifb_init_module(void)
 {
+	struct ifb_private *priv, *next;
 	int i, err = 0;
-	ifbs = kmalloc(numifbs * sizeof(void *), GFP_KERNEL);
-	if (!ifbs)
-		return -ENOMEM;
+
 	for (i = 0; i < numifbs && !err; i++)
 		err = ifb_init_one(i);
 	if (err) {
-		i--;
-		while (--i >= 0)
-			ifb_free_one(i);
+		list_for_each_entry_safe(priv, next, &ifbs, list)
+			ifb_free_one(priv->dev);
 	}
 
 	return err;
@@ -272,11 +279,10 @@ static int __init ifb_init_module(void)
 
 static void __exit ifb_cleanup_module(void)
 {
-	int i;
+	struct ifb_private *priv, *next;
 
-	for (i = 0; i < numifbs; i++)
-		ifb_free_one(i);
-	kfree(ifbs);
+	list_for_each_entry_safe(priv, next, &ifbs, list)
+		ifb_free_one(priv->dev);
 }
 
 module_init(ifb_init_module);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC IFB 09/09]: Use rtnl_link API
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (7 preceding siblings ...)
  2007-06-05 14:13 ` [RFC IFB 08/09]: Keep ifb devices on list Patrick McHardy
@ 2007-06-05 14:13 ` Patrick McHardy
  2007-06-05 14:18 ` [RFC IPROUTE]: iplink: use netlink for link configuration Patrick McHardy
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:13 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

[IFB]: Use rtnl_link API

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit 6a0492fd68c8c5f528e4277bec20bef8047aec2e
tree 7e2ae7fb9e33d7c547c3030b9923de232aaf9ab3
parent 279921d3d97ab5f310142bdb27b5459742dbba4f
author Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:29 +0200
committer Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 15:40:29 +0200

 drivers/net/ifb.c |   79 ++++++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 59 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index 819945e..9ec01ff 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -139,13 +139,14 @@ resched:
 
 }
 
-static void __init ifb_setup(struct net_device *dev)
+static void ifb_setup(struct net_device *dev)
 {
 	/* Initialize the device structure. */
 	dev->get_stats = ifb_get_stats;
 	dev->hard_start_xmit = ifb_xmit;
 	dev->open = &ifb_open;
 	dev->stop = &ifb_close;
+	dev->destructor = free_netdev;
 
 	/* Fill in device structure with ethernet-generic values. */
 	ether_setup(dev);
@@ -229,6 +230,37 @@ static int ifb_open(struct net_device *dev)
 	return 0;
 }
 
+static int ifb_newlink(struct net_device *dev,
+		       struct nlattr *tb[], struct nlattr *data[])
+{
+	struct ifb_private *priv = netdev_priv(dev);
+	int err;
+
+	err = register_netdevice(dev);
+	if (err < 0)
+		return err;
+
+	priv->dev = dev;
+	list_add_tail(&priv->list, &ifbs);
+	return 0;
+}
+
+static void ifb_dellink(struct net_device *dev)
+{
+	struct ifb_private *priv = netdev_priv(dev);
+
+	list_del(&priv->list);
+	unregister_netdevice(dev);
+}
+
+static struct rtnl_link_ops ifb_link_ops = {
+	.name		= "ifb",
+	.priv_size	= sizeof(struct ifb_private),
+	.setup		= ifb_setup,
+	.newlink	= ifb_newlink,
+	.dellink	= ifb_dellink,
+};
+
 static int __init ifb_init_one(int index)
 {
 	struct net_device *dev_ifb;
@@ -241,38 +273,40 @@ static int __init ifb_init_one(int index)
 	if (!dev_ifb)
 		return -ENOMEM;
 
-	if ((err = register_netdev(dev_ifb))) {
-		free_netdev(dev_ifb);
-		dev_ifb = NULL;
-	} else {
-		priv = netdev_priv(dev_ifb);
-		priv->dev = dev_ifb;
-		list_add_tail(&priv->list, &ifbs);
-	}
+	err = dev_alloc_name(dev_ifb, dev_ifb->name);
+	if (err < 0)
+		goto err;
 
-	return err;
-}
+	err = register_netdevice(dev_ifb);
+	if (err < 0)
+		goto err;
 
-static void ifb_free_one(struct net_device *dev)
-{
-	struct ifb_private *priv = netdev_priv(dev);
+	priv = netdev_priv(dev_ifb);
+	priv->dev = dev_ifb;
+	list_add_tail(&priv->list, &ifbs);
+	return 0;
 
-	list_del(&priv->list);
-	unregister_netdev(dev);
-	free_netdev(dev);
+err:
+	free_netdev(dev_ifb);
+	return err;
 }
 
 static int __init ifb_init_module(void)
 {
 	struct ifb_private *priv, *next;
-	int i, err = 0;
+	int i, err;
+
+	rtnl_lock();
+	err = __rtnl_link_register(&ifb_link_ops);
 
 	for (i = 0; i < numifbs && !err; i++)
 		err = ifb_init_one(i);
 	if (err) {
 		list_for_each_entry_safe(priv, next, &ifbs, list)
-			ifb_free_one(priv->dev);
+			ifb_dellink(priv->dev);
+		__rtnl_link_unregister(&ifb_link_ops);
 	}
+	rtnl_unlock();
 
 	return err;
 }
@@ -281,11 +315,16 @@ static void __exit ifb_cleanup_module(void)
 {
 	struct ifb_private *priv, *next;
 
+	rtnl_lock();
 	list_for_each_entry_safe(priv, next, &ifbs, list)
-		ifb_free_one(priv->dev);
+		ifb_dellink(priv->dev);
+
+	__rtnl_link_unregister(&ifb_link_ops);
+	rtnl_unlock();
 }
 
 module_init(ifb_init_module);
 module_exit(ifb_cleanup_module);
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Jamal Hadi Salim");
+MODULE_ALIAS_RTNL_LINK("ifb");

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC IPROUTE]: iplink: use netlink for link configuration
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (8 preceding siblings ...)
  2007-06-05 14:13 ` [RFC IFB 09/09]: Use rtnl_link API Patrick McHardy
@ 2007-06-05 14:18 ` Patrick McHardy
  2007-06-05 19:37 ` [RFC RTNETLINK 00/09]: Netlink link creation API David Miller
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 14:18 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, ebiederm, tgraf

[-- Attachment #1: Type: text/plain, Size: 397 bytes --]

The iproute patch for the rtnl_link API. For simple devices
that take no configuration like dummy or ifb no further
changes are needed.

Example:

Create dummy device:

# ip link add type dummy

Show device:

# ip -d link list dummy0

9: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop
    link/ether ae:9a:0c:8e:f5:e1 brd ff:ff:ff:ff:ff:ff
    dummy

Delete device again:

# ip link delete dummy0


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 13869 bytes --]

[IPROUTE]: iplink: use netlink for link configuration

Add support for using netlink for link configuration. Kernel-support is
probed, when not available it falls back to using ioctls.

Signed-off-by: Patrick McHardy <kaber@trash.net>

---
commit e59a7a02053c997a2b7ff9a4436bd3deb4781bf4
tree 0c0a45170d43c0b1bca2560851ccfb3f3ccbebaa
parent b16621cafd599499fdbaa79236266d72a53106bb
author Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 16:14:50 +0200
committer Patrick McHardy <kaber@trash.net> Tue, 05 Jun 2007 16:14:50 +0200

 include/linux/if_link.h |   13 ++
 ip/Makefile             |    2 
 ip/ip.c                 |    5 +
 ip/ip_common.h          |   13 ++
 ip/ipaddress.c          |   35 +++++
 ip/iplink.c             |  311 +++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 376 insertions(+), 3 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 2920e8a..aac0df1 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -76,6 +76,8 @@ enum
 #define IFLA_WEIGHT IFLA_WEIGHT
 	IFLA_OPERSTATE,
 	IFLA_LINKMODE,
+	IFLA_LINKINFO,
+#define IFLA_LINKINFO IFLA_LINKINFO
 	__IFLA_MAX
 };
 
@@ -137,4 +139,15 @@ struct ifla_cacheinfo
 	__u32	retrans_time;
 };
 
+enum
+{
+	IFLA_INFO_UNSPEC,
+	IFLA_INFO_NAME,
+	IFLA_INFO_DATA,
+	IFLA_INFO_XSTATS,
+	__IFLA_INFO_MAX,
+};
+
+#define IFLA_INFO_MAX	(__IFLA_INFO_MAX - 1)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/ip/Makefile b/ip/Makefile
index a749993..9a5bfe3 100644
--- a/ip/Makefile
+++ b/ip/Makefile
@@ -22,3 +22,5 @@ install: all
 clean:
 	rm -f $(ALLOBJ) $(TARGETS)
 
+LDLIBS	+= -ldl
+LDFLAGS	+= -Wl,-export-dynamic
diff --git a/ip/ip.c b/ip/ip.c
index c084292..4bdb83b 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -30,6 +30,7 @@
 
 int preferred_family = AF_UNSPEC;
 int show_stats = 0;
+int show_details = 0;
 int resolve_hosts = 0;
 int oneline = 0;
 int timestamp = 0;
@@ -47,7 +48,7 @@ static void usage(void)
 "       ip [ -force ] [-batch filename\n"
 "where  OBJECT := { link | addr | route | rule | neigh | ntable | tunnel |\n"
 "                   maddr | mroute | monitor | xfrm }\n"
-"       OPTIONS := { -V[ersion] | -s[tatistics] | -r[esolve] |\n"
+"       OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n"
 "                    -f[amily] { inet | inet6 | ipx | dnet | link } |\n"
 "                    -o[neline] | -t[imestamp] }\n");
 	exit(-1);
@@ -188,6 +189,8 @@ int main(int argc, char **argv)
 		} else if (matches(opt, "-stats") == 0 ||
 			   matches(opt, "-statistics") == 0) {
 			++show_stats;
+		} else if (matches(opt, "-details") == 0) {
+			++show_details;
 		} else if (matches(opt, "-resolve") == 0) {
 			++resolve_hosts;
 		} else if (matches(opt, "-oneline") == 0) {
diff --git a/ip/ip_common.h b/ip/ip_common.h
index 5bfd9b9..642c609 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -45,6 +45,19 @@ static inline int rtm_get_table(struct rtmsg *r, struct rtattr **tb)
 
 extern struct rtnl_handle rth;
 
+struct link_util
+{
+	struct link_util	*next;
+	const char		*id;
+	int			maxattr;
+	int			(*parse_opt)(struct link_util *, int, char **,
+					     struct nlmsghdr *);
+	void			(*print_opt)(struct link_util *, FILE *,
+					     struct rtattr *[]);
+};
+
+struct link_util *get_link_type(const char *type);
+
 #ifndef	INFINITY_LIFE_TIME
 #define     INFINITY_LIFE_TIME      0xFFFFFFFFU
 #endif
diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index 98effa3..58254ea 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -134,6 +134,37 @@ void print_queuelen(char *name)
 		printf("qlen %d", ifr.ifr_qlen);
 }
 
+static void print_linktype(FILE *fp, struct rtattr *tb)
+{
+	struct rtattr *linkinfo[IFLA_INFO_MAX+1];
+	struct link_util *lu;
+	char *type;
+
+	parse_rtattr_nested(linkinfo, IFLA_INFO_MAX, tb);
+
+	if (!linkinfo[IFLA_INFO_NAME])
+		return;
+	type = RTA_DATA(linkinfo[IFLA_INFO_NAME]);
+
+	fprintf(fp, "%s", _SL_);
+	fprintf(fp, "    %s ", type);
+
+	lu = get_link_type(type);
+	if (!lu || !lu->print_opt)
+		return;
+
+	if (1) {
+		struct rtattr *attr[lu->maxattr+1], **data = NULL;
+
+		if (linkinfo[IFLA_INFO_DATA]) {
+			parse_rtattr_nested(attr, lu->maxattr,
+					    linkinfo[IFLA_INFO_DATA]);
+			data = attr;
+		}
+		lu->print_opt(lu, fp, data);
+	}
+}
+
 int print_linkinfo(const struct sockaddr_nl *who,
 		   struct nlmsghdr *n, void *arg)
 {
@@ -221,6 +252,10 @@ int print_linkinfo(const struct sockaddr_nl *who,
 						      b1, sizeof(b1)));
 		}
 	}
+
+	if (do_link && tb[IFLA_LINKINFO] && show_details)
+		print_linktype(fp, tb[IFLA_LINKINFO]);
+
 	if (do_link && tb[IFLA_STATS] && show_stats) {
 		struct rtnl_link_stats slocal;
 		struct rtnl_link_stats *s = RTA_DATA(tb[IFLA_STATS]);
diff --git a/ip/iplink.c b/ip/iplink.c
index 8f82a08..cfacdab 100644
--- a/ip/iplink.c
+++ b/ip/iplink.c
@@ -15,6 +15,7 @@
 #include <unistd.h>
 #include <syslog.h>
 #include <fcntl.h>
+#include <dlfcn.h>
 #include <errno.h>
 #include <sys/socket.h>
 #include <linux/if.h>
@@ -31,6 +32,7 @@
 #include "utils.h"
 #include "ip_common.h"
 
+#define IPLINK_IOCTL_COMPAT	1
 
 static void usage(void) __attribute__((noreturn));
 
@@ -62,6 +64,290 @@ static int on_off(char *msg)
 	return -1;
 }
 
+static void *BODY;		/* cached dlopen(NULL) handle */
+static struct link_util *linkutil_list;
+
+struct link_util *get_link_type(const char *id)
+{
+	void *dlh;
+	char buf[256];
+	struct link_util *l;
+
+	for (l = linkutil_list; l; l = l->next)
+		if (strcmp(l->id, id) == 0)
+			return l;
+
+	snprintf(buf, sizeof(buf), "/usr/lib/ip/link_%s.so", id);
+	dlh = dlopen(buf, RTLD_LAZY);
+	if (dlh == NULL) {
+		/* look in current binary, only open once */
+		dlh = BODY;
+		if (dlh == NULL) {
+			dlh = BODY = dlopen(NULL, RTLD_LAZY);
+			if (dlh == NULL)
+				return NULL;
+		}
+	}
+
+	snprintf(buf, sizeof(buf), "%s_link_util", id);
+	l = dlsym(dlh, buf);
+	if (l == NULL)
+		return NULL;
+
+	l->next = linkutil_list;
+	linkutil_list = l;
+	return l;
+}
+
+#if IPLINK_IOCTL_COMPAT
+static int have_rtnl_newlink = -1;
+
+static int accept_msg(const struct sockaddr_nl *who,
+		      struct nlmsghdr *n, void *arg)
+{
+	struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(n);
+
+	if (n->nlmsg_type == NLMSG_ERROR && err->error == -EOPNOTSUPP)
+		have_rtnl_newlink = 0;
+	else
+		have_rtnl_newlink = 1;
+	return -1;
+}
+
+static int iplink_have_newlink(void)
+{
+	struct {
+		struct nlmsghdr		n;
+		struct ifinfomsg	i;
+		char			buf[1024];
+	} req;
+
+	if (have_rtnl_newlink < 0) {
+		memset(&req, 0, sizeof(req));
+
+		req.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+		req.n.nlmsg_flags = NLM_F_REQUEST|NLM_F_ACK;
+		req.n.nlmsg_type = RTM_NEWLINK;
+		req.i.ifi_family = AF_UNSPEC;
+
+		rtnl_send(&rth, (char *)&req.n, req.n.nlmsg_len);
+		rtnl_listen(&rth, accept_msg, NULL);
+	}
+	return have_rtnl_newlink;
+}
+#else /* IPLINK_IOCTL_COMPAT */
+static int iplink_have_newlink(void)
+{
+	return 1;
+}
+#endif /* ! IPLINK_IOCTL_COMPAT */
+
+static int iplink_modify(int cmd, unsigned int flags, int argc, char **argv)
+{
+	int qlen = -1;
+	int mtu = -1;
+	int len;
+	char abuf[32];
+	char *dev = NULL;
+	char *name = NULL;
+	char *link = NULL;
+	char *type = NULL;
+	struct link_util *lu = NULL;
+	struct {
+		struct nlmsghdr		n;
+		struct ifinfomsg	i;
+		char			buf[1024];
+	} req;
+
+	memset(&req, 0, sizeof(req));
+
+	req.n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
+	req.n.nlmsg_flags = NLM_F_REQUEST|flags;
+	req.n.nlmsg_type = cmd;
+	req.i.ifi_family = preferred_family;
+
+	while (argc > 0) {
+		if (strcmp(*argv, "up") == 0) {
+			req.i.ifi_change |= IFF_UP;
+			req.i.ifi_flags |= IFF_UP;
+		} else if (strcmp(*argv, "down") == 0) {
+			req.i.ifi_change |= IFF_UP;
+			req.i.ifi_flags &= ~IFF_UP;
+		} else if (strcmp(*argv, "name") == 0) {
+			NEXT_ARG();
+			name = *argv;
+		} else if (matches(*argv, "link") == 0) {
+			NEXT_ARG();
+			link = *argv;
+		} else if (matches(*argv, "address") == 0) {
+			NEXT_ARG();
+			len = ll_addr_a2n(abuf, sizeof(abuf), *argv);
+			addattr_l(&req.n, sizeof(req), IFLA_ADDRESS, abuf, len);
+		} else if (matches(*argv, "broadcast") == 0 ||
+			   strcmp(*argv, "brd") == 0) {
+			NEXT_ARG();
+			len = ll_addr_a2n(abuf, sizeof(abuf), *argv);
+			addattr_l(&req.n, sizeof(req), IFLA_BROADCAST, abuf, len);
+		} else if (matches(*argv, "txqueuelen") == 0 ||
+			   strcmp(*argv, "qlen") == 0 ||
+			   matches(*argv, "txqlen") == 0) {
+			NEXT_ARG();
+			if (qlen != -1)
+				duparg("txqueuelen", *argv);
+			if (get_integer(&qlen,  *argv, 0))
+				invarg("Invalid \"txqueuelen\" value\n", *argv);
+			addattr_l(&req.n, sizeof(req), IFLA_TXQLEN, &qlen, 4);
+		} else if (strcmp(*argv, "mtu") == 0) {
+			NEXT_ARG();
+			if (mtu != -1)
+				duparg("mtu", *argv);
+			if (get_integer(&mtu, *argv, 0))
+				invarg("Invalid \"mtu\" value\n", *argv);
+			addattr_l(&req.n, sizeof(req), IFLA_MTU, &mtu, 4);
+		} else if (strcmp(*argv, "multicast") == 0) {
+			NEXT_ARG();
+			req.i.ifi_change |= IFF_MULTICAST;
+			if (strcmp(*argv, "on") == 0) {
+				req.i.ifi_flags |= IFF_MULTICAST;
+			} else if (strcmp(*argv, "off") == 0) {
+				req.i.ifi_flags &= ~IFF_MULTICAST;
+			} else
+				return on_off("multicast");
+		} else if (strcmp(*argv, "allmulticast") == 0) {
+			NEXT_ARG();
+			req.i.ifi_change |= IFF_ALLMULTI;
+			if (strcmp(*argv, "on") == 0) {
+				req.i.ifi_flags |= IFF_ALLMULTI;
+			} else if (strcmp(*argv, "off") == 0) {
+				req.i.ifi_flags &= ~IFF_ALLMULTI;
+			} else
+				return on_off("allmulticast");
+		} else if (strcmp(*argv, "promisc") == 0) {
+			NEXT_ARG();
+			req.i.ifi_change |= IFF_PROMISC;
+			if (strcmp(*argv, "on") == 0) {
+				req.i.ifi_flags |= IFF_PROMISC;
+			} else if (strcmp(*argv, "off") == 0) {
+				req.i.ifi_flags &= ~IFF_PROMISC;
+			} else
+				return on_off("promisc");
+		} else if (strcmp(*argv, "trailers") == 0) {
+			NEXT_ARG();
+			req.i.ifi_change |= IFF_NOTRAILERS;
+			if (strcmp(*argv, "off") == 0) {
+				req.i.ifi_flags |= IFF_NOTRAILERS;
+			} else if (strcmp(*argv, "on") == 0) {
+				req.i.ifi_flags &= ~IFF_NOTRAILERS;
+			} else
+				return on_off("trailers");
+		} else if (strcmp(*argv, "arp") == 0) {
+			NEXT_ARG();
+			req.i.ifi_change |= IFF_NOARP;
+			if (strcmp(*argv, "on") == 0) {
+				req.i.ifi_flags &= ~IFF_NOARP;
+			} else if (strcmp(*argv, "off") == 0) {
+				req.i.ifi_flags |= IFF_NOARP;
+			} else
+				return on_off("noarp");
+#ifdef IFF_DYNAMIC
+		} else if (matches(*argv, "dynamic") == 0) {
+			NEXT_ARG();
+			req.i.ifi_change |= IFF_DYNAMIC;
+			if (strcmp(*argv, "on") == 0) {
+				req.i.ifi_flags |= IFF_DYNAMIC;
+			} else if (strcmp(*argv, "off") == 0) {
+				req.i.ifi_flags &= ~IFF_DYNAMIC;
+			} else
+				return on_off("dynamic");
+#endif
+		} else if (matches(*argv, "type") == 0) {
+			NEXT_ARG();
+			type = *argv;
+			argc--; argv++;
+			break;
+		} else {
+                        if (strcmp(*argv, "dev") == 0) {
+				NEXT_ARG();
+			}
+			if (dev)
+				duparg2("dev", *argv);
+			dev = *argv;
+		}
+		argc--; argv++;
+	}
+
+	ll_init_map(&rth);
+
+	if (type) {
+		struct rtattr *linkinfo = NLMSG_TAIL(&req.n);
+		addattr_l(&req.n, sizeof(req), IFLA_LINKINFO, NULL, 0);
+		addattr_l(&req.n, sizeof(req), IFLA_INFO_NAME, type,
+			 strlen(type));
+
+		lu = get_link_type(type);
+		if (lu) {
+			struct rtattr * data = NLMSG_TAIL(&req.n);
+			addattr_l(&req.n, sizeof(req), IFLA_INFO_DATA, NULL, 0);
+
+			if (lu->parse_opt &&
+			    lu->parse_opt(lu, argc, argv, &req.n))
+				return -1;
+
+			data->rta_len = (void *)NLMSG_TAIL(&req.n) - (void *)data;
+		} else if (argc) {
+			if (matches(*argv, "help") == 0)
+				usage();
+			fprintf(stderr, "Garbage instead of arguments \"%s ...\". "
+					"Try \"ip link help\".\n", *argv);
+			return -1;
+		}
+		linkinfo->rta_len = (void *)NLMSG_TAIL(&req.n) - (void *)linkinfo;
+	}
+
+	if (!(flags & NLM_F_CREATE)) {
+		if (!dev) {
+			fprintf(stderr, "Not enough information: \"dev\" "
+					"argument is required.\n");
+			exit(-1);
+		}
+
+		req.i.ifi_index = ll_name_to_index(dev);
+		if (req.i.ifi_index == 0) {
+			fprintf(stderr, "Cannot find device \"%s\"\n", dev);
+			return -1;
+		}
+	} else {
+		/* Allow "ip link add dev" and "ip link add name" */
+		if (!name)
+			name = dev;
+
+		if (link) {
+			int ifindex;
+			
+			ifindex = ll_name_to_index(link);
+			if (ifindex == 0) {
+				fprintf(stderr, "Cannot find device \"%s\"\n", 
+					link);
+				return -1;
+			}
+			addattr_l(&req.n, sizeof(req), IFLA_LINK, &ifindex, 4);
+		}
+	}
+
+	if (name) {
+		len = strlen(name) + 1;
+		if (len > IFNAMSIZ)
+			invarg("\"name\" too long\n", *argv);
+		addattr_l(&req.n, sizeof(req), IFLA_IFNAME, name, len);
+	}
+
+	if (rtnl_talk(&rth, &req.n, 0, 0, NULL, NULL, NULL) < 0)
+		exit(2);
+
+	return 0;
+}
+
+#if IPLINK_IOCTL_COMPAT
 static int get_ctl_fd(void)
 {
 	int s_errno;
@@ -410,12 +696,33 @@ static int do_set(int argc, char **argv)
 		return do_chflags(dev, flags, mask);
 	return 0;
 }
+#endif /* IPLINK_IOCTL_COMPAT */
 
 int do_iplink(int argc, char **argv)
 {
 	if (argc > 0) {
-		if (matches(*argv, "set") == 0)
-			return do_set(argc-1, argv+1);
+		if (iplink_have_newlink()) {
+			if (matches(*argv, "add") == 0)
+				return iplink_modify(RTM_NEWLINK,
+						     NLM_F_CREATE|NLM_F_EXCL,
+						     argc-1, argv+1);
+			if (matches(*argv, "set") == 0 ||
+			    matches(*argv, "change") == 0)
+				return iplink_modify(RTM_NEWLINK, 0,
+						     argc-1, argv+1);
+			if (matches(*argv, "replace") == 0)
+				return iplink_modify(RTM_NEWLINK,
+						     NLM_F_CREATE|NLM_F_REPLACE,
+						     argc-1, argv+1);
+			if (matches(*argv, "delete") == 0)
+				return iplink_modify(RTM_DELLINK, 0,
+						     argc-1, argv+1);
+		} else {
+#if IPLINK_IOCTL_COMPAT
+			if (matches(*argv, "set") == 0)
+				return do_set(argc-1, argv+1);
+#endif
+		}
 		if (matches(*argv, "show") == 0 ||
 		    matches(*argv, "lst") == 0 ||
 		    matches(*argv, "list") == 0)

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (9 preceding siblings ...)
  2007-06-05 14:18 ` [RFC IPROUTE]: iplink: use netlink for link configuration Patrick McHardy
@ 2007-06-05 19:37 ` David Miller
  2007-06-05 21:10   ` Patrick McHardy
  2007-06-05 22:00 ` jamal
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: David Miller @ 2007-06-05 19:37 UTC (permalink / raw)
  To: kaber; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Jun 2007 16:12:51 +0200 (MEST)

> A few words about the API:
> 
> Drivers wishing to use the API register a struct rtnl_link_ops, which
> contains a few function pointers for device setup, registation, changing
> and deletion, as well as netlink attribute validation and device dumping.
> 
> All netlink communication happens within the AF_UNSPEC family. I
> initially introduced new netlink families for this, but removed them
> again since that would require adding new protocol families that serve
> no further purpose for most drivers. Additionally we currently use
> RTM.*LINK messages with ifi_family != AF_UNSPEC for information that
> is related to the device, but doesn't come from the driver that created
> the device itself, like bridge port state, IPv6 device configuration etc.
> 
> The device specific attributes are nested within a new attribute
> IFLA_LINKINFO. I didn't use IFLA_PROTINFO since userspace can reasonably
> expect to have IFLA_PROTINFO unset for AF_UNSPEC messages, and the
> userspace STP daemon does that. Identification of the driver happens
> by name, stored in the IFLA_INFO_NAME attribute. IFLA_INFO_DATA contains
> driver specific attributes, IFLA_INFO_XSTATS driver specific statistics.
> 
> The API does *not* use the existing RTM_SETLINK message type, instead
> it adds support for receiving RTM_NEWLINK within the kernel. I did this
> because of three reasons: 
> 
> - RTM_SETLINK does not follow the usual rtnetlink conventions and ignores
>   all netlink flags
> 
> - Other rtnetlink subsystems use the same message type for dumps and
>   notifications from the kernel as for configuration from userspace,
>   which usually allows to recreate an object by simply setting the
>   NLM_F_REQUEST flag on message received from the kernel and sending
>   it back.
> 
> - Easier for userspace to detect support for the new features
> 
> The RTM_NEWLINK message type is a superset of RTM_SETLINK, it allows
> to change both driver specific and generic attributes of the device.
> The set of generic device attributes that may be supplied during
> device creation is limited to a few simple ones, it currently does
> not support specifying link layer address/broadcast address as well
> as device flags. The change operation can change all device attributes.
> 
> Not sure what else to say .. comments welcome.

This excellent description of the APIs (particularly the background
and reasoning) belongs in a file under Documentation/networking/ :-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC NETLINK 01/09]: Mark netlink policies const
  2007-06-05 14:12 ` [RFC NETLINK 01/09]: Mark netlink policies const Patrick McHardy
@ 2007-06-05 19:39   ` David Miller
  0 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2007-06-05 19:39 UTC (permalink / raw)
  To: kaber; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Jun 2007 16:12:52 +0200 (MEST)

> [NETLINK]: Mark netlink policies const
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

This one looks good enough for 2.6.22, as I consider more
accurate typing a way to prevent and find bugs.

So I'll apply this one now, thanks!

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 02/09]: ifindex 0 does not exist
  2007-06-05 14:12 ` [RFC RTNETLINK 02/09]: ifindex 0 does not exist Patrick McHardy
@ 2007-06-05 19:40   ` David Miller
  0 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2007-06-05 19:40 UTC (permalink / raw)
  To: kaber; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Jun 2007 16:12:54 +0200 (MEST)

> [RTNETLINK]: ifindex 0 does not exist
> 
> ifindex == 0 does not exist and implies we should do a lookup by name if
> one was given.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

This one is also more like a bug fix, thus applied.

I thought for a moment that perhaps we should signal an
error on ifindex==0 but I think this behavior you choose
makes more sense.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 03/09]: Split up rtnl_setlink
  2007-06-05 14:12 ` [RFC RTNETLINK 03/09]: Split up rtnl_setlink Patrick McHardy
@ 2007-06-05 19:41   ` David Miller
  0 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2007-06-05 19:41 UTC (permalink / raw)
  To: kaber; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Jun 2007 16:12:55 +0200 (MEST)

> [RTNETLINK]: Split up rtnl_setlink
> 
> Split up rtnl_setlink into a function performing validation and a function
> performing the actual changes. This allows to share the modifcation logic
> with rtnl_newlink, which is introduced by the next patch.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

This looks fine to me.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 04/09]: Link creation API
  2007-06-05 14:12 ` [RFC RTNETLINK 04/09]: Link creation API Patrick McHardy
@ 2007-06-05 19:43   ` David Miller
  2007-06-05 21:09     ` Patrick McHardy
  2007-06-05 22:03   ` Stephen Hemminger
  2007-06-06 16:37   ` Eric W. Biederman
  2 siblings, 1 reply; 43+ messages in thread
From: David Miller @ 2007-06-05 19:43 UTC (permalink / raw)
  To: kaber; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Jun 2007 16:12:57 +0200 (MEST)

> [RTNETLINK]: Link creation API
> 
> Add rtnetlink API for creating, changing and deleting software devices.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Looks mostly fine, perhaps you can make even more use of 'const'
for instances of "struct rtnl_link_ops *" at least as function
arguments deeper in the implementation.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC DUMMY 05/09]: Use dev->stats
  2007-06-05 14:12 ` [RFC DUMMY 05/09]: Use dev->stats Patrick McHardy
@ 2007-06-05 19:44   ` David Miller
  0 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2007-06-05 19:44 UTC (permalink / raw)
  To: kaber; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Jun 2007 16:12:59 +0200 (MEST)

> [DUMMY]: Use dev->stats
> 
> Use dev->stats instead of netdev_priv().
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Looks good.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC DUMMY 06/09]: Keep dummy devices on list
  2007-06-05 14:13 ` [RFC DUMMY 06/09]: Keep dummy devices on list Patrick McHardy
@ 2007-06-05 19:44   ` David Miller
  0 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2007-06-05 19:44 UTC (permalink / raw)
  To: kaber; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Jun 2007 16:13:00 +0200 (MEST)

> [DUMMY]: Keep dummy devices on list
> 
> Use a list instead of an array to allow creating new devices.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

Looks good.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC DUMMY 07/09]: Use rtnl_link API
  2007-06-05 14:13 ` [RFC DUMMY 07/09]: Use rtnl_link API Patrick McHardy
@ 2007-06-05 19:46   ` David Miller
  0 siblings, 0 replies; 43+ messages in thread
From: David Miller @ 2007-06-05 19:46 UTC (permalink / raw)
  To: kaber; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

From: Patrick McHardy <kaber@trash.net>
Date: Tue,  5 Jun 2007 16:13:02 +0200 (MEST)

> [DUMMY]: Use rtnl_link API
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>

This looks good too.

I'll let folks like Jamal and Ben look over the IFB and
VLAN bits.

Thanks for doing this work Patrick!

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 04/09]: Link creation API
  2007-06-05 19:43   ` David Miller
@ 2007-06-05 21:09     ` Patrick McHardy
  0 siblings, 0 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 21:09 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Tue,  5 Jun 2007 16:12:57 +0200 (MEST)
> 
>>[RTNETLINK]: Link creation API
>>
>>Add rtnetlink API for creating, changing and deleting software devices.
>>
>>Signed-off-by: Patrick McHardy <kaber@trash.net>
> 
> 
> Looks mostly fine, perhaps you can make even more use of 'const'
> for instances of "struct rtnl_link_ops *" at least as function
> arguments deeper in the implementation.


Good suggestion. I initially had rtnl_link_ops const everywhere
(since the lookup was by family I stored it in an array and didn't
need the list_head), then changed it to a list and removed all
consts. I'll add them back where possible.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-05 19:37 ` [RFC RTNETLINK 00/09]: Netlink link creation API David Miller
@ 2007-06-05 21:10   ` Patrick McHardy
  0 siblings, 0 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 21:10 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Tue,  5 Jun 2007 16:12:51 +0200 (MEST)
> 
> 
>>A few words about the API:
>>
>>[..]
>>
>>Not sure what else to say .. comments welcome.
> 
> 
> This excellent description of the APIs (particularly the background
> and reasoning) belongs in a file under Documentation/networking/ :-)


I'll add something like this under Documentation/, thanks.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (10 preceding siblings ...)
  2007-06-05 19:37 ` [RFC RTNETLINK 00/09]: Netlink link creation API David Miller
@ 2007-06-05 22:00 ` jamal
  2007-06-05 22:07   ` Patrick McHardy
  2007-06-06  0:40 ` Eric W. Biederman
  2007-06-06 13:46 ` Patrick McHardy
  13 siblings, 1 reply; 43+ messages in thread
From: jamal @ 2007-06-05 22:00 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, socketcan, xemul, ebiederm, tgraf


All patches looked really good. speaking for the ifb stuff, a definete
ACK. 
The only thing that threw me off for a sec was the naming convention for
"type" referenced via IFLA_INFO_NAME because it seems to be colliding
semantic with dev->type and dev->name as in IFLA_NAME and  
ifi_type ifinfomsg.
But i cant come with a better noun. 

Good stuff, nevertheless  

cheers,
jamal


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 04/09]: Link creation API
  2007-06-05 14:12 ` [RFC RTNETLINK 04/09]: Link creation API Patrick McHardy
  2007-06-05 19:43   ` David Miller
@ 2007-06-05 22:03   ` Stephen Hemminger
  2007-06-05 23:17     ` Patrick McHardy
  2007-06-06 16:37   ` Eric W. Biederman
  2 siblings, 1 reply; 43+ messages in thread
From: Stephen Hemminger @ 2007-06-05 22:03 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: netdev, socketcan, hadi, xemul, Patrick McHardy, ebiederm, tgraf

On Tue,  5 Jun 2007 16:12:57 +0200 (MEST)
Patrick McHardy <kaber@trash.net> wrote:

> [RTNETLINK]: Link creation API
> 
> Add rtnetlink API for creating, changing and deleting software devices.
> 
> Signed-off-by: Patrick McHardy <kaber@trash.net>
> 

If you want I'll extend existing bridge netlink to use these.

-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-05 22:00 ` jamal
@ 2007-06-05 22:07   ` Patrick McHardy
  2007-06-05 22:29     ` jamal
  0 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 22:07 UTC (permalink / raw)
  To: hadi; +Cc: netdev, socketcan, xemul, ebiederm, tgraf

jamal wrote:
> All patches looked really good. speaking for the ifb stuff, a definete
> ACK. 
> The only thing that threw me off for a sec was the naming convention for
> "type" referenced via IFLA_INFO_NAME because it seems to be colliding
> semantic with dev->type and dev->name as in IFLA_NAME and  
> ifi_type ifinfomsg.
> But i cant come with a better noun. 


How about IFLA_INFO_KIND (borrowed from sch_api)? I generally don't
like the IFLA_INFO_ prefix very much, but so far didn't come up with
something better. Suggestions welcome :)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-05 22:07   ` Patrick McHardy
@ 2007-06-05 22:29     ` jamal
  0 siblings, 0 replies; 43+ messages in thread
From: jamal @ 2007-06-05 22:29 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, socketcan, xemul, ebiederm, tgraf

On Wed, 2007-06-06 at 00:07 +0200, Patrick McHardy wrote:

> How about IFLA_INFO_KIND (borrowed from sch_api)? I generally don't
> like the IFLA_INFO_ prefix very much, but so far didn't come up with
> something better. Suggestions welcome :)

KIND sounds a lot more tasty ;-> Thanks.

cheers,
jamal


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 04/09]: Link creation API
  2007-06-05 23:17     ` Patrick McHardy
@ 2007-06-05 23:16       ` Stephen Hemminger
  2007-06-06 11:42         ` Patrick McHardy
  0 siblings, 1 reply; 43+ messages in thread
From: Stephen Hemminger @ 2007-06-05 23:16 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

On Wed, 06 Jun 2007 01:17:11 +0200
Patrick McHardy <kaber@trash.net> wrote:

> Stephen Hemminger wrote:
> > On Tue,  5 Jun 2007 16:12:57 +0200 (MEST)
> > Patrick McHardy <kaber@trash.net> wrote:
> > 
> > 
> >>[RTNETLINK]: Link creation API
> >>
> >>Add rtnetlink API for creating, changing and deleting software devices.
> >>
> >>Signed-off-by: Patrick McHardy <kaber@trash.net>
> >>
> > 
> > If you want I'll extend existing bridge netlink to use these.
> 
> 
> Are you talking about brige-port information or bridge device
> configuration? So far the API is not suitable for anything that
> currently uses IFLA_PROTINFO because the sender is not the driver
> which created the device and doesn't use AF_UNSPEC. For bridge
> device configuration it would certainly be nice to have, but I'm
> not sure yet how to handle enslave operations. So far my favourite
> idea is to add enslave/release operations to rtnl_link_ops and call
> them when IFLA_MASTER is set (so the netlink message would look like
> this: ifindex: eth0 master: br0 nlmsg_type: RTM_NETLINK). But I
> haven't really thought this through yet.

Was thinking AF_BRIDGE (we have it already so use it), and both
add/remove bridge, and enslave/unslave device.


> I would also like to add support for handling "secondary device state"
> like bridge port state and others that currently use IFLA_PROTINFO
> (a lot of the code is very similar to the generic code), but all ideas
> so far turned out not to work very well.
> 
> I'm leaving for a short vacation until Sunday tommorrow, so replies
> may be delayed :)


-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 04/09]: Link creation API
  2007-06-05 22:03   ` Stephen Hemminger
@ 2007-06-05 23:17     ` Patrick McHardy
  2007-06-05 23:16       ` Stephen Hemminger
  0 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-05 23:17 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

Stephen Hemminger wrote:
> On Tue,  5 Jun 2007 16:12:57 +0200 (MEST)
> Patrick McHardy <kaber@trash.net> wrote:
> 
> 
>>[RTNETLINK]: Link creation API
>>
>>Add rtnetlink API for creating, changing and deleting software devices.
>>
>>Signed-off-by: Patrick McHardy <kaber@trash.net>
>>
> 
> If you want I'll extend existing bridge netlink to use these.


Are you talking about brige-port information or bridge device
configuration? So far the API is not suitable for anything that
currently uses IFLA_PROTINFO because the sender is not the driver
which created the device and doesn't use AF_UNSPEC. For bridge
device configuration it would certainly be nice to have, but I'm
not sure yet how to handle enslave operations. So far my favourite
idea is to add enslave/release operations to rtnl_link_ops and call
them when IFLA_MASTER is set (so the netlink message would look like
this: ifindex: eth0 master: br0 nlmsg_type: RTM_NETLINK). But I
haven't really thought this through yet.

I would also like to add support for handling "secondary device state"
like bridge port state and others that currently use IFLA_PROTINFO
(a lot of the code is very similar to the generic code), but all ideas
so far turned out not to work very well.

I'm leaving for a short vacation until Sunday tommorrow, so replies
may be delayed :)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (11 preceding siblings ...)
  2007-06-05 22:00 ` jamal
@ 2007-06-06  0:40 ` Eric W. Biederman
  2007-06-06 11:50   ` Patrick McHardy
  2007-06-06 13:46 ` Patrick McHardy
  13 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2007-06-06  0:40 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, socketcan, hadi, xemul, tgraf


Reading through the patches they look usable to me.

Having to patch iproute to create the more interesting network
devices sucks, but that problem seems fundamental.  We might
be able to avoid it if we allowed fields to be reused between
different types of devices but that makes the error checking
trickier, and we aren't likely to have that many types of
devices so there likely isn't much value in generalizing.

I do think we should specify the IFLA_KIND (was: IFLA_NAME) values in
a header file.  So it is easy to get a list of all of the different
kinds and so we don't conflict.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 04/09]: Link creation API
  2007-06-05 23:16       ` Stephen Hemminger
@ 2007-06-06 11:42         ` Patrick McHardy
  0 siblings, 0 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-06 11:42 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, socketcan, hadi, xemul, ebiederm, tgraf

Stephen Hemminger wrote:
> On Wed, 06 Jun 2007 01:17:11 +0200
> Patrick McHardy <kaber@trash.net> wrote:
> 
>>>If you want I'll extend existing bridge netlink to use these.
>>
>>
>>Are you talking about brige-port information or bridge device
>>configuration? So far the API is not suitable for anything that
>>currently uses IFLA_PROTINFO because the sender is not the driver
>>which created the device and doesn't use AF_UNSPEC. For bridge
>>device configuration it would certainly be nice to have, but I'm
>>not sure yet how to handle enslave operations. So far my favourite
>>idea is to add enslave/release operations to rtnl_link_ops and call
>>them when IFLA_MASTER is set (so the netlink message would look like
>>this: ifindex: eth0 master: br0 nlmsg_type: RTM_NETLINK). But I
>>haven't really thought this through yet.
> 
> 
> Was thinking AF_BRIDGE (we have it already so use it), and both
> add/remove bridge, and enslave/unslave device.


I think we should use two seperate families for bridge devices
and bridge ports. I'll think about the enslave operation some
more and try to add it next week.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06  0:40 ` Eric W. Biederman
@ 2007-06-06 11:50   ` Patrick McHardy
  2007-06-06 15:25     ` Eric W. Biederman
  0 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-06 11:50 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, socketcan, hadi, xemul, tgraf

Eric W. Biederman wrote:
> Reading through the patches they look usable to me.
> 
> Having to patch iproute to create the more interesting network
> devices sucks, but that problem seems fundamental.  We might
> be able to avoid it if we allowed fields to be reused between
> different types of devices but that makes the error checking
> trickier, and we aren't likely to have that many types of
> devices so there likely isn't much value in generalizing.


You don't really need to patch it, installing a new iplink_XXX.so
file is enough. Generalizing driver specific options more than
what we currently have doesn't look very promising. I think your
driver was simple enough to get along with the generic device
attributes though (IFLA_LINK or IFLA_MASTER).

> I do think we should specify the IFLA_KIND (was: IFLA_NAME) values in
> a header file.  So it is easy to get a list of all of the different
> kinds and so we don't conflict.


I don't think conflicts are going to be a problem (it would be
nice if modpost would warn about duplicate aliases though).
How is listing IFLA_KIND types in a header file going to help
get a list? Presuming the user knows what kind of device he
wants to set up and is not just looking for things to play
around with I also don't see any real value in this.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
                   ` (12 preceding siblings ...)
  2007-06-06  0:40 ` Eric W. Biederman
@ 2007-06-06 13:46 ` Patrick McHardy
  13 siblings, 0 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-06 13:46 UTC (permalink / raw)
  To: netdev; +Cc: socketcan, hadi, xemul, ebiederm, tgraf

Patrick McHardy wrote:
> The following patches contain the rtnetlink link creation API I promised,
> as well as two simple driver conversion to use the API as an example.
> I've also converted VLAN as a more complex example, but these patches
> need some more work and are most likely not interesting to all the CCed
> parties, so I'm sending them seperately.


I've updated the patches to remove the broken VLAN ID change, added back
some consts, renamed IFLA_INFO_NAME to IFLA_INFO_KIND and rebased to
current net-2.6. The current patches and -git trees can be found at
http://people.netfilter.org/kaber/rtnl_link/


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06 11:50   ` Patrick McHardy
@ 2007-06-06 15:25     ` Eric W. Biederman
  2007-06-06 15:35       ` Patrick McHardy
  0 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2007-06-06 15:25 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, socketcan, hadi, xemul, tgraf

Patrick McHardy <kaber@trash.net> writes:

> Eric W. Biederman wrote:
>> Reading through the patches they look usable to me.
>> 
>> Having to patch iproute to create the more interesting network
>> devices sucks, but that problem seems fundamental.  We might
>> be able to avoid it if we allowed fields to be reused between
>> different types of devices but that makes the error checking
>> trickier, and we aren't likely to have that many types of
>> devices so there likely isn't much value in generalizing.
>
>
> You don't really need to patch it, installing a new iplink_XXX.so
> file is enough. Generalizing driver specific options more than
> what we currently have doesn't look very promising. I think your
> driver was simple enough to get along with the generic device
> attributes though (IFLA_LINK or IFLA_MASTER).

I need to know the other device in the pair of devices I am creating.
If ifindex was selectable IFLA_LINK or IFLA_MASTER might be
interesting however they are currently are not, and I'm not quite
certain about letting a user select the ifindex.  Although there may
come a point when dealing with migration when it makes sense.

Hmm.  I guess if I had a reasonable default I could find out the
pair device by looking at the returned NEW_LINK message.

Looking more close.  IFLA_MASTER does not work because I don't
have a master/slave relationship.

IFLA_LINK looks like it will work.  I don't precisely match the
semantics though which makes me nervous.  In particular my other
device is not something I am sending through but what I am sending
to.  The way the IPv6 code uses iflink to get the link local address
starting with the hardware address of the iflink would be completely
the wrong thing to do in my case.  Now my device won't have the magic
IPv6 tunnel arp type so that code won't trigger.  Still it is
a challenge.

I still think adding a IFLA_PARTNER or a custom attribute is cleaner
in this case.  Slight semantic mismatches are the worst design bugs
to correct.

To some extent this is a practical problematic point for me, because
in the context of multiple network namespaces I could theoretically
have both network devices have the same name and the same ifindex in
different network namespaces.  Although it really doesn't matter
unless they are in the same network namespace in which case they
can't have the same ifindex or ifname.

>> I do think we should specify the IFLA_KIND (was: IFLA_NAME) values in
>> a header file.  So it is easy to get a list of all of the different
>> kinds and so we don't conflict.
>
>
> I don't think conflicts are going to be a problem (it would be
> nice if modpost would warn about duplicate aliases though).
> How is listing IFLA_KIND types in a header file going to help
> get a list? Presuming the user knows what kind of device he
> wants to set up and is not just looking for things to play
> around with I also don't see any real value in this.

This isn't about the user this is about maintaining the ABI.

We have to control set of strings for IFLA_KIND.  Having them all
in a single header file means that we can easily look when we add
support for a new kind to see if some other driver has already
used that kind.

The same reason we stick the rest of the enumerations into a header
file.

Strings don't conflict as easily as small integers do, but it is still
possible to have a conflict, so having something like an ifla_kind.h to
hold them would be useful.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06 15:25     ` Eric W. Biederman
@ 2007-06-06 15:35       ` Patrick McHardy
  2007-06-06 15:56         ` Alexey Kuznetsov
  2007-06-06 16:13         ` Eric W. Biederman
  0 siblings, 2 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-06 15:35 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, socketcan, hadi, xemul, tgraf

Eric W. Biederman wrote:
> Patrick McHardy <kaber@trash.net> writes:
> 
>>You don't really need to patch it, installing a new iplink_XXX.so
>>file is enough. Generalizing driver specific options more than
>>what we currently have doesn't look very promising. I think your
>>driver was simple enough to get along with the generic device
>>attributes though (IFLA_LINK or IFLA_MASTER).
> 
> 
> I need to know the other device in the pair of devices I am creating.
> If ifindex was selectable IFLA_LINK or IFLA_MASTER might be
> interesting however they are currently are not, and I'm not quite
> certain about letting a user select the ifindex.  Although there may
> come a point when dealing with migration when it makes sense.


It shouldn't be very hard to implement, so far I just didn't see
any use for it.

> Hmm.  I guess if I had a reasonable default I could find out the
> pair device by looking at the returned NEW_LINK message.
> 
> Looking more close.  IFLA_MASTER does not work because I don't
> have a master/slave relationship.
> 
> IFLA_LINK looks like it will work.  I don't precisely match the
> semantics though which makes me nervous.  In particular my other
> device is not something I am sending through but what I am sending
> to.  The way the IPv6 code uses iflink to get the link local address
> starting with the hardware address of the iflink would be completely
> the wrong thing to do in my case.  Now my device won't have the magic
> IPv6 tunnel arp type so that code won't trigger.  Still it is
> a challenge.
> 
> I still think adding a IFLA_PARTNER or a custom attribute is cleaner
> in this case.  Slight semantic mismatches are the worst design bugs
> to correct.


Indeed, IFLA_PARTNER sounds like a better idea. I just suggested to
Pavel to create only a single device per newlink operation and binding
them later, what do you think about that?

>>>I do think we should specify the IFLA_KIND (was: IFLA_NAME) values in
>>>a header file.  So it is easy to get a list of all of the different
>>>kinds and so we don't conflict.
>>
>>
>>I don't think conflicts are going to be a problem (it would be
>>nice if modpost would warn about duplicate aliases though).
>>How is listing IFLA_KIND types in a header file going to help
>>get a list? Presuming the user knows what kind of device he
>>wants to set up and is not just looking for things to play
>>around with I also don't see any real value in this.
> 
> 
> This isn't about the user this is about maintaining the ABI.
> 
> We have to control set of strings for IFLA_KIND.  Having them all
> in a single header file means that we can easily look when we add
> support for a new kind to see if some other driver has already
> used that kind.
> 
> The same reason we stick the rest of the enumerations into a header
> file.
> 
> Strings don't conflict as easily as small integers do, but it is still
> possible to have a conflict, so having something like an ifla_kind.h to
> hold them would be useful.


Mhh .. we have multiple string based APIs that do just fine. I'd
prefer having someone adding a new driver do a quick grep for
MODULE_ALIAS_RTNL_LINK to adding unused definitions.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06 15:35       ` Patrick McHardy
@ 2007-06-06 15:56         ` Alexey Kuznetsov
  2007-06-06 16:32           ` Patrick McHardy
  2007-06-06 16:13         ` Eric W. Biederman
  1 sibling, 1 reply; 43+ messages in thread
From: Alexey Kuznetsov @ 2007-06-06 15:56 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Eric W. Biederman, netdev, socketcan, hadi, xemul, tgraf

Hello!

> 						 I just suggested to
> Pavel to create only a single device per newlink operation and binding
> them later,

I see some logical inconsistency here.

Look, the second end is supposed to be in another namespace.
It will have identity, which cannot be expressed in any way in namespace,
which is allowed to create the pair: name, ifindex - nothing
is shared between namespaces.

Moreover, do not forget we have two netlink spaces as well.
Events happening in one namespace are reported only inside that namespace.

Actually, the only self-consistent solution, which I see right now
(sorry, did not think that much) is to create the whole pair as
one operation; required parameters (name of partner, identity of namespace)
can be passed as attributes. I guess IFLA_PARTNER approach suggests
the same thing, right?

As response to this action two replies are generated: one RTM_NEWLINK
for one end of device with the whole desciption of partnet
is broadcasted inside this namespace, another RTM_NETLINK with index/name
of partner device is broadcasted inside the second namespace
(and, probably, some attributes, which must be hidden inside namespace,
f.e. identity of main device is suppressed). 

Alexey



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06 15:35       ` Patrick McHardy
  2007-06-06 15:56         ` Alexey Kuznetsov
@ 2007-06-06 16:13         ` Eric W. Biederman
  2007-06-06 16:25           ` Patrick McHardy
  1 sibling, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2007-06-06 16:13 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, socketcan, hadi, xemul, tgraf

Patrick McHardy <kaber@trash.net> writes:

>> I still think adding a IFLA_PARTNER or a custom attribute is cleaner
>> in this case.  Slight semantic mismatches are the worst design bugs
>> to correct.
>
>
> Indeed, IFLA_PARTNER sounds like a better idea. I just suggested to
> Pavel to create only a single device per newlink operation and binding
> them later, what do you think about that?

I don't think it solves much because we still need a way to report the
partner device.

On the actual using side I think it makes the core of the driver much
more difficult to get right.  

Basically if we can't count on having a partner device we have to
add NULL pointer checks and locking to the packet dispatch which
is otherwise unnecessary.

Eric


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06 16:13         ` Eric W. Biederman
@ 2007-06-06 16:25           ` Patrick McHardy
  0 siblings, 0 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-06 16:25 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, socketcan, hadi, xemul, tgraf

Eric W. Biederman wrote:
> Patrick McHardy <kaber@trash.net> writes:
> 
> 
>>>I still think adding a IFLA_PARTNER or a custom attribute is cleaner
>>>in this case.  Slight semantic mismatches are the worst design bugs
>>>to correct.
>>
>>
>>Indeed, IFLA_PARTNER sounds like a better idea. I just suggested to
>>Pavel to create only a single device per newlink operation and binding
>>them later, what do you think about that?
> 
> 
> I don't think it solves much because we still need a way to report the
> partner device.

I was thinking of something like this:

ip link add veth0 type veth
ip link add veth1 partner veth0 type veth

ip would resolve veth0 to an ifindex and set IFLA_PARTNER. But Alexey
just raised a few good points, so this might not work.

> On the actual using side I think it makes the core of the driver much
> more difficult to get right.  
> 
> Basically if we can't count on having a partner device we have to
> add NULL pointer checks and locking to the packet dispatch which
> is otherwise unnecessary.


All you'd need to do is keep the queue stopped until the device
is bound. No changes to rx or tx path neccessary.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06 15:56         ` Alexey Kuznetsov
@ 2007-06-06 16:32           ` Patrick McHardy
  2007-06-06 17:31             ` Eric W. Biederman
  2007-06-06 19:14             ` Alexey Kuznetsov
  0 siblings, 2 replies; 43+ messages in thread
From: Patrick McHardy @ 2007-06-06 16:32 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Eric W. Biederman, netdev, socketcan, hadi, xemul, tgraf

Alexey Kuznetsov wrote:
>>						 I just suggested to
>>Pavel to create only a single device per newlink operation and binding
>>them later,
> 
> 
> I see some logical inconsistency here.
> 
> Look, the second end is supposed to be in another namespace.
> It will have identity, which cannot be expressed in any way in namespace,
> which is allowed to create the pair: name, ifindex - nothing
> is shared between namespaces.


Good point, I didn't think of that. Is there a version of this patch
that already uses different namespaces so I can look at it?

Are network namespace completely seperated or is there some hierarchy
with all lower namespaces visible above or something like that?

> Moreover, do not forget we have two netlink spaces as well.
> Events happening in one namespace are reported only inside that namespace.
> 
> Actually, the only self-consistent solution, which I see right now
> (sorry, did not think that much) is to create the whole pair as
> one operation; required parameters (name of partner, identity of namespace)
> can be passed as attributes. I guess IFLA_PARTNER approach suggests
> the same thing, right?


I imagined it more as a bind operation, pretty similar to enslave, so
it would only contain an ifindex, no parameters. But as you say that
doesn't work, so I guess we'd have to nest an entire ifinfomsg + the
attributes for the partner device under it .. not exactly pretty.

> As response to this action two replies are generated: one RTM_NEWLINK
> for one end of device with the whole desciption of partnet
> is broadcasted inside this namespace, another RTM_NETLINK with index/name
> of partner device is broadcasted inside the second namespace
> (and, probably, some attributes, which must be hidden inside namespace,
> f.e. identity of main device is suppressed). 


The identity of the main device has no meaning within a different
namespace, but are there other reasons for hiding it?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 04/09]: Link creation API
  2007-06-05 14:12 ` [RFC RTNETLINK 04/09]: Link creation API Patrick McHardy
  2007-06-05 19:43   ` David Miller
  2007-06-05 22:03   ` Stephen Hemminger
@ 2007-06-06 16:37   ` Eric W. Biederman
  2007-06-06 16:50     ` Patrick McHardy
  2 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2007-06-06 16:37 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, socketcan, hadi, xemul, tgraf


> +static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
> +{
> +	struct rtnl_link_ops *ops;
> +	struct net_device *dev;
> +	struct ifinfomsg *ifm;
> +	char name[MODULE_NAME_LEN];
> +	char ifname[IFNAMSIZ];
> +	struct nlattr *tb[IFLA_MAX+1];
> +	struct nlattr *linkinfo[IFLA_INFO_MAX+1];
> +	int err;
> +
> +	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
> +	if (err < 0)
> +		return err;
> +
> +	if (tb[IFLA_IFNAME])
> +		nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
> +	else
> +		ifname[0] = '\0';
> +
> +	ifm = nlmsg_data(nlh);
> +	if (ifm->ifi_index > 0)
> +		dev = __dev_get_by_index(ifm->ifi_index);
> +	else if (ifname[0])
> +		dev = __dev_get_by_name(ifname);
> +	else
> +		dev = NULL;
> +
> +	if (tb[IFLA_LINKINFO]) {
> +		err = nla_parse_nested(linkinfo, IFLA_INFO_MAX,
> +				       tb[IFLA_LINKINFO], ifla_info_policy);
> +		if (err < 0)
> +			return err;
> +	} else
> +		memset(linkinfo, 0, sizeof(linkinfo));
> +
> +	if (linkinfo[IFLA_INFO_NAME]) {
> +		nla_strlcpy(name, linkinfo[IFLA_INFO_NAME], sizeof(name));
> +		ops = rtnl_link_ops_get(name);

Ugh.  Shouldn't we have the request_module logic here?
Otherwise it looks like we can skip the validate method and 
have other weird interactions.


> +	} else {
> +		name[0] = '\0';
> +		ops = NULL;
> +	}
> +
> +	if (1) {
> +		struct nlattr *attr[ops ? ops->maxtype + 1 : 0], **data = NULL;
> +
> +		if (ops) {
> +			if (ops->maxtype && linkinfo[IFLA_INFO_DATA]) {
> +				err = nla_parse_nested(attr, ops->maxtype,
> + linkinfo[IFLA_INFO_DATA],
> +						       ops->policy);
> +				if (err < 0)
> +					return err;
> +				data = attr;
> +			}
> +			if (ops->validate) {
> +				err = ops->validate(tb, data);
> +				if (err < 0)
> +					return err;
> +			}
> +		}
> +
> +		if (dev) {
> +			int modified = 0;
> +
> +			if (nlh->nlmsg_flags & NLM_F_EXCL)
> +				return -EEXIST;
> +			if (nlh->nlmsg_flags & NLM_F_REPLACE)
> +				return -EOPNOTSUPP;
> +
> +			if (linkinfo[IFLA_INFO_DATA]) {
> +				if (!ops || ops != dev->rtnl_link_ops ||
> +				    !ops->changelink)
> +					return -EOPNOTSUPP;
> +
> +				err = ops->changelink(dev, tb, data);
> +				if (err < 0)
> +					return err;
> +				modified = 1;
> +			}
> +
> +			return do_setlink(dev, ifm, tb, ifname, modified);
> +		}
> +
> +		if (!(nlh->nlmsg_flags & NLM_F_CREATE))
> +			return -ENODEV;
> +
> +		if (ifm->ifi_index)
> +			return -EINVAL;
> +		if (tb[IFLA_ADDRESS] || tb[IFLA_BROADCAST] || tb[IFLA_MAP])
> +			return -EOPNOTSUPP;
> +
> +#ifdef CONFIG_KMOD
> +		if (!ops && name[0]) {
> +			/* race condition: device may be created while rtnl is
> +			 * unlocked, final register_netdevice will catch it.
> +			 */
> +			__rtnl_unlock();
> +			request_module("rtnl-link-%s", name);
> +			rtnl_lock();
> +			ops = rtnl_link_ops_get(name);
> +		}
> +#endif
> +		if (!ops)
> +			return -EOPNOTSUPP;

> +
> +		if (!ifname[0])
> +			snprintf(ifname, IFNAMSIZ, "%s%%d", ops->name);
> +		dev = alloc_netdev(ops->priv_size, ifname, ops->setup);
> +		if (!dev)
> +			return -ENOMEM;
> +
> +		if (strchr(dev->name, '%')) {
> +			err = dev_alloc_name(dev, dev->name);
> +			if (err < 0)
> +				goto err_free;
> +		}
> +		dev->rtnl_link_ops = ops;
> +
> +		if (tb[IFLA_MTU])
> +			dev->mtu = nla_get_u32(tb[IFLA_MTU]);
> +		if (tb[IFLA_TXQLEN])
> +			dev->tx_queue_len = nla_get_u32(tb[IFLA_TXQLEN]);
> +		if (tb[IFLA_WEIGHT])
> +			dev->weight = nla_get_u32(tb[IFLA_WEIGHT]);
> +		if (tb[IFLA_OPERSTATE])
> +			set_operstate(dev, nla_get_u8(tb[IFLA_OPERSTATE]));
> +		if (tb[IFLA_LINKMODE])
> +			dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]);
> +
> +		err = ops->newlink(dev, tb, data);
> +err_free:
> +		if (err < 0)
> +			free_netdev(dev);
> +		return err;
> +	}
> +}
> +

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 04/09]: Link creation API
  2007-06-06 16:37   ` Eric W. Biederman
@ 2007-06-06 16:50     ` Patrick McHardy
  2007-06-06 18:02       ` Eric W. Biederman
  0 siblings, 1 reply; 43+ messages in thread
From: Patrick McHardy @ 2007-06-06 16:50 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: netdev, socketcan, hadi, xemul, tgraf

[-- Attachment #1: Type: text/plain, Size: 499 bytes --]

Eric W. Biederman wrote:
>>+	if (linkinfo[IFLA_INFO_NAME]) {
>>+		nla_strlcpy(name, linkinfo[IFLA_INFO_NAME], sizeof(name));
>>+		ops = rtnl_link_ops_get(name);
> 
> 
> Ugh.  Shouldn't we have the request_module logic here?
> Otherwise it looks like we can skip the validate method and 
> have other weird interactions.


Good catch. The easiest solution seems be to simply replay the
request after successful module load, which also avoids the
device lookup race.

Something like this (untested).


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1198 bytes --]

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 8d2f817..f2868b0 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -930,6 +930,7 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 	struct nlattr *linkinfo[IFLA_INFO_MAX+1];
 	int err;
 
+replay:
 	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
 	if (err < 0)
 		return err;
@@ -1012,19 +1013,19 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg)
 		if (tb[IFLA_ADDRESS] || tb[IFLA_BROADCAST] || tb[IFLA_MAP])
 			return -EOPNOTSUPP;
 
+		if (!ops) {
 #ifdef CONFIG_KMOD
-		if (!ops && kind[0]) {
-			/* race condition: device may be created while rtnl is
-			 * unlocked, final register_netdevice will catch it.
-			 */
-			__rtnl_unlock();
-			request_module("rtnl-link-%s", kind);
-			rtnl_lock();
-			ops = rtnl_link_ops_get(kind);
-		}
+			if (kind[0]) {
+				__rtnl_unlock();
+				request_module("rtnl-link-%s", kind);
+				rtnl_lock();
+				ops = rtnl_link_ops_get(kind);
+				if (ops)
+					goto replay;
+			}
 #endif
-		if (!ops)
 			return -EOPNOTSUPP;
+		}
 
 		if (!ifname[0])
 			snprintf(ifname, IFNAMSIZ, "%s%%d", ops->kind);

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06 16:32           ` Patrick McHardy
@ 2007-06-06 17:31             ` Eric W. Biederman
  2007-06-06 19:14             ` Alexey Kuznetsov
  1 sibling, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2007-06-06 17:31 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Alexey Kuznetsov, netdev, socketcan, hadi, xemul, tgraf

Patrick McHardy <kaber@trash.net> writes:

> Alexey Kuznetsov wrote:
>>>						 I just suggested to
>>>Pavel to create only a single device per newlink operation and binding
>>>them later,
>> 
>> 
>> I see some logical inconsistency here.
>> 
>> Look, the second end is supposed to be in another namespace.
>> It will have identity, which cannot be expressed in any way in namespace,
>> which is allowed to create the pair: name, ifindex - nothing
>> is shared between namespaces.
>
>
> Good point, I didn't think of that. Is there a version of this patch
> that already uses different namespaces so I can look at it?

We have posted patches a couple of times, and veth or etun were always
a part of it.  But except for some book keeping details we really don't
care.

> Are network namespace completely seperated or is there some hierarchy
> with all lower namespaces visible above or something like that?

Completely separated.  The goal is to look like two separate machines
to user space, with respect to the network stack.

There is a bit of a hierarchy usage wise.  Because frequently only
one namespace will have real hardware devices in it.  So everything
needs to route through there.  But that detail is a usage detail
and is easiest not to reflect in the actual implementation.

> I imagined it more as a bind operation, pretty similar to enslave, so
> it would only contain an ifindex, no parameters. But as you say that
> doesn't work, so I guess we'd have to nest an entire ifinfomsg + the
> attributes for the partner device under it .. not exactly pretty.

In the model I'm working in, is that there is a separate operation:
"move device to other namespace", which should work for any network device.

So there should be an interval immediately after device creation
when both devices are in the same namespace, and then one of the
pair is moved to another namespace.

>> As response to this action two replies are generated: one RTM_NEWLINK
>> for one end of device with the whole desciption of partnet
>> is broadcasted inside this namespace, another RTM_NETLINK with index/name
>> of partner device is broadcasted inside the second namespace
>> (and, probably, some attributes, which must be hidden inside namespace,
>> f.e. identity of main device is suppressed). 
>
>
> The identity of the main device has no meaning within a different
> namespace, but are there other reasons for hiding it?

Not really.  We can already recognized the type of the device.

Eric


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 04/09]: Link creation API
  2007-06-06 16:50     ` Patrick McHardy
@ 2007-06-06 18:02       ` Eric W. Biederman
  0 siblings, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2007-06-06 18:02 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netdev, socketcan, hadi, xemul, tgraf

Patrick McHardy <kaber@trash.net> writes:

> Eric W. Biederman wrote:
>>>+	if (linkinfo[IFLA_INFO_NAME]) {
>>>+		nla_strlcpy(name, linkinfo[IFLA_INFO_NAME], sizeof(name));
>>>+		ops = rtnl_link_ops_get(name);
>> 
>> 
>> Ugh.  Shouldn't we have the request_module logic here?
>> Otherwise it looks like we can skip the validate method and 
>> have other weird interactions.
>
>
> Good catch. The easiest solution seems be to simply replay the
> request after successful module load, which also avoids the
> device lookup race.

Looks reasonable to me.   There might be a variable or two that
we need to make certain is initialized but at a quick glance it
looks ok.

Eric


> Something like this (untested).
>
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 8d2f817..f2868b0 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -930,6 +930,7 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr
> *nlh, void *arg)
>  	struct nlattr *linkinfo[IFLA_INFO_MAX+1];
>  	int err;
>  
> +replay:
>  	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
>  	if (err < 0)
>  		return err;
> @@ -1012,19 +1013,19 @@ static int rtnl_newlink(struct sk_buff *skb, struct
> nlmsghdr *nlh, void *arg)
>  		if (tb[IFLA_ADDRESS] || tb[IFLA_BROADCAST] || tb[IFLA_MAP])
>  			return -EOPNOTSUPP;
>  
> +		if (!ops) {
>  #ifdef CONFIG_KMOD
> -		if (!ops && kind[0]) {
> -			/* race condition: device may be created while rtnl is
> -			 * unlocked, final register_netdevice will catch it.
> -			 */
> -			__rtnl_unlock();
> -			request_module("rtnl-link-%s", kind);
> -			rtnl_lock();
> -			ops = rtnl_link_ops_get(kind);
> -		}
> +			if (kind[0]) {
> +				__rtnl_unlock();
> +				request_module("rtnl-link-%s", kind);
> +				rtnl_lock();
> +				ops = rtnl_link_ops_get(kind);
> +				if (ops)
> +					goto replay;
> +			}
>  #endif
> -		if (!ops)
>  			return -EOPNOTSUPP;
> +		}
>  
>  		if (!ifname[0])
>  			snprintf(ifname, IFNAMSIZ, "%s%%d", ops->kind);

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06 16:32           ` Patrick McHardy
  2007-06-06 17:31             ` Eric W. Biederman
@ 2007-06-06 19:14             ` Alexey Kuznetsov
  2007-06-07  8:06               ` Eric W. Biederman
  1 sibling, 1 reply; 43+ messages in thread
From: Alexey Kuznetsov @ 2007-06-06 19:14 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Eric W. Biederman, netdev, socketcan, hadi, xemul, tgraf

Hello!

> Good point, I didn't think of that. Is there a version of this patch
> that already uses different namespaces so I can look at it?

Pavel does not like the idea. It looks "not exactly pretty", like you said. :-)

The alternative is to create pair in main namespace and then move
one end to another namespace renaming it and changing index.
Why I do not like it? Because this makes RTM_NEWLINK just useless step,
all its work is undone and real work is remade when the device moves,
with all the unrettiness moved to another place.

>From another hand, some move operation is required in any case.
Right now in openvz the problem is solved in tricky, but quite
inerseting way: all the devices in main namespace are assigned
with odd index, child devices get odd index. So that, when a device
moves from main namespace to child, openvz does not need to change
ifindex, conflict is impossible. Well, it is working approach.
But it is not pretty either.


> Are network namespace completely seperated or is there some hierarchy
> with all lower namespaces visible above or something like that?

Right now they are completely separate.

It is possible to make child devices visible in parent namespace
like it is done for process pids: i.e. there is an abstract identity
which is seen under different names and indices in different namespaces.
Sounds cool, but this add a lot of complexity, which has no meaning
outside of context of device creation, I do not think it is worth to do.




> The identity of the main device has no meaning within a different
> namespace, but are there other reasons for hiding it?

Perhaps, security. It is not a good idea to leak information
about parent namespace to child namespace.

Also, people will want to see emulated ethernet inside namespace
looking exactly like ethernet. No freaking additional attributes.

Alexey

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC RTNETLINK 00/09]: Netlink link creation API
  2007-06-06 19:14             ` Alexey Kuznetsov
@ 2007-06-07  8:06               ` Eric W. Biederman
  0 siblings, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2007-06-07  8:06 UTC (permalink / raw)
  To: Alexey Kuznetsov; +Cc: Patrick McHardy, netdev, socketcan, hadi, xemul, tgraf

Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> writes:

> Hello!
>
>> Good point, I didn't think of that. Is there a version of this patch
>> that already uses different namespaces so I can look at it?
>
> Pavel does not like the idea. It looks "not exactly pretty", like you said. :-)
>
> The alternative is to create pair in main namespace and then move
> one end to another namespace renaming it and changing index.
> Why I do not like it? Because this makes RTM_NEWLINK just useless step,
> all its work is undone and real work is remade when the device moves,
> with all the unrettiness moved to another place.

- A move network device between namespaces operation is necessary.
- If we limit these devices to just communication between namespaces
  we severely limit their utility.  In particular there are know applications
  now that do not need this.
- Further I believe by using RTM_NEWLINK the ethernet tunnel driver
  will never need to have any code that knows about namespaces,
  all that is needed is for RTM_NEWLINK to have an appropriate default
  network namespace, (the network namespace of the netlink socket).

>>From another hand, some move operation is required in any case.
> Right now in openvz the problem is solved in tricky, but quite
> inerseting way: all the devices in main namespace are assigned
> with odd index, child devices get odd index. So that, when a device
> moves from main namespace to child, openvz does not need to change
> ifindex, conflict is impossible. Well, it is working approach.
> But it is not pretty either.

We can solve the ifindex change even more simply by simply using
a global ifindex sequence number for now.  In the context of migration
that is likely to prove insufficient for virtual devices but for now
it is simple, it already exists and it is good enough.

>> Are network namespace completely seperated or is there some hierarchy
>> with all lower namespaces visible above or something like that?
>
> Right now they are completely separate.
>
> It is possible to make child devices visible in parent namespace
> like it is done for process pids: i.e. there is an abstract identity
> which is seen under different names and indices in different namespaces.
> Sounds cool, but this add a lot of complexity, which has no meaning
> outside of context of device creation, I do not think it is worth to do.

I completely agree.  There is no advantage and a considerable
disadvantage in having network namespaces being other then completely
separate.

>> The identity of the main device has no meaning within a different
>> namespace, but are there other reasons for hiding it?
>
> Perhaps, security. It is not a good idea to leak information
> about parent namespace to child namespace.
>
> Also, people will want to see emulated ethernet inside namespace
> looking exactly like ethernet. No freaking additional attributes.

As long as we keep ourselves within the usual variation of ethernet
network devices we should be fine.  For someone who wants to know
we can't hide the fact we are a particular kind of ethernet device.

Eric


^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2007-06-07  8:09 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-05 14:12 [RFC RTNETLINK 00/09]: Netlink link creation API Patrick McHardy
2007-06-05 14:12 ` [RFC NETLINK 01/09]: Mark netlink policies const Patrick McHardy
2007-06-05 19:39   ` David Miller
2007-06-05 14:12 ` [RFC RTNETLINK 02/09]: ifindex 0 does not exist Patrick McHardy
2007-06-05 19:40   ` David Miller
2007-06-05 14:12 ` [RFC RTNETLINK 03/09]: Split up rtnl_setlink Patrick McHardy
2007-06-05 19:41   ` David Miller
2007-06-05 14:12 ` [RFC RTNETLINK 04/09]: Link creation API Patrick McHardy
2007-06-05 19:43   ` David Miller
2007-06-05 21:09     ` Patrick McHardy
2007-06-05 22:03   ` Stephen Hemminger
2007-06-05 23:17     ` Patrick McHardy
2007-06-05 23:16       ` Stephen Hemminger
2007-06-06 11:42         ` Patrick McHardy
2007-06-06 16:37   ` Eric W. Biederman
2007-06-06 16:50     ` Patrick McHardy
2007-06-06 18:02       ` Eric W. Biederman
2007-06-05 14:12 ` [RFC DUMMY 05/09]: Use dev->stats Patrick McHardy
2007-06-05 19:44   ` David Miller
2007-06-05 14:13 ` [RFC DUMMY 06/09]: Keep dummy devices on list Patrick McHardy
2007-06-05 19:44   ` David Miller
2007-06-05 14:13 ` [RFC DUMMY 07/09]: Use rtnl_link API Patrick McHardy
2007-06-05 19:46   ` David Miller
2007-06-05 14:13 ` [RFC IFB 08/09]: Keep ifb devices on list Patrick McHardy
2007-06-05 14:13 ` [RFC IFB 09/09]: Use rtnl_link API Patrick McHardy
2007-06-05 14:18 ` [RFC IPROUTE]: iplink: use netlink for link configuration Patrick McHardy
2007-06-05 19:37 ` [RFC RTNETLINK 00/09]: Netlink link creation API David Miller
2007-06-05 21:10   ` Patrick McHardy
2007-06-05 22:00 ` jamal
2007-06-05 22:07   ` Patrick McHardy
2007-06-05 22:29     ` jamal
2007-06-06  0:40 ` Eric W. Biederman
2007-06-06 11:50   ` Patrick McHardy
2007-06-06 15:25     ` Eric W. Biederman
2007-06-06 15:35       ` Patrick McHardy
2007-06-06 15:56         ` Alexey Kuznetsov
2007-06-06 16:32           ` Patrick McHardy
2007-06-06 17:31             ` Eric W. Biederman
2007-06-06 19:14             ` Alexey Kuznetsov
2007-06-07  8:06               ` Eric W. Biederman
2007-06-06 16:13         ` Eric W. Biederman
2007-06-06 16:25           ` Patrick McHardy
2007-06-06 13:46 ` Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).