* [PATCH 1/1] [TIPC]: Fixed erroneous introduction of for_each_netdev
[not found] <1>
@ 2007-05-16 0:21 ` Jon Paul Maloy
2007-05-16 0:25 ` David Miller
2007-05-23 22:12 ` David Miller
2007-05-18 0:33 ` [PATCH 1/3] [TIPC]: Improved support for Ethernet traffic filtering Jon Paul Maloy
` (4 subsequent siblings)
5 siblings, 2 replies; 24+ messages in thread
From: Jon Paul Maloy @ 2007-05-16 0:21 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Jon Paul Maloy
Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com>
---
net/tipc/eth_media.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/tipc/eth_media.c b/net/tipc/eth_media.c
index 0ee6ded..c73c206 100644
--- a/net/tipc/eth_media.c
+++ b/net/tipc/eth_media.c
@@ -120,18 +120,20 @@ static int recv_msg(struct sk_buff *buf, struct net_device *dev,
static int enable_bearer(struct tipc_bearer *tb_ptr)
{
- struct net_device *dev, *pdev;
+ struct net_device *dev = NULL;
+ struct net_device *pdev = NULL;
struct eth_bearer *eb_ptr = ð_bearers[0];
struct eth_bearer *stop = ð_bearers[MAX_ETH_BEARERS];
char *driver_name = strchr((const char *)tb_ptr->name, ':') + 1;
/* Find device with specified name */
- dev = NULL;
- for_each_netdev(pdev)
- if (!strncmp(dev->name, driver_name, IFNAMSIZ)) {
+
+ for_each_netdev(pdev){
+ if (!strncmp(pdev->name, driver_name, IFNAMSIZ)) {
dev = pdev;
break;
}
+ }
if (!dev)
return -ENODEV;
--
1.5.0.5
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 1/3] [TIPC]: Improved support for Ethernet traffic filtering
[not found] <1>
2007-05-16 0:21 ` [PATCH 1/1] [TIPC]: Fixed erroneous introduction of for_each_netdev Jon Paul Maloy
@ 2007-05-18 0:33 ` Jon Paul Maloy
2007-05-18 0:33 ` [PATCH 2/3] [TIPC]: Use standard socket "not implemented" routines Jon Paul Maloy
` (3 subsequent siblings)
5 siblings, 0 replies; 24+ messages in thread
From: Jon Paul Maloy @ 2007-05-18 0:33 UTC (permalink / raw)
To: David Miller; +Cc: Jon Paul Maloy, netdev, tipc-discussion
This patch simplifies TIPC's Ethernet receive routine to take
advantage of information already present in each incoming sk_buff
indicating whether the packet was explicitly sent to the interface,
has been broadcast to all interfaces, or was picked up because the
interface is in promiscous mode.
This new approach also fixes the problem of TIPC accepting unwanted
traffic through UML's multicast-based Ethernet interfaces (which
deliver traffic in a promiscuous manner even if the interface is
not configured to be promiscuous).
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com>
---
net/tipc/eth_media.c | 11 ++++++-----
1 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/tipc/eth_media.c b/net/tipc/eth_media.c
index c73c206..19a71cf 100644
--- a/net/tipc/eth_media.c
+++ b/net/tipc/eth_media.c
@@ -1,8 +1,8 @@
/*
* net/tipc/eth_media.c: Ethernet bearer support for TIPC
*
- * Copyright (c) 2001-2006, Ericsson AB
- * Copyright (c) 2005-2006, Wind River Systems
+ * Copyright (c) 2001-2007, Ericsson AB
+ * Copyright (c) 2005-2007, Wind River Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -87,6 +87,9 @@ static int send_msg(struct sk_buff *buf, struct tipc_bearer *tb_ptr,
/**
* recv_msg - handle incoming TIPC message from an Ethernet interface
*
+ * Accept only packets explicitly sent to this node, or broadcast packets;
+ * ignores packets sent using Ethernet multicast, and traffic sent to other
+ * nodes (which can happen if interface is running in promiscuous mode).
* Routine truncates any Ethernet padding/CRC appended to the message,
* and ensures message size matches actual length
*/
@@ -98,9 +101,7 @@ static int recv_msg(struct sk_buff *buf, struct net_device *dev,
u32 size;
if (likely(eb_ptr->bearer)) {
- if (likely(!dev->promiscuity) ||
- !memcmp(skb_mac_header(buf), dev->dev_addr, ETH_ALEN) ||
- !memcmp(skb_mac_header(buf), dev->broadcast, ETH_ALEN)) {
+ if (likely(buf->pkt_type <= PACKET_BROADCAST)) {
size = msg_size((struct tipc_msg *)buf->data);
skb_trim(buf, size);
if (likely(buf->len == size)) {
--
1.5.0.5
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/3] [TIPC]: Use standard socket "not implemented" routines
[not found] <1>
2007-05-16 0:21 ` [PATCH 1/1] [TIPC]: Fixed erroneous introduction of for_each_netdev Jon Paul Maloy
2007-05-18 0:33 ` [PATCH 1/3] [TIPC]: Improved support for Ethernet traffic filtering Jon Paul Maloy
@ 2007-05-18 0:33 ` Jon Paul Maloy
2007-05-18 0:33 ` [PATCH 3/3] [TIPC]: Optimize stream send routine to avoid fragmentation Jon Paul Maloy
` (2 subsequent siblings)
5 siblings, 0 replies; 24+ messages in thread
From: Jon Paul Maloy @ 2007-05-18 0:33 UTC (permalink / raw)
To: David Miller; +Cc: Jon Paul Maloy, netdev, tipc-discussion
This patch modifies TIPC's socket API to utilize existing
generic routines to indicate unsupported operations, rather
than adding similar TIPC-specific routines.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com>
---
net/tipc/socket.c | 55 +++++++++++++---------------------------------------
1 files changed, 14 insertions(+), 41 deletions(-)
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 45832fb..ac7f2aa 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1,8 +1,8 @@
/*
* net/tipc/socket.c: TIPC socket API
*
- * Copyright (c) 2001-2006, Ericsson AB
- * Copyright (c) 2004-2006, Wind River Systems
+ * Copyright (c) 2001-2007, Ericsson AB
+ * Copyright (c) 2004-2007, Wind River Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -1600,33 +1600,6 @@ static int getsockopt(struct socket *sock,
}
/**
- * Placeholders for non-implemented functionality
- *
- * Returns error code (POSIX-compliant where defined)
- */
-
-static int ioctl(struct socket *s, u32 cmd, unsigned long arg)
-{
- return -EINVAL;
-}
-
-static int no_mmap(struct file *file, struct socket *sock,
- struct vm_area_struct *vma)
-{
- return -EINVAL;
-}
-static ssize_t no_sendpage(struct socket *sock, struct page *page,
- int offset, size_t size, int flags)
-{
- return -EINVAL;
-}
-
-static int no_skpair(struct socket *s1, struct socket *s2)
-{
- return -EOPNOTSUPP;
-}
-
-/**
* Protocol switches for the various types of TIPC sockets
*/
@@ -1636,19 +1609,19 @@ static struct proto_ops msg_ops = {
.release = release,
.bind = bind,
.connect = connect,
- .socketpair = no_skpair,
+ .socketpair = sock_no_socketpair,
.accept = accept,
.getname = get_name,
.poll = poll,
- .ioctl = ioctl,
+ .ioctl = sock_no_ioctl,
.listen = listen,
.shutdown = shutdown,
.setsockopt = setsockopt,
.getsockopt = getsockopt,
.sendmsg = send_msg,
.recvmsg = recv_msg,
- .mmap = no_mmap,
- .sendpage = no_sendpage
+ .mmap = sock_no_mmap,
+ .sendpage = sock_no_sendpage
};
static struct proto_ops packet_ops = {
@@ -1657,19 +1630,19 @@ static struct proto_ops packet_ops = {
.release = release,
.bind = bind,
.connect = connect,
- .socketpair = no_skpair,
+ .socketpair = sock_no_socketpair,
.accept = accept,
.getname = get_name,
.poll = poll,
- .ioctl = ioctl,
+ .ioctl = sock_no_ioctl,
.listen = listen,
.shutdown = shutdown,
.setsockopt = setsockopt,
.getsockopt = getsockopt,
.sendmsg = send_packet,
.recvmsg = recv_msg,
- .mmap = no_mmap,
- .sendpage = no_sendpage
+ .mmap = sock_no_mmap,
+ .sendpage = sock_no_sendpage
};
static struct proto_ops stream_ops = {
@@ -1678,19 +1651,19 @@ static struct proto_ops stream_ops = {
.release = release,
.bind = bind,
.connect = connect,
- .socketpair = no_skpair,
+ .socketpair = sock_no_socketpair,
.accept = accept,
.getname = get_name,
.poll = poll,
- .ioctl = ioctl,
+ .ioctl = sock_no_ioctl,
.listen = listen,
.shutdown = shutdown,
.setsockopt = setsockopt,
.getsockopt = getsockopt,
.sendmsg = send_stream,
.recvmsg = recv_stream,
- .mmap = no_mmap,
- .sendpage = no_sendpage
+ .mmap = sock_no_mmap,
+ .sendpage = sock_no_sendpage
};
static struct net_proto_family tipc_family_ops = {
--
1.5.0.5
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 3/3] [TIPC]: Optimize stream send routine to avoid fragmentation
[not found] <1>
` (2 preceding siblings ...)
2007-05-18 0:33 ` [PATCH 2/3] [TIPC]: Use standard socket "not implemented" routines Jon Paul Maloy
@ 2007-05-18 0:33 ` Jon Paul Maloy
2015-08-24 11:39 ` [PATCH] IGMP: Inhibit reports for local multicast groups Philip Downey
2015-08-27 15:46 ` Philip Downey
5 siblings, 0 replies; 24+ messages in thread
From: Jon Paul Maloy @ 2007-05-18 0:33 UTC (permalink / raw)
To: David Miller; +Cc: Jon Paul Maloy, netdev, tipc-discussion
This patch enhances TIPC's stream socket send routine so that
it avoids transmitting data in chunks that require fragmentation
and reassembly, thereby improving performance at both the
sending and receiving ends of the connection.
The "maximum packet size" hint that records MTU info allows
the socket to decide how big a chunk it should send; in the
event that the hint has become stale, fragmentation may still
occur, but the data will be passed correctly and the hint will
be updated in time for the following send. Note: The 66060 byte
pseudo-MTU used for intra-node connections requires the send
routine to perform an additional check to ensure it does not
exceed TIPC"s limit of 66000 bytes of user data per chunk.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com>
---
include/net/tipc/tipc_port.h | 6 ++++--
net/tipc/link.c | 16 ++++++++--------
net/tipc/port.c | 10 +++++-----
net/tipc/port.h | 6 ++----
net/tipc/socket.c | 25 +++++++++++++++++--------
5 files changed, 36 insertions(+), 27 deletions(-)
diff --git a/include/net/tipc/tipc_port.h b/include/net/tipc/tipc_port.h
index 333bba6..cfc4ba4 100644
--- a/include/net/tipc/tipc_port.h
+++ b/include/net/tipc/tipc_port.h
@@ -1,8 +1,8 @@
/*
* include/net/tipc/tipc_port.h: Include file for privileged access to TIPC ports
*
- * Copyright (c) 1994-2006, Ericsson AB
- * Copyright (c) 2005, Wind River Systems
+ * Copyright (c) 1994-2007, Ericsson AB
+ * Copyright (c) 2005-2007, Wind River Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -55,6 +55,7 @@
* @conn_unacked: number of unacknowledged messages received from peer port
* @published: non-zero if port has one or more associated names
* @congested: non-zero if cannot send because of link or port congestion
+ * @max_pkt: maximum packet size "hint" used when building messages sent by port
* @ref: unique reference to port in TIPC object registry
* @phdr: preformatted message header used when sending messages
*/
@@ -68,6 +69,7 @@ struct tipc_port {
u32 conn_unacked;
int published;
u32 congested;
+ u32 max_pkt;
u32 ref;
struct tipc_msg phdr;
};
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 2124f32..5adfdfd 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1,8 +1,8 @@
/*
* net/tipc/link.c: TIPC link code
*
- * Copyright (c) 1996-2006, Ericsson AB
- * Copyright (c) 2004-2006, Wind River Systems
+ * Copyright (c) 1996-2007, Ericsson AB
+ * Copyright (c) 2004-2007, Wind River Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -1260,7 +1260,7 @@ again:
* (Must not hold any locks while building message.)
*/
- res = msg_build(hdr, msg_sect, num_sect, sender->max_pkt,
+ res = msg_build(hdr, msg_sect, num_sect, sender->publ.max_pkt,
!sender->user_port, &buf);
read_lock_bh(&tipc_net_lock);
@@ -1271,7 +1271,7 @@ again:
if (likely(l_ptr)) {
if (likely(buf)) {
res = link_send_buf_fast(l_ptr, buf,
- &sender->max_pkt);
+ &sender->publ.max_pkt);
if (unlikely(res < 0))
buf_discard(buf);
exit:
@@ -1299,12 +1299,12 @@ exit:
* then re-try fast path or fragment the message
*/
- sender->max_pkt = link_max_pkt(l_ptr);
+ sender->publ.max_pkt = link_max_pkt(l_ptr);
tipc_node_unlock(node);
read_unlock_bh(&tipc_net_lock);
- if ((msg_hdr_sz(hdr) + res) <= sender->max_pkt)
+ if ((msg_hdr_sz(hdr) + res) <= sender->publ.max_pkt)
goto again;
return link_send_sections_long(sender, msg_sect,
@@ -1357,7 +1357,7 @@ static int link_send_sections_long(struct port *sender,
again:
fragm_no = 1;
- max_pkt = sender->max_pkt - INT_H_SIZE;
+ max_pkt = sender->publ.max_pkt - INT_H_SIZE;
/* leave room for tunnel header in case of link changeover */
fragm_sz = max_pkt - INT_H_SIZE;
/* leave room for fragmentation header in each fragment */
@@ -1463,7 +1463,7 @@ error:
goto reject;
}
if (link_max_pkt(l_ptr) < max_pkt) {
- sender->max_pkt = link_max_pkt(l_ptr);
+ sender->publ.max_pkt = link_max_pkt(l_ptr);
tipc_node_unlock(node);
for (; buf_chain; buf_chain = buf) {
buf = buf_chain->next;
diff --git a/net/tipc/port.c b/net/tipc/port.c
index bcd5da0..5d2b9ce 100644
--- a/net/tipc/port.c
+++ b/net/tipc/port.c
@@ -1,8 +1,8 @@
/*
* net/tipc/port.c: TIPC port code
*
- * Copyright (c) 1992-2006, Ericsson AB
- * Copyright (c) 2004-2005, Wind River Systems
+ * Copyright (c) 1992-2007, Ericsson AB
+ * Copyright (c) 2004-2007, Wind River Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -239,6 +239,8 @@ u32 tipc_createport_raw(void *usr_handle,
}
tipc_port_lock(ref);
+ p_ptr->publ.usr_handle = usr_handle;
+ p_ptr->publ.max_pkt = MAX_PKT_DEFAULT;
p_ptr->publ.ref = ref;
msg = &p_ptr->publ.phdr;
msg_init(msg, DATA_LOW, TIPC_NAMED_MSG, TIPC_OK, LONG_H_SIZE, 0);
@@ -248,11 +250,9 @@ u32 tipc_createport_raw(void *usr_handle,
msg_set_importance(msg,importance);
p_ptr->last_in_seqno = 41;
p_ptr->sent = 1;
- p_ptr->publ.usr_handle = usr_handle;
INIT_LIST_HEAD(&p_ptr->wait_list);
INIT_LIST_HEAD(&p_ptr->subscription.nodesub_list);
p_ptr->congested_link = NULL;
- p_ptr->max_pkt = MAX_PKT_DEFAULT;
p_ptr->dispatcher = dispatcher;
p_ptr->wakeup = wakeup;
p_ptr->user_port = NULL;
@@ -1243,7 +1243,7 @@ int tipc_connect2port(u32 ref, struct tipc_portid const *peer)
res = TIPC_OK;
exit:
tipc_port_unlock(p_ptr);
- p_ptr->max_pkt = tipc_link_get_max_pkt(peer->node, ref);
+ p_ptr->publ.max_pkt = tipc_link_get_max_pkt(peer->node, ref);
return res;
}
diff --git a/net/tipc/port.h b/net/tipc/port.h
index 7ef4d64..e5f8c16 100644
--- a/net/tipc/port.h
+++ b/net/tipc/port.h
@@ -1,8 +1,8 @@
/*
* net/tipc/port.h: Include file for TIPC port code
*
- * Copyright (c) 1994-2006, Ericsson AB
- * Copyright (c) 2004-2005, Wind River Systems
+ * Copyright (c) 1994-2007, Ericsson AB
+ * Copyright (c) 2004-2007, Wind River Systems
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -81,7 +81,6 @@ struct user_port {
* @acked:
* @publications: list of publications for port
* @pub_count: total # of publications port has made during its lifetime
- * @max_pkt: maximum packet size "hint" used when building messages sent by port
* @probing_state:
* @probing_interval:
* @last_in_seqno:
@@ -102,7 +101,6 @@ struct port {
u32 acked;
struct list_head publications;
u32 pub_count;
- u32 max_pkt;
u32 probing_state;
u32 probing_interval;
u32 last_in_seqno;
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index ac7f2aa..4a8f37f 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -607,23 +607,24 @@ exit:
static int send_stream(struct kiocb *iocb, struct socket *sock,
struct msghdr *m, size_t total_len)
{
+ struct tipc_port *tport;
struct msghdr my_msg;
struct iovec my_iov;
struct iovec *curr_iov;
int curr_iovlen;
char __user *curr_start;
+ u32 hdr_size;
int curr_left;
int bytes_to_send;
int bytes_sent;
int res;
- if (likely(total_len <= TIPC_MAX_USER_MSG_SIZE))
- return send_packet(iocb, sock, m, total_len);
-
- /* Can only send large data streams if already connected */
+ /* Handle special cases where there is no connection */
if (unlikely(sock->state != SS_CONNECTED)) {
- if (sock->state == SS_DISCONNECTING)
+ if (sock->state == SS_UNCONNECTED)
+ return send_packet(iocb, sock, m, total_len);
+ else if (sock->state == SS_DISCONNECTING)
return -EPIPE;
else
return -ENOTCONN;
@@ -648,17 +649,25 @@ static int send_stream(struct kiocb *iocb, struct socket *sock,
my_msg.msg_name = NULL;
bytes_sent = 0;
+ tport = tipc_sk(sock->sk)->p;
+ hdr_size = msg_hdr_sz(&tport->phdr);
+
while (curr_iovlen--) {
curr_start = curr_iov->iov_base;
curr_left = curr_iov->iov_len;
while (curr_left) {
- bytes_to_send = (curr_left < TIPC_MAX_USER_MSG_SIZE)
- ? curr_left : TIPC_MAX_USER_MSG_SIZE;
+ bytes_to_send = tport->max_pkt - hdr_size;
+ if (bytes_to_send > TIPC_MAX_USER_MSG_SIZE)
+ bytes_to_send = TIPC_MAX_USER_MSG_SIZE;
+ if (curr_left < bytes_to_send)
+ bytes_to_send = curr_left;
my_iov.iov_base = curr_start;
my_iov.iov_len = bytes_to_send;
if ((res = send_packet(iocb, sock, &my_msg, 0)) < 0) {
- return bytes_sent ? bytes_sent : res;
+ if (bytes_sent != 0)
+ res = bytes_sent;
+ return res;
}
curr_left -= bytes_to_send;
curr_start += bytes_to_send;
--
1.5.0.5
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH] IGMP: Inhibit reports for local multicast groups
[not found] <1>
` (3 preceding siblings ...)
2007-05-18 0:33 ` [PATCH 3/3] [TIPC]: Optimize stream send routine to avoid fragmentation Jon Paul Maloy
@ 2015-08-24 11:39 ` Philip Downey
2015-08-25 21:20 ` David Miller
2015-08-27 15:46 ` Philip Downey
5 siblings, 1 reply; 24+ messages in thread
From: Philip Downey @ 2015-08-24 11:39 UTC (permalink / raw)
Cc: kuznet, jmorris, yoshfuji, kaber, linux-kernel, netdev,
Philip Downey
The range of addresses between 224.0.0.0 and 224.0.0.255 inclusive, is
reserved for the use of routing protocols and other low-level topology
discovery or maintenance protocols, such as gateway discovery and
group membership reporting. Multicast routers should not forward any
multicast datagram with destination addresses in this range,
regardless of its TTL.
Currently, IGMP reports are generated for this reserved range of
addresses even though a router will ignore this information since it
has no purpose. However, the presence of reserved group addresses in
an IGMP membership report uses up network bandwidth and can also
obscure addresses of interest when inspecting membership reports using
packet inspection or debug messages.
Although the RFCs for the various version of IGMP (e.g.RFC 3376 for
v3) do not specify that the reserved addresses be excluded from
membership reports, it should do no harm in doing so. In particular
there should be no adverse effect in any IGMP snooping functionality
since 224.0.0.x is specifically excluded as per RFC 4541 (IGMP and MLD
Snooping Switches Considerations) section 2.1.2. Data Forwarding
Rules:
2) Packets with a destination IP (DIP) address in the 224.0.0.X
range which are not IGMP must be forwarded on all ports.
IGMP reports for local multicast groups can now be optionally
inhibited by means of a system control variable (by setting the value
to zero) e.g.:
echo 0 > /proc/sys/net/ipv4/igmp_link_local_reports
To retain backwards compatibility the previous behaviour is retained
by default on system boot or reverted by setting the value back to
non-zero e.g.:
echo 1 > /proc/sys/net/ipv4/igmp_link_local_reports
Signed-off-by: Philip Downey <pdowney@brocade.com>
---
include/linux/igmp.h | 1 +
net/ipv4/igmp.c | 29 ++++++++++++++++++++++++++++-
net/ipv4/sysctl_net_ipv4.c | 7 +++++++
3 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/include/linux/igmp.h b/include/linux/igmp.h
index 193ad48..e3e0dae 100644
--- a/include/linux/igmp.h
+++ b/include/linux/igmp.h
@@ -37,6 +37,7 @@ static inline struct igmpv3_query *
return (struct igmpv3_query *)skb_transport_header(skb);
}
+extern int sysctl_igmp_link_local_reports;
extern int sysctl_igmp_max_memberships;
extern int sysctl_igmp_max_msf;
extern int sysctl_igmp_qrv;
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 9fdfd9d..a3df89d 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -110,6 +110,15 @@
#define IP_MAX_MEMBERSHIPS 20
#define IP_MAX_MSF 10
+/* IGMP reports for link-local multicast groups are enabled by default */
+#define IGMP_ENABLE_LLM 1
+
+int sysctl_igmp_link_local_reports __read_mostly = IGMP_ENABLE_LLM;
+
+#define IGMP_INHIBIT_LINK_LOCAL_REPORTS(_ipaddr) \
+ (ipv4_is_local_multicast(_ipaddr) && \
+ (sysctl_igmp_link_local_reports == 0))
+
#ifdef CONFIG_IP_MULTICAST
/* Parameter names and values are taken from igmp-v2-06 draft */
@@ -437,6 +446,8 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ip_mc_list *pmc,
if (pmc->multiaddr == IGMP_ALL_HOSTS)
return skb;
+ if (IGMP_INHIBIT_LINK_LOCAL_REPORTS(pmc->multiaddr))
+ return skb;
isquery = type == IGMPV3_MODE_IS_INCLUDE ||
type == IGMPV3_MODE_IS_EXCLUDE;
@@ -545,6 +556,8 @@ static int igmpv3_send_report(struct in_device *in_dev, struct ip_mc_list *pmc)
for_each_pmc_rcu(in_dev, pmc) {
if (pmc->multiaddr == IGMP_ALL_HOSTS)
continue;
+ if (IGMP_INHIBIT_LINK_LOCAL_REPORTS(pmc->multiaddr))
+ continue;
spin_lock_bh(&pmc->lock);
if (pmc->sfcount[MCAST_EXCLUDE])
type = IGMPV3_MODE_IS_EXCLUDE;
@@ -678,7 +691,11 @@ static int igmp_send_report(struct in_device *in_dev, struct ip_mc_list *pmc,
if (type == IGMPV3_HOST_MEMBERSHIP_REPORT)
return igmpv3_send_report(in_dev, pmc);
- else if (type == IGMP_HOST_LEAVE_MESSAGE)
+
+ if (IGMP_INHIBIT_LINK_LOCAL_REPORTS(group))
+ return 0;
+
+ if (type == IGMP_HOST_LEAVE_MESSAGE)
dst = IGMP_ALL_ROUTER;
else
dst = group;
@@ -851,6 +868,8 @@ static bool igmp_heard_report(struct in_device *in_dev, __be32 group)
if (group == IGMP_ALL_HOSTS)
return false;
+ if (IGMP_INHIBIT_LINK_LOCAL_REPORTS(group))
+ return false;
rcu_read_lock();
for_each_pmc_rcu(in_dev, im) {
@@ -957,6 +976,8 @@ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb,
continue;
if (im->multiaddr == IGMP_ALL_HOSTS)
continue;
+ if (IGMP_INHIBIT_LINK_LOCAL_REPORTS(im->multiaddr))
+ continue;
spin_lock_bh(&im->lock);
if (im->tm_running)
im->gsquery = im->gsquery && mark;
@@ -1181,6 +1202,8 @@ static void igmp_group_dropped(struct ip_mc_list *im)
#ifdef CONFIG_IP_MULTICAST
if (im->multiaddr == IGMP_ALL_HOSTS)
return;
+ if (IGMP_INHIBIT_LINK_LOCAL_REPORTS(im->multiaddr))
+ return;
reporter = im->reporter;
igmp_stop_timer(im);
@@ -1213,6 +1236,8 @@ static void igmp_group_added(struct ip_mc_list *im)
#ifdef CONFIG_IP_MULTICAST
if (im->multiaddr == IGMP_ALL_HOSTS)
return;
+ if (IGMP_INHIBIT_LINK_LOCAL_REPORTS(im->multiaddr))
+ return;
if (in_dev->dead)
return;
@@ -1518,6 +1543,8 @@ static void ip_mc_rejoin_groups(struct in_device *in_dev)
for_each_pmc_rtnl(in_dev, im) {
if (im->multiaddr == IGMP_ALL_HOSTS)
continue;
+ if (IGMP_INHIBIT_LINK_LOCAL_REPORTS(im->multiaddr))
+ continue;
/* a failover is happening and switches
* must be notified immediately
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 0330ab2..157c25a 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -910,6 +910,13 @@ static struct ctl_table ipv4_net_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
+ {
+ .procname = "igmp_link_local_reports",
+ .data = &sysctl_igmp_link_local_reports,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec
+ },
{ }
};
--
1.7.10.4
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] IGMP: Inhibit reports for local multicast groups
2015-08-24 11:39 ` [PATCH] IGMP: Inhibit reports for local multicast groups Philip Downey
@ 2015-08-25 21:20 ` David Miller
2015-08-26 9:23 ` Philip Downey
0 siblings, 1 reply; 24+ messages in thread
From: David Miller @ 2015-08-25 21:20 UTC (permalink / raw)
To: pdowney; +Cc: kuznet, jmorris, yoshfuji, kaber, linux-kernel, netdev
From: Philip Downey <pdowney@brocade.com>
Date: Mon, 24 Aug 2015 12:39:17 +0100
> +extern int sysctl_igmp_link_local_reports;
...
> +/* IGMP reports for link-local multicast groups are enabled by default */
> +#define IGMP_ENABLE_LLM 1
> +
> +int sysctl_igmp_link_local_reports __read_mostly = IGMP_ENABLE_LLM;
> +
> +#define IGMP_INHIBIT_LINK_LOCAL_REPORTS(_ipaddr) \
> + (ipv4_is_local_multicast(_ipaddr) && \
> + (sysctl_igmp_link_local_reports == 0))
> +
People know that "1" and "0" means enable and disable respectively, so this
macros is pretty excessive. Just remove it.
Also, simplify the name of the sysctl to something like
"sysctl_igmp_llm_reports" or similar, and simplify the test against 0
to be in the canonical "!x" format. Then the test can fit on one
line:
(ipv4_is_local_multicast(_ipaddr) && !sysctl_igmp_llm_reports)
^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [PATCH] IGMP: Inhibit reports for local multicast groups
2015-08-25 21:20 ` David Miller
@ 2015-08-26 9:23 ` Philip Downey
0 siblings, 0 replies; 24+ messages in thread
From: Philip Downey @ 2015-08-26 9:23 UTC (permalink / raw)
To: David Miller
Cc: kuznet@ms2.inr.ac.ru, jmorris@namei.org, yoshfuji@linux-ipv6.org,
kaber@trash.net, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org
> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Tuesday, August 25, 2015 10:20 PM
> To: Philip Downey
> Cc: kuznet@ms2.inr.ac.ru; jmorris@namei.org; yoshfuji@linux-ipv6.org;
> kaber@trash.net; linux-kernel@vger.kernel.org; netdev@vger.kernel.org
> Subject: Re: [PATCH] IGMP: Inhibit reports for local multicast groups
>
> From: Philip Downey <pdowney@brocade.com>
> Date: Mon, 24 Aug 2015 12:39:17 +0100
>
> > +extern int sysctl_igmp_link_local_reports;
> ...
> > +/* IGMP reports for link-local multicast groups are enabled by default */
> > +#define IGMP_ENABLE_LLM 1
> > +
> > +int sysctl_igmp_link_local_reports __read_mostly = IGMP_ENABLE_LLM;
> > +
> > +#define IGMP_INHIBIT_LINK_LOCAL_REPORTS(_ipaddr) \
> > + (ipv4_is_local_multicast(_ipaddr) && \
> > + (sysctl_igmp_link_local_reports == 0))
> > +
>
> People know that "1" and "0" means enable and disable respectively, so this
> macros is pretty excessive. Just remove it.
>
> Also, simplify the name of the sysctl to something like
> "sysctl_igmp_llm_reports" or similar, and simplify the test against 0 to be in
> the canonical "!x" format. Then the test can fit on one
> line:
>
> (ipv4_is_local_multicast(_ipaddr) && !sysctl_igmp_llm_reports).
Thanks for reviewing David.
I will make the requested changes (fitting the test on a single line was my main reason for introducing the macro - that and making it patently obvious what the test was doing. Your suggestion would seem to meet that aim).
Will amend and resubmit.
Regards
Philip
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] IGMP: Inhibit reports for local multicast groups
[not found] <1>
` (4 preceding siblings ...)
2015-08-24 11:39 ` [PATCH] IGMP: Inhibit reports for local multicast groups Philip Downey
@ 2015-08-27 15:46 ` Philip Downey
2015-08-28 20:29 ` David Miller
2015-08-28 21:19 ` Cong Wang
5 siblings, 2 replies; 24+ messages in thread
From: Philip Downey @ 2015-08-27 15:46 UTC (permalink / raw)
To: davem; +Cc: kuznet, jmorris, yoshfuji, kaber, linux-kernel, netdev,
Philip Downey
The range of addresses between 224.0.0.0 and 224.0.0.255 inclusive, is
reserved for the use of routing protocols and other low-level topology
discovery or maintenance protocols, such as gateway discovery and
group membership reporting. Multicast routers should not forward any
multicast datagram with destination addresses in this range,
regardless of its TTL.
Currently, IGMP reports are generated for this reserved range of
addresses even though a router will ignore this information since it
has no purpose. However, the presence of reserved group addresses in
an IGMP membership report uses up network bandwidth and can also
obscure addresses of interest when inspecting membership reports using
packet inspection or debug messages.
Although the RFCs for the various version of IGMP (e.g.RFC 3376 for
v3) do not specify that the reserved addresses be excluded from
membership reports, it should do no harm in doing so. In particular
there should be no adverse effect in any IGMP snooping functionality
since 224.0.0.x is specifically excluded as per RFC 4541 (IGMP and MLD
Snooping Switches Considerations) section 2.1.2. Data Forwarding
Rules:
2) Packets with a destination IP (DIP) address in the 224.0.0.X
range which are not IGMP must be forwarded on all ports.
IGMP reports for local multicast groups can now be optionally
inhibited by means of a system control variable (by setting the value
to zero) e.g.:
echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
To retain backwards compatibility the previous behaviour is retained
by default on system boot or reverted by setting the value back to
non-zero e.g.:
echo 1 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
Signed-off-by: Philip Downey <pdowney@brocade.com>
---
include/linux/igmp.h | 1 +
net/ipv4/igmp.c | 26 +++++++++++++++++++++++++-
net/ipv4/sysctl_net_ipv4.c | 7 +++++++
3 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/include/linux/igmp.h b/include/linux/igmp.h
index 193ad48..9084292 100644
--- a/include/linux/igmp.h
+++ b/include/linux/igmp.h
@@ -37,6 +37,7 @@ static inline struct igmpv3_query *
return (struct igmpv3_query *)skb_transport_header(skb);
}
+extern int sysctl_igmp_llm_reports;
extern int sysctl_igmp_max_memberships;
extern int sysctl_igmp_max_msf;
extern int sysctl_igmp_qrv;
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 9fdfd9d..d38b8b6 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -110,6 +110,9 @@
#define IP_MAX_MEMBERSHIPS 20
#define IP_MAX_MSF 10
+/* IGMP reports for link-local multicast groups are enabled by default */
+int sysctl_igmp_llm_reports __read_mostly = 1;
+
#ifdef CONFIG_IP_MULTICAST
/* Parameter names and values are taken from igmp-v2-06 draft */
@@ -437,6 +440,8 @@ static struct sk_buff *add_grec(struct sk_buff *skb, struct ip_mc_list *pmc,
if (pmc->multiaddr == IGMP_ALL_HOSTS)
return skb;
+ if (ipv4_is_local_multicast(pmc->multiaddr) && !sysctl_igmp_llm_reports)
+ return skb;
isquery = type == IGMPV3_MODE_IS_INCLUDE ||
type == IGMPV3_MODE_IS_EXCLUDE;
@@ -545,6 +550,9 @@ static int igmpv3_send_report(struct in_device *in_dev, struct ip_mc_list *pmc)
for_each_pmc_rcu(in_dev, pmc) {
if (pmc->multiaddr == IGMP_ALL_HOSTS)
continue;
+ if (ipv4_is_local_multicast(pmc->multiaddr) &&
+ !sysctl_igmp_llm_reports)
+ continue;
spin_lock_bh(&pmc->lock);
if (pmc->sfcount[MCAST_EXCLUDE])
type = IGMPV3_MODE_IS_EXCLUDE;
@@ -678,7 +686,11 @@ static int igmp_send_report(struct in_device *in_dev, struct ip_mc_list *pmc,
if (type == IGMPV3_HOST_MEMBERSHIP_REPORT)
return igmpv3_send_report(in_dev, pmc);
- else if (type == IGMP_HOST_LEAVE_MESSAGE)
+
+ if (ipv4_is_local_multicast(group) && !sysctl_igmp_llm_reports)
+ return 0;
+
+ if (type == IGMP_HOST_LEAVE_MESSAGE)
dst = IGMP_ALL_ROUTER;
else
dst = group;
@@ -851,6 +863,8 @@ static bool igmp_heard_report(struct in_device *in_dev, __be32 group)
if (group == IGMP_ALL_HOSTS)
return false;
+ if (ipv4_is_local_multicast(group) && !sysctl_igmp_llm_reports)
+ return false;
rcu_read_lock();
for_each_pmc_rcu(in_dev, im) {
@@ -957,6 +971,9 @@ static bool igmp_heard_query(struct in_device *in_dev, struct sk_buff *skb,
continue;
if (im->multiaddr == IGMP_ALL_HOSTS)
continue;
+ if (ipv4_is_local_multicast(im->multiaddr) &&
+ !sysctl_igmp_llm_reports)
+ continue;
spin_lock_bh(&im->lock);
if (im->tm_running)
im->gsquery = im->gsquery && mark;
@@ -1181,6 +1198,8 @@ static void igmp_group_dropped(struct ip_mc_list *im)
#ifdef CONFIG_IP_MULTICAST
if (im->multiaddr == IGMP_ALL_HOSTS)
return;
+ if (ipv4_is_local_multicast(im->multiaddr) && !sysctl_igmp_llm_reports)
+ return;
reporter = im->reporter;
igmp_stop_timer(im);
@@ -1213,6 +1232,8 @@ static void igmp_group_added(struct ip_mc_list *im)
#ifdef CONFIG_IP_MULTICAST
if (im->multiaddr == IGMP_ALL_HOSTS)
return;
+ if (ipv4_is_local_multicast(im->multiaddr) && !sysctl_igmp_llm_reports)
+ return;
if (in_dev->dead)
return;
@@ -1518,6 +1539,9 @@ static void ip_mc_rejoin_groups(struct in_device *in_dev)
for_each_pmc_rtnl(in_dev, im) {
if (im->multiaddr == IGMP_ALL_HOSTS)
continue;
+ if (ipv4_is_local_multicast(im->multiaddr) &&
+ !sysctl_igmp_llm_reports)
+ continue;
/* a failover is happening and switches
* must be notified immediately
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 0330ab2..74eede2 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -910,6 +910,13 @@ static struct ctl_table ipv4_net_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
+ {
+ .procname = "igmp_link_local_mcast_reports",
+ .data = &sysctl_igmp_llm_reports,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec
+ },
{ }
};
--
1.7.10.4
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] IGMP: Inhibit reports for local multicast groups
2015-08-27 15:46 ` Philip Downey
@ 2015-08-28 20:29 ` David Miller
2015-08-28 21:19 ` Cong Wang
1 sibling, 0 replies; 24+ messages in thread
From: David Miller @ 2015-08-28 20:29 UTC (permalink / raw)
To: pdowney; +Cc: kuznet, jmorris, yoshfuji, kaber, linux-kernel, netdev
From: Philip Downey <pdowney@brocade.com>
Date: Thu, 27 Aug 2015 16:46:26 +0100
> The range of addresses between 224.0.0.0 and 224.0.0.255 inclusive, is
> reserved for the use of routing protocols and other low-level topology
> discovery or maintenance protocols, such as gateway discovery and
> group membership reporting. Multicast routers should not forward any
> multicast datagram with destination addresses in this range,
> regardless of its TTL.
>
> Currently, IGMP reports are generated for this reserved range of
> addresses even though a router will ignore this information since it
> has no purpose. However, the presence of reserved group addresses in
> an IGMP membership report uses up network bandwidth and can also
> obscure addresses of interest when inspecting membership reports using
> packet inspection or debug messages.
>
> Although the RFCs for the various version of IGMP (e.g.RFC 3376 for
> v3) do not specify that the reserved addresses be excluded from
> membership reports, it should do no harm in doing so. In particular
> there should be no adverse effect in any IGMP snooping functionality
> since 224.0.0.x is specifically excluded as per RFC 4541 (IGMP and MLD
> Snooping Switches Considerations) section 2.1.2. Data Forwarding
> Rules:
>
> 2) Packets with a destination IP (DIP) address in the 224.0.0.X
> range which are not IGMP must be forwarded on all ports.
>
> IGMP reports for local multicast groups can now be optionally
> inhibited by means of a system control variable (by setting the value
> to zero) e.g.:
> echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
>
> To retain backwards compatibility the previous behaviour is retained
> by default on system boot or reverted by setting the value back to
> non-zero e.g.:
> echo 1 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
>
> Signed-off-by: Philip Downey <pdowney@brocade.com>
Applied to net-next, thanks.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH] IGMP: Inhibit reports for local multicast groups
2015-08-27 15:46 ` Philip Downey
2015-08-28 20:29 ` David Miller
@ 2015-08-28 21:19 ` Cong Wang
2015-08-31 10:33 ` Philip Downey
1 sibling, 1 reply; 24+ messages in thread
From: Cong Wang @ 2015-08-28 21:19 UTC (permalink / raw)
To: Philip Downey
Cc: David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, linux-kernel@vger.kernel.org, netdev
On Thu, Aug 27, 2015 at 8:46 AM, Philip Downey <pdowney@brocade.com> wrote:
> IGMP reports for local multicast groups can now be optionally
> inhibited by means of a system control variable (by setting the value
> to zero) e.g.:
> echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
>
> To retain backwards compatibility the previous behaviour is retained
> by default on system boot or reverted by setting the value back to
> non-zero e.g.:
> echo 1 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
>
Please document it in Documentation/networking/ip-sysctl.txt.
^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: [PATCH] IGMP: Inhibit reports for local multicast groups
2015-08-28 21:19 ` Cong Wang
@ 2015-08-31 10:33 ` Philip Downey
0 siblings, 0 replies; 24+ messages in thread
From: Philip Downey @ 2015-08-31 10:33 UTC (permalink / raw)
To: Cong Wang
Cc: David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, linux-kernel@vger.kernel.org, netdev
> -----Original Message-----
> From: Cong Wang [mailto:cwang@twopensource.com]
> Sent: Friday, August 28, 2015 10:20 PM
> To: Philip Downey
> Cc: David Miller; Alexey Kuznetsov; James Morris; Hideaki YOSHIFUJI; Patrick
> McHardy; linux-kernel@vger.kernel.org; netdev
> Subject: Re: [PATCH] IGMP: Inhibit reports for local multicast groups
>
> On Thu, Aug 27, 2015 at 8:46 AM, Philip Downey <pdowney@brocade.com>
> wrote:
> > IGMP reports for local multicast groups can now be optionally
> > inhibited by means of a system control variable (by setting the value
> > to zero) e.g.:
> > echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
> >
> > To retain backwards compatibility the previous behaviour is retained
> > by default on system boot or reverted by setting the value back to
> > non-zero e.g.:
> > echo 1 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
> >
>
> Please document it in Documentation/networking/ip-sysctl.txt.
Thanks for the comment.
I have generated a new patch for the proposed documentation change.
Hope this is the correct thing to do.
Regards
Philip
^ permalink raw reply [flat|nested] 24+ messages in thread