* [net-next 0/4] tipc: emulate multicast through replication
@ 2017-01-18 18:50 Jon Maloy
2017-01-18 18:50 ` [net-next 1/4] tipc: add function for checking broadcast support in bearer Jon Maloy
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Jon Maloy @ 2017-01-18 18:50 UTC (permalink / raw)
To: davem
Cc: netdev, parthasarathy.bhuvaragan, ying.xue, maloy,
tipc-discussion, Jon Maloy
TIPC multicast messages are currently distributed via L2 broadcast
or IP multicast to all nodes in the cluster, irrespective of the
number of real destinations of the message.
In this series we introduce an option to transport messages via
replication ("replicast") across a selected number of unicast links,
instead of relying on the underlying media. This option is used when
true broadcast/multicast is not supported by the media, or when the
number of true destinations is much smaller than the cluster size.
Jon Maloy (4):
tipc: add function for checking broadcast support in bearer
tipc: add functionality to lookup multicast destination nodes
tipc: introduce replicast as transport option for multicast
tipc: make replicast a user selectable option
include/uapi/linux/tipc.h | 6 +-
net/tipc/bcast.c | 200 +++++++++++++++++++++++++++++++++++++++-------
net/tipc/bcast.h | 33 +++++++-
net/tipc/bearer.c | 15 +++-
net/tipc/bearer.h | 8 +-
net/tipc/link.c | 12 ++-
net/tipc/msg.c | 17 ++++
net/tipc/msg.h | 9 +--
net/tipc/name_table.c | 38 +++++++--
net/tipc/name_table.h | 9 +++
net/tipc/node.c | 27 ++++---
net/tipc/node.h | 4 +-
net/tipc/socket.c | 61 ++++++++++----
net/tipc/udp_media.c | 8 +-
14 files changed, 374 insertions(+), 73 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 6+ messages in thread
* [net-next 1/4] tipc: add function for checking broadcast support in bearer
2017-01-18 18:50 [net-next 0/4] tipc: emulate multicast through replication Jon Maloy
@ 2017-01-18 18:50 ` Jon Maloy
2017-01-18 18:50 ` [net-next 2/4] tipc: add functionality to lookup multicast destination nodes Jon Maloy
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Jon Maloy @ 2017-01-18 18:50 UTC (permalink / raw)
To: davem; +Cc: Jon Maloy, netdev, tipc-discussion
As a preparation for the 'replicast' functionality we are going to
introduce in the next commits, we need the broadcast base structure to
store whether bearer broadcast is available at all from the currently
used bearer or bearers.
We do this by adding a new function tipc_bearer_bcast_support() to
the bearer layer, and letting the bearer selection function in
bcast.c use this to give a new boolean field, 'bcast_support' the
appropriate value.
Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
net/tipc/bcast.c | 12 +++++++++---
net/tipc/bearer.c | 15 ++++++++++++++-
net/tipc/bearer.h | 8 +++++++-
net/tipc/udp_media.c | 8 ++++----
4 files changed, 34 insertions(+), 9 deletions(-)
diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index c35fad3..3256276 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -1,7 +1,7 @@
/*
* net/tipc/bcast.c: TIPC broadcast code
*
- * Copyright (c) 2004-2006, 2014-2015, Ericsson AB
+ * Copyright (c) 2004-2006, 2014-2016, Ericsson AB
* Copyright (c) 2004, Intel Corporation.
* Copyright (c) 2005, 2010-2011, Wind River Systems
* All rights reserved.
@@ -54,12 +54,14 @@ const char tipc_bclink_name[] = "broadcast-link";
* @inputq: data input queue; will only carry SOCK_WAKEUP messages
* @dest: array keeping number of reachable destinations per bearer
* @primary_bearer: a bearer having links to all broadcast destinations, if any
+ * @bcast_support: indicates if primary bearer, if any, supports broadcast
*/
struct tipc_bc_base {
struct tipc_link *link;
struct sk_buff_head inputq;
int dests[MAX_BEARERS];
int primary_bearer;
+ bool bcast_support;
};
static struct tipc_bc_base *tipc_bc_base(struct net *net)
@@ -79,9 +81,10 @@ static void tipc_bcbase_select_primary(struct net *net)
{
struct tipc_bc_base *bb = tipc_bc_base(net);
int all_dests = tipc_link_bc_peers(bb->link);
- int i, mtu;
+ int i, mtu, prim;
bb->primary_bearer = INVALID_BEARER_ID;
+ bb->bcast_support = true;
if (!all_dests)
return;
@@ -93,7 +96,7 @@ static void tipc_bcbase_select_primary(struct net *net)
mtu = tipc_bearer_mtu(net, i);
if (mtu < tipc_link_mtu(bb->link))
tipc_link_set_mtu(bb->link, mtu);
-
+ bb->bcast_support &= tipc_bearer_bcast_support(net, i);
if (bb->dests[i] < all_dests)
continue;
@@ -103,6 +106,9 @@ static void tipc_bcbase_select_primary(struct net *net)
if ((i ^ tipc_own_addr(net)) & 1)
break;
}
+ prim = bb->primary_bearer;
+ if (prim != INVALID_BEARER_ID)
+ bb->bcast_support = tipc_bearer_bcast_support(net, prim);
}
void tipc_bcast_inc_bearer_dst_cnt(struct net *net, int bearer_id)
diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
index 52d7476..33a5bdf 100644
--- a/net/tipc/bearer.c
+++ b/net/tipc/bearer.c
@@ -431,7 +431,7 @@ int tipc_enable_l2_media(struct net *net, struct tipc_bearer *b,
memset(&b->bcast_addr, 0, sizeof(b->bcast_addr));
memcpy(b->bcast_addr.value, dev->broadcast, b->media->hwaddr_len);
b->bcast_addr.media_id = b->media->type_id;
- b->bcast_addr.broadcast = 1;
+ b->bcast_addr.broadcast = TIPC_BROADCAST_SUPPORT;
b->mtu = dev->mtu;
b->media->raw2addr(b, &b->addr, (char *)dev->dev_addr);
rcu_assign_pointer(dev->tipc_ptr, b);
@@ -482,6 +482,19 @@ int tipc_l2_send_msg(struct net *net, struct sk_buff *skb,
return 0;
}
+bool tipc_bearer_bcast_support(struct net *net, u32 bearer_id)
+{
+ bool supp = false;
+ struct tipc_bearer *b;
+
+ rcu_read_lock();
+ b = bearer_get(net, bearer_id);
+ if (b)
+ supp = (b->bcast_addr.broadcast == TIPC_BROADCAST_SUPPORT);
+ rcu_read_unlock();
+ return supp;
+}
+
int tipc_bearer_mtu(struct net *net, u32 bearer_id)
{
int mtu = 0;
diff --git a/net/tipc/bearer.h b/net/tipc/bearer.h
index 278ff7f..635c908 100644
--- a/net/tipc/bearer.h
+++ b/net/tipc/bearer.h
@@ -60,9 +60,14 @@
#define TIPC_MEDIA_TYPE_IB 2
#define TIPC_MEDIA_TYPE_UDP 3
-/* minimum bearer MTU */
+/* Minimum bearer MTU */
#define TIPC_MIN_BEARER_MTU (MAX_H_SIZE + INT_H_SIZE)
+/* Identifiers for distinguishing between broadcast/multicast and replicast
+ */
+#define TIPC_BROADCAST_SUPPORT 1
+#define TIPC_REPLICAST_SUPPORT 2
+
/**
* struct tipc_media_addr - destination address used by TIPC bearers
* @value: address info (format defined by media)
@@ -210,6 +215,7 @@ int tipc_bearer_setup(void);
void tipc_bearer_cleanup(void);
void tipc_bearer_stop(struct net *net);
int tipc_bearer_mtu(struct net *net, u32 bearer_id);
+bool tipc_bearer_bcast_support(struct net *net, u32 bearer_id);
void tipc_bearer_xmit_skb(struct net *net, u32 bearer_id,
struct sk_buff *skb,
struct tipc_media_addr *dest);
diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c
index b58dc95..46061cf 100644
--- a/net/tipc/udp_media.c
+++ b/net/tipc/udp_media.c
@@ -113,7 +113,7 @@ static void tipc_udp_media_addr_set(struct tipc_media_addr *addr,
memcpy(addr->value, ua, sizeof(struct udp_media_addr));
if (tipc_udp_is_mcast_addr(ua))
- addr->broadcast = 1;
+ addr->broadcast = TIPC_BROADCAST_SUPPORT;
}
/* tipc_udp_addr2str - convert ip/udp address to string */
@@ -229,7 +229,7 @@ static int tipc_udp_send_msg(struct net *net, struct sk_buff *skb,
goto out;
}
- if (!addr->broadcast || list_empty(&ub->rcast.list))
+ if (addr->broadcast != TIPC_REPLICAST_SUPPORT)
return tipc_udp_xmit(net, skb, ub, src, dst);
/* Replicast, send an skb to each configured IP address */
@@ -296,7 +296,7 @@ static int tipc_udp_rcast_add(struct tipc_bearer *b,
else if (ntohs(addr->proto) == ETH_P_IPV6)
pr_info("New replicast peer: %pI6\n", &rcast->addr.ipv6);
#endif
-
+ b->bcast_addr.broadcast = TIPC_REPLICAST_SUPPORT;
list_add_rcu(&rcast->list, &ub->rcast.list);
return 0;
}
@@ -681,7 +681,7 @@ static int tipc_udp_enable(struct net *net, struct tipc_bearer *b,
goto err;
b->bcast_addr.media_id = TIPC_MEDIA_TYPE_UDP;
- b->bcast_addr.broadcast = 1;
+ b->bcast_addr.broadcast = TIPC_BROADCAST_SUPPORT;
rcu_assign_pointer(b->media_ptr, ub);
rcu_assign_pointer(ub->bearer, b);
tipc_udp_media_addr_set(&b->addr, &local);
--
2.7.4
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [net-next 2/4] tipc: add functionality to lookup multicast destination nodes
2017-01-18 18:50 [net-next 0/4] tipc: emulate multicast through replication Jon Maloy
2017-01-18 18:50 ` [net-next 1/4] tipc: add function for checking broadcast support in bearer Jon Maloy
@ 2017-01-18 18:50 ` Jon Maloy
2017-01-18 18:50 ` [net-next 3/4] tipc: introduce replicast as transport option for multicast Jon Maloy
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Jon Maloy @ 2017-01-18 18:50 UTC (permalink / raw)
To: davem
Cc: netdev, parthasarathy.bhuvaragan, ying.xue, maloy,
tipc-discussion, Jon Maloy
As a further preparation for the upcoming 'replicast' functionality,
we add some necessary structs and functions for looking up and returning
a list of all nodes that host destinations for a given multicast message.
Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
net/tipc/bcast.c | 33 +++++++++++++++++++++++++++++++--
net/tipc/bcast.h | 15 ++++++++++++++-
net/tipc/name_table.c | 38 +++++++++++++++++++++++++++++++++-----
net/tipc/name_table.h | 9 +++++++++
4 files changed, 87 insertions(+), 8 deletions(-)
diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 3256276..412d335 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -39,9 +39,8 @@
#include "socket.h"
#include "msg.h"
#include "bcast.h"
-#include "name_distr.h"
#include "link.h"
-#include "node.h"
+#include "name_table.h"
#define BCLINK_WIN_DEFAULT 50 /* bcast link window size (default) */
#define BCLINK_WIN_MIN 32 /* bcast minimum link window size */
@@ -434,3 +433,33 @@ void tipc_bcast_stop(struct net *net)
kfree(tn->bcbase);
kfree(tn->bcl);
}
+
+void tipc_nlist_init(struct tipc_nlist *nl, u32 self)
+{
+ memset(nl, 0, sizeof(*nl));
+ INIT_LIST_HEAD(&nl->list);
+ nl->self = self;
+}
+
+void tipc_nlist_add(struct tipc_nlist *nl, u32 node)
+{
+ if (node == nl->self)
+ nl->local = true;
+ else if (u32_push(&nl->list, node))
+ nl->remote++;
+}
+
+void tipc_nlist_del(struct tipc_nlist *nl, u32 node)
+{
+ if (node == nl->self)
+ nl->local = false;
+ else if (u32_del(&nl->list, node))
+ nl->remote--;
+}
+
+void tipc_nlist_purge(struct tipc_nlist *nl)
+{
+ u32_list_purge(&nl->list);
+ nl->remote = 0;
+ nl->local = 0;
+}
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index 855d53c..18f3791 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -42,9 +42,22 @@
struct tipc_node;
struct tipc_msg;
struct tipc_nl_msg;
-struct tipc_node_map;
+struct tipc_nlist;
+struct tipc_nitem;
extern const char tipc_bclink_name[];
+struct tipc_nlist {
+ struct list_head list;
+ u32 self;
+ u16 remote;
+ bool local;
+};
+
+void tipc_nlist_init(struct tipc_nlist *nl, u32 self);
+void tipc_nlist_purge(struct tipc_nlist *nl);
+void tipc_nlist_add(struct tipc_nlist *nl, u32 node);
+void tipc_nlist_del(struct tipc_nlist *nl, u32 node);
+
int tipc_bcast_init(struct net *net);
void tipc_bcast_stop(struct net *net);
void tipc_bcast_add_peer(struct net *net, struct tipc_link *l,
diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c
index 5a86df1..9be6592 100644
--- a/net/tipc/name_table.c
+++ b/net/tipc/name_table.c
@@ -645,6 +645,39 @@ int tipc_nametbl_mc_translate(struct net *net, u32 type, u32 lower, u32 upper,
return res;
}
+/* tipc_nametbl_lookup_dst_nodes - find broadcast destination nodes
+ * - Creates list of nodes that overlap the given multicast address
+ * - Determines if any node local ports overlap
+ */
+void tipc_nametbl_lookup_dst_nodes(struct net *net, u32 type, u32 lower,
+ u32 upper, u32 domain,
+ struct tipc_nlist *nodes)
+{
+ struct sub_seq *sseq, *stop;
+ struct publication *publ;
+ struct name_info *info;
+ struct name_seq *seq;
+
+ rcu_read_lock();
+ seq = nametbl_find_seq(net, type);
+ if (!seq)
+ goto exit;
+
+ spin_lock_bh(&seq->lock);
+ sseq = seq->sseqs + nameseq_locate_subseq(seq, lower);
+ stop = seq->sseqs + seq->first_free;
+ for (; sseq->lower <= upper && sseq != stop; sseq++) {
+ info = sseq->info;
+ list_for_each_entry(publ, &info->zone_list, zone_list) {
+ if (tipc_in_scope(domain, publ->node))
+ tipc_nlist_add(nodes, publ->node);
+ }
+ }
+ spin_unlock_bh(&seq->lock);
+exit:
+ rcu_read_unlock();
+}
+
/*
* tipc_nametbl_publish - add name publication to network name tables
*/
@@ -1022,11 +1055,6 @@ int tipc_nl_name_table_dump(struct sk_buff *skb, struct netlink_callback *cb)
return skb->len;
}
-struct u32_item {
- struct list_head list;
- u32 value;
-};
-
bool u32_find(struct list_head *l, u32 value)
{
struct u32_item *item;
diff --git a/net/tipc/name_table.h b/net/tipc/name_table.h
index c89bb3f..6ebdeb1 100644
--- a/net/tipc/name_table.h
+++ b/net/tipc/name_table.h
@@ -39,6 +39,7 @@
struct tipc_subscription;
struct tipc_plist;
+struct tipc_nlist;
/*
* TIPC name types reserved for internal TIPC use (both current and planned)
@@ -100,6 +101,9 @@ int tipc_nl_name_table_dump(struct sk_buff *skb, struct netlink_callback *cb);
u32 tipc_nametbl_translate(struct net *net, u32 type, u32 instance, u32 *node);
int tipc_nametbl_mc_translate(struct net *net, u32 type, u32 lower, u32 upper,
u32 limit, struct list_head *dports);
+void tipc_nametbl_lookup_dst_nodes(struct net *net, u32 type, u32 lower,
+ u32 upper, u32 domain,
+ struct tipc_nlist *nodes);
struct publication *tipc_nametbl_publish(struct net *net, u32 type, u32 lower,
u32 upper, u32 scope, u32 port_ref,
u32 key);
@@ -116,6 +120,11 @@ void tipc_nametbl_unsubscribe(struct tipc_subscription *s);
int tipc_nametbl_init(struct net *net);
void tipc_nametbl_stop(struct net *net);
+struct u32_item {
+ struct list_head list;
+ u32 value;
+};
+
bool u32_push(struct list_head *l, u32 value);
u32 u32_pop(struct list_head *l);
bool u32_find(struct list_head *l, u32 value);
--
2.7.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [net-next 3/4] tipc: introduce replicast as transport option for multicast
2017-01-18 18:50 [net-next 0/4] tipc: emulate multicast through replication Jon Maloy
2017-01-18 18:50 ` [net-next 1/4] tipc: add function for checking broadcast support in bearer Jon Maloy
2017-01-18 18:50 ` [net-next 2/4] tipc: add functionality to lookup multicast destination nodes Jon Maloy
@ 2017-01-18 18:50 ` Jon Maloy
2017-01-18 18:50 ` [net-next 4/4] tipc: make replicast a user selectable option Jon Maloy
2017-01-20 17:10 ` [net-next 0/4] tipc: emulate multicast through replication David Miller
4 siblings, 0 replies; 6+ messages in thread
From: Jon Maloy @ 2017-01-18 18:50 UTC (permalink / raw)
To: davem; +Cc: Jon Maloy, netdev, tipc-discussion
TIPC multicast messages are currently carried over a reliable
'broadcast link', making use of the underlying media's ability to
transport packets as L2 broadcast or IP multicast to all nodes in
the cluster.
When the used bearer is lacking that ability, we can instead emulate
the broadcast service by replicating and sending the packets over as
many unicast links as needed to reach all identified destinations.
We now introduce a new TIPC link-level 'replicast' service that does
this.
Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
net/tipc/bcast.c | 105 ++++++++++++++++++++++++++++++++++++++++++------------
net/tipc/bcast.h | 3 +-
net/tipc/link.c | 8 ++++-
net/tipc/msg.c | 17 +++++++++
net/tipc/msg.h | 9 +++--
net/tipc/node.c | 27 +++++++++-----
net/tipc/socket.c | 27 +++++++++-----
7 files changed, 149 insertions(+), 47 deletions(-)
diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 412d335..672e6ef 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -70,7 +70,7 @@ static struct tipc_bc_base *tipc_bc_base(struct net *net)
int tipc_bcast_get_mtu(struct net *net)
{
- return tipc_link_mtu(tipc_bc_sndlink(net));
+ return tipc_link_mtu(tipc_bc_sndlink(net)) - INT_H_SIZE;
}
/* tipc_bcbase_select_primary(): find a bearer with links to all destinations,
@@ -175,42 +175,101 @@ static void tipc_bcbase_xmit(struct net *net, struct sk_buff_head *xmitq)
__skb_queue_purge(&_xmitq);
}
-/* tipc_bcast_xmit - deliver buffer chain to all nodes in cluster
- * and to identified node local sockets
+/* tipc_bcast_xmit - broadcast the buffer chain to all external nodes
* @net: the applicable net namespace
- * @list: chain of buffers containing message
+ * @pkts: chain of buffers containing message
+ * @cong_link_cnt: set to 1 if broadcast link is congested, otherwise 0
* Consumes the buffer chain.
- * Returns 0 if success, otherwise errno: -ELINKCONG,-EHOSTUNREACH,-EMSGSIZE
+ * Returns 0 if success, otherwise errno: -EHOSTUNREACH,-EMSGSIZE
*/
-int tipc_bcast_xmit(struct net *net, struct sk_buff_head *list)
+static int tipc_bcast_xmit(struct net *net, struct sk_buff_head *pkts,
+ u16 *cong_link_cnt)
{
struct tipc_link *l = tipc_bc_sndlink(net);
- struct sk_buff_head xmitq, inputq, rcvq;
+ struct sk_buff_head xmitq;
int rc = 0;
- __skb_queue_head_init(&rcvq);
__skb_queue_head_init(&xmitq);
- skb_queue_head_init(&inputq);
-
- /* Prepare message clone for local node */
- if (unlikely(!tipc_msg_reassemble(list, &rcvq)))
- return -EHOSTUNREACH;
-
tipc_bcast_lock(net);
if (tipc_link_bc_peers(l))
- rc = tipc_link_xmit(l, list, &xmitq);
+ rc = tipc_link_xmit(l, pkts, &xmitq);
tipc_bcast_unlock(net);
+ tipc_bcbase_xmit(net, &xmitq);
+ __skb_queue_purge(pkts);
+ if (rc == -ELINKCONG) {
+ *cong_link_cnt = 1;
+ rc = 0;
+ }
+ return rc;
+}
- /* Don't send to local node if adding to link failed */
- if (unlikely(rc && (rc != -ELINKCONG))) {
- __skb_queue_purge(&rcvq);
- return rc;
+/* tipc_rcast_xmit - replicate and send a message to given destination nodes
+ * @net: the applicable net namespace
+ * @pkts: chain of buffers containing message
+ * @dests: list of destination nodes
+ * @cong_link_cnt: returns number of congested links
+ * @cong_links: returns identities of congested links
+ * Returns 0 if success, otherwise errno
+ */
+static int tipc_rcast_xmit(struct net *net, struct sk_buff_head *pkts,
+ struct tipc_nlist *dests, u16 *cong_link_cnt)
+{
+ struct sk_buff_head _pkts;
+ struct u32_item *n, *tmp;
+ u32 dst, selector;
+
+ selector = msg_link_selector(buf_msg(skb_peek(pkts)));
+ __skb_queue_head_init(&_pkts);
+
+ list_for_each_entry_safe(n, tmp, &dests->list, list) {
+ dst = n->value;
+ if (!tipc_msg_pskb_copy(dst, pkts, &_pkts))
+ return -ENOMEM;
+
+ /* Any other return value than -ELINKCONG is ignored */
+ if (tipc_node_xmit(net, &_pkts, dst, selector) == -ELINKCONG)
+ (*cong_link_cnt)++;
}
+ return 0;
+}
- /* Broadcast to all nodes, inluding local node */
- tipc_bcbase_xmit(net, &xmitq);
- tipc_sk_mcast_rcv(net, &rcvq, &inputq);
- __skb_queue_purge(list);
+/* tipc_mcast_xmit - deliver message to indicated destination nodes
+ * and to identified node local sockets
+ * @net: the applicable net namespace
+ * @pkts: chain of buffers containing message
+ * @dests: destination nodes for message. Not consumed.
+ * @cong_link_cnt: returns number of encountered congested destination links
+ * @cong_links: returns identities of congested links
+ * Consumes buffer chain.
+ * Returns 0 if success, otherwise errno
+ */
+int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
+ struct tipc_nlist *dests, u16 *cong_link_cnt)
+{
+ struct tipc_bc_base *bb = tipc_bc_base(net);
+ struct sk_buff_head inputq, localq;
+ int rc = 0;
+
+ skb_queue_head_init(&inputq);
+ skb_queue_head_init(&localq);
+
+ /* Clone packets before they are consumed by next call */
+ if (dests->local && !tipc_msg_reassemble(pkts, &localq)) {
+ rc = -ENOMEM;
+ goto exit;
+ }
+
+ if (dests->remote) {
+ if (!bb->bcast_support)
+ rc = tipc_rcast_xmit(net, pkts, dests, cong_link_cnt);
+ else
+ rc = tipc_bcast_xmit(net, pkts, cong_link_cnt);
+ }
+
+ if (dests->local)
+ tipc_sk_mcast_rcv(net, &localq, &inputq);
+exit:
+ __skb_queue_purge(pkts);
return rc;
}
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index 18f3791..dd772e6 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -66,7 +66,8 @@ void tipc_bcast_remove_peer(struct net *net, struct tipc_link *rcv_bcl);
void tipc_bcast_inc_bearer_dst_cnt(struct net *net, int bearer_id);
void tipc_bcast_dec_bearer_dst_cnt(struct net *net, int bearer_id);
int tipc_bcast_get_mtu(struct net *net);
-int tipc_bcast_xmit(struct net *net, struct sk_buff_head *list);
+int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
+ struct tipc_nlist *dests, u16 *cong_link_cnt);
int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff *skb);
void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
struct tipc_msg *hdr);
diff --git a/net/tipc/link.c b/net/tipc/link.c
index b0f8646..b17b9e1 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1032,11 +1032,17 @@ int tipc_link_retrans(struct tipc_link *l, u16 from, u16 to,
static bool tipc_data_input(struct tipc_link *l, struct sk_buff *skb,
struct sk_buff_head *inputq)
{
- switch (msg_user(buf_msg(skb))) {
+ struct tipc_msg *hdr = buf_msg(skb);
+
+ switch (msg_user(hdr)) {
case TIPC_LOW_IMPORTANCE:
case TIPC_MEDIUM_IMPORTANCE:
case TIPC_HIGH_IMPORTANCE:
case TIPC_CRITICAL_IMPORTANCE:
+ if (unlikely(msg_type(hdr) == TIPC_MCAST_MSG)) {
+ skb_queue_tail(l->bc_rcvlink->inputq, skb);
+ return true;
+ }
case CONN_MANAGER:
skb_queue_tail(inputq, skb);
return true;
diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index ab02d07..312ef7d 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -607,6 +607,23 @@ bool tipc_msg_reassemble(struct sk_buff_head *list, struct sk_buff_head *rcvq)
return false;
}
+bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg,
+ struct sk_buff_head *cpy)
+{
+ struct sk_buff *skb, *_skb;
+
+ skb_queue_walk(msg, skb) {
+ _skb = pskb_copy(skb, GFP_ATOMIC);
+ if (!_skb) {
+ __skb_queue_purge(cpy);
+ return false;
+ }
+ msg_set_destnode(buf_msg(_skb), dst);
+ __skb_queue_tail(cpy, _skb);
+ }
+ return true;
+}
+
/* tipc_skb_queue_sorted(); sort pkt into list according to sequence number
* @list: list to be appended to
* @seqno: sequence number of buffer to add
diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index f07b51e..c843fd2 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -631,14 +631,11 @@ static inline void msg_set_bc_netid(struct tipc_msg *m, u32 id)
static inline u32 msg_link_selector(struct tipc_msg *m)
{
+ if (msg_user(m) == MSG_FRAGMENTER)
+ m = (void *)msg_data(m);
return msg_bits(m, 4, 0, 1);
}
-static inline void msg_set_link_selector(struct tipc_msg *m, u32 n)
-{
- msg_set_bits(m, 4, 0, 1, n);
-}
-
/*
* Word 5
*/
@@ -835,6 +832,8 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
int offset, int dsz, int mtu, struct sk_buff_head *list);
bool tipc_msg_lookup_dest(struct net *net, struct sk_buff *skb, int *err);
bool tipc_msg_reassemble(struct sk_buff_head *list, struct sk_buff_head *rcvq);
+bool tipc_msg_pskb_copy(u32 dst, struct sk_buff_head *msg,
+ struct sk_buff_head *cpy);
void __tipc_skb_queue_sorted(struct sk_buff_head *list, u16 seqno,
struct sk_buff *skb);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 2883f6a..f96dacf 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1257,6 +1257,19 @@ void tipc_node_broadcast(struct net *net, struct sk_buff *skb)
kfree_skb(skb);
}
+static void tipc_node_mcast_rcv(struct tipc_node *n)
+{
+ struct tipc_bclink_entry *be = &n->bc_entry;
+
+ /* 'arrvq' is under inputq2's lock protection */
+ spin_lock_bh(&be->inputq2.lock);
+ spin_lock_bh(&be->inputq1.lock);
+ skb_queue_splice_tail_init(&be->inputq1, &be->arrvq);
+ spin_unlock_bh(&be->inputq1.lock);
+ spin_unlock_bh(&be->inputq2.lock);
+ tipc_sk_mcast_rcv(n->net, &be->arrvq, &be->inputq2);
+}
+
static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr,
int bearer_id, struct sk_buff_head *xmitq)
{
@@ -1330,15 +1343,8 @@ static void tipc_node_bc_rcv(struct net *net, struct sk_buff *skb, int bearer_id
if (!skb_queue_empty(&xmitq))
tipc_bearer_xmit(net, bearer_id, &xmitq, &le->maddr);
- /* Deliver. 'arrvq' is under inputq2's lock protection */
- if (!skb_queue_empty(&be->inputq1)) {
- spin_lock_bh(&be->inputq2.lock);
- spin_lock_bh(&be->inputq1.lock);
- skb_queue_splice_tail_init(&be->inputq1, &be->arrvq);
- spin_unlock_bh(&be->inputq1.lock);
- spin_unlock_bh(&be->inputq2.lock);
- tipc_sk_mcast_rcv(net, &be->arrvq, &be->inputq2);
- }
+ if (!skb_queue_empty(&be->inputq1))
+ tipc_node_mcast_rcv(n);
if (rc & TIPC_LINK_DOWN_EVT) {
/* Reception reassembly failure => reset all links to peer */
@@ -1565,6 +1571,9 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, struct tipc_bearer *b)
if (unlikely(!skb_queue_empty(&n->bc_entry.namedq)))
tipc_named_rcv(net, &n->bc_entry.namedq);
+ if (unlikely(!skb_queue_empty(&n->bc_entry.inputq1)))
+ tipc_node_mcast_rcv(n);
+
if (!skb_queue_empty(&le->inputq))
tipc_sk_rcv(net, &le->inputq);
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index d2f3539..93b6ae3 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -740,32 +740,43 @@ static int tipc_sendmcast(struct socket *sock, struct tipc_name_seq *seq,
struct tipc_msg *hdr = &tsk->phdr;
struct net *net = sock_net(sk);
int mtu = tipc_bcast_get_mtu(net);
+ u32 domain = addr_domain(net, TIPC_CLUSTER_SCOPE);
struct sk_buff_head pkts;
+ struct tipc_nlist dsts;
int rc;
+ /* Block or return if any destination link is congested */
rc = tipc_wait_for_cond(sock, &timeout, !tsk->cong_link_cnt);
if (unlikely(rc))
return rc;
+ /* Lookup destination nodes */
+ tipc_nlist_init(&dsts, tipc_own_addr(net));
+ tipc_nametbl_lookup_dst_nodes(net, seq->type, seq->lower,
+ seq->upper, domain, &dsts);
+ if (!dsts.local && !dsts.remote)
+ return -EHOSTUNREACH;
+
+ /* Build message header */
msg_set_type(hdr, TIPC_MCAST_MSG);
+ msg_set_hdr_sz(hdr, MCAST_H_SIZE);
msg_set_lookup_scope(hdr, TIPC_CLUSTER_SCOPE);
msg_set_destport(hdr, 0);
msg_set_destnode(hdr, 0);
msg_set_nametype(hdr, seq->type);
msg_set_namelower(hdr, seq->lower);
msg_set_nameupper(hdr, seq->upper);
- msg_set_hdr_sz(hdr, MCAST_H_SIZE);
+ /* Build message as chain of buffers */
skb_queue_head_init(&pkts);
rc = tipc_msg_build(hdr, msg, 0, dlen, mtu, &pkts);
- if (unlikely(rc != dlen))
- return rc;
- rc = tipc_bcast_xmit(net, &pkts);
- if (unlikely(rc == -ELINKCONG)) {
- tsk->cong_link_cnt = 1;
- rc = 0;
- }
+ /* Send message if build was successful */
+ if (unlikely(rc == dlen))
+ rc = tipc_mcast_xmit(net, &pkts, &dsts,
+ &tsk->cong_link_cnt);
+
+ tipc_nlist_purge(&dsts);
return rc ? rc : dlen;
}
--
2.7.4
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [net-next 4/4] tipc: make replicast a user selectable option
2017-01-18 18:50 [net-next 0/4] tipc: emulate multicast through replication Jon Maloy
` (2 preceding siblings ...)
2017-01-18 18:50 ` [net-next 3/4] tipc: introduce replicast as transport option for multicast Jon Maloy
@ 2017-01-18 18:50 ` Jon Maloy
2017-01-20 17:10 ` [net-next 0/4] tipc: emulate multicast through replication David Miller
4 siblings, 0 replies; 6+ messages in thread
From: Jon Maloy @ 2017-01-18 18:50 UTC (permalink / raw)
To: davem; +Cc: Jon Maloy, netdev, tipc-discussion
If the bearer carrying multicast messages supports broadcast, those
messages will be sent to all cluster nodes, irrespective of whether
these nodes host any actual destinations socket or not. This is clearly
wasteful if the cluster is large and there are only a few real
destinations for the message being sent.
In this commit we extend the eligibility of the newly introduced
"replicast" transmit option. We now make it possible for a user to
select which method he wants to be used, either as a mandatory setting
via setsockopt(), or as a relative setting where we let the broadcast
layer decide which method to use based on the ratio between cluster
size and the message's actual number of destination nodes.
In the latter case, a sending socket must stick to a previously
selected method until it enters an idle period of at least 5 seconds.
This eliminates the risk of message reordering caused by method change,
i.e., when changes to cluster size or number of destinations would
otherwise mandate a new method to be used.
Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
include/uapi/linux/tipc.h | 6 +++--
net/tipc/bcast.c | 62 ++++++++++++++++++++++++++++++++++++++++++-----
net/tipc/bcast.h | 17 ++++++++++++-
net/tipc/link.c | 4 +++
net/tipc/node.h | 4 ++-
net/tipc/socket.c | 36 +++++++++++++++++++++------
6 files changed, 112 insertions(+), 17 deletions(-)
diff --git a/include/uapi/linux/tipc.h b/include/uapi/linux/tipc.h
index bf049e8..5351b08 100644
--- a/include/uapi/linux/tipc.h
+++ b/include/uapi/linux/tipc.h
@@ -1,7 +1,7 @@
/*
* include/uapi/linux/tipc.h: Header for TIPC socket interface
*
- * Copyright (c) 2003-2006, Ericsson AB
+ * Copyright (c) 2003-2006, 2015-2016 Ericsson AB
* Copyright (c) 2005, 2010-2011, Wind River Systems
* All rights reserved.
*
@@ -220,7 +220,7 @@ struct sockaddr_tipc {
#define TIPC_DESTNAME 3 /* destination name */
/*
- * TIPC-specific socket option values
+ * TIPC-specific socket option names
*/
#define TIPC_IMPORTANCE 127 /* Default: TIPC_LOW_IMPORTANCE */
@@ -229,6 +229,8 @@ struct sockaddr_tipc {
#define TIPC_CONN_TIMEOUT 130 /* Default: 8000 (ms) */
#define TIPC_NODE_RECVQ_DEPTH 131 /* Default: none (read only) */
#define TIPC_SOCK_RECVQ_DEPTH 132 /* Default: none (read only) */
+#define TIPC_MCAST_BROADCAST 133 /* Default: TIPC selects. No arg */
+#define TIPC_MCAST_REPLICAST 134 /* Default: TIPC selects. No arg */
/*
* Maximum sizes of TIPC bearer-related names (including terminating NULL)
diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 672e6ef..7d99029 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -54,6 +54,9 @@ const char tipc_bclink_name[] = "broadcast-link";
* @dest: array keeping number of reachable destinations per bearer
* @primary_bearer: a bearer having links to all broadcast destinations, if any
* @bcast_support: indicates if primary bearer, if any, supports broadcast
+ * @rcast_support: indicates if all peer nodes support replicast
+ * @rc_ratio: dest count as percentage of cluster size where send method changes
+ * @bc_threshold: calculated drom rc_ratio; if dests > threshold use broadcast
*/
struct tipc_bc_base {
struct tipc_link *link;
@@ -61,6 +64,9 @@ struct tipc_bc_base {
int dests[MAX_BEARERS];
int primary_bearer;
bool bcast_support;
+ bool rcast_support;
+ int rc_ratio;
+ int bc_threshold;
};
static struct tipc_bc_base *tipc_bc_base(struct net *net)
@@ -73,6 +79,19 @@ int tipc_bcast_get_mtu(struct net *net)
return tipc_link_mtu(tipc_bc_sndlink(net)) - INT_H_SIZE;
}
+void tipc_bcast_disable_rcast(struct net *net)
+{
+ tipc_bc_base(net)->rcast_support = false;
+}
+
+static void tipc_bcbase_calc_bc_threshold(struct net *net)
+{
+ struct tipc_bc_base *bb = tipc_bc_base(net);
+ int cluster_size = tipc_link_bc_peers(tipc_bc_sndlink(net));
+
+ bb->bc_threshold = 1 + (cluster_size * bb->rc_ratio / 100);
+}
+
/* tipc_bcbase_select_primary(): find a bearer with links to all destinations,
* if any, and make it primary bearer
*/
@@ -175,6 +194,31 @@ static void tipc_bcbase_xmit(struct net *net, struct sk_buff_head *xmitq)
__skb_queue_purge(&_xmitq);
}
+static void tipc_bcast_select_xmit_method(struct net *net, int dests,
+ struct tipc_mc_method *method)
+{
+ struct tipc_bc_base *bb = tipc_bc_base(net);
+ unsigned long exp = method->expires;
+
+ /* Broadcast supported by used bearer/bearers? */
+ if (!bb->bcast_support) {
+ method->rcast = true;
+ return;
+ }
+ /* Any destinations which don't support replicast ? */
+ if (!bb->rcast_support) {
+ method->rcast = false;
+ return;
+ }
+ /* Can current method be changed ? */
+ method->expires = jiffies + TIPC_METHOD_EXPIRE;
+ if (method->mandatory || time_before(jiffies, exp))
+ return;
+
+ /* Determine method to use now */
+ method->rcast = dests <= bb->bc_threshold;
+}
+
/* tipc_bcast_xmit - broadcast the buffer chain to all external nodes
* @net: the applicable net namespace
* @pkts: chain of buffers containing message
@@ -237,16 +281,16 @@ static int tipc_rcast_xmit(struct net *net, struct sk_buff_head *pkts,
* and to identified node local sockets
* @net: the applicable net namespace
* @pkts: chain of buffers containing message
- * @dests: destination nodes for message. Not consumed.
+ * @method: send method to be used
+ * @dests: destination nodes for message.
* @cong_link_cnt: returns number of encountered congested destination links
- * @cong_links: returns identities of congested links
* Consumes buffer chain.
* Returns 0 if success, otherwise errno
*/
int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
- struct tipc_nlist *dests, u16 *cong_link_cnt)
+ struct tipc_mc_method *method, struct tipc_nlist *dests,
+ u16 *cong_link_cnt)
{
- struct tipc_bc_base *bb = tipc_bc_base(net);
struct sk_buff_head inputq, localq;
int rc = 0;
@@ -258,9 +302,10 @@ int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
rc = -ENOMEM;
goto exit;
}
-
+ /* Send according to determined transmit method */
if (dests->remote) {
- if (!bb->bcast_support)
+ tipc_bcast_select_xmit_method(net, dests->remote, method);
+ if (method->rcast)
rc = tipc_rcast_xmit(net, pkts, dests, cong_link_cnt);
else
rc = tipc_bcast_xmit(net, pkts, cong_link_cnt);
@@ -269,6 +314,7 @@ int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
if (dests->local)
tipc_sk_mcast_rcv(net, &localq, &inputq);
exit:
+ /* This queue should normally be empty by now */
__skb_queue_purge(pkts);
return rc;
}
@@ -377,6 +423,7 @@ void tipc_bcast_add_peer(struct net *net, struct tipc_link *uc_l,
tipc_bcast_lock(net);
tipc_link_add_bc_peer(snd_l, uc_l, xmitq);
tipc_bcbase_select_primary(net);
+ tipc_bcbase_calc_bc_threshold(net);
tipc_bcast_unlock(net);
}
@@ -395,6 +442,7 @@ void tipc_bcast_remove_peer(struct net *net, struct tipc_link *rcv_l)
tipc_bcast_lock(net);
tipc_link_remove_bc_peer(snd_l, rcv_l, &xmitq);
tipc_bcbase_select_primary(net);
+ tipc_bcbase_calc_bc_threshold(net);
tipc_bcast_unlock(net);
tipc_bcbase_xmit(net, &xmitq);
@@ -477,6 +525,8 @@ int tipc_bcast_init(struct net *net)
goto enomem;
bb->link = l;
tn->bcl = l;
+ bb->rc_ratio = 25;
+ bb->rcast_support = true;
return 0;
enomem:
kfree(bb);
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index dd772e6..751530a 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -46,6 +46,8 @@ struct tipc_nlist;
struct tipc_nitem;
extern const char tipc_bclink_name[];
+#define TIPC_METHOD_EXPIRE msecs_to_jiffies(5000)
+
struct tipc_nlist {
struct list_head list;
u32 self;
@@ -58,6 +60,17 @@ void tipc_nlist_purge(struct tipc_nlist *nl);
void tipc_nlist_add(struct tipc_nlist *nl, u32 node);
void tipc_nlist_del(struct tipc_nlist *nl, u32 node);
+/* Cookie to be used between socket and broadcast layer
+ * @rcast: replicast (instead of broadcast) was used at previous xmit
+ * @mandatory: broadcast/replicast indication was set by user
+ * @expires: re-evaluate non-mandatory transmit method if we are past this
+ */
+struct tipc_mc_method {
+ bool rcast;
+ bool mandatory;
+ unsigned long expires;
+};
+
int tipc_bcast_init(struct net *net);
void tipc_bcast_stop(struct net *net);
void tipc_bcast_add_peer(struct net *net, struct tipc_link *l,
@@ -66,8 +79,10 @@ void tipc_bcast_remove_peer(struct net *net, struct tipc_link *rcv_bcl);
void tipc_bcast_inc_bearer_dst_cnt(struct net *net, int bearer_id);
void tipc_bcast_dec_bearer_dst_cnt(struct net *net, int bearer_id);
int tipc_bcast_get_mtu(struct net *net);
+void tipc_bcast_disable_rcast(struct net *net);
int tipc_mcast_xmit(struct net *net, struct sk_buff_head *pkts,
- struct tipc_nlist *dests, u16 *cong_link_cnt);
+ struct tipc_mc_method *method, struct tipc_nlist *dests,
+ u16 *cong_link_cnt);
int tipc_bcast_rcv(struct net *net, struct tipc_link *l, struct sk_buff *skb);
void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
struct tipc_msg *hdr);
diff --git a/net/tipc/link.c b/net/tipc/link.c
index b17b9e1..ddd2dd6f 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -515,6 +515,10 @@ bool tipc_link_bc_create(struct net *net, u32 ownnode, u32 peer,
if (link_is_bc_sndlink(l))
l->state = LINK_ESTABLISHED;
+ /* Disable replicast if even a single peer doesn't support it */
+ if (link_is_bc_rcvlink(l) && !(peer_caps & TIPC_BCAST_RCAST))
+ tipc_bcast_disable_rcast(net);
+
return true;
}
diff --git a/net/tipc/node.h b/net/tipc/node.h
index 39ef54c..898c229 100644
--- a/net/tipc/node.h
+++ b/net/tipc/node.h
@@ -47,11 +47,13 @@
enum {
TIPC_BCAST_SYNCH = (1 << 1),
TIPC_BCAST_STATE_NACK = (1 << 2),
- TIPC_BLOCK_FLOWCTL = (1 << 3)
+ TIPC_BLOCK_FLOWCTL = (1 << 3),
+ TIPC_BCAST_RCAST = (1 << 4)
};
#define TIPC_NODE_CAPABILITIES (TIPC_BCAST_SYNCH | \
TIPC_BCAST_STATE_NACK | \
+ TIPC_BCAST_RCAST | \
TIPC_BLOCK_FLOWCTL)
#define INVALID_BEARER_ID -1
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 93b6ae3..5bec8aa 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -79,6 +79,7 @@ enum {
* @rcv_unacked: # messages read by user, but not yet acked back to peer
* @peer: 'connected' peer for dgram/rdm
* @node: hash table node
+ * @mc_method: cookie for use between socket and broadcast layer
* @rcu: rcu struct for tipc_sock
*/
struct tipc_sock {
@@ -103,6 +104,7 @@ struct tipc_sock {
u16 rcv_win;
struct sockaddr_tipc peer;
struct rhash_head node;
+ struct tipc_mc_method mc_method;
struct rcu_head rcu;
};
@@ -740,6 +742,7 @@ static int tipc_sendmcast(struct socket *sock, struct tipc_name_seq *seq,
struct tipc_msg *hdr = &tsk->phdr;
struct net *net = sock_net(sk);
int mtu = tipc_bcast_get_mtu(net);
+ struct tipc_mc_method *method = &tsk->mc_method;
u32 domain = addr_domain(net, TIPC_CLUSTER_SCOPE);
struct sk_buff_head pkts;
struct tipc_nlist dsts;
@@ -773,7 +776,7 @@ static int tipc_sendmcast(struct socket *sock, struct tipc_name_seq *seq,
/* Send message if build was successful */
if (unlikely(rc == dlen))
- rc = tipc_mcast_xmit(net, &pkts, &dsts,
+ rc = tipc_mcast_xmit(net, &pkts, method, &dsts,
&tsk->cong_link_cnt);
tipc_nlist_purge(&dsts);
@@ -2344,18 +2347,29 @@ static int tipc_setsockopt(struct socket *sock, int lvl, int opt,
{
struct sock *sk = sock->sk;
struct tipc_sock *tsk = tipc_sk(sk);
- u32 value;
+ u32 value = 0;
int res;
if ((lvl == IPPROTO_TCP) && (sock->type == SOCK_STREAM))
return 0;
if (lvl != SOL_TIPC)
return -ENOPROTOOPT;
- if (ol < sizeof(value))
- return -EINVAL;
- res = get_user(value, (u32 __user *)ov);
- if (res)
- return res;
+
+ switch (opt) {
+ case TIPC_IMPORTANCE:
+ case TIPC_SRC_DROPPABLE:
+ case TIPC_DEST_DROPPABLE:
+ case TIPC_CONN_TIMEOUT:
+ if (ol < sizeof(value))
+ return -EINVAL;
+ res = get_user(value, (u32 __user *)ov);
+ if (res)
+ return res;
+ break;
+ default:
+ if (ov || ol)
+ return -EINVAL;
+ }
lock_sock(sk);
@@ -2376,6 +2390,14 @@ static int tipc_setsockopt(struct socket *sock, int lvl, int opt,
tipc_sk(sk)->conn_timeout = value;
/* no need to set "res", since already 0 at this point */
break;
+ case TIPC_MCAST_BROADCAST:
+ tsk->mc_method.rcast = false;
+ tsk->mc_method.mandatory = true;
+ break;
+ case TIPC_MCAST_REPLICAST:
+ tsk->mc_method.rcast = true;
+ tsk->mc_method.mandatory = true;
+ break;
default:
res = -EINVAL;
}
--
2.7.4
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [net-next 0/4] tipc: emulate multicast through replication
2017-01-18 18:50 [net-next 0/4] tipc: emulate multicast through replication Jon Maloy
` (3 preceding siblings ...)
2017-01-18 18:50 ` [net-next 4/4] tipc: make replicast a user selectable option Jon Maloy
@ 2017-01-20 17:10 ` David Miller
4 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2017-01-20 17:10 UTC (permalink / raw)
To: jon.maloy; +Cc: netdev, tipc-discussion
From: Jon Maloy <jon.maloy@ericsson.com>
Date: Wed, 18 Jan 2017 13:50:49 -0500
> TIPC multicast messages are currently distributed via L2 broadcast
> or IP multicast to all nodes in the cluster, irrespective of the
> number of real destinations of the message.
>
> In this series we introduce an option to transport messages via
> replication ("replicast") across a selected number of unicast links,
> instead of relying on the underlying media. This option is used when
> true broadcast/multicast is not supported by the media, or when the
> number of true destinations is much smaller than the cluster size.
Series applied, thanks Jon.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-01-20 17:10 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-18 18:50 [net-next 0/4] tipc: emulate multicast through replication Jon Maloy
2017-01-18 18:50 ` [net-next 1/4] tipc: add function for checking broadcast support in bearer Jon Maloy
2017-01-18 18:50 ` [net-next 2/4] tipc: add functionality to lookup multicast destination nodes Jon Maloy
2017-01-18 18:50 ` [net-next 3/4] tipc: introduce replicast as transport option for multicast Jon Maloy
2017-01-18 18:50 ` [net-next 4/4] tipc: make replicast a user selectable option Jon Maloy
2017-01-20 17:10 ` [net-next 0/4] tipc: emulate multicast through replication David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).