Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH iproute2 1/3] ss: allow AF_FAMILY constants >32
From: Stephen Hemminger @ 2017-10-03 18:26 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: netdev, Jorgen Hansen, Dexuan Cui
In-Reply-To: <20171003175744.24987-2-stefanha@redhat.com>

On Tue,  3 Oct 2017 13:57:42 -0400
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> Linux has more than 32 address families defined in <bits/socket.h>.  Use
> a 64-bit type so all of them can be represented in the filter->families
> bitmask.
> 
> It's easy to introduce bugs when using (1 << AF_FAMILY) because the
> value is 32-bit.  This can produce incorrect results from bitmask
> operations so introduce the FAMILY_MASK() macro to eliminate these bugs.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  misc/ss.c | 54 ++++++++++++++++++++++++++++--------------------------
>  1 file changed, 28 insertions(+), 26 deletions(-)
> 
> diff --git a/misc/ss.c b/misc/ss.c
> index dd8dfaa4..12a31c90 100644
> --- a/misc/ss.c
> +++ b/misc/ss.c
> @@ -170,55 +170,57 @@ enum {
>  struct filter {
>  	int dbs;
>  	int states;
> -	int families;
> +	__u64 families;

Since this isn't a value that is coming from kernel. It should be uint64_t
rather than __u64.

^ permalink raw reply

* Re: [PATCH net-next v2 1/3] bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood
From: Stephen Hemminger @ 2017-10-03 18:29 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: davem, netdev, nikolay, bridge
In-Reply-To: <1507054876-16746-2-git-send-email-roopa@cumulusnetworks.com>

On Tue,  3 Oct 2017 11:21:14 -0700
Roopa Prabhu <roopa@cumulusnetworks.com> wrote:

> diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
> index 48fb174..7a50dc5 100644
> --- a/net/bridge/br_forward.c
> +++ b/net/bridge/br_forward.c
> @@ -204,7 +204,8 @@ void br_flood(struct net_bridge *br, struct sk_buff *skb,
>  		/* Do not flood to ports that enable proxy ARP */
>  		if (p->flags & BR_PROXYARP)
>  			continue;
> -		if ((p->flags & BR_PROXYARP_WIFI) &&
> +		if ((p->flags & BR_PROXYARP_WIFI ||
> +		     p->flags & BR_NEIGH_SUPPRESS) &&
>  		    BR_INPUT_SKB_CB(skb)->proxyarp_replied)
>  			continue;

Don;t you need additional paren here to avoid warnings.
Or do one mask:
		if ((p->flags & (BR_PROXYARP_WIFI | BR_NEIGH_SUPPRESS)) &&
 		    BR_INPUT_SKB_CB(skb)->proxyarp_replied)
  			continue;

^ permalink raw reply

* Re: [PATCH] net: phy: DP83822 initial driver submission
From: Florian Fainelli @ 2017-10-03 18:31 UTC (permalink / raw)
  To: Dan Murphy, andrew; +Cc: netdev
In-Reply-To: <ccab5880-eace-503c-d325-d20867b98bd5@ti.com>

On 10/03/2017 11:03 AM, Dan Murphy wrote:
> Florian
> 
> Thanks for the review
> 
> On 10/03/2017 12:15 PM, Florian Fainelli wrote:
>>> +		} else {
>>> +			value &= ~DP83822_WOL_SECURE_ON;
>>> +		}
>>> +
>>> +		value |= (DP83822_WOL_EN | DP83822_WOL_CLR_INDICATION |
>>> +			  DP83822_WOL_CLR_INDICATION);
>>
>> The extra parenthesis should not be required here.
> 
> I did not code that in.  I had to add it after Checkpatch cribbed about it.
> Let me know if you want me to remove it.

Let's keep those, that does not change much.

> 
>>
>>> +		phy_write_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG,
>>> +			      value);
>>> +	} else {
>>> +		value =
>>> +		    phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG);
>>> +		value &= (~DP83822_WOL_EN);
>>
>> Same here, parenthesis should not be needed.
> 
> There are three lines of code in the else.  This code all needs to be excuted in the else case.
> I might reformat it to read better.  Lindent messed that one up.

sorry, I meant to write that you don't need the parenthesis around
DP83822_WOL_EN since that is just a single bit here.

[snip]

>>> +
>>> +	mutex_unlock(&phydev->lock);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int dp83822_resume(struct phy_device *phydev)
>>> +{
>>> +	int value;
>>> +
>>> +	mutex_lock(&phydev->lock);
>>> +
>>> +	value = phy_read(phydev, MII_BMCR);
>>> +	phy_write(phydev, MII_BMCR, value & ~BMCR_PDOWN);
>>
>> And genphy_resume() here as well?
> 
> genphy_resume does not have WoL.

I should have been cleared, I meant using genphy_{suspend,resume} to
avoid open coding the setting of the BMCR_PDOWN bit, conversely clearing
of that bit. Because of the locking, maybe you could introduce unlocked
versions of these two routines, or you acquire and release the lock
outside of genphy_{suspend,resume}?

> 
>>
>>> +
>>> +	value = phy_read_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG);
>>> +
>>> +	phy_write_mmd(phydev, DP83822_DEVADDR, MII_DP83822_WOL_CFG, value |
>>> +		      DP83822_WOL_CLR_INDICATION);
>>> +
>>> +	mutex_unlock(&phydev->lock);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static struct phy_driver dp83822_driver[] = {
>>> +	{
>>> +	 .phy_id = DP83822_PHY_ID,
>>> +	 .phy_id_mask = 0xfffffff0,
>>> +	 .name = "TI DP83822",
>>> +	 .features = PHY_BASIC_FEATURES,
>>> +	 .flags = PHY_HAS_INTERRUPT,
>>> +
>>> +	 .config_init = genphy_config_init,
>>> +	 .soft_reset = dp83822_phy_reset,
>>> +
>>> +	 .get_wol = dp83822_get_wol,
>>> +	 .set_wol = dp83822_set_wol,
>>> +
>>> +	 /* IRQ related */
>>> +	 .ack_interrupt = dp83822_ack_interrupt,
>>> +	 .config_intr = dp83822_config_intr,
>>> +
>>> +	 .config_aneg = genphy_config_aneg,
>>> +	 .read_status = genphy_read_status,
>>> +	 .suspend = dp83822_suspend,
>>> +	 .resume = dp83822_resume,
>>> +	 },
>>
>> I would omit newlines between definitions of callbacks, but this is
>> really a personal preference. Unless you are planning on adding new IDs,
>> you could also avoid using an array of 1 element and just a plain
>> phy_driver structure, but that's not a big deal either.
> 
> Yes there is a plan to add another phy id in early 2018 to this driver.

Alright then!
-- 
Florian

^ permalink raw reply

* Re: [PATCH v4 net-next 0/8] flow_dissector: Protocol specific flow dissector offload
From: Tom Herbert @ 2017-10-03 18:35 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Tom Herbert, David Miller, Hannes Frederic Sowa,
	Linux Kernel Network Developers, Rohit Seth
In-Reply-To: <20171003074632.GD1916@nanopsycho>

On Tue, Oct 3, 2017 at 12:46 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Fri, Sep 29, 2017 at 07:59:35PM CEST, tom@herbertland.com wrote:
>>On Fri, Sep 29, 2017 at 10:42 AM, David Miller <davem@davemloft.net> wrote:
>>> From: Tom Herbert <tom@herbertland.com>
>>> Date: Fri, 29 Sep 2017 08:48:55 -0700
>>>
>>>> The flow_dissector interface is not a uAPI.
>>>
>>> That's not true, insofar as cls_flower.c uses the flow_dissector
>>> therefore if you change the flow_dissector in certain ways then
>>> cls_flower.c might have it's behavior changed and that is in fact UAPI
>>> facing.
>>
>>Then I would suggest adding another flag like FLOW_DISSECTOR_F_FLOWER
>>and when anyone puts new code into flow_dissector they can wrap it
>>with "if !(flags & FLOW_DISSECTOR_F_FLOWER)". If the flower uAPI is
>>subsequently update then the conditional can be removed. This way
>>flower can support maintain its APIs, but we can still still extend
>>and improve flow_dissector for othersuse cases.
>
> This is not flower-specific problem. Flow_dissector is a servant of many.

Besides flower, what other use cases of flow_dissector have made
flow_dissector interface a uAPI? Any use of hashing does not do this.
Maybe OVS does?

> As such, it is instructed what should it do. If you want to
> change the way inner headers are parsed, you should either:

Why would that only affect the way inner headers are parsed? Wouldn't
we need to consider any change to flow_dissector that might affect the
output in any way. For instance, the depth limits I added would change
to output for someone that was parsing thirty-five layers of
encapsulation so it it looks like that feature needs a flag. What if
someone adds a new Ethernet protocol or a new encap protocol?

> 1) change the callers so they are behaving the same as before
> 2) make the flow_dissection change optional so the caller can say if he
>    wants original or new behaviour.

I guess we can do that, but am concerned about the overhead this will
generate if were adding a flag each time anyone modifies the function.
There are performance critical use cases of flow_dissector that will
be impacted by such changes.

Tom


>

^ permalink raw reply

* Re: [RFC 1/2] bpf: move instruction printing into a separate file
From: Daniel Borkmann @ 2017-10-03 19:32 UTC (permalink / raw)
  To: Jakub Kicinski, dsahern, alexei.starovoitov
  Cc: netdev, oss-drivers, david.beckett
In-Reply-To: <20171003175746.30145-1-jakub.kicinski@netronome.com>

On 10/03/2017 07:57 PM, Jakub Kicinski wrote:
> Separate the instruction printing into a standalone source file.
> This way sneaky code from tools/ can use it directly.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> Like this?

Looks good to me, yes.

^ permalink raw reply

* Re: [PATCH v2 net-next 06/12] qed: Add LL2 slowpath handling
From: Kalderon, Michal @ 2017-10-03 19:48 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Elior, Ariel
In-Reply-To: <20171003132632.GB25829-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>

From: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Sent: Tuesday, October 3, 2017 4:26 PM

>On Tue, Oct 03, 2017 at 11:54:56AM +0300, Michal Kalderon wrote:
>> For iWARP unaligned MPA flow, a slowpath event of flushing an
>> MPA connection that entered an unaligned state is required.
>> The flush ramrod is received on the ll2 queue, and a pre-registered
>> callback function is called to handle the flush event.
>>
>> Signed-off-by: Michal Kalderon <Michal.Kalderon-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
>> Signed-off-by: Ariel Elior <Ariel.Elior-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>
>> ---
>>  drivers/net/ethernet/qlogic/qed/qed_ll2.c | 40 +++++++++++++++++++++++++++++--
>>  include/linux/qed/qed_ll2_if.h            |  5 ++++
>>  2 files changed, 43 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
>> index 8eb9645..047f556 100644
>> --- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c
>> +++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
>> @@ -423,6 +423,41 @@ static void qed_ll2_rxq_parse_reg(struct qed_hwfn *p_hwfn,
>>  }
>>
>>  static int
>> +qed_ll2_handle_slowpath(struct qed_hwfn *p_hwfn,
>> +                     struct qed_ll2_info *p_ll2_conn,
>> +                     union core_rx_cqe_union *p_cqe,
>> +                     unsigned long *p_lock_flags)
>> +{
>> +     struct qed_ll2_rx_queue *p_rx = &p_ll2_conn->rx_queue;
>> +     struct core_rx_slow_path_cqe *sp_cqe;
>> +
>> +     sp_cqe = &p_cqe->rx_cqe_sp;
>> +     if (sp_cqe->ramrod_cmd_id != CORE_RAMROD_RX_QUEUE_FLUSH) {
>> +             DP_NOTICE(p_hwfn,
>> +                       "LL2 - unexpected Rx CQE slowpath ramrod_cmd_id:%d\n",
>> +                       sp_cqe->ramrod_cmd_id);
>> +             return -EINVAL;
>> +     }
>> +
>> +     if (!p_ll2_conn->cbs.slowpath_cb) {
>> +             DP_NOTICE(p_hwfn,
>> +                       "LL2 - received RX_QUEUE_FLUSH but no callback was provided\n");
>> +             return -EINVAL;
>> +     }
>> +
>> +     spin_unlock_irqrestore(&p_rx->lock, *p_lock_flags);
>
>Interesting, you are unlock the lock which was taken in upper layer.
>It is not actual error, but chances to have such error are pretty high
>(for example, after refactoring).

Thanks. Ensuring that the lock will only be unlocked inside the calling function would make 
the calling function long and less readable.
The risk exists, but I think the fact that p_lock_flags is passed as parameter should 
give a strong indication in the future that lock should be handled delicately. --
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] nfp: convert nfp_eth_set_bit_config() into a macro
From: Matthias Kaehlcke @ 2017-10-03 20:05 UTC (permalink / raw)
  To: Jakub Kicinski, David S . Miller, Simon Horman,
	Dirk van der Merwe
  Cc: oss-drivers, netdev, linux-kernel, Renato Golin, Manoj Gupta,
	Guenter Roeck, Doug Anderson, Matthias Kaehlcke

nfp_eth_set_bit_config() is marked as __always_inline to allow gcc to
identify the 'mask' parameter as known to be constant at compile time,
which is required to use the FIELD_GET() macro.

The forced inlining does the trick for gcc, but for kernel builds with
clang it results in undefined symbols:

drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.o: In function
  `__nfp_eth_set_aneg':
drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c:(.text+0x787):
  undefined reference to `__compiletime_assert_492'
drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c:(.text+0x7b1):
  undefined reference to `__compiletime_assert_496'

These __compiletime_assert_xyx() calls would have been optimized away if
the compiler had seen 'mask' as a constant.

Convert nfp_eth_set_bit_config() into a macro, which allows both gcc and
clang to identify 'mask' as a compile time constant.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
---
I am aware that a lengthy macro is not a pretty solution, I'm open for
better suggestions.

Note: The patch has been build-tested only since I don't have any NFP
hardware.

 .../ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c   | 67 +++++++++++-----------
 1 file changed, 34 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c
index f6f7c085f8e0..e9c635867918 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c
@@ -469,39 +469,40 @@ int nfp_eth_set_configured(struct nfp_cpp *cpp, unsigned int idx, bool configed)
 	return nfp_eth_config_commit_end(nsp);
 }
 
-/* Force inline, FIELD_* macroes require masks to be compilation-time known */
-static __always_inline int
-nfp_eth_set_bit_config(struct nfp_nsp *nsp, unsigned int raw_idx,
-		       const u64 mask, unsigned int val, const u64 ctrl_bit)
-{
-	union eth_table_entry *entries = nfp_nsp_config_entries(nsp);
-	unsigned int idx = nfp_nsp_config_idx(nsp);
-	u64 reg;
-
-	/* Note: set features were added in ABI 0.14 but the error
-	 *	 codes were initially not populated correctly.
-	 */
-	if (nfp_nsp_get_abi_ver_minor(nsp) < 17) {
-		nfp_err(nfp_nsp_cpp(nsp),
-			"set operations not supported, please update flash\n");
-		return -EOPNOTSUPP;
-	}
-
-	/* Check if we are already in requested state */
-	reg = le64_to_cpu(entries[idx].raw[raw_idx]);
-	if (val == FIELD_GET(mask, reg))
-		return 0;
-
-	reg &= ~mask;
-	reg |= FIELD_PREP(mask, val);
-	entries[idx].raw[raw_idx] = cpu_to_le64(reg);
-
-	entries[idx].control |= cpu_to_le64(ctrl_bit);
-
-	nfp_nsp_config_set_modified(nsp, true);
-
-	return 0;
-}
+#define nfp_eth_set_bit_config(nsp, raw_idx, mask, val, ctrl_bit)	\
+({									\
+	union eth_table_entry *entries = nfp_nsp_config_entries(nsp);	\
+	unsigned int idx = nfp_nsp_config_idx(nsp);			\
+	u64 reg;							\
+	int rc;								\
+									\
+	/* Note: set features were added in ABI 0.14 but the error */	\
+	/*	 codes were initially not populated correctly.	   */	\
+	if (nfp_nsp_get_abi_ver_minor(nsp) < 17) {			\
+		nfp_err(nfp_nsp_cpp(nsp),				\
+			"set operations not supported, please update flash\n"); \
+		rc = -EOPNOTSUPP;					\
+		goto out;						\
+	}								\
+									\
+	rc = 0;								\
+									\
+	/* Check if we are already in requested state */		\
+	reg = le64_to_cpu(entries[idx].raw[raw_idx]);			\
+	if (val == FIELD_GET(mask, reg))				\
+		goto out;						\
+									\
+	reg &= ~mask;							\
+	reg |= FIELD_PREP(mask, val);					\
+	entries[idx].raw[raw_idx] = cpu_to_le64(reg);			\
+									\
+	entries[idx].control |= cpu_to_le64(ctrl_bit);			\
+									\
+	nfp_nsp_config_set_modified(nsp, true);				\
+									\
+out:									\
+	rc;								\
+})
 
 /**
  * __nfp_eth_set_aneg() - set PHY autonegotiation control bit
-- 
2.14.2.920.gcf0c67979c-goog

^ permalink raw reply related

* [PATCH] net: 8021q: skip packets if the vlan is down
From: Vishakha Narvekar @ 2017-10-03 20:13 UTC (permalink / raw)
  To: netdev; +Cc: allen.hubbe, andrew.boyer, Vishakha Narvekar, David S. Miller

If the vlan is down, free the packet instead of proceeding with other
processing, or counting it as received.  If vlan interfaces are used
as slaves for bonding, with arp monitoring for connectivity, if the rx
counter is seen to be incrementing, then the bond device will not
observe that the interface is down.

CC: David S. Miller <davem@davemloft.net>
Signed-off-by: Vishakha Narvekar <Vishakha.Narvekar@dell.com>
---
I don't know if this is the appropriate change, or if it is supposed to
work as before.  This change seemed to fix the behavior for bonding.

 net/8021q/vlan_core.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index e2ed698..0bc31de 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -21,6 +21,12 @@ bool vlan_do_receive(struct sk_buff **skbp)
 	if (unlikely(!skb))
 		return false;
 
+	if (unlikely(!(vlan_dev->flags & IFF_UP))) {
+		kfree_skb(skb);
+		*skbp = NULL;
+		return false;
+	}
+
 	skb->dev = vlan_dev;
 	if (unlikely(skb->pkt_type == PACKET_OTHERHOST)) {
 		/* Our lower layer thinks this is not local, let's make sure.
-- 
1.8.3.1

^ permalink raw reply related

* Re: [RFC 1/2] bpf: move instruction printing into a separate file
From: Alexei Starovoitov @ 2017-10-03 20:14 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: daniel, dsahern, netdev, oss-drivers, david.beckett
In-Reply-To: <20171003175746.30145-1-jakub.kicinski@netronome.com>

On Tue, Oct 03, 2017 at 10:57:45AM -0700, Jakub Kicinski wrote:
> Separate the instruction printing into a standalone source file.
> This way sneaky code from tools/ can use it directly.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> Like this?
...
> +static void print_bpf_end_insn(void (*verbose)(const char *, ...),
> +			       const struct bpf_insn *insn)
> +{
> +	verbose("(%02x) r%d = %s%d r%d\n", insn->code, insn->dst_reg,
> +		BPF_SRC(insn->code) == BPF_TO_BE ? "be" : "le",
> +		insn->imm, insn->dst_reg);
> +}
...
> +			print_bpf_insn(verbose, insn, env->allow_ptr_leaks);

since you're changing it please please please kill that global verbose() ugliness.
It's been on todo list for long time.
iirc that's the only thing that prevents us to remove global bpf_verifier_lock.
if we don't do it as part of this change, we'd need another one in the future
with equal amount of changed lines, so let's do it now.

^ permalink raw reply

* Re: [PATCH net-next 2/3] bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports
From: kbuild test robot @ 2017-10-03 20:14 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: kbuild-all, davem, netdev, nikolay, stephen, bridge
In-Reply-To: <1506919018-27875-3-git-send-email-roopa@cumulusnetworks.com>

[-- Attachment #1: Type: text/plain, Size: 1173 bytes --]

Hi Roopa,

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Roopa-Prabhu/bridge-neigh-msg-proxy-and-flood-suppression-support/20171003-124610
config: x86_64-randconfig-i0-10030107 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   net/bridge/br_arp_nd_proxy.o: In function `br_chk_addr_ip':
>> br_arp_nd_proxy.c:(.text+0x5c): undefined reference to `inet_confirm_addr'
   net/bridge/br_arp_nd_proxy.o: In function `br_do_proxy_suppress_arp':
>> br_arp_nd_proxy.c:(.text+0x528): undefined reference to `arp_tbl'
   net/bridge/br_arp_nd_proxy.o: In function `br_arp_send.constprop.4':
>> br_arp_nd_proxy.c:(.text.unlikely+0x7d): undefined reference to `arp_send'
>> br_arp_nd_proxy.c:(.text.unlikely+0xaa): undefined reference to `arp_create'
>> br_arp_nd_proxy.c:(.text.unlikely+0x392): undefined reference to `arp_xmit'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32872 bytes --]

^ permalink raw reply

* Re: [PATCH net-next v2 0/3] tools: add bpftool
From: Arnaldo Carvalho de Melo @ 2017-10-03 20:19 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, daniel, alexei.starovoitov, oss-drivers
In-Reply-To: <20171002231130.12406-1-jakub.kicinski@netronome.com>

Em Mon, Oct 02, 2017 at 04:11:27PM -0700, Jakub Kicinski escreveu:
> Hi!
> 
> This set adds bpftool to the tools/ directory.  The first 
> patch renames tools/net to tools/bpf, the second one adds 
> the new code, while the third adds simple documentation.
> 
> v2:
>  - report names, map ids, load time, uid;
>  - add docs/man pages;
>  - general cleanups & fixes.
> 
> Thanks to David Beckett for help with docs and testing.

Why not call it just 'bpf'?

- Arnaldo
 
> Jakub Kicinski (3):
>   tools: rename tools/net directory to tools/bpf
>   tools: bpf: add bpftool
>   tools: bpftool: add documentation
> 
>  MAINTAINERS                                      |   3 +-
>  tools/Makefile                                   |  14 +-
>  tools/{net => bpf}/Makefile                      |  18 +-
>  tools/{net => bpf}/bpf_asm.c                     |   0
>  tools/{net => bpf}/bpf_dbg.c                     |   0
>  tools/{net => bpf}/bpf_exp.l                     |   0
>  tools/{net => bpf}/bpf_exp.y                     |   0
>  tools/{net => bpf}/bpf_jit_disasm.c              |   0
>  tools/bpf/bpftool/Documentation/Makefile         |  34 ++
>  tools/bpf/bpftool/Documentation/bpftool-map.txt  | 110 ++++
>  tools/bpf/bpftool/Documentation/bpftool-prog.txt |  81 +++
>  tools/bpf/bpftool/Documentation/bpftool.txt      |  34 ++
>  tools/bpf/bpftool/Makefile                       |  86 +++
>  tools/bpf/bpftool/common.c                       | 215 +++++++
>  tools/bpf/bpftool/jit_disasm.c                   |  87 +++
>  tools/bpf/bpftool/main.c                         | 212 +++++++
>  tools/bpf/bpftool/main.h                         |  99 +++
>  tools/bpf/bpftool/map.c                          | 744 +++++++++++++++++++++++
>  tools/bpf/bpftool/prog.c                         | 427 +++++++++++++
>  19 files changed, 2152 insertions(+), 12 deletions(-)
>  rename tools/{net => bpf}/Makefile (74%)
>  rename tools/{net => bpf}/bpf_asm.c (100%)
>  rename tools/{net => bpf}/bpf_dbg.c (100%)
>  rename tools/{net => bpf}/bpf_exp.l (100%)
>  rename tools/{net => bpf}/bpf_exp.y (100%)
>  rename tools/{net => bpf}/bpf_jit_disasm.c (100%)
>  create mode 100644 tools/bpf/bpftool/Documentation/Makefile
>  create mode 100644 tools/bpf/bpftool/Documentation/bpftool-map.txt
>  create mode 100644 tools/bpf/bpftool/Documentation/bpftool-prog.txt
>  create mode 100644 tools/bpf/bpftool/Documentation/bpftool.txt
>  create mode 100644 tools/bpf/bpftool/Makefile
>  create mode 100644 tools/bpf/bpftool/common.c
>  create mode 100644 tools/bpf/bpftool/jit_disasm.c
>  create mode 100644 tools/bpf/bpftool/main.c
>  create mode 100644 tools/bpf/bpftool/main.h
>  create mode 100644 tools/bpf/bpftool/map.c
>  create mode 100644 tools/bpf/bpftool/prog.c
> 
> -- 
> 2.14.1

^ permalink raw reply

* Re: [PATCH net v1 1/2] ARM: dts: imx: let's name the ptp interrupt for the fec ethernet driver
From: Troy Kisky @ 2017-10-03 20:41 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: shawn.guo, fugang.duan, netdev, davem, fabio.estevam, lznuaa
In-Reply-To: <20171003005148.GA24147@lunn.ch>

On 10/2/2017 5:51 PM, Andrew Lunn wrote:
> On Mon, Oct 02, 2017 at 05:04:41PM -0700, Troy Kisky wrote:
>> imx7s/imx7d has the ptp interrupt newly added as well.
>> This will allow the ptp interrupt to have its own interrupt routine.
>>
>> Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
>> ---
>>  arch/arm/boot/dts/imx6qdl.dtsi | 1 +
>>  arch/arm/boot/dts/imx6sx.dtsi  | 2 ++
>>  arch/arm/boot/dts/imx6ul.dtsi  | 2 ++
>>  arch/arm/boot/dts/imx7d.dtsi   | 4 +++-
>>  arch/arm/boot/dts/imx7s.dtsi   | 4 +++-
>>  5 files changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/boot/dts/imx6qdl.dtsi b/arch/arm/boot/dts/imx6qdl.dtsi
>> index 8884b4a3cafb..d848d2bfe8e2 100644
>> --- a/arch/arm/boot/dts/imx6qdl.dtsi
>> +++ b/arch/arm/boot/dts/imx6qdl.dtsi
>> @@ -1017,6 +1017,7 @@
>>  			fec: ethernet@02188000 {
>>  				compatible = "fsl,imx6q-fec";
>>  				reg = <0x02188000 0x4000>;
>> +				interrupt-names = "","ptp";
> 
> Hi Troy
> 
> The "" looks a bit odd. Can you use a name here?
> 
>     Andrew
> 

Sure. Can I use "q0","q1","q2",  and look them up by name in fec_main
as well ?

Should I worrying about compatiblity with old dtbs ?
I could look up by number if name fails. Maybe with a
deprecated warning ?


Thanks
Troy

^ permalink raw reply

* Re: [PATCH] bridge: Fix format string for %ul
From: Stephen Hemminger @ 2017-10-03 20:43 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: David S. Miller, bridge, netdev, linux-kernel
In-Reply-To: <1472267428-810527-1-git-send-email-green@linuxhacker.ru>

On Fri, 26 Aug 2016 23:10:28 -0400
Oleg Drokin <green@linuxhacker.ru> wrote:

> %ul would print an unsigned value and a letter l,
> likely it was %lu that was meant to print the long int,
> but in reality the values printed there are just regular signed
> ints, so just dropping the l altogether.
> 
> Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
> ---
>  net/bridge/br_stp_bpdu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/bridge/br_stp_bpdu.c b/net/bridge/br_stp_bpdu.c
> index 5881fbc..15c4a9c 100644
> --- a/net/bridge/br_stp_bpdu.c
> +++ b/net/bridge/br_stp_bpdu.c
> @@ -230,7 +230,7 @@ void br_stp_rcv(const struct stp_proto *proto, struct sk_buff *skb,
>  			if (net_ratelimit())
>  				br_notice(p->br,
>  					  "port %u config from %pM"
> -					  " (message_age %ul > max_age %ul)\n",
> +					  " (message_age %u > max_age %u)\n",
>  					  p->port_no,
>  					  eth_hdr(skb)->h_source,
>  					  bpdu.message_age, bpdu.max_age);

Could you make the format string a single line plwase.
And add Fixes tag.

^ permalink raw reply

* Re: [PATCH net v1 1/2] ARM: dts: imx: let's name the ptp interrupt for the fec ethernet driver
From: Andrew Lunn @ 2017-10-03 20:53 UTC (permalink / raw)
  To: Troy Kisky; +Cc: shawn.guo, fugang.duan, netdev, davem, fabio.estevam, lznuaa
In-Reply-To: <b07b44ca-3804-bd7d-900c-556443a02ac8@boundarydevices.com>

On Tue, Oct 03, 2017 at 01:41:34PM -0700, Troy Kisky wrote:
> On 10/2/2017 5:51 PM, Andrew Lunn wrote:
> > On Mon, Oct 02, 2017 at 05:04:41PM -0700, Troy Kisky wrote:
> >> imx7s/imx7d has the ptp interrupt newly added as well.
> >> This will allow the ptp interrupt to have its own interrupt routine.
> >>
> >> Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
> >> ---
> >>  arch/arm/boot/dts/imx6qdl.dtsi | 1 +
> >>  arch/arm/boot/dts/imx6sx.dtsi  | 2 ++
> >>  arch/arm/boot/dts/imx6ul.dtsi  | 2 ++
> >>  arch/arm/boot/dts/imx7d.dtsi   | 4 +++-
> >>  arch/arm/boot/dts/imx7s.dtsi   | 4 +++-
> >>  5 files changed, 11 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/arm/boot/dts/imx6qdl.dtsi b/arch/arm/boot/dts/imx6qdl.dtsi
> >> index 8884b4a3cafb..d848d2bfe8e2 100644
> >> --- a/arch/arm/boot/dts/imx6qdl.dtsi
> >> +++ b/arch/arm/boot/dts/imx6qdl.dtsi
> >> @@ -1017,6 +1017,7 @@
> >>  			fec: ethernet@02188000 {
> >>  				compatible = "fsl,imx6q-fec";
> >>  				reg = <0x02188000 0x4000>;
> >> +				interrupt-names = "","ptp";
> > 
> > Hi Troy
> > 
> > The "" looks a bit odd. Can you use a name here?
> > 
> >     Andrew
> > 
> 
> Sure. Can I use "q0","q1","q2",  and look them up by name in fec_main
> as well ?

Hi Troy

Is there no better name? How does the datasheet name them?

> Should I worrying about compatiblity with old dtbs ?

Yes. You cannot break old dtb blobs. So i would suggest keep looking
up the old interrupts by number. But this new interrupt you can use
the name.

    Andrew

^ permalink raw reply

* Re: [PATCH] rndis_host: support Novatel Verizon USB730L
From: David Miller @ 2017-10-03 21:31 UTC (permalink / raw)
  To: bjorn; +Cc: aleksander, oliver, linux-usb, netdev
In-Reply-To: <87k20cmitw.fsf@miraculix.mork.no>

From: Bjørn Mork <bjorn@mork.no>
Date: Tue, 03 Oct 2017 16:01:15 +0200

> We can pretty much ignore the USB-IF and any specs, since that is what
> the vendors appear to do.  They provide device specific drivers for
> Windows, so all they care about is that their device "works" with their
> driver.
> 
> But in Linux we prefer to create drivers for device classes whenever we
> can, to avoid having to add every single device by ID.  So we try to
> guess future patterns based on the devices we have observed, even when
> there is no clear spec.  This is what Aleksander does here. He has a
> device with a 'Cls=ef(misc ) Sub=04 Prot=01' function.  This device
> works with the rndis_host driver. That is all we know.
> 
> We cannot prove that a class match is correct. But it does make sense to
> try it.  At least we know that this works for one device.
> 
> Adding anything else, e.g. based on the table at
> http://www.usb.org/developers/defined_class/#BaseClassEFh , is a bit
> more risky.  We don't know if a driver will work with *any* such device
> until we've actually seen one.
> 
> This is just my opinion, and probably full of bogus assumptions as
> usual.  I was sort of hoping that some expert would speak up so I didn't
> have to :-)

Ok ;-)

> But FWIW:
> 
> Reviewed-by: Bjørn Mork <bjorn@mork.no>

So I'll apply this for now, thanks for your feedback.

^ permalink raw reply

* Re: [PATCH next] bonding: speed/duplex update at NETDEV_UP event
From: David Miller @ 2017-10-03 21:32 UTC (permalink / raw)
  To: mahesh; +Cc: j.vosburgh, andy, vfalico, netdev, maheshb
In-Reply-To: <20170928010349.8988-1-mahesh@bandewar.net>

From: Mahesh Bandewar <mahesh@bandewar.net>
Date: Wed, 27 Sep 2017 18:03:49 -0700

> From: Mahesh Bandewar <maheshb@google.com>
> 
> Some NIC drivers don't have correct speed/duplex settings at the
> time they send NETDEV_UP notification and that messes up the
> bonding state. Especially 802.3ad mode which is very sensitive
> to these settings. In the current implementation we invoke
> bond_update_speed_duplex() when we receive NETDEV_UP, however,
> ignore the return value. If the values we get are invalid
> (UNKNOWN), then slave gets removed from the aggregator with
> speed and duplex set to UNKNOWN while link is still marked as UP.
> 
> This patch fixes this scenario. Also 802.3ad mode is sensitive to
> these conditions while other modes are not, so making sure that it
> doesn't change the behavior for other modes.
> 
> Signed-off-by: Mahesh Bandewar <maheshb@google.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] nfp: convert nfp_eth_set_bit_config() into a macro
From: Jakub Kicinski @ 2017-10-03 21:50 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: David S . Miller, Simon Horman, Dirk van der Merwe, oss-drivers,
	netdev, linux-kernel, Renato Golin, Manoj Gupta, Guenter Roeck,
	Doug Anderson
In-Reply-To: <20171003200546.165731-1-mka@chromium.org>

On Tue,  3 Oct 2017 13:05:46 -0700, Matthias Kaehlcke wrote:
> nfp_eth_set_bit_config() is marked as __always_inline to allow gcc to
> identify the 'mask' parameter as known to be constant at compile time,
> which is required to use the FIELD_GET() macro.
> 
> The forced inlining does the trick for gcc, but for kernel builds with
> clang it results in undefined symbols:
> 
> drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.o: In function
>   `__nfp_eth_set_aneg':
> drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c:(.text+0x787):
>   undefined reference to `__compiletime_assert_492'
> drivers/net/ethernet/netronome/nfp/nfpcore/nfp_nsp_eth.c:(.text+0x7b1):
>   undefined reference to `__compiletime_assert_496'
> 
> These __compiletime_assert_xyx() calls would have been optimized away if
> the compiler had seen 'mask' as a constant.
> 
> Convert nfp_eth_set_bit_config() into a macro, which allows both gcc and
> clang to identify 'mask' as a compile time constant.
> 
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>

:(

Is there no chance of fixing the constant propagation in the compiler?

^ permalink raw reply

* Re: [PATCH net] net: fib_rules: Fix fib_rules_ops->compare implementations to support exact match
From: David Miller @ 2017-10-03 21:54 UTC (permalink / raw)
  To: shmulik; +Cc: netdev, mateusz.bajorski, dsa, tgraf, shmulik.ladkani
In-Reply-To: <20170930085909.1103-1-shmulik@nsof.io>

From: Shmulik Ladkani <shmulik@nsof.io>
Date: Sat, 30 Sep 2017 11:59:09 +0300

> This leads to inconsistencies, depending on order of operations, e.g.:

I don't see any inconsistency.  When you insert using NLM_F_EXCL the
insertion fails if any existing rule matches or overlaps in any way
with the keys in the new rule.

Sorry I'm not going to apply this.

^ permalink raw reply

* [PATCH net-next v2 00/10] Introduce SCTP Stream Schedulers
From: Marcelo Ricardo Leitner @ 2017-10-03 22:20 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich, Xin Long, David Laight

This patchset introduces the SCTP Stream Schedulers are defined by
https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13

It provides 3 schedulers at the moment: FCFS, Priority and Round Robin.
The other 3, Round Robin per packet, Fair Capacity and Weighted Fair
Capacity will be added later. More specifically, WFQ is required by
WebRTC Datachannels.

The draft also defines the idata chunk, allowing a usermsg to be
interrupted by another piece of idata from another stream. This patchset
*doesn't* include it. It will be posted later by Xin Long.  Its
integration with this patchset is very simple and it basically only
requires a tweak in sctp_sched_dequeue_done(), to ignore datamsg
boundaries.

The first 5 patches are a preparation for the next ones. The most
relevant patches are the 4th and 6th ones. More details are available on
each patch.

v2: changelog update on patch 3

Marcelo Ricardo Leitner (10):
  sctp: silence warns on sctp_stream_init allocations
  sctp: factor out stream->out allocation
  sctp: factor out stream->in allocation
  sctp: introduce struct sctp_stream_out_ext
  sctp: introduce sctp_chunk_stream_no
  sctp: introduce stream scheduler foundations
  sctp: add sockopt to get/set stream scheduler
  sctp: add sockopt to get/set stream scheduler parameters
  sctp: introduce priority based stream scheduler
  sctp: introduce round robin stream scheduler

 include/net/sctp/stream_sched.h |  72 +++++++++
 include/net/sctp/structs.h      |  63 +++++++-
 include/uapi/linux/sctp.h       |  16 ++
 net/sctp/Makefile               |   3 +-
 net/sctp/chunk.c                |   6 +-
 net/sctp/outqueue.c             |  63 ++++----
 net/sctp/sm_sideeffect.c        |   3 +
 net/sctp/socket.c               | 179 ++++++++++++++++++++-
 net/sctp/stream.c               | 196 +++++++++++++++++++----
 net/sctp/stream_sched.c         | 275 +++++++++++++++++++++++++++++++
 net/sctp/stream_sched_prio.c    | 347 ++++++++++++++++++++++++++++++++++++++++
 net/sctp/stream_sched_rr.c      | 201 +++++++++++++++++++++++
 12 files changed, 1347 insertions(+), 77 deletions(-)
 create mode 100644 include/net/sctp/stream_sched.h
 create mode 100644 net/sctp/stream_sched.c
 create mode 100644 net/sctp/stream_sched_prio.c
 create mode 100644 net/sctp/stream_sched_rr.c

-- 
2.13.5

^ permalink raw reply

* [PATCH net-next v2 01/10] sctp: silence warns on sctp_stream_init allocations
From: Marcelo Ricardo Leitner @ 2017-10-03 22:20 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich, Xin Long, David Laight
In-Reply-To: <cover.1507069005.git.marcelo.leitner@gmail.com>

As SCTP supports up to 65535 streams, that can lead to very large
allocations in sctp_stream_init(). As Xin Long noticed, systems with
small amounts of memory are more prone to not have enough memory and
dump warnings on dmesg initiated by user actions. Thus, silence them.

Also, if the reallocation of stream->out is not necessary, skip it and
keep the memory we already have.

Reported-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 net/sctp/stream.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 63ea1550371493ec8863627c7a43f46a22f4a4c9..1afa9555808390d5fc736727422d9700a3855613 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -40,9 +40,14 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 {
 	int i;
 
+	gfp |= __GFP_NOWARN;
+
 	/* Initial stream->out size may be very big, so free it and alloc
-	 * a new one with new outcnt to save memory.
+	 * a new one with new outcnt to save memory if needed.
 	 */
+	if (outcnt == stream->outcnt)
+		goto in;
+
 	kfree(stream->out);
 
 	stream->out = kcalloc(outcnt, sizeof(*stream->out), gfp);
@@ -53,6 +58,7 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	for (i = 0; i < stream->outcnt; i++)
 		stream->out[i].state = SCTP_STREAM_OPEN;
 
+in:
 	if (!incnt)
 		return 0;
 
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 02/10] sctp: factor out stream->out allocation
From: Marcelo Ricardo Leitner @ 2017-10-03 22:20 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich, Xin Long, David Laight
In-Reply-To: <cover.1507069005.git.marcelo.leitner@gmail.com>

There is 1 place allocating it and 2 other reallocating. Move such
procedures to a common function.

Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 net/sctp/stream.c | 52 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 32 insertions(+), 20 deletions(-)

diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 1afa9555808390d5fc736727422d9700a3855613..6d0e997d301f89e165367106c02e82f8a6c3a877 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -35,6 +35,30 @@
 #include <net/sctp/sctp.h>
 #include <net/sctp/sm.h>
 
+static int sctp_stream_alloc_out(struct sctp_stream *stream, __u16 outcnt,
+				 gfp_t gfp)
+{
+	struct sctp_stream_out *out;
+
+	out = kmalloc_array(outcnt, sizeof(*out), gfp);
+	if (!out)
+		return -ENOMEM;
+
+	if (stream->out) {
+		memcpy(out, stream->out, min(outcnt, stream->outcnt) *
+					 sizeof(*out));
+		kfree(stream->out);
+	}
+
+	if (outcnt > stream->outcnt)
+		memset(out + stream->outcnt, 0,
+		       (outcnt - stream->outcnt) * sizeof(*out));
+
+	stream->out = out;
+
+	return 0;
+}
+
 int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 		     gfp_t gfp)
 {
@@ -48,11 +72,9 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	if (outcnt == stream->outcnt)
 		goto in;
 
-	kfree(stream->out);
-
-	stream->out = kcalloc(outcnt, sizeof(*stream->out), gfp);
-	if (!stream->out)
-		return -ENOMEM;
+	i = sctp_stream_alloc_out(stream, outcnt, gfp);
+	if (i)
+		return i;
 
 	stream->outcnt = outcnt;
 	for (i = 0; i < stream->outcnt; i++)
@@ -276,15 +298,9 @@ int sctp_send_add_streams(struct sctp_association *asoc,
 	}
 
 	if (out) {
-		struct sctp_stream_out *streamout;
-
-		streamout = krealloc(stream->out, outcnt * sizeof(*streamout),
-				     GFP_KERNEL);
-		if (!streamout)
+		retval = sctp_stream_alloc_out(stream, outcnt, GFP_KERNEL);
+		if (retval)
 			goto out;
-
-		memset(streamout + stream->outcnt, 0, out * sizeof(*streamout));
-		stream->out = streamout;
 	}
 
 	chunk = sctp_make_strreset_addstrm(asoc, out, in);
@@ -682,10 +698,10 @@ struct sctp_chunk *sctp_process_strreset_addstrm_in(
 	struct sctp_strreset_addstrm *addstrm = param.v;
 	struct sctp_stream *stream = &asoc->stream;
 	__u32 result = SCTP_STRRESET_DENIED;
-	struct sctp_stream_out *streamout;
 	struct sctp_chunk *chunk = NULL;
 	__u32 request_seq, outcnt;
 	__u16 out, i;
+	int ret;
 
 	request_seq = ntohl(addstrm->request_seq);
 	if (TSN_lt(asoc->strreset_inseq, request_seq) ||
@@ -714,14 +730,10 @@ struct sctp_chunk *sctp_process_strreset_addstrm_in(
 	if (!out || outcnt > SCTP_MAX_STREAM)
 		goto out;
 
-	streamout = krealloc(stream->out, outcnt * sizeof(*streamout),
-			     GFP_ATOMIC);
-	if (!streamout)
+	ret = sctp_stream_alloc_out(stream, outcnt, GFP_ATOMIC);
+	if (ret)
 		goto out;
 
-	memset(streamout + stream->outcnt, 0, out * sizeof(*streamout));
-	stream->out = streamout;
-
 	chunk = sctp_make_strreset_addstrm(asoc, out, 0);
 	if (!chunk)
 		goto out;
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 03/10] sctp: factor out stream->in allocation
From: Marcelo Ricardo Leitner @ 2017-10-03 22:20 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich, Xin Long, David Laight
In-Reply-To: <cover.1507069005.git.marcelo.leitner@gmail.com>

There is 1 place allocating it and another reallocating. Move such
procedures to a common function.

v2: updated changelog

Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 net/sctp/stream.c | 36 ++++++++++++++++++++++++++++--------
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 6d0e997d301f89e165367106c02e82f8a6c3a877..952437d656cc71ad1c133a736c539eff9a8d80c2 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -59,6 +59,31 @@ static int sctp_stream_alloc_out(struct sctp_stream *stream, __u16 outcnt,
 	return 0;
 }
 
+static int sctp_stream_alloc_in(struct sctp_stream *stream, __u16 incnt,
+				gfp_t gfp)
+{
+	struct sctp_stream_in *in;
+
+	in = kmalloc_array(incnt, sizeof(*stream->in), gfp);
+
+	if (!in)
+		return -ENOMEM;
+
+	if (stream->in) {
+		memcpy(in, stream->in, min(incnt, stream->incnt) *
+				       sizeof(*in));
+		kfree(stream->in);
+	}
+
+	if (incnt > stream->incnt)
+		memset(in + stream->incnt, 0,
+		       (incnt - stream->incnt) * sizeof(*in));
+
+	stream->in = in;
+
+	return 0;
+}
+
 int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 		     gfp_t gfp)
 {
@@ -84,8 +109,8 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	if (!incnt)
 		return 0;
 
-	stream->in = kcalloc(incnt, sizeof(*stream->in), gfp);
-	if (!stream->in) {
+	i = sctp_stream_alloc_in(stream, incnt, gfp);
+	if (i) {
 		kfree(stream->out);
 		stream->out = NULL;
 		return -ENOMEM;
@@ -623,7 +648,6 @@ struct sctp_chunk *sctp_process_strreset_addstrm_out(
 	struct sctp_strreset_addstrm *addstrm = param.v;
 	struct sctp_stream *stream = &asoc->stream;
 	__u32 result = SCTP_STRRESET_DENIED;
-	struct sctp_stream_in *streamin;
 	__u32 request_seq, incnt;
 	__u16 in, i;
 
@@ -670,13 +694,9 @@ struct sctp_chunk *sctp_process_strreset_addstrm_out(
 	if (!in || incnt > SCTP_MAX_STREAM)
 		goto out;
 
-	streamin = krealloc(stream->in, incnt * sizeof(*streamin),
-			    GFP_ATOMIC);
-	if (!streamin)
+	if (sctp_stream_alloc_in(stream, incnt, GFP_ATOMIC))
 		goto out;
 
-	memset(streamin + stream->incnt, 0, in * sizeof(*streamin));
-	stream->in = streamin;
 	stream->incnt = incnt;
 
 	result = SCTP_STRRESET_PERFORMED;
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 04/10] sctp: introduce struct sctp_stream_out_ext
From: Marcelo Ricardo Leitner @ 2017-10-03 22:20 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich, Xin Long, David Laight
In-Reply-To: <cover.1507069005.git.marcelo.leitner@gmail.com>

With the stream schedulers, sctp_stream_out will become too big to be
allocated by kmalloc and as we need to allocate with BH disabled, we
cannot use __vmalloc in sctp_stream_init().

This patch moves out the stats from sctp_stream_out to
sctp_stream_out_ext, which will be allocated only when the application
tries to sendmsg something on it.

Just the introduction of sctp_stream_out_ext would already fix the issue
described above by splitting the allocation in two. Moving the stats
to it also reduces the pressure on the allocator as we will ask for less
memory atomically when creating the socket and we will use GFP_KERNEL
later.

Then, for stream schedulers, we will just use sctp_stream_out_ext.

Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 include/net/sctp/structs.h | 10 ++++++++--
 net/sctp/chunk.c           |  6 +++---
 net/sctp/outqueue.c        |  4 ++--
 net/sctp/socket.c          | 27 +++++++++++++++++++++------
 net/sctp/stream.c          | 16 ++++++++++++++++
 5 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 0477945de1a3cf5c27348e99d9a30e02c491d1de..9b2b30b3ba4dfd10c24c3e06ed80779180a06baf 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -84,6 +84,7 @@ struct sctp_ulpq;
 struct sctp_ep_common;
 struct crypto_shash;
 struct sctp_stream;
+struct sctp_stream_out;
 
 
 #include <net/sctp/tsnmap.h>
@@ -380,6 +381,7 @@ struct sctp_sender_hb_info {
 
 int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 		     gfp_t gfp);
+int sctp_stream_init_ext(struct sctp_stream *stream, __u16 sid);
 void sctp_stream_free(struct sctp_stream *stream);
 void sctp_stream_clear(struct sctp_stream *stream);
 void sctp_stream_update(struct sctp_stream *stream, struct sctp_stream *new);
@@ -1315,11 +1317,15 @@ struct sctp_inithdr_host {
 	__u32 initial_tsn;
 };
 
+struct sctp_stream_out_ext {
+	__u64 abandoned_unsent[SCTP_PR_INDEX(MAX) + 1];
+	__u64 abandoned_sent[SCTP_PR_INDEX(MAX) + 1];
+};
+
 struct sctp_stream_out {
 	__u16	ssn;
 	__u8	state;
-	__u64	abandoned_unsent[SCTP_PR_INDEX(MAX) + 1];
-	__u64	abandoned_sent[SCTP_PR_INDEX(MAX) + 1];
+	struct sctp_stream_out_ext *ext;
 };
 
 struct sctp_stream_in {
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index 3afac275ee82dbec825dd71378dffe69a53718a7..7b261afc47b9d709fdd780a93aaba874f35d79be 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -311,10 +311,10 @@ int sctp_chunk_abandoned(struct sctp_chunk *chunk)
 
 		if (chunk->sent_count) {
 			chunk->asoc->abandoned_sent[SCTP_PR_INDEX(TTL)]++;
-			streamout->abandoned_sent[SCTP_PR_INDEX(TTL)]++;
+			streamout->ext->abandoned_sent[SCTP_PR_INDEX(TTL)]++;
 		} else {
 			chunk->asoc->abandoned_unsent[SCTP_PR_INDEX(TTL)]++;
-			streamout->abandoned_unsent[SCTP_PR_INDEX(TTL)]++;
+			streamout->ext->abandoned_unsent[SCTP_PR_INDEX(TTL)]++;
 		}
 		return 1;
 	} else if (SCTP_PR_RTX_ENABLED(chunk->sinfo.sinfo_flags) &&
@@ -323,7 +323,7 @@ int sctp_chunk_abandoned(struct sctp_chunk *chunk)
 			&chunk->asoc->stream.out[chunk->sinfo.sinfo_stream];
 
 		chunk->asoc->abandoned_sent[SCTP_PR_INDEX(RTX)]++;
-		streamout->abandoned_sent[SCTP_PR_INDEX(RTX)]++;
+		streamout->ext->abandoned_sent[SCTP_PR_INDEX(RTX)]++;
 		return 1;
 	} else if (!SCTP_PR_POLICY(chunk->sinfo.sinfo_flags) &&
 		   chunk->msg->expires_at &&
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 2966ff400755fe93e3658e09d3bb44b9d7d19d2e..746b07b7937d8730824b9e09917d947aa7863ec6 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -366,7 +366,7 @@ static int sctp_prsctp_prune_sent(struct sctp_association *asoc,
 		streamout = &asoc->stream.out[chk->sinfo.sinfo_stream];
 		asoc->sent_cnt_removable--;
 		asoc->abandoned_sent[SCTP_PR_INDEX(PRIO)]++;
-		streamout->abandoned_sent[SCTP_PR_INDEX(PRIO)]++;
+		streamout->ext->abandoned_sent[SCTP_PR_INDEX(PRIO)]++;
 
 		if (!chk->tsn_gap_acked) {
 			if (chk->transport)
@@ -404,7 +404,7 @@ static int sctp_prsctp_prune_unsent(struct sctp_association *asoc,
 			struct sctp_stream_out *streamout =
 				&asoc->stream.out[chk->sinfo.sinfo_stream];
 
-			streamout->abandoned_unsent[SCTP_PR_INDEX(PRIO)]++;
+			streamout->ext->abandoned_unsent[SCTP_PR_INDEX(PRIO)]++;
 		}
 
 		msg_len -= SCTP_DATA_SNDSIZE(chk) +
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index d4730ada7f3233367be7a0e3bb10e286a25602c8..d207734326b085e60625e4333f74221481114892 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1927,6 +1927,13 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
 		goto out_free;
 	}
 
+	/* Allocate sctp_stream_out_ext if not already done */
+	if (unlikely(!asoc->stream.out[sinfo->sinfo_stream].ext)) {
+		err = sctp_stream_init_ext(&asoc->stream, sinfo->sinfo_stream);
+		if (err)
+			goto out_free;
+	}
+
 	if (sctp_wspace(asoc) < msg_len)
 		sctp_prsctp_prune(asoc, sinfo, msg_len - sctp_wspace(asoc));
 
@@ -6645,7 +6652,7 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
 					   char __user *optval,
 					   int __user *optlen)
 {
-	struct sctp_stream_out *streamout;
+	struct sctp_stream_out_ext *streamoute;
 	struct sctp_association *asoc;
 	struct sctp_prstatus params;
 	int retval = -EINVAL;
@@ -6668,21 +6675,29 @@ static int sctp_getsockopt_pr_streamstatus(struct sock *sk, int len,
 	if (!asoc || params.sprstat_sid >= asoc->stream.outcnt)
 		goto out;
 
-	streamout = &asoc->stream.out[params.sprstat_sid];
+	streamoute = asoc->stream.out[params.sprstat_sid].ext;
+	if (!streamoute) {
+		/* Not allocated yet, means all stats are 0 */
+		params.sprstat_abandoned_unsent = 0;
+		params.sprstat_abandoned_sent = 0;
+		retval = 0;
+		goto out;
+	}
+
 	if (policy == SCTP_PR_SCTP_NONE) {
 		params.sprstat_abandoned_unsent = 0;
 		params.sprstat_abandoned_sent = 0;
 		for (policy = 0; policy <= SCTP_PR_INDEX(MAX); policy++) {
 			params.sprstat_abandoned_unsent +=
-				streamout->abandoned_unsent[policy];
+				streamoute->abandoned_unsent[policy];
 			params.sprstat_abandoned_sent +=
-				streamout->abandoned_sent[policy];
+				streamoute->abandoned_sent[policy];
 		}
 	} else {
 		params.sprstat_abandoned_unsent =
-			streamout->abandoned_unsent[__SCTP_PR_INDEX(policy)];
+			streamoute->abandoned_unsent[__SCTP_PR_INDEX(policy)];
 		params.sprstat_abandoned_sent =
-			streamout->abandoned_sent[__SCTP_PR_INDEX(policy)];
+			streamoute->abandoned_sent[__SCTP_PR_INDEX(policy)];
 	}
 
 	if (put_user(len, optlen) || copy_to_user(optval, &params, len)) {
diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 952437d656cc71ad1c133a736c539eff9a8d80c2..055ca25bbc91bf932db8048c72a1b11cc2214942 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -121,8 +121,24 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	return 0;
 }
 
+int sctp_stream_init_ext(struct sctp_stream *stream, __u16 sid)
+{
+	struct sctp_stream_out_ext *soute;
+
+	soute = kzalloc(sizeof(*soute), GFP_KERNEL);
+	if (!soute)
+		return -ENOMEM;
+	stream->out[sid].ext = soute;
+
+	return 0;
+}
+
 void sctp_stream_free(struct sctp_stream *stream)
 {
+	int i;
+
+	for (i = 0; i < stream->outcnt; i++)
+		kfree(stream->out[i].ext);
 	kfree(stream->out);
 	kfree(stream->in);
 }
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 05/10] sctp: introduce sctp_chunk_stream_no
From: Marcelo Ricardo Leitner @ 2017-10-03 22:20 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich, Xin Long, David Laight
In-Reply-To: <cover.1507069005.git.marcelo.leitner@gmail.com>

Add a helper to fetch the stream number from a given chunk.

Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 include/net/sctp/structs.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 9b2b30b3ba4dfd10c24c3e06ed80779180a06baf..c48f7999fe9b80c5b5e41910a3608059b94140a7 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -642,6 +642,11 @@ void sctp_init_addrs(struct sctp_chunk *, union sctp_addr *,
 		     union sctp_addr *);
 const union sctp_addr *sctp_source(const struct sctp_chunk *chunk);
 
+static inline __u16 sctp_chunk_stream_no(struct sctp_chunk *ch)
+{
+	return ntohs(ch->subh.data_hdr->stream);
+}
+
 enum {
 	SCTP_ADDR_NEW,		/* new address added to assoc/ep */
 	SCTP_ADDR_SRC,		/* address can be used as source */
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next v2 06/10] sctp: introduce stream scheduler foundations
From: Marcelo Ricardo Leitner @ 2017-10-03 22:20 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich, Xin Long, David Laight
In-Reply-To: <cover.1507069005.git.marcelo.leitner@gmail.com>

This patch introduces the hooks necessary to do stream scheduling, as
per RFC Draft ndata.  It also introduces the first scheduler, which is
what we do today but now factored out: first come first served (FCFS).

With stream scheduling now we have to track which chunk was enqueued on
which stream and be able to select another other than the in front of
the main outqueue. So we introduce a list on sctp_stream_out_ext
structure for this purpose.

We reuse sctp_chunk->transmitted_list space for the list above, as the
chunk cannot belong to the two lists at the same time. By using the
union in there, we can have distinct names for these moments.

sctp_sched_ops are the operations expected to be implemented by each
scheduler. The dequeueing is a bit particular to this implementation but
it is to match how we dequeue packets today. We first dequeue and then
check if it fits the packet and if not, we requeue it at head. Thus why
we don't have a peek operation but have dequeue_done instead, which is
called once the chunk can be safely considered as transmitted.

The check removed from sctp_outq_flush is now performed by
sctp_stream_outq_migrate, which is only called during assoc setup.
(sctp_sendmsg() also checks for it)

The only operation that is foreseen but not yet added here is a way to
signalize that a new packet is starting or that the packet is done, for
round robin scheduler per packet, but is intentionally left to the
patch that actually implements it.

Support for I-DATA chunks, also described in this RFC, with user message
interleaving is straightforward as it just requires the schedulers to
probe for the feature and ignore datamsg boundaries when dequeueing.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
 include/net/sctp/stream_sched.h |  72 +++++++++++
 include/net/sctp/structs.h      |  15 ++-
 include/uapi/linux/sctp.h       |   6 +
 net/sctp/Makefile               |   2 +-
 net/sctp/outqueue.c             |  59 +++++----
 net/sctp/sm_sideeffect.c        |   3 +
 net/sctp/stream.c               |  88 +++++++++++--
 net/sctp/stream_sched.c         | 270 ++++++++++++++++++++++++++++++++++++++++
 8 files changed, 477 insertions(+), 38 deletions(-)
 create mode 100644 include/net/sctp/stream_sched.h
 create mode 100644 net/sctp/stream_sched.c

diff --git a/include/net/sctp/stream_sched.h b/include/net/sctp/stream_sched.h
new file mode 100644
index 0000000000000000000000000000000000000000..c676550a4c7dd0ea27ac0e14437d0a2b451ef499
--- /dev/null
+++ b/include/net/sctp/stream_sched.h
@@ -0,0 +1,72 @@
+/* SCTP kernel implementation
+ * (C) Copyright Red Hat Inc. 2017
+ *
+ * These are definitions used by the stream schedulers, defined in RFC
+ * draft ndata (https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-11)
+ *
+ * This SCTP implementation is free software;
+ * you can redistribute it and/or modify it under the terms of
+ * the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This SCTP implementation  is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; without even the implied
+ *                 ************************
+ * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with GNU CC; see the file COPYING.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ *
+ * Please send any bug reports or fixes you make to the
+ * email addresses:
+ *    lksctp developers <linux-sctp@vger.kernel.org>
+ *
+ * Written or modified by:
+ *   Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
+ */
+
+#ifndef __sctp_stream_sched_h__
+#define __sctp_stream_sched_h__
+
+struct sctp_sched_ops {
+	/* Property handling for a given stream */
+	int (*set)(struct sctp_stream *stream, __u16 sid, __u16 value,
+		   gfp_t gfp);
+	int (*get)(struct sctp_stream *stream, __u16 sid, __u16 *value);
+
+	/* Init the specific scheduler */
+	int (*init)(struct sctp_stream *stream);
+	/* Init a stream */
+	int (*init_sid)(struct sctp_stream *stream, __u16 sid, gfp_t gfp);
+	/* Frees the entire thing */
+	void (*free)(struct sctp_stream *stream);
+
+	/* Enqueue a chunk */
+	void (*enqueue)(struct sctp_outq *q, struct sctp_datamsg *msg);
+	/* Dequeue a chunk */
+	struct sctp_chunk *(*dequeue)(struct sctp_outq *q);
+	/* Called only if the chunk fit the packet */
+	void (*dequeue_done)(struct sctp_outq *q, struct sctp_chunk *chunk);
+	/* Sched all chunks already enqueued */
+	void (*sched_all)(struct sctp_stream *steam);
+	/* Unched all chunks already enqueued */
+	void (*unsched_all)(struct sctp_stream *steam);
+};
+
+int sctp_sched_set_sched(struct sctp_association *asoc,
+			 enum sctp_sched_type sched);
+int sctp_sched_get_sched(struct sctp_association *asoc);
+int sctp_sched_set_value(struct sctp_association *asoc, __u16 sid,
+			 __u16 value, gfp_t gfp);
+int sctp_sched_get_value(struct sctp_association *asoc, __u16 sid,
+			 __u16 *value);
+void sctp_sched_dequeue_done(struct sctp_outq *q, struct sctp_chunk *ch);
+
+void sctp_sched_dequeue_common(struct sctp_outq *q, struct sctp_chunk *ch);
+int sctp_sched_init_sid(struct sctp_stream *stream, __u16 sid, gfp_t gfp);
+struct sctp_sched_ops *sctp_sched_ops_from_stream(struct sctp_stream *stream);
+
+#endif /* __sctp_stream_sched_h__ */
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index c48f7999fe9b80c5b5e41910a3608059b94140a7..3c22a30fd71b4ef87419a77cf69b00807a5986bb 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -84,7 +84,6 @@ struct sctp_ulpq;
 struct sctp_ep_common;
 struct crypto_shash;
 struct sctp_stream;
-struct sctp_stream_out;
 
 
 #include <net/sctp/tsnmap.h>
@@ -531,8 +530,12 @@ struct sctp_chunk {
 	/* How many times this chunk have been sent, for prsctp RTX policy */
 	int sent_count;
 
-	/* This is our link to the per-transport transmitted list.  */
-	struct list_head transmitted_list;
+	union {
+		/* This is our link to the per-transport transmitted list.  */
+		struct list_head transmitted_list;
+		/* List in specific stream outq */
+		struct list_head stream_list;
+	};
 
 	/* This field is used by chunks that hold fragmented data.
 	 * For the first fragment this is the list that holds the rest of
@@ -1019,6 +1022,9 @@ struct sctp_outq {
 	/* Data pending that has never been transmitted.  */
 	struct list_head out_chunk_list;
 
+	/* Stream scheduler being used */
+	struct sctp_sched_ops *sched;
+
 	unsigned int out_qlen;	/* Total length of queued data chunks. */
 
 	/* Error of send failed, may used in SCTP_SEND_FAILED event. */
@@ -1325,6 +1331,7 @@ struct sctp_inithdr_host {
 struct sctp_stream_out_ext {
 	__u64 abandoned_unsent[SCTP_PR_INDEX(MAX) + 1];
 	__u64 abandoned_sent[SCTP_PR_INDEX(MAX) + 1];
+	struct list_head outq; /* chunks enqueued by this stream */
 };
 
 struct sctp_stream_out {
@@ -1342,6 +1349,8 @@ struct sctp_stream {
 	struct sctp_stream_in *in;
 	__u16 outcnt;
 	__u16 incnt;
+	/* Current stream being sent, if any */
+	struct sctp_stream_out *out_curr;
 };
 
 #define SCTP_STREAM_CLOSED		0x00
diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h
index 6217ff8500a1d818fd1002fbd6f81c0c11974665..4487e7625ddbd48be1868a8292a807ecd0a314bc 100644
--- a/include/uapi/linux/sctp.h
+++ b/include/uapi/linux/sctp.h
@@ -1088,4 +1088,10 @@ struct sctp_add_streams {
 	uint16_t sas_outstrms;
 };
 
+/* SCTP Stream schedulers */
+enum sctp_sched_type {
+	SCTP_SS_FCFS,
+	SCTP_SS_MAX = SCTP_SS_FCFS
+};
+
 #endif /* _UAPI_SCTP_H */
diff --git a/net/sctp/Makefile b/net/sctp/Makefile
index 70f1b570bab9764d692f1c2e605d76d056cda2cd..0f6e6d1d69fd336b4a99f896851b0120f9a0d1e0 100644
--- a/net/sctp/Makefile
+++ b/net/sctp/Makefile
@@ -12,7 +12,7 @@ sctp-y := sm_statetable.o sm_statefuns.o sm_sideeffect.o \
 	  inqueue.o outqueue.o ulpqueue.o \
 	  tsnmap.o bind_addr.o socket.o primitive.o \
 	  output.o input.o debug.o stream.o auth.o \
-	  offload.o
+	  offload.o stream_sched.o
 
 sctp_probe-y := probe.o
 
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 746b07b7937d8730824b9e09917d947aa7863ec6..4db012aa25f7a042f063bc17b56270effebc6cc6 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -50,6 +50,7 @@
 
 #include <net/sctp/sctp.h>
 #include <net/sctp/sm.h>
+#include <net/sctp/stream_sched.h>
 
 /* Declare internal functions here.  */
 static int sctp_acked(struct sctp_sackhdr *sack, __u32 tsn);
@@ -72,32 +73,38 @@ static void sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp);
 
 /* Add data to the front of the queue. */
 static inline void sctp_outq_head_data(struct sctp_outq *q,
-					struct sctp_chunk *ch)
+				       struct sctp_chunk *ch)
 {
+	struct sctp_stream_out_ext *oute;
+	__u16 stream;
+
 	list_add(&ch->list, &q->out_chunk_list);
 	q->out_qlen += ch->skb->len;
+
+	stream = sctp_chunk_stream_no(ch);
+	oute = q->asoc->stream.out[stream].ext;
+	list_add(&ch->stream_list, &oute->outq);
 }
 
 /* Take data from the front of the queue. */
 static inline struct sctp_chunk *sctp_outq_dequeue_data(struct sctp_outq *q)
 {
-	struct sctp_chunk *ch = NULL;
-
-	if (!list_empty(&q->out_chunk_list)) {
-		struct list_head *entry = q->out_chunk_list.next;
-
-		ch = list_entry(entry, struct sctp_chunk, list);
-		list_del_init(entry);
-		q->out_qlen -= ch->skb->len;
-	}
-	return ch;
+	return q->sched->dequeue(q);
 }
+
 /* Add data chunk to the end of the queue. */
 static inline void sctp_outq_tail_data(struct sctp_outq *q,
 				       struct sctp_chunk *ch)
 {
+	struct sctp_stream_out_ext *oute;
+	__u16 stream;
+
 	list_add_tail(&ch->list, &q->out_chunk_list);
 	q->out_qlen += ch->skb->len;
+
+	stream = sctp_chunk_stream_no(ch);
+	oute = q->asoc->stream.out[stream].ext;
+	list_add_tail(&ch->stream_list, &oute->outq);
 }
 
 /*
@@ -207,6 +214,7 @@ void sctp_outq_init(struct sctp_association *asoc, struct sctp_outq *q)
 	INIT_LIST_HEAD(&q->retransmit);
 	INIT_LIST_HEAD(&q->sacked);
 	INIT_LIST_HEAD(&q->abandoned);
+	sctp_sched_set_sched(asoc, SCTP_SS_FCFS);
 }
 
 /* Free the outqueue structure and any related pending chunks.
@@ -258,6 +266,7 @@ static void __sctp_outq_teardown(struct sctp_outq *q)
 
 	/* Throw away any leftover data chunks. */
 	while ((chunk = sctp_outq_dequeue_data(q)) != NULL) {
+		sctp_sched_dequeue_done(q, chunk);
 
 		/* Mark as send failure. */
 		sctp_chunk_fail(chunk, q->error);
@@ -391,13 +400,14 @@ static int sctp_prsctp_prune_unsent(struct sctp_association *asoc,
 	struct sctp_outq *q = &asoc->outqueue;
 	struct sctp_chunk *chk, *temp;
 
+	q->sched->unsched_all(&asoc->stream);
+
 	list_for_each_entry_safe(chk, temp, &q->out_chunk_list, list) {
 		if (!SCTP_PR_PRIO_ENABLED(chk->sinfo.sinfo_flags) ||
 		    chk->sinfo.sinfo_timetolive <= sinfo->sinfo_timetolive)
 			continue;
 
-		list_del_init(&chk->list);
-		q->out_qlen -= chk->skb->len;
+		sctp_sched_dequeue_common(q, chk);
 		asoc->sent_cnt_removable--;
 		asoc->abandoned_unsent[SCTP_PR_INDEX(PRIO)]++;
 		if (chk->sinfo.sinfo_stream < asoc->stream.outcnt) {
@@ -415,6 +425,8 @@ static int sctp_prsctp_prune_unsent(struct sctp_association *asoc,
 			break;
 	}
 
+	q->sched->sched_all(&asoc->stream);
+
 	return msg_len;
 }
 
@@ -1033,22 +1045,9 @@ static void sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp)
 		while ((chunk = sctp_outq_dequeue_data(q)) != NULL) {
 			__u32 sid = ntohs(chunk->subh.data_hdr->stream);
 
-			/* RFC 2960 6.5 Every DATA chunk MUST carry a valid
-			 * stream identifier.
-			 */
-			if (chunk->sinfo.sinfo_stream >= asoc->stream.outcnt) {
-
-				/* Mark as failed send. */
-				sctp_chunk_fail(chunk, SCTP_ERROR_INV_STRM);
-				if (asoc->peer.prsctp_capable &&
-				    SCTP_PR_PRIO_ENABLED(chunk->sinfo.sinfo_flags))
-					asoc->sent_cnt_removable--;
-				sctp_chunk_free(chunk);
-				continue;
-			}
-
 			/* Has this chunk expired? */
 			if (sctp_chunk_abandoned(chunk)) {
+				sctp_sched_dequeue_done(q, chunk);
 				sctp_chunk_fail(chunk, 0);
 				sctp_chunk_free(chunk);
 				continue;
@@ -1070,6 +1069,7 @@ static void sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp)
 				new_transport = asoc->peer.active_path;
 			if (new_transport->state == SCTP_UNCONFIRMED) {
 				WARN_ONCE(1, "Attempt to send packet on unconfirmed path.");
+				sctp_sched_dequeue_done(q, chunk);
 				sctp_chunk_fail(chunk, 0);
 				sctp_chunk_free(chunk);
 				continue;
@@ -1133,6 +1133,11 @@ static void sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp)
 				else
 					asoc->stats.oodchunks++;
 
+				/* Only now it's safe to consider this
+				 * chunk as sent, sched-wise.
+				 */
+				sctp_sched_dequeue_done(q, chunk);
+
 				break;
 
 			default:
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index e6a2974e020e1a4232d94e6c2933eebff5f8acb4..402bfbb888cda53248dd192d3756a2f4db1d2a7f 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -50,6 +50,7 @@
 #include <net/sock.h>
 #include <net/sctp/sctp.h>
 #include <net/sctp/sm.h>
+#include <net/sctp/stream_sched.h>
 
 static int sctp_cmd_interpreter(enum sctp_event event_type,
 				union sctp_subtype subtype,
@@ -1089,6 +1090,8 @@ static void sctp_cmd_send_msg(struct sctp_association *asoc,
 
 	list_for_each_entry(chunk, &msg->chunks, frag_list)
 		sctp_outq_tail(&asoc->outqueue, chunk, gfp);
+
+	asoc->outqueue.sched->enqueue(&asoc->outqueue, msg);
 }
 
 
diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 055ca25bbc91bf932db8048c72a1b11cc2214942..5ea33a2c453b4272c5c22fa61e8e8bec06001f8b 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -32,8 +32,61 @@
  *    Xin Long <lucien.xin@gmail.com>
  */
 
+#include <linux/list.h>
 #include <net/sctp/sctp.h>
 #include <net/sctp/sm.h>
+#include <net/sctp/stream_sched.h>
+
+/* Migrates chunks from stream queues to new stream queues if needed,
+ * but not across associations. Also, removes those chunks to streams
+ * higher than the new max.
+ */
+static void sctp_stream_outq_migrate(struct sctp_stream *stream,
+				     struct sctp_stream *new, __u16 outcnt)
+{
+	struct sctp_association *asoc;
+	struct sctp_chunk *ch, *temp;
+	struct sctp_outq *outq;
+	int i;
+
+	asoc = container_of(stream, struct sctp_association, stream);
+	outq = &asoc->outqueue;
+
+	list_for_each_entry_safe(ch, temp, &outq->out_chunk_list, list) {
+		__u16 sid = sctp_chunk_stream_no(ch);
+
+		if (sid < outcnt)
+			continue;
+
+		sctp_sched_dequeue_common(outq, ch);
+		/* No need to call dequeue_done here because
+		 * the chunks are not scheduled by now.
+		 */
+
+		/* Mark as failed send. */
+		sctp_chunk_fail(ch, SCTP_ERROR_INV_STRM);
+		if (asoc->peer.prsctp_capable &&
+		    SCTP_PR_PRIO_ENABLED(ch->sinfo.sinfo_flags))
+			asoc->sent_cnt_removable--;
+
+		sctp_chunk_free(ch);
+	}
+
+	if (new) {
+		/* Here we actually move the old ext stuff into the new
+		 * buffer, because we want to keep it. Then
+		 * sctp_stream_update will swap ->out pointers.
+		 */
+		for (i = 0; i < outcnt; i++) {
+			kfree(new->out[i].ext);
+			new->out[i].ext = stream->out[i].ext;
+			stream->out[i].ext = NULL;
+		}
+	}
+
+	for (i = outcnt; i < stream->outcnt; i++)
+		kfree(stream->out[i].ext);
+}
 
 static int sctp_stream_alloc_out(struct sctp_stream *stream, __u16 outcnt,
 				 gfp_t gfp)
@@ -87,7 +140,8 @@ static int sctp_stream_alloc_in(struct sctp_stream *stream, __u16 incnt,
 int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 		     gfp_t gfp)
 {
-	int i;
+	struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
+	int i, ret = 0;
 
 	gfp |= __GFP_NOWARN;
 
@@ -97,6 +151,11 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	if (outcnt == stream->outcnt)
 		goto in;
 
+	/* Filter out chunks queued on streams that won't exist anymore */
+	sched->unsched_all(stream);
+	sctp_stream_outq_migrate(stream, NULL, outcnt);
+	sched->sched_all(stream);
+
 	i = sctp_stream_alloc_out(stream, outcnt, gfp);
 	if (i)
 		return i;
@@ -105,20 +164,27 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	for (i = 0; i < stream->outcnt; i++)
 		stream->out[i].state = SCTP_STREAM_OPEN;
 
+	sched->init(stream);
+
 in:
 	if (!incnt)
-		return 0;
+		goto out;
 
 	i = sctp_stream_alloc_in(stream, incnt, gfp);
 	if (i) {
-		kfree(stream->out);
-		stream->out = NULL;
-		return -ENOMEM;
+		ret = -ENOMEM;
+		goto free;
 	}
 
 	stream->incnt = incnt;
+	goto out;
 
-	return 0;
+free:
+	sched->free(stream);
+	kfree(stream->out);
+	stream->out = NULL;
+out:
+	return ret;
 }
 
 int sctp_stream_init_ext(struct sctp_stream *stream, __u16 sid)
@@ -130,13 +196,15 @@ int sctp_stream_init_ext(struct sctp_stream *stream, __u16 sid)
 		return -ENOMEM;
 	stream->out[sid].ext = soute;
 
-	return 0;
+	return sctp_sched_init_sid(stream, sid, GFP_KERNEL);
 }
 
 void sctp_stream_free(struct sctp_stream *stream)
 {
+	struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
 	int i;
 
+	sched->free(stream);
 	for (i = 0; i < stream->outcnt; i++)
 		kfree(stream->out[i].ext);
 	kfree(stream->out);
@@ -156,6 +224,10 @@ void sctp_stream_clear(struct sctp_stream *stream)
 
 void sctp_stream_update(struct sctp_stream *stream, struct sctp_stream *new)
 {
+	struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
+
+	sched->unsched_all(stream);
+	sctp_stream_outq_migrate(stream, new, new->outcnt);
 	sctp_stream_free(stream);
 
 	stream->out = new->out;
@@ -163,6 +235,8 @@ void sctp_stream_update(struct sctp_stream *stream, struct sctp_stream *new)
 	stream->outcnt = new->outcnt;
 	stream->incnt  = new->incnt;
 
+	sched->sched_all(stream);
+
 	new->out = NULL;
 	new->in  = NULL;
 }
diff --git a/net/sctp/stream_sched.c b/net/sctp/stream_sched.c
new file mode 100644
index 0000000000000000000000000000000000000000..40a9a9de2b98a56786a4c8585f5ad514be9189af
--- /dev/null
+++ b/net/sctp/stream_sched.c
@@ -0,0 +1,270 @@
+/* SCTP kernel implementation
+ * (C) Copyright Red Hat Inc. 2017
+ *
+ * This file is part of the SCTP kernel implementation
+ *
+ * These functions manipulate sctp stream queue/scheduling.
+ *
+ * This SCTP implementation is free software;
+ * you can redistribute it and/or modify it under the terms of
+ * the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This SCTP implementation is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; without even the implied
+ *                 ************************
+ * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with GNU CC; see the file COPYING.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ *
+ * Please send any bug reports or fixes you make to the
+ * email addresched(es):
+ *    lksctp developers <linux-sctp@vger.kernel.org>
+ *
+ * Written or modified by:
+ *    Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
+ */
+
+#include <linux/list.h>
+#include <net/sctp/sctp.h>
+#include <net/sctp/sm.h>
+#include <net/sctp/stream_sched.h>
+
+/* First Come First Serve (a.k.a. FIFO)
+ * RFC DRAFT ndata Section 3.1
+ */
+static int sctp_sched_fcfs_set(struct sctp_stream *stream, __u16 sid,
+			       __u16 value, gfp_t gfp)
+{
+	return 0;
+}
+
+static int sctp_sched_fcfs_get(struct sctp_stream *stream, __u16 sid,
+			       __u16 *value)
+{
+	*value = 0;
+	return 0;
+}
+
+static int sctp_sched_fcfs_init(struct sctp_stream *stream)
+{
+	return 0;
+}
+
+static int sctp_sched_fcfs_init_sid(struct sctp_stream *stream, __u16 sid,
+				    gfp_t gfp)
+{
+	return 0;
+}
+
+static void sctp_sched_fcfs_free(struct sctp_stream *stream)
+{
+}
+
+static void sctp_sched_fcfs_enqueue(struct sctp_outq *q,
+				    struct sctp_datamsg *msg)
+{
+}
+
+static struct sctp_chunk *sctp_sched_fcfs_dequeue(struct sctp_outq *q)
+{
+	struct sctp_stream *stream = &q->asoc->stream;
+	struct sctp_chunk *ch = NULL;
+	struct list_head *entry;
+
+	if (list_empty(&q->out_chunk_list))
+		goto out;
+
+	if (stream->out_curr) {
+		ch = list_entry(stream->out_curr->ext->outq.next,
+				struct sctp_chunk, stream_list);
+	} else {
+		entry = q->out_chunk_list.next;
+		ch = list_entry(entry, struct sctp_chunk, list);
+	}
+
+	sctp_sched_dequeue_common(q, ch);
+
+out:
+	return ch;
+}
+
+static void sctp_sched_fcfs_dequeue_done(struct sctp_outq *q,
+					 struct sctp_chunk *chunk)
+{
+}
+
+static void sctp_sched_fcfs_sched_all(struct sctp_stream *stream)
+{
+}
+
+static void sctp_sched_fcfs_unsched_all(struct sctp_stream *stream)
+{
+}
+
+static struct sctp_sched_ops sctp_sched_fcfs = {
+	.set = sctp_sched_fcfs_set,
+	.get = sctp_sched_fcfs_get,
+	.init = sctp_sched_fcfs_init,
+	.init_sid = sctp_sched_fcfs_init_sid,
+	.free = sctp_sched_fcfs_free,
+	.enqueue = sctp_sched_fcfs_enqueue,
+	.dequeue = sctp_sched_fcfs_dequeue,
+	.dequeue_done = sctp_sched_fcfs_dequeue_done,
+	.sched_all = sctp_sched_fcfs_sched_all,
+	.unsched_all = sctp_sched_fcfs_unsched_all,
+};
+
+/* API to other parts of the stack */
+
+struct sctp_sched_ops *sctp_sched_ops[] = {
+	&sctp_sched_fcfs,
+};
+
+int sctp_sched_set_sched(struct sctp_association *asoc,
+			 enum sctp_sched_type sched)
+{
+	struct sctp_sched_ops *n = sctp_sched_ops[sched];
+	struct sctp_sched_ops *old = asoc->outqueue.sched;
+	struct sctp_datamsg *msg = NULL;
+	struct sctp_chunk *ch;
+	int i, ret = 0;
+
+	if (old == n)
+		return ret;
+
+	if (sched > SCTP_SS_MAX)
+		return -EINVAL;
+
+	if (old) {
+		old->free(&asoc->stream);
+
+		/* Give the next scheduler a clean slate. */
+		for (i = 0; i < asoc->stream.outcnt; i++) {
+			void *p = asoc->stream.out[i].ext;
+
+			if (!p)
+				continue;
+
+			p += offsetofend(struct sctp_stream_out_ext, outq);
+			memset(p, 0, sizeof(struct sctp_stream_out_ext) -
+				     offsetofend(struct sctp_stream_out_ext, outq));
+		}
+	}
+
+	asoc->outqueue.sched = n;
+	n->init(&asoc->stream);
+	for (i = 0; i < asoc->stream.outcnt; i++) {
+		if (!asoc->stream.out[i].ext)
+			continue;
+
+		ret = n->init_sid(&asoc->stream, i, GFP_KERNEL);
+		if (ret)
+			goto err;
+	}
+
+	/* We have to requeue all chunks already queued. */
+	list_for_each_entry(ch, &asoc->outqueue.out_chunk_list, list) {
+		if (ch->msg == msg)
+			continue;
+		msg = ch->msg;
+		n->enqueue(&asoc->outqueue, msg);
+	}
+
+	return ret;
+
+err:
+	n->free(&asoc->stream);
+	asoc->outqueue.sched = &sctp_sched_fcfs; /* Always safe */
+
+	return ret;
+}
+
+int sctp_sched_get_sched(struct sctp_association *asoc)
+{
+	int i;
+
+	for (i = 0; i <= SCTP_SS_MAX; i++)
+		if (asoc->outqueue.sched == sctp_sched_ops[i])
+			return i;
+
+	return 0;
+}
+
+int sctp_sched_set_value(struct sctp_association *asoc, __u16 sid,
+			 __u16 value, gfp_t gfp)
+{
+	if (sid >= asoc->stream.outcnt)
+		return -EINVAL;
+
+	if (!asoc->stream.out[sid].ext) {
+		int ret;
+
+		ret = sctp_stream_init_ext(&asoc->stream, sid);
+		if (ret)
+			return ret;
+	}
+
+	return asoc->outqueue.sched->set(&asoc->stream, sid, value, gfp);
+}
+
+int sctp_sched_get_value(struct sctp_association *asoc, __u16 sid,
+			 __u16 *value)
+{
+	if (sid >= asoc->stream.outcnt)
+		return -EINVAL;
+
+	if (!asoc->stream.out[sid].ext)
+		return 0;
+
+	return asoc->outqueue.sched->get(&asoc->stream, sid, value);
+}
+
+void sctp_sched_dequeue_done(struct sctp_outq *q, struct sctp_chunk *ch)
+{
+	if (!list_is_last(&ch->frag_list, &ch->msg->chunks)) {
+		struct sctp_stream_out *sout;
+		__u16 sid;
+
+		/* datamsg is not finish, so save it as current one,
+		 * in case application switch scheduler or a higher
+		 * priority stream comes in.
+		 */
+		sid = sctp_chunk_stream_no(ch);
+		sout = &q->asoc->stream.out[sid];
+		q->asoc->stream.out_curr = sout;
+		return;
+	}
+
+	q->asoc->stream.out_curr = NULL;
+	q->sched->dequeue_done(q, ch);
+}
+
+/* Auxiliary functions for the schedulers */
+void sctp_sched_dequeue_common(struct sctp_outq *q, struct sctp_chunk *ch)
+{
+	list_del_init(&ch->list);
+	list_del_init(&ch->stream_list);
+	q->out_qlen -= ch->skb->len;
+}
+
+int sctp_sched_init_sid(struct sctp_stream *stream, __u16 sid, gfp_t gfp)
+{
+	struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
+
+	INIT_LIST_HEAD(&stream->out[sid].ext->outq);
+	return sched->init_sid(stream, sid, gfp);
+}
+
+struct sctp_sched_ops *sctp_sched_ops_from_stream(struct sctp_stream *stream)
+{
+	struct sctp_association *asoc;
+
+	asoc = container_of(stream, struct sctp_association, stream);
+
+	return asoc->outqueue.sched;
+}
-- 
2.13.5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox