Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] ipv4: Minor logic clean-up in ipv4_mtu
From: David Miller @ 2012-08-31 20:23 UTC (permalink / raw)
  To: alexander.h.duyck; +Cc: netdev, jeffrey.t.kirsher
In-Reply-To: <20120827162930.2969.96733.stgit@gitlad.jf.intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Mon, 27 Aug 2012 09:30:01 -0700

> In ipv4_mtu there is some logic where we are testing for a non-zero value
> and a timer expiration, then setting the value to zero, and then testing if
> the value is zero we set it to a value based on the dst.  Instead of
> bothering with the extra steps it is easier to just cleanup the logic so
> that we set it to the dst based value if it is zero or if the timer has
> expired.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH v3] net: add new QCA alx ethernet driver
From: David Miller @ 2012-08-31 20:20 UTC (permalink / raw)
  To: cjren; +Cc: netdev, linux-kernel, qca-linux-team, nic-devel, xiong, rodrigue
In-Reply-To: <1346083963-17610-1-git-send-email-cjren@qca.qualcomm.com>

From: <cjren@qca.qualcomm.com>
Date: Tue, 28 Aug 2012 00:12:43 +0800

> +/*
> + * Definition to enable some features
> + */
> +#undef CONFIG_ALX_MSIX
> +#undef CONFIG_ALX_MSI
> +#undef CONFIG_ALX_MTQ
> +#undef CONFIG_ALX_MRQ
> +#undef CONFIG_ALX_RSS
> +/* #define CONFIG_ALX_MSIX */
> +#define CONFIG_ALX_MSI
> +#define CONFIG_ALX_MTQ
> +#define CONFIG_ALX_MRQ
> +#ifdef CONFIG_ALX_MRQ
> +#define CONFIG_ALX_RSS
> +#endif
> +

Get rid of all of these.  You may never use private feature control macros
in the CONFIG_* namespace, those are for the Kconfig system only.

Local controls of this nature are only appropriate for a driver amidst
development, and not a final version that should be included in the
upstream kernel tree.

You must remove all of these CPP macros, and all code that is currently
protected by the ones which are off.

Just so that your expectations are set appropriately, I anticipate
that there will be at least 5 more rounds of review for things of this
nature before we can even remotely consider adding this driver to the
tree.  This driver is very poorly written and is far away from meeting
our standards for incusion.

^ permalink raw reply

* Re: [PATCH 2/5] net:atm:fix up ENOIOCTLCMD error handling
From: David Miller @ 2012-08-31 20:14 UTC (permalink / raw)
  To: gaowanlong; +Cc: linux-kernel, netdev
In-Reply-To: <1346052196-32682-3-git-send-email-gaowanlong@cn.fujitsu.com>

From: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Date: Mon, 27 Aug 2012 15:23:13 +0800

> At commit 07d106d0, Linus pointed out that ENOIOCTLCMD should be
> translated as ENOTTY to user mode.
> 
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: netdev@vger.kernel.org
> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [Stlinux-devel] [PATCH linux-stm 4/4] net:stmmac: convert driver to use devm_request_and_ioremap.
From: David Miller @ 2012-08-31 20:13 UTC (permalink / raw)
  To: srinivas.kandagatla; +Cc: netdev, peppe.cavallaro
In-Reply-To: <1346341869-20211-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas KANDAGATLA <srinivas.kandagatla@st.com>
Date: Thu, 30 Aug 2012 16:51:09 +0100

> From: Srinivas Kandagatla <srinivas.kandagatla@st.com>
> 
> This patch moves calls to ioremap and request_mem_region to
> devm_request_and_ioremap call.
> 
> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>

Applied.

^ permalink raw reply

* Re: [Stlinux-devel] [PATCH linux-stm 3/4] net/stmmac: Remove bus_id from mdio platform data.
From: David Miller @ 2012-08-31 20:13 UTC (permalink / raw)
  To: srinivas.kandagatla; +Cc: netdev, peppe.cavallaro
In-Reply-To: <1346341843-20169-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas KANDAGATLA <srinivas.kandagatla@st.com>
Date: Thu, 30 Aug 2012 16:50:43 +0100

> From: Srinivas Kandagatla <srinivas.kandagatla@st.com>
> 
> This patch removes bus_id from mdio platform data, The reason to remove
> bus_id is, stmmac mdio bus_id is always same as stmmac bus-id, so there
> is no point in passing this in different variable.
> Also stmmac ethernet driver connects to phy with bus_id passed its
> platform data.
> So, having single bus-id is much simpler.
> 
> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>

Applied.

^ permalink raw reply

* Re: [Stlinux-devel] [PATCH linux-stm 2/4] net:stmmac: fix broken stmmac_pltfr_remove.
From: David Miller @ 2012-08-31 20:12 UTC (permalink / raw)
  To: srinivas.kandagatla; +Cc: netdev, peppe.cavallaro
In-Reply-To: <1346341819-20125-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas KANDAGATLA <srinivas.kandagatla@st.com>
Date: Thu, 30 Aug 2012 16:50:19 +0100

> From: Srinivas Kandagatla <srinivas.kandagatla@st.com>
> 
> This patch fixes stmmac_pltfr_remove function, which is broken because,
> it is accessing plat variable via freed memory priv pointer which gets
> freed by free_netdev called from stmmac_dvr_remove.
> 
> In short this patch caches the plat pointer in local variable before
> calling stmmac_dvr_remove to prevent code accessing freed memory.
> 
> Without this patch any attempt to remove the stmmac device will fail as
> below:
 ...
> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>

Applied.

^ permalink raw reply

* Re: [Stlinux-devel] [PATCH linux-stm 1/4] net:stmmac: Add check if mdiobus is registered in stmmac_mdio_unregister
From: David Miller @ 2012-08-31 20:12 UTC (permalink / raw)
  To: srinivas.kandagatla; +Cc: netdev, peppe.cavallaro
In-Reply-To: <1346341798-19704-1-git-send-email-srinivas.kandagatla@st.com>

From: Srinivas KANDAGATLA <srinivas.kandagatla@st.com>
Date: Thu, 30 Aug 2012 16:49:58 +0100

> From: Srinivas Kandagatla <srinivas.kandagatla@st.com>
> 
> This patch adds a basic check in stmmac_mdio_unregister to see if mdio
> bus registeration for this driver was actually sucessfull or not.
> 
> Use case here is, if BSP considers using mdio-gpio bus along with stmmac
> driver by passing mdio_bus_data as NULL in platform data.
> Call to stmmac_mdio_register with mdio_bus_data as NULL returns 0, which
> is a considered sucessfull call form stmmac. Then again when we unload
> the driver we just call stmmac_mdio_unregister, this is were the actual
> problem is stmmac-mdio code dont really know at this instance of calling
> that stmmac_mdio_register was actually successful.
> 
> So Adding a check in stmmac_mdio_unregister is always safe.
> 
> Without this patch stmmac driver calls stmmac_mdio_register from
> stmmac_release which Segfaults as mii bus was never registered at the
> first point.
> 
> Originally the this bug was found when unloading an stmmac driver
> instance which uses mdio-gpio for smi access.
> 
> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>

Applied.

^ permalink raw reply

* Re: [PATCH] usbnet: fix deadlock in resume
From: David Miller @ 2012-08-31 20:12 UTC (permalink / raw)
  To: oliver; +Cc: netdev, ming.lei, oneukum, stable
In-Reply-To: <1346049698-10740-1-git-send-email-oliver@neukum.org>

From: oliver@neukum.org
Date: Mon, 27 Aug 2012 08:41:38 +0200

> From: Oliver Neukum <oliver@neukum.org>
> 
> A usbnet device can share a multifunction device
> with a storage device. If the storage device is autoresumed
> the usbnet devices also needs to be autoresumed. Allocating
> memory with GFP_KERNEL can deadlock in this case.
> 
> This should go back into all kernels that have
> commit 65841fd5132c3941cdf5df09e70df3ed28323212
> That is 3.5
> 
> Signed-off-by: Oliver Neukum <oneukum@suse.de>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 01/12] qlcnic: Refactoring - template based hardware interface
From: David Miller @ 2012-08-31 20:05 UTC (permalink / raw)
  To: sony.chacko; +Cc: netdev, Dept_NX_Linux_NIC_Driver, anirban.chakraborty
In-Reply-To: <1346394541-3486-2-git-send-email-sony.chacko@qlogic.com>

From: Sony Chacko <sony.chacko@qlogic.com>
Date: Fri, 31 Aug 2012 02:28:50 -0400

> +static inline int
> +qlcnic_config_bridged_mode(struct qlcnic_adapter *adapter, u32 enable)
> +{
> +	return adapter->nic_ops->config_bridged_mode(adapter, enable);
> +
> +}
> +
> +static inline int
> +qlcnic_config_led(struct qlcnic_adapter *adapter, u32 state, u32 rate)
> +{
> +	return adapter->nic_ops->config_led(adapter, state, rate);
> +
> +}
> +

Please get rid of those unnecessary empty lines in the function bodies.

^ permalink raw reply

* Re: [net-next 0/8][pull request] Intel Wired LAN Driver Updates
From: David Miller @ 2012-08-31 20:03 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, sassmann
In-Reply-To: <1346390174-30449-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 30 Aug 2012 22:16:06 -0700

> This series contains updates to e1000e and ixgbevf.
> 
> The following are changes since commit 761743ebc92df72053e736fce953a5d2e90099d5:
>   net/fsl_pq_mdio: add support for the Fman 1G MDIO controller
> and are available in the git repository at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master

Pulled, thanks Jeff.

^ permalink raw reply

* Re: [PATCH] openvswitch: using kfree_rcu() to simplify the code
From: David Miller @ 2012-08-31 19:57 UTC (permalink / raw)
  To: jesse; +Cc: weiyj.lk, yongjun_wei, dev, netdev
In-Reply-To: <CAEP_g=_9PYpQQbu-1eH7uF-Dk1+LbOfW1GAS6v6=Vu-8hq42yg@mail.gmail.com>

From: Jesse Gross <jesse@nicira.com>
Date: Tue, 28 Aug 2012 16:00:24 -0700

> On Sun, Aug 26, 2012 at 9:20 PM, Wei Yongjun <weiyj.lk@gmail.com> wrote:
>> From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
>>
>> The callback function of call_rcu() just calls a kfree(), so we
>> can use kfree_rcu() instead of call_rcu() + callback function.
>>
>> spatch with a semantic match is used to found this problem.
>> (http://coccinelle.lip6.fr/)
>>
>> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> 
> Thanks Wei.
> 
> Acked-by: Jesse Gross <jesse@nicira.com>

Applied to net-next

^ permalink raw reply

* Re: [PATCH] af_unix: fix shutdown parameter checking
From: David Miller @ 2012-08-31 19:57 UTC (permalink / raw)
  To: xi.wang; +Cc: netdev, linux-kernel
In-Reply-To: <1346035633-2492-1-git-send-email-xi.wang@gmail.com>

From: Xi Wang <xi.wang@gmail.com>
Date: Sun, 26 Aug 2012 22:47:13 -0400

> Return -EINVAL rather than 0 given an invalid "mode" parameter.
> 
> Signed-off-by: Xi Wang <xi.wang@gmail.com>

Applied to net-next

^ permalink raw reply

* Re: [PATCH] decnet: fix shutdown parameter checking
From: David Miller @ 2012-08-31 19:57 UTC (permalink / raw)
  To: swhiteho; +Cc: xi.wang, netdev, linux-kernel
In-Reply-To: <1346059001.2703.7.camel@menhir>

From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 27 Aug 2012 10:16:41 +0100

> On Sun, 2012-08-26 at 22:37 -0400, Xi Wang wrote:
>> The allowed value of "how" is SHUT_RD/SHUT_WR/SHUT_RDWR (0/1/2),
>> rather than SHUTDOWN_MASK (3).
>> 
>> Signed-off-by: Xi Wang <xi.wang@gmail.com>
> Acked-by: Steven Whitehouse <swhiteho@redhat.com>

Applied to net-next.

> Although it could be argued that we should also continue to accept the
> value 3 just in case there is any userland software out there which
> sends that value,

True, but this is a rather standard BSD socket interface with a very
specific small set of legitimate input parameters.  Allowing
deviation, even for compatability for specific protocols, is largely
unwise.

^ permalink raw reply

* Re: [PATCH v2] cs89x0 : packet reception not working
From: David Miller @ 2012-08-31 19:49 UTC (permalink / raw)
  To: jaccon.bastiaansen; +Cc: joe, linux-arm-kernel, netdev, s.hauer, festevam
In-Reply-To: <1346104431-3784-1-git-send-email-jaccon.bastiaansen@gmail.com>

From: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
Date: Mon, 27 Aug 2012 23:53:51 +0200

> The RxCFG register of the CS89x0 could be configured incorrectly
> (because of misplaced parentheses), resulting in the disabling
> of packet reception.
> 
> Signed-off-by: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH 1/1] tcp: Wrong timeout for SYN segments
From: David Miller @ 2012-08-31 19:47 UTC (permalink / raw)
  To: eric.dumazet; +Cc: alex, hkjerry.chu, netdev, linux-kernel
In-Reply-To: <20120831.154234.735439593093335863.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Fri, 31 Aug 2012 15:42:34 -0400 (EDT)

> Applied with some minor comment formatting and wording adjustments.

BTW, please keep in mind that when you modify the value
of TCP_SYN_RETRIES, you are having an influence upon DCCP
as well.

TCP is not the only protocol which uses this value.

^ permalink raw reply

* Re: [PATCH 1/1] tcp: Wrong timeout for SYN segments
From: David Miller @ 2012-08-31 19:42 UTC (permalink / raw)
  To: eric.dumazet; +Cc: alex, hkjerry.chu, netdev, linux-kernel
In-Reply-To: <1346419550.2591.12.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 31 Aug 2012 06:25:50 -0700

> On Fri, 2012-08-31 at 14:48 +0200, Alexander Bergmann wrote:
> 
>> Hi Eric!
>> 
>> I've also changed the Documentation file. As usual, comments are welcome!
>> 
>> 
>> Alex
>> 
>> 
>> From 848f34ce27f65401940ae98e0b2d395888d3986d Mon Sep 17 00:00:00 2001
>> From: Alexander Bergmann <alex@linlab.net>
>> Date: Fri, 31 Aug 2012 14:31:00 +0200
>> Subject: [PATCH 1/1] tcp: Increase timeout for SYN segments
>> 
>> Commit 9ad7c049 changed the initRTO from 3secs to 1sec in accordance to
>> RFC6298 (former RFC2988bis). This reduced the time till the last SYN
>> retransmission packet gets sent from 93secs to 31secs.
>> 
>> RFC1122 is stating that the retransmission should be done for at least 3
>> minutes, but this seems to be quite high.
>> 
>>   "However, the values of R1 and R2 may be different for SYN
>>   and data segments.  In particular, R2 for a SYN segment MUST
>>   be set large enough to provide retransmission of the segment
>>   for at least 3 minutes.  The application can close the
>>   connection (i.e., give up on the open attempt) sooner, of
>>   course."
>> 
>> This patch increases the value of TCP_SYN_RETRIES to the value of 6,
>> providing a retransmission window of 63secs.
>> 
>> The comments for SYN and SYNACK retries have also been updated to
>> describe the current settings. The same goes for the documentation file
>> "Documentation/networking/ip-sysctl.txt".
>> 
>> Signed-off-by: Alexander Bergmann <alex@linlab.net>
>> ---
> 
> Thanks for your patience and followup, this seems good to me !
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Applied with some minor comment formatting and wording adjustments.

Thanks everyone.

^ permalink raw reply

* WESTERN UNION IMF
From: Western Union Money Transfer @ 2012-08-31 19:21 UTC (permalink / raw)


Attn Contact Western Union for your Payment of $1,500,000.00 today with
your Full Details info,Name,Country,phone,Occupation for your first MTCN of $5,000 USD
E-mail:(claimsdptwesternunion@yahoo.co.uk)
Contact Phone Number:+2348052574570
Best Regard Once More Congratulation !!
Manager Mr Green Garry
Western©Union 2012

^ permalink raw reply

* Re: [PATCH 1/2] ipv4: Improve the scaling of the ARP cache for multicast destinations.
From: Bob Gilligan @ 2012-08-31 19:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120830.210628.365120808137655227.davem@davemloft.net>

On 8/30/12 6:06 PM, David Miller wrote:
> From: Bob Gilligan <gilligan@aristanetworks.com>
> Date: Thu, 30 Aug 2012 17:55:04 -0700
> 
>> The mapping from multicast IPv4 address to MAC address can just as
>> easily be done at the time a packet is to be sent.  With this change,
>> we maintain one ARP cache entry for each interface that has at least
>> one multicast group member.  All routes to IPv4 multicast destinations
>> via a particular interface use the same ARP cache entry.  This entry
>> does not store the MAC address to use.  Instead, packets for multicast
>> destinations go to a new output function that maps the destination
>> IPv4 multicast address into the MAC address and forms the MAC header.
> 
> Doing an ARP MC mapping on every packet is much more expensive than
> doing a copy of the hard header cache.
> 
> I do not believe the memory consumption issue you use to justify this
> change is a real issue.
> 
> If you are talking to that many multicast groups actively, you do want
> that many neighbour cache entries.  This is not different from talking
> to nearly every IP address on a local /8 subnet.  You'll have a huge
> number of neighbour table entries in that case as well.
> 
> If your the actual steady state number of active groups being spoken
> to is smaller, you can tune the neighbour cache thresholds to collect
> old less used entries more quickly.
> 
> And this today is trivial, since routes no longer hold a reference
> to neighbour entries.  Therefore any neighbour entry whatsoever can
> be immediately reclaimed at any moment.

The scaling is N-squared: the number of neighbor cache entries
required for your multicast traffic is interfaces * groups.  100
interfaces and 100 groups could generate 10,000 entries. 1,000
interfaces and 1,000 groups could generate a million entries.

But the number of groups is hard to predict: it depends on the
applications in use and the multicast traffic they generate.  So, it
is hard to come up with a "budget" for multicast entries in the
neighbor cache for a multicast router.

If you pick a gc_thresh3 that is less than your working set, you'll
end up thrashing the neighbor cache.  And calls to neigh_forced_gc()
are expensive: It performs a linear search of the entire neighbor
cache.  Also, the calls to neigh_forced_gc() due to a large number of
multicast entries will negatively impact the unicast entries sharing the
neighbor cache: it will free any unreferenced but resolved unicast
entries. Any subsequent packets for those destinations will trigger a
re-ARP.  Unnecessary re-ARPing is generally undesirable in a router.

The user who wants to avoid these problems is left with the
alternative of setting gc_thresh3 to a very large number based on a
worst case estimate of the number of unicast plus multicast entries
required.

Seems just simpler and more efficient to keep the multicast entries
out of the neighbor cache entirely.

Bob.

> 
> I'm not fond of these patches, and adding yet more special cases to
> the neighbour layer, and therefore will not apply them.
> 

^ permalink raw reply

* Re: [PATCH 0/6] netfilter updates for 3.6-rc
From: David Miller @ 2012-08-31 19:15 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <1346421789-3449-1-git-send-email-pablo@netfilter.org>

From: pablo@netfilter.org
Date: Fri, 31 Aug 2012 16:03:03 +0200

> You can pull these changes from:
> 
> git://1984.lsi.us.es/nf master

Pulled.

> BTW, please merge net to net-next after this so I can resolve the
> conflict between the SIP helper and NAT IPv6 changes from Patrick,
> which is scheduled for net-next.

Done.

^ permalink raw reply

* Re: [BUG]  TIPC handling of -ERESTARTSYS in connect()
From: Chris Friesen @ 2012-08-31 15:18 UTC (permalink / raw)
  To: Ying Xue; +Cc: netdev, Allan Stephens, Jon Maloy
In-Reply-To: <5040D3C4.2010308@genband.com>

On 08/31/2012 09:09 AM, Chris Friesen wrote:
> On 08/31/2012 03:37 AM, Ying Xue wrote:
>> Hi Chris,
>>
>> Although this is a known issue, still thanks for your report.
>>
>> Regardless of 1.7.7 or mainline, the issue really exists.
>> Can you please check and verify the attached patch?
>>
>> PS: the patch is based on 1.7.7 rather than mainline.
>
>
> I haven't had a chance to test it yet but from visual inspection the
> patch looks pretty good.
>
> It looks like the case where we come in with "sock->state ==
> SS_LISTENING" has changed. Previously we would return -EOPNOTSUPP but
> now it'll be -EINVAL. Is that intentional?

Just noticed something else.  In the SS_CONNECTING case I don't think 
there's any point in setting "res = -EALREADY" since a bit further down 
it gets set unconditionally anyway.

Chris

^ permalink raw reply

* Re: [BUG]  TIPC handling of -ERESTARTSYS in connect()
From: Chris Friesen @ 2012-08-31 15:09 UTC (permalink / raw)
  To: Ying Xue; +Cc: netdev, Allan Stephens, Jon Maloy
In-Reply-To: <504085D9.3010907@windriver.com>

On 08/31/2012 03:37 AM, Ying Xue wrote:
> Hi Chris,
>
> Although this is a known issue, still thanks for your report.
>
> Regardless of 1.7.7 or mainline, the issue really exists.
> Can you please check and verify the attached patch?
>
> PS: the patch is based on 1.7.7 rather than mainline.

I haven't had a chance to test it yet but from visual inspection the 
patch looks pretty good.

It looks like the case where we come in with "sock->state == 
SS_LISTENING" has changed.  Previously we would return -EOPNOTSUPP but 
now it'll be -EINVAL.  Is that intentional?

Chris

^ permalink raw reply

* Re: [PATCH 1/3] tcp: TCP Fast Open Server - header & support functions
From: Eric Dumazet @ 2012-08-31 14:21 UTC (permalink / raw)
  To: H.K. Jerry Chu
  Cc: davem, ycheng, edumazet, ncardwell, sivasankar, therbert, netdev
In-Reply-To: <1346369948-1722-2-git-send-email-hkchu@google.com>

On Thu, 2012-08-30 at 16:39 -0700, H.K. Jerry Chu wrote:
> From: Jerry Chu <hkchu@google.com>
> 
> This patch adds all the necessary data structure and support
> functions to implement TFO server side. It also documents a number
> of flags for the sysctl_tcp_fastopen knob, and adds a few Linux
> extension MIBs.
> 
> In addition, it includes the following:
> 
> 1. a new TCP_FASTOPEN socket option an application must call to
> supply a max backlog allowed in order to enable TFO on its listener.
> 
> 2. A number of key data structures:
> "fastopen_rsk" in tcp_sock - for a big socket to access its
> request_sock for retransmission and ack processing purpose. It is
> non-NULL iff 3WHS not completed.
> 
> "fastopenq" in request_sock_queue - points to a per Fast Open
> listener data structure "fastopen_queue" to keep track of qlen (# of
> outstanding Fast Open requests) and max_qlen, among other things.
> 
> "listener" in tcp_request_sock - to point to the original listener
> for book-keeping purpose, i.e., to maintain qlen against max_qlen
> as part of defense against IP spoofing attack.
> 
> 3. various data structure and functions, many in tcp_fastopen.c, to
> support server side Fast Open cookie operations, including
> /proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying.
> 
> Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Tom Herbert <therbert@google.com>
> ---
>  Documentation/networking/ip-sysctl.txt |   29 ++++++++---
>  include/linux/snmp.h                   |    4 ++
>  include/linux/tcp.h                    |   45 ++++++++++++++++-
>  include/net/request_sock.h             |   36 ++++++++++++++
>  include/net/tcp.h                      |   46 +++++++++++++++---
>  net/ipv4/proc.c                        |    4 ++
>  net/ipv4/sysctl_net_ipv4.c             |   45 +++++++++++++++++
>  net/ipv4/tcp_fastopen.c                |   83 +++++++++++++++++++++++++++++++-
>  net/ipv4/tcp_input.c                   |    4 +-
>  9 files changed, 276 insertions(+), 20 deletions(-)

There are two very small points that can be addressed later, or in next
iteration if there is one.

static inline bool fastopen_cookie_present(struct tcp_fastopen_cookie *foc)
{
       return (foc)->len != -1;
}

should be :

static inline bool fastopen_cookie_present(const struct tcp_fastopen_cookie *foc)
{
       return foc->len != -1;
}

And we should add a BUILD_BUG_ON(TCP_FASTOPEN_KEY_LENGTH != 4*sizeof(u32));
in proc_tcp_fastopen_key().

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* [PATCH 6/6] netfilter: nf_conntrack: fix racy timer handling with reliable events
From: pablo @ 2012-08-31 14:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1346421789-3449-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>

Existing code assumes that del_timer returns true for alive conntrack
entries. However, this is not true if reliable events are enabled.
In that case, del_timer may return true for entries that were
just inserted in the dying list. Note that packets / ctnetlink may
hold references to conntrack entries that were just inserted to such
list.

This patch fixes the issue by adding an independent timer for
event delivery. This increases the size of the ecache extension.
Still we can revisit this later and use variable size extensions
to allocate this area on demand.

Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack_ecache.h |    1 +
 net/netfilter/nf_conntrack_core.c           |   16 +++++++++++-----
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_ecache.h b/include/net/netfilter/nf_conntrack_ecache.h
index e1ce104..4a045cd 100644
--- a/include/net/netfilter/nf_conntrack_ecache.h
+++ b/include/net/netfilter/nf_conntrack_ecache.h
@@ -18,6 +18,7 @@ struct nf_conntrack_ecache {
 	u16 ctmask;		/* bitmask of ct events to be delivered */
 	u16 expmask;		/* bitmask of expect events to be delivered */
 	u32 pid;		/* netlink pid of destroyer */
+	struct timer_list timeout;
 };
 
 static inline struct nf_conntrack_ecache *
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index cf48755..2ceec64 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -249,12 +249,15 @@ static void death_by_event(unsigned long ul_conntrack)
 {
 	struct nf_conn *ct = (void *)ul_conntrack;
 	struct net *net = nf_ct_net(ct);
+	struct nf_conntrack_ecache *ecache = nf_ct_ecache_find(ct);
+
+	BUG_ON(ecache == NULL);
 
 	if (nf_conntrack_event(IPCT_DESTROY, ct) < 0) {
 		/* bad luck, let's retry again */
-		ct->timeout.expires = jiffies +
+		ecache->timeout.expires = jiffies +
 			(random32() % net->ct.sysctl_events_retry_timeout);
-		add_timer(&ct->timeout);
+		add_timer(&ecache->timeout);
 		return;
 	}
 	/* we've got the event delivered, now it's dying */
@@ -268,6 +271,9 @@ static void death_by_event(unsigned long ul_conntrack)
 void nf_ct_insert_dying_list(struct nf_conn *ct)
 {
 	struct net *net = nf_ct_net(ct);
+	struct nf_conntrack_ecache *ecache = nf_ct_ecache_find(ct);
+
+	BUG_ON(ecache == NULL);
 
 	/* add this conntrack to the dying list */
 	spin_lock_bh(&nf_conntrack_lock);
@@ -275,10 +281,10 @@ void nf_ct_insert_dying_list(struct nf_conn *ct)
 			     &net->ct.dying);
 	spin_unlock_bh(&nf_conntrack_lock);
 	/* set a new timer to retry event delivery */
-	setup_timer(&ct->timeout, death_by_event, (unsigned long)ct);
-	ct->timeout.expires = jiffies +
+	setup_timer(&ecache->timeout, death_by_event, (unsigned long)ct);
+	ecache->timeout.expires = jiffies +
 		(random32() % net->ct.sysctl_events_retry_timeout);
-	add_timer(&ct->timeout);
+	add_timer(&ecache->timeout);
 }
 EXPORT_SYMBOL_GPL(nf_ct_insert_dying_list);
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 5/6] netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation
From: pablo @ 2012-08-31 14:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1346421789-3449-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>

We're hitting bug while trying to reinsert an already existing
expectation:

kernel BUG at kernel/timer.c:895!
invalid opcode: 0000 [#1] SMP
[...]
Call Trace:
 <IRQ>
 [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack]
 [<ffffffff812d423a>] ? in4_pton+0x72/0x131
 [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip]
 [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip]
 [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip]
 [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c
 [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip]

We have to remove the RTP expectation if the RTCP expectation hits EBUSY
since we keep trying with other ports until we succeed.

Reported-by: Rafal Fitt <rafalf@aplusc.com.pl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/nf_nat_sip.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/netfilter/nf_nat_sip.c b/net/ipv4/netfilter/nf_nat_sip.c
index 4ad9cf1..9c87cde 100644
--- a/net/ipv4/netfilter/nf_nat_sip.c
+++ b/net/ipv4/netfilter/nf_nat_sip.c
@@ -502,7 +502,10 @@ static unsigned int ip_nat_sdp_media(struct sk_buff *skb, unsigned int dataoff,
 		ret = nf_ct_expect_related(rtcp_exp);
 		if (ret == 0)
 			break;
-		else if (ret != -EBUSY) {
+		else if (ret == -EBUSY) {
+			nf_ct_unexpect_related(rtp_exp);
+			continue;
+		} else if (ret < 0) {
 			nf_ct_unexpect_related(rtp_exp);
 			port = 0;
 			break;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 2/6] ipvs: fix error return code
From: pablo @ 2012-08-31 14:03 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1346421789-3449-1-git-send-email-pablo@netfilter.org>

From: Julia Lawall <Julia.Lawall@lip6.fr>

Initialize return variable before exiting on an error path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
 { ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
 return ret;
}

// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/ipvs/ip_vs_ctl.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 72bf32a..f51013c 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -1171,8 +1171,10 @@ ip_vs_add_service(struct net *net, struct ip_vs_service_user_kern *u,
 		goto out_err;
 	}
 	svc->stats.cpustats = alloc_percpu(struct ip_vs_cpu_stats);
-	if (!svc->stats.cpustats)
+	if (!svc->stats.cpustats) {
+		ret = -ENOMEM;
 		goto out_err;
+	}
 
 	/* I'm the first user of the service */
 	atomic_set(&svc->usecnt, 0);
-- 
1.7.10.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox