* merge window warning
From: David Miller @ 2013-09-04 17:10 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA
Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netfilter-devel-u79uwXL29TY76Z2rM5mHXA
The merge window is now open.
This means if a feature patch isn't in patchwork already, and
completely ready to apply, you should not submit it at this time.
I will just automatically reject any inappropriate patches sent from
now until the net-next tree opens again after the merge window closes,
in order to assist with my ability to process the current backlog
efficiently.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [patch v2] sfc: check for allocation failure
From: David Miller @ 2013-09-04 17:08 UTC (permalink / raw)
To: dan.carpenter; +Cc: linux-net-drivers, bhutchings, netdev, kernel-janitors
In-Reply-To: <20130904150727.GA32327@elgon.mountain>
From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Wed, 4 Sep 2013 18:07:27 +0300
> It upsets static analyzers when we don't check for allocation failure.
>
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> v2: rebased on latest linux-next
Applied, thanks Dan.
^ permalink raw reply
* Re: [PATCH] net: mvneta: implement ->ndo_do_ioctl() to support PHY ioctls
From: Gregory CLEMENT @ 2013-09-04 16:45 UTC (permalink / raw)
To: Thomas Petazzoni
Cc: David S. Miller, netdev, Ezequiel Garcia, Lior Amsalem,
linux-arm-kernel, Jason Cooper, Andrew Lunn
In-Reply-To: <1378304812-21390-1-git-send-email-thomas.petazzoni@free-electrons.com>
On 04/09/2013 16:26, Thomas Petazzoni wrote:
> This commit implements the ->ndo_do_ioctl() operation so that the
> PHY-related ioctl() calls can work from userspace, which allows
> applications like mii-tool or mii-diag to do their job.
>
I tested it successfully with mii-diag for the Armada 370 on Mirabox
and for the Armada XP on Open Blocks AX3-4
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
> drivers/net/ethernet/marvell/mvneta.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index 90ab292..389a854 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -2451,6 +2451,21 @@ static int mvneta_stop(struct net_device *dev)
> return 0;
> }
>
> +static int mvneta_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
> +{
> + struct mvneta_port *pp = netdev_priv(dev);
> + int ret;
> +
> + if (!pp->phy_dev)
> + return -ENOTSUPP;
> +
> + ret = phy_mii_ioctl(pp->phy_dev, ifr, cmd);
> + if (!ret)
> + mvneta_adjust_link(dev);
> +
> + return ret;
> +}
> +
> /* Ethtool methods */
>
> /* Get settings (phy address, speed) for ethtools */
> @@ -2569,6 +2584,7 @@ static const struct net_device_ops mvneta_netdev_ops = {
> .ndo_change_mtu = mvneta_change_mtu,
> .ndo_tx_timeout = mvneta_tx_timeout,
> .ndo_get_stats64 = mvneta_get_stats64,
> + .ndo_do_ioctl = mvneta_ioctl,
> };
>
> const struct ethtool_ops mvneta_eth_tool_ops = {
>
--
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
^ permalink raw reply
* Re: [net-next 0/9][pull request] Intel Wired LAN Driver Updates
From: David Miller @ 2013-09-04 16:41 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, gospo, sassmann
In-Reply-To: <1378300136-26003-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Wed, 4 Sep 2013 06:08:47 -0700
> This series contains updates to igb only.
>
> Todd provides a fix for igb to not look for a PBA in the iNVM on
> devices that are flashless.
>
> Akeem provides igb patches to add a new PHY id for i354, as well as
> a couple of patches to implement the new PHY id. He also provides
> several patches to correctly report the appropriate media type as
> well as correctly report advertised/supported link for i354 devices.
> Lastly Akeem implements a 1 second delay mechanism for i210 devices
> to avoid erroneous link issue with the link partner.
Pulled, thanks a lot Jeff.
^ permalink raw reply
* Re: [PATCH] net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
From: Willy Tarreau @ 2013-09-04 16:32 UTC (permalink / raw)
To: Thomas Petazzoni
Cc: Lior Amsalem, Jochen De Smet, Simon Guinot, Ryan Press, netdev,
vdonnefort, Ethan Tuttle, stable, Ezequiel Garcia,
Chény Yves-Gael, Gregory Clement, Peter Sanford,
David S. Miller, linux-arm-kernel
In-Reply-To: <1378304478-21237-1-git-send-email-thomas.petazzoni@free-electrons.com>
Hi Thomas!
On Wed, Sep 04, 2013 at 04:21:18PM +0200, Thomas Petazzoni wrote:
> This commit fixes a long-standing bug that has been reported by many
> users: on some Armada 370 platforms, only the network interface that
> has been used in U-Boot to tftp the kernel works properly in
> Linux. The other network interfaces can see a 'link up', but are
> unable to transmit data. The reports were generally made on the Armada
> 370-based Mirabox, but have also been given on the Armada 370-RD
> board.
(...)
> This patch has been tested on Armada 370 Mirabox, and now both network
> interfaces are usable after boot.
Just as a complementary check, I can also confirm that the OpenBlocks
AX3 continues to work fine after this change.
Best regards!
Willy
^ permalink raw reply
* RE: [PATCH net-next 4/5] driver/net: enic: Exposing symbols for Cisco's low latency driver
From: Christian Benvenuti (benve) @ 2013-09-04 16:32 UTC (permalink / raw)
To: Ben Hutchings, Govindarajulu Varadarajan
Cc: davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Sujith Sankar (ssujith),
Nishank Trivedi (nistrive), Upinder Malhi (umalhi)
In-Reply-To: <1378306376.3133.7.camel@bwh-desktop.uk.level5networks.com>
Hi Ben,
> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Ben Hutchings
> Sent: Wednesday, September 04, 2013 7:53 AM
> To: Govindarajulu Varadarajan
> Cc: davem@davemloft.net; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; Christian Benvenuti (benve); Sujith Sankar
> (ssujith); Nishank Trivedi (nistrive); Upinder Malhi (umalhi)
> Subject: Re: [PATCH net-next 4/5] driver/net: enic: Exposing symbols for
> Cisco's low latency driver
>
> On Wed, 2013-09-04 at 11:17 +0530, Govindarajulu Varadarajan wrote:
> > This patch exposes symbols for usnic low latency driver that can be
> > used to register and unregister vNics as well to traverse the resources on
> vNics.
> >
> > Signed-off-by: Upinder Malhi <umalhi@cisco.com>
> > Signed-off-by: Nishank Trivedi <nistrive@cisco.com>
> > Signed-off-by: Christian Benvenuti <benve@cisco.com>
>
> Will usnic, or any other user of these symbols, be submitted for inclusion in-
> tree as well? It is generally expected that exported functions do have an in-
> tree user.
The usnic driver has been posted yesterday on the rdma list.
(http://www.spinics.net/lists/linux-rdma/msg16999.html)
/Chris
> Also, header files under drivers/ generally won't be included in distribution
> -devel packages, so to support an out-of-tree module the function
> prototypes would need to be included in a header under include/ (or else
> you have to repeat them and hope the types never change).
>
> Ben.
>
> > ---
> > drivers/net/ethernet/cisco/enic/vnic_dev.c | 10 ++++++++++
> > drivers/net/ethernet/cisco/enic/vnic_dev.h | 1 +
> > 2 files changed, 11 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/cisco/enic/vnic_dev.c
> > b/drivers/net/ethernet/cisco/enic/vnic_dev.c
> > index 97455c5..69dd925 100644
> > --- a/drivers/net/ethernet/cisco/enic/vnic_dev.c
> > +++ b/drivers/net/ethernet/cisco/enic/vnic_dev.c
> > @@ -175,6 +175,7 @@ unsigned int vnic_dev_get_res_count(struct
> > vnic_dev *vdev, {
> > return vdev->res[type].count;
> > }
> > +EXPORT_SYMBOL(vnic_dev_get_res_count);
> >
> > void __iomem *vnic_dev_get_res(struct vnic_dev *vdev, enum
> vnic_res_type type,
> > unsigned int index)
> > @@ -193,6 +194,7 @@ void __iomem *vnic_dev_get_res(struct vnic_dev
> *vdev, enum vnic_res_type type,
> > return (char __iomem *)vdev->res[type].vaddr;
> > }
> > }
> > +EXPORT_SYMBOL(vnic_dev_get_res);
> >
> > static unsigned int vnic_dev_desc_ring_size(struct vnic_dev_ring *ring,
> > unsigned int desc_count, unsigned int desc_size) @@ -942,6 +944,7
> @@
> > void vnic_dev_unregister(struct vnic_dev *vdev)
> > kfree(vdev);
> > }
> > }
> > +EXPORT_SYMBOL(vnic_dev_unregister);
> >
> > struct vnic_dev *vnic_dev_register(struct vnic_dev *vdev,
> > void *priv, struct pci_dev *pdev, struct vnic_dev_bar *bar, @@
> > -969,6 +972,13 @@ err_out:
> > vnic_dev_unregister(vdev);
> > return NULL;
> > }
> > +EXPORT_SYMBOL(vnic_dev_register);
> > +
> > +struct pci_dev *vnic_dev_get_pdev(struct vnic_dev *vdev) {
> > + return vdev->pdev;
> > +}
> > +EXPORT_SYMBOL(vnic_dev_get_pdev);
> >
> > int vnic_dev_init_prov2(struct vnic_dev *vdev, u8 *buf, u32 len) {
> > diff --git a/drivers/net/ethernet/cisco/enic/vnic_dev.h
> > b/drivers/net/ethernet/cisco/enic/vnic_dev.h
> > index f3d9b79..e670029 100644
> > --- a/drivers/net/ethernet/cisco/enic/vnic_dev.h
> > +++ b/drivers/net/ethernet/cisco/enic/vnic_dev.h
> > @@ -127,6 +127,7 @@ int vnic_dev_set_ig_vlan_rewrite_mode(struct
> > vnic_dev *vdev, struct vnic_dev *vnic_dev_register(struct vnic_dev *vdev,
> > void *priv, struct pci_dev *pdev, struct vnic_dev_bar *bar,
> > unsigned int num_bars);
> > +struct pci_dev *vnic_dev_get_pdev(struct vnic_dev *vdev);
> > int vnic_dev_init_prov2(struct vnic_dev *vdev, u8 *buf, u32 len);
> > int vnic_dev_enable2(struct vnic_dev *vdev, int active); int
> > vnic_dev_enable2_done(struct vnic_dev *vdev, int *status);
>
> --
> Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer;
> that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in the
> body of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH net-next v2 5/6] bonding: restructure and add rcu for bond_for_each_slave_next()
From: Veaceslav Falico @ 2013-09-04 16:29 UTC (permalink / raw)
To: Ding Tianhong
Cc: Ding Tianhong, Jay Vosburgh, Andy Gospodarek, David S. Miller,
Nikolay Aleksandrov, Netdev
In-Reply-To: <52274B84.2010509@gmail.com>
On Wed, Sep 04, 2013 at 11:02:28PM +0800, Ding Tianhong wrote:
>于 2013/9/4 18:35, Veaceslav Falico 写道:
>>On Wed, Sep 04, 2013 at 05:44:15PM +0800, Ding Tianhong wrote:
>>...snip...
>>>+/* Check whether the slave is the only one in bond */
>>>+#define bond_is_only_slave(bond, pos) \
>>>+ (((pos)->list.prev == &(bond)->slave_list) && \
>>>+ ((pos)->list.next == &(bond)->slave_list))
>>
>>Could be done without pos at all -
>>
>>!list_empty(&(bond)->slave_list) && \
>>&(bond)->slave_list.next == &(bond)->slave_list.prev
>>
>>If we have only one slave and pos is NOT our slave then... well.. we have
>>big troubles.
>>
>yes, more simple more beautiful, thanks.
>
>but if the pos is not our slave, it is the mistake, not bug. :)
>
>>>+
>>>/**
>>>* bond_for_each_slave_from - iterate the slaves list from a
>>>starting point
>>>* @bond: the bond holding this list.
>>>* @pos: current slave.
>>>- * @cnt: counter for max number of moves
>>>* @start: starting point.
>>>*
>>>* Caller must hold bond->lock
>>>*/
>>>-#define bond_for_each_slave_from(bond, pos, cnt, start) \
>>>- for (cnt = 0, pos = start; pos && cnt < (bond)->slave_cnt; \
>>>- cnt++, pos = bond_next_slave(bond, pos))
>>>+#define bond_for_each_slave_from(bond, pos, start) \
>>>+ for (pos = start; pos && (bond_is_only_slave(bond, start) ? \
>>>+ &pos->list != &bond->slave_list : \
>>>+ &pos->list != &start->list); bond_is_only_slave(bond, start) ? \
>>>+ (pos = list_entry(pos->list.next, typeof(*pos), list)) : \
>>>+ (pos = bond_next_slave(bond, pos)))
>>
>>Did you check that?
>>
>>pos = slave1 (bond has more than one slave);
>>pos && &pos->list != &slave1->list - false.
>>
>>We won't ever enter this loop if we have >1 slaves.
>>
>>I don't understand this at all.
>>
>
>ok, the logic is : if slaves == 1, run once for the slave.
your code, actually, says differently:
(bond_is_only_slave(bond, start) ? &pos->list != &bond->slave_list :
which means that if bond_is_only_slave(bond, start) == true then we check
&pos->list != &bond->slave_list, so we run till the end of the list.
Ternary operator works like that
expression ? what_to_do_if_true : what_to_do_if_false
http://en.wikipedia.org/wiki/?:#C
>if slaves > 1, run loops until reach the end list.
Here we test for &pos->list != &start->list, which is false cause pos ==
start.
>I could not get a better way to simplify the function, I think it is
>a suitable scheme.
Nope. bond_for_each_slave_from() is supposed to loop through slaves from a
starting slave and till that starting slave, not included.
Your loop... I don't understand what it does. But clearly not what it was
doing (and supposed to do).
>
>by the way, I test the function and works well. :)
>
>>>+
>>>+/**
>>>+ * bond_for_each_slave_from_rcu - iterate the slaves list from a
>>>starting point
>>>+ * @bond: the bond holding this list.
>>>+ * @pos: current slave.
>>>+ * @start: starting point.
>>>+ *
>>>+ * Caller must hold rcu_read_lock
>>>+ */
>>>+#define bond_for_each_slave_from_rcu(bond, pos, start) \
>>>+ for (pos = start; pos && (bond_is_only_slave(bond, start) ? \
>>>+ &pos->list != &bond->slave_list : \
>>>+ &pos->list != &start->list); bond_is_only_slave(bond, start) ? \
>>>+ (pos = list_entry_rcu(pos->list.next, typeof(*pos), list)) : \
>>>+ (pos = bond_next_slave_rcu(bond, pos)))
>>
>>Ditto as bond_for_each_slave_from() and, also, see my comment about RCU
>>from patch 1.
>>
>>>
>>>/**
>>>* bond_for_each_slave - iterate over all slaves
>>>--
>>>1.8.2.1
>>>
>>>
>>>
>>--
>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
^ permalink raw reply
* Re: [PATCH 0/4] netfilter updates for net-next
From: David Miller @ 2013-09-04 16:28 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <1378299625-4638-1-git-send-email-pablo@netfilter.org>
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 4 Sep 2013 15:00:21 +0200
> The following batch contains:
>
> * Three fixes for the new synproxy target available in your
> net-next tree, from Jesper D. Brouer and Patrick McHardy.
>
> * One fix for TCPMSS to correctly handling the fragmentation
> case, from Phil Oester. I'll pass this one to -stable.
Pulled, thanks Pablo.
^ permalink raw reply
* Re: [PATCH net-next v2 1/6] bonding: simplify and use RCU protection for 3ad xmit path
From: David Miller @ 2013-09-04 16:25 UTC (permalink / raw)
To: vfalico; +Cc: dingtianhong, fubar, andy, nikolay, netdev
In-Reply-To: <20130904101823.GO1992@redhat.com>
From: Veaceslav Falico <vfalico@redhat.com>
Date: Wed, 4 Sep 2013 12:18:24 +0200
> On Wed, Sep 04, 2013 at 05:43:45PM +0800, Ding Tianhong wrote:
> ...snip...
>>+/**
>>+ * IMPORTANT: bond_first/last_slave_rcu can return NULL in case of an
>>empty list
>>+ * Caller must hold rcu_read_lock
>>+ */
>>+#define bond_first_slave_rcu(bond) \
>>+ list_first_or_null_rcu(&(bond)->slave_list, struct slave, list)
>>+#define bond_last_slave_rcu(bond) \
>>+ (list_empty(&(bond)->slave_list) ? NULL : \
>>+ bond_to_slave_rcu((bond)->slave_list.prev))
>
> Here, bond_last_slave_rcu() is racy. The list can be non-empty when
> list_empty() is verified, however afterwards it might become empty,
> when
> you call bond_to_slave_rcu(), and thus you'll get
> bond_to_slave(bond->slave_list) in the result, which is not a slave.
>
> Take a look at list_first_or_null_rcu() for a reference. The main idea
> is
> that it first gets the ->next pointer, with RCU protection, and then
> verifies if it's the list head or not, and if not - it gets the
> container
> already. This way the ->next pointer won't get away.
>
> These kind of bugs are really rare, but are *EXTREMELY* hard to debug.
I agree with this analysis.
Ding, "rcu_read_lock()" doesn't "lock" anything. It's just a memory
barrier.
All the list can still change on you asynchronously to your accesses.
That's why list_first_or_null_rcu() is so carefully arranged.
Therefore, you must make similar accomodations.
^ permalink raw reply
* Re: [PATCH 5/7] ixgbe: use pcie_capability_read_word() to simplify code
From: Bjorn Helgaas @ 2013-09-04 16:20 UTC (permalink / raw)
To: Yijing Wang
Cc: Benjamin Herrenschmidt, Gavin Shan, James E.J. Bottomley,
David S. Miller, linux-kernel, linux-pci, Hanjun Guo, e1000-devel,
netdev, Jeff Kirsher, Jacob Keller
In-Reply-To: <1378193715-25328-5-git-send-email-wangyijing@huawei.com>
[+cc Jacob, Jeff]
On Tue, Sep 03, 2013 at 03:35:13PM +0800, Yijing Wang wrote:
> use pcie_capability_read_word() to simplify code.
>
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> Cc: e1000-devel@lists.sourceforge.net
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 ++----
> 1 files changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index bad8f14..bfa0b06 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -152,7 +152,6 @@ MODULE_VERSION(DRV_VERSION);
> static int ixgbe_read_pci_cfg_word_parent(struct ixgbe_adapter *adapter,
> u32 reg, u16 *value)
> {
> - int pos = 0;
> struct pci_dev *parent_dev;
> struct pci_bus *parent_bus;
>
> @@ -164,11 +163,10 @@ static int ixgbe_read_pci_cfg_word_parent(struct ixgbe_adapter *adapter,
> if (!parent_dev)
> return -1;
>
> - pos = pci_find_capability(parent_dev, PCI_CAP_ID_EXP);
> - if (!pos)
> + if (!pci_is_pcie(parent_dev))
> return -1;
>
> - pci_read_config_word(parent_dev, pos + reg, value);
> + pcie_capability_read_word(parent_dev, reg, value);
> return 0;
> }
>
Here's the caller of ixgbe_read_pci_cfg_word_parent():
/* Get the negotiated link width and speed from PCI config space of the
* parent, as this device is behind a switch
*/
err = ixgbe_read_pci_cfg_word_parent(adapter, 18, &link_status);
This should be using PCI_EXP_LNKSTA instead of "18".
But it would be even better if we could drop ixgbe_get_parent_bus_info()
completely. It seems redundant after merging Jacob's new
pcie_get_minimum_link() stuff [1].
ixgbe_disable_pcie_master() looks like it should be using
pcie_capability_read_word() with PCI_EXP_DEVSTA instead of using
IXGBE_PCI_DEVICE_STATUS. If fact, it looks like it could use the
new pci_wait_for_pending_transaction() interface [2].
It looks like all the #defines in the "PCI Bus Info" block
(IXGBE_PCI_DEVICE_STATUS, IXGBE_PCI_DEVICE_STATUS_TRANSACTION_PENDING,
IXGBE_PCI_LINK_STATUS, etc.) [3] are really for PCIe-generic things. If
so, the IXGBE-specific ones should be dropped in favor of the generic
ones.
[1] http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c?id=e027d1aec4bb49030646d2c186a721f94372d7f2
[2] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/pci/pci.c?id=3775a209d38aa3a0c7ed89a7d0f529e0230f280e
[3] http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/tree/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h#n1833
^ permalink raw reply
* Re: [PATCH] net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
From: Gregory CLEMENT @ 2013-09-04 16:08 UTC (permalink / raw)
To: Thomas Petazzoni
Cc: David S. Miller, netdev, Lior Amsalem, Jochen De Smet,
Simon Guinot, Ryan Press, vdonnefort, Ethan Tuttle, stable,
Ezequiel Garcia, Chény Yves-Gael, Peter Sanford,
Willy Tarreau, linux-arm-kernel
In-Reply-To: <1378304478-21237-1-git-send-email-thomas.petazzoni@free-electrons.com>
On 04/09/2013 16:21, Thomas Petazzoni wrote:
> This commit fixes a long-standing bug that has been reported by many
> users: on some Armada 370 platforms, only the network interface that
> has been used in U-Boot to tftp the kernel works properly in
> Linux. The other network interfaces can see a 'link up', but are
> unable to transmit data. The reports were generally made on the Armada
> 370-based Mirabox, but have also been given on the Armada 370-RD
> board.
>
> The network MAC in the Armada 370/XP (supported by the mvneta driver
> in Linux) has a functionality that allows it to continuously poll the
> PHY and directly update the MAC configuration accordingly (speed,
> duplex, etc.). The very first versions of the driver submitted for
> review were using this hardware mechanism, but due to this, the driver
> was not integrated with the kernel phylib. Following reviews, the
> driver was changed to use the phylib, and therefore a software based
> polling. In software based polling, Linux regularly talks to the PHY
> over the MDIO bus, and sees if the link status has changed. If it's
> the case then the adjust_link() callback of the driver is called to
> update the MAC configuration accordingly.
>
> However, it turns out that the adjust_link() callback was not
> configuring the hardware in a completely correct way: while it was
> setting the speed and duplex bits correctly, it wasn't telling the
> hardware to actually take into account those bits rather than what the
> hardware-based PHY polling mechanism has concluded. So, in fact the
> adjust_link() callback was basically a no-op.
>
> However, the network happened to be working because on the network
> interfaces used by U-Boot for tftp on Armada 370 platforms because the
> hardware PHY polling was enabled by the bootloader, and left enabled
> by Linux. However, the second network interface not used for tftp (or
> both network interfaces if the kernel is loaded from USB, NAND or SD
> card) didn't had the hardware PHY polling enabled.
>
> This patch fixes this situation by:
>
> (1) Making sure that the hardware PHY polling is disabled by clearing
> the MVNETA_PHY_POLLING_ENABLE bit in the MVNETA_UNIT_CONTROL
> register in the driver ->probe() function.
>
> (2) Making sure that the duplex and speed selections made by the
> adjust_link() callback are taken into account by clearing the
> MVNETA_GMAC_AN_SPEED_EN and MVNETA_GMAC_AN_DUPLEX_EN bits in the
> MVNETA_GMAC_AUTONEG_CONFIG register.
>
> This patch has been tested on Armada 370 Mirabox, and now both network
> interfaces are usable after boot.
>
Well done Thomas!
I have successfully tested it on Armada 370 Mirabox:
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Thanks
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> Cc: Willy Tarreau <w@1wt.eu>
> Cc: Jochen De Smet <jochen.armkernel@leahnim.org>
> Cc: Peter Sanford <psanford@nearbuy.io>
> Cc: Ethan Tuttle <ethan@ethantuttle.com>
> Cc: Chény Yves-Gael <yves@cheny.fr>
> Cc: Ryan Press <ryan@presslab.us>
> Cc: Simon Guinot <simon.guinot@sequanux.org>
> Cc: vdonnefort@lacie.com
> Cc: stable@vger.kernel.org
> ---
> David, this patch is a fix for a problem that has been here since 3.8
> (when the mvneta driver was introduced), so I've Cc'ed stable@ and if
> possible I'd like to patch to be included for 3.12.
> ---
> drivers/net/ethernet/marvell/mvneta.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index b017818..90ab292 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -138,7 +138,9 @@
> #define MVNETA_GMAC_FORCE_LINK_PASS BIT(1)
> #define MVNETA_GMAC_CONFIG_MII_SPEED BIT(5)
> #define MVNETA_GMAC_CONFIG_GMII_SPEED BIT(6)
> +#define MVNETA_GMAC_AN_SPEED_EN BIT(7)
> #define MVNETA_GMAC_CONFIG_FULL_DUPLEX BIT(12)
> +#define MVNETA_GMAC_AN_DUPLEX_EN BIT(13)
> #define MVNETA_MIB_COUNTERS_BASE 0x3080
> #define MVNETA_MIB_LATE_COLLISION 0x7c
> #define MVNETA_DA_FILT_SPEC_MCAST 0x3400
> @@ -915,6 +917,13 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
> /* Assign port SDMA configuration */
> mvreg_write(pp, MVNETA_SDMA_CONFIG, val);
>
> + /* Disable PHY polling in hardware, since we're using the
> + * kernel phylib to do this.
> + */
> + val = mvreg_read(pp, MVNETA_UNIT_CONTROL);
> + val &= ~MVNETA_PHY_POLLING_ENABLE;
> + mvreg_write(pp, MVNETA_UNIT_CONTROL, val);
> +
> mvneta_set_ucast_table(pp, -1);
> mvneta_set_special_mcast_table(pp, -1);
> mvneta_set_other_mcast_table(pp, -1);
> @@ -2307,7 +2316,9 @@ static void mvneta_adjust_link(struct net_device *ndev)
> val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
> val &= ~(MVNETA_GMAC_CONFIG_MII_SPEED |
> MVNETA_GMAC_CONFIG_GMII_SPEED |
> - MVNETA_GMAC_CONFIG_FULL_DUPLEX);
> + MVNETA_GMAC_CONFIG_FULL_DUPLEX |
> + MVNETA_GMAC_AN_SPEED_EN |
> + MVNETA_GMAC_AN_DUPLEX_EN);
>
> if (phydev->duplex)
> val |= MVNETA_GMAC_CONFIG_FULL_DUPLEX;
>
--
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
^ permalink raw reply
* Re: [PATCH] net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
From: yves @ 2013-09-04 16:00 UTC (permalink / raw)
To: Vincent Donnefort
Cc: Thomas Petazzoni, Lior Amsalem, Jochen De Smet, Jason Cooper,
Ryan Press, netdev, stable, Willy Tarreau, Ethan Tuttle,
Ezequiel Garcia, Gregory Clement, Peter Sanford, David S. Miller,
linux-arm-kernel, Simon Guinot
In-Reply-To: <20130904152018.GA21032@grp-vdonnefort>
Le 2013-09-04 17:20, Vincent Donnefort a écrit :
> On Wed, Sep 04, 2013 at 10:50:51AM -0400, Jason Cooper wrote:
>> On Wed, Sep 04, 2013 at 04:21:18PM +0200, Thomas Petazzoni wrote:
>> > This commit fixes a long-standing bug that has been reported by many
>> > users: on some Armada 370 platforms, only the network interface that
>> > has been used in U-Boot to tftp the kernel works properly in
>> > Linux. The other network interfaces can see a 'link up', but are
>> > unable to transmit data. The reports were generally made on the Armada
>> > 370-based Mirabox, but have also been given on the Armada 370-RD
>> > board.
>> >
>> > The network MAC in the Armada 370/XP (supported by the mvneta driver
>> > in Linux) has a functionality that allows it to continuously poll the
>> > PHY and directly update the MAC configuration accordingly (speed,
>> > duplex, etc.). The very first versions of the driver submitted for
>> > review were using this hardware mechanism, but due to this, the driver
>> > was not integrated with the kernel phylib. Following reviews, the
>> > driver was changed to use the phylib, and therefore a software based
>> > polling. In software based polling, Linux regularly talks to the PHY
>> > over the MDIO bus, and sees if the link status has changed. If it's
>> > the case then the adjust_link() callback of the driver is called to
>> > update the MAC configuration accordingly.
>> >
>> > However, it turns out that the adjust_link() callback was not
>> > configuring the hardware in a completely correct way: while it was
>> > setting the speed and duplex bits correctly, it wasn't telling the
>> > hardware to actually take into account those bits rather than what the
>> > hardware-based PHY polling mechanism has concluded. So, in fact the
>> > adjust_link() callback was basically a no-op.
>> >
>> > However, the network happened to be working because on the network
>> > interfaces used by U-Boot for tftp on Armada 370 platforms because the
>> > hardware PHY polling was enabled by the bootloader, and left enabled
>> > by Linux. However, the second network interface not used for tftp (or
>> > both network interfaces if the kernel is loaded from USB, NAND or SD
>> > card) didn't had the hardware PHY polling enabled.
>> >
>> > This patch fixes this situation by:
>> >
>> > (1) Making sure that the hardware PHY polling is disabled by clearing
>> > the MVNETA_PHY_POLLING_ENABLE bit in the MVNETA_UNIT_CONTROL
>> > register in the driver ->probe() function.
>> >
>> > (2) Making sure that the duplex and speed selections made by the
>> > adjust_link() callback are taken into account by clearing the
>> > MVNETA_GMAC_AN_SPEED_EN and MVNETA_GMAC_AN_DUPLEX_EN bits in the
>> > MVNETA_GMAC_AUTONEG_CONFIG register.
>> >
>> > This patch has been tested on Armada 370 Mirabox, and now both network
>> > interfaces are usable after boot.
>> >
>> > Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
>> > Cc: Willy Tarreau <w@1wt.eu>
>> > Cc: Jochen De Smet <jochen.armkernel@leahnim.org>
>> > Cc: Peter Sanford <psanford@nearbuy.io>
>> > Cc: Ethan Tuttle <ethan@ethantuttle.com>
>> > Cc: Chény Yves-Gael <yves@cheny.fr>
>> > Cc: Ryan Press <ryan@presslab.us>
>> > Cc: Simon Guinot <simon.guinot@sequanux.org>
>> > Cc: vdonnefort@lacie.com
>> > Cc: stable@vger.kernel.org
>> > ---
>> > David, this patch is a fix for a problem that has been here since 3.8
>> > (when the mvneta driver was introduced), so I've Cc'ed stable@ and if
>> > possible I'd like to patch to be included for 3.12.
>>
>> David,
>>
>> Offending patch is:
>>
>> c5aff18 net: mvneta: driver for Marvell Armada 370/XP network unit
>>
>> Applies and builds cleanly against v3.8.13, v3.9.11, v3.10.10, and
>> v3.11
>>
>> Acked-by: Jason Cooper <jason@lakedaemon.net>
>>
>> thx,
>>
>> Jason.
>
> Works with the armada370-rd board.
>
> Tested-by: Vincent Donnefort <vdonnefort@gmail.com>
>
> Thank you!
>
> Vincent.
>
Works with the mirabox armada370 devkit.
Tested-by: Yves-Gael Cheny <yves@cheny.fr>
Many thx,
Yves-Gaël .
>>
>> > ---
>> > drivers/net/ethernet/marvell/mvneta.c | 13 ++++++++++++-
>> > 1 file changed, 12 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
>> > index b017818..90ab292 100644
>> > --- a/drivers/net/ethernet/marvell/mvneta.c
>> > +++ b/drivers/net/ethernet/marvell/mvneta.c
>> > @@ -138,7 +138,9 @@
>> > #define MVNETA_GMAC_FORCE_LINK_PASS BIT(1)
>> > #define MVNETA_GMAC_CONFIG_MII_SPEED BIT(5)
>> > #define MVNETA_GMAC_CONFIG_GMII_SPEED BIT(6)
>> > +#define MVNETA_GMAC_AN_SPEED_EN BIT(7)
>> > #define MVNETA_GMAC_CONFIG_FULL_DUPLEX BIT(12)
>> > +#define MVNETA_GMAC_AN_DUPLEX_EN BIT(13)
>> > #define MVNETA_MIB_COUNTERS_BASE 0x3080
>> > #define MVNETA_MIB_LATE_COLLISION 0x7c
>> > #define MVNETA_DA_FILT_SPEC_MCAST 0x3400
>> > @@ -915,6 +917,13 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
>> > /* Assign port SDMA configuration */
>> > mvreg_write(pp, MVNETA_SDMA_CONFIG, val);
>> >
>> > + /* Disable PHY polling in hardware, since we're using the
>> > + * kernel phylib to do this.
>> > + */
>> > + val = mvreg_read(pp, MVNETA_UNIT_CONTROL);
>> > + val &= ~MVNETA_PHY_POLLING_ENABLE;
>> > + mvreg_write(pp, MVNETA_UNIT_CONTROL, val);
>> > +
>> > mvneta_set_ucast_table(pp, -1);
>> > mvneta_set_special_mcast_table(pp, -1);
>> > mvneta_set_other_mcast_table(pp, -1);
>> > @@ -2307,7 +2316,9 @@ static void mvneta_adjust_link(struct net_device *ndev)
>> > val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
>> > val &= ~(MVNETA_GMAC_CONFIG_MII_SPEED |
>> > MVNETA_GMAC_CONFIG_GMII_SPEED |
>> > - MVNETA_GMAC_CONFIG_FULL_DUPLEX);
>> > + MVNETA_GMAC_CONFIG_FULL_DUPLEX |
>> > + MVNETA_GMAC_AN_SPEED_EN |
>> > + MVNETA_GMAC_AN_DUPLEX_EN);
>> >
>> > if (phydev->duplex)
>> > val |= MVNETA_GMAC_CONFIG_FULL_DUPLEX;
>> > --
>> > 1.8.1.2
>> >
>> >
>> > _______________________________________________
>> > linux-arm-kernel mailing list
>> > linux-arm-kernel@lists.infradead.org
>> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH net-next 2/2] tuntap: orphan frags before trying to set tx timestamp
From: Richard Cochran @ 2013-09-04 15:57 UTC (permalink / raw)
To: Jason Wang; +Cc: davem, netdev, linux-kernel, mst
In-Reply-To: <1378269226-5969-2-git-send-email-jasowang@redhat.com>
On Wed, Sep 04, 2013 at 12:33:46PM +0800, Jason Wang wrote:
> sock_tx_timestamp() will clear all zerocopy flags of skb which may lead the
> frags never to be orphaned. This will break guest to guest traffic when zerocopy
> is enabled. Fix this by orphaning the frags before trying to set tx time stamp.
>
> The issue were introduced by commit eda297729171fe16bf34fe5b0419dfb69060f623
> (tun: Support software transmit time stamping).
>
> Cc: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
^ permalink raw reply
* Re: [PATCH net-next 1/2] tuntap: purge socket error queue on detach
From: Richard Cochran @ 2013-09-04 15:56 UTC (permalink / raw)
To: Jason Wang; +Cc: davem, netdev, linux-kernel, mst
In-Reply-To: <1378269226-5969-1-git-send-email-jasowang@redhat.com>
On Wed, Sep 04, 2013 at 12:33:45PM +0800, Jason Wang wrote:
> Commit eda297729171fe16bf34fe5b0419dfb69060f623
> (tun: Support software transmit time stamping) will queue skbs into error queue
> when tx stamping is enabled. But it forgets to purge the error queue during
> detach. This patch fixes this.
>
> Cc: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
^ permalink raw reply
* Re: [PATCH net-next v2 1/2] random: add prandom_u32_range and prandom_u32_max helpers
From: Joe Perches @ 2013-09-04 15:54 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: davem, netdev, linux-kernel, Theodore Ts'o
In-Reply-To: <1378298247-29364-2-git-send-email-dborkman@redhat.com>
On Wed, 2013-09-04 at 14:37 +0200, Daniel Borkmann wrote:
> We have implemented the same function over and over, so introduce
> generic helpers that unify these implementations in order to migrate
> such code to use them. Make the API similarly to randomize_range()
> for consistency. prandom_u32_range() generates numbers in [start, end]
> interval and prandom_u32_max() generates numbers in [0, end] interval.
I think these helpers can in many cases cause
poorer compiler generated object code.
> +/**
> + * prandom_u32_range - return a random number in interval [start, end]
> + * @start: lower interval endpoint
> + * @end: higher interval endpoint
> + *
> + * Returns a number that is in the given interval:
> + *
> + * [...... <range> .....]
> + * start end
> + *
> + * Callers need to make sure that start <= end. Note that the result
> + * depends on PRNG being well distributed in [0, ~0U] space. Here we
> + * use maximally equidistributed combined Tausworthe generator.
> + */
> +static inline u32 prandom_u32_range(u32 start, u32 end)
> +{
> + return (u32)(((u64) prandom_u32() * (end + 1 - start)) >> 32) + start;
> +}
This is effectively:
return (prandom_u32() % (end - start)) + start;
and if start and end are constant, gcc can optimize the
division by constant to a 32 bit multiply/shift/add.
I think if you add __builtin_constant_p tests for start
and end and expand the code a little you can still get
the optimizations done.
^ permalink raw reply
* Re: [PATCH] xen-netback: count number required slots for an skb more carefully
From: Wei Liu @ 2013-09-04 15:44 UTC (permalink / raw)
To: David Vrabel
Cc: Wei Liu, xen-devel, Konrad Rzeszutek Wilk, Boris Ostrovsky,
Ian Campbell, netdev, msw, annie.li
In-Reply-To: <52273D59.2020205@citrix.com>
On Wed, Sep 04, 2013 at 03:02:01PM +0100, David Vrabel wrote:
[...]
> >>
> >> I think I prefer fixing the counting for backporting to stable kernels.
> >
> > The original patch has coding style change. Sans that contextual change
> > it's not a very long patch.
>
> The size of the patch isn't the main concern for backport-ability. It's
> the frontend visible changes and thus any (unexpected) impacts on
> frontends -- this is especially important as only a small fraction of
> frontends in use will be tested with these changes.
>
> >> Xi's approach of packing the ring differently is a change in frontend
> >> visible behaviour and seems more risky. e.g., possible performance
> >> impact so I would like to see some performance analysis of that approach.
> >>
> >
> > With Xi's approach it is more efficient for backend to process. As we
> > now use one less grant copy operation which means we copy the same
> > amount of data with less grant ops.
>
> It think it uses more grant ops because the copies of the linear
> portion are in chunks that do not cross source page boundaries.
>
> i.e., in netbk_gop_skb():
>
> data = skb->data;
> while (data < skb_tail_pointer(skb)) {
> unsigned int offset = offset_in_page(data);
> unsigned int len = PAGE_SIZE - offset;
> [...]
>
> It wasn't clear from the patch that this had been considered and that
> any extra space needed in the grant op array was made available.
>
If I'm not mistaken the grant op array is already enormous. See the
comment in struct xen_netbk for grant_copy_op. The case that a buffer
straddles two slots was taken into consideration long ago -- that's
why you don't see any comment or code change WRT that...
> > From frontend's PoV I think the impact is minimal. Frontend is involved
> > in assembling the packets. It only takes what's in the ring and chain
> > them together. The operation involves copying so far is the
> > __pskb_pull_tail which happens a) in rare case when there's more frags
> > than frontend's MAX_SKB_FRAGS, b) when pull_to > skb_headlen which
> > happens. With Xi's change the rare case a) will even be rarer than
> > before as we use less slots. b) happens the same as it happens before
> > Xi's change, because the pull is guarded by "if (pull_to >
> > skb_headlen(skb))" and Xi's change doesn't affect skb_headlen.
> >
> > So overall I don't see obvious downside.
>
> The obvious downside is it doesn't exist (in a form that can be applied
> now), it hasn't been tested and I think there may well be a subtle bug
> that would need a careful review or testing to confirm/deny.
>
It does exist and apply cleanly on top of net tree. I haven't posted
it yet because we haven't reached concensus which path to take. :-)
The only reason that last version didn't get upstreamed is that the
commit message wasn't clear enough. From the technical PoV it's quite
sound and I believe Amazon has been using it for a long time -- the
older reference dates back to Aug 2012 IIRC. It's just never properly
upstreamed.
> You are free to work on this as a future improvements but I really don't
> see why this critical bug fix needs to be delayed any further.
>
True. I don't mean to hold off critical fix. Just want to make sure that
every option is presented and considered.
Wei.
> David
^ permalink raw reply
* Re: [PATCH v3 0/3] Send audit/procinfo/cgroup data in socket-level control message
From: Jan Kaluža @ 2013-09-04 15:40 UTC (permalink / raw)
To: Eric Dumazet
Cc: Richard Guy Briggs, Eric W. Biederman,
davem-fT/PcQaiUtIeIZ0/mPfg9Q, LKML, netdev-u79uwXL29TY76Z2rM5mHXA,
eparis-H+wXaHxf7aLQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
lizefan-hv44wF8Li93QT0dZR+AlfA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn
In-Reply-To: <1378308621.7360.110.camel@edumazet-glaptop>
On 09/04/2013 05:30 PM, Eric Dumazet wrote:
> On Wed, 2013-09-04 at 11:20 -0400, Richard Guy Briggs wrote:
>> On Wed, Sep 04, 2013 at 10:58:30AM -0400, Richard Guy Briggs wrote:
>>> On Wed, Sep 04, 2013 at 12:42:26AM -0700, Eric W. Biederman wrote:
>>>> Jan Kaluza <jkaluza-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
>>>>> this patchset against net-next (applies also to linux-next) adds 3 new types
>>>>> of "Socket"-level control message (SCM_AUDIT, SCM_PROCINFO and SCM_CGROUP).
>>>
>>>> By my count you have overflowed cb in struct sk_buff and are stomping on
>>>> _skb_refdest.
>>>
>>> For patch1/3 I count 56/48, then for patch3 I get 48/48. Jan, you might
>>> do the conversion to a pointer in patch1/3 to avoid bisect breakage.
>>
>> Wait, that __aligned(8) is for cb[48], not for the contents.
>>
>> For patch1/3 I count 28/48 on 32-bit, 36/48 on 64-bit (or would that be
>> 56 by default on 64-bit arches without aligned specified?), then for
>> patch3 I get 24/48 on 32 and 40/48 on 64 (or again 48/48 by default?).
>
> Do not count, just add this :
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 86de99a..5b61320 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1329,6 +1329,7 @@ static void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb)
> {
> int i;
>
> + BUILD_BUG_ON(sizeof(struct unix_skb_parms) > sizeof(skb->cb));
> scm->fp = UNIXCB(skb).fp;
> UNIXCB(skb).fp = NULL;
>
That's already in af_unix.c:
BUILD_BUG_ON(sizeof(struct unix_skb_parms) > FIELD_SIZEOF(struct
sk_buff, cb));
This test is passing for me even with only patch1/3 applied when
building 64bit kernel.
Jan Kaluza
^ permalink raw reply
* Re: [PATCH v3 0/3] Send audit/procinfo/cgroup data in socket-level control message
From: Eric Dumazet @ 2013-09-04 15:30 UTC (permalink / raw)
To: Richard Guy Briggs
Cc: Jan Kaluza, netdev-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML,
eparis-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman,
tj-DgEjT+Ai2ygdnm+yROfE0A, cgroups-u79uwXL29TY76Z2rM5mHXA,
davem-fT/PcQaiUtIeIZ0/mPfg9Q,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn
In-Reply-To: <20130904152022.GD28517-bcJWsdo4jJjeVoXN4CMphl7TgLCtbB0G@public.gmane.org>
On Wed, 2013-09-04 at 11:20 -0400, Richard Guy Briggs wrote:
> On Wed, Sep 04, 2013 at 10:58:30AM -0400, Richard Guy Briggs wrote:
> > On Wed, Sep 04, 2013 at 12:42:26AM -0700, Eric W. Biederman wrote:
> > > Jan Kaluza <jkaluza-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
> > > > this patchset against net-next (applies also to linux-next) adds 3 new types
> > > > of "Socket"-level control message (SCM_AUDIT, SCM_PROCINFO and SCM_CGROUP).
> >
> > > By my count you have overflowed cb in struct sk_buff and are stomping on
> > > _skb_refdest.
> >
> > For patch1/3 I count 56/48, then for patch3 I get 48/48. Jan, you might
> > do the conversion to a pointer in patch1/3 to avoid bisect breakage.
>
> Wait, that __aligned(8) is for cb[48], not for the contents.
>
> For patch1/3 I count 28/48 on 32-bit, 36/48 on 64-bit (or would that be
> 56 by default on 64-bit arches without aligned specified?), then for
> patch3 I get 24/48 on 32 and 40/48 on 64 (or again 48/48 by default?).
Do not count, just add this :
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 86de99a..5b61320 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1329,6 +1329,7 @@ static void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb)
{
int i;
+ BUILD_BUG_ON(sizeof(struct unix_skb_parms) > sizeof(skb->cb));
scm->fp = UNIXCB(skb).fp;
UNIXCB(skb).fp = NULL;
^ permalink raw reply related
* Re: [PATCH v3 0/3] Send audit/procinfo/cgroup data in socket-level control message
From: Richard Guy Briggs @ 2013-09-04 15:20 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Jan Kaluza, davem-fT/PcQaiUtIeIZ0/mPfg9Q, LKML,
netdev-u79uwXL29TY76Z2rM5mHXA, eparis-H+wXaHxf7aLQT0dZR+AlfA,
tj-DgEjT+Ai2ygdnm+yROfE0A, lizefan-hv44wF8Li93QT0dZR+AlfA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn
In-Reply-To: <20130904145830.GC28517-bcJWsdo4jJjeVoXN4CMphl7TgLCtbB0G@public.gmane.org>
On Wed, Sep 04, 2013 at 10:58:30AM -0400, Richard Guy Briggs wrote:
> On Wed, Sep 04, 2013 at 12:42:26AM -0700, Eric W. Biederman wrote:
> > Jan Kaluza <jkaluza-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
> > > this patchset against net-next (applies also to linux-next) adds 3 new types
> > > of "Socket"-level control message (SCM_AUDIT, SCM_PROCINFO and SCM_CGROUP).
>
> > By my count you have overflowed cb in struct sk_buff and are stomping on
> > _skb_refdest.
>
> For patch1/3 I count 56/48, then for patch3 I get 48/48. Jan, you might
> do the conversion to a pointer in patch1/3 to avoid bisect breakage.
Wait, that __aligned(8) is for cb[48], not for the contents.
For patch1/3 I count 28/48 on 32-bit, 36/48 on 64-bit (or would that be
56 by default on 64-bit arches without aligned specified?), then for
patch3 I get 24/48 on 32 and 40/48 on 64 (or again 48/48 by default?).
> > Eric
>
> - RGB
- RGB
--
Richard Guy Briggs <rbriggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545
^ permalink raw reply
* Re: [PATCH] net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
From: Vincent Donnefort @ 2013-09-04 15:20 UTC (permalink / raw)
To: Jason Cooper
Cc: Thomas Petazzoni, Lior Amsalem, Jochen De Smet, Simon Guinot,
Ryan Press, netdev, stable, Willy Tarreau, Ethan Tuttle,
Ezequiel Garcia, Chény Yves-Gael, Gregory Clement,
Peter Sanford, David S. Miller, linux-arm-kernel
In-Reply-To: <20130904145051.GO19598@titan.lakedaemon.net>
On Wed, Sep 04, 2013 at 10:50:51AM -0400, Jason Cooper wrote:
> On Wed, Sep 04, 2013 at 04:21:18PM +0200, Thomas Petazzoni wrote:
> > This commit fixes a long-standing bug that has been reported by many
> > users: on some Armada 370 platforms, only the network interface that
> > has been used in U-Boot to tftp the kernel works properly in
> > Linux. The other network interfaces can see a 'link up', but are
> > unable to transmit data. The reports were generally made on the Armada
> > 370-based Mirabox, but have also been given on the Armada 370-RD
> > board.
> >
> > The network MAC in the Armada 370/XP (supported by the mvneta driver
> > in Linux) has a functionality that allows it to continuously poll the
> > PHY and directly update the MAC configuration accordingly (speed,
> > duplex, etc.). The very first versions of the driver submitted for
> > review were using this hardware mechanism, but due to this, the driver
> > was not integrated with the kernel phylib. Following reviews, the
> > driver was changed to use the phylib, and therefore a software based
> > polling. In software based polling, Linux regularly talks to the PHY
> > over the MDIO bus, and sees if the link status has changed. If it's
> > the case then the adjust_link() callback of the driver is called to
> > update the MAC configuration accordingly.
> >
> > However, it turns out that the adjust_link() callback was not
> > configuring the hardware in a completely correct way: while it was
> > setting the speed and duplex bits correctly, it wasn't telling the
> > hardware to actually take into account those bits rather than what the
> > hardware-based PHY polling mechanism has concluded. So, in fact the
> > adjust_link() callback was basically a no-op.
> >
> > However, the network happened to be working because on the network
> > interfaces used by U-Boot for tftp on Armada 370 platforms because the
> > hardware PHY polling was enabled by the bootloader, and left enabled
> > by Linux. However, the second network interface not used for tftp (or
> > both network interfaces if the kernel is loaded from USB, NAND or SD
> > card) didn't had the hardware PHY polling enabled.
> >
> > This patch fixes this situation by:
> >
> > (1) Making sure that the hardware PHY polling is disabled by clearing
> > the MVNETA_PHY_POLLING_ENABLE bit in the MVNETA_UNIT_CONTROL
> > register in the driver ->probe() function.
> >
> > (2) Making sure that the duplex and speed selections made by the
> > adjust_link() callback are taken into account by clearing the
> > MVNETA_GMAC_AN_SPEED_EN and MVNETA_GMAC_AN_DUPLEX_EN bits in the
> > MVNETA_GMAC_AUTONEG_CONFIG register.
> >
> > This patch has been tested on Armada 370 Mirabox, and now both network
> > interfaces are usable after boot.
> >
> > Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> > Cc: Willy Tarreau <w@1wt.eu>
> > Cc: Jochen De Smet <jochen.armkernel@leahnim.org>
> > Cc: Peter Sanford <psanford@nearbuy.io>
> > Cc: Ethan Tuttle <ethan@ethantuttle.com>
> > Cc: Chény Yves-Gael <yves@cheny.fr>
> > Cc: Ryan Press <ryan@presslab.us>
> > Cc: Simon Guinot <simon.guinot@sequanux.org>
> > Cc: vdonnefort@lacie.com
> > Cc: stable@vger.kernel.org
> > ---
> > David, this patch is a fix for a problem that has been here since 3.8
> > (when the mvneta driver was introduced), so I've Cc'ed stable@ and if
> > possible I'd like to patch to be included for 3.12.
>
> David,
>
> Offending patch is:
>
> c5aff18 net: mvneta: driver for Marvell Armada 370/XP network unit
>
> Applies and builds cleanly against v3.8.13, v3.9.11, v3.10.10, and v3.11
>
> Acked-by: Jason Cooper <jason@lakedaemon.net>
>
> thx,
>
> Jason.
Works with the armada370-rd board.
Tested-by: Vincent Donnefort <vdonnefort@gmail.com>
Thank you!
Vincent.
>
> > ---
> > drivers/net/ethernet/marvell/mvneta.c | 13 ++++++++++++-
> > 1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> > index b017818..90ab292 100644
> > --- a/drivers/net/ethernet/marvell/mvneta.c
> > +++ b/drivers/net/ethernet/marvell/mvneta.c
> > @@ -138,7 +138,9 @@
> > #define MVNETA_GMAC_FORCE_LINK_PASS BIT(1)
> > #define MVNETA_GMAC_CONFIG_MII_SPEED BIT(5)
> > #define MVNETA_GMAC_CONFIG_GMII_SPEED BIT(6)
> > +#define MVNETA_GMAC_AN_SPEED_EN BIT(7)
> > #define MVNETA_GMAC_CONFIG_FULL_DUPLEX BIT(12)
> > +#define MVNETA_GMAC_AN_DUPLEX_EN BIT(13)
> > #define MVNETA_MIB_COUNTERS_BASE 0x3080
> > #define MVNETA_MIB_LATE_COLLISION 0x7c
> > #define MVNETA_DA_FILT_SPEC_MCAST 0x3400
> > @@ -915,6 +917,13 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
> > /* Assign port SDMA configuration */
> > mvreg_write(pp, MVNETA_SDMA_CONFIG, val);
> >
> > + /* Disable PHY polling in hardware, since we're using the
> > + * kernel phylib to do this.
> > + */
> > + val = mvreg_read(pp, MVNETA_UNIT_CONTROL);
> > + val &= ~MVNETA_PHY_POLLING_ENABLE;
> > + mvreg_write(pp, MVNETA_UNIT_CONTROL, val);
> > +
> > mvneta_set_ucast_table(pp, -1);
> > mvneta_set_special_mcast_table(pp, -1);
> > mvneta_set_other_mcast_table(pp, -1);
> > @@ -2307,7 +2316,9 @@ static void mvneta_adjust_link(struct net_device *ndev)
> > val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
> > val &= ~(MVNETA_GMAC_CONFIG_MII_SPEED |
> > MVNETA_GMAC_CONFIG_GMII_SPEED |
> > - MVNETA_GMAC_CONFIG_FULL_DUPLEX);
> > + MVNETA_GMAC_CONFIG_FULL_DUPLEX |
> > + MVNETA_GMAC_AN_SPEED_EN |
> > + MVNETA_GMAC_AN_DUPLEX_EN);
> >
> > if (phydev->duplex)
> > val |= MVNETA_GMAC_CONFIG_FULL_DUPLEX;
> > --
> > 1.8.1.2
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
Vincent
LaCie will welcome you at IBC Amsterdam (13-17 Sept) on booth 7.G17 (Hall7). Come and visit us.
^ permalink raw reply
* Re: [PATCH net-next v2 5/6] bonding: restructure and add rcu for bond_for_each_slave_next()
From: Ding Tianhong @ 2013-09-04 15:02 UTC (permalink / raw)
To: Veaceslav Falico
Cc: Ding Tianhong, Jay Vosburgh, Andy Gospodarek, David S. Miller,
Nikolay Aleksandrov, Netdev
In-Reply-To: <20130904103540.GP1992@redhat.com>
于 2013/9/4 18:35, Veaceslav Falico 写道:
> On Wed, Sep 04, 2013 at 05:44:15PM +0800, Ding Tianhong wrote:
> ...snip...
>> +/* Check whether the slave is the only one in bond */
>> +#define bond_is_only_slave(bond, pos) \
>> + (((pos)->list.prev == &(bond)->slave_list) && \
>> + ((pos)->list.next == &(bond)->slave_list))
>
> Could be done without pos at all -
>
> !list_empty(&(bond)->slave_list) && \
> &(bond)->slave_list.next == &(bond)->slave_list.prev
>
> If we have only one slave and pos is NOT our slave then... well.. we have
> big troubles.
>
yes, more simple more beautiful, thanks.
but if the pos is not our slave, it is the mistake, not bug. :)
>> +
>> /**
>> * bond_for_each_slave_from - iterate the slaves list from a starting
>> point
>> * @bond: the bond holding this list.
>> * @pos: current slave.
>> - * @cnt: counter for max number of moves
>> * @start: starting point.
>> *
>> * Caller must hold bond->lock
>> */
>> -#define bond_for_each_slave_from(bond, pos, cnt, start) \
>> - for (cnt = 0, pos = start; pos && cnt < (bond)->slave_cnt; \
>> - cnt++, pos = bond_next_slave(bond, pos))
>> +#define bond_for_each_slave_from(bond, pos, start) \
>> + for (pos = start; pos && (bond_is_only_slave(bond, start) ? \
>> + &pos->list != &bond->slave_list : \
>> + &pos->list != &start->list); bond_is_only_slave(bond, start) ? \
>> + (pos = list_entry(pos->list.next, typeof(*pos), list)) : \
>> + (pos = bond_next_slave(bond, pos)))
>
> Did you check that?
>
> pos = slave1 (bond has more than one slave);
> pos && &pos->list != &slave1->list - false.
>
> We won't ever enter this loop if we have >1 slaves.
>
> I don't understand this at all.
>
ok, the logic is : if slaves == 1, run once for the slave.
if slaves > 1, run loops until reach the end list.
I could not get a better way to simplify the function, I think it is a
suitable scheme.
by the way, I test the function and works well. :)
>> +
>> +/**
>> + * bond_for_each_slave_from_rcu - iterate the slaves list from a
>> starting point
>> + * @bond: the bond holding this list.
>> + * @pos: current slave.
>> + * @start: starting point.
>> + *
>> + * Caller must hold rcu_read_lock
>> + */
>> +#define bond_for_each_slave_from_rcu(bond, pos, start) \
>> + for (pos = start; pos && (bond_is_only_slave(bond, start) ? \
>> + &pos->list != &bond->slave_list : \
>> + &pos->list != &start->list); bond_is_only_slave(bond, start) ? \
>> + (pos = list_entry_rcu(pos->list.next, typeof(*pos), list)) : \
>> + (pos = bond_next_slave_rcu(bond, pos)))
>
> Ditto as bond_for_each_slave_from() and, also, see my comment about RCU
> from patch 1.
>
>>
>> /**
>> * bond_for_each_slave - iterate over all slaves
>> --
>> 1.8.2.1
>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* [patch v2] sfc: check for allocation failure
From: Dan Carpenter @ 2013-09-04 15:07 UTC (permalink / raw)
To: Solarflare linux maintainers; +Cc: Ben Hutchings, netdev, kernel-janitors
In-Reply-To: <20130903.224939.866623192858217581.davem@davemloft.net>
It upsets static analyzers when we don't check for allocation failure.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
---
v2: rebased on latest linux-next
diff --git a/drivers/net/ethernet/sfc/falcon.c b/drivers/net/ethernet/sfc/falcon.c
index 38d179c..75799f8 100644
--- a/drivers/net/ethernet/sfc/falcon.c
+++ b/drivers/net/ethernet/sfc/falcon.c
@@ -894,6 +894,8 @@ static int falcon_mtd_probe(struct efx_nic *efx)
/* Allocate space for maximum number of partitions */
parts = kcalloc(2, sizeof(*parts), GFP_KERNEL);
+ if (!parts)
+ return -ENOMEM;
n_parts = 0;
spi = &nic_data->spi_flash;
^ permalink raw reply related
* Re: [PATCH v3 0/3] Send audit/procinfo/cgroup data in socket-level control message
From: Jan Kaluža @ 2013-09-04 15:04 UTC (permalink / raw)
To: Richard Guy Briggs
Cc: Eric W. Biederman, davem-fT/PcQaiUtIeIZ0/mPfg9Q, LKML,
netdev-u79uwXL29TY76Z2rM5mHXA, eparis-H+wXaHxf7aLQT0dZR+AlfA,
tj-DgEjT+Ai2ygdnm+yROfE0A, lizefan-hv44wF8Li93QT0dZR+AlfA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
cgroups-u79uwXL29TY76Z2rM5mHXA,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn
In-Reply-To: <20130904145830.GC28517-bcJWsdo4jJjeVoXN4CMphl7TgLCtbB0G@public.gmane.org>
On 09/04/2013 04:58 PM, Richard Guy Briggs wrote:
> On Wed, Sep 04, 2013 at 12:42:26AM -0700, Eric W. Biederman wrote:
>> Jan Kaluza <jkaluza-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
>>> Hi,
>>>
>>> this patchset against net-next (applies also to linux-next) adds 3 new types
>>> of "Socket"-level control message (SCM_AUDIT, SCM_PROCINFO and SCM_CGROUP).
>>>
>>> Server-like processes in many cases need credentials and other
>>> metadata of the peer, to decide if the calling process is allowed to
>>> request a specific action, or the server just wants to log away this
>>> type of information for auditing tasks.
>>>
>>> The current practice to retrieve such process metadata is to look that
>>> information up in procfs with the $PID received over SCM_CREDENTIALS.
>>> This is sufficient for long-running tasks, but introduces a race which
>>> cannot be worked around for short-living processes; the calling
>>> process and all the information in /proc/$PID/ is gone before the
>>> receiver of the socket message can look it up.
>>
>>> Changes introduced in this patchset can also increase performance
>>> of such server-like processes, because current way of opening and
>>> parsing /proc/$PID/* files is much more expensive than receiving these
>>> metadata using SCM.
>>
>> Can I just say ick, blech, barf, gag.
>
> /me hands ebiederman an air sickness bag.
>
>> You don't require this information to be passed. You are asking people
>> to suport a lot of new code for the forseeable future. The only advantage
>> appears to be for short lived racy processes that don't even bother to
>> make certain their message was acknowleged before exiting.
>>
>> You sent this during the merge window which is the time for code
>> integration and testing not new code.
>
> This is an RFC. How is this important?
>
>> By my count you have overflowed cb in struct sk_buff and are stomping on
>> _skb_refdest.
>
> For patch1/3 I count 56/48, then for patch3 I get 48/48. Jan, you might
> do the conversion to a pointer in patch1/3 to avoid bisect breakage.
Yes, this is valid point. I will do the conversion in patch1. Thanks all
for reviewing and pointing that out.
Jan Kaluza
>> If you are going to go crazy and pass things is there a reason you do
>> not add a patch to pass the bsd SCM_CREDS? That information seems more
>> relevant in a security context and for making security decisions than
>> about half the information you are passing.
>>
>> Eric
>
> - RGB
>
> --
> Richard Guy Briggs <rbriggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Senior Software Engineer
> Kernel Security
> AMER ENG Base Operating Systems
> Remote, Ottawa, Canada
> Voice: +1.647.777.2635
> Internal: (81) 32635
> Alt: +1.613.693.0684x3545
>
^ permalink raw reply
* Re: [PATCH net-next v2 1/6] bonding: simplify and use RCU protection for 3ad xmit path
From: Ding Tianhong @ 2013-09-04 14:53 UTC (permalink / raw)
To: Veaceslav Falico
Cc: Ding Tianhong, Jay Vosburgh, Andy Gospodarek, David S. Miller,
Nikolay Aleksandrov, Netdev
In-Reply-To: <20130904101823.GO1992@redhat.com>
于 2013/9/4 18:18, Veaceslav Falico 写道:
> On Wed, Sep 04, 2013 at 05:43:45PM +0800, Ding Tianhong wrote:
> ...snip...
>> +/**
>> + * IMPORTANT: bond_first/last_slave_rcu can return NULL in case of
>> an empty list
>> + * Caller must hold rcu_read_lock
>> + */
>> +#define bond_first_slave_rcu(bond) \
>> + list_first_or_null_rcu(&(bond)->slave_list, struct slave, list)
>> +#define bond_last_slave_rcu(bond) \
>> + (list_empty(&(bond)->slave_list) ? NULL : \
>> + bond_to_slave_rcu((bond)->slave_list.prev))
>
> Here, bond_last_slave_rcu() is racy. The list can be non-empty when
> list_empty() is verified, however afterwards it might become empty, when
> you call bond_to_slave_rcu(), and thus you'll get
> bond_to_slave(bond->slave_list) in the result, which is not a slave.
>
> Take a look at list_first_or_null_rcu() for a reference. The main idea is
> that it first gets the ->next pointer, with RCU protection, and then
> verifies if it's the list head or not, and if not - it gets the container
> already. This way the ->next pointer won't get away.
>
> These kind of bugs are really rare, but are *EXTREMELY* hard to debug.
Thanks for your response and opinions, but I think your miss something,
the slave_list will not changed in the rcu_read_lock, so ,the bugs will not
happen.
>
>> +
>> #define bond_is_first_slave(bond, pos) ((pos)->list.prev ==
>> &(bond)->slave_list)
>> #define bond_is_last_slave(bond, pos) ((pos)->list.next ==
>> &(bond)->slave_list)
>>
>> @@ -93,6 +106,15 @@
>> (bond_is_first_slave(bond, pos) ? bond_last_slave(bond) : \
>> bond_to_slave((pos)->list.prev))
>>
>> +/* Since bond_first/last_slave_rcu can return NULL, these can return
>> NULL too */
>> +#define bond_next_slave_rcu(bond, pos) \
>> + (bond_is_last_slave(bond, pos) ? bond_first_slave_rcu(bond) : \
>> + bond_to_slave_rcu((pos)->list.next))
>> +
>> +#define bond_prev_slave_rcu(bond, pos) \
>> + (bond_is_first_slave(bond, pos) ? bond_last_slave_rcu(bond) : \
>> + bond_to_slave_rcu((pos)->list.prev))
>> +
>
> These two are also racy. bond_is_last/first_slave() is not rcu-ified, and
> thus you can't rely on it without proper locking. Same ideas apply as per
> bond_first_slave_rcu().
> --
refer to the above answer.
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: [PATCH v3 0/3] Send audit/procinfo/cgroup data in socket-level control message
From: Richard Guy Briggs @ 2013-09-04 14:58 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Jan Kaluza, netdev-u79uwXL29TY76Z2rM5mHXA,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, LKML,
eparis-H+wXaHxf7aLQT0dZR+AlfA,
viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn, tj-DgEjT+Ai2ygdnm+yROfE0A,
cgroups-u79uwXL29TY76Z2rM5mHXA, davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <878uzdf2xp.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
On Wed, Sep 04, 2013 at 12:42:26AM -0700, Eric W. Biederman wrote:
> Jan Kaluza <jkaluza-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
> > Hi,
> >
> > this patchset against net-next (applies also to linux-next) adds 3 new types
> > of "Socket"-level control message (SCM_AUDIT, SCM_PROCINFO and SCM_CGROUP).
> >
> > Server-like processes in many cases need credentials and other
> > metadata of the peer, to decide if the calling process is allowed to
> > request a specific action, or the server just wants to log away this
> > type of information for auditing tasks.
> >
> > The current practice to retrieve such process metadata is to look that
> > information up in procfs with the $PID received over SCM_CREDENTIALS.
> > This is sufficient for long-running tasks, but introduces a race which
> > cannot be worked around for short-living processes; the calling
> > process and all the information in /proc/$PID/ is gone before the
> > receiver of the socket message can look it up.
>
> > Changes introduced in this patchset can also increase performance
> > of such server-like processes, because current way of opening and
> > parsing /proc/$PID/* files is much more expensive than receiving these
> > metadata using SCM.
>
> Can I just say ick, blech, barf, gag.
/me hands ebiederman an air sickness bag.
> You don't require this information to be passed. You are asking people
> to suport a lot of new code for the forseeable future. The only advantage
> appears to be for short lived racy processes that don't even bother to
> make certain their message was acknowleged before exiting.
>
> You sent this during the merge window which is the time for code
> integration and testing not new code.
This is an RFC. How is this important?
> By my count you have overflowed cb in struct sk_buff and are stomping on
> _skb_refdest.
For patch1/3 I count 56/48, then for patch3 I get 48/48. Jan, you might
do the conversion to a pointer in patch1/3 to avoid bisect breakage.
> If you are going to go crazy and pass things is there a reason you do
> not add a patch to pass the bsd SCM_CREDS? That information seems more
> relevant in a security context and for making security decisions than
> about half the information you are passing.
>
> Eric
- RGB
--
Richard Guy Briggs <rbriggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox