Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH ghak90 (was ghak32) V4 03/10] audit: log container info of syscalls
From: Paul Moore @ 2018-10-25 10:49 UTC (permalink / raw)
  To: sgrubb
  Cc: rgb, simo, carlos, linux-api, containers, linux-kernel, dhowells,
	linux-audit, netfilter-devel, ebiederm, luto, netdev,
	linux-fsdevel, Eric Paris, Serge Hallyn, viro
In-Reply-To: <20181025080638.771621a3@ivy-bridge>

On Thu, Oct 25, 2018 at 2:06 AM Steve Grubb <sgrubb@redhat.com> wrote:
> On Wed, 24 Oct 2018 20:42:55 -0400
> Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2018-10-24 16:55, Paul Moore wrote:
> > > On Wed, Oct 24, 2018 at 11:15 AM Richard Guy Briggs
> > > <rgb@redhat.com> wrote:
> > > > On 2018-10-19 19:16, Paul Moore wrote:
> > > > > On Sun, Aug 5, 2018 at 4:32 AM Richard Guy Briggs
> > > > > <rgb@redhat.com> wrote:

...

> > > > > > +/*
> > > > > > + * audit_log_contid - report container info
> > > > > > + * @tsk: task to be recorded
> > > > > > + * @context: task or local context for record
> > > > > > + * @op: contid string description
> > > > > > + */
> > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > +                            struct audit_context *context,
> > > > > > char *op) +{
> > > > > > +       struct audit_buffer *ab;
> > > > > > +
> > > > > > +       if (!audit_contid_set(tsk))
> > > > > > +               return 0;
> > > > > > +       /* Generate AUDIT_CONTAINER record with container ID
> > > > > > */
> > > > > > +       ab = audit_log_start(context, GFP_KERNEL,
> > > > > > AUDIT_CONTAINER);
> > > > > > +       if (!ab)
> > > > > > +               return -ENOMEM;
> > > > > > +       audit_log_format(ab, "op=%s contid=%llu",
> > > > > > +                        op, audit_get_contid(tsk));
> > > > > > +       audit_log_end(ab);
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL(audit_log_contid);
> > > > >
> > > > > As discussed in the previous iteration of the patch, I prefer
> > > > > AUDIT_CONTAINER_ID here over AUDIT_CONTAINER.  If you feel
> > > > > strongly about keeping it as-is with AUDIT_CONTAINER I suppose
> > > > > I could live with that, but it is isn't my first choice.
> > > >
> > > > I don't have a strong opinion on this one, mildly preferring the
> > > > shorter one only because it is shorter.
> > >
> > > We already have multiple AUDIT_CONTAINER* record types, so it seems
> > > as though we should use "AUDIT_CONTAINER" as a prefix of sorts,
> > > rather than a type itself.
> >
> > I'm fine with that.  I'd still like to hear Steve's input.  He had
> > stronger opinions than me.
>
> The creation event should be separate and distinct from the continuing
> use when its used as a supplemental record. IOW, binding the ID to a
> container is part of the lifecycle and needs to be kept distinct.

Steve's comment is pretty ambiguous when it comes to AUDIT_CONTAINER
vs AUDIT_CONTAINER_ID, but one could argue that AUDIT_CONTAINER_ID
helps distinguish the audit container id marking record and gets to
what I believe is the spirit of Steve's comment.  Taking this in
context with my previous remarks, let's switch to using
AUDIT_CONTAINER_ID.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* [PATCH v2] netfilter: conntrack: fix calculation of next bucket number in early_drop
From: Vasily Khoruzhick @ 2018-10-25 19:15 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, netfilter-devel, coreteam, netdev, linux-kernel,
	Dmitry Safonov
  Cc: Vasily Khoruzhick, stable

If there's no entry to drop in bucket that corresponds to the hash,
early_drop() should look for it in other buckets. But since it increments
hash instead of bucket number, it actually looks in the same bucket 8
times: hsize is 16k by default (14 bits) and hash is 32-bit value, so
reciprocal_scale(hash, hsize) returns the same value for hash..hash+7 in
most cases.

Fix it by increasing bucket number instead of hash and rename _hash
to bucket to avoid future confusion.

Fixes: 3e86638e9a0b ("netfilter: conntrack: consider ct netns in early_drop logic")
Cc: <stable@vger.kernel.org> # v4.7+
Signed-off-by: Vasily Khoruzhick <vasilykh@arista.com>
---
v2: move 'bucket' outside of the loop

 net/netfilter/nf_conntrack_core.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index ca1168d67fac..e92e749aff53 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1073,19 +1073,22 @@ static unsigned int early_drop_list(struct net *net,
 	return drops;
 }
 
-static noinline int early_drop(struct net *net, unsigned int _hash)
+static noinline int early_drop(struct net *net, unsigned int hash)
 {
-	unsigned int i;
+	unsigned int i, bucket;
 
 	for (i = 0; i < NF_CT_EVICTION_RANGE; i++) {
 		struct hlist_nulls_head *ct_hash;
-		unsigned int hash, hsize, drops;
+		unsigned int hsize, drops;
 
 		rcu_read_lock();
 		nf_conntrack_get_ht(&ct_hash, &hsize);
-		hash = reciprocal_scale(_hash++, hsize);
+		if (!i)
+			bucket = reciprocal_scale(hash, hsize);
+		else
+			bucket = (bucket + 1) % hsize;
 
-		drops = early_drop_list(net, &ct_hash[hash]);
+		drops = early_drop_list(net, &ct_hash[bucket]);
 		rcu_read_unlock();
 
 		if (drops) {
-- 
2.19.1

^ permalink raw reply related

* Re: [PATCH 2/2] Bluetooth: btqcomsmd: use HCI_QUIRK_USE_BDADDR_PROPERTY
From: Matthias Kaehlcke @ 2018-10-25 19:03 UTC (permalink / raw)
  To: Balakrishna Godavarthi
  Cc: Marcel Holtmann, Johan Hedberg, David S . Miller, Loic Poulain,
	linux-bluetooth, netdev, linux-kernel, Brian Norris,
	Dmitry Grinberg
In-Reply-To: <bff4f1b7d82e217f45273872eeffa77f@codeaurora.org>

On Thu, Oct 25, 2018 at 08:10:30PM +0530, Balakrishna Godavarthi wrote:
> Hi Matthias,
> 
> On 2018-10-25 05:51, Matthias Kaehlcke wrote:
> > Use the HCI_QUIRK_USE_BDADDR_PROPERTY quirk to let the HCI
> > core handle the reading of 'local-bd-address'. With this there
> > is no need to set HCI_QUIRK_INVALID_BDADDR, the case of a
> > non-existing or invalid fwnode property is handled by the core
> > code.
> > 
> > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> > ---
> > I couldn't actually test the changes in this driver since I
> > don't have a device with this controller. Could someone
> > from Qualcomm help with this?
> > ---
> >  drivers/bluetooth/btqcomsmd.c | 28 +++-------------------------
> >  1 file changed, 3 insertions(+), 25 deletions(-)
> > 
> > diff --git a/drivers/bluetooth/btqcomsmd.c
> > b/drivers/bluetooth/btqcomsmd.c
> > index 7df3eed1ef5e..e5841602c4f1 100644
> > --- a/drivers/bluetooth/btqcomsmd.c
> > +++ b/drivers/bluetooth/btqcomsmd.c
> > @@ -125,23 +125,10 @@ static int btqcomsmd_setup(struct hci_dev *hdev)
> >  		return PTR_ERR(skb);
> >  	kfree_skb(skb);
> > 
> > -	/* Devices do not have persistent storage for BD address. If no
> > -	 * BD address has been retrieved during probe, mark the device
> > -	 * as having an invalid BD address.
> > +	/* Devices do not have persistent storage for BD address. Retrieve
> > +	 * it from the firmware node property.
> >  	 */
> > -	if (!bacmp(&btq->bdaddr, BDADDR_ANY)) {
> > -		set_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks);
> > -		return 0;
> > -	}
> > -
> > -	/* When setting a configured BD address fails, mark the device
> > -	 * as having an invalid BD address.
> > -	 */
> > -	err = qca_set_bdaddr_rome(hdev, &btq->bdaddr);
> > -	if (err) {
> > -		set_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks);
> > -		return 0;
> > -	}
> > +	set_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks);
> > 
> >  	return 0;
> >  }
> > @@ -169,15 +156,6 @@ static int btqcomsmd_probe(struct platform_device
> > *pdev)
> >  	if (IS_ERR(btq->cmd_channel))
> >  		return PTR_ERR(btq->cmd_channel);
> > 
> > -	/* The local-bd-address property is usually injected by the
> > -	 * bootloader which has access to the allocated BD address.
> > -	 */
> > -	if (!of_property_read_u8_array(pdev->dev.of_node, "local-bd-address",
> > -				       (u8 *)&btq->bdaddr, sizeof(bdaddr_t))) {
> > -		dev_info(&pdev->dev, "BD address %pMR retrieved from device-tree",
> > -			 &btq->bdaddr);
> > -	}
> > -
> >  	hdev = hci_alloc_dev();
> >  	if (!hdev)
> >  		return -ENOMEM;
> 
> nit: can be remove unused "bdaddr_t bdaddr" variable.
> https://elixir.bootlin.com/linux/v4.19-rc8/source/drivers/bluetooth/btqcomsmd.c#L31
> Apart from this, It looks ok to me.

Thanks for the reviews! I'll remove the field in the next revision.

^ permalink raw reply

* [PATCH v3 2/2] net: qcom/emac: add phy-handle support for ACPI
From: Wang Dongsheng @ 2018-10-25 10:09 UTC (permalink / raw)
  To: timur, andrew; +Cc: Wang Dongsheng, yu.zheng, f.fainelli, netdev
In-Reply-To: <cover.1540459999.git.dongsheng.wang@hxt-semitech.com>

Use "phy-handle" to porint an internal MDIO device port.

Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
---
 drivers/net/ethernet/qualcomm/emac/emac-phy.c | 115 +++++++++++++++---
 1 file changed, 100 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.c b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
index f2ed013ce5d5..3dc3ae55e5bb 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-phy.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
@@ -96,6 +96,96 @@ static int emac_mdio_write(struct mii_bus *bus, int addr, int regnum, u16 val)
 	return 0;
 }
 
+static int acpi_device_match(struct device *dev, void *fwnode)
+{
+	return dev->fwnode == fwnode;
+}
+
+static struct phy_device *
+emac_acpi_get_phydev_from_phy_handle(struct platform_device *pdev)
+{
+	struct fwnode_reference_args args;
+	struct fwnode_handle *fw_node;
+	struct acpi_device *adev;
+	acpi_handle handle;
+	struct device *dev;
+	struct phy_device *phydev;
+	struct net_device *netdev;
+	struct emac_adapter *adpt;
+	int phy_addr;
+	int ret;
+
+	/* Get PHY Port reference from phy-handle */
+	fw_node = acpi_fwnode_handle(ACPI_COMPANION(&pdev->dev));
+	ret = acpi_node_get_property_reference(fw_node, "phy-handle", 0,
+					       &args);
+	if (ACPI_FAILURE(ret) || !is_acpi_device_node(args.fwnode))
+		return ERR_PTR(-ENODEV);
+
+	/* Get PHY addr from the port node */
+	if (fwnode_property_read_u32(args.fwnode, "phy-channel", &phy_addr))
+		return ERR_PTR(-ENODEV);
+
+	/* Get the MDIO bus that included the port */
+	handle = ACPI_HANDLE_FWNODE(args.fwnode);
+	if (!handle || acpi_bus_get_device(handle, &adev))
+		return ERR_PTR(-ENODEV);
+
+	while (adev->parent) {
+		if (!strcmp(acpi_device_hid(adev), "QCOM8070"))
+			break;
+		adev = adev->parent;
+	}
+	if (!adev->parent)
+		return ERR_PTR(-ENODEV);
+
+	dev = bus_find_device(&platform_bus_type, NULL,
+			      &adev->fwnode,
+			      acpi_device_match);
+	if (!dev)
+		return ERR_PTR(-ENODEV);
+
+	netdev = dev_get_drvdata(dev);
+	if (!netdev)
+		return ERR_PTR(-EPROBE_DEFER);
+
+	adpt = netdev_priv(netdev);
+	if (!adpt->mii_bus)
+		return ERR_PTR(-EPROBE_DEFER);
+
+	phydev = mdiobus_get_phy(adpt->mii_bus, phy_addr);
+	return phydev ? phydev : ERR_PTR(-ENODEV);
+}
+
+static struct phy_device *
+emac_acpi_get_phydev(struct platform_device *pdev, struct emac_adapter *adpt)
+{
+	struct phy_device *phydev = NULL;
+	int phy_addr;
+	int ret;
+
+	/* Compatible with "phy-channel" */
+	ret = device_property_read_u32(&pdev->dev, "phy-channel",
+				       &phy_addr);
+	if (!ret)
+		phydev = mdiobus_get_phy(adpt->mii_bus, phy_addr);
+	if (phydev)
+		return phydev;
+
+	/* Get PHY Port reference from phy-handle */
+	phydev = emac_acpi_get_phydev_from_phy_handle(pdev);
+	if (!IS_ERR(phydev))
+		return phydev;
+	if (PTR_ERR(phydev) == -EPROBE_DEFER)
+		return ERR_PTR(-EPROBE_DEFER);
+
+	/* If we can't read a valid phy address from "phy-channel"/"phy-handle",
+	 * then assume that there is only one phy on local mdio bus.
+	 */
+	phydev = phy_find_first(adpt->mii_bus);
+	return phydev ? phydev : ERR_PTR(-ENODEV);
+}
+
 static int emac_mdio_bus_create(struct platform_device *pdev,
 				struct emac_adapter *adpt)
 {
@@ -128,13 +218,9 @@ static int emac_get_phydev(struct platform_device *pdev,
 			    struct emac_adapter *adpt)
 {
 	struct device_node *np = pdev->dev.of_node;
-	struct mii_bus *bus = adpt->mii_bus;
 	struct device_node *phy_np;
 	struct phy_device *phydev;
 
-	u32 phy_addr;
-	int ret;
-
 	if (!has_acpi_companion(&pdev->dev)) {
 		phy_np = of_parse_phandle(np, "phy-handle", 0);
 		adpt->phydev = of_phy_find_device(phy_np);
@@ -142,14 +228,9 @@ static int emac_get_phydev(struct platform_device *pdev,
 		return adpt->phydev ? 0 : -ENODEV;
 	}
 
-	ret = device_property_read_u32(&pdev->dev, "phy-channel",
-				       &phy_addr);
-	/* If we can't read a valid phy address, then assume
-	 * that there is only one phy on this mdio bus.
-	 */
-	phydev = ret ? phy_find_first(bus) : mdiobus_get_phy(bus, phy_addr);
-	if (!phydev)
-		return -ENODEV;
+	phydev = emac_acpi_get_phydev(pdev, adpt);
+	if (IS_ERR(phydev))
+		return PTR_ERR(phydev);
 
 	/* of_phy_find_device() claims a reference to the phydev,
 	 * so we do that here manually as well. When the driver
@@ -171,10 +252,14 @@ int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt)
 		return ret;
 
 	ret = emac_get_phydev(pdev, adpt);
-	if (ret) {
+	if (!ret)
+		return 0;
+
+	if (ret != -EPROBE_DEFER)
 		dev_err(&pdev->dev, "Could not find external phy\n");
-		mdiobus_unregister(adpt->mii_bus);
-	}
+	else
+		dev_warn(&pdev->dev, "Phy is not available yet, deferred probing\n");
 
+	mdiobus_unregister(adpt->mii_bus);
 	return ret;
 }
-- 
2.18.0

^ permalink raw reply related

* [PATCH v3 1/2] net: qcom/emac: split phy_config to mdio bus create and get phy device
From: Wang Dongsheng @ 2018-10-25 10:08 UTC (permalink / raw)
  To: timur, andrew; +Cc: Wang Dongsheng, yu.zheng, f.fainelli, netdev
In-Reply-To: <cover.1540459999.git.dongsheng.wang@hxt-semitech.com>

This patch separate emac_mdio_bus_create and emac_get_phydev from
emac_phy_config, and do some codes clean.

Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com>
---
 drivers/net/ethernet/qualcomm/emac/emac-phy.c | 96 +++++++++++--------
 1 file changed, 56 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.c b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
index 53dbf1e163a8..f2ed013ce5d5 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-phy.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
@@ -96,15 +96,15 @@ static int emac_mdio_write(struct mii_bus *bus, int addr, int regnum, u16 val)
 	return 0;
 }
 
-/* Configure the MDIO bus and connect the external PHY */
-int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt)
+static int emac_mdio_bus_create(struct platform_device *pdev,
+				struct emac_adapter *adpt)
 {
 	struct device_node *np = pdev->dev.of_node;
 	struct mii_bus *mii_bus;
 	int ret;
 
 	/* Create the mii_bus object for talking to the MDIO bus */
-	adpt->mii_bus = mii_bus = devm_mdiobus_alloc(&pdev->dev);
+	mii_bus = devm_mdiobus_alloc(&pdev->dev);
 	if (!mii_bus)
 		return -ENOMEM;
 
@@ -115,50 +115,66 @@ int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt)
 	mii_bus->parent = &pdev->dev;
 	mii_bus->priv = adpt;
 
-	if (has_acpi_companion(&pdev->dev)) {
-		u32 phy_addr;
-
-		ret = mdiobus_register(mii_bus);
-		if (ret) {
-			dev_err(&pdev->dev, "could not register mdio bus\n");
-			return ret;
-		}
-		ret = device_property_read_u32(&pdev->dev, "phy-channel",
-					       &phy_addr);
-		if (ret)
-			/* If we can't read a valid phy address, then assume
-			 * that there is only one phy on this mdio bus.
-			 */
-			adpt->phydev = phy_find_first(mii_bus);
-		else
-			adpt->phydev = mdiobus_get_phy(mii_bus, phy_addr);
-
-		/* of_phy_find_device() claims a reference to the phydev,
-		 * so we do that here manually as well. When the driver
-		 * later unloads, it can unilaterally drop the reference
-		 * without worrying about ACPI vs DT.
-		 */
-		if (adpt->phydev)
-			get_device(&adpt->phydev->mdio.dev);
-	} else {
-		struct device_node *phy_np;
-
-		ret = of_mdiobus_register(mii_bus, np);
-		if (ret) {
-			dev_err(&pdev->dev, "could not register mdio bus\n");
-			return ret;
-		}
+	ret = of_mdiobus_register(mii_bus, has_acpi_companion(&pdev->dev) ?
+				  NULL : np);
+	if (ret)
+		dev_err(&pdev->dev, "Could not register mdio bus\n");
+
+	adpt->mii_bus = ret ? NULL : mii_bus;
+	return ret;
+}
+
+static int emac_get_phydev(struct platform_device *pdev,
+			   struct emac_adapter *adpt)
+{
+	struct device_node *np = pdev->dev.of_node;
+	struct mii_bus *bus = adpt->mii_bus;
+	struct device_node *phy_np;
+	struct phy_device *phydev;
 
+	u32 phy_addr;
+	int ret;
+
+	if (!has_acpi_companion(&pdev->dev)) {
 		phy_np = of_parse_phandle(np, "phy-handle", 0);
 		adpt->phydev = of_phy_find_device(phy_np);
 		of_node_put(phy_np);
+		return adpt->phydev ? 0 : -ENODEV;
 	}
 
-	if (!adpt->phydev) {
-		dev_err(&pdev->dev, "could not find external phy\n");
-		mdiobus_unregister(mii_bus);
+	ret = device_property_read_u32(&pdev->dev, "phy-channel",
+				       &phy_addr);
+	/* If we can't read a valid phy address, then assume
+	 * that there is only one phy on this mdio bus.
+	 */
+	phydev = ret ? phy_find_first(bus) : mdiobus_get_phy(bus, phy_addr);
+	if (!phydev)
 		return -ENODEV;
-	}
 
+	/* of_phy_find_device() claims a reference to the phydev,
+	 * so we do that here manually as well. When the driver
+	 * later unloads, it can unilaterally drop the reference
+	 * without worrying about ACPI vs DT.
+	 */
+	get_device(&phydev->mdio.dev);
+	adpt->phydev = phydev;
 	return 0;
 }
+
+/* Configure the MDIO bus and connect the external PHY */
+int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt)
+{
+	int ret;
+
+	ret = emac_mdio_bus_create(pdev, adpt);
+	if (ret)
+		return ret;
+
+	ret = emac_get_phydev(pdev, adpt);
+	if (ret) {
+		dev_err(&pdev->dev, "Could not find external phy\n");
+		mdiobus_unregister(adpt->mii_bus);
+	}
+
+	return ret;
+}
-- 
2.18.0

^ permalink raw reply related

* [PATCH v3 0/2] net: qcom/emac: add shared mdio bus support
From: Wang Dongsheng @ 2018-10-25 10:08 UTC (permalink / raw)
  To: timur, andrew; +Cc: Wang Dongsheng, yu.zheng, f.fainelli, netdev
In-Reply-To: <78719753-77bd-596a-dfc7-ccd676850283@kernel.org>

The emac include MDIO controller, and the motherboard has more than one
PHY connected to an MDIO bus. So share the shared mii_bus for others MAC
device that not has MDIO bus connected.

Based on ACPI, since "phy-handle" cannot directly point to a _DSD
sub-package, so we use "phy-handle" to point an internal MDIO device port.
The port describes the phy address.

Tested: QDF2400 (ACPI), buildin/insmod/rmmod

V3:
 - Add "phy-handle" support.
 - Remove all of DT changes.

V2:
 - Separate patch.

Wang Dongsheng (2):
  net: qcom/emac: split phy_config to mdio bus create and get phy device
  net: qcom/emac: add phy-handle support for ACPI

 drivers/net/ethernet/qualcomm/emac/emac-phy.c | 183 ++++++++++++++----
 1 file changed, 142 insertions(+), 41 deletions(-)

-- 
2.18.0

^ permalink raw reply

* Re: [RFC PATCH 01/11] phy: core add phy_set_netif_mode() api
From: Kishon Vijay Abraham I @ 2018-10-25 10:05 UTC (permalink / raw)
  To: Grygorii Strashko, David S. Miller, netdev, Tony Lindgren,
	Rob Herring
  Cc: Sekhar Nori, linux-kernel, linux-omap, devicetree
In-Reply-To: <18219d97-f86e-27ee-a116-a8febc3dcd24@ti.com>

Hi,

On Wednesday 10 October 2018 04:13 AM, Grygorii Strashko wrote:
> 
> 
> On 10/09/2018 12:22 AM, Kishon Vijay Abraham I wrote:
>> Hi Grygorii,
>>
>> On Tuesday 09 October 2018 05:19 AM, Grygorii Strashko wrote:
>>> Add new API phy_set_netif_mode(struct phy *phy, phy_interface_t mode) and
>>> new PHY operation callback .set_netif_mode() which intended to be implemnte
>>> by PHY drivers which supports Network interrfaces mode selection. Both
>>> accepts phy_interface_t vlaue as input parameter.
>>>
>>> Cc: Kishon Vijay Abraham I <kishon@ti.com>
>>> Cc: Tony Lindgren <tony@atomide.com>
>>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
>>> ---
>>>   drivers/phy/phy-core.c  | 15 +++++++++++++++
>>>   include/linux/phy/phy.h | 12 ++++++++++++
>>>   2 files changed, 27 insertions(+)
>>>
>>> diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
>>> index 35fd38c..d9aba1a 100644
>>> --- a/drivers/phy/phy-core.c
>>> +++ b/drivers/phy/phy-core.c
>>> @@ -377,6 +377,21 @@ int phy_set_mode(struct phy *phy, enum phy_mode mode)
>>>   }
>>>   EXPORT_SYMBOL_GPL(phy_set_mode);
>>>   
>>> +int phy_set_netif_mode(struct phy *phy, phy_interface_t mode)
>>> +{
>>> +	int ret;
>>> +
>>> +	if (!phy || !phy->ops->set_netif_mode)
>>> +		return 0;
>>> +
>>> +	mutex_lock(&phy->mutex);
>>> +	ret = phy->ops->set_netif_mode(phy, mode);
>>> +	mutex_unlock(&phy->mutex);
>>> +
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(phy_set_netif_mode);
>>
>> We should try to add only generic PHY APIs and not subsystem specific APIs. In
>> this case I think phy_set_mode should suffice.
> 
> 
> This is what I've had in mind first, but all my guts argued against it after I've tried:
> 
> diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h
> index bc73d2b..961b156 100644
> --- a/include/linux/phy/phy.h
> +++ b/include/linux/phy/phy.h
> @@ -41,6 +41,14 @@ enum phy_mode {
>         PHY_MODE_10GKR,
>         PHY_MODE_UFS_HS_A,
>         PHY_MODE_UFS_HS_B,
> +       PHY_MODE_MODE_MII,
> +       PHY_MODE_MODE_GMII,
> +       PHY_MODE_MODE_SGMII,
> +       PHY_MODE_MODE_RMII,
> +       PHY_MODE_MODE_RGMII,
> +       PHY_MODE_MODE_RGMII_ID,
> +       PHY_MODE_MODE_RGMII_RXID,
> +       PHY_MODE_MODE_RGMII_TXID,
>  };
> 
> above introduces ugly constants duplication and required every network phy driver
> to maintain conversation table  phy_interface_t -> enum phy_mode.
> More over, if above change happens third time (first time PHY_MODE_SGMII/PHY_MODE_10GKR were added,
> second - PHY_MODE_2500SGMII) it will never ends (there are ~15 more items only in phy_interface_t).
> As result, enum phy_mode might became a un-maintainable monster.
> 
> So, as per above, and considering that Network subsystem is based on standards (phy_interface_t lists standard intf)
> I've tried to add separate PHY API.
> 
> As an idea:
> - seems it could be reasonable to introduce PHY_MODE_NETWORK (or PHY_MODE_ETHERNET) and
> add generic phy_set_submode(struct phy *phy, long submode). 
> 
> So, single functional PHY device can just use phy_set_submode() and
> multi-functional devices (like serdes which can be muxed between PCIe, USB, NET), can use:
> 
> phy_set_mode(PHY_MODE_ETHERNET)
> phy_set_submode(X);

Agreed on the constant duplication comment above. We can modify set_mode to
take submode as an additional parameter and fix all the users of phy_set_mode.
int phy_set_mode(struct phy *phy, enum phy_mode mode, int submode)

Thanks
Kishon

^ permalink raw reply

* Re: [PATCH net-next 0/4] net: ethernet: ti: cpsw: fix vlan mcast
From: David Miller @ 2018-10-25 18:34 UTC (permalink / raw)
  To: ivan.khoronzhuk
  Cc: grygorii.strashko, linux-omap, netdev, linux-kernel,
	alexander.h.duyck, bjorn
In-Reply-To: <20181024221059.21834-1-ivan.khoronzhuk@linaro.org>

From: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Date: Thu, 25 Oct 2018 01:10:55 +0300

> The cpsw holds separate mcast entires for vlan entries. At this moment
> driver adds only not vlan mcast addresses, omitting vlan/mcast entries.
> As result mcast for vlans doesn't work. It can be fixed by adding same
> mcast entries for every created vlan, but this patchseries uses more
> sophisticated way and allows to create mcast entries only for vlans
> that really require it. Generic functions from this series can be
> reused for fixing vlan and macvlan unicast.

This is a bug fix but targetted at net-next, and indeed it is quite
invasive as it adds new core infrastructure and converts the generic
vlan code over to using it.

Unfortunately net-next is closed.

So if you want this bug fixed in mainline you will have to come up
with a less invasive fix, and resubmit this net-next approach when the
net-next tree opens back up.

Thank you.

^ permalink raw reply

* Re: Regression in 4.19 net/phy/realtek: garbled sysfs output
From: Holger Hoffstätte @ 2018-10-25  9:29 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Netdev, Jassi Brar, David S. Miller, Heiner Kallweit
In-Reply-To: <20181024201219.GB19440@lunn.ch>

On 10/24/18 22:12, Andrew Lunn wrote:
> On Wed, Oct 24, 2018 at 09:36:02PM +0200, Holger Hoffstätte wrote:
>> Hi,
>>
>> Since 4.19 r8169 depends on phylib:
>>
>> $lsmod | grep r8169
>> r8169                  81920  0
>> libphy                 57344  2 r8169,realtek
>>
>> Unfortunately this now gives me the following sysfs error:
>>
>> $cd /sys/module/realtek/drivers
>> $ls -l
>> ls: cannot access 'mdio_bus:RTL8201F 10/100Mbps Ethernet': No such file or directory
>> total 0
>> lrwxrwxrwx 1 root root 0 Oct 24 21:09 'mdio_bus:RTL8201CP Ethernet' -> '../../../bus/mdio_bus/drivers/RTL8201CP Ethernet'
>> l????????? ? ?    ?    ?            ? 'mdio_bus:RTL8201F 10/100Mbps Ethernet'
>> lrwxrwxrwx 1 root root 0 Oct 24 21:09 'mdio_bus:RTL8211 Gigabit Ethernet' -> '../../../bus/mdio_bus/drivers/RTL8211 Gigabit Ethernet'
>> [..]
>>
>> Apparently the forward slash in "10/100Mbps Ethernet" is interpreted as
>> directory separator that leads nowhere, and was introduced in commit
>> 513588dd44b ("net: phy: realtek: add RTL8201F phy-id and functions").
>>
>> Would it be acceptable to change the name simply to "RTL8201F Ethernet"?
> 
> Hi Holger
> 
> Or use "RTL8201F Fast Ethernet"

Yes, even better since it's correct. :)
As expected changing the name .name entry fixes the sysfs behaviour.

diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index 7fc8508b5..271e8adc3 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -220,7 +220,7 @@ static struct phy_driver realtek_drvs[] = {
  		.flags          = PHY_HAS_INTERRUPT,
  	}, {
  		.phy_id		= 0x001cc816,
-		.name		= "RTL8201F 10/100Mbps Ethernet",
+		.name		= "RTL8201F Fast Ethernet",
  		.phy_id_mask	= 0x001fffff,
  		.features	= PHY_BASIC_FEATURES,
  		.flags		= PHY_HAS_INTERRUPT,

> I wonder if other drivers have similar problems?
> 
> davicom.c:      .name           = "Davicom DM9161B/C",
> intel-xway.c:           .name           = "Intel XWAY PHY11G (PEF 7071/PEF 7072) v1.3",
> intel-xway.c:           .name           = "Intel XWAY PHY11G (PEF 7071/PEF 7072) v1.4",
> intel-xway.c:           .name           = "Intel XWAY PHY11G (PEF 7071/PEF 7072) v1.5 / v1.6",
> intel-xway.c:           .name           = "Intel XWAY PHY22F (PEF 7061) v1.5 / v1.6",
> smsc.c:	 .name	       = "SMSC LAN8710/LAN8720",

I'm open to suggestions about how to rename those identifiers.
"|" seems to work but IMHO looks a bit weird:

"Davicom DM9161B/C" -> "Davicom DM9161B|C"
"(PEF 7071/PEF 7072) v1.5 / v1.6" -> "(PEF 7071|7072) v1.5|6"

We can go full regex, which will probably get me voted off the island:
"(PEF 7071/PEF 7072) v1.5 / v1.6" -> "(PEF {7071,7072}) v1.{5,6}"

Cast your votes now!

cheers,
Holger

^ permalink raw reply related

* Re: [PATCH bpf-next 6/6] selftests/bpf: test_verifier, check bpf_map_lookup_elem access in bpf prog
From: Prashant Bhole @ 2018-10-25  9:09 UTC (permalink / raw)
  To: Naresh Kamboju, liu.song.a23
  Cc: ast, Daniel Borkmann, jakub.kicinski, David S. Miller,
	quentin.monnet, netdev, open list:KERNEL SELFTEST FRAMEWORK
In-Reply-To: <CA+G9fYtT78hr4i6JDdeKBNoO4G_NMSVWUTyxMoU4_1ZfqGdthg@mail.gmail.com>



On 10/25/2018 5:54 PM, Naresh Kamboju wrote:
> On Tue, 9 Oct 2018 at 12:32, Song Liu <liu.song.a23@gmail.com> wrote:
>>
>> On Mon, Oct 8, 2018 at 6:07 PM Prashant Bhole
>> <bhole_prashant_q7@lab.ntt.co.jp> wrote:
>>>
>>> map_lookup_elem isn't supported by certain map types like:
>>> - BPF_MAP_TYPE_PROG_ARRAY
>>> - BPF_MAP_TYPE_STACK_TRACE
>>> - BPF_MAP_TYPE_XSKMAP
>>> - BPF_MAP_TYPE_SOCKMAP/BPF_MAP_TYPE_SOCKHASH
>>> Let's add verfier tests to check whether verifier prevents
>>> bpf_map_lookup_elem call on above programs from bpf program.
>>>
>>> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
>>> Acked-by: Alexei Starovoitov <ast@kernel.org>
>> Acked-by: Song Liu <songliubraving@fb.com>
>>
>>> ---
>>>   tools/testing/selftests/bpf/test_verifier.c | 121 +++++++++++++++++++-
>>>   1 file changed, 120 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
>>> index 65ae44c85d27..cf4cd32b6772 100644
>>> --- a/tools/testing/selftests/bpf/test_verifier.c
>>> +++ b/tools/testing/selftests/bpf/test_verifier.c
>>> @@ -48,7 +48,7 @@
>>>
>>>   #define MAX_INSNS      BPF_MAXINSNS
>>>   #define MAX_FIXUPS     8
>>> -#define MAX_NR_MAPS    8
>>> +#define MAX_NR_MAPS    13
>>>   #define POINTER_VALUE  0xcafe4all
>>>   #define TEST_DATA_LEN  64
>>>
>>> @@ -65,6 +65,10 @@ struct bpf_test {
>>>          int fixup_map_hash_48b[MAX_FIXUPS];
>>>          int fixup_map_hash_16b[MAX_FIXUPS];
>>>          int fixup_map_array_48b[MAX_FIXUPS];
>>> +       int fixup_map_sockmap[MAX_FIXUPS];
>>> +       int fixup_map_sockhash[MAX_FIXUPS];
>>> +       int fixup_map_xskmap[MAX_FIXUPS];
>>> +       int fixup_map_stacktrace[MAX_FIXUPS];
>>>          int fixup_prog1[MAX_FIXUPS];
>>>          int fixup_prog2[MAX_FIXUPS];
>>>          int fixup_map_in_map[MAX_FIXUPS];
>>> @@ -4541,6 +4545,85 @@ static struct bpf_test tests[] = {
>>>                  .errstr = "invalid access to packet",
>>>                  .prog_type = BPF_PROG_TYPE_SCHED_CLS,
>>>          },
>>> +       {
>>> +               "prevent map lookup in sockmap",
>>> +               .insns = {
>>> +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> +                                    BPF_FUNC_map_lookup_elem),
>>> +                       BPF_EXIT_INSN(),
>>> +               },
>>> +               .fixup_map_sockmap = { 3 },
>>> +               .result = REJECT,
>>> +               .errstr = "cannot pass map_type 15 into func bpf_map_lookup_elem",
>>> +               .prog_type = BPF_PROG_TYPE_SOCK_OPS,
>>> +       },
>>> +       {
>>> +               "prevent map lookup in sockhash",
>>> +               .insns = {
>>> +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> +                                    BPF_FUNC_map_lookup_elem),
>>> +                       BPF_EXIT_INSN(),
>>> +               },
>>> +               .fixup_map_sockhash = { 3 },
>>> +               .result = REJECT,
>>> +               .errstr = "cannot pass map_type 18 into func bpf_map_lookup_elem",
>>> +               .prog_type = BPF_PROG_TYPE_SOCK_OPS,
>>> +       },
>>> +       {
>>> +               "prevent map lookup in xskmap",
>>> +               .insns = {
>>> +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> +                                    BPF_FUNC_map_lookup_elem),
>>> +                       BPF_EXIT_INSN(),
>>> +               },
>>> +               .fixup_map_xskmap = { 3 },
>>> +               .result = REJECT,
>>> +               .errstr = "cannot pass map_type 17 into func bpf_map_lookup_elem",
>>> +               .prog_type = BPF_PROG_TYPE_XDP,
>>> +       },
>>> +       {
>>> +               "prevent map lookup in stack trace",
>>> +               .insns = {
>>> +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> +                                    BPF_FUNC_map_lookup_elem),
>>> +                       BPF_EXIT_INSN(),
>>> +               },
>>> +               .fixup_map_stacktrace = { 3 },
>>> +               .result = REJECT,
>>> +               .errstr = "cannot pass map_type 7 into func bpf_map_lookup_elem",
>>> +               .prog_type = BPF_PROG_TYPE_PERF_EVENT,
>>> +       },
>>> +       {
>>> +               "prevent map lookup in prog array",
>>> +               .insns = {
>>> +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
>>> +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
>>> +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
>>> +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
>>> +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
>>> +                                    BPF_FUNC_map_lookup_elem),
>>> +                       BPF_EXIT_INSN(),
>>> +               },
>>> +               .fixup_prog2 = { 3 },
>>> +               .result = REJECT,
>>> +               .errstr = "cannot pass map_type 3 into func bpf_map_lookup_elem",
>>> +       },
>>>          {
>>>                  "valid map access into an array with a constant",
>>>                  .insns = {
>>> @@ -13515,6 +13598,10 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_map_type prog_type,
>>>          int *fixup_map_hash_48b = test->fixup_map_hash_48b;
>>>          int *fixup_map_hash_16b = test->fixup_map_hash_16b;
>>>          int *fixup_map_array_48b = test->fixup_map_array_48b;
>>> +       int *fixup_map_sockmap = test->fixup_map_sockmap;
>>> +       int *fixup_map_sockhash = test->fixup_map_sockhash;
>>> +       int *fixup_map_xskmap = test->fixup_map_xskmap;
>>> +       int *fixup_map_stacktrace = test->fixup_map_stacktrace;
>>>          int *fixup_prog1 = test->fixup_prog1;
>>>          int *fixup_prog2 = test->fixup_prog2;
>>>          int *fixup_map_in_map = test->fixup_map_in_map;
>>> @@ -13603,6 +13690,38 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_map_type prog_type,
>>>                          fixup_percpu_cgroup_storage++;
>>>                  } while (*fixup_percpu_cgroup_storage);
>>>          }
>>> +       if (*fixup_map_sockmap) {
>>> +               map_fds[9] = create_map(BPF_MAP_TYPE_SOCKMAP, sizeof(int),
>>> +                                       sizeof(int), 1);
>>> +               do {
>>> +                       prog[*fixup_map_sockmap].imm = map_fds[9];
>>> +                       fixup_map_sockmap++;
>>> +               } while (*fixup_map_sockmap);
>>> +       }
>>> +       if (*fixup_map_sockhash) {
>>> +               map_fds[10] = create_map(BPF_MAP_TYPE_SOCKHASH, sizeof(int),
>>> +                                       sizeof(int), 1);
>>> +               do {
>>> +                       prog[*fixup_map_sockhash].imm = map_fds[10];
>>> +                       fixup_map_sockhash++;
>>> +               } while (*fixup_map_sockhash);
>>> +       }
>>> +       if (*fixup_map_xskmap) {
>>> +               map_fds[11] = create_map(BPF_MAP_TYPE_XSKMAP, sizeof(int),
>>> +                                       sizeof(int), 1);
>>> +               do {
>>> +                       prog[*fixup_map_xskmap].imm = map_fds[11];
>>> +                       fixup_map_xskmap++;
>>> +               } while (*fixup_map_xskmap);
>>> +       }
> 
> selftests: bpf: test_verifier sockmap, sockhash, xskmap failed on
> mainline and next
> (from 4.19.0-rc7-next-20181011 to till date )
> Are we missing any pre-required kernel configs ?

sockmap/hashmap is dependent on CONFIG_BPF_STREAM_PARSER and xskmap is 
dependent on CONFIG_XDP_SOCKETS.

Reference: include/linux/bpf_types.h

-Prashant

> 
> Test log,
> ------------
> selftests: bpf: test_verifier
> <>
> #274/p prevent map lookup in sockmap Failed to create hash map
> 'Invalid argument'!
> FAIL
> Unexpected error message!
> EXP: cannot pass map_type 15 into func bpf_map_lookup_elem
> RES: fd -1 is not pointing to valid bpf_map
> fd -1 is not pointing to valid bpf_map
> #275/p prevent map lookup in sockhash Failed to create hash map
> 'Invalid argument'!
> FAIL
> Unexpected error message!
> EXP: cannot pass map_type 18 into func bpf_map_lookup_elem
> RES: fd -1 is not pointing to valid bpf_map
> fd -1 is not pointing to valid bpf_map
> #276/p prevent map lookup in xskmap Failed to create hash map 'Invalid
> argument'!
> FAIL
> Unexpected error message!
> EXP: cannot pass map_type 17 into func bpf_map_lookup_elem
> RES: fd -1 is not pointing to valid bpf_map
> fd -1 is not pointing to valid bpf_map
> <>
> Summary: 962 PASSED, 0 SKIPPED, 3 FAILED
> not ok 1..1 selftests: bpf: test_verifier [FAIL]
> selftests: bpf_test_verifier [FAIL]
> 
> -mainline results history,
> https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/bpf_test_verifier
> 
> -next results history,
> https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/bpf_test_verifier
> 
> Test case full log,
> https://lkft.validation.linaro.org/scheduler/job/461881#L1655
> 
> Best regards
> Naresh Kamboju
> 
>>> +       if (*fixup_map_stacktrace) {
>>> +               map_fds[12] = create_map(BPF_MAP_TYPE_STACK_TRACE, sizeof(u32),
>>> +                                        sizeof(u64), 1);
>>> +               do {
>>> +                       prog[*fixup_map_stacktrace].imm = map_fds[12];
>>> +                       fixup_map_stacktrace++;
>>> +               } while (fixup_map_stacktrace);
>>> +       }
>>>   }
>>>
>>>   static void do_test_single(struct bpf_test *test, bool unpriv,
>>> --
>>> 2.17.1
>>>
>>>
> 
> 

^ permalink raw reply

* Re: [PATCH ghak90 (was ghak32) V4 03/10] audit: log container info of syscalls
From: Richard Guy Briggs @ 2018-10-25 17:38 UTC (permalink / raw)
  To: Steve Grubb
  Cc: Paul Moore, simo, carlos, linux-api, containers, linux-kernel,
	dhowells, linux-audit, netfilter-devel, ebiederm, luto, netdev,
	linux-fsdevel, Eric Paris, Serge Hallyn, viro
In-Reply-To: <20181025175745.5b2b13e9@ivy-bridge>

On 2018-10-25 17:57, Steve Grubb wrote:
> On Thu, 25 Oct 2018 08:27:32 -0400
> Richard Guy Briggs <rgb@redhat.com> wrote:
> 
> > On 2018-10-25 06:49, Paul Moore wrote:
> > > On Thu, Oct 25, 2018 at 2:06 AM Steve Grubb <sgrubb@redhat.com>
> > > wrote:  
> > > > On Wed, 24 Oct 2018 20:42:55 -0400
> > > > Richard Guy Briggs <rgb@redhat.com> wrote:  
> > > > > On 2018-10-24 16:55, Paul Moore wrote:  
> > > > > > On Wed, Oct 24, 2018 at 11:15 AM Richard Guy Briggs
> > > > > > <rgb@redhat.com> wrote:  
> > > > > > > On 2018-10-19 19:16, Paul Moore wrote:  
> > > > > > > > On Sun, Aug 5, 2018 at 4:32 AM Richard Guy Briggs
> > > > > > > > <rgb@redhat.com> wrote:  
> > > 
> > > ...
> > >   
> > > > > > > > > +/*
> > > > > > > > > + * audit_log_contid - report container info
> > > > > > > > > + * @tsk: task to be recorded
> > > > > > > > > + * @context: task or local context for record
> > > > > > > > > + * @op: contid string description
> > > > > > > > > + */
> > > > > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > > > > +                            struct audit_context
> > > > > > > > > *context, char *op) +{
> > > > > > > > > +       struct audit_buffer *ab;
> > > > > > > > > +
> > > > > > > > > +       if (!audit_contid_set(tsk))
> > > > > > > > > +               return 0;
> > > > > > > > > +       /* Generate AUDIT_CONTAINER record with
> > > > > > > > > container ID */
> > > > > > > > > +       ab = audit_log_start(context, GFP_KERNEL,
> > > > > > > > > AUDIT_CONTAINER);
> > > > > > > > > +       if (!ab)
> > > > > > > > > +               return -ENOMEM;
> > > > > > > > > +       audit_log_format(ab, "op=%s contid=%llu",
> > > > > > > > > +                        op, audit_get_contid(tsk));
> > > > > > > > > +       audit_log_end(ab);
> > > > > > > > > +       return 0;
> > > > > > > > > +}
> > > > > > > > > +EXPORT_SYMBOL(audit_log_contid);  
> > > > > > > >
> > > > > > > > As discussed in the previous iteration of the patch, I
> > > > > > > > prefer AUDIT_CONTAINER_ID here over AUDIT_CONTAINER.  If
> > > > > > > > you feel strongly about keeping it as-is with
> > > > > > > > AUDIT_CONTAINER I suppose I could live with that, but it
> > > > > > > > is isn't my first choice.  
> > > > > > >
> > > > > > > I don't have a strong opinion on this one, mildly
> > > > > > > preferring the shorter one only because it is shorter.  
> > > > > >
> > > > > > We already have multiple AUDIT_CONTAINER* record types, so it
> > > > > > seems as though we should use "AUDIT_CONTAINER" as a prefix
> > > > > > of sorts, rather than a type itself.  
> > > > >
> > > > > I'm fine with that.  I'd still like to hear Steve's input.  He
> > > > > had stronger opinions than me.  
> > > >
> > > > The creation event should be separate and distinct from the
> > > > continuing use when its used as a supplemental record. IOW,
> > > > binding the ID to a container is part of the lifecycle and needs
> > > > to be kept distinct.  
> > > 
> > > Steve's comment is pretty ambiguous when it comes to AUDIT_CONTAINER
> > > vs AUDIT_CONTAINER_ID, but one could argue that AUDIT_CONTAINER_ID
> > > helps distinguish the audit container id marking record and gets to
> > > what I believe is the spirit of Steve's comment.  Taking this in
> > > context with my previous remarks, let's switch to using
> > > AUDIT_CONTAINER_ID.  
> > 
> > I suspect Steve is mixing up AUDIT_CONTAINER_OP with
> > AUDIT_CONTAINER_ID, confusing the fact that they are two seperate
> > records.  As a summary, the suggested records are:
> > 	CONTAINER_OP	audit container identifier creation
> > 	CONTAINER	audit container identifier aux record to an
> > event
> > 
> > and what Paul is suggesting (which is fine by me) is:
> > 	CONTAINER_OP	audit container identifier creation event
> > 	CONTAINER_ID	audit container identifier aux record to
> > an event
> > 
> > Steve, please indicate you are fine with this.
> 
> I thought it was:

It *was*.  It was changed at Paul's request in this v3 thread:
	https://www.redhat.com/archives/linux-audit/2018-July/msg00087.html

And listed in the examples and changelog to this v4 patchset:
	https://www.redhat.com/archives/linux-audit/2018-July/msg00178.html

It is also listed in this userspace patchset update v4 (which should
also have had a changelog added to it, note to self...):
	https://www.redhat.com/archives/linux-audit/2018-July/msg00189.html

I realize it is hard to keep up with all the detail changes in these
patchsets...

> CONTAINER_ID audit container identifier creation event
> CONTAINER audit container identifier aux record to an event
> 
> Or vice versa. Don't mix up creation of the identifier with operations.

Exactly what I'm trying to avoid...  Worded another way: "Don't mix up
the creation operation with routine reporting of the identifier in
events."  Steve, can you and Paul discuss and agree on what they should
be called?  I don't have a horse in this race, but I need to record the
result of that run.  ;-)

> -Steve

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply

* Re: Regression: kernel 4.14 an later very slow with many ipsec tunnels
From: David Miller @ 2018-10-25 17:34 UTC (permalink / raw)
  To: linux
  Cc: netdev, fw, steffen.klassert, linux-kernel, torvalds,
	christophe.gouault, gregkh
In-Reply-To: <2766296.15tpkxTHJV@stwm.de>

From: Wolfgang Walter <linux@stwm.de>
Date: Thu, 25 Oct 2018 11:38:19 +0200

> there is now a new 4.19 which still has the big performance regression when 
> many ipsec tunnels are configured (throughput and latency get worse by 10 to 
> 50 times) which makes any kernel > 4.9 unusable for our routers.
> 
> I still don't understand why a revert of the flow cache removal at least for 
> the longterm kernels is that a bad option (maybe as a compile time option), 
> especially as there is no workaround available.

You do know that the flow cache is DDoS targettable, right?

That's why we removed it, we did not make the change lightly.

Adding a DDoS vector back into the kernel is not an option sorry.

Please work diligently with Florian and others to try and find ways to
soften the performance hit.

Thank you.

^ permalink raw reply

* Re: [PATCH bpf-next 6/6] selftests/bpf: test_verifier, check bpf_map_lookup_elem access in bpf prog
From: Naresh Kamboju @ 2018-10-25  8:54 UTC (permalink / raw)
  To: liu.song.a23, bhole_prashant_q7
  Cc: ast, Daniel Borkmann, jakub.kicinski, David S. Miller,
	quentin.monnet, netdev, open list:KERNEL SELFTEST FRAMEWORK
In-Reply-To: <CAPhsuW72jhD+962NjSyxPrMhoeE9d24ArEVm0oDsP4FV46nNVA@mail.gmail.com>

On Tue, 9 Oct 2018 at 12:32, Song Liu <liu.song.a23@gmail.com> wrote:
>
> On Mon, Oct 8, 2018 at 6:07 PM Prashant Bhole
> <bhole_prashant_q7@lab.ntt.co.jp> wrote:
> >
> > map_lookup_elem isn't supported by certain map types like:
> > - BPF_MAP_TYPE_PROG_ARRAY
> > - BPF_MAP_TYPE_STACK_TRACE
> > - BPF_MAP_TYPE_XSKMAP
> > - BPF_MAP_TYPE_SOCKMAP/BPF_MAP_TYPE_SOCKHASH
> > Let's add verfier tests to check whether verifier prevents
> > bpf_map_lookup_elem call on above programs from bpf program.
> >
> > Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
> > Acked-by: Alexei Starovoitov <ast@kernel.org>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> > ---
> >  tools/testing/selftests/bpf/test_verifier.c | 121 +++++++++++++++++++-
> >  1 file changed, 120 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> > index 65ae44c85d27..cf4cd32b6772 100644
> > --- a/tools/testing/selftests/bpf/test_verifier.c
> > +++ b/tools/testing/selftests/bpf/test_verifier.c
> > @@ -48,7 +48,7 @@
> >
> >  #define MAX_INSNS      BPF_MAXINSNS
> >  #define MAX_FIXUPS     8
> > -#define MAX_NR_MAPS    8
> > +#define MAX_NR_MAPS    13
> >  #define POINTER_VALUE  0xcafe4all
> >  #define TEST_DATA_LEN  64
> >
> > @@ -65,6 +65,10 @@ struct bpf_test {
> >         int fixup_map_hash_48b[MAX_FIXUPS];
> >         int fixup_map_hash_16b[MAX_FIXUPS];
> >         int fixup_map_array_48b[MAX_FIXUPS];
> > +       int fixup_map_sockmap[MAX_FIXUPS];
> > +       int fixup_map_sockhash[MAX_FIXUPS];
> > +       int fixup_map_xskmap[MAX_FIXUPS];
> > +       int fixup_map_stacktrace[MAX_FIXUPS];
> >         int fixup_prog1[MAX_FIXUPS];
> >         int fixup_prog2[MAX_FIXUPS];
> >         int fixup_map_in_map[MAX_FIXUPS];
> > @@ -4541,6 +4545,85 @@ static struct bpf_test tests[] = {
> >                 .errstr = "invalid access to packet",
> >                 .prog_type = BPF_PROG_TYPE_SCHED_CLS,
> >         },
> > +       {
> > +               "prevent map lookup in sockmap",
> > +               .insns = {
> > +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
> > +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > +                                    BPF_FUNC_map_lookup_elem),
> > +                       BPF_EXIT_INSN(),
> > +               },
> > +               .fixup_map_sockmap = { 3 },
> > +               .result = REJECT,
> > +               .errstr = "cannot pass map_type 15 into func bpf_map_lookup_elem",
> > +               .prog_type = BPF_PROG_TYPE_SOCK_OPS,
> > +       },
> > +       {
> > +               "prevent map lookup in sockhash",
> > +               .insns = {
> > +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
> > +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > +                                    BPF_FUNC_map_lookup_elem),
> > +                       BPF_EXIT_INSN(),
> > +               },
> > +               .fixup_map_sockhash = { 3 },
> > +               .result = REJECT,
> > +               .errstr = "cannot pass map_type 18 into func bpf_map_lookup_elem",
> > +               .prog_type = BPF_PROG_TYPE_SOCK_OPS,
> > +       },
> > +       {
> > +               "prevent map lookup in xskmap",
> > +               .insns = {
> > +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
> > +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > +                                    BPF_FUNC_map_lookup_elem),
> > +                       BPF_EXIT_INSN(),
> > +               },
> > +               .fixup_map_xskmap = { 3 },
> > +               .result = REJECT,
> > +               .errstr = "cannot pass map_type 17 into func bpf_map_lookup_elem",
> > +               .prog_type = BPF_PROG_TYPE_XDP,
> > +       },
> > +       {
> > +               "prevent map lookup in stack trace",
> > +               .insns = {
> > +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
> > +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > +                                    BPF_FUNC_map_lookup_elem),
> > +                       BPF_EXIT_INSN(),
> > +               },
> > +               .fixup_map_stacktrace = { 3 },
> > +               .result = REJECT,
> > +               .errstr = "cannot pass map_type 7 into func bpf_map_lookup_elem",
> > +               .prog_type = BPF_PROG_TYPE_PERF_EVENT,
> > +       },
> > +       {
> > +               "prevent map lookup in prog array",
> > +               .insns = {
> > +                       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > +                       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > +                       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > +                       BPF_LD_MAP_FD(BPF_REG_1, 0),
> > +                       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> > +                                    BPF_FUNC_map_lookup_elem),
> > +                       BPF_EXIT_INSN(),
> > +               },
> > +               .fixup_prog2 = { 3 },
> > +               .result = REJECT,
> > +               .errstr = "cannot pass map_type 3 into func bpf_map_lookup_elem",
> > +       },
> >         {
> >                 "valid map access into an array with a constant",
> >                 .insns = {
> > @@ -13515,6 +13598,10 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_map_type prog_type,
> >         int *fixup_map_hash_48b = test->fixup_map_hash_48b;
> >         int *fixup_map_hash_16b = test->fixup_map_hash_16b;
> >         int *fixup_map_array_48b = test->fixup_map_array_48b;
> > +       int *fixup_map_sockmap = test->fixup_map_sockmap;
> > +       int *fixup_map_sockhash = test->fixup_map_sockhash;
> > +       int *fixup_map_xskmap = test->fixup_map_xskmap;
> > +       int *fixup_map_stacktrace = test->fixup_map_stacktrace;
> >         int *fixup_prog1 = test->fixup_prog1;
> >         int *fixup_prog2 = test->fixup_prog2;
> >         int *fixup_map_in_map = test->fixup_map_in_map;
> > @@ -13603,6 +13690,38 @@ static void do_test_fixup(struct bpf_test *test, enum bpf_map_type prog_type,
> >                         fixup_percpu_cgroup_storage++;
> >                 } while (*fixup_percpu_cgroup_storage);
> >         }
> > +       if (*fixup_map_sockmap) {
> > +               map_fds[9] = create_map(BPF_MAP_TYPE_SOCKMAP, sizeof(int),
> > +                                       sizeof(int), 1);
> > +               do {
> > +                       prog[*fixup_map_sockmap].imm = map_fds[9];
> > +                       fixup_map_sockmap++;
> > +               } while (*fixup_map_sockmap);
> > +       }
> > +       if (*fixup_map_sockhash) {
> > +               map_fds[10] = create_map(BPF_MAP_TYPE_SOCKHASH, sizeof(int),
> > +                                       sizeof(int), 1);
> > +               do {
> > +                       prog[*fixup_map_sockhash].imm = map_fds[10];
> > +                       fixup_map_sockhash++;
> > +               } while (*fixup_map_sockhash);
> > +       }
> > +       if (*fixup_map_xskmap) {
> > +               map_fds[11] = create_map(BPF_MAP_TYPE_XSKMAP, sizeof(int),
> > +                                       sizeof(int), 1);
> > +               do {
> > +                       prog[*fixup_map_xskmap].imm = map_fds[11];
> > +                       fixup_map_xskmap++;
> > +               } while (*fixup_map_xskmap);
> > +       }

selftests: bpf: test_verifier sockmap, sockhash, xskmap failed on
mainline and next
(from 4.19.0-rc7-next-20181011 to till date )
Are we missing any pre-required kernel configs ?

Test log,
------------
selftests: bpf: test_verifier
<>
#274/p prevent map lookup in sockmap Failed to create hash map
'Invalid argument'!
FAIL
Unexpected error message!
EXP: cannot pass map_type 15 into func bpf_map_lookup_elem
RES: fd -1 is not pointing to valid bpf_map
fd -1 is not pointing to valid bpf_map
#275/p prevent map lookup in sockhash Failed to create hash map
'Invalid argument'!
FAIL
Unexpected error message!
EXP: cannot pass map_type 18 into func bpf_map_lookup_elem
RES: fd -1 is not pointing to valid bpf_map
fd -1 is not pointing to valid bpf_map
#276/p prevent map lookup in xskmap Failed to create hash map 'Invalid
argument'!
FAIL
Unexpected error message!
EXP: cannot pass map_type 17 into func bpf_map_lookup_elem
RES: fd -1 is not pointing to valid bpf_map
fd -1 is not pointing to valid bpf_map
<>
Summary: 962 PASSED, 0 SKIPPED, 3 FAILED
not ok 1..1 selftests: bpf: test_verifier [FAIL]
selftests: bpf_test_verifier [FAIL]

-mainline results history,
https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/bpf_test_verifier

-next results history,
https://qa-reports.linaro.org/lkft/linux-next-oe/tests/kselftest/bpf_test_verifier

Test case full log,
https://lkft.validation.linaro.org/scheduler/job/461881#L1655

Best regards
Naresh Kamboju

> > +       if (*fixup_map_stacktrace) {
> > +               map_fds[12] = create_map(BPF_MAP_TYPE_STACK_TRACE, sizeof(u32),
> > +                                        sizeof(u64), 1);
> > +               do {
> > +                       prog[*fixup_map_stacktrace].imm = map_fds[12];
> > +                       fixup_map_stacktrace++;
> > +               } while (fixup_map_stacktrace);
> > +       }
> >  }
> >
> >  static void do_test_single(struct bpf_test *test, bool unpriv,
> > --
> > 2.17.1
> >
> >

^ permalink raw reply

* Re: [PATCH nf] netfilter: ipv6: fix oops when defragmenting locally generated fragments
From: Pablo Neira Ayuso @ 2018-10-25  8:18 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netfilter-devel, lorenzo, zenczykowski, edumazet, netdev, maze
In-Reply-To: <20181023144716.19746-1-fw@strlen.de>

On Tue, Oct 23, 2018 at 04:47:16PM +0200, Florian Westphal wrote:
> Unlike ipv4 and normal ipv6 defrag, netfilter ipv6 defragmentation did
> not save/restore skb->dst.
> 
> This causes oops when handling locally generated ipv6 fragments, as
> output path needs a valid dst.

Applied, thanks!

^ permalink raw reply

* RE: [PATCH v3 0/2] net: qcom/emac: add shared mdio bus support
From: Wang, Dongsheng @ 2018-10-25  8:16 UTC (permalink / raw)
  To: timur@kernel.org, andrew@lunn.ch
  Cc: netdev@vger.kernel.org, Zheng, Joey, f.fainelli@gmail.com
In-Reply-To: <20181025081503.15683-1-dongsheng.wang@hxt-semitech.com>

Sorry, please ignore this patch.

Cheers,
-Dongsheng


> -----Original Message-----
> From: Wang, Dongsheng
> Sent: Thursday, October 25, 2018 4:15 PM
> To: timur@kernel.org; andrew@lunn.ch
> Cc: netdev@vger.kernel.org; Wang, Dongsheng
> <dongsheng.wang@hxt-semitech.com>; Zheng, Joey
> <yu.zheng@hxt-semitech.com>; f.fainelli@gmail.com
> Subject: [PATCH v3 0/2] net: qcom/emac: add shared mdio bus support
> 
> The emac include MDIO controller, and the motherboard has more than one
> PHY connected to an MDIO bus. So share the shared mii_bus for others MAC
> device that not has MDIO bus connected.
> 
> Based on ACPI, since "phy-handle" cannot directly point to a _DSD sub-package,
> so we use "phy-handle" to point an internal MDIO device port.
> The port describes the phy address.
> 
> Tested: QDF2400 (ACPI), buildin/insmod/rmmod
> 
> V3:
>  - Add "phy-handle" support.
>  - Remove all of DT changes.
> 
> V2:
>  - Separate patch.
> 
> Wang Dongsheng (2):
>   net: qcom/emac: split phy_config to mdio bus create and get phy device
>   net: qcom/emac: add phy-handle support for ACPI
> 
>  drivers/net/ethernet/qualcomm/emac/emac-phy.c | 183 ++++++++++++++----
>  1 file changed, 142 insertions(+), 41 deletions(-)
> 
> --
> 2.18.0

^ permalink raw reply

* [PATCH v3 0/2] net: qcom/emac: add shared mdio bus support
From: Wang Dongsheng @ 2018-10-25  8:15 UTC (permalink / raw)
  To: timur, andrew; +Cc: netdev, Wang Dongsheng, yu.zheng, f.fainelli

The emac include MDIO controller, and the motherboard has more than one
PHY connected to an MDIO bus. So share the shared mii_bus for others MAC
device that not has MDIO bus connected.

Based on ACPI, since "phy-handle" cannot directly point to a _DSD
sub-package, so we use "phy-handle" to point an internal MDIO device port.
The port describes the phy address.

Tested: QDF2400 (ACPI), buildin/insmod/rmmod

V3:
 - Add "phy-handle" support.
 - Remove all of DT changes.

V2:
 - Separate patch.

Wang Dongsheng (2):
  net: qcom/emac: split phy_config to mdio bus create and get phy device
  net: qcom/emac: add phy-handle support for ACPI

 drivers/net/ethernet/qualcomm/emac/emac-phy.c | 183 ++++++++++++++----
 1 file changed, 142 insertions(+), 41 deletions(-)

-- 
2.18.0

^ permalink raw reply

* Re: [PATCH ghak90 (was ghak32) V4 03/10] audit: log container info of syscalls
From: Steve Grubb @ 2018-10-25 15:57 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Paul Moore, simo, carlos, linux-api, containers, linux-kernel,
	dhowells, linux-audit, netfilter-devel, ebiederm, luto, netdev,
	linux-fsdevel, Eric Paris, Serge Hallyn, viro
In-Reply-To: <20181025122732.4j4rbychjse3gemt@madcap2.tricolour.ca>

On Thu, 25 Oct 2018 08:27:32 -0400
Richard Guy Briggs <rgb@redhat.com> wrote:

> On 2018-10-25 06:49, Paul Moore wrote:
> > On Thu, Oct 25, 2018 at 2:06 AM Steve Grubb <sgrubb@redhat.com>
> > wrote:  
> > > On Wed, 24 Oct 2018 20:42:55 -0400
> > > Richard Guy Briggs <rgb@redhat.com> wrote:  
> > > > On 2018-10-24 16:55, Paul Moore wrote:  
> > > > > On Wed, Oct 24, 2018 at 11:15 AM Richard Guy Briggs
> > > > > <rgb@redhat.com> wrote:  
> > > > > > On 2018-10-19 19:16, Paul Moore wrote:  
> > > > > > > On Sun, Aug 5, 2018 at 4:32 AM Richard Guy Briggs
> > > > > > > <rgb@redhat.com> wrote:  
> > 
> > ...
> >   
> > > > > > > > +/*
> > > > > > > > + * audit_log_contid - report container info
> > > > > > > > + * @tsk: task to be recorded
> > > > > > > > + * @context: task or local context for record
> > > > > > > > + * @op: contid string description
> > > > > > > > + */
> > > > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > > > +                            struct audit_context
> > > > > > > > *context, char *op) +{
> > > > > > > > +       struct audit_buffer *ab;
> > > > > > > > +
> > > > > > > > +       if (!audit_contid_set(tsk))
> > > > > > > > +               return 0;
> > > > > > > > +       /* Generate AUDIT_CONTAINER record with
> > > > > > > > container ID */
> > > > > > > > +       ab = audit_log_start(context, GFP_KERNEL,
> > > > > > > > AUDIT_CONTAINER);
> > > > > > > > +       if (!ab)
> > > > > > > > +               return -ENOMEM;
> > > > > > > > +       audit_log_format(ab, "op=%s contid=%llu",
> > > > > > > > +                        op, audit_get_contid(tsk));
> > > > > > > > +       audit_log_end(ab);
> > > > > > > > +       return 0;
> > > > > > > > +}
> > > > > > > > +EXPORT_SYMBOL(audit_log_contid);  
> > > > > > >
> > > > > > > As discussed in the previous iteration of the patch, I
> > > > > > > prefer AUDIT_CONTAINER_ID here over AUDIT_CONTAINER.  If
> > > > > > > you feel strongly about keeping it as-is with
> > > > > > > AUDIT_CONTAINER I suppose I could live with that, but it
> > > > > > > is isn't my first choice.  
> > > > > >
> > > > > > I don't have a strong opinion on this one, mildly
> > > > > > preferring the shorter one only because it is shorter.  
> > > > >
> > > > > We already have multiple AUDIT_CONTAINER* record types, so it
> > > > > seems as though we should use "AUDIT_CONTAINER" as a prefix
> > > > > of sorts, rather than a type itself.  
> > > >
> > > > I'm fine with that.  I'd still like to hear Steve's input.  He
> > > > had stronger opinions than me.  
> > >
> > > The creation event should be separate and distinct from the
> > > continuing use when its used as a supplemental record. IOW,
> > > binding the ID to a container is part of the lifecycle and needs
> > > to be kept distinct.  
> > 
> > Steve's comment is pretty ambiguous when it comes to AUDIT_CONTAINER
> > vs AUDIT_CONTAINER_ID, but one could argue that AUDIT_CONTAINER_ID
> > helps distinguish the audit container id marking record and gets to
> > what I believe is the spirit of Steve's comment.  Taking this in
> > context with my previous remarks, let's switch to using
> > AUDIT_CONTAINER_ID.  
> 
> I suspect Steve is mixing up AUDIT_CONTAINER_OP with
> AUDIT_CONTAINER_ID, confusing the fact that they are two seperate
> records.  As a summary, the suggested records are:
> 	CONTAINER_OP	audit container identifier creation
> 	CONTAINER	audit container identifier aux record to an
> event
> 
> and what Paul is suggesting (which is fine by me) is:
> 	CONTAINER_OP	audit container identifier creation event
> 	CONTAINER_ID	audit container identifier aux record to
> an event
> 
> Steve, please indicate you are fine with this.

I thought it was:

CONTAINER_ID audit container identifier creation event
CONTAINER audit container identifier aux record to an event

Or vice versa. Don't mix up creation of the identifier with operations.

-Steve

^ permalink raw reply

* Re: [PATCH] netfilter: conntrack: fix calculation of next bucket number in early_drop
From: Vasiliy Khoruzhick @ 2018-10-25  6:37 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, netfilter-devel, coreteam, netdev, linux-kernel,
	stable
In-Reply-To: <a2fa755a-048b-459a-b526-2fe27cbb6574@arista.com>

On Wed, Oct 24, 2018 at 9:52 PM, Dmitry Safonov <dima@arista.com> wrote:
> On 10/25/18 4:48 AM, Vasily Khoruzhick wrote:
>>
>> If there's no entry to drop in bucket that corresponds to the hash,
>> early_drop() should look for it in other buckets. But since it increments
>> hash instead of bucket number, it actually looks in the same bucket 8
>> times: hsize is 16k by default (14 bits) and hash is 32-bit value, so
>> reciprocal_scale(hash, hsize) returns the same value for hash..hash+7 in
>> most cases.
>>
>> Fix it by increasing bucket number instead of hash and rename _hash
>> to bucket to avoid future confusion.
>>
>> Fixes: 3e86638e9a0b ("netfilter: conntrack: consider ct netns in
>> early_drop logic")
>> Cc: <stable@vger.kernel.org> # v4.7+
>> Signed-off-by: Vasily Khoruzhick <vasilykh@arista.com>
>
>
> Nice work!
>
> Reviewed-by: Dmitry Safonov <dima@arista.com>

Oops, found a bug in it - 'bucket' should be moved outside of the
loop. I'll send v2 tomorrow morning.

>
>> ---
>>   net/netfilter/nf_conntrack_core.c | 11 +++++++----
>>   1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/netfilter/nf_conntrack_core.c
>> b/net/netfilter/nf_conntrack_core.c
>> index ca1168d67fac..a04af246b184 100644
>> --- a/net/netfilter/nf_conntrack_core.c
>> +++ b/net/netfilter/nf_conntrack_core.c
>> @@ -1073,19 +1073,22 @@ static unsigned int early_drop_list(struct net
>> *net,
>>         return drops;
>>   }
>>   -static noinline int early_drop(struct net *net, unsigned int _hash)
>> +static noinline int early_drop(struct net *net, unsigned int hash)
>>   {
>>         unsigned int i;
>>         for (i = 0; i < NF_CT_EVICTION_RANGE; i++) {
>>                 struct hlist_nulls_head *ct_hash;
>> -               unsigned int hash, hsize, drops;
>> +               unsigned int bucket, hsize, drops;
>>                 rcu_read_lock();
>>                 nf_conntrack_get_ht(&ct_hash, &hsize);
>> -               hash = reciprocal_scale(_hash++, hsize);
>> +               if (!i)
>> +                       bucket = reciprocal_scale(hash, hsize);
>> +               else
>> +                       bucket = (bucket + 1) % hsize;
>>   -             drops = early_drop_list(net, &ct_hash[hash]);
>> +               drops = early_drop_list(net, &ct_hash[bucket]);
>>                 rcu_read_unlock();
>>                 if (drops) {
>>
>
> --
>           Dima

^ permalink raw reply

* Re: [PATCH net] net: sched: Remove TCA_OPTIONS from policy
From: Jiri Pirko @ 2018-10-25  6:31 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, davem, pupilla, David Ahern
In-Reply-To: <20181024153249.15374-1-dsahern@kernel.org>

Wed, Oct 24, 2018 at 05:32:49PM CEST, dsahern@kernel.org wrote:
>From: David Ahern <dsahern@gmail.com>
>
>Marco reported an error with hfsc:
>root@Calimero:~# tc qdisc add dev eth0 root handle 1:0 hfsc default 1
>Error: Attribute failed policy validation.
>
>Apparently a few implementations pass TCA_OPTIONS as a binary instead
>of nested attribute, so drop TCA_OPTIONS from the policy.

Yeah, this is nice example of a case, where I think it wouldn't hurt to
be a bit more strict. Apparently, the userspace app is buggy. It should
be fixed. Note that I'm aware of the bw compatibility.


>
>Fixes: 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes")
>Reported-by: Marco Berizzi <pupilla@libero.it>
>Signed-off-by: David Ahern <dsahern@gmail.com>
>---
> net/sched/sch_api.c | 1 -
> 1 file changed, 1 deletion(-)
>
>diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
>index 3dc0acf54245..be7cd140b2a3 100644
>--- a/net/sched/sch_api.c
>+++ b/net/sched/sch_api.c
>@@ -1309,7 +1309,6 @@ check_loop_fn(struct Qdisc *q, unsigned long cl, struct qdisc_walker *w)
> 
> const struct nla_policy rtm_tca_policy[TCA_MAX + 1] = {
> 	[TCA_KIND]		= { .type = NLA_STRING },
>-	[TCA_OPTIONS]		= { .type = NLA_NESTED },

A future developer might add this again. A comment why not would be good
here.


> 	[TCA_RATE]		= { .type = NLA_BINARY,
> 				    .len = sizeof(struct tc_estimator) },
> 	[TCA_STAB]		= { .type = NLA_NESTED },
>-- 
>2.11.0
>

^ permalink raw reply

* Re: [PATCH ghak90 (was ghak32) V4 03/10] audit: log container info of syscalls
From: Paul Moore @ 2018-10-25  6:13 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: containers, linux-api, linux-audit, linux-fsdevel, linux-kernel,
	netdev, netfilter-devel, ebiederm, luto, carlos, dhowells, viro,
	simo, Eric Paris, Serge Hallyn
In-Reply-To: <20181025004255.zl7p7j6gztouh2hh@madcap2.tricolour.ca>

On October 25, 2018 1:43:16 AM Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2018-10-24 16:55, Paul Moore wrote:
>> On Wed, Oct 24, 2018 at 11:15 AM Richard Guy Briggs <rgb@redhat.com> wrote:
>>> On 2018-10-19 19:16, Paul Moore wrote:
>>>> On Sun, Aug 5, 2018 at 4:32 AM Richard Guy Briggs <rgb@redhat.com> wrote:
>

...

>
>>>> However, I do care about the "op" field in this record.  It just
>>>> doesn't make any sense; the way you are using it it is more of a
>>>> context field than an operations field, and even then why is the
>>>> context important from a logging and/or security perspective?  Drop it
>>>> please.
>>>
>>> I'll rename it to whatever you like.  I'd suggest "ref=".  The reason I
>>> think it is important is there are multiple sources that aren't always
>>> obvious from the other records to which it is associated.  In the case
>>> of ptrace and signals, there can be many target tasks listed (OBJ_PID)
>>> with no other way to distinguish the matching audit container identifier
>>> records all for one event.  This is in addition to the default syscall
>>> container identifier record.  I'm not currently happy with the text
>>> content to link the two, but that should be solvable (most obvious is
>>> taret PID).  Throwing away this information seems shortsighted.
>>
>> It would be helpful if you could generate real audit events
>> demonstrating the problems you are describing, as well as a more
>> standard syscall event, so we can discuss some possible solutions.
>
> If the auditted process is in a container and it ptraces or signals
> another process in a container, there will be two AUDIT_CONTAINER
> records for the same event that won't be identified as to which record
> belongs to which process or other record (SYSCALL vs 1+ OBJ_PID
> records).  There could be many signals recorded, each with their own
> OBJ_PID record.  The first is stored in the audit context and additional
> ones are stored in a chained struct that can accommodate 16 entries each.
>
> (See audit_signal_info(), __audit_ptrace().)
>
> (As a side note, on code inspection it appears that a signal target
> would get overwritten by a ptrace action if they were to happen in that
> order.)

As requested above, please respond with real audit events generated by this patchset so that we can discuss possible solutions.

^ permalink raw reply

* Re: [PATCH net-next v8 28/28] net: WireGuard secure network tunnel
From: Jason A. Donenfeld @ 2018-10-25 14:43 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: LKML, Netdev, Linux Crypto Mailing List, David Miller,
	Greg Kroah-Hartman
In-Reply-To: <20181020224706.GC14816@lunn.ch>

Hi Andrew,

Thanks for the review. Comments and fix links are inline below.

On Sun, Oct 21, 2018 at 12:47 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > +#define choose_node(parent, key)                                               \
> > +     parent->bit[(key[parent->bit_at_a] >> parent->bit_at_b) & 1]
> This should be a function, not a macro.

That prevents it from being used as an lvalue, which is mostly where
it's used, with the rcu-protected structure, so this will remain a
macro. I'll make it upper-case though.

> > +
> > +static void node_free_rcu(struct rcu_head *rcu)
> > +{
> > +     kfree(container_of(rcu, struct allowedips_node, rcu));
> > +}
> > +
> > +#define push_rcu(stack, p, len) ({                                             \
> > +             if (rcu_access_pointer(p)) {                                   \
> > +                     WARN_ON(IS_ENABLED(DEBUG) && (len) >= 128);            \
> > +                     stack[(len)++] = rcu_dereference_raw(p);               \
> > +             }                                                              \
> > +             true;                                                          \
> > +     })
>
> This also looks like it could be a function.

You're right. I've changed this now. That produces slightly worse code
on gcc 8.2, but that's not really hot path anyway, so it doesn't
matter.

> > +static void root_free_rcu(struct rcu_head *rcu)
> > +{
> > +     struct allowedips_node *node, *stack[128] = {
> > +             container_of(rcu, struct allowedips_node, rcu) };
> > +     unsigned int len = 1;
> > +
> > +     while (len > 0 && (node = stack[--len]) &&
> > +            push_rcu(stack, node->bit[0], len) &&
> > +            push_rcu(stack, node->bit[1], len))
> > +             kfree(node);
> > +}
> > +
> > +#define ref(p) rcu_access_pointer(p)
> > +#define deref(p) rcu_dereference_protected(*(p), lockdep_is_held(lock))
>
> Macros should be uppercase, or better still, functions.
>
> > +#define push(p) ({                                                             \
> > +             WARN_ON(IS_ENABLED(DEBUG) && len >= 128);                      \
> > +             stack[len++] = p;                                              \
> > +     })
>
> This one definitely should be upper case, to warn readers it has
> unexpected side effects.

I've made these little helper macros uppercase.

>
> > +
> > +static void walk_remove_by_peer(struct allowedips_node __rcu **top,
> > +                             struct wg_peer *peer, struct mutex *lock)
> > +{
> > +     struct allowedips_node __rcu **stack[128], **nptr;
> > +     struct allowedips_node *node, *prev;
> > +     unsigned int len;
> > +
> > +     if (unlikely(!peer || !ref(*top)))
> > +             return;
> > +
> > +     for (prev = NULL, len = 0, push(top); len > 0; prev = node) {
> > +             nptr = stack[len - 1];
> > +             node = deref(nptr);
> > +             if (!node) {
> > +                     --len;
> > +                     continue;
> > +             }
> > +             if (!prev || ref(prev->bit[0]) == node ||
> > +                 ref(prev->bit[1]) == node) {
> > +                     if (ref(node->bit[0]))
> > +                             push(&node->bit[0]);
> > +                     else if (ref(node->bit[1]))
> > +                             push(&node->bit[1]);
> > +             } else if (ref(node->bit[0]) == prev) {
> > +                     if (ref(node->bit[1]))
> > +                             push(&node->bit[1]);
> > +             } else {
> > +                     if (rcu_dereference_protected(node->peer,
> > +                             lockdep_is_held(lock)) == peer) {
> > +                             RCU_INIT_POINTER(node->peer, NULL);
> > +                             if (!node->bit[0] || !node->bit[1]) {
> > +                                     rcu_assign_pointer(*nptr,
> > +                                     deref(&node->bit[!ref(node->bit[0])]));
> > +                                     call_rcu_bh(&node->rcu, node_free_rcu);
> > +                                     node = deref(nptr);
> > +                             }
> > +                     }
> > +                     --len;
> > +             }
> > +     }
> > +}
> > +
> > +#undef ref
> > +#undef deref
> > +#undef push
> > +
> > +static __always_inline unsigned int fls128(u64 a, u64 b)
> > +{
> > +     return a ? fls64(a) + 64U : fls64(b);
> > +}
>
> Does the compiler actually get this wrong? Not inline it?

Actually for this function (in contrast with all of the other ones
marked as such, which really must have the annotation), recent gcc
gets it right, so I've removed the annotation. Nice catch.

> > +
> > +static __always_inline u8 common_bits(const struct allowedips_node *node,
> > +                                   const u8 *key, u8 bits)
> > +{
> > +     if (bits == 32)
> > +             return 32U - fls(*(const u32 *)node->bits ^ *(const u32 *)key);
> > +     else if (bits == 128)
> > +             return 128U - fls128(
> > +                     *(const u64 *)&node->bits[0] ^ *(const u64 *)&key[0],
> > +                     *(const u64 *)&node->bits[8] ^ *(const u64 *)&key[8]);
> > +     return 0;
> > +}
> > +
> > +/* This could be much faster if it actually just compared the common bits
> > + * properly, by precomputing a mask bswap(~0 << (32 - cidr)), and the rest, but
> > + * it turns out that common_bits is already super fast on modern processors,
> > + * even taking into account the unfortunate bswap. So, we just inline it like
> > + * this instead.
> > + */
> > +#define prefix_matches(node, key, bits)                                        \
> > +     (common_bits(node, key, bits) >= (node)->cidr)
>
> Could be a function.

I've made this a function now and confirmed codegen did not change
significantly.

> > +/* Returns a strong reference to a peer */
> > +static __always_inline struct wg_peer *
> > +lookup(struct allowedips_node __rcu *root, u8 bits, const void *be_ip)
> > +{
> > +     u8 ip[16] __aligned(__alignof(u64));
>
> You virtually never see aligned stack variables. This needs some sort
> of comment.

I've added a comment. The reason, if you're curious, is so that we can
pass it to fls64 on all platforms.

> > +__attribute__((nonnull(1))) static bool
> > +node_placement(struct allowedips_node __rcu *trie, const u8 *key, u8 cidr,
> > +            u8 bits, struct allowedips_node **rnode, struct mutex *lock)
> > +{
> > +     struct allowedips_node *node = rcu_dereference_protected(trie,
> > +                                             lockdep_is_held(lock));
> > +     struct allowedips_node *parent = NULL;
> > +     bool exact = false;
>
> Should there be a WARN_ON(!key) here, since the attribute will only
> detect problems at compile time, and maybe some runtime cases will get
> passed it?

Actually no, it's only used in one place, and that function is pretty
clear about checking beforehand, and even this function checks
implicitly. Looking at my commit log, I had added the annotation to
make clang-analyzer happy, because it couldn't (at the time) deduce
things correctly from rcu_access_pointer(p) checks. It looks like
recent clang-analyzer has fixed that, and besides I'm not sure the
kernel is in the business of adding annotations to work around
static-analyzer-bug-of-the-week. So I've gotten rid of the annotation.
Thanks for bringing my attention to it.

> > +             net_dbg_ratelimited("%s: Could not decrypt invalid cookie response\n",
> > +                                 wg->dev->name);
>
> It might be worth adding a netdev_dbg_ratelimited(), which takes a
> netdev as its first parameter, just line netdev_dbg().

That sounds like it could be useful. Though, I'm trying hard to make
the first patch submission _not_ need to touch any of the rest of the
networking stack. I've actually got a small list of interesting
networking stack changes that might be useful for WireGuard, but I
think I'd prefer to submit these after this is all merged, and each
one separately for a separate discussion, if that's okay with you.

> > +
> > +#if defined(CONFIG_PM_SLEEP) && !defined(CONFIG_ANDROID)
>
> I don't see any other code which uses this combination. Why is this
> needed?

WireGuard clears private key material before going to sleep, so that
ephemeral keys never live longer in ram than their expiration date.
This works well for mostly everything, except Android devices where
crazy out-of-tree wireless drivers (such as qcacld) will actually wake
and sleep the phone constantly, for a few milliseconds each wake, to
handle incoming packets. In this case, we must not zero out ephemeral
keys, so that these packets can actually be decrypted. There are no
other systems that have this characteristic, as far as I can tell, and
I'm certainly not willing to add a runtime configuration nob for that,
and it turns out that matching on CONFIG_ANDROID as such does the
right thing all of the time. If the landscape becomes more
complicated, we can visit it, probably in the form of passing reasons
through to pm_notification and related scaffolding. But in lieu of
this infrastructure someday existing, I prefer to do the
simple-and-always-works-anyway solution.

> > +     while (skb_queue_len(&peer->staged_packet_queue) > MAX_STAGED_PACKETS)
> > +             dev_kfree_skb(__skb_dequeue(&peer->staged_packet_queue));
>
> It would be good to have some stats counters in here. Maybe the
> standard interface statistics will cover it, otherwise ethtool -S.

Good point, I need to increment the drop counters there. I'll add
that; good catch.

Willy and I actually discussed around a year ago adding a series of
wireguard-specific error counters for all the various unusual things
that can happen. This is something I'll probably investigate in
earnest post-merge, as it's a bit of a "bell and whistle" thing, and
the existing counters now are already sufficient for most purposes.

> You should also get this code looked at by some of the queueing
> people. Rather than discarding, you might want to be applying back
> pressure to slow down the sender application?

Actually Toke, DaveT, and I have agonized quite a bit over the
queueing mechanisms in WireGuard, as well as two GSoC summers looking
at it with students. There's been lots of flent and rrul and tweaking
involved, and the entire queueing system has gone through probably 15
different revisions of trying things out and experimenting. At this
point, I think the way we're doing queueing and when we're dropping,
and those various factors, are taken care of pretty well. Future work
still involves considering shoehorning some fq_codel in a similar way
that wireless does, and a few other tweeks, but I think the current
system is good and stable and very much sufficient.

The particular instance you're pointing out though, with
staged_packet_queue, is not the relevant queue and isn't what you have
in mind with regards to backpressure. This is simply a very short
lived thing for dealing with packets before a handshake is made, and
for handling xmit being used from multiple cores, but it's not _the_
queueing system that's relevant in the actual buffering and latency
calculations.

A little bit of the queueing stuff is discussed in the paper I
presented at Netdev 2.2, if you're interested:
https://www.wireguard.com/papers/wireguard-netdev22.pdf Some of that
is outdated by now, but should give you some idea.

> > +static void kdf(u8 *first_dst, u8 *second_dst, u8 *third_dst, const u8 *data,
> > +             size_t first_len, size_t second_len, size_t third_len,
> > +             size_t data_len, const u8 chaining_key[NOISE_HASH_LEN])
> > +{
> > +     u8 output[BLAKE2S_HASH_SIZE + 1];
> > +     u8 secret[BLAKE2S_HASH_SIZE];
> > +
> > +     WARN_ON(IS_ENABLED(DEBUG) &&
> > +             (first_len > BLAKE2S_HASH_SIZE ||
> > +              second_len > BLAKE2S_HASH_SIZE ||
> > +              third_len > BLAKE2S_HASH_SIZE ||
> > +              ((second_len || second_dst || third_len || third_dst) &&
> > +               (!first_len || !first_dst)) ||
> > +              ((third_len || third_dst) && (!second_len || !second_dst))));
>
> Maybe split this up into a number of WARN_ON()s. At the moment, if it
> fires, you have little idea why, there are so many comparisons. Also,
> is this on the hot path? I guess not, since this is keys, not
> packets. Do you need to care about DEBUG here?

This is on the hot path, actually. Well, it's not on path of data
packets, but I do consider handshake packets to be fairly "warm". I'm
not especially keen on splitting that up into multiple warn_ons,
mostly because if that is ever hint, something is seriously wrong, and
the whole flow would likely then need auditing anyway.

Regards,
Jason

^ permalink raw reply

* Re: [PATCH 2/2] Bluetooth: btqcomsmd: use HCI_QUIRK_USE_BDADDR_PROPERTY
From: Balakrishna Godavarthi @ 2018-10-25 14:40 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: Marcel Holtmann, Johan Hedberg, David S . Miller, Loic Poulain,
	linux-bluetooth, netdev, linux-kernel, Brian Norris,
	Dmitry Grinberg
In-Reply-To: <20181025002134.256791-3-mka@chromium.org>

Hi Matthias,

On 2018-10-25 05:51, Matthias Kaehlcke wrote:
> Use the HCI_QUIRK_USE_BDADDR_PROPERTY quirk to let the HCI
> core handle the reading of 'local-bd-address'. With this there
> is no need to set HCI_QUIRK_INVALID_BDADDR, the case of a
> non-existing or invalid fwnode property is handled by the core
> code.
> 
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> ---
> I couldn't actually test the changes in this driver since I
> don't have a device with this controller. Could someone
> from Qualcomm help with this?
> ---
>  drivers/bluetooth/btqcomsmd.c | 28 +++-------------------------
>  1 file changed, 3 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/bluetooth/btqcomsmd.c 
> b/drivers/bluetooth/btqcomsmd.c
> index 7df3eed1ef5e..e5841602c4f1 100644
> --- a/drivers/bluetooth/btqcomsmd.c
> +++ b/drivers/bluetooth/btqcomsmd.c
> @@ -125,23 +125,10 @@ static int btqcomsmd_setup(struct hci_dev *hdev)
>  		return PTR_ERR(skb);
>  	kfree_skb(skb);
> 
> -	/* Devices do not have persistent storage for BD address. If no
> -	 * BD address has been retrieved during probe, mark the device
> -	 * as having an invalid BD address.
> +	/* Devices do not have persistent storage for BD address. Retrieve
> +	 * it from the firmware node property.
>  	 */
> -	if (!bacmp(&btq->bdaddr, BDADDR_ANY)) {
> -		set_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks);
> -		return 0;
> -	}
> -
> -	/* When setting a configured BD address fails, mark the device
> -	 * as having an invalid BD address.
> -	 */
> -	err = qca_set_bdaddr_rome(hdev, &btq->bdaddr);
> -	if (err) {
> -		set_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks);
> -		return 0;
> -	}
> +	set_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks);
> 
>  	return 0;
>  }
> @@ -169,15 +156,6 @@ static int btqcomsmd_probe(struct platform_device 
> *pdev)
>  	if (IS_ERR(btq->cmd_channel))
>  		return PTR_ERR(btq->cmd_channel);
> 
> -	/* The local-bd-address property is usually injected by the
> -	 * bootloader which has access to the allocated BD address.
> -	 */
> -	if (!of_property_read_u8_array(pdev->dev.of_node, "local-bd-address",
> -				       (u8 *)&btq->bdaddr, sizeof(bdaddr_t))) {
> -		dev_info(&pdev->dev, "BD address %pMR retrieved from device-tree",
> -			 &btq->bdaddr);
> -	}
> -
>  	hdev = hci_alloc_dev();
>  	if (!hdev)
>  		return -ENOMEM;

nit: can be remove unused "bdaddr_t bdaddr" variable.
https://elixir.bootlin.com/linux/v4.19-rc8/source/drivers/bluetooth/btqcomsmd.c#L31
Apart from this, It looks ok to me.

Reviewed-by: Balakrishna Godavarthi <bgodavar@codeaurora.org>
-- 
Regards
Balakrishna.

^ permalink raw reply

* Re: [PATCH 1/2] Bluetooth: Add quirk for reading BD_ADDR from fwnode property
From: Balakrishna Godavarthi @ 2018-10-25 14:36 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: Marcel Holtmann, Johan Hedberg, David S . Miller, Loic Poulain,
	linux-bluetooth, netdev, linux-kernel, Brian Norris,
	Dmitry Grinberg
In-Reply-To: <20181025002134.256791-2-mka@chromium.org>

On 2018-10-25 05:51, Matthias Kaehlcke wrote:
> Add HCI_QUIRK_USE_BDADDR_PROPERTY to allow controllers to retrieve
> the public Bluetooth address from the firmware node property
> 'local-bd-address'. If quirk is set and the property does not exist
> or is invalid the controller is marked as unconfigured.
> 
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> ---
> hci_dev_get_bd_addr_from_property() currently assumes that the
> firmware node with 'local-bd-address' is from hdev->dev.parent, not
> sure if this universally true. However if it is true for existing
> device that might use this interface we can assume this for now
> (unless there is a clear solution now), and cross the bridge of
> finding an alternative when we actually encounter the situation.
> One option could be to look for the first parent that has a fwnode.
> ---
>  include/net/bluetooth/hci.h | 12 +++++++++++
>  net/bluetooth/hci_core.c    | 42 +++++++++++++++++++++++++++++++++++++
>  net/bluetooth/mgmt.c        |  6 ++++--
>  3 files changed, 58 insertions(+), 2 deletions(-)
> 
> diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
> index cdd9f1fe7cfa..a5d748099752 100644
> --- a/include/net/bluetooth/hci.h
> +++ b/include/net/bluetooth/hci.h
> @@ -158,6 +158,18 @@ enum {
>  	 */
>  	HCI_QUIRK_INVALID_BDADDR,
> 
> +	/* When this quirk is set, the public Bluetooth address
> +	 * initially reported by HCI Read BD Address command
> +	 * is considered invalid. The public BD Address can be
> +	 * specified in the fwnode property 'local-bd-address'.
> +	 * If this property does not exist or is invalid controller
> +	 * configuration is required before this device can be used.
> +	 *
> +	 * This quirk can be set before hci_register_dev is called or
> +	 * during the hdev->setup vendor callback.
> +	 */
> +	HCI_QUIRK_USE_BDADDR_PROPERTY,
> +
>  	/* When this quirk is set, the duplicate filtering during
>  	 * scanning is based on Bluetooth devices addresses. To allow
>  	 * RSSI based updates, restart scanning if needed.
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index 74b29c7d841c..97214262c4fb 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -30,6 +30,7 @@
>  #include <linux/rfkill.h>
>  #include <linux/debugfs.h>
>  #include <linux/crypto.h>
> +#include <linux/property.h>
>  #include <asm/unaligned.h>
> 
>  #include <net/bluetooth/bluetooth.h>
> @@ -1355,9 +1356,40 @@ int hci_inquiry(void __user *arg)
>  	return err;
>  }
> 
> +/**
> + * hci_dev_get_bd_addr_from_property - Get the Bluetooth Device 
> Address
> + *				       (BD_ADDR) for a HCI device from
> + *				       a firmware node property.
> + * @hdev:	The HCI device
> + *
> + * Search the firmware node for 'local-bd-address'.
> + *
> + * All-zero BD addresses are rejected, because those could be 
> properties
> + * that exist in the firmware tables, but were not updated by the 
> firmware. For
> + * example, the DTS could define 'local-bd-address', with zero BD 
> addresses.
> + */
> +static int hci_dev_get_bd_addr_from_property(struct hci_dev *hdev)
> +{
> +	struct fwnode_handle *fwnode = dev_fwnode(hdev->dev.parent);
> +	bdaddr_t ba;
> +	int ret;
> +
> +	ret = fwnode_property_read_u8_array(fwnode, "local-bd-address",
> +					    (u8 *)&ba, sizeof(ba));
> +	if (ret < 0)
> +		return ret;
> +	if (!bacmp(&ba, BDADDR_ANY))
> +		return -ENODATA;
> +
> +	hdev->public_addr = ba;
> +
> +	return 0;
> +}
> +
>  static int hci_dev_do_open(struct hci_dev *hdev)
>  {
>  	int ret = 0;
> +	bool bd_addr_set = false;
> 
>  	BT_DBG("%s %p", hdev->name, hdev);
> 
> @@ -1422,6 +1454,16 @@ static int hci_dev_do_open(struct hci_dev *hdev)
>  		if (hdev->setup)
>  			ret = hdev->setup(hdev);
> 
> +		if (test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks)) {
> +			if (!hci_dev_get_bd_addr_from_property(hdev))
> +				if (hdev->set_bdaddr &&
> +				    !hdev->set_bdaddr(hdev, &hdev->public_addr))
> +					bd_addr_set = true;
> +
> +			if (!bd_addr_set)
> +				hci_dev_set_flag(hdev, HCI_UNCONFIGURED);
> +		}
> +
>  		/* The transport driver can set these quirks before
>  		 * creating the HCI device or in its setup callback.
>  		 *
> diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
> index 3bdc8f3ca259..3d9edb752403 100644
> --- a/net/bluetooth/mgmt.c
> +++ b/net/bluetooth/mgmt.c
> @@ -551,7 +551,8 @@ static bool is_configured(struct hci_dev *hdev)
>  	    !hci_dev_test_flag(hdev, HCI_EXT_CONFIGURED))
>  		return false;
> 
> -	if (test_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks) &&
> +	if ((test_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks) ||
> +	     test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks)) &&
>  	    !bacmp(&hdev->public_addr, BDADDR_ANY))
>  		return false;
> 
> @@ -566,7 +567,8 @@ static __le32 get_missing_options(struct hci_dev 
> *hdev)
>  	    !hci_dev_test_flag(hdev, HCI_EXT_CONFIGURED))
>  		options |= MGMT_OPTION_EXTERNAL_CONFIG;
> 
> -	if (test_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks) &&
> +	if ((test_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks) ||
> +	     test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks)) &&
>  	    !bacmp(&hdev->public_addr, BDADDR_ANY))
>  		options |= MGMT_OPTION_PUBLIC_ADDRESS;

Looks fine to me.

Reviewed-by: Balakrishna Godavarthi <bgodavar@codeaurora.org>
-- 
Regards
Balakrishna.

^ permalink raw reply

* Re: [PATCH RFC v3 0/3] Rlimit for module space
From: Michal Hocko @ 2018-10-25 14:18 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: daniel@iogearbox.net, jeyu@kernel.org, ard.biesheuvel@linaro.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	jannh@google.com, keescook@chromium.org, arjan@linux.intel.com,
	netdev@vger.kernel.org, tglx@linutronix.de,
	linux-mips@linux-mips.org, linux-s390@vger.kernel.org,
	x86@kernel.org, kristen@linux.intel.com, Dock, Deneen T,
	catalin.marinas@arm.com, mingo@redhat.com, will.deacon@arm.com
In-Reply-To: <d1fec827d028168047eafbac56e8e47d37cf7fb5.camel@intel.com>

On Thu 25-10-18 01:01:44, Edgecombe, Rick P wrote:
[...]
> FWIW, cgroups seems like a better solution than rlimit to me too. Maybe you all
> already know, but it looks like the cgroups vmalloc charge is done in the main
> page allocator and counts against the whole kernel memory limit.

I am not sure I understand what you are saying but let me clarify that
vmalloc memory is accounted against memory cgroup associated with the
user context calling vmalloc. All you need to do is to add __GFP_ACCOUNT
to the gfp mask. The only challenge here is the charged memory life
cycle. When does it get deallocated? In the worst case the memory is not
tight to any user context and as such it doesn't get deallocated by
killing all processes which could lead to memcg limit exhaustion.

> A user may want
> to have a higher kernel limit than the module space size, so it seems it isn't
> enough by itself and some new limit would need to be added.

If there is another restriction on this memory then you are right.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply

* Re: [PATCH] octeontx2-af: Use GFP_ATOMIC under spin lock
From: Sunil Kovvuri @ 2018-10-25  4:18 UTC (permalink / raw)
  To: weiyongjun1
  Cc: Sunil Goutham, Linu Cherian, Geetha sowjanya, jerinj,
	Linux Netdev List, kernel-janitors
In-Reply-To: <1540431746-176759-1-git-send-email-weiyongjun1@huawei.com>

On Thu, Oct 25, 2018 at 7:02 AM Wei Yongjun <weiyongjun1@huawei.com> wrote:
>
> The function nix_update_mce_list() is called from
> nix_update_bcast_mce_list(), and a spin lock is held
> here, so we should use GFP_ATOMIC instead.
>
> Fixes: 4b05528ebf0c ("octeontx2-af: Update bcast list upon NIXLF alloc/free")
> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
> ---
>  drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
> index 8890c95..a618e48 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
> @@ -1294,7 +1294,7 @@ static int nix_update_mce_list(struct nix_mce_list *mce_list,
>                 return 0;
>
>         /* Add a new one to the list, at the tail */
> -       mce = kzalloc(sizeof(*mce), GFP_KERNEL);
> +       mce = kzalloc(sizeof(*mce), GFP_ATOMIC);
>         if (!mce)
>                 return -ENOMEM;
>         mce->idx = idx;
>

Thanks for pointing this and for the patch.
FYI, this driver is no where near to complete, after net-next is open we will
start submitting rest of the patches. There is a subsequent patch
which changes most of the locks to mutex from spinlock, which will
fix this issue as well.

It would be great if above patch is ignored for now, as in it's
current state driver is
not complete and usable. Otherwise also fine, I will change change/fix
this again.

Regards,
Sunil.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox