Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2 net-next 1/4] umh: introduce fork_usermode_blob() helper
From: Alexei Starovoitov @ 2018-05-05 16:24 UTC (permalink / raw)
  To: Jann Horn
  Cc: Alexei Starovoitov, David S. Miller, Daniel Borkmann,
	Linus Torvalds, Greg Kroah-Hartman, Andy Lutomirski,
	Network Development, kernel list, kernel-team
In-Reply-To: <CAG48ez3gB=zFebJAvFCj+6iuicgTiw_0UDEdeWW1Ri82pz8zkw@mail.gmail.com>

On Sat, May 05, 2018 at 12:48:24AM -0400, Jann Horn wrote:
> On Thu, May 3, 2018 at 12:36 AM, Alexei Starovoitov <ast@kernel.org> wrote:
> > Introduce helper:
> > int fork_usermode_blob(void *data, size_t len, struct umh_info *info);
> > struct umh_info {
> >        struct file *pipe_to_umh;
> >        struct file *pipe_from_umh;
> >        pid_t pid;
> > };
> >
> > that GPLed kernel modules (signed or unsigned) can use it to execute part
> > of its own data as swappable user mode process.
> >
> > The kernel will do:
> > - mount "tmpfs"
> > - allocate a unique file in tmpfs
> > - populate that file with [data, data + len] bytes
> > - user-mode-helper code will do_execve that file and, before the process
> >   starts, the kernel will create two unix pipes for bidirectional
> >   communication between kernel module and umh
> > - close tmpfs file, effectively deleting it
> > - the fork_usermode_blob will return zero on success and populate
> >   'struct umh_info' with two unix pipes and the pid of the user process
> >
> > As the first step in the development of the bpfilter project
> > the fork_usermode_blob() helper is introduced to allow user mode code
> > to be invoked from a kernel module. The idea is that user mode code plus
> > normal kernel module code are built as part of the kernel build
> > and installed as traditional kernel module into distro specified location,
> > such that from a distribution point of view, there is
> > no difference between regular kernel modules and kernel modules + umh code.
> > Such modules can be signed, modprobed, rmmod, etc. The use of this new helper
> > by a kernel module doesn't make it any special from kernel and user space
> > tooling point of view.
> [...]
> > +static struct vfsmount *umh_fs;
> > +
> > +static int init_tmpfs(void)
> > +{
> > +       struct file_system_type *type;
> > +
> > +       if (umh_fs)
> > +               return 0;
> > +       type = get_fs_type("tmpfs");
> > +       if (!type)
> > +               return -ENODEV;
> > +       umh_fs = kern_mount(type);
> > +       if (IS_ERR(umh_fs)) {
> > +               int err = PTR_ERR(umh_fs);
> > +
> > +               put_filesystem(type);
> > +               umh_fs = NULL;
> > +               return err;
> > +       }
> > +       return 0;
> > +}
> 
> Should init_tmpfs() be holding some sort of mutex if it's fiddling
> with `umh_fs`? The current code only calls it in initcall context, but
> if that ever changes and two processes try to initialize the tmpfs at
> the same time, a few things could go wrong.

I thought that module loading is serialized, so calls to
fork_usermode_blob() will be serialized as well, but looking at the code
again that doesn't seem to be the case, so need to revisit not only
this function, but the rest of it too.

> I guess Luis' suggestion (putting a call to init_tmpfs() in
> do_basic_setup()) might be the easiest way to get rid of that problem.

I still think that two mounts where umh mount is dynamic is cleaner.
Why waste the mount if no module uses this helper?
I'm thinking to wrap init_tmpfs into DO_ONCE instead or use a mutex.
Looks like shmem_file_setup_with_mnt() can be called in parallel
on the same mount, so that should be fine.

^ permalink raw reply

* [PATCH v2 1/1] drivers core: multi-threading device shutdown
From: Pavel Tatashin @ 2018-05-05 15:40 UTC (permalink / raw)
  To: pasha.tatashin, steven.sistare, daniel.m.jordan, linux-kernel,
	jeffrey.t.kirsher, intel-wired-lan, netdev, gregkh,
	alexander.duyck, tobin
In-Reply-To: <20180505154040.28614-1-pasha.tatashin@oracle.com>

When system is rebooted, halted or kexeced device_shutdown() is
called.

This function shuts down every single device by calling either:

	dev->bus->shutdown(dev)
	dev->driver->shutdown(dev)

Even on a machine with just a moderate amount of devices, device_shutdown()
may take multiple seconds to complete. This is because many devices require
a specific delays to perform this operation.

Here is a sample analysis of time it takes to call device_shutdown() on a
two socket Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz machine.

device_shutdown		2.95s
-----------------------------
 mlx4_shutdown		1.14s
 megasas_shutdown	0.24s
 ixgbe_shutdown		0.37s x 4 (four ixgbe devices on this machine).
 the rest		0.09s

In mlx4 we spent the most time, but that is because there is a 1 second
sleep, which is defined by hardware specifications:
mlx4_shutdown
 mlx4_unload_one
  mlx4_free_ownership
   msleep(1000)

With megasas we spend a quarter of a second, but sometimes longer (up-to
0.5s) in this path:

    megasas_shutdown
      megasas_flush_cache
        megasas_issue_blocked_cmd
          wait_event_timeout

Finally, with ixgbe_shutdown() it takes 0.37 for each device, but that time
is spread all over the place, with bigger offenders:

    ixgbe_shutdown
      __ixgbe_shutdown
        ixgbe_close_suspend
          ixgbe_down
            ixgbe_init_hw_generic
              ixgbe_reset_hw_X540
                msleep(100);                        0.104483472
                ixgbe_get_san_mac_addr_generic      0.048414851
                ixgbe_get_wwn_prefix_generic        0.048409893
              ixgbe_start_hw_X540
                ixgbe_start_hw_generic
                  ixgbe_clear_hw_cntrs_generic      0.048581502
                  ixgbe_setup_fc_generic            0.024225800

    All the ixgbe_*generic functions end-up calling:
    ixgbe_read_eerd_X540()
      ixgbe_acquire_swfw_sync_X540
        usleep_range(5000, 6000);
      ixgbe_release_swfw_sync_X540
        usleep_range(5000, 6000);

While these are short sleeps, they end-up calling them over 24 times!
24 * 0.0055s = 0.132s. Adding-up to 0.528s for four devices. Also we have
four msleep(100). Totaling to:  0.928s

While we should keep optimizing the individual device drivers, in some
cases this is simply a hardware property that forces a specific delay, and
we must wait.

So, the solution for this problem is to shutdown devices in parallel.
However, we must shutdown children before shutting down parents, so parent
device must wait for its children to finish.

With this patch, on the same machine devices_shutdown() takes 1.142s, and
without mlx4 one second delay only 0.38s

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 drivers/base/core.c | 275 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 225 insertions(+), 50 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index b610816eb887..110d7970deef 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -25,6 +25,7 @@
 #include <linux/netdevice.h>
 #include <linux/sched/signal.h>
 #include <linux/sysfs.h>
+#include <linux/kthread.h>
 
 #include "base.h"
 #include "power/power.h"
@@ -2102,6 +2103,59 @@ const char *device_get_devnode(struct device *dev,
 	return *tmp = s;
 }
 
+/**
+ * device_children_count - device children count
+ * @parent: parent struct device.
+ *
+ * Returns number of children for this device or 0 if none.
+ */
+static int device_children_count(struct device *parent)
+{
+	struct klist_iter i;
+	int children = 0;
+
+	if (!parent->p)
+		return 0;
+
+	klist_iter_init(&parent->p->klist_children, &i);
+	while (next_device(&i))
+		children++;
+	klist_iter_exit(&i);
+
+	return children;
+}
+
+/**
+ * device_get_child_by_index - Return child using the provided index.
+ * @parent: parent struct device.
+ * @index:  Index of the child, where 0 is the first child in the children list,
+ * and so on.
+ *
+ * Returns child or NULL if child with this index is not present.
+ */
+static struct device *
+device_get_child_by_index(struct device *parent, int index)
+{
+	struct klist_iter i;
+	struct device *dev = NULL, *d;
+	int child_index = 0;
+
+	if (!parent->p || index < 0)
+		return NULL;
+
+	klist_iter_init(&parent->p->klist_children, &i);
+	while ((d = next_device(&i))) {
+		if (child_index == index) {
+			dev = d;
+			break;
+		}
+		child_index++;
+	}
+	klist_iter_exit(&i);
+
+	return dev;
+}
+
 /**
  * device_for_each_child - device child iterator.
  * @parent: parent struct device.
@@ -2765,71 +2819,192 @@ int device_move(struct device *dev, struct device *new_parent,
 }
 EXPORT_SYMBOL_GPL(device_move);
 
+/*
+ * device_shutdown_one - call ->shutdown() for the device passed as
+ * argument.
+ */
+static void device_shutdown_one(struct device *dev)
+{
+	/* Don't allow any more runtime suspends */
+	pm_runtime_get_noresume(dev);
+	pm_runtime_barrier(dev);
+
+	if (dev->class && dev->class->shutdown_pre) {
+		if (initcall_debug)
+			dev_info(dev, "shutdown_pre\n");
+		dev->class->shutdown_pre(dev);
+	}
+	if (dev->bus && dev->bus->shutdown) {
+		if (initcall_debug)
+			dev_info(dev, "shutdown\n");
+		dev->bus->shutdown(dev);
+	} else if (dev->driver && dev->driver->shutdown) {
+		if (initcall_debug)
+			dev_info(dev, "shutdown\n");
+		dev->driver->shutdown(dev);
+	}
+
+	/* decrement the reference counter */
+	put_device(dev);
+}
+
+/**
+ * Passed as an argument to device_shutdown_child_task().
+ * child_next_index	the next available child index.
+ * tasks_running	number of tasks still running. Each tasks decrements it
+ *			when job is finished and the last task signals that the
+ *			job is complete.
+ * complete		Used to signal job competition.
+ * parent		Parent device.
+ */
+struct device_shutdown_task_data {
+	atomic_t		child_next_index;
+	atomic_t		tasks_running;
+	struct completion	complete;
+	struct device		*parent;
+};
+
+static int device_shutdown_child_task(void *data);
+
+/**
+ * These globals are used by tasks that are started for root devices.
+ * device_root_tasks_finished	Number of root devices finished shutting down.
+ * device_root_tasks_started	Total number of root devices tasks started.
+ * device_root_tasks_done	The completion signal to the main thread.
+ */
+static atomic_t		device_root_tasks_finished;
+static atomic_t		device_root_tasks_started;
+struct completion	device_root_tasks_done;
+
+/**
+ * Shutdown device tree with root started in dev. If dev has no children
+ * simply shutdown only this device. If dev has children recursively shutdown
+ * children first, and only then the parent. For performance reasons children
+ * are shutdown in parallel using kernel threads. dev is locked, so its children
+ * cannot be removed while this functions is running.
+ */
+static void device_shutdown_tree(struct device *dev)
+{
+	int children_count = device_children_count(dev);
+
+	if (children_count) {
+		struct device_shutdown_task_data tdata;
+		int i;
+
+		init_completion(&tdata.complete);
+		atomic_set(&tdata.child_next_index, 0);
+		atomic_set(&tdata.tasks_running, children_count);
+		tdata.parent = dev;
+
+		for (i = 0; i < children_count; i++) {
+			kthread_run(device_shutdown_child_task,
+				    &tdata, "device_shutdown.%s",
+				    dev_name(dev));
+		}
+		wait_for_completion(&tdata.complete);
+	}
+	device_shutdown_one(dev);
+}
+
+/**
+ * Only devices with parent are going through this function. The parent is
+ * locked and waits for all of its children to finish shutting down before
+ * calling shutdown function for itself.
+ */
+static int device_shutdown_child_task(void *data)
+{
+	struct device_shutdown_task_data *tdata = data;
+	int cidx = atomic_inc_return(&tdata->child_next_index) - 1;
+	struct device *dev = device_get_child_by_index(tdata->parent, cidx);
+
+	/* ref. counter is going to be decremented in device_shutdown_one() */
+	get_device(dev);
+	device_lock(dev);
+	device_shutdown_tree(dev);
+	device_unlock(dev);
+
+	/* If we are the last to exit, signal the completion */
+	if (atomic_dec_return(&tdata->tasks_running) == 0)
+		complete(&tdata->complete);
+	return 0;
+}
+
+/**
+ * On shutdown each root device (the one that does not have a parent) goes
+ * through this function.
+ */
+static int device_shutdown_root_task(void *data)
+{
+	struct device *dev = (struct device *)data;
+	int root_devices;
+
+	device_lock(dev);
+	device_shutdown_tree(dev);
+	device_unlock(dev);
+
+	/* If we are the last to exit, signal the completion */
+	root_devices = atomic_inc_return(&device_root_tasks_finished);
+	if (root_devices == atomic_read(&device_root_tasks_started))
+		complete(&device_root_tasks_done);
+	return 0;
+}
+
 /**
  * device_shutdown - call ->shutdown() on each device to shutdown.
  */
 void device_shutdown(void)
 {
-	struct device *dev, *parent;
+	int root_devices = 0;
+	struct device *dev;
 
-	spin_lock(&devices_kset->list_lock);
-	/*
-	 * Walk the devices list backward, shutting down each in turn.
-	 * Beware that device unplug events may also start pulling
-	 * devices offline, even as the system is shutting down.
+	atomic_set(&device_root_tasks_finished, 0);
+	atomic_set(&device_root_tasks_started, 0);
+	init_completion(&device_root_tasks_done);
+
+	/* Shutdown the root devices in parallel. The children are going to be
+	 * shutdown first in device_shutdown_tree().
 	 */
+	spin_lock(&devices_kset->list_lock);
 	while (!list_empty(&devices_kset->list)) {
-		dev = list_entry(devices_kset->list.prev, struct device,
-				kobj.entry);
+		dev = list_entry(devices_kset->list.next, struct device,
+				 kobj.entry);
 
-		/*
-		 * hold reference count of device's parent to
-		 * prevent it from being freed because parent's
-		 * lock is to be held
-		 */
-		parent = get_device(dev->parent);
-		get_device(dev);
-		/*
-		 * Make sure the device is off the kset list, in the
-		 * event that dev->*->shutdown() doesn't remove it.
+		/* Make sure the device is off the kset list, in the event that
+		 * dev->*->shutdown() doesn't remove it.
 		 */
 		list_del_init(&dev->kobj.entry);
-		spin_unlock(&devices_kset->list_lock);
-
-		/* hold lock to avoid race with probe/release */
-		if (parent)
-			device_lock(parent);
-		device_lock(dev);
-
-		/* Don't allow any more runtime suspends */
-		pm_runtime_get_noresume(dev);
-		pm_runtime_barrier(dev);
 
-		if (dev->class && dev->class->shutdown_pre) {
-			if (initcall_debug)
-				dev_info(dev, "shutdown_pre\n");
-			dev->class->shutdown_pre(dev);
+		/* Here we start tasks for root devices only */
+		if (!dev->parent) {
+			/* Prevents devices from being freed. The counter is
+			 * going to be decremented in device_shutdown_one() once
+			 * this root device is shutdown.
+			 */
+			get_device(dev);
+
+			/* We unlock list for performance reasons,
+			 * dev->*->shutdown(), may try to take this lock to
+			 * remove us from kset list. To avoid unlocking this
+			 * list we could replace spin lock in:
+			 * dev->kobj.kset->list_lock with a dummy one once
+			 * device is locked in device_shutdown_root_task() and
+			 * in device_shutdown_child_task().
+			 */
+			spin_unlock(&devices_kset->list_lock);
+
+			root_devices++;
+			kthread_run(device_shutdown_root_task,
+				    dev, "device_root_shutdown.%s",
+				    dev_name(dev));
+			spin_lock(&devices_kset->list_lock);
 		}
-		if (dev->bus && dev->bus->shutdown) {
-			if (initcall_debug)
-				dev_info(dev, "shutdown\n");
-			dev->bus->shutdown(dev);
-		} else if (dev->driver && dev->driver->shutdown) {
-			if (initcall_debug)
-				dev_info(dev, "shutdown\n");
-			dev->driver->shutdown(dev);
-		}
-
-		device_unlock(dev);
-		if (parent)
-			device_unlock(parent);
-
-		put_device(dev);
-		put_device(parent);
-
-		spin_lock(&devices_kset->list_lock);
 	}
 	spin_unlock(&devices_kset->list_lock);
+
+	/* Set number of root tasks started, and waits for completion */
+	atomic_set(&device_root_tasks_started, root_devices);
+	if (root_devices != atomic_read(&device_root_tasks_finished))
+		wait_for_completion(&device_root_tasks_done);
 }
 
 /*
-- 
2.17.0

^ permalink raw reply related

* [PATCH v2 0/1] multi-threading device shutdown
From: Pavel Tatashin @ 2018-05-05 15:40 UTC (permalink / raw)
  To: pasha.tatashin, steven.sistare, daniel.m.jordan, linux-kernel,
	jeffrey.t.kirsher, intel-wired-lan, netdev, gregkh,
	alexander.duyck, tobin

Changelog

v1 - v2
	- It turns out we cannot lock more than MAX_LOCK_DEPTH by a single
	  thread. (By default this value is 48), and is used to detect
	  deadlocks. So, I re-wrote the code to only lock one devices per
	  thread instead of pre-locking all devices by the main thread.
	- Addressed comments from Tobin C. Harding.
	- As suggested by Alexander Duyck removed ixgbe changes. It can be
	  done as a separate work scaling RTNL mutex.

Do a faster shutdown by calling dev->*->shutdown(dev) in parallel.
device_shutdown() calls these functions for every single device but
only using one thread.

Since, nothing else is running on the machine by the device_shutdown()
s called, there is no reason not to utilize all the available CPU
resources.

Pavel Tatashin (1):
  drivers core: multi-threading device shutdown

 drivers/base/core.c | 275 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 225 insertions(+), 50 deletions(-)

-- 
2.17.0

^ permalink raw reply

* Re: [PATCH 7/8] rhashtable: add rhashtable_walk_prev()
From: Tom Herbert @ 2018-05-05 15:40 UTC (permalink / raw)
  To: Herbert Xu; +Cc: NeilBrown, Thomas Graf, Linux Kernel Network Developers, LKML
In-Reply-To: <20180505094324.wwjtl76oofgrtpg4@gondor.apana.org.au>

On Sat, May 5, 2018 at 2:43 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
>> rhashtable_walk_prev() returns the object returned by
>> the previous rhashtable_walk_next(), providing it is still in the
>> table (or was during this grace period).
>> This works even if rhashtable_walk_stop() and rhashtable_talk_start()
>> have been called since the last rhashtable_walk_next().
>>
>> If there have been no calls to rhashtable_walk_next(), or if the
>> object is gone from the table, then NULL is returned.
>>
>> This can usefully be used in a seq_file ->start() function.
>> If the pos is the same as was returned by the last ->next() call,
>> then rhashtable_walk_prev() can be used to re-establish the
>> current location in the table.  If it returns NULL, then
>> rhashtable_walk_next() should be used.
>>
>> Signed-off-by: NeilBrown <neilb@suse.com>
>
> I will ack this if Tom is OK with replacing peek with it.
>
I'm not following why this is any better than peek. The point of
having rhashtable_walk_peek is to to allow the caller to get then
current element not the next one. This is needed when table is read in
multiple parts and we need to pick up with what was returned from the
last call to rhashtable_walk_next (which apparently is what this patch
is also trying to do).

There is one significant difference in that peek will return the
element in the table regardless of where the iterator is at (this is
why peek can move the iterator) and only returns NULL at end of table.
As mentioned above rhashtable_walk_prev can return NULL so then caller
needs and so rhashtable_walk_next "should be used" to get the previous
element. Doing a peek is a lot cleaner and more straightforward API in
this regard.

Tom

> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH net-next v2 02/13] net: phy: sfp: handle non-wired SFP connectors
From: Thomas Petazzoni @ 2018-05-05 15:39 UTC (permalink / raw)
  To: Antoine Tenart
  Cc: Florian Fainelli, davem, kishon, linux, gregory.clement, andrew,
	jason, sebastian.hesselbarth, netdev, linux-kernel,
	maxime.chevallier, miquel.raynal, nadavh, stefanc, ymarkman, mw,
	linux-arm-kernel
In-Reply-To: <20180504172337.GC13899@kwain>

Hello,

On Fri, 4 May 2018 19:23:37 +0200, Antoine Tenart wrote:

> On Fri, May 04, 2018 at 10:04:48AM -0700, Florian Fainelli wrote:
> > On 05/04/2018 06:56 AM, Antoine Tenart wrote:  
> > > SFP connectors can be solder on a board without having any of their pins
> > > (LOS, i2c...) wired. In such cases the SFP link state cannot be guessed,
> > > and the overall link status reporting is left to other layers.
> > > 
> > > In order to achieve this, a new SFP_DEV status is added, named UNKNOWN.
> > > This mode is set when it is not possible for the SFP code to get the
> > > link status and as a result the link status is reported to be always UP
> > > from the SFP point of view.  
> > 
> > Why represent the SFP in Device Tree then? Why not just declare this is
> > a fixed link which would avoid having to introduce this "unknown" state.  
> 
> The other solution would have been to represent this as a fixed-link.
> But such a representation would report the link as being up all the
> time, which is something we wanted to avoid as the GoP in PPv2 can
> report some link status. This is achieved using SFP+phylink+PPv2.
> 
> And representing the SFP cage in the device tree, although it's a
> "dummy" one, helps describing the hardware.

Just to add to this: the board physically has a SFP cage, and a cable
can be connected to it, or not. So it is absolutely not a fixed link
(cable can be connected or not) and it really is a SFP cage.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: [PATCH] net: dsa: drop some VLAs in switch.c
From: Andrew Lunn @ 2018-05-05 15:39 UTC (permalink / raw)
  To: Salvatore Mesoraca
  Cc: Florian Fainelli, Vivien Didelot, linux-kernel, Kernel Hardening,
	netdev, David S. Miller, Kees Cook
In-Reply-To: <CAJHCu1+V+khxvzhQdG=mo960b5ayiQcSaoMnm9A__51JdWkdig@mail.gmail.com>

On Sat, May 05, 2018 at 12:36:36PM +0200, Salvatore Mesoraca wrote:
> 2018-03-13 21:06 GMT+01:00 Florian Fainelli <f.fainelli@gmail.com>:
> > On 03/13/2018 12:58 PM, Vivien Didelot wrote:
> >> Hi Salvatore,
> >>
> >> Salvatore Mesoraca <s.mesoraca16@gmail.com> writes:
> >>
> >>> dsa_switch's num_ports is currently fixed to DSA_MAX_PORTS. So we avoid
> >>> 2 VLAs[1] by using DSA_MAX_PORTS instead of ds->num_ports.
> >>>
> >>> [1] https://lkml.org/lkml/2018/3/7/621
> >>>
> >>> Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
> >>
> >> NAK.
> >>
> >> We are in the process to remove hardcoded limits such as DSA_MAX_PORTS
> >> and DSA_MAX_SWITCHES, so we have to stick with ds->num_ports.
> >
> > Then this means that we need to allocate a bitmap from the heap, which
> > sounds a bit superfluous and could theoretically fail... not sure which
> > way is better, but bumping the size to DSA_MAX_PORTS definitively does
> > help people working on enabling -Wvla.
> 
> Hi Florian,
> 
> Should I consider this patch still NAKed or not?
> Should I resend the patch with some modifications?

Hi Salvatore

We have been removing all uses of DSA_MAX_PORTS. I don't particularly
like arbitrary limits on how many ports a switch can have, or how many
switches a board can have.

So i would prefer to not use DSA_MAX_PORTS here.

You could make the bitmap part of the dsa_switch structure. This is
allocated by dsa_switch_alloc() and is passed the number of ports.
Doing the allocation there means you don't need to worry about it
failing in dsa_switch_mdb_add() or dsa_switch_vlan_add().

	Andrew

^ permalink raw reply

* [PATCH net] tls: fix use after free in tls_sk_proto_close
From: Eric Dumazet @ 2018-05-05 15:35 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Eric Dumazet, Atul Gupta, Steve Wise,
	Ilya Lesokhin, Aviad Yehezkel, Dave Watson

syzbot reported a use-after-free in tls_sk_proto_close

Add a boolean value to cleanup a bit this function.

BUG: KASAN: use-after-free in tls_sk_proto_close+0x8ab/0x9c0 net/tls/tls_main.c:297
Read of size 1 at addr ffff8801ae40a858 by task syz-executor363/4503

CPU: 0 PID: 4503 Comm: syz-executor363 Not tainted 4.17.0-rc3+ #34
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 print_address_description+0x6c/0x20b mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430
 tls_sk_proto_close+0x8ab/0x9c0 net/tls/tls_main.c:297
 inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460
 sock_release+0x96/0x1b0 net/socket.c:594
 sock_close+0x16/0x20 net/socket.c:1149
 __fput+0x34d/0x890 fs/file_table.c:209
 ____fput+0x15/0x20 fs/file_table.c:243
 task_work_run+0x1e4/0x290 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x1aee/0x2730 kernel/exit.c:865
 do_group_exit+0x16f/0x430 kernel/exit.c:968
 get_signal+0x886/0x1960 kernel/signal.c:2469
 do_signal+0x98/0x2040 arch/x86/kernel/signal.c:810
 exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
 do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4457b9
RSP: 002b:00007fdf4d766da8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000006dac3c RCX: 00000000004457b9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000006dac3c
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dac38
R13: 3692738801137283 R14: 6bf92c39443c4c1d R15: 0000000000000006

Allocated by task 4498:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
 kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620
 kmalloc include/linux/slab.h:512 [inline]
 kzalloc include/linux/slab.h:701 [inline]
 create_ctx net/tls/tls_main.c:521 [inline]
 tls_init+0x1f9/0xb00 net/tls/tls_main.c:633
 tcp_set_ulp+0x1bc/0x520 net/ipv4/tcp_ulp.c:153
 do_tcp_setsockopt.isra.39+0x44a/0x2600 net/ipv4/tcp.c:2588
 tcp_setsockopt+0xc1/0xe0 net/ipv4/tcp.c:2893
 sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3039
 __sys_setsockopt+0x1bd/0x390 net/socket.c:1903
 __do_sys_setsockopt net/socket.c:1914 [inline]
 __se_sys_setsockopt net/socket.c:1911 [inline]
 __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 4503:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
 __cache_free mm/slab.c:3498 [inline]
 kfree+0xd9/0x260 mm/slab.c:3813
 tls_sw_free_resources+0x2a3/0x360 net/tls/tls_sw.c:1037
 tls_sk_proto_close+0x67c/0x9c0 net/tls/tls_main.c:288
 inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460
 sock_release+0x96/0x1b0 net/socket.c:594
 sock_close+0x16/0x20 net/socket.c:1149
 __fput+0x34d/0x890 fs/file_table.c:209
 ____fput+0x15/0x20 fs/file_table.c:243
 task_work_run+0x1e4/0x290 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x1aee/0x2730 kernel/exit.c:865
 do_group_exit+0x16f/0x430 kernel/exit.c:968
 get_signal+0x886/0x1960 kernel/signal.c:2469
 do_signal+0x98/0x2040 arch/x86/kernel/signal.c:810
 exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
 do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8801ae40a800
 which belongs to the cache kmalloc-256 of size 256
The buggy address is located 88 bytes inside of
 256-byte region [ffff8801ae40a800, ffff8801ae40a900)
The buggy address belongs to the page:
page:ffffea0006b90280 count:1 mapcount:0 mapping:ffff8801ae40a080 index:0x0
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffff8801ae40a080 0000000000000000 000000010000000c
raw: ffffea0006bea9e0 ffffea0006bc94a0 ffff8801da8007c0 0000000000000000
page dumped because: kasan: bad access detected

Fixes: dd0bed1665d6 ("tls: support for Inline tls record")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Atul Gupta <atul.gupta@chelsio.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Ilya Lesokhin <ilyal@mellanox.com>
Cc: Aviad Yehezkel <aviadye@mellanox.com>
Cc: Dave Watson <davejwatson@fb.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
---
 net/tls/tls_main.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index cc03e00785c7ffefc8c37cac39aecc7e28cc86f9..74ed1e7af3d9da0f4074cb4ee5d7ed705f89c78b 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -248,16 +248,13 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 	struct tls_context *ctx = tls_get_ctx(sk);
 	long timeo = sock_sndtimeo(sk, 0);
 	void (*sk_proto_close)(struct sock *sk, long timeout);
+	bool free_ctx = false;
 
 	lock_sock(sk);
 	sk_proto_close = ctx->sk_proto_close;
 
-	if (ctx->conf == TLS_HW_RECORD)
-		goto skip_tx_cleanup;
-
-	if (ctx->conf == TLS_BASE) {
-		kfree(ctx);
-		ctx = NULL;
+	if (ctx->conf == TLS_BASE || ctx->conf == TLS_HW_RECORD) {
+		free_ctx = true;
 		goto skip_tx_cleanup;
 	}
 
@@ -294,7 +291,7 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
 	/* free ctx for TLS_HW_RECORD, used by tcp_set_state
 	 * for sk->sk_prot->unhash [tls_hw_unhash]
 	 */
-	if (ctx && ctx->conf == TLS_HW_RECORD)
+	if (free_ctx)
 		kfree(ctx);
 }
 
-- 
2.17.0.441.gb46fe60e1d-goog

^ permalink raw reply related

* Re: [PATCH iproute2] rdma: fix header files
From: David Ahern @ 2018-05-05 15:17 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: swise, netdev
In-Reply-To: <20180504215800.41b8467a@xeon-e3>

On 5/4/18 10:58 PM, Stephen Hemminger wrote:
> On Fri, 4 May 2018 16:13:07 -0600
> David Ahern <dsahern@gmail.com> wrote:
> 
>> On 5/4/18 3:56 PM, Stephen Hemminger wrote:
>>> All user api headers in iproute2 should be in include/uapi
>>> so that script can be used to put correct sanitized kernel headers
>>> there. And the header files for rdma must be a complete set; if one
>>> header file includes another, all must be present.
>>>
>>> This fixes build on older distributions, and Windows Services
>>> for Linux.
>>>
>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>> ---
>>>  include/uapi/rdma/ib_user_sa.h                |   77 ++
>>>  include/uapi/rdma/ib_user_verbs.h             | 1210 +++++++++++++++++
>>>  .../uapi/rdma/rdma_netlink.h                  |   13 +
>>>  .../uapi/rdma/rdma_user_cm.h                  |    6 +-
>>>  4 files changed, 1303 insertions(+), 3 deletions(-)
>>>  create mode 100644 include/uapi/rdma/ib_user_sa.h
>>>  create mode 100644 include/uapi/rdma/ib_user_verbs.h
>>>  rename {rdma/include => include}/uapi/rdma/rdma_netlink.h (95%)
>>>  rename {rdma/include => include}/uapi/rdma/rdma_user_cm.h (98%)
>>>   
>>
>> Stephen:
>>
>> Per a recent discussion the RDMA folks need to take ownership of the
>> uapi files. RDMA features do not hit Dave's net-next tree so the rdma
>> code can never hit iproute2-next during a dev cycle.
> 
> I want all uapi headers in include/uapi because it avoids possible overlap problems,
> During the linux-net/linus release cycle they should match what is Linus's tree.
> 
> During the net-next they can come from two sources.
> 

That creates extra work for me for no reason.

You state above "user api headers in iproute2 should be in include/uapi
so that script can be used to put correct sanitized kernel headers there."

With RDMA's development cycle that will *never* happen. With the
exception of RDMA, all iproute2 features go through net-next and it is
the right tree to pull updates from. Every time I sync from net-next the
header files for rdma will have to be ignored, so they will never be
updated through this mechanism which means the stated goal is not
achievable.

As for linux-next, I will not sync header files to a tree that
disappears; it breaks all traceability. Further, it seems to me that it
does not really solve the problem. I forget the steps now but RDMA
features have to hit some development tree before going to Linus, so
there will be a delay with the headers.

Back in March we discussed options. iproute2 is nothing more than a
delivery vehicle for rdmatool. Since it breaks everything else about the
iproute2 and net-next association, the simplest option for everyone is
for the rdma group to control syncing their own headers and putting them
under rdma directory.

^ permalink raw reply

* Re: [PATCH] net/mlx5: Fix mlx5_get_vector_affinity function
From: Guenter Roeck @ 2018-05-05 14:38 UTC (permalink / raw)
  To: Israel Rukshin
  Cc: Max Gurtovoy, Matan Barak, Doug Ledford, linux-rdma, linux-kernel,
	netdev, Thomas Gleixner

On Thu, Apr 12, 2018 at 09:49:11AM +0000, Israel Rukshin wrote:
> Adding the vector offset when calling to mlx5_vector2eqn() is wrong.
> This is because mlx5_vector2eqn() checks if EQ index is equal to vector number
> and the fact that the internal completion vectors that mlx5 allocates
> don't get an EQ index.
> 
> The second problem here is that using effective_affinity_mask gives the same
> CPU for different vectors.
> This leads to unmapped queues when calling it from blk_mq_rdma_map_queues().
> This doesn't happen when using affinity_hint mask.
> 
Except that affinity_hint is only defined if SMP is enabled. Without:

include/linux/mlx5/driver.h: In function ‘mlx5_get_vector_affinity_hint’:
include/linux/mlx5/driver.h:1299:13: error:
        ‘struct irq_desc’ has no member named ‘affinity_hint’

Note that this is the only use of affinity_hint outside kernel/irq.
Don't other drivers have similar problems ?

Guenter

> Fixes: 2572cf57d75a ("mlx5: fix mlx5_get_vector_affinity to start from completion vector 0")
> Fixes: 05e0cc84e00c ("net/mlx5: Fix get vector affinity helper function")
> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>  drivers/infiniband/hw/mlx5/main.c |  2 +-
>  include/linux/mlx5/driver.h       | 12 +++---------
>  2 files changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index daa919e5a442..241cf4ff9901 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -4757,7 +4757,7 @@ mlx5_ib_get_vector_affinity(struct ib_device *ibdev, int comp_vector)
>  {
>  	struct mlx5_ib_dev *dev = to_mdev(ibdev);
>  
> -	return mlx5_get_vector_affinity(dev->mdev, comp_vector);
> +	return mlx5_get_vector_affinity_hint(dev->mdev, comp_vector);
>  }
>  
>  /* The mlx5_ib_multiport_mutex should be held when calling this function */
> diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
> index 767d193c269a..2a156c5dfadd 100644
> --- a/include/linux/mlx5/driver.h
> +++ b/include/linux/mlx5/driver.h
> @@ -1284,25 +1284,19 @@ enum {
>  };
>  
>  static inline const struct cpumask *
> -mlx5_get_vector_affinity(struct mlx5_core_dev *dev, int vector)
> +mlx5_get_vector_affinity_hint(struct mlx5_core_dev *dev, int vector)
>  {
> -	const struct cpumask *mask;
>  	struct irq_desc *desc;
>  	unsigned int irq;
>  	int eqn;
>  	int err;
>  
> -	err = mlx5_vector2eqn(dev, MLX5_EQ_VEC_COMP_BASE + vector, &eqn, &irq);
> +	err = mlx5_vector2eqn(dev, vector, &eqn, &irq);
>  	if (err)
>  		return NULL;
>  
>  	desc = irq_to_desc(irq);
> -#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK
> -	mask = irq_data_get_effective_affinity_mask(&desc->irq_data);
> -#else
> -	mask = desc->irq_common_data.affinity;
> -#endif
> -	return mask;
> +	return desc->affinity_hint;
>  }
>  
>  #endif /* MLX5_DRIVER_H */
> -- 
> 2.7.4

^ permalink raw reply

* Re: [PATCH] dt-bindings: can: rcar_can: Fix R8A7796 SoC name
From: Marc Kleine-Budde @ 2018-05-05 13:56 UTC (permalink / raw)
  To: Geert Uytterhoeven, Wolfgang Grandegger, Rob Herring,
	Mark Rutland
  Cc: Chris Paterson, linux-can, netdev, devicetree, linux-renesas-soc
In-Reply-To: <1525352553-17045-1-git-send-email-geert+renesas@glider.be>


[-- Attachment #1.1: Type: text/plain, Size: 461 bytes --]

On 05/03/2018 03:02 PM, Geert Uytterhoeven wrote:
> R8A7796 is R-Car M3-W.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>

Applied to linux-can.

thnx,
Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH 38/40] ide: remove ide_driver_proc_write
From: Eric W. Biederman @ 2018-05-05 13:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-rtc, Alessandro Zummo, Alexandre Belloni, devel,
	linux-kernel, linux-scsi, linux-ide, Greg Kroah-Hartman,
	jfs-discussion, linux-afs, linux-acpi, netdev, netfilter-devel,
	Alexander Viro, Jiri Slaby, Andrew Morton, linux-ext4,
	Alexey Dobriyan, megaraidlinux.pdl, drbd-dev
In-Reply-To: <20180425154827.32251-39-hch@lst.de>

Christoph Hellwig <hch@lst.de> writes:

> The driver proc file hasn't been writeable for a long time, so this is
> just dead code.

It is possible to chmod this file to get at the write method.  Not that
I think anyone does.

It looks like this code was merged in 2.3.99-pre1 with permissions
S_IFREG|S_IRUGO so I don't think the write support was ever finished.

That cap_capable in the write method looks down right scary/buggy.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>

Eric



>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/ide/ide-proc.c | 46 ------------------------------------------
>  1 file changed, 46 deletions(-)
>
> diff --git a/drivers/ide/ide-proc.c b/drivers/ide/ide-proc.c
> index 863db44c7916..b3b8b8822d6a 100644
> --- a/drivers/ide/ide-proc.c
> +++ b/drivers/ide/ide-proc.c
> @@ -528,58 +528,12 @@ static int ide_driver_proc_open(struct inode *inode, struct file *file)
>  	return single_open(file, ide_driver_proc_show, PDE_DATA(inode));
>  }
>  
> -static int ide_replace_subdriver(ide_drive_t *drive, const char *driver)
> -{
> -	struct device *dev = &drive->gendev;
> -	int ret = 1;
> -	int err;
> -
> -	device_release_driver(dev);
> -	/* FIXME: device can still be in use by previous driver */
> -	strlcpy(drive->driver_req, driver, sizeof(drive->driver_req));
> -	err = device_attach(dev);
> -	if (err < 0)
> -		printk(KERN_WARNING "IDE: %s: device_attach error: %d\n",
> -			__func__, err);
> -	drive->driver_req[0] = 0;
> -	if (dev->driver == NULL) {
> -		err = device_attach(dev);
> -		if (err < 0)
> -			printk(KERN_WARNING
> -				"IDE: %s: device_attach(2) error: %d\n",
> -				__func__, err);
> -	}
> -	if (dev->driver && !strcmp(dev->driver->name, driver))
> -		ret = 0;
> -
> -	return ret;
> -}
> -
> -static ssize_t ide_driver_proc_write(struct file *file, const char __user *buffer,
> -				     size_t count, loff_t *pos)
> -{
> -	ide_drive_t	*drive = PDE_DATA(file_inode(file));
> -	char name[32];
> -
> -	if (!capable(CAP_SYS_ADMIN))
> -		return -EACCES;
> -	if (count > 31)
> -		count = 31;
> -	if (copy_from_user(name, buffer, count))
> -		return -EFAULT;
> -	name[count] = '\0';
> -	if (ide_replace_subdriver(drive, name))
> -		return -EINVAL;
> -	return count;
> -}
> -
>  static const struct file_operations ide_driver_proc_fops = {
>  	.owner		= THIS_MODULE,
>  	.open		= ide_driver_proc_open,
>  	.read		= seq_read,
>  	.llseek		= seq_lseek,
>  	.release	= single_release,
> -	.write		= ide_driver_proc_write,
>  };
>  
>  static int ide_media_proc_show(struct seq_file *m, void *v)

^ permalink raw reply

* Re: [PATCH RFC net-next] net: ipvs: Adjust gso_size for IPPROTO_TCP
From: Julian Anastasov @ 2018-05-05 12:58 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: netdev, David Ahern, Tom Herbert, Eric Dumazet, Nikita Shirokov,
	kernel-team, lvs-devel
In-Reply-To: <20180503070114.bcuusvzga45klccs@kafai-mbp>


	Hello,

On Thu, 3 May 2018, Martin KaFai Lau wrote:

> > - when exactly we start to use the new PMTU, eg. what happens
> > in case socket caches the route, whether route is killed via
> > dst->obsolete. Or may be while the PMTU expiration is handled
> > per-packet, the PMTU change is noticed only on ICMP...
> Before sk can reuse its dst cache, the sk will notice
> its dst cache is no longer valid by calling dst_check().
> dst_check() should return NULL which is one of the side
> effect of the earlier update_pmtu().  This dst_check()
> is usually only called when the sk needs to do output,
> so the new PMTU route (i.e. the RTF_CACHE IPv6 route)
> only have effect to the later packets.

	I checked again the code and it looks like sockets
are forced to use new exceptional route (RTF_CACHE/fnhe) via
dst_check only when the PMTU update should move them away
from old non-exceptional routes. Later, if PMTU is
reduced/updated this is noticed for every packet via dst_mtu,
as in the case with TCP.

	So, except the RTF_LOCAL check in __ip6_rt_update_pmtu
we should have no other issues. Only one minor bit is strange to me,
why rt6_insert_exception warns for RTF_PCPU if rt6_cache_allowed_for_pmtu
allows it when returning true... 

	Also, commit 0d3f6d297bfb allows rt6_do_update_pmtu() for
routes without RTF_CACHE, RTF_PCPU and rt6i_node. Should we
restrict rt6_do_update_pmtu only to RTF_CACHE routes?

 	if (!rt6_cache_allowed_for_pmtu(rt6)) {
-		rt6_do_update_pmtu(rt6, mtu);
-		/* update rt6_ex->stamp for cache */
-		if (rt6->rt6i_flags & RTF_CACHE)
+		if (rt6->rt6i_flags & RTF_CACHE) {
+			rt6_do_update_pmtu(rt6, mtu);
+			/* update rt6_ex->stamp for cache */
 			rt6_update_exception_stamp_rt(rt6);
+		}
 	} else if (daddr) {

Regards

^ permalink raw reply

* Re: [PATCH 34/40] atm: simplify procfs code
From: Eric W. Biederman @ 2018-05-05 12:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-rtc, Alessandro Zummo, Alexandre Belloni, devel,
	linux-kernel, linux-scsi, linux-ide, Greg Kroah-Hartman,
	jfs-discussion, linux-afs, linux-acpi, netdev, netfilter-devel,
	Alexander Viro, Jiri Slaby, Andrew Morton, linux-ext4,
	Alexey Dobriyan, megaraidlinux.pdl, drbd-dev
In-Reply-To: <20180425154827.32251-35-hch@lst.de>

Christoph Hellwig <hch@lst.de> writes:

> Use remove_proc_subtree to remove the whole subtree on cleanup, and
> unwind the registration loop into individual calls.  Switch to use
> proc_create_seq where applicable.

Can you please explain why you are removing the error handling when
you are unwinding the registration loop?

>  int __init atm_proc_init(void)
>  {
> -	static struct atm_proc_entry *e;
> -	int ret;
> -
>  	atm_proc_root = proc_net_mkdir(&init_net, "atm", init_net.proc_net);
>  	if (!atm_proc_root)
> -		goto err_out;
> -	for (e = atm_proc_ents; e->name; e++) {
> -		struct proc_dir_entry *dirent;
> -
> -		dirent = proc_create(e->name, 0444,
> -				     atm_proc_root, e->proc_fops);
> -		if (!dirent)
> -			goto err_out_remove;
> -		e->dirent = dirent;
> -	}
> -	ret = 0;
> -out:
> -	return ret;
> -
> -err_out_remove:
> -	atm_proc_dirs_remove();
> -err_out:
> -	ret = -ENOMEM;
> -	goto out;
> +		return -ENOMEM;
> +	proc_create_seq("devices", 0444, atm_proc_root, &atm_dev_seq_ops);
> +	proc_create("pvc", 0444, atm_proc_root, &pvc_seq_fops);
> +	proc_create("svc", 0444, atm_proc_root, &svc_seq_fops);
> +	proc_create("vc", 0444, atm_proc_root, &vcc_seq_fops);
> +	return 0;
>  }

These proc entries could fail to register with -ENOMEM if for no other
reason.  Can you justify the removal of the error handling?

Can you please at least mention that removal in your change description
and explain why it is reasonable.

As it sits this is not a semantics preserving transformation, and the
difference is not documented.  Which raises red flags for me.

Eric

^ permalink raw reply

* Re: [PATCH v3 12/20] media: Remove depends on HAS_DMA in case of platform dependency
From: Mauro Carvalho Chehab @ 2018-05-05 12:47 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Ulf Hansson, Wolfram Sang, linux-iio, linux-fpga,
	linux-remoteproc, alsa-devel, Bjorn Andersson, Eric Anholt,
	netdev, linux-mtd, linux-i2c, linux1394-devel, Christoph Hellwig,
	Marek Szyprowski, Stefan Wahren, Boris Brezillon, Herbert Xu,
	Richard Weinberger, Joerg Roedel, Jassi Brar, Marek Vasut,
	linux-serial, Matias Bjorling, David Woodhouse, linux-media, Ohad
In-Reply-To: <1523987360-18760-13-git-send-email-geert@linux-m68k.org>

Hi Geert,

Em Tue, 17 Apr 2018 19:49:12 +0200
Geert Uytterhoeven <geert@linux-m68k.org> escreveu:

> Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
> symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".

Applying a patch like that is hard, as there are lots of churn at
the code. That's against latest media upstream:

checking file drivers/media/pci/intel/ipu3/Kconfig
Hunk #1 FAILED at 4.
1 out of 1 hunk FAILED
checking file drivers/media/pci/solo6x10/Kconfig
checking file drivers/media/pci/sta2x11/Kconfig
Hunk #1 FAILED at 1.
1 out of 1 hunk FAILED
checking file drivers/media/pci/tw5864/Kconfig
checking file drivers/media/pci/tw686x/Kconfig
checking file drivers/media/platform/Kconfig
Hunk #2 FAILED at 63.
Hunk #3 succeeded at 81 (offset 1 line).
Hunk #4 succeeded at 91 (offset 1 line).
Hunk #5 succeeded at 101 (offset 1 line).
Hunk #6 succeeded at 111 (offset 1 line).
Hunk #7 succeeded at 124 (offset 1 line).
Hunk #8 succeeded at 142 (offset 1 line).
Hunk #9 succeeded at 169 (offset 1 line).
Hunk #10 succeeded at 186 (offset 1 line).
Hunk #11 succeeded at 197 (offset 1 line).
Hunk #12 succeeded at 213 (offset 1 line).
Hunk #13 succeeded at 227 (offset 1 line).
Hunk #14 succeeded at 254 (offset 1 line).
Hunk #15 succeeded at 265 (offset 1 line).
Hunk #16 succeeded at 275 (offset 1 line).
Hunk #17 succeeded at 284 (offset 1 line).
Hunk #18 succeeded at 295 (offset 1 line).
Hunk #19 succeeded at 303 (offset 1 line).
Hunk #20 succeeded at 312 (offset 1 line).
Hunk #21 succeeded at 338 (offset 1 line).
Hunk #22 succeeded at 383 (offset 1 line).
Hunk #23 succeeded at 397 (offset 1 line).
Hunk #24 succeeded at 422 (offset 1 line).
Hunk #25 succeeded at 435 (offset 1 line).
Hunk #26 succeeded at 452 (offset 1 line).
Hunk #27 succeeded at 470 (offset 1 line).
Hunk #28 succeeded at 612 (offset 1 line).
1 out of 28 hunks FAILED

> In most cases this other symbol is an architecture or platform specific
> symbol, or PCI.
> 
> Generic symbols and drivers without platform dependencies keep their
> dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
> cannot work anyway.

Actually, depends on HAS_DMA was introduced on media because builds
were failing otherwise. We started adding it before the addition 
of COMPILE_TEST.

Can we just remove all HAS_DMA Kconfig dependencies as a hole from the
entire media subsystem, with something like:

	$ for i in $(find drivers/media -name Kconfig) $(find drivers/staging/media -name Kconfig); do sed '/depends on HAS_DMA/d;s/ && HAS_DMA//g' -i $i; done

Or would it cause build issues?

Regards,
Mauro




Thanks,
Mauro

^ permalink raw reply

* Re: [PATCH 11/40] ipv6/flowlabel: simplify pid namespace lookup
From: Eric W. Biederman @ 2018-05-05 12:37 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, Alexander Viro, Alexey Dobriyan,
	Greg Kroah-Hartman, Jiri Slaby, Alessandro Zummo,
	Alexandre Belloni, linux-acpi, drbd-dev, linux-ide, netdev,
	linux-rtc, megaraidlinux.pdl, linux-scsi, devel, linux-afs,
	linux-ext4, jfs-discussion, netfilter-devel, linux-kernel
In-Reply-To: <20180425154827.32251-12-hch@lst.de>

Christoph Hellwig <hch@lst.de> writes:

> The shole seq_file sequence already operates under a single RCU lock pair,
> so move the pid namespace lookup into it, and stop grabbing a reference
> and remove all kinds of boilerplate code.

This is wrong.

Move task_active_pid_ns(current) from open to seq_start actually means
that the results if you pass this proc file between callers the results
will change.  So this breaks file descriptor passing.

Open is a bad place to access current.  In the middle of read/write is
broken.


In this particular instance looking up the pid namespace with
task_active_pid_ns was a personal brain fart.  What the code should be
doing (with an appropriate helper) is:

struct pid_namespace *pid_ns = inode->i_sb->s_fs_info;

Because each mount of proc is bound to a pid namespace.  Looking up the
pid namespace from the super_block is a much better way to go.

Eric



> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  net/ipv6/ip6_flowlabel.c | 28 +++++-----------------------
>  1 file changed, 5 insertions(+), 23 deletions(-)
>
> diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c
> index c05c4e82a7ca..a9f221d45ef9 100644
> --- a/net/ipv6/ip6_flowlabel.c
> +++ b/net/ipv6/ip6_flowlabel.c
> @@ -754,7 +754,10 @@ static struct ip6_flowlabel *ip6fl_get_idx(struct seq_file *seq, loff_t pos)
>  static void *ip6fl_seq_start(struct seq_file *seq, loff_t *pos)
>  	__acquires(RCU)
>  {
> +	struct ip6fl_iter_state *state = ip6fl_seq_private(seq);
> +
>  	rcu_read_lock_bh();
> +	state->pid_ns = task_active_pid_ns(current);
>  	return *pos ? ip6fl_get_idx(seq, *pos - 1) : SEQ_START_TOKEN;
>  }
>  
> @@ -810,36 +813,15 @@ static const struct seq_operations ip6fl_seq_ops = {
>  
>  static int ip6fl_seq_open(struct inode *inode, struct file *file)
>  {
> -	struct seq_file *seq;
> -	struct ip6fl_iter_state *state;
> -	int err;
> -
> -	err = seq_open_net(inode, file, &ip6fl_seq_ops,
> +	return seq_open_net(inode, file, &ip6fl_seq_ops,
>  			   sizeof(struct ip6fl_iter_state));
> -
> -	if (!err) {
> -		seq = file->private_data;
> -		state = ip6fl_seq_private(seq);
> -		rcu_read_lock();
> -		state->pid_ns = get_pid_ns(task_active_pid_ns(current));
> -		rcu_read_unlock();
> -	}
> -	return err;
> -}
> -
> -static int ip6fl_seq_release(struct inode *inode, struct file *file)
> -{
> -	struct seq_file *seq = file->private_data;
> -	struct ip6fl_iter_state *state = ip6fl_seq_private(seq);
> -	put_pid_ns(state->pid_ns);
> -	return seq_release_net(inode, file);
>  }
>  
>  static const struct file_operations ip6fl_seq_fops = {
>  	.open		=	ip6fl_seq_open,
>  	.read		=	seq_read,
>  	.llseek		=	seq_lseek,
> -	.release	=	ip6fl_seq_release,
> +	.release	=	seq_release_net,
>  };
>  
>  static int __net_init ip6_flowlabel_proc_init(struct net *net)

^ permalink raw reply

* Re: [PATCH] net: dsa: drop some VLAs in switch.c
From: Salvatore Mesoraca @ 2018-05-05 10:36 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Vivien Didelot, linux-kernel, Kernel Hardening, netdev,
	David S. Miller, Andrew Lunn, Kees Cook
In-Reply-To: <b948ef55-8b16-fee4-b22f-4c6d09fea0cf@gmail.com>

2018-03-13 21:06 GMT+01:00 Florian Fainelli <f.fainelli@gmail.com>:
> On 03/13/2018 12:58 PM, Vivien Didelot wrote:
>> Hi Salvatore,
>>
>> Salvatore Mesoraca <s.mesoraca16@gmail.com> writes:
>>
>>> dsa_switch's num_ports is currently fixed to DSA_MAX_PORTS. So we avoid
>>> 2 VLAs[1] by using DSA_MAX_PORTS instead of ds->num_ports.
>>>
>>> [1] https://lkml.org/lkml/2018/3/7/621
>>>
>>> Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
>>
>> NAK.
>>
>> We are in the process to remove hardcoded limits such as DSA_MAX_PORTS
>> and DSA_MAX_SWITCHES, so we have to stick with ds->num_ports.
>
> Then this means that we need to allocate a bitmap from the heap, which
> sounds a bit superfluous and could theoretically fail... not sure which
> way is better, but bumping the size to DSA_MAX_PORTS definitively does
> help people working on enabling -Wvla.

Hi Florian,

Should I consider this patch still NAKed or not?
Should I resend the patch with some modifications?

Thank you,

Salvatore

^ permalink raw reply

* Re: [net-next PATCH v2 0/8] UDP GSO Segmentation clean-ups
From: Willem de Bruijn @ 2018-05-05 10:06 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Network Development, Willem de Bruijn, David Miller
In-Reply-To: <20180504182537.5194.72775.stgit@localhost.localdomain>

On Fri, May 4, 2018 at 8:28 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> This patch set addresses a number of issues I found while sorting out
> enabling UDP GSO Segmentation support for ixgbe/ixgbevf. Specifically there
> were a number of issues related to the checksum and such that seemed to
> cause either minor irregularities or kernel panics in the case of the
> offload request being allowed to traverse between name spaces.

Were you able to traverse GSO packets between network namespace before
adding to NETIF_F_GSO_SOFTWARE? It does appear that veth includes
NETIF_F_GSO_ENCAP_ALL, which also allows GSO.

In either case, it should not be possible for GSO packets to arrive on a veth
device, as that can result in queuing the GSO packet to a recipient socket.
In this regard veth is like loopback and must exclude GSO support.

I'll take a look.

^ permalink raw reply

* Re: [net-next PATCH v2 4/8] udp: Do not pass checksum as a parameter to GSO segmentation
From: Willem de Bruijn @ 2018-05-05 10:01 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Network Development, Willem de Bruijn, David Miller
In-Reply-To: <20180504182958.5194.62398.stgit@localhost.localdomain>

On Fri, May 4, 2018 at 8:30 PM, Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> From: Alexander Duyck <alexander.h.duyck@intel.com>
>
> This patch is meant to allow us to avoid having to recompute the checksum
> from scratch and have it passed as a parameter.
>
> Instead of taking that approach we can take advantage of the fact that the
> length that was used to compute the existing checksum is included in the
> UDP header. If we cancel that out by adding the value XOR with 0xFFFF we
> can then just add the new length in and fold that into the new result.
>
> I think this may be fixing a checksum bug in the original code as well
> since the checksum that was passed included the UDP header in the checksum
> computation, but then excluded it for the adjustment on the last frame. I
> believe this may have an effect on things in the cases where the two differ
> by bits that would result in things crossing the byte boundaries.

The replacement code, below, subtracts original payload size then adds
the new payload size. mss here excludes the udp header size.

>                 /* last packet can be partial gso_size */
> -               if (!seg->next)
> -                       csum_replace2(&uh->check, htons(mss),
> -                                     htons(seg->len - hdrlen - sizeof(*uh)));

^ permalink raw reply

* Re: [PATCH 7/8] rhashtable: add rhashtable_walk_prev()
From: Herbert Xu @ 2018-05-05  9:43 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel, Tom Herbert
In-Reply-To: <152540605441.18473.4087381584733882012.stgit@noble>

On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
> rhashtable_walk_prev() returns the object returned by
> the previous rhashtable_walk_next(), providing it is still in the
> table (or was during this grace period).
> This works even if rhashtable_walk_stop() and rhashtable_talk_start()
> have been called since the last rhashtable_walk_next().
> 
> If there have been no calls to rhashtable_walk_next(), or if the
> object is gone from the table, then NULL is returned.
> 
> This can usefully be used in a seq_file ->start() function.
> If the pos is the same as was returned by the last ->next() call,
> then rhashtable_walk_prev() can be used to re-establish the
> current location in the table.  If it returns NULL, then
> rhashtable_walk_next() should be used.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

I will ack this if Tom is OK with replacing peek with it.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 6/8] rhashtable: further improve stability of rhashtable_walk
From: Herbert Xu @ 2018-05-05  9:42 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel
In-Reply-To: <152540605438.18473.4797800779538116583.stgit@noble>

On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
> If the sequence:
>    obj = rhashtable_walk_next(iter);
>    rhashtable_walk_stop(iter);
>    rhashtable_remove_fast(ht, &obj->head, params);
>    rhashtable_walk_start(iter);
> 
>  races with another thread inserting or removing
>  an object on the same hash chain, a subsequent
>  rhashtable_walk_next() is not guaranteed to get the "next"
>  object. It is possible that an object could be
>  repeated, or missed.
> 
>  This can be made more reliable by keeping the objects in a hash chain
>  sorted by memory address.  A subsequent rhashtable_walk_next()
>  call can reliably find the correct position in the list, and thus
>  find the 'next' object.
> 
>  It is not possible (certainly not so easy) to achieve this with an
>  rhltable as keeping the hash chain in order is not so easy.  When the
>  first object with a given key is removed, it is replaced in the chain
>  with the next object with the same key, and the address of that
>  object may not be correctly ordered.
>  No current user of rhltable_walk_enter() calls
>  rhashtable_walk_start() more than once, so no current code
>  could benefit from a more reliable walk of rhltables.
> 
>  This patch only attempts to improve walks for rhashtables.
>  - a new object is always inserted after the last object with a
>    smaller address, or at the start
>  - when rhashtable_walk_start() is called, it records that 'p' is not
>    'safe', meaning that it cannot be dereferenced.  The revalidation
>    that was previously done here is moved to rhashtable_walk_next()
>  - when rhashtable_walk_next() is called while p is not NULL and not
>    safe, it walks the chain looking for the first object with an
>    address greater than p and returns that.  If there is none, it moves
>    to the next hash chain.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

I'm a bit torn on this.  On the hand this is definitely an improvement
over the status quo.  On the other this does not work on rhltable and
we do have a way of fixing it for both rhashtable and rhltable.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 8/8] rhashtable: don't hold lock on first table throughout insertion.
From: Herbert Xu @ 2018-05-05  9:41 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel
In-Reply-To: <152540605444.18473.9591316658457316578.stgit@noble>

On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
> rhashtable_try_insert() currently hold a lock on the bucket in
> the first table, while also locking buckets in subsequent tables.
> This is unnecessary and looks like a hold-over from some earlier
> version of the implementation.
> 
> As insert and remove always lock a bucket in each table in turn, and
> as insert only inserts in the final table, there cannot be any races
> that are not covered by simply locking a bucket in each table in turn.
> 
> When an insert call reaches that last table it can be sure that there
> is no match entry in any other table as it has searched them all, and
> insertion never happens anywhere but in the last table.  The fact that
> code tests for the existence of future_tbl while holding a lock on
> the relevant bucket ensures that two threads inserting the same key
> will make compatible decisions about which is the "last" table.
> 
> This simplifies the code and allows the ->rehash field to be
> discarded.
> We still need a way to ensure that a dead bucket_table is never
> re-linked by rhashtable_walk_stop().  This can be achieved by
> setting the ->size to 1.  This still allows lookup code to work (and
> correctly not find anything) but can never happen on an active bucket
> table (as the minimum size is 4).
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

I'm not convinced this is safe.  There can be multiple tables
in existence.  Unless you hold the lock on the first table, what
is to stop two parallel inserters each from inserting into their
"last" tables which may not be the same table?

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 5/8] rhashtable: remove rhashtable_walk_peek()
From: Herbert Xu @ 2018-05-05  9:30 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel, Tom Herbert
In-Reply-To: <152540605435.18473.12657571579874985957.stgit@noble>

On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
> This function has a somewhat confused behavior that is not properly
> described by the documentation.
> Sometimes is returns the previous object, sometimes it returns the
> next one.
> Sometimes it changes the iterator, sometimes it doesn't.
> 
> This function is not currently used and is not worth keeping, so
> remove it.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

I don't have a problem with this.  But it would be good to hear
from Tom Herbert who added this.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 4/8] rhashtable: fix race in nested_table_alloc()
From: Herbert Xu @ 2018-05-05  9:29 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel
In-Reply-To: <152540605432.18473.11813271279255176724.stgit@noble>

On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
> If two threads run nested_table_alloc() at the same time
> they could both allocate a new table.
> Best case is that one of them will never be freed, leaking memory.
> Worst case is hat entry get stored there before it leaks,
> and the are lost from the table.
> 
> So use cmpxchg to detect the race and free the unused table.
> 
> Fixes: da20420f83ea ("rhashtable: Add nested tables")
> Cc: stable@vger.kernel.org # 4.11+
> Signed-off-by: NeilBrown <neilb@suse.com>

What about the spinlock that's meant to be held around this
operation?

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 3/8] rhashtable: use cmpxchg() to protect ->future_tbl.
From: Herbert Xu @ 2018-05-05  9:27 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel
In-Reply-To: <152540605430.18473.11758878046224478232.stgit@noble>

On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
> Rather than borrowing one of the bucket locks to
> protect ->future_tbl updates, use cmpxchg().
> This gives more freedom to change how bucket locking
> is implemented.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

This looks nice.

> -	spin_unlock_bh(old_tbl->locks);
> +	rcu_assign_pointer(tmp, new_tbl);

Do we need this barrier since cmpxchg is supposed to provide memory
barrier semantics?

> +	if (cmpxchg(&old_tbl->future_tbl, NULL, tmp) != NULL)
> +		return -EEXIST;

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 2/8] rhashtable: remove nulls_base and related code.
From: Herbert Xu @ 2018-05-05  9:12 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel
In-Reply-To: <152540605427.18473.12277050439942480863.stgit@noble>

On Fri, May 04, 2018 at 01:54:14PM +1000, NeilBrown wrote:
> This "feature" is unused, undocumented, and untested and so
> doesn't really belong.  If a use for the nulls marker
> is found, all this code would need to be reviewed to
> ensure it works as required.  It would be just as easy to
> just add the code if/when it is needed instead.
> 
> This patch actually fixes a bug too.  The table resizing allows a
> table to grow to 2^31 buckets, but the hash is truncated to 27 bits -
> any growth beyond 2^27 is wasteful an ineffective.
> 
> This patch result in NULLS_MARKER(0) being used for all chains,
> and leave the use of rht_is_a_null() to test for it.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

I disagree.  This is a fundamental requirement for the use of
rhashtable in certain networking systems such as TCP/UDP.  So
we know that there will be a use for this.

As to the bug fix, please separate it out of the patch and resubmit.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox