* [PATCH 02/14] allow root in container to copy namespaces (v3)
From: Serge E. Hallyn @ 2011-07-29 17:27 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: linux-kernel, netdev, containers, dhowells
In-Reply-To: <m1hb67fh9l.fsf@fess.ebiederm.org>
Quoting Eric W. Biederman (ebiederm@xmission.com):
> Serge Hallyn <serge@hallyn.com> writes:
>
> > From: Serge E. Hallyn <serge.hallyn@canonical.com>
> >
> > Othewise nested containers with user namespaces won't be possible.
> >
> > It's true that user namespaces are not yet fully isolated, but for
> > that same reason there are far worse things that root in a child
> > user ns can do. Spawning a child user ns is not in itself bad.
> >
> > This patch also allows setns for root in a container:
> > @Eric Biederman: are there gotchas in allowing setns from child
> > userns?
>
> Yes. We need to ensure that the target namespaces are namespaces
> that have been created in from user_namespace or from a child of this
> user_namespace.
>
> Aka we need to ensure that we have CAP_SYS_ADMIN for the new namespace.
[New patch below]
Othewise nested containers with user namespaces won't be possible.
It's true that user namespaces are not yet fully isolated, but for
that same reason there are far worse things that root in a child
user ns can do. Spawning a child user ns is not in itself bad.
This patch also allows setns for root in a container:
@Eric Biederman: are there gotchas in allowing setns from child
userns?
Changelog:
Jul 29: setns: target capability check for setns
When changing to another namespace, make sure that we have
the CAP_SYS_ADMIN capability targeted at the user namespace
owning the new ns.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
---
ipc/namespace.c | 3 +++
kernel/fork.c | 4 ++--
kernel/nsproxy.c | 7 ++-----
kernel/utsname.c | 3 +++
net/core/net_namespace.c | 3 +++
5 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/ipc/namespace.c b/ipc/namespace.c
index ce0a647..f527e49 100644
--- a/ipc/namespace.c
+++ b/ipc/namespace.c
@@ -163,6 +163,9 @@ static void ipcns_put(void *ns)
static int ipcns_install(struct nsproxy *nsproxy, void *ns)
{
+ struct ipc_namespace *newns = ns;
+ if (!ns_capable(newns->user_ns, CAP_SYS_ADMIN))
+ return -1;
/* Ditch state from the old ipc namespace */
exit_sem(current);
put_ipc_ns(nsproxy->ipc_ns);
diff --git a/kernel/fork.c b/kernel/fork.c
index e7ceaca..f9fac70 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1488,8 +1488,8 @@ long do_fork(unsigned long clone_flags,
/* hopefully this check will go away when userns support is
* complete
*/
- if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SETUID) ||
- !capable(CAP_SETGID))
+ if (!nsown_capable(CAP_SYS_ADMIN) || !nsown_capable(CAP_SETUID) ||
+ !nsown_capable(CAP_SETGID))
return -EPERM;
}
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 9aeab4b..cadcee0 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -134,7 +134,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
CLONE_NEWPID | CLONE_NEWNET)))
return 0;
- if (!capable(CAP_SYS_ADMIN)) {
+ if (!nsown_capable(CAP_SYS_ADMIN)) {
err = -EPERM;
goto out;
}
@@ -191,7 +191,7 @@ int unshare_nsproxy_namespaces(unsigned long unshare_flags,
CLONE_NEWNET)))
return 0;
- if (!capable(CAP_SYS_ADMIN))
+ if (!nsown_capable(CAP_SYS_ADMIN))
return -EPERM;
*new_nsp = create_new_namespaces(unshare_flags, current,
@@ -241,9 +241,6 @@ SYSCALL_DEFINE2(setns, int, fd, int, nstype)
struct file *file;
int err;
- if (!capable(CAP_SYS_ADMIN))
- return -EPERM;
-
file = proc_ns_fget(fd);
if (IS_ERR(file))
return PTR_ERR(file);
diff --git a/kernel/utsname.c b/kernel/utsname.c
index bff131b..8f648cc 100644
--- a/kernel/utsname.c
+++ b/kernel/utsname.c
@@ -104,6 +104,9 @@ static void utsns_put(void *ns)
static int utsns_install(struct nsproxy *nsproxy, void *ns)
{
+ struct uts_namespace *newns = ns;
+ if (!ns_capable(newns->user_ns, CAP_SYS_ADMIN))
+ return -1;
get_uts_ns(ns);
put_uts_ns(nsproxy->uts_ns);
nsproxy->uts_ns = ns;
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 5bbdbf0..90c97f6 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -620,6 +620,9 @@ static void netns_put(void *ns)
static int netns_install(struct nsproxy *nsproxy, void *ns)
{
+ struct net *net = ns;
+ if (!ns_capable(net->user_ns, CAP_SYS_ADMIN))
+ return -1;
put_net(nsproxy->net_ns);
nsproxy->net_ns = get_net(ns);
return 0;
--
1.7.5.4
^ permalink raw reply related
* [PATCH 01/14] add Documentation/namespaces/user_namespace.txt (v3)
From: Serge E. Hallyn @ 2011-07-29 17:25 UTC (permalink / raw)
To: David Howells
Cc: Randy Dunlap, linux-kernel, ebiederm, containers, netdev, akpm,
Serge E. Hallyn
In-Reply-To: <27437.1311712186@redhat.com>
Quoting David Howells (dhowells@redhat.com):
> Randy Dunlap <rdunlap@xenotime.net> wrote:
>
> > > +Any task in or resource belonging to the initial user namespace will, to this
> > > +new task, appear to belong to UID and GID -1 - which is usually known as
> >
> > that extra hyphen is confusing. how about:
> >
> > to UID and GID -1, which is
>
> 'which are'.
>
> David
This will hold some info about the design. Currently it contains
future todos, issues and questions.
Changelog:
jul 26: incorporate feedback from David Howells.
jul 29: incorporate feedback from Randy Dunlap.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
---
Documentation/namespaces/user_namespace.txt | 107 +++++++++++++++++++++++++++
1 files changed, 107 insertions(+), 0 deletions(-)
create mode 100644 Documentation/namespaces/user_namespace.txt
diff --git a/Documentation/namespaces/user_namespace.txt b/Documentation/namespaces/user_namespace.txt
new file mode 100644
index 0000000..b0bc480
--- /dev/null
+++ b/Documentation/namespaces/user_namespace.txt
@@ -0,0 +1,107 @@
+Description
+===========
+
+Traditionally, each task is owned by a user ID (UID) and belongs to one or more
+groups (GID). Both are simple numeric IDs, though userspace usually translates
+them to names. The user namespace allows tasks to have different views of the
+UIDs and GIDs associated with tasks and other resources. (See 'UID mapping'
+below for more.)
+
+The user namespace is a simple hierarchical one. The system starts with all
+tasks belonging to the initial user namespace. A task creates a new user
+namespace by passing the CLONE_NEWUSER flag to clone(2). This requires the
+creating task to have the CAP_SETUID, CAP_SETGID, and CAP_CHOWN capabilities,
+but it does not need to be running as root. The clone(2) call will result in a
+new task which to itself appears to be running as UID and GID 0, but to its
+creator seems to have the creator's credentials.
+
+To this new task, any resource belonging to the initial user namespace will
+appear to belong to user and group 'nobody', which are UID and GID -1.
+Permission to open such files will be granted according to world access
+permissions. UID comparisons and group membership checks will return false,
+and privilege will be denied.
+
+When a task belonging to (for example) userid 500 in the initial user namespace
+creates a new user namespace, even though the new task will see itself as
+belonging to UID 0, any task in the initial user namespace will see it as
+belonging to UID 500. Therefore, UID 500 in the initial user namespace will be
+able to kill the new task. Files created by the new user will (eventually) be
+seen by tasks in its own user namespace as belonging to UID 0, but to tasks in
+the initial user namespace as belonging to UID 500.
+
+Note that this userid mapping for the VFS is not yet implemented, though the
+lkml and containers mailing list archives will show several previous
+prototypes. In the end, those got hung up waiting on the concept of targeted
+capabilities to be developed, which, thanks to the insight of Eric Biederman,
+they finally did.
+
+Relationship between the User namespace and other namespaces
+============================================================
+
+Other namespaces, such as UTS and network, are owned by a user namespace. When
+such a namespace is created, it is assigned to the user namespace of the task
+by which it was created. Therefore, attempts to exercise privilege to
+resources in, for instance, a particular network namespace, can be properly
+validated by checking whether the caller has the needed privilege (i.e.
+CAP_NET_ADMIN) targeted to the user namespace which owns the network namespace.
+This is done using the ns_capable() function.
+
+As an example, if a new task is cloned with a private user namespace but
+no private network namespace, then the task's network namespace is owned
+by the parent user namespace. The new task has no privilege to the
+parent user namespace, so it will not be able to create or configure
+network devices. If, instead, the task were cloned with both private
+user and network namespaces, then the private network namespace is owned
+by the private user namespace, and so root in the new user namespace
+will have privilege targeted to the network namespace. It will be able
+to create and configure network devices.
+
+UID Mapping
+===========
+The current plan (see 'flexible UID mapping' at
+https://wiki.ubuntu.com/UserNamespace) is:
+
+The UID/GID stored on disk will be that in the init_user_ns. Most likely
+UID/GID in other namespaces will be stored in xattrs. But Eric was advocating
+(a few years ago) leaving the details up to filesystems while providing a lib/
+stock implementation. See the thread around here:
+http://www.mail-archive.com/devel@openvz.org/msg09331.html
+
+
+Working notes
+=============
+Capability checks for actions related to syslog must be against the
+init_user_ns until syslog is containerized.
+
+Same is true for reboot and power, control groups, devices, and time.
+
+Perf actions (kernel/event/core.c for instance) will always be constrained to
+init_user_ns.
+
+Q:
+Is accounting considered properly containerized with respect to pidns? (it
+appears to be). If so, then we can change the capable() check in
+kernel/acct.c to 'ns_capable(current_pid_ns()->user_ns, CAP_PACCT)'
+
+Q:
+For things like nice and schedaffinity, we could allow root in a container to
+control those, and leave only cgroups to constrain the container. I'm not sure
+whether that is right, or whether it violates admin expectations.
+
+I deferred some of commoncap.c. I'm punting on xattr stuff as they take
+dentries, not inodes.
+
+For drivers/tty/tty_io.c and drivers/tty/vt/vt.c, we'll want to (for some of
+them) target the capability checks at the user_ns owning the tty. That will
+have to wait until we get userns owning files straightened out.
+
+We need to figure out how to label devices. Should we just toss a user_ns
+right into struct device?
+
+capable(CAP_MAC_ADMIN) checks are always to be against init_user_ns, unless
+some day LSMs were to be containerized, near zero chance.
+
+inode_owner_or_capable() should probably take an optional ns and cap parameter.
+If cap is 0, then CAP_FOWNER is checked. If ns is NULL, we derive the ns from
+inode. But if ns is provided, then callers who need to derive
+inode_userns(inode) anyway can save a few cycles.
--
1.7.5.4
^ permalink raw reply related
* Re: [PATCH] net/smsc911x: add device tree probe support
From: Shawn Guo @ 2011-07-29 17:13 UTC (permalink / raw)
To: Grant Likely
Cc: Nicolas Pitre, patches, netdev, devicetree-discuss,
Steve Glendinning, Shawn Guo, David S. Miller, linux-arm-kernel
In-Reply-To: <20110729162629.GI11164@ponder.secretlab.ca>
On Fri, Jul 29, 2011 at 10:26:29AM -0600, Grant Likely wrote:
> On Sat, Jul 30, 2011 at 12:03:44AM +0800, Shawn Guo wrote:
> > On Fri, Jul 29, 2011 at 09:47:23AM -0600, Grant Likely wrote:
> > > On Fri, Jul 29, 2011 at 10:36:26AM +0800, Shawn Guo wrote:
> > > > Since I do not get any suggestion here, I'm going to follow the driver
> > > > name to use '911' as the model number. Please tell me if there is any
> > > > better one.
> > >
> > > What is the manufacturer part name? lan9111 or lan91c11? The
> > > manufacturer's documented part number is almost always preferred.
> > >
> > I just posted the v2 of the patch, and chose to use 'smsc,lan9115'
> > which is the first model supported in the driver. This is the same
> > approach I used with i.mx device bindings.
>
> You haven't answered the question.
>
For 9115 example, the part number in data sheet is "LAN9115".
--
Regards,
Shawn
^ permalink raw reply
* Re: [patch net-next-2.6] dummy: allow report link status and change it via sysfs
From: Stephen Hemminger @ 2011-07-29 16:55 UTC (permalink / raw)
To: Ben Hutchings; +Cc: Jiri Pirko, netdev, davem, eric.dumazet
In-Reply-To: <1311955389.3220.15.camel@deadeye>
On Fri, 29 Jul 2011 18:03:09 +0200
Ben Hutchings <bhutchings@solarflare.com> wrote:
> On Fri, 2011-07-29 at 17:27 +0200, Jiri Pirko wrote:
> > Signed-off-by: Jiri Pirko <jpirko@redhat.com>
> > ---
> > drivers/net/dummy.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 files changed, 54 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
> > index 39cf9b9..fc39c24 100644
> > --- a/drivers/net/dummy.c
> > +++ b/drivers/net/dummy.c
> > @@ -37,9 +37,57 @@
> > #include <linux/rtnetlink.h>
> > #include <net/rtnetlink.h>
> > #include <linux/u64_stats_sync.h>
> > +#include <linux/ethtool.h>
> >
> > static int numdummies = 1;
> >
> > +static ssize_t dummy_show_link(struct device *d,
> > + struct device_attribute *attr,
> > + char *buf)
> > +{
> > + struct net_device *dev = to_net_dev(d);
> > +
> > + return sprintf(buf, "%d\n", netif_carrier_ok(dev) ? 1 : 0);
> > +}
> [...]
>
> Net devices already have the 'carrier' attribute. You should make that
> attribute writable for dummy devices, rather than adding another one.
>
And adding ethtool support is unnecessary since carrier is already
reported by tools like 'ip link'
^ permalink raw reply
* RE: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
From: Rose, Gregory V @ 2011-07-29 16:54 UTC (permalink / raw)
To: Jesse Barnes, Ian Campbell
Cc: Konrad Rzeszutek Wilk, netdev@vger.kernel.org,
davem@davemloft.net, bhutchings@solarflare.com,
Kirsher, Jeffrey T, linux-pci@vger.kernel.org
In-Reply-To: <20110729095141.472e8081@jbarnes-desktop>
> -----Original Message-----
> From: Jesse Barnes [mailto:jbarnes@virtuousgeek.org]
> Sent: Friday, July 29, 2011 9:52 AM
> To: Ian Campbell
> Cc: Rose, Gregory V; Konrad Rzeszutek Wilk; netdev@vger.kernel.org;
> davem@davemloft.net; bhutchings@solarflare.com; Kirsher, Jeffrey T; linux-
> pci@vger.kernel.org
> Subject: Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has
> been assigned by KVM
>
> On Thu, 28 Jul 2011 16:11:17 +0100
> Ian Campbell <ijc@hellion.org.uk> wrote:
>
> > On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > > Device drivers that create and destroy SR-IOV virtual functions via
> > > calls to pci_enable_sriov() and pci_disable_sriov can cause
> catastrophic
> > > failures if they attempt to destroy VFs while they are assigned to
> > > guest virtual machines. By adding a flag for use by the KVM module
> > > to indicate that a device is assigned a device driver can check that
> > > flag and avoid destroying VFs while they are assigned and avoid system
> > > failures.
> > >
> > > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > > ---
> > >
> > > include/linux/pci.h | 2 ++
> >
> > I added Jesse and linux-pci to CC.
> >
> > > virt/kvm/assigned-dev.c | 2 ++
> > > virt/kvm/iommu.c | 4 ++++
> > > 3 files changed, 8 insertions(+), 0 deletions(-)
> >
> > I suppose this would also be useful in Xen's pciback or any other system
> > which does passthrough? (Konrad CC'd for pciback)
> >
> > Is there some common lower layer we could hook this in to? (does
> > iommu_attach/detach_device make sense?) Or shall we just add the flag
> > manipulations to pciback as well?
> >
> > Ian.
> >
> > >
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index 2d29218..a297ca2 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -174,6 +174,8 @@ enum pci_dev_flags {
> > > PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG = (__force pci_dev_flags_t) 1,
> > > /* Device configuration is irrevocably lost if disabled into D3 */
> > > PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) 2,
> > > + /* Provide indication device is assigned by KVM */
> > > + PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) 4,
> > > };
>
> Looks fine, but I'd make the comment less redundant with the code, e.g.
> "set when the device is assigned to a guest instance" or somesuch.
Sure, sounds good to me.
Rev 2 of the RFC patches will be out in a couple of weeks, I'm away next week.
Thanks,
- Greg
>
> --
> Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply
* Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
From: Konrad Rzeszutek Wilk @ 2011-07-29 16:54 UTC (permalink / raw)
To: Rose, Gregory V
Cc: Ian Campbell, netdev@vger.kernel.org, davem@davemloft.net,
bhutchings@solarflare.com, Kirsher, Jeffrey T, Jesse Barnes,
linux-pci@vger.kernel.org
In-Reply-To: <43F901BD926A4E43B106BF17856F0755019414D59C@orsmsx508.amr.corp.intel.com>
> > > > On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > > > > Device drivers that create and destroy SR-IOV virtual functions via
> > > > > calls to pci_enable_sriov() and pci_disable_sriov can cause
> > catastrophic
> > > > > failures if they attempt to destroy VFs while they are assigned to
> > > > > guest virtual machines. By adding a flag for use by the KVM module
> > > > > to indicate that a device is assigned a device driver can check that
> > > > > flag and avoid destroying VFs while they are assigned and avoid
> > system
> > > > > failures.
> OK, but I hope Xen can still use the dev_flag assignment bit.
Yeah, I think the attached patch would do it, but I need to double check it.
Do you have a git tree with this patchset?
Um, so you are fixing up ixgbe only - what about the cxgb4 and be driver?
Shouldn't they also get some of this treatment?
diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index 206c4ce0..0d72e84 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -250,6 +250,7 @@ static int xen_pcibk_export_device(struct xen_pcibk_device *pdev,
goto out;
dev_dbg(&dev->dev, "registering for %d\n", pdev->xdev->otherend_id);
+ dev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
if (xen_register_device_domain_owner(dev,
pdev->xdev->otherend_id) != 0) {
dev_err(&dev->dev, "device has been assigned to another " \
@@ -289,6 +290,7 @@ static int xen_pcibk_remove_device(struct xen_pcibk_device *pdev,
}
dev_dbg(&dev->dev, "unregistering for %d\n", pdev->xdev->otherend_id);
+ dev->dev_flags &= ~PCI_DEV_FLAGS_ASSIGNED;
xen_unregister_device_domain_owner(dev);
xen_pcibk_release_pci_dev(pdev, dev);
^ permalink raw reply related
* Re: [PATCH] sunrpc: use better NUMA affinities
From: J. Bruce Fields @ 2011-07-29 16:53 UTC (permalink / raw)
To: Greg Banks
Cc: Eric Dumazet, Christoph Hellwig, NeilBrown,
linux-nfs@vger.kernel.org, David Miller, linux-kernel, netdev
In-Reply-To: <20110729164836.GL23194@fieldses.org>
On Fri, Jul 29, 2011 at 12:48:36PM -0400, bfields wrote:
> On Fri, Jul 29, 2011 at 11:30:05PM +1000, Greg Banks wrote:
> >
> >
> > Sent from my iPhone
> >
> > On 29/07/2011, at 22:11, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > > Le vendredi 29 juillet 2011 à 21:58 +1000, Greg Banks a écrit :
> > >
> > >>
> > >> Sure, and a whole lot of the callsites are ("..._%d", cpu), hence the
> > >> unfortune :(
> > >
> > > BTW, we could name nfsd threads differently :
> > >
> > > Currently, they all are named : "nfsd"
> > >
> > > If SVC_POOL_PERCPU is selected, we could name them :
> > > nfsd_c0 -> nfsd_cN
> > >
> > > If SVC_POOL_PERNODE is selected, we could name them :
> > > nfsd_n0 -> nfsd_nN
> > >
> > > That would help to check with "ps aux" which cpu/nodes are under
> > > stress.
> > >
> > >
> >
> > I like it!
>
> Yup, patch welcomed.--b.
(Annoying fact: some initscripts stop nfsd using a rough equivalent of
"killall nfsd". So the name of the threads is arguably ABI. I think
those initscripts are nuts and deserve what they get, but that may be
because I'm forgetting the reason they do that.)
--b.
^ permalink raw reply
* Re: [RFC net-next PATCH 1/4] pci: Add flag indicating device has been assigned by KVM
From: Jesse Barnes @ 2011-07-29 16:51 UTC (permalink / raw)
To: Ian Campbell
Cc: Greg Rose, Konrad Rzeszutek Wilk, netdev, davem, bhutchings,
jeffrey.t.kirsher, linux-pci
In-Reply-To: <1311865877.24408.144.camel@cthulhu.hellion.org.uk>
On Thu, 28 Jul 2011 16:11:17 +0100
Ian Campbell <ijc@hellion.org.uk> wrote:
> On Wed, 2011-07-27 at 15:17 -0700, Greg Rose wrote:
> > Device drivers that create and destroy SR-IOV virtual functions via
> > calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
> > failures if they attempt to destroy VFs while they are assigned to
> > guest virtual machines. By adding a flag for use by the KVM module
> > to indicate that a device is assigned a device driver can check that
> > flag and avoid destroying VFs while they are assigned and avoid system
> > failures.
> >
> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> > ---
> >
> > include/linux/pci.h | 2 ++
>
> I added Jesse and linux-pci to CC.
>
> > virt/kvm/assigned-dev.c | 2 ++
> > virt/kvm/iommu.c | 4 ++++
> > 3 files changed, 8 insertions(+), 0 deletions(-)
>
> I suppose this would also be useful in Xen's pciback or any other system
> which does passthrough? (Konrad CC'd for pciback)
>
> Is there some common lower layer we could hook this in to? (does
> iommu_attach/detach_device make sense?) Or shall we just add the flag
> manipulations to pciback as well?
>
> Ian.
>
> >
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 2d29218..a297ca2 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -174,6 +174,8 @@ enum pci_dev_flags {
> > PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG = (__force pci_dev_flags_t) 1,
> > /* Device configuration is irrevocably lost if disabled into D3 */
> > PCI_DEV_FLAGS_NO_D3 = (__force pci_dev_flags_t) 2,
> > + /* Provide indication device is assigned by KVM */
> > + PCI_DEV_FLAGS_ASSIGNED = (__force pci_dev_flags_t) 4,
> > };
Looks fine, but I'd make the comment less redundant with the code, e.g.
"set when the device is assigned to a guest instance" or somesuch.
--
Jesse Barnes, Intel Open Source Technology Center
^ permalink raw reply
* Re: [PATCH] sunrpc: use better NUMA affinities
From: J. Bruce Fields @ 2011-07-29 16:48 UTC (permalink / raw)
To: Greg Banks
Cc: Eric Dumazet, Christoph Hellwig, NeilBrown,
linux-nfs@vger.kernel.org, David Miller, linux-kernel, netdev
In-Reply-To: <F67B6E82-9FA8-464E-B3FB-DC0D4EC18BB0@fastmail.fm>
On Fri, Jul 29, 2011 at 11:30:05PM +1000, Greg Banks wrote:
>
>
> Sent from my iPhone
>
> On 29/07/2011, at 22:11, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> > Le vendredi 29 juillet 2011 à 21:58 +1000, Greg Banks a écrit :
> >
> >>
> >> Sure, and a whole lot of the callsites are ("..._%d", cpu), hence the
> >> unfortune :(
> >
> > BTW, we could name nfsd threads differently :
> >
> > Currently, they all are named : "nfsd"
> >
> > If SVC_POOL_PERCPU is selected, we could name them :
> > nfsd_c0 -> nfsd_cN
> >
> > If SVC_POOL_PERNODE is selected, we could name them :
> > nfsd_n0 -> nfsd_nN
> >
> > That would help to check with "ps aux" which cpu/nodes are under
> > stress.
> >
> >
>
> I like it!
Yup, patch welcomed.--b.
^ permalink raw reply
* Re: Fw: [PATCH] sunrpc: use better NUMA affinities
From: J. Bruce Fields @ 2011-07-29 16:48 UTC (permalink / raw)
To: Greg Banks
Cc: NeilBrown, linux-nfs-u79uwXL29TY76Z2rM5mHXA, David Miller,
linux-kernel, netdev, Eric Dumazet
In-Reply-To: <4E324DB4.7060600-97jfqw80gc6171pxa8y+qA@public.gmane.org>
On Fri, Jul 29, 2011 at 04:05:40PM +1000, Greg Banks wrote:
> On 29/07/11 15:32, NeilBrown wrote:
> >
> >Hi Greg,
> > I saw this patch float past and thought of you... You may not be interested
> > any more, and it may be a perfectly good patch that does not need any
> > comment, but I thought I would let you know anyway.
>
> Thanks Neil.
>
> I've trimmed the cc list to limit the number of copies Trond and Bruce get:)
I appreciate the thought, but in future please don't trim cc's.
--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Fw: [PATCH] sunrpc: use better NUMA affinities
From: J. Bruce Fields @ 2011-07-29 16:45 UTC (permalink / raw)
To: Greg Banks
Cc: Eric Dumazet, NeilBrown, linux-nfs, David Miller, linux-kernel,
netdev
In-Reply-To: <4E3258E1.6020000@fastmail.fm>
On Fri, Jul 29, 2011 at 04:53:21PM +1000, Greg Banks wrote:
> On 29/07/11 16:30, Eric Dumazet wrote:
> >Le vendredi 29 juillet 2011 à 16:05 +1000, Greg Banks a écrit :
> >>On 29/07/11 15:32, NeilBrown wrote:
> >>
> >>I seem to remember coming to the conclusion that Jeff eventually
> >>addressed this problem...am I misremembering or did something regress?
> >>
> >Currently, all nfsd kthreads use memory for their kernel stack and
> >various initial data from a _single_ node, even if you use
> >sunrpc.pool_mode=pernode (or percpu)
>
> That's just plain broken and I'm very pleased to see you fix it.
Should I take that as a "Reviewed-by"?
> I was just surprised that it was still broken and wondering how that
> happened. Looking at ToT I see that because I dropped the ball in
> 2008, Jeff's patches didn't address the problem. In ToT
> svc_pool_map_set_cpumask() is called *after* kthread_create() and
> applies to the child thread, *after* it's stack has been allocated
> on the wrong node. In the working SGI code,
> svc_pool_map_set_cpumask() is called by the parent node on itself
> *before* calling kernel_thread() or doing any of the data structure
> allocations, thus ensuring that everything gets allocated using the
> default memory allocation policy, which on SGI NFS servers was
> globally tuned to be "node-local".
OK, so would it be enough to just move the svc_pool_map_set_cpumask()
back a few lines, or do we want Eric's approach, in order to have
something that will work better with other memory allocation policies?
--b.
>
> >With my patch, we make sure each thread gets its stack from its local
> >node.
> >
> >Check commit 94dcf29a11b3d20a (kthread: use kthread_create_on_node()) to
> >see how this strategy already was adopted for ksoftirqd, kworker,
> >migration, and pktgend kthreads.
>
> Ah, I see. It's unfortunate that the kthread_create() API ends up
> being passed a CPU number but that's only used to format the name
> and not for sensible things :(
>
> --
> Greg.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] sunrpc: use better NUMA affinities
From: J. Bruce Fields @ 2011-07-29 16:42 UTC (permalink / raw)
To: Eric Dumazet
Cc: Trond Myklebust, Neil Brown, David Miller,
linux-nfs-u79uwXL29TY76Z2rM5mHXA, netdev, linux-kernel,
Greg Banks
In-Reply-To: <1311876249.2346.39.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
On Thu, Jul 28, 2011 at 08:04:09PM +0200, Eric Dumazet wrote:
> Use NUMA aware allocations to reduce latencies and increase throughput.
>
> sunrpc kthreads can use kthread_create_on_node() if pool_mode is
> "percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can
> also take into account NUMA node affinity for memory allocations.
...
> @@ -662,14 +675,16 @@ svc_set_num_threads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
> nrservs--;
> chosen_pool = choose_pool(serv, pool, &state);
>
> - rqstp = svc_prepare_thread(serv, chosen_pool);
> + node = svc_pool_map_get_node(chosen_pool->sp_id);
> + rqstp = svc_prepare_thread(serv, chosen_pool, node);
The only correct value for the third argument there is
svc_pool_map_get_node(chosen_pool->sp_id), so let's have
svc_prepare_thread() call that itself.
Seems OK otherwise.
Any suggestions on how we should test this?
--b.
> if (IS_ERR(rqstp)) {
> error = PTR_ERR(rqstp);
> break;
> }
>
> __module_get(serv->sv_module);
> - task = kthread_create(serv->sv_function, rqstp, serv->sv_name);
> + task = kthread_create_on_node(serv->sv_function, rqstp,
> + node, serv->sv_name);
> if (IS_ERR(task)) {
> error = PTR_ERR(task);
> module_put(serv->sv_module);
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] net/smsc911x: add device tree probe support
From: Grant Likely @ 2011-07-29 16:26 UTC (permalink / raw)
To: Shawn Guo
Cc: Nicolas Pitre, patches, netdev, devicetree-discuss,
Steve Glendinning, Shawn Guo, David S. Miller, linux-arm-kernel
In-Reply-To: <20110729160343.GA2241@S2100-06.ap.freescale.net>
On Sat, Jul 30, 2011 at 12:03:44AM +0800, Shawn Guo wrote:
> On Fri, Jul 29, 2011 at 09:47:23AM -0600, Grant Likely wrote:
> > On Fri, Jul 29, 2011 at 10:36:26AM +0800, Shawn Guo wrote:
> > > Since I do not get any suggestion here, I'm going to follow the driver
> > > name to use '911' as the model number. Please tell me if there is any
> > > better one.
> >
> > What is the manufacturer part name? lan9111 or lan91c11? The
> > manufacturer's documented part number is almost always preferred.
> >
> I just posted the v2 of the patch, and chose to use 'smsc,lan9115'
> which is the first model supported in the driver. This is the same
> approach I used with i.mx device bindings.
You haven't answered the question.
g.
^ permalink raw reply
* Re: [PATCH v2] net/smsc911x: add device tree probe support
From: Shawn Guo @ 2011-07-29 16:16 UTC (permalink / raw)
To: Grant Likely
Cc: Shawn Guo, netdev, devicetree-discuss, linux-arm-kernel, patches,
Steve Glendinning, David S. Miller
In-Reply-To: <20110729155354.GD11164@ponder.secretlab.ca>
On Fri, Jul 29, 2011 at 09:53:54AM -0600, Grant Likely wrote:
> On Fri, Jul 29, 2011 at 04:43:16PM +0800, Shawn Guo wrote:
> > It adds device tree probe support for smsc911x driver.
> >
> > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > Cc: Grant Likely <grant.likely@secretlab.ca>
> > Cc: Steve Glendinning <steve.glendinning@smsc.com>
> > Cc: David S. Miller <davem@davemloft.net>
>
> Some comments below, and I asked a question on the older version about
> the actual model name vs. compatible, but otherwise it looks right and
> you can add my:
>
> Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
>
Thanks, Grant.
> > ---
> > Changes since v1:
> > * Instead of getting irq line from gpio number, it use irq domain
> > to keep platform_get_resource(IORESOURCE_IRQ) works for dt too.
> > * Use 'lan9115' the first model that smsc911x supports in the match
> > table
> > * Use reg-shift and reg-io-width which already used in of_serial for
> > shift and access size binding
> >
> > Documentation/devicetree/bindings/net/smsc911x.txt | 38 +++++++++
> > drivers/net/smsc911x.c | 85 +++++++++++++++++---
> > 2 files changed, 112 insertions(+), 11 deletions(-)
> > create mode 100644 Documentation/devicetree/bindings/net/smsc911x.txt
> >
> > diff --git a/Documentation/devicetree/bindings/net/smsc911x.txt b/Documentation/devicetree/bindings/net/smsc911x.txt
> > new file mode 100644
> > index 0000000..271c454
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/net/smsc911x.txt
> > @@ -0,0 +1,38 @@
> > +* Smart Mixed-Signal Connectivity (SMSC) LAN911x/912x Controller
> > +
> > +Required properties:
> > +- compatible : Should be "smsc,lan<model>", "smsc,lan9115"
> > +- reg : Address and length of the io space for SMSC LAN
> > +- interrupts : Should contain SMSC LAN interrupt line
> > +- interrupt-parent : Should be the phandle for the interrupt controller
> > + that services interrupts for this device
> > +- phy-mode : String, operation mode of the PHY interface.
> > + Supported values are: "mii", "gmii", "sgmii", "tbi", "rmii",
> > + "rgmii", "rgmii-id", "rgmii-rxid", "rgmii-txid", "rtbi", "smii".
> > +
> > +Optional properties:
> > +- reg-shift : Specify the quantity to shift the register offsets by
> > +- reg-io-width : Specify the size (in bytes) of the IO accesses that
> > + should be performed on the device. Valid value for SMSC LAN is
> > + 2 or 4. If it's omitted or invalid, the size would be 2.
> > +- smsc,irq-active-high : Indicates the IRQ polarity is active-low
>
> Which is it? Active high or active low? The property doesn't match
> the description.
>
Sorry. The description is wrong.
> > +- smsc,irq-push-pull : Indicates the IRQ type is push-pull
>
> What exactly does "push-pull" mean here?
>
I do not understand exactly. But I see the term used in driver was
originally from SMSC data sheet.
> > +- smsc,force-internal-phy : Forces SMSC LAN controller to use
> > + internal PHY
> > +- smsc,force-external-phy : Forces SMSC LAN controller to use
> > + external PHY
> > +- smsc,save-mac-address : Indicates that mac address needs to be saved
> > + before resetting the controller
> > +- local-mac-address : 6 bytes, mac address
> > +
> > +Examples:
> > +
> > +lan9220@f4000000 {
> > + compatible = "smsc,lan9220", "smsc,lan9115";
> > + reg = <0xf4000000 0x2000000>;
> > + phy-mode = "mii";
> > + interrupt-parent = <&gpio1>;
> > + interrupts = <31>;
> > + reg-io-width = <4>;
> > + smsc,irq-push-pull;
> > +};
> > diff --git a/drivers/net/smsc911x.c b/drivers/net/smsc911x.c
> > index b9016a3..75c08a5 100644
> > --- a/drivers/net/smsc911x.c
> > +++ b/drivers/net/smsc911x.c
> > @@ -53,6 +53,10 @@
> > #include <linux/phy.h>
> > #include <linux/smsc911x.h>
> > #include <linux/device.h>
> > +#include <linux/of.h>
> > +#include <linux/of_device.h>
> > +#include <linux/of_gpio.h>
> > +#include <linux/of_net.h>
> > #include "smsc911x.h"
> >
> > #define SMSC_CHIPNAME "smsc911x"
> > @@ -2095,8 +2099,58 @@ static const struct smsc911x_ops shifted_smsc911x_ops = {
> > .tx_writefifo = smsc911x_tx_writefifo_shift,
> > };
> >
> > +#ifdef CONFIG_OF
> > +static int __devinit smsc911x_probe_config_dt(
> > + struct smsc911x_platform_config *config,
> > + struct device_node *np)
> > +{
> > + const char *mac;
> > + u32 width = 0;
> > +
> > + if (!np)
> > + return -ENODEV;
> > +
> > + config->phy_interface = of_get_phy_mode(np);
> > +
> > + mac = of_get_mac_address(np);
> > + if (mac)
> > + memcpy(config->mac, mac, ETH_ALEN);
> > +
> > + of_property_read_u32(np, "reg-shift", &config->shift);
> > +
> > + of_property_read_u32(np, "reg-io-width", &width);
> > + if (width == 4)
> > + config->flags |= SMSC911X_USE_32BIT;
> > +
> > + if (of_get_property(np, "smsc,irq-active-high", NULL))
> > + config->irq_polarity = SMSC911X_IRQ_POLARITY_ACTIVE_HIGH;
> > +
> > + if (of_get_property(np, "smsc,irq-push-pull", NULL))
> > + config->irq_type = SMSC911X_IRQ_TYPE_PUSH_PULL;
> > +
> > + if (of_get_property(np, "smsc,force-internal-phy", NULL))
> > + config->flags |= SMSC911X_FORCE_INTERNAL_PHY;
> > +
> > + if (of_get_property(np, "smsc,force-external-phy", NULL))
> > + config->flags |= SMSC911X_FORCE_EXTERNAL_PHY;
> > +
> > + if (of_get_property(np, "smsc,save-mac-address", NULL))
> > + config->flags |= SMSC911X_SAVE_MAC_ADDRESS;
> > +
> > + return 0;
> > +}
> > +#else
> > +static inline int smsc911x_probe_config_dt(
> > + struct smsc911x_platform_config *config,
> > + struct device_node *np)
> > +{
> > + return -ENODEV;
> > +}
> > +#endif /* CONFIG_OF */
> > +
> > static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> > {
> > + struct device_node *np = pdev->dev.of_node;
> > struct net_device *dev;
> > struct smsc911x_data *pdata;
> > struct smsc911x_platform_config *config = pdev->dev.platform_data;
> > @@ -2107,13 +2161,6 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> >
> > pr_info("Driver version %s\n", SMSC_DRV_VERSION);
> >
> > - /* platform data specifies irq & dynamic bus configuration */
> > - if (!pdev->dev.platform_data) {
> > - pr_warn("platform_data not provided\n");
> > - retval = -ENODEV;
> > - goto out_0;
> > - }
> > -
> > res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
> > "smsc911x-memory");
> > if (!res)
> > @@ -2152,9 +2199,6 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> > irq_flags = irq_res->flags & IRQF_TRIGGER_MASK;
> > pdata->ioaddr = ioremap_nocache(res->start, res_size);
> >
> > - /* copy config parameters across to pdata */
> > - memcpy(&pdata->config, config, sizeof(pdata->config));
> > -
> > pdata->dev = dev;
> > pdata->msg_enable = ((1 << debug) - 1);
> >
> > @@ -2164,10 +2208,22 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> > goto out_free_netdev_2;
> > }
> >
> > + retval = smsc911x_probe_config_dt(&pdata->config, np);
> > + if (retval && config) {
> > + /* copy config parameters across to pdata */
> > + memcpy(&pdata->config, config, sizeof(pdata->config));
>
> The following will do the same but is type-safe:
> pdata->config = *config;
>
I choose to keep the original taste :)
> > + retval = 0;
> > + }
> > +
> > + if (retval) {
> > + SMSC_WARN(pdata, probe, "Error smsc911x config not found");
> > + goto out_unmap_io_3;
> > + }
> > +
> > /* assume standard, non-shifted, access to HW registers */
> > pdata->ops = &standard_smsc911x_ops;
> > /* apply the right access if shifting is needed */
> > - if (config->shift)
> > + if (pdata->config.shift)
> > pdata->ops = &shifted_smsc911x_ops;
> >
> > retval = smsc911x_init(dev);
> > @@ -2314,6 +2370,12 @@ static const struct dev_pm_ops smsc911x_pm_ops = {
> > #define SMSC911X_PM_OPS NULL
> > #endif
> >
> > +static const struct of_device_id smsc911x_dt_ids[] = {
> > + { .compatible = "smsc,lan9115", },
> > + { /* sentinel */ }
> > +};
> > +MODULE_DEVICE_TABLE(of, smsc911x_dt_ids);
> > +
> > static struct platform_driver smsc911x_driver = {
> > .probe = smsc911x_drv_probe,
> > .remove = __devexit_p(smsc911x_drv_remove),
> > @@ -2321,6 +2383,7 @@ static struct platform_driver smsc911x_driver = {
> > .name = SMSC_CHIPNAME,
> > .owner = THIS_MODULE,
> > .pm = SMSC911X_PM_OPS,
> > + .of_match_table = smsc911x_dt_ids,
> > },
> > };
> >
> > --
> > 1.7.4.1
> >
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Regards,
Shawn
^ permalink raw reply
* Re: [patch net-next-2.6] dummy: allow report link status and change it via sysfs
From: Ben Hutchings @ 2011-07-29 16:03 UTC (permalink / raw)
To: Jiri Pirko; +Cc: netdev, davem, eric.dumazet
In-Reply-To: <1311953253-25059-1-git-send-email-jpirko@redhat.com>
On Fri, 2011-07-29 at 17:27 +0200, Jiri Pirko wrote:
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
> ---
> drivers/net/dummy.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 54 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
> index 39cf9b9..fc39c24 100644
> --- a/drivers/net/dummy.c
> +++ b/drivers/net/dummy.c
> @@ -37,9 +37,57 @@
> #include <linux/rtnetlink.h>
> #include <net/rtnetlink.h>
> #include <linux/u64_stats_sync.h>
> +#include <linux/ethtool.h>
>
> static int numdummies = 1;
>
> +static ssize_t dummy_show_link(struct device *d,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct net_device *dev = to_net_dev(d);
> +
> + return sprintf(buf, "%d\n", netif_carrier_ok(dev) ? 1 : 0);
> +}
[...]
Net devices already have the 'carrier' attribute. You should make that
attribute writable for dummy devices, rather than adding another one.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH] net/smsc911x: add device tree probe support
From: Shawn Guo @ 2011-07-29 16:03 UTC (permalink / raw)
To: Grant Likely
Cc: Nicolas Pitre, patches, netdev, devicetree-discuss,
Steve Glendinning, Shawn Guo, David S. Miller, linux-arm-kernel
In-Reply-To: <20110729154723.GC11164@ponder.secretlab.ca>
On Fri, Jul 29, 2011 at 09:47:23AM -0600, Grant Likely wrote:
> On Fri, Jul 29, 2011 at 10:36:26AM +0800, Shawn Guo wrote:
> > On Mon, Jul 25, 2011 at 10:28:05PM -0400, Nicolas Pitre wrote:
> > > On Tue, 26 Jul 2011, Shawn Guo wrote:
> > >
> > > > On Mon, Jul 25, 2011 at 09:16:40PM -0400, Nicolas Pitre wrote:
> > > > > On Tue, 26 Jul 2011, Shawn Guo wrote:
> > > > >
> > > > > > On Mon, Jul 25, 2011 at 03:37:23PM -0600, Grant Likely wrote:
> > > > > > > On Mon, Jul 25, 2011 at 05:44:00PM +0800, Shawn Guo wrote:
> > > > > > > > It adds device tree probe support for smsc911x driver.
> > > > > > > >
> > > > > > > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > > > > > > > Cc: Grant Likely <grant.likely@secretlab.ca>
> > > > > > > > Cc: Steve Glendinning <steve.glendinning@smsc.com>
> > > > > > > > Cc: David S. Miller <davem@davemloft.net>
> > > > > > > > ---
> > > > > > > > Documentation/devicetree/bindings/net/smsc.txt | 34 +++++++
> > > > > > > > drivers/net/smsc911x.c | 123 +++++++++++++++++++-----
> > > > > > > > 2 files changed, 132 insertions(+), 25 deletions(-)
> > > > > > > > create mode 100644 Documentation/devicetree/bindings/net/smsc.txt
> > > > > > > >
> > > > > > > > diff --git a/Documentation/devicetree/bindings/net/smsc.txt b/Documentation/devicetree/bindings/net/smsc.txt
> > > > > > > > new file mode 100644
> > > > > > > > index 0000000..1920695
> > > > > > > > --- /dev/null
> > > > > > > > +++ b/Documentation/devicetree/bindings/net/smsc.txt
> > > > > > > > @@ -0,0 +1,34 @@
> > > > > > > > +* Smart Mixed-Signal Connectivity (SMSC) LAN Controller
> > > > > > > > +
> > > > > > > > +Required properties:
> > > > > > > > +- compatible : Should be "smsc,lan<model>""smsc,lan"
> > > > > > >
> > > > > > > Drop "smsc,lan". That's far too generic.
> > > > > > >
> > > > > > The following devices are supported by the driver.
> > > > > >
> > > > > > LAN9115, LAN9116, LAN9117, LAN9118
> > > > > > LAN9215, LAN9216, LAN9217, LAN9218
> > > > > > LAN9210, LAN9211
> > > > > > LAN9220, LAN9221
> > > > > >
> > > > > > If we only keep specific <model> as the compatible, we will have a
> > > > > > long match table which is actually used nowhere to distinguish the
> > > > > > device.
> > > > > >
> > > > > > So we need some level generic compatible to save the meaningless
> > > > > > long match table. What about:
> > > > > >
> > > > > > static const struct of_device_id smsc_dt_ids[] = {
> > > > > > { .compatible = "smsc,lan9", },
> > > > > > { /* sentinel */ }
> > > > > > };
> > > > > >
> > > > > > Or:
> > > > > >
> > > > > > static const struct of_device_id smsc_dt_ids[] = {
> > > > > > { .compatible = "smsc,lan91", },
> > > > > > { .compatible = "smsc,lan92", },
> > > > > > { /* sentinel */ }
> > > > > > };
> > > > >
> > > > > None of this unambiguously distinguish the devices supported by this
> > > > > driver and the smc91x driver which supports LAN91C92, LAN91C94,
> > > > > LAN91C95, LAN91C96, LAN91C100, LAN91C110.
> > > > >
> > > > So you suggest to make a long list to explicitly tell the device type
> > > > that the driver supports?
> > >
> > > I'm not suggesting anything. :-) I'm merely pointing out that the
> > > above .compatible = "smsc,lan9" or .compatible = "smsc,lan91" are too
> > > generic given that there is another driver with different devices to
> > > which they could also apply.
> > >
> > Since I do not get any suggestion here, I'm going to follow the driver
> > name to use '911' as the model number. Please tell me if there is any
> > better one.
>
> What is the manufacturer part name? lan9111 or lan91c11? The
> manufacturer's documented part number is almost always preferred.
>
I just posted the v2 of the patch, and chose to use 'smsc,lan9115'
which is the first model supported in the driver. This is the same
approach I used with i.mx device bindings.
--
Regards,
Shawn
^ permalink raw reply
* Re: [PATCH v2] net/smsc911x: add device tree probe support
From: Grant Likely @ 2011-07-29 15:53 UTC (permalink / raw)
To: Shawn Guo
Cc: netdev, devicetree-discuss, linux-arm-kernel, patches,
Steve Glendinning, David S. Miller
In-Reply-To: <1311928996-10729-1-git-send-email-shawn.guo@linaro.org>
On Fri, Jul 29, 2011 at 04:43:16PM +0800, Shawn Guo wrote:
> It adds device tree probe support for smsc911x driver.
>
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> Cc: Grant Likely <grant.likely@secretlab.ca>
> Cc: Steve Glendinning <steve.glendinning@smsc.com>
> Cc: David S. Miller <davem@davemloft.net>
Some comments below, and I asked a question on the older version about
the actual model name vs. compatible, but otherwise it looks right and
you can add my:
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
> ---
> Changes since v1:
> * Instead of getting irq line from gpio number, it use irq domain
> to keep platform_get_resource(IORESOURCE_IRQ) works for dt too.
> * Use 'lan9115' the first model that smsc911x supports in the match
> table
> * Use reg-shift and reg-io-width which already used in of_serial for
> shift and access size binding
>
> Documentation/devicetree/bindings/net/smsc911x.txt | 38 +++++++++
> drivers/net/smsc911x.c | 85 +++++++++++++++++---
> 2 files changed, 112 insertions(+), 11 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/net/smsc911x.txt
>
> diff --git a/Documentation/devicetree/bindings/net/smsc911x.txt b/Documentation/devicetree/bindings/net/smsc911x.txt
> new file mode 100644
> index 0000000..271c454
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/smsc911x.txt
> @@ -0,0 +1,38 @@
> +* Smart Mixed-Signal Connectivity (SMSC) LAN911x/912x Controller
> +
> +Required properties:
> +- compatible : Should be "smsc,lan<model>", "smsc,lan9115"
> +- reg : Address and length of the io space for SMSC LAN
> +- interrupts : Should contain SMSC LAN interrupt line
> +- interrupt-parent : Should be the phandle for the interrupt controller
> + that services interrupts for this device
> +- phy-mode : String, operation mode of the PHY interface.
> + Supported values are: "mii", "gmii", "sgmii", "tbi", "rmii",
> + "rgmii", "rgmii-id", "rgmii-rxid", "rgmii-txid", "rtbi", "smii".
> +
> +Optional properties:
> +- reg-shift : Specify the quantity to shift the register offsets by
> +- reg-io-width : Specify the size (in bytes) of the IO accesses that
> + should be performed on the device. Valid value for SMSC LAN is
> + 2 or 4. If it's omitted or invalid, the size would be 2.
> +- smsc,irq-active-high : Indicates the IRQ polarity is active-low
Which is it? Active high or active low? The property doesn't match
the description.
> +- smsc,irq-push-pull : Indicates the IRQ type is push-pull
What exactly does "push-pull" mean here?
> +- smsc,force-internal-phy : Forces SMSC LAN controller to use
> + internal PHY
> +- smsc,force-external-phy : Forces SMSC LAN controller to use
> + external PHY
> +- smsc,save-mac-address : Indicates that mac address needs to be saved
> + before resetting the controller
> +- local-mac-address : 6 bytes, mac address
> +
> +Examples:
> +
> +lan9220@f4000000 {
> + compatible = "smsc,lan9220", "smsc,lan9115";
> + reg = <0xf4000000 0x2000000>;
> + phy-mode = "mii";
> + interrupt-parent = <&gpio1>;
> + interrupts = <31>;
> + reg-io-width = <4>;
> + smsc,irq-push-pull;
> +};
> diff --git a/drivers/net/smsc911x.c b/drivers/net/smsc911x.c
> index b9016a3..75c08a5 100644
> --- a/drivers/net/smsc911x.c
> +++ b/drivers/net/smsc911x.c
> @@ -53,6 +53,10 @@
> #include <linux/phy.h>
> #include <linux/smsc911x.h>
> #include <linux/device.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/of_gpio.h>
> +#include <linux/of_net.h>
> #include "smsc911x.h"
>
> #define SMSC_CHIPNAME "smsc911x"
> @@ -2095,8 +2099,58 @@ static const struct smsc911x_ops shifted_smsc911x_ops = {
> .tx_writefifo = smsc911x_tx_writefifo_shift,
> };
>
> +#ifdef CONFIG_OF
> +static int __devinit smsc911x_probe_config_dt(
> + struct smsc911x_platform_config *config,
> + struct device_node *np)
> +{
> + const char *mac;
> + u32 width = 0;
> +
> + if (!np)
> + return -ENODEV;
> +
> + config->phy_interface = of_get_phy_mode(np);
> +
> + mac = of_get_mac_address(np);
> + if (mac)
> + memcpy(config->mac, mac, ETH_ALEN);
> +
> + of_property_read_u32(np, "reg-shift", &config->shift);
> +
> + of_property_read_u32(np, "reg-io-width", &width);
> + if (width == 4)
> + config->flags |= SMSC911X_USE_32BIT;
> +
> + if (of_get_property(np, "smsc,irq-active-high", NULL))
> + config->irq_polarity = SMSC911X_IRQ_POLARITY_ACTIVE_HIGH;
> +
> + if (of_get_property(np, "smsc,irq-push-pull", NULL))
> + config->irq_type = SMSC911X_IRQ_TYPE_PUSH_PULL;
> +
> + if (of_get_property(np, "smsc,force-internal-phy", NULL))
> + config->flags |= SMSC911X_FORCE_INTERNAL_PHY;
> +
> + if (of_get_property(np, "smsc,force-external-phy", NULL))
> + config->flags |= SMSC911X_FORCE_EXTERNAL_PHY;
> +
> + if (of_get_property(np, "smsc,save-mac-address", NULL))
> + config->flags |= SMSC911X_SAVE_MAC_ADDRESS;
> +
> + return 0;
> +}
> +#else
> +static inline int smsc911x_probe_config_dt(
> + struct smsc911x_platform_config *config,
> + struct device_node *np)
> +{
> + return -ENODEV;
> +}
> +#endif /* CONFIG_OF */
> +
> static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> {
> + struct device_node *np = pdev->dev.of_node;
> struct net_device *dev;
> struct smsc911x_data *pdata;
> struct smsc911x_platform_config *config = pdev->dev.platform_data;
> @@ -2107,13 +2161,6 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
>
> pr_info("Driver version %s\n", SMSC_DRV_VERSION);
>
> - /* platform data specifies irq & dynamic bus configuration */
> - if (!pdev->dev.platform_data) {
> - pr_warn("platform_data not provided\n");
> - retval = -ENODEV;
> - goto out_0;
> - }
> -
> res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
> "smsc911x-memory");
> if (!res)
> @@ -2152,9 +2199,6 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> irq_flags = irq_res->flags & IRQF_TRIGGER_MASK;
> pdata->ioaddr = ioremap_nocache(res->start, res_size);
>
> - /* copy config parameters across to pdata */
> - memcpy(&pdata->config, config, sizeof(pdata->config));
> -
> pdata->dev = dev;
> pdata->msg_enable = ((1 << debug) - 1);
>
> @@ -2164,10 +2208,22 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> goto out_free_netdev_2;
> }
>
> + retval = smsc911x_probe_config_dt(&pdata->config, np);
> + if (retval && config) {
> + /* copy config parameters across to pdata */
> + memcpy(&pdata->config, config, sizeof(pdata->config));
The following will do the same but is type-safe:
pdata->config = *config;
> + retval = 0;
> + }
> +
> + if (retval) {
> + SMSC_WARN(pdata, probe, "Error smsc911x config not found");
> + goto out_unmap_io_3;
> + }
> +
> /* assume standard, non-shifted, access to HW registers */
> pdata->ops = &standard_smsc911x_ops;
> /* apply the right access if shifting is needed */
> - if (config->shift)
> + if (pdata->config.shift)
> pdata->ops = &shifted_smsc911x_ops;
>
> retval = smsc911x_init(dev);
> @@ -2314,6 +2370,12 @@ static const struct dev_pm_ops smsc911x_pm_ops = {
> #define SMSC911X_PM_OPS NULL
> #endif
>
> +static const struct of_device_id smsc911x_dt_ids[] = {
> + { .compatible = "smsc,lan9115", },
> + { /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, smsc911x_dt_ids);
> +
> static struct platform_driver smsc911x_driver = {
> .probe = smsc911x_drv_probe,
> .remove = __devexit_p(smsc911x_drv_remove),
> @@ -2321,6 +2383,7 @@ static struct platform_driver smsc911x_driver = {
> .name = SMSC_CHIPNAME,
> .owner = THIS_MODULE,
> .pm = SMSC911X_PM_OPS,
> + .of_match_table = smsc911x_dt_ids,
> },
> };
>
> --
> 1.7.4.1
>
^ permalink raw reply
* Re: [PATCH] net/smsc911x: add device tree probe support
From: Grant Likely @ 2011-07-29 15:47 UTC (permalink / raw)
To: Shawn Guo
Cc: Nicolas Pitre, patches, netdev, devicetree-discuss,
Steve Glendinning, Shawn Guo, David S. Miller, linux-arm-kernel
In-Reply-To: <20110729023626.GB1946@S2100-06.ap.freescale.net>
On Fri, Jul 29, 2011 at 10:36:26AM +0800, Shawn Guo wrote:
> On Mon, Jul 25, 2011 at 10:28:05PM -0400, Nicolas Pitre wrote:
> > On Tue, 26 Jul 2011, Shawn Guo wrote:
> >
> > > On Mon, Jul 25, 2011 at 09:16:40PM -0400, Nicolas Pitre wrote:
> > > > On Tue, 26 Jul 2011, Shawn Guo wrote:
> > > >
> > > > > On Mon, Jul 25, 2011 at 03:37:23PM -0600, Grant Likely wrote:
> > > > > > On Mon, Jul 25, 2011 at 05:44:00PM +0800, Shawn Guo wrote:
> > > > > > > It adds device tree probe support for smsc911x driver.
> > > > > > >
> > > > > > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > > > > > > Cc: Grant Likely <grant.likely@secretlab.ca>
> > > > > > > Cc: Steve Glendinning <steve.glendinning@smsc.com>
> > > > > > > Cc: David S. Miller <davem@davemloft.net>
> > > > > > > ---
> > > > > > > Documentation/devicetree/bindings/net/smsc.txt | 34 +++++++
> > > > > > > drivers/net/smsc911x.c | 123 +++++++++++++++++++-----
> > > > > > > 2 files changed, 132 insertions(+), 25 deletions(-)
> > > > > > > create mode 100644 Documentation/devicetree/bindings/net/smsc.txt
> > > > > > >
> > > > > > > diff --git a/Documentation/devicetree/bindings/net/smsc.txt b/Documentation/devicetree/bindings/net/smsc.txt
> > > > > > > new file mode 100644
> > > > > > > index 0000000..1920695
> > > > > > > --- /dev/null
> > > > > > > +++ b/Documentation/devicetree/bindings/net/smsc.txt
> > > > > > > @@ -0,0 +1,34 @@
> > > > > > > +* Smart Mixed-Signal Connectivity (SMSC) LAN Controller
> > > > > > > +
> > > > > > > +Required properties:
> > > > > > > +- compatible : Should be "smsc,lan<model>""smsc,lan"
> > > > > >
> > > > > > Drop "smsc,lan". That's far too generic.
> > > > > >
> > > > > The following devices are supported by the driver.
> > > > >
> > > > > LAN9115, LAN9116, LAN9117, LAN9118
> > > > > LAN9215, LAN9216, LAN9217, LAN9218
> > > > > LAN9210, LAN9211
> > > > > LAN9220, LAN9221
> > > > >
> > > > > If we only keep specific <model> as the compatible, we will have a
> > > > > long match table which is actually used nowhere to distinguish the
> > > > > device.
> > > > >
> > > > > So we need some level generic compatible to save the meaningless
> > > > > long match table. What about:
> > > > >
> > > > > static const struct of_device_id smsc_dt_ids[] = {
> > > > > { .compatible = "smsc,lan9", },
> > > > > { /* sentinel */ }
> > > > > };
> > > > >
> > > > > Or:
> > > > >
> > > > > static const struct of_device_id smsc_dt_ids[] = {
> > > > > { .compatible = "smsc,lan91", },
> > > > > { .compatible = "smsc,lan92", },
> > > > > { /* sentinel */ }
> > > > > };
> > > >
> > > > None of this unambiguously distinguish the devices supported by this
> > > > driver and the smc91x driver which supports LAN91C92, LAN91C94,
> > > > LAN91C95, LAN91C96, LAN91C100, LAN91C110.
> > > >
> > > So you suggest to make a long list to explicitly tell the device type
> > > that the driver supports?
> >
> > I'm not suggesting anything. :-) I'm merely pointing out that the
> > above .compatible = "smsc,lan9" or .compatible = "smsc,lan91" are too
> > generic given that there is another driver with different devices to
> > which they could also apply.
> >
> Since I do not get any suggestion here, I'm going to follow the driver
> name to use '911' as the model number. Please tell me if there is any
> better one.
What is the manufacturer part name? lan9111 or lan91c11? The
manufacturer's documented part number is almost always preferred.
g.
^ permalink raw reply
* Re: [PATCH] net/smsc911x: add device tree probe support
From: Grant Likely @ 2011-07-29 15:45 UTC (permalink / raw)
To: Shawn Guo
Cc: Nicolas Pitre, patches, netdev, devicetree-discuss,
Steve Glendinning, Shawn Guo, David S. Miller, linux-arm-kernel
In-Reply-To: <20110726013026.GH21641@S2100-06.ap.freescale.net>
On Tue, Jul 26, 2011 at 09:30:26AM +0800, Shawn Guo wrote:
> On Mon, Jul 25, 2011 at 09:16:40PM -0400, Nicolas Pitre wrote:
> > On Tue, 26 Jul 2011, Shawn Guo wrote:
> >
> > > On Mon, Jul 25, 2011 at 03:37:23PM -0600, Grant Likely wrote:
> > > > On Mon, Jul 25, 2011 at 05:44:00PM +0800, Shawn Guo wrote:
> > > > > It adds device tree probe support for smsc911x driver.
> > > > >
> > > > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > > > > Cc: Grant Likely <grant.likely@secretlab.ca>
> > > > > Cc: Steve Glendinning <steve.glendinning@smsc.com>
> > > > > Cc: David S. Miller <davem@davemloft.net>
> > > > > ---
> > > > > Documentation/devicetree/bindings/net/smsc.txt | 34 +++++++
> > > > > drivers/net/smsc911x.c | 123 +++++++++++++++++++-----
> > > > > 2 files changed, 132 insertions(+), 25 deletions(-)
> > > > > create mode 100644 Documentation/devicetree/bindings/net/smsc.txt
> > > > >
> > > > > diff --git a/Documentation/devicetree/bindings/net/smsc.txt b/Documentation/devicetree/bindings/net/smsc.txt
> > > > > new file mode 100644
> > > > > index 0000000..1920695
> > > > > --- /dev/null
> > > > > +++ b/Documentation/devicetree/bindings/net/smsc.txt
> > > > > @@ -0,0 +1,34 @@
> > > > > +* Smart Mixed-Signal Connectivity (SMSC) LAN Controller
> > > > > +
> > > > > +Required properties:
> > > > > +- compatible : Should be "smsc,lan<model>""smsc,lan"
> > > >
> > > > Drop "smsc,lan". That's far too generic.
> > > >
> > > The following devices are supported by the driver.
> > >
> > > LAN9115, LAN9116, LAN9117, LAN9118
> > > LAN9215, LAN9216, LAN9217, LAN9218
> > > LAN9210, LAN9211
> > > LAN9220, LAN9221
> > >
> > > If we only keep specific <model> as the compatible, we will have a
> > > long match table which is actually used nowhere to distinguish the
> > > device.
> > >
> > > So we need some level generic compatible to save the meaningless
> > > long match table. What about:
> > >
> > > static const struct of_device_id smsc_dt_ids[] = {
> > > { .compatible = "smsc,lan9", },
> > > { /* sentinel */ }
> > > };
> > >
> > > Or:
> > >
> > > static const struct of_device_id smsc_dt_ids[] = {
> > > { .compatible = "smsc,lan91", },
> > > { .compatible = "smsc,lan92", },
> > > { /* sentinel */ }
> > > };
> >
> > None of this unambiguously distinguish the devices supported by this
> > driver and the smc91x driver which supports LAN91C92, LAN91C94,
> > LAN91C95, LAN91C96, LAN91C100, LAN91C110.
> >
> So you suggest to make a long list to explicitly tell the device type
> that the driver supports?
If newer devices are 100% backwards compatible with an older device,
then the newer device doesn't need to appear in this list because the
device node can claim compatibility.
If it isn't backwards compatible, then you do need an entry in this
list.
g.
^ permalink raw reply
* Re: PROBLEM: BUG (NULL ptr dereference in ipv4_dst_check)
From: Eric Dumazet @ 2011-07-29 15:43 UTC (permalink / raw)
To: David Miller; +Cc: synapse, netdev
In-Reply-To: <20110729.081902.300678107767426313.davem@davemloft.net>
Le vendredi 29 juillet 2011 à 08:19 -0700, David Miller a écrit :
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 29 Jul 2011 17:11:46 +0200
>
> > Thats tricky, because I am not sure we dont need RCU protection since we
> > can now exchange dst neighbour on the fly.
> >
> > Following patch would only reduce the window of bug, not a complete
> > fix...
> >
> > David, any opinion on this ?
>
> Indeed, old code worked because we invalidated entire route cache
> entry, and we never before ran arp_bind_neighbour() except on new
> route cache entires before they become globally visible.
>
> I think when we change an existing neigh we will need to release old
> neigh via RCU, at a minimum.
Oh well, we already use RCU in neigh_destroy(), so adding rcu would need
to change all dst_get_neighbour() callers to be in one rcu_read_lock()
section.
I'll take a look, I suspect its mostly already done.
^ permalink raw reply
* [patch net-next-2.6] dummy: allow report link status and change it via sysfs
From: Jiri Pirko @ 2011-07-29 15:27 UTC (permalink / raw)
To: netdev; +Cc: davem, eric.dumazet
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
---
drivers/net/dummy.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 54 insertions(+), 0 deletions(-)
diff --git a/drivers/net/dummy.c b/drivers/net/dummy.c
index 39cf9b9..fc39c24 100644
--- a/drivers/net/dummy.c
+++ b/drivers/net/dummy.c
@@ -37,9 +37,57 @@
#include <linux/rtnetlink.h>
#include <net/rtnetlink.h>
#include <linux/u64_stats_sync.h>
+#include <linux/ethtool.h>
static int numdummies = 1;
+static ssize_t dummy_show_link(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct net_device *dev = to_net_dev(d);
+
+ return sprintf(buf, "%d\n", netif_carrier_ok(dev) ? 1 : 0);
+}
+
+static ssize_t dummy_store_link(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct net_device *dev = to_net_dev(d);
+ int new_value;
+
+ if (sscanf(buf, "%d", &new_value) != 1) {
+ pr_err("%s: no link value specified.\n", dev->name);
+ return -EINVAL;
+ }
+ switch (new_value) {
+ case 0:
+ netif_carrier_off(dev);
+ break;
+ case 1:
+ netif_carrier_on(dev);
+ break;
+ default:
+ pr_info("%s: Ignoring invalid link value %d.\n",
+ dev->name, new_value);
+ }
+ return count;
+}
+
+static DEVICE_ATTR(link, S_IRUGO | S_IWUSR,
+ dummy_show_link, dummy_store_link);
+
+static struct attribute *per_dummy_attrs[] = {
+ &dev_attr_link.attr,
+ NULL,
+};
+
+static struct attribute_group dummy_group = {
+ .name = "dummy",
+ .attrs = per_dummy_attrs,
+};
+
static int dummy_set_address(struct net_device *dev, void *p)
{
struct sockaddr *sa = p;
@@ -103,6 +151,7 @@ static int dummy_dev_init(struct net_device *dev)
if (!dev->dstats)
return -ENOMEM;
+ dev->sysfs_groups[0] = &dummy_group;
return 0;
}
@@ -121,12 +170,17 @@ static const struct net_device_ops dummy_netdev_ops = {
.ndo_get_stats64 = dummy_get_stats64,
};
+static const struct ethtool_ops dummy_ethtool_ops = {
+ .get_link = ethtool_op_get_link,
+};
+
static void dummy_setup(struct net_device *dev)
{
ether_setup(dev);
/* Initialize the device structure. */
dev->netdev_ops = &dummy_netdev_ops;
+ dev->ethtool_ops = &dummy_ethtool_ops;
dev->destructor = dummy_dev_free;
/* Fill in device structure with ethernet-generic values. */
--
1.7.6
^ permalink raw reply related
* Re: PROBLEM: BUG (NULL ptr dereference in ipv4_dst_check)
From: David Miller @ 2011-07-29 15:19 UTC (permalink / raw)
To: eric.dumazet; +Cc: synapse, netdev
In-Reply-To: <1311952306.2843.27.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 29 Jul 2011 17:11:46 +0200
> Thats tricky, because I am not sure we dont need RCU protection since we
> can now exchange dst neighbour on the fly.
>
> Following patch would only reduce the window of bug, not a complete
> fix...
>
> David, any opinion on this ?
Indeed, old code worked because we invalidated entire route cache
entry, and we never before ran arp_bind_neighbour() except on new
route cache entires before they become globally visible.
I think when we change an existing neigh we will need to release old
neigh via RCU, at a minimum.
^ permalink raw reply
* Re: PROBLEM: BUG (NULL ptr dereference in ipv4_dst_check)
From: Eric Dumazet @ 2011-07-29 15:11 UTC (permalink / raw)
To: synapse; +Cc: David Miller, netdev
In-Reply-To: <4E32C76B.5010700@hippy.csoma.elte.hu>
Le vendredi 29 juillet 2011 à 16:44 +0200, synapse a écrit :
> On 07/29/11 16:41, Eric Dumazet wrote:
> > Le vendredi 29 juillet 2011 à 07:39 -0700, David Miller a écrit :
> >> From: Eric Dumazet<eric.dumazet@gmail.com>
> >> Date: Fri, 29 Jul 2011 16:36:24 +0200
> >>
> >>> Hmm, I'll take a look, but check_peer_redir() seems suspicious at first
> >>> glance.
> >> I take full responsibility for any bugs you discover in it :-)
> >
> > ;)
> >
> > Bug origin is commit f39925dbde7788cfb96419c0f092b086aa325c0f
> > ipv4: Cache learned redirect information in inetpeer.
> >
> > I'll cook a patch ASAP
> >
> wow that was QUICK :)
>
> When you're done I'll check it ASAP
>
> Gergely Kalman
Thats tricky, because I am not sure we dont need RCU protection since we
can now exchange dst neighbour on the fly.
Following patch would only reduce the window of bug, not a complete
fix...
David, any opinion on this ?
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 1730689..6afc4eb 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1628,16 +1628,18 @@ static int check_peer_redir(struct dst_entry *dst, struct inet_peer *peer)
{
struct rtable *rt = (struct rtable *) dst;
__be32 orig_gw = rt->rt_gateway;
- struct neighbour *n;
+ struct neighbour *n, *old_n;
dst_confirm(&rt->dst);
- neigh_release(dst_get_neighbour(&rt->dst));
- dst_set_neighbour(&rt->dst, NULL);
-
rt->rt_gateway = peer->redirect_learned.a4;
- rt_bind_neighbour(rt);
- n = dst_get_neighbour(&rt->dst);
+
+ n = ipv4_neigh_lookup(&rt->dst, &rt->rt_gateway);
+ if (IS_ERR(n))
+ return PTR_ERR(n);
+ old_n = xchg(&rt->dst._neighbour, n);
+ if (old_n)
+ neigh_release(old_n);
if (!n || !(n->nud_state & NUD_VALID)) {
if (n)
neigh_event_send(n, NULL);
^ permalink raw reply related
* Re: Realtek 8139 Flow Control?
From: Sven Anders @ 2011-07-29 14:59 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev
In-Reply-To: <20110728063526.GA11214@electric-eye.fr.zoreil.com>
[-- Attachment #1: Type: text/plain, Size: 2280 bytes --]
Francois Romieu schrieb:
> Sven Anders <anders@anduras.de> :
>> Sven Anders wrote:
> [...]
>>> We have appliances with Realtek 8139C/8139C+ (rev 10) chipsets (with the
>>> PCI IDs: 10ec:8139). We are using the 8139too driver (version: 0.9.28)
>
> The PCI revision ID is not incompatible with a 8139c(l)+. If so the 8139cp
> driver may prove easier to handle at the driver programming level. It's
> almost surely less commonly used though.
>
> I'd suggest to use both PCI information and TxConfig content to identify
> Realtek's chipset. ethtool should provide the latter.
>
>>> According the datasheet the chipset supports Flow Control (IEEE 802.3x).
>>>
>>> I want to know, if the support for enabling flow control is missing in
>>> the driver by purpose or is only not implemented due to lack of time?
> [...]
>> Can somebody of the old implementors please answer this question?
>
> Hint ? :o)
>
> This hardware is not exactly fun, especially if you do not own a c+ model
> (4 Tx descriptors, a single copy-only receive buffer, really cheap...).
>
>> We need that feature and I'm willing to implement it, but I need the
>> confirmation, that it was not done due to lack of time and not because
>> will not work (correctly)...
>
> If you do not get an answer and the relevant bit is already set in the
> eeprom, you should be able to make your own mind shortly. The comment
> in the datasheet rightfully reminds that both the NIC and the peer
> networking gear need to handle flow control correctly.
Thanks for the long and only answer (so far)!
But I still need the following statement before I implement it:
[ ] It was not implemented, because I will not work!
[ ] It was not implemented due to lack of time. It should work.
I want to know this, because I'm not eager to waste my time...
Regards
Sven Anders
--
Sven Anders <anders@anduras.de> () UTF-8 Ribbon Campaign
/\ Support plain text e-mail
ANDURAS intranet security AG
Messestrasse 3 - 94036 Passau - Germany
Web: www.anduras.de - Tel: +49 (0)851-4 90 50-0 - Fax: +49 (0)851-4 90 50-55
Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety.
- Benjamin Franklin
[-- Attachment #2: anders.vcf --]
[-- Type: text/x-vcard, Size: 339 bytes --]
begin:vcard
fn:Sven Anders
n:Anders;Sven
org:ANDURAS AG;Research and Development
adr;quoted-printable:;;Messestra=C3=9Fe 3;Passau;Bavaria;94036;Germany
email;internet:anders@anduras.de
title:Dipl. Inf.
tel;work:++49 (0)851 / 490 50 -0
tel;fax:++49 (0)851 / 590 50 - 55
x-mozilla-html:FALSE
url:http://www.anduras.de
version:2.1
end:vcard
^ permalink raw reply
* Re: PROBLEM: BUG (NULL ptr dereference in ipv4_dst_check)
From: synapse @ 2011-07-29 14:44 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1311950488.2843.23.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
On 07/29/11 16:41, Eric Dumazet wrote:
> Le vendredi 29 juillet 2011 à 07:39 -0700, David Miller a écrit :
>> From: Eric Dumazet<eric.dumazet@gmail.com>
>> Date: Fri, 29 Jul 2011 16:36:24 +0200
>>
>>> Hmm, I'll take a look, but check_peer_redir() seems suspicious at first
>>> glance.
>> I take full responsibility for any bugs you discover in it :-)
>
> ;)
>
> Bug origin is commit f39925dbde7788cfb96419c0f092b086aa325c0f
> ipv4: Cache learned redirect information in inetpeer.
>
> I'll cook a patch ASAP
>
wow that was QUICK :)
When you're done I'll check it ASAP
Gergely Kalman
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox