Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] mac80211_hwsim: release driver when ieee80211_register_hw fails
From: Fengguang Wu @ 2014-10-29 14:14 UTC (permalink / raw)
  To: Junjie Mao
  Cc: Martin Pitt, linux-wireless, netdev, linux-kernel, Dan Carpenter
In-Reply-To: <8638a718ft.fsf@MJJ-LAPTOP.i-did-not-set--mail-host-address--so-tickle-me>

On Wed, Oct 29, 2014 at 06:23:02PM +0800, Junjie Mao wrote:
> I was not familiar with the acquiring/releasing API either, until I met
> with this bug...
> 
> Perhaps we can use static checkers to avoid these issues as early as
> possible. Any suggestions?

CC Dan. His smatch checker might be able (or could be enabled) to
handle the verification of missing device_release_driver() call.

Thanks,
Fengguang

> Martin Pitt <martin.pitt@ubuntu.com> writes:
> 
> > Acked-By: Martin Pitt <martin.pitt@ubuntu.com>
> >
> > Hello Junjie,
> >
> > Junjie Mao [2014-10-28  9:31 +0800]:
> >> The driver is not released when ieee80211_register_hw fails in
> >> mac80211_hwsim_create_radio, leading to the access to the unregistered (and
> >> possibly freed) device in platform_driver_unregister:
> >
> > Many thanks for fixing this! Sorry about that, I don't know these bits
> > very well.
> >
> > Martin

^ permalink raw reply

* Re: nfs stalls over loopback interface (no sk_data_ready events?)
From: Jeff Layton @ 2014-10-29 14:21 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Christoph Hellwig, Linux NFS Mailing List, Bruce Fields,
	Trond Myklebust
In-Reply-To: <20141027152900.4e81a9d5-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>

On Mon, 27 Oct 2014 15:29:00 -0400
Jeff Layton <jlayton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org> wrote:

> (sorry for resend -- I got the netdev address wrong)
> 
> Sending this to netdev since I think I've now determined that this is
> not a NFS specific problem. Recently Christoph mentioned that he was
> seeing stalls when running xfstests generic/075 test on NFS over the
> loopback interface with v3.18-rc1-ish kernel.
> 
> The configuration in this case is the nfs server and client on same box
> communicating over the lo interface.
> 
> Here's are tracepoints from a typical request as it's supposed to work:
> 
>        mount.nfs-906   [002] ...1 22711.996969: xprt_transmit: xprt=0xffff8800ce961000 xid=0xa8a34513 status=0
>             nfsd-678   [000] ...1 22711.997082: svc_recv: rq_xid=0xa8a34513 status=164
>             nfsd-678   [000] ..s8 22711.997185: xprt_lookup_rqst: xprt=0xffff8800ce961000 xid=0xa8a34513 status=0
>             nfsd-678   [000] ..s8 22711.997186: xprt_complete_rqst: xprt=0xffff8800ce961000 xid=0xa8a34513 status=140
>             nfsd-678   [000] ...1 22711.997236: svc_send: rq_xid=0xa8a34513 dropme=0 status=144
>             nfsd-678   [000] ...1 22711.997236: svc_process: rq_xid=0xa8a34513 dropme=0 status=144
> 
> ...basically, we send a request to the server. Server picks it up and
> sends the reply, and then the client IDs that reply and processes it.
> This runs along just fine for ~ a minute or so. At some point, the
> client stops seeing replies come in:
> 
>      kworker/2:2-107   [002] ...1 22741.696070: xprt_transmit: xprt=0xffff8800ce961000 xid=0xc3a84513 status=0
>             nfsd-678   [002] .N.1 22741.696917: svc_recv: rq_xid=0xc3a84513 status=208
>             nfsd-678   [002] ...1 22741.699890: svc_send: rq_xid=0xc3a84513 dropme=0 status=262252
>             nfsd-678   [002] ...1 22741.699891: svc_process: rq_xid=0xc3a84513 dropme=0 status=262252
> 
> 
> ...a bit more tracepoint work seems to show that we just stop getting
> sk_data_ready callbacks on the socket at all. I'm not terribly familiar
> with the lower-level socket code, so I figured I'd email here and ask...
> 
> Anyone have insight into why this might be happening?
> 

Looks some change that went into -rc2 has fixed the problem for me.
Christoph, can you confirm that this no longer occurs with -rc2?

-- 
Jeff Layton <jlayton-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 net-next] net: ipv6: Add a sysctl to make optimistic addresses useful candidates
From: Hannes Frederic Sowa @ 2014-10-29 14:34 UTC (permalink / raw)
  To: Erik Kline
  Cc: netdev@vger.kernel.org, David Miller, Ben Hutchings,
	Lorenzo Colitti
In-Reply-To: <CAAedzxptbKLrO_0uPVqXiqOACqNNj9BeeakgB3+t+XxemeN3Sw@mail.gmail.com>

Hi Erik,

On Mi, 2014-10-29 at 18:34 +0900, Erik Kline wrote:
> Given that we spoke about this reduction in the number of netlink messages
> earlier, do you still think it's an issue?

No, it would be nice to have but as this is just a minor detail and
complexity would be too high I am fine with your current solution.

> The end result here is that for listeners on netlink sockets on systems
> that (a) have optimistic dad built-in, (b) have optimistic dad enabled, and
> (c) have use_optimistic set: they'll 2 notifications (with different flags)
> for automatically added addresses on these interfaces.

Ack, I see.

> (Personally, given that we appear to send an RTM_NEWADDR when the address
> gets deprecated (I think), I think sending one on every flag change of
> interest is in keeping with existing behaviour.)

It does make sense, I have no objections.

Bye,
Hannes

^ permalink raw reply

* Re: [PATCH v2 net-next] net: ipv6: Add a sysctl to make optimistic addresses useful candidates
From: Hannes Frederic Sowa @ 2014-10-29 14:37 UTC (permalink / raw)
  To: Erik Kline; +Cc: netdev, davem, ben, lorenzo
In-Reply-To: <1414487474-18201-1-git-send-email-ek@google.com>

On Di, 2014-10-28 at 18:11 +0900, Erik Kline wrote:
> Add a sysctl that causes an interface's optimistic addresses
> to be considered equivalent to other non-deprecated addresses
> for source address selection purposes.  Preferred addresses
> will still take precedence over optimistic addresses, subject
> to other ranking in the source address selection algorithm.
> 
> This is useful where different interfaces are connected to
> different networks from different ISPs (e.g., a cell network
> and a home wifi network).
> 
> The current behaviour complies with RFC 3484/6724, and it
> makes sense if the host has only one interface, or has
> multiple interfaces on the same network (same or cooperating
> administrative domain(s), but not in the multiple distinct
> networks case.
> 
> For example, if a mobile device has an IPv6 address on an LTE
> network and then connects to IPv6-enabled wifi, while the wifi
> IPv6 address is undergoing DAD, IPv6 connections will try use
> the wifi default route with the LTE IPv6 address, and will get
> stuck until they time out.
> 
> Also, because optimistic nodes can receive frames, issue
> an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
> flag appropriately set).  A second RTM_NEWADDR is sent if DAD
> completes (the address flags have changed), otherwise an
> RTM_DELADDR is sent.
> 
> Also: add an entry in ip-sysctl.txt for optimistic_dad.
> 
> Signed-off-by: Erik Kline <ek@google.com>
> ---
>
> [...]
>  
> +static inline bool ipv6_use_optimistic_addr(struct inet6_dev *idev)
> +{
> +#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
> +	return idev && idev->cnf.optimistic_dad && idev->cnf.use_optimistic;

Just a small nit: is this idev != NULL check necessary?

Otherwise:
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Thanks,
Hannes

^ permalink raw reply

* [RFC] use smp_load_acquire()/smp_store_release()
From: Eric Dumazet @ 2014-10-29 14:49 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, Stephen Ko

Hi Alexander

The memory barriers added in commit
b37c0fbe3f6dfba1f8ad2aed47fb40578a254635
("net: Add memory barriers to prevent possible race in byte queue
limits")

have heavy cost.

It seems we could use smp_load_acquire() and smp_store_release()
instead ?

I'll post a patch later today. I would be interested if someone was able
to test it, as your commit apparently was tested and known to fix a
reproducible race.

Thanks !

^ permalink raw reply

* Re: some failures with vxlan offloads..
From: Tom Herbert @ 2014-10-29 14:59 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: netdev@vger.kernel.org
In-Reply-To: <54508020.1040305@mellanox.com>

On Tue, Oct 28, 2014 at 10:50 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> On 10/28/2014 5:36 PM, Tom Herbert wrote:
>>>
>>> I wonder if we have another bug somewhere... when both sides were
>>> offloaded,
>>> >it works even with the mlx4 bug, canyou explain that?is it possible that
>>> > the
>>> >GRO stack somehow covers on the bug when both sides are offloaded and
>>> >GRO/VXLAN comes into play?
>>>
>> Look at the receive side. As I mentioned, if the device is returning
>> checksum-unnecessary and setting csum_level to 1 (inner checksum was
>> validated) then stack won't try to verify the outer checksum. So in
>> this case if outer checksum is incorrect nobody complains about it.
>
>
> OK, I'll look there. Anything that should worries us at that stack trace I
> sent in my initial email of this thread, or you think this is related to the
> mlx4 driver checksum bug?
>
The trace doesn't seem like it would be related to a checksum bug. Do
you only see this with offload enabled?

> Or.
>
>>
>

^ permalink raw reply

* Re: [PATCH net-next] net: introduce napi_schedule_irqoff()
From: Alexei Starovoitov @ 2014-10-29 15:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1414563627.631.75.camel@edumazet-glaptop2.roam.corp.google.com>

On Tue, Oct 28, 2014 at 11:20 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2014-10-28 at 22:13 -0700, Alexei Starovoitov wrote:
>
>> tried 50 parallel netperf -t TCP_RR over ixgbe
>> and perf top were tcp stack bits, qdisc locks and netperf itself.
>> What do you see?
>
> You are kidding right ?
>
> If you save 30 nsec ( 2 * 15 nsec) per transaction, and rtt is about 20
> usec, its a 0.15 % gain. Not bad for a trivial patch.

agreed.
I wasn't arguing against the patch at all. Was just curious
what performance gain we'll see. 0.15% is tiny and some might
say the code bloat is not worth it, but imo it's a good one. ack.

> Every atomic op we remove/avoid, every irq masking unmasking we remove,
> every cache line miss or extra bus transaction we remove, TLB miss, is
> the path for better latency.

yes. We're saying the same thing.

^ permalink raw reply

* Re: [PATCH] mac80211_hwsim: release driver when ieee80211_register_hw fails
From: Johannes Berg @ 2014-10-29 15:31 UTC (permalink / raw)
  To: Junjie Mao
  Cc: Martin Pitt, Fengguang Wu, linux-wireless, netdev, linux-kernel
In-Reply-To: <1414459907-7509-1-git-send-email-eternal.n08@gmail.com>

On Tue, 2014-10-28 at 09:31 +0800, Junjie Mao wrote:
> The driver is not released when ieee80211_register_hw fails in
> mac80211_hwsim_create_radio, leading to the access to the unregistered (and
> possibly freed) device in platform_driver_unregister:

Applied.

johannes

^ permalink raw reply

* Re: [patch] SUNRPC: off by one in BUG_ON()
From: J. Bruce Fields @ 2014-10-29 15:38 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Trond Myklebust, David S. Miller,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	kernel-janitors-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20141029084416.GC8939@mwanda>

On Wed, Oct 29, 2014 at 11:44:16AM +0300, Dan Carpenter wrote:
> The m->pool_to[] array has "maxpools" number of elements.  It's
> allocated in svc_pool_map_alloc_arrays() which we called earlier in the
> function.  This test should be >= instead of >.
> 
> Signed-off-by: Dan Carpenter <dan.carpenter-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> ---
> This is very old code, but hopefully the off by one doesn't affect
> runtime.

Yeah, doesn't look like a big deal, but thanks, applying for 3.19.--b.

> 
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index ca8a795..349c98f 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -189,7 +189,7 @@ svc_pool_map_init_percpu(struct svc_pool_map *m)
>  		return err;
>  
>  	for_each_online_cpu(cpu) {
> -		BUG_ON(pidx > maxpools);
> +		BUG_ON(pidx >= maxpools);
>  		m->to_pool[cpu] = pidx;
>  		m->pool_to[pidx] = cpu;
>  		pidx++;
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: some failures with vxlan offloads..
From: Or Gerlitz @ 2014-10-29 15:56 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev@vger.kernel.org
In-Reply-To: <CA+mtBx9TCtHgW=TegUgjZ6Se=mP+0CNiNHYmbYHPd1cJJGYoBQ@mail.gmail.com>

On 10/29/2014 4:59 PM, Tom Herbert wrote:
>> >OK, I'll look there. Anything that should worries us at that stack trace I
>> >sent in my initial email of this thread, or you think this is related to the
>> >mlx4 driver checksum bug?
>> >
> The trace doesn't seem like it would be related to a checksum bug. Do you only see this with offload enabled?

no, it happened on the server side of configuration #4 in my original 
email, which is offloaded client and non-offloaded server.

Or.

^ permalink raw reply

* e1000_netpoll(): disable_irq() triggers might_sleep() on linux-next
From: Sabrina Dubroca @ 2014-10-29 15:56 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, peterz, jeffrey.t.kirsher

commit e22b886a8a43b ("sched/wait: Add might_sleep() checks") included
in today's linux-next added a check that fires on e1000 with netpoll:


BUG: sleeping function called from invalid context at kernel/irq/manage.c:104
in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: systemd
no locks held by systemd/1.
irq event stamp: 10102965
hardirqs last  enabled at (10102965): [<ffffffff810cbafd>] vprintk_emit+0x2dd/0x6a0
hardirqs last disabled at (10102964): [<ffffffff810cb897>] vprintk_emit+0x77/0x6a0
softirqs last  enabled at (10102342): [<ffffffff810666aa>] __do_softirq+0x27a/0x6f0
softirqs last disabled at (10102337): [<ffffffff81066e86>] irq_exit+0x56/0xe0
Preemption disabled at:[<ffffffff817de50d>] printk_emit+0x31/0x33

CPU: 1 PID: 1 Comm: systemd Not tainted 3.18.0-rc2-next-20141029-dirty #222
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140617_173321-var-lib-archbuild-testing-x86_64-tobias 04/01/2014
 ffffffff81a82291 ffff88001e743978 ffffffff817df31d 0000000000000000
 0000000000000000 ffff88001e7439a8 ffffffff8108dfa2 ffff88001e7439a8
 ffffffff81a82291 0000000000000068 0000000000000000 ffff88001e7439d8
Call Trace:
 [<ffffffff817df31d>] dump_stack+0x4f/0x7c
 [<ffffffff8108dfa2>] ___might_sleep+0x182/0x2b0
 [<ffffffff8108e10a>] __might_sleep+0x3a/0xc0
 [<ffffffff810ce358>] synchronize_irq+0x38/0xa0
 [<ffffffff810ce633>] ? __disable_irq_nosync+0x43/0x70
 [<ffffffff810ce690>] disable_irq+0x20/0x30
 [<ffffffff815d7253>] e1000_netpoll+0x23/0x60
 [<ffffffff81678d02>] netpoll_poll_dev+0x72/0x3a0
 [<ffffffff817e9993>] ? _raw_spin_trylock+0x73/0x90
 [<ffffffff8167920f>] ? netpoll_send_skb_on_dev+0x1df/0x2e0
 [<ffffffff816791e7>] netpoll_send_skb_on_dev+0x1b7/0x2e0
 [<ffffffff816795f3>] netpoll_send_udp+0x2e3/0x490
 [<ffffffff815d1f61>] ? write_msg+0x51/0x140
 [<ffffffff815d1fdf>] write_msg+0xcf/0x140
 [<ffffffff810cadbb>] call_console_drivers.constprop.22+0x13b/0x2a0
 [<ffffffff810cb6bd>] console_unlock+0x39d/0x500
 [<ffffffff810cbb3e>] ? vprintk_emit+0x31e/0x6a0
 [<ffffffff810cbb5c>] vprintk_emit+0x33c/0x6a0
 [<ffffffff811a6c6e>] ? might_fault+0x5e/0xc0
 [<ffffffff817de50d>] printk_emit+0x31/0x33
 [<ffffffff810cbfad>] devkmsg_write+0xbd/0x110
 [<ffffffff811f24d5>] do_iter_readv_writev+0x65/0xa0
 [<ffffffff811f3b72>] do_readv_writev+0xe2/0x290
 [<ffffffff810cbef0>] ? vprintk+0x30/0x30
 [<ffffffff810d499d>] ? rcu_read_lock_held+0x6d/0x70
 [<ffffffff812142d6>] ? __fget_light+0xc6/0xd0
 [<ffffffff811f3dac>] vfs_writev+0x3c/0x50
 [<ffffffff811f3eed>] SyS_writev+0x4d/0xe0
 [<ffffffff817ea16d>] system_call_fastpath+0x16/0x1b



I'm able to reproduce it consistently by sending a lot of packets from
that interface while writing to /dev/kmsg with netconsole
enabled. Just writing to /dev/kmsg isn't enough.

# with ping -f or pktgen running
for i in `seq 1 20` ; do echo '............................................' > /dev/kmsg ; done


-- 
Sabrina

^ permalink raw reply

* Fw: [Bug 87111] New: hlist_for_each_entry_rcu()  returns invalid pointer causing kernel to OOPS
From: Stephen Hemminger @ 2014-10-29 15:57 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Wed, 29 Oct 2014 07:16:13 -0700
From: "bugzilla-daemon@bugzilla.kernel.org" <bugzilla-daemon@bugzilla.kernel.org>
To: "stephen@networkplumber.org" <stephen@networkplumber.org>
Subject: [Bug 87111] New: hlist_for_each_entry_rcu()  returns invalid pointer causing kernel to OOPS


https://bugzilla.kernel.org/show_bug.cgi?id=87111

            Bug ID: 87111
           Summary: hlist_for_each_entry_rcu()  returns invalid pointer
                    causing kernel to OOPS
           Product: Networking
           Version: 2.5
    Kernel Version: 2.6.32.24
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: IPV4
          Assignee: shemminger@linux-foundation.org
          Reporter: jith131986@gmail.com
        Regression: No

Created attachment 155781
  --> https://bugzilla.kernel.org/attachment.cgi?id=155781&action=edit
nf_nat.ko objdump for analysing IP and offset to see exact line where kernel
panic happened

In my setup linux stack is only used for layer 2 network services. when layer 2
packet is recieved by linux for layer 2 functionality, in nf_nat kernel module
hlist_for_each_entry_rcu()(where IP points)  function return an invalid pointer
resulting in Oops panic. I have attached panic dump and nf_nat.ko objdump for
further analysis.

Would like to know the issue is seen/reported before and fixed ?. If not is it 
possible to get cause or solution for the same.

Pasting the panic dump below and attaching nf_nat.ko objdump

<1>BUG: unable to handle kernel NULL pointer dereference at 000000000000003e
<1>IP: [<ffffffffa003794b>] nf_nat_setup_info+0x1ab/0x740 [nf_nat]
<6>PGD 641576067 PUD 7dd9f3067 PMD 0 
<0>Oops: 0000 [#1] PREEMPT SMP 
<0>last sysfs file:
/sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/2-1:1.0/host5/scsi_host/host5/proc_name
<6>CPU 3 
<6>Modules linked in: bridge stp llc ixgbe igb ftdi_sio usbserial xt_connlimit
xt_tcpudp xt_mark iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_conntrack iptable_filter ip_tables x_tables
<6>Pid: 0, comm: swapper Tainted: P        W  2.6.32.24 #1 S5520UR
<6>RIP: e030:[<ffffffffa003794b>]  [<ffffffffa003794b>]
nf_nat_setup_info+0x1ab/0x740 [nf_nat]
<6>RSP: e02b:ffff88002808d910  EFLAGS: 00010282
<6>RAX: 0000000000000000 RBX: ffff880381313b58 RCX: 0000000000000000
<6>RDX: 0000000000000018 RSI: 000000007049f4f6 RDI: ffff88002808d9b0
<6>RBP: ffff88002808da10 R08: ffffffff81393e80 R09: ffffffffa0040790
<6>R10: 0000000000004000 R11: 000000000000002c R12: ffff88002808da20
<6>R13: ffff8807fc8ebfd8 R14: ffff880396c3bb70 R15: 0000000000000000
<6>FS:  00007fde2cd296f0(0000) GS:ffff88002808a000(0000) knlGS:0000000000000000
<6>CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
<6>CR2: 000000000000003e CR3: 000000079ab27000 CR4: 0000000000002660
<6>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<6>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<6>Process swapper (pid: 0, threadinfo ffff8807fc8ea000, task ffff8807fc8da050)
<0>Stack:
<6> 0000000000000000 ffff88002808d980 ffff88002808da2c ffff88002808da2e
<6><0> ffff8807fc8ea000 ffff8807fc8ebfd8 0000000000000100 0000000000000100
<6><0> 0000000000000000 0000000000010001 00000000002ace3f ffff88002809a720
<0>Call Trace:
<0> <IRQ> 
<6> [<ffffffff81048b07>] ? local_bh_enable+0x77/0xc0
<6> [<ffffffffa0009945>] ? ipt_do_table+0x2a5/0x3e0 [ip_tables]
<6> [<ffffffffa00400cf>] alloc_null_binding+0x3f/0x70 [iptable_nat]
<6> [<ffffffffa00402fb>] nf_nat_rule_find+0x1fb/0x390 [iptable_nat]
<6> [<ffffffff8138ca3f>] nf_iterate+0x5f/0x90
<6> [<ffffffff81393e80>] ? ip_local_deliver_finish+0x0/0x1e0
<6> [<ffffffff8138cdb0>] nf_hook_slow+0xb0/0x110
<6> [<ffffffff81393e80>] ? ip_local_deliver_finish+0x0/0x1e0
<6> [<ffffffff81394559>] ip_local_deliver+0x69/0x90
<6> [<ffffffff81393ba6>] ip_rcv_finish+0x146/0x420
<6> [<ffffffff8139440d>] ip_rcv+0x27d/0x360
<6> [<ffffffff81371747>] netif_receive_skb+0x2b7/0x390
<6> [<ffffffffa12cce50>] br_handle_frame_finish+0x130/0x170 [bridge]
<6> [<ffffffffa12d1458>] br_netfilter_fini+0x6a8/0x810 [bridge]
<6> [<ffffffff8138cdb0>] ? nf_hook_slow+0xb0/0x110
<6> [<ffffffffa12d1270>] ? br_netfilter_fini+0x4c0/0x810 [bridge]
<6> [<ffffffffa12d2389>] nf_bridge_copy_header+0xdc9/0x10e0 [bridge]
<6> [<ffffffff8138ca3f>] nf_iterate+0x5f/0x90
<6> [<ffffffffa12ccd20>] ? br_handle_frame_finish+0x0/0x170 [bridge]
<6> [<ffffffff8138cdb0>] nf_hook_slow+0xb0/0x110
<6> [<ffffffffa12ccd20>] ? br_handle_frame_finish+0x0/0x170 [bridge]
<6> [<ffffffffa12ccfe6>] br_handle_frame+0x156/0x2b0 [bridge]
<6> [<ffffffff813f2ab8>] ? vlan_skb_recv+0x1a8/0x2f0
<6> [<ffffffff81371699>] netif_receive_skb+0x209/0x390
<6> [<ffffffff81374d79>] process_backlog+0x89/0xc0
<6> [<ffffffff81374b7f>] net_rx_action+0x7f/0x160
<6> [<ffffffffa0078165>] ? igb_reinit_locked+0x1995/0x2900 [igb]
<6> [<ffffffff810484f8>] __do_softirq+0xa8/0x130
<6> [<ffffffff810755a8>] ? handle_level_irq+0xe8/0x130
<6> [<ffffffff81014efc>] call_softirq+0x1c/0x30
<6> [<ffffffff81016765>] do_softirq+0x65/0xa0
<6> [<ffffffff81048358>] irq_exit+0x48/0x50
<6> [<ffffffff81228ddd>] xen_evtchn_do_upcall+0x3d/0x60
<6> [<ffffffff81014f4e>] xen_do_hypervisor_callback+0x1e/0x30
<0> <EOI> 
<6> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1010
<6> [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1010
<6> [<ffffffff8100f8d0>] ? xen_safe_halt+0x10/0x20
<6> [<ffffffff8100c4d5>] ? xen_idle+0x45/0x70
<6> [<ffffffff81012d78>] ? cpu_idle+0x58/0x90
<6> [<ffffffff810101c9>] ? xen_irq_enable_direct_end+0x0/0x7
<6> [<ffffffff8140a86e>] ? cpu_bringup_and_idle+0xe/0x10
<0>Code: ff ff ff 49 8d 44 24 0c 48 89 85 10 ff ff ff eb 0c 48 8b 1b 48 85 db
0f 84 f1 00 00 00 48 8b 4b 20 48 8b 03 48 8d 51 18 0f 18 08 <0f> b6 42 26 3a 45
c6 75 dd 8b 02 3b 45 a0 75 d6 0f b7 42 10 66 
<1>RIP  [<ffffffffa003794b>] nf_nat_setup_info+0x1ab/0x740 [nf_nat]
<6> RSP <ffff88002808d910>
<0>CR2: 000000000000003e


WARN  paging error trying to follow 0x0000000000000000 - level 2, cr3
000000058ea67000

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply

* Re: [PATCH 1/1 net-next] cfg80211: fix set but not used warning in nl80211_channel_switch()
From: Johannes Berg @ 2014-10-29 16:04 UTC (permalink / raw)
  To: Fabian Frederick
  Cc: linux-kernel, John W. Linville, David S. Miller, linux-wireless,
	netdev
In-Reply-To: <1414252655-20506-1-git-send-email-fabf@skynet.be>

On Sat, 2014-10-25 at 17:57 +0200, Fabian Frederick wrote:
> radar_detect_width is unused since commit 97dc94f1d933
> ("cfg80211: remove channel_switch combination check")

Applied, thanks.

johannes

^ permalink raw reply

* [patch] iwlwifi: cleanup a mask shift in iwlagn_bt_traffic_is_sco()
From: Dan Carpenter @ 2014-10-29 16:08 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Emmanuel Grumbach, Intel Linux Wireless, John W. Linville,
	Paul Gortmaker, Larry Finger, linux-wireless, netdev,
	linux-kernel, kernel-janitors

The shift operation is higher precedence so the code is wrong and it
sets of a static checker warning.  But it doesn't affect real life
because BT_UART_MSG_FRAME3SCOESCO_POS is zero so the shift is a no-op.

I have re-written it in normal style and with parenthesis as a cleanup
and to silence the static checker warning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/drivers/net/wireless/iwlwifi/dvm/lib.c b/drivers/net/wireless/iwlwifi/dvm/lib.c
index 2191621..065d3d5 100644
--- a/drivers/net/wireless/iwlwifi/dvm/lib.c
+++ b/drivers/net/wireless/iwlwifi/dvm/lib.c
@@ -418,7 +418,7 @@ void iwlagn_bt_adjust_rssi_monitor(struct iwl_priv *priv, bool rssi_ena)
 
 static bool iwlagn_bt_traffic_is_sco(struct iwl_bt_uart_msg *uart_msg)
 {
-	return BT_UART_MSG_FRAME3SCOESCO_MSK & uart_msg->frame3 >>
+	return (uart_msg->frame3 & BT_UART_MSG_FRAME3SCOESCO_MSK) >>
 			BT_UART_MSG_FRAME3SCOESCO_POS;
 }
 

^ permalink raw reply related

* [patch] Bluetooth: 6lowpan: use after free in disconnect_devices()
From: Dan Carpenter @ 2014-10-29 16:10 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Gustavo Padovan, Johan Hedberg, David S. Miller, linux-bluetooth,
	netdev, linux-kernel, kernel-janitors

This was accidentally changed from list_for_each_entry_safe() to
list_for_each_entry() so now it has a use after free bug.  I've changed
it back.

Fixes: 90305829635d ('Bluetooth: 6lowpan: Converting rwlocks to use RCU')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c
index 7254bdd..eef298d 100644
--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -1383,7 +1383,7 @@ static const struct file_operations lowpan_control_fops = {
 
 static void disconnect_devices(void)
 {
-	struct lowpan_dev *entry, *new_dev;
+	struct lowpan_dev *entry, *tmp, *new_dev;
 	struct list_head devices;
 
 	INIT_LIST_HEAD(&devices);
@@ -1408,7 +1408,7 @@ static void disconnect_devices(void)
 
 	rcu_read_unlock();
 
-	list_for_each_entry(entry, &devices, list) {
+	list_for_each_entry_safe(entry, tmp, &devices, list) {
 		ifdown(entry->netdev);
 		BT_DBG("Unregistering netdev %s %p",
 		       entry->netdev->name, entry->netdev);

^ permalink raw reply related

* Re: [patch] iwlwifi: cleanup a mask shift in iwlagn_bt_traffic_is_sco()
From: Emmanuel Grumbach @ 2014-10-29 16:16 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Johannes Berg, Emmanuel Grumbach, Intel Linux Wireless,
	John W. Linville, Paul Gortmaker, Larry Finger, linux-wireless,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-janitors
In-Reply-To: <20141029160827.GD5290@mwanda>

On Wed, Oct 29, 2014 at 6:08 PM, Dan Carpenter <dan.carpenter@oracle.com> wrote:
> The shift operation is higher precedence so the code is wrong and it
> sets of a static checker warning.  But it doesn't affect real life
> because BT_UART_MSG_FRAME3SCOESCO_POS is zero so the shift is a no-op.
>
> I have re-written it in normal style and with parenthesis as a cleanup
> and to silence the static checker warning.
>
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
>

In my tree already - I got it from Joe.

https://git.kernel.org/cgit/linux/kernel/git/iwlwifi/iwlwifi-next.git/commit/?id=50f6635afe565a0e1c5ab78f040294fe1dc41de0

^ permalink raw reply

* Re: [RFC] use smp_load_acquire()/smp_store_release()
From: Alexander Duyck @ 2014-10-29 16:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, jeffrey.t.kirsher
In-Reply-To: <1414594159.631.85.camel@edumazet-glaptop2.roam.corp.google.com>

On 10/29/2014 07:49 AM, Eric Dumazet wrote:
> Hi Alexander
>
> The memory barriers added in commit
> b37c0fbe3f6dfba1f8ad2aed47fb40578a254635
> ("net: Add memory barriers to prevent possible race in byte queue
> limits")
>
> have heavy cost.
>
> It seems we could use smp_load_acquire() and smp_store_release()
> instead ?
>
> I'll post a patch later today. I would be interested if someone was able
> to test it, as your commit apparently was tested and known to fix a
> reproducible race.
>
> Thanks !

Unfortunately Stephen left Intel before I did, so we will need to find 
someone else in the validation team to test this if possible. I have 
added Jeff to the CC so that he can give the appropriate validation 
people a heads up that this patch might be coming.

As I recall what was seen was random Tx hangs on systems with the 
original BQL code when interfaces were stressed.  It has been a while so 
I don't recall the exact set-up for all of it.  Also some less 
used/tested architectures such as PowerPC can be more susceptible to 
synchronization issues such as these as the memory model is more weakly 
ordered.

I'm wondering where you are seeing the barrier show up?  In 
netdev_tx_send_queue you should only hit the barrier if you actually are 
triggering the XOFF condition, and in netdev_tx_completed_queue the 
barrier should be coalesced in amongst a number of frames reducing the cost.

My concern with this would be that we are actually syncronizing multiple 
things, the __QUEUE_STATE_STACK_XOFF flag, dql->adj_limit, and 
dql->num_queued, and we might be trading off reducing the cost on x86 to 
result in it being increased on other architectures as we may have to 
actually add additional synchronization as I suspect we would need to 
use acquire/release on both adj_limit and num_queued.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH] netdev: Fix sleeping inside wait event
From: Cong Wang @ 2014-10-29 16:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Fengguang Wu, LKP, LKML, oleg@redhat.com, Eric W. Biederman,
	David Miller, Linux Kernel Network Developers
In-Reply-To: <20141029161657.GF3337@twins.programming.kicks-ass.net>

(Adding netdev@...)

On Wed, Oct 29, 2014 at 9:16 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
> Dave, this relies on bits currently in tip/sched/core, if you're ok I'll
> merge it through that tree.
>
> ---
> Subject: netdev: Fix sleeping inside wait event
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Wed Oct 29 17:04:56 CET 2014
>
> rtnl_lock_unregistering() takes rtnl_lock() -- a mutex -- inside a
> wait loop. The wait loop relies on current->state to function, but so
> does mutex_lock(), nesting them makes for the inner to destroy the
> outer state.
>

While you are on it, please fix rtnl_lock_unregistering_all() too?

Thanks!

> Fix this using the new wait_woken() bits.
>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Eric Biederman <ebiederm@xmission.com>
> Cc: David Miller <davem@davemloft.net>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  net/core/dev.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -7196,11 +7196,10 @@ static void __net_exit rtnl_lock_unregis
>          */
>         struct net *net;
>         bool unregistering;
> -       DEFINE_WAIT(wait);
> +       DEFINE_WAIT_FUNC(wait, woken_wake_function);
>
> +       add_wait_queue(&netdev_unregistering_wq, &wait);
>         for (;;) {
> -               prepare_to_wait(&netdev_unregistering_wq, &wait,
> -                               TASK_UNINTERRUPTIBLE);
>                 unregistering = false;
>                 rtnl_lock();
>                 list_for_each_entry(net, net_list, exit_list) {
> @@ -7212,9 +7211,10 @@ static void __net_exit rtnl_lock_unregis
>                 if (!unregistering)
>                         break;
>                 __rtnl_unlock();
> -               schedule();
> +
> +               wait_woken(&wait, TASK_UNINTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
>         }
> -       finish_wait(&netdev_unregistering_wq, &wait);
> +       remove_wait_queue(&netdev_unregistering_wq, &wait);
>  }
>
>  static void __net_exit default_device_exit_batch(struct list_head *net_list)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply

* Re: [PATCH] PPC: bpf_jit_comp: add SKF_AD_PKTTYPE instruction
From: Alexei Starovoitov @ 2014-10-29 17:08 UTC (permalink / raw)
  To: Denis Kirjanov; +Cc: linuxppc-dev, Matt Evans, netdev@vger.kernel.org
In-Reply-To: <CAOJe8K0t3G-bHm_24GjrTp9mmKnYZS1_bdGgrwQBLjC_s5is6w@mail.gmail.com>

On Wed, Oct 29, 2014 at 2:21 AM, Denis Kirjanov <kda@linux-powerpc.org> wrote:
> Any feedback from PPC folks?

not a ppc guy, but looks reasonable to me.
What lib/test_bpf says? Like performance difference before/after
for LD_PKTTYPE test...

^ permalink raw reply

* Re: [PATCH] netdev: Fix sleeping inside wait event
From: Peter Zijlstra @ 2014-10-29 17:13 UTC (permalink / raw)
  To: Cong Wang
  Cc: Fengguang Wu, LKP, LKML, oleg@redhat.com, Eric W. Biederman,
	David Miller, Linux Kernel Network Developers
In-Reply-To: <CAM_iQpXTyBT_4TrnYAoAuAE_jMto=2B=cutFggpHRMhvSrG7qg@mail.gmail.com>

On Wed, Oct 29, 2014 at 09:29:55AM -0700, Cong Wang wrote:
> (Adding netdev@...)
> 
> On Wed, Oct 29, 2014 at 9:16 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > Dave, this relies on bits currently in tip/sched/core, if you're ok I'll
> > merge it through that tree.
> >
> > ---
> > Subject: netdev: Fix sleeping inside wait event
> > From: Peter Zijlstra <peterz@infradead.org>
> > Date: Wed Oct 29 17:04:56 CET 2014
> >
> > rtnl_lock_unregistering() takes rtnl_lock() -- a mutex -- inside a
> > wait loop. The wait loop relies on current->state to function, but so
> > does mutex_lock(), nesting them makes for the inner to destroy the
> > outer state.
> >
> 
> While you are on it, please fix rtnl_lock_unregistering_all() too?

Ah, that's hidden someplace else, sure I can do that. Thanks for
pointing it out.

^ permalink raw reply

* Re: [PATCH] netdev: Fix sleeping inside wait event
From: Peter Zijlstra @ 2014-10-29 17:31 UTC (permalink / raw)
  To: Cong Wang
  Cc: Fengguang Wu, LKP, LKML, oleg@redhat.com, Eric W. Biederman,
	David Miller, Linux Kernel Network Developers
In-Reply-To: <20141029171345.GO12706@worktop.programming.kicks-ass.net>

On Wed, Oct 29, 2014 at 06:13:45PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 29, 2014 at 09:29:55AM -0700, Cong Wang wrote:
> > While you are on it, please fix rtnl_lock_unregistering_all() too?
> 
> Ah, that's hidden someplace else, sure I can do that. Thanks for
> pointing it out.

Here goes..

---
Subject: netdev: Fix sleeping inside wait event
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed Oct 29 17:04:56 CET 2014

rtnl_lock_unregistering*() take rtnl_lock() -- a mutex -- inside a
wait loop. The wait loop relies on current->state to function, but so
does mutex_lock(), nesting them makes for the inner to destroy the
outer state.

Fix this using the new wait_woken() bits.

Cc: David Miller <davem@davemloft.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 net/core/dev.c       |   10 +++++-----
 net/core/rtnetlink.c |   10 +++++-----
 2 files changed, 10 insertions(+), 10 deletions(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7196,11 +7196,10 @@ static void __net_exit rtnl_lock_unregis
 	 */
 	struct net *net;
 	bool unregistering;
-	DEFINE_WAIT(wait);
+	DEFINE_WAIT_FUNC(wait, woken_wake_function);
 
+	add_wait_queue(&netdev_unregistering_wq, &wait);
 	for (;;) {
-		prepare_to_wait(&netdev_unregistering_wq, &wait,
-				TASK_UNINTERRUPTIBLE);
 		unregistering = false;
 		rtnl_lock();
 		list_for_each_entry(net, net_list, exit_list) {
@@ -7212,9 +7211,10 @@ static void __net_exit rtnl_lock_unregis
 		if (!unregistering)
 			break;
 		__rtnl_unlock();
-		schedule();
+
+		wait_woken(&wait, TASK_UNINTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
 	}
-	finish_wait(&netdev_unregistering_wq, &wait);
+	remove_wait_queue(&netdev_unregistering_wq, &wait);
 }
 
 static void __net_exit default_device_exit_batch(struct list_head *net_list)
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -365,11 +365,10 @@ static void rtnl_lock_unregistering_all(
 {
 	struct net *net;
 	bool unregistering;
-	DEFINE_WAIT(wait);
+	DEFINE_WAIT_FUNC(wait, woken_wake_function);
 
+	add_wait_queue(&netdev_unregistering_wq, &wait);
 	for (;;) {
-		prepare_to_wait(&netdev_unregistering_wq, &wait,
-				TASK_UNINTERRUPTIBLE);
 		unregistering = false;
 		rtnl_lock();
 		for_each_net(net) {
@@ -381,9 +380,10 @@ static void rtnl_lock_unregistering_all(
 		if (!unregistering)
 			break;
 		__rtnl_unlock();
-		schedule();
+
+		wait_woken(&wait, TASK_UNINTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
 	}
-	finish_wait(&netdev_unregistering_wq, &wait);
+	remove_wait_queue(&netdev_unregistering_wq, &wait);
 }
 
 /**

^ permalink raw reply

* Re: [PATCH] netdev: Fix sleeping inside wait event
From: David Miller @ 2014-10-29 17:38 UTC (permalink / raw)
  To: peterz
  Cc: xiyou.wangcong, fengguang.wu, lkp, linux-kernel, oleg, ebiederm,
	netdev
In-Reply-To: <20141029173110.GE15602@worktop.programming.kicks-ass.net>

From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 29 Oct 2014 18:31:10 +0100

> On Wed, Oct 29, 2014 at 06:13:45PM +0100, Peter Zijlstra wrote:
>> On Wed, Oct 29, 2014 at 09:29:55AM -0700, Cong Wang wrote:
>> > While you are on it, please fix rtnl_lock_unregistering_all() too?
>> 
>> Ah, that's hidden someplace else, sure I can do that. Thanks for
>> pointing it out.
> 
> Here goes..

For this as well:

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* [PATCH v3 01/15] net: dsa: Don't set skb->protocol on outgoing tagged packets
From: Guenter Roeck @ 2014-10-29 17:44 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Florian Fainelli, Andrew Lunn, linux-kernel,
	Guenter Roeck
In-Reply-To: <1414604707-22407-1-git-send-email-linux@roeck-us.net>

Setting skb->protocol to a private protocol type may result in warning
messages such as
	e1000e 0000:00:19.0 em1: checksum_partial proto=dada!

This happens if the L3 protocol is IP or IPv6 and skb->ip_summed is set
to CHECKSUM_PARTIAL. Looking through the code, it appears that changing
skb->protocol for transmitted packets is not necessary and may actually
be harmful. For example, it prevents purposely unmodified (from a DSA
perspective) network drivers from properly setting up their transmit
checksum offload pointers since they inspect skb->protocol to set up the
IPv4 header or IPv6 header pointers. So don't unnecessarily change the
protocol field.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v3:
- Updated description
v2:
- No change

 net/dsa/tag_dsa.c     | 2 --
 net/dsa/tag_edsa.c    | 2 --
 net/dsa/tag_trailer.c | 2 --
 3 files changed, 6 deletions(-)

diff --git a/net/dsa/tag_dsa.c b/net/dsa/tag_dsa.c
index ce90c8b..2dab270 100644
--- a/net/dsa/tag_dsa.c
+++ b/net/dsa/tag_dsa.c
@@ -63,8 +63,6 @@ static netdev_tx_t dsa_xmit(struct sk_buff *skb, struct net_device *dev)
 		dsa_header[3] = 0x00;
 	}
 
-	skb->protocol = htons(ETH_P_DSA);
-
 	skb->dev = p->parent->dst->master_netdev;
 	dev_queue_xmit(skb);
 
diff --git a/net/dsa/tag_edsa.c b/net/dsa/tag_edsa.c
index 94fcce7..9aeda59 100644
--- a/net/dsa/tag_edsa.c
+++ b/net/dsa/tag_edsa.c
@@ -76,8 +76,6 @@ static netdev_tx_t edsa_xmit(struct sk_buff *skb, struct net_device *dev)
 		edsa_header[7] = 0x00;
 	}
 
-	skb->protocol = htons(ETH_P_EDSA);
-
 	skb->dev = p->parent->dst->master_netdev;
 	dev_queue_xmit(skb);
 
diff --git a/net/dsa/tag_trailer.c b/net/dsa/tag_trailer.c
index 115fdca..e268f9d 100644
--- a/net/dsa/tag_trailer.c
+++ b/net/dsa/tag_trailer.c
@@ -57,8 +57,6 @@ static netdev_tx_t trailer_xmit(struct sk_buff *skb, struct net_device *dev)
 	trailer[2] = 0x10;
 	trailer[3] = 0x00;
 
-	nskb->protocol = htons(ETH_P_TRAILER);
-
 	nskb->dev = p->parent->dst->master_netdev;
 	dev_queue_xmit(nskb);
 
-- 
1.9.1

^ permalink raw reply related

* [PATCH v3 04/15] net: dsa: Add support for Marvell 88E6352
From: Guenter Roeck @ 2014-10-29 17:44 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Florian Fainelli, Andrew Lunn, linux-kernel,
	Guenter Roeck
In-Reply-To: <1414604707-22407-1-git-send-email-linux@roeck-us.net>

Marvell 88E6352 is mostly compatible to MV88E6123/61/65,
but requires indirect phy access. Also, its configuration
registers are a bit different.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v3:
- No change
v2:
- No change

 MAINTAINERS                 |   5 +
 drivers/net/dsa/Kconfig     |   8 +
 drivers/net/dsa/Makefile    |   3 +
 drivers/net/dsa/mv88e6352.c | 464 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/net/dsa/mv88e6xxx.c |   3 +
 drivers/net/dsa/mv88e6xxx.h |   7 +
 6 files changed, 490 insertions(+)
 create mode 100644 drivers/net/dsa/mv88e6352.c

diff --git a/MAINTAINERS b/MAINTAINERS
index dab92a7..91db18e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5860,6 +5860,11 @@ M:	Russell King <rmk+kernel@arm.linux.org.uk>
 S:	Maintained
 F:	drivers/gpu/drm/armada/
 
+MARVELL 88E6352 DSA support
+M:	Guenter Roeck <linux@roeck-us.net>
+S:	Maintained
+F:	drivers/net/dsa/mv88e6352.c
+
 MARVELL GIGABIT ETHERNET DRIVERS (skge/sky2)
 M:	Mirko Lindner <mlindner@marvell.com>
 M:	Stephen Hemminger <stephen@networkplumber.org>
diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 9234d80..0987c33 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -45,6 +45,14 @@ config NET_DSA_MV88E6171
 	  This enables support for the Marvell 88E6171 ethernet switch
 	  chip.
 
+config NET_DSA_MV88E6352
+	tristate "Marvell 88E6352 ethernet switch chip support"
+	select NET_DSA
+	select NET_DSA_MV88E6XXX
+	select NET_DSA_TAG_EDSA
+	---help---
+	  This enables support for the Marvell 88E6352 ethernet switch chip.
+
 config NET_DSA_BCM_SF2
 	tristate "Broadcom Starfighter 2 Ethernet switch support"
 	depends on HAS_IOMEM
diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile
index 23a90de..e2d51c4 100644
--- a/drivers/net/dsa/Makefile
+++ b/drivers/net/dsa/Makefile
@@ -7,6 +7,9 @@ endif
 ifdef CONFIG_NET_DSA_MV88E6131
 mv88e6xxx_drv-y += mv88e6131.o
 endif
+ifdef CONFIG_NET_DSA_MV88E6352
+mv88e6xxx_drv-y += mv88e6352.o
+endif
 ifdef CONFIG_NET_DSA_MV88E6171
 mv88e6xxx_drv-y += mv88e6171.o
 endif
diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
new file mode 100644
index 0000000..43a5826
--- /dev/null
+++ b/drivers/net/dsa/mv88e6352.c
@@ -0,0 +1,464 @@
+/*
+ * net/dsa/mv88e6352.c - Marvell 88e6352 switch chip support
+ *
+ * Copyright (c) 2014 Guenter Roeck
+ *
+ * Derived from mv88e6123_61_65.c
+ * Copyright (c) 2008-2009 Marvell Semiconductor
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/delay.h>
+#include <linux/jiffies.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/platform_device.h>
+#include <linux/phy.h>
+#include <net/dsa.h>
+#include "mv88e6xxx.h"
+
+static int mv88e6352_phy_wait(struct dsa_switch *ds)
+{
+	unsigned long timeout = jiffies + HZ / 10;
+
+	while (time_before(jiffies, timeout)) {
+		int ret;
+
+		ret = REG_READ(REG_GLOBAL2, 0x18);
+		if (ret < 0)
+			return ret;
+
+		if (!(ret & 0x8000))
+			return 0;
+
+		usleep_range(1000, 2000);
+	}
+	return -ETIMEDOUT;
+}
+
+static int __mv88e6352_phy_read(struct dsa_switch *ds, int addr, int regnum)
+{
+	int ret;
+
+	REG_WRITE(REG_GLOBAL2, 0x18, 0x9800 | (addr << 5) | regnum);
+
+	ret = mv88e6352_phy_wait(ds);
+	if (ret < 0)
+		return ret;
+
+	return REG_READ(REG_GLOBAL2, 0x19);
+}
+
+static int __mv88e6352_phy_write(struct dsa_switch *ds, int addr, int regnum,
+				 u16 val)
+{
+	REG_WRITE(REG_GLOBAL2, 0x19, val);
+	REG_WRITE(REG_GLOBAL2, 0x18, 0x9400 | (addr << 5) | regnum);
+
+	return mv88e6352_phy_wait(ds);
+}
+
+static char *mv88e6352_probe(struct device *host_dev, int sw_addr)
+{
+	struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
+	int ret;
+
+	if (bus == NULL)
+		return NULL;
+
+	ret = __mv88e6xxx_reg_read(bus, sw_addr, REG_PORT(0), 0x03);
+	if (ret >= 0) {
+		if (ret == 0x3521)
+			return "Marvell 88E6352 (A0)";
+		if (ret == 0x3522)
+			return "Marvell 88E6352 (A1)";
+		if ((ret & 0xfff0) == 0x3520)
+			return "Marvell 88E6352";
+	}
+
+	return NULL;
+}
+
+static int mv88e6352_switch_reset(struct dsa_switch *ds)
+{
+	unsigned long timeout;
+	int ret;
+	int i;
+
+	/* Set all ports to the disabled state. */
+	for (i = 0; i < 7; i++) {
+		ret = REG_READ(REG_PORT(i), 0x04);
+		REG_WRITE(REG_PORT(i), 0x04, ret & 0xfffc);
+	}
+
+	/* Wait for transmit queues to drain. */
+	usleep_range(2000, 4000);
+
+	/* Reset the switch. Keep PPU active (bit 14, undocumented).
+	 * The PPU needs to be active to support indirect phy register
+	 * accesses through global registers 0x18 and 0x19.
+	 */
+	REG_WRITE(REG_GLOBAL, 0x04, 0xc000);
+
+	/* Wait up to one second for reset to complete. */
+	timeout = jiffies + 1 * HZ;
+	while (time_before(jiffies, timeout)) {
+		ret = REG_READ(REG_GLOBAL, 0x00);
+		if ((ret & 0x8800) == 0x8800)
+			break;
+		usleep_range(1000, 2000);
+	}
+	if (time_after(jiffies, timeout))
+		return -ETIMEDOUT;
+
+	return 0;
+}
+
+static int mv88e6352_setup_global(struct dsa_switch *ds)
+{
+	int ret;
+	int i;
+
+	/* Discard packets with excessive collisions,
+	 * mask all interrupt sources, enable PPU (bit 14, undocumented).
+	 */
+	REG_WRITE(REG_GLOBAL, 0x04, 0x6000);
+
+	/* Set the default address aging time to 5 minutes, and
+	 * enable address learn messages to be sent to all message
+	 * ports.
+	 */
+	REG_WRITE(REG_GLOBAL, 0x0a, 0x0148);
+
+	/* Configure the priority mapping registers. */
+	ret = mv88e6xxx_config_prio(ds);
+	if (ret < 0)
+		return ret;
+
+	/* Configure the upstream port, and configure the upstream
+	 * port as the port to which ingress and egress monitor frames
+	 * are to be sent.
+	 */
+	REG_WRITE(REG_GLOBAL, 0x1a, (dsa_upstream_port(ds) * 0x1110));
+
+	/* Disable remote management for now, and set the switch's
+	 * DSA device number.
+	 */
+	REG_WRITE(REG_GLOBAL, 0x1c, ds->index & 0x1f);
+
+	/* Send all frames with destination addresses matching
+	 * 01:80:c2:00:00:2x to the CPU port.
+	 */
+	REG_WRITE(REG_GLOBAL2, 0x02, 0xffff);
+
+	/* Send all frames with destination addresses matching
+	 * 01:80:c2:00:00:0x to the CPU port.
+	 */
+	REG_WRITE(REG_GLOBAL2, 0x03, 0xffff);
+
+	/* Disable the loopback filter, disable flow control
+	 * messages, disable flood broadcast override, disable
+	 * removing of provider tags, disable ATU age violation
+	 * interrupts, disable tag flow control, force flow
+	 * control priority to the highest, and send all special
+	 * multicast frames to the CPU at the highest priority.
+	 */
+	REG_WRITE(REG_GLOBAL2, 0x05, 0x00ff);
+
+	/* Program the DSA routing table. */
+	for (i = 0; i < 32; i++) {
+		int nexthop = 0x1f;
+
+		if (i != ds->index && i < ds->dst->pd->nr_chips)
+			nexthop = ds->pd->rtable[i] & 0x1f;
+
+		REG_WRITE(REG_GLOBAL2, 0x06, 0x8000 | (i << 8) | nexthop);
+	}
+
+	/* Clear all trunk masks. */
+	for (i = 0; i < 8; i++)
+		REG_WRITE(REG_GLOBAL2, 0x07, 0x8000 | (i << 12) | 0x7f);
+
+	/* Clear all trunk mappings. */
+	for (i = 0; i < 16; i++)
+		REG_WRITE(REG_GLOBAL2, 0x08, 0x8000 | (i << 11));
+
+	/* Disable ingress rate limiting by resetting all ingress
+	 * rate limit registers to their initial state.
+	 */
+	for (i = 0; i < 7; i++)
+		REG_WRITE(REG_GLOBAL2, 0x09, 0x9000 | (i << 8));
+
+	/* Initialise cross-chip port VLAN table to reset defaults. */
+	REG_WRITE(REG_GLOBAL2, 0x0b, 0x9000);
+
+	/* Clear the priority override table. */
+	for (i = 0; i < 16; i++)
+		REG_WRITE(REG_GLOBAL2, 0x0f, 0x8000 | (i << 8));
+
+	/* @@@ initialise AVB (22/23) watchdog (27) sdet (29) registers */
+
+	return 0;
+}
+
+static int mv88e6352_setup_port(struct dsa_switch *ds, int p)
+{
+	int addr = REG_PORT(p);
+	u16 val;
+
+	/* MAC Forcing register: don't force link, speed, duplex
+	 * or flow control state to any particular values on physical
+	 * ports, but force the CPU port and all DSA ports to 1000 Mb/s
+	 * full duplex.
+	 */
+	if (dsa_is_cpu_port(ds, p) || ds->dsa_port_mask & (1 << p))
+		REG_WRITE(addr, 0x01, 0x003e);
+	else
+		REG_WRITE(addr, 0x01, 0x0003);
+
+	/* Do not limit the period of time that this port can be
+	 * paused for by the remote end or the period of time that
+	 * this port can pause the remote end.
+	 */
+	REG_WRITE(addr, 0x02, 0x0000);
+
+	/* Port Control: disable Drop-on-Unlock, disable Drop-on-Lock,
+	 * disable Header mode, enable IGMP/MLD snooping, disable VLAN
+	 * tunneling, determine priority by looking at 802.1p and IP
+	 * priority fields (IP prio has precedence), and set STP state
+	 * to Forwarding.
+	 *
+	 * If this is the CPU link, use DSA or EDSA tagging depending
+	 * on which tagging mode was configured.
+	 *
+	 * If this is a link to another switch, use DSA tagging mode.
+	 *
+	 * If this is the upstream port for this switch, enable
+	 * forwarding of unknown unicasts and multicasts.
+	 */
+	val = 0x0433;
+	if (dsa_is_cpu_port(ds, p)) {
+		if (ds->dst->tag_protocol == DSA_TAG_PROTO_EDSA)
+			val |= 0x3300;
+		else
+			val |= 0x0100;
+	}
+	if (ds->dsa_port_mask & (1 << p))
+		val |= 0x0100;
+	if (p == dsa_upstream_port(ds))
+		val |= 0x000c;
+	REG_WRITE(addr, 0x04, val);
+
+	/* Port Control 1: disable trunking.  Also, if this is the
+	 * CPU port, enable learn messages to be sent to this port.
+	 */
+	REG_WRITE(addr, 0x05, dsa_is_cpu_port(ds, p) ? 0x8000 : 0x0000);
+
+	/* Port based VLAN map: give each port its own address
+	 * database, allow the CPU port to talk to each of the 'real'
+	 * ports, and allow each of the 'real' ports to only talk to
+	 * the upstream port.
+	 */
+	val = (p & 0xf) << 12;
+	if (dsa_is_cpu_port(ds, p))
+		val |= ds->phys_port_mask;
+	else
+		val |= 1 << dsa_upstream_port(ds);
+	REG_WRITE(addr, 0x06, val);
+
+	/* Default VLAN ID and priority: don't set a default VLAN
+	 * ID, and set the default packet priority to zero.
+	 */
+	REG_WRITE(addr, 0x07, 0x0000);
+
+	/* Port Control 2: don't force a good FCS, set the maximum
+	 * frame size to 10240 bytes, don't let the switch add or
+	 * strip 802.1q tags, don't discard tagged or untagged frames
+	 * on this port, do a destination address lookup on all
+	 * received packets as usual, disable ARP mirroring and don't
+	 * send a copy of all transmitted/received frames on this port
+	 * to the CPU.
+	 */
+	REG_WRITE(addr, 0x08, 0x2080);
+
+	/* Egress rate control: disable egress rate control. */
+	REG_WRITE(addr, 0x09, 0x0001);
+
+	/* Egress rate control 2: disable egress rate control. */
+	REG_WRITE(addr, 0x0a, 0x0000);
+
+	/* Port Association Vector: when learning source addresses
+	 * of packets, add the address to the address database using
+	 * a port bitmap that has only the bit for this port set and
+	 * the other bits clear.
+	 */
+	REG_WRITE(addr, 0x0b, 1 << p);
+
+	/* Port ATU control: disable limiting the number of address
+	 * database entries that this port is allowed to use.
+	 */
+	REG_WRITE(addr, 0x0c, 0x0000);
+
+	/* Priority Override: disable DA, SA and VTU priority override. */
+	REG_WRITE(addr, 0x0d, 0x0000);
+
+	/* Port Ethertype: use the Ethertype DSA Ethertype value. */
+	REG_WRITE(addr, 0x0f, ETH_P_EDSA);
+
+	/* Tag Remap: use an identity 802.1p prio -> switch prio
+	 * mapping.
+	 */
+	REG_WRITE(addr, 0x18, 0x3210);
+
+	/* Tag Remap 2: use an identity 802.1p prio -> switch prio
+	 * mapping.
+	 */
+	REG_WRITE(addr, 0x19, 0x7654);
+
+	return 0;
+}
+
+static int mv88e6352_setup(struct dsa_switch *ds)
+{
+	struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+	int ret;
+	int i;
+
+	mutex_init(&ps->smi_mutex);
+	mutex_init(&ps->stats_mutex);
+	mutex_init(&ps->phy_mutex);
+
+	ps->id = REG_READ(REG_PORT(0), 0x03) & 0xfff0;
+
+	ret = mv88e6352_switch_reset(ds);
+	if (ret < 0)
+		return ret;
+
+	/* @@@ initialise vtu and atu */
+
+	ret = mv88e6352_setup_global(ds);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < 7; i++) {
+		ret = mv88e6352_setup_port(ds, i);
+		if (ret < 0)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int mv88e6352_port_to_phy_addr(int port)
+{
+	if (port >= 0 && port <= 4)
+		return port;
+	return -EINVAL;
+}
+
+static int
+mv88e6352_phy_read(struct dsa_switch *ds, int port, int regnum)
+{
+	struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+	int addr = mv88e6352_port_to_phy_addr(port);
+	int ret;
+
+	if (addr < 0)
+		return addr;
+
+	mutex_lock(&ps->phy_mutex);
+	ret = __mv88e6352_phy_read(ds, addr, regnum);
+	mutex_unlock(&ps->phy_mutex);
+
+	return ret;
+}
+
+static int
+mv88e6352_phy_write(struct dsa_switch *ds, int port, int regnum, u16 val)
+{
+	struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
+	int addr = mv88e6352_port_to_phy_addr(port);
+	int ret;
+
+	if (addr < 0)
+		return addr;
+
+	mutex_lock(&ps->phy_mutex);
+	ret = __mv88e6352_phy_write(ds, addr, regnum, val);
+	mutex_unlock(&ps->phy_mutex);
+
+	return ret;
+}
+
+static struct mv88e6xxx_hw_stat mv88e6352_hw_stats[] = {
+	{ "in_good_octets", 8, 0x00, },
+	{ "in_bad_octets", 4, 0x02, },
+	{ "in_unicast", 4, 0x04, },
+	{ "in_broadcasts", 4, 0x06, },
+	{ "in_multicasts", 4, 0x07, },
+	{ "in_pause", 4, 0x16, },
+	{ "in_undersize", 4, 0x18, },
+	{ "in_fragments", 4, 0x19, },
+	{ "in_oversize", 4, 0x1a, },
+	{ "in_jabber", 4, 0x1b, },
+	{ "in_rx_error", 4, 0x1c, },
+	{ "in_fcs_error", 4, 0x1d, },
+	{ "out_octets", 8, 0x0e, },
+	{ "out_unicast", 4, 0x10, },
+	{ "out_broadcasts", 4, 0x13, },
+	{ "out_multicasts", 4, 0x12, },
+	{ "out_pause", 4, 0x15, },
+	{ "excessive", 4, 0x11, },
+	{ "collisions", 4, 0x1e, },
+	{ "deferred", 4, 0x05, },
+	{ "single", 4, 0x14, },
+	{ "multiple", 4, 0x17, },
+	{ "out_fcs_error", 4, 0x03, },
+	{ "late", 4, 0x1f, },
+	{ "hist_64bytes", 4, 0x08, },
+	{ "hist_65_127bytes", 4, 0x09, },
+	{ "hist_128_255bytes", 4, 0x0a, },
+	{ "hist_256_511bytes", 4, 0x0b, },
+	{ "hist_512_1023bytes", 4, 0x0c, },
+	{ "hist_1024_max_bytes", 4, 0x0d, },
+};
+
+static void
+mv88e6352_get_strings(struct dsa_switch *ds, int port, uint8_t *data)
+{
+	mv88e6xxx_get_strings(ds, ARRAY_SIZE(mv88e6352_hw_stats),
+			      mv88e6352_hw_stats, port, data);
+}
+
+static void
+mv88e6352_get_ethtool_stats(struct dsa_switch *ds, int port, uint64_t *data)
+{
+	mv88e6xxx_get_ethtool_stats(ds, ARRAY_SIZE(mv88e6352_hw_stats),
+				    mv88e6352_hw_stats, port, data);
+}
+
+static int mv88e6352_get_sset_count(struct dsa_switch *ds)
+{
+	return ARRAY_SIZE(mv88e6352_hw_stats);
+}
+
+struct dsa_switch_driver mv88e6352_switch_driver = {
+	.tag_protocol		= DSA_TAG_PROTO_EDSA,
+	.priv_size		= sizeof(struct mv88e6xxx_priv_state),
+	.probe			= mv88e6352_probe,
+	.setup			= mv88e6352_setup,
+	.set_addr		= mv88e6xxx_set_addr_indirect,
+	.phy_read		= mv88e6352_phy_read,
+	.phy_write		= mv88e6352_phy_write,
+	.poll_link		= mv88e6xxx_poll_link,
+	.get_strings		= mv88e6352_get_strings,
+	.get_ethtool_stats	= mv88e6352_get_ethtool_stats,
+	.get_sset_count		= mv88e6352_get_sset_count,
+};
+
+MODULE_ALIAS("platform:mv88e6352");
diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index a6c90cf..8e1090b 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -507,6 +507,9 @@ static int __init mv88e6xxx_init(void)
 #if IS_ENABLED(CONFIG_NET_DSA_MV88E6123_61_65)
 	register_switch_driver(&mv88e6123_61_65_switch_driver);
 #endif
+#if IS_ENABLED(CONFIG_NET_DSA_MV88E6352)
+	register_switch_driver(&mv88e6352_switch_driver);
+#endif
 #if IS_ENABLED(CONFIG_NET_DSA_MV88E6171)
 	register_switch_driver(&mv88e6171_switch_driver);
 #endif
diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
index 5e5145a..c0ce133 100644
--- a/drivers/net/dsa/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx.h
@@ -37,6 +37,12 @@ struct mv88e6xxx_priv_state {
 	 */
 	struct mutex	stats_mutex;
 
+	/* This mutex serializes phy access for chips with
+	 * indirect phy addressing. It is unused for chips
+	 * with direct phy access.
+	 */
+	struct mutex	phy_mutex;
+
 	int		id; /* switch product id */
 };
 
@@ -70,6 +76,7 @@ void mv88e6xxx_get_ethtool_stats(struct dsa_switch *ds,
 
 extern struct dsa_switch_driver mv88e6131_switch_driver;
 extern struct dsa_switch_driver mv88e6123_61_65_switch_driver;
+extern struct dsa_switch_driver mv88e6352_switch_driver;
 extern struct dsa_switch_driver mv88e6171_switch_driver;
 
 #define REG_READ(addr, reg)						\
-- 
1.9.1

^ permalink raw reply related

* [PATCH v3 06/15] net: dsa: Add support for reporting switch chip temperatures
From: Guenter Roeck @ 2014-10-29 17:44 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Florian Fainelli, Andrew Lunn, linux-kernel,
	Guenter Roeck
In-Reply-To: <1414604707-22407-1-git-send-email-linux@roeck-us.net>

Some switches provide chip temperature data.
Add support for reporting it through the hwmon subsystem.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v3:
- No change
v2:
- Updated headline to reflect what is reported, not how.
- Make added functionality optional with new Kconfig flag
- Register with hwmon subsystem as virtual device (no parent)
- Generate hwmon 'name' attribute from sanitized master network device
  name plus _dsaX, where X is the DSA switch index and 'sanitized'
  means to pick all characters and digits from the network device name
- Do not use devm_ function to register hwmon device, since the hwmon
  device is not associated with a device and thus can not be auto-removed.

 include/net/dsa.h |  16 +++++++
 net/dsa/Kconfig   |  11 +++++
 net/dsa/dsa.c     | 131 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 158 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index b765592..55e75e7 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -139,6 +139,14 @@ struct dsa_switch {
 	 */
 	struct device		*master_dev;
 
+#ifdef CONFIG_NET_DSA_HWMON
+	/*
+	 * Hardware monitoring information
+	 */
+	char			hwmon_name[IFNAMSIZ + 8];
+	struct device		*hwmon_dev;
+#endif
+
 	/*
 	 * Slave mii_bus and devices for the individual ports.
 	 */
@@ -242,6 +250,14 @@ struct dsa_switch_driver {
 			   struct ethtool_eee *e);
 	int	(*get_eee)(struct dsa_switch *ds, int port,
 			   struct ethtool_eee *e);
+
+#ifdef CONFIG_NET_DSA_HWMON
+	/* Hardware monitoring */
+	int	(*get_temp)(struct dsa_switch *ds, int *temp);
+	int	(*get_temp_limit)(struct dsa_switch *ds, int *temp);
+	int	(*set_temp_limit)(struct dsa_switch *ds, int temp);
+	int	(*get_temp_alarm)(struct dsa_switch *ds, bool *alarm);
+#endif
 };
 
 void register_switch_driver(struct dsa_switch_driver *type);
diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
index a585fd6..5f8ac40 100644
--- a/net/dsa/Kconfig
+++ b/net/dsa/Kconfig
@@ -11,6 +11,17 @@ config NET_DSA
 
 if NET_DSA
 
+config NET_DSA_HWMON
+	bool "Distributed Switch Architecture HWMON support"
+	default y
+	depends on HWMON && !(NET_DSA=y && HWMON=m)
+	---help---
+	  Say Y if you want to expose thermal sensor data on switches supported
+	  by the Distributed Switch Architecture.
+
+	  Some of those switches contain thermal sensors. This data is available
+	  via the hwmon sysfs interface and exposes the onboard sensors.
+
 # tagging formats
 config NET_DSA_TAG_BRCM
 	bool
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 22f34cf..5edbbca 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -9,6 +9,9 @@
  * (at your option) any later version.
  */
 
+#include <linux/ctype.h>
+#include <linux/device.h>
+#include <linux/hwmon.h>
 #include <linux/list.h>
 #include <linux/platform_device.h>
 #include <linux/slab.h>
@@ -17,6 +20,7 @@
 #include <linux/of.h>
 #include <linux/of_mdio.h>
 #include <linux/of_platform.h>
+#include <linux/sysfs.h>
 #include "dsa_priv.h"
 
 char dsa_driver_version[] = "0.1";
@@ -71,6 +75,104 @@ dsa_switch_probe(struct device *host_dev, int sw_addr, char **_name)
 	return ret;
 }
 
+/* hwmon support ************************************************************/
+
+#ifdef CONFIG_NET_DSA_HWMON
+
+static ssize_t temp1_input_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct dsa_switch *ds = dev_get_drvdata(dev);
+	int temp, ret;
+
+	ret = ds->drv->get_temp(ds, &temp);
+	if (ret < 0)
+		return ret;
+
+	return sprintf(buf, "%d\n", temp * 1000);
+}
+static DEVICE_ATTR_RO(temp1_input);
+
+static ssize_t temp1_max_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct dsa_switch *ds = dev_get_drvdata(dev);
+	int temp, ret;
+
+	ret = ds->drv->get_temp_limit(ds, &temp);
+	if (ret < 0)
+		return ret;
+
+	return sprintf(buf, "%d\n", temp * 1000);
+}
+
+static ssize_t temp1_max_store(struct device *dev,
+			       struct device_attribute *attr, const char *buf,
+			       size_t count)
+{
+	struct dsa_switch *ds = dev_get_drvdata(dev);
+	int temp, ret;
+
+	ret = kstrtoint(buf, 0, &temp);
+	if (ret < 0)
+		return ret;
+
+	ret = ds->drv->set_temp_limit(ds, DIV_ROUND_CLOSEST(temp, 1000));
+	if (ret < 0)
+		return ret;
+
+	return count;
+}
+static DEVICE_ATTR(temp1_max, S_IRUGO, temp1_max_show, temp1_max_store);
+
+static ssize_t temp1_max_alarm_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct dsa_switch *ds = dev_get_drvdata(dev);
+	bool alarm;
+	int ret;
+
+	ret = ds->drv->get_temp_alarm(ds, &alarm);
+	if (ret < 0)
+		return ret;
+
+	return sprintf(buf, "%d\n", alarm);
+}
+static DEVICE_ATTR_RO(temp1_max_alarm);
+
+static struct attribute *dsa_hwmon_attrs[] = {
+	&dev_attr_temp1_input.attr,	/* 0 */
+	&dev_attr_temp1_max.attr,	/* 1 */
+	&dev_attr_temp1_max_alarm.attr,	/* 2 */
+	NULL
+};
+
+static umode_t dsa_hwmon_attrs_visible(struct kobject *kobj,
+				       struct attribute *attr, int index)
+{
+	struct device *dev = container_of(kobj, struct device, kobj);
+	struct dsa_switch *ds = dev_get_drvdata(dev);
+	struct dsa_switch_driver *drv = ds->drv;
+	umode_t mode = attr->mode;
+
+	if (index == 1) {
+		if (!drv->get_temp_limit)
+			mode = 0;
+		else if (drv->set_temp_limit)
+			mode |= S_IWUSR;
+	} else if (index == 2 && !drv->get_temp_alarm) {
+		mode = 0;
+	}
+	return mode;
+}
+
+static const struct attribute_group dsa_hwmon_group = {
+	.attrs = dsa_hwmon_attrs,
+	.is_visible = dsa_hwmon_attrs_visible,
+};
+__ATTRIBUTE_GROUPS(dsa_hwmon);
+
+#endif /* CONFIG_NET_DSA_HWMON */
 
 /* basic switch operations **************************************************/
 static struct dsa_switch *
@@ -225,6 +327,31 @@ dsa_switch_setup(struct dsa_switch_tree *dst, int index,
 		ds->ports[i] = slave_dev;
 	}
 
+#ifdef CONFIG_NET_DSA_HWMON
+	/* If the switch provides a temperature sensor,
+	 * register with hardware monitoring subsystem.
+	 * Treat registration error as non-fatal and ignore it.
+	 */
+	if (drv->get_temp) {
+		const char *netname = netdev_name(dst->master_netdev);
+		char hname[IFNAMSIZ + 1];
+		int i, j;
+
+		/* Create valid hwmon 'name' attribute */
+		for (i = j = 0; i < IFNAMSIZ && netname[i]; i++) {
+			if (isalnum(netname[i]))
+				hname[j++] = netname[i];
+		}
+		hname[j] = '\0';
+		scnprintf(ds->hwmon_name, sizeof(ds->hwmon_name), "%s_dsa%d",
+			  hname, index);
+		ds->hwmon_dev = hwmon_device_register_with_groups(NULL,
+					ds->hwmon_name, ds, dsa_hwmon_groups);
+		if (IS_ERR(ds->hwmon_dev))
+			ds->hwmon_dev = NULL;
+	}
+#endif /* CONFIG_NET_DSA_HWMON */
+
 	return ds;
 
 out_free:
@@ -236,6 +363,10 @@ out:
 
 static void dsa_switch_destroy(struct dsa_switch *ds)
 {
+#ifdef CONFIG_NET_DSA_HWMON
+	if (ds->hwmon_dev)
+		hwmon_device_unregister(ds->hwmon_dev);
+#endif
 }
 
 #ifdef CONFIG_PM_SLEEP
-- 
1.9.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox