* [PATCH net-next 0/2] pskb_extract() helper function.
From: Sowmini Varadhan @ 2016-04-20 10:17 UTC (permalink / raw)
To: netdev, rds-devel, santosh.shilimkar, davem
Cc: sowmini.varadhan, eric.dumazet, marcelo.leitner
This patchset follows up on the discussion in
https://www.mail-archive.com/netdev@vger.kernel.org/msg105090.html
For RDS-TCP, we have to deal with the full gamut of
nonlinear sk_buffs, including all the frag_list variants.
Also, the parent skb has to remain unchanged, while the clone
is queued for Rx on the PF_RDS socket.
Patch 1 of this patchset adds a pskb_extract() function that
does all this without the redundant memcpy's in pskb_expand_head()
and __pskb_pull_tail().
Sowmini Varadhan (2):
Add pskb_extract() helper function
Call pskb_extract() helper function
include/linux/skbuff.h | 2 +
net/core/skbuff.c | 248 ++++++++++++++++++++++++++++++++++++++++++++++++
net/rds/tcp_recv.c | 14 +--
3 files changed, 253 insertions(+), 11 deletions(-)
^ permalink raw reply
* Re: [PATCH net-next 1/4] netlink: fix test alignment in nla_align_64bit()
From: Nicolas Dichtel @ 2016-04-20 10:14 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, davem, roopa, tgraf, jhs
In-Reply-To: <1461146278.10638.253.camel@edumazet-glaptop3.roam.corp.google.com>
Le 20/04/2016 11:57, Eric Dumazet a écrit :
> On Wed, 2016-04-20 at 11:44 +0200, Nicolas Dichtel wrote:
>> Le 20/04/2016 11:33, Eric Dumazet a écrit :
>> [snip]
>>> How have you tested your patch exactly ?
>> As stated in the cover letter, I didn't test it.
>
>
> You certainly can test this, by tweaking HAVE_EFFICIENT_UNALIGNED_ACCESS
> and adding another assertion in the code.
>
> By testing it you would have caught a real bug, since David incorrectly
> used HAVE_EFFICIENT_UNALIGNED_ACCESS instead of
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
>
> ;)
Héhé, good catch :)
^ permalink raw reply
* Re: [PATCH net-next 1/4] netlink: fix test alignment in nla_align_64bit()
From: Eric Dumazet @ 2016-04-20 9:57 UTC (permalink / raw)
To: nicolas.dichtel; +Cc: netdev, davem, roopa, tgraf, jhs
In-Reply-To: <57174F8E.6050201@6wind.com>
On Wed, 2016-04-20 at 11:44 +0200, Nicolas Dichtel wrote:
> Le 20/04/2016 11:33, Eric Dumazet a écrit :
> [snip]
> > How have you tested your patch exactly ?
> As stated in the cover letter, I didn't test it.
You certainly can test this, by tweaking HAVE_EFFICIENT_UNALIGNED_ACCESS
and adding another assertion in the code.
By testing it you would have caught a real bug, since David incorrectly
used HAVE_EFFICIENT_UNALIGNED_ACCESS instead of
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
;)
diff --git a/include/net/netlink.h b/include/net/netlink.h
index e644b3489acf..ea6872633a92 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -1244,7 +1244,7 @@ static inline int nla_validate_nested(const struct nlattr *start, int maxtype,
*/
static inline int nla_align_64bit(struct sk_buff *skb, int padattr)
{
-#ifndef HAVE_EFFICIENT_UNALIGNED_ACCESS
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
if (IS_ALIGNED((unsigned long)skb->data, 8)) {
struct nlattr *attr = nla_reserve(skb, padattr, 0);
if (!attr)
@@ -1261,7 +1261,7 @@ static inline int nla_align_64bit(struct sk_buff *skb, int padattr)
static inline int nla_total_size_64bit(int payload)
{
return NLA_ALIGN(nla_attr_size(payload))
-#ifndef HAVE_EFFICIENT_UNALIGNED_ACCESS
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+ NLA_ALIGN(nla_attr_size(0))
#endif
;
^ permalink raw reply related
* Re: [PATCH net-next 0/2] act_bpf, cls_bpf: send eBPF bytecode through
From: Daniel Borkmann @ 2016-04-20 9:55 UTC (permalink / raw)
To: Quentin Monnet; +Cc: Alexei Starovoitov, netdev
In-Reply-To: <57172ED3.30101@6wind.com>
Hi Quentin,
On 04/20/2016 09:25 AM, Quentin Monnet wrote:
> 2016-04-15 (11:44 UTC-0700) ~ Alexei Starovoitov:
>> On Fri, Apr 15, 2016 at 12:41:05PM +0200, Daniel Borkmann wrote:
>>> On 04/15/2016 12:07 PM, Quentin Monnet wrote:
>>>> When a new BPF traffic control filter or action is set up with tc, the
>>>> bytecode is sent back to userspace through a netlink socket for cBPF, but
>>>> not for eBPF (the file descriptor pointing to the object file containing
>>>> the bytecode is sent instead).
>>>>
>>>> This patch makes cls_bpf and act_bpf modules send the bytecode for eBPF as
>>>> well (in addition to the file descriptor).
>>>>
> […]
>>>
>>> Thanks for working on this, but it's unfortunately not that easy. Let
>>> me ask, what would be the intended use-case to dump the insns?
>>
>> +1
>>
>>> I'm asking because if you dump them as-is, then a reinject at a later
>>> time of that bytecode back into the kernel will most likely be rejected
>>> by the verifier.
>>>
>>> This is because on load time, verifier does rewrites/expansion on some
>>> of the insns (f.e. map pointers, helper functions, ctx access etc, see
>>> also appendix in [1]), so the code as seen in the kernel would need to
>>> be sanitized first.
>>
>> +1
>> we had similar discussion about this in seccomp context and decided that
>> the only sensible way is to keep original instructions, but it's wasteful
>> to do unconditionally and snapshotting of maps is not possible,
>> so there was no use for such dumping facility other than debugging.
>> Is it what the patch after?
>> We need to discuss it in the proper context.
>
> I am experimenting with BPF, and so far I was just trying to dump the
> bytecode sent from tc to the kernel. I had not realized that the
> verifier would bring some changes to the instructions. And I agree that
> a more comprehensive debugging solution could be obtained if I can find
> some way to get a snapshot of the maps.
>
>>> Also, how would you make sense/transform maps into a meaningful
>>> representation (probably possible to find a scheme when they are pinned)?
>>>
>>> Another possibility is that such programs need to be pinned (can be done
>>> easily by tc in the background) and then implement a CRIU facility into
>>> the bpf(2) syscall to retrieve them. tc could make use of this w/o too
>>> much effort, and at the same time it would help CRIU folks, too. It
>>> also seems cleaner to have only one central api (bpf(2)) to dump them,
>>> but needs a bit of thought.
>>
>> +1
>> any debugging or criu needs to be done in a centralized way via syscall
>> and/or bpffs.
>
> Maintaining a central API around bpf() makes sense to me. I have been
> looking at the BPF filesystem to see what information I can obtain from
> it, but I did not understand it well. I read the logs of Daniel's commit
> b2197755b263 (“bpf: add support for persistent maps/progs”), but I am
> unsure how I could use it in order to gather data about the maps and
> programs (if this is possible at all). I tried to set up some BPF
Currently, there's not yet much information to extract. F.e. if you look at
the tc source code, we do bpf_map_selfcheck_pinned() from fdinfo to check if
the map fd that we got from the pinned one fits to the one from the object
file. But obviously more work is needed for extraction of bytecode as in your
case.
Haven't thought much about it yet, but one idea could be that tc also pins
programs, then sends some kind of annotation down to cls_bpf where on filter
dump tc could retrieve the path to the pinned program again, then uses bpf(2)
with BPF_OBJ_GET to get the fd, and a new command e.g. BPF_PROG_DUMP to extract
bytecode/map info from the running program and dumps it to the user in a way
where some sense can be made out of it from admin/user perspective (in other
words, not just raw opcodes I mean).
BPF_PROG_DUMP could have auxiliary information with map specs, kind of in a
similar way like we retrieve them as relo entries from the object file in
the loader, and in addition some information where to retrieve the maps in
case they were pinned. This still doesn't give you a entire snapshot of the
map, but would at least allow you for the pinned ones to iterate over them
via bpf(2) with BPF_MAP_GET_NEXT_KEY, plus in general it would allow you to
reload the program.
There's still the issue with the additional memory overhead to keep original
insns around as Alexei mentioned. Two things that come to mind, one being
that when JITing was successful, we could actually try to shrink struct bpf_prog
again since we work on a different image, but it doesn't address the case
where JIT is not used. Other one being to perhaps only keep a 'diff' around
in orig_prog where we can patch insns back to original, probably possible,
but needs a bit of work though.
> filters working with maps, but I could not find any file under
> /sys/fs/bpf/tc.
There are some getting started examples under examples/bpf/ in the iproute2
repo, f.e. bpf_shared.c is one.
> Would you have a pointer to some documentation about this filesystem? Or
> is there only the kernel code?
Yeah, b2197755b263 and 42984d7c1e56, and in my netdev1.1 paper I tried to put
more extensive information, but seems the proceedings haven't been published
yet. I can send you a private copy until they are officially released I guess.
Thanks,
Daniel
^ permalink raw reply
* drop all fragments inside tx queue if one gets dropped
From: Alexander Aring @ 2016-04-20 9:52 UTC (permalink / raw)
To: netdev; +Cc: linux-wpan
Hi,
On linux-wpan we had a discussion about setting the right tx_queue_len
and came to some issues in 802.15.4 6LoWPAN networks.
Our hardware parameters are:
- Bandwidth: 250kb/s
- One framebuffer at hardware side for transmit a frame.
- MTU - 127 bytes (without mac headers)
To provide 6LoWPAN (IPv6) on such interface, we have two interfaces.
One wpan interface (which works on 802.15.4 layer and has a queue) and
another lowpan interface (gets IPv6 and queue 6LoWPAN into wpan
interface, has no queue - it's virtual interface).
If the IPv6 packets needs fragmentation, mostly if payload is 127 bytes.
We have the following situation:
- 6lowpan interface gets IPv6 packet:
- generate 6LoWPAN fragments
- dev_queue_xmit(wpan_dev, frag1)
- dev_queue_xmit(wpan_dev, frag2)
- dev_queue_xmit(wpan_dev, frag3)
- dev_queue_xmit(wpan_dev, ...)
And then a lot of fragments laying inside the tx_queue and waits to
transfer to the transceiver which has only one framebuffer to transmit
one frame and waits for tx completion to transfer the next one.
My question is, if qdisc drops some fragment because the queue is full
or something else. Exists there some way to remove all fragments inside
the queue? If one fragment will be dropped and all related are still
inside the queue then we send mostly garbage.
I want to add a behaviour which drops all related fragments for
6LoWPAN fragmentation at first, if the payload is above 1280 bytes, then
we have also IPv6 fragmentation on it. In future I also like to remove
all related 6LoWPAN fragments which are related according to the IPv6
fragment.
- Alex
^ permalink raw reply
* Re: [PATCH net-next 1/4] netlink: fix test alignment in nla_align_64bit()
From: Nicolas Dichtel @ 2016-04-20 9:44 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, davem, roopa, tgraf, jhs
In-Reply-To: <1461144802.10638.249.camel@edumazet-glaptop3.roam.corp.google.com>
Le 20/04/2016 11:33, Eric Dumazet a écrit :
[snip]
> How have you tested your patch exactly ?
As stated in the cover letter, I didn't test it.
>
> I guess David should have copied his original comment here.
>
> - * The nlattr header is 4 bytes in size, that's why we test
> - * if the skb->data _is_ aligned. This NOP attribute, plus
> - * nlattr header for IFLA_STATS64, will make nla_data() 8-byte
> - * aligned.
>
>
I knew I was missing something, thanks for the explanation.
All other patches of this series need to be updated, I will do it if there is
no other comment.
^ permalink raw reply
* Re: [PATCH 4/4] drm/i915: Move ioremap_wc tracking onto VMA
From: Chris Wilson @ 2016-04-20 9:38 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Tvrtko Ursulin, netdev, intel-gfx, linux-kernel, Ingo Molnar,
Peter Zijlstra (Intel), dri-devel, linux-rdma, Daniel Vetter,
Dan Williams, Yishai Hadas, David Hildenbrand
In-Reply-To: <20160420091054.GL1990@wotan.suse.de>
On Wed, Apr 20, 2016 at 11:10:54AM +0200, Luis R. Rodriguez wrote:
> On Tue, Apr 19, 2016 at 01:33:58PM +0100, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 6ce2c31b9a81..9ef47329e8ae 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3346,6 +3346,15 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> > old_write_domain);
> > }
> >
> > +static void __i915_vma_iounmap(struct i915_vma *vma)
> > +{
> > + if (vma->iomap == NULL)
> > + return;
> > +
> > + io_mapping_unmap(vma->iomap);
>
> The NULL check could just be done by io_mapping_unmap() then you
> can avoid this in other drivers too.
>
> > + vma->iomap = NULL;
>
> You added accounting here, by simple int and inc / dec'ing it.
> I cannot confirm if it is correctly avoiding races, can you
> confirm?
Yes, the vma->pin_count is guarded by the struct_mutex atm. (The
struct_mutex is our own BKL :(
> Also you added accounting for the custom vma pinning thing and do
> GEM_BUG_ON(vma->pin_count == 0); when you unpin one instance but *you do not*
> do something like GEM_BUG_ON(vma->pin_count != 0); when you do the final full
> iounmap. That seems rather sloppy.
It's placed next to the function where pin_count == 0 and only called
from it. Yes, I did think the same...
> iomapping stuff has its own custom data structure, why not just use that data
> structure instead of the struct i915_vma and generalize this ? Drivers can
> be buggy and best if we avoid custom driver accounting and just do it in a neat
> generic fashion.
Completely different tasks, as far as I am aware. The iomapping is about
providing CPU access to the IO region, dma-remapping about providing
device access to physical memory, and our own VMA is about how the
object sits in all the different views of both CPU and device address
spaces (of which there are many, and even the CPU accessible address
space is not the entirety of that particular address space).
> Then other drivers could use this too.
drivers/gpu/drm/ttm (you didn't hear me say that...)
> > diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> > index 79ac202f3870..93f54a10042f 100644
> > --- a/drivers/gpu/drm/i915/intel_fbdev.c
> > +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> > @@ -244,22 +245,23 @@ static int intelfb_create(struct drm_fb_helper *helper,
> > info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
> > info->fbops = &intelfb_ops;
> >
> > + vma = i915_gem_obj_to_ggtt(obj);
> > +
> > /* setup aperture base/size for vesafb takeover */
> > info->apertures->ranges[0].base = dev->mode_config.fb_base;
> > info->apertures->ranges[0].size = ggtt->mappable_end;
> >
> > - info->fix.smem_start = dev->mode_config.fb_base + i915_gem_obj_ggtt_offset(obj);
> > - info->fix.smem_len = size;
> > + info->fix.smem_start = dev->mode_config.fb_base + vma->node.start;
> > + info->fix.smem_len = vma->node.size;
> >
> > - info->screen_base =
> > - ioremap_wc(ggtt->mappable_base + i915_gem_obj_ggtt_offset(obj),
> > - size);
> > - if (!info->screen_base) {
> > + vaddr = i915_vma_pin_iomap(vma);
> > + if (IS_ERR(vaddr)) {
> > DRM_ERROR("Failed to remap framebuffer into virtual memory\n");
> > - ret = -ENOSPC;
> > + ret = PTR_ERR(vaddr);
> > goto out_destroy_fbi;
> > }
> > - info->screen_size = size;
> > + info->screen_base = vaddr;
> > + info->screen_size = vma->node.size;
>
> some framebuffer drivers tend to use a generic start address of
> iinfo->fix.smem_start and a length of info->fix.smem_len, this
> driver sets the smem_start above, but its different than what
> gets ioremap for a start address:
>
> + ptr = io_mapping_map_wc(i915_vm_to_ggtt(vma->vm)->mappable,
> + vma->node.start,
> + vma->node.size);
>
> fix.smem_start is :
>
>
> > + info->fix.smem_start = dev->mode_config.fb_base + vma->node.start;
>
> The smem_len matches though. Can you clarify if its correct for
> the io_mapping_map_wc() should not be using info->fix.smem_start
> (which is dev->mode_config.fb_base + vma->node.start)?
dev->mode_config.fb_base is the base address of the mappable region. It
is an inconsistently in naming that just hasn't annoyed me enough to
fix.
> Reason I ask is since I noticed a while ago a lot of drivers
> were using info->fix.smem_start and info->fix.smem_len consistently
> for their ioremap'd areas it might make sense instead to let the
> internal framebuffer (register_framebuffer()) optionally manage the
> ioremap_wc() for drivers, given that this is pretty generic stuff.
Apart from drivers like ours we would end up with multiple mappings to
the same region. It was just a little grevience that I think was worth
fixing. It does highlight how buggy our code is currently though as we
never relinquish that mapping when the driver is unloaded.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply
* Re: [PATCH v2 1/1] drivers: net: cpsw: Prevent NUll pointer dereference with two PHYs
From: Grygorii Strashko @ 2016-04-20 9:37 UTC (permalink / raw)
To: David Rivshin (Allworx), David Miller
Cc: andrew.goodbody, netdev, linux-kernel, linux-omap, mugunthanvnm,
tony
In-Reply-To: <20160419193808.29dc51c3.drivshin.allworx@gmail.com>
On 04/20/2016 02:38 AM, David Rivshin (Allworx) wrote:
> On Tue, 19 Apr 2016 18:43:39 -0400 (EDT)
> David Miller <davem@davemloft.net> wrote:
>
>> From: Grygorii Strashko <grygorii.strashko@ti.com>
>> Date: Tue, 19 Apr 2016 21:44:09 +0300
>>
>>> May be you can send revert + your patch 1 (only fix for this issue).
>>>
>>> Dave, Does that sound good to you?
>>
>> Sure.
>
> OK, I will hopefully have that ready tomorrow evening.
>
> Anything in particular I should mention in the revert commit message?
> I didn't find a discussion on it, other than the earlier statement that
> "it breaks boot on many TI boards".
>
Such kind of issues are tracked automatically now for ARM through https://kernelci.org
Example:
https://storage.kernelci.org/next/next-20160419/arm-multi_v7_defconfig+CONFIG_EFI=y/lab-cambridge/boot-am335x-boneblack.txt
And We're trying to auto check it internally also
https://github.com/nmenon/kernel-test-logs
Below is my boot log:
ata1: SATA max UDMA/133 mmio [mem 0x4a140000-0x4a1410ff] port 0x100 irq 332
mtdoops: mtd device (mtddev=name/number) must be supplied
davinci_mdio 48485000.mdio: davinci mdio revision 1.6
davinci_mdio 48485000.mdio: detected phy mask fffffff9
libphy: 48485000.mdio: probed
davinci_mdio 48485000.mdio: phy[1]: device 48485000.mdio:01, driver unknown
davinci_mdio 48485000.mdio: phy[2]: device 48485000.mdio:02, driver unknown
cpsw 48484000.ethernet: Detected MACID = 74:da:ea:47:7d:9c
cpsw 48484000.ethernet: cpsw: Random MACID = ca:39:e7:f3:28:69
Unable to handle kernel paging request at virtual address fffffffc
pgd = c0004000
[fffffffc] *pgd=affae861, *pte=00000000, *ppte=00000000
Internal error: Oops: 37 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 4.6.0-rc4-next-20160419-00002-g94c3d2a #4
Hardware name: Generic DRA74X (Flattened Device Tree)
task: ee0c4d80 ti: ee0c6000 task.ti: ee0c6000
PC is at kset_find_obj+0x28/0xa4
LR is at kset_find_obj+0x3c/0xa4
pc : [<c04814b4>] lr : [<c04814c8>] psr: a0000013
sp : ee0c7ec8 ip : 00000000 fp : c0b005a0
[<c04814b4>] (kset_find_obj) from [<c0523780>] (driver_find+0x14/0x30)
ata1: SATA link down (SStatus 0 SControl 300)
[<c0523780>] (driver_find) from [<c0523804>] (driver_register+0x68/0xf8)
[<c0523804>] (driver_register) from [<c010185c>] (do_one_initcall+0x3c/0x178)
[<c010185c>] (do_one_initcall) from [<c0b00e94>] (kernel_init_freeable+0x218/0x2e8)
[<c0b00e94>] (kernel_init_freeable) from [<c07a4bc8>] (kernel_init+0x8/0x118)
[<c07a4bc8>] (kernel_init) from [<c01078d0>] (ret_from_fork+0x14/0x24)
Code: e5954000 e1550004 e2444004 0a00000a (e5943000)
---[ end trace 593c492662ed437e ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
CPU0: stopping
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D W 4.6.0-rc4-next-20160419-00002-g94c3d2a #4
Hardware name: Generic DRA74X (Flattened Device Tree)
[<c01101dc>] (unwind_backtrace) from [<c010c2a4>] (show_stack+0x10/0x14)
[<c010c2a4>] (show_stack) from [<c047f01c>] (dump_stack+0xb0/0xe4)
[<c047f01c>] (dump_stack) from [<c010e7cc>] (handle_IPI+0x2c4/0x360)
[<c010e7cc>] (handle_IPI) from [<c0101574>] (gic_handle_irq+0x84/0x94)
[<c0101574>] (gic_handle_irq) from [<c07abc78>] (__irq_svc+0x58/0x78)
Exception stack(0xc0c01f58 to 0xc0c01fa0)
1f40: c0108318 00000000
1f60: 00000000 00000000 c0c02a34 00000000 00000000 c0cbe5f8 c0c029cc c0cbe5f8
1f80: c0b71a38 c0cbd3a9 00000000 c0c01fa8 c0108318 c010831c 60000013 ffffffff
[<c07abc78>] (__irq_svc) from [<c010831c>] (arch_cpu_idle+0x20/0x3c)
[<c010831c>] (arch_cpu_idle) from [<c0184c58>] (cpu_startup_entry+0x30c/0x3f0)
[<c0184c58>] (cpu_startup_entry) from [<c0b00c04>] (start_kernel+0x33c/0x3b4)
[<c0b00c04>] (start_kernel) from [<8000807c>] (0x8000807c)
--
regards,
-grygorii
^ permalink raw reply
* Re: [PATCH net-next 1/4] netlink: fix test alignment in nla_align_64bit()
From: Eric Dumazet @ 2016-04-20 9:33 UTC (permalink / raw)
To: Nicolas Dichtel; +Cc: netdev, davem, roopa, tgraf, jhs
In-Reply-To: <1461142655-5067-2-git-send-email-nicolas.dichtel@6wind.com>
On Wed, 2016-04-20 at 10:57 +0200, Nicolas Dichtel wrote:
> IS_ALIGN() returns true when the alignment is as expected. The pad
> attribute should be added only when the alignment is not 8.
>
> Fixes: 35c5845957c7 ("net: Add helpers for 64-bit aligning netlink attributes.")
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> ---
> include/net/netlink.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/net/netlink.h b/include/net/netlink.h
> index e644b3489acf..694caac31d2c 100644
> --- a/include/net/netlink.h
> +++ b/include/net/netlink.h
> @@ -1245,7 +1245,7 @@ static inline int nla_validate_nested(const struct nlattr *start, int maxtype,
> static inline int nla_align_64bit(struct sk_buff *skb, int padattr)
> {
> #ifndef HAVE_EFFICIENT_UNALIGNED_ACCESS
> - if (IS_ALIGNED((unsigned long)skb->data, 8)) {
> + if (!IS_ALIGNED((unsigned long)skb->data, 8)) {
> struct nlattr *attr = nla_reserve(skb, padattr, 0);
> if (!attr)
> return -EMSGSIZE;
This is silly.
How have you tested your patch exactly ?
I guess David should have copied his original comment here.
- * The nlattr header is 4 bytes in size, that's why we test
- * if the skb->data _is_ aligned. This NOP attribute, plus
- * nlattr header for IFLA_STATS64, will make nla_data() 8-byte
- * aligned.
^ permalink raw reply
* Re: [RFC PATCH v3 net-next 1/3] tcp: Make use of MSG_EOR in tcp_sendmsg and tcp_sendpage
From: Eric Dumazet @ 2016-04-20 9:21 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: netdev, Eric Dumazet, Neal Cardwell, Soheil Hassas Yeganeh,
Willem de Bruijn, Yuchung Cheng, Kernel Team
In-Reply-To: <1461133497-1515104-2-git-send-email-kafai@fb.com>
On Tue, 2016-04-19 at 23:24 -0700, Martin KaFai Lau wrote:
> This patch adds an eor bit to the TCP_SKB_CB. When MSG_EOR
> is passed to tcp_sendmsg/tcp_sendpage, the eor bit will
> be set at the skb containing the last byte of the userland's
> msg. The eor bit will prevent data from appending to that
> skb in the future.
>
> This patch handles the tcp_sendmsg and tcp_sendpage cases.
>
> The followup patches will handle other skb coalescing
> and fragment cases.
>
> One potential use case is to use MSG_EOR with
> SOF_TIMESTAMPING_TX_ACK to get a more accurate
> TCP ack timestamping on application protocol with
> multiple outgoing response messages (e.g. HTTP2).
>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Suggested-by: Eric Dumazet <edumazet@google.com>
> ---
> include/net/tcp.h | 3 ++-
> net/ipv4/tcp.c | 7 +++++--
> 2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index c0ef054..ac31798 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -762,7 +762,8 @@ struct tcp_skb_cb {
>
> __u8 ip_dsfield; /* IPv4 tos or IPv6 dsfield */
> __u8 txstamp_ack:1, /* Record TX timestamp for ack? */
> - unused:7;
> + eor:1, /* Is skb MSG_EOR marked */
> + unused:6;
> __u32 ack_seq; /* Sequence number ACK'd */
> union {
> struct inet_skb_parm h4;
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 4d73858..7df0c1a88 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -908,7 +908,8 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
> int copy, i;
> bool can_coalesce;
>
> - if (!tcp_send_head(sk) || (copy = size_goal - skb->len) <= 0) {
> + if (!tcp_send_head(sk) || (copy = size_goal - skb->len) <= 0 ||
> + TCP_SKB_CB(skb)->eor) {
> new_segment:
> if (!sk_stream_memory_free(sk))
> goto wait_for_sndbuf;
> @@ -960,6 +961,7 @@ new_segment:
> size -= copy;
> if (!size) {
> tcp_tx_timestamp(sk, sk->sk_tsflags, skb);
> + TCP_SKB_CB(skb)->eor = !!(flags & MSG_EOR);
I am not sure you understood how do_tcp_sendpages() was working.
1) It is called one page at a time, so size would reach zero for every
sent page, and we would have at most 4096 bytes (on x86) per skb, even
if a sendfile() or splice() or vmsplice() is requesting a large size.
2) @flags here does not contain typical MSG_... values,
but a combination of MSG_MORE and MSG_SENDPAGE_NOTLAST
Since there is no way to add a MSG_EOR yet in the sendfile() and related
functions, you should remove the above line and not claim sendpage()
support in patch changelog/title, since it is not true.
We only support not aggregating data to the last skb in write queue if
eor bit is set on it, thus not breaking sendmsg( ... MSG_EOR) prior
uses.
> goto out;
> }
>
> @@ -1156,7 +1158,7 @@ int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> copy = max - skb->len;
> }
>
> - if (copy <= 0) {
> + if (copy <= 0 || TCP_SKB_CB(skb)->eor) {
> new_segment:
> /* Allocate new segment. If the interface is SG,
> * allocate skb fitting to single page.
> @@ -1250,6 +1252,7 @@ new_segment:
> copied += copy;
> if (!msg_data_left(msg)) {
> tcp_tx_timestamp(sk, sockc.tsflags, skb);
> + TCP_SKB_CB(skb)->eor = !!(flags & MSG_EOR);
Since this is a rmw, it is probably better to avoid it in common case,
since we know prior value of eor must be 0 at this point.
if (unlikely(flags & MSG_EOR))
TCP_SKB_CB(skb)->eor = 1;
> goto out;
> }
>
Thanks.
^ permalink raw reply
* Re: [PATCH V2] net: stmmac: socfpga: Remove re-registration of reset controller
From: Marek Vasut @ 2016-04-20 9:04 UTC (permalink / raw)
To: Dinh Nguyen, netdev
Cc: peppe.cavallaro, alexandre.torgue, Matthew Gerlach,
David S . Miller
In-Reply-To: <5716E39E.3010905@opensource.altera.com>
On 04/20/2016 04:04 AM, Dinh Nguyen wrote:
>
>
> On 04/19/2016 07:05 PM, Marek Vasut wrote:
>> Both socfpga_dwmac_parse_data() in dwmac-socfpga.c and stmmac_dvr_probe()
>> in stmmac_main.c functions call devm_reset_control_get() to register an
>> reset controller for the stmmac. This results in an attempt to register
>> two reset controllers for the same non-shared reset line.
>>
>> The first attempt to register the reset controller works fine. The second
>> attempt fails with warning from the reset controller core, see below.
>> The warning is produced because the reset line is non-shared and thus
>> it is allowed to have only up-to one reset controller associated with
>> that reset line, not two or more.
>>
>> The solution is not great. Since the hardware needs to toggle the reset
>> before calling stmmac_dvr_probe() to perform mandatory preconfiguration,
>> this patch splits socfpga_dwmac_init_probe() from socfpga_dwmac_init().
>>
>> The socfpga_dwmac_init_probe() temporarily registers the reset controller,
>> performs the pre-configuration and unregisters the reset controller again.
>> This function is only called from the socfpga_dwmac_probe().
>>
>> The original socfpga_dwmac_init() is tweaked to use reset controller
>> pointer from the stmmac_priv (private data of the stmmac core) instead
>> of the local instance, which was used before.
>>
>> Finally, plat_dat->exit and socfpga_dwmac_exit() is no longer necessary,
>> since the functionality is already performed by the stmmac core.
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 1 at drivers/reset/core.c:187 __of_reset_control_get+0x218/0x270
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc4-next-20160419-00015-gabb2477-dirty #4
>> Hardware name: Altera SOCFPGA
>> [<c010f290>] (unwind_backtrace) from [<c010b82c>] (show_stack+0x10/0x14)
>> [<c010b82c>] (show_stack) from [<c0373da4>] (dump_stack+0x94/0xa8)
>> [<c0373da4>] (dump_stack) from [<c011bcc0>] (__warn+0xec/0x104)
>> [<c011bcc0>] (__warn) from [<c011bd88>] (warn_slowpath_null+0x20/0x28)
>> [<c011bd88>] (warn_slowpath_null) from [<c03a6eb4>] (__of_reset_control_get+0x218/0x270)
>> [<c03a6eb4>] (__of_reset_control_get) from [<c03a701c>] (__devm_reset_control_get+0x54/0x90)
>> [<c03a701c>] (__devm_reset_control_get) from [<c041fa30>] (stmmac_dvr_probe+0x1b4/0x8e8)
>> [<c041fa30>] (stmmac_dvr_probe) from [<c04298c8>] (socfpga_dwmac_probe+0x1b8/0x28c)
>> [<c04298c8>] (socfpga_dwmac_probe) from [<c03d6ffc>] (platform_drv_probe+0x4c/0xb0)
>> [<c03d6ffc>] (platform_drv_probe) from [<c03d54ec>] (driver_probe_device+0x224/0x2bc)
>> [<c03d54ec>] (driver_probe_device) from [<c03d5630>] (__driver_attach+0xac/0xb0)
>> [<c03d5630>] (__driver_attach) from [<c03d382c>] (bus_for_each_dev+0x6c/0xa0)
>> [<c03d382c>] (bus_for_each_dev) from [<c03d4ad4>] (bus_add_driver+0x1a4/0x21c)
>> [<c03d4ad4>] (bus_add_driver) from [<c03d60ac>] (driver_register+0x78/0xf8)
>> [<c03d60ac>] (driver_register) from [<c0101760>] (do_one_initcall+0x40/0x170)
>> [<c0101760>] (do_one_initcall) from [<c0800e38>] (kernel_init_freeable+0x1dc/0x27c)
>> [<c0800e38>] (kernel_init_freeable) from [<c05d1bd4>] (kernel_init+0x8/0x114)
>> [<c05d1bd4>] (kernel_init) from [<c01076f8>] (ret_from_fork+0x14/0x3c)
>> ---[ end trace 059d2fbe87608fa9 ]---
>>
>
> Odd, I hadn't seen this error before. I would have noticed it, but I
> haven't ran linux-next for a few days.
>
> I see that this error was introduced on linux-next/next-20160414. The
> error was not there prior to that. While I agree that the extra call to
> get the reset control is not needed, I wonder what commit between
> next-20160413 and next-20160414 exposed this error. I'll try to dig this
> some more.
I'll save you the work, it's this one :)
commit 0b52297f2288ca239e598afe6c92db83d8d2bfcd
Author: Hans de Goede <hdegoede@redhat.com>
Date: Tue Feb 23 18:46:26 2016 +0100
reset: Add support for shared reset controls
In some SoCs some hw-blocks share a reset control. Add support for this
setup by adding new:
--
Best regards,
Marek Vasut
^ permalink raw reply
* Re: [PATCH 4/4] drm/i915: Move ioremap_wc tracking onto VMA
From: Luis R. Rodriguez @ 2016-04-20 9:10 UTC (permalink / raw)
To: Chris Wilson
Cc: David Airlie, netdev, intel-gfx, linux-kernel, Ingo Molnar,
Peter Zijlstra (Intel), mcgrof, dri-devel, linux-rdma,
Daniel Vetter, Dan Williams, Yishai Hadas, David Hildenbrand
In-Reply-To: <1461069238-31539-4-git-send-email-chris@chris-wilson.co.uk>
On Tue, Apr 19, 2016 at 01:33:58PM +0100, Chris Wilson wrote:
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6ce2c31b9a81..9ef47329e8ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3346,6 +3346,15 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> old_write_domain);
> }
>
> +static void __i915_vma_iounmap(struct i915_vma *vma)
> +{
> + if (vma->iomap == NULL)
> + return;
> +
> + io_mapping_unmap(vma->iomap);
The NULL check could just be done by io_mapping_unmap() then you
can avoid this in other drivers too.
> + vma->iomap = NULL;
You added accounting here, by simple int and inc / dec'ing it.
I cannot confirm if it is correctly avoiding races, can you
confirm?
Also you added accounting for the custom vma pinning thing and do
GEM_BUG_ON(vma->pin_count == 0); when you unpin one instance but *you do not*
do something like GEM_BUG_ON(vma->pin_count != 0); when you do the final full
iounmap. That seems rather sloppy.
iomapping stuff has its own custom data structure, why not just use that data
structure instead of the struct i915_vma and generalize this ? Drivers can
be buggy and best if we avoid custom driver accounting and just do it in a neat
generic fashion.
Then other drivers could use this too.
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index 79ac202f3870..93f54a10042f 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -244,22 +245,23 @@ static int intelfb_create(struct drm_fb_helper *helper,
> info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
> info->fbops = &intelfb_ops;
>
> + vma = i915_gem_obj_to_ggtt(obj);
> +
> /* setup aperture base/size for vesafb takeover */
> info->apertures->ranges[0].base = dev->mode_config.fb_base;
> info->apertures->ranges[0].size = ggtt->mappable_end;
>
> - info->fix.smem_start = dev->mode_config.fb_base + i915_gem_obj_ggtt_offset(obj);
> - info->fix.smem_len = size;
> + info->fix.smem_start = dev->mode_config.fb_base + vma->node.start;
> + info->fix.smem_len = vma->node.size;
>
> - info->screen_base =
> - ioremap_wc(ggtt->mappable_base + i915_gem_obj_ggtt_offset(obj),
> - size);
> - if (!info->screen_base) {
> + vaddr = i915_vma_pin_iomap(vma);
> + if (IS_ERR(vaddr)) {
> DRM_ERROR("Failed to remap framebuffer into virtual memory\n");
> - ret = -ENOSPC;
> + ret = PTR_ERR(vaddr);
> goto out_destroy_fbi;
> }
> - info->screen_size = size;
> + info->screen_base = vaddr;
> + info->screen_size = vma->node.size;
some framebuffer drivers tend to use a generic start address of
iinfo->fix.smem_start and a length of info->fix.smem_len, this
driver sets the smem_start above, but its different than what
gets ioremap for a start address:
+ ptr = io_mapping_map_wc(i915_vm_to_ggtt(vma->vm)->mappable,
+ vma->node.start,
+ vma->node.size);
fix.smem_start is :
> + info->fix.smem_start = dev->mode_config.fb_base + vma->node.start;
The smem_len matches though. Can you clarify if its correct for
the io_mapping_map_wc() should not be using info->fix.smem_start
(which is dev->mode_config.fb_base + vma->node.start)?
Reason I ask is since I noticed a while ago a lot of drivers
were using info->fix.smem_start and info->fix.smem_len consistently
for their ioremap'd areas it might make sense instead to let the
internal framebuffer (register_framebuffer()) optionally manage the
ioremap_wc() for drivers, given that this is pretty generic stuff.
Luis
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply
* [PATCH net-next 4/4] ip6mr: align RTA_MFC_STATS on 64-bit
From: Nicolas Dichtel @ 2016-04-20 8:57 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, eric.dumazet, tgraf, jhs, Nicolas Dichtel
In-Reply-To: <1461142655-5067-1-git-send-email-nicolas.dichtel@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
net/ipv6/ip6mr.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index a10e77103c88..bf678324fd52 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -2268,7 +2268,7 @@ static int __ip6mr_fill_mroute(struct mr6_table *mrt, struct sk_buff *skb,
mfcs.mfcs_packets = c->mfc_un.res.pkt;
mfcs.mfcs_bytes = c->mfc_un.res.bytes;
mfcs.mfcs_wrong_if = c->mfc_un.res.wrong_if;
- if (nla_put(skb, RTA_MFC_STATS, sizeof(mfcs), &mfcs) < 0)
+ if (nla_put_64bit(skb, RTA_MFC_STATS, sizeof(mfcs), &mfcs, RTA_PAD) < 0)
return -EMSGSIZE;
rtm->rtm_type = RTN_MULTICAST;
@@ -2411,7 +2411,7 @@ static int mr6_msgsize(bool unresolved, int maxvif)
+ nla_total_size(0) /* RTA_MULTIPATH */
+ maxvif * NLA_ALIGN(sizeof(struct rtnexthop))
/* RTA_MFC_STATS */
- + nla_total_size(sizeof(struct rta_mfc_stats))
+ + nla_total_size_64bit(sizeof(struct rta_mfc_stats))
;
return len;
--
2.4.2
^ permalink raw reply related
* [PATCH net-next 3/4] ipmr: align RTA_MFC_STATS on 64-bit
From: Nicolas Dichtel @ 2016-04-20 8:57 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, eric.dumazet, tgraf, jhs, Nicolas Dichtel
In-Reply-To: <1461142655-5067-1-git-send-email-nicolas.dichtel@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
include/uapi/linux/rtnetlink.h | 1 +
net/ipv4/ipmr.c | 4 ++--
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index ca764b5da86d..02baa5281bbf 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -312,6 +312,7 @@ enum rtattr_type_t {
RTA_ENCAP_TYPE,
RTA_ENCAP,
RTA_EXPIRES,
+ RTA_PAD,
__RTA_MAX
};
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 395e2814a46d..21a38e296fe2 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -2104,7 +2104,7 @@ static int __ipmr_fill_mroute(struct mr_table *mrt, struct sk_buff *skb,
mfcs.mfcs_packets = c->mfc_un.res.pkt;
mfcs.mfcs_bytes = c->mfc_un.res.bytes;
mfcs.mfcs_wrong_if = c->mfc_un.res.wrong_if;
- if (nla_put(skb, RTA_MFC_STATS, sizeof(mfcs), &mfcs) < 0)
+ if (nla_put_64bit(skb, RTA_MFC_STATS, sizeof(mfcs), &mfcs, RTA_PAD) < 0)
return -EMSGSIZE;
rtm->rtm_type = RTN_MULTICAST;
@@ -2237,7 +2237,7 @@ static size_t mroute_msgsize(bool unresolved, int maxvif)
+ nla_total_size(0) /* RTA_MULTIPATH */
+ maxvif * NLA_ALIGN(sizeof(struct rtnexthop))
/* RTA_MFC_STATS */
- + nla_total_size(sizeof(struct rta_mfc_stats))
+ + nla_total_size_64bit(sizeof(struct rta_mfc_stats))
;
return len;
--
2.4.2
^ permalink raw reply related
* [PATCH net-next 2/4] libnl: add more helpers to align attribute on 64-bit
From: Nicolas Dichtel @ 2016-04-20 8:57 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, eric.dumazet, tgraf, jhs, Nicolas Dichtel
In-Reply-To: <1461142655-5067-1-git-send-email-nicolas.dichtel@6wind.com>
Add use it to align IFLA_STATS64.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
include/net/netlink.h | 8 ++++
lib/nlattr.c | 107 ++++++++++++++++++++++++++++++++++++++++++++++++++
net/core/rtnetlink.c | 9 +----
3 files changed, 117 insertions(+), 7 deletions(-)
diff --git a/include/net/netlink.h b/include/net/netlink.h
index 694caac31d2c..c01863b49787 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -244,13 +244,21 @@ int nla_memcpy(void *dest, const struct nlattr *src, int count);
int nla_memcmp(const struct nlattr *nla, const void *data, size_t size);
int nla_strcmp(const struct nlattr *nla, const char *str);
struct nlattr *__nla_reserve(struct sk_buff *skb, int attrtype, int attrlen);
+struct nlattr *__nla_reserve_64bit(struct sk_buff *skb, int attrtype,
+ int attrlen, int padattr);
void *__nla_reserve_nohdr(struct sk_buff *skb, int attrlen);
struct nlattr *nla_reserve(struct sk_buff *skb, int attrtype, int attrlen);
+struct nlattr *nla_reserve_64bit(struct sk_buff *skb, int attrtype,
+ int attrlen, int padattr);
void *nla_reserve_nohdr(struct sk_buff *skb, int attrlen);
void __nla_put(struct sk_buff *skb, int attrtype, int attrlen,
const void *data);
+void __nla_put_64bit(struct sk_buff *skb, int attrtype, int attrlen,
+ const void *data, int padattr);
void __nla_put_nohdr(struct sk_buff *skb, int attrlen, const void *data);
int nla_put(struct sk_buff *skb, int attrtype, int attrlen, const void *data);
+int nla_put_64bit(struct sk_buff *skb, int attrtype, int attrlen,
+ const void *data, int padattr);
int nla_put_nohdr(struct sk_buff *skb, int attrlen, const void *data);
int nla_append(struct sk_buff *skb, int attrlen, const void *data);
diff --git a/lib/nlattr.c b/lib/nlattr.c
index f5907d23272d..cc311b8b6ff0 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -355,6 +355,31 @@ struct nlattr *__nla_reserve(struct sk_buff *skb, int attrtype, int attrlen)
EXPORT_SYMBOL(__nla_reserve);
/**
+ * __nla_reserve_64bit - reserve room for attribute on the skb and align it
+ * @skb: socket buffer to reserve room on
+ * @attrtype: attribute type
+ * @attrlen: length of attribute payload
+ *
+ * Adds a netlink attribute header to a socket buffer and reserves
+ * room for the payload but does not copy it. It also ensure that this
+ * attribute will be 64-bit aign.
+ *
+ * The caller is responsible to ensure that the skb provides enough
+ * tailroom for the attribute header and payload.
+ */
+struct nlattr *__nla_reserve_64bit(struct sk_buff *skb, int attrtype,
+ int attrlen, int padattr)
+{
+#ifndef HAVE_EFFICIENT_UNALIGNED_ACCESS
+ if (!IS_ALIGNED((unsigned long)skb->data, 8))
+ nla_align_64bit(skb, padattr);
+#endif
+
+ return __nla_reserve(skb, attrtype, attrlen);
+}
+EXPORT_SYMBOL(__nla_reserve_64bit);
+
+/**
* __nla_reserve_nohdr - reserve room for attribute without header
* @skb: socket buffer to reserve room on
* @attrlen: length of attribute payload
@@ -397,6 +422,38 @@ struct nlattr *nla_reserve(struct sk_buff *skb, int attrtype, int attrlen)
EXPORT_SYMBOL(nla_reserve);
/**
+ * nla_reserve_64bit - reserve room for attribute on the skb and align it
+ * @skb: socket buffer to reserve room on
+ * @attrtype: attribute type
+ * @attrlen: length of attribute payload
+ *
+ * Adds a netlink attribute header to a socket buffer and reserves
+ * room for the payload but does not copy it. It also ensure that this
+ * attribute will be 64-bit aign.
+ *
+ * Returns NULL if the tailroom of the skb is insufficient to store
+ * the attribute header and payload.
+ */
+struct nlattr *nla_reserve_64bit(struct sk_buff *skb, int attrtype, int attrlen,
+ int padattr)
+{
+ size_t len;
+
+#ifndef HAVE_EFFICIENT_UNALIGNED_ACCESS
+ if (!IS_ALIGNED((unsigned long)skb->data, 8))
+ len = nla_total_size_64bit(attrlen);
+ else
+#endif
+ len = nla_total_size(attrlen);
+
+ if (unlikely(skb_tailroom(skb) < len))
+ return NULL;
+
+ return __nla_reserve_64bit(skb, attrtype, attrlen, padattr);
+}
+EXPORT_SYMBOL(nla_reserve_64bit);
+
+/**
* nla_reserve_nohdr - reserve room for attribute without header
* @skb: socket buffer to reserve room on
* @attrlen: length of attribute payload
@@ -436,6 +493,26 @@ void __nla_put(struct sk_buff *skb, int attrtype, int attrlen,
EXPORT_SYMBOL(__nla_put);
/**
+ * __nla_put_64bit - Add a netlink attribute to a socket buffer and align it
+ * @skb: socket buffer to add attribute to
+ * @attrtype: attribute type
+ * @attrlen: length of attribute payload
+ * @data: head of attribute payload
+ *
+ * The caller is responsible to ensure that the skb provides enough
+ * tailroom for the attribute header and payload.
+ */
+void __nla_put_64bit(struct sk_buff *skb, int attrtype, int attrlen,
+ const void *data, int padattr)
+{
+ struct nlattr *nla;
+
+ nla = __nla_reserve_64bit(skb, attrtype, attrlen, padattr);
+ memcpy(nla_data(nla), data, attrlen);
+}
+EXPORT_SYMBOL(__nla_put_64bit);
+
+/**
* __nla_put_nohdr - Add a netlink attribute without header
* @skb: socket buffer to add attribute to
* @attrlen: length of attribute payload
@@ -474,6 +551,36 @@ int nla_put(struct sk_buff *skb, int attrtype, int attrlen, const void *data)
EXPORT_SYMBOL(nla_put);
/**
+ * nla_put_64bit - Add a netlink attribute to a socket buffer and align it
+ * @skb: socket buffer to add attribute to
+ * @attrtype: attribute type
+ * @attrlen: length of attribute payload
+ * @data: head of attribute payload
+ *
+ * Returns -EMSGSIZE if the tailroom of the skb is insufficient to store
+ * the attribute header and payload.
+ */
+int nla_put_64bit(struct sk_buff *skb, int attrtype, int attrlen,
+ const void *data, int padattr)
+{
+ size_t len;
+
+#ifndef HAVE_EFFICIENT_UNALIGNED_ACCESS
+ if (!IS_ALIGNED((unsigned long)skb->data, 8))
+ len = nla_total_size_64bit(attrlen);
+ else
+#endif
+ len = nla_total_size(attrlen);
+
+ if (unlikely(skb_tailroom(skb) < len))
+ return -EMSGSIZE;
+
+ __nla_put_64bit(skb, attrtype, attrlen, data, padattr);
+ return 0;
+}
+EXPORT_SYMBOL(nla_put_64bit);
+
+/**
* nla_put_nohdr - Add a netlink attribute without header
* @skb: socket buffer to add attribute to
* @attrlen: length of attribute payload
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index d3694a13c85a..24cd21273383 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1051,14 +1051,9 @@ static noinline_for_stack int rtnl_fill_stats(struct sk_buff *skb,
{
struct rtnl_link_stats64 *sp;
struct nlattr *attr;
- int err;
-
- err = nla_align_64bit(skb, IFLA_PAD);
- if (err)
- return err;
- attr = nla_reserve(skb, IFLA_STATS64,
- sizeof(struct rtnl_link_stats64));
+ attr = nla_reserve_64bit(skb, IFLA_STATS64,
+ sizeof(struct rtnl_link_stats64), IFLA_PAD);
if (!attr)
return -EMSGSIZE;
--
2.4.2
^ permalink raw reply related
* [PATCH net-next 1/4] netlink: fix test alignment in nla_align_64bit()
From: Nicolas Dichtel @ 2016-04-20 8:57 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, eric.dumazet, tgraf, jhs, Nicolas Dichtel
In-Reply-To: <1461142655-5067-1-git-send-email-nicolas.dichtel@6wind.com>
IS_ALIGN() returns true when the alignment is as expected. The pad
attribute should be added only when the alignment is not 8.
Fixes: 35c5845957c7 ("net: Add helpers for 64-bit aligning netlink attributes.")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
include/net/netlink.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/netlink.h b/include/net/netlink.h
index e644b3489acf..694caac31d2c 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -1245,7 +1245,7 @@ static inline int nla_validate_nested(const struct nlattr *start, int maxtype,
static inline int nla_align_64bit(struct sk_buff *skb, int padattr)
{
#ifndef HAVE_EFFICIENT_UNALIGNED_ACCESS
- if (IS_ALIGNED((unsigned long)skb->data, 8)) {
+ if (!IS_ALIGNED((unsigned long)skb->data, 8)) {
struct nlattr *attr = nla_reserve(skb, padattr, 0);
if (!attr)
return -EMSGSIZE;
--
2.4.2
^ permalink raw reply related
* [PATCH net-next 0/4] libnl: enhance API to ease 64bit alignment for attribute
From: Nicolas Dichtel @ 2016-04-20 8:57 UTC (permalink / raw)
To: netdev; +Cc: davem, roopa, eric.dumazet, tgraf, jhs
In-Reply-To: <20160419.195009.1052027353987244150.davem@davemloft.net>
Here is a proposal to add more helpers in the libnetlink to manage 64-bit
alignment issues.
Note that this series was only tested on x86.
The first patch is a fix (bug seen by code review only unless I've missed
something).
The second patch adds helpers and uses it for IFLA_STATS64.
The last two patches use the new API to align mcast stats.
We could also add helpers for nla_put_u64() and its variants.
include/net/netlink.h | 10 +++-
include/uapi/linux/rtnetlink.h | 1 +
lib/nlattr.c | 107 +++++++++++++++++++++++++++++++++++++++++
net/core/rtnetlink.c | 9 +---
net/ipv4/ipmr.c | 4 +-
net/ipv6/ip6mr.c | 4 +-
6 files changed, 123 insertions(+), 12 deletions(-)
Comments are welcomed,
Regards,
Nicolas
^ permalink raw reply
* [PATCH 2/2] net: ethernet: davinci_emac: Fix platform_data overwrite
From: Neil Armstrong @ 2016-04-20 8:56 UTC (permalink / raw)
To: David S. Miller, Andrew Lunn, Tom Lendacky, Mugunthan V N, netdev,
linux-kernel
Cc: Neil Armstrong, Brian Hutchinson
When the DaVinci emac driver is removed and re-probed, the actual
pdev->dev.platform_data is populated with an unwanted valid pointer saved by
the previous davinci_emac_of_get_pdata() call, causing a kernel crash when
calling priv->int_disable() in emac_int_disable().
Unable to handle kernel paging request at virtual address c8622a80
...
[<c0426fb4>] (emac_int_disable) from [<c0427700>] (emac_dev_open+0x290/0x5f8)
[<c0427700>] (emac_dev_open) from [<c04c00ec>] (__dev_open+0xb8/0x120)
[<c04c00ec>] (__dev_open) from [<c04c0370>] (__dev_change_flags+0x88/0x14c)
[<c04c0370>] (__dev_change_flags) from [<c04c044c>] (dev_change_flags+0x18/0x48)
[<c04c044c>] (dev_change_flags) from [<c052bafc>] (devinet_ioctl+0x6b4/0x7ac)
[<c052bafc>] (devinet_ioctl) from [<c04a1428>] (sock_ioctl+0x1d8/0x2c0)
[<c04a1428>] (sock_ioctl) from [<c014f054>] (do_vfs_ioctl+0x41c/0x600)
[<c014f054>] (do_vfs_ioctl) from [<c014f2a4>] (SyS_ioctl+0x6c/0x7c)
[<c014f2a4>] (SyS_ioctl) from [<c000ff60>] (ret_fast_syscall+0x0/0x1c)
Fixes: 42f59967a091 ("net: ethernet: davinci_emac: add OF support")
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
---
drivers/net/ethernet/ti/davinci_emac.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index e9fe3fb..58d58f0 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -1878,8 +1878,6 @@ davinci_emac_of_get_pdata(struct platform_device *pdev, struct emac_priv *priv)
pdata->hw_ram_addr = auxdata->hw_ram_addr;
}
- pdev->dev.platform_data = pdata;
-
return pdata;
}
--
1.9.1
^ permalink raw reply related
* [PATCH 1/2] net: ethernet: davinci_emac: Fix Unbalanced pm_runtime_enable
From: Neil Armstrong @ 2016-04-20 8:56 UTC (permalink / raw)
To: David S. Miller, Andrew Lunn, Tom Lendacky, Mugunthan V N, netdev,
linux-kernel
Cc: Neil Armstrong, Brian Hutchinson
In order to avoid an Unbalanced pm_runtime_enable in the DaVinci
emac driver when the device is removed and re-probed, and a
pm_runtime_disable() call in davinci_emac_remove().
Actually, using unbind/bind on a TI DM8168 SoC gives :
$ echo 4a120000.ethernet > /sys/bus/platform/drivers/davinci_emac/unbind
net eth1: DaVinci EMAC: davinci_emac_remove()
$ echo 4a120000.ethernet > /sys/bus/platform/drivers/davinci_emac/bind
davinci_emac 4a120000.ethernet: Unbalanced pm_runtime_enable
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Fixes: 3ba97381343b ("net: ethernet: davinci_emac: add pm_runtime support")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
---
drivers/net/ethernet/ti/davinci_emac.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index 5d9abed..e9fe3fb 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -2101,6 +2101,7 @@ static int davinci_emac_remove(struct platform_device *pdev)
cpdma_ctlr_destroy(priv->dma);
unregister_netdev(ndev);
+ pm_runtime_disable(&pdev->dev);
free_netdev(ndev);
return 0;
--
1.9.1
^ permalink raw reply related
* RE: [PATCH v2 0/1] drivers: net: cpsw: Fix NULL pointer dereference with two slave PHYs
From: Andrew Goodbody @ 2016-04-20 8:49 UTC (permalink / raw)
To: David Miller
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
David Rivshin (Allworx), grygorii.strashko@ti.com,
mugunthanvnm@ti.com, linux-omap@vger.kernel.org, tony@atomide.com
In-Reply-To: <20160419.201550.902520333910307559.davem@davemloft.net>
> -----Original Message-----
> From: Andrew Goodbody <andrew.goodbody@cambrionix.com>
> Date: Mon, 18 Apr 2016 14:53:25 +0100
>
> > This is a fix for a NULL pointer dereference from cpsw which is
> > triggered by having two slave PHYs attached to a cpsw network device.
> > The problem is due to only maintaining a single reference to a PHY
> > node in the prive data which gets overwritten by the second PHY probe.
> > So move the PHY node reference to the individual slave data so that there
> is now one per slave.
> >
> > v1 had a problem that data->slaves was used before it had been filled
> > in
>
> I already applied v1 the other day, so you need to send me a relative patch
> rather than a whole new one.
>
> Thanks.
Sorry, I had no notification that this had happened. However I thought that the plan was to revert v1 and go with David Rivshin's patch instead. I'll see if I can create a revert in a little while.
Andrew
^ permalink raw reply
* [PATCHv2 bluetooth-next 10/10] 6lowpan: add support for 802.15.4 short addr handling
From: Alexander Aring @ 2016-04-20 8:19 UTC (permalink / raw)
To: linux-wpan
Cc: kernel, marcel, jukka.rissanen, hannes, stefan, mcr, werner,
linux-bluetooth, netdev, Alexander Aring
In-Reply-To: <1461140382-4784-1-git-send-email-aar@pengutronix.de>
This patch adds necessary handling for use the short address for
802.15.4 6lowpan. It contains support for IPHC address compression
and new matching algorithmn to decide which link layer address will be
used for 802.15.4 frame.
Signed-off-by: Alexander Aring <aar@pengutronix.de>
---
net/6lowpan/iphc.c | 167 ++++++++++++++++++++++++++++++++++++--------
net/ieee802154/6lowpan/tx.c | 107 ++++++++++++++--------------
2 files changed, 189 insertions(+), 85 deletions(-)
diff --git a/net/6lowpan/iphc.c b/net/6lowpan/iphc.c
index 8501dd5..aca38dc 100644
--- a/net/6lowpan/iphc.c
+++ b/net/6lowpan/iphc.c
@@ -761,22 +761,75 @@ static const u8 lowpan_iphc_dam_to_sam_value[] = {
[LOWPAN_IPHC_DAM_11] = LOWPAN_IPHC_SAM_11,
};
-static u8 lowpan_compress_ctx_addr(u8 **hc_ptr, const struct in6_addr *ipaddr,
+static inline bool
+lowpan_iphc_compress_ctx_802154_lladdr(const struct in6_addr *ipaddr,
+ const struct lowpan_iphc_ctx *ctx,
+ const void *lladdr)
+{
+ const struct ieee802154_addr *addr = lladdr;
+ unsigned char extended_addr[EUI64_ADDR_LEN];
+ struct in6_addr tmp = {};
+ bool lladdr_compress = false;
+
+ switch (addr->mode) {
+ case IEEE802154_ADDR_LONG:
+ ieee802154_le64_to_be64(&extended_addr, &addr->extended_addr);
+ /* check for SAM/DAM = 11 */
+ memcpy(&tmp.s6_addr[8], &extended_addr, EUI64_ADDR_LEN);
+ /* second bit-flip (Universe/Local) is done according RFC2464 */
+ tmp.s6_addr[8] ^= 0x02;
+ /* context information are always used */
+ ipv6_addr_prefix_copy(&tmp, &ctx->pfx, ctx->plen);
+ if (ipv6_addr_equal(&tmp, ipaddr))
+ lladdr_compress = true;
+ break;
+ case IEEE802154_ADDR_SHORT:
+ tmp.s6_addr[11] = 0xFF;
+ tmp.s6_addr[12] = 0xFE;
+ ieee802154_le16_to_be16(&tmp.s6_addr16[7],
+ &addr->short_addr);
+ /* context information are always used */
+ ipv6_addr_prefix_copy(&tmp, &ctx->pfx, ctx->plen);
+ if (ipv6_addr_equal(&tmp, ipaddr))
+ lladdr_compress = true;
+ break;
+ default:
+ /* should never handled and filtered by 802154 6lowpan */
+ WARN_ON_ONCE(1);
+ break;
+ }
+
+ return lladdr_compress;
+}
+
+static u8 lowpan_compress_ctx_addr(u8 **hc_ptr, const struct net_device *dev,
+ const struct in6_addr *ipaddr,
const struct lowpan_iphc_ctx *ctx,
const unsigned char *lladdr, bool sam)
{
struct in6_addr tmp = {};
u8 dam;
- /* check for SAM/DAM = 11 */
- memcpy(&tmp.s6_addr[8], lladdr, 8);
- /* second bit-flip (Universe/Local) is done according RFC2464 */
- tmp.s6_addr[8] ^= 0x02;
- /* context information are always used */
- ipv6_addr_prefix_copy(&tmp, &ctx->pfx, ctx->plen);
- if (ipv6_addr_equal(&tmp, ipaddr)) {
- dam = LOWPAN_IPHC_DAM_11;
- goto out;
+ switch (lowpan_dev(dev)->lltype) {
+ case LOWPAN_LLTYPE_IEEE802154:
+ if (lowpan_iphc_compress_ctx_802154_lladdr(ipaddr, ctx,
+ lladdr)) {
+ dam = LOWPAN_IPHC_DAM_11;
+ goto out;
+ }
+ break;
+ default:
+ /* check for SAM/DAM = 11 */
+ memcpy(&tmp.s6_addr[8], lladdr, EUI64_ADDR_LEN);
+ /* second bit-flip (Universe/Local) is done according RFC2464 */
+ tmp.s6_addr[8] ^= 0x02;
+ /* context information are always used */
+ ipv6_addr_prefix_copy(&tmp, &ctx->pfx, ctx->plen);
+ if (ipv6_addr_equal(&tmp, ipaddr)) {
+ dam = LOWPAN_IPHC_DAM_11;
+ goto out;
+ }
+ break;
}
memset(&tmp, 0, sizeof(tmp));
@@ -813,28 +866,85 @@ out:
return dam;
}
-static u8 lowpan_compress_addr_64(u8 **hc_ptr, const struct in6_addr *ipaddr,
+static inline bool
+lowpan_iphc_compress_802154_lladdr(const struct in6_addr *ipaddr,
+ const void *lladdr)
+{
+ const struct ieee802154_addr *addr = lladdr;
+ unsigned char extended_addr[EUI64_ADDR_LEN];
+ struct in6_addr tmp = {};
+ bool lladdr_compress = false;
+
+ switch (addr->mode) {
+ case IEEE802154_ADDR_LONG:
+ ieee802154_le64_to_be64(&extended_addr, &addr->extended_addr);
+ if (is_addr_mac_addr_based(ipaddr, extended_addr))
+ lladdr_compress = true;
+ break;
+ case IEEE802154_ADDR_SHORT:
+ /* fe:80::ff:fe00:XXXX
+ * \__/
+ * short_addr
+ *
+ * Universe/Local bit is zero.
+ */
+ tmp.s6_addr[0] = 0xFE;
+ tmp.s6_addr[1] = 0x80;
+ tmp.s6_addr[11] = 0xFF;
+ tmp.s6_addr[12] = 0xFE;
+ ieee802154_le16_to_be16(&tmp.s6_addr16[7],
+ &addr->short_addr);
+ if (ipv6_addr_equal(&tmp, ipaddr))
+ lladdr_compress = true;
+ break;
+ default:
+ /* should never handled and filtered by 802154 6lowpan */
+ WARN_ON_ONCE(1);
+ break;
+ }
+
+ return lladdr_compress;
+}
+
+static u8 lowpan_compress_addr_64(u8 **hc_ptr, const struct net_device *dev,
+ const struct in6_addr *ipaddr,
const unsigned char *lladdr, bool sam)
{
- u8 dam = LOWPAN_IPHC_DAM_00;
+ u8 dam = LOWPAN_IPHC_DAM_01;
- if (is_addr_mac_addr_based(ipaddr, lladdr)) {
- dam = LOWPAN_IPHC_DAM_11; /* 0-bits */
- pr_debug("address compression 0 bits\n");
- } else if (lowpan_is_iid_16_bit_compressable(ipaddr)) {
+ switch (lowpan_dev(dev)->lltype) {
+ case LOWPAN_LLTYPE_IEEE802154:
+ if (lowpan_iphc_compress_802154_lladdr(ipaddr, lladdr)) {
+ dam = LOWPAN_IPHC_DAM_11; /* 0-bits */
+ pr_debug("address compression 0 bits\n");
+ goto out;
+ }
+ break;
+ default:
+ if (is_addr_mac_addr_based(ipaddr, lladdr)) {
+ dam = LOWPAN_IPHC_DAM_11; /* 0-bits */
+ pr_debug("address compression 0 bits\n");
+ goto out;
+ }
+ break;
+ }
+
+ if (lowpan_is_iid_16_bit_compressable(ipaddr)) {
/* compress IID to 16 bits xxxx::XXXX */
lowpan_push_hc_data(hc_ptr, &ipaddr->s6_addr16[7], 2);
dam = LOWPAN_IPHC_DAM_10; /* 16-bits */
raw_dump_inline(NULL, "Compressed ipv6 addr is (16 bits)",
*hc_ptr - 2, 2);
- } else {
- /* do not compress IID => xxxx::IID */
- lowpan_push_hc_data(hc_ptr, &ipaddr->s6_addr16[4], 8);
- dam = LOWPAN_IPHC_DAM_01; /* 64-bits */
- raw_dump_inline(NULL, "Compressed ipv6 addr is (64 bits)",
- *hc_ptr - 8, 8);
+ goto out;
}
+ /* do not compress IID => xxxx::IID */
+ lowpan_push_hc_data(hc_ptr, &ipaddr->s6_addr16[4], 8);
+ raw_dump_inline(NULL, "Compressed ipv6 addr is (64 bits)",
+ *hc_ptr - 8, 8);
+
+out:
+
if (sam)
return lowpan_iphc_dam_to_sam_value[dam];
else
@@ -1013,9 +1123,6 @@ int lowpan_header_compress(struct sk_buff *skb, const struct net_device *dev,
iphc0 = LOWPAN_DISPATCH_IPHC;
iphc1 = 0;
- raw_dump_inline(__func__, "saddr", saddr, EUI64_ADDR_LEN);
- raw_dump_inline(__func__, "daddr", daddr, EUI64_ADDR_LEN);
-
raw_dump_table(__func__, "sending raw skb network uncompressed packet",
skb->data, skb->len);
@@ -1088,14 +1195,15 @@ int lowpan_header_compress(struct sk_buff *skb, const struct net_device *dev,
iphc1 |= LOWPAN_IPHC_SAC;
} else {
if (sci) {
- iphc1 |= lowpan_compress_ctx_addr(&hc_ptr, &hdr->saddr,
+ iphc1 |= lowpan_compress_ctx_addr(&hc_ptr, dev,
+ &hdr->saddr,
&sci_entry, saddr,
true);
iphc1 |= LOWPAN_IPHC_SAC;
} else {
if (ipv6_saddr_type & IPV6_ADDR_LINKLOCAL &&
lowpan_is_linklocal_zero_padded(hdr->saddr)) {
- iphc1 |= lowpan_compress_addr_64(&hc_ptr,
+ iphc1 |= lowpan_compress_addr_64(&hc_ptr, dev,
&hdr->saddr,
saddr, true);
pr_debug("source address unicast link-local %pI6c iphc1 0x%02x\n",
@@ -1123,14 +1231,15 @@ int lowpan_header_compress(struct sk_buff *skb, const struct net_device *dev,
}
} else {
if (dci) {
- iphc1 |= lowpan_compress_ctx_addr(&hc_ptr, &hdr->daddr,
+ iphc1 |= lowpan_compress_ctx_addr(&hc_ptr, dev,
+ &hdr->daddr,
&dci_entry, daddr,
false);
iphc1 |= LOWPAN_IPHC_DAC;
} else {
if (ipv6_daddr_type & IPV6_ADDR_LINKLOCAL &&
lowpan_is_linklocal_zero_padded(hdr->daddr)) {
- iphc1 |= lowpan_compress_addr_64(&hc_ptr,
+ iphc1 |= lowpan_compress_addr_64(&hc_ptr, dev,
&hdr->daddr,
daddr, false);
pr_debug("dest address unicast link-local %pI6c iphc1 0x%02x\n",
diff --git a/net/ieee802154/6lowpan/tx.c b/net/ieee802154/6lowpan/tx.c
index e459afd..88c9d16 100644
--- a/net/ieee802154/6lowpan/tx.c
+++ b/net/ieee802154/6lowpan/tx.c
@@ -9,6 +9,7 @@
*/
#include <net/6lowpan.h>
+#include <net/ndisc.h>
#include <net/ieee802154_netdev.h>
#include <net/mac802154.h>
@@ -17,19 +18,9 @@
#define LOWPAN_FRAG1_HEAD_SIZE 0x4
#define LOWPAN_FRAGN_HEAD_SIZE 0x5
-/* don't save pan id, it's intra pan */
-struct lowpan_addr {
- u8 mode;
- union {
- /* IPv6 needs big endian here */
- __be64 extended_addr;
- __be16 short_addr;
- } u;
-};
-
struct lowpan_addr_info {
- struct lowpan_addr daddr;
- struct lowpan_addr saddr;
+ struct ieee802154_addr daddr;
+ struct ieee802154_addr saddr;
};
static inline struct
@@ -48,12 +39,14 @@ lowpan_addr_info *lowpan_skb_priv(const struct sk_buff *skb)
* RAW/DGRAM sockets.
*/
int lowpan_header_create(struct sk_buff *skb, struct net_device *ldev,
- unsigned short type, const void *_daddr,
- const void *_saddr, unsigned int len)
+ unsigned short type, const void *daddr,
+ const void *saddr, unsigned int len)
{
- const u8 *saddr = _saddr;
- const u8 *daddr = _daddr;
- struct lowpan_addr_info *info;
+ struct wpan_dev *wpan_dev = lowpan_802154_dev(ldev)->wdev->ieee802154_ptr;
+ struct lowpan_addr_info *info = lowpan_skb_priv(skb);
+ struct lowpan_802154_neigh *llneigh = NULL;
+ const struct ipv6hdr *hdr = ipv6_hdr(skb);
+ struct neighbour *n;
/* TODO:
* if this package isn't ipv6 one, where should it be routed?
@@ -61,21 +54,44 @@ int lowpan_header_create(struct sk_buff *skb, struct net_device *ldev,
if (type != ETH_P_IPV6)
return 0;
- if (!saddr)
- saddr = ldev->dev_addr;
+ /* intra-pan communication */
+ info->saddr.pan_id = wpan_dev->pan_id;
+ info->daddr.pan_id = info->saddr.pan_id;
- raw_dump_inline(__func__, "saddr", (unsigned char *)saddr, 8);
- raw_dump_inline(__func__, "daddr", (unsigned char *)daddr, 8);
+ if (!memcmp(daddr, ldev->broadcast, EUI64_ADDR_LEN)) {
+ info->daddr.short_addr = cpu_to_le16(IEEE802154_ADDR_BROADCAST);
+ info->daddr.mode = IEEE802154_ADDR_SHORT;
+ } else {
+ n = neigh_lookup(&nd_tbl, &hdr->daddr, ldev);
+ if (n)
+ llneigh = lowpan_802154_neigh(neighbour_priv(n));
+
+ if (llneigh &&
+ ieee802154_is_valid_src_short_addr(llneigh->short_addr)) {
+ info->daddr.mode = IEEE802154_ADDR_SHORT;
+ info->daddr.short_addr = llneigh->short_addr;
+ } else {
+ info->daddr.mode = IEEE802154_ADDR_LONG;
+ ieee802154_be64_to_le64(&info->daddr.extended_addr,
+ daddr);
+ }
- info = lowpan_skb_priv(skb);
+ if (n)
+ neigh_release(n);
+ }
- /* TODO: Currently we only support extended_addr */
- info->daddr.mode = IEEE802154_ADDR_LONG;
- memcpy(&info->daddr.u.extended_addr, daddr,
- sizeof(info->daddr.u.extended_addr));
- info->saddr.mode = IEEE802154_ADDR_LONG;
- memcpy(&info->saddr.u.extended_addr, saddr,
- sizeof(info->daddr.u.extended_addr));
+ if (!saddr) {
+ if (ieee802154_is_valid_src_short_addr(wpan_dev->short_addr)) {
+ info->saddr.mode = IEEE802154_ADDR_SHORT;
+ info->saddr.short_addr = wpan_dev->short_addr;
+ } else {
+ info->saddr.mode = IEEE802154_ADDR_LONG;
+ info->saddr.extended_addr = wpan_dev->extended_addr;
+ }
+ } else {
+ info->saddr.mode = IEEE802154_ADDR_LONG;
+ ieee802154_be64_to_le64(&info->saddr.extended_addr, saddr);
+ }
return 0;
}
@@ -209,47 +225,26 @@ static int lowpan_header(struct sk_buff *skb, struct net_device *ldev,
u16 *dgram_size, u16 *dgram_offset)
{
struct wpan_dev *wpan_dev = lowpan_802154_dev(ldev)->wdev->ieee802154_ptr;
- struct ieee802154_addr sa, da;
struct ieee802154_mac_cb *cb = mac_cb_init(skb);
struct lowpan_addr_info info;
- void *daddr, *saddr;
memcpy(&info, lowpan_skb_priv(skb), sizeof(info));
- /* TODO: Currently we only support extended_addr */
- daddr = &info.daddr.u.extended_addr;
- saddr = &info.saddr.u.extended_addr;
-
*dgram_size = skb->len;
- lowpan_header_compress(skb, ldev, daddr, saddr);
+ lowpan_header_compress(skb, ldev, &info.daddr, &info.saddr);
/* dgram_offset = (saved bytes after compression) + lowpan header len */
*dgram_offset = (*dgram_size - skb->len) + skb_network_header_len(skb);
cb->type = IEEE802154_FC_TYPE_DATA;
- /* prepare wpan address data */
- sa.mode = IEEE802154_ADDR_LONG;
- sa.pan_id = wpan_dev->pan_id;
- sa.extended_addr = ieee802154_devaddr_from_raw(saddr);
-
- /* intra-PAN communications */
- da.pan_id = sa.pan_id;
-
- /* if the destination address is the broadcast address, use the
- * corresponding short address
- */
- if (!memcmp(daddr, ldev->broadcast, EUI64_ADDR_LEN)) {
- da.mode = IEEE802154_ADDR_SHORT;
- da.short_addr = cpu_to_le16(IEEE802154_ADDR_BROADCAST);
+ if (info.daddr.mode == IEEE802154_ADDR_SHORT &&
+ ieee802154_is_broadcast_short_addr(info.daddr.short_addr))
cb->ackreq = false;
- } else {
- da.mode = IEEE802154_ADDR_LONG;
- da.extended_addr = ieee802154_devaddr_from_raw(daddr);
+ else
cb->ackreq = wpan_dev->ackreq;
- }
- return wpan_dev_hard_header(skb, lowpan_802154_dev(ldev)->wdev, &da,
- &sa, 0);
+ return wpan_dev_hard_header(skb, lowpan_802154_dev(ldev)->wdev,
+ &info.daddr, &info.saddr, 0);
}
netdev_tx_t lowpan_xmit(struct sk_buff *skb, struct net_device *ldev)
--
2.8.0
^ permalink raw reply related
* [PATCHv2 bluetooth-next 08/10] ipv6: export ndisc functions
From: Alexander Aring @ 2016-04-20 8:19 UTC (permalink / raw)
To: linux-wpan
Cc: kernel, marcel, jukka.rissanen, hannes, stefan, mcr, werner,
linux-bluetooth, netdev, Alexander Aring, David S . Miller,
Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy
In-Reply-To: <1461140382-4784-1-git-send-email-aar@pengutronix.de>
This patch exports some neighbour discovery functions which can be used
by 6lowpan neighbour discovery ops functionality then.
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Alexander Aring <aar@pengutronix.de>
---
include/net/ndisc.h | 16 ++++++++++++++++
net/ipv6/addrconf.c | 1 +
net/ipv6/ndisc.c | 28 ++++++++++------------------
3 files changed, 27 insertions(+), 18 deletions(-)
diff --git a/include/net/ndisc.h b/include/net/ndisc.h
index 14ed016..35a4396 100644
--- a/include/net/ndisc.h
+++ b/include/net/ndisc.h
@@ -53,6 +53,15 @@ enum {
#include <net/neighbour.h>
+/* Set to 3 to get tracing... */
+#define ND_DEBUG 1
+
+#define ND_PRINTK(val, level, fmt, ...) \
+do { \
+ if (val <= ND_DEBUG) \
+ net_##level##_ratelimited(fmt, ##__VA_ARGS__); \
+} while (0)
+
struct ctl_table;
struct inet6_dev;
struct net_device;
@@ -267,6 +276,13 @@ int ndisc_late_init(void);
void ndisc_late_cleanup(void);
void ndisc_cleanup(void);
+void ndisc_fill_addr_option(struct sk_buff *skb, int type, void *data,
+ int data_len);
+struct sk_buff *ndisc_alloc_skb(struct net_device *dev, int len);
+void ndisc_send_skb(struct sk_buff *skb, const struct in6_addr *daddr,
+ const struct in6_addr *saddr);
+int pndisc_is_router(const void *pkey, struct net_device *dev);
+
int ndisc_rcv(struct sk_buff *skb);
void ndisc_send_rs(struct net_device *dev,
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index a2ef04b..8f05ef8 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1775,6 +1775,7 @@ struct inet6_ifaddr *ipv6_get_ifaddr(struct net *net, const struct in6_addr *add
return result;
}
+EXPORT_SYMBOL(ipv6_get_ifaddr);
/* Gets referenced address, destroys ifaddr */
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 297080a..dc8bfec 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -73,15 +73,6 @@
#include <linux/netfilter.h>
#include <linux/netfilter_ipv6.h>
-/* Set to 3 to get tracing... */
-#define ND_DEBUG 1
-
-#define ND_PRINTK(val, level, fmt, ...) \
-do { \
- if (val <= ND_DEBUG) \
- net_##level##_ratelimited(fmt, ##__VA_ARGS__); \
-} while (0)
-
static u32 ndisc_hash(const void *pkey,
const struct net_device *dev,
__u32 *hash_rnd);
@@ -150,8 +141,8 @@ struct neigh_table nd_tbl = {
};
EXPORT_SYMBOL_GPL(nd_tbl);
-static void ndisc_fill_addr_option(struct sk_buff *skb, int type, void *data,
- int data_len)
+void ndisc_fill_addr_option(struct sk_buff *skb, int type, void *data,
+ int data_len)
{
int pad = ndisc_addr_option_pad(skb->dev->type);
int space = ndisc_opt_addr_space(skb->dev, data_len);
@@ -171,6 +162,7 @@ static void ndisc_fill_addr_option(struct sk_buff *skb, int type, void *data,
if (space > 0)
memset(opt, 0, space);
}
+EXPORT_SYMBOL(ndisc_fill_addr_option);
static struct nd_opt_hdr *ndisc_next_option(struct nd_opt_hdr *cur,
struct nd_opt_hdr *end)
@@ -378,8 +370,7 @@ static void pndisc_destructor(struct pneigh_entry *n)
ipv6_dev_mc_dec(dev, &maddr);
}
-static struct sk_buff *ndisc_alloc_skb(struct net_device *dev,
- int len)
+struct sk_buff *ndisc_alloc_skb(struct net_device *dev, int len)
{
int hlen = LL_RESERVED_SPACE(dev);
int tlen = dev->needed_tailroom;
@@ -406,6 +397,7 @@ static struct sk_buff *ndisc_alloc_skb(struct net_device *dev,
return skb;
}
+EXPORT_SYMBOL(ndisc_alloc_skb);
static void ip6_nd_hdr(struct sk_buff *skb,
const struct in6_addr *saddr,
@@ -428,9 +420,8 @@ static void ip6_nd_hdr(struct sk_buff *skb,
hdr->daddr = *daddr;
}
-static void ndisc_send_skb(struct sk_buff *skb,
- const struct in6_addr *daddr,
- const struct in6_addr *saddr)
+void ndisc_send_skb(struct sk_buff *skb, const struct in6_addr *daddr,
+ const struct in6_addr *saddr)
{
struct dst_entry *dst = skb_dst(skb);
struct net *net = dev_net(skb->dev);
@@ -479,6 +470,7 @@ static void ndisc_send_skb(struct sk_buff *skb,
rcu_read_unlock();
}
+EXPORT_SYMBOL(ndisc_send_skb);
static void ip6_ndisc_send_na(struct net_device *dev,
const struct in6_addr *daddr,
@@ -692,8 +684,7 @@ static void ndisc_solicit(struct neighbour *neigh, struct sk_buff *skb)
}
}
-static int pndisc_is_router(const void *pkey,
- struct net_device *dev)
+int pndisc_is_router(const void *pkey, struct net_device *dev)
{
struct pneigh_entry *n;
int ret = -1;
@@ -706,6 +697,7 @@ static int pndisc_is_router(const void *pkey,
return ret;
}
+EXPORT_SYMBOL(pndisc_is_router);
static void ip6_ndisc_recv_ns(struct sk_buff *skb)
{
--
2.8.0
^ permalink raw reply related
* [PATCHv2 bluetooth-next 06/10] ndisc: add addr_len parameter to ndisc_fill_addr_option
From: Alexander Aring @ 2016-04-20 8:19 UTC (permalink / raw)
To: linux-wpan
Cc: kernel, marcel, jukka.rissanen, hannes, stefan, mcr, werner,
linux-bluetooth, netdev, Alexander Aring, David S . Miller,
Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy
In-Reply-To: <1461140382-4784-1-git-send-email-aar@pengutronix.de>
This patch makes the address length as argument for the
ndisc_fill_addr_option function. This is necessary to handle addresses
which don't use dev->addr_len as address length.
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Alexander Aring <aar@pengutronix.de>
---
net/ipv6/ndisc.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 4e91d5e..176c7c4 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -150,11 +150,11 @@ struct neigh_table nd_tbl = {
};
EXPORT_SYMBOL_GPL(nd_tbl);
-static void ndisc_fill_addr_option(struct sk_buff *skb, int type, void *data)
+static void ndisc_fill_addr_option(struct sk_buff *skb, int type, void *data,
+ int data_len)
{
int pad = ndisc_addr_option_pad(skb->dev->type);
- int data_len = skb->dev->addr_len;
- int space = ndisc_opt_addr_space(skb->dev, skb->dev->addr_len);
+ int space = ndisc_opt_addr_space(skb->dev, data_len);
u8 *opt = skb_put(skb, space);
opt[0] = type;
@@ -528,7 +528,7 @@ void ndisc_send_na(struct net_device *dev, const struct in6_addr *daddr,
if (inc_opt)
ndisc_fill_addr_option(skb, ND_OPT_TARGET_LL_ADDR,
- dev->dev_addr);
+ dev->dev_addr, dev->addr_len);
ndisc_send_skb(skb, daddr, src_addr);
@@ -590,7 +590,7 @@ void ndisc_send_ns(struct net_device *dev, const struct in6_addr *solicit,
if (inc_opt)
ndisc_fill_addr_option(skb, ND_OPT_SOURCE_LL_ADDR,
- dev->dev_addr);
+ dev->dev_addr, dev->addr_len);
ndisc_send_skb(skb, daddr, saddr);
}
@@ -641,7 +641,7 @@ void ndisc_send_rs(struct net_device *dev, const struct in6_addr *saddr,
if (send_sllao)
ndisc_fill_addr_option(skb, ND_OPT_SOURCE_LL_ADDR,
- dev->dev_addr);
+ dev->dev_addr, dev->addr_len);
ndisc_send_skb(skb, daddr, saddr);
}
@@ -1597,7 +1597,8 @@ void ndisc_send_redirect(struct sk_buff *skb, const struct in6_addr *target)
*/
if (ha)
- ndisc_fill_addr_option(buff, ND_OPT_TARGET_LL_ADDR, ha);
+ ndisc_fill_addr_option(buff, ND_OPT_TARGET_LL_ADDR, ha,
+ dev->addr_len);
/*
* build redirect option and copy skb over to the new packet.
--
2.8.0
^ permalink raw reply related
* [PATCHv2 bluetooth-next 09/10] 6lowpan: introduce 6lowpan-nd
From: Alexander Aring @ 2016-04-20 8:19 UTC (permalink / raw)
To: linux-wpan-u79uwXL29TY76Z2rM5mHXA
Cc: kernel-bIcnvbaLZ9MEGnE8C9+IrQ, marcel-kz+m5ild9QBg9hUCZPvPmw,
jukka.rissanen-VuQAYsv1563Yd54FQh9/CA,
hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r,
stefan-JPH+aEBZ4P+UEJcrhfAQsw, mcr-SWp7JaYWvAQV+D8aMU/kSg,
werner-SEdMjqphH88wryQfseakQg,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA, Alexander Aring, David S . Miller,
Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy
In-Reply-To: <1461140382-4784-1-git-send-email-aar-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
This patch introduce different 6lowpan handling for receive and transmit
NS/NA messages for the ipv6 neighbour discovery. The first use-case is
for supporting 802.15.4 short addresses inside the option fields and
handling for RFC6775 6CO option field as userspace option.
Future handling:
Also add RS/RA(processing) for 802.15.4 short addresses and handle
RFC6775, which requires more 6lowpan specific handling for ipv6 neighbour
discovery implementation.
Cc: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: Alexey Kuznetsov <kuznet-v/Mj1YrvjDBInbfyfbPRSQ@public.gmane.org>
Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
Cc: Hideaki YOSHIFUJI <yoshfuji-VfPWfsRibaP+Ru+s062T9g@public.gmane.org>
Cc: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
Signed-off-by: Alexander Aring <aar-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
---
include/net/ndisc.h | 1 +
net/6lowpan/6lowpan_i.h | 2 +
net/6lowpan/Makefile | 2 +-
net/6lowpan/core.c | 2 +
net/6lowpan/ndisc.c | 633 ++++++++++++++++++++++++++++++++++++++++++++++++
net/ipv6/ndisc.c | 3 +
6 files changed, 642 insertions(+), 1 deletion(-)
create mode 100644 net/6lowpan/ndisc.c
diff --git a/include/net/ndisc.h b/include/net/ndisc.h
index 35a4396..e2ee83d 100644
--- a/include/net/ndisc.h
+++ b/include/net/ndisc.h
@@ -35,6 +35,7 @@ enum {
ND_OPT_ROUTE_INFO = 24, /* RFC4191 */
ND_OPT_RDNSS = 25, /* RFC5006 */
ND_OPT_DNSSL = 31, /* RFC6106 */
+ ND_OPT_6CO = 34, /* RFC6775 */
__ND_OPT_MAX
};
diff --git a/net/6lowpan/6lowpan_i.h b/net/6lowpan/6lowpan_i.h
index 97ecc27..8b01774 100644
--- a/net/6lowpan/6lowpan_i.h
+++ b/net/6lowpan/6lowpan_i.h
@@ -12,6 +12,8 @@ static inline bool lowpan_is_ll(const struct net_device *dev,
return lowpan_dev(dev)->lltype == lltype;
}
+void lowpan_register_ndisc_ops(struct net_device *dev);
+
#ifdef CONFIG_6LOWPAN_DEBUGFS
int lowpan_dev_debugfs_init(struct net_device *dev);
void lowpan_dev_debugfs_exit(struct net_device *dev);
diff --git a/net/6lowpan/Makefile b/net/6lowpan/Makefile
index e44f3bf..12d131a 100644
--- a/net/6lowpan/Makefile
+++ b/net/6lowpan/Makefile
@@ -1,6 +1,6 @@
obj-$(CONFIG_6LOWPAN) += 6lowpan.o
-6lowpan-y := core.o iphc.o nhc.o
+6lowpan-y := core.o iphc.o nhc.o ndisc.o
6lowpan-$(CONFIG_6LOWPAN_DEBUGFS) += debugfs.o
#rfc6282 nhcs
diff --git a/net/6lowpan/core.c b/net/6lowpan/core.c
index 824d1bc..e7a370e 100644
--- a/net/6lowpan/core.c
+++ b/net/6lowpan/core.c
@@ -34,6 +34,8 @@ int lowpan_register_netdevice(struct net_device *dev,
for (i = 0; i < LOWPAN_IPHC_CTX_TABLE_SIZE; i++)
lowpan_dev(dev)->ctx.table[i].id = i;
+ lowpan_register_ndisc_ops(dev);
+
ret = register_netdevice(dev);
if (ret < 0)
return ret;
diff --git a/net/6lowpan/ndisc.c b/net/6lowpan/ndisc.c
new file mode 100644
index 0000000..d088295
--- /dev/null
+++ b/net/6lowpan/ndisc.c
@@ -0,0 +1,633 @@
+/* This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <net/6lowpan.h>
+#include <net/addrconf.h>
+#include <net/ip6_route.h>
+#include <net/ndisc.h>
+
+#include "6lowpan_i.h"
+
+struct lowpan_ndisc_options {
+ struct nd_opt_hdr *nd_opt_array[ND_OPT_TARGET_LL_ADDR + 1];
+#if IS_ENABLED(CONFIG_IEEE802154_6LOWPAN)
+ struct nd_opt_hdr *nd_802154_opt_array[ND_OPT_TARGET_LL_ADDR + 1];
+#endif
+};
+
+#define nd_802154_opts_src_lladdr nd_802154_opt_array[ND_OPT_SOURCE_LL_ADDR]
+#define nd_802154_opts_tgt_lladdr nd_802154_opt_array[ND_OPT_TARGET_LL_ADDR]
+
+#define NDISC_802154_EXTENDED_ADDR_LENGTH 2
+#define NDISC_802154_SHORT_ADDR_LENGTH 1
+
+#if IS_ENABLED(CONFIG_IEEE802154_6LOWPAN)
+static void lowpan_ndisc_802154_neigh_update(struct neighbour *n, void *priv,
+ bool override)
+{
+ struct lowpan_802154_neigh *neigh = lowpan_802154_neigh(neighbour_priv(n));
+
+ if (!override)
+ return;
+
+ write_lock_bh(&n->lock);
+ if (priv)
+ ieee802154_be16_to_le16(&neigh->short_addr, priv);
+ else
+ neigh->short_addr = cpu_to_le16(IEEE802154_ADDR_SHORT_UNSPEC);
+ write_unlock_bh(&n->lock);
+}
+
+static inline int lowpan_ndisc_802154_short_addr_space(struct net_device *dev)
+{
+ struct wpan_dev *wpan_dev;
+ int addr_space = 0;
+
+ if (lowpan_is_ll(dev, LOWPAN_LLTYPE_IEEE802154)) {
+ wpan_dev = lowpan_802154_dev(dev)->wdev->ieee802154_ptr;
+
+ if (ieee802154_is_valid_src_short_addr(wpan_dev->short_addr))
+ addr_space = ndisc_opt_addr_space(dev, IEEE802154_SHORT_ADDR_LEN);
+ }
+
+ return addr_space;
+}
+
+static inline void
+lowpan_ndisc_802154_short_addr_option(struct net_device *dev,
+ struct sk_buff *skb, int type)
+{
+ struct wpan_dev *wpan_dev;
+ __be16 short_addr;
+
+ if (lowpan_is_ll(dev, LOWPAN_LLTYPE_IEEE802154)) {
+ wpan_dev = lowpan_802154_dev(dev)->wdev->ieee802154_ptr;
+
+ if (ieee802154_is_valid_src_short_addr(wpan_dev->short_addr)) {
+ ieee802154_le16_to_be16(&short_addr,
+ &wpan_dev->short_addr);
+ ndisc_fill_addr_option(skb, type, &short_addr,
+ IEEE802154_SHORT_ADDR_LEN);
+ }
+ }
+}
+#else
+static void
+lowpan_ndisc_802154_neigh_update(struct neighbour *n, void *priv,
+ bool override) { }
+
+static inline void
+lowpan_ndisc_802154_short_addr_option(struct net_device *dev,
+ struct sk_buff *skb,
+ int type) { }
+
+static inline int lowpan_ndisc_802154_short_addr_space(struct net_device *dev)
+{
+ return 0;
+}
+#endif
+
+static void lowpan_ndisc_parse_addr_options(const struct net_device *dev,
+ struct lowpan_ndisc_options *ndopts,
+ struct nd_opt_hdr *nd_opt)
+{
+ switch (nd_opt->nd_opt_len) {
+ case NDISC_802154_EXTENDED_ADDR_LENGTH:
+ if (ndopts->nd_opt_array[nd_opt->nd_opt_type])
+ ND_PRINTK(2, warn,
+ "%s: duplicated extended addr ND6 option found: type=%d\n",
+ __func__, nd_opt->nd_opt_type);
+ else
+ ndopts->nd_opt_array[nd_opt->nd_opt_type] = nd_opt;
+ break;
+#if IS_ENABLED(CONFIG_IEEE802154_6LOWPAN)
+ case NDISC_802154_SHORT_ADDR_LENGTH:
+ /* only valid on 802.15.4 */
+ if (!lowpan_is_ll(dev, LOWPAN_LLTYPE_IEEE802154)) {
+ ND_PRINTK(2, warn,
+ "%s: invalid length detected: type=%d\n",
+ __func__, nd_opt->nd_opt_type);
+ break;
+ }
+
+ if (ndopts->nd_802154_opt_array[nd_opt->nd_opt_type])
+ ND_PRINTK(2, warn,
+ "%s: duplicated short addr ND6 option found: type=%d\n",
+ __func__, nd_opt->nd_opt_type);
+ else
+ ndopts->nd_802154_opt_array[nd_opt->nd_opt_type] = nd_opt;
+ break;
+#endif
+ default:
+ ND_PRINTK(2, warn,
+ "%s: invalid length detected: type=%d\n",
+ __func__, nd_opt->nd_opt_type);
+ break;
+ }
+}
+
+static struct lowpan_ndisc_options *
+lowpan_ndisc_parse_options(const struct net_device *dev, u8 *opt, int opt_len,
+ struct lowpan_ndisc_options *ndopts)
+{
+ struct nd_opt_hdr *nd_opt = (struct nd_opt_hdr *)opt;
+
+ if (!nd_opt || opt_len < 0 || !ndopts)
+ return NULL;
+
+ memset(ndopts, 0, sizeof(*ndopts));
+
+ while (opt_len) {
+ int l;
+
+ if (opt_len < sizeof(struct nd_opt_hdr))
+ return NULL;
+
+ l = nd_opt->nd_opt_len << 3;
+ if (opt_len < l || l == 0)
+ return NULL;
+
+ switch (nd_opt->nd_opt_type) {
+ case ND_OPT_SOURCE_LL_ADDR:
+ case ND_OPT_TARGET_LL_ADDR:
+ lowpan_ndisc_parse_addr_options(dev, ndopts, nd_opt);
+ break;
+ default:
+ /* Unknown options must be silently ignored,
+ * to accommodate future extension to the
+ * protocol.
+ */
+ ND_PRINTK(2, notice,
+ "%s: ignored unsupported option; type=%d, len=%d\n",
+ __func__,
+ nd_opt->nd_opt_type,
+ nd_opt->nd_opt_len);
+ }
+
+ opt_len -= l;
+ nd_opt = ((void *)nd_opt) + l;
+ }
+
+ return ndopts;
+}
+
+static void lowpan_ndisc_send_na(struct net_device *dev,
+ const struct in6_addr *daddr,
+ const struct in6_addr *solicited_addr,
+ bool router, bool solicited, bool override,
+ bool inc_opt)
+{
+ struct sk_buff *skb;
+ struct in6_addr tmpaddr;
+ struct inet6_ifaddr *ifp;
+ const struct in6_addr *src_addr;
+ struct nd_msg *msg;
+ int optlen = 0;
+
+ /* for anycast or proxy, solicited_addr != src_addr */
+ ifp = ipv6_get_ifaddr(dev_net(dev), solicited_addr, dev, 1);
+ if (ifp) {
+ src_addr = solicited_addr;
+ if (ifp->flags & IFA_F_OPTIMISTIC)
+ override = false;
+ inc_opt |= ifp->idev->cnf.force_tllao;
+ in6_ifa_put(ifp);
+ } else {
+ if (ipv6_dev_get_saddr(dev_net(dev), dev, daddr,
+ inet6_sk(dev_net(dev)->ipv6.ndisc_sk)->srcprefs,
+ &tmpaddr))
+ return;
+ src_addr = &tmpaddr;
+ }
+
+ if (!dev->addr_len)
+ inc_opt = 0;
+ if (inc_opt) {
+ optlen += ndisc_opt_addr_space(dev, dev->addr_len);
+ optlen += lowpan_ndisc_802154_short_addr_space(dev);
+ }
+
+ skb = ndisc_alloc_skb(dev, sizeof(*msg) + optlen);
+ if (!skb)
+ return;
+
+ msg = (struct nd_msg *)skb_put(skb, sizeof(*msg));
+ *msg = (struct nd_msg) {
+ .icmph = {
+ .icmp6_type = NDISC_NEIGHBOUR_ADVERTISEMENT,
+ .icmp6_router = router,
+ .icmp6_solicited = solicited,
+ .icmp6_override = override,
+ },
+ .target = *solicited_addr,
+ };
+
+ if (inc_opt) {
+ ndisc_fill_addr_option(skb, ND_OPT_TARGET_LL_ADDR,
+ dev->dev_addr, dev->addr_len);
+ lowpan_ndisc_802154_short_addr_option(dev, skb,
+ ND_OPT_TARGET_LL_ADDR);
+ }
+
+ ndisc_send_skb(skb, daddr, src_addr);
+}
+
+static void lowpan_ndisc_recv_na(struct sk_buff *skb)
+{
+ struct nd_msg *msg = (struct nd_msg *)skb_transport_header(skb);
+ struct in6_addr *saddr = &ipv6_hdr(skb)->saddr;
+ const struct in6_addr *daddr = &ipv6_hdr(skb)->daddr;
+ u8 *lladdr = NULL;
+ u32 ndoptlen = skb_tail_pointer(skb) - (skb_transport_header(skb) +
+ offsetof(struct nd_msg, opt));
+ struct lowpan_ndisc_options ndopts;
+ struct net_device *dev = skb->dev;
+ struct inet6_dev *idev = __in6_dev_get(dev);
+ struct inet6_ifaddr *ifp;
+ struct neighbour *neigh;
+ u8 *lladdr_short = NULL;
+
+ if (skb->len < sizeof(struct nd_msg)) {
+ ND_PRINTK(2, warn, "NA: packet too short\n");
+ return;
+ }
+
+ if (ipv6_addr_is_multicast(&msg->target)) {
+ ND_PRINTK(2, warn, "NA: target address is multicast\n");
+ return;
+ }
+
+ if (ipv6_addr_is_multicast(daddr) &&
+ msg->icmph.icmp6_solicited) {
+ ND_PRINTK(2, warn, "NA: solicited NA is multicasted\n");
+ return;
+ }
+
+ /* For some 802.11 wireless deployments (and possibly other networks),
+ * there will be a NA proxy and unsolicitd packets are attacks
+ * and thus should not be accepted.
+ */
+ if (!msg->icmph.icmp6_solicited && idev &&
+ idev->cnf.drop_unsolicited_na)
+ return;
+
+ if (!lowpan_ndisc_parse_options(dev, msg->opt, ndoptlen, &ndopts)) {
+ ND_PRINTK(2, warn, "NS: invalid ND option\n");
+ return;
+ }
+ if (ndopts.nd_opts_tgt_lladdr) {
+ lladdr = ndisc_opt_addr_data(ndopts.nd_opts_tgt_lladdr, dev,
+ dev->addr_len);
+ if (!lladdr) {
+ ND_PRINTK(2, warn,
+ "NA: invalid link-layer address length\n");
+ return;
+ }
+ }
+#if IS_ENABLED(CONFIG_IEEE802154_6LOWPAN)
+ if (lowpan_is_ll(dev, LOWPAN_LLTYPE_IEEE802154) &&
+ ndopts.nd_802154_opts_tgt_lladdr) {
+ lladdr_short = ndisc_opt_addr_data(ndopts.nd_802154_opts_tgt_lladdr,
+ dev, IEEE802154_SHORT_ADDR_LEN);
+ if (!lladdr_short) {
+ ND_PRINTK(2, warn,
+ "NA: invalid short link-layer address length\n");
+ return;
+ }
+ }
+#endif
+ ifp = ipv6_get_ifaddr(dev_net(dev), &msg->target, dev, 1);
+ if (ifp) {
+ if (skb->pkt_type != PACKET_LOOPBACK &&
+ (ifp->flags & IFA_F_TENTATIVE)) {
+ addrconf_dad_failure(ifp);
+ return;
+ }
+ /* What should we make now? The advertisement
+ * is invalid, but ndisc specs say nothing
+ * about it. It could be misconfiguration, or
+ * an smart proxy agent tries to help us :-)
+ *
+ * We should not print the error if NA has been
+ * received from loopback - it is just our own
+ * unsolicited advertisement.
+ */
+ if (skb->pkt_type != PACKET_LOOPBACK)
+ ND_PRINTK(1, warn,
+ "NA: someone advertises our address %pI6 on %s!\n",
+ &ifp->addr, ifp->idev->dev->name);
+ in6_ifa_put(ifp);
+ return;
+ }
+ neigh = neigh_lookup(&nd_tbl, &msg->target, dev);
+
+ if (neigh) {
+ u8 old_flags = neigh->flags;
+ struct net *net = dev_net(dev);
+
+ if (neigh->nud_state & NUD_FAILED)
+ goto out;
+
+ /* Don't update the neighbor cache entry on a proxy NA from
+ * ourselves because either the proxied node is off link or it
+ * has already sent a NA to us.
+ */
+ if (lladdr && !memcmp(lladdr, dev->dev_addr, dev->addr_len) &&
+ net->ipv6.devconf_all->forwarding &&
+ net->ipv6.devconf_all->proxy_ndp &&
+ pneigh_lookup(&nd_tbl, net, &msg->target, dev, 0)) {
+ /* XXX: idev->cnf.proxy_ndp */
+ goto out;
+ }
+
+ neigh_update(neigh, lladdr,
+ msg->icmph.icmp6_solicited ? NUD_REACHABLE : NUD_STALE,
+ NEIGH_UPDATE_F_WEAK_OVERRIDE |
+ (msg->icmph.icmp6_override ? NEIGH_UPDATE_F_OVERRIDE : 0) |
+ NEIGH_UPDATE_F_OVERRIDE_ISROUTER |
+ (msg->icmph.icmp6_router ? NEIGH_UPDATE_F_ISROUTER : 0));
+
+ if (lowpan_is_ll(dev, LOWPAN_LLTYPE_IEEE802154))
+ lowpan_ndisc_802154_neigh_update(neigh, lladdr_short,
+ msg->icmph.icmp6_override);
+
+ if ((old_flags & ~neigh->flags) & NTF_ROUTER) {
+ /* Change: router to host */
+ rt6_clean_tohost(dev_net(dev), saddr);
+ }
+
+out:
+ neigh_release(neigh);
+ }
+}
+
+static void lowpan_ndisc_send_ns(struct net_device *dev,
+ const struct in6_addr *solicit,
+ const struct in6_addr *daddr,
+ const struct in6_addr *saddr)
+{
+ struct sk_buff *skb;
+ struct in6_addr addr_buf;
+ int inc_opt = dev->addr_len;
+ int optlen = 0;
+ struct nd_msg *msg;
+
+ if (!saddr) {
+ if (ipv6_get_lladdr(dev, &addr_buf,
+ (IFA_F_TENTATIVE | IFA_F_OPTIMISTIC)))
+ return;
+ saddr = &addr_buf;
+ }
+
+ if (ipv6_addr_any(saddr))
+ inc_opt = false;
+ if (inc_opt) {
+ optlen += ndisc_opt_addr_space(dev, dev->addr_len);
+ optlen += lowpan_ndisc_802154_short_addr_space(dev);
+ }
+
+ skb = ndisc_alloc_skb(dev, sizeof(*msg) + optlen);
+ if (!skb)
+ return;
+
+ msg = (struct nd_msg *)skb_put(skb, sizeof(*msg));
+ *msg = (struct nd_msg) {
+ .icmph = {
+ .icmp6_type = NDISC_NEIGHBOUR_SOLICITATION,
+ },
+ .target = *solicit,
+ };
+
+ if (inc_opt) {
+ ndisc_fill_addr_option(skb, ND_OPT_SOURCE_LL_ADDR,
+ dev->dev_addr, dev->addr_len);
+ lowpan_ndisc_802154_short_addr_option(dev, skb,
+ ND_OPT_SOURCE_LL_ADDR);
+ }
+
+ ndisc_send_skb(skb, daddr, saddr);
+}
+
+static void lowpan_ndisc_recv_ns(struct sk_buff *skb)
+{
+ struct nd_msg *msg = (struct nd_msg *)skb_transport_header(skb);
+ const struct in6_addr *saddr = &ipv6_hdr(skb)->saddr;
+ const struct in6_addr *daddr = &ipv6_hdr(skb)->daddr;
+ u8 *lladdr = NULL;
+ u32 ndoptlen = skb_tail_pointer(skb) - (skb_transport_header(skb) +
+ offsetof(struct nd_msg, opt));
+ struct lowpan_ndisc_options ndopts;
+ struct net_device *dev = skb->dev;
+ struct inet6_ifaddr *ifp;
+ struct inet6_dev *idev = NULL;
+ struct neighbour *neigh;
+ int dad = ipv6_addr_any(saddr);
+ bool inc;
+ int is_router = -1;
+ u8 *lladdr_short = NULL;
+
+ if (skb->len < sizeof(struct nd_msg)) {
+ ND_PRINTK(2, warn, "NS: packet too short\n");
+ return;
+ }
+
+ if (ipv6_addr_is_multicast(&msg->target)) {
+ ND_PRINTK(2, warn, "NS: multicast target address\n");
+ return;
+ }
+
+ /* RFC2461 7.1.1:
+ * DAD has to be destined for solicited node multicast address.
+ */
+ if (dad && !ipv6_addr_is_solict_mult(daddr)) {
+ ND_PRINTK(2, warn, "NS: bad DAD packet (wrong destination)\n");
+ return;
+ }
+
+ if (!lowpan_ndisc_parse_options(dev, msg->opt, ndoptlen, &ndopts)) {
+ ND_PRINTK(2, warn, "NS: invalid ND options\n");
+ return;
+ }
+
+ if (ndopts.nd_opts_src_lladdr) {
+ lladdr = ndisc_opt_addr_data(ndopts.nd_opts_src_lladdr, dev,
+ dev->addr_len);
+ if (!lladdr) {
+ ND_PRINTK(2, warn,
+ "NS: invalid link-layer address length\n");
+ return;
+ }
+
+ /* RFC2461 7.1.1:
+ * If the IP source address is the unspecified address,
+ * there MUST NOT be source link-layer address option
+ * in the message.
+ */
+ if (dad) {
+ ND_PRINTK(2, warn,
+ "NS: bad DAD packet (link-layer address option)\n");
+ return;
+ }
+ }
+
+#if IS_ENABLED(CONFIG_IEEE802154_6LOWPAN)
+ if (lowpan_is_ll(dev, LOWPAN_LLTYPE_IEEE802154) &&
+ ndopts.nd_802154_opts_src_lladdr) {
+ lladdr_short = ndisc_opt_addr_data(ndopts.nd_802154_opts_src_lladdr,
+ dev, IEEE802154_SHORT_ADDR_LEN);
+ if (!lladdr_short) {
+ ND_PRINTK(2, warn,
+ "NS: invalid short link-layer address length\n");
+ return;
+ }
+
+ /* RFC2461 7.1.1:
+ * If the IP source address is the unspecified address,
+ * there MUST NOT be source link-layer address option
+ * in the message.
+ */
+ if (dad) {
+ ND_PRINTK(2, warn,
+ "NS: bad DAD packet (short link-layer address option)\n");
+ return;
+ }
+ }
+#endif
+
+ inc = ipv6_addr_is_multicast(daddr);
+
+ ifp = ipv6_get_ifaddr(dev_net(dev), &msg->target, dev, 1);
+ if (ifp) {
+have_ifp:
+ if (ifp->flags & (IFA_F_TENTATIVE | IFA_F_OPTIMISTIC)) {
+ if (dad) {
+ /* We are colliding with another node
+ * who is doing DAD
+ * so fail our DAD process
+ */
+ addrconf_dad_failure(ifp);
+ return;
+ }
+
+ /* This is not a dad solicitation.
+ * If we are an optimistic node,
+ * we should respond.
+ * Otherwise, we should ignore it.
+ */
+ if (!(ifp->flags & IFA_F_OPTIMISTIC))
+ goto out;
+ }
+
+ idev = ifp->idev;
+ } else {
+ struct net *net = dev_net(dev);
+
+ /* perhaps an address on the master device */
+ if (netif_is_l3_slave(dev)) {
+ struct net_device *mdev;
+
+ mdev = netdev_master_upper_dev_get_rcu(dev);
+ if (mdev) {
+ ifp = ipv6_get_ifaddr(net, &msg->target, mdev, 1);
+ if (ifp)
+ goto have_ifp;
+ }
+ }
+
+ idev = in6_dev_get(dev);
+ if (!idev) {
+ /* XXX: count this drop? */
+ return;
+ }
+
+ if (ipv6_chk_acast_addr(net, dev, &msg->target) ||
+ (idev->cnf.forwarding &&
+ (net->ipv6.devconf_all->proxy_ndp || idev->cnf.proxy_ndp) &&
+ (is_router = pndisc_is_router(&msg->target, dev)) >= 0)) {
+ if (!(NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED) &&
+ skb->pkt_type != PACKET_HOST &&
+ inc &&
+ NEIGH_VAR(idev->nd_parms, PROXY_DELAY) != 0) {
+ /* for anycast or proxy,
+ * sender should delay its response
+ * by a random time between 0 and
+ * MAX_ANYCAST_DELAY_TIME seconds.
+ * (RFC2461) -- yoshfuji
+ */
+ struct sk_buff *n = skb_clone(skb, GFP_ATOMIC);
+
+ if (n)
+ pneigh_enqueue(&nd_tbl, idev->nd_parms,
+ n);
+ goto out;
+ }
+ } else {
+ goto out;
+ }
+ }
+
+ if (is_router < 0)
+ is_router = idev->cnf.forwarding;
+
+ if (dad) {
+ ndisc_send_na(dev, &in6addr_linklocal_allnodes, &msg->target,
+ !!is_router, false, (ifp != NULL), true);
+ goto out;
+ }
+
+ if (inc)
+ NEIGH_CACHE_STAT_INC(&nd_tbl, rcv_probes_mcast);
+ else
+ NEIGH_CACHE_STAT_INC(&nd_tbl, rcv_probes_ucast);
+
+ /* update / create cache entry
+ * for the source address
+ */
+ neigh = __neigh_lookup(&nd_tbl, saddr, dev,
+ !inc || lladdr || !dev->addr_len);
+ if (neigh) {
+ neigh_update(neigh, lladdr, NUD_STALE,
+ NEIGH_UPDATE_F_WEAK_OVERRIDE |
+ NEIGH_UPDATE_F_OVERRIDE);
+ if (lowpan_is_ll(dev, LOWPAN_LLTYPE_IEEE802154))
+ lowpan_ndisc_802154_neigh_update(neigh, lladdr_short,
+ true);
+ }
+ if (neigh || !dev->header_ops) {
+ ndisc_send_na(dev, saddr, &msg->target, !!is_router,
+ true, (ifp != NULL && inc), inc);
+ if (neigh)
+ neigh_release(neigh);
+ }
+
+out:
+ if (ifp)
+ in6_ifa_put(ifp);
+ else
+ in6_dev_put(idev);
+}
+
+static inline int lowpan_ndisc_is_useropt(struct nd_opt_hdr *opt)
+{
+ return __ip6_ndisc_is_useropt(opt) || opt->nd_opt_type == ND_OPT_6CO;
+}
+
+static const struct ndisc_ops lowpan_ndisc_ops = {
+ .is_useropt = lowpan_ndisc_is_useropt,
+ .send_na = lowpan_ndisc_send_na,
+ .recv_na = lowpan_ndisc_recv_na,
+ .send_ns = lowpan_ndisc_send_ns,
+ .recv_ns = lowpan_ndisc_recv_ns,
+};
+
+void lowpan_register_ndisc_ops(struct net_device *dev)
+{
+ dev->ndisc_ops = &lowpan_ndisc_ops;
+}
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index dc8bfec..9d7f228 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1792,6 +1792,9 @@ static const struct ndisc_ops ip6_ndisc_ops = {
void ip6_register_ndisc_ops(struct net_device *dev)
{
switch (dev->type) {
+ case ARPHRD_6LOWPAN:
+ /* will be assigned while lowpan interface register */
+ break;
default:
if (dev->ndisc_ops) {
ND_PRINTK(2, warn,
--
2.8.0
^ permalink raw reply related
* [PATCHv2 bluetooth-next 07/10] ipv6: introduce neighbour discovery ops
From: Alexander Aring @ 2016-04-20 8:19 UTC (permalink / raw)
To: linux-wpan-u79uwXL29TY76Z2rM5mHXA
Cc: kernel-bIcnvbaLZ9MEGnE8C9+IrQ, marcel-kz+m5ild9QBg9hUCZPvPmw,
jukka.rissanen-VuQAYsv1563Yd54FQh9/CA,
hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r,
stefan-JPH+aEBZ4P+UEJcrhfAQsw, mcr-SWp7JaYWvAQV+D8aMU/kSg,
werner-SEdMjqphH88wryQfseakQg,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA, Alexander Aring, David S . Miller,
Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy
In-Reply-To: <1461140382-4784-1-git-send-email-aar-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
This patch introduces neighbour discovery ops callback structure. The
structure contains at first receive and transmit handling for NS/NA and
userspace option field functionality.
These callback offers 6lowpan different handling, such as 802.15.4 short
address handling or RFC6775 (Neighbor Discovery Optimization for IPv6 over
6LoWPANs).
Cc: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: Alexey Kuznetsov <kuznet-v/Mj1YrvjDBInbfyfbPRSQ@public.gmane.org>
Cc: James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>
Cc: Hideaki YOSHIFUJI <yoshfuji-VfPWfsRibaP+Ru+s062T9g@public.gmane.org>
Cc: Patrick McHardy <kaber-dcUjhNyLwpNeoWH0uzbU5w@public.gmane.org>
Signed-off-by: Alexander Aring <aar-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
---
include/linux/netdevice.h | 3 ++
include/net/ndisc.h | 96 +++++++++++++++++++++++++++++++++++++++++++----
net/ipv6/addrconf.c | 1 +
net/ipv6/ndisc.c | 71 ++++++++++++++++++++++++-----------
net/ipv6/route.c | 2 +-
5 files changed, 144 insertions(+), 29 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0052c42..bc60033 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1677,6 +1677,9 @@ struct net_device {
#ifdef CONFIG_NET_L3_MASTER_DEV
const struct l3mdev_ops *l3mdev_ops;
#endif
+#if IS_ENABLED(CONFIG_IPV6)
+ const struct ndisc_ops *ndisc_ops;
+#endif
const struct header_ops *header_ops;
diff --git a/include/net/ndisc.h b/include/net/ndisc.h
index aac868e..14ed016 100644
--- a/include/net/ndisc.h
+++ b/include/net/ndisc.h
@@ -110,7 +110,8 @@ struct ndisc_options {
#define NDISC_OPT_SPACE(len) (((len)+2+7)&~7)
-struct ndisc_options *ndisc_parse_options(u8 *opt, int opt_len,
+struct ndisc_options *ndisc_parse_options(const struct net_device *dev,
+ u8 *opt, int opt_len,
struct ndisc_options *ndopts);
/*
@@ -173,6 +174,93 @@ static inline struct neighbour *__ipv6_neigh_lookup(struct net_device *dev, cons
return n;
}
+static inline int __ip6_ndisc_is_useropt(struct nd_opt_hdr *opt)
+{
+ return opt->nd_opt_type == ND_OPT_RDNSS ||
+ opt->nd_opt_type == ND_OPT_DNSSL;
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+struct ndisc_ops {
+ int (*is_useropt)(struct nd_opt_hdr *opt);
+ void (*send_na)(struct net_device *dev,
+ const struct in6_addr *daddr,
+ const struct in6_addr *solicited_addr,
+ bool router, bool solicited,
+ bool override, bool inc_opt);
+ void (*recv_na)(struct sk_buff *skb);
+ void (*send_ns)(struct net_device *dev,
+ const struct in6_addr *solicit,
+ const struct in6_addr *daddr,
+ const struct in6_addr *saddr);
+ void (*recv_ns)(struct sk_buff *skb);
+};
+
+static inline int ndisc_is_useropt(const struct net_device *dev,
+ struct nd_opt_hdr *opt)
+{
+ if (likely(dev->ndisc_ops->is_useropt))
+ return dev->ndisc_ops->is_useropt(opt);
+ else
+ return 0;
+}
+
+static inline void ndisc_send_na(struct net_device *dev,
+ const struct in6_addr *daddr,
+ const struct in6_addr *solicited_addr,
+ bool router, bool solicited, bool override,
+ bool inc_opt)
+{
+ if (likely(dev->ndisc_ops->send_na))
+ dev->ndisc_ops->send_na(dev, daddr, solicited_addr, router,
+ solicited, override, inc_opt);
+}
+
+static inline void ndisc_recv_na(struct sk_buff *skb)
+{
+ if (likely(skb->dev->ndisc_ops->recv_na))
+ skb->dev->ndisc_ops->recv_na(skb);
+}
+
+static inline void ndisc_send_ns(struct net_device *dev,
+ const struct in6_addr *solicit,
+ const struct in6_addr *daddr,
+ const struct in6_addr *saddr)
+{
+ if (likely(dev->ndisc_ops->send_ns))
+ dev->ndisc_ops->send_ns(dev, solicit, daddr, saddr);
+}
+
+static inline void ndisc_recv_ns(struct sk_buff *skb)
+{
+ if (likely(skb->dev->ndisc_ops->recv_ns))
+ skb->dev->ndisc_ops->recv_ns(skb);
+}
+#else
+static inline int ndisc_is_useropt(const struct net_device *dev,
+ struct nd_opt_hdr *opt)
+{
+ return 0;
+}
+
+static inline void ndisc_send_na(struct net_device *dev,
+ const struct in6_addr *daddr,
+ const struct in6_addr *solicited_addr,
+ bool router, bool solicited, bool override,
+ bool inc_opt) { }
+
+static inline void ndisc_recv_na(struct sk_buff *skb) { }
+
+static inline void ndisc_send_ns(struct net_device *dev,
+ const struct in6_addr *solicit,
+ const struct in6_addr *daddr,
+ const struct in6_addr *saddr) { }
+
+static inline void ndisc_recv_ns(struct sk_buff *skb) { }
+#endif
+
+void ip6_register_ndisc_ops(struct net_device *dev);
+
int ndisc_init(void);
int ndisc_late_init(void);
@@ -181,14 +269,8 @@ void ndisc_cleanup(void);
int ndisc_rcv(struct sk_buff *skb);
-void ndisc_send_ns(struct net_device *dev, const struct in6_addr *solicit,
- const struct in6_addr *daddr, const struct in6_addr *saddr);
-
void ndisc_send_rs(struct net_device *dev,
const struct in6_addr *saddr, const struct in6_addr *daddr);
-void ndisc_send_na(struct net_device *dev, const struct in6_addr *daddr,
- const struct in6_addr *solicited_addr,
- bool router, bool solicited, bool override, bool inc_opt);
void ndisc_send_redirect(struct sk_buff *skb, const struct in6_addr *target);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 54e18c2..a2ef04b 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3266,6 +3266,7 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
idev = ipv6_add_dev(dev);
if (IS_ERR(idev))
return notifier_from_errno(PTR_ERR(idev));
+ ip6_register_ndisc_ops(dev);
}
break;
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 176c7c4..297080a 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -185,24 +185,25 @@ static struct nd_opt_hdr *ndisc_next_option(struct nd_opt_hdr *cur,
return cur <= end && cur->nd_opt_type == type ? cur : NULL;
}
-static inline int ndisc_is_useropt(struct nd_opt_hdr *opt)
+static inline int ip6_ndisc_is_useropt(struct nd_opt_hdr *opt)
{
- return opt->nd_opt_type == ND_OPT_RDNSS ||
- opt->nd_opt_type == ND_OPT_DNSSL;
+ return __ip6_ndisc_is_useropt(opt);
}
-static struct nd_opt_hdr *ndisc_next_useropt(struct nd_opt_hdr *cur,
+static struct nd_opt_hdr *ndisc_next_useropt(const struct net_device *dev,
+ struct nd_opt_hdr *cur,
struct nd_opt_hdr *end)
{
if (!cur || !end || cur >= end)
return NULL;
do {
cur = ((void *)cur) + (cur->nd_opt_len << 3);
- } while (cur < end && !ndisc_is_useropt(cur));
- return cur <= end && ndisc_is_useropt(cur) ? cur : NULL;
+ } while (cur < end && !ndisc_is_useropt(dev, cur));
+ return cur <= end && ndisc_is_useropt(dev, cur) ? cur : NULL;
}
-struct ndisc_options *ndisc_parse_options(u8 *opt, int opt_len,
+struct ndisc_options *ndisc_parse_options(const struct net_device *dev,
+ u8 *opt, int opt_len,
struct ndisc_options *ndopts)
{
struct nd_opt_hdr *nd_opt = (struct nd_opt_hdr *)opt;
@@ -243,7 +244,7 @@ struct ndisc_options *ndisc_parse_options(u8 *opt, int opt_len,
break;
#endif
default:
- if (ndisc_is_useropt(nd_opt)) {
+ if (ndisc_is_useropt(dev, nd_opt)) {
ndopts->nd_useropts_end = nd_opt;
if (!ndopts->nd_useropts)
ndopts->nd_useropts = nd_opt;
@@ -479,9 +480,11 @@ static void ndisc_send_skb(struct sk_buff *skb,
rcu_read_unlock();
}
-void ndisc_send_na(struct net_device *dev, const struct in6_addr *daddr,
- const struct in6_addr *solicited_addr,
- bool router, bool solicited, bool override, bool inc_opt)
+static void ip6_ndisc_send_na(struct net_device *dev,
+ const struct in6_addr *daddr,
+ const struct in6_addr *solicited_addr,
+ bool router, bool solicited, bool override,
+ bool inc_opt)
{
struct sk_buff *skb;
struct in6_addr tmpaddr;
@@ -555,8 +558,10 @@ static void ndisc_send_unsol_na(struct net_device *dev)
in6_dev_put(idev);
}
-void ndisc_send_ns(struct net_device *dev, const struct in6_addr *solicit,
- const struct in6_addr *daddr, const struct in6_addr *saddr)
+static void ip6_ndisc_send_ns(struct net_device *dev,
+ const struct in6_addr *solicit,
+ const struct in6_addr *daddr,
+ const struct in6_addr *saddr)
{
struct sk_buff *skb;
struct in6_addr addr_buf;
@@ -702,7 +707,7 @@ static int pndisc_is_router(const void *pkey,
return ret;
}
-static void ndisc_recv_ns(struct sk_buff *skb)
+static void ip6_ndisc_recv_ns(struct sk_buff *skb)
{
struct nd_msg *msg = (struct nd_msg *)skb_transport_header(skb);
const struct in6_addr *saddr = &ipv6_hdr(skb)->saddr;
@@ -738,7 +743,7 @@ static void ndisc_recv_ns(struct sk_buff *skb)
return;
}
- if (!ndisc_parse_options(msg->opt, ndoptlen, &ndopts)) {
+ if (!ndisc_parse_options(dev, msg->opt, ndoptlen, &ndopts)) {
ND_PRINTK(2, warn, "NS: invalid ND options\n");
return;
}
@@ -874,7 +879,7 @@ out:
in6_dev_put(idev);
}
-static void ndisc_recv_na(struct sk_buff *skb)
+static void ip6_ndisc_recv_na(struct sk_buff *skb)
{
struct nd_msg *msg = (struct nd_msg *)skb_transport_header(skb);
struct in6_addr *saddr = &ipv6_hdr(skb)->saddr;
@@ -912,7 +917,7 @@ static void ndisc_recv_na(struct sk_buff *skb)
idev->cnf.drop_unsolicited_na)
return;
- if (!ndisc_parse_options(msg->opt, ndoptlen, &ndopts)) {
+ if (!ndisc_parse_options(dev, msg->opt, ndoptlen, &ndopts)) {
ND_PRINTK(2, warn, "NS: invalid ND option\n");
return;
}
@@ -1019,7 +1024,7 @@ static void ndisc_recv_rs(struct sk_buff *skb)
goto out;
/* Parse ND options */
- if (!ndisc_parse_options(rs_msg->opt, ndoptlen, &ndopts)) {
+ if (!ndisc_parse_options(skb->dev, rs_msg->opt, ndoptlen, &ndopts)) {
ND_PRINTK(2, notice, "NS: invalid ND option, ignored\n");
goto out;
}
@@ -1137,7 +1142,7 @@ static void ndisc_router_discovery(struct sk_buff *skb)
return;
}
- if (!ndisc_parse_options(opt, optlen, &ndopts)) {
+ if (!ndisc_parse_options(skb->dev, opt, optlen, &ndopts)) {
ND_PRINTK(2, warn, "RA: invalid ND options\n");
return;
}
@@ -1424,7 +1429,8 @@ skip_routeinfo:
struct nd_opt_hdr *p;
for (p = ndopts.nd_useropts;
p;
- p = ndisc_next_useropt(p, ndopts.nd_useropts_end)) {
+ p = ndisc_next_useropt(skb->dev, p,
+ ndopts.nd_useropts_end)) {
ndisc_ra_useropt(skb, p);
}
}
@@ -1462,7 +1468,7 @@ static void ndisc_redirect_rcv(struct sk_buff *skb)
return;
}
- if (!ndisc_parse_options(msg->opt, ndoptlen, &ndopts))
+ if (!ndisc_parse_options(skb->dev, msg->opt, ndoptlen, &ndopts))
return;
if (!ndopts.nd_opts_rh) {
@@ -1783,6 +1789,29 @@ int ndisc_ifinfo_sysctl_change(struct ctl_table *ctl, int write, void __user *bu
#endif
+static const struct ndisc_ops ip6_ndisc_ops = {
+ .is_useropt = ip6_ndisc_is_useropt,
+ .send_na = ip6_ndisc_send_na,
+ .recv_na = ip6_ndisc_recv_na,
+ .send_ns = ip6_ndisc_send_ns,
+ .recv_ns = ip6_ndisc_recv_ns,
+};
+
+void ip6_register_ndisc_ops(struct net_device *dev)
+{
+ switch (dev->type) {
+ default:
+ if (dev->ndisc_ops) {
+ ND_PRINTK(2, warn,
+ "%s: ndisc_ops already defined for interface type=%d\n",
+ __func__, dev->type);
+ } else {
+ dev->ndisc_ops = &ip6_ndisc_ops;
+ }
+ break;
+ }
+}
+
static int __net_init ndisc_net_init(struct net *net)
{
struct ipv6_pinfo *np;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index cc180b3..5fa276d 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2149,7 +2149,7 @@ static void rt6_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_bu
* first-hop router for the specified ICMP Destination Address.
*/
- if (!ndisc_parse_options(msg->opt, optlen, &ndopts)) {
+ if (!ndisc_parse_options(skb->dev, msg->opt, optlen, &ndopts)) {
net_dbg_ratelimited("rt6_redirect: invalid ND options\n");
return;
}
--
2.8.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox