* Re: [Bug reporting] kernel panic during handle the dst unreach icmp msg.
From: Eric Dumazet @ 2019-02-14 17:15 UTC (permalink / raw)
To: soukjin.bae, netdev@vger.kernel.org
Cc: 박종언, Steffen Klassert, Herbert Xu
In-Reply-To: <20190214074641epcms1p1db1c5589f96718a440a166328eec9ebd@epcms1p1>
On 02/13/2019 11:46 PM, 배석진 wrote:
> Dear all,
>
>
> https://www.mail-archive.com/netdev@vger.kernel.org/msg256527.html
>
> as we concerned before at above mail thread,
> we faced a problem cased by not removed socket.
>
> (from now, 'the socket' means the socket alloced at 0xFFFFFFC0051E5E00)
>
> #1. the socket is state in TIME_WAIT1. maybe it's process closed the socket.
> below is memory dump information with Trace32.
>
> (struct sock *)0xFFFFFFC0051E5E00 = 0xFFFFFFC0051E5E00 = end+0x3FF9E4CE00 -> (
> __sk_common = (
> ...
> skc_rcv_saddr = 0x0200A8C0, ==> 192.168.0.2
> ...
> skc_state = 4, ==> TIME_WAIT1
> ...
> skc_flags = 0x4301, ==> SOCK_DEAD(0x01) set
>
>
> #2. user changed WIFI AP to another one, so previous netdevice deleted and destroied it's sockets.
>
> [60392.948657][4: netd] 02-13 00:39:32.095 5249 5323 I NetdDestroyed 30 sockets on 192.168.0.2 in 2.7 ms
> [60392.948705][4: netd] 02-13 00:39:32.095 5249 5323 D Netdnotify() code: 614, msg: Address removed 192.168.0.2/24 wlan0 128 0
>
> --> the socket will be exist for a while.
> because of 'sock_diag_destory() -> tcp_abort()' can not call tcp_done() for the socket.
> but clearing the socket's sk_write_queue by calling tcp_write_queue_purge(sk).
>
>
> #3. icmp msg(dst unreach) came for sent packet by the socket.
> to retransmit them, lookup sk and fint it. (because the socket still exist)
> but it's sk_write_queue was already cleared so has no skb to send.
> and make the kernel bug.
>
> <4>[60392.948306] I[1: ksoftirqd/1: 19] ------------[ cut here ]------------
> <0>[60392.948334] I[1: ksoftirqd/1: 19] kernel BUG at net/ipv4/tcp_ipv4.c:519!
> <2>[60392.948344] I[1: ksoftirqd/1: 19] sec_debug_set_extra_info_fault = BUG / 0xffffff80090351d0
> <0>[60392.948386] I[1: ksoftirqd/1: 19] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> ...
> <4>[60392.950676] I[1: ksoftirqd/1: 19] PC is at tcp_v4_err+0x4b0/0x4bc
> <4>[60392.950684] I[1: ksoftirqd/1: 19] LR is at tcp_v4_err+0x3ac/0x4bc
>
>
> 370 void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
> 371 {
> ...
> 516 icsk->icsk_rto = inet_csk_rto_backoff(icsk, TCP_RTO_MAX);
> 517
> 518 skb = tcp_write_queue_head(sk);
> 519 BUG_ON(!skb);
> 520
> 521 tcp_mstamp_refresh(tp);
>
>
> we know that the line 519 removed on latest state. instead this will be shown to kernel panic.
> how about below change? do not retransmit packets when socket was already closed.
>
> best regards,
>
>
>
> From: soukjin bae <soukjin.bae@samsung.com>
> Date: Wen, 14 Jan 2019 14:26:35 +0900
> Subject: net: Don't retransmit packets when socket was already closed
>
> Signed-off-by: soukjin bae <soukjin.bae@samsung.com>
> Signed-off-by: jongeon park <jongeon.park@samsung.com>
> ---
> net/ipv4/tcp_ipv4 | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/ipv4/tcp_ipv4 b/net/ipv4/tcp_ipv4
> index fe4daf6..654bd19 100755
> --- a/net/ipv4/tcp_ipv4
> +++ b/net/ipv4/tcp_ipv4
>
> @@ -442,6 +465,10 @@ void tcp_v4_err(struct sk_buff *icmp_skb, u32 info)
> err = EPROTO;
> break;
> case ICMP_DEST_UNREACH:
> + /* Don't retransmit packets when socket was already closed */
> + if (sock_flag(sk, SOCK_DEAD))
> + goto out;
> +
> if (code > NR_ICMP_UNREACH)
> goto out;
>
I do not believe this patch is needed.
You probably hit another more serious bug, but since you do not post the full stack trace
it is hard to help.
Are you using vti tunnel ?
I just got a syzbot report that might give us a clue :
(I suspect commit 61220ab349485d911083d0b7990ccd3db6c63297 vti6: Enable namespace changing
was wrong, since vti tunnels have t->net assigned to a struct net without holding a reference)
So we end up freeing a struct net (and associated resources) too soon.
BUG: KASAN: slab-out-of-bounds in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
BUG: KASAN: slab-out-of-bounds in queued_spin_trylock include/asm-generic/qspinlock.h:69 [inline]
BUG: KASAN: slab-out-of-bounds in do_raw_spin_trylock+0x6a/0x180 kernel/locking/spinlock_debug.c:119
Read of size 4 at addr ffff888066405d9c by task syz-executor.4/10575
CPU: 0 PID: 10575 Comm: syz-executor.4 Not tainted 5.0.0-rc6+ #70
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
check_memory_region_inline mm/kasan/generic.c:185 [inline]
check_memory_region+0x123/0x190 mm/kasan/generic.c:191
kasan_check_read+0x11/0x20 mm/kasan/common.c:100
atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
queued_spin_trylock include/asm-generic/qspinlock.h:69 [inline]
do_raw_spin_trylock+0x6a/0x180 kernel/locking/spinlock_debug.c:119
__raw_spin_trylock include/linux/spinlock_api_smp.h:89 [inline]
_raw_spin_trylock+0x1c/0x80 kernel/locking/spinlock.c:128
spin_trylock include/linux/spinlock.h:339 [inline]
icmp_xmit_lock net/ipv4/icmp.c:219 [inline]
icmp_send+0x54c/0x1400 net/ipv4/icmp.c:665
ipv4_link_failure+0x2c/0x210 net/ipv4/route.c:1187
dst_link_failure include/net/dst.h:427 [inline]
vti6_xmit net/ipv6/ip6_vti.c:514 [inline]
vti6_tnl_xmit+0x10db/0x1c6e net/ipv6/ip6_vti.c:553
__netdev_start_xmit include/linux/netdevice.h:4385 [inline]
netdev_start_xmit include/linux/netdevice.h:4394 [inline]
xmit_one net/core/dev.c:3278 [inline]
dev_hard_start_xmit+0x1b2/0x980 net/core/dev.c:3294
__dev_queue_xmit+0x26e5/0x2fe0 net/core/dev.c:3864
dev_queue_xmit+0x18/0x20 net/core/dev.c:3897
neigh_direct_output+0x16/0x20 net/core/neighbour.c:1516
neigh_output include/net/neighbour.h:508 [inline]
ip_finish_output2+0x949/0x1740 net/ipv4/ip_output.c:229
ip_finish_output+0x73c/0xd50 net/ipv4/ip_output.c:317
NF_HOOK_COND include/linux/netfilter.h:278 [inline]
ip_output+0x21f/0x670 net/ipv4/ip_output.c:405
dst_output include/net/dst.h:444 [inline]
ip_local_out+0xc4/0x1b0 net/ipv4/ip_output.c:124
__ip_queue_xmit+0x86f/0x1bf0 net/ipv4/ip_output.c:505
ip_queue_xmit+0x5a/0x70 include/net/ip.h:198
__tcp_transmit_skb+0x1a5f/0x3680 net/ipv4/tcp_output.c:1160
tcp_transmit_skb net/ipv4/tcp_output.c:1176 [inline]
tcp_write_xmit+0xe89/0x5160 net/ipv4/tcp_output.c:2401
__tcp_push_pending_frames+0xb4/0x350 net/ipv4/tcp_output.c:2577
tcp_send_fin+0x149/0xbb0 net/ipv4/tcp_output.c:3122
tcp_close+0xddf/0x10c0 net/ipv4/tcp.c:2405
inet_release+0x105/0x1f0 net/ipv4/af_inet.c:428
__sock_release+0xd3/0x250 net/socket.c:579
sock_close+0x1b/0x30 net/socket.c:1139
__fput+0x2df/0x8d0 fs/file_table.c:278
____fput+0x16/0x20 fs/file_table.c:309
task_work_run+0x14a/0x1c0 kernel/task_work.c:113
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
exit_to_usermode_loop+0x273/0x2c0 arch/x86/entry/common.c:166
prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
do_syscall_32_irqs_on arch/x86/entry/common.c:341 [inline]
do_fast_syscall_32+0xa9d/0xc98 arch/x86/entry/common.c:397
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7fe8869
Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 002b:000000000845fdac EFLAGS: 00000216 ORIG_RAX: 0000000000000006
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
RDX: 0000000000000005 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Allocated by task 9609:
save_stack+0x45/0xd0 mm/kasan/common.c:73
set_track mm/kasan/common.c:85 [inline]
__kasan_kmalloc mm/kasan/common.c:496 [inline]
__kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469
kasan_kmalloc mm/kasan/common.c:504 [inline]
kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:411
kmem_cache_alloc_node+0x144/0x710 mm/slab.c:3633
alloc_task_struct_node kernel/fork.c:158 [inline]
dup_task_struct kernel/fork.c:845 [inline]
copy_process.part.0+0x1d08/0x79a0 kernel/fork.c:1753
copy_process kernel/fork.c:1710 [inline]
_do_fork+0x257/0xfe0 kernel/fork.c:2227
__do_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:240 [inline]
__se_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:236 [inline]
__ia32_compat_sys_x86_clone+0xbc/0x140 arch/x86/ia32/sys_ia32.c:236
do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline]
do_int80_syscall_32+0x14d/0x670 arch/x86/entry/common.c:349
entry_INT80_compat+0x76/0x80 arch/x86/entry/entry_64_compat.S:413
Freed by task 9627:
save_stack+0x45/0xd0 mm/kasan/common.c:73
set_track mm/kasan/common.c:85 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/common.c:458
kasan_slab_free+0xe/0x10 mm/kasan/common.c:466
__cache_free mm/slab.c:3487 [inline]
kmem_cache_free+0x86/0x260 mm/slab.c:3749
free_task_struct kernel/fork.c:163 [inline]
free_task+0xdd/0x120 kernel/fork.c:458
__put_task_struct+0x20a/0x4e0 kernel/fork.c:731
put_task_struct include/linux/sched/task.h:98 [inline]
delayed_put_task_struct+0x1fd/0x350 kernel/exit.c:181
__rcu_reclaim kernel/rcu/rcu.h:240 [inline]
rcu_do_batch kernel/rcu/tree.c:2452 [inline]
invoke_rcu_callbacks kernel/rcu/tree.c:2773 [inline]
rcu_process_callbacks+0x928/0x1390 kernel/rcu/tree.c:2754
__do_softirq+0x266/0x95a kernel/softirq.c:292
The buggy address belongs to the object at ffff888066404540
which belongs to the cache task_struct(81:syz5) of size 6080
The buggy address is located 156 bytes to the right of
6080-byte region [ffff888066404540, ffff888066405d00)
The buggy address belongs to the page:
page:ffffea0001990100 count:1 mapcount:0 mapping:ffff888092e85080 index:0x0 compound_mapcount: 0
flags: 0x1fffc0000010200(slab|head)
raw: 01fffc0000010200 ffffea00026efe08 ffffea0002554f08 ffff888092e85080
raw: 0000000000000000 ffff888066404540 0000000100000001 ffff8880602fe480
page dumped because: kasan: bad access detected
page->mem_cgroup:ffff8880602fe480
Memory state around the buggy address:
ffff888066405c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888066405d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888066405d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff888066405e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888066405e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^ permalink raw reply
* [PATCH net-next 1/2] trace: events: add a few neigh tracepoints
From: Roopa Prabhu @ 2019-02-14 17:15 UTC (permalink / raw)
To: davem; +Cc: netdev, dsa
In-Reply-To: <1550164511-21195-1-git-send-email-roopa@cumulusnetworks.com>
From: Roopa Prabhu <roopa@cumulusnetworks.com>
The goal here is to trace neigh state changes covering all possible
neigh update paths. Plus have a specific trace point in neigh_update
to cover flags sent to neigh_update.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
include/trace/events/neigh.h | 204 +++++++++++++++++++++++++++++++++++++++++++
net/core/net-traces.c | 8 ++
2 files changed, 212 insertions(+)
create mode 100644 include/trace/events/neigh.h
diff --git a/include/trace/events/neigh.h b/include/trace/events/neigh.h
new file mode 100644
index 0000000..ed10353
--- /dev/null
+++ b/include/trace/events/neigh.h
@@ -0,0 +1,204 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM neigh
+
+#if !defined(_TRACE_NEIGH_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_NEIGH_H
+
+#include <linux/skbuff.h>
+#include <linux/netdevice.h>
+#include <linux/tracepoint.h>
+#include <net/neighbour.h>
+
+#define neigh_state_str(state) \
+ __print_symbolic(state, \
+ { NUD_INCOMPLETE, "incomplete" }, \
+ { NUD_REACHABLE, "reachable" }, \
+ { NUD_STALE, "stale" }, \
+ { NUD_DELAY, "delay" }, \
+ { NUD_PROBE, "probe" }, \
+ { NUD_FAILED, "failed" })
+
+TRACE_EVENT(neigh_update,
+
+ TP_PROTO(struct neighbour *n, const u8 *lladdr, u8 new,
+ u32 flags, u32 nlmsg_pid),
+
+ TP_ARGS(n, lladdr, new, flags, nlmsg_pid),
+
+ TP_STRUCT__entry(
+ __field(u32, family)
+ __string(dev, (n->dev ? n->dev->name : "NULL"))
+ __array(u8, lladdr, MAX_ADDR_LEN)
+ __field(u8, lladdr_len)
+ __field(u8, flags)
+ __field(u8, nud_state)
+ __field(u8, type)
+ __field(u8, dead)
+ __field(int, refcnt)
+ __array(__u8, primary_key4, 4)
+ __array(__u8, primary_key6, 16)
+ __field(unsigned long, confirmed)
+ __field(unsigned long, updated)
+ __field(unsigned long, used)
+ __array(u8, new_lladdr, MAX_ADDR_LEN)
+ __field(u8, new_state)
+ __field(u32, update_flags)
+ __field(u32, pid)
+ ),
+
+ TP_fast_assign(
+ int lladdr_len = (n->dev ? n->dev->addr_len : MAX_ADDR_LEN);
+ struct in6_addr *pin6;
+ __be32 *p32;
+
+ __entry->family = n->tbl->family;
+ __assign_str(dev, (n->dev ? n->dev->name : "NULL"));
+ __entry->lladdr_len = lladdr_len;
+ memcpy(__entry->lladdr, n->ha, lladdr_len);
+ __entry->flags = n->flags;
+ __entry->nud_state = n->nud_state;
+ __entry->type = n->type;
+ __entry->dead = n->dead;
+ __entry->refcnt = refcount_read(&n->refcnt);
+ pin6 = (struct in6_addr *)__entry->primary_key6;
+ p32 = (__be32 *)__entry->primary_key4;
+
+ if (n->tbl->family == AF_INET)
+ *p32 = *(__be32 *)n->primary_key;
+ else
+ *p32 = 0;
+
+#if IS_ENABLED(CONFIG_IPV6)
+ if (n->tbl->family == AF_INET6) {
+ pin6 = (struct in6_addr *)__entry->primary_key6;
+ *pin6 = *(struct in6_addr *)n->primary_key;
+ } else
+#endif
+ {
+ ipv6_addr_set_v4mapped(*p32, pin6);
+ }
+ __entry->confirmed = n->confirmed;
+ __entry->updated = n->updated;
+ __entry->used = n->used;
+ if (lladdr)
+ memcpy(__entry->new_lladdr, lladdr, lladdr_len);
+ __entry->new_state = new;
+ __entry->update_flags = flags;
+ __entry->pid = nlmsg_pid;
+ ),
+
+ TP_printk("family %d dev %s lladdr %s flags %02x nud_state %s type %02x "
+ "dead %d refcnt %d primary_key4 %pI4 primary_key6 %pI6c "
+ "confirmed %lu updated %lu used %lu new_lladdr %s "
+ "new_state %02x update_flags %02x pid %d",
+ __entry->family, __get_str(dev),
+ __print_hex_str(__entry->lladdr, __entry->lladdr_len),
+ __entry->flags, neigh_state_str(__entry->nud_state),
+ __entry->type, __entry->dead, __entry->refcnt,
+ __entry->primary_key4, __entry->primary_key6,
+ __entry->confirmed, __entry->updated, __entry->used,
+ __print_hex_str(__entry->new_lladdr, __entry->lladdr_len),
+ __entry->new_state,
+ __entry->update_flags, __entry->pid)
+);
+
+DECLARE_EVENT_CLASS(neigh__update,
+ TP_PROTO(struct neighbour *n, int err),
+ TP_ARGS(n, err),
+ TP_STRUCT__entry(
+ __field(u32, family)
+ __string(dev, (n->dev ? n->dev->name : "NULL"))
+ __array(u8, lladdr, MAX_ADDR_LEN)
+ __field(u8, lladdr_len)
+ __field(u8, flags)
+ __field(u8, nud_state)
+ __field(u8, type)
+ __field(u8, dead)
+ __field(int, refcnt)
+ __array(__u8, primary_key4, 4)
+ __array(__u8, primary_key6, 16)
+ __field(unsigned long, confirmed)
+ __field(unsigned long, updated)
+ __field(unsigned long, used)
+ __field(u32, err)
+ ),
+
+ TP_fast_assign(
+ int lladdr_len = (n->dev ? n->dev->addr_len : MAX_ADDR_LEN);
+ struct in6_addr *pin6;
+ __be32 *p32;
+
+ __entry->family = n->tbl->family;
+ __assign_str(dev, (n->dev ? n->dev->name : "NULL"));
+ __entry->lladdr_len = lladdr_len;
+ memcpy(__entry->lladdr, n->ha, lladdr_len);
+ __entry->flags = n->flags;
+ __entry->nud_state = n->nud_state;
+ __entry->type = n->type;
+ __entry->dead = n->dead;
+ __entry->refcnt = refcount_read(&n->refcnt);
+ pin6 = (struct in6_addr *)__entry->primary_key6;
+ p32 = (__be32 *)__entry->primary_key4;
+
+ if (n->tbl->family == AF_INET)
+ *p32 = *(__be32 *)n->primary_key;
+ else
+ *p32 = 0;
+
+#if IS_ENABLED(CONFIG_IPV6)
+ if (n->tbl->family == AF_INET6) {
+ pin6 = (struct in6_addr *)__entry->primary_key6;
+ *pin6 = *(struct in6_addr *)n->primary_key;
+ } else
+#endif
+ {
+ ipv6_addr_set_v4mapped(*p32, pin6);
+ }
+
+ __entry->confirmed = n->confirmed;
+ __entry->updated = n->updated;
+ __entry->used = n->used;
+ __entry->err = err;
+ ),
+
+ TP_printk("family %d dev %s lladdr %s flags %02x nud_state %s type %02x "
+ "dead %d refcnt %d primary_key4 %pI4 primary_key6 %pI6c "
+ "confirmed %lu updated %lu used %lu err %d",
+ __entry->family, __get_str(dev),
+ __print_hex_str(__entry->lladdr, __entry->lladdr_len),
+ __entry->flags, neigh_state_str(__entry->nud_state),
+ __entry->type, __entry->dead, __entry->refcnt,
+ __entry->primary_key4, __entry->primary_key6,
+ __entry->confirmed, __entry->updated, __entry->used,
+ __entry->err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_update_done,
+ TP_PROTO(struct neighbour *neigh, int err),
+ TP_ARGS(neigh, err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_timer_handler,
+ TP_PROTO(struct neighbour *neigh, int err),
+ TP_ARGS(neigh, err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_event_send_done,
+ TP_PROTO(struct neighbour *neigh, int err),
+ TP_ARGS(neigh, err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_event_send_dead,
+ TP_PROTO(struct neighbour *neigh, int err),
+ TP_ARGS(neigh, err)
+);
+
+DEFINE_EVENT(neigh__update, neigh_cleanup_and_release,
+ TP_PROTO(struct neighbour *neigh, int rc),
+ TP_ARGS(neigh, rc)
+);
+
+#endif /* _TRACE_NEIGH_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/net/core/net-traces.c b/net/core/net-traces.c
index 419af6d..470b179 100644
--- a/net/core/net-traces.c
+++ b/net/core/net-traces.c
@@ -43,6 +43,14 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(fdb_delete);
EXPORT_TRACEPOINT_SYMBOL_GPL(br_fdb_update);
#endif
+#include <trace/events/neigh.h>
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_update);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_update_done);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_timer_handler);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_event_send_done);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_event_send_dead);
+EXPORT_TRACEPOINT_SYMBOL_GPL(neigh_cleanup_and_release);
+
EXPORT_TRACEPOINT_SYMBOL_GPL(kfree_skb);
EXPORT_TRACEPOINT_SYMBOL_GPL(napi_poll);
--
2.1.4
^ permalink raw reply related
* [PATCH net-next 0/2] tracepoints in neighbor subsystem
From: Roopa Prabhu @ 2019-02-14 17:15 UTC (permalink / raw)
To: davem; +Cc: netdev, dsa
From: Roopa Prabhu <roopa@cumulusnetworks.com>
Roopa Prabhu (2):
trace: events: add a few neigh tracepoints
neigh: hook tracepoints in neigh update code
include/trace/events/neigh.h | 213 +++++++++++++++++++++++++++++++++++++++++++
net/core/neighbour.c | 11 +++
net/core/net-traces.c | 8 ++
3 files changed, 232 insertions(+)
create mode 100644 include/trace/events/neigh.h
--
2.1.4
^ permalink raw reply
* Re: [RESEND PATCH net] mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs
From: David Miller @ 2019-02-14 17:13 UTC (permalink / raw)
To: jannh
Cc: netdev, linux-mm, linux-kernel, mhocko, vbabka, pavel.tatashin,
osalvador, mgorman, aaron.lu, alexander.h.duyck
In-Reply-To: <20190213214559.125666-1-jannh@google.com>
From: Jann Horn <jannh@google.com>
Date: Wed, 13 Feb 2019 22:45:59 +0100
> The basic idea behind ->pagecnt_bias is: If we pre-allocate the maximum
> number of references that we might need to create in the fastpath later,
> the bump-allocation fastpath only has to modify the non-atomic bias value
> that tracks the number of extra references we hold instead of the atomic
> refcount. The maximum number of allocations we can serve (under the
> assumption that no allocation is made with size 0) is nc->size, so that's
> the bias used.
>
> However, even when all memory in the allocation has been given away, a
> reference to the page is still held; and in the `offset < 0` slowpath, the
> page may be reused if everyone else has dropped their references.
> This means that the necessary number of references is actually
> `nc->size+1`.
>
> Luckily, from a quick grep, it looks like the only path that can call
> page_frag_alloc(fragsz=1) is TAP with the IFF_NAPI_FRAGS flag, which
> requires CAP_NET_ADMIN in the init namespace and is only intended to be
> used for kernel testing and fuzzing.
>
> To test for this issue, put a `WARN_ON(page_ref_count(page) == 0)` in the
> `offset < 0` path, below the virt_to_page() call, and then repeatedly call
> writev() on a TAP device with IFF_TAP|IFF_NO_PI|IFF_NAPI_FRAGS|IFF_NAPI,
> with a vector consisting of 15 elements containing 1 byte each.
>
> Signed-off-by: Jann Horn <jannh@google.com>
Applied and queued up for -stable.
^ permalink raw reply
* Re: [PATCH net-next] net: ip6_gre: Give ERSPAN a fill_info link op of its own
From: David Miller @ 2019-02-14 17:08 UTC (permalink / raw)
To: petrm; +Cc: netdev, kuznet, yoshfuji, lorenzo.bianconi
In-Reply-To: <c14a9085e87ca9e36ba7f5feea46e5750a5baeeb.1550086179.git.petrm@mellanox.com>
From: Petr Machata <petrm@mellanox.com>
Date: Wed, 13 Feb 2019 19:31:32 +0000
> In commit c706863bc890 ("net: ip6_gre: always reports o_key to
> userspace"), ip6gre and ip6gretap tunnels started reporting a TUNNEL_KEY
> output flag even if one was not configured at the device.
>
> When an okey-less ip6gre or ip6gretap netdevice is created, it initially
> encapsulates the packets without okey. But any configuration change
> (even a non-change such as setting TOS to an already-configured value)
> then causes the okey flag from the reported configuration to be
> circulated back to actual configuration. From that point on, the device
> encapsulates packets with output key of 0.
>
> The intention was to implement this behavior for ERSPAN devices, not for
> all ip6gre devices. The ERSPAN netdevice should really have its own
> fill_info callback. Add one.
>
> Fixes: c706863bc890 ("net: ip6_gre: always reports o_key to userspace")
> CC: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> Signed-off-by: Petr Machata <petrm@mellanox.com>
This commit you are fixing exists in the 'net' tree, therefore this is
a bug fix and should be targetted at 'net'.
^ permalink raw reply
* Re: [PATCH net 0/2] net: phy: fix locking issue
From: David Miller @ 2019-02-14 17:05 UTC (permalink / raw)
To: hkallweit1; +Cc: andrew, f.fainelli, linux, netdev
In-Reply-To: <2a39271d-3b9e-e425-98b4-b2a24074e806@gmail.com>
From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Wed, 13 Feb 2019 20:10:36 +0100
> Russell pointed out that the locking used in phy_is_started() isn't
> needed and misleading. This locking also contributes to a race fixed
> with patch 2.
Series applied and queued up for -stable, thanks.
^ permalink raw reply
* Re: [PATCH 5/9] perf, bpf: save bpf_prog_info in a rbtree in perf_env
From: Song Liu @ 2019-02-14 17:03 UTC (permalink / raw)
To: Jiri Olsa
Cc: Netdev, linux-kernel, ast@kernel.org, daniel@iogearbox.net,
Kernel Team, peterz@infradead.org, acme@redhat.com
In-Reply-To: <20190214123311.GA7465@krava>
> On Feb 14, 2019, at 4:33 AM, Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Feb 08, 2019 at 05:17:01PM -0800, Song Liu wrote:
>> bpf_prog_info contains information necessary to annotate bpf programs.
>> This patch saves bpf_prog_info for bpf programs loaded in the system.
>>
>> perf-record saves bpf_prog_info information as headers to perf.data.
>> A new header type HEADER_BPF_PROG_INFO is introduced for this data.
>
> please move those 2 changes into separate patches then
Do you mean one patch to save data in rbtree, then a separate patch
to save data in perf.data file?
Thanks,
Song
>
> it's hard to make comments when I don't see the rest of
> the patches on the list please resend the patchset
>
> thanks,
> jirka
^ permalink raw reply
* Re: [PATCH net] selftests: fix timestamping Makefile
From: David Miller @ 2019-02-14 17:03 UTC (permalink / raw)
To: deepa.kernel; +Cc: shuah, willemb, netdev, linux-kselftest
In-Reply-To: <20190213170914.11991-1-deepa.kernel@gmail.com>
From: Deepa Dinamani <deepa.kernel@gmail.com>
Date: Wed, 13 Feb 2019 09:09:13 -0800
> The clean target in the makefile conflicts with the generic
> kselftests lib.mk, and fails to properly remove the compiled
> test programs.
>
> Remove the redundant rule, the TEST_GEN_FILES will be already
> removed by the CLEAN macro in lib.mk.
>
> Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Applied, thank you.
^ permalink raw reply
* Re: [PATCH 5/9] perf, bpf: save bpf_prog_info in a rbtree in perf_env
From: Song Liu @ 2019-02-14 17:01 UTC (permalink / raw)
To: Jiri Olsa
Cc: Netdev, linux-kernel@vger.kernel.org, ast@kernel.org,
daniel@iogearbox.net, Kernel Team, peterz@infradead.org,
acme@redhat.com
In-Reply-To: <20190214122638.GD26714@krava>
> On Feb 14, 2019, at 4:26 AM, Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Feb 08, 2019 at 05:17:01PM -0800, Song Liu wrote:
>
> SNIP
>
>> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
>> index d01b8355f4ca..5894a177b7cf 100644
>> --- a/tools/perf/util/env.h
>> +++ b/tools/perf/util/env.h
>> @@ -3,7 +3,10 @@
>> #define __PERF_ENV_H
>>
>> #include <linux/types.h>
>> +#include <linux/rbtree.h>
>> #include "cpumap.h"
>> +#include "rwsem.h"
>> +#include "bpf-event.h"
>>
>> struct cpu_topology_map {
>> int socket_id;
>> @@ -64,6 +67,8 @@ struct perf_env {
>> struct memory_node *memory_nodes;
>> unsigned long long memory_bsize;
>> u64 clockid_res_ns;
>> + struct rw_semaphore bpf_info_lock;
>
> why's the lock needed?
>
> jirka
It protects the retries for bpf_prog_info and btf. For perf-top,
we will have one thread writing to the trees, while the main
thread reading from them.
Let me add comments to clarify.
Thanks,
Song
^ permalink raw reply
* Re: [PATCH net] net: stmmac: Fix NAPI poll in TX path when in multi-queue
From: David Miller @ 2019-02-14 17:01 UTC (permalink / raw)
To: jose.abreu
Cc: netdev, linux-kernel, joao.pinto, peppe.cavallaro,
alexandre.torgue
In-Reply-To: <a264c48823687434e4d18aeb5830707e00c64250.1550077162.git.joabreu@synopsys.com>
From: Jose Abreu <jose.abreu@synopsys.com>
Date: Wed, 13 Feb 2019 18:00:43 +0100
> Commit 8fce33317023 introduced the concept of NAPI per-channel and
> independent cleaning of TX path.
>
> This is currently breaking performance in some cases. The scenario
> happens when all packets are being received in Queue 0 but the TX is
> performed in Queue != 0.
>
> I didn't look very deep but it seems that NAPI for Queue 0 will clean
> the RX path but as TX is in different NAPI, this last one is called at a
> slower rate which kills performance in TX. I suspect this is due to TX
> cleaning takes much longer than RX and because NAPI will get canceled
> once we return with 0 budget consumed (e.g. when TX is still not done it
> will return 0 budget).
>
> Fix this by looking at all TX channels in NAPI poll function.
>
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
> Fixes: 8fce33317023 ("net: stmmac: Rework coalesce timer and fix multi-queue races")
No this isn't right.
The TX interrupt events for Queue != 0 should clean up the TX packets
on those queues.
Furthermore you are breaking the locality of the TX processing.
I'm not applying this, sorry.
^ permalink raw reply
* RE: [PATCH net-next 2/3] arm64: dts: fsl: ls1028a-rdb: Add ENETC external eth ports for the LS1028A RDB board
From: Claudiu Manoil @ 2019-02-14 17:00 UTC (permalink / raw)
To: Andrew Lunn
Cc: Shawn Guo, Leo Li, David S . Miller, devicetree@vger.kernel.org,
Alexandru Marginean, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org
In-Reply-To: <20190214162746.GI708@lunn.ch>
>-----Original Message-----
>From: Andrew Lunn <andrew@lunn.ch>
>Sent: Thursday, February 14, 2019 6:28 PM
>To: Claudiu Manoil <claudiu.manoil@nxp.com>
>Cc: Shawn Guo <shawnguo@kernel.org>; Leo Li <leoyang.li@nxp.com>; David S .
>Miller <davem@davemloft.net>; devicetree@vger.kernel.org; Alexandru
>Marginean <alexandru.marginean@nxp.com>; linux-kernel@vger.kernel.org;
>linux-arm-kernel@lists.infradead.org; netdev@vger.kernel.org
>Subject: Re: [PATCH net-next 2/3] arm64: dts: fsl: ls1028a-rdb: Add ENETC
>external eth ports for the LS1028A RDB board
>
>> Hi Andrew,
>>
>> The extra node for mdio seems to complicate things somewhat.
>> Just adding this node seems not enough. How to find out easily if a
>> child of a enetc port node is a mdio node?
>
>You copy somebody else code :-)
>
Provided you find the right thing to copy : ) . Thanks for the hint.
^ permalink raw reply
* Re: [PATCH -next] net: ipvlan_l3s: fix kconfig dependency warning
From: David Miller @ 2019-02-14 16:59 UTC (permalink / raw)
To: rdunlap; +Cc: netdev, maheshb, daniel
In-Reply-To: <204a7785-a1d2-e714-653e-2cb19e36f279@infradead.org>
From: Randy Dunlap <rdunlap@infradead.org>
Date: Wed, 13 Feb 2019 08:55:02 -0800
> From: Randy Dunlap <rdunlap@infradead.org>
>
> Fix the kconfig warning in IPVLAN_L3S when neither INET nor IPV6
> is enabled:
>
> WARNING: unmet direct dependencies detected for NET_L3_MASTER_DEV
> Depends on [n]: NET [=y] && (INET [=n] || IPV6 [=n])
> Selected by [y]:
> - IPVLAN_L3S [=y] && NETDEVICES [=y] && NET_CORE [=y] && NETFILTER [=y]
>
> Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
> Cc: Mahesh Bandewar <maheshb@google.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> ---
> v2: simplify the dependency to IPVLAN
Applied, thanks Randy.
^ permalink raw reply
* Re: [PATCH net] net: nuvoton: w90p910_ether: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 16:57 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, linux-arm-kernel, mcuos.com, yang.wei9
In-Reply-To: <1550071262-4889-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Wed, 13 Feb 2019 23:21:02 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called in w90p910_ether_start_xmit()
> when skb xmit done. It makes drop profiles(dropwatch, perf) more
> friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied.
^ permalink raw reply
* Re: [PATCH net] net: natsemi: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 16:57 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, tsbogend, yang.wei9
In-Reply-To: <1550071154-4834-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Wed, 13 Feb 2019 23:19:14 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called when skb xmit done. It makes
> drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied.
^ permalink raw reply
* Re: [PATCH net] net: micrel: ks8695net: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 16:57 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, yang.wei9
In-Reply-To: <1550071089-4776-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Wed, 13 Feb 2019 23:18:09 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called in ks8695_tx_irq() when skb
> xmit done. It makes drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied.
^ permalink raw reply
* Re: [PATCH net] net: sgi: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 16:57 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, ralf, yang.wei9
In-Reply-To: <1550071026-4723-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Wed, 13 Feb 2019 23:17:06 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called when skb xmit done. It makes
> drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied.
^ permalink raw reply
* Re: [PATCH net] net: myri10ge: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 16:57 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, christopher.lee, yang.wei9
In-Reply-To: <1550070943-4653-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Wed, 13 Feb 2019 23:15:43 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called in myri10ge_tx_done() when
> skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied.
^ permalink raw reply
* Re: [PATCH net] net: amd: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 16:57 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, yang.wei9
In-Reply-To: <1550070894-4602-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Wed, 13 Feb 2019 23:14:54 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called when skb xmit done. It makes
> drop profiles(dropwatch, perf) more friendly.
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied.
^ permalink raw reply
* Re: [PATCH net] net: dlink: sundance: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
From: David Miller @ 2019-02-14 16:56 UTC (permalink / raw)
To: albin_yang; +Cc: netdev, kda, yang.wei9
In-Reply-To: <1550070722-4539-1-git-send-email-albin_yang@163.com>
From: Yang Wei <albin_yang@163.com>
Date: Wed, 13 Feb 2019 23:12:02 +0800
> From: Yang Wei <yang.wei9@zte.com.cn>
>
> dev_consume_skb_irq() should be called in intr_handler() when skb
> xmit done. It makes drop profiles(dropwatch, perf) more friendly.
>
> Remove a redundant blank line in intr_handler().
>
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Applied.
^ permalink raw reply
* Re: [PATCH iproute2] ss: Render buffer to output every time a number of chunks are allocated
From: Eric Dumazet @ 2019-02-14 16:55 UTC (permalink / raw)
To: Stefano Brivio, Stephen Hemminger
Cc: Phil Sutter, David Ahern, Sabrina Dubroca, netdev
In-Reply-To: <03dd56e5161a3c1270a21c4ba3f6e695793dbb74.1550105375.git.sbrivio@redhat.com>
On 02/13/2019 04:58 PM, Stefano Brivio wrote:
> Eric reported that, with 10 million sockets, ss -emoi (about 1000 bytes
> output per socket) can easily lead to OOM (buffer would grow to 10GB of
> memory).
>
> Limit the maximum size of the buffer to five chunks, 1M each. Render and
> flush buffers whenever we reach that.
>
> This might make the resulting blocks slightly unaligned between them, with
> occasional loss of readability on lines occurring every 5k to 50k sockets
> approximately. Something like (from ss -tu):
>
> [...]
> CLOSE-WAIT 32 0 192.168.1.50:35232 10.0.0.1:https
> ESTAB 0 0 192.168.1.50:53820 10.0.0.1:https
> ESTAB 0 0 192.168.1.50:46924 10.0.0.1:https
> CLOSE-WAIT 32 0 192.168.1.50:35228 10.0.0.1:https
> [...]
>
> However, I don't actually expect any human user to scroll through that
> amount of sockets, so readability should be preserved when it matters.
>
> The bulk of the diffstat comes from moving field_next() around, as we now
> call render() from it. Functionally, this is implemented by six lines of
> code, most of them in field_next().
>
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Fixes: 691bd854bf4a ("ss: Buffer raw fields first, then render them as a table")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
> Eric, it would be nice if you could test this with your bazillion sockets,
> I checked this with -emoi and "only" 500,000 sockets.
Thanks, this seems reasonable enough to me.
# /usr/bin/time misc/ss -t |head -1
State Recv-Q Send-Q Local Address:Port Peer Address:Port
Command terminated by signal 13
0.05user 0.00system 0:00.05elapsed 100%CPU (0avgtext+0avgdata 5836maxresident)k
0inputs+0outputs (0major+1121minor)pagefaults 0swaps
^ permalink raw reply
* Re: [PATCH net-next 0/2] uapi: Add a new header for time types
From: David Miller @ 2019-02-14 16:52 UTC (permalink / raw)
To: deepa.kernel; +Cc: linux-kernel, netdev, willemb, tglx, arnd, y2038
In-Reply-To: <20190213032604.2655-1-deepa.kernel@gmail.com>
From: Deepa Dinamani <deepa.kernel@gmail.com>
Date: Tue, 12 Feb 2019 19:26:02 -0800
> The series aims at adding a new time header: time_types.h. This header
> is what will eventually hold all the uapi time types that we plan to
> leave across the interfaces after the y2038 cleanup.
>
> The series was discussed with Arnd Bergmann.
>
> The second patch fixes the errqueue.h header, which has a dependency on
> these types.
>
> Note that there may be a trivial merge conflict with linux-next
> c70a772fda11 ("y2038: remove struct definition redirects").
Series applied, thank you.
^ permalink raw reply
* Re: [PATCH] net: phy: at803x: disable delay only for RGMII mode
From: Marc Gonzalez @ 2019-02-14 16:46 UTC (permalink / raw)
To: David Miller, vkoul
Cc: linux-arm-msm, bjorn.andersson, netdev, niklas.cassel, andrew,
f.fainelli, nsekhar, peter.ujfalusi
In-Reply-To: <20190214.083828.206479765039661735.davem@davemloft.net>
On 14/02/2019 17:38, David Miller wrote:
> From: Vinod Koul <vkoul@kernel.org>
> Date: Tue, 12 Feb 2019 19:49:22 +0530
>
>> diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
>> index 8ff12938ab47..7b54b54e3316 100644
>> --- a/drivers/net/phy/at803x.c
>> +++ b/drivers/net/phy/at803x.c
>> @@ -110,6 +110,18 @@ static int at803x_debug_reg_mask(struct phy_device *phydev, u16 reg,
>> return phy_write(phydev, AT803X_DEBUG_DATA, val);
>> }
>>
>> +static inline int at803x_enable_rx_delay(struct phy_device *phydev)
>> +{
>> + return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_0, 0,
>> + AT803X_DEBUG_RX_CLK_DLY_EN);
>> +}
>> +
>> +static inline int at803x_enable_tx_delay(struct phy_device *phydev)
>> +{
>> + return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_5, 0,
>> + AT803X_DEBUG_TX_CLK_DLY_EN);
>> +}
>> +
>
> Please do not use the inline directive in foo.c files, let the compiler
> decide.
Isn't the compiler free to ignore the "inline" hint?
Regards.
^ permalink raw reply
* Re: [PATCHv2 net-next 0/2] devlink: 2 fixes for devlink region read
From: David Miller @ 2019-02-14 16:46 UTC (permalink / raw)
To: parav; +Cc: jiri, netdev
In-Reply-To: <1550002970-28893-1-git-send-email-parav@mellanox.com>
From: Parav Pandit <parav@mellanox.com>
Date: Tue, 12 Feb 2019 14:22:50 -0600
> This 2 patches consist of fixes for devlink region read handling.
>
> Signed-off-by: Parav Pandit <parav@mellanox.com>
Series applied, thanks.
^ permalink raw reply
* Re: [PATCH] NETWORKING: avoid use IPCB in cipso_v4_error
From: David Miller @ 2019-02-14 16:43 UTC (permalink / raw)
To: s-nazarov; +Cc: netdev, linux-security-module, kuznet, yoshfuji, paul
In-Reply-To: <6691891549984203@myt5-a323eb993ef7.qloud-c.yandex.net>
From: Nazarov Sergey <s-nazarov@yandex.ru>
Date: Tue, 12 Feb 2019 18:10:03 +0300
> Since cipso_v4_error might be called from different network stack layers, we can't safely use icmp_send there.
> icmp_send copies IP options with ip_option_echo, which uses IPCB to take access to IP header compiled data.
> But after commit 971f10ec ("tcp: better TCP_SKB_CB layout to reduce cache line misses"), IPCB can't be used
> above IP layer.
> This patch fixes the problem by creating in cipso_v4_error a local copy of compiled IP options and using it with
> introduced __icmp_send function. This looks some overloaded, but in quite rare error conditions only.
>
> The original discussion is here:
> https://lore.kernel.org/linux-security-module/16659801547571984@sas1-890ba5c2334a.qloud-c.yandex.net/
>
> Signed-off-by: Sergey Nazarov <s-nazarov@yandex.ru>
This problem is not unique to Cipso, net/atm/clip.c's error handler
has the same exact issue.
I didn't scan more of the tree, there are probably a couple more
locations as well.
^ permalink raw reply
* Re: [PATCH] net: phy: at803x: disable delay only for RGMII mode
From: David Miller @ 2019-02-14 16:38 UTC (permalink / raw)
To: vkoul
Cc: linux-arm-msm, bjorn.andersson, netdev, niklas.cassel, andrew,
f.fainelli, nsekhar, peter.ujfalusi, marc.w.gonzalez
In-Reply-To: <20190212141922.12849-1-vkoul@kernel.org>
From: Vinod Koul <vkoul@kernel.org>
Date: Tue, 12 Feb 2019 19:49:22 +0530
> diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
> index 8ff12938ab47..7b54b54e3316 100644
> --- a/drivers/net/phy/at803x.c
> +++ b/drivers/net/phy/at803x.c
> @@ -110,6 +110,18 @@ static int at803x_debug_reg_mask(struct phy_device *phydev, u16 reg,
> return phy_write(phydev, AT803X_DEBUG_DATA, val);
> }
>
> +static inline int at803x_enable_rx_delay(struct phy_device *phydev)
> +{
> + return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_0, 0,
> + AT803X_DEBUG_RX_CLK_DLY_EN);
> +}
> +
> +static inline int at803x_enable_tx_delay(struct phy_device *phydev)
> +{
> + return at803x_debug_reg_mask(phydev, AT803X_DEBUG_REG_5, 0,
> + AT803X_DEBUG_TX_CLK_DLY_EN);
> +}
> +
Please do not use the inline directive in foo.c files, let the compiler
decide.
Thank you.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox