* Re: [PATCH bpf-next 2/4] selftests/bpf: test_progs: test__skip
From: Andrii Nakryiko @ 2019-08-16 5:23 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Stanislav Fomichev, Stanislav Fomichev, Networking, bpf,
David S. Miller, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko
In-Reply-To: <CAADnVQ+Bz6R17bassdr3xOR7rhbuw-HbdXYu-hHkxE8S2WiNrA@mail.gmail.com>
On Thu, Aug 15, 2019 at 10:16 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Aug 14, 2019 at 1:01 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> > >
> > > Let me know if you see a value in highlighting test vs subtest skip.
> > >
> > > Other related question is: should we do verbose output in case
> > > of a skip? Right now we don't do it.
> >
> > It might be useful, I guess, especially if it's not too common. But
> > Alexei is way more picky about stuff like that, so I'd defer to him. I
> > have no problem with a clean "SKIPPED: <test>/<subtest> (maybe some
> > reason for skipping here)" message.
>
> Since test_progs prints single number for FAILED tests then single number
> for SKIPPED tests is fine as well.
I'm fine with single number, but it should count number of subtests
skipped, if there are subtests within test, same as for FAILED.
^ permalink raw reply
* Re: linux-next: manual merge of the net-next tree with the kbuild tree
From: Andrii Nakryiko @ 2019-08-16 5:21 UTC (permalink / raw)
To: Stephen Rothwell
Cc: David Miller, Networking, Masahiro Yamada,
Linux Next Mailing List, Linux Kernel Mailing List, Kees Cook,
Andrii Nakryiko, Daniel Borkmann
In-Reply-To: <20190816124143.2640218a@canb.auug.org.au>
On Thu, Aug 15, 2019 at 7:42 PM Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> Hi all,
>
> Today's linux-next merge of the net-next tree got a conflict in:
>
> scripts/link-vmlinux.sh
>
> between commit:
>
> e167191e4a8a ("kbuild: Parameterize kallsyms generation and correct reporting")
>
> from the kbuild tree and commits:
>
> 341dfcf8d78e ("btf: expose BTF info through sysfs")
> 7fd785685e22 ("btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux")
>
> from the net-next tree.
>
> I fixed it up (I think - see below) and can carry the fix as necessary.
Thanks, Stephen! Looks good except one minor issue below.
> This is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging. You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
>
> --
> Cheers,
> Stephen Rothwell
>
> diff --cc scripts/link-vmlinux.sh
> index 2438a9faf3f1,c31193340108..000000000000
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@@ -56,11 -56,10 +56,11 @@@ modpost_link(
> }
>
> # Link of vmlinux
> - # ${1} - optional extra .o files
> - # ${2} - output file
> + # ${1} - output file
> + # ${@:2} - optional extra .o files
> vmlinux_link()
> {
> + info LD ${2}
This needs to be ${1}.
> local lds="${objtree}/${KBUILD_LDS}"
> local objects
>
> @@@ -139,18 -149,6 +150,18 @@@ kallsyms(
> ${CC} ${aflags} -c -o ${2} ${afile}
> }
>
> +# Perform one step in kallsyms generation, including temporary linking of
> +# vmlinux.
> +kallsyms_step()
> +{
> + kallsymso_prev=${kallsymso}
> + kallsymso=.tmp_kallsyms${1}.o
> + kallsyms_vmlinux=.tmp_vmlinux${1}
> +
> - vmlinux_link "${kallsymso_prev}" ${kallsyms_vmlinux}
> ++ vmlinux_link ${kallsyms_vmlinux} "${kallsymso_prev}" ${btf_vmlinux_bin_o}
> + kallsyms ${kallsyms_vmlinux} ${kallsymso}
> +}
> +
> # Create map file with all symbols from ${1}
> # See mksymap for additional details
> mksysmap()
> @@@ -228,8 -227,14 +240,15 @@@ ${MAKE} -f "${srctree}/scripts/Makefile
> info MODINFO modules.builtin.modinfo
> ${OBJCOPY} -j .modinfo -O binary vmlinux.o modules.builtin.modinfo
>
> + btf_vmlinux_bin_o=""
> + if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then
> + if gen_btf .tmp_vmlinux.btf .btf.vmlinux.bin.o ; then
> + btf_vmlinux_bin_o=.btf.vmlinux.bin.o
> + fi
> + fi
> +
> kallsymso=""
> +kallsymso_prev=""
> kallsyms_vmlinux=""
> if [ -n "${CONFIG_KALLSYMS}" ]; then
>
> @@@ -268,11 -285,8 +287,7 @@@
> fi
> fi
>
> - vmlinux_link "${kallsymso}" vmlinux
> -
> - if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then
> - gen_btf vmlinux
> - fi
> -info LD vmlinux
> + vmlinux_link vmlinux "${kallsymso}" "${btf_vmlinux_bin_o}"
>
> if [ -n "${CONFIG_BUILDTIME_EXTABLE_SORT}" ]; then
> info SORTEX vmlinux
^ permalink raw reply
* Re: [PATCH net-next] r8152: divide the tx and rx bottom functions
From: David Miller @ 2019-08-16 5:17 UTC (permalink / raw)
To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel
In-Reply-To: <0835B3720019904CB8F7AA43166CEEB2F18D43A3@RTITMBSVM03.realtek.com.tw>
From: Hayes Wang <hayeswang@realtek.com>
Date: Fri, 16 Aug 2019 02:59:16 +0000
> David Miller [mailto:davem@davemloft.net]
>> Sent: Friday, August 16, 2019 4:59 AM
> [...]
>> Theoretically, yes.
>>
>> But do you have actual performance numbers showing this to be worth
>> the change?
>>
>> Always provide performance numbers with changes that are supposed to
>> improve performance.
>
> On x86, they are almost the same.
> Tx/Rx: 943/943 Mbits/sec -> 945/944
>
> For arm platform,
> Tx/Rx: 917/917 Mbits/sec -> 933/933
> Improve about 1.74%.
Belongs in the commit message.
^ permalink raw reply
* Re: [PATCH bpf-next 2/4] selftests/bpf: test_progs: test__skip
From: Alexei Starovoitov @ 2019-08-16 5:16 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Stanislav Fomichev, Stanislav Fomichev, Networking, bpf,
David S. Miller, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko
In-Reply-To: <CAEf4BzaEJcTKV6s8cVinpJcBStvs2LAJ+obNjevw54EOQq1QdQ@mail.gmail.com>
On Wed, Aug 14, 2019 at 1:01 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> >
> > Let me know if you see a value in highlighting test vs subtest skip.
> >
> > Other related question is: should we do verbose output in case
> > of a skip? Right now we don't do it.
>
> It might be useful, I guess, especially if it's not too common. But
> Alexei is way more picky about stuff like that, so I'd defer to him. I
> have no problem with a clean "SKIPPED: <test>/<subtest> (maybe some
> reason for skipping here)" message.
Since test_progs prints single number for FAILED tests then single number
for SKIPPED tests is fine as well.
^ permalink raw reply
* Re: [PATCH bpf] tools: bpftool: close prog FD before exit on showing a single program
From: Alexei Starovoitov @ 2019-08-16 5:11 UTC (permalink / raw)
To: Quentin Monnet
Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Network Development,
oss-drivers
In-Reply-To: <20190815142223.2203-1-quentin.monnet@netronome.com>
On Thu, Aug 15, 2019 at 7:22 AM Quentin Monnet
<quentin.monnet@netronome.com> wrote:
>
> When showing metadata about a single program by invoking
> "bpftool prog show PROG", the file descriptor referring to the program
> is not closed before returning from the function. Let's close it.
>
> Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Applied. Thanks
^ permalink raw reply
* Re: [PATCH bpf 0/6] tools: bpftool: fix printf()-like functions
From: Alexei Starovoitov @ 2019-08-16 5:08 UTC (permalink / raw)
To: Quentin Monnet
Cc: Alexei Starovoitov, Daniel Borkmann, bpf, Network Development,
oss-drivers
In-Reply-To: <20190815143220.4199-1-quentin.monnet@netronome.com>
On Thu, Aug 15, 2019 at 7:32 AM Quentin Monnet
<quentin.monnet@netronome.com> wrote:
>
> Hi,
> Because the "__printf()" attributes were used only where the functions are
> implemented, and not in header files, the checks have not been enforced on
> all the calls to printf()-like functions, and a number of errors slipped in
> bpftool over time.
>
> This set cleans up such errors, and then moves the "__printf()" attributes
> to header files, so that the checks are performed at all locations.
Applied. Thanks
^ permalink raw reply
* Reminder: 6 active syzbot reports in "net/tls" subsystem
From: Eric Biggers @ 2019-08-16 4:18 UTC (permalink / raw)
To: netdev, Boris Pismenny, Aviad Yehezkel, Dave Watson,
John Fastabend, Daniel Borkmann, Jakub Kicinski, David S. Miller,
Vakul Garg
Cc: syzkaller-bugs
[This email was generated by a script. Let me know if you have any suggestions
to make it better, or if you want it re-generated with the latest status.]
Of the distinct crashes that syzbot has seen in the last week, I've manually
marked 6 of them as possibly being bugs in the "net/tls" subsystem. I've listed
these bug reports below.
Of these 6 reports, 3 were bisected to commits from the following people:
Vakul Garg <vakul.garg@nxp.com>
Dave Watson <davejwatson@fb.com>
I've manually checked that these bisection results look plausible.
If you believe a bug report is no longer valid, please close it by sending a
'#syz fix', '#syz dup', or '#syz invalid' command in reply to the original
thread, as explained at https://goo.gl/tpsmEJ#status
If you believe I misattributed a bug report to the "net/tls" subsystem, please
let me know and (if possible) forward it to the correct place.
Note: in total, I've actually assigned 27 open syzbot reports to this subsystem.
But to help focus people's efforts, I've only listed the 6 that have
(re-)occurred in the last week. Let me know if you want the full list.
Here are the bug reports:
--------------------------------------------------------------------------------
Title: kernel BUG at include/linux/scatterlist.h:LINE!
Last occurred: 0 days ago
Reported: 85 days ago
Branches: Mainline and others
Dashboard link: https://syzkaller.appspot.com/bug?id=effb623cefb879664122cc47df3af728957eb279
Original thread: https://lore.kernel.org/lkml/000000000000f41cd905897c075e@google.com/T/#u
This bug has a C reproducer.
This bug was bisected to:
commit f295b3ae9f5927e084bd5decdff82390e3471801
Author: Vakul Garg <vakul.garg@nxp.com>
Date: Wed Mar 20 02:03:36 2019 +0000
net/tls: Add support of AES128-CCM based ciphers
The original thread for this bug has received 1 reply, 66 days ago.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+df0d4ec12332661dd1f9@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000f41cd905897c075e@google.com
--------------------------------------------------------------------------------
Title: kernel BUG at ./include/linux/scatterlist.h:LINE!
Last occurred: 6 days ago
Reported: 56 days ago
Branches: Mainline
Dashboard link: https://syzkaller.appspot.com/bug?id=3008161aab5958fe4125a4cae3e4b7ad3ea50a26
Original thread: https://lore.kernel.org/lkml/000000000000417551058bc0bef9@google.com/T/#u
This bug has a C reproducer.
This bug was bisected to:
commit f295b3ae9f5927e084bd5decdff82390e3471801
Author: Vakul Garg <vakul.garg@nxp.com>
Date: Wed Mar 20 02:03:36 2019 +0000
net/tls: Add support of AES128-CCM based ciphers
No one has replied to the original thread for this bug yet.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+ef0daa6ce95facb233c1@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000417551058bc0bef9@google.com
--------------------------------------------------------------------------------
Title: INFO: task hung in tls_sw_release_resources_tx
Last occurred: 0 days ago
Reported: 0 days ago
Branches: Mainline and others
Dashboard link: https://syzkaller.appspot.com/bug?id=845e2a9172ab3afe80b95af12014c65930a053d5
Original thread: https://lore.kernel.org/lkml/000000000000523ea3059025b11d@google.com/T/#u
This bug has a C reproducer.
This bug was bisected to:
commit 130b392c6cd6b2aed1b7eb32253d4920babb4891
Author: Dave Watson <davejwatson@fb.com>
Date: Wed Jan 30 21:58:31 2019 +0000
net: tls: Add tls 1.3 support
The original thread for this bug has received 1 reply, 3 hours ago.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+6a9ff159672dfbb41c95@syzkaller.appspotmail.com
If you send any email or patch for this bug, please reply to the original
thread, which had activity only 3 hours ago. For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at https://lore.kernel.org/r/000000000000523ea3059025b11d@google.com
--------------------------------------------------------------------------------
Title: KMSAN: uninit-value in gf128mul_4k_lle (3)
Last occurred: 0 days ago
Reported: 265 days ago
Branches: https://github.com/google/kmsan.git master
Dashboard link: https://syzkaller.appspot.com/bug?id=a01db4c67933e9e4be8e721a8ee15a9530f1ac04
Original thread: https://lore.kernel.org/lkml/000000000000bf2457057b5ccda3@google.com/T/#u
This bug has a C reproducer.
The original thread for this bug received 2 replies; the last was 260 days ago.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+f8495bff23a879a6d0bd@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000bf2457057b5ccda3@google.com
--------------------------------------------------------------------------------
Title: INFO: task hung in __flush_work
Last occurred: 7 days ago
Reported: 180 days ago
Branches: Mainline and others
Dashboard link: https://syzkaller.appspot.com/bug?id=9613d8dffb5c6cc39da8ec290cb8f3eb62bdf21f
Original thread: https://lore.kernel.org/lkml/0000000000008f9c780581fd7417@google.com/T/#u
This bug has a C reproducer.
No one replied to the original thread for this bug.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+aa0b64a57e300a1c6bcc@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/0000000000008f9c780581fd7417@google.com
--------------------------------------------------------------------------------
Title: KMSAN: uninit-value in aesti_encrypt
Last occurred: 1 day ago
Reported: 49 days ago
Branches: https://github.com/google/kmsan.git master
Dashboard link: https://syzkaller.appspot.com/bug?id=9e9babd01df34db0c4d4dbde8ca57a0380e6db0b
Original thread: https://lore.kernel.org/lkml/000000000000a97a15058c50c52e@google.com/T/#u
This bug has a C reproducer.
The original thread for this bug has received 4 replies; the last was 43 days
ago.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+6f50c99e8f6194bf363f@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000a97a15058c50c52e@google.com
^ permalink raw reply
* Reminder: 8 active syzbot reports in "net/bpf" subsystem
From: Eric Biggers @ 2019-08-16 4:17 UTC (permalink / raw)
To: netdev, bpf, David S. Miller, Alexei Starovoitov, Daniel Borkmann
Cc: Martin KaFai Lau, Song Liu, Yonghong Song, syzkaller-bugs
[This email was generated by a script. Let me know if you have any suggestions
to make it better, or if you want it re-generated with the latest status.]
Of the distinct crashes that syzbot has seen in the last week, I've manually
marked 8 of them as possibly being bugs in the "net/bpf" subsystem. I've listed
these bug reports below.
Of these 8 reports, 1 was bisected to a commit from the following person:
Alexei Starovoitov <ast@kernel.org>
I've manually checked that this bisection result looks plausible.
If you believe a bug report is no longer valid, please close it by sending a
'#syz fix', '#syz dup', or '#syz invalid' command in reply to the original
thread, as explained at https://goo.gl/tpsmEJ#status
If you believe I misattributed a bug report to the "net/bpf" subsystem, please
let me know and (if possible) forward it to the correct place.
Note: in total, I've actually assigned 42 open syzbot reports to this subsystem.
But to help focus people's efforts, I've only listed the 8 that have
(re-)occurred in the last week. Let me know if you want the full list.
Here are the bug reports:
--------------------------------------------------------------------------------
Title: WARNING in bpf_jit_free
Last occurred: 0 days ago
Reported: 395 days ago
Branches: Mainline and others
Dashboard link: https://syzkaller.appspot.com/bug?id=d04f9c2ec11ab2678f7427795ff5170cb9eb2220
Original thread: https://lore.kernel.org/lkml/000000000000e92d1805711f5552@google.com/T/#u
This bug has a C reproducer.
syzbot has bisected this bug, but I think the bisection result is incorrect.
The original thread for this bug received 5 replies; the last was 65 days ago.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+2ff1e7cb738fd3c41113@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000e92d1805711f5552@google.com
--------------------------------------------------------------------------------
Title: WARNING: kernel stack frame pointer has bad value (2)
Last occurred: 1 day ago
Reported: 395 days ago
Branches: Mainline and others
Dashboard link: https://syzkaller.appspot.com/bug?id=02a32f98a4e3b5a2ed6929aabdd28dd1618b9c03
Original thread: https://lore.kernel.org/lkml/0000000000000956640571197f98@google.com/T/#u
This bug has a C reproducer.
The original thread for this bug received 1 reply, 395 days ago.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+903cdd6bce9a6eb832a4@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/0000000000000956640571197f98@google.com
--------------------------------------------------------------------------------
Title: BUG: unable to handle kernel paging request in bpf_prog_kallsyms_add
Last occurred: 0 days ago
Reported: 339 days ago
Branches: Mainline and others
Dashboard link: https://syzkaller.appspot.com/bug?id=97f89d84d528e4f5150dcfbdeb97347bc8471e96
Original thread: https://lore.kernel.org/lkml/0000000000009417ef0575802d44@google.com/T/#u
This bug has a syzkaller reproducer only.
The original thread for this bug received 2 replies; the last was 164 days ago.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+c827a78260579449ad39@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/0000000000009417ef0575802d44@google.com
--------------------------------------------------------------------------------
Title: WARNING in bpf_prog_kallsyms_find
Last occurred: 0 days ago
Reported: 100 days ago
Branches: Mainline and others
Dashboard link: https://syzkaller.appspot.com/bug?id=40b0c218e639f1d882b86abff2549cfe11c5101e
Original thread: https://lore.kernel.org/lkml/000000000000a8fa360588580820@google.com/T/#u
This bug has a C reproducer.
No one replied to the original thread for this bug.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+89d1ce6e80218a6192d8@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000a8fa360588580820@google.com
--------------------------------------------------------------------------------
Title: KASAN: use-after-free Read in sk_psock_unlink
Last occurred: 5 days ago
Reported: 293 days ago
Branches: Mainline and others
Dashboard link: https://syzkaller.appspot.com/bug?id=d691981726208716cc7aec231fb915e27763d662
Original thread: https://lore.kernel.org/lkml/000000000000fd342e05791cc86f@google.com/T/#u
This bug has a syzkaller reproducer only.
syzbot has bisected this bug, but I think the bisection result is incorrect.
The original thread for this bug received 1 reply, 85 days ago.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+3acd9f67a6a15766686e@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000fd342e05791cc86f@google.com
--------------------------------------------------------------------------------
Title: KASAN: slab-out-of-bounds Read in do_jit
Last occurred: 0 days ago
Reported: 23 days ago
Branches: Mainline and others
Dashboard link: https://syzkaller.appspot.com/bug?id=3aacade388873fa82bd6d2efb6aaa9ab85964020
Original thread: https://lore.kernel.org/lkml/000000000000a6ab6b058e5b899b@google.com/T/#u
This bug has a C reproducer.
This bug was bisected to:
commit 2589726d12a1b12eaaa93c7f1ea64287e383c7a5
Author: Alexei Starovoitov <ast@kernel.org>
Date: Sat Jun 15 19:12:20 2019 +0000
bpf: introduce bounded loops
No one has replied to the original thread for this bug yet.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+6b40f58c6d280fa23b40@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000a6ab6b058e5b899b@google.com
--------------------------------------------------------------------------------
Title: WARNING in is_bpf_text_address
Last occurred: 0 days ago
Reported: 55 days ago
Branches: Mainline
Dashboard link: https://syzkaller.appspot.com/bug?id=2386340f7a641010bb1e17228d1e9319592c01ba
Original thread: https://lore.kernel.org/lkml/00000000000000ac4f058bd50039@google.com/T/#u
This bug has a C reproducer.
The original thread for this bug has received 5 replies; the last was 2 hours
ago.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+bd3bba6ff3fcea7a6ec6@syzkaller.appspotmail.com
If you send any email or patch for this bug, please reply to the original
thread, which had activity only 2 hours ago. For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at https://lore.kernel.org/r/00000000000000ac4f058bd50039@google.com
--------------------------------------------------------------------------------
Title: memory leak in sock_hash_update_common
Last occurred: 7 days ago
Reported: 85 days ago
Branches: Mainline
Dashboard link: https://syzkaller.appspot.com/bug?id=9992588b3bbe2617f62f41b1162af9fc8ea4829c
Original thread: https://lore.kernel.org/lkml/000000000000fa662405897c0774@google.com/T/#u
This bug has a syzkaller reproducer only.
No one has replied to the original thread for this bug yet.
If you fix this bug, please add the following tag to the commit:
Reported-by: syzbot+30c7a1fc662026545124@syzkaller.appspotmail.com
If you send any email or patch for this bug, please consider replying to the
original thread. For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000fa662405897c0774@google.com
^ permalink raw reply
* Re: [PATCH net] tunnel: fix dev null pointer dereference when send pkg larger than mtu in collect_md mode
From: Hangbin Liu @ 2019-08-16 4:01 UTC (permalink / raw)
To: Eric Dumazet
Cc: netdev, Stefano Brivio, wenxu, Alexei Starovoitov,
David S . Miller
In-Reply-To: <20190816032418.GX18865@dhcp-12-139.nay.redhat.com>
On Fri, Aug 16, 2019 at 11:24:18AM +0800, Hangbin Liu wrote:
> If yes, how about just set the skb dst to rt->dst, as the
> iptunnel_xmit would do later.
>
> skb_dst_drop(skb);
> skb_dst_set(skb, &rt->dst);
>
Tested and this donesn't work good....
^ permalink raw reply
* [PATCH] airo: fix memory leaks
From: Wenwen Wang @ 2019-08-16 3:50 UTC (permalink / raw)
To: Wenwen Wang
Cc: Kalle Valo, David S. Miller, Herbert Xu, Dan Carpenter,
Eric Biggers, Ard Biesheuvel,
open list:NETWORKING DRIVERS (WIRELESS),
open list:NETWORKING DRIVERS, open list
In proc_BSSList_open(), 'file->private_data' is allocated through kzalloc()
and 'data->rbuffer' is allocated through kmalloc(). In the following
execution, if an error occurs, they are not deallocated, leading to memory
leaks. To fix this issue, free the allocated memory regions before
returning the error.
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
---
drivers/net/wireless/cisco/airo.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/cisco/airo.c b/drivers/net/wireless/cisco/airo.c
index 9342ffb..f43c065 100644
--- a/drivers/net/wireless/cisco/airo.c
+++ b/drivers/net/wireless/cisco/airo.c
@@ -5441,11 +5441,18 @@ static int proc_BSSList_open( struct inode *inode, struct file *file ) {
Cmd cmd;
Resp rsp;
- if (ai->flags & FLAG_RADIO_MASK) return -ENETDOWN;
+ if (ai->flags & FLAG_RADIO_MASK) {
+ kfree(data->rbuffer);
+ kfree(file->private_data);
+ return -ENETDOWN;
+ }
memset(&cmd, 0, sizeof(cmd));
cmd.cmd=CMD_LISTBSS;
- if (down_interruptible(&ai->sem))
+ if (down_interruptible(&ai->sem)) {
+ kfree(data->rbuffer);
+ kfree(file->private_data);
return -ERESTARTSYS;
+ }
issuecommand(ai, &cmd, &rsp);
up(&ai->sem);
data->readlen = 0;
--
2.7.4
^ permalink raw reply related
* Re: [PATCH v2] socket.7: Add description of SO_SELECT_ERR_QUEUE
From: Ricardo Biehl Pasquali @ 2019-08-16 3:43 UTC (permalink / raw)
To: Michael Kerrisk (man-pages), jacob.e.keller
Cc: linux-man, netdev, stefan.puiu, corbet, davem
In-Reply-To: <f053fe2c-20e5-4754-8b13-89cddfbfb52d@gmail.com>
TL;DR: This email proposes a description of the socket
option SO_SELECT_ERR_QUEUE taking into account the change
in wake up behavior when errors are enqueued introduced by
the commit 6e5d58fdc9bedd0255a8 ("skbuff: Fix not waking
applications when errors are enqueued") in Linux 4.16.
On Mon, Jul 29, 2019 at 08:51:42PM +0200, Michael Kerrisk (man-pages) wrote:
> Sorry -- I've not had a lot of cycles to spare for man-pages of late.
Hi. No problem, I've just wondering whether you were
receiving the messages.
> Thanks for the patch. But your text doesn't quite capture the idea
> in this commit message:
>
> commit 7d4c04fc170087119727119074e72445f2bb192b
> Author: Keller, Jacob E <jacob.e.keller@intel.com>
> Date: Thu Mar 28 11:19:25 2013 +0000
It definitely does not.
Initially, despite the description of the commit and the
name of the option, I was investigating only the poll() case
as this was what I was working on.
Sorry.
Now I investigated the behavior of select() and poll(). I've
updated a test code that I wrote some time ago.
See <https://github.com/pasqualirb/poll_select_test>.
I've also written a Behavior section in README which I did
not include here.
> What would you think of something like this:
> SO_SELECT_ERR_QUEUE (since Linux 3.10)
> When this option is set on a socket, an error condition on
> a socket causes notification not only via the exceptfds set
> of select(2). Similarly, poll(2) also returns a POLLPRI
> whenever an POLLERR event is returned.
>
> Background: this option was added when waking up on an
> error condition occurred occured only via the readfds and
> writefds sets of select(2). The option was added to allow
> monitoring for error conditions via the exceptfds argument
> without simultaneously having to receive notifications (via
> readfds) for regular data that can be read from the socket.
> After changes in Linux 4.16, in Linux 4.16, the use of this
> flag to achieve the desired notifications is no longer nec‐
> essary. This option is nevertheless retained for backwards
> compatibility.
>
> ?
I think the part "causes notification not only via the
exeptfds set" implies that the option causes notification
in other sets besides exceptfds. However, the option causes
notification in exceptfds (before Linux 4.16).
In "Background", before Linux 4.16, "waking up" happened
also in exeptfds (see 'Internal details' section), although
select() did not return.
A description covering poll() and select() cases plus wake
up behavior might be:
When this option is set on a socket and an error condition
triggers wake up (see Background below), an exeptional
condition (POLLPRI of poll(2); exeptfds of select(2)) is
returned if user requested it.
Background:
Before Linux 4.16, an error condition triggers wake up only
if user requested POLLIN or POLLPRI (i.e. any of readfds,
writefds or exeptfds of select(2)). However, for an error
condition to be returned to the user instead of sleeping
again in the kernel, POLLERR (i.e. readfds or writefds of
select(2)) must also have been requested (implicit in
poll(2)). The option eliminates this need in select(2) by
returning POLLPRI (i.e. exeptfds) if user requested it.
Since Linux 4.16, an error condition triggers wake up only
if user requested POLLERR (i.e. readfds or writefds of
select(2)). Wake up is not triggered when requesting only
exeptfds, although returning on it occurs if the error
condition was generated before calling select(2).
// Linux 4.16 commit 6e5d58fdc9bedd0255a8 ("skbuff: Fix not
// waking applications when errors are enqueued")
Another description, focusing on select(), might be:
Before Linux 4.16, when this option is set on a socket and
an error condition occurs, select(2) returns on exeptfds if
user requested it. It is already returned on readfds and
writefds. Since Linux 4.16, when the option is set, an error
condition does not return via exeptfds anymore unless it
occurred before calling select(2).
For poll(2), regardless of the kernel version, the option
causes POLLPRI to be added when POLLERR is returned.
The option does not affect wake up, it affects only whether
select(2) returns. The wake up behavior is affected in Linux
4.16. Before this release, waking up on an error condition
required requesting POLLIN or POLLPRI. However, for an error
condition to be returned to the user instead of sleeping
again in the kernel, POLLERR must also be requested. Since
Linux 4.16, waking up requires requesting only POLLERR.
I have been rewriting this multiple times in the past two
weeks, and I still think it is not clear/simple enough.
What do you think? Please comment your understanding and
your ideas.
Internal details
================
The commit 6e5d58fdc9bedd0255a8 ("skbuff: Fix not waking
applications when errors are enqueued") introduced in Linux
4.16, changed the function that triggered the wake up. The
function sk_data_ready() (sock_def_readable()), which wakes
up the task if POLLIN or POLLPRI is requested, was replaced
by sk_error_report() (sock_queue_err_skb()), which wakes up
the task only if POLLERR is requested.
With the option (SO_SELECT_ERR_QUEUE) set, requesting only
exeptfds (POLLPRI) does not intersect the trigger events
anymore, so the task is not woken. However, if POLLERR is
triggered __before__ calling select(), select() __will__
return because availability of events is checked before
sleep.
In select(), POLLPRI is always requested [1]. POLLERR is
requested by readfds and writefds [2]. POLLIN and POLLHUP
by readfds [2]. POLLOUT by writefds [2].
In poll(), user freely requests events, but POLLERR and
POLLHUP are always requested [3].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/
linux.git/tree/fs/select.c?id=6e5d58fdc9bedd0255a8#n443
[2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/
linux.git/tree/fs/select.c?id=6e5d58fdc9bedd0255a8#n435
[3] https://git.kernel.org/pub/scm/linux/kernel/git/stable/
linux.git/tree/fs/select.c?id=6e5d58fdc9bedd0255a8#n820
pasquali
^ permalink raw reply
* Re: [PATCH net] tunnel: fix dev null pointer dereference when send pkg larger than mtu in collect_md mode
From: Hangbin Liu @ 2019-08-16 3:24 UTC (permalink / raw)
To: Eric Dumazet
Cc: netdev, Stefano Brivio, wenxu, Alexei Starovoitov,
David S . Miller
In-Reply-To: <cb5b5d82-1239-34a9-23f5-1894a2ec92a2@gmail.com>
Hi Eric,
Thanks for the review.
On Thu, Aug 15, 2019 at 11:16:58AM +0200, Eric Dumazet wrote:
> > diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> > index 38c02bb62e2c..c6713c7287df 100644
> > --- a/net/ipv4/ip_tunnel.c
> > +++ b/net/ipv4/ip_tunnel.c
> > @@ -597,6 +597,9 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
> > goto tx_error;
> > }
> >
> > + if (skb_dst(skb) && !skb_dst(skb)->dev)
> > + skb_dst(skb)->dev = rt->dst.dev;
> > +
>
>
> IMO this looks wrong.
> This dst seems shared.
If the dst is shared, it may cause some problem. Could you point me where the
dst may be shared possibly?
> Once set, we will reuse the same dev ?
If yes, how about just set the skb dst to rt->dst, as the
iptunnel_xmit would do later.
skb_dst_drop(skb);
skb_dst_set(skb, &rt->dst);
or do you have any other idea?
>
> If intended, why not doing this in __metadata_dst_init() instead of in the fast path ?
I'm afraid we couldn't do this, I didn't find a way to init dev in
__metadata_dst_init(). Do you?
Thanks
Hangbin
^ permalink raw reply
* RE: [PATCH net-next] r8152: divide the tx and rx bottom functions
From: Hayes Wang @ 2019-08-16 2:59 UTC (permalink / raw)
To: David Miller
Cc: netdev@vger.kernel.org, nic_swsd, linux-kernel@vger.kernel.org
In-Reply-To: <20190815.135851.1942927063321516679.davem@davemloft.net>
David Miller [mailto:davem@davemloft.net]
> Sent: Friday, August 16, 2019 4:59 AM
[...]
> Theoretically, yes.
>
> But do you have actual performance numbers showing this to be worth
> the change?
>
> Always provide performance numbers with changes that are supposed to
> improve performance.
On x86, they are almost the same.
Tx/Rx: 943/943 Mbits/sec -> 945/944
For arm platform,
Tx/Rx: 917/917 Mbits/sec -> 933/933
Improve about 1.74%.
Best Regards,
Hayes
^ permalink raw reply
* linux-next: manual merge of the net-next tree with the kbuild tree
From: Stephen Rothwell @ 2019-08-16 2:41 UTC (permalink / raw)
To: David Miller, Networking, Masahiro Yamada
Cc: Linux Next Mailing List, Linux Kernel Mailing List, Kees Cook,
Andrii Nakryiko, Daniel Borkmann
[-- Attachment #1: Type: text/plain, Size: 2690 bytes --]
Hi all,
Today's linux-next merge of the net-next tree got a conflict in:
scripts/link-vmlinux.sh
between commit:
e167191e4a8a ("kbuild: Parameterize kallsyms generation and correct reporting")
from the kbuild tree and commits:
341dfcf8d78e ("btf: expose BTF info through sysfs")
7fd785685e22 ("btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux")
from the net-next tree.
I fixed it up (I think - see below) and can carry the fix as necessary.
This is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging. You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.
--
Cheers,
Stephen Rothwell
diff --cc scripts/link-vmlinux.sh
index 2438a9faf3f1,c31193340108..000000000000
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@@ -56,11 -56,10 +56,11 @@@ modpost_link(
}
# Link of vmlinux
- # ${1} - optional extra .o files
- # ${2} - output file
+ # ${1} - output file
+ # ${@:2} - optional extra .o files
vmlinux_link()
{
+ info LD ${2}
local lds="${objtree}/${KBUILD_LDS}"
local objects
@@@ -139,18 -149,6 +150,18 @@@ kallsyms(
${CC} ${aflags} -c -o ${2} ${afile}
}
+# Perform one step in kallsyms generation, including temporary linking of
+# vmlinux.
+kallsyms_step()
+{
+ kallsymso_prev=${kallsymso}
+ kallsymso=.tmp_kallsyms${1}.o
+ kallsyms_vmlinux=.tmp_vmlinux${1}
+
- vmlinux_link "${kallsymso_prev}" ${kallsyms_vmlinux}
++ vmlinux_link ${kallsyms_vmlinux} "${kallsymso_prev}" ${btf_vmlinux_bin_o}
+ kallsyms ${kallsyms_vmlinux} ${kallsymso}
+}
+
# Create map file with all symbols from ${1}
# See mksymap for additional details
mksysmap()
@@@ -228,8 -227,14 +240,15 @@@ ${MAKE} -f "${srctree}/scripts/Makefile
info MODINFO modules.builtin.modinfo
${OBJCOPY} -j .modinfo -O binary vmlinux.o modules.builtin.modinfo
+ btf_vmlinux_bin_o=""
+ if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then
+ if gen_btf .tmp_vmlinux.btf .btf.vmlinux.bin.o ; then
+ btf_vmlinux_bin_o=.btf.vmlinux.bin.o
+ fi
+ fi
+
kallsymso=""
+kallsymso_prev=""
kallsyms_vmlinux=""
if [ -n "${CONFIG_KALLSYMS}" ]; then
@@@ -268,11 -285,8 +287,7 @@@
fi
fi
- vmlinux_link "${kallsymso}" vmlinux
-
- if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then
- gen_btf vmlinux
- fi
-info LD vmlinux
+ vmlinux_link vmlinux "${kallsymso}" "${btf_vmlinux_bin_o}"
if [ -n "${CONFIG_BUILDTIME_EXTABLE_SORT}" ]; then
info SORTEX vmlinux
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* [PATCH -next v2] btf: fix return value check in btf_vmlinux_init()
From: Wei Yongjun @ 2019-08-16 2:40 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
Yonghong Song, Andrii Nakryiko
Cc: Wei Yongjun, netdev, bpf, kernel-janitors
In-Reply-To: <20190815142432.101401-1-weiyongjun1@huawei.com>
In case of error, the function kobject_create_and_add() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.
Fixes: 341dfcf8d78e ("btf: expose BTF info through sysfs")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
---
kernel/bpf/sysfs_btf.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
index 4659349fc795..7ae5dddd1fe6 100644
--- a/kernel/bpf/sysfs_btf.c
+++ b/kernel/bpf/sysfs_btf.c
@@ -30,17 +30,12 @@ static struct kobject *btf_kobj;
static int __init btf_vmlinux_init(void)
{
- int err;
-
if (!_binary__btf_vmlinux_bin_start)
return 0;
btf_kobj = kobject_create_and_add("btf", kernel_kobj);
- if (IS_ERR(btf_kobj)) {
- err = PTR_ERR(btf_kobj);
- btf_kobj = NULL;
- return err;
- }
+ if (!btf_kobj)
+ return -ENOMEM;
bin_attr_btf_vmlinux.size = _binary__btf_vmlinux_bin_end -
_binary__btf_vmlinux_bin_start;
^ permalink raw reply related
* Re: [PATCH v5] perf machine: arm/arm64: Improve completeness for kernel address space
From: Leo Yan @ 2019-08-16 1:45 UTC (permalink / raw)
To: Adrian Hunter
Cc: Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Alexei Starovoitov, Daniel Borkmann,
Martin KaFai Lau, Song Liu, Yonghong Song, linux-kernel, netdev,
bpf, clang-built-linux, Mathieu Poirier, Peter Zijlstra,
Suzuki Poulouse, coresight, linux-arm-kernel
In-Reply-To: <e0919e39-7607-815b-3a12-96f098e45a5f@intel.com>
Hi Adrian,
On Thu, Aug 15, 2019 at 02:45:57PM +0300, Adrian Hunter wrote:
[...]
> >> How come you cannot use kallsyms to get the information?
> >
> > Thanks for pointing out this. Sorry I skipped your comment "I don't
> > know how you intend to calculate ARM_PRE_START_SIZE" when you reviewed
> > the patch v3, I should use that chance to elaborate the detailed idea
> > and so can get more feedback/guidance before procceed.
> >
> > Actually, I have considered to use kallsyms when worked on the previous
> > patch set.
> >
> > As mentioned in patch set v4's cover letter, I tried to implement
> > machine__create_extra_kernel_maps() for arm/arm64, the purpose is to
> > parse kallsyms so can find more kernel maps and thus also can fixup
> > the kernel start address. But I found the 'perf script' tool directly
> > calls machine__get_kernel_start() instead of running into the flow for
> > machine__create_extra_kernel_maps();
>
> Doesn't it just need to loop through each kernel map to find the lowest
> start address?
Based on your suggestion, I worked out below change and verified it
can work well on arm64 for fixing up start address; please let me know
if the change works for you?
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index f6ee7fbad3e4..51d78313dca1 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2671,9 +2671,26 @@ int machine__nr_cpus_avail(struct machine *machine)
return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
}
+static int machine__fixup_kernel_start(void *arg,
+ const char *name __maybe_unused,
+ char type,
+ u64 start)
+{
+ struct machine *machine = arg;
+
+ type = toupper(type);
+
+ /* Fixup for text, weak, data and bss sections. */
+ if (type == 'T' || type == 'W' || type == 'D' || type == 'B')
+ machine->kernel_start = min(machine->kernel_start, start);
+
+ return 0;
+}
+
int machine__get_kernel_start(struct machine *machine)
{
struct map *map = machine__kernel_map(machine);
+ char filename[PATH_MAX];
int err = 0;
/*
@@ -2687,6 +2704,7 @@ int machine__get_kernel_start(struct machine *machine)
machine->kernel_start = 1ULL << 63;
if (map) {
err = map__load(map);
/*
* On x86_64, PTI entry trampolines are less than the
* start of kernel text, but still above 2^63. So leave
@@ -2695,6 +2713,16 @@ int machine__get_kernel_start(struct machine *machine)
if (!err && !machine__is(machine, "x86_64"))
machine->kernel_start = map->start;
}
+
+ machine__get_kallsyms_filename(machine, filename, PATH_MAX);
+
+ if (symbol__restricted_filename(filename, "/proc/kallsyms"))
+ goto out;
+
+ if (kallsyms__parse(filename, machine, machine__fixup_kernel_start))
+ pr_warning("Fail to fixup kernel start address. skipping...\n");
+
+out:
return err;
}
Thanks,
Leo Yan
^ permalink raw reply related
* Re: WARNING in is_bpf_text_address
From: Bart Van Assche @ 2019-08-16 1:39 UTC (permalink / raw)
To: Will Deacon, syzbot
Cc: akpm, ast, bpf, daniel, davem, dvyukov, hawk, hdanton,
jakub.kicinski, johannes.berg, johannes, john.fastabend, kafai,
linux-kernel, longman, mingo, netdev, paulmck, peterz,
songliubraving, syzkaller-bugs, tglx, tj, torvalds, will.deacon,
xdp-newbies, yhs
In-Reply-To: <20190815075142.vuza32plqtiuhixx@willie-the-truck>
On 8/15/19 12:51 AM, Will Deacon wrote:
> Hi Bart,
>
> On Sat, Aug 10, 2019 at 05:24:06PM -0700, syzbot wrote:
>> syzbot has found a reproducer for the following crash on:
>>
>> HEAD commit: 451577f3 Merge tag 'kbuild-fixes-v5.3-3' of git://git.kern..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=120850a6600000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=2031e7d221391b8a
>> dashboard link: https://syzkaller.appspot.com/bug?extid=bd3bba6ff3fcea7a6ec6
>> compiler: clang version 9.0.0 (/home/glider/llvm/clang
>> 80fee25776c2fb61e74c1ecb1a523375c2500b69)
>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=130ffe4a600000
>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17137d2c600000
>>
>> The bug was bisected to:
>>
>> commit a0b0fd53e1e67639b303b15939b9c653dbe7a8c4
>> Author: Bart Van Assche <bvanassche@acm.org>
>> Date: Thu Feb 14 23:00:46 2019 +0000
>>
>> locking/lockdep: Free lock classes that are no longer in use
>>
>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=152f6a9da00000
>> final crash: https://syzkaller.appspot.com/x/report.txt?x=172f6a9da00000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=132f6a9da00000
>
> I know you don't think much to these reports, but please could you have a
> look (even if it's just to declare it a false positive)?
Hi Will,
Had you already noticed the following message?
https://lore.kernel.org/bpf/d76d7a63-7854-e92d-30cb-52546d333ffe@iogearbox.net/
From that message: "Hey Bart, don't think it's related in any way to
your commit. I'll allocate some time on working on this issue today,
thanks!"
Bart.
^ permalink raw reply
* Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP
From: Toshiaki Makita @ 2019-08-16 1:38 UTC (permalink / raw)
To: William Tu
Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
Yonghong Song, David S. Miller, Jakub Kicinski,
Jesper Dangaard Brouer, John Fastabend, Jamal Hadi Salim,
Cong Wang, Jiri Pirko, Linux Kernel Network Developers, bpf
In-Reply-To: <CALDO+SYC4sPw-7iDkFMCD=kf2UnTW2qc0m6Kgz41zLmNNxQ+Ww@mail.gmail.com>
On 2019/08/16 0:46, William Tu wrote:
> On Tue, Aug 13, 2019 at 5:07 AM Toshiaki Makita
> <toshiaki.makita1@gmail.com> wrote:
>>
>> This is a rough PoC for an idea to offload TC flower to XDP.
>>
>>
>> * Motivation
>>
>> The purpose is to speed up software TC flower by using XDP.
>>
>> I chose TC flower because my current interest is in OVS. OVS uses TC to
>> offload flow tables to hardware, so if TC can offload flows to XDP, OVS
>> also can be offloaded to XDP.
>>
>> When TC flower filter is offloaded to XDP, the received packets are
>> handled by XDP first, and if their protocol or something is not
>> supported by the eBPF program, the program returns XDP_PASS and packets
>> are passed to upper layer TC.
>>
>> The packet processing flow will be like this when this mechanism,
>> xdp_flow, is used with OVS.
>>
>> +-------------+
>> | openvswitch |
>> | kmod |
>> +-------------+
>> ^
>> | if not match in filters (flow key or action not supported by TC)
>> +-------------+
>> | TC flower |
>> +-------------+
>> ^
>> | if not match in flow tables (flow key or action not supported by XDP)
>> +-------------+
>> | XDP prog |
>> +-------------+
>> ^
>> | incoming packets
>>
> I like this idea, some comments about the OVS AF_XDP work.
>
> Another way when using OVS AF_XDP is to serve as slow path of TC flow
> HW offload.
> For example:
>
> Userspace OVS datapath (The one used by OVS-DPDK)
> ^
> |
> +------------------------------+
> | OVS AF_XDP netdev |
> +------------------------------+
> ^
> | if not supported or not match in flow tables
> +---------------------+
> | TC HW flower |
> +---------------------+
> ^
> | incoming packets
>
> So in this case it's either TC HW flower offload, or the userspace PMD OVS.
> Both cases should be pretty fast.
>
> I think xdp_flow can also be used by OVS AF_XDP netdev, sitting between
> TC HW flower and OVS AF_XDP netdev.
> Before the XDP program sending packet to AF_XDP socket, the
> xdp_flow can execute first, and if not match, then send to AF_XDP.
> So in your patch set, implement s.t like
> bpf_redirect_map(&xsks_map, index, 0);
Thanks, the concept sounds good but this is probably difficult as long as
this is a TC offload, which is emulating TC.
If I changed the direction and implement offload in ovs-vswitchd, it would
be possible. I'll remember this optimization.
> Another thing is that at each layer we are doing its own packet parsing.
> From your graph, first parse at XDP program, then at TC flow, then at
> openvswitch kmod.
> I wonder if we can reuse some parsing result.
That would be nice if possible...
Currently I don't have any ideas to do that. Someday XDP may support more
metadata for this or HW-offload like checksum. Then we can store the information
and upper layers may be able to use that.
Toshiaki Makita
^ permalink raw reply
* Re: [PATCH net-next v7 3/3] net: phy: broadcom: add 1000Base-X support for BCM54616S
From: Tao Ren @ 2019-08-16 1:33 UTC (permalink / raw)
To: Florian Fainelli, Andrew Lunn, Heiner Kallweit, David S . Miller,
Vladimir Oltean, Arun Parameswaran, Justin Chen,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
openbmc@lists.ozlabs.org
In-Reply-To: <d45d609b-60ea-760d-31f4-51afa379c55a@gmail.com>
On 8/15/19 4:36 PM, Florian Fainelli wrote:
> On 8/11/19 4:40 PM, Tao Ren wrote:
>> The BCM54616S PHY cannot work properly in RGMII->1000Base-X mode, mainly
>> because genphy functions are designed for copper links, and 1000Base-X
>> (clause 37) auto negotiation needs to be handled differently.
>>
>> This patch enables 1000Base-X support for BCM54616S by customizing 3
>> driver callbacks, and it's verified to be working on Facebook CMM BMC
>> platform (RGMII->1000Base-KX):
>>
>> - probe: probe callback detects PHY's operation mode based on
>> INTERF_SEL[1:0] pins and 1000X/100FX selection bit in SerDES 100-FX
>> Control register.
>>
>> - config_aneg: calls genphy_c37_config_aneg when the PHY is running in
>> 1000Base-X mode; otherwise, genphy_config_aneg will be called.
>>
>> - read_status: calls genphy_c37_read_status when the PHY is running in
>> 1000Base-X mode; otherwise, genphy_read_status will be called.
>>
>> Note: BCM54616S PHY can also be configured in RGMII->100Base-FX mode, and
>> 100Base-FX support is not available as of now.
>>
>> Signed-off-by: Tao Ren <taoren@fb.com>
>
>> - reg = bcm_phy_read_shadow(phydev, BCM5482_SHD_MODE);
>> - bcm_phy_write_shadow(phydev, BCM5482_SHD_MODE,
>> - reg | BCM5482_SHD_MODE_1000BX);
>> + reg = bcm_phy_read_shadow(phydev, BCM54XX_SHD_MODE);
>> + bcm_phy_write_shadow(phydev, BCM54XX_SHD_MODE,
>> + reg | BCM54XX_SHD_MODE_1000BX);
>
> This could have been a separate patch, but this looks reasonable to me
> and this is correct with the datasheet, thanks Tao.
>
> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Thank you FLorian.
It is great experience working with you all to figure out root cause and work out the patches, and I really appreciate it.
Cheers,
Tao
^ permalink raw reply
* Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP
From: Toshiaki Makita @ 2019-08-16 1:28 UTC (permalink / raw)
To: Jakub Kicinski, Stanislav Fomichev
Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
Yonghong Song, David S. Miller, Jesper Dangaard Brouer,
John Fastabend, Jamal Hadi Salim, Cong Wang, Jiri Pirko, netdev,
bpf, William Tu
In-Reply-To: <20190815122232.4b1fa01c@cakuba.netronome.com>
On 2019/08/16 4:22, Jakub Kicinski wrote:
> On Thu, 15 Aug 2019 08:21:00 -0700, Stanislav Fomichev wrote:
>> On 08/15, Toshiaki Makita wrote:
>>> On 2019/08/15 2:07, Stanislav Fomichev wrote:
>>>> On 08/13, Toshiaki Makita wrote:
>>>>> * Implementation
>>>>>
>>>>> xdp_flow makes use of UMH to load an eBPF program for XDP, similar to
>>>>> bpfilter. The difference is that xdp_flow does not generate the eBPF
>>>>> program dynamically but a prebuilt program is embedded in UMH. This is
>>>>> mainly because flow insertion is considerably frequent. If we generate
>>>>> and load an eBPF program on each insertion of a flow, the latency of the
>>>>> first packet of ping in above test will incease, which I want to avoid.
>>>> Can this be instead implemented with a new hook that will be called
>>>> for TC events? This hook can write to perf event buffer and control
>>>> plane will insert/remove/modify flow tables in the BPF maps (contol
>>>> plane will also install xdp program).
>>>>
>>>> Why do we need UMH? What am I missing?
>>>
>>> So you suggest doing everything in xdp_flow kmod?
>> You probably don't even need xdp_flow kmod. Add new tc "offload" mode
>> (bypass) that dumps every command via netlink (or calls the BPF hook
>> where you can dump it into perf event buffer) and then read that info
>> from userspace and install xdp programs and modify flow tables.
>> I don't think you need any kernel changes besides that stream
>> of data from the kernel about qdisc/tc flow creation/removal/etc.
>
> There's a certain allure in bringing the in-kernel BPF translation
> infrastructure forward. OTOH from system architecture perspective IMHO
> it does seem like a task best handed in user space. bpfilter can replace
> iptables completely, here we're looking at an acceleration relatively
> loosely coupled with flower.
I don't think it's loosely coupled. Emulating TC behavior in userspace
is not so easy.
Think about recent multi-mask support in flower. Previously userspace could
assume there is one mask and hash table for each preference in TC. After the
change TC accepts different masks with the same pref. Such a change tends to
break userspace emulation. It may ignore masks passed from flow insertion
and use the mask remembered when the first flow of the pref is inserted. It
may override the mask of all existing flows with the pref. It may fail to
insert such flows. Any of them would result in unexpected wrong datapath
handling which is critical.
I think such an emulation layer needs to be updated in sync with TC.
Toshiaki Makita
^ permalink raw reply
* [PATCH net,v5 1/2] net: sched: use major priority number as hardware priority
From: Pablo Neira Ayuso @ 2019-08-16 1:24 UTC (permalink / raw)
To: netfilter-devel
Cc: davem, netdev, marcelo.leitner, jiri, wenxu, saeedm, paulb,
gerlitz.or, jakub.kicinski
In-Reply-To: <20190816012410.31844-1-pablo@netfilter.org>
tc transparently maps the software priority number to hardware. Update
it to pass the major priority which is what most drivers expect. Update
drivers too so they do not need to lshift the priority field of the
flow_cls_common_offload object. The stmmac driver is an exception, since
this code assumes the tc software priority is fine, therefore, lshift it
just to be conservative.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
v5: no changes.
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +-
drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 2 +-
drivers/net/ethernet/mscc/ocelot_flower.c | 12 +++---------
drivers/net/ethernet/netronome/nfp/flower/qos_conf.c | 2 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c | 2 +-
include/net/pkt_cls.h | 2 +-
6 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index deeb65da99f3..00b2d4a86159 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -3167,7 +3167,7 @@ mlx5e_flow_esw_attr_init(struct mlx5_esw_flow_attr *esw_attr,
esw_attr->parse_attr = parse_attr;
esw_attr->chain = f->common.chain_index;
- esw_attr->prio = TC_H_MAJ(f->common.prio) >> 16;
+ esw_attr->prio = f->common.prio;
esw_attr->in_rep = in_rep;
esw_attr->in_mdev = in_mdev;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index e8ac90564dbe..84a87d059333 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -471,7 +471,7 @@ int mlxsw_sp_acl_rulei_commit(struct mlxsw_sp_acl_rule_info *rulei)
void mlxsw_sp_acl_rulei_priority(struct mlxsw_sp_acl_rule_info *rulei,
unsigned int priority)
{
- rulei->priority = priority >> 16;
+ rulei->priority = priority;
}
void mlxsw_sp_acl_rulei_keymask_u32(struct mlxsw_sp_acl_rule_info *rulei,
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 59487d446a09..b894bc0c9c16 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -13,12 +13,6 @@ struct ocelot_port_block {
struct ocelot_port *port;
};
-static u16 get_prio(u32 prio)
-{
- /* prio starts from 0x1000 while the ids starts from 0 */
- return prio >> 16;
-}
-
static int ocelot_flower_parse_action(struct flow_cls_offload *f,
struct ocelot_ace_rule *rule)
{
@@ -168,7 +162,7 @@ static int ocelot_flower_parse(struct flow_cls_offload *f,
}
finished_key_parsing:
- ocelot_rule->prio = get_prio(f->common.prio);
+ ocelot_rule->prio = f->common.prio;
ocelot_rule->id = f->cookie;
return ocelot_flower_parse_action(f, ocelot_rule);
}
@@ -218,7 +212,7 @@ static int ocelot_flower_destroy(struct flow_cls_offload *f,
struct ocelot_ace_rule rule;
int ret;
- rule.prio = get_prio(f->common.prio);
+ rule.prio = f->common.prio;
rule.port = port_block->port;
rule.id = f->cookie;
@@ -236,7 +230,7 @@ static int ocelot_flower_stats_update(struct flow_cls_offload *f,
struct ocelot_ace_rule rule;
int ret;
- rule.prio = get_prio(f->common.prio);
+ rule.prio = f->common.prio;
rule.port = port_block->port;
rule.id = f->cookie;
ret = ocelot_ace_rule_stats_update(&rule);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c b/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c
index 86e968cd5ffd..124a43dc136a 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c
@@ -93,7 +93,7 @@ nfp_flower_install_rate_limiter(struct nfp_app *app, struct net_device *netdev,
return -EOPNOTSUPP;
}
- if (flow->common.prio != (1 << 16)) {
+ if (flow->common.prio != 1) {
NL_SET_ERR_MSG_MOD(extack, "unsupported offload: qos rate limit offload requires highest priority");
return -EOPNOTSUPP;
}
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
index 37c0bc699cd9..6c305b6ecad0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
@@ -94,7 +94,7 @@ static int tc_fill_entry(struct stmmac_priv *priv,
struct stmmac_tc_entry *entry, *frag = NULL;
struct tc_u32_sel *sel = cls->knode.sel;
u32 off, data, mask, real_off, rem;
- u32 prio = cls->common.prio;
+ u32 prio = cls->common.prio << 16;
int ret;
/* Only 1 match per entry */
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e429809ca90d..98be18ef1ed3 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -646,7 +646,7 @@ tc_cls_common_offload_init(struct flow_cls_common_offload *cls_common,
{
cls_common->chain_index = tp->chain->index;
cls_common->protocol = tp->protocol;
- cls_common->prio = tp->prio;
+ cls_common->prio = tp->prio >> 16;
if (tc_skip_sw(flags) || flags & TCA_CLS_FLAGS_VERBOSE)
cls_common->extack = extack;
}
--
2.11.0
^ permalink raw reply related
* [PATCH net,v5 2/2] netfilter: nf_tables: map basechain priority to hardware priority
From: Pablo Neira Ayuso @ 2019-08-16 1:24 UTC (permalink / raw)
To: netfilter-devel
Cc: davem, netdev, marcelo.leitner, jiri, wenxu, saeedm, paulb,
gerlitz.or, jakub.kicinski
In-Reply-To: <20190816012410.31844-1-pablo@netfilter.org>
This patch adds initial support for offloading basechains using the
priority range from 1 to 65535. This is restricting the netfilter
priority range to 16-bit integer since this is what most drivers assume
so far from tc. It should be possible to extend this range of supported
priorities later on once drivers are updated to support for 32-bit
integer priorities.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
v5: fix clang warning by simplifying the mapping of hardware priorities
to basechain priority in the range of 1-65535. Zero is left behind
since some drivers do not support this, no negative basechain
priorities are used at this stage.
include/net/netfilter/nf_tables_offload.h | 2 ++
net/netfilter/nf_tables_api.c | 4 ++++
net/netfilter/nf_tables_offload.c | 17 ++++++++++++++---
3 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
index 3196663a10e3..c8b9dec376f5 100644
--- a/include/net/netfilter/nf_tables_offload.h
+++ b/include/net/netfilter/nf_tables_offload.h
@@ -73,4 +73,6 @@ int nft_flow_rule_offload_commit(struct net *net);
(__reg)->key = __key; \
memset(&(__reg)->mask, 0xff, (__reg)->len);
+int nft_chain_offload_priority(struct nft_base_chain *basechain);
+
#endif
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 605a7cfe7ca7..b65caf4974e1 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1662,6 +1662,10 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
chain->flags |= NFT_BASE_CHAIN | flags;
basechain->policy = NF_ACCEPT;
+ if (chain->flags & NFT_CHAIN_HW_OFFLOAD &&
+ nft_chain_offload_priority(basechain) < 0)
+ return -EOPNOTSUPP;
+
flow_block_init(&basechain->flow_block);
} else {
chain = kzalloc(sizeof(*chain), GFP_KERNEL);
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 64f5fd5f240e..c0d18c1d77ac 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -103,10 +103,11 @@ void nft_offload_update_dependency(struct nft_offload_ctx *ctx,
}
static void nft_flow_offload_common_init(struct flow_cls_common_offload *common,
- __be16 proto,
- struct netlink_ext_ack *extack)
+ __be16 proto, int priority,
+ struct netlink_ext_ack *extack)
{
common->protocol = proto;
+ common->prio = priority;
common->extack = extack;
}
@@ -124,6 +125,15 @@ static int nft_setup_cb_call(struct nft_base_chain *basechain,
return 0;
}
+int nft_chain_offload_priority(struct nft_base_chain *basechain)
+{
+ if (basechain->ops.priority <= 0 ||
+ basechain->ops.priority > USHRT_MAX)
+ return -1;
+
+ return 0;
+}
+
static int nft_flow_offload_rule(struct nft_trans *trans,
enum flow_cls_command command)
{
@@ -142,7 +152,8 @@ static int nft_flow_offload_rule(struct nft_trans *trans,
if (flow)
proto = flow->proto;
- nft_flow_offload_common_init(&cls_flow.common, proto, &extack);
+ nft_flow_offload_common_init(&cls_flow.common, proto,
+ basechain->ops.priority, &extack);
cls_flow.command = command;
cls_flow.cookie = (unsigned long) rule;
if (flow)
--
2.11.0
^ permalink raw reply related
* [PATCH net,v5 0/2] flow_offload hardware priority fixes
From: Pablo Neira Ayuso @ 2019-08-16 1:24 UTC (permalink / raw)
To: netfilter-devel
Cc: davem, netdev, marcelo.leitner, jiri, wenxu, saeedm, paulb,
gerlitz.or, jakub.kicinski
Hi,
This patchset contains two updates for the flow_offload users:
1) Pass the major tc priority to drivers so they do not have to
lshift it. This is a preparation patch for the fix coming in
patch #2.
2) Set the hardware priority from the netfilter basechain priority,
some drivers break when using the existing hardware priority
number that is set to zero.
v5: fix patch 2/2 to address a clang warning and to simplify
the priority mapping.
Please, apply.
Pablo Neira Ayuso (2):
net: sched: use major priority number as hardware priority
netfilter: nf_tables: map basechain priority to hardware priority
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +-
drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c | 2 +-
drivers/net/ethernet/mscc/ocelot_flower.c | 12 +++---------
drivers/net/ethernet/netronome/nfp/flower/qos_conf.c | 2 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c | 2 +-
include/net/netfilter/nf_tables_offload.h | 2 ++
include/net/pkt_cls.h | 2 +-
net/netfilter/nf_tables_api.c | 4 ++++
net/netfilter/nf_tables_offload.c | 17 ++++++++++++++---
9 files changed, 28 insertions(+), 17 deletions(-)
--
2.11.0
^ permalink raw reply
* Re: INFO: task hung in tls_sw_release_resources_tx
From: Jakub Kicinski @ 2019-08-16 1:11 UTC (permalink / raw)
To: Hillf Danton
Cc: syzbot, ast, aviadye, borisp, bpf, daniel, davejwatson, davem,
john.fastabend, kafai, linux-kernel, netdev, songliubraving,
syzkaller-bugs, yhs
In-Reply-To: <20190815141419.15036-1-hdanton@sina.com>
On Thu, 15 Aug 2019 22:14:19 +0800, Hillf Danton wrote:
> On Thu, 15 Aug 2019 03:54:06 -0700
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: 6d5afe20 sctp: fix memleak in sctp_send_reset_streams
> > git tree: net
> > console output: https://syzkaller.appspot.com/x/log.txt?x=16e5536a600000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
> > dashboard link: https://syzkaller.appspot.com/bug?extid=6a9ff159672dfbb41c95
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17cb0502600000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14d5dc22600000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+6a9ff159672dfbb41c95@syzkaller.appspotmail.com
> >
> > INFO: task syz-executor153:10198 blocked for more than 143 seconds.
> > Not tainted 5.3.0-rc3+ #162
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > syz-executor153 D27672 10198 10179 0x80000002
> > Call Trace:
> > context_switch kernel/sched/core.c:3254 [inline]
> > __schedule+0x755/0x1580 kernel/sched/core.c:3880
> > schedule+0xa8/0x270 kernel/sched/core.c:3944
> > schedule_timeout+0x717/0xc50 kernel/time/timer.c:1783
> > do_wait_for_common kernel/sched/completion.c:83 [inline]
> > __wait_for_common kernel/sched/completion.c:104 [inline]
> > wait_for_common kernel/sched/completion.c:115 [inline]
> > wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
> > crypto_wait_req include/linux/crypto.h:685 [inline]
> > crypto_wait_req include/linux/crypto.h:680 [inline]
> > tls_sw_release_resources_tx+0x4ee/0x6b0 net/tls/tls_sw.c:2075
> > tls_sk_proto_cleanup net/tls/tls_main.c:275 [inline]
> > tls_sk_proto_close+0x686/0x970 net/tls/tls_main.c:305
> > inet_release+0xed/0x200 net/ipv4/af_inet.c:427
> > inet6_release+0x53/0x80 net/ipv6/af_inet6.c:470
> > __sock_release+0xce/0x280 net/socket.c:590
> > sock_close+0x1e/0x30 net/socket.c:1268
> > __fput+0x2ff/0x890 fs/file_table.c:280
> > ____fput+0x16/0x20 fs/file_table.c:313
> > task_work_run+0x145/0x1c0 kernel/task_work.c:113
> > exit_task_work include/linux/task_work.h:22 [inline]
> > do_exit+0x92f/0x2e50 kernel/exit.c:879
> > do_group_exit+0x135/0x360 kernel/exit.c:983
> > __do_sys_exit_group kernel/exit.c:994 [inline]
> > __se_sys_exit_group kernel/exit.c:992 [inline]
> > __x64_sys_exit_group+0x44/0x50 kernel/exit.c:992
> > do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
> > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x43ff88
> > Code: 00 00 be 3c 00 00 00 eb 19 66 0f 1f 84 00 00 00 00 00 48 89 d7 89 f0
> > 0f 05 48 3d 00 f0 ff ff 77 21 f4 48 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff
> > ff 76 e0 f7 d8 64 41 89 01 eb d8 0f 1f 84 00 00 00
> > RSP: 002b:00007ffd1c2d0f78 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043ff88
> > RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
> > RBP: 00000000004bf890 R08: 00000000000000e7 R09: ffffffffffffffd0
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> > R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000
> > INFO: lockdep is turned off.
> > NMI backtrace for cpu 0
> > CPU: 0 PID: 1057 Comm: khungtaskd Not tainted 5.3.0-rc3+ #162
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Call Trace:
> > __dump_stack lib/dump_stack.c:77 [inline]
> > dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> > nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
> > nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
> > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
> > trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
> > check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
> > watchdog+0x9d0/0xef0 kernel/hung_task.c:289
> > kthread+0x361/0x430 kernel/kthread.c:255
> > ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> > Sending NMI from CPU 0 to CPUs 1:
> > NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0xe/0x10
> > arch/x86/include/asm/irqflags.h:60
>
> 1, diff -> commit f87e62d45e51 -> commit 1023121375c6
>
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2167,11 +2167,13 @@ static void tx_work_handler(struct work_
> return;
>
> ctx = tls_sw_ctx_tx(tls_ctx);
> - if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
> - return;
> -
> - if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> - return;
> + if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask)) {
> + if (!test_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> + return;
> + } else {
> + if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> + return;
> + }
> lock_sock(sk);
> tls_tx_records(sk, -1);
> release_sock(sk);
> --
>
> 2, a simpler one. And clear BIT_TX_SCHEDULED perhaps after releasing sock.
>
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2167,11 +2167,9 @@ static void tx_work_handler(struct work_
> return;
>
> ctx = tls_sw_ctx_tx(tls_ctx);
> - if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
> - return;
> + if (!test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
> + clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
>
> - if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> - return;
> lock_sock(sk);
> tls_tx_records(sk, -1);
> release_sock(sk);
Mmm.. too terse, I don't follow what you're trying to do here :(
I've been staring at this for a while and trying to repo but it's not
happening here.
The only thing I see is that EBUSY is not handled.
^ permalink raw reply
* Re: [PATCH net-next,v4 07/12] net: sched: use flow block API
From: Pablo Neira Ayuso @ 2019-08-16 1:10 UTC (permalink / raw)
To: Edward Cree; +Cc: netdev, netfilter-devel
In-Reply-To: <b45709c7-38b5-2dcb-3db1-0c2fca1840be@solarflare.com>
On Wed, Aug 14, 2019 at 05:32:34PM +0100, Edward Cree wrote:
> On 09/07/2019 21:55, Pablo Neira Ayuso wrote:
> > This patch adds tcf_block_setup() which uses the flow block API.
> >
> > This infrastructure takes the flow block callbacks coming from the
> > driver and register/unregister to/from the cls_api core.
> >
> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> > ---
> > <snip>
> > @@ -796,13 +804,20 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
> > struct netlink_ext_ack *extack)
> > {
> > struct tc_block_offload bo = {};
> > + int err;
> >
> > bo.net = dev_net(dev);
> > bo.command = command;
> > bo.binder_type = ei->binder_type;
> > bo.block = block;
> > bo.extack = extack;
> > - return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> > + INIT_LIST_HEAD(&bo.cb_list);
> > +
> > + err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> > + if (err < 0)
> > + return err;
> > +
> > + return tcf_block_setup(block, &bo);
> > }
> >
> > static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
> > @@ -1636,6 +1651,77 @@ void tcf_block_cb_unregister(struct tcf_block *block,
> > }
> > EXPORT_SYMBOL(tcf_block_cb_unregister);
> >
> > +static int tcf_block_bind(struct tcf_block *block,
> > + struct flow_block_offload *bo)
> > +{
> > + struct flow_block_cb *block_cb, *next;
> > + int err, i = 0;
> > +
> > + list_for_each_entry(block_cb, &bo->cb_list, list) {
> > + err = tcf_block_playback_offloads(block, block_cb->cb,
> > + block_cb->cb_priv, true,
> > + tcf_block_offload_in_use(block),
> > + bo->extack);
> > + if (err)
> > + goto err_unroll;
> > +
> > + i++;
> > + }
> > + list_splice(&bo->cb_list, &block->cb_list);
> > +
> > + return 0;
> > +
> > +err_unroll:
> > + list_for_each_entry_safe(block_cb, next, &bo->cb_list, list) {
> > + if (i-- > 0) {
> > + list_del(&block_cb->list);
> > + tcf_block_playback_offloads(block, block_cb->cb,
> > + block_cb->cb_priv, false,
> > + tcf_block_offload_in_use(block),
> > + NULL);
> > + }
> > + flow_block_cb_free(block_cb);
> > + }
> > +
> > + return err;
> > +}
> >
> Why has the replay been moved from the function called by the driver
> (__tcf_block_cb_register()) to work done by the driver's caller based on
> what the driver has left on this flow_block_offload.cb_list? This makes
> it impossible for the driver to (say) unregister a block outside of an
> explicit request from ndo_setup_tc().
> In my under-development driver, I have a teardown path called on PCI
> remove, which calls tcf_block_cb_unregister() on all my block bindings
> (of which the driver keeps track), to ensure that no flow rules are still
> in place when unregister_netdev() is called;
It's the subsystem that has to release resources when
unregister_netdev() event happens. At least in netfilter, when the
device is going away, the filtering policy is removed, hence the
FLOW_BLOCK_UNBIND is called to release the blocks and, hence, the
offload resources. I remember tc ingress qdisc works like this too.
> this is needed because some of the driver's state for certain
> rules involves taking a reference on the netdevice (dev_hold()).
> Your structural changes here make that impossible; is there any
> reason why they're necessary?
May I have access to your driver code?
This would make it easier for me to understand your requirements, and
to discuss changes with you.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox