* Re: [PATCH] objtool: ignore .L prefixed local symbols
From: Arvind Sankar @ 2020-02-14 20:42 UTC (permalink / raw)
To: Fangrui Song
Cc: Arvind Sankar, Nick Desaulniers, jpoimboe, peterz,
clang-built-linux, Nathan Chancellor, linux-kernel
In-Reply-To: <20200214180527.z44b4bmzn336mff2@google.com>
On Fri, Feb 14, 2020 at 10:05:27AM -0800, Fangrui Song wrote:
> I know little about objtool, but if it may be used by other
> architectures, hope the following explanations don't appear to be too
> off-topic:)
>
> On 2020-02-14, Arvind Sankar wrote:
> >Can you describe what case the clang change is supposed to optimize?
> >AFAICT, it kicks in when the symbol is known by the compiler to be local
> >to the DSO and defined in the same translation unit.
> >
> >But then there are two cases:
> >(a) we have call foo, where foo is defined in the same section as the
> >call instruction. In this case the assembler should be able to fully
> >resolve foo and not generate any relocation, regardless of whether foo
> >is global or local.
>
> If foo is STB_GLOBAL or STB_WEAK, the assembler cannot fully resolve a
> reference to foo in the same section, unless the assembler can assume
> (the codegen tells it) the call to foo cannot be interposed by another
> foo definition at runtime.
I was testing with hidden/protected visibility, I see you want this for
the no-semantic-interposition case. Actually a bit more testing shows
some peculiarities even with hidden visibility. With the below, the call
and lea create relocations in the object file, but the jmp doesn't. ld
does avoid creating a plt for this though.
.text
.globl foo, bar
.hidden foo
bar:
call foo
leaq foo(%rip), %rax
jmp foo
foo: ret
>
> >(b) we have call foo, where foo is defined in a different section from
> >the call instruction. In this case the assembler must generate a
> >relocation regardless of whether foo is global or local, and the linker
> >should eliminate it.
> >In what case does does replacing call foo with call .Lfoo$local help?
>
> For -fPIC -fno-semantic-interposition, the assembly emitter can perform
> the following optimization:
>
> void foo() {}
> void bar() { foo(); }
>
> .globl foo, bar
> foo:
> .Lfoo$local:
> ret
> bar:
> call foo --> call .Lfoo$local
> ret
>
> call foo generates an R_X86_64_PLT32. In a -shared link, it creates an
> unneeded PLT entry for foo.
>
> call .Lfoo$local generates an R_X86_64_PLT32. In a -shared link, .Lfoo$local is
> non-preemptible => no PLT entry is created.
>
> For -fno-PIC and -fPIE, the final link is expected to be -no-pie or
> -pie. This optimization does not save anything, because PLT entries will
> not be generated. With clang's integrated assembler, it may increase the
> number of STT_SECTION symbols (because .Lfoo$local will be turned to a
> STT_SECTION relative relocation), but the size increase is very small.
>
>
> I want to teach clang -fPIC to use -fno-semantic-interposition by
> default. (It is currently an LLVM optimization, not realized in clang.)
> clang traditionally makes various -fno-semantic-interposition
> assumptions and can perform interprocedural optimizations even if the
> strict ELF rule disallows them.
FWIW, gcc with no-semantic-interposition also uses local aliases, but
rather than using .L labels, it creates a local alias by
.set foo.localalias, foo
This makes the type of foo.localalias the same as foo, which I gather
should placate objtool as it'll still see an STT_FUNC no matter whether
it picks up foo.localalias or foo.
^ permalink raw reply
* [tip:x86/fpu] BUILD SUCCESS e70b100806d63fb79775858ea92e1a716da46186
From: kbuild test robot @ 2020-02-14 20:41 UTC (permalink / raw)
To: x86-ml; +Cc: linux-kernel
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/fpu
branch HEAD: e70b100806d63fb79775858ea92e1a716da46186 x86/fpu/xstate: Warn when checking alignment of disabled xfeatures
elapsed time: 2893m
configs tested: 234
configs skipped: 73
The following configs have been built successfully.
More configs may be tested in the coming days.
arm allmodconfig
arm allnoconfig
arm allyesconfig
arm at91_dt_defconfig
arm efm32_defconfig
arm exynos_defconfig
arm multi_v5_defconfig
arm multi_v7_defconfig
arm shmobile_defconfig
arm sunxi_defconfig
arm64 allmodconfig
arm64 allnoconfig
arm64 allyesconfig
arm64 defconfig
sparc allyesconfig
nios2 10m50_defconfig
riscv allnoconfig
sh sh7785lcr_32bit_defconfig
m68k m5475evb_defconfig
powerpc allnoconfig
riscv defconfig
i386 defconfig
riscv allmodconfig
sparc defconfig
nds32 defconfig
riscv allyesconfig
s390 defconfig
sparc64 allmodconfig
riscv nommu_virt_defconfig
mips allyesconfig
arc allyesconfig
nds32 allnoconfig
powerpc defconfig
i386 allyesconfig
mips malta_kvm_defconfig
mips fuloong2e_defconfig
xtensa iss_defconfig
parisc b180_defconfig
sparc64 allnoconfig
nios2 3c120_defconfig
parisc allnoconfig
powerpc ppc64_defconfig
microblaze mmu_defconfig
um i386_defconfig
ia64 allnoconfig
csky defconfig
m68k multi_defconfig
parisc defconfig
s390 alldefconfig
alpha defconfig
sh titan_defconfig
h8300 h8300h-sim_defconfig
mips allnoconfig
i386 alldefconfig
i386 allnoconfig
ia64 alldefconfig
ia64 allmodconfig
ia64 allyesconfig
ia64 defconfig
c6x allyesconfig
c6x evmc6678_defconfig
openrisc or1ksim_defconfig
openrisc simple_smp_defconfig
xtensa common_defconfig
h8300 edosk2674_defconfig
h8300 h8s-sim_defconfig
m68k allmodconfig
m68k sun3_defconfig
arc defconfig
microblaze nommu_defconfig
powerpc rhel-kconfig
mips 32r2_defconfig
mips 64r6el_defconfig
mips allmodconfig
parisc allyesconfig
parisc c3000_defconfig
x86_64 randconfig-a001-20200213
x86_64 randconfig-a002-20200213
x86_64 randconfig-a003-20200213
i386 randconfig-a001-20200213
i386 randconfig-a002-20200213
i386 randconfig-a003-20200213
x86_64 randconfig-a001-20200214
x86_64 randconfig-a002-20200214
x86_64 randconfig-a003-20200214
i386 randconfig-a001-20200214
i386 randconfig-a002-20200214
i386 randconfig-a003-20200214
x86_64 randconfig-a001-20200215
x86_64 randconfig-a002-20200215
x86_64 randconfig-a003-20200215
i386 randconfig-a001-20200215
i386 randconfig-a002-20200215
i386 randconfig-a003-20200215
alpha randconfig-a001-20200213
m68k randconfig-a001-20200213
mips randconfig-a001-20200213
nds32 randconfig-a001-20200213
parisc randconfig-a001-20200213
riscv randconfig-a001-20200213
alpha randconfig-a001-20200214
m68k randconfig-a001-20200214
mips randconfig-a001-20200214
nds32 randconfig-a001-20200214
parisc randconfig-a001-20200214
c6x randconfig-a001-20200213
h8300 randconfig-a001-20200213
microblaze randconfig-a001-20200213
nios2 randconfig-a001-20200213
sparc64 randconfig-a001-20200213
c6x randconfig-a001-20200215
h8300 randconfig-a001-20200215
microblaze randconfig-a001-20200215
nios2 randconfig-a001-20200215
sparc64 randconfig-a001-20200215
csky randconfig-a001-20200213
openrisc randconfig-a001-20200213
s390 randconfig-a001-20200213
sh randconfig-a001-20200213
xtensa randconfig-a001-20200213
csky randconfig-a001-20200214
openrisc randconfig-a001-20200214
s390 randconfig-a001-20200214
xtensa randconfig-a001-20200214
sh randconfig-a001-20200214
x86_64 randconfig-b001-20200213
x86_64 randconfig-b002-20200213
x86_64 randconfig-b003-20200213
i386 randconfig-b001-20200213
i386 randconfig-b002-20200213
i386 randconfig-b003-20200213
x86_64 randconfig-b001-20200214
x86_64 randconfig-b002-20200214
x86_64 randconfig-b003-20200214
i386 randconfig-b001-20200214
i386 randconfig-b002-20200214
i386 randconfig-b003-20200214
x86_64 randconfig-c001-20200213
x86_64 randconfig-c002-20200213
x86_64 randconfig-c003-20200213
i386 randconfig-c001-20200213
i386 randconfig-c002-20200213
i386 randconfig-c003-20200213
x86_64 randconfig-c001-20200214
x86_64 randconfig-c002-20200214
x86_64 randconfig-c003-20200214
i386 randconfig-c001-20200214
i386 randconfig-c002-20200214
i386 randconfig-c003-20200214
x86_64 randconfig-d001-20200213
x86_64 randconfig-d002-20200213
x86_64 randconfig-d003-20200213
i386 randconfig-d001-20200213
i386 randconfig-d002-20200213
i386 randconfig-d003-20200213
x86_64 randconfig-d001-20200214
x86_64 randconfig-d002-20200214
x86_64 randconfig-d003-20200214
i386 randconfig-d001-20200214
i386 randconfig-d002-20200214
i386 randconfig-d003-20200214
x86_64 randconfig-e001-20200213
x86_64 randconfig-e002-20200213
x86_64 randconfig-e003-20200213
i386 randconfig-e001-20200213
i386 randconfig-e002-20200213
i386 randconfig-e003-20200213
x86_64 randconfig-e001-20200214
x86_64 randconfig-e002-20200214
x86_64 randconfig-e003-20200214
i386 randconfig-e001-20200214
i386 randconfig-e002-20200214
i386 randconfig-e003-20200214
x86_64 randconfig-f001-20200213
x86_64 randconfig-f002-20200213
x86_64 randconfig-f003-20200213
i386 randconfig-f001-20200213
i386 randconfig-f002-20200213
i386 randconfig-f003-20200213
x86_64 randconfig-f001-20200214
x86_64 randconfig-f002-20200214
x86_64 randconfig-f003-20200214
i386 randconfig-f001-20200214
i386 randconfig-f002-20200214
i386 randconfig-f003-20200214
x86_64 randconfig-g001-20200213
x86_64 randconfig-g002-20200213
x86_64 randconfig-g003-20200213
i386 randconfig-g001-20200213
i386 randconfig-g002-20200213
i386 randconfig-g003-20200213
x86_64 randconfig-g001-20200214
x86_64 randconfig-g002-20200214
x86_64 randconfig-g003-20200214
i386 randconfig-g001-20200214
i386 randconfig-g002-20200214
i386 randconfig-g003-20200214
x86_64 randconfig-h001-20200214
x86_64 randconfig-h002-20200214
x86_64 randconfig-h003-20200214
i386 randconfig-h001-20200214
i386 randconfig-h002-20200214
i386 randconfig-h003-20200214
x86_64 randconfig-h001-20200213
x86_64 randconfig-h002-20200213
x86_64 randconfig-h003-20200213
i386 randconfig-h001-20200213
i386 randconfig-h002-20200213
i386 randconfig-h003-20200213
arc randconfig-a001-20200213
arm randconfig-a001-20200213
arm64 randconfig-a001-20200213
ia64 randconfig-a001-20200213
powerpc randconfig-a001-20200213
sparc randconfig-a001-20200213
riscv rv32_defconfig
s390 allmodconfig
s390 allnoconfig
s390 allyesconfig
s390 debug_defconfig
s390 zfcpdump_defconfig
sh allmodconfig
sh allnoconfig
sh rsk7269_defconfig
sparc64 allyesconfig
sparc64 defconfig
um defconfig
um x86_64_defconfig
x86_64 fedora-25
x86_64 kexec
x86_64 lkp
x86_64 rhel
x86_64 rhel-7.2-clear
x86_64 rhel-7.6
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH nf-next v4 5/9] nf_tables: Add set type for arbitrary concatenation of ranges
From: Pablo Neira Ayuso @ 2020-02-14 20:42 UTC (permalink / raw)
To: Stefano Brivio
Cc: netfilter-devel, Florian Westphal, Kadlecsik József,
Eric Garver, Phil Sutter
In-Reply-To: <20200214204213.50b54ed4@redhat.com>
On Fri, Feb 14, 2020 at 08:42:13PM +0100, Stefano Brivio wrote:
> On Fri, 14 Feb 2020 19:16:34 +0100
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
[...]
> > You refer to a property that says that you can split a range into a
> > 2*n netmasks IIRC. Do you know what is the worst case when splitting
> > ranges?
>
> I'm not sure I got your question: that is exactly the worst case, i.e.
> we can have _up to_ 2 * n netmasks (hence rules) given a range of n
> bits. There's an additional upper bound on this, given by the address
> space, but single fields in a concatenation can overlap.
>
> For example, we can have up to 128 rules for an IPv6 range where at
> least 64 bits differ between the endpoints, and which would contain
> 2 ^ 64 addresses. Or, say, the IPv4 range 1.2.3.4 - 255.255.0.2 is
> expressed by 42 rules.
>
> By the way, 0.0.0.1 - 255.255.255.254 takes 62 rules, so we can
> *probably* say it's 2 * n - 2, but I don't have a formal proof for that.
By "splitting" I was actually refering to "expanding", so you're
replying here to my worst-case range-to-rules expansion question.
> I have a couple of ways in mind to get that down to n / 2, but it's not
> straightforward and it will take me some time (assuming it makes
> sense). For the n bound, we can introduce negations (proof in
> literature), and I have some kind of ugly prototype. For the n / 2
> bound, I'd need some auxiliary data structure to keep insertion
> invertible.
OK, so there is room to improve the "rule expansion" logic. I didn't
spend much time on that front yet.
> In practice, the "average" case is much less, but to define it we would
> first need to agree on what are the actual components of the
> multivariate distribution... size and start? Is it a Poisson
> distribution then? After spending some time on this and disagreeing
> with myself I'd shyly recommend to skip the topic. :)
Yes, I agree to stick to something relatively simple and good is just
fine.
> > There is no ipset set like this, but I agree usecase might happen.
>
> Actually, for ipset, a "net,port,net,port" type was proposed
> (netfilter-devel <20181216213039.399-1-oliver@uptheinter.net>), but when
> József enquired about the intended use case, none was given. So maybe
> this whole "net,net,port,mac" story makes even less sense.
Would it make sense to you to restrict pipapo to 3 fields until there
is someone with a usecase for this?
[...]
> > The per-cpu scratch index is only required if we cannot fit in the
> > "result bitmap" into the stack, right?
>
> Right.
>
> > Probably up to 256 bytes result bitmap in the stack is reasonable?
> > That makes 8192 pipapo rules. There will be no need to disable bh and
> > make use of the percpu scratchpad area in that case.
>
> Right -- the question is whether that would mean yet another
> implementation for the lookup function.
This would need another lookup function that can be selected from
control plane path. The set size and the range-to-rule expansion
worst-case can tell us if it would fit into the stack. It's would be
just one extra lookup function for this case, ~80-100 LOC.
> > If adjusting the code to deal with variable length "pipapo word" size
> > is not too convoluted, then you could just deal with the variable word
> > size from the insert / delete / get (slow) path and register one
> > lookup function for the version that is optimized for this pipapo word
> > size.
>
> Yes, I like this a lot -- we would also need one function to rebuild
> tables when the word size changes, but that sounds almost trivial.
> Changes for the slow path are actually rather simple.
>
> Still, I start doubting quite heavily that my original worst case is
> reasonable. If we stick to the one you mentioned, or even something in
> between, it makes no sense to keep 4-bit buckets.
OK, then moving to 8-bits will probably remove a bit of code which is
dealing with "nibbles".
> By the way, I went ahead and tried the 8-bit bucket version of the C
> implementation only, on my usual x86_64 box (one thread, AMD Epyc 7351).
> I think it's worth it:
>
> 4-bit 8-bit
> net,port
> 1000 entries 2304165pps 2901299pps
> port,net
> 100 entries 4131471pps 4751247pps
> net6,port
> 1000 entries 1092557pps 1651037pps
> port,proto
> 30000 entries 284147pps 449665pps
> net6,port,mac
> 10 entries 2082880pps 2762291pps
> net6,port,mac,proto
> 1000 entries 783810pps 1195823pps
> net,mac
> 1000 entries 1279122pps 1934003pps
Assuming the same concatenation type, larger bucket size makes pps
drop in the C implementation?
> I would now proceed extending this to the AVX2 implementation and (once
> I finish it) to the NEON one, I actually expect bigger gains there.
Good. BTW, probably you can add a new NFT_SET_CLASS_JIT class that
comes becomes NFT_SET_CLASS_O_1 to make the set routine that selects
the set pick the jit version instead.
> > Probably adding helper function to deal with pipapo words would help
> > to prepare for such update in the future. There is the ->estimate
> > function that allows to calculate for the best word size depending on
> > all the information this gets from the set definition.
>
> Hm, I really think it should be kind of painless to make this dynamic
> on insertion/deletion.
OK, good. How would you like to proceed?
Thanks!
^ permalink raw reply
* Re: [PATCH net] net: add strict checks in netdev_name_node_alt_destroy()
From: Jiri Pirko @ 2020-02-14 20:40 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David S . Miller, netdev, Eric Dumazet, syzbot, Jiri Pirko
In-Reply-To: <20200214155353.71062-1-edumazet@google.com>
Fri, Feb 14, 2020 at 04:53:53PM CET, edumazet@google.com wrote:
>netdev_name_node_alt_destroy() does a lookup over all
>device names of a namespace.
>
>We need to make sure the name belongs to the device
>of interest, and that we do not destroy its primary
>name, since we rely on it being not deleted :
>dev->name_node would indeed point to freed memory.
>
>syzbot report was the following :
>
>BUG: KASAN: use-after-free in dev_net include/linux/netdevice.h:2206 [inline]
>BUG: KASAN: use-after-free in mld_force_mld_version net/ipv6/mcast.c:1172 [inline]
>BUG: KASAN: use-after-free in mld_in_v2_mode_only net/ipv6/mcast.c:1180 [inline]
>BUG: KASAN: use-after-free in mld_in_v1_mode+0x203/0x230 net/ipv6/mcast.c:1190
>Read of size 8 at addr ffff88809886c588 by task swapper/1/0
>
>CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc1-syzkaller #0
>Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>Call Trace:
> <IRQ>
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x197/0x210 lib/dump_stack.c:118
> print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
> __kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506
> kasan_report+0x12/0x20 mm/kasan/common.c:641
> __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
> dev_net include/linux/netdevice.h:2206 [inline]
> mld_force_mld_version net/ipv6/mcast.c:1172 [inline]
> mld_in_v2_mode_only net/ipv6/mcast.c:1180 [inline]
> mld_in_v1_mode+0x203/0x230 net/ipv6/mcast.c:1190
> mld_send_initial_cr net/ipv6/mcast.c:2083 [inline]
> mld_dad_timer_expire+0x24/0x230 net/ipv6/mcast.c:2118
> call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1404
> expire_timers kernel/time/timer.c:1449 [inline]
> __run_timers kernel/time/timer.c:1773 [inline]
> __run_timers kernel/time/timer.c:1740 [inline]
> run_timer_softirq+0x6c3/0x1790 kernel/time/timer.c:1786
> __do_softirq+0x262/0x98c kernel/softirq.c:292
> invoke_softirq kernel/softirq.c:373 [inline]
> irq_exit+0x19b/0x1e0 kernel/softirq.c:413
> exiting_irq arch/x86/include/asm/apic.h:546 [inline]
> smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1146
> apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
> </IRQ>
>RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
>Code: 68 73 c5 f9 eb 8a cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 94 be 59 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 84 be 59 00 fb f4 <c3> cc 55 48 89 e5 41 57 41 56 41 55 41 54 53 e8 de 2a 74 f9 e8 09
>RSP: 0018:ffffc90000d3fd68 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
>RAX: 1ffffffff136761a RBX: ffff8880a99fc340 RCX: 0000000000000000
>RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffff8880a99fcbd4
>RBP: ffffc90000d3fd98 R08: ffff8880a99fc340 R09: 0000000000000000
>R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
>R13: ffffffff8aa5a1c0 R14: 0000000000000000 R15: 0000000000000001
> arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:686
> default_idle_call+0x84/0xb0 kernel/sched/idle.c:94
> cpuidle_idle_call kernel/sched/idle.c:154 [inline]
> do_idle+0x3c8/0x6e0 kernel/sched/idle.c:269
> cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:361
> start_secondary+0x2f4/0x410 arch/x86/kernel/smpboot.c:264
> secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242
>
>Allocated by task 10229:
> save_stack+0x23/0x90 mm/kasan/common.c:72
> set_track mm/kasan/common.c:80 [inline]
> __kasan_kmalloc mm/kasan/common.c:515 [inline]
> __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:488
> kasan_kmalloc+0x9/0x10 mm/kasan/common.c:529
> __do_kmalloc_node mm/slab.c:3616 [inline]
> __kmalloc_node+0x4e/0x70 mm/slab.c:3623
> kmalloc_node include/linux/slab.h:578 [inline]
> kvmalloc_node+0x68/0x100 mm/util.c:574
> kvmalloc include/linux/mm.h:645 [inline]
> kvzalloc include/linux/mm.h:653 [inline]
> alloc_netdev_mqs+0x98/0xe40 net/core/dev.c:9797
> rtnl_create_link+0x22d/0xaf0 net/core/rtnetlink.c:3047
> __rtnl_newlink+0xf9f/0x1790 net/core/rtnetlink.c:3309
> rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3377
> rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438
> netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
> rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456
> netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
> netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
> netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
> sock_sendmsg_nosec net/socket.c:652 [inline]
> sock_sendmsg+0xd7/0x130 net/socket.c:672
> __sys_sendto+0x262/0x380 net/socket.c:1998
> __do_compat_sys_socketcall net/compat.c:771 [inline]
> __se_compat_sys_socketcall net/compat.c:719 [inline]
> __ia32_compat_sys_socketcall+0x530/0x710 net/compat.c:719
> do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
> do_fast_syscall_32+0x27b/0xe16 arch/x86/entry/common.c:408
> entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
>
>Freed by task 10229:
> save_stack+0x23/0x90 mm/kasan/common.c:72
> set_track mm/kasan/common.c:80 [inline]
> kasan_set_free_info mm/kasan/common.c:337 [inline]
> __kasan_slab_free+0x102/0x150 mm/kasan/common.c:476
> kasan_slab_free+0xe/0x10 mm/kasan/common.c:485
> __cache_free mm/slab.c:3426 [inline]
> kfree+0x10a/0x2c0 mm/slab.c:3757
> __netdev_name_node_alt_destroy+0x1ff/0x2a0 net/core/dev.c:322
> netdev_name_node_alt_destroy+0x57/0x80 net/core/dev.c:334
> rtnl_alt_ifname net/core/rtnetlink.c:3518 [inline]
> rtnl_linkprop.isra.0+0x575/0x6f0 net/core/rtnetlink.c:3567
> rtnl_dellinkprop+0x46/0x60 net/core/rtnetlink.c:3588
> rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438
> netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
> rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456
> netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
> netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
> netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
> sock_sendmsg_nosec net/socket.c:652 [inline]
> sock_sendmsg+0xd7/0x130 net/socket.c:672
> ____sys_sendmsg+0x753/0x880 net/socket.c:2343
> ___sys_sendmsg+0x100/0x170 net/socket.c:2397
> __sys_sendmsg+0x105/0x1d0 net/socket.c:2430
> __compat_sys_sendmsg net/compat.c:642 [inline]
> __do_compat_sys_sendmsg net/compat.c:649 [inline]
> __se_compat_sys_sendmsg net/compat.c:646 [inline]
> __ia32_compat_sys_sendmsg+0x7a/0xb0 net/compat.c:646
> do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
> do_fast_syscall_32+0x27b/0xe16 arch/x86/entry/common.c:408
> entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
>
>The buggy address belongs to the object at ffff88809886c000
> which belongs to the cache kmalloc-4k of size 4096
>The buggy address is located 1416 bytes inside of
> 4096-byte region [ffff88809886c000, ffff88809886d000)
>The buggy address belongs to the page:
>page:ffffea0002621b00 refcount:1 mapcount:0 mapping:ffff8880aa402000 index:0x0 compound_mapcount: 0
>flags: 0xfffe0000010200(slab|head)
>raw: 00fffe0000010200 ffffea0002610d08 ffffea0002607608 ffff8880aa402000
>raw: 0000000000000000 ffff88809886c000 0000000100000001 0000000000000000
>page dumped because: kasan: bad access detected
>
>Memory state around the buggy address:
> ffff88809886c480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88809886c500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>ffff88809886c580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ^
> ffff88809886c600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88809886c680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>
>Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
>Signed-off-by: Eric Dumazet <edumazet@google.com>
>Reported-by: syzbot <syzkaller@googlegroups.com>
>Cc: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Thanks Eric!
^ permalink raw reply
* Re: [linux-lvm] commit c527a0cbfc3 may have a bug
From: David Teigland @ 2020-02-14 20:40 UTC (permalink / raw)
To: Gionatan Danti; +Cc: linux-lvm, heming.zhao
In-Reply-To: <1f438b012d606d06d77ff9a1fc3a6926@assyoma.it>
On Fri, Feb 14, 2020 at 08:34:19PM +0100, Gionatan Danti wrote:
> Hi David, being filters one of the most asked questions, can I ask why we
> have so many different filters, leading to such complex interactions and
> behaviors?
>
> Don't get me wrong: I am sure you (the lvm team) have very good reasons to
> do that, and I am surely missing something? But what, precisely? How should
> we (end users) consider filters? Should we only use global_filter?
You're right, filters are difficult to understand and use correctly. The
complexity and confusion in the code is no better. With the removal of
lvmetad in 2.03 versions (e.g. RHEL8) there's no difference between filter
and global_filter, so that's some small improvement. But, I think filters
should be replaced or overhauled with something easier to use and more
useful at a technical level.
I've created a bz about that and welcome thoughts about what a replacement
should or should not be like. With input the work is more likely to be
prioritized.
https://bugzilla.redhat.com/show_bug.cgi?id=1803266
^ permalink raw reply
* Re: [PATCH RFC] memory: Don't allow to resize RAM while migrating
From: Peter Xu @ 2020-02-14 20:38 UTC (permalink / raw)
To: David Hildenbrand
Cc: Eduardo Habkost, Juan Quintela, Michael S. Tsirkin,
Richard Henderson, Dr. David Alan Gilbert,
Shameerali Kolothum Thodi, qemu-devel, Shannon Zhao,
Paolo Bonzini, Igor Mammedov, David Hildenbrand, Alex Bennée
In-Reply-To: <A5C9F372-A9A6-4D6C-8C08-798F4ED15C10@redhat.com>
On Fri, Feb 14, 2020 at 03:04:23PM -0500, David Hildenbrand wrote:
>
>
> > Am 14.02.2020 um 20:45 schrieb Peter Xu <peterx@redhat.com>:
> >
> > On Fri, Feb 14, 2020 at 07:26:59PM +0100, David Hildenbrand wrote:
> >>>>>> + if (!postcopy_is_running()) {
> >>>>>> + Error *err = NULL;
> >>>>>> +
> >>>>>> + /*
> >>>>>> + * Precopy code cannot deal with the size of ram blocks changing at
> >>>>>> + * random points in time. We're still running on the source, abort
> >>>>>> + * the migration and continue running here. Make sure to wait until
> >>>>>> + * migration was canceled.
> >>>>>> + */
> >>>>>> + error_setg(&err, "RAM resized during precopy.");
> >>>>>> + migrate_set_error(migrate_get_current(), err);
> >>>>>> + error_free(err);
> >>>>>> + migration_cancel();
> >>>>>> + } else {
> >>>>>> + /*
> >>>>>> + * Postcopy code cannot deal with the size of ram blocks changing at
> >>>>>> + * random points in time. We're running on the target. Fail hard.
> >>>>>> + *
> >>>>>> + * TODO: How to handle this in a better way?
> >>>>>> + */
> >>>>>> + error_report("RAM resized during postcopy.");
> >>>>>> + exit(-1);
> >>>>>
> >>>>> Now I'm rethinking the postcopy case....
> >>>>>
> >>>>> ram_dirty_bitmap_reload() should only happen during the postcopy
> >>>>> recovery, and when that happens the VM should be stopped on both
> >>>>> sides. Which means, ram resizing should not trigger there...
> >>>>
> >>>> But that guest got the chance to run for a bit and eventually reboot
> >>>> AFAIK. Also, there are other data races possible when used_length
> >>>> suddenly changes, this is just the most obvious one where things will;
> >>>> get screwed up.
> >>>
> >>> Right, the major one could be in ram_load_postcopy() where we call
> >>> host_from_ram_block_offset(). However if FW extension is the major
> >>> use case then it seems to still work (still better than crashing,
> >>> isn't it? :).
> >>
> >> "Let's close our eyes and hope it will never happen" ? :) No, I don't
> >> like that. This screams for a better solution long term, and until then
> >> a proper fencing IMHO. We're making here wild guesses about data races
> >> and why they might not be that bad in certain cases (did I mention
> >> load/store tearing? used_length is not an atomic value ...).
> >
> > Yeah fencing is good, but crashing a VM while it wasn't going to crash
> > is another thing, imho. You can dump an error message if you really
> > like, but instead of exit() I really prefer we either still let the
> > old way to at least work or really fix it.
>
> I‘ll do whatever Juan/Dave think is best. I am not convinced that there is no way to corrupt data or crash later when the guest is already running again post-reboot and doing real work.
Yeah I never said it will always work. :)
However it does not mean it'll break every time. My guess is that for
the happened cases it might still survive quite a few, confessing that
is without much clue. I just prefer to avoid having an explicit patch
to bail out like that, because it doesn't really help that much by
crashing earlier.
That's something I learnt when I started to work on migration, that
is, we don't call exit() on source VM when we really, really needed
to. For postcopy, it's the destination VM that matters here.
Yeh not a big deal since this is really corner case even if it
happened. Let's follow the maintainers' judgement.
Thanks,
--
Peter Xu
^ permalink raw reply
* Re: [RFC PATCH v4 02/25] ice: Create and register virtual bus for RDMA
From: Jason Gunthorpe @ 2020-02-14 20:39 UTC (permalink / raw)
To: Jeff Kirsher
Cc: davem, gregkh, Dave Ertman, netdev, linux-rdma, nhorman, sassmann,
Tony Nguyen, Andrew Bowers
In-Reply-To: <20200212191424.1715577-3-jeffrey.t.kirsher@intel.com>
On Wed, Feb 12, 2020 at 11:14:01AM -0800, Jeff Kirsher wrote:
> +/**
> + * ice_init_peer_devices - initializes peer devices
> + * @pf: ptr to ice_pf
> + *
> + * This function initializes peer devices on the virtual bus.
> + */
> +int ice_init_peer_devices(struct ice_pf *pf)
> +{
> + struct ice_vsi *vsi = pf->vsi[0];
> + struct pci_dev *pdev = pf->pdev;
> + struct device *dev = &pdev->dev;
> + int status = 0;
> + int i;
> +
> + /* Reserve vector resources */
> + status = ice_reserve_peer_qvector(pf);
> + if (status < 0) {
> + dev_err(dev, "failed to reserve vectors for peer drivers\n");
> + return status;
> + }
> + for (i = 0; i < ARRAY_SIZE(ice_peers); i++) {
> + struct ice_peer_dev_int *peer_dev_int;
> + struct ice_peer_drv_int *peer_drv_int;
> + struct iidc_qos_params *qos_info;
> + struct iidc_virtbus_object *vbo;
> + struct msix_entry *entry = NULL;
> + struct iidc_peer_dev *peer_dev;
> + struct virtbus_device *vdev;
> + int j;
> +
> + /* structure layout needed for container_of's looks like:
> + * ice_peer_dev_int (internal only ice peer superstruct)
> + * |--> iidc_peer_dev
> + * |--> *ice_peer_drv_int
> + *
> + * iidc_virtbus_object (container_of parent for vdev)
> + * |--> virtbus_device
> + * |--> *iidc_peer_dev (pointer from internal struct)
> + *
> + * ice_peer_drv_int (internal only peer_drv struct)
> + */
> + peer_dev_int = devm_kzalloc(dev, sizeof(*peer_dev_int),
> + GFP_KERNEL);
> + if (!peer_dev_int)
> + return -ENOMEM;
> +
> + vbo = kzalloc(sizeof(*vbo), GFP_KERNEL);
> + if (!vbo) {
> + devm_kfree(dev, peer_dev_int);
> + return -ENOMEM;
> + }
> +
> + peer_drv_int = devm_kzalloc(dev, sizeof(*peer_drv_int),
> + GFP_KERNEL);
To me, this looks like a lifetime mess. All these devm allocations
against the parent object are being referenced through the vbo with a
different kref lifetime. The whole thing has very unclear semantics
who should be cleaning up on error
> + if (!peer_drv_int) {
> + devm_kfree(dev, peer_dev_int);
> + kfree(vbo);
ie here we free two things
> + return -ENOMEM;
> + }
> +
> + pf->peers[i] = peer_dev_int;
> + vbo->peer_dev = &peer_dev_int->peer_dev;
> + peer_dev_int->peer_drv_int = peer_drv_int;
> + peer_dev_int->peer_dev.vdev = &vbo->vdev;
> +
> + /* Initialize driver values */
> + for (j = 0; j < IIDC_EVENT_NBITS; j++)
> + bitmap_zero(peer_drv_int->current_events[j].type,
> + IIDC_EVENT_NBITS);
> +
> + mutex_init(&peer_dev_int->peer_dev_state_mutex);
> +
> + peer_dev = &peer_dev_int->peer_dev;
> + peer_dev->peer_ops = NULL;
> + peer_dev->hw_addr = (u8 __iomem *)pf->hw.hw_addr;
> + peer_dev->peer_dev_id = ice_peers[i].id;
> + peer_dev->pf_vsi_num = vsi->vsi_num;
> + peer_dev->netdev = vsi->netdev;
> +
> + peer_dev_int->ice_peer_wq =
> + alloc_ordered_workqueue("ice_peer_wq_%d", WQ_UNBOUND,
> + i);
> + if (!peer_dev_int->ice_peer_wq)
> + return -ENOMEM;
Here we free nothing
> +
> + peer_dev->pdev = pdev;
> + qos_info = &peer_dev->initial_qos_info;
> +
> + /* setup qos_info fields with defaults */
> + qos_info->num_apps = 0;
> + qos_info->num_tc = 1;
> +
> + for (j = 0; j < IIDC_MAX_USER_PRIORITY; j++)
> + qos_info->up2tc[j] = 0;
> +
> + qos_info->tc_info[0].rel_bw = 100;
> + for (j = 1; j < IEEE_8021QAZ_MAX_TCS; j++)
> + qos_info->tc_info[j].rel_bw = 0;
> +
> + /* for DCB, override the qos_info defaults. */
> + ice_setup_dcb_qos_info(pf, qos_info);
> +
> + /* make sure peer specific resources such as msix_count and
> + * msix_entries are initialized
> + */
> + switch (ice_peers[i].id) {
> + case IIDC_PEER_RDMA_ID:
> + if (test_bit(ICE_FLAG_IWARP_ENA, pf->flags)) {
> + peer_dev->msix_count = pf->num_rdma_msix;
> + entry = &pf->msix_entries[pf->rdma_base_vector];
> + }
> + break;
> + default:
> + break;
> + }
> +
> + peer_dev->msix_entries = entry;
> + ice_peer_state_change(peer_dev_int, ICE_PEER_DEV_STATE_INIT,
> + false);
> +
> + vdev = &vbo->vdev;
> + vdev->name = ice_peers[i].name;
> + vdev->release = ice_peer_vdev_release;
> + vdev->dev.parent = &pdev->dev;
> +
> + status = virtbus_dev_register(vdev);
> + if (status) {
> + virtbus_dev_unregister(vdev);
> + vdev = NULL;
Here we double unregister and free nothing.
You need to go through all of this really carefully and make some kind
of sane lifetime model and fix all the error unwinding :(
Why doesn't the release() function of vbo trigger the free of all this
peer related stuff?
Use a sane design model of splitting into functions to allocate single
peices of memory, goto error unwind each function, and build things up
properly.
Jason
^ permalink raw reply
* Re: [PATCH] mt76: mt7615: remove rx_mask in mt7615_eeprom_parse_hw_cap
From: kbuild test robot @ 2020-02-14 20:38 UTC (permalink / raw)
To: kbuild-all
In-Reply-To: <496a58e997ab842d912c5b5352fa6593dc7cc00f.1581455625.git.lorenzo@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 4710 bytes --]
Hi Lorenzo,
I love your patch! Yet something to improve:
[auto build test ERROR on wireless-drivers-next/master]
[also build test ERROR on wireless-drivers/master v5.6-rc1 next-20200214]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Lorenzo-Bianconi/mt76-mt7615-remove-rx_mask-in-mt7615_eeprom_parse_hw_cap/20200215-021915
base: https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git master
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 7.5.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.5.0 make.cross ARCH=ia64
If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
drivers/net/wireless/mediatek/mt76/mt7615/eeprom.c: In function 'mt7615_eeprom_parse_hw_cap':
>> drivers/net/wireless/mediatek/mt76/mt7615/eeprom.c:123:39: error: 'rx_mask' undeclared (first use in this function); did you mean 'tx_mask'?
dev->mt76.chainmask = tx_mask << 8 | rx_mask;
^~~~~~~
tx_mask
drivers/net/wireless/mediatek/mt76/mt7615/eeprom.c:123:39: note: each undeclared identifier is reported only once for each function it appears in
vim +123 drivers/net/wireless/mediatek/mt76/mt7615/eeprom.c
f9461a687928df2 Lorenzo Bianconi 2019-05-17 92
c988a77f1de523e Lorenzo Bianconi 2019-05-17 93 static void mt7615_eeprom_parse_hw_cap(struct mt7615_dev *dev)
c988a77f1de523e Lorenzo Bianconi 2019-05-17 94 {
d08f3010f4a32ee Lorenzo Bianconi 2020-02-07 95 u8 *eeprom = dev->mt76.eeprom.data;
30ec8d836cb0539 Lorenzo Bianconi 2020-02-11 96 u8 tx_mask, max_nss;
d08f3010f4a32ee Lorenzo Bianconi 2020-02-07 97 u32 val;
c988a77f1de523e Lorenzo Bianconi 2019-05-17 98
c988a77f1de523e Lorenzo Bianconi 2019-05-17 99 val = FIELD_GET(MT_EE_NIC_WIFI_CONF_BAND_SEL,
c988a77f1de523e Lorenzo Bianconi 2019-05-17 100 eeprom[MT_EE_WIFI_CONF]);
c988a77f1de523e Lorenzo Bianconi 2019-05-17 101 switch (val) {
c988a77f1de523e Lorenzo Bianconi 2019-05-17 102 case MT_EE_5GHZ:
c988a77f1de523e Lorenzo Bianconi 2019-05-17 103 dev->mt76.cap.has_5ghz = true;
c988a77f1de523e Lorenzo Bianconi 2019-05-17 104 break;
c988a77f1de523e Lorenzo Bianconi 2019-05-17 105 case MT_EE_2GHZ:
c988a77f1de523e Lorenzo Bianconi 2019-05-17 106 dev->mt76.cap.has_2ghz = true;
c988a77f1de523e Lorenzo Bianconi 2019-05-17 107 break;
c988a77f1de523e Lorenzo Bianconi 2019-05-17 108 default:
c988a77f1de523e Lorenzo Bianconi 2019-05-17 109 dev->mt76.cap.has_2ghz = true;
c988a77f1de523e Lorenzo Bianconi 2019-05-17 110 dev->mt76.cap.has_5ghz = true;
c988a77f1de523e Lorenzo Bianconi 2019-05-17 111 break;
c988a77f1de523e Lorenzo Bianconi 2019-05-17 112 }
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 113
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 114 /* read tx-rx mask from eeprom */
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 115 val = mt76_rr(dev, MT_TOP_STRAP_STA);
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 116 max_nss = val & MT_TOP_3NSS ? 3 : 4;
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 117
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 118 tx_mask = FIELD_GET(MT_EE_NIC_CONF_TX_MASK,
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 119 eeprom[MT_EE_NIC_CONF_0]);
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 120 if (!tx_mask || tx_mask > max_nss)
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 121 tx_mask = max_nss;
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 122
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 @123 dev->mt76.chainmask = tx_mask << 8 | rx_mask;
acf5457fd99db6c Lorenzo Bianconi 2019-11-14 124 dev->mt76.antenna_mask = BIT(tx_mask) - 1;
c988a77f1de523e Lorenzo Bianconi 2019-05-17 125 }
c988a77f1de523e Lorenzo Bianconi 2019-05-17 126
:::::: The code at line 123 was first introduced by commit
:::::: acf5457fd99db6c9a42ef280494dfee949ee1e09 mt76: mt7615: read {tx,rx} mask from eeprom
:::::: TO: Lorenzo Bianconi <lorenzo@kernel.org>
:::::: CC: Felix Fietkau <nbd@nbd.name>
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 56548 bytes --]
^ permalink raw reply
* Re: [PATCH v2 9/9] perf,tracing: Allow function tracing when !RCU
From: Kim Phillips @ 2020-02-14 20:38 UTC (permalink / raw)
To: Peter Zijlstra, linux-kernel, linux-arch, rostedt
Cc: mingo, joel, gregkh, gustavo, tglx, paulmck, josh,
mathieu.desnoyers, jiangshanlai
In-Reply-To: <20200212210750.312024711@infradead.org>
On 2/12/20 3:01 PM, Peter Zijlstra wrote:
> Since perf is now able to deal with !rcu_is_watching() contexts,
> remove the restraint.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> kernel/trace/trace_event_perf.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/kernel/trace/trace_event_perf.c
> +++ b/kernel/trace/trace_event_perf.c
> @@ -477,7 +477,7 @@ static int perf_ftrace_function_register
> {
> struct ftrace_ops *ops = &event->ftrace_ops;
>
> - ops->flags = FTRACE_OPS_FL_RCU;
> + ops->flags = 0;
> ops->func = perf_ftrace_function_call;
> ops->private = (void *)(unsigned long)nr_cpu_ids;
If this is the last user of the flag, should all remaining
FTRACE_OPS_FL_RCU references be removed, too?
Thanks,
Kim
^ permalink raw reply
* Re: [PATCH] pinctrl: ingenic: Make unreachable path more robust
From: Josh Poimboeuf @ 2020-02-14 20:37 UTC (permalink / raw)
To: Paul Cercueil
Cc: Linus Walleij, linux-gpio, linux-kernel, Peter Zijlstra,
Randy Dunlap
In-Reply-To: <1581706938.3.5@crapouillou.net>
On Fri, Feb 14, 2020 at 04:02:18PM -0300, Paul Cercueil wrote:
> Hi Josh,
>
>
> Le ven., févr. 14, 2020 at 10:37, Josh Poimboeuf <jpoimboe@redhat.com> a
> écrit :
> > In the second loop of ingenic_pinconf_set(), it annotates the switch
> > default case as unreachable(). The annotation is technically correct,
> > because that same case would have resulted in an early return in the
> > previous loop.
> >
> > However, if a bug were to get introduced later, for example if an
> > additional case were added to the first loop without adjusting the
> > second loop, it would result in nasty undefined behavior: most likely
> > the function's generated code would fall through to the next function.
> >
> > Another issue is that, while objtool normally understands unreachable()
> > annotations, there's one special case where it doesn't: when the
> > annotation occurs immediately after a 'ret' instruction. That happens
> > to be the case here because unreachable() is immediately before the
> > return.
> >
> > So change the unreachable() to BUG() so that the unreachable code, if
> > ever executed, would panic instead of introducing undefined behavior.
> > This also makes objtool happy.
>
> I don't like the idea that you change this driver's code just to work around
> a bug in objtool, and I don't like the idea of working around a future bug
> that shouldn't be introduced in the first place.
It's not an objtool bug. It's a byproduct of the fact that GCC's
undefined behavior is inscrutable, and there's no way to determine that
it actually *wants* to jump to a random function.
And anyway, regardless of objtool, the patch is meant to make the code
more robust.
Do you not agree that BUG (defined behavior) is more robust than
unreachable (undefined behavior)?
--
Josh
^ permalink raw reply
* Re: [Virtio-fs] [PATCH v2 0/2] virtiofsd: Fix xattr and ACL
From: Vivek Goyal @ 2020-02-14 20:37 UTC (permalink / raw)
To: misono.tomohiro@fujitsu.com; +Cc: virtio-fs@redhat.com
In-Reply-To: <OSBPR01MB458283800922F1C74C094ABDE5070@OSBPR01MB4582.jpnprd01.prod.outlook.com>
On Fri, Jan 31, 2020 at 02:06:51AM +0000, misono.tomohiro@fujitsu.com wrote:
> > On Tue, Jan 28, 2020 at 07:18:17PM +0900, Misono Tomohiro wrote:
> > > Hi,
> > >
> > > This is a second version of xattr fix for virtiofsd.
> > > I included ACL fix (which introduces new option posix_acl) in this
> > > version too as ACL mostly depends on xattr.
> > >
> > > I run xfstests with XFS backend using "-o xattr -o posix_acl" option
> > > and only new failure is generic/375 which checks if sgid bit is
> > > cleared after setfacl. I'll try to investigate it.
> > >
> > > change in v1 -> v2
> > > - rebased to current dev branch
> > >
> > > - Always chdir for xattr (1st patch)
> > > In v1, I keep current implementation for regular file/dir since it
> > > show better performance in my environment. But I notice opening file
> > > for xattr causes seek sanity test fails (xfstest generic/285, 436).
> > >
> > > I'm not sure what is the fundamental problem here but I believe
> > > performance can be improved by introducing some caching mechanism
> > > in general.
> >
> > Hi Misono,
> >
> > How much is performance degradation due to fchdir(). If it is significant, then I will be inclined to keep original code for dir/file
> > till some other mechanism is introduced to offset the perofrmance loss.
>
> Please refer this replay: https://www.redhat.com/archives/virtio-fs/2020-January/msg00063.html
As per your email, regression due to fchdir() seems to be in the range of
5% to 10%. It is not trivial, IMO. May be its a good idea to keep original
logic and use fchdir() only when need be.
Thanks
Vivek
^ permalink raw reply
* Re: [PATCH] net: phy: restore mdio regs in the iproc mdio driver
From: Florian Fainelli @ 2020-02-14 20:37 UTC (permalink / raw)
To: Andrew Lunn, Scott Branden
Cc: Ray Jui, Arun Parameswaran, Russell King, linux-kernel,
bcm-kernel-feedback-list, netdev, David S . Miller,
linux-arm-kernel, Heiner Kallweit
In-Reply-To: <20200214203310.GQ31084@lunn.ch>
On 2/14/20 12:33 PM, Andrew Lunn wrote:
> On Fri, Feb 14, 2020 at 11:48:58AM -0800, Scott Branden wrote:
>> From: Arun Parameswaran <arun.parameswaran@broadcom.com>
>>
>> The mii management register in iproc mdio block
>> does not have a reention register so it is lost on suspend.
>
> reention?
Retention presumably.
--
Florian
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH] net: phy: restore mdio regs in the iproc mdio driver
From: Florian Fainelli @ 2020-02-14 20:37 UTC (permalink / raw)
To: Andrew Lunn, Scott Branden
Cc: Heiner Kallweit, Ray Jui, Arun Parameswaran, Russell King,
linux-kernel, bcm-kernel-feedback-list, netdev, David S . Miller,
linux-arm-kernel
In-Reply-To: <20200214203310.GQ31084@lunn.ch>
On 2/14/20 12:33 PM, Andrew Lunn wrote:
> On Fri, Feb 14, 2020 at 11:48:58AM -0800, Scott Branden wrote:
>> From: Arun Parameswaran <arun.parameswaran@broadcom.com>
>>
>> The mii management register in iproc mdio block
>> does not have a reention register so it is lost on suspend.
>
> reention?
Retention presumably.
--
Florian
^ permalink raw reply
* Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed
From: Ashish Kalra @ 2020-02-14 20:36 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
Radim Krcmar, Joerg Roedel, Borislav Petkov, Tom Lendacky,
David Rientjes, X86 ML, kvm list, LKML, brijesh.singh
In-Reply-To: <CALCETrX=ycjSuf_N_ff-VQtqq2_RoawuAqdkM+bCPn_2_swkjg@mail.gmail.com>
On Fri, Feb 14, 2020 at 10:56:53AM -0800, Andy Lutomirski wrote:
> On Thu, Feb 13, 2020 at 2:28 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
> >
> > On Wed, Feb 12, 2020 at 09:42:02PM -0800, Andy Lutomirski wrote:
> > >> On Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@amd.com> wrote:
> > >> >
> > >> > From: Brijesh Singh <brijesh.singh@amd.com>
> > > >
> > > > Invoke a hypercall when a memory region is changed from encrypted ->
> > > > decrypted and vice versa. Hypervisor need to know the page encryption
> > > > status during the guest migration.
> > >>
> > >> What happens if the guest memory status doesn't match what the
> > >> hypervisor thinks it is? What happens if the guest gets migrated
> > >> between the hypercall and the associated flushes?
> >
> > This is basically same as the dirty page tracking and logging being done
> > during Live Migration. As with dirty page tracking and logging we
> > maintain a page encryption bitmap in the kernel which keeps tracks of
> > guest's page encrypted/decrypted state changes and this bitmap is
> > sync'ed regularly from kernel to qemu and also during the live migration
> > process, therefore any dirty pages whose encryption status will change
> > during migration, should also have their page status updated when the
> > page encryption bitmap is sync'ed.
> >
> > Also i think that when the amount of dirty pages reach a low threshold,
> > QEMU stops the source VM and then transfers all the remaining dirty
> > pages, so at that point, there will also be a final sync of the page
> > encryption bitmap, there won't be any hypercalls after this as the
> > source VM has been stopped and the remaining VM state gets transferred.
>
> And have you ensured that, in the inevitable race when a guest gets
> migrated part way through an encryption state change, that no data
> corruption occurs?
We ensure that we send the page encryption state bitmap to the
destination VM at migration completion and when the remaining amount of
RAM/dirty pages are flushed. Also as the source VM is stopped before
this flush of remaining blocks occur, so any encryption state change
hypercalls would have been completed before that.
Thanks,
Ashish
^ permalink raw reply
* Re: [PATCH] x86/mce/amd: Fix kobject lifetime
From: Borislav Petkov @ 2020-02-14 20:36 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Greg KH, stable, X86 ML, Yazen Ghannam, LKML
In-Reply-To: <87a75kud8o.fsf@nanos.tec.linutronix.de>
On Fri, Feb 14, 2020 at 09:26:31PM +0100, Thomas Gleixner wrote:
> This once Cc'ed stable but lacked a Cc: stable tag in the changelog.
So that's the difference. Ok, I'm fine with that.
/me removes "suppresscc = bodycc" from his .gitconfig again.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply
* Re: s390 depending on cc-options makes it difficult to configure
From: Jeremy Cline @ 2020-02-14 20:35 UTC (permalink / raw)
To: Masahiro Yamada
Cc: Philipp Rudo, Michal Kubecek, Linux Kernel Mailing List,
Heiko Carstens, Vasily Gorbik, linux-s390
In-Reply-To: <CAK7LNATL3Oyn=FLKm0TcB9SkJLuCOWV06a_t-FRtFiFp9Vda1g@mail.gmail.com>
On Fri, Feb 14, 2020 at 12:31:05PM +0900, Masahiro Yamada wrote:
> Hi.
>
> On Tue, Feb 11, 2020 at 3:49 AM Philipp Rudo <prudo@linux.ibm.com> wrote:
> >
> > Hey Jeremy,
> > Hey Michal,
> >
> > sorry for the late response. The mail got lost in the pre-xmas rush...
> >
> > In my opinion the problem goes beyond s390 and the commit you mentioned. So I'm
> > also adding Masahiro as Kconfig maintainer and author of cc-option.
>
>
> I did not notice the former discussion.
> Thanks for CC'ing me.
>
>
>
>
> > On Wed, 11 Dec 2019 12:18:22 -0500
> > Jeremy Cline <jcline@redhat.com> wrote:
> >
> > > On Tue, Dec 10, 2019 at 10:01:08AM +0100, Michal Kubecek wrote:
> > > > On Mon, Dec 09, 2019 at 11:41:55AM -0500, Jeremy Cline wrote:
> > > > > Hi folks,
> > > > >
> > > > > Commit 5474080a3a0a ("s390/Kconfig: make use of 'depends on cc-option'")
> > > > > makes it difficult to produce an s390 configuration for Fedora and Red
> > > > > Hat kernels.
> > > > >
> > > > > The issue is I have the following configurations:
> > > > >
> > > > > CONFIG_MARCH_Z13=y
> > > > > CONFIG_TUNE_Z14=y
> > > > > # CONFIG_TUNE_DEFAULT is not set
> > > > >
> > > > > When the configuration is prepared on a non-s390x host without a
> > > > > compiler with -march=z* it changes CONFIG_TUNE_DEFAULT to y which, as
> > > > > far as I can tell, leads to a kernel tuned for z13 instead of z14.
> > > > > Fedora and Red Hat build processes produce complete configurations from
> > > > > snippets on any available host in the build infrastructure which very
> > > > > frequently is *not* s390.
> > > >
> > > > We have exactly the same problem. Our developers need to update config
> > > > files for different architectures and different kernel versions on their
> > > > machines which are usually x86_64 but that often produces different
> > > > configs than the real build environment.
> > > >
> > > > This is not an issue for upstream development as one usually updates
> > > > configs on the same system where the build takes place but it's a big
> > > > problem for distribution maintainers.
> >
> > If I recall correct the goal was to avoid trouble with clang, as it does not
> > support all processor types with -march. But yeah, in the original
> > consideration we only thought about upstream development and forgot the
> > distros.
> > > > > I did a quick search and couldn't find any other examples of Kconfigs
> > > > > depending on march or mtune compiler flags and it seems like it'd
> > > > > generally problematic for people preparing configurations.
> >
> > True, but not the whole story. Power and Arm64 use cc-option to check for
> > -mstack-protector*, which do not exist on s390. So you have the same problem
> > when you prepare a config for any of them on s390. Thus simply reverting the
> > commit you mentioned above does not solve the problem but merely hides one
> > symptom. Which also means that the original problem will return over and over
> > again in the future.
> >
> > An other reason why I don't think it makes sens to revert the commit is that it
> > would make cc-option as a whole useless. What's the benefit in having cc-option
> > when you are not allowed to use it? Or less provocative, in which use cases is
> > allowed to use cc-option?
>
>
> You are right.
> Reverting the particular s390 commit is not the solution.
>
>
> > > > There are more issues like this. In general, since 4.17 or 4.18, the
> > > > resulting config depends on both architecture and compiler version.
> > > > Earlier, you could simply run "ARCH=... make oldconfig" (or menuconfig)
> > > > to update configs for all architectures and distribution versions.
> > > > Today, you need to use the right compiler version (results with e.g.
> > > > 4.8, 7.4 and 9.2 differ) and architecture.
> > > >
> > >
> > > Yeah, that's also troublesome. This is by no means the first problem
> > > related to the environment at configuration time, but it the most
> > > bothersome to work around (at least for Fedora kernel configuration).
> > >
> > > > At the moment, I'm working around the issue by using chroot environments
> > > > with target distributions (e.g. openSUSE Tumbleweed) and set of cross
> > > > compilers for supported architectures but it's far from perfect and even
> > > > this way, there are problemantic points, e.g. BPFILTER_UMH which depends
> > > > on gcc being able to not only compile but also link.
> > > >
> > > > IMHO the key problem is that .config mixes configuration with
> > > > description of build environment. I have an idea of a solution which
> > > > would consist of
> > > >
> > > > - an option to extract "config" options which describe build
> > > > environment (i.e. their values are determined by running some
> > > > command, rather than reading from a file or asking user) into
> > > > a cache file
> > > > - an option telling "make *config" to use such cache file for these
> > > > environment "config" options instead of running the test scripts
> > > > (and probably issue an error if an environment option is missing)
> > > >
> > >
> > > I agree that the issue is mixing kernel configuration with build
> > > environment. I suppose a cache file would work, but it still sounds like
> > > a difficult process that is working around that fact that folks are
> > > coupling the configuration step with the build step.
> >
> > An other solution would be a "I know better" switch which simply disables
> > cc-option for that run. That would allow the use of cc-option for upstream
> > development and provide a simple way for distros to turn it off.
> >
> > > I would advocate that this patch be reverted and an effort made to not
> > > mix build environment checks into the configuration. I'm much happier
> > > for the build to fail because the configuration can't be satisfied by
> > > the environment than I am for the configuration to quietly change or for
> > > the tools to not allow me to make the configuration in the first place.
> > > Ideally the tools would warn the user if their environment won't build
> > > the configuration, but that's a nice-to-have.
> >
> > I too would prefer to have a warning instead of the config being silently
> > changed. But again, the problem goes beyond what was reported.
> >
> > @Masahiro: What do you think about it?
> >
> > Thanks
> > Philipp
> >
>
>
> The problem for Jeremy and Michal is,
> it is difficult to get a full-feature cross-compiler
> for every arch.
>
Indeed.
> One idea to workaround this is
> to use a fake script that accepts any flag,
> and use it as $(CC) in Kconfig.
>
> RFC patch is attached.
>
> This is not a perfect solution, of course.
>
The attached patch doesn't looks like it'd work for what we need,
although I wonder if it's easier to just check when cc-options is
defined for an environment variable or something and always return y
instead of calling out to $(CC) at all. Comes to the same thing, I
suppose.
>
> Evaluating the compiler in the Kconfig stage
> conceptually has a conflict with the workflow
> of distro maintainers.
>
> I think the only way to solve it completely is,
> ultimately, go back to pre 4.18 situation.
> But, I am not sure if upstream people want to do it.
> At least, Linus was happy to do compiler-tests
> in Kconfig.
>
> I already got several criticism about the
> new feature in Kconfig because it broke the
> workflow of distro maintainers. Sorry about that.
>
No worries, it's a tough balancing act between upstream users and
distros. It's not caused me *that* much bother.
>
> The idea from Michal, separation of the build environment
> description, would work too.
> IIRC, the crosstool-ng project generates some
> Kconfig files based on the environment.
> In hindsight, Kconfig did not need to have cc-option
> but it was how I implemented. I just thought it would be cleaner to
> put cc-option and the CONFIG option depending on it very close.
>
> Anyway, comments to the attachment are appreciated.
>
I believe it would solve our problem so from that perspective, it looks
good to me.
Thanks,
Jeremy
^ permalink raw reply
* Re: [PATCH 02/13] fixup! KVM: selftests: Add support for vcpu_args_set to aarch64 and s390x
From: Christian Borntraeger @ 2020-02-14 20:35 UTC (permalink / raw)
To: Andrew Jones, kvm; +Cc: pbonzini, bgardon, frankja, thuth, peterx
In-Reply-To: <20200214145920.30792-3-drjones@redhat.com>
On 14.02.20 15:59, Andrew Jones wrote:
> [Fixed array index (num => i) and made some style changes.]
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
> .../selftests/kvm/lib/aarch64/processor.c | 24 ++++---------------
subject says s390, the patch not.
^ permalink raw reply
* Re: [PATCH 0/7] Split fsverity-utils into a shared library
From: Eric Biggers @ 2020-02-14 20:35 UTC (permalink / raw)
To: Jes Sorensen; +Cc: linux-fscrypt, kernel-team, Jes Sorensen
In-Reply-To: <c39f57d5-c9a4-5fbb-3ce3-cd21e90ef921@gmail.com>
Hi Jes,
On Tue, Feb 11, 2020 at 06:35:45PM -0500, Jes Sorensen wrote:
> On 2/11/20 6:14 PM, Eric Biggers wrote:
> > On Tue, Feb 11, 2020 at 05:09:22PM -0500, Jes Sorensen wrote:
> >> On 2/11/20 2:22 PM, Eric Biggers wrote:
> >>> Hi Jes,
> >> So I basically want to be able to carry verity signatures in RPM as RPM
> >> internal data, similar to how it supports IMA signatures. I want to be
> >> able to install those without relying on post-install scripts and
> >> signature files being distributed as actual files that gets installed,
> >> just to have to remove them. This is how IMA support is integrated into
> >> RPM as well.
> >>
> >> Right now the RPM approach for signatures involves two steps, a build
> >> digest phase, and a sign the digest phase.
> >>
> >> The reason I included enable and measure was for completeness. I don't
> >> care wildly about those.
> >
> > So the signing happens when the RPM is built, not when it's installed? Are you
> > sure you actually need a library and not just 'fsverity sign' called from a
> > build script?
>
> So the way RPM is handling these is to calculate the digest in one
> place, and sign it in another. Basically the signing is a second step,
> post build, using the rpmsign command. Shelling out is not a good fit
> for this model.
>
> >>> Separately, before you start building something around fs-verity's builtin
> >>> signature verification support, have you also considered adding support for
> >>> fs-verity to IMA? I.e., using the fs-verity hashing mechanism with the IMA
> >>> signature mechanism. The IMA maintainer has been expressed interested in that.
> >>> If rpm already supports IMA signatures, maybe that way would be a better fit?
> >>
> >> I looked at IMA and it is overly complex. It is not obvious to me how
> >> you would get around that without the full complexity of IMA? The beauty
> >> of fsverity's approach is the simplicity of relying on just the kernel
> >> keyring for validation of the signature. If you have explicit
> >> suggestions, I am certainly happy to look at it.
> >
> > fs-verity's builtin signature verification feature is simple, but does it
> > actually do what you need? Note that unlike IMA, it doesn't provide an
> > in-kernel policy about which files have to have signatures and which don't.
> > I.e., to get any authenticity guarantee, before using any files that are
> > supposed to be protected by fs-verity, userspace has to manually check whether
> > the fs-verity bit is actually set. Is that part of your design?
>
> Totally aware of this, and it fits the model I am looking at.
>
Well, this might be a legitimate use case then. We need to define the library
interface as simply as possible, though, so that we can maintain this code in
the future without breaking users. I suggest starting with something along the
lines of:
#ifndef _LIBFSVERITY_H
#define _LIBFSVERITY_H
#include <stddef.h>
#include <stdint.h>
#define FS_VERITY_HASH_ALG_SHA256 1
#define FS_VERITY_HASH_ALG_SHA512 2
struct libfsverity_merkle_tree_params {
uint32_t version;
uint32_t hash_algorithm;
uint32_t block_size;
uint32_t salt_size;
const uint8_t *salt;
size_t reserved[11];
};
struct libfsverity_digest {
uint16_t digest_algorithm;
uint16_t digest_size;
uint8_t digest[];
};
struct libfsverity_signature_params {
const char *keyfile;
const char *certfile;
size_t reserved[11];
};
int libfsverity_compute_digest(int fd,
const struct libfsverity_merkle_tree_params *params,
struct libfsverity_digest **digest_ret);
int libfsverity_sign_digest(const struct libfsverity_digest *digest,
const struct libfsverity_signature_params *sig_params,
void **sig_ret, size_t *sig_size_ret);
#endif /* _LIBFSVERITY_H */
I.e.:
- The stuff in util.h and hash_algs.h isn't exposed to library users.
- Then names of all library functions and structs are appropriately prefixed
and avoid collisions with the kernel header.
- Only signing functionality is included.
- There are reserved fields, so we can add more parameters later.
Before committing to any stable API, it would also be helpful to see the RPM
patches to see what it actually does.
We'd also need to follow shared library best practices like compiling with
-fvisibility=hidden and marking the API functions explicitly with
__attribute__((visibility("default"))), and setting the 'soname' like
-Wl,-soname=libfsverity.so.0.
Also, is the GPLv2+ license okay for the use case?
- Eric
^ permalink raw reply
* Re: get-lore-mbox: quickly grab full threads from lore
From: Kevin Hilman @ 2020-02-14 20:35 UTC (permalink / raw)
To: Konstantin Ryabitsev; +Cc: workflows
In-Reply-To: <20200214195318.ghvcroucki4pcz4r@chatter.i7.local>
Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes:
> On Fri, Feb 14, 2020 at 11:30:42AM -0800, Kevin Hilman wrote:
>> Konstantin Ryabitsev <konstantin@linuxfoundation.org> writes:
>>
>> > I'd like your opinion on this quick helper script I wrote that uses any
>> > message-id to grab a full thread from lore.kernel.org and save it as a
>> > mbox file.
>>
>> This is very useful, thank you!
>>
>> One question/request: Is there a way for it to only grab a subset of a
>> series? e.g. Some series contain patches that might end up going
>> through a couple different trees (e.g. DT patches typically take a
>> separate path than drivers) so as a maintainer for one of the
>> subsystems, I might want to only get a subset of the series into an
>> mbox, not the whole thing.
>>
>> IOW, Right now even if I pass a msgid from the middle of the series, it
>> finds the whole series (which is cool!), but what if I want to apply
>> just that single patch? Or even better, I might want to only apply
>> patches 3-5 and 9 from a 10-patch series.
>>
>> Is this something do-able?
>
> I think for such cases it's easy enough to just edit the .mbx file to
> remove the patches you're not interested in.
Yes, that was my first "solution", but it's not very easy to
automate. :)
If there needs to be a manual step, I prefer 'git am --interactive'.
Anyways, this tool is really great and it's already replacing some of my
homebrew scripts.
Thanks,
Kevin
^ permalink raw reply
* Re: [RFC PATCH v4 01/25] virtual-bus: Implementation of Virtual Bus
From: Jason Gunthorpe @ 2020-02-14 20:34 UTC (permalink / raw)
To: Greg KH
Cc: Jeff Kirsher, davem, Dave Ertman, netdev, linux-rdma, nhorman,
sassmann, parav, galpress, selvin.xavier, sriharsha.basavapatna,
benve, bharat, xavier.huwei, yishaih, leonro, mkalderon, aditr,
Kiran Patil, Andrew Bowers
In-Reply-To: <20200214170240.GA4034785@kroah.com>
On Fri, Feb 14, 2020 at 09:02:40AM -0800, Greg KH wrote:
> > +/**
> > + * virtbus_dev_register - add a virtual bus device
> > + * @vdev: virtual bus device to add
> > + */
> > +int virtbus_dev_register(struct virtbus_device *vdev)
> > +{
> > + int ret;
> > +
> > + if (!vdev->release) {
> > + dev_err(&vdev->dev, "virtbus_device .release callback NULL\n");
>
> "virtbus_device MUST have a .release callback that does something!\n"
>
> > + return -EINVAL;
> > + }
> > +
> > + device_initialize(&vdev->dev);
> > +
> > + vdev->dev.bus = &virtual_bus_type;
> > + vdev->dev.release = virtbus_dev_release;
> > + /* All device IDs are automatically allocated */
> > + ret = ida_simple_get(&virtbus_dev_ida, 0, 0, GFP_KERNEL);
> > + if (ret < 0) {
> > + dev_err(&vdev->dev, "get IDA idx for virtbus device failed!\n");
> > + put_device(&vdev->dev);
>
> If you allocate the number before device_initialize(), no need to call
> put_device(). Just a minor thing, no big deal.
If *_regster does put_device on error then it must always do
put_device on any error, for instance the above return -EINVAL with
no put_device leaks memory.
Generally I find the design and audit of drivers simpler if the
register doesn't do device_initialize or put_device - have them
distinct and require the caller to manage this.
For instance look at ice_init_peer_devices() and ask who frees
the alloc_ordered_workqueue() if virtbus_dev_register() fails..
It is not all easy to tell if this is right or not..
> > + put_device(&vdev->dev);
> > + ida_simple_remove(&virtbus_dev_ida, vdev->id);
>
> You need to do this before put_device().
Shouldn't it be in the release function? The ida index should not be
re-used until the kref goes to zero..
> > +struct virtbus_device {
> > + struct device dev;
> > + const char *name;
> > + void (*release)(struct virtbus_device *);
> > + int id;
> > + const struct virtbus_dev_id *matched_element;
> > +};
>
> Any reason you need to make "struct virtbus_device" a public structure
> at all?
The general point of this scheme is to do this in a public header:
+struct iidc_virtbus_object {
+ struct virtbus_device vdev;
+ struct iidc_peer_dev *peer_dev;
+};
And then this when the driver binds:
+int irdma_probe(struct virtbus_device *vdev)
+{
+ struct iidc_virtbus_object *vo =
+ container_of(vdev, struct iidc_virtbus_object, vdev);
+ struct iidc_peer_dev *ldev = vo->peer_dev;
So the virtbus_device is in a public header to enable the container_of
construction.
Jason
^ permalink raw reply
* [PATCH] xfstests: add a CGROUP configuration option
From: Josef Bacik @ 2020-02-14 20:34 UTC (permalink / raw)
To: fstests, linux-btrfs
I want to add some extended statistic gathering for xfstests, but it's
tricky to isolate xfstests from the rest of the host applications. The
most straightforward way to do this is to run every test inside of it's
own cgroup. From there we can monitor the activity of tasks in the
specific cgroup using BPF.
The support for this is pretty simple, allow users to specify
CGROUP=/path/to/cgroup. We will create the path if it doesn't already
exist, and validate we can add things to cgroup.procs. If we cannot
it'll be disabled, otherwise we will use this when we do _run_seq by
echo'ing the bash pid into cgroup.procs, which will cause any children
to run under that cgroup.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
README | 3 +++
check | 17 ++++++++++++++++-
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/README b/README
index 593c1052..722dc170 100644
--- a/README
+++ b/README
@@ -102,6 +102,9 @@ Preparing system for tests:
- set USE_KMEMLEAK=yes to scan for memory leaks in the kernel
after every test, if the kernel supports kmemleak.
- set KEEP_DMESG=yes to keep dmesg log after test
+ - set CGROUP=/path/to/cgroup to create a cgroup to run tests inside
+ of. The main check will run outside of the cgroup, only the test
+ itself and any child processes will run under the cgroup.
- or add a case to the switch in common/config assigning
these variables based on the hostname of your test
diff --git a/check b/check
index 2e148e57..07a0e251 100755
--- a/check
+++ b/check
@@ -509,11 +509,23 @@ _expunge_test()
OOM_SCORE_ADJ="/proc/self/oom_score_adj"
test -w ${OOM_SCORE_ADJ} && echo -1000 > ${OOM_SCORE_ADJ}
+# Initialize the cgroup path if it doesn't already exist
+if [ ! -z "$CGROUP" ]; then
+ mkdir -p ${CGROUP}
+
+ # If we can't write to cgroup.procs then unset cgroup
+ test -w ${CGROUP}/cgroup.procs || unset CGROUP
+fi
+
# ...and make the tests themselves somewhat more attractive to it, so that if
# the system runs out of memory it'll be the test that gets killed and not the
# test framework.
_run_seq() {
- bash -c "test -w ${OOM_SCORE_ADJ} && echo 250 > ${OOM_SCORE_ADJ}; exec ./$seq"
+ _extra="test -w ${OOM_SCORE_ADJ} && echo 250 > ${OOM_SCORE_ADJ};"
+ if [ ! -z "$CGROUP" ]; then
+ _extra+="echo $$ > ${CGROUP}/cgroup.procs;"
+ fi
+ bash -c "${_extra} exec ./$seq"
}
_detect_kmemleak
@@ -615,6 +627,9 @@ for section in $HOST_OPTIONS_SECTIONS; do
echo "MKFS_OPTIONS -- `_scratch_mkfs_options`"
echo "MOUNT_OPTIONS -- `_scratch_mount_options`"
fi
+ if [ ! -z "$CGROUP" ]; then
+ echo "CGROUP -- ${CGROUP}"
+ fi
echo
needwrap=true
--
2.24.1
^ permalink raw reply related
* [PATCH -next] fork: annotate a data race in vm_area_dup()
From: Qian Cai @ 2020-02-14 20:33 UTC (permalink / raw)
To: akpm; +Cc: elver, linux-mm, linux-kernel, Qian Cai
struct vm_area_struct could be accessed concurrently as noticed by
KCSAN,
write to 0xffff9cf8bba08ad8 of 8 bytes by task 14263 on cpu 35:
vma_interval_tree_insert+0x101/0x150:
rb_insert_augmented_cached at include/linux/rbtree_augmented.h:58
(inlined by) vma_interval_tree_insert at mm/interval_tree.c:23
__vma_link_file+0x6e/0xe0
__vma_link_file at mm/mmap.c:629
vma_link+0xa2/0x120
mmap_region+0x753/0xb90
do_mmap+0x45c/0x710
vm_mmap_pgoff+0xc0/0x130
ksys_mmap_pgoff+0x1d1/0x300
__x64_sys_mmap+0x33/0x40
do_syscall_64+0x91/0xc44
entry_SYSCALL_64_after_hwframe+0x49/0xbe
read to 0xffff9cf8bba08a80 of 200 bytes by task 14262 on cpu 122:
vm_area_dup+0x6a/0xe0
vm_area_dup at kernel/fork.c:362
__split_vma+0x72/0x2a0
__split_vma at mm/mmap.c:2661
split_vma+0x5a/0x80
mprotect_fixup+0x368/0x3f0
do_mprotect_pkey+0x263/0x420
__x64_sys_mprotect+0x51/0x70
do_syscall_64+0x91/0xc44
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The write is holding mmap_sem while changing vm_area_struct.shared.rb.
Even though the read is lockless while making a copy, the clone will
have its own shared.rb reinitialized. Thus, mark it as an intentional
data race using the data_race() macro.
Signed-off-by: Qian Cai <cai@lca.pw>
---
kernel/fork.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 41f784b6203a..81bdc6e8a6cf 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -359,7 +359,11 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig)
struct vm_area_struct *new = kmem_cache_alloc(vm_area_cachep, GFP_KERNEL);
if (new) {
- *new = *orig;
+ /*
+ * @orig may be modified concurrently, but the clone will be
+ * reinitialized.
+ */
+ *new = data_race(*orig);
INIT_LIST_HEAD(&new->anon_vma_chain);
}
return new;
--
1.8.3.1
^ permalink raw reply related
* Re: [PATCH] net: phy: restore mdio regs in the iproc mdio driver
From: Andrew Lunn @ 2020-02-14 20:33 UTC (permalink / raw)
To: Scott Branden
Cc: Florian Fainelli, Ray Jui, Arun Parameswaran, Russell King,
linux-kernel, bcm-kernel-feedback-list, netdev, David S . Miller,
linux-arm-kernel, Heiner Kallweit
In-Reply-To: <20200214194858.8528-1-scott.branden@broadcom.com>
On Fri, Feb 14, 2020 at 11:48:58AM -0800, Scott Branden wrote:
> From: Arun Parameswaran <arun.parameswaran@broadcom.com>
>
> The mii management register in iproc mdio block
> does not have a reention register so it is lost on suspend.
reention?
Andrew
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH] net: phy: restore mdio regs in the iproc mdio driver
From: Andrew Lunn @ 2020-02-14 20:33 UTC (permalink / raw)
To: Scott Branden
Cc: Florian Fainelli, Heiner Kallweit, Ray Jui, Arun Parameswaran,
Russell King, linux-kernel, bcm-kernel-feedback-list, netdev,
David S . Miller, linux-arm-kernel
In-Reply-To: <20200214194858.8528-1-scott.branden@broadcom.com>
On Fri, Feb 14, 2020 at 11:48:58AM -0800, Scott Branden wrote:
> From: Arun Parameswaran <arun.parameswaran@broadcom.com>
>
> The mii management register in iproc mdio block
> does not have a reention register so it is lost on suspend.
reention?
Andrew
^ permalink raw reply
* [cip-dev] [ANNOUNCE] Release v4.19.103-cip20 and v4.4.213-cip42
From: nobuhiro1.iwamatsu at toshiba.co.jp @ 2020-02-14 20:32 UTC (permalink / raw)
To: cip-dev
Hi,
CIP kernel team has released Linux kernel v4.19.103-cip20 and v4.4.213-cip42.
The linux-4.19.y-cip tree has been updated base version from v4.19.98 to v4.19.103,
and many features of r8a774b1 have been added.
Then the linux-4.4.y-cip tree has been updated base version from v4.4.208 to v4.4.213,
and QSPI, TPU, PWM, VSP, USB and MSIOF support for r8a7744 has been added.
We can get this release via the git tree at:
v4.19.103-cip20:
repository:
https://git.kernel.org/pub/scm/linux/kernel/git/cip/linux-cip.git
branch:
linux-4.19.y-cip
commit hash:
d8d2f780968e403b42ecf61d498032ade45546d0
added commits:
CIP: Bump version suffix to -cip20 after merge from stable
arm64: dts: renesas: r8a774b1: Add USB3.0 device nodes
arm64: dts: renesas: r8a774b1: Add USB-DMAC and HSUSB device nodes
arm64: dts: renesas: r8a774b1: Add USB2.0 phy and host (EHCI/OHCI) device nodes
dt-bindings: usb: renesas_usb3: Document r8a774b1 support
dt-bindings: usb: renesas_gen3: Rename bindings documentation file to reflect IP block
dt-bindings: usb-xhci: Add r8a774b1 support
dt-bindings: rcar-gen3-phy-usb3: Add r8a774b1 support
dt-bindings: usb: renesas_usbhs: Add r8a774b1 support
dt-bindings: usb: renesas_usbhs: Rename bindings documentation file
dt-bindings: dmaengine: usb-dmac: Add binding for r8a774b1
dt-bindings: rcar-gen3-phy-usb2: Add r8a774b1 support
arm64: dts: renesas: r8a774b1: Add Sound and Audio DMAC device nodes
ASoC: rsnd: Document r8a774b1 bindings
arm64: dts: renesas: r8a774a1: Remove audio port node
arm64: dts: renesas: Add support for Advantech idk-1110wr LVDS panel
arm64: dts: renesas: hihope-rzg2-ex: Add LVDS support
drm: rcar-du: lvds: Add r8a774b1 support
arm64: dts: renesas: hihope-rzg2-ex: Enable backlight
arm64: dts: renesas: r8a774b1: Add PWM device nodes
arm64: dts: renesas: r8a774b1: Add FDP1 device nodes
arm64: dts: renesas: r8a774b1-hihope-rzg2n: Add display clock properties
arm64: dts: renesas: r8a774b1: Add HDMI encoder instance
arm64: dts: renesas: r8a774b1: Add DU device to DT
drm: rcar-du: Add R8A774B1 support
arm64: dts: renesas: hihope-common: Move du clk properties out of common dtsi
arm64: dts: renesas: r8a774b1: Connect Ethernet-AVB to IPMMU-DS0
arm64: dts: renesas: r8a774b1: Tie SYS-DMAC to IPMMU-DS0/1
arm64: dts: renesas: r8a774b1: Add VSP instances
arm64: dts: renesas: r8a774b1: Add FCPF and FCPV instances
arm64: dts: renesas: r8a774b1: Add IPMMU device nodes
iommu/ipmmu-vmsa: Hook up r8a774b1 DT matching code
dt-bindings: iommu: ipmmu-vmsa: Add r8a774b1 support
arm64: dts: renesas: r8a774b1: Add CAN and CAN FD support
dt-bindings: can: rcar_canfd: document r8a774b1 support
dt-bindings: can: rcar_can: document r8a774b1 support
arm64: dts: renesas: r8a774b1: Add TMU device nodes
clk: renesas: r8a774b1: Add TMU clock
dt-bindings: timer: renesas: tmu: Document r8a774b1 bindings
arm64: dts: renesas: r8a774b1: Add CMT device nodes
dt-bindings: timer: renesas, cmt: Document r8a774b1 CMT support
arm64: dts: renesas: r8a774b1: Add RZ/G2N thermal support
thermal: rcar_gen3_thermal: Add r8a774b1 support
dt-bindings: thermal: rcar-gen3-thermal: Add r8a774b1 support
arm64: dts: renesas: r8a774b1: Add OPPs table for cpu devices
arm64: dts: renesas: r8a774b1: Add I2C and IIC-DVFS support
dt-bindings: i2c: sh_mobile: Add r8a774b1 support
dt-bindings: i2c: sh_mobile: Rename bindings documentation file
dt-bindings: i2c: rcar: Add r8a774b1 support
dt-bindings: i2c: rcar: Rename bindings documentation file
arm64: dts: renesas: r8a774b1-hihope-rzg2n: Enable HS400 mode
arm64: dts: renesas: r8a774b1: Add SDHI support
mmc: renesas_sdhi_internal_dmac: Add r8a774b1 support
dt-bindings: mmc: renesas_sdhi: Add r8a774b1 support
arm64: dts: renesas: r8a774b1: Add INTC-EX device node
arm64: dts: renesas: hihope-rzg2-ex: Let the board specific DT decide about pciec1
arm64: dts: renesas: r8a774b1: Add PCIe device nodes
arm64: dts: renesas: r8a774b1: Add all MSIOF nodes
arm64: dts: renesas: r8a774b1: Add RWDT node
dt-bindings: watchdog: renesas-wdt: Document r8a774b1 support
dt-bindings: watchdog: Rename bindings documentation file
dt-bindings: spi: sh-msiof: Add r8a774b1 support
arm64: dts: renesas: Add HiHope RZ/G2N sub board support
arm64: dts: renesas: r8a774b1: Add Ethernet AVB node
dt-bindings: net: ravb: Add support for r8a774b1 SoC
arm64: dts: renesas: r8a774b1: Add GPIO device nodes
dt-bindings: gpio: rcar: Add DT binding for r8a774b1
arm64: dts: renesas: r8a774b1: Add SCIF and HSCIF nodes
arm64: dts: renesas: r8a774b1: Add SYS-DMAC device nodes
dt-bindings: dmaengine: rcar-dmac: Document R8A774B1 bindings
v4.4.213-cip42:
repository:
https://git.kernel.org/pub/scm/linux/kernel/git/cip/linux-cip.git
branch:
linux-4.4.y-cip
commit hash:
2507dd95fec1b330c3c62881e43d3d10d44a1a04
added commits:
CIP: Bump version suffix to -cip42 after merge from stable
ARM: dts: r8a7744: Add PCIe Controller device node
ARM: dts: r8a7744: Add xhci support
dt-bindings: usb-xhci: Document r8a7744 support
usb: host: xhci-plat: Add r8a7744 support
ARM: dts: r8a7744-iwg20m: Add SPI NOR support
ARM: dts: r8a7744: Add MSIOF[012] support
ARM: dts: r8a7744: Add QSPI support
ARM: dts: r8a7744: Add TPU support
ARM: dts: r8a7744: Add PWM SoC support
ARM: dts: r8a7744: Add VSP support
Best regards,
Nobuhiro
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.