patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH 6.1 000/219] 6.1.54-rc1 review
@ 2023-09-17 19:12 Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 001/219] net/ipv6: SKB symmetric hash should incorporate transport ports Greg Kroah-Hartman
                   ` (229 more replies)
  0 siblings, 230 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, linux-kernel, torvalds, akpm, linux,
	shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor

This is the start of the stable review cycle for the 6.1.54 release.
There are 219 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 6.1.54-rc1

Wesley Chalmers <wesley.chalmers@amd.com>
    drm/amd/display: Fix a bug when searching for insert_above_mpcc

Maciej W. Rozycki <macro@orcam.me.uk>
    MIPS: Only fiddle with CHECKFLAGS if `need-compiler'

Kuniyuki Iwashima <kuniyu@amazon.com>
    kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().

Vadim Fedorenko <vadim.fedorenko@linux.dev>
    ixgbe: fix timestamp configuration code

Kuniyuki Iwashima <kuniyu@amazon.com>
    tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.

Kuniyuki Iwashima <kuniyu@amazon.com>
    tcp: Fix bind() regression for v4-mapped-v6 wildcard address.

Kuniyuki Iwashima <kuniyu@amazon.com>
    tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).

Kuniyuki Iwashima <kuniyu@amazon.com>
    ipv6: Remove in6addr_any alternatives.

Eric Dumazet <edumazet@google.com>
    ipv6: fix ip6_sock_set_addr_preferences() typo

Sascha Hauer <s.hauer@pengutronix.de>
    net: macb: fix sleep inside spinlock

Harini Katakam <harini.katakam@xilinx.com>
    net: macb: Enable PTP unicast

Liu Jian <liujian56@huawei.com>
    net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()

Geert Uytterhoeven <geert+renesas@glider.be>
    platform/mellanox: NVSW_SN2201 should depend on ACPI

Shravan Kumar Ramani <shravankr@nvidia.com>
    platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events

Shravan Kumar Ramani <shravankr@nvidia.com>
    platform/mellanox: mlxbf-pmc: Fix potential buffer overflows

Liming Sun <limings@nvidia.com>
    platform/mellanox: mlxbf-tmfifo: Drop jumbo frames

Liming Sun <limings@nvidia.com>
    platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors

Shigeru Yoshida <syoshida@redhat.com>
    kcm: Fix memory leak in error path of kcm_sendmsg()

Hayes Wang <hayeswang@realtek.com>
    r8152: check budget for r8152_poll()

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid()

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: hide all multicast addresses from "bridge fdb show"

Ciprian Regus <ciprian.regus@analog.com>
    net:ethernet:adi:adin1110: Fix forwarding offload

Yang Yingliang <yangyingliang@huawei.com>
    net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address

Ziyang Xuan <william.xuanziyang@huawei.com>
    hsr: Fix uninit-value access in fill_frame_info()

Hangyu Hua <hbh25y@gmail.com>
    net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all()

Hangyu Hua <hbh25y@gmail.com>
    net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc()

Vincent Whitchurch <vincent.whitchurch@axis.com>
    net: stmmac: fix handling of zero coalescing tx-usecs

Guangguan Wang <guangguan.wang@linux.alibaba.com>
    net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add

Björn Töpel <bjorn@rivosinc.com>
    selftests: Keep symlinks, when possible

Björn Töpel <bjorn@rivosinc.com>
    kselftest/runner.sh: Propagate SIGTERM to runner child

Liu Jian <liujian56@huawei.com>
    net: ipv4: fix one memleak in __inet_del_ifa()

Jinjie Ruan <ruanjinjie@huawei.com>
    kunit: Fix wild-memory-access bug in kunit_free_suite_set()

Hamza Mahfooz <hamza.mahfooz@amd.com>
    drm/amdgpu: register a dirty framebuffer callback for fbcon

Gabe Teeger <gabe.teeger@amd.com>
    drm/amd/display: Remove wait while locked

Wenjing Liu <wenjing.liu@amd.com>
    drm/amd/display: always switch off ODM before committing more streams

Namhyung Kim <namhyung@kernel.org>
    perf hists browser: Fix the number of entries for 'e' key

Namhyung Kim <namhyung@kernel.org>
    perf tools: Handle old data in PERF_RECORD_ATTR

Namhyung Kim <namhyung@kernel.org>
    perf test shell stat_bpf_counters: Fix test on Intel

Namhyung Kim <namhyung@kernel.org>
    perf hists browser: Fix hierarchy mode header

Maciej W. Rozycki <macro@orcam.me.uk>
    MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install' regression

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Skip VMSA init in sev_es_init_vmcb() if pointer is NULL

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Set target pCPU during IRTE update if target vCPU is running

Sean Christopherson <seanjc@google.com>
    KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state

Sean Christopherson <seanjc@google.com>
    KVM: nSVM: Check instead of asserting on nested TSC scaling support

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Get source vCPUs from source VM for SEV-ES intrahost migration

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Don't inject #UD if KVM attempts to skip SEV guest insn

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Take and hold ir_list_lock when updating vCPU's Physical ID entry

Hamza Mahfooz <hamza.mahfooz@amd.com>
    drm/amd/display: prevent potential division by zero errors

Melissa Wen <mwen@igalia.com>
    drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma

William Zhang <william.zhang@broadcom.com>
    mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller

William Zhang <william.zhang@broadcom.com>
    mtd: rawnand: brcmnand: Fix potential false time out warning

Linus Walleij <linus.walleij@linaro.org>
    mtd: spi-nor: Correct flags for Winbond w25q128

William Zhang <william.zhang@broadcom.com>
    mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write

William Zhang <william.zhang@broadcom.com>
    mtd: rawnand: brcmnand: Fix crash during the panic_write

Liu Ying <victor.liu@nxp.com>
    drm/mxsfb: Disable overlay plane in mxsfb_plane_overlay_atomic_disable()

Anand Jain <anand.jain@oracle.com>
    btrfs: use the correct superblock to compare fsid in btrfs_validate_super

Naohiro Aota <naohiro.aota@wdc.com>
    btrfs: zoned: re-enable metadata over-commit for zoned mode

Josef Bacik <josef@toxicpanda.com>
    btrfs: set page extent mapped after read_folio in relocate_one_page

Filipe Manana <fdmanana@suse.com>
    btrfs: don't start transaction when joining with TRANS_JOIN_NOSTART

Boris Burkov <boris@bur.io>
    btrfs: free qgroup rsv on io failure

Boris Burkov <boris@bur.io>
    btrfs: fix start transaction qgroup rsv double free

Naohiro Aota <naohiro.aota@wdc.com>
    btrfs: zoned: do not zone finish data relocation block group

ruanmeisi <ruan.meisi@zte.com.cn>
    fuse: nlookup missing decrement in fuse_direntplus_link

Damien Le Moal <dlemoal@kernel.org>
    ata: pata_ftide010: Add missing MODULE_DESCRIPTION

Damien Le Moal <dlemoal@kernel.org>
    ata: sata_gemini: Add missing MODULE_DESCRIPTION

Michael Schmitz <schmitzmic@gmail.com>
    ata: pata_falcon: fix IO base selection for Q40

Werner Fischer <devlists@wefi.net>
    ata: ahci: Add Elkhart Lake AHCI controller

Christian Marangi <ansuelsmth@gmail.com>
    hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation

Nathan Chancellor <nathan@kernel.org>
    lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()

Jaegeuk Kim <jaegeuk@kernel.org>
    f2fs: avoid false alarm of circular locking

Jaegeuk Kim <jaegeuk@kernel.org>
    f2fs: flush inode if atomic file is aborted

Luís Henriques <lhenriques@suse.de>
    ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup}

Wang Jianjian <wangjianjian0@foxmail.com>
    ext4: add correct group descriptors and reserved GDT blocks to system zone

Zhang Yi <yi.zhang@huawei.com>
    jbd2: correct the end of the journal recovery scan range

Zhihao Cheng <chengzhihao1@huawei.com>
    jbd2: check 'jh->b_transaction' before removing it from checkpoint

Zhang Yi <yi.zhang@huawei.com>
    jbd2: fix checkpoint cleanup performance regression

Hien Huynh <hien.huynh.px@renesas.com>
    dmaengine: sh: rz-dmac: Fix destination and source data size setting

Walter Chang <walter.chang@mediatek.com>
    clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL

Pavel Kozlov <pavel.kozlov@synopsys.com>
    ARC: atomics: Add compiler barrier to atomic operations...

Saeed Mahameed <saeedm@nvidia.com>
    net/mlx5: Free IRQ rmap and notifier on kernel shutdown

Kalesh Singh <kaleshsingh@google.com>
    Multi-gen LRU: avoid race in inc_min_seq()

Petr Tesarik <petr.tesarik.ext@huawei.com>
    sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory()

Jie Wang <wangjie125@huawei.com>
    net: hns3: remove GSO partial feature bit

Yisen Zhuang <yisen.zhuang@huawei.com>
    net: hns3: fix the port information display when sfp is absent

Jijie Shao <shaojijie@huawei.com>
    net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue

Hao Chen <chenhao418@huawei.com>
    net: hns3: fix debugfs concurrency issue between kfree buffer and read

Hao Chen <chenhao418@huawei.com>
    net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read()

Jian Shen <shenjian15@huawei.com>
    net: hns3: fix tx timeout issue

Wander Lairson Costa <wander@redhat.com>
    netfilter: nfnetlink_osf: avoid OOB read

Florian Westphal <fw@strlen.de>
    netfilter: nftables: exthdr: fix 4-byte stack OOB write

Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    bpf: Assign bpf_tramp_run_ctx::saved_run_ctx before recursion check.

Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf().

Martin KaFai Lau <martin.lau@kernel.org>
    bpf: Remove prog->active check for bpf_lsm and bpf_iter

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: complete tc-cbs offload support on SJA1110

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload

Eric Dumazet <edumazet@google.com>
    ip_tunnels: use DEV_STATS_INC()

Ariel Marcovitch <arielmarcovitch@gmail.com>
    idr: fix param name in idr_alloc_cyclic() doc

Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    s390/zcrypt: don't leak memory if dev_set_name() fails

Olga Zaborska <olga.zaborska@intel.com>
    igb: Change IGB_MIN to allow set rx/tx value between 64 and 80

Olga Zaborska <olga.zaborska@intel.com>
    igbvf: Change IGBVF_MIN to allow set rx/tx value between 64 and 80

Olga Zaborska <olga.zaborska@intel.com>
    igc: Change IGC_MIN to allow set rx/tx value between 64 and 80

Geetha sowjanya <gakula@marvell.com>
    octeontx2-af: Fix truncation of smq in CN10K NIX AQ enqueue mbox handler

Shigeru Yoshida <syoshida@redhat.com>
    kcm: Destroy mutex in kcm_exit_net()

valis <sec@valis.email>
    net: sched: sch_qfq: Fix UAF in qfq_dequeue()

Kuniyuki Iwashima <kuniyu@amazon.com>
    af_unix: Fix data race around sk->sk_err.

Kuniyuki Iwashima <kuniyu@amazon.com>
    af_unix: Fix data-races around sk->sk_shutdown.

Kuniyuki Iwashima <kuniyu@amazon.com>
    af_unix: Fix data-race around unix_tot_inflight.

Kuniyuki Iwashima <kuniyu@amazon.com>
    af_unix: Fix data-races around user->unix_inflight.

John Fastabend <john.fastabend@gmail.com>
    bpf, sockmap: Fix skb refcnt race after locking changes

Oleksij Rempel <linux@rempel-privat.de>
    net: phy: micrel: Correct bit assignments for phy_device flags

Alex Henrie <alexhenrie24@gmail.com>
    net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr

Liang Chen <liangchen.linux@gmail.com>
    veth: Fixing transmit return status for dropped packets

Eric Dumazet <edumazet@google.com>
    gve: fix frag_list chaining

Corinna Vinschen <vinschen@redhat.com>
    igb: disable virtualization features on 82580

Sriram Yagnaraman <sriram.yagnaraman@est.tech>
    ipv6: ignore dst hint for multipath routes

Sriram Yagnaraman <sriram.yagnaraman@est.tech>
    ipv4: ignore dst hint for multipath routes

Eric Dumazet <edumazet@google.com>
    mptcp: annotate data-races around msk->rmem_fwd_alloc

Eric Dumazet <edumazet@google.com>
    net: annotate data-races around sk->sk_forward_alloc

Eric Dumazet <edumazet@google.com>
    net: use sk_forward_alloc_get() in sk_get_meminfo()

Sean Christopherson <seanjc@google.com>
    drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt()

Sean Christopherson <seanjc@google.com>
    drm/i915/gvt: Put the page reference obtained by KVM's gfn_to_pfn()

Sean Christopherson <seanjc@google.com>
    drm/i915/gvt: Verify pfn is "valid" before dereferencing "struct page"

Xiubo Li <xiubli@redhat.com>
    ceph: make members in struct ceph_mds_request_args_ext a union

Magnus Karlsson <magnus.karlsson@intel.com>
    xsk: Fix xsk_diag use-after-free error during socket cleanup

Florian Westphal <fw@strlen.de>
    net: fib: avoid warn splat in flow dissector

Eric Dumazet <edumazet@google.com>
    net: read sk->sk_family once in sk_mc_loop()

Eric Dumazet <edumazet@google.com>
    ipv4: annotate data-races around fi->fib_dead

Eric Dumazet <edumazet@google.com>
    sctp: annotate data-races around sk->sk_wmem_queued

Eric Dumazet <edumazet@google.com>
    net/sched: fq_pie: avoid stalls in fq_pie_timer()

Katya Orlova <e.orlova@ispras.ru>
    smb: propagate error code of extract_sharename()

Paulo Alcantara <pc@cjr.nz>
    cifs: use fs_context for automounts

Yu Kuai <yukuai3@huawei.com>
    blk-throttle: consider 'carryover_ios/bytes' in throtl_trim_slice()

Yu Kuai <yukuai3@huawei.com>
    blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice()

Andrzej Hajda <andrzej.hajda@intel.com>
    drm/i915: mark requests for GuC virtual engines to avoid use-after-free

Namhyung Kim <namhyung@kernel.org>
    perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test

Kajol Jain <kjain@linux.ibm.com>
    perf test stat_bpf_counters_cgrp: Fix shellcheck issue about logical operators

Vladimir Zapolskiy <vz@mleia.com>
    pwm: lpc32xx: Remove handling of PWM channels

Raag Jadav <raag.jadav@intel.com>
    watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf top: Don't pass an ERR_PTR() directly to perf_session__delete()

Kajol Jain <kjain@linux.ibm.com>
    perf vendor events: Drop STORES_PER_INST metric event for power10 platform

Kajol Jain <kjain@linux.ibm.com>
    perf vendor events: Drop some of the JSON/events for power10 platform

Kajol Jain <kjain@linux.ibm.com>
    perf vendor events: Update the JSON/events descriptions for power10 platform

Sean Christopherson <seanjc@google.com>
    x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm()

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf annotate bpf: Don't enclose non-debug code with an assert()

Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Input: tca6416-keypad - fix interrupt enable disbalance

Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Input: tca6416-keypad - always expect proper IRQ number in i2c client

Ying Liu <victor.liu@nxp.com>
    backlight: gpio_backlight: Drop output GPIO direction check for initial power state

Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    pwm: atmel-tcb: Fix resource freeing in error path and remove

Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    pwm: atmel-tcb: Harmonize resource allocation order

Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    pwm: atmel-tcb: Convert to platform remove callback returning void

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf trace: Really free the evsel->priv area

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf trace: Use zfree() to reduce chances of use after free

Jeff LaBundy <jeff@labundy.com>
    Input: iqs7222 - configure power mode before triggering ATI

Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
    kconfig: fix possible buffer overflow

Jonathan Marek <jonathan@marek.ca>
    mailbox: qcom-ipcc: fix incorrect num_chans counting

Andreas Gruenbacher <agruenba@redhat.com>
    gfs2: low-memory forced flush fixes

Andreas Gruenbacher <agruenba@redhat.com>
    gfs2: Switch to wait_event in gfs2_logd

Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    tpm_crb: Fix an error handling path in crb_acpi_add()

Masahiro Yamada <masahiroy@kernel.org>
    kbuild: do not run depmod for 'make modules_sign'

Masahiro Yamada <masahiroy@kernel.org>
    kbuild: rpm-pkg: define _arch conditionally

Eric Dumazet <edumazet@google.com>
    net: deal with integer overflows in kmalloc_reserve()

Eric Dumazet <edumazet@google.com>
    net: factorize code in kmalloc_reserve()

Eric Dumazet <edumazet@google.com>
    net: remove osize variable in __alloc_skb()

Eric Dumazet <edumazet@google.com>
    net: add SKB_HEAD_ALIGN() helper

Qiang Yu <quic_qianyu@quicinc.com>
    bus: mhi: host: Skip MHI reset if device is in RDDM

Fedor Pchelkin <pchelkin@ispras.ru>
    NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info

Trond Myklebust <trond.myklebust@hammerspace.com>
    NFS: Fix a potential data corruption

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: mss-sc7180: fix missing resume during probe

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: q6sstop-qcs404: fix missing resume during probe

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: lpasscc-sc7280: fix missing resume during probe

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: dispcc-sm8450: fix runtime PM imbalance on probe errors

Chris Lew <quic_clew@quicinc.com>
    soc: qcom: qmi_encdec: Restrict string length in decode

Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock

Marco Felsch <m.felsch@pengutronix.de>
    clk: imx: pll14xx: align pdiv with reference manual

Ahmad Fatoum <a.fatoum@pengutronix.de>
    clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz

Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    dt-bindings: clock: xlnx,versal-clk: drop select:false

Raag Jadav <raag.jadav@intel.com>
    pinctrl: cherryview: fix address_space_handler() argument

Bharath SM <bharathsm@microsoft.com>
    cifs: update desired access while requesting for directory lease

Helge Deller <deller@gmx.de>
    parisc: led: Reduce CPU overhead for disk & lan LED computation

Helge Deller <deller@gmx.de>
    parisc: led: Fix LAN receive and transmit LEDs

Andrew Donnellan <ajd@linux.ibm.com>
    lib/test_meminit: allocate pages up to order MAX_ORDER

Muchun Song <muchun.song@linux.dev>
    mm: hugetlb_vmemmap: fix a race between vmemmap pmd split

Michal Hocko <mhocko@suse.com>
    memcg: drop kmem.limit_in_bytes

Steve French <stfrench@microsoft.com>
    send channel sequence number in SMB3 requests after reconnects

Chris Paterson <chris.paterson2@renesas.com>
    arm64: dts: renesas: rzg2l: Fix txdv-skew-psec typos

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: turingcc-qcs404: fix missing resume during probe

Sheetal <sheetal@nvidia.com>
    ASoC: tegra: Fix SFC conversion for few rates

Thomas Zimmermann <tzimmermann@suse.de>
    drm/ast: Fix DRAM init on AST2200

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: camcc-sc7180: fix async resume during probe

Thomas Zimmermann <tzimmermann@suse.de>
    fbdev/ep93xx-fb: Do not assign to struct fb_info.dev

Chengming Zhou <zhouchengming@bytedance.com>
    null_blk: fix poll request timeout handling

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix firmware resource tracking

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Error code did not return to upper layer

Nilesh Javali <njavali@marvell.com>
    scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit()

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Flush mailbox commands on chip reset

Manish Rangankar <mrangankar@marvell.com>
    scsi: qla2xxx: Remove unsupported ql2xenabledif option

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix TMF leak through

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix session hang in gnl

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Turn off noisy message log

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix erroneous link up failure

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix command flush during TMF

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: fix inconsistent TMF timeout

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix deletion race condition

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Limit TMF to 8 per function

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Adjust IOCB resource on qpair create

Gurchetan Singh <gurchetansingh@chromium.org>
    drm/virtio: Conditionally allocate virtio_gpu_fence

Pavel Begunkov <asml.silence@gmail.com>
    io_uring: Don't set affinity on a dying sqpoll thread

Pavel Begunkov <asml.silence@gmail.com>
    io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used

Pavel Begunkov <asml.silence@gmail.com>
    io_uring: break out of iowq iopoll on teardown

Pavel Begunkov <asml.silence@gmail.com>
    io_uring/net: don't overflow multishot accept

Pavel Begunkov <asml.silence@gmail.com>
    io_uring: revert "io_uring fix multishot accept ordering"

Pavel Begunkov <asml.silence@gmail.com>
    io_uring: always lock in io_apoll_task_func

Kalesh Singh <kaleshsingh@google.com>
    Multi-gen LRU: fix per-zone reclaim

Yu Zhao <yuzhao@google.com>
    mm: multi-gen LRU: rename lrugen->lists[] to lrugen->folios[]

Quan Tian <qtian@vmware.com>
    net/ipv6: SKB symmetric hash should incorporate transport ports


-------------

Diffstat:

 Documentation/admin-guide/cgroup-v1/memory.rst     |   2 -
 .../devicetree/bindings/clock/xlnx,versal-clk.yaml |   2 -
 Documentation/mm/multigen_lru.rst                  |   8 +-
 Makefile                                           |   6 +-
 arch/arc/include/asm/atomic-llsc.h                 |   6 +-
 arch/arc/include/asm/atomic64-arcv2.h              |   6 +-
 arch/arm64/boot/dts/renesas/rzg2l-smarc-som.dtsi   |   4 +-
 arch/arm64/boot/dts/renesas/rzg2lc-smarc-som.dtsi  |   2 +-
 arch/arm64/boot/dts/renesas/rzg2ul-smarc-som.dtsi  |   4 +-
 arch/arm64/net/bpf_jit_comp.c                      |   9 +-
 arch/mips/Makefile                                 |   6 +-
 arch/parisc/include/asm/led.h                      |   4 +-
 arch/sh/boards/mach-ap325rxa/setup.c               |   2 +-
 arch/sh/boards/mach-ecovec24/setup.c               |   6 +-
 arch/sh/boards/mach-kfr2r09/setup.c                |   2 +-
 arch/sh/boards/mach-migor/setup.c                  |   2 +-
 arch/sh/boards/mach-se/7724/setup.c                |   6 +-
 arch/x86/include/asm/virtext.h                     |   6 -
 arch/x86/kvm/svm/avic.c                            |  59 +++++-
 arch/x86/kvm/svm/nested.c                          |   9 +-
 arch/x86/kvm/svm/sev.c                             |   9 +-
 arch/x86/kvm/svm/svm.c                             |  35 ++-
 arch/x86/net/bpf_jit_comp.c                        |  19 +-
 block/blk-throttle.c                               |  99 ++++-----
 drivers/ata/ahci.c                                 |   2 +
 drivers/ata/pata_falcon.c                          |  50 +++--
 drivers/ata/pata_ftide010.c                        |   1 +
 drivers/ata/sata_gemini.c                          |   1 +
 drivers/block/null_blk/main.c                      |  12 +-
 drivers/bus/mhi/host/pm.c                          |   5 +
 drivers/char/tpm/tpm_crb.c                         |   5 +-
 drivers/clk/imx/clk-pll14xx.c                      |  13 +-
 drivers/clk/qcom/camcc-sc7180.c                    |   2 +-
 drivers/clk/qcom/dispcc-sm8450.c                   |  13 +-
 drivers/clk/qcom/gcc-mdm9615.c                     |   2 +-
 drivers/clk/qcom/lpasscc-sc7280.c                  |  16 +-
 drivers/clk/qcom/mss-sc7180.c                      |  13 +-
 drivers/clk/qcom/q6sstop-qcs404.c                  |  15 +-
 drivers/clk/qcom/turingcc-qcs404.c                 |  13 +-
 drivers/clocksource/arm_arch_timer.c               |   7 +
 drivers/dma/sh/rz-dmac.c                           |  11 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c        |  26 ++-
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c    |   7 +
 drivers/gpu/drm/amd/display/dc/Makefile            |   1 +
 drivers/gpu/drm/amd/display/dc/core/dc.c           |  68 ++++--
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_mpc.c   |   5 +-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c |  11 -
 .../drm/amd/display/modules/freesync/freesync.c    |   9 +-
 drivers/gpu/drm/ast/ast_post.c                     |   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h       |   1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c  |   3 +
 drivers/gpu/drm/i915/gvt/gtt.c                     |  27 +--
 drivers/gpu/drm/i915/gvt/gtt.h                     |   1 -
 drivers/gpu/drm/i915/i915_request.c                |   7 +-
 drivers/gpu/drm/mxsfb/mxsfb_kms.c                  |   9 +
 drivers/gpu/drm/virtio/virtgpu_ioctl.c             |  30 +--
 drivers/hwspinlock/qcom_hwspinlock.c               |   9 +
 drivers/input/keyboard/tca6416-keypad.c            |  31 +--
 drivers/input/misc/iqs7222.c                       |   8 +-
 drivers/mailbox/qcom-ipcc.c                        |   4 +-
 drivers/mtd/nand/raw/brcmnand/brcmnand.c           | 112 ++++++----
 drivers/mtd/spi-nor/winbond.c                      |   5 +-
 drivers/net/dsa/sja1105/sja1105.h                  |   4 +
 drivers/net/dsa/sja1105/sja1105_dynamic_config.c   |  93 ++++----
 drivers/net/dsa/sja1105/sja1105_main.c             | 120 ++++++++---
 drivers/net/dsa/sja1105/sja1105_spi.c              |   4 +
 drivers/net/ethernet/adi/adin1110.c                |  10 +-
 drivers/net/ethernet/cadence/macb.h                |   4 +
 drivers/net/ethernet/cadence/macb_main.c           |  18 +-
 drivers/net/ethernet/google/gve/gve_rx_dqo.c       |   5 +-
 drivers/net/ethernet/hisilicon/hns3/hnae3.h        |   1 +
 drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c |   7 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    |  19 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c |   4 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c |  20 +-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c |  14 +-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    |   5 +-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |   2 -
 drivers/net/ethernet/intel/igb/igb.h               |   4 +-
 drivers/net/ethernet/intel/igb/igb_main.c          |   5 +-
 drivers/net/ethernet/intel/igbvf/igbvf.h           |   4 +-
 drivers/net/ethernet/intel/igc/igc.h               |   4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c       |  28 +--
 drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c    |   5 +
 .../net/ethernet/marvell/octeontx2/af/rvu_nix.c    |  21 +-
 drivers/net/ethernet/mediatek/mtk_eth_soc.c        |   3 +
 .../ethernet/mellanox/mlx5/core/en/tc_tun_encap.c  |   5 +-
 drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c  |  26 ++-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  10 +-
 drivers/net/usb/r8152.c                            |   3 +
 drivers/net/veth.c                                 |   4 +-
 drivers/parisc/led.c                               |   4 +-
 drivers/pinctrl/intel/pinctrl-cherryview.c         |   5 +-
 drivers/platform/mellanox/Kconfig                  |   4 +-
 drivers/platform/mellanox/mlxbf-pmc.c              |  41 ++--
 drivers/platform/mellanox/mlxbf-tmfifo.c           |  90 +++++---
 drivers/pwm/pwm-atmel-tcb.c                        |  70 +++---
 drivers/pwm/pwm-lpc32xx.c                          |  16 +-
 drivers/s390/crypto/zcrypt_api.c                   |   1 +
 drivers/scsi/qla2xxx/qla_attr.c                    |   2 -
 drivers/scsi/qla2xxx/qla_dbg.c                     |   2 +-
 drivers/scsi/qla2xxx/qla_def.h                     |  21 +-
 drivers/scsi/qla2xxx/qla_dfs.c                     |  10 +
 drivers/scsi/qla2xxx/qla_gbl.h                     |   1 +
 drivers/scsi/qla2xxx/qla_init.c                    | 234 +++++++++++++--------
 drivers/scsi/qla2xxx/qla_inline.h                  |  57 ++++-
 drivers/scsi/qla2xxx/qla_iocb.c                    |   1 +
 drivers/scsi/qla2xxx/qla_isr.c                     |   7 +-
 drivers/scsi/qla2xxx/qla_mbx.c                     |   7 +-
 drivers/scsi/qla2xxx/qla_nvme.c                    |   3 +-
 drivers/scsi/qla2xxx/qla_os.c                      |  26 ++-
 drivers/scsi/qla2xxx/qla_target.c                  |  14 +-
 drivers/soc/qcom/qmi_encdec.c                      |   4 +-
 drivers/video/backlight/gpio_backlight.c           |   3 +-
 drivers/video/fbdev/ep93xx-fb.c                    |   1 -
 drivers/watchdog/intel-mid_wdt.c                   |   1 +
 fs/btrfs/disk-io.c                                 |   5 +-
 fs/btrfs/extent-tree.c                             |  43 ++--
 fs/btrfs/inode.c                                   |   7 +
 fs/btrfs/relocation.c                              |  12 +-
 fs/btrfs/space-info.c                              |   6 +-
 fs/btrfs/transaction.c                             |  26 ++-
 fs/btrfs/zoned.c                                   |  16 +-
 fs/ext4/balloc.c                                   |  15 +-
 fs/ext4/block_validity.c                           |   8 +-
 fs/ext4/crypto.c                                   |   4 +
 fs/ext4/ext4.h                                     |   2 +
 fs/f2fs/f2fs.h                                     |  24 ++-
 fs/f2fs/inline.c                                   |   3 +-
 fs/f2fs/segment.c                                  |   2 +
 fs/fuse/readdir.c                                  |  10 +-
 fs/gfs2/aops.c                                     |   4 +-
 fs/gfs2/log.c                                      |  25 +--
 fs/jbd2/checkpoint.c                               |  22 +-
 fs/jbd2/recovery.c                                 |  12 +-
 fs/nfs/direct.c                                    |  20 +-
 fs/nfs/pnfs_dev.c                                  |   2 +-
 fs/smb/client/cached_dir.c                         |   2 +-
 fs/smb/client/cifs_dfs_ref.c                       | 100 ++++-----
 fs/smb/client/cifsglob.h                           |   1 +
 fs/smb/client/connect.c                            |   1 +
 fs/smb/client/fscache.c                            |   2 +-
 fs/smb/client/smb2ops.c                            |  11 +-
 fs/smb/client/smb2pdu.c                            |  11 +
 fs/smb/common/smb2pdu.h                            |  22 ++
 include/linux/bpf.h                                |  24 +--
 include/linux/bpf_verifier.h                       |  13 ++
 include/linux/ceph/ceph_fs.h                       |  24 ++-
 include/linux/ipv6.h                               |   1 +
 include/linux/micrel_phy.h                         |   6 +-
 include/linux/mm_inline.h                          |   4 +-
 include/linux/mmzone.h                             |   8 +-
 include/linux/skbuff.h                             |   8 +
 include/linux/tca6416_keypad.h                     |   1 -
 include/net/ip.h                                   |   1 +
 include/net/ip6_fib.h                              |  14 +-
 include/net/ip_fib.h                               |   5 +-
 include/net/ip_tunnels.h                           |  15 +-
 include/net/ipv6.h                                 |   7 +-
 include/net/sock.h                                 |  12 +-
 include/trace/events/fib.h                         |   5 +-
 include/trace/events/fib6.h                        |   5 +-
 io_uring/io-wq.c                                   |  17 +-
 io_uring/io-wq.h                                   |   3 +-
 io_uring/io_uring.c                                |  31 ++-
 io_uring/net.c                                     |   8 +-
 io_uring/poll.c                                    |   3 +-
 io_uring/sqpoll.c                                  |  17 ++
 io_uring/sqpoll.h                                  |   1 +
 kernel/bpf/syscall.c                               |   7 +-
 kernel/bpf/trampoline.c                            |  81 +++++--
 lib/idr.c                                          |   2 +-
 lib/kunit/test.c                                   |   3 +-
 lib/test_meminit.c                                 |   2 +-
 lib/test_scanf.c                                   |   2 +-
 mm/hugetlb_vmemmap.c                               |  34 ++-
 mm/memcontrol.c                                    |  10 -
 mm/vmscan.c                                        |  50 +++--
 net/core/flow_dissector.c                          |   3 +-
 net/core/skbuff.c                                  |  49 ++---
 net/core/skmsg.c                                   |  12 +-
 net/core/sock.c                                    |  19 +-
 net/ethtool/ioctl.c                                |  10 +-
 net/hsr/hsr_forward.c                              |   1 +
 net/ipv4/devinet.c                                 |  10 +-
 net/ipv4/fib_semantics.c                           |   5 +-
 net/ipv4/fib_trie.c                                |   3 +-
 net/ipv4/inet_hashtables.c                         |  43 ++--
 net/ipv4/ip_input.c                                |   3 +-
 net/ipv4/route.c                                   |   1 +
 net/ipv4/tcp_output.c                              |   2 +-
 net/ipv4/udp.c                                     |   6 +-
 net/ipv6/addrconf.c                                |   2 +-
 net/ipv6/ip6_input.c                               |   3 +-
 net/ipv6/route.c                                   |   3 +
 net/kcm/kcmsock.c                                  |  15 +-
 net/mptcp/protocol.c                               |  23 +-
 net/netfilter/nfnetlink_osf.c                      |   8 +
 net/netfilter/nft_exthdr.c                         |  22 +-
 net/sched/sch_fq_pie.c                             |  27 ++-
 net/sched/sch_plug.c                               |   2 +-
 net/sched/sch_qfq.c                                |  22 +-
 net/sctp/proc.c                                    |   2 +-
 net/sctp/socket.c                                  |  10 +-
 net/smc/smc_core.c                                 |   2 +
 net/tls/tls_sw.c                                   |   4 +-
 net/unix/af_unix.c                                 |   2 +-
 net/unix/scm.c                                     |   6 +-
 net/xdp/xsk_diag.c                                 |   3 +
 scripts/kconfig/preprocess.c                       |   3 +
 scripts/package/mkspec                             |   2 +-
 sound/soc/tegra/tegra210_sfc.c                     |  31 ++-
 sound/soc/tegra/tegra210_sfc.h                     |   4 +-
 tools/perf/builtin-top.c                           |   1 +
 tools/perf/builtin-trace.c                         |  15 +-
 .../pmu-events/arch/powerpc/power10/cache.json     |   4 +-
 .../arch/powerpc/power10/floating_point.json       |   7 -
 .../pmu-events/arch/powerpc/power10/frontend.json  |  30 +--
 .../pmu-events/arch/powerpc/power10/marked.json    |  30 +--
 .../pmu-events/arch/powerpc/power10/memory.json    |   6 +-
 .../pmu-events/arch/powerpc/power10/metrics.json   |   6 -
 .../pmu-events/arch/powerpc/power10/others.json    |  53 +++--
 .../pmu-events/arch/powerpc/power10/pipeline.json  |  30 +--
 .../perf/pmu-events/arch/powerpc/power10/pmc.json  |   4 +-
 .../arch/powerpc/power10/translation.json          |  11 +-
 tools/perf/tests/shell/stat_bpf_counters.sh        |   4 +-
 tools/perf/tests/shell/stat_bpf_counters_cgrp.sh   |  28 ++-
 tools/perf/ui/browsers/hists.c                     |  60 +++---
 tools/perf/util/annotate.c                         |  10 +-
 tools/perf/util/header.c                           |  11 +-
 tools/testing/selftests/kselftest/runner.sh        |   3 +-
 tools/testing/selftests/lib.mk                     |   4 +-
 232 files changed, 2129 insertions(+), 1340 deletions(-)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 001/219] net/ipv6: SKB symmetric hash should incorporate transport ports
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 002/219] mm: multi-gen LRU: rename lrugen->lists[] to lrugen->folios[] Greg Kroah-Hartman
                   ` (228 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Lars Ekman, Quan Tian, Eric Dumazet,
	David S. Miller

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quan Tian <qtian@vmware.com>

commit a5e2151ff9d5852d0ababbbcaeebd9646af9c8d9 upstream.

__skb_get_hash_symmetric() was added to compute a symmetric hash over
the protocol, addresses and transport ports, by commit eb70db875671
("packet: Use symmetric hash for PACKET_FANOUT_HASH."). It uses
flow_keys_dissector_symmetric_keys as the flow_dissector to incorporate
IPv4 addresses, IPv6 addresses and ports. However, it should not specify
the flag as FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL, which stops further
dissection when an IPv6 flow label is encountered, making transport
ports not being incorporated in such case.

As a consequence, the symmetric hash is based on 5-tuple for IPv4 but
3-tuple for IPv6 when flow label is present. It caused a few problems,
e.g. when nft symhash and openvswitch l4_sym rely on the symmetric hash
to perform load balancing as different L4 flows between two given IPv6
addresses would always get the same symmetric hash, leading to uneven
traffic distribution.

Removing the use of FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL makes sure the
symmetric hash is based on 5-tuple for both IPv4 and IPv6 consistently.

Fixes: eb70db875671 ("packet: Use symmetric hash for PACKET_FANOUT_HASH.")
Reported-by: Lars Ekman <uablrek@gmail.com>
Closes: https://github.com/antrea-io/antrea/issues/5457
Signed-off-by: Quan Tian <qtian@vmware.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/flow_dissector.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -1738,8 +1738,7 @@ u32 __skb_get_hash_symmetric(const struc
 
 	memset(&keys, 0, sizeof(keys));
 	__skb_flow_dissect(NULL, skb, &flow_keys_dissector_symmetric,
-			   &keys, NULL, 0, 0, 0,
-			   FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL);
+			   &keys, NULL, 0, 0, 0, 0);
 
 	return __flow_hash_from_keys(&keys, &hashrnd);
 }



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 002/219] mm: multi-gen LRU: rename lrugen->lists[] to lrugen->folios[]
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 001/219] net/ipv6: SKB symmetric hash should incorporate transport ports Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 003/219] Multi-gen LRU: fix per-zone reclaim Greg Kroah-Hartman
                   ` (227 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Yu Zhao, Johannes Weiner,
	Jonathan Corbet, Michael Larabel, Michal Hocko, Mike Rapoport,
	Roman Gushchin, Suren Baghdasaryan, Andrew Morton

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yu Zhao <yuzhao@google.com>

commit 6df1b2212950aae2b2188c6645ea18e2a9e3fdd5 upstream.

lru_gen_folio will be chained into per-node lists by the coming
lrugen->list.

Link: https://lkml.kernel.org/r/20221222041905.2431096-3-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Larabel <Michael@MichaelLarabel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/mm/multigen_lru.rst |    8 ++++----
 include/linux/mm_inline.h         |    4 ++--
 include/linux/mmzone.h            |    8 ++++----
 mm/vmscan.c                       |   20 ++++++++++----------
 4 files changed, 20 insertions(+), 20 deletions(-)

--- a/Documentation/mm/multigen_lru.rst
+++ b/Documentation/mm/multigen_lru.rst
@@ -89,15 +89,15 @@ variables are monotonically increasing.
 
 Generation numbers are truncated into ``order_base_2(MAX_NR_GENS+1)``
 bits in order to fit into the gen counter in ``folio->flags``. Each
-truncated generation number is an index to ``lrugen->lists[]``. The
+truncated generation number is an index to ``lrugen->folios[]``. The
 sliding window technique is used to track at least ``MIN_NR_GENS`` and
 at most ``MAX_NR_GENS`` generations. The gen counter stores a value
 within ``[1, MAX_NR_GENS]`` while a page is on one of
-``lrugen->lists[]``; otherwise it stores zero.
+``lrugen->folios[]``; otherwise it stores zero.
 
 Each generation is divided into multiple tiers. A page accessed ``N``
 times through file descriptors is in tier ``order_base_2(N)``. Unlike
-generations, tiers do not have dedicated ``lrugen->lists[]``. In
+generations, tiers do not have dedicated ``lrugen->folios[]``. In
 contrast to moving across generations, which requires the LRU lock,
 moving across tiers only involves atomic operations on
 ``folio->flags`` and therefore has a negligible cost. A feedback loop
@@ -127,7 +127,7 @@ page mapped by this PTE to ``(max_seq%MA
 Eviction
 --------
 The eviction consumes old generations. Given an ``lruvec``, it
-increments ``min_seq`` when ``lrugen->lists[]`` indexed by
+increments ``min_seq`` when ``lrugen->folios[]`` indexed by
 ``min_seq%MAX_NR_GENS`` becomes empty. To select a type and a tier to
 evict from, it first compares ``min_seq[]`` to select the older type.
 If both types are equally old, it selects the one whose first tier has
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -256,9 +256,9 @@ static inline bool lru_gen_add_folio(str
 	lru_gen_update_size(lruvec, folio, -1, gen);
 	/* for folio_rotate_reclaimable() */
 	if (reclaiming)
-		list_add_tail(&folio->lru, &lrugen->lists[gen][type][zone]);
+		list_add_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
 	else
-		list_add(&folio->lru, &lrugen->lists[gen][type][zone]);
+		list_add(&folio->lru, &lrugen->folios[gen][type][zone]);
 
 	return true;
 }
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -312,7 +312,7 @@ enum lruvec_flags {
  * They form a sliding window of a variable size [MIN_NR_GENS, MAX_NR_GENS]. An
  * offset within MAX_NR_GENS, i.e., gen, indexes the LRU list of the
  * corresponding generation. The gen counter in folio->flags stores gen+1 while
- * a page is on one of lrugen->lists[]. Otherwise it stores 0.
+ * a page is on one of lrugen->folios[]. Otherwise it stores 0.
  *
  * A page is added to the youngest generation on faulting. The aging needs to
  * check the accessed bit at least twice before handing this page over to the
@@ -324,8 +324,8 @@ enum lruvec_flags {
  * rest of generations, if they exist, are considered inactive. See
  * lru_gen_is_active().
  *
- * PG_active is always cleared while a page is on one of lrugen->lists[] so that
- * the aging needs not to worry about it. And it's set again when a page
+ * PG_active is always cleared while a page is on one of lrugen->folios[] so
+ * that the aging needs not to worry about it. And it's set again when a page
  * considered active is isolated for non-reclaiming purposes, e.g., migration.
  * See lru_gen_add_folio() and lru_gen_del_folio().
  *
@@ -412,7 +412,7 @@ struct lru_gen_struct {
 	/* the birth time of each generation in jiffies */
 	unsigned long timestamps[MAX_NR_GENS];
 	/* the multi-gen LRU lists, lazily sorted on eviction */
-	struct list_head lists[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];
+	struct list_head folios[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];
 	/* the multi-gen LRU sizes, eventually consistent */
 	long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];
 	/* the exponential moving average of refaulted */
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4258,7 +4258,7 @@ static bool inc_min_seq(struct lruvec *l
 
 	/* prevent cold/hot inversion if force_scan is true */
 	for (zone = 0; zone < MAX_NR_ZONES; zone++) {
-		struct list_head *head = &lrugen->lists[old_gen][type][zone];
+		struct list_head *head = &lrugen->folios[old_gen][type][zone];
 
 		while (!list_empty(head)) {
 			struct folio *folio = lru_to_folio(head);
@@ -4269,7 +4269,7 @@ static bool inc_min_seq(struct lruvec *l
 			VM_WARN_ON_ONCE_FOLIO(folio_zonenum(folio) != zone, folio);
 
 			new_gen = folio_inc_gen(lruvec, folio, false);
-			list_move_tail(&folio->lru, &lrugen->lists[new_gen][type][zone]);
+			list_move_tail(&folio->lru, &lrugen->folios[new_gen][type][zone]);
 
 			if (!--remaining)
 				return false;
@@ -4297,7 +4297,7 @@ static bool try_to_inc_min_seq(struct lr
 			gen = lru_gen_from_seq(min_seq[type]);
 
 			for (zone = 0; zone < MAX_NR_ZONES; zone++) {
-				if (!list_empty(&lrugen->lists[gen][type][zone]))
+				if (!list_empty(&lrugen->folios[gen][type][zone]))
 					goto next;
 			}
 
@@ -4762,7 +4762,7 @@ static bool sort_folio(struct lruvec *lr
 
 	/* promoted */
 	if (gen != lru_gen_from_seq(lrugen->min_seq[type])) {
-		list_move(&folio->lru, &lrugen->lists[gen][type][zone]);
+		list_move(&folio->lru, &lrugen->folios[gen][type][zone]);
 		return true;
 	}
 
@@ -4771,7 +4771,7 @@ static bool sort_folio(struct lruvec *lr
 		int hist = lru_hist_from_seq(lrugen->min_seq[type]);
 
 		gen = folio_inc_gen(lruvec, folio, false);
-		list_move_tail(&folio->lru, &lrugen->lists[gen][type][zone]);
+		list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
 
 		WRITE_ONCE(lrugen->protected[hist][type][tier - 1],
 			   lrugen->protected[hist][type][tier - 1] + delta);
@@ -4783,7 +4783,7 @@ static bool sort_folio(struct lruvec *lr
 	if (folio_test_locked(folio) || folio_test_writeback(folio) ||
 	    (type == LRU_GEN_FILE && folio_test_dirty(folio))) {
 		gen = folio_inc_gen(lruvec, folio, true);
-		list_move(&folio->lru, &lrugen->lists[gen][type][zone]);
+		list_move(&folio->lru, &lrugen->folios[gen][type][zone]);
 		return true;
 	}
 
@@ -4850,7 +4850,7 @@ static int scan_folios(struct lruvec *lr
 	for (zone = sc->reclaim_idx; zone >= 0; zone--) {
 		LIST_HEAD(moved);
 		int skipped = 0;
-		struct list_head *head = &lrugen->lists[gen][type][zone];
+		struct list_head *head = &lrugen->folios[gen][type][zone];
 
 		while (!list_empty(head)) {
 			struct folio *folio = lru_to_folio(head);
@@ -5250,7 +5250,7 @@ static bool __maybe_unused state_is_vali
 		int gen, type, zone;
 
 		for_each_gen_type_zone(gen, type, zone) {
-			if (!list_empty(&lrugen->lists[gen][type][zone]))
+			if (!list_empty(&lrugen->folios[gen][type][zone]))
 				return false;
 		}
 	}
@@ -5295,7 +5295,7 @@ static bool drain_evictable(struct lruve
 	int remaining = MAX_LRU_BATCH;
 
 	for_each_gen_type_zone(gen, type, zone) {
-		struct list_head *head = &lruvec->lrugen.lists[gen][type][zone];
+		struct list_head *head = &lruvec->lrugen.folios[gen][type][zone];
 
 		while (!list_empty(head)) {
 			bool success;
@@ -5832,7 +5832,7 @@ void lru_gen_init_lruvec(struct lruvec *
 		lrugen->timestamps[i] = jiffies;
 
 	for_each_gen_type_zone(gen, type, zone)
-		INIT_LIST_HEAD(&lrugen->lists[gen][type][zone]);
+		INIT_LIST_HEAD(&lrugen->folios[gen][type][zone]);
 
 	lruvec->mm_state.seq = MIN_NR_GENS;
 	init_waitqueue_head(&lruvec->mm_state.wait);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 003/219] Multi-gen LRU: fix per-zone reclaim
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 001/219] net/ipv6: SKB symmetric hash should incorporate transport ports Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 002/219] mm: multi-gen LRU: rename lrugen->lists[] to lrugen->folios[] Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 004/219] io_uring: always lock in io_apoll_task_func Greg Kroah-Hartman
                   ` (226 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kalesh Singh, Charan Teja Kalla,
	Lecopzer Chen, Yu Zhao, Barry Song, Brian Geffon,
	Jan Alexander Steffens (heftig), Matthias Brugger,
	Oleksandr Natalenko, Qi Zheng, Steven Barrett, Suleiman Souhlal,
	Suren Baghdasaryan, Aneesh Kumar K V, Andrew Morton,
	AngeloGioacchino Del Regno

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kalesh Singh <kaleshsingh@google.com>

commit 669281ee7ef731fb5204df9d948669bf32a5e68d upstream.

MGLRU has a LRU list for each zone for each type (anon/file) in each
generation:

	long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];

The min_seq (oldest generation) can progress independently for each
type but the max_seq (youngest generation) is shared for both anon and
file. This is to maintain a common frame of reference.

In order for eviction to advance the min_seq of a type, all the per-zone
lists in the oldest generation of that type must be empty.

The eviction logic only considers pages from eligible zones for
eviction or promotion.

    scan_folios() {
	...
	for (zone = sc->reclaim_idx; zone >= 0; zone--)  {
	    ...
	    sort_folio(); 	// Promote
	    ...
	    isolate_folio(); 	// Evict
	}
	...
    }

Consider the system has the movable zone configured and default 4
generations. The current state of the system is as shown below
(only illustrating one type for simplicity):

Type: ANON

	Zone    DMA32     Normal    Movable    Device

	Gen 0       0          0        4GB         0

	Gen 1       0        1GB        1MB         0

	Gen 2     1MB        4GB        1MB         0

	Gen 3     1MB        1MB        1MB         0

Now consider there is a GFP_KERNEL allocation request (eligible zone
index <= Normal), evict_folios() will return without doing any work
since there are no pages to scan in the eligible zones of the oldest
generation. Reclaim won't make progress until triggered from a ZONE_MOVABLE
allocation request; which may not happen soon if there is a lot of free
memory in the movable zone. This can lead to OOM kills, although there
is 1GB pages in the Normal zone of Gen 1 that we have not yet tried to
reclaim.

This issue is not seen in the conventional active/inactive LRU since
there are no per-zone lists.

If there are no (not enough) folios to scan in the eligible zones, move
folios from ineligible zone (zone_index > reclaim_index) to the next
generation. This allows for the progression of min_seq and reclaiming
from the next generation (Gen 1).

Qualcomm, Mediatek and raspberrypi [1] discovered this issue independently.

[1] https://github.com/raspberrypi/linux/issues/5395

Link: https://lkml.kernel.org/r/20230802025606.346758-1-kaleshsingh@google.com
Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation")
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reported-by: Charan Teja Kalla <quic_charante@quicinc.com>
Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek]
Tested-by: Charan Teja Kalla <quic_charante@quicinc.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Steven Barrett <steven@liquorix.net>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/vmscan.c |   18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4728,7 +4728,8 @@ void lru_gen_look_around(struct page_vma
  *                          the eviction
  ******************************************************************************/
 
-static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx)
+static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_control *sc,
+		       int tier_idx)
 {
 	bool success;
 	int gen = folio_lru_gen(folio);
@@ -4779,6 +4780,13 @@ static bool sort_folio(struct lruvec *lr
 		return true;
 	}
 
+	/* ineligible */
+	if (zone > sc->reclaim_idx) {
+		gen = folio_inc_gen(lruvec, folio, false);
+		list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
+		return true;
+	}
+
 	/* waiting for writeback */
 	if (folio_test_locked(folio) || folio_test_writeback(folio) ||
 	    (type == LRU_GEN_FILE && folio_test_dirty(folio))) {
@@ -4831,7 +4839,8 @@ static bool isolate_folio(struct lruvec
 static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
 		       int type, int tier, struct list_head *list)
 {
-	int gen, zone;
+	int i;
+	int gen;
 	enum vm_event_item item;
 	int sorted = 0;
 	int scanned = 0;
@@ -4847,9 +4856,10 @@ static int scan_folios(struct lruvec *lr
 
 	gen = lru_gen_from_seq(lrugen->min_seq[type]);
 
-	for (zone = sc->reclaim_idx; zone >= 0; zone--) {
+	for (i = MAX_NR_ZONES; i > 0; i--) {
 		LIST_HEAD(moved);
 		int skipped = 0;
+		int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
 		struct list_head *head = &lrugen->folios[gen][type][zone];
 
 		while (!list_empty(head)) {
@@ -4863,7 +4873,7 @@ static int scan_folios(struct lruvec *lr
 
 			scanned += delta;
 
-			if (sort_folio(lruvec, folio, tier))
+			if (sort_folio(lruvec, folio, sc, tier))
 				sorted += delta;
 			else if (isolate_folio(lruvec, folio, sc)) {
 				list_add(&folio->lru, list);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 004/219] io_uring: always lock in io_apoll_task_func
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 003/219] Multi-gen LRU: fix per-zone reclaim Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 005/219] io_uring: revert "io_uring fix multishot accept ordering" Greg Kroah-Hartman
                   ` (225 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Dylan Yudaken, Jens Axboe,
	Pavel Begunkov

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Begunkov <asml.silence@gmail.com>

From: Dylan Yudaken <dylany@meta.com>

[ upstream commit c06c6c5d276707e04cedbcc55625e984922118aa ]

This is required for the failure case (io_req_complete_failed) and is
missing.

The alternative would be to only lock in the failure path, however all of
the non-error paths in io_poll_check_events that do not do not return
IOU_POLL_NO_ACTION end up locking anyway. The only extraneous lock would
be for the multishot poll overflowing the CQE ring, however multishot poll
would probably benefit from being locked as it will allow completions to
be batched.

So it seems reasonable to lock always.

Signed-off-by: Dylan Yudaken <dylany@meta.com>
Link: https://lore.kernel.org/r/20221124093559.3780686-3-dylany@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 io_uring/poll.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -360,11 +360,12 @@ static void io_apoll_task_func(struct io
 	if (ret == IOU_POLL_NO_ACTION)
 		return;
 
+	io_tw_lock(req->ctx, locked);
 	io_poll_remove_entries(req);
 	io_poll_tw_hash_eject(req, locked);
 
 	if (ret == IOU_POLL_REMOVE_POLL_USE_RES)
-		io_req_complete_post(req);
+		io_req_task_complete(req, locked);
 	else if (ret == IOU_POLL_DONE || ret == IOU_POLL_REISSUE)
 		io_req_task_submit(req, locked);
 	else



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 005/219] io_uring: revert "io_uring fix multishot accept ordering"
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 004/219] io_uring: always lock in io_apoll_task_func Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 006/219] io_uring/net: dont overflow multishot accept Greg Kroah-Hartman
                   ` (224 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Dylan Yudaken, Jens Axboe,
	Pavel Begunkov

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Begunkov <asml.silence@gmail.com>

From: Dylan Yudaken <dylany@meta.com>

[ upstream commit 515e26961295bee9da5e26916c27739dca6c10e1 ]

This is no longer needed after commit aa1df3a360a0 ("io_uring: fix CQE
reordering"), since all reordering is now taken care of.

This reverts commit cbd25748545c ("io_uring: fix multishot accept
ordering").

Signed-off-by: Dylan Yudaken <dylany@meta.com>
Link: https://lore.kernel.org/r/20221107125236.260132-2-dylany@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 io_uring/net.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -1337,12 +1337,12 @@ retry:
 		return IOU_OK;
 	}
 
-	if (ret >= 0 &&
-	    io_post_aux_cqe(ctx, req->cqe.user_data, ret, IORING_CQE_F_MORE, false))
+	if (ret < 0)
+		return ret;
+	if (io_post_aux_cqe(ctx, req->cqe.user_data, ret, IORING_CQE_F_MORE, true))
 		goto retry;
 
-	io_req_set_res(req, ret, 0);
-	return (issue_flags & IO_URING_F_MULTISHOT) ? IOU_STOP_MULTISHOT : IOU_OK;
+	return -ECANCELED;
 }
 
 int io_socket_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 006/219] io_uring/net: dont overflow multishot accept
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 005/219] io_uring: revert "io_uring fix multishot accept ordering" Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 007/219] io_uring: break out of iowq iopoll on teardown Greg Kroah-Hartman
                   ` (223 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Pavel Begunkov, Jens Axboe

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Begunkov <asml.silence@gmail.com>

[ upstream commit 1bfed23349716a7811645336a7ce42c4b8f250bc ]

Don't allow overflowing multishot accept CQEs, we want to limit
the grows of the overflow list.

Cc: stable@vger.kernel.org
Fixes: 4e86a2c980137 ("io_uring: implement multishot mode for accept")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7d0d749649244873772623dd7747966f516fe6e2.1691757663.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 io_uring/net.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -1339,7 +1339,7 @@ retry:
 
 	if (ret < 0)
 		return ret;
-	if (io_post_aux_cqe(ctx, req->cqe.user_data, ret, IORING_CQE_F_MORE, true))
+	if (io_post_aux_cqe(ctx, req->cqe.user_data, ret, IORING_CQE_F_MORE, false))
 		goto retry;
 
 	return -ECANCELED;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 007/219] io_uring: break out of iowq iopoll on teardown
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 006/219] io_uring/net: dont overflow multishot accept Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 008/219] io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used Greg Kroah-Hartman
                   ` (222 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Pavel Begunkov, Jens Axboe

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Begunkov <asml.silence@gmail.com>

[ upstream commit 45500dc4e01c167ee063f3dcc22f51ced5b2b1e9 ]

io-wq will retry iopoll even when it failed with -EAGAIN. If that
races with task exit, which sets TIF_NOTIFY_SIGNAL for all its workers,
such workers might potentially infinitely spin retrying iopoll again and
again and each time failing on some allocation / waiting / etc. Don't
keep spinning if io-wq is dying.

Fixes: 561fb04a6a225 ("io_uring: replace workqueue usage with io-wq")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 io_uring/io-wq.c    |   10 ++++++++++
 io_uring/io-wq.h    |    1 +
 io_uring/io_uring.c |    2 ++
 3 files changed, 13 insertions(+)

--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -181,6 +181,16 @@ static void io_worker_ref_put(struct io_
 		complete(&wq->worker_done);
 }
 
+bool io_wq_worker_stopped(void)
+{
+	struct io_worker *worker = current->worker_private;
+
+	if (WARN_ON_ONCE(!io_wq_current_is_worker()))
+		return true;
+
+	return test_bit(IO_WQ_BIT_EXIT, &worker->wqe->wq->state);
+}
+
 static void io_worker_cancel_cb(struct io_worker *worker)
 {
 	struct io_wqe_acct *acct = io_wqe_get_acct(worker);
--- a/io_uring/io-wq.h
+++ b/io_uring/io-wq.h
@@ -52,6 +52,7 @@ void io_wq_hash_work(struct io_wq_work *
 
 int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask);
 int io_wq_max_workers(struct io_wq *wq, int *new_count);
+bool io_wq_worker_stopped(void);
 
 static inline bool io_wq_is_hashed(struct io_wq_work *work)
 {
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1823,6 +1823,8 @@ fail:
 		if (!needs_poll) {
 			if (!(req->ctx->flags & IORING_SETUP_IOPOLL))
 				break;
+			if (io_wq_worker_stopped())
+				break;
 			cond_resched();
 			continue;
 		}



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 008/219] io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 007/219] io_uring: break out of iowq iopoll on teardown Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 009/219] io_uring: Dont set affinity on a dying sqpoll thread Greg Kroah-Hartman
                   ` (221 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Jens Axboe, Pavel Begunkov

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Begunkov <asml.silence@gmail.com>

From: Jens Axboe <axboe@kernel.dk>

[ upstream commit ebdfefc09c6de7897962769bd3e63a2ff443ebf5 ]

If we setup the ring with SQPOLL, then that polling thread has its
own io-wq setup. This means that if the application uses
IORING_REGISTER_IOWQ_AFF to set the io-wq affinity, we should not be
setting it for the invoking task, but rather the sqpoll task.

Add an sqpoll helper that parks the thread and updates the affinity,
and use that one if we're using SQPOLL.

Fixes: fe76421d1da1 ("io_uring: allow user configurable IO thread CPU affinity")
Cc: stable@vger.kernel.org # 5.10+
Link: https://github.com/axboe/liburing/discussions/884
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 io_uring/io-wq.c    |    7 +++++--
 io_uring/io-wq.h    |    2 +-
 io_uring/io_uring.c |   29 ++++++++++++++++++-----------
 io_uring/sqpoll.c   |   15 +++++++++++++++
 io_uring/sqpoll.h   |    1 +
 5 files changed, 40 insertions(+), 14 deletions(-)

--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -1350,13 +1350,16 @@ static int io_wq_cpu_offline(unsigned in
 	return __io_wq_cpu_online(wq, cpu, false);
 }
 
-int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask)
+int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask)
 {
 	int i;
 
+	if (!tctx || !tctx->io_wq)
+		return -EINVAL;
+
 	rcu_read_lock();
 	for_each_node(i) {
-		struct io_wqe *wqe = wq->wqes[i];
+		struct io_wqe *wqe = tctx->io_wq->wqes[i];
 
 		if (mask)
 			cpumask_copy(wqe->cpu_mask, mask);
--- a/io_uring/io-wq.h
+++ b/io_uring/io-wq.h
@@ -50,7 +50,7 @@ void io_wq_put_and_exit(struct io_wq *wq
 void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work);
 void io_wq_hash_work(struct io_wq_work *work, void *val);
 
-int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask);
+int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask);
 int io_wq_max_workers(struct io_wq *wq, int *new_count);
 bool io_wq_worker_stopped(void);
 
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3835,16 +3835,28 @@ static int io_register_enable_rings(stru
 	return 0;
 }
 
+static __cold int __io_register_iowq_aff(struct io_ring_ctx *ctx,
+					 cpumask_var_t new_mask)
+{
+	int ret;
+
+	if (!(ctx->flags & IORING_SETUP_SQPOLL)) {
+		ret = io_wq_cpu_affinity(current->io_uring, new_mask);
+	} else {
+		mutex_unlock(&ctx->uring_lock);
+		ret = io_sqpoll_wq_cpu_affinity(ctx, new_mask);
+		mutex_lock(&ctx->uring_lock);
+	}
+
+	return ret;
+}
+
 static __cold int io_register_iowq_aff(struct io_ring_ctx *ctx,
 				       void __user *arg, unsigned len)
 {
-	struct io_uring_task *tctx = current->io_uring;
 	cpumask_var_t new_mask;
 	int ret;
 
-	if (!tctx || !tctx->io_wq)
-		return -EINVAL;
-
 	if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
 		return -ENOMEM;
 
@@ -3865,19 +3877,14 @@ static __cold int io_register_iowq_aff(s
 		return -EFAULT;
 	}
 
-	ret = io_wq_cpu_affinity(tctx->io_wq, new_mask);
+	ret = __io_register_iowq_aff(ctx, new_mask);
 	free_cpumask_var(new_mask);
 	return ret;
 }
 
 static __cold int io_unregister_iowq_aff(struct io_ring_ctx *ctx)
 {
-	struct io_uring_task *tctx = current->io_uring;
-
-	if (!tctx || !tctx->io_wq)
-		return -EINVAL;
-
-	return io_wq_cpu_affinity(tctx->io_wq, NULL);
+	return __io_register_iowq_aff(ctx, NULL);
 }
 
 static __cold int io_register_iowq_max_workers(struct io_ring_ctx *ctx,
--- a/io_uring/sqpoll.c
+++ b/io_uring/sqpoll.c
@@ -423,3 +423,18 @@ err:
 	io_sq_thread_finish(ctx);
 	return ret;
 }
+
+__cold int io_sqpoll_wq_cpu_affinity(struct io_ring_ctx *ctx,
+				     cpumask_var_t mask)
+{
+	struct io_sq_data *sqd = ctx->sq_data;
+	int ret = -EINVAL;
+
+	if (sqd) {
+		io_sq_thread_park(sqd);
+		ret = io_wq_cpu_affinity(sqd->thread->io_uring, mask);
+		io_sq_thread_unpark(sqd);
+	}
+
+	return ret;
+}
--- a/io_uring/sqpoll.h
+++ b/io_uring/sqpoll.h
@@ -27,3 +27,4 @@ void io_sq_thread_park(struct io_sq_data
 void io_sq_thread_unpark(struct io_sq_data *sqd);
 void io_put_sq_data(struct io_sq_data *sqd);
 int io_sqpoll_wait_sq(struct io_ring_ctx *ctx);
+int io_sqpoll_wq_cpu_affinity(struct io_ring_ctx *ctx, cpumask_var_t mask);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 009/219] io_uring: Dont set affinity on a dying sqpoll thread
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 008/219] io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 010/219] drm/virtio: Conditionally allocate virtio_gpu_fence Greg Kroah-Hartman
                   ` (220 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzbot+c74fea926a78b8a91042,
	Gabriel Krisman Bertazi, Jens Axboe, Pavel Begunkov

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Begunkov <asml.silence@gmail.com>

From: Gabriel Krisman Bertazi <krisman@suse.de>

[ upstream commit bd6fc5da4c51107e1e0cec4a3a07963d1dae2c84 ]

Syzbot reported a null-ptr-deref of sqd->thread inside
io_sqpoll_wq_cpu_affinity.  It turns out the sqd->thread can go away
from under us during io_uring_register, in case the process gets a
fatal signal during io_uring_register.

It is not particularly hard to hit the race, and while I am not sure
this is the exact case hit by syzbot, it solves it.  Finally, checking
->thread is enough to close the race because we locked sqd while
"parking" the thread, thus preventing it from going away.

I reproduced it fairly consistently with a program that does:

int main(void) {
  ...
  io_uring_queue_init(RING_LEN, &ring1, IORING_SETUP_SQPOLL);
  while (1) {
    io_uring_register_iowq_aff(ring, 1, &mask);
  }
}

Executed in a loop with timeout to trigger SIGTERM:
  while true; do timeout 1 /a.out ; done

This will hit the following BUG() in very few attempts.

BUG: kernel NULL pointer dereference, address: 00000000000007a8
PGD 800000010e949067 P4D 800000010e949067 PUD 10e46e067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 15715 Comm: dead-sqpoll Not tainted 6.5.0-rc7-next-20230825-g193296236fa0-dirty #23
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:io_sqpoll_wq_cpu_affinity+0x27/0x70
Code: 90 90 90 0f 1f 44 00 00 55 53 48 8b 9f 98 03 00 00 48 85 db 74 4f
48 89 df 48 89 f5 e8 e2 f8 ff ff 48 8b 43 38 48 85 c0 74 22 <48> 8b b8
a8 07 00 00 48 89 ee e8 ba b1 00 00 48 89 df 89 c5 e8 70
RSP: 0018:ffffb04040ea7e70 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff93c010749e40 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffffa7653331 RDI: 00000000ffffffff
RBP: ffffb04040ea7eb8 R08: 0000000000000000 R09: c0000000ffffdfff
R10: ffff93c01141b600 R11: ffffb04040ea7d18 R12: ffff93c00ea74840
R13: 0000000000000011 R14: 0000000000000000 R15: ffff93c00ea74800
FS:  00007fb7c276ab80(0000) GS:ffff93c36f200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000007a8 CR3: 0000000111634003 CR4: 0000000000370ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 ? __die_body+0x1a/0x60
 ? page_fault_oops+0x154/0x440
 ? do_user_addr_fault+0x174/0x7b0
 ? exc_page_fault+0x63/0x140
 ? asm_exc_page_fault+0x22/0x30
 ? io_sqpoll_wq_cpu_affinity+0x27/0x70
 __io_register_iowq_aff+0x2b/0x60
 __io_uring_register+0x614/0xa70
 __x64_sys_io_uring_register+0xaa/0x1a0
 do_syscall_64+0x3a/0x90
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7fb7c226fec9
Code: 2e 00 b8 ca 00 00 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d 97 7f 2d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffe2c0674f8 EFLAGS: 00000246 ORIG_RAX: 00000000000001ab
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb7c226fec9
RDX: 00007ffe2c067530 RSI: 0000000000000011 RDI: 0000000000000003
RBP: 00007ffe2c0675d0 R08: 00007ffe2c067550 R09: 00007ffe2c067550
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffe2c067750 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
Modules linked in:
CR2: 00000000000007a8
---[ end trace 0000000000000000 ]---

Reported-by: syzbot+c74fea926a78b8a91042@syzkaller.appspotmail.com
Fixes: ebdfefc09c6d ("io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/87v8cybuo6.fsf@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 io_uring/sqpoll.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/io_uring/sqpoll.c
+++ b/io_uring/sqpoll.c
@@ -432,7 +432,9 @@ __cold int io_sqpoll_wq_cpu_affinity(str
 
 	if (sqd) {
 		io_sq_thread_park(sqd);
-		ret = io_wq_cpu_affinity(sqd->thread->io_uring, mask);
+		/* Don't set affinity for a dying thread */
+		if (sqd->thread)
+			ret = io_wq_cpu_affinity(sqd->thread->io_uring, mask);
 		io_sq_thread_unpark(sqd);
 	}
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 010/219] drm/virtio: Conditionally allocate virtio_gpu_fence
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 009/219] io_uring: Dont set affinity on a dying sqpoll thread Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 011/219] scsi: qla2xxx: Adjust IOCB resource on qpair create Greg Kroah-Hartman
                   ` (219 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Gurchetan Singh, Dmitry Osipenko,
	Alyssa Ross

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gurchetan Singh <gurchetansingh@chromium.org>

commit 70d1ace56db6c79d39dbe9c0d5244452b67e2fde upstream.

We don't want to create a fence for every command submission.  It's
only necessary when userspace provides a waitable token for submission.
This could be:

1) bo_handles, to be used with VIRTGPU_WAIT
2) out_fence_fd, to be used with dma_fence apis
3) a ring_idx provided with VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK
   + DRM event API
4) syncobjs in the future

The use case for just submitting a command to the host, and expecting
no response.  For example, gfxstream has GFXSTREAM_CONTEXT_PING that
just wakes up the host side worker threads.  There's also
CROSS_DOMAIN_CMD_SEND which just sends data to the Wayland server.

This prevents the need to signal the automatically created
virtio_gpu_fence.

In addition, VIRTGPU_EXECBUF_RING_IDX is checked when creating a
DRM event object.  VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK is
already defined in terms of per-context rings.  It was theoretically
possible to create a DRM event on the global timeline (ring_idx == 0),
if the context enabled DRM event polling.  However, that wouldn't
work and userspace (Sommelier).  Explicitly disallow it for
clarity.

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> # edited coding style
Link: https://patchwork.freedesktop.org/patch/msgid/20230707213124.494-1-gurchetansingh@chromium.org
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/virtio/virtgpu_ioctl.c |   30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -43,13 +43,9 @@ static int virtio_gpu_fence_event_create
 					 struct virtio_gpu_fence *fence,
 					 uint32_t ring_idx)
 {
-	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
 	struct virtio_gpu_fence_event *e = NULL;
 	int ret;
 
-	if (!(vfpriv->ring_idx_mask & BIT_ULL(ring_idx)))
-		return 0;
-
 	e = kzalloc(sizeof(*e), GFP_KERNEL);
 	if (!e)
 		return -ENOMEM;
@@ -121,6 +117,7 @@ static int virtio_gpu_execbuffer_ioctl(s
 	struct virtio_gpu_device *vgdev = dev->dev_private;
 	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
 	struct virtio_gpu_fence *out_fence;
+	bool drm_fence_event;
 	int ret;
 	uint32_t *bo_handles = NULL;
 	void __user *user_bo_handles = NULL;
@@ -216,15 +213,24 @@ static int virtio_gpu_execbuffer_ioctl(s
 			goto out_memdup;
 	}
 
-	out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
-	if(!out_fence) {
-		ret = -ENOMEM;
-		goto out_unresv;
-	}
+	if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX) &&
+	    (vfpriv->ring_idx_mask & BIT_ULL(ring_idx)))
+		drm_fence_event = true;
+	else
+		drm_fence_event = false;
+
+	if ((exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_OUT) ||
+	    exbuf->num_bo_handles ||
+	    drm_fence_event)
+		out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
+	else
+		out_fence = NULL;
 
-	ret = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx);
-	if (ret)
-		goto out_unresv;
+	if (drm_fence_event) {
+		ret = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx);
+		if (ret)
+			goto out_unresv;
+	}
 
 	if (out_fence_fd >= 0) {
 		sync_file = sync_file_create(&out_fence->f);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 011/219] scsi: qla2xxx: Adjust IOCB resource on qpair create
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (9 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 010/219] drm/virtio: Conditionally allocate virtio_gpu_fence Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 012/219] scsi: qla2xxx: Limit TMF to 8 per function Greg Kroah-Hartman
                   ` (218 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit efa74a62aaa2429c04fe6cb277b3bf6739747d86 upstream.

During NVMe queue creation, a new qpair is created. FW resource limit needs
to be re-adjusted to take into account the new qpair. Otherwise, NVMe
command can not go through.  This issue was discovered while
testing/forcing FW execution to fail at load time.

Add call to readjust IOCB and exchange limit.

In addition, get FW state command and require FW to be running. Otherwise,
error is generated.

Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230714070104.40052-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_gbl.h  |    1 
 drivers/scsi/qla2xxx/qla_init.c |   52 +++++++++++++++++++++++++---------------
 drivers/scsi/qla2xxx/qla_mbx.c  |    3 ++
 drivers/scsi/qla2xxx/qla_nvme.c |    1 
 4 files changed, 38 insertions(+), 19 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_gbl.h
+++ b/drivers/scsi/qla2xxx/qla_gbl.h
@@ -143,6 +143,7 @@ void qla_edif_sess_down(struct scsi_qla_
 void qla_edif_clear_appdata(struct scsi_qla_host *vha,
 			    struct fc_port *fcport);
 const char *sc_to_str(uint16_t cmd);
+void qla_adjust_iocb_limit(scsi_qla_host_t *vha);
 
 /*
  * Global Data in qla_os.c source file.
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -4148,41 +4148,55 @@ out:
 	return ha->flags.lr_detected;
 }
 
-void qla_init_iocb_limit(scsi_qla_host_t *vha)
+static void __qla_adjust_iocb_limit(struct qla_qpair *qpair)
 {
-	u16 i, num_qps;
-	u32 limit;
-	struct qla_hw_data *ha = vha->hw;
+	u8 num_qps;
+	u16 limit;
+	struct qla_hw_data *ha = qpair->vha->hw;
 
 	num_qps = ha->num_qpairs + 1;
 	limit = (ha->orig_fw_iocb_count * QLA_IOCB_PCT_LIMIT) / 100;
 
-	ha->base_qpair->fwres.iocbs_total = ha->orig_fw_iocb_count;
-	ha->base_qpair->fwres.iocbs_limit = limit;
-	ha->base_qpair->fwres.iocbs_qp_limit = limit / num_qps;
-	ha->base_qpair->fwres.iocbs_used = 0;
+	qpair->fwres.iocbs_total = ha->orig_fw_iocb_count;
+	qpair->fwres.iocbs_limit = limit;
+	qpair->fwres.iocbs_qp_limit = limit / num_qps;
+
+	qpair->fwres.exch_total = ha->orig_fw_xcb_count;
+	qpair->fwres.exch_limit = (ha->orig_fw_xcb_count *
+				   QLA_IOCB_PCT_LIMIT) / 100;
+}
 
-	ha->base_qpair->fwres.exch_total = ha->orig_fw_xcb_count;
-	ha->base_qpair->fwres.exch_limit = (ha->orig_fw_xcb_count *
-					    QLA_IOCB_PCT_LIMIT) / 100;
+void qla_init_iocb_limit(scsi_qla_host_t *vha)
+{
+	u8 i;
+	struct qla_hw_data *ha = vha->hw;
+
+	 __qla_adjust_iocb_limit(ha->base_qpair);
+	ha->base_qpair->fwres.iocbs_used = 0;
 	ha->base_qpair->fwres.exch_used  = 0;
 
 	for (i = 0; i < ha->max_qpairs; i++) {
 		if (ha->queue_pair_map[i])  {
-			ha->queue_pair_map[i]->fwres.iocbs_total =
-				ha->orig_fw_iocb_count;
-			ha->queue_pair_map[i]->fwres.iocbs_limit = limit;
-			ha->queue_pair_map[i]->fwres.iocbs_qp_limit =
-				limit / num_qps;
+			__qla_adjust_iocb_limit(ha->queue_pair_map[i]);
 			ha->queue_pair_map[i]->fwres.iocbs_used = 0;
-			ha->queue_pair_map[i]->fwres.exch_total = ha->orig_fw_xcb_count;
-			ha->queue_pair_map[i]->fwres.exch_limit =
-				(ha->orig_fw_xcb_count * QLA_IOCB_PCT_LIMIT) / 100;
 			ha->queue_pair_map[i]->fwres.exch_used = 0;
 		}
 	}
 }
 
+void qla_adjust_iocb_limit(scsi_qla_host_t *vha)
+{
+	u8 i;
+	struct qla_hw_data *ha = vha->hw;
+
+	__qla_adjust_iocb_limit(ha->base_qpair);
+
+	for (i = 0; i < ha->max_qpairs; i++) {
+		if (ha->queue_pair_map[i])
+			__qla_adjust_iocb_limit(ha->queue_pair_map[i]);
+	}
+}
+
 /**
  * qla2x00_setup_chip() - Load and start RISC firmware.
  * @vha: HA context
--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -2213,6 +2213,9 @@ qla2x00_get_firmware_state(scsi_qla_host
 	ql_dbg(ql_dbg_mbx + ql_dbg_verbose, vha, 0x1054,
 	    "Entered %s.\n", __func__);
 
+	if (!ha->flags.fw_started)
+		return QLA_FUNCTION_FAILED;
+
 	mcp->mb[0] = MBC_GET_FIRMWARE_STATE;
 	mcp->out_mb = MBX_0;
 	if (IS_FWI2_CAPABLE(vha->hw))
--- a/drivers/scsi/qla2xxx/qla_nvme.c
+++ b/drivers/scsi/qla2xxx/qla_nvme.c
@@ -132,6 +132,7 @@ static int qla_nvme_alloc_queue(struct n
 			       "Failed to allocate qpair\n");
 			return -EINVAL;
 		}
+		qla_adjust_iocb_limit(vha);
 	}
 	*handle = qpair;
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 012/219] scsi: qla2xxx: Limit TMF to 8 per function
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 011/219] scsi: qla2xxx: Adjust IOCB resource on qpair create Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 013/219] scsi: qla2xxx: Fix deletion race condition Greg Kroah-Hartman
                   ` (217 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit a8ec192427e0516436e61f9ca9eb49c54eadfe0a upstream.

Per FW recommendation, 8 TMF's can be outstanding for each
function. Previously, it allowed 8 per target.

Limit TMF to 8 per function.

Cc: stable@vger.kernel.org
Fixes: 6a87679626b5 ("scsi: qla2xxx: Fix task management cmd fail due to unavailable resource")
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230714070104.40052-4-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_def.h  |    9 +++---
 drivers/scsi/qla2xxx/qla_init.c |   55 ++++++++++++++++++++++++----------------
 drivers/scsi/qla2xxx/qla_os.c   |    2 +
 3 files changed, 41 insertions(+), 25 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -458,6 +458,7 @@ static inline be_id_t port_id_to_be_id(p
 }
 
 struct tmf_arg {
+	struct list_head tmf_elem;
 	struct qla_qpair *qpair;
 	struct fc_port *fcport;
 	struct scsi_qla_host *vha;
@@ -2534,7 +2535,6 @@ enum rscn_addr_format {
 typedef struct fc_port {
 	struct list_head list;
 	struct scsi_qla_host *vha;
-	struct list_head tmf_pending;
 
 	unsigned int conf_compl_supported:1;
 	unsigned int deleted:2;
@@ -2555,9 +2555,6 @@ typedef struct fc_port {
 	unsigned int do_prli_nvme:1;
 
 	uint8_t nvme_flag;
-	uint8_t active_tmf;
-#define MAX_ACTIVE_TMF 8
-
 	uint8_t node_name[WWN_SIZE];
 	uint8_t port_name[WWN_SIZE];
 	port_id_t d_id;
@@ -4640,6 +4637,8 @@ struct qla_hw_data {
 		uint32_t	flt_region_aux_img_status_sec;
 	};
 	uint8_t         active_image;
+	uint8_t active_tmf;
+#define MAX_ACTIVE_TMF 8
 
 	/* Needed for BEACON */
 	uint16_t        beacon_blink_led;
@@ -4654,6 +4653,8 @@ struct qla_hw_data {
 
 	struct qla_msix_entry *msix_entries;
 
+	struct list_head tmf_pending;
+	struct list_head tmf_active;
 	struct list_head        vp_list;        /* list of VP */
 	unsigned long   vp_idx_map[(MAX_MULTI_ID_FABRIC / 8) /
 			sizeof(unsigned long)];
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -2187,30 +2187,42 @@ done:
 	return rval;
 }
 
-static void qla_put_tmf(fc_port_t *fcport)
+static void qla_put_tmf(struct tmf_arg *arg)
 {
-	struct scsi_qla_host *vha = fcport->vha;
+	struct scsi_qla_host *vha = arg->vha;
 	struct qla_hw_data *ha = vha->hw;
 	unsigned long flags;
 
 	spin_lock_irqsave(&ha->tgt.sess_lock, flags);
-	fcport->active_tmf--;
+	ha->active_tmf--;
+	list_del(&arg->tmf_elem);
 	spin_unlock_irqrestore(&ha->tgt.sess_lock, flags);
 }
 
 static
-int qla_get_tmf(fc_port_t *fcport)
+int qla_get_tmf(struct tmf_arg *arg)
 {
-	struct scsi_qla_host *vha = fcport->vha;
+	struct scsi_qla_host *vha = arg->vha;
 	struct qla_hw_data *ha = vha->hw;
 	unsigned long flags;
+	fc_port_t *fcport = arg->fcport;
 	int rc = 0;
-	LIST_HEAD(tmf_elem);
+	struct tmf_arg *t;
 
 	spin_lock_irqsave(&ha->tgt.sess_lock, flags);
-	list_add_tail(&tmf_elem, &fcport->tmf_pending);
+	list_for_each_entry(t, &ha->tmf_active, tmf_elem) {
+		if (t->fcport == arg->fcport && t->lun == arg->lun) {
+			/* reject duplicate TMF */
+			ql_log(ql_log_warn, vha, 0x802c,
+			       "found duplicate TMF.  Nexus=%ld:%06x:%llu.\n",
+			       vha->host_no, fcport->d_id.b24, arg->lun);
+			spin_unlock_irqrestore(&ha->tgt.sess_lock, flags);
+			return -EINVAL;
+		}
+	}
 
-	while (fcport->active_tmf >= MAX_ACTIVE_TMF) {
+	list_add_tail(&arg->tmf_elem, &ha->tmf_pending);
+	while (ha->active_tmf >= MAX_ACTIVE_TMF) {
 		spin_unlock_irqrestore(&ha->tgt.sess_lock, flags);
 
 		msleep(1);
@@ -2222,15 +2234,17 @@ int qla_get_tmf(fc_port_t *fcport)
 			rc = EIO;
 			break;
 		}
-		if (fcport->active_tmf < MAX_ACTIVE_TMF &&
-		    list_is_first(&tmf_elem, &fcport->tmf_pending))
+		if (ha->active_tmf < MAX_ACTIVE_TMF &&
+		    list_is_first(&arg->tmf_elem, &ha->tmf_pending))
 			break;
 	}
 
-	list_del(&tmf_elem);
+	list_del(&arg->tmf_elem);
 
-	if (!rc)
-		fcport->active_tmf++;
+	if (!rc) {
+		ha->active_tmf++;
+		list_add_tail(&arg->tmf_elem, &ha->tmf_active);
+	}
 
 	spin_unlock_irqrestore(&ha->tgt.sess_lock, flags);
 
@@ -2252,15 +2266,18 @@ qla2x00_async_tm_cmd(fc_port_t *fcport,
 	a.vha = fcport->vha;
 	a.fcport = fcport;
 	a.lun = lun;
+	a.flags = flags;
+	INIT_LIST_HEAD(&a.tmf_elem);
+
 	if (flags & (TCF_LUN_RESET|TCF_ABORT_TASK_SET|TCF_CLEAR_TASK_SET|TCF_CLEAR_ACA)) {
 		a.modifier = MK_SYNC_ID_LUN;
-
-		if (qla_get_tmf(fcport))
-			return QLA_FUNCTION_FAILED;
 	} else {
 		a.modifier = MK_SYNC_ID;
 	}
 
+	if (qla_get_tmf(&a))
+		return QLA_FUNCTION_FAILED;
+
 	if (vha->hw->mqenable) {
 		for (i = 0; i < vha->hw->num_qpairs; i++) {
 			qpair = vha->hw->queue_pair_map[i];
@@ -2286,13 +2303,10 @@ qla2x00_async_tm_cmd(fc_port_t *fcport,
 		goto bailout;
 
 	a.qpair = vha->hw->base_qpair;
-	a.flags = flags;
 	rval = __qla2x00_async_tm_cmd(&a);
 
 bailout:
-	if (a.modifier == MK_SYNC_ID_LUN)
-		qla_put_tmf(fcport);
-
+	qla_put_tmf(&a);
 	return rval;
 }
 
@@ -5542,7 +5556,6 @@ qla2x00_alloc_fcport(scsi_qla_host_t *vh
 	INIT_WORK(&fcport->reg_work, qla_register_fcport_fn);
 	INIT_LIST_HEAD(&fcport->gnl_entry);
 	INIT_LIST_HEAD(&fcport->list);
-	INIT_LIST_HEAD(&fcport->tmf_pending);
 
 	INIT_LIST_HEAD(&fcport->sess_cmd_list);
 	spin_lock_init(&fcport->sess_cmd_lock);
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3002,6 +3002,8 @@ qla2x00_probe_one(struct pci_dev *pdev,
 	atomic_set(&ha->num_pend_mbx_stage3, 0);
 	atomic_set(&ha->zio_threshold, DEFAULT_ZIO_THRESHOLD);
 	ha->last_zio_threshold = DEFAULT_ZIO_THRESHOLD;
+	INIT_LIST_HEAD(&ha->tmf_pending);
+	INIT_LIST_HEAD(&ha->tmf_active);
 
 	/* Assign ISP specific operations. */
 	if (IS_QLA2100(ha)) {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 013/219] scsi: qla2xxx: Fix deletion race condition
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (11 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 012/219] scsi: qla2xxx: Limit TMF to 8 per function Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 014/219] scsi: qla2xxx: fix inconsistent TMF timeout Greg Kroah-Hartman
                   ` (216 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit 6dfe4344c168c6ca20fe7640649aacfcefcccb26 upstream.

System crash when using debug kernel due to link list corruption. The cause
of the link list corruption is due to session deletion was allowed to queue
up twice.  Here's the internal trace that show the same port was allowed to
double queue for deletion on different cpu.

20808683956 015 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1
20808683957 027 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1

Move the clearing/setting of deleted flag lock.

Cc: stable@vger.kernel.org
Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery")
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230714070104.40052-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_init.c   |   16 ++++++++++++++--
 drivers/scsi/qla2xxx/qla_target.c |   14 +++++++-------
 2 files changed, 21 insertions(+), 9 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -502,6 +502,7 @@ static
 void qla24xx_handle_adisc_event(scsi_qla_host_t *vha, struct event_arg *ea)
 {
 	struct fc_port *fcport = ea->fcport;
+	unsigned long flags;
 
 	ql_dbg(ql_dbg_disc, vha, 0x20d2,
 	    "%s %8phC DS %d LS %d rc %d login %d|%d rscn %d|%d lid %d\n",
@@ -516,9 +517,15 @@ void qla24xx_handle_adisc_event(scsi_qla
 		ql_dbg(ql_dbg_disc, vha, 0x2066,
 		    "%s %8phC: adisc fail: post delete\n",
 		    __func__, ea->fcport->port_name);
+
+		spin_lock_irqsave(&vha->work_lock, flags);
 		/* deleted = 0 & logout_on_delete = force fw cleanup */
-		fcport->deleted = 0;
+		if (fcport->deleted == QLA_SESS_DELETED)
+			fcport->deleted = 0;
+
 		fcport->logout_on_delete = 1;
+		spin_unlock_irqrestore(&vha->work_lock, flags);
+
 		qlt_schedule_sess_for_deletion(ea->fcport);
 		return;
 	}
@@ -1440,7 +1447,6 @@ void __qla24xx_handle_gpdb_event(scsi_ql
 
 	spin_lock_irqsave(&vha->hw->tgt.sess_lock, flags);
 	ea->fcport->login_gen++;
-	ea->fcport->deleted = 0;
 	ea->fcport->logout_on_delete = 1;
 
 	if (!ea->fcport->login_succ && !IS_SW_RESV_ADDR(ea->fcport->d_id)) {
@@ -6143,6 +6149,8 @@ qla2x00_reg_remote_port(scsi_qla_host_t
 void
 qla2x00_update_fcport(scsi_qla_host_t *vha, fc_port_t *fcport)
 {
+	unsigned long flags;
+
 	if (IS_SW_RESV_ADDR(fcport->d_id))
 		return;
 
@@ -6152,7 +6160,11 @@ qla2x00_update_fcport(scsi_qla_host_t *v
 	qla2x00_set_fcport_disc_state(fcport, DSC_UPD_FCPORT);
 	fcport->login_retry = vha->hw->login_retry_count;
 	fcport->flags &= ~(FCF_LOGIN_NEEDED | FCF_ASYNC_SENT);
+
+	spin_lock_irqsave(&vha->work_lock, flags);
 	fcport->deleted = 0;
+	spin_unlock_irqrestore(&vha->work_lock, flags);
+
 	if (vha->hw->current_topology == ISP_CFG_NL)
 		fcport->logout_on_delete = 0;
 	else
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -1085,10 +1085,6 @@ void qlt_free_session_done(struct work_s
 			(struct imm_ntfy_from_isp *)sess->iocb, SRB_NACK_LOGO);
 	}
 
-	spin_lock_irqsave(&vha->work_lock, flags);
-	sess->flags &= ~FCF_ASYNC_SENT;
-	spin_unlock_irqrestore(&vha->work_lock, flags);
-
 	spin_lock_irqsave(&ha->tgt.sess_lock, flags);
 	if (sess->se_sess) {
 		sess->se_sess = NULL;
@@ -1098,7 +1094,6 @@ void qlt_free_session_done(struct work_s
 
 	qla2x00_set_fcport_disc_state(sess, DSC_DELETED);
 	sess->fw_login_state = DSC_LS_PORT_UNAVAIL;
-	sess->deleted = QLA_SESS_DELETED;
 
 	if (sess->login_succ && !IS_SW_RESV_ADDR(sess->d_id)) {
 		vha->fcport_count--;
@@ -1150,10 +1145,15 @@ void qlt_free_session_done(struct work_s
 
 	sess->explicit_logout = 0;
 	spin_unlock_irqrestore(&ha->tgt.sess_lock, flags);
-	sess->free_pending = 0;
 
 	qla2x00_dfs_remove_rport(vha, sess);
 
+	spin_lock_irqsave(&vha->work_lock, flags);
+	sess->flags &= ~FCF_ASYNC_SENT;
+	sess->deleted = QLA_SESS_DELETED;
+	sess->free_pending = 0;
+	spin_unlock_irqrestore(&vha->work_lock, flags);
+
 	ql_dbg(ql_dbg_disc, vha, 0xf001,
 	    "Unregistration of sess %p %8phC finished fcp_cnt %d\n",
 		sess, sess->port_name, vha->fcport_count);
@@ -1202,12 +1202,12 @@ void qlt_unreg_sess(struct fc_port *sess
 	 * management from being sent.
 	 */
 	sess->flags |= FCF_ASYNC_SENT;
+	sess->deleted = QLA_SESS_DELETION_IN_PROGRESS;
 	spin_unlock_irqrestore(&sess->vha->work_lock, flags);
 
 	if (sess->se_sess)
 		vha->hw->tgt.tgt_ops->clear_nacl_from_fcport_map(sess);
 
-	sess->deleted = QLA_SESS_DELETION_IN_PROGRESS;
 	qla2x00_set_fcport_disc_state(sess, DSC_DELETE_PEND);
 	sess->last_rscn_gen = sess->rscn_gen;
 	sess->last_login_gen = sess->login_gen;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 014/219] scsi: qla2xxx: fix inconsistent TMF timeout
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (12 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 013/219] scsi: qla2xxx: Fix deletion race condition Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 015/219] scsi: qla2xxx: Fix command flush during TMF Greg Kroah-Hartman
                   ` (215 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit 009e7fe4a1ed52276b332842a6b6e23b07200f2d upstream.

Different behavior were experienced of session being torn down vs not when
TMF is timed out. When FW detects the time out, the session is torn down.
When driver detects the time out, the session is not torn down.

Allow TMF error to return to upper layer without session tear down.

Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230714070104.40052-10-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_isr.c |    1 -
 1 file changed, 1 deletion(-)

--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -2539,7 +2539,6 @@ qla24xx_tm_iocb_entry(scsi_qla_host_t *v
 	case CS_PORT_BUSY:
 	case CS_INCOMPLETE:
 	case CS_PORT_UNAVAILABLE:
-	case CS_TIMEOUT:
 	case CS_RESET:
 		if (atomic_read(&fcport->state) == FCS_ONLINE) {
 			ql_dbg(ql_dbg_disc, fcport->vha, 0x3021,



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 015/219] scsi: qla2xxx: Fix command flush during TMF
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (13 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 014/219] scsi: qla2xxx: fix inconsistent TMF timeout Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 016/219] scsi: qla2xxx: Fix erroneous link up failure Greg Kroah-Hartman
                   ` (214 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit da7c21b72aa86e990af5f73bce6590b8d8d148d0 upstream.

For each TMF request, driver iterates through each qpair and flushes
commands associated to the TMF. At the end of the qpair flush, a Marker is
used to complete the flush transaction. This process was repeated for each
qpair. The multiple flush and marker for this TMF request seems to cause
confusion for FW.

Instead, 1 flush is sent to FW. Driver would wait for FW to go through all
the I/Os on each qpair to be read then return. Driver then closes out the
transaction with a Marker.

Cc: stable@vger.kernel.org
Fixes: d90171dd0da5 ("scsi: qla2xxx: Multi-que support for TMF")
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230714070104.40052-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_init.c |   74 +++++++++++++++++++++-------------------
 drivers/scsi/qla2xxx/qla_iocb.c |    1 
 drivers/scsi/qla2xxx/qla_os.c   |    9 ++--
 3 files changed, 45 insertions(+), 39 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -2003,12 +2003,11 @@ qla2x00_tmf_iocb_timeout(void *data)
 	int rc, h;
 	unsigned long flags;
 
-	if (sp->type == SRB_MARKER) {
-		complete(&tmf->u.tmf.comp);
-		return;
-	}
+	if (sp->type == SRB_MARKER)
+		rc = QLA_FUNCTION_FAILED;
+	else
+		rc = qla24xx_async_abort_cmd(sp, false);
 
-	rc = qla24xx_async_abort_cmd(sp, false);
 	if (rc) {
 		spin_lock_irqsave(sp->qpair->qp_lock_ptr, flags);
 		for (h = 1; h < sp->qpair->req->num_outstanding_cmds; h++) {
@@ -2130,6 +2129,17 @@ static void qla2x00_tmf_sp_done(srb_t *s
 	complete(&tmf->u.tmf.comp);
 }
 
+static int qla_tmf_wait(struct tmf_arg *arg)
+{
+	/* there are only 2 types of error handling that reaches here, lun or target reset */
+	if (arg->flags & (TCF_LUN_RESET | TCF_ABORT_TASK_SET | TCF_CLEAR_TASK_SET))
+		return qla2x00_eh_wait_for_pending_commands(arg->vha,
+		    arg->fcport->d_id.b24, arg->lun, WAIT_LUN);
+	else
+		return qla2x00_eh_wait_for_pending_commands(arg->vha,
+		    arg->fcport->d_id.b24, arg->lun, WAIT_TARGET);
+}
+
 static int
 __qla2x00_async_tm_cmd(struct tmf_arg *arg)
 {
@@ -2137,8 +2147,9 @@ __qla2x00_async_tm_cmd(struct tmf_arg *a
 	struct srb_iocb *tm_iocb;
 	srb_t *sp;
 	int rval = QLA_FUNCTION_FAILED;
-
 	fc_port_t *fcport = arg->fcport;
+	u32 chip_gen, login_gen;
+	u64 jif;
 
 	if (TMF_NOT_READY(arg->fcport)) {
 		ql_dbg(ql_dbg_taskm, vha, 0x8032,
@@ -2183,8 +2194,27 @@ __qla2x00_async_tm_cmd(struct tmf_arg *a
 		    "TM IOCB failed (%x).\n", rval);
 	}
 
-	if (!test_bit(UNLOADING, &vha->dpc_flags) && !IS_QLAFX00(vha->hw))
-		rval = qla26xx_marker(arg);
+	if (!test_bit(UNLOADING, &vha->dpc_flags) && !IS_QLAFX00(vha->hw)) {
+		chip_gen = vha->hw->chip_reset;
+		login_gen = fcport->login_gen;
+
+		jif = jiffies;
+		if (qla_tmf_wait(arg)) {
+			ql_log(ql_log_info, vha, 0x803e,
+			       "Waited %u ms Nexus=%ld:%06x:%llu.\n",
+			       jiffies_to_msecs(jiffies - jif), vha->host_no,
+			       fcport->d_id.b24, arg->lun);
+		}
+
+		if (chip_gen == vha->hw->chip_reset && login_gen == fcport->login_gen) {
+			rval = qla26xx_marker(arg);
+		} else {
+			ql_log(ql_log_info, vha, 0x803e,
+			       "Skip Marker due to disruption. Nexus=%ld:%06x:%llu.\n",
+			       vha->host_no, fcport->d_id.b24, arg->lun);
+			rval = QLA_FUNCTION_FAILED;
+		}
+	}
 
 done_free_sp:
 	/* ref: INIT */
@@ -2262,9 +2292,8 @@ qla2x00_async_tm_cmd(fc_port_t *fcport,
 		     uint32_t tag)
 {
 	struct scsi_qla_host *vha = fcport->vha;
-	struct qla_qpair *qpair;
 	struct tmf_arg a;
-	int i, rval = QLA_SUCCESS;
+	int rval = QLA_SUCCESS;
 
 	if (TMF_NOT_READY(fcport))
 		return QLA_SUSPENDED;
@@ -2284,34 +2313,9 @@ qla2x00_async_tm_cmd(fc_port_t *fcport,
 	if (qla_get_tmf(&a))
 		return QLA_FUNCTION_FAILED;
 
-	if (vha->hw->mqenable) {
-		for (i = 0; i < vha->hw->num_qpairs; i++) {
-			qpair = vha->hw->queue_pair_map[i];
-			if (!qpair)
-				continue;
-
-			if (TMF_NOT_READY(fcport)) {
-				ql_log(ql_log_warn, vha, 0x8026,
-				    "Unable to send TM due to disruption.\n");
-				rval = QLA_SUSPENDED;
-				break;
-			}
-
-			a.qpair = qpair;
-			a.flags = flags|TCF_NOTMCMD_TO_TARGET;
-			rval = __qla2x00_async_tm_cmd(&a);
-			if (rval)
-				break;
-		}
-	}
-
-	if (rval)
-		goto bailout;
-
 	a.qpair = vha->hw->base_qpair;
 	rval = __qla2x00_async_tm_cmd(&a);
 
-bailout:
 	qla_put_tmf(&a);
 	return rval;
 }
--- a/drivers/scsi/qla2xxx/qla_iocb.c
+++ b/drivers/scsi/qla2xxx/qla_iocb.c
@@ -3887,6 +3887,7 @@ qla_marker_iocb(srb_t *sp, struct mrk_en
 {
 	mrk->entry_type = MARKER_TYPE;
 	mrk->modifier = sp->u.iocb_cmd.u.tmf.modifier;
+	mrk->handle = make_handle(sp->qpair->req->id, sp->handle);
 	if (sp->u.iocb_cmd.u.tmf.modifier != MK_SYNC_ALL) {
 		mrk->nport_handle = cpu_to_le16(sp->u.iocb_cmd.u.tmf.loop_id);
 		int_to_scsilun(sp->u.iocb_cmd.u.tmf.lun, (struct scsi_lun *)&mrk->lun);
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1478,8 +1478,9 @@ qla2xxx_eh_device_reset(struct scsi_cmnd
 		goto eh_reset_failed;
 	}
 	err = 3;
-	if (qla2x00_eh_wait_for_pending_commands(vha, sdev->id,
-	    sdev->lun, WAIT_LUN) != QLA_SUCCESS) {
+	if (qla2x00_eh_wait_for_pending_commands(vha, fcport->d_id.b24,
+						 cmd->device->lun,
+						 WAIT_LUN) != QLA_SUCCESS) {
 		ql_log(ql_log_warn, vha, 0x800d,
 		    "wait for pending cmds failed for cmd=%p.\n", cmd);
 		goto eh_reset_failed;
@@ -1545,8 +1546,8 @@ qla2xxx_eh_target_reset(struct scsi_cmnd
 		goto eh_reset_failed;
 	}
 	err = 3;
-	if (qla2x00_eh_wait_for_pending_commands(vha, sdev->id,
-	    0, WAIT_TARGET) != QLA_SUCCESS) {
+	if (qla2x00_eh_wait_for_pending_commands(vha, fcport->d_id.b24, 0,
+						 WAIT_TARGET) != QLA_SUCCESS) {
 		ql_log(ql_log_warn, vha, 0x800d,
 		    "wait for pending cmds failed for cmd=%p.\n", cmd);
 		goto eh_reset_failed;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 016/219] scsi: qla2xxx: Fix erroneous link up failure
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (14 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 015/219] scsi: qla2xxx: Fix command flush during TMF Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 017/219] scsi: qla2xxx: Turn off noisy message log Greg Kroah-Hartman
                   ` (213 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit 5b51f35d127e7bef55fa869d2465e2bca4636454 upstream.

Link up failure occurred where driver failed to see certain events from FW
indicating link up (AEN 8011) and fabric login completion (AEN 8014).
Without these 2 events, driver would not proceed forward to scan the
fabric. The cause of this is due to delay in the receive of interrupt for
Mailbox 60 that causes qla to set the fw_started flag late.  The late
setting of this flag causes other interrupts to be dropped.  These dropped
interrupts happen to be the link up (AEN 8011) and fabric login completion
(AEN 8014).

Set fw_started flag early to prevent interrupts being dropped.

Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230714070104.40052-6-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_init.c |    3 ++-
 drivers/scsi/qla2xxx/qla_isr.c  |    6 +++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -4816,15 +4816,16 @@ qla2x00_init_rings(scsi_qla_host_t *vha)
 	if (ha->flags.edif_enabled)
 		mid_init_cb->init_cb.frame_payload_size = cpu_to_le16(ELS_MAX_PAYLOAD);
 
+	QLA_FW_STARTED(ha);
 	rval = qla2x00_init_firmware(vha, ha->init_cb_size);
 next_check:
 	if (rval) {
+		QLA_FW_STOPPED(ha);
 		ql_log(ql_log_fatal, vha, 0x00d2,
 		    "Init Firmware **** FAILED ****.\n");
 	} else {
 		ql_dbg(ql_dbg_init, vha, 0x00d3,
 		    "Init Firmware -- success.\n");
-		QLA_FW_STARTED(ha);
 		vha->u_ql2xexchoffld = vha->u_ql2xiniexchg = 0;
 	}
 
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -1121,8 +1121,12 @@ qla2x00_async_event(scsi_qla_host_t *vha
 	unsigned long	flags;
 	fc_port_t	*fcport = NULL;
 
-	if (!vha->hw->flags.fw_started)
+	if (!vha->hw->flags.fw_started) {
+		ql_log(ql_log_warn, vha, 0x50ff,
+		    "Dropping AEN - %04x %04x %04x %04x.\n",
+		    mb[0], mb[1], mb[2], mb[3]);
 		return;
+	}
 
 	/* Setup to process RIO completion. */
 	handle_cnt = 0;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 017/219] scsi: qla2xxx: Turn off noisy message log
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (15 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 016/219] scsi: qla2xxx: Fix erroneous link up failure Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 018/219] scsi: qla2xxx: Fix session hang in gnl Greg Kroah-Hartman
                   ` (212 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit 8ebaa45163a3fedc885c1dc7d43ea987a2f00a06 upstream.

Some consider noisy log as test failure.  Turn off noisy message log.

Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230714070104.40052-8-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_nvme.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/qla2xxx/qla_nvme.c
+++ b/drivers/scsi/qla2xxx/qla_nvme.c
@@ -664,7 +664,7 @@ static int qla_nvme_post_cmd(struct nvme
 
 	rval = qla2x00_start_nvme_mq(sp);
 	if (rval != QLA_SUCCESS) {
-		ql_log(ql_log_warn, vha, 0x212d,
+		ql_dbg(ql_dbg_io + ql_dbg_verbose, vha, 0x212d,
 		    "qla2x00_start_nvme_mq failed = %d\n", rval);
 		sp->priv = NULL;
 		priv->sp = NULL;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 018/219] scsi: qla2xxx: Fix session hang in gnl
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (16 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 017/219] scsi: qla2xxx: Turn off noisy message log Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 019/219] scsi: qla2xxx: Fix TMF leak through Greg Kroah-Hartman
                   ` (211 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit 39d22740712c7563a2e18c08f033deeacdaf66e7 upstream.

Connection does not resume after a host reset / chip reset. The cause of
the blockage is due to the FCF_ASYNC_ACTIVE left on. The gnl command was
interrupted by the chip reset. On exiting the command, this flag should be
turn off to allow relogin to reoccur. Clear this flag to prevent blockage.

Cc: stable@vger.kernel.org
Fixes: 17e64648aa47 ("scsi: qla2xxx: Correct fcport flags handling")
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230714070104.40052-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_init.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -1135,7 +1135,7 @@ int qla24xx_async_gnl(struct scsi_qla_ho
 	u16 *mb;
 
 	if (!vha->flags.online || (fcport->flags & FCF_ASYNC_SENT))
-		return rval;
+		goto done;
 
 	ql_dbg(ql_dbg_disc, vha, 0x20d9,
 	    "Async-gnlist WWPN %8phC \n", fcport->port_name);
@@ -1189,8 +1189,9 @@ int qla24xx_async_gnl(struct scsi_qla_ho
 done_free_sp:
 	/* ref: INIT */
 	kref_put(&sp->cmd_kref, qla2x00_sp_release);
+	fcport->flags &= ~(FCF_ASYNC_SENT);
 done:
-	fcport->flags &= ~(FCF_ASYNC_ACTIVE | FCF_ASYNC_SENT);
+	fcport->flags &= ~(FCF_ASYNC_ACTIVE);
 	return rval;
 }
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 019/219] scsi: qla2xxx: Fix TMF leak through
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (17 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 018/219] scsi: qla2xxx: Fix session hang in gnl Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 020/219] scsi: qla2xxx: Remove unsupported ql2xenabledif option Greg Kroah-Hartman
                   ` (210 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit 5d3148d8e8b05f084e607ac3bd55a4c317a9f934 upstream.

Task management can retry up to 5 times when FW resource becomes bottle
neck. Between the retries, there is a short sleep.  Current code assumes
the chip has not reset or session has not changed.

Check for chip reset or session change before sending Task management.

Cc: stable@vger.kernel.org
Fixes: 9803fb5d2759 ("scsi: qla2xxx: Fix task management cmd failure")
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230714070104.40052-9-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_init.c |   20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -2039,10 +2039,14 @@ static void qla_marker_sp_done(srb_t *sp
 	complete(&tmf->u.tmf.comp);
 }
 
-#define  START_SP_W_RETRIES(_sp, _rval) \
+#define  START_SP_W_RETRIES(_sp, _rval, _chip_gen, _login_gen) \
 {\
 	int cnt = 5; \
 	do { \
+		if (_chip_gen != sp->vha->hw->chip_reset || _login_gen != sp->fcport->login_gen) {\
+			_rval = EINVAL; \
+			break; \
+		} \
 		_rval = qla2x00_start_sp(_sp); \
 		if (_rval == EAGAIN) \
 			msleep(1); \
@@ -2065,6 +2069,7 @@ qla26xx_marker(struct tmf_arg *arg)
 	srb_t *sp;
 	int rval = QLA_FUNCTION_FAILED;
 	fc_port_t *fcport = arg->fcport;
+	u32 chip_gen, login_gen;
 
 	if (TMF_NOT_READY(arg->fcport)) {
 		ql_dbg(ql_dbg_taskm, vha, 0x8039,
@@ -2074,6 +2079,9 @@ qla26xx_marker(struct tmf_arg *arg)
 		return QLA_SUSPENDED;
 	}
 
+	chip_gen = vha->hw->chip_reset;
+	login_gen = fcport->login_gen;
+
 	/* ref: INIT */
 	sp = qla2xxx_get_qpair_sp(vha, arg->qpair, fcport, GFP_KERNEL);
 	if (!sp)
@@ -2091,7 +2099,7 @@ qla26xx_marker(struct tmf_arg *arg)
 	tm_iocb->u.tmf.loop_id = fcport->loop_id;
 	tm_iocb->u.tmf.vp_index = vha->vp_idx;
 
-	START_SP_W_RETRIES(sp, rval);
+	START_SP_W_RETRIES(sp, rval, chip_gen, login_gen);
 
 	ql_dbg(ql_dbg_taskm, vha, 0x8006,
 	    "Async-marker hdl=%x loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d rval %d.\n",
@@ -2160,6 +2168,9 @@ __qla2x00_async_tm_cmd(struct tmf_arg *a
 		return QLA_SUSPENDED;
 	}
 
+	chip_gen = vha->hw->chip_reset;
+	login_gen = fcport->login_gen;
+
 	/* ref: INIT */
 	sp = qla2xxx_get_qpair_sp(vha, arg->qpair, fcport, GFP_KERNEL);
 	if (!sp)
@@ -2177,7 +2188,7 @@ __qla2x00_async_tm_cmd(struct tmf_arg *a
 	tm_iocb->u.tmf.flags = arg->flags;
 	tm_iocb->u.tmf.lun = arg->lun;
 
-	START_SP_W_RETRIES(sp, rval);
+	START_SP_W_RETRIES(sp, rval, chip_gen, login_gen);
 
 	ql_dbg(ql_dbg_taskm, vha, 0x802f,
 	    "Async-tmf hdl=%x loop-id=%x portid=%06x ctrl=%x lun=%lld qp=%d rval=%x.\n",
@@ -2196,9 +2207,6 @@ __qla2x00_async_tm_cmd(struct tmf_arg *a
 	}
 
 	if (!test_bit(UNLOADING, &vha->dpc_flags) && !IS_QLAFX00(vha->hw)) {
-		chip_gen = vha->hw->chip_reset;
-		login_gen = fcport->login_gen;
-
 		jif = jiffies;
 		if (qla_tmf_wait(arg)) {
 			ql_log(ql_log_info, vha, 0x803e,



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 020/219] scsi: qla2xxx: Remove unsupported ql2xenabledif option
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (18 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 019/219] scsi: qla2xxx: Fix TMF leak through Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 021/219] scsi: qla2xxx: Flush mailbox commands on chip reset Greg Kroah-Hartman
                   ` (209 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Manish Rangankar, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Manish Rangankar <mrangankar@marvell.com>

commit e9105c4b7a9208a21a9bda133707624f12ddabc2 upstream.

User accidently passed module parameter ql2xenabledif=1 which is
unsupported. However, driver still initialized which lead to guard tag
errors during device discovery.

Remove unsupported ql2xenabledif=1 option and validate the user input.

Cc: stable@vger.kernel.org
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230821130045.34850-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_attr.c |    2 --
 drivers/scsi/qla2xxx/qla_dbg.c  |    2 +-
 drivers/scsi/qla2xxx/qla_os.c   |    9 +++++++--
 3 files changed, 8 insertions(+), 5 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_attr.c
+++ b/drivers/scsi/qla2xxx/qla_attr.c
@@ -3093,8 +3093,6 @@ qla24xx_vport_create(struct fc_vport *fc
 			vha->flags.difdix_supported = 1;
 			ql_dbg(ql_dbg_user, vha, 0x7082,
 			    "Registered for DIF/DIX type 1 and 3 protection.\n");
-			if (ql2xenabledif == 1)
-				prot = SHOST_DIX_TYPE0_PROTECTION;
 			scsi_host_set_prot(vha->host,
 			    prot | SHOST_DIF_TYPE1_PROTECTION
 			    | SHOST_DIF_TYPE2_PROTECTION
--- a/drivers/scsi/qla2xxx/qla_dbg.c
+++ b/drivers/scsi/qla2xxx/qla_dbg.c
@@ -18,7 +18,7 @@
  * | Queue Command and IO tracing |       0x3074       | 0x300b         |
  * |                              |                    | 0x3027-0x3028  |
  * |                              |                    | 0x303d-0x3041  |
- * |                              |                    | 0x302d,0x3033  |
+ * |                              |                    | 0x302e,0x3033  |
  * |                              |                    | 0x3036,0x3038  |
  * |                              |                    | 0x303a		|
  * | DPC Thread                   |       0x4023       | 0x4002,0x4013  |
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3281,6 +3281,13 @@ qla2x00_probe_one(struct pci_dev *pdev,
 	host->max_id = ha->max_fibre_devices;
 	host->cmd_per_lun = 3;
 	host->unique_id = host->host_no;
+
+	if (ql2xenabledif && ql2xenabledif != 2) {
+		ql_log(ql_log_warn, base_vha, 0x302d,
+		       "Invalid value for ql2xenabledif, resetting it to default (2)\n");
+		ql2xenabledif = 2;
+	}
+
 	if (IS_T10_PI_CAPABLE(ha) && ql2xenabledif)
 		host->max_cmd_len = 32;
 	else
@@ -3518,8 +3525,6 @@ skip_dpc:
 			base_vha->flags.difdix_supported = 1;
 			ql_dbg(ql_dbg_init, base_vha, 0x00f1,
 			    "Registering for DIF/DIX type 1 and 3 protection.\n");
-			if (ql2xenabledif == 1)
-				prot = SHOST_DIX_TYPE0_PROTECTION;
 			if (ql2xprotmask)
 				scsi_host_set_prot(host, ql2xprotmask);
 			else



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 021/219] scsi: qla2xxx: Flush mailbox commands on chip reset
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (19 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 020/219] scsi: qla2xxx: Remove unsupported ql2xenabledif option Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 022/219] scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit() Greg Kroah-Hartman
                   ` (208 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit 6d0b65569c0a10b27c49bacd8d25bcd406003533 upstream.

Fix race condition between Interrupt thread and Chip reset thread in trying
to flush the same mailbox. With the race condition, the "ha->mbx_intr_comp"
will get an extra complete() call. The extra complete call create erroneous
mailbox timeout condition when the next mailbox is sent where the mailbox
call does not wait for interrupt to arrive. Instead, it advances without
waiting.

Add lock protection around the check for mailbox completion.

Cc: stable@vger.kernel.org
Fixes: b2000805a975 ("scsi: qla2xxx: Flush mailbox commands on chip reset")
Signed-off-by: Quinn Tran <quinn.tran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230821130045.34850-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_def.h  |    1 -
 drivers/scsi/qla2xxx/qla_init.c |    7 ++++---
 drivers/scsi/qla2xxx/qla_mbx.c  |    4 ----
 drivers/scsi/qla2xxx/qla_os.c   |    1 -
 4 files changed, 4 insertions(+), 9 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -4367,7 +4367,6 @@ struct qla_hw_data {
 	uint8_t		aen_mbx_count;
 	atomic_t	num_pend_mbx_stage1;
 	atomic_t	num_pend_mbx_stage2;
-	atomic_t	num_pend_mbx_stage3;
 	uint16_t	frame_payload_size;
 
 	uint32_t	login_retry_count;
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -7444,14 +7444,15 @@ qla2x00_abort_isp_cleanup(scsi_qla_host_
 	}
 
 	/* purge MBox commands */
-	if (atomic_read(&ha->num_pend_mbx_stage3)) {
+	spin_lock_irqsave(&ha->hardware_lock, flags);
+	if (test_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags)) {
 		clear_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags);
 		complete(&ha->mbx_intr_comp);
 	}
+	spin_unlock_irqrestore(&ha->hardware_lock, flags);
 
 	i = 0;
-	while (atomic_read(&ha->num_pend_mbx_stage3) ||
-	    atomic_read(&ha->num_pend_mbx_stage2) ||
+	while (atomic_read(&ha->num_pend_mbx_stage2) ||
 	    atomic_read(&ha->num_pend_mbx_stage1)) {
 		msleep(20);
 		i++;
--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -273,7 +273,6 @@ qla2x00_mailbox_command(scsi_qla_host_t
 		spin_unlock_irqrestore(&ha->hardware_lock, flags);
 
 		wait_time = jiffies;
-		atomic_inc(&ha->num_pend_mbx_stage3);
 		if (!wait_for_completion_timeout(&ha->mbx_intr_comp,
 		    mcp->tov * HZ)) {
 			ql_dbg(ql_dbg_mbx, vha, 0x117a,
@@ -290,7 +289,6 @@ qla2x00_mailbox_command(scsi_qla_host_t
 				spin_unlock_irqrestore(&ha->hardware_lock,
 				    flags);
 				atomic_dec(&ha->num_pend_mbx_stage2);
-				atomic_dec(&ha->num_pend_mbx_stage3);
 				rval = QLA_ABORTED;
 				goto premature_exit;
 			}
@@ -302,11 +300,9 @@ qla2x00_mailbox_command(scsi_qla_host_t
 			ha->flags.mbox_busy = 0;
 			spin_unlock_irqrestore(&ha->hardware_lock, flags);
 			atomic_dec(&ha->num_pend_mbx_stage2);
-			atomic_dec(&ha->num_pend_mbx_stage3);
 			rval = QLA_ABORTED;
 			goto premature_exit;
 		}
-		atomic_dec(&ha->num_pend_mbx_stage3);
 
 		if (time_after(jiffies, wait_time + 5 * HZ))
 			ql_log(ql_log_warn, vha, 0x1015, "cmd=0x%x, waited %d msecs\n",
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3000,7 +3000,6 @@ qla2x00_probe_one(struct pci_dev *pdev,
 	ha->max_exchg = FW_MAX_EXCHANGES_CNT;
 	atomic_set(&ha->num_pend_mbx_stage1, 0);
 	atomic_set(&ha->num_pend_mbx_stage2, 0);
-	atomic_set(&ha->num_pend_mbx_stage3, 0);
 	atomic_set(&ha->zio_threshold, DEFAULT_ZIO_THRESHOLD);
 	ha->last_zio_threshold = DEFAULT_ZIO_THRESHOLD;
 	INIT_LIST_HEAD(&ha->tmf_pending);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 022/219] scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (20 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 021/219] scsi: qla2xxx: Flush mailbox commands on chip reset Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 023/219] scsi: qla2xxx: Error code did not return to upper layer Greg Kroah-Hartman
                   ` (207 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Nilesh Javali, Himanshu Madhani,
	Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nilesh Javali <njavali@marvell.com>

commit b496953dd0444001b12f425ea07d78c1f47e3193 upstream.

Fix indentation for warning reported by smatch:

drivers/scsi/qla2xxx/qla_init.c:4199 qla_init_iocb_limit() warn: inconsistent indenting

Fixes: efa74a62aaa2 ("scsi: qla2xxx: Adjust IOCB resource on qpair create")
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230821130045.34850-8-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_init.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -4204,7 +4204,7 @@ void qla_init_iocb_limit(scsi_qla_host_t
 	u8 i;
 	struct qla_hw_data *ha = vha->hw;
 
-	 __qla_adjust_iocb_limit(ha->base_qpair);
+	__qla_adjust_iocb_limit(ha->base_qpair);
 	ha->base_qpair->fwres.iocbs_used = 0;
 	ha->base_qpair->fwres.exch_used  = 0;
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 023/219] scsi: qla2xxx: Error code did not return to upper layer
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (21 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 022/219] scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit() Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 024/219] scsi: qla2xxx: Fix firmware resource tracking Greg Kroah-Hartman
                   ` (206 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit 0ba0b018f94525a6b32f5930f980ce9b62b72e6f upstream.

TMF was returned with an error code. The error code was not preserved to be
returned to upper layer. Instead, the error code from the Marker was
returned.

Preserve error code from TMF and return it to upper layer.

Cc: stable@vger.kernel.org
Fixes: da7c21b72aa8 ("scsi: qla2xxx: Fix command flush during TMF")
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230821130045.34850-6-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_init.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -2224,6 +2224,8 @@ __qla2x00_async_tm_cmd(struct tmf_arg *a
 			rval = QLA_FUNCTION_FAILED;
 		}
 	}
+	if (tm_iocb->u.tmf.data)
+		rval = tm_iocb->u.tmf.data;
 
 done_free_sp:
 	/* ref: INIT */



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 024/219] scsi: qla2xxx: Fix firmware resource tracking
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (22 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 023/219] scsi: qla2xxx: Error code did not return to upper layer Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 025/219] null_blk: fix poll request timeout handling Greg Kroah-Hartman
                   ` (205 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Quinn Tran, Nilesh Javali,
	Himanshu Madhani, Martin K. Petersen

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Quinn Tran <qutran@marvell.com>

commit e370b64c7db96384a0886a09a9d80406e4c663d7 upstream.

The storage was not draining I/Os and the work load was not spread out
across different CPUs evenly. This led to firmware resource counters
getting overrun on the busy CPU. This overrun prevented error recovery from
happening in a timely manner.

By switching the counter to atomic, it allows the count to be little more
accurate to prevent the overrun.

Cc: stable@vger.kernel.org
Fixes: da7c21b72aa8 ("scsi: qla2xxx: Fix command flush during TMF")
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230821130045.34850-4-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/scsi/qla2xxx/qla_def.h    |   11 +++++++
 drivers/scsi/qla2xxx/qla_dfs.c    |   10 ++++++
 drivers/scsi/qla2xxx/qla_init.c   |    8 +++++
 drivers/scsi/qla2xxx/qla_inline.h |   57 +++++++++++++++++++++++++++++++++++++-
 drivers/scsi/qla2xxx/qla_os.c     |    5 ++-
 5 files changed, 88 insertions(+), 3 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -3740,6 +3740,16 @@ struct qla_fw_resources {
 	u16 pad;
 };
 
+struct qla_fw_res {
+	u16      iocb_total;
+	u16      iocb_limit;
+	atomic_t iocb_used;
+
+	u16      exch_total;
+	u16      exch_limit;
+	atomic_t exch_used;
+};
+
 #define QLA_IOCB_PCT_LIMIT 95
 
 /*Queue pair data structure */
@@ -4782,6 +4792,7 @@ struct qla_hw_data {
 	spinlock_t sadb_lock;	/* protects list */
 	struct els_reject elsrej;
 	u8 edif_post_stop_cnt_down;
+	struct qla_fw_res fwres ____cacheline_aligned;
 };
 
 #define RX_ELS_SIZE (roundup(sizeof(struct enode) + ELS_MAX_PAYLOAD, SMP_CACHE_BYTES))
--- a/drivers/scsi/qla2xxx/qla_dfs.c
+++ b/drivers/scsi/qla2xxx/qla_dfs.c
@@ -276,6 +276,16 @@ qla_dfs_fw_resource_cnt_show(struct seq_
 
 		seq_printf(s, "estimate exchange used[%d] high water limit [%d] n",
 			   exch_used, ha->base_qpair->fwres.exch_limit);
+
+		if (ql2xenforce_iocb_limit == 2) {
+			iocbs_used = atomic_read(&ha->fwres.iocb_used);
+			exch_used  = atomic_read(&ha->fwres.exch_used);
+			seq_printf(s, "        estimate iocb2 used [%d] high water limit [%d]\n",
+					iocbs_used, ha->fwres.iocb_limit);
+
+			seq_printf(s, "        estimate exchange2 used[%d] high water limit [%d] \n",
+					exch_used, ha->fwres.exch_limit);
+		}
 	}
 
 	return 0;
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -4217,6 +4217,14 @@ void qla_init_iocb_limit(scsi_qla_host_t
 			ha->queue_pair_map[i]->fwres.exch_used = 0;
 		}
 	}
+
+	ha->fwres.iocb_total = ha->orig_fw_iocb_count;
+	ha->fwres.iocb_limit = (ha->orig_fw_iocb_count * QLA_IOCB_PCT_LIMIT) / 100;
+	ha->fwres.exch_total = ha->orig_fw_xcb_count;
+	ha->fwres.exch_limit = (ha->orig_fw_xcb_count * QLA_IOCB_PCT_LIMIT) / 100;
+
+	atomic_set(&ha->fwres.iocb_used, 0);
+	atomic_set(&ha->fwres.exch_used, 0);
 }
 
 void qla_adjust_iocb_limit(scsi_qla_host_t *vha)
--- a/drivers/scsi/qla2xxx/qla_inline.h
+++ b/drivers/scsi/qla2xxx/qla_inline.h
@@ -386,6 +386,7 @@ enum {
 	RESOURCE_IOCB = BIT_0,
 	RESOURCE_EXCH = BIT_1,  /* exchange */
 	RESOURCE_FORCE = BIT_2,
+	RESOURCE_HA = BIT_3,
 };
 
 static inline int
@@ -393,7 +394,7 @@ qla_get_fw_resources(struct qla_qpair *q
 {
 	u16 iocbs_used, i;
 	u16 exch_used;
-	struct qla_hw_data *ha = qp->vha->hw;
+	struct qla_hw_data *ha = qp->hw;
 
 	if (!ql2xenforce_iocb_limit) {
 		iores->res_type = RESOURCE_NONE;
@@ -428,15 +429,69 @@ qla_get_fw_resources(struct qla_qpair *q
 			return -ENOSPC;
 		}
 	}
+
+	if (ql2xenforce_iocb_limit == 2) {
+		if ((iores->iocb_cnt + atomic_read(&ha->fwres.iocb_used)) >=
+		    ha->fwres.iocb_limit) {
+			iores->res_type = RESOURCE_NONE;
+			return -ENOSPC;
+		}
+
+		if (iores->res_type & RESOURCE_EXCH) {
+			if ((iores->exch_cnt + atomic_read(&ha->fwres.exch_used)) >=
+			    ha->fwres.exch_limit) {
+				iores->res_type = RESOURCE_NONE;
+				return -ENOSPC;
+			}
+		}
+	}
+
 force:
 	qp->fwres.iocbs_used += iores->iocb_cnt;
 	qp->fwres.exch_used += iores->exch_cnt;
+	if (ql2xenforce_iocb_limit == 2) {
+		atomic_add(iores->iocb_cnt, &ha->fwres.iocb_used);
+		atomic_add(iores->exch_cnt, &ha->fwres.exch_used);
+		iores->res_type |= RESOURCE_HA;
+	}
 	return 0;
 }
 
+/*
+ * decrement to zero.  This routine will not decrement below zero
+ * @v:  pointer of type atomic_t
+ * @amount: amount to decrement from v
+ */
+static void qla_atomic_dtz(atomic_t *v, int amount)
+{
+	int c, old, dec;
+
+	c = atomic_read(v);
+	for (;;) {
+		dec = c - amount;
+		if (unlikely(dec < 0))
+			dec = 0;
+
+		old = atomic_cmpxchg((v), c, dec);
+		if (likely(old == c))
+			break;
+		c = old;
+	}
+}
+
 static inline void
 qla_put_fw_resources(struct qla_qpair *qp, struct iocb_resource *iores)
 {
+	struct qla_hw_data *ha = qp->hw;
+
+	if (iores->res_type & RESOURCE_HA) {
+		if (iores->res_type & RESOURCE_IOCB)
+			qla_atomic_dtz(&ha->fwres.iocb_used, iores->iocb_cnt);
+
+		if (iores->res_type & RESOURCE_EXCH)
+			qla_atomic_dtz(&ha->fwres.exch_used, iores->exch_cnt);
+	}
+
 	if (iores->res_type & RESOURCE_IOCB) {
 		if (qp->fwres.iocbs_used >= iores->iocb_cnt) {
 			qp->fwres.iocbs_used -= iores->iocb_cnt;
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -44,10 +44,11 @@ module_param(ql2xfulldump_on_mpifail, in
 MODULE_PARM_DESC(ql2xfulldump_on_mpifail,
 		 "Set this to take full dump on MPI hang.");
 
-int ql2xenforce_iocb_limit = 1;
+int ql2xenforce_iocb_limit = 2;
 module_param(ql2xenforce_iocb_limit, int, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(ql2xenforce_iocb_limit,
-		 "Enforce IOCB throttling, to avoid FW congestion. (default: 1)");
+		 "Enforce IOCB throttling, to avoid FW congestion. (default: 2) "
+		 "1: track usage per queue, 2: track usage per adapter");
 
 /*
  * CT6 CTX allocation cache



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 025/219] null_blk: fix poll request timeout handling
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (23 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 024/219] scsi: qla2xxx: Fix firmware resource tracking Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 026/219] fbdev/ep93xx-fb: Do not assign to struct fb_info.dev Greg Kroah-Hartman
                   ` (204 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, David Howells, Ming Lei,
	Chengming Zhou, Jens Axboe

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chengming Zhou <zhouchengming@bytedance.com>

commit 5a26e45edb4690d58406178b5a9ea4c6dcf2c105 upstream.

When doing io_uring benchmark on /dev/nullb0, it's easy to crash the
kernel if poll requests timeout triggered, as reported by David. [1]

BUG: kernel NULL pointer dereference, address: 0000000000000008
Workqueue: kblockd blk_mq_timeout_work
RIP: 0010:null_timeout_rq+0x4e/0x91
Call Trace:
 ? null_timeout_rq+0x4e/0x91
 blk_mq_handle_expired+0x31/0x4b
 bt_iter+0x68/0x84
 ? bt_tags_iter+0x81/0x81
 __sbitmap_for_each_set.constprop.0+0xb0/0xf2
 ? __blk_mq_complete_request_remote+0xf/0xf
 bt_for_each+0x46/0x64
 ? __blk_mq_complete_request_remote+0xf/0xf
 ? percpu_ref_get_many+0xc/0x2a
 blk_mq_queue_tag_busy_iter+0x14d/0x18e
 blk_mq_timeout_work+0x95/0x127
 process_one_work+0x185/0x263
 worker_thread+0x1b5/0x227

This is indeed a race problem between null_timeout_rq() and null_poll().

null_poll()				null_timeout_rq()
  spin_lock(&nq->poll_lock)
  list_splice_init(&nq->poll_list, &list)
  spin_unlock(&nq->poll_lock)

  while (!list_empty(&list))
    req = list_first_entry()
    list_del_init()
    ...
    blk_mq_add_to_batch()
    // req->rq_next = NULL
					spin_lock(&nq->poll_lock)

					// rq->queuelist->next == NULL
					list_del_init(&rq->queuelist)

					spin_unlock(&nq->poll_lock)

Fix these problems by setting requests state to MQ_RQ_COMPLETE under
nq->poll_lock protection, in which null_timeout_rq() can safely detect
this race and early return.

Note this patch just fix the kernel panic when request timeout happen.

[1] https://lore.kernel.org/all/3893581.1691785261@warthog.procyon.org.uk/

Fixes: 0a593fbbc245 ("null_blk: poll queue support")
Reported-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Link: https://lore.kernel.org/r/20230901120306.170520-2-chengming.zhou@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/block/null_blk/main.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -1585,9 +1585,12 @@ static int null_poll(struct blk_mq_hw_ct
 	struct nullb_queue *nq = hctx->driver_data;
 	LIST_HEAD(list);
 	int nr = 0;
+	struct request *rq;
 
 	spin_lock(&nq->poll_lock);
 	list_splice_init(&nq->poll_list, &list);
+	list_for_each_entry(rq, &list, queuelist)
+		blk_mq_set_request_complete(rq);
 	spin_unlock(&nq->poll_lock);
 
 	while (!list_empty(&list)) {
@@ -1613,16 +1616,21 @@ static enum blk_eh_timer_return null_tim
 	struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
 	struct nullb_cmd *cmd = blk_mq_rq_to_pdu(rq);
 
-	pr_info("rq %p timed out\n", rq);
-
 	if (hctx->type == HCTX_TYPE_POLL) {
 		struct nullb_queue *nq = hctx->driver_data;
 
 		spin_lock(&nq->poll_lock);
+		/* The request may have completed meanwhile. */
+		if (blk_mq_request_completed(rq)) {
+			spin_unlock(&nq->poll_lock);
+			return BLK_EH_DONE;
+		}
 		list_del_init(&rq->queuelist);
 		spin_unlock(&nq->poll_lock);
 	}
 
+	pr_info("rq %p timed out\n", rq);
+
 	/*
 	 * If the device is marked as blocking (i.e. memory backed or zoned
 	 * device), the submission path may be blocked waiting for resources



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 026/219] fbdev/ep93xx-fb: Do not assign to struct fb_info.dev
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (24 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 025/219] null_blk: fix poll request timeout handling Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 027/219] clk: qcom: camcc-sc7180: fix async resume during probe Greg Kroah-Hartman
                   ` (203 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Thomas Zimmermann,
	Javier Martinez Canillas, Sam Ravnborg

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Zimmermann <tzimmermann@suse.de>

commit f90a0e5265b60cdd3c77990e8105f79aa2fac994 upstream.

Do not assing the Linux device to struct fb_info.dev. The call to
register_framebuffer() initializes the field to the fbdev device.
Drivers should not override its value.

Fixes a bug where the driver incorrectly decreases the hardware
device's reference counter and leaks the fbdev device.

v2:
	* add Fixes tag (Dan)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 88017bda96a5 ("ep93xx video driver")
Cc: <stable@vger.kernel.org> # v2.6.32+
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230613110953.24176-15-tzimmermann@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/video/fbdev/ep93xx-fb.c |    1 -
 1 file changed, 1 deletion(-)

--- a/drivers/video/fbdev/ep93xx-fb.c
+++ b/drivers/video/fbdev/ep93xx-fb.c
@@ -474,7 +474,6 @@ static int ep93xxfb_probe(struct platfor
 	if (!info)
 		return -ENOMEM;
 
-	info->dev = &pdev->dev;
 	platform_set_drvdata(pdev, info);
 	fbi = info->par;
 	fbi->mach_info = mach_info;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 027/219] clk: qcom: camcc-sc7180: fix async resume during probe
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (25 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 026/219] fbdev/ep93xx-fb: Do not assign to struct fb_info.dev Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 028/219] drm/ast: Fix DRAM init on AST2200 Greg Kroah-Hartman
                   ` (202 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Stephen Boyd, Johan Hovold,
	Bjorn Andersson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan+linaro@kernel.org>

commit c948ff727e25297f3a703eb5349dd66aabf004e4 upstream.

To make sure that the controller is runtime resumed and its power domain
is enabled before accessing its registers during probe, the synchronous
runtime PM interface must be used.

Fixes: 8d4025943e13 ("clk: qcom: camcc-sc7180: Use runtime PM ops instead of clk ones")
Cc: stable@vger.kernel.org      # 5.11
Cc: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230718132902.21430-2-johan+linaro@kernel.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/qcom/camcc-sc7180.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/clk/qcom/camcc-sc7180.c
+++ b/drivers/clk/qcom/camcc-sc7180.c
@@ -1664,7 +1664,7 @@ static int cam_cc_sc7180_probe(struct pl
 		return ret;
 	}
 
-	ret = pm_runtime_get(&pdev->dev);
+	ret = pm_runtime_resume_and_get(&pdev->dev);
 	if (ret)
 		return ret;
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 028/219] drm/ast: Fix DRAM init on AST2200
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (26 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 027/219] clk: qcom: camcc-sc7180: fix async resume during probe Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 029/219] ASoC: tegra: Fix SFC conversion for few rates Greg Kroah-Hartman
                   ` (201 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Thomas Zimmermann, Dave Airlie,
	dri-devel, Sui Jingfeng, Jocelyn Falempe

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Zimmermann <tzimmermann@suse.de>

commit 4cfe75f0f14f044dae66ad0e6eea812d038465d9 upstream.

Fix the test for the AST2200 in the DRAM initialization. The value
in ast->chip has to be compared against an enum constant instead of
a numerical value.

This bug got introduced when the driver was first imported into the
kernel.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 312fec1405dd ("drm: Initial KMS driver for AST (ASpeed Technologies) 2000 series (v2)")
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v3.5+
Reviewed-by: Sui Jingfeng <suijingfeng@loongson.cn>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Tested-by: Jocelyn Falempe <jfalempe@redhat.com> # AST2600
Link: https://patchwork.freedesktop.org/patch/msgid/20230621130032.3568-2-tzimmermann@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/ast/ast_post.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/ast/ast_post.c
+++ b/drivers/gpu/drm/ast/ast_post.c
@@ -291,7 +291,7 @@ static void ast_init_dram_reg(struct drm
 				;
 			} while (ast_read32(ast, 0x10100) != 0xa8);
 		} else {/* AST2100/1100 */
-			if (ast->chip == AST2100 || ast->chip == 2200)
+			if (ast->chip == AST2100 || ast->chip == AST2200)
 				dram_reg_info = ast2100_dram_table_data;
 			else
 				dram_reg_info = ast1100_dram_table_data;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 029/219] ASoC: tegra: Fix SFC conversion for few rates
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (27 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 028/219] drm/ast: Fix DRAM init on AST2200 Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 030/219] clk: qcom: turingcc-qcs404: fix missing resume during probe Greg Kroah-Hartman
                   ` (200 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Sheetal, Mohan Kumar D, Sameer Pujar,
	Mark Brown

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sheetal <sheetal@nvidia.com>

commit d900d9a435ca95a386f49424f3689cd17ec201da upstream.

Sample rate conversions for rates greater than 48kHz are found to be
failing. It means x->y conversions fail when either x or y is greater
than 48kHz.

This happens because, tegra210_sfc_rate_to_idx() returns incorrect
index for rates greater than 48kHz. This actually depends on the
tegra210_sfc_rates[] array and it is not in sync with frequency
values of SFC TX/RX register. To be precise, 64kHz entry is missing
in above array defined in the driver. Due to this wrong index is
returned and this results in incorrect programming of coefficients.

To fix this, align the tegra210_sfc_rates[] array with SFC register
specification and thus add 64kHz entry to it. Also, the coefficient
table is updated to reflect that none of the conversions are supported
for 64kHz.

Fixes: b2f74ec53a6c ("ASoC: tegra: Add Tegra210 based SFC driver")
Cc: stable@vger.kernel.org
Signed-off-by: Sheetal <sheetal@nvidia.com>
Reviewed-by: Mohan Kumar D <mkumard@nvidia.com>
Reviewed-by: Sameer Pujar <spujar@nvidia.com>
Link: https://lore.kernel.org/r/Message-Id: <1687433656-7892-2-git-send-email-spujar@nvidia.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 sound/soc/tegra/tegra210_sfc.c |   31 ++++++++++++++++++++++++++++++-
 sound/soc/tegra/tegra210_sfc.h |    4 ++--
 2 files changed, 32 insertions(+), 3 deletions(-)

--- a/sound/soc/tegra/tegra210_sfc.c
+++ b/sound/soc/tegra/tegra210_sfc.c
@@ -2,7 +2,7 @@
 //
 // tegra210_sfc.c - Tegra210 SFC driver
 //
-// Copyright (c) 2021 NVIDIA CORPORATION.  All rights reserved.
+// Copyright (c) 2021-2023 NVIDIA CORPORATION.  All rights reserved.
 
 #include <linux/clk.h>
 #include <linux/device.h>
@@ -42,6 +42,7 @@ static const int tegra210_sfc_rates[TEGR
 	32000,
 	44100,
 	48000,
+	64000,
 	88200,
 	96000,
 	176400,
@@ -2857,6 +2858,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_8to32,
 		coef_8to44,
 		coef_8to48,
+		UNSUPP_CONV,
 		coef_8to88,
 		coef_8to96,
 		UNSUPP_CONV,
@@ -2872,6 +2874,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_11to32,
 		coef_11to44,
 		coef_11to48,
+		UNSUPP_CONV,
 		coef_11to88,
 		coef_11to96,
 		UNSUPP_CONV,
@@ -2887,6 +2890,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_16to32,
 		coef_16to44,
 		coef_16to48,
+		UNSUPP_CONV,
 		coef_16to88,
 		coef_16to96,
 		coef_16to176,
@@ -2902,6 +2906,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_22to32,
 		coef_22to44,
 		coef_22to48,
+		UNSUPP_CONV,
 		coef_22to88,
 		coef_22to96,
 		coef_22to176,
@@ -2917,6 +2922,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_24to32,
 		coef_24to44,
 		coef_24to48,
+		UNSUPP_CONV,
 		coef_24to88,
 		coef_24to96,
 		coef_24to176,
@@ -2932,6 +2938,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		BYPASS_CONV,
 		coef_32to44,
 		coef_32to48,
+		UNSUPP_CONV,
 		coef_32to88,
 		coef_32to96,
 		coef_32to176,
@@ -2947,6 +2954,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_44to32,
 		BYPASS_CONV,
 		coef_44to48,
+		UNSUPP_CONV,
 		coef_44to88,
 		coef_44to96,
 		coef_44to176,
@@ -2962,11 +2970,28 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_48to32,
 		coef_48to44,
 		BYPASS_CONV,
+		UNSUPP_CONV,
 		coef_48to88,
 		coef_48to96,
 		coef_48to176,
 		coef_48to192,
 	},
+	/* Convertions from 64 kHz */
+	{
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+		UNSUPP_CONV,
+	},
 	/* Convertions from 88.2 kHz */
 	{
 		coef_88to8,
@@ -2977,6 +3002,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_88to32,
 		coef_88to44,
 		coef_88to48,
+		UNSUPP_CONV,
 		BYPASS_CONV,
 		coef_88to96,
 		coef_88to176,
@@ -2991,6 +3017,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_96to32,
 		coef_96to44,
 		coef_96to48,
+		UNSUPP_CONV,
 		coef_96to88,
 		BYPASS_CONV,
 		coef_96to176,
@@ -3006,6 +3033,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_176to32,
 		coef_176to44,
 		coef_176to48,
+		UNSUPP_CONV,
 		coef_176to88,
 		coef_176to96,
 		BYPASS_CONV,
@@ -3021,6 +3049,7 @@ static s32 *coef_addr_table[TEGRA210_SFC
 		coef_192to32,
 		coef_192to44,
 		coef_192to48,
+		UNSUPP_CONV,
 		coef_192to88,
 		coef_192to96,
 		coef_192to176,
--- a/sound/soc/tegra/tegra210_sfc.h
+++ b/sound/soc/tegra/tegra210_sfc.h
@@ -2,7 +2,7 @@
 /*
  * tegra210_sfc.h - Definitions for Tegra210 SFC driver
  *
- * Copyright (c) 2021 NVIDIA CORPORATION.  All rights reserved.
+ * Copyright (c) 2021-2023 NVIDIA CORPORATION.  All rights reserved.
  *
  */
 
@@ -47,7 +47,7 @@
 #define TEGRA210_SFC_EN_SHIFT			0
 #define TEGRA210_SFC_EN				(1 << TEGRA210_SFC_EN_SHIFT)
 
-#define TEGRA210_SFC_NUM_RATES 12
+#define TEGRA210_SFC_NUM_RATES 13
 
 /* Fields in TEGRA210_SFC_COEF_RAM */
 #define TEGRA210_SFC_COEF_RAM_EN		BIT(0)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 030/219] clk: qcom: turingcc-qcs404: fix missing resume during probe
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (28 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 029/219] ASoC: tegra: Fix SFC conversion for few rates Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 031/219] arm64: dts: renesas: rzg2l: Fix txdv-skew-psec typos Greg Kroah-Hartman
                   ` (199 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Bjorn Andersson, Johan Hovold

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan+linaro@kernel.org>

commit a9f71a033587c9074059132d34c74eabbe95ef26 upstream.

Drivers that enable runtime PM must make sure that the controller is
runtime resumed before accessing its registers to prevent the power
domain from being disabled.

Fixes: 892df0191b29 ("clk: qcom: Add QCS404 TuringCC")
Cc: stable@vger.kernel.org      # 5.2
Cc: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230718132902.21430-9-johan+linaro@kernel.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/qcom/turingcc-qcs404.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

--- a/drivers/clk/qcom/turingcc-qcs404.c
+++ b/drivers/clk/qcom/turingcc-qcs404.c
@@ -125,11 +125,22 @@ static int turingcc_probe(struct platfor
 		return ret;
 	}
 
+	ret = pm_runtime_resume_and_get(&pdev->dev);
+	if (ret)
+		return ret;
+
 	ret = qcom_cc_probe(pdev, &turingcc_desc);
 	if (ret < 0)
-		return ret;
+		goto err_put_rpm;
+
+	pm_runtime_put(&pdev->dev);
 
 	return 0;
+
+err_put_rpm:
+	pm_runtime_put_sync(&pdev->dev);
+
+	return ret;
 }
 
 static const struct dev_pm_ops turingcc_pm_ops = {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 031/219] arm64: dts: renesas: rzg2l: Fix txdv-skew-psec typos
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (29 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 030/219] clk: qcom: turingcc-qcs404: fix missing resume during probe Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 032/219] [SMB3] send channel sequence number in SMB3 requests after reconnects Greg Kroah-Hartman
                   ` (198 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Tomohiro Komagata, Chris Paterson,
	Geert Uytterhoeven

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chris Paterson <chris.paterson2@renesas.com>

commit db67345716a52abb750ec8f76d6a5675218715f9 upstream.

It looks like txdv-skew-psec is a typo from a copy+paste. txdv-skew-psec
is not present in the PHY bindings nor is it in the driver.

Correct to txen-skew-psec which is clearly what it was meant to be.

Given that the default for txen-skew-psec is 0, and the device tree is
only trying to set it to 0 anyway, there should not be any functional
change from this fix.

Fixes: 361b0dcbd7f9 ("arm64: dts: renesas: rzg2l-smarc-som: Enable Ethernet")
Fixes: 6494e4f90503 ("arm64: dts: renesas: rzg2ul-smarc-som: Enable Ethernet on SMARC platform")
Fixes: ce0c63b6a5ef ("arm64: dts: renesas: Add initial device tree for RZ/G2LC SMARC EVK")
Cc: stable@vger.kernel.org # 6.1.y
Reported-by: Tomohiro Komagata <tomohiro.komagata.aj@renesas.com>
Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20230609221136.7431-1-chris.paterson2@renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/boot/dts/renesas/rzg2l-smarc-som.dtsi  |    4 ++--
 arch/arm64/boot/dts/renesas/rzg2lc-smarc-som.dtsi |    2 +-
 arch/arm64/boot/dts/renesas/rzg2ul-smarc-som.dtsi |    4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

--- a/arch/arm64/boot/dts/renesas/rzg2l-smarc-som.dtsi
+++ b/arch/arm64/boot/dts/renesas/rzg2l-smarc-som.dtsi
@@ -100,7 +100,7 @@
 		rxc-skew-psec = <2400>;
 		txc-skew-psec = <2400>;
 		rxdv-skew-psec = <0>;
-		txdv-skew-psec = <0>;
+		txen-skew-psec = <0>;
 		rxd0-skew-psec = <0>;
 		rxd1-skew-psec = <0>;
 		rxd2-skew-psec = <0>;
@@ -128,7 +128,7 @@
 		rxc-skew-psec = <2400>;
 		txc-skew-psec = <2400>;
 		rxdv-skew-psec = <0>;
-		txdv-skew-psec = <0>;
+		txen-skew-psec = <0>;
 		rxd0-skew-psec = <0>;
 		rxd1-skew-psec = <0>;
 		rxd2-skew-psec = <0>;
--- a/arch/arm64/boot/dts/renesas/rzg2lc-smarc-som.dtsi
+++ b/arch/arm64/boot/dts/renesas/rzg2lc-smarc-som.dtsi
@@ -77,7 +77,7 @@
 		rxc-skew-psec = <2400>;
 		txc-skew-psec = <2400>;
 		rxdv-skew-psec = <0>;
-		txdv-skew-psec = <0>;
+		txen-skew-psec = <0>;
 		rxd0-skew-psec = <0>;
 		rxd1-skew-psec = <0>;
 		rxd2-skew-psec = <0>;
--- a/arch/arm64/boot/dts/renesas/rzg2ul-smarc-som.dtsi
+++ b/arch/arm64/boot/dts/renesas/rzg2ul-smarc-som.dtsi
@@ -80,7 +80,7 @@
 		rxc-skew-psec = <2400>;
 		txc-skew-psec = <2400>;
 		rxdv-skew-psec = <0>;
-		txdv-skew-psec = <0>;
+		txen-skew-psec = <0>;
 		rxd0-skew-psec = <0>;
 		rxd1-skew-psec = <0>;
 		rxd2-skew-psec = <0>;
@@ -107,7 +107,7 @@
 		rxc-skew-psec = <2400>;
 		txc-skew-psec = <2400>;
 		rxdv-skew-psec = <0>;
-		txdv-skew-psec = <0>;
+		txen-skew-psec = <0>;
 		rxd0-skew-psec = <0>;
 		rxd1-skew-psec = <0>;
 		rxd2-skew-psec = <0>;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 032/219] [SMB3] send channel sequence number in SMB3 requests after reconnects
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (30 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 031/219] arm64: dts: renesas: rzg2l: Fix txdv-skew-psec typos Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes Greg Kroah-Hartman
                   ` (197 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Steve French

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Steve French <stfrench@microsoft.com>

commit 09ee7a3bf866c0fa5ee1914d2c65958559eb5b4c upstream.

The ChannelSequence field in the SMB3 header is supposed to be
increased after reconnect to allow the server to distinguish
requests from before and after the reconnect.  We had always
been setting it to zero.  There are cases where incrementing
ChannelSequence on requests after network reconnects can reduce
the chance of data corruptions.

See MS-SMB2 3.2.4.1 and 3.2.7.1

Signed-off-by: Steve French <stfrench@microsoft.com>
Cc: stable@vger.kernel.org # 5.16+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/smb/client/cifsglob.h |    1 +
 fs/smb/client/connect.c  |    1 +
 fs/smb/client/smb2ops.c  |   11 ++++++++++-
 fs/smb/client/smb2pdu.c  |   11 +++++++++++
 fs/smb/common/smb2pdu.h  |   22 ++++++++++++++++++++++
 5 files changed, 45 insertions(+), 1 deletion(-)

--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -734,6 +734,7 @@ struct TCP_Server_Info {
 	 */
 #define CIFS_SERVER_IS_CHAN(server)	(!!(server)->primary_server)
 	struct TCP_Server_Info *primary_server;
+	__u16 channel_sequence_num;  /* incremented on primary channel on each chan reconnect */
 
 #ifdef CONFIG_CIFS_SWN_UPCALL
 	bool use_swn_dstaddr;
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -1725,6 +1725,7 @@ cifs_get_tcp_session(struct smb3_fs_cont
 		ctx->target_rfc1001_name, RFC1001_NAME_LEN_WITH_NULL);
 	tcp_ses->session_estab = false;
 	tcp_ses->sequence_number = 0;
+	tcp_ses->channel_sequence_num = 0; /* only tracked for primary channel */
 	tcp_ses->reconnect_instance = 1;
 	tcp_ses->lstrp = jiffies;
 	tcp_ses->compress_algorithm = cpu_to_le16(ctx->compression);
--- a/fs/smb/client/smb2ops.c
+++ b/fs/smb/client/smb2ops.c
@@ -167,8 +167,17 @@ smb2_set_credits(struct TCP_Server_Info
 
 	spin_lock(&server->req_lock);
 	server->credits = val;
-	if (val == 1)
+	if (val == 1) {
 		server->reconnect_instance++;
+		/*
+		 * ChannelSequence updated for all channels in primary channel so that consistent
+		 * across SMB3 requests sent on any channel. See MS-SMB2 3.2.4.1 and 3.2.7.1
+		 */
+		if (CIFS_SERVER_IS_CHAN(server))
+			server->primary_server->channel_sequence_num++;
+		else
+			server->channel_sequence_num++;
+	}
 	scredits = server->credits;
 	in_flight = server->in_flight;
 	spin_unlock(&server->req_lock);
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -88,9 +88,20 @@ smb2_hdr_assemble(struct smb2_hdr *shdr,
 		  const struct cifs_tcon *tcon,
 		  struct TCP_Server_Info *server)
 {
+	struct smb3_hdr_req *smb3_hdr;
 	shdr->ProtocolId = SMB2_PROTO_NUMBER;
 	shdr->StructureSize = cpu_to_le16(64);
 	shdr->Command = smb2_cmd;
+	if (server->dialect >= SMB30_PROT_ID) {
+		/* After reconnect SMB3 must set ChannelSequence on subsequent reqs */
+		smb3_hdr = (struct smb3_hdr_req *)shdr;
+		/* if primary channel is not set yet, use default channel for chan sequence num */
+		if (CIFS_SERVER_IS_CHAN(server))
+			smb3_hdr->ChannelSequence =
+				cpu_to_le16(server->primary_server->channel_sequence_num);
+		else
+			smb3_hdr->ChannelSequence = cpu_to_le16(server->channel_sequence_num);
+	}
 	if (server) {
 		spin_lock(&server->req_lock);
 		/* Request up to 10 credits but don't go over the limit. */
--- a/fs/smb/common/smb2pdu.h
+++ b/fs/smb/common/smb2pdu.h
@@ -153,6 +153,28 @@ struct smb2_hdr {
 	__u8   Signature[16];
 } __packed;
 
+struct smb3_hdr_req {
+	__le32 ProtocolId;	/* 0xFE 'S' 'M' 'B' */
+	__le16 StructureSize;	/* 64 */
+	__le16 CreditCharge;	/* MBZ */
+	__le16 ChannelSequence; /* See MS-SMB2 3.2.4.1 and 3.2.7.1 */
+	__le16 Reserved;
+	__le16 Command;
+	__le16 CreditRequest;	/* CreditResponse */
+	__le32 Flags;
+	__le32 NextCommand;
+	__le64 MessageId;
+	union {
+		struct {
+			__le32 ProcessId;
+			__le32  TreeId;
+		} __packed SyncId;
+		__le64  AsyncId;
+	} __packed Id;
+	__le64  SessionId;
+	__u8   Signature[16];
+} __packed;
+
 struct smb2_pdu {
 	struct smb2_hdr hdr;
 	__le16 StructureSize2; /* size of wct area (varies, request specific) */



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (31 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 032/219] [SMB3] send channel sequence number in SMB3 requests after reconnects Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-20  8:11   ` [REGRESSION] " Jeremi Piotrowski
  2023-09-17 19:12 ` [PATCH 6.1 034/219] mm: hugetlb_vmemmap: fix a race between vmemmap pmd split Greg Kroah-Hartman
                   ` (196 subsequent siblings)
  229 siblings, 1 reply; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Michal Hocko, Shakeel Butt,
	Johannes Weiner, Roman Gushchin, Muchun Song, Tejun Heo,
	Andrew Morton

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michal Hocko <mhocko@suse.com>

commit 86327e8eb94c52eca4f93cfece2e29d1bf52acbf upstream.

kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
deprecated since 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") merged in 5.16.  We haven't heard about any serious
users since then but it seems that the mere presence of the file is
causing more harm thatn good.  We (SUSE) have had several bug reports from
customers where Docker based containers started to fail because a write to
kmem.limit_in_bytes has failed.

This was unexpected because runc code only expects ENOENT (kmem disabled)
or EBUSY (tasks already running within cgroup).  So a new error code was
unexpected and the whole container startup failed.  This has been later
addressed by
https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
so current Docker runtimes do not suffer from the problem anymore.  There
are still older version of Docker in use and likely hard to get rid of
completely.

Address this by wiping out the file completely and effectively get back to
pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.

I would recommend backporting to stable trees which have picked up
58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").

[mhocko@suse.com: restore _KMEM switch case]
  Link: https://lkml.kernel.org/r/ZKe5wxdbvPi5Cwd7@dhcp22.suse.cz
Link: https://lkml.kernel.org/r/20230704115240.14672-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/admin-guide/cgroup-v1/memory.rst |    2 --
 mm/memcontrol.c                                |   10 ----------
 2 files changed, 12 deletions(-)

--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -91,8 +91,6 @@ Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
- memory.kmem.limit_in_bytes          This knob is deprecated and writing to
-                                     it will return -ENOTSUPP.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3841,10 +3841,6 @@ static ssize_t mem_cgroup_write(struct k
 		case _MEMSWAP:
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
-		case _KMEM:
-			/* kmem.limit_in_bytes is deprecated. */
-			ret = -EOPNOTSUPP;
-			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
 			break;
@@ -5056,12 +5052,6 @@ static struct cftype mem_cgroup_legacy_f
 	},
 #endif
 	{
-		.name = "kmem.limit_in_bytes",
-		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
-		.write = mem_cgroup_write,
-		.read_u64 = mem_cgroup_read_u64,
-	},
-	{
 		.name = "kmem.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
 		.read_u64 = mem_cgroup_read_u64,



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 034/219] mm: hugetlb_vmemmap: fix a race between vmemmap pmd split
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (32 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 035/219] lib/test_meminit: allocate pages up to order MAX_ORDER Greg Kroah-Hartman
                   ` (195 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Muchun Song, Mike Kravetz,
	Andrew Morton

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Muchun Song <songmuchun@bytedance.com>

commit 3ce2c24cb68f228590a053d6058a5901cd31af61 upstream.

The local variable @page in __split_vmemmap_huge_pmd() to obtain a pmd
page without holding page_table_lock may possiblely get the page table
page instead of a huge pmd page.

The effect may be in set_pte_at() since we may pass an invalid page
struct, if set_pte_at() wants to access the page struct (e.g.
CONFIG_PAGE_TABLE_CHECK is enabled), it may crash the kernel.

So fix it.  And inline __split_vmemmap_huge_pmd() since it only has one
user.

Link: https://lkml.kernel.org/r/20230707033859.16148-1-songmuchun@bytedance.com
Fixes: d8d55f5616cf ("mm: sparsemem: use page table lock to protect kernel pmd operations")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/hugetlb_vmemmap.c |   34 ++++++++++++++--------------------
 1 file changed, 14 insertions(+), 20 deletions(-)

--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -36,14 +36,22 @@ struct vmemmap_remap_walk {
 	struct list_head	*vmemmap_pages;
 };
 
-static int __split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start)
+static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start)
 {
 	pmd_t __pmd;
 	int i;
 	unsigned long addr = start;
-	struct page *page = pmd_page(*pmd);
-	pte_t *pgtable = pte_alloc_one_kernel(&init_mm);
+	struct page *head;
+	pte_t *pgtable;
+
+	spin_lock(&init_mm.page_table_lock);
+	head = pmd_leaf(*pmd) ? pmd_page(*pmd) : NULL;
+	spin_unlock(&init_mm.page_table_lock);
 
+	if (!head)
+		return 0;
+
+	pgtable = pte_alloc_one_kernel(&init_mm);
 	if (!pgtable)
 		return -ENOMEM;
 
@@ -53,7 +61,7 @@ static int __split_vmemmap_huge_pmd(pmd_
 		pte_t entry, *pte;
 		pgprot_t pgprot = PAGE_KERNEL;
 
-		entry = mk_pte(page + i, pgprot);
+		entry = mk_pte(head + i, pgprot);
 		pte = pte_offset_kernel(&__pmd, addr);
 		set_pte_at(&init_mm, addr, pte, entry);
 	}
@@ -65,8 +73,8 @@ static int __split_vmemmap_huge_pmd(pmd_
 		 * be treated as indepdenent small pages (as they can be freed
 		 * individually).
 		 */
-		if (!PageReserved(page))
-			split_page(page, get_order(PMD_SIZE));
+		if (!PageReserved(head))
+			split_page(head, get_order(PMD_SIZE));
 
 		/* Make pte visible before pmd. See comment in pmd_install(). */
 		smp_wmb();
@@ -80,20 +88,6 @@ static int __split_vmemmap_huge_pmd(pmd_
 	return 0;
 }
 
-static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start)
-{
-	int leaf;
-
-	spin_lock(&init_mm.page_table_lock);
-	leaf = pmd_leaf(*pmd);
-	spin_unlock(&init_mm.page_table_lock);
-
-	if (!leaf)
-		return 0;
-
-	return __split_vmemmap_huge_pmd(pmd, start);
-}
-
 static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr,
 			      unsigned long end,
 			      struct vmemmap_remap_walk *walk)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 035/219] lib/test_meminit: allocate pages up to order MAX_ORDER
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (33 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 034/219] mm: hugetlb_vmemmap: fix a race between vmemmap pmd split Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 036/219] parisc: led: Fix LAN receive and transmit LEDs Greg Kroah-Hartman
                   ` (194 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Andrew Donnellan,
	Alexander Potapenko, Xiaoke Wang, Andrew Morton

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrew Donnellan <ajd@linux.ibm.com>

commit efb78fa86e95832b78ca0ba60f3706788a818938 upstream.

test_pages() tests the page allocator by calling alloc_pages() with
different orders up to order 10.

However, different architectures and platforms support different maximum
contiguous allocation sizes.  The default maximum allocation order
(MAX_ORDER) is 10, but architectures can use CONFIG_ARCH_FORCE_MAX_ORDER
to override this.  On platforms where this is less than 10, test_meminit()
will blow up with a WARN().  This is expected, so let's not do that.

Replace the hardcoded "10" with the MAX_ORDER macro so that we test
allocations up to the expected platform limit.

Link: https://lkml.kernel.org/r/20230714015238.47931-1-ajd@linux.ibm.com
Fixes: 5015a300a522 ("lib: introduce test_meminit module")
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Xiaoke Wang <xkernel.wang@foxmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 lib/test_meminit.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/test_meminit.c
+++ b/lib/test_meminit.c
@@ -93,7 +93,7 @@ static int __init test_pages(int *total_
 	int failures = 0, num_tests = 0;
 	int i;
 
-	for (i = 0; i < 10; i++)
+	for (i = 0; i <= MAX_ORDER; i++)
 		num_tests += do_alloc_pages_order(i, &failures);
 
 	REPORT_FAILURES_IN_FN();



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 036/219] parisc: led: Fix LAN receive and transmit LEDs
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (34 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 035/219] lib/test_meminit: allocate pages up to order MAX_ORDER Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 037/219] parisc: led: Reduce CPU overhead for disk & lan LED computation Greg Kroah-Hartman
                   ` (193 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Helge Deller

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Helge Deller <deller@gmx.de>

commit 4db89524b084f712a887256391fc19d9f66c8e55 upstream.

Fix the LAN receive and LAN transmit LEDs, which where swapped
up to now.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/parisc/include/asm/led.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/parisc/include/asm/led.h
+++ b/arch/parisc/include/asm/led.h
@@ -11,8 +11,8 @@
 #define	LED1		0x02
 #define	LED0		0x01		/* bottom (or furthest left) LED */
 
-#define	LED_LAN_TX	LED0		/* for LAN transmit activity */
-#define	LED_LAN_RCV	LED1		/* for LAN receive activity */
+#define	LED_LAN_RCV	LED0		/* for LAN receive activity */
+#define	LED_LAN_TX	LED1		/* for LAN transmit activity */
 #define	LED_DISK_IO	LED2		/* for disk activity */
 #define	LED_HEARTBEAT	LED3		/* heartbeat */
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 037/219] parisc: led: Reduce CPU overhead for disk & lan LED computation
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (35 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 036/219] parisc: led: Fix LAN receive and transmit LEDs Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 038/219] cifs: update desired access while requesting for directory lease Greg Kroah-Hartman
                   ` (192 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Helge Deller

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Helge Deller <deller@gmx.de>

commit 358ad816e52d4253b38c2f312e6b1cbd89e0dbf7 upstream.

Older PA-RISC machines have LEDs which show the disk- and LAN-activity.
The computation is done in software and takes quite some time, e.g. on a
J6500 this may take up to 60% time of one CPU if the machine is loaded
via network traffic.

Since most people don't care about the LEDs, start with LEDs disabled and
just show a CPU heartbeat LED. The disk and LAN LEDs can be turned on
manually via /proc/pdc/led.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/parisc/led.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/parisc/led.c
+++ b/drivers/parisc/led.c
@@ -56,8 +56,8 @@
 static int led_type __read_mostly = -1;
 static unsigned char lastleds;	/* LED state from most recent update */
 static unsigned int led_heartbeat __read_mostly = 1;
-static unsigned int led_diskio    __read_mostly = 1;
-static unsigned int led_lanrxtx   __read_mostly = 1;
+static unsigned int led_diskio    __read_mostly;
+static unsigned int led_lanrxtx   __read_mostly;
 static char lcd_text[32]          __read_mostly;
 static char lcd_text_default[32]  __read_mostly;
 static int  lcd_no_led_support    __read_mostly = 0; /* KittyHawk doesn't support LED on its LCD */



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 038/219] cifs: update desired access while requesting for directory lease
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (36 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 037/219] parisc: led: Reduce CPU overhead for disk & lan LED computation Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 039/219] pinctrl: cherryview: fix address_space_handler() argument Greg Kroah-Hartman
                   ` (191 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Bharath SM, Shyam Prasad N,
	Steve French

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Bharath SM <bharathsm@microsoft.com>

commit b6d44d42313baa45a81ce9b299aeee2ccf3d0ee1 upstream.

We read and cache directory contents when we get directory
lease, so we should ask for read permission to read contents
of directory.

Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/smb/client/cached_dir.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/smb/client/cached_dir.c
+++ b/fs/smb/client/cached_dir.c
@@ -218,7 +218,7 @@ int open_cached_dir(unsigned int xid, st
 		.tcon = tcon,
 		.path = path,
 		.create_options = cifs_create_options(cifs_sb, CREATE_NOT_FILE),
-		.desired_access = FILE_READ_ATTRIBUTES,
+		.desired_access =  FILE_READ_DATA | FILE_READ_ATTRIBUTES,
 		.disposition = FILE_OPEN,
 		.fid = pfid,
 	};



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 039/219] pinctrl: cherryview: fix address_space_handler() argument
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (37 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 038/219] cifs: update desired access while requesting for directory lease Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 040/219] dt-bindings: clock: xlnx,versal-clk: drop select:false Greg Kroah-Hartman
                   ` (190 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Raag Jadav, Mika Westerberg,
	Andy Shevchenko

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Raag Jadav <raag.jadav@intel.com>

commit d5301c90716a8e20bc961a348182daca00c8e8f0 upstream.

First argument of acpi_*_address_space_handler() APIs is acpi_handle of
the device, which is incorrectly passed in driver ->remove() path here.
Fix it by passing the appropriate argument and while at it, make both
API calls consistent using ACPI_HANDLE().

Fixes: a0b028597d59 ("pinctrl: cherryview: Add support for GMMR GPIO opregion")
Cc: stable@vger.kernel.org
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pinctrl/intel/pinctrl-cherryview.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/drivers/pinctrl/intel/pinctrl-cherryview.c
+++ b/drivers/pinctrl/intel/pinctrl-cherryview.c
@@ -1698,7 +1698,6 @@ static int chv_pinctrl_probe(struct plat
 	struct intel_community_context *cctx;
 	struct intel_community *community;
 	struct device *dev = &pdev->dev;
-	struct acpi_device *adev = ACPI_COMPANION(dev);
 	struct intel_pinctrl *pctrl;
 	acpi_status status;
 	unsigned int i;
@@ -1766,7 +1765,7 @@ static int chv_pinctrl_probe(struct plat
 	if (ret)
 		return ret;
 
-	status = acpi_install_address_space_handler(adev->handle,
+	status = acpi_install_address_space_handler(ACPI_HANDLE(dev),
 					community->acpi_space_id,
 					chv_pinctrl_mmio_access_handler,
 					NULL, pctrl);
@@ -1783,7 +1782,7 @@ static int chv_pinctrl_remove(struct pla
 	struct intel_pinctrl *pctrl = platform_get_drvdata(pdev);
 	const struct intel_community *community = &pctrl->communities[0];
 
-	acpi_remove_address_space_handler(ACPI_COMPANION(&pdev->dev),
+	acpi_remove_address_space_handler(ACPI_HANDLE(&pdev->dev),
 					  community->acpi_space_id,
 					  chv_pinctrl_mmio_access_handler);
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 040/219] dt-bindings: clock: xlnx,versal-clk: drop select:false
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (38 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 039/219] pinctrl: cherryview: fix address_space_handler() argument Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 041/219] clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz Greg Kroah-Hartman
                   ` (189 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Krzysztof Kozlowski, Conor Dooley,
	Shubhrajyoti Datta, Stephen Boyd

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>

commit 172044e30b00977784269e8ab72132a48293c654 upstream.

select:false makes the schema basically ignored and not effective, which
is clearly not what we want for a device binding.

Fixes: 352546805a44 ("dt-bindings: clock: Add bindings for versal clock driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230728165923.108589-1-krzysztof.kozlowski@linaro.org
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/devicetree/bindings/clock/xlnx,versal-clk.yaml |    2 --
 1 file changed, 2 deletions(-)

--- a/Documentation/devicetree/bindings/clock/xlnx,versal-clk.yaml
+++ b/Documentation/devicetree/bindings/clock/xlnx,versal-clk.yaml
@@ -16,8 +16,6 @@ description: |
   reads required input clock frequencies from the devicetree and acts as clock
   provider for all clock consumers of PS clocks.
 
-select: false
-
 properties:
   compatible:
     const: xlnx,versal-clk



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 041/219] clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (39 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 040/219] dt-bindings: clock: xlnx,versal-clk: drop select:false Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 042/219] clk: imx: pll14xx: align pdiv with reference manual Greg Kroah-Hartman
                   ` (188 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Ahmad Fatoum, Marco Felsch,
	Abel Vesa

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ahmad Fatoum <a.fatoum@pengutronix.de>

commit 72d00e560d10665e6139c9431956a87ded6e9880 upstream.

Since commit b09c68dc57c9 ("clk: imx: pll14xx: Support dynamic rates"),
the driver has the ability to dynamically compute PLL parameters to
approximate the requested rates. This is not always used, because the
logic is as follows:

  - Check if the target rate is hardcoded in the frequency table
  - Check if varying only kdiv is possible, so switch over is glitch free
  - Compute rate dynamically by iterating over pdiv range

If we skip the frequency table for the 1443x PLL, we find that the
computed values differ to the hardcoded ones. This can be valid if the
hardcoded values guarantee for example an earlier lock-in or if the
divisors are chosen, so that other important rates are more likely to
be reached glitch-free.

For rates (393216000 and 361267200, this doesn't seem to be the case:
They are only approximated by existing parameters (393215995 and
361267196 Hz, respectively) and they aren't reachable glitch-free from
other hardcoded frequencies. Dropping them from the table allows us
to lock-in to these frequencies exactly.

This is immediately noticeable because they are the assigned-clock-rates
for IMX8MN_AUDIO_PLL1 and IMX8MN_AUDIO_PLL2, respectively and a look
into clk_summary so far showed that they were a few Hz short of the target:

imx8mn-board:~# grep audio_pll[12]_out /sys/kernel/debug/clk/clk_summary
audio_pll2_out           0        0        0   361267196 0     0  50000   N
audio_pll1_out           1        1        0   393215995 0     0  50000   Y

and afterwards:

imx8mn-board:~# grep audio_pll[12]_out /sys/kernel/debug/clk/clk_summary
audio_pll2_out           0        0        0   361267200 0     0  50000   N
audio_pll1_out           1        1        0   393216000 0     0  50000   Y

This change is equivalent to adding following hardcoded values:

  /*               rate     mdiv  pdiv  sdiv   kdiv */
  PLL_1443X_RATE(393216000, 655,    5,    3,  23593),
  PLL_1443X_RATE(361267200, 497,   33,    0, -16882),

Fixes: 053a4ffe2988 ("clk: imx: imx8mm: fix audio pll setting")
Cc: stable@vger.kernel.org # v5.18+
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Link: https://lore.kernel.org/r/20230807084744.1184791-2-m.felsch@pengutronix.de
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/imx/clk-pll14xx.c |    2 --
 1 file changed, 2 deletions(-)

--- a/drivers/clk/imx/clk-pll14xx.c
+++ b/drivers/clk/imx/clk-pll14xx.c
@@ -62,8 +62,6 @@ static const struct imx_pll14xx_rate_tab
 	PLL_1443X_RATE(650000000U, 325, 3, 2, 0),
 	PLL_1443X_RATE(594000000U, 198, 2, 2, 0),
 	PLL_1443X_RATE(519750000U, 173, 2, 2, 16384),
-	PLL_1443X_RATE(393216000U, 262, 2, 3, 9437),
-	PLL_1443X_RATE(361267200U, 361, 3, 3, 17511),
 };
 
 struct imx_pll14xx_clk imx_1443x_pll = {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 042/219] clk: imx: pll14xx: align pdiv with reference manual
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (40 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 041/219] clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 043/219] clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock Greg Kroah-Hartman
                   ` (187 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Marco Felsch, Abel Vesa, Adam Ford,
	Philipp Zabel, Sascha Hauer, Ahmad Fatoum

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Marco Felsch <m.felsch@pengutronix.de>

commit 37cfd5e457cbdcd030f378127ff2d62776f641e7 upstream.

The PLL14xx hardware can be found on i.MX8M{M,N,P} SoCs and always come
with a 6-bit pre-divider. Neither the reference manuals nor the
datasheets of these SoCs do mention any restrictions. Furthermore the
current code doesn't respect the restrictions from the comment too.

Therefore drop the restriction and align the max pre-divider (pdiv)
value to 63 to get more accurate frequencies.

Fixes: b09c68dc57c9 ("clk: imx: pll14xx: Support dynamic rates")
Cc: stable@vger.kernel.org
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Reviewed-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Link: https://lore.kernel.org/r/20230807084744.1184791-1-m.felsch@pengutronix.de
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/imx/clk-pll14xx.c |   11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

--- a/drivers/clk/imx/clk-pll14xx.c
+++ b/drivers/clk/imx/clk-pll14xx.c
@@ -135,11 +135,10 @@ static void imx_pll14xx_calc_settings(st
 	/*
 	 * Fractional PLL constrains:
 	 *
-	 * a) 6MHz <= prate <= 25MHz
-	 * b) 1 <= p <= 63 (1 <= p <= 4 prate = 24MHz)
-	 * c) 64 <= m <= 1023
-	 * d) 0 <= s <= 6
-	 * e) -32768 <= k <= 32767
+	 * a) 1 <= p <= 63
+	 * b) 64 <= m <= 1023
+	 * c) 0 <= s <= 6
+	 * d) -32768 <= k <= 32767
 	 *
 	 * fvco = (m * 65536 + k) * prate / (p * 65536)
 	 */
@@ -182,7 +181,7 @@ static void imx_pll14xx_calc_settings(st
 	}
 
 	/* Finally calculate best values */
-	for (pdiv = 1; pdiv <= 7; pdiv++) {
+	for (pdiv = 1; pdiv <= 63; pdiv++) {
 		for (sdiv = 0; sdiv <= 6; sdiv++) {
 			/* calc mdiv = round(rate * pdiv * 2^sdiv) / prate) */
 			mdiv = DIV_ROUND_CLOSEST(rate * (pdiv << sdiv), prate);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 043/219] clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (41 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 042/219] clk: imx: pll14xx: align pdiv with reference manual Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 044/219] soc: qcom: qmi_encdec: Restrict string length in decode Greg Kroah-Hartman
                   ` (186 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, stable, Neil Armstrong,
	Dmitry Baryshkov, Konrad Dybcio, Bjorn Andersson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

commit 1583694bb4eaf186f17131dbc1b83d6057d2749b upstream.

The pll0_vote clock definitely should have pll0 as a parent (instead of
pll8).

Fixes: 7792a8d6713c ("clk: mdm9615: Add support for MDM9615 Clock Controllers")
Cc: stable@kernel.org
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Link: https://lore.kernel.org/r/20230512211727.3445575-7-dmitry.baryshkov@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/qcom/gcc-mdm9615.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/clk/qcom/gcc-mdm9615.c
+++ b/drivers/clk/qcom/gcc-mdm9615.c
@@ -58,7 +58,7 @@ static struct clk_regmap pll0_vote = {
 	.enable_mask = BIT(0),
 	.hw.init = &(struct clk_init_data){
 		.name = "pll0_vote",
-		.parent_names = (const char *[]){ "pll8" },
+		.parent_names = (const char *[]){ "pll0" },
 		.num_parents = 1,
 		.ops = &clk_pll_vote_ops,
 	},



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 044/219] soc: qcom: qmi_encdec: Restrict string length in decode
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (42 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 043/219] clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 045/219] clk: qcom: dispcc-sm8450: fix runtime PM imbalance on probe errors Greg Kroah-Hartman
                   ` (185 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Chris Lew, Praveenkumar I,
	Bjorn Andersson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Chris Lew <quic_clew@quicinc.com>

commit 8d207400fd6b79c92aeb2f33bb79f62dff904ea2 upstream.

The QMI TLV value for strings in a lot of qmi element info structures
account for null terminated strings with MAX_LEN + 1. If a string is
actually MAX_LEN + 1 length, this will cause an out of bounds access
when the NULL character is appended in decoding.

Fixes: 9b8a11e82615 ("soc: qcom: Introduce QMI encoder/decoder")
Cc: stable@vger.kernel.org
Signed-off-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Praveenkumar I <quic_ipkumar@quicinc.com>
Link: https://lore.kernel.org/r/20230801064712.3590128-1-quic_ipkumar@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/soc/qcom/qmi_encdec.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/soc/qcom/qmi_encdec.c
+++ b/drivers/soc/qcom/qmi_encdec.c
@@ -534,8 +534,8 @@ static int qmi_decode_string_elem(const
 		decoded_bytes += rc;
 	}
 
-	if (string_len > temp_ei->elem_len) {
-		pr_err("%s: String len %d > Max Len %d\n",
+	if (string_len >= temp_ei->elem_len) {
+		pr_err("%s: String len %d >= Max Len %d\n",
 		       __func__, string_len, temp_ei->elem_len);
 		return -ETOOSMALL;
 	} else if (string_len > tlv_len) {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 045/219] clk: qcom: dispcc-sm8450: fix runtime PM imbalance on probe errors
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (43 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 044/219] soc: qcom: qmi_encdec: Restrict string length in decode Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 046/219] clk: qcom: lpasscc-sc7280: fix missing resume during probe Greg Kroah-Hartman
                   ` (184 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Dmitry Baryshkov, Johan Hovold,
	Bjorn Andersson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan+linaro@kernel.org>

commit b0f3d01bda6c3f6f811e70f76d2040ae81f64565 upstream.

Make sure to decrement the runtime PM usage count before returning in
case regmap initialisation fails.

Fixes: 16fb89f92ec4 ("clk: qcom: Add support for Display Clock Controller on SM8450")
Cc: stable@vger.kernel.org      # 6.1
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230718132902.21430-3-johan+linaro@kernel.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/qcom/dispcc-sm8450.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

--- a/drivers/clk/qcom/dispcc-sm8450.c
+++ b/drivers/clk/qcom/dispcc-sm8450.c
@@ -1783,8 +1783,10 @@ static int disp_cc_sm8450_probe(struct p
 		return ret;
 
 	regmap = qcom_cc_map(pdev, &disp_cc_sm8450_desc);
-	if (IS_ERR(regmap))
-		return PTR_ERR(regmap);
+	if (IS_ERR(regmap)) {
+		ret = PTR_ERR(regmap);
+		goto err_put_rpm;
+	}
 
 	clk_lucid_evo_pll_configure(&disp_cc_pll0, regmap, &disp_cc_pll0_config);
 	clk_lucid_evo_pll_configure(&disp_cc_pll1, regmap, &disp_cc_pll1_config);
@@ -1799,9 +1801,16 @@ static int disp_cc_sm8450_probe(struct p
 	regmap_update_bits(regmap, 0xe05c, BIT(0), BIT(0));
 
 	ret = qcom_cc_really_probe(pdev, &disp_cc_sm8450_desc, regmap);
+	if (ret)
+		goto err_put_rpm;
 
 	pm_runtime_put(&pdev->dev);
 
+	return 0;
+
+err_put_rpm:
+	pm_runtime_put_sync(&pdev->dev);
+
 	return ret;
 }
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 046/219] clk: qcom: lpasscc-sc7280: fix missing resume during probe
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (44 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 045/219] clk: qcom: dispcc-sm8450: fix runtime PM imbalance on probe errors Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 047/219] clk: qcom: q6sstop-qcs404: " Greg Kroah-Hartman
                   ` (183 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Taniya Das, Johan Hovold,
	Bjorn Andersson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan+linaro@kernel.org>

commit 66af5339d4f8e20c6d89a490570bd94d40f1a7f6 upstream.

Drivers that enable runtime PM must make sure that the controller is
runtime resumed before accessing its registers to prevent the power
domain from being disabled.

Fixes: 4ab43d171181 ("clk: qcom: Add lpass clock controller driver for SC7280")
Cc: stable@vger.kernel.org      # 5.16
Cc: Taniya Das <quic_tdas@quicinc.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230718132902.21430-6-johan+linaro@kernel.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/qcom/lpasscc-sc7280.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

--- a/drivers/clk/qcom/lpasscc-sc7280.c
+++ b/drivers/clk/qcom/lpasscc-sc7280.c
@@ -115,9 +115,13 @@ static int lpass_cc_sc7280_probe(struct
 	ret = pm_clk_add(&pdev->dev, "iface");
 	if (ret < 0) {
 		dev_err(&pdev->dev, "failed to acquire iface clock\n");
-		goto destroy_pm_clk;
+		goto err_destroy_pm_clk;
 	}
 
+	ret = pm_runtime_resume_and_get(&pdev->dev);
+	if (ret)
+		goto err_destroy_pm_clk;
+
 	if (!of_property_read_bool(pdev->dev.of_node, "qcom,adsp-pil-mode")) {
 		lpass_regmap_config.name = "qdsp6ss";
 		lpass_regmap_config.max_register = 0x3f;
@@ -125,7 +129,7 @@ static int lpass_cc_sc7280_probe(struct
 
 		ret = qcom_cc_probe_by_index(pdev, 0, desc);
 		if (ret)
-			goto destroy_pm_clk;
+			goto err_put_rpm;
 	}
 
 	lpass_regmap_config.name = "top_cc";
@@ -134,11 +138,15 @@ static int lpass_cc_sc7280_probe(struct
 
 	ret = qcom_cc_probe_by_index(pdev, 1, desc);
 	if (ret)
-		goto destroy_pm_clk;
+		goto err_put_rpm;
+
+	pm_runtime_put(&pdev->dev);
 
 	return 0;
 
-destroy_pm_clk:
+err_put_rpm:
+	pm_runtime_put_sync(&pdev->dev);
+err_destroy_pm_clk:
 	pm_clk_destroy(&pdev->dev);
 
 disable_pm_runtime:



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 047/219] clk: qcom: q6sstop-qcs404: fix missing resume during probe
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (45 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 046/219] clk: qcom: lpasscc-sc7280: fix missing resume during probe Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 048/219] clk: qcom: mss-sc7180: " Greg Kroah-Hartman
                   ` (182 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Johan Hovold, Bjorn Andersson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan+linaro@kernel.org>

commit 97112c83f4671a4a722f99a53be4e91fac4091bc upstream.

Drivers that enable runtime PM must make sure that the controller is
runtime resumed before accessing its registers to prevent the power
domain from being disabled.

Fixes: 6cdef2738db0 ("clk: qcom: Add Q6SSTOP clock controller for QCS404")
Cc: stable@vger.kernel.org      # 5.5
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230718132902.21430-7-johan+linaro@kernel.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/qcom/q6sstop-qcs404.c |   15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

--- a/drivers/clk/qcom/q6sstop-qcs404.c
+++ b/drivers/clk/qcom/q6sstop-qcs404.c
@@ -174,21 +174,32 @@ static int q6sstopcc_qcs404_probe(struct
 		return ret;
 	}
 
+	ret = pm_runtime_resume_and_get(&pdev->dev);
+	if (ret)
+		return ret;
+
 	q6sstop_regmap_config.name = "q6sstop_tcsr";
 	desc = &tcsr_qcs404_desc;
 
 	ret = qcom_cc_probe_by_index(pdev, 1, desc);
 	if (ret)
-		return ret;
+		goto err_put_rpm;
 
 	q6sstop_regmap_config.name = "q6sstop_cc";
 	desc = &q6sstop_qcs404_desc;
 
 	ret = qcom_cc_probe_by_index(pdev, 0, desc);
 	if (ret)
-		return ret;
+		goto err_put_rpm;
+
+	pm_runtime_put(&pdev->dev);
 
 	return 0;
+
+err_put_rpm:
+	pm_runtime_put_sync(&pdev->dev);
+
+	return ret;
 }
 
 static const struct dev_pm_ops q6sstopcc_pm_ops = {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 048/219] clk: qcom: mss-sc7180: fix missing resume during probe
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (46 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 047/219] clk: qcom: q6sstop-qcs404: " Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 049/219] NFS: Fix a potential data corruption Greg Kroah-Hartman
                   ` (181 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Taniya Das, Johan Hovold,
	Bjorn Andersson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Johan Hovold <johan+linaro@kernel.org>

commit e2349da0fa7ca822cda72f427345b95795358fe7 upstream.

Drivers that enable runtime PM must make sure that the controller is
runtime resumed before accessing its registers to prevent the power
domain from being disabled.

Fixes: 8def929c4097 ("clk: qcom: Add modem clock controller driver for SC7180")
Cc: stable@vger.kernel.org      # 5.7
Cc: Taniya Das <quic_tdas@quicinc.com>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230718132902.21430-8-johan+linaro@kernel.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clk/qcom/mss-sc7180.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

--- a/drivers/clk/qcom/mss-sc7180.c
+++ b/drivers/clk/qcom/mss-sc7180.c
@@ -87,11 +87,22 @@ static int mss_sc7180_probe(struct platf
 		return ret;
 	}
 
+	ret = pm_runtime_resume_and_get(&pdev->dev);
+	if (ret)
+		return ret;
+
 	ret = qcom_cc_probe(pdev, &mss_sc7180_desc);
 	if (ret < 0)
-		return ret;
+		goto err_put_rpm;
+
+	pm_runtime_put(&pdev->dev);
 
 	return 0;
+
+err_put_rpm:
+	pm_runtime_put_sync(&pdev->dev);
+
+	return ret;
 }
 
 static const struct dev_pm_ops mss_sc7180_pm_ops = {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 049/219] NFS: Fix a potential data corruption
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (47 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 048/219] clk: qcom: mss-sc7180: " Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 050/219] NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info Greg Kroah-Hartman
                   ` (180 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Trond Myklebust, Anna Schumaker

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <trond.myklebust@hammerspace.com>

commit 88975a55969e11f26fe3846bf4fbf8e7dc8cbbd4 upstream.

We must ensure that the subrequests are joined back into the head before
we can retransmit a request. If the head was not on the commit lists,
because the server wrote it synchronously, we still need to add it back
to the retransmission list.
Add a call that mirrors the effect of nfs_cancel_remove_inode() for
O_DIRECT.

Fixes: ed5d588fe47f ("NFS: Try to join page groups before an O_DIRECT retransmission")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/nfs/direct.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -474,13 +474,31 @@ out:
 	return result;
 }
 
+static void nfs_direct_add_page_head(struct list_head *list,
+				     struct nfs_page *req)
+{
+	struct nfs_page *head = req->wb_head;
+
+	if (!list_empty(&head->wb_list) || !nfs_lock_request(head))
+		return;
+	if (!list_empty(&head->wb_list)) {
+		nfs_unlock_request(head);
+		return;
+	}
+	list_add(&head->wb_list, list);
+	kref_get(&head->wb_kref);
+	kref_get(&head->wb_kref);
+}
+
 static void nfs_direct_join_group(struct list_head *list, struct inode *inode)
 {
 	struct nfs_page *req, *subreq;
 
 	list_for_each_entry(req, list, wb_list) {
-		if (req->wb_head != req)
+		if (req->wb_head != req) {
+			nfs_direct_add_page_head(&req->wb_list, req);
 			continue;
+		}
 		subreq = req->wb_this_page;
 		if (subreq == req)
 			continue;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 050/219] NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (48 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 049/219] NFS: Fix a potential data corruption Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 051/219] bus: mhi: host: Skip MHI reset if device is in RDDM Greg Kroah-Hartman
                   ` (179 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Fedor Pchelkin, Benjamin Coddington,
	Anna Schumaker

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Fedor Pchelkin <pchelkin@ispras.ru>

commit 96562c45af5c31b89a197af28f79bfa838fb8391 upstream.

It is an almost improbable error case but when page allocating loop in
nfs4_get_device_info() fails then we should only free the already
allocated pages, as __free_page() can't deal with NULL arguments.

Found by Linux Verification Center (linuxtesting.org).

Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/nfs/pnfs_dev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/nfs/pnfs_dev.c
+++ b/fs/nfs/pnfs_dev.c
@@ -154,7 +154,7 @@ nfs4_get_device_info(struct nfs_server *
 		set_bit(NFS_DEVICEID_NOCACHE, &d->flags);
 
 out_free_pages:
-	for (i = 0; i < max_pages; i++)
+	while (--i >= 0)
 		__free_page(pages[i]);
 	kfree(pages);
 out_free_pdev:



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 051/219] bus: mhi: host: Skip MHI reset if device is in RDDM
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (49 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 050/219] NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:12 ` [PATCH 6.1 052/219] net: add SKB_HEAD_ALIGN() helper Greg Kroah-Hartman
                   ` (178 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Qiang Yu, Jeffrey Hugo,
	Manivannan Sadhasivam, Manivannan Sadhasivam

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Qiang Yu <quic_qianyu@quicinc.com>

commit cabce92dd805945a090dc6fc73b001bb35ed083a upstream.

In RDDM EE, device can not process MHI reset issued by host. In case of MHI
power off, host is issuing MHI reset and polls for it to get cleared until
it times out. Since this timeout can not be avoided in case of RDDM, skip
the MHI reset in this scenarios.

Cc: <stable@vger.kernel.org>
Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions")
Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://lore.kernel.org/r/1684390959-17836-1-git-send-email-quic_qianyu@quicinc.com
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/bus/mhi/host/pm.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/bus/mhi/host/pm.c
+++ b/drivers/bus/mhi/host/pm.c
@@ -470,6 +470,10 @@ static void mhi_pm_disable_transition(st
 
 	/* Trigger MHI RESET so that the device will not access host memory */
 	if (!MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state)) {
+		/* Skip MHI RESET if in RDDM state */
+		if (mhi_cntrl->rddm_image && mhi_get_exec_env(mhi_cntrl) == MHI_EE_RDDM)
+			goto skip_mhi_reset;
+
 		dev_dbg(dev, "Triggering MHI Reset in device\n");
 		mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET);
 
@@ -495,6 +499,7 @@ static void mhi_pm_disable_transition(st
 		}
 	}
 
+skip_mhi_reset:
 	dev_dbg(dev,
 		 "Waiting for all pending event ring processing to complete\n");
 	mhi_event = mhi_cntrl->mhi_event;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 052/219] net: add SKB_HEAD_ALIGN() helper
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (50 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 051/219] bus: mhi: host: Skip MHI reset if device is in RDDM Greg Kroah-Hartman
@ 2023-09-17 19:12 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 053/219] net: remove osize variable in __alloc_skb() Greg Kroah-Hartman
                   ` (177 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Eric Dumazet, Soheil Hassas Yeganeh,
	Paolo Abeni, Alexander Duyck, Jakub Kicinski, Ajay Kaher

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

commit 115f1a5c42bdad9a9ea356fc0b4a39ec7537947f upstream.

We have many places using this expression:

 SKB_DATA_ALIGN(sizeof(struct skb_shared_info))

Use of SKB_HEAD_ALIGN() will allow to clean them.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[Ajay: Regenerated the patch for v6.1.y]
Signed-off-by: Ajay Kaher <akaher@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/skbuff.h |    8 ++++++++
 net/core/skbuff.c      |   18 ++++++------------
 2 files changed, 14 insertions(+), 12 deletions(-)

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -261,6 +261,14 @@
 #define SKB_DATA_ALIGN(X)	ALIGN(X, SMP_CACHE_BYTES)
 #define SKB_WITH_OVERHEAD(X)	\
 	((X) - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
+
+/* For X bytes available in skb->head, what is the minimal
+ * allocation needed, knowing struct skb_shared_info needs
+ * to be aligned.
+ */
+#define SKB_HEAD_ALIGN(X) (SKB_DATA_ALIGN(X) + \
+	SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
+
 #define SKB_MAX_ORDER(X, ORDER) \
 	SKB_WITH_OVERHEAD((PAGE_SIZE << (ORDER)) - (X))
 #define SKB_MAX_HEAD(X)		(SKB_MAX_ORDER((X), 0))
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -504,8 +504,7 @@ struct sk_buff *__alloc_skb(unsigned int
 	 * aligned memory blocks, unless SLUB/SLAB debug is enabled.
 	 * Both skb->head and skb_shared_info are cache line aligned.
 	 */
-	size = SKB_DATA_ALIGN(size);
-	size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+	size = SKB_HEAD_ALIGN(size);
 	osize = kmalloc_size_roundup(size);
 	data = kmalloc_reserve(osize, gfp_mask, node, &pfmemalloc);
 	if (unlikely(!data))
@@ -578,8 +577,7 @@ struct sk_buff *__netdev_alloc_skb(struc
 		goto skb_success;
 	}
 
-	len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-	len = SKB_DATA_ALIGN(len);
+	len = SKB_HEAD_ALIGN(len);
 
 	if (sk_memalloc_socks())
 		gfp_mask |= __GFP_MEMALLOC;
@@ -678,8 +676,7 @@ struct sk_buff *__napi_alloc_skb(struct
 		data = page_frag_alloc_1k(&nc->page_small, gfp_mask);
 		pfmemalloc = NAPI_SMALL_PAGE_PFMEMALLOC(nc->page_small);
 	} else {
-		len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-		len = SKB_DATA_ALIGN(len);
+		len = SKB_HEAD_ALIGN(len);
 
 		data = page_frag_alloc(&nc->page, len, gfp_mask);
 		pfmemalloc = nc->page.pfmemalloc;
@@ -1837,8 +1834,7 @@ int pskb_expand_head(struct sk_buff *skb
 	if (skb_pfmemalloc(skb))
 		gfp_mask |= __GFP_MEMALLOC;
 
-	size = SKB_DATA_ALIGN(size);
-	size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+	size = SKB_HEAD_ALIGN(size);
 	size = kmalloc_size_roundup(size);
 	data = kmalloc_reserve(size, gfp_mask, NUMA_NO_NODE, NULL);
 	if (!data)
@@ -6204,8 +6200,7 @@ static int pskb_carve_inside_header(stru
 	if (skb_pfmemalloc(skb))
 		gfp_mask |= __GFP_MEMALLOC;
 
-	size = SKB_DATA_ALIGN(size);
-	size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+	size = SKB_HEAD_ALIGN(size);
 	size = kmalloc_size_roundup(size);
 	data = kmalloc_reserve(size, gfp_mask, NUMA_NO_NODE, NULL);
 	if (!data)
@@ -6323,8 +6318,7 @@ static int pskb_carve_inside_nonlinear(s
 	if (skb_pfmemalloc(skb))
 		gfp_mask |= __GFP_MEMALLOC;
 
-	size = SKB_DATA_ALIGN(size);
-	size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+	size = SKB_HEAD_ALIGN(size);
 	size = kmalloc_size_roundup(size);
 	data = kmalloc_reserve(size, gfp_mask, NUMA_NO_NODE, NULL);
 	if (!data)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 053/219] net: remove osize variable in __alloc_skb()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (51 preceding siblings ...)
  2023-09-17 19:12 ` [PATCH 6.1 052/219] net: add SKB_HEAD_ALIGN() helper Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 054/219] net: factorize code in kmalloc_reserve() Greg Kroah-Hartman
                   ` (176 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Eric Dumazet, Soheil Hassas Yeganeh,
	Paolo Abeni, Alexander Duyck, Jakub Kicinski, Ajay Kaher

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

commit 65998d2bf857b9ae5acc1f3b70892bd1b429ccab upstream.

This is a cleanup patch, to prepare following change.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[Ajay: Regenerated the patch for v6.1.y]
Signed-off-by: Ajay Kaher <akaher@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/skbuff.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -479,7 +479,6 @@ struct sk_buff *__alloc_skb(unsigned int
 {
 	struct kmem_cache *cache;
 	struct sk_buff *skb;
-	unsigned int osize;
 	bool pfmemalloc;
 	u8 *data;
 
@@ -505,16 +504,15 @@ struct sk_buff *__alloc_skb(unsigned int
 	 * Both skb->head and skb_shared_info are cache line aligned.
 	 */
 	size = SKB_HEAD_ALIGN(size);
-	osize = kmalloc_size_roundup(size);
-	data = kmalloc_reserve(osize, gfp_mask, node, &pfmemalloc);
+	size = kmalloc_size_roundup(size);
+	data = kmalloc_reserve(size, gfp_mask, node, &pfmemalloc);
 	if (unlikely(!data))
 		goto nodata;
 	/* kmalloc_size_roundup() might give us more room than requested.
 	 * Put skb_shared_info exactly at the end of allocated zone,
 	 * to allow max possible filling before reallocation.
 	 */
-	size = SKB_WITH_OVERHEAD(osize);
-	prefetchw(data + size);
+	prefetchw(data + SKB_WITH_OVERHEAD(size));
 
 	/*
 	 * Only clear those fields we need to clear, not those that we will
@@ -522,7 +520,7 @@ struct sk_buff *__alloc_skb(unsigned int
 	 * the tail pointer in struct sk_buff!
 	 */
 	memset(skb, 0, offsetof(struct sk_buff, tail));
-	__build_skb_around(skb, data, osize);
+	__build_skb_around(skb, data, size);
 	skb->pfmemalloc = pfmemalloc;
 
 	if (flags & SKB_ALLOC_FCLONE) {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 054/219] net: factorize code in kmalloc_reserve()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (52 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 053/219] net: remove osize variable in __alloc_skb() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 055/219] net: deal with integer overflows " Greg Kroah-Hartman
                   ` (175 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Eric Dumazet, Soheil Hassas Yeganeh,
	Paolo Abeni, Alexander Duyck, Jakub Kicinski, Ajay Kaher

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

commit 5c0e820cbbbe2d1c4cea5cd2bfc1302c123436df upstream.

All kmalloc_reserve() callers have to make the same computation,
we can factorize them, to prepare following patch in the series.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[Ajay: Regenerated the patch for v6.1.y]
Signed-off-by: Ajay Kaher <akaher@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/skbuff.c |   27 +++++++++++----------------
 1 file changed, 11 insertions(+), 16 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -424,17 +424,20 @@ EXPORT_SYMBOL(napi_build_skb);
  * may be used. Otherwise, the packet data may be discarded until enough
  * memory is free
  */
-static void *kmalloc_reserve(size_t size, gfp_t flags, int node,
+static void *kmalloc_reserve(unsigned int *size, gfp_t flags, int node,
 			     bool *pfmemalloc)
 {
-	void *obj;
 	bool ret_pfmemalloc = false;
+	unsigned int obj_size;
+	void *obj;
 
+	obj_size = SKB_HEAD_ALIGN(*size);
+	*size = obj_size = kmalloc_size_roundup(obj_size);
 	/*
 	 * Try a regular allocation, when that fails and we're not entitled
 	 * to the reserves, fail.
 	 */
-	obj = kmalloc_node_track_caller(size,
+	obj = kmalloc_node_track_caller(obj_size,
 					flags | __GFP_NOMEMALLOC | __GFP_NOWARN,
 					node);
 	if (obj || !(gfp_pfmemalloc_allowed(flags)))
@@ -442,7 +445,7 @@ static void *kmalloc_reserve(size_t size
 
 	/* Try again but now we are using pfmemalloc reserves */
 	ret_pfmemalloc = true;
-	obj = kmalloc_node_track_caller(size, flags, node);
+	obj = kmalloc_node_track_caller(obj_size, flags, node);
 
 out:
 	if (pfmemalloc)
@@ -503,9 +506,7 @@ struct sk_buff *__alloc_skb(unsigned int
 	 * aligned memory blocks, unless SLUB/SLAB debug is enabled.
 	 * Both skb->head and skb_shared_info are cache line aligned.
 	 */
-	size = SKB_HEAD_ALIGN(size);
-	size = kmalloc_size_roundup(size);
-	data = kmalloc_reserve(size, gfp_mask, node, &pfmemalloc);
+	data = kmalloc_reserve(&size, gfp_mask, node, &pfmemalloc);
 	if (unlikely(!data))
 		goto nodata;
 	/* kmalloc_size_roundup() might give us more room than requested.
@@ -1832,9 +1833,7 @@ int pskb_expand_head(struct sk_buff *skb
 	if (skb_pfmemalloc(skb))
 		gfp_mask |= __GFP_MEMALLOC;
 
-	size = SKB_HEAD_ALIGN(size);
-	size = kmalloc_size_roundup(size);
-	data = kmalloc_reserve(size, gfp_mask, NUMA_NO_NODE, NULL);
+	data = kmalloc_reserve(&size, gfp_mask, NUMA_NO_NODE, NULL);
 	if (!data)
 		goto nodata;
 	size = SKB_WITH_OVERHEAD(size);
@@ -6198,9 +6197,7 @@ static int pskb_carve_inside_header(stru
 	if (skb_pfmemalloc(skb))
 		gfp_mask |= __GFP_MEMALLOC;
 
-	size = SKB_HEAD_ALIGN(size);
-	size = kmalloc_size_roundup(size);
-	data = kmalloc_reserve(size, gfp_mask, NUMA_NO_NODE, NULL);
+	data = kmalloc_reserve(&size, gfp_mask, NUMA_NO_NODE, NULL);
 	if (!data)
 		return -ENOMEM;
 	size = SKB_WITH_OVERHEAD(size);
@@ -6316,9 +6313,7 @@ static int pskb_carve_inside_nonlinear(s
 	if (skb_pfmemalloc(skb))
 		gfp_mask |= __GFP_MEMALLOC;
 
-	size = SKB_HEAD_ALIGN(size);
-	size = kmalloc_size_roundup(size);
-	data = kmalloc_reserve(size, gfp_mask, NUMA_NO_NODE, NULL);
+	data = kmalloc_reserve(&size, gfp_mask, NUMA_NO_NODE, NULL);
 	if (!data)
 		return -ENOMEM;
 	size = SKB_WITH_OVERHEAD(size);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 055/219] net: deal with integer overflows in kmalloc_reserve()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (53 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 054/219] net: factorize code in kmalloc_reserve() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 056/219] kbuild: rpm-pkg: define _arch conditionally Greg Kroah-Hartman
                   ` (174 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzbot, Kyle Zeng, Eric Dumazet,
	Kees Cook, Vlastimil Babka, David S. Miller, Ajay Kaher

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

commit 915d975b2ffa58a14bfcf16fafe00c41315949ff upstream.

Blamed commit changed:
    ptr = kmalloc(size);
    if (ptr)
      size = ksize(ptr);

to:
    size = kmalloc_size_roundup(size);
    ptr = kmalloc(size);

This allowed various crash as reported by syzbot [1]
and Kyle Zeng.

Problem is that if @size is bigger than 0x80000001,
kmalloc_size_roundup(size) returns 2^32.

kmalloc_reserve() uses a 32bit variable (obj_size),
so 2^32 is truncated to 0.

kmalloc(0) returns ZERO_SIZE_PTR which is not handled by
skb allocations.

Following trace can be triggered if a netdev->mtu is set
close to 0x7fffffff

We might in the future limit netdev->mtu to more sensible
limit (like KMALLOC_MAX_SIZE).

This patch is based on a syzbot report, and also a report
and tentative fix from Kyle Zeng.

[1]
BUG: KASAN: user-memory-access in __build_skb_around net/core/skbuff.c:294 [inline]
BUG: KASAN: user-memory-access in __alloc_skb+0x3c4/0x6e8 net/core/skbuff.c:527
Write of size 32 at addr 00000000fffffd10 by task syz-executor.4/22554

CPU: 1 PID: 22554 Comm: syz-executor.4 Not tainted 6.1.39-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
Call trace:
dump_backtrace+0x1c8/0x1f4 arch/arm64/kernel/stacktrace.c:279
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:286
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x120/0x1a0 lib/dump_stack.c:106
print_report+0xe4/0x4b4 mm/kasan/report.c:398
kasan_report+0x150/0x1ac mm/kasan/report.c:495
kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:189
memset+0x40/0x70 mm/kasan/shadow.c:44
__build_skb_around net/core/skbuff.c:294 [inline]
__alloc_skb+0x3c4/0x6e8 net/core/skbuff.c:527
alloc_skb include/linux/skbuff.h:1316 [inline]
igmpv3_newpack+0x104/0x1088 net/ipv4/igmp.c:359
add_grec+0x81c/0x1124 net/ipv4/igmp.c:534
igmpv3_send_cr net/ipv4/igmp.c:667 [inline]
igmp_ifc_timer_expire+0x1b0/0x1008 net/ipv4/igmp.c:810
call_timer_fn+0x1c0/0x9f0 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers+0x54c/0x710 kernel/time/timer.c:1790
run_timer_softirq+0x28/0x4c kernel/time/timer.c:1803
_stext+0x380/0xfbc
____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:79
call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:891
do_softirq_own_stack+0x20/0x2c arch/arm64/kernel/irq.c:84
invoke_softirq kernel/softirq.c:437 [inline]
__irq_exit_rcu+0x1c0/0x4cc kernel/softirq.c:683
irq_exit_rcu+0x14/0x78 kernel/softirq.c:695
el0_interrupt+0x7c/0x2e0 arch/arm64/kernel/entry-common.c:717
__el0_irq_handler_common+0x18/0x24 arch/arm64/kernel/entry-common.c:724
el0t_64_irq_handler+0x10/0x1c arch/arm64/kernel/entry-common.c:729
el0t_64_irq+0x1a0/0x1a4 arch/arm64/kernel/entry.S:584

Fixes: 12d6c1d3a2ad ("skbuff: Proactively round up to kmalloc bucket size")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Kyle Zeng <zengyhkyle@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
[Ajay: Regenerated the patch for v6.1.y]
Signed-off-by: Ajay Kaher <akaher@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/core/skbuff.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -428,11 +428,17 @@ static void *kmalloc_reserve(unsigned in
 			     bool *pfmemalloc)
 {
 	bool ret_pfmemalloc = false;
-	unsigned int obj_size;
+	size_t obj_size;
 	void *obj;
 
 	obj_size = SKB_HEAD_ALIGN(*size);
-	*size = obj_size = kmalloc_size_roundup(obj_size);
+
+	obj_size = kmalloc_size_roundup(obj_size);
+	/* The following cast might truncate high-order bits of obj_size, this
+	 * is harmless because kmalloc(obj_size >= 2^32) will fail anyway.
+	 */
+	*size = (unsigned int)obj_size;
+
 	/*
 	 * Try a regular allocation, when that fails and we're not entitled
 	 * to the reserves, fail.



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 056/219] kbuild: rpm-pkg: define _arch conditionally
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (54 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 055/219] net: deal with integer overflows " Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 057/219] kbuild: do not run depmod for make modules_sign Greg Kroah-Hartman
                   ` (173 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Masahiro Yamada, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Masahiro Yamada <masahiroy@kernel.org>

[ Upstream commit 233046a2afd12a4f699305b92ee634eebf1e4f31 ]

Commit 3089b2be0cce ("kbuild: rpm-pkg: fix build error when _arch is
undefined") does not work as intended; _arch is always defined as
$UTS_MACHINE.

The intention was to define _arch to $UTS_MACHINE only when it is not
defined.

Fixes: 3089b2be0cce ("kbuild: rpm-pkg: fix build error when _arch is undefined")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 scripts/package/mkspec | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/package/mkspec b/scripts/package/mkspec
index 70392fd2fd29c..f892cf8e37f03 100755
--- a/scripts/package/mkspec
+++ b/scripts/package/mkspec
@@ -51,7 +51,7 @@ $S	Source: kernel-$__KERNELRELEASE.tar.gz
 	Provides: $PROVIDES
 	# $UTS_MACHINE as a fallback of _arch in case
 	# /usr/lib/rpm/platform/*/macros was not included.
-	%define _arch %{?_arch:$UTS_MACHINE}
+	%{!?_arch: %define _arch $UTS_MACHINE}
 	%define __spec_install_post /usr/lib/rpm/brp-compress || :
 	%define debug_package %{nil}
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 057/219] kbuild: do not run depmod for make modules_sign
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (55 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 056/219] kbuild: rpm-pkg: define _arch conditionally Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 058/219] tpm_crb: Fix an error handling path in crb_acpi_add() Greg Kroah-Hartman
                   ` (172 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Masahiro Yamada, Nicolas Schier,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Masahiro Yamada <masahiroy@kernel.org>

[ Upstream commit 2429742e506a2b5939a62c629c4a46d91df0ada8 ]

Commit 961ab4a3cd66 ("kbuild: merge scripts/Makefile.modsign to
scripts/Makefile.modinst") started to run depmod at the end of
'make modules_sign'.

Move the depmod rule to scripts/Makefile.modinst and run it only when
$(modules_sign_only) is empty.

Fixes: 961ab4a3cd66 ("kbuild: merge scripts/Makefile.modsign to scripts/Makefile.modinst")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Makefile b/Makefile
index 35fc0d62898dc..f23edaa0e8139 100644
--- a/Makefile
+++ b/Makefile
@@ -1939,7 +1939,9 @@ quiet_cmd_depmod = DEPMOD  $(MODLIB)
 
 modules_install:
 	$(Q)$(MAKE) -f $(srctree)/scripts/Makefile.modinst
+ifndef modules_sign_only
 	$(call cmd,depmod)
+endif
 
 else # CONFIG_MODULES
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 058/219] tpm_crb: Fix an error handling path in crb_acpi_add()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (56 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 057/219] kbuild: do not run depmod for make modules_sign Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 059/219] gfs2: Switch to wait_event in gfs2_logd Greg Kroah-Hartman
                   ` (171 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Christophe JAILLET, Matthew Garrett,
	Jarkko Sakkinen, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

[ Upstream commit 9c377852ddfdc557b1370f196b0cfdf28d233460 ]

Some error paths don't call acpi_put_table() before returning.
Branch to the correct place instead of doing some direct return.

Fixes: 4d2732882703 ("tpm_crb: Add support for CRB devices based on Pluton")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Matthew Garrett <mgarrett@aurora.tech>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/char/tpm/tpm_crb.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
index db0b774207d35..f45239e73c4ca 100644
--- a/drivers/char/tpm/tpm_crb.c
+++ b/drivers/char/tpm/tpm_crb.c
@@ -775,12 +775,13 @@ static int crb_acpi_add(struct acpi_device *device)
 				FW_BUG "TPM2 ACPI table has wrong size %u for start method type %d\n",
 				buf->header.length,
 				ACPI_TPM2_COMMAND_BUFFER_WITH_PLUTON);
-			return -EINVAL;
+			rc = -EINVAL;
+			goto out;
 		}
 		crb_pluton = ACPI_ADD_PTR(struct tpm2_crb_pluton, buf, sizeof(*buf));
 		rc = crb_map_pluton(dev, priv, buf, crb_pluton);
 		if (rc)
-			return rc;
+			goto out;
 	}
 
 	priv->sm = sm;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 059/219] gfs2: Switch to wait_event in gfs2_logd
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (57 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 058/219] tpm_crb: Fix an error handling path in crb_acpi_add() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 060/219] gfs2: low-memory forced flush fixes Greg Kroah-Hartman
                   ` (170 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Andreas Gruenbacher, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andreas Gruenbacher <agruenba@redhat.com>

[ Upstream commit 6df373b09b1dcf2f7d579f515f653f89a896d417 ]

In gfs2_logd(), switch from an open-coded wait loop to
wait_event_interruptible_timeout().

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Stable-dep-of: b74cd55aa9a9 ("gfs2: low-memory forced flush fixes")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/gfs2/log.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 61323deb80bc7..69c3facfcbef4 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1304,7 +1304,6 @@ int gfs2_logd(void *data)
 {
 	struct gfs2_sbd *sdp = data;
 	unsigned long t = 1;
-	DEFINE_WAIT(wait);
 
 	while (!kthread_should_stop()) {
 
@@ -1341,17 +1340,11 @@ int gfs2_logd(void *data)
 
 		try_to_freeze();
 
-		do {
-			prepare_to_wait(&sdp->sd_logd_waitq, &wait,
-					TASK_INTERRUPTIBLE);
-			if (!gfs2_ail_flush_reqd(sdp) &&
-			    !gfs2_jrnl_flush_reqd(sdp) &&
-			    !kthread_should_stop())
-				t = schedule_timeout(t);
-		} while(t && !gfs2_ail_flush_reqd(sdp) &&
-			!gfs2_jrnl_flush_reqd(sdp) &&
-			!kthread_should_stop());
-		finish_wait(&sdp->sd_logd_waitq, &wait);
+		t = wait_event_interruptible_timeout(sdp->sd_logd_waitq,
+				gfs2_ail_flush_reqd(sdp) ||
+				gfs2_jrnl_flush_reqd(sdp) ||
+				kthread_should_stop(),
+				t);
 	}
 
 	return 0;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 060/219] gfs2: low-memory forced flush fixes
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (58 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 059/219] gfs2: Switch to wait_event in gfs2_logd Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 061/219] mailbox: qcom-ipcc: fix incorrect num_chans counting Greg Kroah-Hartman
                   ` (169 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Andreas Gruenbacher, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andreas Gruenbacher <agruenba@redhat.com>

[ Upstream commit b74cd55aa9a9d0aca760028a51343ec79812e410 ]

First, function gfs2_ail_flush_reqd checks the SDF_FORCE_AIL_FLUSH flag
to determine if an AIL flush should be forced in low-memory situations.
However, it also immediately clears the flag, and when called repeatedly
as in function gfs2_logd, the flag will be lost.  Fix that by pulling
the SDF_FORCE_AIL_FLUSH flag check out of gfs2_ail_flush_reqd.

Second, function gfs2_writepages sets the SDF_FORCE_AIL_FLUSH flag
whether or not enough pages were written.  If enough pages could be
written, flushing the AIL is unnecessary, though.

Third, gfs2_writepages doesn't wake up logd after setting the
SDF_FORCE_AIL_FLUSH flag, so it can take a long time for logd to react.
It would be preferable to wake up logd, but that hurts the performance
of some workloads and we don't quite understand why so far, so don't
wake up logd so far.

Fixes: b066a4eebd4f ("gfs2: forcibly flush ail to relieve memory pressure")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/gfs2/aops.c | 4 ++--
 fs/gfs2/log.c  | 8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 2f04c0ff7470b..1e9fa26f04fe1 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -182,13 +182,13 @@ static int gfs2_writepages(struct address_space *mapping,
 	int ret;
 
 	/*
-	 * Even if we didn't write any pages here, we might still be holding
+	 * Even if we didn't write enough pages here, we might still be holding
 	 * dirty pages in the ail. We forcibly flush the ail because we don't
 	 * want balance_dirty_pages() to loop indefinitely trying to write out
 	 * pages held in the ail that it can't find.
 	 */
 	ret = iomap_writepages(mapping, wbc, &wpc, &gfs2_writeback_ops);
-	if (ret == 0)
+	if (ret == 0 && wbc->nr_to_write > 0)
 		set_bit(SDF_FORCE_AIL_FLUSH, &sdp->sd_flags);
 	return ret;
 }
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 69c3facfcbef4..e021d5f50c231 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1285,9 +1285,6 @@ static inline int gfs2_ail_flush_reqd(struct gfs2_sbd *sdp)
 {
 	unsigned int used_blocks = sdp->sd_jdesc->jd_blocks - atomic_read(&sdp->sd_log_blks_free);
 
-	if (test_and_clear_bit(SDF_FORCE_AIL_FLUSH, &sdp->sd_flags))
-		return 1;
-
 	return used_blocks + atomic_read(&sdp->sd_log_blks_needed) >=
 		atomic_read(&sdp->sd_log_thresh2);
 }
@@ -1328,7 +1325,9 @@ int gfs2_logd(void *data)
 						  GFS2_LFC_LOGD_JFLUSH_REQD);
 		}
 
-		if (gfs2_ail_flush_reqd(sdp)) {
+		if (test_bit(SDF_FORCE_AIL_FLUSH, &sdp->sd_flags) ||
+		    gfs2_ail_flush_reqd(sdp)) {
+			clear_bit(SDF_FORCE_AIL_FLUSH, &sdp->sd_flags);
 			gfs2_ail1_start(sdp);
 			gfs2_ail1_wait(sdp);
 			gfs2_ail1_empty(sdp, 0);
@@ -1341,6 +1340,7 @@ int gfs2_logd(void *data)
 		try_to_freeze();
 
 		t = wait_event_interruptible_timeout(sdp->sd_logd_waitq,
+				test_bit(SDF_FORCE_AIL_FLUSH, &sdp->sd_flags) ||
 				gfs2_ail_flush_reqd(sdp) ||
 				gfs2_jrnl_flush_reqd(sdp) ||
 				kthread_should_stop(),
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 061/219] mailbox: qcom-ipcc: fix incorrect num_chans counting
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (59 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 060/219] gfs2: low-memory forced flush fixes Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 062/219] kconfig: fix possible buffer overflow Greg Kroah-Hartman
                   ` (168 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Jonathan Marek, Jassi Brar,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jonathan Marek <jonathan@marek.ca>

[ Upstream commit a493208079e299aefdc15169dc80e3da3ebb718a ]

Breaking out early when a match is found leads to an incorrect num_chans
value when more than one ipcc mailbox channel is used by the same device.

Fixes: e9d50e4b4d04 ("mailbox: qcom-ipcc: Dynamic alloc for channel arrangement")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/mailbox/qcom-ipcc.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/mailbox/qcom-ipcc.c b/drivers/mailbox/qcom-ipcc.c
index 7e27acf6c0cca..f597a1bd56847 100644
--- a/drivers/mailbox/qcom-ipcc.c
+++ b/drivers/mailbox/qcom-ipcc.c
@@ -227,10 +227,8 @@ static int qcom_ipcc_setup_mbox(struct qcom_ipcc *ipcc,
 			ret = of_parse_phandle_with_args(client_dn, "mboxes",
 						"#mbox-cells", j, &curr_ph);
 			of_node_put(curr_ph.np);
-			if (!ret && curr_ph.np == controller_dn) {
+			if (!ret && curr_ph.np == controller_dn)
 				ipcc->num_chans++;
-				break;
-			}
 		}
 	}
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 062/219] kconfig: fix possible buffer overflow
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (60 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 061/219] mailbox: qcom-ipcc: fix incorrect num_chans counting Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 063/219] Input: iqs7222 - configure power mode before triggering ATI Greg Kroah-Hartman
                   ` (167 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Konstantin Meskhidze,
	Masahiro Yamada, Sasha Levin, Ivanov Mikhail

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>

[ Upstream commit a3b7039bb2b22fcd2ad20d59c00ed4e606ce3754 ]

Buffer 'new_argv' is accessed without bound check after accessing with
bound check via 'new_argc' index.

Fixes: e298f3b49def ("kconfig: add built-in function support")
Co-developed-by: Ivanov Mikhail <ivanov.mikhail1@huawei-partners.com>
Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 scripts/kconfig/preprocess.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/scripts/kconfig/preprocess.c b/scripts/kconfig/preprocess.c
index 748da578b418c..d1f5bcff4b62d 100644
--- a/scripts/kconfig/preprocess.c
+++ b/scripts/kconfig/preprocess.c
@@ -396,6 +396,9 @@ static char *eval_clause(const char *str, size_t len, int argc, char *argv[])
 
 		p++;
 	}
+
+	if (new_argc >= FUNCTION_MAX_ARGS)
+		pperror("too many function arguments");
 	new_argv[new_argc++] = prev;
 
 	/*
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 063/219] Input: iqs7222 - configure power mode before triggering ATI
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (61 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 062/219] kconfig: fix possible buffer overflow Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 064/219] perf trace: Use zfree() to reduce chances of use after free Greg Kroah-Hartman
                   ` (166 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Jeff LaBundy, Dmitry Torokhov,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jeff LaBundy <jeff@labundy.com>

[ Upstream commit 2e00b8bf5624767f6be7427b6eb532524793463e ]

If the device drops into ultra-low-power mode before being placed
into normal-power mode as part of ATI being triggered, the device
does not assert any interrupts until the ATI routine is restarted
two seconds later.

Solve this problem by adopting the vendor's recommendation, which
calls for the device to be placed into normal-power mode prior to
being configured and ATI being triggered.

The original implementation followed this sequence, but the order
was inadvertently changed as part of the resolution of a separate
erratum.

Fixes: 1e4189d8af27 ("Input: iqs7222 - protect volatile registers")
Signed-off-by: Jeff LaBundy <jeff@labundy.com>
Link: https://lore.kernel.org/r/ZKrpHc2Ji9qR25r2@nixie71
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/input/misc/iqs7222.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/input/misc/iqs7222.c b/drivers/input/misc/iqs7222.c
index e47ab6c1177f5..f24b174c72667 100644
--- a/drivers/input/misc/iqs7222.c
+++ b/drivers/input/misc/iqs7222.c
@@ -1381,9 +1381,6 @@ static int iqs7222_ati_trigger(struct iqs7222_private *iqs7222)
 	if (error)
 		return error;
 
-	sys_setup &= ~IQS7222_SYS_SETUP_INTF_MODE_MASK;
-	sys_setup &= ~IQS7222_SYS_SETUP_PWR_MODE_MASK;
-
 	for (i = 0; i < IQS7222_NUM_RETRIES; i++) {
 		/*
 		 * Trigger ATI from streaming and normal-power modes so that
@@ -1561,8 +1558,11 @@ static int iqs7222_dev_init(struct iqs7222_private *iqs7222, int dir)
 			return error;
 	}
 
-	if (dir == READ)
+	if (dir == READ) {
+		iqs7222->sys_setup[0] &= ~IQS7222_SYS_SETUP_INTF_MODE_MASK;
+		iqs7222->sys_setup[0] &= ~IQS7222_SYS_SETUP_PWR_MODE_MASK;
 		return 0;
+	}
 
 	return iqs7222_ati_trigger(iqs7222);
 }
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 064/219] perf trace: Use zfree() to reduce chances of use after free
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (62 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 063/219] Input: iqs7222 - configure power mode before triggering ATI Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 065/219] perf trace: Really free the evsel->priv area Greg Kroah-Hartman
                   ` (165 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Arnaldo Carvalho de Melo,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arnaldo Carvalho de Melo <acme@redhat.com>

[ Upstream commit 9997d5dd177c52017fa0541bf236a4232c8148e6 ]

Do defensive programming by using zfree() to initialize freed pointers
to NULL, so that eventual use after free result in a NULL pointer deref
instead of more subtle behaviour.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: 7962ef13651a ("perf trace: Really free the evsel->priv area")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 tools/perf/builtin-trace.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 97b17f8941dc0..6392fcf2610c4 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2293,7 +2293,7 @@ static void syscall__exit(struct syscall *sc)
 	if (!sc)
 		return;
 
-	free(sc->arg_fmt);
+	zfree(&sc->arg_fmt);
 }
 
 static int trace__sys_enter(struct trace *trace, struct evsel *evsel,
@@ -3129,7 +3129,7 @@ static void evlist__free_syscall_tp_fields(struct evlist *evlist)
 		if (!et || !evsel->tp_format || strcmp(evsel->tp_format->system, "syscalls"))
 			continue;
 
-		free(et->fmt);
+		zfree(&et->fmt);
 		free(et);
 	}
 }
@@ -4765,11 +4765,11 @@ static void trace__exit(struct trace *trace)
 	int i;
 
 	strlist__delete(trace->ev_qualifier);
-	free(trace->ev_qualifier_ids.entries);
+	zfree(&trace->ev_qualifier_ids.entries);
 	if (trace->syscalls.table) {
 		for (i = 0; i <= trace->sctbl->syscalls.max_id; i++)
 			syscall__exit(&trace->syscalls.table[i]);
-		free(trace->syscalls.table);
+		zfree(&trace->syscalls.table);
 	}
 	syscalltbl__delete(trace->sctbl);
 	zfree(&trace->perfconfig_events);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 065/219] perf trace: Really free the evsel->priv area
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (63 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 064/219] perf trace: Use zfree() to reduce chances of use after free Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 066/219] pwm: atmel-tcb: Convert to platform remove callback returning void Greg Kroah-Hartman
                   ` (164 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Ian Rogers, Adrian Hunter, Jiri Olsa,
	Namhyung Kim, Riccardo Mancini, Arnaldo Carvalho de Melo,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arnaldo Carvalho de Melo <acme@redhat.com>

[ Upstream commit 7962ef13651a9163f07b530607392ea123482e8a ]

In 3cb4d5e00e037c70 ("perf trace: Free syscall tp fields in
evsel->priv") it only was freeing if strcmp(evsel->tp_format->system,
"syscalls") returned zero, while the corresponding initialization of
evsel->priv was being performed if it was _not_ zero, i.e. if the tp
system wasn't 'syscalls'.

Just stop looking for that and free it if evsel->priv was set, which
should be equivalent.

Also use the pre-existing evsel_trace__delete() function.

This resolves these leaks, detected with:

  $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin

  =================================================================
  ==481565==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
      #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
      #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
      #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
      #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
      #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
      #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
      #6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212
      #7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
      #8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
      #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
      #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
      #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
      #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
      #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
      #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
      #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
      #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
      #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
      #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
      #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
      #6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205
      #7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
      #8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
      #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
      #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
      #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
      #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
      #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s).
  [root@quaco ~]#

With this we plug all leaks with "perf trace sleep 1".

Fixes: 3cb4d5e00e037c70 ("perf trace: Free syscall tp fields in evsel->priv")
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: https://lore.kernel.org/lkml/20230719202951.534582-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 tools/perf/builtin-trace.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 6392fcf2610c4..93dab6423a048 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3124,13 +3124,8 @@ static void evlist__free_syscall_tp_fields(struct evlist *evlist)
 	struct evsel *evsel;
 
 	evlist__for_each_entry(evlist, evsel) {
-		struct evsel_trace *et = evsel->priv;
-
-		if (!et || !evsel->tp_format || strcmp(evsel->tp_format->system, "syscalls"))
-			continue;
-
-		zfree(&et->fmt);
-		free(et);
+		evsel_trace__delete(evsel->priv);
+		evsel->priv = NULL;
 	}
 }
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 066/219] pwm: atmel-tcb: Convert to platform remove callback returning void
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (64 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 065/219] perf trace: Really free the evsel->priv area Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-10-09 19:41   ` Uwe Kleine-König
  2023-09-17 19:13 ` [PATCH 6.1 067/219] pwm: atmel-tcb: Harmonize resource allocation order Greg Kroah-Hartman
                   ` (163 subsequent siblings)
  229 siblings, 1 reply; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Uwe Kleine-König,
	Claudiu Beznea, Thierry Reding, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

[ Upstream commit 9609284a76978daf53a54e05cff36873a75e4d13 ]

The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is (mostly) ignored
and this typically results in resource leaks. To improve here there is a
quest to make the remove callback return void. In the first step of this
quest all drivers are converted to .remove_new() which already returns
void.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Stable-dep-of: c11622324c02 ("pwm: atmel-tcb: Fix resource freeing in error path and remove")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/pwm/pwm-atmel-tcb.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/pwm/pwm-atmel-tcb.c b/drivers/pwm/pwm-atmel-tcb.c
index 2837b4ce8053c..4a116dc44f6e7 100644
--- a/drivers/pwm/pwm-atmel-tcb.c
+++ b/drivers/pwm/pwm-atmel-tcb.c
@@ -500,7 +500,7 @@ static int atmel_tcb_pwm_probe(struct platform_device *pdev)
 	return err;
 }
 
-static int atmel_tcb_pwm_remove(struct platform_device *pdev)
+static void atmel_tcb_pwm_remove(struct platform_device *pdev)
 {
 	struct atmel_tcb_pwm_chip *tcbpwm = platform_get_drvdata(pdev);
 
@@ -509,8 +509,6 @@ static int atmel_tcb_pwm_remove(struct platform_device *pdev)
 	clk_disable_unprepare(tcbpwm->slow_clk);
 	clk_put(tcbpwm->slow_clk);
 	clk_put(tcbpwm->clk);
-
-	return 0;
 }
 
 static const struct of_device_id atmel_tcb_pwm_dt_ids[] = {
@@ -564,7 +562,7 @@ static struct platform_driver atmel_tcb_pwm_driver = {
 		.pm = &atmel_tcb_pwm_pm_ops,
 	},
 	.probe = atmel_tcb_pwm_probe,
-	.remove = atmel_tcb_pwm_remove,
+	.remove_new = atmel_tcb_pwm_remove,
 };
 module_platform_driver(atmel_tcb_pwm_driver);
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 067/219] pwm: atmel-tcb: Harmonize resource allocation order
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (65 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 066/219] pwm: atmel-tcb: Convert to platform remove callback returning void Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 068/219] pwm: atmel-tcb: Fix resource freeing in error path and remove Greg Kroah-Hartman
                   ` (162 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Uwe Kleine-König,
	Thierry Reding, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

[ Upstream commit 0323e8fedd1ef25342cf7abf3a2024f5670362b8 ]

Allocate driver data as first resource in the probe function. This way it
can be used during allocation of the other resources (instead of assigning
these to local variables first and update driver data only when it's
allocated). Also as driver data is allocated using a devm function this
should happen first to have the order of freeing resources in the error
path and the remove function in reverse.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Stable-dep-of: c11622324c02 ("pwm: atmel-tcb: Fix resource freeing in error path and remove")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/pwm/pwm-atmel-tcb.c | 49 +++++++++++++++----------------------
 1 file changed, 20 insertions(+), 29 deletions(-)

diff --git a/drivers/pwm/pwm-atmel-tcb.c b/drivers/pwm/pwm-atmel-tcb.c
index 4a116dc44f6e7..613dd1810fb53 100644
--- a/drivers/pwm/pwm-atmel-tcb.c
+++ b/drivers/pwm/pwm-atmel-tcb.c
@@ -422,13 +422,14 @@ static int atmel_tcb_pwm_probe(struct platform_device *pdev)
 	struct atmel_tcb_pwm_chip *tcbpwm;
 	const struct atmel_tcb_config *config;
 	struct device_node *np = pdev->dev.of_node;
-	struct regmap *regmap;
-	struct clk *clk, *gclk = NULL;
-	struct clk *slow_clk;
 	char clk_name[] = "t0_clk";
 	int err;
 	int channel;
 
+	tcbpwm = devm_kzalloc(&pdev->dev, sizeof(*tcbpwm), GFP_KERNEL);
+	if (tcbpwm == NULL)
+		return -ENOMEM;
+
 	err = of_property_read_u32(np, "reg", &channel);
 	if (err < 0) {
 		dev_err(&pdev->dev,
@@ -437,47 +438,37 @@ static int atmel_tcb_pwm_probe(struct platform_device *pdev)
 		return err;
 	}
 
-	regmap = syscon_node_to_regmap(np->parent);
-	if (IS_ERR(regmap))
-		return PTR_ERR(regmap);
+	tcbpwm->regmap = syscon_node_to_regmap(np->parent);
+	if (IS_ERR(tcbpwm->regmap))
+		return PTR_ERR(tcbpwm->regmap);
 
-	slow_clk = of_clk_get_by_name(np->parent, "slow_clk");
-	if (IS_ERR(slow_clk))
-		return PTR_ERR(slow_clk);
+	tcbpwm->slow_clk = of_clk_get_by_name(np->parent, "slow_clk");
+	if (IS_ERR(tcbpwm->slow_clk))
+		return PTR_ERR(tcbpwm->slow_clk);
 
 	clk_name[1] += channel;
-	clk = of_clk_get_by_name(np->parent, clk_name);
-	if (IS_ERR(clk))
-		clk = of_clk_get_by_name(np->parent, "t0_clk");
-	if (IS_ERR(clk))
-		return PTR_ERR(clk);
+	tcbpwm->clk = of_clk_get_by_name(np->parent, clk_name);
+	if (IS_ERR(tcbpwm->clk))
+		tcbpwm->clk = of_clk_get_by_name(np->parent, "t0_clk");
+	if (IS_ERR(tcbpwm->clk))
+		return PTR_ERR(tcbpwm->clk);
 
 	match = of_match_node(atmel_tcb_of_match, np->parent);
 	config = match->data;
 
 	if (config->has_gclk) {
-		gclk = of_clk_get_by_name(np->parent, "gclk");
-		if (IS_ERR(gclk))
-			return PTR_ERR(gclk);
-	}
-
-	tcbpwm = devm_kzalloc(&pdev->dev, sizeof(*tcbpwm), GFP_KERNEL);
-	if (tcbpwm == NULL) {
-		err = -ENOMEM;
-		goto err_slow_clk;
+		tcbpwm->gclk = of_clk_get_by_name(np->parent, "gclk");
+		if (IS_ERR(tcbpwm->gclk))
+			return PTR_ERR(tcbpwm->gclk);
 	}
 
 	tcbpwm->chip.dev = &pdev->dev;
 	tcbpwm->chip.ops = &atmel_tcb_pwm_ops;
 	tcbpwm->chip.npwm = NPWM;
 	tcbpwm->channel = channel;
-	tcbpwm->regmap = regmap;
-	tcbpwm->clk = clk;
-	tcbpwm->gclk = gclk;
-	tcbpwm->slow_clk = slow_clk;
 	tcbpwm->width = config->counter_width;
 
-	err = clk_prepare_enable(slow_clk);
+	err = clk_prepare_enable(tcbpwm->slow_clk);
 	if (err)
 		goto err_slow_clk;
 
@@ -495,7 +486,7 @@ static int atmel_tcb_pwm_probe(struct platform_device *pdev)
 	clk_disable_unprepare(tcbpwm->slow_clk);
 
 err_slow_clk:
-	clk_put(slow_clk);
+	clk_put(tcbpwm->slow_clk);
 
 	return err;
 }
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 068/219] pwm: atmel-tcb: Fix resource freeing in error path and remove
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (66 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 067/219] pwm: atmel-tcb: Harmonize resource allocation order Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 069/219] backlight: gpio_backlight: Drop output GPIO direction check for initial power state Greg Kroah-Hartman
                   ` (161 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Uwe Kleine-König,
	Claudiu Beznea, Thierry Reding, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

[ Upstream commit c11622324c023415fb69196c5fc3782d2b8cced0 ]

Several resources were not freed in the error path and the remove
function. Add the forgotten items.

Fixes: 34cbcd72588f ("pwm: atmel-tcb: Add sama5d2 support")
Fixes: 061f8572a31c ("pwm: atmel-tcb: Switch to new binding")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/pwm/pwm-atmel-tcb.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/pwm/pwm-atmel-tcb.c b/drivers/pwm/pwm-atmel-tcb.c
index 613dd1810fb53..2826fc216d291 100644
--- a/drivers/pwm/pwm-atmel-tcb.c
+++ b/drivers/pwm/pwm-atmel-tcb.c
@@ -450,16 +450,20 @@ static int atmel_tcb_pwm_probe(struct platform_device *pdev)
 	tcbpwm->clk = of_clk_get_by_name(np->parent, clk_name);
 	if (IS_ERR(tcbpwm->clk))
 		tcbpwm->clk = of_clk_get_by_name(np->parent, "t0_clk");
-	if (IS_ERR(tcbpwm->clk))
-		return PTR_ERR(tcbpwm->clk);
+	if (IS_ERR(tcbpwm->clk)) {
+		err = PTR_ERR(tcbpwm->clk);
+		goto err_slow_clk;
+	}
 
 	match = of_match_node(atmel_tcb_of_match, np->parent);
 	config = match->data;
 
 	if (config->has_gclk) {
 		tcbpwm->gclk = of_clk_get_by_name(np->parent, "gclk");
-		if (IS_ERR(tcbpwm->gclk))
-			return PTR_ERR(tcbpwm->gclk);
+		if (IS_ERR(tcbpwm->gclk)) {
+			err = PTR_ERR(tcbpwm->gclk);
+			goto err_clk;
+		}
 	}
 
 	tcbpwm->chip.dev = &pdev->dev;
@@ -470,7 +474,7 @@ static int atmel_tcb_pwm_probe(struct platform_device *pdev)
 
 	err = clk_prepare_enable(tcbpwm->slow_clk);
 	if (err)
-		goto err_slow_clk;
+		goto err_gclk;
 
 	spin_lock_init(&tcbpwm->lock);
 
@@ -485,6 +489,12 @@ static int atmel_tcb_pwm_probe(struct platform_device *pdev)
 err_disable_clk:
 	clk_disable_unprepare(tcbpwm->slow_clk);
 
+err_gclk:
+	clk_put(tcbpwm->gclk);
+
+err_clk:
+	clk_put(tcbpwm->clk);
+
 err_slow_clk:
 	clk_put(tcbpwm->slow_clk);
 
@@ -498,8 +508,9 @@ static void atmel_tcb_pwm_remove(struct platform_device *pdev)
 	pwmchip_remove(&tcbpwm->chip);
 
 	clk_disable_unprepare(tcbpwm->slow_clk);
-	clk_put(tcbpwm->slow_clk);
+	clk_put(tcbpwm->gclk);
 	clk_put(tcbpwm->clk);
+	clk_put(tcbpwm->slow_clk);
 }
 
 static const struct of_device_id atmel_tcb_pwm_dt_ids[] = {
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 069/219] backlight: gpio_backlight: Drop output GPIO direction check for initial power state
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (67 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 068/219] pwm: atmel-tcb: Fix resource freeing in error path and remove Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 070/219] Input: tca6416-keypad - always expect proper IRQ number in i2c client Greg Kroah-Hartman
                   ` (160 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Liu Ying, Andy Shevchenko,
	Linus Walleij, Bartosz Golaszewski, Lee Jones, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ying Liu <victor.liu@nxp.com>

[ Upstream commit fe1328b5b2a087221e31da77e617f4c2b70f3b7f ]

So, let's drop output GPIO direction check and only check GPIO value to set
the initial power state.

Fixes: 706dc68102bc ("backlight: gpio: Explicitly set the direction of the GPIO")
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230721093342.1532531-1-victor.liu@nxp.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/video/backlight/gpio_backlight.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/video/backlight/gpio_backlight.c b/drivers/video/backlight/gpio_backlight.c
index 5c5c99f7979e3..30ec5b6845335 100644
--- a/drivers/video/backlight/gpio_backlight.c
+++ b/drivers/video/backlight/gpio_backlight.c
@@ -87,8 +87,7 @@ static int gpio_backlight_probe(struct platform_device *pdev)
 		/* Not booted with device tree or no phandle link to the node */
 		bl->props.power = def_value ? FB_BLANK_UNBLANK
 					    : FB_BLANK_POWERDOWN;
-	else if (gpiod_get_direction(gbl->gpiod) == 0 &&
-		 gpiod_get_value_cansleep(gbl->gpiod) == 0)
+	else if (gpiod_get_value_cansleep(gbl->gpiod) == 0)
 		bl->props.power = FB_BLANK_POWERDOWN;
 	else
 		bl->props.power = FB_BLANK_UNBLANK;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 070/219] Input: tca6416-keypad - always expect proper IRQ number in i2c client
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (68 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 069/219] backlight: gpio_backlight: Drop output GPIO direction check for initial power state Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 071/219] Input: tca6416-keypad - fix interrupt enable disbalance Greg Kroah-Hartman
                   ` (159 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Dmitry Torokhov, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dmitry Torokhov <dmitry.torokhov@gmail.com>

[ Upstream commit 687fe7dfb736b03ab820d172ea5dbfc1ec447135 ]

Remove option having i2c client contain raw gpio number instead of proper
IRQ number. There are no users of this facility in mainline and it will
allow cleaning up the driver code with regard to wakeup handling, etc.

Link: https://lore.kernel.org/r/20230724053024.352054-1-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Stable-dep-of: cc141c35af87 ("Input: tca6416-keypad - fix interrupt enable disbalance")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/input/keyboard/tca6416-keypad.c | 27 +++++++++----------------
 include/linux/tca6416_keypad.h          |  1 -
 2 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/drivers/input/keyboard/tca6416-keypad.c b/drivers/input/keyboard/tca6416-keypad.c
index afcdfbb002ff3..b48adec8fe2e7 100644
--- a/drivers/input/keyboard/tca6416-keypad.c
+++ b/drivers/input/keyboard/tca6416-keypad.c
@@ -148,7 +148,7 @@ static int tca6416_keys_open(struct input_dev *dev)
 	if (chip->use_polling)
 		schedule_delayed_work(&chip->dwork, msecs_to_jiffies(100));
 	else
-		enable_irq(chip->irqnum);
+		enable_irq(chip->client->irq);
 
 	return 0;
 }
@@ -160,7 +160,7 @@ static void tca6416_keys_close(struct input_dev *dev)
 	if (chip->use_polling)
 		cancel_delayed_work_sync(&chip->dwork);
 	else
-		disable_irq(chip->irqnum);
+		disable_irq(chip->client->irq);
 }
 
 static int tca6416_setup_registers(struct tca6416_keypad_chip *chip)
@@ -266,12 +266,7 @@ static int tca6416_keypad_probe(struct i2c_client *client,
 		goto fail1;
 
 	if (!chip->use_polling) {
-		if (pdata->irq_is_gpio)
-			chip->irqnum = gpio_to_irq(client->irq);
-		else
-			chip->irqnum = client->irq;
-
-		error = request_threaded_irq(chip->irqnum, NULL,
+		error = request_threaded_irq(client->irq, NULL,
 					     tca6416_keys_isr,
 					     IRQF_TRIGGER_FALLING |
 					     IRQF_ONESHOT | IRQF_NO_AUTOEN,
@@ -279,7 +274,7 @@ static int tca6416_keypad_probe(struct i2c_client *client,
 		if (error) {
 			dev_dbg(&client->dev,
 				"Unable to claim irq %d; error %d\n",
-				chip->irqnum, error);
+				client->irq, error);
 			goto fail1;
 		}
 	}
@@ -298,8 +293,8 @@ static int tca6416_keypad_probe(struct i2c_client *client,
 
 fail2:
 	if (!chip->use_polling) {
-		free_irq(chip->irqnum, chip);
-		enable_irq(chip->irqnum);
+		free_irq(client->irq, chip);
+		enable_irq(client->irq);
 	}
 fail1:
 	input_free_device(input);
@@ -312,8 +307,8 @@ static void tca6416_keypad_remove(struct i2c_client *client)
 	struct tca6416_keypad_chip *chip = i2c_get_clientdata(client);
 
 	if (!chip->use_polling) {
-		free_irq(chip->irqnum, chip);
-		enable_irq(chip->irqnum);
+		free_irq(client->irq, chip);
+		enable_irq(client->irq);
 	}
 
 	input_unregister_device(chip->input);
@@ -324,10 +319,9 @@ static void tca6416_keypad_remove(struct i2c_client *client)
 static int tca6416_keypad_suspend(struct device *dev)
 {
 	struct i2c_client *client = to_i2c_client(dev);
-	struct tca6416_keypad_chip *chip = i2c_get_clientdata(client);
 
 	if (device_may_wakeup(dev))
-		enable_irq_wake(chip->irqnum);
+		enable_irq_wake(client->irq);
 
 	return 0;
 }
@@ -335,10 +329,9 @@ static int tca6416_keypad_suspend(struct device *dev)
 static int tca6416_keypad_resume(struct device *dev)
 {
 	struct i2c_client *client = to_i2c_client(dev);
-	struct tca6416_keypad_chip *chip = i2c_get_clientdata(client);
 
 	if (device_may_wakeup(dev))
-		disable_irq_wake(chip->irqnum);
+		disable_irq_wake(client->irq);
 
 	return 0;
 }
diff --git a/include/linux/tca6416_keypad.h b/include/linux/tca6416_keypad.h
index b0d36a9934ccd..5cf6f6f82aa70 100644
--- a/include/linux/tca6416_keypad.h
+++ b/include/linux/tca6416_keypad.h
@@ -25,7 +25,6 @@ struct tca6416_keys_platform_data {
 	unsigned int rep:1;	/* enable input subsystem auto repeat */
 	uint16_t pinmask;
 	uint16_t invert;
-	int irq_is_gpio;
 	int use_polling;	/* use polling if Interrupt is not connected*/
 };
 #endif
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 071/219] Input: tca6416-keypad - fix interrupt enable disbalance
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (69 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 070/219] Input: tca6416-keypad - always expect proper IRQ number in i2c client Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 072/219] perf annotate bpf: Dont enclose non-debug code with an assert() Greg Kroah-Hartman
                   ` (158 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Dmitry Torokhov, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Dmitry Torokhov <dmitry.torokhov@gmail.com>

[ Upstream commit cc141c35af873c6796e043adcb820833bd8ef8c5 ]

The driver has been switched to use IRQF_NO_AUTOEN, but in the error
unwinding and remove paths calls to enable_irq() were left in place, which
will lead to an incorrect enable counter value.

Fixes: bcd9730a04a1 ("Input: move to use request_irq by IRQF_NO_AUTOEN flag")
Link: https://lore.kernel.org/r/20230724053024.352054-3-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/input/keyboard/tca6416-keypad.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/input/keyboard/tca6416-keypad.c b/drivers/input/keyboard/tca6416-keypad.c
index b48adec8fe2e7..9c1489c0dae13 100644
--- a/drivers/input/keyboard/tca6416-keypad.c
+++ b/drivers/input/keyboard/tca6416-keypad.c
@@ -292,10 +292,8 @@ static int tca6416_keypad_probe(struct i2c_client *client,
 	return 0;
 
 fail2:
-	if (!chip->use_polling) {
+	if (!chip->use_polling)
 		free_irq(client->irq, chip);
-		enable_irq(client->irq);
-	}
 fail1:
 	input_free_device(input);
 	kfree(chip);
@@ -306,10 +304,8 @@ static void tca6416_keypad_remove(struct i2c_client *client)
 {
 	struct tca6416_keypad_chip *chip = i2c_get_clientdata(client);
 
-	if (!chip->use_polling) {
+	if (!chip->use_polling)
 		free_irq(client->irq, chip);
-		enable_irq(client->irq);
-	}
 
 	input_unregister_device(chip->input);
 	kfree(chip);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 072/219] perf annotate bpf: Dont enclose non-debug code with an assert()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (70 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 071/219] Input: tca6416-keypad - fix interrupt enable disbalance Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 073/219] x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm() Greg Kroah-Hartman
                   ` (157 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Adrian Hunter, Ian Rogers, Jiri Olsa,
	Mohamed Mahmoud, Namhyung Kim, Dave Tucker, Derek Barbosa,
	Song Liu, Arnaldo Carvalho de Melo, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arnaldo Carvalho de Melo <acme@redhat.com>

[ Upstream commit 979e9c9fc9c2a761303585e07fe2699bdd88182f ]

In 616b14b47a86d880 ("perf build: Conditionally define NDEBUG") we
started using NDEBUG=1 when DEBUG=1 isn't present, so code that is
enclosed with assert() is not called.

In dd317df072071903 ("perf build: Make binutil libraries opt in") we
stopped linking against binutils-devel, for licensing reasons.

Recently people asked me why annotation of BPF programs wasn't working,
i.e. this:

  $ perf annotate bpf_prog_5280546344e3f45c_kfree_skb

was returning:

  case SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF:
     scnprintf(buf, buflen, "Please link with binutils's libopcode to enable BPF annotation");

This was on a fedora rpm, so its new enough that I had to try to test by
rebuilding using BUILD_NONDISTRO=1, only to get it segfaulting on me.

This combination made this libopcode function not to be called:

        assert(bfd_check_format(bfdf, bfd_object));

Changing it to:

	if (!bfd_check_format(bfdf, bfd_object))
		abort();

Made it work, looking at this "check" function made me realize it
changes the 'bfdf' internal state, i.e. we better call it.

So stop using assert() on it, just call it and abort if it fails.

Probably it is better to propagate the error, etc, but it seems it is
unlikely to fail from the usage done so far and we really need to stop
using libopcodes, so do the quick fix above and move on.

With it we have BPF annotation back working when built with
BUILD_NONDISTRO=1:

  ⬢[acme@toolbox perf-tools-next]$ perf annotate --stdio2 bpf_prog_5280546344e3f45c_kfree_skb   | head
  No kallsyms or vmlinux with build-id 939bc71a1a51cdc434e60af93c7e734f7d5c0e7e was found
  Samples: 12  of event 'cpu-clock:ppp', 4000 Hz, Event count (approx.): 3000000, [percent: local period]
  bpf_prog_5280546344e3f45c_kfree_skb() bpf_prog_5280546344e3f45c_kfree_skb
  Percent      int kfree_skb(struct trace_event_raw_kfree_skb *args) {
                 nop
   33.33         xchg   %ax,%ax
                 push   %rbp
                 mov    %rsp,%rbp
                 sub    $0x180,%rsp
                 push   %rbx
                 push   %r13
  ⬢[acme@toolbox perf-tools-next]$

Fixes: 6987561c9e86eace ("perf annotate: Enable annotation of BPF programs")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mohamed Mahmoud <mmahmoud@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Dave Tucker <datucker@redhat.com>
Cc: Derek Barbosa <debarbos@redhat.com>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/lkml/ZMrMzoQBe0yqMek1@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 tools/perf/util/annotate.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index db475e44f42fa..a9122ea3b44c4 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1756,8 +1756,11 @@ static int symbol__disassemble_bpf(struct symbol *sym,
 	perf_exe(tpath, sizeof(tpath));
 
 	bfdf = bfd_openr(tpath, NULL);
-	assert(bfdf);
-	assert(bfd_check_format(bfdf, bfd_object));
+	if (bfdf == NULL)
+		abort();
+
+	if (!bfd_check_format(bfdf, bfd_object))
+		abort();
 
 	s = open_memstream(&buf, &buf_size);
 	if (!s) {
@@ -1805,7 +1808,8 @@ static int symbol__disassemble_bpf(struct symbol *sym,
 #else
 	disassemble = disassembler(bfdf);
 #endif
-	assert(disassemble);
+	if (disassemble == NULL)
+		abort();
 
 	fflush(s);
 	do {
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 073/219] x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (71 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 072/219] perf annotate bpf: Dont enclose non-debug code with an assert() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 074/219] perf vendor events: Update the JSON/events descriptions for power10 platform Greg Kroah-Hartman
                   ` (156 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Sean Christopherson, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

[ Upstream commit 5df8ecfe3632d5879d1f154f7aa8de441b5d1c89 ]

Drop the explicit check on the extended CPUID level in cpu_has_svm(), the
kernel's cached CPUID info will leave the entire SVM leaf unset if said
leaf is not supported by hardware.  Prior to using cached information,
the check was needed to avoid false positives due to Intel's rather crazy
CPUID behavior of returning the values of the maximum supported leaf if
the specified leaf is unsupported.

Fixes: 682a8108872f ("x86/kvm/svm: Simplify cpu_has_svm()")
Link: https://lore.kernel.org/r/20230721201859.2307736-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/include/asm/virtext.h | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
index 3b12e6b994123..6c2e3ff3cb28f 100644
--- a/arch/x86/include/asm/virtext.h
+++ b/arch/x86/include/asm/virtext.h
@@ -101,12 +101,6 @@ static inline int cpu_has_svm(const char **msg)
 		return 0;
 	}
 
-	if (boot_cpu_data.extended_cpuid_level < SVM_CPUID_FUNC) {
-		if (msg)
-			*msg = "can't execute cpuid_8000000a";
-		return 0;
-	}
-
 	if (!boot_cpu_has(X86_FEATURE_SVM)) {
 		if (msg)
 			*msg = "svm not available";
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 074/219] perf vendor events: Update the JSON/events descriptions for power10 platform
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (72 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 073/219] x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 075/219] perf vendor events: Drop some of the JSON/events " Greg Kroah-Hartman
                   ` (155 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kajol Jain, Athira Rajeev,
	Disha Goel, Ian Rogers, Madhavan Srinivasan, Namhyung Kim,
	linuxppc-dev, Arnaldo Carvalho de Melo, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kajol Jain <kjain@linux.ibm.com>

[ Upstream commit 3286f88f31da060ac2789cee247153961ba57e49 ]

Update the description for some of the JSON/events for power10 platform.

Fixes: 32daa5d7899e0343 ("perf vendor events: Initial JSON/events list for power10 platform")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230814112803.1508296-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../arch/powerpc/power10/cache.json           |  4 +-
 .../arch/powerpc/power10/frontend.json        | 30 ++++++------
 .../arch/powerpc/power10/marked.json          | 20 ++++----
 .../arch/powerpc/power10/memory.json          |  6 +--
 .../arch/powerpc/power10/others.json          | 48 +++++++++----------
 .../arch/powerpc/power10/pipeline.json        | 20 ++++----
 .../pmu-events/arch/powerpc/power10/pmc.json  |  4 +-
 .../arch/powerpc/power10/translation.json     |  6 +--
 8 files changed, 69 insertions(+), 69 deletions(-)

diff --git a/tools/perf/pmu-events/arch/powerpc/power10/cache.json b/tools/perf/pmu-events/arch/powerpc/power10/cache.json
index 605be14f441c8..9cb929bb64afd 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/cache.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/cache.json
@@ -17,7 +17,7 @@
   {
     "EventCode": "0x34056",
     "EventName": "PM_EXEC_STALL_LOAD_FINISH",
-    "BriefDescription": "Cycles in which the oldest instruction in the pipeline was finishing a load after its data was reloaded from a data source beyond the local L1; cycles in which the LSU was processing an L1-hit; cycles in which the NTF instruction merged with another load in the LMQ; cycles in which the NTF instruction is waiting for a data reload for a load miss, but the data comes back with a non-NTF instruction."
+    "BriefDescription": "Cycles in which the oldest instruction in the pipeline was finishing a load after its data was reloaded from a data source beyond the local L1; cycles in which the LSU was processing an L1-hit; cycles in which the next-to-finish (NTF) instruction merged with another load in the LMQ; cycles in which the NTF instruction is waiting for a data reload for a load miss, but the data comes back with a non-NTF instruction."
   },
   {
     "EventCode": "0x3006C",
@@ -27,7 +27,7 @@
   {
     "EventCode": "0x300F4",
     "EventName": "PM_RUN_INST_CMPL_CONC",
-    "BriefDescription": "PowerPC instructions completed by this thread when all threads in the core had the run-latch set."
+    "BriefDescription": "PowerPC instruction completed by this thread when all threads in the core had the run-latch set."
   },
   {
     "EventCode": "0x4C016",
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/frontend.json b/tools/perf/pmu-events/arch/powerpc/power10/frontend.json
index 558f9530f54ec..61e9e0222c873 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/frontend.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/frontend.json
@@ -7,7 +7,7 @@
   {
     "EventCode": "0x10006",
     "EventName": "PM_DISP_STALL_HELD_OTHER_CYC",
-    "BriefDescription": "Cycles in which the NTC instruction is held at dispatch for any other reason."
+    "BriefDescription": "Cycles in which the next-to-complete (NTC) instruction is held at dispatch for any other reason."
   },
   {
     "EventCode": "0x10010",
@@ -32,12 +32,12 @@
   {
     "EventCode": "0x1D05E",
     "EventName": "PM_DISP_STALL_HELD_HALT_CYC",
-    "BriefDescription": "Cycles in which the NTC instruction is held at dispatch because of power management."
+    "BriefDescription": "Cycles in which the next-to-complete (NTC) instruction is held at dispatch because of power management."
   },
   {
     "EventCode": "0x1E050",
     "EventName": "PM_DISP_STALL_HELD_STF_MAPPER_CYC",
-    "BriefDescription": "Cycles in which the NTC instruction is held at dispatch because the STF mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR."
+    "BriefDescription": "Cycles in which the next-to-complete (NTC) instruction is held at dispatch because the STF mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR."
   },
   {
     "EventCode": "0x1F054",
@@ -67,7 +67,7 @@
   {
     "EventCode": "0x100F6",
     "EventName": "PM_IERAT_MISS",
-    "BriefDescription": "IERAT Reloaded to satisfy an IERAT miss. All page sizes are counted by this event."
+    "BriefDescription": "IERAT Reloaded to satisfy an IERAT miss. All page sizes are counted by this event. This event only counts instruction demand access."
   },
   {
     "EventCode": "0x100F8",
@@ -77,7 +77,7 @@
   {
     "EventCode": "0x20006",
     "EventName": "PM_DISP_STALL_HELD_ISSQ_FULL_CYC",
-    "BriefDescription": "Cycles in which the NTC instruction is held at dispatch due to Issue queue full. Includes issue queue and branch queue."
+    "BriefDescription": "Cycles in which the next-to-complete (NTC) instruction is held at dispatch due to Issue queue full. Includes issue queue and branch queue."
   },
   {
     "EventCode": "0x20114",
@@ -102,7 +102,7 @@
   {
     "EventCode": "0x2D01A",
     "EventName": "PM_DISP_STALL_IC_MISS",
-    "BriefDescription": "Cycles when dispatch was stalled for this thread due to an Icache Miss."
+    "BriefDescription": "Cycles when dispatch was stalled for this thread due to an instruction cache miss."
   },
   {
     "EventCode": "0x2E018",
@@ -112,7 +112,7 @@
   {
     "EventCode": "0x2E01A",
     "EventName": "PM_DISP_STALL_HELD_XVFC_MAPPER_CYC",
-    "BriefDescription": "Cycles in which the NTC instruction is held at dispatch because the XVFC mapper/SRB was full."
+    "BriefDescription": "Cycles in which the next-to-complete (NTC) instruction is held at dispatch because the XVFC mapper/SRB was full."
   },
   {
     "EventCode": "0x2C142",
@@ -137,7 +137,7 @@
   {
     "EventCode": "0x30004",
     "EventName": "PM_DISP_STALL_FLUSH",
-    "BriefDescription": "Cycles when dispatch was stalled because of a flush that happened to an instruction(s) that was not yet NTC. PM_EXEC_STALL_NTC_FLUSH only includes instructions that were flushed after becoming NTC."
+    "BriefDescription": "Cycles when dispatch was stalled because of a flush that happened to an instruction(s) that was not yet next-to-complete (NTC). PM_EXEC_STALL_NTC_FLUSH only includes instructions that were flushed after becoming NTC."
   },
   {
     "EventCode": "0x3000A",
@@ -157,7 +157,7 @@
   {
     "EventCode": "0x30018",
     "EventName": "PM_DISP_STALL_HELD_SCOREBOARD_CYC",
-    "BriefDescription": "Cycles in which the NTC instruction is held at dispatch while waiting on the Scoreboard. This event combines VSCR and FPSCR together."
+    "BriefDescription": "Cycles in which the next-to-complete (NTC) instruction is held at dispatch while waiting on the Scoreboard. This event combines VSCR and FPSCR together."
   },
   {
     "EventCode": "0x30026",
@@ -182,7 +182,7 @@
   {
     "EventCode": "0x3D05C",
     "EventName": "PM_DISP_STALL_HELD_RENAME_CYC",
-    "BriefDescription": "Cycles in which the NTC instruction is held at dispatch because the mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR and XVFC."
+    "BriefDescription": "Cycles in which the next-to-complete (NTC) instruction is held at dispatch because the mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR and XVFC."
   },
   {
     "EventCode": "0x3E052",
@@ -192,7 +192,7 @@
   {
     "EventCode": "0x3E054",
     "EventName": "PM_LD_MISS_L1",
-    "BriefDescription": "Load Missed L1, counted at execution time (can be greater than loads finished). LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load."
+    "BriefDescription": "Load missed L1, counted at finish time. LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load."
   },
   {
     "EventCode": "0x301EA",
@@ -202,7 +202,7 @@
   {
     "EventCode": "0x300FA",
     "EventName": "PM_INST_FROM_L3MISS",
-    "BriefDescription": "The processor's instruction cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss."
+    "BriefDescription": "The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss."
   },
   {
     "EventCode": "0x40006",
@@ -232,16 +232,16 @@
   {
     "EventCode": "0x4E01A",
     "EventName": "PM_DISP_STALL_HELD_CYC",
-    "BriefDescription": "Cycles in which the NTC instruction is held at dispatch for any reason."
+    "BriefDescription": "Cycles in which the next-to-complete (NTC) instruction is held at dispatch for any reason."
   },
   {
     "EventCode": "0x4003C",
     "EventName": "PM_DISP_STALL_HELD_SYNC_CYC",
-    "BriefDescription": "Cycles in which the NTC instruction is held at dispatch because of a synchronizing instruction that requires the ICT to be empty before dispatch."
+    "BriefDescription": "Cycles in which the next-to-complete (NTC) instruction is held at dispatch because of a synchronizing instruction that requires the ICT to be empty before dispatch."
   },
   {
     "EventCode": "0x44056",
     "EventName": "PM_VECTOR_ST_CMPL",
-    "BriefDescription": "Vector store instructions completed."
+    "BriefDescription": "Vector store instruction completed."
   }
 ]
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/marked.json b/tools/perf/pmu-events/arch/powerpc/power10/marked.json
index 58b5dfe3a2731..131f8d0e88317 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/marked.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/marked.json
@@ -62,7 +62,7 @@
   {
     "EventCode": "0x200FD",
     "EventName": "PM_L1_ICACHE_MISS",
-    "BriefDescription": "Demand iCache Miss."
+    "BriefDescription": "Demand instruction cache miss."
   },
   {
     "EventCode": "0x30130",
@@ -72,7 +72,7 @@
   {
     "EventCode": "0x34146",
     "EventName": "PM_MRK_LD_CMPL",
-    "BriefDescription": "Marked loads completed."
+    "BriefDescription": "Marked load instruction completed."
   },
   {
     "EventCode": "0x3E158",
@@ -82,12 +82,12 @@
   {
     "EventCode": "0x3E15A",
     "EventName": "PM_MRK_ST_FIN",
-    "BriefDescription": "The marked instruction was a store of any kind."
+    "BriefDescription": "Marked store instruction finished."
   },
   {
     "EventCode": "0x30068",
     "EventName": "PM_L1_ICACHE_RELOADED_PREF",
-    "BriefDescription": "Counts all Icache prefetch reloads ( includes demand turned into prefetch)."
+    "BriefDescription": "Counts all instruction cache prefetch reloads (includes demand turned into prefetch)."
   },
   {
     "EventCode": "0x301E4",
@@ -102,12 +102,12 @@
   {
     "EventCode": "0x300FE",
     "EventName": "PM_DATA_FROM_L3MISS",
-    "BriefDescription": "The processor's data cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss."
+    "BriefDescription": "The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss."
   },
   {
     "EventCode": "0x40012",
     "EventName": "PM_L1_ICACHE_RELOADED_ALL",
-    "BriefDescription": "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch."
+    "BriefDescription": "Counts all instruction cache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch."
   },
   {
     "EventCode": "0x40134",
@@ -117,22 +117,22 @@
   {
     "EventCode": "0x4505A",
     "EventName": "PM_SP_FLOP_CMPL",
-    "BriefDescription": "Single Precision floating point instructions completed."
+    "BriefDescription": "Single Precision floating point instruction completed."
   },
   {
     "EventCode": "0x4D058",
     "EventName": "PM_VECTOR_FLOP_CMPL",
-    "BriefDescription": "Vector floating point instructions completed."
+    "BriefDescription": "Vector floating point instruction completed."
   },
   {
     "EventCode": "0x4D05A",
     "EventName": "PM_NON_MATH_FLOP_CMPL",
-    "BriefDescription": "Non Math instructions completed."
+    "BriefDescription": "Non Math instruction completed."
   },
   {
     "EventCode": "0x401E0",
     "EventName": "PM_MRK_INST_CMPL",
-    "BriefDescription": "marked instruction completed."
+    "BriefDescription": "Marked instruction completed."
   },
   {
     "EventCode": "0x400FE",
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/memory.json b/tools/perf/pmu-events/arch/powerpc/power10/memory.json
index 843b51f531e95..c4c10ca98cad7 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/memory.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/memory.json
@@ -47,7 +47,7 @@
   {
     "EventCode": "0x10062",
     "EventName": "PM_LD_L3MISS_PEND_CYC",
-    "BriefDescription": "Cycles L3 miss was pending for this thread."
+    "BriefDescription": "Cycles in which an L3 miss was pending for this thread."
   },
   {
     "EventCode": "0x20010",
@@ -132,7 +132,7 @@
   {
     "EventCode": "0x300FC",
     "EventName": "PM_DTLB_MISS",
-    "BriefDescription": "The DPTEG required for the load/store instruction in execution was missing from the TLB. It includes pages of all sizes for demand and prefetch activity."
+    "BriefDescription": "The DPTEG required for the load/store instruction in execution was missing from the TLB. This event only counts for demand misses."
   },
   {
     "EventCode": "0x4D02C",
@@ -142,7 +142,7 @@
   {
     "EventCode": "0x4003E",
     "EventName": "PM_LD_CMPL",
-    "BriefDescription": "Loads completed."
+    "BriefDescription": "Load instruction completed."
   },
   {
     "EventCode": "0x4C040",
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/others.json b/tools/perf/pmu-events/arch/powerpc/power10/others.json
index 7d0de1a2860b4..e691041ee8678 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/others.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/others.json
@@ -2,12 +2,12 @@
   {
     "EventCode": "0x10016",
     "EventName": "PM_VSU0_ISSUE",
-    "BriefDescription": "VSU instructions issued to VSU pipe 0."
+    "BriefDescription": "VSU instruction issued to VSU pipe 0."
   },
   {
     "EventCode": "0x1001C",
     "EventName": "PM_ULTRAVISOR_INST_CMPL",
-    "BriefDescription": "PowerPC instructions that completed while the thread was in ultravisor state."
+    "BriefDescription": "PowerPC instruction completed while the thread was in ultravisor state."
   },
   {
     "EventCode": "0x100F0",
@@ -17,12 +17,12 @@
   {
     "EventCode": "0x10134",
     "EventName": "PM_MRK_ST_DONE_L2",
-    "BriefDescription": "Marked stores completed in L2 (RC machine done)."
+    "BriefDescription": "Marked store completed in L2."
   },
   {
     "EventCode": "0x1505E",
     "EventName": "PM_LD_HIT_L1",
-    "BriefDescription": "Loads that finished without experiencing an L1 miss."
+    "BriefDescription": "Load finished without experiencing an L1 miss."
   },
   {
     "EventCode": "0x1F056",
@@ -42,7 +42,7 @@
   {
     "EventCode": "0x101E4",
     "EventName": "PM_MRK_L1_ICACHE_MISS",
-    "BriefDescription": "Marked Instruction suffered an icache Miss."
+    "BriefDescription": "Marked instruction suffered an instruction cache miss."
   },
   {
     "EventCode": "0x101EA",
@@ -72,7 +72,7 @@
   {
     "EventCode": "0x2E010",
     "EventName": "PM_ADJUNCT_INST_CMPL",
-    "BriefDescription": "PowerPC instructions that completed while the thread is in Adjunct state."
+    "BriefDescription": "PowerPC instruction completed while the thread was in Adjunct state."
   },
   {
     "EventCode": "0x2E014",
@@ -122,7 +122,7 @@
   {
     "EventCode": "0x201E4",
     "EventName": "PM_MRK_DATA_FROM_L3MISS",
-    "BriefDescription": "The processor's data cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss for a marked load."
+    "BriefDescription": "The processor's L1 data cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction."
   },
   {
     "EventCode": "0x201E8",
@@ -132,17 +132,17 @@
   {
     "EventCode": "0x200F2",
     "EventName": "PM_INST_DISP",
-    "BriefDescription": "PowerPC instructions dispatched."
+    "BriefDescription": "PowerPC instruction dispatched."
   },
   {
     "EventCode": "0x30132",
     "EventName": "PM_MRK_VSU_FIN",
-    "BriefDescription": "VSU marked instructions finished. Excludes simple FX instructions issued to the Store Unit."
+    "BriefDescription": "VSU marked instruction finished. Excludes simple FX instructions issued to the Store Unit."
   },
   {
     "EventCode": "0x30038",
     "EventName": "PM_EXEC_STALL_DMISS_LMEM",
-    "BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local memory, local OpenCapp cache, or local OpenCapp memory."
+    "BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local memory, local OpenCAPI cache, or local OpenCAPI memory."
   },
   {
     "EventCode": "0x3F04A",
@@ -152,12 +152,12 @@
   {
     "EventCode": "0x3405A",
     "EventName": "PM_PRIVILEGED_INST_CMPL",
-    "BriefDescription": "PowerPC Instructions that completed while the thread is in Privileged state."
+    "BriefDescription": "PowerPC instruction completed while the thread was in Privileged state."
   },
   {
     "EventCode": "0x3F150",
     "EventName": "PM_MRK_ST_DRAIN_CYC",
-    "BriefDescription": "cycles to drain st from core to L2."
+    "BriefDescription": "Cycles in which the marked store drained from the core to the L2."
   },
   {
     "EventCode": "0x3F054",
@@ -182,7 +182,7 @@
   {
     "EventCode": "0x4001C",
     "EventName": "PM_VSU_FIN",
-    "BriefDescription": "VSU instructions finished."
+    "BriefDescription": "VSU instruction finished."
   },
   {
     "EventCode": "0x4C01A",
@@ -197,7 +197,7 @@
   {
     "EventCode": "0x4D022",
     "EventName": "PM_HYPERVISOR_INST_CMPL",
-    "BriefDescription": "PowerPC instructions that completed while the thread is in hypervisor state."
+    "BriefDescription": "PowerPC instruction completed while the thread was in hypervisor state."
   },
   {
     "EventCode": "0x4D026",
@@ -212,32 +212,32 @@
   {
     "EventCode": "0x40030",
     "EventName": "PM_INST_FIN",
-    "BriefDescription": "Instructions finished."
+    "BriefDescription": "Instruction finished."
   },
   {
     "EventCode": "0x44146",
     "EventName": "PM_MRK_STCX_CORE_CYC",
-    "BriefDescription": "Cycles spent in the core portion of a marked Stcx instruction. It starts counting when the instruction is decoded and stops counting when it drains into the L2."
+    "BriefDescription": "Cycles spent in the core portion of a marked STCX instruction. It starts counting when the instruction is decoded and stops counting when it drains into the L2."
   },
   {
     "EventCode": "0x44054",
     "EventName": "PM_VECTOR_LD_CMPL",
-    "BriefDescription": "Vector load instructions completed."
+    "BriefDescription": "Vector load instruction completed."
   },
   {
     "EventCode": "0x45054",
     "EventName": "PM_FMA_CMPL",
-    "BriefDescription": "Two floating point instructions completed (FMA class of instructions: fmadd, fnmadd, fmsub, fnmsub). Scalar instructions only."
+    "BriefDescription": "Two floating point instruction completed (FMA class of instructions: fmadd, fnmadd, fmsub, fnmsub). Scalar instructions only."
   },
   {
     "EventCode": "0x45056",
     "EventName": "PM_SCALAR_FLOP_CMPL",
-    "BriefDescription": "Scalar floating point instructions completed."
+    "BriefDescription": "Scalar floating point instruction completed."
   },
   {
     "EventCode": "0x4505C",
     "EventName": "PM_MATH_FLOP_CMPL",
-    "BriefDescription": "Math floating point instructions completed."
+    "BriefDescription": "Math floating point instruction completed."
   },
   {
     "EventCode": "0x4D05E",
@@ -252,21 +252,21 @@
   {
     "EventCode": "0x401E6",
     "EventName": "PM_MRK_INST_FROM_L3MISS",
-    "BriefDescription": "The processor's instruction cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss for a marked instruction."
+    "BriefDescription": "The processor's instruction cache was reloaded from beyond the local core's L3 due to a demand miss for a marked instruction."
   },
   {
     "EventCode": "0x401E8",
     "EventName": "PM_MRK_DATA_FROM_L2MISS",
-    "BriefDescription": "The processor's data cache was reloaded from a source other than the local core's L1 or L2 due to a demand miss for a marked load."
+    "BriefDescription": "The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss for a marked instruction."
   },
   {
     "EventCode": "0x400F0",
     "EventName": "PM_LD_DEMAND_MISS_L1_FIN",
-    "BriefDescription": "Load Missed L1, counted at finish time."
+    "BriefDescription": "Load missed L1, counted at finish time."
   },
   {
     "EventCode": "0x400FA",
     "EventName": "PM_RUN_INST_CMPL",
-    "BriefDescription": "Completed PowerPC instructions gated by the run latch."
+    "BriefDescription": "PowerPC instruction completed while the run latch is set."
   }
 ]
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/pipeline.json b/tools/perf/pmu-events/arch/powerpc/power10/pipeline.json
index b8aded6045faa..449f57e8ba6af 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/pipeline.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/pipeline.json
@@ -2,7 +2,7 @@
   {
     "EventCode": "0x100FE",
     "EventName": "PM_INST_CMPL",
-    "BriefDescription": "PowerPC instructions completed."
+    "BriefDescription": "PowerPC instruction completed."
   },
   {
     "EventCode": "0x1000C",
@@ -12,7 +12,7 @@
   {
     "EventCode": "0x1000E",
     "EventName": "PM_MMA_ISSUED",
-    "BriefDescription": "MMA instructions issued."
+    "BriefDescription": "MMA instruction issued."
   },
   {
     "EventCode": "0x10012",
@@ -107,7 +107,7 @@
   {
     "EventCode": "0x2D012",
     "EventName": "PM_VSU1_ISSUE",
-    "BriefDescription": "VSU instructions issued to VSU pipe 1."
+    "BriefDescription": "VSU instruction issued to VSU pipe 1."
   },
   {
     "EventCode": "0x2D018",
@@ -122,7 +122,7 @@
   {
     "EventCode": "0x2E01E",
     "EventName": "PM_EXEC_STALL_NTC_FLUSH",
-    "BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in any unit before it was flushed. Note that if the flush of the oldest instruction happens after finish, the cycles from dispatch to issue will be included in PM_DISP_STALL and the cycles from issue to finish will be included in PM_EXEC_STALL and its corresponding children. This event will also count cycles when the previous NTF instruction is still completing and the new NTF instruction is stalled at dispatch."
+    "BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in any unit before it was flushed. Note that if the flush of the oldest instruction happens after finish, the cycles from dispatch to issue will be included in PM_DISP_STALL and the cycles from issue to finish will be included in PM_EXEC_STALL and its corresponding children. This event will also count cycles when the previous next-to-finish (NTF) instruction is still completing and the new NTF instruction is stalled at dispatch."
   },
   {
     "EventCode": "0x2013C",
@@ -137,7 +137,7 @@
   {
     "EventCode": "0x201E2",
     "EventName": "PM_MRK_LD_MISS_L1",
-    "BriefDescription": "Marked DL1 Demand Miss counted at finish time."
+    "BriefDescription": "Marked demand data load miss counted at finish time."
   },
   {
     "EventCode": "0x200F4",
@@ -172,7 +172,7 @@
   {
     "EventCode": "0x30028",
     "EventName": "PM_CMPL_STALL_MEM_ECC",
-    "BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC."
+    "BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for the non-speculative finish of either a STCX waiting for its result or a load waiting for non-critical sectors of data and ECC."
   },
   {
     "EventCode": "0x30036",
@@ -187,12 +187,12 @@
   {
     "EventCode": "0x3F044",
     "EventName": "PM_VSU2_ISSUE",
-    "BriefDescription": "VSU instructions issued to VSU pipe 2."
+    "BriefDescription": "VSU instruction issued to VSU pipe 2."
   },
   {
     "EventCode": "0x30058",
     "EventName": "PM_TLBIE_FIN",
-    "BriefDescription": "TLBIE instructions finished in the LSU. Two TLBIEs can finish each cycle. All will be counted."
+    "BriefDescription": "TLBIE instruction finished in the LSU. Two TLBIEs can finish each cycle. All will be counted."
   },
   {
     "EventCode": "0x3D058",
@@ -252,7 +252,7 @@
   {
     "EventCode": "0x4E012",
     "EventName": "PM_EXEC_STALL_UNKNOWN",
-    "BriefDescription": "Cycles in which the oldest instruction in the pipeline completed without an ntf_type pulse. The ntf_pulse was missed by the ISU because the NTF finishes and completions came too close together."
+    "BriefDescription": "Cycles in which the oldest instruction in the pipeline completed without an ntf_type pulse. The ntf_pulse was missed by the ISU because the next-to-finish (NTF) instruction finishes and completions came too close together."
   },
   {
     "EventCode": "0x4D020",
@@ -267,7 +267,7 @@
   {
     "EventCode": "0x45058",
     "EventName": "PM_IC_MISS_CMPL",
-    "BriefDescription": "Non-speculative icache miss, counted at completion."
+    "BriefDescription": "Non-speculative instruction cache miss, counted at completion."
   },
   {
     "EventCode": "0x4D050",
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/pmc.json b/tools/perf/pmu-events/arch/powerpc/power10/pmc.json
index b5d1bd39cfb22..364fedbfb490b 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/pmc.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/pmc.json
@@ -12,11 +12,11 @@
   {
     "EventCode": "0x45052",
     "EventName": "PM_4FLOP_CMPL",
-    "BriefDescription": "Four floating point instructions completed (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg)."
+    "BriefDescription": "Four floating point instruction completed (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg)."
   },
   {
     "EventCode": "0x4D054",
     "EventName": "PM_8FLOP_CMPL",
-    "BriefDescription": "Four Double Precision vector instructions completed."
+    "BriefDescription": "Four Double Precision vector instruction completed."
   }
 ]
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/translation.json b/tools/perf/pmu-events/arch/powerpc/power10/translation.json
index db3766dca07c5..3e47b804a0a8f 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/translation.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/translation.json
@@ -17,7 +17,7 @@
   {
     "EventCode": "0x2011C",
     "EventName": "PM_MRK_NTF_CYC",
-    "BriefDescription": "Cycles during which the marked instruction is the oldest in the pipeline (NTF or NTC)."
+    "BriefDescription": "Cycles in which the marked instruction is the oldest in the pipeline (next-to-finish or next-to-complete)."
   },
   {
     "EventCode": "0x2E01C",
@@ -37,7 +37,7 @@
   {
     "EventCode": "0x200FE",
     "EventName": "PM_DATA_FROM_L2MISS",
-    "BriefDescription": "The processor's data cache was reloaded from a source other than the local core's L1 or L2 due to a demand miss."
+    "BriefDescription": "The processor's L1 data cache was reloaded from a source beyond the local core's L2 due to a demand miss."
   },
   {
     "EventCode": "0x30010",
@@ -52,6 +52,6 @@
   {
     "EventCode": "0x4D05C",
     "EventName": "PM_DPP_FLOP_CMPL",
-    "BriefDescription": "Double-Precision or Quad-Precision instructions completed."
+    "BriefDescription": "Double-Precision or Quad-Precision instruction completed."
   }
 ]
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 075/219] perf vendor events: Drop some of the JSON/events for power10 platform
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (73 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 074/219] perf vendor events: Update the JSON/events descriptions for power10 platform Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 076/219] perf vendor events: Drop STORES_PER_INST metric event " Greg Kroah-Hartman
                   ` (154 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kajol Jain, Athira Rajeev,
	Disha Goel, Ian Rogers, Madhavan Srinivasan, Namhyung Kim,
	linuxppc-dev, Arnaldo Carvalho de Melo, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kajol Jain <kjain@linux.ibm.com>

[ Upstream commit e104df97b8dcfbab2e42de634b99bf03f0805d85 ]

Drop some of the JSON/events for power10 platform due to counter
data mismatch.

Fixes: 32daa5d7899e0343 ("perf vendor events: Initial JSON/events list for power10 platform")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230814112803.1508296-2-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../arch/powerpc/power10/floating_point.json           |  7 -------
 tools/perf/pmu-events/arch/powerpc/power10/marked.json | 10 ----------
 tools/perf/pmu-events/arch/powerpc/power10/others.json |  5 -----
 .../perf/pmu-events/arch/powerpc/power10/pipeline.json | 10 ----------
 .../pmu-events/arch/powerpc/power10/translation.json   |  5 -----
 5 files changed, 37 deletions(-)
 delete mode 100644 tools/perf/pmu-events/arch/powerpc/power10/floating_point.json

diff --git a/tools/perf/pmu-events/arch/powerpc/power10/floating_point.json b/tools/perf/pmu-events/arch/powerpc/power10/floating_point.json
deleted file mode 100644
index 54acb55e2c8c6..0000000000000
--- a/tools/perf/pmu-events/arch/powerpc/power10/floating_point.json
+++ /dev/null
@@ -1,7 +0,0 @@
-[
-  {
-    "EventCode": "0x4016E",
-    "EventName": "PM_THRESH_NOT_MET",
-    "BriefDescription": "Threshold counter did not meet threshold."
-  }
-]
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/marked.json b/tools/perf/pmu-events/arch/powerpc/power10/marked.json
index 131f8d0e88317..f2436fc5537ce 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/marked.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/marked.json
@@ -19,11 +19,6 @@
     "EventName": "PM_MRK_BR_TAKEN_CMPL",
     "BriefDescription": "Marked Branch Taken instruction completed."
   },
-  {
-    "EventCode": "0x20112",
-    "EventName": "PM_MRK_NTF_FIN",
-    "BriefDescription": "The marked instruction became the oldest in the pipeline before it finished. It excludes instructions that finish at dispatch."
-  },
   {
     "EventCode": "0x2C01C",
     "EventName": "PM_EXEC_STALL_DMISS_OFF_CHIP",
@@ -64,11 +59,6 @@
     "EventName": "PM_L1_ICACHE_MISS",
     "BriefDescription": "Demand instruction cache miss."
   },
-  {
-    "EventCode": "0x30130",
-    "EventName": "PM_MRK_INST_FIN",
-    "BriefDescription": "marked instruction finished. Excludes instructions that finish at dispatch. Note that stores always finish twice since the address gets issued to the LSU and the data gets issued to the VSU."
-  },
   {
     "EventCode": "0x34146",
     "EventName": "PM_MRK_LD_CMPL",
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/others.json b/tools/perf/pmu-events/arch/powerpc/power10/others.json
index e691041ee8678..36c5bbc64c3be 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/others.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/others.json
@@ -29,11 +29,6 @@
     "EventName": "PM_DISP_SS0_2_INSTR_CYC",
     "BriefDescription": "Cycles in which Superslice 0 dispatches either 1 or 2 instructions."
   },
-  {
-    "EventCode": "0x1F15C",
-    "EventName": "PM_MRK_STCX_L2_CYC",
-    "BriefDescription": "Cycles spent in the nest portion of a marked Stcx instruction. It starts counting when the operation starts to drain to the L2 and it stops counting when the instruction retires from the Instruction Completion Table (ICT) in the Instruction Sequencing Unit (ISU)."
-  },
   {
     "EventCode": "0x10066",
     "EventName": "PM_ADJUNCT_CYC",
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/pipeline.json b/tools/perf/pmu-events/arch/powerpc/power10/pipeline.json
index 449f57e8ba6af..799893c56f32b 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/pipeline.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/pipeline.json
@@ -194,11 +194,6 @@
     "EventName": "PM_TLBIE_FIN",
     "BriefDescription": "TLBIE instruction finished in the LSU. Two TLBIEs can finish each cycle. All will be counted."
   },
-  {
-    "EventCode": "0x3D058",
-    "EventName": "PM_SCALAR_FSQRT_FDIV_ISSUE",
-    "BriefDescription": "Scalar versions of four floating point operations: fdiv,fsqrt (xvdivdp, xvdivsp, xvsqrtdp, xvsqrtsp)."
-  },
   {
     "EventCode": "0x30066",
     "EventName": "PM_LSU_FIN",
@@ -269,11 +264,6 @@
     "EventName": "PM_IC_MISS_CMPL",
     "BriefDescription": "Non-speculative instruction cache miss, counted at completion."
   },
-  {
-    "EventCode": "0x4D050",
-    "EventName": "PM_VSU_NON_FLOP_CMPL",
-    "BriefDescription": "Non-floating point VSU instructions completed."
-  },
   {
     "EventCode": "0x4D052",
     "EventName": "PM_2FLOP_CMPL",
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/translation.json b/tools/perf/pmu-events/arch/powerpc/power10/translation.json
index 3e47b804a0a8f..961e2491e73f6 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/translation.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/translation.json
@@ -4,11 +4,6 @@
     "EventName": "PM_MRK_START_PROBE_NOP_CMPL",
     "BriefDescription": "Marked Start probe nop (AND R0,R0,R0) completed."
   },
-  {
-    "EventCode": "0x20016",
-    "EventName": "PM_ST_FIN",
-    "BriefDescription": "Store finish count. Includes speculative activity."
-  },
   {
     "EventCode": "0x20018",
     "EventName": "PM_ST_FWD",
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 076/219] perf vendor events: Drop STORES_PER_INST metric event for power10 platform
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (74 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 075/219] perf vendor events: Drop some of the JSON/events " Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 077/219] perf top: Dont pass an ERR_PTR() directly to perf_session__delete() Greg Kroah-Hartman
                   ` (153 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kajol Jain, Athira Rajeev,
	Disha Goel, Ian Rogers, Madhavan Srinivasan, Namhyung Kim,
	linuxppc-dev, Arnaldo Carvalho de Melo, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kajol Jain <kjain@linux.ibm.com>

[ Upstream commit 4836b9a85ef148c7c9779b66fab3f7279e488d90 ]

Drop STORES_PER_INST metric event for the power10 platform, as the
metric expression of STORES_PER_INST metric event using dropped event
PM_ST_FIN.

Fixes: 3ca3af7d1f230d1f ("perf vendor events power10: Add metric events JSON file for power10 platform")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230814112803.1508296-3-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 tools/perf/pmu-events/arch/powerpc/power10/metrics.json | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/tools/perf/pmu-events/arch/powerpc/power10/metrics.json b/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
index b57526fa44f2d..6e76f65c314ce 100644
--- a/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
+++ b/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
@@ -453,12 +453,6 @@
         "MetricGroup": "General",
         "MetricName": "LOADS_PER_INST"
     },
-    {
-        "BriefDescription": "Average number of finished stores per completed instruction",
-        "MetricExpr": "PM_ST_FIN / PM_RUN_INST_CMPL",
-        "MetricGroup": "General",
-        "MetricName": "STORES_PER_INST"
-    },
     {
         "BriefDescription": "Percentage of demand loads that reloaded from beyond the L2 per completed instruction",
         "MetricExpr": "PM_DATA_FROM_L2MISS / PM_RUN_INST_CMPL * 100",
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 077/219] perf top: Dont pass an ERR_PTR() directly to perf_session__delete()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (75 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 076/219] perf vendor events: Drop STORES_PER_INST metric event " Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 078/219] watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load Greg Kroah-Hartman
                   ` (152 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Adrian Hunter, Alexander Shishkin,
	Alexey Budankov, Jeremie Galarneau, Jiri Olsa, Kate Stewart,
	Mamatha Inamdar, Mukesh Ojha, Nageswara R Sastry, Namhyung Kim,
	Peter Zijlstra, Ravi Bangoria, Shawn Landden, Song Liu,
	Thomas Gleixner, Tzvetomir Stoyanov, Arnaldo Carvalho de Melo,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Arnaldo Carvalho de Melo <acme@redhat.com>

[ Upstream commit ef23cb593304bde0cc046fd4cc83ae7ea2e24f16 ]

While debugging a segfault on 'perf lock contention' without an
available perf.data file I noticed that it was basically calling:

	perf_session__delete(ERR_PTR(-1))

Resulting in:

  (gdb) run lock contention
  Starting program: /root/bin/perf lock contention
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib64/libthread_db.so.1".
  failed to open perf.data: No such file or directory  (try 'perf record' first)
  Initializing perf session failed

  Program received signal SIGSEGV, Segmentation fault.
  0x00000000005e7515 in auxtrace__free (session=0xffffffffffffffff) at util/auxtrace.c:2858
  2858		if (!session->auxtrace)
  (gdb) p session
  $1 = (struct perf_session *) 0xffffffffffffffff
  (gdb) bt
  #0  0x00000000005e7515 in auxtrace__free (session=0xffffffffffffffff) at util/auxtrace.c:2858
  #1  0x000000000057bb4d in perf_session__delete (session=0xffffffffffffffff) at util/session.c:300
  #2  0x000000000047c421 in __cmd_contention (argc=0, argv=0x7fffffffe200) at builtin-lock.c:2161
  #3  0x000000000047dc95 in cmd_lock (argc=0, argv=0x7fffffffe200) at builtin-lock.c:2604
  #4  0x0000000000501466 in run_builtin (p=0xe597a8 <commands+552>, argc=2, argv=0x7fffffffe200) at perf.c:322
  #5  0x00000000005016d5 in handle_internal_command (argc=2, argv=0x7fffffffe200) at perf.c:375
  #6  0x0000000000501824 in run_argv (argcp=0x7fffffffe02c, argv=0x7fffffffe020) at perf.c:419
  #7  0x0000000000501b11 in main (argc=2, argv=0x7fffffffe200) at perf.c:535
  (gdb)

So just set it to NULL after using PTR_ERR(session) to decode the error
as perf_session__delete(NULL) is supported.

The same problem was found in 'perf top' after an audit of all
perf_session__new() failure handling.

Fixes: 6ef81c55a2b6584c ("perf session: Return error code for perf_session__new() function on failure")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Cc: Mukesh Ojha <mojha@codeaurora.org>
Cc: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Shawn Landden <shawn@git.icu>
Cc: Song Liu <songliubraving@fb.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: https://lore.kernel.org/lkml/ZN4Q2rxxsL08A8rd@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 tools/perf/builtin-top.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 4b3ff7687236e..f9917848cdad0 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1751,6 +1751,7 @@ int cmd_top(int argc, const char **argv)
 	top.session = perf_session__new(NULL, NULL);
 	if (IS_ERR(top.session)) {
 		status = PTR_ERR(top.session);
+		top.session = NULL;
 		goto out_delete_evlist;
 	}
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 078/219] watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (76 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 077/219] perf top: Dont pass an ERR_PTR() directly to perf_session__delete() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 079/219] pwm: lpc32xx: Remove handling of PWM channels Greg Kroah-Hartman
                   ` (151 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Raag Jadav, Andy Shevchenko,
	Guenter Roeck, Wim Van Sebroeck, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Raag Jadav <raag.jadav@intel.com>

[ Upstream commit cf38e7691c85f1b09973b22a0b89bf1e1228d2f9 ]

When built with CONFIG_INTEL_MID_WATCHDOG=m, currently the driver
needs to be loaded manually, for the lack of module alias.
This causes unintended resets in cases where watchdog timer is
set-up by bootloader and the driver is not explicitly loaded.
Add MODULE_ALIAS() to load the driver automatically at boot and
avoid this issue.

Fixes: 87a1ef8058d9 ("watchdog: add Intel MID watchdog driver support")
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230811120220.31578-1-raag.jadav@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/watchdog/intel-mid_wdt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/watchdog/intel-mid_wdt.c b/drivers/watchdog/intel-mid_wdt.c
index 9b2173f765c8c..fb7fae750181b 100644
--- a/drivers/watchdog/intel-mid_wdt.c
+++ b/drivers/watchdog/intel-mid_wdt.c
@@ -203,3 +203,4 @@ module_platform_driver(mid_wdt_driver);
 MODULE_AUTHOR("David Cohen <david.a.cohen@linux.intel.com>");
 MODULE_DESCRIPTION("Watchdog Driver for Intel MID platform");
 MODULE_LICENSE("GPL");
+MODULE_ALIAS("platform:intel_mid_wdt");
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 079/219] pwm: lpc32xx: Remove handling of PWM channels
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (77 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 078/219] watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 080/219] perf test stat_bpf_counters_cgrp: Fix shellcheck issue about logical operators Greg Kroah-Hartman
                   ` (150 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Dan Carpenter, Vladimir Zapolskiy,
	Uwe Kleine-König, Thierry Reding, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Zapolskiy <vz@mleia.com>

[ Upstream commit 4aae44f65827f0213a7361cf9c32cfe06114473f ]

Because LPC32xx PWM controllers have only a single output which is
registered as the only PWM device/channel per controller, it is known in
advance that pwm->hwpwm value is always 0. On basis of this fact
simplify the code by removing operations with pwm->hwpwm, there is no
controls which require channel number as input.

Even though I wasn't aware at the time when I forward ported that patch,
this fixes a null pointer dereference as lpc32xx->chip.pwms is NULL
before devm_pwmchip_add() is called.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: 3d2813fb17e5 ("pwm: lpc32xx: Don't modify HW state in .probe() after the PWM chip was registered")
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/pwm/pwm-lpc32xx.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/pwm/pwm-lpc32xx.c b/drivers/pwm/pwm-lpc32xx.c
index 86a0ea0f6955c..806f0bb3ad6d8 100644
--- a/drivers/pwm/pwm-lpc32xx.c
+++ b/drivers/pwm/pwm-lpc32xx.c
@@ -51,10 +51,10 @@ static int lpc32xx_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
 	if (duty_cycles > 255)
 		duty_cycles = 255;
 
-	val = readl(lpc32xx->base + (pwm->hwpwm << 2));
+	val = readl(lpc32xx->base);
 	val &= ~0xFFFF;
 	val |= (period_cycles << 8) | duty_cycles;
-	writel(val, lpc32xx->base + (pwm->hwpwm << 2));
+	writel(val, lpc32xx->base);
 
 	return 0;
 }
@@ -69,9 +69,9 @@ static int lpc32xx_pwm_enable(struct pwm_chip *chip, struct pwm_device *pwm)
 	if (ret)
 		return ret;
 
-	val = readl(lpc32xx->base + (pwm->hwpwm << 2));
+	val = readl(lpc32xx->base);
 	val |= PWM_ENABLE;
-	writel(val, lpc32xx->base + (pwm->hwpwm << 2));
+	writel(val, lpc32xx->base);
 
 	return 0;
 }
@@ -81,9 +81,9 @@ static void lpc32xx_pwm_disable(struct pwm_chip *chip, struct pwm_device *pwm)
 	struct lpc32xx_pwm_chip *lpc32xx = to_lpc32xx_pwm_chip(chip);
 	u32 val;
 
-	val = readl(lpc32xx->base + (pwm->hwpwm << 2));
+	val = readl(lpc32xx->base);
 	val &= ~PWM_ENABLE;
-	writel(val, lpc32xx->base + (pwm->hwpwm << 2));
+	writel(val, lpc32xx->base);
 
 	clk_disable_unprepare(lpc32xx->clk);
 }
@@ -141,9 +141,9 @@ static int lpc32xx_pwm_probe(struct platform_device *pdev)
 	lpc32xx->chip.npwm = 1;
 
 	/* If PWM is disabled, configure the output to the default value */
-	val = readl(lpc32xx->base + (lpc32xx->chip.pwms[0].hwpwm << 2));
+	val = readl(lpc32xx->base);
 	val &= ~PWM_PIN_LEVEL;
-	writel(val, lpc32xx->base + (lpc32xx->chip.pwms[0].hwpwm << 2));
+	writel(val, lpc32xx->base);
 
 	ret = devm_pwmchip_add(&pdev->dev, &lpc32xx->chip);
 	if (ret < 0) {
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 080/219] perf test stat_bpf_counters_cgrp: Fix shellcheck issue about logical operators
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (78 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 079/219] pwm: lpc32xx: Remove handling of PWM channels Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 081/219] perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test Greg Kroah-Hartman
                   ` (149 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kajol Jain, Ian Rogers, Disha Goel,
	Jiri Olsa, Madhavan Srinivasan, Namhyung Kim, linuxppc-dev,
	Athira Rajeev, Arnaldo Carvalho de Melo, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kajol Jain <kjain@linux.ibm.com>

[ Upstream commit 0dd1f815545d7210150642741c364521cc5cf116 ]

Running shellcheck on lock_contention.sh generates below warning:

In stat_bpf_counters_cgrp.sh line 28:
	if [ -d /sys/fs/cgroup/system.slice -a -d /sys/fs/cgroup/user.slice ]; then
                                            ^-- SC2166 (warning): Prefer [ p ] && [ q ] as [ p -a q ] is not well defined.

In stat_bpf_counters_cgrp.sh line 34:
	local self_cgrp=$(grep perf_event /proc/self/cgroup | cut -d: -f3)
        ^-------------^ SC3043 (warning): In POSIX sh, 'local' is undefined.
              ^-------^ SC2155 (warning): Declare and assign separately to avoid masking return values.
                        ^-- SC2046 (warning): Quote this to prevent word splitting.

In stat_bpf_counters_cgrp.sh line 51:
	local output
        ^----------^ SC3043 (warning): In POSIX sh, 'local' is undefined.

In stat_bpf_counters_cgrp.sh line 65:
	local output
        ^----------^ SC3043 (warning): In POSIX sh, 'local' is undefined.

Fixed above warnings by:
- Changing the expression [p -a q] to [p] && [q].
- Fixing shellcheck warnings for local usage, by prefixing
  function name to the variable.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230709182800.53002-6-atrajeev@linux.vnet.ibm.com
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: a84260e31402 ("perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../tests/shell/stat_bpf_counters_cgrp.sh     | 28 ++++++++-----------
 1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
index d724855d097c2..a74440a00b6b6 100755
--- a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
+++ b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
@@ -25,22 +25,22 @@ check_bpf_counter()
 find_cgroups()
 {
 	# try usual systemd slices first
-	if [ -d /sys/fs/cgroup/system.slice -a -d /sys/fs/cgroup/user.slice ]; then
+	if [ -d /sys/fs/cgroup/system.slice ] && [ -d /sys/fs/cgroup/user.slice ]; then
 		test_cgroups="system.slice,user.slice"
 		return
 	fi
 
 	# try root and self cgroups
-	local self_cgrp=$(grep perf_event /proc/self/cgroup | cut -d: -f3)
-	if [ -z ${self_cgrp} ]; then
+	find_cgroups_self_cgrp=$(grep perf_event /proc/self/cgroup | cut -d: -f3)
+	if [ -z ${find_cgroups_self_cgrp} ]; then
 		# cgroup v2 doesn't specify perf_event
-		self_cgrp=$(grep ^0: /proc/self/cgroup | cut -d: -f3)
+		find_cgroups_self_cgrp=$(grep ^0: /proc/self/cgroup | cut -d: -f3)
 	fi
 
-	if [ -z ${self_cgrp} ]; then
+	if [ -z ${find_cgroups_self_cgrp} ]; then
 		test_cgroups="/"
 	else
-		test_cgroups="/,${self_cgrp}"
+		test_cgroups="/,${find_cgroups_self_cgrp}"
 	fi
 }
 
@@ -48,13 +48,11 @@ find_cgroups()
 # Just check if it runs without failure and has non-zero results.
 check_system_wide_counted()
 {
-	local output
-
-	output=$(perf stat -a --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, sleep 1  2>&1)
-	if echo ${output} | grep -q -F "<not "; then
+	check_system_wide_counted_output=$(perf stat -a --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, sleep 1  2>&1)
+	if echo ${check_system_wide_counted_output} | grep -q -F "<not "; then
 		echo "Some system-wide events are not counted"
 		if [ "${verbose}" = "1" ]; then
-			echo ${output}
+			echo ${check_system_wide_counted_output}
 		fi
 		exit 1
 	fi
@@ -62,13 +60,11 @@ check_system_wide_counted()
 
 check_cpu_list_counted()
 {
-	local output
-
-	output=$(perf stat -C 1 --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, taskset -c 1 sleep 1  2>&1)
-	if echo ${output} | grep -q -F "<not "; then
+	check_cpu_list_counted_output=$(perf stat -C 1 --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, taskset -c 1 sleep 1  2>&1)
+	if echo ${check_cpu_list_counted_output} | grep -q -F "<not "; then
 		echo "Some CPU events are not counted"
 		if [ "${verbose}" = "1" ]; then
-			echo ${output}
+			echo ${check_cpu_list_counted_output}
 		fi
 		exit 1
 	fi
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 081/219] perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (79 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 080/219] perf test stat_bpf_counters_cgrp: Fix shellcheck issue about logical operators Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 082/219] drm/i915: mark requests for GuC virtual engines to avoid use-after-free Greg Kroah-Hartman
                   ` (148 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, Ian Rogers, Ingo Molnar, Jiri Olsa,
	Peter Zijlstra, bpf, Arnaldo Carvalho de Melo, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Namhyung Kim <namhyung@kernel.org>

[ Upstream commit a84260e314029e6dc9904fd6eabf8d9fd7965351 ]

It has system-wide test and cpu-list test but the cpu-list test fails
sometimes.  It runs sleep command on CPU1 and measure both user.slice
and system.slice cgroups by default (on systemd-based systems).

But if the system was idle enough, sometime the system.slice gets no
count and it makes the test failing.  Maybe that's because it only looks
at the CPU1, let's add CPU0 to increase the chance it finds some tasks.

Fixes: 7901086014bbaa3a ("perf test: Add a new test for perf stat cgroup BPF counter")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 tools/perf/tests/shell/stat_bpf_counters_cgrp.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
index a74440a00b6b6..e75d0780dc788 100755
--- a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
+++ b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
@@ -60,7 +60,7 @@ check_system_wide_counted()
 
 check_cpu_list_counted()
 {
-	check_cpu_list_counted_output=$(perf stat -C 1 --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, taskset -c 1 sleep 1  2>&1)
+	check_cpu_list_counted_output=$(perf stat -C 0,1 --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, taskset -c 1 sleep 1  2>&1)
 	if echo ${check_cpu_list_counted_output} | grep -q -F "<not "; then
 		echo "Some CPU events are not counted"
 		if [ "${verbose}" = "1" ]; then
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 082/219] drm/i915: mark requests for GuC virtual engines to avoid use-after-free
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (80 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 081/219] perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 083/219] blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice() Greg Kroah-Hartman
                   ` (147 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Chris Wilson, Andrzej Hajda,
	Andi Shyti, Rodrigo Vivi, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andrzej Hajda <andrzej.hajda@intel.com>

[ Upstream commit 5eefc5307c983b59344a4cb89009819f580c84fa ]

References to i915_requests may be trapped by userspace inside a
sync_file or dmabuf (dma-resv) and held indefinitely across different
proceses. To counter-act the memory leaks, we try to not to keep
references from the request past their completion.
On the other side on fence release we need to know if rq->engine
is valid and points to hw engine (true for non-virtual requests).
To make it possible extra bit has been added to rq->execution_mask,
for marking virtual engines.

Fixes: bcb9aa45d5a0 ("Revert "drm/i915: Hold reference to intel_context over life of i915_request"")
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230821153035.3903006-1-andrzej.hajda@intel.com
(cherry picked from commit 280410677af763f3871b93e794a199cfcf6fb580)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h      | 1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 +++
 drivers/gpu/drm/i915/i915_request.c               | 7 ++-----
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 6b5d4ea22b673..107f465a27b9e 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -56,6 +56,7 @@ struct intel_breadcrumbs;
 
 typedef u32 intel_engine_mask_t;
 #define ALL_ENGINES ((intel_engine_mask_t)~0ul)
+#define VIRTUAL_ENGINES BIT(BITS_PER_TYPE(intel_engine_mask_t) - 1)
 
 struct intel_hw_status_page {
 	struct list_head timelines;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 0ec07dad1dcf1..fecdc7ea78ebd 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -5111,6 +5111,9 @@ guc_create_virtual(struct intel_engine_cs **siblings, unsigned int count,
 
 	ve->base.flags = I915_ENGINE_IS_VIRTUAL;
 
+	BUILD_BUG_ON(ilog2(VIRTUAL_ENGINES) < I915_NUM_ENGINES);
+	ve->base.mask = VIRTUAL_ENGINES;
+
 	intel_context_init(&ve->context, &ve->base);
 
 	for (n = 0; n < count; n++) {
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 803cd2ad4deb5..7ce126a01cbf6 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -134,9 +134,7 @@ static void i915_fence_release(struct dma_fence *fence)
 	i915_sw_fence_fini(&rq->semaphore);
 
 	/*
-	 * Keep one request on each engine for reserved use under mempressure
-	 * do not use with virtual engines as this really is only needed for
-	 * kernel contexts.
+	 * Keep one request on each engine for reserved use under mempressure.
 	 *
 	 * We do not hold a reference to the engine here and so have to be
 	 * very careful in what rq->engine we poke. The virtual engine is
@@ -166,8 +164,7 @@ static void i915_fence_release(struct dma_fence *fence)
 	 * know that if the rq->execution_mask is a single bit, rq->engine
 	 * can be a physical engine with the exact corresponding mask.
 	 */
-	if (!intel_engine_is_virtual(rq->engine) &&
-	    is_power_of_2(rq->execution_mask) &&
+	if (is_power_of_2(rq->execution_mask) &&
 	    !cmpxchg(&rq->engine->request_pool, NULL, rq))
 		return;
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 083/219] blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (81 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 082/219] drm/i915: mark requests for GuC virtual engines to avoid use-after-free Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 084/219] blk-throttle: consider carryover_ios/bytes in throtl_trim_slice() Greg Kroah-Hartman
                   ` (146 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Yu Kuai, Tejun Heo, Jens Axboe,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yu Kuai <yukuai3@huawei.com>

[ Upstream commit e8368b57c006dc0e02dcd8a9dc9f2060ff5476fe ]

There are no functional changes, just make the code cleaner.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230816012708.1193747-4-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: eead0056648c ("blk-throttle: consider 'carryover_ios/bytes' in throtl_trim_slice()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 block/blk-throttle.c | 86 +++++++++++++++++++++-----------------------
 1 file changed, 41 insertions(+), 45 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index f1bc600c4ded6..931795da4d65d 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -697,11 +697,40 @@ static bool throtl_slice_used(struct throtl_grp *tg, bool rw)
 	return true;
 }
 
+static unsigned int calculate_io_allowed(u32 iops_limit,
+					 unsigned long jiffy_elapsed)
+{
+	unsigned int io_allowed;
+	u64 tmp;
+
+	/*
+	 * jiffy_elapsed should not be a big value as minimum iops can be
+	 * 1 then at max jiffy elapsed should be equivalent of 1 second as we
+	 * will allow dispatch after 1 second and after that slice should
+	 * have been trimmed.
+	 */
+
+	tmp = (u64)iops_limit * jiffy_elapsed;
+	do_div(tmp, HZ);
+
+	if (tmp > UINT_MAX)
+		io_allowed = UINT_MAX;
+	else
+		io_allowed = tmp;
+
+	return io_allowed;
+}
+
+static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
+{
+	return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
+}
+
 /* Trim the used slices and adjust slice start accordingly */
 static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
 {
-	unsigned long nr_slices, time_elapsed, io_trim;
-	u64 bytes_trim, tmp;
+	unsigned long time_elapsed, io_trim;
+	u64 bytes_trim;
 
 	BUG_ON(time_before(tg->slice_end[rw], tg->slice_start[rw]));
 
@@ -723,19 +752,14 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
 
 	throtl_set_slice_end(tg, rw, jiffies + tg->td->throtl_slice);
 
-	time_elapsed = jiffies - tg->slice_start[rw];
-
-	nr_slices = time_elapsed / tg->td->throtl_slice;
-
-	if (!nr_slices)
+	time_elapsed = rounddown(jiffies - tg->slice_start[rw],
+				 tg->td->throtl_slice);
+	if (!time_elapsed)
 		return;
-	tmp = tg_bps_limit(tg, rw) * tg->td->throtl_slice * nr_slices;
-	do_div(tmp, HZ);
-	bytes_trim = tmp;
-
-	io_trim = (tg_iops_limit(tg, rw) * tg->td->throtl_slice * nr_slices) /
-		HZ;
 
+	bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
+					     time_elapsed);
+	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed);
 	if (!bytes_trim && !io_trim)
 		return;
 
@@ -749,41 +773,13 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
 	else
 		tg->io_disp[rw] = 0;
 
-	tg->slice_start[rw] += nr_slices * tg->td->throtl_slice;
+	tg->slice_start[rw] += time_elapsed;
 
 	throtl_log(&tg->service_queue,
 		   "[%c] trim slice nr=%lu bytes=%llu io=%lu start=%lu end=%lu jiffies=%lu",
-		   rw == READ ? 'R' : 'W', nr_slices, bytes_trim, io_trim,
-		   tg->slice_start[rw], tg->slice_end[rw], jiffies);
-}
-
-static unsigned int calculate_io_allowed(u32 iops_limit,
-					 unsigned long jiffy_elapsed)
-{
-	unsigned int io_allowed;
-	u64 tmp;
-
-	/*
-	 * jiffy_elapsed should not be a big value as minimum iops can be
-	 * 1 then at max jiffy elapsed should be equivalent of 1 second as we
-	 * will allow dispatch after 1 second and after that slice should
-	 * have been trimmed.
-	 */
-
-	tmp = (u64)iops_limit * jiffy_elapsed;
-	do_div(tmp, HZ);
-
-	if (tmp > UINT_MAX)
-		io_allowed = UINT_MAX;
-	else
-		io_allowed = tmp;
-
-	return io_allowed;
-}
-
-static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
-{
-	return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
+		   rw == READ ? 'R' : 'W', time_elapsed / tg->td->throtl_slice,
+		   bytes_trim, io_trim, tg->slice_start[rw], tg->slice_end[rw],
+		   jiffies);
 }
 
 static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 084/219] blk-throttle: consider carryover_ios/bytes in throtl_trim_slice()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (82 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 083/219] blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 085/219] cifs: use fs_context for automounts Greg Kroah-Hartman
                   ` (145 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, zhuxiaohui, Yu Kuai, Tejun Heo,
	Jens Axboe, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yu Kuai <yukuai3@huawei.com>

[ Upstream commit eead0056648cef49d7b15c07ae612fa217083165 ]

Currently, 'carryover_ios/bytes' is not handled in throtl_trim_slice(),
for consequence, 'carryover_ios/bytes' will be used to throttle bio
multiple times, for example:

1) set iops limit to 100, and slice start is 0, slice end is 100ms;
2) current time is 0, and 10 ios are dispatched, those io won't be
   throttled and io_disp is 10;
3) still at current time 0, update iops limit to 1000, carryover_ios is
   updated to (0 - 10) = -10;
4) in this slice(0 - 100ms), io_allowed = 100 + (-10) = 90, which means
   only 90 ios can be dispatched without waiting;
5) assume that io is throttled in slice(0 - 100ms), and
   throtl_trim_slice() update silce to (100ms - 200ms). In this case,
   'carryover_ios/bytes' is not cleared and still only 90 ios can be
   dispatched between 100ms - 200ms.

Fix this problem by updating 'carryover_ios/bytes' in
throtl_trim_slice().

Fixes: a880ae93e5b5 ("blk-throttle: fix io hung due to configuration updates")
Reported-by: zhuxiaohui <zhuxiaohui.400@bytedance.com>
Link: https://lore.kernel.org/all/20230812072116.42321-1-zhuxiaohui.400@bytedance.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230816012708.1193747-5-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 block/blk-throttle.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 931795da4d65d..1007f80278579 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -729,8 +729,9 @@ static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
 /* Trim the used slices and adjust slice start accordingly */
 static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
 {
-	unsigned long time_elapsed, io_trim;
-	u64 bytes_trim;
+	unsigned long time_elapsed;
+	long long bytes_trim;
+	int io_trim;
 
 	BUG_ON(time_before(tg->slice_end[rw], tg->slice_start[rw]));
 
@@ -758,17 +759,21 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
 		return;
 
 	bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
-					     time_elapsed);
-	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed);
-	if (!bytes_trim && !io_trim)
+					     time_elapsed) +
+		     tg->carryover_bytes[rw];
+	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed) +
+		  tg->carryover_ios[rw];
+	if (bytes_trim <= 0 && io_trim <= 0)
 		return;
 
-	if (tg->bytes_disp[rw] >= bytes_trim)
+	tg->carryover_bytes[rw] = 0;
+	if ((long long)tg->bytes_disp[rw] >= bytes_trim)
 		tg->bytes_disp[rw] -= bytes_trim;
 	else
 		tg->bytes_disp[rw] = 0;
 
-	if (tg->io_disp[rw] >= io_trim)
+	tg->carryover_ios[rw] = 0;
+	if ((int)tg->io_disp[rw] >= io_trim)
 		tg->io_disp[rw] -= io_trim;
 	else
 		tg->io_disp[rw] = 0;
@@ -776,7 +781,7 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
 	tg->slice_start[rw] += time_elapsed;
 
 	throtl_log(&tg->service_queue,
-		   "[%c] trim slice nr=%lu bytes=%llu io=%lu start=%lu end=%lu jiffies=%lu",
+		   "[%c] trim slice nr=%lu bytes=%lld io=%d start=%lu end=%lu jiffies=%lu",
 		   rw == READ ? 'R' : 'W', time_elapsed / tg->td->throtl_slice,
 		   bytes_trim, io_trim, tg->slice_start[rw], tg->slice_end[rw],
 		   jiffies);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 085/219] cifs: use fs_context for automounts
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (83 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 084/219] blk-throttle: consider carryover_ios/bytes in throtl_trim_slice() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 086/219] smb: propagate error code of extract_sharename() Greg Kroah-Hartman
                   ` (144 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Paulo Alcantara (SUSE), Steve French,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Paulo Alcantara <pc@cjr.nz>

[ Upstream commit 9fd29a5bae6e8f94b410374099a6fddb253d2d5f ]

Use filesystem context support to handle dfs links.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Stable-dep-of: efc0b0bcffcb ("smb: propagate error code of extract_sharename()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/smb/client/cifs_dfs_ref.c | 100 ++++++++++++++---------------------
 1 file changed, 40 insertions(+), 60 deletions(-)

diff --git a/fs/smb/client/cifs_dfs_ref.c b/fs/smb/client/cifs_dfs_ref.c
index b0864da9ef434..020e71fe1454e 100644
--- a/fs/smb/client/cifs_dfs_ref.c
+++ b/fs/smb/client/cifs_dfs_ref.c
@@ -258,61 +258,23 @@ char *cifs_compose_mount_options(const char *sb_mountdata,
 	goto compose_mount_options_out;
 }
 
-/**
- * cifs_dfs_do_mount - mounts specified path using DFS full path
- *
- * Always pass down @fullpath to smb3_do_mount() so we can use the root server
- * to perform failover in case we failed to connect to the first target in the
- * referral.
- *
- * @mntpt:		directory entry for the path we are trying to automount
- * @cifs_sb:		parent/root superblock
- * @fullpath:		full path in UNC format
- */
-static struct vfsmount *cifs_dfs_do_mount(struct dentry *mntpt,
-					  struct cifs_sb_info *cifs_sb,
-					  const char *fullpath)
-{
-	struct vfsmount *mnt;
-	char *mountdata;
-	char *devname;
-
-	devname = kstrdup(fullpath, GFP_KERNEL);
-	if (!devname)
-		return ERR_PTR(-ENOMEM);
-
-	convert_delimiter(devname, '/');
-
-	/* TODO: change to call fs_context_for_mount(), fill in context directly, call fc_mount */
-
-	/* See afs_mntpt_do_automount in fs/afs/mntpt.c for an example */
-
-	/* strip first '\' from fullpath */
-	mountdata = cifs_compose_mount_options(cifs_sb->ctx->mount_options,
-					       fullpath + 1, NULL, NULL);
-	if (IS_ERR(mountdata)) {
-		kfree(devname);
-		return (struct vfsmount *)mountdata;
-	}
-
-	mnt = vfs_submount(mntpt, &cifs_fs_type, devname, mountdata);
-	kfree(mountdata);
-	kfree(devname);
-	return mnt;
-}
-
 /*
  * Create a vfsmount that we can automount
  */
-static struct vfsmount *cifs_dfs_do_automount(struct dentry *mntpt)
+static struct vfsmount *cifs_dfs_do_automount(struct path *path)
 {
+	int rc;
+	struct dentry *mntpt = path->dentry;
+	struct fs_context *fc;
 	struct cifs_sb_info *cifs_sb;
-	void *page;
+	void *page = NULL;
+	struct smb3_fs_context *ctx, *cur_ctx;
+	struct smb3_fs_context tmp;
 	char *full_path;
 	struct vfsmount *mnt;
 
-	cifs_dbg(FYI, "in %s\n", __func__);
-	BUG_ON(IS_ROOT(mntpt));
+	if (IS_ROOT(mntpt))
+		return ERR_PTR(-ESTALE);
 
 	/*
 	 * The MSDFS spec states that paths in DFS referral requests and
@@ -321,29 +283,47 @@ static struct vfsmount *cifs_dfs_do_automount(struct dentry *mntpt)
 	 * gives us the latter, so we must adjust the result.
 	 */
 	cifs_sb = CIFS_SB(mntpt->d_sb);
-	if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS) {
-		mnt = ERR_PTR(-EREMOTE);
-		goto cdda_exit;
-	}
+	if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS)
+		return ERR_PTR(-EREMOTE);
+
+	cur_ctx = cifs_sb->ctx;
+
+	fc = fs_context_for_submount(path->mnt->mnt_sb->s_type, mntpt);
+	if (IS_ERR(fc))
+		return ERR_CAST(fc);
+
+	ctx = smb3_fc2context(fc);
 
 	page = alloc_dentry_path();
 	/* always use tree name prefix */
 	full_path = build_path_from_dentry_optional_prefix(mntpt, page, true);
 	if (IS_ERR(full_path)) {
 		mnt = ERR_CAST(full_path);
-		goto free_full_path;
+		goto out;
 	}
 
-	convert_delimiter(full_path, '\\');
+	convert_delimiter(full_path, '/');
 	cifs_dbg(FYI, "%s: full_path: %s\n", __func__, full_path);
 
-	mnt = cifs_dfs_do_mount(mntpt, cifs_sb, full_path);
-	cifs_dbg(FYI, "%s: cifs_dfs_do_mount:%s , mnt:%p\n", __func__, full_path + 1, mnt);
+	tmp = *cur_ctx;
+	tmp.source = full_path;
+	tmp.UNC = tmp.prepath = NULL;
+
+	rc = smb3_fs_context_dup(ctx, &tmp);
+	if (rc) {
+		mnt = ERR_PTR(rc);
+		goto out;
+	}
+
+	rc = smb3_parse_devname(full_path, ctx);
+	if (!rc)
+		mnt = fc_mount(fc);
+	else
+		mnt = ERR_PTR(rc);
 
-free_full_path:
+out:
+	put_fs_context(fc);
 	free_dentry_path(page);
-cdda_exit:
-	cifs_dbg(FYI, "leaving %s\n" , __func__);
 	return mnt;
 }
 
@@ -354,9 +334,9 @@ struct vfsmount *cifs_dfs_d_automount(struct path *path)
 {
 	struct vfsmount *newmnt;
 
-	cifs_dbg(FYI, "in %s\n", __func__);
+	cifs_dbg(FYI, "%s: %pd\n", __func__, path->dentry);
 
-	newmnt = cifs_dfs_do_automount(path->dentry);
+	newmnt = cifs_dfs_do_automount(path);
 	if (IS_ERR(newmnt)) {
 		cifs_dbg(FYI, "leaving %s [automount failed]\n" , __func__);
 		return newmnt;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 086/219] smb: propagate error code of extract_sharename()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (84 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 085/219] cifs: use fs_context for automounts Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 087/219] net/sched: fq_pie: avoid stalls in fq_pie_timer() Greg Kroah-Hartman
                   ` (143 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Katya Orlova, Steve French,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Katya Orlova <e.orlova@ispras.ru>

[ Upstream commit efc0b0bcffcba60d9c6301063d25a22a4744b499 ]

In addition to the EINVAL, there may be an ENOMEM.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 70431bfd825d ("cifs: Support fscache indexing rewrite")
Signed-off-by: Katya Orlova <e.orlova@ispras.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/smb/client/fscache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/smb/client/fscache.c b/fs/smb/client/fscache.c
index f6f3a6b75601b..e73625b5d0cc6 100644
--- a/fs/smb/client/fscache.c
+++ b/fs/smb/client/fscache.c
@@ -48,7 +48,7 @@ int cifs_fscache_get_super_cookie(struct cifs_tcon *tcon)
 	sharename = extract_sharename(tcon->tree_name);
 	if (IS_ERR(sharename)) {
 		cifs_dbg(FYI, "%s: couldn't extract sharename\n", __func__);
-		return -EINVAL;
+		return PTR_ERR(sharename);
 	}
 
 	slen = strlen(sharename);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 087/219] net/sched: fq_pie: avoid stalls in fq_pie_timer()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (85 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 086/219] smb: propagate error code of extract_sharename() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 088/219] sctp: annotate data-races around sk->sk_wmem_queued Greg Kroah-Hartman
                   ` (142 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzbot+e46fbd5289363464bc13,
	Eric Dumazet, Michal Kubiak, Jamal Hadi Salim, Paolo Abeni,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 8c21ab1bae945686c602c5bfa4e3f3352c2452c5 ]

When setting a high number of flows (limit being 65536),
fq_pie_timer() is currently using too much time as syzbot reported.

Add logic to yield the cpu every 2048 flows (less than 150 usec
on debug kernels).
It should also help by not blocking qdisc fast paths for too long.
Worst case (65536 flows) would need 31 jiffies for a complete scan.

Relevant extract from syzbot report:

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 2663 jiffies s: 873 root: 0x1/.
rcu: blocking rcu_node structures (internal RCU debug):
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 5177 Comm: syz-executor273 Not tainted 6.5.0-syzkaller-00453-g727dbda16b83 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
RIP: 0010:write_comp_data+0x21/0x90 kernel/kcov.c:236
Code: 2e 0f 1f 84 00 00 00 00 00 65 8b 05 01 b2 7d 7e 49 89 f1 89 c6 49 89 d2 81 e6 00 01 00 00 49 89 f8 65 48 8b 14 25 80 b9 03 00 <a9> 00 01 ff 00 74 0e 85 f6 74 59 8b 82 04 16 00 00 85 c0 74 4f 8b
RSP: 0018:ffffc90000007bb8 EFLAGS: 00000206
RAX: 0000000000000101 RBX: ffffc9000dc0d140 RCX: ffffffff885893b0
RDX: ffff88807c075940 RSI: 0000000000000100 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000dc0d178
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000555555d54380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6b442f6130 CR3: 000000006fe1c000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <NMI>
 </NMI>
 <IRQ>
 pie_calculate_probability+0x480/0x850 net/sched/sch_pie.c:415
 fq_pie_timer+0x1da/0x4f0 net/sched/sch_fq_pie.c:387
 call_timer_fn+0x1a0/0x580 kernel/time/timer.c:1700

Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")
Link: https://lore.kernel.org/lkml/00000000000017ad3f06040bf394@google.com/
Reported-by: syzbot+e46fbd5289363464bc13@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20230829123541.3745013-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/sched/sch_fq_pie.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/net/sched/sch_fq_pie.c b/net/sched/sch_fq_pie.c
index 591d87d5e5c0f..68e6acd0f130d 100644
--- a/net/sched/sch_fq_pie.c
+++ b/net/sched/sch_fq_pie.c
@@ -61,6 +61,7 @@ struct fq_pie_sched_data {
 	struct pie_params p_params;
 	u32 ecn_prob;
 	u32 flows_cnt;
+	u32 flows_cursor;
 	u32 quantum;
 	u32 memory_limit;
 	u32 new_flow_count;
@@ -375,22 +376,32 @@ static int fq_pie_change(struct Qdisc *sch, struct nlattr *opt,
 static void fq_pie_timer(struct timer_list *t)
 {
 	struct fq_pie_sched_data *q = from_timer(q, t, adapt_timer);
+	unsigned long next, tupdate;
 	struct Qdisc *sch = q->sch;
 	spinlock_t *root_lock; /* to lock qdisc for probability calculations */
-	u32 idx;
+	int max_cnt, i;
 
 	rcu_read_lock();
 	root_lock = qdisc_lock(qdisc_root_sleeping(sch));
 	spin_lock(root_lock);
 
-	for (idx = 0; idx < q->flows_cnt; idx++)
-		pie_calculate_probability(&q->p_params, &q->flows[idx].vars,
-					  q->flows[idx].backlog);
-
-	/* reset the timer to fire after 'tupdate' jiffies. */
-	if (q->p_params.tupdate)
-		mod_timer(&q->adapt_timer, jiffies + q->p_params.tupdate);
+	/* Limit this expensive loop to 2048 flows per round. */
+	max_cnt = min_t(int, q->flows_cnt - q->flows_cursor, 2048);
+	for (i = 0; i < max_cnt; i++) {
+		pie_calculate_probability(&q->p_params,
+					  &q->flows[q->flows_cursor].vars,
+					  q->flows[q->flows_cursor].backlog);
+		q->flows_cursor++;
+	}
 
+	tupdate = q->p_params.tupdate;
+	next = 0;
+	if (q->flows_cursor >= q->flows_cnt) {
+		q->flows_cursor = 0;
+		next = tupdate;
+	}
+	if (tupdate)
+		mod_timer(&q->adapt_timer, jiffies + next);
 	spin_unlock(root_lock);
 	rcu_read_unlock();
 }
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 088/219] sctp: annotate data-races around sk->sk_wmem_queued
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (86 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 087/219] net/sched: fq_pie: avoid stalls in fq_pie_timer() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 089/219] ipv4: annotate data-races around fi->fib_dead Greg Kroah-Hartman
                   ` (141 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzbot, Eric Dumazet,
	Marcelo Ricardo Leitner, Xin Long, Paolo Abeni, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit dc9511dd6f37fe803f6b15b61b030728d7057417 ]

sk->sk_wmem_queued can be read locklessly from sctp_poll()

Use sk_wmem_queued_add() when the field is changed,
and add READ_ONCE() annotations in sctp_writeable()
and sctp_assocs_seq_show()

syzbot reported:

BUG: KCSAN: data-race in sctp_poll / sctp_wfree

read-write to 0xffff888149d77810 of 4 bytes by interrupt on cpu 0:
sctp_wfree+0x170/0x4a0 net/sctp/socket.c:9147
skb_release_head_state+0xb7/0x1a0 net/core/skbuff.c:988
skb_release_all net/core/skbuff.c:1000 [inline]
__kfree_skb+0x16/0x140 net/core/skbuff.c:1016
consume_skb+0x57/0x180 net/core/skbuff.c:1232
sctp_chunk_destroy net/sctp/sm_make_chunk.c:1503 [inline]
sctp_chunk_put+0xcd/0x130 net/sctp/sm_make_chunk.c:1530
sctp_datamsg_put+0x29a/0x300 net/sctp/chunk.c:128
sctp_chunk_free+0x34/0x50 net/sctp/sm_make_chunk.c:1515
sctp_outq_sack+0xafa/0xd70 net/sctp/outqueue.c:1381
sctp_cmd_process_sack net/sctp/sm_sideeffect.c:834 [inline]
sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1366 [inline]
sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
sctp_do_sm+0x12c7/0x31b0 net/sctp/sm_sideeffect.c:1169
sctp_assoc_bh_rcv+0x2b2/0x430 net/sctp/associola.c:1051
sctp_inq_push+0x108/0x120 net/sctp/inqueue.c:80
sctp_rcv+0x116e/0x1340 net/sctp/input.c:243
sctp6_rcv+0x25/0x40 net/sctp/ipv6.c:1120
ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437
ip6_input_finish net/ipv6/ip6_input.c:482 [inline]
NF_HOOK include/linux/netfilter.h:303 [inline]
ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491
dst_input include/net/dst.h:468 [inline]
ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79
NF_HOOK include/linux/netfilter.h:303 [inline]
ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309
__netif_receive_skb_one_core net/core/dev.c:5452 [inline]
__netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566
process_backlog+0x21f/0x380 net/core/dev.c:5894
__napi_poll+0x60/0x3b0 net/core/dev.c:6460
napi_poll net/core/dev.c:6527 [inline]
net_rx_action+0x32b/0x750 net/core/dev.c:6660
__do_softirq+0xc1/0x265 kernel/softirq.c:553
run_ksoftirqd+0x17/0x20 kernel/softirq.c:921
smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164
kthread+0x1d7/0x210 kernel/kthread.c:389
ret_from_fork+0x2e/0x40 arch/x86/kernel/process.c:145
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304

read to 0xffff888149d77810 of 4 bytes by task 17828 on cpu 1:
sctp_writeable net/sctp/socket.c:9304 [inline]
sctp_poll+0x265/0x410 net/sctp/socket.c:8671
sock_poll+0x253/0x270 net/socket.c:1374
vfs_poll include/linux/poll.h:88 [inline]
do_pollfd fs/select.c:873 [inline]
do_poll fs/select.c:921 [inline]
do_sys_poll+0x636/0xc00 fs/select.c:1015
__do_sys_ppoll fs/select.c:1121 [inline]
__se_sys_ppoll+0x1af/0x1f0 fs/select.c:1101
__x64_sys_ppoll+0x67/0x80 fs/select.c:1101
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x00019e80 -> 0x0000cc80

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 17828 Comm: syz-executor.1 Not tainted 6.5.0-rc7-syzkaller-00185-g28f20a19294d #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20230830094519.950007-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/sctp/proc.c   |  2 +-
 net/sctp/socket.c | 10 +++++-----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/sctp/proc.c b/net/sctp/proc.c
index f13d6a34f32f2..ec00ee75d59a6 100644
--- a/net/sctp/proc.c
+++ b/net/sctp/proc.c
@@ -282,7 +282,7 @@ static int sctp_assocs_seq_show(struct seq_file *seq, void *v)
 		assoc->init_retries, assoc->shutdown_retries,
 		assoc->rtx_data_chunks,
 		refcount_read(&sk->sk_wmem_alloc),
-		sk->sk_wmem_queued,
+		READ_ONCE(sk->sk_wmem_queued),
 		sk->sk_sndbuf,
 		sk->sk_rcvbuf);
 	seq_printf(seq, "\n");
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index a11b0d903514c..32e3669adf146 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -68,7 +68,7 @@
 #include <net/sctp/stream_sched.h>
 
 /* Forward declarations for internal helper functions. */
-static bool sctp_writeable(struct sock *sk);
+static bool sctp_writeable(const struct sock *sk);
 static void sctp_wfree(struct sk_buff *skb);
 static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p,
 				size_t msg_len);
@@ -139,7 +139,7 @@ static inline void sctp_set_owner_w(struct sctp_chunk *chunk)
 
 	refcount_add(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc);
 	asoc->sndbuf_used += chunk->skb->truesize + sizeof(struct sctp_chunk);
-	sk->sk_wmem_queued += chunk->skb->truesize + sizeof(struct sctp_chunk);
+	sk_wmem_queued_add(sk, chunk->skb->truesize + sizeof(struct sctp_chunk));
 	sk_mem_charge(sk, chunk->skb->truesize);
 }
 
@@ -9139,7 +9139,7 @@ static void sctp_wfree(struct sk_buff *skb)
 	struct sock *sk = asoc->base.sk;
 
 	sk_mem_uncharge(sk, skb->truesize);
-	sk->sk_wmem_queued -= skb->truesize + sizeof(struct sctp_chunk);
+	sk_wmem_queued_add(sk, -(skb->truesize + sizeof(struct sctp_chunk)));
 	asoc->sndbuf_used -= skb->truesize + sizeof(struct sctp_chunk);
 	WARN_ON(refcount_sub_and_test(sizeof(struct sctp_chunk),
 				      &sk->sk_wmem_alloc));
@@ -9292,9 +9292,9 @@ void sctp_write_space(struct sock *sk)
  * UDP-style sockets or TCP-style sockets, this code should work.
  *  - Daisy
  */
-static bool sctp_writeable(struct sock *sk)
+static bool sctp_writeable(const struct sock *sk)
 {
-	return sk->sk_sndbuf > sk->sk_wmem_queued;
+	return READ_ONCE(sk->sk_sndbuf) > READ_ONCE(sk->sk_wmem_queued);
 }
 
 /* Wait for an association to go into ESTABLISHED state. If timeout is 0,
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 089/219] ipv4: annotate data-races around fi->fib_dead
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (87 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 088/219] sctp: annotate data-races around sk->sk_wmem_queued Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 090/219] net: read sk->sk_family once in sk_mc_loop() Greg Kroah-Hartman
                   ` (140 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzbot, Eric Dumazet, David Ahern,
	Paolo Abeni, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit fce92af1c29d90184dfec638b5738831097d66e9 ]

syzbot complained about a data-race in fib_table_lookup() [1]

Add appropriate annotations to document it.

[1]
BUG: KCSAN: data-race in fib_release_info / fib_table_lookup

write to 0xffff888150f31744 of 1 bytes by task 1189 on cpu 0:
fib_release_info+0x3a0/0x460 net/ipv4/fib_semantics.c:281
fib_table_delete+0x8d2/0x900 net/ipv4/fib_trie.c:1777
fib_magic+0x1c1/0x1f0 net/ipv4/fib_frontend.c:1106
fib_del_ifaddr+0x8cf/0xa60 net/ipv4/fib_frontend.c:1317
fib_inetaddr_event+0x77/0x200 net/ipv4/fib_frontend.c:1448
notifier_call_chain kernel/notifier.c:93 [inline]
blocking_notifier_call_chain+0x90/0x200 kernel/notifier.c:388
__inet_del_ifa+0x4df/0x800 net/ipv4/devinet.c:432
inet_del_ifa net/ipv4/devinet.c:469 [inline]
inetdev_destroy net/ipv4/devinet.c:322 [inline]
inetdev_event+0x553/0xaf0 net/ipv4/devinet.c:1606
notifier_call_chain kernel/notifier.c:93 [inline]
raw_notifier_call_chain+0x6b/0x1c0 kernel/notifier.c:461
call_netdevice_notifiers_info net/core/dev.c:1962 [inline]
call_netdevice_notifiers_mtu+0xd2/0x130 net/core/dev.c:2037
dev_set_mtu_ext+0x30b/0x3e0 net/core/dev.c:8673
do_setlink+0x5be/0x2430 net/core/rtnetlink.c:2837
rtnl_setlink+0x255/0x300 net/core/rtnetlink.c:3177
rtnetlink_rcv_msg+0x807/0x8c0 net/core/rtnetlink.c:6445
netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2549
rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6463
netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1914
sock_sendmsg_nosec net/socket.c:725 [inline]
sock_sendmsg net/socket.c:748 [inline]
sock_write_iter+0x1aa/0x230 net/socket.c:1129
do_iter_write+0x4b4/0x7b0 fs/read_write.c:860
vfs_writev+0x1a8/0x320 fs/read_write.c:933
do_writev+0xf8/0x220 fs/read_write.c:976
__do_sys_writev fs/read_write.c:1049 [inline]
__se_sys_writev fs/read_write.c:1046 [inline]
__x64_sys_writev+0x45/0x50 fs/read_write.c:1046
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff888150f31744 of 1 bytes by task 21839 on cpu 1:
fib_table_lookup+0x2bf/0xd50 net/ipv4/fib_trie.c:1585
fib_lookup include/net/ip_fib.h:383 [inline]
ip_route_output_key_hash_rcu+0x38c/0x12c0 net/ipv4/route.c:2751
ip_route_output_key_hash net/ipv4/route.c:2641 [inline]
__ip_route_output_key include/net/route.h:134 [inline]
ip_route_output_flow+0xa6/0x150 net/ipv4/route.c:2869
send4+0x1e7/0x500 drivers/net/wireguard/socket.c:61
wg_socket_send_skb_to_peer+0x94/0x130 drivers/net/wireguard/socket.c:175
wg_socket_send_buffer_to_peer+0xd6/0x100 drivers/net/wireguard/socket.c:200
wg_packet_send_handshake_initiation drivers/net/wireguard/send.c:40 [inline]
wg_packet_handshake_send_worker+0x10c/0x150 drivers/net/wireguard/send.c:51
process_one_work+0x434/0x860 kernel/workqueue.c:2600
worker_thread+0x5f2/0xa10 kernel/workqueue.c:2751
kthread+0x1d7/0x210 kernel/kthread.c:389
ret_from_fork+0x2e/0x40 arch/x86/kernel/process.c:145
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304

value changed: 0x00 -> 0x01

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 21839 Comm: kworker/u4:18 Tainted: G W 6.5.0-syzkaller #0

Fixes: dccd9ecc3744 ("ipv4: Do not use dead fib_info entries.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230830095520.1046984-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ipv4/fib_semantics.c | 5 ++++-
 net/ipv4/fib_trie.c      | 3 ++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 3bb890a40ed73..3b6e6bc80dc1c 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -278,7 +278,8 @@ void fib_release_info(struct fib_info *fi)
 				hlist_del(&nexthop_nh->nh_hash);
 			} endfor_nexthops(fi)
 		}
-		fi->fib_dead = 1;
+		/* Paired with READ_ONCE() from fib_table_lookup() */
+		WRITE_ONCE(fi->fib_dead, 1);
 		fib_info_put(fi);
 	}
 	spin_unlock_bh(&fib_info_lock);
@@ -1581,6 +1582,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
 link_it:
 	ofi = fib_find_info(fi);
 	if (ofi) {
+		/* fib_table_lookup() should not see @fi yet. */
 		fi->fib_dead = 1;
 		free_fib_info(fi);
 		refcount_inc(&ofi->fib_treeref);
@@ -1619,6 +1621,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg,
 
 failure:
 	if (fi) {
+		/* fib_table_lookup() should not see @fi yet. */
 		fi->fib_dead = 1;
 		free_fib_info(fi);
 	}
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 74d403dbd2b4e..d13fb9e76b971 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1582,7 +1582,8 @@ int fib_table_lookup(struct fib_table *tb, const struct flowi4 *flp,
 		if (fa->fa_dscp &&
 		    inet_dscp_to_dsfield(fa->fa_dscp) != flp->flowi4_tos)
 			continue;
-		if (fi->fib_dead)
+		/* Paired with WRITE_ONCE() in fib_release_info() */
+		if (READ_ONCE(fi->fib_dead))
 			continue;
 		if (fa->fa_info->fib_scope < flp->flowi4_scope)
 			continue;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 090/219] net: read sk->sk_family once in sk_mc_loop()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (88 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 089/219] ipv4: annotate data-races around fi->fib_dead Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 091/219] net: fib: avoid warn splat in flow dissector Greg Kroah-Hartman
                   ` (139 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzbot, Eric Dumazet,
	Kuniyuki Iwashima, Paolo Abeni, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit a3e0fdf71bbe031de845e8e08ed7fba49f9c702c ]

syzbot is playing with IPV6_ADDRFORM quite a lot these days,
and managed to hit the WARN_ON_ONCE(1) in sk_mc_loop()

We have many more similar issues to fix.

WARNING: CPU: 1 PID: 1593 at net/core/sock.c:782 sk_mc_loop+0x165/0x260
Modules linked in:
CPU: 1 PID: 1593 Comm: kworker/1:3 Not tainted 6.1.40-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
Workqueue: events_power_efficient gc_worker
RIP: 0010:sk_mc_loop+0x165/0x260 net/core/sock.c:782
Code: 34 1b fd 49 81 c7 18 05 00 00 4c 89 f8 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 ff e8 25 36 6d fd 4d 8b 37 eb 13 e8 db 33 1b fd <0f> 0b b3 01 eb 34 e8 d0 33 1b fd 45 31 f6 49 83 c6 38 4c 89 f0 48
RSP: 0018:ffffc90000388530 EFLAGS: 00010246
RAX: ffffffff846d9b55 RBX: 0000000000000011 RCX: ffff88814f884980
RDX: 0000000000000102 RSI: ffffffff87ae5160 RDI: 0000000000000011
RBP: ffffc90000388550 R08: 0000000000000003 R09: ffffffff846d9a65
R10: 0000000000000002 R11: ffff88814f884980 R12: dffffc0000000000
R13: ffff88810dbee000 R14: 0000000000000010 R15: ffff888150084000
FS: 0000000000000000(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000180 CR3: 000000014ee5b000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
[<ffffffff8507734f>] ip6_finish_output2+0x33f/0x1ae0 net/ipv6/ip6_output.c:83
[<ffffffff85062766>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline]
[<ffffffff85062766>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211
[<ffffffff85061f8c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
[<ffffffff85061f8c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232
[<ffffffff852071cf>] dst_output include/net/dst.h:444 [inline]
[<ffffffff852071cf>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161
[<ffffffff83618fb4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline]
[<ffffffff83618fb4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
[<ffffffff83618fb4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
[<ffffffff83618fb4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
[<ffffffff8361ddd9>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
[<ffffffff84763fc0>] netdev_start_xmit include/linux/netdevice.h:4925 [inline]
[<ffffffff84763fc0>] xmit_one net/core/dev.c:3644 [inline]
[<ffffffff84763fc0>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
[<ffffffff8494c650>] sch_direct_xmit+0x2a0/0x9c0 net/sched/sch_generic.c:342
[<ffffffff8494d883>] qdisc_restart net/sched/sch_generic.c:407 [inline]
[<ffffffff8494d883>] __qdisc_run+0xb13/0x1e70 net/sched/sch_generic.c:415
[<ffffffff8478c426>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125
[<ffffffff84796eac>] net_tx_action+0x7ac/0x940 net/core/dev.c:5247
[<ffffffff858002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:599
[<ffffffff814c3fe8>] invoke_softirq kernel/softirq.c:430 [inline]
[<ffffffff814c3fe8>] __irq_exit_rcu+0xc8/0x170 kernel/softirq.c:683
[<ffffffff814c3f09>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:695

Fixes: 7ad6848c7e81 ("ip: fix mc_loop checks for tunnels with multicast outer addresses")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230830101244.1146934-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/sock.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index fc475845c94d5..fa988063630db 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -761,7 +761,8 @@ bool sk_mc_loop(struct sock *sk)
 		return false;
 	if (!sk)
 		return true;
-	switch (sk->sk_family) {
+	/* IPV6_ADDRFORM can change sk->sk_family under us. */
+	switch (READ_ONCE(sk->sk_family)) {
 	case AF_INET:
 		return inet_sk(sk)->mc_loop;
 #if IS_ENABLED(CONFIG_IPV6)
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 091/219] net: fib: avoid warn splat in flow dissector
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (89 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 090/219] net: read sk->sk_family once in sk_mc_loop() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 092/219] xsk: Fix xsk_diag use-after-free error during socket cleanup Greg Kroah-Hartman
                   ` (138 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Stanislav Fomichev, David Ahern,
	Ido Schimmel, Florian Westphal, Paolo Abeni, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>

[ Upstream commit 8aae7625ff3f0bd5484d01f1b8d5af82e44bec2d ]

New skbs allocated via nf_send_reset() have skb->dev == NULL.

fib*_rules_early_flow_dissect helpers already have a 'struct net'
argument but its not passed down to the flow dissector core, which
will then WARN as it can't derive a net namespace to use:

 WARNING: CPU: 0 PID: 0 at net/core/flow_dissector.c:1016 __skb_flow_dissect+0xa91/0x1cd0
 [..]
  ip_route_me_harder+0x143/0x330
  nf_send_reset+0x17c/0x2d0 [nf_reject_ipv4]
  nft_reject_inet_eval+0xa9/0xf2 [nft_reject_inet]
  nft_do_chain+0x198/0x5d0 [nf_tables]
  nft_do_chain_inet+0xa4/0x110 [nf_tables]
  nf_hook_slow+0x41/0xc0
  ip_local_deliver+0xce/0x110
  ..

Cc: Stanislav Fomichev <sdf@google.com>
Cc: David Ahern <dsahern@kernel.org>
Cc: Ido Schimmel <idosch@nvidia.com>
Fixes: 812fa71f0d96 ("netfilter: Dissect flow after packet mangling")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217826
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230830110043.30497-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/ip6_fib.h | 5 ++++-
 include/net/ip_fib.h  | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 6268963d95994..a92f6eb853068 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -610,7 +610,10 @@ static inline bool fib6_rules_early_flow_dissect(struct net *net,
 	if (!net->ipv6.fib6_rules_require_fldissect)
 		return false;
 
-	skb_flow_dissect_flow_keys(skb, flkeys, flag);
+	memset(flkeys, 0, sizeof(*flkeys));
+	__skb_flow_dissect(net, skb, &flow_keys_dissector,
+			   flkeys, NULL, 0, 0, 0, flag);
+
 	fl6->fl6_sport = flkeys->ports.src;
 	fl6->fl6_dport = flkeys->ports.dst;
 	fl6->flowi6_proto = flkeys->basic.ip_proto;
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index a378eff827c74..f0c13864180e2 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -418,7 +418,10 @@ static inline bool fib4_rules_early_flow_dissect(struct net *net,
 	if (!net->ipv4.fib_rules_require_fldissect)
 		return false;
 
-	skb_flow_dissect_flow_keys(skb, flkeys, flag);
+	memset(flkeys, 0, sizeof(*flkeys));
+	__skb_flow_dissect(net, skb, &flow_keys_dissector,
+			   flkeys, NULL, 0, 0, 0, flag);
+
 	fl4->fl4_sport = flkeys->ports.src;
 	fl4->fl4_dport = flkeys->ports.dst;
 	fl4->flowi4_proto = flkeys->basic.ip_proto;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 092/219] xsk: Fix xsk_diag use-after-free error during socket cleanup
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (90 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 091/219] net: fib: avoid warn splat in flow dissector Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 093/219] ceph: make members in struct ceph_mds_request_args_ext a union Greg Kroah-Hartman
                   ` (137 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzbot+822d1359297e2694f873,
	Magnus Karlsson, Daniel Borkmann, Maciej Fijalkowski, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Magnus Karlsson <magnus.karlsson@intel.com>

[ Upstream commit 3e019d8a05a38abb5c85d4f1e85fda964610aa14 ]

Fix a use-after-free error that is possible if the xsk_diag interface
is used after the socket has been unbound from the device. This can
happen either due to the socket being closed or the device
disappearing. In the early days of AF_XDP, the way we tested that a
socket was not bound to a device was to simply check if the netdevice
pointer in the xsk socket structure was NULL. Later, a better system
was introduced by having an explicit state variable in the xsk socket
struct. For example, the state of a socket that is on the way to being
closed and has been unbound from the device is XSK_UNBOUND.

The commit in the Fixes tag below deleted the old way of signalling
that a socket is unbound, setting dev to NULL. This in the belief that
all code using the old way had been exterminated. That was
unfortunately not true as the xsk diagnostics code was still using the
old way and thus does not work as intended when a socket is going
down. Fix this by introducing a test against the state variable. If
the socket is in the state XSK_UNBOUND, simply abort the diagnostic's
netlink operation.

Fixes: 18b1ab7aa76b ("xsk: Fix race at socket teardown")
Reported-by: syzbot+822d1359297e2694f873@syzkaller.appspotmail.com
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+822d1359297e2694f873@syzkaller.appspotmail.com
Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20230831100119.17408-1-magnus.karlsson@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/xdp/xsk_diag.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/xdp/xsk_diag.c b/net/xdp/xsk_diag.c
index c014217f5fa7d..22b36c8143cfd 100644
--- a/net/xdp/xsk_diag.c
+++ b/net/xdp/xsk_diag.c
@@ -111,6 +111,9 @@ static int xsk_diag_fill(struct sock *sk, struct sk_buff *nlskb,
 	sock_diag_save_cookie(sk, msg->xdiag_cookie);
 
 	mutex_lock(&xs->mutex);
+	if (READ_ONCE(xs->state) == XSK_UNBOUND)
+		goto out_nlmsg_trim;
+
 	if ((req->xdiag_show & XDP_SHOW_INFO) && xsk_diag_put_info(xs, nlskb))
 		goto out_nlmsg_trim;
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 093/219] ceph: make members in struct ceph_mds_request_args_ext a union
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (91 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 092/219] xsk: Fix xsk_diag use-after-free error during socket cleanup Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 094/219] drm/i915/gvt: Verify pfn is "valid" before dereferencing "struct page" Greg Kroah-Hartman
                   ` (136 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Xiubo Li, Milind Changire,
	Ilya Dryomov, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Xiubo Li <xiubli@redhat.com>

[ Upstream commit 3af5ae22030cb59fab4fba35f5a2b62f47e14df9 ]

In ceph mainline it will allow to set the btime in the setattr request
and just add a 'btime' member in the union 'ceph_mds_request_args' and
then bump up the header version to 4. That means the total size of union
'ceph_mds_request_args' will increase sizeof(struct ceph_timespec) bytes,
but in kclient it will increase the sizeof(setattr_ext) bytes for each
request.

Since the MDS will always depend on the header's vesion and front_len
members to decode the 'ceph_mds_request_head' struct, at the same time
kclient hasn't supported the 'btime' feature yet in setattr request,
so it's safe to do this change here.

This will save 48 bytes memories for each request.

Fixes: 4f1ddb1ea874 ("ceph: implement updated ceph_mds_request_head structure")
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Milind Changire <mchangir@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/ceph/ceph_fs.h | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h
index 49586ff261520..b4fa2a25b7d95 100644
--- a/include/linux/ceph/ceph_fs.h
+++ b/include/linux/ceph/ceph_fs.h
@@ -462,17 +462,19 @@ union ceph_mds_request_args {
 } __attribute__ ((packed));
 
 union ceph_mds_request_args_ext {
-	union ceph_mds_request_args old;
-	struct {
-		__le32 mode;
-		__le32 uid;
-		__le32 gid;
-		struct ceph_timespec mtime;
-		struct ceph_timespec atime;
-		__le64 size, old_size;       /* old_size needed by truncate */
-		__le32 mask;                 /* CEPH_SETATTR_* */
-		struct ceph_timespec btime;
-	} __attribute__ ((packed)) setattr_ext;
+	union {
+		union ceph_mds_request_args old;
+		struct {
+			__le32 mode;
+			__le32 uid;
+			__le32 gid;
+			struct ceph_timespec mtime;
+			struct ceph_timespec atime;
+			__le64 size, old_size;       /* old_size needed by truncate */
+			__le32 mask;                 /* CEPH_SETATTR_* */
+			struct ceph_timespec btime;
+		} __attribute__ ((packed)) setattr_ext;
+	};
 };
 
 #define CEPH_MDS_FLAG_REPLAY		1 /* this is a replayed op */
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 094/219] drm/i915/gvt: Verify pfn is "valid" before dereferencing "struct page"
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (92 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 093/219] ceph: make members in struct ceph_mds_request_args_ext a union Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 095/219] drm/i915/gvt: Put the page reference obtained by KVMs gfn_to_pfn() Greg Kroah-Hartman
                   ` (135 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Yan Zhao, Yongwei Ma, Zhi Wang,
	Sean Christopherson, Paolo Bonzini, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

[ Upstream commit f046923af79158361295ed4f0a588c80b9fdcc1d ]

Check that the pfn found by gfn_to_pfn() is actually backed by "struct
page" memory prior to retrieving and dereferencing the page.  KVM
supports backing guest memory with VM_PFNMAP, VM_IO, etc., and so
there is no guarantee the pfn returned by gfn_to_pfn() has an associated
"struct page".

Fixes: b901b252b6cf ("drm/i915/gvt: Add 2M huge gtt support")
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://lore.kernel.org/r/20230729013535.1070024-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/i915/gvt/gtt.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
index 80c60754a5c1c..92462cd4bf7cc 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.c
+++ b/drivers/gpu/drm/i915/gvt/gtt.c
@@ -1188,6 +1188,10 @@ static int is_2MB_gtt_possible(struct intel_vgpu *vgpu,
 	pfn = gfn_to_pfn(vgpu->vfio_device.kvm, ops->get_pfn(entry));
 	if (is_error_noslot_pfn(pfn))
 		return -EINVAL;
+
+	if (!pfn_valid(pfn))
+		return -EINVAL;
+
 	return PageTransHuge(pfn_to_page(pfn));
 }
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 095/219] drm/i915/gvt: Put the page reference obtained by KVMs gfn_to_pfn()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (93 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 094/219] drm/i915/gvt: Verify pfn is "valid" before dereferencing "struct page" Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 096/219] drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt() Greg Kroah-Hartman
                   ` (134 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Yan Zhao, Yongwei Ma, Zhi Wang,
	Sean Christopherson, Paolo Bonzini, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

[ Upstream commit 708e49583d7da863898b25dafe4bcd799c414278 ]

Put the struct page reference acquired by gfn_to_pfn(), KVM's API is that
the caller is ultimately responsible for dropping any reference.

Note, kvm_release_pfn_clean() ensures the pfn is actually a refcounted
struct page before trying to put any references.

Fixes: b901b252b6cf ("drm/i915/gvt: Add 2M huge gtt support")
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://lore.kernel.org/r/20230729013535.1070024-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/i915/gvt/gtt.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
index 92462cd4bf7cc..980d671ac3595 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.c
+++ b/drivers/gpu/drm/i915/gvt/gtt.c
@@ -1179,6 +1179,7 @@ static int is_2MB_gtt_possible(struct intel_vgpu *vgpu,
 {
 	const struct intel_gvt_gtt_pte_ops *ops = vgpu->gvt->gtt.pte_ops;
 	kvm_pfn_t pfn;
+	int ret;
 
 	if (!HAS_PAGE_SIZES(vgpu->gvt->gt->i915, I915_GTT_PAGE_SIZE_2M))
 		return 0;
@@ -1192,7 +1193,9 @@ static int is_2MB_gtt_possible(struct intel_vgpu *vgpu,
 	if (!pfn_valid(pfn))
 		return -EINVAL;
 
-	return PageTransHuge(pfn_to_page(pfn));
+	ret = PageTransHuge(pfn_to_page(pfn));
+	kvm_release_pfn_clean(pfn);
+	return ret;
 }
 
 static int split_2MB_gtt_entry(struct intel_vgpu *vgpu,
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 096/219] drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (94 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 095/219] drm/i915/gvt: Put the page reference obtained by KVMs gfn_to_pfn() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 097/219] net: use sk_forward_alloc_get() in sk_get_meminfo() Greg Kroah-Hartman
                   ` (133 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Yan Zhao, Yongwei Ma, Zhi Wang,
	Sean Christopherson, Paolo Bonzini, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

[ Upstream commit a90c367e5af63880008e21dd199dac839e0e9e0f ]

Drop intel_vgpu_reset_gtt() as it no longer has any callers.  In addition
to eliminating dead code, this eliminates the last possible scenario where
__kvmgt_protect_table_find() can be reached without holding vgpu_lock.
Requiring vgpu_lock to be held when calling __kvmgt_protect_table_find()
will allow a protecting the gfn hash with vgpu_lock without too much fuss.

No functional change intended.

Fixes: ba25d977571e ("drm/i915/gvt: Do not destroy ppgtt_mm during vGPU D3->D0.")
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Yongwei Ma <yongwei.ma@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://lore.kernel.org/r/20230729013535.1070024-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/i915/gvt/gtt.c | 18 ------------------
 drivers/gpu/drm/i915/gvt/gtt.h |  1 -
 2 files changed, 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
index 980d671ac3595..79bf1be68d8cf 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.c
+++ b/drivers/gpu/drm/i915/gvt/gtt.c
@@ -2887,24 +2887,6 @@ void intel_vgpu_reset_ggtt(struct intel_vgpu *vgpu, bool invalidate_old)
 	ggtt_invalidate(gvt->gt);
 }
 
-/**
- * intel_vgpu_reset_gtt - reset the all GTT related status
- * @vgpu: a vGPU
- *
- * This function is called from vfio core to reset reset all
- * GTT related status, including GGTT, PPGTT, scratch page.
- *
- */
-void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu)
-{
-	/* Shadow pages are only created when there is no page
-	 * table tracking data, so remove page tracking data after
-	 * removing the shadow pages.
-	 */
-	intel_vgpu_destroy_all_ppgtt_mm(vgpu);
-	intel_vgpu_reset_ggtt(vgpu, true);
-}
-
 /**
  * intel_gvt_restore_ggtt - restore all vGPU's ggtt entries
  * @gvt: intel gvt device
diff --git a/drivers/gpu/drm/i915/gvt/gtt.h b/drivers/gpu/drm/i915/gvt/gtt.h
index a3b0f59ec8bd9..4cb183e06e95a 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.h
+++ b/drivers/gpu/drm/i915/gvt/gtt.h
@@ -224,7 +224,6 @@ void intel_vgpu_reset_ggtt(struct intel_vgpu *vgpu, bool invalidate_old);
 void intel_vgpu_invalidate_ppgtt(struct intel_vgpu *vgpu);
 
 int intel_gvt_init_gtt(struct intel_gvt *gvt);
-void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu);
 void intel_gvt_clean_gtt(struct intel_gvt *gvt);
 
 struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu,
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 097/219] net: use sk_forward_alloc_get() in sk_get_meminfo()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (95 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 096/219] drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 098/219] net: annotate data-races around sk->sk_forward_alloc Greg Kroah-Hartman
                   ` (132 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Eric Dumazet, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 66d58f046c9d3a8f996b7138d02e965fd0617de0 ]

inet_sk_diag_fill() has been changed to use sk_forward_alloc_get(),
but sk_get_meminfo() was forgotten.

Fixes: 292e6077b040 ("net: introduce sk_forward_alloc_get()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index fa988063630db..6ff58fa5f41ed 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3715,7 +3715,7 @@ void sk_get_meminfo(const struct sock *sk, u32 *mem)
 	mem[SK_MEMINFO_RCVBUF] = READ_ONCE(sk->sk_rcvbuf);
 	mem[SK_MEMINFO_WMEM_ALLOC] = sk_wmem_alloc_get(sk);
 	mem[SK_MEMINFO_SNDBUF] = READ_ONCE(sk->sk_sndbuf);
-	mem[SK_MEMINFO_FWD_ALLOC] = sk->sk_forward_alloc;
+	mem[SK_MEMINFO_FWD_ALLOC] = sk_forward_alloc_get(sk);
 	mem[SK_MEMINFO_WMEM_QUEUED] = READ_ONCE(sk->sk_wmem_queued);
 	mem[SK_MEMINFO_OPTMEM] = atomic_read(&sk->sk_omem_alloc);
 	mem[SK_MEMINFO_BACKLOG] = READ_ONCE(sk->sk_backlog.len);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 098/219] net: annotate data-races around sk->sk_forward_alloc
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (96 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 097/219] net: use sk_forward_alloc_get() in sk_get_meminfo() Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 099/219] mptcp: annotate data-races around msk->rmem_fwd_alloc Greg Kroah-Hartman
                   ` (131 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Eric Dumazet, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 5e6300e7b3a4ab5b72a82079753868e91fbf9efc ]

Every time sk->sk_forward_alloc is read locklessly,
add a READ_ONCE().

Add sk_forward_alloc_add() helper to centralize updates,
to reduce number of WRITE_ONCE().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/sock.h    | 12 +++++++++---
 net/core/sock.c       |  8 ++++----
 net/ipv4/tcp_output.c |  2 +-
 net/ipv4/udp.c        |  6 +++---
 net/mptcp/protocol.c  |  6 +++---
 5 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index d1f936ed97556..fe695e8bfe289 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1049,6 +1049,12 @@ static inline void sk_wmem_queued_add(struct sock *sk, int val)
 	WRITE_ONCE(sk->sk_wmem_queued, sk->sk_wmem_queued + val);
 }
 
+static inline void sk_forward_alloc_add(struct sock *sk, int val)
+{
+	/* Paired with lockless reads of sk->sk_forward_alloc */
+	WRITE_ONCE(sk->sk_forward_alloc, sk->sk_forward_alloc + val);
+}
+
 void sk_stream_write_space(struct sock *sk);
 
 /* OOB backlog add */
@@ -1401,7 +1407,7 @@ static inline int sk_forward_alloc_get(const struct sock *sk)
 	if (sk->sk_prot->forward_alloc_get)
 		return sk->sk_prot->forward_alloc_get(sk);
 #endif
-	return sk->sk_forward_alloc;
+	return READ_ONCE(sk->sk_forward_alloc);
 }
 
 static inline bool __sk_stream_memory_free(const struct sock *sk, int wake)
@@ -1697,14 +1703,14 @@ static inline void sk_mem_charge(struct sock *sk, int size)
 {
 	if (!sk_has_account(sk))
 		return;
-	sk->sk_forward_alloc -= size;
+	sk_forward_alloc_add(sk, -size);
 }
 
 static inline void sk_mem_uncharge(struct sock *sk, int size)
 {
 	if (!sk_has_account(sk))
 		return;
-	sk->sk_forward_alloc += size;
+	sk_forward_alloc_add(sk, size);
 	sk_mem_reclaim(sk);
 }
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 6ff58fa5f41ed..aa628c6314f64 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1034,7 +1034,7 @@ static int sock_reserve_memory(struct sock *sk, int bytes)
 		mem_cgroup_uncharge_skmem(sk->sk_memcg, pages);
 		return -ENOMEM;
 	}
-	sk->sk_forward_alloc += pages << PAGE_SHIFT;
+	sk_forward_alloc_add(sk, pages << PAGE_SHIFT);
 
 	WRITE_ONCE(sk->sk_reserved_mem,
 		   sk->sk_reserved_mem + (pages << PAGE_SHIFT));
@@ -3082,10 +3082,10 @@ int __sk_mem_schedule(struct sock *sk, int size, int kind)
 {
 	int ret, amt = sk_mem_pages(size);
 
-	sk->sk_forward_alloc += amt << PAGE_SHIFT;
+	sk_forward_alloc_add(sk, amt << PAGE_SHIFT);
 	ret = __sk_mem_raise_allocated(sk, size, amt, kind);
 	if (!ret)
-		sk->sk_forward_alloc -= amt << PAGE_SHIFT;
+		sk_forward_alloc_add(sk, -(amt << PAGE_SHIFT));
 	return ret;
 }
 EXPORT_SYMBOL(__sk_mem_schedule);
@@ -3117,7 +3117,7 @@ void __sk_mem_reduce_allocated(struct sock *sk, int amount)
 void __sk_mem_reclaim(struct sock *sk, int amount)
 {
 	amount >>= PAGE_SHIFT;
-	sk->sk_forward_alloc -= amount << PAGE_SHIFT;
+	sk_forward_alloc_add(sk, -(amount << PAGE_SHIFT));
 	__sk_mem_reduce_allocated(sk, amount);
 }
 EXPORT_SYMBOL(__sk_mem_reclaim);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 26bd039f9296f..dc3166e56169f 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3380,7 +3380,7 @@ void sk_forced_mem_schedule(struct sock *sk, int size)
 	if (delta <= 0)
 		return;
 	amt = sk_mem_pages(delta);
-	sk->sk_forward_alloc += amt << PAGE_SHIFT;
+	sk_forward_alloc_add(sk, amt << PAGE_SHIFT);
 	sk_memory_allocated_add(sk, amt);
 
 	if (mem_cgroup_sockets_enabled && sk->sk_memcg)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 42c1f7d9a980a..b2aa7777521f6 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1474,9 +1474,9 @@ static void udp_rmem_release(struct sock *sk, int size, int partial,
 		spin_lock(&sk_queue->lock);
 
 
-	sk->sk_forward_alloc += size;
+	sk_forward_alloc_add(sk, size);
 	amt = (sk->sk_forward_alloc - partial) & ~(PAGE_SIZE - 1);
-	sk->sk_forward_alloc -= amt;
+	sk_forward_alloc_add(sk, -amt);
 
 	if (amt)
 		__sk_mem_reduce_allocated(sk, amt >> PAGE_SHIFT);
@@ -1582,7 +1582,7 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
 		sk->sk_forward_alloc += delta;
 	}
 
-	sk->sk_forward_alloc -= size;
+	sk_forward_alloc_add(sk, -size);
 
 	/* no need to setup a destructor, we will explicitly release the
 	 * forward allocated memory on dequeue
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 61fefa1a82db2..573db9c2bc1cd 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1802,7 +1802,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		}
 
 		/* data successfully copied into the write queue */
-		sk->sk_forward_alloc -= total_ts;
+		sk_forward_alloc_add(sk, -total_ts);
 		copied += psize;
 		dfrag->data_len += psize;
 		frag_truesize += psize;
@@ -3278,7 +3278,7 @@ void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags)
 	/* move all the rx fwd alloc into the sk_mem_reclaim_final in
 	 * inet_sock_destruct() will dispose it
 	 */
-	sk->sk_forward_alloc += msk->rmem_fwd_alloc;
+	sk_forward_alloc_add(sk, msk->rmem_fwd_alloc);
 	msk->rmem_fwd_alloc = 0;
 	mptcp_token_destroy(msk);
 	mptcp_pm_free_anno_list(msk);
@@ -3562,7 +3562,7 @@ static void mptcp_shutdown(struct sock *sk, int how)
 
 static int mptcp_forward_alloc_get(const struct sock *sk)
 {
-	return sk->sk_forward_alloc + mptcp_sk(sk)->rmem_fwd_alloc;
+	return READ_ONCE(sk->sk_forward_alloc) + mptcp_sk(sk)->rmem_fwd_alloc;
 }
 
 static int mptcp_ioctl_outq(const struct mptcp_sock *msk, u64 v)
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 099/219] mptcp: annotate data-races around msk->rmem_fwd_alloc
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (97 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 098/219] net: annotate data-races around sk->sk_forward_alloc Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 100/219] ipv4: ignore dst hint for multipath routes Greg Kroah-Hartman
                   ` (130 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Eric Dumazet, Paolo Abeni,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 9531e4a83febc3fb47ac77e24cfb5ea97e50034d ]

msk->rmem_fwd_alloc can be read locklessly.

Add mptcp_rmem_fwd_alloc_add(), similar to sk_forward_alloc_add(),
and appropriate READ_ONCE()/WRITE_ONCE() annotations.

Fixes: 6511882cdd82 ("mptcp: allocate fwd memory separately on the rx and tx path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/mptcp/protocol.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 573db9c2bc1cd..6dd880d6b0518 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -131,9 +131,15 @@ static void mptcp_drop(struct sock *sk, struct sk_buff *skb)
 	__kfree_skb(skb);
 }
 
+static void mptcp_rmem_fwd_alloc_add(struct sock *sk, int size)
+{
+	WRITE_ONCE(mptcp_sk(sk)->rmem_fwd_alloc,
+		   mptcp_sk(sk)->rmem_fwd_alloc + size);
+}
+
 static void mptcp_rmem_charge(struct sock *sk, int size)
 {
-	mptcp_sk(sk)->rmem_fwd_alloc -= size;
+	mptcp_rmem_fwd_alloc_add(sk, -size);
 }
 
 static bool mptcp_try_coalesce(struct sock *sk, struct sk_buff *to,
@@ -174,7 +180,7 @@ static bool mptcp_ooo_try_coalesce(struct mptcp_sock *msk, struct sk_buff *to,
 static void __mptcp_rmem_reclaim(struct sock *sk, int amount)
 {
 	amount >>= PAGE_SHIFT;
-	mptcp_sk(sk)->rmem_fwd_alloc -= amount << PAGE_SHIFT;
+	mptcp_rmem_charge(sk, amount << PAGE_SHIFT);
 	__sk_mem_reduce_allocated(sk, amount);
 }
 
@@ -183,7 +189,7 @@ static void mptcp_rmem_uncharge(struct sock *sk, int size)
 	struct mptcp_sock *msk = mptcp_sk(sk);
 	int reclaimable;
 
-	msk->rmem_fwd_alloc += size;
+	mptcp_rmem_fwd_alloc_add(sk, size);
 	reclaimable = msk->rmem_fwd_alloc - sk_unused_reserved_mem(sk);
 
 	/* see sk_mem_uncharge() for the rationale behind the following schema */
@@ -338,7 +344,7 @@ static bool mptcp_rmem_schedule(struct sock *sk, struct sock *ssk, int size)
 	if (!__sk_mem_raise_allocated(sk, size, amt, SK_MEM_RECV))
 		return false;
 
-	msk->rmem_fwd_alloc += amount;
+	mptcp_rmem_fwd_alloc_add(sk, amount);
 	return true;
 }
 
@@ -3279,7 +3285,7 @@ void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags)
 	 * inet_sock_destruct() will dispose it
 	 */
 	sk_forward_alloc_add(sk, msk->rmem_fwd_alloc);
-	msk->rmem_fwd_alloc = 0;
+	WRITE_ONCE(msk->rmem_fwd_alloc, 0);
 	mptcp_token_destroy(msk);
 	mptcp_pm_free_anno_list(msk);
 	mptcp_free_local_addr_list(msk);
@@ -3562,7 +3568,8 @@ static void mptcp_shutdown(struct sock *sk, int how)
 
 static int mptcp_forward_alloc_get(const struct sock *sk)
 {
-	return READ_ONCE(sk->sk_forward_alloc) + mptcp_sk(sk)->rmem_fwd_alloc;
+	return READ_ONCE(sk->sk_forward_alloc) +
+	       READ_ONCE(mptcp_sk(sk)->rmem_fwd_alloc);
 }
 
 static int mptcp_ioctl_outq(const struct mptcp_sock *msk, u64 v)
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 100/219] ipv4: ignore dst hint for multipath routes
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (98 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 099/219] mptcp: annotate data-races around msk->rmem_fwd_alloc Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 101/219] ipv6: " Greg Kroah-Hartman
                   ` (129 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Sriram Yagnaraman, Ido Schimmel,
	David Ahern, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

[ Upstream commit 6ac66cb03ae306c2e288a9be18226310529f5b25 ]

Route hints when the nexthop is part of a multipath group causes packets
in the same receive batch to be sent to the same nexthop irrespective of
the multipath hash of the packet. So, do not extract route hint for
packets whose destination is part of a multipath group.

A new SKB flag IPSKB_MULTIPATH is introduced for this purpose, set the
flag when route is looked up in ip_mkroute_input() and use it in
ip_extract_route_hint() to check for the existence of the flag.

Fixes: 02b24941619f ("ipv4: use dst hint for ipv4 list receive")
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/ip.h    | 1 +
 net/ipv4/ip_input.c | 3 ++-
 net/ipv4/route.c    | 1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index 1872f570abeda..c286344628dba 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -57,6 +57,7 @@ struct inet_skb_parm {
 #define IPSKB_FRAG_PMTU		BIT(6)
 #define IPSKB_L3SLAVE		BIT(7)
 #define IPSKB_NOPOLICY		BIT(8)
+#define IPSKB_MULTIPATH		BIT(9)
 
 	u16			frag_max_size;
 };
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index e880ce77322aa..e7196ecffafc6 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -584,7 +584,8 @@ static void ip_sublist_rcv_finish(struct list_head *head)
 static struct sk_buff *ip_extract_route_hint(const struct net *net,
 					     struct sk_buff *skb, int rt_type)
 {
-	if (fib4_has_custom_rules(net) || rt_type == RTN_BROADCAST)
+	if (fib4_has_custom_rules(net) || rt_type == RTN_BROADCAST ||
+	    IPCB(skb)->flags & IPSKB_MULTIPATH)
 		return NULL;
 
 	return skb;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 51bd9a50a1d1d..a04ffc128e22b 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2146,6 +2146,7 @@ static int ip_mkroute_input(struct sk_buff *skb,
 		int h = fib_multipath_hash(res->fi->fib_net, NULL, skb, hkeys);
 
 		fib_select_multipath(res, h);
+		IPCB(skb)->flags |= IPSKB_MULTIPATH;
 	}
 #endif
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 101/219] ipv6: ignore dst hint for multipath routes
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (99 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 100/219] ipv4: ignore dst hint for multipath routes Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 102/219] igb: disable virtualization features on 82580 Greg Kroah-Hartman
                   ` (128 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Sriram Yagnaraman, Ido Schimmel,
	David Ahern, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sriram Yagnaraman <sriram.yagnaraman@est.tech>

[ Upstream commit 8423be8926aa82cd2e28bba5cc96ccb72c7ce6be ]

Route hints when the nexthop is part of a multipath group causes packets
in the same receive batch to be sent to the same nexthop irrespective of
the multipath hash of the packet. So, do not extract route hint for
packets whose destination is part of a multipath group.

A new SKB flag IP6SKB_MULTIPATH is introduced for this purpose, set the
flag when route is looked up in fib6_select_path() and use it in
ip6_can_use_hint() to check for the existence of the flag.

Fixes: 197dbf24e360 ("ipv6: introduce and uses route look hints for list input.")
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/ipv6.h | 1 +
 net/ipv6/ip6_input.c | 3 ++-
 net/ipv6/route.c     | 3 +++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 37dfdcfcdd542..15d7529ac9534 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -146,6 +146,7 @@ struct inet6_skb_parm {
 #define IP6SKB_JUMBOGRAM      128
 #define IP6SKB_SEG6	      256
 #define IP6SKB_FAKEJUMBO      512
+#define IP6SKB_MULTIPATH      1024
 };
 
 #if defined(CONFIG_NET_L3_MASTER_DEV)
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index d94041bb42872..b8378814532ce 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -99,7 +99,8 @@ static bool ip6_can_use_hint(const struct sk_buff *skb,
 static struct sk_buff *ip6_extract_route_hint(const struct net *net,
 					      struct sk_buff *skb)
 {
-	if (fib6_routes_require_src(net) || fib6_has_custom_rules(net))
+	if (fib6_routes_require_src(net) || fib6_has_custom_rules(net) ||
+	    IP6CB(skb)->flags & IP6SKB_MULTIPATH)
 		return NULL;
 
 	return skb;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 960ab43a49c46..93957b20fccce 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -425,6 +425,9 @@ void fib6_select_path(const struct net *net, struct fib6_result *res,
 	if (match->nh && have_oif_match && res->nh)
 		return;
 
+	if (skb)
+		IP6CB(skb)->flags |= IP6SKB_MULTIPATH;
+
 	/* We might have already computed the hash for ICMPv6 errors. In such
 	 * case it will always be non-zero. Otherwise now is the time to do it.
 	 */
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 102/219] igb: disable virtualization features on 82580
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (100 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 101/219] ipv6: " Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 103/219] gve: fix frag_list chaining Greg Kroah-Hartman
                   ` (127 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Corinna Vinschen, Simon Horman,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Corinna Vinschen <vinschen@redhat.com>

[ Upstream commit fa09bc40b21a33937872c4c4cf0f266ec9fa4869 ]

Disable virtualization features on 82580 just as on i210/i211.
This avoids that virt functions are acidentally called on 82850.

Fixes: 55cac248caa4 ("igb: Add full support for 82580 devices")
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index d0ead18ec0266..45ce4ed16146e 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3877,8 +3877,9 @@ static void igb_probe_vfs(struct igb_adapter *adapter)
 	struct pci_dev *pdev = adapter->pdev;
 	struct e1000_hw *hw = &adapter->hw;
 
-	/* Virtualization features not supported on i210 family. */
-	if ((hw->mac.type == e1000_i210) || (hw->mac.type == e1000_i211))
+	/* Virtualization features not supported on i210 and 82580 family. */
+	if ((hw->mac.type == e1000_i210) || (hw->mac.type == e1000_i211) ||
+	    (hw->mac.type == e1000_82580))
 		return;
 
 	/* Of the below we really only want the effect of getting
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 103/219] gve: fix frag_list chaining
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (101 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 102/219] igb: disable virtualization features on 82580 Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 104/219] veth: Fixing transmit return status for dropped packets Greg Kroah-Hartman
                   ` (126 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Eric Dumazet, Bailey Forrest,
	Willem de Bruijn, Catherine Sullivan, David Ahern,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 817c7cd2043a83a3d8147f40eea1505ac7300b62 ]

gve_rx_append_frags() is able to build skbs chained with frag_list,
like GRO engine.

Problem is that shinfo->frag_list should only be used
for the head of the chain.

All other links should use skb->next pointer.

Otherwise, built skbs are not valid and can cause crashes.

Equivalent code in GRO (skb_gro_receive()) is:

    if (NAPI_GRO_CB(p)->last == p)
        skb_shinfo(p)->frag_list = skb;
    else
        NAPI_GRO_CB(p)->last->next = skb;
    NAPI_GRO_CB(p)->last = skb;

Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Bailey Forrest <bcf@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Catherine Sullivan <csully@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/google/gve/gve_rx_dqo.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
index 2e6461b0ea8bc..a9409e3721ad7 100644
--- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
@@ -492,7 +492,10 @@ static int gve_rx_append_frags(struct napi_struct *napi,
 		if (!skb)
 			return -1;
 
-		skb_shinfo(rx->ctx.skb_tail)->frag_list = skb;
+		if (rx->ctx.skb_tail == rx->ctx.skb_head)
+			skb_shinfo(rx->ctx.skb_head)->frag_list = skb;
+		else
+			rx->ctx.skb_tail->next = skb;
 		rx->ctx.skb_tail = skb;
 		num_frags = 0;
 	}
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 104/219] veth: Fixing transmit return status for dropped packets
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (102 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 103/219] gve: fix frag_list chaining Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 105/219] net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr Greg Kroah-Hartman
                   ` (125 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Liang Chen, Eric Dumazet,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Liang Chen <liangchen.linux@gmail.com>

[ Upstream commit 151e887d8ff97e2e42110ffa1fb1e6a2128fb364 ]

The veth_xmit function returns NETDEV_TX_OK even when packets are dropped.
This behavior leads to incorrect calculations of statistics counts, as
well as things like txq->trans_start updates.

Fixes: e314dbdc1c0d ("[NET]: Virtual ethernet device driver.")
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/veth.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 727b9278b9fe5..36c5a41f84e44 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -313,6 +313,7 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
 	struct veth_rq *rq = NULL;
+	int ret = NETDEV_TX_OK;
 	struct net_device *rcv;
 	int length = skb->len;
 	bool use_napi = false;
@@ -345,6 +346,7 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
 	} else {
 drop:
 		atomic64_inc(&priv->dropped);
+		ret = NET_XMIT_DROP;
 	}
 
 	if (use_napi)
@@ -352,7 +354,7 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	rcu_read_unlock();
 
-	return NETDEV_TX_OK;
+	return ret;
 }
 
 static u64 veth_stats_tx(struct net_device *dev, u64 *packets, u64 *bytes)
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 105/219] net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (103 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 104/219] veth: Fixing transmit return status for dropped packets Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 106/219] net: phy: micrel: Correct bit assignments for phy_device flags Greg Kroah-Hartman
                   ` (124 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Alex Henrie, David Ahern,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Alex Henrie <alexhenrie24@gmail.com>

[ Upstream commit f31867d0d9d82af757c1e0178b659438f4c1ea3c ]

The existing code incorrectly casted a negative value (the result of a
subtraction) to an unsigned value without checking. For example, if
/proc/sys/net/ipv6/conf/*/temp_prefered_lft was set to 1, the preferred
lifetime would jump to 4 billion seconds. On my machine and network the
shortest lifetime that avoided underflow was 3 seconds.

Fixes: 76506a986dc3 ("IPv6: fix DESYNC_FACTOR")
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ipv6/addrconf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 48a6486951cd6..83be842198244 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1368,7 +1368,7 @@ static int ipv6_create_tempaddr(struct inet6_ifaddr *ifp, bool block)
 	 * idev->desync_factor if it's larger
 	 */
 	cnf_temp_preferred_lft = READ_ONCE(idev->cnf.temp_prefered_lft);
-	max_desync_factor = min_t(__u32,
+	max_desync_factor = min_t(long,
 				  idev->cnf.max_desync_factor,
 				  cnf_temp_preferred_lft - regen_advance);
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 106/219] net: phy: micrel: Correct bit assignments for phy_device flags
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (104 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 105/219] net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 107/219] bpf, sockmap: Fix skb refcnt race after locking changes Greg Kroah-Hartman
                   ` (123 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Oleksij Rempel,
	Russell King (Oracle), David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Oleksij Rempel <o.rempel@pengutronix.de>

[ Upstream commit 719c5e37e99d2fd588d1c994284d17650a66354c ]

Previously, the defines for phy_device flags in the Micrel driver were
ambiguous in their representation. They were intended to be bit masks
but were mistakenly defined as bit positions. This led to the following
issues:

- MICREL_KSZ8_P1_ERRATA, designated for KSZ88xx switches, overlapped
  with MICREL_PHY_FXEN and MICREL_PHY_50MHZ_CLK.
- Due to this overlap, the code path for MICREL_PHY_FXEN, tailored for
  the KSZ8041 PHY, was not executed for KSZ88xx PHYs.
- Similarly, the code associated with MICREL_PHY_50MHZ_CLK wasn't
  triggered for KSZ88xx.

To rectify this, all three flags have now been explicitly converted to
use the `BIT()` macro, ensuring they are defined as bit masks and
preventing potential overlaps in the future.

Fixes: 49011e0c1555 ("net: phy: micrel: ksz886x/ksz8081: add cabletest support")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/micrel_phy.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/micrel_phy.h b/include/linux/micrel_phy.h
index 1f7c33b2f5a3f..e164facb0f363 100644
--- a/include/linux/micrel_phy.h
+++ b/include/linux/micrel_phy.h
@@ -38,9 +38,9 @@
 #define	PHY_ID_KSZ9477		0x00221631
 
 /* struct phy_device dev_flags definitions */
-#define MICREL_PHY_50MHZ_CLK	0x00000001
-#define MICREL_PHY_FXEN		0x00000002
-#define MICREL_KSZ8_P1_ERRATA	0x00000003
+#define MICREL_PHY_50MHZ_CLK	BIT(0)
+#define MICREL_PHY_FXEN		BIT(1)
+#define MICREL_KSZ8_P1_ERRATA	BIT(2)
 
 #define MICREL_KSZ9021_EXTREG_CTRL	0xB
 #define MICREL_KSZ9021_EXTREG_DATA_WRITE	0xC
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 107/219] bpf, sockmap: Fix skb refcnt race after locking changes
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (105 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 106/219] net: phy: micrel: Correct bit assignments for phy_device flags Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 108/219] af_unix: Fix data-races around user->unix_inflight Greg Kroah-Hartman
                   ` (122 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Jiri Olsa, John Fastabend,
	Daniel Borkmann, Xu Kuohai, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: John Fastabend <john.fastabend@gmail.com>

[ Upstream commit a454d84ee20baf7bd7be90721b9821f73c7d23d9 ]

There is a race where skb's from the sk_psock_backlog can be referenced
after userspace side has already skb_consumed() the sk_buff and its refcnt
dropped to zer0 causing use after free.

The flow is the following:

  while ((skb = skb_peek(&psock->ingress_skb))
    sk_psock_handle_Skb(psock, skb, ..., ingress)
    if (!ingress) ...
    sk_psock_skb_ingress
       sk_psock_skb_ingress_enqueue(skb)
          msg->skb = skb
          sk_psock_queue_msg(psock, msg)
    skb_dequeue(&psock->ingress_skb)

The sk_psock_queue_msg() puts the msg on the ingress_msg queue. This is
what the application reads when recvmsg() is called. An application can
read this anytime after the msg is placed on the queue. The recvmsg hook
will also read msg->skb and then after user space reads the msg will call
consume_skb(skb) on it effectively free'ing it.

But, the race is in above where backlog queue still has a reference to
the skb and calls skb_dequeue(). If the skb_dequeue happens after the
user reads and free's the skb we have a use after free.

The !ingress case does not suffer from this problem because it uses
sendmsg_*(sk, msg) which does not pass the sk_buff further down the
stack.

The following splat was observed with 'test_progs -t sockmap_listen':

  [ 1022.710250][ T2556] general protection fault, ...
  [...]
  [ 1022.712830][ T2556] Workqueue: events sk_psock_backlog
  [ 1022.713262][ T2556] RIP: 0010:skb_dequeue+0x4c/0x80
  [ 1022.713653][ T2556] Code: ...
  [...]
  [ 1022.720699][ T2556] Call Trace:
  [ 1022.720984][ T2556]  <TASK>
  [ 1022.721254][ T2556]  ? die_addr+0x32/0x80^M
  [ 1022.721589][ T2556]  ? exc_general_protection+0x25a/0x4b0
  [ 1022.722026][ T2556]  ? asm_exc_general_protection+0x22/0x30
  [ 1022.722489][ T2556]  ? skb_dequeue+0x4c/0x80
  [ 1022.722854][ T2556]  sk_psock_backlog+0x27a/0x300
  [ 1022.723243][ T2556]  process_one_work+0x2a7/0x5b0
  [ 1022.723633][ T2556]  worker_thread+0x4f/0x3a0
  [ 1022.723998][ T2556]  ? __pfx_worker_thread+0x10/0x10
  [ 1022.724386][ T2556]  kthread+0xfd/0x130
  [ 1022.724709][ T2556]  ? __pfx_kthread+0x10/0x10
  [ 1022.725066][ T2556]  ret_from_fork+0x2d/0x50
  [ 1022.725409][ T2556]  ? __pfx_kthread+0x10/0x10
  [ 1022.725799][ T2556]  ret_from_fork_asm+0x1b/0x30
  [ 1022.726201][ T2556]  </TASK>

To fix we add an skb_get() before passing the skb to be enqueued in the
engress queue. This bumps the skb->users refcnt so that consume_skb()
and kfree_skb will not immediately free the sk_buff. With this we can
be sure the skb is still around when we do the dequeue. Then we just
need to decrement the refcnt or free the skb in the backlog case which
we do by calling kfree_skb() on the ingress case as well as the sendmsg
case.

Before locking change from fixes tag we had the sock locked so we
couldn't race with user and there was no issue here.

Fixes: 799aa7f98d53e ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
Reported-by: Jiri Olsa  <jolsa@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Xu Kuohai <xukuohai@huawei.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230901202137.214666-1-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/skmsg.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 296e45b6c3c0d..a5c1f67dc96ec 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -611,12 +611,18 @@ static int sk_psock_skb_ingress_self(struct sk_psock *psock, struct sk_buff *skb
 static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
 			       u32 off, u32 len, bool ingress)
 {
+	int err = 0;
+
 	if (!ingress) {
 		if (!sock_writeable(psock->sk))
 			return -EAGAIN;
 		return skb_send_sock(psock->sk, skb, off, len);
 	}
-	return sk_psock_skb_ingress(psock, skb, off, len);
+	skb_get(skb);
+	err = sk_psock_skb_ingress(psock, skb, off, len);
+	if (err < 0)
+		kfree_skb(skb);
+	return err;
 }
 
 static void sk_psock_skb_state(struct sk_psock *psock,
@@ -684,9 +690,7 @@ static void sk_psock_backlog(struct work_struct *work)
 		} while (len);
 
 		skb = skb_dequeue(&psock->ingress_skb);
-		if (!ingress) {
-			kfree_skb(skb);
-		}
+		kfree_skb(skb);
 	}
 end:
 	mutex_unlock(&psock->work_mutex);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 108/219] af_unix: Fix data-races around user->unix_inflight.
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (106 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 107/219] bpf, sockmap: Fix skb refcnt race after locking changes Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 109/219] af_unix: Fix data-race around unix_tot_inflight Greg Kroah-Hartman
                   ` (121 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzkaller, Kuniyuki Iwashima,
	Willy Tarreau, Eric Dumazet, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 0bc36c0650b21df36fbec8136add83936eaf0607 ]

user->unix_inflight is changed under spin_lock(unix_gc_lock),
but too_many_unix_fds() reads it locklessly.

Let's annotate the write/read accesses to user->unix_inflight.

BUG: KCSAN: data-race in unix_attach_fds / unix_inflight

write to 0xffffffff8546f2d0 of 8 bytes by task 44798 on cpu 1:
 unix_inflight+0x157/0x180 net/unix/scm.c:66
 unix_attach_fds+0x147/0x1e0 net/unix/scm.c:123
 unix_scm_to_skb net/unix/af_unix.c:1827 [inline]
 unix_dgram_sendmsg+0x46a/0x14f0 net/unix/af_unix.c:1950
 unix_seqpacket_sendmsg net/unix/af_unix.c:2308 [inline]
 unix_seqpacket_sendmsg+0xba/0x130 net/unix/af_unix.c:2292
 sock_sendmsg_nosec net/socket.c:725 [inline]
 sock_sendmsg+0x148/0x160 net/socket.c:748
 ____sys_sendmsg+0x4e4/0x610 net/socket.c:2494
 ___sys_sendmsg+0xc6/0x140 net/socket.c:2548
 __sys_sendmsg+0x94/0x140 net/socket.c:2577
 __do_sys_sendmsg net/socket.c:2586 [inline]
 __se_sys_sendmsg net/socket.c:2584 [inline]
 __x64_sys_sendmsg+0x45/0x50 net/socket.c:2584
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

read to 0xffffffff8546f2d0 of 8 bytes by task 44814 on cpu 0:
 too_many_unix_fds net/unix/scm.c:101 [inline]
 unix_attach_fds+0x54/0x1e0 net/unix/scm.c:110
 unix_scm_to_skb net/unix/af_unix.c:1827 [inline]
 unix_dgram_sendmsg+0x46a/0x14f0 net/unix/af_unix.c:1950
 unix_seqpacket_sendmsg net/unix/af_unix.c:2308 [inline]
 unix_seqpacket_sendmsg+0xba/0x130 net/unix/af_unix.c:2292
 sock_sendmsg_nosec net/socket.c:725 [inline]
 sock_sendmsg+0x148/0x160 net/socket.c:748
 ____sys_sendmsg+0x4e4/0x610 net/socket.c:2494
 ___sys_sendmsg+0xc6/0x140 net/socket.c:2548
 __sys_sendmsg+0x94/0x140 net/socket.c:2577
 __do_sys_sendmsg net/socket.c:2586 [inline]
 __se_sys_sendmsg net/socket.c:2584 [inline]
 __x64_sys_sendmsg+0x45/0x50 net/socket.c:2584
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

value changed: 0x000000000000000c -> 0x000000000000000d

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 44814 Comm: systemd-coredum Not tainted 6.4.0-11989-g6843306689af #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

Fixes: 712f4aad406b ("unix: properly account for FDs passed over unix sockets")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/unix/scm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/unix/scm.c b/net/unix/scm.c
index aa27a02478dc1..e8e2a00bb0f58 100644
--- a/net/unix/scm.c
+++ b/net/unix/scm.c
@@ -63,7 +63,7 @@ void unix_inflight(struct user_struct *user, struct file *fp)
 		/* Paired with READ_ONCE() in wait_for_unix_gc() */
 		WRITE_ONCE(unix_tot_inflight, unix_tot_inflight + 1);
 	}
-	user->unix_inflight++;
+	WRITE_ONCE(user->unix_inflight, user->unix_inflight + 1);
 	spin_unlock(&unix_gc_lock);
 }
 
@@ -84,7 +84,7 @@ void unix_notinflight(struct user_struct *user, struct file *fp)
 		/* Paired with READ_ONCE() in wait_for_unix_gc() */
 		WRITE_ONCE(unix_tot_inflight, unix_tot_inflight - 1);
 	}
-	user->unix_inflight--;
+	WRITE_ONCE(user->unix_inflight, user->unix_inflight - 1);
 	spin_unlock(&unix_gc_lock);
 }
 
@@ -98,7 +98,7 @@ static inline bool too_many_unix_fds(struct task_struct *p)
 {
 	struct user_struct *user = current_user();
 
-	if (unlikely(user->unix_inflight > task_rlimit(p, RLIMIT_NOFILE)))
+	if (unlikely(READ_ONCE(user->unix_inflight) > task_rlimit(p, RLIMIT_NOFILE)))
 		return !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN);
 	return false;
 }
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 109/219] af_unix: Fix data-race around unix_tot_inflight.
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (107 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 108/219] af_unix: Fix data-races around user->unix_inflight Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 110/219] af_unix: Fix data-races around sk->sk_shutdown Greg Kroah-Hartman
                   ` (120 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzkaller, Kuniyuki Iwashima,
	Eric Dumazet, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit ade32bd8a738d7497ffe9743c46728db26740f78 ]

unix_tot_inflight is changed under spin_lock(unix_gc_lock), but
unix_release_sock() reads it locklessly.

Let's use READ_ONCE() for unix_tot_inflight.

Note that the writer side was marked by commit 9d6d7f1cb67c ("af_unix:
annote lockless accesses to unix_tot_inflight & gc_in_progress")

BUG: KCSAN: data-race in unix_inflight / unix_release_sock

write (marked) to 0xffffffff871852b8 of 4 bytes by task 123 on cpu 1:
 unix_inflight+0x130/0x180 net/unix/scm.c:64
 unix_attach_fds+0x137/0x1b0 net/unix/scm.c:123
 unix_scm_to_skb net/unix/af_unix.c:1832 [inline]
 unix_dgram_sendmsg+0x46a/0x14f0 net/unix/af_unix.c:1955
 sock_sendmsg_nosec net/socket.c:724 [inline]
 sock_sendmsg+0x148/0x160 net/socket.c:747
 ____sys_sendmsg+0x4e4/0x610 net/socket.c:2493
 ___sys_sendmsg+0xc6/0x140 net/socket.c:2547
 __sys_sendmsg+0x94/0x140 net/socket.c:2576
 __do_sys_sendmsg net/socket.c:2585 [inline]
 __se_sys_sendmsg net/socket.c:2583 [inline]
 __x64_sys_sendmsg+0x45/0x50 net/socket.c:2583
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

read to 0xffffffff871852b8 of 4 bytes by task 4891 on cpu 0:
 unix_release_sock+0x608/0x910 net/unix/af_unix.c:671
 unix_release+0x59/0x80 net/unix/af_unix.c:1058
 __sock_release+0x7d/0x170 net/socket.c:653
 sock_close+0x19/0x30 net/socket.c:1385
 __fput+0x179/0x5e0 fs/file_table.c:321
 ____fput+0x15/0x20 fs/file_table.c:349
 task_work_run+0x116/0x1a0 kernel/task_work.c:179
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
 exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
 syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
 do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

value changed: 0x00000000 -> 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 4891 Comm: systemd-coredum Not tainted 6.4.0-rc5-01219-gfa0e21fa4443 #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

Fixes: 9305cfa4443d ("[AF_UNIX]: Make unix_tot_inflight counter non-atomic")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/unix/af_unix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index ca31847a6c70c..310952f4c68f7 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -667,7 +667,7 @@ static void unix_release_sock(struct sock *sk, int embrion)
 	 *	  What the above comment does talk about? --ANK(980817)
 	 */
 
-	if (unix_tot_inflight)
+	if (READ_ONCE(unix_tot_inflight))
 		unix_gc();		/* Garbage collect fds */
 }
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 110/219] af_unix: Fix data-races around sk->sk_shutdown.
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (108 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 109/219] af_unix: Fix data-race around unix_tot_inflight Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 111/219] af_unix: Fix data race around sk->sk_err Greg Kroah-Hartman
                   ` (119 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzkaller, Kuniyuki Iwashima,
	Eric Dumazet, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit afe8764f76346ba838d4f162883e23d2fcfaa90e ]

sk->sk_shutdown is changed under unix_state_lock(sk), but
unix_dgram_sendmsg() calls two functions to read sk_shutdown locklessly.

  sock_alloc_send_pskb
  `- sock_wait_for_wmem

Let's use READ_ONCE() there.

Note that the writer side was marked by commit e1d09c2c2f57 ("af_unix:
Fix data races around sk->sk_shutdown.").

BUG: KCSAN: data-race in sock_alloc_send_pskb / unix_release_sock

write (marked) to 0xffff8880069af12c of 1 bytes by task 1 on cpu 1:
 unix_release_sock+0x75c/0x910 net/unix/af_unix.c:631
 unix_release+0x59/0x80 net/unix/af_unix.c:1053
 __sock_release+0x7d/0x170 net/socket.c:654
 sock_close+0x19/0x30 net/socket.c:1386
 __fput+0x2a3/0x680 fs/file_table.c:384
 ____fput+0x15/0x20 fs/file_table.c:412
 task_work_run+0x116/0x1a0 kernel/task_work.c:179
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
 exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
 syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
 do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

read to 0xffff8880069af12c of 1 bytes by task 28650 on cpu 0:
 sock_alloc_send_pskb+0xd2/0x620 net/core/sock.c:2767
 unix_dgram_sendmsg+0x2f8/0x14f0 net/unix/af_unix.c:1944
 unix_seqpacket_sendmsg net/unix/af_unix.c:2308 [inline]
 unix_seqpacket_sendmsg+0xba/0x130 net/unix/af_unix.c:2292
 sock_sendmsg_nosec net/socket.c:725 [inline]
 sock_sendmsg+0x148/0x160 net/socket.c:748
 ____sys_sendmsg+0x4e4/0x610 net/socket.c:2494
 ___sys_sendmsg+0xc6/0x140 net/socket.c:2548
 __sys_sendmsg+0x94/0x140 net/socket.c:2577
 __do_sys_sendmsg net/socket.c:2586 [inline]
 __se_sys_sendmsg net/socket.c:2584 [inline]
 __x64_sys_sendmsg+0x45/0x50 net/socket.c:2584
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

value changed: 0x00 -> 0x03

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 28650 Comm: systemd-coredum Not tainted 6.4.0-11989-g6843306689af #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/sock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index aa628c6314f64..71990525d37e2 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2690,7 +2690,7 @@ static long sock_wait_for_wmem(struct sock *sk, long timeo)
 		prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
 		if (refcount_read(&sk->sk_wmem_alloc) < READ_ONCE(sk->sk_sndbuf))
 			break;
-		if (sk->sk_shutdown & SEND_SHUTDOWN)
+		if (READ_ONCE(sk->sk_shutdown) & SEND_SHUTDOWN)
 			break;
 		if (sk->sk_err)
 			break;
@@ -2720,7 +2720,7 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
 			goto failure;
 
 		err = -EPIPE;
-		if (sk->sk_shutdown & SEND_SHUTDOWN)
+		if (READ_ONCE(sk->sk_shutdown) & SEND_SHUTDOWN)
 			goto failure;
 
 		if (sk_wmem_alloc_get(sk) < READ_ONCE(sk->sk_sndbuf))
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 111/219] af_unix: Fix data race around sk->sk_err.
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (109 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 110/219] af_unix: Fix data-races around sk->sk_shutdown Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:13 ` [PATCH 6.1 112/219] net: sched: sch_qfq: Fix UAF in qfq_dequeue() Greg Kroah-Hartman
                   ` (118 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Eric Dumazet,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit b192812905e4b134f7b7994b079eb647e9d2d37e ]

As with sk->sk_shutdown shown in the previous patch, sk->sk_err can be
read locklessly by unix_dgram_sendmsg().

Let's use READ_ONCE() for sk_err as well.

Note that the writer side is marked by commit cc04410af7de ("af_unix:
annotate lockless accesses to sk->sk_err").

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/core/sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 71990525d37e2..e5858fa5d6d57 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2692,7 +2692,7 @@ static long sock_wait_for_wmem(struct sock *sk, long timeo)
 			break;
 		if (READ_ONCE(sk->sk_shutdown) & SEND_SHUTDOWN)
 			break;
-		if (sk->sk_err)
+		if (READ_ONCE(sk->sk_err))
 			break;
 		timeo = schedule_timeout(timeo);
 	}
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 112/219] net: sched: sch_qfq: Fix UAF in qfq_dequeue()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (110 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 111/219] af_unix: Fix data race around sk->sk_err Greg Kroah-Hartman
@ 2023-09-17 19:13 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 113/219] kcm: Destroy mutex in kcm_exit_net() Greg Kroah-Hartman
                   ` (117 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:13 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, valis, Jamal Hadi Salim, Paolo Abeni,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: valis <sec@valis.email>

[ Upstream commit 8fc134fee27f2263988ae38920bc03da416b03d8 ]

When the plug qdisc is used as a class of the qfq qdisc it could trigger a
UAF. This issue can be reproduced with following commands:

  tc qdisc add dev lo root handle 1: qfq
  tc class add dev lo parent 1: classid 1:1 qfq weight 1 maxpkt 512
  tc qdisc add dev lo parent 1:1 handle 2: plug
  tc filter add dev lo parent 1: basic classid 1:1
  ping -c1 127.0.0.1

and boom:

[  285.353793] BUG: KASAN: slab-use-after-free in qfq_dequeue+0xa7/0x7f0
[  285.354910] Read of size 4 at addr ffff8880bad312a8 by task ping/144
[  285.355903]
[  285.356165] CPU: 1 PID: 144 Comm: ping Not tainted 6.5.0-rc3+ #4
[  285.357112] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[  285.358376] Call Trace:
[  285.358773]  <IRQ>
[  285.359109]  dump_stack_lvl+0x44/0x60
[  285.359708]  print_address_description.constprop.0+0x2c/0x3c0
[  285.360611]  kasan_report+0x10c/0x120
[  285.361195]  ? qfq_dequeue+0xa7/0x7f0
[  285.361780]  qfq_dequeue+0xa7/0x7f0
[  285.362342]  __qdisc_run+0xf1/0x970
[  285.362903]  net_tx_action+0x28e/0x460
[  285.363502]  __do_softirq+0x11b/0x3de
[  285.364097]  do_softirq.part.0+0x72/0x90
[  285.364721]  </IRQ>
[  285.365072]  <TASK>
[  285.365422]  __local_bh_enable_ip+0x77/0x90
[  285.366079]  __dev_queue_xmit+0x95f/0x1550
[  285.366732]  ? __pfx_csum_and_copy_from_iter+0x10/0x10
[  285.367526]  ? __pfx___dev_queue_xmit+0x10/0x10
[  285.368259]  ? __build_skb_around+0x129/0x190
[  285.368960]  ? ip_generic_getfrag+0x12c/0x170
[  285.369653]  ? __pfx_ip_generic_getfrag+0x10/0x10
[  285.370390]  ? csum_partial+0x8/0x20
[  285.370961]  ? raw_getfrag+0xe5/0x140
[  285.371559]  ip_finish_output2+0x539/0xa40
[  285.372222]  ? __pfx_ip_finish_output2+0x10/0x10
[  285.372954]  ip_output+0x113/0x1e0
[  285.373512]  ? __pfx_ip_output+0x10/0x10
[  285.374130]  ? icmp_out_count+0x49/0x60
[  285.374739]  ? __pfx_ip_finish_output+0x10/0x10
[  285.375457]  ip_push_pending_frames+0xf3/0x100
[  285.376173]  raw_sendmsg+0xef5/0x12d0
[  285.376760]  ? do_syscall_64+0x40/0x90
[  285.377359]  ? __static_call_text_end+0x136578/0x136578
[  285.378173]  ? do_syscall_64+0x40/0x90
[  285.378772]  ? kasan_enable_current+0x11/0x20
[  285.379469]  ? __pfx_raw_sendmsg+0x10/0x10
[  285.380137]  ? __sock_create+0x13e/0x270
[  285.380673]  ? __sys_socket+0xf3/0x180
[  285.381174]  ? __x64_sys_socket+0x3d/0x50
[  285.381725]  ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  285.382425]  ? __rcu_read_unlock+0x48/0x70
[  285.382975]  ? ip4_datagram_release_cb+0xd8/0x380
[  285.383608]  ? __pfx_ip4_datagram_release_cb+0x10/0x10
[  285.384295]  ? preempt_count_sub+0x14/0xc0
[  285.384844]  ? __list_del_entry_valid+0x76/0x140
[  285.385467]  ? _raw_spin_lock_bh+0x87/0xe0
[  285.386014]  ? __pfx__raw_spin_lock_bh+0x10/0x10
[  285.386645]  ? release_sock+0xa0/0xd0
[  285.387148]  ? preempt_count_sub+0x14/0xc0
[  285.387712]  ? freeze_secondary_cpus+0x348/0x3c0
[  285.388341]  ? aa_sk_perm+0x177/0x390
[  285.388856]  ? __pfx_aa_sk_perm+0x10/0x10
[  285.389441]  ? check_stack_object+0x22/0x70
[  285.390032]  ? inet_send_prepare+0x2f/0x120
[  285.390603]  ? __pfx_inet_sendmsg+0x10/0x10
[  285.391172]  sock_sendmsg+0xcc/0xe0
[  285.391667]  __sys_sendto+0x190/0x230
[  285.392168]  ? __pfx___sys_sendto+0x10/0x10
[  285.392727]  ? kvm_clock_get_cycles+0x14/0x30
[  285.393328]  ? set_normalized_timespec64+0x57/0x70
[  285.393980]  ? _raw_spin_unlock_irq+0x1b/0x40
[  285.394578]  ? __x64_sys_clock_gettime+0x11c/0x160
[  285.395225]  ? __pfx___x64_sys_clock_gettime+0x10/0x10
[  285.395908]  ? _copy_to_user+0x3e/0x60
[  285.396432]  ? exit_to_user_mode_prepare+0x1a/0x120
[  285.397086]  ? syscall_exit_to_user_mode+0x22/0x50
[  285.397734]  ? do_syscall_64+0x71/0x90
[  285.398258]  __x64_sys_sendto+0x74/0x90
[  285.398786]  do_syscall_64+0x64/0x90
[  285.399273]  ? exit_to_user_mode_prepare+0x1a/0x120
[  285.399949]  ? syscall_exit_to_user_mode+0x22/0x50
[  285.400605]  ? do_syscall_64+0x71/0x90
[  285.401124]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  285.401807] RIP: 0033:0x495726
[  285.402233] Code: ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 09
[  285.404683] RSP: 002b:00007ffcc25fb618 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[  285.405677] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 0000000000495726
[  285.406628] RDX: 0000000000000040 RSI: 0000000002518750 RDI: 0000000000000000
[  285.407565] RBP: 00000000005205ef R08: 00000000005f8838 R09: 000000000000001c
[  285.408523] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000002517634
[  285.409460] R13: 00007ffcc25fb6f0 R14: 0000000000000003 R15: 0000000000000000
[  285.410403]  </TASK>
[  285.410704]
[  285.410929] Allocated by task 144:
[  285.411402]  kasan_save_stack+0x1e/0x40
[  285.411926]  kasan_set_track+0x21/0x30
[  285.412442]  __kasan_slab_alloc+0x55/0x70
[  285.412973]  kmem_cache_alloc_node+0x187/0x3d0
[  285.413567]  __alloc_skb+0x1b4/0x230
[  285.414060]  __ip_append_data+0x17f7/0x1b60
[  285.414633]  ip_append_data+0x97/0xf0
[  285.415144]  raw_sendmsg+0x5a8/0x12d0
[  285.415640]  sock_sendmsg+0xcc/0xe0
[  285.416117]  __sys_sendto+0x190/0x230
[  285.416626]  __x64_sys_sendto+0x74/0x90
[  285.417145]  do_syscall_64+0x64/0x90
[  285.417624]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  285.418306]
[  285.418531] Freed by task 144:
[  285.418960]  kasan_save_stack+0x1e/0x40
[  285.419469]  kasan_set_track+0x21/0x30
[  285.419988]  kasan_save_free_info+0x27/0x40
[  285.420556]  ____kasan_slab_free+0x109/0x1a0
[  285.421146]  kmem_cache_free+0x1c2/0x450
[  285.421680]  __netif_receive_skb_core+0x2ce/0x1870
[  285.422333]  __netif_receive_skb_one_core+0x97/0x140
[  285.423003]  process_backlog+0x100/0x2f0
[  285.423537]  __napi_poll+0x5c/0x2d0
[  285.424023]  net_rx_action+0x2be/0x560
[  285.424510]  __do_softirq+0x11b/0x3de
[  285.425034]
[  285.425254] The buggy address belongs to the object at ffff8880bad31280
[  285.425254]  which belongs to the cache skbuff_head_cache of size 224
[  285.426993] The buggy address is located 40 bytes inside of
[  285.426993]  freed 224-byte region [ffff8880bad31280, ffff8880bad31360)
[  285.428572]
[  285.428798] The buggy address belongs to the physical page:
[  285.429540] page:00000000f4b77674 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbad31
[  285.430758] flags: 0x100000000000200(slab|node=0|zone=1)
[  285.431447] page_type: 0xffffffff()
[  285.431934] raw: 0100000000000200 ffff88810094a8c0 dead000000000122 0000000000000000
[  285.432757] raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
[  285.433562] page dumped because: kasan: bad access detected
[  285.434144]
[  285.434320] Memory state around the buggy address:
[  285.434828]  ffff8880bad31180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  285.435580]  ffff8880bad31200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  285.436264] >ffff8880bad31280: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  285.436777]                                   ^
[  285.437106]  ffff8880bad31300: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[  285.437616]  ffff8880bad31380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  285.438126] ==================================================================
[  285.438662] Disabling lock debugging due to kernel taint

Fix this by:
1. Changing sch_plug's .peek handler to qdisc_peek_dequeued(), a
function compatible with non-work-conserving qdiscs
2. Checking the return value of qdisc_dequeue_peeked() in sch_qfq.

Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
Reported-by: valis <sec@valis.email>
Signed-off-by: valis <sec@valis.email>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20230901162237.11525-1-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/sched/sch_plug.c |  2 +-
 net/sched/sch_qfq.c  | 22 +++++++++++++++++-----
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/net/sched/sch_plug.c b/net/sched/sch_plug.c
index ea8c4a7174bba..35f49edf63dbf 100644
--- a/net/sched/sch_plug.c
+++ b/net/sched/sch_plug.c
@@ -207,7 +207,7 @@ static struct Qdisc_ops plug_qdisc_ops __read_mostly = {
 	.priv_size   =       sizeof(struct plug_sched_data),
 	.enqueue     =       plug_enqueue,
 	.dequeue     =       plug_dequeue,
-	.peek        =       qdisc_peek_head,
+	.peek        =       qdisc_peek_dequeued,
 	.init        =       plug_init,
 	.change      =       plug_change,
 	.reset       =	     qdisc_reset_queue,
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index e150d08f182d8..ed01634af82c2 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -973,10 +973,13 @@ static void qfq_update_eligible(struct qfq_sched *q)
 }
 
 /* Dequeue head packet of the head class in the DRR queue of the aggregate. */
-static void agg_dequeue(struct qfq_aggregate *agg,
-			struct qfq_class *cl, unsigned int len)
+static struct sk_buff *agg_dequeue(struct qfq_aggregate *agg,
+				   struct qfq_class *cl, unsigned int len)
 {
-	qdisc_dequeue_peeked(cl->qdisc);
+	struct sk_buff *skb = qdisc_dequeue_peeked(cl->qdisc);
+
+	if (!skb)
+		return NULL;
 
 	cl->deficit -= (int) len;
 
@@ -986,6 +989,8 @@ static void agg_dequeue(struct qfq_aggregate *agg,
 		cl->deficit += agg->lmax;
 		list_move_tail(&cl->alist, &agg->active);
 	}
+
+	return skb;
 }
 
 static inline struct sk_buff *qfq_peek_skb(struct qfq_aggregate *agg,
@@ -1131,11 +1136,18 @@ static struct sk_buff *qfq_dequeue(struct Qdisc *sch)
 	if (!skb)
 		return NULL;
 
-	qdisc_qstats_backlog_dec(sch, skb);
 	sch->q.qlen--;
+
+	skb = agg_dequeue(in_serv_agg, cl, len);
+
+	if (!skb) {
+		sch->q.qlen++;
+		return NULL;
+	}
+
+	qdisc_qstats_backlog_dec(sch, skb);
 	qdisc_bstats_update(sch, skb);
 
-	agg_dequeue(in_serv_agg, cl, len);
 	/* If lmax is lowered, through qfq_change_class, for a class
 	 * owning pending packets with larger size than the new value
 	 * of lmax, then the following condition may hold.
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 113/219] kcm: Destroy mutex in kcm_exit_net()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (111 preceding siblings ...)
  2023-09-17 19:13 ` [PATCH 6.1 112/219] net: sched: sch_qfq: Fix UAF in qfq_dequeue() Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 114/219] octeontx2-af: Fix truncation of smq in CN10K NIX AQ enqueue mbox handler Greg Kroah-Hartman
                   ` (116 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Shigeru Yoshida, Paolo Abeni,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Shigeru Yoshida <syoshida@redhat.com>

[ Upstream commit 6ad40b36cd3b04209e2d6c89d252c873d8082a59 ]

kcm_exit_net() should call mutex_destroy() on knet->mutex. This is especially
needed if CONFIG_DEBUG_MUTEXES is enabled.

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Link: https://lore.kernel.org/r/20230902170708.1727999-1-syoshida@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/kcm/kcmsock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index 890a2423f559e..6a97662d7548e 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1981,6 +1981,8 @@ static __net_exit void kcm_exit_net(struct net *net)
 	 * that all multiplexors and psocks have been destroyed.
 	 */
 	WARN_ON(!list_empty(&knet->mux_list));
+
+	mutex_destroy(&knet->mutex);
 }
 
 static struct pernet_operations kcm_net_ops = {
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 114/219] octeontx2-af: Fix truncation of smq in CN10K NIX AQ enqueue mbox handler
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (112 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 113/219] kcm: Destroy mutex in kcm_exit_net() Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 115/219] igc: Change IGC_MIN to allow set rx/tx value between 64 and 80 Greg Kroah-Hartman
                   ` (115 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Geetha sowjanya,
	Sunil Kovvuri Goutham, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Geetha sowjanya <gakula@marvell.com>

[ Upstream commit 29fe7a1b62717d58f033009874554d99d71f7d37 ]

The smq value used in the CN10K NIX AQ instruction enqueue mailbox
handler was truncated to 9-bit value from 10-bit value because of
typecasting the CN10K mbox request structure to the CN9K structure.
Though this hasn't caused any problems when programming the NIX SQ
context to the HW because the context structure is the same size.
However, this causes a problem when accessing the structure parameters.
This patch reads the right smq value for each platform.

Fixes: 30077d210c83 ("octeontx2-af: cn10k: Update NIX/NPA context structure")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../ethernet/marvell/octeontx2/af/rvu_nix.c   | 21 +++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
index c85e0180d96da..1f3a8cf42765e 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
@@ -834,6 +834,21 @@ static int nix_aq_enqueue_wait(struct rvu *rvu, struct rvu_block *block,
 	return 0;
 }
 
+static void nix_get_aq_req_smq(struct rvu *rvu, struct nix_aq_enq_req *req,
+			       u16 *smq, u16 *smq_mask)
+{
+	struct nix_cn10k_aq_enq_req *aq_req;
+
+	if (!is_rvu_otx2(rvu)) {
+		aq_req = (struct nix_cn10k_aq_enq_req *)req;
+		*smq = aq_req->sq.smq;
+		*smq_mask = aq_req->sq_mask.smq;
+	} else {
+		*smq = req->sq.smq;
+		*smq_mask = req->sq_mask.smq;
+	}
+}
+
 static int rvu_nix_blk_aq_enq_inst(struct rvu *rvu, struct nix_hw *nix_hw,
 				   struct nix_aq_enq_req *req,
 				   struct nix_aq_enq_rsp *rsp)
@@ -845,6 +860,7 @@ static int rvu_nix_blk_aq_enq_inst(struct rvu *rvu, struct nix_hw *nix_hw,
 	struct rvu_block *block;
 	struct admin_queue *aq;
 	struct rvu_pfvf *pfvf;
+	u16 smq, smq_mask;
 	void *ctx, *mask;
 	bool ena;
 	u64 cfg;
@@ -916,13 +932,14 @@ static int rvu_nix_blk_aq_enq_inst(struct rvu *rvu, struct nix_hw *nix_hw,
 	if (rc)
 		return rc;
 
+	nix_get_aq_req_smq(rvu, req, &smq, &smq_mask);
 	/* Check if SQ pointed SMQ belongs to this PF/VF or not */
 	if (req->ctype == NIX_AQ_CTYPE_SQ &&
 	    ((req->op == NIX_AQ_INSTOP_INIT && req->sq.ena) ||
 	     (req->op == NIX_AQ_INSTOP_WRITE &&
-	      req->sq_mask.ena && req->sq_mask.smq && req->sq.ena))) {
+	      req->sq_mask.ena && req->sq.ena && smq_mask))) {
 		if (!is_valid_txschq(rvu, blkaddr, NIX_TXSCH_LVL_SMQ,
-				     pcifunc, req->sq.smq))
+				     pcifunc, smq))
 			return NIX_AF_ERR_AQ_ENQUEUE;
 	}
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 115/219] igc: Change IGC_MIN to allow set rx/tx value between 64 and 80
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (113 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 114/219] octeontx2-af: Fix truncation of smq in CN10K NIX AQ enqueue mbox handler Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 116/219] igbvf: Change IGBVF_MIN " Greg Kroah-Hartman
                   ` (114 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Olga Zaborska, Naama Meir,
	Tony Nguyen, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Olga Zaborska <olga.zaborska@intel.com>

[ Upstream commit 5aa48279712e1f134aac908acde4df798955a955 ]

Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igc devices can use as low as 64 descriptors.
This change will unify igc with other drivers.
Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")

Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers")
Signed-off-by: Olga Zaborska <olga.zaborska@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/igc/igc.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index f83cbc4a1afa8..d3b17aa1d1a83 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -354,11 +354,11 @@ static inline u32 igc_rss_type(const union igc_adv_rx_desc *rx_desc)
 /* TX/RX descriptor defines */
 #define IGC_DEFAULT_TXD		256
 #define IGC_DEFAULT_TX_WORK	128
-#define IGC_MIN_TXD		80
+#define IGC_MIN_TXD		64
 #define IGC_MAX_TXD		4096
 
 #define IGC_DEFAULT_RXD		256
-#define IGC_MIN_RXD		80
+#define IGC_MIN_RXD		64
 #define IGC_MAX_RXD		4096
 
 /* Supported Rx Buffer Sizes */
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 116/219] igbvf: Change IGBVF_MIN to allow set rx/tx value between 64 and 80
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (114 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 115/219] igc: Change IGC_MIN to allow set rx/tx value between 64 and 80 Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 117/219] igb: Change IGB_MIN " Greg Kroah-Hartman
                   ` (113 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Olga Zaborska, Rafal Romanowski,
	Tony Nguyen, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Olga Zaborska <olga.zaborska@intel.com>

[ Upstream commit 8360717524a24a421c36ef8eb512406dbd42160a ]

Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igbvf devices can use as low as 64 descriptors.
This change will unify igbvf with other drivers.
Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")

Fixes: d4e0fe01a38a ("igbvf: add new driver to support 82576 virtual functions")
Signed-off-by: Olga Zaborska <olga.zaborska@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/igbvf/igbvf.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igbvf/igbvf.h b/drivers/net/ethernet/intel/igbvf/igbvf.h
index 57d39ee00b585..7b83678ba83a6 100644
--- a/drivers/net/ethernet/intel/igbvf/igbvf.h
+++ b/drivers/net/ethernet/intel/igbvf/igbvf.h
@@ -39,11 +39,11 @@ enum latency_range {
 /* Tx/Rx descriptor defines */
 #define IGBVF_DEFAULT_TXD	256
 #define IGBVF_MAX_TXD		4096
-#define IGBVF_MIN_TXD		80
+#define IGBVF_MIN_TXD		64
 
 #define IGBVF_DEFAULT_RXD	256
 #define IGBVF_MAX_RXD		4096
-#define IGBVF_MIN_RXD		80
+#define IGBVF_MIN_RXD		64
 
 #define IGBVF_MIN_ITR_USECS	10 /* 100000 irq/sec */
 #define IGBVF_MAX_ITR_USECS	10000 /* 100    irq/sec */
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 117/219] igb: Change IGB_MIN to allow set rx/tx value between 64 and 80
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (115 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 116/219] igbvf: Change IGBVF_MIN " Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 118/219] s390/zcrypt: dont leak memory if dev_set_name() fails Greg Kroah-Hartman
                   ` (112 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Olga Zaborska, Tony Nguyen,
	Sasha Levin, Pucha Himasekhar Reddy

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Olga Zaborska <olga.zaborska@intel.com>

[ Upstream commit 6319685bdc8ad5310890add907b7c42f89302886 ]

Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igb devices can use as low as 64 descriptors.
This change will unify igb with other drivers.
Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")

Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver")
Signed-off-by: Olga Zaborska <olga.zaborska@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/igb/igb.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 015b781441149..a2b759531cb7b 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -34,11 +34,11 @@ struct igb_adapter;
 /* TX/RX descriptor defines */
 #define IGB_DEFAULT_TXD		256
 #define IGB_DEFAULT_TX_WORK	128
-#define IGB_MIN_TXD		80
+#define IGB_MIN_TXD		64
 #define IGB_MAX_TXD		4096
 
 #define IGB_DEFAULT_RXD		256
-#define IGB_MIN_RXD		80
+#define IGB_MIN_RXD		64
 #define IGB_MAX_RXD		4096
 
 #define IGB_DEFAULT_ITR		3 /* dynamic */
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 118/219] s390/zcrypt: dont leak memory if dev_set_name() fails
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (116 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 117/219] igb: Change IGB_MIN " Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 119/219] idr: fix param name in idr_alloc_cyclic() doc Greg Kroah-Hartman
                   ` (111 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Andy Shevchenko,
	Harald Freudenberger, Heiko Carstens, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

[ Upstream commit 6252f47b78031979ad919f971dc8468b893488bd ]

When dev_set_name() fails, zcdn_create() doesn't free the newly
allocated resources. Do it.

Fixes: 00fab2350e6b ("s390/zcrypt: multiple zcrypt device nodes support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230831110000.24279-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/s390/crypto/zcrypt_api.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/s390/crypto/zcrypt_api.c b/drivers/s390/crypto/zcrypt_api.c
index f94b43ce9a658..28e34d155334b 100644
--- a/drivers/s390/crypto/zcrypt_api.c
+++ b/drivers/s390/crypto/zcrypt_api.c
@@ -441,6 +441,7 @@ static int zcdn_create(const char *name)
 			 ZCRYPT_NAME "_%d", (int)MINOR(devt));
 	nodename[sizeof(nodename) - 1] = '\0';
 	if (dev_set_name(&zcdndev->device, nodename)) {
+		kfree(zcdndev);
 		rc = -EINVAL;
 		goto unlockout;
 	}
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 119/219] idr: fix param name in idr_alloc_cyclic() doc
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (117 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 118/219] s390/zcrypt: dont leak memory if dev_set_name() fails Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 120/219] ip_tunnels: use DEV_STATS_INC() Greg Kroah-Hartman
                   ` (110 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Ariel Marcovitch,
	Matthew Wilcox (Oracle), Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ariel Marcovitch <arielmarcovitch@gmail.com>

[ Upstream commit 2a15de80dd0f7e04a823291aa9eb49c5294f56af ]

The relevant parameter is 'start' and not 'nextid'

Fixes: 460488c58ca8 ("idr: Remove idr_alloc_ext")
Signed-off-by: Ariel Marcovitch <arielmarcovitch@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 lib/idr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/idr.c b/lib/idr.c
index 7ecdfdb5309e7..13f2758c23773 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -100,7 +100,7 @@ EXPORT_SYMBOL_GPL(idr_alloc);
  * @end: The maximum ID (exclusive).
  * @gfp: Memory allocation flags.
  *
- * Allocates an unused ID in the range specified by @nextid and @end.  If
+ * Allocates an unused ID in the range specified by @start and @end.  If
  * @end is <= 0, it is treated as one larger than %INT_MAX.  This allows
  * callers to use @start + N as @end as long as N is within integer range.
  * The search for an unused ID will start at the last ID allocated and will
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 120/219] ip_tunnels: use DEV_STATS_INC()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (118 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 119/219] idr: fix param name in idr_alloc_cyclic() doc Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 121/219] net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload Greg Kroah-Hartman
                   ` (109 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzbot, Eric Dumazet,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 9b271ebaf9a2c5c566a54bc6cd915962e8241130 ]

syzbot/KCSAN reported data-races in iptunnel_xmit_stats() [1]

This can run from multiple cpus without mutual exclusion.

Adopt SMP safe DEV_STATS_INC() to update dev->stats fields.

[1]
BUG: KCSAN: data-race in iptunnel_xmit / iptunnel_xmit

read-write to 0xffff8881353df170 of 8 bytes by task 30263 on cpu 1:
iptunnel_xmit_stats include/net/ip_tunnels.h:493 [inline]
iptunnel_xmit+0x432/0x4a0 net/ipv4/ip_tunnel_core.c:87
ip_tunnel_xmit+0x1477/0x1750 net/ipv4/ip_tunnel.c:831
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:662
__netdev_start_xmit include/linux/netdevice.h:4889 [inline]
netdev_start_xmit include/linux/netdevice.h:4903 [inline]
xmit_one net/core/dev.c:3544 [inline]
dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3560
__dev_queue_xmit+0xeee/0x1de0 net/core/dev.c:4340
dev_queue_xmit include/linux/netdevice.h:3082 [inline]
__bpf_tx_skb net/core/filter.c:2129 [inline]
__bpf_redirect_no_mac net/core/filter.c:2159 [inline]
__bpf_redirect+0x723/0x9c0 net/core/filter.c:2182
____bpf_clone_redirect net/core/filter.c:2453 [inline]
bpf_clone_redirect+0x16c/0x1d0 net/core/filter.c:2425
___bpf_prog_run+0xd7d/0x41e0 kernel/bpf/core.c:1954
__bpf_prog_run512+0x74/0xa0 kernel/bpf/core.c:2195
bpf_dispatcher_nop_func include/linux/bpf.h:1181 [inline]
__bpf_prog_run include/linux/filter.h:609 [inline]
bpf_prog_run include/linux/filter.h:616 [inline]
bpf_test_run+0x15d/0x3d0 net/bpf/test_run.c:423
bpf_prog_test_run_skb+0x77b/0xa00 net/bpf/test_run.c:1045
bpf_prog_test_run+0x265/0x3d0 kernel/bpf/syscall.c:3996
__sys_bpf+0x3af/0x780 kernel/bpf/syscall.c:5353
__do_sys_bpf kernel/bpf/syscall.c:5439 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5437 [inline]
__x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5437
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read-write to 0xffff8881353df170 of 8 bytes by task 30249 on cpu 0:
iptunnel_xmit_stats include/net/ip_tunnels.h:493 [inline]
iptunnel_xmit+0x432/0x4a0 net/ipv4/ip_tunnel_core.c:87
ip_tunnel_xmit+0x1477/0x1750 net/ipv4/ip_tunnel.c:831
__gre_xmit net/ipv4/ip_gre.c:469 [inline]
ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:662
__netdev_start_xmit include/linux/netdevice.h:4889 [inline]
netdev_start_xmit include/linux/netdevice.h:4903 [inline]
xmit_one net/core/dev.c:3544 [inline]
dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3560
__dev_queue_xmit+0xeee/0x1de0 net/core/dev.c:4340
dev_queue_xmit include/linux/netdevice.h:3082 [inline]
__bpf_tx_skb net/core/filter.c:2129 [inline]
__bpf_redirect_no_mac net/core/filter.c:2159 [inline]
__bpf_redirect+0x723/0x9c0 net/core/filter.c:2182
____bpf_clone_redirect net/core/filter.c:2453 [inline]
bpf_clone_redirect+0x16c/0x1d0 net/core/filter.c:2425
___bpf_prog_run+0xd7d/0x41e0 kernel/bpf/core.c:1954
__bpf_prog_run512+0x74/0xa0 kernel/bpf/core.c:2195
bpf_dispatcher_nop_func include/linux/bpf.h:1181 [inline]
__bpf_prog_run include/linux/filter.h:609 [inline]
bpf_prog_run include/linux/filter.h:616 [inline]
bpf_test_run+0x15d/0x3d0 net/bpf/test_run.c:423
bpf_prog_test_run_skb+0x77b/0xa00 net/bpf/test_run.c:1045
bpf_prog_test_run+0x265/0x3d0 kernel/bpf/syscall.c:3996
__sys_bpf+0x3af/0x780 kernel/bpf/syscall.c:5353
__do_sys_bpf kernel/bpf/syscall.c:5439 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5437 [inline]
__x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5437
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x0000000000018830 -> 0x0000000000018831

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 30249 Comm: syz-executor.4 Not tainted 6.5.0-syzkaller-11704-g3f86ed6ec0b3 #0

Fixes: 039f50629b7f ("ip_tunnel: Move stats update to iptunnel_xmit()")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/ip_tunnels.h | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index fca3576798166..bca80522f95c8 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -473,15 +473,14 @@ static inline void iptunnel_xmit_stats(struct net_device *dev, int pkt_len)
 		u64_stats_inc(&tstats->tx_packets);
 		u64_stats_update_end(&tstats->syncp);
 		put_cpu_ptr(tstats);
+		return;
+	}
+
+	if (pkt_len < 0) {
+		DEV_STATS_INC(dev, tx_errors);
+		DEV_STATS_INC(dev, tx_aborted_errors);
 	} else {
-		struct net_device_stats *err_stats = &dev->stats;
-
-		if (pkt_len < 0) {
-			err_stats->tx_errors++;
-			err_stats->tx_aborted_errors++;
-		} else {
-			err_stats->tx_dropped++;
-		}
+		DEV_STATS_INC(dev, tx_dropped);
 	}
 }
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 121/219] net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (119 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 120/219] ip_tunnels: use DEV_STATS_INC() Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 122/219] net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times Greg Kroah-Hartman
                   ` (108 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Yanan Yang, Vladimir Oltean,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Oltean <vladimir.oltean@nxp.com>

[ Upstream commit 954ad9bf13c4f95a4958b5f8433301f2ab99e1f5 ]

More careful measurement of the tc-cbs bandwidth shows that the stream
bandwidth (effectively idleslope) increases, there is a larger and
larger discrepancy between the rate limit obtained by the software
Qdisc, and the rate limit obtained by its offloaded counterpart.

The discrepancy becomes so large, that e.g. at an idleslope of 40000
(40Mbps), the offloaded cbs does not actually rate limit anything, and
traffic will pass at line rate through a 100 Mbps port.

The reason for the discrepancy is that the hardware documentation I've
been following is incorrect. UM11040.pdf (for SJA1105P/Q/R/S) states
about IDLE_SLOPE that it is "the rate (in unit of bytes/sec) at which
the credit counter is increased".

Cross-checking with UM10944.pdf (for SJA1105E/T) and UM11107.pdf
(for SJA1110), the wording is different: "This field specifies the
value, in bytes per second times link speed, by which the credit counter
is increased".

So there's an extra scaling for link speed that the driver is currently
not accounting for, and apparently (empirically), that link speed is
expressed in Kbps.

I've pondered whether to pollute the sja1105_mac_link_up()
implementation with CBS shaper reprogramming, but I don't think it is
worth it. IMO, the UAPI exposed by tc-cbs requires user space to
recalculate the sendslope anyway, since the formula for that depends on
port_transmit_rate (see man tc-cbs), which is not an invariant from tc's
perspective.

So we use the offload->sendslope and offload->idleslope to deduce the
original port_transmit_rate from the CBS formula, and use that value to
scale the offload->sendslope and offload->idleslope to values that the
hardware understands.

Some numerical data points:

 40Mbps stream, max interfering frame size 1500, port speed 100M
 ---------------------------------------------------------------

 tc-cbs parameters:
 idleslope 40000 sendslope -60000 locredit -900 hicredit 600

 which result in hardware values:

 Before (doesn't work)           After (works)
 credit_hi    600                600
 credit_lo    900                900
 send_slope   7500000            75
 idle_slope   5000000            50

 40Mbps stream, max interfering frame size 1500, port speed 1G
 -------------------------------------------------------------

 tc-cbs parameters:
 idleslope 40000 sendslope -960000 locredit -1440 hicredit 60

 which result in hardware values:

 Before (doesn't work)           After (works)
 credit_hi    60                 60
 credit_lo    1440               1440
 send_slope   120000000          120
 idle_slope   5000000            5

 5.12Mbps stream, max interfering frame size 1522, port speed 100M
 -----------------------------------------------------------------

 tc-cbs parameters:
 idleslope 5120 sendslope -94880 locredit -1444 hicredit 77

 which result in hardware values:

 Before (doesn't work)           After (works)
 credit_hi    77                 77
 credit_lo    1444               1444
 send_slope   11860000           118
 idle_slope   640000             6

Tested on SJA1105T, SJA1105S and SJA1110A, at 1Gbps and 100Mbps.

Fixes: 4d7525085a9b ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
Reported-by: Yanan Yang <yanan.yang@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/sja1105/sja1105_main.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 947e8f7c09880..377f177502003 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2157,6 +2157,7 @@ static int sja1105_setup_tc_cbs(struct dsa_switch *ds, int port,
 {
 	struct sja1105_private *priv = ds->priv;
 	struct sja1105_cbs_entry *cbs;
+	s64 port_transmit_rate_kbps;
 	int index;
 
 	if (!offload->enable)
@@ -2174,9 +2175,17 @@ static int sja1105_setup_tc_cbs(struct dsa_switch *ds, int port,
 	 */
 	cbs->credit_hi = offload->hicredit;
 	cbs->credit_lo = abs(offload->locredit);
-	/* User space is in kbits/sec, hardware in bytes/sec */
-	cbs->idle_slope = offload->idleslope * BYTES_PER_KBIT;
-	cbs->send_slope = abs(offload->sendslope * BYTES_PER_KBIT);
+	/* User space is in kbits/sec, while the hardware in bytes/sec times
+	 * link speed. Since the given offload->sendslope is good only for the
+	 * current link speed anyway, and user space is likely to reprogram it
+	 * when that changes, don't even bother to track the port's link speed,
+	 * but deduce the port transmit rate from idleslope - sendslope.
+	 */
+	port_transmit_rate_kbps = offload->idleslope - offload->sendslope;
+	cbs->idle_slope = div_s64(offload->idleslope * BYTES_PER_KBIT,
+				  port_transmit_rate_kbps);
+	cbs->send_slope = div_s64(abs(offload->sendslope * BYTES_PER_KBIT),
+				  port_transmit_rate_kbps);
 	/* Convert the negative values from 64-bit 2's complement
 	 * to 32-bit 2's complement (for the case of 0x80000000 whose
 	 * negative is still negative).
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 122/219] net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (120 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 121/219] net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 123/219] net: dsa: sja1105: complete tc-cbs offload support on SJA1110 Greg Kroah-Hartman
                   ` (107 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Vladimir Oltean, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Oltean <vladimir.oltean@nxp.com>

[ Upstream commit 894cafc5c62ccced758077bd4e970dc714c42637 ]

After running command [2] too many times in a row:

[1] $ tc qdisc add dev sw2p0 root handle 1: mqprio num_tc 8 \
	map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
[2] $ tc qdisc replace dev sw2p0 parent 1:1 cbs offload 1 \
	idleslope 120000 sendslope -880000 locredit -1320 hicredit 180

(aka more than priv->info->num_cbs_shapers times)

we start seeing the following error message:

Error: Specified device failed to setup cbs hardware offload.

This comes from the fact that ndo_setup_tc(TC_SETUP_QDISC_CBS) presents
the same API for the qdisc create and replace cases, and the sja1105
driver fails to distinguish between the 2. Thus, it always thinks that
it must allocate the same shaper for a {port, queue} pair, when it may
instead have to replace an existing one.

Fixes: 4d7525085a9b ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/sja1105/sja1105_main.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 377f177502003..9dd5cdcda2843 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2123,6 +2123,18 @@ static void sja1105_bridge_leave(struct dsa_switch *ds, int port,
 
 #define BYTES_PER_KBIT (1000LL / 8)
 
+static int sja1105_find_cbs_shaper(struct sja1105_private *priv,
+				   int port, int prio)
+{
+	int i;
+
+	for (i = 0; i < priv->info->num_cbs_shapers; i++)
+		if (priv->cbs[i].port == port && priv->cbs[i].prio == prio)
+			return i;
+
+	return -1;
+}
+
 static int sja1105_find_unused_cbs_shaper(struct sja1105_private *priv)
 {
 	int i;
@@ -2163,9 +2175,14 @@ static int sja1105_setup_tc_cbs(struct dsa_switch *ds, int port,
 	if (!offload->enable)
 		return sja1105_delete_cbs_shaper(priv, port, offload->queue);
 
-	index = sja1105_find_unused_cbs_shaper(priv);
-	if (index < 0)
-		return -ENOSPC;
+	/* The user may be replacing an existing shaper */
+	index = sja1105_find_cbs_shaper(priv, port, offload->queue);
+	if (index < 0) {
+		/* That isn't the case - see if we can allocate a new one */
+		index = sja1105_find_unused_cbs_shaper(priv);
+		if (index < 0)
+			return -ENOSPC;
+	}
 
 	cbs = &priv->cbs[index];
 	cbs->port = port;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 123/219] net: dsa: sja1105: complete tc-cbs offload support on SJA1110
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (121 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 122/219] net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 124/219] bpf: Remove prog->active check for bpf_lsm and bpf_iter Greg Kroah-Hartman
                   ` (106 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Vladimir Oltean, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Oltean <vladimir.oltean@nxp.com>

[ Upstream commit 180a7419fe4adc8d9c8e0ef0fd17bcdd0cf78acd ]

The blamed commit left this delta behind:

  struct sja1105_cbs_entry {
 -	u64 port;
 -	u64 prio;
 +	u64 port; /* Not used for SJA1110 */
 +	u64 prio; /* Not used for SJA1110 */
  	u64 credit_hi;
  	u64 credit_lo;
  	u64 send_slope;
  	u64 idle_slope;
  };

but did not actually implement tc-cbs offload fully for the new switch.
The offload is accepted, but it doesn't work.

The difference compared to earlier switch generations is that now, the
table of CBS shapers is sparse, because there are many more shapers, so
the mapping between a {port, prio} and a table index is static, rather
than requiring us to store the port and prio into the sja1105_cbs_entry.

So, the problem is that the code programs the CBS shaper parameters at a
dynamic table index which is incorrect.

All that needs to be done for SJA1110 CBS shapers to work is to bypass
the logic which allocates shapers in a dense manner, as for SJA1105, and
use the fixed mapping instead.

Fixes: 3e77e59bf8cf ("net: dsa: sja1105: add support for the SJA1110 switch family")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/sja1105/sja1105.h      |  2 ++
 drivers/net/dsa/sja1105/sja1105_main.c | 13 +++++++++++++
 drivers/net/dsa/sja1105/sja1105_spi.c  |  4 ++++
 3 files changed, 19 insertions(+)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index fb3cd4c78faa8..a831bb0a52074 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -132,6 +132,8 @@ struct sja1105_info {
 	int max_frame_mem;
 	int num_ports;
 	bool multiple_cascade_ports;
+	/* Every {port, TXQ} has its own CBS shaper */
+	bool fixed_cbs_mapping;
 	enum dsa_tag_protocol tag_proto;
 	const struct sja1105_dynamic_table_ops *dyn_ops;
 	const struct sja1105_table_ops *static_ops;
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 9dd5cdcda2843..ff94c5996fafb 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2122,12 +2122,22 @@ static void sja1105_bridge_leave(struct dsa_switch *ds, int port,
 }
 
 #define BYTES_PER_KBIT (1000LL / 8)
+/* Port 0 (the uC port) does not have CBS shapers */
+#define SJA1110_FIXED_CBS(port, prio) ((((port) - 1) * SJA1105_NUM_TC) + (prio))
 
 static int sja1105_find_cbs_shaper(struct sja1105_private *priv,
 				   int port, int prio)
 {
 	int i;
 
+	if (priv->info->fixed_cbs_mapping) {
+		i = SJA1110_FIXED_CBS(port, prio);
+		if (i >= 0 && i < priv->info->num_cbs_shapers)
+			return i;
+
+		return -1;
+	}
+
 	for (i = 0; i < priv->info->num_cbs_shapers; i++)
 		if (priv->cbs[i].port == port && priv->cbs[i].prio == prio)
 			return i;
@@ -2139,6 +2149,9 @@ static int sja1105_find_unused_cbs_shaper(struct sja1105_private *priv)
 {
 	int i;
 
+	if (priv->info->fixed_cbs_mapping)
+		return -1;
+
 	for (i = 0; i < priv->info->num_cbs_shapers; i++)
 		if (!priv->cbs[i].idle_slope && !priv->cbs[i].send_slope)
 			return i;
diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c
index d3c9ad6d39d46..e6b61aef4127c 100644
--- a/drivers/net/dsa/sja1105/sja1105_spi.c
+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
@@ -781,6 +781,7 @@ const struct sja1105_info sja1110a_info = {
 	.tag_proto		= DSA_TAG_PROTO_SJA1110,
 	.can_limit_mcast_flood	= true,
 	.multiple_cascade_ports	= true,
+	.fixed_cbs_mapping	= true,
 	.ptp_ts_bits		= 32,
 	.ptpegr_ts_bytes	= 8,
 	.max_frame_mem		= SJA1110_MAX_FRAME_MEMORY,
@@ -831,6 +832,7 @@ const struct sja1105_info sja1110b_info = {
 	.tag_proto		= DSA_TAG_PROTO_SJA1110,
 	.can_limit_mcast_flood	= true,
 	.multiple_cascade_ports	= true,
+	.fixed_cbs_mapping	= true,
 	.ptp_ts_bits		= 32,
 	.ptpegr_ts_bytes	= 8,
 	.max_frame_mem		= SJA1110_MAX_FRAME_MEMORY,
@@ -881,6 +883,7 @@ const struct sja1105_info sja1110c_info = {
 	.tag_proto		= DSA_TAG_PROTO_SJA1110,
 	.can_limit_mcast_flood	= true,
 	.multiple_cascade_ports	= true,
+	.fixed_cbs_mapping	= true,
 	.ptp_ts_bits		= 32,
 	.ptpegr_ts_bytes	= 8,
 	.max_frame_mem		= SJA1110_MAX_FRAME_MEMORY,
@@ -931,6 +934,7 @@ const struct sja1105_info sja1110d_info = {
 	.tag_proto		= DSA_TAG_PROTO_SJA1110,
 	.can_limit_mcast_flood	= true,
 	.multiple_cascade_ports	= true,
+	.fixed_cbs_mapping	= true,
 	.ptp_ts_bits		= 32,
 	.ptpegr_ts_bytes	= 8,
 	.max_frame_mem		= SJA1110_MAX_FRAME_MEMORY,
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 124/219] bpf: Remove prog->active check for bpf_lsm and bpf_iter
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (122 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 123/219] net: dsa: sja1105: complete tc-cbs offload support on SJA1110 Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 125/219] bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf() Greg Kroah-Hartman
                   ` (105 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Martin KaFai Lau, Alexei Starovoitov,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Martin KaFai Lau <martin.lau@kernel.org>

[ Upstream commit 271de525e1d7f564e88a9d212c50998b49a54476 ]

The commit 64696c40d03c ("bpf: Add __bpf_prog_{enter,exit}_struct_ops for struct_ops trampoline")
removed prog->active check for struct_ops prog.  The bpf_lsm
and bpf_iter is also using trampoline.  Like struct_ops, the bpf_lsm
and bpf_iter have fixed hooks for the prog to attach.  The
kernel does not call the same hook in a recursive way.
This patch also removes the prog->active check for
bpf_lsm and bpf_iter.

A later patch has a test to reproduce the recursion issue
for a sleepable bpf_lsm program.

This patch appends the '_recur' naming to the existing
enter and exit functions that track the prog->active counter.
New __bpf_prog_{enter,exit}[_sleepable] function are
added to skip the prog->active tracking. The '_struct_ops'
version is also removed.

It also moves the decision on picking the enter and exit function to
the new bpf_trampoline_{enter,exit}().  It returns the '_recur' ones
for all tracing progs to use.  For bpf_lsm, bpf_iter,
struct_ops (no prog->active tracking after 64696c40d03c), and
bpf_lsm_cgroup (no prog->active tracking after 69fd337a975c7),
it will return the functions that don't track the prog->active.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20221025184524.3526117-2-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: 7645629f7dc8 ("bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf().")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/arm64/net/bpf_jit_comp.c |  9 +---
 arch/x86/net/bpf_jit_comp.c   | 19 +--------
 include/linux/bpf.h           | 24 +++++------
 include/linux/bpf_verifier.h  | 13 ++++++
 kernel/bpf/syscall.c          |  5 ++-
 kernel/bpf/trampoline.c       | 80 +++++++++++++++++++++++++++++------
 6 files changed, 97 insertions(+), 53 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 14134fd34ff79..0ce5f13eabb1b 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1655,13 +1655,8 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
 	struct bpf_prog *p = l->link.prog;
 	int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
 
-	if (p->aux->sleepable) {
-		enter_prog = (u64)__bpf_prog_enter_sleepable;
-		exit_prog = (u64)__bpf_prog_exit_sleepable;
-	} else {
-		enter_prog = (u64)__bpf_prog_enter;
-		exit_prog = (u64)__bpf_prog_exit;
-	}
+	enter_prog = (u64)bpf_trampoline_enter(p);
+	exit_prog = (u64)bpf_trampoline_exit(p);
 
 	if (l->cookie == 0) {
 		/* if cookie is zero, one instruction is enough to store it */
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index db6053a22e866..5e680e039d0e1 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1813,10 +1813,6 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 			   struct bpf_tramp_link *l, int stack_size,
 			   int run_ctx_off, bool save_ret)
 {
-	void (*exit)(struct bpf_prog *prog, u64 start,
-		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit;
-	u64 (*enter)(struct bpf_prog *prog,
-		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter;
 	u8 *prog = *pprog;
 	u8 *jmp_insn;
 	int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
@@ -1835,23 +1831,12 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	 */
 	emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_1, -run_ctx_off + ctx_cookie_off);
 
-	if (p->aux->sleepable) {
-		enter = __bpf_prog_enter_sleepable;
-		exit = __bpf_prog_exit_sleepable;
-	} else if (p->type == BPF_PROG_TYPE_STRUCT_OPS) {
-		enter = __bpf_prog_enter_struct_ops;
-		exit = __bpf_prog_exit_struct_ops;
-	} else if (p->expected_attach_type == BPF_LSM_CGROUP) {
-		enter = __bpf_prog_enter_lsm_cgroup;
-		exit = __bpf_prog_exit_lsm_cgroup;
-	}
-
 	/* arg1: mov rdi, progs[i] */
 	emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
 	/* arg2: lea rsi, [rbp - ctx_cookie_off] */
 	EMIT4(0x48, 0x8D, 0x75, -run_ctx_off);
 
-	if (emit_call(&prog, enter, prog))
+	if (emit_call(&prog, bpf_trampoline_enter(p), prog))
 		return -EINVAL;
 	/* remember prog start time returned by __bpf_prog_enter */
 	emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
@@ -1896,7 +1881,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
 	/* arg3: lea rdx, [rbp - run_ctx_off] */
 	EMIT4(0x48, 0x8D, 0x55, -run_ctx_off);
-	if (emit_call(&prog, exit, prog))
+	if (emit_call(&prog, bpf_trampoline_exit(p), prog))
 		return -EINVAL;
 
 	*pprog = prog;
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 8cef9ec3a89c2..b3d3aa8437dce 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -862,22 +862,18 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *i
 				const struct btf_func_model *m, u32 flags,
 				struct bpf_tramp_links *tlinks,
 				void *orig_call);
-/* these two functions are called from generated trampoline */
-u64 notrace __bpf_prog_enter(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
-void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_run_ctx *run_ctx);
-u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
-void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
-				       struct bpf_tramp_run_ctx *run_ctx);
-u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
-					struct bpf_tramp_run_ctx *run_ctx);
-void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
-					struct bpf_tramp_run_ctx *run_ctx);
-u64 notrace __bpf_prog_enter_struct_ops(struct bpf_prog *prog,
-					struct bpf_tramp_run_ctx *run_ctx);
-void notrace __bpf_prog_exit_struct_ops(struct bpf_prog *prog, u64 start,
-					struct bpf_tramp_run_ctx *run_ctx);
+u64 notrace __bpf_prog_enter_sleepable_recur(struct bpf_prog *prog,
+					     struct bpf_tramp_run_ctx *run_ctx);
+void notrace __bpf_prog_exit_sleepable_recur(struct bpf_prog *prog, u64 start,
+					     struct bpf_tramp_run_ctx *run_ctx);
 void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
 void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
+typedef u64 (*bpf_trampoline_enter_t)(struct bpf_prog *prog,
+				      struct bpf_tramp_run_ctx *run_ctx);
+typedef void (*bpf_trampoline_exit_t)(struct bpf_prog *prog, u64 start,
+				      struct bpf_tramp_run_ctx *run_ctx);
+bpf_trampoline_enter_t bpf_trampoline_enter(const struct bpf_prog *prog);
+bpf_trampoline_exit_t bpf_trampoline_exit(const struct bpf_prog *prog);
 
 struct bpf_ksym {
 	unsigned long		 start;
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 0eb8f035b3d9f..1a32baa78ce26 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -648,4 +648,17 @@ static inline enum bpf_prog_type resolve_prog_type(const struct bpf_prog *prog)
 		prog->aux->dst_prog->type : prog->type;
 }
 
+static inline bool bpf_prog_check_recur(const struct bpf_prog *prog)
+{
+	switch (resolve_prog_type(prog)) {
+	case BPF_PROG_TYPE_TRACING:
+		return prog->expected_attach_type != BPF_TRACE_ITER;
+	case BPF_PROG_TYPE_STRUCT_OPS:
+	case BPF_PROG_TYPE_LSM:
+		return false;
+	default:
+		return true;
+	}
+}
+
 #endif /* _LINUX_BPF_VERIFIER_H */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0c44a716f0a24..7afec961c5728 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -5136,13 +5136,14 @@ int kern_sys_bpf(int cmd, union bpf_attr *attr, unsigned int size)
 
 		run_ctx.bpf_cookie = 0;
 		run_ctx.saved_run_ctx = NULL;
-		if (!__bpf_prog_enter_sleepable(prog, &run_ctx)) {
+		if (!__bpf_prog_enter_sleepable_recur(prog, &run_ctx)) {
 			/* recursion detected */
 			bpf_prog_put(prog);
 			return -EBUSY;
 		}
 		attr->test.retval = bpf_prog_run(prog, (void *) (long) attr->test.ctx_in);
-		__bpf_prog_exit_sleepable(prog, 0 /* bpf_prog_run does runtime stats */, &run_ctx);
+		__bpf_prog_exit_sleepable_recur(prog, 0 /* bpf_prog_run does runtime stats */,
+						&run_ctx);
 		bpf_prog_put(prog);
 		return 0;
 #endif
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 30af8f66e17b4..88841e352dcdf 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -874,7 +874,7 @@ static __always_inline u64 notrace bpf_prog_start_time(void)
  * [2..MAX_U64] - execute bpf prog and record execution time.
  *     This is start time.
  */
-u64 notrace __bpf_prog_enter(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx)
+static u64 notrace __bpf_prog_enter_recur(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx)
 	__acquires(RCU)
 {
 	rcu_read_lock();
@@ -911,7 +911,8 @@ static void notrace update_prog_stats(struct bpf_prog *prog,
 	}
 }
 
-void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_run_ctx *run_ctx)
+static void notrace __bpf_prog_exit_recur(struct bpf_prog *prog, u64 start,
+					  struct bpf_tramp_run_ctx *run_ctx)
 	__releases(RCU)
 {
 	bpf_reset_run_ctx(run_ctx->saved_run_ctx);
@@ -922,8 +923,8 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
 	rcu_read_unlock();
 }
 
-u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
-					struct bpf_tramp_run_ctx *run_ctx)
+static u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
+					       struct bpf_tramp_run_ctx *run_ctx)
 	__acquires(RCU)
 {
 	/* Runtime stats are exported via actual BPF_LSM_CGROUP
@@ -937,8 +938,8 @@ u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
 	return NO_START_TIME;
 }
 
-void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
-					struct bpf_tramp_run_ctx *run_ctx)
+static void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
+					       struct bpf_tramp_run_ctx *run_ctx)
 	__releases(RCU)
 {
 	bpf_reset_run_ctx(run_ctx->saved_run_ctx);
@@ -947,7 +948,8 @@ void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
 	rcu_read_unlock();
 }
 
-u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx)
+u64 notrace __bpf_prog_enter_sleepable_recur(struct bpf_prog *prog,
+					     struct bpf_tramp_run_ctx *run_ctx)
 {
 	rcu_read_lock_trace();
 	migrate_disable();
@@ -963,8 +965,8 @@ u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_r
 	return bpf_prog_start_time();
 }
 
-void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
-				       struct bpf_tramp_run_ctx *run_ctx)
+void notrace __bpf_prog_exit_sleepable_recur(struct bpf_prog *prog, u64 start,
+					     struct bpf_tramp_run_ctx *run_ctx)
 {
 	bpf_reset_run_ctx(run_ctx->saved_run_ctx);
 
@@ -974,8 +976,30 @@ void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
 	rcu_read_unlock_trace();
 }
 
-u64 notrace __bpf_prog_enter_struct_ops(struct bpf_prog *prog,
-					struct bpf_tramp_run_ctx *run_ctx)
+static u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog,
+					      struct bpf_tramp_run_ctx *run_ctx)
+{
+	rcu_read_lock_trace();
+	migrate_disable();
+	might_fault();
+
+	run_ctx->saved_run_ctx = bpf_set_run_ctx(&run_ctx->run_ctx);
+
+	return bpf_prog_start_time();
+}
+
+static void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
+					      struct bpf_tramp_run_ctx *run_ctx)
+{
+	bpf_reset_run_ctx(run_ctx->saved_run_ctx);
+
+	update_prog_stats(prog, start);
+	migrate_enable();
+	rcu_read_unlock_trace();
+}
+
+static u64 notrace __bpf_prog_enter(struct bpf_prog *prog,
+				    struct bpf_tramp_run_ctx *run_ctx)
 	__acquires(RCU)
 {
 	rcu_read_lock();
@@ -986,8 +1010,8 @@ u64 notrace __bpf_prog_enter_struct_ops(struct bpf_prog *prog,
 	return bpf_prog_start_time();
 }
 
-void notrace __bpf_prog_exit_struct_ops(struct bpf_prog *prog, u64 start,
-					struct bpf_tramp_run_ctx *run_ctx)
+static void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start,
+				    struct bpf_tramp_run_ctx *run_ctx)
 	__releases(RCU)
 {
 	bpf_reset_run_ctx(run_ctx->saved_run_ctx);
@@ -1007,6 +1031,36 @@ void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
 	percpu_ref_put(&tr->pcref);
 }
 
+bpf_trampoline_enter_t bpf_trampoline_enter(const struct bpf_prog *prog)
+{
+	bool sleepable = prog->aux->sleepable;
+
+	if (bpf_prog_check_recur(prog))
+		return sleepable ? __bpf_prog_enter_sleepable_recur :
+			__bpf_prog_enter_recur;
+
+	if (resolve_prog_type(prog) == BPF_PROG_TYPE_LSM &&
+	    prog->expected_attach_type == BPF_LSM_CGROUP)
+		return __bpf_prog_enter_lsm_cgroup;
+
+	return sleepable ? __bpf_prog_enter_sleepable : __bpf_prog_enter;
+}
+
+bpf_trampoline_exit_t bpf_trampoline_exit(const struct bpf_prog *prog)
+{
+	bool sleepable = prog->aux->sleepable;
+
+	if (bpf_prog_check_recur(prog))
+		return sleepable ? __bpf_prog_exit_sleepable_recur :
+			__bpf_prog_exit_recur;
+
+	if (resolve_prog_type(prog) == BPF_PROG_TYPE_LSM &&
+	    prog->expected_attach_type == BPF_LSM_CGROUP)
+		return __bpf_prog_exit_lsm_cgroup;
+
+	return sleepable ? __bpf_prog_exit_sleepable : __bpf_prog_exit;
+}
+
 int __weak
 arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
 			    const struct btf_func_model *m, u32 flags,
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 125/219] bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf().
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (123 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 124/219] bpf: Remove prog->active check for bpf_lsm and bpf_iter Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 126/219] bpf: Assign bpf_tramp_run_ctx::saved_run_ctx before recursion check Greg Kroah-Hartman
                   ` (104 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Sebastian Andrzej Siewior,
	Daniel Borkmann, Jiri Olsa, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

[ Upstream commit 7645629f7dc88cd777f98970134bf1a54c8d77e3 ]

If __bpf_prog_enter_sleepable_recur() detects recursion then it returns
0 without undoing rcu_read_lock_trace(), migrate_disable() or
decrementing the recursion counter. This is fine in the JIT case because
the JIT code will jump in the 0 case to the end and invoke the matching
exit trampoline (__bpf_prog_exit_sleepable_recur()).

This is not the case in kern_sys_bpf() which returns directly to the
caller with an error code.

Add __bpf_prog_exit_sleepable_recur() as clean up in the recursion case.

Fixes: b1d18a7574d0d ("bpf: Extend sys_bpf commands for bpf_syscall programs.")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230830080405.251926-2-bigeasy@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/bpf/syscall.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 7afec961c5728..76484137233a3 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -5138,6 +5138,7 @@ int kern_sys_bpf(int cmd, union bpf_attr *attr, unsigned int size)
 		run_ctx.saved_run_ctx = NULL;
 		if (!__bpf_prog_enter_sleepable_recur(prog, &run_ctx)) {
 			/* recursion detected */
+			__bpf_prog_exit_sleepable_recur(prog, 0, &run_ctx);
 			bpf_prog_put(prog);
 			return -EBUSY;
 		}
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 126/219] bpf: Assign bpf_tramp_run_ctx::saved_run_ctx before recursion check.
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (124 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 125/219] bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf() Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 127/219] netfilter: nftables: exthdr: fix 4-byte stack OOB write Greg Kroah-Hartman
                   ` (103 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Sebastian Andrzej Siewior,
	Daniel Borkmann, Jiri Olsa, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

[ Upstream commit 6764e767f4af1e35f87f3497e1182d945de37f93 ]

__bpf_prog_enter_recur() assigns bpf_tramp_run_ctx::saved_run_ctx before
performing the recursion check which means in case of a recursion
__bpf_prog_exit_recur() uses the previously set bpf_tramp_run_ctx::saved_run_ctx
value.

__bpf_prog_enter_sleepable_recur() assigns bpf_tramp_run_ctx::saved_run_ctx
after the recursion check which means in case of a recursion
__bpf_prog_exit_sleepable_recur() uses an uninitialized value. This does not
look right. If I read the entry trampoline code right, then bpf_tramp_run_ctx
isn't initialized upfront.

Align __bpf_prog_enter_sleepable_recur() with __bpf_prog_enter_recur() and
set bpf_tramp_run_ctx::saved_run_ctx before the recursion check is made.
Remove the assignment of saved_run_ctx in kern_sys_bpf() since it happens
a few cycles later.

Fixes: e384c7b7b46d0 ("bpf, x86: Create bpf_tramp_run_ctx on the caller thread's stack")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230830080405.251926-3-bigeasy@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/bpf/syscall.c    | 1 -
 kernel/bpf/trampoline.c | 5 ++---
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 76484137233a3..0c8b7733573ee 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -5135,7 +5135,6 @@ int kern_sys_bpf(int cmd, union bpf_attr *attr, unsigned int size)
 		}
 
 		run_ctx.bpf_cookie = 0;
-		run_ctx.saved_run_ctx = NULL;
 		if (!__bpf_prog_enter_sleepable_recur(prog, &run_ctx)) {
 			/* recursion detected */
 			__bpf_prog_exit_sleepable_recur(prog, 0, &run_ctx);
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 88841e352dcdf..c4381dfcd6b09 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -955,13 +955,12 @@ u64 notrace __bpf_prog_enter_sleepable_recur(struct bpf_prog *prog,
 	migrate_disable();
 	might_fault();
 
+	run_ctx->saved_run_ctx = bpf_set_run_ctx(&run_ctx->run_ctx);
+
 	if (unlikely(this_cpu_inc_return(*(prog->active)) != 1)) {
 		bpf_prog_inc_misses_counter(prog);
 		return 0;
 	}
-
-	run_ctx->saved_run_ctx = bpf_set_run_ctx(&run_ctx->run_ctx);
-
 	return bpf_prog_start_time();
 }
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 127/219] netfilter: nftables: exthdr: fix 4-byte stack OOB write
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (125 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 126/219] bpf: Assign bpf_tramp_run_ctx::saved_run_ctx before recursion check Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 128/219] netfilter: nfnetlink_osf: avoid OOB read Greg Kroah-Hartman
                   ` (102 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Florian Westphal, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>

[ Upstream commit fd94d9dadee58e09b49075240fe83423eb1dcd36 ]

If priv->len is a multiple of 4, then dst[len / 4] can write past
the destination array which leads to stack corruption.

This construct is necessary to clean the remainder of the register
in case ->len is NOT a multiple of the register size, so make it
conditional just like nft_payload.c does.

The bug was added in 4.1 cycle and then copied/inherited when
tcp/sctp and ip option support was added.

Bug reported by Zero Day Initiative project (ZDI-CAN-21950,
ZDI-CAN-21951, ZDI-CAN-21961).

Fixes: 49499c3e6e18 ("netfilter: nf_tables: switch registers to 32 bit addressing")
Fixes: 935b7f643018 ("netfilter: nft_exthdr: add TCP option matching")
Fixes: 133dc203d77d ("netfilter: nft_exthdr: Support SCTP chunks")
Fixes: dbb5281a1f84 ("netfilter: nf_tables: add support for matching IPv4 options")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nft_exthdr.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c
index c307c57a93e57..efb50c2b41f32 100644
--- a/net/netfilter/nft_exthdr.c
+++ b/net/netfilter/nft_exthdr.c
@@ -35,6 +35,14 @@ static unsigned int optlen(const u8 *opt, unsigned int offset)
 		return opt[offset + 1];
 }
 
+static int nft_skb_copy_to_reg(const struct sk_buff *skb, int offset, u32 *dest, unsigned int len)
+{
+	if (len % NFT_REG32_SIZE)
+		dest[len / NFT_REG32_SIZE] = 0;
+
+	return skb_copy_bits(skb, offset, dest, len);
+}
+
 static void nft_exthdr_ipv6_eval(const struct nft_expr *expr,
 				 struct nft_regs *regs,
 				 const struct nft_pktinfo *pkt)
@@ -56,8 +64,7 @@ static void nft_exthdr_ipv6_eval(const struct nft_expr *expr,
 	}
 	offset += priv->offset;
 
-	dest[priv->len / NFT_REG32_SIZE] = 0;
-	if (skb_copy_bits(pkt->skb, offset, dest, priv->len) < 0)
+	if (nft_skb_copy_to_reg(pkt->skb, offset, dest, priv->len) < 0)
 		goto err;
 	return;
 err:
@@ -153,8 +160,7 @@ static void nft_exthdr_ipv4_eval(const struct nft_expr *expr,
 	}
 	offset += priv->offset;
 
-	dest[priv->len / NFT_REG32_SIZE] = 0;
-	if (skb_copy_bits(pkt->skb, offset, dest, priv->len) < 0)
+	if (nft_skb_copy_to_reg(pkt->skb, offset, dest, priv->len) < 0)
 		goto err;
 	return;
 err:
@@ -210,7 +216,8 @@ static void nft_exthdr_tcp_eval(const struct nft_expr *expr,
 		if (priv->flags & NFT_EXTHDR_F_PRESENT) {
 			*dest = 1;
 		} else {
-			dest[priv->len / NFT_REG32_SIZE] = 0;
+			if (priv->len % NFT_REG32_SIZE)
+				dest[priv->len / NFT_REG32_SIZE] = 0;
 			memcpy(dest, opt + offset, priv->len);
 		}
 
@@ -388,9 +395,8 @@ static void nft_exthdr_sctp_eval(const struct nft_expr *expr,
 			    offset + ntohs(sch->length) > pkt->skb->len)
 				break;
 
-			dest[priv->len / NFT_REG32_SIZE] = 0;
-			if (skb_copy_bits(pkt->skb, offset + priv->offset,
-					  dest, priv->len) < 0)
+			if (nft_skb_copy_to_reg(pkt->skb, offset + priv->offset,
+						dest, priv->len) < 0)
 				break;
 			return;
 		}
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 128/219] netfilter: nfnetlink_osf: avoid OOB read
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (126 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 127/219] netfilter: nftables: exthdr: fix 4-byte stack OOB write Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 129/219] net: hns3: fix tx timeout issue Greg Kroah-Hartman
                   ` (101 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Lucas Leong, Wander Lairson Costa,
	Florian Westphal, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Wander Lairson Costa <wander@redhat.com>

[ Upstream commit f4f8a7803119005e87b716874bec07c751efafec ]

The opt_num field is controlled by user mode and is not currently
validated inside the kernel. An attacker can take advantage of this to
trigger an OOB read and potentially leak information.

BUG: KASAN: slab-out-of-bounds in nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
Read of size 2 at addr ffff88804bc64272 by task poc/6431

CPU: 1 PID: 6431 Comm: poc Not tainted 6.0.0-rc4 #1
Call Trace:
 nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
 nf_osf_find+0x186/0x2f0 net/netfilter/nfnetlink_osf.c:281
 nft_osf_eval+0x37f/0x590 net/netfilter/nft_osf.c:47
 expr_call_ops_eval net/netfilter/nf_tables_core.c:214
 nft_do_chain+0x2b0/0x1490 net/netfilter/nf_tables_core.c:264
 nft_do_chain_ipv4+0x17c/0x1f0 net/netfilter/nft_chain_filter.c:23
 [..]

Also add validation to genre, subtype and version fields.

Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match")
Reported-by: Lucas Leong <wmliang@infosec.exchange>
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nfnetlink_osf.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/netfilter/nfnetlink_osf.c b/net/netfilter/nfnetlink_osf.c
index 8f1bfa6ccc2d9..50723ba082890 100644
--- a/net/netfilter/nfnetlink_osf.c
+++ b/net/netfilter/nfnetlink_osf.c
@@ -315,6 +315,14 @@ static int nfnl_osf_add_callback(struct sk_buff *skb,
 
 	f = nla_data(osf_attrs[OSF_ATTR_FINGER]);
 
+	if (f->opt_num > ARRAY_SIZE(f->opt))
+		return -EINVAL;
+
+	if (!memchr(f->genre, 0, MAXGENRELEN) ||
+	    !memchr(f->subtype, 0, MAXGENRELEN) ||
+	    !memchr(f->version, 0, MAXGENRELEN))
+		return -EINVAL;
+
 	kf = kmalloc(sizeof(struct nf_osf_finger), GFP_KERNEL);
 	if (!kf)
 		return -ENOMEM;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 129/219] net: hns3: fix tx timeout issue
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (127 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 128/219] netfilter: nfnetlink_osf: avoid OOB read Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 130/219] net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read() Greg Kroah-Hartman
                   ` (100 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Jian Shen, Jijie Shao, Paolo Abeni,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jian Shen <shenjian15@huawei.com>

[ Upstream commit 61a1deacc3d4fd3d57d7fda4d935f7f7503e8440 ]

Currently, the driver knocks the ring doorbell before updating
the ring->last_to_use in tx flow. if the hardware transmiting
packet and napi poll scheduling are fast enough, it may get
the old ring->last_to_use in drivers' napi poll.
In this case, the driver will think the tx is not completed, and
return directly without clear the flag __QUEUE_STATE_STACK_XOFF,
which may cause tx timeout.

Fixes: 20d06ca2679c ("net: hns3: optimize the tx clean process")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 61f833d61f583..9942b21cd6193 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2102,8 +2102,12 @@ static void hns3_tx_doorbell(struct hns3_enet_ring *ring, int num,
 	 */
 	if (test_bit(HNS3_NIC_STATE_TX_PUSH_ENABLE, &priv->state) && num &&
 	    !ring->pending_buf && num <= HNS3_MAX_PUSH_BD_NUM && doorbell) {
+		/* This smp_store_release() pairs with smp_load_aquire() in
+		 * hns3_nic_reclaim_desc(). Ensure that the BD valid bit
+		 * is updated.
+		 */
+		smp_store_release(&ring->last_to_use, ring->next_to_use);
 		hns3_tx_push_bd(ring, num);
-		WRITE_ONCE(ring->last_to_use, ring->next_to_use);
 		return;
 	}
 
@@ -2114,6 +2118,11 @@ static void hns3_tx_doorbell(struct hns3_enet_ring *ring, int num,
 		return;
 	}
 
+	/* This smp_store_release() pairs with smp_load_aquire() in
+	 * hns3_nic_reclaim_desc(). Ensure that the BD valid bit is updated.
+	 */
+	smp_store_release(&ring->last_to_use, ring->next_to_use);
+
 	if (ring->tqp->mem_base)
 		hns3_tx_mem_doorbell(ring);
 	else
@@ -2121,7 +2130,6 @@ static void hns3_tx_doorbell(struct hns3_enet_ring *ring, int num,
 		       ring->tqp->io_base + HNS3_RING_TX_RING_TAIL_REG);
 
 	ring->pending_buf = 0;
-	WRITE_ONCE(ring->last_to_use, ring->next_to_use);
 }
 
 static void hns3_tsyn(struct net_device *netdev, struct sk_buff *skb,
@@ -3562,9 +3570,8 @@ static void hns3_reuse_buffer(struct hns3_enet_ring *ring, int i)
 static bool hns3_nic_reclaim_desc(struct hns3_enet_ring *ring,
 				  int *bytes, int *pkts, int budget)
 {
-	/* pair with ring->last_to_use update in hns3_tx_doorbell(),
-	 * smp_store_release() is not used in hns3_tx_doorbell() because
-	 * the doorbell operation already have the needed barrier operation.
+	/* This smp_load_acquire() pairs with smp_store_release() in
+	 * hns3_tx_doorbell().
 	 */
 	int ltu = smp_load_acquire(&ring->last_to_use);
 	int ntc = ring->next_to_clean;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 130/219] net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (128 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 129/219] net: hns3: fix tx timeout issue Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 131/219] net: hns3: fix debugfs concurrency issue between kfree buffer and read Greg Kroah-Hartman
                   ` (99 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Hao Chen, Jijie Shao, Paolo Abeni,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hao Chen <chenhao418@huawei.com>

[ Upstream commit efccf655e99b6907ca07a466924e91805892e7d3 ]

req1->tcam_data is defined as "u8 tcam_data[8]", and we convert it as
(u32 *) without considerring byte order conversion,
it may result in printing wrong data for tcam_data.

Convert tcam_data to (__le32 *) first to fix it.

Fixes: b5a0b70d77b9 ("net: hns3: refactor dump fd tcam of debugfs")
Signed-off-by: Hao Chen <chenhao418@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
index 5cb8f1818e51c..a1c59f4aae988 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
@@ -1517,7 +1517,7 @@ static int hclge_dbg_fd_tcam_read(struct hclge_dev *hdev, bool sel_x,
 	struct hclge_desc desc[3];
 	int pos = 0;
 	int ret, i;
-	u32 *req;
+	__le32 *req;
 
 	hclge_cmd_setup_basic_desc(&desc[0], HCLGE_OPC_FD_TCAM_OP, true);
 	desc[0].flag |= cpu_to_le16(HCLGE_COMM_CMD_FLAG_NEXT);
@@ -1542,22 +1542,22 @@ static int hclge_dbg_fd_tcam_read(struct hclge_dev *hdev, bool sel_x,
 			 tcam_msg.loc);
 
 	/* tcam_data0 ~ tcam_data1 */
-	req = (u32 *)req1->tcam_data;
+	req = (__le32 *)req1->tcam_data;
 	for (i = 0; i < 2; i++)
 		pos += scnprintf(tcam_buf + pos, HCLGE_DBG_TCAM_BUF_SIZE - pos,
-				 "%08x\n", *req++);
+				 "%08x\n", le32_to_cpu(*req++));
 
 	/* tcam_data2 ~ tcam_data7 */
-	req = (u32 *)req2->tcam_data;
+	req = (__le32 *)req2->tcam_data;
 	for (i = 0; i < 6; i++)
 		pos += scnprintf(tcam_buf + pos, HCLGE_DBG_TCAM_BUF_SIZE - pos,
-				 "%08x\n", *req++);
+				 "%08x\n", le32_to_cpu(*req++));
 
 	/* tcam_data8 ~ tcam_data12 */
-	req = (u32 *)req3->tcam_data;
+	req = (__le32 *)req3->tcam_data;
 	for (i = 0; i < 5; i++)
 		pos += scnprintf(tcam_buf + pos, HCLGE_DBG_TCAM_BUF_SIZE - pos,
-				 "%08x\n", *req++);
+				 "%08x\n", le32_to_cpu(*req++));
 
 	return ret;
 }
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 131/219] net: hns3: fix debugfs concurrency issue between kfree buffer and read
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (129 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 130/219] net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read() Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 132/219] net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue Greg Kroah-Hartman
                   ` (98 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Hao Chen, Jijie Shao, Paolo Abeni,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hao Chen <chenhao418@huawei.com>

[ Upstream commit c295160b1d95e885f1af4586a221cb221d232d10 ]

Now in hns3_dbg_uninit(), there may be concurrency between
kfree buffer and read, it may result in memory error.

Moving debugfs_remove_recursive() in front of kfree buffer to ensure
they don't happen at the same time.

Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process")
Signed-off-by: Hao Chen <chenhao418@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
index 69d1549e63a98..00eed9835cb55 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
@@ -1406,9 +1406,9 @@ int hns3_dbg_init(struct hnae3_handle *handle)
 	return 0;
 
 out:
-	mutex_destroy(&handle->dbgfs_lock);
 	debugfs_remove_recursive(handle->hnae3_dbgfs);
 	handle->hnae3_dbgfs = NULL;
+	mutex_destroy(&handle->dbgfs_lock);
 	return ret;
 }
 
@@ -1416,6 +1416,9 @@ void hns3_dbg_uninit(struct hnae3_handle *handle)
 {
 	u32 i;
 
+	debugfs_remove_recursive(handle->hnae3_dbgfs);
+	handle->hnae3_dbgfs = NULL;
+
 	for (i = 0; i < ARRAY_SIZE(hns3_dbg_cmd); i++)
 		if (handle->dbgfs_buf[i]) {
 			kvfree(handle->dbgfs_buf[i]);
@@ -1423,8 +1426,6 @@ void hns3_dbg_uninit(struct hnae3_handle *handle)
 		}
 
 	mutex_destroy(&handle->dbgfs_lock);
-	debugfs_remove_recursive(handle->hnae3_dbgfs);
-	handle->hnae3_dbgfs = NULL;
 }
 
 void hns3_dbg_register_debugfs(const char *debugfs_dir_name)
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 132/219] net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (130 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 131/219] net: hns3: fix debugfs concurrency issue between kfree buffer and read Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 133/219] net: hns3: fix the port information display when sfp is absent Greg Kroah-Hartman
                   ` (97 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Jijie Shao, Paolo Abeni, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jijie Shao <shaojijie@huawei.com>

[ Upstream commit fa5564945f7d15ae2390b00c08b6abaef0165cda ]

We hope that tc qdisc and dcb ets commands can not be used crosswise.
If we want to use any of the commands to configure tc,
We must use the other command to clear the existing configuration.

However, when we configure a single tc with tc qdisc,
we can still configure it with dcb ets.
Because we use mqprio_active as the tag of tc qdisc configuration,
but with dcb ets, we do not check mqprio_active.

This patch fix this issue by check mqprio_active before
executing the dcb ets command. and add dcb_ets_active to
replace HCLGE_FLAG_DCB_ENABLE and HCLGE_FLAG_MQPRIO_ENABLE
at the hclge layer,

Fixes: cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h   |  1 +
 .../hisilicon/hns3/hns3pf/hclge_dcb.c         | 20 +++++--------------
 .../hisilicon/hns3/hns3pf/hclge_main.c        |  5 +++--
 .../hisilicon/hns3/hns3pf/hclge_main.h        |  2 --
 4 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index fcb8b6dc5ab92..c693bb701ba3e 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -797,6 +797,7 @@ struct hnae3_tc_info {
 	u8 max_tc; /* Total number of TCs */
 	u8 num_tc; /* Total number of enabled TCs */
 	bool mqprio_active;
+	bool dcb_ets_active;
 };
 
 #define HNAE3_MAX_DSCP			64
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
index 09362823140d5..2740f0d703e4f 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
@@ -251,7 +251,7 @@ static int hclge_ieee_setets(struct hnae3_handle *h, struct ieee_ets *ets)
 	int ret;
 
 	if (!(hdev->dcbx_cap & DCB_CAP_DCBX_VER_IEEE) ||
-	    hdev->flag & HCLGE_FLAG_MQPRIO_ENABLE)
+	    h->kinfo.tc_info.mqprio_active)
 		return -EINVAL;
 
 	ret = hclge_ets_validate(hdev, ets, &num_tc, &map_changed);
@@ -267,10 +267,7 @@ static int hclge_ieee_setets(struct hnae3_handle *h, struct ieee_ets *ets)
 	}
 
 	hclge_tm_schd_info_update(hdev, num_tc);
-	if (num_tc > 1)
-		hdev->flag |= HCLGE_FLAG_DCB_ENABLE;
-	else
-		hdev->flag &= ~HCLGE_FLAG_DCB_ENABLE;
+	h->kinfo.tc_info.dcb_ets_active = num_tc > 1;
 
 	ret = hclge_ieee_ets_to_tm_info(hdev, ets);
 	if (ret)
@@ -463,7 +460,7 @@ static u8 hclge_getdcbx(struct hnae3_handle *h)
 	struct hclge_vport *vport = hclge_get_vport(h);
 	struct hclge_dev *hdev = vport->back;
 
-	if (hdev->flag & HCLGE_FLAG_MQPRIO_ENABLE)
+	if (h->kinfo.tc_info.mqprio_active)
 		return 0;
 
 	return hdev->dcbx_cap;
@@ -587,7 +584,8 @@ static int hclge_setup_tc(struct hnae3_handle *h,
 	if (!test_bit(HCLGE_STATE_NIC_REGISTERED, &hdev->state))
 		return -EBUSY;
 
-	if (hdev->flag & HCLGE_FLAG_DCB_ENABLE)
+	kinfo = &vport->nic.kinfo;
+	if (kinfo->tc_info.dcb_ets_active)
 		return -EINVAL;
 
 	ret = hclge_mqprio_qopt_check(hdev, mqprio_qopt);
@@ -601,7 +599,6 @@ static int hclge_setup_tc(struct hnae3_handle *h,
 	if (ret)
 		return ret;
 
-	kinfo = &vport->nic.kinfo;
 	memcpy(&old_tc_info, &kinfo->tc_info, sizeof(old_tc_info));
 	hclge_sync_mqprio_qopt(&kinfo->tc_info, mqprio_qopt);
 	kinfo->tc_info.mqprio_active = tc > 0;
@@ -610,13 +607,6 @@ static int hclge_setup_tc(struct hnae3_handle *h,
 	if (ret)
 		goto err_out;
 
-	hdev->flag &= ~HCLGE_FLAG_DCB_ENABLE;
-
-	if (tc > 1)
-		hdev->flag |= HCLGE_FLAG_MQPRIO_ENABLE;
-	else
-		hdev->flag &= ~HCLGE_FLAG_MQPRIO_ENABLE;
-
 	return hclge_notify_init_up(hdev);
 
 err_out:
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 84ecd8b9be48c..884e45fb6b72e 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -11132,6 +11132,7 @@ static void hclge_get_mdix_mode(struct hnae3_handle *handle,
 
 static void hclge_info_show(struct hclge_dev *hdev)
 {
+	struct hnae3_handle *handle = &hdev->vport->nic;
 	struct device *dev = &hdev->pdev->dev;
 
 	dev_info(dev, "PF info begin:\n");
@@ -11148,9 +11149,9 @@ static void hclge_info_show(struct hclge_dev *hdev)
 	dev_info(dev, "This is %s PF\n",
 		 hdev->flag & HCLGE_FLAG_MAIN ? "main" : "not main");
 	dev_info(dev, "DCB %s\n",
-		 hdev->flag & HCLGE_FLAG_DCB_ENABLE ? "enable" : "disable");
+		 handle->kinfo.tc_info.dcb_ets_active ? "enable" : "disable");
 	dev_info(dev, "MQPRIO %s\n",
-		 hdev->flag & HCLGE_FLAG_MQPRIO_ENABLE ? "enable" : "disable");
+		 handle->kinfo.tc_info.mqprio_active ? "enable" : "disable");
 	dev_info(dev, "Default tx spare buffer size: %u\n",
 		 hdev->tx_spare_buf_size);
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
index 13f23d606e77b..f6fef790e16c1 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
@@ -916,8 +916,6 @@ struct hclge_dev {
 
 #define HCLGE_FLAG_MAIN			BIT(0)
 #define HCLGE_FLAG_DCB_CAPABLE		BIT(1)
-#define HCLGE_FLAG_DCB_ENABLE		BIT(2)
-#define HCLGE_FLAG_MQPRIO_ENABLE	BIT(3)
 	u32 flag;
 
 	u32 pkt_buf_size; /* Total pf buf size for tx/rx */
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 133/219] net: hns3: fix the port information display when sfp is absent
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (131 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 132/219] net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 134/219] net: hns3: remove GSO partial feature bit Greg Kroah-Hartman
                   ` (96 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Yisen Zhuang, Jijie Shao,
	Paolo Abeni, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yisen Zhuang <yisen.zhuang@huawei.com>

[ Upstream commit 674d9591a32d01df75d6b5fffed4ef942a294376 ]

When sfp is absent or unidentified, the port type should be
displayed as PORT_OTHERS, rather than PORT_FIBRE.

Fixes: 88d10bd6f730 ("net: hns3: add support for multiple media type")
Signed-off-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index cdf76fb58d45e..e22835ae8a941 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -776,7 +776,9 @@ static int hns3_get_link_ksettings(struct net_device *netdev,
 		hns3_get_ksettings(h, cmd);
 		break;
 	case HNAE3_MEDIA_TYPE_FIBER:
-		if (module_type == HNAE3_MODULE_TYPE_CR)
+		if (module_type == HNAE3_MODULE_TYPE_UNKNOWN)
+			cmd->base.port = PORT_OTHER;
+		else if (module_type == HNAE3_MODULE_TYPE_CR)
 			cmd->base.port = PORT_DA;
 		else
 			cmd->base.port = PORT_FIBRE;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 134/219] net: hns3: remove GSO partial feature bit
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (132 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 133/219] net: hns3: fix the port information display when sfp is absent Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 135/219] sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory() Greg Kroah-Hartman
                   ` (95 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Jie Wang, Jijie Shao, Paolo Abeni,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jie Wang <wangjie125@huawei.com>

[ Upstream commit 60326634f6c54528778de18bfef1e8a7a93b3771 ]

HNS3 NIC does not support GSO partial packets segmentation. Actually tunnel
packets for example NvGRE packets segment offload and checksum offload is
already supported. There is no need to keep gso partial feature bit. So
this patch removes it.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 9942b21cd6193..8aae179554a81 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3315,8 +3315,6 @@ static void hns3_set_default_feature(struct net_device *netdev)
 
 	netdev->priv_flags |= IFF_UNICAST_FLT;
 
-	netdev->gso_partial_features |= NETIF_F_GSO_GRE_CSUM;
-
 	netdev->features |= NETIF_F_HW_VLAN_CTAG_FILTER |
 		NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
 		NETIF_F_RXCSUM | NETIF_F_SG | NETIF_F_GSO |
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 135/219] sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (133 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 134/219] net: hns3: remove GSO partial feature bit Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 136/219] Multi-gen LRU: avoid race in inc_min_seq() Greg Kroah-Hartman
                   ` (94 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Petr Tesarik, Geert Uytterhoeven,
	Jacopo Mondi, John Paul Adrian Glaubitz, Laurent Pinchart,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Petr Tesarik <petr.tesarik.ext@huawei.com>

[ Upstream commit fb60211f377b69acffead3147578f86d0092a7a5 ]

In all these cases, the last argument to dma_declare_coherent_memory() is
the buffer end address, but the expected value should be the size of the
reserved region.

Fixes: 39fb993038e1 ("media: arch: sh: ap325rxa: Use new renesas-ceu camera driver")
Fixes: c2f9b05fd5c1 ("media: arch: sh: ecovec: Use new renesas-ceu camera driver")
Fixes: f3590dc32974 ("media: arch: sh: kfr2r09: Use new renesas-ceu camera driver")
Fixes: 186c446f4b84 ("media: arch: sh: migor: Use new renesas-ceu camera driver")
Fixes: 1a3c230b4151 ("media: arch: sh: ms7724se: Use new renesas-ceu camera driver")
Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://lore.kernel.org/r/20230724120742.2187-1-petrtesarik@huaweicloud.com
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/sh/boards/mach-ap325rxa/setup.c | 2 +-
 arch/sh/boards/mach-ecovec24/setup.c | 6 ++----
 arch/sh/boards/mach-kfr2r09/setup.c  | 2 +-
 arch/sh/boards/mach-migor/setup.c    | 2 +-
 arch/sh/boards/mach-se/7724/setup.c  | 6 ++----
 5 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/sh/boards/mach-ap325rxa/setup.c b/arch/sh/boards/mach-ap325rxa/setup.c
index c77b5f00a66a3..d8f8dca4d7968 100644
--- a/arch/sh/boards/mach-ap325rxa/setup.c
+++ b/arch/sh/boards/mach-ap325rxa/setup.c
@@ -530,7 +530,7 @@ static int __init ap325rxa_devices_setup(void)
 	device_initialize(&ap325rxa_ceu_device.dev);
 	dma_declare_coherent_memory(&ap325rxa_ceu_device.dev,
 			ceu_dma_membase, ceu_dma_membase,
-			ceu_dma_membase + CEU_BUFFER_MEMORY_SIZE - 1);
+			CEU_BUFFER_MEMORY_SIZE);
 
 	platform_device_add(&ap325rxa_ceu_device);
 
diff --git a/arch/sh/boards/mach-ecovec24/setup.c b/arch/sh/boards/mach-ecovec24/setup.c
index 674da7ebd8b7f..7ec03d4a4edf0 100644
--- a/arch/sh/boards/mach-ecovec24/setup.c
+++ b/arch/sh/boards/mach-ecovec24/setup.c
@@ -1454,15 +1454,13 @@ static int __init arch_setup(void)
 	device_initialize(&ecovec_ceu_devices[0]->dev);
 	dma_declare_coherent_memory(&ecovec_ceu_devices[0]->dev,
 				    ceu0_dma_membase, ceu0_dma_membase,
-				    ceu0_dma_membase +
-				    CEU_BUFFER_MEMORY_SIZE - 1);
+				    CEU_BUFFER_MEMORY_SIZE);
 	platform_device_add(ecovec_ceu_devices[0]);
 
 	device_initialize(&ecovec_ceu_devices[1]->dev);
 	dma_declare_coherent_memory(&ecovec_ceu_devices[1]->dev,
 				    ceu1_dma_membase, ceu1_dma_membase,
-				    ceu1_dma_membase +
-				    CEU_BUFFER_MEMORY_SIZE - 1);
+				    CEU_BUFFER_MEMORY_SIZE);
 	platform_device_add(ecovec_ceu_devices[1]);
 
 	gpiod_add_lookup_table(&cn12_power_gpiod_table);
diff --git a/arch/sh/boards/mach-kfr2r09/setup.c b/arch/sh/boards/mach-kfr2r09/setup.c
index 20f4db778ed6a..c6d556dfbbbe6 100644
--- a/arch/sh/boards/mach-kfr2r09/setup.c
+++ b/arch/sh/boards/mach-kfr2r09/setup.c
@@ -603,7 +603,7 @@ static int __init kfr2r09_devices_setup(void)
 	device_initialize(&kfr2r09_ceu_device.dev);
 	dma_declare_coherent_memory(&kfr2r09_ceu_device.dev,
 			ceu_dma_membase, ceu_dma_membase,
-			ceu_dma_membase + CEU_BUFFER_MEMORY_SIZE - 1);
+			CEU_BUFFER_MEMORY_SIZE);
 
 	platform_device_add(&kfr2r09_ceu_device);
 
diff --git a/arch/sh/boards/mach-migor/setup.c b/arch/sh/boards/mach-migor/setup.c
index f60061283c482..773ee767d0c4e 100644
--- a/arch/sh/boards/mach-migor/setup.c
+++ b/arch/sh/boards/mach-migor/setup.c
@@ -604,7 +604,7 @@ static int __init migor_devices_setup(void)
 	device_initialize(&migor_ceu_device.dev);
 	dma_declare_coherent_memory(&migor_ceu_device.dev,
 			ceu_dma_membase, ceu_dma_membase,
-			ceu_dma_membase + CEU_BUFFER_MEMORY_SIZE - 1);
+			CEU_BUFFER_MEMORY_SIZE);
 
 	platform_device_add(&migor_ceu_device);
 
diff --git a/arch/sh/boards/mach-se/7724/setup.c b/arch/sh/boards/mach-se/7724/setup.c
index b60a2626e18b2..6495f93540654 100644
--- a/arch/sh/boards/mach-se/7724/setup.c
+++ b/arch/sh/boards/mach-se/7724/setup.c
@@ -940,15 +940,13 @@ static int __init devices_setup(void)
 	device_initialize(&ms7724se_ceu_devices[0]->dev);
 	dma_declare_coherent_memory(&ms7724se_ceu_devices[0]->dev,
 				    ceu0_dma_membase, ceu0_dma_membase,
-				    ceu0_dma_membase +
-				    CEU_BUFFER_MEMORY_SIZE - 1);
+				    CEU_BUFFER_MEMORY_SIZE);
 	platform_device_add(ms7724se_ceu_devices[0]);
 
 	device_initialize(&ms7724se_ceu_devices[1]->dev);
 	dma_declare_coherent_memory(&ms7724se_ceu_devices[1]->dev,
 				    ceu1_dma_membase, ceu1_dma_membase,
-				    ceu1_dma_membase +
-				    CEU_BUFFER_MEMORY_SIZE - 1);
+				    CEU_BUFFER_MEMORY_SIZE);
 	platform_device_add(ms7724se_ceu_devices[1]);
 
 	return platform_add_devices(ms7724se_devices,
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 136/219] Multi-gen LRU: avoid race in inc_min_seq()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (134 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 135/219] sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory() Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 137/219] net/mlx5: Free IRQ rmap and notifier on kernel shutdown Greg Kroah-Hartman
                   ` (93 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kalesh Singh, Charan Teja Kalla,
	Yu Zhao, Aneesh Kumar K V, Barry Song, Brian Geffon,
	Jan Alexander Steffens (heftig), Lecopzer Chen, Matthias Brugger,
	Oleksandr Natalenko, Qi Zheng, Steven Barrett, Suleiman Souhlal,
	Suren Baghdasaryan, Andrew Morton, AngeloGioacchino Del Regno

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kalesh Singh <kaleshsingh@google.com>

commit bb5e7f234eacf34b65be67ebb3613e3b8cf11b87 upstream.

inc_max_seq() will try to inc_min_seq() if nr_gens == MAX_NR_GENS. This
is because the generations are reused (the last oldest now empty
generation will become the next youngest generation).

inc_min_seq() is retried until successful, dropping the lru_lock
and yielding the CPU on each failure, and retaking the lock before
trying again:

        while (!inc_min_seq(lruvec, type, can_swap)) {
                spin_unlock_irq(&lruvec->lru_lock);
                cond_resched();
                spin_lock_irq(&lruvec->lru_lock);
        }

However, the initial condition that required incrementing the min_seq
(nr_gens == MAX_NR_GENS) is not retested. This can change by another
call to inc_max_seq() from run_aging() with force_scan=true from the
debugfs interface.

Since the eviction stalls when the nr_gens == MIN_NR_GENS, avoid
unnecessarily incrementing the min_seq by rechecking the number of
generations before each attempt.

This issue was uncovered in previous discussion on the list by Yu Zhao
and Aneesh Kumar [1].

[1] https://lore.kernel.org/linux-mm/CAOUHufbO7CaVm=xjEb1avDhHVvnC8pJmGyKcFf2iY_dpf+zR3w@mail.gmail.com/

Link: https://lkml.kernel.org/r/20230802025606.346758-2-kaleshsingh@google.com
Fixes: d6c3af7d8a2b ("mm: multi-gen LRU: debugfs interface")
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek]
Tested-by: Charan Teja Kalla <quic_charante@quicinc.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Steven Barrett <steven@liquorix.net>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/vmscan.c |   12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4331,6 +4331,7 @@ static void inc_max_seq(struct lruvec *l
 	int type, zone;
 	struct lru_gen_struct *lrugen = &lruvec->lrugen;
 
+restart:
 	spin_lock_irq(&lruvec->lru_lock);
 
 	VM_WARN_ON_ONCE(!seq_is_valid(lruvec));
@@ -4341,11 +4342,12 @@ static void inc_max_seq(struct lruvec *l
 
 		VM_WARN_ON_ONCE(!force_scan && (type == LRU_GEN_FILE || can_swap));
 
-		while (!inc_min_seq(lruvec, type, can_swap)) {
-			spin_unlock_irq(&lruvec->lru_lock);
-			cond_resched();
-			spin_lock_irq(&lruvec->lru_lock);
-		}
+		if (inc_min_seq(lruvec, type, can_swap))
+			continue;
+
+		spin_unlock_irq(&lruvec->lru_lock);
+		cond_resched();
+		goto restart;
 	}
 
 	/*



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 137/219] net/mlx5: Free IRQ rmap and notifier on kernel shutdown
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (135 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 136/219] Multi-gen LRU: avoid race in inc_min_seq() Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 138/219] ARC: atomics: Add compiler barrier to atomic operations Greg Kroah-Hartman
                   ` (92 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Saeed Mahameed, Shay Drory,
	Mathieu Tortuyaux

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Saeed Mahameed <saeedm@nvidia.com>

commit 314ded538e5f22e7610b1bf621402024a180ec80 upstream.

The kernel IRQ system needs the irq affinity notifier to be clear
before attempting to free the irq, see WARN_ON log below.

On a normal driver unload we don't have this issue since we do the
complete cleanup of the irq resources.

To fix this, put the important resources cleanup in a helper function
and use it in both normal driver unload and shutdown flows.

[ 4497.498434] ------------[ cut here ]------------
[ 4497.498726] WARNING: CPU: 0 PID: 9 at kernel/irq/manage.c:2034 free_irq+0x295/0x340
[ 4497.499193] Modules linked in:
[ 4497.499386] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.4.0-rc4+ #10
[ 4497.499876] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
[ 4497.500518] Workqueue: events do_poweroff
[ 4497.500849] RIP: 0010:free_irq+0x295/0x340
[ 4497.501132] Code: 85 c0 0f 84 1d ff ff ff 48 89 ef ff d0 0f 1f 00 e9 10 ff ff ff 0f 0b e9 72 ff ff ff 49 8d 7f 28 ff d0 0f 1f 00 e9 df fd ff ff <0f> 0b 48 c7 80 c0 008
[ 4497.502269] RSP: 0018:ffffc90000053da0 EFLAGS: 00010282
[ 4497.502589] RAX: ffff888100949600 RBX: ffff88810330b948 RCX: 0000000000000000
[ 4497.503035] RDX: ffff888100949600 RSI: ffff888100400490 RDI: 0000000000000023
[ 4497.503472] RBP: ffff88810330c7e0 R08: ffff8881004005d0 R09: ffffffff8273a260
[ 4497.503923] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881009ae000
[ 4497.504359] R13: ffff8881009ae148 R14: 0000000000000000 R15: ffff888100949600
[ 4497.504804] FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
[ 4497.505302] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4497.505671] CR2: 00007fce98806298 CR3: 000000000262e005 CR4: 0000000000370ef0
[ 4497.506104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 4497.506540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 4497.507002] Call Trace:
[ 4497.507158]  <TASK>
[ 4497.507299]  ? free_irq+0x295/0x340
[ 4497.507522]  ? __warn+0x7c/0x130
[ 4497.507740]  ? free_irq+0x295/0x340
[ 4497.507963]  ? report_bug+0x171/0x1a0
[ 4497.508197]  ? handle_bug+0x3c/0x70
[ 4497.508417]  ? exc_invalid_op+0x17/0x70
[ 4497.508662]  ? asm_exc_invalid_op+0x1a/0x20
[ 4497.508926]  ? free_irq+0x295/0x340
[ 4497.509146]  mlx5_irq_pool_free_irqs+0x48/0x90
[ 4497.509421]  mlx5_irq_table_free_irqs+0x38/0x50
[ 4497.509714]  mlx5_core_eq_free_irqs+0x27/0x40
[ 4497.509984]  shutdown+0x7b/0x100
[ 4497.510184]  pci_device_shutdown+0x30/0x60
[ 4497.510440]  device_shutdown+0x14d/0x240
[ 4497.510698]  kernel_power_off+0x30/0x70
[ 4497.510938]  process_one_work+0x1e6/0x3e0
[ 4497.511183]  worker_thread+0x49/0x3b0
[ 4497.511407]  ? __pfx_worker_thread+0x10/0x10
[ 4497.511679]  kthread+0xe0/0x110
[ 4497.511879]  ? __pfx_kthread+0x10/0x10
[ 4497.512114]  ret_from_fork+0x29/0x50
[ 4497.512342]  </TASK>

Fixes: 9c2d08010963 ("net/mlx5: Free irqs only on shutdown callback")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c |   26 ++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
@@ -123,18 +123,32 @@ out:
 	return ret;
 }
 
-static void irq_release(struct mlx5_irq *irq)
+/* mlx5_system_free_irq - Free an IRQ
+ * @irq: IRQ to free
+ *
+ * Free the IRQ and other resources such as rmap from the system.
+ * BUT doesn't free or remove reference from mlx5.
+ * This function is very important for the shutdown flow, where we need to
+ * cleanup system resoruces but keep mlx5 objects alive,
+ * see mlx5_irq_table_free_irqs().
+ */
+static void mlx5_system_free_irq(struct mlx5_irq *irq)
 {
-	struct mlx5_irq_pool *pool = irq->pool;
-
-	xa_erase(&pool->irqs, irq->index);
 	/* free_irq requires that affinity_hint and rmap will be cleared
 	 * before calling it. This is why there is asymmetry with set_rmap
 	 * which should be called after alloc_irq but before request_irq.
 	 */
 	irq_update_affinity_hint(irq->irqn, NULL);
-	free_cpumask_var(irq->mask);
 	free_irq(irq->irqn, &irq->nh);
+}
+
+static void irq_release(struct mlx5_irq *irq)
+{
+	struct mlx5_irq_pool *pool = irq->pool;
+
+	xa_erase(&pool->irqs, irq->index);
+	mlx5_system_free_irq(irq);
+	free_cpumask_var(irq->mask);
 	kfree(irq);
 }
 
@@ -597,7 +611,7 @@ static void mlx5_irq_pool_free_irqs(stru
 	unsigned long index;
 
 	xa_for_each(&pool->irqs, index, irq)
-		free_irq(irq->irqn, &irq->nh);
+		mlx5_system_free_irq(irq);
 }
 
 static void mlx5_irq_pools_free_irqs(struct mlx5_irq_table *table)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 138/219] ARC: atomics: Add compiler barrier to atomic operations...
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (136 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 137/219] net/mlx5: Free IRQ rmap and notifier on kernel shutdown Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 139/219] clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL Greg Kroah-Hartman
                   ` (91 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Pavel Kozlov, Vineet Gupta

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Pavel Kozlov <pavel.kozlov@synopsys.com>

commit 42f51fb24fd39cc547c086ab3d8a314cc603a91c upstream.

... to avoid unwanted gcc optimizations

SMP kernels fail to boot with commit 596ff4a09b89
("cpumask: re-introduce constant-sized cpumask optimizations").

|
| percpu: BUG: failure at mm/percpu.c:2981/pcpu_build_alloc_info()!
|

The write operation performed by the SCOND instruction in the atomic
inline asm code is not properly passed to the compiler. The compiler
cannot correctly optimize a nested loop that runs through the cpumask
in the pcpu_build_alloc_info() function.

Fix this by add a compiler barrier (memory clobber in inline asm).

Apparently atomic ops used to have memory clobber implicitly via
surrounding smp_mb(). However commit b64be6836993c431e
("ARC: atomics: implement relaxed variants") removed the smp_mb() for
the relaxed variants, but failed to add the explicit compiler barrier.

Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/135
Cc: <stable@vger.kernel.org> # v6.3+
Fixes: b64be6836993c43 ("ARC: atomics: implement relaxed variants")
Signed-off-by: Pavel Kozlov <pavel.kozlov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
[vgupta: tweaked the changelog and added Fixes tag]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arc/include/asm/atomic-llsc.h    |    6 +++---
 arch/arc/include/asm/atomic64-arcv2.h |    6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

--- a/arch/arc/include/asm/atomic-llsc.h
+++ b/arch/arc/include/asm/atomic-llsc.h
@@ -18,7 +18,7 @@ static inline void arch_atomic_##op(int
 	: [val]	"=&r"	(val) /* Early clobber to prevent reg reuse */	\
 	: [ctr]	"r"	(&v->counter), /* Not "m": llock only supports reg direct addr mode */	\
 	  [i]	"ir"	(i)						\
-	: "cc");							\
+	: "cc", "memory");						\
 }									\
 
 #define ATOMIC_OP_RETURN(op, asm_op)				\
@@ -34,7 +34,7 @@ static inline int arch_atomic_##op##_ret
 	: [val]	"=&r"	(val)						\
 	: [ctr]	"r"	(&v->counter),					\
 	  [i]	"ir"	(i)						\
-	: "cc");							\
+	: "cc", "memory");						\
 									\
 	return val;							\
 }
@@ -56,7 +56,7 @@ static inline int arch_atomic_fetch_##op
 	  [orig] "=&r" (orig)						\
 	: [ctr]	"r"	(&v->counter),					\
 	  [i]	"ir"	(i)						\
-	: "cc");							\
+	: "cc", "memory");						\
 									\
 	return orig;							\
 }
--- a/arch/arc/include/asm/atomic64-arcv2.h
+++ b/arch/arc/include/asm/atomic64-arcv2.h
@@ -60,7 +60,7 @@ static inline void arch_atomic64_##op(s6
 	"	bnz     1b		\n"				\
 	: "=&r"(val)							\
 	: "r"(&v->counter), "ir"(a)					\
-	: "cc");							\
+	: "cc", "memory");						\
 }									\
 
 #define ATOMIC64_OP_RETURN(op, op1, op2)		        	\
@@ -77,7 +77,7 @@ static inline s64 arch_atomic64_##op##_r
 	"	bnz     1b		\n"				\
 	: [val] "=&r"(val)						\
 	: "r"(&v->counter), "ir"(a)					\
-	: "cc");	/* memory clobber comes from smp_mb() */	\
+	: "cc", "memory");						\
 									\
 	return val;							\
 }
@@ -99,7 +99,7 @@ static inline s64 arch_atomic64_fetch_##
 	"	bnz     1b		\n"				\
 	: "=&r"(orig), "=&r"(val)					\
 	: "r"(&v->counter), "ir"(a)					\
-	: "cc");	/* memory clobber comes from smp_mb() */	\
+	: "cc", "memory");						\
 									\
 	return orig;							\
 }



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 139/219] clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (137 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 138/219] ARC: atomics: Add compiler barrier to atomic operations Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 140/219] dmaengine: sh: rz-dmac: Fix destination and source data size setting Greg Kroah-Hartman
                   ` (90 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Walter Chang, Marc Zyngier,
	AngeloGioacchino Del Regno, Daniel Lezcano

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Walter Chang <walter.chang@mediatek.com>

commit e7d65e40ab5a5940785c5922f317602d0268caaf upstream.

Due to the fact that the use of `writeq_relaxed()` to program CVAL is
not guaranteed to be atomic, it is necessary to disable the timer before
programming CVAL.

However, if the MMIO timer is already enabled and has not yet expired,
there is a possibility of unexpected behavior occurring: when the CPU
enters the idle state during this period, and if the CPU's local event
is earlier than the broadcast event, the following process occurs:

tick_broadcast_enter()
  tick_broadcast_oneshot_control(TICK_BROADCAST_ENTER)
    __tick_broadcast_oneshot_control()
      ___tick_broadcast_oneshot_control()
        tick_broadcast_set_event()
          clockevents_program_event()
            set_next_event_mem()

During this process, the MMIO timer remains enabled while programming
CVAL. To prevent such behavior, disable timer explicitly prior to
programming CVAL.

Fixes: 8b82c4f883a7 ("clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL")
Cc: stable@vger.kernel.org
Signed-off-by: Walter Chang <walter.chang@mediatek.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230717090735.19370-1-walter.chang@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/clocksource/arm_arch_timer.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -773,6 +773,13 @@ static __always_inline void set_next_eve
 	u64 cnt;
 
 	ctrl = arch_timer_reg_read(access, ARCH_TIMER_REG_CTRL, clk);
+
+	/* Timer must be disabled before programming CVAL */
+	if (ctrl & ARCH_TIMER_CTRL_ENABLE) {
+		ctrl &= ~ARCH_TIMER_CTRL_ENABLE;
+		arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk);
+	}
+
 	ctrl |= ARCH_TIMER_CTRL_ENABLE;
 	ctrl &= ~ARCH_TIMER_CTRL_IT_MASK;
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 140/219] dmaengine: sh: rz-dmac: Fix destination and source data size setting
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (138 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 139/219] clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 141/219] jbd2: fix checkpoint cleanup performance regression Greg Kroah-Hartman
                   ` (89 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, stable, Hien Huynh, Biju Das,
	Geert Uytterhoeven, Vinod Koul

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hien Huynh <hien.huynh.px@renesas.com>

commit c6ec8c83a29fb3aec3efa6fabbf5344498f57c7f upstream.

Before setting DDS and SDS values, we need to clear its value first
otherwise, we get incorrect results when we change/update the DMA bus
width several times due to the 'OR' expression.

Fixes: 5000d37042a6 ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC")
Cc: stable@kernel.org
Signed-off-by: Hien Huynh <hien.huynh.px@renesas.com>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20230706112150.198941-3-biju.das.jz@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/dma/sh/rz-dmac.c |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

--- a/drivers/dma/sh/rz-dmac.c
+++ b/drivers/dma/sh/rz-dmac.c
@@ -9,6 +9,7 @@
  * Copyright 2012 Javier Martin, Vista Silicon <javier.martin@vista-silicon.com>
  */
 
+#include <linux/bitfield.h>
 #include <linux/dma-mapping.h>
 #include <linux/dmaengine.h>
 #include <linux/interrupt.h>
@@ -145,8 +146,8 @@ struct rz_dmac {
 #define CHCFG_REQD			BIT(3)
 #define CHCFG_SEL(bits)			((bits) & 0x07)
 #define CHCFG_MEM_COPY			(0x80400008)
-#define CHCFG_FILL_DDS(a)		(((a) << 16) & GENMASK(19, 16))
-#define CHCFG_FILL_SDS(a)		(((a) << 12) & GENMASK(15, 12))
+#define CHCFG_FILL_DDS_MASK		GENMASK(19, 16)
+#define CHCFG_FILL_SDS_MASK		GENMASK(15, 12)
 #define CHCFG_FILL_TM(a)		(((a) & BIT(5)) << 22)
 #define CHCFG_FILL_AM(a)		(((a) & GENMASK(4, 2)) << 6)
 #define CHCFG_FILL_LVL(a)		(((a) & BIT(1)) << 5)
@@ -609,13 +610,15 @@ static int rz_dmac_config(struct dma_cha
 	if (val == CHCFG_DS_INVALID)
 		return -EINVAL;
 
-	channel->chcfg |= CHCFG_FILL_DDS(val);
+	channel->chcfg &= ~CHCFG_FILL_DDS_MASK;
+	channel->chcfg |= FIELD_PREP(CHCFG_FILL_DDS_MASK, val);
 
 	val = rz_dmac_ds_to_val_mapping(config->src_addr_width);
 	if (val == CHCFG_DS_INVALID)
 		return -EINVAL;
 
-	channel->chcfg |= CHCFG_FILL_SDS(val);
+	channel->chcfg &= ~CHCFG_FILL_SDS_MASK;
+	channel->chcfg |= FIELD_PREP(CHCFG_FILL_SDS_MASK, val);
 
 	return 0;
 }



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 141/219] jbd2: fix checkpoint cleanup performance regression
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (139 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 140/219] dmaengine: sh: rz-dmac: Fix destination and source data size setting Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 142/219] jbd2: check jh->b_transaction before removing it from checkpoint Greg Kroah-Hartman
                   ` (88 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, stable, Zhang Yi, Jan Kara,
	Theodore Tso

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Zhang Yi <yi.zhang@huawei.com>

commit 373ac521799d9e97061515aca6ec6621789036bb upstream.

journal_clean_one_cp_list() has been merged into
journal_shrink_one_cp_list(), but do chekpoint buffer cleanup from the
committing process is just a best effort, it should stop scan once it
meet a busy buffer, or else it will cause a lot of invalid buffer scan
and checks. We catch a performance regression when doing fs_mark tests
below.

Test cmd:
 ./fs_mark  -d  scratch  -s  1024  -n  10000  -t  1  -D  100  -N  100

Before merging checkpoint buffer cleanup:
 FSUse%        Count         Size    Files/sec     App Overhead
     95        10000         1024       8304.9            49033

After merging checkpoint buffer cleanup:
 FSUse%        Count         Size    Files/sec     App Overhead
     95        10000         1024       7649.0            50012
 FSUse%        Count         Size    Files/sec     App Overhead
     95        10000         1024       2107.1            50871

After merging checkpoint buffer cleanup, the total loop count in
journal_shrink_one_cp_list() could be up to 6,261,600+ (50,000+ ~
100,000+ in general), most of them are invalid. This patch fix it
through passing 'shrink_type' into journal_shrink_one_cp_list() and add
a new 'SHRINK_BUSY_STOP' to indicate it should stop once meet a busy
buffer. After fix, the loop count descending back to 10,000+.

After this fix:
 FSUse%        Count         Size    Files/sec     App Overhead
     95        10000         1024       8558.4            49109

Cc: stable@kernel.org
Fixes: b98dba273a0e ("jbd2: remove journal_clean_one_cp_list()")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230714025528.564988-2-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/jbd2/checkpoint.c |   20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -349,6 +349,8 @@ int jbd2_cleanup_journal_tail(journal_t
 
 /* Checkpoint list management */
 
+enum shrink_type {SHRINK_DESTROY, SHRINK_BUSY_STOP, SHRINK_BUSY_SKIP};
+
 /*
  * journal_shrink_one_cp_list
  *
@@ -360,7 +362,8 @@ int jbd2_cleanup_journal_tail(journal_t
  * Called with j_list_lock held.
  */
 static unsigned long journal_shrink_one_cp_list(struct journal_head *jh,
-						bool destroy, bool *released)
+						enum shrink_type type,
+						bool *released)
 {
 	struct journal_head *last_jh;
 	struct journal_head *next_jh = jh;
@@ -376,12 +379,15 @@ static unsigned long journal_shrink_one_
 		jh = next_jh;
 		next_jh = jh->b_cpnext;
 
-		if (destroy) {
+		if (type == SHRINK_DESTROY) {
 			ret = __jbd2_journal_remove_checkpoint(jh);
 		} else {
 			ret = jbd2_journal_try_remove_checkpoint(jh);
-			if (ret < 0)
-				continue;
+			if (ret < 0) {
+				if (type == SHRINK_BUSY_SKIP)
+					continue;
+				break;
+			}
 		}
 
 		nr_freed++;
@@ -445,7 +451,7 @@ again:
 		tid = transaction->t_tid;
 
 		freed = journal_shrink_one_cp_list(transaction->t_checkpoint_list,
-						   false, &released);
+						   SHRINK_BUSY_SKIP, &released);
 		nr_freed += freed;
 		(*nr_to_scan) -= min(*nr_to_scan, freed);
 		if (*nr_to_scan == 0)
@@ -485,19 +491,21 @@ out:
 void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy)
 {
 	transaction_t *transaction, *last_transaction, *next_transaction;
+	enum shrink_type type;
 	bool released;
 
 	transaction = journal->j_checkpoint_transactions;
 	if (!transaction)
 		return;
 
+	type = destroy ? SHRINK_DESTROY : SHRINK_BUSY_STOP;
 	last_transaction = transaction->t_cpprev;
 	next_transaction = transaction;
 	do {
 		transaction = next_transaction;
 		next_transaction = transaction->t_cpnext;
 		journal_shrink_one_cp_list(transaction->t_checkpoint_list,
-					   destroy, &released);
+					   type, &released);
 		/*
 		 * This function only frees up some memory if possible so we
 		 * dont have an obligation to finish processing. Bail out if



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 142/219] jbd2: check jh->b_transaction before removing it from checkpoint
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (140 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 141/219] jbd2: fix checkpoint cleanup performance regression Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 143/219] jbd2: correct the end of the journal recovery scan range Greg Kroah-Hartman
                   ` (87 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, stable, Zhihao Cheng, Zhang Yi,
	Jan Kara, Theodore Tso

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Zhihao Cheng <chengzhihao1@huawei.com>

commit 590a809ff743e7bd890ba5fb36bc38e20a36de53 upstream.

Following process will corrupt ext4 image:
Step 1:
jbd2_journal_commit_transaction
 __jbd2_journal_insert_checkpoint(jh, commit_transaction)
 // Put jh into trans1->t_checkpoint_list
 journal->j_checkpoint_transactions = commit_transaction
 // Put trans1 into journal->j_checkpoint_transactions

Step 2:
do_get_write_access
 test_clear_buffer_dirty(bh) // clear buffer dirty,set jbd dirty
 __jbd2_journal_file_buffer(jh, transaction) // jh belongs to trans2

Step 3:
drop_cache
 journal_shrink_one_cp_list
  jbd2_journal_try_remove_checkpoint
   if (!trylock_buffer(bh))  // lock bh, true
   if (buffer_dirty(bh))     // buffer is not dirty
   __jbd2_journal_remove_checkpoint(jh)
   // remove jh from trans1->t_checkpoint_list

Step 4:
jbd2_log_do_checkpoint
 trans1 = journal->j_checkpoint_transactions
 // jh is not in trans1->t_checkpoint_list
 jbd2_cleanup_journal_tail(journal)  // trans1 is done

Step 5: Power cut, trans2 is not committed, jh is lost in next mounting.

Fix it by checking 'jh->b_transaction' before remove it from checkpoint.

Cc: stable@kernel.org
Fixes: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230714025528.564988-3-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/jbd2/checkpoint.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -639,6 +639,8 @@ int jbd2_journal_try_remove_checkpoint(s
 {
 	struct buffer_head *bh = jh2bh(jh);
 
+	if (jh->b_transaction)
+		return -EBUSY;
 	if (!trylock_buffer(bh))
 		return -EBUSY;
 	if (buffer_dirty(bh)) {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 143/219] jbd2: correct the end of the journal recovery scan range
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (141 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 142/219] jbd2: check jh->b_transaction before removing it from checkpoint Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 144/219] ext4: add correct group descriptors and reserved GDT blocks to system zone Greg Kroah-Hartman
                   ` (86 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, stable, Theodore Tso, Zhang Yi,
	Jan Kara

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Zhang Yi <yi.zhang@huawei.com>

commit 2dfba3bb40ad8536b9fa802364f2d40da31aa88e upstream.

We got a filesystem inconsistency issue below while running generic/475
I/O failure pressure test with fast_commit feature enabled.

 Symlink /p3/d3/d1c/d6c/dd6/dce/l101 (inode #132605) is invalid.

If fast_commit feature is enabled, a special fast_commit journal area is
appended to the end of the normal journal area. The journal->j_last
point to the first unused block behind the normal journal area instead
of the whole log area, and the journal->j_fc_last point to the first
unused block behind the fast_commit journal area. While doing journal
recovery, do_one_pass(PASS_SCAN) should first scan the normal journal
area and turn around to the first block once it meet journal->j_last,
but the wrap() macro misuse the journal->j_fc_last, so the recovering
could not read the next magic block (commit block perhaps) and would end
early mistakenly and missing tN and every transaction after it in the
following example. Finally, it could lead to filesystem inconsistency.

 | normal journal area                             | fast commit area |
 +-------------------------------------------------+------------------+
 | tN(rere) | tN+1 |~| tN-x |...| tN-1 | tN(front) |       ....       |
 +-------------------------------------------------+------------------+
                     /                             /                  /
                start               journal->j_last journal->j_fc_last

This patch fix it by use the correct ending journal->j_last.

Fixes: 5b849b5f96b4 ("jbd2: fast commit recovery path")
Cc: stable@kernel.org
Reported-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/linux-ext4/20230613043120.GB1584772@mit.edu/
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230626073322.3956567-1-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/jbd2/recovery.c |   12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -229,12 +229,8 @@ static int count_tags(journal_t *journal
 /* Make sure we wrap around the log correctly! */
 #define wrap(journal, var)						\
 do {									\
-	unsigned long _wrap_last =					\
-		jbd2_has_feature_fast_commit(journal) ?			\
-			(journal)->j_fc_last : (journal)->j_last;	\
-									\
-	if (var >= _wrap_last)						\
-		var -= (_wrap_last - (journal)->j_first);		\
+	if (var >= (journal)->j_last)					\
+		var -= ((journal)->j_last - (journal)->j_first);	\
 } while (0)
 
 static int fc_do_one_pass(journal_t *journal,
@@ -517,9 +513,7 @@ static int do_one_pass(journal_t *journa
 				break;
 
 		jbd2_debug(2, "Scanning for sequence ID %u at %lu/%lu\n",
-			  next_commit_ID, next_log_block,
-			  jbd2_has_feature_fast_commit(journal) ?
-			  journal->j_fc_last : journal->j_last);
+			  next_commit_ID, next_log_block, journal->j_last);
 
 		/* Skip over each chunk of the transaction looking
 		 * either the next descriptor block or the final commit



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 144/219] ext4: add correct group descriptors and reserved GDT blocks to system zone
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (142 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 143/219] jbd2: correct the end of the journal recovery scan range Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 145/219] ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup} Greg Kroah-Hartman
                   ` (85 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, stable, Wang Jianjian, Theodore Tso

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Wang Jianjian <wangjianjian0@foxmail.com>

commit 68228da51c9a436872a4ef4b5a7692e29f7e5bc7 upstream.

When setup_system_zone, flex_bg is not initialized so it is always 1.
Use a new helper function, ext4_num_base_meta_blocks() which does not
depend on sbi->s_log_groups_per_flex being initialized.

[ Squashed two patches in the Link URL's below together into a single
  commit, which is simpler to review/understand.  Also fix checkpatch
  warnings. --TYT ]

Cc: stable@kernel.org
Signed-off-by: Wang Jianjian <wangjianjian0@foxmail.com>
Link: https://lore.kernel.org/r/tencent_21AF0D446A9916ED5C51492CC6C9A0A77B05@qq.com
Link: https://lore.kernel.org/r/tencent_D744D1450CC169AEA77FCF0A64719909ED05@qq.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/ext4/balloc.c         |   15 +++++++++++----
 fs/ext4/block_validity.c |    8 ++++----
 fs/ext4/ext4.h           |    2 ++
 3 files changed, 17 insertions(+), 8 deletions(-)

--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -910,11 +910,11 @@ unsigned long ext4_bg_num_gdb(struct sup
 }
 
 /*
- * This function returns the number of file system metadata clusters at
+ * This function returns the number of file system metadata blocks at
  * the beginning of a block group, including the reserved gdt blocks.
  */
-static unsigned ext4_num_base_meta_clusters(struct super_block *sb,
-				     ext4_group_t block_group)
+unsigned int ext4_num_base_meta_blocks(struct super_block *sb,
+				       ext4_group_t block_group)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
 	unsigned num;
@@ -932,8 +932,15 @@ static unsigned ext4_num_base_meta_clust
 	} else { /* For META_BG_BLOCK_GROUPS */
 		num += ext4_bg_num_gdb(sb, block_group);
 	}
-	return EXT4_NUM_B2C(sbi, num);
+	return num;
 }
+
+static unsigned int ext4_num_base_meta_clusters(struct super_block *sb,
+						ext4_group_t block_group)
+{
+	return EXT4_NUM_B2C(EXT4_SB(sb), ext4_num_base_meta_blocks(sb, block_group));
+}
+
 /**
  *	ext4_inode_to_goal_block - return a hint for block allocation
  *	@inode: inode for block allocation
--- a/fs/ext4/block_validity.c
+++ b/fs/ext4/block_validity.c
@@ -215,7 +215,6 @@ int ext4_setup_system_zone(struct super_
 	struct ext4_system_blocks *system_blks;
 	struct ext4_group_desc *gdp;
 	ext4_group_t i;
-	int flex_size = ext4_flex_bg_size(sbi);
 	int ret;
 
 	system_blks = kzalloc(sizeof(*system_blks), GFP_KERNEL);
@@ -223,12 +222,13 @@ int ext4_setup_system_zone(struct super_
 		return -ENOMEM;
 
 	for (i=0; i < ngroups; i++) {
+		unsigned int meta_blks = ext4_num_base_meta_blocks(sb, i);
+
 		cond_resched();
-		if (ext4_bg_has_super(sb, i) &&
-		    ((i < 5) || ((i % flex_size) == 0))) {
+		if (meta_blks != 0) {
 			ret = add_system_zone(system_blks,
 					ext4_group_first_block_no(sb, i),
-					ext4_bg_num_gdb(sb, i) + 1, 0);
+					meta_blks, 0);
 			if (ret)
 				goto err;
 		}
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3096,6 +3096,8 @@ extern const char *ext4_decode_error(str
 extern void ext4_mark_group_bitmap_corrupted(struct super_block *sb,
 					     ext4_group_t block_group,
 					     unsigned int flags);
+extern unsigned int ext4_num_base_meta_blocks(struct super_block *sb,
+					      ext4_group_t block_group);
 
 extern __printf(7, 8)
 void __ext4_error(struct super_block *, const char *, unsigned int, bool,



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 145/219] ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup}
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (143 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 144/219] ext4: add correct group descriptors and reserved GDT blocks to system zone Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 146/219] f2fs: flush inode if atomic file is aborted Greg Kroah-Hartman
                   ` (84 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, stable, Luís Henriques,
	Eric Biggers, Theodore Tso

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Luís Henriques <lhenriques@suse.de>

commit 7ca4b085f430f3774c3838b3da569ceccd6a0177 upstream.

If the filename casefolding fails, we'll be leaking memory from the
fscrypt_name struct, namely from the 'crypto_buf.name' member.

Make sure we free it in the error path on both ext4_fname_setup_filename()
and ext4_fname_prepare_lookup() functions.

Cc: stable@kernel.org
Fixes: 1ae98e295fa2 ("ext4: optimize match for casefolded encrypted dirs")
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20230803091713.13239-1-lhenriques@suse.de
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/ext4/crypto.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/fs/ext4/crypto.c
+++ b/fs/ext4/crypto.c
@@ -33,6 +33,8 @@ int ext4_fname_setup_filename(struct ino
 
 #if IS_ENABLED(CONFIG_UNICODE)
 	err = ext4_fname_setup_ci_filename(dir, iname, fname);
+	if (err)
+		ext4_fname_free_filename(fname);
 #endif
 	return err;
 }
@@ -51,6 +53,8 @@ int ext4_fname_prepare_lookup(struct ino
 
 #if IS_ENABLED(CONFIG_UNICODE)
 	err = ext4_fname_setup_ci_filename(dir, &dentry->d_name, fname);
+	if (err)
+		ext4_fname_free_filename(fname);
 #endif
 	return err;
 }



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 146/219] f2fs: flush inode if atomic file is aborted
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (144 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 145/219] ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup} Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 147/219] f2fs: avoid false alarm of circular locking Greg Kroah-Hartman
                   ` (83 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Chao Yu, Jaegeuk Kim,
	syzbot+e1246909d526a9d470fa

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jaegeuk Kim <jaegeuk@kernel.org>

commit a3ab55746612247ce3dcaac6de66f5ffc055b9df upstream.

Let's flush the inode being aborted atomic operation to avoid stale dirty
inode during eviction in this call stack:

  f2fs_mark_inode_dirty_sync+0x22/0x40 [f2fs]
  f2fs_abort_atomic_write+0xc4/0xf0 [f2fs]
  f2fs_evict_inode+0x3f/0x690 [f2fs]
  ? sugov_start+0x140/0x140
  evict+0xc3/0x1c0
  evict_inodes+0x17b/0x210
  generic_shutdown_super+0x32/0x120
  kill_block_super+0x21/0x50
  deactivate_locked_super+0x31/0x90
  cleanup_mnt+0x100/0x160
  task_work_run+0x59/0x90
  do_exit+0x33b/0xa50
  do_group_exit+0x2d/0x80
  __x64_sys_exit_group+0x14/0x20
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

This triggers f2fs_bug_on() in f2fs_evict_inode:
 f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE));

This fixes the syzbot report:

loop0: detected capacity change from 0 to 131072
F2FS-fs (loop0): invalid crc value
F2FS-fs (loop0): Found nat_bits in checkpoint
F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4
------------[ cut here ]------------
kernel BUG at fs/f2fs/inode.c:869!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 5014 Comm: syz-executor220 Not tainted 6.4.0-syzkaller-11479-g6cd06ab12d1a #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869
Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc
RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007
RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000
R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50
FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 evict+0x2ed/0x6b0 fs/inode.c:665
 dispose_list+0x117/0x1e0 fs/inode.c:698
 evict_inodes+0x345/0x440 fs/inode.c:748
 generic_shutdown_super+0xaf/0x480 fs/super.c:478
 kill_block_super+0x64/0xb0 fs/super.c:1417
 kill_f2fs_super+0x2af/0x3c0 fs/f2fs/super.c:4704
 deactivate_locked_super+0x98/0x160 fs/super.c:330
 deactivate_super+0xb1/0xd0 fs/super.c:361
 cleanup_mnt+0x2ae/0x3d0 fs/namespace.c:1254
 task_work_run+0x16f/0x270 kernel/task_work.c:179
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0xa9a/0x29a0 kernel/exit.c:874
 do_group_exit+0xd4/0x2a0 kernel/exit.c:1024
 __do_sys_exit_group kernel/exit.c:1035 [inline]
 __se_sys_exit_group kernel/exit.c:1033 [inline]
 __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1033
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f309be71a09
Code: Unable to access opcode bytes at 0x7f309be719df.
RSP: 002b:00007fff171df518 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f309bef7330 RCX: 00007f309be71a09
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
RBP: 0000000000000001 R08: ffffffffffffffc0 R09: 00007f309bef1e40
R10: 0000000000010600 R11: 0000000000000246 R12: 00007f309bef7330
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869
Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc
RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007
RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000
R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50
FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0

Cc: <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+e1246909d526a9d470fa@syzkaller.appspotmail.com
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/segment.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -204,6 +204,8 @@ void f2fs_abort_atomic_write(struct inod
 		f2fs_i_size_write(inode, fi->original_i_size);
 		fi->original_i_size = 0;
 	}
+	/* avoid stale dirty inode during eviction */
+	sync_inode_metadata(inode, 0);
 }
 
 static int __replace_atomic_write_block(struct inode *inode, pgoff_t index,



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 147/219] f2fs: avoid false alarm of circular locking
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (145 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 146/219] f2fs: flush inode if atomic file is aborted Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 148/219] lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix() Greg Kroah-Hartman
                   ` (82 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Guenter Roeck, Chao Yu, Jaegeuk Kim,
	syzbot+e5600587fa9cbf8e3826

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jaegeuk Kim <jaegeuk@kernel.org>

commit 5c13e2388bf3426fd69a89eb46e50469e9624e56 upstream.

======================================================
WARNING: possible circular locking dependency detected
6.5.0-rc5-syzkaller-00353-gae545c3283dc #0 Not tainted
------------------------------------------------------
syz-executor273/5027 is trying to acquire lock:
ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_down_write fs/f2fs/f2fs.h:2133 [inline]
ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644

but task is already holding lock:
ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_down_read fs/f2fs/f2fs.h:2108 [inline]
ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_add_dentry+0x92/0x230 fs/f2fs/dir.c:783

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&fi->i_xattr_sem){.+.+}-{3:3}:
       down_read+0x9c/0x470 kernel/locking/rwsem.c:1520
       f2fs_down_read fs/f2fs/f2fs.h:2108 [inline]
       f2fs_getxattr+0xb1e/0x12c0 fs/f2fs/xattr.c:532
       __f2fs_get_acl+0x5a/0x900 fs/f2fs/acl.c:179
       f2fs_acl_create fs/f2fs/acl.c:377 [inline]
       f2fs_init_acl+0x15c/0xb30 fs/f2fs/acl.c:420
       f2fs_init_inode_metadata+0x159/0x1290 fs/f2fs/dir.c:558
       f2fs_add_regular_entry+0x79e/0xb90 fs/f2fs/dir.c:740
       f2fs_add_dentry+0x1de/0x230 fs/f2fs/dir.c:788
       f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827
       f2fs_add_link fs/f2fs/f2fs.h:3554 [inline]
       f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781
       vfs_mkdir+0x532/0x7e0 fs/namei.c:4117
       do_mkdirat+0x2a9/0x330 fs/namei.c:4140
       __do_sys_mkdir fs/namei.c:4160 [inline]
       __se_sys_mkdir fs/namei.c:4158 [inline]
       __x64_sys_mkdir+0xf2/0x140 fs/namei.c:4158
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #0 (&fi->i_sem){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3142 [inline]
       check_prevs_add kernel/locking/lockdep.c:3261 [inline]
       validate_chain kernel/locking/lockdep.c:3876 [inline]
       __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144
       lock_acquire kernel/locking/lockdep.c:5761 [inline]
       lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726
       down_write+0x93/0x200 kernel/locking/rwsem.c:1573
       f2fs_down_write fs/f2fs/f2fs.h:2133 [inline]
       f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644
       f2fs_add_dentry+0xa6/0x230 fs/f2fs/dir.c:784
       f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827
       f2fs_add_link fs/f2fs/f2fs.h:3554 [inline]
       f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781
       vfs_mkdir+0x532/0x7e0 fs/namei.c:4117
       ovl_do_mkdir fs/overlayfs/overlayfs.h:196 [inline]
       ovl_mkdir_real+0xb5/0x370 fs/overlayfs/dir.c:146
       ovl_workdir_create+0x3de/0x820 fs/overlayfs/super.c:309
       ovl_make_workdir fs/overlayfs/super.c:711 [inline]
       ovl_get_workdir fs/overlayfs/super.c:864 [inline]
       ovl_fill_super+0xdab/0x6180 fs/overlayfs/super.c:1400
       vfs_get_super+0xf9/0x290 fs/super.c:1152
       vfs_get_tree+0x88/0x350 fs/super.c:1519
       do_new_mount fs/namespace.c:3335 [inline]
       path_mount+0x1492/0x1ed0 fs/namespace.c:3662
       do_mount fs/namespace.c:3675 [inline]
       __do_sys_mount fs/namespace.c:3884 [inline]
       __se_sys_mount fs/namespace.c:3861 [inline]
       __x64_sys_mount+0x293/0x310 fs/namespace.c:3861
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  rlock(&fi->i_xattr_sem);
                               lock(&fi->i_sem);
                               lock(&fi->i_xattr_sem);
  lock(&fi->i_sem);

Cc: <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+e5600587fa9cbf8e3826@syzkaller.appspotmail.com
Fixes: 5eda1ad1aaff "f2fs: fix deadlock in i_xattr_sem and inode page lock"
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/f2fs/f2fs.h   |   24 +++++++++++++++---------
 fs/f2fs/inline.c |    3 ++-
 2 files changed, 17 insertions(+), 10 deletions(-)

--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2160,15 +2160,6 @@ static inline int f2fs_down_read_trylock
 	return down_read_trylock(&sem->internal_rwsem);
 }
 
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-static inline void f2fs_down_read_nested(struct f2fs_rwsem *sem, int subclass)
-{
-	down_read_nested(&sem->internal_rwsem, subclass);
-}
-#else
-#define f2fs_down_read_nested(sem, subclass) f2fs_down_read(sem)
-#endif
-
 static inline void f2fs_up_read(struct f2fs_rwsem *sem)
 {
 	up_read(&sem->internal_rwsem);
@@ -2179,6 +2170,21 @@ static inline void f2fs_down_write(struc
 	down_write(&sem->internal_rwsem);
 }
 
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+static inline void f2fs_down_read_nested(struct f2fs_rwsem *sem, int subclass)
+{
+	down_read_nested(&sem->internal_rwsem, subclass);
+}
+
+static inline void f2fs_down_write_nested(struct f2fs_rwsem *sem, int subclass)
+{
+	down_write_nested(&sem->internal_rwsem, subclass);
+}
+#else
+#define f2fs_down_read_nested(sem, subclass) f2fs_down_read(sem)
+#define f2fs_down_write_nested(sem, subclass) f2fs_down_write(sem)
+#endif
+
 static inline int f2fs_down_write_trylock(struct f2fs_rwsem *sem)
 {
 	return down_write_trylock(&sem->internal_rwsem);
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -642,7 +642,8 @@ int f2fs_add_inline_entry(struct inode *
 	}
 
 	if (inode) {
-		f2fs_down_write(&F2FS_I(inode)->i_sem);
+		f2fs_down_write_nested(&F2FS_I(inode)->i_sem,
+						SINGLE_DEPTH_NESTING);
 		page = f2fs_init_inode_metadata(inode, dir, fname, ipage);
 		if (IS_ERR(page)) {
 			err = PTR_ERR(page);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 148/219] lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (146 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 147/219] f2fs: avoid false alarm of circular locking Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 149/219] hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation Greg Kroah-Hartman
                   ` (81 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Nick Desaulniers, Nathan Chancellor,
	Petr Mladek

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Nathan Chancellor <nathan@kernel.org>

commit 92382d744176f230101d54f5c017bccd62770f01 upstream.

A recent change in clang allows it to consider more expressions as
compile time constants, which causes it to point out an implicit
conversion in the scanf tests:

  lib/test_scanf.c:661:2: warning: implicit conversion from 'int' to 'unsigned char' changes value from -168 to 88 [-Wconstant-conversion]
    661 |         test_number_prefix(unsigned char,       "0xA7", "%2hhx%hhx", 0, 0xa7, 2, check_uchar);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  lib/test_scanf.c:609:29: note: expanded from macro 'test_number_prefix'
    609 |         T result[2] = {~expect[0], ~expect[1]};                                 \
        |                       ~            ^~~~~~~~~~
  1 warning generated.

The result of the bitwise negation is the type of the operand after
going through the integer promotion rules, so this truncation is
expected but harmless, as the initial values in the result array get
overwritten by _test() anyways. Add an explicit cast to the expected
type in test_number_prefix() to silence the warning. There is no
functional change, as all the tests still pass with GCC 13.1.0 and clang
18.0.0.

Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linuxq/issues/1899
Link: https://github.com/llvm/llvm-project/commit/610ec954e1f81c0e8fcadedcd25afe643f5a094e
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230807-test_scanf-wconstant-conversion-v2-1-839ca39083e1@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 lib/test_scanf.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/test_scanf.c
+++ b/lib/test_scanf.c
@@ -606,7 +606,7 @@ static void __init numbers_slice(void)
 #define test_number_prefix(T, str, scan_fmt, expect0, expect1, n_args, fn)	\
 do {										\
 	const T expect[2] = { expect0, expect1 };				\
-	T result[2] = {~expect[0], ~expect[1]};					\
+	T result[2] = { (T)~expect[0], (T)~expect[1] };				\
 										\
 	_test(fn, &expect, str, scan_fmt, n_args, &result[0], &result[1]);	\
 } while (0)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 149/219] hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (147 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 148/219] lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix() Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 150/219] ata: ahci: Add Elkhart Lake AHCI controller Greg Kroah-Hartman
                   ` (80 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Christian Marangi, Bjorn Andersson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Christian Marangi <ansuelsmth@gmail.com>

commit 23316be8a9d450f33a21f1efe7d89570becbec58 upstream.

Commit 5d4753f741d8 ("hwspinlock: qcom: add support for MMIO on older
SoCs") introduced and made regmap_config mandatory in the of_data struct
but didn't add the regmap_config for sfpb based devices.

SFPB based devices can both use the legacy syscon way to probe or the
new MMIO way and currently device that use the MMIO way are broken as
they lack the definition of the now required regmap_config and always
return -EINVAL (and indirectly makes fail probing everything that
depends on it, smem, nandc with smem-parser...)

Fix this by correctly adding the missing regmap_config and restore
function of hwspinlock on SFPB based devices with MMIO implementation.

Cc: stable@vger.kernel.org
Fixes: 5d4753f741d8 ("hwspinlock: qcom: add support for MMIO on older SoCs")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20230716022804.21239-1-ansuelsmth@gmail.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/hwspinlock/qcom_hwspinlock.c |    9 +++++++++
 1 file changed, 9 insertions(+)

--- a/drivers/hwspinlock/qcom_hwspinlock.c
+++ b/drivers/hwspinlock/qcom_hwspinlock.c
@@ -69,9 +69,18 @@ static const struct hwspinlock_ops qcom_
 	.unlock		= qcom_hwspinlock_unlock,
 };
 
+static const struct regmap_config sfpb_mutex_config = {
+	.reg_bits		= 32,
+	.reg_stride		= 4,
+	.val_bits		= 32,
+	.max_register		= 0x100,
+	.fast_io		= true,
+};
+
 static const struct qcom_hwspinlock_of_data of_sfpb_mutex = {
 	.offset = 0x4,
 	.stride = 0x4,
+	.regmap_config = &sfpb_mutex_config,
 };
 
 static const struct regmap_config tcsr_msm8226_mutex_config = {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 150/219] ata: ahci: Add Elkhart Lake AHCI controller
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (148 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 149/219] hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 151/219] ata: pata_falcon: fix IO base selection for Q40 Greg Kroah-Hartman
                   ` (79 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Werner Fischer, Damien Le Moal

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Werner Fischer <devlists@wefi.net>

commit 2a2df98ec592667927b5c1351afa6493ea125c9f upstream.

Elkhart Lake is the successor of Apollo Lake and Gemini Lake. These
CPUs and their PCHs are used in mobile and embedded environments.

With this patch I suggest that Elkhart Lake SATA controllers [1] should
use the default LPM policy for mobile chipsets.
The disadvantage of missing hot-plug support with this setting should
not be an issue, as those CPUs are used in embedded environments and
not in servers with hot-plug backplanes.

We discovered that the Elkhart Lake SATA controllers have been missing
in ahci.c after a customer reported the throttling of his SATA SSD
after a short period of higher I/O. We determined the high temperature
of the SSD controller in idle mode as the root cause for that.

Depending on the used SSD, we have seen up to 1.8 Watt lower system
idle power usage and up to 30°C lower SSD controller temperatures in
our tests, when we set med_power_with_dipm manually. I have provided a
table showing seven different SATA SSDs from ATP, Intel/Solidigm and
Samsung [2].

Intel lists a total of 3 SATA controller IDs (4B60, 4B62, 4B63) in [1]
for those mobile PCHs.
This commit just adds 0x4b63 as I do not have test systems with 0x4b60
and 0x4b62 SATA controllers.
I have tested this patch with a system which uses 0x4b63 as SATA
controller.

[1] https://sata-io.org/product/8803
[2] https://www.thomas-krenn.com/en/wiki/SATA_Link_Power_Management#Example_LES_v4

Signed-off-by: Werner Fischer <devlists@wefi.net>
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/ata/ahci.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -422,6 +422,8 @@ static const struct pci_device_id ahci_p
 	{ PCI_VDEVICE(INTEL, 0x34d3), board_ahci_low_power }, /* Ice Lake LP AHCI */
 	{ PCI_VDEVICE(INTEL, 0x02d3), board_ahci_low_power }, /* Comet Lake PCH-U AHCI */
 	{ PCI_VDEVICE(INTEL, 0x02d7), board_ahci_low_power }, /* Comet Lake PCH RAID */
+	/* Elkhart Lake IDs 0x4b60 & 0x4b62 https://sata-io.org/product/8803 not tested yet */
+	{ PCI_VDEVICE(INTEL, 0x4b63), board_ahci_low_power }, /* Elkhart Lake AHCI */
 
 	/* JMicron 360/1/3/5/6, match class to avoid IDE function */
 	{ PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 151/219] ata: pata_falcon: fix IO base selection for Q40
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (149 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 150/219] ata: ahci: Add Elkhart Lake AHCI controller Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 152/219] ata: sata_gemini: Add missing MODULE_DESCRIPTION Greg Kroah-Hartman
                   ` (78 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, William R Sowerbutts, Finn Thain,
	Geert Uytterhoeven, Michael Schmitz, Sergey Shtylyov,
	Damien Le Moal

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Michael Schmitz <schmitzmic@gmail.com>

commit 8a1f00b753ecfdb117dc1a07e68c46d80e7923ea upstream.

With commit 44b1fbc0f5f3 ("m68k/q40: Replace q40ide driver
with pata_falcon and falconide"), the Q40 IDE driver was
replaced by pata_falcon.c.

Both IO and memory resources were defined for the Q40 IDE
platform device, but definition of the IDE register addresses
was modeled after the Falcon case, both in use of the memory
resources and in including register shift and byte vs. word
offset in the address.

This was correct for the Falcon case, which does not apply
any address translation to the register addresses. In the
Q40 case, all of device base address, byte access offset
and register shift is included in the platform specific
ISA access translation (in asm/mm_io.h).

As a consequence, such address translation gets applied
twice, and register addresses are mangled.

Use the device base address from the platform IO resource
for Q40 (the IO address translation will then add the correct
ISA window base address and byte access offset), with register
shift 1. Use MMIO base address and register shift 2 as before
for Falcon.

Encode PIO_OFFSET into IO port addresses for all registers
for Q40 except the data transfer register. Encode the MMIO
offset there (pata_falcon_data_xfer() directly uses raw IO
with no address translation).

Reported-by: William R Sowerbutts <will@sowerbutts.com>
Closes: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com
Link: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com
Fixes: 44b1fbc0f5f3 ("m68k/q40: Replace q40ide driver with pata_falcon and falconide")
Cc: stable@vger.kernel.org
Cc: Finn Thain <fthain@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: William R Sowerbutts <will@sowerbutts.com>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/ata/pata_falcon.c |   50 ++++++++++++++++++++++++++--------------------
 1 file changed, 29 insertions(+), 21 deletions(-)

--- a/drivers/ata/pata_falcon.c
+++ b/drivers/ata/pata_falcon.c
@@ -123,8 +123,8 @@ static int __init pata_falcon_init_one(s
 	struct resource *base_res, *ctl_res, *irq_res;
 	struct ata_host *host;
 	struct ata_port *ap;
-	void __iomem *base;
-	int irq = 0;
+	void __iomem *base, *ctl_base;
+	int irq = 0, io_offset = 1, reg_shift = 2; /* Falcon defaults */
 
 	dev_info(&pdev->dev, "Atari Falcon and Q40/Q60 PATA controller\n");
 
@@ -165,26 +165,34 @@ static int __init pata_falcon_init_one(s
 	ap->pio_mask = ATA_PIO4;
 	ap->flags |= ATA_FLAG_SLAVE_POSS | ATA_FLAG_NO_IORDY;
 
-	base = (void __iomem *)base_mem_res->start;
 	/* N.B. this assumes data_addr will be used for word-sized I/O only */
-	ap->ioaddr.data_addr		= base + 0 + 0 * 4;
-	ap->ioaddr.error_addr		= base + 1 + 1 * 4;
-	ap->ioaddr.feature_addr		= base + 1 + 1 * 4;
-	ap->ioaddr.nsect_addr		= base + 1 + 2 * 4;
-	ap->ioaddr.lbal_addr		= base + 1 + 3 * 4;
-	ap->ioaddr.lbam_addr		= base + 1 + 4 * 4;
-	ap->ioaddr.lbah_addr		= base + 1 + 5 * 4;
-	ap->ioaddr.device_addr		= base + 1 + 6 * 4;
-	ap->ioaddr.status_addr		= base + 1 + 7 * 4;
-	ap->ioaddr.command_addr		= base + 1 + 7 * 4;
-
-	base = (void __iomem *)ctl_mem_res->start;
-	ap->ioaddr.altstatus_addr	= base + 1;
-	ap->ioaddr.ctl_addr		= base + 1;
-
-	ata_port_desc(ap, "cmd 0x%lx ctl 0x%lx",
-		      (unsigned long)base_mem_res->start,
-		      (unsigned long)ctl_mem_res->start);
+	ap->ioaddr.data_addr = (void __iomem *)base_mem_res->start;
+
+	if (base_res) {		/* only Q40 has IO resources */
+		io_offset = 0x10000;
+		reg_shift = 0;
+		base = (void __iomem *)base_res->start;
+		ctl_base = (void __iomem *)ctl_res->start;
+	} else {
+		base = (void __iomem *)base_mem_res->start;
+		ctl_base = (void __iomem *)ctl_mem_res->start;
+	}
+
+	ap->ioaddr.error_addr	= base + io_offset + (1 << reg_shift);
+	ap->ioaddr.feature_addr	= base + io_offset + (1 << reg_shift);
+	ap->ioaddr.nsect_addr	= base + io_offset + (2 << reg_shift);
+	ap->ioaddr.lbal_addr	= base + io_offset + (3 << reg_shift);
+	ap->ioaddr.lbam_addr	= base + io_offset + (4 << reg_shift);
+	ap->ioaddr.lbah_addr	= base + io_offset + (5 << reg_shift);
+	ap->ioaddr.device_addr	= base + io_offset + (6 << reg_shift);
+	ap->ioaddr.status_addr	= base + io_offset + (7 << reg_shift);
+	ap->ioaddr.command_addr	= base + io_offset + (7 << reg_shift);
+
+	ap->ioaddr.altstatus_addr	= ctl_base + io_offset;
+	ap->ioaddr.ctl_addr		= ctl_base + io_offset;
+
+	ata_port_desc(ap, "cmd %px ctl %px data %px",
+		      base, ctl_base, ap->ioaddr.data_addr);
 
 	irq_res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
 	if (irq_res && irq_res->start > 0) {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 152/219] ata: sata_gemini: Add missing MODULE_DESCRIPTION
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (150 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 151/219] ata: pata_falcon: fix IO base selection for Q40 Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 153/219] ata: pata_ftide010: " Greg Kroah-Hartman
                   ` (77 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Damien Le Moal, Linus Walleij

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Damien Le Moal <dlemoal@kernel.org>

commit 8566572bf3b4d6e416a4bf2110dbb4817d11ba59 upstream.

Add the missing MODULE_DESCRIPTION() to avoid warnings such as:

WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ata/sata_gemini.o

when compiling with W=1.

Fixes: be4e456ed3a5 ("ata: Add driver for Faraday Technology FTIDE010")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/ata/sata_gemini.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/ata/sata_gemini.c
+++ b/drivers/ata/sata_gemini.c
@@ -428,6 +428,7 @@ static struct platform_driver gemini_sat
 };
 module_platform_driver(gemini_sata_driver);
 
+MODULE_DESCRIPTION("low level driver for Cortina Systems Gemini SATA bridge");
 MODULE_AUTHOR("Linus Walleij <linus.walleij@linaro.org>");
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("platform:" DRV_NAME);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 153/219] ata: pata_ftide010: Add missing MODULE_DESCRIPTION
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (151 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 152/219] ata: sata_gemini: Add missing MODULE_DESCRIPTION Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 154/219] fuse: nlookup missing decrement in fuse_direntplus_link Greg Kroah-Hartman
                   ` (76 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Damien Le Moal, Linus Walleij

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Damien Le Moal <dlemoal@kernel.org>

commit 7274eef5729037300f29d14edeb334a47a098f65 upstream.

Add the missing MODULE_DESCRIPTION() to avoid warnings such as:

WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ata/pata_ftide010.o

when compiling with W=1.

Fixes: be4e456ed3a5 ("ata: Add driver for Faraday Technology FTIDE010")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/ata/pata_ftide010.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/ata/pata_ftide010.c
+++ b/drivers/ata/pata_ftide010.c
@@ -567,6 +567,7 @@ static struct platform_driver pata_ftide
 };
 module_platform_driver(pata_ftide010_driver);
 
+MODULE_DESCRIPTION("low level driver for Faraday Technology FTIDE010");
 MODULE_AUTHOR("Linus Walleij <linus.walleij@linaro.org>");
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("platform:" DRV_NAME);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 154/219] fuse: nlookup missing decrement in fuse_direntplus_link
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (152 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 153/219] ata: pata_ftide010: " Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 155/219] btrfs: zoned: do not zone finish data relocation block group Greg Kroah-Hartman
                   ` (75 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, ruanmeisi, Miklos Szeredi

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: ruanmeisi <ruan.meisi@zte.com.cn>

commit b8bd342d50cbf606666488488f9fea374aceb2d5 upstream.

During our debugging of glusterfs, we found an Assertion failed error:
inode_lookup >= nlookup, which was caused by the nlookup value in the
kernel being greater than that in the FUSE file system.

The issue was introduced by fuse_direntplus_link, where in the function,
fuse_iget increments nlookup, and if d_splice_alias returns failure,
fuse_direntplus_link returns failure without decrementing nlookup
https://github.com/gluster/glusterfs/pull/4081

Signed-off-by: ruanmeisi <ruan.meisi@zte.com.cn>
Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support")
Cc: <stable@vger.kernel.org> # v3.9
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/fuse/readdir.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -243,8 +243,16 @@ retry:
 			dput(dentry);
 			dentry = alias;
 		}
-		if (IS_ERR(dentry))
+		if (IS_ERR(dentry)) {
+			if (!IS_ERR(inode)) {
+				struct fuse_inode *fi = get_fuse_inode(inode);
+
+				spin_lock(&fi->lock);
+				fi->nlookup--;
+				spin_unlock(&fi->lock);
+			}
 			return PTR_ERR(dentry);
+		}
 	}
 	if (fc->readdirplus_auto)
 		set_bit(FUSE_I_INIT_RDPLUS, &get_fuse_inode(inode)->state);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 155/219] btrfs: zoned: do not zone finish data relocation block group
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (153 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 154/219] fuse: nlookup missing decrement in fuse_direntplus_link Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 156/219] btrfs: fix start transaction qgroup rsv double free Greg Kroah-Hartman
                   ` (74 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Christoph Hellwig,
	Johannes Thumshirn, Naohiro Aota, David Sterba

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Naohiro Aota <naohiro.aota@wdc.com>

commit 332581bde2a419d5f12a93a1cdc2856af649a3cc upstream.

When multiple writes happen at once, we may need to sacrifice a currently
active block group to be zone finished for a new allocation. We choose a
block group with the least free space left, and zone finish it.

To do the finishing, we need to send IOs for already allocated region
and wait for them and on-going IOs. Otherwise, these IOs fail because the
zone is already finished at the time the IO reach a device.

However, if a block group dedicated to the data relocation is zone
finished, there is a chance that finishing it before an ongoing write IO
reaches the device. That is because there is timing gap between an
allocation is done (block_group->reservations == 0, as pre-allocation is
done) and an ordered extent is created when the relocation IO starts.
Thus, if we finish the zone between them, we can fail the IOs.

We cannot simply use "fs_info->data_reloc_bg == block_group->start" to
avoid the zone finishing. Because, the data_reloc_bg may already switch to
a new block group, while there are still ongoing write IOs to the old
data_reloc_bg.

So, this patch reworks the BLOCK_GROUP_FLAG_ZONED_DATA_RELOC bit to
indicate there is a data relocation allocation and/or ongoing write to the
block group. The bit is set on allocation and cleared in end_io function of
the last IO for the currently allocated region.

To change the timing of the bit setting also solves the issue that the bit
being left even after there is no IO going on. With the current code, if
the data_reloc_bg switches after the last IO to the current data_reloc_bg,
the bit is set at this timing and there is no one clearing that bit. As a
result, that block group is kept unallocatable for anything.

Fixes: 343d8a30851c ("btrfs: zoned: prevent allocation from previous data relocation BG")
Fixes: 74e91b12b115 ("btrfs: zoned: zone finish unused block group")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/extent-tree.c |   43 +++++++++++++++++++++++--------------------
 fs/btrfs/zoned.c       |   16 +++++++++++++---
 2 files changed, 36 insertions(+), 23 deletions(-)

--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3810,7 +3810,8 @@ static int do_allocation_zoned(struct bt
 	       fs_info->data_reloc_bg == 0);
 
 	if (block_group->ro ||
-	    test_bit(BLOCK_GROUP_FLAG_ZONED_DATA_RELOC, &block_group->runtime_flags)) {
+	    (!ffe_ctl->for_data_reloc &&
+	     test_bit(BLOCK_GROUP_FLAG_ZONED_DATA_RELOC, &block_group->runtime_flags))) {
 		ret = 1;
 		goto out;
 	}
@@ -3853,8 +3854,26 @@ static int do_allocation_zoned(struct bt
 	if (ffe_ctl->for_treelog && !fs_info->treelog_bg)
 		fs_info->treelog_bg = block_group->start;
 
-	if (ffe_ctl->for_data_reloc && !fs_info->data_reloc_bg)
-		fs_info->data_reloc_bg = block_group->start;
+	if (ffe_ctl->for_data_reloc) {
+		if (!fs_info->data_reloc_bg)
+			fs_info->data_reloc_bg = block_group->start;
+		/*
+		 * Do not allow allocations from this block group, unless it is
+		 * for data relocation. Compared to increasing the ->ro, setting
+		 * the ->zoned_data_reloc_ongoing flag still allows nocow
+		 * writers to come in. See btrfs_inc_nocow_writers().
+		 *
+		 * We need to disable an allocation to avoid an allocation of
+		 * regular (non-relocation data) extent. With mix of relocation
+		 * extents and regular extents, we can dispatch WRITE commands
+		 * (for relocation extents) and ZONE APPEND commands (for
+		 * regular extents) at the same time to the same zone, which
+		 * easily break the write pointer.
+		 *
+		 * Also, this flag avoids this block group to be zone finished.
+		 */
+		set_bit(BLOCK_GROUP_FLAG_ZONED_DATA_RELOC, &block_group->runtime_flags);
+	}
 
 	ffe_ctl->found_offset = start + block_group->alloc_offset;
 	block_group->alloc_offset += num_bytes;
@@ -3872,24 +3891,8 @@ static int do_allocation_zoned(struct bt
 out:
 	if (ret && ffe_ctl->for_treelog)
 		fs_info->treelog_bg = 0;
-	if (ret && ffe_ctl->for_data_reloc &&
-	    fs_info->data_reloc_bg == block_group->start) {
-		/*
-		 * Do not allow further allocations from this block group.
-		 * Compared to increasing the ->ro, setting the
-		 * ->zoned_data_reloc_ongoing flag still allows nocow
-		 *  writers to come in. See btrfs_inc_nocow_writers().
-		 *
-		 * We need to disable an allocation to avoid an allocation of
-		 * regular (non-relocation data) extent. With mix of relocation
-		 * extents and regular extents, we can dispatch WRITE commands
-		 * (for relocation extents) and ZONE APPEND commands (for
-		 * regular extents) at the same time to the same zone, which
-		 * easily break the write pointer.
-		 */
-		set_bit(BLOCK_GROUP_FLAG_ZONED_DATA_RELOC, &block_group->runtime_flags);
+	if (ret && ffe_ctl->for_data_reloc)
 		fs_info->data_reloc_bg = 0;
-	}
 	spin_unlock(&fs_info->relocation_bg_lock);
 	spin_unlock(&fs_info->treelog_bg_lock);
 	spin_unlock(&block_group->lock);
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -2009,6 +2009,10 @@ static int do_zone_finish(struct btrfs_b
 	 * and block_group->meta_write_pointer for metadata.
 	 */
 	if (!fully_written) {
+		if (test_bit(BLOCK_GROUP_FLAG_ZONED_DATA_RELOC, &block_group->runtime_flags)) {
+			spin_unlock(&block_group->lock);
+			return -EAGAIN;
+		}
 		spin_unlock(&block_group->lock);
 
 		ret = btrfs_inc_block_group_ro(block_group, false);
@@ -2037,7 +2041,9 @@ static int do_zone_finish(struct btrfs_b
 			return 0;
 		}
 
-		if (block_group->reserved) {
+		if (block_group->reserved ||
+		    test_bit(BLOCK_GROUP_FLAG_ZONED_DATA_RELOC,
+			     &block_group->runtime_flags)) {
 			spin_unlock(&block_group->lock);
 			btrfs_dec_block_group_ro(block_group);
 			return -EAGAIN;
@@ -2268,7 +2274,10 @@ void btrfs_zoned_release_data_reloc_bg(s
 
 	/* All relocation extents are written. */
 	if (block_group->start + block_group->alloc_offset == logical + length) {
-		/* Now, release this block group for further allocations. */
+		/*
+		 * Now, release this block group for further allocations and
+		 * zone finish.
+		 */
 		clear_bit(BLOCK_GROUP_FLAG_ZONED_DATA_RELOC,
 			  &block_group->runtime_flags);
 	}
@@ -2292,7 +2301,8 @@ int btrfs_zone_finish_one_bg(struct btrf
 
 		spin_lock(&block_group->lock);
 		if (block_group->reserved || block_group->alloc_offset == 0 ||
-		    (block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM)) {
+		    (block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM) ||
+		    test_bit(BLOCK_GROUP_FLAG_ZONED_DATA_RELOC, &block_group->runtime_flags)) {
 			spin_unlock(&block_group->lock);
 			continue;
 		}



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 156/219] btrfs: fix start transaction qgroup rsv double free
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (154 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 155/219] btrfs: zoned: do not zone finish data relocation block group Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 157/219] btrfs: free qgroup rsv on io failure Greg Kroah-Hartman
                   ` (73 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Josef Bacik, Boris Burkov,
	David Sterba

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Boris Burkov <boris@bur.io>

commit a6496849671a5bc9218ecec25a983253b34351b1 upstream.

btrfs_start_transaction reserves metadata space of the PERTRANS type
before it identifies a transaction to start/join. This allows flushing
when reserving that space without a deadlock. However, it results in a
race which temporarily breaks qgroup rsv accounting.

T1                                              T2
start_transaction
do_stuff
                                            start_transaction
                                                qgroup_reserve_meta_pertrans
commit_transaction
    qgroup_free_meta_all_pertrans
                                            hit an error starting txn
                                            goto reserve_fail
                                            qgroup_free_meta_pertrans (already freed!)

The basic issue is that there is nothing preventing another commit from
committing before start_transaction finishes (in fact sometimes we
intentionally wait for it) so any error path that frees the reserve is
at risk of this race.

While this exact space was getting freed anyway, and it's not a huge
deal to double free it (just a warning, the free code catches this), it
can result in incorrectly freeing some other pertrans reservation in
this same reservation, which could then lead to spuriously granting
reservations we might not have the space for. Therefore, I do believe it
is worth fixing.

To fix it, use the existing prealloc->pertrans conversion mechanism.
When we first reserve the space, we reserve prealloc space and only when
we are sure we have a transaction do we convert it to pertrans. This way
any racing commits do not blow away our reservation, but we still get a
pertrans reservation that is freed when _this_ transaction gets committed.

This issue can be reproduced by running generic/269 with either qgroups
or squotas enabled via mkfs on the scratch device.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
CC: stable@vger.kernel.org # 5.10+
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/transaction.c |   19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -580,8 +580,13 @@ start_transaction(struct btrfs_root *roo
 		u64 delayed_refs_bytes = 0;
 
 		qgroup_reserved = num_items * fs_info->nodesize;
-		ret = btrfs_qgroup_reserve_meta_pertrans(root, qgroup_reserved,
-				enforce_qgroups);
+		/*
+		 * Use prealloc for now, as there might be a currently running
+		 * transaction that could free this reserved space prematurely
+		 * by committing.
+		 */
+		ret = btrfs_qgroup_reserve_meta_prealloc(root, qgroup_reserved,
+							 enforce_qgroups, false);
 		if (ret)
 			return ERR_PTR(ret);
 
@@ -693,6 +698,14 @@ again:
 		h->reloc_reserved = reloc_reserved;
 	}
 
+	/*
+	 * Now that we have found a transaction to be a part of, convert the
+	 * qgroup reservation from prealloc to pertrans. A different transaction
+	 * can't race in and free our pertrans out from under us.
+	 */
+	if (qgroup_reserved)
+		btrfs_qgroup_convert_reserved_meta(root, qgroup_reserved);
+
 got_it:
 	if (!current->journal_info)
 		current->journal_info = h;
@@ -740,7 +753,7 @@ alloc_fail:
 		btrfs_block_rsv_release(fs_info, &fs_info->trans_block_rsv,
 					num_bytes, NULL);
 reserve_fail:
-	btrfs_qgroup_free_meta_pertrans(root, qgroup_reserved);
+	btrfs_qgroup_free_meta_prealloc(root, qgroup_reserved);
 	return ERR_PTR(ret);
 }
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 157/219] btrfs: free qgroup rsv on io failure
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (155 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 156/219] btrfs: fix start transaction qgroup rsv double free Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 158/219] btrfs: dont start transaction when joining with TRANS_JOIN_NOSTART Greg Kroah-Hartman
                   ` (72 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Josef Bacik, Boris Burkov,
	David Sterba

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Boris Burkov <boris@bur.io>

commit e28b02118b94e42be3355458a2406c6861e2dd32 upstream.

If we do a write whose bio suffers an error, we will never reclaim the
qgroup reserved space for it. We allocate the space in the write_iter
codepath, then release the reservation as we allocate the ordered
extent, but we only create a delayed ref if the ordered extent finishes.
If it has an error, we simply leak the rsv. This is apparent in running
any error injecting (dmerror) fstests like btrfs/146 or btrfs/160. Such
tests fail due to dmesg on umount complaining about the leaked qgroup
data space.

When we clean up other aspects of space on failed ordered_extents, also
free the qgroup rsv.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
CC: stable@vger.kernel.org # 5.10+
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/inode.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3393,6 +3393,13 @@ out:
 			btrfs_free_reserved_extent(fs_info,
 					ordered_extent->disk_bytenr,
 					ordered_extent->disk_num_bytes, 1);
+			/*
+			 * Actually free the qgroup rsv which was released when
+			 * the ordered extent was created.
+			 */
+			btrfs_qgroup_free_refroot(fs_info, inode->root->root_key.objectid,
+						  ordered_extent->qgroup_rsv,
+						  BTRFS_QGROUP_RSV_DATA);
 		}
 	}
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 158/219] btrfs: dont start transaction when joining with TRANS_JOIN_NOSTART
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (156 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 157/219] btrfs: free qgroup rsv on io failure Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 159/219] btrfs: set page extent mapped after read_folio in relocate_one_page Greg Kroah-Hartman
                   ` (71 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Filipe Manana, David Sterba

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Filipe Manana <fdmanana@suse.com>

commit 4490e803e1fe9fab8db5025e44e23b55df54078b upstream.

When joining a transaction with TRANS_JOIN_NOSTART, if we don't find a
running transaction we end up creating one. This goes against the purpose
of TRANS_JOIN_NOSTART which is to join a running transaction if its state
is at or below the state TRANS_STATE_COMMIT_START, otherwise return an
-ENOENT error and don't start a new transaction. So fix this to not create
a new transaction if there's no running transaction at or below that
state.

CC: stable@vger.kernel.org # 4.14+
Fixes: a6d155d2e363 ("Btrfs: fix deadlock between fiemap and transaction commits")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/transaction.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -279,10 +279,11 @@ loop:
 	spin_unlock(&fs_info->trans_lock);
 
 	/*
-	 * If we are ATTACH, we just want to catch the current transaction,
-	 * and commit it. If there is no transaction, just return ENOENT.
+	 * If we are ATTACH or TRANS_JOIN_NOSTART, we just want to catch the
+	 * current transaction, and commit it. If there is no transaction, just
+	 * return ENOENT.
 	 */
-	if (type == TRANS_ATTACH)
+	if (type == TRANS_ATTACH || type == TRANS_JOIN_NOSTART)
 		return -ENOENT;
 
 	/*



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 159/219] btrfs: set page extent mapped after read_folio in relocate_one_page
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (157 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 158/219] btrfs: dont start transaction when joining with TRANS_JOIN_NOSTART Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 160/219] btrfs: zoned: re-enable metadata over-commit for zoned mode Greg Kroah-Hartman
                   ` (70 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Filipe Manana, Josef Bacik,
	David Sterba

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Josef Bacik <josef@toxicpanda.com>

commit e7f1326cc24e22b38afc3acd328480a1183f9e79 upstream.

One of the CI runs triggered the following panic

  assertion failed: PagePrivate(page) && page->private, in fs/btrfs/subpage.c:229
  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/subpage.c:229!
  Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
  CPU: 0 PID: 923660 Comm: btrfs Not tainted 6.5.0-rc3+ #1
  pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
  pc : btrfs_subpage_assert+0xbc/0xf0
  lr : btrfs_subpage_assert+0xbc/0xf0
  sp : ffff800093213720
  x29: ffff800093213720 x28: ffff8000932138b4 x27: 000000000c280000
  x26: 00000001b5d00000 x25: 000000000c281000 x24: 000000000c281fff
  x23: 0000000000001000 x22: 0000000000000000 x21: ffffff42b95bf880
  x20: ffff42b9528e0000 x19: 0000000000001000 x18: ffffffffffffffff
  x17: 667274622f736620 x16: 6e69202c65746176 x15: 0000000000000028
  x14: 0000000000000003 x13: 00000000002672d7 x12: 0000000000000000
  x11: ffffcd3f0ccd9204 x10: ffffcd3f0554ae50 x9 : ffffcd3f0379528c
  x8 : ffff800093213428 x7 : 0000000000000000 x6 : ffffcd3f091771e8
  x5 : ffff42b97f333948 x4 : 0000000000000000 x3 : 0000000000000000
  x2 : 0000000000000000 x1 : ffff42b9556cde80 x0 : 000000000000004f
  Call trace:
   btrfs_subpage_assert+0xbc/0xf0
   btrfs_subpage_set_dirty+0x38/0xa0
   btrfs_page_set_dirty+0x58/0x88
   relocate_one_page+0x204/0x5f0
   relocate_file_extent_cluster+0x11c/0x180
   relocate_data_extent+0xd0/0xf8
   relocate_block_group+0x3d0/0x4e8
   btrfs_relocate_block_group+0x2d8/0x490
   btrfs_relocate_chunk+0x54/0x1a8
   btrfs_balance+0x7f4/0x1150
   btrfs_ioctl+0x10f0/0x20b8
   __arm64_sys_ioctl+0x120/0x11d8
   invoke_syscall.constprop.0+0x80/0xd8
   do_el0_svc+0x6c/0x158
   el0_svc+0x50/0x1b0
   el0t_64_sync_handler+0x120/0x130
   el0t_64_sync+0x194/0x198
  Code: 91098021 b0007fa0 91346000 97e9c6d2 (d4210000)

This is the same problem outlined in 17b17fcd6d44 ("btrfs:
set_page_extent_mapped after read_folio in btrfs_cont_expand") , and the
fix is the same.  I originally looked for the same pattern elsewhere in
our code, but mistakenly skipped over this code because I saw the page
cache readahead before we set_page_extent_mapped, not realizing that
this was only in the !page case, that we can still end up with a
!uptodate page and then do the btrfs_read_folio further down.

The fix here is the same as the above mentioned patch, move the
set_page_extent_mapped call to after the btrfs_read_folio() block to
make sure that we have the subpage blocksize stuff setup properly before
using the page.

CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/relocation.c |   12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2985,9 +2985,6 @@ static int relocate_one_page(struct inod
 		if (!page)
 			return -ENOMEM;
 	}
-	ret = set_page_extent_mapped(page);
-	if (ret < 0)
-		goto release_page;
 
 	if (PageReadahead(page))
 		page_cache_async_readahead(inode->i_mapping, ra, NULL,
@@ -3003,6 +3000,15 @@ static int relocate_one_page(struct inod
 		}
 	}
 
+	/*
+	 * We could have lost page private when we dropped the lock to read the
+	 * page above, make sure we set_page_extent_mapped here so we have any
+	 * of the subpage blocksize stuff we need in place.
+	 */
+	ret = set_page_extent_mapped(page);
+	if (ret < 0)
+		goto release_page;
+
 	page_start = page_offset(page);
 	page_end = page_start + PAGE_SIZE - 1;
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 160/219] btrfs: zoned: re-enable metadata over-commit for zoned mode
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (158 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 159/219] btrfs: set page extent mapped after read_folio in relocate_one_page Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 161/219] btrfs: use the correct superblock to compare fsid in btrfs_validate_super Greg Kroah-Hartman
                   ` (69 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Johannes Thumshirn, Naohiro Aota,
	David Sterba

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Naohiro Aota <naohiro.aota@wdc.com>

commit 5b135b382a360f4c87cf8896d1465b0b07f10cb0 upstream.

Now that, we can re-enable metadata over-commit. As we moved the activation
from the reservation time to the write time, we no longer need to ensure
all the reserved bytes is properly activated.

Without the metadata over-commit, it suffers from lower performance because
it needs to flush the delalloc items more often and allocate more block
groups. Re-enabling metadata over-commit will solve the issue.

Fixes: 79417d040f4f ("btrfs: zoned: disable metadata overcommit for zoned")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/space-info.c |    6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -404,11 +404,7 @@ int btrfs_can_overcommit(struct btrfs_fs
 		return 0;
 
 	used = btrfs_space_info_used(space_info, true);
-	if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &fs_info->flags) &&
-	    (space_info->flags & BTRFS_BLOCK_GROUP_METADATA))
-		avail = 0;
-	else
-		avail = calc_available_free_space(fs_info, space_info, flush);
+	avail = calc_available_free_space(fs_info, space_info, flush);
 
 	if (used + bytes < writable_total_bytes(fs_info, space_info) + avail)
 		return 1;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 161/219] btrfs: use the correct superblock to compare fsid in btrfs_validate_super
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (159 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 160/219] btrfs: zoned: re-enable metadata over-commit for zoned mode Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 162/219] drm/mxsfb: Disable overlay plane in mxsfb_plane_overlay_atomic_disable() Greg Kroah-Hartman
                   ` (68 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Johannes Thumshirn,
	Guilherme G. Piccoli, Anand Jain, David Sterba

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Anand Jain <anand.jain@oracle.com>

commit d167aa76dc0683828588c25767da07fb549e4f48 upstream.

The function btrfs_validate_super() should verify the fsid in the provided
superblock argument. Because, all its callers expect it to do that.

Such as in the following stack:

   write_all_supers()
       sb = fs_info->super_for_commit;
       btrfs_validate_write_super(.., sb)
         btrfs_validate_super(.., sb, ..)

   scrub_one_super()
	btrfs_validate_super(.., sb, ..)

And
   check_dev_super()
	btrfs_validate_super(.., sb, ..)

However, it currently verifies the fs_info::super_copy::fsid instead,
which is not correct.  Fix this using the correct fsid in the superblock
argument.

CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/disk-io.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2721,11 +2721,10 @@ int btrfs_validate_super(struct btrfs_fs
 		ret = -EINVAL;
 	}
 
-	if (memcmp(fs_info->fs_devices->fsid, fs_info->super_copy->fsid,
-		   BTRFS_FSID_SIZE)) {
+	if (memcmp(fs_info->fs_devices->fsid, sb->fsid, BTRFS_FSID_SIZE) != 0) {
 		btrfs_err(fs_info,
 		"superblock fsid doesn't match fsid of fs_devices: %pU != %pU",
-			fs_info->super_copy->fsid, fs_info->fs_devices->fsid);
+			  sb->fsid, fs_info->fs_devices->fsid);
 		ret = -EINVAL;
 	}
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 162/219] drm/mxsfb: Disable overlay plane in mxsfb_plane_overlay_atomic_disable()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (160 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 161/219] btrfs: use the correct superblock to compare fsid in btrfs_validate_super Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 163/219] mtd: rawnand: brcmnand: Fix crash during the panic_write Greg Kroah-Hartman
                   ` (67 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Liu Ying, Marek Vasut

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Liu Ying <victor.liu@nxp.com>

commit aa656d48e871a1b062e1bbf9474d8b831c35074c upstream.

When disabling overlay plane in mxsfb_plane_overlay_atomic_update(),
overlay plane's framebuffer pointer is NULL.  So, dereferencing it would
cause a kernel Oops(NULL pointer dereferencing).  Fix the issue by
disabling overlay plane in mxsfb_plane_overlay_atomic_disable() instead.

Fixes: cb285a5348e7 ("drm: mxsfb: Replace mxsfb_get_fb_paddr() with drm_fb_cma_get_gem_addr()")
Cc: stable@vger.kernel.org # 5.19+
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Marek Vasut <marex@denx.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230612092359.784115-1-victor.liu@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/mxsfb/mxsfb_kms.c |    9 +++++++++
 1 file changed, 9 insertions(+)

--- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
@@ -611,6 +611,14 @@ static void mxsfb_plane_overlay_atomic_u
 	writel(ctrl, mxsfb->base + LCDC_AS_CTRL);
 }
 
+static void mxsfb_plane_overlay_atomic_disable(struct drm_plane *plane,
+					       struct drm_atomic_state *state)
+{
+	struct mxsfb_drm_private *mxsfb = to_mxsfb_drm_private(plane->dev);
+
+	writel(0, mxsfb->base + LCDC_AS_CTRL);
+}
+
 static bool mxsfb_format_mod_supported(struct drm_plane *plane,
 				       uint32_t format,
 				       uint64_t modifier)
@@ -626,6 +634,7 @@ static const struct drm_plane_helper_fun
 static const struct drm_plane_helper_funcs mxsfb_plane_overlay_helper_funcs = {
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_overlay_atomic_update,
+	.atomic_disable = mxsfb_plane_overlay_atomic_disable,
 };
 
 static const struct drm_plane_funcs mxsfb_plane_funcs = {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 163/219] mtd: rawnand: brcmnand: Fix crash during the panic_write
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (161 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 162/219] drm/mxsfb: Disable overlay plane in mxsfb_plane_overlay_atomic_disable() Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 164/219] mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write Greg Kroah-Hartman
                   ` (66 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, William Zhang, Florian Fainelli,
	Kursad Oney, Kamal Dasu, Miquel Raynal

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: William Zhang <william.zhang@broadcom.com>

commit e66dd317194daae0475fe9e5577c80aa97f16cb9 upstream.

When executing a NAND command within the panic write path, wait for any
pending command instead of calling BUG_ON to avoid crashing while
already crashing.

Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller")
Signed-off-by: William Zhang <william.zhang@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Kursad Oney <kursad.oney@broadcom.com>
Reviewed-by: Kamal Dasu <kamal.dasu@broadcom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-4-william.zhang@broadcom.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/mtd/nand/raw/brcmnand/brcmnand.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

--- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
+++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
@@ -1592,7 +1592,17 @@ static void brcmnand_send_cmd(struct brc
 
 	dev_dbg(ctrl->dev, "send native cmd %d addr 0x%llx\n", cmd, cmd_addr);
 
-	BUG_ON(ctrl->cmd_pending != 0);
+	/*
+	 * If we came here through _panic_write and there is a pending
+	 * command, try to wait for it. If it times out, rather than
+	 * hitting BUG_ON, just return so we don't crash while crashing.
+	 */
+	if (oops_in_progress) {
+		if (ctrl->cmd_pending &&
+			bcmnand_ctrl_poll_status(ctrl, NAND_CTRL_RDY, NAND_CTRL_RDY, 0))
+			return;
+	} else
+		BUG_ON(ctrl->cmd_pending != 0);
 	ctrl->cmd_pending = cmd;
 
 	ret = bcmnand_ctrl_poll_status(ctrl, NAND_CTRL_RDY, NAND_CTRL_RDY, 0);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 164/219] mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (162 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 163/219] mtd: rawnand: brcmnand: Fix crash during the panic_write Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 165/219] mtd: spi-nor: Correct flags for Winbond w25q128 Greg Kroah-Hartman
                   ` (65 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, William Zhang, Florian Fainelli,
	Miquel Raynal

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: William Zhang <william.zhang@broadcom.com>

commit 5d53244186c9ac58cb88d76a0958ca55b83a15cd upstream.

When the oob buffer length is not in multiple of words, the oob write
function does out-of-bounds read on the oob source buffer at the last
iteration. Fix that by always checking length limit on the oob buffer
read and fill with 0xff when reaching the end of the buffer to the oob
registers.

Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller")
Signed-off-by: William Zhang <william.zhang@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-5-william.zhang@broadcom.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/mtd/nand/raw/brcmnand/brcmnand.c |   18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

--- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
+++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
@@ -1461,19 +1461,33 @@ static int write_oob_to_regs(struct brcm
 			     const u8 *oob, int sas, int sector_1k)
 {
 	int tbytes = sas << sector_1k;
-	int j;
+	int j, k = 0;
+	u32 last = 0xffffffff;
+	u8 *plast = (u8 *)&last;
 
 	/* Adjust OOB values for 1K sector size */
 	if (sector_1k && (i & 0x01))
 		tbytes = max(0, tbytes - (int)ctrl->max_oob);
 	tbytes = min_t(int, tbytes, ctrl->max_oob);
 
-	for (j = 0; j < tbytes; j += 4)
+	/*
+	 * tbytes may not be multiple of words. Make sure we don't read out of
+	 * the boundary and stop at last word.
+	 */
+	for (j = 0; (j + 3) < tbytes; j += 4)
 		oob_reg_write(ctrl, j,
 				(oob[j + 0] << 24) |
 				(oob[j + 1] << 16) |
 				(oob[j + 2] <<  8) |
 				(oob[j + 3] <<  0));
+
+	/* handle the remaing bytes */
+	while (j < tbytes)
+		plast[k++] = oob[j++];
+
+	if (tbytes & 0x3)
+		oob_reg_write(ctrl, (tbytes & ~0x3), (__force u32)cpu_to_be32(last));
+
 	return tbytes;
 }
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 165/219] mtd: spi-nor: Correct flags for Winbond w25q128
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (163 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 164/219] mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 166/219] mtd: rawnand: brcmnand: Fix potential false time out warning Greg Kroah-Hartman
                   ` (64 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Michael Walle, Linus Walleij,
	Tudor Ambarus

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Linus Walleij <linus.walleij@linaro.org>

commit 83e824a4a595132f9bd7ac4f5afff857bfc5991e upstream.

The Winbond "w25q128" (actual vendor name W25Q128JV) has
exactly the same flags as the sibling device "w25q128jv".
The devices both require unlocking to enable write access.

The actual product naming between devices vs the Linux
strings in winbond.c:

0xef4018: "w25q128"   W25Q128JV-IN/IQ/JQ
0xef7018: "w25q128jv" W25Q128JV-IM/JM

The latter device, "w25q128jv" supports features named DTQ
and QPI, otherwise it is the same.

Not having the right flags has the annoying side effect
that write access does not work.

After this patch I can write to the flash on the Inteno
XG6846 router.

The flash memory also supports dual and quad SPI modes.
This does not currently manifest, but by turning on SFDP
parsing, the right SPI modes are emitted in
/sys/kernel/debug/spi-nor/spi1.0/capabilities
for this chip, so we also turn on this.

Since we now have determined that SFDP parsing works on
the device, we also detect the geometry using SFDP.

After this dmesg and sysfs says:
[    1.062401] spi-nor spi1.0: w25q128 (16384 Kbytes)
cat erasesize
65536
(16384*1024)/65536 = 256 sectors

spi-nor sysfs:
cat jedec_id
ef4018
cat manufacturer
winbond
cat partname
w25q128
hexdump -v -C sfdp
00000000  53 46 44 50 05 01 00 ff  00 05 01 10 80 00 00 ff
00000010  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
00000020  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
00000030  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
00000040  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
00000050  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
00000060  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
00000070  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
00000080  e5 20 f9 ff ff ff ff 07  44 eb 08 6b 08 3b 42 bb
00000090  fe ff ff ff ff ff 00 00  ff ff 40 eb 0c 20 0f 52
000000a0  10 d8 00 00 36 02 a6 00  82 ea 14 c9 e9 63 76 33
000000b0  7a 75 7a 75 f7 a2 d5 5c  19 f7 4d ff e9 30 f8 80

Cc: stable@vger.kernel.org
Suggested-by: Michael Walle <michael@walle.cc>
Reviewed-by: Michael Walle <michael@walle.cc>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230718-spi-nor-winbond-w25q128-v5-1-a73653ee46c3@linaro.org
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/mtd/spi-nor/winbond.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/mtd/spi-nor/winbond.c
+++ b/drivers/mtd/spi-nor/winbond.c
@@ -120,8 +120,9 @@ static const struct flash_info winbond_n
 		NO_SFDP_FLAGS(SECT_4K) },
 	{ "w25q80bl", INFO(0xef4014, 0, 64 * 1024,  16)
 		NO_SFDP_FLAGS(SECT_4K) },
-	{ "w25q128", INFO(0xef4018, 0, 64 * 1024, 256)
-		NO_SFDP_FLAGS(SECT_4K) },
+	{ "w25q128", INFO(0xef4018, 0, 0, 0)
+		PARSE_SFDP
+		FLAGS(SPI_NOR_HAS_LOCK | SPI_NOR_HAS_TB) },
 	{ "w25q256", INFO(0xef4019, 0, 64 * 1024, 512)
 		NO_SFDP_FLAGS(SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ)
 		.fixups = &w25q256_fixups },



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 166/219] mtd: rawnand: brcmnand: Fix potential false time out warning
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (164 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 165/219] mtd: spi-nor: Correct flags for Winbond w25q128 Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 167/219] mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller Greg Kroah-Hartman
                   ` (63 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, William Zhang, Florian Fainelli,
	Miquel Raynal

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: William Zhang <william.zhang@broadcom.com>

commit 9cc0a598b944816f2968baf2631757f22721b996 upstream.

If system is busy during the command status polling function, the driver
may not get the chance to poll the status register till the end of time
out and return the premature status.  Do a final check after time out
happens to ensure reading the correct status.

Fixes: 9d2ee0a60b8b ("mtd: nand: brcmnand: Check flash #WP pin status before nand erase/program")
Signed-off-by: William Zhang <william.zhang@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-3-william.zhang@broadcom.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/mtd/nand/raw/brcmnand/brcmnand.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
+++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
@@ -1072,6 +1072,14 @@ static int bcmnand_ctrl_poll_status(stru
 		cpu_relax();
 	} while (time_after(limit, jiffies));
 
+	/*
+	 * do a final check after time out in case the CPU was busy and the driver
+	 * did not get enough time to perform the polling to avoid false alarms
+	 */
+	val = brcmnand_read_reg(ctrl, BRCMNAND_INTFC_STATUS);
+	if ((val & mask) == expected_val)
+		return 0;
+
 	dev_warn(ctrl->dev, "timeout on status poll (expected %x got %x)\n",
 		 expected_val, val & mask);
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 167/219] mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (165 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 166/219] mtd: rawnand: brcmnand: Fix potential false time out warning Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 168/219] drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma Greg Kroah-Hartman
                   ` (62 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, William Zhang, Florian Fainelli,
	Miquel Raynal

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: William Zhang <william.zhang@broadcom.com>

commit 2ec2839a9062db8a592525a3fdabd42dcd9a3a9b upstream.

v7.2 controller has different ECC level field size and shift in the acc
control register than its predecessor and successor controller. It needs
to be set specifically.

Fixes: decba6d47869 ("mtd: brcmnand: Add v7.2 controller support")
Signed-off-by: William Zhang <william.zhang@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-2-william.zhang@broadcom.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/mtd/nand/raw/brcmnand/brcmnand.c |   74 +++++++++++++++++--------------
 1 file changed, 41 insertions(+), 33 deletions(-)

--- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
+++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
@@ -272,6 +272,7 @@ struct brcmnand_controller {
 	const unsigned int	*page_sizes;
 	unsigned int		page_size_shift;
 	unsigned int		max_oob;
+	u32			ecc_level_shift;
 	u32			features;
 
 	/* for low-power standby/resume only */
@@ -596,6 +597,34 @@ enum {
 	INTFC_CTLR_READY		= BIT(31),
 };
 
+/***********************************************************************
+ * NAND ACC CONTROL bitfield
+ *
+ * Some bits have remained constant throughout hardware revision, while
+ * others have shifted around.
+ ***********************************************************************/
+
+/* Constant for all versions (where supported) */
+enum {
+	/* See BRCMNAND_HAS_CACHE_MODE */
+	ACC_CONTROL_CACHE_MODE				= BIT(22),
+
+	/* See BRCMNAND_HAS_PREFETCH */
+	ACC_CONTROL_PREFETCH				= BIT(23),
+
+	ACC_CONTROL_PAGE_HIT				= BIT(24),
+	ACC_CONTROL_WR_PREEMPT				= BIT(25),
+	ACC_CONTROL_PARTIAL_PAGE			= BIT(26),
+	ACC_CONTROL_RD_ERASED				= BIT(27),
+	ACC_CONTROL_FAST_PGM_RDIN			= BIT(28),
+	ACC_CONTROL_WR_ECC				= BIT(30),
+	ACC_CONTROL_RD_ECC				= BIT(31),
+};
+
+#define	ACC_CONTROL_ECC_SHIFT			16
+/* Only for v7.2 */
+#define	ACC_CONTROL_ECC_EXT_SHIFT		13
+
 static inline bool brcmnand_non_mmio_ops(struct brcmnand_controller *ctrl)
 {
 #if IS_ENABLED(CONFIG_MTD_NAND_BRCMNAND_BCMA)
@@ -737,6 +766,12 @@ static int brcmnand_revision_init(struct
 	else if (of_property_read_bool(ctrl->dev->of_node, "brcm,nand-has-wp"))
 		ctrl->features |= BRCMNAND_HAS_WP;
 
+	/* v7.2 has different ecc level shift in the acc register */
+	if (ctrl->nand_version == 0x0702)
+		ctrl->ecc_level_shift = ACC_CONTROL_ECC_EXT_SHIFT;
+	else
+		ctrl->ecc_level_shift = ACC_CONTROL_ECC_SHIFT;
+
 	return 0;
 }
 
@@ -931,30 +966,6 @@ static inline int brcmnand_cmd_shift(str
 	return 0;
 }
 
-/***********************************************************************
- * NAND ACC CONTROL bitfield
- *
- * Some bits have remained constant throughout hardware revision, while
- * others have shifted around.
- ***********************************************************************/
-
-/* Constant for all versions (where supported) */
-enum {
-	/* See BRCMNAND_HAS_CACHE_MODE */
-	ACC_CONTROL_CACHE_MODE				= BIT(22),
-
-	/* See BRCMNAND_HAS_PREFETCH */
-	ACC_CONTROL_PREFETCH				= BIT(23),
-
-	ACC_CONTROL_PAGE_HIT				= BIT(24),
-	ACC_CONTROL_WR_PREEMPT				= BIT(25),
-	ACC_CONTROL_PARTIAL_PAGE			= BIT(26),
-	ACC_CONTROL_RD_ERASED				= BIT(27),
-	ACC_CONTROL_FAST_PGM_RDIN			= BIT(28),
-	ACC_CONTROL_WR_ECC				= BIT(30),
-	ACC_CONTROL_RD_ECC				= BIT(31),
-};
-
 static inline u32 brcmnand_spare_area_mask(struct brcmnand_controller *ctrl)
 {
 	if (ctrl->nand_version == 0x0702)
@@ -967,18 +978,15 @@ static inline u32 brcmnand_spare_area_ma
 		return GENMASK(4, 0);
 }
 
-#define NAND_ACC_CONTROL_ECC_SHIFT	16
-#define NAND_ACC_CONTROL_ECC_EXT_SHIFT	13
-
 static inline u32 brcmnand_ecc_level_mask(struct brcmnand_controller *ctrl)
 {
 	u32 mask = (ctrl->nand_version >= 0x0600) ? 0x1f : 0x0f;
 
-	mask <<= NAND_ACC_CONTROL_ECC_SHIFT;
+	mask <<= ACC_CONTROL_ECC_SHIFT;
 
 	/* v7.2 includes additional ECC levels */
-	if (ctrl->nand_version >= 0x0702)
-		mask |= 0x7 << NAND_ACC_CONTROL_ECC_EXT_SHIFT;
+	if (ctrl->nand_version == 0x0702)
+		mask |= 0x7 << ACC_CONTROL_ECC_EXT_SHIFT;
 
 	return mask;
 }
@@ -992,8 +1000,8 @@ static void brcmnand_set_ecc_enabled(str
 
 	if (en) {
 		acc_control |= ecc_flags; /* enable RD/WR ECC */
-		acc_control |= host->hwcfg.ecc_level
-			       << NAND_ACC_CONTROL_ECC_SHIFT;
+		acc_control &= ~brcmnand_ecc_level_mask(ctrl);
+		acc_control |= host->hwcfg.ecc_level << ctrl->ecc_level_shift;
 	} else {
 		acc_control &= ~ecc_flags; /* disable RD/WR ECC */
 		acc_control &= ~brcmnand_ecc_level_mask(ctrl);
@@ -2593,7 +2601,7 @@ static int brcmnand_set_cfg(struct brcmn
 	tmp &= ~brcmnand_ecc_level_mask(ctrl);
 	tmp &= ~brcmnand_spare_area_mask(ctrl);
 	if (ctrl->nand_version >= 0x0302) {
-		tmp |= cfg->ecc_level << NAND_ACC_CONTROL_ECC_SHIFT;
+		tmp |= cfg->ecc_level << ctrl->ecc_level_shift;
 		tmp |= cfg->spare_area_size;
 	}
 	nand_writereg(ctrl, acc_control_offs, tmp);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 168/219] drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (166 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 167/219] mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 169/219] drm/amd/display: prevent potential division by zero errors Greg Kroah-Hartman
                   ` (61 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Melissa Wen, Alex Deucher

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Melissa Wen <mwen@igalia.com>

commit 57a943ebfcdb4a97fbb409640234bdb44bfa1953 upstream.

For DRM legacy gamma, AMD display manager applies implicit sRGB degamma
using a pre-defined sRGB transfer function. It works fine for DCN2
family where degamma ROM and custom curves go to the same color block.
But, on DCN3+, degamma is split into two blocks: degamma ROM for
pre-defined TFs and `gamma correction` for user/custom curves and
degamma ROM settings doesn't apply to cursor plane. To get DRM legacy
gamma working as expected, enable cursor degamma ROM for implict sRGB
degamma on HW with this configuration.

Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2803
Fixes: 96b020e2163f ("drm/amd/display: check attr flag before set cursor degamma on DCN3+")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c |    7 +++++++
 1 file changed, 7 insertions(+)

--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -1269,6 +1269,13 @@ void handle_cursor_update(struct drm_pla
 	attributes.rotation_angle    = 0;
 	attributes.attribute_flags.value = 0;
 
+	/* Enable cursor degamma ROM on DCN3+ for implicit sRGB degamma in DRM
+	 * legacy gamma setup.
+	 */
+	if (crtc_state->cm_is_degamma_srgb &&
+	    adev->dm.dc->caps.color.dpp.gamma_corr)
+		attributes.attribute_flags.bits.ENABLE_CURSOR_DEGAMMA = 1;
+
 	attributes.pitch = afb->base.pitches[0] / afb->base.format->cpp[0];
 
 	if (crtc_state->stream) {



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 169/219] drm/amd/display: prevent potential division by zero errors
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (167 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 168/219] drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 170/219] KVM: SVM: Take and hold ir_list_lock when updating vCPUs Physical ID entry Greg Kroah-Hartman
                   ` (60 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Aurabindo Pillai, Hamza Mahfooz,
	Alex Deucher

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hamza Mahfooz <hamza.mahfooz@amd.com>

commit 07e388aab042774f284a2ad75a70a194517cdad4 upstream.

There are two places in apply_below_the_range() where it's possible for
a divide by zero error to occur. So, to fix this make sure the divisor
is non-zero before attempting the computation in both cases.

Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2637
Fixes: a463b263032f ("drm/amd/display: Fix frames_to_insert math")
Fixes: ded6119e825a ("drm/amd/display: Reinstate LFC optimization")
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/amd/display/modules/freesync/freesync.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -338,7 +338,9 @@ static void apply_below_the_range(struct
 		 *  - Delta for CEIL: delta_from_mid_point_in_us_1
 		 *  - Delta for FLOOR: delta_from_mid_point_in_us_2
 		 */
-		if ((last_render_time_in_us / mid_point_frames_ceil) < in_out_vrr->min_duration_in_us) {
+		if (mid_point_frames_ceil &&
+		    (last_render_time_in_us / mid_point_frames_ceil) <
+		    in_out_vrr->min_duration_in_us) {
 			/* Check for out of range.
 			 * If using CEIL produces a value that is out of range,
 			 * then we are forced to use FLOOR.
@@ -385,8 +387,9 @@ static void apply_below_the_range(struct
 		/* Either we've calculated the number of frames to insert,
 		 * or we need to insert min duration frames
 		 */
-		if (last_render_time_in_us / frames_to_insert <
-				in_out_vrr->min_duration_in_us){
+		if (frames_to_insert &&
+		    (last_render_time_in_us / frames_to_insert) <
+		    in_out_vrr->min_duration_in_us){
 			frames_to_insert -= (frames_to_insert > 1) ?
 					1 : 0;
 		}



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 170/219] KVM: SVM: Take and hold ir_list_lock when updating vCPUs Physical ID entry
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (168 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 169/219] drm/amd/display: prevent potential division by zero errors Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 171/219] KVM: SVM: Dont inject #UD if KVM attempts to skip SEV guest insn Greg Kroah-Hartman
                   ` (59 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Alejandro Jimenez, Joao Martins,
	Sean Christopherson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

commit 4c08e737f056fec930b416a2bd37ed266d724f95 upstream.

Hoist the acquisition of ir_list_lock from avic_update_iommu_vcpu_affinity()
to its two callers, avic_vcpu_load() and avic_vcpu_put(), specifically to
encapsulate the write to the vCPU's entry in the AVIC Physical ID table.
This will allow a future fix to pull information from the Physical ID entry
when updating the IRTE, without potentially consuming stale information,
i.e. without racing with the vCPU being (un)loaded.

Add a comment to call out that ir_list_lock does NOT protect against
multiple writers, specifically that reading the Physical ID entry in
avic_vcpu_put() outside of the lock is safe.

To preserve some semblance of independence from ir_list_lock, keep the
READ_ONCE() in avic_vcpu_load() even though acuiring the spinlock
effectively ensures the load(s) will be generated after acquiring the
lock.

Cc: stable@vger.kernel.org
Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/20230808233132.2499764-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/svm/avic.c |   31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -1022,10 +1022,11 @@ static inline int
 avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu, int cpu, bool r)
 {
 	int ret = 0;
-	unsigned long flags;
 	struct amd_svm_iommu_ir *ir;
 	struct vcpu_svm *svm = to_svm(vcpu);
 
+	lockdep_assert_held(&svm->ir_list_lock);
+
 	if (!kvm_arch_has_assigned_device(vcpu->kvm))
 		return 0;
 
@@ -1033,19 +1034,15 @@ avic_update_iommu_vcpu_affinity(struct k
 	 * Here, we go through the per-vcpu ir_list to update all existing
 	 * interrupt remapping table entry targeting this vcpu.
 	 */
-	spin_lock_irqsave(&svm->ir_list_lock, flags);
-
 	if (list_empty(&svm->ir_list))
-		goto out;
+		return 0;
 
 	list_for_each_entry(ir, &svm->ir_list, node) {
 		ret = amd_iommu_update_ga(cpu, r, ir->data);
 		if (ret)
-			break;
+			return ret;
 	}
-out:
-	spin_unlock_irqrestore(&svm->ir_list_lock, flags);
-	return ret;
+	return 0;
 }
 
 void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
@@ -1053,6 +1050,7 @@ void avic_vcpu_load(struct kvm_vcpu *vcp
 	u64 entry;
 	int h_physical_id = kvm_cpu_get_apicid(cpu);
 	struct vcpu_svm *svm = to_svm(vcpu);
+	unsigned long flags;
 
 	lockdep_assert_preemption_disabled();
 
@@ -1069,6 +1067,8 @@ void avic_vcpu_load(struct kvm_vcpu *vcp
 	if (kvm_vcpu_is_blocking(vcpu))
 		return;
 
+	spin_lock_irqsave(&svm->ir_list_lock, flags);
+
 	entry = READ_ONCE(*(svm->avic_physical_id_cache));
 
 	entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
@@ -1077,25 +1077,40 @@ void avic_vcpu_load(struct kvm_vcpu *vcp
 
 	WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
 	avic_update_iommu_vcpu_affinity(vcpu, h_physical_id, true);
+
+	spin_unlock_irqrestore(&svm->ir_list_lock, flags);
 }
 
 void avic_vcpu_put(struct kvm_vcpu *vcpu)
 {
 	u64 entry;
 	struct vcpu_svm *svm = to_svm(vcpu);
+	unsigned long flags;
 
 	lockdep_assert_preemption_disabled();
 
+	/*
+	 * Note, reading the Physical ID entry outside of ir_list_lock is safe
+	 * as only the pCPU that has loaded (or is loading) the vCPU is allowed
+	 * to modify the entry, and preemption is disabled.  I.e. the vCPU
+	 * can't be scheduled out and thus avic_vcpu_{put,load}() can't run
+	 * recursively.
+	 */
 	entry = READ_ONCE(*(svm->avic_physical_id_cache));
 
 	/* Nothing to do if IsRunning == '0' due to vCPU blocking. */
 	if (!(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK))
 		return;
 
+	spin_lock_irqsave(&svm->ir_list_lock, flags);
+
 	avic_update_iommu_vcpu_affinity(vcpu, -1, 0);
 
 	entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
 	WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
+
+	spin_unlock_irqrestore(&svm->ir_list_lock, flags);
+
 }
 
 void avic_refresh_virtual_apic_mode(struct kvm_vcpu *vcpu)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 171/219] KVM: SVM: Dont inject #UD if KVM attempts to skip SEV guest insn
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (169 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 170/219] KVM: SVM: Take and hold ir_list_lock when updating vCPUs Physical ID entry Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:14 ` [PATCH 6.1 172/219] KVM: SVM: Get source vCPUs from source VM for SEV-ES intrahost migration Greg Kroah-Hartman
                   ` (58 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Wu Zongyo, Tom Lendacky,
	Sean Christopherson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

commit cb49631ad111570f1bad37702c11c2ae07fa2e3c upstream.

Don't inject a #UD if KVM attempts to "emulate" to skip an instruction
for an SEV guest, and instead resume the guest and hope that it can make
forward progress.  When commit 04c40f344def ("KVM: SVM: Inject #UD on
attempted emulation for SEV guest w/o insn buffer") added the completely
arbitrary #UD behavior, there were no known scenarios where a well-behaved
guest would induce a VM-Exit that triggered emulation, i.e. it was thought
that injecting #UD would be helpful.

However, now that KVM (correctly) attempts to re-inject INT3/INTO, e.g. if
a #NPF is encountered when attempting to deliver the INT3/INTO, an SEV
guest can trigger emulation without a buffer, through no fault of its own.
Resuming the guest and retrying the INT3/INTO is architecturally wrong,
e.g. the vCPU will incorrectly re-hit code #DBs, but for SEV guests there
is literally no other option that has a chance of making forward progress.

Drop the #UD injection for all "skip" emulation, not just those related to
INT3/INTO, even though that means that the guest will likely end up in an
infinite loop instead of getting a #UD (the vCPU may also crash, e.g. if
KVM emulated everything about an instruction except for advancing RIP).
There's no evidence that suggests that an unexpected #UD is actually
better than hanging the vCPU, e.g. a soft-hung vCPU can still respond to
IRQs and NMIs to generate a backtrace.

Reported-by: Wu Zongyo <wuzongyo@mail.ustc.edu.cn>
Closes: https://lore.kernel.org/all/8eb933fd-2cf3-d7a9-32fe-2a1d82eac42a@mail.ustc.edu.cn
Fixes: 6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction")
Cc: stable@vger.kernel.org
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230825013621.2845700-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/svm/svm.c |   35 +++++++++++++++++++++++++++--------
 1 file changed, 27 insertions(+), 8 deletions(-)

--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -366,6 +366,8 @@ static void svm_set_interrupt_shadow(str
 		svm->vmcb->control.int_state |= SVM_INTERRUPT_SHADOW_MASK;
 
 }
+static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
+					void *insn, int insn_len);
 
 static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
 					   bool commit_side_effects)
@@ -386,6 +388,14 @@ static int __svm_skip_emulated_instructi
 	}
 
 	if (!svm->next_rip) {
+		/*
+		 * FIXME: Drop this when kvm_emulate_instruction() does the
+		 * right thing and treats "can't emulate" as outright failure
+		 * for EMULTYPE_SKIP.
+		 */
+		if (!svm_can_emulate_instruction(vcpu, EMULTYPE_SKIP, NULL, 0))
+			return 0;
+
 		if (unlikely(!commit_side_effects))
 			old_rflags = svm->vmcb->save.rflags;
 
@@ -4592,16 +4602,25 @@ static bool svm_can_emulate_instruction(
 	 * and cannot be decrypted by KVM, i.e. KVM would read cyphertext and
 	 * decode garbage.
 	 *
-	 * Inject #UD if KVM reached this point without an instruction buffer.
-	 * In practice, this path should never be hit by a well-behaved guest,
-	 * e.g. KVM doesn't intercept #UD or #GP for SEV guests, but this path
-	 * is still theoretically reachable, e.g. via unaccelerated fault-like
-	 * AVIC access, and needs to be handled by KVM to avoid putting the
-	 * guest into an infinite loop.   Injecting #UD is somewhat arbitrary,
-	 * but its the least awful option given lack of insight into the guest.
+	 * If KVM is NOT trying to simply skip an instruction, inject #UD if
+	 * KVM reached this point without an instruction buffer.  In practice,
+	 * this path should never be hit by a well-behaved guest, e.g. KVM
+	 * doesn't intercept #UD or #GP for SEV guests, but this path is still
+	 * theoretically reachable, e.g. via unaccelerated fault-like AVIC
+	 * access, and needs to be handled by KVM to avoid putting the guest
+	 * into an infinite loop.   Injecting #UD is somewhat arbitrary, but
+	 * its the least awful option given lack of insight into the guest.
+	 *
+	 * If KVM is trying to skip an instruction, simply resume the guest.
+	 * If a #NPF occurs while the guest is vectoring an INT3/INTO, then KVM
+	 * will attempt to re-inject the INT3/INTO and skip the instruction.
+	 * In that scenario, retrying the INT3/INTO and hoping the guest will
+	 * make forward progress is the only option that has a chance of
+	 * success (and in practice it will work the vast majority of the time).
 	 */
 	if (unlikely(!insn)) {
-		kvm_queue_exception(vcpu, UD_VECTOR);
+		if (!(emul_type & EMULTYPE_SKIP))
+			kvm_queue_exception(vcpu, UD_VECTOR);
 		return false;
 	}
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 172/219] KVM: SVM: Get source vCPUs from source VM for SEV-ES intrahost migration
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (170 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 171/219] KVM: SVM: Dont inject #UD if KVM attempts to skip SEV guest insn Greg Kroah-Hartman
@ 2023-09-17 19:14 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 173/219] KVM: nSVM: Check instead of asserting on nested TSC scaling support Greg Kroah-Hartman
                   ` (57 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:14 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Peter Gonda, Pankaj Gupta,
	Sean Christopherson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

commit f1187ef24eb8f36e8ad8106d22615ceddeea6097 upstream.

Fix a goof where KVM tries to grab source vCPUs from the destination VM
when doing intrahost migration.  Grabbing the wrong vCPU not only hoses
the guest, it also crashes the host due to the VMSA pointer being left
NULL.

  BUG: unable to handle page fault for address: ffffe38687000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP NOPTI
  CPU: 39 PID: 17143 Comm: sev_migrate_tes Tainted: GO       6.5.0-smp--fff2e47e6c3b-next #151
  Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.28.0 07/10/2023
  RIP: 0010:__free_pages+0x15/0xd0
  RSP: 0018:ffff923fcf6e3c78 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffffe38687000000 RCX: 0000000000000100
  RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffe38687000000
  RBP: ffff923fcf6e3c88 R08: ffff923fcafb0000 R09: 0000000000000000
  R10: 0000000000000000 R11: ffffffff83619b90 R12: ffff923fa9540000
  R13: 0000000000080007 R14: ffff923f6d35d000 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff929d0d7c0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffe38687000000 CR3: 0000005224c34005 CR4: 0000000000770ee0
  PKRU: 55555554
  Call Trace:
   <TASK>
   sev_free_vcpu+0xcb/0x110 [kvm_amd]
   svm_vcpu_free+0x75/0xf0 [kvm_amd]
   kvm_arch_vcpu_destroy+0x36/0x140 [kvm]
   kvm_destroy_vcpus+0x67/0x100 [kvm]
   kvm_arch_destroy_vm+0x161/0x1d0 [kvm]
   kvm_put_kvm+0x276/0x560 [kvm]
   kvm_vm_release+0x25/0x30 [kvm]
   __fput+0x106/0x280
   ____fput+0x12/0x20
   task_work_run+0x86/0xb0
   do_exit+0x2e3/0x9c0
   do_group_exit+0xb1/0xc0
   __x64_sys_exit_group+0x1b/0x20
   do_syscall_64+0x41/0x90
   entry_SYSCALL_64_after_hwframe+0x63/0xcd
   </TASK>
  CR2: ffffe38687000000

Fixes: 6defa24d3b12 ("KVM: SEV: Init target VMCBs in sev_migrate_from")
Cc: stable@vger.kernel.org
Cc: Peter Gonda <pgonda@google.com>
Reviewed-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Link: https://lore.kernel.org/r/20230825022357.2852133-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/svm/sev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -1723,7 +1723,7 @@ static void sev_migrate_from(struct kvm
 		 * Note, the source is not required to have the same number of
 		 * vCPUs as the destination when migrating a vanilla SEV VM.
 		 */
-		src_vcpu = kvm_get_vcpu(dst_kvm, i);
+		src_vcpu = kvm_get_vcpu(src_kvm, i);
 		src_svm = to_svm(src_vcpu);
 
 		/*



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 173/219] KVM: nSVM: Check instead of asserting on nested TSC scaling support
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (171 preceding siblings ...)
  2023-09-17 19:14 ` [PATCH 6.1 172/219] KVM: SVM: Get source vCPUs from source VM for SEV-ES intrahost migration Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 174/219] KVM: nSVM: Load L1s TSC multiplier based on L1 state, not L2 state Greg Kroah-Hartman
                   ` (56 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Maxim Levitsky, Sean Christopherson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

commit 7cafe9b8e22bb3d77f130c461aedf6868c4aaf58 upstream.

Check for nested TSC scaling support on nested SVM VMRUN instead of
asserting that TSC scaling is exposed to L1 if L1's MSR_AMD64_TSC_RATIO
has diverged from KVM's default.  Userspace can trigger the WARN at will
by writing the MSR and then updating guest CPUID to hide the feature
(modifying guest CPUID is allowed anytime before KVM_RUN).  E.g. hacking
KVM's state_test selftest to do

		vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0);
		vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR);

after restoring state in a new VM+vCPU yields an endless supply of:

  ------------[ cut here ]------------
  WARNING: CPU: 164 PID: 62565 at arch/x86/kvm/svm/nested.c:699
           nested_vmcb02_prepare_control+0x3d6/0x3f0 [kvm_amd]
  Call Trace:
   <TASK>
   enter_svm_guest_mode+0x114/0x560 [kvm_amd]
   nested_svm_vmrun+0x260/0x330 [kvm_amd]
   vmrun_interception+0x29/0x30 [kvm_amd]
   svm_invoke_exit_handler+0x35/0x100 [kvm_amd]
   svm_handle_exit+0xe7/0x180 [kvm_amd]
   kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm]
   kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm]
   __se_sys_ioctl+0x7a/0xc0
   __x64_sys_ioctl+0x21/0x30
   do_syscall_64+0x41/0x90
   entry_SYSCALL_64_after_hwframe+0x63/0xcd
  RIP: 0033:0x45ca1b

Note, the nested #VMEXIT path has the same flaw, but needs a different
fix and will be handled separately.

Fixes: 5228eb96a487 ("KVM: x86: nSVM: implement nested TSC scaling")
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230729011608.1065019-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/svm/nested.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -660,10 +660,9 @@ static void nested_vmcb02_prepare_contro
 
 	vmcb02->control.tsc_offset = vcpu->arch.tsc_offset;
 
-	if (svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio) {
-		WARN_ON(!svm->tsc_scaling_enabled);
+	if (svm->tsc_scaling_enabled &&
+	    svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio)
 		nested_svm_update_tsc_ratio_msr(vcpu);
-	}
 
 	vmcb02->control.int_ctl             =
 		(svm->nested.ctl.int_ctl & int_ctl_vmcb12_bits) |



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 174/219] KVM: nSVM: Load L1s TSC multiplier based on L1 state, not L2 state
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (172 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 173/219] KVM: nSVM: Check instead of asserting on nested TSC scaling support Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 175/219] KVM: SVM: Set target pCPU during IRTE update if target vCPU is running Greg Kroah-Hartman
                   ` (55 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Maxim Levitsky, Sean Christopherson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

commit 0c94e2468491cbf0754f49a5136ab51294a96b69 upstream.

When emulating nested VM-Exit, load L1's TSC multiplier if L1's desired
ratio doesn't match the current ratio, not if the ratio L1 is using for
L2 diverges from the default.  Functionally, the end result is the same
as KVM will run L2 with L1's multiplier if L2's multiplier is the default,
i.e. checking that L1's multiplier is loaded is equivalent to checking if
L2 has a non-default multiplier.

However, the assertion that TSC scaling is exposed to L1 is flawed, as
userspace can trigger the WARN at will by writing the MSR and then
updating guest CPUID to hide the feature (modifying guest CPUID is
allowed anytime before KVM_RUN).  E.g. hacking KVM's state_test
selftest to do

                vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0);
                vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR);

after restoring state in a new VM+vCPU yields an endless supply of:

  ------------[ cut here ]------------
  WARNING: CPU: 10 PID: 206939 at arch/x86/kvm/svm/nested.c:1105
           nested_svm_vmexit+0x6af/0x720 [kvm_amd]
  Call Trace:
   nested_svm_exit_handled+0x102/0x1f0 [kvm_amd]
   svm_handle_exit+0xb9/0x180 [kvm_amd]
   kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm]
   kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm]
   ? trace_hardirqs_off+0x4d/0xa0
   __se_sys_ioctl+0x7a/0xc0
   __x64_sys_ioctl+0x21/0x30
   do_syscall_64+0x41/0x90
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

Unlike the nested VMRUN path, hoisting the svm->tsc_scaling_enabled check
into the if-statement is wrong as KVM needs to ensure L1's multiplier is
loaded in the above scenario.   Alternatively, the WARN_ON() could simply
be deleted, but that would make KVM's behavior even more subtle, e.g. it's
not immediately obvious why it's safe to write MSR_AMD64_TSC_RATIO when
checking only tsc_ratio_msr.

Fixes: 5228eb96a487 ("KVM: x86: nSVM: implement nested TSC scaling")
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230729011608.1065019-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/svm/nested.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1021,8 +1021,8 @@ int nested_svm_vmexit(struct vcpu_svm *s
 		vmcb_mark_dirty(vmcb01, VMCB_INTERCEPTS);
 	}
 
-	if (svm->tsc_ratio_msr != kvm_caps.default_tsc_scaling_ratio) {
-		WARN_ON(!svm->tsc_scaling_enabled);
+	if (kvm_caps.has_tsc_control &&
+	    vcpu->arch.tsc_scaling_ratio != vcpu->arch.l1_tsc_scaling_ratio) {
 		vcpu->arch.tsc_scaling_ratio = vcpu->arch.l1_tsc_scaling_ratio;
 		__svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
 	}



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 175/219] KVM: SVM: Set target pCPU during IRTE update if target vCPU is running
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (173 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 174/219] KVM: nSVM: Load L1s TSC multiplier based on L1 state, not L2 state Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 176/219] KVM: SVM: Skip VMSA init in sev_es_init_vmcb() if pointer is NULL Greg Kroah-Hartman
                   ` (54 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, dengqiao.joey, Alejandro Jimenez,
	Joao Martins, Maxim Levitsky, Suravee Suthikulpanit,
	Sean Christopherson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

commit f3cebc75e7425d6949d726bb8e937095b0aef025 upstream.

Update the target pCPU for IOMMU doorbells when updating IRTE routing if
KVM is actively running the associated vCPU.  KVM currently only updates
the pCPU when loading the vCPU (via avic_vcpu_load()), and so doorbell
events will be delayed until the vCPU goes through a put+load cycle (which
might very well "never" happen for the lifetime of the VM).

To avoid inserting a stale pCPU, e.g. due to racing between updating IRTE
routing and vCPU load/put, get the pCPU information from the vCPU's
Physical APIC ID table entry (a.k.a. avic_physical_id_cache in KVM) and
update the IRTE while holding ir_list_lock.  Add comments with --verbose
enabled to explain exactly what is and isn't protected by ir_list_lock.

Fixes: 411b44ba80ab ("svm: Implements update_pi_irte hook to setup posted interrupt")
Reported-by: dengqiao.joey <dengqiao.joey@bytedance.com>
Cc: stable@vger.kernel.org
Cc: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Link: https://lore.kernel.org/r/20230808233132.2499764-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/svm/avic.c |   28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -810,6 +810,7 @@ static int svm_ir_list_add(struct vcpu_s
 	int ret = 0;
 	unsigned long flags;
 	struct amd_svm_iommu_ir *ir;
+	u64 entry;
 
 	/**
 	 * In some cases, the existing irte is updated and re-set,
@@ -843,6 +844,18 @@ static int svm_ir_list_add(struct vcpu_s
 	ir->data = pi->ir_data;
 
 	spin_lock_irqsave(&svm->ir_list_lock, flags);
+
+	/*
+	 * Update the target pCPU for IOMMU doorbells if the vCPU is running.
+	 * If the vCPU is NOT running, i.e. is blocking or scheduled out, KVM
+	 * will update the pCPU info when the vCPU awkened and/or scheduled in.
+	 * See also avic_vcpu_load().
+	 */
+	entry = READ_ONCE(*(svm->avic_physical_id_cache));
+	if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK)
+		amd_iommu_update_ga(entry & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK,
+				    true, pi->ir_data);
+
 	list_add(&ir->node, &svm->ir_list);
 	spin_unlock_irqrestore(&svm->ir_list_lock, flags);
 out:
@@ -1067,6 +1080,13 @@ void avic_vcpu_load(struct kvm_vcpu *vcp
 	if (kvm_vcpu_is_blocking(vcpu))
 		return;
 
+	/*
+	 * Grab the per-vCPU interrupt remapping lock even if the VM doesn't
+	 * _currently_ have assigned devices, as that can change.  Holding
+	 * ir_list_lock ensures that either svm_ir_list_add() will consume
+	 * up-to-date entry information, or that this task will wait until
+	 * svm_ir_list_add() completes to set the new target pCPU.
+	 */
 	spin_lock_irqsave(&svm->ir_list_lock, flags);
 
 	entry = READ_ONCE(*(svm->avic_physical_id_cache));
@@ -1102,6 +1122,14 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu
 	if (!(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK))
 		return;
 
+	/*
+	 * Take and hold the per-vCPU interrupt remapping lock while updating
+	 * the Physical ID entry even though the lock doesn't protect against
+	 * multiple writers (see above).  Holding ir_list_lock ensures that
+	 * either svm_ir_list_add() will consume up-to-date entry information,
+	 * or that this task will wait until svm_ir_list_add() completes to
+	 * mark the vCPU as not running.
+	 */
 	spin_lock_irqsave(&svm->ir_list_lock, flags);
 
 	avic_update_iommu_vcpu_affinity(vcpu, -1, 0);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 176/219] KVM: SVM: Skip VMSA init in sev_es_init_vmcb() if pointer is NULL
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (174 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 175/219] KVM: SVM: Set target pCPU during IRTE update if target vCPU is running Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 177/219] MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install regression Greg Kroah-Hartman
                   ` (53 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Peter Gonda, Pankaj Gupta,
	Sean Christopherson

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sean Christopherson <seanjc@google.com>

commit 1952e74da96fb3e48b72a2d0ece78c688a5848c1 upstream.

Skip initializing the VMSA physical address in the VMCB if the VMSA is
NULL, which occurs during intrahost migration as KVM initializes the VMCB
before copying over state from the source to the destination (including
the VMSA and its physical address).

In normal builds, __pa() is just math, so the bug isn't fatal, but with
CONFIG_DEBUG_VIRTUAL=y, the validity of the virtual address is verified
and passing in NULL will make the kernel unhappy.

Fixes: 6defa24d3b12 ("KVM: SEV: Init target VMCBs in sev_migrate_from")
Cc: stable@vger.kernel.org
Cc: Peter Gonda <pgonda@google.com>
Reviewed-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Link: https://lore.kernel.org/r/20230825022357.2852133-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/svm/sev.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2951,9 +2951,12 @@ static void sev_es_init_vmcb(struct vcpu
 	/*
 	 * An SEV-ES guest requires a VMSA area that is a separate from the
 	 * VMCB page. Do not include the encryption mask on the VMSA physical
-	 * address since hardware will access it using the guest key.
+	 * address since hardware will access it using the guest key.  Note,
+	 * the VMSA will be NULL if this vCPU is the destination for intrahost
+	 * migration, and will be copied later.
 	 */
-	svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);
+	if (svm->sev_es.vmsa)
+		svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);
 
 	/* Can't intercept CR register access, HV can't modify CR registers */
 	svm_clr_intercept(svm, INTERCEPT_CR0_READ);



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 177/219] MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install regression
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (175 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 176/219] KVM: SVM: Skip VMSA init in sev_es_init_vmcb() if pointer is NULL Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 178/219] perf hists browser: Fix hierarchy mode header Greg Kroah-Hartman
                   ` (52 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Jan-Benedict Glaw, Maciej W. Rozycki,
	Philippe Mathieu-Daudé, Thomas Bogendoerfer

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Maciej W. Rozycki <macro@orcam.me.uk>

commit a79a404e6c2241ebc528b9ebf4c0832457b498c3 upstream.

Remove a build-time check for the presence of the GCC `-msym32' option.
This option has been there since GCC 4.1.0, which is below the minimum
required as at commit 805b2e1d427a ("kbuild: include Makefile.compiler
only when compiler is needed"), when an error message:

arch/mips/Makefile:306: *** CONFIG_CPU_DADDI_WORKAROUNDS unsupported without -msym32.  Stop.

started to trigger for the `modules_install' target with configurations
such as `decstation_64_defconfig' that set CONFIG_CPU_DADDI_WORKAROUNDS,
because said commit has made `cc-option-yn' an undefined function for
non-build targets.

Reported-by: Jan-Benedict Glaw <jbglaw@lug-owl.de>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: 805b2e1d427a ("kbuild: include Makefile.compiler only when compiler is needed")
Cc: stable@vger.kernel.org # v5.13+
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/mips/Makefile |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -308,8 +308,8 @@ ifdef CONFIG_64BIT
     endif
   endif
 
-  ifeq ($(KBUILD_SYM32)$(call cc-option-yn,-msym32), yy)
-    cflags-y += -msym32 -DKBUILD_64BIT_SYM32
+  ifeq ($(KBUILD_SYM32), y)
+    cflags-$(KBUILD_SYM32) += -msym32 -DKBUILD_64BIT_SYM32
   else
     ifeq ($(CONFIG_CPU_DADDI_WORKAROUNDS), y)
       $(error CONFIG_CPU_DADDI_WORKAROUNDS unsupported without -msym32)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 178/219] perf hists browser: Fix hierarchy mode header
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (176 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 177/219] MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install regression Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 179/219] perf test shell stat_bpf_counters: Fix test on Intel Greg Kroah-Hartman
                   ` (51 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Namhyung Kim,
	Arnaldo Carvalho de Melo, Adrian Hunter, Ian Rogers, Ingo Molnar,
	Jiri Olsa, Peter Zijlstra

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Namhyung Kim <namhyung@kernel.org>

commit e2cabf2a44791f01c21f8d5189b946926e34142e upstream.

The commit ef9ff6017e3c4593 ("perf ui browser: Move the extra title
lines from the hists browser") introduced ui_browser__gotorc_title() to
help moving non-title lines easily.  But it missed to update the title
for the hierarchy mode so it won't print the header line on TUI at all.

  $ perf report --hierarchy

Fixes: ef9ff6017e3c4593 ("perf ui browser: Move the extra title lines from the hists browser")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230731094934.1616495-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/ui/browsers/hists.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1779,7 +1779,7 @@ static void hists_browser__hierarchy_hea
 	hists_browser__scnprintf_hierarchy_headers(browser, headers,
 						   sizeof(headers));
 
-	ui_browser__gotorc(&browser->b, 0, 0);
+	ui_browser__gotorc_title(&browser->b, 0, 0);
 	ui_browser__set_color(&browser->b, HE_COLORSET_ROOT);
 	ui_browser__write_nstring(&browser->b, headers, browser->b.width + 1);
 }



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 179/219] perf test shell stat_bpf_counters: Fix test on Intel
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (177 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 178/219] perf hists browser: Fix hierarchy mode header Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 180/219] perf tools: Handle old data in PERF_RECORD_ATTR Greg Kroah-Hartman
                   ` (50 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Song Liu, Namhyung Kim,
	Adrian Hunter, Ian Rogers, Ingo Molnar, Jiri Olsa, Peter Zijlstra,
	bpf, Arnaldo Carvalho de Melo

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Namhyung Kim <namhyung@kernel.org>

commit 68ca249c964f520af7f8763e22f12bd26b57b870 upstream.

As of now, bpf counters (bperf) don't support event groups.  But the
default perf stat includes topdown metrics if supported (on recent Intel
machines) which require groups.  That makes perf stat exiting.

  $ sudo perf stat --bpf-counter true
  bpf managed perf events do not yet support groups.

Actually the test explicitly uses cycles event only, but it missed to
pass the option when it checks the availability of the command.

Fixes: 2c0cb9f56020d2ea ("perf test: Add a shell test for 'perf stat --bpf-counters' new option")
Reviewed-by: Song Liu <song@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/tests/shell/stat_bpf_counters.sh |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/tools/perf/tests/shell/stat_bpf_counters.sh
+++ b/tools/perf/tests/shell/stat_bpf_counters.sh
@@ -22,10 +22,10 @@ compare_number()
 }
 
 # skip if --bpf-counters is not supported
-if ! perf stat --bpf-counters true > /dev/null 2>&1; then
+if ! perf stat -e cycles --bpf-counters true > /dev/null 2>&1; then
 	if [ "$1" = "-v" ]; then
 		echo "Skipping: --bpf-counters not supported"
-		perf --no-pager stat --bpf-counters true || true
+		perf --no-pager stat -e cycles --bpf-counters true || true
 	fi
 	exit 2
 fi



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 180/219] perf tools: Handle old data in PERF_RECORD_ATTR
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (178 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 179/219] perf test shell stat_bpf_counters: Fix test on Intel Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 181/219] perf hists browser: Fix the number of entries for e key Greg Kroah-Hartman
                   ` (49 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Namhyung Kim, Adrian Hunter,
	Ian Rogers, Ingo Molnar, Jiri Olsa, Peter Zijlstra, Tom Zanussi,
	Arnaldo Carvalho de Melo

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Namhyung Kim <namhyung@kernel.org>

commit 9bf63282ea77a531ea58acb42fb3f40d2d1e4497 upstream.

The PERF_RECORD_ATTR is used for a pipe mode to describe an event with
attribute and IDs.  The ID table comes after the attr and it calculate
size of the table using the total record size and the attr size.

  n_ids = (total_record_size - end_of_the_attr_field) / sizeof(u64)

This is fine for most use cases, but sometimes it saves the pipe output
in a file and then process it later.  And it becomes a problem if there
is a change in attr size between the record and report.

  $ perf record -o- > perf-pipe.data  # old version
  $ perf report -i- < perf-pipe.data  # new version

For example, if the attr size is 128 and it has 4 IDs, then it would
save them in 168 byte like below:

   8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
 128 byte: perf event attr { .size = 128, ... },
  32 byte: event IDs [] = { 1234, 1235, 1236, 1237 },

But when report later, it thinks the attr size is 136 then it only read
the last 3 entries as ID.

   8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
 136 byte: perf event attr { .size = 136, ... },
  24 byte: event IDs [] = { 1235, 1236, 1237 },  // 1234 is missing

So it should use the recorded version of the attr.  The attr has the
size field already then it should honor the size when reading data.

Fixes: 2c46dbb517a10b18 ("perf: Convert perf header attrs into attr events")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230825152552.112913-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/util/header.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -4331,7 +4331,8 @@ int perf_event__process_attr(struct perf
 			     union perf_event *event,
 			     struct evlist **pevlist)
 {
-	u32 i, ids, n_ids;
+	u32 i, n_ids;
+	u64 *ids;
 	struct evsel *evsel;
 	struct evlist *evlist = *pevlist;
 
@@ -4347,9 +4348,8 @@ int perf_event__process_attr(struct perf
 
 	evlist__add(evlist, evsel);
 
-	ids = event->header.size;
-	ids -= (void *)&event->attr.id - (void *)event;
-	n_ids = ids / sizeof(u64);
+	n_ids = event->header.size - sizeof(event->header) - event->attr.attr.size;
+	n_ids = n_ids / sizeof(u64);
 	/*
 	 * We don't have the cpu and thread maps on the header, so
 	 * for allocating the perf_sample_id table we fake 1 cpu and
@@ -4358,8 +4358,9 @@ int perf_event__process_attr(struct perf
 	if (perf_evsel__alloc_id(&evsel->core, 1, n_ids))
 		return -ENOMEM;
 
+	ids = (void *)&event->attr.attr + event->attr.attr.size;
 	for (i = 0; i < n_ids; i++) {
-		perf_evlist__id_add(&evlist->core, &evsel->core, 0, i, event->attr.id[i]);
+		perf_evlist__id_add(&evlist->core, &evsel->core, 0, i, ids[i]);
 	}
 
 	return 0;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 181/219] perf hists browser: Fix the number of entries for e key
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (179 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 180/219] perf tools: Handle old data in PERF_RECORD_ATTR Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 182/219] drm/amd/display: always switch off ODM before committing more streams Greg Kroah-Hartman
                   ` (48 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Namhyung Kim,
	Arnaldo Carvalho de Melo, Adrian Hunter, Ian Rogers, Ingo Molnar,
	Jiri Olsa, Peter Zijlstra

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Namhyung Kim <namhyung@kernel.org>

commit f6b8436bede3e80226e8b2100279c4450c73806a upstream.

The 'e' key is to toggle expand/collapse the selected entry only.  But
the current code has a bug that it only increases the number of entries
by 1 in the hierarchy mode so users cannot move under the current entry
after the key stroke.  This is due to a wrong assumption in the
hist_entry__set_folding().

The commit b33f922651011eff ("perf hists browser: Put hist_entry folding
logic into single function") factored out the code, but actually it
should be handled separately.  The hist_browser__set_folding() is to
update fold state for each entry so it needs to traverse all (child)
entries regardless of the current fold state.  So it increases the
number of entries by 1.

But the hist_entry__set_folding() only cares the currently selected
entry and its all children.  So it should count all unfolded child
entries.  This code is implemented in hist_browser__toggle_fold()
already so we can just call it.

Fixes: b33f922651011eff ("perf hists browser: Put hist_entry folding logic into single function")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230731094934.1616495-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 tools/perf/ui/browsers/hists.c |   58 ++++++++++++++++-------------------------
 1 file changed, 24 insertions(+), 34 deletions(-)

--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -407,11 +407,6 @@ static bool hist_browser__selection_has_
 	return container_of(ms, struct callchain_list, ms)->has_children;
 }
 
-static bool hist_browser__he_selection_unfolded(struct hist_browser *browser)
-{
-	return browser->he_selection ? browser->he_selection->unfolded : false;
-}
-
 static bool hist_browser__selection_unfolded(struct hist_browser *browser)
 {
 	struct hist_entry *he = browser->he_selection;
@@ -584,8 +579,8 @@ static int hierarchy_set_folding(struct
 	return n;
 }
 
-static void __hist_entry__set_folding(struct hist_entry *he,
-				      struct hist_browser *hb, bool unfold)
+static void hist_entry__set_folding(struct hist_entry *he,
+				    struct hist_browser *hb, bool unfold)
 {
 	hist_entry__init_have_children(he);
 	he->unfolded = unfold ? he->has_children : false;
@@ -603,34 +598,12 @@ static void __hist_entry__set_folding(st
 		he->nr_rows = 0;
 }
 
-static void hist_entry__set_folding(struct hist_entry *he,
-				    struct hist_browser *browser, bool unfold)
-{
-	double percent;
-
-	percent = hist_entry__get_percent_limit(he);
-	if (he->filtered || percent < browser->min_pcnt)
-		return;
-
-	__hist_entry__set_folding(he, browser, unfold);
-
-	if (!he->depth || unfold)
-		browser->nr_hierarchy_entries++;
-	if (he->leaf)
-		browser->nr_callchain_rows += he->nr_rows;
-	else if (unfold && !hist_entry__has_hierarchy_children(he, browser->min_pcnt)) {
-		browser->nr_hierarchy_entries++;
-		he->has_no_entry = true;
-		he->nr_rows = 1;
-	} else
-		he->has_no_entry = false;
-}
-
 static void
 __hist_browser__set_folding(struct hist_browser *browser, bool unfold)
 {
 	struct rb_node *nd;
 	struct hist_entry *he;
+	double percent;
 
 	nd = rb_first_cached(&browser->hists->entries);
 	while (nd) {
@@ -640,6 +613,21 @@ __hist_browser__set_folding(struct hist_
 		nd = __rb_hierarchy_next(nd, HMD_FORCE_CHILD);
 
 		hist_entry__set_folding(he, browser, unfold);
+
+		percent = hist_entry__get_percent_limit(he);
+		if (he->filtered || percent < browser->min_pcnt)
+			continue;
+
+		if (!he->depth || unfold)
+			browser->nr_hierarchy_entries++;
+		if (he->leaf)
+			browser->nr_callchain_rows += he->nr_rows;
+		else if (unfold && !hist_entry__has_hierarchy_children(he, browser->min_pcnt)) {
+			browser->nr_hierarchy_entries++;
+			he->has_no_entry = true;
+			he->nr_rows = 1;
+		} else
+			he->has_no_entry = false;
 	}
 }
 
@@ -659,8 +647,10 @@ static void hist_browser__set_folding_se
 	if (!browser->he_selection)
 		return;
 
-	hist_entry__set_folding(browser->he_selection, browser, unfold);
-	browser->b.nr_entries = hist_browser__nr_entries(browser);
+	if (unfold == browser->he_selection->unfolded)
+		return;
+
+	hist_browser__toggle_fold(browser);
 }
 
 static void ui_browser__warn_lost_events(struct ui_browser *browser)
@@ -732,8 +722,8 @@ static int hist_browser__handle_hotkey(s
 		hist_browser__set_folding(browser, true);
 		break;
 	case 'e':
-		/* Expand the selected entry. */
-		hist_browser__set_folding_selected(browser, !hist_browser__he_selection_unfolded(browser));
+		/* Toggle expand/collapse the selected entry. */
+		hist_browser__toggle_fold(browser);
 		break;
 	case 'H':
 		browser->show_headers = !browser->show_headers;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 182/219] drm/amd/display: always switch off ODM before committing more streams
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (180 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 181/219] perf hists browser: Fix the number of entries for e key Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 183/219] drm/amd/display: Remove wait while locked Greg Kroah-Hartman
                   ` (47 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Dillon Varone, Hamza Mahfooz,
	Wenjing Liu, Alex Deucher

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Wenjing Liu <wenjing.liu@amd.com>

commit 49a30c3d1a2258fc93cfe6eea8e4951dabadc824 upstream.

ODM power optimization is only supported with single stream. When ODM
power optimization is enabled, we might not have enough free pipes for
enabling other stream. So when we are committing more than 1 stream we
should first switch off ODM power optimization to make room for new
stream and then allocating pipe resource for the new stream.

Cc: stable@vger.kernel.org
Fixes: 59de751e3845 ("drm/amd/display: add ODM case when looking for first split pipe")
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/amd/display/dc/core/dc.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1977,12 +1977,12 @@ enum dc_status dc_commit_streams(struct
 		}
 	}
 
-	/* Check for case where we are going from odm 2:1 to max
-	 *  pipe scenario.  For these cases, we will call
-	 *  commit_minimal_transition_state() to exit out of odm 2:1
-	 *  first before processing new streams
+	/* ODM Combine 2:1 power optimization is only applied for single stream
+	 * scenario, it uses extra pipes than needed to reduce power consumption
+	 * We need to switch off this feature to make room for new streams.
 	 */
-	if (stream_count == dc->res_pool->pipe_count) {
+	if (stream_count > dc->current_state->stream_count &&
+			dc->current_state->stream_count == 1) {
 		for (i = 0; i < dc->res_pool->pipe_count; i++) {
 			pipe = &dc->current_state->res_ctx.pipe_ctx[i];
 			if (pipe->next_odm_pipe)



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 183/219] drm/amd/display: Remove wait while locked
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (181 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 182/219] drm/amd/display: always switch off ODM before committing more streams Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 184/219] drm/amdgpu: register a dirty framebuffer callback for fbcon Greg Kroah-Hartman
                   ` (46 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Jun Lei, Hamza Mahfooz, Gabe Teeger,
	Alex Deucher

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gabe Teeger <gabe.teeger@amd.com>

commit 5a3ccb1400339268c5e3dc1fa044a7f6c7f59a02 upstream.

[Why]
We wait for mpc idle while in a locked state, leading to potential
deadlock.

[What]
Move the wait_for_idle call to outside of HW lock. This and a
call to wait_drr_doublebuffer_pending_clear are moved added to a new
static helper function called wait_for_outstanding_hw_updates, to make
the interface clearer.

Cc: stable@vger.kernel.org
Fixes: 8f0d304d21b3 ("drm/amd/display: Do not commit pipe when updating DRR")
Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/amd/display/dc/Makefile            |    1 
 drivers/gpu/drm/amd/display/dc/core/dc.c           |   58 ++++++++++++++-------
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c |   11 ---
 3 files changed, 42 insertions(+), 28 deletions(-)

--- a/drivers/gpu/drm/amd/display/dc/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/Makefile
@@ -82,3 +82,4 @@ DC_EDID += dc_edid_parser.o
 AMD_DISPLAY_DMUB = $(addprefix $(AMDDALPATH)/dc/,$(DC_DMUB))
 AMD_DISPLAY_EDID = $(addprefix $(AMDDALPATH)/dc/,$(DC_EDID))
 AMD_DISPLAY_FILES += $(AMD_DISPLAY_DMUB) $(AMD_DISPLAY_EDID)
+
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3361,6 +3361,45 @@ void dc_dmub_update_dirty_rect(struct dc
 	}
 }
 
+static void wait_for_outstanding_hw_updates(struct dc *dc, const struct dc_state *dc_context)
+{
+/*
+ * This function calls HWSS to wait for any potentially double buffered
+ * operations to complete. It should be invoked as a pre-amble prior
+ * to full update programming before asserting any HW locks.
+ */
+	int pipe_idx;
+	int opp_inst;
+	int opp_count = dc->res_pool->pipe_count;
+	struct hubp *hubp;
+	int mpcc_inst;
+	const struct pipe_ctx *pipe_ctx;
+
+	for (pipe_idx = 0; pipe_idx < dc->res_pool->pipe_count; pipe_idx++) {
+		pipe_ctx = &dc_context->res_ctx.pipe_ctx[pipe_idx];
+
+		if (!pipe_ctx->stream)
+			continue;
+
+		if (pipe_ctx->stream_res.tg->funcs->wait_drr_doublebuffer_pending_clear)
+			pipe_ctx->stream_res.tg->funcs->wait_drr_doublebuffer_pending_clear(pipe_ctx->stream_res.tg);
+
+		hubp = pipe_ctx->plane_res.hubp;
+		if (!hubp)
+			continue;
+
+		mpcc_inst = hubp->inst;
+		// MPCC inst is equal to pipe index in practice
+		for (opp_inst = 0; opp_inst < opp_count; opp_inst++) {
+			if (dc->res_pool->opps[opp_inst]->mpcc_disconnect_pending[mpcc_inst]) {
+				dc->res_pool->mpc->funcs->wait_for_idle(dc->res_pool->mpc, mpcc_inst);
+				dc->res_pool->opps[opp_inst]->mpcc_disconnect_pending[mpcc_inst] = false;
+				break;
+			}
+		}
+	}
+}
+
 static void commit_planes_for_stream(struct dc *dc,
 		struct dc_surface_update *srf_updates,
 		int surface_count,
@@ -3378,24 +3417,9 @@ static void commit_planes_for_stream(str
 	// dc->current_state anymore, so we have to cache it before we apply
 	// the new SubVP context
 	subvp_prev_use = false;
-
-
 	dc_z10_restore(dc);
-
-	if (update_type == UPDATE_TYPE_FULL) {
-		/* wait for all double-buffer activity to clear on all pipes */
-		int pipe_idx;
-
-		for (pipe_idx = 0; pipe_idx < dc->res_pool->pipe_count; pipe_idx++) {
-			struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[pipe_idx];
-
-			if (!pipe_ctx->stream)
-				continue;
-
-			if (pipe_ctx->stream_res.tg->funcs->wait_drr_doublebuffer_pending_clear)
-				pipe_ctx->stream_res.tg->funcs->wait_drr_doublebuffer_pending_clear(pipe_ctx->stream_res.tg);
-		}
-	}
+	if (update_type == UPDATE_TYPE_FULL)
+		wait_for_outstanding_hw_updates(dc, context);
 
 	if (get_seamless_boot_stream_count(context) > 0 && surface_count > 0) {
 		/* Optimize seamless boot flag keeps clocks and watermarks high until
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1515,17 +1515,6 @@ static void dcn20_update_dchubp_dpp(
 			|| plane_state->update_flags.bits.global_alpha_change
 			|| plane_state->update_flags.bits.per_pixel_alpha_change) {
 		// MPCC inst is equal to pipe index in practice
-		int mpcc_inst = hubp->inst;
-		int opp_inst;
-		int opp_count = dc->res_pool->pipe_count;
-
-		for (opp_inst = 0; opp_inst < opp_count; opp_inst++) {
-			if (dc->res_pool->opps[opp_inst]->mpcc_disconnect_pending[mpcc_inst]) {
-				dc->res_pool->mpc->funcs->wait_for_idle(dc->res_pool->mpc, mpcc_inst);
-				dc->res_pool->opps[opp_inst]->mpcc_disconnect_pending[mpcc_inst] = false;
-				break;
-			}
-		}
 		hws->funcs.update_mpcc(dc, pipe_ctx);
 	}
 



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 184/219] drm/amdgpu: register a dirty framebuffer callback for fbcon
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (182 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 183/219] drm/amd/display: Remove wait while locked Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 185/219] kunit: Fix wild-memory-access bug in kunit_free_suite_set() Greg Kroah-Hartman
                   ` (45 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Aurabindo Pillai, Mario Limonciello,
	Hamza Mahfooz, Alex Deucher

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hamza Mahfooz <hamza.mahfooz@amd.com>

commit 0a611560f53bfd489e33f4a718c915f1a6123d03 upstream.

fbcon requires that we implement &drm_framebuffer_funcs.dirty.
Otherwise, the framebuffer might take a while to flush (which would
manifest as noticeable lag). However, we can't enable this callback for
non-fbcon cases since it may cause too many atomic commits to be made at
once. So, implement amdgpu_dirtyfb() and only enable it for fbcon
framebuffers (we can use the "struct drm_file file" parameter in the
callback to check for this since it is only NULL when called by fbcon,
at least in the mainline kernel) on devices that support atomic KMS.

Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org # 6.1+
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2519
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c |   26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -38,6 +38,8 @@
 #include <linux/pci.h>
 #include <linux/pm_runtime.h>
 #include <drm/drm_crtc_helper.h>
+#include <drm/drm_damage_helper.h>
+#include <drm/drm_drv.h>
 #include <drm/drm_edid.h>
 #include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_fb_helper.h>
@@ -493,11 +495,29 @@ bool amdgpu_display_ddc_probe(struct amd
 	return true;
 }
 
+static int amdgpu_dirtyfb(struct drm_framebuffer *fb, struct drm_file *file,
+			  unsigned int flags, unsigned int color,
+			  struct drm_clip_rect *clips, unsigned int num_clips)
+{
+
+	if (file)
+		return -ENOSYS;
+
+	return drm_atomic_helper_dirtyfb(fb, file, flags, color, clips,
+					 num_clips);
+}
+
 static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
 	.destroy = drm_gem_fb_destroy,
 	.create_handle = drm_gem_fb_create_handle,
 };
 
+static const struct drm_framebuffer_funcs amdgpu_fb_funcs_atomic = {
+	.destroy = drm_gem_fb_destroy,
+	.create_handle = drm_gem_fb_create_handle,
+	.dirty = amdgpu_dirtyfb
+};
+
 uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev,
 					  uint64_t bo_flags)
 {
@@ -1100,7 +1120,11 @@ static int amdgpu_display_gem_fb_verify_
 	if (ret)
 		goto err;
 
-	ret = drm_framebuffer_init(dev, &rfb->base, &amdgpu_fb_funcs);
+	if (drm_drv_uses_atomic_modeset(dev))
+		ret = drm_framebuffer_init(dev, &rfb->base,
+					   &amdgpu_fb_funcs_atomic);
+	else
+		ret = drm_framebuffer_init(dev, &rfb->base, &amdgpu_fb_funcs);
 
 	if (ret)
 		goto err;



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 185/219] kunit: Fix wild-memory-access bug in kunit_free_suite_set()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (183 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 184/219] drm/amdgpu: register a dirty framebuffer callback for fbcon Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 186/219] net: ipv4: fix one memleak in __inet_del_ifa() Greg Kroah-Hartman
                   ` (44 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Jinjie Ruan, Rae Moar, David Gow,
	Shuah Khan, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jinjie Ruan <ruanjinjie@huawei.com>

[ Upstream commit 2810c1e99867a811e631dd24e63e6c1e3b78a59d ]

Inject fault while probing kunit-example-test.ko, if kstrdup()
fails in mod_sysfs_setup() in load_module(), the mod->state will
switch from MODULE_STATE_COMING to MODULE_STATE_GOING instead of
from MODULE_STATE_LIVE to MODULE_STATE_GOING, so only
kunit_module_exit() will be called without kunit_module_init(), and
the mod->kunit_suites is no set correctly and the free in
kunit_free_suite_set() will cause below wild-memory-access bug.

The mod->state state machine when load_module() succeeds:

MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_LIVE
	 ^						|
	 |						| delete_module
	 +---------------- MODULE_STATE_GOING <---------+

The mod->state state machine when load_module() fails at
mod_sysfs_setup():

MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_GOING
	^						|
	|						|
	+-----------------------------------------------+

Call kunit_module_init() at MODULE_STATE_COMING state to fix the issue
because MODULE_STATE_LIVE is transformed from it.

 Unable to handle kernel paging request at virtual address ffffff341e942a88
 KASAN: maybe wild-memory-access in range [0x0003f9a0f4a15440-0x0003f9a0f4a15447]
 Mem abort info:
   ESR = 0x0000000096000004
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x04: level 0 translation fault
 Data abort info:
   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
 swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000441ea000
 [ffffff341e942a88] pgd=0000000000000000, p4d=0000000000000000
 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
 Modules linked in: kunit_example_test(-) cfg80211 rfkill 8021q garp mrp stp llc ipv6 [last unloaded: kunit_example_test]
 CPU: 3 PID: 2035 Comm: modprobe Tainted: G        W        N 6.5.0-next-20230828+ #136
 Hardware name: linux,dummy-virt (DT)
 pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : kfree+0x2c/0x70
 lr : kunit_free_suite_set+0xcc/0x13c
 sp : ffff8000829b75b0
 x29: ffff8000829b75b0 x28: ffff8000829b7b90 x27: 0000000000000000
 x26: dfff800000000000 x25: ffffcd07c82a7280 x24: ffffcd07a50ab300
 x23: ffffcd07a50ab2e8 x22: 1ffff00010536ec0 x21: dfff800000000000
 x20: ffffcd07a50ab2f0 x19: ffffcd07a50ab2f0 x18: 0000000000000000
 x17: 0000000000000000 x16: 0000000000000000 x15: ffffcd07c24b6764
 x14: ffffcd07c24b63c0 x13: ffffcd07c4cebb94 x12: ffff700010536ec7
 x11: 1ffff00010536ec6 x10: ffff700010536ec6 x9 : dfff800000000000
 x8 : 00008fffefac913a x7 : 0000000041b58ab3 x6 : 0000000000000000
 x5 : 1ffff00010536ec5 x4 : ffff8000829b7628 x3 : dfff800000000000
 x2 : ffffff341e942a80 x1 : ffffcd07a50aa000 x0 : fffffc0000000000
 Call trace:
  kfree+0x2c/0x70
  kunit_free_suite_set+0xcc/0x13c
  kunit_module_notify+0xd8/0x360
  blocking_notifier_call_chain+0xc4/0x128
  load_module+0x382c/0x44a4
  init_module_from_file+0xd4/0x128
  idempotent_init_module+0x2c8/0x524
  __arm64_sys_finit_module+0xac/0x100
  invoke_syscall+0x6c/0x258
  el0_svc_common.constprop.0+0x160/0x22c
  do_el0_svc+0x44/0x5c
  el0_svc+0x38/0x78
  el0t_64_sync_handler+0x13c/0x158
  el0t_64_sync+0x190/0x194
 Code: aa0003e1 b25657e0 d34cfc42 8b021802 (f9400440)
 ---[ end trace 0000000000000000 ]---
 Kernel panic - not syncing: Oops: Fatal exception
 SMP: stopping secondary CPUs
 Kernel Offset: 0x4d0742200000 from 0xffff800080000000
 PHYS_OFFSET: 0xffffee43c0000000
 CPU features: 0x88000203,3c020000,1000421b
 Memory Limit: none
 Rebooting in 1 seconds..

Fixes: 3d6e44623841 ("kunit: unify module and builtin suite definitions")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 lib/kunit/test.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 184df6f701b48..a90bd265d73db 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -667,12 +667,13 @@ static int kunit_module_notify(struct notifier_block *nb, unsigned long val,
 
 	switch (val) {
 	case MODULE_STATE_LIVE:
-		kunit_module_init(mod);
 		break;
 	case MODULE_STATE_GOING:
 		kunit_module_exit(mod);
 		break;
 	case MODULE_STATE_COMING:
+		kunit_module_init(mod);
+		break;
 	case MODULE_STATE_UNFORMED:
 		break;
 	}
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 186/219] net: ipv4: fix one memleak in __inet_del_ifa()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (184 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 185/219] kunit: Fix wild-memory-access bug in kunit_free_suite_set() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 187/219] kselftest/runner.sh: Propagate SIGTERM to runner child Greg Kroah-Hartman
                   ` (43 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Liu Jian, Julian Anastasov,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Liu Jian <liujian56@huawei.com>

[ Upstream commit ac28b1ec6135649b5d78b028e47264cb3ebca5ea ]

I got the below warning when do fuzzing test:
unregister_netdevice: waiting for bond0 to become free. Usage count = 2

It can be repoduced via:

ip link add bond0 type bond
sysctl -w net.ipv4.conf.bond0.promote_secondaries=1
ip addr add 4.117.174.103/0 scope 0x40 dev bond0
ip addr add 192.168.100.111/255.255.255.254 scope 0 dev bond0
ip addr add 0.0.0.4/0 scope 0x40 secondary dev bond0
ip addr del 4.117.174.103/0 scope 0x40 dev bond0
ip link delete bond0 type bond

In this reproduction test case, an incorrect 'last_prim' is found in
__inet_del_ifa(), as a result, the secondary address(0.0.0.4/0 scope 0x40)
is lost. The memory of the secondary address is leaked and the reference of
in_device and net_device is leaked.

Fix this problem:
Look for 'last_prim' starting at location of the deleted IP and inserting
the promoted IP into the location of 'last_prim'.

Fixes: 0ff60a45678e ("[IPV4]: Fix secondary IP addresses after promotion")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ipv4/devinet.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index e8b9a9202fecd..35d6e74be8406 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -354,14 +354,14 @@ static void __inet_del_ifa(struct in_device *in_dev,
 {
 	struct in_ifaddr *promote = NULL;
 	struct in_ifaddr *ifa, *ifa1;
-	struct in_ifaddr *last_prim;
+	struct in_ifaddr __rcu **last_prim;
 	struct in_ifaddr *prev_prom = NULL;
 	int do_promote = IN_DEV_PROMOTE_SECONDARIES(in_dev);
 
 	ASSERT_RTNL();
 
 	ifa1 = rtnl_dereference(*ifap);
-	last_prim = rtnl_dereference(in_dev->ifa_list);
+	last_prim = ifap;
 	if (in_dev->dead)
 		goto no_promotions;
 
@@ -375,7 +375,7 @@ static void __inet_del_ifa(struct in_device *in_dev,
 		while ((ifa = rtnl_dereference(*ifap1)) != NULL) {
 			if (!(ifa->ifa_flags & IFA_F_SECONDARY) &&
 			    ifa1->ifa_scope <= ifa->ifa_scope)
-				last_prim = ifa;
+				last_prim = &ifa->ifa_next;
 
 			if (!(ifa->ifa_flags & IFA_F_SECONDARY) ||
 			    ifa1->ifa_mask != ifa->ifa_mask ||
@@ -439,9 +439,9 @@ static void __inet_del_ifa(struct in_device *in_dev,
 
 			rcu_assign_pointer(prev_prom->ifa_next, next_sec);
 
-			last_sec = rtnl_dereference(last_prim->ifa_next);
+			last_sec = rtnl_dereference(*last_prim);
 			rcu_assign_pointer(promote->ifa_next, last_sec);
-			rcu_assign_pointer(last_prim->ifa_next, promote);
+			rcu_assign_pointer(*last_prim, promote);
 		}
 
 		promote->ifa_flags &= ~IFA_F_SECONDARY;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 187/219] kselftest/runner.sh: Propagate SIGTERM to runner child
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (185 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 186/219] net: ipv4: fix one memleak in __inet_del_ifa() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 188/219] selftests: Keep symlinks, when possible Greg Kroah-Hartman
                   ` (42 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Björn Töpel, Shuah Khan,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Björn Töpel <bjorn@rivosinc.com>

[ Upstream commit 9616cb34b08ec86642b162eae75c5a7ca8debe3c ]

Timeouts in kselftest are done using the "timeout" command with the
"--foreground" option. Without the "foreground" option, it is not
possible for a user to cancel the runner using SIGINT, because the
signal is not propagated to timeout which is running in a different
process group. The "forground" options places the timeout in the same
process group as its parent, but only sends the SIGTERM (on timeout)
signal to the forked process. Unfortunately, this does not play nice
with all kselftests, e.g. "net:fcnal-test.sh", where the child
processes will linger because timeout does not send SIGTERM to the
group.

Some users have noted these hangs [1].

Fix this by nesting the timeout with an additional timeout without the
foreground option.

Link: https://lore.kernel.org/all/7650b2eb-0aee-a2b0-2e64-c9bc63210f67@alu.unizg.hr/ # [1]
Fixes: 651e0d881461 ("kselftest/runner: allow to properly deliver signals to tests")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 tools/testing/selftests/kselftest/runner.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kselftest/runner.sh b/tools/testing/selftests/kselftest/runner.sh
index 294619ade49fe..1333ab1eda708 100644
--- a/tools/testing/selftests/kselftest/runner.sh
+++ b/tools/testing/selftests/kselftest/runner.sh
@@ -35,7 +35,8 @@ tap_timeout()
 {
 	# Make sure tests will time out if utility is available.
 	if [ -x /usr/bin/timeout ] ; then
-		/usr/bin/timeout --foreground "$kselftest_timeout" $1
+		/usr/bin/timeout --foreground "$kselftest_timeout" \
+			/usr/bin/timeout "$kselftest_timeout" $1
 	else
 		$1
 	fi
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 188/219] selftests: Keep symlinks, when possible
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (186 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 187/219] kselftest/runner.sh: Propagate SIGTERM to runner child Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 189/219] net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add Greg Kroah-Hartman
                   ` (41 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Björn Töpel,
	Benjamin Poirier, Shuah Khan, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Björn Töpel <bjorn@rivosinc.com>

[ Upstream commit 3f3f384139ed147c71e1d770accf610133d5309b ]

When kselftest is built/installed with the 'gen_tar' target, rsync is
used for the installation step to copy files. Extra care is needed for
tests that have symlinks. Commit ae108c48b5d2 ("selftests: net: Fix
cross-tree inclusion of scripts") added '-L' (transform symlink into
referent file/dir) to rsync, to fix dangling links. However, that
broke some tests where the symlink (being a symlink) is part of the
test (e.g. exec:execveat).

Use rsync's '--copy-unsafe-links' that does right thing.

Fixes: ae108c48b5d2 ("selftests: net: Fix cross-tree inclusion of scripts")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 tools/testing/selftests/lib.mk | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 05400462c7799..aa646e0661f36 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -72,7 +72,7 @@ endef
 run_tests: all
 ifdef building_out_of_srctree
 	@if [ "X$(TEST_PROGS)$(TEST_PROGS_EXTENDED)$(TEST_FILES)" != "X" ]; then \
-		rsync -aLq $(TEST_PROGS) $(TEST_PROGS_EXTENDED) $(TEST_FILES) $(OUTPUT); \
+		rsync -aq --copy-unsafe-links $(TEST_PROGS) $(TEST_PROGS_EXTENDED) $(TEST_FILES) $(OUTPUT); \
 	fi
 	@if [ "X$(TEST_PROGS)" != "X" ]; then \
 		$(call RUN_TESTS, $(TEST_GEN_PROGS) $(TEST_CUSTOM_PROGS) \
@@ -86,7 +86,7 @@ endif
 
 define INSTALL_SINGLE_RULE
 	$(if $(INSTALL_LIST),@mkdir -p $(INSTALL_PATH))
-	$(if $(INSTALL_LIST),rsync -aL $(INSTALL_LIST) $(INSTALL_PATH)/)
+	$(if $(INSTALL_LIST),rsync -a --copy-unsafe-links $(INSTALL_LIST) $(INSTALL_PATH)/)
 endef
 
 define INSTALL_RULE
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 189/219] net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (187 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 188/219] selftests: Keep symlinks, when possible Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 190/219] net: stmmac: fix handling of zero coalescing tx-usecs Greg Kroah-Hartman
                   ` (40 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Guangguan Wang, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Guangguan Wang <guangguan.wang@linux.alibaba.com>

[ Upstream commit f5146e3ef0a9eea405874b36178c19a4863b8989 ]

While doing smcr_port_add, there maybe linkgroup add into or delete
from smc_lgr_list.list at the same time, which may result kernel crash.
So, use smc_lgr_list.lock to protect smc_lgr_list.list iterate in
smcr_port_add.

The crash calltrace show below:
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 PID: 559726 Comm: kworker/0:92 Kdump: loaded Tainted: G
Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 449e491 04/01/2014
Workqueue: events smc_ib_port_event_work [smc]
RIP: 0010:smcr_port_add+0xa6/0xf0 [smc]
RSP: 0000:ffffa5a2c8f67de0 EFLAGS: 00010297
RAX: 0000000000000001 RBX: ffff9935e0650000 RCX: 0000000000000000
RDX: 0000000000000010 RSI: ffff9935e0654290 RDI: ffff9935c8560000
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9934c0401918
R10: 0000000000000000 R11: ffffffffb4a5c278 R12: ffff99364029aae4
R13: ffff99364029aa00 R14: 00000000ffffffed R15: ffff99364029ab08
FS:  0000000000000000(0000) GS:ffff994380600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000f06a10003 CR4: 0000000002770ef0
PKRU: 55555554
Call Trace:
 smc_ib_port_event_work+0x18f/0x380 [smc]
 process_one_work+0x19b/0x340
 worker_thread+0x30/0x370
 ? process_one_work+0x340/0x340
 kthread+0x114/0x130
 ? __kthread_cancel_work+0x50/0x50
 ret_from_fork+0x1f/0x30

Fixes: 1f90a05d9ff9 ("net/smc: add smcr_port_add() and smcr_link_up() processing")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/smc/smc_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index c676d92af7b7d..64b6dd439938e 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -1650,6 +1650,7 @@ void smcr_port_add(struct smc_ib_device *smcibdev, u8 ibport)
 {
 	struct smc_link_group *lgr, *n;
 
+	spin_lock_bh(&smc_lgr_list.lock);
 	list_for_each_entry_safe(lgr, n, &smc_lgr_list.list, list) {
 		struct smc_link *link;
 
@@ -1665,6 +1666,7 @@ void smcr_port_add(struct smc_ib_device *smcibdev, u8 ibport)
 		if (link)
 			smc_llc_add_link_local(link);
 	}
+	spin_unlock_bh(&smc_lgr_list.lock);
 }
 
 /* link is down - switch connections to alternate link,
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 190/219] net: stmmac: fix handling of zero coalescing tx-usecs
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (188 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 189/219] net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 191/219] net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc() Greg Kroah-Hartman
                   ` (39 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Vincent Whitchurch, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vincent Whitchurch <vincent.whitchurch@axis.com>

[ Upstream commit fa60b8163816f194786f3ee334c9a458da7699c6 ]

Setting ethtool -C eth0 tx-usecs 0 is supposed to disable the use of the
coalescing timer but currently it gets programmed with zero delay
instead.

Disable the use of the coalescing timer if tx-usecs is zero by
preventing it from being restarted.  Note that to keep things simple we
don't start/stop the timer when the coalescing settings are changed, but
just let that happen on the next transmit or timer expiry.

Fixes: 8fce33317023 ("net: stmmac: Rework coalesce timer and fix multi-queue races")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index a07bcb2f5d2e2..1559a4dafd413 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2692,9 +2692,7 @@ static int stmmac_tx_clean(struct stmmac_priv *priv, int budget, u32 queue)
 
 	/* We still have pending packets, let's call for a new scheduling */
 	if (tx_q->dirty_tx != tx_q->cur_tx)
-		hrtimer_start(&tx_q->txtimer,
-			      STMMAC_COAL_TIMER(priv->tx_coal_timer[queue]),
-			      HRTIMER_MODE_REL);
+		stmmac_tx_timer_arm(priv, queue);
 
 	__netif_tx_unlock_bh(netdev_get_tx_queue(priv->dev, queue));
 
@@ -2975,9 +2973,13 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 static void stmmac_tx_timer_arm(struct stmmac_priv *priv, u32 queue)
 {
 	struct stmmac_tx_queue *tx_q = &priv->dma_conf.tx_queue[queue];
+	u32 tx_coal_timer = priv->tx_coal_timer[queue];
+
+	if (!tx_coal_timer)
+		return;
 
 	hrtimer_start(&tx_q->txtimer,
-		      STMMAC_COAL_TIMER(priv->tx_coal_timer[queue]),
+		      STMMAC_COAL_TIMER(tx_coal_timer),
 		      HRTIMER_MODE_REL);
 }
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 191/219] net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (189 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 190/219] net: stmmac: fix handling of zero coalescing tx-usecs Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 192/219] net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all() Greg Kroah-Hartman
                   ` (38 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Hangyu Hua, Marcin Wojtas,
	Russell King (Oracle), David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hangyu Hua <hbh25y@gmail.com>

[ Upstream commit 51fe0a470543f345e3c62b6798929de3ddcedc1d ]

rules is allocated in ethtool_get_rxnfc and the size is determined by
rule_cnt from user space. So rule_cnt needs to be check before using
rules to avoid OOB writing or NULL pointer dereference.

Fixes: 90b509b39ac9 ("net: mvpp2: cls: Add Classification offload support")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Marcin Wojtas <mw@semihalf.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index b399bdb1ca362..f936640cca4e6 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -5578,6 +5578,11 @@ static int mvpp2_ethtool_get_rxnfc(struct net_device *dev,
 		break;
 	case ETHTOOL_GRXCLSRLALL:
 		for (i = 0; i < MVPP2_N_RFS_ENTRIES_PER_FLOW; i++) {
+			if (loc == info->rule_cnt) {
+				ret = -EMSGSIZE;
+				break;
+			}
+
 			if (port->rfs_rules[i])
 				rules[loc++] = i;
 		}
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 192/219] net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (190 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 191/219] net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 193/219] hsr: Fix uninit-value access in fill_frame_info() Greg Kroah-Hartman
                   ` (37 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Hangyu Hua, Simon Horman,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hangyu Hua <hbh25y@gmail.com>

[ Upstream commit e4c79810755f66c9a933ca810da2724133b1165a ]

rule_locs is allocated in ethtool_get_rxnfc and the size is determined by
rule_cnt from user space. So rule_cnt needs to be check before using
rule_locs to avoid NULL pointer dereference.

Fixes: 7aab747e5563 ("net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 7e318133423a9..0ac5ae16308f6 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -2698,6 +2698,9 @@ static int mtk_hwlro_get_fdir_all(struct net_device *dev,
 	int i;
 
 	for (i = 0; i < MTK_MAX_LRO_IP_CNT; i++) {
+		if (cnt == cmd->rule_cnt)
+			return -EMSGSIZE;
+
 		if (mac->hwlro_ip[i]) {
 			rule_locs[cnt] = i;
 			cnt++;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 193/219] hsr: Fix uninit-value access in fill_frame_info()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (191 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 192/219] net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 194/219] net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address Greg Kroah-Hartman
                   ` (36 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, syzbot+bf7e6250c7ce248f3ec9,
	Ziyang Xuan, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ziyang Xuan <william.xuanziyang@huawei.com>

[ Upstream commit 484b4833c604c0adcf19eac1ca14b60b757355b5 ]

Syzbot reports the following uninit-value access problem.

=====================================================
BUG: KMSAN: uninit-value in fill_frame_info net/hsr/hsr_forward.c:601 [inline]
BUG: KMSAN: uninit-value in hsr_forward_skb+0x9bd/0x30f0 net/hsr/hsr_forward.c:616
 fill_frame_info net/hsr/hsr_forward.c:601 [inline]
 hsr_forward_skb+0x9bd/0x30f0 net/hsr/hsr_forward.c:616
 hsr_dev_xmit+0x192/0x330 net/hsr/hsr_device.c:223
 __netdev_start_xmit include/linux/netdevice.h:4889 [inline]
 netdev_start_xmit include/linux/netdevice.h:4903 [inline]
 xmit_one net/core/dev.c:3544 [inline]
 dev_hard_start_xmit+0x247/0xa10 net/core/dev.c:3560
 __dev_queue_xmit+0x34d0/0x52a0 net/core/dev.c:4340
 dev_queue_xmit include/linux/netdevice.h:3082 [inline]
 packet_xmit+0x9c/0x6b0 net/packet/af_packet.c:276
 packet_snd net/packet/af_packet.c:3087 [inline]
 packet_sendmsg+0x8b1d/0x9f30 net/packet/af_packet.c:3119
 sock_sendmsg_nosec net/socket.c:730 [inline]
 sock_sendmsg net/socket.c:753 [inline]
 __sys_sendto+0x781/0xa30 net/socket.c:2176
 __do_sys_sendto net/socket.c:2188 [inline]
 __se_sys_sendto net/socket.c:2184 [inline]
 __ia32_sys_sendto+0x11f/0x1c0 net/socket.c:2184
 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
 __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
 do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
 entry_SYSENTER_compat_after_hwframe+0x70/0x82

Uninit was created at:
 slab_post_alloc_hook+0x12f/0xb70 mm/slab.h:767
 slab_alloc_node mm/slub.c:3478 [inline]
 kmem_cache_alloc_node+0x577/0xa80 mm/slub.c:3523
 kmalloc_reserve+0x148/0x470 net/core/skbuff.c:559
 __alloc_skb+0x318/0x740 net/core/skbuff.c:644
 alloc_skb include/linux/skbuff.h:1286 [inline]
 alloc_skb_with_frags+0xc8/0xbd0 net/core/skbuff.c:6299
 sock_alloc_send_pskb+0xa80/0xbf0 net/core/sock.c:2794
 packet_alloc_skb net/packet/af_packet.c:2936 [inline]
 packet_snd net/packet/af_packet.c:3030 [inline]
 packet_sendmsg+0x70e8/0x9f30 net/packet/af_packet.c:3119
 sock_sendmsg_nosec net/socket.c:730 [inline]
 sock_sendmsg net/socket.c:753 [inline]
 __sys_sendto+0x781/0xa30 net/socket.c:2176
 __do_sys_sendto net/socket.c:2188 [inline]
 __se_sys_sendto net/socket.c:2184 [inline]
 __ia32_sys_sendto+0x11f/0x1c0 net/socket.c:2184
 do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
 __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
 do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
 do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
 entry_SYSENTER_compat_after_hwframe+0x70/0x82

It is because VLAN not yet supported in hsr driver. Return error
when protocol is ETH_P_8021Q in fill_frame_info() now to fix it.

Fixes: 451d8123f897 ("net: prp: add packet handling support")
Reported-by: syzbot+bf7e6250c7ce248f3ec9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bf7e6250c7ce248f3ec9
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/hsr/hsr_forward.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c
index 629daacc96071..b71dab630a873 100644
--- a/net/hsr/hsr_forward.c
+++ b/net/hsr/hsr_forward.c
@@ -594,6 +594,7 @@ static int fill_frame_info(struct hsr_frame_info *frame,
 		proto = vlan_hdr->vlanhdr.h_vlan_encapsulated_proto;
 		/* FIXME: */
 		netdev_warn_once(skb->dev, "VLAN not yet supported");
+		return -EINVAL;
 	}
 
 	frame->is_from_san = false;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 194/219] net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (192 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 193/219] hsr: Fix uninit-value access in fill_frame_info() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 195/219] net:ethernet:adi:adin1110: Fix forwarding offload Greg Kroah-Hartman
                   ` (35 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Yang Yingliang, Simon Horman,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yang Yingliang <yangyingliang@huawei.com>

[ Upstream commit 54024dbec95585243391caeb9f04a2620e630765 ]

Use eth_broadcast_addr() to assign broadcast address instead
of memset().

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 32530dba1bd4 ("net:ethernet:adi:adin1110: Fix forwarding offload")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/adi/adin1110.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/adi/adin1110.c b/drivers/net/ethernet/adi/adin1110.c
index ecce5f7a549f2..cc026780ee0e8 100644
--- a/drivers/net/ethernet/adi/adin1110.c
+++ b/drivers/net/ethernet/adi/adin1110.c
@@ -740,7 +740,7 @@ static int adin1110_broadcasts_filter(struct adin1110_port_priv *port_priv,
 	u32 port_rules = 0;
 	u8 mask[ETH_ALEN];
 
-	memset(mask, 0xFF, ETH_ALEN);
+	eth_broadcast_addr(mask);
 
 	if (accept_broadcast && port_priv->state == BR_STATE_FORWARDING)
 		port_rules = adin1110_port_rules(port_priv, true, true);
@@ -761,7 +761,7 @@ static int adin1110_set_mac_address(struct net_device *netdev,
 		return -EADDRNOTAVAIL;
 
 	eth_hw_addr_set(netdev, dev_addr);
-	memset(mask, 0xFF, ETH_ALEN);
+	eth_broadcast_addr(mask);
 
 	mac_slot = (!port_priv->nr) ?  ADIN_MAC_P1_ADDR_SLOT : ADIN_MAC_P2_ADDR_SLOT;
 	port_rules = adin1110_port_rules(port_priv, true, false);
@@ -1251,7 +1251,7 @@ static int adin1110_port_set_blocking_state(struct adin1110_port_priv *port_priv
 		goto out;
 
 	/* Allow only BPDUs to be passed to the CPU */
-	memset(mask, 0xFF, ETH_ALEN);
+	eth_broadcast_addr(mask);
 	port_rules = adin1110_port_rules(port_priv, true, false);
 	ret = adin1110_write_mac_address(port_priv, mac_slot, mac,
 					 mask, port_rules);
@@ -1366,7 +1366,7 @@ static int adin1110_fdb_add(struct adin1110_port_priv *port_priv,
 
 	other_port = priv->ports[!port_priv->nr];
 	port_rules = adin1110_port_rules(port_priv, false, true);
-	memset(mask, 0xFF, ETH_ALEN);
+	eth_broadcast_addr(mask);
 
 	return adin1110_write_mac_address(other_port, mac_nr, (u8 *)fdb->addr,
 					  mask, port_rules);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 195/219] net:ethernet:adi:adin1110: Fix forwarding offload
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (193 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 194/219] net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 196/219] net: dsa: sja1105: hide all multicast addresses from "bridge fdb show" Greg Kroah-Hartman
                   ` (34 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Ciprian Regus, Simon Horman,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ciprian Regus <ciprian.regus@analog.com>

[ Upstream commit 32530dba1bd48da4437d18d9a8dbc9d2826938a6 ]

Currently, when a new fdb entry is added (with both ports of the
ADIN2111 bridged), the driver configures the MAC filters for the wrong
port, which results in the forwarding being done by the host, and not
actually hardware offloaded.

The ADIN2111 offloads the forwarding by setting filters on the
destination MAC address of incoming frames. Based on these, they may be
routed to the other port. Thus, if a frame has to be forwarded from port
1 to port 2, the required configuration for the ADDR_FILT_UPRn register
should set the APPLY2PORT1 bit (instead of APPLY2PORT2, as it's
currently the case).

Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support")
Signed-off-by: Ciprian Regus <ciprian.regus@analog.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/adi/adin1110.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/adi/adin1110.c b/drivers/net/ethernet/adi/adin1110.c
index cc026780ee0e8..ed2863ed6a5bb 100644
--- a/drivers/net/ethernet/adi/adin1110.c
+++ b/drivers/net/ethernet/adi/adin1110.c
@@ -1365,7 +1365,7 @@ static int adin1110_fdb_add(struct adin1110_port_priv *port_priv,
 		return -ENOMEM;
 
 	other_port = priv->ports[!port_priv->nr];
-	port_rules = adin1110_port_rules(port_priv, false, true);
+	port_rules = adin1110_port_rules(other_port, false, true);
 	eth_broadcast_addr(mask);
 
 	return adin1110_write_mac_address(other_port, mac_nr, (u8 *)fdb->addr,
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 196/219] net: dsa: sja1105: hide all multicast addresses from "bridge fdb show"
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (194 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 195/219] net:ethernet:adi:adin1110: Fix forwarding offload Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 197/219] net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid() Greg Kroah-Hartman
                   ` (33 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Vladimir Oltean, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Oltean <vladimir.oltean@nxp.com>

[ Upstream commit 02c652f5465011126152bbd93b6a582a1d0c32f1 ]

Commit 4d9423549501 ("net: dsa: sja1105: offload bridge port flags to
device") has partially hidden some multicast entries from showing up in
the "bridge fdb show" output, but it wasn't enough. Addresses which are
added through "bridge mdb add" still show up. Hide them all.

Fixes: 291d1e72b756 ("net: dsa: sja1105: Add support for FDB and MDB management")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/sja1105/sja1105_main.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index ff94c5996fafb..f8228a81234fe 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -1875,13 +1875,14 @@ static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
 		if (!(l2_lookup.destports & BIT(port)))
 			continue;
 
-		/* We need to hide the FDB entry for unknown multicast */
-		if (l2_lookup.macaddr == SJA1105_UNKNOWN_MULTICAST &&
-		    l2_lookup.mask_macaddr == SJA1105_UNKNOWN_MULTICAST)
-			continue;
-
 		u64_to_ether_addr(l2_lookup.macaddr, macaddr);
 
+		/* Hardware FDB is shared for fdb and mdb, "bridge fdb show"
+		 * only wants to see unicast
+		 */
+		if (is_multicast_ether_addr(macaddr))
+			continue;
+
 		/* We need to hide the dsa_8021q VLANs from the user. */
 		if (vid_is_dsa_8021q(l2_lookup.vlanid))
 			l2_lookup.vlanid = 0;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 197/219] net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (195 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 196/219] net: dsa: sja1105: hide all multicast addresses from "bridge fdb show" Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 198/219] net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry Greg Kroah-Hartman
                   ` (32 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Vladimir Oltean, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Oltean <vladimir.oltean@nxp.com>

[ Upstream commit c956798062b5a308db96e75157747291197f0378 ]

Currently, sja1105_dynamic_config_wait_complete() returns either 0 or
-ETIMEDOUT, because it just looks at the read_poll_timeout() return code.

There will be future changes which move some more checks to
sja1105_dynamic_config_poll_valid(). It is important that we propagate
their exact return code (-ENOENT, -EINVAL), because callers of
sja1105_dynamic_config_read() depend on them.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 7cef293b9a63 ("net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/sja1105/sja1105_dynamic_config.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
index 7729d3f8b7f50..93d47dab8d3e9 100644
--- a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
+++ b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
@@ -1211,13 +1211,14 @@ sja1105_dynamic_config_wait_complete(struct sja1105_private *priv,
 				     struct sja1105_dyn_cmd *cmd,
 				     const struct sja1105_dynamic_table_ops *ops)
 {
-	int rc;
-
-	return read_poll_timeout(sja1105_dynamic_config_poll_valid,
-				 rc, rc != -EAGAIN,
-				 SJA1105_DYNAMIC_CONFIG_SLEEP_US,
-				 SJA1105_DYNAMIC_CONFIG_TIMEOUT_US,
-				 false, priv, cmd, ops);
+	int err, rc;
+
+	err = read_poll_timeout(sja1105_dynamic_config_poll_valid,
+				rc, rc != -EAGAIN,
+				SJA1105_DYNAMIC_CONFIG_SLEEP_US,
+				SJA1105_DYNAMIC_CONFIG_TIMEOUT_US,
+				false, priv, cmd, ops);
+	return err < 0 ? err : rc;
 }
 
 /* Provides read access to the settings through the dynamic interface
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 198/219] net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (196 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 197/219] net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 199/219] net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses Greg Kroah-Hartman
                   ` (31 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Yanan Yang, Vladimir Oltean,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Oltean <vladimir.oltean@nxp.com>

[ Upstream commit 7cef293b9a634a05fcce9e1df4aee3aeed023345 ]

The commit cited in Fixes: did 2 things: it refactored the read-back
polling from sja1105_dynamic_config_read() into a new function,
sja1105_dynamic_config_wait_complete(), and it called that from
sja1105_dynamic_config_write() too.

What is problematic is the refactoring.

The refactored code from sja1105_dynamic_config_poll_valid() works like
the previous one, but the problem is that it uses another packed_buf[]
SPI buffer, and there was code at the end of sja1105_dynamic_config_read()
which was relying on the read-back packed_buf[]:

	/* Don't dereference possibly NULL pointer - maybe caller
	 * only wanted to see whether the entry existed or not.
	 */
	if (entry)
		ops->entry_packing(packed_buf, entry, UNPACK);

After the change, the packed_buf[] that this code sees is no longer the
entry read back from hardware, but the original entry that the caller
passed to the sja1105_dynamic_config_read(), packed into this buffer.

This difference is the most notable with the SJA1105_SEARCH uses from
sja1105pqrs_fdb_add() - used for both fdb and mdb. There, we have logic
added by commit 728db843df88 ("net: dsa: sja1105: ignore the FDB entry
for unknown multicast when adding a new address") to figure out whether
the address we're trying to add matches on any existing hardware entry,
with the exception of the catch-all multicast address.

That logic was broken, because with sja1105_dynamic_config_read() not
working properly, it doesn't return us the entry read back from
hardware, but the entry that we passed to it. And, since for multicast,
a match will always exist, it will tell us that any mdb entry already
exists at index=0 L2 Address Lookup table. It is index=0 because the
caller doesn't know the index - it wants to find it out, and
sja1105_dynamic_config_read() does:

	if (index < 0) { // SJA1105_SEARCH
		/* Avoid copying a signed negative number to an u64 */
		cmd.index = 0; // <- this
		cmd.search = true;
	} else {
		cmd.index = index;
		cmd.search = false;
	}

So, to the caller of sja1105_dynamic_config_read(), the returned info
looks entirely legit, and it will add all mdb entries to FDB index 0.
There, they will always overwrite each other (not to mention,
potentially they can also overwrite a pre-existing bridge fdb entry),
and the user-visible impact will be that only the last mdb entry will be
forwarded as it should. The others won't (will be flooded or dropped,
depending on the egress flood settings).

Fixing is a bit more complicated, and involves either passing the same
packed_buf[] to sja1105_dynamic_config_wait_complete(), or moving all
the extra processing on the packed_buf[] to
sja1105_dynamic_config_wait_complete(). I've opted for the latter,
because it makes sja1105_dynamic_config_wait_complete() a bit more
self-contained.

Fixes: df405910ab9f ("net: dsa: sja1105: wait for dynamic config command completion on writes too")
Reported-by: Yanan Yang <yanan.yang@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../net/dsa/sja1105/sja1105_dynamic_config.c  | 80 +++++++++----------
 1 file changed, 37 insertions(+), 43 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
index 93d47dab8d3e9..984c0e604e8de 100644
--- a/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
+++ b/drivers/net/dsa/sja1105/sja1105_dynamic_config.c
@@ -1175,18 +1175,15 @@ const struct sja1105_dynamic_table_ops sja1110_dyn_ops[BLK_IDX_MAX_DYN] = {
 
 static int
 sja1105_dynamic_config_poll_valid(struct sja1105_private *priv,
-				  struct sja1105_dyn_cmd *cmd,
-				  const struct sja1105_dynamic_table_ops *ops)
+				  const struct sja1105_dynamic_table_ops *ops,
+				  void *entry, bool check_valident,
+				  bool check_errors)
 {
 	u8 packed_buf[SJA1105_MAX_DYN_CMD_SIZE] = {};
+	struct sja1105_dyn_cmd cmd = {};
 	int rc;
 
-	/* We don't _need_ to read the full entry, just the command area which
-	 * is a fixed SJA1105_SIZE_DYN_CMD. But our cmd_packing() API expects a
-	 * buffer that contains the full entry too. Additionally, our API
-	 * doesn't really know how many bytes into the buffer does the command
-	 * area really begin. So just read back the whole entry.
-	 */
+	/* Read back the whole entry + command structure. */
 	rc = sja1105_xfer_buf(priv, SPI_READ, ops->addr, packed_buf,
 			      ops->packed_size);
 	if (rc)
@@ -1195,11 +1192,25 @@ sja1105_dynamic_config_poll_valid(struct sja1105_private *priv,
 	/* Unpack the command structure, and return it to the caller in case it
 	 * needs to perform further checks on it (VALIDENT).
 	 */
-	memset(cmd, 0, sizeof(*cmd));
-	ops->cmd_packing(packed_buf, cmd, UNPACK);
+	ops->cmd_packing(packed_buf, &cmd, UNPACK);
 
 	/* Hardware hasn't cleared VALID => still working on it */
-	return cmd->valid ? -EAGAIN : 0;
+	if (cmd.valid)
+		return -EAGAIN;
+
+	if (check_valident && !cmd.valident && !(ops->access & OP_VALID_ANYWAY))
+		return -ENOENT;
+
+	if (check_errors && cmd.errors)
+		return -EINVAL;
+
+	/* Don't dereference possibly NULL pointer - maybe caller
+	 * only wanted to see whether the entry existed or not.
+	 */
+	if (entry)
+		ops->entry_packing(packed_buf, entry, UNPACK);
+
+	return 0;
 }
 
 /* Poll the dynamic config entry's control area until the hardware has
@@ -1208,8 +1219,9 @@ sja1105_dynamic_config_poll_valid(struct sja1105_private *priv,
  */
 static int
 sja1105_dynamic_config_wait_complete(struct sja1105_private *priv,
-				     struct sja1105_dyn_cmd *cmd,
-				     const struct sja1105_dynamic_table_ops *ops)
+				     const struct sja1105_dynamic_table_ops *ops,
+				     void *entry, bool check_valident,
+				     bool check_errors)
 {
 	int err, rc;
 
@@ -1217,7 +1229,8 @@ sja1105_dynamic_config_wait_complete(struct sja1105_private *priv,
 				rc, rc != -EAGAIN,
 				SJA1105_DYNAMIC_CONFIG_SLEEP_US,
 				SJA1105_DYNAMIC_CONFIG_TIMEOUT_US,
-				false, priv, cmd, ops);
+				false, priv, ops, entry, check_valident,
+				check_errors);
 	return err < 0 ? err : rc;
 }
 
@@ -1287,25 +1300,14 @@ int sja1105_dynamic_config_read(struct sja1105_private *priv,
 	mutex_lock(&priv->dynamic_config_lock);
 	rc = sja1105_xfer_buf(priv, SPI_WRITE, ops->addr, packed_buf,
 			      ops->packed_size);
-	if (rc < 0) {
-		mutex_unlock(&priv->dynamic_config_lock);
-		return rc;
-	}
-
-	rc = sja1105_dynamic_config_wait_complete(priv, &cmd, ops);
-	mutex_unlock(&priv->dynamic_config_lock);
 	if (rc < 0)
-		return rc;
+		goto out;
 
-	if (!cmd.valident && !(ops->access & OP_VALID_ANYWAY))
-		return -ENOENT;
+	rc = sja1105_dynamic_config_wait_complete(priv, ops, entry, true, false);
+out:
+	mutex_unlock(&priv->dynamic_config_lock);
 
-	/* Don't dereference possibly NULL pointer - maybe caller
-	 * only wanted to see whether the entry existed or not.
-	 */
-	if (entry)
-		ops->entry_packing(packed_buf, entry, UNPACK);
-	return 0;
+	return rc;
 }
 
 int sja1105_dynamic_config_write(struct sja1105_private *priv,
@@ -1357,22 +1359,14 @@ int sja1105_dynamic_config_write(struct sja1105_private *priv,
 	mutex_lock(&priv->dynamic_config_lock);
 	rc = sja1105_xfer_buf(priv, SPI_WRITE, ops->addr, packed_buf,
 			      ops->packed_size);
-	if (rc < 0) {
-		mutex_unlock(&priv->dynamic_config_lock);
-		return rc;
-	}
-
-	rc = sja1105_dynamic_config_wait_complete(priv, &cmd, ops);
-	mutex_unlock(&priv->dynamic_config_lock);
 	if (rc < 0)
-		return rc;
+		goto out;
 
-	cmd = (struct sja1105_dyn_cmd) {0};
-	ops->cmd_packing(packed_buf, &cmd, UNPACK);
-	if (cmd.errors)
-		return -EINVAL;
+	rc = sja1105_dynamic_config_wait_complete(priv, ops, NULL, false, true);
+out:
+	mutex_unlock(&priv->dynamic_config_lock);
 
-	return 0;
+	return rc;
 }
 
 static u8 sja1105_crc8_add(u8 crc, u8 byte, u8 poly)
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 199/219] net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (197 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 198/219] net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 200/219] net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset Greg Kroah-Hartman
                   ` (30 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Vladimir Oltean, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Oltean <vladimir.oltean@nxp.com>

[ Upstream commit ea32690daf4fa525dc5a4d164bd00ed8c756e1c6 ]

sja1105_fdb_add() runs from the dsa_owq, and sja1105_port_mcast_flood()
runs from switchdev_deferred_process_work(). Prior to the blamed commit,
they used to be indirectly serialized through the rtnl_lock(), which
no longer holds true because dsa_owq dropped that.

So, it is now possible that we traverse the static config BLK_IDX_L2_LOOKUP
elements concurrently compared to when we change them, in
sja1105_static_fdb_change(). That is not ideal, since it might result in
data corruption.

Introduce a mutex which serializes accesses to the hardware FDB and to
the static config elements for the L2 Address Lookup table.

I can't find a good reason to add locking around sja1105_fdb_dump().
I'll add it later if needed.

Fixes: 0faf890fc519 ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/sja1105/sja1105.h      |  2 +
 drivers/net/dsa/sja1105/sja1105_main.c | 56 ++++++++++++++++++++------
 2 files changed, 45 insertions(+), 13 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index a831bb0a52074..06e0ebfe652a4 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -264,6 +264,8 @@ struct sja1105_private {
 	 * the switch doesn't confuse them with one another.
 	 */
 	struct mutex mgmt_lock;
+	/* Serializes accesses to the FDB */
+	struct mutex fdb_lock;
 	/* PTP two-step TX timestamp ID, and its serialization lock */
 	spinlock_t ts_id_lock;
 	u8 ts_id;
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index f8228a81234fe..c6c74e302e4dd 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -1805,6 +1805,7 @@ static int sja1105_fdb_add(struct dsa_switch *ds, int port,
 			   struct dsa_db db)
 {
 	struct sja1105_private *priv = ds->priv;
+	int rc;
 
 	if (!vid) {
 		switch (db.type) {
@@ -1819,12 +1820,16 @@ static int sja1105_fdb_add(struct dsa_switch *ds, int port,
 		}
 	}
 
-	return priv->info->fdb_add_cmd(ds, port, addr, vid);
+	mutex_lock(&priv->fdb_lock);
+	rc = priv->info->fdb_add_cmd(ds, port, addr, vid);
+	mutex_unlock(&priv->fdb_lock);
+
+	return rc;
 }
 
-static int sja1105_fdb_del(struct dsa_switch *ds, int port,
-			   const unsigned char *addr, u16 vid,
-			   struct dsa_db db)
+static int __sja1105_fdb_del(struct dsa_switch *ds, int port,
+			     const unsigned char *addr, u16 vid,
+			     struct dsa_db db)
 {
 	struct sja1105_private *priv = ds->priv;
 
@@ -1844,6 +1849,20 @@ static int sja1105_fdb_del(struct dsa_switch *ds, int port,
 	return priv->info->fdb_del_cmd(ds, port, addr, vid);
 }
 
+static int sja1105_fdb_del(struct dsa_switch *ds, int port,
+			   const unsigned char *addr, u16 vid,
+			   struct dsa_db db)
+{
+	struct sja1105_private *priv = ds->priv;
+	int rc;
+
+	mutex_lock(&priv->fdb_lock);
+	rc = __sja1105_fdb_del(ds, port, addr, vid, db);
+	mutex_unlock(&priv->fdb_lock);
+
+	return rc;
+}
+
 static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
 			    dsa_fdb_dump_cb_t *cb, void *data)
 {
@@ -1906,6 +1925,8 @@ static void sja1105_fast_age(struct dsa_switch *ds, int port)
 	};
 	int i;
 
+	mutex_lock(&priv->fdb_lock);
+
 	for (i = 0; i < SJA1105_MAX_L2_LOOKUP_COUNT; i++) {
 		struct sja1105_l2_lookup_entry l2_lookup = {0};
 		u8 macaddr[ETH_ALEN];
@@ -1919,7 +1940,7 @@ static void sja1105_fast_age(struct dsa_switch *ds, int port)
 		if (rc) {
 			dev_err(ds->dev, "Failed to read FDB: %pe\n",
 				ERR_PTR(rc));
-			return;
+			break;
 		}
 
 		if (!(l2_lookup.destports & BIT(port)))
@@ -1931,14 +1952,16 @@ static void sja1105_fast_age(struct dsa_switch *ds, int port)
 
 		u64_to_ether_addr(l2_lookup.macaddr, macaddr);
 
-		rc = sja1105_fdb_del(ds, port, macaddr, l2_lookup.vlanid, db);
+		rc = __sja1105_fdb_del(ds, port, macaddr, l2_lookup.vlanid, db);
 		if (rc) {
 			dev_err(ds->dev,
 				"Failed to delete FDB entry %pM vid %lld: %pe\n",
 				macaddr, l2_lookup.vlanid, ERR_PTR(rc));
-			return;
+			break;
 		}
 	}
+
+	mutex_unlock(&priv->fdb_lock);
 }
 
 static int sja1105_mdb_add(struct dsa_switch *ds, int port,
@@ -2964,7 +2987,9 @@ static int sja1105_port_mcast_flood(struct sja1105_private *priv, int to,
 {
 	struct sja1105_l2_lookup_entry *l2_lookup;
 	struct sja1105_table *table;
-	int match;
+	int match, rc;
+
+	mutex_lock(&priv->fdb_lock);
 
 	table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP];
 	l2_lookup = table->entries;
@@ -2977,7 +3002,8 @@ static int sja1105_port_mcast_flood(struct sja1105_private *priv, int to,
 	if (match == table->entry_count) {
 		NL_SET_ERR_MSG_MOD(extack,
 				   "Could not find FDB entry for unknown multicast");
-		return -ENOSPC;
+		rc = -ENOSPC;
+		goto out;
 	}
 
 	if (flags.val & BR_MCAST_FLOOD)
@@ -2985,10 +3011,13 @@ static int sja1105_port_mcast_flood(struct sja1105_private *priv, int to,
 	else
 		l2_lookup[match].destports &= ~BIT(to);
 
-	return sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
-					    l2_lookup[match].index,
-					    &l2_lookup[match],
-					    true);
+	rc = sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
+					  l2_lookup[match].index,
+					  &l2_lookup[match], true);
+out:
+	mutex_unlock(&priv->fdb_lock);
+
+	return rc;
 }
 
 static int sja1105_port_pre_bridge_flags(struct dsa_switch *ds, int port,
@@ -3358,6 +3387,7 @@ static int sja1105_probe(struct spi_device *spi)
 	mutex_init(&priv->ptp_data.lock);
 	mutex_init(&priv->dynamic_config_lock);
 	mutex_init(&priv->mgmt_lock);
+	mutex_init(&priv->fdb_lock);
 	spin_lock_init(&priv->ts_id_lock);
 
 	rc = sja1105_parse_dt(priv);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 200/219] net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (198 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 199/219] net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 201/219] r8152: check budget for r8152_poll() Greg Kroah-Hartman
                   ` (29 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Vladimir Oltean, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vladimir Oltean <vladimir.oltean@nxp.com>

[ Upstream commit 86899e9e1e29e854b5f6dcc24ba4f75f792c89aa ]

Currently, when we add the first sja1105 port to a bridge with
vlan_filtering 1, then we sometimes see this output:

sja1105 spi2.2: port 4 failed to read back entry for be:79:b4:9e:9e:96 vid 3088: -ENOENT
sja1105 spi2.2: Reset switch and programmed static config. Reason: VLAN filtering
sja1105 spi2.2: port 0 failed to add be:79:b4:9e:9e:96 vid 0 to fdb: -2

It is because sja1105_fdb_add() runs from the dsa_owq which is no longer
serialized with switch resets since it dropped the rtnl_lock() in the
blamed commit.

Either performing the FDB accesses before the reset, or after the reset,
is equally fine, because sja1105_static_fdb_change() backs up those
changes in the static config, but FDB access during reset isn't ok.

Make sja1105_static_config_reload() take the fdb_lock to fix that.

Fixes: 0faf890fc519 ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/dsa/sja1105/sja1105_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index c6c74e302e4dd..f1f1368e8146f 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2304,6 +2304,7 @@ int sja1105_static_config_reload(struct sja1105_private *priv,
 	int rc, i;
 	s64 now;
 
+	mutex_lock(&priv->fdb_lock);
 	mutex_lock(&priv->mgmt_lock);
 
 	mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
@@ -2418,6 +2419,7 @@ int sja1105_static_config_reload(struct sja1105_private *priv,
 		goto out;
 out:
 	mutex_unlock(&priv->mgmt_lock);
+	mutex_unlock(&priv->fdb_lock);
 
 	return rc;
 }
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 201/219] r8152: check budget for r8152_poll()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (199 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 200/219] net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 202/219] kcm: Fix memory leak in error path of kcm_sendmsg() Greg Kroah-Hartman
                   ` (28 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Hayes Wang, David S. Miller,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hayes Wang <hayeswang@realtek.com>

[ Upstream commit a7b8d60b37237680009dd0b025fe8c067aba0ee3 ]

According to the document of napi, there is no rx process when the
budget is 0. Therefore, r8152_poll() has to return 0 directly when the
budget is equal to 0.

Fixes: d2187f8e4454 ("r8152: divide the tx and rx bottom functions")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/usb/r8152.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 059d610901d84..fc1458f96e170 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -2628,6 +2628,9 @@ static int r8152_poll(struct napi_struct *napi, int budget)
 	struct r8152 *tp = container_of(napi, struct r8152, napi);
 	int work_done;
 
+	if (!budget)
+		return 0;
+
 	work_done = rx_bottom(tp, budget);
 
 	if (work_done < budget) {
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 202/219] kcm: Fix memory leak in error path of kcm_sendmsg()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (200 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 201/219] r8152: check budget for r8152_poll() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 203/219] platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors Greg Kroah-Hartman
                   ` (27 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Shigeru Yoshida, David S. Miller,
	Sasha Levin, syzbot+6f98de741f7dbbfc4ccb

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Shigeru Yoshida <syoshida@redhat.com>

[ Upstream commit c821a88bd720b0046433173185fd841a100d44ad ]

syzbot reported a memory leak like below:

BUG: memory leak
unreferenced object 0xffff88810b088c00 (size 240):
  comm "syz-executor186", pid 5012, jiffies 4294943306 (age 13.680s)
  hex dump (first 32 bytes):
    00 89 08 0b 81 88 ff ff 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff83e5d5ff>] __alloc_skb+0x1ef/0x230 net/core/skbuff.c:634
    [<ffffffff84606e59>] alloc_skb include/linux/skbuff.h:1289 [inline]
    [<ffffffff84606e59>] kcm_sendmsg+0x269/0x1050 net/kcm/kcmsock.c:815
    [<ffffffff83e479c6>] sock_sendmsg_nosec net/socket.c:725 [inline]
    [<ffffffff83e479c6>] sock_sendmsg+0x56/0xb0 net/socket.c:748
    [<ffffffff83e47f55>] ____sys_sendmsg+0x365/0x470 net/socket.c:2494
    [<ffffffff83e4c389>] ___sys_sendmsg+0xc9/0x130 net/socket.c:2548
    [<ffffffff83e4c536>] __sys_sendmsg+0xa6/0x120 net/socket.c:2577
    [<ffffffff84ad7bb8>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    [<ffffffff84ad7bb8>] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
    [<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

In kcm_sendmsg(), kcm_tx_msg(head)->last_skb is used as a cursor to append
newly allocated skbs to 'head'. If some bytes are copied, an error occurred,
and jumped to out_error label, 'last_skb' is left unmodified. A later
kcm_sendmsg() will use an obsoleted 'last_skb' reference, corrupting the
'head' frag_list and causing the leak.

This patch fixes this issue by properly updating the last allocated skb in
'last_skb'.

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Reported-and-tested-by: syzbot+6f98de741f7dbbfc4ccb@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=6f98de741f7dbbfc4ccb
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/kcm/kcmsock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index 6a97662d7548e..dd929a8350740 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1074,6 +1074,8 @@ static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 
 	if (head != kcm->seq_skb)
 		kfree_skb(head);
+	else if (copied)
+		kcm_tx_msg(head)->last_skb = skb;
 
 	err = sk_stream_error(sk, msg->msg_flags, err);
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 203/219] platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (201 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 202/219] kcm: Fix memory leak in error path of kcm_sendmsg() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 204/219] platform/mellanox: mlxbf-tmfifo: Drop jumbo frames Greg Kroah-Hartman
                   ` (26 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Liming Sun, Vadim Pasternak,
	David Thompson, Hans de Goede, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Liming Sun <limings@nvidia.com>

[ Upstream commit 78034cbece79c2d730ad0770b3b7f23eedbbecf5 ]

This commit fixes tmfifo console stuck issue when the virtual
networking interface is in down state. In such case, the network
Rx descriptors runs out and causes the Rx network packet staying
in the head of the tmfifo thus blocking the console packets. The
fix is to drop the Rx network packet when no more Rx descriptors.
Function name mlxbf_tmfifo_release_pending_pkt() is also renamed
to mlxbf_tmfifo_release_pkt() to be more approperiate.

Fixes: 1357dfd7261f ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc")
Signed-off-by: Liming Sun <limings@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/8c0177dc938ae03f52ff7e0b62dbeee74b7bec09.1693322547.git.limings@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/platform/mellanox/mlxbf-tmfifo.c | 66 ++++++++++++++++++------
 1 file changed, 49 insertions(+), 17 deletions(-)

diff --git a/drivers/platform/mellanox/mlxbf-tmfifo.c b/drivers/platform/mellanox/mlxbf-tmfifo.c
index d31fe7eed38df..ee8648a271fda 100644
--- a/drivers/platform/mellanox/mlxbf-tmfifo.c
+++ b/drivers/platform/mellanox/mlxbf-tmfifo.c
@@ -56,6 +56,7 @@ struct mlxbf_tmfifo;
  * @vq: pointer to the virtio virtqueue
  * @desc: current descriptor of the pending packet
  * @desc_head: head descriptor of the pending packet
+ * @drop_desc: dummy desc for packet dropping
  * @cur_len: processed length of the current descriptor
  * @rem_len: remaining length of the pending packet
  * @pkt_len: total length of the pending packet
@@ -72,6 +73,7 @@ struct mlxbf_tmfifo_vring {
 	struct virtqueue *vq;
 	struct vring_desc *desc;
 	struct vring_desc *desc_head;
+	struct vring_desc drop_desc;
 	int cur_len;
 	int rem_len;
 	u32 pkt_len;
@@ -83,6 +85,14 @@ struct mlxbf_tmfifo_vring {
 	struct mlxbf_tmfifo *fifo;
 };
 
+/* Check whether vring is in drop mode. */
+#define IS_VRING_DROP(_r) ({ \
+	typeof(_r) (r) = (_r); \
+	(r->desc_head == &r->drop_desc ? true : false); })
+
+/* A stub length to drop maximum length packet. */
+#define VRING_DROP_DESC_MAX_LEN		GENMASK(15, 0)
+
 /* Interrupt types. */
 enum {
 	MLXBF_TM_RX_LWM_IRQ,
@@ -243,6 +253,7 @@ static int mlxbf_tmfifo_alloc_vrings(struct mlxbf_tmfifo *fifo,
 		vring->align = SMP_CACHE_BYTES;
 		vring->index = i;
 		vring->vdev_id = tm_vdev->vdev.id.device;
+		vring->drop_desc.len = VRING_DROP_DESC_MAX_LEN;
 		dev = &tm_vdev->vdev.dev;
 
 		size = vring_size(vring->num, vring->align);
@@ -348,7 +359,7 @@ static u32 mlxbf_tmfifo_get_pkt_len(struct mlxbf_tmfifo_vring *vring,
 	return len;
 }
 
-static void mlxbf_tmfifo_release_pending_pkt(struct mlxbf_tmfifo_vring *vring)
+static void mlxbf_tmfifo_release_pkt(struct mlxbf_tmfifo_vring *vring)
 {
 	struct vring_desc *desc_head;
 	u32 len = 0;
@@ -577,19 +588,25 @@ static void mlxbf_tmfifo_rxtx_word(struct mlxbf_tmfifo_vring *vring,
 
 	if (vring->cur_len + sizeof(u64) <= len) {
 		/* The whole word. */
-		if (is_rx)
-			memcpy(addr + vring->cur_len, &data, sizeof(u64));
-		else
-			memcpy(&data, addr + vring->cur_len, sizeof(u64));
+		if (!IS_VRING_DROP(vring)) {
+			if (is_rx)
+				memcpy(addr + vring->cur_len, &data,
+				       sizeof(u64));
+			else
+				memcpy(&data, addr + vring->cur_len,
+				       sizeof(u64));
+		}
 		vring->cur_len += sizeof(u64);
 	} else {
 		/* Leftover bytes. */
-		if (is_rx)
-			memcpy(addr + vring->cur_len, &data,
-			       len - vring->cur_len);
-		else
-			memcpy(&data, addr + vring->cur_len,
-			       len - vring->cur_len);
+		if (!IS_VRING_DROP(vring)) {
+			if (is_rx)
+				memcpy(addr + vring->cur_len, &data,
+				       len - vring->cur_len);
+			else
+				memcpy(&data, addr + vring->cur_len,
+				       len - vring->cur_len);
+		}
 		vring->cur_len = len;
 	}
 
@@ -690,8 +707,16 @@ static bool mlxbf_tmfifo_rxtx_one_desc(struct mlxbf_tmfifo_vring *vring,
 	/* Get the descriptor of the next packet. */
 	if (!vring->desc) {
 		desc = mlxbf_tmfifo_get_next_pkt(vring, is_rx);
-		if (!desc)
-			return false;
+		if (!desc) {
+			/* Drop next Rx packet to avoid stuck. */
+			if (is_rx) {
+				desc = &vring->drop_desc;
+				vring->desc_head = desc;
+				vring->desc = desc;
+			} else {
+				return false;
+			}
+		}
 	} else {
 		desc = vring->desc;
 	}
@@ -724,17 +749,24 @@ static bool mlxbf_tmfifo_rxtx_one_desc(struct mlxbf_tmfifo_vring *vring,
 		vring->rem_len -= len;
 
 		/* Get the next desc on the chain. */
-		if (vring->rem_len > 0 &&
+		if (!IS_VRING_DROP(vring) && vring->rem_len > 0 &&
 		    (virtio16_to_cpu(vdev, desc->flags) & VRING_DESC_F_NEXT)) {
 			idx = virtio16_to_cpu(vdev, desc->next);
 			desc = &vr->desc[idx];
 			goto mlxbf_tmfifo_desc_done;
 		}
 
-		/* Done and release the pending packet. */
-		mlxbf_tmfifo_release_pending_pkt(vring);
+		/* Done and release the packet. */
 		desc = NULL;
 		fifo->vring[is_rx] = NULL;
+		if (!IS_VRING_DROP(vring)) {
+			mlxbf_tmfifo_release_pkt(vring);
+		} else {
+			vring->pkt_len = 0;
+			vring->desc_head = NULL;
+			vring->desc = NULL;
+			return false;
+		}
 
 		/*
 		 * Make sure the load/store are in order before
@@ -914,7 +946,7 @@ static void mlxbf_tmfifo_virtio_del_vqs(struct virtio_device *vdev)
 
 		/* Release the pending packet. */
 		if (vring->desc)
-			mlxbf_tmfifo_release_pending_pkt(vring);
+			mlxbf_tmfifo_release_pkt(vring);
 		vq = vring->vq;
 		if (vq) {
 			vring->vq = NULL;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 204/219] platform/mellanox: mlxbf-tmfifo: Drop jumbo frames
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (202 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 203/219] platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 205/219] platform/mellanox: mlxbf-pmc: Fix potential buffer overflows Greg Kroah-Hartman
                   ` (25 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Liming Sun, Vadim Pasternak,
	David Thompson, Hans de Goede, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Liming Sun <limings@nvidia.com>

[ Upstream commit fc4c655821546239abb3cf4274d66b9747aa87dd ]

This commit drops over-sized network packets to avoid tmfifo
queue stuck.

Fixes: 1357dfd7261f ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc")
Signed-off-by: Liming Sun <limings@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/9318936c2447f76db475c985ca6d91f057efcd41.1693322547.git.limings@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/platform/mellanox/mlxbf-tmfifo.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/platform/mellanox/mlxbf-tmfifo.c b/drivers/platform/mellanox/mlxbf-tmfifo.c
index ee8648a271fda..a04ff89a7ec44 100644
--- a/drivers/platform/mellanox/mlxbf-tmfifo.c
+++ b/drivers/platform/mellanox/mlxbf-tmfifo.c
@@ -205,7 +205,7 @@ static u8 mlxbf_tmfifo_net_default_mac[ETH_ALEN] = {
 static efi_char16_t mlxbf_tmfifo_efi_name[] = L"RshimMacAddr";
 
 /* Maximum L2 header length. */
-#define MLXBF_TMFIFO_NET_L2_OVERHEAD	36
+#define MLXBF_TMFIFO_NET_L2_OVERHEAD	(ETH_HLEN + VLAN_HLEN)
 
 /* Supported virtio-net features. */
 #define MLXBF_TMFIFO_NET_FEATURES \
@@ -623,13 +623,14 @@ static void mlxbf_tmfifo_rxtx_word(struct mlxbf_tmfifo_vring *vring,
  * flag is set.
  */
 static void mlxbf_tmfifo_rxtx_header(struct mlxbf_tmfifo_vring *vring,
-				     struct vring_desc *desc,
+				     struct vring_desc **desc,
 				     bool is_rx, bool *vring_change)
 {
 	struct mlxbf_tmfifo *fifo = vring->fifo;
 	struct virtio_net_config *config;
 	struct mlxbf_tmfifo_msg_hdr hdr;
 	int vdev_id, hdr_len;
+	bool drop_rx = false;
 
 	/* Read/Write packet header. */
 	if (is_rx) {
@@ -649,8 +650,8 @@ static void mlxbf_tmfifo_rxtx_header(struct mlxbf_tmfifo_vring *vring,
 			if (ntohs(hdr.len) >
 			    __virtio16_to_cpu(virtio_legacy_is_little_endian(),
 					      config->mtu) +
-			    MLXBF_TMFIFO_NET_L2_OVERHEAD)
-				return;
+					      MLXBF_TMFIFO_NET_L2_OVERHEAD)
+				drop_rx = true;
 		} else {
 			vdev_id = VIRTIO_ID_CONSOLE;
 			hdr_len = 0;
@@ -665,16 +666,25 @@ static void mlxbf_tmfifo_rxtx_header(struct mlxbf_tmfifo_vring *vring,
 
 			if (!tm_dev2)
 				return;
-			vring->desc = desc;
+			vring->desc = *desc;
 			vring = &tm_dev2->vrings[MLXBF_TMFIFO_VRING_RX];
 			*vring_change = true;
 		}
+
+		if (drop_rx && !IS_VRING_DROP(vring)) {
+			if (vring->desc_head)
+				mlxbf_tmfifo_release_pkt(vring);
+			*desc = &vring->drop_desc;
+			vring->desc_head = *desc;
+			vring->desc = *desc;
+		}
+
 		vring->pkt_len = ntohs(hdr.len) + hdr_len;
 	} else {
 		/* Network virtio has an extra header. */
 		hdr_len = (vring->vdev_id == VIRTIO_ID_NET) ?
 			   sizeof(struct virtio_net_hdr) : 0;
-		vring->pkt_len = mlxbf_tmfifo_get_pkt_len(vring, desc);
+		vring->pkt_len = mlxbf_tmfifo_get_pkt_len(vring, *desc);
 		hdr.type = (vring->vdev_id == VIRTIO_ID_NET) ?
 			    VIRTIO_ID_NET : VIRTIO_ID_CONSOLE;
 		hdr.len = htons(vring->pkt_len - hdr_len);
@@ -723,7 +733,7 @@ static bool mlxbf_tmfifo_rxtx_one_desc(struct mlxbf_tmfifo_vring *vring,
 
 	/* Beginning of a packet. Start to Rx/Tx packet header. */
 	if (vring->pkt_len == 0) {
-		mlxbf_tmfifo_rxtx_header(vring, desc, is_rx, &vring_change);
+		mlxbf_tmfifo_rxtx_header(vring, &desc, is_rx, &vring_change);
 		(*avail)--;
 
 		/* Return if new packet is for another ring. */
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 205/219] platform/mellanox: mlxbf-pmc: Fix potential buffer overflows
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (203 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 204/219] platform/mellanox: mlxbf-tmfifo: Drop jumbo frames Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 206/219] platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events Greg Kroah-Hartman
                   ` (24 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Shravan Kumar Ramani,
	Vadim Pasternak, David Thompson, Hans de Goede, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Shravan Kumar Ramani <shravankr@nvidia.com>

[ Upstream commit 80ccd40568bcd3655b0fd0be1e9b3379fd6e1056 ]

Replace sprintf with sysfs_emit where possible.
Size check in mlxbf_pmc_event_list_show should account for "\0".

Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver")
Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/bef39ef32319a31b32f999065911f61b0d3b17c3.1693917738.git.shravankr@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/platform/mellanox/mlxbf-pmc.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/platform/mellanox/mlxbf-pmc.c b/drivers/platform/mellanox/mlxbf-pmc.c
index be967d797c28e..95afcae7b9fa9 100644
--- a/drivers/platform/mellanox/mlxbf-pmc.c
+++ b/drivers/platform/mellanox/mlxbf-pmc.c
@@ -1008,7 +1008,7 @@ static ssize_t mlxbf_pmc_counter_show(struct device *dev,
 	} else
 		return -EINVAL;
 
-	return sprintf(buf, "0x%llx\n", value);
+	return sysfs_emit(buf, "0x%llx\n", value);
 }
 
 /* Store function for "counter" sysfs files */
@@ -1078,13 +1078,13 @@ static ssize_t mlxbf_pmc_event_show(struct device *dev,
 
 	err = mlxbf_pmc_read_event(blk_num, cnt_num, is_l3, &evt_num);
 	if (err)
-		return sprintf(buf, "No event being monitored\n");
+		return sysfs_emit(buf, "No event being monitored\n");
 
 	evt_name = mlxbf_pmc_get_event_name(pmc->block_name[blk_num], evt_num);
 	if (!evt_name)
 		return -EINVAL;
 
-	return sprintf(buf, "0x%llx: %s\n", evt_num, evt_name);
+	return sysfs_emit(buf, "0x%llx: %s\n", evt_num, evt_name);
 }
 
 /* Store function for "event" sysfs files */
@@ -1139,9 +1139,9 @@ static ssize_t mlxbf_pmc_event_list_show(struct device *dev,
 		return -EINVAL;
 
 	for (i = 0, buf[0] = '\0'; i < size; ++i) {
-		len += sprintf(e_info, "0x%x: %s\n", events[i].evt_num,
-			       events[i].evt_name);
-		if (len > PAGE_SIZE)
+		len += snprintf(e_info, sizeof(e_info), "0x%x: %s\n",
+				events[i].evt_num, events[i].evt_name);
+		if (len >= PAGE_SIZE)
 			break;
 		strcat(buf, e_info);
 		ret = len;
@@ -1168,7 +1168,7 @@ static ssize_t mlxbf_pmc_enable_show(struct device *dev,
 
 	value = FIELD_GET(MLXBF_PMC_L3C_PERF_CNT_CFG_EN, perfcnt_cfg);
 
-	return sprintf(buf, "%d\n", value);
+	return sysfs_emit(buf, "%d\n", value);
 }
 
 /* Store function for "enable" sysfs files - only for l3cache */
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 206/219] platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (204 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 205/219] platform/mellanox: mlxbf-pmc: Fix potential buffer overflows Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 207/219] platform/mellanox: NVSW_SN2201 should depend on ACPI Greg Kroah-Hartman
                   ` (23 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Shravan Kumar Ramani,
	Vadim Pasternak, David Thompson, Hans de Goede, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Shravan Kumar Ramani <shravankr@nvidia.com>

[ Upstream commit 0f5969452e162efc50bdc98968fb62b424a9874b ]

This fix involves 2 changes:
 - All event regs have a reset value of 0, which is not a valid
   event_number as per the event_list for most blocks and hence seen
   as an error. Add a "disable" event with event_number 0 for all blocks.

 - The enable bit for each counter need not be checked before
   reading the event info, and hence removed.

Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver")
Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/04d0213932d32681de1c716b54320ed894e52425.1693917738.git.shravankr@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/platform/mellanox/mlxbf-pmc.c | 27 +++++++--------------------
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/drivers/platform/mellanox/mlxbf-pmc.c b/drivers/platform/mellanox/mlxbf-pmc.c
index 95afcae7b9fa9..2d4bbe99959ef 100644
--- a/drivers/platform/mellanox/mlxbf-pmc.c
+++ b/drivers/platform/mellanox/mlxbf-pmc.c
@@ -191,6 +191,7 @@ static const struct mlxbf_pmc_events mlxbf_pmc_smgen_events[] = {
 };
 
 static const struct mlxbf_pmc_events mlxbf_pmc_trio_events_1[] = {
+	{ 0x0, "DISABLE" },
 	{ 0xa0, "TPIO_DATA_BEAT" },
 	{ 0xa1, "TDMA_DATA_BEAT" },
 	{ 0xa2, "MAP_DATA_BEAT" },
@@ -214,6 +215,7 @@ static const struct mlxbf_pmc_events mlxbf_pmc_trio_events_1[] = {
 };
 
 static const struct mlxbf_pmc_events mlxbf_pmc_trio_events_2[] = {
+	{ 0x0, "DISABLE" },
 	{ 0xa0, "TPIO_DATA_BEAT" },
 	{ 0xa1, "TDMA_DATA_BEAT" },
 	{ 0xa2, "MAP_DATA_BEAT" },
@@ -246,6 +248,7 @@ static const struct mlxbf_pmc_events mlxbf_pmc_trio_events_2[] = {
 };
 
 static const struct mlxbf_pmc_events mlxbf_pmc_ecc_events[] = {
+	{ 0x0, "DISABLE" },
 	{ 0x100, "ECC_SINGLE_ERROR_CNT" },
 	{ 0x104, "ECC_DOUBLE_ERROR_CNT" },
 	{ 0x114, "SERR_INJ" },
@@ -258,6 +261,7 @@ static const struct mlxbf_pmc_events mlxbf_pmc_ecc_events[] = {
 };
 
 static const struct mlxbf_pmc_events mlxbf_pmc_mss_events[] = {
+	{ 0x0, "DISABLE" },
 	{ 0xc0, "RXREQ_MSS" },
 	{ 0xc1, "RXDAT_MSS" },
 	{ 0xc2, "TXRSP_MSS" },
@@ -265,6 +269,7 @@ static const struct mlxbf_pmc_events mlxbf_pmc_mss_events[] = {
 };
 
 static const struct mlxbf_pmc_events mlxbf_pmc_hnf_events[] = {
+	{ 0x0, "DISABLE" },
 	{ 0x45, "HNF_REQUESTS" },
 	{ 0x46, "HNF_REJECTS" },
 	{ 0x47, "ALL_BUSY" },
@@ -323,6 +328,7 @@ static const struct mlxbf_pmc_events mlxbf_pmc_hnf_events[] = {
 };
 
 static const struct mlxbf_pmc_events mlxbf_pmc_hnfnet_events[] = {
+	{ 0x0, "DISABLE" },
 	{ 0x12, "CDN_REQ" },
 	{ 0x13, "DDN_REQ" },
 	{ 0x14, "NDN_REQ" },
@@ -892,7 +898,7 @@ static int mlxbf_pmc_read_event(int blk_num, uint32_t cnt_num, bool is_l3,
 				uint64_t *result)
 {
 	uint32_t perfcfg_offset, perfval_offset;
-	uint64_t perfmon_cfg, perfevt, perfctl;
+	uint64_t perfmon_cfg, perfevt;
 
 	if (cnt_num >= pmc->block[blk_num].counters)
 		return -EINVAL;
@@ -904,25 +910,6 @@ static int mlxbf_pmc_read_event(int blk_num, uint32_t cnt_num, bool is_l3,
 	perfval_offset = perfcfg_offset +
 			 pmc->block[blk_num].counters * MLXBF_PMC_REG_SIZE;
 
-	/* Set counter in "read" mode */
-	perfmon_cfg = FIELD_PREP(MLXBF_PMC_PERFMON_CONFIG_ADDR,
-				 MLXBF_PMC_PERFCTL);
-	perfmon_cfg |= FIELD_PREP(MLXBF_PMC_PERFMON_CONFIG_STROBE, 1);
-	perfmon_cfg |= FIELD_PREP(MLXBF_PMC_PERFMON_CONFIG_WR_R_B, 0);
-
-	if (mlxbf_pmc_write(pmc->block[blk_num].mmio_base + perfcfg_offset,
-			    MLXBF_PMC_WRITE_REG_64, perfmon_cfg))
-		return -EFAULT;
-
-	/* Check if the counter is enabled */
-
-	if (mlxbf_pmc_read(pmc->block[blk_num].mmio_base + perfval_offset,
-			   MLXBF_PMC_READ_REG_64, &perfctl))
-		return -EFAULT;
-
-	if (!FIELD_GET(MLXBF_PMC_PERFCTL_EN0, perfctl))
-		return -EINVAL;
-
 	/* Set counter in "read" mode */
 	perfmon_cfg = FIELD_PREP(MLXBF_PMC_PERFMON_CONFIG_ADDR,
 				 MLXBF_PMC_PERFEVT);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 207/219] platform/mellanox: NVSW_SN2201 should depend on ACPI
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (205 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 206/219] platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 208/219] net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict() Greg Kroah-Hartman
                   ` (22 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Geert Uytterhoeven, Vadim Pasternak,
	Andi Shyti, Hans de Goede, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Geert Uytterhoeven <geert+renesas@glider.be>

[ Upstream commit 0a138f1670bd1af13ba6949c48ea86ddd4bf557e ]

The only probing method supported by the Nvidia SN2201 platform driver
is probing through an ACPI match table.  Hence add a dependency on
ACPI, to prevent asking the user about this driver when configuring a
kernel without ACPI support.

Fixes: 662f24826f95 ("platform/mellanox: Add support for new SN2201 system")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Vadim Pasternak <vadimp@nvidia.com>
Acked-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/ec5a4071691ab08d58771b7732a9988e89779268.1693828363.git.geert+renesas@glider.be
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/platform/mellanox/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/mellanox/Kconfig b/drivers/platform/mellanox/Kconfig
index 382793e73a60a..30b50920b278c 100644
--- a/drivers/platform/mellanox/Kconfig
+++ b/drivers/platform/mellanox/Kconfig
@@ -80,8 +80,8 @@ config MLXBF_PMC
 
 config NVSW_SN2201
 	tristate "Nvidia SN2201 platform driver support"
-	depends on HWMON
-	depends on I2C
+	depends on HWMON && I2C
+	depends on ACPI || COMPILE_TEST
 	select REGMAP_I2C
 	help
 	  This driver provides support for the Nvidia SN2201 platform.
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 208/219] net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (206 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 207/219] platform/mellanox: NVSW_SN2201 should depend on ACPI Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 209/219] net: macb: Enable PTP unicast Greg Kroah-Hartman
                   ` (21 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Liu Jian, Sabrina Dubroca,
	Paolo Abeni, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Liu Jian <liujian56@huawei.com>

[ Upstream commit cfaa80c91f6f99b9342b6557f0f0e1143e434066 ]

I got the below warning when do fuzzing test:
BUG: KASAN: null-ptr-deref in scatterwalk_copychunks+0x320/0x470
Read of size 4 at addr 0000000000000008 by task kworker/u8:1/9

CPU: 0 PID: 9 Comm: kworker/u8:1 Tainted: G           OE
Hardware name: linux,dummy-virt (DT)
Workqueue: pencrypt_parallel padata_parallel_worker
Call trace:
 dump_backtrace+0x0/0x420
 show_stack+0x34/0x44
 dump_stack+0x1d0/0x248
 __kasan_report+0x138/0x140
 kasan_report+0x44/0x6c
 __asan_load4+0x94/0xd0
 scatterwalk_copychunks+0x320/0x470
 skcipher_next_slow+0x14c/0x290
 skcipher_walk_next+0x2fc/0x480
 skcipher_walk_first+0x9c/0x110
 skcipher_walk_aead_common+0x380/0x440
 skcipher_walk_aead_encrypt+0x54/0x70
 ccm_encrypt+0x13c/0x4d0
 crypto_aead_encrypt+0x7c/0xfc
 pcrypt_aead_enc+0x28/0x84
 padata_parallel_worker+0xd0/0x2dc
 process_one_work+0x49c/0xbdc
 worker_thread+0x124/0x880
 kthread+0x210/0x260
 ret_from_fork+0x10/0x18

This is because the value of rec_seq of tls_crypto_info configured by the
user program is too large, for example, 0xffffffffffffff. In addition, TLS
is asynchronously accelerated. When tls_do_encryption() returns
-EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow,
skmsg is released before the asynchronous encryption process ends. As a
result, the UAF problem occurs during the asynchronous processing of the
encryption module.

If the operation is asynchronous and the encryption module returns
EINPROGRESS, do not free the record information.

Fixes: 635d93981786 ("net/tls: free record only on encryption error")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/20230909081434.2324940-1-liujian56@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/tls/tls_sw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 96b4545ea700f..9be00ebbb2341 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -802,7 +802,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk,
 	psock = sk_psock_get(sk);
 	if (!psock || !policy) {
 		err = tls_push_record(sk, flags, record_type);
-		if (err && sk->sk_err == EBADMSG) {
+		if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) {
 			*copied -= sk_msg_free(sk, msg);
 			tls_free_open_rec(sk);
 			err = -sk->sk_err;
@@ -831,7 +831,7 @@ static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk,
 	switch (psock->eval) {
 	case __SK_PASS:
 		err = tls_push_record(sk, flags, record_type);
-		if (err && sk->sk_err == EBADMSG) {
+		if (err && err != -EINPROGRESS && sk->sk_err == EBADMSG) {
 			*copied -= sk_msg_free(sk, msg);
 			tls_free_open_rec(sk);
 			err = -sk->sk_err;
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 209/219] net: macb: Enable PTP unicast
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (207 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 208/219] net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 210/219] net: macb: fix sleep inside spinlock Greg Kroah-Hartman
                   ` (20 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Harini Katakam, Michal Simek,
	Radhey Shyam Pandey, Jakub Kicinski, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Harini Katakam <harini.katakam@xilinx.com>

[ Upstream commit ee4e92c26c60b7344b7261035683a37da5a6119b ]

Enable transmission and reception of PTP unicast packets by
updating PTP unicast config bit and setting current HW mac
address as allowed address in PTP unicast filter registers.

Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 403f0e771457 ("net: macb: fix sleep inside spinlock")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/cadence/macb.h      |  4 ++++
 drivers/net/ethernet/cadence/macb_main.c | 13 +++++++++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 9c410f93a1039..1aa578c1ca4ad 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -95,6 +95,8 @@
 #define GEM_SA4B		0x00A0 /* Specific4 Bottom */
 #define GEM_SA4T		0x00A4 /* Specific4 Top */
 #define GEM_WOL			0x00b8 /* Wake on LAN */
+#define GEM_RXPTPUNI		0x00D4 /* PTP RX Unicast address */
+#define GEM_TXPTPUNI		0x00D8 /* PTP TX Unicast address */
 #define GEM_EFTSH		0x00e8 /* PTP Event Frame Transmitted Seconds Register 47:32 */
 #define GEM_EFRSH		0x00ec /* PTP Event Frame Received Seconds Register 47:32 */
 #define GEM_PEFTSH		0x00f0 /* PTP Peer Event Frame Transmitted Seconds Register 47:32 */
@@ -245,6 +247,8 @@
 #define MACB_TZQ_OFFSET		12 /* Transmit zero quantum pause frame */
 #define MACB_TZQ_SIZE		1
 #define MACB_SRTSM_OFFSET	15 /* Store Receive Timestamp to Memory */
+#define MACB_PTPUNI_OFFSET	20 /* PTP Unicast packet enable */
+#define MACB_PTPUNI_SIZE	1
 #define MACB_OSSMODE_OFFSET	24 /* Enable One Step Synchro Mode */
 #define MACB_OSSMODE_SIZE	1
 #define MACB_MIIONRGMII_OFFSET	28 /* MII Usage on RGMII Interface */
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 5fb991835078a..9470e895591e5 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -288,6 +288,11 @@ static void macb_set_hwaddr(struct macb *bp)
 	top = cpu_to_le16(*((u16 *)(bp->dev->dev_addr + 4)));
 	macb_or_gem_writel(bp, SA1T, top);
 
+	if (gem_has_ptp(bp)) {
+		gem_writel(bp, RXPTPUNI, bottom);
+		gem_writel(bp, TXPTPUNI, bottom);
+	}
+
 	/* Clear unused address register sets */
 	macb_or_gem_writel(bp, SA2B, 0);
 	macb_or_gem_writel(bp, SA2T, 0);
@@ -721,8 +726,12 @@ static void macb_mac_link_up(struct phylink_config *config,
 
 	spin_unlock_irqrestore(&bp->lock, flags);
 
-	/* Enable Rx and Tx */
-	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(RE) | MACB_BIT(TE));
+	/* Enable Rx and Tx; Enable PTP unicast */
+	ctrl = macb_readl(bp, NCR);
+	if (gem_has_ptp(bp))
+		ctrl |= MACB_BIT(PTPUNI);
+
+	macb_writel(bp, NCR, ctrl | MACB_BIT(RE) | MACB_BIT(TE));
 
 	netif_tx_wake_all_queues(ndev);
 }
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 210/219] net: macb: fix sleep inside spinlock
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (208 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 209/219] net: macb: Enable PTP unicast Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 211/219] ipv6: fix ip6_sock_set_addr_preferences() typo Greg Kroah-Hartman
                   ` (19 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable; +Cc: Greg Kroah-Hartman, patches, Sascha Hauer, Paolo Abeni,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Sascha Hauer <s.hauer@pengutronix.de>

[ Upstream commit 403f0e771457e2b8811dc280719d11b9bacf10f4 ]

macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
which can sleep. This results in:

| BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
| pps pps1: new PPS source ptp1
| in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
| preempt_count: 1, expected: 0
| RCU nest depth: 0, expected: 0
| 4 locks held by kworker/u4:3/40:
|  #0: ffff000003409148
| macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
|  ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
|  #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
|  #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
|  #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
| irq event stamp: 113998
| hardirqs last  enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
| hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
| softirqs last  enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
| softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
| CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
| Hardware name: ... ZynqMP ... (DT)
| Workqueue: events_power_efficient phylink_resolve
| Call trace:
|  dump_backtrace+0x98/0xf0
|  show_stack+0x18/0x24
|  dump_stack_lvl+0x60/0xac
|  dump_stack+0x18/0x24
|  __might_resched+0x144/0x24c
|  __might_sleep+0x48/0x98
|  __mutex_lock+0x58/0x7b0
|  mutex_lock_nested+0x24/0x30
|  clk_prepare_lock+0x4c/0xa8
|  clk_set_rate+0x24/0x8c
|  macb_mac_link_up+0x25c/0x2ac
|  phylink_resolve+0x178/0x4e8
|  process_one_work+0x1ec/0x51c
|  worker_thread+0x1ec/0x3e4
|  kthread+0x120/0x124
|  ret_from_fork+0x10/0x20

The obvious fix is to move the call to macb_set_tx_clk() out of the
protected area. This seems safe as rx and tx are both disabled anyway at
this point.
It is however not entirely clear what the spinlock shall protect. It
could be the read-modify-write access to the NCFGR register, but this
is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well
without holding the spinlock. It could also be the register accesses
done in mog_init_rings() or macb_init_buffers(), but again these
functions are called without holding the spinlock in macb_hresp_error_task().
The locking seems fishy in this driver and it might deserve another look
before this patch is applied.

Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20230908112913.1701766-1-s.hauer@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/cadence/macb_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 9470e895591e5..54b032a46b48a 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -705,8 +705,6 @@ static void macb_mac_link_up(struct phylink_config *config,
 		if (rx_pause)
 			ctrl |= MACB_BIT(PAE);
 
-		macb_set_tx_clk(bp, speed);
-
 		/* Initialize rings & buffers as clearing MACB_BIT(TE) in link down
 		 * cleared the pipeline and control registers.
 		 */
@@ -726,6 +724,9 @@ static void macb_mac_link_up(struct phylink_config *config,
 
 	spin_unlock_irqrestore(&bp->lock, flags);
 
+	if (!(bp->caps & MACB_CAPS_MACB_IS_EMAC))
+		macb_set_tx_clk(bp, speed);
+
 	/* Enable Rx and Tx; Enable PTP unicast */
 	ctrl = macb_readl(bp, NCR);
 	if (gem_has_ptp(bp))
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 211/219] ipv6: fix ip6_sock_set_addr_preferences() typo
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (209 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 210/219] net: macb: fix sleep inside spinlock Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 212/219] ipv6: Remove in6addr_any alternatives Greg Kroah-Hartman
                   ` (18 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Eric Dumazet, Christoph Hellwig,
	Chuck Lever, Simon Horman, Paolo Abeni, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 8cdd9f1aaedf823006449faa4e540026c692ac43 ]

ip6_sock_set_addr_preferences() second argument should be an integer.

SUNRPC attempts to set IPV6_PREFER_SRC_PUBLIC were
translated to IPV6_PREFER_SRC_TMP

Fixes: 18d5ad623275 ("ipv6: add ip6_sock_set_addr_preferences")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230911154213.713941-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/ipv6.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index e4ceef687c1c2..8ced112167599 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1322,7 +1322,7 @@ static inline int __ip6_sock_set_addr_preferences(struct sock *sk, int val)
 	return 0;
 }
 
-static inline int ip6_sock_set_addr_preferences(struct sock *sk, bool val)
+static inline int ip6_sock_set_addr_preferences(struct sock *sk, int val)
 {
 	int ret;
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 212/219] ipv6: Remove in6addr_any alternatives.
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (210 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 211/219] ipv6: fix ip6_sock_set_addr_preferences() typo Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 213/219] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any) Greg Kroah-Hartman
                   ` (17 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Mark Bloch,
	David Ahern, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 8cdc3223e78c43e1b60ea1c536a103e32fdca3c5 ]

Some code defines the IPv6 wildcard address as a local variable and
use it with memcmp() or ipv6_addr_equal().

Let's use in6addr_any and ipv6_addr_any() instead.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: aa99e5f87bd5 ("tcp: Fix bind() regression for v4-mapped-v6 wildcard address.")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c |  5 ++---
 include/net/ip6_fib.h                                 |  9 +++------
 include/trace/events/fib.h                            |  5 ++---
 include/trace/events/fib6.h                           |  5 +----
 net/ethtool/ioctl.c                                   | 10 +++++-----
 net/ipv4/inet_hashtables.c                            | 11 ++++-------
 6 files changed, 17 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
index 72b61f66df37a..cd15d36b1507e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
@@ -97,7 +97,6 @@ int mlx5e_tc_set_attr_rx_tun(struct mlx5e_tc_flow *flow,
 #if IS_ENABLED(CONFIG_INET) && IS_ENABLED(CONFIG_IPV6)
 	else if (ip_version == 6) {
 		int ipv6_size = MLX5_FLD_SZ_BYTES(ipv6_layout, ipv6);
-		struct in6_addr zerov6 = {};
 
 		daddr = MLX5_ADDR_OF(fte_match_param, spec->match_value,
 				     outer_headers.dst_ipv4_dst_ipv6.ipv6_layout.ipv6);
@@ -105,8 +104,8 @@ int mlx5e_tc_set_attr_rx_tun(struct mlx5e_tc_flow *flow,
 				     outer_headers.src_ipv4_src_ipv6.ipv6_layout.ipv6);
 		memcpy(&tun_attr->dst_ip.v6, daddr, ipv6_size);
 		memcpy(&tun_attr->src_ip.v6, saddr, ipv6_size);
-		if (!memcmp(&tun_attr->dst_ip.v6, &zerov6, sizeof(zerov6)) ||
-		    !memcmp(&tun_attr->src_ip.v6, &zerov6, sizeof(zerov6)))
+		if (ipv6_addr_any(&tun_attr->dst_ip.v6) ||
+		    ipv6_addr_any(&tun_attr->src_ip.v6))
 			return 0;
 	}
 #endif
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index a92f6eb853068..fa4e6af382e2a 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -472,13 +472,10 @@ void rt6_get_prefsrc(const struct rt6_info *rt, struct in6_addr *addr)
 	rcu_read_lock();
 
 	from = rcu_dereference(rt->from);
-	if (from) {
+	if (from)
 		*addr = from->fib6_prefsrc.addr;
-	} else {
-		struct in6_addr in6_zero = {};
-
-		*addr = in6_zero;
-	}
+	else
+		*addr = in6addr_any;
 
 	rcu_read_unlock();
 }
diff --git a/include/trace/events/fib.h b/include/trace/events/fib.h
index c2300c407f583..76297ecd4935c 100644
--- a/include/trace/events/fib.h
+++ b/include/trace/events/fib.h
@@ -36,7 +36,6 @@ TRACE_EVENT(fib_table_lookup,
 	),
 
 	TP_fast_assign(
-		struct in6_addr in6_zero = {};
 		struct net_device *dev;
 		struct in6_addr *in6;
 		__be32 *p32;
@@ -74,7 +73,7 @@ TRACE_EVENT(fib_table_lookup,
 				*p32 = nhc->nhc_gw.ipv4;
 
 				in6 = (struct in6_addr *)__entry->gw6;
-				*in6 = in6_zero;
+				*in6 = in6addr_any;
 			} else if (nhc->nhc_gw_family == AF_INET6) {
 				p32 = (__be32 *) __entry->gw4;
 				*p32 = 0;
@@ -87,7 +86,7 @@ TRACE_EVENT(fib_table_lookup,
 			*p32 = 0;
 
 			in6 = (struct in6_addr *)__entry->gw6;
-			*in6 = in6_zero;
+			*in6 = in6addr_any;
 		}
 	),
 
diff --git a/include/trace/events/fib6.h b/include/trace/events/fib6.h
index 6e821eb794503..4d3e607b3cdec 100644
--- a/include/trace/events/fib6.h
+++ b/include/trace/events/fib6.h
@@ -68,11 +68,8 @@ TRACE_EVENT(fib6_table_lookup,
 			strcpy(__entry->name, "-");
 		}
 		if (res->f6i == net->ipv6.fib6_null_entry) {
-			struct in6_addr in6_zero = {};
-
 			in6 = (struct in6_addr *)__entry->gw;
-			*in6 = in6_zero;
-
+			*in6 = in6addr_any;
 		} else if (res->nh) {
 			in6 = (struct in6_addr *)__entry->gw;
 			*in6 = res->nh->fib_nh_gw6;
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 940c0e27be735..e31d1247b9f08 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -27,6 +27,7 @@
 #include <linux/net.h>
 #include <linux/pm_runtime.h>
 #include <net/devlink.h>
+#include <net/ipv6.h>
 #include <net/xdp_sock_drv.h>
 #include <net/flow_offload.h>
 #include <linux/ethtool_netlink.h>
@@ -3090,7 +3091,6 @@ struct ethtool_rx_flow_rule *
 ethtool_rx_flow_rule_create(const struct ethtool_rx_flow_spec_input *input)
 {
 	const struct ethtool_rx_flow_spec *fs = input->fs;
-	static struct in6_addr zero_addr = {};
 	struct ethtool_rx_flow_match *match;
 	struct ethtool_rx_flow_rule *flow;
 	struct flow_action_entry *act;
@@ -3196,20 +3196,20 @@ ethtool_rx_flow_rule_create(const struct ethtool_rx_flow_spec_input *input)
 
 		v6_spec = &fs->h_u.tcp_ip6_spec;
 		v6_m_spec = &fs->m_u.tcp_ip6_spec;
-		if (memcmp(v6_m_spec->ip6src, &zero_addr, sizeof(zero_addr))) {
+		if (!ipv6_addr_any((struct in6_addr *)v6_m_spec->ip6src)) {
 			memcpy(&match->key.ipv6.src, v6_spec->ip6src,
 			       sizeof(match->key.ipv6.src));
 			memcpy(&match->mask.ipv6.src, v6_m_spec->ip6src,
 			       sizeof(match->mask.ipv6.src));
 		}
-		if (memcmp(v6_m_spec->ip6dst, &zero_addr, sizeof(zero_addr))) {
+		if (!ipv6_addr_any((struct in6_addr *)v6_m_spec->ip6dst)) {
 			memcpy(&match->key.ipv6.dst, v6_spec->ip6dst,
 			       sizeof(match->key.ipv6.dst));
 			memcpy(&match->mask.ipv6.dst, v6_m_spec->ip6dst,
 			       sizeof(match->mask.ipv6.dst));
 		}
-		if (memcmp(v6_m_spec->ip6src, &zero_addr, sizeof(zero_addr)) ||
-		    memcmp(v6_m_spec->ip6dst, &zero_addr, sizeof(zero_addr))) {
+		if (!ipv6_addr_any((struct in6_addr *)v6_m_spec->ip6src) ||
+		    !ipv6_addr_any((struct in6_addr *)v6_m_spec->ip6dst)) {
 			match->dissector.used_keys |=
 				BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS);
 			match->dissector.offset[FLOW_DISSECTOR_KEY_IPV6_ADDRS] =
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index c19b462662ad0..38a92c8c66863 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -813,13 +813,11 @@ bool inet_bind2_bucket_match_addr_any(const struct inet_bind2_bucket *tb, const
 				      unsigned short port, int l3mdev, const struct sock *sk)
 {
 #if IS_ENABLED(CONFIG_IPV6)
-	struct in6_addr addr_any = {};
-
 	if (sk->sk_family != tb->family) {
 		if (sk->sk_family == AF_INET)
 			return net_eq(ib2_net(tb), net) && tb->port == port &&
 				tb->l3mdev == l3mdev &&
-				ipv6_addr_equal(&tb->v6_rcv_saddr, &addr_any);
+				ipv6_addr_any(&tb->v6_rcv_saddr);
 
 		return false;
 	}
@@ -827,7 +825,7 @@ bool inet_bind2_bucket_match_addr_any(const struct inet_bind2_bucket *tb, const
 	if (sk->sk_family == AF_INET6)
 		return net_eq(ib2_net(tb), net) && tb->port == port &&
 			tb->l3mdev == l3mdev &&
-			ipv6_addr_equal(&tb->v6_rcv_saddr, &addr_any);
+			ipv6_addr_any(&tb->v6_rcv_saddr);
 	else
 #endif
 		return net_eq(ib2_net(tb), net) && tb->port == port &&
@@ -853,11 +851,10 @@ inet_bhash2_addr_any_hashbucket(const struct sock *sk, const struct net *net, in
 {
 	struct inet_hashinfo *hinfo = tcp_or_dccp_get_hashinfo(sk);
 	u32 hash;
-#if IS_ENABLED(CONFIG_IPV6)
-	struct in6_addr addr_any = {};
 
+#if IS_ENABLED(CONFIG_IPV6)
 	if (sk->sk_family == AF_INET6)
-		hash = ipv6_portaddr_hash(net, &addr_any, port);
+		hash = ipv6_portaddr_hash(net, &in6addr_any, port);
 	else
 #endif
 		hash = ipv4_portaddr_hash(net, 0, port);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 213/219] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (211 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 212/219] ipv6: Remove in6addr_any alternatives Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 214/219] tcp: Fix bind() regression for v4-mapped-v6 wildcard address Greg Kroah-Hartman
                   ` (16 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Eric Dumazet,
	David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit c6d277064b1da7f9015b575a562734de87a7e463 ]

This is a prep patch to make the following patches cleaner that touch
inet_bind2_bucket_match() and inet_bind2_bucket_match_addr_any().

Both functions have duplicated comparison for netns, port, and l3mdev.
Let's factorise them.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: aa99e5f87bd5 ("tcp: Fix bind() regression for v4-mapped-v6 wildcard address.")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ipv4/inet_hashtables.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 38a92c8c66863..71d77b3b6536f 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -795,41 +795,39 @@ static bool inet_bind2_bucket_match(const struct inet_bind2_bucket *tb,
 				    const struct net *net, unsigned short port,
 				    int l3mdev, const struct sock *sk)
 {
+	if (!net_eq(ib2_net(tb), net) || tb->port != port ||
+	    tb->l3mdev != l3mdev)
+		return false;
+
 #if IS_ENABLED(CONFIG_IPV6)
 	if (sk->sk_family != tb->family)
 		return false;
 
 	if (sk->sk_family == AF_INET6)
-		return net_eq(ib2_net(tb), net) && tb->port == port &&
-			tb->l3mdev == l3mdev &&
-			ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr);
-	else
+		return ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr);
 #endif
-		return net_eq(ib2_net(tb), net) && tb->port == port &&
-			tb->l3mdev == l3mdev && tb->rcv_saddr == sk->sk_rcv_saddr;
+	return tb->rcv_saddr == sk->sk_rcv_saddr;
 }
 
 bool inet_bind2_bucket_match_addr_any(const struct inet_bind2_bucket *tb, const struct net *net,
 				      unsigned short port, int l3mdev, const struct sock *sk)
 {
+	if (!net_eq(ib2_net(tb), net) || tb->port != port ||
+	    tb->l3mdev != l3mdev)
+		return false;
+
 #if IS_ENABLED(CONFIG_IPV6)
 	if (sk->sk_family != tb->family) {
 		if (sk->sk_family == AF_INET)
-			return net_eq(ib2_net(tb), net) && tb->port == port &&
-				tb->l3mdev == l3mdev &&
-				ipv6_addr_any(&tb->v6_rcv_saddr);
+			return ipv6_addr_any(&tb->v6_rcv_saddr);
 
 		return false;
 	}
 
 	if (sk->sk_family == AF_INET6)
-		return net_eq(ib2_net(tb), net) && tb->port == port &&
-			tb->l3mdev == l3mdev &&
-			ipv6_addr_any(&tb->v6_rcv_saddr);
-	else
+		return ipv6_addr_any(&tb->v6_rcv_saddr);
 #endif
-		return net_eq(ib2_net(tb), net) && tb->port == port &&
-			tb->l3mdev == l3mdev && tb->rcv_saddr == 0;
+	return tb->rcv_saddr == 0;
 }
 
 /* The socket's bhash2 hashbucket spinlock must be held when this is called */
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 214/219] tcp: Fix bind() regression for v4-mapped-v6 wildcard address.
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (212 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 213/219] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any) Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 215/219] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address Greg Kroah-Hartman
                   ` (15 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Andrei Vagin, Kuniyuki Iwashima,
	Eric Dumazet, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit aa99e5f87bd54db55dd37cb130bd5eb55933027f ]

Andrei Vagin reported bind() regression with strace logs.

If we bind() a TCPv6 socket to ::FFFF:0.0.0.0 and then bind() a TCPv4
socket to 127.0.0.1, the 2nd bind() should fail but now succeeds.

  from socket import *

  s1 = socket(AF_INET6, SOCK_STREAM)
  s1.bind(('::ffff:0.0.0.0', 0))

  s2 = socket(AF_INET, SOCK_STREAM)
  s2.bind(('127.0.0.1', s1.getsockname()[1]))

During the 2nd bind(), if tb->family is AF_INET6 and sk->sk_family is
AF_INET in inet_bind2_bucket_match_addr_any(), we still need to check
if tb has the v4-mapped-v6 wildcard address.

The example above does not work after commit 5456262d2baa ("net: Fix
incorrect address comparison when searching for a bind2 bucket"), but
the blamed change is not the commit.

Before the commit, the leading zeros of ::FFFF:0.0.0.0 were treated
as 0.0.0.0, and the sequence above worked by chance.  Technically, this
case has been broken since bhash2 was introduced.

Note that if we bind() two sockets to 127.0.0.1 and then ::FFFF:0.0.0.0,
the 2nd bind() fails properly because we fall back to using bhash to
detect conflicts for the v4-mapped-v6 address.

Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Reported-by: Andrei Vagin <avagin@google.com>
Closes: https://lore.kernel.org/netdev/ZPuYBOFC8zsK6r9T@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/net/ipv6.h         | 5 +++++
 net/ipv4/inet_hashtables.c | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 8ced112167599..517bdae78614b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -750,6 +750,11 @@ static inline bool ipv6_addr_v4mapped(const struct in6_addr *a)
 					cpu_to_be32(0x0000ffff))) == 0UL;
 }
 
+static inline bool ipv6_addr_v4mapped_any(const struct in6_addr *a)
+{
+	return ipv6_addr_v4mapped(a) && ipv4_is_zeronet(a->s6_addr32[3]);
+}
+
 static inline bool ipv6_addr_v4mapped_loopback(const struct in6_addr *a)
 {
 	return ipv6_addr_v4mapped(a) && ipv4_is_loopback(a->s6_addr32[3]);
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 71d77b3b6536f..eb522c0374d59 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -819,7 +819,8 @@ bool inet_bind2_bucket_match_addr_any(const struct inet_bind2_bucket *tb, const
 #if IS_ENABLED(CONFIG_IPV6)
 	if (sk->sk_family != tb->family) {
 		if (sk->sk_family == AF_INET)
-			return ipv6_addr_any(&tb->v6_rcv_saddr);
+			return ipv6_addr_any(&tb->v6_rcv_saddr) ||
+				ipv6_addr_v4mapped_any(&tb->v6_rcv_saddr);
 
 		return false;
 	}
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 215/219] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (213 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 214/219] tcp: Fix bind() regression for v4-mapped-v6 wildcard address Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 216/219] ixgbe: fix timestamp configuration code Greg Kroah-Hartman
                   ` (14 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Eric Dumazet,
	Andrei Vagin, David S. Miller, Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit c48ef9c4aed3632566b57ba66cec6ec78624d4cb ]

Since bhash2 was introduced, the example below does not work as expected.
These two bind() should conflict, but the 2nd bind() now succeeds.

  from socket import *

  s1 = socket(AF_INET6, SOCK_STREAM)
  s1.bind(('::ffff:127.0.0.1', 0))

  s2 = socket(AF_INET, SOCK_STREAM)
  s2.bind(('127.0.0.1', s1.getsockname()[1]))

During the 2nd bind() in inet_csk_get_port(), inet_bind2_bucket_find()
fails to find the 1st socket's tb2, so inet_bind2_bucket_create() allocates
a new tb2 for the 2nd socket.  Then, we call inet_csk_bind_conflict() that
checks conflicts in the new tb2 by inet_bhash2_conflict().  However, the
new tb2 does not include the 1st socket, thus the bind() finally succeeds.

In this case, inet_bind2_bucket_match() must check if AF_INET6 tb2 has
the conflicting v4-mapped-v6 address so that inet_bind2_bucket_find()
returns the 1st socket's tb2.

Note that if we bind two sockets to 127.0.0.1 and then ::FFFF:127.0.0.1,
the 2nd bind() fails properly for the same reason mentinoed in the previous
commit.

Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ipv4/inet_hashtables.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index eb522c0374d59..d79de4b95186b 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -800,8 +800,13 @@ static bool inet_bind2_bucket_match(const struct inet_bind2_bucket *tb,
 		return false;
 
 #if IS_ENABLED(CONFIG_IPV6)
-	if (sk->sk_family != tb->family)
+	if (sk->sk_family != tb->family) {
+		if (sk->sk_family == AF_INET)
+			return ipv6_addr_v4mapped(&tb->v6_rcv_saddr) &&
+				tb->v6_rcv_saddr.s6_addr32[3] == sk->sk_rcv_saddr;
+
 		return false;
+	}
 
 	if (sk->sk_family == AF_INET6)
 		return ipv6_addr_equal(&tb->v6_rcv_saddr, &sk->sk_v6_rcv_saddr);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 216/219] ixgbe: fix timestamp configuration code
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (214 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 215/219] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 217/219] kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg() Greg Kroah-Hartman
                   ` (13 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Vadim Fedorenko, Simon Horman,
	Tony Nguyen, David S. Miller, Sasha Levin, Pucha Himasekhar Reddy

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Vadim Fedorenko <vadim.fedorenko@linux.dev>

[ Upstream commit 3c44191dd76cf9c0cc49adaf34384cbd42ef8ad2 ]

The commit in fixes introduced flags to control the status of hardware
configuration while processing packets. At the same time another structure
is used to provide configuration of timestamper to user-space applications.
The way it was coded makes this structures go out of sync easily. The
repro is easy for 82599 chips:

[root@hostname ~]# hwstamp_ctl -i eth0 -r 12 -t 1
current settings:
tx_type 0
rx_filter 0
new settings:
tx_type 1
rx_filter 12

The eth0 device is properly configured to timestamp any PTPv2 events.

[root@hostname ~]# hwstamp_ctl -i eth0 -r 1 -t 1
current settings:
tx_type 1
rx_filter 12
SIOCSHWTSTAMP failed: Numerical result out of range
The requested time stamping mode is not supported by the hardware.

The error is properly returned because HW doesn't support all packets
timestamping. But the adapter->flags is cleared of timestamp flags
even though no HW configuration was done. From that point no RX timestamps
are received by user-space application. But configuration shows good
values:

[root@hostname ~]# hwstamp_ctl -i eth0
current settings:
tx_type 1
rx_filter 12

Fix the issue by applying new flags only when the HW was actually
configured.

Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices")
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c | 28 +++++++++++---------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
index f8605f57bd067..75e1383263c1e 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c
@@ -995,6 +995,7 @@ static int ixgbe_ptp_set_timestamp_mode(struct ixgbe_adapter *adapter,
 	u32 tsync_tx_ctl = IXGBE_TSYNCTXCTL_ENABLED;
 	u32 tsync_rx_ctl = IXGBE_TSYNCRXCTL_ENABLED;
 	u32 tsync_rx_mtrl = PTP_EV_PORT << 16;
+	u32 aflags = adapter->flags;
 	bool is_l2 = false;
 	u32 regval;
 
@@ -1012,20 +1013,20 @@ static int ixgbe_ptp_set_timestamp_mode(struct ixgbe_adapter *adapter,
 	case HWTSTAMP_FILTER_NONE:
 		tsync_rx_ctl = 0;
 		tsync_rx_mtrl = 0;
-		adapter->flags &= ~(IXGBE_FLAG_RX_HWTSTAMP_ENABLED |
-				    IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
+		aflags &= ~(IXGBE_FLAG_RX_HWTSTAMP_ENABLED |
+			    IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
 		break;
 	case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
 		tsync_rx_ctl |= IXGBE_TSYNCRXCTL_TYPE_L4_V1;
 		tsync_rx_mtrl |= IXGBE_RXMTRL_V1_SYNC_MSG;
-		adapter->flags |= (IXGBE_FLAG_RX_HWTSTAMP_ENABLED |
-				   IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
+		aflags |= (IXGBE_FLAG_RX_HWTSTAMP_ENABLED |
+			   IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
 		break;
 	case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
 		tsync_rx_ctl |= IXGBE_TSYNCRXCTL_TYPE_L4_V1;
 		tsync_rx_mtrl |= IXGBE_RXMTRL_V1_DELAY_REQ_MSG;
-		adapter->flags |= (IXGBE_FLAG_RX_HWTSTAMP_ENABLED |
-				   IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
+		aflags |= (IXGBE_FLAG_RX_HWTSTAMP_ENABLED |
+			   IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
 		break;
 	case HWTSTAMP_FILTER_PTP_V2_EVENT:
 	case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
@@ -1039,8 +1040,8 @@ static int ixgbe_ptp_set_timestamp_mode(struct ixgbe_adapter *adapter,
 		tsync_rx_ctl |= IXGBE_TSYNCRXCTL_TYPE_EVENT_V2;
 		is_l2 = true;
 		config->rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
-		adapter->flags |= (IXGBE_FLAG_RX_HWTSTAMP_ENABLED |
-				   IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
+		aflags |= (IXGBE_FLAG_RX_HWTSTAMP_ENABLED |
+			   IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
 		break;
 	case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
 	case HWTSTAMP_FILTER_NTP_ALL:
@@ -1051,7 +1052,7 @@ static int ixgbe_ptp_set_timestamp_mode(struct ixgbe_adapter *adapter,
 		if (hw->mac.type >= ixgbe_mac_X550) {
 			tsync_rx_ctl |= IXGBE_TSYNCRXCTL_TYPE_ALL;
 			config->rx_filter = HWTSTAMP_FILTER_ALL;
-			adapter->flags |= IXGBE_FLAG_RX_HWTSTAMP_ENABLED;
+			aflags |= IXGBE_FLAG_RX_HWTSTAMP_ENABLED;
 			break;
 		}
 		fallthrough;
@@ -1062,8 +1063,6 @@ static int ixgbe_ptp_set_timestamp_mode(struct ixgbe_adapter *adapter,
 		 * Delay_Req messages and hardware does not support
 		 * timestamping all packets => return error
 		 */
-		adapter->flags &= ~(IXGBE_FLAG_RX_HWTSTAMP_ENABLED |
-				    IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER);
 		config->rx_filter = HWTSTAMP_FILTER_NONE;
 		return -ERANGE;
 	}
@@ -1095,8 +1094,8 @@ static int ixgbe_ptp_set_timestamp_mode(struct ixgbe_adapter *adapter,
 			       IXGBE_TSYNCRXCTL_TYPE_ALL |
 			       IXGBE_TSYNCRXCTL_TSIP_UT_EN;
 		config->rx_filter = HWTSTAMP_FILTER_ALL;
-		adapter->flags |= IXGBE_FLAG_RX_HWTSTAMP_ENABLED;
-		adapter->flags &= ~IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER;
+		aflags |= IXGBE_FLAG_RX_HWTSTAMP_ENABLED;
+		aflags &= ~IXGBE_FLAG_RX_HWTSTAMP_IN_REGISTER;
 		is_l2 = true;
 		break;
 	default:
@@ -1129,6 +1128,9 @@ static int ixgbe_ptp_set_timestamp_mode(struct ixgbe_adapter *adapter,
 
 	IXGBE_WRITE_FLUSH(hw);
 
+	/* configure adapter flags only when HW is actually configured */
+	adapter->flags = aflags;
+
 	/* clear TX/RX time stamp registers, just to be sure */
 	ixgbe_ptp_clear_tx_timestamp(adapter);
 	IXGBE_READ_REG(hw, IXGBE_RXSTMPH);
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 217/219] kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (215 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 216/219] ixgbe: fix timestamp configuration code Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 218/219] MIPS: Only fiddle with CHECKFLAGS if `need-compiler Greg Kroah-Hartman
                   ` (12 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
	Sasha Levin

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit a22730b1b4bf437c6bbfdeff5feddf54be4aeada ]

syzkaller found a memory leak in kcm_sendmsg(), and commit c821a88bd720
("kcm: Fix memory leak in error path of kcm_sendmsg()") suppressed it by
updating kcm_tx_msg(head)->last_skb if partial data is copied so that the
following sendmsg() will resume from the skb.

However, we cannot know how many bytes were copied when we get the error.
Thus, we could mess up the MSG_MORE queue.

When kcm_sendmsg() fails for SOCK_DGRAM, we should purge the queue as we
do so for UDP by udp_flush_pending_frames().

Even without this change, when the error occurred, the following sendmsg()
resumed from a wrong skb and the queue was messed up.  However, we have
yet to get such a report, and only syzkaller stumbled on it.  So, this
can be changed safely.

Note this does not change SOCK_SEQPACKET behaviour.

Fixes: c821a88bd720 ("kcm: Fix memory leak in error path of kcm_sendmsg()")
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230912022753.33327-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/kcm/kcmsock.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index dd929a8350740..65845c59c0655 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1065,17 +1065,18 @@ static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 out_error:
 	kcm_push(kcm);
 
-	if (copied && sock->type == SOCK_SEQPACKET) {
+	if (sock->type == SOCK_SEQPACKET) {
 		/* Wrote some bytes before encountering an
 		 * error, return partial success.
 		 */
-		goto partial_message;
-	}
-
-	if (head != kcm->seq_skb)
+		if (copied)
+			goto partial_message;
+		if (head != kcm->seq_skb)
+			kfree_skb(head);
+	} else {
 		kfree_skb(head);
-	else if (copied)
-		kcm_tx_msg(head)->last_skb = skb;
+		kcm->seq_skb = NULL;
+	}
 
 	err = sk_stream_error(sk, msg->msg_flags, err);
 
-- 
2.40.1




^ permalink raw reply related	[flat|nested] 258+ messages in thread

* [PATCH 6.1 218/219] MIPS: Only fiddle with CHECKFLAGS if `need-compiler
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (216 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 217/219] kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg() Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 19:15 ` [PATCH 6.1 219/219] drm/amd/display: Fix a bug when searching for insert_above_mpcc Greg Kroah-Hartman
                   ` (11 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Guillaume Tucker, Maciej W. Rozycki,
	kernelci.org bot, Thomas Bogendoerfer

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Maciej W. Rozycki <macro@orcam.me.uk>

commit 4fe4a6374c4db9ae2b849b61e84b58685dca565a upstream.

We have originally guarded fiddling with CHECKFLAGS in our arch Makefile
by checking for the CONFIG_MIPS variable, not set for targets such as
`distclean', etc. that neither include `.config' nor use the compiler.

Starting from commit 805b2e1d427a ("kbuild: include Makefile.compiler
only when compiler is needed") we have had a generic `need-compiler'
variable explicitly telling us if the compiler will be used and thus its
capabilities need to be checked and expressed in the form of compilation
flags.  If this variable is not set, then `make' functions such as
`cc-option' are undefined, causing all kinds of weirdness to happen if
we expect specific results to be returned, most recently:

cc1: error: '-mloongson-mmi' must be used with '-mhard-float'

messages with configurations such as `fuloong2e_defconfig' and the
`modules_install' target, which does include `.config' and yet does not
use the compiler.

Replace the check for CONFIG_MIPS with one for `need-compiler' instead,
so as to prevent the compiler from being ever called for CHECKFLAGS when
not needed.

Reported-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Closes: https://lore.kernel.org/r/85031c0c-d981-031e-8a50-bc4fad2ddcd8@collabora.com/
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: 805b2e1d427a ("kbuild: include Makefile.compiler only when compiler is needed")
Cc: stable@vger.kernel.org # v5.13+
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/mips/Makefile |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -350,7 +350,7 @@ KBUILD_CFLAGS += -fno-asynchronous-unwin
 
 KBUILD_LDFLAGS		+= -m $(ld-emul)
 
-ifdef CONFIG_MIPS
+ifdef need-compiler
 CHECKFLAGS += $(shell $(CC) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \
 	egrep -vw '__GNUC_(MINOR_|PATCHLEVEL_)?_' | \
 	sed -e "s/^\#define /-D'/" -e "s/ /'='/" -e "s/$$/'/" -e 's/\$$/&&/g')



^ permalink raw reply	[flat|nested] 258+ messages in thread

* [PATCH 6.1 219/219] drm/amd/display: Fix a bug when searching for insert_above_mpcc
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (217 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 218/219] MIPS: Only fiddle with CHECKFLAGS if `need-compiler Greg Kroah-Hartman
@ 2023-09-17 19:15 ` Greg Kroah-Hartman
  2023-09-17 20:47 ` [PATCH 6.1 000/219] 6.1.54-rc1 review SeongJae Park
                   ` (10 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:15 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, Mario Limonciello, Alex Deucher,
	Jun Lei, Tom Chung, Wesley Chalmers, Daniel Wheeler

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Wesley Chalmers <wesley.chalmers@amd.com>

commit 3d028d5d60d516c536de1ddd3ebf3d55f3f8983b upstream.

[WHY]
Currently, when insert_plane is called with insert_above_mpcc
parameter that is equal to tree->opp_list, the function returns NULL.

[HOW]
Instead, the function should insert the plane at the top of the tree.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wesley Chalmers <wesley.chalmers@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_mpc.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_mpc.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_mpc.c
@@ -212,8 +212,9 @@ struct mpcc *mpc1_insert_plane(
 		/* check insert_above_mpcc exist in tree->opp_list */
 		struct mpcc *temp_mpcc = tree->opp_list;
 
-		while (temp_mpcc && temp_mpcc->mpcc_bot != insert_above_mpcc)
-			temp_mpcc = temp_mpcc->mpcc_bot;
+		if (temp_mpcc != insert_above_mpcc)
+			while (temp_mpcc && temp_mpcc->mpcc_bot != insert_above_mpcc)
+				temp_mpcc = temp_mpcc->mpcc_bot;
 		if (temp_mpcc == NULL)
 			return NULL;
 	}



^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (218 preceding siblings ...)
  2023-09-17 19:15 ` [PATCH 6.1 219/219] drm/amd/display: Fix a bug when searching for insert_above_mpcc Greg Kroah-Hartman
@ 2023-09-17 20:47 ` SeongJae Park
  2023-09-18  5:34 ` Takeshi Ogasawara
                   ` (9 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: SeongJae Park @ 2023-09-17 20:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, damon, SeongJae Park

Hello,

On Sun, 17 Sep 2023 21:12:07 +0200 Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.

This rc kernel passes DAMON functionality test[1] on my test machine.
Attaching the test results summary below.  Please note that I retrieved the
kernel from linux-stable-rc tree[2].

Tested-by: SeongJae Park <sj@kernel.org>

[1] https://github.com/awslabs/damon-tests/tree/next/corr
[2] 89fc7c511aa5 ("Linux 6.1.54-rc1")

Thanks,
SJ

[...]

---

# .config:1408:warning: override: reassigning to symbol CGROUPS
ok 15 selftests: damon-tests: build_nomemcg.sh
# kselftest dir '/home/sjpark/damon-tests-cont/linux/tools/testing/selftests/damon-tests' is in dirty state.
# the log is at '/home/sjpark/log'.
 [32m
ok 1 selftests: damon: debugfs_attrs.sh
ok 2 selftests: damon: debugfs_schemes.sh
ok 3 selftests: damon: debugfs_target_ids.sh
ok 4 selftests: damon: debugfs_empty_targets.sh
ok 5 selftests: damon: debugfs_huge_count_read_write.sh
ok 6 selftests: damon: debugfs_duplicate_context_creation.sh
ok 7 selftests: damon: sysfs.sh
ok 1 selftests: damon-tests: kunit.sh
ok 2 selftests: damon-tests: huge_count_read_write.sh
ok 3 selftests: damon-tests: buffer_overflow.sh
ok 4 selftests: damon-tests: rm_contexts.sh
ok 5 selftests: damon-tests: record_null_deref.sh
ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh
ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh
ok 8 selftests: damon-tests: damo_tests.sh
ok 9 selftests: damon-tests: masim-record.sh
ok 10 selftests: damon-tests: build_i386.sh
ok 11 selftests: damon-tests: build_m68k.sh
ok 12 selftests: damon-tests: build_arm64.sh
ok 13 selftests: damon-tests: build_i386_idle_flag.sh
ok 14 selftests: damon-tests: build_i386_highpte.sh
ok 15 selftests: damon-tests: build_nomemcg.sh
 [33m
 [92mPASS [39m
_remote_run_corr.sh SUCCESS

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (219 preceding siblings ...)
  2023-09-17 20:47 ` [PATCH 6.1 000/219] 6.1.54-rc1 review SeongJae Park
@ 2023-09-18  5:34 ` Takeshi Ogasawara
  2023-09-18  6:42 ` Bagas Sanjaya
                   ` (8 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Takeshi Ogasawara @ 2023-09-18  5:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor

Hi Greg

On Mon, Sep 18, 2023 at 5:03 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

6.1.54-rc1 tested.

Build successfully completed.
Boot successfully completed.
No dmesg regressions.
Video output normal.
Sound output normal.

Lenovo ThinkPad X1 Carbon Gen10(Intel i7-1260P(x86_64) arch linux)

Thanks

Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (220 preceding siblings ...)
  2023-09-18  5:34 ` Takeshi Ogasawara
@ 2023-09-18  6:42 ` Bagas Sanjaya
  2023-09-18 11:24 ` Conor Dooley
                   ` (7 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Bagas Sanjaya @ 2023-09-18  6:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
	rwarsow, conor

[-- Attachment #1: Type: text/plain, Size: 559 bytes --]

On Sun, Sep 17, 2023 at 09:12:07PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 

Successfully compiled and installed bindeb-pkgs on my computer (Acer
Aspire E15, Intel Core i3 Haswell). No noticeable regressions.

Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (221 preceding siblings ...)
  2023-09-18  6:42 ` Bagas Sanjaya
@ 2023-09-18 11:24 ` Conor Dooley
  2023-09-18 12:08 ` Ron Economos
                   ` (6 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Conor Dooley @ 2023-09-18 11:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor

[-- Attachment #1: Type: text/plain, Size: 371 bytes --]

On Sun, Sep 17, 2023 at 09:12:07PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.

Tested-by: Conor Dooley <conor.dooley@microchip.com>

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (222 preceding siblings ...)
  2023-09-18 11:24 ` Conor Dooley
@ 2023-09-18 12:08 ` Ron Economos
  2023-09-18 12:48 ` Jon Hunter
                   ` (5 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Ron Economos @ 2023-09-18 12:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
	rwarsow, conor

On 9/17/23 12:12 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Built and booted successfully on RISC-V RV64 (HiFive Unmatched).

Tested-by: Ron Economos <re@w6rz.net>


^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (223 preceding siblings ...)
  2023-09-18 12:08 ` Ron Economos
@ 2023-09-18 12:48 ` Jon Hunter
  2023-09-18 18:34 ` Florian Fainelli
                   ` (4 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Jon Hunter @ 2023-09-18 12:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Greg Kroah-Hartman, patches, linux-kernel, torvalds, akpm, linux,
	shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-tegra, stable

On Sun, 17 Sep 2023 21:12:07 +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

All tests passing for Tegra ...

Test results for stable-v6.1:
    11 builds:	11 pass, 0 fail
    28 boots:	28 pass, 0 fail
    130 tests:	130 pass, 0 fail

Linux version:	6.1.54-rc1-g89fc7c511aa5
Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
                tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000,
                tegra20-ventana, tegra210-p2371-2180,
                tegra210-p3450-0000, tegra30-cardhu-a04

Tested-by: Jon Hunter <jonathanh@nvidia.com>

Jon

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (224 preceding siblings ...)
  2023-09-18 12:48 ` Jon Hunter
@ 2023-09-18 18:34 ` Florian Fainelli
  2023-09-18 18:41 ` Guenter Roeck
                   ` (3 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Florian Fainelli @ 2023-09-18 18:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, sudipm.mukherjee, srw, rwarsow,
	conor



On 9/17/2023 12:12 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on 
BMIPS_GENERIC:

Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
-- 
Florian


^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (225 preceding siblings ...)
  2023-09-18 18:34 ` Florian Fainelli
@ 2023-09-18 18:41 ` Guenter Roeck
  2023-09-18 20:56 ` Naresh Kamboju
                   ` (2 subsequent siblings)
  229 siblings, 0 replies; 258+ messages in thread
From: Guenter Roeck @ 2023-09-18 18:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
	rwarsow, conor

On Sun, Sep 17, 2023 at 09:12:07PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 

Build results:
	total: 157 pass: 157 fail: 0
Qemu test results:
	total: 529 pass: 529 fail: 0

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (226 preceding siblings ...)
  2023-09-18 18:41 ` Guenter Roeck
@ 2023-09-18 20:56 ` Naresh Kamboju
  2023-09-18 22:21 ` Shuah Khan
  2023-09-21 13:04 ` Conor Dooley
  229 siblings, 0 replies; 258+ messages in thread
From: Naresh Kamboju @ 2023-09-18 20:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor

On Mon, 18 Sept 2023 at 01:30, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h


Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>

## Build
* kernel: 6.1.54-rc1
* git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
* git branch: linux-6.1.y
* git commit: 89fc7c511aa5cd0b21e82ec42611db04d9e3b7c2
* git describe: v6.1.52-813-g89fc7c511aa5
* test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.1.y/build/v6.1.52-813-g89fc7c511aa5

## Test Regressions (compared to v6.1.52)

## Metric Regressions (compared to v6.1.52)

## Test Fixes (compared to v6.1.52)

## Metric Fixes (compared to v6.1.52)

## Test result summary
total: 206086, pass: 176646, fail: 2859, skip: 26303, xfail: 278

## Build Summary
* arc: 10 total, 10 passed, 0 failed
* arm: 284 total, 283 passed, 1 failed
* arm64: 89 total, 87 passed, 2 failed
* i386: 67 total, 65 passed, 2 failed
* mips: 56 total, 54 passed, 2 failed
* parisc: 7 total, 7 passed, 0 failed
* powerpc: 70 total, 68 passed, 2 failed
* riscv: 28 total, 26 passed, 2 failed
* s390: 28 total, 27 passed, 1 failed
* sh: 26 total, 24 passed, 2 failed
* sparc: 14 total, 14 passed, 0 failed
* x86_64: 76 total, 72 passed, 4 failed

## Test suites summary
* boot
* kselftest-android
* kselftest-arm64
* kselftest-breakpoints
* kselftest-capabilities
* kselftest-clone3
* kselftest-core
* kselftest-cpu-hotplug
* kselftest-drivers-dma-buf
* kselftest-exec
* kselftest-fpu
* kselftest-ftrace
* kselftest-futex
* kselftest-gpio
* kselftest-intel_pstate
* kselftest-ipc
* kselftest-ir
* kselftest-kcmp
* kselftest-kexec
* kselftest-lib
* kselftest-membarrier
* kselftest-memfd
* kselftest-memory-hotplug
* kselftest-mincore
* kselftest-mount
* kselftest-mqueue
* kselftest-net
* kselftest-net-forwarding
* kselftest-net-mptcp
* kselftest-netfilter
* kselftest-nsfs
* kselftest-openat2
* kselftest-pid_namespace
* kselftest-pidfd
* kselftest-proc
* kselftest-pstore
* kselftest-ptrace
* kselftest-rseq
* kselftest-rtc
* kselftest-seccomp
* kselftest-sigaltstack
* kselftest-size
* kselftest-splice
* kselftest-static_keys
* kselftest-sync
* kselftest-sysctl
* kselftest-tc-testing
* kselftest-timens
* kselftest-timers
* kselftest-tmpfs
* kselftest-tpm2
* kselftest-user
* kselftest-user_events
* kselftest-vDSO
* kselftest-vm
* kselftest-watchdog
* kselftest-x86
* kunit
* kvm-unit-tests
* libgpiod
* log-parser-boot
* log-parser-test
* ltp-cap_bounds
* ltp-commands
* ltp-containers
* ltp-controllers
* ltp-cpuhotplug
* ltp-crypto
* ltp-cve
* ltp-dio
* ltp-fcntl-locktests
* ltp-filecaps
* ltp-fs
* ltp-fs_bind
* ltp-fs_perms_simple
* ltp-fsx
* ltp-hugetlb
* ltp-io
* ltp-ipc
* ltp-math
* ltp-mm
* ltp-nptl
* ltp-pty
* ltp-sched
* ltp-securebits
* ltp-smoke
* ltp-syscalls
* ltp-tracing
* perf
* rcutorture

--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (227 preceding siblings ...)
  2023-09-18 20:56 ` Naresh Kamboju
@ 2023-09-18 22:21 ` Shuah Khan
  2023-09-21 13:04 ` Conor Dooley
  229 siblings, 0 replies; 258+ messages in thread
From: Shuah Khan @ 2023-09-18 22:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
	rwarsow, conor, Shuah Khan

On 9/17/23 13:12, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 258+ messages in thread

* [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-17 19:12 ` [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes Greg Kroah-Hartman
@ 2023-09-20  8:11   ` Jeremi Piotrowski
  2023-09-20  8:43     ` Michal Hocko
  0 siblings, 1 reply; 258+ messages in thread
From: Jeremi Piotrowski @ 2023-09-20  8:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, Michal Hocko, Shakeel Butt, Johannes Weiner,
	Roman Gushchin, Muchun Song, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> 6.1-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------

Hi Greg/Michal,

This commit breaks userspace which makes it a bad commit for mainline and an
even worse commit for stable.

We ingested 6.1.54 into our nightly testing and found that runc fails to gather
cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
fine.

> Address this by wiping out the file completely and effectively get back to
> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.

On reads, the runc code checks for MEMCG_KMEM=n by checking
kmem.usage_in_bytes. If it is present then runc expects the other cgroup files
to be there (including kmem.limit_in_bytes). So this change is not effectively
the same.

Here's a link to the PR that would be needed to handle this change in userspace
(not merged yet and would need to be propagated through the ecosystem):

https://github.com/opencontainers/runc/pull/4018.

Jeremi

> 
> From: Michal Hocko <mhocko@suse.com>
> 
> commit 86327e8eb94c52eca4f93cfece2e29d1bf52acbf upstream.
> 
> kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
> deprecated since 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") merged in 5.16.  We haven't heard about any serious
> users since then but it seems that the mere presence of the file is
> causing more harm thatn good.  We (SUSE) have had several bug reports from
> customers where Docker based containers started to fail because a write to
> kmem.limit_in_bytes has failed.
> 
> This was unexpected because runc code only expects ENOENT (kmem disabled)
> or EBUSY (tasks already running within cgroup).  So a new error code was
> unexpected and the whole container startup failed.  This has been later
> addressed by
> https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
> so current Docker runtimes do not suffer from the problem anymore.  There
> are still older version of Docker in use and likely hard to get rid of
> completely.
> 
> Address this by wiping out the file completely and effectively get back to
> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> I would recommend backporting to stable trees which have picked up
> 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").
> 
> [mhocko@suse.com: restore _KMEM switch case]
>   Link: https://lkml.kernel.org/r/ZKe5wxdbvPi5Cwd7@dhcp22.suse.cz
> Link: https://lkml.kernel.org/r/20230704115240.14672-1-mhocko@kernel.org
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> Acked-by: Shakeel Butt <shakeelb@google.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  Documentation/admin-guide/cgroup-v1/memory.rst |    2 --
>  mm/memcontrol.c                                |   10 ----------
>  2 files changed, 12 deletions(-)
> 
> --- a/Documentation/admin-guide/cgroup-v1/memory.rst
> +++ b/Documentation/admin-guide/cgroup-v1/memory.rst
> @@ -91,8 +91,6 @@ Brief summary of control files.
>   memory.oom_control		     set/show oom controls.
>   memory.numa_stat		     show the number of memory usage per numa
>  				     node
> - memory.kmem.limit_in_bytes          This knob is deprecated and writing to
> -                                     it will return -ENOTSUPP.
>   memory.kmem.usage_in_bytes          show current kernel memory allocation
>   memory.kmem.failcnt                 show the number of kernel memory usage
>  				     hits limits
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3841,10 +3841,6 @@ static ssize_t mem_cgroup_write(struct k
>  		case _MEMSWAP:
>  			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
>  			break;
> -		case _KMEM:
> -			/* kmem.limit_in_bytes is deprecated. */
> -			ret = -EOPNOTSUPP;
> -			break;
>  		case _TCP:
>  			ret = memcg_update_tcp_max(memcg, nr_pages);
>  			break;
> @@ -5056,12 +5052,6 @@ static struct cftype mem_cgroup_legacy_f
>  	},
>  #endif
>  	{
> -		.name = "kmem.limit_in_bytes",
> -		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
> -		.write = mem_cgroup_write,
> -		.read_u64 = mem_cgroup_read_u64,
> -	},
> -	{
>  		.name = "kmem.usage_in_bytes",
>  		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
>  		.read_u64 = mem_cgroup_read_u64,
> 
> 

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20  8:11   ` [REGRESSION] " Jeremi Piotrowski
@ 2023-09-20  8:43     ` Michal Hocko
  2023-09-20  9:25       ` Greg Kroah-Hartman
  2023-09-20 10:04       ` Jeremi Piotrowski
  0 siblings, 2 replies; 258+ messages in thread
From: Michal Hocko @ 2023-09-20  8:43 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Greg Kroah-Hartman, stable, patches, Shakeel Butt,
	Johannes Weiner, Roman Gushchin, Muchun Song, Tejun Heo,
	Andrew Morton, linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> > 6.1-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> 
> Hi Greg/Michal,
> 
> This commit breaks userspace which makes it a bad commit for mainline and an
> even worse commit for stable.
> 
> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
> fine.

Could you expand some more on why is the file read? It doesn't support
writing to it for some time so how does reading it helps in any sense?

Anyway, I do agree that the stable backport should be reverted.

> > Address this by wiping out the file completely and effectively get back to
> > pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> On reads, the runc code checks for MEMCG_KMEM=n by checking
> kmem.usage_in_bytes. If it is present then runc expects the other cgroup files
> to be there (including kmem.limit_in_bytes). So this change is not effectively
> the same.
> 
> Here's a link to the PR that would be needed to handle this change in userspace
> (not merged yet and would need to be propagated through the ecosystem):
> 
> https://github.com/opencontainers/runc/pull/4018.

Thanks. Does that mean the revert is still necessary for the Linus tree
or do you expect that the fix can be merged and propagated in a
reasonable time?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20  8:43     ` Michal Hocko
@ 2023-09-20  9:25       ` Greg Kroah-Hartman
  2023-09-20 10:21         ` Jeremi Piotrowski
  2023-09-20 10:04       ` Jeremi Piotrowski
  1 sibling, 1 reply; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-20  9:25 UTC (permalink / raw)
  To: Michal Hocko, Jeremi Piotrowski
  Cc: stable, patches, Shakeel Butt, Johannes Weiner, Roman Gushchin,
	Muchun Song, Tejun Heo, Andrew Morton, linux-kernel, regressions,
	mathieu.tortuyaux

On Wed, Sep 20, 2023 at 10:43:56AM +0200, Michal Hocko wrote:
> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
> > On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> > > 6.1-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > ------------------
> > 
> > Hi Greg/Michal,
> > 
> > This commit breaks userspace which makes it a bad commit for mainline and an
> > even worse commit for stable.
> > 
> > We ingested 6.1.54 into our nightly testing and found that runc fails to gather
> > cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
> > into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
> > fine.
> 
> Could you expand some more on why is the file read? It doesn't support
> writing to it for some time so how does reading it helps in any sense?
> 
> Anyway, I do agree that the stable backport should be reverted.

That will just postpone the breakage, we really shouldn't break
userspace.

That being said, having userspace "break" because a file is no longer
present is not good coding style on the userspace side at all.  That's
why we have sysfs and single-value-files now, if the file isn't present,
then userspace instantly notices and can handle it.  Much easier than
the old-style multi-fields-in-one-file problem.

> > > Address this by wiping out the file completely and effectively get back to
> > > pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.

The fact that this is a valid option (i.e. no file) with that config
option disabled makes me want to keep this as well, as how does
userspace handle this option disabled at all?  Or old kernels?

I can drop this from stable kernels, but again, this feels like the runc
developers are just postponing the problem...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20  8:43     ` Michal Hocko
  2023-09-20  9:25       ` Greg Kroah-Hartman
@ 2023-09-20 10:04       ` Jeremi Piotrowski
  2023-09-20 11:07         ` Michal Hocko
  1 sibling, 1 reply; 258+ messages in thread
From: Jeremi Piotrowski @ 2023-09-20 10:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, stable, patches, Shakeel Butt,
	Johannes Weiner, Roman Gushchin, Muchun Song, Tejun Heo,
	Andrew Morton, linux-kernel, regressions, mathieu.tortuyaux

On 9/20/2023 10:43 AM, Michal Hocko wrote:
> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
>> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
>>> 6.1-stable review patch.  If anyone has any objections, please let me know.
>>>
>>> ------------------
>>
>> Hi Greg/Michal,
>>
>> This commit breaks userspace which makes it a bad commit for mainline and an
>> even worse commit for stable.
>>
>> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
>> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
>> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
>> fine.
> 
> Could you expand some more on why is the file read? It doesn't support
> writing to it for some time so how does reading it helps in any sense?
> 
> Anyway, I do agree that the stable backport should be reverted.
> 

This file is read together with all the other memcg files. Each prefix:

memory
memory.memsw
memory.kmem
memory.kmem.tcp

is combined with these suffixes

.usage_in_bytes
.max_usage_in_bytes
.failcnt
.limit_in_bytes

and read, the values are then forwarded on to other components for scheduling decisions.
You want to know the limit when checking the usage (is the usage close to the limit or not).

Userspace tolerates MEMCG/MEMCG_KMEM being disabled, but having a single file out of the
set missing is an anomaly. So maybe we could keep the dummy file just for the
sake of consistency? Cgroupv1 is legacy after all.

>>> Address this by wiping out the file completely and effectively get back to
>>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
>>
>> On reads, the runc code checks for MEMCG_KMEM=n by checking
>> kmem.usage_in_bytes. If it is present then runc expects the other cgroup files
>> to be there (including kmem.limit_in_bytes). So this change is not effectively
>> the same.
>>
>> Here's a link to the PR that would be needed to handle this change in userspace
>> (not merged yet and would need to be propagated through the ecosystem):
>>
>> https://github.com/opencontainers/runc/pull/4018.
> 
> Thanks. Does that mean the revert is still necessary for the Linus tree
> or do you expect that the fix can be merged and propagated in a
> reasonable time?
> 

We can probably get runc and currently supported kubernetes versions patched in time
before 6.6 (or the next LTS kernel) hits LTS distros.

But there's still a bunch of users running cgroupv1 with unsupported kubernetes
versions that are still taking kernel updates as they come, so this might get reported
again next year if it stays in mainline.

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20  9:25       ` Greg Kroah-Hartman
@ 2023-09-20 10:21         ` Jeremi Piotrowski
  2023-09-20 10:45           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 258+ messages in thread
From: Jeremi Piotrowski @ 2023-09-20 10:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Michal Hocko
  Cc: stable, patches, Shakeel Butt, Johannes Weiner, Roman Gushchin,
	Muchun Song, Tejun Heo, Andrew Morton, linux-kernel, regressions,
	mathieu.tortuyaux

On 9/20/2023 11:25 AM, Greg Kroah-Hartman wrote:
> On Wed, Sep 20, 2023 at 10:43:56AM +0200, Michal Hocko wrote:
>> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
>>> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
>>>> 6.1-stable review patch.  If anyone has any objections, please let me know.
>>>>
>>>> ------------------
>>>
>>> Hi Greg/Michal,
>>>
>>> This commit breaks userspace which makes it a bad commit for mainline and an
>>> even worse commit for stable.
>>>
>>> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
>>> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
>>> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
>>> fine.
>>
>> Could you expand some more on why is the file read? It doesn't support
>> writing to it for some time so how does reading it helps in any sense?
>>
>> Anyway, I do agree that the stable backport should be reverted.
> 
> That will just postpone the breakage, we really shouldn't break
> userspace.
> 
> That being said, having userspace "break" because a file is no longer
> present is not good coding style on the userspace side at all.  That's
> why we have sysfs and single-value-files now, if the file isn't present,
> then userspace instantly notices and can handle it.  Much easier than
> the old-style multi-fields-in-one-file problem.
> 

The memcg files in this case are single-value, but userspace expects to be able
to read memcg limits when it can read the usage (indicating MEMCG is enabled).
If it can't - then something is off, and the node is marked unhealthy.

>>>> Address this by wiping out the file completely and effectively get back to
>>>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> The fact that this is a valid option (i.e. no file) with that config
> option disabled makes me want to keep this as well, as how does
> userspace handle this option disabled at all?  Or old kernels?
> 

Userspace has had to handle the case of MEMCG_KMEM=n, but that had 2 cases so far:

limits/usage/max_usage/failcnt files are all available or none of them are available.

Now it needs to handle 3 of 4 files being available, but only for kmem (and not plain
memory, memsw or kmem.tcp). That's an inconsistency.

> I can drop this from stable kernels, but again, this feels like the runc
> developers are just postponing the problem...
>

Since cgroups v1 is deprecated, I think the runc developers haven't touched this part
of the code in years and expected it to keep working while they wait for the long tail
of usage to die out.

> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 10:21         ` Jeremi Piotrowski
@ 2023-09-20 10:45           ` Greg Kroah-Hartman
  2023-09-20 11:08             ` Michal Hocko
  0 siblings, 1 reply; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-20 10:45 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Michal Hocko, stable, patches, Shakeel Butt, Johannes Weiner,
	Roman Gushchin, Muchun Song, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 12:21:37PM +0200, Jeremi Piotrowski wrote:
> On 9/20/2023 11:25 AM, Greg Kroah-Hartman wrote:
> > On Wed, Sep 20, 2023 at 10:43:56AM +0200, Michal Hocko wrote:
> >> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
> >>> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> >>>> 6.1-stable review patch.  If anyone has any objections, please let me know.
> >>>>
> >>>> ------------------
> >>>
> >>> Hi Greg/Michal,
> >>>
> >>> This commit breaks userspace which makes it a bad commit for mainline and an
> >>> even worse commit for stable.
> >>>
> >>> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
> >>> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
> >>> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
> >>> fine.
> >>
> >> Could you expand some more on why is the file read? It doesn't support
> >> writing to it for some time so how does reading it helps in any sense?
> >>
> >> Anyway, I do agree that the stable backport should be reverted.
> > 
> > That will just postpone the breakage, we really shouldn't break
> > userspace.
> > 
> > That being said, having userspace "break" because a file is no longer
> > present is not good coding style on the userspace side at all.  That's
> > why we have sysfs and single-value-files now, if the file isn't present,
> > then userspace instantly notices and can handle it.  Much easier than
> > the old-style multi-fields-in-one-file problem.
> > 
> 
> The memcg files in this case are single-value, but userspace expects to be able
> to read memcg limits when it can read the usage (indicating MEMCG is enabled).
> If it can't - then something is off, and the node is marked unhealthy.
> 
> >>>> Address this by wiping out the file completely and effectively get back to
> >>>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> > 
> > The fact that this is a valid option (i.e. no file) with that config
> > option disabled makes me want to keep this as well, as how does
> > userspace handle this option disabled at all?  Or old kernels?
> > 
> 
> Userspace has had to handle the case of MEMCG_KMEM=n, but that had 2 cases so far:
> 
> limits/usage/max_usage/failcnt files are all available or none of them are available.
> 
> Now it needs to handle 3 of 4 files being available, but only for kmem (and not plain
> memory, memsw or kmem.tcp). That's an inconsistency.
> 
> > I can drop this from stable kernels, but again, this feels like the runc
> > developers are just postponing the problem...
> >
> 
> Since cgroups v1 is deprecated, I think the runc developers haven't touched this part
> of the code in years and expected it to keep working while they wait for the long tail
> of usage to die out.

Ok, then we should revert this, I'll go drop it in the stable trees, it
should also be reverted in Linus's tree too.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 10:04       ` Jeremi Piotrowski
@ 2023-09-20 11:07         ` Michal Hocko
  2023-09-20 13:25           ` Jeremi Piotrowski
  0 siblings, 1 reply; 258+ messages in thread
From: Michal Hocko @ 2023-09-20 11:07 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Greg Kroah-Hartman, stable, patches, Shakeel Butt,
	Johannes Weiner, Roman Gushchin, Muchun Song, Tejun Heo,
	Andrew Morton, linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 12:04:48, Jeremi Piotrowski wrote:
> On 9/20/2023 10:43 AM, Michal Hocko wrote:
> > On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
> >> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> >>> 6.1-stable review patch.  If anyone has any objections, please let me know.
> >>>
> >>> ------------------
> >>
> >> Hi Greg/Michal,
> >>
> >> This commit breaks userspace which makes it a bad commit for mainline and an
> >> even worse commit for stable.
> >>
> >> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
> >> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
> >> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
> >> fine.
> > 
> > Could you expand some more on why is the file read? It doesn't support
> > writing to it for some time so how does reading it helps in any sense?
> > 
> > Anyway, I do agree that the stable backport should be reverted.
> > 
> 
> This file is read together with all the other memcg files. Each prefix:
> 
> memory
> memory.memsw
> memory.kmem
> memory.kmem.tcp
> 
> is combined with these suffixes
> 
> .usage_in_bytes
> .max_usage_in_bytes
> .failcnt
> .limit_in_bytes
> 
> and read, the values are then forwarded on to other components for scheduling decisions.
> You want to know the limit when checking the usage (is the usage close to the limit or not).

You know there is no kmem limit as there is no way to set it for some
time (since 5.16 - i.e. 2 years ago). I can see that users following old
kernels could have missed that though.

> Userspace tolerates MEMCG/MEMCG_KMEM being disabled, but having a single file out of the
> set missing is an anomaly. So maybe we could keep the dummy file just for the
> sake of consistency? Cgroupv1 is legacy after all.

What we had was a dummy file. It didn't allow to write any value so it
would have always reported unlimited. The reason I've decided to remove
the file was that there were other users not being able to handle the
write failure while they are just fine not having the file. So we are
effectively between a rock and hard place here. Either way something is
broken. The other SW got fixed as well but similar to your case it takes
some time to absorb the change through all 3rd party users.

> >>> Address this by wiping out the file completely and effectively get back to
> >>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> >>
> >> On reads, the runc code checks for MEMCG_KMEM=n by checking
> >> kmem.usage_in_bytes.

Just one side note. Config options get renamed and their semantic
changes over time so I would just recomment to never make any
dependencies on any specific one. 

> >> If it is present then runc expects the other cgroup files
> >> to be there (including kmem.limit_in_bytes). So this change is not effectively
> >> the same.
> >>
> >> Here's a link to the PR that would be needed to handle this change in userspace
> >> (not merged yet and would need to be propagated through the ecosystem):
> >>
> >> https://github.com/opencontainers/runc/pull/4018.
> > 
> > Thanks. Does that mean the revert is still necessary for the Linus tree
> > or do you expect that the fix can be merged and propagated in a
> > reasonable time?
> > 
> 
> We can probably get runc and currently supported kubernetes versions patched in time
> before 6.6 (or the next LTS kernel) hits LTS distros.
> 
> But there's still a bunch of users running cgroupv1 with unsupported kubernetes
> versions that are still taking kernel updates as they come, so this might get reported
> again next year if it stays in mainline.

I can see how 3rd party users are hard to get aligned but having a fix
available should allow them to apply it or is there any actual roadblock
for them to adapt as soon as they hit the issue?

I mean, normally I would be just fine reverting this API change because
it is disruptive but the only way to have the file available and not
break somebody is to revert 58056f77502f ("memcg, kmem: further
deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
there but that sounds rather dubious. Although one could argue this
would mimic nokmem kernel option.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 10:45           ` Greg Kroah-Hartman
@ 2023-09-20 11:08             ` Michal Hocko
  2023-09-20 11:16               ` Greg Kroah-Hartman
  0 siblings, 1 reply; 258+ messages in thread
From: Michal Hocko @ 2023-09-20 11:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jeremi Piotrowski, stable, patches, Shakeel Butt, Johannes Weiner,
	Roman Gushchin, Muchun Song, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 12:45:08, Greg KH wrote:
[...]
> Ok, then we should revert this, I'll go drop it in the stable trees, it
> should also be reverted in Linus's tree too.

A simple revert would break other users as noted in other response so
wait with sending reverts to Linus before we agreen on the least painful
solution.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 11:08             ` Michal Hocko
@ 2023-09-20 11:16               ` Greg Kroah-Hartman
  0 siblings, 0 replies; 258+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-20 11:16 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, stable, patches, Shakeel Butt, Johannes Weiner,
	Roman Gushchin, Muchun Song, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 01:08:43PM +0200, Michal Hocko wrote:
> On Wed 20-09-23 12:45:08, Greg KH wrote:
> [...]
> > Ok, then we should revert this, I'll go drop it in the stable trees, it
> > should also be reverted in Linus's tree too.
> 
> A simple revert would break other users as noted in other response so
> wait with sending reverts to Linus before we agreen on the least painful
> solution.

A revert should cause the systems that stopped working to start working
again, so I'll keep the revert in the stable trees and wait for you to
work out the real solution in Linus's tree and then backport all of them
as needed.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 11:07         ` Michal Hocko
@ 2023-09-20 13:25           ` Jeremi Piotrowski
  2023-09-20 13:47             ` Michal Hocko
  0 siblings, 1 reply; 258+ messages in thread
From: Jeremi Piotrowski @ 2023-09-20 13:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, stable, patches, Shakeel Butt,
	Johannes Weiner, Roman Gushchin, Muchun Song, Tejun Heo,
	Andrew Morton, linux-kernel, regressions, mathieu.tortuyaux

On 9/20/2023 1:07 PM, Michal Hocko wrote:
> On Wed 20-09-23 12:04:48, Jeremi Piotrowski wrote:
>> On 9/20/2023 10:43 AM, Michal Hocko wrote:
>>> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
>>>> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
>>>>> 6.1-stable review patch.  If anyone has any objections, please let me know.
>>>>>
>>>>> ------------------
>>>>
>>>> Hi Greg/Michal,
>>>>
>>>> This commit breaks userspace which makes it a bad commit for mainline and an
>>>> even worse commit for stable.
>>>>
>>>> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
>>>> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
>>>> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
>>>> fine.
>>>
>>> Could you expand some more on why is the file read? It doesn't support
>>> writing to it for some time so how does reading it helps in any sense?
>>>
>>> Anyway, I do agree that the stable backport should be reverted.
>>>
>>
>> This file is read together with all the other memcg files. Each prefix:
>>
>> memory
>> memory.memsw
>> memory.kmem
>> memory.kmem.tcp
>>
>> is combined with these suffixes
>>
>> .usage_in_bytes
>> .max_usage_in_bytes
>> .failcnt
>> .limit_in_bytes
>>
>> and read, the values are then forwarded on to other components for scheduling decisions.
>> You want to know the limit when checking the usage (is the usage close to the limit or not).
> 
> You know there is no kmem limit as there is no way to set it for some
> time (since 5.16 - i.e. 2 years ago). I can see that users following old
> kernels could have missed that though.

I know what you mean, but I think this generally went unnoticed because the limit file is read
unconditionally, but only written when a kmem limit is explicitly requested for a specific
container, which is rarely (if ever) done.

Regarding following old kernels: a majority of kubernetes users are still on 5.15 and only slowly
started shifting to >=6.1 very recently (this summer). This is mostly driven by distro vendor
policies which tend to follow the pattern of "follow LTS kernels but don't switch to the next
LTS immediately".

I know this is far from ideal for reporting these kinds of issues, would love to report
them as soon as a kernel release happens.

> 
>> Userspace tolerates MEMCG/MEMCG_KMEM being disabled, but having a single file out of the
>> set missing is an anomaly. So maybe we could keep the dummy file just for the
>> sake of consistency? Cgroupv1 is legacy after all.
> 
> What we had was a dummy file. It didn't allow to write any value so it
> would have always reported unlimited. The reason I've decided to remove
> the file was that there were other users not being able to handle the
> write failure while they are just fine not having the file. So we are
> effectively between a rock and hard place here. Either way something is
> broken. The other SW got fixed as well but similar to your case it takes
> some time to absorb the change through all 3rd party users.
> 
>>>>> Address this by wiping out the file completely and effectively get back to
>>>>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
>>>>
>>>> On reads, the runc code checks for MEMCG_KMEM=n by checking
>>>> kmem.usage_in_bytes.
> 
> Just one side note. Config options get renamed and their semantic
> changes over time so I would just recomment to never make any
> dependencies on any specific one. 
> 

Right, what i meant is the logic is this, with checking the "usage"
file to determine whether the controller is available:

    value, err := fscommon.GetCgroupParamUint(path, usage)
    if err != nil {
        if name != "" && os.IsNotExist(err) {
            // Ignore ENOENT as swap and kmem controllers
            // are optional in the kernel.
            return cgroups.MemoryData{}, nil
        }
        return cgroups.MemoryData{}, err
    }

and if it is, then it proceeds to read "limit_in_bytes" and the others.

>>>> If it is present then runc expects the other cgroup files
>>>> to be there (including kmem.limit_in_bytes). So this change is not effectively
>>>> the same.
>>>>
>>>> Here's a link to the PR that would be needed to handle this change in userspace
>>>> (not merged yet and would need to be propagated through the ecosystem):
>>>>
>>>> https://github.com/opencontainers/runc/pull/4018.
>>>
>>> Thanks. Does that mean the revert is still necessary for the Linus tree
>>> or do you expect that the fix can be merged and propagated in a
>>> reasonable time?
>>>
>>
>> We can probably get runc and currently supported kubernetes versions patched in time
>> before 6.6 (or the next LTS kernel) hits LTS distros.
>>
>> But there's still a bunch of users running cgroupv1 with unsupported kubernetes
>> versions that are still taking kernel updates as they come, so this might get reported
>> again next year if it stays in mainline.
> 
> I can see how 3rd party users are hard to get aligned but having a fix
> available should allow them to apply it or is there any actual roadblock
> for them to adapt as soon as they hit the issue?
> 

The issue with this is that these users are running a frozen set of kubernetes (+runc)
binaries, but still pull kernel updates from the base distro. These kubernetes versions
are out of maintenance so the code will not get fixed and no one will release fixed
binaries.

> I mean, normally I would be just fine reverting this API change because
> it is disruptive but the only way to have the file available and not
> break somebody is to revert 58056f77502f ("memcg, kmem: further
> deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> there but that sounds rather dubious. Although one could argue this
> would mimic nokmem kernel option.
> 

I just want to make sure we don't introduce yet another new behavior in this legacy
system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
(=don't event store the written limit). The latter might have unintended consequences.


^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 13:25           ` Jeremi Piotrowski
@ 2023-09-20 13:47             ` Michal Hocko
  2023-09-20 15:32               ` Shakeel Butt
  2023-09-22 23:00               ` Roman Gushchin
  0 siblings, 2 replies; 258+ messages in thread
From: Michal Hocko @ 2023-09-20 13:47 UTC (permalink / raw)
  To: Jeremi Piotrowski, Shakeel Butt, Johannes Weiner, Roman Gushchin,
	Muchun Song
  Cc: Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> On 9/20/2023 1:07 PM, Michal Hocko wrote:
[...]
> > I mean, normally I would be just fine reverting this API change because
> > it is disruptive but the only way to have the file available and not
> > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > there but that sounds rather dubious. Although one could argue this
> > would mimic nokmem kernel option.
> > 
> 
> I just want to make sure we don't introduce yet another new behavior in this legacy
> system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> (=don't event store the written limit). The latter might have unintended consequences.

Yes it would mean that the limit is never enforced. Bad as it is the
thing is that the hard limit on kernel memory is broken by design and
unfixable.  This causes all sorts of unexpected kernel allocation
failures that this is simply unsafe to use.

All that being said I can see the following options
1) keep the current upstream status and not export the file
2) revert both 58056f77502f and 86327e8eb94 and make it clear
   that kmem.limit_in_bytes is unsupported so failures or misbehavior
   as a result of the limit being hit are likely not going to be
   investigated or fixed.
3) reverting like in 2) but never inforce the limit (so basically nokmem
   semantic)

Shakeel, Johannes, Roman, Muchun Song what do you think?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 13:47             ` Michal Hocko
@ 2023-09-20 15:32               ` Shakeel Butt
  2023-09-20 16:55                 ` Michal Hocko
  2023-09-22 23:00               ` Roman Gushchin
  1 sibling, 1 reply; 258+ messages in thread
From: Shakeel Butt @ 2023-09-20 15:32 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 6:47 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> > On 9/20/2023 1:07 PM, Michal Hocko wrote:
> [...]
> > > I mean, normally I would be just fine reverting this API change because
> > > it is disruptive but the only way to have the file available and not
> > > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > > there but that sounds rather dubious. Although one could argue this
> > > would mimic nokmem kernel option.
> > >
> >
> > I just want to make sure we don't introduce yet another new behavior in this legacy
> > system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> > does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> > (=don't event store the written limit). The latter might have unintended consequences.
>
> Yes it would mean that the limit is never enforced. Bad as it is the
> thing is that the hard limit on kernel memory is broken by design and
> unfixable.  This causes all sorts of unexpected kernel allocation
> failures that this is simply unsafe to use.
>
> All that being said I can see the following options
> 1) keep the current upstream status and not export the file
> 2) revert both 58056f77502f and 86327e8eb94 and make it clear
>    that kmem.limit_in_bytes is unsupported so failures or misbehavior
>    as a result of the limit being hit are likely not going to be
>    investigated or fixed.
> 3) reverting like in 2) but never inforce the limit (so basically nokmem
>    semantic)
>
> Shakeel, Johannes, Roman, Muchun Song what do you think?

I think the safe option would be to revert 86327e8eb94 for now and put
pr_warn_once even for the read of kmem.limit_in_bytes? We can retry
86327e8eb94 in a year or so.

However personally I would prefer option 1. Also I don't think
reverting  58056f77502f would give any benefit.

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 15:32               ` Shakeel Butt
@ 2023-09-20 16:55                 ` Michal Hocko
  2023-09-20 19:46                   ` Shakeel Butt
  0 siblings, 1 reply; 258+ messages in thread
From: Michal Hocko @ 2023-09-20 16:55 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 08:32:42, Shakeel Butt wrote:
> Also I don't think reverting 58056f77502f would give any benefit.

Not reverting 58056f77502f would re-introduce the regression in some
non-patched versions of Docker runtimes which cannot handle ENOTSUPP.
So I think we need to revert both or none of them. I would prefer the
later (option 1) as the fix is trivial but I do understand headache
of chasing all those outdated deployments or vendor code forks.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 16:55                 ` Michal Hocko
@ 2023-09-20 19:46                   ` Shakeel Butt
  2023-09-20 20:08                     ` Michal Hocko
  0 siblings, 1 reply; 258+ messages in thread
From: Shakeel Butt @ 2023-09-20 19:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 9:55 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 20-09-23 08:32:42, Shakeel Butt wrote:
> > Also I don't think reverting 58056f77502f would give any benefit.
>
> Not reverting 58056f77502f would re-introduce the regression in some
> non-patched versions of Docker runtimes which cannot handle ENOTSUPP.
> So I think we need to revert both or none of them. I would prefer the
> later (option 1) as the fix is trivial but I do understand headache
> of chasing all those outdated deployments or vendor code forks.

I think that would be too much conservative an approach but I don't
have a strong opinion against it. Also just to be clear we are not
talking about full revert of 58056f77502f but just the returning of
EOPNOTSUPP, right?

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 19:46                   ` Shakeel Butt
@ 2023-09-20 20:08                     ` Michal Hocko
  2023-09-20 21:46                       ` Shakeel Butt
  0 siblings, 1 reply; 258+ messages in thread
From: Michal Hocko @ 2023-09-20 20:08 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 12:46:23, Shakeel Butt wrote:
> On Wed, Sep 20, 2023 at 9:55 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Wed 20-09-23 08:32:42, Shakeel Butt wrote:
> > > Also I don't think reverting 58056f77502f would give any benefit.
> >
> > Not reverting 58056f77502f would re-introduce the regression in some
> > non-patched versions of Docker runtimes which cannot handle ENOTSUPP.
> > So I think we need to revert both or none of them. I would prefer the
> > later (option 1) as the fix is trivial but I do understand headache
> > of chasing all those outdated deployments or vendor code forks.
> 
> I think that would be too much conservative an approach but I don't

Well, TBH I do not really see any sifference between one set of broken
userspace or the other. Both are making assumptions on our interfaces
and they do not overlap unfortunately.

> have a strong opinion against it. Also just to be clear we are not
> talking about full revert of 58056f77502f but just the returning of
> EOPNOTSUPP, right?

If we allow the limit to be set without returning a failure then we
still have options 2 and 3 on how to deal with that. One of them is to
enforce the limit.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 20:08                     ` Michal Hocko
@ 2023-09-20 21:46                       ` Shakeel Butt
  2023-09-21  7:52                         ` Michal Hocko
  0 siblings, 1 reply; 258+ messages in thread
From: Shakeel Butt @ 2023-09-20 21:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
>
[...]
> > have a strong opinion against it. Also just to be clear we are not
> > talking about full revert of 58056f77502f but just the returning of
> > EOPNOTSUPP, right?
>
> If we allow the limit to be set without returning a failure then we
> still have options 2 and 3 on how to deal with that. One of them is to
> enforce the limit.
>

Option 3 is a partial revert of 58056f77502f where we keep the no
limit enforcement and remove the EOPNOTSUPP return on write. Let's go
with option 3. In addition, let's add pr_warn_once on the read of
kmem.limit_in_bytes as well.

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 21:46                       ` Shakeel Butt
@ 2023-09-21  7:52                         ` Michal Hocko
  2023-09-21 10:43                           ` Jeremi Piotrowski
  0 siblings, 1 reply; 258+ messages in thread
From: Michal Hocko @ 2023-09-21  7:52 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 14:46:52, Shakeel Butt wrote:
> On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
> >
> [...]
> > > have a strong opinion against it. Also just to be clear we are not
> > > talking about full revert of 58056f77502f but just the returning of
> > > EOPNOTSUPP, right?
> >
> > If we allow the limit to be set without returning a failure then we
> > still have options 2 and 3 on how to deal with that. One of them is to
> > enforce the limit.
> >
> 
> Option 3 is a partial revert of 58056f77502f where we keep the no
> limit enforcement and remove the EOPNOTSUPP return on write. Let's go
> with option 3. In addition, let's add pr_warn_once on the read of
> kmem.limit_in_bytes as well.

How about this?
--- 
From 81ae0797d8da1b9cfbf357b4be4787a5bbf46bb4 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Thu, 21 Sep 2023 09:38:29 +0200
Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation

This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
and partially reverts 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") which have incrementally removed support for the
kernel memory accounting hard limit. Unfortunately it has turned out
that there is still userspace depending on the existence of
memory.kmem.limit_in_bytes [1]. The underlying functionality is not
really required but the non-existent file just confuses the userspace
which fails in the result. The patch to fix this on the userspace side
has been submitted but it is hard to predict how it will propagate
through the maze of 3rd party consumers of the software.

Now, reverting alone 86327e8eb94c is not an option because there is
another set of userspace which cannot cope with ENOTSUPP returned when
writing to the file. Therefore we have to go and revisit 58056f77502f
as well. There are two ways to go ahead. Either we give up on the
deprecation and fully revert 58056f77502f as well or we can keep
kmem.limit_in_bytes but make the write a noop and warn about the fact.
This should work for both known breaking workloads which depend on the
existence but do not depend on the hard limit enforcement.

[1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
 mm/memcontrol.c                                | 12 ++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 5f502bf68fbc..ff456871bf4b 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,6 +92,13 @@ Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
+ memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
+                                     memory hard limit. Kernel hard limit is not
+                                     supported since 5.16. Writing any value to
+                                     do file will not have any effect same as if
+                                     nokmem kernel parameter was specified.
+                                     Kernel memory is still charged and reported
+                                     by memory.kmem.usage_in_bytes.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4d3282493b6..ac7f14b2338d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
 static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 				   unsigned int nr_pages)
 {
+	struct page_counter *counter;
 	struct mem_cgroup *memcg;
 	int ret;
 
@@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 		goto out;
 
 	memcg_account_kmem(memcg, nr_pages);
+
+	/* There is no way to set up kmem hard limit so this operation cannot fail */
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
 out:
 	css_put(&memcg->css);
 
@@ -3867,6 +3872,13 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
 		case _MEMSWAP:
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
+		case _KMEM:
+			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
+				     "Writing any value to this file has no effect. "
+				     "Please report your usecase to linux-mm@kvack.org if you "
+				     "depend on this functionality.\n");
+			ret = 0;
+			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
 			break;
-- 
2.30.2

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21  7:52                         ` Michal Hocko
@ 2023-09-21 10:43                           ` Jeremi Piotrowski
  2023-09-21 11:21                             ` Michal Hocko
  0 siblings, 1 reply; 258+ messages in thread
From: Jeremi Piotrowski @ 2023-09-21 10:43 UTC (permalink / raw)
  To: Michal Hocko, Shakeel Butt
  Cc: Johannes Weiner, Roman Gushchin, Muchun Song, Greg Kroah-Hartman,
	stable, patches, Tejun Heo, Andrew Morton, linux-kernel,
	regressions, mathieu.tortuyaux

On 9/21/2023 9:52 AM, Michal Hocko wrote:
> On Wed 20-09-23 14:46:52, Shakeel Butt wrote:
>> On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
>>>
>> [...]
>>>> have a strong opinion against it. Also just to be clear we are not
>>>> talking about full revert of 58056f77502f but just the returning of
>>>> EOPNOTSUPP, right?
>>>
>>> If we allow the limit to be set without returning a failure then we
>>> still have options 2 and 3 on how to deal with that. One of them is to
>>> enforce the limit.
>>>
>>
>> Option 3 is a partial revert of 58056f77502f where we keep the no
>> limit enforcement and remove the EOPNOTSUPP return on write. Let's go
>> with option 3. In addition, let's add pr_warn_once on the read of
>> kmem.limit_in_bytes as well.
> 
> How about this?
> --- 

I'm OK with this approach. You're missing this in the patch below:

// static struct cftype mem_cgroup_legacy_files[] = {

+       {
+               .name = "kmem.limit_in_bytes",
+               .private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
+               .write = mem_cgroup_write,
+               .read_u64 = mem_cgroup_read_u64,
+       },


Thanks,
Jeremi

>>From 81ae0797d8da1b9cfbf357b4be4787a5bbf46bb4 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Thu, 21 Sep 2023 09:38:29 +0200
> Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation
> 
> This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
> and partially reverts 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") which have incrementally removed support for the
> kernel memory accounting hard limit. Unfortunately it has turned out
> that there is still userspace depending on the existence of
> memory.kmem.limit_in_bytes [1]. The underlying functionality is not
> really required but the non-existent file just confuses the userspace
> which fails in the result. The patch to fix this on the userspace side
> has been submitted but it is hard to predict how it will propagate
> through the maze of 3rd party consumers of the software.
> 
> Now, reverting alone 86327e8eb94c is not an option because there is
> another set of userspace which cannot cope with ENOTSUPP returned when
> writing to the file. Therefore we have to go and revisit 58056f77502f
> as well. There are two ways to go ahead. Either we give up on the
> deprecation and fully revert 58056f77502f as well or we can keep
> kmem.limit_in_bytes but make the write a noop and warn about the fact.
> This should work for both known breaking workloads which depend on the
> existence but do not depend on the hard limit enforcement.
> 
> [1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
> Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
> Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
>  mm/memcontrol.c                                | 12 ++++++++++++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
> index 5f502bf68fbc..ff456871bf4b 100644
> --- a/Documentation/admin-guide/cgroup-v1/memory.rst
> +++ b/Documentation/admin-guide/cgroup-v1/memory.rst
> @@ -92,6 +92,13 @@ Brief summary of control files.
>   memory.oom_control		     set/show oom controls.
>   memory.numa_stat		     show the number of memory usage per numa
>  				     node
> + memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
> +                                     memory hard limit. Kernel hard limit is not
> +                                     supported since 5.16. Writing any value to
> +                                     do file will not have any effect same as if
> +                                     nokmem kernel parameter was specified.
> +                                     Kernel memory is still charged and reported
> +                                     by memory.kmem.usage_in_bytes.
>   memory.kmem.usage_in_bytes          show current kernel memory allocation
>   memory.kmem.failcnt                 show the number of kernel memory usage
>  				     hits limits
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index a4d3282493b6..ac7f14b2338d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
>  static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>  				   unsigned int nr_pages)
>  {
> +	struct page_counter *counter;
>  	struct mem_cgroup *memcg;
>  	int ret;
>  
> @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>  		goto out;
>  
>  	memcg_account_kmem(memcg, nr_pages);
> +
> +	/* There is no way to set up kmem hard limit so this operation cannot fail */
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
>  out:
>  	css_put(&memcg->css);
>  
> @@ -3867,6 +3872,13 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
>  		case _MEMSWAP:
>  			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
>  			break;
> +		case _KMEM:
> +			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
> +				     "Writing any value to this file has no effect. "
> +				     "Please report your usecase to linux-mm@kvack.org if you "
> +				     "depend on this functionality.\n");
> +			ret = 0;
> +			break;
>  		case _TCP:
>  			ret = memcg_update_tcp_max(memcg, nr_pages);
>  			break;


^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21 10:43                           ` Jeremi Piotrowski
@ 2023-09-21 11:21                             ` Michal Hocko
  2023-09-21 17:25                               ` Shakeel Butt
  2023-09-22 13:30                               ` Johannes Weiner
  0 siblings, 2 replies; 258+ messages in thread
From: Michal Hocko @ 2023-09-21 11:21 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Shakeel Butt, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Thu 21-09-23 12:43:05, Jeremi Piotrowski wrote:
> On 9/21/2023 9:52 AM, Michal Hocko wrote:
> > On Wed 20-09-23 14:46:52, Shakeel Butt wrote:
> >> On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
> >>>
> >> [...]
> >>>> have a strong opinion against it. Also just to be clear we are not
> >>>> talking about full revert of 58056f77502f but just the returning of
> >>>> EOPNOTSUPP, right?
> >>>
> >>> If we allow the limit to be set without returning a failure then we
> >>> still have options 2 and 3 on how to deal with that. One of them is to
> >>> enforce the limit.
> >>>
> >>
> >> Option 3 is a partial revert of 58056f77502f where we keep the no
> >> limit enforcement and remove the EOPNOTSUPP return on write. Let's go
> >> with option 3. In addition, let's add pr_warn_once on the read of
> >> kmem.limit_in_bytes as well.
> > 
> > How about this?
> > --- 
> 
> I'm OK with this approach. You're missing this in the patch below:
> 
> // static struct cftype mem_cgroup_legacy_files[] = {
> 
> +       {
> +               .name = "kmem.limit_in_bytes",
> +               .private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
> +               .write = mem_cgroup_write,
> +               .read_u64 = mem_cgroup_read_u64,
> +       },

Of course. I've lost the hunk while massaging the revert. Thanks for
spotting. Updated version below. Btw. I've decided to not pr_{warn,info}
on the read side because realistically I do not think this will help all
that much. I am worried we will get stuck with this for ever because
there always be somebody stuck on unpatched userspace.
--- 
From bb6702b698efd31f3f90f4f1dd36ffe223397bec Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Thu, 21 Sep 2023 09:38:29 +0200
Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation

This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
and partially reverts 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") which have incrementally removed support for the
kernel memory accounting hard limit. Unfortunately it has turned out
that there is still userspace depending on the existence of
memory.kmem.limit_in_bytes [1]. The underlying functionality is not
really required but the non-existent file just confuses the userspace
which fails in the result. The patch to fix this on the userspace side
has been submitted but it is hard to predict how it will propagate
through the maze of 3rd party consumers of the software.

Now, reverting alone 86327e8eb94c is not an option because there is
another set of userspace which cannot cope with ENOTSUPP returned when
writing to the file. Therefore we have to go and revisit 58056f77502f
as well. There are two ways to go ahead. Either we give up on the
deprecation and fully revert 58056f77502f as well or we can keep
kmem.limit_in_bytes but make the write a noop and warn about the fact.
This should work for both known breaking workloads which depend on the
existence but do not depend on the hard limit enforcement.

[1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
 mm/memcontrol.c                                | 18 ++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 5f502bf68fbc..ff456871bf4b 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,6 +92,13 @@ Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
+ memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
+                                     memory hard limit. Kernel hard limit is not
+                                     supported since 5.16. Writing any value to
+                                     do file will not have any effect same as if
+                                     nokmem kernel parameter was specified.
+                                     Kernel memory is still charged and reported
+                                     by memory.kmem.usage_in_bytes.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4d3282493b6..0b161705ef36 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
 static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 				   unsigned int nr_pages)
 {
+	struct page_counter *counter;
 	struct mem_cgroup *memcg;
 	int ret;
 
@@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 		goto out;
 
 	memcg_account_kmem(memcg, nr_pages);
+
+	/* There is no way to set up kmem hard limit so this operation cannot fail */
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
 out:
 	css_put(&memcg->css);
 
@@ -3867,6 +3872,13 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
 		case _MEMSWAP:
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
+		case _KMEM:
+			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
+				     "Writing any value to this file has no effect. "
+				     "Please report your usecase to linux-mm@kvack.org if you "
+				     "depend on this functionality.\n");
+			ret = 0;
+			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
 			break;
@@ -5077,6 +5089,12 @@ static struct cftype mem_cgroup_legacy_files[] = {
 		.seq_show = memcg_numa_stat_show,
 	},
 #endif
+	{
+		.name = "kmem.limit_in_bytes",
+		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
+		.write = mem_cgroup_write,
+		.read_u64 = mem_cgroup_read_u64,
+	},
 	{
 		.name = "kmem.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
-- 
2.30.2

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (228 preceding siblings ...)
  2023-09-18 22:21 ` Shuah Khan
@ 2023-09-21 13:04 ` Conor Dooley
  229 siblings, 0 replies; 258+ messages in thread
From: Conor Dooley @ 2023-09-21 13:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow

[-- Attachment #1: Type: text/plain, Size: 371 bytes --]

On Sun, Sep 17, 2023 at 09:12:07PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.

Tested-by: Conor Dooley <conor.dooley@microchip.com>

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21 11:21                             ` Michal Hocko
@ 2023-09-21 17:25                               ` Shakeel Butt
  2023-09-21 19:50                                 ` Michal Hocko
  2023-09-22 13:30                               ` Johannes Weiner
  1 sibling, 1 reply; 258+ messages in thread
From: Shakeel Butt @ 2023-09-21 17:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Thu, Sep 21, 2023 at 4:21 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Thu 21-09-23 12:43:05, Jeremi Piotrowski wrote:
> > On 9/21/2023 9:52 AM, Michal Hocko wrote:
> > > On Wed 20-09-23 14:46:52, Shakeel Butt wrote:
> > >> On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
> > >>>
> > >> [...]
> > >>>> have a strong opinion against it. Also just to be clear we are not
> > >>>> talking about full revert of 58056f77502f but just the returning of
> > >>>> EOPNOTSUPP, right?
> > >>>
> > >>> If we allow the limit to be set without returning a failure then we
> > >>> still have options 2 and 3 on how to deal with that. One of them is to
> > >>> enforce the limit.
> > >>>
> > >>
> > >> Option 3 is a partial revert of 58056f77502f where we keep the no
> > >> limit enforcement and remove the EOPNOTSUPP return on write. Let's go
> > >> with option 3. In addition, let's add pr_warn_once on the read of
> > >> kmem.limit_in_bytes as well.
> > >
> > > How about this?
> > > ---
> >
> > I'm OK with this approach. You're missing this in the patch below:
> >
> > // static struct cftype mem_cgroup_legacy_files[] = {
> >
> > +       {
> > +               .name = "kmem.limit_in_bytes",
> > +               .private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
> > +               .write = mem_cgroup_write,
> > +               .read_u64 = mem_cgroup_read_u64,
> > +       },
>
> Of course. I've lost the hunk while massaging the revert. Thanks for
> spotting. Updated version below. Btw. I've decided to not pr_{warn,info}
> on the read side because realistically I do not think this will help all
> that much. I am worried we will get stuck with this for ever because
> there always be somebody stuck on unpatched userspace.
> ---
> From bb6702b698efd31f3f90f4f1dd36ffe223397bec Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Thu, 21 Sep 2023 09:38:29 +0200
> Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation
>
> This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
> and partially reverts 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") which have incrementally removed support for the
> kernel memory accounting hard limit. Unfortunately it has turned out
> that there is still userspace depending on the existence of
> memory.kmem.limit_in_bytes [1]. The underlying functionality is not
> really required but the non-existent file just confuses the userspace
> which fails in the result. The patch to fix this on the userspace side
> has been submitted but it is hard to predict how it will propagate
> through the maze of 3rd party consumers of the software.
>
> Now, reverting alone 86327e8eb94c is not an option because there is
> another set of userspace which cannot cope with ENOTSUPP returned when
> writing to the file. Therefore we have to go and revisit 58056f77502f
> as well. There are two ways to go ahead. Either we give up on the
> deprecation and fully revert 58056f77502f as well or we can keep
> kmem.limit_in_bytes but make the write a noop and warn about the fact.
> This should work for both known breaking workloads which depend on the
> existence but do not depend on the hard limit enforcement.
>
> [1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
> Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
> Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
> Signed-off-by: Michal Hocko <mhocko@suse.com>

With one request below:

Acked-by: Shakeel Butt <shakeelb@google.com>

> ---
>  Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
>  mm/memcontrol.c                                | 18 ++++++++++++++++++
>  2 files changed, 25 insertions(+)
>
> diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
> index 5f502bf68fbc..ff456871bf4b 100644
> --- a/Documentation/admin-guide/cgroup-v1/memory.rst
> +++ b/Documentation/admin-guide/cgroup-v1/memory.rst
> @@ -92,6 +92,13 @@ Brief summary of control files.
>   memory.oom_control                 set/show oom controls.
>   memory.numa_stat                   show the number of memory usage per numa
>                                      node
> + memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
> +                                     memory hard limit. Kernel hard limit is not
> +                                     supported since 5.16. Writing any value to
> +                                     do file will not have any effect same as if
> +                                     nokmem kernel parameter was specified.
> +                                     Kernel memory is still charged and reported
> +                                     by memory.kmem.usage_in_bytes.
>   memory.kmem.usage_in_bytes          show current kernel memory allocation
>   memory.kmem.failcnt                 show the number of kernel memory usage
>                                      hits limits
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index a4d3282493b6..0b161705ef36 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
>  static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>                                    unsigned int nr_pages)
>  {
> +       struct page_counter *counter;
>         struct mem_cgroup *memcg;
>         int ret;
>
> @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>                 goto out;
>
>         memcg_account_kmem(memcg, nr_pages);
> +
> +       /* There is no way to set up kmem hard limit so this operation cannot fail */
> +       if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +               WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));

WARN_ON_ONCE() please.

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21 17:25                               ` Shakeel Butt
@ 2023-09-21 19:50                                 ` Michal Hocko
  0 siblings, 0 replies; 258+ messages in thread
From: Michal Hocko @ 2023-09-21 19:50 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Thu 21-09-23 10:25:11, Shakeel Butt wrote:
> On Thu, Sep 21, 2023 at 4:21 AM Michal Hocko <mhocko@suse.com> wrote:
[...]
> With one request below:
> 
> Acked-by: Shakeel Butt <shakeelb@google.com>

Thanks.

> > @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
> >                 goto out;
> >
> >         memcg_account_kmem(memcg, nr_pages);
> > +
> > +       /* There is no way to set up kmem hard limit so this operation cannot fail */
> > +       if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +               WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
> 
> WARN_ON_ONCE() please.

Sure. This shouldn't really trigger, but it is true that if something
unexpected happens then it is likly to flood the log so _ONCE is safer.

I will wait for others to comment before I send the official patch.
To be completely honest I am not super happy about this way of handling
stuff, but considering the level of brokenness this seems like the
safest option. Especially when nobody really want to use the kernel
memory hard limit AFAIU.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21 11:21                             ` Michal Hocko
  2023-09-21 17:25                               ` Shakeel Butt
@ 2023-09-22 13:30                               ` Johannes Weiner
  2023-09-25  7:40                                 ` Michal Hocko
  1 sibling, 1 reply; 258+ messages in thread
From: Johannes Weiner @ 2023-09-22 13:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Shakeel Butt, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Thu, Sep 21, 2023 at 01:21:54PM +0200, Michal Hocko wrote:
> @@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
>  static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>  				   unsigned int nr_pages)
>  {
> +	struct page_counter *counter;
>  	struct mem_cgroup *memcg;
>  	int ret;
>  
> @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>  		goto out;
>  
>  	memcg_account_kmem(memcg, nr_pages);
> +
> +	/* There is no way to set up kmem hard limit so this operation cannot fail */
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));

This hunk doesn't look quite right.

static void memcg_account_kmem(struct mem_cgroup *memcg, int nr_pages)
{
	mod_memcg_state(memcg, MEMCG_KMEM, nr_pages);
	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
		if (nr_pages > 0)
			page_counter_charge(&memcg->kmem, nr_pages);
		else
			page_counter_uncharge(&memcg->kmem, -nr_pages);
	}
}

Other than that, please add

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 13:47             ` Michal Hocko
  2023-09-20 15:32               ` Shakeel Butt
@ 2023-09-22 23:00               ` Roman Gushchin
  2023-09-25  7:41                 ` Michal Hocko
  1 sibling, 1 reply; 258+ messages in thread
From: Roman Gushchin @ 2023-09-22 23:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Shakeel Butt, Johannes Weiner, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 03:47:37PM +0200, Michal Hocko wrote:
> On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> > On 9/20/2023 1:07 PM, Michal Hocko wrote:
> [...]
> > > I mean, normally I would be just fine reverting this API change because
> > > it is disruptive but the only way to have the file available and not
> > > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > > there but that sounds rather dubious. Although one could argue this
> > > would mimic nokmem kernel option.
> > > 
> > 
> > I just want to make sure we don't introduce yet another new behavior in this legacy
> > system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> > does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> > (=don't event store the written limit). The latter might have unintended consequences.
> 
> Yes it would mean that the limit is never enforced. Bad as it is the
> thing is that the hard limit on kernel memory is broken by design and
> unfixable.  This causes all sorts of unexpected kernel allocation
> failures that this is simply unsafe to use.
> 
> All that being said I can see the following options
> 1) keep the current upstream status and not export the file
> 2) revert both 58056f77502f and 86327e8eb94 and make it clear
>    that kmem.limit_in_bytes is unsupported so failures or misbehavior
>    as a result of the limit being hit are likely not going to be
>    investigated or fixed.
> 3) reverting like in 2) but never inforce the limit (so basically nokmem
>    semantic)

Since it's a part of cgroup v1 interface, which is in a frozen state as a whole,
and there is no significant (performance, code complexity) benefit of
additionally deprecating kmem.limit_in_bytes, I vote for 2).
1) is also an option.

Thanks!

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-22 13:30                               ` Johannes Weiner
@ 2023-09-25  7:40                                 ` Michal Hocko
  0 siblings, 0 replies; 258+ messages in thread
From: Michal Hocko @ 2023-09-25  7:40 UTC (permalink / raw)
  To: Johannes Weiner, Andrew Morton
  Cc: Jeremi Piotrowski, Shakeel Butt, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, linux-kernel,
	regressions, mathieu.tortuyaux

On Fri 22-09-23 09:30:17, Johannes Weiner wrote:
> On Thu, Sep 21, 2023 at 01:21:54PM +0200, Michal Hocko wrote:
> > @@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
> >  static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
> >  				   unsigned int nr_pages)
> >  {
> > +	struct page_counter *counter;
> >  	struct mem_cgroup *memcg;
> >  	int ret;
> >  
> > @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
> >  		goto out;
> >  
> >  	memcg_account_kmem(memcg, nr_pages);
> > +
> > +	/* There is no way to set up kmem hard limit so this operation cannot fail */
> > +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
> 
> This hunk doesn't look quite right.
> 
> static void memcg_account_kmem(struct mem_cgroup *memcg, int nr_pages)
> {
> 	mod_memcg_state(memcg, MEMCG_KMEM, nr_pages);
> 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
> 		if (nr_pages > 0)
> 			page_counter_charge(&memcg->kmem, nr_pages);
> 		else
> 			page_counter_uncharge(&memcg->kmem, -nr_pages);
> 	}
> }
> 
> Other than that, please add

Good point. I have missed a8c49af3be5f ("memcg: add per-memcg total kernel memory stat")
introduced in 4.18

> Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Fixed version below. Andrew, it seems we have a good consensus for this.
Could you queue this up and send it to Linus please?
--- 
From 8c3cbe68bba0fe5103d8fe73a06b3608ed49bda0 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Thu, 21 Sep 2023 09:38:29 +0200
Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation

This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
and partially reverts 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") which have incrementally removed support for the
kernel memory accounting hard limit. Unfortunately it has turned out
that there is still userspace depending on the existence of
memory.kmem.limit_in_bytes [1]. The underlying functionality is not
really required but the non-existent file just confuses the userspace
which fails in the result. The patch to fix this on the userspace side
has been submitted but it is hard to predict how it will propagate
through the maze of 3rd party consumers of the software.

Now, reverting alone 86327e8eb94c is not an option because there is
another set of userspace which cannot cope with ENOTSUPP returned when
writing to the file. Therefore we have to go and revisit 58056f77502f
as well. There are two ways to go ahead. Either we give up on the
deprecation and fully revert 58056f77502f as well or we can keep
kmem.limit_in_bytes but make the write a noop and warn about the fact.
This should work for both known breaking workloads which depend on the
existence but do not depend on the hard limit enforcement.

Note to backporters to stable trees. a8c49af3be5f ("memcg: add per-memcg
total kernel memory stat") introduced in 4.18 has added memcg_account_kmem
so the accounting is not done by obj_cgroup_charge_pages directly for v1
anymore. Prior kernels need to add it explicitly (thanks to Johannes for
pointing this out).

[1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Cc: stable
Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
 mm/memcontrol.c                                | 14 ++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 5f502bf68fbc..ff456871bf4b 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,6 +92,13 @@ Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
+ memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
+                                     memory hard limit. Kernel hard limit is not
+                                     supported since 5.16. Writing any value to
+                                     do file will not have any effect same as if
+                                     nokmem kernel parameter was specified.
+                                     Kernel memory is still charged and reported
+                                     by memory.kmem.usage_in_bytes.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4d3282493b6..63bdaab2a906 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
 static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 				   unsigned int nr_pages)
 {
+	struct page_counter *counter;
 	struct mem_cgroup *memcg;
 	int ret;
 
@@ -3867,6 +3868,13 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
 		case _MEMSWAP:
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
+		case _KMEM:
+			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
+				     "Writing any value to this file has no effect. "
+				     "Please report your usecase to linux-mm@kvack.org if you "
+				     "depend on this functionality.\n");
+			ret = 0;
+			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
 			break;
@@ -5077,6 +5085,12 @@ static struct cftype mem_cgroup_legacy_files[] = {
 		.seq_show = memcg_numa_stat_show,
 	},
 #endif
+	{
+		.name = "kmem.limit_in_bytes",
+		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
+		.write = mem_cgroup_write,
+		.read_u64 = mem_cgroup_read_u64,
+	},
 	{
 		.name = "kmem.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
-- 
2.30.2

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-22 23:00               ` Roman Gushchin
@ 2023-09-25  7:41                 ` Michal Hocko
  2023-09-26  2:49                   ` Roman Gushchin
  0 siblings, 1 reply; 258+ messages in thread
From: Michal Hocko @ 2023-09-25  7:41 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Jeremi Piotrowski, Shakeel Butt, Johannes Weiner, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Fri 22-09-23 16:00:30, Roman Gushchin wrote:
> On Wed, Sep 20, 2023 at 03:47:37PM +0200, Michal Hocko wrote:
> > On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> > > On 9/20/2023 1:07 PM, Michal Hocko wrote:
> > [...]
> > > > I mean, normally I would be just fine reverting this API change because
> > > > it is disruptive but the only way to have the file available and not
> > > > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > > > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > > > there but that sounds rather dubious. Although one could argue this
> > > > would mimic nokmem kernel option.
> > > > 
> > > 
> > > I just want to make sure we don't introduce yet another new behavior in this legacy
> > > system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> > > does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> > > (=don't event store the written limit). The latter might have unintended consequences.
> > 
> > Yes it would mean that the limit is never enforced. Bad as it is the
> > thing is that the hard limit on kernel memory is broken by design and
> > unfixable.  This causes all sorts of unexpected kernel allocation
> > failures that this is simply unsafe to use.
> > 
> > All that being said I can see the following options
> > 1) keep the current upstream status and not export the file
> > 2) revert both 58056f77502f and 86327e8eb94 and make it clear
> >    that kmem.limit_in_bytes is unsupported so failures or misbehavior
> >    as a result of the limit being hit are likely not going to be
> >    investigated or fixed.
> > 3) reverting like in 2) but never inforce the limit (so basically nokmem
> >    semantic)
> 
> Since it's a part of cgroup v1 interface, which is in a frozen state as a whole,
> and there is no significant (performance, code complexity) benefit of
> additionally deprecating kmem.limit_in_bytes, I vote for 2).
> 1) is also an option.

We have a stronger agrement over 3)
http://lkml.kernel.org/r/ZRE5VJozPZt9bRPy@dhcp22.suse.cz. Please speak
up if you disagree.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-25  7:41                 ` Michal Hocko
@ 2023-09-26  2:49                   ` Roman Gushchin
  0 siblings, 0 replies; 258+ messages in thread
From: Roman Gushchin @ 2023-09-26  2:49 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Shakeel Butt, Johannes Weiner, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Mon, Sep 25, 2023 at 09:41:24AM +0200, Michal Hocko wrote:
> On Fri 22-09-23 16:00:30, Roman Gushchin wrote:
> > On Wed, Sep 20, 2023 at 03:47:37PM +0200, Michal Hocko wrote:
> > > On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> > > > On 9/20/2023 1:07 PM, Michal Hocko wrote:
> > > [...]
> > > > > I mean, normally I would be just fine reverting this API change because
> > > > > it is disruptive but the only way to have the file available and not
> > > > > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > > > > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > > > > there but that sounds rather dubious. Although one could argue this
> > > > > would mimic nokmem kernel option.
> > > > > 
> > > > 
> > > > I just want to make sure we don't introduce yet another new behavior in this legacy
> > > > system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> > > > does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> > > > (=don't event store the written limit). The latter might have unintended consequences.
> > > 
> > > Yes it would mean that the limit is never enforced. Bad as it is the
> > > thing is that the hard limit on kernel memory is broken by design and
> > > unfixable.  This causes all sorts of unexpected kernel allocation
> > > failures that this is simply unsafe to use.
> > > 
> > > All that being said I can see the following options
> > > 1) keep the current upstream status and not export the file
> > > 2) revert both 58056f77502f and 86327e8eb94 and make it clear
> > >    that kmem.limit_in_bytes is unsupported so failures or misbehavior
> > >    as a result of the limit being hit are likely not going to be
> > >    investigated or fixed.
> > > 3) reverting like in 2) but never inforce the limit (so basically nokmem
> > >    semantic)
> > 
> > Since it's a part of cgroup v1 interface, which is in a frozen state as a whole,
> > and there is no significant (performance, code complexity) benefit of
> > additionally deprecating kmem.limit_in_bytes, I vote for 2).
> > 1) is also an option.
> 
> We have a stronger agrement over 3)
> http://lkml.kernel.org/r/ZRE5VJozPZt9bRPy@dhcp22.suse.cz. Please speak
> up if you disagree.

This works for me too.
Thank you!

Btw, it seems like going forward we should be more resistant for any
cgroup v1 changes and just leave it as it is.

Thanks.

^ permalink raw reply	[flat|nested] 258+ messages in thread

* Re: [PATCH 6.1 066/219] pwm: atmel-tcb: Convert to platform remove callback returning void
  2023-09-17 19:13 ` [PATCH 6.1 066/219] pwm: atmel-tcb: Convert to platform remove callback returning void Greg Kroah-Hartman
@ 2023-10-09 19:41   ` Uwe Kleine-König
  0 siblings, 0 replies; 258+ messages in thread
From: Uwe Kleine-König @ 2023-10-09 19:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, Claudiu Beznea, Thierry Reding, Sasha Levin

[-- Attachment #1: Type: text/plain, Size: 1934 bytes --]

Hello,

On Sun, Sep 17, 2023 at 09:13:13PM +0200, Greg Kroah-Hartman wrote:
> 6.1-stable review patch.  If anyone has any objections, please let me know.
> ------------------
> 
> From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> 
> [ Upstream commit 9609284a76978daf53a54e05cff36873a75e4d13 ]
> 
> The .remove() callback for a platform driver returns an int which makes
> many driver authors wrongly assume it's possible to do error handling by
> returning an error code. However the value returned is (mostly) ignored
> and this typically results in resource leaks. To improve here there is a
> quest to make the remove callback return void. In the first step of this
> quest all drivers are converted to .remove_new() which already returns
> void.
> 
> Trivially convert this driver from always returning zero in the remove
> callback to the void returning variant.
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
> Stable-dep-of: c11622324c02 ("pwm: atmel-tcb: Fix resource freeing in error path and remove")
> Signed-off-by: Sasha Levin <sashal@kernel.org>

This is similar to the other backport I wondered about[1]. IMHO dropping
this patch an resolving the (simple) resolving conflict is the more
sensible approach here. (But keeping this patch doesn't hurt either.)

The other dependency of c11622324c02 ("pwm: atmel-tcb: Fix resource
freeing in error path and remove") is not that trivial to back out, so
I'd keep it (= "pwm: atmel-tcb: Harmonize resource allocation order").

Best regards
Uwe

[1] https://lore.kernel.org/stable/20231009154949.33tpn4fsbacllhme@pengutronix.de

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 258+ messages in thread

end of thread, other threads:[~2023-10-09 19:41 UTC | newest]

Thread overview: 258+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 001/219] net/ipv6: SKB symmetric hash should incorporate transport ports Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 002/219] mm: multi-gen LRU: rename lrugen->lists[] to lrugen->folios[] Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 003/219] Multi-gen LRU: fix per-zone reclaim Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 004/219] io_uring: always lock in io_apoll_task_func Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 005/219] io_uring: revert "io_uring fix multishot accept ordering" Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 006/219] io_uring/net: dont overflow multishot accept Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 007/219] io_uring: break out of iowq iopoll on teardown Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 008/219] io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 009/219] io_uring: Dont set affinity on a dying sqpoll thread Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 010/219] drm/virtio: Conditionally allocate virtio_gpu_fence Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 011/219] scsi: qla2xxx: Adjust IOCB resource on qpair create Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 012/219] scsi: qla2xxx: Limit TMF to 8 per function Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 013/219] scsi: qla2xxx: Fix deletion race condition Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 014/219] scsi: qla2xxx: fix inconsistent TMF timeout Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 015/219] scsi: qla2xxx: Fix command flush during TMF Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 016/219] scsi: qla2xxx: Fix erroneous link up failure Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 017/219] scsi: qla2xxx: Turn off noisy message log Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 018/219] scsi: qla2xxx: Fix session hang in gnl Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 019/219] scsi: qla2xxx: Fix TMF leak through Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 020/219] scsi: qla2xxx: Remove unsupported ql2xenabledif option Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 021/219] scsi: qla2xxx: Flush mailbox commands on chip reset Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 022/219] scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit() Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 023/219] scsi: qla2xxx: Error code did not return to upper layer Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 024/219] scsi: qla2xxx: Fix firmware resource tracking Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 025/219] null_blk: fix poll request timeout handling Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 026/219] fbdev/ep93xx-fb: Do not assign to struct fb_info.dev Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 027/219] clk: qcom: camcc-sc7180: fix async resume during probe Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 028/219] drm/ast: Fix DRAM init on AST2200 Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 029/219] ASoC: tegra: Fix SFC conversion for few rates Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 030/219] clk: qcom: turingcc-qcs404: fix missing resume during probe Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 031/219] arm64: dts: renesas: rzg2l: Fix txdv-skew-psec typos Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 032/219] [SMB3] send channel sequence number in SMB3 requests after reconnects Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes Greg Kroah-Hartman
2023-09-20  8:11   ` [REGRESSION] " Jeremi Piotrowski
2023-09-20  8:43     ` Michal Hocko
2023-09-20  9:25       ` Greg Kroah-Hartman
2023-09-20 10:21         ` Jeremi Piotrowski
2023-09-20 10:45           ` Greg Kroah-Hartman
2023-09-20 11:08             ` Michal Hocko
2023-09-20 11:16               ` Greg Kroah-Hartman
2023-09-20 10:04       ` Jeremi Piotrowski
2023-09-20 11:07         ` Michal Hocko
2023-09-20 13:25           ` Jeremi Piotrowski
2023-09-20 13:47             ` Michal Hocko
2023-09-20 15:32               ` Shakeel Butt
2023-09-20 16:55                 ` Michal Hocko
2023-09-20 19:46                   ` Shakeel Butt
2023-09-20 20:08                     ` Michal Hocko
2023-09-20 21:46                       ` Shakeel Butt
2023-09-21  7:52                         ` Michal Hocko
2023-09-21 10:43                           ` Jeremi Piotrowski
2023-09-21 11:21                             ` Michal Hocko
2023-09-21 17:25                               ` Shakeel Butt
2023-09-21 19:50                                 ` Michal Hocko
2023-09-22 13:30                               ` Johannes Weiner
2023-09-25  7:40                                 ` Michal Hocko
2023-09-22 23:00               ` Roman Gushchin
2023-09-25  7:41                 ` Michal Hocko
2023-09-26  2:49                   ` Roman Gushchin
2023-09-17 19:12 ` [PATCH 6.1 034/219] mm: hugetlb_vmemmap: fix a race between vmemmap pmd split Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 035/219] lib/test_meminit: allocate pages up to order MAX_ORDER Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 036/219] parisc: led: Fix LAN receive and transmit LEDs Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 037/219] parisc: led: Reduce CPU overhead for disk & lan LED computation Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 038/219] cifs: update desired access while requesting for directory lease Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 039/219] pinctrl: cherryview: fix address_space_handler() argument Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 040/219] dt-bindings: clock: xlnx,versal-clk: drop select:false Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 041/219] clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 042/219] clk: imx: pll14xx: align pdiv with reference manual Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 043/219] clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 044/219] soc: qcom: qmi_encdec: Restrict string length in decode Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 045/219] clk: qcom: dispcc-sm8450: fix runtime PM imbalance on probe errors Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 046/219] clk: qcom: lpasscc-sc7280: fix missing resume during probe Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 047/219] clk: qcom: q6sstop-qcs404: " Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 048/219] clk: qcom: mss-sc7180: " Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 049/219] NFS: Fix a potential data corruption Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 050/219] NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 051/219] bus: mhi: host: Skip MHI reset if device is in RDDM Greg Kroah-Hartman
2023-09-17 19:12 ` [PATCH 6.1 052/219] net: add SKB_HEAD_ALIGN() helper Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 053/219] net: remove osize variable in __alloc_skb() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 054/219] net: factorize code in kmalloc_reserve() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 055/219] net: deal with integer overflows " Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 056/219] kbuild: rpm-pkg: define _arch conditionally Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 057/219] kbuild: do not run depmod for make modules_sign Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 058/219] tpm_crb: Fix an error handling path in crb_acpi_add() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 059/219] gfs2: Switch to wait_event in gfs2_logd Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 060/219] gfs2: low-memory forced flush fixes Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 061/219] mailbox: qcom-ipcc: fix incorrect num_chans counting Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 062/219] kconfig: fix possible buffer overflow Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 063/219] Input: iqs7222 - configure power mode before triggering ATI Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 064/219] perf trace: Use zfree() to reduce chances of use after free Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 065/219] perf trace: Really free the evsel->priv area Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 066/219] pwm: atmel-tcb: Convert to platform remove callback returning void Greg Kroah-Hartman
2023-10-09 19:41   ` Uwe Kleine-König
2023-09-17 19:13 ` [PATCH 6.1 067/219] pwm: atmel-tcb: Harmonize resource allocation order Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 068/219] pwm: atmel-tcb: Fix resource freeing in error path and remove Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 069/219] backlight: gpio_backlight: Drop output GPIO direction check for initial power state Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 070/219] Input: tca6416-keypad - always expect proper IRQ number in i2c client Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 071/219] Input: tca6416-keypad - fix interrupt enable disbalance Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 072/219] perf annotate bpf: Dont enclose non-debug code with an assert() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 073/219] x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 074/219] perf vendor events: Update the JSON/events descriptions for power10 platform Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 075/219] perf vendor events: Drop some of the JSON/events " Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 076/219] perf vendor events: Drop STORES_PER_INST metric event " Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 077/219] perf top: Dont pass an ERR_PTR() directly to perf_session__delete() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 078/219] watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 079/219] pwm: lpc32xx: Remove handling of PWM channels Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 080/219] perf test stat_bpf_counters_cgrp: Fix shellcheck issue about logical operators Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 081/219] perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 082/219] drm/i915: mark requests for GuC virtual engines to avoid use-after-free Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 083/219] blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 084/219] blk-throttle: consider carryover_ios/bytes in throtl_trim_slice() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 085/219] cifs: use fs_context for automounts Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 086/219] smb: propagate error code of extract_sharename() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 087/219] net/sched: fq_pie: avoid stalls in fq_pie_timer() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 088/219] sctp: annotate data-races around sk->sk_wmem_queued Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 089/219] ipv4: annotate data-races around fi->fib_dead Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 090/219] net: read sk->sk_family once in sk_mc_loop() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 091/219] net: fib: avoid warn splat in flow dissector Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 092/219] xsk: Fix xsk_diag use-after-free error during socket cleanup Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 093/219] ceph: make members in struct ceph_mds_request_args_ext a union Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 094/219] drm/i915/gvt: Verify pfn is "valid" before dereferencing "struct page" Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 095/219] drm/i915/gvt: Put the page reference obtained by KVMs gfn_to_pfn() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 096/219] drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 097/219] net: use sk_forward_alloc_get() in sk_get_meminfo() Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 098/219] net: annotate data-races around sk->sk_forward_alloc Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 099/219] mptcp: annotate data-races around msk->rmem_fwd_alloc Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 100/219] ipv4: ignore dst hint for multipath routes Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 101/219] ipv6: " Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 102/219] igb: disable virtualization features on 82580 Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 103/219] gve: fix frag_list chaining Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 104/219] veth: Fixing transmit return status for dropped packets Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 105/219] net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 106/219] net: phy: micrel: Correct bit assignments for phy_device flags Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 107/219] bpf, sockmap: Fix skb refcnt race after locking changes Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 108/219] af_unix: Fix data-races around user->unix_inflight Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 109/219] af_unix: Fix data-race around unix_tot_inflight Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 110/219] af_unix: Fix data-races around sk->sk_shutdown Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 111/219] af_unix: Fix data race around sk->sk_err Greg Kroah-Hartman
2023-09-17 19:13 ` [PATCH 6.1 112/219] net: sched: sch_qfq: Fix UAF in qfq_dequeue() Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 113/219] kcm: Destroy mutex in kcm_exit_net() Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 114/219] octeontx2-af: Fix truncation of smq in CN10K NIX AQ enqueue mbox handler Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 115/219] igc: Change IGC_MIN to allow set rx/tx value between 64 and 80 Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 116/219] igbvf: Change IGBVF_MIN " Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 117/219] igb: Change IGB_MIN " Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 118/219] s390/zcrypt: dont leak memory if dev_set_name() fails Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 119/219] idr: fix param name in idr_alloc_cyclic() doc Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 120/219] ip_tunnels: use DEV_STATS_INC() Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 121/219] net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 122/219] net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 123/219] net: dsa: sja1105: complete tc-cbs offload support on SJA1110 Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 124/219] bpf: Remove prog->active check for bpf_lsm and bpf_iter Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 125/219] bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf() Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 126/219] bpf: Assign bpf_tramp_run_ctx::saved_run_ctx before recursion check Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 127/219] netfilter: nftables: exthdr: fix 4-byte stack OOB write Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 128/219] netfilter: nfnetlink_osf: avoid OOB read Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 129/219] net: hns3: fix tx timeout issue Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 130/219] net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read() Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 131/219] net: hns3: fix debugfs concurrency issue between kfree buffer and read Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 132/219] net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 133/219] net: hns3: fix the port information display when sfp is absent Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 134/219] net: hns3: remove GSO partial feature bit Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 135/219] sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory() Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 136/219] Multi-gen LRU: avoid race in inc_min_seq() Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 137/219] net/mlx5: Free IRQ rmap and notifier on kernel shutdown Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 138/219] ARC: atomics: Add compiler barrier to atomic operations Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 139/219] clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 140/219] dmaengine: sh: rz-dmac: Fix destination and source data size setting Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 141/219] jbd2: fix checkpoint cleanup performance regression Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 142/219] jbd2: check jh->b_transaction before removing it from checkpoint Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 143/219] jbd2: correct the end of the journal recovery scan range Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 144/219] ext4: add correct group descriptors and reserved GDT blocks to system zone Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 145/219] ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup} Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 146/219] f2fs: flush inode if atomic file is aborted Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 147/219] f2fs: avoid false alarm of circular locking Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 148/219] lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix() Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 149/219] hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 150/219] ata: ahci: Add Elkhart Lake AHCI controller Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 151/219] ata: pata_falcon: fix IO base selection for Q40 Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 152/219] ata: sata_gemini: Add missing MODULE_DESCRIPTION Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 153/219] ata: pata_ftide010: " Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 154/219] fuse: nlookup missing decrement in fuse_direntplus_link Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 155/219] btrfs: zoned: do not zone finish data relocation block group Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 156/219] btrfs: fix start transaction qgroup rsv double free Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 157/219] btrfs: free qgroup rsv on io failure Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 158/219] btrfs: dont start transaction when joining with TRANS_JOIN_NOSTART Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 159/219] btrfs: set page extent mapped after read_folio in relocate_one_page Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 160/219] btrfs: zoned: re-enable metadata over-commit for zoned mode Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 161/219] btrfs: use the correct superblock to compare fsid in btrfs_validate_super Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 162/219] drm/mxsfb: Disable overlay plane in mxsfb_plane_overlay_atomic_disable() Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 163/219] mtd: rawnand: brcmnand: Fix crash during the panic_write Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 164/219] mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 165/219] mtd: spi-nor: Correct flags for Winbond w25q128 Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 166/219] mtd: rawnand: brcmnand: Fix potential false time out warning Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 167/219] mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 168/219] drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 169/219] drm/amd/display: prevent potential division by zero errors Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 170/219] KVM: SVM: Take and hold ir_list_lock when updating vCPUs Physical ID entry Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 171/219] KVM: SVM: Dont inject #UD if KVM attempts to skip SEV guest insn Greg Kroah-Hartman
2023-09-17 19:14 ` [PATCH 6.1 172/219] KVM: SVM: Get source vCPUs from source VM for SEV-ES intrahost migration Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 173/219] KVM: nSVM: Check instead of asserting on nested TSC scaling support Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 174/219] KVM: nSVM: Load L1s TSC multiplier based on L1 state, not L2 state Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 175/219] KVM: SVM: Set target pCPU during IRTE update if target vCPU is running Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 176/219] KVM: SVM: Skip VMSA init in sev_es_init_vmcb() if pointer is NULL Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 177/219] MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install regression Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 178/219] perf hists browser: Fix hierarchy mode header Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 179/219] perf test shell stat_bpf_counters: Fix test on Intel Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 180/219] perf tools: Handle old data in PERF_RECORD_ATTR Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 181/219] perf hists browser: Fix the number of entries for e key Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 182/219] drm/amd/display: always switch off ODM before committing more streams Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 183/219] drm/amd/display: Remove wait while locked Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 184/219] drm/amdgpu: register a dirty framebuffer callback for fbcon Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 185/219] kunit: Fix wild-memory-access bug in kunit_free_suite_set() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 186/219] net: ipv4: fix one memleak in __inet_del_ifa() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 187/219] kselftest/runner.sh: Propagate SIGTERM to runner child Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 188/219] selftests: Keep symlinks, when possible Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 189/219] net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 190/219] net: stmmac: fix handling of zero coalescing tx-usecs Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 191/219] net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 192/219] net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 193/219] hsr: Fix uninit-value access in fill_frame_info() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 194/219] net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 195/219] net:ethernet:adi:adin1110: Fix forwarding offload Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 196/219] net: dsa: sja1105: hide all multicast addresses from "bridge fdb show" Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 197/219] net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 198/219] net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 199/219] net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 200/219] net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 201/219] r8152: check budget for r8152_poll() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 202/219] kcm: Fix memory leak in error path of kcm_sendmsg() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 203/219] platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 204/219] platform/mellanox: mlxbf-tmfifo: Drop jumbo frames Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 205/219] platform/mellanox: mlxbf-pmc: Fix potential buffer overflows Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 206/219] platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 207/219] platform/mellanox: NVSW_SN2201 should depend on ACPI Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 208/219] net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 209/219] net: macb: Enable PTP unicast Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 210/219] net: macb: fix sleep inside spinlock Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 211/219] ipv6: fix ip6_sock_set_addr_preferences() typo Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 212/219] ipv6: Remove in6addr_any alternatives Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 213/219] tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any) Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 214/219] tcp: Fix bind() regression for v4-mapped-v6 wildcard address Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 215/219] tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 216/219] ixgbe: fix timestamp configuration code Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 217/219] kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg() Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 218/219] MIPS: Only fiddle with CHECKFLAGS if `need-compiler Greg Kroah-Hartman
2023-09-17 19:15 ` [PATCH 6.1 219/219] drm/amd/display: Fix a bug when searching for insert_above_mpcc Greg Kroah-Hartman
2023-09-17 20:47 ` [PATCH 6.1 000/219] 6.1.54-rc1 review SeongJae Park
2023-09-18  5:34 ` Takeshi Ogasawara
2023-09-18  6:42 ` Bagas Sanjaya
2023-09-18 11:24 ` Conor Dooley
2023-09-18 12:08 ` Ron Economos
2023-09-18 12:48 ` Jon Hunter
2023-09-18 18:34 ` Florian Fainelli
2023-09-18 18:41 ` Guenter Roeck
2023-09-18 20:56 ` Naresh Kamboju
2023-09-18 22:21 ` Shuah Khan
2023-09-21 13:04 ` Conor Dooley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).