public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 6.1 000/219] 6.1.54-rc1 review
@ 2023-09-17 19:12 Greg Kroah-Hartman
  2023-09-17 20:47 ` SeongJae Park
                   ` (11 more replies)
  0 siblings, 12 replies; 39+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-17 19:12 UTC (permalink / raw)
  To: stable
  Cc: Greg Kroah-Hartman, patches, linux-kernel, torvalds, akpm, linux,
	shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor

This is the start of the stable review cycle for the 6.1.54 release.
There are 219 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 6.1.54-rc1

Wesley Chalmers <wesley.chalmers@amd.com>
    drm/amd/display: Fix a bug when searching for insert_above_mpcc

Maciej W. Rozycki <macro@orcam.me.uk>
    MIPS: Only fiddle with CHECKFLAGS if `need-compiler'

Kuniyuki Iwashima <kuniyu@amazon.com>
    kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().

Vadim Fedorenko <vadim.fedorenko@linux.dev>
    ixgbe: fix timestamp configuration code

Kuniyuki Iwashima <kuniyu@amazon.com>
    tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.

Kuniyuki Iwashima <kuniyu@amazon.com>
    tcp: Fix bind() regression for v4-mapped-v6 wildcard address.

Kuniyuki Iwashima <kuniyu@amazon.com>
    tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).

Kuniyuki Iwashima <kuniyu@amazon.com>
    ipv6: Remove in6addr_any alternatives.

Eric Dumazet <edumazet@google.com>
    ipv6: fix ip6_sock_set_addr_preferences() typo

Sascha Hauer <s.hauer@pengutronix.de>
    net: macb: fix sleep inside spinlock

Harini Katakam <harini.katakam@xilinx.com>
    net: macb: Enable PTP unicast

Liu Jian <liujian56@huawei.com>
    net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()

Geert Uytterhoeven <geert+renesas@glider.be>
    platform/mellanox: NVSW_SN2201 should depend on ACPI

Shravan Kumar Ramani <shravankr@nvidia.com>
    platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events

Shravan Kumar Ramani <shravankr@nvidia.com>
    platform/mellanox: mlxbf-pmc: Fix potential buffer overflows

Liming Sun <limings@nvidia.com>
    platform/mellanox: mlxbf-tmfifo: Drop jumbo frames

Liming Sun <limings@nvidia.com>
    platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors

Shigeru Yoshida <syoshida@redhat.com>
    kcm: Fix memory leak in error path of kcm_sendmsg()

Hayes Wang <hayeswang@realtek.com>
    r8152: check budget for r8152_poll()

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid()

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: hide all multicast addresses from "bridge fdb show"

Ciprian Regus <ciprian.regus@analog.com>
    net:ethernet:adi:adin1110: Fix forwarding offload

Yang Yingliang <yangyingliang@huawei.com>
    net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address

Ziyang Xuan <william.xuanziyang@huawei.com>
    hsr: Fix uninit-value access in fill_frame_info()

Hangyu Hua <hbh25y@gmail.com>
    net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all()

Hangyu Hua <hbh25y@gmail.com>
    net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc()

Vincent Whitchurch <vincent.whitchurch@axis.com>
    net: stmmac: fix handling of zero coalescing tx-usecs

Guangguan Wang <guangguan.wang@linux.alibaba.com>
    net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add

Björn Töpel <bjorn@rivosinc.com>
    selftests: Keep symlinks, when possible

Björn Töpel <bjorn@rivosinc.com>
    kselftest/runner.sh: Propagate SIGTERM to runner child

Liu Jian <liujian56@huawei.com>
    net: ipv4: fix one memleak in __inet_del_ifa()

Jinjie Ruan <ruanjinjie@huawei.com>
    kunit: Fix wild-memory-access bug in kunit_free_suite_set()

Hamza Mahfooz <hamza.mahfooz@amd.com>
    drm/amdgpu: register a dirty framebuffer callback for fbcon

Gabe Teeger <gabe.teeger@amd.com>
    drm/amd/display: Remove wait while locked

Wenjing Liu <wenjing.liu@amd.com>
    drm/amd/display: always switch off ODM before committing more streams

Namhyung Kim <namhyung@kernel.org>
    perf hists browser: Fix the number of entries for 'e' key

Namhyung Kim <namhyung@kernel.org>
    perf tools: Handle old data in PERF_RECORD_ATTR

Namhyung Kim <namhyung@kernel.org>
    perf test shell stat_bpf_counters: Fix test on Intel

Namhyung Kim <namhyung@kernel.org>
    perf hists browser: Fix hierarchy mode header

Maciej W. Rozycki <macro@orcam.me.uk>
    MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install' regression

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Skip VMSA init in sev_es_init_vmcb() if pointer is NULL

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Set target pCPU during IRTE update if target vCPU is running

Sean Christopherson <seanjc@google.com>
    KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state

Sean Christopherson <seanjc@google.com>
    KVM: nSVM: Check instead of asserting on nested TSC scaling support

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Get source vCPUs from source VM for SEV-ES intrahost migration

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Don't inject #UD if KVM attempts to skip SEV guest insn

Sean Christopherson <seanjc@google.com>
    KVM: SVM: Take and hold ir_list_lock when updating vCPU's Physical ID entry

Hamza Mahfooz <hamza.mahfooz@amd.com>
    drm/amd/display: prevent potential division by zero errors

Melissa Wen <mwen@igalia.com>
    drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma

William Zhang <william.zhang@broadcom.com>
    mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller

William Zhang <william.zhang@broadcom.com>
    mtd: rawnand: brcmnand: Fix potential false time out warning

Linus Walleij <linus.walleij@linaro.org>
    mtd: spi-nor: Correct flags for Winbond w25q128

William Zhang <william.zhang@broadcom.com>
    mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write

William Zhang <william.zhang@broadcom.com>
    mtd: rawnand: brcmnand: Fix crash during the panic_write

Liu Ying <victor.liu@nxp.com>
    drm/mxsfb: Disable overlay plane in mxsfb_plane_overlay_atomic_disable()

Anand Jain <anand.jain@oracle.com>
    btrfs: use the correct superblock to compare fsid in btrfs_validate_super

Naohiro Aota <naohiro.aota@wdc.com>
    btrfs: zoned: re-enable metadata over-commit for zoned mode

Josef Bacik <josef@toxicpanda.com>
    btrfs: set page extent mapped after read_folio in relocate_one_page

Filipe Manana <fdmanana@suse.com>
    btrfs: don't start transaction when joining with TRANS_JOIN_NOSTART

Boris Burkov <boris@bur.io>
    btrfs: free qgroup rsv on io failure

Boris Burkov <boris@bur.io>
    btrfs: fix start transaction qgroup rsv double free

Naohiro Aota <naohiro.aota@wdc.com>
    btrfs: zoned: do not zone finish data relocation block group

ruanmeisi <ruan.meisi@zte.com.cn>
    fuse: nlookup missing decrement in fuse_direntplus_link

Damien Le Moal <dlemoal@kernel.org>
    ata: pata_ftide010: Add missing MODULE_DESCRIPTION

Damien Le Moal <dlemoal@kernel.org>
    ata: sata_gemini: Add missing MODULE_DESCRIPTION

Michael Schmitz <schmitzmic@gmail.com>
    ata: pata_falcon: fix IO base selection for Q40

Werner Fischer <devlists@wefi.net>
    ata: ahci: Add Elkhart Lake AHCI controller

Christian Marangi <ansuelsmth@gmail.com>
    hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation

Nathan Chancellor <nathan@kernel.org>
    lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()

Jaegeuk Kim <jaegeuk@kernel.org>
    f2fs: avoid false alarm of circular locking

Jaegeuk Kim <jaegeuk@kernel.org>
    f2fs: flush inode if atomic file is aborted

Luís Henriques <lhenriques@suse.de>
    ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup}

Wang Jianjian <wangjianjian0@foxmail.com>
    ext4: add correct group descriptors and reserved GDT blocks to system zone

Zhang Yi <yi.zhang@huawei.com>
    jbd2: correct the end of the journal recovery scan range

Zhihao Cheng <chengzhihao1@huawei.com>
    jbd2: check 'jh->b_transaction' before removing it from checkpoint

Zhang Yi <yi.zhang@huawei.com>
    jbd2: fix checkpoint cleanup performance regression

Hien Huynh <hien.huynh.px@renesas.com>
    dmaengine: sh: rz-dmac: Fix destination and source data size setting

Walter Chang <walter.chang@mediatek.com>
    clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL

Pavel Kozlov <pavel.kozlov@synopsys.com>
    ARC: atomics: Add compiler barrier to atomic operations...

Saeed Mahameed <saeedm@nvidia.com>
    net/mlx5: Free IRQ rmap and notifier on kernel shutdown

Kalesh Singh <kaleshsingh@google.com>
    Multi-gen LRU: avoid race in inc_min_seq()

Petr Tesarik <petr.tesarik.ext@huawei.com>
    sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory()

Jie Wang <wangjie125@huawei.com>
    net: hns3: remove GSO partial feature bit

Yisen Zhuang <yisen.zhuang@huawei.com>
    net: hns3: fix the port information display when sfp is absent

Jijie Shao <shaojijie@huawei.com>
    net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue

Hao Chen <chenhao418@huawei.com>
    net: hns3: fix debugfs concurrency issue between kfree buffer and read

Hao Chen <chenhao418@huawei.com>
    net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read()

Jian Shen <shenjian15@huawei.com>
    net: hns3: fix tx timeout issue

Wander Lairson Costa <wander@redhat.com>
    netfilter: nfnetlink_osf: avoid OOB read

Florian Westphal <fw@strlen.de>
    netfilter: nftables: exthdr: fix 4-byte stack OOB write

Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    bpf: Assign bpf_tramp_run_ctx::saved_run_ctx before recursion check.

Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf().

Martin KaFai Lau <martin.lau@kernel.org>
    bpf: Remove prog->active check for bpf_lsm and bpf_iter

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: complete tc-cbs offload support on SJA1110

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times

Vladimir Oltean <vladimir.oltean@nxp.com>
    net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload

Eric Dumazet <edumazet@google.com>
    ip_tunnels: use DEV_STATS_INC()

Ariel Marcovitch <arielmarcovitch@gmail.com>
    idr: fix param name in idr_alloc_cyclic() doc

Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    s390/zcrypt: don't leak memory if dev_set_name() fails

Olga Zaborska <olga.zaborska@intel.com>
    igb: Change IGB_MIN to allow set rx/tx value between 64 and 80

Olga Zaborska <olga.zaborska@intel.com>
    igbvf: Change IGBVF_MIN to allow set rx/tx value between 64 and 80

Olga Zaborska <olga.zaborska@intel.com>
    igc: Change IGC_MIN to allow set rx/tx value between 64 and 80

Geetha sowjanya <gakula@marvell.com>
    octeontx2-af: Fix truncation of smq in CN10K NIX AQ enqueue mbox handler

Shigeru Yoshida <syoshida@redhat.com>
    kcm: Destroy mutex in kcm_exit_net()

valis <sec@valis.email>
    net: sched: sch_qfq: Fix UAF in qfq_dequeue()

Kuniyuki Iwashima <kuniyu@amazon.com>
    af_unix: Fix data race around sk->sk_err.

Kuniyuki Iwashima <kuniyu@amazon.com>
    af_unix: Fix data-races around sk->sk_shutdown.

Kuniyuki Iwashima <kuniyu@amazon.com>
    af_unix: Fix data-race around unix_tot_inflight.

Kuniyuki Iwashima <kuniyu@amazon.com>
    af_unix: Fix data-races around user->unix_inflight.

John Fastabend <john.fastabend@gmail.com>
    bpf, sockmap: Fix skb refcnt race after locking changes

Oleksij Rempel <linux@rempel-privat.de>
    net: phy: micrel: Correct bit assignments for phy_device flags

Alex Henrie <alexhenrie24@gmail.com>
    net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr

Liang Chen <liangchen.linux@gmail.com>
    veth: Fixing transmit return status for dropped packets

Eric Dumazet <edumazet@google.com>
    gve: fix frag_list chaining

Corinna Vinschen <vinschen@redhat.com>
    igb: disable virtualization features on 82580

Sriram Yagnaraman <sriram.yagnaraman@est.tech>
    ipv6: ignore dst hint for multipath routes

Sriram Yagnaraman <sriram.yagnaraman@est.tech>
    ipv4: ignore dst hint for multipath routes

Eric Dumazet <edumazet@google.com>
    mptcp: annotate data-races around msk->rmem_fwd_alloc

Eric Dumazet <edumazet@google.com>
    net: annotate data-races around sk->sk_forward_alloc

Eric Dumazet <edumazet@google.com>
    net: use sk_forward_alloc_get() in sk_get_meminfo()

Sean Christopherson <seanjc@google.com>
    drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt()

Sean Christopherson <seanjc@google.com>
    drm/i915/gvt: Put the page reference obtained by KVM's gfn_to_pfn()

Sean Christopherson <seanjc@google.com>
    drm/i915/gvt: Verify pfn is "valid" before dereferencing "struct page"

Xiubo Li <xiubli@redhat.com>
    ceph: make members in struct ceph_mds_request_args_ext a union

Magnus Karlsson <magnus.karlsson@intel.com>
    xsk: Fix xsk_diag use-after-free error during socket cleanup

Florian Westphal <fw@strlen.de>
    net: fib: avoid warn splat in flow dissector

Eric Dumazet <edumazet@google.com>
    net: read sk->sk_family once in sk_mc_loop()

Eric Dumazet <edumazet@google.com>
    ipv4: annotate data-races around fi->fib_dead

Eric Dumazet <edumazet@google.com>
    sctp: annotate data-races around sk->sk_wmem_queued

Eric Dumazet <edumazet@google.com>
    net/sched: fq_pie: avoid stalls in fq_pie_timer()

Katya Orlova <e.orlova@ispras.ru>
    smb: propagate error code of extract_sharename()

Paulo Alcantara <pc@cjr.nz>
    cifs: use fs_context for automounts

Yu Kuai <yukuai3@huawei.com>
    blk-throttle: consider 'carryover_ios/bytes' in throtl_trim_slice()

Yu Kuai <yukuai3@huawei.com>
    blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice()

Andrzej Hajda <andrzej.hajda@intel.com>
    drm/i915: mark requests for GuC virtual engines to avoid use-after-free

Namhyung Kim <namhyung@kernel.org>
    perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test

Kajol Jain <kjain@linux.ibm.com>
    perf test stat_bpf_counters_cgrp: Fix shellcheck issue about logical operators

Vladimir Zapolskiy <vz@mleia.com>
    pwm: lpc32xx: Remove handling of PWM channels

Raag Jadav <raag.jadav@intel.com>
    watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf top: Don't pass an ERR_PTR() directly to perf_session__delete()

Kajol Jain <kjain@linux.ibm.com>
    perf vendor events: Drop STORES_PER_INST metric event for power10 platform

Kajol Jain <kjain@linux.ibm.com>
    perf vendor events: Drop some of the JSON/events for power10 platform

Kajol Jain <kjain@linux.ibm.com>
    perf vendor events: Update the JSON/events descriptions for power10 platform

Sean Christopherson <seanjc@google.com>
    x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm()

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf annotate bpf: Don't enclose non-debug code with an assert()

Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Input: tca6416-keypad - fix interrupt enable disbalance

Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Input: tca6416-keypad - always expect proper IRQ number in i2c client

Ying Liu <victor.liu@nxp.com>
    backlight: gpio_backlight: Drop output GPIO direction check for initial power state

Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    pwm: atmel-tcb: Fix resource freeing in error path and remove

Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    pwm: atmel-tcb: Harmonize resource allocation order

Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    pwm: atmel-tcb: Convert to platform remove callback returning void

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf trace: Really free the evsel->priv area

Arnaldo Carvalho de Melo <acme@redhat.com>
    perf trace: Use zfree() to reduce chances of use after free

Jeff LaBundy <jeff@labundy.com>
    Input: iqs7222 - configure power mode before triggering ATI

Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
    kconfig: fix possible buffer overflow

Jonathan Marek <jonathan@marek.ca>
    mailbox: qcom-ipcc: fix incorrect num_chans counting

Andreas Gruenbacher <agruenba@redhat.com>
    gfs2: low-memory forced flush fixes

Andreas Gruenbacher <agruenba@redhat.com>
    gfs2: Switch to wait_event in gfs2_logd

Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    tpm_crb: Fix an error handling path in crb_acpi_add()

Masahiro Yamada <masahiroy@kernel.org>
    kbuild: do not run depmod for 'make modules_sign'

Masahiro Yamada <masahiroy@kernel.org>
    kbuild: rpm-pkg: define _arch conditionally

Eric Dumazet <edumazet@google.com>
    net: deal with integer overflows in kmalloc_reserve()

Eric Dumazet <edumazet@google.com>
    net: factorize code in kmalloc_reserve()

Eric Dumazet <edumazet@google.com>
    net: remove osize variable in __alloc_skb()

Eric Dumazet <edumazet@google.com>
    net: add SKB_HEAD_ALIGN() helper

Qiang Yu <quic_qianyu@quicinc.com>
    bus: mhi: host: Skip MHI reset if device is in RDDM

Fedor Pchelkin <pchelkin@ispras.ru>
    NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info

Trond Myklebust <trond.myklebust@hammerspace.com>
    NFS: Fix a potential data corruption

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: mss-sc7180: fix missing resume during probe

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: q6sstop-qcs404: fix missing resume during probe

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: lpasscc-sc7280: fix missing resume during probe

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: dispcc-sm8450: fix runtime PM imbalance on probe errors

Chris Lew <quic_clew@quicinc.com>
    soc: qcom: qmi_encdec: Restrict string length in decode

Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock

Marco Felsch <m.felsch@pengutronix.de>
    clk: imx: pll14xx: align pdiv with reference manual

Ahmad Fatoum <a.fatoum@pengutronix.de>
    clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz

Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    dt-bindings: clock: xlnx,versal-clk: drop select:false

Raag Jadav <raag.jadav@intel.com>
    pinctrl: cherryview: fix address_space_handler() argument

Bharath SM <bharathsm@microsoft.com>
    cifs: update desired access while requesting for directory lease

Helge Deller <deller@gmx.de>
    parisc: led: Reduce CPU overhead for disk & lan LED computation

Helge Deller <deller@gmx.de>
    parisc: led: Fix LAN receive and transmit LEDs

Andrew Donnellan <ajd@linux.ibm.com>
    lib/test_meminit: allocate pages up to order MAX_ORDER

Muchun Song <muchun.song@linux.dev>
    mm: hugetlb_vmemmap: fix a race between vmemmap pmd split

Michal Hocko <mhocko@suse.com>
    memcg: drop kmem.limit_in_bytes

Steve French <stfrench@microsoft.com>
    send channel sequence number in SMB3 requests after reconnects

Chris Paterson <chris.paterson2@renesas.com>
    arm64: dts: renesas: rzg2l: Fix txdv-skew-psec typos

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: turingcc-qcs404: fix missing resume during probe

Sheetal <sheetal@nvidia.com>
    ASoC: tegra: Fix SFC conversion for few rates

Thomas Zimmermann <tzimmermann@suse.de>
    drm/ast: Fix DRAM init on AST2200

Johan Hovold <johan+linaro@kernel.org>
    clk: qcom: camcc-sc7180: fix async resume during probe

Thomas Zimmermann <tzimmermann@suse.de>
    fbdev/ep93xx-fb: Do not assign to struct fb_info.dev

Chengming Zhou <zhouchengming@bytedance.com>
    null_blk: fix poll request timeout handling

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix firmware resource tracking

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Error code did not return to upper layer

Nilesh Javali <njavali@marvell.com>
    scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit()

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Flush mailbox commands on chip reset

Manish Rangankar <mrangankar@marvell.com>
    scsi: qla2xxx: Remove unsupported ql2xenabledif option

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix TMF leak through

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix session hang in gnl

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Turn off noisy message log

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix erroneous link up failure

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix command flush during TMF

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: fix inconsistent TMF timeout

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Fix deletion race condition

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Limit TMF to 8 per function

Quinn Tran <qutran@marvell.com>
    scsi: qla2xxx: Adjust IOCB resource on qpair create

Gurchetan Singh <gurchetansingh@chromium.org>
    drm/virtio: Conditionally allocate virtio_gpu_fence

Pavel Begunkov <asml.silence@gmail.com>
    io_uring: Don't set affinity on a dying sqpoll thread

Pavel Begunkov <asml.silence@gmail.com>
    io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used

Pavel Begunkov <asml.silence@gmail.com>
    io_uring: break out of iowq iopoll on teardown

Pavel Begunkov <asml.silence@gmail.com>
    io_uring/net: don't overflow multishot accept

Pavel Begunkov <asml.silence@gmail.com>
    io_uring: revert "io_uring fix multishot accept ordering"

Pavel Begunkov <asml.silence@gmail.com>
    io_uring: always lock in io_apoll_task_func

Kalesh Singh <kaleshsingh@google.com>
    Multi-gen LRU: fix per-zone reclaim

Yu Zhao <yuzhao@google.com>
    mm: multi-gen LRU: rename lrugen->lists[] to lrugen->folios[]

Quan Tian <qtian@vmware.com>
    net/ipv6: SKB symmetric hash should incorporate transport ports


-------------

Diffstat:

 Documentation/admin-guide/cgroup-v1/memory.rst     |   2 -
 .../devicetree/bindings/clock/xlnx,versal-clk.yaml |   2 -
 Documentation/mm/multigen_lru.rst                  |   8 +-
 Makefile                                           |   6 +-
 arch/arc/include/asm/atomic-llsc.h                 |   6 +-
 arch/arc/include/asm/atomic64-arcv2.h              |   6 +-
 arch/arm64/boot/dts/renesas/rzg2l-smarc-som.dtsi   |   4 +-
 arch/arm64/boot/dts/renesas/rzg2lc-smarc-som.dtsi  |   2 +-
 arch/arm64/boot/dts/renesas/rzg2ul-smarc-som.dtsi  |   4 +-
 arch/arm64/net/bpf_jit_comp.c                      |   9 +-
 arch/mips/Makefile                                 |   6 +-
 arch/parisc/include/asm/led.h                      |   4 +-
 arch/sh/boards/mach-ap325rxa/setup.c               |   2 +-
 arch/sh/boards/mach-ecovec24/setup.c               |   6 +-
 arch/sh/boards/mach-kfr2r09/setup.c                |   2 +-
 arch/sh/boards/mach-migor/setup.c                  |   2 +-
 arch/sh/boards/mach-se/7724/setup.c                |   6 +-
 arch/x86/include/asm/virtext.h                     |   6 -
 arch/x86/kvm/svm/avic.c                            |  59 +++++-
 arch/x86/kvm/svm/nested.c                          |   9 +-
 arch/x86/kvm/svm/sev.c                             |   9 +-
 arch/x86/kvm/svm/svm.c                             |  35 ++-
 arch/x86/net/bpf_jit_comp.c                        |  19 +-
 block/blk-throttle.c                               |  99 ++++-----
 drivers/ata/ahci.c                                 |   2 +
 drivers/ata/pata_falcon.c                          |  50 +++--
 drivers/ata/pata_ftide010.c                        |   1 +
 drivers/ata/sata_gemini.c                          |   1 +
 drivers/block/null_blk/main.c                      |  12 +-
 drivers/bus/mhi/host/pm.c                          |   5 +
 drivers/char/tpm/tpm_crb.c                         |   5 +-
 drivers/clk/imx/clk-pll14xx.c                      |  13 +-
 drivers/clk/qcom/camcc-sc7180.c                    |   2 +-
 drivers/clk/qcom/dispcc-sm8450.c                   |  13 +-
 drivers/clk/qcom/gcc-mdm9615.c                     |   2 +-
 drivers/clk/qcom/lpasscc-sc7280.c                  |  16 +-
 drivers/clk/qcom/mss-sc7180.c                      |  13 +-
 drivers/clk/qcom/q6sstop-qcs404.c                  |  15 +-
 drivers/clk/qcom/turingcc-qcs404.c                 |  13 +-
 drivers/clocksource/arm_arch_timer.c               |   7 +
 drivers/dma/sh/rz-dmac.c                           |  11 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c        |  26 ++-
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c    |   7 +
 drivers/gpu/drm/amd/display/dc/Makefile            |   1 +
 drivers/gpu/drm/amd/display/dc/core/dc.c           |  68 ++++--
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_mpc.c   |   5 +-
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c |  11 -
 .../drm/amd/display/modules/freesync/freesync.c    |   9 +-
 drivers/gpu/drm/ast/ast_post.c                     |   2 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h       |   1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c  |   3 +
 drivers/gpu/drm/i915/gvt/gtt.c                     |  27 +--
 drivers/gpu/drm/i915/gvt/gtt.h                     |   1 -
 drivers/gpu/drm/i915/i915_request.c                |   7 +-
 drivers/gpu/drm/mxsfb/mxsfb_kms.c                  |   9 +
 drivers/gpu/drm/virtio/virtgpu_ioctl.c             |  30 +--
 drivers/hwspinlock/qcom_hwspinlock.c               |   9 +
 drivers/input/keyboard/tca6416-keypad.c            |  31 +--
 drivers/input/misc/iqs7222.c                       |   8 +-
 drivers/mailbox/qcom-ipcc.c                        |   4 +-
 drivers/mtd/nand/raw/brcmnand/brcmnand.c           | 112 ++++++----
 drivers/mtd/spi-nor/winbond.c                      |   5 +-
 drivers/net/dsa/sja1105/sja1105.h                  |   4 +
 drivers/net/dsa/sja1105/sja1105_dynamic_config.c   |  93 ++++----
 drivers/net/dsa/sja1105/sja1105_main.c             | 120 ++++++++---
 drivers/net/dsa/sja1105/sja1105_spi.c              |   4 +
 drivers/net/ethernet/adi/adin1110.c                |  10 +-
 drivers/net/ethernet/cadence/macb.h                |   4 +
 drivers/net/ethernet/cadence/macb_main.c           |  18 +-
 drivers/net/ethernet/google/gve/gve_rx_dqo.c       |   5 +-
 drivers/net/ethernet/hisilicon/hns3/hnae3.h        |   1 +
 drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c |   7 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    |  19 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c |   4 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c |  20 +-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c |  14 +-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    |   5 +-
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |   2 -
 drivers/net/ethernet/intel/igb/igb.h               |   4 +-
 drivers/net/ethernet/intel/igb/igb_main.c          |   5 +-
 drivers/net/ethernet/intel/igbvf/igbvf.h           |   4 +-
 drivers/net/ethernet/intel/igc/igc.h               |   4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ptp.c       |  28 +--
 drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c    |   5 +
 .../net/ethernet/marvell/octeontx2/af/rvu_nix.c    |  21 +-
 drivers/net/ethernet/mediatek/mtk_eth_soc.c        |   3 +
 .../ethernet/mellanox/mlx5/core/en/tc_tun_encap.c  |   5 +-
 drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c  |  26 ++-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |  10 +-
 drivers/net/usb/r8152.c                            |   3 +
 drivers/net/veth.c                                 |   4 +-
 drivers/parisc/led.c                               |   4 +-
 drivers/pinctrl/intel/pinctrl-cherryview.c         |   5 +-
 drivers/platform/mellanox/Kconfig                  |   4 +-
 drivers/platform/mellanox/mlxbf-pmc.c              |  41 ++--
 drivers/platform/mellanox/mlxbf-tmfifo.c           |  90 +++++---
 drivers/pwm/pwm-atmel-tcb.c                        |  70 +++---
 drivers/pwm/pwm-lpc32xx.c                          |  16 +-
 drivers/s390/crypto/zcrypt_api.c                   |   1 +
 drivers/scsi/qla2xxx/qla_attr.c                    |   2 -
 drivers/scsi/qla2xxx/qla_dbg.c                     |   2 +-
 drivers/scsi/qla2xxx/qla_def.h                     |  21 +-
 drivers/scsi/qla2xxx/qla_dfs.c                     |  10 +
 drivers/scsi/qla2xxx/qla_gbl.h                     |   1 +
 drivers/scsi/qla2xxx/qla_init.c                    | 234 +++++++++++++--------
 drivers/scsi/qla2xxx/qla_inline.h                  |  57 ++++-
 drivers/scsi/qla2xxx/qla_iocb.c                    |   1 +
 drivers/scsi/qla2xxx/qla_isr.c                     |   7 +-
 drivers/scsi/qla2xxx/qla_mbx.c                     |   7 +-
 drivers/scsi/qla2xxx/qla_nvme.c                    |   3 +-
 drivers/scsi/qla2xxx/qla_os.c                      |  26 ++-
 drivers/scsi/qla2xxx/qla_target.c                  |  14 +-
 drivers/soc/qcom/qmi_encdec.c                      |   4 +-
 drivers/video/backlight/gpio_backlight.c           |   3 +-
 drivers/video/fbdev/ep93xx-fb.c                    |   1 -
 drivers/watchdog/intel-mid_wdt.c                   |   1 +
 fs/btrfs/disk-io.c                                 |   5 +-
 fs/btrfs/extent-tree.c                             |  43 ++--
 fs/btrfs/inode.c                                   |   7 +
 fs/btrfs/relocation.c                              |  12 +-
 fs/btrfs/space-info.c                              |   6 +-
 fs/btrfs/transaction.c                             |  26 ++-
 fs/btrfs/zoned.c                                   |  16 +-
 fs/ext4/balloc.c                                   |  15 +-
 fs/ext4/block_validity.c                           |   8 +-
 fs/ext4/crypto.c                                   |   4 +
 fs/ext4/ext4.h                                     |   2 +
 fs/f2fs/f2fs.h                                     |  24 ++-
 fs/f2fs/inline.c                                   |   3 +-
 fs/f2fs/segment.c                                  |   2 +
 fs/fuse/readdir.c                                  |  10 +-
 fs/gfs2/aops.c                                     |   4 +-
 fs/gfs2/log.c                                      |  25 +--
 fs/jbd2/checkpoint.c                               |  22 +-
 fs/jbd2/recovery.c                                 |  12 +-
 fs/nfs/direct.c                                    |  20 +-
 fs/nfs/pnfs_dev.c                                  |   2 +-
 fs/smb/client/cached_dir.c                         |   2 +-
 fs/smb/client/cifs_dfs_ref.c                       | 100 ++++-----
 fs/smb/client/cifsglob.h                           |   1 +
 fs/smb/client/connect.c                            |   1 +
 fs/smb/client/fscache.c                            |   2 +-
 fs/smb/client/smb2ops.c                            |  11 +-
 fs/smb/client/smb2pdu.c                            |  11 +
 fs/smb/common/smb2pdu.h                            |  22 ++
 include/linux/bpf.h                                |  24 +--
 include/linux/bpf_verifier.h                       |  13 ++
 include/linux/ceph/ceph_fs.h                       |  24 ++-
 include/linux/ipv6.h                               |   1 +
 include/linux/micrel_phy.h                         |   6 +-
 include/linux/mm_inline.h                          |   4 +-
 include/linux/mmzone.h                             |   8 +-
 include/linux/skbuff.h                             |   8 +
 include/linux/tca6416_keypad.h                     |   1 -
 include/net/ip.h                                   |   1 +
 include/net/ip6_fib.h                              |  14 +-
 include/net/ip_fib.h                               |   5 +-
 include/net/ip_tunnels.h                           |  15 +-
 include/net/ipv6.h                                 |   7 +-
 include/net/sock.h                                 |  12 +-
 include/trace/events/fib.h                         |   5 +-
 include/trace/events/fib6.h                        |   5 +-
 io_uring/io-wq.c                                   |  17 +-
 io_uring/io-wq.h                                   |   3 +-
 io_uring/io_uring.c                                |  31 ++-
 io_uring/net.c                                     |   8 +-
 io_uring/poll.c                                    |   3 +-
 io_uring/sqpoll.c                                  |  17 ++
 io_uring/sqpoll.h                                  |   1 +
 kernel/bpf/syscall.c                               |   7 +-
 kernel/bpf/trampoline.c                            |  81 +++++--
 lib/idr.c                                          |   2 +-
 lib/kunit/test.c                                   |   3 +-
 lib/test_meminit.c                                 |   2 +-
 lib/test_scanf.c                                   |   2 +-
 mm/hugetlb_vmemmap.c                               |  34 ++-
 mm/memcontrol.c                                    |  10 -
 mm/vmscan.c                                        |  50 +++--
 net/core/flow_dissector.c                          |   3 +-
 net/core/skbuff.c                                  |  49 ++---
 net/core/skmsg.c                                   |  12 +-
 net/core/sock.c                                    |  19 +-
 net/ethtool/ioctl.c                                |  10 +-
 net/hsr/hsr_forward.c                              |   1 +
 net/ipv4/devinet.c                                 |  10 +-
 net/ipv4/fib_semantics.c                           |   5 +-
 net/ipv4/fib_trie.c                                |   3 +-
 net/ipv4/inet_hashtables.c                         |  43 ++--
 net/ipv4/ip_input.c                                |   3 +-
 net/ipv4/route.c                                   |   1 +
 net/ipv4/tcp_output.c                              |   2 +-
 net/ipv4/udp.c                                     |   6 +-
 net/ipv6/addrconf.c                                |   2 +-
 net/ipv6/ip6_input.c                               |   3 +-
 net/ipv6/route.c                                   |   3 +
 net/kcm/kcmsock.c                                  |  15 +-
 net/mptcp/protocol.c                               |  23 +-
 net/netfilter/nfnetlink_osf.c                      |   8 +
 net/netfilter/nft_exthdr.c                         |  22 +-
 net/sched/sch_fq_pie.c                             |  27 ++-
 net/sched/sch_plug.c                               |   2 +-
 net/sched/sch_qfq.c                                |  22 +-
 net/sctp/proc.c                                    |   2 +-
 net/sctp/socket.c                                  |  10 +-
 net/smc/smc_core.c                                 |   2 +
 net/tls/tls_sw.c                                   |   4 +-
 net/unix/af_unix.c                                 |   2 +-
 net/unix/scm.c                                     |   6 +-
 net/xdp/xsk_diag.c                                 |   3 +
 scripts/kconfig/preprocess.c                       |   3 +
 scripts/package/mkspec                             |   2 +-
 sound/soc/tegra/tegra210_sfc.c                     |  31 ++-
 sound/soc/tegra/tegra210_sfc.h                     |   4 +-
 tools/perf/builtin-top.c                           |   1 +
 tools/perf/builtin-trace.c                         |  15 +-
 .../pmu-events/arch/powerpc/power10/cache.json     |   4 +-
 .../arch/powerpc/power10/floating_point.json       |   7 -
 .../pmu-events/arch/powerpc/power10/frontend.json  |  30 +--
 .../pmu-events/arch/powerpc/power10/marked.json    |  30 +--
 .../pmu-events/arch/powerpc/power10/memory.json    |   6 +-
 .../pmu-events/arch/powerpc/power10/metrics.json   |   6 -
 .../pmu-events/arch/powerpc/power10/others.json    |  53 +++--
 .../pmu-events/arch/powerpc/power10/pipeline.json  |  30 +--
 .../perf/pmu-events/arch/powerpc/power10/pmc.json  |   4 +-
 .../arch/powerpc/power10/translation.json          |  11 +-
 tools/perf/tests/shell/stat_bpf_counters.sh        |   4 +-
 tools/perf/tests/shell/stat_bpf_counters_cgrp.sh   |  28 ++-
 tools/perf/ui/browsers/hists.c                     |  60 +++---
 tools/perf/util/annotate.c                         |  10 +-
 tools/perf/util/header.c                           |  11 +-
 tools/testing/selftests/kselftest/runner.sh        |   3 +-
 tools/testing/selftests/lib.mk                     |   4 +-
 232 files changed, 2129 insertions(+), 1340 deletions(-)



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
@ 2023-09-17 20:47 ` SeongJae Park
  2023-09-18  5:34 ` Takeshi Ogasawara
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 39+ messages in thread
From: SeongJae Park @ 2023-09-17 20:47 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, damon, SeongJae Park

Hello,

On Sun, 17 Sep 2023 21:12:07 +0200 Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.

This rc kernel passes DAMON functionality test[1] on my test machine.
Attaching the test results summary below.  Please note that I retrieved the
kernel from linux-stable-rc tree[2].

Tested-by: SeongJae Park <sj@kernel.org>

[1] https://github.com/awslabs/damon-tests/tree/next/corr
[2] 89fc7c511aa5 ("Linux 6.1.54-rc1")

Thanks,
SJ

[...]

---

# .config:1408:warning: override: reassigning to symbol CGROUPS
ok 15 selftests: damon-tests: build_nomemcg.sh
# kselftest dir '/home/sjpark/damon-tests-cont/linux/tools/testing/selftests/damon-tests' is in dirty state.
# the log is at '/home/sjpark/log'.
 [32m
ok 1 selftests: damon: debugfs_attrs.sh
ok 2 selftests: damon: debugfs_schemes.sh
ok 3 selftests: damon: debugfs_target_ids.sh
ok 4 selftests: damon: debugfs_empty_targets.sh
ok 5 selftests: damon: debugfs_huge_count_read_write.sh
ok 6 selftests: damon: debugfs_duplicate_context_creation.sh
ok 7 selftests: damon: sysfs.sh
ok 1 selftests: damon-tests: kunit.sh
ok 2 selftests: damon-tests: huge_count_read_write.sh
ok 3 selftests: damon-tests: buffer_overflow.sh
ok 4 selftests: damon-tests: rm_contexts.sh
ok 5 selftests: damon-tests: record_null_deref.sh
ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh
ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh
ok 8 selftests: damon-tests: damo_tests.sh
ok 9 selftests: damon-tests: masim-record.sh
ok 10 selftests: damon-tests: build_i386.sh
ok 11 selftests: damon-tests: build_m68k.sh
ok 12 selftests: damon-tests: build_arm64.sh
ok 13 selftests: damon-tests: build_i386_idle_flag.sh
ok 14 selftests: damon-tests: build_i386_highpte.sh
ok 15 selftests: damon-tests: build_nomemcg.sh
 [33m
 [92mPASS [39m
_remote_run_corr.sh SUCCESS

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
  2023-09-17 20:47 ` SeongJae Park
@ 2023-09-18  5:34 ` Takeshi Ogasawara
  2023-09-18  6:42 ` Bagas Sanjaya
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 39+ messages in thread
From: Takeshi Ogasawara @ 2023-09-18  5:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor

Hi Greg

On Mon, Sep 18, 2023 at 5:03 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

6.1.54-rc1 tested.

Build successfully completed.
Boot successfully completed.
No dmesg regressions.
Video output normal.
Sound output normal.

Lenovo ThinkPad X1 Carbon Gen10(Intel i7-1260P(x86_64) arch linux)

Thanks

Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
  2023-09-17 20:47 ` SeongJae Park
  2023-09-18  5:34 ` Takeshi Ogasawara
@ 2023-09-18  6:42 ` Bagas Sanjaya
  2023-09-18 11:24 ` Conor Dooley
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 39+ messages in thread
From: Bagas Sanjaya @ 2023-09-18  6:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
	rwarsow, conor

[-- Attachment #1: Type: text/plain, Size: 559 bytes --]

On Sun, Sep 17, 2023 at 09:12:07PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 

Successfully compiled and installed bindeb-pkgs on my computer (Acer
Aspire E15, Intel Core i3 Haswell). No noticeable regressions.

Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2023-09-18  6:42 ` Bagas Sanjaya
@ 2023-09-18 11:24 ` Conor Dooley
  2023-09-18 12:08 ` Ron Economos
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 39+ messages in thread
From: Conor Dooley @ 2023-09-18 11:24 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor

[-- Attachment #1: Type: text/plain, Size: 371 bytes --]

On Sun, Sep 17, 2023 at 09:12:07PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.

Tested-by: Conor Dooley <conor.dooley@microchip.com>

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2023-09-18 11:24 ` Conor Dooley
@ 2023-09-18 12:08 ` Ron Economos
  2023-09-18 12:48 ` Jon Hunter
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 39+ messages in thread
From: Ron Economos @ 2023-09-18 12:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
	rwarsow, conor

On 9/17/23 12:12 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Built and booted successfully on RISC-V RV64 (HiFive Unmatched).

Tested-by: Ron Economos <re@w6rz.net>


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2023-09-18 12:08 ` Ron Economos
@ 2023-09-18 12:48 ` Jon Hunter
  2023-09-18 18:34 ` Florian Fainelli
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 39+ messages in thread
From: Jon Hunter @ 2023-09-18 12:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Greg Kroah-Hartman, patches, linux-kernel, torvalds, akpm, linux,
	shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, linux-tegra, stable

On Sun, 17 Sep 2023 21:12:07 +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

All tests passing for Tegra ...

Test results for stable-v6.1:
    11 builds:	11 pass, 0 fail
    28 boots:	28 pass, 0 fail
    130 tests:	130 pass, 0 fail

Linux version:	6.1.54-rc1-g89fc7c511aa5
Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
                tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000,
                tegra20-ventana, tegra210-p2371-2180,
                tegra210-p3450-0000, tegra30-cardhu-a04

Tested-by: Jon Hunter <jonathanh@nvidia.com>

Jon

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (5 preceding siblings ...)
  2023-09-18 12:48 ` Jon Hunter
@ 2023-09-18 18:34 ` Florian Fainelli
  2023-09-18 18:41 ` Guenter Roeck
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 39+ messages in thread
From: Florian Fainelli @ 2023-09-18 18:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, sudipm.mukherjee, srw, rwarsow,
	conor



On 9/17/2023 12:12 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on 
BMIPS_GENERIC:

Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
-- 
Florian


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (6 preceding siblings ...)
  2023-09-18 18:34 ` Florian Fainelli
@ 2023-09-18 18:41 ` Guenter Roeck
  2023-09-18 20:56 ` Naresh Kamboju
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 39+ messages in thread
From: Guenter Roeck @ 2023-09-18 18:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
	rwarsow, conor

On Sun, Sep 17, 2023 at 09:12:07PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 

Build results:
	total: 157 pass: 157 fail: 0
Qemu test results:
	total: 529 pass: 529 fail: 0

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (7 preceding siblings ...)
  2023-09-18 18:41 ` Guenter Roeck
@ 2023-09-18 20:56 ` Naresh Kamboju
  2023-09-18 22:21 ` Shuah Khan
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 39+ messages in thread
From: Naresh Kamboju @ 2023-09-18 20:56 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor

On Mon, 18 Sept 2023 at 01:30, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
>         https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
>         git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h


Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>

## Build
* kernel: 6.1.54-rc1
* git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
* git branch: linux-6.1.y
* git commit: 89fc7c511aa5cd0b21e82ec42611db04d9e3b7c2
* git describe: v6.1.52-813-g89fc7c511aa5
* test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.1.y/build/v6.1.52-813-g89fc7c511aa5

## Test Regressions (compared to v6.1.52)

## Metric Regressions (compared to v6.1.52)

## Test Fixes (compared to v6.1.52)

## Metric Fixes (compared to v6.1.52)

## Test result summary
total: 206086, pass: 176646, fail: 2859, skip: 26303, xfail: 278

## Build Summary
* arc: 10 total, 10 passed, 0 failed
* arm: 284 total, 283 passed, 1 failed
* arm64: 89 total, 87 passed, 2 failed
* i386: 67 total, 65 passed, 2 failed
* mips: 56 total, 54 passed, 2 failed
* parisc: 7 total, 7 passed, 0 failed
* powerpc: 70 total, 68 passed, 2 failed
* riscv: 28 total, 26 passed, 2 failed
* s390: 28 total, 27 passed, 1 failed
* sh: 26 total, 24 passed, 2 failed
* sparc: 14 total, 14 passed, 0 failed
* x86_64: 76 total, 72 passed, 4 failed

## Test suites summary
* boot
* kselftest-android
* kselftest-arm64
* kselftest-breakpoints
* kselftest-capabilities
* kselftest-clone3
* kselftest-core
* kselftest-cpu-hotplug
* kselftest-drivers-dma-buf
* kselftest-exec
* kselftest-fpu
* kselftest-ftrace
* kselftest-futex
* kselftest-gpio
* kselftest-intel_pstate
* kselftest-ipc
* kselftest-ir
* kselftest-kcmp
* kselftest-kexec
* kselftest-lib
* kselftest-membarrier
* kselftest-memfd
* kselftest-memory-hotplug
* kselftest-mincore
* kselftest-mount
* kselftest-mqueue
* kselftest-net
* kselftest-net-forwarding
* kselftest-net-mptcp
* kselftest-netfilter
* kselftest-nsfs
* kselftest-openat2
* kselftest-pid_namespace
* kselftest-pidfd
* kselftest-proc
* kselftest-pstore
* kselftest-ptrace
* kselftest-rseq
* kselftest-rtc
* kselftest-seccomp
* kselftest-sigaltstack
* kselftest-size
* kselftest-splice
* kselftest-static_keys
* kselftest-sync
* kselftest-sysctl
* kselftest-tc-testing
* kselftest-timens
* kselftest-timers
* kselftest-tmpfs
* kselftest-tpm2
* kselftest-user
* kselftest-user_events
* kselftest-vDSO
* kselftest-vm
* kselftest-watchdog
* kselftest-x86
* kunit
* kvm-unit-tests
* libgpiod
* log-parser-boot
* log-parser-test
* ltp-cap_bounds
* ltp-commands
* ltp-containers
* ltp-controllers
* ltp-cpuhotplug
* ltp-crypto
* ltp-cve
* ltp-dio
* ltp-fcntl-locktests
* ltp-filecaps
* ltp-fs
* ltp-fs_bind
* ltp-fs_perms_simple
* ltp-fsx
* ltp-hugetlb
* ltp-io
* ltp-ipc
* ltp-math
* ltp-mm
* ltp-nptl
* ltp-pty
* ltp-sched
* ltp-securebits
* ltp-smoke
* ltp-syscalls
* ltp-tracing
* perf
* rcutorture

--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (8 preceding siblings ...)
  2023-09-18 20:56 ` Naresh Kamboju
@ 2023-09-18 22:21 ` Shuah Khan
       [not found] ` <20230917191042.204185566@linuxfoundation.org>
  2023-09-21 13:04 ` [PATCH 6.1 000/219] 6.1.54-rc1 review Conor Dooley
  11 siblings, 0 replies; 39+ messages in thread
From: Shuah Khan @ 2023-09-18 22:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman, stable
  Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
	rwarsow, conor, Shuah Khan

On 9/17/23 13:12, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Tue, 19 Sep 2023 19:10:04 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.54-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
       [not found] ` <20230917191042.204185566@linuxfoundation.org>
@ 2023-09-20  8:11   ` Jeremi Piotrowski
  2023-09-20  8:43     ` Michal Hocko
  2023-09-22 11:14     ` Linux regression tracking #adding (Thorsten Leemhuis)
  0 siblings, 2 replies; 39+ messages in thread
From: Jeremi Piotrowski @ 2023-09-20  8:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, Michal Hocko, Shakeel Butt, Johannes Weiner,
	Roman Gushchin, Muchun Song, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> 6.1-stable review patch.  If anyone has any objections, please let me know.
> 
> ------------------

Hi Greg/Michal,

This commit breaks userspace which makes it a bad commit for mainline and an
even worse commit for stable.

We ingested 6.1.54 into our nightly testing and found that runc fails to gather
cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
fine.

> Address this by wiping out the file completely and effectively get back to
> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.

On reads, the runc code checks for MEMCG_KMEM=n by checking
kmem.usage_in_bytes. If it is present then runc expects the other cgroup files
to be there (including kmem.limit_in_bytes). So this change is not effectively
the same.

Here's a link to the PR that would be needed to handle this change in userspace
(not merged yet and would need to be propagated through the ecosystem):

https://github.com/opencontainers/runc/pull/4018.

Jeremi

> 
> From: Michal Hocko <mhocko@suse.com>
> 
> commit 86327e8eb94c52eca4f93cfece2e29d1bf52acbf upstream.
> 
> kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
> deprecated since 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") merged in 5.16.  We haven't heard about any serious
> users since then but it seems that the mere presence of the file is
> causing more harm thatn good.  We (SUSE) have had several bug reports from
> customers where Docker based containers started to fail because a write to
> kmem.limit_in_bytes has failed.
> 
> This was unexpected because runc code only expects ENOENT (kmem disabled)
> or EBUSY (tasks already running within cgroup).  So a new error code was
> unexpected and the whole container startup failed.  This has been later
> addressed by
> https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
> so current Docker runtimes do not suffer from the problem anymore.  There
> are still older version of Docker in use and likely hard to get rid of
> completely.
> 
> Address this by wiping out the file completely and effectively get back to
> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> I would recommend backporting to stable trees which have picked up
> 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").
> 
> [mhocko@suse.com: restore _KMEM switch case]
>   Link: https://lkml.kernel.org/r/ZKe5wxdbvPi5Cwd7@dhcp22.suse.cz
> Link: https://lkml.kernel.org/r/20230704115240.14672-1-mhocko@kernel.org
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> Acked-by: Shakeel Butt <shakeelb@google.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  Documentation/admin-guide/cgroup-v1/memory.rst |    2 --
>  mm/memcontrol.c                                |   10 ----------
>  2 files changed, 12 deletions(-)
> 
> --- a/Documentation/admin-guide/cgroup-v1/memory.rst
> +++ b/Documentation/admin-guide/cgroup-v1/memory.rst
> @@ -91,8 +91,6 @@ Brief summary of control files.
>   memory.oom_control		     set/show oom controls.
>   memory.numa_stat		     show the number of memory usage per numa
>  				     node
> - memory.kmem.limit_in_bytes          This knob is deprecated and writing to
> -                                     it will return -ENOTSUPP.
>   memory.kmem.usage_in_bytes          show current kernel memory allocation
>   memory.kmem.failcnt                 show the number of kernel memory usage
>  				     hits limits
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3841,10 +3841,6 @@ static ssize_t mem_cgroup_write(struct k
>  		case _MEMSWAP:
>  			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
>  			break;
> -		case _KMEM:
> -			/* kmem.limit_in_bytes is deprecated. */
> -			ret = -EOPNOTSUPP;
> -			break;
>  		case _TCP:
>  			ret = memcg_update_tcp_max(memcg, nr_pages);
>  			break;
> @@ -5056,12 +5052,6 @@ static struct cftype mem_cgroup_legacy_f
>  	},
>  #endif
>  	{
> -		.name = "kmem.limit_in_bytes",
> -		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
> -		.write = mem_cgroup_write,
> -		.read_u64 = mem_cgroup_read_u64,
> -	},
> -	{
>  		.name = "kmem.usage_in_bytes",
>  		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
>  		.read_u64 = mem_cgroup_read_u64,
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20  8:11   ` [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes Jeremi Piotrowski
@ 2023-09-20  8:43     ` Michal Hocko
  2023-09-20  9:25       ` Greg Kroah-Hartman
  2023-09-20 10:04       ` Jeremi Piotrowski
  2023-09-22 11:14     ` Linux regression tracking #adding (Thorsten Leemhuis)
  1 sibling, 2 replies; 39+ messages in thread
From: Michal Hocko @ 2023-09-20  8:43 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Greg Kroah-Hartman, stable, patches, Shakeel Butt,
	Johannes Weiner, Roman Gushchin, Muchun Song, Tejun Heo,
	Andrew Morton, linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> > 6.1-stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> 
> Hi Greg/Michal,
> 
> This commit breaks userspace which makes it a bad commit for mainline and an
> even worse commit for stable.
> 
> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
> fine.

Could you expand some more on why is the file read? It doesn't support
writing to it for some time so how does reading it helps in any sense?

Anyway, I do agree that the stable backport should be reverted.

> > Address this by wiping out the file completely and effectively get back to
> > pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> On reads, the runc code checks for MEMCG_KMEM=n by checking
> kmem.usage_in_bytes. If it is present then runc expects the other cgroup files
> to be there (including kmem.limit_in_bytes). So this change is not effectively
> the same.
> 
> Here's a link to the PR that would be needed to handle this change in userspace
> (not merged yet and would need to be propagated through the ecosystem):
> 
> https://github.com/opencontainers/runc/pull/4018.

Thanks. Does that mean the revert is still necessary for the Linus tree
or do you expect that the fix can be merged and propagated in a
reasonable time?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20  8:43     ` Michal Hocko
@ 2023-09-20  9:25       ` Greg Kroah-Hartman
  2023-09-20 10:21         ` Jeremi Piotrowski
  2023-09-20 10:04       ` Jeremi Piotrowski
  1 sibling, 1 reply; 39+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-20  9:25 UTC (permalink / raw)
  To: Michal Hocko, Jeremi Piotrowski
  Cc: stable, patches, Shakeel Butt, Johannes Weiner, Roman Gushchin,
	Muchun Song, Tejun Heo, Andrew Morton, linux-kernel, regressions,
	mathieu.tortuyaux

On Wed, Sep 20, 2023 at 10:43:56AM +0200, Michal Hocko wrote:
> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
> > On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> > > 6.1-stable review patch.  If anyone has any objections, please let me know.
> > > 
> > > ------------------
> > 
> > Hi Greg/Michal,
> > 
> > This commit breaks userspace which makes it a bad commit for mainline and an
> > even worse commit for stable.
> > 
> > We ingested 6.1.54 into our nightly testing and found that runc fails to gather
> > cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
> > into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
> > fine.
> 
> Could you expand some more on why is the file read? It doesn't support
> writing to it for some time so how does reading it helps in any sense?
> 
> Anyway, I do agree that the stable backport should be reverted.

That will just postpone the breakage, we really shouldn't break
userspace.

That being said, having userspace "break" because a file is no longer
present is not good coding style on the userspace side at all.  That's
why we have sysfs and single-value-files now, if the file isn't present,
then userspace instantly notices and can handle it.  Much easier than
the old-style multi-fields-in-one-file problem.

> > > Address this by wiping out the file completely and effectively get back to
> > > pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.

The fact that this is a valid option (i.e. no file) with that config
option disabled makes me want to keep this as well, as how does
userspace handle this option disabled at all?  Or old kernels?

I can drop this from stable kernels, but again, this feels like the runc
developers are just postponing the problem...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20  8:43     ` Michal Hocko
  2023-09-20  9:25       ` Greg Kroah-Hartman
@ 2023-09-20 10:04       ` Jeremi Piotrowski
  2023-09-20 11:07         ` Michal Hocko
  1 sibling, 1 reply; 39+ messages in thread
From: Jeremi Piotrowski @ 2023-09-20 10:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, stable, patches, Shakeel Butt,
	Johannes Weiner, Roman Gushchin, Muchun Song, Tejun Heo,
	Andrew Morton, linux-kernel, regressions, mathieu.tortuyaux

On 9/20/2023 10:43 AM, Michal Hocko wrote:
> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
>> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
>>> 6.1-stable review patch.  If anyone has any objections, please let me know.
>>>
>>> ------------------
>>
>> Hi Greg/Michal,
>>
>> This commit breaks userspace which makes it a bad commit for mainline and an
>> even worse commit for stable.
>>
>> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
>> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
>> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
>> fine.
> 
> Could you expand some more on why is the file read? It doesn't support
> writing to it for some time so how does reading it helps in any sense?
> 
> Anyway, I do agree that the stable backport should be reverted.
> 

This file is read together with all the other memcg files. Each prefix:

memory
memory.memsw
memory.kmem
memory.kmem.tcp

is combined with these suffixes

.usage_in_bytes
.max_usage_in_bytes
.failcnt
.limit_in_bytes

and read, the values are then forwarded on to other components for scheduling decisions.
You want to know the limit when checking the usage (is the usage close to the limit or not).

Userspace tolerates MEMCG/MEMCG_KMEM being disabled, but having a single file out of the
set missing is an anomaly. So maybe we could keep the dummy file just for the
sake of consistency? Cgroupv1 is legacy after all.

>>> Address this by wiping out the file completely and effectively get back to
>>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
>>
>> On reads, the runc code checks for MEMCG_KMEM=n by checking
>> kmem.usage_in_bytes. If it is present then runc expects the other cgroup files
>> to be there (including kmem.limit_in_bytes). So this change is not effectively
>> the same.
>>
>> Here's a link to the PR that would be needed to handle this change in userspace
>> (not merged yet and would need to be propagated through the ecosystem):
>>
>> https://github.com/opencontainers/runc/pull/4018.
> 
> Thanks. Does that mean the revert is still necessary for the Linus tree
> or do you expect that the fix can be merged and propagated in a
> reasonable time?
> 

We can probably get runc and currently supported kubernetes versions patched in time
before 6.6 (or the next LTS kernel) hits LTS distros.

But there's still a bunch of users running cgroupv1 with unsupported kubernetes
versions that are still taking kernel updates as they come, so this might get reported
again next year if it stays in mainline.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20  9:25       ` Greg Kroah-Hartman
@ 2023-09-20 10:21         ` Jeremi Piotrowski
  2023-09-20 10:45           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 39+ messages in thread
From: Jeremi Piotrowski @ 2023-09-20 10:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Michal Hocko
  Cc: stable, patches, Shakeel Butt, Johannes Weiner, Roman Gushchin,
	Muchun Song, Tejun Heo, Andrew Morton, linux-kernel, regressions,
	mathieu.tortuyaux

On 9/20/2023 11:25 AM, Greg Kroah-Hartman wrote:
> On Wed, Sep 20, 2023 at 10:43:56AM +0200, Michal Hocko wrote:
>> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
>>> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
>>>> 6.1-stable review patch.  If anyone has any objections, please let me know.
>>>>
>>>> ------------------
>>>
>>> Hi Greg/Michal,
>>>
>>> This commit breaks userspace which makes it a bad commit for mainline and an
>>> even worse commit for stable.
>>>
>>> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
>>> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
>>> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
>>> fine.
>>
>> Could you expand some more on why is the file read? It doesn't support
>> writing to it for some time so how does reading it helps in any sense?
>>
>> Anyway, I do agree that the stable backport should be reverted.
> 
> That will just postpone the breakage, we really shouldn't break
> userspace.
> 
> That being said, having userspace "break" because a file is no longer
> present is not good coding style on the userspace side at all.  That's
> why we have sysfs and single-value-files now, if the file isn't present,
> then userspace instantly notices and can handle it.  Much easier than
> the old-style multi-fields-in-one-file problem.
> 

The memcg files in this case are single-value, but userspace expects to be able
to read memcg limits when it can read the usage (indicating MEMCG is enabled).
If it can't - then something is off, and the node is marked unhealthy.

>>>> Address this by wiping out the file completely and effectively get back to
>>>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> The fact that this is a valid option (i.e. no file) with that config
> option disabled makes me want to keep this as well, as how does
> userspace handle this option disabled at all?  Or old kernels?
> 

Userspace has had to handle the case of MEMCG_KMEM=n, but that had 2 cases so far:

limits/usage/max_usage/failcnt files are all available or none of them are available.

Now it needs to handle 3 of 4 files being available, but only for kmem (and not plain
memory, memsw or kmem.tcp). That's an inconsistency.

> I can drop this from stable kernels, but again, this feels like the runc
> developers are just postponing the problem...
>

Since cgroups v1 is deprecated, I think the runc developers haven't touched this part
of the code in years and expected it to keep working while they wait for the long tail
of usage to die out.

> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 10:21         ` Jeremi Piotrowski
@ 2023-09-20 10:45           ` Greg Kroah-Hartman
  2023-09-20 11:08             ` Michal Hocko
  0 siblings, 1 reply; 39+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-20 10:45 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Michal Hocko, stable, patches, Shakeel Butt, Johannes Weiner,
	Roman Gushchin, Muchun Song, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 12:21:37PM +0200, Jeremi Piotrowski wrote:
> On 9/20/2023 11:25 AM, Greg Kroah-Hartman wrote:
> > On Wed, Sep 20, 2023 at 10:43:56AM +0200, Michal Hocko wrote:
> >> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
> >>> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> >>>> 6.1-stable review patch.  If anyone has any objections, please let me know.
> >>>>
> >>>> ------------------
> >>>
> >>> Hi Greg/Michal,
> >>>
> >>> This commit breaks userspace which makes it a bad commit for mainline and an
> >>> even worse commit for stable.
> >>>
> >>> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
> >>> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
> >>> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
> >>> fine.
> >>
> >> Could you expand some more on why is the file read? It doesn't support
> >> writing to it for some time so how does reading it helps in any sense?
> >>
> >> Anyway, I do agree that the stable backport should be reverted.
> > 
> > That will just postpone the breakage, we really shouldn't break
> > userspace.
> > 
> > That being said, having userspace "break" because a file is no longer
> > present is not good coding style on the userspace side at all.  That's
> > why we have sysfs and single-value-files now, if the file isn't present,
> > then userspace instantly notices and can handle it.  Much easier than
> > the old-style multi-fields-in-one-file problem.
> > 
> 
> The memcg files in this case are single-value, but userspace expects to be able
> to read memcg limits when it can read the usage (indicating MEMCG is enabled).
> If it can't - then something is off, and the node is marked unhealthy.
> 
> >>>> Address this by wiping out the file completely and effectively get back to
> >>>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> > 
> > The fact that this is a valid option (i.e. no file) with that config
> > option disabled makes me want to keep this as well, as how does
> > userspace handle this option disabled at all?  Or old kernels?
> > 
> 
> Userspace has had to handle the case of MEMCG_KMEM=n, but that had 2 cases so far:
> 
> limits/usage/max_usage/failcnt files are all available or none of them are available.
> 
> Now it needs to handle 3 of 4 files being available, but only for kmem (and not plain
> memory, memsw or kmem.tcp). That's an inconsistency.
> 
> > I can drop this from stable kernels, but again, this feels like the runc
> > developers are just postponing the problem...
> >
> 
> Since cgroups v1 is deprecated, I think the runc developers haven't touched this part
> of the code in years and expected it to keep working while they wait for the long tail
> of usage to die out.

Ok, then we should revert this, I'll go drop it in the stable trees, it
should also be reverted in Linus's tree too.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 10:04       ` Jeremi Piotrowski
@ 2023-09-20 11:07         ` Michal Hocko
  2023-09-20 13:25           ` Jeremi Piotrowski
  0 siblings, 1 reply; 39+ messages in thread
From: Michal Hocko @ 2023-09-20 11:07 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Greg Kroah-Hartman, stable, patches, Shakeel Butt,
	Johannes Weiner, Roman Gushchin, Muchun Song, Tejun Heo,
	Andrew Morton, linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 12:04:48, Jeremi Piotrowski wrote:
> On 9/20/2023 10:43 AM, Michal Hocko wrote:
> > On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
> >> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
> >>> 6.1-stable review patch.  If anyone has any objections, please let me know.
> >>>
> >>> ------------------
> >>
> >> Hi Greg/Michal,
> >>
> >> This commit breaks userspace which makes it a bad commit for mainline and an
> >> even worse commit for stable.
> >>
> >> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
> >> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
> >> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
> >> fine.
> > 
> > Could you expand some more on why is the file read? It doesn't support
> > writing to it for some time so how does reading it helps in any sense?
> > 
> > Anyway, I do agree that the stable backport should be reverted.
> > 
> 
> This file is read together with all the other memcg files. Each prefix:
> 
> memory
> memory.memsw
> memory.kmem
> memory.kmem.tcp
> 
> is combined with these suffixes
> 
> .usage_in_bytes
> .max_usage_in_bytes
> .failcnt
> .limit_in_bytes
> 
> and read, the values are then forwarded on to other components for scheduling decisions.
> You want to know the limit when checking the usage (is the usage close to the limit or not).

You know there is no kmem limit as there is no way to set it for some
time (since 5.16 - i.e. 2 years ago). I can see that users following old
kernels could have missed that though.

> Userspace tolerates MEMCG/MEMCG_KMEM being disabled, but having a single file out of the
> set missing is an anomaly. So maybe we could keep the dummy file just for the
> sake of consistency? Cgroupv1 is legacy after all.

What we had was a dummy file. It didn't allow to write any value so it
would have always reported unlimited. The reason I've decided to remove
the file was that there were other users not being able to handle the
write failure while they are just fine not having the file. So we are
effectively between a rock and hard place here. Either way something is
broken. The other SW got fixed as well but similar to your case it takes
some time to absorb the change through all 3rd party users.

> >>> Address this by wiping out the file completely and effectively get back to
> >>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> >>
> >> On reads, the runc code checks for MEMCG_KMEM=n by checking
> >> kmem.usage_in_bytes.

Just one side note. Config options get renamed and their semantic
changes over time so I would just recomment to never make any
dependencies on any specific one. 

> >> If it is present then runc expects the other cgroup files
> >> to be there (including kmem.limit_in_bytes). So this change is not effectively
> >> the same.
> >>
> >> Here's a link to the PR that would be needed to handle this change in userspace
> >> (not merged yet and would need to be propagated through the ecosystem):
> >>
> >> https://github.com/opencontainers/runc/pull/4018.
> > 
> > Thanks. Does that mean the revert is still necessary for the Linus tree
> > or do you expect that the fix can be merged and propagated in a
> > reasonable time?
> > 
> 
> We can probably get runc and currently supported kubernetes versions patched in time
> before 6.6 (or the next LTS kernel) hits LTS distros.
> 
> But there's still a bunch of users running cgroupv1 with unsupported kubernetes
> versions that are still taking kernel updates as they come, so this might get reported
> again next year if it stays in mainline.

I can see how 3rd party users are hard to get aligned but having a fix
available should allow them to apply it or is there any actual roadblock
for them to adapt as soon as they hit the issue?

I mean, normally I would be just fine reverting this API change because
it is disruptive but the only way to have the file available and not
break somebody is to revert 58056f77502f ("memcg, kmem: further
deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
there but that sounds rather dubious. Although one could argue this
would mimic nokmem kernel option.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 10:45           ` Greg Kroah-Hartman
@ 2023-09-20 11:08             ` Michal Hocko
  2023-09-20 11:16               ` Greg Kroah-Hartman
  0 siblings, 1 reply; 39+ messages in thread
From: Michal Hocko @ 2023-09-20 11:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jeremi Piotrowski, stable, patches, Shakeel Butt, Johannes Weiner,
	Roman Gushchin, Muchun Song, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 12:45:08, Greg KH wrote:
[...]
> Ok, then we should revert this, I'll go drop it in the stable trees, it
> should also be reverted in Linus's tree too.

A simple revert would break other users as noted in other response so
wait with sending reverts to Linus before we agreen on the least painful
solution.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 11:08             ` Michal Hocko
@ 2023-09-20 11:16               ` Greg Kroah-Hartman
  0 siblings, 0 replies; 39+ messages in thread
From: Greg Kroah-Hartman @ 2023-09-20 11:16 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, stable, patches, Shakeel Butt, Johannes Weiner,
	Roman Gushchin, Muchun Song, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 01:08:43PM +0200, Michal Hocko wrote:
> On Wed 20-09-23 12:45:08, Greg KH wrote:
> [...]
> > Ok, then we should revert this, I'll go drop it in the stable trees, it
> > should also be reverted in Linus's tree too.
> 
> A simple revert would break other users as noted in other response so
> wait with sending reverts to Linus before we agreen on the least painful
> solution.

A revert should cause the systems that stopped working to start working
again, so I'll keep the revert in the stable trees and wait for you to
work out the real solution in Linus's tree and then backport all of them
as needed.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 11:07         ` Michal Hocko
@ 2023-09-20 13:25           ` Jeremi Piotrowski
  2023-09-20 13:47             ` Michal Hocko
  0 siblings, 1 reply; 39+ messages in thread
From: Jeremi Piotrowski @ 2023-09-20 13:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Greg Kroah-Hartman, stable, patches, Shakeel Butt,
	Johannes Weiner, Roman Gushchin, Muchun Song, Tejun Heo,
	Andrew Morton, linux-kernel, regressions, mathieu.tortuyaux

On 9/20/2023 1:07 PM, Michal Hocko wrote:
> On Wed 20-09-23 12:04:48, Jeremi Piotrowski wrote:
>> On 9/20/2023 10:43 AM, Michal Hocko wrote:
>>> On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
>>>> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
>>>>> 6.1-stable review patch.  If anyone has any objections, please let me know.
>>>>>
>>>>> ------------------
>>>>
>>>> Hi Greg/Michal,
>>>>
>>>> This commit breaks userspace which makes it a bad commit for mainline and an
>>>> even worse commit for stable.
>>>>
>>>> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
>>>> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
>>>> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
>>>> fine.
>>>
>>> Could you expand some more on why is the file read? It doesn't support
>>> writing to it for some time so how does reading it helps in any sense?
>>>
>>> Anyway, I do agree that the stable backport should be reverted.
>>>
>>
>> This file is read together with all the other memcg files. Each prefix:
>>
>> memory
>> memory.memsw
>> memory.kmem
>> memory.kmem.tcp
>>
>> is combined with these suffixes
>>
>> .usage_in_bytes
>> .max_usage_in_bytes
>> .failcnt
>> .limit_in_bytes
>>
>> and read, the values are then forwarded on to other components for scheduling decisions.
>> You want to know the limit when checking the usage (is the usage close to the limit or not).
> 
> You know there is no kmem limit as there is no way to set it for some
> time (since 5.16 - i.e. 2 years ago). I can see that users following old
> kernels could have missed that though.

I know what you mean, but I think this generally went unnoticed because the limit file is read
unconditionally, but only written when a kmem limit is explicitly requested for a specific
container, which is rarely (if ever) done.

Regarding following old kernels: a majority of kubernetes users are still on 5.15 and only slowly
started shifting to >=6.1 very recently (this summer). This is mostly driven by distro vendor
policies which tend to follow the pattern of "follow LTS kernels but don't switch to the next
LTS immediately".

I know this is far from ideal for reporting these kinds of issues, would love to report
them as soon as a kernel release happens.

> 
>> Userspace tolerates MEMCG/MEMCG_KMEM being disabled, but having a single file out of the
>> set missing is an anomaly. So maybe we could keep the dummy file just for the
>> sake of consistency? Cgroupv1 is legacy after all.
> 
> What we had was a dummy file. It didn't allow to write any value so it
> would have always reported unlimited. The reason I've decided to remove
> the file was that there were other users not being able to handle the
> write failure while they are just fine not having the file. So we are
> effectively between a rock and hard place here. Either way something is
> broken. The other SW got fixed as well but similar to your case it takes
> some time to absorb the change through all 3rd party users.
> 
>>>>> Address this by wiping out the file completely and effectively get back to
>>>>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
>>>>
>>>> On reads, the runc code checks for MEMCG_KMEM=n by checking
>>>> kmem.usage_in_bytes.
> 
> Just one side note. Config options get renamed and their semantic
> changes over time so I would just recomment to never make any
> dependencies on any specific one. 
> 

Right, what i meant is the logic is this, with checking the "usage"
file to determine whether the controller is available:

    value, err := fscommon.GetCgroupParamUint(path, usage)
    if err != nil {
        if name != "" && os.IsNotExist(err) {
            // Ignore ENOENT as swap and kmem controllers
            // are optional in the kernel.
            return cgroups.MemoryData{}, nil
        }
        return cgroups.MemoryData{}, err
    }

and if it is, then it proceeds to read "limit_in_bytes" and the others.

>>>> If it is present then runc expects the other cgroup files
>>>> to be there (including kmem.limit_in_bytes). So this change is not effectively
>>>> the same.
>>>>
>>>> Here's a link to the PR that would be needed to handle this change in userspace
>>>> (not merged yet and would need to be propagated through the ecosystem):
>>>>
>>>> https://github.com/opencontainers/runc/pull/4018.
>>>
>>> Thanks. Does that mean the revert is still necessary for the Linus tree
>>> or do you expect that the fix can be merged and propagated in a
>>> reasonable time?
>>>
>>
>> We can probably get runc and currently supported kubernetes versions patched in time
>> before 6.6 (or the next LTS kernel) hits LTS distros.
>>
>> But there's still a bunch of users running cgroupv1 with unsupported kubernetes
>> versions that are still taking kernel updates as they come, so this might get reported
>> again next year if it stays in mainline.
> 
> I can see how 3rd party users are hard to get aligned but having a fix
> available should allow them to apply it or is there any actual roadblock
> for them to adapt as soon as they hit the issue?
> 

The issue with this is that these users are running a frozen set of kubernetes (+runc)
binaries, but still pull kernel updates from the base distro. These kubernetes versions
are out of maintenance so the code will not get fixed and no one will release fixed
binaries.

> I mean, normally I would be just fine reverting this API change because
> it is disruptive but the only way to have the file available and not
> break somebody is to revert 58056f77502f ("memcg, kmem: further
> deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> there but that sounds rather dubious. Although one could argue this
> would mimic nokmem kernel option.
> 

I just want to make sure we don't introduce yet another new behavior in this legacy
system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
(=don't event store the written limit). The latter might have unintended consequences.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 13:25           ` Jeremi Piotrowski
@ 2023-09-20 13:47             ` Michal Hocko
  2023-09-20 15:32               ` Shakeel Butt
  2023-09-22 23:00               ` Roman Gushchin
  0 siblings, 2 replies; 39+ messages in thread
From: Michal Hocko @ 2023-09-20 13:47 UTC (permalink / raw)
  To: Jeremi Piotrowski, Shakeel Butt, Johannes Weiner, Roman Gushchin,
	Muchun Song
  Cc: Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> On 9/20/2023 1:07 PM, Michal Hocko wrote:
[...]
> > I mean, normally I would be just fine reverting this API change because
> > it is disruptive but the only way to have the file available and not
> > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > there but that sounds rather dubious. Although one could argue this
> > would mimic nokmem kernel option.
> > 
> 
> I just want to make sure we don't introduce yet another new behavior in this legacy
> system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> (=don't event store the written limit). The latter might have unintended consequences.

Yes it would mean that the limit is never enforced. Bad as it is the
thing is that the hard limit on kernel memory is broken by design and
unfixable.  This causes all sorts of unexpected kernel allocation
failures that this is simply unsafe to use.

All that being said I can see the following options
1) keep the current upstream status and not export the file
2) revert both 58056f77502f and 86327e8eb94 and make it clear
   that kmem.limit_in_bytes is unsupported so failures or misbehavior
   as a result of the limit being hit are likely not going to be
   investigated or fixed.
3) reverting like in 2) but never inforce the limit (so basically nokmem
   semantic)

Shakeel, Johannes, Roman, Muchun Song what do you think?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 13:47             ` Michal Hocko
@ 2023-09-20 15:32               ` Shakeel Butt
  2023-09-20 16:55                 ` Michal Hocko
  2023-09-22 23:00               ` Roman Gushchin
  1 sibling, 1 reply; 39+ messages in thread
From: Shakeel Butt @ 2023-09-20 15:32 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 6:47 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> > On 9/20/2023 1:07 PM, Michal Hocko wrote:
> [...]
> > > I mean, normally I would be just fine reverting this API change because
> > > it is disruptive but the only way to have the file available and not
> > > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > > there but that sounds rather dubious. Although one could argue this
> > > would mimic nokmem kernel option.
> > >
> >
> > I just want to make sure we don't introduce yet another new behavior in this legacy
> > system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> > does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> > (=don't event store the written limit). The latter might have unintended consequences.
>
> Yes it would mean that the limit is never enforced. Bad as it is the
> thing is that the hard limit on kernel memory is broken by design and
> unfixable.  This causes all sorts of unexpected kernel allocation
> failures that this is simply unsafe to use.
>
> All that being said I can see the following options
> 1) keep the current upstream status and not export the file
> 2) revert both 58056f77502f and 86327e8eb94 and make it clear
>    that kmem.limit_in_bytes is unsupported so failures or misbehavior
>    as a result of the limit being hit are likely not going to be
>    investigated or fixed.
> 3) reverting like in 2) but never inforce the limit (so basically nokmem
>    semantic)
>
> Shakeel, Johannes, Roman, Muchun Song what do you think?

I think the safe option would be to revert 86327e8eb94 for now and put
pr_warn_once even for the read of kmem.limit_in_bytes? We can retry
86327e8eb94 in a year or so.

However personally I would prefer option 1. Also I don't think
reverting  58056f77502f would give any benefit.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 15:32               ` Shakeel Butt
@ 2023-09-20 16:55                 ` Michal Hocko
  2023-09-20 19:46                   ` Shakeel Butt
  0 siblings, 1 reply; 39+ messages in thread
From: Michal Hocko @ 2023-09-20 16:55 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 08:32:42, Shakeel Butt wrote:
> Also I don't think reverting 58056f77502f would give any benefit.

Not reverting 58056f77502f would re-introduce the regression in some
non-patched versions of Docker runtimes which cannot handle ENOTSUPP.
So I think we need to revert both or none of them. I would prefer the
later (option 1) as the fix is trivial but I do understand headache
of chasing all those outdated deployments or vendor code forks.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 16:55                 ` Michal Hocko
@ 2023-09-20 19:46                   ` Shakeel Butt
  2023-09-20 20:08                     ` Michal Hocko
  0 siblings, 1 reply; 39+ messages in thread
From: Shakeel Butt @ 2023-09-20 19:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 9:55 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 20-09-23 08:32:42, Shakeel Butt wrote:
> > Also I don't think reverting 58056f77502f would give any benefit.
>
> Not reverting 58056f77502f would re-introduce the regression in some
> non-patched versions of Docker runtimes which cannot handle ENOTSUPP.
> So I think we need to revert both or none of them. I would prefer the
> later (option 1) as the fix is trivial but I do understand headache
> of chasing all those outdated deployments or vendor code forks.

I think that would be too much conservative an approach but I don't
have a strong opinion against it. Also just to be clear we are not
talking about full revert of 58056f77502f but just the returning of
EOPNOTSUPP, right?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 19:46                   ` Shakeel Butt
@ 2023-09-20 20:08                     ` Michal Hocko
  2023-09-20 21:46                       ` Shakeel Butt
  0 siblings, 1 reply; 39+ messages in thread
From: Michal Hocko @ 2023-09-20 20:08 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 12:46:23, Shakeel Butt wrote:
> On Wed, Sep 20, 2023 at 9:55 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Wed 20-09-23 08:32:42, Shakeel Butt wrote:
> > > Also I don't think reverting 58056f77502f would give any benefit.
> >
> > Not reverting 58056f77502f would re-introduce the regression in some
> > non-patched versions of Docker runtimes which cannot handle ENOTSUPP.
> > So I think we need to revert both or none of them. I would prefer the
> > later (option 1) as the fix is trivial but I do understand headache
> > of chasing all those outdated deployments or vendor code forks.
> 
> I think that would be too much conservative an approach but I don't

Well, TBH I do not really see any sifference between one set of broken
userspace or the other. Both are making assumptions on our interfaces
and they do not overlap unfortunately.

> have a strong opinion against it. Also just to be clear we are not
> talking about full revert of 58056f77502f but just the returning of
> EOPNOTSUPP, right?

If we allow the limit to be set without returning a failure then we
still have options 2 and 3 on how to deal with that. One of them is to
enforce the limit.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 20:08                     ` Michal Hocko
@ 2023-09-20 21:46                       ` Shakeel Butt
  2023-09-21  7:52                         ` Michal Hocko
  0 siblings, 1 reply; 39+ messages in thread
From: Shakeel Butt @ 2023-09-20 21:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
>
[...]
> > have a strong opinion against it. Also just to be clear we are not
> > talking about full revert of 58056f77502f but just the returning of
> > EOPNOTSUPP, right?
>
> If we allow the limit to be set without returning a failure then we
> still have options 2 and 3 on how to deal with that. One of them is to
> enforce the limit.
>

Option 3 is a partial revert of 58056f77502f where we keep the no
limit enforcement and remove the EOPNOTSUPP return on write. Let's go
with option 3. In addition, let's add pr_warn_once on the read of
kmem.limit_in_bytes as well.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 21:46                       ` Shakeel Butt
@ 2023-09-21  7:52                         ` Michal Hocko
  2023-09-21 10:43                           ` Jeremi Piotrowski
  0 siblings, 1 reply; 39+ messages in thread
From: Michal Hocko @ 2023-09-21  7:52 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed 20-09-23 14:46:52, Shakeel Butt wrote:
> On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
> >
> [...]
> > > have a strong opinion against it. Also just to be clear we are not
> > > talking about full revert of 58056f77502f but just the returning of
> > > EOPNOTSUPP, right?
> >
> > If we allow the limit to be set without returning a failure then we
> > still have options 2 and 3 on how to deal with that. One of them is to
> > enforce the limit.
> >
> 
> Option 3 is a partial revert of 58056f77502f where we keep the no
> limit enforcement and remove the EOPNOTSUPP return on write. Let's go
> with option 3. In addition, let's add pr_warn_once on the read of
> kmem.limit_in_bytes as well.

How about this?
--- 
From 81ae0797d8da1b9cfbf357b4be4787a5bbf46bb4 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Thu, 21 Sep 2023 09:38:29 +0200
Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation

This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
and partially reverts 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") which have incrementally removed support for the
kernel memory accounting hard limit. Unfortunately it has turned out
that there is still userspace depending on the existence of
memory.kmem.limit_in_bytes [1]. The underlying functionality is not
really required but the non-existent file just confuses the userspace
which fails in the result. The patch to fix this on the userspace side
has been submitted but it is hard to predict how it will propagate
through the maze of 3rd party consumers of the software.

Now, reverting alone 86327e8eb94c is not an option because there is
another set of userspace which cannot cope with ENOTSUPP returned when
writing to the file. Therefore we have to go and revisit 58056f77502f
as well. There are two ways to go ahead. Either we give up on the
deprecation and fully revert 58056f77502f as well or we can keep
kmem.limit_in_bytes but make the write a noop and warn about the fact.
This should work for both known breaking workloads which depend on the
existence but do not depend on the hard limit enforcement.

[1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
 mm/memcontrol.c                                | 12 ++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 5f502bf68fbc..ff456871bf4b 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,6 +92,13 @@ Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
+ memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
+                                     memory hard limit. Kernel hard limit is not
+                                     supported since 5.16. Writing any value to
+                                     do file will not have any effect same as if
+                                     nokmem kernel parameter was specified.
+                                     Kernel memory is still charged and reported
+                                     by memory.kmem.usage_in_bytes.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4d3282493b6..ac7f14b2338d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
 static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 				   unsigned int nr_pages)
 {
+	struct page_counter *counter;
 	struct mem_cgroup *memcg;
 	int ret;
 
@@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 		goto out;
 
 	memcg_account_kmem(memcg, nr_pages);
+
+	/* There is no way to set up kmem hard limit so this operation cannot fail */
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
 out:
 	css_put(&memcg->css);
 
@@ -3867,6 +3872,13 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
 		case _MEMSWAP:
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
+		case _KMEM:
+			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
+				     "Writing any value to this file has no effect. "
+				     "Please report your usecase to linux-mm@kvack.org if you "
+				     "depend on this functionality.\n");
+			ret = 0;
+			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
 			break;
-- 
2.30.2

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21  7:52                         ` Michal Hocko
@ 2023-09-21 10:43                           ` Jeremi Piotrowski
  2023-09-21 11:21                             ` Michal Hocko
  0 siblings, 1 reply; 39+ messages in thread
From: Jeremi Piotrowski @ 2023-09-21 10:43 UTC (permalink / raw)
  To: Michal Hocko, Shakeel Butt
  Cc: Johannes Weiner, Roman Gushchin, Muchun Song, Greg Kroah-Hartman,
	stable, patches, Tejun Heo, Andrew Morton, linux-kernel,
	regressions, mathieu.tortuyaux

On 9/21/2023 9:52 AM, Michal Hocko wrote:
> On Wed 20-09-23 14:46:52, Shakeel Butt wrote:
>> On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
>>>
>> [...]
>>>> have a strong opinion against it. Also just to be clear we are not
>>>> talking about full revert of 58056f77502f but just the returning of
>>>> EOPNOTSUPP, right?
>>>
>>> If we allow the limit to be set without returning a failure then we
>>> still have options 2 and 3 on how to deal with that. One of them is to
>>> enforce the limit.
>>>
>>
>> Option 3 is a partial revert of 58056f77502f where we keep the no
>> limit enforcement and remove the EOPNOTSUPP return on write. Let's go
>> with option 3. In addition, let's add pr_warn_once on the read of
>> kmem.limit_in_bytes as well.
> 
> How about this?
> --- 

I'm OK with this approach. You're missing this in the patch below:

// static struct cftype mem_cgroup_legacy_files[] = {

+       {
+               .name = "kmem.limit_in_bytes",
+               .private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
+               .write = mem_cgroup_write,
+               .read_u64 = mem_cgroup_read_u64,
+       },


Thanks,
Jeremi

>>From 81ae0797d8da1b9cfbf357b4be4787a5bbf46bb4 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Thu, 21 Sep 2023 09:38:29 +0200
> Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation
> 
> This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
> and partially reverts 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") which have incrementally removed support for the
> kernel memory accounting hard limit. Unfortunately it has turned out
> that there is still userspace depending on the existence of
> memory.kmem.limit_in_bytes [1]. The underlying functionality is not
> really required but the non-existent file just confuses the userspace
> which fails in the result. The patch to fix this on the userspace side
> has been submitted but it is hard to predict how it will propagate
> through the maze of 3rd party consumers of the software.
> 
> Now, reverting alone 86327e8eb94c is not an option because there is
> another set of userspace which cannot cope with ENOTSUPP returned when
> writing to the file. Therefore we have to go and revisit 58056f77502f
> as well. There are two ways to go ahead. Either we give up on the
> deprecation and fully revert 58056f77502f as well or we can keep
> kmem.limit_in_bytes but make the write a noop and warn about the fact.
> This should work for both known breaking workloads which depend on the
> existence but do not depend on the hard limit enforcement.
> 
> [1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
> Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
> Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
>  mm/memcontrol.c                                | 12 ++++++++++++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
> index 5f502bf68fbc..ff456871bf4b 100644
> --- a/Documentation/admin-guide/cgroup-v1/memory.rst
> +++ b/Documentation/admin-guide/cgroup-v1/memory.rst
> @@ -92,6 +92,13 @@ Brief summary of control files.
>   memory.oom_control		     set/show oom controls.
>   memory.numa_stat		     show the number of memory usage per numa
>  				     node
> + memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
> +                                     memory hard limit. Kernel hard limit is not
> +                                     supported since 5.16. Writing any value to
> +                                     do file will not have any effect same as if
> +                                     nokmem kernel parameter was specified.
> +                                     Kernel memory is still charged and reported
> +                                     by memory.kmem.usage_in_bytes.
>   memory.kmem.usage_in_bytes          show current kernel memory allocation
>   memory.kmem.failcnt                 show the number of kernel memory usage
>  				     hits limits
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index a4d3282493b6..ac7f14b2338d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
>  static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>  				   unsigned int nr_pages)
>  {
> +	struct page_counter *counter;
>  	struct mem_cgroup *memcg;
>  	int ret;
>  
> @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>  		goto out;
>  
>  	memcg_account_kmem(memcg, nr_pages);
> +
> +	/* There is no way to set up kmem hard limit so this operation cannot fail */
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
>  out:
>  	css_put(&memcg->css);
>  
> @@ -3867,6 +3872,13 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
>  		case _MEMSWAP:
>  			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
>  			break;
> +		case _KMEM:
> +			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
> +				     "Writing any value to this file has no effect. "
> +				     "Please report your usecase to linux-mm@kvack.org if you "
> +				     "depend on this functionality.\n");
> +			ret = 0;
> +			break;
>  		case _TCP:
>  			ret = memcg_update_tcp_max(memcg, nr_pages);
>  			break;


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21 10:43                           ` Jeremi Piotrowski
@ 2023-09-21 11:21                             ` Michal Hocko
  2023-09-21 17:25                               ` Shakeel Butt
  2023-09-22 13:30                               ` Johannes Weiner
  0 siblings, 2 replies; 39+ messages in thread
From: Michal Hocko @ 2023-09-21 11:21 UTC (permalink / raw)
  To: Jeremi Piotrowski
  Cc: Shakeel Butt, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Thu 21-09-23 12:43:05, Jeremi Piotrowski wrote:
> On 9/21/2023 9:52 AM, Michal Hocko wrote:
> > On Wed 20-09-23 14:46:52, Shakeel Butt wrote:
> >> On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
> >>>
> >> [...]
> >>>> have a strong opinion against it. Also just to be clear we are not
> >>>> talking about full revert of 58056f77502f but just the returning of
> >>>> EOPNOTSUPP, right?
> >>>
> >>> If we allow the limit to be set without returning a failure then we
> >>> still have options 2 and 3 on how to deal with that. One of them is to
> >>> enforce the limit.
> >>>
> >>
> >> Option 3 is a partial revert of 58056f77502f where we keep the no
> >> limit enforcement and remove the EOPNOTSUPP return on write. Let's go
> >> with option 3. In addition, let's add pr_warn_once on the read of
> >> kmem.limit_in_bytes as well.
> > 
> > How about this?
> > --- 
> 
> I'm OK with this approach. You're missing this in the patch below:
> 
> // static struct cftype mem_cgroup_legacy_files[] = {
> 
> +       {
> +               .name = "kmem.limit_in_bytes",
> +               .private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
> +               .write = mem_cgroup_write,
> +               .read_u64 = mem_cgroup_read_u64,
> +       },

Of course. I've lost the hunk while massaging the revert. Thanks for
spotting. Updated version below. Btw. I've decided to not pr_{warn,info}
on the read side because realistically I do not think this will help all
that much. I am worried we will get stuck with this for ever because
there always be somebody stuck on unpatched userspace.
--- 
From bb6702b698efd31f3f90f4f1dd36ffe223397bec Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Thu, 21 Sep 2023 09:38:29 +0200
Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation

This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
and partially reverts 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") which have incrementally removed support for the
kernel memory accounting hard limit. Unfortunately it has turned out
that there is still userspace depending on the existence of
memory.kmem.limit_in_bytes [1]. The underlying functionality is not
really required but the non-existent file just confuses the userspace
which fails in the result. The patch to fix this on the userspace side
has been submitted but it is hard to predict how it will propagate
through the maze of 3rd party consumers of the software.

Now, reverting alone 86327e8eb94c is not an option because there is
another set of userspace which cannot cope with ENOTSUPP returned when
writing to the file. Therefore we have to go and revisit 58056f77502f
as well. There are two ways to go ahead. Either we give up on the
deprecation and fully revert 58056f77502f as well or we can keep
kmem.limit_in_bytes but make the write a noop and warn about the fact.
This should work for both known breaking workloads which depend on the
existence but do not depend on the hard limit enforcement.

[1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
 mm/memcontrol.c                                | 18 ++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 5f502bf68fbc..ff456871bf4b 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,6 +92,13 @@ Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
+ memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
+                                     memory hard limit. Kernel hard limit is not
+                                     supported since 5.16. Writing any value to
+                                     do file will not have any effect same as if
+                                     nokmem kernel parameter was specified.
+                                     Kernel memory is still charged and reported
+                                     by memory.kmem.usage_in_bytes.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4d3282493b6..0b161705ef36 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
 static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 				   unsigned int nr_pages)
 {
+	struct page_counter *counter;
 	struct mem_cgroup *memcg;
 	int ret;
 
@@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 		goto out;
 
 	memcg_account_kmem(memcg, nr_pages);
+
+	/* There is no way to set up kmem hard limit so this operation cannot fail */
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
 out:
 	css_put(&memcg->css);
 
@@ -3867,6 +3872,13 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
 		case _MEMSWAP:
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
+		case _KMEM:
+			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
+				     "Writing any value to this file has no effect. "
+				     "Please report your usecase to linux-mm@kvack.org if you "
+				     "depend on this functionality.\n");
+			ret = 0;
+			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
 			break;
@@ -5077,6 +5089,12 @@ static struct cftype mem_cgroup_legacy_files[] = {
 		.seq_show = memcg_numa_stat_show,
 	},
 #endif
+	{
+		.name = "kmem.limit_in_bytes",
+		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
+		.write = mem_cgroup_write,
+		.read_u64 = mem_cgroup_read_u64,
+	},
 	{
 		.name = "kmem.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
-- 
2.30.2

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 6.1 000/219] 6.1.54-rc1 review
  2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
                   ` (10 preceding siblings ...)
       [not found] ` <20230917191042.204185566@linuxfoundation.org>
@ 2023-09-21 13:04 ` Conor Dooley
  11 siblings, 0 replies; 39+ messages in thread
From: Conor Dooley @ 2023-09-21 13:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow

[-- Attachment #1: Type: text/plain, Size: 371 bytes --]

On Sun, Sep 17, 2023 at 09:12:07PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.1.54 release.
> There are 219 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.

Tested-by: Conor Dooley <conor.dooley@microchip.com>

Thanks,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21 11:21                             ` Michal Hocko
@ 2023-09-21 17:25                               ` Shakeel Butt
  2023-09-21 19:50                                 ` Michal Hocko
  2023-09-22 13:30                               ` Johannes Weiner
  1 sibling, 1 reply; 39+ messages in thread
From: Shakeel Butt @ 2023-09-21 17:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Thu, Sep 21, 2023 at 4:21 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Thu 21-09-23 12:43:05, Jeremi Piotrowski wrote:
> > On 9/21/2023 9:52 AM, Michal Hocko wrote:
> > > On Wed 20-09-23 14:46:52, Shakeel Butt wrote:
> > >> On Wed, Sep 20, 2023 at 1:08 PM Michal Hocko <mhocko@suse.com> wrote:
> > >>>
> > >> [...]
> > >>>> have a strong opinion against it. Also just to be clear we are not
> > >>>> talking about full revert of 58056f77502f but just the returning of
> > >>>> EOPNOTSUPP, right?
> > >>>
> > >>> If we allow the limit to be set without returning a failure then we
> > >>> still have options 2 and 3 on how to deal with that. One of them is to
> > >>> enforce the limit.
> > >>>
> > >>
> > >> Option 3 is a partial revert of 58056f77502f where we keep the no
> > >> limit enforcement and remove the EOPNOTSUPP return on write. Let's go
> > >> with option 3. In addition, let's add pr_warn_once on the read of
> > >> kmem.limit_in_bytes as well.
> > >
> > > How about this?
> > > ---
> >
> > I'm OK with this approach. You're missing this in the patch below:
> >
> > // static struct cftype mem_cgroup_legacy_files[] = {
> >
> > +       {
> > +               .name = "kmem.limit_in_bytes",
> > +               .private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
> > +               .write = mem_cgroup_write,
> > +               .read_u64 = mem_cgroup_read_u64,
> > +       },
>
> Of course. I've lost the hunk while massaging the revert. Thanks for
> spotting. Updated version below. Btw. I've decided to not pr_{warn,info}
> on the read side because realistically I do not think this will help all
> that much. I am worried we will get stuck with this for ever because
> there always be somebody stuck on unpatched userspace.
> ---
> From bb6702b698efd31f3f90f4f1dd36ffe223397bec Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Thu, 21 Sep 2023 09:38:29 +0200
> Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation
>
> This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
> and partially reverts 58056f77502f ("memcg, kmem: further deprecate
> kmem.limit_in_bytes") which have incrementally removed support for the
> kernel memory accounting hard limit. Unfortunately it has turned out
> that there is still userspace depending on the existence of
> memory.kmem.limit_in_bytes [1]. The underlying functionality is not
> really required but the non-existent file just confuses the userspace
> which fails in the result. The patch to fix this on the userspace side
> has been submitted but it is hard to predict how it will propagate
> through the maze of 3rd party consumers of the software.
>
> Now, reverting alone 86327e8eb94c is not an option because there is
> another set of userspace which cannot cope with ENOTSUPP returned when
> writing to the file. Therefore we have to go and revisit 58056f77502f
> as well. There are two ways to go ahead. Either we give up on the
> deprecation and fully revert 58056f77502f as well or we can keep
> kmem.limit_in_bytes but make the write a noop and warn about the fact.
> This should work for both known breaking workloads which depend on the
> existence but do not depend on the hard limit enforcement.
>
> [1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
> Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
> Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
> Signed-off-by: Michal Hocko <mhocko@suse.com>

With one request below:

Acked-by: Shakeel Butt <shakeelb@google.com>

> ---
>  Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
>  mm/memcontrol.c                                | 18 ++++++++++++++++++
>  2 files changed, 25 insertions(+)
>
> diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
> index 5f502bf68fbc..ff456871bf4b 100644
> --- a/Documentation/admin-guide/cgroup-v1/memory.rst
> +++ b/Documentation/admin-guide/cgroup-v1/memory.rst
> @@ -92,6 +92,13 @@ Brief summary of control files.
>   memory.oom_control                 set/show oom controls.
>   memory.numa_stat                   show the number of memory usage per numa
>                                      node
> + memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
> +                                     memory hard limit. Kernel hard limit is not
> +                                     supported since 5.16. Writing any value to
> +                                     do file will not have any effect same as if
> +                                     nokmem kernel parameter was specified.
> +                                     Kernel memory is still charged and reported
> +                                     by memory.kmem.usage_in_bytes.
>   memory.kmem.usage_in_bytes          show current kernel memory allocation
>   memory.kmem.failcnt                 show the number of kernel memory usage
>                                      hits limits
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index a4d3282493b6..0b161705ef36 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
>  static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>                                    unsigned int nr_pages)
>  {
> +       struct page_counter *counter;
>         struct mem_cgroup *memcg;
>         int ret;
>
> @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>                 goto out;
>
>         memcg_account_kmem(memcg, nr_pages);
> +
> +       /* There is no way to set up kmem hard limit so this operation cannot fail */
> +       if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +               WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));

WARN_ON_ONCE() please.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21 17:25                               ` Shakeel Butt
@ 2023-09-21 19:50                                 ` Michal Hocko
  0 siblings, 0 replies; 39+ messages in thread
From: Michal Hocko @ 2023-09-21 19:50 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Jeremi Piotrowski, Johannes Weiner, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Thu 21-09-23 10:25:11, Shakeel Butt wrote:
> On Thu, Sep 21, 2023 at 4:21 AM Michal Hocko <mhocko@suse.com> wrote:
[...]
> With one request below:
> 
> Acked-by: Shakeel Butt <shakeelb@google.com>

Thanks.

> > @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
> >                 goto out;
> >
> >         memcg_account_kmem(memcg, nr_pages);
> > +
> > +       /* There is no way to set up kmem hard limit so this operation cannot fail */
> > +       if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +               WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
> 
> WARN_ON_ONCE() please.

Sure. This shouldn't really trigger, but it is true that if something
unexpected happens then it is likly to flood the log so _ONCE is safer.

I will wait for others to comment before I send the official patch.
To be completely honest I am not super happy about this way of handling
stuff, but considering the level of brokenness this seems like the
safest option. Especially when nobody really want to use the kernel
memory hard limit AFAIU.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20  8:11   ` [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes Jeremi Piotrowski
  2023-09-20  8:43     ` Michal Hocko
@ 2023-09-22 11:14     ` Linux regression tracking #adding (Thorsten Leemhuis)
  1 sibling, 0 replies; 39+ messages in thread
From: Linux regression tracking #adding (Thorsten Leemhuis) @ 2023-09-22 11:14 UTC (permalink / raw)
  To: regressions; +Cc: linux-kernel

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 20.09.23 10:11, Jeremi Piotrowski wrote:
> On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
>> 6.1-stable review patch.  If anyone has any objections, please let me know.
>>
>> ------------------
> 
> Hi Greg/Michal,
> 
> This commit breaks userspace which makes it a bad commit for mainline and an
> even worse commit for stable.
> 
> We ingested 6.1.54 into our nightly testing and found that runc fails to gather
> cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
> into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
> fine.
> 
>> Address this by wiping out the file completely and effectively get back to
>> pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
> 
> On reads, the runc code checks for MEMCG_KMEM=n by checking
> kmem.usage_in_bytes. If it is present then runc expects the other cgroup files
> to be there (including kmem.limit_in_bytes). So this change is not effectively
> the same.
> 
> Here's a link to the PR that would be needed to handle this change in userspace
> (not merged yet and would need to be propagated through the ecosystem):
> 
> https://github.com/opencontainers/runc/pull/4018.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced 86327e8eb94c52
#regzbot title mm, memcg: runc fails to gather cgroup statistics
#regzbot fix: mm, memcg: reconsider kmem.limit_in_bytes deprecation
#regzbot ignore-activity

FWIW, the porposed fix can be found here:
https://lore.kernel.org/all/ZQwnUpX7FlzIOWXP@dhcp22.suse.cz/

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-21 11:21                             ` Michal Hocko
  2023-09-21 17:25                               ` Shakeel Butt
@ 2023-09-22 13:30                               ` Johannes Weiner
  2023-09-25  7:40                                 ` Michal Hocko
  1 sibling, 1 reply; 39+ messages in thread
From: Johannes Weiner @ 2023-09-22 13:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Shakeel Butt, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Thu, Sep 21, 2023 at 01:21:54PM +0200, Michal Hocko wrote:
> @@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
>  static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>  				   unsigned int nr_pages)
>  {
> +	struct page_counter *counter;
>  	struct mem_cgroup *memcg;
>  	int ret;
>  
> @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
>  		goto out;
>  
>  	memcg_account_kmem(memcg, nr_pages);
> +
> +	/* There is no way to set up kmem hard limit so this operation cannot fail */
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));

This hunk doesn't look quite right.

static void memcg_account_kmem(struct mem_cgroup *memcg, int nr_pages)
{
	mod_memcg_state(memcg, MEMCG_KMEM, nr_pages);
	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
		if (nr_pages > 0)
			page_counter_charge(&memcg->kmem, nr_pages);
		else
			page_counter_uncharge(&memcg->kmem, -nr_pages);
	}
}

Other than that, please add

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-20 13:47             ` Michal Hocko
  2023-09-20 15:32               ` Shakeel Butt
@ 2023-09-22 23:00               ` Roman Gushchin
  2023-09-25  7:41                 ` Michal Hocko
  1 sibling, 1 reply; 39+ messages in thread
From: Roman Gushchin @ 2023-09-22 23:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Shakeel Butt, Johannes Weiner, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Wed, Sep 20, 2023 at 03:47:37PM +0200, Michal Hocko wrote:
> On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> > On 9/20/2023 1:07 PM, Michal Hocko wrote:
> [...]
> > > I mean, normally I would be just fine reverting this API change because
> > > it is disruptive but the only way to have the file available and not
> > > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > > there but that sounds rather dubious. Although one could argue this
> > > would mimic nokmem kernel option.
> > > 
> > 
> > I just want to make sure we don't introduce yet another new behavior in this legacy
> > system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> > does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> > (=don't event store the written limit). The latter might have unintended consequences.
> 
> Yes it would mean that the limit is never enforced. Bad as it is the
> thing is that the hard limit on kernel memory is broken by design and
> unfixable.  This causes all sorts of unexpected kernel allocation
> failures that this is simply unsafe to use.
> 
> All that being said I can see the following options
> 1) keep the current upstream status and not export the file
> 2) revert both 58056f77502f and 86327e8eb94 and make it clear
>    that kmem.limit_in_bytes is unsupported so failures or misbehavior
>    as a result of the limit being hit are likely not going to be
>    investigated or fixed.
> 3) reverting like in 2) but never inforce the limit (so basically nokmem
>    semantic)

Since it's a part of cgroup v1 interface, which is in a frozen state as a whole,
and there is no significant (performance, code complexity) benefit of
additionally deprecating kmem.limit_in_bytes, I vote for 2).
1) is also an option.

Thanks!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-22 13:30                               ` Johannes Weiner
@ 2023-09-25  7:40                                 ` Michal Hocko
  0 siblings, 0 replies; 39+ messages in thread
From: Michal Hocko @ 2023-09-25  7:40 UTC (permalink / raw)
  To: Johannes Weiner, Andrew Morton
  Cc: Jeremi Piotrowski, Shakeel Butt, Roman Gushchin, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, linux-kernel,
	regressions, mathieu.tortuyaux

On Fri 22-09-23 09:30:17, Johannes Weiner wrote:
> On Thu, Sep 21, 2023 at 01:21:54PM +0200, Michal Hocko wrote:
> > @@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
> >  static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
> >  				   unsigned int nr_pages)
> >  {
> > +	struct page_counter *counter;
> >  	struct mem_cgroup *memcg;
> >  	int ret;
> >  
> > @@ -3107,6 +3108,10 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
> >  		goto out;
> >  
> >  	memcg_account_kmem(memcg, nr_pages);
> > +
> > +	/* There is no way to set up kmem hard limit so this operation cannot fail */
> > +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +		WARN_ON(!page_counter_try_charge(&memcg->kmem, nr_pages, &counter));
> 
> This hunk doesn't look quite right.
> 
> static void memcg_account_kmem(struct mem_cgroup *memcg, int nr_pages)
> {
> 	mod_memcg_state(memcg, MEMCG_KMEM, nr_pages);
> 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
> 		if (nr_pages > 0)
> 			page_counter_charge(&memcg->kmem, nr_pages);
> 		else
> 			page_counter_uncharge(&memcg->kmem, -nr_pages);
> 	}
> }
> 
> Other than that, please add

Good point. I have missed a8c49af3be5f ("memcg: add per-memcg total kernel memory stat")
introduced in 4.18

> Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Fixed version below. Andrew, it seems we have a good consensus for this.
Could you queue this up and send it to Linus please?
--- 
From 8c3cbe68bba0fe5103d8fe73a06b3608ed49bda0 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Thu, 21 Sep 2023 09:38:29 +0200
Subject: [PATCH] mm, memcg: reconsider kmem.limit_in_bytes deprecation

This reverts commits 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
and partially reverts 58056f77502f ("memcg, kmem: further deprecate
kmem.limit_in_bytes") which have incrementally removed support for the
kernel memory accounting hard limit. Unfortunately it has turned out
that there is still userspace depending on the existence of
memory.kmem.limit_in_bytes [1]. The underlying functionality is not
really required but the non-existent file just confuses the userspace
which fails in the result. The patch to fix this on the userspace side
has been submitted but it is hard to predict how it will propagate
through the maze of 3rd party consumers of the software.

Now, reverting alone 86327e8eb94c is not an option because there is
another set of userspace which cannot cope with ENOTSUPP returned when
writing to the file. Therefore we have to go and revisit 58056f77502f
as well. There are two ways to go ahead. Either we give up on the
deprecation and fully revert 58056f77502f as well or we can keep
kmem.limit_in_bytes but make the write a noop and warn about the fact.
This should work for both known breaking workloads which depend on the
existence but do not depend on the hard limit enforcement.

Note to backporters to stable trees. a8c49af3be5f ("memcg: add per-memcg
total kernel memory stat") introduced in 4.18 has added memcg_account_kmem
so the accounting is not done by obj_cgroup_charge_pages directly for v1
anymore. Prior kernels need to add it explicitly (thanks to Johannes for
pointing this out).

[1] http://lkml.kernel.org/r/20230920081101.GA12096@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Cc: stable
Fixes: 86327e8eb94c ("memcg: drop kmem.limit_in_bytes")
Fixes: 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes")
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 Documentation/admin-guide/cgroup-v1/memory.rst |  7 +++++++
 mm/memcontrol.c                                | 14 ++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 5f502bf68fbc..ff456871bf4b 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -92,6 +92,13 @@ Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
+ memory.kmem.limit_in_bytes          Deprecated knob to set and read the kernel
+                                     memory hard limit. Kernel hard limit is not
+                                     supported since 5.16. Writing any value to
+                                     do file will not have any effect same as if
+                                     nokmem kernel parameter was specified.
+                                     Kernel memory is still charged and reported
+                                     by memory.kmem.usage_in_bytes.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4d3282493b6..63bdaab2a906 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3097,6 +3097,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg,
 static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 				   unsigned int nr_pages)
 {
+	struct page_counter *counter;
 	struct mem_cgroup *memcg;
 	int ret;
 
@@ -3867,6 +3868,13 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
 		case _MEMSWAP:
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
+		case _KMEM:
+			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
+				     "Writing any value to this file has no effect. "
+				     "Please report your usecase to linux-mm@kvack.org if you "
+				     "depend on this functionality.\n");
+			ret = 0;
+			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
 			break;
@@ -5077,6 +5085,12 @@ static struct cftype mem_cgroup_legacy_files[] = {
 		.seq_show = memcg_numa_stat_show,
 	},
 #endif
+	{
+		.name = "kmem.limit_in_bytes",
+		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
+		.write = mem_cgroup_write,
+		.read_u64 = mem_cgroup_read_u64,
+	},
 	{
 		.name = "kmem.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
-- 
2.30.2

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-22 23:00               ` Roman Gushchin
@ 2023-09-25  7:41                 ` Michal Hocko
  2023-09-26  2:49                   ` Roman Gushchin
  0 siblings, 1 reply; 39+ messages in thread
From: Michal Hocko @ 2023-09-25  7:41 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Jeremi Piotrowski, Shakeel Butt, Johannes Weiner, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Fri 22-09-23 16:00:30, Roman Gushchin wrote:
> On Wed, Sep 20, 2023 at 03:47:37PM +0200, Michal Hocko wrote:
> > On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> > > On 9/20/2023 1:07 PM, Michal Hocko wrote:
> > [...]
> > > > I mean, normally I would be just fine reverting this API change because
> > > > it is disruptive but the only way to have the file available and not
> > > > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > > > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > > > there but that sounds rather dubious. Although one could argue this
> > > > would mimic nokmem kernel option.
> > > > 
> > > 
> > > I just want to make sure we don't introduce yet another new behavior in this legacy
> > > system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> > > does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> > > (=don't event store the written limit). The latter might have unintended consequences.
> > 
> > Yes it would mean that the limit is never enforced. Bad as it is the
> > thing is that the hard limit on kernel memory is broken by design and
> > unfixable.  This causes all sorts of unexpected kernel allocation
> > failures that this is simply unsafe to use.
> > 
> > All that being said I can see the following options
> > 1) keep the current upstream status and not export the file
> > 2) revert both 58056f77502f and 86327e8eb94 and make it clear
> >    that kmem.limit_in_bytes is unsupported so failures or misbehavior
> >    as a result of the limit being hit are likely not going to be
> >    investigated or fixed.
> > 3) reverting like in 2) but never inforce the limit (so basically nokmem
> >    semantic)
> 
> Since it's a part of cgroup v1 interface, which is in a frozen state as a whole,
> and there is no significant (performance, code complexity) benefit of
> additionally deprecating kmem.limit_in_bytes, I vote for 2).
> 1) is also an option.

We have a stronger agrement over 3)
http://lkml.kernel.org/r/ZRE5VJozPZt9bRPy@dhcp22.suse.cz. Please speak
up if you disagree.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes
  2023-09-25  7:41                 ` Michal Hocko
@ 2023-09-26  2:49                   ` Roman Gushchin
  0 siblings, 0 replies; 39+ messages in thread
From: Roman Gushchin @ 2023-09-26  2:49 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jeremi Piotrowski, Shakeel Butt, Johannes Weiner, Muchun Song,
	Greg Kroah-Hartman, stable, patches, Tejun Heo, Andrew Morton,
	linux-kernel, regressions, mathieu.tortuyaux

On Mon, Sep 25, 2023 at 09:41:24AM +0200, Michal Hocko wrote:
> On Fri 22-09-23 16:00:30, Roman Gushchin wrote:
> > On Wed, Sep 20, 2023 at 03:47:37PM +0200, Michal Hocko wrote:
> > > On Wed 20-09-23 15:25:23, Jeremi Piotrowski wrote:
> > > > On 9/20/2023 1:07 PM, Michal Hocko wrote:
> > > [...]
> > > > > I mean, normally I would be just fine reverting this API change because
> > > > > it is disruptive but the only way to have the file available and not
> > > > > break somebody is to revert 58056f77502f ("memcg, kmem: further
> > > > > deprecate kmem.limit_in_bytes") as well. Or to ignore any value written
> > > > > there but that sounds rather dubious. Although one could argue this
> > > > > would mimic nokmem kernel option.
> > > > > 
> > > > 
> > > > I just want to make sure we don't introduce yet another new behavior in this legacy
> > > > system. I have not seen breakage due to 58056f77502f. Mimicing nokmem sounds good but
> > > > does this mean "don't enforce limits" (that should be fine) or "ignore writes to the limit"
> > > > (=don't event store the written limit). The latter might have unintended consequences.
> > > 
> > > Yes it would mean that the limit is never enforced. Bad as it is the
> > > thing is that the hard limit on kernel memory is broken by design and
> > > unfixable.  This causes all sorts of unexpected kernel allocation
> > > failures that this is simply unsafe to use.
> > > 
> > > All that being said I can see the following options
> > > 1) keep the current upstream status and not export the file
> > > 2) revert both 58056f77502f and 86327e8eb94 and make it clear
> > >    that kmem.limit_in_bytes is unsupported so failures or misbehavior
> > >    as a result of the limit being hit are likely not going to be
> > >    investigated or fixed.
> > > 3) reverting like in 2) but never inforce the limit (so basically nokmem
> > >    semantic)
> > 
> > Since it's a part of cgroup v1 interface, which is in a frozen state as a whole,
> > and there is no significant (performance, code complexity) benefit of
> > additionally deprecating kmem.limit_in_bytes, I vote for 2).
> > 1) is also an option.
> 
> We have a stronger agrement over 3)
> http://lkml.kernel.org/r/ZRE5VJozPZt9bRPy@dhcp22.suse.cz. Please speak
> up if you disagree.

This works for me too.
Thank you!

Btw, it seems like going forward we should be more resistant for any
cgroup v1 changes and just leave it as it is.

Thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2023-09-26  2:49 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-17 19:12 [PATCH 6.1 000/219] 6.1.54-rc1 review Greg Kroah-Hartman
2023-09-17 20:47 ` SeongJae Park
2023-09-18  5:34 ` Takeshi Ogasawara
2023-09-18  6:42 ` Bagas Sanjaya
2023-09-18 11:24 ` Conor Dooley
2023-09-18 12:08 ` Ron Economos
2023-09-18 12:48 ` Jon Hunter
2023-09-18 18:34 ` Florian Fainelli
2023-09-18 18:41 ` Guenter Roeck
2023-09-18 20:56 ` Naresh Kamboju
2023-09-18 22:21 ` Shuah Khan
     [not found] ` <20230917191042.204185566@linuxfoundation.org>
2023-09-20  8:11   ` [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes Jeremi Piotrowski
2023-09-20  8:43     ` Michal Hocko
2023-09-20  9:25       ` Greg Kroah-Hartman
2023-09-20 10:21         ` Jeremi Piotrowski
2023-09-20 10:45           ` Greg Kroah-Hartman
2023-09-20 11:08             ` Michal Hocko
2023-09-20 11:16               ` Greg Kroah-Hartman
2023-09-20 10:04       ` Jeremi Piotrowski
2023-09-20 11:07         ` Michal Hocko
2023-09-20 13:25           ` Jeremi Piotrowski
2023-09-20 13:47             ` Michal Hocko
2023-09-20 15:32               ` Shakeel Butt
2023-09-20 16:55                 ` Michal Hocko
2023-09-20 19:46                   ` Shakeel Butt
2023-09-20 20:08                     ` Michal Hocko
2023-09-20 21:46                       ` Shakeel Butt
2023-09-21  7:52                         ` Michal Hocko
2023-09-21 10:43                           ` Jeremi Piotrowski
2023-09-21 11:21                             ` Michal Hocko
2023-09-21 17:25                               ` Shakeel Butt
2023-09-21 19:50                                 ` Michal Hocko
2023-09-22 13:30                               ` Johannes Weiner
2023-09-25  7:40                                 ` Michal Hocko
2023-09-22 23:00               ` Roman Gushchin
2023-09-25  7:41                 ` Michal Hocko
2023-09-26  2:49                   ` Roman Gushchin
2023-09-22 11:14     ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-09-21 13:04 ` [PATCH 6.1 000/219] 6.1.54-rc1 review Conor Dooley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox