From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
torvalds@linux-foundation.org, stable@vger.kernel.org
Cc: lwn@lwn.net, jslaby@suse.cz,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: Linux 5.18.18
Date: Wed, 17 Aug 2022 15:24:54 +0200 [thread overview]
Message-ID: <1660742693200231@kroah.com> (raw)
In-Reply-To: <16607426939212@kroah.com>
diff --git a/Documentation/ABI/testing/sysfs-driver-xen-blkback b/Documentation/ABI/testing/sysfs-driver-xen-blkback
index a74dfe52dd76..ca7f12aeb2da 100644
--- a/Documentation/ABI/testing/sysfs-driver-xen-blkback
+++ b/Documentation/ABI/testing/sysfs-driver-xen-blkback
@@ -42,5 +42,5 @@ KernelVersion: 5.10
Contact: SeongJae Park <sj@kernel.org>
Description:
Whether to enable the persistent grants feature or not. Note
- that this option only takes effect on newly created backends.
+ that this option only takes effect on newly connected backends.
The default is Y (enable).
diff --git a/Documentation/ABI/testing/sysfs-driver-xen-blkfront b/Documentation/ABI/testing/sysfs-driver-xen-blkfront
index 61fd173fabfe..98b41c3b6002 100644
--- a/Documentation/ABI/testing/sysfs-driver-xen-blkfront
+++ b/Documentation/ABI/testing/sysfs-driver-xen-blkfront
@@ -15,5 +15,5 @@ KernelVersion: 5.10
Contact: SeongJae Park <sj@kernel.org>
Description:
Whether to enable the persistent grants feature or not. Note
- that this option only takes effect on newly created frontends.
+ that this option only takes effect on newly connected frontends.
The default is Y (enable).
diff --git a/Documentation/admin-guide/device-mapper/writecache.rst b/Documentation/admin-guide/device-mapper/writecache.rst
index 10429779a91a..724e028d1858 100644
--- a/Documentation/admin-guide/device-mapper/writecache.rst
+++ b/Documentation/admin-guide/device-mapper/writecache.rst
@@ -78,16 +78,16 @@ Status:
2. the number of blocks
3. the number of free blocks
4. the number of blocks under writeback
-5. the number of read requests
-6. the number of read requests that hit the cache
-7. the number of write requests
-8. the number of write requests that hit uncommitted block
-9. the number of write requests that hit committed block
-10. the number of write requests that bypass the cache
-11. the number of write requests that are allocated in the cache
+5. the number of read blocks
+6. the number of read blocks that hit the cache
+7. the number of write blocks
+8. the number of write blocks that hit uncommitted block
+9. the number of write blocks that hit committed block
+10. the number of write blocks that bypass the cache
+11. the number of write blocks that are allocated in the cache
12. the number of write requests that are blocked on the freelist
13. the number of flush requests
-14. the number of discard requests
+14. the number of discarded blocks
Messages:
flush
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 334544c893dd..c9a272acfb5b 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5130,20 +5130,33 @@
Speculative Code Execution with Return Instructions)
vulnerability.
+ AMD-based UNRET and IBPB mitigations alone do not stop
+ sibling threads from influencing the predictions of other
+ sibling threads. For that reason, STIBP is used on pro-
+ cessors that support it, and mitigate SMT on processors
+ that don't.
+
off - no mitigation
auto - automatically select a migitation
auto,nosmt - automatically select a mitigation,
disabling SMT if necessary for
the full mitigation (only on Zen1
and older without STIBP).
- ibpb - mitigate short speculation windows on
- basic block boundaries too. Safe, highest
- perf impact.
- unret - force enable untrained return thunks,
- only effective on AMD f15h-f17h
- based systems.
- unret,nosmt - like unret, will disable SMT when STIBP
- is not available.
+ ibpb - On AMD, mitigate short speculation
+ windows on basic block boundaries too.
+ Safe, highest perf impact. It also
+ enables STIBP if present. Not suitable
+ on Intel.
+ ibpb,nosmt - Like "ibpb" above but will disable SMT
+ when STIBP is not available. This is
+ the alternative for systems which do not
+ have STIBP.
+ unret - Force enable untrained return thunks,
+ only effective on AMD f15h-f17h based
+ systems.
+ unret,nosmt - Like unret, but will disable SMT when STIBP
+ is not available. This is the alternative for
+ systems which do not have STIBP.
Selecting 'auto' will choose a mitigation method at run
time according to the CPU.
diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst
index aec2cd2aaea7..19754beb5a4e 100644
--- a/Documentation/admin-guide/pm/cpuidle.rst
+++ b/Documentation/admin-guide/pm/cpuidle.rst
@@ -612,8 +612,8 @@ the ``menu`` governor to be used on the systems that use the ``ladder`` governor
by default this way, for example.
The other kernel command line parameters controlling CPU idle time management
-described below are only relevant for the *x86* architecture and some of
-them affect Intel processors only.
+described below are only relevant for the *x86* architecture and references
+to ``intel_idle`` affect Intel processors only.
The *x86* architecture support code recognizes three kernel command line
options related to CPU idle time management: ``idle=poll``, ``idle=halt``,
@@ -635,10 +635,13 @@ idle, so it very well may hurt single-thread computations performance as well as
energy-efficiency. Thus using it for performance reasons may not be a good idea
at all.]
-The ``idle=nomwait`` option disables the ``intel_idle`` driver and causes
-``acpi_idle`` to be used (as long as all of the information needed by it is
-there in the system's ACPI tables), but it is not allowed to use the
-``MWAIT`` instruction of the CPUs to ask the hardware to enter idle states.
+The ``idle=nomwait`` option prevents the use of ``MWAIT`` instruction of
+the CPU to enter idle states. When this option is used, the ``acpi_idle``
+driver will use the ``HLT`` instruction instead of ``MWAIT``. On systems
+running Intel processors, this option disables the ``intel_idle`` driver
+and forces the use of the ``acpi_idle`` driver instead. Note that in either
+case, ``acpi_idle`` driver will function only if all the information needed
+by it is in the system's ACPI tables.
In addition to the architecture-level kernel command line options affecting CPU
idle time management, there are parameters affecting individual ``CPUIdle``
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index d27db84d585e..0b4235b1f8c4 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -82,10 +82,14 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A57 | #1319537 | ARM64_ERRATUM_1319367 |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A57 | #1742098 | ARM64_ERRATUM_1742098 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A72 | #853709 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A72 | #1319367 | ARM64_ERRATUM_1319367 |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A72 | #1655431 | ARM64_ERRATUM_1742098 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A73 | #858921 | ARM64_ERRATUM_858921 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1188873,1418040| ARM64_ERRATUM_1418040 |
diff --git a/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml
index e2d330bd4608..69cdab18d629 100644
--- a/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml
+++ b/Documentation/devicetree/bindings/riscv/sifive-l2-cache.yaml
@@ -46,7 +46,7 @@ properties:
const: 2
cache-sets:
- const: 1024
+ enum: [1024, 2048]
cache-size:
const: 2097152
@@ -84,6 +84,8 @@ then:
description: |
Must contain entries for DirError, DataError and DataFail signals.
maxItems: 3
+ cache-sets:
+ const: 1024
else:
properties:
@@ -91,6 +93,8 @@ else:
description: |
Must contain entries for DirError, DataError, DataFail, DirFail signals.
minItems: 4
+ cache-sets:
+ const: 2048
additionalProperties: false
diff --git a/Documentation/tty/device_drivers/oxsemi-tornado.rst b/Documentation/tty/device_drivers/oxsemi-tornado.rst
new file mode 100644
index 000000000000..0180d8bb0881
--- /dev/null
+++ b/Documentation/tty/device_drivers/oxsemi-tornado.rst
@@ -0,0 +1,129 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================================================================
+Notes on Oxford Semiconductor PCIe (Tornado) 950 serial port devices
+====================================================================
+
+Oxford Semiconductor PCIe (Tornado) 950 serial port devices are driven
+by a fixed 62.5MHz clock input derived from the 100MHz PCI Express clock.
+
+The baud rate produced by the baud generator is obtained from this input
+frequency by dividing it by the clock prescaler, which can be set to any
+value from 1 to 63.875 in increments of 0.125, and then the usual 16-bit
+divisor is used as with the original 8250, to divide the frequency by a
+value from 1 to 65535. Finally a programmable oversampling rate is used
+that can take any value from 4 to 16 to divide the frequency further and
+determine the actual baud rate used. Baud rates from 15625000bps down
+to 0.933bps can be obtained this way.
+
+By default the oversampling rate is set to 16 and the clock prescaler is
+set to 33.875, meaning that the frequency to be used as the reference
+for the usual 16-bit divisor is 115313.653, which is close enough to the
+frequency of 115200 used by the original 8250 for the same values to be
+used for the divisor to obtain the requested baud rates by software that
+is unaware of the extra clock controls available.
+
+The oversampling rate is programmed with the TCR register and the clock
+prescaler is programmed with the CPR/CPR2 register pair[1][2][3][4].
+To switch away from the default value of 33.875 for the prescaler the
+the enhanced mode has to be explicitly enabled though, by setting bit 4
+of the EFR. In that mode setting bit 7 in the MCR enables the prescaler
+or otherwise it is bypassed as if the value of 1 was used. Additionally
+writing any value to CPR clears CPR2 for compatibility with old software
+written for older conventional PCI Oxford Semiconductor devices that do
+not have the extra prescaler's 9th bit in CPR2, so the CPR/CPR2 register
+pair has to be programmed in the right order.
+
+By using these parameters rates from 15625000bps down to 1bps can be
+obtained, with either exact or highly-accurate actual bit rates for
+standard and many non-standard rates.
+
+Here are the figures for the standard and some non-standard baud rates
+(including those quoted in Oxford Semiconductor documentation), giving
+the requested rate (r), the actual rate yielded (a) and its deviation
+from the requested rate (d), and the values of the oversampling rate
+(tcr), the clock prescaler (cpr) and the divisor (div) produced by the
+new `get_divisor' handler:
+
+r: 15625000, a: 15625000.00, d: 0.0000%, tcr: 4, cpr: 1.000, div: 1
+r: 12500000, a: 12500000.00, d: 0.0000%, tcr: 5, cpr: 1.000, div: 1
+r: 10416666, a: 10416666.67, d: 0.0000%, tcr: 6, cpr: 1.000, div: 1
+r: 8928571, a: 8928571.43, d: 0.0000%, tcr: 7, cpr: 1.000, div: 1
+r: 7812500, a: 7812500.00, d: 0.0000%, tcr: 8, cpr: 1.000, div: 1
+r: 4000000, a: 4000000.00, d: 0.0000%, tcr: 5, cpr: 3.125, div: 1
+r: 3686400, a: 3676470.59, d: -0.2694%, tcr: 8, cpr: 2.125, div: 1
+r: 3500000, a: 3496503.50, d: -0.0999%, tcr: 13, cpr: 1.375, div: 1
+r: 3000000, a: 2976190.48, d: -0.7937%, tcr: 14, cpr: 1.500, div: 1
+r: 2500000, a: 2500000.00, d: 0.0000%, tcr: 10, cpr: 2.500, div: 1
+r: 2000000, a: 2000000.00, d: 0.0000%, tcr: 10, cpr: 3.125, div: 1
+r: 1843200, a: 1838235.29, d: -0.2694%, tcr: 16, cpr: 2.125, div: 1
+r: 1500000, a: 1492537.31, d: -0.4975%, tcr: 5, cpr: 8.375, div: 1
+r: 1152000, a: 1152073.73, d: 0.0064%, tcr: 14, cpr: 3.875, div: 1
+r: 921600, a: 919117.65, d: -0.2694%, tcr: 16, cpr: 2.125, div: 2
+r: 576000, a: 576036.87, d: 0.0064%, tcr: 14, cpr: 3.875, div: 2
+r: 460800, a: 460829.49, d: 0.0064%, tcr: 7, cpr: 3.875, div: 5
+r: 230400, a: 230414.75, d: 0.0064%, tcr: 14, cpr: 3.875, div: 5
+r: 115200, a: 115207.37, d: 0.0064%, tcr: 14, cpr: 1.250, div: 31
+r: 57600, a: 57603.69, d: 0.0064%, tcr: 8, cpr: 3.875, div: 35
+r: 38400, a: 38402.46, d: 0.0064%, tcr: 14, cpr: 3.875, div: 30
+r: 19200, a: 19201.23, d: 0.0064%, tcr: 8, cpr: 3.875, div: 105
+r: 9600, a: 9600.06, d: 0.0006%, tcr: 9, cpr: 1.125, div: 643
+r: 4800, a: 4799.98, d: -0.0004%, tcr: 7, cpr: 2.875, div: 647
+r: 2400, a: 2400.02, d: 0.0008%, tcr: 9, cpr: 2.250, div: 1286
+r: 1200, a: 1200.00, d: 0.0000%, tcr: 14, cpr: 2.875, div: 1294
+r: 300, a: 300.00, d: 0.0000%, tcr: 11, cpr: 2.625, div: 7215
+r: 200, a: 200.00, d: 0.0000%, tcr: 16, cpr: 1.250, div: 15625
+r: 150, a: 150.00, d: 0.0000%, tcr: 13, cpr: 2.250, div: 14245
+r: 134, a: 134.00, d: 0.0000%, tcr: 11, cpr: 2.625, div: 16153
+r: 110, a: 110.00, d: 0.0000%, tcr: 12, cpr: 1.000, div: 47348
+r: 75, a: 75.00, d: 0.0000%, tcr: 4, cpr: 5.875, div: 35461
+r: 50, a: 50.00, d: 0.0000%, tcr: 16, cpr: 1.250, div: 62500
+r: 25, a: 25.00, d: 0.0000%, tcr: 16, cpr: 2.500, div: 62500
+r: 4, a: 4.00, d: 0.0000%, tcr: 16, cpr: 20.000, div: 48828
+r: 2, a: 2.00, d: 0.0000%, tcr: 16, cpr: 40.000, div: 48828
+r: 1, a: 1.00, d: 0.0000%, tcr: 16, cpr: 63.875, div: 61154
+
+With the baud base set to 15625000 and the unsigned 16-bit UART_DIV_MAX
+limitation imposed by `serial8250_get_baud_rate' standard baud rates
+below 300bps become unavailable in the regular way, e.g. the rate of
+200bps requires the baud base to be divided by 78125 and that is beyond
+the unsigned 16-bit range. The historic spd_cust feature can still be
+used by encoding the values for, the prescaler, the oversampling rate
+and the clock divisor (DLM/DLL) as follows to obtain such rates if so
+required:
+
+ 31 29 28 20 19 16 15 0
++-----+-----------------+-------+-------------------------------+
+|0 0 0| CPR2:CPR | TCR | DLM:DLL |
++-----+-----------------+-------+-------------------------------+
+
+Use a value such encoded for the `custom_divisor' field along with the
+ASYNC_SPD_CUST flag set in the `flags' field in `struct serial_struct'
+passed with the TIOCSSERIAL ioctl(2), such as with the setserial(8)
+utility and its `divisor' and `spd_cust' parameters, and the select
+the baud rate of 38400bps. Note that the value of 0 in TCR sets the
+oversampling rate to 16 and prescaler values below 1 in CPR2/CPR are
+clamped by the driver to 1.
+
+For example the value of 0x1f4004e2 will set CPR2/CPR, TCR and DLM/DLL
+respectively to 0x1f4, 0x0 and 0x04e2, choosing the prescaler value,
+the oversampling rate and the clock divisor of 62.500, 16 and 1250
+respectively. These parameters will set the baud rate for the serial
+port to 62500000 / 62.500 / 1250 / 16 = 50bps.
+
+References:
+
+[1] "OXPCIe200 PCI Express Multi-Port Bridge", Oxford Semiconductor,
+ Inc., DS-0045, 10 Nov 2008, Section "950 Mode", pp. 64-65
+
+[2] "OXPCIe952 PCI Express Bridge to Dual Serial & Parallel Port",
+ Oxford Semiconductor, Inc., DS-0046, Mar 06 08, Section "950 Mode",
+ p. 20
+
+[3] "OXPCIe954 PCI Express Bridge to Quad Serial Port", Oxford
+ Semiconductor, Inc., DS-0047, Feb 08, Section "950 Mode", p. 20
+
+[4] "OXPCIe958 PCI Express Bridge to Octal Serial Port", Oxford
+ Semiconductor, Inc., DS-0048, Feb 08, Section "950 Mode", p. 20
+
+Maciej W. Rozycki <macro@orcam.me.uk>
diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
index 4cd7c541fc30..bcdc14144105 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
@@ -2975,7 +2975,7 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
* - __u8
- ``colour_plane_id``
-
- * - __u16
+ * - __s32
- ``slice_pic_order_cnt``
-
* - __u8
diff --git a/MAINTAINERS b/MAINTAINERS
index 2b70e2d21405..c7c7a96b62a8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7599,9 +7599,6 @@ F: include/linux/fs.h
F: include/linux/fs_types.h
F: include/uapi/linux/fs.h
F: include/uapi/linux/openat2.h
-X: fs/io-wq.c
-X: fs/io-wq.h
-X: fs/io_uring.c
FINTEK F75375S HARDWARE MONITOR AND FAN CONTROLLER DRIVER
M: Riku Voipio <riku.voipio@iki.fi>
@@ -10277,9 +10274,7 @@ L: io-uring@vger.kernel.org
S: Maintained
T: git git://git.kernel.dk/linux-block
T: git git://git.kernel.dk/liburing
-F: fs/io-wq.c
-F: fs/io-wq.h
-F: fs/io_uring.c
+F: io_uring/
F: include/linux/io_uring.h
F: include/uapi/linux/io_uring.h
F: tools/io_uring/
diff --git a/Makefile b/Makefile
index ef8c18e5c161..23162e2bdf14 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
VERSION = 5
PATCHLEVEL = 18
-SUBLEVEL = 17
+SUBLEVEL = 18
EXTRAVERSION =
NAME = Superb Owl
@@ -1031,6 +1031,11 @@ KBUILD_CFLAGS += $(KCFLAGS)
KBUILD_LDFLAGS_MODULE += --build-id=sha1
LDFLAGS_vmlinux += --build-id=sha1
+KBUILD_LDFLAGS += -z noexecstack
+ifeq ($(CONFIG_LD_IS_BFD),y)
+KBUILD_LDFLAGS += $(call ld-option,--no-warn-rwx-segments)
+endif
+
ifeq ($(CONFIG_STRIP_ASM_SYMS),y)
LDFLAGS_vmlinux += $(call ld-option, -X,)
endif
@@ -1095,6 +1100,7 @@ export MODULES_NSDEPS := $(extmod_prefix)modules.nsdeps
ifeq ($(KBUILD_EXTMOD),)
core-y += kernel/ certs/ mm/ fs/ ipc/ security/ crypto/
core-$(CONFIG_BLOCK) += block/
+core-$(CONFIG_IO_URING) += io_uring/
vmlinux-dirs := $(patsubst %/,%,$(filter %/, \
$(core-y) $(core-m) $(drivers-y) $(drivers-m) \
diff --git a/arch/Kconfig b/arch/Kconfig
index 31c4fdc4a4ba..ab45e0f6c21b 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -214,6 +214,9 @@ config HAVE_FUNCTION_DESCRIPTORS
config TRACE_IRQFLAGS_SUPPORT
bool
+config TRACE_IRQFLAGS_NMI_SUPPORT
+ bool
+
#
# An arch should select this if it provides all these things:
#
diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 7c16f8a2b738..13d788d1e6dc 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -133,6 +133,7 @@ dtb-$(CONFIG_ARCH_BCM_5301X) += \
bcm47094-luxul-xwr-3150-v1.dtb \
bcm47094-netgear-r8500.dtb \
bcm47094-phicomm-k3.dtb \
+ bcm53015-meraki-mr26.dtb \
bcm53016-meraki-mr32.dtb \
bcm94708.dtb \
bcm94709.dtb \
diff --git a/arch/arm/boot/dts/aspeed-ast2500-evb.dts b/arch/arm/boot/dts/aspeed-ast2500-evb.dts
index 1d24b394ea4c..a497dd135491 100644
--- a/arch/arm/boot/dts/aspeed-ast2500-evb.dts
+++ b/arch/arm/boot/dts/aspeed-ast2500-evb.dts
@@ -5,7 +5,7 @@
/ {
model = "AST2500 EVB";
- compatible = "aspeed,ast2500";
+ compatible = "aspeed,ast2500-evb", "aspeed,ast2500";
aliases {
serial4 = &uart5;
diff --git a/arch/arm/boot/dts/aspeed-ast2600-evb-a1.dts b/arch/arm/boot/dts/aspeed-ast2600-evb-a1.dts
index dd7148060c4a..d0a5c2ff0fec 100644
--- a/arch/arm/boot/dts/aspeed-ast2600-evb-a1.dts
+++ b/arch/arm/boot/dts/aspeed-ast2600-evb-a1.dts
@@ -5,6 +5,7 @@
/ {
model = "AST2600 A1 EVB";
+ compatible = "aspeed,ast2600-evb-a1", "aspeed,ast2600";
/delete-node/regulator-vcc-sdhci0;
/delete-node/regulator-vcc-sdhci1;
diff --git a/arch/arm/boot/dts/aspeed-ast2600-evb.dts b/arch/arm/boot/dts/aspeed-ast2600-evb.dts
index 788448cdd6b3..b8e55bf167aa 100644
--- a/arch/arm/boot/dts/aspeed-ast2600-evb.dts
+++ b/arch/arm/boot/dts/aspeed-ast2600-evb.dts
@@ -8,7 +8,7 @@
/ {
model = "AST2600 EVB";
- compatible = "aspeed,ast2600";
+ compatible = "aspeed,ast2600-evb-a1", "aspeed,ast2600";
aliases {
serial4 = &uart5;
diff --git a/arch/arm/boot/dts/bcm53015-meraki-mr26.dts b/arch/arm/boot/dts/bcm53015-meraki-mr26.dts
new file mode 100644
index 000000000000..14f58033efeb
--- /dev/null
+++ b/arch/arm/boot/dts/bcm53015-meraki-mr26.dts
@@ -0,0 +1,166 @@
+// SPDX-License-Identifier: GPL-2.0-or-later OR MIT
+/*
+ * Broadcom BCM470X / BCM5301X ARM platform code.
+ * DTS for Meraki MR26 / Codename: Venom
+ *
+ * Copyright (C) 2022 Christian Lamparter <chunkeey@gmail.com>
+ */
+
+/dts-v1/;
+
+#include "bcm4708.dtsi"
+#include "bcm5301x-nand-cs0-bch8.dtsi"
+#include <dt-bindings/leds/common.h>
+
+/ {
+ compatible = "meraki,mr26", "brcm,bcm53015", "brcm,bcm4708";
+ model = "Meraki MR26";
+
+ memory@0 {
+ reg = <0x00000000 0x08000000>;
+ device_type = "memory";
+ };
+
+ leds {
+ compatible = "gpio-leds";
+
+ led-0 {
+ function = LED_FUNCTION_FAULT;
+ color = <LED_COLOR_ID_AMBER>;
+ gpios = <&chipcommon 13 GPIO_ACTIVE_HIGH>;
+ panic-indicator;
+ };
+ led-1 {
+ function = LED_FUNCTION_INDICATOR;
+ color = <LED_COLOR_ID_WHITE>;
+ gpios = <&chipcommon 12 GPIO_ACTIVE_HIGH>;
+ };
+ };
+
+ keys {
+ compatible = "gpio-keys";
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ key-restart {
+ label = "Reset";
+ linux,code = <KEY_RESTART>;
+ gpios = <&chipcommon 11 GPIO_ACTIVE_LOW>;
+ };
+ };
+};
+
+&uart0 {
+ clock-frequency = <50000000>;
+ /delete-property/ clocks;
+};
+
+&uart1 {
+ status = "disabled";
+};
+
+&gmac0 {
+ status = "okay";
+};
+
+&gmac1 {
+ status = "disabled";
+};
+&gmac2 {
+ status = "disabled";
+};
+&gmac3 {
+ status = "disabled";
+};
+
+&nandcs {
+ nand-ecc-algo = "hw";
+
+ partitions {
+ compatible = "fixed-partitions";
+ #address-cells = <0x1>;
+ #size-cells = <0x1>;
+
+ partition@0 {
+ label = "u-boot";
+ reg = <0x0 0x200000>;
+ read-only;
+ };
+
+ partition@200000 {
+ label = "u-boot-env";
+ reg = <0x200000 0x200000>;
+ /* empty */
+ };
+
+ partition@400000 {
+ label = "u-boot-backup";
+ reg = <0x400000 0x200000>;
+ /* empty */
+ };
+
+ partition@600000 {
+ label = "u-boot-env-backup";
+ reg = <0x600000 0x200000>;
+ /* empty */
+ };
+
+ partition@800000 {
+ label = "ubi";
+ reg = <0x800000 0x7780000>;
+ };
+ };
+};
+
+&srab {
+ status = "okay";
+
+ ports {
+ port@0 {
+ reg = <0>;
+ label = "poe";
+ };
+
+ port@5 {
+ reg = <5>;
+ label = "cpu";
+ ethernet = <&gmac0>;
+
+ fixed-link {
+ speed = <1000>;
+ duplex-full;
+ };
+ };
+ };
+};
+
+&i2c0 {
+ status = "okay";
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinmux_i2c>;
+
+ clock-frequency = <100000>;
+
+ ina219@40 {
+ compatible = "ti,ina219"; /* PoE power */
+ reg = <0x40>;
+ shunt-resistor = <60000>; /* = 60 mOhms */
+ };
+
+ eeprom@56 {
+ compatible = "atmel,24c64";
+ reg = <0x56>;
+ pagesize = <32>;
+ read-only;
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ /* it's empty */
+ };
+};
+
+&thermal {
+ status = "disabled";
+ /* does not work, reads 418 degree Celsius */
+};
diff --git a/arch/arm/boot/dts/imx6qdl-apalis.dtsi b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
index bd763bae596b..da919d0544a8 100644
--- a/arch/arm/boot/dts/imx6qdl-apalis.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-apalis.dtsi
@@ -315,7 +315,7 @@ stmpe811@41 {
/* ADC conversion time: 80 clocks */
st,sample-time = <4>;
- stmpe_touchscreen: stmpe-touchscreen {
+ stmpe_touchscreen: stmpe_touchscreen {
compatible = "st,stmpe-ts";
/* 8 sample average control */
st,ave-ctrl = <3>;
@@ -332,7 +332,7 @@ stmpe_touchscreen: stmpe-touchscreen {
st,touch-det-delay = <5>;
};
- stmpe_adc: stmpe-adc {
+ stmpe_adc: stmpe_adc {
compatible = "st,stmpe-adc";
/* forbid to use ADC channels 3-0 (touch) */
st,norequest-mask = <0x0F>;
diff --git a/arch/arm/boot/dts/imx6ul.dtsi b/arch/arm/boot/dts/imx6ul.dtsi
index afeec01f6522..eca8bf89ab88 100644
--- a/arch/arm/boot/dts/imx6ul.dtsi
+++ b/arch/arm/boot/dts/imx6ul.dtsi
@@ -64,20 +64,18 @@ cpu0: cpu@0 {
clock-frequency = <696000000>;
clock-latency = <61036>; /* two CLK32 periods */
#cooling-cells = <2>;
- operating-points = <
+ operating-points =
/* kHz uV */
- 696000 1275000
- 528000 1175000
- 396000 1025000
- 198000 950000
- >;
- fsl,soc-operating-points = <
+ <696000 1275000>,
+ <528000 1175000>,
+ <396000 1025000>,
+ <198000 950000>;
+ fsl,soc-operating-points =
/* KHz uV */
- 696000 1275000
- 528000 1175000
- 396000 1175000
- 198000 1175000
- >;
+ <696000 1275000>,
+ <528000 1175000>,
+ <396000 1175000>,
+ <198000 1175000>;
clocks = <&clks IMX6UL_CLK_ARM>,
<&clks IMX6UL_CLK_PLL2_BUS>,
<&clks IMX6UL_CLK_PLL2_PFD2>,
@@ -149,6 +147,9 @@ soc {
ocram: sram@900000 {
compatible = "mmio-sram";
reg = <0x00900000 0x20000>;
+ ranges = <0 0x00900000 0x20000>;
+ #address-cells = <1>;
+ #size-cells = <1>;
};
intc: interrupt-controller@a01000 {
@@ -543,7 +544,7 @@ fec2: ethernet@20b4000 {
};
kpp: keypad@20b8000 {
- compatible = "fsl,imx6ul-kpp", "fsl,imx6q-kpp", "fsl,imx21-kpp";
+ compatible = "fsl,imx6ul-kpp", "fsl,imx21-kpp";
reg = <0x020b8000 0x4000>;
interrupts = <GIC_SPI 82 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX6UL_CLK_KPP>;
@@ -998,7 +999,7 @@ cpu_speed_grade: speed-grade@10 {
};
csi: csi@21c4000 {
- compatible = "fsl,imx6ul-csi", "fsl,imx7-csi";
+ compatible = "fsl,imx6ul-csi";
reg = <0x021c4000 0x4000>;
interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX6UL_CLK_CSI>;
@@ -1007,7 +1008,7 @@ csi: csi@21c4000 {
};
lcdif: lcdif@21c8000 {
- compatible = "fsl,imx6ul-lcdif", "fsl,imx28-lcdif";
+ compatible = "fsl,imx6ul-lcdif", "fsl,imx6sx-lcdif";
reg = <0x021c8000 0x4000>;
interrupts = <GIC_SPI 5 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clks IMX6UL_CLK_LCDIF_PIX>,
@@ -1028,7 +1029,7 @@ pxp: pxp@21cc000 {
qspi: spi@21e0000 {
#address-cells = <1>;
#size-cells = <0>;
- compatible = "fsl,imx6ul-qspi", "fsl,imx6sx-qspi";
+ compatible = "fsl,imx6ul-qspi";
reg = <0x021e0000 0x4000>, <0x60000000 0x10000000>;
reg-names = "QuadSPI", "QuadSPI-memory";
interrupts = <GIC_SPI 107 IRQ_TYPE_LEVEL_HIGH>;
diff --git a/arch/arm/boot/dts/imx7d-colibri-emmc.dtsi b/arch/arm/boot/dts/imx7d-colibri-emmc.dtsi
index af39e5370fa1..045e4413d339 100644
--- a/arch/arm/boot/dts/imx7d-colibri-emmc.dtsi
+++ b/arch/arm/boot/dts/imx7d-colibri-emmc.dtsi
@@ -13,6 +13,10 @@ memory@80000000 {
};
};
+&cpu1 {
+ cpu-supply = <®_DCDC2>;
+};
+
&gpio6 {
gpio-line-names = "",
"",
diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi b/arch/arm/boot/dts/qcom-apq8064.dtsi
index a1c8ae516d21..33a4d3441959 100644
--- a/arch/arm/boot/dts/qcom-apq8064.dtsi
+++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
@@ -227,7 +227,7 @@ smem {
smd {
compatible = "qcom,smd";
- modem@0 {
+ modem-edge {
interrupts = <0 37 IRQ_TYPE_EDGE_RISING>;
qcom,ipc = <&l2cc 8 3>;
@@ -236,7 +236,7 @@ modem@0 {
status = "disabled";
};
- q6@1 {
+ q6-edge {
interrupts = <0 90 IRQ_TYPE_EDGE_RISING>;
qcom,ipc = <&l2cc 8 15>;
@@ -245,7 +245,7 @@ q6@1 {
status = "disabled";
};
- dsps@3 {
+ dsps-edge {
interrupts = <0 138 IRQ_TYPE_EDGE_RISING>;
qcom,ipc = <&sps_sic_non_secure 0x4080 0>;
@@ -254,7 +254,7 @@ dsps@3 {
status = "disabled";
};
- riva@6 {
+ riva-edge {
interrupts = <0 198 IRQ_TYPE_EDGE_RISING>;
qcom,ipc = <&l2cc 8 25>;
diff --git a/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts b/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts
index 83793b835d40..f47020cf7a90 100644
--- a/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts
+++ b/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts
@@ -16,331 +16,320 @@ aliases {
chosen {
stdout-path = "serial0:115200n8";
};
+};
+
+&blsp1_uart2 {
+ status = "okay";
+};
+
+&blsp2_i2c5 {
+ status = "okay";
+ clock-frequency = <200000>;
- soc {
- serial@f991e000 {
+ pinctrl-0 = <&i2c11_pins>;
+ pinctrl-names = "default";
+
+ eeprom: eeprom@52 {
+ compatible = "atmel,24c128";
+ reg = <0x52>;
+ pagesize = <32>;
+ read-only;
+ };
+};
+
+&otg {
+ status = "okay";
+
+ phys = <&usb_hs2_phy>;
+ phy-select = <&tcsr 0xb000 1>;
+ extcon = <&smbb>, <&usb_id>;
+ vbus-supply = <&chg_otg>;
+ hnp-disable;
+ srp-disable;
+ adp-disable;
+
+ ulpi {
+ phy@b {
status = "okay";
+ v3p3-supply = <&pm8941_l24>;
+ v1p8-supply = <&pm8941_l6>;
+ extcon = <&smbb>;
+ qcom,init-seq = /bits/ 8 <0x1 0x63>;
};
+ };
+};
- sdhci@f9824900 {
- bus-width = <8>;
- non-removable;
- status = "okay";
+&rpm_requests {
+ pm8841-regulators {
+ compatible = "qcom,rpm-pm8841-regulators";
- vmmc-supply = <&pm8941_l20>;
- vqmmc-supply = <&pm8941_s3>;
+ pm8841_s1: s1 {
+ regulator-min-microvolt = <675000>;
+ regulator-max-microvolt = <1050000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc1_pin_a>;
+ pm8841_s2: s2 {
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <1050000>;
};
- sdhci@f98a4900 {
- cd-gpios = <&msmgpio 62 0x1>;
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
- bus-width = <4>;
- status = "okay";
+ pm8841_s3: s3 {
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <1050000>;
+ };
- vmmc-supply = <&pm8941_l21>;
- vqmmc-supply = <&pm8941_l13>;
+ pm8841_s4: s4 {
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <1050000>;
};
+ };
- usb@f9a55000 {
- status = "okay";
- phys = <&usb_hs2_phy>;
- phy-select = <&tcsr 0xb000 1>;
- extcon = <&smbb>, <&usb_id>;
- vbus-supply = <&chg_otg>;
- hnp-disable;
- srp-disable;
- adp-disable;
- ulpi {
- phy@b {
- status = "okay";
- v3p3-supply = <&pm8941_l24>;
- v1p8-supply = <&pm8941_l6>;
- extcon = <&smbb>;
- qcom,init-seq = /bits/ 8 <0x1 0x63>;
- };
- };
- };
-
-
- pinctrl@fd510000 {
- i2c11_pins: i2c11 {
- mux {
- pins = "gpio83", "gpio84";
- function = "blsp_i2c11";
- };
- };
-
- spi8_default: spi8_default {
- mosi {
- pins = "gpio45";
- function = "blsp_spi8";
- };
- miso {
- pins = "gpio46";
- function = "blsp_spi8";
- };
- cs {
- pins = "gpio47";
- function = "blsp_spi8";
- };
- clk {
- pins = "gpio48";
- function = "blsp_spi8";
- };
- };
-
- sdhc1_pin_a: sdhc1-pin-active {
- clk {
- pins = "sdc1_clk";
- drive-strength = <16>;
- bias-disable;
- };
-
- cmd-data {
- pins = "sdc1_cmd", "sdc1_data";
- drive-strength = <10>;
- bias-pull-up;
- };
- };
-
- sdhc2_cd_pin_a: sdhc2-cd-pin-active {
- pins = "gpio62";
- function = "gpio";
-
- drive-strength = <2>;
- bias-disable;
- };
-
- sdhc2_pin_a: sdhc2-pin-active {
- clk {
- pins = "sdc2_clk";
- drive-strength = <10>;
- bias-disable;
- };
-
- cmd-data {
- pins = "sdc2_cmd", "sdc2_data";
- drive-strength = <6>;
- bias-pull-up;
- };
- };
- };
-
- i2c@f9967000 {
- status = "okay";
- clock-frequency = <200000>;
- pinctrl-0 = <&i2c11_pins>;
- pinctrl-names = "default";
+ pm8941-regulators {
+ compatible = "qcom,rpm-pm8941-regulators";
+
+ vdd_l1_l3-supply = <&pm8941_s1>;
+ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+ vdd_l4_l11-supply = <&pm8941_s1>;
+ vdd_l5_l7-supply = <&pm8941_s2>;
+ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+ vin_5vs-supply = <&pm8941_5v>;
+
+ pm8941_s1: s1 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1300000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ pm8941_s2: s2 {
+ regulator-min-microvolt = <2150000>;
+ regulator-max-microvolt = <2150000>;
+ regulator-boot-on;
+ };
- eeprom: eeprom@52 {
- compatible = "atmel,24c128";
- reg = <0x52>;
- pagesize = <32>;
- read-only;
- };
+ pm8941_s3: s3 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ regulator-boot-on;
};
+
+ pm8941_l1: l1 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ pm8941_l2: l2 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
+ };
+
+ pm8941_l3: l3 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ };
+
+ pm8941_l4: l4 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ };
+
+ pm8941_l5: l5 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
+
+ pm8941_l6: l6 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l7: l7 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l8: l8 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
+
+ pm8941_l9: l9 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ };
+
+ pm8941_l10: l10 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ };
+
+ pm8941_l11: l11 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1300000>;
+ };
+
+ pm8941_l12: l12 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ pm8941_l13: l13 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l14: l14 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
+
+ pm8941_l15: l15 {
+ regulator-min-microvolt = <2050000>;
+ regulator-max-microvolt = <2050000>;
+ };
+
+ pm8941_l16: l16 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2700000>;
+ };
+
+ pm8941_l17: l17 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2700000>;
+ };
+
+ pm8941_l18: l18 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
+ };
+
+ pm8941_l19: l19 {
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+ regulator-always-on;
+ };
+
+ pm8941_l20: l20 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-system-load = <200000>;
+ regulator-allow-set-load;
+ regulator-boot-on;
+ };
+
+ pm8941_l21: l21 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l22: l22 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
+ };
+
+ pm8941_l23: l23 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
+ };
+
+ pm8941_l24: l24 {
+ regulator-min-microvolt = <3075000>;
+ regulator-max-microvolt = <3075000>;
+ regulator-boot-on;
+ };
+ };
+};
+
+&sdhc_1 {
+ status = "okay";
+
+ vmmc-supply = <&pm8941_l20>;
+ vqmmc-supply = <&pm8941_s3>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc1_pin_a>;
+};
+
+&sdhc_2 {
+ status = "okay";
+ cd-gpios = <&tlmm 62 0x1>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
+
+ vmmc-supply = <&pm8941_l21>;
+ vqmmc-supply = <&pm8941_l13>;
+};
+
+&tlmm {
+ i2c11_pins: i2c11 {
+ mux {
+ pins = "gpio83", "gpio84";
+ function = "blsp_i2c11";
+ };
+ };
+
+ spi8_default: spi8_default {
+ mosi {
+ pins = "gpio45";
+ function = "blsp_spi8";
+ };
+ miso {
+ pins = "gpio46";
+ function = "blsp_spi8";
+ };
+ cs {
+ pins = "gpio47";
+ function = "blsp_spi8";
+ };
+ clk {
+ pins = "gpio48";
+ function = "blsp_spi8";
+ };
+ };
+
+ sdhc1_pin_a: sdhc1-pin-active {
+ clk {
+ pins = "sdc1_clk";
+ drive-strength = <16>;
+ bias-disable;
+ };
+
+ cmd-data {
+ pins = "sdc1_cmd", "sdc1_data";
+ drive-strength = <10>;
+ bias-pull-up;
+ };
+ };
+
+ sdhc2_cd_pin_a: sdhc2-cd-pin-active {
+ pins = "gpio62";
+ function = "gpio";
+
+ drive-strength = <2>;
+ bias-disable;
};
- smd {
- rpm {
- rpm_requests {
- pm8841-regulators {
- s1 {
- regulator-min-microvolt = <675000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s2 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s3 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s4 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
- };
-
- pm8941-regulators {
- vdd_l1_l3-supply = <&pm8941_s1>;
- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
- vdd_l4_l11-supply = <&pm8941_s1>;
- vdd_l5_l7-supply = <&pm8941_s2>;
- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
- vin_5vs-supply = <&pm8941_5v>;
-
- s1 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
- regulator-always-on;
- regulator-boot-on;
- };
-
- s2 {
- regulator-min-microvolt = <2150000>;
- regulator-max-microvolt = <2150000>;
- regulator-boot-on;
- };
-
- s3 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- regulator-always-on;
- regulator-boot-on;
- };
-
- l1 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l2 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- l3 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- l4 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- l5 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l6 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l7 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l8 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l9 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- l10 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- regulator-always-on;
- };
-
- l11 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
- };
-
- l12 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l13 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l14 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l15 {
- regulator-min-microvolt = <2050000>;
- regulator-max-microvolt = <2050000>;
- };
-
- l16 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l17 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l18 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- l19 {
- regulator-min-microvolt = <3300000>;
- regulator-max-microvolt = <3300000>;
- regulator-always-on;
- };
-
- l20 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-allow-set-load;
- regulator-boot-on;
- regulator-system-load = <200000>;
- };
-
- l21 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l22 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3000000>;
- };
-
- l23 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3000000>;
- };
-
- l24 {
- regulator-min-microvolt = <3075000>;
- regulator-max-microvolt = <3075000>;
-
- regulator-boot-on;
- };
- };
- };
+ sdhc2_pin_a: sdhc2-pin-active {
+ clk {
+ pins = "sdc2_clk";
+ drive-strength = <10>;
+ bias-disable;
+ };
+
+ cmd-data {
+ pins = "sdc2_cmd", "sdc2_data";
+ drive-strength = <6>;
+ bias-pull-up;
};
};
};
diff --git a/arch/arm/boot/dts/qcom-apq8084.dtsi b/arch/arm/boot/dts/qcom-apq8084.dtsi
index 52240fc7a1a6..da50a1a0197f 100644
--- a/arch/arm/boot/dts/qcom-apq8084.dtsi
+++ b/arch/arm/boot/dts/qcom-apq8084.dtsi
@@ -470,7 +470,7 @@ rpm {
qcom,ipc = <&apcs 8 0>;
qcom,smd-edge = <15>;
- rpm_requests {
+ rpm-requests {
compatible = "qcom,rpm-apq8084";
qcom,smd-channels = "rpm_requests";
diff --git a/arch/arm/boot/dts/qcom-mdm9615.dtsi b/arch/arm/boot/dts/qcom-mdm9615.dtsi
index 4d4f37cebf21..96a66862875c 100644
--- a/arch/arm/boot/dts/qcom-mdm9615.dtsi
+++ b/arch/arm/boot/dts/qcom-mdm9615.dtsi
@@ -321,6 +321,7 @@ rtc@11d {
pmicgpio: gpio@150 {
compatible = "qcom,pm8018-gpio", "qcom,ssbi-gpio";
+ reg = <0x150>;
interrupt-controller;
#interrupt-cells = <2>;
gpio-controller;
diff --git a/arch/arm/boot/dts/qcom-msm8974-fairphone-fp2.dts b/arch/arm/boot/dts/qcom-msm8974-fairphone-fp2.dts
index 6d77e0f8ca4d..085591183592 100644
--- a/arch/arm/boot/dts/qcom-msm8974-fairphone-fp2.dts
+++ b/arch/arm/boot/dts/qcom-msm8974-fairphone-fp2.dts
@@ -5,7 +5,6 @@
#include <dt-bindings/input/input.h>
#include <dt-bindings/pinctrl/qcom,pmic-gpio.h>
-
/ {
model = "Fairphone 2";
compatible = "fairphone,fp2", "qcom,msm8974";
@@ -51,359 +50,358 @@ volume-up {
vibrator {
compatible = "gpio-vibrator";
- enable-gpios = <&msmgpio 86 GPIO_ACTIVE_HIGH>;
+ enable-gpios = <&tlmm 86 GPIO_ACTIVE_HIGH>;
vcc-supply = <&pm8941_l18>;
};
+};
+
+&blsp1_uart2 {
+ status = "okay";
+};
+
+&imem {
+ status = "okay";
+
+ reboot-mode {
+ mode-normal = <0x77665501>;
+ mode-bootloader = <0x77665500>;
+ mode-recovery = <0x77665502>;
+ };
+};
+
+&otg {
+ status = "okay";
+
+ phys = <&usb_hs1_phy>;
+ phy-select = <&tcsr 0xb000 0>;
+ extcon = <&smbb>, <&usb_id>;
+ vbus-supply = <&chg_otg>;
+
+ hnp-disable;
+ srp-disable;
+ adp-disable;
- smd {
- rpm {
- rpm_requests {
- pm8841-regulators {
- s1 {
- regulator-min-microvolt = <675000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s2 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s3 {
- regulator-min-microvolt = <1050000>;
- regulator-max-microvolt = <1050000>;
- };
- };
-
- pm8941-regulators {
- vdd_l1_l3-supply = <&pm8941_s1>;
- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
- vdd_l4_l11-supply = <&pm8941_s1>;
- vdd_l5_l7-supply = <&pm8941_s2>;
- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
- vdd_l21-supply = <&vreg_boost>;
-
- s1 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- s2 {
- regulator-min-microvolt = <2150000>;
- regulator-max-microvolt = <2150000>;
-
- regulator-boot-on;
- };
-
- s3 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l1 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l2 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- l3 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- l4 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- l5 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l6 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l7 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l8 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l9 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- l10 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- l11 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1350000>;
- };
-
- l12 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l13 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l14 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l15 {
- regulator-min-microvolt = <2050000>;
- regulator-max-microvolt = <2050000>;
- };
-
- l16 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l17 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- l18 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- l19 {
- regulator-min-microvolt = <2900000>;
- regulator-max-microvolt = <3350000>;
- };
-
- l20 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- regulator-system-load = <200000>;
- regulator-allow-set-load;
- };
-
- l21 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l22 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3300000>;
- };
-
- l23 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3000000>;
- };
-
- l24 {
- regulator-min-microvolt = <3075000>;
- regulator-max-microvolt = <3075000>;
-
- regulator-boot-on;
- };
- };
- };
+ ulpi {
+ phy@a {
+ status = "okay";
+
+ v1p8-supply = <&pm8941_l6>;
+ v3p3-supply = <&pm8941_l24>;
+
+ extcon = <&smbb>;
+ qcom,init-seq = /bits/ 8 <0x1 0x64>;
};
};
};
-&soc {
- serial@f991e000 {
- status = "okay";
+&pm8941_gpios {
+ gpio_keys_pin_a: gpio-keys-active {
+ pins = "gpio1", "gpio2", "gpio5";
+ function = "normal";
+
+ bias-pull-up;
+ power-source = <PM8941_GPIO_S3>;
};
+};
- remoteproc@fb21b000 {
- status = "okay";
+&pronto {
+ status = "okay";
- vddmx-supply = <&pm8841_s1>;
- vddcx-supply = <&pm8841_s2>;
+ vddmx-supply = <&pm8841_s1>;
+ vddcx-supply = <&pm8841_s2>;
+ vddpx-supply = <&pm8941_s3>;
- pinctrl-names = "default";
- pinctrl-0 = <&wcnss_pin_a>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&wcnss_pin_a>;
- smd-edge {
- qcom,remote-pid = <4>;
- label = "pronto";
+ iris {
+ vddxo-supply = <&pm8941_l6>;
+ vddrfa-supply = <&pm8941_l11>;
+ vddpa-supply = <&pm8941_l19>;
+ vdddig-supply = <&pm8941_s3>;
+ };
+
+ smd-edge {
+ qcom,remote-pid = <4>;
+ label = "pronto";
- wcnss {
- status = "okay";
- };
+ wcnss {
+ status = "okay";
};
};
+};
- pinctrl@fd510000 {
- sdhc1_pin_a: sdhc1-pin-active {
- clk {
- pins = "sdc1_clk";
- drive-strength = <16>;
- bias-disable;
- };
+&remoteproc_adsp {
+ status = "okay";
+ cx-supply = <&pm8841_s2>;
+};
- cmd-data {
- pins = "sdc1_cmd", "sdc1_data";
- drive-strength = <10>;
- bias-pull-up;
- };
+&remoteproc_mss {
+ status = "okay";
+ cx-supply = <&pm8841_s2>;
+ mss-supply = <&pm8841_s3>;
+ mx-supply = <&pm8841_s1>;
+ pll-supply = <&pm8941_l12>;
+};
+
+&rpm_requests {
+ pm8841-regulators {
+ compatible = "qcom,rpm-pm8841-regulators";
+
+ pm8841_s1: s1 {
+ regulator-min-microvolt = <675000>;
+ regulator-max-microvolt = <1050000>;
};
- sdhc2_pin_a: sdhc2-pin-active {
- clk {
- pins = "sdc2_clk";
- drive-strength = <10>;
- bias-disable;
- };
+ pm8841_s2: s2 {
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <1050000>;
+ };
- cmd-data {
- pins = "sdc2_cmd", "sdc2_data";
- drive-strength = <6>;
- bias-pull-up;
- };
+ pm8841_s3: s3 {
+ regulator-min-microvolt = <1050000>;
+ regulator-max-microvolt = <1050000>;
};
+ };
- wcnss_pin_a: wcnss-pin-active {
- wlan {
- pins = "gpio36", "gpio37", "gpio38", "gpio39", "gpio40";
- function = "wlan";
+ pm8941-regulators {
+ compatible = "qcom,rpm-pm8941-regulators";
+
+ vdd_l1_l3-supply = <&pm8941_s1>;
+ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+ vdd_l4_l11-supply = <&pm8941_s1>;
+ vdd_l5_l7-supply = <&pm8941_s2>;
+ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+ vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
+ vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
+ vdd_l21-supply = <&vreg_boost>;
+
+ pm8941_s1: s1 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1300000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
- drive-strength = <6>;
- bias-pull-down;
- };
+ pm8941_s2: s2 {
+ regulator-min-microvolt = <2150000>;
+ regulator-max-microvolt = <2150000>;
+ regulator-boot-on;
+ };
- bt {
- pins = "gpio35", "gpio43", "gpio44";
- function = "bt";
+ pm8941_s3: s3 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
- drive-strength = <2>;
- bias-pull-down;
- };
+ pm8941_l1: l1 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
- fm {
- pins = "gpio41", "gpio42";
- function = "fm";
+ pm8941_l2: l2 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
+ };
- drive-strength = <2>;
- bias-pull-down;
- };
+ pm8941_l3: l3 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
};
- };
- sdhci@f9824900 {
- status = "okay";
+ pm8941_l4: l4 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ };
- vmmc-supply = <&pm8941_l20>;
- vqmmc-supply = <&pm8941_s3>;
+ pm8941_l5: l5 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- bus-width = <8>;
- non-removable;
+ pm8941_l6: l6 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc1_pin_a>;
- };
+ pm8941_l7: l7 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
+ };
- sdhci@f98a4900 {
- status = "okay";
+ pm8941_l8: l8 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- vmmc-supply = <&pm8941_l21>;
- vqmmc-supply = <&pm8941_l13>;
+ pm8941_l9: l9 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ };
- bus-width = <4>;
+ pm8941_l10: l10 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc2_pin_a>;
+ pm8941_l11: l11 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1350000>;
+ };
+
+ pm8941_l12: l12 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ pm8941_l13: l13 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l14: l14 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
+
+ pm8941_l15: l15 {
+ regulator-min-microvolt = <2050000>;
+ regulator-max-microvolt = <2050000>;
+ };
+
+ pm8941_l16: l16 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2700000>;
+ };
+
+ pm8941_l17: l17 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
+ };
+
+ pm8941_l18: l18 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
+ };
+
+ pm8941_l19: l19 {
+ regulator-min-microvolt = <2900000>;
+ regulator-max-microvolt = <3350000>;
+ };
+
+ pm8941_l20: l20 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-system-load = <200000>;
+ regulator-allow-set-load;
+ regulator-boot-on;
+ };
+
+ pm8941_l21: l21 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l22: l22 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3300000>;
+ };
+
+ pm8941_l23: l23 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
+ };
+
+ pm8941_l24: l24 {
+ regulator-min-microvolt = <3075000>;
+ regulator-max-microvolt = <3075000>;
+ regulator-boot-on;
+ };
};
+};
+
+&sdhc_1 {
+ status = "okay";
- usb@f9a55000 {
- status = "okay";
+ vmmc-supply = <&pm8941_l20>;
+ vqmmc-supply = <&pm8941_s3>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc1_pin_a>;
+};
- phys = <&usb_hs1_phy>;
- phy-select = <&tcsr 0xb000 0>;
- extcon = <&smbb>, <&usb_id>;
- vbus-supply = <&chg_otg>;
+&sdhc_2 {
+ status = "okay";
- hnp-disable;
- srp-disable;
- adp-disable;
+ vmmc-supply = <&pm8941_l21>;
+ vqmmc-supply = <&pm8941_l13>;
- ulpi {
- phy@a {
- status = "okay";
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc2_pin_a>;
+};
- v1p8-supply = <&pm8941_l6>;
- v3p3-supply = <&pm8941_l24>;
+&tlmm {
+ sdhc1_pin_a: sdhc1-pin-active {
+ clk {
+ pins = "sdc1_clk";
+ drive-strength = <16>;
+ bias-disable;
+ };
- extcon = <&smbb>;
- qcom,init-seq = /bits/ 8 <0x1 0x64>;
- };
+ cmd-data {
+ pins = "sdc1_cmd", "sdc1_data";
+ drive-strength = <10>;
+ bias-pull-up;
};
};
- imem@fe805000 {
- status = "okay";
+ sdhc2_pin_a: sdhc2-pin-active {
+ clk {
+ pins = "sdc2_clk";
+ drive-strength = <10>;
+ bias-disable;
+ };
- reboot-mode {
- mode-normal = <0x77665501>;
- mode-bootloader = <0x77665500>;
- mode-recovery = <0x77665502>;
+ cmd-data {
+ pins = "sdc2_cmd", "sdc2_data";
+ drive-strength = <6>;
+ bias-pull-up;
};
};
-};
-&spmi_bus {
- pm8941@0 {
- gpios@c000 {
- gpio_keys_pin_a: gpio-keys-active {
- pins = "gpio1", "gpio2", "gpio5";
- function = "normal";
+ wcnss_pin_a: wcnss-pin-active {
+ wlan {
+ pins = "gpio36", "gpio37", "gpio38", "gpio39", "gpio40";
+ function = "wlan";
+
+ drive-strength = <6>;
+ bias-pull-down;
+ };
+
+ bt {
+ pins = "gpio35", "gpio43", "gpio44";
+ function = "bt";
+
+ drive-strength = <2>;
+ bias-pull-down;
+ };
+
+ fm {
+ pins = "gpio41", "gpio42";
+ function = "fm";
- bias-pull-up;
- power-source = <PM8941_GPIO_S3>;
- };
+ drive-strength = <2>;
+ bias-pull-down;
};
};
};
diff --git a/arch/arm/boot/dts/qcom-msm8974-lge-nexus5-hammerhead.dts b/arch/arm/boot/dts/qcom-msm8974-lge-nexus5-hammerhead.dts
index 069136170198..6537950c30ba 100644
--- a/arch/arm/boot/dts/qcom-msm8974-lge-nexus5-hammerhead.dts
+++ b/arch/arm/boot/dts/qcom-msm8974-lge-nexus5-hammerhead.dts
@@ -12,213 +12,31 @@ / {
aliases {
serial0 = &blsp1_uart1;
- serial1 = &blsp2_uart10;
+ serial1 = &blsp2_uart4;
};
chosen {
stdout-path = "serial0:115200n8";
};
- smd {
- rpm {
- rpm_requests {
- pm8841-regulators {
- s1 {
- regulator-min-microvolt = <675000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s2 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s3 {
- regulator-min-microvolt = <1050000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s4 {
- regulator-min-microvolt = <815000>;
- regulator-max-microvolt = <900000>;
- };
- };
-
- pm8941-regulators {
- vdd_l1_l3-supply = <&pm8941_s1>;
- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
- vdd_l4_l11-supply = <&pm8941_s1>;
- vdd_l5_l7-supply = <&pm8941_s2>;
- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
- vdd_l8_l16_l18_l19-supply = <&vreg_vph_pwr>;
- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
- vdd_l21-supply = <&vreg_boost>;
-
- s1 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- s2 {
- regulator-min-microvolt = <2150000>;
- regulator-max-microvolt = <2150000>;
-
- regulator-boot-on;
- };
-
- s3 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l1 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l2 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- l3 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- l4 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- l5 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l6 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l7 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l8 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l9 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- l10 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- l11 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
- };
-
- l12 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l13 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l14 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l15 {
- regulator-min-microvolt = <2050000>;
- regulator-max-microvolt = <2050000>;
- };
-
- l16 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l17 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- l18 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- l19 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3300000>;
- };
-
- l20 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- regulator-system-load = <200000>;
- regulator-allow-set-load;
- };
-
- l21 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l22 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3300000>;
- };
-
- l23 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3000000>;
- };
-
- l24 {
- regulator-min-microvolt = <3075000>;
- regulator-max-microvolt = <3075000>;
-
- regulator-boot-on;
- };
- };
- };
+ gpio-keys {
+ compatible = "gpio-keys";
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&gpio_keys_pin_a>;
+
+ volume-up {
+ label = "volume_up";
+ gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
+ linux,input-type = <1>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+
+ volume-down {
+ label = "volume_down";
+ gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
+ linux,input-type = <1>;
+ linux,code = <KEY_VOLUMEDOWN>;
};
};
@@ -229,7 +47,7 @@ vreg_wlan: wlan-regulator {
regulator-min-microvolt = <3300000>;
regulator-max-microvolt = <3300000>;
- gpio = <&msmgpio 26 GPIO_ACTIVE_HIGH>;
+ gpio = <&tlmm 26 GPIO_ACTIVE_HIGH>;
enable-active-high;
pinctrl-names = "default";
@@ -237,525 +55,676 @@ vreg_wlan: wlan-regulator {
};
};
-&soc {
- serial@f991d000 {
- status = "okay";
+&blsp1_i2c1 {
+ status = "okay";
+ clock-frequency = <100000>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c1_pins>;
+
+ charger: bq24192@6b {
+ compatible = "ti,bq24192";
+ reg = <0x6b>;
+ interrupts-extended = <&spmi_bus 0 0xd5 0 IRQ_TYPE_EDGE_FALLING>;
+
+ omit-battery-class;
+
+ usb_otg_vbus: usb-otg-vbus { };
};
- pinctrl@fd510000 {
- sdhc1_pin_a: sdhc1-pin-active {
- clk {
- pins = "sdc1_clk";
- drive-strength = <16>;
- bias-disable;
- };
+ fuelgauge: max17048@36 {
+ compatible = "maxim,max17048";
+ reg = <0x36>;
- cmd-data {
- pins = "sdc1_cmd", "sdc1_data";
- drive-strength = <10>;
- bias-pull-up;
- };
- };
+ maxim,double-soc;
+ maxim,rcomp = /bits/ 8 <0x4d>;
- sdhc2_pin_a: sdhc2-pin-active {
- clk {
- pins = "sdc2_clk";
- drive-strength = <6>;
- bias-disable;
- };
+ interrupt-parent = <&tlmm>;
+ interrupts = <9 IRQ_TYPE_LEVEL_LOW>;
- cmd-data {
- pins = "sdc2_cmd", "sdc2_data";
- drive-strength = <6>;
- bias-pull-up;
- };
- };
+ pinctrl-names = "default";
+ pinctrl-0 = <&fuelgauge_pin>;
- i2c1_pins: i2c1 {
- mux {
- pins = "gpio2", "gpio3";
- function = "blsp_i2c1";
+ maxim,alert-low-soc-level = <2>;
+ };
+};
- drive-strength = <2>;
- bias-disable;
- };
- };
+&blsp1_i2c2 {
+ status = "okay";
+ clock-frequency = <355000>;
- i2c2_pins: i2c2 {
- mux {
- pins = "gpio6", "gpio7";
- function = "blsp_i2c2";
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c2_pins>;
- drive-strength = <2>;
- bias-disable;
- };
+ synaptics@70 {
+ compatible = "syna,rmi4-i2c";
+ reg = <0x70>;
+
+ interrupts-extended = <&tlmm 5 IRQ_TYPE_EDGE_FALLING>;
+ vdd-supply = <&pm8941_l22>;
+ vio-supply = <&pm8941_lvs3>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&touch_pin>;
+
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ rmi4-f01@1 {
+ reg = <0x1>;
+ syna,nosleep-mode = <1>;
};
- i2c3_pins: i2c3 {
- mux {
- pins = "gpio10", "gpio11";
- function = "blsp_i2c3";
- drive-strength = <2>;
- bias-disable;
- };
+ rmi4-f12@12 {
+ reg = <0x12>;
+ syna,sensor-type = <1>;
};
+ };
+};
- i2c11_pins: i2c11 {
- mux {
- pins = "gpio83", "gpio84";
- function = "blsp_i2c11";
+&blsp1_i2c3 {
+ status = "okay";
+ clock-frequency = <100000>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c3_pins>;
+
+ avago_apds993@39 {
+ compatible = "avago,apds9930";
+ reg = <0x39>;
+ interrupts-extended = <&tlmm 61 IRQ_TYPE_EDGE_FALLING>;
+ vdd-supply = <&pm8941_l17>;
+ vddio-supply = <&pm8941_lvs1>;
+ led-max-microamp = <100000>;
+ amstaos,proximity-diodes = <0>;
+ };
+};
- drive-strength = <2>;
- bias-disable;
- };
+&blsp2_i2c5 {
+ status = "okay";
+ clock-frequency = <355000>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c11_pins>;
+
+ led-controller@38 {
+ compatible = "ti,lm3630a";
+ status = "okay";
+ reg = <0x38>;
+
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ led@0 {
+ reg = <0>;
+ led-sources = <0 1>;
+ label = "lcd-backlight";
+ default-brightness = <200>;
};
+ };
+};
+
+&blsp2_i2c6 {
+ status = "okay";
+ clock-frequency = <100000>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c12_pins>;
+
+ mpu6515@68 {
+ compatible = "invensense,mpu6515";
+ reg = <0x68>;
+ interrupts-extended = <&tlmm 73 IRQ_TYPE_EDGE_FALLING>;
+ vddio-supply = <&pm8941_lvs1>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&mpu6515_pin>;
- i2c12_pins: i2c12 {
- mux {
- pins = "gpio87", "gpio88";
- function = "blsp_i2c12";
- drive-strength = <2>;
- bias-disable;
+ mount-matrix = "0", "-1", "0",
+ "-1", "0", "0",
+ "0", "0", "1";
+
+ i2c-gate {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ ak8963@f {
+ compatible = "asahi-kasei,ak8963";
+ reg = <0x0f>;
+ gpios = <&tlmm 67 0>;
+ vid-supply = <&pm8941_lvs1>;
+ vdd-supply = <&pm8941_l17>;
};
- };
- mpu6515_pin: mpu6515 {
- irq {
- pins = "gpio73";
- function = "gpio";
- bias-disable;
- input-enable;
+ bmp280@76 {
+ compatible = "bosch,bmp280";
+ reg = <0x76>;
+ vdda-supply = <&pm8941_lvs1>;
+ vddd-supply = <&pm8941_l17>;
};
};
+ };
+};
+
+&blsp1_uart1 {
+ status = "okay";
+};
- touch_pin: touch {
- int {
- pins = "gpio5";
- function = "gpio";
+&blsp2_uart4 {
+ status = "okay";
- drive-strength = <2>;
- bias-disable;
- input-enable;
- };
+ pinctrl-names = "default";
+ pinctrl-0 = <&blsp2_uart4_pin_a>;
- reset {
- pins = "gpio8";
- function = "gpio";
+ bluetooth {
+ compatible = "brcm,bcm43438-bt";
+ max-speed = <3000000>;
- drive-strength = <2>;
- bias-pull-up;
- };
- };
+ pinctrl-names = "default";
+ pinctrl-0 = <&bt_pin>;
+
+ host-wakeup-gpios = <&tlmm 42 GPIO_ACTIVE_HIGH>;
+ device-wakeup-gpios = <&tlmm 62 GPIO_ACTIVE_HIGH>;
+ shutdown-gpios = <&tlmm 41 GPIO_ACTIVE_HIGH>;
+ };
+};
- panel_pin: panel {
- te {
- pins = "gpio12";
- function = "mdp_vsync";
+&dsi0 {
+ status = "okay";
- drive-strength = <2>;
- bias-disable;
- };
- };
+ vdda-supply = <&pm8941_l2>;
+ vdd-supply = <&pm8941_lvs3>;
+ vddio-supply = <&pm8941_l12>;
- bt_pin: bt {
- hostwake {
- pins = "gpio42";
- function = "gpio";
- };
+ panel: panel@0 {
+ reg = <0>;
+ compatible = "lg,acx467akm-7";
- devwake {
- pins = "gpio62";
- function = "gpio";
- };
+ pinctrl-names = "default";
+ pinctrl-0 = <&panel_pin>;
- shutdown {
- pins = "gpio41";
- function = "gpio";
+ port {
+ panel_in: endpoint {
+ remote-endpoint = <&dsi0_out>;
};
};
+ };
+};
- blsp2_uart10_pin_a: blsp2-uart10-pin-active {
- tx {
- pins = "gpio53";
- function = "blsp_uart10";
+&dsi0_out {
+ remote-endpoint = <&panel_in>;
+ data-lanes = <0 1 2 3>;
+};
- drive-strength = <2>;
- bias-disable;
- };
+&dsi0_phy {
+ status = "okay";
- rx {
- pins = "gpio54";
- function = "blsp_uart10";
+ vddio-supply = <&pm8941_l12>;
+};
- drive-strength = <2>;
- bias-pull-up;
- };
+&mdss {
+ status = "okay";
+};
- cts {
- pins = "gpio55";
- function = "blsp_uart10";
+&otg {
+ status = "okay";
- drive-strength = <2>;
- bias-pull-up;
- };
+ phys = <&usb_hs1_phy>;
+ phy-select = <&tcsr 0xb000 0>;
- rts {
- pins = "gpio56";
- function = "blsp_uart10";
+ extcon = <&charger>, <&usb_id>;
+ vbus-supply = <&usb_otg_vbus>;
- drive-strength = <2>;
- bias-disable;
- };
+ hnp-disable;
+ srp-disable;
+ adp-disable;
+
+ ulpi {
+ phy@a {
+ status = "okay";
+
+ v1p8-supply = <&pm8941_l6>;
+ v3p3-supply = <&pm8941_l24>;
+
+ qcom,init-seq = /bits/ 8 <0x1 0x64>;
};
};
+};
- sdhci@f9824900 {
- status = "okay";
+&pm8941_gpios {
+ gpio_keys_pin_a: gpio-keys-active {
+ pins = "gpio2", "gpio3";
+ function = "normal";
- vmmc-supply = <&pm8941_l20>;
- vqmmc-supply = <&pm8941_s3>;
+ bias-pull-up;
+ power-source = <PM8941_GPIO_S3>;
+ };
- bus-width = <8>;
- non-removable;
+ fuelgauge_pin: fuelgauge-int {
+ pins = "gpio9";
+ function = "normal";
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc1_pin_a>;
+ bias-disable;
+ input-enable;
+ power-source = <PM8941_GPIO_S3>;
};
- sdhci@f98a4900 {
- status = "okay";
+ wlan_sleep_clk_pin: wl-sleep-clk {
+ pins = "gpio16";
+ function = "func2";
- max-frequency = <100000000>;
- bus-width = <4>;
- non-removable;
- vmmc-supply = <&vreg_wlan>;
- vqmmc-supply = <&pm8941_s3>;
+ output-high;
+ power-source = <PM8941_GPIO_S3>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc2_pin_a>;
+ wlan_regulator_pin: wl-reg-active {
+ pins = "gpio17";
+ function = "normal";
- #address-cells = <1>;
- #size-cells = <0>;
+ bias-disable;
+ power-source = <PM8941_GPIO_S3>;
+ };
+
+ otg {
+ gpio-hog;
+ gpios = <35 GPIO_ACTIVE_HIGH>;
+ output-high;
+ line-name = "otg-gpio";
+ };
+};
- bcrmf@1 {
- compatible = "brcm,bcm4339-fmac", "brcm,bcm4329-fmac";
- reg = <1>;
+&rpm_requests {
+ pm8841-regulators {
+ compatible = "qcom,rpm-pm8841-regulators";
- brcm,drive-strength = <10>;
+ pm8841_s1: s1 {
+ regulator-min-microvolt = <675000>;
+ regulator-max-microvolt = <1050000>;
+ };
+
+ pm8841_s2: s2 {
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <1050000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&wlan_sleep_clk_pin>;
+ pm8841_s3: s3 {
+ regulator-min-microvolt = <1050000>;
+ regulator-max-microvolt = <1050000>;
+ };
+
+ pm8841_s4: s4 {
+ regulator-min-microvolt = <815000>;
+ regulator-max-microvolt = <900000>;
};
};
- gpio-keys {
- compatible = "gpio-keys";
+ pm8941-regulators {
+ compatible = "qcom,rpm-pm8941-regulators";
+
+ vdd_l1_l3-supply = <&pm8941_s1>;
+ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+ vdd_l4_l11-supply = <&pm8941_s1>;
+ vdd_l5_l7-supply = <&pm8941_s2>;
+ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+ vdd_l8_l16_l18_l19-supply = <&vreg_vph_pwr>;
+ vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
+ vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
+ vdd_l21-supply = <&vreg_boost>;
+
+ pm8941_s1: s1 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1300000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&gpio_keys_pin_a>;
+ pm8941_s2: s2 {
+ regulator-min-microvolt = <2150000>;
+ regulator-max-microvolt = <2150000>;
+ regulator-boot-on;
+ };
- volume-up {
- label = "volume_up";
- gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_VOLUMEUP>;
+ pm8941_s3: s3 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ regulator-boot-on;
};
- volume-down {
- label = "volume_down";
- gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_VOLUMEDOWN>;
+ pm8941_l1: l1 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ regulator-always-on;
+ regulator-boot-on;
};
- };
- serial@f9960000 {
- status = "okay";
+ pm8941_l2: l2 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&blsp2_uart10_pin_a>;
+ pm8941_l3: l3 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ };
- bluetooth {
- compatible = "brcm,bcm43438-bt";
- max-speed = <3000000>;
+ pm8941_l4: l4 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&bt_pin>;
+ pm8941_l5: l5 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- host-wakeup-gpios = <&msmgpio 42 GPIO_ACTIVE_HIGH>;
- device-wakeup-gpios = <&msmgpio 62 GPIO_ACTIVE_HIGH>;
- shutdown-gpios = <&msmgpio 41 GPIO_ACTIVE_HIGH>;
+ pm8941_l6: l6 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
};
- };
- i2c@f9967000 {
- status = "okay";
- pinctrl-names = "default";
- pinctrl-0 = <&i2c11_pins>;
- clock-frequency = <355000>;
- qcom,src-freq = <50000000>;
+ pm8941_l7: l7 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
+ };
- led-controller@38 {
- compatible = "ti,lm3630a";
- status = "okay";
- reg = <0x38>;
+ pm8941_l8: l8 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- #address-cells = <1>;
- #size-cells = <0>;
+ pm8941_l9: l9 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ };
- led@0 {
- reg = <0>;
- led-sources = <0 1>;
- label = "lcd-backlight";
- default-brightness = <200>;
- };
+ pm8941_l10: l10 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
};
- };
- i2c@f9968000 {
- status = "okay";
- pinctrl-names = "default";
- pinctrl-0 = <&i2c12_pins>;
- clock-frequency = <100000>;
- qcom,src-freq = <50000000>;
-
- mpu6515@68 {
- compatible = "invensense,mpu6515";
- reg = <0x68>;
- interrupts-extended = <&msmgpio 73 IRQ_TYPE_EDGE_FALLING>;
- vddio-supply = <&pm8941_lvs1>;
-
- pinctrl-names = "default";
- pinctrl-0 = <&mpu6515_pin>;
-
- mount-matrix = "0", "-1", "0",
- "-1", "0", "0",
- "0", "0", "1";
-
- i2c-gate {
- #address-cells = <1>;
- #size-cells = <0>;
- ak8963@f {
- compatible = "asahi-kasei,ak8963";
- reg = <0x0f>;
- gpios = <&msmgpio 67 0>;
- vid-supply = <&pm8941_lvs1>;
- vdd-supply = <&pm8941_l17>;
- };
-
- bmp280@76 {
- compatible = "bosch,bmp280";
- reg = <0x76>;
- vdda-supply = <&pm8941_lvs1>;
- vddd-supply = <&pm8941_l17>;
- };
- };
+ pm8941_l11: l11 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1300000>;
};
- };
- i2c@f9923000 {
- status = "okay";
- pinctrl-names = "default";
- pinctrl-0 = <&i2c1_pins>;
- clock-frequency = <100000>;
- qcom,src-freq = <50000000>;
+ pm8941_l12: l12 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ pm8941_l13: l13 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l14: l14 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
+
+ pm8941_l15: l15 {
+ regulator-min-microvolt = <2050000>;
+ regulator-max-microvolt = <2050000>;
+ };
- charger: bq24192@6b {
- compatible = "ti,bq24192";
- reg = <0x6b>;
- interrupts-extended = <&spmi_bus 0 0xd5 0 IRQ_TYPE_EDGE_FALLING>;
+ pm8941_l16: l16 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2700000>;
+ };
- omit-battery-class;
+ pm8941_l17: l17 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
+ };
- usb_otg_vbus: usb-otg-vbus { };
+ pm8941_l18: l18 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
};
- fuelgauge: max17048@36 {
- compatible = "maxim,max17048";
- reg = <0x36>;
+ pm8941_l19: l19 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3300000>;
+ };
- maxim,double-soc;
- maxim,rcomp = /bits/ 8 <0x4d>;
+ pm8941_l20: l20 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-system-load = <200000>;
+ regulator-allow-set-load;
+ regulator-boot-on;
+ };
- interrupt-parent = <&msmgpio>;
- interrupts = <9 IRQ_TYPE_LEVEL_LOW>;
+ pm8941_l21: l21 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&fuelgauge_pin>;
+ pm8941_l22: l22 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3300000>;
+ };
- maxim,alert-low-soc-level = <2>;
+ pm8941_l23: l23 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
};
+
+ pm8941_l24: l24 {
+ regulator-min-microvolt = <3075000>;
+ regulator-max-microvolt = <3075000>;
+ regulator-boot-on;
+ };
+
+ pm8941_lvs1: lvs1 {};
+ pm8941_lvs3: lvs3 {};
};
+};
- i2c@f9924000 {
- status = "okay";
+&sdhc_1 {
+ status = "okay";
- clock-frequency = <355000>;
- qcom,src-freq = <50000000>;
+ vmmc-supply = <&pm8941_l20>;
+ vqmmc-supply = <&pm8941_s3>;
- pinctrl-names = "default";
- pinctrl-0 = <&i2c2_pins>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc1_pin_a>;
+};
- synaptics@70 {
- compatible = "syna,rmi4-i2c";
- reg = <0x70>;
+&sdhc_2 {
+ status = "okay";
- interrupts-extended = <&msmgpio 5 IRQ_TYPE_EDGE_FALLING>;
- vdd-supply = <&pm8941_l22>;
- vio-supply = <&pm8941_lvs3>;
+ max-frequency = <100000000>;
+ vmmc-supply = <&vreg_wlan>;
+ vqmmc-supply = <&pm8941_s3>;
+ non-removable;
- pinctrl-names = "default";
- pinctrl-0 = <&touch_pin>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc2_pin_a>;
- #address-cells = <1>;
- #size-cells = <0>;
+ #address-cells = <1>;
+ #size-cells = <0>;
- rmi4-f01@1 {
- reg = <0x1>;
- syna,nosleep-mode = <1>;
- };
+ bcrmf@1 {
+ compatible = "brcm,bcm4339-fmac", "brcm,bcm4329-fmac";
+ reg = <1>;
- rmi4-f12@12 {
- reg = <0x12>;
- syna,sensor-type = <1>;
- };
+ brcm,drive-strength = <10>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&wlan_sleep_clk_pin>;
+ };
+};
+
+&tlmm {
+ sdhc1_pin_a: sdhc1-pin-active {
+ clk {
+ pins = "sdc1_clk";
+ drive-strength = <16>;
+ bias-disable;
+ };
+
+ cmd-data {
+ pins = "sdc1_cmd", "sdc1_data";
+ drive-strength = <10>;
+ bias-pull-up;
};
};
- i2c@f9925000 {
- status = "okay";
- pinctrl-names = "default";
- pinctrl-0 = <&i2c3_pins>;
- clock-frequency = <100000>;
- qcom,src-freq = <50000000>;
+ sdhc2_pin_a: sdhc2-pin-active {
+ clk {
+ pins = "sdc2_clk";
+ drive-strength = <6>;
+ bias-disable;
+ };
- avago_apds993@39 {
- compatible = "avago,apds9930";
- reg = <0x39>;
- interrupts-extended = <&msmgpio 61 IRQ_TYPE_EDGE_FALLING>;
- vdd-supply = <&pm8941_l17>;
- vddio-supply = <&pm8941_lvs1>;
- led-max-microamp = <100000>;
- amstaos,proximity-diodes = <0>;
+ cmd-data {
+ pins = "sdc2_cmd", "sdc2_data";
+ drive-strength = <6>;
+ bias-pull-up;
};
};
- usb@f9a55000 {
- status = "okay";
+ i2c1_pins: i2c1 {
+ mux {
+ pins = "gpio2", "gpio3";
+ function = "blsp_i2c1";
- phys = <&usb_hs1_phy>;
- phy-select = <&tcsr 0xb000 0>;
+ drive-strength = <2>;
+ bias-disable;
+ };
+ };
- extcon = <&charger>, <&usb_id>;
- vbus-supply = <&usb_otg_vbus>;
+ i2c2_pins: i2c2 {
+ mux {
+ pins = "gpio6", "gpio7";
+ function = "blsp_i2c2";
- hnp-disable;
- srp-disable;
- adp-disable;
+ drive-strength = <2>;
+ bias-disable;
+ };
+ };
- ulpi {
- phy@a {
- status = "okay";
+ i2c3_pins: i2c3 {
+ mux {
+ pins = "gpio10", "gpio11";
+ function = "blsp_i2c3";
+ drive-strength = <2>;
+ bias-disable;
+ };
+ };
- v1p8-supply = <&pm8941_l6>;
- v3p3-supply = <&pm8941_l24>;
+ i2c11_pins: i2c11 {
+ mux {
+ pins = "gpio83", "gpio84";
+ function = "blsp_i2c11";
- qcom,init-seq = /bits/ 8 <0x1 0x64>;
- };
+ drive-strength = <2>;
+ bias-disable;
};
};
- mdss@fd900000 {
- status = "okay";
+ i2c12_pins: i2c12 {
+ mux {
+ pins = "gpio87", "gpio88";
+ function = "blsp_i2c12";
+ drive-strength = <2>;
+ bias-disable;
+ };
+ };
- mdp@fd900000 {
- status = "okay";
+ mpu6515_pin: mpu6515 {
+ irq {
+ pins = "gpio73";
+ function = "gpio";
+ bias-disable;
+ input-enable;
};
+ };
- dsi@fd922800 {
- status = "okay";
+ touch_pin: touch {
+ int {
+ pins = "gpio5";
+ function = "gpio";
- vdda-supply = <&pm8941_l2>;
- vdd-supply = <&pm8941_lvs3>;
- vddio-supply = <&pm8941_l12>;
+ drive-strength = <2>;
+ bias-disable;
+ input-enable;
+ };
- #address-cells = <1>;
- #size-cells = <0>;
+ reset {
+ pins = "gpio8";
+ function = "gpio";
- ports {
- port@1 {
- endpoint {
- remote-endpoint = <&panel_in>;
- data-lanes = <0 1 2 3>;
- };
- };
- };
+ drive-strength = <2>;
+ bias-pull-up;
+ };
+ };
- panel: panel@0 {
- reg = <0>;
- compatible = "lg,acx467akm-7";
+ panel_pin: panel {
+ te {
+ pins = "gpio12";
+ function = "mdp_vsync";
- pinctrl-names = "default";
- pinctrl-0 = <&panel_pin>;
+ drive-strength = <2>;
+ bias-disable;
+ };
+ };
- port {
- panel_in: endpoint {
- remote-endpoint = <&dsi0_out>;
- };
- };
- };
+ bt_pin: bt {
+ hostwake {
+ pins = "gpio42";
+ function = "gpio";
};
- dsi-phy@fd922a00 {
- status = "okay";
+ devwake {
+ pins = "gpio62";
+ function = "gpio";
+ };
- vddio-supply = <&pm8941_l12>;
+ shutdown {
+ pins = "gpio41";
+ function = "gpio";
};
};
-};
-&spmi_bus {
- pm8941@0 {
- gpios@c000 {
- gpio_keys_pin_a: gpio-keys-active {
- pins = "gpio2", "gpio3";
- function = "normal";
+ blsp2_uart4_pin_a: blsp2-uart4-pin-active {
+ tx {
+ pins = "gpio53";
+ function = "blsp_uart10";
- bias-pull-up;
- power-source = <PM8941_GPIO_S3>;
- };
+ drive-strength = <2>;
+ bias-disable;
+ };
- fuelgauge_pin: fuelgauge-int {
- pins = "gpio9";
- function = "normal";
+ rx {
+ pins = "gpio54";
+ function = "blsp_uart10";
- bias-disable;
- input-enable;
- power-source = <PM8941_GPIO_S3>;
- };
-
- wlan_sleep_clk_pin: wl-sleep-clk {
- pins = "gpio16";
- function = "func2";
+ drive-strength = <2>;
+ bias-pull-up;
+ };
- output-high;
- power-source = <PM8941_GPIO_S3>;
- };
+ cts {
+ pins = "gpio55";
+ function = "blsp_uart10";
- wlan_regulator_pin: wl-reg-active {
- pins = "gpio17";
- function = "normal";
+ drive-strength = <2>;
+ bias-pull-up;
+ };
- bias-disable;
- power-source = <PM8941_GPIO_S3>;
- };
+ rts {
+ pins = "gpio56";
+ function = "blsp_uart10";
- otg {
- gpio-hog;
- gpios = <35 GPIO_ACTIVE_HIGH>;
- output-high;
- line-name = "otg-gpio";
- };
+ drive-strength = <2>;
+ bias-disable;
};
};
};
diff --git a/arch/arm/boot/dts/qcom-msm8974-samsung-klte.dts b/arch/arm/boot/dts/qcom-msm8974-samsung-klte.dts
index 96e1c978b878..9ef5a68747f1 100644
--- a/arch/arm/boot/dts/qcom-msm8974-samsung-klte.dts
+++ b/arch/arm/boot/dts/qcom-msm8974-samsung-klte.dts
@@ -13,201 +13,42 @@ / {
aliases {
serial0 = &blsp1_uart1;
mmc0 = &sdhc_1; /* SDC1 eMMC slot */
- mmc1 = &sdhc_2; /* SDC2 SD card slot */
+ mmc1 = &sdhc_3; /* SDC2 SD card slot */
};
chosen {
stdout-path = "serial0:115200n8";
};
- smd {
- rpm {
- rpm_requests {
- pma8084-regulators {
- compatible = "qcom,rpm-pma8084-regulators";
- status = "okay";
-
- pma8084_s1: s1 {
- regulator-min-microvolt = <675000>;
- regulator-max-microvolt = <1050000>;
- regulator-always-on;
- };
-
- pma8084_s2: s2 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
-
- pma8084_s3: s3 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
- };
-
- pma8084_s4: s4 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- pma8084_s5: s5 {
- regulator-min-microvolt = <2150000>;
- regulator-max-microvolt = <2150000>;
- };
-
- pma8084_s6: s6 {
- regulator-min-microvolt = <1050000>;
- regulator-max-microvolt = <1050000>;
- };
-
- pma8084_l1: l1 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- pma8084_l2: l2 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- pma8084_l3: l3 {
- regulator-min-microvolt = <1050000>;
- regulator-max-microvolt = <1200000>;
- };
-
- pma8084_l4: l4 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1225000>;
- };
-
- pma8084_l5: l5 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- pma8084_l6: l6 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- pma8084_l7: l7 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- pma8084_l8: l8 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- pma8084_l9: l9 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- pma8084_l10: l10 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- pma8084_l11: l11 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
- };
-
- pma8084_l12: l12 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- regulator-always-on;
- };
-
- pma8084_l13: l13 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- pma8084_l14: l14 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- pma8084_l15: l15 {
- regulator-min-microvolt = <2050000>;
- regulator-max-microvolt = <2050000>;
- };
-
- pma8084_l16: l16 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- pma8084_l17: l17 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- pma8084_l18: l18 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- pma8084_l19: l19 {
- regulator-min-microvolt = <3300000>;
- regulator-max-microvolt = <3300000>;
- };
-
- pma8084_l20: l20 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-allow-set-load;
- regulator-system-load = <200000>;
- };
-
- pma8084_l21: l21 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-allow-set-load;
- regulator-system-load = <200000>;
- };
-
- pma8084_l22: l22 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3300000>;
- };
-
- pma8084_l23: l23 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3000000>;
- };
-
- pma8084_l24: l24 {
- regulator-min-microvolt = <3075000>;
- regulator-max-microvolt = <3075000>;
- };
-
- pma8084_l25: l25 {
- regulator-min-microvolt = <2100000>;
- regulator-max-microvolt = <2100000>;
- };
-
- pma8084_l26: l26 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2050000>;
- };
-
- pma8084_l27: l27 {
- regulator-min-microvolt = <1000000>;
- regulator-max-microvolt = <1225000>;
- };
-
- pma8084_lvs1: lvs1 {};
- pma8084_lvs2: lvs2 {};
- pma8084_lvs3: lvs3 {};
- pma8084_lvs4: lvs4 {};
-
- pma8084_5vs1: 5vs1 {};
- };
- };
+ gpio-keys {
+ compatible = "gpio-keys";
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&gpio_keys_pin_a>;
+
+ volume-down {
+ label = "volume_down";
+ gpios = <&pma8084_gpios 2 GPIO_ACTIVE_LOW>;
+ linux,input-type = <1>;
+ linux,code = <KEY_VOLUMEDOWN>;
+ debounce-interval = <15>;
+ };
+
+ home-key {
+ label = "home_key";
+ gpios = <&pma8084_gpios 3 GPIO_ACTIVE_LOW>;
+ linux,input-type = <1>;
+ linux,code = <KEY_HOMEPAGE>;
+ wakeup-source;
+ debounce-interval = <15>;
+ };
+
+ volume-up {
+ label = "volume_up";
+ gpios = <&pma8084_gpios 5 GPIO_ACTIVE_LOW>;
+ linux,input-type = <1>;
+ linux,code = <KEY_VOLUMEUP>;
+ debounce-interval = <15>;
};
};
@@ -215,8 +56,8 @@ i2c-gpio-touchkey {
compatible = "i2c-gpio";
#address-cells = <1>;
#size-cells = <0>;
- sda-gpios = <&msmgpio 95 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
- scl-gpios = <&msmgpio 96 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+ sda-gpios = <&tlmm 95 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+ scl-gpios = <&tlmm 96 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
pinctrl-names = "default";
pinctrl-0 = <&i2c_touchkey_pins>;
@@ -240,8 +81,8 @@ i2c-gpio-led {
compatible = "i2c-gpio";
#address-cells = <1>;
#size-cells = <0>;
- scl-gpios = <&msmgpio 121 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
- sda-gpios = <&msmgpio 120 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+ scl-gpios = <&tlmm 121 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
+ sda-gpios = <&tlmm 120 (GPIO_ACTIVE_HIGH | GPIO_OPEN_DRAIN)>;
pinctrl-names = "default";
pinctrl-0 = <&i2c_led_gpioex_pins>;
@@ -259,7 +100,7 @@ gpio_expander: gpio@20 {
pinctrl-names = "default";
pinctrl-0 = <&gpioex_pin>;
- reset-gpios = <&msmgpio 145 GPIO_ACTIVE_LOW>;
+ reset-gpios = <&tlmm 145 GPIO_ACTIVE_LOW>;
};
led-controller@30 {
@@ -315,594 +156,719 @@ vreg_panel: panel-regulator {
};
/delete-node/ vreg-boost;
-
- adsp-pil {
- cx-supply = <&pma8084_s2>;
- };
};
-&soc {
- serial@f991e000 {
- status = "okay";
- };
+&blsp1_i2c2 {
+ status = "okay";
- /* blsp2_uart8 */
- serial@f995e000 {
- status = "okay";
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c2_pins>;
- pinctrl-names = "default", "sleep";
- pinctrl-0 = <&blsp2_uart8_pins_active>;
- pinctrl-1 = <&blsp2_uart8_pins_sleep>;
+ touchscreen@20 {
+ compatible = "syna,rmi4-i2c";
+ reg = <0x20>;
- bluetooth {
- compatible = "brcm,bcm43540-bt";
- max-speed = <3000000>;
- pinctrl-names = "default";
- pinctrl-0 = <&bt_pins>;
- device-wakeup-gpios = <&msmgpio 91 GPIO_ACTIVE_HIGH>;
- shutdown-gpios = <&gpio_expander 9 GPIO_ACTIVE_HIGH>;
- interrupt-parent = <&msmgpio>;
- interrupts = <75 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "host-wakeup";
- };
- };
+ interrupt-parent = <&pma8084_gpios>;
+ interrupts = <8 IRQ_TYPE_EDGE_FALLING>;
- gpio-keys {
- compatible = "gpio-keys";
+ vdd-supply = <&max77826_ldo13>;
+ vio-supply = <&pma8084_lvs2>;
pinctrl-names = "default";
- pinctrl-0 = <&gpio_keys_pin_a>;
+ pinctrl-0 = <&touch_pin>;
- volume-down {
- label = "volume_down";
- gpios = <&pma8084_gpios 2 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_VOLUMEDOWN>;
- debounce-interval = <15>;
- };
+ syna,startup-delay-ms = <100>;
- home-key {
- label = "home_key";
- gpios = <&pma8084_gpios 3 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_HOMEPAGE>;
- wakeup-source;
- debounce-interval = <15>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ rmi4-f01@1 {
+ reg = <0x1>;
+ syna,nosleep-mode = <1>;
};
- volume-up {
- label = "volume_up";
- gpios = <&pma8084_gpios 5 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_VOLUMEUP>;
- debounce-interval = <15>;
+ rmi4-f12@12 {
+ reg = <0x12>;
+ syna,sensor-type = <1>;
};
};
+};
- pinctrl@fd510000 {
- blsp2_uart8_pins_active: blsp2-uart8-pins-active {
- pins = "gpio45", "gpio46", "gpio47", "gpio48";
- function = "blsp_uart8";
- drive-strength = <8>;
- bias-disable;
- };
+&blsp1_i2c6 {
+ status = "okay";
- blsp2_uart8_pins_sleep: blsp2-uart8-pins-sleep {
- pins = "gpio45", "gpio46", "gpio47", "gpio48";
- function = "gpio";
- drive-strength = <2>;
- bias-pull-down;
- };
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c6_pins>;
+
+ pmic@60 {
+ reg = <0x60>;
+ compatible = "maxim,max77826";
- bt_pins: bt-pins {
- hostwake {
- pins = "gpio75";
- function = "gpio";
- drive-strength = <16>;
- input-enable;
+ regulators {
+ max77826_ldo1: LDO1 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
};
- devwake {
- pins = "gpio91";
- function = "gpio";
- drive-strength = <2>;
+ max77826_ldo2: LDO2 {
+ regulator-min-microvolt = <1000000>;
+ regulator-max-microvolt = <1000000>;
};
- };
- sdhc1_pin_a: sdhc1-pin-active {
- clk {
- pins = "sdc1_clk";
- drive-strength = <4>;
- bias-disable;
+ max77826_ldo3: LDO3 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
};
- cmd-data {
- pins = "sdc1_cmd", "sdc1_data";
- drive-strength = <4>;
- bias-pull-up;
+ max77826_ldo4: LDO4 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
};
- };
- sdhc2_pin_a: sdhc2-pin-active {
- clk-cmd-data {
- pins = "gpio35", "gpio36", "gpio37", "gpio38",
- "gpio39", "gpio40";
- function = "sdc3";
- drive-strength = <8>;
- bias-disable;
+ max77826_ldo5: LDO5 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
};
- };
- sdhc2_cd_pin: sdhc2-cd {
- pins = "gpio62";
- function = "gpio";
+ max77826_ldo6: LDO6 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <3300000>;
+ };
- drive-strength = <2>;
- bias-disable;
- };
+ max77826_ldo7: LDO7 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- sdhc3_pin_a: sdhc3-pin-active {
- clk {
- pins = "sdc2_clk";
- drive-strength = <6>;
- bias-disable;
+ max77826_ldo8: LDO8 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <3300000>;
};
- cmd-data {
- pins = "sdc2_cmd", "sdc2_data";
- drive-strength = <6>;
- bias-pull-up;
+ max77826_ldo9: LDO9 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
};
- };
- i2c2_pins: i2c2 {
- mux {
- pins = "gpio6", "gpio7";
- function = "blsp_i2c2";
+ max77826_ldo10: LDO10 {
+ regulator-min-microvolt = <2800000>;
+ regulator-max-microvolt = <2950000>;
+ };
- drive-strength = <2>;
- bias-disable;
+ max77826_ldo11: LDO11 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2950000>;
};
- };
- i2c6_pins: i2c6 {
- mux {
- pins = "gpio29", "gpio30";
- function = "blsp_i2c6";
+ max77826_ldo12: LDO12 {
+ regulator-min-microvolt = <2500000>;
+ regulator-max-microvolt = <3300000>;
+ };
- drive-strength = <2>;
- bias-disable;
+ max77826_ldo13: LDO13 {
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
};
- };
- i2c12_pins: i2c12 {
- mux {
- pins = "gpio87", "gpio88";
- function = "blsp_i2c12";
+ max77826_ldo14: LDO14 {
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+ };
- drive-strength = <2>;
- bias-disable;
+ max77826_ldo15: LDO15 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
};
- };
- i2c_touchkey_pins: i2c-touchkey {
- mux {
- pins = "gpio95", "gpio96";
- function = "gpio";
- input-enable;
- bias-pull-up;
+ max77826_buck: BUCK {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
};
- };
- i2c_led_gpioex_pins: i2c-led-gpioex {
- mux {
- pins = "gpio120", "gpio121";
- function = "gpio";
- input-enable;
- bias-pull-down;
+ max77826_buckboost: BUCKBOOST {
+ regulator-min-microvolt = <3400000>;
+ regulator-max-microvolt = <3400000>;
};
};
+ };
+};
- gpioex_pin: gpioex {
- res {
- pins = "gpio145";
- function = "gpio";
+&blsp1_uart2 {
+ status = "okay";
+};
- bias-pull-up;
- drive-strength = <2>;
- };
- };
+&blsp2_i2c6 {
+ status = "okay";
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c12_pins>;
+
+ fuelgauge@36 {
+ compatible = "maxim,max17048";
+ reg = <0x36>;
+
+ maxim,double-soc;
+ maxim,rcomp = /bits/ 8 <0x56>;
+
+ interrupt-parent = <&pma8084_gpios>;
+ interrupts = <21 IRQ_TYPE_LEVEL_LOW>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&fuelgauge_pin>;
+ };
+};
+
+&blsp2_uart2 {
+ status = "okay";
- wifi_pin: wifi {
- int {
- pins = "gpio92";
- function = "gpio";
+ pinctrl-names = "default", "sleep";
+ pinctrl-0 = <&blsp2_uart2_pins_active>;
+ pinctrl-1 = <&blsp2_uart2_pins_sleep>;
- input-enable;
- bias-pull-down;
+ bluetooth {
+ compatible = "brcm,bcm43540-bt";
+ max-speed = <3000000>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&bt_pins>;
+ device-wakeup-gpios = <&tlmm 91 GPIO_ACTIVE_HIGH>;
+ shutdown-gpios = <&gpio_expander 9 GPIO_ACTIVE_HIGH>;
+ interrupt-parent = <&tlmm>;
+ interrupts = <75 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "host-wakeup";
+ };
+};
+
+&dsi0 {
+ status = "okay";
+
+ vdda-supply = <&pma8084_l2>;
+ vdd-supply = <&pma8084_l22>;
+ vddio-supply = <&pma8084_l12>;
+
+ panel: panel@0 {
+ reg = <0>;
+ compatible = "samsung,s6e3fa2";
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&panel_te_pin &panel_rst_pin>;
+
+ iovdd-supply = <&pma8084_lvs4>;
+ vddr-supply = <&vreg_panel>;
+
+ reset-gpios = <&pma8084_gpios 17 GPIO_ACTIVE_LOW>;
+ te-gpios = <&tlmm 12 GPIO_ACTIVE_HIGH>;
+
+ port {
+ panel_in: endpoint {
+ remote-endpoint = <&dsi0_out>;
};
};
+ };
+};
- panel_te_pin: panel {
- te {
- pins = "gpio12";
- function = "mdp_vsync";
+&dsi0_out {
+ remote-endpoint = <&panel_in>;
+ data-lanes = <0 1 2 3>;
+};
- drive-strength = <2>;
- bias-disable;
- };
+&dsi0_phy {
+ status = "okay";
+
+ vddio-supply = <&pma8084_l12>;
+};
+
+&gpu {
+ status = "okay";
+};
+
+&mdss {
+ status = "okay";
+};
+
+&otg {
+ status = "okay";
+
+ phys = <&usb_hs1_phy>;
+ phy-select = <&tcsr 0xb000 0>;
+
+ hnp-disable;
+ srp-disable;
+ adp-disable;
+
+ ulpi {
+ phy@a {
+ status = "okay";
+
+ v1p8-supply = <&pma8084_l6>;
+ v3p3-supply = <&pma8084_l24>;
+
+ qcom,init-seq = /bits/ 8 <0x1 0x64>;
};
};
+};
- sdhc_1: sdhci@f9824900 {
- status = "okay";
+&pma8084_gpios {
+ gpio_keys_pin_a: gpio-keys-active {
+ pins = "gpio2", "gpio3", "gpio5";
+ function = "normal";
- vmmc-supply = <&pma8084_l20>;
- vqmmc-supply = <&pma8084_s4>;
+ bias-pull-up;
+ power-source = <PMA8084_GPIO_S4>;
+ };
- bus-width = <8>;
- non-removable;
+ touchkey_pin: touchkey-int-pin {
+ pins = "gpio6";
+ function = "normal";
+ bias-disable;
+ input-enable;
+ power-source = <PMA8084_GPIO_S4>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc1_pin_a>;
+ touch_pin: touchscreen-int-pin {
+ pins = "gpio8";
+ function = "normal";
+ bias-disable;
+ input-enable;
+ power-source = <PMA8084_GPIO_S4>;
};
- sdhc_2: sdhci@f9864900 {
- status = "okay";
+ panel_en_pin: panel-en-pin {
+ pins = "gpio14";
+ function = "normal";
+ bias-pull-up;
+ power-source = <PMA8084_GPIO_S4>;
+ qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
+ };
- max-frequency = <100000000>;
+ wlan_sleep_clk_pin: wlan-sleep-clk-pin {
+ pins = "gpio16";
+ function = "func2";
- vmmc-supply = <&pma8084_l21>;
- vqmmc-supply = <&pma8084_l13>;
+ output-high;
+ power-source = <PMA8084_GPIO_S4>;
+ qcom,drive-strength = <PMIC_GPIO_STRENGTH_HIGH>;
+ };
- bus-width = <4>;
+ panel_rst_pin: panel-rst-pin {
+ pins = "gpio17";
+ function = "normal";
+ bias-disable;
+ power-source = <PMA8084_GPIO_S4>;
+ qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
+ };
- /* cd-gpio is intentionally disabled. If enabled, an SD card
- * present during boot is not initialized correctly. Without
- * cd-gpios the driver resorts to polling, so hotplug works.
- */
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc2_pin_a /* &sdhc2_cd_pin */>;
- // cd-gpios = <&msmgpio 62 GPIO_ACTIVE_LOW>;
+ fuelgauge_pin: fuelgauge-int-pin {
+ pins = "gpio21";
+ function = "normal";
+ bias-disable;
+ input-enable;
+ power-source = <PMA8084_GPIO_S4>;
};
+};
- sdhci@f98a4900 {
- status = "okay";
+&remoteproc_adsp {
+ status = "okay";
+ cx-supply = <&pma8084_s2>;
+};
- #address-cells = <1>;
- #size-cells = <0>;
+&remoteproc_mss {
+ status = "okay";
+ cx-supply = <&pma8084_s2>;
+ mss-supply = <&pma8084_s6>;
+ mx-supply = <&pma8084_s1>;
+ pll-supply = <&pma8084_l12>;
+};
- max-frequency = <100000000>;
+&rpm_requests {
+ pma8084-regulators {
+ compatible = "qcom,rpm-pma8084-regulators";
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc3_pin_a>;
+ pma8084_s1: s1 {
+ regulator-min-microvolt = <675000>;
+ regulator-max-microvolt = <1050000>;
+ regulator-always-on;
+ };
- vmmc-supply = <&vreg_wlan>;
- vqmmc-supply = <&pma8084_s4>;
+ pma8084_s2: s2 {
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <1050000>;
+ };
- bus-width = <4>;
- non-removable;
+ pma8084_s3: s3 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1300000>;
+ };
- wifi@1 {
- reg = <1>;
- compatible = "brcm,bcm4329-fmac";
+ pma8084_s4: s4 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- interrupt-parent = <&msmgpio>;
- interrupts = <92 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "host-wake";
+ pma8084_s5: s5 {
+ regulator-min-microvolt = <2150000>;
+ regulator-max-microvolt = <2150000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&wlan_sleep_clk_pin &wifi_pin>;
+ pma8084_s6: s6 {
+ regulator-min-microvolt = <1050000>;
+ regulator-max-microvolt = <1050000>;
};
- };
- usb@f9a55000 {
- status = "okay";
+ pma8084_l1: l1 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ };
- phys = <&usb_hs1_phy>;
- phy-select = <&tcsr 0xb000 0>;
- /*extcon = <&smbb>, <&usb_id>;*/
- /*vbus-supply = <&chg_otg>;*/
+ pma8084_l2: l2 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
+ };
- hnp-disable;
- srp-disable;
- adp-disable;
+ pma8084_l3: l3 {
+ regulator-min-microvolt = <1050000>;
+ regulator-max-microvolt = <1200000>;
+ };
- ulpi {
- phy@a {
- status = "okay";
+ pma8084_l4: l4 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1225000>;
+ };
- v1p8-supply = <&pma8084_l6>;
- v3p3-supply = <&pma8084_l24>;
+ pma8084_l5: l5 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- /*extcon = <&smbb>;*/
- qcom,init-seq = /bits/ 8 <0x1 0x64>;
- };
+ pma8084_l6: l6 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
};
- };
- i2c@f9924000 {
- status = "okay";
+ pma8084_l7: l7 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&i2c2_pins>;
+ pma8084_l8: l8 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- touchscreen@20 {
- compatible = "syna,rmi4-i2c";
- reg = <0x20>;
+ pma8084_l9: l9 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ };
- interrupt-parent = <&pma8084_gpios>;
- interrupts = <8 IRQ_TYPE_EDGE_FALLING>;
+ pma8084_l10: l10 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ };
- vdd-supply = <&max77826_ldo13>;
- vio-supply = <&pma8084_lvs2>;
+ pma8084_l11: l11 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1300000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&touch_pin>;
+ pma8084_l12: l12 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ };
- syna,startup-delay-ms = <100>;
+ pma8084_l13: l13 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ };
- #address-cells = <1>;
- #size-cells = <0>;
+ pma8084_l14: l14 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- rmi4-f01@1 {
- reg = <0x1>;
- syna,nosleep-mode = <1>;
- };
+ pma8084_l15: l15 {
+ regulator-min-microvolt = <2050000>;
+ regulator-max-microvolt = <2050000>;
+ };
- rmi4-f12@12 {
- reg = <0x12>;
- syna,sensor-type = <1>;
- };
+ pma8084_l16: l16 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2700000>;
};
- };
- i2c@f9928000 {
- status = "okay";
+ pma8084_l17: l17 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&i2c6_pins>;
-
- pmic@60 {
- reg = <0x60>;
- compatible = "maxim,max77826";
-
- regulators {
- max77826_ldo1: LDO1 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- max77826_ldo2: LDO2 {
- regulator-min-microvolt = <1000000>;
- regulator-max-microvolt = <1000000>;
- };
-
- max77826_ldo3: LDO3 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- max77826_ldo4: LDO4 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- max77826_ldo5: LDO5 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- max77826_ldo6: LDO6 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <3300000>;
- };
-
- max77826_ldo7: LDO7 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- max77826_ldo8: LDO8 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <3300000>;
- };
-
- max77826_ldo9: LDO9 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- max77826_ldo10: LDO10 {
- regulator-min-microvolt = <2800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- max77826_ldo11: LDO11 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2950000>;
- };
-
- max77826_ldo12: LDO12 {
- regulator-min-microvolt = <2500000>;
- regulator-max-microvolt = <3300000>;
- };
-
- max77826_ldo13: LDO13 {
- regulator-min-microvolt = <3300000>;
- regulator-max-microvolt = <3300000>;
- };
-
- max77826_ldo14: LDO14 {
- regulator-min-microvolt = <3300000>;
- regulator-max-microvolt = <3300000>;
- };
-
- max77826_ldo15: LDO15 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- max77826_buck: BUCK {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- max77826_buckboost: BUCKBOOST {
- regulator-min-microvolt = <3400000>;
- regulator-max-microvolt = <3400000>;
- };
- };
+ pma8084_l18: l18 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
};
- };
- i2c@f9968000 {
- status = "okay";
+ pma8084_l19: l19 {
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&i2c12_pins>;
+ pma8084_l20: l20 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-system-load = <200000>;
+ regulator-allow-set-load;
+ };
- fuelgauge@36 {
- compatible = "maxim,max17048";
- reg = <0x36>;
+ pma8084_l21: l21 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-system-load = <200000>;
+ regulator-allow-set-load;
+ };
- maxim,double-soc;
- maxim,rcomp = /bits/ 8 <0x56>;
+ pma8084_l22: l22 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3300000>;
+ };
- interrupt-parent = <&pma8084_gpios>;
- interrupts = <21 IRQ_TYPE_LEVEL_LOW>;
+ pma8084_l23: l23 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&fuelgauge_pin>;
+ pma8084_l24: l24 {
+ regulator-min-microvolt = <3075000>;
+ regulator-max-microvolt = <3075000>;
};
+
+ pma8084_l25: l25 {
+ regulator-min-microvolt = <2100000>;
+ regulator-max-microvolt = <2100000>;
+ };
+
+ pma8084_l26: l26 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2050000>;
+ };
+
+ pma8084_l27: l27 {
+ regulator-min-microvolt = <1000000>;
+ regulator-max-microvolt = <1225000>;
+ };
+
+ pma8084_lvs1: lvs1 {};
+ pma8084_lvs2: lvs2 {};
+ pma8084_lvs3: lvs3 {};
+ pma8084_lvs4: lvs4 {};
+
+ pma8084_5vs1: 5vs1 {};
};
+};
+
+&sdhc_1 {
+ status = "okay";
+
+ vmmc-supply = <&pma8084_l20>;
+ vqmmc-supply = <&pma8084_s4>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc1_pin_a>;
+};
+
+&sdhc_2 {
+ status = "okay";
+ max-frequency = <100000000>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc3_pin_a>;
+
+ vmmc-supply = <&vreg_wlan>;
+ vqmmc-supply = <&pma8084_s4>;
+
+ non-removable;
- adreno@fdb00000 {
- status = "ok";
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ wifi@1 {
+ reg = <1>;
+ compatible = "brcm,bcm4329-fmac";
+
+ interrupt-parent = <&tlmm>;
+ interrupts = <92 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "host-wake";
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&wlan_sleep_clk_pin &wifi_pin>;
};
+};
- mdss@fd900000 {
- status = "ok";
+&sdhc_3 {
+ status = "okay";
+ max-frequency = <100000000>;
+
+ vmmc-supply = <&pma8084_l21>;
+ vqmmc-supply = <&pma8084_l13>;
+
+ /*
+ * cd-gpio is intentionally disabled. If enabled, an SD card
+ * present during boot is not initialized correctly. Without
+ * cd-gpios the driver resorts to polling, so hotplug works.
+ */
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc2_pin_a /* &sdhc2_cd_pin */>;
+ /* cd-gpios = <&tlmm 62 GPIO_ACTIVE_LOW>; */
+};
- mdp@fd900000 {
- status = "ok";
- };
+&tlmm {
+ blsp2_uart2_pins_active: blsp2-uart2-pins-active {
+ pins = "gpio45", "gpio46", "gpio47", "gpio48";
+ function = "blsp_uart8";
+ drive-strength = <8>;
+ bias-disable;
+ };
- dsi@fd922800 {
- status = "ok";
+ blsp2_uart2_pins_sleep: blsp2-uart2-pins-sleep {
+ pins = "gpio45", "gpio46", "gpio47", "gpio48";
+ function = "gpio";
+ drive-strength = <2>;
+ bias-pull-down;
+ };
- vdda-supply = <&pma8084_l2>;
- vdd-supply = <&pma8084_l22>;
- vddio-supply = <&pma8084_l12>;
+ bt_pins: bt-pins {
+ hostwake {
+ pins = "gpio75";
+ function = "gpio";
+ drive-strength = <16>;
+ input-enable;
+ };
- #address-cells = <1>;
- #size-cells = <0>;
+ devwake {
+ pins = "gpio91";
+ function = "gpio";
+ drive-strength = <2>;
+ };
+ };
- ports {
- port@1 {
- endpoint {
- remote-endpoint = <&panel_in>;
- data-lanes = <0 1 2 3>;
- };
- };
- };
+ sdhc1_pin_a: sdhc1-pin-active {
+ clk {
+ pins = "sdc1_clk";
+ drive-strength = <4>;
+ bias-disable;
+ };
- panel: panel@0 {
- reg = <0>;
- compatible = "samsung,s6e3fa2";
+ cmd-data {
+ pins = "sdc1_cmd", "sdc1_data";
+ drive-strength = <4>;
+ bias-pull-up;
+ };
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&panel_te_pin &panel_rst_pin>;
+ sdhc2_pin_a: sdhc2-pin-active {
+ clk-cmd-data {
+ pins = "gpio35", "gpio36", "gpio37", "gpio38",
+ "gpio39", "gpio40";
+ function = "sdc3";
+ drive-strength = <8>;
+ bias-disable;
+ };
+ };
- iovdd-supply = <&pma8084_lvs4>;
- vddr-supply = <&vreg_panel>;
+ sdhc2_cd_pin: sdhc2-cd {
+ pins = "gpio62";
+ function = "gpio";
- reset-gpios = <&pma8084_gpios 17 GPIO_ACTIVE_LOW>;
- te-gpios = <&msmgpio 12 GPIO_ACTIVE_HIGH>;
+ drive-strength = <2>;
+ bias-disable;
+ };
- port {
- panel_in: endpoint {
- remote-endpoint = <&dsi0_out>;
- };
- };
- };
+ sdhc3_pin_a: sdhc3-pin-active {
+ clk {
+ pins = "sdc2_clk";
+ drive-strength = <6>;
+ bias-disable;
+ };
+
+ cmd-data {
+ pins = "sdc2_cmd", "sdc2_data";
+ drive-strength = <6>;
+ bias-pull-up;
};
+ };
- dsi-phy@fd922a00 {
- status = "ok";
+ i2c2_pins: i2c2 {
+ mux {
+ pins = "gpio6", "gpio7";
+ function = "blsp_i2c2";
- vddio-supply = <&pma8084_l12>;
+ drive-strength = <2>;
+ bias-disable;
};
};
- remoteproc@fc880000 {
- cx-supply = <&pma8084_s2>;
- mss-supply = <&pma8084_s6>;
- mx-supply = <&pma8084_s1>;
- pll-supply = <&pma8084_l12>;
+ i2c6_pins: i2c6 {
+ mux {
+ pins = "gpio29", "gpio30";
+ function = "blsp_i2c6";
+
+ drive-strength = <2>;
+ bias-disable;
+ };
};
-};
-&spmi_bus {
- pma8084@0 {
- gpios@c000 {
- gpio_keys_pin_a: gpio-keys-active {
- pins = "gpio2", "gpio3", "gpio5";
- function = "normal";
+ i2c12_pins: i2c12 {
+ mux {
+ pins = "gpio87", "gpio88";
+ function = "blsp_i2c12";
- bias-pull-up;
- power-source = <PMA8084_GPIO_S4>;
- };
+ drive-strength = <2>;
+ bias-disable;
+ };
+ };
- touchkey_pin: touchkey-int-pin {
- pins = "gpio6";
- function = "normal";
- bias-disable;
- input-enable;
- power-source = <PMA8084_GPIO_S4>;
- };
+ i2c_touchkey_pins: i2c-touchkey {
+ mux {
+ pins = "gpio95", "gpio96";
+ function = "gpio";
+ input-enable;
+ bias-pull-up;
+ };
+ };
- touch_pin: touchscreen-int-pin {
- pins = "gpio8";
- function = "normal";
- bias-disable;
- input-enable;
- power-source = <PMA8084_GPIO_S4>;
- };
+ i2c_led_gpioex_pins: i2c-led-gpioex {
+ mux {
+ pins = "gpio120", "gpio121";
+ function = "gpio";
+ input-enable;
+ bias-pull-down;
+ };
+ };
- panel_en_pin: panel-en-pin {
- pins = "gpio14";
- function = "normal";
- bias-pull-up;
- power-source = <PMA8084_GPIO_S4>;
- qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
- };
+ gpioex_pin: gpioex {
+ res {
+ pins = "gpio145";
+ function = "gpio";
- wlan_sleep_clk_pin: wlan-sleep-clk-pin {
- pins = "gpio16";
- function = "func2";
+ bias-pull-up;
+ drive-strength = <2>;
+ };
+ };
- output-high;
- power-source = <PMA8084_GPIO_S4>;
- qcom,drive-strength = <PMIC_GPIO_STRENGTH_HIGH>;
- };
+ wifi_pin: wifi {
+ int {
+ pins = "gpio92";
+ function = "gpio";
- panel_rst_pin: panel-rst-pin {
- pins = "gpio17";
- function = "normal";
- bias-disable;
- power-source = <PMA8084_GPIO_S4>;
- qcom,drive-strength = <PMIC_GPIO_STRENGTH_LOW>;
- };
+ input-enable;
+ bias-pull-down;
+ };
+ };
+ panel_te_pin: panel {
+ te {
+ pins = "gpio12";
+ function = "mdp_vsync";
- fuelgauge_pin: fuelgauge-int-pin {
- pins = "gpio21";
- function = "normal";
- bias-disable;
- input-enable;
- power-source = <PMA8084_GPIO_S4>;
- };
+ drive-strength = <2>;
+ bias-disable;
};
};
};
diff --git a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-amami.dts b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-amami.dts
index 79e2cfbbb1ba..68d5626bf491 100644
--- a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-amami.dts
+++ b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-amami.dts
@@ -1,435 +1,13 @@
// SPDX-License-Identifier: GPL-2.0
-#include "qcom-msm8974.dtsi"
-#include "qcom-pm8841.dtsi"
-#include "qcom-pm8941.dtsi"
-#include <dt-bindings/gpio/gpio.h>
-#include <dt-bindings/input/input.h>
-#include <dt-bindings/pinctrl/qcom,pmic-gpio.h>
+#include "qcom-msm8974-sony-xperia-rhine.dtsi"
/ {
model = "Sony Xperia Z1 Compact";
compatible = "sony,xperia-amami", "qcom,msm8974";
-
- aliases {
- serial0 = &blsp1_uart2;
- };
-
- chosen {
- stdout-path = "serial0:115200n8";
- };
-
- gpio-keys {
- compatible = "gpio-keys";
-
- pinctrl-names = "default";
- pinctrl-0 = <&gpio_keys_pin_a>;
-
- volume-down {
- label = "volume_down";
- gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_VOLUMEDOWN>;
- };
-
- camera-snapshot {
- label = "camera_snapshot";
- gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_CAMERA>;
- };
-
- camera-focus {
- label = "camera_focus";
- gpios = <&pm8941_gpios 4 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_CAMERA_FOCUS>;
- };
-
- volume-up {
- label = "volume_up";
- gpios = <&pm8941_gpios 5 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_VOLUMEUP>;
- };
- };
-
- memory@0 {
- reg = <0 0x40000000>, <0x40000000 0x40000000>;
- device_type = "memory";
- };
-
- smd {
- rpm {
- rpm_requests {
- pm8841-regulators {
- s1 {
- regulator-min-microvolt = <675000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s2 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s3 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s4 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
- };
-
- pm8941-regulators {
- vdd_l1_l3-supply = <&pm8941_s1>;
- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
- vdd_l4_l11-supply = <&pm8941_s1>;
- vdd_l5_l7-supply = <&pm8941_s2>;
- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
- vdd_l21-supply = <&vreg_boost>;
-
- s1 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
- regulator-always-on;
- regulator-boot-on;
- };
-
- s2 {
- regulator-min-microvolt = <2150000>;
- regulator-max-microvolt = <2150000>;
- regulator-boot-on;
- };
-
- s3 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- regulator-always-on;
- regulator-boot-on;
- };
-
- s4 {
- regulator-min-microvolt = <5000000>;
- regulator-max-microvolt = <5000000>;
- };
-
- l1 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l2 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- l3 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- l4 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- l5 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l6 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l7 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l8 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l9 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- l11 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1350000>;
- };
-
- l12 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l13 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l14 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l15 {
- regulator-min-microvolt = <2050000>;
- regulator-max-microvolt = <2050000>;
- };
-
- l16 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l17 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l18 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- l19 {
- regulator-min-microvolt = <3300000>;
- regulator-max-microvolt = <3300000>;
- };
-
- l20 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-allow-set-load;
- regulator-boot-on;
- regulator-system-load = <200000>;
- };
-
- l21 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l22 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3000000>;
- };
-
- l23 {
- regulator-min-microvolt = <2800000>;
- regulator-max-microvolt = <2800000>;
- };
-
- l24 {
- regulator-min-microvolt = <3075000>;
- regulator-max-microvolt = <3075000>;
-
- regulator-boot-on;
- };
- };
- };
- };
- };
};
-&soc {
- sdhci@f9824900 {
- status = "okay";
-
- vmmc-supply = <&pm8941_l20>;
- vqmmc-supply = <&pm8941_s3>;
-
- bus-width = <8>;
- non-removable;
-
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc1_pin_a>;
- };
-
- sdhci@f98a4900 {
- status = "okay";
-
- bus-width = <4>;
-
- vmmc-supply = <&pm8941_l21>;
- vqmmc-supply = <&pm8941_l13>;
-
- cd-gpios = <&msmgpio 62 GPIO_ACTIVE_LOW>;
-
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
- };
-
- serial@f991e000 {
- status = "okay";
-
- pinctrl-names = "default";
- pinctrl-0 = <&blsp1_uart2_pin_a>;
- };
-
-
- pinctrl@fd510000 {
- blsp1_uart2_pin_a: blsp1-uart2-pin-active {
- rx {
- pins = "gpio5";
- function = "blsp_uart2";
-
- drive-strength = <2>;
- bias-pull-up;
- };
-
- tx {
- pins = "gpio4";
- function = "blsp_uart2";
-
- drive-strength = <4>;
- bias-disable;
- };
- };
-
- i2c2_pins: i2c2 {
- mux {
- pins = "gpio6", "gpio7";
- function = "blsp_i2c2";
-
- drive-strength = <2>;
- bias-disable;
- };
- };
-
- sdhc1_pin_a: sdhc1-pin-active {
- clk {
- pins = "sdc1_clk";
- drive-strength = <16>;
- bias-disable;
- };
-
- cmd-data {
- pins = "sdc1_cmd", "sdc1_data";
- drive-strength = <10>;
- bias-pull-up;
- };
- };
-
- sdhc2_cd_pin_a: sdhc2-cd-pin-active {
- pins = "gpio62";
- function = "gpio";
-
- drive-strength = <2>;
- bias-disable;
- };
-
- sdhc2_pin_a: sdhc2-pin-active {
- clk {
- pins = "sdc2_clk";
- drive-strength = <10>;
- bias-disable;
- };
-
- cmd-data {
- pins = "sdc2_cmd", "sdc2_data";
- drive-strength = <6>;
- bias-pull-up;
- };
- };
- };
-
- dma-controller@f9944000 {
- qcom,controlled-remotely;
- };
-
- usb@f9a55000 {
- status = "okay";
-
- phys = <&usb_hs1_phy>;
- phy-select = <&tcsr 0xb000 0>;
- extcon = <&smbb>, <&usb_id>;
- vbus-supply = <&chg_otg>;
-
- hnp-disable;
- srp-disable;
- adp-disable;
-
- ulpi {
- phy@a {
- status = "okay";
-
- v1p8-supply = <&pm8941_l6>;
- v3p3-supply = <&pm8941_l24>;
-
- extcon = <&smbb>;
- qcom,init-seq = /bits/ 8 <0x1 0x64>;
- };
- };
- };
-};
-
-&spmi_bus {
- pm8941@0 {
- charger@1000 {
- qcom,fast-charge-safe-current = <1300000>;
- qcom,fast-charge-current-limit = <1300000>;
- qcom,dc-current-limit = <1300000>;
- qcom,fast-charge-safe-voltage = <4400000>;
- qcom,fast-charge-high-threshold-voltage = <4350000>;
- qcom,fast-charge-low-threshold-voltage = <3400000>;
- qcom,auto-recharge-threshold-voltage = <4200000>;
- qcom,minimum-input-voltage = <4300000>;
- };
-
- gpios@c000 {
- gpio_keys_pin_a: gpio-keys-active {
- pins = "gpio2", "gpio3", "gpio4", "gpio5";
- function = "normal";
-
- bias-pull-up;
- power-source = <PM8941_GPIO_S3>;
- };
- };
-
- coincell@2800 {
- status = "okay";
- qcom,rset-ohms = <2100>;
- qcom,vset-millivolts = <3000>;
- };
- };
-
- pm8941@1 {
- wled@d800 {
- status = "okay";
-
- qcom,cs-out;
- qcom,current-limit = <20>;
- qcom,current-boost-limit = <805>;
- qcom,switching-freq = <1600>;
- qcom,ovp = <29>;
- qcom,num-strings = <2>;
- };
- };
+&smbb {
+ qcom,fast-charge-safe-current = <1300000>;
+ qcom,fast-charge-current-limit = <1300000>;
+ qcom,dc-current-limit = <1300000>;
};
diff --git a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-castor.dts b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-castor.dts
index e66937e3f7dd..b8b6447b3217 100644
--- a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-castor.dts
+++ b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-castor.dts
@@ -11,7 +11,7 @@ / {
aliases {
serial0 = &blsp1_uart2;
- serial1 = &blsp2_uart7;
+ serial1 = &blsp2_uart1;
};
chosen {
@@ -53,193 +53,13 @@ volume-up {
};
};
- smd {
- rpm {
- rpm_requests {
- pm8941-regulators {
- vdd_l1_l3-supply = <&pm8941_s1>;
- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
- vdd_l4_l11-supply = <&pm8941_s1>;
- vdd_l5_l7-supply = <&pm8941_s2>;
- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
- vdd_l21-supply = <&vreg_boost>;
-
- s1 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
- regulator-always-on;
- regulator-boot-on;
- };
-
- s2 {
- regulator-min-microvolt = <2150000>;
- regulator-max-microvolt = <2150000>;
- regulator-boot-on;
- };
-
- s3 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- regulator-always-on;
- regulator-boot-on;
-
- regulator-system-load = <154000>;
- };
-
- s4 {
- regulator-min-microvolt = <5000000>;
- regulator-max-microvolt = <5000000>;
- };
-
- l1 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l2 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- l3 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- l4 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- l5 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l6 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l7 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l8 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l9 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- l11 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1350000>;
- };
-
- l12 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l13 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l14 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l15 {
- regulator-min-microvolt = <2050000>;
- regulator-max-microvolt = <2050000>;
- };
-
- l16 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l17 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l18 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- l19 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- l20 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-allow-set-load;
- regulator-boot-on;
- regulator-allow-set-load;
- regulator-system-load = <500000>;
- };
-
- l21 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l22 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3000000>;
- };
-
- l23 {
- regulator-min-microvolt = <2800000>;
- regulator-max-microvolt = <2800000>;
- };
-
- l24 {
- regulator-min-microvolt = <3075000>;
- regulator-max-microvolt = <3075000>;
-
- regulator-boot-on;
- };
- };
- };
- };
- };
-
vreg_bl_vddio: lcd-backlight-vddio {
compatible = "regulator-fixed";
regulator-name = "vreg_bl_vddio";
regulator-min-microvolt = <3150000>;
regulator-max-microvolt = <3150000>;
- gpio = <&msmgpio 69 0>;
+ gpio = <&tlmm 69 0>;
enable-active-high;
vin-supply = <&pm8941_s3>;
@@ -277,447 +97,605 @@ vreg_wlan: wlan-regulator {
};
};
-&soc {
- sdhci@f9824900 {
- status = "okay";
+&blsp1_uart2 {
+ status = "okay";
- vmmc-supply = <&pm8941_l20>;
- vqmmc-supply = <&pm8941_s3>;
-
- bus-width = <8>;
- non-removable;
+ pinctrl-names = "default";
+ pinctrl-0 = <&blsp1_uart2_pin_a>;
+};
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc1_pin_a>;
- };
+&blsp2_i2c2 {
+ status = "okay";
+ clock-frequency = <355000>;
- sdhci@f9864900 {
- status = "okay";
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c8_pins>;
- max-frequency = <100000000>;
- non-removable;
- vmmc-supply = <&vreg_wlan>;
+ synaptics@2c {
+ compatible = "syna,rmi4-i2c";
+ reg = <0x2c>;
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc3_pin_a>;
+ interrupt-parent = <&tlmm>;
+ interrupts = <86 IRQ_TYPE_EDGE_FALLING>;
#address-cells = <1>;
#size-cells = <0>;
- bcrmf@1 {
- compatible = "brcm,bcm4339-fmac", "brcm,bcm4329-fmac";
- reg = <1>;
+ vdd-supply = <&pm8941_l22>;
+ vio-supply = <&pm8941_lvs3>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&ts_int_pin>;
- brcm,drive-strength = <10>;
+ syna,startup-delay-ms = <10>;
- pinctrl-names = "default";
- pinctrl-0 = <&wlan_sleep_clk_pin>;
+ rmi-f01@1 {
+ reg = <0x1>;
+ syna,nosleep = <1>;
};
- };
- sdhci@f98a4900 {
- status = "okay";
+ rmi-f11@11 {
+ reg = <0x11>;
+ syna,f11-flip-x = <1>;
+ syna,sensor-type = <1>;
+ };
+ };
+};
- bus-width = <4>;
+&blsp2_i2c5 {
+ status = "okay";
+ clock-frequency = <355000>;
- vmmc-supply = <&pm8941_l21>;
- vqmmc-supply = <&pm8941_l13>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c11_pins>;
- cd-gpios = <&msmgpio 62 GPIO_ACTIVE_LOW>;
+ lp8566_wled: backlight@2c {
+ compatible = "ti,lp8556";
+ reg = <0x2c>;
+ power-supply = <&vreg_bl_vddio>;
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
+ bl-name = "backlight";
+ dev-ctrl = /bits/ 8 <0x05>;
+ init-brt = /bits/ 8 <0x3f>;
+ rom_a0h {
+ rom-addr = /bits/ 8 <0xa0>;
+ rom-val = /bits/ 8 <0xff>;
+ };
+ rom_a1h {
+ rom-addr = /bits/ 8 <0xa1>;
+ rom-val = /bits/ 8 <0x3f>;
+ };
+ rom_a2h {
+ rom-addr = /bits/ 8 <0xa2>;
+ rom-val = /bits/ 8 <0x20>;
+ };
+ rom_a3h {
+ rom-addr = /bits/ 8 <0xa3>;
+ rom-val = /bits/ 8 <0x5e>;
+ };
+ rom_a4h {
+ rom-addr = /bits/ 8 <0xa4>;
+ rom-val = /bits/ 8 <0x02>;
+ };
+ rom_a5h {
+ rom-addr = /bits/ 8 <0xa5>;
+ rom-val = /bits/ 8 <0x04>;
+ };
+ rom_a6h {
+ rom-addr = /bits/ 8 <0xa6>;
+ rom-val = /bits/ 8 <0x80>;
+ };
+ rom_a7h {
+ rom-addr = /bits/ 8 <0xa7>;
+ rom-val = /bits/ 8 <0xf7>;
+ };
+ rom_a9h {
+ rom-addr = /bits/ 8 <0xa9>;
+ rom-val = /bits/ 8 <0x80>;
+ };
+ rom_aah {
+ rom-addr = /bits/ 8 <0xaa>;
+ rom-val = /bits/ 8 <0x0f>;
+ };
+ rom_aeh {
+ rom-addr = /bits/ 8 <0xae>;
+ rom-val = /bits/ 8 <0x0f>;
+ };
};
+};
+
+&blsp2_uart1 {
+ status = "okay";
- serial@f991e000 {
- status = "okay";
+ pinctrl-names = "default";
+ pinctrl-0 = <&blsp2_uart7_pin_a>;
+
+ bluetooth {
+ compatible = "brcm,bcm43438-bt";
+ max-speed = <3000000>;
pinctrl-names = "default";
- pinctrl-0 = <&blsp1_uart2_pin_a>;
+ pinctrl-0 = <&bt_host_wake_pin>,
+ <&bt_dev_wake_pin>,
+ <&bt_reg_on_pin>;
+
+ host-wakeup-gpios = <&tlmm 95 GPIO_ACTIVE_HIGH>;
+ device-wakeup-gpios = <&tlmm 96 GPIO_ACTIVE_HIGH>;
+ shutdown-gpios = <&pm8941_gpios 16 GPIO_ACTIVE_HIGH>;
};
+};
- serial@f995d000 {
- status = "ok";
+&otg {
+ status = "okay";
- pinctrl-names = "default";
- pinctrl-0 = <&blsp2_uart7_pin_a>;
+ phys = <&usb_hs1_phy>;
+ phy-select = <&tcsr 0xb000 0>;
+ extcon = <&smbb>, <&usb_id>;
+ vbus-supply = <&chg_otg>;
- bluetooth {
- compatible = "brcm,bcm43438-bt";
- max-speed = <3000000>;
+ hnp-disable;
+ srp-disable;
+ adp-disable;
- pinctrl-names = "default";
- pinctrl-0 = <&bt_host_wake_pin>,
- <&bt_dev_wake_pin>,
- <&bt_reg_on_pin>;
+ ulpi {
+ phy@a {
+ status = "okay";
- host-wakeup-gpios = <&msmgpio 95 GPIO_ACTIVE_HIGH>;
- device-wakeup-gpios = <&msmgpio 96 GPIO_ACTIVE_HIGH>;
- shutdown-gpios = <&pm8941_gpios 16 GPIO_ACTIVE_HIGH>;
+ v1p8-supply = <&pm8941_l6>;
+ v3p3-supply = <&pm8941_l24>;
+
+ extcon = <&smbb>;
+ qcom,init-seq = /bits/ 8 <0x1 0x64>;
};
};
+};
- usb@f9a55000 {
- status = "okay";
+&pm8941_coincell {
+ status = "okay";
- phys = <&usb_hs1_phy>;
- phy-select = <&tcsr 0xb000 0>;
- extcon = <&smbb>, <&usb_id>;
- vbus-supply = <&chg_otg>;
+ qcom,rset-ohms = <2100>;
+ qcom,vset-millivolts = <3000>;
+};
- hnp-disable;
- srp-disable;
- adp-disable;
+&pm8941_gpios {
+ gpio_keys_pin_a: gpio-keys-active {
+ pins = "gpio2", "gpio5";
+ function = "normal";
- ulpi {
- phy@a {
- status = "okay";
+ bias-pull-up;
+ power-source = <PM8941_GPIO_S3>;
+ };
- v1p8-supply = <&pm8941_l6>;
- v3p3-supply = <&pm8941_l24>;
+ bt_reg_on_pin: bt-reg-on {
+ pins = "gpio16";
+ function = "normal";
- extcon = <&smbb>;
- qcom,init-seq = /bits/ 8 <0x1 0x64>;
- };
- };
+ output-low;
+ power-source = <PM8941_GPIO_S3>;
};
- pinctrl@fd510000 {
- blsp1_uart2_pin_a: blsp1-uart2-pin-active {
- rx {
- pins = "gpio5";
- function = "blsp_uart2";
+ wlan_sleep_clk_pin: wl-sleep-clk {
+ pins = "gpio17";
+ function = "func2";
- drive-strength = <2>;
- bias-pull-up;
- };
+ output-high;
+ power-source = <PM8941_GPIO_S3>;
+ };
- tx {
- pins = "gpio4";
- function = "blsp_uart2";
+ wlan_regulator_pin: wl-reg-active {
+ pins = "gpio18";
+ function = "normal";
- drive-strength = <4>;
- bias-disable;
- };
- };
+ bias-disable;
+ power-source = <PM8941_GPIO_S3>;
+ };
- blsp2_uart7_pin_a: blsp2-uart7-pin-active {
- tx {
- pins = "gpio41";
- function = "blsp_uart7";
+ lcd_dcdc_en_pin_a: lcd-dcdc-en-active {
+ pins = "gpio20";
+ function = "normal";
- drive-strength = <2>;
- bias-disable;
- };
+ bias-disable;
+ power-source = <PM8941_GPIO_S3>;
+ input-disable;
+ output-low;
+ };
- rx {
- pins = "gpio42";
- function = "blsp_uart7";
+};
- drive-strength = <2>;
- bias-pull-up;
- };
+&rpm_requests {
+ pm8941-regulators {
+ compatible = "qcom,rpm-pm8941-regulators";
+
+ vdd_l1_l3-supply = <&pm8941_s1>;
+ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+ vdd_l4_l11-supply = <&pm8941_s1>;
+ vdd_l5_l7-supply = <&pm8941_s2>;
+ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+ vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
+ vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
+ vdd_l21-supply = <&vreg_boost>;
+
+ pm8941_s1: s1 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1300000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
- cts {
- pins = "gpio43";
- function = "blsp_uart7";
+ pm8941_s2: s2 {
+ regulator-min-microvolt = <2150000>;
+ regulator-max-microvolt = <2150000>;
+ regulator-boot-on;
+ };
- drive-strength = <2>;
- bias-pull-up;
- };
+ pm8941_s3: s3 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-system-load = <154000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
- rts {
- pins = "gpio44";
- function = "blsp_uart7";
+ pm8941_s4: s4 {
+ regulator-min-microvolt = <5000000>;
+ regulator-max-microvolt = <5000000>;
+ };
- drive-strength = <2>;
- bias-disable;
- };
+ pm8941_l1: l1 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ regulator-always-on;
+ regulator-boot-on;
};
- i2c8_pins: i2c8 {
- mux {
- pins = "gpio47", "gpio48";
- function = "blsp_i2c8";
+ pm8941_l2: l2 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
+ };
- drive-strength = <2>;
- bias-disable;
- };
+ pm8941_l3: l3 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
};
- i2c11_pins: i2c11 {
- mux {
- pins = "gpio83", "gpio84";
- function = "blsp_i2c11";
+ pm8941_l4: l4 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ };
- drive-strength = <2>;
- bias-disable;
- };
+ pm8941_l5: l5 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
};
- lcd_backlight_en_pin_a: lcd-backlight-vddio {
- pins = "gpio69";
- drive-strength = <10>;
- output-low;
- bias-disable;
+ pm8941_l6: l6 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
};
- sdhc1_pin_a: sdhc1-pin-active {
- clk {
- pins = "sdc1_clk";
- drive-strength = <16>;
- bias-disable;
- };
+ pm8941_l7: l7 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
+ };
- cmd-data {
- pins = "sdc1_cmd", "sdc1_data";
- drive-strength = <10>;
- bias-pull-up;
- };
+ pm8941_l8: l8 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
};
- sdhc2_cd_pin_a: sdhc2-cd-pin-active {
- pins = "gpio62";
- function = "gpio";
+ pm8941_l9: l9 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ };
- drive-strength = <2>;
- bias-disable;
- };
+ pm8941_l11: l11 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1350000>;
+ };
- sdhc2_pin_a: sdhc2-pin-active {
- clk {
- pins = "sdc2_clk";
- drive-strength = <6>;
- bias-disable;
- };
+ pm8941_l12: l12 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
- cmd-data {
- pins = "sdc2_cmd", "sdc2_data";
- drive-strength = <6>;
- bias-pull-up;
- };
+ pm8941_l13: l13 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
};
- sdhc3_pin_a: sdhc3-pin-active {
- clk {
- pins = "gpio40";
- function = "sdc3";
+ pm8941_l14: l14 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
- drive-strength = <10>;
- bias-disable;
- };
+ pm8941_l15: l15 {
+ regulator-min-microvolt = <2050000>;
+ regulator-max-microvolt = <2050000>;
+ };
- cmd {
- pins = "gpio39";
- function = "sdc3";
+ pm8941_l16: l16 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2700000>;
+ };
- drive-strength = <10>;
- bias-pull-up;
- };
+ pm8941_l17: l17 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2700000>;
+ };
- data {
- pins = "gpio35", "gpio36", "gpio37", "gpio38";
- function = "sdc3";
+ pm8941_l18: l18 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
+ };
- drive-strength = <10>;
- bias-pull-up;
- };
+ pm8941_l19: l19 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
};
- ts_int_pin: synaptics {
- pin {
- pins = "gpio86";
- function = "gpio";
+ pm8941_l20: l20 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-system-load = <500000>;
+ regulator-allow-set-load;
+ regulator-boot-on;
+ };
- drive-strength = <2>;
- bias-disable;
- input-enable;
- };
+ pm8941_l21: l21 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
};
- bt_host_wake_pin: bt-host-wake {
- pins = "gpio95";
- function = "gpio";
+ pm8941_l22: l22 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
+ };
- drive-strength = <2>;
- bias-disable;
- output-low;
+ pm8941_l23: l23 {
+ regulator-min-microvolt = <2800000>;
+ regulator-max-microvolt = <2800000>;
};
- bt_dev_wake_pin: bt-dev-wake {
- pins = "gpio96";
- function = "gpio";
+ pm8941_l24: l24 {
+ regulator-min-microvolt = <3075000>;
+ regulator-max-microvolt = <3075000>;
+ regulator-boot-on;
+ };
+
+ pm8941_lvs3: lvs3 {};
+ };
+};
+
+&sdhc_1 {
+ status = "okay";
+
+ vmmc-supply = <&pm8941_l20>;
+ vqmmc-supply = <&pm8941_s3>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc1_pin_a>;
+};
+
+&sdhc_2 {
+ status = "okay";
+
+ vmmc-supply = <&pm8941_l21>;
+ vqmmc-supply = <&pm8941_l13>;
+
+ cd-gpios = <&tlmm 62 GPIO_ACTIVE_LOW>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
+};
+
+&sdhc_3 {
+ status = "okay";
+
+ max-frequency = <100000000>;
+ vmmc-supply = <&vreg_wlan>;
+ non-removable;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc3_pin_a>;
+
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ bcrmf@1 {
+ compatible = "brcm,bcm4339-fmac", "brcm,bcm4329-fmac";
+ reg = <1>;
+
+ brcm,drive-strength = <10>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&wlan_sleep_clk_pin>;
+ };
+};
+
+&smbb {
+ qcom,fast-charge-safe-current = <1500000>;
+ qcom,fast-charge-current-limit = <1500000>;
+ qcom,dc-current-limit = <1800000>;
+ qcom,fast-charge-safe-voltage = <4400000>;
+ qcom,fast-charge-high-threshold-voltage = <4350000>;
+ qcom,fast-charge-low-threshold-voltage = <3400000>;
+ qcom,auto-recharge-threshold-voltage = <4200000>;
+ qcom,minimum-input-voltage = <4300000>;
+};
+
+&tlmm {
+ blsp1_uart2_pin_a: blsp1-uart2-pin-active {
+ rx {
+ pins = "gpio5";
+ function = "blsp_uart2";
drive-strength = <2>;
+ bias-pull-up;
+ };
+
+ tx {
+ pins = "gpio4";
+ function = "blsp_uart2";
+
+ drive-strength = <4>;
bias-disable;
};
};
- i2c@f9964000 {
- status = "okay";
+ blsp2_uart7_pin_a: blsp2-uart7-pin-active {
+ tx {
+ pins = "gpio41";
+ function = "blsp_uart7";
- clock-frequency = <355000>;
- qcom,src-freq = <50000000>;
-
- pinctrl-names = "default";
- pinctrl-0 = <&i2c8_pins>;
+ drive-strength = <2>;
+ bias-disable;
+ };
- synaptics@2c {
- compatible = "syna,rmi4-i2c";
- reg = <0x2c>;
+ rx {
+ pins = "gpio42";
+ function = "blsp_uart7";
- interrupt-parent = <&msmgpio>;
- interrupts = <86 IRQ_TYPE_EDGE_FALLING>;
+ drive-strength = <2>;
+ bias-pull-up;
+ };
- #address-cells = <1>;
- #size-cells = <0>;
+ cts {
+ pins = "gpio43";
+ function = "blsp_uart7";
- vdd-supply = <&pm8941_l22>;
- vio-supply = <&pm8941_lvs3>;
+ drive-strength = <2>;
+ bias-pull-up;
+ };
- pinctrl-names = "default";
- pinctrl-0 = <&ts_int_pin>;
+ rts {
+ pins = "gpio44";
+ function = "blsp_uart7";
- syna,startup-delay-ms = <10>;
+ drive-strength = <2>;
+ bias-disable;
+ };
+ };
- rmi-f01@1 {
- reg = <0x1>;
- syna,nosleep = <1>;
- };
+ i2c8_pins: i2c8 {
+ mux {
+ pins = "gpio47", "gpio48";
+ function = "blsp_i2c8";
- rmi-f11@11 {
- reg = <0x11>;
- syna,f11-flip-x = <1>;
- syna,sensor-type = <1>;
- };
+ drive-strength = <2>;
+ bias-disable;
};
};
- i2c@f9967000 {
- status = "okay";
- pinctrl-names = "default";
- pinctrl-0 = <&i2c11_pins>;
- clock-frequency = <355000>;
- qcom,src-freq = <50000000>;
-
- lp8566_wled: backlight@2c {
- compatible = "ti,lp8556";
- reg = <0x2c>;
- power-supply = <&vreg_bl_vddio>;
-
- bl-name = "backlight";
- dev-ctrl = /bits/ 8 <0x05>;
- init-brt = /bits/ 8 <0x3f>;
- rom_a0h {
- rom-addr = /bits/ 8 <0xa0>;
- rom-val = /bits/ 8 <0xff>;
- };
- rom_a1h {
- rom-addr = /bits/ 8 <0xa1>;
- rom-val = /bits/ 8 <0x3f>;
- };
- rom_a2h {
- rom-addr = /bits/ 8 <0xa2>;
- rom-val = /bits/ 8 <0x20>;
- };
- rom_a3h {
- rom-addr = /bits/ 8 <0xa3>;
- rom-val = /bits/ 8 <0x5e>;
- };
- rom_a4h {
- rom-addr = /bits/ 8 <0xa4>;
- rom-val = /bits/ 8 <0x02>;
- };
- rom_a5h {
- rom-addr = /bits/ 8 <0xa5>;
- rom-val = /bits/ 8 <0x04>;
- };
- rom_a6h {
- rom-addr = /bits/ 8 <0xa6>;
- rom-val = /bits/ 8 <0x80>;
- };
- rom_a7h {
- rom-addr = /bits/ 8 <0xa7>;
- rom-val = /bits/ 8 <0xf7>;
- };
- rom_a9h {
- rom-addr = /bits/ 8 <0xa9>;
- rom-val = /bits/ 8 <0x80>;
- };
- rom_aah {
- rom-addr = /bits/ 8 <0xaa>;
- rom-val = /bits/ 8 <0x0f>;
- };
- rom_aeh {
- rom-addr = /bits/ 8 <0xae>;
- rom-val = /bits/ 8 <0x0f>;
- };
+ i2c11_pins: i2c11 {
+ mux {
+ pins = "gpio83", "gpio84";
+ function = "blsp_i2c11";
+
+ drive-strength = <2>;
+ bias-disable;
};
};
-};
-&spmi_bus {
- pm8941@0 {
- charger@1000 {
- qcom,fast-charge-safe-current = <1500000>;
- qcom,fast-charge-current-limit = <1500000>;
- qcom,dc-current-limit = <1800000>;
- qcom,fast-charge-safe-voltage = <4400000>;
- qcom,fast-charge-high-threshold-voltage = <4350000>;
- qcom,fast-charge-low-threshold-voltage = <3400000>;
- qcom,auto-recharge-threshold-voltage = <4200000>;
- qcom,minimum-input-voltage = <4300000>;
+ lcd_backlight_en_pin_a: lcd-backlight-vddio {
+ pins = "gpio69";
+ drive-strength = <10>;
+ output-low;
+ bias-disable;
+ };
+
+ sdhc1_pin_a: sdhc1-pin-active {
+ clk {
+ pins = "sdc1_clk";
+ drive-strength = <16>;
+ bias-disable;
};
- gpios@c000 {
- gpio_keys_pin_a: gpio-keys-active {
- pins = "gpio2", "gpio5";
- function = "normal";
+ cmd-data {
+ pins = "sdc1_cmd", "sdc1_data";
+ drive-strength = <10>;
+ bias-pull-up;
+ };
+ };
- bias-pull-up;
- power-source = <PM8941_GPIO_S3>;
- };
+ sdhc2_cd_pin_a: sdhc2-cd-pin-active {
+ pins = "gpio62";
+ function = "gpio";
- bt_reg_on_pin: bt-reg-on {
- pins = "gpio16";
- function = "normal";
+ drive-strength = <2>;
+ bias-disable;
+ };
- output-low;
- power-source = <PM8941_GPIO_S3>;
- };
+ sdhc2_pin_a: sdhc2-pin-active {
+ clk {
+ pins = "sdc2_clk";
+ drive-strength = <6>;
+ bias-disable;
+ };
- wlan_sleep_clk_pin: wl-sleep-clk {
- pins = "gpio17";
- function = "func2";
+ cmd-data {
+ pins = "sdc2_cmd", "sdc2_data";
+ drive-strength = <6>;
+ bias-pull-up;
+ };
+ };
- output-high;
- power-source = <PM8941_GPIO_S3>;
- };
+ sdhc3_pin_a: sdhc3-pin-active {
+ clk {
+ pins = "gpio40";
+ function = "sdc3";
- wlan_regulator_pin: wl-reg-active {
- pins = "gpio18";
- function = "normal";
+ drive-strength = <10>;
+ bias-disable;
+ };
- bias-disable;
- power-source = <PM8941_GPIO_S3>;
- };
+ cmd {
+ pins = "gpio39";
+ function = "sdc3";
- lcd_dcdc_en_pin_a: lcd-dcdc-en-active {
- pins = "gpio20";
- function = "normal";
+ drive-strength = <10>;
+ bias-pull-up;
+ };
- bias-disable;
- power-source = <PM8941_GPIO_S3>;
- input-disable;
- output-low;
- };
+ data {
+ pins = "gpio35", "gpio36", "gpio37", "gpio38";
+ function = "sdc3";
+ drive-strength = <10>;
+ bias-pull-up;
};
+ };
- coincell@2800 {
- status = "okay";
- qcom,rset-ohms = <2100>;
- qcom,vset-millivolts = <3000>;
+ ts_int_pin: synaptics {
+ pin {
+ pins = "gpio86";
+ function = "gpio";
+
+ drive-strength = <2>;
+ bias-disable;
+ input-enable;
};
};
+
+ bt_host_wake_pin: bt-host-wake {
+ pins = "gpio95";
+ function = "gpio";
+
+ drive-strength = <2>;
+ bias-disable;
+ output-low;
+ };
+
+ bt_dev_wake_pin: bt-dev-wake {
+ pins = "gpio96";
+ function = "gpio";
+
+ drive-strength = <2>;
+ bias-disable;
+ };
};
diff --git a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-honami.dts b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-honami.dts
index a62e5c25b23c..ea6a941d8f8c 100644
--- a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-honami.dts
+++ b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-honami.dts
@@ -1,484 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
-#include "qcom-msm8974.dtsi"
-#include "qcom-pm8841.dtsi"
-#include "qcom-pm8941.dtsi"
-#include <dt-bindings/gpio/gpio.h>
-#include <dt-bindings/input/input.h>
-#include <dt-bindings/pinctrl/qcom,pmic-gpio.h>
+#include "qcom-msm8974-sony-xperia-rhine.dtsi"
/ {
model = "Sony Xperia Z1";
compatible = "sony,xperia-honami", "qcom,msm8974";
-
- aliases {
- serial0 = &blsp1_uart2;
- };
-
- chosen {
- stdout-path = "serial0:115200n8";
- };
-
- gpio-keys {
- compatible = "gpio-keys";
-
- pinctrl-names = "default";
- pinctrl-0 = <&gpio_keys_pin_a>;
-
- volume-down {
- label = "volume_down";
- gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_VOLUMEDOWN>;
- };
-
- camera-snapshot {
- label = "camera_snapshot";
- gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_CAMERA>;
- };
-
- camera-focus {
- label = "camera_focus";
- gpios = <&pm8941_gpios 4 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_CAMERA_FOCUS>;
- };
-
- volume-up {
- label = "volume_up";
- gpios = <&pm8941_gpios 5 GPIO_ACTIVE_LOW>;
- linux,input-type = <1>;
- linux,code = <KEY_VOLUMEUP>;
- };
- };
-
- memory@0 {
- reg = <0 0x40000000>, <0x40000000 0x40000000>;
- device_type = "memory";
- };
-
- smd {
- rpm {
- rpm_requests {
- pm8841-regulators {
- s1 {
- regulator-min-microvolt = <675000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s2 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s3 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
-
- s4 {
- regulator-min-microvolt = <500000>;
- regulator-max-microvolt = <1050000>;
- };
- };
-
- pm8941-regulators {
- vdd_l1_l3-supply = <&pm8941_s1>;
- vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
- vdd_l4_l11-supply = <&pm8941_s1>;
- vdd_l5_l7-supply = <&pm8941_s2>;
- vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
- vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
- vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
- vdd_l21-supply = <&vreg_boost>;
-
- s1 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1300000>;
- regulator-always-on;
- regulator-boot-on;
- };
-
- s2 {
- regulator-min-microvolt = <2150000>;
- regulator-max-microvolt = <2150000>;
- regulator-boot-on;
- };
-
- s3 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- regulator-always-on;
- regulator-boot-on;
- };
-
- s4 {
- regulator-min-microvolt = <5000000>;
- regulator-max-microvolt = <5000000>;
- };
-
- l1 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l2 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- l3 {
- regulator-min-microvolt = <1200000>;
- regulator-max-microvolt = <1200000>;
- };
-
- l4 {
- regulator-min-microvolt = <1225000>;
- regulator-max-microvolt = <1225000>;
- };
-
- l5 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l6 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l7 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-boot-on;
- };
-
- l8 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l9 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
- };
-
- l11 {
- regulator-min-microvolt = <1300000>;
- regulator-max-microvolt = <1350000>;
- };
-
- l12 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
-
- regulator-always-on;
- regulator-boot-on;
- };
-
- l13 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l14 {
- regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
- };
-
- l15 {
- regulator-min-microvolt = <2050000>;
- regulator-max-microvolt = <2050000>;
- };
-
- l16 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l17 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
- };
-
- l18 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
- };
-
- l19 {
- regulator-min-microvolt = <3300000>;
- regulator-max-microvolt = <3300000>;
- };
-
- l20 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-allow-set-load;
- regulator-boot-on;
- regulator-system-load = <200000>;
- };
-
- l21 {
- regulator-min-microvolt = <2950000>;
- regulator-max-microvolt = <2950000>;
-
- regulator-boot-on;
- };
-
- l22 {
- regulator-min-microvolt = <3000000>;
- regulator-max-microvolt = <3000000>;
- };
-
- l23 {
- regulator-min-microvolt = <2800000>;
- regulator-max-microvolt = <2800000>;
- };
-
- l24 {
- regulator-min-microvolt = <3075000>;
- regulator-max-microvolt = <3075000>;
-
- regulator-boot-on;
- };
- };
- };
- };
- };
-};
-
-&soc {
- usb@f9a55000 {
- status = "okay";
-
- phys = <&usb_hs1_phy>;
- phy-select = <&tcsr 0xb000 0>;
- extcon = <&smbb>, <&usb_id>;
- vbus-supply = <&chg_otg>;
-
- hnp-disable;
- srp-disable;
- adp-disable;
-
- ulpi {
- phy@a {
- status = "okay";
-
- v1p8-supply = <&pm8941_l6>;
- v3p3-supply = <&pm8941_l24>;
-
- extcon = <&smbb>;
- qcom,init-seq = /bits/ 8 <0x1 0x64>;
- };
- };
- };
-
- sdhci@f9824900 {
- status = "okay";
-
- vmmc-supply = <&pm8941_l20>;
- vqmmc-supply = <&pm8941_s3>;
-
- bus-width = <8>;
- non-removable;
-
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc1_pin_a>;
- };
-
- sdhci@f98a4900 {
- status = "okay";
-
- bus-width = <4>;
-
- vmmc-supply = <&pm8941_l21>;
- vqmmc-supply = <&pm8941_l13>;
-
- cd-gpios = <&msmgpio 62 GPIO_ACTIVE_LOW>;
-
- pinctrl-names = "default";
- pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
- };
-
- serial@f991e000 {
- status = "okay";
-
- pinctrl-names = "default";
- pinctrl-0 = <&blsp1_uart2_pin_a>;
- };
-
- i2c@f9924000 {
- status = "okay";
-
- clock-frequency = <355000>;
- qcom,src-freq = <50000000>;
-
- pinctrl-names = "default";
- pinctrl-0 = <&i2c2_pins>;
-
- synaptics@2c {
- compatible = "syna,rmi4-i2c";
- reg = <0x2c>;
-
- interrupts-extended = <&msmgpio 61 IRQ_TYPE_EDGE_FALLING>;
-
- #address-cells = <1>;
- #size-cells = <0>;
-
- vdd-supply = <&pm8941_l22>;
- vio-supply = <&pm8941_lvs3>;
-
- pinctrl-names = "default";
- pinctrl-0 = <&ts_int_pin>;
-
- syna,startup-delay-ms = <10>;
-
- rmi4-f01@1 {
- reg = <0x1>;
- syna,nosleep-mode = <1>;
- };
-
- rmi4-f11@11 {
- reg = <0x11>;
- touchscreen-inverted-x;
- syna,sensor-type = <1>;
- };
- };
- };
-
- pinctrl@fd510000 {
- blsp1_uart2_pin_a: blsp1-uart2-pin-active {
- rx {
- pins = "gpio5";
- function = "blsp_uart2";
-
- drive-strength = <2>;
- bias-pull-up;
- };
-
- tx {
- pins = "gpio4";
- function = "blsp_uart2";
-
- drive-strength = <4>;
- bias-disable;
- };
- };
-
- i2c2_pins: i2c2 {
- mux {
- pins = "gpio6", "gpio7";
- function = "blsp_i2c2";
-
- drive-strength = <2>;
- bias-disable;
- };
- };
-
- sdhc1_pin_a: sdhc1-pin-active {
- clk {
- pins = "sdc1_clk";
- drive-strength = <16>;
- bias-disable;
- };
-
- cmd-data {
- pins = "sdc1_cmd", "sdc1_data";
- drive-strength = <10>;
- bias-pull-up;
- };
- };
-
- sdhc2_cd_pin_a: sdhc2-cd-pin-active {
- pins = "gpio62";
- function = "gpio";
-
- drive-strength = <2>;
- bias-disable;
- };
-
- sdhc2_pin_a: sdhc2-pin-active {
- clk {
- pins = "sdc2_clk";
- drive-strength = <10>;
- bias-disable;
- };
-
- cmd-data {
- pins = "sdc2_cmd", "sdc2_data";
- drive-strength = <6>;
- bias-pull-up;
- };
- };
-
- ts_int_pin: touch-int {
- pin {
- pins = "gpio61";
- function = "gpio";
-
- drive-strength = <2>;
- bias-disable;
- input-enable;
- };
- };
- };
-
- dma-controller@f9944000 {
- qcom,controlled-remotely;
- };
-};
-
-&spmi_bus {
- pm8941@0 {
- charger@1000 {
- qcom,fast-charge-safe-current = <1500000>;
- qcom,fast-charge-current-limit = <1500000>;
- qcom,dc-current-limit = <1800000>;
- qcom,fast-charge-safe-voltage = <4400000>;
- qcom,fast-charge-high-threshold-voltage = <4350000>;
- qcom,fast-charge-low-threshold-voltage = <3400000>;
- qcom,auto-recharge-threshold-voltage = <4200000>;
- qcom,minimum-input-voltage = <4300000>;
- };
-
- gpios@c000 {
- gpio_keys_pin_a: gpio-keys-active {
- pins = "gpio2", "gpio3", "gpio4", "gpio5";
- function = "normal";
-
- bias-pull-up;
- power-source = <PM8941_GPIO_S3>;
- };
- };
-
- coincell@2800 {
- status = "okay";
- qcom,rset-ohms = <2100>;
- qcom,vset-millivolts = <3000>;
- };
- };
-
- pm8941@1 {
- wled@d800 {
- status = "okay";
-
- qcom,cs-out;
- qcom,current-limit = <20>;
- qcom,current-boost-limit = <805>;
- qcom,switching-freq = <1600>;
- qcom,ovp = <29>;
- qcom,num-strings = <2>;
- };
- };
};
diff --git a/arch/arm/boot/dts/qcom-msm8974-sony-xperia-rhine.dtsi b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-rhine.dtsi
new file mode 100644
index 000000000000..870e0aeb4d05
--- /dev/null
+++ b/arch/arm/boot/dts/qcom-msm8974-sony-xperia-rhine.dtsi
@@ -0,0 +1,455 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "qcom-msm8974.dtsi"
+#include "qcom-pm8841.dtsi"
+#include "qcom-pm8941.dtsi"
+#include <dt-bindings/gpio/gpio.h>
+#include <dt-bindings/input/input.h>
+#include <dt-bindings/pinctrl/qcom,pmic-gpio.h>
+
+/ {
+ aliases {
+ serial0 = &blsp1_uart2;
+ };
+
+ chosen {
+ stdout-path = "serial0:115200n8";
+ };
+
+ gpio-keys {
+ compatible = "gpio-keys";
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&gpio_keys_pin_a>;
+
+ volume-down {
+ label = "volume_down";
+ gpios = <&pm8941_gpios 2 GPIO_ACTIVE_LOW>;
+ linux,input-type = <1>;
+ linux,code = <KEY_VOLUMEDOWN>;
+ };
+
+ camera-snapshot {
+ label = "camera_snapshot";
+ gpios = <&pm8941_gpios 3 GPIO_ACTIVE_LOW>;
+ linux,input-type = <1>;
+ linux,code = <KEY_CAMERA>;
+ };
+
+ camera-focus {
+ label = "camera_focus";
+ gpios = <&pm8941_gpios 4 GPIO_ACTIVE_LOW>;
+ linux,input-type = <1>;
+ linux,code = <KEY_CAMERA_FOCUS>;
+ };
+
+ volume-up {
+ label = "volume_up";
+ gpios = <&pm8941_gpios 5 GPIO_ACTIVE_LOW>;
+ linux,input-type = <1>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+ };
+};
+
+&blsp1_i2c2 {
+ status = "okay";
+ clock-frequency = <355000>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&i2c2_pins>;
+
+ synaptics@2c {
+ compatible = "syna,rmi4-i2c";
+ reg = <0x2c>;
+
+ interrupts-extended = <&tlmm 61 IRQ_TYPE_EDGE_FALLING>;
+
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ vdd-supply = <&pm8941_l22>;
+ vio-supply = <&pm8941_lvs3>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&ts_int_pin>;
+
+ syna,startup-delay-ms = <10>;
+
+ rmi4-f01@1 {
+ reg = <0x1>;
+ syna,nosleep-mode = <1>;
+ };
+
+ rmi4-f11@11 {
+ reg = <0x11>;
+ touchscreen-inverted-x;
+ syna,sensor-type = <1>;
+ };
+ };
+};
+
+&blsp1_uart2 {
+ status = "okay";
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&blsp1_uart2_pin_a>;
+};
+
+&blsp2_dma {
+ qcom,controlled-remotely;
+};
+
+&otg {
+ status = "okay";
+
+ phys = <&usb_hs1_phy>;
+ phy-select = <&tcsr 0xb000 0>;
+ extcon = <&smbb>, <&usb_id>;
+ vbus-supply = <&chg_otg>;
+
+ hnp-disable;
+ srp-disable;
+ adp-disable;
+
+ ulpi {
+ phy@a {
+ status = "okay";
+
+ v1p8-supply = <&pm8941_l6>;
+ v3p3-supply = <&pm8941_l24>;
+
+ extcon = <&smbb>;
+ qcom,init-seq = /bits/ 8 <0x1 0x64>;
+ };
+ };
+};
+
+&pm8941_coincell {
+ status = "okay";
+ qcom,rset-ohms = <2100>;
+ qcom,vset-millivolts = <3000>;
+};
+
+&pm8941_gpios {
+ gpio_keys_pin_a: gpio-keys-active {
+ pins = "gpio2", "gpio3", "gpio4", "gpio5";
+ function = "normal";
+
+ bias-pull-up;
+ power-source = <PM8941_GPIO_S3>;
+ };
+};
+
+&pm8941_wled {
+ status = "okay";
+
+ qcom,cs-out;
+ qcom,current-limit = <20>;
+ qcom,current-boost-limit = <805>;
+ qcom,switching-freq = <1600>;
+ qcom,ovp = <29>;
+ qcom,num-strings = <2>;
+};
+
+&rpm_requests {
+ pm8841-regulators {
+ compatible = "qcom,rpm-pm8841-regulators";
+
+ pm8841_s1: s1 {
+ regulator-min-microvolt = <675000>;
+ regulator-max-microvolt = <1050000>;
+ };
+
+ pm8841_s2: s2 {
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <1050000>;
+ };
+
+ pm8841_s3: s3 {
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <1050000>;
+ };
+
+ pm8841_s4: s4 {
+ regulator-min-microvolt = <500000>;
+ regulator-max-microvolt = <1050000>;
+ };
+ };
+
+ pm8941-regulators {
+ compatible = "qcom,rpm-pm8941-regulators";
+
+ vdd_l1_l3-supply = <&pm8941_s1>;
+ vdd_l2_lvs1_2_3-supply = <&pm8941_s3>;
+ vdd_l4_l11-supply = <&pm8941_s1>;
+ vdd_l5_l7-supply = <&pm8941_s2>;
+ vdd_l6_l12_l14_l15-supply = <&pm8941_s2>;
+ vdd_l9_l10_l17_l22-supply = <&vreg_boost>;
+ vdd_l13_l20_l23_l24-supply = <&vreg_boost>;
+ vdd_l21-supply = <&vreg_boost>;
+
+ pm8941_s1: s1 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1300000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ pm8941_s2: s2 {
+ regulator-min-microvolt = <2150000>;
+ regulator-max-microvolt = <2150000>;
+ regulator-boot-on;
+ };
+
+ pm8941_s3: s3 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ pm8941_s4: s4 {
+ regulator-min-microvolt = <5000000>;
+ regulator-max-microvolt = <5000000>;
+ };
+
+ pm8941_l1: l1 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ pm8941_l2: l2 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
+ };
+
+ pm8941_l3: l3 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
+ };
+
+ pm8941_l4: l4 {
+ regulator-min-microvolt = <1225000>;
+ regulator-max-microvolt = <1225000>;
+ };
+
+ pm8941_l5: l5 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
+
+ pm8941_l6: l6 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l7: l7 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l8: l8 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
+
+ pm8941_l9: l9 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ };
+
+ pm8941_l11: l11 {
+ regulator-min-microvolt = <1300000>;
+ regulator-max-microvolt = <1350000>;
+ };
+
+ pm8941_l12: l12 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ pm8941_l13: l13 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l14: l14 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ };
+
+ pm8941_l15: l15 {
+ regulator-min-microvolt = <2050000>;
+ regulator-max-microvolt = <2050000>;
+ };
+
+ pm8941_l16: l16 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2700000>;
+ };
+
+ pm8941_l17: l17 {
+ regulator-min-microvolt = <2700000>;
+ regulator-max-microvolt = <2700000>;
+ };
+
+ pm8941_l18: l18 {
+ regulator-min-microvolt = <2850000>;
+ regulator-max-microvolt = <2850000>;
+ };
+
+ pm8941_l19: l19 {
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+ };
+
+ pm8941_l20: l20 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-system-load = <200000>;
+ regulator-allow-set-load;
+ regulator-boot-on;
+ };
+
+ pm8941_l21: l21 {
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+ regulator-boot-on;
+ };
+
+ pm8941_l22: l22 {
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
+ };
+
+ pm8941_l23: l23 {
+ regulator-min-microvolt = <2800000>;
+ regulator-max-microvolt = <2800000>;
+ };
+
+ pm8941_l24: l24 {
+ regulator-min-microvolt = <3075000>;
+ regulator-max-microvolt = <3075000>;
+ regulator-boot-on;
+ };
+
+ pm8941_lvs3: lvs3 {};
+ };
+};
+
+&sdhc_1 {
+ status = "okay";
+
+ vmmc-supply = <&pm8941_l20>;
+ vqmmc-supply = <&pm8941_s3>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc1_pin_a>;
+};
+
+&sdhc_2 {
+ status = "okay";
+
+ vmmc-supply = <&pm8941_l21>;
+ vqmmc-supply = <&pm8941_l13>;
+
+ cd-gpios = <&tlmm 62 GPIO_ACTIVE_LOW>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&sdhc2_pin_a>, <&sdhc2_cd_pin_a>;
+};
+
+&smbb {
+ qcom,fast-charge-safe-current = <1500000>;
+ qcom,fast-charge-current-limit = <1500000>;
+ qcom,dc-current-limit = <1800000>;
+ qcom,fast-charge-safe-voltage = <4400000>;
+ qcom,fast-charge-high-threshold-voltage = <4350000>;
+ qcom,fast-charge-low-threshold-voltage = <3400000>;
+ qcom,auto-recharge-threshold-voltage = <4200000>;
+ qcom,minimum-input-voltage = <4300000>;
+};
+
+&tlmm {
+ ts_int_pin: touch-int {
+ pin {
+ pins = "gpio61";
+ function = "gpio";
+
+ drive-strength = <2>;
+ bias-disable;
+ input-enable;
+ };
+ };
+
+ blsp1_uart2_pin_a: blsp1-uart2-pin-active {
+ rx {
+ pins = "gpio5";
+ function = "blsp_uart2";
+
+ drive-strength = <2>;
+ bias-pull-up;
+ };
+
+ tx {
+ pins = "gpio4";
+ function = "blsp_uart2";
+
+ drive-strength = <4>;
+ bias-disable;
+ };
+ };
+
+ i2c2_pins: i2c2 {
+ mux {
+ pins = "gpio6", "gpio7";
+ function = "blsp_i2c2";
+
+ drive-strength = <2>;
+ bias-disable;
+ };
+ };
+
+ sdhc1_pin_a: sdhc1-pin-active {
+ clk {
+ pins = "sdc1_clk";
+ drive-strength = <16>;
+ bias-disable;
+ };
+
+ cmd-data {
+ pins = "sdc1_cmd", "sdc1_data";
+ drive-strength = <10>;
+ bias-pull-up;
+ };
+ };
+
+ sdhc2_cd_pin_a: sdhc2-cd-pin-active {
+ pins = "gpio62";
+ function = "gpio";
+
+ drive-strength = <2>;
+ bias-disable;
+ };
+
+ sdhc2_pin_a: sdhc2-pin-active {
+ clk {
+ pins = "sdc2_clk";
+ drive-strength = <10>;
+ bias-disable;
+ };
+
+ cmd-data {
+ pins = "sdc2_cmd", "sdc2_data";
+ drive-strength = <6>;
+ bias-pull-up;
+ };
+ };
+};
diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi b/arch/arm/boot/dts/qcom-msm8974.dtsi
index 412d94736c35..05a36566bd52 100644
--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
+++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
@@ -16,57 +16,17 @@ / {
compatible = "qcom,msm8974";
interrupt-parent = <&intc>;
- reserved-memory {
- #address-cells = <1>;
- #size-cells = <1>;
- ranges;
-
- mpss_region: mpss@8000000 {
- reg = <0x08000000 0x5100000>;
- no-map;
- };
-
- mba_region: mba@d100000 {
- reg = <0x0d100000 0x100000>;
- no-map;
- };
-
- wcnss_region: wcnss@d200000 {
- reg = <0x0d200000 0xa00000>;
- no-map;
- };
-
- adsp_region: adsp@dc00000 {
- reg = <0x0dc00000 0x1900000>;
- no-map;
- };
-
- venus@f500000 {
- reg = <0x0f500000 0x500000>;
- no-map;
- };
-
- smem_region: smem@fa00000 {
- reg = <0xfa00000 0x200000>;
- no-map;
- };
-
- tz@fc00000 {
- reg = <0x0fc00000 0x160000>;
- no-map;
- };
-
- rfsa@fd60000 {
- reg = <0x0fd60000 0x20000>;
- no-map;
+ clocks {
+ xo_board: xo_board {
+ compatible = "fixed-clock";
+ #clock-cells = <0>;
+ clock-frequency = <19200000>;
};
- rmtfs@fd80000 {
- compatible = "qcom,rmtfs-mem";
- reg = <0x0fd80000 0x180000>;
- no-map;
-
- qcom,client-id = <1>;
+ sleep_clk: sleep_clk {
+ compatible = "fixed-clock";
+ #clock-cells = <0>;
+ clock-frequency = <32768>;
};
};
@@ -136,283 +96,120 @@ CPU_SPC: spc {
};
};
+ firmware {
+ scm {
+ compatible = "qcom,scm";
+ clocks = <&gcc GCC_CE1_CLK>, <&gcc GCC_CE1_AXI_CLK>, <&gcc GCC_CE1_AHB_CLK>;
+ clock-names = "core", "bus", "iface";
+ };
+ };
+
memory {
device_type = "memory";
reg = <0x0 0x0>;
};
- thermal-zones {
- cpu0-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
+ pmu {
+ compatible = "qcom,krait-pmu";
+ interrupts = <GIC_PPI 7 0xf04>;
+ };
- thermal-sensors = <&tsens 5>;
+ reserved-memory {
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges;
- trips {
- cpu_alert0: trip0 {
- temperature = <75000>;
- hysteresis = <2000>;
- type = "passive";
- };
- cpu_crit0: trip1 {
- temperature = <110000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
+ mpss_region: mpss@8000000 {
+ reg = <0x08000000 0x5100000>;
+ no-map;
};
- cpu1-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
+ mba_region: mba@d100000 {
+ reg = <0x0d100000 0x100000>;
+ no-map;
+ };
- thermal-sensors = <&tsens 6>;
+ wcnss_region: wcnss@d200000 {
+ reg = <0x0d200000 0xa00000>;
+ no-map;
+ };
- trips {
- cpu_alert1: trip0 {
- temperature = <75000>;
- hysteresis = <2000>;
- type = "passive";
- };
- cpu_crit1: trip1 {
- temperature = <110000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
+ adsp_region: adsp@dc00000 {
+ reg = <0x0dc00000 0x1900000>;
+ no-map;
};
- cpu2-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
+ venus_region: memory@f500000 {
+ reg = <0x0f500000 0x500000>;
+ no-map;
+ };
- thermal-sensors = <&tsens 7>;
+ smem_region: smem@fa00000 {
+ reg = <0xfa00000 0x200000>;
+ no-map;
+ };
- trips {
- cpu_alert2: trip0 {
- temperature = <75000>;
- hysteresis = <2000>;
- type = "passive";
- };
- cpu_crit2: trip1 {
- temperature = <110000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
+ tz_region: memory@fc00000 {
+ reg = <0x0fc00000 0x160000>;
+ no-map;
};
- cpu3-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
+ rfsa_mem: memory@fd60000 {
+ reg = <0x0fd60000 0x20000>;
+ no-map;
+ };
- thermal-sensors = <&tsens 8>;
+ rmtfs@fd80000 {
+ compatible = "qcom,rmtfs-mem";
+ reg = <0x0fd80000 0x180000>;
+ no-map;
- trips {
- cpu_alert3: trip0 {
- temperature = <75000>;
- hysteresis = <2000>;
- type = "passive";
- };
- cpu_crit3: trip1 {
- temperature = <110000>;
- hysteresis = <2000>;
- type = "critical";
- };
- };
+ qcom,client-id = <1>;
};
+ };
- q6-dsp-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
-
- thermal-sensors = <&tsens 1>;
+ smem {
+ compatible = "qcom,smem";
- trips {
- q6_dsp_alert0: trip-point0 {
- temperature = <90000>;
- hysteresis = <2000>;
- type = "hot";
- };
- };
- };
+ memory-region = <&smem_region>;
+ qcom,rpm-msg-ram = <&rpm_msg_ram>;
- modemtx-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
+ hwlocks = <&tcsr_mutex 3>;
+ };
- thermal-sensors = <&tsens 2>;
+ smp2p-adsp {
+ compatible = "qcom,smp2p";
+ qcom,smem = <443>, <429>;
- trips {
- modemtx_alert0: trip-point0 {
- temperature = <90000>;
- hysteresis = <2000>;
- type = "hot";
- };
- };
- };
+ interrupt-parent = <&intc>;
+ interrupts = <GIC_SPI 158 IRQ_TYPE_EDGE_RISING>;
- video-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
+ qcom,ipc = <&apcs 8 10>;
- thermal-sensors = <&tsens 3>;
+ qcom,local-pid = <0>;
+ qcom,remote-pid = <2>;
- trips {
- video_alert0: trip-point0 {
- temperature = <95000>;
- hysteresis = <2000>;
- type = "hot";
- };
- };
+ adsp_smp2p_out: master-kernel {
+ qcom,entry-name = "master-kernel";
+ #qcom,smem-state-cells = <1>;
};
- wlan-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
-
- thermal-sensors = <&tsens 4>;
+ adsp_smp2p_in: slave-kernel {
+ qcom,entry-name = "slave-kernel";
- trips {
- wlan_alert0: trip-point0 {
- temperature = <105000>;
- hysteresis = <2000>;
- type = "hot";
- };
- };
+ interrupt-controller;
+ #interrupt-cells = <2>;
};
+ };
- gpu-top-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
+ smp2p-modem {
+ compatible = "qcom,smp2p";
+ qcom,smem = <435>, <428>;
- thermal-sensors = <&tsens 9>;
+ interrupt-parent = <&intc>;
+ interrupts = <GIC_SPI 27 IRQ_TYPE_EDGE_RISING>;
- trips {
- gpu1_alert0: trip-point0 {
- temperature = <90000>;
- hysteresis = <2000>;
- type = "hot";
- };
- };
- };
-
- gpu-bottom-thermal {
- polling-delay-passive = <250>;
- polling-delay = <1000>;
-
- thermal-sensors = <&tsens 10>;
-
- trips {
- gpu2_alert0: trip-point0 {
- temperature = <90000>;
- hysteresis = <2000>;
- type = "hot";
- };
- };
- };
- };
-
- cpu-pmu {
- compatible = "qcom,krait-pmu";
- interrupts = <GIC_PPI 7 0xf04>;
- };
-
- clocks {
- xo_board: xo_board {
- compatible = "fixed-clock";
- #clock-cells = <0>;
- clock-frequency = <19200000>;
- };
-
- sleep_clk: sleep_clk {
- compatible = "fixed-clock";
- #clock-cells = <0>;
- clock-frequency = <32768>;
- };
- };
-
- timer {
- compatible = "arm,armv7-timer";
- interrupts = <GIC_PPI 2 0xf08>,
- <GIC_PPI 3 0xf08>,
- <GIC_PPI 4 0xf08>,
- <GIC_PPI 1 0xf08>;
- clock-frequency = <19200000>;
- };
-
- adsp-pil {
- compatible = "qcom,msm8974-adsp-pil";
-
- interrupts-extended = <&intc GIC_SPI 162 IRQ_TYPE_EDGE_RISING>,
- <&adsp_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
- <&adsp_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
- <&adsp_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
- <&adsp_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
- interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
-
- cx-supply = <&pm8841_s2>;
-
- clocks = <&xo_board>;
- clock-names = "xo";
-
- memory-region = <&adsp_region>;
-
- qcom,smem-states = <&adsp_smp2p_out 0>;
- qcom,smem-state-names = "stop";
-
- smd-edge {
- interrupts = <GIC_SPI 156 IRQ_TYPE_EDGE_RISING>;
-
- qcom,ipc = <&apcs 8 8>;
- qcom,smd-edge = <1>;
-
- label = "lpass";
- };
- };
-
- smem {
- compatible = "qcom,smem";
-
- memory-region = <&smem_region>;
- qcom,rpm-msg-ram = <&rpm_msg_ram>;
-
- hwlocks = <&tcsr_mutex 3>;
- };
-
- smp2p-adsp {
- compatible = "qcom,smp2p";
- qcom,smem = <443>, <429>;
-
- interrupt-parent = <&intc>;
- interrupts = <GIC_SPI 158 IRQ_TYPE_EDGE_RISING>;
-
- qcom,ipc = <&apcs 8 10>;
-
- qcom,local-pid = <0>;
- qcom,remote-pid = <2>;
-
- adsp_smp2p_out: master-kernel {
- qcom,entry-name = "master-kernel";
- #qcom,smem-state-cells = <1>;
- };
-
- adsp_smp2p_in: slave-kernel {
- qcom,entry-name = "slave-kernel";
-
- interrupt-controller;
- #interrupt-cells = <2>;
- };
- };
-
- smp2p-modem {
- compatible = "qcom,smp2p";
- qcom,smem = <435>, <428>;
-
- interrupt-parent = <&intc>;
- interrupts = <GIC_SPI 27 IRQ_TYPE_EDGE_RISING>;
-
- qcom,ipc = <&apcs 8 14>;
+ qcom,ipc = <&apcs 8 14>;
qcom,local-pid = <0>;
qcom,remote-pid = <1>;
@@ -497,11 +294,23 @@ wcnss_smsm: wcnss@7 {
};
};
- firmware {
- scm {
- compatible = "qcom,scm";
- clocks = <&gcc GCC_CE1_CLK>, <&gcc GCC_CE1_AXI_CLK>, <&gcc GCC_CE1_AHB_CLK>;
- clock-names = "core", "bus", "iface";
+ smd {
+ compatible = "qcom,smd";
+
+ rpm {
+ interrupts = <GIC_SPI 168 IRQ_TYPE_EDGE_RISING>;
+ qcom,ipc = <&apcs 8 0>;
+ qcom,smd-edge = <15>;
+
+ rpm_requests: rpm_requests {
+ compatible = "qcom,rpm-msm8974";
+ qcom,smd-channels = "rpm_requests";
+
+ rpmcc: clock-controller {
+ compatible = "qcom,rpmcc-msm8974", "qcom,rpmcc";
+ #clock-cells = <1>;
+ };
+ };
};
};
@@ -524,31 +333,6 @@ apcs: syscon@f9011000 {
reg = <0xf9011000 0x1000>;
};
- qfprom: qfprom@fc4bc000 {
- #address-cells = <1>;
- #size-cells = <1>;
- compatible = "qcom,qfprom";
- reg = <0xfc4bc000 0x1000>;
- tsens_calib: calib@d0 {
- reg = <0xd0 0x18>;
- };
- tsens_backup: backup@440 {
- reg = <0x440 0x10>;
- };
- };
-
- tsens: thermal-sensor@fc4a9000 {
- compatible = "qcom,msm8974-tsens";
- reg = <0xfc4a9000 0x1000>, /* TM */
- <0xfc4a8000 0x1000>; /* SROT */
- nvmem-cells = <&tsens_calib>, <&tsens_backup>;
- nvmem-cell-names = "calib", "calib_backup";
- #qcom,sensors = <11>;
- interrupts = <GIC_SPI 184 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "uplow";
- #thermal-sensor-cells = <1>;
- };
-
timer@f9020000 {
#address-cells = <1>;
#size-cells = <1>;
@@ -654,47 +438,53 @@ acc3: clock-controller@f90b8000 {
reg = <0xf90b8000 0x1000>, <0xf9008000 0x1000>;
};
- restart@fc4ab000 {
- compatible = "qcom,pshold";
- reg = <0xfc4ab000 0x4>;
- };
-
- gcc: clock-controller@fc400000 {
- compatible = "qcom,gcc-msm8974";
- #clock-cells = <1>;
- #reset-cells = <1>;
- #power-domain-cells = <1>;
- reg = <0xfc400000 0x4000>;
- };
+ sdhc_1: sdhci@f9824900 {
+ compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
+ reg = <0xf9824900 0x11c>, <0xf9824000 0x800>;
+ reg-names = "hc_mem", "core_mem";
+ interrupts = <GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 138 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "hc_irq", "pwr_irq";
+ clocks = <&gcc GCC_SDCC1_APPS_CLK>,
+ <&gcc GCC_SDCC1_AHB_CLK>,
+ <&xo_board>;
+ clock-names = "core", "iface", "xo";
+ bus-width = <8>;
+ non-removable;
- tcsr: syscon@fd4a0000 {
- compatible = "syscon";
- reg = <0xfd4a0000 0x10000>;
+ status = "disabled";
};
- tcsr_mutex_block: syscon@fd484000 {
- compatible = "syscon";
- reg = <0xfd484000 0x2000>;
- };
+ sdhc_3: sdhci@f9864900 {
+ compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
+ reg = <0xf9864900 0x11c>, <0xf9864000 0x800>;
+ reg-names = "hc_mem", "core_mem";
+ interrupts = <GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 224 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "hc_irq", "pwr_irq";
+ clocks = <&gcc GCC_SDCC3_APPS_CLK>,
+ <&gcc GCC_SDCC3_AHB_CLK>,
+ <&xo_board>;
+ clock-names = "core", "iface", "xo";
+ bus-width = <4>;
- mmcc: clock-controller@fd8c0000 {
- compatible = "qcom,mmcc-msm8974";
- #clock-cells = <1>;
- #reset-cells = <1>;
- #power-domain-cells = <1>;
- reg = <0xfd8c0000 0x6000>;
+ status = "disabled";
};
- tcsr_mutex: tcsr-mutex {
- compatible = "qcom,tcsr-mutex";
- syscon = <&tcsr_mutex_block 0 0x80>;
-
- #hwlock-cells = <1>;
- };
+ sdhc_2: sdhci@f98a4900 {
+ compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
+ reg = <0xf98a4900 0x11c>, <0xf98a4000 0x800>;
+ reg-names = "hc_mem", "core_mem";
+ interrupts = <GIC_SPI 125 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 221 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "hc_irq", "pwr_irq";
+ clocks = <&gcc GCC_SDCC2_APPS_CLK>,
+ <&gcc GCC_SDCC2_AHB_CLK>,
+ <&xo_board>;
+ clock-names = "core", "iface", "xo";
+ bus-width = <4>;
- rpm_msg_ram: memory@fc428000 {
- compatible = "qcom,rpm-msg-ram";
- reg = <0xfc428000 0x4000>;
+ status = "disabled";
};
blsp1_uart1: serial@f991d000 {
@@ -715,16 +505,70 @@ blsp1_uart2: serial@f991e000 {
status = "disabled";
};
- blsp2_uart7: serial@f995d000 {
+ blsp1_i2c1: i2c@f9923000 {
+ status = "disabled";
+ compatible = "qcom,i2c-qup-v2.1.1";
+ reg = <0xf9923000 0x1000>;
+ interrupts = <0 95 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc GCC_BLSP1_QUP1_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
+ clock-names = "core", "iface";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ };
+
+ blsp1_i2c2: i2c@f9924000 {
+ status = "disabled";
+ compatible = "qcom,i2c-qup-v2.1.1";
+ reg = <0xf9924000 0x1000>;
+ interrupts = <GIC_SPI 96 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc GCC_BLSP1_QUP2_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
+ clock-names = "core", "iface";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ };
+
+ blsp1_i2c3: i2c@f9925000 {
+ status = "disabled";
+ compatible = "qcom,i2c-qup-v2.1.1";
+ reg = <0xf9925000 0x1000>;
+ interrupts = <0 97 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc GCC_BLSP1_QUP3_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
+ clock-names = "core", "iface";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ };
+
+ blsp1_i2c6: i2c@f9928000 {
+ status = "disabled";
+ compatible = "qcom,i2c-qup-v2.1.1";
+ reg = <0xf9928000 0x1000>;
+ interrupts = <GIC_SPI 100 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc GCC_BLSP1_QUP6_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
+ clock-names = "core", "iface";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ };
+
+ blsp2_dma: dma-controller@f9944000 {
+ compatible = "qcom,bam-v1.4.0";
+ reg = <0xf9944000 0x19000>;
+ interrupts = <GIC_SPI 239 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc GCC_BLSP2_AHB_CLK>;
+ clock-names = "bam_clk";
+ #dma-cells = <1>;
+ qcom,ee = <0>;
+ };
+
+ blsp2_uart1: serial@f995d000 {
compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
reg = <0xf995d000 0x1000>;
- interrupts = <GIC_SPI 113 IRQ_TYPE_NONE>;
+ interrupts = <GIC_SPI 113 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&gcc GCC_BLSP2_UART1_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
clock-names = "core", "iface";
status = "disabled";
};
- blsp2_uart8: serial@f995e000 {
+ blsp2_uart2: serial@f995e000 {
compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
reg = <0xf995e000 0x1000>;
interrupts = <GIC_SPI 114 IRQ_TYPE_LEVEL_HIGH>;
@@ -733,7 +577,7 @@ blsp2_uart8: serial@f995e000 {
status = "disabled";
};
- blsp2_uart10: serial@f9960000 {
+ blsp2_uart4: serial@f9960000 {
compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
reg = <0xf9960000 0x1000>;
interrupts = <GIC_SPI 116 IRQ_TYPE_LEVEL_HIGH>;
@@ -742,46 +586,39 @@ blsp2_uart10: serial@f9960000 {
status = "disabled";
};
- sdhci@f9824900 {
- compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
- reg = <0xf9824900 0x11c>, <0xf9824000 0x800>;
- reg-names = "hc_mem", "core_mem";
- interrupts = <GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 138 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "hc_irq", "pwr_irq";
- clocks = <&gcc GCC_SDCC1_APPS_CLK>,
- <&gcc GCC_SDCC1_AHB_CLK>,
- <&xo_board>;
- clock-names = "core", "iface", "xo";
+ blsp2_i2c2: i2c@f9964000 {
status = "disabled";
+ compatible = "qcom,i2c-qup-v2.1.1";
+ reg = <0xf9964000 0x1000>;
+ interrupts = <GIC_SPI 102 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc GCC_BLSP2_QUP2_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
+ clock-names = "core", "iface";
+ #address-cells = <1>;
+ #size-cells = <0>;
};
- sdhci@f9864900 {
- compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
- reg = <0xf9864900 0x11c>, <0xf9864000 0x800>;
- reg-names = "hc_mem", "core_mem";
- interrupts = <GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 224 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "hc_irq", "pwr_irq";
- clocks = <&gcc GCC_SDCC3_APPS_CLK>,
- <&gcc GCC_SDCC3_AHB_CLK>,
- <&xo_board>;
- clock-names = "core", "iface", "xo";
+ blsp2_i2c5: i2c@f9967000 {
status = "disabled";
+ compatible = "qcom,i2c-qup-v2.1.1";
+ reg = <0xf9967000 0x1000>;
+ interrupts = <GIC_SPI 105 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc GCC_BLSP2_QUP5_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
+ clock-names = "core", "iface";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ dmas = <&blsp2_dma 20>, <&blsp2_dma 21>;
+ dma-names = "tx", "rx";
};
- sdhci@f98a4900 {
- compatible = "qcom,msm8974-sdhci", "qcom,sdhci-msm-v4";
- reg = <0xf98a4900 0x11c>, <0xf98a4000 0x800>;
- reg-names = "hc_mem", "core_mem";
- interrupts = <GIC_SPI 125 IRQ_TYPE_LEVEL_HIGH>,
- <GIC_SPI 221 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "hc_irq", "pwr_irq";
- clocks = <&gcc GCC_SDCC2_APPS_CLK>,
- <&gcc GCC_SDCC2_AHB_CLK>,
- <&xo_board>;
- clock-names = "core", "iface", "xo";
+ blsp2_i2c6: i2c@f9968000 {
status = "disabled";
+ compatible = "qcom,i2c-qup-v2.1.1";
+ reg = <0xf9968000 0x1000>;
+ interrupts = <0 106 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&gcc GCC_BLSP2_QUP6_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
+ clock-names = "core", "iface";
+ #address-cells = <1>;
+ #size-cells = <0>;
};
otg: usb@f9a55000 {
@@ -835,55 +672,6 @@ rng@f9bff000 {
clock-names = "core";
};
- remoteproc@fc880000 {
- compatible = "qcom,msm8974-mss-pil";
- reg = <0xfc880000 0x100>, <0xfc820000 0x020>;
- reg-names = "qdsp6", "rmb";
-
- interrupts-extended = <&intc GIC_SPI 24 IRQ_TYPE_EDGE_RISING>,
- <&modem_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
- <&modem_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
- <&modem_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
- <&modem_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
- interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
-
- clocks = <&gcc GCC_MSS_Q6_BIMC_AXI_CLK>,
- <&gcc GCC_MSS_CFG_AHB_CLK>,
- <&gcc GCC_BOOT_ROM_AHB_CLK>,
- <&xo_board>;
- clock-names = "iface", "bus", "mem", "xo";
-
- resets = <&gcc GCC_MSS_RESTART>;
- reset-names = "mss_restart";
-
- cx-supply = <&pm8841_s2>;
- mss-supply = <&pm8841_s3>;
- mx-supply = <&pm8841_s1>;
- pll-supply = <&pm8941_l12>;
-
- qcom,halt-regs = <&tcsr_mutex_block 0x1180 0x1200 0x1280>;
-
- qcom,smem-states = <&modem_smp2p_out 0>;
- qcom,smem-state-names = "stop";
-
- mba {
- memory-region = <&mba_region>;
- };
-
- mpss {
- memory-region = <&mpss_region>;
- };
-
- smd-edge {
- interrupts = <GIC_SPI 25 IRQ_TYPE_EDGE_RISING>;
-
- qcom,ipc = <&apcs 8 12>;
- qcom,smd-edge = <0>;
-
- label = "modem";
- };
- };
-
pronto: remoteproc@fb21b000 {
compatible = "qcom,pronto-v2-pil", "qcom,pronto";
reg = <0xfb204000 0x2000>, <0xfb202000 0x1000>, <0xfb21b000 0x3000>;
@@ -898,8 +686,6 @@ pronto: remoteproc@fb21b000 {
<&wcnss_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
- vddpx-supply = <&pm8941_s3>;
-
qcom,smem-states = <&wcnss_smp2p_out 0>;
qcom,smem-state-names = "stop";
@@ -910,11 +696,6 @@ iris {
clocks = <&rpmcc RPM_SMD_CXO_A2>;
clock-names = "xo";
-
- vddxo-supply = <&pm8941_l6>;
- vddrfa-supply = <&pm8941_l11>;
- vddpa-supply = <&pm8941_l19>;
- vdddig-supply = <&pm8941_s3>;
};
smd-edge {
@@ -942,139 +723,32 @@ wifi {
interrupt-names = "tx", "rx";
qcom,smem-states = <&apps_smsm 10>, <&apps_smsm 9>;
- qcom,smem-state-names = "tx-enable", "tx-rings-empty";
+ qcom,smem-state-names = "tx-enable",
+ "tx-rings-empty";
};
};
};
};
- msmgpio: pinctrl@fd510000 {
- compatible = "qcom,msm8974-pinctrl";
- reg = <0xfd510000 0x4000>;
- gpio-controller;
- gpio-ranges = <&msmgpio 0 0 146>;
- #gpio-cells = <2>;
- interrupt-controller;
- #interrupt-cells = <2>;
- interrupts = <GIC_SPI 208 IRQ_TYPE_LEVEL_HIGH>;
- };
-
- i2c@f9923000 {
- status = "disabled";
- compatible = "qcom,i2c-qup-v2.1.1";
- reg = <0xf9923000 0x1000>;
- interrupts = <0 95 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&gcc GCC_BLSP1_QUP1_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
- clock-names = "core", "iface";
- #address-cells = <1>;
- #size-cells = <0>;
- };
-
- i2c@f9924000 {
- status = "disabled";
- compatible = "qcom,i2c-qup-v2.1.1";
- reg = <0xf9924000 0x1000>;
- interrupts = <GIC_SPI 96 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&gcc GCC_BLSP1_QUP2_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
- clock-names = "core", "iface";
- #address-cells = <1>;
- #size-cells = <0>;
- };
-
- blsp_i2c3: i2c@f9925000 {
- status = "disabled";
- compatible = "qcom,i2c-qup-v2.1.1";
- reg = <0xf9925000 0x1000>;
- interrupts = <0 97 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&gcc GCC_BLSP1_QUP3_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
- clock-names = "core", "iface";
- #address-cells = <1>;
- #size-cells = <0>;
- };
-
- blsp_i2c6: i2c@f9928000 {
- status = "disabled";
- compatible = "qcom,i2c-qup-v2.1.1";
- reg = <0xf9928000 0x1000>;
- interrupts = <GIC_SPI 100 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&gcc GCC_BLSP1_QUP6_I2C_APPS_CLK>, <&gcc GCC_BLSP1_AHB_CLK>;
- clock-names = "core", "iface";
- #address-cells = <1>;
- #size-cells = <0>;
- };
-
- blsp_i2c8: i2c@f9964000 {
- status = "disabled";
- compatible = "qcom,i2c-qup-v2.1.1";
- reg = <0xf9964000 0x1000>;
- interrupts = <GIC_SPI 102 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&gcc GCC_BLSP2_QUP2_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
- clock-names = "core", "iface";
- #address-cells = <1>;
- #size-cells = <0>;
- };
-
- blsp_i2c11: i2c@f9967000 {
- status = "disabled";
- compatible = "qcom,i2c-qup-v2.1.1";
- reg = <0xf9967000 0x1000>;
- interrupts = <GIC_SPI 105 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&gcc GCC_BLSP2_QUP5_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
- clock-names = "core", "iface";
- #address-cells = <1>;
- #size-cells = <0>;
- dmas = <&blsp2_dma 20>, <&blsp2_dma 21>;
- dma-names = "tx", "rx";
- };
-
- blsp_i2c12: i2c@f9968000 {
- status = "disabled";
- compatible = "qcom,i2c-qup-v2.1.1";
- reg = <0xf9968000 0x1000>;
- interrupts = <0 106 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&gcc GCC_BLSP2_QUP6_I2C_APPS_CLK>, <&gcc GCC_BLSP2_AHB_CLK>;
- clock-names = "core", "iface";
- #address-cells = <1>;
- #size-cells = <0>;
- };
-
- spmi_bus: spmi@fc4cf000 {
- compatible = "qcom,spmi-pmic-arb";
- reg-names = "core", "intr", "cnfg";
- reg = <0xfc4cf000 0x1000>,
- <0xfc4cb000 0x1000>,
- <0xfc4ca000 0x1000>;
- interrupt-names = "periph_irq";
- interrupts = <GIC_SPI 190 IRQ_TYPE_LEVEL_HIGH>;
- qcom,ee = <0>;
- qcom,channel = <0>;
- #address-cells = <2>;
- #size-cells = <0>;
- interrupt-controller;
- #interrupt-cells = <4>;
- };
-
- blsp2_dma: dma-controller@f9944000 {
- compatible = "qcom,bam-v1.4.0";
- reg = <0xf9944000 0x19000>;
- interrupts = <GIC_SPI 239 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&gcc GCC_BLSP2_AHB_CLK>;
- clock-names = "bam_clk";
- #dma-cells = <1>;
- qcom,ee = <0>;
- };
-
- etr@fc322000 {
+ etf@fc307000 {
compatible = "arm,coresight-tmc", "arm,primecell";
- reg = <0xfc322000 0x1000>;
+ reg = <0xfc307000 0x1000>;
clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
clock-names = "apb_pclk", "atclk";
+ out-ports {
+ port {
+ etf_out: endpoint {
+ remote-endpoint = <&replicator_in>;
+ };
+ };
+ };
+
in-ports {
port {
- etr_in: endpoint {
- remote-endpoint = <&replicator_out0>;
+ etf_in: endpoint {
+ remote-endpoint = <&merger_out>;
};
};
};
@@ -1096,59 +770,39 @@ tpiu_in: endpoint {
};
};
- replicator@fc31c000 {
- compatible = "arm,coresight-dynamic-replicator", "arm,primecell";
- reg = <0xfc31c000 0x1000>;
+ funnel@fc31a000 {
+ compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
+ reg = <0xfc31a000 0x1000>;
clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
clock-names = "apb_pclk", "atclk";
- out-ports {
+ in-ports {
#address-cells = <1>;
#size-cells = <0>;
- port@0 {
- reg = <0>;
- replicator_out0: endpoint {
- remote-endpoint = <&etr_in>;
- };
- };
- port@1 {
- reg = <1>;
- replicator_out1: endpoint {
- remote-endpoint = <&tpiu_in>;
+ /*
+ * Not described input ports:
+ * 0 - not-connected
+ * 1 - connected trought funnel to Multimedia CPU
+ * 2 - connected to Wireless CPU
+ * 3 - not-connected
+ * 4 - not-connected
+ * 6 - not-connected
+ * 7 - connected to STM
+ */
+ port@5 {
+ reg = <5>;
+ funnel1_in5: endpoint {
+ remote-endpoint = <&kpss_out>;
};
};
};
- in-ports {
+ out-ports {
port {
- replicator_in: endpoint {
- remote-endpoint = <&etf_out>;
- };
- };
- };
- };
-
- etf@fc307000 {
- compatible = "arm,coresight-tmc", "arm,primecell";
- reg = <0xfc307000 0x1000>;
-
- clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
- clock-names = "apb_pclk", "atclk";
-
- out-ports {
- port {
- etf_out: endpoint {
- remote-endpoint = <&replicator_in>;
- };
- };
- };
-
- in-ports {
- port {
- etf_in: endpoint {
- remote-endpoint = <&merger_out>;
+ funnel1_out: endpoint {
+ remote-endpoint = <&merger_in1>;
};
};
};
@@ -1188,85 +842,51 @@ merger_out: endpoint {
};
};
- funnel@fc31a000 {
- compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
- reg = <0xfc31a000 0x1000>;
+ replicator@fc31c000 {
+ compatible = "arm,coresight-dynamic-replicator", "arm,primecell";
+ reg = <0xfc31c000 0x1000>;
clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
clock-names = "apb_pclk", "atclk";
- in-ports {
+ out-ports {
#address-cells = <1>;
#size-cells = <0>;
- /*
- * Not described input ports:
- * 0 - not-connected
- * 1 - connected trought funnel to Multimedia CPU
- * 2 - connected to Wireless CPU
- * 3 - not-connected
- * 4 - not-connected
- * 6 - not-connected
- * 7 - connected to STM
- */
- port@5 {
- reg = <5>;
- funnel1_in5: endpoint {
- remote-endpoint = <&kpss_out>;
+ port@0 {
+ reg = <0>;
+ replicator_out0: endpoint {
+ remote-endpoint = <&etr_in>;
+ };
+ };
+ port@1 {
+ reg = <1>;
+ replicator_out1: endpoint {
+ remote-endpoint = <&tpiu_in>;
};
};
};
- out-ports {
+ in-ports {
port {
- funnel1_out: endpoint {
- remote-endpoint = <&merger_in1>;
+ replicator_in: endpoint {
+ remote-endpoint = <&etf_out>;
};
};
};
};
- funnel@fc345000 { /* KPSS funnel only 4 inputs are used */
- compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
- reg = <0xfc345000 0x1000>;
+ etr@fc322000 {
+ compatible = "arm,coresight-tmc", "arm,primecell";
+ reg = <0xfc322000 0x1000>;
clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
clock-names = "apb_pclk", "atclk";
in-ports {
- #address-cells = <1>;
- #size-cells = <0>;
-
- port@0 {
- reg = <0>;
- kpss_in0: endpoint {
- remote-endpoint = <&etm0_out>;
- };
- };
- port@1 {
- reg = <1>;
- kpss_in1: endpoint {
- remote-endpoint = <&etm1_out>;
- };
- };
- port@2 {
- reg = <2>;
- kpss_in2: endpoint {
- remote-endpoint = <&etm2_out>;
- };
- };
- port@3 {
- reg = <3>;
- kpss_in3: endpoint {
- remote-endpoint = <&etm3_out>;
- };
- };
- };
-
- out-ports {
port {
- kpss_out: endpoint {
- remote-endpoint = <&funnel1_in5>;
+ etr_in: endpoint {
+ remote-endpoint = <&replicator_out0>;
};
};
};
@@ -1344,25 +964,66 @@ etm3_out: endpoint {
};
};
- ocmem@fdd00000 {
- compatible = "qcom,msm8974-ocmem";
- reg = <0xfdd00000 0x2000>,
- <0xfec00000 0x180000>;
- reg-names = "ctrl",
- "mem";
- clocks = <&rpmcc RPM_SMD_OCMEMGX_CLK>,
- <&mmcc OCMEMCX_OCMEMNOC_CLK>;
- clock-names = "core",
- "iface";
+ /* KPSS funnel, only 4 inputs are used */
+ funnel@fc345000 {
+ compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
+ reg = <0xfc345000 0x1000>;
- #address-cells = <1>;
- #size-cells = <1>;
+ clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>;
+ clock-names = "apb_pclk", "atclk";
- gmu_sram: gmu-sram@0 {
- reg = <0x0 0x100000>;
+ in-ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ port@0 {
+ reg = <0>;
+ kpss_in0: endpoint {
+ remote-endpoint = <&etm0_out>;
+ };
+ };
+ port@1 {
+ reg = <1>;
+ kpss_in1: endpoint {
+ remote-endpoint = <&etm1_out>;
+ };
+ };
+ port@2 {
+ reg = <2>;
+ kpss_in2: endpoint {
+ remote-endpoint = <&etm2_out>;
+ };
+ };
+ port@3 {
+ reg = <3>;
+ kpss_in3: endpoint {
+ remote-endpoint = <&etm3_out>;
+ };
+ };
+ };
+
+ out-ports {
+ port {
+ kpss_out: endpoint {
+ remote-endpoint = <&funnel1_in5>;
+ };
+ };
};
};
+ gcc: clock-controller@fc400000 {
+ compatible = "qcom,gcc-msm8974";
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ #power-domain-cells = <1>;
+ reg = <0xfc400000 0x4000>;
+ };
+
+ rpm_msg_ram: memory@fc428000 {
+ compatible = "qcom,rpm-msg-ram";
+ reg = <0xfc428000 0x4000>;
+ };
+
bimc: interconnect@fc380000 {
reg = <0xfc380000 0x6a000>;
compatible = "qcom,msm8974-bimc";
@@ -1417,79 +1078,151 @@ cnoc: interconnect@fc480000 {
<&rpmcc RPM_SMD_CNOC_A_CLK>;
};
- gpu: adreno@fdb00000 {
- status = "disabled";
-
- compatible = "qcom,adreno-330.1",
- "qcom,adreno";
- reg = <0xfdb00000 0x10000>;
- reg-names = "kgsl_3d0_reg_memory";
- interrupts = <GIC_SPI 33 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "kgsl_3d0_irq";
- clock-names = "core",
- "iface",
- "mem_iface";
- clocks = <&mmcc OXILI_GFX3D_CLK>,
- <&mmcc OXILICX_AHB_CLK>,
- <&mmcc OXILICX_AXI_CLK>;
- sram = <&gmu_sram>;
- power-domains = <&mmcc OXILICX_GDSC>;
- operating-points-v2 = <&gpu_opp_table>;
-
- interconnects = <&mmssnoc MNOC_MAS_GRAPHICS_3D &bimc BIMC_SLV_EBI_CH0>,
- <&ocmemnoc OCMEM_VNOC_MAS_GFX3D &ocmemnoc OCMEM_SLV_OCMEM>;
- interconnect-names = "gfx-mem",
- "ocmem";
-
- // iommus = <&gpu_iommu 0>;
-
- gpu_opp_table: opp_table {
- compatible = "operating-points-v2";
-
- opp-320000000 {
- opp-hz = /bits/ 64 <320000000>;
- };
+ tsens: thermal-sensor@fc4a9000 {
+ compatible = "qcom,msm8974-tsens";
+ reg = <0xfc4a9000 0x1000>, /* TM */
+ <0xfc4a8000 0x1000>; /* SROT */
+ nvmem-cells = <&tsens_calib>, <&tsens_backup>;
+ nvmem-cell-names = "calib", "calib_backup";
+ #qcom,sensors = <11>;
+ interrupts = <GIC_SPI 184 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "uplow";
+ #thermal-sensor-cells = <1>;
+ };
- opp-200000000 {
- opp-hz = /bits/ 64 <200000000>;
- };
+ restart@fc4ab000 {
+ compatible = "qcom,pshold";
+ reg = <0xfc4ab000 0x4>;
+ };
- opp-27000000 {
- opp-hz = /bits/ 64 <27000000>;
- };
+ qfprom: qfprom@fc4bc000 {
+ #address-cells = <1>;
+ #size-cells = <1>;
+ compatible = "qcom,qfprom";
+ reg = <0xfc4bc000 0x1000>;
+ tsens_calib: calib@d0 {
+ reg = <0xd0 0x18>;
+ };
+ tsens_backup: backup@440 {
+ reg = <0x440 0x10>;
};
};
- mdss: mdss@fd900000 {
- status = "disabled";
+ spmi_bus: spmi@fc4cf000 {
+ compatible = "qcom,spmi-pmic-arb";
+ reg-names = "core", "intr", "cnfg";
+ reg = <0xfc4cf000 0x1000>,
+ <0xfc4cb000 0x1000>,
+ <0xfc4ca000 0x1000>;
+ interrupt-names = "periph_irq";
+ interrupts = <GIC_SPI 190 IRQ_TYPE_LEVEL_HIGH>;
+ qcom,ee = <0>;
+ qcom,channel = <0>;
+ #address-cells = <2>;
+ #size-cells = <0>;
+ interrupt-controller;
+ #interrupt-cells = <4>;
+ };
- compatible = "qcom,mdss";
- reg = <0xfd900000 0x100>,
- <0xfd924000 0x1000>;
- reg-names = "mdss_phys",
- "vbif_phys";
+ remoteproc_mss: remoteproc@fc880000 {
+ compatible = "qcom,msm8974-mss-pil";
+ reg = <0xfc880000 0x100>, <0xfc820000 0x020>;
+ reg-names = "qdsp6", "rmb";
- power-domains = <&mmcc MDSS_GDSC>;
+ interrupts-extended = <&intc GIC_SPI 24 IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
+ <&modem_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
+ interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
- clocks = <&mmcc MDSS_AHB_CLK>,
- <&mmcc MDSS_AXI_CLK>,
- <&mmcc MDSS_VSYNC_CLK>;
- clock-names = "iface",
- "bus",
- "vsync";
+ clocks = <&gcc GCC_MSS_Q6_BIMC_AXI_CLK>,
+ <&gcc GCC_MSS_CFG_AHB_CLK>,
+ <&gcc GCC_BOOT_ROM_AHB_CLK>,
+ <&xo_board>;
+ clock-names = "iface", "bus", "mem", "xo";
+
+ resets = <&gcc GCC_MSS_RESTART>;
+ reset-names = "mss_restart";
+
+ qcom,halt-regs = <&tcsr_mutex_block 0x1180 0x1200 0x1280>;
+
+ qcom,smem-states = <&modem_smp2p_out 0>;
+ qcom,smem-state-names = "stop";
+
+ status = "disabled";
+
+ mba {
+ memory-region = <&mba_region>;
+ };
+
+ mpss {
+ memory-region = <&mpss_region>;
+ };
+
+ smd-edge {
+ interrupts = <GIC_SPI 25 IRQ_TYPE_EDGE_RISING>;
+
+ qcom,ipc = <&apcs 8 12>;
+ qcom,smd-edge = <0>;
+
+ label = "modem";
+ };
+ };
+
+ tcsr_mutex_block: syscon@fd484000 {
+ compatible = "syscon";
+ reg = <0xfd484000 0x2000>;
+ };
+
+ tcsr: syscon@fd4a0000 {
+ compatible = "syscon";
+ reg = <0xfd4a0000 0x10000>;
+ };
+
+ tlmm: pinctrl@fd510000 {
+ compatible = "qcom,msm8974-pinctrl";
+ reg = <0xfd510000 0x4000>;
+ gpio-controller;
+ gpio-ranges = <&tlmm 0 0 146>;
+ #gpio-cells = <2>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+ interrupts = <GIC_SPI 208 IRQ_TYPE_LEVEL_HIGH>;
+ };
+
+ mmcc: clock-controller@fd8c0000 {
+ compatible = "qcom,mmcc-msm8974";
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ #power-domain-cells = <1>;
+ reg = <0xfd8c0000 0x6000>;
+ };
+
+ mdss: mdss@fd900000 {
+ compatible = "qcom,mdss";
+ reg = <0xfd900000 0x100>, <0xfd924000 0x1000>;
+ reg-names = "mdss_phys", "vbif_phys";
+
+ power-domains = <&mmcc MDSS_GDSC>;
+
+ clocks = <&mmcc MDSS_AHB_CLK>,
+ <&mmcc MDSS_AXI_CLK>,
+ <&mmcc MDSS_VSYNC_CLK>;
+ clock-names = "iface", "bus", "vsync";
interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
interrupt-controller;
#interrupt-cells = <1>;
+ status = "disabled";
+
#address-cells = <1>;
#size-cells = <1>;
ranges;
mdp: mdp@fd900000 {
- status = "disabled";
-
compatible = "qcom,mdp5";
reg = <0xfd900100 0x22000>;
reg-names = "mdp_phys";
@@ -1501,10 +1234,7 @@ mdp: mdp@fd900000 {
<&mmcc MDSS_AXI_CLK>,
<&mmcc MDSS_MDP_CLK>,
<&mmcc MDSS_VSYNC_CLK>;
- clock-names = "iface",
- "bus",
- "core",
- "vsync";
+ clock-names = "iface", "bus", "core", "vsync";
interconnects = <&mmssnoc MNOC_MAS_MDP_PORT0 &bimc BIMC_SLV_EBI_CH0>;
interconnect-names = "mdp0-mem";
@@ -1523,8 +1253,6 @@ mdp5_intf1_out: endpoint {
};
dsi0: dsi@fd922800 {
- status = "disabled";
-
compatible = "qcom,mdss-dsi-ctrl";
reg = <0xfd922800 0x1f8>;
reg-names = "dsi_ctrl";
@@ -1532,29 +1260,32 @@ dsi0: dsi@fd922800 {
interrupt-parent = <&mdss>;
interrupts = <4 IRQ_TYPE_LEVEL_HIGH>;
- assigned-clocks = <&mmcc BYTE0_CLK_SRC>,
- <&mmcc PCLK0_CLK_SRC>;
- assigned-clock-parents = <&dsi_phy0 0>,
- <&dsi_phy0 1>;
+ assigned-clocks = <&mmcc BYTE0_CLK_SRC>, <&mmcc PCLK0_CLK_SRC>;
+ assigned-clock-parents = <&dsi0_phy 0>, <&dsi0_phy 1>;
clocks = <&mmcc MDSS_MDP_CLK>,
- <&mmcc MDSS_AHB_CLK>,
- <&mmcc MDSS_AXI_CLK>,
- <&mmcc MDSS_BYTE0_CLK>,
- <&mmcc MDSS_PCLK0_CLK>,
- <&mmcc MDSS_ESC0_CLK>,
- <&mmcc MMSS_MISC_AHB_CLK>;
+ <&mmcc MDSS_AHB_CLK>,
+ <&mmcc MDSS_AXI_CLK>,
+ <&mmcc MDSS_BYTE0_CLK>,
+ <&mmcc MDSS_PCLK0_CLK>,
+ <&mmcc MDSS_ESC0_CLK>,
+ <&mmcc MMSS_MISC_AHB_CLK>;
clock-names = "mdp_core",
- "iface",
- "bus",
- "byte",
- "pixel",
- "core",
- "core_mmss";
-
- phys = <&dsi_phy0>;
+ "iface",
+ "bus",
+ "byte",
+ "pixel",
+ "core",
+ "core_mmss";
+
+ phys = <&dsi0_phy>;
phy-names = "dsi-phy";
+ status = "disabled";
+
+ #address-cells = <1>;
+ #size-cells = <0>;
+
ports {
#address-cells = <1>;
#size-cells = <0>;
@@ -1574,27 +1305,118 @@ dsi0_out: endpoint {
};
};
- dsi_phy0: dsi-phy@fd922a00 {
- status = "disabled";
-
+ dsi0_phy: dsi-phy@fd922a00 {
compatible = "qcom,dsi-phy-28nm-hpm";
reg = <0xfd922a00 0xd4>,
<0xfd922b00 0x280>,
<0xfd922d80 0x30>;
reg-names = "dsi_pll",
- "dsi_phy",
- "dsi_phy_regulator";
+ "dsi_phy",
+ "dsi_phy_regulator";
#clock-cells = <1>;
#phy-cells = <0>;
- qcom,dsi-phy-index = <0>;
clocks = <&mmcc MDSS_AHB_CLK>, <&xo_board>;
clock-names = "iface", "ref";
+
+ status = "disabled";
+ };
+ };
+
+ gpu: adreno@fdb00000 {
+ compatible = "qcom,adreno-330.1", "qcom,adreno";
+ reg = <0xfdb00000 0x10000>;
+ reg-names = "kgsl_3d0_reg_memory";
+
+ interrupts = <GIC_SPI 33 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "kgsl_3d0_irq";
+
+ clocks = <&mmcc OXILI_GFX3D_CLK>,
+ <&mmcc OXILICX_AHB_CLK>,
+ <&mmcc OXILICX_AXI_CLK>;
+ clock-names = "core", "iface", "mem_iface";
+
+ sram = <&gmu_sram>;
+ power-domains = <&mmcc OXILICX_GDSC>;
+ operating-points-v2 = <&gpu_opp_table>;
+
+ interconnects = <&mmssnoc MNOC_MAS_GRAPHICS_3D &bimc BIMC_SLV_EBI_CH0>,
+ <&ocmemnoc OCMEM_VNOC_MAS_GFX3D &ocmemnoc OCMEM_SLV_OCMEM>;
+ interconnect-names = "gfx-mem", "ocmem";
+
+ // iommus = <&gpu_iommu 0>;
+
+ status = "disabled";
+
+ gpu_opp_table: opp_table {
+ compatible = "operating-points-v2";
+
+ opp-320000000 {
+ opp-hz = /bits/ 64 <320000000>;
+ };
+
+ opp-200000000 {
+ opp-hz = /bits/ 64 <200000000>;
+ };
+
+ opp-27000000 {
+ opp-hz = /bits/ 64 <27000000>;
+ };
};
};
- imem@fe805000 {
+ ocmem@fdd00000 {
+ compatible = "qcom,msm8974-ocmem";
+ reg = <0xfdd00000 0x2000>,
+ <0xfec00000 0x180000>;
+ reg-names = "ctrl", "mem";
+ ranges = <0 0xfec00000 0x180000>;
+ clocks = <&rpmcc RPM_SMD_OCMEMGX_CLK>,
+ <&mmcc OCMEMCX_OCMEMNOC_CLK>;
+ clock-names = "core", "iface";
+
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ gmu_sram: gmu-sram@0 {
+ reg = <0x0 0x100000>;
+ };
+ };
+
+ remoteproc_adsp: remoteproc@fe200000 {
+ compatible = "qcom,msm8974-adsp-pil";
+ reg = <0xfe200000 0x100>;
+
+ interrupts-extended = <&intc GIC_SPI 162 IRQ_TYPE_EDGE_RISING>,
+ <&adsp_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
+ <&adsp_smp2p_in 1 IRQ_TYPE_EDGE_RISING>,
+ <&adsp_smp2p_in 2 IRQ_TYPE_EDGE_RISING>,
+ <&adsp_smp2p_in 3 IRQ_TYPE_EDGE_RISING>;
+ interrupt-names = "wdog", "fatal", "ready", "handover", "stop-ack";
+
+ clocks = <&xo_board>;
+ clock-names = "xo";
+
+ memory-region = <&adsp_region>;
+
+ qcom,smem-states = <&adsp_smp2p_out 0>;
+ qcom,smem-state-names = "stop";
+
+ status = "disabled";
+
+ smd-edge {
+ interrupts = <GIC_SPI 156 IRQ_TYPE_EDGE_RISING>;
+
+ qcom,ipc = <&apcs 8 8>;
+ qcom,smd-edge = <1>;
+ label = "lpass";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ };
+ };
+
+ imem: imem@fe805000 {
status = "disabled";
compatible = "syscon", "simple-mfd";
reg = <0xfe805000 0x1000>;
@@ -1606,76 +1428,194 @@ reboot-mode {
};
};
- smd {
- compatible = "qcom,smd";
+ tcsr_mutex: tcsr-mutex {
+ compatible = "qcom,tcsr-mutex";
+ syscon = <&tcsr_mutex_block 0 0x80>;
- rpm {
- interrupts = <GIC_SPI 168 IRQ_TYPE_EDGE_RISING>;
- qcom,ipc = <&apcs 8 0>;
- qcom,smd-edge = <15>;
+ #hwlock-cells = <1>;
+ };
- rpm_requests {
- compatible = "qcom,rpm-msm8974";
- qcom,smd-channels = "rpm_requests";
+ thermal-zones {
+ cpu0-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
- rpmcc: clock-controller {
- compatible = "qcom,rpmcc-msm8974", "qcom,rpmcc";
- #clock-cells = <1>;
+ thermal-sensors = <&tsens 5>;
+
+ trips {
+ cpu_alert0: trip0 {
+ temperature = <75000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+ cpu_crit0: trip1 {
+ temperature = <110000>;
+ hysteresis = <2000>;
+ type = "critical";
+ };
+ };
+ };
+
+ cpu1-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
+
+ thermal-sensors = <&tsens 6>;
+
+ trips {
+ cpu_alert1: trip0 {
+ temperature = <75000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+ cpu_crit1: trip1 {
+ temperature = <110000>;
+ hysteresis = <2000>;
+ type = "critical";
+ };
+ };
+ };
+
+ cpu2-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
+
+ thermal-sensors = <&tsens 7>;
+
+ trips {
+ cpu_alert2: trip0 {
+ temperature = <75000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+ cpu_crit2: trip1 {
+ temperature = <110000>;
+ hysteresis = <2000>;
+ type = "critical";
+ };
+ };
+ };
+
+ cpu3-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
+
+ thermal-sensors = <&tsens 8>;
+
+ trips {
+ cpu_alert3: trip0 {
+ temperature = <75000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+ cpu_crit3: trip1 {
+ temperature = <110000>;
+ hysteresis = <2000>;
+ type = "critical";
+ };
+ };
+ };
+
+ q6-dsp-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
+
+ thermal-sensors = <&tsens 1>;
+
+ trips {
+ q6_dsp_alert0: trip-point0 {
+ temperature = <90000>;
+ hysteresis = <2000>;
+ type = "hot";
};
+ };
+ };
- pm8841-regulators {
- compatible = "qcom,rpm-pm8841-regulators";
-
- pm8841_s1: s1 {};
- pm8841_s2: s2 {};
- pm8841_s3: s3 {};
- pm8841_s4: s4 {};
- pm8841_s5: s5 {};
- pm8841_s6: s6 {};
- pm8841_s7: s7 {};
- pm8841_s8: s8 {};
+ modemtx-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
+
+ thermal-sensors = <&tsens 2>;
+
+ trips {
+ modemtx_alert0: trip-point0 {
+ temperature = <90000>;
+ hysteresis = <2000>;
+ type = "hot";
};
+ };
+ };
+
+ video-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
- pm8941-regulators {
- compatible = "qcom,rpm-pm8941-regulators";
-
- pm8941_s1: s1 {};
- pm8941_s2: s2 {};
- pm8941_s3: s3 {};
-
- pm8941_l1: l1 {};
- pm8941_l2: l2 {};
- pm8941_l3: l3 {};
- pm8941_l4: l4 {};
- pm8941_l5: l5 {};
- pm8941_l6: l6 {};
- pm8941_l7: l7 {};
- pm8941_l8: l8 {};
- pm8941_l9: l9 {};
- pm8941_l10: l10 {};
- pm8941_l11: l11 {};
- pm8941_l12: l12 {};
- pm8941_l13: l13 {};
- pm8941_l14: l14 {};
- pm8941_l15: l15 {};
- pm8941_l16: l16 {};
- pm8941_l17: l17 {};
- pm8941_l18: l18 {};
- pm8941_l19: l19 {};
- pm8941_l20: l20 {};
- pm8941_l21: l21 {};
- pm8941_l22: l22 {};
- pm8941_l23: l23 {};
- pm8941_l24: l24 {};
-
- pm8941_lvs1: lvs1 {};
- pm8941_lvs2: lvs2 {};
- pm8941_lvs3: lvs3 {};
+ thermal-sensors = <&tsens 3>;
+
+ trips {
+ video_alert0: trip-point0 {
+ temperature = <95000>;
+ hysteresis = <2000>;
+ type = "hot";
+ };
+ };
+ };
+
+ wlan-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
+
+ thermal-sensors = <&tsens 4>;
+
+ trips {
+ wlan_alert0: trip-point0 {
+ temperature = <105000>;
+ hysteresis = <2000>;
+ type = "hot";
+ };
+ };
+ };
+
+ gpu-top-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
+
+ thermal-sensors = <&tsens 9>;
+
+ trips {
+ gpu1_alert0: trip-point0 {
+ temperature = <90000>;
+ hysteresis = <2000>;
+ type = "hot";
+ };
+ };
+ };
+
+ gpu-bottom-thermal {
+ polling-delay-passive = <250>;
+ polling-delay = <1000>;
+
+ thermal-sensors = <&tsens 10>;
+
+ trips {
+ gpu2_alert0: trip-point0 {
+ temperature = <90000>;
+ hysteresis = <2000>;
+ type = "hot";
};
};
};
};
+ timer {
+ compatible = "arm,armv7-timer";
+ interrupts = <GIC_PPI 2 0xf08>,
+ <GIC_PPI 3 0xf08>,
+ <GIC_PPI 4 0xf08>,
+ <GIC_PPI 1 0xf08>;
+ clock-frequency = <19200000>;
+ };
+
vreg_boost: vreg-boost {
compatible = "regulator-fixed";
@@ -1692,6 +1632,7 @@ vreg_boost: vreg-boost {
pinctrl-names = "default";
pinctrl-0 = <&boost_bypass_n_pin>;
};
+
vreg_vph_pwr: vreg-vph-pwr {
compatible = "regulator-fixed";
regulator-name = "vph-pwr";
diff --git a/arch/arm/boot/dts/qcom-pm8841.dtsi b/arch/arm/boot/dts/qcom-pm8841.dtsi
index 2caf71eacb52..b5cdde034d18 100644
--- a/arch/arm/boot/dts/qcom-pm8841.dtsi
+++ b/arch/arm/boot/dts/qcom-pm8841.dtsi
@@ -24,6 +24,7 @@ temp-alarm@2400 {
compatible = "qcom,spmi-temp-alarm";
reg = <0x2400>;
interrupts = <4 0x24 0 IRQ_TYPE_EDGE_RISING>;
+ #thermal-sensor-cells = <0>;
};
};
diff --git a/arch/arm/boot/dts/qcom-pm8941.dtsi b/arch/arm/boot/dts/qcom-pm8941.dtsi
index da00b8f5eecd..cdd2bdb77b32 100644
--- a/arch/arm/boot/dts/qcom-pm8941.dtsi
+++ b/arch/arm/boot/dts/qcom-pm8941.dtsi
@@ -131,7 +131,7 @@ pm8941_iadc: iadc@3600 {
qcom,external-resistor-micro-ohms = <10000>;
};
- coincell@2800 {
+ pm8941_coincell: coincell@2800 {
compatible = "qcom,pm8941-coincell";
reg = <0x2800>;
status = "disabled";
diff --git a/arch/arm/boot/dts/qcom-sdx55.dtsi b/arch/arm/boot/dts/qcom-sdx55.dtsi
index d455795da44c..b75e672c239d 100644
--- a/arch/arm/boot/dts/qcom-sdx55.dtsi
+++ b/arch/arm/boot/dts/qcom-sdx55.dtsi
@@ -206,7 +206,7 @@ gcc: clock-controller@100000 {
blsp1_uart3: serial@831000 {
compatible = "qcom,msm-uartdm-v1.4", "qcom,msm-uartdm";
reg = <0x00831000 0x200>;
- interrupts = <GIC_SPI 26 IRQ_TYPE_LEVEL_LOW>;
+ interrupts = <GIC_SPI 26 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&gcc 30>,
<&gcc 9>;
clock-names = "core", "iface";
diff --git a/arch/arm/boot/dts/ste-ux500-samsung-codina.dts b/arch/arm/boot/dts/ste-ux500-samsung-codina.dts
index 1c1725d31c7c..0a10fb0a3741 100644
--- a/arch/arm/boot/dts/ste-ux500-samsung-codina.dts
+++ b/arch/arm/boot/dts/ste-ux500-samsung-codina.dts
@@ -566,8 +566,8 @@ lisd3dh@19 {
reg = <0x19>;
vdd-supply = <&ab8500_ldo_aux1_reg>; // 3V
vddio-supply = <&ab8500_ldo_aux2_reg>; // 1.8V
- mount-matrix = "0", "-1", "0",
- "1", "0", "0",
+ mount-matrix = "0", "1", "0",
+ "-1", "0", "0",
"0", "0", "1";
};
};
diff --git a/arch/arm/boot/dts/ste-ux500-samsung-gavini.dts b/arch/arm/boot/dts/ste-ux500-samsung-gavini.dts
index fd170974765f..149838b5f9c1 100644
--- a/arch/arm/boot/dts/ste-ux500-samsung-gavini.dts
+++ b/arch/arm/boot/dts/ste-ux500-samsung-gavini.dts
@@ -523,8 +523,8 @@ i2c-gate {
accelerometer@18 {
compatible = "bosch,bma222e";
reg = <0x18>;
- mount-matrix = "0", "1", "0",
- "-1", "0", "0",
+ mount-matrix = "0", "-1", "0",
+ "1", "0", "0",
"0", "0", "1";
vddio-supply = <&ab8500_ldo_aux2_reg>; // 1.8V
vdd-supply = <&ab8500_ldo_aux1_reg>; // 3V
diff --git a/arch/arm/boot/dts/ste-ux500-samsung-janice.dts b/arch/arm/boot/dts/ste-ux500-samsung-janice.dts
index 42762bfcd878..2069efb252a7 100644
--- a/arch/arm/boot/dts/ste-ux500-samsung-janice.dts
+++ b/arch/arm/boot/dts/ste-ux500-samsung-janice.dts
@@ -610,8 +610,8 @@ i2c-gate {
accelerometer@8 {
compatible = "bosch,bma222";
reg = <0x08>;
- mount-matrix = "0", "1", "0",
- "-1", "0", "0",
+ mount-matrix = "0", "-1", "0",
+ "1", "0", "0",
"0", "0", "1";
vddio-supply = <&ab8500_ldo_aux2_reg>; // 1.8V
vdd-supply = <&ab8500_ldo_aux1_reg>; // 3V
diff --git a/arch/arm/boot/dts/uniphier-pxs2.dtsi b/arch/arm/boot/dts/uniphier-pxs2.dtsi
index e81e5937a60a..03301ddb3403 100644
--- a/arch/arm/boot/dts/uniphier-pxs2.dtsi
+++ b/arch/arm/boot/dts/uniphier-pxs2.dtsi
@@ -597,8 +597,8 @@ usb0: usb@65a00000 {
compatible = "socionext,uniphier-dwc3", "snps,dwc3";
status = "disabled";
reg = <0x65a00000 0xcd00>;
- interrupt-names = "host", "peripheral";
- interrupts = <0 134 4>, <0 135 4>;
+ interrupt-names = "dwc_usb3";
+ interrupts = <0 134 4>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usb0>, <&pinctrl_usb2>;
clock-names = "ref", "bus_early", "suspend";
@@ -693,8 +693,8 @@ usb1: usb@65c00000 {
compatible = "socionext,uniphier-dwc3", "snps,dwc3";
status = "disabled";
reg = <0x65c00000 0xcd00>;
- interrupt-names = "host", "peripheral";
- interrupts = <0 137 4>, <0 138 4>;
+ interrupt-names = "dwc_usb3";
+ interrupts = <0 137 4>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usb1>, <&pinctrl_usb3>;
clock-names = "ref", "bus_early", "suspend";
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index e4dba5461cb3..149a5bd6b88c 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -63,7 +63,7 @@ config CRYPTO_SHA512_ARM
using optimized ARM assembler and NEON, when available.
config CRYPTO_BLAKE2S_ARM
- tristate "BLAKE2s digest algorithm (ARM)"
+ bool "BLAKE2s digest algorithm (ARM)"
select CRYPTO_ARCH_HAVE_LIB_BLAKE2S
help
BLAKE2s digest algorithm optimized with ARM scalar instructions. This
diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index 0274f81cc8ea..971e74546fb1 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -9,8 +9,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
-obj-$(CONFIG_CRYPTO_BLAKE2S_ARM) += blake2s-arm.o
-obj-$(if $(CONFIG_CRYPTO_BLAKE2S_ARM),y) += libblake2s-arm.o
+obj-$(CONFIG_CRYPTO_BLAKE2S_ARM) += libblake2s-arm.o
obj-$(CONFIG_CRYPTO_BLAKE2B_NEON) += blake2b-neon.o
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
obj-$(CONFIG_CRYPTO_POLY1305_ARM) += poly1305-arm.o
@@ -32,7 +31,6 @@ sha256-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha256_neon_glue.o
sha256-arm-y := sha256-core.o sha256_glue.o $(sha256-arm-neon-y)
sha512-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha512-neon-glue.o
sha512-arm-y := sha512-core.o sha512-glue.o $(sha512-arm-neon-y)
-blake2s-arm-y := blake2s-shash.o
libblake2s-arm-y:= blake2s-core.o blake2s-glue.o
blake2b-neon-y := blake2b-neon-core.o blake2b-neon-glue.o
sha1-arm-ce-y := sha1-ce-core.o sha1-ce-glue.o
diff --git a/arch/arm/crypto/blake2s-shash.c b/arch/arm/crypto/blake2s-shash.c
deleted file mode 100644
index 763c73beea2d..000000000000
--- a/arch/arm/crypto/blake2s-shash.c
+++ /dev/null
@@ -1,75 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * BLAKE2s digest algorithm, ARM scalar implementation
- *
- * Copyright 2020 Google LLC
- */
-
-#include <crypto/internal/blake2s.h>
-#include <crypto/internal/hash.h>
-
-#include <linux/module.h>
-
-static int crypto_blake2s_update_arm(struct shash_desc *desc,
- const u8 *in, unsigned int inlen)
-{
- return crypto_blake2s_update(desc, in, inlen, false);
-}
-
-static int crypto_blake2s_final_arm(struct shash_desc *desc, u8 *out)
-{
- return crypto_blake2s_final(desc, out, false);
-}
-
-#define BLAKE2S_ALG(name, driver_name, digest_size) \
- { \
- .base.cra_name = name, \
- .base.cra_driver_name = driver_name, \
- .base.cra_priority = 200, \
- .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
- .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \
- .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \
- .base.cra_module = THIS_MODULE, \
- .digestsize = digest_size, \
- .setkey = crypto_blake2s_setkey, \
- .init = crypto_blake2s_init, \
- .update = crypto_blake2s_update_arm, \
- .final = crypto_blake2s_final_arm, \
- .descsize = sizeof(struct blake2s_state), \
- }
-
-static struct shash_alg blake2s_arm_algs[] = {
- BLAKE2S_ALG("blake2s-128", "blake2s-128-arm", BLAKE2S_128_HASH_SIZE),
- BLAKE2S_ALG("blake2s-160", "blake2s-160-arm", BLAKE2S_160_HASH_SIZE),
- BLAKE2S_ALG("blake2s-224", "blake2s-224-arm", BLAKE2S_224_HASH_SIZE),
- BLAKE2S_ALG("blake2s-256", "blake2s-256-arm", BLAKE2S_256_HASH_SIZE),
-};
-
-static int __init blake2s_arm_mod_init(void)
-{
- return IS_REACHABLE(CONFIG_CRYPTO_HASH) ?
- crypto_register_shashes(blake2s_arm_algs,
- ARRAY_SIZE(blake2s_arm_algs)) : 0;
-}
-
-static void __exit blake2s_arm_mod_exit(void)
-{
- if (IS_REACHABLE(CONFIG_CRYPTO_HASH))
- crypto_unregister_shashes(blake2s_arm_algs,
- ARRAY_SIZE(blake2s_arm_algs));
-}
-
-module_init(blake2s_arm_mod_init);
-module_exit(blake2s_arm_mod_exit);
-
-MODULE_DESCRIPTION("BLAKE2s digest algorithm, ARM scalar implementation");
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
-MODULE_ALIAS_CRYPTO("blake2s-128");
-MODULE_ALIAS_CRYPTO("blake2s-128-arm");
-MODULE_ALIAS_CRYPTO("blake2s-160");
-MODULE_ALIAS_CRYPTO("blake2s-160-arm");
-MODULE_ALIAS_CRYPTO("blake2s-224");
-MODULE_ALIAS_CRYPTO("blake2s-224-arm");
-MODULE_ALIAS_CRYPTO("blake2s-256");
-MODULE_ALIAS_CRYPTO("blake2s-256-arm");
diff --git a/arch/arm/lib/findbit.S b/arch/arm/lib/findbit.S
index b5e8b9ae4c7d..7fd3600db8ef 100644
--- a/arch/arm/lib/findbit.S
+++ b/arch/arm/lib/findbit.S
@@ -40,8 +40,8 @@ ENDPROC(_find_first_zero_bit_le)
* Prototype: int find_next_zero_bit(void *addr, unsigned int maxbit, int offset)
*/
ENTRY(_find_next_zero_bit_le)
- teq r1, #0
- beq 3b
+ cmp r2, r1
+ bhs 3b
ands ip, r2, #7
beq 1b @ If new byte, goto old routine
ARM( ldrb r3, [r0, r2, lsr #3] )
@@ -81,8 +81,8 @@ ENDPROC(_find_first_bit_le)
* Prototype: int find_next_zero_bit(void *addr, unsigned int maxbit, int offset)
*/
ENTRY(_find_next_bit_le)
- teq r1, #0
- beq 3b
+ cmp r2, r1
+ bhs 3b
ands ip, r2, #7
beq 1b @ If new byte, goto old routine
ARM( ldrb r3, [r0, r2, lsr #3] )
@@ -115,8 +115,8 @@ ENTRY(_find_first_zero_bit_be)
ENDPROC(_find_first_zero_bit_be)
ENTRY(_find_next_zero_bit_be)
- teq r1, #0
- beq 3b
+ cmp r2, r1
+ bhs 3b
ands ip, r2, #7
beq 1b @ If new byte, goto old routine
eor r3, r2, #0x18 @ big endian byte ordering
@@ -149,8 +149,8 @@ ENTRY(_find_first_bit_be)
ENDPROC(_find_first_bit_be)
ENTRY(_find_next_bit_be)
- teq r1, #0
- beq 3b
+ cmp r2, r1
+ bhs 3b
ands ip, r2, #7
beq 1b @ If new byte, goto old routine
eor r3, r2, #0x18 @ big endian byte ordering
diff --git a/arch/arm/mach-bcm/bcm_kona_smc.c b/arch/arm/mach-bcm/bcm_kona_smc.c
index 43829e49ad93..347bfb7f03e2 100644
--- a/arch/arm/mach-bcm/bcm_kona_smc.c
+++ b/arch/arm/mach-bcm/bcm_kona_smc.c
@@ -52,6 +52,7 @@ int __init bcm_kona_smc_init(void)
return -ENODEV;
prop_val = of_get_address(node, 0, &prop_size, NULL);
+ of_node_put(node);
if (!prop_val)
return -EINVAL;
diff --git a/arch/arm/mach-omap2/display.c b/arch/arm/mach-omap2/display.c
index 21413a9b7b6c..8d829f3dafe7 100644
--- a/arch/arm/mach-omap2/display.c
+++ b/arch/arm/mach-omap2/display.c
@@ -211,6 +211,7 @@ static int __init omapdss_init_fbdev(void)
node = of_find_node_by_name(NULL, "omap4_padconf_global");
if (node)
omap4_dsi_mux_syscon = syscon_node_to_regmap(node);
+ of_node_put(node);
return 0;
}
@@ -259,11 +260,13 @@ static int __init omapdss_init_of(void)
if (!pdev) {
pr_err("Unable to find DSS platform device\n");
+ of_node_put(node);
return -ENODEV;
}
r = of_platform_populate(node, NULL, NULL, &pdev->dev);
put_device(&pdev->dev);
+ of_node_put(node);
if (r) {
pr_err("Unable to populate DSS submodule devices\n");
return r;
diff --git a/arch/arm/mach-omap2/pdata-quirks.c b/arch/arm/mach-omap2/pdata-quirks.c
index e7fd29a502a0..af9d8c432af0 100644
--- a/arch/arm/mach-omap2/pdata-quirks.c
+++ b/arch/arm/mach-omap2/pdata-quirks.c
@@ -551,6 +551,8 @@ pdata_quirks_init_clocks(const struct of_device_id *omap_dt_match_table)
of_platform_populate(np, omap_dt_match_table,
omap_auxdata_lookup, NULL);
+
+ of_node_put(np);
}
}
diff --git a/arch/arm/mach-omap2/prm3xxx.c b/arch/arm/mach-omap2/prm3xxx.c
index 1b442b128569..63e73e9b82bc 100644
--- a/arch/arm/mach-omap2/prm3xxx.c
+++ b/arch/arm/mach-omap2/prm3xxx.c
@@ -708,6 +708,7 @@ static int omap3xxx_prm_late_init(void)
}
irq_num = of_irq_get(np, 0);
+ of_node_put(np);
if (irq_num == -EPROBE_DEFER)
return irq_num;
diff --git a/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c b/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c
index 09ef73b99dd8..ba44cec5e59a 100644
--- a/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c
+++ b/arch/arm/mach-shmobile/regulator-quirk-rcar-gen2.c
@@ -125,6 +125,7 @@ static int regulator_quirk_notify(struct notifier_block *nb,
list_for_each_entry_safe(pos, tmp, &quirk_list, list) {
list_del(&pos->list);
+ of_node_put(pos->np);
kfree(pos);
}
@@ -174,11 +175,12 @@ static int __init rcar_gen2_regulator_quirk(void)
memcpy(&quirk->i2c_msg, id->data, sizeof(quirk->i2c_msg));
quirk->id = id;
- quirk->np = np;
+ quirk->np = of_node_get(np);
quirk->i2c_msg.addr = addr;
ret = of_irq_parse_one(np, 0, argsa);
if (ret) { /* Skip invalid entry and continue */
+ of_node_put(np);
kfree(quirk);
continue;
}
@@ -225,6 +227,7 @@ static int __init rcar_gen2_regulator_quirk(void)
err_mem:
list_for_each_entry_safe(pos, tmp, &quirk_list, list) {
list_del(&pos->list);
+ of_node_put(pos->np);
kfree(pos);
}
diff --git a/arch/arm/mach-zynq/common.c b/arch/arm/mach-zynq/common.c
index e1ca6a5732d2..15e8a321a713 100644
--- a/arch/arm/mach-zynq/common.c
+++ b/arch/arm/mach-zynq/common.c
@@ -77,6 +77,7 @@ static int __init zynq_get_revision(void)
}
zynq_devcfg_base = of_iomap(np, 0);
+ of_node_put(np);
if (!zynq_devcfg_base) {
pr_err("%s: Unable to map I/O memory\n", __func__);
return -1;
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 20ea89d9ac2f..54cf6faf339c 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -223,6 +223,7 @@ config ARM64
select THREAD_INFO_IN_TASK
select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
select TRACE_IRQFLAGS_SUPPORT
+ select TRACE_IRQFLAGS_NMI_SUPPORT
help
ARM 64-bit (AArch64) Linux support.
@@ -500,6 +501,22 @@ config ARM64_ERRATUM_834220
If unsure, say Y.
+config ARM64_ERRATUM_1742098
+ bool "Cortex-A57/A72: 1742098: ELR recorded incorrectly on interrupt taken between cryptographic instructions in a sequence"
+ depends on COMPAT
+ default y
+ help
+ This option removes the AES hwcap for aarch32 user-space to
+ workaround erratum 1742098 on Cortex-A57 and Cortex-A72.
+
+ Affected parts may corrupt the AES state if an interrupt is
+ taken between a pair of AES instructions. These instructions
+ are only present if the cryptography extensions are present.
+ All software should have a fallback implementation for CPUs
+ that don't implement the cryptography extensions.
+
+ If unsure, say Y.
+
config ARM64_ERRATUM_845719
bool "Cortex-A53: 845719: a load might read incorrect data"
depends on COMPAT
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-orangepi-win.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-orangepi-win.dts
index c519d9fa6967..3d2c68d58f49 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-orangepi-win.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-orangepi-win.dts
@@ -40,7 +40,7 @@ hdmi_con_in: endpoint {
leds {
compatible = "gpio-leds";
- status {
+ led-0 {
label = "orangepi:green:status";
gpios = <&pio 7 11 GPIO_ACTIVE_HIGH>; /* PH11 */
};
diff --git a/arch/arm64/boot/dts/exynos/exynosautov9-pinctrl.dtsi b/arch/arm64/boot/dts/exynos/exynosautov9-pinctrl.dtsi
index ef0349d1c3d0..68f4a0fae7cf 100644
--- a/arch/arm64/boot/dts/exynos/exynosautov9-pinctrl.dtsi
+++ b/arch/arm64/boot/dts/exynos/exynosautov9-pinctrl.dtsi
@@ -1089,21 +1089,21 @@ spi10_cs_func: spi10-cs-func-pins {
/* PERIC1 USI11_SPI */
spi11_bus: spi11-pins {
- samsung,pins = "gpp3-6", "gpp3-5", "gpp3-4";
+ samsung,pins = "gpp5-6", "gpp5-5", "gpp5-4";
samsung,pin-function = <EXYNOS_PIN_FUNC_2>;
samsung,pin-pud = <EXYNOS_PIN_PULL_NONE>;
samsung,pin-drv = <EXYNOS5420_PIN_DRV_LV1>;
};
spi11_cs: spi11-cs-pins {
- samsung,pins = "gpp3-7";
+ samsung,pins = "gpp5-7";
samsung,pin-function = <EXYNOS_PIN_FUNC_OUTPUT>;
samsung,pin-pud = <EXYNOS_PIN_PULL_NONE>;
samsung,pin-drv = <EXYNOS5420_PIN_DRV_LV1>;
};
spi11_cs_func: spi11-cs-func-pins {
- samsung,pins = "gpp3-7";
+ samsung,pins = "gpp5-7";
samsung,pin-function = <EXYNOS_PIN_FUNC_2>;
samsung,pin-pud = <EXYNOS_PIN_PULL_NONE>;
samsung,pin-drv = <EXYNOS5420_PIN_DRV_LV1>;
diff --git a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
index 2b9bf8dd14ec..7538918c7a82 100644
--- a/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
+++ b/arch/arm64/boot/dts/mediatek/mt7622-bananapi-bpi-r64.dts
@@ -49,7 +49,7 @@ factory {
wps {
label = "wps";
linux,code = <KEY_WPS_BUTTON>;
- gpios = <&pio 102 GPIO_ACTIVE_HIGH>;
+ gpios = <&pio 102 GPIO_ACTIVE_LOW>;
};
};
diff --git a/arch/arm64/boot/dts/mediatek/mt8192.dtsi b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
index bcecc7484453..96439a4bce09 100644
--- a/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
@@ -41,7 +41,7 @@ cpu0: cpu@0 {
reg = <0x000>;
enable-method = "psci";
clock-frequency = <1701000000>;
- cpu-idle-states = <&cpuoff_l &clusteroff_l>;
+ cpu-idle-states = <&cpu_sleep_l &cluster_sleep_l>;
next-level-cache = <&l2_0>;
capacity-dmips-mhz = <530>;
};
@@ -52,7 +52,7 @@ cpu1: cpu@100 {
reg = <0x100>;
enable-method = "psci";
clock-frequency = <1701000000>;
- cpu-idle-states = <&cpuoff_l &clusteroff_l>;
+ cpu-idle-states = <&cpu_sleep_l &cluster_sleep_l>;
next-level-cache = <&l2_0>;
capacity-dmips-mhz = <530>;
};
@@ -63,7 +63,7 @@ cpu2: cpu@200 {
reg = <0x200>;
enable-method = "psci";
clock-frequency = <1701000000>;
- cpu-idle-states = <&cpuoff_l &clusteroff_l>;
+ cpu-idle-states = <&cpu_sleep_l &cluster_sleep_l>;
next-level-cache = <&l2_0>;
capacity-dmips-mhz = <530>;
};
@@ -74,7 +74,7 @@ cpu3: cpu@300 {
reg = <0x300>;
enable-method = "psci";
clock-frequency = <1701000000>;
- cpu-idle-states = <&cpuoff_l &clusteroff_l>;
+ cpu-idle-states = <&cpu_sleep_l &cluster_sleep_l>;
next-level-cache = <&l2_0>;
capacity-dmips-mhz = <530>;
};
@@ -85,7 +85,7 @@ cpu4: cpu@400 {
reg = <0x400>;
enable-method = "psci";
clock-frequency = <2171000000>;
- cpu-idle-states = <&cpuoff_b &clusteroff_b>;
+ cpu-idle-states = <&cpu_sleep_b &cluster_sleep_b>;
next-level-cache = <&l2_1>;
capacity-dmips-mhz = <1024>;
};
@@ -96,7 +96,7 @@ cpu5: cpu@500 {
reg = <0x500>;
enable-method = "psci";
clock-frequency = <2171000000>;
- cpu-idle-states = <&cpuoff_b &clusteroff_b>;
+ cpu-idle-states = <&cpu_sleep_b &cluster_sleep_b>;
next-level-cache = <&l2_1>;
capacity-dmips-mhz = <1024>;
};
@@ -107,7 +107,7 @@ cpu6: cpu@600 {
reg = <0x600>;
enable-method = "psci";
clock-frequency = <2171000000>;
- cpu-idle-states = <&cpuoff_b &clusteroff_b>;
+ cpu-idle-states = <&cpu_sleep_b &cluster_sleep_b>;
next-level-cache = <&l2_1>;
capacity-dmips-mhz = <1024>;
};
@@ -118,7 +118,7 @@ cpu7: cpu@700 {
reg = <0x700>;
enable-method = "psci";
clock-frequency = <2171000000>;
- cpu-idle-states = <&cpuoff_b &clusteroff_b>;
+ cpu-idle-states = <&cpu_sleep_b &cluster_sleep_b>;
next-level-cache = <&l2_1>;
capacity-dmips-mhz = <1024>;
};
@@ -170,8 +170,8 @@ l3_0: l3-cache {
};
idle-states {
- entry-method = "arm,psci";
- cpuoff_l: cpuoff_l {
+ entry-method = "psci";
+ cpu_sleep_l: cpu-sleep-l {
compatible = "arm,idle-state";
arm,psci-suspend-param = <0x00010001>;
local-timer-stop;
@@ -179,7 +179,7 @@ cpuoff_l: cpuoff_l {
exit-latency-us = <140>;
min-residency-us = <780>;
};
- cpuoff_b: cpuoff_b {
+ cpu_sleep_b: cpu-sleep-b {
compatible = "arm,idle-state";
arm,psci-suspend-param = <0x00010001>;
local-timer-stop;
@@ -187,7 +187,7 @@ cpuoff_b: cpuoff_b {
exit-latency-us = <145>;
min-residency-us = <720>;
};
- clusteroff_l: clusteroff_l {
+ cluster_sleep_l: cluster-sleep-l {
compatible = "arm,idle-state";
arm,psci-suspend-param = <0x01010002>;
local-timer-stop;
@@ -195,7 +195,7 @@ clusteroff_l: clusteroff_l {
exit-latency-us = <155>;
min-residency-us = <860>;
};
- clusteroff_b: clusteroff_b {
+ cluster_sleep_b: cluster-sleep-b {
compatible = "arm,idle-state";
arm,psci-suspend-param = <0x01010002>;
local-timer-stop;
diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index e9b40f5d79ec..77c597a6386f 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1807,6 +1807,7 @@ sram@30000000 {
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x0 0x0 0x30000000 0x50000>;
+ no-memory-wc;
cpu_bpmp_tx: sram@4e000 {
reg = <0x4e000 0x1000>;
diff --git a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
index a7d7cfd66379..b0f9393dd39c 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi
@@ -75,7 +75,7 @@ eeprom@50 {
/* SDMMC1 (SD/MMC) */
mmc@3400000 {
- cd-gpios = <&gpio TEGRA194_MAIN_GPIO(A, 0) GPIO_ACTIVE_LOW>;
+ cd-gpios = <&gpio TEGRA194_MAIN_GPIO(G, 7) GPIO_ACTIVE_LOW>;
};
/* SDMMC4 (eMMC) */
diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index 751ebe5e9506..61465eb6cccd 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -2648,6 +2648,7 @@ sram@40000000 {
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x0 0x0 0x40000000 0x50000>;
+ no-memory-wc;
cpu_bpmp_tx: sram@4e000 {
reg = <0x4e000 0x1000>;
diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
index aaace605bdaa..9916b87fa83f 100644
--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
@@ -1264,6 +1264,7 @@ sram@40000000 {
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x0 0x0 0x40000000 0x80000>;
+ no-memory-wc;
cpu_bpmp_tx: sram@70000 {
reg = <0x70000 0x1000>;
diff --git a/arch/arm64/boot/dts/qcom/ipq6018.dtsi b/arch/arm64/boot/dts/qcom/ipq6018.dtsi
index aac56575e30d..5cdec104d899 100644
--- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi
+++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi
@@ -525,9 +525,9 @@ timer {
};
timer@b120000 {
- #address-cells = <2>;
- #size-cells = <2>;
- ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0 0 0x10000000>;
compatible = "arm,armv7-timer-mem";
reg = <0x0 0x0b120000 0x0 0x1000>;
@@ -535,49 +535,49 @@ frame@b120000 {
frame-number = <0>;
interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x0b121000 0x0 0x1000>,
- <0x0 0x0b122000 0x0 0x1000>;
+ reg = <0x0b121000 0x1000>,
+ <0x0b122000 0x1000>;
};
frame@b123000 {
frame-number = <1>;
interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0xb123000 0x0 0x1000>;
+ reg = <0x0b123000 0x1000>;
status = "disabled";
};
frame@b124000 {
frame-number = <2>;
interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x0b124000 0x0 0x1000>;
+ reg = <0x0b124000 0x1000>;
status = "disabled";
};
frame@b125000 {
frame-number = <3>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x0b125000 0x0 0x1000>;
+ reg = <0x0b125000 0x1000>;
status = "disabled";
};
frame@b126000 {
frame-number = <4>;
interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x0b126000 0x0 0x1000>;
+ reg = <0x0b126000 0x1000>;
status = "disabled";
};
frame@b127000 {
frame-number = <5>;
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x0b127000 0x0 0x1000>;
+ reg = <0x0b127000 0x1000>;
status = "disabled";
};
frame@b128000 {
frame-number = <6>;
interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x0b128000 0x0 0x1000>;
+ reg = <0x0b128000 0x1000>;
status = "disabled";
};
};
diff --git a/arch/arm64/boot/dts/qcom/ipq8074.dtsi b/arch/arm64/boot/dts/qcom/ipq8074.dtsi
index 8d4e0e193439..664fba3632b1 100644
--- a/arch/arm64/boot/dts/qcom/ipq8074.dtsi
+++ b/arch/arm64/boot/dts/qcom/ipq8074.dtsi
@@ -534,7 +534,7 @@ qpic_bam: dma-controller@7984000 {
status = "disabled";
};
- qpic_nand: nand@79b0000 {
+ qpic_nand: nand-controller@79b0000 {
compatible = "qcom,ipq8074-nand";
reg = <0x079b0000 0x10000>;
#address-cells = <1>;
diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index e34963505e07..7ecd747dc624 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -1758,8 +1758,8 @@ pronto: remoteproc@a21b000 {
<&rpmpd MSM8916_VDDMX>;
power-domain-names = "cx", "mx";
- qcom,state = <&wcnss_smp2p_out 0>;
- qcom,state-names = "stop";
+ qcom,smem-states = <&wcnss_smp2p_out 0>;
+ qcom,smem-state-names = "stop";
pinctrl-names = "default";
pinctrl-0 = <&wcnss_pin_a>;
diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index b9a48cfd760f..078edf46de2b 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -603,7 +603,7 @@ pciephy_0: phy@35000 {
<0x00035400 0x1dc>;
#phy-cells = <0>;
- #clock-cells = <1>;
+ #clock-cells = <0>;
clock-output-names = "pcie_0_pipe_clk_src";
clocks = <&gcc GCC_PCIE_0_PIPE_CLK>;
clock-names = "pipe0";
@@ -617,6 +617,7 @@ pciephy_1: phy@36000 {
<0x00036400 0x1dc>;
#phy-cells = <0>;
+ #clock-cells = <0>;
clock-output-names = "pcie_1_pipe_clk_src";
clocks = <&gcc GCC_PCIE_1_PIPE_CLK>;
clock-names = "pipe1";
@@ -630,6 +631,7 @@ pciephy_2: phy@37000 {
<0x00037400 0x1dc>;
#phy-cells = <0>;
+ #clock-cells = <0>;
clock-output-names = "pcie_2_pipe_clk_src";
clocks = <&gcc GCC_PCIE_2_PIPE_CLK>;
clock-names = "pipe2";
@@ -2659,7 +2661,7 @@ ssusb_phy_0: phy@7410200 {
<0x07410600 0x1a8>;
#phy-cells = <0>;
- #clock-cells = <1>;
+ #clock-cells = <0>;
clock-output-names = "usb3_phy_pipe_clk_src";
clocks = <&gcc GCC_USB3_PHY_PIPE_CLK>;
clock-names = "pipe0";
diff --git a/arch/arm64/boot/dts/qcom/msm8998-sony-xperia-yoshino-poplar.dts b/arch/arm64/boot/dts/qcom/msm8998-sony-xperia-yoshino-poplar.dts
index 4a1f98a21031..c21333aa73c2 100644
--- a/arch/arm64/boot/dts/qcom/msm8998-sony-xperia-yoshino-poplar.dts
+++ b/arch/arm64/boot/dts/qcom/msm8998-sony-xperia-yoshino-poplar.dts
@@ -26,11 +26,13 @@ &lab {
};
&vreg_l18a_2p85 {
- regulator-min-microvolt = <2850000>;
- regulator-max-microvolt = <2850000>;
+ /* Note: Round-down from 2850000 to be a multiple of PLDO step-size 8000 */
+ regulator-min-microvolt = <2848000>;
+ regulator-max-microvolt = <2848000>;
};
&vreg_l22a_2p85 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
+ /* Note: Round-down from 2700000 to be a multiple of PLDO step-size 8000 */
+ regulator-min-microvolt = <2696000>;
+ regulator-max-microvolt = <2696000>;
};
diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi b/arch/arm64/boot/dts/qcom/qcs404.dtsi
index 3f06f7cd3cf2..65d71adee345 100644
--- a/arch/arm64/boot/dts/qcom/qcs404.dtsi
+++ b/arch/arm64/boot/dts/qcom/qcs404.dtsi
@@ -548,7 +548,7 @@ dwc3@7580000 {
compatible = "snps,dwc3";
reg = <0x07580000 0xcd00>;
interrupts = <GIC_SPI 26 IRQ_TYPE_LEVEL_HIGH>;
- phys = <&usb2_phy_sec>, <&usb3_phy>;
+ phys = <&usb2_phy_prim>, <&usb3_phy>;
phy-names = "usb2-phy", "usb3-phy";
snps,has-lpm-erratum;
snps,hird-threshold = /bits/ 8 <0x10>;
@@ -577,7 +577,7 @@ dwc3@78c0000 {
compatible = "snps,dwc3";
reg = <0x078c0000 0xcc00>;
interrupts = <GIC_SPI 44 IRQ_TYPE_LEVEL_HIGH>;
- phys = <&usb2_phy_prim>;
+ phys = <&usb2_phy_sec>;
phy-names = "usb2-phy";
snps,has-lpm-erratum;
snps,hird-threshold = /bits/ 8 <0x10>;
diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
index 732e1181af48..262224504921 100644
--- a/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor.dtsi
@@ -42,6 +42,7 @@ charger-crit {
*/
/delete-node/ &hyp_mem;
+/delete-node/ &ipa_fw_mem;
/delete-node/ &xbl_mem;
/delete-node/ &aop_mem;
/delete-node/ &sec_apps_mem;
diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi
index e1c46b80f14a..120db2d0a309 100644
--- a/arch/arm64/boot/dts/qcom/sc7180.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi
@@ -3219,7 +3219,7 @@ aoss_reset: reset-controller@c2a0000 {
};
aoss_qmp: power-controller@c300000 {
- compatible = "qcom,sc7180-aoss-qmp";
+ compatible = "qcom,sc7180-aoss-qmp", "qcom,aoss-qmp";
reg = <0 0x0c300000 0 0x400>;
interrupts = <GIC_SPI 389 IRQ_TYPE_EDGE_RISING>;
mboxes = <&apss_shared 0>;
@@ -3388,9 +3388,9 @@ watchdog@17c10000 {
};
timer@17c20000{
- #address-cells = <2>;
- #size-cells = <2>;
- ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0 0 0x20000000>;
compatible = "arm,armv7-timer-mem";
reg = <0 0x17c20000 0 0x1000>;
@@ -3398,49 +3398,49 @@ frame@17c21000 {
frame-number = <0>;
interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c21000 0 0x1000>,
- <0 0x17c22000 0 0x1000>;
+ reg = <0x17c21000 0x1000>,
+ <0x17c22000 0x1000>;
};
frame@17c23000 {
frame-number = <1>;
interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c23000 0 0x1000>;
+ reg = <0x17c23000 0x1000>;
status = "disabled";
};
frame@17c25000 {
frame-number = <2>;
interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c25000 0 0x1000>;
+ reg = <0x17c25000 0x1000>;
status = "disabled";
};
frame@17c27000 {
frame-number = <3>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c27000 0 0x1000>;
+ reg = <0x17c27000 0x1000>;
status = "disabled";
};
frame@17c29000 {
frame-number = <4>;
interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c29000 0 0x1000>;
+ reg = <0x17c29000 0x1000>;
status = "disabled";
};
frame@17c2b000 {
frame-number = <5>;
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c2b000 0 0x1000>;
+ reg = <0x17c2b000 0x1000>;
status = "disabled";
};
frame@17c2d000 {
frame-number = <6>;
interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c2d000 0 0x1000>;
+ reg = <0x17c2d000 0x1000>;
status = "disabled";
};
};
diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi b/arch/arm64/boot/dts/qcom/sc7280.dtsi
index f0b64be63c21..793ac9715da2 100644
--- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
@@ -810,7 +810,7 @@ gcc: clock-controller@100000 {
reg = <0 0x00100000 0 0x1f0000>;
clocks = <&rpmhcc RPMH_CXO_CLK>,
<&rpmhcc RPMH_CXO_CLK_A>, <&sleep_clk>,
- <0>, <&pcie1_lane 0>,
+ <0>, <&pcie1_lane>,
<0>, <0>, <0>, <0>;
clock-names = "bi_tcxo", "bi_tcxo_ao", "sleep_clk",
"pcie_0_pipe_clk", "pcie_1_pipe_clk",
@@ -1839,7 +1839,7 @@ pcie1: pci@1c08000 {
clocks = <&gcc GCC_PCIE_1_PIPE_CLK>,
<&gcc GCC_PCIE_1_PIPE_CLK_SRC>,
- <&pcie1_lane 0>,
+ <&pcie1_lane>,
<&rpmhcc RPMH_CXO_CLK>,
<&gcc GCC_PCIE_1_AUX_CLK>,
<&gcc GCC_PCIE_1_CFG_AHB_CLK>,
@@ -1914,7 +1914,7 @@ pcie1_lane: lanes@1c0e200 {
clock-names = "pipe0";
#phy-cells = <0>;
- #clock-cells = <1>;
+ #clock-cells = <0>;
clock-output-names = "pcie_1_pipe_clk";
};
};
@@ -3467,7 +3467,7 @@ aoss_reset: reset-controller@c2a0000 {
};
aoss_qmp: power-controller@c300000 {
- compatible = "qcom,sc7280-aoss-qmp";
+ compatible = "qcom,sc7280-aoss-qmp", "qcom,aoss-qmp";
reg = <0 0x0c300000 0 0x400>;
interrupts-extended = <&ipcc IPCC_CLIENT_AOP
IPCC_MPROC_SIGNAL_GLINK_QMP
@@ -4395,9 +4395,9 @@ watchdog@17c10000 {
};
timer@17c20000 {
- #address-cells = <2>;
- #size-cells = <2>;
- ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0 0 0x20000000>;
compatible = "arm,armv7-timer-mem";
reg = <0 0x17c20000 0 0x1000>;
@@ -4405,49 +4405,49 @@ frame@17c21000 {
frame-number = <0>;
interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c21000 0 0x1000>,
- <0 0x17c22000 0 0x1000>;
+ reg = <0x17c21000 0x1000>,
+ <0x17c22000 0x1000>;
};
frame@17c23000 {
frame-number = <1>;
interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c23000 0 0x1000>;
+ reg = <0x17c23000 0x1000>;
status = "disabled";
};
frame@17c25000 {
frame-number = <2>;
interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c25000 0 0x1000>;
+ reg = <0x17c25000 0x1000>;
status = "disabled";
};
frame@17c27000 {
frame-number = <3>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c27000 0 0x1000>;
+ reg = <0x17c27000 0x1000>;
status = "disabled";
};
frame@17c29000 {
frame-number = <4>;
interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c29000 0 0x1000>;
+ reg = <0x17c29000 0x1000>;
status = "disabled";
};
frame@17c2b000 {
frame-number = <5>;
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c2b000 0 0x1000>;
+ reg = <0x17c2b000 0x1000>;
status = "disabled";
};
frame@17c2d000 {
frame-number = <6>;
interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17c2d000 0 0x1000>;
+ reg = <0x17c2d000 0x1000>;
status = "disabled";
};
};
diff --git a/arch/arm64/boot/dts/qcom/sdm630.dtsi b/arch/arm64/boot/dts/qcom/sdm630.dtsi
index 240293592ef9..a33638a1cb44 100644
--- a/arch/arm64/boot/dts/qcom/sdm630.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm630.dtsi
@@ -8,6 +8,7 @@
#include <dt-bindings/clock/qcom,gpucc-sdm660.h>
#include <dt-bindings/clock/qcom,mmcc-sdm660.h>
#include <dt-bindings/clock/qcom,rpmcc.h>
+#include <dt-bindings/interconnect/qcom,sdm660.h>
#include <dt-bindings/power/qcom-rpmpd.h>
#include <dt-bindings/gpio/gpio.h>
#include <dt-bindings/interrupt-controller/arm-gic.h>
@@ -1045,11 +1046,13 @@ adreno_gpu: gpu@5000000 {
nvmem-cells = <&gpu_speed_bin>;
nvmem-cell-names = "speed_bin";
- interconnects = <&gnoc 1 &bimc 5>;
+ interconnects = <&bimc MASTER_OXILI &bimc SLAVE_EBI>;
interconnect-names = "gfx-mem";
operating-points-v2 = <&gpu_sdm630_opp_table>;
+ status = "disabled";
+
gpu_sdm630_opp_table: opp-table {
compatible = "operating-points-v2";
opp-775000000 {
@@ -1260,7 +1263,7 @@ qusb2phy: phy@c012000 {
#phy-cells = <0>;
clocks = <&gcc GCC_USB_PHY_CFG_AHB2PHY_CLK>,
- <&gcc GCC_RX1_USB2_CLKREF_CLK>;
+ <&gcc GCC_RX0_USB2_CLKREF_CLK>;
clock-names = "cfg_ahb", "ref";
resets = <&gcc GCC_QUSB2PHY_PRIM_BCR>;
diff --git a/arch/arm64/boot/dts/qcom/sdm636-sony-xperia-ganges-mermaid.dts b/arch/arm64/boot/dts/qcom/sdm636-sony-xperia-ganges-mermaid.dts
index b96da53f2f1e..58f687fc49e0 100644
--- a/arch/arm64/boot/dts/qcom/sdm636-sony-xperia-ganges-mermaid.dts
+++ b/arch/arm64/boot/dts/qcom/sdm636-sony-xperia-ganges-mermaid.dts
@@ -19,7 +19,7 @@ / {
};
&sdc2_state_on {
- pinconf-clk {
+ clk {
drive-strength = <14>;
};
};
diff --git a/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama-akatsuki.dts b/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama-akatsuki.dts
index 8a0d94e7f598..2f5e12deaada 100644
--- a/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama-akatsuki.dts
+++ b/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama-akatsuki.dts
@@ -19,8 +19,9 @@ &vreg_l14a_1p8 {
};
&vreg_l22a_2p8 {
- regulator-min-microvolt = <2700000>;
- regulator-max-microvolt = <2700000>;
+ /* Note: Round-down from 2700000 to be a multiple of PLDO step-size 8000 */
+ regulator-min-microvolt = <2696000>;
+ regulator-max-microvolt = <2696000>;
};
&vreg_l28a_2p8 {
diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index ad21cf465c98..cd0029ff8246 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -4942,9 +4942,9 @@ slimbam: dma-controller@17184000 {
};
timer@17c90000 {
- #address-cells = <2>;
- #size-cells = <2>;
- ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0 0 0x20000000>;
compatible = "arm,armv7-timer-mem";
reg = <0 0x17c90000 0 0x1000>;
@@ -4952,49 +4952,49 @@ frame@17ca0000 {
frame-number = <0>;
interrupts = <GIC_SPI 7 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17ca0000 0 0x1000>,
- <0 0x17cb0000 0 0x1000>;
+ reg = <0x17ca0000 0x1000>,
+ <0x17cb0000 0x1000>;
};
frame@17cc0000 {
frame-number = <1>;
interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17cc0000 0 0x1000>;
+ reg = <0x17cc0000 0x1000>;
status = "disabled";
};
frame@17cd0000 {
frame-number = <2>;
interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17cd0000 0 0x1000>;
+ reg = <0x17cd0000 0x1000>;
status = "disabled";
};
frame@17ce0000 {
frame-number = <3>;
interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17ce0000 0 0x1000>;
+ reg = <0x17ce0000 0x1000>;
status = "disabled";
};
frame@17cf0000 {
frame-number = <4>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17cf0000 0 0x1000>;
+ reg = <0x17cf0000 0x1000>;
status = "disabled";
};
frame@17d00000 {
frame-number = <5>;
interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17d00000 0 0x1000>;
+ reg = <0x17d00000 0x1000>;
status = "disabled";
};
frame@17d10000 {
frame-number = <6>;
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0 0x17d10000 0 0x1000>;
+ reg = <0x17d10000 0x1000>;
status = "disabled";
};
};
diff --git a/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts b/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts
index 871ccbba445b..038970c0b68e 100644
--- a/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts
+++ b/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts
@@ -88,11 +88,19 @@ &hsusb_phy1 {
status = "okay";
};
-&sdc2_state_off {
+&sdc2_off_state {
sd-cd {
pins = "gpio98";
+ drive-strength = <2>;
bias-disable;
+ };
+};
+
+&sdc2_on_state {
+ sd-cd {
+ pins = "gpio98";
drive-strength = <2>;
+ bias-pull-up;
};
};
@@ -102,32 +110,6 @@ &sdhc_1 {
&tlmm {
gpio-reserved-ranges = <22 2>, <28 6>;
-
- sdc2_state_on: sdc2-on {
- clk {
- pins = "sdc2_clk";
- bias-disable;
- drive-strength = <16>;
- };
-
- cmd {
- pins = "sdc2_cmd";
- bias-pull-up;
- drive-strength = <10>;
- };
-
- data {
- pins = "sdc2_data";
- bias-pull-up;
- drive-strength = <10>;
- };
-
- sd-cd {
- pins = "gpio98";
- bias-pull-up;
- drive-strength = <2>;
- };
- };
};
&usb3 {
diff --git a/arch/arm64/boot/dts/qcom/sm6125.dtsi b/arch/arm64/boot/dts/qcom/sm6125.dtsi
index e81b2a7794fb..e601b9bfdc04 100644
--- a/arch/arm64/boot/dts/qcom/sm6125.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm6125.dtsi
@@ -386,23 +386,43 @@ tlmm: pinctrl@500000 {
interrupt-controller;
#interrupt-cells = <2>;
- sdc2_state_off: sdc2-off {
+ sdc2_off_state: sdc2-off-state {
clk {
pins = "sdc2_clk";
- bias-disable;
drive-strength = <2>;
+ bias-disable;
};
cmd {
pins = "sdc2_cmd";
+ drive-strength = <2>;
bias-pull-up;
+ };
+
+ data {
+ pins = "sdc2_data";
drive-strength = <2>;
+ bias-pull-up;
+ };
+ };
+
+ sdc2_on_state: sdc2-on-state {
+ clk {
+ pins = "sdc2_clk";
+ drive-strength = <16>;
+ bias-disable;
+ };
+
+ cmd {
+ pins = "sdc2_cmd";
+ drive-strength = <10>;
+ bias-pull-up;
};
data {
pins = "sdc2_data";
+ drive-strength = <10>;
bias-pull-up;
- drive-strength = <2>;
};
};
};
@@ -470,8 +490,8 @@ sdhc_2: sdhci@4784000 {
<&xo_board>;
clock-names = "iface", "core", "xo";
- pinctrl-0 = <&sdc2_state_on>;
- pinctrl-1 = <&sdc2_state_off>;
+ pinctrl-0 = <&sdc2_on_state>;
+ pinctrl-1 = <&sdc2_off_state>;
pinctrl-names = "default", "sleep";
power-domains = <&rpmpd SM6125_VDDCX>;
diff --git a/arch/arm64/boot/dts/qcom/sm6350.dtsi b/arch/arm64/boot/dts/qcom/sm6350.dtsi
index d7c9edff19f7..c31fe27a46f2 100644
--- a/arch/arm64/boot/dts/qcom/sm6350.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm6350.dtsi
@@ -1090,57 +1090,57 @@ timer@17c20000 {
compatible = "arm,armv7-timer-mem";
reg = <0x0 0x17c20000 0x0 0x1000>;
clock-frequency = <19200000>;
- #address-cells = <2>;
- #size-cells = <2>;
- ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0 0 0x20000000>;
frame@17c21000 {
frame-number = <0>;
interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c21000 0x0 0x1000>,
- <0x0 0x17c22000 0x0 0x1000>;
+ reg = <0x17c21000 0x1000>,
+ <0x17c22000 0x1000>;
};
frame@17c23000 {
frame-number = <1>;
interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c23000 0x0 0x1000>;
+ reg = <0x17c23000 0x1000>;
status = "disabled";
};
frame@17c25000 {
frame-number = <2>;
interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c25000 0x0 0x1000>;
+ reg = <0x17c25000 0x1000>;
status = "disabled";
};
frame@17c27000 {
frame-number = <3>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c27000 0x0 0x1000>;
+ reg = <0x17c27000 0x1000>;
status = "disabled";
};
frame@17c29000 {
frame-number = <4>;
interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c29000 0x0 0x1000>;
+ reg = <0x17c29000 0x1000>;
status = "disabled";
};
frame@17c2b000 {
frame-number = <5>;
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c2b000 0x0 0x1000>;
+ reg = <0x17c2b000 0x1000>;
status = "disabled";
};
frame@17c2d000 {
frame-number = <6>;
interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c2d000 0x0 0x1000>;
+ reg = <0x17c2d000 0x1000>;
status = "disabled";
};
};
diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
index 15f3bf2e7ea0..6edb3986a473 100644
--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
@@ -3382,7 +3382,7 @@ camnoc_virt: interconnect@ac00000 {
};
aoss_qmp: power-controller@c300000 {
- compatible = "qcom,sm8150-aoss-qmp";
+ compatible = "qcom,sm8150-aoss-qmp", "qcom,aoss-qmp";
reg = <0x0 0x0c300000 0x0 0x400>;
interrupts = <GIC_SPI 389 IRQ_TYPE_EDGE_RISING>;
mboxes = <&apss_shared 0>;
@@ -3608,9 +3608,9 @@ watchdog@17c10000 {
};
timer@17c20000 {
- #address-cells = <2>;
- #size-cells = <2>;
- ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0 0 0x20000000>;
compatible = "arm,armv7-timer-mem";
reg = <0x0 0x17c20000 0x0 0x1000>;
clock-frequency = <19200000>;
@@ -3619,49 +3619,49 @@ frame@17c21000{
frame-number = <0>;
interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c21000 0x0 0x1000>,
- <0x0 0x17c22000 0x0 0x1000>;
+ reg = <0x17c21000 0x1000>,
+ <0x17c22000 0x1000>;
};
frame@17c23000 {
frame-number = <1>;
interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c23000 0x0 0x1000>;
+ reg = <0x17c23000 0x1000>;
status = "disabled";
};
frame@17c25000 {
frame-number = <2>;
interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c25000 0x0 0x1000>;
+ reg = <0x17c25000 0x1000>;
status = "disabled";
};
frame@17c27000 {
frame-number = <3>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c26000 0x0 0x1000>;
+ reg = <0x17c26000 0x1000>;
status = "disabled";
};
frame@17c29000 {
frame-number = <4>;
interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c29000 0x0 0x1000>;
+ reg = <0x17c29000 0x1000>;
status = "disabled";
};
frame@17c2b000 {
frame-number = <5>;
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c2b000 0x0 0x1000>;
+ reg = <0x17c2b000 0x1000>;
status = "disabled";
};
frame@17c2d000 {
frame-number = <6>;
interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c2d000 0x0 0x1000>;
+ reg = <0x17c2d000 0x1000>;
status = "disabled";
};
};
diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi b/arch/arm64/boot/dts/qcom/sm8250.dtsi
index 1304b86af1a0..9afd8b3da3a1 100644
--- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
@@ -1883,6 +1883,8 @@ pcie0_lane: phy@1c06200 {
clock-names = "pipe0";
#phy-cells = <0>;
+
+ #clock-cells = <0>;
clock-output-names = "pcie_0_pipe_clk";
};
};
@@ -1989,6 +1991,8 @@ pcie1_lane: phy@1c0e200 {
clock-names = "pipe0";
#phy-cells = <0>;
+
+ #clock-cells = <0>;
clock-output-names = "pcie_1_pipe_clk";
};
};
@@ -2095,6 +2099,8 @@ pcie2_lane: phy@1c16200 {
clock-names = "pipe0";
#phy-cells = <0>;
+
+ #clock-cells = <0>;
clock-output-names = "pcie_2_pipe_clk";
};
};
@@ -3475,7 +3481,7 @@ tsens1: thermal-sensor@c265000 {
};
aoss_qmp: power-controller@c300000 {
- compatible = "qcom,sm8250-aoss-qmp";
+ compatible = "qcom,sm8250-aoss-qmp", "qcom,aoss-qmp";
reg = <0 0x0c300000 0 0x400>;
interrupts-extended = <&ipcc IPCC_CLIENT_AOP
IPCC_MPROC_SIGNAL_GLINK_QMP
@@ -4528,9 +4534,9 @@ watchdog@17c10000 {
};
timer@17c20000 {
- #address-cells = <2>;
- #size-cells = <2>;
- ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0 0 0x20000000>;
compatible = "arm,armv7-timer-mem";
reg = <0x0 0x17c20000 0x0 0x1000>;
clock-frequency = <19200000>;
@@ -4539,49 +4545,49 @@ frame@17c21000 {
frame-number = <0>;
interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c21000 0x0 0x1000>,
- <0x0 0x17c22000 0x0 0x1000>;
+ reg = <0x17c21000 0x1000>,
+ <0x17c22000 0x1000>;
};
frame@17c23000 {
frame-number = <1>;
interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c23000 0x0 0x1000>;
+ reg = <0x17c23000 0x1000>;
status = "disabled";
};
frame@17c25000 {
frame-number = <2>;
interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c25000 0x0 0x1000>;
+ reg = <0x17c25000 0x1000>;
status = "disabled";
};
frame@17c27000 {
frame-number = <3>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c27000 0x0 0x1000>;
+ reg = <0x17c27000 0x1000>;
status = "disabled";
};
frame@17c29000 {
frame-number = <4>;
interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c29000 0x0 0x1000>;
+ reg = <0x17c29000 0x1000>;
status = "disabled";
};
frame@17c2b000 {
frame-number = <5>;
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c2b000 0x0 0x1000>;
+ reg = <0x17c2b000 0x1000>;
status = "disabled";
};
frame@17c2d000 {
frame-number = <6>;
interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c2d000 0x0 0x1000>;
+ reg = <0x17c2d000 0x1000>;
status = "disabled";
};
};
diff --git a/arch/arm64/boot/dts/qcom/sm8350.dtsi b/arch/arm64/boot/dts/qcom/sm8350.dtsi
index 20f850b94158..ee6c202ab68c 100644
--- a/arch/arm64/boot/dts/qcom/sm8350.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8350.dtsi
@@ -1537,7 +1537,7 @@ tsens1: thermal-sensor@c265000 {
};
aoss_qmp: power-controller@c300000 {
- compatible = "qcom,sm8350-aoss-qmp";
+ compatible = "qcom,sm8350-aoss-qmp", "qcom,aoss-qmp";
reg = <0 0x0c300000 0 0x400>;
interrupts-extended = <&ipcc IPCC_CLIENT_AOP IPCC_MPROC_SIGNAL_GLINK_QMP
IRQ_TYPE_EDGE_RISING>;
@@ -1752,9 +1752,9 @@ intc: interrupt-controller@17a00000 {
timer@17c20000 {
compatible = "arm,armv7-timer-mem";
- #address-cells = <2>;
- #size-cells = <2>;
- ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0 0 0x20000000>;
reg = <0x0 0x17c20000 0x0 0x1000>;
clock-frequency = <19200000>;
@@ -1762,49 +1762,49 @@ frame@17c21000 {
frame-number = <0>;
interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c21000 0x0 0x1000>,
- <0x0 0x17c22000 0x0 0x1000>;
+ reg = <0x17c21000 0x1000>,
+ <0x17c22000 0x1000>;
};
frame@17c23000 {
frame-number = <1>;
interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c23000 0x0 0x1000>;
+ reg = <0x17c23000 0x1000>;
status = "disabled";
};
frame@17c25000 {
frame-number = <2>;
interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c25000 0x0 0x1000>;
+ reg = <0x17c25000 0x1000>;
status = "disabled";
};
frame@17c27000 {
frame-number = <3>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c27000 0x0 0x1000>;
+ reg = <0x17c27000 0x1000>;
status = "disabled";
};
frame@17c29000 {
frame-number = <4>;
interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c29000 0x0 0x1000>;
+ reg = <0x17c29000 0x1000>;
status = "disabled";
};
frame@17c2b000 {
frame-number = <5>;
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c2b000 0x0 0x1000>;
+ reg = <0x17c2b000 0x1000>;
status = "disabled";
};
frame@17c2d000 {
frame-number = <6>;
interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17c2d000 0x0 0x1000>;
+ reg = <0x17c2d000 0x1000>;
status = "disabled";
};
};
diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
index 7a14eb89e4ca..e5fe694b7be6 100644
--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
@@ -1203,9 +1203,9 @@ intc: interrupt-controller@17100000 {
timer@17420000 {
compatible = "arm,armv7-timer-mem";
- #address-cells = <2>;
- #size-cells = <2>;
- ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0 0 0x20000000>;
reg = <0x0 0x17420000 0x0 0x1000>;
clock-frequency = <19200000>;
@@ -1213,49 +1213,49 @@ frame@17421000 {
frame-number = <0>;
interrupts = <GIC_SPI 8 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 6 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17421000 0x0 0x1000>,
- <0x0 0x17422000 0x0 0x1000>;
+ reg = <0x17421000 0x1000>,
+ <0x17422000 0x1000>;
};
frame@17423000 {
frame-number = <1>;
interrupts = <GIC_SPI 9 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17423000 0x0 0x1000>;
+ reg = <0x17423000 0x1000>;
status = "disabled";
};
frame@17425000 {
frame-number = <2>;
interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17425000 0x0 0x1000>;
+ reg = <0x17425000 0x1000>;
status = "disabled";
};
frame@17427000 {
frame-number = <3>;
interrupts = <GIC_SPI 11 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17427000 0x0 0x1000>;
+ reg = <0x17427000 0x1000>;
status = "disabled";
};
frame@17429000 {
frame-number = <4>;
interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x17429000 0x0 0x1000>;
+ reg = <0x17429000 0x1000>;
status = "disabled";
};
frame@1742b000 {
frame-number = <5>;
interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x1742b000 0x0 0x1000>;
+ reg = <0x1742b000 0x1000>;
status = "disabled";
};
frame@1742d000 {
frame-number = <6>;
interrupts = <GIC_SPI 14 IRQ_TYPE_LEVEL_HIGH>;
- reg = <0x0 0x1742d000 0x0 0x1000>;
+ reg = <0x1742d000 0x1000>;
status = "disabled";
};
};
diff --git a/arch/arm64/boot/dts/renesas/beacon-renesom-baseboard.dtsi b/arch/arm64/boot/dts/renesas/beacon-renesom-baseboard.dtsi
index 5ad6cd1864c1..c289d2bb5122 100644
--- a/arch/arm64/boot/dts/renesas/beacon-renesom-baseboard.dtsi
+++ b/arch/arm64/boot/dts/renesas/beacon-renesom-baseboard.dtsi
@@ -146,7 +146,7 @@ rgb_panel: endpoint {
};
};
- reg_audio: regulator_audio {
+ reg_audio: regulator-audio {
compatible = "regulator-fixed";
regulator-name = "audio-1.8V";
regulator-min-microvolt = <1800000>;
@@ -174,7 +174,7 @@ reg_lcd_reset: regulator-lcd-reset {
vin-supply = <®_lcd>;
};
- reg_cam0: regulator_camera {
+ reg_cam0: regulator-cam0 {
compatible = "regulator-fixed";
regulator-name = "reg_cam0";
regulator-min-microvolt = <1800000>;
@@ -183,7 +183,7 @@ reg_cam0: regulator_camera {
enable-active-high;
};
- reg_cam1: regulator_camera {
+ reg_cam1: regulator-cam1 {
compatible = "regulator-fixed";
regulator-name = "reg_cam1";
regulator-min-microvolt = <1800000>;
diff --git a/arch/arm64/boot/dts/renesas/r8a774c0.dtsi b/arch/arm64/boot/dts/renesas/r8a774c0.dtsi
index e123c8d1bab9..337efbeda5a9 100644
--- a/arch/arm64/boot/dts/renesas/r8a774c0.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a774c0.dtsi
@@ -1956,7 +1956,7 @@ thermal-zones {
cpu-thermal {
polling-delay-passive = <250>;
polling-delay = <0>;
- thermal-sensors = <&thermal 0>;
+ thermal-sensors = <&thermal>;
sustainable-power = <717>;
cooling-maps {
diff --git a/arch/arm64/boot/dts/renesas/r8a77990.dtsi b/arch/arm64/boot/dts/renesas/r8a77990.dtsi
index 7e0f1aab2135..155589f67e1b 100644
--- a/arch/arm64/boot/dts/renesas/r8a77990.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a77990.dtsi
@@ -2117,7 +2117,7 @@ thermal-zones {
cpu-thermal {
polling-delay-passive = <250>;
polling-delay = <0>;
- thermal-sensors = <&thermal 0>;
+ thermal-sensors = <&thermal>;
sustainable-power = <717>;
cooling-maps {
diff --git a/arch/arm64/boot/dts/renesas/r8a779m8.dtsi b/arch/arm64/boot/dts/renesas/r8a779m8.dtsi
index 752440b0c40f..750bd8ccdb7f 100644
--- a/arch/arm64/boot/dts/renesas/r8a779m8.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a779m8.dtsi
@@ -10,3 +10,8 @@
/ {
compatible = "renesas,r8a779m8", "renesas,r8a7795";
};
+
+&cluster0_opp {
+ /delete-node/ opp-1600000000;
+ /delete-node/ opp-1700000000;
+};
diff --git a/arch/arm64/boot/dts/renesas/r9a07g054l2-smarc.dts b/arch/arm64/boot/dts/renesas/r9a07g054l2-smarc.dts
index fc334b4c2aa4..d2848cbfc6c0 100644
--- a/arch/arm64/boot/dts/renesas/r9a07g054l2-smarc.dts
+++ b/arch/arm64/boot/dts/renesas/r9a07g054l2-smarc.dts
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
/*
- * Device Tree Source for the RZ/G2L SMARC EVK board
+ * Device Tree Source for the RZ/V2L SMARC EVK board
*
* Copyright (C) 2021 Renesas Electronics Corp.
*/
diff --git a/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi b/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi
index be97da132258..ba75adedbf79 100644
--- a/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi
+++ b/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi
@@ -599,8 +599,8 @@ usb0: usb@65a00000 {
compatible = "socionext,uniphier-dwc3", "snps,dwc3";
status = "disabled";
reg = <0x65a00000 0xcd00>;
- interrupt-names = "host", "peripheral";
- interrupts = <0 134 4>, <0 135 4>;
+ interrupt-names = "dwc_usb3";
+ interrupts = <0 134 4>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usb0>, <&pinctrl_usb2>;
clock-names = "ref", "bus_early", "suspend";
@@ -701,8 +701,8 @@ usb1: usb@65c00000 {
compatible = "socionext,uniphier-dwc3", "snps,dwc3";
status = "disabled";
reg = <0x65c00000 0xcd00>;
- interrupt-names = "host", "peripheral";
- interrupts = <0 137 4>, <0 138 4>;
+ interrupt-names = "dwc_usb3";
+ interrupts = <0 137 4>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_usb1>, <&pinctrl_usb3>;
clock-names = "ref", "bus_early", "suspend";
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 2a965aa0188d..bdca34283c06 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -59,6 +59,7 @@ config CRYPTO_GHASH_ARM64_CE
select CRYPTO_HASH
select CRYPTO_GF128MUL
select CRYPTO_LIB_AES
+ select CRYPTO_AEAD
config CRYPTO_CRCT10DIF_ARM64_CE
tristate "CRCT10DIF digest algorithm using PMULL instructions"
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index d52a0b269ee8..65c2201b11b2 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -133,7 +133,8 @@
#define ESR_ELx_CV (UL(1) << 24)
#define ESR_ELx_COND_SHIFT (20)
#define ESR_ELx_COND_MASK (UL(0xF) << ESR_ELx_COND_SHIFT)
-#define ESR_ELx_WFx_ISS_TI (UL(1) << 0)
+#define ESR_ELx_WFx_ISS_TI (UL(3) << 0)
+#define ESR_ELx_WFx_ISS_WFxT (UL(2) << 0)
#define ESR_ELx_WFx_ISS_WFI (UL(0) << 0)
#define ESR_ELx_WFx_ISS_WFE (UL(1) << 0)
#define ESR_ELx_xVC_IMM_MASK ((1UL << 16) - 1)
@@ -146,7 +147,8 @@
#define DISR_EL1_ESR_MASK (ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
/* ESR value templates for specific events */
-#define ESR_ELx_WFx_MASK (ESR_ELx_EC_MASK | ESR_ELx_WFx_ISS_TI)
+#define ESR_ELx_WFx_MASK (ESR_ELx_EC_MASK | \
+ (ESR_ELx_WFx_ISS_TI & ~ESR_ELx_WFx_ISS_WFxT))
#define ESR_ELx_WFx_WFI_VAL ((ESR_ELx_EC_WFx << ESR_ELx_EC_SHIFT) | \
ESR_ELx_WFx_ISS_WFI)
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 9839bfc163d7..78d272b26ebd 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -115,7 +115,9 @@ extern const struct kexec_file_ops kexec_image_ops;
struct kimage;
-extern int arch_kimage_file_post_load_cleanup(struct kimage *image);
+int arch_kimage_file_post_load_cleanup(struct kimage *image);
+#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
+
extern int load_other_segments(struct kimage *image,
unsigned long kernel_load_addr, unsigned long kernel_size,
char *initrd, unsigned long initrd_len,
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 6b1a12c23fe7..7058bf047fa5 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -250,8 +250,9 @@ void tls_preserve_current_state(void);
static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
{
+ s32 previous_syscall = regs->syscallno;
memset(regs, 0, sizeof(*regs));
- forget_syscall(regs);
+ regs->syscallno = previous_syscall;
regs->pc = pc;
if (system_uses_irq_prio_masking())
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 6875a16b09d2..fb0e7c7b2e20 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -59,6 +59,7 @@ struct insn_emulation {
static LIST_HEAD(insn_emulation);
static int nr_insn_emulated __initdata;
static DEFINE_RAW_SPINLOCK(insn_emulation_lock);
+static DEFINE_MUTEX(insn_emulation_mutex);
static void register_emulation_hooks(struct insn_emulation_ops *ops)
{
@@ -207,10 +208,10 @@ static int emulation_proc_handler(struct ctl_table *table, int write,
loff_t *ppos)
{
int ret = 0;
- struct insn_emulation *insn = (struct insn_emulation *) table->data;
+ struct insn_emulation *insn = container_of(table->data, struct insn_emulation, current_mode);
enum insn_emulation_mode prev_mode = insn->current_mode;
- table->data = &insn->current_mode;
+ mutex_lock(&insn_emulation_mutex);
ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
if (ret || !write || prev_mode == insn->current_mode)
@@ -223,7 +224,7 @@ static int emulation_proc_handler(struct ctl_table *table, int write,
update_insn_emulation_mode(insn, INSN_UNDEF);
}
ret:
- table->data = insn;
+ mutex_unlock(&insn_emulation_mutex);
return ret;
}
@@ -247,7 +248,7 @@ static void __init register_insn_emulation_sysctl(void)
sysctl->maxlen = sizeof(int);
sysctl->procname = insn->ops->name;
- sysctl->data = insn;
+ sysctl->data = &insn->current_mode;
sysctl->extra1 = &insn->min;
sysctl->extra2 = &insn->max;
sysctl->proc_handler = emulation_proc_handler;
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index a0f3d0aaa3c5..27baa41c46ca 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -395,6 +395,14 @@ static struct midr_range trbe_write_out_of_range_cpus[] = {
};
#endif /* CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE */
+#ifdef CONFIG_ARM64_ERRATUM_1742098
+static struct midr_range broken_aarch32_aes[] = {
+ MIDR_RANGE(MIDR_CORTEX_A57, 0, 1, 0xf, 0xf),
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
+ {},
+};
+#endif /* CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE */
+
const struct arm64_cpu_capabilities arm64_errata[] = {
#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
{
@@ -657,6 +665,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
/* Cortex-A510 r0p0 - r0p1 */
ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 1)
},
+#endif
+#ifdef CONFIG_ARM64_ERRATUM_1742098
+ {
+ .desc = "ARM erratum 1742098",
+ .capability = ARM64_WORKAROUND_1742098,
+ CAP_MIDR_RANGE_LIST(broken_aarch32_aes),
+ .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
+ },
#endif
{
}
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 2cb9cc9e0eff..a54f34aa36e0 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -79,6 +79,7 @@
#include <asm/cpufeature.h>
#include <asm/cpu_ops.h>
#include <asm/fpsimd.h>
+#include <asm/hwcap.h>
#include <asm/insn.h>
#include <asm/kvm_host.h>
#include <asm/mmu_context.h>
@@ -540,7 +541,7 @@ static const struct arm64_ftr_bits ftr_id_pfr2[] = {
static const struct arm64_ftr_bits ftr_id_dfr0[] = {
/* [31:28] TraceFilt */
- S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_PERFMON_SHIFT, 4, 0xf),
+ S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_DFR0_PERFMON_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MPROFDBG_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MMAPTRC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_COPTRC_SHIFT, 4, 0),
@@ -1921,6 +1922,14 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
}
#endif /* CONFIG_ARM64_MTE */
+static void elf_hwcap_fixup(void)
+{
+#ifdef CONFIG_ARM64_ERRATUM_1742098
+ if (cpus_have_const_cap(ARM64_WORKAROUND_1742098))
+ compat_elf_hwcap2 &= ~COMPAT_HWCAP2_AES;
+#endif /* ARM64_ERRATUM_1742098 */
+}
+
#ifdef CONFIG_KVM
static bool is_kvm_protected_mode(const struct arm64_cpu_capabilities *entry, int __unused)
{
@@ -3033,8 +3042,10 @@ void __init setup_cpu_features(void)
setup_system_capabilities();
setup_elf_hwcaps(arm64_elf_hwcaps);
- if (system_supports_32bit_el0())
+ if (system_supports_32bit_el0()) {
setup_elf_hwcaps(compat_elf_hwcaps);
+ elf_hwcap_fixup();
+ }
if (system_uses_ttbr0_pan())
pr_info("emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching\n");
@@ -3086,6 +3097,7 @@ static int enable_mismatched_32bit_el0(unsigned int cpu)
cpu_active_mask);
get_cpu_device(lucky_winner)->offline_disabled = true;
setup_elf_hwcaps(compat_elf_hwcaps);
+ elf_hwcap_fixup();
pr_info("Asymmetric 32-bit EL0 support detected on CPU %u; CPU hot-unplug disabled on CPU %u\n",
cpu, lucky_winner);
return 0;
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 6328308be272..7754ef328657 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -300,11 +300,6 @@ static void swsusp_mte_restore_tags(void)
unsigned long pfn = xa_state.xa_index;
struct page *page = pfn_to_online_page(pfn);
- /*
- * It is not required to invoke page_kasan_tag_reset(page)
- * at this point since the tags stored in page->flags are
- * already restored.
- */
mte_restore_page_tags(page_address(page), tags);
mte_free_tag_storage(tags);
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index d502703e8373..d565ae25e48f 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -47,15 +47,6 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte,
if (!pte_is_tagged)
return;
- page_kasan_tag_reset(page);
- /*
- * We need smp_wmb() in between setting the flags and clearing the
- * tags because if another thread reads page->flags and builds a
- * tagged address out of it, there is an actual dependency to the
- * memory access, but on the current thread we do not guarantee that
- * the new page->flags are visible before the tags were updated.
- */
- smp_wmb();
mte_clear_page_tags(page_address(page));
}
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 6410d21d8695..858e5be48791 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -371,5 +371,5 @@ void __noreturn hyp_panic(void)
asmlinkage void kvm_unexpected_el2_exception(void)
{
- return __kvm_unexpected_el2_exception();
+ __kvm_unexpected_el2_exception();
}
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 262dfe03134d..51ea9ac71210 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -240,5 +240,5 @@ void __noreturn hyp_panic(void)
asmlinkage void kvm_unexpected_el2_exception(void)
{
- return __kvm_unexpected_el2_exception();
+ __kvm_unexpected_el2_exception();
}
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 0dea80bf6de4..24913271e898 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -23,15 +23,6 @@ void copy_highpage(struct page *to, struct page *from)
if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) {
set_bit(PG_mte_tagged, &to->flags);
- page_kasan_tag_reset(to);
- /*
- * We need smp_wmb() in between setting the flags and clearing the
- * tags because if another thread reads page->flags and builds a
- * tagged address out of it, there is an actual dependency to the
- * memory access, but on the current thread we do not guarantee that
- * the new page->flags are visible before the tags were updated.
- */
- smp_wmb();
mte_copy_page_tags(kto, kfrom);
}
}
diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
index a9e50e930484..4334dec93bd4 100644
--- a/arch/arm64/mm/mteswap.c
+++ b/arch/arm64/mm/mteswap.c
@@ -53,15 +53,6 @@ bool mte_restore_tags(swp_entry_t entry, struct page *page)
if (!tags)
return false;
- page_kasan_tag_reset(page);
- /*
- * We need smp_wmb() in between setting the flags and clearing the
- * tags because if another thread reads page->flags and builds a
- * tagged address out of it, there is an actual dependency to the
- * memory access, but on the current thread we do not guarantee that
- * the new page->flags are visible before the tags were updated.
- */
- smp_wmb();
mte_restore_page_tags(page_address(page), tags);
return true;
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 3ed418f70e3b..8cd6088f8875 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -58,6 +58,7 @@ WORKAROUND_1418040
WORKAROUND_1463225
WORKAROUND_1508412
WORKAROUND_1542419
+WORKAROUND_1742098
WORKAROUND_1902691
WORKAROUND_2038923
WORKAROUND_2064142
diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index 7cbce290f4e5..757c2f6d8d4b 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -538,7 +538,7 @@ ia64_get_irr(unsigned int vector)
{
unsigned int reg = vector / 64;
unsigned int bit = vector % 64;
- u64 irr;
+ unsigned long irr;
switch (reg) {
case 0: irr = ia64_getreg(_IA64_REG_CR_IRR0); break;
diff --git a/arch/ia64/kernel/palinfo.c b/arch/ia64/kernel/palinfo.c
index 64189f04c1a4..b9ae093bfe37 100644
--- a/arch/ia64/kernel/palinfo.c
+++ b/arch/ia64/kernel/palinfo.c
@@ -120,7 +120,7 @@ static const char *mem_attrib[]={
* Input:
* - a pointer to a buffer to hold the string
* - a 64-bit vector
- * Ouput:
+ * Output:
* - a pointer to the end of the buffer
*
*/
diff --git a/arch/ia64/kernel/traps.c b/arch/ia64/kernel/traps.c
index 753642366e12..53735b1d1be3 100644
--- a/arch/ia64/kernel/traps.c
+++ b/arch/ia64/kernel/traps.c
@@ -309,7 +309,7 @@ handle_fpu_swa (int fp_fault, struct pt_regs *regs, unsigned long isr)
/*
* Lower 4 bits are used as a count. Upper bits are a sequence
* number that is updated when count is reset. The cmpxchg will
- * fail is seqno has changed. This minimizes mutiple cpus
+ * fail is seqno has changed. This minimizes multiple cpus
* resetting the count.
*/
if (current_jiffies > last.time)
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 5d165607bf35..7ae1244ed8ec 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -451,7 +451,7 @@ mem_init (void)
memblock_free_all();
/*
- * For fsyscall entrpoints with no light-weight handler, use the ordinary
+ * For fsyscall entrypoints with no light-weight handler, use the ordinary
* (heavy-weight) handler, but mark it by setting bit 0, so the fsyscall entry
* code can tell them apart.
*/
diff --git a/arch/ia64/mm/tlb.c b/arch/ia64/mm/tlb.c
index 135b5135cace..ca060e7a2a46 100644
--- a/arch/ia64/mm/tlb.c
+++ b/arch/ia64/mm/tlb.c
@@ -174,7 +174,7 @@ __setup("nptcg=", set_nptcg);
* override table (in which case we should ignore the value from
* PAL_VM_SUMMARY).
*
- * Kernel parameter "nptcg=" overrides maximum number of simultanesous ptc.g
+ * Kernel parameter "nptcg=" overrides maximum number of simultaneous ptc.g
* purges defined in either PAL_VM_SUMMARY or PAL override table. In this case,
* we should ignore the value from either PAL_VM_SUMMARY or PAL override table.
*
@@ -516,7 +516,7 @@ int ia64_itr_entry(u64 target_mask, u64 va, u64 pte, u64 log_size)
if (i >= per_cpu(ia64_tr_num, cpu))
return -EBUSY;
- /*Record tr info for mca hander use!*/
+ /*Record tr info for mca handler use!*/
if (i > per_cpu(ia64_tr_used, cpu))
per_cpu(ia64_tr_used, cpu) = i;
diff --git a/arch/mips/kernel/proc.c b/arch/mips/kernel/proc.c
index bb43bf850314..8eba5a1ed664 100644
--- a/arch/mips/kernel/proc.c
+++ b/arch/mips/kernel/proc.c
@@ -311,7 +311,7 @@ static void *c_start(struct seq_file *m, loff_t *pos)
{
unsigned long i = *pos;
- return i < NR_CPUS ? (void *) (i + 1) : NULL;
+ return i < nr_cpu_ids ? (void *) (i + 1) : NULL;
}
static void *c_next(struct seq_file *m, void *v, loff_t *pos)
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index 3d0cf471f2fe..b2cc2c2dd4bf 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -159,7 +159,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
/* Map GIC user page. */
if (gic_size) {
gic_base = (unsigned long)mips_gic_base + MIPS_GIC_USER_OFS;
- gic_pfn = virt_to_phys((void *)gic_base) >> PAGE_SHIFT;
+ gic_pfn = PFN_DOWN(__pa(gic_base));
ret = io_remap_pfn_range(vma, base, gic_pfn, gic_size,
pgprot_noncached(vma->vm_page_prot));
diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
index 69a533148efd..8f61e93c0c5b 100644
--- a/arch/mips/loongson64/numa.c
+++ b/arch/mips/loongson64/numa.c
@@ -196,7 +196,6 @@ void __init prom_init_numa_memory(void)
pr_info("CP0_PageGrain: CP0 5.1 (0x%x)\n", read_c0_pagegrain());
prom_meminit();
}
-EXPORT_SYMBOL(prom_init_numa_memory);
pg_data_t * __init arch_alloc_nodedata(int nid)
{
diff --git a/arch/mips/mm/physaddr.c b/arch/mips/mm/physaddr.c
index a1ced5e44951..f9b8c85e9843 100644
--- a/arch/mips/mm/physaddr.c
+++ b/arch/mips/mm/physaddr.c
@@ -5,6 +5,7 @@
#include <linux/mmdebug.h>
#include <linux/mm.h>
+#include <asm/addrspace.h>
#include <asm/sections.h>
#include <asm/io.h>
#include <asm/page.h>
@@ -12,15 +13,6 @@
static inline bool __debug_virt_addr_valid(unsigned long x)
{
- /* high_memory does not get immediately defined, and there
- * are early callers of __pa() against PAGE_OFFSET
- */
- if (!high_memory && x >= PAGE_OFFSET)
- return true;
-
- if (high_memory && x >= PAGE_OFFSET && x < (unsigned long)high_memory)
- return true;
-
/*
* MAX_DMA_ADDRESS is a virtual address that may not correspond to an
* actual physical address. Enough code relies on
@@ -30,7 +22,9 @@ static inline bool __debug_virt_addr_valid(unsigned long x)
if (x == MAX_DMA_ADDRESS)
return true;
- return false;
+ return x >= PAGE_OFFSET && (KSEGX(x) < KSEG2 ||
+ IS_ENABLED(CONFIG_EVA) ||
+ !IS_ENABLED(CONFIG_HIGHMEM));
}
phys_addr_t __virt_to_phys(volatile const void *x)
diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c
index a20c1c47b780..5930ad5a3ad3 100644
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -50,9 +50,6 @@ void flush_instruction_cache_local(void); /* flushes local code-cache only */
*/
DEFINE_SPINLOCK(pa_tlb_flush_lock);
-/* Swapper page setup lock. */
-DEFINE_SPINLOCK(pa_swapper_pg_lock);
-
#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
int pa_serialize_tlb_flushes __ro_after_init;
#endif
diff --git a/arch/parisc/kernel/drivers.c b/arch/parisc/kernel/drivers.c
index 776d624a7207..d126e78e101a 100644
--- a/arch/parisc/kernel/drivers.c
+++ b/arch/parisc/kernel/drivers.c
@@ -520,7 +520,6 @@ alloc_pa_dev(unsigned long hpa, struct hardware_path *mod_path)
dev->id.hversion_rev = iodc_data[1] & 0x0f;
dev->id.sversion = ((iodc_data[4] & 0x0f) << 16) |
(iodc_data[5] << 8) | iodc_data[6];
- dev->hpa.name = parisc_pathname(dev);
dev->hpa.start = hpa;
/* This is awkward. The STI spec says that gfx devices may occupy
* 32MB or 64MB. Unfortunately, we don't know how to tell whether
@@ -534,10 +533,10 @@ alloc_pa_dev(unsigned long hpa, struct hardware_path *mod_path)
dev->hpa.end = hpa + 0xfff;
}
dev->hpa.flags = IORESOURCE_MEM;
- name = parisc_hardware_description(&dev->id);
- if (name) {
- strlcpy(dev->name, name, sizeof(dev->name));
- }
+ dev->hpa.name = dev->name;
+ name = parisc_hardware_description(&dev->id) ? : "unknown";
+ snprintf(dev->name, sizeof(dev->name), "%s [%s]",
+ name, parisc_pathname(dev));
/* Silently fail things like mouse ports which are subsumed within
* the keyboard controller
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 68b46fe2f17c..8a99c998da9b 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -413,7 +413,7 @@
412 32 utimensat_time64 sys_utimensat sys_utimensat
413 32 pselect6_time64 sys_pselect6 compat_sys_pselect6_time64
414 32 ppoll_time64 sys_ppoll compat_sys_ppoll_time64
-416 32 io_pgetevents_time64 sys_io_pgetevents sys_io_pgetevents
+416 32 io_pgetevents_time64 sys_io_pgetevents compat_sys_io_pgetevents_time64
417 32 recvmmsg_time64 sys_recvmmsg compat_sys_recvmmsg_time64
418 32 mq_timedsend_time64 sys_mq_timedsend sys_mq_timedsend
419 32 mq_timedreceive_time64 sys_mq_timedreceive sys_mq_timedreceive
diff --git a/arch/powerpc/boot/cuboot-hotfoot.c b/arch/powerpc/boot/cuboot-hotfoot.c
index 888a6b9bfead..0e5532f855d6 100644
--- a/arch/powerpc/boot/cuboot-hotfoot.c
+++ b/arch/powerpc/boot/cuboot-hotfoot.c
@@ -70,7 +70,7 @@ static void hotfoot_fixups(void)
printf("Fixing devtree for 4M Flash\n");
- /* First fix up the base addresse */
+ /* First fix up the base address */
getprop(devp, "reg", regs, sizeof(regs));
regs[0] = 0;
regs[1] = 0xffc00000;
diff --git a/arch/powerpc/configs/44x/akebono_defconfig b/arch/powerpc/configs/44x/akebono_defconfig
index 4bc549c6edc5..fde4824f235e 100644
--- a/arch/powerpc/configs/44x/akebono_defconfig
+++ b/arch/powerpc/configs/44x/akebono_defconfig
@@ -118,7 +118,7 @@ CONFIG_CRAMFS=y
CONFIG_NLS_DEFAULT="n"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_XMON=y
diff --git a/arch/powerpc/configs/44x/currituck_defconfig b/arch/powerpc/configs/44x/currituck_defconfig
index 717827219921..7283b7d4a1a5 100644
--- a/arch/powerpc/configs/44x/currituck_defconfig
+++ b/arch/powerpc/configs/44x/currituck_defconfig
@@ -73,7 +73,7 @@ CONFIG_NFS_FS=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NLS_DEFAULT="n"
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_XMON=y
diff --git a/arch/powerpc/configs/44x/fsp2_defconfig b/arch/powerpc/configs/44x/fsp2_defconfig
index 8da316e61a08..3fdfbb29b854 100644
--- a/arch/powerpc/configs/44x/fsp2_defconfig
+++ b/arch/powerpc/configs/44x/fsp2_defconfig
@@ -110,7 +110,7 @@ CONFIG_XZ_DEC=y
CONFIG_PRINTK_TIME=y
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=3
CONFIG_DYNAMIC_DEBUG=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_CRYPTO_CBC=y
diff --git a/arch/powerpc/configs/44x/iss476-smp_defconfig b/arch/powerpc/configs/44x/iss476-smp_defconfig
index c11e777b2f3d..0f6380e1e612 100644
--- a/arch/powerpc/configs/44x/iss476-smp_defconfig
+++ b/arch/powerpc/configs/44x/iss476-smp_defconfig
@@ -56,7 +56,7 @@ CONFIG_PROC_KCORE=y
CONFIG_TMPFS=y
CONFIG_CRAMFS=y
# CONFIG_NETWORK_FILESYSTEMS is not set
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_PPC_EARLY_DEBUG=y
diff --git a/arch/powerpc/configs/44x/warp_defconfig b/arch/powerpc/configs/44x/warp_defconfig
index 47252c2d7669..20891c413149 100644
--- a/arch/powerpc/configs/44x/warp_defconfig
+++ b/arch/powerpc/configs/44x/warp_defconfig
@@ -88,7 +88,7 @@ CONFIG_NLS_UTF8=y
CONFIG_CRC_CCITT=y
CONFIG_CRC_T10DIF=y
CONFIG_PRINTK_TIME=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_FS=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
diff --git a/arch/powerpc/configs/52xx/lite5200b_defconfig b/arch/powerpc/configs/52xx/lite5200b_defconfig
index 63368e677506..7db479dcbc0c 100644
--- a/arch/powerpc/configs/52xx/lite5200b_defconfig
+++ b/arch/powerpc/configs/52xx/lite5200b_defconfig
@@ -58,6 +58,6 @@ CONFIG_NFS_FS=y
CONFIG_NFS_V4=y
CONFIG_ROOT_NFS=y
CONFIG_PRINTK_TIME=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DETECT_HUNG_TASK=y
# CONFIG_DEBUG_BUGVERBOSE is not set
diff --git a/arch/powerpc/configs/52xx/motionpro_defconfig b/arch/powerpc/configs/52xx/motionpro_defconfig
index 72762da94846..6186ead1e105 100644
--- a/arch/powerpc/configs/52xx/motionpro_defconfig
+++ b/arch/powerpc/configs/52xx/motionpro_defconfig
@@ -84,7 +84,7 @@ CONFIG_ROOT_NFS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_PRINTK_TIME=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DETECT_HUNG_TASK=y
# CONFIG_DEBUG_BUGVERBOSE is not set
CONFIG_CRYPTO_ECB=y
diff --git a/arch/powerpc/configs/52xx/tqm5200_defconfig b/arch/powerpc/configs/52xx/tqm5200_defconfig
index a3c8ca74032c..e6735b945327 100644
--- a/arch/powerpc/configs/52xx/tqm5200_defconfig
+++ b/arch/powerpc/configs/52xx/tqm5200_defconfig
@@ -85,7 +85,7 @@ CONFIG_ROOT_NFS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_PRINTK_TIME=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DETECT_HUNG_TASK=y
# CONFIG_DEBUG_BUGVERBOSE is not set
CONFIG_CRYPTO_ECB=y
diff --git a/arch/powerpc/configs/adder875_defconfig b/arch/powerpc/configs/adder875_defconfig
index 5326bc739279..7f35d5bc1229 100644
--- a/arch/powerpc/configs/adder875_defconfig
+++ b/arch/powerpc/configs/adder875_defconfig
@@ -45,7 +45,7 @@ CONFIG_CRAMFS=y
CONFIG_NFS_FS=y
CONFIG_ROOT_NFS=y
CONFIG_CRC32_SLICEBY4=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_FS=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
diff --git a/arch/powerpc/configs/ep8248e_defconfig b/arch/powerpc/configs/ep8248e_defconfig
index 00d69965f898..8df6d3a293e3 100644
--- a/arch/powerpc/configs/ep8248e_defconfig
+++ b/arch/powerpc/configs/ep8248e_defconfig
@@ -59,7 +59,7 @@ CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_UTF8=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_SCHED_DEBUG is not set
CONFIG_BDI_SWITCH=y
diff --git a/arch/powerpc/configs/ep88xc_defconfig b/arch/powerpc/configs/ep88xc_defconfig
index f5c3e72da719..a98ef6a4abef 100644
--- a/arch/powerpc/configs/ep88xc_defconfig
+++ b/arch/powerpc/configs/ep88xc_defconfig
@@ -48,6 +48,6 @@ CONFIG_CRAMFS=y
CONFIG_NFS_FS=y
CONFIG_ROOT_NFS=y
CONFIG_CRC32_SLICEBY4=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
diff --git a/arch/powerpc/configs/fsl-emb-nonhw.config b/arch/powerpc/configs/fsl-emb-nonhw.config
index df37efed0aec..f14c6dbd7346 100644
--- a/arch/powerpc/configs/fsl-emb-nonhw.config
+++ b/arch/powerpc/configs/fsl-emb-nonhw.config
@@ -24,7 +24,7 @@ CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=y
CONFIG_DEBUG_FS=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_SHIRQ=y
CONFIG_DETECT_HUNG_TASK=y
diff --git a/arch/powerpc/configs/mgcoge_defconfig b/arch/powerpc/configs/mgcoge_defconfig
index dcc8dccf54f3..498d35db7833 100644
--- a/arch/powerpc/configs/mgcoge_defconfig
+++ b/arch/powerpc/configs/mgcoge_defconfig
@@ -73,7 +73,7 @@ CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_UTF8=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_FS=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_SCHED_DEBUG is not set
diff --git a/arch/powerpc/configs/mpc5200_defconfig b/arch/powerpc/configs/mpc5200_defconfig
index 83d801307178..c0fe5e76604a 100644
--- a/arch/powerpc/configs/mpc5200_defconfig
+++ b/arch/powerpc/configs/mpc5200_defconfig
@@ -122,6 +122,6 @@ CONFIG_ROOT_NFS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_PRINTK_TIME=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DETECT_HUNG_TASK=y
diff --git a/arch/powerpc/configs/mpc8272_ads_defconfig b/arch/powerpc/configs/mpc8272_ads_defconfig
index 00a4d2bf43b2..4145ef5689ca 100644
--- a/arch/powerpc/configs/mpc8272_ads_defconfig
+++ b/arch/powerpc/configs/mpc8272_ads_defconfig
@@ -67,7 +67,7 @@ CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_UTF8=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_BDI_SWITCH=y
diff --git a/arch/powerpc/configs/mpc885_ads_defconfig b/arch/powerpc/configs/mpc885_ads_defconfig
index c74dc76b1d0d..700115d85d6f 100644
--- a/arch/powerpc/configs/mpc885_ads_defconfig
+++ b/arch/powerpc/configs/mpc885_ads_defconfig
@@ -71,7 +71,7 @@ CONFIG_ROOT_NFS=y
CONFIG_CRYPTO=y
CONFIG_CRYPTO_DEV_TALITOS=y
CONFIG_CRC32_SLICEBY4=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_VM_PGTABLE=y
diff --git a/arch/powerpc/configs/ppc6xx_defconfig b/arch/powerpc/configs/ppc6xx_defconfig
index bb549cb1c3e3..6d7ae407648f 100644
--- a/arch/powerpc/configs/ppc6xx_defconfig
+++ b/arch/powerpc/configs/ppc6xx_defconfig
@@ -1066,7 +1066,7 @@ CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_HEADERS_INSTALL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_KERNEL=y
diff --git a/arch/powerpc/configs/pq2fads_defconfig b/arch/powerpc/configs/pq2fads_defconfig
index 9d8a76857c6f..9d63e2e65211 100644
--- a/arch/powerpc/configs/pq2fads_defconfig
+++ b/arch/powerpc/configs/pq2fads_defconfig
@@ -68,7 +68,7 @@ CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_UTF8=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
# CONFIG_SCHED_DEBUG is not set
diff --git a/arch/powerpc/configs/ps3_defconfig b/arch/powerpc/configs/ps3_defconfig
index 7c95fab4b920..2d9ac233da68 100644
--- a/arch/powerpc/configs/ps3_defconfig
+++ b/arch/powerpc/configs/ps3_defconfig
@@ -153,7 +153,7 @@ CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_CRC_CCITT=m
CONFIG_CRC_T10DIF=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_STACKOVERFLOW=y
diff --git a/arch/powerpc/configs/tqm8xx_defconfig b/arch/powerpc/configs/tqm8xx_defconfig
index 77857d513022..083c2e57520a 100644
--- a/arch/powerpc/configs/tqm8xx_defconfig
+++ b/arch/powerpc/configs/tqm8xx_defconfig
@@ -55,6 +55,6 @@ CONFIG_CRAMFS=y
CONFIG_NFS_FS=y
CONFIG_ROOT_NFS=y
CONFIG_CRC32_SLICEBY4=y
-CONFIG_DEBUG_INFO=y
+CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DETECT_HUNG_TASK=y
diff --git a/arch/powerpc/crypto/aes-spe-glue.c b/arch/powerpc/crypto/aes-spe-glue.c
index c2b23b69d7b1..e8dfe9fb0266 100644
--- a/arch/powerpc/crypto/aes-spe-glue.c
+++ b/arch/powerpc/crypto/aes-spe-glue.c
@@ -404,7 +404,7 @@ static int ppc_xts_decrypt(struct skcipher_request *req)
/*
* Algorithm definitions. Disabling alignment (cra_alignmask=0) was chosen
- * because the e500 platform can handle unaligned reads/writes very efficently.
+ * because the e500 platform can handle unaligned reads/writes very efficiently.
* This improves IPsec thoughput by another few percent. Additionally we assume
* that AES context is always aligned to at least 8 bytes because it is created
* with kmalloc() in the crypto infrastructure
diff --git a/arch/powerpc/include/asm/archrandom.h b/arch/powerpc/include/asm/archrandom.h
index 9a53e29680f4..258174304904 100644
--- a/arch/powerpc/include/asm/archrandom.h
+++ b/arch/powerpc/include/asm/archrandom.h
@@ -38,12 +38,7 @@ static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
#endif /* CONFIG_ARCH_RANDOM */
#ifdef CONFIG_PPC_POWERNV
-int powernv_hwrng_present(void);
int powernv_get_random_long(unsigned long *v);
-int powernv_get_random_real_mode(unsigned long *v);
-#else
-static inline int powernv_hwrng_present(void) { return 0; }
-static inline int powernv_get_random_real_mode(unsigned long *v) { return 0; }
#endif
#endif /* _ASM_POWERPC_ARCHRANDOM_H */
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 2aefe14e1442..1e5e9b6ec78d 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -120,6 +120,15 @@ int setup_purgatory(struct kimage *image, const void *slave_code,
#ifdef CONFIG_PPC64
struct kexec_buf;
+int arch_kexec_kernel_image_probe(struct kimage *image, void *buf, unsigned long buf_len);
+#define arch_kexec_kernel_image_probe arch_kexec_kernel_image_probe
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image);
+#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
+
+int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
+#define arch_kexec_locate_mem_hole arch_kexec_locate_mem_hole
+
int load_crashdump_segments_ppc64(struct kimage *image,
struct kexec_buf *kbuf);
int setup_purgatory_ppc64(struct kimage *image, const void *slave_code,
diff --git a/arch/powerpc/include/asm/simple_spinlock.h b/arch/powerpc/include/asm/simple_spinlock.h
index 7ae6aeef8464..9dcc7e9993b9 100644
--- a/arch/powerpc/include/asm/simple_spinlock.h
+++ b/arch/powerpc/include/asm/simple_spinlock.h
@@ -48,10 +48,11 @@ static inline int arch_spin_is_locked(arch_spinlock_t *lock)
static inline unsigned long __arch_spin_trylock(arch_spinlock_t *lock)
{
unsigned long tmp, token;
+ unsigned int eh = IS_ENABLED(CONFIG_PPC64);
token = LOCK_TOKEN;
__asm__ __volatile__(
-"1: lwarx %0,0,%2,1\n\
+"1: lwarx %0,0,%2,%[eh]\n\
cmpwi 0,%0,0\n\
bne- 2f\n\
stwcx. %1,0,%2\n\
@@ -59,7 +60,7 @@ static inline unsigned long __arch_spin_trylock(arch_spinlock_t *lock)
PPC_ACQUIRE_BARRIER
"2:"
: "=&r" (tmp)
- : "r" (token), "r" (&lock->slock)
+ : "r" (token), "r" (&lock->slock), [eh] "n" (eh)
: "cr0", "memory");
return tmp;
@@ -156,9 +157,10 @@ static inline void arch_spin_unlock(arch_spinlock_t *lock)
static inline long __arch_read_trylock(arch_rwlock_t *rw)
{
long tmp;
+ unsigned int eh = IS_ENABLED(CONFIG_PPC64);
__asm__ __volatile__(
-"1: lwarx %0,0,%1,1\n"
+"1: lwarx %0,0,%1,%[eh]\n"
__DO_SIGN_EXTEND
" addic. %0,%0,1\n\
ble- 2f\n"
@@ -166,7 +168,7 @@ static inline long __arch_read_trylock(arch_rwlock_t *rw)
bne- 1b\n"
PPC_ACQUIRE_BARRIER
"2:" : "=&r" (tmp)
- : "r" (&rw->lock)
+ : "r" (&rw->lock), [eh] "n" (eh)
: "cr0", "xer", "memory");
return tmp;
@@ -179,17 +181,18 @@ static inline long __arch_read_trylock(arch_rwlock_t *rw)
static inline long __arch_write_trylock(arch_rwlock_t *rw)
{
long tmp, token;
+ unsigned int eh = IS_ENABLED(CONFIG_PPC64);
token = WRLOCK_TOKEN;
__asm__ __volatile__(
-"1: lwarx %0,0,%2,1\n\
+"1: lwarx %0,0,%2,%[eh]\n\
cmpwi 0,%0,0\n\
bne- 2f\n"
" stwcx. %1,0,%2\n\
bne- 1b\n"
PPC_ACQUIRE_BARRIER
"2:" : "=&r" (tmp)
- : "r" (token), "r" (&rw->lock)
+ : "r" (token), "r" (&rw->lock), [eh] "n" (eh)
: "cr0", "memory");
return tmp;
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 4ddd161aef32..63c384c3e6d4 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -20,6 +20,7 @@ CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += -fno-stack-protector
CFLAGS_prom_init.o += -DDISABLE_BRANCH_PROFILING
CFLAGS_prom_init.o += -ffreestanding
+CFLAGS_prom_init.o += $(call cc-option, -ftrivial-auto-var-init=uninitialized)
ifdef CONFIG_FUNCTION_TRACER
# Do not trace early boot code
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index ae0fdef0ac11..2a271a6d6924 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -2025,7 +2025,7 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned long offset,
* oprofile_cpu_type already has a value, then we are
* possibly overriding a real PVR with a logical one,
* and, in that case, keep the current value for
- * oprofile_cpu_type. Futhermore, let's ensure that the
+ * oprofile_cpu_type. Furthermore, let's ensure that the
* fix for the PMAO bug is enabled on compatibility mode.
*/
if (old.oprofile_cpu_type != NULL) {
diff --git a/arch/powerpc/kernel/dawr.c b/arch/powerpc/kernel/dawr.c
index 64e423d2fe0f..30d4eca88d17 100644
--- a/arch/powerpc/kernel/dawr.c
+++ b/arch/powerpc/kernel/dawr.c
@@ -27,7 +27,7 @@ int set_dawr(int nr, struct arch_hw_breakpoint *brk)
dawrx |= (brk->type & (HW_BRK_TYPE_PRIV_ALL)) >> 3;
/*
* DAWR length is stored in field MDR bits 48:53. Matches range in
- * doublewords (64 bits) baised by -1 eg. 0b000000=1DW and
+ * doublewords (64 bits) biased by -1 eg. 0b000000=1DW and
* 0b111111=64DW.
* brk->hw_len is in bytes.
* This aligns up to double word size, shifts and does the bias.
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 28bb1e7263a6..ab316e155ea9 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1329,7 +1329,7 @@ int eeh_pe_set_option(struct eeh_pe *pe, int option)
/*
* EEH functionality could possibly be disabled, just
- * return error for the case. And the EEH functinality
+ * return error for the case. And the EEH functionality
* isn't expected to be disabled on one specific PE.
*/
switch (option) {
@@ -1804,7 +1804,7 @@ static int eeh_debugfs_break_device(struct pci_dev *pdev)
* PE freeze. Using the in_8() accessor skips the eeh detection hook
* so the freeze hook so the EEH Detection machinery won't be
* triggered here. This is to match the usual behaviour of EEH
- * where the HW will asyncronously freeze a PE and it's up to
+ * where the HW will asynchronously freeze a PE and it's up to
* the kernel to notice and deal with it.
*
* 3. Turn Memory space back on. This is more important for VFs
diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
index a7a8dc182efb..c23a454af08a 100644
--- a/arch/powerpc/kernel/eeh_event.c
+++ b/arch/powerpc/kernel/eeh_event.c
@@ -143,7 +143,7 @@ int __eeh_send_failure_event(struct eeh_pe *pe)
int eeh_send_failure_event(struct eeh_pe *pe)
{
/*
- * If we've manually supressed recovery events via debugfs
+ * If we've manually suppressed recovery events via debugfs
* then just drop it on the floor.
*/
if (eeh_debugfs_no_recover) {
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index dc2350b288cf..aa29b9b7920f 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1665,8 +1665,8 @@ int __init setup_fadump(void)
}
/*
* Use subsys_initcall_sync() here because there is dependency with
- * crash_save_vmcoreinfo_init(), which mush run first to ensure vmcoreinfo initialization
- * is done before regisering with f/w.
+ * crash_save_vmcoreinfo_init(), which must run first to ensure vmcoreinfo initialization
+ * is done before registering with f/w.
*/
subsys_initcall_sync(setup_fadump);
#else /* !CONFIG_PRESERVE_FA_DUMP */
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 07093b7cdcb9..a67fd54ccc57 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -776,6 +776,11 @@ bool iommu_table_in_use(struct iommu_table *tbl)
/* ignore reserved bit0 */
if (tbl->it_offset == 0)
start = 1;
+
+ /* Simple case with no reserved MMIO32 region */
+ if (!tbl->it_reserved_start && !tbl->it_reserved_end)
+ return find_next_bit(tbl->it_map, tbl->it_size, start) != tbl->it_size;
+
end = tbl->it_reserved_start - tbl->it_offset;
if (find_next_bit(tbl->it_map, end, start) != end)
return true;
diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
index a0432ef46967..e25b796682cc 100644
--- a/arch/powerpc/kernel/module_32.c
+++ b/arch/powerpc/kernel/module_32.c
@@ -99,7 +99,7 @@ static unsigned long get_plt_size(const Elf32_Ehdr *hdr,
/* Sort the relocation information based on a symbol and
* addend key. This is a stable O(n*log n) complexity
- * alogrithm but it will reduce the complexity of
+ * algorithm but it will reduce the complexity of
* count_relocs() to linear complexity O(n)
*/
sort((void *)hdr + sechdrs[i].sh_offset,
diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index 794720530442..2cce576edbc5 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -194,7 +194,7 @@ static unsigned long get_stubs_size(const Elf64_Ehdr *hdr,
/* Sort the relocation information based on a symbol and
* addend key. This is a stable O(n*log n) complexity
- * alogrithm but it will reduce the complexity of
+ * algorithm but it will reduce the complexity of
* count_relocs() to linear complexity O(n)
*/
sort((void *)sechdrs[i].sh_addr,
@@ -361,7 +361,7 @@ static inline int create_ftrace_stub(struct ppc64_stub_entry *entry,
entry->jump[1] |= PPC_HA(reladdr);
entry->jump[2] |= PPC_LO(reladdr);
- /* Eventhough we don't use funcdata in the stub, it's needed elsewhere. */
+ /* Even though we don't use funcdata in the stub, it's needed elsewhere. */
entry->funcdata = func_desc(addr);
entry->magic = STUB_MAGIC;
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 8bc9cf62cd93..aa37faeb80ee 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -74,16 +74,32 @@ void __init set_pci_dma_ops(const struct dma_map_ops *dma_ops)
static int get_phb_number(struct device_node *dn)
{
int ret, phb_id = -1;
- u32 prop_32;
u64 prop;
/*
* Try fixed PHB numbering first, by checking archs and reading
- * the respective device-tree properties. Firstly, try powernv by
- * reading "ibm,opal-phbid", only present in OPAL environment.
+ * the respective device-tree properties. Firstly, try reading
+ * standard "linux,pci-domain", then try reading "ibm,opal-phbid"
+ * (only present in powernv OPAL environment), then try device-tree
+ * alias and as the last try to use lower bits of "reg" property.
*/
- ret = of_property_read_u64(dn, "ibm,opal-phbid", &prop);
+ ret = of_get_pci_domain_nr(dn);
+ if (ret >= 0) {
+ prop = ret;
+ ret = 0;
+ }
+ if (ret)
+ ret = of_property_read_u64(dn, "ibm,opal-phbid", &prop);
+
if (ret) {
+ ret = of_alias_get_id(dn, "pci");
+ if (ret >= 0) {
+ prop = ret;
+ ret = 0;
+ }
+ }
+ if (ret) {
+ u32 prop_32;
ret = of_property_read_u32_index(dn, "reg", 1, &prop_32);
prop = prop_32;
}
@@ -95,10 +111,7 @@ static int get_phb_number(struct device_node *dn)
if ((phb_id >= 0) && !test_and_set_bit(phb_id, phb_bitmap))
return phb_id;
- /*
- * If not pseries nor powernv, or if fixed PHB numbering tried to add
- * the same PHB number twice, then fallback to dynamic PHB numbering.
- */
+ /* If everything fails then fallback to dynamic PHB numbering. */
phb_id = find_first_zero_bit(phb_bitmap, MAX_PHBS);
BUG_ON(phb_id >= MAX_PHBS);
set_bit(phb_id, phb_bitmap);
@@ -1688,7 +1701,7 @@ EXPORT_SYMBOL_GPL(pcibios_scan_phb);
static void fixup_hide_host_resource_fsl(struct pci_dev *dev)
{
int i, class = dev->class >> 8;
- /* When configured as agent, programing interface = 1 */
+ /* When configured as agent, programming interface = 1 */
int prog_if = dev->class & 0xf;
if ((class == PCI_CLASS_PROCESSOR_POWERPC ||
diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
index c3024f104765..6f2b0cc1ddd6 100644
--- a/arch/powerpc/kernel/pci_of_scan.c
+++ b/arch/powerpc/kernel/pci_of_scan.c
@@ -244,7 +244,7 @@ EXPORT_SYMBOL(of_create_pci_dev);
* @dev: pci_dev structure for the bridge
*
* of_scan_bus() calls this routine for each PCI bridge that it finds, and
- * this routine in turn call of_scan_bus() recusively to scan for more child
+ * this routine in turn call of_scan_bus() recursively to scan for more child
* devices.
*/
void of_scan_pci_bridge(struct pci_dev *dev)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 9be279469a85..3940db48db77 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -307,7 +307,7 @@ static void __giveup_vsx(struct task_struct *tsk)
unsigned long msr = tsk->thread.regs->msr;
/*
- * We should never be ssetting MSR_VSX without also setting
+ * We should never be setting MSR_VSX without also setting
* MSR_FP and MSR_VEC
*/
WARN_ON((msr & MSR_VSX) && !((msr & MSR_FP) && (msr & MSR_VEC)));
@@ -645,7 +645,7 @@ static void do_break_handler(struct pt_regs *regs)
return;
}
- /* Otherwise findout which DAWR caused exception and disable it. */
+ /* Otherwise find out which DAWR caused exception and disable it. */
wp_get_instr_detail(regs, &instr, &type, &size, &ea);
for (i = 0; i < nr_wp_slots(); i++) {
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 0ac5faacc909..ace861ec4c4c 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -3416,7 +3416,7 @@ unsigned long __init prom_init(unsigned long r3, unsigned long r4,
*
* PowerMacs use a different mechanism to spin CPUs
*
- * (This must be done after instanciating RTAS)
+ * (This must be done after instantiating RTAS)
*/
if (of_platform != PLATFORM_POWERMAC)
prom_hold_cpus();
diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
index f15bc78caf71..076d867412c7 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
@@ -174,7 +174,7 @@ int ptrace_get_reg(struct task_struct *task, int regno, unsigned long *data)
/*
* softe copies paca->irq_soft_mask variable state. Since irq_soft_mask is
- * no more used as a flag, lets force usr to alway see the softe value as 1
+ * no more used as a flag, lets force usr to always see the softe value as 1
* which means interrupts are not soft disabled.
*/
if (IS_ENABLED(CONFIG_PPC64) && regno == PT_SOFTE) {
diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c
index a99179d83538..bc817a5619d6 100644
--- a/arch/powerpc/kernel/rtas_flash.c
+++ b/arch/powerpc/kernel/rtas_flash.c
@@ -120,7 +120,7 @@ static struct kmem_cache *flash_block_cache = NULL;
/*
* Local copy of the flash block list.
*
- * The rtas_firmware_flash_list varable will be
+ * The rtas_firmware_flash_list variable will be
* set once the data is fully read.
*
* For convenience as we build the list we use virtual addrs,
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index 518ae5aa9410..3acf2782acdf 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -279,7 +279,7 @@ static int show_cpuinfo(struct seq_file *m, void *v)
proc_freq / 1000000, proc_freq % 1000000);
/* If we are a Freescale core do a simple check so
- * we dont have to keep adding cases in the future */
+ * we don't have to keep adding cases in the future */
if (PVR_VER(pvr) & 0x8000) {
switch (PVR_VER(pvr)) {
case 0x8000: /* 7441/7450/7451, Voyager */
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 73d483b07ff3..858fc13b8c51 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -123,7 +123,7 @@ static long notrace __unsafe_setup_sigcontext(struct sigcontext __user *sc,
#endif
struct pt_regs *regs = tsk->thread.regs;
unsigned long msr = regs->msr;
- /* Force usr to alway see softe as 1 (interrupts enabled) */
+ /* Force usr to always see softe as 1 (interrupts enabled) */
unsigned long softe = 0x1;
BUG_ON(tsk != current);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index de0f6f09a5dd..a69df557e2b7 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1102,7 +1102,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
DBG("smp_prepare_cpus\n");
/*
- * setup_cpu may need to be called on the boot cpu. We havent
+ * setup_cpu may need to be called on the boot cpu. We haven't
* spun any cpus up but lets be paranoid.
*/
BUG_ON(boot_cpuid != smp_processor_id());
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index f80cce0e3899..4bf757ebe13d 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -829,7 +829,7 @@ static void __read_persistent_clock(struct timespec64 *ts)
static int first = 1;
ts->tv_nsec = 0;
- /* XXX this is a litle fragile but will work okay in the short term */
+ /* XXX this is a little fragile but will work okay in the short term */
if (first) {
first = 0;
if (ppc_md.time_init)
@@ -974,7 +974,7 @@ void secondary_cpu_time_init(void)
*/
start_cpu_decrementer();
- /* FIME: Should make unrelatred change to move snapshot_timebase
+ /* FIME: Should make unrelated change to move snapshot_timebase
* call here ! */
register_decrementer_clockevent(smp_processor_id());
}
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index bfc27496fe7e..7d28b9553654 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -56,7 +56,7 @@
* solved by also having a SMP watchdog where all CPUs check all other
* CPUs heartbeat.
*
- * The SMP checker can detect lockups on other CPUs. A gobal "pending"
+ * The SMP checker can detect lockups on other CPUs. A global "pending"
* cpumask is kept, containing all CPUs which enable the watchdog. Each
* CPU clears their pending bit in their heartbeat timer. When the bitmask
* becomes empty, the last CPU to clear its pending bit updates a global
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 6cc7793b8420..c29c639551fe 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -406,7 +406,7 @@ static int __init export_htab_values(void)
if (!node)
return -ENODEV;
- /* remove any stale propertys so ours can be found */
+ /* remove any stale properties so ours can be found */
of_remove_property(node, of_find_property(node, htab_base_prop.name, NULL));
of_remove_property(node, of_find_property(node, htab_size_prop.name, NULL));
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index b4981b651d9a..349a781cea0b 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -23,6 +23,7 @@
#include <linux/vmalloc.h>
#include <asm/setup.h>
#include <asm/drmem.h>
+#include <asm/firmware.h>
#include <asm/kexec_ranges.h>
#include <asm/crashdump-ppc64.h>
@@ -1038,6 +1039,48 @@ static int update_cpus_node(void *fdt)
return ret;
}
+static int copy_property(void *fdt, int node_offset, const struct device_node *dn,
+ const char *propname)
+{
+ const void *prop, *fdtprop;
+ int len = 0, fdtlen = 0;
+
+ prop = of_get_property(dn, propname, &len);
+ fdtprop = fdt_getprop(fdt, node_offset, propname, &fdtlen);
+
+ if (fdtprop && !prop)
+ return fdt_delprop(fdt, node_offset, propname);
+ else if (prop)
+ return fdt_setprop(fdt, node_offset, propname, prop, len);
+ else
+ return -FDT_ERR_NOTFOUND;
+}
+
+static int update_pci_dma_nodes(void *fdt, const char *dmapropname)
+{
+ struct device_node *dn;
+ int pci_offset, root_offset, ret = 0;
+
+ if (!firmware_has_feature(FW_FEATURE_LPAR))
+ return 0;
+
+ root_offset = fdt_path_offset(fdt, "/");
+ for_each_node_with_property(dn, dmapropname) {
+ pci_offset = fdt_subnode_offset(fdt, root_offset, of_node_full_name(dn));
+ if (pci_offset < 0)
+ continue;
+
+ ret = copy_property(fdt, pci_offset, dn, "ibm,dma-window");
+ if (ret < 0)
+ break;
+ ret = copy_property(fdt, pci_offset, dn, dmapropname);
+ if (ret < 0)
+ break;
+ }
+
+ return ret;
+}
+
/**
* setup_new_fdt_ppc64 - Update the flattend device-tree of the kernel
* being loaded.
@@ -1099,6 +1142,18 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt,
if (ret < 0)
goto out;
+#define DIRECT64_PROPNAME "linux,direct64-ddr-window-info"
+#define DMA64_PROPNAME "linux,dma64-ddr-window-info"
+ ret = update_pci_dma_nodes(fdt, DIRECT64_PROPNAME);
+ if (ret < 0)
+ goto out;
+
+ ret = update_pci_dma_nodes(fdt, DMA64_PROPNAME);
+ if (ret < 0)
+ goto out;
+#undef DMA64_PROPNAME
+#undef DIRECT64_PROPNAME
+
/* Update memory reserve map */
ret = get_reserved_memory_ranges(&rmem);
if (ret)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 0aeb51738ca9..1137c4df726c 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -58,7 +58,7 @@ struct kvm_resize_hpt {
/* Possible values and their usage:
* <0 an error occurred during allocation,
* -EBUSY allocation is in the progress,
- * 0 allocation made successfuly.
+ * 0 allocation made successfully.
*/
int error;
diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c
index fdeda6a9cff4..fdcc7b287dd8 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -453,7 +453,7 @@ static long kvmppc_rm_ua_to_hpa(struct kvm_vcpu *vcpu, unsigned long mmu_seq,
* we are doing this on secondary cpus and current task there
* is not the hypervisor. Also this is safe against THP in the
* host, because an IPI to primary thread will wait for the secondary
- * to exit which will agains result in the below page table walk
+ * to exit which will again result in the below page table walk
* to finish.
*/
/* an rmap lock won't make it safe. because that just ensure hash
diff --git a/arch/powerpc/kvm/book3s_emulate.c b/arch/powerpc/kvm/book3s_emulate.c
index fdb57be71aa6..5bbfb2eed127 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -268,7 +268,7 @@ int kvmppc_core_emulate_op_pr(struct kvm_vcpu *vcpu,
/*
* add rules to fit in ISA specification regarding TM
- * state transistion in TM disable/Suspended state,
+ * state transition in TM disable/Suspended state,
* and target TM state is TM inactive(00) state. (the
* change should be suppressed).
*/
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index 7e52d0beee77..5e4251b76e75 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -19,7 +19,7 @@
#include <asm/interrupt.h>
#include <asm/kvm_ppc.h>
#include <asm/kvm_book3s.h>
-#include <asm/archrandom.h>
+#include <asm/machdep.h>
#include <asm/xics.h>
#include <asm/xive.h>
#include <asm/dbell.h>
@@ -176,13 +176,14 @@ EXPORT_SYMBOL_GPL(kvmppc_hcall_impl_hv_realmode);
int kvmppc_hwrng_present(void)
{
- return powernv_hwrng_present();
+ return ppc_md.get_random_seed != NULL;
}
EXPORT_SYMBOL_GPL(kvmppc_hwrng_present);
long kvmppc_rm_h_random(struct kvm_vcpu *vcpu)
{
- if (powernv_get_random_real_mode(&vcpu->arch.regs.gpr[4]))
+ if (ppc_md.get_random_seed &&
+ ppc_md.get_random_seed(&vcpu->arch.regs.gpr[4]))
return H_SUCCESS;
return H_HARDWARE;
diff --git a/arch/powerpc/kvm/book3s_hv_p9_entry.c b/arch/powerpc/kvm/book3s_hv_p9_entry.c
index a28e5b3daabd..ac38c1cad378 100644
--- a/arch/powerpc/kvm/book3s_hv_p9_entry.c
+++ b/arch/powerpc/kvm/book3s_hv_p9_entry.c
@@ -379,7 +379,7 @@ void restore_p9_host_os_sprs(struct kvm_vcpu *vcpu,
{
/*
* current->thread.xxx registers must all be restored to host
- * values before a potential context switch, othrewise the context
+ * values before a potential context switch, otherwise the context
* switch itself will overwrite current->thread.xxx with the values
* from the guest SPRs.
*/
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 36f2314c58e5..598006301620 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -120,7 +120,7 @@ static DEFINE_SPINLOCK(kvmppc_uvmem_bitmap_lock);
* content is un-encrypted.
*
* (c) Normal - The GFN is a normal. The GFN is associated with
- * a normal VM. The contents of the GFN is accesible to
+ * a normal VM. The contents of the GFN is accessible to
* the Hypervisor. Its content is never encrypted.
*
* States of a VM.
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 7bf9e6ca5c2d..d6abed6e51e6 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -1287,7 +1287,7 @@ int kvmppc_handle_exit_pr(struct kvm_vcpu *vcpu, unsigned int exit_nr)
/* Get last sc for papr */
if (vcpu->arch.papr_enabled) {
- /* The sc instuction points SRR0 to the next inst */
+ /* The sc instruction points SRR0 to the next inst */
emul = kvmppc_get_last_inst(vcpu, INST_SC, &last_sc);
if (emul != EMULATE_DONE) {
kvmppc_set_pc(vcpu, kvmppc_get_pc(vcpu) - 4);
diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
index ab6d37d78c62..589a8f257120 100644
--- a/arch/powerpc/kvm/book3s_xics.c
+++ b/arch/powerpc/kvm/book3s_xics.c
@@ -462,7 +462,7 @@ static void icp_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp *icp,
* new guy. We cannot assume that the rejected interrupt is less
* favored than the new one, and thus doesn't need to be delivered,
* because by the time we exit icp_try_to_deliver() the target
- * processor may well have alrady consumed & completed it, and thus
+ * processor may well have already consumed & completed it, and thus
* the rejected interrupt might actually be already acceptable.
*/
if (icp_try_to_deliver(icp, new_irq, state->priority, &reject)) {
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index c0ce5531d9bc..24d434f1f012 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -124,7 +124,7 @@ void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu)
* interrupt might have fired and be on its way to the
* host queue while we mask it, and if we unmask it
* early enough (re-cede right away), there is a
- * theorical possibility that it fires again, thus
+ * theoretical possibility that it fires again, thus
* landing in the target queue more than once which is
* a big no-no.
*
@@ -622,7 +622,7 @@ static int xive_target_interrupt(struct kvm *kvm,
/*
* Targetting rules: In order to avoid losing track of
- * pending interrupts accross mask and unmask, which would
+ * pending interrupts across mask and unmask, which would
* allow queue overflows, we implement the following rules:
*
* - Unless it was never enabled (or we run out of capacity)
@@ -1073,7 +1073,7 @@ int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq,
/*
* If old_p is set, the interrupt is pending, we switch it to
* PQ=11. This will force a resend in the host so the interrupt
- * isn't lost to whatver host driver may pick it up
+ * isn't lost to whatever host driver may pick it up
*/
if (state->old_p)
xive_vm_esb_load(state->pt_data, XIVE_ESB_SET_PQ_11);
diff --git a/arch/powerpc/kvm/e500mc.c b/arch/powerpc/kvm/e500mc.c
index fa0d8dbbe484..4ff1372e48d4 100644
--- a/arch/powerpc/kvm/e500mc.c
+++ b/arch/powerpc/kvm/e500mc.c
@@ -309,7 +309,7 @@ static int kvmppc_core_vcpu_create_e500mc(struct kvm_vcpu *vcpu)
BUILD_BUG_ON(offsetof(struct kvmppc_vcpu_e500, vcpu) != 0);
vcpu_e500 = to_e500(vcpu);
- /* Invalid PIR value -- this LPID dosn't have valid state on any cpu */
+ /* Invalid PIR value -- this LPID doesn't have valid state on any cpu */
vcpu->arch.oldpir = 0xffffffff;
err = kvmppc_e500_tlb_init(vcpu_e500);
diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
index 7ce8914992e3..2e0cad5817ba 100644
--- a/arch/powerpc/mm/book3s64/hash_pgtable.c
+++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
@@ -377,7 +377,7 @@ int hash__has_transparent_hugepage(void)
if (mmu_psize_defs[MMU_PAGE_16M].shift != PMD_SHIFT)
return 0;
/*
- * We need to make sure that we support 16MB hugepage in a segement
+ * We need to make sure that we support 16MB hugepage in a segment
* with base page size 64K or 4K. We only enable THP with a PAGE_SIZE
* of 64K.
*/
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 985cabdd7f67..5b69b271707e 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1343,7 +1343,7 @@ static int subpage_protection(struct mm_struct *mm, unsigned long ea)
spp >>= 30 - 2 * ((ea >> 12) & 0xf);
/*
- * 0 -> full premission
+ * 0 -> full permission
* 1 -> Read only
* 2 -> no access.
* We return the flag that need to be cleared.
@@ -1664,7 +1664,7 @@ DEFINE_INTERRUPT_HANDLER(do_hash_fault)
err = hash_page_mm(mm, ea, access, TRAP(regs), flags);
if (unlikely(err < 0)) {
- // failed to instert a hash PTE due to an hypervisor error
+ // failed to insert a hash PTE due to an hypervisor error
if (user_mode(regs)) {
if (IS_ENABLED(CONFIG_PPC_SUBPAGE_PROT) && err == -2)
_exception(SIGSEGV, regs, SEGV_ACCERR, ea);
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 052e6590f84f..071bb66c3ad9 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -331,7 +331,7 @@ static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm)
spin_lock(&mm->page_table_lock);
/*
* If we find pgtable_page set, we return
- * the allocated page with single fragement
+ * the allocated page with single fragment
* count.
*/
if (likely(!mm->context.pmd_frag)) {
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index def04631a74d..db2f3d193448 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -359,7 +359,7 @@ static void __init radix_init_pgtable(void)
if (!cpu_has_feature(CPU_FTR_HVMODE) &&
cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG)) {
/*
- * Older versions of KVM on these machines perfer if the
+ * Older versions of KVM on these machines prefer if the
* guest only uses the low 19 PID bits.
*/
mmu_pid_bits = 19;
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index 7724af19ed7e..dda51fef2d2e 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -397,7 +397,7 @@ static inline void _tlbie_pid(unsigned long pid, unsigned long ric)
/*
* Workaround the fact that the "ric" argument to __tlbie_pid
- * must be a compile-time contraint to match the "i" constraint
+ * must be a compile-time constraint to match the "i" constraint
* in the asm statement.
*/
switch (ric) {
diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
index 81091b9587f6..6956f637a38c 100644
--- a/arch/powerpc/mm/book3s64/slb.c
+++ b/arch/powerpc/mm/book3s64/slb.c
@@ -347,7 +347,7 @@ void slb_setup_new_exec(void)
/*
* We have no good place to clear the slb preload cache on exec,
* flush_thread is about the earliest arch hook but that happens
- * after we switch to the mm and have aleady preloaded the SLBEs.
+ * after we switch to the mm and have already preloaded the SLBEs.
*
* For the most part that's probably okay to use entries from the
* previous exec, they will age out if unused. It may turn out to
@@ -615,7 +615,7 @@ static void slb_cache_update(unsigned long esid_data)
} else {
/*
* Our cache is full and the current cache content strictly
- * doesn't indicate the active SLB conents. Bump the ptr
+ * doesn't indicate the active SLB contents. Bump the ptr
* so that switch_slb() will ignore the cache.
*/
local_paca->slb_cache_ptr = SLB_CACHE_ENTRIES + 1;
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 83c0ee9fbf05..2e11952057f8 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -111,7 +111,7 @@ static int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_m
}
/*
- * vmemmap virtual address space management does not have a traditonal page
+ * vmemmap virtual address space management does not have a traditional page
* table to track which virtual struct pages are backed by physical mapping.
* The virtual to physical mappings are tracked in a simple linked list
* format. 'vmemmap_list' maintains the entire vmemmap physical mapping at
@@ -128,7 +128,7 @@ static struct vmemmap_backing *next;
/*
* The same pointer 'next' tracks individual chunks inside the allocated
- * full page during the boot time and again tracks the freeed nodes during
+ * full page during the boot time and again tracks the freed nodes during
* runtime. It is racy but it does not happen as they are separated by the
* boot process. Will create problem if some how we have memory hotplug
* operation during boot !!
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c
index f3e4d069e0ba..a70828a6d935 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -25,7 +25,7 @@ static void __init kasan_populate_pte(pte_t *ptep, pgprot_t prot)
int i;
for (i = 0; i < PTRS_PER_PTE; i++, ptep++)
- __set_pte_at(&init_mm, va, ptep, pfn_pte(PHYS_PFN(pa), prot), 0);
+ __set_pte_at(&init_mm, va, ptep, pfn_pte(PHYS_PFN(pa), prot), 1);
}
int __init kasan_init_shadow_page_tables(unsigned long k_start, unsigned long k_end)
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index 27f9186ae374..1ee08c3efe5b 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -179,8 +179,8 @@ void mmu_mark_initmem_nx(void)
unsigned long boundary = strict_kernel_rwx_enabled() ? sinittext : etext8;
unsigned long einittext8 = ALIGN(__pa(_einittext), SZ_8M);
- mmu_mapin_ram_chunk(0, boundary, PAGE_KERNEL_TEXT, false);
- mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
+ if (!debug_pagealloc_enabled_or_kfence())
+ mmu_mapin_ram_chunk(boundary, einittext8, PAGE_KERNEL, false);
mmu_pin_tlb(block_mapped_ram, false);
}
diff --git a/arch/powerpc/mm/nohash/book3e_hugetlbpage.c b/arch/powerpc/mm/nohash/book3e_hugetlbpage.c
index 8b88be91b622..307ca919d393 100644
--- a/arch/powerpc/mm/nohash/book3e_hugetlbpage.c
+++ b/arch/powerpc/mm/nohash/book3e_hugetlbpage.c
@@ -142,7 +142,7 @@ book3e_hugetlb_preload(struct vm_area_struct *vma, unsigned long ea, pte_t pte)
tsize = shift - 10;
/*
* We can't be interrupted while we're setting up the MAS
- * regusters or after we've confirmed that no tlb exists.
+ * registers or after we've confirmed that no tlb exists.
*/
local_irq_save(flags);
diff --git a/arch/powerpc/mm/nohash/kaslr_booke.c b/arch/powerpc/mm/nohash/kaslr_booke.c
index 5f81c076621f..37eb8d80bd53 100644
--- a/arch/powerpc/mm/nohash/kaslr_booke.c
+++ b/arch/powerpc/mm/nohash/kaslr_booke.c
@@ -311,7 +311,7 @@ static unsigned long __init kaslr_choose_location(void *dt_ptr, phys_addr_t size
ram = map_mem_in_cams(ram, CONFIG_LOWMEM_CAM_NUM, true, true);
linear_sz = min_t(unsigned long, ram, SZ_512M);
- /* If the linear size is smaller than 64M, do not randmize */
+ /* If the linear size is smaller than 64M, do not randomize */
if (linear_sz < SZ_64M)
return 0;
diff --git a/arch/powerpc/mm/nohash/tlb_low_64e.S b/arch/powerpc/mm/nohash/tlb_low_64e.S
index 8b97c4acfebf..9e9ab3803fb2 100644
--- a/arch/powerpc/mm/nohash/tlb_low_64e.S
+++ b/arch/powerpc/mm/nohash/tlb_low_64e.S
@@ -583,7 +583,7 @@ itlb_miss_fault_e6500:
*/
rlwimi r11,r14,32-19,27,27
rlwimi r11,r14,32-16,19,19
- beq normal_tlb_miss
+ beq normal_tlb_miss_user
/* XXX replace the RMW cycles with immediate loads + writes */
1: mfspr r10,SPRN_MAS1
cmpldi cr0,r15,8 /* Check for vmalloc region */
@@ -626,7 +626,7 @@ itlb_miss_fault_e6500:
cmpldi cr0,r15,0 /* Check for user region */
std r14,EX_TLB_ESR(r12) /* write crazy -1 to frame */
- beq normal_tlb_miss
+ beq normal_tlb_miss_user
li r11,_PAGE_PRESENT|_PAGE_BAP_SX /* Base perm */
oris r11,r11,_PAGE_ACCESSED@h
@@ -653,6 +653,12 @@ itlb_miss_fault_e6500:
* r11 = PTE permission mask
* r10 = crap (free to use)
*/
+normal_tlb_miss_user:
+#ifdef CONFIG_PPC_KUAP
+ mfspr r14,SPRN_MAS1
+ rlwinm. r14,r14,0,0x3fff0000
+ beq- normal_tlb_miss_access_fault /* KUAP fault */
+#endif
normal_tlb_miss:
/* So we first construct the page table address. We do that by
* shifting the bottom of the address (not the region ID) by
@@ -683,11 +689,6 @@ finish_normal_tlb_miss:
/* Check if required permissions are met */
andc. r15,r11,r14
bne- normal_tlb_miss_access_fault
-#ifdef CONFIG_PPC_KUAP
- mfspr r11,SPRN_MAS1
- rlwinm. r10,r11,0,0x3fff0000
- beq- normal_tlb_miss_access_fault /* KUAP fault */
-#endif
/* Now we build the MAS:
*
@@ -709,9 +710,7 @@ finish_normal_tlb_miss:
rldicl r10,r14,64-8,64-8
cmpldi cr0,r10,BOOK3E_PAGESZ_4K
beq- 1f
-#ifndef CONFIG_PPC_KUAP
mfspr r11,SPRN_MAS1
-#endif
rlwimi r11,r14,31,21,24
rlwinm r11,r11,0,21,19
mtspr SPRN_MAS1,r11
diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c
index 97ae4935da79..20652daa1d7e 100644
--- a/arch/powerpc/mm/pgtable-frag.c
+++ b/arch/powerpc/mm/pgtable-frag.c
@@ -83,7 +83,7 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel)
spin_lock(&mm->page_table_lock);
/*
* If we find pgtable_page set, we return
- * the allocated page with single fragement
+ * the allocated page with single fragment
* count.
*/
if (likely(!pte_frag_get(&mm->context))) {
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index a56ade39dc68..3ac73f9fb5d5 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -135,9 +135,9 @@ void mark_initmem_nx(void)
unsigned long numpages = PFN_UP((unsigned long)_einittext) -
PFN_DOWN((unsigned long)_sinittext);
- if (v_block_mapped((unsigned long)_sinittext)) {
- mmu_mark_initmem_nx();
- } else {
+ mmu_mark_initmem_nx();
+
+ if (!v_block_mapped((unsigned long)_sinittext)) {
set_memory_nx((unsigned long)_sinittext, numpages);
set_memory_rw((unsigned long)_sinittext, numpages);
}
diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
index 03607ab90c66..f884760ca5cf 100644
--- a/arch/powerpc/mm/ptdump/shared.c
+++ b/arch/powerpc/mm/ptdump/shared.c
@@ -17,9 +17,9 @@ static const struct flag_info flag_array[] = {
.clear = " ",
}, {
.mask = _PAGE_RW,
- .val = _PAGE_RW,
- .set = "rw",
- .clear = "r ",
+ .val = 0,
+ .set = "r ",
+ .clear = "rw",
}, {
.mask = _PAGE_EXEC,
.val = _PAGE_EXEC,
diff --git a/arch/powerpc/perf/8xx-pmu.c b/arch/powerpc/perf/8xx-pmu.c
index 4738c4dbf567..308a2e40d7be 100644
--- a/arch/powerpc/perf/8xx-pmu.c
+++ b/arch/powerpc/perf/8xx-pmu.c
@@ -157,7 +157,7 @@ static void mpc8xx_pmu_del(struct perf_event *event, int flags)
mpc8xx_pmu_read(event);
- /* If it was the last user, stop counting to avoid useles overhead */
+ /* If it was the last user, stop counting to avoid useless overhead */
switch (event_type(event)) {
case PERF_8xx_ID_CPU_CYCLES:
break;
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index b5b42cf0a703..03c64a0195df 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1142,7 +1142,7 @@ static u64 check_and_compute_delta(u64 prev, u64 val)
/*
* POWER7 can roll back counter values, if the new value is smaller
* than the previous value it will cause the delta and the counter to
- * have bogus values unless we rolled a counter over. If a coutner is
+ * have bogus values unless we rolled a counter over. If a counter is
* rolled back, it will be smaller, but within 256, which is the maximum
* number of events to rollback at once. If we detect a rollback
* return 0. This can lead to a small lack of precision in the
@@ -1349,27 +1349,22 @@ static void power_pmu_disable(struct pmu *pmu)
* a PMI happens during interrupt replay and perf counter
* values are cleared by PMU callbacks before replay.
*
- * If any PMC corresponding to the active PMU events are
- * overflown, disable the interrupt by clearing the paca
- * bit for PMI since we are disabling the PMU now.
- * Otherwise provide a warning if there is PMI pending, but
- * no counter is found overflown.
+ * Disable the interrupt by clearing the paca bit for PMI
+ * since we are disabling the PMU now. Otherwise provide a
+ * warning if there is PMI pending, but no counter is found
+ * overflown.
+ *
+ * Since power_pmu_disable runs under local_irq_save, it
+ * could happen that code hits a PMC overflow without PMI
+ * pending in paca. Hence only clear PMI pending if it was
+ * set.
+ *
+ * If a PMI is pending, then MSR[EE] must be disabled (because
+ * the masked PMI handler disabling EE). So it is safe to
+ * call clear_pmi_irq_pending().
*/
- if (any_pmc_overflown(cpuhw)) {
- /*
- * Since power_pmu_disable runs under local_irq_save, it
- * could happen that code hits a PMC overflow without PMI
- * pending in paca. Hence only clear PMI pending if it was
- * set.
- *
- * If a PMI is pending, then MSR[EE] must be disabled (because
- * the masked PMI handler disabling EE). So it is safe to
- * call clear_pmi_irq_pending().
- */
- if (pmi_irq_pending())
- clear_pmi_irq_pending();
- } else
- WARN_ON(pmi_irq_pending());
+ if (pmi_irq_pending())
+ clear_pmi_irq_pending();
val = mmcra = cpuhw->mmcr.mmcra;
@@ -2057,7 +2052,7 @@ static int power_pmu_event_init(struct perf_event *event)
/*
* PMU config registers have fields that are
* reserved and some specific values for bit fields are reserved.
- * For ex., MMCRA[61:62] is Randome Sampling Mode (SM)
+ * For ex., MMCRA[61:62] is Random Sampling Mode (SM)
* and value of 0b11 to this field is reserved.
* Check for invalid values in attr.config.
*/
@@ -2447,7 +2442,7 @@ static void __perf_event_interrupt(struct pt_regs *regs)
}
/*
- * During system wide profling or while specific CPU is monitored for an
+ * During system wide profiling or while specific CPU is monitored for an
* event, some corner cases could cause PMC to overflow in idle path. This
* will trigger a PMI after waking up from idle. Since counter values are _not_
* saved/restored in idle path, can lead to below "Can't find PMC" message.
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 526d4b767534..498f1a2f7658 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -521,7 +521,7 @@ static int nest_imc_event_init(struct perf_event *event)
/*
* Nest HW counter memory resides in a per-chip reserve-memory (HOMER).
- * Get the base memory addresss for this cpu.
+ * Get the base memory address for this cpu.
*/
chip_id = cpu_to_chip_id(event->cpu);
@@ -674,7 +674,7 @@ static int ppc_core_imc_cpu_offline(unsigned int cpu)
/*
* Check whether core_imc is registered. We could end up here
* if the cpuhotplug callback registration fails. i.e, callback
- * invokes the offline path for all sucessfully registered cpus.
+ * invokes the offline path for all successfully registered cpus.
* At this stage, core_imc pmu will not be registered and we
* should return here.
*
diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index bb5d64862bc9..42abbcfc73da 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -82,11 +82,11 @@ static unsigned long sdar_mod_val(u64 event)
static void mmcra_sdar_mode(u64 event, unsigned long *mmcra)
{
/*
- * MMCRA[SDAR_MODE] specifices how the SDAR should be updated in
- * continous sampling mode.
+ * MMCRA[SDAR_MODE] specifies how the SDAR should be updated in
+ * continuous sampling mode.
*
* Incase of Power8:
- * MMCRA[SDAR_MODE] will be programmed as "0b01" for continous sampling
+ * MMCRA[SDAR_MODE] will be programmed as "0b01" for continuous sampling
* mode and will be un-changed when setting MMCRA[63] (Marked events).
*
* Incase of Power9/power10:
diff --git a/arch/powerpc/platforms/512x/clock-commonclk.c b/arch/powerpc/platforms/512x/clock-commonclk.c
index 0b03d812baae..0652c7e69225 100644
--- a/arch/powerpc/platforms/512x/clock-commonclk.c
+++ b/arch/powerpc/platforms/512x/clock-commonclk.c
@@ -663,7 +663,7 @@ static void __init mpc512x_clk_setup_mclk(struct mclk_setup_data *entry, size_t
* the PSC/MSCAN/SPDIF (serial drivers et al) need the MCLK
* for their bitrate
* - in the absence of "aliases" for clocks we need to create
- * individial 'struct clk' items for whatever might get
+ * individual 'struct clk' items for whatever might get
* referenced or looked up, even if several of those items are
* identical from the logical POV (their rate value)
* - for easier future maintenance and for better reflection of
diff --git a/arch/powerpc/platforms/512x/mpc512x_shared.c b/arch/powerpc/platforms/512x/mpc512x_shared.c
index e3411663edad..96d9cf49560d 100644
--- a/arch/powerpc/platforms/512x/mpc512x_shared.c
+++ b/arch/powerpc/platforms/512x/mpc512x_shared.c
@@ -289,7 +289,7 @@ static void __init mpc512x_setup_diu(void)
/*
* We do not allocate and configure new area for bitmap buffer
- * because it would requere copying bitmap data (splash image)
+ * because it would require copying bitmap data (splash image)
* and so negatively affect boot time. Instead we reserve the
* already configured frame buffer area so that it won't be
* destroyed. The starting address of the area to reserve and
diff --git a/arch/powerpc/platforms/52xx/mpc52xx_common.c b/arch/powerpc/platforms/52xx/mpc52xx_common.c
index 565e3a83dc9e..60aa6015e284 100644
--- a/arch/powerpc/platforms/52xx/mpc52xx_common.c
+++ b/arch/powerpc/platforms/52xx/mpc52xx_common.c
@@ -308,7 +308,7 @@ int mpc5200_psc_ac97_gpio_reset(int psc_number)
spin_lock_irqsave(&gpio_lock, flags);
- /* Reconfiure pin-muxing to gpio */
+ /* Reconfigure pin-muxing to gpio */
mux = in_be32(&simple_gpio->port_config);
out_be32(&simple_gpio->port_config, mux & (~gpio));
diff --git a/arch/powerpc/platforms/52xx/mpc52xx_gpt.c b/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
index f862b48b4824..7252d992ca9f 100644
--- a/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
+++ b/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
@@ -398,7 +398,7 @@ static int mpc52xx_gpt_do_start(struct mpc52xx_gpt_priv *gpt, u64 period,
set |= MPC52xx_GPT_MODE_CONTINUOUS;
/* Determine the number of clocks in the requested period. 64 bit
- * arithmatic is done here to preserve the precision until the value
+ * arithmetic is done here to preserve the precision until the value
* is scaled back down into the u32 range. Period is in 'ns', bus
* frequency is in Hz. */
clocks = period * (u64)gpt->ipb_freq;
diff --git a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c
index b91ebebd9ff2..54dfc63809d3 100644
--- a/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c
+++ b/arch/powerpc/platforms/52xx/mpc52xx_lpbfifo.c
@@ -104,7 +104,7 @@ static void mpc52xx_lpbfifo_kick(struct mpc52xx_lpbfifo_request *req)
*
* Configure the watermarks so DMA will always complete correctly.
* It may be worth experimenting with the ALARM value to see if
- * there is a performance impacit. However, if it is wrong there
+ * there is a performance impact. However, if it is wrong there
* is a risk of DMA not transferring the last chunk of data
*/
if (write) {
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_cds.c b/arch/powerpc/platforms/85xx/mpc85xx_cds.c
index 5bd487030256..ad0b88ecfd0c 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_cds.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_cds.c
@@ -151,7 +151,7 @@ static void __init mpc85xx_cds_pci_irq_fixup(struct pci_dev *dev)
*/
case PCI_DEVICE_ID_VIA_82C586_2:
/* There are two USB controllers.
- * Identify them by functon number
+ * Identify them by function number
*/
if (PCI_FUNC(dev->devfn) == 3)
dev->irq = 11;
diff --git a/arch/powerpc/platforms/86xx/gef_ppc9a.c b/arch/powerpc/platforms/86xx/gef_ppc9a.c
index 44bbbc535e1d..884da08806ce 100644
--- a/arch/powerpc/platforms/86xx/gef_ppc9a.c
+++ b/arch/powerpc/platforms/86xx/gef_ppc9a.c
@@ -180,7 +180,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB,
*
* This function is called to determine whether the BSP is compatible with the
* supplied device-tree, which is assumed to be the correct one for the actual
- * board. It is expected thati, in the future, a kernel may support multiple
+ * board. It is expected that, in the future, a kernel may support multiple
* boards.
*/
static int __init gef_ppc9a_probe(void)
diff --git a/arch/powerpc/platforms/86xx/gef_sbc310.c b/arch/powerpc/platforms/86xx/gef_sbc310.c
index 46d6d3d4957a..baaf1ab07016 100644
--- a/arch/powerpc/platforms/86xx/gef_sbc310.c
+++ b/arch/powerpc/platforms/86xx/gef_sbc310.c
@@ -167,7 +167,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB,
*
* This function is called to determine whether the BSP is compatible with the
* supplied device-tree, which is assumed to be the correct one for the actual
- * board. It is expected thati, in the future, a kernel may support multiple
+ * board. It is expected that, in the future, a kernel may support multiple
* boards.
*/
static int __init gef_sbc310_probe(void)
diff --git a/arch/powerpc/platforms/86xx/gef_sbc610.c b/arch/powerpc/platforms/86xx/gef_sbc610.c
index acf2c6c3c1eb..120caf6af71d 100644
--- a/arch/powerpc/platforms/86xx/gef_sbc610.c
+++ b/arch/powerpc/platforms/86xx/gef_sbc610.c
@@ -157,7 +157,7 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_USB,
*
* This function is called to determine whether the BSP is compatible with the
* supplied device-tree, which is assumed to be the correct one for the actual
- * board. It is expected thati, in the future, a kernel may support multiple
+ * board. It is expected that, in the future, a kernel may support multiple
* boards.
*/
static int __init gef_sbc610_probe(void)
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index e2e1fec91c6e..660309559bd8 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -173,11 +173,11 @@ config POWER9_CPU
config E5500_CPU
bool "Freescale e5500"
- depends on E500
+ depends on PPC64 && E500
config E6500_CPU
bool "Freescale e6500"
- depends on E500
+ depends on PPC64 && E500
config 860_CPU
bool "8xx family"
diff --git a/arch/powerpc/platforms/book3s/vas-api.c b/arch/powerpc/platforms/book3s/vas-api.c
index f9a1615b74da..c0799fb26b6d 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -30,7 +30,7 @@
*
* where "vas_copy" and "vas_paste" are defined in copy-paste.h.
* copy/paste returns to the user space directly. So refer NX hardware
- * documententation for exact copy/paste usage and completion / error
+ * documentation for exact copy/paste usage and completion / error
* conditions.
*/
diff --git a/arch/powerpc/platforms/cell/axon_msi.c b/arch/powerpc/platforms/cell/axon_msi.c
index 354a58c1e6f2..dba34f4d5162 100644
--- a/arch/powerpc/platforms/cell/axon_msi.c
+++ b/arch/powerpc/platforms/cell/axon_msi.c
@@ -223,6 +223,7 @@ static int setup_msi_msg_address(struct pci_dev *dev, struct msi_msg *msg)
if (!prop) {
dev_dbg(&dev->dev,
"axon_msi: no msi-address-(32|64) properties found\n");
+ of_node_put(dn);
return -ENOENT;
}
diff --git a/arch/powerpc/platforms/cell/cbe_regs.c b/arch/powerpc/platforms/cell/cbe_regs.c
index 1c4c53bec66c..03512a41bd7e 100644
--- a/arch/powerpc/platforms/cell/cbe_regs.c
+++ b/arch/powerpc/platforms/cell/cbe_regs.c
@@ -23,7 +23,7 @@
* Current implementation uses "cpu" nodes. We build our own mapping
* array of cpu numbers to cpu nodes locally for now to allow interrupt
* time code to have a fast path rather than call of_get_cpu_node(). If
- * we implement cpu hotplug, we'll have to install an appropriate norifier
+ * we implement cpu hotplug, we'll have to install an appropriate notifier
* in order to release references to the cpu going away
*/
static struct cbe_regs_map
diff --git a/arch/powerpc/platforms/cell/iommu.c b/arch/powerpc/platforms/cell/iommu.c
index 25e726bf0172..3f141cf5e580 100644
--- a/arch/powerpc/platforms/cell/iommu.c
+++ b/arch/powerpc/platforms/cell/iommu.c
@@ -582,7 +582,7 @@ static int cell_of_bus_notify(struct notifier_block *nb, unsigned long action,
{
struct device *dev = data;
- /* We are only intereted in device addition */
+ /* We are only interested in device addition */
if (action != BUS_NOTIFY_ADD_DEVICE)
return 0;
diff --git a/arch/powerpc/platforms/cell/spider-pci.c b/arch/powerpc/platforms/cell/spider-pci.c
index a1c293f42a1f..3a2ea8376e32 100644
--- a/arch/powerpc/platforms/cell/spider-pci.c
+++ b/arch/powerpc/platforms/cell/spider-pci.c
@@ -81,7 +81,7 @@ static int __init spiderpci_pci_setup_chip(struct pci_controller *phb,
/*
* On CellBlade, we can't know that which XDR memory is used by
* kmalloc() to allocate dummy_page_va.
- * In order to imporve the performance, the XDR which is used to
+ * In order to improve the performance, the XDR which is used to
* allocate dummy_page_va is the nearest the spider-pci.
* We have to select the CBE which is the nearest the spider-pci
* to allocate memory from the best XDR, but I don't know that
diff --git a/arch/powerpc/platforms/cell/spu_manage.c b/arch/powerpc/platforms/cell/spu_manage.c
index ddf8742f09a3..080ed2d2c682 100644
--- a/arch/powerpc/platforms/cell/spu_manage.c
+++ b/arch/powerpc/platforms/cell/spu_manage.c
@@ -457,7 +457,7 @@ static void __init init_affinity_node(int cbe)
/*
* Walk through each phandle in vicinity property of the spu
- * (tipically two vicinity phandles per spe node)
+ * (typically two vicinity phandles per spe node)
*/
for (i = 0; i < (lenp / sizeof(phandle)); i++) {
if (vic_handles[i] == avoid_ph)
diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index 4c702192412f..96f5e2e43018 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -660,6 +660,7 @@ spufs_init_isolated_loader(void)
return;
loader = of_get_property(dn, "loader", &size);
+ of_node_put(dn);
if (!loader)
return;
diff --git a/arch/powerpc/platforms/powermac/low_i2c.c b/arch/powerpc/platforms/powermac/low_i2c.c
index df89d916236d..9aded0188ce8 100644
--- a/arch/powerpc/platforms/powermac/low_i2c.c
+++ b/arch/powerpc/platforms/powermac/low_i2c.c
@@ -1472,7 +1472,7 @@ int __init pmac_i2c_init(void)
smu_i2c_probe();
#endif
- /* Now add plaform functions for some known devices */
+ /* Now add platform functions for some known devices */
pmac_i2c_devscan(pmac_i2c_dev_create);
return 0;
diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c
index 89e22c460ebf..33f7b959c810 100644
--- a/arch/powerpc/platforms/powernv/eeh-powernv.c
+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c
@@ -390,7 +390,7 @@ static struct eeh_dev *pnv_eeh_probe(struct pci_dev *pdev)
* should be blocked until PE reset. MMIO access is dropped
* by hardware certainly. In order to drop PCI config requests,
* one more flag (EEH_PE_CFG_RESTRICTED) is introduced, which
- * will be checked in the backend for PE state retrival. If
+ * will be checked in the backend for PE state retrieval. If
* the PE becomes frozen for the first time and the flag has
* been set for the PE, we will set EEH_PE_CFG_BLOCKED for
* that PE to block its config space.
@@ -981,7 +981,7 @@ static int pnv_eeh_do_af_flr(struct pci_dn *pdn, int option)
case EEH_RESET_FUNDAMENTAL:
/*
* Wait for Transaction Pending bit to clear. A word-aligned
- * test is used, so we use the conrol offset rather than status
+ * test is used, so we use the control offset rather than status
* and shift the test bit to match.
*/
pnv_eeh_wait_for_pending(pdn, "AF",
@@ -1048,7 +1048,7 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
* frozen state during PE reset. However, the good idea here from
* benh is to keep frozen state before we get PE reset done completely
* (until BAR restore). With the frozen state, HW drops illegal IO
- * or MMIO access, which can incur recrusive frozen PE during PE
+ * or MMIO access, which can incur recursive frozen PE during PE
* reset. The side effect is that EEH core has to clear the frozen
* state explicitly after BAR restore.
*/
@@ -1095,8 +1095,8 @@ static int pnv_eeh_reset(struct eeh_pe *pe, int option)
* bus is behind a hotplug slot and it will use the slot provided
* reset methods to prevent spurious hotplug events during the reset.
*
- * Fundemental resets need to be handled internally to EEH since the
- * PCI core doesn't really have a concept of a fundemental reset,
+ * Fundamental resets need to be handled internally to EEH since the
+ * PCI core doesn't really have a concept of a fundamental reset,
* mainly because there's no standard way to generate one. Only a
* few devices require an FRESET so it should be fine.
*/
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index a6677a111aca..6f94b808dd39 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -112,7 +112,7 @@ static int __init pnv_save_sprs_for_deep_states(void)
if (rc != 0)
return rc;
- /* Only p8 needs to set extra HID regiters */
+ /* Only p8 needs to set extra HID registers */
if (!cpu_has_feature(CPU_FTR_ARCH_300)) {
uint64_t hid1_val = mfspr(SPRN_HID1);
uint64_t hid4_val = mfspr(SPRN_HID4);
@@ -1204,7 +1204,7 @@ static void __init pnv_arch300_idle_init(void)
* The idle code does not deal with TB loss occurring
* in a shallower state than SPR loss, so force it to
* behave like SPRs are lost if TB is lost. POWER9 would
- * never encouter this, but a POWER8 core would if it
+ * never encounter this, but a POWER8 core would if it
* implemented the stop instruction. So this is for forward
* compatibility.
*/
diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c
index 28b009b46464..27c936075031 100644
--- a/arch/powerpc/platforms/powernv/ocxl.c
+++ b/arch/powerpc/platforms/powernv/ocxl.c
@@ -289,7 +289,7 @@ int pnv_ocxl_get_pasid_count(struct pci_dev *dev, int *count)
* be used by a function depends on how many functions exist
* on the device. The NPU needs to be configured to know how
* many bits are available to PASIDs and how many are to be
- * used by the function BDF indentifier.
+ * used by the function BDF identifier.
*
* We only support one AFU-carrying function for now.
*/
diff --git a/arch/powerpc/platforms/powernv/opal-fadump.c b/arch/powerpc/platforms/powernv/opal-fadump.c
index 9d74d3950a52..c1bcd2d4826e 100644
--- a/arch/powerpc/platforms/powernv/opal-fadump.c
+++ b/arch/powerpc/platforms/powernv/opal-fadump.c
@@ -206,7 +206,7 @@ static u64 opal_fadump_init_mem_struct(struct fw_dump *fadump_conf)
opal_fdm->region_cnt = cpu_to_be16(reg_cnt);
/*
- * Kernel metadata is passed to f/w and retrieved in capture kerenl.
+ * Kernel metadata is passed to f/w and retrieved in capture kernel.
* So, use it to save fadump header address instead of calculating it.
*/
opal_fdm->fadumphdr_addr = cpu_to_be64(be64_to_cpu(opal_fdm->rgn[0].dest) +
diff --git a/arch/powerpc/platforms/powernv/opal-lpc.c b/arch/powerpc/platforms/powernv/opal-lpc.c
index 5390c888db16..d129d6d45a50 100644
--- a/arch/powerpc/platforms/powernv/opal-lpc.c
+++ b/arch/powerpc/platforms/powernv/opal-lpc.c
@@ -197,7 +197,7 @@ static ssize_t lpc_debug_read(struct file *filp, char __user *ubuf,
/*
* Select access size based on count and alignment and
- * access type. IO and MEM only support byte acceses,
+ * access type. IO and MEM only support byte accesses,
* FW supports all 3.
*/
len = 1;
diff --git a/arch/powerpc/platforms/powernv/opal-memory-errors.c b/arch/powerpc/platforms/powernv/opal-memory-errors.c
index 1e8e17df9ce8..a1754a28265d 100644
--- a/arch/powerpc/platforms/powernv/opal-memory-errors.c
+++ b/arch/powerpc/platforms/powernv/opal-memory-errors.c
@@ -82,7 +82,7 @@ static DECLARE_WORK(mem_error_work, mem_error_handler);
/*
* opal_memory_err_event - notifier handler that queues up the opal message
- * to be preocessed later.
+ * to be processed later.
*/
static int opal_memory_err_event(struct notifier_block *nb,
unsigned long msg_type, void *msg)
diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
index 04155aaaadb1..fe3d111b881c 100644
--- a/arch/powerpc/platforms/powernv/pci-sriov.c
+++ b/arch/powerpc/platforms/powernv/pci-sriov.c
@@ -699,7 +699,7 @@ static int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs)
return -ENOSPC;
}
- /* allocate a contigious block of PEs for our VFs */
+ /* allocate a contiguous block of PEs for our VFs */
base_pe = pnv_ioda_alloc_pe(phb, num_vfs);
if (!base_pe) {
pci_err(pdev, "Unable to allocate PEs for %d VFs\n", num_vfs);
diff --git a/arch/powerpc/platforms/powernv/rng.c b/arch/powerpc/platforms/powernv/rng.c
index 3805ad13b8f3..d19305292e1e 100644
--- a/arch/powerpc/platforms/powernv/rng.c
+++ b/arch/powerpc/platforms/powernv/rng.c
@@ -29,15 +29,6 @@ struct powernv_rng {
static DEFINE_PER_CPU(struct powernv_rng *, powernv_rng);
-int powernv_hwrng_present(void)
-{
- struct powernv_rng *rng;
-
- rng = get_cpu_var(powernv_rng);
- put_cpu_var(rng);
- return rng != NULL;
-}
-
static unsigned long rng_whiten(struct powernv_rng *rng, unsigned long val)
{
unsigned long parity;
@@ -58,17 +49,6 @@ static unsigned long rng_whiten(struct powernv_rng *rng, unsigned long val)
return val;
}
-int powernv_get_random_real_mode(unsigned long *v)
-{
- struct powernv_rng *rng;
-
- rng = raw_cpu_read(powernv_rng);
-
- *v = rng_whiten(rng, __raw_rm_readq(rng->regs_real));
-
- return 1;
-}
-
static int powernv_get_random_darn(unsigned long *v)
{
unsigned long val;
@@ -105,12 +85,14 @@ int powernv_get_random_long(unsigned long *v)
{
struct powernv_rng *rng;
- rng = get_cpu_var(powernv_rng);
-
- *v = rng_whiten(rng, in_be64(rng->regs));
-
- put_cpu_var(rng);
-
+ if (mfmsr() & MSR_DR) {
+ rng = get_cpu_var(powernv_rng);
+ *v = rng_whiten(rng, in_be64(rng->regs));
+ put_cpu_var(rng);
+ } else {
+ rng = raw_cpu_read(powernv_rng);
+ *v = rng_whiten(rng, __raw_rm_readq(rng->regs_real));
+ }
return 1;
}
EXPORT_SYMBOL_GPL(powernv_get_random_long);
diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c
index 5ce924611b94..63ef61ed7597 100644
--- a/arch/powerpc/platforms/ps3/mm.c
+++ b/arch/powerpc/platforms/ps3/mm.c
@@ -364,7 +364,7 @@ static void __maybe_unused _dma_dump_region(const struct ps3_dma_region *r,
* @bus_addr: Starting ioc bus address of the area to map.
* @len: Length in bytes of the area to map.
* @link: A struct list_head used with struct ps3_dma_region.chunk_list, the
- * list of all chuncks owned by the region.
+ * list of all chunks owned by the region.
*
* This implementation uses a very simple dma page manager
* based on the dma_chunk structure. This scheme assumes
diff --git a/arch/powerpc/platforms/ps3/system-bus.c b/arch/powerpc/platforms/ps3/system-bus.c
index b637bf292047..2502e9b17df4 100644
--- a/arch/powerpc/platforms/ps3/system-bus.c
+++ b/arch/powerpc/platforms/ps3/system-bus.c
@@ -601,7 +601,7 @@ static dma_addr_t ps3_ioc0_map_page(struct device *_dev, struct page *page,
iopte_flag |= CBE_IOPTE_PP_W | CBE_IOPTE_SO_RW;
break;
default:
- /* not happned */
+ /* not happened */
BUG();
}
result = ps3_dma_map(dev->d_region, (unsigned long)ptr, size,
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 09fafcf2d3a0..f0b2bca750da 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -510,7 +510,7 @@ static int pseries_eeh_set_option(struct eeh_pe *pe, int option)
int ret = 0;
/*
- * When we're enabling or disabling EEH functioality on
+ * When we're enabling or disabling EEH functionality on
* the particular PE, the PE config address is possibly
* unavailable. Therefore, we have to figure it out from
* the FDT node.
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 4d991cf840d9..a90024697c11 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -701,6 +701,33 @@ struct iommu_table_ops iommu_table_lpar_multi_ops = {
.get = tce_get_pSeriesLP
};
+/*
+ * Find nearest ibm,dma-window (default DMA window) or direct DMA window or
+ * dynamic 64bit DMA window, walking up the device tree.
+ */
+static struct device_node *pci_dma_find(struct device_node *dn,
+ const __be32 **dma_window)
+{
+ const __be32 *dw = NULL;
+
+ for ( ; dn && PCI_DN(dn); dn = dn->parent) {
+ dw = of_get_property(dn, "ibm,dma-window", NULL);
+ if (dw) {
+ if (dma_window)
+ *dma_window = dw;
+ return dn;
+ }
+ dw = of_get_property(dn, DIRECT64_PROPNAME, NULL);
+ if (dw)
+ return dn;
+ dw = of_get_property(dn, DMA64_PROPNAME, NULL);
+ if (dw)
+ return dn;
+ }
+
+ return NULL;
+}
+
static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus)
{
struct iommu_table *tbl;
@@ -713,20 +740,10 @@ static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus)
pr_debug("pci_dma_bus_setup_pSeriesLP: setting up bus %pOF\n",
dn);
- /*
- * Find nearest ibm,dma-window (default DMA window), walking up the
- * device tree
- */
- for (pdn = dn; pdn != NULL; pdn = pdn->parent) {
- dma_window = of_get_property(pdn, "ibm,dma-window", NULL);
- if (dma_window != NULL)
- break;
- }
+ pdn = pci_dma_find(dn, &dma_window);
- if (dma_window == NULL) {
+ if (dma_window == NULL)
pr_debug(" no ibm,dma-window property !\n");
- return;
- }
ppci = PCI_DN(pdn);
@@ -736,11 +753,13 @@ static void pci_dma_bus_setup_pSeriesLP(struct pci_bus *bus)
if (!ppci->table_group) {
ppci->table_group = iommu_pseries_alloc_group(ppci->phb->node);
tbl = ppci->table_group->tables[0];
- iommu_table_setparms_lpar(ppci->phb, pdn, tbl,
- ppci->table_group, dma_window);
+ if (dma_window) {
+ iommu_table_setparms_lpar(ppci->phb, pdn, tbl,
+ ppci->table_group, dma_window);
- if (!iommu_init_table(tbl, ppci->phb->node, 0, 0))
- panic("Failed to initialize iommu table");
+ if (!iommu_init_table(tbl, ppci->phb->node, 0, 0))
+ panic("Failed to initialize iommu table");
+ }
iommu_register_group(ppci->table_group,
pci_domain_nr(bus), 0);
pr_debug(" created table: %p\n", ppci->table_group);
@@ -1233,7 +1252,7 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
bool default_win_removed = false, direct_mapping = false;
bool pmem_present;
struct pci_dn *pci = PCI_DN(pdn);
- struct iommu_table *tbl = pci->table_group->tables[0];
+ struct property *default_win = NULL;
dn = of_find_node_by_type(NULL, "ibm,pmemory");
pmem_present = dn != NULL;
@@ -1290,11 +1309,10 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
* for extensions presence.
*/
if (query.windows_available == 0) {
- struct property *default_win;
int reset_win_ext;
/* DDW + IOMMU on single window may fail if there is any allocation */
- if (iommu_table_in_use(tbl)) {
+ if (iommu_table_in_use(pci->table_group->tables[0])) {
dev_warn(&dev->dev, "current IOMMU table in use, can't be replaced.\n");
goto out_failed;
}
@@ -1430,16 +1448,18 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
pci->table_group->tables[1] = newtbl;
- /* Keep default DMA window stuct if removed */
- if (default_win_removed) {
- tbl->it_size = 0;
- vfree(tbl->it_map);
- tbl->it_map = NULL;
- }
-
set_iommu_table_base(&dev->dev, newtbl);
}
+ if (default_win_removed) {
+ iommu_tce_table_put(pci->table_group->tables[0]);
+ pci->table_group->tables[0] = NULL;
+
+ /* default_win is valid here because default_win_removed == true */
+ of_remove_property(pdn, default_win);
+ dev_info(&dev->dev, "Removed default DMA window for %pOF\n", pdn);
+ }
+
spin_lock(&dma_win_list_lock);
list_add(&window->list, &dma_win_list);
spin_unlock(&dma_win_list_lock);
@@ -1504,13 +1524,7 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
dn = pci_device_to_OF_node(dev);
pr_debug(" node is %pOF\n", dn);
- for (pdn = dn; pdn && PCI_DN(pdn) && !PCI_DN(pdn)->table_group;
- pdn = pdn->parent) {
- dma_window = of_get_property(pdn, "ibm,dma-window", NULL);
- if (dma_window)
- break;
- }
-
+ pdn = pci_dma_find(dn, &dma_window);
if (!pdn || !PCI_DN(pdn)) {
printk(KERN_WARNING "pci_dma_dev_setup_pSeriesLP: "
"no DMA window found for pci dev=%s dn=%pOF\n",
@@ -1541,7 +1555,6 @@ static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
static bool iommu_bypass_supported_pSeriesLP(struct pci_dev *pdev, u64 dma_mask)
{
struct device_node *dn = pci_device_to_OF_node(pdev), *pdn;
- const __be32 *dma_window = NULL;
/* only attempt to use a new window if 64-bit DMA is requested */
if (dma_mask < DMA_BIT_MASK(64))
@@ -1555,13 +1568,7 @@ static bool iommu_bypass_supported_pSeriesLP(struct pci_dev *pdev, u64 dma_mask)
* search upwards in the tree until we either hit a dma-window
* property, OR find a parent with a table already allocated.
*/
- for (pdn = dn; pdn && PCI_DN(pdn) && !PCI_DN(pdn)->table_group;
- pdn = pdn->parent) {
- dma_window = of_get_property(pdn, "ibm,dma-window", NULL);
- if (dma_window)
- break;
- }
-
+ pdn = pci_dma_find(dn, NULL);
if (pdn && PCI_DN(pdn))
return enable_ddw(pdev, pdn);
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index f27735f623ba..27bed0dd866e 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -658,7 +658,7 @@ static resource_size_t pseries_get_iov_fw_value(struct pci_dev *dev, int resno,
*/
num_res = of_read_number(&indexes[NUM_RES_PROPERTY], 1);
if (resno >= num_res)
- return 0; /* or an errror */
+ return 0; /* or an error */
i = START_OF_ENTRIES + NEXT_ENTRY * resno;
switch (value) {
@@ -762,7 +762,7 @@ static void pseries_pci_fixup_iov_resources(struct pci_dev *pdev)
if (!pdev->is_physfn)
return;
- /*Firmware must support open sriov otherwise dont configure*/
+ /*Firmware must support open sriov otherwise don't configure*/
indexes = of_get_property(dn, "ibm,open-sriov-vf-bar-info", NULL);
if (indexes)
of_pci_parse_iov_addrs(pdev, indexes);
diff --git a/arch/powerpc/platforms/pseries/vas-sysfs.c b/arch/powerpc/platforms/pseries/vas-sysfs.c
index ec65586cbeb3..241c84374045 100644
--- a/arch/powerpc/platforms/pseries/vas-sysfs.c
+++ b/arch/powerpc/platforms/pseries/vas-sysfs.c
@@ -76,7 +76,7 @@ struct vas_sysfs_entry {
* Create sysfs interface:
* /sys/devices/vas/vas0/gzip/default_capabilities
* This directory contains the following VAS GZIP capabilities
- * for the defaule credit type.
+ * for the default credit type.
* /sys/devices/vas/vas0/gzip/default_capabilities/nr_total_credits
* Total number of default credits assigned to the LPAR which
* can be changed with DLPAR operation.
diff --git a/arch/powerpc/platforms/pseries/vas.c b/arch/powerpc/platforms/pseries/vas.c
index ec643bbdb67f..500a1fc4a1d7 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -801,7 +801,7 @@ int vas_reconfig_capabilties(u8 type, int new_nr_creds)
atomic_set(&caps->nr_total_credits, new_nr_creds);
/*
* The total number of available credits may be decreased or
- * inceased with DLPAR operation. Means some windows have to be
+ * increased with DLPAR operation. Means some windows have to be
* closed / reopened. Hold the vas_pseries_mutex so that the
* the user space can not open new windows.
*/
diff --git a/arch/powerpc/sysdev/fsl_lbc.c b/arch/powerpc/sysdev/fsl_lbc.c
index 1985e067e952..18acfb4e82af 100644
--- a/arch/powerpc/sysdev/fsl_lbc.c
+++ b/arch/powerpc/sysdev/fsl_lbc.c
@@ -37,7 +37,7 @@ EXPORT_SYMBOL(fsl_lbc_ctrl_dev);
*
* This function converts a base address of lbc into the right format for the
* BR register. If the SOC has eLBC then it returns 32bit physical address
- * else it convers a 34bit local bus physical address to correct format of
+ * else it converts a 34bit local bus physical address to correct format of
* 32bit address for BR register (Example: MPC8641).
*/
u32 fsl_lbc_addr(phys_addr_t addr_base)
diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
index a97ce602394e..3c430a6a6a4e 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -218,7 +218,7 @@ static void setup_pci_atmu(struct pci_controller *hose)
* windows have implemented the default target value as 0xf
* for CCSR space.In all Freescale legacy devices the target
* of 0xf is reserved for local memory space. 9132 Rev1.0
- * now has local mempry space mapped to target 0x0 instead of
+ * now has local memory space mapped to target 0x0 instead of
* 0xf. Hence adding a workaround to remove the target 0xf
* defined for memory space from Inbound window attributes.
*/
@@ -520,6 +520,7 @@ int fsl_add_bridge(struct platform_device *pdev, int is_primary)
struct resource rsrc;
const int *bus_range;
u8 hdr_type, progif;
+ u32 class_code;
struct device_node *dev;
struct ccsr_pci __iomem *pci;
u16 temp;
@@ -593,6 +594,13 @@ int fsl_add_bridge(struct platform_device *pdev, int is_primary)
PPC_INDIRECT_TYPE_SURPRESS_PRIMARY_BUS;
if (fsl_pcie_check_link(hose))
hose->indirect_type |= PPC_INDIRECT_TYPE_NO_PCIE_LINK;
+ /* Fix Class Code to PCI_CLASS_BRIDGE_PCI_NORMAL for pre-3.0 controller */
+ if (in_be32(&pci->block_rev1) < PCIE_IP_REV_3_0) {
+ early_read_config_dword(hose, 0, 0, PCIE_FSL_CSR_CLASSCODE, &class_code);
+ class_code &= 0xff;
+ class_code |= PCI_CLASS_BRIDGE_PCI_NORMAL << 8;
+ early_write_config_dword(hose, 0, 0, PCIE_FSL_CSR_CLASSCODE, class_code);
+ }
} else {
/*
* Set PBFR(PCI Bus Function Register)[10] = 1 to
diff --git a/arch/powerpc/sysdev/fsl_pci.h b/arch/powerpc/sysdev/fsl_pci.h
index cdbde2e0c96e..093a875d7d1e 100644
--- a/arch/powerpc/sysdev/fsl_pci.h
+++ b/arch/powerpc/sysdev/fsl_pci.h
@@ -18,6 +18,7 @@ struct platform_device;
#define PCIE_LTSSM 0x0404 /* PCIE Link Training and Status */
#define PCIE_LTSSM_L0 0x16 /* L0 state */
+#define PCIE_FSL_CSR_CLASSCODE 0x474 /* FSL GPEX CSR */
#define PCIE_IP_REV_2_2 0x02080202 /* PCIE IP block version Rev2.2 */
#define PCIE_IP_REV_3_0 0x02080300 /* PCIE IP block version Rev3.0 */
#define PIWAR_EN 0x80000000 /* Enable */
diff --git a/arch/powerpc/sysdev/ge/ge_pic.c b/arch/powerpc/sysdev/ge/ge_pic.c
index 02553a8ce191..413b375c4d28 100644
--- a/arch/powerpc/sysdev/ge/ge_pic.c
+++ b/arch/powerpc/sysdev/ge/ge_pic.c
@@ -150,7 +150,7 @@ static struct irq_chip gef_pic_chip = {
};
-/* When an interrupt is being configured, this call allows some flexibilty
+/* When an interrupt is being configured, this call allows some flexibility
* in deciding which irq_chip structure is used
*/
static int gef_pic_host_map(struct irq_domain *h, unsigned int virq,
diff --git a/arch/powerpc/sysdev/mpic_msgr.c b/arch/powerpc/sysdev/mpic_msgr.c
index 36ec0bdd8b63..a25413826b63 100644
--- a/arch/powerpc/sysdev/mpic_msgr.c
+++ b/arch/powerpc/sysdev/mpic_msgr.c
@@ -99,7 +99,7 @@ void mpic_msgr_disable(struct mpic_msgr *msgr)
EXPORT_SYMBOL_GPL(mpic_msgr_disable);
/* The following three functions are used to compute the order and number of
- * the message register blocks. They are clearly very inefficent. However,
+ * the message register blocks. They are clearly very inefficient. However,
* they are called *only* a few times during device initialization.
*/
static unsigned int mpic_msgr_number_of_blocks(void)
diff --git a/arch/powerpc/sysdev/mpic_msi.c b/arch/powerpc/sysdev/mpic_msi.c
index f412d6ad0b66..9936c014ac7d 100644
--- a/arch/powerpc/sysdev/mpic_msi.c
+++ b/arch/powerpc/sysdev/mpic_msi.c
@@ -37,7 +37,7 @@ static int __init mpic_msi_reserve_u3_hwirqs(struct mpic *mpic)
/* Reserve source numbers we know are reserved in the HW.
*
* This is a bit of a mix of U3 and U4 reserves but that's going
- * to work fine, we have plenty enugh numbers left so let's just
+ * to work fine, we have plenty enough numbers left so let's just
* mark anything we don't like reserved.
*/
for (i = 0; i < 8; i++)
diff --git a/arch/powerpc/sysdev/mpic_timer.c b/arch/powerpc/sysdev/mpic_timer.c
index 444e9ce42d0a..b2f0a73e8f93 100644
--- a/arch/powerpc/sysdev/mpic_timer.c
+++ b/arch/powerpc/sysdev/mpic_timer.c
@@ -255,7 +255,7 @@ EXPORT_SYMBOL(mpic_start_timer);
/**
* mpic_stop_timer - stop hardware timer
- * @handle: the timer to be stoped
+ * @handle: the timer to be stopped
*
* The timer periodically generates an interrupt. Unless user stops the timer.
*/
diff --git a/arch/powerpc/sysdev/mpic_u3msi.c b/arch/powerpc/sysdev/mpic_u3msi.c
index 3f4841dfefb5..73d129594078 100644
--- a/arch/powerpc/sysdev/mpic_u3msi.c
+++ b/arch/powerpc/sysdev/mpic_u3msi.c
@@ -78,7 +78,7 @@ static u64 find_u4_magic_addr(struct pci_dev *pdev, unsigned int hwirq)
/* U4 PCIe MSIs need to write to the special register in
* the bridge that generates interrupts. There should be
- * theorically a register at 0xf8005000 where you just write
+ * theoretically a register at 0xf8005000 where you just write
* the MSI number and that triggers the right interrupt, but
* unfortunately, this is busted in HW, the bridge endian swaps
* the value and hits the wrong nibble in the register.
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index f940428ad13f..45f72fc715fc 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -617,7 +617,7 @@ bool __init xive_native_init(void)
xive_tima_os = r.start;
- /* Grab size of provisionning pages */
+ /* Grab size of provisioning pages */
xive_parse_provisioning(np);
/* Switch the XIVE to exploitation mode */
diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index b0d36e430dbc..013446f5cdbb 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -716,6 +716,7 @@ static bool __init xive_get_max_prio(u8 *max_prio)
}
reg = of_get_property(rootdn, "ibm,plat-res-int-priorities", &len);
+ of_node_put(rootdn);
if (!reg) {
pr_err("Failed to read 'ibm,plat-res-int-priorities' property\n");
return false;
diff --git a/arch/powerpc/xmon/ppc-opc.c b/arch/powerpc/xmon/ppc-opc.c
index dfb80810b16c..0774d711453e 100644
--- a/arch/powerpc/xmon/ppc-opc.c
+++ b/arch/powerpc/xmon/ppc-opc.c
@@ -408,7 +408,7 @@ const struct powerpc_operand powerpc_operands[] =
#define FXM4 FXM + 1
{ 0xff, 12, insert_fxm, extract_fxm,
PPC_OPERAND_OPTIONAL | PPC_OPERAND_OPTIONAL_VALUE},
- /* If the FXM4 operand is ommitted, use the sentinel value -1. */
+ /* If the FXM4 operand is omitted, use the sentinel value -1. */
{ -1, -1, NULL, NULL, 0},
/* The IMM20 field in an LI instruction. */
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index fd72753e8ad5..27da7d5c2024 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2024,7 +2024,7 @@ static void dump_206_sprs(void)
if (!cpu_has_feature(CPU_FTR_ARCH_206))
return;
- /* Actually some of these pre-date 2.06, but whatevs */
+ /* Actually some of these pre-date 2.06, but whatever */
printf("srr0 = %.16lx srr1 = %.16lx dsisr = %.8lx\n",
mfspr(SPRN_SRR0), mfspr(SPRN_SRR1), mfspr(SPRN_DSISR));
diff --git a/arch/riscv/boot/dts/starfive/jh7100.dtsi b/arch/riscv/boot/dts/starfive/jh7100.dtsi
index 69f22f9aad9d..f48e232a72a7 100644
--- a/arch/riscv/boot/dts/starfive/jh7100.dtsi
+++ b/arch/riscv/boot/dts/starfive/jh7100.dtsi
@@ -118,7 +118,7 @@ plic: interrupt-controller@c000000 {
interrupt-controller;
#address-cells = <0>;
#interrupt-cells = <1>;
- riscv,ndev = <127>;
+ riscv,ndev = <133>;
};
clkgen: clock-controller@11800000 {
diff --git a/arch/riscv/include/asm/cpu_ops.h b/arch/riscv/include/asm/cpu_ops.h
index 134590f1b843..aa128466c4d4 100644
--- a/arch/riscv/include/asm/cpu_ops.h
+++ b/arch/riscv/include/asm/cpu_ops.h
@@ -38,6 +38,7 @@ struct cpu_operations {
#endif
};
+extern const struct cpu_operations cpu_ops_spinwait;
extern const struct cpu_operations *cpu_ops[NR_CPUS];
void __init cpu_set_ops(int cpu);
diff --git a/arch/riscv/kernel/cpu_ops.c b/arch/riscv/kernel/cpu_ops.c
index 170d07e57721..f92c0e6eddb1 100644
--- a/arch/riscv/kernel/cpu_ops.c
+++ b/arch/riscv/kernel/cpu_ops.c
@@ -15,9 +15,7 @@
const struct cpu_operations *cpu_ops[NR_CPUS] __ro_after_init;
extern const struct cpu_operations cpu_ops_sbi;
-#ifdef CONFIG_RISCV_BOOT_SPINWAIT
-extern const struct cpu_operations cpu_ops_spinwait;
-#else
+#ifndef CONFIG_RISCV_BOOT_SPINWAIT
const struct cpu_operations cpu_ops_spinwait = {
.name = "",
.cpu_prepare = NULL,
diff --git a/arch/riscv/kernel/cpu_ops_spinwait.c b/arch/riscv/kernel/cpu_ops_spinwait.c
index 346847f6c41c..d98d19226b5f 100644
--- a/arch/riscv/kernel/cpu_ops_spinwait.c
+++ b/arch/riscv/kernel/cpu_ops_spinwait.c
@@ -11,6 +11,8 @@
#include <asm/sbi.h>
#include <asm/smp.h>
+#include "head.h"
+
const struct cpu_operations cpu_ops_spinwait;
void *__cpu_spinwait_stack_pointer[NR_CPUS] __section(".data");
void *__cpu_spinwait_task_pointer[NR_CPUS] __section(".data");
@@ -18,7 +20,7 @@ void *__cpu_spinwait_task_pointer[NR_CPUS] __section(".data");
static void cpu_update_secondary_bootdata(unsigned int cpuid,
struct task_struct *tidle)
{
- int hartid = cpuid_to_hartid_map(cpuid);
+ unsigned long hartid = cpuid_to_hartid_map(cpuid);
/*
* The hartid must be less than NR_CPUS to avoid out-of-bound access
@@ -27,7 +29,7 @@ static void cpu_update_secondary_bootdata(unsigned int cpuid,
* spinwait booting is not the recommended approach for any platforms
* booting Linux in S-mode and can be disabled in the future.
*/
- if (hartid == INVALID_HARTID || hartid >= NR_CPUS)
+ if (hartid == INVALID_HARTID || hartid >= (unsigned long) NR_CPUS)
return;
/* Make sure tidle is updated */
diff --git a/arch/riscv/kernel/crash_save_regs.S b/arch/riscv/kernel/crash_save_regs.S
index 7832fb763aba..b2a1908c0463 100644
--- a/arch/riscv/kernel/crash_save_regs.S
+++ b/arch/riscv/kernel/crash_save_regs.S
@@ -44,7 +44,7 @@ SYM_CODE_START(riscv_crash_save_regs)
REG_S t6, PT_T6(a0) /* x31 */
csrr t1, CSR_STATUS
- csrr t2, CSR_EPC
+ auipc t2, 0x0
csrr t3, CSR_TVAL
csrr t4, CSR_CAUSE
diff --git a/arch/riscv/kernel/machine_kexec.c b/arch/riscv/kernel/machine_kexec.c
index df8e24559035..ee79e6839b86 100644
--- a/arch/riscv/kernel/machine_kexec.c
+++ b/arch/riscv/kernel/machine_kexec.c
@@ -138,19 +138,37 @@ void machine_shutdown(void)
#endif
}
+/* Override the weak function in kernel/panic.c */
+void crash_smp_send_stop(void)
+{
+ static int cpus_stopped;
+
+ /*
+ * This function can be called twice in panic path, but obviously
+ * we execute this only once.
+ */
+ if (cpus_stopped)
+ return;
+
+ smp_send_stop();
+ cpus_stopped = 1;
+}
+
/*
* machine_crash_shutdown - Prepare to kexec after a kernel crash
*
* This function is called by crash_kexec just before machine_kexec
- * below and its goal is similar to machine_shutdown, but in case of
- * a kernel crash. Since we don't handle such cases yet, this function
- * is empty.
+ * and its goal is to shutdown non-crashing cpus and save registers.
*/
void
machine_crash_shutdown(struct pt_regs *regs)
{
+ local_irq_disable();
+
+ /* shutdown non-crashing cpus */
+ crash_smp_send_stop();
+
crash_save_cpu(regs, smp_processor_id());
- machine_shutdown();
pr_info("Starting crashdump kernel...\n");
}
@@ -171,7 +189,7 @@ machine_kexec(struct kimage *image)
struct kimage_arch *internal = &image->arch;
unsigned long jump_addr = (unsigned long) image->start;
unsigned long first_ind_entry = (unsigned long) &image->head;
- unsigned long this_cpu_id = smp_processor_id();
+ unsigned long this_cpu_id = __smp_processor_id();
unsigned long this_hart_id = cpuid_to_hartid_map(this_cpu_id);
unsigned long fdt_addr = internal->fdt_addr;
void *control_code_buffer = page_address(image->control_code_page);
diff --git a/arch/riscv/kernel/probes/uprobes.c b/arch/riscv/kernel/probes/uprobes.c
index 7a057b5f0adc..c976a21cd4bd 100644
--- a/arch/riscv/kernel/probes/uprobes.c
+++ b/arch/riscv/kernel/probes/uprobes.c
@@ -59,8 +59,6 @@ int arch_uprobe_pre_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
instruction_pointer_set(regs, utask->xol_vaddr);
- regs->status &= ~SR_SPIE;
-
return 0;
}
@@ -72,8 +70,6 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
instruction_pointer_set(regs, utask->vaddr + auprobe->insn_size);
- regs->status |= SR_SPIE;
-
return 0;
}
@@ -111,8 +107,6 @@ void arch_uprobe_abort_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
* address.
*/
instruction_pointer_set(regs, utask->vaddr);
-
- regs->status &= ~SR_SPIE;
}
bool arch_uretprobe_is_alive(struct return_instance *ret, enum rp_check ctx,
diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
index 8c475f4da308..ec486e5369d9 100644
--- a/arch/riscv/lib/uaccess.S
+++ b/arch/riscv/lib/uaccess.S
@@ -175,7 +175,7 @@ ENTRY(__asm_copy_from_user)
/* Exception fixup code */
10:
/* Disable access to user memory */
- csrs CSR_STATUS, t6
+ csrc CSR_STATUS, t6
mv a0, t5
ret
ENDPROC(__asm_copy_to_user)
@@ -227,7 +227,7 @@ ENTRY(__clear_user)
/* Exception fixup code */
11:
/* Disable access to user memory */
- csrs CSR_STATUS, t6
+ csrc CSR_STATUS, t6
mv a0, a1
ret
ENDPROC(__clear_user)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 39e2e1d0e94f..80b4a407b3f1 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -99,6 +99,10 @@ static void __init print_vm_layout(void)
(unsigned long)VMEMMAP_END);
print_mlm("vmalloc", (unsigned long)VMALLOC_START,
(unsigned long)VMALLOC_END);
+#ifdef CONFIG_64BIT
+ print_mlm("modules", (unsigned long)MODULES_VADDR,
+ (unsigned long)MODULES_END);
+#endif
print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
(unsigned long)high_memory);
if (IS_ENABLED(CONFIG_64BIT)) {
diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h
index 40264f60b0da..f4073106e1f3 100644
--- a/arch/s390/include/asm/gmap.h
+++ b/arch/s390/include/asm/gmap.h
@@ -148,4 +148,6 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long dirty_bitmap[4],
unsigned long gaddr, unsigned long vmaddr);
int gmap_mark_unmergeable(void);
void s390_reset_acc(struct mm_struct *mm);
+void s390_unlist_old_asce(struct gmap *gmap);
+int s390_replace_asce(struct gmap *gmap);
#endif /* _ASM_S390_GMAP_H */
diff --git a/arch/s390/include/asm/kexec.h b/arch/s390/include/asm/kexec.h
index 63098df81c9f..d13bd221cd37 100644
--- a/arch/s390/include/asm/kexec.h
+++ b/arch/s390/include/asm/kexec.h
@@ -92,5 +92,8 @@ int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
const Elf_Shdr *relsec,
const Elf_Shdr *symtab);
#define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image);
+#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
#endif
#endif /*_S390_KEXEC_H */
diff --git a/arch/s390/include/asm/unwind.h b/arch/s390/include/asm/unwind.h
index 0bf06f1682d8..02462e7100c1 100644
--- a/arch/s390/include/asm/unwind.h
+++ b/arch/s390/include/asm/unwind.h
@@ -47,7 +47,7 @@ struct unwind_state {
static inline unsigned long unwind_recover_ret_addr(struct unwind_state *state,
unsigned long ip)
{
- ip = ftrace_graph_ret_addr(state->task, &state->graph_idx, ip, NULL);
+ ip = ftrace_graph_ret_addr(state->task, &state->graph_idx, ip, (void *)state->sp);
if (is_kretprobe_trampoline(ip))
ip = kretprobe_find_ret_addr(state->task, (void *)state->sp, &state->kr_cur);
return ip;
diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c
index 28124d0fa1d5..f8ebdd70dd31 100644
--- a/arch/s390/kernel/crash_dump.c
+++ b/arch/s390/kernel/crash_dump.c
@@ -199,7 +199,7 @@ static int copy_oldmem_user(void __user *dst, unsigned long src, size_t count)
} else {
len = count;
}
- rc = copy_to_user_real(dst, src, count);
+ rc = copy_to_user_real(dst, src, len);
if (rc)
return rc;
}
diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c
index 8f43575a4dd3..fc6d5f58debe 100644
--- a/arch/s390/kernel/machine_kexec_file.c
+++ b/arch/s390/kernel/machine_kexec_file.c
@@ -31,6 +31,7 @@ int s390_verify_sig(const char *kernel, unsigned long kernel_len)
const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1;
struct module_signature *ms;
unsigned long sig_len;
+ int ret;
/* Skip signature verification when not secure IPLed. */
if (!ipl_secure_flag)
@@ -65,11 +66,18 @@ int s390_verify_sig(const char *kernel, unsigned long kernel_len)
return -EBADMSG;
}
- return verify_pkcs7_signature(kernel, kernel_len,
- kernel + kernel_len, sig_len,
- VERIFY_USE_PLATFORM_KEYRING,
- VERIFYING_MODULE_SIGNATURE,
- NULL, NULL);
+ ret = verify_pkcs7_signature(kernel, kernel_len,
+ kernel + kernel_len, sig_len,
+ VERIFY_USE_SECONDARY_KEYRING,
+ VERIFYING_MODULE_SIGNATURE,
+ NULL, NULL);
+ if (ret == -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING))
+ ret = verify_pkcs7_signature(kernel, kernel_len,
+ kernel + kernel_len, sig_len,
+ VERIFY_USE_PLATFORM_KEYRING,
+ VERIFYING_MODULE_SIGNATURE,
+ NULL, NULL);
+ return ret;
}
#endif /* CONFIG_KEXEC_SIG */
diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index 8bd42a20d924..88112065d941 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -528,12 +528,27 @@ static int handle_pv_uvc(struct kvm_vcpu *vcpu)
static int handle_pv_notification(struct kvm_vcpu *vcpu)
{
+ int ret;
+
if (vcpu->arch.sie_block->ipa == 0xb210)
return handle_pv_spx(vcpu);
if (vcpu->arch.sie_block->ipa == 0xb220)
return handle_pv_sclp(vcpu);
if (vcpu->arch.sie_block->ipa == 0xb9a4)
return handle_pv_uvc(vcpu);
+ if (vcpu->arch.sie_block->ipa >> 8 == 0xae) {
+ /*
+ * Besides external call, other SIGP orders also cause a
+ * 108 (pv notify) intercept. In contrast to external call,
+ * these orders need to be emulated and hence the appropriate
+ * place to handle them is in handle_instruction().
+ * So first try kvm_s390_handle_sigp_pei() and if that isn't
+ * successful, go on with handle_instruction().
+ */
+ ret = kvm_s390_handle_sigp_pei(vcpu);
+ if (!ret)
+ return ret;
+ }
return handle_instruction(vcpu);
}
diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
index cc7c9599f43e..8eee3fc414e5 100644
--- a/arch/s390/kvm/pv.c
+++ b/arch/s390/kvm/pv.c
@@ -161,10 +161,13 @@ int kvm_s390_pv_deinit_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
atomic_set(&kvm->mm->context.is_protected, 0);
KVM_UV_EVENT(kvm, 3, "PROTVIRT DESTROY VM: rc %x rrc %x", *rc, *rrc);
WARN_ONCE(cc, "protvirt destroy vm failed rc %x rrc %x", *rc, *rrc);
- /* Inteded memory leak on "impossible" error */
- if (!cc)
+ /* Intended memory leak on "impossible" error */
+ if (!cc) {
kvm_s390_pv_dealloc_vm(kvm);
- return cc ? -EIO : 0;
+ return 0;
+ }
+ s390_replace_asce(kvm->arch.gmap);
+ return -EIO;
}
int kvm_s390_pv_init_vm(struct kvm *kvm, u16 *rc, u16 *rrc)
diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
index 8aaee2892ec3..cb747bf6c798 100644
--- a/arch/s390/kvm/sigp.c
+++ b/arch/s390/kvm/sigp.c
@@ -480,9 +480,9 @@ int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu)
struct kvm_vcpu *dest_vcpu;
u8 order_code = kvm_s390_get_base_disp_rs(vcpu, NULL);
- trace_kvm_s390_handle_sigp_pei(vcpu, order_code, cpu_addr);
-
if (order_code == SIGP_EXTERNAL_CALL) {
+ trace_kvm_s390_handle_sigp_pei(vcpu, order_code, cpu_addr);
+
dest_vcpu = kvm_get_vcpu_by_id(vcpu->kvm, cpu_addr);
BUG_ON(dest_vcpu == NULL);
diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
index b8ae4a4aa2ba..85cab61d87a9 100644
--- a/arch/s390/mm/gmap.c
+++ b/arch/s390/mm/gmap.c
@@ -2735,3 +2735,89 @@ void s390_reset_acc(struct mm_struct *mm)
mmput(mm);
}
EXPORT_SYMBOL_GPL(s390_reset_acc);
+
+/**
+ * s390_unlist_old_asce - Remove the topmost level of page tables from the
+ * list of page tables of the gmap.
+ * @gmap: the gmap whose table is to be removed
+ *
+ * On s390x, KVM keeps a list of all pages containing the page tables of the
+ * gmap (the CRST list). This list is used at tear down time to free all
+ * pages that are now not needed anymore.
+ *
+ * This function removes the topmost page of the tree (the one pointed to by
+ * the ASCE) from the CRST list.
+ *
+ * This means that it will not be freed when the VM is torn down, and needs
+ * to be handled separately by the caller, unless a leak is actually
+ * intended. Notice that this function will only remove the page from the
+ * list, the page will still be used as a top level page table (and ASCE).
+ */
+void s390_unlist_old_asce(struct gmap *gmap)
+{
+ struct page *old;
+
+ old = virt_to_page(gmap->table);
+ spin_lock(&gmap->guest_table_lock);
+ list_del(&old->lru);
+ /*
+ * Sometimes the topmost page might need to be "removed" multiple
+ * times, for example if the VM is rebooted into secure mode several
+ * times concurrently, or if s390_replace_asce fails after calling
+ * s390_remove_old_asce and is attempted again later. In that case
+ * the old asce has been removed from the list, and therefore it
+ * will not be freed when the VM terminates, but the ASCE is still
+ * in use and still pointed to.
+ * A subsequent call to replace_asce will follow the pointer and try
+ * to remove the same page from the list again.
+ * Therefore it's necessary that the page of the ASCE has valid
+ * pointers, so list_del can work (and do nothing) without
+ * dereferencing stale or invalid pointers.
+ */
+ INIT_LIST_HEAD(&old->lru);
+ spin_unlock(&gmap->guest_table_lock);
+}
+EXPORT_SYMBOL_GPL(s390_unlist_old_asce);
+
+/**
+ * s390_replace_asce - Try to replace the current ASCE of a gmap with a copy
+ * @gmap: the gmap whose ASCE needs to be replaced
+ *
+ * If the allocation of the new top level page table fails, the ASCE is not
+ * replaced.
+ * In any case, the old ASCE is always removed from the gmap CRST list.
+ * Therefore the caller has to make sure to save a pointer to it
+ * beforehand, unless a leak is actually intended.
+ */
+int s390_replace_asce(struct gmap *gmap)
+{
+ unsigned long asce;
+ struct page *page;
+ void *table;
+
+ s390_unlist_old_asce(gmap);
+
+ page = alloc_pages(GFP_KERNEL_ACCOUNT, CRST_ALLOC_ORDER);
+ if (!page)
+ return -ENOMEM;
+ table = page_to_virt(page);
+ memcpy(table, gmap->table, 1UL << (CRST_ALLOC_ORDER + PAGE_SHIFT));
+
+ /*
+ * The caller has to deal with the old ASCE, but here we make sure
+ * the new one is properly added to the CRST list, so that
+ * it will be freed when the VM is torn down.
+ */
+ spin_lock(&gmap->guest_table_lock);
+ list_add(&page->lru, &gmap->crst_list);
+ spin_unlock(&gmap->guest_table_lock);
+
+ /* Set new table origin while preserving existing ASCE control bits */
+ asce = (gmap->asce & ~_ASCE_ORIGIN) | __pa(table);
+ WRITE_ONCE(gmap->asce, asce);
+ WRITE_ONCE(gmap->mm->context.gmap_asce, asce);
+ WRITE_ONCE(gmap->table, table);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(s390_replace_asce);
diff --git a/arch/um/drivers/random.c b/arch/um/drivers/random.c
index 433a3f8f2ef3..32b3341fe970 100644
--- a/arch/um/drivers/random.c
+++ b/arch/um/drivers/random.c
@@ -28,7 +28,7 @@
* protects against a module being loaded twice at the same time.
*/
static int random_fd = -1;
-static struct hwrng hwrng = { 0, };
+static struct hwrng hwrng;
static DECLARE_COMPLETION(have_data);
static int rng_dev_read(struct hwrng *rng, void *buf, size_t max, bool block)
diff --git a/arch/um/include/asm/archrandom.h b/arch/um/include/asm/archrandom.h
new file mode 100644
index 000000000000..2f24cb96391d
--- /dev/null
+++ b/arch/um/include/asm/archrandom.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_UM_ARCHRANDOM_H__
+#define __ASM_UM_ARCHRANDOM_H__
+
+#include <linux/types.h>
+
+/* This is from <os.h>, but better not to #include that in a global header here. */
+ssize_t os_getrandom(void *buf, size_t len, unsigned int flags);
+
+static inline bool __must_check arch_get_random_long(unsigned long *v)
+{
+ return os_getrandom(v, sizeof(*v), 0) == sizeof(*v);
+}
+
+static inline bool __must_check arch_get_random_int(unsigned int *v)
+{
+ return os_getrandom(v, sizeof(*v), 0) == sizeof(*v);
+}
+
+static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
+{
+ return false;
+}
+
+static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
+{
+ return false;
+}
+
+#endif
diff --git a/arch/um/include/asm/xor.h b/arch/um/include/asm/xor.h
index 22b39de73c24..647fae200c5d 100644
--- a/arch/um/include/asm/xor.h
+++ b/arch/um/include/asm/xor.h
@@ -18,7 +18,7 @@
#undef XOR_SELECT_TEMPLATE
/* pick an arbitrary one - measuring isn't possible with inf-cpu */
#define XOR_SELECT_TEMPLATE(x) \
- (time_travel_mode == TT_MODE_INFCPU ? TT_CPU_INF_XOR_DEFAULT : x))
+ (time_travel_mode == TT_MODE_INFCPU ? TT_CPU_INF_XOR_DEFAULT : x)
#endif
#endif
diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
index fafde1d5416e..0df646c6651e 100644
--- a/arch/um/include/shared/os.h
+++ b/arch/um/include/shared/os.h
@@ -11,6 +11,12 @@
#include <irq_user.h>
#include <longjmp.h>
#include <mm_id.h>
+/* This is to get size_t */
+#ifndef __UM_HOST__
+#include <linux/types.h>
+#else
+#include <sys/types.h>
+#endif
#define CATCH_EINTR(expr) while ((errno = 0, ((expr) < 0)) && (errno == EINTR))
@@ -243,6 +249,7 @@ extern void stack_protections(unsigned long address);
extern int raw(int fd);
extern void setup_machinename(char *machine_out);
extern void setup_hostinfo(char *buf, int len);
+extern ssize_t os_getrandom(void *buf, size_t len, unsigned int flags);
extern void os_dump_core(void) __attribute__ ((noreturn));
extern void um_early_printk(const char *s, unsigned int n);
extern void os_fix_helper_signals(void);
diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c
index 9838967d0b2f..e0de60e503b9 100644
--- a/arch/um/kernel/um_arch.c
+++ b/arch/um/kernel/um_arch.c
@@ -16,6 +16,7 @@
#include <linux/sched/task.h>
#include <linux/kmsg_dump.h>
#include <linux/suspend.h>
+#include <linux/random.h>
#include <asm/processor.h>
#include <asm/cpufeature.h>
@@ -406,6 +407,8 @@ int __init __weak read_initrd(void)
void __init setup_arch(char **cmdline_p)
{
+ u8 rng_seed[32];
+
stack_protections((unsigned long) &init_thread_info);
setup_physmem(uml_physmem, uml_reserved, physmem_size, highmem);
mem_total_pages(physmem_size, iomem_size, highmem);
@@ -416,6 +419,11 @@ void __init setup_arch(char **cmdline_p)
strlcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
*cmdline_p = command_line;
setup_hostinfo(host_info, sizeof host_info);
+
+ if (os_getrandom(rng_seed, sizeof(rng_seed), 0) == sizeof(rng_seed)) {
+ add_bootloader_randomness(rng_seed, sizeof(rng_seed));
+ memzero_explicit(rng_seed, sizeof(rng_seed));
+ }
}
void __init check_bugs(void)
diff --git a/arch/um/os-Linux/util.c b/arch/um/os-Linux/util.c
index 41297ec404bf..fc0f2a9dee5a 100644
--- a/arch/um/os-Linux/util.c
+++ b/arch/um/os-Linux/util.c
@@ -14,6 +14,7 @@
#include <sys/wait.h>
#include <sys/mman.h>
#include <sys/utsname.h>
+#include <sys/random.h>
#include <init.h>
#include <os.h>
@@ -96,6 +97,11 @@ static inline void __attribute__ ((noreturn)) uml_abort(void)
exit(127);
}
+ssize_t os_getrandom(void *buf, size_t len, unsigned int flags)
+{
+ return getrandom(buf, len, flags);
+}
+
/*
* UML helper threads must not handle SIGWINCH/INT/TERM
*/
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ce1f5a876cfe..04a9865dbdbf 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -272,6 +272,7 @@ config X86
select SYSCTL_EXCEPTION_TRACE
select THREAD_INFO_IN_TASK
select TRACE_IRQFLAGS_SUPPORT
+ select TRACE_IRQFLAGS_NMI_SUPPORT
select USER_STACKTRACE_SUPPORT
select VIRT_TO_BUS
select HAVE_ARCH_KCSAN if X86_64
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index d3a6f74a94bd..d4d6db4dde22 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -1,8 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-config TRACE_IRQFLAGS_NMI_SUPPORT
- def_bool y
-
config EARLY_PRINTK_USB
bool
diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index b5aecb524a8a..ffec8bb01ba8 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -103,7 +103,7 @@ $(obj)/zoffset.h: $(obj)/compressed/vmlinux FORCE
AFLAGS_header.o += -I$(objtree)/$(obj)
$(obj)/header.o: $(obj)/zoffset.h
-LDFLAGS_setup.elf := -m elf_i386 -T
+LDFLAGS_setup.elf := -m elf_i386 -z noexecstack -T
$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE
$(call if_changed,ld)
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 6115274fe10f..c66ea96936de 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -69,6 +69,10 @@ LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
ifdef CONFIG_LD_ORPHAN_WARN
LDFLAGS_vmlinux += --orphan-handling=warn
endif
+LDFLAGS_vmlinux += -z noexecstack
+ifeq ($(CONFIG_LD_IS_BFD),y)
+LDFLAGS_vmlinux += $(call ld-option,--no-warn-rwx-segments)
+endif
LDFLAGS_vmlinux += -T
hostprogs := mkpiggy
diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index 2831685adf6f..8ed4597fdf6a 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -61,9 +61,7 @@ sha256-ssse3-$(CONFIG_AS_SHA256_NI) += sha256_ni_asm.o
obj-$(CONFIG_CRYPTO_SHA512_SSSE3) += sha512-ssse3.o
sha512-ssse3-y := sha512-ssse3-asm.o sha512-avx-asm.o sha512-avx2-asm.o sha512_ssse3_glue.o
-obj-$(CONFIG_CRYPTO_BLAKE2S_X86) += blake2s-x86_64.o
-blake2s-x86_64-y := blake2s-shash.o
-obj-$(if $(CONFIG_CRYPTO_BLAKE2S_X86),y) += libblake2s-x86_64.o
+obj-$(CONFIG_CRYPTO_BLAKE2S_X86) += libblake2s-x86_64.o
libblake2s-x86_64-y := blake2s-core.o blake2s-glue.o
obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) += ghash-clmulni-intel.o
diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c
index 69853c13e8fb..aaba21230528 100644
--- a/arch/x86/crypto/blake2s-glue.c
+++ b/arch/x86/crypto/blake2s-glue.c
@@ -4,7 +4,6 @@
*/
#include <crypto/internal/blake2s.h>
-#include <crypto/internal/simd.h>
#include <linux/types.h>
#include <linux/jump_label.h>
@@ -33,7 +32,7 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block,
/* SIMD disables preemption, so relax after processing each page. */
BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8);
- if (!static_branch_likely(&blake2s_use_ssse3) || !crypto_simd_usable()) {
+ if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) {
blake2s_compress_generic(state, block, nblocks, inc);
return;
}
diff --git a/arch/x86/crypto/blake2s-shash.c b/arch/x86/crypto/blake2s-shash.c
deleted file mode 100644
index 59ae28abe35c..000000000000
--- a/arch/x86/crypto/blake2s-shash.c
+++ /dev/null
@@ -1,77 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0 OR MIT
-/*
- * Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
- */
-
-#include <crypto/internal/blake2s.h>
-#include <crypto/internal/simd.h>
-#include <crypto/internal/hash.h>
-
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/sizes.h>
-
-#include <asm/cpufeature.h>
-#include <asm/processor.h>
-
-static int crypto_blake2s_update_x86(struct shash_desc *desc,
- const u8 *in, unsigned int inlen)
-{
- return crypto_blake2s_update(desc, in, inlen, false);
-}
-
-static int crypto_blake2s_final_x86(struct shash_desc *desc, u8 *out)
-{
- return crypto_blake2s_final(desc, out, false);
-}
-
-#define BLAKE2S_ALG(name, driver_name, digest_size) \
- { \
- .base.cra_name = name, \
- .base.cra_driver_name = driver_name, \
- .base.cra_priority = 200, \
- .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
- .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \
- .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \
- .base.cra_module = THIS_MODULE, \
- .digestsize = digest_size, \
- .setkey = crypto_blake2s_setkey, \
- .init = crypto_blake2s_init, \
- .update = crypto_blake2s_update_x86, \
- .final = crypto_blake2s_final_x86, \
- .descsize = sizeof(struct blake2s_state), \
- }
-
-static struct shash_alg blake2s_algs[] = {
- BLAKE2S_ALG("blake2s-128", "blake2s-128-x86", BLAKE2S_128_HASH_SIZE),
- BLAKE2S_ALG("blake2s-160", "blake2s-160-x86", BLAKE2S_160_HASH_SIZE),
- BLAKE2S_ALG("blake2s-224", "blake2s-224-x86", BLAKE2S_224_HASH_SIZE),
- BLAKE2S_ALG("blake2s-256", "blake2s-256-x86", BLAKE2S_256_HASH_SIZE),
-};
-
-static int __init blake2s_mod_init(void)
-{
- if (IS_REACHABLE(CONFIG_CRYPTO_HASH) && boot_cpu_has(X86_FEATURE_SSSE3))
- return crypto_register_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs));
- return 0;
-}
-
-static void __exit blake2s_mod_exit(void)
-{
- if (IS_REACHABLE(CONFIG_CRYPTO_HASH) && boot_cpu_has(X86_FEATURE_SSSE3))
- crypto_unregister_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs));
-}
-
-module_init(blake2s_mod_init);
-module_exit(blake2s_mod_exit);
-
-MODULE_ALIAS_CRYPTO("blake2s-128");
-MODULE_ALIAS_CRYPTO("blake2s-128-x86");
-MODULE_ALIAS_CRYPTO("blake2s-160");
-MODULE_ALIAS_CRYPTO("blake2s-160-x86");
-MODULE_ALIAS_CRYPTO("blake2s-224");
-MODULE_ALIAS_CRYPTO("blake2s-224-x86");
-MODULE_ALIAS_CRYPTO("blake2s-256");
-MODULE_ALIAS_CRYPTO("blake2s-256-x86");
-MODULE_LICENSE("GPL v2");
diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
index eeadbd7d92cc..ca2fe186994b 100644
--- a/arch/x86/entry/Makefile
+++ b/arch/x86/entry/Makefile
@@ -11,12 +11,13 @@ CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE)
CFLAGS_common.o += -fno-stack-protector
-obj-y := entry.o entry_$(BITS).o thunk_$(BITS).o syscall_$(BITS).o
+obj-y := entry.o entry_$(BITS).o syscall_$(BITS).o
obj-y += common.o
obj-y += vdso/
obj-y += vsyscall/
+obj-$(CONFIG_PREEMPTION) += thunk_$(BITS).o
obj-$(CONFIG_IA32_EMULATION) += entry_64_compat.o syscall_32.o
obj-$(CONFIG_X86_X32_ABI) += syscall_x32.o
diff --git a/arch/x86/entry/thunk_32.S b/arch/x86/entry/thunk_32.S
index 7591bab060f7..ff6e7003da97 100644
--- a/arch/x86/entry/thunk_32.S
+++ b/arch/x86/entry/thunk_32.S
@@ -29,10 +29,8 @@ SYM_CODE_START_NOALIGN(\name)
SYM_CODE_END(\name)
.endm
-#ifdef CONFIG_PREEMPTION
THUNK preempt_schedule_thunk, preempt_schedule
THUNK preempt_schedule_notrace_thunk, preempt_schedule_notrace
EXPORT_SYMBOL(preempt_schedule_thunk)
EXPORT_SYMBOL(preempt_schedule_notrace_thunk)
-#endif
diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
index 505b488fcc65..f38b07d2768b 100644
--- a/arch/x86/entry/thunk_64.S
+++ b/arch/x86/entry/thunk_64.S
@@ -31,14 +31,11 @@ SYM_FUNC_END(\name)
_ASM_NOKPROBE(\name)
.endm
-#ifdef CONFIG_PREEMPTION
THUNK preempt_schedule_thunk, preempt_schedule
THUNK preempt_schedule_notrace_thunk, preempt_schedule_notrace
EXPORT_SYMBOL(preempt_schedule_thunk)
EXPORT_SYMBOL(preempt_schedule_notrace_thunk)
-#endif
-#ifdef CONFIG_PREEMPTION
SYM_CODE_START_LOCAL_NOALIGN(__thunk_restore)
popq %r11
popq %r10
@@ -53,4 +50,3 @@ SYM_CODE_START_LOCAL_NOALIGN(__thunk_restore)
RET
_ASM_NOKPROBE(__thunk_restore)
SYM_CODE_END(__thunk_restore)
-#endif
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index e893af5aa8f5..d80a133093a2 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -179,7 +179,7 @@ quiet_cmd_vdso = VDSO $@
sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@'
VDSO_LDFLAGS = -shared --hash-style=both --build-id=sha1 \
- $(call ld-option, --eh-frame-hdr) -Bsymbolic
+ $(call ld-option, --eh-frame-hdr) -Bsymbolic -z noexecstack
GCOV_PROFILE := n
quiet_cmd_vdso_and_check = VDSO $@
diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 1f156098a5bf..4f70fb6c2c1e 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -769,6 +769,7 @@ void intel_pmu_lbr_disable_all(void)
void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
{
unsigned long mask = x86_pmu.lbr_nr - 1;
+ struct perf_branch_entry *br = cpuc->lbr_entries;
u64 tos = intel_pmu_lbr_tos();
int i;
@@ -784,15 +785,11 @@ void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
rdmsrl(x86_pmu.lbr_from + lbr_idx, msr_lastbranch.lbr);
- cpuc->lbr_entries[i].from = msr_lastbranch.from;
- cpuc->lbr_entries[i].to = msr_lastbranch.to;
- cpuc->lbr_entries[i].mispred = 0;
- cpuc->lbr_entries[i].predicted = 0;
- cpuc->lbr_entries[i].in_tx = 0;
- cpuc->lbr_entries[i].abort = 0;
- cpuc->lbr_entries[i].cycles = 0;
- cpuc->lbr_entries[i].type = 0;
- cpuc->lbr_entries[i].reserved = 0;
+ perf_clear_branch_entry_bitfields(br);
+
+ br->from = msr_lastbranch.from;
+ br->to = msr_lastbranch.to;
+ br++;
}
cpuc->lbr_stack.nr = i;
cpuc->lbr_stack.hw_idx = tos;
@@ -807,6 +804,7 @@ void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
{
bool need_info = false, call_stack = false;
unsigned long mask = x86_pmu.lbr_nr - 1;
+ struct perf_branch_entry *br = cpuc->lbr_entries;
u64 tos = intel_pmu_lbr_tos();
int i;
int out = 0;
@@ -878,15 +876,14 @@ void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
if (abort && x86_pmu.lbr_double_abort && out > 0)
out--;
- cpuc->lbr_entries[out].from = from;
- cpuc->lbr_entries[out].to = to;
- cpuc->lbr_entries[out].mispred = mis;
- cpuc->lbr_entries[out].predicted = pred;
- cpuc->lbr_entries[out].in_tx = in_tx;
- cpuc->lbr_entries[out].abort = abort;
- cpuc->lbr_entries[out].cycles = cycles;
- cpuc->lbr_entries[out].type = 0;
- cpuc->lbr_entries[out].reserved = 0;
+ perf_clear_branch_entry_bitfields(br+out);
+ br[out].from = from;
+ br[out].to = to;
+ br[out].mispred = mis;
+ br[out].predicted = pred;
+ br[out].in_tx = in_tx;
+ br[out].abort = abort;
+ br[out].cycles = cycles;
out++;
}
cpuc->lbr_stack.nr = out;
@@ -951,6 +948,8 @@ static void intel_pmu_store_lbr(struct cpu_hw_events *cpuc,
to = rdlbr_to(i, lbr);
info = rdlbr_info(i, lbr);
+ perf_clear_branch_entry_bitfields(e);
+
e->from = from;
e->to = to;
e->mispred = get_lbr_mispred(info);
@@ -959,7 +958,6 @@ static void intel_pmu_store_lbr(struct cpu_hw_events *cpuc,
e->abort = !!(info & LBR_INFO_ABORT);
e->cycles = get_lbr_cycles(info);
e->type = get_lbr_br_type(info);
- e->reserved = 0;
}
cpuc->lbr_stack.nr = i;
diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 6ad8d946cd3e..5ec359c1b50c 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -193,6 +193,12 @@ int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
const Elf_Shdr *relsec,
const Elf_Shdr *symtab);
#define arch_kexec_apply_relocations_add arch_kexec_apply_relocations_add
+
+void *arch_kexec_kernel_image_load(struct kimage *image);
+#define arch_kexec_kernel_image_load arch_kexec_kernel_image_load
+
+int arch_kimage_file_post_load_cleanup(struct kimage *image);
+#define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
#endif
#endif
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9fdaa847d4b6..35c7a1fce8ea 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -508,6 +508,7 @@ struct kvm_pmu {
unsigned nr_arch_fixed_counters;
unsigned available_event_types;
u64 fixed_ctr_ctrl;
+ u64 fixed_ctr_ctrl_mask;
u64 global_ctrl;
u64 global_status;
u64 counter_bitmask[2];
@@ -1588,7 +1589,7 @@ static inline int kvm_arch_flush_remote_tlb(struct kvm *kvm)
#define kvm_arch_pmi_in_guest(vcpu) \
((vcpu) && (vcpu)->arch.handling_intr_from_guest)
-void kvm_mmu_x86_module_init(void);
+void __init kvm_mmu_x86_module_init(void);
int kvm_mmu_vendor_module_init(void);
void kvm_mmu_vendor_module_exit(void);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index fa625b2a8a93..a80329bd0004 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -152,7 +152,7 @@ void __init check_bugs(void)
/*
* spectre_v2_user_select_mitigation() relies on the state set by
* retbleed_select_mitigation(); specifically the STIBP selection is
- * forced for UNRET.
+ * forced for UNRET or IBPB.
*/
spectre_v2_user_select_mitigation();
ssb_select_mitigation();
@@ -1172,7 +1172,8 @@ spectre_v2_user_select_mitigation(void)
boot_cpu_has(X86_FEATURE_AMD_STIBP_ALWAYS_ON))
mode = SPECTRE_V2_USER_STRICT_PREFERRED;
- if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET) {
+ if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET ||
+ retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
if (mode != SPECTRE_V2_USER_STRICT &&
mode != SPECTRE_V2_USER_STRICT_PREFERRED)
pr_info("Selecting STIBP always-on mode to complement retbleed mitigation\n");
@@ -2353,10 +2354,11 @@ static ssize_t srbds_show_state(char *buf)
static ssize_t retbleed_show_state(char *buf)
{
- if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET) {
+ if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET ||
+ retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
- return sprintf(buf, "Vulnerable: untrained return thunk on non-Zen uarch\n");
+ return sprintf(buf, "Vulnerable: untrained return thunk / IBPB on non-AMD based uarch\n");
return sprintf(buf, "%s; SMT %s\n",
retbleed_strings[retbleed_mitigation],
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 2c87d62f191e..ae7d4c85f4f4 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1145,22 +1145,23 @@ static void bus_lock_init(void)
{
u64 val;
- /*
- * Warn and fatal are handled by #AC for split lock if #AC for
- * split lock is supported.
- */
- if (!boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT) ||
- (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
- (sld_state == sld_warn || sld_state == sld_fatal)) ||
- sld_state == sld_off)
+ if (!boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
return;
- /*
- * Enable #DB for bus lock. All bus locks are handled in #DB except
- * split locks are handled in #AC in the fatal case.
- */
rdmsrl(MSR_IA32_DEBUGCTLMSR, val);
- val |= DEBUGCTLMSR_BUS_LOCK_DETECT;
+
+ if ((boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
+ (sld_state == sld_warn || sld_state == sld_fatal)) ||
+ sld_state == sld_off) {
+ /*
+ * Warn and fatal are handled by #AC for split lock if #AC for
+ * split lock is supported.
+ */
+ val &= ~DEBUGCTLMSR_BUS_LOCK_DETECT;
+ } else {
+ val |= DEBUGCTLMSR_BUS_LOCK_DETECT;
+ }
+
wrmsrl(MSR_IA32_DEBUGCTLMSR, val);
}
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 6892ca67d9c6..b6d7ece7bf51 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -93,6 +93,7 @@ static int ftrace_verify_code(unsigned long ip, const char *old_code)
/* Make sure it is what we expect it to be */
if (memcmp(cur_code, old_code, MCOUNT_INSN_SIZE) != 0) {
+ ftrace_expected = old_code;
WARN_ON(1);
return -EINVAL;
}
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 7c4ab8870da4..74167dc5f55e 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -814,16 +814,20 @@ set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
static void kprobe_post_process(struct kprobe *cur, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
{
- if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
- kcb->kprobe_status = KPROBE_HIT_SSDONE;
- cur->post_handler(cur, regs, 0);
- }
-
/* Restore back the original saved kprobes variables and continue. */
- if (kcb->kprobe_status == KPROBE_REENTER)
+ if (kcb->kprobe_status == KPROBE_REENTER) {
+ /* This will restore both kcb and current_kprobe */
restore_previous_kprobe(kcb);
- else
+ } else {
+ /*
+ * Always update the kcb status because
+ * reset_curent_kprobe() doesn't update kcb.
+ */
+ kcb->kprobe_status = KPROBE_HIT_SSDONE;
+ if (cur->post_handler)
+ cur->post_handler(cur, regs, 0);
reset_current_kprobe();
+ }
}
NOKPROBE_SYMBOL(kprobe_post_process);
diff --git a/arch/x86/kernel/pmem.c b/arch/x86/kernel/pmem.c
index 6b07faaa1579..23154d24b117 100644
--- a/arch/x86/kernel/pmem.c
+++ b/arch/x86/kernel/pmem.c
@@ -27,6 +27,11 @@ static __init int register_e820_pmem(void)
* simply here to trigger the module to load on demand.
*/
pdev = platform_device_alloc("e820_pmem", -1);
- return platform_device_add(pdev);
+
+ rc = platform_device_add(pdev);
+ if (rc)
+ platform_device_put(pdev);
+
+ return rc;
}
device_initcall(register_e820_pmem);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 622dc3673c37..8011536ba5c4 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -824,6 +824,10 @@ static void amd_e400_idle(void)
*/
static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
{
+ /* User has disallowed the use of MWAIT. Fallback to HALT */
+ if (boot_option_idle_override == IDLE_NOMWAIT)
+ return 0;
+
if (c->x86_vendor != X86_VENDOR_INTEL)
return 0;
@@ -932,9 +936,8 @@ static int __init idle_setup(char *str)
} else if (!strcmp(str, "nomwait")) {
/*
* If the boot option of "idle=nomwait" is added,
- * it means that mwait will be disabled for CPU C2/C3
- * states. In such case it won't touch the variable
- * of boot_option_idle_override.
+ * it means that mwait will be disabled for CPU C1/C2/C3
+ * states.
*/
boot_option_idle_override = IDLE_NOMWAIT;
} else
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index f8382abe22ff..aa907cec0918 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1687,16 +1687,6 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
case VCPU_SREG_TR:
if (seg_desc.s || (seg_desc.type != 1 && seg_desc.type != 9))
goto exception;
- if (!seg_desc.p) {
- err_vec = NP_VECTOR;
- goto exception;
- }
- old_desc = seg_desc;
- seg_desc.type |= 2; /* busy */
- ret = ctxt->ops->cmpxchg_emulated(ctxt, desc_addr, &old_desc, &seg_desc,
- sizeof(seg_desc), &ctxt->exception);
- if (ret != X86EMUL_CONTINUE)
- return ret;
break;
case VCPU_SREG_LDTR:
if (seg_desc.s || seg_desc.type != 2)
@@ -1734,8 +1724,17 @@ static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
if (ret != X86EMUL_CONTINUE)
return ret;
if (emul_is_noncanonical_address(get_desc_base(&seg_desc) |
- ((u64)base3 << 32), ctxt))
- return emulate_gp(ctxt, 0);
+ ((u64)base3 << 32), ctxt))
+ return emulate_gp(ctxt, err_code);
+ }
+
+ if (seg == VCPU_SREG_TR) {
+ old_desc = seg_desc;
+ seg_desc.type |= 2; /* busy */
+ ret = ctxt->ops->cmpxchg_emulated(ctxt, desc_addr, &old_desc, &seg_desc,
+ sizeof(seg_desc), &ctxt->exception);
+ if (ret != X86EMUL_CONTINUE)
+ return ret;
}
load:
ctxt->ops->set_segment(ctxt, selector, &seg_desc, base3, seg);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7f9c3b0fcd0b..da7a25d41a97 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6264,7 +6264,7 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
* nx_huge_pages needs to be resolved to true/false when kvm.ko is loaded, as
* its default value of -1 is technically undefined behavior for a boolean.
*/
-void kvm_mmu_x86_module_init(void)
+void __init kvm_mmu_x86_module_init(void)
{
if (nx_huge_pages == -1)
__set_nx_huge_pages(get_nx_auto_mode());
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index beb3ce8d94eb..43f6a882615f 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -1044,7 +1044,14 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access))
continue;
- if (gfn != sp->gfns[i]) {
+ /*
+ * Drop the SPTE if the new protections would result in a RWX=0
+ * SPTE or if the gfn is changing. The RWX=0 case only affects
+ * EPT with execute-only support, i.e. EPT without an effective
+ * "present" bit, as all other paging modes will create a
+ * read-only SPTE if pte_access is zero.
+ */
+ if ((!pte_access && !shadow_present_mask) || gfn != sp->gfns[i]) {
drop_spte(vcpu->kvm, &sp->spt[i]);
flush = true;
continue;
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index e5c0b6db6f2c..8223a80802e7 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -128,6 +128,8 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
u64 spte = SPTE_MMU_PRESENT_MASK;
bool wrprot = false;
+ WARN_ON_ONCE(!pte_access && !shadow_present_mask);
+
if (sp->role.ad_disabled)
spte |= SPTE_TDP_AD_DISABLED_MASK;
else if (kvm_mmu_page_ad_need_write_protect(sp))
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 9a5963bff0f9..83e7325f7df4 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -292,7 +292,8 @@ static bool __nested_vmcb_check_save(struct kvm_vcpu *vcpu,
return false;
}
- if (CC(!kvm_is_valid_cr4(vcpu, save->cr4)))
+ /* Note, SVM doesn't have any additional restrictions on CR4. */
+ if (CC(!__kvm_is_valid_cr4(vcpu, save->cr4)))
return false;
if (CC(!kvm_valid_efer(vcpu, save->efer)))
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index c667214c630b..e5a154f0d2ab 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -390,6 +390,10 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
*/
(void)svm_skip_emulated_instruction(vcpu);
rip = kvm_rip_read(vcpu);
+
+ if (boot_cpu_has(X86_FEATURE_NRIPS))
+ svm->vmcb->control.next_rip = rip;
+
svm->int3_rip = rip + svm->vmcb->save.cs.base;
svm->int3_injected = rip - old_rip;
}
@@ -3305,8 +3309,6 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
- BUG_ON(!(gif_set(svm)));
-
trace_kvm_inj_virq(vcpu->arch.interrupt.nr);
++vcpu->stat.irq_injections;
@@ -3615,6 +3617,18 @@ static void svm_complete_interrupts(struct kvm_vcpu *vcpu)
vector = exitintinfo & SVM_EXITINTINFO_VEC_MASK;
type = exitintinfo & SVM_EXITINTINFO_TYPE_MASK;
+ /*
+ * If NextRIP isn't enabled, KVM must manually advance RIP prior to
+ * injecting the soft exception/interrupt. That advancement needs to
+ * be unwound if vectoring didn't complete. Note, the new event may
+ * not be the injected event, e.g. if KVM injected an INTn, the INTn
+ * hit a #NP in the guest, and the #NP encountered a #PF, the #NP will
+ * be the reported vectored event, but RIP still needs to be unwound.
+ */
+ if (int3_injected && type == SVM_EXITINTINFO_TYPE_EXEPT &&
+ kvm_is_linear_rip(vcpu, svm->int3_rip))
+ kvm_rip_write(vcpu, kvm_rip_read(vcpu) - int3_injected);
+
switch (type) {
case SVM_EXITINTINFO_TYPE_NMI:
vcpu->arch.nmi_injected = true;
@@ -3628,16 +3642,11 @@ static void svm_complete_interrupts(struct kvm_vcpu *vcpu)
/*
* In case of software exceptions, do not reinject the vector,
- * but re-execute the instruction instead. Rewind RIP first
- * if we emulated INT3 before.
+ * but re-execute the instruction instead.
*/
- if (kvm_exception_is_soft(vector)) {
- if (vector == BP_VECTOR && int3_injected &&
- kvm_is_linear_rip(vcpu, svm->int3_rip))
- kvm_rip_write(vcpu,
- kvm_rip_read(vcpu) - int3_injected);
+ if (kvm_exception_is_soft(vector))
break;
- }
+
if (exitintinfo & SVM_EXITINTINFO_VALID_ERR) {
u32 err = svm->vmcb->control.exit_int_info_err;
kvm_requeue_exception_e(vcpu, vector, err);
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 28ccf25c4124..5c62e552082a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1224,7 +1224,7 @@ static int vmx_restore_vmx_basic(struct vcpu_vmx *vmx, u64 data)
BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) |
/* reserved */
BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 56);
- u64 vmx_basic = vmx->nested.msrs.basic;
+ u64 vmx_basic = vmcs_config.nested.basic;
if (!is_bitwise_subset(vmx_basic, data, feature_and_reserved))
return -EINVAL;
@@ -1247,36 +1247,42 @@ static int vmx_restore_vmx_basic(struct vcpu_vmx *vmx, u64 data)
return 0;
}
-static int
-vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
+static void vmx_get_control_msr(struct nested_vmx_msrs *msrs, u32 msr_index,
+ u32 **low, u32 **high)
{
- u64 supported;
- u32 *lowp, *highp;
-
switch (msr_index) {
case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
- lowp = &vmx->nested.msrs.pinbased_ctls_low;
- highp = &vmx->nested.msrs.pinbased_ctls_high;
+ *low = &msrs->pinbased_ctls_low;
+ *high = &msrs->pinbased_ctls_high;
break;
case MSR_IA32_VMX_TRUE_PROCBASED_CTLS:
- lowp = &vmx->nested.msrs.procbased_ctls_low;
- highp = &vmx->nested.msrs.procbased_ctls_high;
+ *low = &msrs->procbased_ctls_low;
+ *high = &msrs->procbased_ctls_high;
break;
case MSR_IA32_VMX_TRUE_EXIT_CTLS:
- lowp = &vmx->nested.msrs.exit_ctls_low;
- highp = &vmx->nested.msrs.exit_ctls_high;
+ *low = &msrs->exit_ctls_low;
+ *high = &msrs->exit_ctls_high;
break;
case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
- lowp = &vmx->nested.msrs.entry_ctls_low;
- highp = &vmx->nested.msrs.entry_ctls_high;
+ *low = &msrs->entry_ctls_low;
+ *high = &msrs->entry_ctls_high;
break;
case MSR_IA32_VMX_PROCBASED_CTLS2:
- lowp = &vmx->nested.msrs.secondary_ctls_low;
- highp = &vmx->nested.msrs.secondary_ctls_high;
+ *low = &msrs->secondary_ctls_low;
+ *high = &msrs->secondary_ctls_high;
break;
default:
BUG();
}
+}
+
+static int
+vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
+{
+ u32 *lowp, *highp;
+ u64 supported;
+
+ vmx_get_control_msr(&vmcs_config.nested, msr_index, &lowp, &highp);
supported = vmx_control_msr(*lowp, *highp);
@@ -1288,6 +1294,7 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
if (!is_bitwise_subset(supported, data, GENMASK_ULL(63, 32)))
return -EINVAL;
+ vmx_get_control_msr(&vmx->nested.msrs, msr_index, &lowp, &highp);
*lowp = data;
*highp = data >> 32;
return 0;
@@ -1301,10 +1308,8 @@ static int vmx_restore_vmx_misc(struct vcpu_vmx *vmx, u64 data)
BIT_ULL(28) | BIT_ULL(29) | BIT_ULL(30) |
/* reserved */
GENMASK_ULL(13, 9) | BIT_ULL(31);
- u64 vmx_misc;
-
- vmx_misc = vmx_control_msr(vmx->nested.msrs.misc_low,
- vmx->nested.msrs.misc_high);
+ u64 vmx_misc = vmx_control_msr(vmcs_config.nested.misc_low,
+ vmcs_config.nested.misc_high);
if (!is_bitwise_subset(vmx_misc, data, feature_and_reserved_bits))
return -EINVAL;
@@ -1332,10 +1337,8 @@ static int vmx_restore_vmx_misc(struct vcpu_vmx *vmx, u64 data)
static int vmx_restore_vmx_ept_vpid_cap(struct vcpu_vmx *vmx, u64 data)
{
- u64 vmx_ept_vpid_cap;
-
- vmx_ept_vpid_cap = vmx_control_msr(vmx->nested.msrs.ept_caps,
- vmx->nested.msrs.vpid_caps);
+ u64 vmx_ept_vpid_cap = vmx_control_msr(vmcs_config.nested.ept_caps,
+ vmcs_config.nested.vpid_caps);
/* Every bit is either reserved or a feature bit. */
if (!is_bitwise_subset(vmx_ept_vpid_cap, data, -1ULL))
@@ -1346,20 +1349,21 @@ static int vmx_restore_vmx_ept_vpid_cap(struct vcpu_vmx *vmx, u64 data)
return 0;
}
-static int vmx_restore_fixed0_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
+static u64 *vmx_get_fixed0_msr(struct nested_vmx_msrs *msrs, u32 msr_index)
{
- u64 *msr;
-
switch (msr_index) {
case MSR_IA32_VMX_CR0_FIXED0:
- msr = &vmx->nested.msrs.cr0_fixed0;
- break;
+ return &msrs->cr0_fixed0;
case MSR_IA32_VMX_CR4_FIXED0:
- msr = &vmx->nested.msrs.cr4_fixed0;
- break;
+ return &msrs->cr4_fixed0;
default:
BUG();
}
+}
+
+static int vmx_restore_fixed0_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
+{
+ const u64 *msr = vmx_get_fixed0_msr(&vmcs_config.nested, msr_index);
/*
* 1 bits (which indicates bits which "must-be-1" during VMX operation)
@@ -1368,7 +1372,7 @@ static int vmx_restore_fixed0_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
if (!is_bitwise_subset(data, *msr, -1ULL))
return -EINVAL;
- *msr = data;
+ *vmx_get_fixed0_msr(&vmx->nested.msrs, msr_index) = data;
return 0;
}
@@ -1429,7 +1433,7 @@ int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
vmx->nested.msrs.vmcs_enum = data;
return 0;
case MSR_IA32_VMX_VMFUNC:
- if (data & ~vmx->nested.msrs.vmfunc_controls)
+ if (data & ~vmcs_config.nested.vmfunc_controls)
return -EINVAL;
vmx->nested.msrs.vmfunc_controls = data;
return 0;
@@ -2279,7 +2283,6 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_APIC_REGISTER_VIRT |
SECONDARY_EXEC_ENABLE_VMFUNC |
- SECONDARY_EXEC_TSC_SCALING |
SECONDARY_EXEC_DESC);
if (nested_cpu_has(vmcs12,
@@ -2618,6 +2621,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) &&
+ intel_pmu_has_perf_global_ctrl(vcpu_to_pmu(vcpu)) &&
WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL,
vmcs12->guest_ia32_perf_global_ctrl))) {
*entry_failure_code = ENTRY_FAIL_DEFAULT;
@@ -3378,10 +3382,12 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
if (likely(!evaluate_pending_interrupts) && kvm_vcpu_apicv_active(vcpu))
evaluate_pending_interrupts |= vmx_has_apicv_interrupt(vcpu);
- if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
+ if (!vmx->nested.nested_run_pending ||
+ !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
if (kvm_mpx_supported() &&
- !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))
+ (!vmx->nested.nested_run_pending ||
+ !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)))
vmx->nested.vmcs01_guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
/*
@@ -4341,7 +4347,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
vmcs_write64(GUEST_IA32_PAT, vmcs12->host_ia32_pat);
vcpu->arch.pat = vmcs12->host_ia32_pat;
}
- if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)
+ if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL) &&
+ intel_pmu_has_perf_global_ctrl(vcpu_to_pmu(vcpu)))
WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL,
vmcs12->host_ia32_perf_global_ctrl));
@@ -4967,20 +4974,25 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
| FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
/*
- * The Intel VMX Instruction Reference lists a bunch of bits that are
- * prerequisite to running VMXON, most notably cr4.VMXE must be set to
- * 1 (see vmx_is_valid_cr4() for when we allow the guest to set this).
- * Otherwise, we should fail with #UD. But most faulting conditions
- * have already been checked by hardware, prior to the VM-exit for
- * VMXON. We do test guest cr4.VMXE because processor CR4 always has
- * that bit set to 1 in non-root mode.
+ * Note, KVM cannot rely on hardware to perform the CR0/CR4 #UD checks
+ * that have higher priority than VM-Exit (see Intel SDM's pseudocode
+ * for VMXON), as KVM must load valid CR0/CR4 values into hardware while
+ * running the guest, i.e. KVM needs to check the _guest_ values.
+ *
+ * Rely on hardware for the other two pre-VM-Exit checks, !VM86 and
+ * !COMPATIBILITY modes. KVM may run the guest in VM86 to emulate Real
+ * Mode, but KVM will never take the guest out of those modes.
*/
- if (!kvm_read_cr4_bits(vcpu, X86_CR4_VMXE)) {
+ if (!nested_host_cr0_valid(vcpu, kvm_read_cr0(vcpu)) ||
+ !nested_host_cr4_valid(vcpu, kvm_read_cr4(vcpu))) {
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
}
- /* CPL=0 must be checked manually. */
+ /*
+ * CPL=0 and all other checks that are lower priority than VM-Exit must
+ * be checked manually.
+ */
if (vmx_get_cpl(vcpu)) {
kvm_inject_gp(vcpu, 0);
return 1;
@@ -6780,6 +6792,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
rdmsrl(MSR_IA32_VMX_CR0_FIXED1, msrs->cr0_fixed1);
rdmsrl(MSR_IA32_VMX_CR4_FIXED1, msrs->cr4_fixed1);
+ if (vmx_umip_emulated())
+ msrs->cr4_fixed1 |= X86_CR4_UMIP;
+
msrs->vmcs_enum = nested_vmx_calc_vmcs_enum_msr();
}
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index c92cea0b8ccc..129ae4e01f7c 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -281,7 +281,8 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *vcpu, unsigned long val)
u64 fixed0 = to_vmx(vcpu)->nested.msrs.cr4_fixed0;
u64 fixed1 = to_vmx(vcpu)->nested.msrs.cr4_fixed1;
- return fixed_bits_valid(val, fixed0, fixed1);
+ return fixed_bits_valid(val, fixed0, fixed1) &&
+ __kvm_is_valid_cr4(vcpu, val);
}
/* No difference in the restrictions on guest and host CR4 in VMX operation. */
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index b82b6709d7a8..8bd154f8c966 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -98,6 +98,9 @@ static bool intel_pmc_is_enabled(struct kvm_pmc *pmc)
{
struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+ if (!intel_pmu_has_perf_global_ctrl(pmu))
+ return true;
+
return test_bit(pmc->idx, (unsigned long *)&pmu->global_ctrl);
}
@@ -212,7 +215,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
case MSR_CORE_PERF_GLOBAL_STATUS:
case MSR_CORE_PERF_GLOBAL_CTRL:
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
- ret = pmu->version > 1;
+ return intel_pmu_has_perf_global_ctrl(pmu);
break;
default:
ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
@@ -395,7 +398,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_CORE_PERF_FIXED_CTR_CTRL:
if (pmu->fixed_ctr_ctrl == data)
return 0;
- if (!(data & 0xfffffffffffff444ull)) {
+ if (!(data & pmu->fixed_ctr_ctrl_mask)) {
reprogram_fixed_counters(pmu, data);
return 0;
}
@@ -479,6 +482,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
struct kvm_cpuid_entry2 *entry;
union cpuid10_eax eax;
union cpuid10_edx edx;
+ int i;
pmu->nr_arch_gp_counters = 0;
pmu->nr_arch_fixed_counters = 0;
@@ -487,6 +491,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->version = 0;
pmu->reserved_bits = 0xffffffff00200000ull;
pmu->raw_event_mask = X86_RAW_EVENT_MASK;
+ pmu->global_ctrl_mask = ~0ull;
+ pmu->global_ovf_ctrl_mask = ~0ull;
+ pmu->fixed_ctr_ctrl_mask = ~0ull;
entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
if (!entry || !vcpu->kvm->arch.enable_pmu)
@@ -522,6 +529,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
setup_fixed_pmc_eventsel(pmu);
}
+ for (i = 0; i < pmu->nr_arch_fixed_counters; i++)
+ pmu->fixed_ctr_ctrl_mask &= ~(0xbull << (i * 4));
pmu->global_ctrl = ((1ull << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << INTEL_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 597c3c08da50..3cfbb0413cd8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3230,8 +3230,8 @@ static bool vmx_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
{
/*
* We operate under the default treatment of SMM, so VMX cannot be
- * enabled under SMM. Note, whether or not VMXE is allowed at all is
- * handled by kvm_is_valid_cr4().
+ * enabled under SMM. Note, whether or not VMXE is allowed at all,
+ * i.e. is a reserved bit, is handled by common x86 code.
*/
if ((cr4 & X86_CR4_VMXE) && is_smm(vcpu))
return false;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 1e7f9453894b..93aa1f3ea01e 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -92,6 +92,18 @@ union vmx_exit_reason {
u32 full;
};
+static inline bool intel_pmu_has_perf_global_ctrl(struct kvm_pmu *pmu)
+{
+ /*
+ * Architecturally, Intel's SDM states that IA32_PERF_GLOBAL_CTRL is
+ * supported if "CPUID.0AH: EAX[7:0] > 0", i.e. if the PMU version is
+ * greater than zero. However, KVM only exposes and emulates the MSR
+ * to/for the guest if the guest PMU supports at least "Architectural
+ * Performance Monitoring Version 2".
+ */
+ return pmu->version > 1;
+}
+
#define vcpu_to_lbr_desc(vcpu) (&to_vmx(vcpu)->lbr_desc)
#define vcpu_to_lbr_records(vcpu) (&to_vmx(vcpu)->lbr_desc.records)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 65b0ec28bd52..0d6cea0d33a9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1066,7 +1066,7 @@ int kvm_emulate_xsetbv(struct kvm_vcpu *vcpu)
}
EXPORT_SYMBOL_GPL(kvm_emulate_xsetbv);
-bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
{
if (cr4 & cr4_reserved_bits)
return false;
@@ -1074,9 +1074,15 @@ bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
if (cr4 & vcpu->arch.cr4_guest_rsvd_bits)
return false;
- return static_call(kvm_x86_is_valid_cr4)(vcpu, cr4);
+ return true;
+}
+EXPORT_SYMBOL_GPL(__kvm_is_valid_cr4);
+
+static bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+{
+ return __kvm_is_valid_cr4(vcpu, cr4) &&
+ static_call(kvm_x86_is_valid_cr4)(vcpu, cr4);
}
-EXPORT_SYMBOL_GPL(kvm_is_valid_cr4);
void kvm_post_set_cr4(struct kvm_vcpu *vcpu, unsigned long old_cr4, unsigned long cr4)
{
@@ -3220,17 +3226,20 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
/* only 0 or all 1s can be written to IA32_MCi_CTL
* some Linux kernels though clear bit 10 in bank 4 to
* workaround a BIOS/GART TBL issue on AMD K8s, ignore
- * this to avoid an uncatched #GP in the guest
+ * this to avoid an uncatched #GP in the guest.
+ *
+ * UNIXWARE clears bit 0 of MC1_CTL to ignore
+ * correctable, single-bit ECC data errors.
*/
if ((offset & 0x3) == 0 &&
- data != 0 && (data | (1 << 10)) != ~(u64)0)
- return -1;
+ data != 0 && (data | (1 << 10) | 1) != ~(u64)0)
+ return 1;
/* MCi_STATUS */
if (!msr_info->host_initiated &&
(offset & 0x3) == 1 && data != 0) {
if (!can_set_mci_status(vcpu))
- return -1;
+ return 1;
}
vcpu->arch.mce_banks[offset] = data;
@@ -3361,6 +3370,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
struct gfn_to_hva_cache *ghc = &vcpu->arch.st.cache;
struct kvm_steal_time __user *st;
struct kvm_memslots *slots;
+ gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
u64 steal;
u32 version;
@@ -3378,13 +3388,12 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
slots = kvm_memslots(vcpu->kvm);
if (unlikely(slots->generation != ghc->generation ||
+ gpa != ghc->gpa ||
kvm_is_error_hva(ghc->hva) || !ghc->memslot)) {
- gfn_t gfn = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
-
/* We rely on the fact that it fits in a single page. */
BUILD_BUG_ON((sizeof(*st) - 1) & KVM_STEAL_VALID_BITS);
- if (kvm_gfn_to_hva_cache_init(vcpu->kvm, ghc, gfn, sizeof(*st)) ||
+ if (kvm_gfn_to_hva_cache_init(vcpu->kvm, ghc, gpa, sizeof(*st)) ||
kvm_is_error_hva(ghc->hva) || !ghc->memslot)
return;
}
@@ -4371,10 +4380,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
if (r < sizeof(struct kvm_xsave))
r = sizeof(struct kvm_xsave);
break;
+ }
case KVM_CAP_PMU_CAPABILITY:
r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
break;
- }
case KVM_CAP_DISABLE_QUIRKS2:
r = KVM_X86_VALID_QUIRKS;
break;
@@ -4608,6 +4617,7 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
struct kvm_steal_time __user *st;
struct kvm_memslots *slots;
static const u8 preempted = KVM_VCPU_PREEMPTED;
+ gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
/*
* The vCPU can be marked preempted if and only if the VM-Exit was on
@@ -4635,6 +4645,7 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
slots = kvm_memslots(vcpu->kvm);
if (unlikely(slots->generation != ghc->generation ||
+ gpa != ghc->gpa ||
kvm_is_error_hva(ghc->hva) || !ghc->memslot))
return;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 588792f00334..80417761fe4a 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -407,7 +407,7 @@ static inline void kvm_machine_check(void)
void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu);
void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu);
int kvm_spec_ctrl_test_value(u64 value);
-bool kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
+bool __kvm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
int kvm_handle_memory_failure(struct kvm_vcpu *vcpu, int r,
struct x86_exception *e);
int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned long type, gva_t gva);
diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
index dba2197c05c3..331310c29349 100644
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -94,16 +94,18 @@ static bool ex_handler_copy(const struct exception_table_entry *fixup,
static bool ex_handler_msr(const struct exception_table_entry *fixup,
struct pt_regs *regs, bool wrmsr, bool safe, int reg)
{
- if (!safe && wrmsr &&
- pr_warn_once("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n",
- (unsigned int)regs->cx, (unsigned int)regs->dx,
- (unsigned int)regs->ax, regs->ip, (void *)regs->ip))
+ if (__ONCE_LITE_IF(!safe && wrmsr)) {
+ pr_warn("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n",
+ (unsigned int)regs->cx, (unsigned int)regs->dx,
+ (unsigned int)regs->ax, regs->ip, (void *)regs->ip);
show_stack_regs(regs);
+ }
- if (!safe && !wrmsr &&
- pr_warn_once("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n",
- (unsigned int)regs->cx, regs->ip, (void *)regs->ip))
+ if (__ONCE_LITE_IF(!safe && !wrmsr)) {
+ pr_warn("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n",
+ (unsigned int)regs->cx, regs->ip, (void *)regs->ip);
show_stack_regs(regs);
+ }
if (!wrmsr) {
/* Pretend that the read succeeded and returned 0. */
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index e8b061557887..2aadb2019b4f 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -867,7 +867,7 @@ void debug_cpumask_set_cpu(int cpu, int node, bool enable)
return;
}
mask = node_to_cpumask_map[node];
- if (!mask) {
+ if (!cpumask_available(mask)) {
pr_err("node_to_cpumask_map[%i] NULL\n", node);
dump_stack();
return;
@@ -913,7 +913,7 @@ const struct cpumask *cpumask_of_node(int node)
dump_stack();
return cpu_none_mask;
}
- if (node_to_cpumask_map[node] == NULL) {
+ if (!cpumask_available(node_to_cpumask_map[node])) {
printk(KERN_WARNING
"cpumask_of_node(%d): no node_to_cpumask_map!\n",
node);
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 2dab2816b3f7..400117f630b8 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1777,10 +1777,12 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
}
static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
- struct bpf_prog *p, int stack_size, bool save_ret)
+ struct bpf_tramp_link *l, int stack_size,
+ bool save_ret)
{
u8 *prog = *pprog;
u8 *jmp_insn;
+ struct bpf_prog *p = l->link.prog;
/* arg1: mov rdi, progs[i] */
emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
@@ -1865,14 +1867,14 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
}
static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
- struct bpf_tramp_progs *tp, int stack_size,
+ struct bpf_tramp_links *tl, int stack_size,
bool save_ret)
{
int i;
u8 *prog = *pprog;
- for (i = 0; i < tp->nr_progs; i++) {
- if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size,
+ for (i = 0; i < tl->nr_links; i++) {
+ if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
save_ret))
return -EINVAL;
}
@@ -1881,7 +1883,7 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
}
static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog,
- struct bpf_tramp_progs *tp, int stack_size,
+ struct bpf_tramp_links *tl, int stack_size,
u8 **branches)
{
u8 *prog = *pprog;
@@ -1892,8 +1894,8 @@ static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog,
*/
emit_mov_imm32(&prog, false, BPF_REG_0, 0);
emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
- for (i = 0; i < tp->nr_progs; i++) {
- if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size, true))
+ for (i = 0; i < tl->nr_links; i++) {
+ if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size, true))
return -EINVAL;
/* mod_ret prog stored return value into [rbp - 8]. Emit:
@@ -1995,14 +1997,14 @@ static bool is_valid_bpf_tramp_flags(unsigned int flags)
*/
int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *image_end,
const struct btf_func_model *m, u32 flags,
- struct bpf_tramp_progs *tprogs,
+ struct bpf_tramp_links *tlinks,
void *orig_call)
{
int ret, i, nr_args = m->nr_args;
int regs_off, ip_off, args_off, stack_size = nr_args * 8;
- struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
- struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
- struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
+ struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
+ struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
+ struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
u8 **branches = NULL;
u8 *prog;
bool save_ret;
@@ -2093,13 +2095,13 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
}
}
- if (fentry->nr_progs)
+ if (fentry->nr_links)
if (invoke_bpf(m, &prog, fentry, regs_off,
flags & BPF_TRAMP_F_RET_FENTRY_RET))
return -EINVAL;
- if (fmod_ret->nr_progs) {
- branches = kcalloc(fmod_ret->nr_progs, sizeof(u8 *),
+ if (fmod_ret->nr_links) {
+ branches = kcalloc(fmod_ret->nr_links, sizeof(u8 *),
GFP_KERNEL);
if (!branches)
return -ENOMEM;
@@ -2126,7 +2128,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
prog += X86_PATCH_SIZE;
}
- if (fmod_ret->nr_progs) {
+ if (fmod_ret->nr_links) {
/* From Intel 64 and IA-32 Architectures Optimization
* Reference Manual, 3.4.1.4 Code Alignment, Assembly/Compiler
* Coding Rule 11: All branch targets should be 16-byte
@@ -2136,12 +2138,12 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
/* Update the branches saved in invoke_bpf_mod_ret with the
* aligned address of do_fexit.
*/
- for (i = 0; i < fmod_ret->nr_progs; i++)
+ for (i = 0; i < fmod_ret->nr_links; i++)
emit_cond_near_jump(&branches[i], prog, branches[i],
X86_JNE);
}
- if (fexit->nr_progs)
+ if (fexit->nr_links)
if (invoke_bpf(m, &prog, fexit, regs_off, false)) {
ret = -EINVAL;
goto cleanup;
@@ -2475,3 +2477,34 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
return ERR_PTR(-EINVAL);
return dst;
}
+
+/* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */
+bool bpf_jit_supports_subprog_tailcalls(void)
+{
+ return true;
+}
+
+void bpf_jit_free(struct bpf_prog *prog)
+{
+ if (prog->jited) {
+ struct x64_jit_data *jit_data = prog->aux->jit_data;
+ struct bpf_binary_header *hdr;
+
+ /*
+ * If we fail the final pass of JIT (from jit_subprogs),
+ * the program may not be finalized yet. Call finalize here
+ * before freeing it.
+ */
+ if (jit_data) {
+ bpf_jit_binary_pack_finalize(prog, jit_data->header,
+ jit_data->rw_header);
+ kvfree(jit_data->addrs);
+ kfree(jit_data);
+ }
+ hdr = bpf_jit_binary_pack_hdr(prog);
+ bpf_jit_binary_pack_free(hdr, NULL);
+ WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(prog));
+ }
+
+ bpf_prog_unlock_free(prog);
+}
diff --git a/arch/x86/platform/olpc/olpc-xo1-sci.c b/arch/x86/platform/olpc/olpc-xo1-sci.c
index f03a6883dcc6..89f25af4b3c3 100644
--- a/arch/x86/platform/olpc/olpc-xo1-sci.c
+++ b/arch/x86/platform/olpc/olpc-xo1-sci.c
@@ -80,7 +80,7 @@ static void send_ebook_state(void)
return;
}
- if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
+ if (test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == !!state)
return; /* Nothing new to report. */
input_report_switch(ebook_switch_idev, SW_TABLET_MODE, state);
diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile
index ba5789c35809..a8cde4e8ab11 100644
--- a/arch/x86/um/Makefile
+++ b/arch/x86/um/Makefile
@@ -28,7 +28,8 @@ else
obj-y += syscalls_64.o vdso/
-subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o ../entry/thunk_64.o
+subarch-y = ../lib/csum-partial_64.o ../lib/memcpy_64.o
+subarch-$(CONFIG_PREEMPTION) += ../entry/thunk_64.o
endif
diff --git a/arch/xtensa/platforms/iss/network.c b/arch/xtensa/platforms/iss/network.c
index be3aaaad8bee..eee389187807 100644
--- a/arch/xtensa/platforms/iss/network.c
+++ b/arch/xtensa/platforms/iss/network.c
@@ -503,16 +503,24 @@ static const struct net_device_ops iss_netdev_ops = {
.ndo_set_rx_mode = iss_net_set_multicast_list,
};
-static int iss_net_configure(int index, char *init)
+static void iss_net_pdev_release(struct device *dev)
+{
+ struct platform_device *pdev = to_platform_device(dev);
+ struct iss_net_private *lp =
+ container_of(pdev, struct iss_net_private, pdev);
+
+ free_netdev(lp->dev);
+}
+
+static void iss_net_configure(int index, char *init)
{
struct net_device *dev;
struct iss_net_private *lp;
- int err;
dev = alloc_etherdev(sizeof(*lp));
if (dev == NULL) {
pr_err("eth_configure: failed to allocate device\n");
- return 1;
+ return;
}
/* Initialize private element. */
@@ -541,7 +549,7 @@ static int iss_net_configure(int index, char *init)
if (!tuntap_probe(lp, index, init)) {
pr_err("%s: invalid arguments. Skipping device!\n",
dev->name);
- goto errout;
+ goto err_free_netdev;
}
pr_info("Netdevice %d (%pM)\n", index, dev->dev_addr);
@@ -549,7 +557,8 @@ static int iss_net_configure(int index, char *init)
/* sysfs register */
if (!driver_registered) {
- platform_driver_register(&iss_net_driver);
+ if (platform_driver_register(&iss_net_driver))
+ goto err_free_netdev;
driver_registered = 1;
}
@@ -559,7 +568,9 @@ static int iss_net_configure(int index, char *init)
lp->pdev.id = index;
lp->pdev.name = DRIVER_NAME;
- platform_device_register(&lp->pdev);
+ lp->pdev.dev.release = iss_net_pdev_release;
+ if (platform_device_register(&lp->pdev))
+ goto err_free_netdev;
SET_NETDEV_DEV(dev, &lp->pdev.dev);
dev->netdev_ops = &iss_netdev_ops;
@@ -568,23 +579,20 @@ static int iss_net_configure(int index, char *init)
dev->irq = -1;
rtnl_lock();
- err = register_netdevice(dev);
- rtnl_unlock();
-
- if (err) {
+ if (register_netdevice(dev)) {
+ rtnl_unlock();
pr_err("%s: error registering net device!\n", dev->name);
- /* XXX: should we call ->remove() here? */
- free_netdev(dev);
- return 1;
+ platform_device_unregister(&lp->pdev);
+ return;
}
+ rtnl_unlock();
timer_setup(&lp->tl, iss_net_user_timer_expire, 0);
- return 0;
+ return;
-errout:
- /* FIXME: unregister; free, etc.. */
- return -EIO;
+err_free_netdev:
+ free_netdev(dev);
}
/* ------------------------------------------------------------------------- */
diff --git a/block/bio.c b/block/bio.c
index d3ca79c3ebdf..7d4d5723350b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1129,6 +1129,37 @@ static void bio_put_pages(struct page **pages, size_t size, size_t off)
put_page(pages[i]);
}
+static int bio_iov_add_page(struct bio *bio, struct page *page,
+ unsigned int len, unsigned int offset)
+{
+ bool same_page = false;
+
+ if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) {
+ if (WARN_ON_ONCE(bio_full(bio, len)))
+ return -EINVAL;
+ __bio_add_page(bio, page, len, offset);
+ return 0;
+ }
+
+ if (same_page)
+ put_page(page);
+ return 0;
+}
+
+static int bio_iov_add_zone_append_page(struct bio *bio, struct page *page,
+ unsigned int len, unsigned int offset)
+{
+ struct request_queue *q = bdev_get_queue(bio->bi_bdev);
+ bool same_page = false;
+
+ if (bio_add_hw_page(q, bio, page, len, offset,
+ queue_max_zone_append_sectors(q), &same_page) != len)
+ return -EINVAL;
+ if (same_page)
+ put_page(page);
+ return 0;
+}
+
#define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *))
/**
@@ -1147,61 +1178,11 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
struct page **pages = (struct page **)bv;
- bool same_page = false;
- ssize_t size, left;
- unsigned len, i;
- size_t offset;
-
- /*
- * Move page array up in the allocated memory for the bio vecs as far as
- * possible so that we can start filling biovecs from the beginning
- * without overwriting the temporary page array.
- */
- BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
- pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
-
- size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset);
- if (unlikely(size <= 0))
- return size ? size : -EFAULT;
-
- for (left = size, i = 0; left > 0; left -= len, i++) {
- struct page *page = pages[i];
-
- len = min_t(size_t, PAGE_SIZE - offset, left);
-
- if (__bio_try_merge_page(bio, page, len, offset, &same_page)) {
- if (same_page)
- put_page(page);
- } else {
- if (WARN_ON_ONCE(bio_full(bio, len))) {
- bio_put_pages(pages + i, left, offset);
- return -EINVAL;
- }
- __bio_add_page(bio, page, len, offset);
- }
- offset = 0;
- }
-
- iov_iter_advance(iter, size);
- return 0;
-}
-
-static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
-{
- unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt;
- unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
- struct request_queue *q = bdev_get_queue(bio->bi_bdev);
- unsigned int max_append_sectors = queue_max_zone_append_sectors(q);
- struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
- struct page **pages = (struct page **)bv;
ssize_t size, left;
unsigned len, i;
size_t offset;
int ret = 0;
- if (WARN_ON_ONCE(!max_append_sectors))
- return 0;
-
/*
* Move page array up in the allocated memory for the bio vecs as far as
* possible so that we can start filling biovecs from the beginning
@@ -1216,17 +1197,18 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
for (left = size, i = 0; left > 0; left -= len, i++) {
struct page *page = pages[i];
- bool same_page = false;
len = min_t(size_t, PAGE_SIZE - offset, left);
- if (bio_add_hw_page(q, bio, page, len, offset,
- max_append_sectors, &same_page) != len) {
+ if (bio_op(bio) == REQ_OP_ZONE_APPEND)
+ ret = bio_iov_add_zone_append_page(bio, page, len,
+ offset);
+ else
+ ret = bio_iov_add_page(bio, page, len, offset);
+
+ if (ret) {
bio_put_pages(pages + i, left, offset);
- ret = -EINVAL;
break;
}
- if (same_page)
- put_page(page);
offset = 0;
}
@@ -1268,10 +1250,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
}
do {
- if (bio_op(bio) == REQ_OP_ZONE_APPEND)
- ret = __bio_iov_append_get_pages(bio, iter);
- else
- ret = __bio_iov_iter_get_pages(bio, iter);
+ ret = __bio_iov_iter_get_pages(bio, iter);
} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
/* don't account direct I/O as memory stall */
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 16705fbd0699..a19f2db4eeb2 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -2893,15 +2893,21 @@ static int blk_iocost_init(struct request_queue *q)
* called before policy activation completion, can't assume that the
* target bio has an iocg associated and need to test for NULL iocg.
*/
- rq_qos_add(q, rqos);
+ ret = rq_qos_add(q, rqos);
+ if (ret)
+ goto err_free_ioc;
+
ret = blkcg_activate_policy(q, &blkcg_policy_iocost);
- if (ret) {
- rq_qos_del(q, rqos);
- free_percpu(ioc->pcpu_stat);
- kfree(ioc);
- return ret;
- }
+ if (ret)
+ goto err_del_qos;
return 0;
+
+err_del_qos:
+ rq_qos_del(q, rqos);
+err_free_ioc:
+ free_percpu(ioc->pcpu_stat);
+ kfree(ioc);
+ return ret;
}
static struct blkcg_policy_data *ioc_cpd_alloc(gfp_t gfp)
diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c
index 9568bf8dfe82..7845dca5fcfd 100644
--- a/block/blk-iolatency.c
+++ b/block/blk-iolatency.c
@@ -773,19 +773,23 @@ int blk_iolatency_init(struct request_queue *q)
rqos->ops = &blkcg_iolatency_ops;
rqos->q = q;
- rq_qos_add(q, rqos);
-
+ ret = rq_qos_add(q, rqos);
+ if (ret)
+ goto err_free;
ret = blkcg_activate_policy(q, &blkcg_policy_iolatency);
- if (ret) {
- rq_qos_del(q, rqos);
- kfree(blkiolat);
- return ret;
- }
+ if (ret)
+ goto err_qos_del;
timer_setup(&blkiolat->timer, blkiolatency_timer_fn, 0);
INIT_WORK(&blkiolat->enable_work, blkiolatency_enable_work_fn);
return 0;
+
+err_qos_del:
+ rq_qos_del(q, rqos);
+err_free:
+ kfree(blkiolat);
+ return ret;
}
static void iolatency_set_min_lat_nsec(struct blkcg_gq *blkg, u64 val)
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index aa0349e9f083..d491b6eb0ab9 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -713,11 +713,6 @@ void blk_mq_debugfs_register(struct request_queue *q)
}
}
-void blk_mq_debugfs_unregister(struct request_queue *q)
-{
- q->sched_debugfs_dir = NULL;
-}
-
static void blk_mq_debugfs_register_ctx(struct blk_mq_hw_ctx *hctx,
struct blk_mq_ctx *ctx)
{
@@ -737,6 +732,9 @@ void blk_mq_debugfs_register_hctx(struct request_queue *q,
char name[20];
int i;
+ if (!q->debugfs_dir)
+ return;
+
snprintf(name, sizeof(name), "hctx%u", hctx->queue_num);
hctx->debugfs_dir = debugfs_create_dir(name, q->debugfs_dir);
@@ -748,6 +746,8 @@ void blk_mq_debugfs_register_hctx(struct request_queue *q,
void blk_mq_debugfs_unregister_hctx(struct blk_mq_hw_ctx *hctx)
{
+ if (!hctx->queue->debugfs_dir)
+ return;
debugfs_remove_recursive(hctx->debugfs_dir);
hctx->sched_debugfs_dir = NULL;
hctx->debugfs_dir = NULL;
@@ -775,6 +775,8 @@ void blk_mq_debugfs_register_sched(struct request_queue *q)
{
struct elevator_type *e = q->elevator->type;
+ lockdep_assert_held(&q->debugfs_mutex);
+
/*
* If the parent directory has not been created yet, return, we will be
* called again later on and the directory/files will be created then.
@@ -792,6 +794,8 @@ void blk_mq_debugfs_register_sched(struct request_queue *q)
void blk_mq_debugfs_unregister_sched(struct request_queue *q)
{
+ lockdep_assert_held(&q->debugfs_mutex);
+
debugfs_remove_recursive(q->sched_debugfs_dir);
q->sched_debugfs_dir = NULL;
}
@@ -813,6 +817,10 @@ static const char *rq_qos_id_to_name(enum rq_qos_id id)
void blk_mq_debugfs_unregister_rqos(struct rq_qos *rqos)
{
+ lockdep_assert_held(&rqos->q->debugfs_mutex);
+
+ if (!rqos->q->debugfs_dir)
+ return;
debugfs_remove_recursive(rqos->debugfs_dir);
rqos->debugfs_dir = NULL;
}
@@ -822,6 +830,8 @@ void blk_mq_debugfs_register_rqos(struct rq_qos *rqos)
struct request_queue *q = rqos->q;
const char *dir_name = rq_qos_id_to_name(rqos->id);
+ lockdep_assert_held(&q->debugfs_mutex);
+
if (rqos->debugfs_dir || !rqos->ops->debugfs_attrs)
return;
@@ -837,6 +847,8 @@ void blk_mq_debugfs_register_rqos(struct rq_qos *rqos)
void blk_mq_debugfs_unregister_queue_rqos(struct request_queue *q)
{
+ lockdep_assert_held(&q->debugfs_mutex);
+
debugfs_remove_recursive(q->rqos_debugfs_dir);
q->rqos_debugfs_dir = NULL;
}
@@ -846,6 +858,8 @@ void blk_mq_debugfs_register_sched_hctx(struct request_queue *q,
{
struct elevator_type *e = q->elevator->type;
+ lockdep_assert_held(&q->debugfs_mutex);
+
/*
* If the parent debugfs directory has not been created yet, return;
* We will be called again later on with appropriate parent debugfs
@@ -865,6 +879,10 @@ void blk_mq_debugfs_register_sched_hctx(struct request_queue *q,
void blk_mq_debugfs_unregister_sched_hctx(struct blk_mq_hw_ctx *hctx)
{
+ lockdep_assert_held(&hctx->queue->debugfs_mutex);
+
+ if (!hctx->queue->debugfs_dir)
+ return;
debugfs_remove_recursive(hctx->sched_debugfs_dir);
hctx->sched_debugfs_dir = NULL;
}
diff --git a/block/blk-mq-debugfs.h b/block/blk-mq-debugfs.h
index 69918f4170d6..771d45832878 100644
--- a/block/blk-mq-debugfs.h
+++ b/block/blk-mq-debugfs.h
@@ -21,7 +21,6 @@ int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq);
int blk_mq_debugfs_rq_show(struct seq_file *m, void *v);
void blk_mq_debugfs_register(struct request_queue *q);
-void blk_mq_debugfs_unregister(struct request_queue *q);
void blk_mq_debugfs_register_hctx(struct request_queue *q,
struct blk_mq_hw_ctx *hctx);
void blk_mq_debugfs_unregister_hctx(struct blk_mq_hw_ctx *hctx);
@@ -42,10 +41,6 @@ static inline void blk_mq_debugfs_register(struct request_queue *q)
{
}
-static inline void blk_mq_debugfs_unregister(struct request_queue *q)
-{
-}
-
static inline void blk_mq_debugfs_register_hctx(struct request_queue *q,
struct blk_mq_hw_ctx *hctx)
{
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 9e56a69422b6..e84bec39fd3a 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -593,7 +593,9 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
if (ret)
goto err_free_map_and_rqs;
+ mutex_lock(&q->debugfs_mutex);
blk_mq_debugfs_register_sched(q);
+ mutex_unlock(&q->debugfs_mutex);
queue_for_each_hw_ctx(q, hctx, i) {
if (e->ops.init_hctx) {
@@ -606,7 +608,9 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
return ret;
}
}
+ mutex_lock(&q->debugfs_mutex);
blk_mq_debugfs_register_sched_hctx(q, hctx);
+ mutex_unlock(&q->debugfs_mutex);
}
return 0;
@@ -647,14 +651,21 @@ void blk_mq_exit_sched(struct request_queue *q, struct elevator_queue *e)
unsigned int flags = 0;
queue_for_each_hw_ctx(q, hctx, i) {
+ mutex_lock(&q->debugfs_mutex);
blk_mq_debugfs_unregister_sched_hctx(hctx);
+ mutex_unlock(&q->debugfs_mutex);
+
if (e->type->ops.exit_hctx && hctx->sched_data) {
e->type->ops.exit_hctx(hctx, i);
hctx->sched_data = NULL;
}
flags = hctx->flags;
}
+
+ mutex_lock(&q->debugfs_mutex);
blk_mq_debugfs_unregister_sched(q);
+ mutex_unlock(&q->debugfs_mutex);
+
if (e->type->ops.exit_sched)
e->type->ops.exit_sched(e);
blk_mq_sched_tags_teardown(q, flags);
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index e83af7bc7591..249a6f05dd3b 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -294,7 +294,9 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
void rq_qos_exit(struct request_queue *q)
{
+ mutex_lock(&q->debugfs_mutex);
blk_mq_debugfs_unregister_queue_rqos(q);
+ mutex_unlock(&q->debugfs_mutex);
while (q->rq_qos) {
struct rq_qos *rqos = q->rq_qos;
diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h
index 68267007da1c..08b856570ad1 100644
--- a/block/blk-rq-qos.h
+++ b/block/blk-rq-qos.h
@@ -86,7 +86,7 @@ static inline void rq_wait_init(struct rq_wait *rq_wait)
init_waitqueue_head(&rq_wait->wait);
}
-static inline void rq_qos_add(struct request_queue *q, struct rq_qos *rqos)
+static inline int rq_qos_add(struct request_queue *q, struct rq_qos *rqos)
{
/*
* No IO can be in-flight when adding rqos, so freeze queue, which
@@ -98,14 +98,26 @@ static inline void rq_qos_add(struct request_queue *q, struct rq_qos *rqos)
blk_mq_freeze_queue(q);
spin_lock_irq(&q->queue_lock);
+ if (rq_qos_id(q, rqos->id))
+ goto ebusy;
rqos->next = q->rq_qos;
q->rq_qos = rqos;
spin_unlock_irq(&q->queue_lock);
blk_mq_unfreeze_queue(q);
- if (rqos->ops->debugfs_attrs)
+ if (rqos->ops->debugfs_attrs) {
+ mutex_lock(&q->debugfs_mutex);
blk_mq_debugfs_register_rqos(rqos);
+ mutex_unlock(&q->debugfs_mutex);
+ }
+
+ return 0;
+ebusy:
+ spin_unlock_irq(&q->queue_lock);
+ blk_mq_unfreeze_queue(q);
+ return -EBUSY;
+
}
static inline void rq_qos_del(struct request_queue *q, struct rq_qos *rqos)
@@ -129,7 +141,9 @@ static inline void rq_qos_del(struct request_queue *q, struct rq_qos *rqos)
blk_mq_unfreeze_queue(q);
+ mutex_lock(&q->debugfs_mutex);
blk_mq_debugfs_unregister_rqos(rqos);
+ mutex_unlock(&q->debugfs_mutex);
}
typedef bool (acquire_inflight_cb_t)(struct rq_wait *rqw, void *private_data);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 88bd41d4cb59..6e4801b217a7 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -779,14 +779,13 @@ static void blk_release_queue(struct kobject *kobj)
if (queue_is_mq(q))
blk_mq_release(q);
- blk_trace_shutdown(q);
mutex_lock(&q->debugfs_mutex);
+ blk_trace_shutdown(q);
debugfs_remove_recursive(q->debugfs_dir);
+ q->debugfs_dir = NULL;
+ q->sched_debugfs_dir = NULL;
mutex_unlock(&q->debugfs_mutex);
- if (queue_is_mq(q))
- blk_mq_debugfs_unregister(q);
-
bioset_exit(&q->bio_split);
if (blk_queue_has_srcu(q))
@@ -836,17 +835,16 @@ int blk_register_queue(struct gendisk *disk)
goto unlock;
}
+ if (queue_is_mq(q))
+ __blk_mq_register_dev(dev, q);
+ mutex_lock(&q->sysfs_lock);
+
mutex_lock(&q->debugfs_mutex);
q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent),
blk_debugfs_root);
- mutex_unlock(&q->debugfs_mutex);
-
- if (queue_is_mq(q)) {
- __blk_mq_register_dev(dev, q);
+ if (queue_is_mq(q))
blk_mq_debugfs_register(q);
- }
-
- mutex_lock(&q->sysfs_lock);
+ mutex_unlock(&q->debugfs_mutex);
ret = disk_register_independent_access_ranges(disk, NULL);
if (ret)
diff --git a/block/blk-wbt.c b/block/blk-wbt.c
index 0c119be0e813..ae6ea0b54579 100644
--- a/block/blk-wbt.c
+++ b/block/blk-wbt.c
@@ -820,6 +820,7 @@ int wbt_init(struct request_queue *q)
{
struct rq_wb *rwb;
int i;
+ int ret;
rwb = kzalloc(sizeof(*rwb), GFP_KERNEL);
if (!rwb)
@@ -846,7 +847,10 @@ int wbt_init(struct request_queue *q)
/*
* Assign rwb and add the stats callback.
*/
- rq_qos_add(q, &rwb->rqos);
+ ret = rq_qos_add(q, &rwb->rqos);
+ if (ret)
+ goto err_free;
+
blk_stat_add_callback(q, rwb->cb);
rwb->min_lat_nsec = wbt_default_latency_nsec(q);
@@ -855,4 +859,10 @@ int wbt_init(struct request_queue *q)
wbt_set_write_cache(q, test_bit(QUEUE_FLAG_WC, &q->queue_flags));
return 0;
+
+err_free:
+ blk_stat_free_callback(rwb->cb);
+ kfree(rwb);
+ return ret;
+
}
diff --git a/crypto/Kconfig b/crypto/Kconfig
index b4e00a7a046b..38601a072b99 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -692,26 +692,8 @@ config CRYPTO_BLAKE2B
See https://blake2.net for further information.
-config CRYPTO_BLAKE2S
- tristate "BLAKE2s digest algorithm"
- select CRYPTO_LIB_BLAKE2S_GENERIC
- select CRYPTO_HASH
- help
- Implementation of cryptographic hash function BLAKE2s
- optimized for 8-32bit platforms and can produce digests of any size
- between 1 to 32. The keyed hash is also implemented.
-
- This module provides the following algorithms:
-
- - blake2s-128
- - blake2s-160
- - blake2s-224
- - blake2s-256
-
- See https://blake2.net for further information.
-
config CRYPTO_BLAKE2S_X86
- tristate "BLAKE2s digest algorithm (x86 accelerated version)"
+ bool "BLAKE2s digest algorithm (x86 accelerated version)"
depends on X86 && 64BIT
select CRYPTO_LIB_BLAKE2S_GENERIC
select CRYPTO_ARCH_HAVE_LIB_BLAKE2S
diff --git a/crypto/Makefile b/crypto/Makefile
index a40e6d5fb2c8..dbfa53567c92 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -83,7 +83,6 @@ obj-$(CONFIG_CRYPTO_STREEBOG) += streebog_generic.o
obj-$(CONFIG_CRYPTO_WP512) += wp512.o
CFLAGS_wp512.o := $(call cc-option,-fno-schedule-insns) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
obj-$(CONFIG_CRYPTO_BLAKE2B) += blake2b_generic.o
-obj-$(CONFIG_CRYPTO_BLAKE2S) += blake2s_generic.o
obj-$(CONFIG_CRYPTO_GF128MUL) += gf128mul.o
obj-$(CONFIG_CRYPTO_ECB) += ecb.o
obj-$(CONFIG_CRYPTO_CBC) += cbc.o
diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
index 7c9e6be35c30..2f8352e88860 100644
--- a/crypto/asymmetric_keys/public_key.c
+++ b/crypto/asymmetric_keys/public_key.c
@@ -304,6 +304,10 @@ static int cert_sig_digest_update(const struct public_key_signature *sig,
BUG_ON(!sig->data);
+ /* SM2 signatures always use the SM3 hash algorithm */
+ if (!sig->hash_algo || strcmp(sig->hash_algo, "sm3") != 0)
+ return -EINVAL;
+
ret = sm2_compute_z_digest(tfm_pkey, SM2_DEFAULT_USERID,
SM2_DEFAULT_USERID_LEN, dgst);
if (ret)
@@ -414,8 +418,7 @@ int public_key_verify_signature(const struct public_key *pkey,
if (ret)
goto error_free_key;
- if (sig->pkey_algo && strcmp(sig->pkey_algo, "sm2") == 0 &&
- sig->data_size) {
+ if (strcmp(pkey->pkey_algo, "sm2") == 0 && sig->data_size) {
ret = cert_sig_digest_update(sig, tfm);
if (ret)
goto error_free_key;
diff --git a/crypto/blake2s_generic.c b/crypto/blake2s_generic.c
deleted file mode 100644
index 5f96a21f8788..000000000000
--- a/crypto/blake2s_generic.c
+++ /dev/null
@@ -1,75 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0 OR MIT
-/*
- * shash interface to the generic implementation of BLAKE2s
- *
- * Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
- */
-
-#include <crypto/internal/blake2s.h>
-#include <crypto/internal/hash.h>
-
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-
-static int crypto_blake2s_update_generic(struct shash_desc *desc,
- const u8 *in, unsigned int inlen)
-{
- return crypto_blake2s_update(desc, in, inlen, true);
-}
-
-static int crypto_blake2s_final_generic(struct shash_desc *desc, u8 *out)
-{
- return crypto_blake2s_final(desc, out, true);
-}
-
-#define BLAKE2S_ALG(name, driver_name, digest_size) \
- { \
- .base.cra_name = name, \
- .base.cra_driver_name = driver_name, \
- .base.cra_priority = 100, \
- .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \
- .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \
- .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \
- .base.cra_module = THIS_MODULE, \
- .digestsize = digest_size, \
- .setkey = crypto_blake2s_setkey, \
- .init = crypto_blake2s_init, \
- .update = crypto_blake2s_update_generic, \
- .final = crypto_blake2s_final_generic, \
- .descsize = sizeof(struct blake2s_state), \
- }
-
-static struct shash_alg blake2s_algs[] = {
- BLAKE2S_ALG("blake2s-128", "blake2s-128-generic",
- BLAKE2S_128_HASH_SIZE),
- BLAKE2S_ALG("blake2s-160", "blake2s-160-generic",
- BLAKE2S_160_HASH_SIZE),
- BLAKE2S_ALG("blake2s-224", "blake2s-224-generic",
- BLAKE2S_224_HASH_SIZE),
- BLAKE2S_ALG("blake2s-256", "blake2s-256-generic",
- BLAKE2S_256_HASH_SIZE),
-};
-
-static int __init blake2s_mod_init(void)
-{
- return crypto_register_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs));
-}
-
-static void __exit blake2s_mod_exit(void)
-{
- crypto_unregister_shashes(blake2s_algs, ARRAY_SIZE(blake2s_algs));
-}
-
-subsys_initcall(blake2s_mod_init);
-module_exit(blake2s_mod_exit);
-
-MODULE_ALIAS_CRYPTO("blake2s-128");
-MODULE_ALIAS_CRYPTO("blake2s-128-generic");
-MODULE_ALIAS_CRYPTO("blake2s-160");
-MODULE_ALIAS_CRYPTO("blake2s-160-generic");
-MODULE_ALIAS_CRYPTO("blake2s-224");
-MODULE_ALIAS_CRYPTO("blake2s-224-generic");
-MODULE_ALIAS_CRYPTO("blake2s-256");
-MODULE_ALIAS_CRYPTO("blake2s-256-generic");
-MODULE_LICENSE("GPL v2");
diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
index 2bacf8384f59..66b7ca1ccb23 100644
--- a/crypto/tcrypt.c
+++ b/crypto/tcrypt.c
@@ -1669,10 +1669,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
ret += tcrypt_test("rmd160");
break;
- case 41:
- ret += tcrypt_test("blake2s-256");
- break;
-
case 42:
ret += tcrypt_test("blake2b-512");
break;
@@ -2240,10 +2236,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
test_hash_speed("rmd160", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
fallthrough;
- case 316:
- test_hash_speed("blake2s-256", sec, generic_hash_speed_template);
- if (mode > 300 && mode < 400) break;
- fallthrough;
case 317:
test_hash_speed("blake2b-512", sec, generic_hash_speed_template);
if (mode > 300 && mode < 400) break;
@@ -2352,10 +2344,6 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb)
test_ahash_speed("rmd160", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
fallthrough;
- case 416:
- test_ahash_speed("blake2s-256", sec, generic_hash_speed_template);
- if (mode > 400 && mode < 500) break;
- fallthrough;
case 417:
test_ahash_speed("blake2b-512", sec, generic_hash_speed_template);
if (mode > 400 && mode < 500) break;
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 4948201065cc..56facdb63843 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -4324,30 +4324,6 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.hash = __VECS(blake2b_512_tv_template)
}
- }, {
- .alg = "blake2s-128",
- .test = alg_test_hash,
- .suite = {
- .hash = __VECS(blakes2s_128_tv_template)
- }
- }, {
- .alg = "blake2s-160",
- .test = alg_test_hash,
- .suite = {
- .hash = __VECS(blakes2s_160_tv_template)
- }
- }, {
- .alg = "blake2s-224",
- .test = alg_test_hash,
- .suite = {
- .hash = __VECS(blakes2s_224_tv_template)
- }
- }, {
- .alg = "blake2s-256",
- .test = alg_test_hash,
- .suite = {
- .hash = __VECS(blakes2s_256_tv_template)
- }
}, {
.alg = "cbc(aes)",
.test = alg_test_skcipher,
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 4d7449fc6a65..c29658337d96 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -34034,221 +34034,4 @@ static const struct hash_testvec blake2b_512_tv_template[] = {{
0xae, 0x15, 0x81, 0x15, 0xd0, 0x88, 0xa0, 0x3c, },
}};
-static const struct hash_testvec blakes2s_128_tv_template[] = {{
- .digest = (u8[]){ 0x64, 0x55, 0x0d, 0x6f, 0xfe, 0x2c, 0x0a, 0x01,
- 0xa1, 0x4a, 0xba, 0x1e, 0xad, 0xe0, 0x20, 0x0c, },
-}, {
- .plaintext = blake2_ordered_sequence,
- .psize = 64,
- .digest = (u8[]){ 0xdc, 0x66, 0xca, 0x8f, 0x03, 0x86, 0x58, 0x01,
- 0xb0, 0xff, 0xe0, 0x6e, 0xd8, 0xa1, 0xa9, 0x0e, },
-}, {
- .ksize = 16,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 1,
- .digest = (u8[]){ 0x88, 0x1e, 0x42, 0xe7, 0xbb, 0x35, 0x80, 0x82,
- 0x63, 0x7c, 0x0a, 0x0f, 0xd7, 0xec, 0x6c, 0x2f, },
-}, {
- .ksize = 32,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 7,
- .digest = (u8[]){ 0xcf, 0x9e, 0x07, 0x2a, 0xd5, 0x22, 0xf2, 0xcd,
- 0xa2, 0xd8, 0x25, 0x21, 0x80, 0x86, 0x73, 0x1c, },
-}, {
- .ksize = 1,
- .key = "B",
- .plaintext = blake2_ordered_sequence,
- .psize = 15,
- .digest = (u8[]){ 0xf6, 0x33, 0x5a, 0x2c, 0x22, 0xa0, 0x64, 0xb2,
- 0xb6, 0x3f, 0xeb, 0xbc, 0xd1, 0xc3, 0xe5, 0xb2, },
-}, {
- .ksize = 16,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 247,
- .digest = (u8[]){ 0x72, 0x66, 0x49, 0x60, 0xf9, 0x4a, 0xea, 0xbe,
- 0x1f, 0xf4, 0x60, 0xce, 0xb7, 0x81, 0xcb, 0x09, },
-}, {
- .ksize = 32,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 256,
- .digest = (u8[]){ 0xd5, 0xa4, 0x0e, 0xc3, 0x16, 0xc7, 0x51, 0xa6,
- 0x3c, 0xd0, 0xd9, 0x11, 0x57, 0xfa, 0x1e, 0xbb, },
-}};
-
-static const struct hash_testvec blakes2s_160_tv_template[] = {{
- .plaintext = blake2_ordered_sequence,
- .psize = 7,
- .digest = (u8[]){ 0xb4, 0xf2, 0x03, 0x49, 0x37, 0xed, 0xb1, 0x3e,
- 0x5b, 0x2a, 0xca, 0x64, 0x82, 0x74, 0xf6, 0x62,
- 0xe3, 0xf2, 0x84, 0xff, },
-}, {
- .plaintext = blake2_ordered_sequence,
- .psize = 256,
- .digest = (u8[]){ 0xaa, 0x56, 0x9b, 0xdc, 0x98, 0x17, 0x75, 0xf2,
- 0xb3, 0x68, 0x83, 0xb7, 0x9b, 0x8d, 0x48, 0xb1,
- 0x9b, 0x2d, 0x35, 0x05, },
-}, {
- .ksize = 1,
- .key = "B",
- .digest = (u8[]){ 0x50, 0x16, 0xe7, 0x0c, 0x01, 0xd0, 0xd3, 0xc3,
- 0xf4, 0x3e, 0xb1, 0x6e, 0x97, 0xa9, 0x4e, 0xd1,
- 0x79, 0x65, 0x32, 0x93, },
-}, {
- .ksize = 32,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 1,
- .digest = (u8[]){ 0x1c, 0x2b, 0xcd, 0x9a, 0x68, 0xca, 0x8c, 0x71,
- 0x90, 0x29, 0x6c, 0x54, 0xfa, 0x56, 0x4a, 0xef,
- 0xa2, 0x3a, 0x56, 0x9c, },
-}, {
- .ksize = 16,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 15,
- .digest = (u8[]){ 0x36, 0xc3, 0x5f, 0x9a, 0xdc, 0x7e, 0xbf, 0x19,
- 0x68, 0xaa, 0xca, 0xd8, 0x81, 0xbf, 0x09, 0x34,
- 0x83, 0x39, 0x0f, 0x30, },
-}, {
- .ksize = 1,
- .key = "B",
- .plaintext = blake2_ordered_sequence,
- .psize = 64,
- .digest = (u8[]){ 0x86, 0x80, 0x78, 0xa4, 0x14, 0xec, 0x03, 0xe5,
- 0xb6, 0x9a, 0x52, 0x0e, 0x42, 0xee, 0x39, 0x9d,
- 0xac, 0xa6, 0x81, 0x63, },
-}, {
- .ksize = 32,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 247,
- .digest = (u8[]){ 0x2d, 0xd8, 0xd2, 0x53, 0x66, 0xfa, 0xa9, 0x01,
- 0x1c, 0x9c, 0xaf, 0xa3, 0xe2, 0x9d, 0x9b, 0x10,
- 0x0a, 0xf6, 0x73, 0xe8, },
-}};
-
-static const struct hash_testvec blakes2s_224_tv_template[] = {{
- .plaintext = blake2_ordered_sequence,
- .psize = 1,
- .digest = (u8[]){ 0x61, 0xb9, 0x4e, 0xc9, 0x46, 0x22, 0xa3, 0x91,
- 0xd2, 0xae, 0x42, 0xe6, 0x45, 0x6c, 0x90, 0x12,
- 0xd5, 0x80, 0x07, 0x97, 0xb8, 0x86, 0x5a, 0xfc,
- 0x48, 0x21, 0x97, 0xbb, },
-}, {
- .plaintext = blake2_ordered_sequence,
- .psize = 247,
- .digest = (u8[]){ 0x9e, 0xda, 0xc7, 0x20, 0x2c, 0xd8, 0x48, 0x2e,
- 0x31, 0x94, 0xab, 0x46, 0x6d, 0x94, 0xd8, 0xb4,
- 0x69, 0xcd, 0xae, 0x19, 0x6d, 0x9e, 0x41, 0xcc,
- 0x2b, 0xa4, 0xd5, 0xf6, },
-}, {
- .ksize = 16,
- .key = blake2_ordered_sequence,
- .digest = (u8[]){ 0x32, 0xc0, 0xac, 0xf4, 0x3b, 0xd3, 0x07, 0x9f,
- 0xbe, 0xfb, 0xfa, 0x4d, 0x6b, 0x4e, 0x56, 0xb3,
- 0xaa, 0xd3, 0x27, 0xf6, 0x14, 0xbf, 0xb9, 0x32,
- 0xa7, 0x19, 0xfc, 0xb8, },
-}, {
- .ksize = 1,
- .key = "B",
- .plaintext = blake2_ordered_sequence,
- .psize = 7,
- .digest = (u8[]){ 0x73, 0xad, 0x5e, 0x6d, 0xb9, 0x02, 0x8e, 0x76,
- 0xf2, 0x66, 0x42, 0x4b, 0x4c, 0xfa, 0x1f, 0xe6,
- 0x2e, 0x56, 0x40, 0xe5, 0xa2, 0xb0, 0x3c, 0xe8,
- 0x7b, 0x45, 0xfe, 0x05, },
-}, {
- .ksize = 32,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 15,
- .digest = (u8[]){ 0x16, 0x60, 0xfb, 0x92, 0x54, 0xb3, 0x6e, 0x36,
- 0x81, 0xf4, 0x16, 0x41, 0xc3, 0x3d, 0xd3, 0x43,
- 0x84, 0xed, 0x10, 0x6f, 0x65, 0x80, 0x7a, 0x3e,
- 0x25, 0xab, 0xc5, 0x02, },
-}, {
- .ksize = 16,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 64,
- .digest = (u8[]){ 0xca, 0xaa, 0x39, 0x67, 0x9c, 0xf7, 0x6b, 0xc7,
- 0xb6, 0x82, 0xca, 0x0e, 0x65, 0x36, 0x5b, 0x7c,
- 0x24, 0x00, 0xfa, 0x5f, 0xda, 0x06, 0x91, 0x93,
- 0x6a, 0x31, 0x83, 0xb5, },
-}, {
- .ksize = 1,
- .key = "B",
- .plaintext = blake2_ordered_sequence,
- .psize = 256,
- .digest = (u8[]){ 0x90, 0x02, 0x26, 0xb5, 0x06, 0x9c, 0x36, 0x86,
- 0x94, 0x91, 0x90, 0x1e, 0x7d, 0x2a, 0x71, 0xb2,
- 0x48, 0xb5, 0xe8, 0x16, 0xfd, 0x64, 0x33, 0x45,
- 0xb3, 0xd7, 0xec, 0xcc, },
-}};
-
-static const struct hash_testvec blakes2s_256_tv_template[] = {{
- .plaintext = blake2_ordered_sequence,
- .psize = 15,
- .digest = (u8[]){ 0xd9, 0x7c, 0x82, 0x8d, 0x81, 0x82, 0xa7, 0x21,
- 0x80, 0xa0, 0x6a, 0x78, 0x26, 0x83, 0x30, 0x67,
- 0x3f, 0x7c, 0x4e, 0x06, 0x35, 0x94, 0x7c, 0x04,
- 0xc0, 0x23, 0x23, 0xfd, 0x45, 0xc0, 0xa5, 0x2d, },
-}, {
- .ksize = 32,
- .key = blake2_ordered_sequence,
- .digest = (u8[]){ 0x48, 0xa8, 0x99, 0x7d, 0xa4, 0x07, 0x87, 0x6b,
- 0x3d, 0x79, 0xc0, 0xd9, 0x23, 0x25, 0xad, 0x3b,
- 0x89, 0xcb, 0xb7, 0x54, 0xd8, 0x6a, 0xb7, 0x1a,
- 0xee, 0x04, 0x7a, 0xd3, 0x45, 0xfd, 0x2c, 0x49, },
-}, {
- .ksize = 1,
- .key = "B",
- .plaintext = blake2_ordered_sequence,
- .psize = 1,
- .digest = (u8[]){ 0x22, 0x27, 0xae, 0xaa, 0x6e, 0x81, 0x56, 0x03,
- 0xa7, 0xe3, 0xa1, 0x18, 0xa5, 0x9a, 0x2c, 0x18,
- 0xf4, 0x63, 0xbc, 0x16, 0x70, 0xf1, 0xe7, 0x4b,
- 0x00, 0x6d, 0x66, 0x16, 0xae, 0x9e, 0x74, 0x4e, },
-}, {
- .ksize = 16,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 7,
- .digest = (u8[]){ 0x58, 0x5d, 0xa8, 0x60, 0x1c, 0xa4, 0xd8, 0x03,
- 0x86, 0x86, 0x84, 0x64, 0xd7, 0xa0, 0x8e, 0x15,
- 0x2f, 0x05, 0xa2, 0x1b, 0xbc, 0xef, 0x7a, 0x34,
- 0xb3, 0xc5, 0xbc, 0x4b, 0xf0, 0x32, 0xeb, 0x12, },
-}, {
- .ksize = 32,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 64,
- .digest = (u8[]){ 0x89, 0x75, 0xb0, 0x57, 0x7f, 0xd3, 0x55, 0x66,
- 0xd7, 0x50, 0xb3, 0x62, 0xb0, 0x89, 0x7a, 0x26,
- 0xc3, 0x99, 0x13, 0x6d, 0xf0, 0x7b, 0xab, 0xab,
- 0xbd, 0xe6, 0x20, 0x3f, 0xf2, 0x95, 0x4e, 0xd4, },
-}, {
- .ksize = 1,
- .key = "B",
- .plaintext = blake2_ordered_sequence,
- .psize = 247,
- .digest = (u8[]){ 0x2e, 0x74, 0x1c, 0x1d, 0x03, 0xf4, 0x9d, 0x84,
- 0x6f, 0xfc, 0x86, 0x32, 0x92, 0x49, 0x7e, 0x66,
- 0xd7, 0xc3, 0x10, 0x88, 0xfe, 0x28, 0xb3, 0xe0,
- 0xbf, 0x50, 0x75, 0xad, 0x8e, 0xa4, 0xe6, 0xb2, },
-}, {
- .ksize = 16,
- .key = blake2_ordered_sequence,
- .plaintext = blake2_ordered_sequence,
- .psize = 256,
- .digest = (u8[]){ 0xb9, 0xd2, 0x81, 0x0e, 0x3a, 0xb1, 0x62, 0x9b,
- 0xad, 0x44, 0x05, 0xf4, 0x92, 0x2e, 0x99, 0xc1,
- 0x4a, 0x47, 0xbb, 0x5b, 0x6f, 0xb2, 0x96, 0xed,
- 0xd5, 0x06, 0xb5, 0x3a, 0x7c, 0x7a, 0x65, 0x1d, },
-}};
-
#endif /* _CRYPTO_TESTMGR_H */
diff --git a/drivers/acpi/acpi_lpss.c b/drivers/acpi/acpi_lpss.c
index fbe0756259c5..c4d4d21391d7 100644
--- a/drivers/acpi/acpi_lpss.c
+++ b/drivers/acpi/acpi_lpss.c
@@ -422,6 +422,9 @@ static int register_device_clock(struct acpi_device *adev,
if (!lpss_clk_dev)
lpt_register_clock_device();
+ if (IS_ERR(lpss_clk_dev))
+ return PTR_ERR(lpss_clk_dev);
+
clk_data = platform_get_drvdata(lpss_clk_dev);
if (!clk_data)
return -ENODEV;
diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c
index 95cc2a9f3e05..49b5e317e916 100644
--- a/drivers/acpi/apei/einj.c
+++ b/drivers/acpi/apei/einj.c
@@ -546,6 +546,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
!= REGION_INTERSECTS) &&
(region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY)
!= REGION_INTERSECTS) &&
+ (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_SOFT_RESERVED)
+ != REGION_INTERSECTS) &&
!arch_is_platform_page(base_addr)))
return -EINVAL;
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index 6c735cfa7d43..ef7858393a3c 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -1373,6 +1373,7 @@ static int __init acpi_init(void)
pci_mmcfg_late_init();
acpi_iort_init();
+ acpi_viot_early_init();
acpi_hest_init();
acpi_ghes_init();
acpi_scan_init();
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
index b8e26b6b5523..35d894674eba 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
@@ -600,33 +600,6 @@ static int pcc_data_alloc(int pcc_ss_id)
return 0;
}
-/* Check if CPPC revision + num_ent combination is supported */
-static bool is_cppc_supported(int revision, int num_ent)
-{
- int expected_num_ent;
-
- switch (revision) {
- case CPPC_V2_REV:
- expected_num_ent = CPPC_V2_NUM_ENT;
- break;
- case CPPC_V3_REV:
- expected_num_ent = CPPC_V3_NUM_ENT;
- break;
- default:
- pr_debug("Firmware exports unsupported CPPC revision: %d\n",
- revision);
- return false;
- }
-
- if (expected_num_ent != num_ent) {
- pr_debug("Firmware exports %d entries. Expected: %d for CPPC rev:%d\n",
- num_ent, expected_num_ent, revision);
- return false;
- }
-
- return true;
-}
-
/*
* An example CPC table looks like the following.
*
@@ -715,7 +688,6 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
cpc_obj->type, pr->id);
goto out_free;
}
- cpc_ptr->num_entries = num_ent;
/* Second entry should be revision. */
cpc_obj = &out_obj->package.elements[1];
@@ -726,10 +698,32 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
cpc_obj->type, pr->id);
goto out_free;
}
- cpc_ptr->version = cpc_rev;
- if (!is_cppc_supported(cpc_rev, num_ent))
+ if (cpc_rev < CPPC_V2_REV) {
+ pr_debug("Unsupported _CPC Revision (%d) for CPU:%d\n", cpc_rev,
+ pr->id);
+ goto out_free;
+ }
+
+ /*
+ * Disregard _CPC if the number of entries in the return pachage is not
+ * as expected, but support future revisions being proper supersets of
+ * the v3 and only causing more entries to be returned by _CPC.
+ */
+ if ((cpc_rev == CPPC_V2_REV && num_ent != CPPC_V2_NUM_ENT) ||
+ (cpc_rev == CPPC_V3_REV && num_ent != CPPC_V3_NUM_ENT) ||
+ (cpc_rev > CPPC_V3_REV && num_ent <= CPPC_V3_NUM_ENT)) {
+ pr_debug("Unexpected number of _CPC return package entries (%d) for CPU:%d\n",
+ num_ent, pr->id);
goto out_free;
+ }
+ if (cpc_rev > CPPC_V3_REV) {
+ num_ent = CPPC_V3_NUM_ENT;
+ cpc_rev = CPPC_V3_REV;
+ }
+
+ cpc_ptr->num_entries = num_ent;
+ cpc_ptr->version = cpc_rev;
/* Iterate through remaining entries in _CPC */
for (i = 2; i < num_ent; i++) {
diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index a1b871a418f8..488c9ec0da0b 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -180,7 +180,6 @@ static struct workqueue_struct *ec_wq;
static struct workqueue_struct *ec_query_wq;
static int EC_FLAGS_CORRECT_ECDT; /* Needs ECDT port address correction */
-static int EC_FLAGS_IGNORE_DSDT_GPE; /* Needs ECDT GPE as correction setting */
static int EC_FLAGS_TRUST_DSDT_GPE; /* Needs DSDT GPE as correction setting */
static int EC_FLAGS_CLEAR_ON_RESUME; /* Needs acpi_ec_clear() on boot/resume */
@@ -1407,24 +1406,16 @@ ec_parse_device(acpi_handle handle, u32 Level, void *context, void **retval)
if (ec->data_addr == 0 || ec->command_addr == 0)
return AE_OK;
- if (boot_ec && boot_ec_is_ecdt && EC_FLAGS_IGNORE_DSDT_GPE) {
- /*
- * Always inherit the GPE number setting from the ECDT
- * EC.
- */
- ec->gpe = boot_ec->gpe;
- } else {
- /* Get GPE bit assignment (EC events). */
- /* TODO: Add support for _GPE returning a package */
- status = acpi_evaluate_integer(handle, "_GPE", NULL, &tmp);
- if (ACPI_SUCCESS(status))
- ec->gpe = tmp;
+ /* Get GPE bit assignment (EC events). */
+ /* TODO: Add support for _GPE returning a package */
+ status = acpi_evaluate_integer(handle, "_GPE", NULL, &tmp);
+ if (ACPI_SUCCESS(status))
+ ec->gpe = tmp;
+ /*
+ * Errors are non-fatal, allowing for ACPI Reduced Hardware
+ * platforms which use GpioInt instead of GPE.
+ */
- /*
- * Errors are non-fatal, allowing for ACPI Reduced Hardware
- * platforms which use GpioInt instead of GPE.
- */
- }
/* Use the global lock for all EC transactions? */
tmp = 0;
acpi_evaluate_integer(handle, "_GLK", NULL, &tmp);
@@ -1862,60 +1853,12 @@ static int ec_honor_dsdt_gpe(const struct dmi_system_id *id)
return 0;
}
-/*
- * Some DSDTs contain wrong GPE setting.
- * Asus FX502VD/VE, GL702VMK, X550VXK, X580VD
- * https://bugzilla.kernel.org/show_bug.cgi?id=195651
- */
-static int ec_honor_ecdt_gpe(const struct dmi_system_id *id)
-{
- pr_debug("Detected system needing ignore DSDT GPE setting.\n");
- EC_FLAGS_IGNORE_DSDT_GPE = 1;
- return 0;
-}
-
static const struct dmi_system_id ec_dmi_table[] __initconst = {
{
ec_correct_ecdt, "MSI MS-171F", {
DMI_MATCH(DMI_SYS_VENDOR, "Micro-Star"),
DMI_MATCH(DMI_PRODUCT_NAME, "MS-171F"),}, NULL},
{
- ec_honor_ecdt_gpe, "ASUS FX502VD", {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "FX502VD"),}, NULL},
- {
- ec_honor_ecdt_gpe, "ASUS FX502VE", {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "FX502VE"),}, NULL},
- {
- ec_honor_ecdt_gpe, "ASUS GL702VMK", {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "GL702VMK"),}, NULL},
- {
- ec_honor_ecdt_gpe, "ASUSTeK COMPUTER INC. X505BA", {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "X505BA"),}, NULL},
- {
- ec_honor_ecdt_gpe, "ASUSTeK COMPUTER INC. X505BP", {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "X505BP"),}, NULL},
- {
- ec_honor_ecdt_gpe, "ASUSTeK COMPUTER INC. X542BA", {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "X542BA"),}, NULL},
- {
- ec_honor_ecdt_gpe, "ASUSTeK COMPUTER INC. X542BP", {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "X542BP"),}, NULL},
- {
- ec_honor_ecdt_gpe, "ASUS X550VXK", {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "X550VXK"),}, NULL},
- {
- ec_honor_ecdt_gpe, "ASUS X580VD", {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "X580VD"),}, NULL},
- {
/* https://bugzilla.kernel.org/show_bug.cgi?id=209989 */
ec_honor_dsdt_gpe, "HP Pavilion Gaming Laptop 15-cx0xxx", {
DMI_MATCH(DMI_SYS_VENDOR, "HP"),
@@ -2207,13 +2150,6 @@ static const struct dmi_system_id acpi_ec_no_wakeup[] = {
DMI_MATCH(DMI_PRODUCT_FAMILY, "Thinkpad X1 Carbon 6th"),
},
},
- {
- .ident = "ThinkPad X1 Carbon 6th",
- .matches = {
- DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
- DMI_MATCH(DMI_PRODUCT_FAMILY, "ThinkPad X1 Carbon 6th"),
- },
- },
{
.ident = "ThinkPad X1 Yoga 3rd",
.matches = {
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index eb95e188d62b..573ba7f617e8 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -602,7 +602,7 @@ static DEFINE_RAW_SPINLOCK(c3_lock);
* @cx: Target state context
* @index: index of target state
*/
-static int acpi_idle_enter_bm(struct cpuidle_driver *drv,
+static int __cpuidle acpi_idle_enter_bm(struct cpuidle_driver *drv,
struct acpi_processor *pr,
struct acpi_processor_cx *cx,
int index)
@@ -659,7 +659,7 @@ static int acpi_idle_enter_bm(struct cpuidle_driver *drv,
return index;
}
-static int acpi_idle_enter(struct cpuidle_device *dev,
+static int __cpuidle acpi_idle_enter(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
@@ -688,7 +688,7 @@ static int acpi_idle_enter(struct cpuidle_device *dev,
return index;
}
-static int acpi_idle_enter_s2idle(struct cpuidle_device *dev,
+static int __cpuidle acpi_idle_enter_s2idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 3147702710af..1ec3238e2cdc 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -360,6 +360,14 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = {
DMI_MATCH(DMI_PRODUCT_NAME, "80E3"),
},
},
+ {
+ .callback = init_nvs_save_s3,
+ .ident = "Lenovo G40-45",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "80E1"),
+ },
+ },
/*
* ThinkPad X1 Tablet(2016) cannot do suspend-to-idle using
* the Low Power S0 Idle firmware interface (see
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 6615f59ab7fd..5d7f38016a24 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -347,6 +347,14 @@ static const struct dmi_system_id video_detect_dmi_table[] = {
DMI_MATCH(DMI_PRODUCT_NAME, "MacBookPro12,1"),
},
},
+ {
+ .callback = video_detect_force_native,
+ /* Dell Inspiron N4010 */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Inspiron N4010"),
+ },
+ },
{
.callback = video_detect_force_native,
/* Dell Vostro V131 */
diff --git a/drivers/acpi/viot.c b/drivers/acpi/viot.c
index d2256326c73a..647f11cf165d 100644
--- a/drivers/acpi/viot.c
+++ b/drivers/acpi/viot.c
@@ -248,6 +248,26 @@ static int __init viot_parse_node(const struct acpi_viot_header *hdr)
return ret;
}
+/**
+ * acpi_viot_early_init - Test the presence of VIOT and enable ACS
+ *
+ * If the VIOT does exist, ACS must be enabled. This cannot be
+ * done in acpi_viot_init() which is called after the bus scan
+ */
+void __init acpi_viot_early_init(void)
+{
+#ifdef CONFIG_PCI
+ acpi_status status;
+ struct acpi_table_header *hdr;
+
+ status = acpi_get_table(ACPI_SIG_VIOT, 0, &hdr);
+ if (ACPI_FAILURE(status))
+ return;
+ pci_request_acs();
+ acpi_put_table(hdr);
+#endif
+}
+
/**
* acpi_viot_init - Parse the VIOT table
*
@@ -319,12 +339,6 @@ static int viot_pci_dev_iommu_init(struct pci_dev *pdev, u16 dev_id, void *data)
epid = ((domain_nr - ep->segment_start) << 16) +
dev_id - ep->bdf_start + ep->endpoint_id;
- /*
- * If we found a PCI range managed by the viommu, we're
- * the one that has to request ACS.
- */
- pci_request_acs();
-
return viot_dev_iommu_init(&pdev->dev, ep->viommu,
epid);
}
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index f3b639e89dd8..5243fe0eb402 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -170,8 +170,32 @@ static inline void binder_stats_created(enum binder_stat_types type)
atomic_inc(&binder_stats.obj_created[type]);
}
-struct binder_transaction_log binder_transaction_log;
-struct binder_transaction_log binder_transaction_log_failed;
+struct binder_transaction_log_entry {
+ int debug_id;
+ int debug_id_done;
+ int call_type;
+ int from_proc;
+ int from_thread;
+ int target_handle;
+ int to_proc;
+ int to_thread;
+ int to_node;
+ int data_size;
+ int offsets_size;
+ int return_error_line;
+ uint32_t return_error;
+ uint32_t return_error_param;
+ char context_name[BINDERFS_MAX_NAME + 1];
+};
+
+struct binder_transaction_log {
+ atomic_t cur;
+ bool full;
+ struct binder_transaction_log_entry entry[32];
+};
+
+static struct binder_transaction_log binder_transaction_log;
+static struct binder_transaction_log binder_transaction_log_failed;
static struct binder_transaction_log_entry *binder_transaction_log_add(
struct binder_transaction_log *log)
@@ -6084,8 +6108,7 @@ static void print_binder_proc_stats(struct seq_file *m,
print_binder_stats(m, " ", &proc->stats);
}
-
-int binder_state_show(struct seq_file *m, void *unused)
+static int state_show(struct seq_file *m, void *unused)
{
struct binder_proc *proc;
struct binder_node *node;
@@ -6124,7 +6147,7 @@ int binder_state_show(struct seq_file *m, void *unused)
return 0;
}
-int binder_stats_show(struct seq_file *m, void *unused)
+static int stats_show(struct seq_file *m, void *unused)
{
struct binder_proc *proc;
@@ -6140,7 +6163,7 @@ int binder_stats_show(struct seq_file *m, void *unused)
return 0;
}
-int binder_transactions_show(struct seq_file *m, void *unused)
+static int transactions_show(struct seq_file *m, void *unused)
{
struct binder_proc *proc;
@@ -6196,7 +6219,7 @@ static void print_binder_transaction_log_entry(struct seq_file *m,
"\n" : " (incomplete)\n");
}
-int binder_transaction_log_show(struct seq_file *m, void *unused)
+static int transaction_log_show(struct seq_file *m, void *unused)
{
struct binder_transaction_log *log = m->private;
unsigned int log_cur = atomic_read(&log->cur);
@@ -6228,6 +6251,45 @@ const struct file_operations binder_fops = {
.release = binder_release,
};
+DEFINE_SHOW_ATTRIBUTE(state);
+DEFINE_SHOW_ATTRIBUTE(stats);
+DEFINE_SHOW_ATTRIBUTE(transactions);
+DEFINE_SHOW_ATTRIBUTE(transaction_log);
+
+const struct binder_debugfs_entry binder_debugfs_entries[] = {
+ {
+ .name = "state",
+ .mode = 0444,
+ .fops = &state_fops,
+ .data = NULL,
+ },
+ {
+ .name = "stats",
+ .mode = 0444,
+ .fops = &stats_fops,
+ .data = NULL,
+ },
+ {
+ .name = "transactions",
+ .mode = 0444,
+ .fops = &transactions_fops,
+ .data = NULL,
+ },
+ {
+ .name = "transaction_log",
+ .mode = 0444,
+ .fops = &transaction_log_fops,
+ .data = &binder_transaction_log,
+ },
+ {
+ .name = "failed_transaction_log",
+ .mode = 0444,
+ .fops = &transaction_log_fops,
+ .data = &binder_transaction_log_failed,
+ },
+ {} /* terminator */
+};
+
static int __init init_binder_device(const char *name)
{
int ret;
@@ -6273,36 +6335,18 @@ static int __init binder_init(void)
atomic_set(&binder_transaction_log_failed.cur, ~0U);
binder_debugfs_dir_entry_root = debugfs_create_dir("binder", NULL);
- if (binder_debugfs_dir_entry_root)
+ if (binder_debugfs_dir_entry_root) {
+ const struct binder_debugfs_entry *db_entry;
+
+ binder_for_each_debugfs_entry(db_entry)
+ debugfs_create_file(db_entry->name,
+ db_entry->mode,
+ binder_debugfs_dir_entry_root,
+ db_entry->data,
+ db_entry->fops);
+
binder_debugfs_dir_entry_proc = debugfs_create_dir("proc",
binder_debugfs_dir_entry_root);
-
- if (binder_debugfs_dir_entry_root) {
- debugfs_create_file("state",
- 0444,
- binder_debugfs_dir_entry_root,
- NULL,
- &binder_state_fops);
- debugfs_create_file("stats",
- 0444,
- binder_debugfs_dir_entry_root,
- NULL,
- &binder_stats_fops);
- debugfs_create_file("transactions",
- 0444,
- binder_debugfs_dir_entry_root,
- NULL,
- &binder_transactions_fops);
- debugfs_create_file("transaction_log",
- 0444,
- binder_debugfs_dir_entry_root,
- &binder_transaction_log,
- &binder_transaction_log_fops);
- debugfs_create_file("failed_transaction_log",
- 0444,
- binder_debugfs_dir_entry_root,
- &binder_transaction_log_failed,
- &binder_transaction_log_fops);
}
if (!IS_ENABLED(CONFIG_ANDROID_BINDERFS) &&
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 2ac1008a5f39..a93a7d2a8caf 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -213,7 +213,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
if (mm) {
mmap_read_lock(mm);
- vma = alloc->vma;
+ vma = vma_lookup(mm, alloc->vma_addr);
}
if (!vma && need_mm) {
@@ -313,16 +313,15 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
static inline void binder_alloc_set_vma(struct binder_alloc *alloc,
struct vm_area_struct *vma)
{
- if (vma)
+ unsigned long vm_start = 0;
+
+ if (vma) {
+ vm_start = vma->vm_start;
alloc->vma_vm_mm = vma->vm_mm;
- /*
- * If we see alloc->vma is not NULL, buffer data structures set up
- * completely. Look at smp_rmb side binder_alloc_get_vma.
- * We also want to guarantee new alloc->vma_vm_mm is always visible
- * if alloc->vma is set.
- */
- smp_wmb();
- alloc->vma = vma;
+ }
+
+ mmap_assert_write_locked(alloc->vma_vm_mm);
+ alloc->vma_addr = vm_start;
}
static inline struct vm_area_struct *binder_alloc_get_vma(
@@ -330,11 +329,9 @@ static inline struct vm_area_struct *binder_alloc_get_vma(
{
struct vm_area_struct *vma = NULL;
- if (alloc->vma) {
- /* Look at description in binder_alloc_set_vma */
- smp_rmb();
- vma = alloc->vma;
- }
+ if (alloc->vma_addr)
+ vma = vma_lookup(alloc->vma_vm_mm, alloc->vma_addr);
+
return vma;
}
@@ -817,7 +814,8 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
buffers = 0;
mutex_lock(&alloc->mutex);
- BUG_ON(alloc->vma);
+ BUG_ON(alloc->vma_addr &&
+ vma_lookup(alloc->vma_vm_mm, alloc->vma_addr));
while ((n = rb_first(&alloc->allocated_buffers))) {
buffer = rb_entry(n, struct binder_buffer, rb_node);
diff --git a/drivers/android/binder_alloc.h b/drivers/android/binder_alloc.h
index 7dea57a84c79..1e4fd37af5e0 100644
--- a/drivers/android/binder_alloc.h
+++ b/drivers/android/binder_alloc.h
@@ -100,7 +100,7 @@ struct binder_lru_page {
*/
struct binder_alloc {
struct mutex mutex;
- struct vm_area_struct *vma;
+ unsigned long vma_addr;
struct mm_struct *vma_vm_mm;
void __user *buffer;
struct list_head buffers;
diff --git a/drivers/android/binder_alloc_selftest.c b/drivers/android/binder_alloc_selftest.c
index c2b323bc3b3a..43a881073a42 100644
--- a/drivers/android/binder_alloc_selftest.c
+++ b/drivers/android/binder_alloc_selftest.c
@@ -287,7 +287,7 @@ void binder_selftest_alloc(struct binder_alloc *alloc)
if (!binder_selftest_run)
return;
mutex_lock(&binder_selftest_lock);
- if (!binder_selftest_run || !alloc->vma)
+ if (!binder_selftest_run || !alloc->vma_addr)
goto done;
pr_info("STARTED\n");
binder_selftest_alloc_offset(alloc, end_offset, 0);
diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
index d6b6b8cb7346..1ade9799c8d5 100644
--- a/drivers/android/binder_internal.h
+++ b/drivers/android/binder_internal.h
@@ -107,41 +107,19 @@ static inline int __init init_binderfs(void)
}
#endif
-int binder_stats_show(struct seq_file *m, void *unused);
-DEFINE_SHOW_ATTRIBUTE(binder_stats);
-
-int binder_state_show(struct seq_file *m, void *unused);
-DEFINE_SHOW_ATTRIBUTE(binder_state);
-
-int binder_transactions_show(struct seq_file *m, void *unused);
-DEFINE_SHOW_ATTRIBUTE(binder_transactions);
-
-int binder_transaction_log_show(struct seq_file *m, void *unused);
-DEFINE_SHOW_ATTRIBUTE(binder_transaction_log);
-
-struct binder_transaction_log_entry {
- int debug_id;
- int debug_id_done;
- int call_type;
- int from_proc;
- int from_thread;
- int target_handle;
- int to_proc;
- int to_thread;
- int to_node;
- int data_size;
- int offsets_size;
- int return_error_line;
- uint32_t return_error;
- uint32_t return_error_param;
- char context_name[BINDERFS_MAX_NAME + 1];
+struct binder_debugfs_entry {
+ const char *name;
+ umode_t mode;
+ const struct file_operations *fops;
+ void *data;
};
-struct binder_transaction_log {
- atomic_t cur;
- bool full;
- struct binder_transaction_log_entry entry[32];
-};
+extern const struct binder_debugfs_entry binder_debugfs_entries[];
+
+#define binder_for_each_debugfs_entry(entry) \
+ for ((entry) = binder_debugfs_entries; \
+ (entry)->name; \
+ (entry)++)
enum binder_stat_types {
BINDER_STAT_PROC,
@@ -575,6 +553,4 @@ struct binder_object {
};
};
-extern struct binder_transaction_log binder_transaction_log;
-extern struct binder_transaction_log binder_transaction_log_failed;
#endif /* _LINUX_BINDER_INTERNAL_H */
diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c
index e3605cdd4335..6d717ed76766 100644
--- a/drivers/android/binderfs.c
+++ b/drivers/android/binderfs.c
@@ -621,6 +621,7 @@ static int init_binder_features(struct super_block *sb)
static int init_binder_logs(struct super_block *sb)
{
struct dentry *binder_logs_root_dir, *dentry, *proc_log_dir;
+ const struct binder_debugfs_entry *db_entry;
struct binderfs_info *info;
int ret = 0;
@@ -631,43 +632,15 @@ static int init_binder_logs(struct super_block *sb)
goto out;
}
- dentry = binderfs_create_file(binder_logs_root_dir, "stats",
- &binder_stats_fops, NULL);
- if (IS_ERR(dentry)) {
- ret = PTR_ERR(dentry);
- goto out;
- }
-
- dentry = binderfs_create_file(binder_logs_root_dir, "state",
- &binder_state_fops, NULL);
- if (IS_ERR(dentry)) {
- ret = PTR_ERR(dentry);
- goto out;
- }
-
- dentry = binderfs_create_file(binder_logs_root_dir, "transactions",
- &binder_transactions_fops, NULL);
- if (IS_ERR(dentry)) {
- ret = PTR_ERR(dentry);
- goto out;
- }
-
- dentry = binderfs_create_file(binder_logs_root_dir,
- "transaction_log",
- &binder_transaction_log_fops,
- &binder_transaction_log);
- if (IS_ERR(dentry)) {
- ret = PTR_ERR(dentry);
- goto out;
- }
-
- dentry = binderfs_create_file(binder_logs_root_dir,
- "failed_transaction_log",
- &binder_transaction_log_fops,
- &binder_transaction_log_failed);
- if (IS_ERR(dentry)) {
- ret = PTR_ERR(dentry);
- goto out;
+ binder_for_each_debugfs_entry(db_entry) {
+ dentry = binderfs_create_file(binder_logs_root_dir,
+ db_entry->name,
+ db_entry->fops,
+ db_entry->data);
+ if (IS_ERR(dentry)) {
+ ret = PTR_ERR(dentry);
+ goto out;
+ }
}
proc_log_dir = binderfs_create_dir(binder_logs_root_dir, "proc");
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index d6980f33afc4..822dfa6561c2 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -1091,6 +1091,7 @@ static void __driver_attach_async_helper(void *_dev, async_cookie_t cookie)
static int __driver_attach(struct device *dev, void *data)
{
struct device_driver *drv = data;
+ bool async = false;
int ret;
/*
@@ -1129,9 +1130,11 @@ static int __driver_attach(struct device *dev, void *data)
if (!dev->driver) {
get_device(dev);
dev->p->async_driver = drv;
- async_schedule_dev(__driver_attach_async_helper, dev);
+ async = true;
}
device_unlock(dev);
+ if (async)
+ async_schedule_dev(__driver_attach_async_helper, dev);
return 0;
}
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 0ac6376ef7a1..eb0f43784c2b 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -45,7 +45,7 @@ static inline ssize_t cpumap_read(struct file *file, struct kobject *kobj,
return n;
}
-static BIN_ATTR_RO(cpumap, 0);
+static BIN_ATTR_RO(cpumap, CPUMAP_FILE_MAX_BYTES);
static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
struct bin_attribute *attr, char *buf,
@@ -66,7 +66,7 @@ static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
return n;
}
-static BIN_ATTR_RO(cpulist, 0);
+static BIN_ATTR_RO(cpulist, CPULIST_FILE_MAX_BYTES);
/**
* struct node_access_nodes - Access class device to hold user visible
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index f0e4b0ea93e8..59a706a126e5 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -219,6 +219,9 @@ static void genpd_debug_remove(struct generic_pm_domain *genpd)
{
struct dentry *d;
+ if (!genpd_debugfs_dir)
+ return;
+
d = debugfs_lookup(genpd->name, genpd_debugfs_dir);
debugfs_remove(d);
}
diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index ac6ad9ab67f9..89f98be5c5b9 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -62,47 +62,47 @@ define_id_show_func(ppin, "0x%llx");
static DEVICE_ATTR_ADMIN_RO(ppin);
define_siblings_read_func(thread_siblings, sibling_cpumask);
-static BIN_ATTR_RO(thread_siblings, 0);
-static BIN_ATTR_RO(thread_siblings_list, 0);
+static BIN_ATTR_RO(thread_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(thread_siblings_list, CPULIST_FILE_MAX_BYTES);
define_siblings_read_func(core_cpus, sibling_cpumask);
-static BIN_ATTR_RO(core_cpus, 0);
-static BIN_ATTR_RO(core_cpus_list, 0);
+static BIN_ATTR_RO(core_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(core_cpus_list, CPULIST_FILE_MAX_BYTES);
define_siblings_read_func(core_siblings, core_cpumask);
-static BIN_ATTR_RO(core_siblings, 0);
-static BIN_ATTR_RO(core_siblings_list, 0);
+static BIN_ATTR_RO(core_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(core_siblings_list, CPULIST_FILE_MAX_BYTES);
#ifdef TOPOLOGY_CLUSTER_SYSFS
define_siblings_read_func(cluster_cpus, cluster_cpumask);
-static BIN_ATTR_RO(cluster_cpus, 0);
-static BIN_ATTR_RO(cluster_cpus_list, 0);
+static BIN_ATTR_RO(cluster_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(cluster_cpus_list, CPULIST_FILE_MAX_BYTES);
#endif
#ifdef TOPOLOGY_DIE_SYSFS
define_siblings_read_func(die_cpus, die_cpumask);
-static BIN_ATTR_RO(die_cpus, 0);
-static BIN_ATTR_RO(die_cpus_list, 0);
+static BIN_ATTR_RO(die_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(die_cpus_list, CPULIST_FILE_MAX_BYTES);
#endif
define_siblings_read_func(package_cpus, core_cpumask);
-static BIN_ATTR_RO(package_cpus, 0);
-static BIN_ATTR_RO(package_cpus_list, 0);
+static BIN_ATTR_RO(package_cpus, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(package_cpus_list, CPULIST_FILE_MAX_BYTES);
#ifdef TOPOLOGY_BOOK_SYSFS
define_id_show_func(book_id, "%d");
static DEVICE_ATTR_RO(book_id);
define_siblings_read_func(book_siblings, book_cpumask);
-static BIN_ATTR_RO(book_siblings, 0);
-static BIN_ATTR_RO(book_siblings_list, 0);
+static BIN_ATTR_RO(book_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(book_siblings_list, CPULIST_FILE_MAX_BYTES);
#endif
#ifdef TOPOLOGY_DRAWER_SYSFS
define_id_show_func(drawer_id, "%d");
static DEVICE_ATTR_RO(drawer_id);
define_siblings_read_func(drawer_siblings, drawer_cpumask);
-static BIN_ATTR_RO(drawer_siblings, 0);
-static BIN_ATTR_RO(drawer_siblings_list, 0);
+static BIN_ATTR_RO(drawer_siblings, CPUMAP_FILE_MAX_BYTES);
+static BIN_ATTR_RO(drawer_siblings_list, CPULIST_FILE_MAX_BYTES);
#endif
static struct bin_attribute *bin_attrs[] = {
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index c441a4972064..3e59e824ed1f 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -2044,8 +2044,13 @@ static int null_add_dev(struct nullb_device *dev)
blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, nullb->q);
mutex_lock(&lock);
- nullb->index = ida_simple_get(&nullb_indexes, 0, 0, GFP_KERNEL);
- dev->index = nullb->index;
+ rv = ida_simple_get(&nullb_indexes, 0, 0, GFP_KERNEL);
+ if (rv < 0) {
+ mutex_unlock(&lock);
+ goto out_cleanup_zone;
+ }
+ nullb->index = rv;
+ dev->index = rv;
mutex_unlock(&lock);
blk_queue_logical_block_size(nullb->q, dev->blocksize);
@@ -2065,13 +2070,16 @@ static int null_add_dev(struct nullb_device *dev)
rv = null_gendisk_register(nullb);
if (rv)
- goto out_cleanup_zone;
+ goto out_ida_free;
mutex_lock(&lock);
list_add_tail(&nullb->list, &nullb_list);
mutex_unlock(&lock);
return 0;
+
+out_ida_free:
+ ida_free(&nullb_indexes, nullb->index);
out_cleanup_zone:
null_free_zoned_dev(dev);
out_cleanup_disk:
diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index f04df6294650..963df70ea026 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -323,10 +323,11 @@ void rnbd_srv_sess_dev_force_close(struct rnbd_srv_sess_dev *sess_dev,
{
struct rnbd_srv_session *sess = sess_dev->sess;
- sess_dev->keep_id = true;
/* It is already started to close by client's close message. */
if (!mutex_trylock(&sess->lock))
return;
+
+ sess_dev->keep_id = true;
/* first remove sysfs itself to avoid deadlock */
sysfs_remove_file_self(&sess_dev->kobj, &attr->attr);
rnbd_srv_destroy_dev_session_sysfs(sess_dev);
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index f09040435e2e..7e0d1eb29d8f 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -157,6 +157,11 @@ static int xen_blkif_alloc_rings(struct xen_blkif *blkif)
return 0;
}
+/* Enable the persistent grants feature. */
+static bool feature_persistent = true;
+module_param(feature_persistent, bool, 0644);
+MODULE_PARM_DESC(feature_persistent, "Enables the persistent grants feature");
+
static struct xen_blkif *xen_blkif_alloc(domid_t domid)
{
struct xen_blkif *blkif;
@@ -472,12 +477,6 @@ static void xen_vbd_free(struct xen_vbd *vbd)
vbd->bdev = NULL;
}
-/* Enable the persistent grants feature. */
-static bool feature_persistent = true;
-module_param(feature_persistent, bool, 0644);
-MODULE_PARM_DESC(feature_persistent,
- "Enables the persistent grants feature");
-
static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
unsigned major, unsigned minor, int readonly,
int cdrom)
@@ -523,8 +522,6 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
if (q && blk_queue_secure_erase(q))
vbd->discard_secure = true;
- vbd->feature_gnt_persistent = feature_persistent;
-
pr_debug("Successful creation of handle=%04x (dom=%u)\n",
handle, blkif->domid);
return 0;
@@ -1091,10 +1088,9 @@ static int connect_ring(struct backend_info *be)
xenbus_dev_fatal(dev, err, "unknown fe protocol %s", protocol);
return -ENOSYS;
}
- if (blkif->vbd.feature_gnt_persistent)
- blkif->vbd.feature_gnt_persistent =
- xenbus_read_unsigned(dev->otherend,
- "feature-persistent", 0);
+
+ blkif->vbd.feature_gnt_persistent = feature_persistent &&
+ xenbus_read_unsigned(dev->otherend, "feature-persistent", 0);
blkif->vbd.overflow_max_grants = 0;
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index cf9cfc40a028..6aa1bcca8645 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2011,8 +2011,6 @@ static int blkfront_probe(struct xenbus_device *dev,
info->vdevice = vdevice;
info->connected = BLKIF_STATE_DISCONNECTED;
- info->feature_persistent = feature_persistent;
-
/* Front end dir is a number, which is used as the id. */
info->handle = simple_strtoul(strrchr(dev->nodename, '/')+1, NULL, 0);
dev_set_drvdata(&dev->dev, info);
@@ -2306,7 +2304,7 @@ static void blkfront_gather_backend_features(struct blkfront_info *info)
if (xenbus_read_unsigned(info->xbdev->otherend, "feature-discard", 0))
blkfront_setup_discard(info);
- if (info->feature_persistent)
+ if (feature_persistent)
info->feature_persistent =
!!xenbus_read_unsigned(info->xbdev->otherend,
"feature-persistent", 0);
diff --git a/drivers/bluetooth/hci_intel.c b/drivers/bluetooth/hci_intel.c
index 7249b91d9b91..78afb9a348e7 100644
--- a/drivers/bluetooth/hci_intel.c
+++ b/drivers/bluetooth/hci_intel.c
@@ -1217,7 +1217,11 @@ static struct platform_driver intel_driver = {
int __init intel_init(void)
{
- platform_driver_register(&intel_driver);
+ int err;
+
+ err = platform_driver_register(&intel_driver);
+ if (err)
+ return err;
return hci_uart_register_proto(&intel_proto);
}
diff --git a/drivers/bluetooth/hci_serdev.c b/drivers/bluetooth/hci_serdev.c
index 4cda890ce647..c0e5f42ec6b7 100644
--- a/drivers/bluetooth/hci_serdev.c
+++ b/drivers/bluetooth/hci_serdev.c
@@ -231,6 +231,15 @@ static int hci_uart_setup(struct hci_dev *hdev)
return 0;
}
+/* Check if the device is wakeable */
+static bool hci_uart_wakeup(struct hci_dev *hdev)
+{
+ /* HCI UART devices are assumed to be wakeable by default.
+ * Implement wakeup callback to override this behavior.
+ */
+ return true;
+}
+
/** hci_uart_write_wakeup - transmit buffer wakeup
* @serdev: serial device
*
@@ -342,6 +351,8 @@ int hci_uart_register_device(struct hci_uart *hu,
hdev->flush = hci_uart_flush;
hdev->send = hci_uart_send_frame;
hdev->setup = hci_uart_setup;
+ if (!hdev->wakeup)
+ hdev->wakeup = hci_uart_wakeup;
SET_HCIDEV_DEV(hdev, &hu->serdev->dev);
if (test_bit(HCI_UART_NO_SUSPEND_NOTIFIER, &hu->flags))
diff --git a/drivers/bus/hisi_lpc.c b/drivers/bus/hisi_lpc.c
index 378f5d62a991..e7eaa8784fee 100644
--- a/drivers/bus/hisi_lpc.c
+++ b/drivers/bus/hisi_lpc.c
@@ -503,13 +503,13 @@ static int hisi_lpc_acpi_probe(struct device *hostdev)
{
struct acpi_device *adev = ACPI_COMPANION(hostdev);
struct acpi_device *child;
+ struct platform_device *pdev;
int ret;
/* Only consider the children of the host */
list_for_each_entry(child, &adev->children, node) {
const char *hid = acpi_device_hid(child);
const struct hisi_lpc_acpi_cell *cell;
- struct platform_device *pdev;
const struct resource *res;
bool found = false;
int num_res;
@@ -571,22 +571,24 @@ static int hisi_lpc_acpi_probe(struct device *hostdev)
ret = platform_device_add_resources(pdev, res, num_res);
if (ret)
- goto fail;
+ goto fail_put_device;
ret = platform_device_add_data(pdev, cell->pdata,
cell->pdata_size);
if (ret)
- goto fail;
+ goto fail_put_device;
ret = platform_device_add(pdev);
if (ret)
- goto fail;
+ goto fail_put_device;
acpi_device_set_enumerated(child);
}
return 0;
+fail_put_device:
+ platform_device_put(pdev);
fail:
hisi_lpc_acpi_remove(hostdev);
return ret;
diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index 04a3e23a4afc..4419593d9531 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -752,6 +752,12 @@ int tpm2_auto_startup(struct tpm_chip *chip)
}
rc = tpm2_get_cc_attrs_tbl(chip);
+ if (rc == TPM2_RC_FAILURE || (rc < 0 && rc != -ENOMEM)) {
+ dev_info(&chip->dev,
+ "TPM in field failure mode, requires firmware upgrade\n");
+ chip->flags |= TPM_CHIP_FLAG_FIRMWARE_UPGRADE;
+ rc = 0;
+ }
out:
if (rc == TPM2_RC_UPGRADE) {
diff --git a/drivers/clk/imx/clk-fracn-gppll.c b/drivers/clk/imx/clk-fracn-gppll.c
index 71c102d950ab..025b73229cdd 100644
--- a/drivers/clk/imx/clk-fracn-gppll.c
+++ b/drivers/clk/imx/clk-fracn-gppll.c
@@ -64,10 +64,10 @@ struct clk_fracn_gppll {
* Fout = Fvco / (rdiv * odiv)
*/
static const struct imx_fracn_gppll_rate_table fracn_tbl[] = {
- PLL_FRACN_GP(650000000U, 81, 0, 0, 0, 3),
- PLL_FRACN_GP(594000000U, 198, 0, 0, 0, 8),
- PLL_FRACN_GP(560000000U, 70, 0, 0, 0, 3),
- PLL_FRACN_GP(400000000U, 50, 0, 0, 0, 3),
+ PLL_FRACN_GP(650000000U, 81, 0, 1, 0, 3),
+ PLL_FRACN_GP(594000000U, 198, 0, 1, 0, 8),
+ PLL_FRACN_GP(560000000U, 70, 0, 1, 0, 3),
+ PLL_FRACN_GP(400000000U, 50, 0, 1, 0, 3),
PLL_FRACN_GP(393216000U, 81, 92, 100, 0, 5)
};
@@ -131,18 +131,7 @@ static unsigned long clk_fracn_gppll_recalc_rate(struct clk_hw *hw, unsigned lon
mfi = FIELD_GET(PLL_MFI_MASK, pll_div);
rdiv = FIELD_GET(PLL_RDIV_MASK, pll_div);
- rdiv = rdiv + 1;
odiv = FIELD_GET(PLL_ODIV_MASK, pll_div);
- switch (odiv) {
- case 0:
- odiv = 2;
- break;
- case 1:
- odiv = 3;
- break;
- default:
- break;
- }
/*
* Sometimes, the recalculated rate has deviation due to
@@ -160,6 +149,20 @@ static unsigned long clk_fracn_gppll_recalc_rate(struct clk_hw *hw, unsigned lon
if (rate)
return (unsigned long)rate;
+ if (!rdiv)
+ rdiv = rdiv + 1;
+
+ switch (odiv) {
+ case 0:
+ odiv = 2;
+ break;
+ case 1:
+ odiv = 3;
+ break;
+ default:
+ break;
+ }
+
/* Fvco = Fref * (MFI + MFN / MFD) */
fvco = fvco * mfi * mfd + fvco * mfn;
do_div(fvco, mfd * rdiv * odiv);
diff --git a/drivers/clk/imx/clk-imx93.c b/drivers/clk/imx/clk-imx93.c
index edcc87661d1f..26885bd3971c 100644
--- a/drivers/clk/imx/clk-imx93.c
+++ b/drivers/clk/imx/clk-imx93.c
@@ -150,7 +150,7 @@ static const struct imx93_clk_ccgr {
{ IMX93_CLK_A55_GATE, "a55", "a55_root", 0x8000, },
/* M33 critical clk for system run */
{ IMX93_CLK_CM33_GATE, "cm33", "m33_root", 0x8040, CLK_IS_CRITICAL },
- { IMX93_CLK_ADC1_GATE, "adc1", "osc_24m", 0x82c0, },
+ { IMX93_CLK_ADC1_GATE, "adc1", "adc_root", 0x82c0, },
{ IMX93_CLK_WDOG1_GATE, "wdog1", "osc_24m", 0x8300, },
{ IMX93_CLK_WDOG2_GATE, "wdog2", "osc_24m", 0x8340, },
{ IMX93_CLK_WDOG3_GATE, "wdog3", "osc_24m", 0x8380, },
@@ -219,7 +219,7 @@ static const struct imx93_clk_ccgr {
{ IMX93_CLK_LCDIF_GATE, "lcdif", "media_apb_root", 0x9640, },
{ IMX93_CLK_PXP_GATE, "pxp", "media_apb_root", 0x9680, },
{ IMX93_CLK_ISI_GATE, "isi", "media_apb_root", 0x96c0, },
- { IMX93_CLK_NIC_MEDIA_GATE, "nic_media", "media_apb_root", 0x9700, },
+ { IMX93_CLK_NIC_MEDIA_GATE, "nic_media", "media_axi_root", 0x9700, },
{ IMX93_CLK_USB_CONTROLLER_GATE, "usb_controller", "hsio_root", 0x9a00, },
{ IMX93_CLK_USB_TEST_60M_GATE, "usb_test_60m", "hsio_usb_test_60m_root", 0x9a40, },
{ IMX93_CLK_HSIO_TROUT_24M_GATE, "hsio_trout_24m", "osc_24m", 0x9a80, },
diff --git a/drivers/clk/mediatek/reset.c b/drivers/clk/mediatek/reset.c
index bcec4b89f449..834d26e9bdfd 100644
--- a/drivers/clk/mediatek/reset.c
+++ b/drivers/clk/mediatek/reset.c
@@ -25,7 +25,7 @@ static int mtk_reset_assert_set_clr(struct reset_controller_dev *rcdev,
struct mtk_reset *data = container_of(rcdev, struct mtk_reset, rcdev);
unsigned int reg = data->regofs + ((id / 32) << 4);
- return regmap_write(data->regmap, reg, 1);
+ return regmap_write(data->regmap, reg, BIT(id % 32));
}
static int mtk_reset_deassert_set_clr(struct reset_controller_dev *rcdev,
@@ -34,7 +34,7 @@ static int mtk_reset_deassert_set_clr(struct reset_controller_dev *rcdev,
struct mtk_reset *data = container_of(rcdev, struct mtk_reset, rcdev);
unsigned int reg = data->regofs + ((id / 32) << 4) + 0x4;
- return regmap_write(data->regmap, reg, 1);
+ return regmap_write(data->regmap, reg, BIT(id % 32));
}
static int mtk_reset_assert(struct reset_controller_dev *rcdev,
diff --git a/drivers/clk/qcom/camcc-sdm845.c b/drivers/clk/qcom/camcc-sdm845.c
index be3f95326965..27d44188a7ab 100644
--- a/drivers/clk/qcom/camcc-sdm845.c
+++ b/drivers/clk/qcom/camcc-sdm845.c
@@ -1534,6 +1534,8 @@ static struct clk_branch cam_cc_sys_tmr_clk = {
},
};
+static struct gdsc titan_top_gdsc;
+
static struct gdsc bps_gdsc = {
.gdscr = 0x6004,
.pd = {
@@ -1567,6 +1569,7 @@ static struct gdsc ife_0_gdsc = {
.name = "ife_0_gdsc",
},
.flags = POLL_CFG_GDSCR,
+ .parent = &titan_top_gdsc.pd,
.pwrsts = PWRSTS_OFF_ON,
};
@@ -1576,6 +1579,7 @@ static struct gdsc ife_1_gdsc = {
.name = "ife_1_gdsc",
},
.flags = POLL_CFG_GDSCR,
+ .parent = &titan_top_gdsc.pd,
.pwrsts = PWRSTS_OFF_ON,
};
diff --git a/drivers/clk/qcom/camcc-sm8250.c b/drivers/clk/qcom/camcc-sm8250.c
index 439eaafdcc86..9b32c56a5bc5 100644
--- a/drivers/clk/qcom/camcc-sm8250.c
+++ b/drivers/clk/qcom/camcc-sm8250.c
@@ -2205,6 +2205,8 @@ static struct clk_branch cam_cc_sleep_clk = {
},
};
+static struct gdsc titan_top_gdsc;
+
static struct gdsc bps_gdsc = {
.gdscr = 0x7004,
.pd = {
@@ -2238,6 +2240,7 @@ static struct gdsc ife_0_gdsc = {
.name = "ife_0_gdsc",
},
.flags = POLL_CFG_GDSCR,
+ .parent = &titan_top_gdsc.pd,
.pwrsts = PWRSTS_OFF_ON,
};
@@ -2247,6 +2250,7 @@ static struct gdsc ife_1_gdsc = {
.name = "ife_1_gdsc",
},
.flags = POLL_CFG_GDSCR,
+ .parent = &titan_top_gdsc.pd,
.pwrsts = PWRSTS_OFF_ON,
};
@@ -2440,17 +2444,7 @@ static struct platform_driver cam_cc_sm8250_driver = {
},
};
-static int __init cam_cc_sm8250_init(void)
-{
- return platform_driver_register(&cam_cc_sm8250_driver);
-}
-subsys_initcall(cam_cc_sm8250_init);
-
-static void __exit cam_cc_sm8250_exit(void)
-{
- platform_driver_unregister(&cam_cc_sm8250_driver);
-}
-module_exit(cam_cc_sm8250_exit);
+module_platform_driver(cam_cc_sm8250_driver);
MODULE_DESCRIPTION("QTI CAMCC SM8250 Driver");
MODULE_LICENSE("GPL v2");
diff --git a/drivers/clk/qcom/clk-krait.c b/drivers/clk/qcom/clk-krait.c
index 59f1af415b58..90046428693c 100644
--- a/drivers/clk/qcom/clk-krait.c
+++ b/drivers/clk/qcom/clk-krait.c
@@ -32,11 +32,16 @@ static void __krait_mux_set_sel(struct krait_mux_clk *mux, int sel)
regval |= (sel & mux->mask) << (mux->shift + LPL_SHIFT);
}
krait_set_l2_indirect_reg(mux->offset, regval);
- spin_unlock_irqrestore(&krait_clock_reg_lock, flags);
/* Wait for switch to complete. */
mb();
udelay(1);
+
+ /*
+ * Unlock now to make sure the mux register is not
+ * modified while switching to the new parent.
+ */
+ spin_unlock_irqrestore(&krait_clock_reg_lock, flags);
}
static int krait_mux_set_parent(struct clk_hw *hw, u8 index)
diff --git a/drivers/clk/qcom/clk-rcg2.c b/drivers/clk/qcom/clk-rcg2.c
index e9c357309fd9..caafc35eecfd 100644
--- a/drivers/clk/qcom/clk-rcg2.c
+++ b/drivers/clk/qcom/clk-rcg2.c
@@ -13,6 +13,7 @@
#include <linux/rational.h>
#include <linux/regmap.h>
#include <linux/math64.h>
+#include <linux/minmax.h>
#include <linux/slab.h>
#include <asm/div64.h>
@@ -405,7 +406,7 @@ static int clk_rcg2_get_duty_cycle(struct clk_hw *hw, struct clk_duty *duty)
static int clk_rcg2_set_duty_cycle(struct clk_hw *hw, struct clk_duty *duty)
{
struct clk_rcg2 *rcg = to_clk_rcg2(hw);
- u32 notn_m, n, m, d, not2d, mask, duty_per;
+ u32 notn_m, n, m, d, not2d, mask, duty_per, cfg;
int ret;
/* Duty-cycle cannot be modified for non-MND RCGs */
@@ -416,6 +417,11 @@ static int clk_rcg2_set_duty_cycle(struct clk_hw *hw, struct clk_duty *duty)
regmap_read(rcg->clkr.regmap, RCG_N_OFFSET(rcg), ¬n_m);
regmap_read(rcg->clkr.regmap, RCG_M_OFFSET(rcg), &m);
+ regmap_read(rcg->clkr.regmap, RCG_CFG_OFFSET(rcg), &cfg);
+
+ /* Duty-cycle cannot be modified if MND divider is in bypass mode. */
+ if (!(cfg & CFG_MODE_MASK))
+ return -EINVAL;
n = (~(notn_m) + m) & mask;
@@ -424,9 +430,11 @@ static int clk_rcg2_set_duty_cycle(struct clk_hw *hw, struct clk_duty *duty)
/* Calculate 2d value */
d = DIV_ROUND_CLOSEST(n * duty_per * 2, 100);
- /* Check bit widths of 2d. If D is too big reduce duty cycle. */
- if (d > mask)
- d = mask;
+ /*
+ * Check bit widths of 2d. If D is too big reduce duty cycle.
+ * Also make sure it is never zero.
+ */
+ d = clamp_val(d, 1, mask);
if ((d / 2) > (n - m))
d = (n - m) * 2;
diff --git a/drivers/clk/qcom/dispcc-sm8250.c b/drivers/clk/qcom/dispcc-sm8250.c
index db9379634fb2..f646fdfe6f15 100644
--- a/drivers/clk/qcom/dispcc-sm8250.c
+++ b/drivers/clk/qcom/dispcc-sm8250.c
@@ -1134,7 +1134,6 @@ static struct gdsc mdss_gdsc = {
},
.pwrsts = PWRSTS_OFF_ON,
.flags = HW_CTRL,
- .supply = "mmcx",
};
static struct clk_regmap *disp_cc_sm8250_clocks[] = {
diff --git a/drivers/clk/qcom/gcc-ipq8074.c b/drivers/clk/qcom/gcc-ipq8074.c
index 541016db3c4b..2c2ecfc5e61f 100644
--- a/drivers/clk/qcom/gcc-ipq8074.c
+++ b/drivers/clk/qcom/gcc-ipq8074.c
@@ -1788,8 +1788,10 @@ static struct clk_regmap_div nss_port4_tx_div_clk_src = {
static const struct freq_tbl ftbl_nss_port5_rx_clk_src[] = {
F(19200000, P_XO, 1, 0, 0),
F(25000000, P_UNIPHY1_RX, 12.5, 0, 0),
+ F(25000000, P_UNIPHY0_RX, 5, 0, 0),
F(78125000, P_UNIPHY1_RX, 4, 0, 0),
F(125000000, P_UNIPHY1_RX, 2.5, 0, 0),
+ F(125000000, P_UNIPHY0_RX, 1, 0, 0),
F(156250000, P_UNIPHY1_RX, 2, 0, 0),
F(312500000, P_UNIPHY1_RX, 1, 0, 0),
{ }
@@ -1828,8 +1830,10 @@ static struct clk_regmap_div nss_port5_rx_div_clk_src = {
static const struct freq_tbl ftbl_nss_port5_tx_clk_src[] = {
F(19200000, P_XO, 1, 0, 0),
F(25000000, P_UNIPHY1_TX, 12.5, 0, 0),
+ F(25000000, P_UNIPHY0_TX, 5, 0, 0),
F(78125000, P_UNIPHY1_TX, 4, 0, 0),
F(125000000, P_UNIPHY1_TX, 2.5, 0, 0),
+ F(125000000, P_UNIPHY0_TX, 1, 0, 0),
F(156250000, P_UNIPHY1_TX, 2, 0, 0),
F(312500000, P_UNIPHY1_TX, 1, 0, 0),
{ }
@@ -1867,8 +1871,10 @@ static struct clk_regmap_div nss_port5_tx_div_clk_src = {
static const struct freq_tbl ftbl_nss_port6_rx_clk_src[] = {
F(19200000, P_XO, 1, 0, 0),
+ F(25000000, P_UNIPHY2_RX, 5, 0, 0),
F(25000000, P_UNIPHY2_RX, 12.5, 0, 0),
F(78125000, P_UNIPHY2_RX, 4, 0, 0),
+ F(125000000, P_UNIPHY2_RX, 1, 0, 0),
F(125000000, P_UNIPHY2_RX, 2.5, 0, 0),
F(156250000, P_UNIPHY2_RX, 2, 0, 0),
F(312500000, P_UNIPHY2_RX, 1, 0, 0),
@@ -1907,8 +1913,10 @@ static struct clk_regmap_div nss_port6_rx_div_clk_src = {
static const struct freq_tbl ftbl_nss_port6_tx_clk_src[] = {
F(19200000, P_XO, 1, 0, 0),
+ F(25000000, P_UNIPHY2_TX, 5, 0, 0),
F(25000000, P_UNIPHY2_TX, 12.5, 0, 0),
F(78125000, P_UNIPHY2_TX, 4, 0, 0),
+ F(125000000, P_UNIPHY2_TX, 1, 0, 0),
F(125000000, P_UNIPHY2_TX, 2.5, 0, 0),
F(156250000, P_UNIPHY2_TX, 2, 0, 0),
F(312500000, P_UNIPHY2_TX, 1, 0, 0),
@@ -3346,6 +3354,7 @@ static struct clk_branch gcc_nssnoc_ubi1_ahb_clk = {
static struct clk_branch gcc_ubi0_ahb_clk = {
.halt_reg = 0x6820c,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x6820c,
.enable_mask = BIT(0),
@@ -3363,6 +3372,7 @@ static struct clk_branch gcc_ubi0_ahb_clk = {
static struct clk_branch gcc_ubi0_axi_clk = {
.halt_reg = 0x68200,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x68200,
.enable_mask = BIT(0),
@@ -3380,6 +3390,7 @@ static struct clk_branch gcc_ubi0_axi_clk = {
static struct clk_branch gcc_ubi0_nc_axi_clk = {
.halt_reg = 0x68204,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x68204,
.enable_mask = BIT(0),
@@ -3397,6 +3408,7 @@ static struct clk_branch gcc_ubi0_nc_axi_clk = {
static struct clk_branch gcc_ubi0_core_clk = {
.halt_reg = 0x68210,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x68210,
.enable_mask = BIT(0),
@@ -3414,6 +3426,7 @@ static struct clk_branch gcc_ubi0_core_clk = {
static struct clk_branch gcc_ubi0_mpt_clk = {
.halt_reg = 0x68208,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x68208,
.enable_mask = BIT(0),
@@ -3431,6 +3444,7 @@ static struct clk_branch gcc_ubi0_mpt_clk = {
static struct clk_branch gcc_ubi1_ahb_clk = {
.halt_reg = 0x6822c,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x6822c,
.enable_mask = BIT(0),
@@ -3448,6 +3462,7 @@ static struct clk_branch gcc_ubi1_ahb_clk = {
static struct clk_branch gcc_ubi1_axi_clk = {
.halt_reg = 0x68220,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x68220,
.enable_mask = BIT(0),
@@ -3465,6 +3480,7 @@ static struct clk_branch gcc_ubi1_axi_clk = {
static struct clk_branch gcc_ubi1_nc_axi_clk = {
.halt_reg = 0x68224,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x68224,
.enable_mask = BIT(0),
@@ -3482,6 +3498,7 @@ static struct clk_branch gcc_ubi1_nc_axi_clk = {
static struct clk_branch gcc_ubi1_core_clk = {
.halt_reg = 0x68230,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x68230,
.enable_mask = BIT(0),
@@ -3499,6 +3516,7 @@ static struct clk_branch gcc_ubi1_core_clk = {
static struct clk_branch gcc_ubi1_mpt_clk = {
.halt_reg = 0x68228,
+ .halt_check = BRANCH_HALT_DELAY,
.clkr = {
.enable_reg = 0x68228,
.enable_mask = BIT(0),
@@ -4371,6 +4389,33 @@ static struct clk_branch gcc_pcie0_axi_s_bridge_clk = {
},
};
+static const struct alpha_pll_config ubi32_pll_config = {
+ .l = 0x4e,
+ .config_ctl_val = 0x200d4aa8,
+ .config_ctl_hi_val = 0x3c2,
+ .main_output_mask = BIT(0),
+ .aux_output_mask = BIT(1),
+ .pre_div_val = 0x0,
+ .pre_div_mask = BIT(12),
+ .post_div_val = 0x0,
+ .post_div_mask = GENMASK(9, 8),
+};
+
+static const struct alpha_pll_config nss_crypto_pll_config = {
+ .l = 0x3e,
+ .alpha = 0x0,
+ .alpha_hi = 0x80,
+ .config_ctl_val = 0x4001055b,
+ .main_output_mask = BIT(0),
+ .pre_div_val = 0x0,
+ .pre_div_mask = GENMASK(14, 12),
+ .post_div_val = 0x1 << 8,
+ .post_div_mask = GENMASK(11, 8),
+ .vco_mask = GENMASK(21, 20),
+ .vco_val = 0x0,
+ .alpha_en_mask = BIT(24),
+};
+
static struct clk_hw *gcc_ipq8074_hws[] = {
&gpll0_out_main_div2.hw,
&gpll6_out_main_div2.hw,
@@ -4772,7 +4817,20 @@ static const struct qcom_cc_desc gcc_ipq8074_desc = {
static int gcc_ipq8074_probe(struct platform_device *pdev)
{
- return qcom_cc_probe(pdev, &gcc_ipq8074_desc);
+ struct regmap *regmap;
+
+ regmap = qcom_cc_map(pdev, &gcc_ipq8074_desc);
+ if (IS_ERR(regmap))
+ return PTR_ERR(regmap);
+
+ /* SW Workaround for UBI32 Huayra PLL */
+ regmap_update_bits(regmap, 0x2501c, BIT(26), BIT(26));
+
+ clk_alpha_pll_configure(&ubi32_pll_main, regmap, &ubi32_pll_config);
+ clk_alpha_pll_configure(&nss_crypto_pll_main, regmap,
+ &nss_crypto_pll_config);
+
+ return qcom_cc_really_probe(pdev, &gcc_ipq8074_desc, regmap);
}
static struct platform_driver gcc_ipq8074_driver = {
diff --git a/drivers/clk/qcom/gcc-msm8939.c b/drivers/clk/qcom/gcc-msm8939.c
index 39ebb443ae3d..de0022e5450d 100644
--- a/drivers/clk/qcom/gcc-msm8939.c
+++ b/drivers/clk/qcom/gcc-msm8939.c
@@ -632,7 +632,7 @@ static struct clk_rcg2 system_noc_bfdcd_clk_src = {
};
static struct clk_rcg2 bimc_ddr_clk_src = {
- .cmd_rcgr = 0x32004,
+ .cmd_rcgr = 0x32024,
.hid_width = 5,
.parent_map = gcc_xo_gpll0_bimc_map,
.clkr.hw.init = &(struct clk_init_data){
@@ -644,6 +644,18 @@ static struct clk_rcg2 bimc_ddr_clk_src = {
},
};
+static struct clk_rcg2 system_mm_noc_bfdcd_clk_src = {
+ .cmd_rcgr = 0x2600c,
+ .hid_width = 5,
+ .parent_map = gcc_xo_gpll0_gpll6a_map,
+ .clkr.hw.init = &(struct clk_init_data){
+ .name = "system_mm_noc_bfdcd_clk_src",
+ .parent_data = gcc_xo_gpll0_gpll6a_parent_data,
+ .num_parents = 3,
+ .ops = &clk_rcg2_ops,
+ },
+};
+
static const struct freq_tbl ftbl_gcc_camss_ahb_clk[] = {
F(40000000, P_GPLL0, 10, 1, 2),
F(80000000, P_GPLL0, 10, 0, 0),
@@ -1002,7 +1014,7 @@ static struct clk_rcg2 blsp1_uart2_apps_clk_src = {
};
static const struct freq_tbl ftbl_gcc_camss_cci_clk[] = {
- F(19200000, P_XO, 1, 0, 0),
+ F(19200000, P_XO, 1, 0, 0),
{ }
};
@@ -2441,7 +2453,7 @@ static struct clk_branch gcc_camss_jpeg_axi_clk = {
.hw.init = &(struct clk_init_data){
.name = "gcc_camss_jpeg_axi_clk",
.parent_data = &(const struct clk_parent_data){
- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
+ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
},
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
@@ -2645,7 +2657,7 @@ static struct clk_branch gcc_camss_vfe_axi_clk = {
.hw.init = &(struct clk_init_data){
.name = "gcc_camss_vfe_axi_clk",
.parent_data = &(const struct clk_parent_data){
- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
+ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
},
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
@@ -2801,7 +2813,7 @@ static struct clk_branch gcc_mdss_axi_clk = {
.hw.init = &(struct clk_init_data){
.name = "gcc_mdss_axi_clk",
.parent_data = &(const struct clk_parent_data){
- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
+ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
},
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
@@ -3193,7 +3205,7 @@ static struct clk_branch gcc_mdp_tbu_clk = {
.hw.init = &(struct clk_init_data){
.name = "gcc_mdp_tbu_clk",
.parent_data = &(const struct clk_parent_data){
- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
+ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
},
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
@@ -3211,7 +3223,7 @@ static struct clk_branch gcc_venus_tbu_clk = {
.hw.init = &(struct clk_init_data){
.name = "gcc_venus_tbu_clk",
.parent_data = &(const struct clk_parent_data){
- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
+ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
},
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
@@ -3229,7 +3241,7 @@ static struct clk_branch gcc_vfe_tbu_clk = {
.hw.init = &(struct clk_init_data){
.name = "gcc_vfe_tbu_clk",
.parent_data = &(const struct clk_parent_data){
- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
+ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
},
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
@@ -3247,7 +3259,7 @@ static struct clk_branch gcc_jpeg_tbu_clk = {
.hw.init = &(struct clk_init_data){
.name = "gcc_jpeg_tbu_clk",
.parent_data = &(const struct clk_parent_data){
- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
+ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
},
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
@@ -3484,7 +3496,7 @@ static struct clk_branch gcc_venus0_axi_clk = {
.hw.init = &(struct clk_init_data){
.name = "gcc_venus0_axi_clk",
.parent_data = &(const struct clk_parent_data){
- .hw = &system_noc_bfdcd_clk_src.clkr.hw,
+ .hw = &system_mm_noc_bfdcd_clk_src.clkr.hw,
},
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
@@ -3623,6 +3635,7 @@ static struct clk_regmap *gcc_msm8939_clocks[] = {
[GPLL2_VOTE] = &gpll2_vote,
[PCNOC_BFDCD_CLK_SRC] = &pcnoc_bfdcd_clk_src.clkr,
[SYSTEM_NOC_BFDCD_CLK_SRC] = &system_noc_bfdcd_clk_src.clkr,
+ [SYSTEM_MM_NOC_BFDCD_CLK_SRC] = &system_mm_noc_bfdcd_clk_src.clkr,
[CAMSS_AHB_CLK_SRC] = &camss_ahb_clk_src.clkr,
[APSS_AHB_CLK_SRC] = &apss_ahb_clk_src.clkr,
[CSI0_CLK_SRC] = &csi0_clk_src.clkr,
diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
index 44520efc6c72..2db0938f8dd3 100644
--- a/drivers/clk/qcom/gdsc.c
+++ b/drivers/clk/qcom/gdsc.c
@@ -420,6 +420,14 @@ static int gdsc_init(struct gdsc *sc)
return ret;
}
+ /* ...and the power-domain */
+ ret = gdsc_pm_runtime_get(sc);
+ if (ret) {
+ if (sc->rsupply)
+ regulator_disable(sc->rsupply);
+ return ret;
+ }
+
/*
* Votable GDSCs can be ON due to Vote from other masters.
* If a Votable GDSC is ON, make sure we have a Vote.
diff --git a/drivers/clk/qcom/videocc-sm8250.c b/drivers/clk/qcom/videocc-sm8250.c
index 8617454e4a77..f28f2cb051d7 100644
--- a/drivers/clk/qcom/videocc-sm8250.c
+++ b/drivers/clk/qcom/videocc-sm8250.c
@@ -277,7 +277,6 @@ static struct gdsc mvs0c_gdsc = {
},
.flags = 0,
.pwrsts = PWRSTS_OFF_ON,
- .supply = "mmcx",
};
static struct gdsc mvs1c_gdsc = {
@@ -287,7 +286,6 @@ static struct gdsc mvs1c_gdsc = {
},
.flags = 0,
.pwrsts = PWRSTS_OFF_ON,
- .supply = "mmcx",
};
static struct gdsc mvs0_gdsc = {
@@ -297,7 +295,6 @@ static struct gdsc mvs0_gdsc = {
},
.flags = HW_CTRL,
.pwrsts = PWRSTS_OFF_ON,
- .supply = "mmcx",
};
static struct gdsc mvs1_gdsc = {
@@ -307,7 +304,6 @@ static struct gdsc mvs1_gdsc = {
},
.flags = HW_CTRL,
.pwrsts = PWRSTS_OFF_ON,
- .supply = "mmcx",
};
static struct clk_regmap *video_cc_sm8250_clocks[] = {
diff --git a/drivers/clk/renesas/r9a06g032-clocks.c b/drivers/clk/renesas/r9a06g032-clocks.c
index c99942f0e4d4..abc0891fd96d 100644
--- a/drivers/clk/renesas/r9a06g032-clocks.c
+++ b/drivers/clk/renesas/r9a06g032-clocks.c
@@ -286,8 +286,8 @@ static const struct r9a06g032_clkdesc r9a06g032_clocks[] = {
.name = "uart_group_012",
.type = K_BITSEL,
.source = 1 + R9A06G032_DIV_UART,
- /* R9A06G032_SYSCTRL_REG_PWRCTRL_PG1_PR2 */
- .dual.sel = ((0xec / 4) << 5) | 24,
+ /* R9A06G032_SYSCTRL_REG_PWRCTRL_PG0_0 */
+ .dual.sel = ((0x34 / 4) << 5) | 30,
.dual.group = 0,
},
{
@@ -295,8 +295,8 @@ static const struct r9a06g032_clkdesc r9a06g032_clocks[] = {
.name = "uart_group_34567",
.type = K_BITSEL,
.source = 1 + R9A06G032_DIV_P2_PG,
- /* R9A06G032_SYSCTRL_REG_PWRCTRL_PG0_0 */
- .dual.sel = ((0x34 / 4) << 5) | 30,
+ /* R9A06G032_SYSCTRL_REG_PWRCTRL_PG1_PR2 */
+ .dual.sel = ((0xec / 4) << 5) | 24,
.dual.group = 1,
},
D_UGATE(CLK_UART0, "clk_uart0", UART_GROUP_012, 0, 0, 0x1b2, 0x1b3, 0x1b4, 0x1b5),
diff --git a/drivers/clk/renesas/rzg2l-cpg.c b/drivers/clk/renesas/rzg2l-cpg.c
index 486d0656c58a..1068058a3865 100644
--- a/drivers/clk/renesas/rzg2l-cpg.c
+++ b/drivers/clk/renesas/rzg2l-cpg.c
@@ -744,7 +744,7 @@ static int rzg2l_cpg_status(struct reset_controller_dev *rcdev,
unsigned int reg = info->resets[id].off;
u32 bitmask = BIT(info->resets[id].bit);
- return !(readl(priv->base + CLK_MRST_R(reg)) & bitmask);
+ return !!(readl(priv->base + CLK_MRST_R(reg)) & bitmask);
}
static const struct reset_control_ops rzg2l_cpg_reset_ops = {
diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
index 70e2e6e37389..3c46ad8c3a1c 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
@@ -151,6 +151,7 @@ static int sun8i_ss_setup_ivs(struct skcipher_request *areq)
while (i >= 0) {
dma_unmap_single(ss->dev, rctx->p_iv[i], ivsize, DMA_TO_DEVICE);
memzero_explicit(sf->iv[i], ivsize);
+ i--;
}
return err;
}
diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
index 657530578643..47b5828e35c3 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-core.c
@@ -476,14 +476,32 @@ static int allocate_flows(struct sun8i_ss_dev *ss)
ss->flows[i].biv = devm_kmalloc(ss->dev, AES_BLOCK_SIZE,
GFP_KERNEL | GFP_DMA);
- if (!ss->flows[i].biv)
+ if (!ss->flows[i].biv) {
+ err = -ENOMEM;
goto error_engine;
+ }
for (j = 0; j < MAX_SG; j++) {
ss->flows[i].iv[j] = devm_kmalloc(ss->dev, AES_BLOCK_SIZE,
GFP_KERNEL | GFP_DMA);
- if (!ss->flows[i].iv[j])
+ if (!ss->flows[i].iv[j]) {
+ err = -ENOMEM;
goto error_engine;
+ }
+ }
+
+ /* the padding could be up to two block. */
+ ss->flows[i].pad = devm_kmalloc(ss->dev, SHA256_BLOCK_SIZE * 2,
+ GFP_KERNEL | GFP_DMA);
+ if (!ss->flows[i].pad) {
+ err = -ENOMEM;
+ goto error_engine;
+ }
+ ss->flows[i].result = devm_kmalloc(ss->dev, SHA256_DIGEST_SIZE,
+ GFP_KERNEL | GFP_DMA);
+ if (!ss->flows[i].result) {
+ err = -ENOMEM;
+ goto error_engine;
}
ss->flows[i].engine = crypto_engine_alloc_init(ss->dev, true);
diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
index ca4f280af35d..f89a580618aa 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-hash.c
@@ -342,18 +342,11 @@ int sun8i_ss_hash_run(struct crypto_engine *engine, void *breq)
if (digestsize == SHA224_DIGEST_SIZE)
digestsize = SHA256_DIGEST_SIZE;
- /* the padding could be up to two block. */
- pad = kzalloc(algt->alg.hash.halg.base.cra_blocksize * 2, GFP_KERNEL | GFP_DMA);
- if (!pad)
- return -ENOMEM;
+ result = ss->flows[rctx->flow].result;
+ pad = ss->flows[rctx->flow].pad;
+ memset(pad, 0, algt->alg.hash.halg.base.cra_blocksize * 2);
bf = (__le32 *)pad;
- result = kzalloc(digestsize, GFP_KERNEL | GFP_DMA);
- if (!result) {
- kfree(pad);
- return -ENOMEM;
- }
-
for (i = 0; i < MAX_SG; i++) {
rctx->t_dst[i].addr = 0;
rctx->t_dst[i].len = 0;
@@ -449,8 +442,6 @@ int sun8i_ss_hash_run(struct crypto_engine *engine, void *breq)
memcpy(areq->result, result, algt->alg.hash.halg.digestsize);
theend:
- kfree(pad);
- kfree(result);
local_bh_disable();
crypto_finalize_hash_request(engine, breq, err);
local_bh_enable();
diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h
index 57ada8653855..eb82ee5345ae 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h
@@ -123,6 +123,8 @@ struct sginfo {
* @stat_req: number of request done by this flow
* @iv: list of IV to use for each step
* @biv: buffer which contain the backuped IV
+ * @pad: padding buffer for hash operations
+ * @result: buffer for storing the result of hash operations
*/
struct sun8i_ss_flow {
struct crypto_engine *engine;
@@ -130,6 +132,8 @@ struct sun8i_ss_flow {
int status;
u8 *iv[MAX_SG];
u8 *biv;
+ void *pad;
+ void *result;
#ifdef CONFIG_CRYPTO_DEV_SUN8I_SS_DEBUG
unsigned long stat_req;
#endif
diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 3aefb177715e..95dbab0ffab2 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -503,7 +503,7 @@ static int __sev_platform_shutdown_locked(int *error)
struct sev_device *sev = psp_master->sev_data;
int ret;
- if (sev->state == SEV_STATE_UNINIT)
+ if (!sev || sev->state == SEV_STATE_UNINIT)
return 0;
ret = __sev_do_cmd_locked(SEV_CMD_SHUTDOWN, NULL, error);
@@ -577,6 +577,8 @@ static int sev_ioctl_do_platform_status(struct sev_issue_cmd *argp)
struct sev_user_data_status data;
int ret;
+ memset(&data, 0, sizeof(data));
+
ret = __sev_do_cmd_locked(SEV_CMD_PLATFORM_STATUS, &data, &argp->error);
if (ret)
return ret;
@@ -630,7 +632,7 @@ static int sev_ioctl_do_pek_csr(struct sev_issue_cmd *argp, bool writable)
if (input.length > SEV_FW_BLOB_MAX_SIZE)
return -EFAULT;
- blob = kmalloc(input.length, GFP_KERNEL);
+ blob = kzalloc(input.length, GFP_KERNEL);
if (!blob)
return -ENOMEM;
@@ -854,7 +856,7 @@ static int sev_ioctl_do_get_id2(struct sev_issue_cmd *argp)
input_address = (void __user *)input.address;
if (input.address && input.length) {
- id_blob = kmalloc(input.length, GFP_KERNEL);
+ id_blob = kzalloc(input.length, GFP_KERNEL);
if (!id_blob)
return -ENOMEM;
@@ -973,14 +975,14 @@ static int sev_ioctl_do_pdh_export(struct sev_issue_cmd *argp, bool writable)
if (input.cert_chain_len > SEV_FW_BLOB_MAX_SIZE)
return -EFAULT;
- pdh_blob = kmalloc(input.pdh_cert_len, GFP_KERNEL);
+ pdh_blob = kzalloc(input.pdh_cert_len, GFP_KERNEL);
if (!pdh_blob)
return -ENOMEM;
data.pdh_cert_address = __psp_pa(pdh_blob);
data.pdh_cert_len = input.pdh_cert_len;
- cert_blob = kmalloc(input.cert_chain_len, GFP_KERNEL);
+ cert_blob = kzalloc(input.cert_chain_len, GFP_KERNEL);
if (!cert_blob) {
ret = -ENOMEM;
goto e_free_pdh;
diff --git a/drivers/crypto/hisilicon/hpre/hpre_crypto.c b/drivers/crypto/hisilicon/hpre/hpre_crypto.c
index 97d54c1465c2..3ba6f15deafc 100644
--- a/drivers/crypto/hisilicon/hpre/hpre_crypto.c
+++ b/drivers/crypto/hisilicon/hpre/hpre_crypto.c
@@ -252,7 +252,7 @@ static int hpre_prepare_dma_buf(struct hpre_asym_request *hpre_req,
if (unlikely(shift < 0))
return -EINVAL;
- ptr = dma_alloc_coherent(dev, ctx->key_sz, tmp, GFP_KERNEL);
+ ptr = dma_alloc_coherent(dev, ctx->key_sz, tmp, GFP_ATOMIC);
if (unlikely(!ptr))
return -ENOMEM;
diff --git a/drivers/crypto/hisilicon/sec/sec_algs.c b/drivers/crypto/hisilicon/sec/sec_algs.c
index 0a3c8f019b02..490e1542305e 100644
--- a/drivers/crypto/hisilicon/sec/sec_algs.c
+++ b/drivers/crypto/hisilicon/sec/sec_algs.c
@@ -449,7 +449,7 @@ static void sec_skcipher_alg_callback(struct sec_bd_info *sec_resp,
*/
}
- mutex_lock(&ctx->queue->queuelock);
+ spin_lock_bh(&ctx->queue->queuelock);
/* Put the IV in place for chained cases */
switch (ctx->cipher_alg) {
case SEC_C_AES_CBC_128:
@@ -509,7 +509,7 @@ static void sec_skcipher_alg_callback(struct sec_bd_info *sec_resp,
list_del(&backlog_req->backlog_head);
}
}
- mutex_unlock(&ctx->queue->queuelock);
+ spin_unlock_bh(&ctx->queue->queuelock);
mutex_lock(&sec_req->lock);
list_del(&sec_req_el->head);
@@ -798,7 +798,7 @@ static int sec_alg_skcipher_crypto(struct skcipher_request *skreq,
*/
/* Grab a big lock for a long time to avoid concurrency issues */
- mutex_lock(&queue->queuelock);
+ spin_lock_bh(&queue->queuelock);
/*
* Can go on to queue if we have space in either:
@@ -814,15 +814,15 @@ static int sec_alg_skcipher_crypto(struct skcipher_request *skreq,
ret = -EBUSY;
if ((skreq->base.flags & CRYPTO_TFM_REQ_MAY_BACKLOG)) {
list_add_tail(&sec_req->backlog_head, &ctx->backlog);
- mutex_unlock(&queue->queuelock);
+ spin_unlock_bh(&queue->queuelock);
goto out;
}
- mutex_unlock(&queue->queuelock);
+ spin_unlock_bh(&queue->queuelock);
goto err_free_elements;
}
ret = sec_send_request(sec_req, queue);
- mutex_unlock(&queue->queuelock);
+ spin_unlock_bh(&queue->queuelock);
if (ret)
goto err_free_elements;
@@ -881,7 +881,7 @@ static int sec_alg_skcipher_init(struct crypto_skcipher *tfm)
if (IS_ERR(ctx->queue))
return PTR_ERR(ctx->queue);
- mutex_init(&ctx->queue->queuelock);
+ spin_lock_init(&ctx->queue->queuelock);
ctx->queue->havesoftqueue = false;
return 0;
diff --git a/drivers/crypto/hisilicon/sec/sec_drv.h b/drivers/crypto/hisilicon/sec/sec_drv.h
index 179a8250d691..e2a50bf2234b 100644
--- a/drivers/crypto/hisilicon/sec/sec_drv.h
+++ b/drivers/crypto/hisilicon/sec/sec_drv.h
@@ -347,7 +347,7 @@ struct sec_queue {
DECLARE_BITMAP(unprocessed, SEC_QUEUE_LEN);
DECLARE_KFIFO_PTR(softqueue, typeof(struct sec_request_el *));
bool havesoftqueue;
- struct mutex queuelock;
+ spinlock_t queuelock;
void *shadow[SEC_QUEUE_LEN];
};
diff --git a/drivers/crypto/hisilicon/sec2/sec.h b/drivers/crypto/hisilicon/sec2/sec.h
index c2e9b01187a7..a44c8dba3cda 100644
--- a/drivers/crypto/hisilicon/sec2/sec.h
+++ b/drivers/crypto/hisilicon/sec2/sec.h
@@ -119,7 +119,7 @@ struct sec_qp_ctx {
struct idr req_idr;
struct sec_alg_res res[QM_Q_DEPTH];
struct sec_ctx *ctx;
- struct mutex req_lock;
+ spinlock_t req_lock;
struct list_head backlog;
struct hisi_acc_sgl_pool *c_in_pool;
struct hisi_acc_sgl_pool *c_out_pool;
diff --git a/drivers/crypto/hisilicon/sec2/sec_crypto.c b/drivers/crypto/hisilicon/sec2/sec_crypto.c
index a91635c348b5..dbaa6c918cfd 100644
--- a/drivers/crypto/hisilicon/sec2/sec_crypto.c
+++ b/drivers/crypto/hisilicon/sec2/sec_crypto.c
@@ -127,11 +127,11 @@ static int sec_alloc_req_id(struct sec_req *req, struct sec_qp_ctx *qp_ctx)
{
int req_id;
- mutex_lock(&qp_ctx->req_lock);
+ spin_lock_bh(&qp_ctx->req_lock);
req_id = idr_alloc_cyclic(&qp_ctx->req_idr, NULL,
0, QM_Q_DEPTH, GFP_ATOMIC);
- mutex_unlock(&qp_ctx->req_lock);
+ spin_unlock_bh(&qp_ctx->req_lock);
if (unlikely(req_id < 0)) {
dev_err(req->ctx->dev, "alloc req id fail!\n");
return req_id;
@@ -156,9 +156,9 @@ static void sec_free_req_id(struct sec_req *req)
qp_ctx->req_list[req_id] = NULL;
req->qp_ctx = NULL;
- mutex_lock(&qp_ctx->req_lock);
+ spin_lock_bh(&qp_ctx->req_lock);
idr_remove(&qp_ctx->req_idr, req_id);
- mutex_unlock(&qp_ctx->req_lock);
+ spin_unlock_bh(&qp_ctx->req_lock);
}
static u8 pre_parse_finished_bd(struct bd_status *status, void *resp)
@@ -273,7 +273,7 @@ static int sec_bd_send(struct sec_ctx *ctx, struct sec_req *req)
!(req->flag & CRYPTO_TFM_REQ_MAY_BACKLOG))
return -EBUSY;
- mutex_lock(&qp_ctx->req_lock);
+ spin_lock_bh(&qp_ctx->req_lock);
ret = hisi_qp_send(qp_ctx->qp, &req->sec_sqe);
if (ctx->fake_req_limit <=
@@ -281,10 +281,10 @@ static int sec_bd_send(struct sec_ctx *ctx, struct sec_req *req)
list_add_tail(&req->backlog_head, &qp_ctx->backlog);
atomic64_inc(&ctx->sec->debug.dfx.send_cnt);
atomic64_inc(&ctx->sec->debug.dfx.send_busy_cnt);
- mutex_unlock(&qp_ctx->req_lock);
+ spin_unlock_bh(&qp_ctx->req_lock);
return -EBUSY;
}
- mutex_unlock(&qp_ctx->req_lock);
+ spin_unlock_bh(&qp_ctx->req_lock);
if (unlikely(ret == -EBUSY))
return -ENOBUFS;
@@ -487,7 +487,7 @@ static int sec_create_qp_ctx(struct hisi_qm *qm, struct sec_ctx *ctx,
qp->req_cb = sec_req_cb;
- mutex_init(&qp_ctx->req_lock);
+ spin_lock_init(&qp_ctx->req_lock);
idr_init(&qp_ctx->req_idr);
INIT_LIST_HEAD(&qp_ctx->backlog);
@@ -620,7 +620,7 @@ static int sec_auth_init(struct sec_ctx *ctx)
{
struct sec_auth_ctx *a_ctx = &ctx->a_ctx;
- a_ctx->a_key = dma_alloc_coherent(ctx->dev, SEC_MAX_KEY_SIZE,
+ a_ctx->a_key = dma_alloc_coherent(ctx->dev, SEC_MAX_AKEY_SIZE,
&a_ctx->a_key_dma, GFP_KERNEL);
if (!a_ctx->a_key)
return -ENOMEM;
@@ -632,8 +632,8 @@ static void sec_auth_uninit(struct sec_ctx *ctx)
{
struct sec_auth_ctx *a_ctx = &ctx->a_ctx;
- memzero_explicit(a_ctx->a_key, SEC_MAX_KEY_SIZE);
- dma_free_coherent(ctx->dev, SEC_MAX_KEY_SIZE,
+ memzero_explicit(a_ctx->a_key, SEC_MAX_AKEY_SIZE);
+ dma_free_coherent(ctx->dev, SEC_MAX_AKEY_SIZE,
a_ctx->a_key, a_ctx->a_key_dma);
}
@@ -1382,7 +1382,7 @@ static struct sec_req *sec_back_req_clear(struct sec_ctx *ctx,
{
struct sec_req *backlog_req = NULL;
- mutex_lock(&qp_ctx->req_lock);
+ spin_lock_bh(&qp_ctx->req_lock);
if (ctx->fake_req_limit >=
atomic_read(&qp_ctx->qp->qp_status.used) &&
!list_empty(&qp_ctx->backlog)) {
@@ -1390,7 +1390,7 @@ static struct sec_req *sec_back_req_clear(struct sec_ctx *ctx,
typeof(*backlog_req), backlog_head);
list_del(&backlog_req->backlog_head);
}
- mutex_unlock(&qp_ctx->req_lock);
+ spin_unlock_bh(&qp_ctx->req_lock);
return backlog_req;
}
diff --git a/drivers/crypto/hisilicon/sec2/sec_crypto.h b/drivers/crypto/hisilicon/sec2/sec_crypto.h
index 5e039b50e9d4..d033f63b583f 100644
--- a/drivers/crypto/hisilicon/sec2/sec_crypto.h
+++ b/drivers/crypto/hisilicon/sec2/sec_crypto.h
@@ -7,6 +7,7 @@
#define SEC_AIV_SIZE 12
#define SEC_IV_SIZE 24
#define SEC_MAX_KEY_SIZE 64
+#define SEC_MAX_AKEY_SIZE 128
#define SEC_COMM_SCENE 0
#define SEC_MIN_BLOCK_SZ 1
diff --git a/drivers/crypto/inside-secure/safexcel.c b/drivers/crypto/inside-secure/safexcel.c
index 9ff885d50edf..389a7b51f1f3 100644
--- a/drivers/crypto/inside-secure/safexcel.c
+++ b/drivers/crypto/inside-secure/safexcel.c
@@ -1831,6 +1831,8 @@ static const struct of_device_id safexcel_of_match_table[] = {
{},
};
+MODULE_DEVICE_TABLE(of, safexcel_of_match_table);
+
static struct platform_driver crypto_safexcel = {
.probe = safexcel_probe,
.remove = safexcel_remove,
diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c
index 468d1097a1ec..f23569e4b0bd 100644
--- a/drivers/dma/dw-edma/dw-edma-core.c
+++ b/drivers/dma/dw-edma/dw-edma-core.c
@@ -423,7 +423,7 @@ dw_edma_device_transfer(struct dw_edma_transfer *xfer)
chunk->ll_region.sz += burst->sz;
desc->alloc_sz += burst->sz;
- if (chan->dir == EDMA_DIR_WRITE) {
+ if (dir == DMA_DEV_TO_MEM) {
burst->sar = src_addr;
if (xfer->type == EDMA_XFER_CYCLIC) {
burst->dar = xfer->xfer.cyclic.paddr;
diff --git a/drivers/dma/imx-dma.c b/drivers/dma/imx-dma.c
index 2ddc31e64db0..da31e73d24d4 100644
--- a/drivers/dma/imx-dma.c
+++ b/drivers/dma/imx-dma.c
@@ -1047,7 +1047,7 @@ static int __init imxdma_probe(struct platform_device *pdev)
return -ENOMEM;
imxdma->dev = &pdev->dev;
- imxdma->devtype = (enum imx_dma_type)of_device_get_match_data(&pdev->dev);
+ imxdma->devtype = (uintptr_t)of_device_get_match_data(&pdev->dev);
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
imxdma->base = devm_ioremap_resource(&pdev->dev, res);
diff --git a/drivers/dma/sf-pdma/sf-pdma.c b/drivers/dma/sf-pdma/sf-pdma.c
index f12606aeff87..ab0ad7a2f201 100644
--- a/drivers/dma/sf-pdma/sf-pdma.c
+++ b/drivers/dma/sf-pdma/sf-pdma.c
@@ -52,16 +52,6 @@ static inline struct sf_pdma_desc *to_sf_pdma_desc(struct virt_dma_desc *vd)
static struct sf_pdma_desc *sf_pdma_alloc_desc(struct sf_pdma_chan *chan)
{
struct sf_pdma_desc *desc;
- unsigned long flags;
-
- spin_lock_irqsave(&chan->lock, flags);
-
- if (chan->desc && !chan->desc->in_use) {
- spin_unlock_irqrestore(&chan->lock, flags);
- return chan->desc;
- }
-
- spin_unlock_irqrestore(&chan->lock, flags);
desc = kzalloc(sizeof(*desc), GFP_NOWAIT);
if (!desc)
@@ -111,7 +101,6 @@ sf_pdma_prep_dma_memcpy(struct dma_chan *dchan, dma_addr_t dest, dma_addr_t src,
desc->async_tx = vchan_tx_prep(&chan->vchan, &desc->vdesc, flags);
spin_lock_irqsave(&chan->vchan.lock, iflags);
- chan->desc = desc;
sf_pdma_fill_desc(desc, dest, src, len);
spin_unlock_irqrestore(&chan->vchan.lock, iflags);
@@ -170,11 +159,17 @@ static size_t sf_pdma_desc_residue(struct sf_pdma_chan *chan,
unsigned long flags;
u64 residue = 0;
struct sf_pdma_desc *desc;
- struct dma_async_tx_descriptor *tx;
+ struct dma_async_tx_descriptor *tx = NULL;
spin_lock_irqsave(&chan->vchan.lock, flags);
- tx = &chan->desc->vdesc.tx;
+ list_for_each_entry(vd, &chan->vchan.desc_submitted, node)
+ if (vd->tx.cookie == cookie)
+ tx = &vd->tx;
+
+ if (!tx)
+ goto out;
+
if (cookie == tx->chan->completed_cookie)
goto out;
@@ -241,6 +236,19 @@ static void sf_pdma_enable_request(struct sf_pdma_chan *chan)
writel(v, regs->ctrl);
}
+static struct sf_pdma_desc *sf_pdma_get_first_pending_desc(struct sf_pdma_chan *chan)
+{
+ struct virt_dma_chan *vchan = &chan->vchan;
+ struct virt_dma_desc *vdesc;
+
+ if (list_empty(&vchan->desc_issued))
+ return NULL;
+
+ vdesc = list_first_entry(&vchan->desc_issued, struct virt_dma_desc, node);
+
+ return container_of(vdesc, struct sf_pdma_desc, vdesc);
+}
+
static void sf_pdma_xfer_desc(struct sf_pdma_chan *chan)
{
struct sf_pdma_desc *desc = chan->desc;
@@ -268,8 +276,11 @@ static void sf_pdma_issue_pending(struct dma_chan *dchan)
spin_lock_irqsave(&chan->vchan.lock, flags);
- if (vchan_issue_pending(&chan->vchan) && chan->desc)
+ if (!chan->desc && vchan_issue_pending(&chan->vchan)) {
+ /* vchan_issue_pending has made a check that desc in not NULL */
+ chan->desc = sf_pdma_get_first_pending_desc(chan);
sf_pdma_xfer_desc(chan);
+ }
spin_unlock_irqrestore(&chan->vchan.lock, flags);
}
@@ -298,6 +309,11 @@ static void sf_pdma_donebh_tasklet(struct tasklet_struct *t)
spin_lock_irqsave(&chan->vchan.lock, flags);
list_del(&chan->desc->vdesc.node);
vchan_cookie_complete(&chan->desc->vdesc);
+
+ chan->desc = sf_pdma_get_first_pending_desc(chan);
+ if (chan->desc)
+ sf_pdma_xfer_desc(chan);
+
spin_unlock_irqrestore(&chan->vchan.lock, flags);
}
diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c
index ddf0b9ff9e15..435d0e2658a4 100644
--- a/drivers/firmware/arm_scpi.c
+++ b/drivers/firmware/arm_scpi.c
@@ -815,7 +815,7 @@ static int scpi_init_versions(struct scpi_drvinfo *info)
info->firmware_version = le32_to_cpu(caps.platform_version);
}
/* Ignore error if not implemented */
- if (scpi_info->is_legacy && ret == -EOPNOTSUPP)
+ if (info->is_legacy && ret == -EOPNOTSUPP)
return 0;
return ret;
@@ -913,13 +913,14 @@ static int scpi_probe(struct platform_device *pdev)
struct resource res;
struct device *dev = &pdev->dev;
struct device_node *np = dev->of_node;
+ struct scpi_drvinfo *scpi_drvinfo;
- scpi_info = devm_kzalloc(dev, sizeof(*scpi_info), GFP_KERNEL);
- if (!scpi_info)
+ scpi_drvinfo = devm_kzalloc(dev, sizeof(*scpi_drvinfo), GFP_KERNEL);
+ if (!scpi_drvinfo)
return -ENOMEM;
if (of_match_device(legacy_scpi_of_match, &pdev->dev))
- scpi_info->is_legacy = true;
+ scpi_drvinfo->is_legacy = true;
count = of_count_phandle_with_args(np, "mboxes", "#mbox-cells");
if (count < 0) {
@@ -927,19 +928,19 @@ static int scpi_probe(struct platform_device *pdev)
return -ENODEV;
}
- scpi_info->channels = devm_kcalloc(dev, count, sizeof(struct scpi_chan),
- GFP_KERNEL);
- if (!scpi_info->channels)
+ scpi_drvinfo->channels =
+ devm_kcalloc(dev, count, sizeof(struct scpi_chan), GFP_KERNEL);
+ if (!scpi_drvinfo->channels)
return -ENOMEM;
- ret = devm_add_action(dev, scpi_free_channels, scpi_info);
+ ret = devm_add_action(dev, scpi_free_channels, scpi_drvinfo);
if (ret)
return ret;
- for (; scpi_info->num_chans < count; scpi_info->num_chans++) {
+ for (; scpi_drvinfo->num_chans < count; scpi_drvinfo->num_chans++) {
resource_size_t size;
- int idx = scpi_info->num_chans;
- struct scpi_chan *pchan = scpi_info->channels + idx;
+ int idx = scpi_drvinfo->num_chans;
+ struct scpi_chan *pchan = scpi_drvinfo->channels + idx;
struct mbox_client *cl = &pchan->cl;
struct device_node *shmem = of_parse_phandle(np, "shmem", idx);
@@ -986,45 +987,53 @@ static int scpi_probe(struct platform_device *pdev)
return ret;
}
- scpi_info->commands = scpi_std_commands;
+ scpi_drvinfo->commands = scpi_std_commands;
- platform_set_drvdata(pdev, scpi_info);
+ platform_set_drvdata(pdev, scpi_drvinfo);
- if (scpi_info->is_legacy) {
+ if (scpi_drvinfo->is_legacy) {
/* Replace with legacy variants */
scpi_ops.clk_set_val = legacy_scpi_clk_set_val;
- scpi_info->commands = scpi_legacy_commands;
+ scpi_drvinfo->commands = scpi_legacy_commands;
/* Fill priority bitmap */
for (idx = 0; idx < ARRAY_SIZE(legacy_hpriority_cmds); idx++)
set_bit(legacy_hpriority_cmds[idx],
- scpi_info->cmd_priority);
+ scpi_drvinfo->cmd_priority);
}
- ret = scpi_init_versions(scpi_info);
+ scpi_info = scpi_drvinfo;
+
+ ret = scpi_init_versions(scpi_drvinfo);
if (ret) {
dev_err(dev, "incorrect or no SCP firmware found\n");
+ scpi_info = NULL;
return ret;
}
- if (scpi_info->is_legacy && !scpi_info->protocol_version &&
- !scpi_info->firmware_version)
+ if (scpi_drvinfo->is_legacy && !scpi_drvinfo->protocol_version &&
+ !scpi_drvinfo->firmware_version)
dev_info(dev, "SCP Protocol legacy pre-1.0 firmware\n");
else
dev_info(dev, "SCP Protocol %lu.%lu Firmware %lu.%lu.%lu version\n",
FIELD_GET(PROTO_REV_MAJOR_MASK,
- scpi_info->protocol_version),
+ scpi_drvinfo->protocol_version),
FIELD_GET(PROTO_REV_MINOR_MASK,
- scpi_info->protocol_version),
+ scpi_drvinfo->protocol_version),
FIELD_GET(FW_REV_MAJOR_MASK,
- scpi_info->firmware_version),
+ scpi_drvinfo->firmware_version),
FIELD_GET(FW_REV_MINOR_MASK,
- scpi_info->firmware_version),
+ scpi_drvinfo->firmware_version),
FIELD_GET(FW_REV_PATCH_MASK,
- scpi_info->firmware_version));
- scpi_info->scpi_ops = &scpi_ops;
+ scpi_drvinfo->firmware_version));
+
+ scpi_drvinfo->scpi_ops = &scpi_ops;
- return devm_of_platform_populate(dev);
+ ret = devm_of_platform_populate(dev);
+ if (ret)
+ scpi_info = NULL;
+
+ return ret;
}
static const struct of_device_id scpi_of_match[] = {
diff --git a/drivers/firmware/tegra/bpmp-debugfs.c b/drivers/firmware/tegra/bpmp-debugfs.c
index fd89899aeeed..0c440afd5224 100644
--- a/drivers/firmware/tegra/bpmp-debugfs.c
+++ b/drivers/firmware/tegra/bpmp-debugfs.c
@@ -474,7 +474,7 @@ static int bpmp_populate_debugfs_inband(struct tegra_bpmp *bpmp,
mode |= attrs & DEBUGFS_S_IWUSR ? 0200 : 0;
dentry = debugfs_create_file(name, mode, parent, bpmp,
&bpmp_debug_fops);
- if (!dentry) {
+ if (IS_ERR(dentry)) {
err = -ENOMEM;
goto out;
}
@@ -725,7 +725,7 @@ static int bpmp_populate_dir(struct tegra_bpmp *bpmp, struct seqbuf *seqbuf,
if (t & DEBUGFS_S_ISDIR) {
dentry = debugfs_create_dir(name, parent);
- if (!dentry)
+ if (IS_ERR(dentry))
return -ENOMEM;
err = bpmp_populate_dir(bpmp, seqbuf, dentry, depth+1);
if (err < 0)
@@ -738,7 +738,7 @@ static int bpmp_populate_dir(struct tegra_bpmp *bpmp, struct seqbuf *seqbuf,
dentry = debugfs_create_file(name, mode,
parent, bpmp,
&debugfs_fops);
- if (!dentry)
+ if (IS_ERR(dentry))
return -ENOMEM;
}
}
@@ -788,11 +788,11 @@ int tegra_bpmp_init_debugfs(struct tegra_bpmp *bpmp)
return 0;
root = debugfs_create_dir("bpmp", NULL);
- if (!root)
+ if (IS_ERR(root))
return -ENOMEM;
bpmp->debugfs_mirror = debugfs_create_dir("debug", root);
- if (!bpmp->debugfs_mirror) {
+ if (IS_ERR(bpmp->debugfs_mirror)) {
err = -ENOMEM;
goto out;
}
diff --git a/drivers/fpga/altera-pr-ip-core.c b/drivers/fpga/altera-pr-ip-core.c
index be0667968d33..df8671af4a92 100644
--- a/drivers/fpga/altera-pr-ip-core.c
+++ b/drivers/fpga/altera-pr-ip-core.c
@@ -108,7 +108,7 @@ static int alt_pr_fpga_write(struct fpga_manager *mgr, const char *buf,
u32 *buffer_32 = (u32 *)buf;
size_t i = 0;
- if (count <= 0)
+ if (!count)
return -EINVAL;
/* Write out the complete 32-bit chunks */
diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c
index 6dec81b1f24b..5cabb9a9c272 100644
--- a/drivers/gpio/gpiolib-of.c
+++ b/drivers/gpio/gpiolib-of.c
@@ -861,7 +861,8 @@ int of_mm_gpiochip_add_data(struct device_node *np,
if (mm_gc->save_regs)
mm_gc->save_regs(mm_gc);
- mm_gc->gc.of_node = np;
+ of_node_put(mm_gc->gc.of_node);
+ mm_gc->gc.of_node = of_node_get(np);
ret = gpiochip_add_data(gc, data);
if (ret)
@@ -869,6 +870,7 @@ int of_mm_gpiochip_add_data(struct device_node *np,
return 0;
err2:
+ of_node_put(np);
iounmap(mm_gc->regs);
err1:
kfree(gc->label);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index f4509656ea8c..e9a8eb070766 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1401,16 +1401,10 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev,
struct amdgpu_vm *vm)
{
struct amdkfd_process_info *process_info = vm->process_info;
- struct amdgpu_bo *pd = vm->root.bo;
if (!process_info)
return;
- /* Release eviction fence from PD */
- amdgpu_bo_reserve(pd, false);
- amdgpu_bo_fence(pd, NULL, false);
- amdgpu_bo_unreserve(pd);
-
/* Update process info */
mutex_lock(&process_info->lock);
process_info->n_vms--;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 2019622191b5..ee0cbc6ccbfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1260,7 +1260,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
p->fence = dma_fence_get(&job->base.s_fence->finished);
- amdgpu_ctx_add_fence(p->ctx, entity, p->fence, &seq);
+ seq = amdgpu_ctx_add_fence(p->ctx, entity, p->fence);
amdgpu_cs_post_dependencies(p);
if ((job->preamble_status & AMDGPU_PREAMBLE_IB_PRESENT) &&
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index c317078d1afd..95ed528e6ec5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -135,9 +135,9 @@ static enum amdgpu_ring_priority_level amdgpu_ctx_sched_prio_to_ring_prio(int32_
static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)
{
- struct amdgpu_device *adev = ctx->adev;
- int32_t ctx_prio;
+ struct amdgpu_device *adev = ctx->mgr->adev;
unsigned int hw_prio;
+ int32_t ctx_prio;
ctx_prio = (ctx->override_priority == AMDGPU_CTX_PRIORITY_UNSET) ?
ctx->init_priority : ctx->override_priority;
@@ -166,7 +166,7 @@ static unsigned int amdgpu_ctx_get_hw_prio(struct amdgpu_ctx *ctx, u32 hw_ip)
static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
const u32 ring)
{
- struct amdgpu_device *adev = ctx->adev;
+ struct amdgpu_device *adev = ctx->mgr->adev;
struct amdgpu_ctx_entity *entity;
struct drm_gpu_scheduler **scheds = NULL, *sched = NULL;
unsigned num_scheds = 0;
@@ -220,35 +220,6 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
return r;
}
-static int amdgpu_ctx_init(struct amdgpu_device *adev,
- int32_t priority,
- struct drm_file *filp,
- struct amdgpu_ctx *ctx)
-{
- int r;
-
- r = amdgpu_ctx_priority_permit(filp, priority);
- if (r)
- return r;
-
- memset(ctx, 0, sizeof(*ctx));
-
- ctx->adev = adev;
-
- kref_init(&ctx->refcount);
- spin_lock_init(&ctx->ring_lock);
- mutex_init(&ctx->lock);
-
- ctx->reset_counter = atomic_read(&adev->gpu_reset_counter);
- ctx->reset_counter_query = ctx->reset_counter;
- ctx->vram_lost_counter = atomic_read(&adev->vram_lost_counter);
- ctx->init_priority = priority;
- ctx->override_priority = AMDGPU_CTX_PRIORITY_UNSET;
- ctx->stable_pstate = AMDGPU_CTX_STABLE_PSTATE_NONE;
-
- return 0;
-}
-
static void amdgpu_ctx_fini_entity(struct amdgpu_ctx_entity *entity)
{
@@ -266,7 +237,7 @@ static void amdgpu_ctx_fini_entity(struct amdgpu_ctx_entity *entity)
static int amdgpu_ctx_get_stable_pstate(struct amdgpu_ctx *ctx,
u32 *stable_pstate)
{
- struct amdgpu_device *adev = ctx->adev;
+ struct amdgpu_device *adev = ctx->mgr->adev;
enum amd_dpm_forced_level current_level;
current_level = amdgpu_dpm_get_performance_level(adev);
@@ -291,10 +262,42 @@ static int amdgpu_ctx_get_stable_pstate(struct amdgpu_ctx *ctx,
return 0;
}
+static int amdgpu_ctx_init(struct amdgpu_ctx_mgr *mgr, int32_t priority,
+ struct drm_file *filp, struct amdgpu_ctx *ctx)
+{
+ u32 current_stable_pstate;
+ int r;
+
+ r = amdgpu_ctx_priority_permit(filp, priority);
+ if (r)
+ return r;
+
+ memset(ctx, 0, sizeof(*ctx));
+
+ kref_init(&ctx->refcount);
+ ctx->mgr = mgr;
+ spin_lock_init(&ctx->ring_lock);
+ mutex_init(&ctx->lock);
+
+ ctx->reset_counter = atomic_read(&mgr->adev->gpu_reset_counter);
+ ctx->reset_counter_query = ctx->reset_counter;
+ ctx->vram_lost_counter = atomic_read(&mgr->adev->vram_lost_counter);
+ ctx->init_priority = priority;
+ ctx->override_priority = AMDGPU_CTX_PRIORITY_UNSET;
+
+ r = amdgpu_ctx_get_stable_pstate(ctx, ¤t_stable_pstate);
+ if (r)
+ return r;
+
+ ctx->stable_pstate = current_stable_pstate;
+
+ return 0;
+}
+
static int amdgpu_ctx_set_stable_pstate(struct amdgpu_ctx *ctx,
u32 stable_pstate)
{
- struct amdgpu_device *adev = ctx->adev;
+ struct amdgpu_device *adev = ctx->mgr->adev;
enum amd_dpm_forced_level level;
u32 current_stable_pstate;
int r;
@@ -345,7 +348,8 @@ static int amdgpu_ctx_set_stable_pstate(struct amdgpu_ctx *ctx,
static void amdgpu_ctx_fini(struct kref *ref)
{
struct amdgpu_ctx *ctx = container_of(ref, struct amdgpu_ctx, refcount);
- struct amdgpu_device *adev = ctx->adev;
+ struct amdgpu_ctx_mgr *mgr = ctx->mgr;
+ struct amdgpu_device *adev = mgr->adev;
unsigned i, j, idx;
if (!adev)
@@ -359,7 +363,7 @@ static void amdgpu_ctx_fini(struct kref *ref)
}
if (drm_dev_enter(&adev->ddev, &idx)) {
- amdgpu_ctx_set_stable_pstate(ctx, AMDGPU_CTX_STABLE_PSTATE_NONE);
+ amdgpu_ctx_set_stable_pstate(ctx, ctx->stable_pstate);
drm_dev_exit(idx);
}
@@ -421,7 +425,7 @@ static int amdgpu_ctx_alloc(struct amdgpu_device *adev,
}
*id = (uint32_t)r;
- r = amdgpu_ctx_init(adev, priority, filp, ctx);
+ r = amdgpu_ctx_init(mgr, priority, filp, ctx);
if (r) {
idr_remove(&mgr->ctx_handles, *id);
*id = 0;
@@ -671,9 +675,9 @@ int amdgpu_ctx_put(struct amdgpu_ctx *ctx)
return 0;
}
-void amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
- struct drm_sched_entity *entity,
- struct dma_fence *fence, uint64_t *handle)
+uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
+ struct drm_sched_entity *entity,
+ struct dma_fence *fence)
{
struct amdgpu_ctx_entity *centity = to_amdgpu_ctx_entity(entity);
uint64_t seq = centity->sequence;
@@ -682,8 +686,7 @@ void amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
idx = seq & (amdgpu_sched_jobs - 1);
other = centity->fences[idx];
- if (other)
- BUG_ON(!dma_fence_is_signaled(other));
+ WARN_ON(other && !dma_fence_is_signaled(other));
dma_fence_get(fence);
@@ -693,8 +696,7 @@ void amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
spin_unlock(&ctx->ring_lock);
dma_fence_put(other);
- if (handle)
- *handle = seq;
+ return seq;
}
struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
@@ -731,7 +733,7 @@ static void amdgpu_ctx_set_entity_priority(struct amdgpu_ctx *ctx,
int hw_ip,
int32_t priority)
{
- struct amdgpu_device *adev = ctx->adev;
+ struct amdgpu_device *adev = ctx->mgr->adev;
unsigned int hw_prio;
struct drm_gpu_scheduler **scheds = NULL;
unsigned num_scheds;
@@ -796,8 +798,10 @@ int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx,
return r;
}
-void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr)
+void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr,
+ struct amdgpu_device *adev)
{
+ mgr->adev = adev;
mutex_init(&mgr->lock);
idr_init(&mgr->ctx_handles);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
index 142f2f87d44c..681050bc828c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h
@@ -40,7 +40,7 @@ struct amdgpu_ctx_entity {
struct amdgpu_ctx {
struct kref refcount;
- struct amdgpu_device *adev;
+ struct amdgpu_ctx_mgr *mgr;
unsigned reset_counter;
unsigned reset_counter_query;
uint32_t vram_lost_counter;
@@ -70,9 +70,9 @@ int amdgpu_ctx_put(struct amdgpu_ctx *ctx);
int amdgpu_ctx_get_entity(struct amdgpu_ctx *ctx, u32 hw_ip, u32 instance,
u32 ring, struct drm_sched_entity **entity);
-void amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
- struct drm_sched_entity *entity,
- struct dma_fence *fence, uint64_t *seq);
+uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
+ struct drm_sched_entity *entity,
+ struct dma_fence *fence);
struct dma_fence *amdgpu_ctx_get_fence(struct amdgpu_ctx *ctx,
struct drm_sched_entity *entity,
uint64_t seq);
@@ -85,7 +85,8 @@ int amdgpu_ctx_ioctl(struct drm_device *dev, void *data,
int amdgpu_ctx_wait_prev_fence(struct amdgpu_ctx *ctx,
struct drm_sched_entity *entity);
-void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr);
+void amdgpu_ctx_mgr_init(struct amdgpu_ctx_mgr *mgr,
+ struct amdgpu_device *adev);
void amdgpu_ctx_mgr_entity_fini(struct amdgpu_ctx_mgr *mgr);
long amdgpu_ctx_mgr_entity_flush(struct amdgpu_ctx_mgr *mgr, long timeout);
void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index e4fcbb385a62..d68db66d969b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -1891,12 +1891,9 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
break;
case IP_VERSION(7, 4, 0):
case IP_VERSION(7, 4, 1):
- adev->nbio.funcs = &nbio_v7_4_funcs;
- adev->nbio.hdp_flush_reg = &nbio_v7_4_hdp_flush_reg;
- break;
case IP_VERSION(7, 4, 4):
adev->nbio.funcs = &nbio_v7_4_funcs;
- adev->nbio.hdp_flush_reg = &nbio_v7_4_hdp_flush_reg_ald;
+ adev->nbio.hdp_flush_reg = &nbio_v7_4_hdp_flush_reg;
break;
case IP_VERSION(7, 2, 0):
case IP_VERSION(7, 2, 1):
@@ -1910,15 +1907,12 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev)
case IP_VERSION(2, 3, 0):
case IP_VERSION(2, 3, 1):
case IP_VERSION(2, 3, 2):
- adev->nbio.funcs = &nbio_v2_3_funcs;
- adev->nbio.hdp_flush_reg = &nbio_v2_3_hdp_flush_reg;
- break;
case IP_VERSION(3, 3, 0):
case IP_VERSION(3, 3, 1):
case IP_VERSION(3, 3, 2):
case IP_VERSION(3, 3, 3):
adev->nbio.funcs = &nbio_v2_3_funcs;
- adev->nbio.hdp_flush_reg = &nbio_v2_3_hdp_flush_reg_sc;
+ adev->nbio.hdp_flush_reg = &nbio_v2_3_hdp_flush_reg;
break;
default:
break;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 49c55d82cba8..20a432a774c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -1142,7 +1142,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
mutex_init(&fpriv->bo_list_lock);
idr_init(&fpriv->bo_list_handles);
- amdgpu_ctx_mgr_init(&fpriv->ctx_mgr);
+ amdgpu_ctx_mgr_init(&fpriv->ctx_mgr, adev);
file_priv->driver_priv = fpriv;
goto out_suspend;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 940752488330..916e2396d263 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -882,6 +882,10 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
if (WARN_ON_ONCE(min_offset > max_offset))
return -EINVAL;
+ /* Check domain to be pinned to against preferred domains */
+ if (bo->preferred_domains & domain)
+ domain = bo->preferred_domains & domain;
+
/* A shared bo cannot be migrated to VRAM */
if (bo->tbo.base.import_attach) {
if (domain & AMDGPU_GEM_DOMAIN_GTT)
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
index ee7cab37dfd5..dae78a1e88e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
@@ -328,27 +328,6 @@ const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg = {
.ref_and_mask_sdma1 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__SDMA1_MASK,
};
-const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg_sc = {
- .ref_and_mask_cp0 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP0_MASK,
- .ref_and_mask_cp1 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP1_MASK,
- .ref_and_mask_cp2 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP2_MASK,
- .ref_and_mask_cp3 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP3_MASK,
- .ref_and_mask_cp4 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP4_MASK,
- .ref_and_mask_cp5 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP5_MASK,
- .ref_and_mask_cp6 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP6_MASK,
- .ref_and_mask_cp7 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP7_MASK,
- .ref_and_mask_cp8 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP8_MASK,
- .ref_and_mask_cp9 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP9_MASK,
- .ref_and_mask_sdma0 = GPU_HDP_FLUSH_DONE__RSVD_ENG1_MASK,
- .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__RSVD_ENG2_MASK,
- .ref_and_mask_sdma2 = GPU_HDP_FLUSH_DONE__RSVD_ENG3_MASK,
- .ref_and_mask_sdma3 = GPU_HDP_FLUSH_DONE__RSVD_ENG4_MASK,
- .ref_and_mask_sdma4 = GPU_HDP_FLUSH_DONE__RSVD_ENG5_MASK,
- .ref_and_mask_sdma5 = GPU_HDP_FLUSH_DONE__RSVD_ENG6_MASK,
- .ref_and_mask_sdma6 = GPU_HDP_FLUSH_DONE__RSVD_ENG7_MASK,
- .ref_and_mask_sdma7 = GPU_HDP_FLUSH_DONE__RSVD_ENG8_MASK,
-};
-
static void nbio_v2_3_init_registers(struct amdgpu_device *adev)
{
uint32_t def, data;
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
index 6074dd3a1ed8..a43b60acf7f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
@@ -27,7 +27,6 @@
#include "soc15_common.h"
extern const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg;
-extern const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg_sc;
extern const struct amdgpu_nbio_funcs nbio_v2_3_funcs;
#endif
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
index c2357e83a8c4..09c0def356a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
@@ -339,27 +339,6 @@ const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg = {
.ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__SDMA1_MASK,
};
-const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg_ald = {
- .ref_and_mask_cp0 = GPU_HDP_FLUSH_DONE__CP0_MASK,
- .ref_and_mask_cp1 = GPU_HDP_FLUSH_DONE__CP1_MASK,
- .ref_and_mask_cp2 = GPU_HDP_FLUSH_DONE__CP2_MASK,
- .ref_and_mask_cp3 = GPU_HDP_FLUSH_DONE__CP3_MASK,
- .ref_and_mask_cp4 = GPU_HDP_FLUSH_DONE__CP4_MASK,
- .ref_and_mask_cp5 = GPU_HDP_FLUSH_DONE__CP5_MASK,
- .ref_and_mask_cp6 = GPU_HDP_FLUSH_DONE__CP6_MASK,
- .ref_and_mask_cp7 = GPU_HDP_FLUSH_DONE__CP7_MASK,
- .ref_and_mask_cp8 = GPU_HDP_FLUSH_DONE__CP8_MASK,
- .ref_and_mask_cp9 = GPU_HDP_FLUSH_DONE__CP9_MASK,
- .ref_and_mask_sdma0 = GPU_HDP_FLUSH_DONE__RSVD_ENG1_MASK,
- .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__RSVD_ENG2_MASK,
- .ref_and_mask_sdma2 = GPU_HDP_FLUSH_DONE__RSVD_ENG3_MASK,
- .ref_and_mask_sdma3 = GPU_HDP_FLUSH_DONE__RSVD_ENG4_MASK,
- .ref_and_mask_sdma4 = GPU_HDP_FLUSH_DONE__RSVD_ENG5_MASK,
- .ref_and_mask_sdma5 = GPU_HDP_FLUSH_DONE__RSVD_ENG6_MASK,
- .ref_and_mask_sdma6 = GPU_HDP_FLUSH_DONE__RSVD_ENG7_MASK,
- .ref_and_mask_sdma7 = GPU_HDP_FLUSH_DONE__RSVD_ENG8_MASK,
-};
-
static void nbio_v7_4_init_registers(struct amdgpu_device *adev)
{
uint32_t baco_cntl;
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h
index 7490022d79d4..f27c41728822 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h
@@ -27,7 +27,6 @@
#include "soc15_common.h"
extern const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg;
-extern const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg_ald;
extern const struct amdgpu_nbio_funcs nbio_v7_4_funcs;
extern struct amdgpu_nbio_ras nbio_v7_4_ras;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index f5f39984702f..a3282ddbff86 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -571,7 +571,7 @@ static bool execute_synaptics_rc_command(struct drm_dp_aux *aux,
unsigned char rc_cmd = 0;
unsigned char rc_result = 0xFF;
unsigned char i = 0;
- uint8_t ret = 0;
+ int ret;
if (is_write_cmd) {
// write rc data
diff --git a/drivers/gpu/drm/bridge/Kconfig b/drivers/gpu/drm/bridge/Kconfig
index be2fc4791c1d..7d152cc6f837 100644
--- a/drivers/gpu/drm/bridge/Kconfig
+++ b/drivers/gpu/drm/bridge/Kconfig
@@ -82,6 +82,8 @@ config DRM_ITE_IT6505
select DRM_KMS_HELPER
select DRM_DP_HELPER
select EXTCON
+ select CRYPTO
+ select CRYPTO_HASH
help
ITE IT6505 DisplayPort bridge chip driver.
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
index 668dcefbae17..f4d6359f6219 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
@@ -1060,6 +1060,10 @@ static int adv7511_init_cec_regmap(struct adv7511 *adv)
ADV7511_CEC_I2C_ADDR_DEFAULT);
if (IS_ERR(adv->i2c_cec))
return PTR_ERR(adv->i2c_cec);
+
+ regmap_write(adv->regmap, ADV7511_REG_CEC_I2C_ADDR,
+ adv->i2c_cec->addr << 1);
+
i2c_set_clientdata(adv->i2c_cec, adv);
adv->regmap_cec = devm_regmap_init_i2c(adv->i2c_cec,
@@ -1264,9 +1268,6 @@ static int adv7511_probe(struct i2c_client *i2c, const struct i2c_device_id *id)
if (ret)
goto err_i2c_unregister_packet;
- regmap_write(adv7511->regmap, ADV7511_REG_CEC_I2C_ADDR,
- adv7511->i2c_cec->addr << 1);
-
INIT_WORK(&adv7511->hpd_work, adv7511_hpd_work);
if (i2c->irq) {
@@ -1383,10 +1384,21 @@ static struct i2c_driver adv7511_driver = {
static int __init adv7511_init(void)
{
- if (IS_ENABLED(CONFIG_DRM_MIPI_DSI))
- mipi_dsi_driver_register(&adv7533_dsi_driver);
+ int ret;
- return i2c_add_driver(&adv7511_driver);
+ if (IS_ENABLED(CONFIG_DRM_MIPI_DSI)) {
+ ret = mipi_dsi_driver_register(&adv7533_dsi_driver);
+ if (ret)
+ return ret;
+ }
+
+ ret = i2c_add_driver(&adv7511_driver);
+ if (ret) {
+ if (IS_ENABLED(CONFIG_DRM_MIPI_DSI))
+ mipi_dsi_driver_unregister(&adv7533_dsi_driver);
+ }
+
+ return ret;
}
module_init(adv7511_init);
diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c b/drivers/gpu/drm/bridge/analogix/anx7625.c
index 060849f8ad8b..c527a3a02eac 100644
--- a/drivers/gpu/drm/bridge/analogix/anx7625.c
+++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
@@ -2651,14 +2651,6 @@ static int anx7625_i2c_probe(struct i2c_client *client,
platform->aux.dev = dev;
platform->aux.transfer = anx7625_aux_transfer;
drm_dp_aux_init(&platform->aux);
- devm_of_dp_aux_populate_ep_devices(&platform->aux);
-
- ret = anx7625_parse_dt(dev, pdata);
- if (ret) {
- if (ret != -EPROBE_DEFER)
- DRM_DEV_ERROR(dev, "fail to parse DT : %d\n", ret);
- goto free_wq;
- }
if (anx7625_register_i2c_dummy_clients(platform, client) != 0) {
ret = -ENOMEM;
@@ -2674,6 +2666,15 @@ static int anx7625_i2c_probe(struct i2c_client *client,
if (ret)
goto free_wq;
+ devm_of_dp_aux_populate_ep_devices(&platform->aux);
+
+ ret = anx7625_parse_dt(dev, pdata);
+ if (ret) {
+ if (ret != -EPROBE_DEFER)
+ DRM_DEV_ERROR(dev, "fail to parse DT : %d\n", ret);
+ goto free_wq;
+ }
+
if (!platform->pdata.low_power_mode) {
anx7625_disable_pd_protocol(platform);
pm_runtime_get_sync(dev);
diff --git a/drivers/gpu/drm/bridge/lontium-lt9611.c b/drivers/gpu/drm/bridge/lontium-lt9611.c
index 63df2e8a8abc..21d1e1465981 100644
--- a/drivers/gpu/drm/bridge/lontium-lt9611.c
+++ b/drivers/gpu/drm/bridge/lontium-lt9611.c
@@ -586,7 +586,7 @@ lt9611_connector_detect(struct drm_connector *connector, bool force)
int connected = 0;
regmap_read(lt9611->regmap, 0x825e, ®_val);
- connected = (reg_val & BIT(0));
+ connected = (reg_val & (BIT(2) | BIT(0)));
lt9611->status = connected ? connector_status_connected :
connector_status_disconnected;
diff --git a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
index 3d62e6bf6892..310b3b194491 100644
--- a/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
+++ b/drivers/gpu/drm/bridge/lontium-lt9611uxc.c
@@ -982,7 +982,7 @@ static int lt9611uxc_remove(struct i2c_client *client)
struct lt9611uxc *lt9611uxc = i2c_get_clientdata(client);
disable_irq(client->irq);
- flush_scheduled_work();
+ cancel_work_sync(<9611uxc->work);
lt9611uxc_audio_exit(lt9611uxc);
drm_bridge_remove(<9611uxc->bridge);
diff --git a/drivers/gpu/drm/bridge/sil-sii8620.c b/drivers/gpu/drm/bridge/sil-sii8620.c
index ec7745c31da0..ab0bce4a988c 100644
--- a/drivers/gpu/drm/bridge/sil-sii8620.c
+++ b/drivers/gpu/drm/bridge/sil-sii8620.c
@@ -605,7 +605,7 @@ static void *sii8620_burst_get_tx_buf(struct sii8620 *ctx, int len)
u8 *buf = &ctx->burst.tx_buf[ctx->burst.tx_count];
int size = len + 2;
- if (ctx->burst.tx_count + size > ARRAY_SIZE(ctx->burst.tx_buf)) {
+ if (ctx->burst.tx_count + size >= ARRAY_SIZE(ctx->burst.tx_buf)) {
dev_err(ctx->dev, "TX-BLK buffer exhausted\n");
ctx->error = -EINVAL;
return NULL;
@@ -622,7 +622,7 @@ static u8 *sii8620_burst_get_rx_buf(struct sii8620 *ctx, int len)
u8 *buf = &ctx->burst.rx_buf[ctx->burst.rx_count];
int size = len + 1;
- if (ctx->burst.tx_count + size > ARRAY_SIZE(ctx->burst.tx_buf)) {
+ if (ctx->burst.rx_count + size >= ARRAY_SIZE(ctx->burst.rx_buf)) {
dev_err(ctx->dev, "RX-BLK buffer exhausted\n");
ctx->error = -EINVAL;
return NULL;
diff --git a/drivers/gpu/drm/bridge/tc358767.c b/drivers/gpu/drm/bridge/tc358767.c
index c23e0abc65e8..da59b041fec5 100644
--- a/drivers/gpu/drm/bridge/tc358767.c
+++ b/drivers/gpu/drm/bridge/tc358767.c
@@ -1549,19 +1549,12 @@ static irqreturn_t tc_irq_handler(int irq, void *arg)
return IRQ_HANDLED;
}
-static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
+static int tc_probe_edp_bridge_endpoint(struct tc_data *tc)
{
- struct device *dev = &client->dev;
+ struct device *dev = tc->dev;
struct drm_panel *panel;
- struct tc_data *tc;
int ret;
- tc = devm_kzalloc(dev, sizeof(*tc), GFP_KERNEL);
- if (!tc)
- return -ENOMEM;
-
- tc->dev = dev;
-
/* port@2 is the output port */
ret = drm_of_find_panel_or_bridge(dev->of_node, 2, 0, &panel, NULL);
if (ret && ret != -ENODEV)
@@ -1580,6 +1573,50 @@ static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
tc->bridge.type = DRM_MODE_CONNECTOR_DisplayPort;
}
+ return 0;
+}
+
+static void tc_clk_disable(void *data)
+{
+ struct clk *refclk = data;
+
+ clk_disable_unprepare(refclk);
+}
+
+static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
+{
+ struct device *dev = &client->dev;
+ struct tc_data *tc;
+ int ret;
+
+ tc = devm_kzalloc(dev, sizeof(*tc), GFP_KERNEL);
+ if (!tc)
+ return -ENOMEM;
+
+ tc->dev = dev;
+
+ ret = tc_probe_edp_bridge_endpoint(tc);
+ if (ret)
+ return ret;
+
+ tc->refclk = devm_clk_get(dev, "ref");
+ if (IS_ERR(tc->refclk)) {
+ ret = PTR_ERR(tc->refclk);
+ dev_err(dev, "Failed to get refclk: %d\n", ret);
+ return ret;
+ }
+
+ ret = clk_prepare_enable(tc->refclk);
+ if (ret)
+ return ret;
+
+ ret = devm_add_action_or_reset(dev, tc_clk_disable, tc->refclk);
+ if (ret)
+ return ret;
+
+ /* tRSTW = 100 cycles , at 13 MHz that is ~7.69 us */
+ usleep_range(10, 15);
+
/* Shut down GPIO is optional */
tc->sd_gpio = devm_gpiod_get_optional(dev, "shutdown", GPIOD_OUT_HIGH);
if (IS_ERR(tc->sd_gpio))
@@ -1600,13 +1637,6 @@ static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
usleep_range(5000, 10000);
}
- tc->refclk = devm_clk_get(dev, "ref");
- if (IS_ERR(tc->refclk)) {
- ret = PTR_ERR(tc->refclk);
- dev_err(dev, "Failed to get refclk: %d\n", ret);
- return ret;
- }
-
tc->regmap = devm_regmap_init_i2c(client, &tc_regmap_config);
if (IS_ERR(tc->regmap)) {
ret = PTR_ERR(tc->regmap);
diff --git a/drivers/gpu/drm/dp/drm_dp_aux_bus.c b/drivers/gpu/drm/dp/drm_dp_aux_bus.c
index 415afce3cf96..5f06bc0a34ac 100644
--- a/drivers/gpu/drm/dp/drm_dp_aux_bus.c
+++ b/drivers/gpu/drm/dp/drm_dp_aux_bus.c
@@ -66,7 +66,6 @@ static int dp_aux_ep_probe(struct device *dev)
* @dev: The device to remove.
*
* Calls through to the endpoint driver remove.
- *
*/
static void dp_aux_ep_remove(struct device *dev)
{
@@ -120,8 +119,6 @@ ATTRIBUTE_GROUPS(dp_aux_ep_dev);
/**
* dp_aux_ep_dev_release() - Free memory for the dp_aux_ep device
* @dev: The device to free.
- *
- * Return: 0 if no error or negative error code.
*/
static void dp_aux_ep_dev_release(struct device *dev)
{
@@ -256,6 +253,7 @@ int of_dp_aux_populate_ep_devices(struct drm_dp_aux *aux)
return 0;
}
+EXPORT_SYMBOL_GPL(of_dp_aux_populate_ep_devices);
static void of_dp_aux_depopulate_ep_devices_void(void *data)
{
diff --git a/drivers/gpu/drm/dp/drm_dp_mst_topology.c b/drivers/gpu/drm/dp/drm_dp_mst_topology.c
index 7a7cc44686f9..96869875390f 100644
--- a/drivers/gpu/drm/dp/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/dp/drm_dp_mst_topology.c
@@ -3861,9 +3861,7 @@ int drm_dp_mst_topology_mgr_resume(struct drm_dp_mst_topology_mgr *mgr,
if (!mgr->mst_primary)
goto out_fail;
- ret = drm_dp_dpcd_read(mgr->aux, DP_DPCD_REV, mgr->dpcd,
- DP_RECEIVER_CAP_SIZE);
- if (ret != DP_RECEIVER_CAP_SIZE) {
+ if (drm_dp_read_dpcd_caps(mgr->aux, mgr->dpcd) < 0) {
drm_dbg_kms(mgr->dev, "dpcd read failed - undocked during suspend?\n");
goto out_fail;
}
@@ -4912,8 +4910,7 @@ void drm_dp_mst_dump_topology(struct seq_file *m,
u8 buf[DP_PAYLOAD_TABLE_SIZE];
int ret;
- ret = drm_dp_dpcd_read(mgr->aux, DP_DPCD_REV, buf, DP_RECEIVER_CAP_SIZE);
- if (ret) {
+ if (drm_dp_read_dpcd_caps(mgr->aux, buf) < 0) {
seq_printf(m, "dpcd read failed\n");
goto out;
}
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 56fb87885146..e2a810cfce45 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1225,7 +1225,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, int count,
ret = dma_resv_lock_slow_interruptible(obj->resv,
acquire_ctx);
if (ret) {
- ww_acquire_done(acquire_ctx);
+ ww_acquire_fini(acquire_ctx);
return ret;
}
}
@@ -1250,7 +1250,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, int count,
goto retry;
}
- ww_acquire_done(acquire_ctx);
+ ww_acquire_fini(acquire_ctx);
return ret;
}
}
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 8ad0e02991ca..904fc893c905 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -302,6 +302,7 @@ static int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,
ret = dma_buf_vmap(obj->import_attach->dmabuf, map);
if (!ret) {
if (WARN_ON(map->is_iomem)) {
+ dma_buf_vunmap(obj->import_attach->dmabuf, map);
ret = -EIO;
goto err_put_pages;
}
diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c
index 9314f2ead79f..09e4edb5a992 100644
--- a/drivers/gpu/drm/drm_mipi_dbi.c
+++ b/drivers/gpu/drm/drm_mipi_dbi.c
@@ -1199,6 +1199,13 @@ int mipi_dbi_spi_transfer(struct spi_device *spi, u32 speed_hz,
size_t chunk;
int ret;
+ /* In __spi_validate, there's a validation that no partial transfers
+ * are accepted (xfer->len % w_size must be zero).
+ * Here we align max_chunk to multiple of 2 (16bits),
+ * to prevent transfers from being rejected.
+ */
+ max_chunk = ALIGN_DOWN(max_chunk, 2);
+
spi_message_init_with_transfers(&m, &tr, 1);
while (len) {
diff --git a/drivers/gpu/drm/exynos/exynos7_drm_decon.c b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
index c04264f70ad1..3c31405600f0 100644
--- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c
+++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
@@ -800,31 +800,40 @@ static int exynos7_decon_resume(struct device *dev)
if (ret < 0) {
DRM_DEV_ERROR(dev, "Failed to prepare_enable the pclk [%d]\n",
ret);
- return ret;
+ goto err_pclk_enable;
}
ret = clk_prepare_enable(ctx->aclk);
if (ret < 0) {
DRM_DEV_ERROR(dev, "Failed to prepare_enable the aclk [%d]\n",
ret);
- return ret;
+ goto err_aclk_enable;
}
ret = clk_prepare_enable(ctx->eclk);
if (ret < 0) {
DRM_DEV_ERROR(dev, "Failed to prepare_enable the eclk [%d]\n",
ret);
- return ret;
+ goto err_eclk_enable;
}
ret = clk_prepare_enable(ctx->vclk);
if (ret < 0) {
DRM_DEV_ERROR(dev, "Failed to prepare_enable the vclk [%d]\n",
ret);
- return ret;
+ goto err_vclk_enable;
}
return 0;
+
+err_vclk_enable:
+ clk_disable_unprepare(ctx->eclk);
+err_eclk_enable:
+ clk_disable_unprepare(ctx->aclk);
+err_aclk_enable:
+ clk_disable_unprepare(ctx->pclk);
+err_pclk_enable:
+ return ret;
}
#endif
diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
index e82b815f83a6..5a9358dcd9e5 100644
--- a/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
+++ b/drivers/gpu/drm/hyperv/hyperv_drm_modeset.c
@@ -7,9 +7,11 @@
#include <drm/drm_damage_helper.h>
#include <drm/drm_drv.h>
+#include <drm/drm_edid.h>
#include <drm/drm_fb_helper.h>
#include <drm/drm_format_helper.h>
#include <drm/drm_fourcc.h>
+#include <drm/drm_framebuffer.h>
#include <drm/drm_gem_atomic_helper.h>
#include <drm/drm_gem_framebuffer_helper.h>
#include <drm/drm_gem_shmem_helper.h>
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index ac52b49bf901..ca9068bac748 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -69,6 +69,7 @@ struct jz_soc_info {
bool map_noncoherent;
bool use_extended_hwdesc;
bool plane_f0_not_working;
+ u32 max_burst;
unsigned int max_width, max_height;
const u32 *formats_f0, *formats_f1;
unsigned int num_formats_f0, num_formats_f1;
@@ -308,8 +309,9 @@ static void ingenic_drm_crtc_update_timings(struct ingenic_drm *priv,
regmap_write(priv->map, JZ_REG_LCD_REV, mode->htotal << 16);
}
- regmap_set_bits(priv->map, JZ_REG_LCD_CTRL,
- JZ_LCD_CTRL_OFUP | JZ_LCD_CTRL_BURST_16);
+ regmap_update_bits(priv->map, JZ_REG_LCD_CTRL,
+ JZ_LCD_CTRL_OFUP | JZ_LCD_CTRL_BURST_MASK,
+ JZ_LCD_CTRL_OFUP | priv->soc_info->max_burst);
/*
* IPU restart - specify how much time the LCDC will wait before
@@ -1480,6 +1482,7 @@ static const struct jz_soc_info jz4740_soc_info = {
.map_noncoherent = false,
.max_width = 800,
.max_height = 600,
+ .max_burst = JZ_LCD_CTRL_BURST_16,
.formats_f1 = jz4740_formats,
.num_formats_f1 = ARRAY_SIZE(jz4740_formats),
/* JZ4740 has only one plane */
@@ -1491,6 +1494,7 @@ static const struct jz_soc_info jz4725b_soc_info = {
.map_noncoherent = false,
.max_width = 800,
.max_height = 600,
+ .max_burst = JZ_LCD_CTRL_BURST_16,
.formats_f1 = jz4725b_formats_f1,
.num_formats_f1 = ARRAY_SIZE(jz4725b_formats_f1),
.formats_f0 = jz4725b_formats_f0,
@@ -1503,6 +1507,7 @@ static const struct jz_soc_info jz4770_soc_info = {
.map_noncoherent = true,
.max_width = 1280,
.max_height = 720,
+ .max_burst = JZ_LCD_CTRL_BURST_64,
.formats_f1 = jz4770_formats_f1,
.num_formats_f1 = ARRAY_SIZE(jz4770_formats_f1),
.formats_f0 = jz4770_formats_f0,
@@ -1517,6 +1522,7 @@ static const struct jz_soc_info jz4780_soc_info = {
.plane_f0_not_working = true, /* REVISIT */
.max_width = 4096,
.max_height = 2048,
+ .max_burst = JZ_LCD_CTRL_BURST_64,
.formats_f1 = jz4770_formats_f1,
.num_formats_f1 = ARRAY_SIZE(jz4770_formats_f1),
.formats_f0 = jz4770_formats_f0,
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm.h b/drivers/gpu/drm/ingenic/ingenic-drm.h
index cb1d09b62588..e5bd007ea93d 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm.h
+++ b/drivers/gpu/drm/ingenic/ingenic-drm.h
@@ -106,6 +106,9 @@
#define JZ_LCD_CTRL_BURST_4 (0x0 << 28)
#define JZ_LCD_CTRL_BURST_8 (0x1 << 28)
#define JZ_LCD_CTRL_BURST_16 (0x2 << 28)
+#define JZ_LCD_CTRL_BURST_32 (0x3 << 28)
+#define JZ_LCD_CTRL_BURST_64 (0x4 << 28)
+#define JZ_LCD_CTRL_BURST_MASK (0x7 << 28)
#define JZ_LCD_CTRL_RGB555 BIT(27)
#define JZ_LCD_CTRL_OFUP BIT(26)
#define JZ_LCD_CTRL_FRC_GRAYSCALE_16 (0x0 << 24)
diff --git a/drivers/gpu/drm/mcde/mcde_dsi.c b/drivers/gpu/drm/mcde/mcde_dsi.c
index 5651734ce977..9f9ac8699310 100644
--- a/drivers/gpu/drm/mcde/mcde_dsi.c
+++ b/drivers/gpu/drm/mcde/mcde_dsi.c
@@ -1111,6 +1111,7 @@ static int mcde_dsi_bind(struct device *dev, struct device *master,
bridge = of_drm_find_bridge(child);
if (!bridge) {
dev_err(dev, "failed to find bridge\n");
+ of_node_put(child);
return -EINVAL;
}
}
diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c b/drivers/gpu/drm/mediatek/mtk_dpi.c
index e61cd67b978f..41c783349321 100644
--- a/drivers/gpu/drm/mediatek/mtk_dpi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
@@ -54,13 +54,7 @@ enum mtk_dpi_out_channel_swap {
};
enum mtk_dpi_out_color_format {
- MTK_DPI_COLOR_FORMAT_RGB,
- MTK_DPI_COLOR_FORMAT_RGB_FULL,
- MTK_DPI_COLOR_FORMAT_YCBCR_444,
- MTK_DPI_COLOR_FORMAT_YCBCR_422,
- MTK_DPI_COLOR_FORMAT_XV_YCC,
- MTK_DPI_COLOR_FORMAT_YCBCR_444_FULL,
- MTK_DPI_COLOR_FORMAT_YCBCR_422_FULL
+ MTK_DPI_COLOR_FORMAT_RGB
};
struct mtk_dpi {
@@ -364,24 +358,11 @@ static void mtk_dpi_config_disable_edge(struct mtk_dpi *dpi)
static void mtk_dpi_config_color_format(struct mtk_dpi *dpi,
enum mtk_dpi_out_color_format format)
{
- if ((format == MTK_DPI_COLOR_FORMAT_YCBCR_444) ||
- (format == MTK_DPI_COLOR_FORMAT_YCBCR_444_FULL)) {
- mtk_dpi_config_yuv422_enable(dpi, false);
- mtk_dpi_config_csc_enable(dpi, true);
- mtk_dpi_config_swap_input(dpi, false);
- mtk_dpi_config_channel_swap(dpi, MTK_DPI_OUT_CHANNEL_SWAP_BGR);
- } else if ((format == MTK_DPI_COLOR_FORMAT_YCBCR_422) ||
- (format == MTK_DPI_COLOR_FORMAT_YCBCR_422_FULL)) {
- mtk_dpi_config_yuv422_enable(dpi, true);
- mtk_dpi_config_csc_enable(dpi, true);
- mtk_dpi_config_swap_input(dpi, true);
- mtk_dpi_config_channel_swap(dpi, MTK_DPI_OUT_CHANNEL_SWAP_RGB);
- } else {
- mtk_dpi_config_yuv422_enable(dpi, false);
- mtk_dpi_config_csc_enable(dpi, false);
- mtk_dpi_config_swap_input(dpi, false);
- mtk_dpi_config_channel_swap(dpi, MTK_DPI_OUT_CHANNEL_SWAP_RGB);
- }
+ /* only support RGB888 */
+ mtk_dpi_config_yuv422_enable(dpi, false);
+ mtk_dpi_config_csc_enable(dpi, false);
+ mtk_dpi_config_swap_input(dpi, false);
+ mtk_dpi_config_channel_swap(dpi, MTK_DPI_OUT_CHANNEL_SWAP_RGB);
}
static void mtk_dpi_dual_edge(struct mtk_dpi *dpi)
@@ -436,7 +417,6 @@ static int mtk_dpi_power_on(struct mtk_dpi *dpi)
if (dpi->pinctrl && dpi->pins_dpi)
pinctrl_select_state(dpi->pinctrl, dpi->pins_dpi);
- mtk_dpi_enable(dpi);
return 0;
err_pixel:
@@ -658,6 +638,7 @@ static void mtk_dpi_bridge_enable(struct drm_bridge *bridge)
mtk_dpi_power_on(dpi);
mtk_dpi_set_display_mode(dpi, &dpi->mode);
+ mtk_dpi_enable(dpi);
}
static enum drm_mode_status
diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c b/drivers/gpu/drm/mediatek/mtk_dsi.c
index ccb0511b9cd5..e0a2d5ea40af 100644
--- a/drivers/gpu/drm/mediatek/mtk_dsi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dsi.c
@@ -203,6 +203,7 @@ struct mtk_dsi {
struct mtk_phy_timing phy_timing;
int refcount;
bool enabled;
+ bool lanes_ready;
u32 irq_data;
wait_queue_head_t irq_wait_queue;
const struct mtk_dsi_driver_data *driver_data;
@@ -649,18 +650,11 @@ static int mtk_dsi_poweron(struct mtk_dsi *dsi)
mtk_dsi_reset_engine(dsi);
mtk_dsi_phy_timconfig(dsi);
- mtk_dsi_rxtx_control(dsi);
- usleep_range(30, 100);
- mtk_dsi_reset_dphy(dsi);
mtk_dsi_ps_control_vact(dsi);
mtk_dsi_set_vm_cmd(dsi);
mtk_dsi_config_vdo_timing(dsi);
mtk_dsi_set_interrupt_enable(dsi);
- mtk_dsi_clk_ulp_mode_leave(dsi);
- mtk_dsi_lane0_ulp_mode_leave(dsi);
- mtk_dsi_clk_hs_mode(dsi, 0);
-
return 0;
err_disable_engine_clk:
clk_disable_unprepare(dsi->engine_clk);
@@ -679,19 +673,11 @@ static void mtk_dsi_poweroff(struct mtk_dsi *dsi)
if (--dsi->refcount != 0)
return;
- /*
- * mtk_dsi_stop() and mtk_dsi_start() is asymmetric, since
- * mtk_dsi_stop() should be called after mtk_drm_crtc_atomic_disable(),
- * which needs irq for vblank, and mtk_dsi_stop() will disable irq.
- * mtk_dsi_start() needs to be called in mtk_output_dsi_enable(),
- * after dsi is fully set.
- */
- mtk_dsi_stop(dsi);
-
- mtk_dsi_switch_to_cmd_mode(dsi, VM_DONE_INT_FLAG, 500);
mtk_dsi_reset_engine(dsi);
mtk_dsi_lane0_ulp_mode_enter(dsi);
mtk_dsi_clk_ulp_mode_enter(dsi);
+ /* set the lane number as 0 to pull down mipi */
+ writel(0, dsi->regs + DSI_TXRX_CTRL);
mtk_dsi_disable(dsi);
@@ -699,21 +685,31 @@ static void mtk_dsi_poweroff(struct mtk_dsi *dsi)
clk_disable_unprepare(dsi->digital_clk);
phy_power_off(dsi->phy);
+
+ dsi->lanes_ready = false;
}
-static void mtk_output_dsi_enable(struct mtk_dsi *dsi)
+static void mtk_dsi_lane_ready(struct mtk_dsi *dsi)
{
- int ret;
+ if (!dsi->lanes_ready) {
+ dsi->lanes_ready = true;
+ mtk_dsi_rxtx_control(dsi);
+ usleep_range(30, 100);
+ mtk_dsi_reset_dphy(dsi);
+ mtk_dsi_clk_ulp_mode_leave(dsi);
+ mtk_dsi_lane0_ulp_mode_leave(dsi);
+ mtk_dsi_clk_hs_mode(dsi, 0);
+ msleep(20);
+ /* The reaction time after pulling up the mipi signal for dsi_rx */
+ }
+}
+static void mtk_output_dsi_enable(struct mtk_dsi *dsi)
+{
if (dsi->enabled)
return;
- ret = mtk_dsi_poweron(dsi);
- if (ret < 0) {
- DRM_ERROR("failed to power on dsi\n");
- return;
- }
-
+ mtk_dsi_lane_ready(dsi);
mtk_dsi_set_mode(dsi);
mtk_dsi_clk_hs_mode(dsi, 1);
@@ -727,7 +723,16 @@ static void mtk_output_dsi_disable(struct mtk_dsi *dsi)
if (!dsi->enabled)
return;
- mtk_dsi_poweroff(dsi);
+ /*
+ * mtk_dsi_stop() and mtk_dsi_start() is asymmetric, since
+ * mtk_dsi_stop() should be called after mtk_drm_crtc_atomic_disable(),
+ * which needs irq for vblank, and mtk_dsi_stop() will disable irq.
+ * mtk_dsi_start() needs to be called in mtk_output_dsi_enable(),
+ * after dsi is fully set.
+ */
+ mtk_dsi_stop(dsi);
+
+ mtk_dsi_switch_to_cmd_mode(dsi, VM_DONE_INT_FLAG, 500);
dsi->enabled = false;
}
@@ -751,24 +756,50 @@ static void mtk_dsi_bridge_mode_set(struct drm_bridge *bridge,
drm_display_mode_to_videomode(adjusted, &dsi->vm);
}
-static void mtk_dsi_bridge_disable(struct drm_bridge *bridge)
+static void mtk_dsi_bridge_atomic_disable(struct drm_bridge *bridge,
+ struct drm_bridge_state *old_bridge_state)
{
struct mtk_dsi *dsi = bridge_to_dsi(bridge);
mtk_output_dsi_disable(dsi);
}
-static void mtk_dsi_bridge_enable(struct drm_bridge *bridge)
+static void mtk_dsi_bridge_atomic_enable(struct drm_bridge *bridge,
+ struct drm_bridge_state *old_bridge_state)
{
struct mtk_dsi *dsi = bridge_to_dsi(bridge);
+ if (dsi->refcount == 0)
+ return;
+
mtk_output_dsi_enable(dsi);
}
+static void mtk_dsi_bridge_atomic_pre_enable(struct drm_bridge *bridge,
+ struct drm_bridge_state *old_bridge_state)
+{
+ struct mtk_dsi *dsi = bridge_to_dsi(bridge);
+ int ret;
+
+ ret = mtk_dsi_poweron(dsi);
+ if (ret < 0)
+ DRM_ERROR("failed to power on dsi\n");
+}
+
+static void mtk_dsi_bridge_atomic_post_disable(struct drm_bridge *bridge,
+ struct drm_bridge_state *old_bridge_state)
+{
+ struct mtk_dsi *dsi = bridge_to_dsi(bridge);
+
+ mtk_dsi_poweroff(dsi);
+}
+
static const struct drm_bridge_funcs mtk_dsi_bridge_funcs = {
.attach = mtk_dsi_bridge_attach,
- .disable = mtk_dsi_bridge_disable,
- .enable = mtk_dsi_bridge_enable,
+ .atomic_disable = mtk_dsi_bridge_atomic_disable,
+ .atomic_enable = mtk_dsi_bridge_atomic_enable,
+ .atomic_pre_enable = mtk_dsi_bridge_atomic_pre_enable,
+ .atomic_post_disable = mtk_dsi_bridge_atomic_post_disable,
.mode_set = mtk_dsi_bridge_mode_set,
};
@@ -988,6 +1019,8 @@ static ssize_t mtk_dsi_host_transfer(struct mipi_dsi_host *host,
if (MTK_DSI_HOST_IS_READ(msg->type))
irq_flag |= LPRX_RD_RDY_INT_FLAG;
+ mtk_dsi_lane_ready(dsi);
+
ret = mtk_dsi_host_send_cmd(dsi, msg, irq_flag);
if (ret)
goto restore_dsi_mode;
diff --git a/drivers/gpu/drm/meson/meson_encoder_cvbs.c b/drivers/gpu/drm/meson/meson_encoder_cvbs.c
index fd8db97ba8ba..8110a6e39320 100644
--- a/drivers/gpu/drm/meson/meson_encoder_cvbs.c
+++ b/drivers/gpu/drm/meson/meson_encoder_cvbs.c
@@ -238,6 +238,7 @@ int meson_encoder_cvbs_init(struct meson_drm *priv)
}
meson_encoder_cvbs->next_bridge = of_drm_find_bridge(remote);
+ of_node_put(remote);
if (!meson_encoder_cvbs->next_bridge) {
dev_err(priv->dev, "Failed to find CVBS Connector bridge\n");
return -EPROBE_DEFER;
diff --git a/drivers/gpu/drm/meson/meson_encoder_hdmi.c b/drivers/gpu/drm/meson/meson_encoder_hdmi.c
index 5e306de6f485..a7692584487c 100644
--- a/drivers/gpu/drm/meson/meson_encoder_hdmi.c
+++ b/drivers/gpu/drm/meson/meson_encoder_hdmi.c
@@ -365,7 +365,8 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
meson_encoder_hdmi->next_bridge = of_drm_find_bridge(remote);
if (!meson_encoder_hdmi->next_bridge) {
dev_err(priv->dev, "Failed to find HDMI transceiver bridge\n");
- return -EPROBE_DEFER;
+ ret = -EPROBE_DEFER;
+ goto err_put_node;
}
/* HDMI Encoder Bridge */
@@ -383,7 +384,7 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
DRM_MODE_ENCODER_TMDS);
if (ret) {
dev_err(priv->dev, "Failed to init HDMI encoder: %d\n", ret);
- return ret;
+ goto err_put_node;
}
meson_encoder_hdmi->encoder.possible_crtcs = BIT(0);
@@ -393,7 +394,7 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
DRM_BRIDGE_ATTACH_NO_CONNECTOR);
if (ret) {
dev_err(priv->dev, "Failed to attach bridge: %d\n", ret);
- return ret;
+ goto err_put_node;
}
/* Initialize & attach Bridge Connector */
@@ -401,7 +402,8 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
&meson_encoder_hdmi->encoder);
if (IS_ERR(meson_encoder_hdmi->connector)) {
dev_err(priv->dev, "Unable to create HDMI bridge connector\n");
- return PTR_ERR(meson_encoder_hdmi->connector);
+ ret = PTR_ERR(meson_encoder_hdmi->connector);
+ goto err_put_node;
}
drm_connector_attach_encoder(meson_encoder_hdmi->connector,
&meson_encoder_hdmi->encoder);
@@ -428,6 +430,7 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
meson_encoder_hdmi->connector->ycbcr_420_allowed = true;
pdev = of_find_device_by_node(remote);
+ of_node_put(remote);
if (pdev) {
struct cec_connector_info conn_info;
struct cec_notifier *notifier;
@@ -435,8 +438,10 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
cec_fill_conn_info_from_drm(&conn_info, meson_encoder_hdmi->connector);
notifier = cec_notifier_conn_register(&pdev->dev, NULL, &conn_info);
- if (!notifier)
+ if (!notifier) {
+ put_device(&pdev->dev);
return -ENOMEM;
+ }
meson_encoder_hdmi->cec_notifier = notifier;
}
@@ -444,4 +449,8 @@ int meson_encoder_hdmi_init(struct meson_drm *priv)
dev_dbg(priv->dev, "HDMI encoder initialized\n");
return 0;
+
+err_put_node:
+ of_node_put(remote);
+ return ret;
}
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 217615e0e850..213f75fc98e8 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1666,18 +1666,10 @@ static u64 a5xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate)
{
u64 busy_cycles;
- /* Only read the gpu busy if the hardware is already active */
- if (pm_runtime_get_if_in_use(&gpu->pdev->dev) == 0) {
- *out_sample_rate = 1;
- return 0;
- }
-
busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO,
REG_A5XX_RBBM_PERFCTR_RBBM_0_HI);
*out_sample_rate = clk_get_rate(gpu->core_clk);
- pm_runtime_put(&gpu->pdev->dev);
-
return busy_cycles;
}
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 3e325e2a2b1b..1863908694dc 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -102,7 +102,8 @@ bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu)
A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF));
}
-void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
+void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
+ bool suspended)
{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
@@ -127,15 +128,16 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
/*
* This can get called from devfreq while the hardware is idle. Don't
- * bring up the power if it isn't already active
+ * bring up the power if it isn't already active. All we're doing here
+ * is updating the frequency so that when we come back online we're at
+ * the right rate.
*/
- if (pm_runtime_get_if_in_use(gmu->dev) == 0)
+ if (suspended)
return;
if (!gmu->legacy) {
a6xx_hfi_set_freq(gmu, perf_index);
dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
- pm_runtime_put(gmu->dev);
return;
}
@@ -159,7 +161,6 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
dev_err(gmu->dev, "GMU set GPU frequency error: %d\n", ret);
dev_pm_opp_set_opp(&gpu->pdev->dev, opp);
- pm_runtime_put(gmu->dev);
}
unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu)
@@ -895,7 +896,7 @@ static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu)
return;
gmu->freq = 0; /* so a6xx_gmu_set_freq() doesn't exit early */
- a6xx_gmu_set_freq(gpu, gpu_opp);
+ a6xx_gmu_set_freq(gpu, gpu_opp, false);
dev_pm_opp_put(gpu_opp);
}
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 40fb92becc78..a8a0d798d17f 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1658,27 +1658,21 @@ static u64 a6xx_gpu_busy(struct msm_gpu *gpu, unsigned long *out_sample_rate)
/* 19.2MHz */
*out_sample_rate = 19200000;
- /* Only read the gpu busy if the hardware is already active */
- if (pm_runtime_get_if_in_use(a6xx_gpu->gmu.dev) == 0)
- return 0;
-
busy_cycles = gmu_read64(&a6xx_gpu->gmu,
REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_L,
REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_H);
-
- pm_runtime_put(a6xx_gpu->gmu.dev);
-
return busy_cycles;
}
-static void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp)
+static void a6xx_gpu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
+ bool suspended)
{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
mutex_lock(&a6xx_gpu->gmu.lock);
- a6xx_gmu_set_freq(gpu, opp);
+ a6xx_gmu_set_freq(gpu, opp, suspended);
mutex_unlock(&a6xx_gpu->gmu.lock);
}
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 86e0a7c3fe6d..ab853f61db63 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -77,7 +77,8 @@ void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state);
int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node);
void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu);
-void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp);
+void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp,
+ bool suspended);
unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu);
void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state *state,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
index 16ba9f9b9a78..32715399a7c1 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
@@ -361,6 +361,9 @@ static void _dpu_crtc_blend_setup_mixer(struct drm_crtc *crtc,
if (!state)
continue;
+ if (!state->visible)
+ continue;
+
pstate = to_dpu_plane_state(state);
fb = state->fb;
@@ -1125,6 +1128,9 @@ static int dpu_crtc_atomic_check(struct drm_crtc *crtc,
if (cnt >= DPU_STAGE_MAX * 4)
continue;
+ if (!pstate->visible)
+ continue;
+
pstates[cnt].dpu_pstate = dpu_pstate;
pstates[cnt].drm_pstate = pstate;
pstates[cnt].stage = pstate->normalized_zpos;
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
index a4f5cb90f3e8..e4b8a789835a 100644
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_pipe.c
@@ -123,12 +123,13 @@ int mdp5_pipe_release(struct drm_atomic_state *s, struct mdp5_hw_pipe *hwpipe)
{
struct msm_drm_private *priv = s->dev->dev_private;
struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(priv->kms));
- struct mdp5_global_state *state = mdp5_get_global_state(s);
+ struct mdp5_global_state *state;
struct mdp5_hw_pipe_state *new_state;
if (!hwpipe)
return 0;
+ state = mdp5_get_global_state(s);
if (IS_ERR(state))
return PTR_ERR(state);
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
index f6229262dcb0..4ce0b4c41e49 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
@@ -180,6 +180,9 @@ static struct hdmi *msm_hdmi_init(struct platform_device *pdev)
goto fail;
}
+ for (i = 0; i < config->pwr_reg_cnt; i++)
+ hdmi->pwr_regs[i].supply = config->pwr_reg_names[i];
+
ret = devm_regulator_bulk_get(&pdev->dev, config->pwr_reg_cnt, hdmi->pwr_regs);
if (ret) {
DRM_DEV_ERROR(&pdev->dev, "failed to get pwr regulator: %d\n", ret);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 143c56f5185b..a4d3abc4f3e1 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -63,11 +63,14 @@ struct msm_gpu_funcs {
/* for generation specific debugfs: */
void (*debugfs_init)(struct msm_gpu *gpu, struct drm_minor *minor);
#endif
+ /* note: gpu_busy() can assume that we have been pm_resumed */
u64 (*gpu_busy)(struct msm_gpu *gpu, unsigned long *out_sample_rate);
struct msm_gpu_state *(*gpu_state_get)(struct msm_gpu *gpu);
int (*gpu_state_put)(struct msm_gpu_state *state);
unsigned long (*gpu_get_freq)(struct msm_gpu *gpu);
- void (*gpu_set_freq)(struct msm_gpu *gpu, struct dev_pm_opp *opp);
+ /* note: gpu_set_freq() can assume that we have been pm_resumed */
+ void (*gpu_set_freq)(struct msm_gpu *gpu, struct dev_pm_opp *opp,
+ bool suspended);
struct msm_gem_address_space *(*create_address_space)
(struct msm_gpu *gpu, struct platform_device *pdev);
struct msm_gem_address_space *(*create_private_address_space)
@@ -91,6 +94,9 @@ struct msm_gpu_devfreq {
/** devfreq: devfreq instance */
struct devfreq *devfreq;
+ /** lock: lock for "suspended", "busy_cycles", and "time" */
+ struct mutex lock;
+
/**
* idle_constraint:
*
@@ -134,6 +140,9 @@ struct msm_gpu_devfreq {
* elapsed
*/
struct msm_hrtimer_work boost_work;
+
+ /** suspended: tracks if we're suspended */
+ bool suspended;
};
struct msm_gpu {
diff --git a/drivers/gpu/drm/msm/msm_gpu_devfreq.c b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
index c7dbaa4b1926..01248d340124 100644
--- a/drivers/gpu/drm/msm/msm_gpu_devfreq.c
+++ b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
@@ -20,6 +20,7 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq,
u32 flags)
{
struct msm_gpu *gpu = dev_to_gpu(dev);
+ struct msm_gpu_devfreq *df = &gpu->devfreq;
struct dev_pm_opp *opp;
/*
@@ -32,10 +33,13 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq,
trace_msm_gpu_freq_change(dev_pm_opp_get_freq(opp));
- if (gpu->funcs->gpu_set_freq)
- gpu->funcs->gpu_set_freq(gpu, opp);
- else
+ if (gpu->funcs->gpu_set_freq) {
+ mutex_lock(&df->lock);
+ gpu->funcs->gpu_set_freq(gpu, opp, df->suspended);
+ mutex_unlock(&df->lock);
+ } else {
clk_set_rate(gpu->core_clk, *freq);
+ }
dev_pm_opp_put(opp);
@@ -58,15 +62,24 @@ static void get_raw_dev_status(struct msm_gpu *gpu,
unsigned long sample_rate;
ktime_t time;
+ mutex_lock(&df->lock);
+
status->current_frequency = get_freq(gpu);
- busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
time = ktime_get();
-
- busy_time = busy_cycles - df->busy_cycles;
status->total_time = ktime_us_delta(time, df->time);
+ df->time = time;
+ if (df->suspended) {
+ mutex_unlock(&df->lock);
+ status->busy_time = 0;
+ return;
+ }
+
+ busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
+ busy_time = busy_cycles - df->busy_cycles;
df->busy_cycles = busy_cycles;
- df->time = time;
+
+ mutex_unlock(&df->lock);
busy_time *= USEC_PER_SEC;
do_div(busy_time, sample_rate);
@@ -175,6 +188,8 @@ void msm_devfreq_init(struct msm_gpu *gpu)
if (!gpu->funcs->gpu_busy)
return;
+ mutex_init(&df->lock);
+
dev_pm_qos_add_request(&gpu->pdev->dev, &df->idle_freq,
DEV_PM_QOS_MAX_FREQUENCY,
PM_QOS_MAX_FREQUENCY_DEFAULT_VALUE);
@@ -244,12 +259,16 @@ void msm_devfreq_cleanup(struct msm_gpu *gpu)
void msm_devfreq_resume(struct msm_gpu *gpu)
{
struct msm_gpu_devfreq *df = &gpu->devfreq;
+ unsigned long sample_rate;
if (!has_devfreq(gpu))
return;
- df->busy_cycles = 0;
+ mutex_lock(&df->lock);
+ df->busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
df->time = ktime_get();
+ df->suspended = false;
+ mutex_unlock(&df->lock);
devfreq_resume_device(df->devfreq);
}
@@ -261,6 +280,10 @@ void msm_devfreq_suspend(struct msm_gpu *gpu)
if (!has_devfreq(gpu))
return;
+ mutex_lock(&df->lock);
+ df->suspended = true;
+ mutex_unlock(&df->lock);
+
devfreq_suspend_device(df->devfreq);
cancel_idle_work(df);
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index 22b83a6577eb..df83c4654e26 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -1361,13 +1361,11 @@ nouveau_connector_create(struct drm_device *dev,
snprintf(aux_name, sizeof(aux_name), "sor-%04x-%04x",
dcbe->hasht, dcbe->hashm);
nv_connector->aux.name = kstrdup(aux_name, GFP_KERNEL);
- drm_dp_aux_init(&nv_connector->aux);
- if (ret) {
- NV_ERROR(drm, "Failed to init AUX adapter for sor-%04x-%04x: %d\n",
- dcbe->hasht, dcbe->hashm, ret);
+ if (!nv_connector->aux.name) {
kfree(nv_connector);
- return ERR_PTR(ret);
+ return ERR_PTR(-ENOMEM);
}
+ drm_dp_aux_init(&nv_connector->aux);
fallthrough;
default:
funcs = &nouveau_connector_funcs;
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c
index 2cd0932b3d68..a2f5df568ca5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -515,7 +515,7 @@ nouveau_display_hpd_work(struct work_struct *work)
pm_runtime_mark_last_busy(drm->dev->dev);
noop:
- pm_runtime_put_sync(drm->dev->dev);
+ pm_runtime_put_autosuspend(dev->dev);
}
#ifdef CONFIG_ACPI
@@ -537,7 +537,7 @@ nouveau_display_acpi_ntfy(struct notifier_block *nb, unsigned long val,
* it's own hotplug events.
*/
pm_runtime_put_autosuspend(drm->dev->dev);
- } else if (ret == 0) {
+ } else if (ret == 0 || ret == -EINPROGRESS) {
/* We've started resuming the GPU already, so
* it will handle scheduling a full reprobe
* itself
diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
index 4f9b3aa5deda..20ac1ce2c0f1 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
@@ -466,7 +466,7 @@ nouveau_fbcon_set_suspend_work(struct work_struct *work)
if (state == FBINFO_STATE_RUNNING) {
nouveau_fbcon_hotplug_resume(drm->fbcon);
pm_runtime_mark_last_busy(drm->dev->dev);
- pm_runtime_put_sync(drm->dev->dev);
+ pm_runtime_put_autosuspend(drm->dev->dev);
}
}
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c
index 64e423dddd9e..6c318e41bde0 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bios/base.c
@@ -33,7 +33,7 @@ nvbios_addr(struct nvkm_bios *bios, u32 *addr, u8 size)
{
u32 p = *addr;
- if (*addr > bios->image0_size && bios->imaged_addr) {
+ if (*addr >= bios->image0_size && bios->imaged_addr) {
*addr -= bios->image0_size;
*addr += bios->imaged_addr;
}
diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
index ddf5f38e8731..abc5271af4eb 100644
--- a/drivers/gpu/drm/panel/Kconfig
+++ b/drivers/gpu/drm/panel/Kconfig
@@ -428,6 +428,8 @@ config DRM_PANEL_SAMSUNG_ATNA33XC20
depends on OF
depends on BACKLIGHT_CLASS_DEVICE
depends on PM
+ select DRM_DISPLAY_DP_HELPER
+ select DRM_DISPLAY_HELPER
select DRM_DP_AUX_BUS
help
DRM panel driver for the Samsung ATNA33XC20 panel. This panel can't
diff --git a/drivers/gpu/drm/radeon/.gitignore b/drivers/gpu/drm/radeon/.gitignore
index 9c1a94153983..d8777383a64a 100644
--- a/drivers/gpu/drm/radeon/.gitignore
+++ b/drivers/gpu/drm/radeon/.gitignore
@@ -1,4 +1,4 @@
-# SPDX-License-Identifier: GPL-2.0-only
+# SPDX-License-Identifier: MIT
mkregtable
*_reg_safe.h
diff --git a/drivers/gpu/drm/radeon/Kconfig b/drivers/gpu/drm/radeon/Kconfig
index 6f60f4840cc5..52819e7f1fca 100644
--- a/drivers/gpu/drm/radeon/Kconfig
+++ b/drivers/gpu/drm/radeon/Kconfig
@@ -1,4 +1,4 @@
-# SPDX-License-Identifier: GPL-2.0-only
+# SPDX-License-Identifier: MIT
config DRM_RADEON_USERPTR
bool "Always enable userptr support"
depends on DRM_RADEON
diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 11c97edde54d..3d502f1bbfcb 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -1,4 +1,4 @@
-# SPDX-License-Identifier: GPL-2.0
+# SPDX-License-Identifier: MIT
#
# Makefile for the drm device driver. This driver provides support for the
# Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
diff --git a/drivers/gpu/drm/radeon/ni_dpm.c b/drivers/gpu/drm/radeon/ni_dpm.c
index 769f666335ac..672d2239293e 100644
--- a/drivers/gpu/drm/radeon/ni_dpm.c
+++ b/drivers/gpu/drm/radeon/ni_dpm.c
@@ -2741,10 +2741,10 @@ static int ni_set_mc_special_registers(struct radeon_device *rdev,
table->mc_reg_table_entry[k].mc_data[j] |= 0x100;
}
j++;
- if (j > SMC_NISLANDS_MC_REGISTER_ARRAY_SIZE)
- return -EINVAL;
break;
case MC_SEQ_RESERVE_M >> 2:
+ if (j >= SMC_NISLANDS_MC_REGISTER_ARRAY_SIZE)
+ return -EINVAL;
temp_reg = RREG32(MC_PMG_CMD_MRS1);
table->mc_reg_address[j].s1 = MC_PMG_CMD_MRS1 >> 2;
table->mc_reg_address[j].s0 = MC_SEQ_PMG_CMD_MRS1_LP >> 2;
@@ -2753,8 +2753,6 @@ static int ni_set_mc_special_registers(struct radeon_device *rdev,
(temp_reg & 0xffff0000) |
(table->mc_reg_table_entry[k].mc_data[i] & 0x0000ffff);
j++;
- if (j > SMC_NISLANDS_MC_REGISTER_ARRAY_SIZE)
- return -EINVAL;
break;
default:
break;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index 15692cb241fc..429644d5ddc6 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1113,7 +1113,7 @@ static int radeon_gart_size_auto(enum radeon_family family)
static void radeon_check_arguments(struct radeon_device *rdev)
{
/* vramlimit must be a power of two */
- if (!is_power_of_2(radeon_vram_limit)) {
+ if (radeon_vram_limit != 0 && !is_power_of_2(radeon_vram_limit)) {
dev_warn(rdev->dev, "vram limit (%d) must be a power of 2\n",
radeon_vram_limit);
radeon_vram_limit = 0;
diff --git a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
index c82901d9a9cc..8a9bb618f872 100644
--- a/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
+++ b/drivers/gpu/drm/rockchip/analogix_dp-rockchip.c
@@ -398,7 +398,15 @@ static int rockchip_dp_probe(struct platform_device *pdev)
if (IS_ERR(dp->adp))
return PTR_ERR(dp->adp);
- return component_add(dev, &rockchip_dp_component_ops);
+ ret = component_add(dev, &rockchip_dp_component_ops);
+ if (ret)
+ goto err_dp_remove;
+
+ return 0;
+
+err_dp_remove:
+ analogix_dp_remove(dp->adp);
+ return ret;
}
static int rockchip_dp_remove(struct platform_device *pdev)
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index d53037531f40..26e24f62f75f 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -1552,6 +1552,9 @@ static struct drm_crtc_state *vop_crtc_duplicate_state(struct drm_crtc *crtc)
{
struct rockchip_crtc_state *rockchip_state;
+ if (WARN_ON(!crtc->state))
+ return NULL;
+
rockchip_state = kzalloc(sizeof(*rockchip_state), GFP_KERNEL);
if (!rockchip_state)
return NULL;
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 7c7dd84e6db8..81991090adcc 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -704,14 +704,23 @@ static int tegra_gem_prime_vmap(struct dma_buf *buf, struct iosys_map *map)
{
struct drm_gem_object *gem = buf->priv;
struct tegra_bo *bo = to_tegra_bo(gem);
+ void *vaddr;
- iosys_map_set_vaddr(map, bo->vaddr);
+ vaddr = tegra_bo_mmap(&bo->base);
+ if (IS_ERR(vaddr))
+ return PTR_ERR(vaddr);
+
+ iosys_map_set_vaddr(map, vaddr);
return 0;
}
static void tegra_gem_prime_vunmap(struct dma_buf *buf, struct iosys_map *map)
{
+ struct drm_gem_object *gem = buf->priv;
+ struct tegra_bo *bo = to_tegra_bo(gem);
+
+ tegra_bo_munmap(&bo->base, map->vaddr);
}
static const struct dma_buf_ops tegra_gem_prime_dmabuf_ops = {
diff --git a/drivers/gpu/drm/tiny/st7735r.c b/drivers/gpu/drm/tiny/st7735r.c
index 29d618093e94..e0f02d367d88 100644
--- a/drivers/gpu/drm/tiny/st7735r.c
+++ b/drivers/gpu/drm/tiny/st7735r.c
@@ -174,6 +174,7 @@ MODULE_DEVICE_TABLE(of, st7735r_of_match);
static const struct spi_device_id st7735r_id[] = {
{ "jd-t18003-t01", (uintptr_t)&jd_t18003_t01_cfg },
+ { "rh128128t", (uintptr_t)&rh128128t_cfg },
{ },
};
MODULE_DEVICE_TABLE(spi, st7735r_id);
diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index 477b3c5ad089..18e2a246f7c1 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -314,10 +314,13 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
struct drm_crtc_state *crtc_state = crtc->state;
struct drm_display_mode *mode = &crtc_state->adjusted_mode;
bool interlace = mode->flags & DRM_MODE_FLAG_INTERLACE;
- u32 pixel_rep = (mode->flags & DRM_MODE_FLAG_DBLCLK) ? 2 : 1;
+ bool is_hdmi = vc4_encoder->type == VC4_ENCODER_TYPE_HDMI0 ||
+ vc4_encoder->type == VC4_ENCODER_TYPE_HDMI1;
+ u32 pixel_rep = ((mode->flags & DRM_MODE_FLAG_DBLCLK) && !is_hdmi) ? 2 : 1;
bool is_dsi = (vc4_encoder->type == VC4_ENCODER_TYPE_DSI0 ||
vc4_encoder->type == VC4_ENCODER_TYPE_DSI1);
- u32 format = is_dsi ? PV_CONTROL_FORMAT_DSIV_24 : PV_CONTROL_FORMAT_24;
+ bool is_dsi1 = vc4_encoder->type == VC4_ENCODER_TYPE_DSI1;
+ u32 format = is_dsi1 ? PV_CONTROL_FORMAT_DSIV_24 : PV_CONTROL_FORMAT_24;
u8 ppc = pv_data->pixels_per_clock;
bool debug_dump_regs = false;
@@ -343,7 +346,8 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
PV_HORZB_HACTIVE));
CRTC_WRITE(PV_VERTA,
- VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end,
+ VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end +
+ interlace,
PV_VERTA_VBP) |
VC4_SET_FIELD(mode->crtc_vsync_end - mode->crtc_vsync_start,
PV_VERTA_VSYNC));
@@ -355,7 +359,7 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
if (interlace) {
CRTC_WRITE(PV_VERTA_EVEN,
VC4_SET_FIELD(mode->crtc_vtotal -
- mode->crtc_vsync_end - 1,
+ mode->crtc_vsync_end,
PV_VERTA_VBP) |
VC4_SET_FIELD(mode->crtc_vsync_end -
mode->crtc_vsync_start,
@@ -375,7 +379,7 @@ static void vc4_crtc_config_pv(struct drm_crtc *crtc, struct drm_encoder *encode
PV_VCONTROL_CONTINUOUS |
(is_dsi ? PV_VCONTROL_DSI : 0) |
PV_VCONTROL_INTERLACE |
- VC4_SET_FIELD(mode->htotal * pixel_rep / 2,
+ VC4_SET_FIELD(mode->htotal * pixel_rep / (2 * ppc),
PV_VCONTROL_ODD_DELAY));
CRTC_WRITE(PV_VSYNCD_EVEN, 0);
} else {
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index 162bc18e7497..680eaef7287f 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -209,6 +209,15 @@ static void vc4_match_add_drivers(struct device *dev,
}
}
+static const struct of_device_id vc4_dma_range_matches[] = {
+ { .compatible = "brcm,bcm2711-hvs" },
+ { .compatible = "brcm,bcm2835-hvs" },
+ { .compatible = "brcm,bcm2835-v3d" },
+ { .compatible = "brcm,cygnus-v3d" },
+ { .compatible = "brcm,vc4-v3d" },
+ {}
+};
+
static int vc4_drm_bind(struct device *dev)
{
struct platform_device *pdev = to_platform_device(dev);
@@ -227,6 +236,16 @@ static int vc4_drm_bind(struct device *dev)
vc4_drm_driver.driver_features &= ~DRIVER_RENDER;
of_node_put(node);
+ node = of_find_matching_node_and_match(NULL, vc4_dma_range_matches,
+ NULL);
+ if (node) {
+ ret = of_dma_configure(dev, node, true);
+ of_node_put(node);
+
+ if (ret)
+ return ret;
+ }
+
vc4 = devm_drm_dev_alloc(dev, &vc4_drm_driver, struct vc4_dev, base);
if (IS_ERR(vc4))
return PTR_ERR(vc4);
diff --git a/drivers/gpu/drm/vc4/vc4_dsi.c b/drivers/gpu/drm/vc4/vc4_dsi.c
index 98308a17e4ed..b7b2c76770dc 100644
--- a/drivers/gpu/drm/vc4/vc4_dsi.c
+++ b/drivers/gpu/drm/vc4/vc4_dsi.c
@@ -181,8 +181,50 @@
#define DSI0_TXPKT_PIX_FIFO 0x20 /* AKA PIX_FIFO */
-#define DSI0_INT_STAT 0x24
-#define DSI0_INT_EN 0x28
+#define DSI0_INT_STAT 0x24
+#define DSI0_INT_EN 0x28
+# define DSI0_INT_FIFO_ERR BIT(25)
+# define DSI0_INT_CMDC_DONE_MASK VC4_MASK(24, 23)
+# define DSI0_INT_CMDC_DONE_SHIFT 23
+# define DSI0_INT_CMDC_DONE_NO_REPEAT 1
+# define DSI0_INT_CMDC_DONE_REPEAT 3
+# define DSI0_INT_PHY_DIR_RTF BIT(22)
+# define DSI0_INT_PHY_D1_ULPS BIT(21)
+# define DSI0_INT_PHY_D1_STOP BIT(20)
+# define DSI0_INT_PHY_RXLPDT BIT(19)
+# define DSI0_INT_PHY_RXTRIG BIT(18)
+# define DSI0_INT_PHY_D0_ULPS BIT(17)
+# define DSI0_INT_PHY_D0_LPDT BIT(16)
+# define DSI0_INT_PHY_D0_FTR BIT(15)
+# define DSI0_INT_PHY_D0_STOP BIT(14)
+/* Signaled when the clock lane enters the given state. */
+# define DSI0_INT_PHY_CLK_ULPS BIT(13)
+# define DSI0_INT_PHY_CLK_HS BIT(12)
+# define DSI0_INT_PHY_CLK_FTR BIT(11)
+/* Signaled on timeouts */
+# define DSI0_INT_PR_TO BIT(10)
+# define DSI0_INT_TA_TO BIT(9)
+# define DSI0_INT_LPRX_TO BIT(8)
+# define DSI0_INT_HSTX_TO BIT(7)
+/* Contention on a line when trying to drive the line low */
+# define DSI0_INT_ERR_CONT_LP1 BIT(6)
+# define DSI0_INT_ERR_CONT_LP0 BIT(5)
+/* Control error: incorrect line state sequence on data lane 0. */
+# define DSI0_INT_ERR_CONTROL BIT(4)
+# define DSI0_INT_ERR_SYNC_ESC BIT(3)
+# define DSI0_INT_RX2_PKT BIT(2)
+# define DSI0_INT_RX1_PKT BIT(1)
+# define DSI0_INT_CMD_PKT BIT(0)
+
+#define DSI0_INTERRUPTS_ALWAYS_ENABLED (DSI0_INT_ERR_SYNC_ESC | \
+ DSI0_INT_ERR_CONTROL | \
+ DSI0_INT_ERR_CONT_LP0 | \
+ DSI0_INT_ERR_CONT_LP1 | \
+ DSI0_INT_HSTX_TO | \
+ DSI0_INT_LPRX_TO | \
+ DSI0_INT_TA_TO | \
+ DSI0_INT_PR_TO)
+
# define DSI1_INT_PHY_D3_ULPS BIT(30)
# define DSI1_INT_PHY_D3_STOP BIT(29)
# define DSI1_INT_PHY_D2_ULPS BIT(28)
@@ -761,6 +803,9 @@ static void vc4_dsi_encoder_disable(struct drm_encoder *encoder)
list_for_each_entry_reverse(iter, &dsi->bridge_chain, chain_node) {
if (iter->funcs->disable)
iter->funcs->disable(iter);
+
+ if (iter == dsi->bridge)
+ break;
}
vc4_dsi_ulps(dsi, true);
@@ -805,11 +850,9 @@ static bool vc4_dsi_encoder_mode_fixup(struct drm_encoder *encoder,
/* Find what divider gets us a faster clock than the requested
* pixel clock.
*/
- for (divider = 1; divider < 8; divider++) {
- if (parent_rate / divider < pll_clock) {
- divider--;
+ for (divider = 1; divider < 255; divider++) {
+ if (parent_rate / (divider + 1) < pll_clock)
break;
- }
}
/* Now that we've picked a PLL divider, calculate back to its
@@ -894,6 +937,9 @@ static void vc4_dsi_encoder_enable(struct drm_encoder *encoder)
DSI_PORT_WRITE(PHY_AFEC0, afec0);
+ /* AFEC reset hold time */
+ mdelay(1);
+
DSI_PORT_WRITE(PHY_AFEC1,
VC4_SET_FIELD(6, DSI0_PHY_AFEC1_IDR_DLANE1) |
VC4_SET_FIELD(6, DSI0_PHY_AFEC1_IDR_DLANE0) |
@@ -1060,12 +1106,9 @@ static void vc4_dsi_encoder_enable(struct drm_encoder *encoder)
DSI_PORT_WRITE(CTRL, DSI_PORT_READ(CTRL) | DSI1_CTRL_EN);
/* Bring AFE out of reset. */
- if (dsi->variant->port == 0) {
- } else {
- DSI_PORT_WRITE(PHY_AFEC0,
- DSI_PORT_READ(PHY_AFEC0) &
- ~DSI1_PHY_AFEC0_RESET);
- }
+ DSI_PORT_WRITE(PHY_AFEC0,
+ DSI_PORT_READ(PHY_AFEC0) &
+ ~DSI_PORT_BIT(PHY_AFEC0_RESET));
vc4_dsi_ulps(dsi, false);
@@ -1184,13 +1227,28 @@ static ssize_t vc4_dsi_host_transfer(struct mipi_dsi_host *host,
/* Enable the appropriate interrupt for the transfer completion. */
dsi->xfer_result = 0;
reinit_completion(&dsi->xfer_completion);
- DSI_PORT_WRITE(INT_STAT, DSI1_INT_TXPKT1_DONE | DSI1_INT_PHY_DIR_RTF);
- if (msg->rx_len) {
- DSI_PORT_WRITE(INT_EN, (DSI1_INTERRUPTS_ALWAYS_ENABLED |
- DSI1_INT_PHY_DIR_RTF));
+ if (dsi->variant->port == 0) {
+ DSI_PORT_WRITE(INT_STAT,
+ DSI0_INT_CMDC_DONE_MASK | DSI1_INT_PHY_DIR_RTF);
+ if (msg->rx_len) {
+ DSI_PORT_WRITE(INT_EN, (DSI0_INTERRUPTS_ALWAYS_ENABLED |
+ DSI0_INT_PHY_DIR_RTF));
+ } else {
+ DSI_PORT_WRITE(INT_EN,
+ (DSI0_INTERRUPTS_ALWAYS_ENABLED |
+ VC4_SET_FIELD(DSI0_INT_CMDC_DONE_NO_REPEAT,
+ DSI0_INT_CMDC_DONE)));
+ }
} else {
- DSI_PORT_WRITE(INT_EN, (DSI1_INTERRUPTS_ALWAYS_ENABLED |
- DSI1_INT_TXPKT1_DONE));
+ DSI_PORT_WRITE(INT_STAT,
+ DSI1_INT_TXPKT1_DONE | DSI1_INT_PHY_DIR_RTF);
+ if (msg->rx_len) {
+ DSI_PORT_WRITE(INT_EN, (DSI1_INTERRUPTS_ALWAYS_ENABLED |
+ DSI1_INT_PHY_DIR_RTF));
+ } else {
+ DSI_PORT_WRITE(INT_EN, (DSI1_INTERRUPTS_ALWAYS_ENABLED |
+ DSI1_INT_TXPKT1_DONE));
+ }
}
/* Send the packet. */
@@ -1207,7 +1265,7 @@ static ssize_t vc4_dsi_host_transfer(struct mipi_dsi_host *host,
ret = dsi->xfer_result;
}
- DSI_PORT_WRITE(INT_EN, DSI1_INTERRUPTS_ALWAYS_ENABLED);
+ DSI_PORT_WRITE(INT_EN, DSI_PORT_BIT(INTERRUPTS_ALWAYS_ENABLED));
if (ret)
goto reset_fifo_and_return;
@@ -1253,7 +1311,7 @@ static ssize_t vc4_dsi_host_transfer(struct mipi_dsi_host *host,
DSI_PORT_BIT(CTRL_RESET_FIFOS));
DSI_PORT_WRITE(TXPKT1C, 0);
- DSI_PORT_WRITE(INT_EN, DSI1_INTERRUPTS_ALWAYS_ENABLED);
+ DSI_PORT_WRITE(INT_EN, DSI_PORT_BIT(INTERRUPTS_ALWAYS_ENABLED));
return ret;
}
@@ -1390,26 +1448,28 @@ static irqreturn_t vc4_dsi_irq_handler(int irq, void *data)
DSI_PORT_WRITE(INT_STAT, stat);
dsi_handle_error(dsi, &ret, stat,
- DSI1_INT_ERR_SYNC_ESC, "LPDT sync");
+ DSI_PORT_BIT(INT_ERR_SYNC_ESC), "LPDT sync");
dsi_handle_error(dsi, &ret, stat,
- DSI1_INT_ERR_CONTROL, "data lane 0 sequence");
+ DSI_PORT_BIT(INT_ERR_CONTROL), "data lane 0 sequence");
dsi_handle_error(dsi, &ret, stat,
- DSI1_INT_ERR_CONT_LP0, "LP0 contention");
+ DSI_PORT_BIT(INT_ERR_CONT_LP0), "LP0 contention");
dsi_handle_error(dsi, &ret, stat,
- DSI1_INT_ERR_CONT_LP1, "LP1 contention");
+ DSI_PORT_BIT(INT_ERR_CONT_LP1), "LP1 contention");
dsi_handle_error(dsi, &ret, stat,
- DSI1_INT_HSTX_TO, "HSTX timeout");
+ DSI_PORT_BIT(INT_HSTX_TO), "HSTX timeout");
dsi_handle_error(dsi, &ret, stat,
- DSI1_INT_LPRX_TO, "LPRX timeout");
+ DSI_PORT_BIT(INT_LPRX_TO), "LPRX timeout");
dsi_handle_error(dsi, &ret, stat,
- DSI1_INT_TA_TO, "turnaround timeout");
+ DSI_PORT_BIT(INT_TA_TO), "turnaround timeout");
dsi_handle_error(dsi, &ret, stat,
- DSI1_INT_PR_TO, "peripheral reset timeout");
+ DSI_PORT_BIT(INT_PR_TO), "peripheral reset timeout");
- if (stat & (DSI1_INT_TXPKT1_DONE | DSI1_INT_PHY_DIR_RTF)) {
+ if (stat & ((dsi->variant->port ? DSI1_INT_TXPKT1_DONE :
+ DSI0_INT_CMDC_DONE_MASK) |
+ DSI_PORT_BIT(INT_PHY_DIR_RTF))) {
complete(&dsi->xfer_completion);
ret = IRQ_HANDLED;
- } else if (stat & DSI1_INT_HSTX_TO) {
+ } else if (stat & DSI_PORT_BIT(INT_HSTX_TO)) {
complete(&dsi->xfer_completion);
dsi->xfer_result = -ETIMEDOUT;
ret = IRQ_HANDLED;
@@ -1487,13 +1547,29 @@ vc4_dsi_init_phy_clocks(struct vc4_dsi *dsi)
dsi->clk_onecell);
}
+static void vc4_dsi_dma_mem_release(void *ptr)
+{
+ struct vc4_dsi *dsi = ptr;
+ struct device *dev = &dsi->pdev->dev;
+
+ dma_free_coherent(dev, 4, dsi->reg_dma_mem, dsi->reg_dma_paddr);
+ dsi->reg_dma_mem = NULL;
+}
+
+static void vc4_dsi_dma_chan_release(void *ptr)
+{
+ struct vc4_dsi *dsi = ptr;
+
+ dma_release_channel(dsi->reg_dma_chan);
+ dsi->reg_dma_chan = NULL;
+}
+
static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
{
struct platform_device *pdev = to_platform_device(dev);
struct drm_device *drm = dev_get_drvdata(master);
struct vc4_dsi *dsi = dev_get_drvdata(dev);
struct vc4_dsi_encoder *vc4_dsi_encoder;
- dma_cap_mask_t dma_mask;
int ret;
dsi->variant = of_device_get_match_data(dev);
@@ -1504,7 +1580,8 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
return -ENOMEM;
INIT_LIST_HEAD(&dsi->bridge_chain);
- vc4_dsi_encoder->base.type = VC4_ENCODER_TYPE_DSI1;
+ vc4_dsi_encoder->base.type = dsi->variant->port ?
+ VC4_ENCODER_TYPE_DSI1 : VC4_ENCODER_TYPE_DSI0;
vc4_dsi_encoder->dsi = dsi;
dsi->encoder = &vc4_dsi_encoder->base.base;
@@ -1527,6 +1604,8 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
* so set up a channel for talking to it.
*/
if (dsi->variant->broken_axi_workaround) {
+ dma_cap_mask_t dma_mask;
+
dsi->reg_dma_mem = dma_alloc_coherent(dev, 4,
&dsi->reg_dma_paddr,
GFP_KERNEL);
@@ -1535,8 +1614,13 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
return -ENOMEM;
}
+ ret = devm_add_action_or_reset(dev, vc4_dsi_dma_mem_release, dsi);
+ if (ret)
+ return ret;
+
dma_cap_zero(dma_mask);
dma_cap_set(DMA_MEMCPY, dma_mask);
+
dsi->reg_dma_chan = dma_request_chan_by_mask(&dma_mask);
if (IS_ERR(dsi->reg_dma_chan)) {
ret = PTR_ERR(dsi->reg_dma_chan);
@@ -1546,6 +1630,10 @@ static int vc4_dsi_bind(struct device *dev, struct device *master, void *data)
return ret;
}
+ ret = devm_add_action_or_reset(dev, vc4_dsi_dma_chan_release, dsi);
+ if (ret)
+ return ret;
+
/* Get the physical address of the device's registers. The
* struct resource for the regs gives us the bus address
* instead.
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 98b78ec6b37d..8b1e52145082 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -79,6 +79,11 @@
#define VC5_HDMI_VERTB_VSPO_SHIFT 16
#define VC5_HDMI_VERTB_VSPO_MASK VC4_MASK(29, 16)
+#define VC4_HDMI_MISC_CONTROL_PIXEL_REP_SHIFT 0
+#define VC4_HDMI_MISC_CONTROL_PIXEL_REP_MASK VC4_MASK(3, 0)
+#define VC5_HDMI_MISC_CONTROL_PIXEL_REP_SHIFT 0
+#define VC5_HDMI_MISC_CONTROL_PIXEL_REP_MASK VC4_MASK(3, 0)
+
#define VC5_HDMI_SCRAMBLER_CTL_ENABLE BIT(0)
#define VC5_HDMI_DEEP_COLOR_CONFIG_1_INIT_PACK_PHASE_SHIFT 8
@@ -122,6 +127,12 @@ static int vc4_hdmi_debugfs_regs(struct seq_file *m, void *unused)
drm_print_regset32(&p, &vc4_hdmi->hdmi_regset);
drm_print_regset32(&p, &vc4_hdmi->hd_regset);
+ drm_print_regset32(&p, &vc4_hdmi->cec_regset);
+ drm_print_regset32(&p, &vc4_hdmi->csc_regset);
+ drm_print_regset32(&p, &vc4_hdmi->dvp_regset);
+ drm_print_regset32(&p, &vc4_hdmi->phy_regset);
+ drm_print_regset32(&p, &vc4_hdmi->ram_regset);
+ drm_print_regset32(&p, &vc4_hdmi->rm_regset);
return 0;
}
@@ -433,9 +444,11 @@ static void vc4_hdmi_write_infoframe(struct drm_encoder *encoder,
const struct vc4_hdmi_register *ram_packet_start =
&vc4_hdmi->variant->registers[HDMI_RAM_PACKET_START];
u32 packet_reg = ram_packet_start->offset + VC4_HDMI_PACKET_STRIDE * packet_id;
+ u32 packet_reg_next = ram_packet_start->offset +
+ VC4_HDMI_PACKET_STRIDE * (packet_id + 1);
void __iomem *base = __vc4_hdmi_get_field_base(vc4_hdmi,
ram_packet_start->reg);
- uint8_t buffer[VC4_HDMI_PACKET_STRIDE];
+ uint8_t buffer[VC4_HDMI_PACKET_STRIDE] = {};
unsigned long flags;
ssize_t len, i;
int ret;
@@ -471,6 +484,13 @@ static void vc4_hdmi_write_infoframe(struct drm_encoder *encoder,
packet_reg += 4;
}
+ /*
+ * clear remainder of packet ram as it's included in the
+ * infoframe and triggers a checksum error on hdmi analyser
+ */
+ for (; packet_reg < packet_reg_next; packet_reg += 4)
+ writel(0, base + packet_reg);
+
HDMI_WRITE(HDMI_RAM_PACKET_CONFIG,
HDMI_READ(HDMI_RAM_PACKET_CONFIG) | BIT(packet_id));
@@ -852,14 +872,15 @@ static void vc4_hdmi_set_timings(struct vc4_hdmi *vc4_hdmi,
VC4_HDMI_VERTA_VFP) |
VC4_SET_FIELD(mode->crtc_vdisplay, VC4_HDMI_VERTA_VAL));
u32 vertb = (VC4_SET_FIELD(0, VC4_HDMI_VERTB_VSPO) |
- VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end,
+ VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end +
+ interlaced,
VC4_HDMI_VERTB_VBP));
u32 vertb_even = (VC4_SET_FIELD(0, VC4_HDMI_VERTB_VSPO) |
VC4_SET_FIELD(mode->crtc_vtotal -
- mode->crtc_vsync_end -
- interlaced,
+ mode->crtc_vsync_end,
VC4_HDMI_VERTB_VBP));
unsigned long flags;
+ u32 reg;
spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
@@ -886,6 +907,11 @@ static void vc4_hdmi_set_timings(struct vc4_hdmi *vc4_hdmi,
HDMI_WRITE(HDMI_VERTB0, vertb_even);
HDMI_WRITE(HDMI_VERTB1, vertb);
+ reg = HDMI_READ(HDMI_MISC_CONTROL);
+ reg &= ~VC4_HDMI_MISC_CONTROL_PIXEL_REP_MASK;
+ reg |= VC4_SET_FIELD(pixel_rep - 1, VC4_HDMI_MISC_CONTROL_PIXEL_REP);
+ HDMI_WRITE(HDMI_MISC_CONTROL, reg);
+
spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
}
@@ -902,13 +928,13 @@ static void vc5_hdmi_set_timings(struct vc4_hdmi *vc4_hdmi,
VC4_SET_FIELD(mode->crtc_vsync_start - mode->crtc_vdisplay,
VC5_HDMI_VERTA_VFP) |
VC4_SET_FIELD(mode->crtc_vdisplay, VC5_HDMI_VERTA_VAL));
- u32 vertb = (VC4_SET_FIELD(0, VC5_HDMI_VERTB_VSPO) |
+ u32 vertb = (VC4_SET_FIELD(mode->htotal >> (2 - pixel_rep),
+ VC5_HDMI_VERTB_VSPO) |
VC4_SET_FIELD(mode->crtc_vtotal - mode->crtc_vsync_end,
VC4_HDMI_VERTB_VBP));
u32 vertb_even = (VC4_SET_FIELD(0, VC5_HDMI_VERTB_VSPO) |
VC4_SET_FIELD(mode->crtc_vtotal -
- mode->crtc_vsync_end -
- interlaced,
+ mode->crtc_vsync_end - interlaced,
VC4_HDMI_VERTB_VBP));
unsigned long flags;
unsigned char gcp;
@@ -973,6 +999,11 @@ static void vc5_hdmi_set_timings(struct vc4_hdmi *vc4_hdmi,
reg |= gcp_en ? VC5_HDMI_GCP_CONFIG_GCP_ENABLE : 0;
HDMI_WRITE(HDMI_GCP_CONFIG, reg);
+ reg = HDMI_READ(HDMI_MISC_CONTROL);
+ reg &= ~VC5_HDMI_MISC_CONTROL_PIXEL_REP_MASK;
+ reg |= VC4_SET_FIELD(pixel_rep - 1, VC5_HDMI_MISC_CONTROL_PIXEL_REP);
+ HDMI_WRITE(HDMI_MISC_CONTROL, reg);
+
HDMI_WRITE(HDMI_CLOCK_STOP, 0);
spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
@@ -1253,11 +1284,25 @@ static int vc4_hdmi_encoder_atomic_check(struct drm_encoder *encoder,
unsigned long long pixel_rate = mode->clock * 1000;
unsigned long long tmds_rate;
- if (vc4_hdmi->variant->unsupported_odd_h_timings &&
- !(mode->flags & DRM_MODE_FLAG_DBLCLK) &&
- ((mode->hdisplay % 2) || (mode->hsync_start % 2) ||
- (mode->hsync_end % 2) || (mode->htotal % 2)))
- return -EINVAL;
+ if (vc4_hdmi->variant->unsupported_odd_h_timings) {
+ if (mode->flags & DRM_MODE_FLAG_DBLCLK) {
+ /* Only try to fixup DBLCLK modes to get 480i and 576i
+ * working.
+ * A generic solution for all modes with odd horizontal
+ * timing values seems impossible based on trying to
+ * solve it for 1366x768 monitors.
+ */
+ if ((mode->hsync_start - mode->hdisplay) & 1)
+ mode->hsync_start--;
+ if ((mode->hsync_end - mode->hsync_start) & 1)
+ mode->hsync_end--;
+ }
+
+ /* Now check whether we still have odd values remaining */
+ if ((mode->hdisplay % 2) || (mode->hsync_start % 2) ||
+ (mode->hsync_end % 2) || (mode->htotal % 2))
+ return -EINVAL;
+ }
/*
* The 1440p@60 pixel rate is in the same range than the first
@@ -1611,10 +1656,10 @@ static int vc4_hdmi_audio_prepare(struct device *dev, void *data,
/* Set the MAI threshold */
HDMI_WRITE(HDMI_MAI_THR,
- VC4_SET_FIELD(0x10, VC4_HD_MAI_THR_PANICHIGH) |
- VC4_SET_FIELD(0x10, VC4_HD_MAI_THR_PANICLOW) |
- VC4_SET_FIELD(0x10, VC4_HD_MAI_THR_DREQHIGH) |
- VC4_SET_FIELD(0x10, VC4_HD_MAI_THR_DREQLOW));
+ VC4_SET_FIELD(0x08, VC4_HD_MAI_THR_PANICHIGH) |
+ VC4_SET_FIELD(0x08, VC4_HD_MAI_THR_PANICLOW) |
+ VC4_SET_FIELD(0x06, VC4_HD_MAI_THR_DREQHIGH) |
+ VC4_SET_FIELD(0x08, VC4_HD_MAI_THR_DREQLOW));
HDMI_WRITE(HDMI_MAI_CONFIG,
VC4_HDMI_MAI_CONFIG_BIT_REVERSE |
@@ -1705,12 +1750,12 @@ static int vc4_hdmi_audio_init(struct vc4_hdmi *vc4_hdmi)
struct device *dev = &vc4_hdmi->pdev->dev;
struct platform_device *codec_pdev;
const __be32 *addr;
- int index;
+ int index, len;
int ret;
- if (!of_find_property(dev->of_node, "dmas", NULL)) {
+ if (!of_find_property(dev->of_node, "dmas", &len) || !len) {
dev_warn(dev,
- "'dmas' DT property is missing, no HDMI audio\n");
+ "'dmas' DT property is missing or empty, no HDMI audio\n");
return 0;
}
@@ -2191,8 +2236,6 @@ static int vc4_hdmi_cec_init(struct vc4_hdmi *vc4_hdmi)
struct cec_connector_info conn_info;
struct platform_device *pdev = vc4_hdmi->pdev;
struct device *dev = &pdev->dev;
- unsigned long flags;
- u32 value;
int ret;
if (!of_find_property(dev->of_node, "interrupts", NULL)) {
@@ -2211,15 +2254,6 @@ static int vc4_hdmi_cec_init(struct vc4_hdmi *vc4_hdmi)
cec_fill_conn_info_from_drm(&conn_info, &vc4_hdmi->connector);
cec_s_conn_info(vc4_hdmi->cec_adap, &conn_info);
- spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
- value = HDMI_READ(HDMI_CEC_CNTRL_1);
- /* Set the logical address to Unregistered */
- value |= VC4_HDMI_CEC_ADDR_MASK;
- HDMI_WRITE(HDMI_CEC_CNTRL_1, value);
- spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
-
- vc4_hdmi_cec_update_clk_div(vc4_hdmi);
-
if (vc4_hdmi->variant->external_irq_controller) {
ret = request_threaded_irq(platform_get_irq_byname(pdev, "cec-rx"),
vc4_cec_irq_handler_rx_bare,
@@ -2235,10 +2269,6 @@ static int vc4_hdmi_cec_init(struct vc4_hdmi *vc4_hdmi)
if (ret)
goto err_remove_cec_rx_handler;
} else {
- spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
- HDMI_WRITE(HDMI_CEC_CPU_MASK_SET, 0xffffffff);
- spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
-
ret = request_threaded_irq(platform_get_irq(pdev, 0),
vc4_cec_irq_handler,
vc4_cec_irq_handler_thread, 0,
@@ -2289,7 +2319,6 @@ static int vc4_hdmi_cec_init(struct vc4_hdmi *vc4_hdmi)
}
static void vc4_hdmi_cec_exit(struct vc4_hdmi *vc4_hdmi) {};
-
#endif
static int vc4_hdmi_build_regset(struct vc4_hdmi *vc4_hdmi,
@@ -2374,6 +2403,7 @@ static int vc5_hdmi_init_resources(struct vc4_hdmi *vc4_hdmi)
struct platform_device *pdev = vc4_hdmi->pdev;
struct device *dev = &pdev->dev;
struct resource *res;
+ int ret;
res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "hdmi");
if (!res)
@@ -2470,6 +2500,38 @@ static int vc5_hdmi_init_resources(struct vc4_hdmi *vc4_hdmi)
return PTR_ERR(vc4_hdmi->reset);
}
+ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->hdmi_regset, VC4_HDMI);
+ if (ret)
+ return ret;
+
+ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->hd_regset, VC4_HD);
+ if (ret)
+ return ret;
+
+ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->cec_regset, VC5_CEC);
+ if (ret)
+ return ret;
+
+ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->csc_regset, VC5_CSC);
+ if (ret)
+ return ret;
+
+ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->dvp_regset, VC5_DVP);
+ if (ret)
+ return ret;
+
+ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->phy_regset, VC5_PHY);
+ if (ret)
+ return ret;
+
+ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->ram_regset, VC5_RAM);
+ if (ret)
+ return ret;
+
+ ret = vc4_hdmi_build_regset(vc4_hdmi, &vc4_hdmi->rm_regset, VC5_RM);
+ if (ret)
+ return ret;
+
return 0;
}
@@ -2485,12 +2547,34 @@ static int __maybe_unused vc4_hdmi_runtime_suspend(struct device *dev)
static int vc4_hdmi_runtime_resume(struct device *dev)
{
struct vc4_hdmi *vc4_hdmi = dev_get_drvdata(dev);
+ unsigned long __maybe_unused flags;
+ u32 __maybe_unused value;
int ret;
ret = clk_prepare_enable(vc4_hdmi->hsm_clock);
if (ret)
return ret;
+ if (vc4_hdmi->variant->reset)
+ vc4_hdmi->variant->reset(vc4_hdmi);
+
+#ifdef CONFIG_DRM_VC4_HDMI_CEC
+ spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
+ value = HDMI_READ(HDMI_CEC_CNTRL_1);
+ /* Set the logical address to Unregistered */
+ value |= VC4_HDMI_CEC_ADDR_MASK;
+ HDMI_WRITE(HDMI_CEC_CNTRL_1, value);
+ spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
+
+ vc4_hdmi_cec_update_clk_div(vc4_hdmi);
+
+ if (!vc4_hdmi->variant->external_irq_controller) {
+ spin_lock_irqsave(&vc4_hdmi->hw_lock, flags);
+ HDMI_WRITE(HDMI_CEC_CPU_MASK_SET, 0xffffffff);
+ spin_unlock_irqrestore(&vc4_hdmi->hw_lock, flags);
+ }
+#endif
+
return 0;
}
@@ -2593,9 +2677,6 @@ static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
pm_runtime_set_active(dev);
pm_runtime_enable(dev);
- if (vc4_hdmi->variant->reset)
- vc4_hdmi->variant->reset(vc4_hdmi);
-
if ((of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi0") ||
of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi1")) &&
HDMI_READ(HDMI_VID_CTL) & VC4_HD_VID_CTL_ENABLE) {
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
index 1076faeab616..2b9f5ca15a40 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.h
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
@@ -184,6 +184,14 @@ struct vc4_hdmi {
struct debugfs_regset32 hdmi_regset;
struct debugfs_regset32 hd_regset;
+ /* VC5 only */
+ struct debugfs_regset32 cec_regset;
+ struct debugfs_regset32 csc_regset;
+ struct debugfs_regset32 dvp_regset;
+ struct debugfs_regset32 phy_regset;
+ struct debugfs_regset32 ram_regset;
+ struct debugfs_regset32 rm_regset;
+
/**
* @hw_lock: Spinlock protecting device register access.
*/
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi_regs.h b/drivers/gpu/drm/vc4/vc4_hdmi_regs.h
index fc971506bd4f..72b769412482 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_hdmi_regs.h
@@ -125,6 +125,7 @@ enum vc4_hdmi_field {
HDMI_VERTB0,
HDMI_VERTB1,
HDMI_VID_CTL,
+ HDMI_MISC_CONTROL,
};
struct vc4_hdmi_register {
@@ -235,6 +236,7 @@ static const struct vc4_hdmi_register __maybe_unused vc5_hdmi_hdmi0_fields[] = {
VC4_HDMI_REG(HDMI_VERTB0, 0x0f0),
VC4_HDMI_REG(HDMI_VERTA1, 0x0f4),
VC4_HDMI_REG(HDMI_VERTB1, 0x0f8),
+ VC4_HDMI_REG(HDMI_MISC_CONTROL, 0x100),
VC4_HDMI_REG(HDMI_MAI_CHANNEL_MAP, 0x09c),
VC4_HDMI_REG(HDMI_MAI_CONFIG, 0x0a0),
VC4_HDMI_REG(HDMI_DEEP_COLOR_CONFIG_1, 0x170),
@@ -315,6 +317,7 @@ static const struct vc4_hdmi_register __maybe_unused vc5_hdmi_hdmi1_fields[] = {
VC4_HDMI_REG(HDMI_VERTB0, 0x0f0),
VC4_HDMI_REG(HDMI_VERTA1, 0x0f4),
VC4_HDMI_REG(HDMI_VERTB1, 0x0f8),
+ VC4_HDMI_REG(HDMI_MISC_CONTROL, 0x100),
VC4_HDMI_REG(HDMI_MAI_CHANNEL_MAP, 0x09c),
VC4_HDMI_REG(HDMI_MAI_CONFIG, 0x0a0),
VC4_HDMI_REG(HDMI_DEEP_COLOR_CONFIG_1, 0x170),
@@ -414,7 +417,7 @@ static inline u32 vc4_hdmi_read(struct vc4_hdmi *hdmi,
const struct vc4_hdmi_variant *variant = hdmi->variant;
void __iomem *base;
- WARN_ON(!pm_runtime_active(&hdmi->pdev->dev));
+ WARN_ON(pm_runtime_status_suspended(&hdmi->pdev->dev));
if (reg >= variant->num_registers) {
dev_warn(&hdmi->pdev->dev,
@@ -444,7 +447,7 @@ static inline void vc4_hdmi_write(struct vc4_hdmi *hdmi,
lockdep_assert_held(&hdmi->hw_lock);
- WARN_ON(!pm_runtime_active(&hdmi->pdev->dev));
+ WARN_ON(pm_runtime_status_suspended(&hdmi->pdev->dev));
if (reg >= variant->num_registers) {
dev_warn(&hdmi->pdev->dev,
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 992d6a240002..dba23ae2e65e 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -890,7 +890,9 @@ vc4_core_clock_atomic_check(struct drm_atomic_state *state)
continue;
num_outputs++;
- cob_rate += hvs_new_state->fifo_state[i].fifo_load;
+ cob_rate = max_t(unsigned long,
+ hvs_new_state->fifo_state[i].fifo_load,
+ cob_rate);
}
pixel_rate = load_state->hvs_load;
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c
index 920a9eefe426..a82a0b1190eb 100644
--- a/drivers/gpu/drm/vc4/vc4_plane.c
+++ b/drivers/gpu/drm/vc4/vc4_plane.c
@@ -310,16 +310,16 @@ static int vc4_plane_margins_adj(struct drm_plane_state *pstate)
adjhdisplay,
crtc_state->mode.hdisplay);
vc4_pstate->crtc_x += left;
- if (vc4_pstate->crtc_x > crtc_state->mode.hdisplay - left)
- vc4_pstate->crtc_x = crtc_state->mode.hdisplay - left;
+ if (vc4_pstate->crtc_x > crtc_state->mode.hdisplay - right)
+ vc4_pstate->crtc_x = crtc_state->mode.hdisplay - right;
adjvdisplay = crtc_state->mode.vdisplay - (top + bottom);
vc4_pstate->crtc_y = DIV_ROUND_CLOSEST(vc4_pstate->crtc_y *
adjvdisplay,
crtc_state->mode.vdisplay);
vc4_pstate->crtc_y += top;
- if (vc4_pstate->crtc_y > crtc_state->mode.vdisplay - top)
- vc4_pstate->crtc_y = crtc_state->mode.vdisplay - top;
+ if (vc4_pstate->crtc_y > crtc_state->mode.vdisplay - bottom)
+ vc4_pstate->crtc_y = crtc_state->mode.vdisplay - bottom;
vc4_pstate->crtc_w = DIV_ROUND_CLOSEST(vc4_pstate->crtc_w *
adjhdisplay,
@@ -339,7 +339,6 @@ static int vc4_plane_setup_clipping_and_scaling(struct drm_plane_state *state)
struct vc4_plane_state *vc4_state = to_vc4_plane_state(state);
struct drm_framebuffer *fb = state->fb;
struct drm_gem_cma_object *bo = drm_fb_cma_get_gem_obj(fb, 0);
- u32 subpixel_src_mask = (1 << 16) - 1;
int num_planes = fb->format->num_planes;
struct drm_crtc_state *crtc_state;
u32 h_subsample = fb->format->hsub;
@@ -361,18 +360,15 @@ static int vc4_plane_setup_clipping_and_scaling(struct drm_plane_state *state)
for (i = 0; i < num_planes; i++)
vc4_state->offsets[i] = bo->paddr + fb->offsets[i];
- /* We don't support subpixel source positioning for scaling. */
- if ((state->src.x1 & subpixel_src_mask) ||
- (state->src.x2 & subpixel_src_mask) ||
- (state->src.y1 & subpixel_src_mask) ||
- (state->src.y2 & subpixel_src_mask)) {
- return -EINVAL;
- }
-
- vc4_state->src_x = state->src.x1 >> 16;
- vc4_state->src_y = state->src.y1 >> 16;
- vc4_state->src_w[0] = (state->src.x2 - state->src.x1) >> 16;
- vc4_state->src_h[0] = (state->src.y2 - state->src.y1) >> 16;
+ /*
+ * We don't support subpixel source positioning for scaling,
+ * but fractional coordinates can be generated by clipping
+ * so just round for now
+ */
+ vc4_state->src_x = DIV_ROUND_CLOSEST(state->src.x1, 1 << 16);
+ vc4_state->src_y = DIV_ROUND_CLOSEST(state->src.y1, 1 << 16);
+ vc4_state->src_w[0] = DIV_ROUND_CLOSEST(state->src.x2, 1 << 16) - vc4_state->src_x;
+ vc4_state->src_h[0] = DIV_ROUND_CLOSEST(state->src.y2, 1 << 16) - vc4_state->src_y;
vc4_state->crtc_x = state->dst.x1;
vc4_state->crtc_y = state->dst.y1;
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index c708bab555c6..10f080060923 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -579,8 +579,10 @@ static int virtio_gpu_get_caps_ioctl(struct drm_device *dev,
spin_unlock(&vgdev->display_info_lock);
/* not in cache - need to talk to hw */
- virtio_gpu_cmd_get_capset(vgdev, found_valid, args->cap_set_ver,
- &cache_ent);
+ ret = virtio_gpu_cmd_get_capset(vgdev, found_valid, args->cap_set_ver,
+ &cache_ent);
+ if (ret)
+ return ret;
virtio_gpu_notify(vgdev);
copy_exit:
diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c b/drivers/gpu/drm/virtio/virtgpu_object.c
index f293e6ad52da..1cc8f3fc8e4b 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -168,9 +168,9 @@ static int virtio_gpu_object_shmem_init(struct virtio_gpu_device *vgdev,
* since virtio_gpu doesn't support dma-buf import from other devices.
*/
shmem->pages = drm_gem_shmem_get_sg_table(&bo->base);
- if (!shmem->pages) {
+ if (IS_ERR(shmem->pages)) {
drm_gem_shmem_unpin(&bo->base);
- return -EINVAL;
+ return PTR_ERR(shmem->pages);
}
if (use_dma_api) {
diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
index c6a1036bf2ea..b47ac170108c 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -157,7 +157,7 @@ static void compose_plane(struct vkms_composer *primary_composer,
void *vaddr;
void (*pixel_blend)(const u8 *p_src, u8 *p_dst);
- if (WARN_ON(iosys_map_is_null(&primary_composer->map[0])))
+ if (WARN_ON(iosys_map_is_null(&plane_composer->map[0])))
return;
vaddr = plane_composer->map[0].vaddr;
diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_client.c b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
index 444acd9e2cd6..8be33cf3b6c2 100644
--- a/drivers/hid/amd-sfh-hid/amd_sfh_client.c
+++ b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
@@ -155,6 +155,8 @@ int amd_sfh_hid_client_init(struct amd_mp2_dev *privdata)
dev = &privdata->pdev->dev;
cl_data->num_hid_devices = amd_mp2_get_sensor_num(privdata, &cl_data->sensor_idx[0]);
+ if (cl_data->num_hid_devices == 0)
+ return -ENODEV;
INIT_DELAYED_WORK(&cl_data->work, amd_sfh_work);
INIT_DELAYED_WORK(&cl_data->work_buffer, amd_sfh_work_buffer);
diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_hid.c b/drivers/hid/amd-sfh-hid/amd_sfh_hid.c
index e2a9679e32be..54fe224050f9 100644
--- a/drivers/hid/amd-sfh-hid/amd_sfh_hid.c
+++ b/drivers/hid/amd-sfh-hid/amd_sfh_hid.c
@@ -100,11 +100,15 @@ static int amdtp_wait_for_response(struct hid_device *hid)
void amdtp_hid_wakeup(struct hid_device *hid)
{
- struct amdtp_hid_data *hid_data = hid->driver_data;
- struct amdtp_cl_data *cli_data = hid_data->cli_data;
+ struct amdtp_hid_data *hid_data;
+ struct amdtp_cl_data *cli_data;
- cli_data->request_done[cli_data->cur_hid_dev] = true;
- wake_up_interruptible(&hid_data->hid_wait);
+ if (hid) {
+ hid_data = hid->driver_data;
+ cli_data = hid_data->cli_data;
+ cli_data->request_done[cli_data->cur_hid_dev] = true;
+ wake_up_interruptible(&hid_data->hid_wait);
+ }
}
static struct hid_ll_driver amdtp_hid_ll_driver = {
diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c
index e18a4efd8839..390298a6fc85 100644
--- a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c
+++ b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c
@@ -327,7 +327,8 @@ static int amd_mp2_pci_probe(struct pci_dev *pdev, const struct pci_device_id *i
rc = amd_sfh_hid_client_init(privdata);
if (rc) {
amd_sfh_clear_intr(privdata);
- dev_err(&pdev->dev, "amd_sfh_hid_client_init failed\n");
+ if (rc != -EOPNOTSUPP)
+ dev_err(&pdev->dev, "amd_sfh_hid_client_init failed\n");
return rc;
}
diff --git a/drivers/hid/hid-alps.c b/drivers/hid/hid-alps.c
index 2b986d0dbde4..db146d0f7937 100644
--- a/drivers/hid/hid-alps.c
+++ b/drivers/hid/hid-alps.c
@@ -830,6 +830,8 @@ static const struct hid_device_id alps_id[] = {
USB_VENDOR_ID_ALPS_JP, HID_DEVICE_ID_ALPS_U1_DUAL) },
{ HID_DEVICE(HID_BUS_ANY, HID_GROUP_ANY,
USB_VENDOR_ID_ALPS_JP, HID_DEVICE_ID_ALPS_U1) },
+ { HID_DEVICE(HID_BUS_ANY, HID_GROUP_ANY,
+ USB_VENDOR_ID_ALPS_JP, HID_DEVICE_ID_ALPS_U1_UNICORN_LEGACY) },
{ HID_DEVICE(HID_BUS_ANY, HID_GROUP_ANY,
USB_VENDOR_ID_ALPS_JP, HID_DEVICE_ID_ALPS_T4_BTNLESS) },
{ }
diff --git a/drivers/hid/hid-cp2112.c b/drivers/hid/hid-cp2112.c
index ece147d1a278..1e16b0fa310d 100644
--- a/drivers/hid/hid-cp2112.c
+++ b/drivers/hid/hid-cp2112.c
@@ -790,6 +790,11 @@ static int cp2112_xfer(struct i2c_adapter *adap, u16 addr,
data->word = le16_to_cpup((__le16 *)buf);
break;
case I2C_SMBUS_I2C_BLOCK_DATA:
+ if (read_length > I2C_SMBUS_BLOCK_MAX) {
+ ret = -EINVAL;
+ goto power_normal;
+ }
+
memcpy(data->block + 1, buf, read_length);
break;
case I2C_SMBUS_BLOCK_DATA:
diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
index c297c63f3ec5..e233726d9a74 100644
--- a/drivers/hid/hid-ids.h
+++ b/drivers/hid/hid-ids.h
@@ -413,6 +413,7 @@
#define USB_DEVICE_ID_ASUS_UX550VE_TOUCHSCREEN 0x2544
#define USB_DEVICE_ID_ASUS_UX550_TOUCHSCREEN 0x2706
#define I2C_DEVICE_ID_SURFACE_GO_TOUCHSCREEN 0x261A
+#define I2C_DEVICE_ID_SURFACE_GO2_TOUCHSCREEN 0x2A1C
#define USB_VENDOR_ID_ELECOM 0x056e
#define USB_DEVICE_ID_ELECOM_BM084 0x0061
diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c
index c6b27aab9041..48c1c02c69f4 100644
--- a/drivers/hid/hid-input.c
+++ b/drivers/hid/hid-input.c
@@ -381,6 +381,8 @@ static const struct hid_device_id hid_battery_quirks[] = {
HID_BATTERY_QUIRK_IGNORE },
{ HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, I2C_DEVICE_ID_SURFACE_GO_TOUCHSCREEN),
HID_BATTERY_QUIRK_IGNORE },
+ { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, I2C_DEVICE_ID_SURFACE_GO2_TOUCHSCREEN),
+ HID_BATTERY_QUIRK_IGNORE },
{}
};
diff --git a/drivers/hid/hid-mcp2221.c b/drivers/hid/hid-mcp2221.c
index 4211b9839209..de52e9f7bb8c 100644
--- a/drivers/hid/hid-mcp2221.c
+++ b/drivers/hid/hid-mcp2221.c
@@ -385,6 +385,9 @@ static int mcp_smbus_write(struct mcp2221 *mcp, u16 addr,
data_len = 7;
break;
default:
+ if (len > I2C_SMBUS_BLOCK_MAX)
+ return -EINVAL;
+
memcpy(&mcp->txbuf[5], buf, len);
data_len = len + 5;
}
diff --git a/drivers/hid/hid-nintendo.c b/drivers/hid/hid-nintendo.c
index 2204de889739..4b1173957c17 100644
--- a/drivers/hid/hid-nintendo.c
+++ b/drivers/hid/hid-nintendo.c
@@ -1586,6 +1586,7 @@ static const unsigned int joycon_button_inputs_r[] = {
/* We report joy-con d-pad inputs as buttons and pro controller as a hat. */
static const unsigned int joycon_dpad_inputs_jc[] = {
BTN_DPAD_UP, BTN_DPAD_DOWN, BTN_DPAD_LEFT, BTN_DPAD_RIGHT,
+ 0 /* 0 signals end of array */
};
static int joycon_input_create(struct joycon_ctlr *ctlr)
diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c
index 066c567dbaa2..9a81e63c330e 100644
--- a/drivers/hid/wacom_sys.c
+++ b/drivers/hid/wacom_sys.c
@@ -2121,7 +2121,7 @@ static int wacom_register_inputs(struct wacom *wacom)
error = wacom_setup_pad_input_capabilities(pad_input_dev, wacom_wac);
if (error) {
- /* no pad in use on this interface */
+ /* no pad events using this interface */
input_free_device(pad_input_dev);
wacom_wac->pad_input = NULL;
pad_input_dev = NULL;
diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c
index a7176fc0635d..c454231afec8 100644
--- a/drivers/hid/wacom_wac.c
+++ b/drivers/hid/wacom_wac.c
@@ -638,9 +638,26 @@ static int wacom_intuos_id_mangle(int tool_id)
return (tool_id & ~0xFFF) << 4 | (tool_id & 0xFFF);
}
+static bool wacom_is_art_pen(int tool_id)
+{
+ bool is_art_pen = false;
+
+ switch (tool_id) {
+ case 0x885: /* Intuos3 Marker Pen */
+ case 0x804: /* Intuos4/5 13HD/24HD Marker Pen */
+ case 0x10804: /* Intuos4/5 13HD/24HD Art Pen */
+ is_art_pen = true;
+ break;
+ }
+ return is_art_pen;
+}
+
static int wacom_intuos_get_tool_type(int tool_id)
{
- int tool_type;
+ int tool_type = BTN_TOOL_PEN;
+
+ if (wacom_is_art_pen(tool_id))
+ return tool_type;
switch (tool_id) {
case 0x812: /* Inking pen */
@@ -655,12 +672,9 @@ static int wacom_intuos_get_tool_type(int tool_id)
case 0x852:
case 0x823: /* Intuos3 Grip Pen */
case 0x813: /* Intuos3 Classic Pen */
- case 0x885: /* Intuos3 Marker Pen */
case 0x802: /* Intuos4/5 13HD/24HD General Pen */
- case 0x804: /* Intuos4/5 13HD/24HD Marker Pen */
case 0x8e2: /* IntuosHT2 pen */
case 0x022:
- case 0x10804: /* Intuos4/5 13HD/24HD Art Pen */
case 0x10842: /* MobileStudio Pro Pro Pen slim */
case 0x14802: /* Intuos4/5 13HD/24HD Classic Pen */
case 0x16802: /* Cintiq 13HD Pro Pen */
@@ -718,10 +732,6 @@ static int wacom_intuos_get_tool_type(int tool_id)
case 0x10902: /* Intuos4/5 13HD/24HD Airbrush */
tool_type = BTN_TOOL_AIRBRUSH;
break;
-
- default: /* Unknown tool */
- tool_type = BTN_TOOL_PEN;
- break;
}
return tool_type;
}
@@ -2007,7 +2017,6 @@ static void wacom_wac_pad_usage_mapping(struct hid_device *hdev,
wacom_wac->has_mute_touch_switch = true;
usage->type = EV_SW;
usage->code = SW_MUTE_DEVICE;
- features->device_type |= WACOM_DEVICETYPE_PAD;
break;
case WACOM_HID_WD_TOUCHSTRIP:
wacom_map_usage(input, usage, field, EV_ABS, ABS_RX, 0);
@@ -2087,6 +2096,30 @@ static void wacom_wac_pad_event(struct hid_device *hdev, struct hid_field *field
wacom_wac->hid_data.inrange_state |= value;
}
+ /* Process touch switch state first since it is reported through touch interface,
+ * which is indepentent of pad interface. In the case when there are no other pad
+ * events, the pad interface will not even be created.
+ */
+ if ((equivalent_usage == WACOM_HID_WD_MUTE_DEVICE) ||
+ (equivalent_usage == WACOM_HID_WD_TOUCHONOFF)) {
+ if (wacom_wac->shared->touch_input) {
+ bool *is_touch_on = &wacom_wac->shared->is_touch_on;
+
+ if (equivalent_usage == WACOM_HID_WD_MUTE_DEVICE && value)
+ *is_touch_on = !(*is_touch_on);
+ else if (equivalent_usage == WACOM_HID_WD_TOUCHONOFF)
+ *is_touch_on = value;
+
+ input_report_switch(wacom_wac->shared->touch_input,
+ SW_MUTE_DEVICE, !(*is_touch_on));
+ input_sync(wacom_wac->shared->touch_input);
+ }
+ return;
+ }
+
+ if (!input)
+ return;
+
switch (equivalent_usage) {
case WACOM_HID_WD_TOUCHRING:
/*
@@ -2122,22 +2155,6 @@ static void wacom_wac_pad_event(struct hid_device *hdev, struct hid_field *field
input_event(input, usage->type, usage->code, 0);
break;
- case WACOM_HID_WD_MUTE_DEVICE:
- case WACOM_HID_WD_TOUCHONOFF:
- if (wacom_wac->shared->touch_input) {
- bool *is_touch_on = &wacom_wac->shared->is_touch_on;
-
- if (equivalent_usage == WACOM_HID_WD_MUTE_DEVICE && value)
- *is_touch_on = !(*is_touch_on);
- else if (equivalent_usage == WACOM_HID_WD_TOUCHONOFF)
- *is_touch_on = value;
-
- input_report_switch(wacom_wac->shared->touch_input,
- SW_MUTE_DEVICE, !(*is_touch_on));
- input_sync(wacom_wac->shared->touch_input);
- }
- break;
-
case WACOM_HID_WD_MODE_CHANGE:
if (wacom_wac->is_direct_mode != value) {
wacom_wac->is_direct_mode = value;
@@ -2323,6 +2340,9 @@ static void wacom_wac_pen_event(struct hid_device *hdev, struct hid_field *field
}
return;
case HID_DG_TWIST:
+ /* don't modify the value if the pen doesn't support the feature */
+ if (!wacom_is_art_pen(wacom_wac->id[0])) return;
+
/*
* Userspace expects pen twist to have its zero point when
* the buttons/finger is on the tablet's left. HID values
@@ -2795,7 +2815,7 @@ void wacom_wac_event(struct hid_device *hdev, struct hid_field *field,
/* usage tests must precede field tests */
if (WACOM_BATTERY_USAGE(usage))
wacom_wac_battery_event(hdev, field, usage, value);
- else if (WACOM_PAD_FIELD(field) && wacom->wacom_wac.pad_input)
+ else if (WACOM_PAD_FIELD(field))
wacom_wac_pad_event(hdev, field, usage, value);
else if (WACOM_PEN_FIELD(field) && wacom->wacom_wac.pen_input)
wacom_wac_pen_event(hdev, field, usage, value);
diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
index 84cb1ede7bc0..a0df9a0abeb9 100644
--- a/drivers/hwmon/dell-smm-hwmon.c
+++ b/drivers/hwmon/dell-smm-hwmon.c
@@ -1262,6 +1262,14 @@ static const struct dmi_system_id i8k_whitelist_fan_control[] __initconst = {
},
.driver_data = (void *)&i8k_fan_control_data[I8K_FAN_34A3_35A3],
},
+ {
+ .ident = "Dell XPS 13 7390",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "XPS 13 7390"),
+ },
+ .driver_data = (void *)&i8k_fan_control_data[I8K_FAN_34A3_35A3],
+ },
{ }
};
diff --git a/drivers/hwmon/drivetemp.c b/drivers/hwmon/drivetemp.c
index 1eb37106a220..5bac2b0fc7bb 100644
--- a/drivers/hwmon/drivetemp.c
+++ b/drivers/hwmon/drivetemp.c
@@ -621,3 +621,4 @@ module_exit(drivetemp_exit);
MODULE_AUTHOR("Guenter Roeck <linus@roeck-us.net>");
MODULE_DESCRIPTION("Hard drive temperature monitor");
MODULE_LICENSE("GPL");
+MODULE_ALIAS("platform:drivetemp");
diff --git a/drivers/hwmon/sch56xx-common.c b/drivers/hwmon/sch56xx-common.c
index 3ece53adabd6..de3a0886c2f7 100644
--- a/drivers/hwmon/sch56xx-common.c
+++ b/drivers/hwmon/sch56xx-common.c
@@ -523,6 +523,28 @@ static int __init sch56xx_device_add(int address, const char *name)
return PTR_ERR_OR_ZERO(sch56xx_pdev);
}
+static const struct dmi_system_id sch56xx_dmi_override_table[] __initconst = {
+ {
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "CELSIUS W380"),
+ },
+ },
+ {
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "ESPRIMO P710"),
+ },
+ },
+ {
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "ESPRIMO E9900"),
+ },
+ },
+ { }
+};
+
/* For autoloading only */
static const struct dmi_system_id sch56xx_dmi_table[] __initconst = {
{
@@ -543,16 +565,18 @@ static int __init sch56xx_init(void)
if (!dmi_check_system(sch56xx_dmi_table))
return -ENODEV;
- /*
- * Some machines like the Esprimo P720 and Esprimo C700 have
- * onboard devices named " Antiope"/" Theseus" instead of
- * "Antiope"/"Theseus", so we need to check for both.
- */
- if (!dmi_find_device(DMI_DEV_TYPE_OTHER, "Antiope", NULL) &&
- !dmi_find_device(DMI_DEV_TYPE_OTHER, " Antiope", NULL) &&
- !dmi_find_device(DMI_DEV_TYPE_OTHER, "Theseus", NULL) &&
- !dmi_find_device(DMI_DEV_TYPE_OTHER, " Theseus", NULL))
- return -ENODEV;
+ if (!dmi_check_system(sch56xx_dmi_override_table)) {
+ /*
+ * Some machines like the Esprimo P720 and Esprimo C700 have
+ * onboard devices named " Antiope"/" Theseus" instead of
+ * "Antiope"/"Theseus", so we need to check for both.
+ */
+ if (!dmi_find_device(DMI_DEV_TYPE_OTHER, "Antiope", NULL) &&
+ !dmi_find_device(DMI_DEV_TYPE_OTHER, " Antiope", NULL) &&
+ !dmi_find_device(DMI_DEV_TYPE_OTHER, "Theseus", NULL) &&
+ !dmi_find_device(DMI_DEV_TYPE_OTHER, " Theseus", NULL))
+ return -ENODEV;
+ }
}
/*
diff --git a/drivers/hwmon/sht15.c b/drivers/hwmon/sht15.c
index 7f4a63959730..ae4d14257a11 100644
--- a/drivers/hwmon/sht15.c
+++ b/drivers/hwmon/sht15.c
@@ -1020,25 +1020,20 @@ static int sht15_probe(struct platform_device *pdev)
static int sht15_remove(struct platform_device *pdev)
{
struct sht15_data *data = platform_get_drvdata(pdev);
+ int ret;
- /*
- * Make sure any reads from the device are done and
- * prevent new ones beginning
- */
- mutex_lock(&data->read_lock);
- if (sht15_soft_reset(data)) {
- mutex_unlock(&data->read_lock);
- return -EFAULT;
- }
hwmon_device_unregister(data->hwmon_dev);
sysfs_remove_group(&pdev->dev.kobj, &sht15_attr_group);
+
+ ret = sht15_soft_reset(data);
+ if (ret)
+ dev_err(&pdev->dev, "Failed to reset device (%pe)\n", ERR_PTR(ret));
+
if (!IS_ERR(data->reg)) {
regulator_unregister_notifier(data->reg, &data->nb);
regulator_disable(data->reg);
}
- mutex_unlock(&data->read_lock);
-
return 0;
}
diff --git a/drivers/hwtracing/coresight/coresight-config.h b/drivers/hwtracing/coresight/coresight-config.h
index 2e1670523461..6ba013975741 100644
--- a/drivers/hwtracing/coresight/coresight-config.h
+++ b/drivers/hwtracing/coresight/coresight-config.h
@@ -134,6 +134,7 @@ struct cscfg_feature_desc {
* @active_cnt: ref count for activate on this configuration.
* @load_owner: handle to load owner for dynamic load and unload of configs.
* @fs_group: reference to configfs group for dynamic unload.
+ * @available: config can be activated - multi-stage load sets true on completion.
*/
struct cscfg_config_desc {
const char *name;
@@ -148,6 +149,7 @@ struct cscfg_config_desc {
atomic_t active_cnt;
void *load_owner;
struct config_group *fs_group;
+ bool available;
};
/**
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index ee6ce92ab4c3..1edfec1e9d18 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -1424,6 +1424,7 @@ static int coresight_remove_match(struct device *dev, void *data)
* platform data.
*/
fwnode_handle_put(conn->child_fwnode);
+ conn->child_fwnode = NULL;
/* No need to continue */
break;
}
diff --git a/drivers/hwtracing/coresight/coresight-syscfg.c b/drivers/hwtracing/coresight/coresight-syscfg.c
index 11850fd8c3b5..11138a9762b0 100644
--- a/drivers/hwtracing/coresight/coresight-syscfg.c
+++ b/drivers/hwtracing/coresight/coresight-syscfg.c
@@ -414,6 +414,27 @@ static void cscfg_remove_owned_csdev_features(struct coresight_device *csdev, vo
}
}
+/*
+ * Unregister all configuration and features from configfs owned by load_owner.
+ * Although this is called without the list mutex being held, it is in the
+ * context of an unload operation which are strictly serialised,
+ * so the lists cannot change during this call.
+ */
+static void cscfg_fs_unregister_cfgs_feats(void *load_owner)
+{
+ struct cscfg_config_desc *config_desc;
+ struct cscfg_feature_desc *feat_desc;
+
+ list_for_each_entry(config_desc, &cscfg_mgr->config_desc_list, item) {
+ if (config_desc->load_owner == load_owner)
+ cscfg_configfs_del_config(config_desc);
+ }
+ list_for_each_entry(feat_desc, &cscfg_mgr->feat_desc_list, item) {
+ if (feat_desc->load_owner == load_owner)
+ cscfg_configfs_del_feature(feat_desc);
+ }
+}
+
/*
* removal is relatively easy - just remove from all lists, anything that
* matches the owner. Memory for the descriptors will be managed by the owner,
@@ -426,6 +447,8 @@ static void cscfg_unload_owned_cfgs_feats(void *load_owner)
struct cscfg_feature_desc *feat_desc, *feat_tmp;
struct cscfg_registered_csdev *csdev_item;
+ lockdep_assert_held(&cscfg_mutex);
+
/* remove from each csdev instance feature and config lists */
list_for_each_entry(csdev_item, &cscfg_mgr->csdev_desc_list, item) {
/*
@@ -439,7 +462,6 @@ static void cscfg_unload_owned_cfgs_feats(void *load_owner)
/* remove from the config descriptor lists */
list_for_each_entry_safe(config_desc, cfg_tmp, &cscfg_mgr->config_desc_list, item) {
if (config_desc->load_owner == load_owner) {
- cscfg_configfs_del_config(config_desc);
etm_perf_del_symlink_cscfg(config_desc);
list_del(&config_desc->item);
}
@@ -448,12 +470,90 @@ static void cscfg_unload_owned_cfgs_feats(void *load_owner)
/* remove from the feature descriptor lists */
list_for_each_entry_safe(feat_desc, feat_tmp, &cscfg_mgr->feat_desc_list, item) {
if (feat_desc->load_owner == load_owner) {
- cscfg_configfs_del_feature(feat_desc);
list_del(&feat_desc->item);
}
}
}
+/*
+ * load the features and configs to the lists - called with list mutex held
+ */
+static int cscfg_load_owned_cfgs_feats(struct cscfg_config_desc **config_descs,
+ struct cscfg_feature_desc **feat_descs,
+ struct cscfg_load_owner_info *owner_info)
+{
+ int i, err;
+
+ lockdep_assert_held(&cscfg_mutex);
+
+ /* load features first */
+ if (feat_descs) {
+ for (i = 0; feat_descs[i]; i++) {
+ err = cscfg_load_feat(feat_descs[i]);
+ if (err) {
+ pr_err("coresight-syscfg: Failed to load feature %s\n",
+ feat_descs[i]->name);
+ return err;
+ }
+ feat_descs[i]->load_owner = owner_info;
+ }
+ }
+
+ /* next any configurations to check feature dependencies */
+ if (config_descs) {
+ for (i = 0; config_descs[i]; i++) {
+ err = cscfg_load_config(config_descs[i]);
+ if (err) {
+ pr_err("coresight-syscfg: Failed to load configuration %s\n",
+ config_descs[i]->name);
+ return err;
+ }
+ config_descs[i]->load_owner = owner_info;
+ config_descs[i]->available = false;
+ }
+ }
+ return 0;
+}
+
+/* set configurations as available to activate at the end of the load process */
+static void cscfg_set_configs_available(struct cscfg_config_desc **config_descs)
+{
+ int i;
+
+ lockdep_assert_held(&cscfg_mutex);
+
+ if (config_descs) {
+ for (i = 0; config_descs[i]; i++)
+ config_descs[i]->available = true;
+ }
+}
+
+/*
+ * Create and register each of the configurations and features with configfs.
+ * Called without mutex being held.
+ */
+static int cscfg_fs_register_cfgs_feats(struct cscfg_config_desc **config_descs,
+ struct cscfg_feature_desc **feat_descs)
+{
+ int i, err;
+
+ if (feat_descs) {
+ for (i = 0; feat_descs[i]; i++) {
+ err = cscfg_configfs_add_feature(feat_descs[i]);
+ if (err)
+ return err;
+ }
+ }
+ if (config_descs) {
+ for (i = 0; config_descs[i]; i++) {
+ err = cscfg_configfs_add_config(config_descs[i]);
+ if (err)
+ return err;
+ }
+ }
+ return 0;
+}
+
/**
* cscfg_load_config_sets - API function to load feature and config sets.
*
@@ -476,57 +576,63 @@ int cscfg_load_config_sets(struct cscfg_config_desc **config_descs,
struct cscfg_feature_desc **feat_descs,
struct cscfg_load_owner_info *owner_info)
{
- int err = 0, i = 0;
+ int err = 0;
mutex_lock(&cscfg_mutex);
-
- /* load features first */
- if (feat_descs) {
- while (feat_descs[i]) {
- err = cscfg_load_feat(feat_descs[i]);
- if (!err)
- err = cscfg_configfs_add_feature(feat_descs[i]);
- if (err) {
- pr_err("coresight-syscfg: Failed to load feature %s\n",
- feat_descs[i]->name);
- cscfg_unload_owned_cfgs_feats(owner_info);
- goto exit_unlock;
- }
- feat_descs[i]->load_owner = owner_info;
- i++;
- }
+ if (cscfg_mgr->load_state != CSCFG_NONE) {
+ mutex_unlock(&cscfg_mutex);
+ return -EBUSY;
}
+ cscfg_mgr->load_state = CSCFG_LOAD;
- /* next any configurations to check feature dependencies */
- i = 0;
- if (config_descs) {
- while (config_descs[i]) {
- err = cscfg_load_config(config_descs[i]);
- if (!err)
- err = cscfg_configfs_add_config(config_descs[i]);
- if (err) {
- pr_err("coresight-syscfg: Failed to load configuration %s\n",
- config_descs[i]->name);
- cscfg_unload_owned_cfgs_feats(owner_info);
- goto exit_unlock;
- }
- config_descs[i]->load_owner = owner_info;
- i++;
- }
- }
+ /* first load and add to the lists */
+ err = cscfg_load_owned_cfgs_feats(config_descs, feat_descs, owner_info);
+ if (err)
+ goto err_clean_load;
/* add the load owner to the load order list */
list_add_tail(&owner_info->item, &cscfg_mgr->load_order_list);
if (!list_is_singular(&cscfg_mgr->load_order_list)) {
/* lock previous item in load order list */
err = cscfg_owner_get(list_prev_entry(owner_info, item));
- if (err) {
- cscfg_unload_owned_cfgs_feats(owner_info);
- list_del(&owner_info->item);
- }
+ if (err)
+ goto err_clean_owner_list;
}
+ /*
+ * make visible to configfs - configfs manipulation must occur outside
+ * the list mutex lock to avoid circular lockdep issues with configfs
+ * built in mutexes and semaphores. This is safe as it is not possible
+ * to start a new load/unload operation till the current one is done.
+ */
+ mutex_unlock(&cscfg_mutex);
+
+ /* create the configfs elements */
+ err = cscfg_fs_register_cfgs_feats(config_descs, feat_descs);
+ mutex_lock(&cscfg_mutex);
+
+ if (err)
+ goto err_clean_cfs;
+
+ /* mark any new configs as available for activation */
+ cscfg_set_configs_available(config_descs);
+ goto exit_unlock;
+
+err_clean_cfs:
+ /* cleanup after error registering with configfs */
+ cscfg_fs_unregister_cfgs_feats(owner_info);
+
+ if (!list_is_singular(&cscfg_mgr->load_order_list))
+ cscfg_owner_put(list_prev_entry(owner_info, item));
+
+err_clean_owner_list:
+ list_del(&owner_info->item);
+
+err_clean_load:
+ cscfg_unload_owned_cfgs_feats(owner_info);
+
exit_unlock:
+ cscfg_mgr->load_state = CSCFG_NONE;
mutex_unlock(&cscfg_mutex);
return err;
}
@@ -543,6 +649,9 @@ EXPORT_SYMBOL_GPL(cscfg_load_config_sets);
* 1) no configurations are active.
* 2) the set being unloaded was the last to be loaded to maintain dependencies.
*
+ * Once the unload operation commences, we disallow any configuration being
+ * made active until it is complete.
+ *
* @owner_info: Information on owner for set being unloaded.
*/
int cscfg_unload_config_sets(struct cscfg_load_owner_info *owner_info)
@@ -551,6 +660,13 @@ int cscfg_unload_config_sets(struct cscfg_load_owner_info *owner_info)
struct cscfg_load_owner_info *load_list_item = NULL;
mutex_lock(&cscfg_mutex);
+ if (cscfg_mgr->load_state != CSCFG_NONE) {
+ mutex_unlock(&cscfg_mutex);
+ return -EBUSY;
+ }
+
+ /* unload op in progress also prevents activation of any config */
+ cscfg_mgr->load_state = CSCFG_UNLOAD;
/* cannot unload if anything is active */
if (atomic_read(&cscfg_mgr->sys_active_cnt)) {
@@ -571,7 +687,12 @@ int cscfg_unload_config_sets(struct cscfg_load_owner_info *owner_info)
goto exit_unlock;
}
- /* unload all belonging to load_owner */
+ /* remove from configfs - again outside the scope of the list mutex */
+ mutex_unlock(&cscfg_mutex);
+ cscfg_fs_unregister_cfgs_feats(owner_info);
+ mutex_lock(&cscfg_mutex);
+
+ /* unload everything from lists belonging to load_owner */
cscfg_unload_owned_cfgs_feats(owner_info);
/* remove from load order list */
@@ -582,6 +703,7 @@ int cscfg_unload_config_sets(struct cscfg_load_owner_info *owner_info)
list_del(&owner_info->item);
exit_unlock:
+ cscfg_mgr->load_state = CSCFG_NONE;
mutex_unlock(&cscfg_mutex);
return err;
}
@@ -759,8 +881,15 @@ static int _cscfg_activate_config(unsigned long cfg_hash)
struct cscfg_config_desc *config_desc;
int err = -EINVAL;
+ if (cscfg_mgr->load_state == CSCFG_UNLOAD)
+ return -EBUSY;
+
list_for_each_entry(config_desc, &cscfg_mgr->config_desc_list, item) {
if ((unsigned long)config_desc->event_ea->var == cfg_hash) {
+ /* if we happen upon a partly loaded config, can't use it */
+ if (config_desc->available == false)
+ return -EBUSY;
+
/* must ensure that config cannot be unloaded in use */
err = cscfg_owner_get(config_desc->load_owner);
if (err)
@@ -1022,8 +1151,10 @@ struct device *cscfg_device(void)
/* Must have a release function or the kernel will complain on module unload */
static void cscfg_dev_release(struct device *dev)
{
+ mutex_lock(&cscfg_mutex);
kfree(cscfg_mgr);
cscfg_mgr = NULL;
+ mutex_unlock(&cscfg_mutex);
}
/* a device is needed to "own" some kernel elements such as sysfs entries. */
@@ -1042,6 +1173,14 @@ static int cscfg_create_device(void)
if (!cscfg_mgr)
goto create_dev_exit_unlock;
+ /* initialise the cscfg_mgr structure */
+ INIT_LIST_HEAD(&cscfg_mgr->csdev_desc_list);
+ INIT_LIST_HEAD(&cscfg_mgr->feat_desc_list);
+ INIT_LIST_HEAD(&cscfg_mgr->config_desc_list);
+ INIT_LIST_HEAD(&cscfg_mgr->load_order_list);
+ atomic_set(&cscfg_mgr->sys_active_cnt, 0);
+ cscfg_mgr->load_state = CSCFG_NONE;
+
/* setup the device */
dev = cscfg_device();
dev->release = cscfg_dev_release;
@@ -1056,17 +1195,73 @@ static int cscfg_create_device(void)
return err;
}
-static void cscfg_clear_device(void)
+/*
+ * Loading and unloading is generally on user discretion.
+ * If exiting due to coresight module unload, we need to unload any configurations that remain,
+ * before we unregister the configfs intrastructure.
+ *
+ * Do this by walking the load_owner list and taking appropriate action, depending on the load
+ * owner type.
+ */
+static void cscfg_unload_cfgs_on_exit(void)
{
- struct cscfg_config_desc *cfg_desc;
+ struct cscfg_load_owner_info *owner_info = NULL;
+ /*
+ * grab the mutex - even though we are exiting, some configfs files
+ * may still be live till we dump them, so ensure list data is
+ * protected from a race condition.
+ */
mutex_lock(&cscfg_mutex);
- list_for_each_entry(cfg_desc, &cscfg_mgr->config_desc_list, item) {
- etm_perf_del_symlink_cscfg(cfg_desc);
+ while (!list_empty(&cscfg_mgr->load_order_list)) {
+
+ /* remove in reverse order of loading */
+ owner_info = list_last_entry(&cscfg_mgr->load_order_list,
+ struct cscfg_load_owner_info, item);
+
+ /* action according to type */
+ switch (owner_info->type) {
+ case CSCFG_OWNER_PRELOAD:
+ /*
+ * preloaded descriptors are statically allocated in
+ * this module - just need to unload dynamic items from
+ * csdev lists, and remove from configfs directories.
+ */
+ pr_info("cscfg: unloading preloaded configurations\n");
+ break;
+
+ case CSCFG_OWNER_MODULE:
+ /*
+ * this is an error - the loadable module must have been unloaded prior
+ * to the coresight module unload. Therefore that module has not
+ * correctly unloaded configs in its own exit code.
+ * Nothing to do other than emit an error string as the static descriptor
+ * references we need to unload will have disappeared with the module.
+ */
+ pr_err("cscfg: ERROR: prior module failed to unload configuration\n");
+ goto list_remove;
+ }
+
+ /* remove from configfs - outside the scope of the list mutex */
+ mutex_unlock(&cscfg_mutex);
+ cscfg_fs_unregister_cfgs_feats(owner_info);
+ mutex_lock(&cscfg_mutex);
+
+ /* Next unload from csdev lists. */
+ cscfg_unload_owned_cfgs_feats(owner_info);
+
+list_remove:
+ /* remove from load order list */
+ list_del(&owner_info->item);
}
+ mutex_unlock(&cscfg_mutex);
+}
+
+static void cscfg_clear_device(void)
+{
+ cscfg_unload_cfgs_on_exit();
cscfg_configfs_release(cscfg_mgr);
device_unregister(cscfg_device());
- mutex_unlock(&cscfg_mutex);
}
/* Initialise system config management API device */
@@ -1074,20 +1269,16 @@ int __init cscfg_init(void)
{
int err = 0;
+ /* create the device and init cscfg_mgr */
err = cscfg_create_device();
if (err)
return err;
+ /* initialise configfs subsystem */
err = cscfg_configfs_init(cscfg_mgr);
if (err)
goto exit_err;
- INIT_LIST_HEAD(&cscfg_mgr->csdev_desc_list);
- INIT_LIST_HEAD(&cscfg_mgr->feat_desc_list);
- INIT_LIST_HEAD(&cscfg_mgr->config_desc_list);
- INIT_LIST_HEAD(&cscfg_mgr->load_order_list);
- atomic_set(&cscfg_mgr->sys_active_cnt, 0);
-
/* preload built-in configurations */
err = cscfg_preload(THIS_MODULE);
if (err)
diff --git a/drivers/hwtracing/coresight/coresight-syscfg.h b/drivers/hwtracing/coresight/coresight-syscfg.h
index 9106ffab4833..66e2db890d82 100644
--- a/drivers/hwtracing/coresight/coresight-syscfg.h
+++ b/drivers/hwtracing/coresight/coresight-syscfg.h
@@ -12,6 +12,17 @@
#include "coresight-config.h"
+/*
+ * Load operation types.
+ * When loading or unloading, another load operation cannot be run.
+ * When unloading configurations cannot be activated.
+ */
+enum cscfg_load_ops {
+ CSCFG_NONE,
+ CSCFG_LOAD,
+ CSCFG_UNLOAD
+};
+
/**
* System configuration manager device.
*
@@ -30,6 +41,7 @@
* @cfgfs_subsys: configfs subsystem used to manage configurations.
* @sysfs_active_config:Active config hash used if CoreSight controlled from sysfs.
* @sysfs_active_preset:Active preset index used if CoreSight controlled from sysfs.
+ * @load_state: A multi-stage load/unload operation is in progress.
*/
struct cscfg_manager {
struct device dev;
@@ -41,6 +53,7 @@ struct cscfg_manager {
struct configfs_subsystem cfgfs_subsys;
u32 sysfs_active_config;
int sysfs_active_preset;
+ enum cscfg_load_ops load_state;
};
/* get reference to dev in cscfg_manager */
diff --git a/drivers/hwtracing/intel_th/msu-sink.c b/drivers/hwtracing/intel_th/msu-sink.c
index 2c7f5116be12..891b28ea25fe 100644
--- a/drivers/hwtracing/intel_th/msu-sink.c
+++ b/drivers/hwtracing/intel_th/msu-sink.c
@@ -71,6 +71,9 @@ static int msu_sink_alloc_window(void *data, struct sg_table **sgt, size_t size)
block = dma_alloc_coherent(priv->dev->parent->parent,
PAGE_SIZE, &sg_dma_address(sg_ptr),
GFP_KERNEL);
+ if (!block)
+ return -ENOMEM;
+
sg_set_buf(sg_ptr, block, PAGE_SIZE);
}
diff --git a/drivers/hwtracing/intel_th/msu.c b/drivers/hwtracing/intel_th/msu.c
index 70a07b4e9967..6c8215a47a60 100644
--- a/drivers/hwtracing/intel_th/msu.c
+++ b/drivers/hwtracing/intel_th/msu.c
@@ -1067,6 +1067,16 @@ msc_buffer_set_uc(struct msc *msc) {}
static inline void msc_buffer_set_wb(struct msc *msc) {}
#endif /* CONFIG_X86 */
+static struct page *msc_sg_page(struct scatterlist *sg)
+{
+ void *addr = sg_virt(sg);
+
+ if (is_vmalloc_addr(addr))
+ return vmalloc_to_page(addr);
+
+ return sg_page(sg);
+}
+
/**
* msc_buffer_win_alloc() - alloc a window for a multiblock mode
* @msc: MSC device
@@ -1137,7 +1147,7 @@ static void __msc_buffer_win_free(struct msc *msc, struct msc_window *win)
int i;
for_each_sg(win->sgt->sgl, sg, win->nr_segs, i) {
- struct page *page = sg_page(sg);
+ struct page *page = msc_sg_page(sg);
page->mapping = NULL;
dma_free_coherent(msc_dev(win->msc)->parent->parent, PAGE_SIZE,
@@ -1401,7 +1411,7 @@ static struct page *msc_buffer_get_page(struct msc *msc, unsigned long pgoff)
pgoff -= win->pgoff;
for_each_sg(win->sgt->sgl, sg, win->nr_segs, blk) {
- struct page *page = sg_page(sg);
+ struct page *page = msc_sg_page(sg);
size_t pgsz = PFN_DOWN(sg->length);
if (pgoff < pgsz)
diff --git a/drivers/hwtracing/intel_th/pci.c b/drivers/hwtracing/intel_th/pci.c
index 7da4f298ed01..147d338c191e 100644
--- a/drivers/hwtracing/intel_th/pci.c
+++ b/drivers/hwtracing/intel_th/pci.c
@@ -100,8 +100,10 @@ static int intel_th_pci_probe(struct pci_dev *pdev,
}
th = intel_th_alloc(&pdev->dev, drvdata, resource, r);
- if (IS_ERR(th))
- return PTR_ERR(th);
+ if (IS_ERR(th)) {
+ err = PTR_ERR(th);
+ goto err_free_irq;
+ }
th->activate = intel_th_pci_activate;
th->deactivate = intel_th_pci_deactivate;
@@ -109,6 +111,10 @@ static int intel_th_pci_probe(struct pci_dev *pdev,
pci_set_master(pdev);
return 0;
+
+err_free_irq:
+ pci_free_irq_vectors(pdev);
+ return err;
}
static void intel_th_pci_remove(struct pci_dev *pdev)
@@ -278,6 +284,21 @@ static const struct pci_device_id intel_th_pci_id_table[] = {
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x54a6),
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
+ {
+ /* Meteor Lake-P */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x7e24),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
+ {
+ /* Raptor Lake-S */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x7a26),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
+ {
+ /* Raptor Lake-S CPU */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xa76f),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
{
/* Alder Lake CPU */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f),
diff --git a/drivers/i2c/busses/i2c-cadence.c b/drivers/i2c/busses/i2c-cadence.c
index 630cfa4ddd46..33f5588a50c0 100644
--- a/drivers/i2c/busses/i2c-cadence.c
+++ b/drivers/i2c/busses/i2c-cadence.c
@@ -573,8 +573,13 @@ static void cdns_i2c_mrecv(struct cdns_i2c *id)
ctrl_reg = cdns_i2c_readreg(CDNS_I2C_CR_OFFSET);
ctrl_reg |= CDNS_I2C_CR_RW | CDNS_I2C_CR_CLR_FIFO;
+ /*
+ * Receive up to I2C_SMBUS_BLOCK_MAX data bytes, plus one message length
+ * byte, plus one checksum byte if PEC is enabled. p_msg->len will be 2 if
+ * PEC is enabled, otherwise 1.
+ */
if (id->p_msg->flags & I2C_M_RECV_LEN)
- id->recv_count = I2C_SMBUS_BLOCK_MAX + 1;
+ id->recv_count = I2C_SMBUS_BLOCK_MAX + id->p_msg->len;
id->curr_recv_count = id->recv_count;
@@ -789,6 +794,9 @@ static int cdns_i2c_process_msg(struct cdns_i2c *id, struct i2c_msg *msg,
if (id->err_status & CDNS_I2C_IXR_ARB_LOST)
return -EAGAIN;
+ if (msg->flags & I2C_M_RECV_LEN)
+ msg->len += min_t(unsigned int, msg->buf[0], I2C_SMBUS_BLOCK_MAX);
+
return 0;
}
diff --git a/drivers/i2c/busses/i2c-mxs.c b/drivers/i2c/busses/i2c-mxs.c
index 864a3f1bd4e1..68f67d084c63 100644
--- a/drivers/i2c/busses/i2c-mxs.c
+++ b/drivers/i2c/busses/i2c-mxs.c
@@ -799,7 +799,7 @@ static int mxs_i2c_probe(struct platform_device *pdev)
if (!i2c)
return -ENOMEM;
- i2c->dev_type = (enum mxs_i2c_devtype)of_device_get_match_data(&pdev->dev);
+ i2c->dev_type = (uintptr_t)of_device_get_match_data(&pdev->dev);
i2c->regs = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(i2c->regs))
diff --git a/drivers/i2c/busses/i2c-npcm7xx.c b/drivers/i2c/busses/i2c-npcm7xx.c
index 743ac20a405c..9f0462fb2504 100644
--- a/drivers/i2c/busses/i2c-npcm7xx.c
+++ b/drivers/i2c/busses/i2c-npcm7xx.c
@@ -123,11 +123,11 @@ enum i2c_addr {
* Since the addr regs are sprinkled all over the address space,
* use this array to get the address or each register.
*/
-#define I2C_NUM_OWN_ADDR 10
+#define I2C_NUM_OWN_ADDR 2
+#define I2C_NUM_OWN_ADDR_SUPPORTED 2
+
static const int npcm_i2caddr[I2C_NUM_OWN_ADDR] = {
- NPCM_I2CADDR1, NPCM_I2CADDR2, NPCM_I2CADDR3, NPCM_I2CADDR4,
- NPCM_I2CADDR5, NPCM_I2CADDR6, NPCM_I2CADDR7, NPCM_I2CADDR8,
- NPCM_I2CADDR9, NPCM_I2CADDR10,
+ NPCM_I2CADDR1, NPCM_I2CADDR2,
};
#endif
@@ -391,14 +391,10 @@ static void npcm_i2c_disable(struct npcm_i2c *bus)
#if IS_ENABLED(CONFIG_I2C_SLAVE)
int i;
- /* select bank 0 for I2C addresses */
- npcm_i2c_select_bank(bus, I2C_BANK_0);
-
/* Slave addresses removal */
- for (i = I2C_SLAVE_ADDR1; i < I2C_NUM_OWN_ADDR; i++)
+ for (i = I2C_SLAVE_ADDR1; i < I2C_NUM_OWN_ADDR_SUPPORTED; i++)
iowrite8(0, bus->reg + npcm_i2caddr[i]);
- npcm_i2c_select_bank(bus, I2C_BANK_1);
#endif
/* Disable module */
i2cctl2 = ioread8(bus->reg + NPCM_I2CCTL2);
@@ -603,8 +599,7 @@ static int npcm_i2c_slave_enable(struct npcm_i2c *bus, enum i2c_addr addr_type,
i2cctl1 &= ~NPCM_I2CCTL1_GCMEN;
iowrite8(i2cctl1, bus->reg + NPCM_I2CCTL1);
return 0;
- }
- if (addr_type == I2C_ARP_ADDR) {
+ } else if (addr_type == I2C_ARP_ADDR) {
i2cctl3 = ioread8(bus->reg + NPCM_I2CCTL3);
if (enable)
i2cctl3 |= I2CCTL3_ARPMEN;
@@ -613,16 +608,16 @@ static int npcm_i2c_slave_enable(struct npcm_i2c *bus, enum i2c_addr addr_type,
iowrite8(i2cctl3, bus->reg + NPCM_I2CCTL3);
return 0;
}
+ if (addr_type > I2C_SLAVE_ADDR2 && addr_type <= I2C_SLAVE_ADDR10)
+ dev_err(bus->dev, "try to enable more than 2 SA not supported\n");
+
if (addr_type >= I2C_ARP_ADDR)
return -EFAULT;
- /* select bank 0 for address 3 to 10 */
- if (addr_type > I2C_SLAVE_ADDR2)
- npcm_i2c_select_bank(bus, I2C_BANK_0);
+
/* Set and enable the address */
iowrite8(sa_reg, bus->reg + npcm_i2caddr[addr_type]);
npcm_i2c_slave_int_enable(bus, enable);
- if (addr_type > I2C_SLAVE_ADDR2)
- npcm_i2c_select_bank(bus, I2C_BANK_1);
+
return 0;
}
#endif
@@ -843,15 +838,11 @@ static u8 npcm_i2c_get_slave_addr(struct npcm_i2c *bus, enum i2c_addr addr_type)
{
u8 slave_add;
- /* select bank 0 for address 3 to 10 */
- if (addr_type > I2C_SLAVE_ADDR2)
- npcm_i2c_select_bank(bus, I2C_BANK_0);
+ if (addr_type > I2C_SLAVE_ADDR2 && addr_type <= I2C_SLAVE_ADDR10)
+ dev_err(bus->dev, "get slave: try to use more than 2 SA not supported\n");
slave_add = ioread8(bus->reg + npcm_i2caddr[(int)addr_type]);
- if (addr_type > I2C_SLAVE_ADDR2)
- npcm_i2c_select_bank(bus, I2C_BANK_1);
-
return slave_add;
}
@@ -861,12 +852,12 @@ static int npcm_i2c_remove_slave_addr(struct npcm_i2c *bus, u8 slave_add)
/* Set the enable bit */
slave_add |= 0x80;
- npcm_i2c_select_bank(bus, I2C_BANK_0);
- for (i = I2C_SLAVE_ADDR1; i < I2C_NUM_OWN_ADDR; i++) {
+
+ for (i = I2C_SLAVE_ADDR1; i < I2C_NUM_OWN_ADDR_SUPPORTED; i++) {
if (ioread8(bus->reg + npcm_i2caddr[i]) == slave_add)
iowrite8(0, bus->reg + npcm_i2caddr[i]);
}
- npcm_i2c_select_bank(bus, I2C_BANK_1);
+
return 0;
}
@@ -921,11 +912,15 @@ static int npcm_i2c_slave_get_wr_buf(struct npcm_i2c *bus)
for (i = 0; i < I2C_HW_FIFO_SIZE; i++) {
if (bus->slv_wr_size >= I2C_HW_FIFO_SIZE)
break;
- i2c_slave_event(bus->slave, I2C_SLAVE_READ_REQUESTED, &value);
+ if (bus->state == I2C_SLAVE_MATCH) {
+ i2c_slave_event(bus->slave, I2C_SLAVE_READ_REQUESTED, &value);
+ bus->state = I2C_OPER_STARTED;
+ } else {
+ i2c_slave_event(bus->slave, I2C_SLAVE_READ_PROCESSED, &value);
+ }
ind = (bus->slv_wr_ind + bus->slv_wr_size) % I2C_HW_FIFO_SIZE;
bus->slv_wr_buf[ind] = value;
bus->slv_wr_size++;
- i2c_slave_event(bus->slave, I2C_SLAVE_READ_PROCESSED, &value);
}
return I2C_HW_FIFO_SIZE - ret;
}
@@ -973,7 +968,6 @@ static void npcm_i2c_slave_xmit(struct npcm_i2c *bus, u16 nwrite,
if (nwrite == 0)
return;
- bus->state = I2C_OPER_STARTED;
bus->operation = I2C_WRITE_OPER;
/* get the next buffer */
diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
index 5b920f0fc7dd..a4eef8de054c 100644
--- a/drivers/i2c/busses/i2c-qcom-geni.c
+++ b/drivers/i2c/busses/i2c-qcom-geni.c
@@ -688,7 +688,7 @@ static int geni_i2c_xfer(struct i2c_adapter *adap,
pm_runtime_put_autosuspend(gi2c->se.dev);
gi2c->cur = NULL;
gi2c->err = 0;
- return num;
+ return ret;
}
static u32 geni_i2c_func(struct i2c_adapter *adap)
diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
index d43db2c3876e..19a317fdcf5b 100644
--- a/drivers/i2c/i2c-core-base.c
+++ b/drivers/i2c/i2c-core-base.c
@@ -2467,8 +2467,9 @@ void i2c_put_adapter(struct i2c_adapter *adap)
if (!adap)
return;
- put_device(&adap->dev);
module_put(adap->owner);
+ /* Should be last, otherwise we risk use-after-free with 'adap' */
+ put_device(&adap->dev);
}
EXPORT_SYMBOL(i2c_put_adapter);
diff --git a/drivers/i2c/muxes/i2c-mux-gpmux.c b/drivers/i2c/muxes/i2c-mux-gpmux.c
index d3acd8d66c32..33024acaac02 100644
--- a/drivers/i2c/muxes/i2c-mux-gpmux.c
+++ b/drivers/i2c/muxes/i2c-mux-gpmux.c
@@ -134,6 +134,7 @@ static int i2c_mux_probe(struct platform_device *pdev)
return 0;
err_children:
+ of_node_put(child);
i2c_mux_del_adapters(muxc);
err_parent:
i2c_put_adapter(parent);
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 47b68c6071be..9515a3146dc9 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -812,15 +812,105 @@ static struct cpuidle_state icx_cstates[] __initdata = {
};
/*
- * On Sapphire Rapids Xeon C1 has to be disabled if C1E is enabled, and vice
- * versa. On SPR C1E is enabled only if "C1E promotion" bit is set in
- * MSR_IA32_POWER_CTL. But in this case there effectively no C1, because C1
- * requests are promoted to C1E. If the "C1E promotion" bit is cleared, then
- * both C1 and C1E requests end up with C1, so there is effectively no C1E.
+ * On AlderLake C1 has to be disabled if C1E is enabled, and vice versa.
+ * C1E is enabled only if "C1E promotion" bit is set in MSR_IA32_POWER_CTL.
+ * But in this case there is effectively no C1, because C1 requests are
+ * promoted to C1E. If the "C1E promotion" bit is cleared, then both C1
+ * and C1E requests end up with C1, so there is effectively no C1E.
*
- * By default we enable C1 and disable C1E by marking it with
+ * By default we enable C1E and disable C1 by marking it with
* 'CPUIDLE_FLAG_UNUSABLE'.
*/
+static struct cpuidle_state adl_cstates[] __initdata = {
+ {
+ .name = "C1",
+ .desc = "MWAIT 0x00",
+ .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_UNUSABLE,
+ .exit_latency = 1,
+ .target_residency = 1,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C1E",
+ .desc = "MWAIT 0x01",
+ .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
+ .exit_latency = 2,
+ .target_residency = 4,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C6",
+ .desc = "MWAIT 0x20",
+ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .exit_latency = 220,
+ .target_residency = 600,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C8",
+ .desc = "MWAIT 0x40",
+ .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .exit_latency = 280,
+ .target_residency = 800,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C10",
+ .desc = "MWAIT 0x60",
+ .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .exit_latency = 680,
+ .target_residency = 2000,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .enter = NULL }
+};
+
+static struct cpuidle_state adl_l_cstates[] __initdata = {
+ {
+ .name = "C1",
+ .desc = "MWAIT 0x00",
+ .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_UNUSABLE,
+ .exit_latency = 1,
+ .target_residency = 1,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C1E",
+ .desc = "MWAIT 0x01",
+ .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
+ .exit_latency = 2,
+ .target_residency = 4,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C6",
+ .desc = "MWAIT 0x20",
+ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .exit_latency = 170,
+ .target_residency = 500,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C8",
+ .desc = "MWAIT 0x40",
+ .flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .exit_latency = 200,
+ .target_residency = 600,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C10",
+ .desc = "MWAIT 0x60",
+ .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .exit_latency = 230,
+ .target_residency = 700,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .enter = NULL }
+};
+
static struct cpuidle_state spr_cstates[] __initdata = {
{
.name = "C1",
@@ -833,8 +923,7 @@ static struct cpuidle_state spr_cstates[] __initdata = {
{
.name = "C1E",
.desc = "MWAIT 0x01",
- .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE |
- CPUIDLE_FLAG_UNUSABLE,
+ .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
.exit_latency = 2,
.target_residency = 4,
.enter = &intel_idle,
@@ -1194,6 +1283,14 @@ static const struct idle_cpu idle_cpu_icx __initconst = {
.use_acpi = true,
};
+static const struct idle_cpu idle_cpu_adl __initconst = {
+ .state_table = adl_cstates,
+};
+
+static const struct idle_cpu idle_cpu_adl_l __initconst = {
+ .state_table = adl_l_cstates,
+};
+
static const struct idle_cpu idle_cpu_spr __initconst = {
.state_table = spr_cstates,
.disable_promotion_to_c1e = true,
@@ -1262,6 +1359,8 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &idle_cpu_skx),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &idle_cpu_icx),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &idle_cpu_icx),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &idle_cpu_adl),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &idle_cpu_adl_l),
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &idle_cpu_spr),
X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL, &idle_cpu_knl),
X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM, &idle_cpu_knl),
@@ -1620,6 +1719,25 @@ static void __init skx_idle_state_table_update(void)
}
}
+/**
+ * adl_idle_state_table_update - Adjust AlderLake idle states table.
+ */
+static void __init adl_idle_state_table_update(void)
+{
+ /* Check if user prefers C1 over C1E. */
+ if (preferred_states_mask & BIT(1) && !(preferred_states_mask & BIT(2))) {
+ cpuidle_state_table[0].flags &= ~CPUIDLE_FLAG_UNUSABLE;
+ cpuidle_state_table[1].flags |= CPUIDLE_FLAG_UNUSABLE;
+
+ /* Disable C1E by clearing the "C1E promotion" bit. */
+ c1e_promotion = C1E_PROMOTION_DISABLE;
+ return;
+ }
+
+ /* Make sure C1E is enabled by default */
+ c1e_promotion = C1E_PROMOTION_ENABLE;
+}
+
/**
* spr_idle_state_table_update - Adjust Sapphire Rapids idle states table.
*/
@@ -1627,17 +1745,6 @@ static void __init spr_idle_state_table_update(void)
{
unsigned long long msr;
- /* Check if user prefers C1E over C1. */
- if ((preferred_states_mask & BIT(2)) &&
- !(preferred_states_mask & BIT(1))) {
- /* Disable C1 and enable C1E. */
- spr_cstates[0].flags |= CPUIDLE_FLAG_UNUSABLE;
- spr_cstates[1].flags &= ~CPUIDLE_FLAG_UNUSABLE;
-
- /* Enable C1E using the "C1E promotion" bit. */
- c1e_promotion = C1E_PROMOTION_ENABLE;
- }
-
/*
* By default, the C6 state assumes the worst-case scenario of package
* C6. However, if PC6 is disabled, we update the numbers to match
@@ -1689,6 +1796,10 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
case INTEL_FAM6_SAPPHIRERAPIDS_X:
spr_idle_state_table_update();
break;
+ case INTEL_FAM6_ALDERLAKE:
+ case INTEL_FAM6_ALDERLAKE_L:
+ adl_idle_state_table_update();
+ break;
}
for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
diff --git a/drivers/iio/accel/adxl313_core.c b/drivers/iio/accel/adxl313_core.c
index 9e4193e64765..afeef779e1d0 100644
--- a/drivers/iio/accel/adxl313_core.c
+++ b/drivers/iio/accel/adxl313_core.c
@@ -46,7 +46,7 @@ EXPORT_SYMBOL_NS_GPL(adxl313_writable_regs_table, IIO_ADXL313);
struct adxl313_data {
struct regmap *regmap;
struct mutex lock; /* lock to protect transf_buf */
- __le16 transf_buf ____cacheline_aligned;
+ __le16 transf_buf __aligned(IIO_DMA_MINALIGN);
};
static const int adxl313_odr_freqs[][2] = {
diff --git a/drivers/iio/accel/adxl355_core.c b/drivers/iio/accel/adxl355_core.c
index e9c10c8c32f0..2dfe780d6144 100644
--- a/drivers/iio/accel/adxl355_core.c
+++ b/drivers/iio/accel/adxl355_core.c
@@ -177,7 +177,7 @@ struct adxl355_data {
u8 buf[14];
s64 ts;
} buffer;
- } ____cacheline_aligned;
+ } __aligned(IIO_DMA_MINALIGN);
};
static int adxl355_set_op_mode(struct adxl355_data *data,
diff --git a/drivers/iio/accel/adxl367.c b/drivers/iio/accel/adxl367.c
index 62960134ea19..d680bec05efc 100644
--- a/drivers/iio/accel/adxl367.c
+++ b/drivers/iio/accel/adxl367.c
@@ -179,7 +179,7 @@ struct adxl367_state {
unsigned int fifo_set_size;
unsigned int fifo_watermark;
- __be16 fifo_buf[ADXL367_FIFO_SIZE] ____cacheline_aligned;
+ __be16 fifo_buf[ADXL367_FIFO_SIZE] __aligned(IIO_DMA_MINALIGN);
__be16 sample_buf;
u8 act_threshold_buf[2];
u8 inact_time_buf[2];
diff --git a/drivers/iio/accel/adxl367_spi.c b/drivers/iio/accel/adxl367_spi.c
index 26dfc821ebbe..118c894015a5 100644
--- a/drivers/iio/accel/adxl367_spi.c
+++ b/drivers/iio/accel/adxl367_spi.c
@@ -9,6 +9,8 @@
#include <linux/regmap.h>
#include <linux/spi/spi.h>
+#include <linux/iio/iio.h>
+
#include "adxl367.h"
#define ADXL367_SPI_WRITE_COMMAND 0x0A
@@ -28,10 +30,10 @@ struct adxl367_spi_state {
struct spi_transfer fifo_xfer[2];
/*
- * DMA (thus cache coherency maintenance) requires the
- * transfer buffers to live in their own cache lines.
+ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers live in their own cache lines.
*/
- u8 reg_write_tx_buf[1] ____cacheline_aligned;
+ u8 reg_write_tx_buf[1] __aligned(IIO_DMA_MINALIGN);
u8 reg_read_tx_buf[2];
u8 fifo_tx_buf[1];
};
diff --git a/drivers/iio/accel/bma220_spi.c b/drivers/iio/accel/bma220_spi.c
index 74024d7ce5ac..b6d9ab8e2054 100644
--- a/drivers/iio/accel/bma220_spi.c
+++ b/drivers/iio/accel/bma220_spi.c
@@ -67,7 +67,7 @@ struct bma220_data {
/* Ensure timestamp is naturally aligned. */
s64 timestamp __aligned(8);
} scan;
- u8 tx_buf[2] ____cacheline_aligned;
+ u8 tx_buf[2] __aligned(IIO_DMA_MINALIGN);
};
static const struct iio_chan_spec bma220_channels[] = {
diff --git a/drivers/iio/accel/bma400.h b/drivers/iio/accel/bma400.h
index c4c8d74155c2..1c8c47a9a317 100644
--- a/drivers/iio/accel/bma400.h
+++ b/drivers/iio/accel/bma400.h
@@ -83,8 +83,27 @@
#define BMA400_ACC_ODR_MIN_WHOLE_HZ 25
#define BMA400_ACC_ODR_MIN_HZ 12
-#define BMA400_SCALE_MIN 38357
-#define BMA400_SCALE_MAX 306864
+/*
+ * BMA400_SCALE_MIN macro value represents m/s^2 for 1 LSB before
+ * converting to micro values for +-2g range.
+ *
+ * For +-2g - 1 LSB = 0.976562 milli g = 0.009576 m/s^2
+ * For +-4g - 1 LSB = 1.953125 milli g = 0.019153 m/s^2
+ * For +-16g - 1 LSB = 7.8125 milli g = 0.076614 m/s^2
+ *
+ * The raw value which is used to select the different ranges is determined
+ * by the first bit set position from the scale value, so BMA400_SCALE_MIN
+ * should be odd.
+ *
+ * Scale values for +-2g, +-4g, +-8g and +-16g are populated into bma400_scales
+ * array by left shifting BMA400_SCALE_MIN.
+ * e.g.:
+ * To select +-2g = 9577 << 0 = raw value to write is 0.
+ * To select +-8g = 9577 << 2 = raw value to write is 2.
+ * To select +-16g = 9577 << 3 = raw value to write is 3.
+ */
+#define BMA400_SCALE_MIN 9577
+#define BMA400_SCALE_MAX 76617
#define BMA400_NUM_REGULATORS 2
#define BMA400_VDD_REGULATOR 0
@@ -94,6 +113,4 @@ extern const struct regmap_config bma400_regmap_config;
int bma400_probe(struct device *dev, struct regmap *regmap, const char *name);
-void bma400_remove(struct device *dev);
-
#endif
diff --git a/drivers/iio/accel/bma400_core.c b/drivers/iio/accel/bma400_core.c
index 043002fe6f63..07674d89d978 100644
--- a/drivers/iio/accel/bma400_core.c
+++ b/drivers/iio/accel/bma400_core.c
@@ -13,14 +13,14 @@
#include <linux/bitops.h>
#include <linux/device.h>
-#include <linux/iio/iio.h>
-#include <linux/iio/sysfs.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/mutex.h>
#include <linux/regmap.h>
#include <linux/regulator/consumer.h>
+#include <linux/iio/iio.h>
+
#include "bma400.h"
/*
@@ -560,6 +560,26 @@ static void bma400_init_tables(void)
}
}
+static void bma400_regulators_disable(void *data_ptr)
+{
+ struct bma400_data *data = data_ptr;
+
+ regulator_bulk_disable(ARRAY_SIZE(data->regulators), data->regulators);
+}
+
+static void bma400_power_disable(void *data_ptr)
+{
+ struct bma400_data *data = data_ptr;
+ int ret;
+
+ mutex_lock(&data->mutex);
+ ret = bma400_set_power_mode(data, POWER_MODE_SLEEP);
+ mutex_unlock(&data->mutex);
+ if (ret)
+ dev_warn(data->dev, "Failed to put device into sleep mode (%pe)\n",
+ ERR_PTR(ret));
+}
+
static int bma400_init(struct bma400_data *data)
{
unsigned int val;
@@ -569,13 +589,12 @@ static int bma400_init(struct bma400_data *data)
ret = regmap_read(data->regmap, BMA400_CHIP_ID_REG, &val);
if (ret) {
dev_err(data->dev, "Failed to read chip id register\n");
- goto out;
+ return ret;
}
if (val != BMA400_ID_REG_VAL) {
dev_err(data->dev, "Chip ID mismatch\n");
- ret = -ENODEV;
- goto out;
+ return -ENODEV;
}
data->regulators[BMA400_VDD_REGULATOR].supply = "vdd";
@@ -589,27 +608,31 @@ static int bma400_init(struct bma400_data *data)
"Failed to get regulators: %d\n",
ret);
- goto out;
+ return ret;
}
ret = regulator_bulk_enable(ARRAY_SIZE(data->regulators),
data->regulators);
if (ret) {
dev_err(data->dev, "Failed to enable regulators: %d\n",
ret);
- goto out;
+ return ret;
}
+ ret = devm_add_action_or_reset(data->dev, bma400_regulators_disable, data);
+ if (ret)
+ return ret;
+
ret = bma400_get_power_mode(data);
if (ret) {
dev_err(data->dev, "Failed to get the initial power-mode\n");
- goto err_reg_disable;
+ return ret;
}
if (data->power_mode != POWER_MODE_NORMAL) {
ret = bma400_set_power_mode(data, POWER_MODE_NORMAL);
if (ret) {
dev_err(data->dev, "Failed to wake up the device\n");
- goto err_reg_disable;
+ return ret;
}
/*
* TODO: The datasheet waits 1500us here in the example, but
@@ -618,19 +641,23 @@ static int bma400_init(struct bma400_data *data)
usleep_range(1500, 2000);
}
+ ret = devm_add_action_or_reset(data->dev, bma400_power_disable, data);
+ if (ret)
+ return ret;
+
bma400_init_tables();
ret = bma400_get_accel_output_data_rate(data);
if (ret)
- goto err_reg_disable;
+ return ret;
ret = bma400_get_accel_oversampling_ratio(data);
if (ret)
- goto err_reg_disable;
+ return ret;
ret = bma400_get_accel_scale(data);
if (ret)
- goto err_reg_disable;
+ return ret;
/*
* Once the interrupt engine is supported we might use the
@@ -639,12 +666,6 @@ static int bma400_init(struct bma400_data *data)
* channel.
*/
return regmap_write(data->regmap, BMA400_ACC_CONFIG2_REG, 0x00);
-
-err_reg_disable:
- regulator_bulk_disable(ARRAY_SIZE(data->regulators),
- data->regulators);
-out:
- return ret;
}
static int bma400_read_raw(struct iio_dev *indio_dev,
@@ -822,32 +843,10 @@ int bma400_probe(struct device *dev, struct regmap *regmap, const char *name)
indio_dev->num_channels = ARRAY_SIZE(bma400_channels);
indio_dev->modes = INDIO_DIRECT_MODE;
- dev_set_drvdata(dev, indio_dev);
-
- return iio_device_register(indio_dev);
+ return devm_iio_device_register(dev, indio_dev);
}
EXPORT_SYMBOL_NS(bma400_probe, IIO_BMA400);
-void bma400_remove(struct device *dev)
-{
- struct iio_dev *indio_dev = dev_get_drvdata(dev);
- struct bma400_data *data = iio_priv(indio_dev);
- int ret;
-
- mutex_lock(&data->mutex);
- ret = bma400_set_power_mode(data, POWER_MODE_SLEEP);
- mutex_unlock(&data->mutex);
-
- if (ret)
- dev_warn(dev, "Failed to put device into sleep mode (%pe)\n", ERR_PTR(ret));
-
- regulator_bulk_disable(ARRAY_SIZE(data->regulators),
- data->regulators);
-
- iio_device_unregister(indio_dev);
-}
-EXPORT_SYMBOL_NS(bma400_remove, IIO_BMA400);
-
MODULE_AUTHOR("Dan Robertson <dan@dlrobertson.com>");
MODULE_DESCRIPTION("Bosch BMA400 triaxial acceleration sensor core");
MODULE_LICENSE("GPL");
diff --git a/drivers/iio/accel/bma400_i2c.c b/drivers/iio/accel/bma400_i2c.c
index da104ffd3fe0..4f6e01a3b3a1 100644
--- a/drivers/iio/accel/bma400_i2c.c
+++ b/drivers/iio/accel/bma400_i2c.c
@@ -27,13 +27,6 @@ static int bma400_i2c_probe(struct i2c_client *client,
return bma400_probe(&client->dev, regmap, id->name);
}
-static int bma400_i2c_remove(struct i2c_client *client)
-{
- bma400_remove(&client->dev);
-
- return 0;
-}
-
static const struct i2c_device_id bma400_i2c_ids[] = {
{ "bma400", 0 },
{ }
@@ -52,7 +45,6 @@ static struct i2c_driver bma400_i2c_driver = {
.of_match_table = bma400_of_i2c_match,
},
.probe = bma400_i2c_probe,
- .remove = bma400_i2c_remove,
.id_table = bma400_i2c_ids,
};
diff --git a/drivers/iio/accel/bma400_spi.c b/drivers/iio/accel/bma400_spi.c
index 51f23bdc0ea5..28e240400a3f 100644
--- a/drivers/iio/accel/bma400_spi.c
+++ b/drivers/iio/accel/bma400_spi.c
@@ -87,11 +87,6 @@ static int bma400_spi_probe(struct spi_device *spi)
return bma400_probe(&spi->dev, regmap, id->name);
}
-static void bma400_spi_remove(struct spi_device *spi)
-{
- bma400_remove(&spi->dev);
-}
-
static const struct spi_device_id bma400_spi_ids[] = {
{ "bma400", 0 },
{ }
@@ -110,7 +105,6 @@ static struct spi_driver bma400_spi_driver = {
.of_match_table = bma400_of_spi_match,
},
.probe = bma400_spi_probe,
- .remove = bma400_spi_remove,
.id_table = bma400_spi_ids,
};
diff --git a/drivers/iio/accel/cros_ec_accel_legacy.c b/drivers/iio/accel/cros_ec_accel_legacy.c
index b6f3471b62dc..3b77fded2dc0 100644
--- a/drivers/iio/accel/cros_ec_accel_legacy.c
+++ b/drivers/iio/accel/cros_ec_accel_legacy.c
@@ -215,7 +215,7 @@ static int cros_ec_accel_legacy_probe(struct platform_device *pdev)
return -ENOMEM;
ret = cros_ec_sensors_core_init(pdev, indio_dev, true,
- cros_ec_sensors_capture, NULL);
+ cros_ec_sensors_capture);
if (ret)
return ret;
@@ -235,7 +235,7 @@ static int cros_ec_accel_legacy_probe(struct platform_device *pdev)
state->sign[CROS_EC_SENSOR_Z] = -1;
}
- return devm_iio_device_register(dev, indio_dev);
+ return cros_ec_sensors_core_register(dev, indio_dev, NULL);
}
static struct platform_driver cros_ec_accel_platform_driver = {
diff --git a/drivers/iio/accel/sca3000.c b/drivers/iio/accel/sca3000.c
index 83c81072511e..2eecd2ab72dd 100644
--- a/drivers/iio/accel/sca3000.c
+++ b/drivers/iio/accel/sca3000.c
@@ -167,8 +167,8 @@ struct sca3000_state {
int mo_det_use_count;
struct mutex lock;
/* Can these share a cacheline ? */
- u8 rx[384] ____cacheline_aligned;
- u8 tx[6] ____cacheline_aligned;
+ u8 rx[384] __aligned(IIO_DMA_MINALIGN);
+ u8 tx[6] __aligned(IIO_DMA_MINALIGN);
};
/**
diff --git a/drivers/iio/accel/sca3300.c b/drivers/iio/accel/sca3300.c
index f7ef8ecfd34a..39e0c24364ae 100644
--- a/drivers/iio/accel/sca3300.c
+++ b/drivers/iio/accel/sca3300.c
@@ -115,7 +115,7 @@ struct sca3300_data {
s16 channels[4];
s64 ts __aligned(sizeof(s64));
} scan;
- u8 txbuf[4] ____cacheline_aligned;
+ u8 txbuf[4] __aligned(IIO_DMA_MINALIGN);
u8 rxbuf[4];
};
diff --git a/drivers/iio/adc/ad7266.c b/drivers/iio/adc/ad7266.c
index c17d9b5fbaf6..53c83e04dde5 100644
--- a/drivers/iio/adc/ad7266.c
+++ b/drivers/iio/adc/ad7266.c
@@ -37,7 +37,7 @@ struct ad7266_state {
struct gpio_desc *gpios[3];
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
* The buffer needs to be large enough to hold two samples (4 bytes) and
* the naturally aligned timestamp (8 bytes).
@@ -45,7 +45,7 @@ struct ad7266_state {
struct {
__be16 sample[2];
s64 timestamp;
- } data ____cacheline_aligned;
+ } data __aligned(IIO_DMA_MINALIGN);
};
static int ad7266_wakeup(struct ad7266_state *st)
diff --git a/drivers/iio/adc/ad7280a.c b/drivers/iio/adc/ad7280a.c
index ec9acbf12b9a..047a9d0dbcf7 100644
--- a/drivers/iio/adc/ad7280a.c
+++ b/drivers/iio/adc/ad7280a.c
@@ -183,7 +183,7 @@ struct ad7280_state {
unsigned char cb_mask[AD7280A_MAX_CHAIN];
struct mutex lock; /* protect sensor state */
- __be32 tx ____cacheline_aligned;
+ __be32 tx __aligned(IIO_DMA_MINALIGN);
__be32 rx;
};
diff --git a/drivers/iio/adc/ad7292.c b/drivers/iio/adc/ad7292.c
index 3271a31afde1..92c68d467c50 100644
--- a/drivers/iio/adc/ad7292.c
+++ b/drivers/iio/adc/ad7292.c
@@ -80,7 +80,7 @@ struct ad7292_state {
struct regulator *reg;
unsigned short vref_mv;
- __be16 d16 ____cacheline_aligned;
+ __be16 d16 __aligned(IIO_DMA_MINALIGN);
u8 d8[2];
};
diff --git a/drivers/iio/adc/ad7298.c b/drivers/iio/adc/ad7298.c
index 3f4e73f7d35a..c0430f71f592 100644
--- a/drivers/iio/adc/ad7298.c
+++ b/drivers/iio/adc/ad7298.c
@@ -49,7 +49,7 @@ struct ad7298_state {
* DMA (thus cache coherency maintenance) requires the
* transfer buffers to live in their own cache lines.
*/
- __be16 rx_buf[12] ____cacheline_aligned;
+ __be16 rx_buf[12] __aligned(IIO_DMA_MINALIGN);
__be16 tx_buf[2];
};
diff --git a/drivers/iio/adc/ad7476.c b/drivers/iio/adc/ad7476.c
index a1e8b32671cf..94776f696290 100644
--- a/drivers/iio/adc/ad7476.c
+++ b/drivers/iio/adc/ad7476.c
@@ -44,13 +44,12 @@ struct ad7476_state {
struct spi_transfer xfer;
struct spi_message msg;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
* Make the buffer large enough for one 16 bit sample and one 64 bit
* aligned 64 bit timestamp.
*/
- unsigned char data[ALIGN(2, sizeof(s64)) + sizeof(s64)]
- ____cacheline_aligned;
+ unsigned char data[ALIGN(2, sizeof(s64)) + sizeof(s64)] __aligned(IIO_DMA_MINALIGN);
};
enum ad7476_supported_device_ids {
diff --git a/drivers/iio/adc/ad7606.h b/drivers/iio/adc/ad7606.h
index 4f82d7c9acfd..2dc4f599f9df 100644
--- a/drivers/iio/adc/ad7606.h
+++ b/drivers/iio/adc/ad7606.h
@@ -116,11 +116,11 @@ struct ad7606_state {
struct completion completion;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
* 16 * 16-bit samples + 64-bit timestamp
*/
- unsigned short data[20] ____cacheline_aligned;
+ unsigned short data[20] __aligned(IIO_DMA_MINALIGN);
__be16 d16[2];
};
diff --git a/drivers/iio/adc/ad7766.c b/drivers/iio/adc/ad7766.c
index 51ee9482e0df..3079a0872947 100644
--- a/drivers/iio/adc/ad7766.c
+++ b/drivers/iio/adc/ad7766.c
@@ -45,13 +45,12 @@ struct ad7766 {
struct spi_message msg;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
* Make the buffer large enough for one 24 bit sample and one 64 bit
* aligned 64 bit timestamp.
*/
- unsigned char data[ALIGN(3, sizeof(s64)) + sizeof(s64)]
- ____cacheline_aligned;
+ unsigned char data[ALIGN(3, sizeof(s64)) + sizeof(s64)] __aligned(IIO_DMA_MINALIGN);
};
/*
diff --git a/drivers/iio/adc/ad7768-1.c b/drivers/iio/adc/ad7768-1.c
index aa42ba759fa1..60f394da4640 100644
--- a/drivers/iio/adc/ad7768-1.c
+++ b/drivers/iio/adc/ad7768-1.c
@@ -163,7 +163,7 @@ struct ad7768_state {
struct gpio_desc *gpio_sync_in;
const char *labels[ARRAY_SIZE(ad7768_channels)];
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
union {
@@ -173,7 +173,7 @@ struct ad7768_state {
} scan;
__be32 d32;
u8 d8[2];
- } data ____cacheline_aligned;
+ } data __aligned(IIO_DMA_MINALIGN);
};
static int ad7768_spi_reg_read(struct ad7768_state *st, unsigned int addr,
diff --git a/drivers/iio/adc/ad7887.c b/drivers/iio/adc/ad7887.c
index f64999714a4d..965bdc8aa696 100644
--- a/drivers/iio/adc/ad7887.c
+++ b/drivers/iio/adc/ad7887.c
@@ -66,13 +66,12 @@ struct ad7887_state {
unsigned char tx_cmd_buf[4];
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
* Buffer needs to be large enough to hold two 16 bit samples and a
* 64 bit aligned 64 bit timestamp.
*/
- unsigned char data[ALIGN(4, sizeof(s64)) + sizeof(s64)]
- ____cacheline_aligned;
+ unsigned char data[ALIGN(4, sizeof(s64)) + sizeof(s64)] __aligned(IIO_DMA_MINALIGN);
};
enum ad7887_supported_device_ids {
diff --git a/drivers/iio/adc/ad7923.c b/drivers/iio/adc/ad7923.c
index 069b561ee768..edad1f30121d 100644
--- a/drivers/iio/adc/ad7923.c
+++ b/drivers/iio/adc/ad7923.c
@@ -57,12 +57,12 @@ struct ad7923_state {
unsigned int settings;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
* Ensure rx_buf can be directly used in iio_push_to_buffers_with_timetamp
* Length = 8 channels + 4 extra for 8 byte timestamp
*/
- __be16 rx_buf[12] ____cacheline_aligned;
+ __be16 rx_buf[12] __aligned(IIO_DMA_MINALIGN);
__be16 tx_buf[4];
};
diff --git a/drivers/iio/adc/ad7949.c b/drivers/iio/adc/ad7949.c
index 44bb5fde83de..ed4c1656ca75 100644
--- a/drivers/iio/adc/ad7949.c
+++ b/drivers/iio/adc/ad7949.c
@@ -86,7 +86,7 @@ struct ad7949_adc_chip {
u8 resolution;
u16 cfg;
unsigned int current_channel;
- u16 buffer ____cacheline_aligned;
+ u16 buffer __aligned(IIO_DMA_MINALIGN);
__be16 buf8b;
};
diff --git a/drivers/iio/adc/adi-axi-adc.c b/drivers/iio/adc/adi-axi-adc.c
index a9e655e69eaa..8ffabdaf841e 100644
--- a/drivers/iio/adc/adi-axi-adc.c
+++ b/drivers/iio/adc/adi-axi-adc.c
@@ -84,7 +84,8 @@ void *adi_axi_adc_conv_priv(struct adi_axi_adc_conv *conv)
{
struct adi_axi_adc_client *cl = conv_to_client(conv);
- return (char *)cl + ALIGN(sizeof(struct adi_axi_adc_client), IIO_ALIGN);
+ return (char *)cl + ALIGN(sizeof(struct adi_axi_adc_client),
+ IIO_DMA_MINALIGN);
}
EXPORT_SYMBOL_GPL(adi_axi_adc_conv_priv);
@@ -169,9 +170,9 @@ static struct adi_axi_adc_conv *adi_axi_adc_conv_register(struct device *dev,
struct adi_axi_adc_client *cl;
size_t alloc_size;
- alloc_size = ALIGN(sizeof(struct adi_axi_adc_client), IIO_ALIGN);
+ alloc_size = ALIGN(sizeof(struct adi_axi_adc_client), IIO_DMA_MINALIGN);
if (sizeof_priv)
- alloc_size += ALIGN(sizeof_priv, IIO_ALIGN);
+ alloc_size += ALIGN(sizeof_priv, IIO_DMA_MINALIGN);
cl = kzalloc(alloc_size, GFP_KERNEL);
if (!cl)
diff --git a/drivers/iio/adc/hi8435.c b/drivers/iio/adc/hi8435.c
index 8eb0140df133..771fa12bdc02 100644
--- a/drivers/iio/adc/hi8435.c
+++ b/drivers/iio/adc/hi8435.c
@@ -49,7 +49,7 @@ struct hi8435_priv {
unsigned threshold_lo[2]; /* GND-Open and Supply-Open thresholds */
unsigned threshold_hi[2]; /* GND-Open and Supply-Open thresholds */
- u8 reg_buffer[3] ____cacheline_aligned;
+ u8 reg_buffer[3] __aligned(IIO_DMA_MINALIGN);
};
static int hi8435_readb(struct hi8435_priv *priv, u8 reg, u8 *val)
diff --git a/drivers/iio/adc/ltc2496.c b/drivers/iio/adc/ltc2496.c
index 5a55f79f2574..dfb3bb5997e5 100644
--- a/drivers/iio/adc/ltc2496.c
+++ b/drivers/iio/adc/ltc2496.c
@@ -24,10 +24,10 @@ struct ltc2496_driverdata {
struct spi_device *spi;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
- unsigned char rxbuf[3] ____cacheline_aligned;
+ unsigned char rxbuf[3] __aligned(IIO_DMA_MINALIGN);
unsigned char txbuf[3];
};
diff --git a/drivers/iio/adc/ltc2497.c b/drivers/iio/adc/ltc2497.c
index 1adddf5a88a9..f7c786f37ceb 100644
--- a/drivers/iio/adc/ltc2497.c
+++ b/drivers/iio/adc/ltc2497.c
@@ -20,10 +20,10 @@ struct ltc2497_driverdata {
struct ltc2497core_driverdata common_ddata;
struct i2c_client *client;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
- __be32 buf ____cacheline_aligned;
+ __be32 buf __aligned(IIO_DMA_MINALIGN);
};
static int ltc2497_result_and_measure(struct ltc2497core_driverdata *ddata,
diff --git a/drivers/iio/adc/max1027.c b/drivers/iio/adc/max1027.c
index 4daf1d576c4e..136fcf753837 100644
--- a/drivers/iio/adc/max1027.c
+++ b/drivers/iio/adc/max1027.c
@@ -272,7 +272,7 @@ struct max1027_state {
struct mutex lock;
struct completion complete;
- u8 reg ____cacheline_aligned;
+ u8 reg __aligned(IIO_DMA_MINALIGN);
};
static int max1027_wait_eoc(struct iio_dev *indio_dev)
@@ -349,8 +349,7 @@ static int max1027_read_single_value(struct iio_dev *indio_dev,
if (ret < 0) {
dev_err(&indio_dev->dev,
"Failed to configure conversion register\n");
- iio_device_release_direct_mode(indio_dev);
- return ret;
+ goto release;
}
/*
@@ -360,11 +359,12 @@ static int max1027_read_single_value(struct iio_dev *indio_dev,
*/
ret = max1027_wait_eoc(indio_dev);
if (ret)
- return ret;
+ goto release;
/* Read result */
ret = spi_read(st->spi, st->buffer, (chan->type == IIO_TEMP) ? 4 : 2);
+release:
iio_device_release_direct_mode(indio_dev);
if (ret < 0)
diff --git a/drivers/iio/adc/max11100.c b/drivers/iio/adc/max11100.c
index eb1ce6a0315c..49e38dca8fe2 100644
--- a/drivers/iio/adc/max11100.c
+++ b/drivers/iio/adc/max11100.c
@@ -33,10 +33,10 @@ struct max11100_state {
struct spi_device *spi;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
- u8 buffer[3] ____cacheline_aligned;
+ u8 buffer[3] __aligned(IIO_DMA_MINALIGN);
};
static const struct iio_chan_spec max11100_channels[] = {
diff --git a/drivers/iio/adc/max1118.c b/drivers/iio/adc/max1118.c
index a41bc570be21..75ab57d9aef7 100644
--- a/drivers/iio/adc/max1118.c
+++ b/drivers/iio/adc/max1118.c
@@ -42,7 +42,7 @@ struct max1118 {
s64 ts __aligned(8);
} scan;
- u8 data ____cacheline_aligned;
+ u8 data __aligned(IIO_DMA_MINALIGN);
};
#define MAX1118_CHANNEL(ch) \
diff --git a/drivers/iio/adc/max1241.c b/drivers/iio/adc/max1241.c
index a5afd84af58b..a815ad1f6913 100644
--- a/drivers/iio/adc/max1241.c
+++ b/drivers/iio/adc/max1241.c
@@ -26,7 +26,7 @@ struct max1241 {
struct regulator *vref;
struct gpio_desc *shutdown;
- __be16 data ____cacheline_aligned;
+ __be16 data __aligned(IIO_DMA_MINALIGN);
};
static const struct iio_chan_spec max1241_channels[] = {
diff --git a/drivers/iio/adc/mcp320x.c b/drivers/iio/adc/mcp320x.c
index b4c69acb33e3..f3b81798b3c9 100644
--- a/drivers/iio/adc/mcp320x.c
+++ b/drivers/iio/adc/mcp320x.c
@@ -92,7 +92,7 @@ struct mcp320x {
struct mutex lock;
const struct mcp320x_chip_info *chip_info;
- u8 tx_buf ____cacheline_aligned;
+ u8 tx_buf __aligned(IIO_DMA_MINALIGN);
u8 rx_buf[4];
};
diff --git a/drivers/iio/adc/ti-adc0832.c b/drivers/iio/adc/ti-adc0832.c
index fb5e72600b96..b11ce555ba3b 100644
--- a/drivers/iio/adc/ti-adc0832.c
+++ b/drivers/iio/adc/ti-adc0832.c
@@ -36,7 +36,7 @@ struct adc0832 {
*/
u8 data[24] __aligned(8);
- u8 tx_buf[2] ____cacheline_aligned;
+ u8 tx_buf[2] __aligned(IIO_DMA_MINALIGN);
u8 rx_buf[2];
};
diff --git a/drivers/iio/adc/ti-adc084s021.c b/drivers/iio/adc/ti-adc084s021.c
index c9b5d9aec3dc..1f6e53832e06 100644
--- a/drivers/iio/adc/ti-adc084s021.c
+++ b/drivers/iio/adc/ti-adc084s021.c
@@ -32,10 +32,10 @@ struct adc084s021 {
s64 ts __aligned(8);
} scan;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache line.
*/
- u16 tx_buf[4] ____cacheline_aligned;
+ u16 tx_buf[4] __aligned(IIO_DMA_MINALIGN);
__be16 rx_buf[5]; /* First 16-bits are trash */
};
diff --git a/drivers/iio/adc/ti-adc108s102.c b/drivers/iio/adc/ti-adc108s102.c
index c8e48881c37f..c82a161630e1 100644
--- a/drivers/iio/adc/ti-adc108s102.c
+++ b/drivers/iio/adc/ti-adc108s102.c
@@ -77,8 +77,8 @@ struct adc108s102_state {
* tx_buf: 8 channel read commands, plus 1 dummy command
* rx_buf: 1 dummy response, 8 channel responses
*/
- __be16 rx_buf[9] ____cacheline_aligned;
- __be16 tx_buf[9] ____cacheline_aligned;
+ __be16 rx_buf[9] __aligned(IIO_DMA_MINALIGN);
+ __be16 tx_buf[9] __aligned(IIO_DMA_MINALIGN);
};
#define ADC108S102_V_CHAN(index) \
diff --git a/drivers/iio/adc/ti-adc12138.c b/drivers/iio/adc/ti-adc12138.c
index 59d75d09604f..c0a72d72f3a9 100644
--- a/drivers/iio/adc/ti-adc12138.c
+++ b/drivers/iio/adc/ti-adc12138.c
@@ -55,7 +55,7 @@ struct adc12138 {
*/
__be16 data[20] __aligned(8);
- u8 tx_buf[2] ____cacheline_aligned;
+ u8 tx_buf[2] __aligned(IIO_DMA_MINALIGN);
u8 rx_buf[2];
};
diff --git a/drivers/iio/adc/ti-adc128s052.c b/drivers/iio/adc/ti-adc128s052.c
index 8e7adec87755..622fd384983c 100644
--- a/drivers/iio/adc/ti-adc128s052.c
+++ b/drivers/iio/adc/ti-adc128s052.c
@@ -29,7 +29,7 @@ struct adc128 {
struct regulator *reg;
struct mutex lock;
- u8 buffer[2] ____cacheline_aligned;
+ u8 buffer[2] __aligned(IIO_DMA_MINALIGN);
};
static int adc128_adc_conversion(struct adc128 *adc, u8 channel)
diff --git a/drivers/iio/adc/ti-adc161s626.c b/drivers/iio/adc/ti-adc161s626.c
index 75ca7f1c8726..b789891dcf49 100644
--- a/drivers/iio/adc/ti-adc161s626.c
+++ b/drivers/iio/adc/ti-adc161s626.c
@@ -71,7 +71,7 @@ struct ti_adc_data {
u8 read_size;
u8 shift;
- u8 buffer[16] ____cacheline_aligned;
+ u8 buffer[16] __aligned(IIO_DMA_MINALIGN);
};
static int ti_adc_read_measurement(struct ti_adc_data *data,
diff --git a/drivers/iio/adc/ti-ads124s08.c b/drivers/iio/adc/ti-ads124s08.c
index 767b3b634809..64833156c199 100644
--- a/drivers/iio/adc/ti-ads124s08.c
+++ b/drivers/iio/adc/ti-ads124s08.c
@@ -106,7 +106,7 @@ struct ads124s_private {
* timestamp is maintained.
*/
u32 buffer[ADS124S08_MAX_CHANNELS + sizeof(s64)/sizeof(u32)] __aligned(8);
- u8 data[5] ____cacheline_aligned;
+ u8 data[5] __aligned(IIO_DMA_MINALIGN);
};
#define ADS124S08_CHAN(index) \
diff --git a/drivers/iio/adc/ti-ads131e08.c b/drivers/iio/adc/ti-ads131e08.c
index 80a09817c119..32237cacc9a3 100644
--- a/drivers/iio/adc/ti-ads131e08.c
+++ b/drivers/iio/adc/ti-ads131e08.c
@@ -105,7 +105,7 @@ struct ads131e08_state {
s64 ts __aligned(8);
} tmp_buf;
- u8 tx_buf[3] ____cacheline_aligned;
+ u8 tx_buf[3] __aligned(IIO_DMA_MINALIGN);
/*
* Add extra one padding byte to be able to access the last channel
* value using u32 pointer
diff --git a/drivers/iio/adc/ti-ads7950.c b/drivers/iio/adc/ti-ads7950.c
index e3658b969c5b..2cc9a9bd9db6 100644
--- a/drivers/iio/adc/ti-ads7950.c
+++ b/drivers/iio/adc/ti-ads7950.c
@@ -102,11 +102,11 @@ struct ti_ads7950_state {
unsigned int gpio_cmd_settings_bitmask;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
u16 rx_buf[TI_ADS7950_MAX_CHAN + 2 + TI_ADS7950_TIMESTAMP_SIZE]
- ____cacheline_aligned;
+ __aligned(IIO_DMA_MINALIGN);
u16 tx_buf[TI_ADS7950_MAX_CHAN + 2];
u16 single_tx;
u16 single_rx;
diff --git a/drivers/iio/adc/ti-ads8344.c b/drivers/iio/adc/ti-ads8344.c
index c96d2a9ba924..bbd85cb47f81 100644
--- a/drivers/iio/adc/ti-ads8344.c
+++ b/drivers/iio/adc/ti-ads8344.c
@@ -28,7 +28,7 @@ struct ads8344 {
*/
struct mutex lock;
- u8 tx_buf ____cacheline_aligned;
+ u8 tx_buf __aligned(IIO_DMA_MINALIGN);
u8 rx_buf[3];
};
diff --git a/drivers/iio/adc/ti-ads8688.c b/drivers/iio/adc/ti-ads8688.c
index 22c2583eedd0..7ee14c6d9f46 100644
--- a/drivers/iio/adc/ti-ads8688.c
+++ b/drivers/iio/adc/ti-ads8688.c
@@ -71,7 +71,7 @@ struct ads8688_state {
union {
__be32 d32;
u8 d8[4];
- } data[2] ____cacheline_aligned;
+ } data[2] __aligned(IIO_DMA_MINALIGN);
};
enum ads8688_id {
diff --git a/drivers/iio/adc/ti-tlc4541.c b/drivers/iio/adc/ti-tlc4541.c
index 2406eda9dfc6..30f629a553a1 100644
--- a/drivers/iio/adc/ti-tlc4541.c
+++ b/drivers/iio/adc/ti-tlc4541.c
@@ -37,12 +37,12 @@ struct tlc4541_state {
struct spi_message scan_single_msg;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
* 2 bytes data + 6 bytes padding + 8 bytes timestamp when
* call iio_push_to_buffers_with_timestamp.
*/
- __be16 rx_buf[8] ____cacheline_aligned;
+ __be16 rx_buf[8] __aligned(IIO_DMA_MINALIGN);
};
struct tlc4541_chip_info {
diff --git a/drivers/iio/addac/ad74413r.c b/drivers/iio/addac/ad74413r.c
index acd230a6af35..6a66d7a65db7 100644
--- a/drivers/iio/addac/ad74413r.c
+++ b/drivers/iio/addac/ad74413r.c
@@ -77,13 +77,13 @@ struct ad74413r_state {
struct spi_transfer adc_samples_xfer[AD74413R_CHANNEL_MAX + 1];
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
struct {
u8 rx_buf[AD74413R_FRAME_SIZE * AD74413R_CHANNEL_MAX];
s64 timestamp;
- } adc_samples_buf ____cacheline_aligned;
+ } adc_samples_buf __aligned(IIO_DMA_MINALIGN);
u8 adc_samples_tx_buf[AD74413R_FRAME_SIZE * AD74413R_CHANNEL_MAX];
u8 reg_tx_buf[AD74413R_FRAME_SIZE];
diff --git a/drivers/iio/amplifiers/ad8366.c b/drivers/iio/amplifiers/ad8366.c
index 1134ae12e531..f2c2ea79a07f 100644
--- a/drivers/iio/amplifiers/ad8366.c
+++ b/drivers/iio/amplifiers/ad8366.c
@@ -45,10 +45,10 @@ struct ad8366_state {
enum ad8366_type type;
struct ad8366_info *info;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
- unsigned char data[2] ____cacheline_aligned;
+ unsigned char data[2] __aligned(IIO_DMA_MINALIGN);
};
static struct ad8366_info ad8366_infos[] = {
diff --git a/drivers/iio/common/cros_ec_sensors/cros_ec_lid_angle.c b/drivers/iio/common/cros_ec_sensors/cros_ec_lid_angle.c
index af801e203623..02d3cf36acb0 100644
--- a/drivers/iio/common/cros_ec_sensors/cros_ec_lid_angle.c
+++ b/drivers/iio/common/cros_ec_sensors/cros_ec_lid_angle.c
@@ -97,7 +97,7 @@ static int cros_ec_lid_angle_probe(struct platform_device *pdev)
if (!indio_dev)
return -ENOMEM;
- ret = cros_ec_sensors_core_init(pdev, indio_dev, false, NULL, NULL);
+ ret = cros_ec_sensors_core_init(pdev, indio_dev, false, NULL);
if (ret)
return ret;
@@ -113,7 +113,7 @@ static int cros_ec_lid_angle_probe(struct platform_device *pdev)
if (ret)
return ret;
- return devm_iio_device_register(dev, indio_dev);
+ return cros_ec_sensors_core_register(dev, indio_dev, NULL);
}
static const struct platform_device_id cros_ec_lid_angle_ids[] = {
diff --git a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c
index 376a5b30010a..5cce34fdff02 100644
--- a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c
+++ b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors.c
@@ -235,8 +235,7 @@ static int cros_ec_sensors_probe(struct platform_device *pdev)
return -ENOMEM;
ret = cros_ec_sensors_core_init(pdev, indio_dev, true,
- cros_ec_sensors_capture,
- cros_ec_sensors_push_data);
+ cros_ec_sensors_capture);
if (ret)
return ret;
@@ -297,7 +296,8 @@ static int cros_ec_sensors_probe(struct platform_device *pdev)
else
state->core.read_ec_sensors_data = cros_ec_sensors_read_cmd;
- return devm_iio_device_register(dev, indio_dev);
+ return cros_ec_sensors_core_register(dev, indio_dev,
+ cros_ec_sensors_push_data);
}
static const struct platform_device_id cros_ec_sensors_ids[] = {
diff --git a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
index b2725c6adc7f..ed9716832161 100644
--- a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
+++ b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
@@ -234,21 +234,18 @@ static void cros_ec_sensors_core_clean(void *arg)
/**
* cros_ec_sensors_core_init() - basic initialization of the core structure
- * @pdev: platform device created for the sensors
+ * @pdev: platform device created for the sensor
* @indio_dev: iio device structure of the device
* @physical_device: true if the device refers to a physical device
* @trigger_capture: function pointer to call buffer is triggered,
* for backward compatibility.
- * @push_data: function to call when cros_ec_sensorhub receives
- * a sample for that sensor.
*
* Return: 0 on success, -errno on failure.
*/
int cros_ec_sensors_core_init(struct platform_device *pdev,
struct iio_dev *indio_dev,
bool physical_device,
- cros_ec_sensors_capture_t trigger_capture,
- cros_ec_sensorhub_push_data_cb_t push_data)
+ cros_ec_sensors_capture_t trigger_capture)
{
struct device *dev = &pdev->dev;
struct cros_ec_sensors_core_state *state = iio_priv(indio_dev);
@@ -339,17 +336,6 @@ int cros_ec_sensors_core_init(struct platform_device *pdev,
if (ret)
return ret;
- ret = cros_ec_sensorhub_register_push_data(
- sensor_hub, sensor_platform->sensor_num,
- indio_dev, push_data);
- if (ret)
- return ret;
-
- ret = devm_add_action_or_reset(
- dev, cros_ec_sensors_core_clean, pdev);
- if (ret)
- return ret;
-
/* Timestamp coming from FIFO are in ns since boot. */
ret = iio_device_set_clock(indio_dev, CLOCK_BOOTTIME);
if (ret)
@@ -371,6 +357,46 @@ int cros_ec_sensors_core_init(struct platform_device *pdev,
}
EXPORT_SYMBOL_GPL(cros_ec_sensors_core_init);
+/**
+ * cros_ec_sensors_core_register() - Register callback to FIFO and IIO when
+ * sensor is ready.
+ * It must be called at the end of the sensor probe routine.
+ * @dev: device created for the sensor
+ * @indio_dev: iio device structure of the device
+ * @push_data: function to call when cros_ec_sensorhub receives
+ * a sample for that sensor.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+int cros_ec_sensors_core_register(struct device *dev,
+ struct iio_dev *indio_dev,
+ cros_ec_sensorhub_push_data_cb_t push_data)
+{
+ struct cros_ec_sensor_platform *sensor_platform = dev_get_platdata(dev);
+ struct cros_ec_sensorhub *sensor_hub = dev_get_drvdata(dev->parent);
+ struct platform_device *pdev = to_platform_device(dev);
+ struct cros_ec_dev *ec = sensor_hub->ec;
+ int ret;
+
+ ret = devm_iio_device_register(dev, indio_dev);
+ if (ret)
+ return ret;
+
+ if (!push_data ||
+ !cros_ec_check_features(ec, EC_FEATURE_MOTION_SENSE_FIFO))
+ return 0;
+
+ ret = cros_ec_sensorhub_register_push_data(
+ sensor_hub, sensor_platform->sensor_num,
+ indio_dev, push_data);
+ if (ret)
+ return ret;
+
+ return devm_add_action_or_reset(
+ dev, cros_ec_sensors_core_clean, pdev);
+}
+EXPORT_SYMBOL_GPL(cros_ec_sensors_core_register);
+
/**
* cros_ec_motion_send_host_cmd() - send motion sense host command
* @state: pointer to state information for device
diff --git a/drivers/iio/common/ssp_sensors/ssp.h b/drivers/iio/common/ssp_sensors/ssp.h
index abb832795619..f649cdecc277 100644
--- a/drivers/iio/common/ssp_sensors/ssp.h
+++ b/drivers/iio/common/ssp_sensors/ssp.h
@@ -221,8 +221,7 @@ struct ssp_data {
struct iio_dev *sensor_devs[SSP_SENSOR_MAX];
atomic_t enable_refcount;
- __le16 header_buffer[SSP_HEADER_BUFFER_SIZE / sizeof(__le16)]
- ____cacheline_aligned;
+ __le16 header_buffer[SSP_HEADER_BUFFER_SIZE / sizeof(__le16)] __aligned(IIO_DMA_MINALIGN);
};
void ssp_clean_pending_list(struct ssp_data *data);
diff --git a/drivers/iio/dac/ad5064.c b/drivers/iio/dac/ad5064.c
index 27ee2c63c5d4..e3b1add34f46 100644
--- a/drivers/iio/dac/ad5064.c
+++ b/drivers/iio/dac/ad5064.c
@@ -115,13 +115,13 @@ struct ad5064_state {
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
union {
u8 i2c[3];
__be32 spi;
- } data ____cacheline_aligned;
+ } data __aligned(IIO_DMA_MINALIGN);
};
enum ad5064_type {
diff --git a/drivers/iio/dac/ad5360.c b/drivers/iio/dac/ad5360.c
index ecbc6a51d60f..1bde696a572c 100644
--- a/drivers/iio/dac/ad5360.c
+++ b/drivers/iio/dac/ad5360.c
@@ -79,13 +79,13 @@ struct ad5360_state {
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
union {
__be32 d32;
u8 d8[4];
- } data[2] ____cacheline_aligned;
+ } data[2] __aligned(IIO_DMA_MINALIGN);
};
enum ad5360_type {
diff --git a/drivers/iio/dac/ad5421.c b/drivers/iio/dac/ad5421.c
index eedf661d32b2..7644acfd879e 100644
--- a/drivers/iio/dac/ad5421.c
+++ b/drivers/iio/dac/ad5421.c
@@ -72,13 +72,13 @@ struct ad5421_state {
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
union {
__be32 d32;
u8 d8[4];
- } data[2] ____cacheline_aligned;
+ } data[2] __aligned(IIO_DMA_MINALIGN);
};
static const struct iio_event_spec ad5421_current_event[] = {
diff --git a/drivers/iio/dac/ad5449.c b/drivers/iio/dac/ad5449.c
index bad9bdaafa94..4572d6f49275 100644
--- a/drivers/iio/dac/ad5449.c
+++ b/drivers/iio/dac/ad5449.c
@@ -68,10 +68,10 @@ struct ad5449 {
uint16_t dac_cache[AD5449_MAX_CHANNELS];
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
- __be16 data[2] ____cacheline_aligned;
+ __be16 data[2] __aligned(IIO_DMA_MINALIGN);
};
enum ad5449_type {
diff --git a/drivers/iio/dac/ad5504.c b/drivers/iio/dac/ad5504.c
index 8507573aa13e..e472c9564edf 100644
--- a/drivers/iio/dac/ad5504.c
+++ b/drivers/iio/dac/ad5504.c
@@ -54,7 +54,7 @@ struct ad5504_state {
unsigned pwr_down_mask;
unsigned pwr_down_mode;
- __be16 data[2] ____cacheline_aligned;
+ __be16 data[2] __aligned(IIO_DMA_MINALIGN);
};
/*
diff --git a/drivers/iio/dac/ad5592r-base.h b/drivers/iio/dac/ad5592r-base.h
index 2a22ef691996..cc7be426cbc8 100644
--- a/drivers/iio/dac/ad5592r-base.h
+++ b/drivers/iio/dac/ad5592r-base.h
@@ -14,6 +14,8 @@
#include <linux/mutex.h>
#include <linux/gpio/driver.h>
+#include <linux/iio/iio.h>
+
struct device;
struct ad5592r_state;
@@ -65,7 +67,7 @@ struct ad5592r_state {
u8 gpio_in;
u8 gpio_val;
- __be16 spi_msg ____cacheline_aligned;
+ __be16 spi_msg __aligned(IIO_DMA_MINALIGN);
__be16 spi_msg_nop;
};
diff --git a/drivers/iio/dac/ad5686.h b/drivers/iio/dac/ad5686.h
index cd5fff9e9d53..b7ade3a6b9b6 100644
--- a/drivers/iio/dac/ad5686.h
+++ b/drivers/iio/dac/ad5686.h
@@ -13,6 +13,8 @@
#include <linux/mutex.h>
#include <linux/kernel.h>
+#include <linux/iio/iio.h>
+
#define AD5310_CMD(x) ((x) << 12)
#define AD5683_DATA(x) ((x) << 4)
@@ -137,7 +139,7 @@ struct ad5686_state {
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
@@ -145,7 +147,7 @@ struct ad5686_state {
__be32 d32;
__be16 d16;
u8 d8[4];
- } data[3] ____cacheline_aligned;
+ } data[3] __aligned(IIO_DMA_MINALIGN);
};
diff --git a/drivers/iio/dac/ad5755.c b/drivers/iio/dac/ad5755.c
index 7a62e6e1d5f1..bbb345323b69 100644
--- a/drivers/iio/dac/ad5755.c
+++ b/drivers/iio/dac/ad5755.c
@@ -189,14 +189,14 @@ struct ad5755_state {
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
union {
__be32 d32;
u8 d8[4];
- } data[2] ____cacheline_aligned;
+ } data[2] __aligned(IIO_DMA_MINALIGN);
};
enum ad5755_type {
diff --git a/drivers/iio/dac/ad5761.c b/drivers/iio/dac/ad5761.c
index 4cb8471db81e..6aa1a068adb0 100644
--- a/drivers/iio/dac/ad5761.c
+++ b/drivers/iio/dac/ad5761.c
@@ -70,13 +70,13 @@ struct ad5761_state {
enum ad5761_voltage_range range;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
union {
__be32 d32;
u8 d8[4];
- } data[3] ____cacheline_aligned;
+ } data[3] __aligned(IIO_DMA_MINALIGN);
};
static const struct ad5761_range_params ad5761_range_params[] = {
diff --git a/drivers/iio/dac/ad5764.c b/drivers/iio/dac/ad5764.c
index d235a8047ba0..26c049d5b73a 100644
--- a/drivers/iio/dac/ad5764.c
+++ b/drivers/iio/dac/ad5764.c
@@ -56,13 +56,13 @@ struct ad5764_state {
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
union {
__be32 d32;
u8 d8[4];
- } data[2] ____cacheline_aligned;
+ } data[2] __aligned(IIO_DMA_MINALIGN);
};
enum ad5764_type {
diff --git a/drivers/iio/dac/ad5766.c b/drivers/iio/dac/ad5766.c
index 43189af2fb1f..899894523752 100644
--- a/drivers/iio/dac/ad5766.c
+++ b/drivers/iio/dac/ad5766.c
@@ -123,7 +123,7 @@ struct ad5766_state {
u32 d32;
u16 w16[2];
u8 b8[4];
- } data[3] ____cacheline_aligned;
+ } data[3] __aligned(IIO_DMA_MINALIGN);
};
struct ad5766_span_tbl {
diff --git a/drivers/iio/dac/ad5770r.c b/drivers/iio/dac/ad5770r.c
index 7e2fd32e993a..f66d67402e43 100644
--- a/drivers/iio/dac/ad5770r.c
+++ b/drivers/iio/dac/ad5770r.c
@@ -140,7 +140,7 @@ struct ad5770r_state {
bool ch_pwr_down[AD5770R_MAX_CHANNELS];
bool internal_ref;
bool external_res;
- u8 transf_buf[2] ____cacheline_aligned;
+ u8 transf_buf[2] __aligned(IIO_DMA_MINALIGN);
};
static const struct regmap_config ad5770r_spi_regmap_config = {
diff --git a/drivers/iio/dac/ad5791.c b/drivers/iio/dac/ad5791.c
index 2b14914b4050..ee7e89774d9d 100644
--- a/drivers/iio/dac/ad5791.c
+++ b/drivers/iio/dac/ad5791.c
@@ -95,7 +95,7 @@ struct ad5791_state {
union {
__be32 d32;
u8 d8[4];
- } data[3] ____cacheline_aligned;
+ } data[3] __aligned(IIO_DMA_MINALIGN);
};
enum ad5791_supported_device_ids {
diff --git a/drivers/iio/dac/ad7293.c b/drivers/iio/dac/ad7293.c
index 59a38ca4c3c7..06f05750d921 100644
--- a/drivers/iio/dac/ad7293.c
+++ b/drivers/iio/dac/ad7293.c
@@ -144,7 +144,7 @@ struct ad7293_state {
struct regulator *reg_avdd;
struct regulator *reg_vdrive;
u8 page_select;
- u8 data[3] ____cacheline_aligned;
+ u8 data[3] __aligned(IIO_DMA_MINALIGN);
};
static int ad7293_page_select(struct ad7293_state *st, unsigned int reg)
diff --git a/drivers/iio/dac/ad7303.c b/drivers/iio/dac/ad7303.c
index 91eaaf793b3e..558af7926a89 100644
--- a/drivers/iio/dac/ad7303.c
+++ b/drivers/iio/dac/ad7303.c
@@ -44,10 +44,10 @@ struct ad7303_state {
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
- __be16 data ____cacheline_aligned;
+ __be16 data __aligned(IIO_DMA_MINALIGN);
};
static int ad7303_write(struct ad7303_state *st, unsigned int chan,
diff --git a/drivers/iio/dac/ad8801.c b/drivers/iio/dac/ad8801.c
index 6be35c92d435..919e8c880697 100644
--- a/drivers/iio/dac/ad8801.c
+++ b/drivers/iio/dac/ad8801.c
@@ -26,7 +26,7 @@ struct ad8801_state {
struct regulator *vrefh_reg;
struct regulator *vrefl_reg;
- __be16 data ____cacheline_aligned;
+ __be16 data __aligned(IIO_DMA_MINALIGN);
};
static int ad8801_spi_write(struct ad8801_state *state,
diff --git a/drivers/iio/dac/ltc2688.c b/drivers/iio/dac/ltc2688.c
index 2f9c384885f4..04ec38b1cad1 100644
--- a/drivers/iio/dac/ltc2688.c
+++ b/drivers/iio/dac/ltc2688.c
@@ -91,10 +91,10 @@ struct ltc2688_state {
struct mutex lock;
int vref;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
- u8 tx_data[6] ____cacheline_aligned;
+ u8 tx_data[6] __aligned(IIO_DMA_MINALIGN);
u8 rx_data[3];
};
diff --git a/drivers/iio/dac/mcp4922.c b/drivers/iio/dac/mcp4922.c
index cb9e60e71b91..6c0e31032c57 100644
--- a/drivers/iio/dac/mcp4922.c
+++ b/drivers/iio/dac/mcp4922.c
@@ -29,7 +29,7 @@ struct mcp4922_state {
unsigned int value[MCP4922_NUM_CHANNELS];
unsigned int vref_mv;
struct regulator *vref_reg;
- u8 mosi[2] ____cacheline_aligned;
+ u8 mosi[2] __aligned(IIO_DMA_MINALIGN);
};
#define MCP4922_CHAN(chan, bits) { \
diff --git a/drivers/iio/dac/ti-dac082s085.c b/drivers/iio/dac/ti-dac082s085.c
index 4e1156e6deb2..f52ad8fa7747 100644
--- a/drivers/iio/dac/ti-dac082s085.c
+++ b/drivers/iio/dac/ti-dac082s085.c
@@ -55,7 +55,7 @@ struct ti_dac_chip {
bool powerdown;
u8 powerdown_mode;
u8 resolution;
- u8 buf[2] ____cacheline_aligned;
+ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
};
#define WRITE_NOT_UPDATE(chan) (0x00 | (chan) << 6)
diff --git a/drivers/iio/dac/ti-dac5571.c b/drivers/iio/dac/ti-dac5571.c
index 0b775f943db3..ba68ea1e3455 100644
--- a/drivers/iio/dac/ti-dac5571.c
+++ b/drivers/iio/dac/ti-dac5571.c
@@ -52,7 +52,7 @@ struct dac5571_data {
struct dac5571_spec const *spec;
int (*dac5571_cmd)(struct dac5571_data *data, int channel, u16 val);
int (*dac5571_pwrdwn)(struct dac5571_data *data, int channel, u8 pwrdwn);
- u8 buf[3] ____cacheline_aligned;
+ u8 buf[3] __aligned(IIO_DMA_MINALIGN);
};
#define DAC5571_POWERDOWN(mode) ((mode) + 1)
diff --git a/drivers/iio/dac/ti-dac7311.c b/drivers/iio/dac/ti-dac7311.c
index e10d17e60ed3..dd0a7b77838d 100644
--- a/drivers/iio/dac/ti-dac7311.c
+++ b/drivers/iio/dac/ti-dac7311.c
@@ -52,7 +52,7 @@ struct ti_dac_chip {
bool powerdown;
u8 powerdown_mode;
u8 resolution;
- u8 buf[2] ____cacheline_aligned;
+ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
};
static u8 ti_dac_get_power(struct ti_dac_chip *ti_dac, bool powerdown)
diff --git a/drivers/iio/dac/ti-dac7612.c b/drivers/iio/dac/ti-dac7612.c
index 4c0f4b5e9ff4..8195815de26f 100644
--- a/drivers/iio/dac/ti-dac7612.c
+++ b/drivers/iio/dac/ti-dac7612.c
@@ -31,10 +31,10 @@ struct dac7612 {
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
- uint8_t data[2] ____cacheline_aligned;
+ uint8_t data[2] __aligned(IIO_DMA_MINALIGN);
};
static int dac7612_cmd_single(struct dac7612 *priv, int channel, u16 val)
diff --git a/drivers/iio/frequency/ad9523.c b/drivers/iio/frequency/ad9523.c
index a0f92c336fc4..31c97f9f2c1b 100644
--- a/drivers/iio/frequency/ad9523.c
+++ b/drivers/iio/frequency/ad9523.c
@@ -287,13 +287,13 @@ struct ad9523_state {
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
- * transfer buffers to live in their own cache lines.
+ * DMA (thus cache coherency maintenance) may require that
+ * transfer buffers live in their own cache lines.
*/
union {
__be32 d32;
u8 d8[4];
- } data[2] ____cacheline_aligned;
+ } data[2] __aligned(IIO_DMA_MINALIGN);
};
static int ad9523_read(struct iio_dev *indio_dev, unsigned int addr)
diff --git a/drivers/iio/frequency/adf4350.c b/drivers/iio/frequency/adf4350.c
index be1218d86291..85e289700c3c 100644
--- a/drivers/iio/frequency/adf4350.c
+++ b/drivers/iio/frequency/adf4350.c
@@ -56,10 +56,10 @@ struct adf4350_state {
*/
struct mutex lock;
/*
- * DMA (thus cache coherency maintenance) requires the
- * transfer buffers to live in their own cache lines.
+ * DMA (thus cache coherency maintenance) may require that
+ * transfer buffers live in their own cache lines.
*/
- __be32 val ____cacheline_aligned;
+ __be32 val __aligned(IIO_DMA_MINALIGN);
};
static struct adf4350_platform_data default_pdata = {
diff --git a/drivers/iio/frequency/adf4371.c b/drivers/iio/frequency/adf4371.c
index ecd5e18995ad..135c8cedc33d 100644
--- a/drivers/iio/frequency/adf4371.c
+++ b/drivers/iio/frequency/adf4371.c
@@ -175,7 +175,7 @@ struct adf4371_state {
unsigned int mod2;
unsigned int rf_div_sel;
unsigned int ref_div_factor;
- u8 buf[10] ____cacheline_aligned;
+ u8 buf[10] __aligned(IIO_DMA_MINALIGN);
};
static unsigned long long adf4371_pll_fract_n_get_rate(struct adf4371_state *st,
diff --git a/drivers/iio/frequency/admv1013.c b/drivers/iio/frequency/admv1013.c
index b0e1f6571afb..ed8167271358 100644
--- a/drivers/iio/frequency/admv1013.c
+++ b/drivers/iio/frequency/admv1013.c
@@ -100,7 +100,7 @@ struct admv1013_state {
unsigned int input_mode;
unsigned int quad_se_mode;
bool det_en;
- u8 data[3] ____cacheline_aligned;
+ u8 data[3] __aligned(IIO_DMA_MINALIGN);
};
static int __admv1013_spi_read(struct admv1013_state *st, unsigned int reg,
diff --git a/drivers/iio/frequency/admv1014.c b/drivers/iio/frequency/admv1014.c
index a7994f8e6b9b..d1ccaa7ed5fe 100644
--- a/drivers/iio/frequency/admv1014.c
+++ b/drivers/iio/frequency/admv1014.c
@@ -127,7 +127,7 @@ struct admv1014_state {
unsigned int quad_se_mode;
unsigned int p1db_comp;
bool det_en;
- u8 data[3] ____cacheline_aligned;
+ u8 data[3] __aligned(IIO_DMA_MINALIGN);
};
static const int mixer_vgate_table[] = {106, 107, 108, 110, 111, 112, 113, 114,
diff --git a/drivers/iio/frequency/admv4420.c b/drivers/iio/frequency/admv4420.c
index 51134aee8510..863ba8e98c95 100644
--- a/drivers/iio/frequency/admv4420.c
+++ b/drivers/iio/frequency/admv4420.c
@@ -113,7 +113,7 @@ struct admv4420_state {
struct admv4420_n_counter n_counter;
enum admv4420_mux_sel mux_sel;
struct mutex lock;
- u8 transf_buf[4] ____cacheline_aligned;
+ u8 transf_buf[4] __aligned(IIO_DMA_MINALIGN);
};
static const struct regmap_config admv4420_regmap_config = {
diff --git a/drivers/iio/frequency/adrf6780.c b/drivers/iio/frequency/adrf6780.c
index 8255ffd174f6..21878bad0909 100644
--- a/drivers/iio/frequency/adrf6780.c
+++ b/drivers/iio/frequency/adrf6780.c
@@ -86,7 +86,7 @@ struct adrf6780_state {
bool uc_bias_en;
bool lo_sideband;
bool vdet_out_en;
- u8 data[3] ____cacheline_aligned;
+ u8 data[3] __aligned(IIO_DMA_MINALIGN);
};
static int __adrf6780_spi_read(struct adrf6780_state *st, unsigned int reg,
diff --git a/drivers/iio/gyro/adis16080.c b/drivers/iio/gyro/adis16080.c
index acef59d822b1..14b3abf6dce9 100644
--- a/drivers/iio/gyro/adis16080.c
+++ b/drivers/iio/gyro/adis16080.c
@@ -45,7 +45,7 @@ struct adis16080_state {
const struct adis16080_chip_info *info;
struct mutex lock;
- __be16 buf ____cacheline_aligned;
+ __be16 buf __aligned(IIO_DMA_MINALIGN);
};
static int adis16080_read_sample(struct iio_dev *indio_dev,
diff --git a/drivers/iio/gyro/adis16130.c b/drivers/iio/gyro/adis16130.c
index b9c952e65b55..33cde9e6fca5 100644
--- a/drivers/iio/gyro/adis16130.c
+++ b/drivers/iio/gyro/adis16130.c
@@ -41,7 +41,7 @@
struct adis16130_state {
struct spi_device *us;
struct mutex buf_lock;
- u8 buf[4] ____cacheline_aligned;
+ u8 buf[4] __aligned(IIO_DMA_MINALIGN);
};
static int adis16130_spi_read(struct iio_dev *indio_dev, u8 reg_addr, u32 *val)
diff --git a/drivers/iio/gyro/adxrs450.c b/drivers/iio/gyro/adxrs450.c
index 04f350025215..f84438e0c42c 100644
--- a/drivers/iio/gyro/adxrs450.c
+++ b/drivers/iio/gyro/adxrs450.c
@@ -73,7 +73,7 @@ enum {
struct adxrs450_state {
struct spi_device *us;
struct mutex buf_lock;
- __be32 tx ____cacheline_aligned;
+ __be32 tx __aligned(IIO_DMA_MINALIGN);
__be32 rx;
};
diff --git a/drivers/iio/gyro/fxas21002c_core.c b/drivers/iio/gyro/fxas21002c_core.c
index 410e5e9f2672..7a459a823f6e 100644
--- a/drivers/iio/gyro/fxas21002c_core.c
+++ b/drivers/iio/gyro/fxas21002c_core.c
@@ -150,10 +150,10 @@ struct fxas21002c_data {
struct regulator *vddio;
/*
- * DMA (thus cache coherency maintenance) requires the
- * transfer buffers to live in their own cache lines.
+ * DMA (thus cache coherency maintenance) may require the
+ * transfer buffers live in their own cache lines.
*/
- s16 buffer[8] ____cacheline_aligned;
+ s16 buffer[8] __aligned(IIO_DMA_MINALIGN);
};
enum fxas21002c_channel_index {
diff --git a/drivers/iio/imu/fxos8700_core.c b/drivers/iio/imu/fxos8700_core.c
index ab288186f36e..423cfe526f2a 100644
--- a/drivers/iio/imu/fxos8700_core.c
+++ b/drivers/iio/imu/fxos8700_core.c
@@ -167,7 +167,7 @@
struct fxos8700_data {
struct regmap *regmap;
struct iio_trigger *trig;
- __be16 buf[FXOS8700_DATA_BUF_SIZE] ____cacheline_aligned;
+ __be16 buf[FXOS8700_DATA_BUF_SIZE] __aligned(IIO_DMA_MINALIGN);
};
/* Regmap info */
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600.h b/drivers/iio/imu/inv_icm42600/inv_icm42600.h
index 995a9dc06521..3d91469beccb 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600.h
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600.h
@@ -141,7 +141,7 @@ struct inv_icm42600_state {
struct inv_icm42600_suspended suspended;
struct iio_dev *indio_gyro;
struct iio_dev *indio_accel;
- uint8_t buffer[2] ____cacheline_aligned;
+ uint8_t buffer[2] __aligned(IIO_DMA_MINALIGN);
struct inv_icm42600_fifo fifo;
struct {
int64_t gyro;
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h b/drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h
index de2a3949dcc7..8b85ee333bf8 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_buffer.h
@@ -39,7 +39,7 @@ struct inv_icm42600_fifo {
size_t accel;
size_t total;
} nb;
- uint8_t data[2080] ____cacheline_aligned;
+ uint8_t data[2080] __aligned(IIO_DMA_MINALIGN);
};
/* FIFO data packet */
diff --git a/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h b/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
index c6aa36ee966a..32b58b797d57 100644
--- a/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
+++ b/drivers/iio/imu/inv_mpu6050/inv_mpu_iio.h
@@ -203,7 +203,7 @@ struct inv_mpu6050_state {
s32 magn_raw_to_gauss[3];
struct iio_mount_matrix magn_orient;
unsigned int suspended_sensors;
- u8 data[INV_MPU6050_OUTPUT_DATA_SIZE] ____cacheline_aligned;
+ u8 data[INV_MPU6050_OUTPUT_DATA_SIZE] __aligned(IIO_DMA_MINALIGN);
};
/*register and associated bit definition*/
diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c
index b078eb2f3c9d..01369973329a 100644
--- a/drivers/iio/industrialio-buffer.c
+++ b/drivers/iio/industrialio-buffer.c
@@ -915,7 +915,7 @@ static int iio_verify_update(struct iio_dev *indio_dev,
if (scan_mask == NULL)
return -EINVAL;
} else {
- scan_mask = compound_mask;
+ scan_mask = compound_mask;
}
config->scan_bytes = iio_compute_scan_bytes(indio_dev,
@@ -1649,7 +1649,7 @@ static int __iio_buffer_alloc_sysfs_and_mask(struct iio_buffer *buffer,
}
attrn = buffer_attrcount + scan_el_attrcount + ARRAY_SIZE(iio_buffer_attrs);
- attr = kcalloc(attrn + 1, sizeof(* attr), GFP_KERNEL);
+ attr = kcalloc(attrn + 1, sizeof(*attr), GFP_KERNEL);
if (!attr) {
ret = -ENOMEM;
goto error_free_scan_mask;
diff --git a/drivers/iio/industrialio-core.c b/drivers/iio/industrialio-core.c
index e1ed44dec2ab..e5aed96de8f3 100644
--- a/drivers/iio/industrialio-core.c
+++ b/drivers/iio/industrialio-core.c
@@ -821,7 +821,23 @@ static ssize_t iio_format_avail_list(char *buf, const int *vals,
static ssize_t iio_format_avail_range(char *buf, const int *vals, int type)
{
- return iio_format_list(buf, vals, type, 3, "[", "]");
+ int length;
+
+ /*
+ * length refers to the array size , not the number of elements.
+ * The purpose is to print the range [min , step ,max] so length should
+ * be 3 in case of int, and 6 for other types.
+ */
+ switch (type) {
+ case IIO_VAL_INT:
+ length = 3;
+ break;
+ default:
+ length = 6;
+ break;
+ }
+
+ return iio_format_list(buf, vals, type, length, "[", "]");
}
static ssize_t iio_read_channel_info_avail(struct device *dev,
@@ -892,8 +908,7 @@ static int __iio_str_to_fixpoint(const char *str, int fract_mult,
} else if (*str == '\n') {
if (*(str + 1) == '\0')
break;
- else
- return -EINVAL;
+ return -EINVAL;
} else if (!strncmp(str, " dB", sizeof(" dB") - 1) && scale_db) {
/* Ignore the dB suffix */
str += sizeof(" dB") - 1;
@@ -1640,7 +1655,7 @@ struct iio_dev *iio_device_alloc(struct device *parent, int sizeof_priv)
alloc_size = sizeof(struct iio_dev_opaque);
if (sizeof_priv) {
- alloc_size = ALIGN(alloc_size, IIO_ALIGN);
+ alloc_size = ALIGN(alloc_size, IIO_DMA_MINALIGN);
alloc_size += sizeof_priv;
}
@@ -1650,7 +1665,7 @@ struct iio_dev *iio_device_alloc(struct device *parent, int sizeof_priv)
indio_dev = &iio_dev_opaque->indio_dev;
indio_dev->priv = (char *)iio_dev_opaque +
- ALIGN(sizeof(struct iio_dev_opaque), IIO_ALIGN);
+ ALIGN(sizeof(struct iio_dev_opaque), IIO_DMA_MINALIGN);
indio_dev->dev.parent = parent;
indio_dev->dev.type = &iio_device_type;
diff --git a/drivers/iio/light/cros_ec_light_prox.c b/drivers/iio/light/cros_ec_light_prox.c
index de472f23d1cb..16b893bae388 100644
--- a/drivers/iio/light/cros_ec_light_prox.c
+++ b/drivers/iio/light/cros_ec_light_prox.c
@@ -181,8 +181,7 @@ static int cros_ec_light_prox_probe(struct platform_device *pdev)
return -ENOMEM;
ret = cros_ec_sensors_core_init(pdev, indio_dev, true,
- cros_ec_sensors_capture,
- cros_ec_sensors_push_data);
+ cros_ec_sensors_capture);
if (ret)
return ret;
@@ -240,7 +239,8 @@ static int cros_ec_light_prox_probe(struct platform_device *pdev)
state->core.read_ec_sensors_data = cros_ec_sensors_read_cmd;
- return devm_iio_device_register(dev, indio_dev);
+ return cros_ec_sensors_core_register(dev, indio_dev,
+ cros_ec_sensors_push_data);
}
static const struct platform_device_id cros_ec_light_prox_ids[] = {
diff --git a/drivers/iio/light/isl29028.c b/drivers/iio/light/isl29028.c
index 9de3262aa688..a62787f5d5e7 100644
--- a/drivers/iio/light/isl29028.c
+++ b/drivers/iio/light/isl29028.c
@@ -625,7 +625,7 @@ static int isl29028_probe(struct i2c_client *client,
ISL29028_POWER_OFF_DELAY_MS);
pm_runtime_use_autosuspend(&client->dev);
- ret = devm_iio_device_register(indio_dev->dev.parent, indio_dev);
+ ret = iio_device_register(indio_dev);
if (ret < 0) {
dev_err(&client->dev,
"%s(): iio registration failed with error %d\n",
diff --git a/drivers/iio/potentiometer/ad5110.c b/drivers/iio/potentiometer/ad5110.c
index d4eeedae56e5..8fbcce482989 100644
--- a/drivers/iio/potentiometer/ad5110.c
+++ b/drivers/iio/potentiometer/ad5110.c
@@ -63,10 +63,10 @@ struct ad5110_data {
struct mutex lock;
const struct ad5110_cfg *cfg;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
*/
- u8 buf[2] ____cacheline_aligned;
+ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
};
static const struct iio_chan_spec ad5110_channels[] = {
diff --git a/drivers/iio/potentiometer/ad5272.c b/drivers/iio/potentiometer/ad5272.c
index d8cbd170262f..ed5fc0b50fe9 100644
--- a/drivers/iio/potentiometer/ad5272.c
+++ b/drivers/iio/potentiometer/ad5272.c
@@ -50,7 +50,7 @@ struct ad5272_data {
struct i2c_client *client;
struct mutex lock;
const struct ad5272_cfg *cfg;
- u8 buf[2] ____cacheline_aligned;
+ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
};
static const struct iio_chan_spec ad5272_channel = {
diff --git a/drivers/iio/potentiometer/max5481.c b/drivers/iio/potentiometer/max5481.c
index 098d144a8fdd..b40e5ac218d7 100644
--- a/drivers/iio/potentiometer/max5481.c
+++ b/drivers/iio/potentiometer/max5481.c
@@ -44,7 +44,7 @@ static const struct max5481_cfg max5481_cfg[] = {
struct max5481_data {
struct spi_device *spi;
const struct max5481_cfg *cfg;
- u8 msg[3] ____cacheline_aligned;
+ u8 msg[3] __aligned(IIO_DMA_MINALIGN);
};
#define MAX5481_CHANNEL { \
diff --git a/drivers/iio/potentiometer/mcp41010.c b/drivers/iio/potentiometer/mcp41010.c
index 30a4594d4e11..2b73c7540209 100644
--- a/drivers/iio/potentiometer/mcp41010.c
+++ b/drivers/iio/potentiometer/mcp41010.c
@@ -60,7 +60,7 @@ struct mcp41010_data {
const struct mcp41010_cfg *cfg;
struct mutex lock; /* Protect write sequences */
unsigned int value[MCP41010_MAX_WIPERS]; /* Cache wiper values */
- u8 buf[2] ____cacheline_aligned;
+ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
};
#define MCP41010_CHANNEL(ch) { \
diff --git a/drivers/iio/potentiometer/mcp4131.c b/drivers/iio/potentiometer/mcp4131.c
index 7c8c18ab8764..7890c0993ec4 100644
--- a/drivers/iio/potentiometer/mcp4131.c
+++ b/drivers/iio/potentiometer/mcp4131.c
@@ -129,7 +129,7 @@ struct mcp4131_data {
struct spi_device *spi;
const struct mcp4131_cfg *cfg;
struct mutex lock;
- u8 buf[2] ____cacheline_aligned;
+ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
};
#define MCP4131_CHANNEL(ch) { \
diff --git a/drivers/iio/pressure/cros_ec_baro.c b/drivers/iio/pressure/cros_ec_baro.c
index 2f882e109423..0511edbf868d 100644
--- a/drivers/iio/pressure/cros_ec_baro.c
+++ b/drivers/iio/pressure/cros_ec_baro.c
@@ -138,8 +138,7 @@ static int cros_ec_baro_probe(struct platform_device *pdev)
return -ENOMEM;
ret = cros_ec_sensors_core_init(pdev, indio_dev, true,
- cros_ec_sensors_capture,
- cros_ec_sensors_push_data);
+ cros_ec_sensors_capture);
if (ret)
return ret;
@@ -186,7 +185,8 @@ static int cros_ec_baro_probe(struct platform_device *pdev)
state->core.read_ec_sensors_data = cros_ec_sensors_read_cmd;
- return devm_iio_device_register(dev, indio_dev);
+ return cros_ec_sensors_core_register(dev, indio_dev,
+ cros_ec_sensors_push_data);
}
static const struct platform_device_id cros_ec_baro_ids[] = {
diff --git a/drivers/iio/proximity/as3935.c b/drivers/iio/proximity/as3935.c
index 67891ce2bd09..ebc95cf8f5f4 100644
--- a/drivers/iio/proximity/as3935.c
+++ b/drivers/iio/proximity/as3935.c
@@ -65,7 +65,7 @@ struct as3935_state {
u8 chan;
s64 timestamp __aligned(8);
} scan;
- u8 buf[2] ____cacheline_aligned;
+ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
};
static const struct iio_chan_spec as3935_channels[] = {
diff --git a/drivers/iio/proximity/sx9324.c b/drivers/iio/proximity/sx9324.c
index 63fbcaa4cac8..a30ac8007a3d 100644
--- a/drivers/iio/proximity/sx9324.c
+++ b/drivers/iio/proximity/sx9324.c
@@ -93,7 +93,7 @@
#define SX9324_REG_PROX_CTRL4_AVGNEGFILT_MASK GENMASK(5, 3)
#define SX9324_REG_PROX_CTRL4_AVGNEG_FILT_2 0x08
#define SX9324_REG_PROX_CTRL4_AVGPOSFILT_MASK GENMASK(2, 0)
-#define SX9324_REG_PROX_CTRL3_AVGPOS_FILT_256 0x04
+#define SX9324_REG_PROX_CTRL4_AVGPOS_FILT_256 0x04
#define SX9324_REG_PROX_CTRL5 0x35
#define SX9324_REG_PROX_CTRL5_HYST_MASK GENMASK(5, 4)
#define SX9324_REG_PROX_CTRL5_CLOSE_DEBOUNCE_MASK GENMASK(3, 2)
@@ -810,7 +810,7 @@ static const struct sx_common_reg_default sx9324_default_regs[] = {
{ SX9324_REG_PROX_CTRL3, SX9324_REG_PROX_CTRL3_AVGDEB_2SAMPLES |
SX9324_REG_PROX_CTRL3_AVGPOS_THRESH_16K },
{ SX9324_REG_PROX_CTRL4, SX9324_REG_PROX_CTRL4_AVGNEG_FILT_2 |
- SX9324_REG_PROX_CTRL3_AVGPOS_FILT_256 },
+ SX9324_REG_PROX_CTRL4_AVGPOS_FILT_256 },
{ SX9324_REG_PROX_CTRL5, 0x00 },
{ SX9324_REG_PROX_CTRL6, SX9324_REG_PROX_CTRL6_PROXTHRESH_32 },
{ SX9324_REG_PROX_CTRL7, SX9324_REG_PROX_CTRL6_PROXTHRESH_32 },
diff --git a/drivers/iio/resolver/ad2s1200.c b/drivers/iio/resolver/ad2s1200.c
index 9746bd935628..9d95241bdf8f 100644
--- a/drivers/iio/resolver/ad2s1200.c
+++ b/drivers/iio/resolver/ad2s1200.c
@@ -41,7 +41,7 @@ struct ad2s1200_state {
struct spi_device *sdev;
struct gpio_desc *sample;
struct gpio_desc *rdvel;
- __be16 rx ____cacheline_aligned;
+ __be16 rx __aligned(IIO_DMA_MINALIGN);
};
static int ad2s1200_read_raw(struct iio_dev *indio_dev,
diff --git a/drivers/iio/resolver/ad2s90.c b/drivers/iio/resolver/ad2s90.c
index d6a91f137e13..be6836e55376 100644
--- a/drivers/iio/resolver/ad2s90.c
+++ b/drivers/iio/resolver/ad2s90.c
@@ -24,7 +24,7 @@
struct ad2s90_state {
struct mutex lock; /* lock to protect rx buffer */
struct spi_device *sdev;
- u8 rx[2] ____cacheline_aligned;
+ u8 rx[2] __aligned(IIO_DMA_MINALIGN);
};
static int ad2s90_read_raw(struct iio_dev *indio_dev,
diff --git a/drivers/iio/temperature/ltc2983.c b/drivers/iio/temperature/ltc2983.c
index 301c3f13fb26..1b8252d86889 100644
--- a/drivers/iio/temperature/ltc2983.c
+++ b/drivers/iio/temperature/ltc2983.c
@@ -200,11 +200,11 @@ struct ltc2983_data {
u8 num_channels;
u8 iio_channels;
/*
- * DMA (thus cache coherency maintenance) requires the
+ * DMA (thus cache coherency maintenance) may require the
* transfer buffers to live in their own cache lines.
* Holds the converted temperature
*/
- __be32 temp ____cacheline_aligned;
+ __be32 temp __aligned(IIO_DMA_MINALIGN);
};
struct ltc2983_sensor {
diff --git a/drivers/iio/temperature/max31865.c b/drivers/iio/temperature/max31865.c
index 86c3f3509a26..f0079e51d8d2 100644
--- a/drivers/iio/temperature/max31865.c
+++ b/drivers/iio/temperature/max31865.c
@@ -53,7 +53,7 @@ struct max31865_data {
struct mutex lock;
bool filter_50hz;
bool three_wire;
- u8 buf[2] ____cacheline_aligned;
+ u8 buf[2] __aligned(IIO_DMA_MINALIGN);
};
static int max31865_read(struct max31865_data *data, u8 reg,
diff --git a/drivers/iio/temperature/maxim_thermocouple.c b/drivers/iio/temperature/maxim_thermocouple.c
index 98c41cddc6f0..c28a7a6dea5f 100644
--- a/drivers/iio/temperature/maxim_thermocouple.c
+++ b/drivers/iio/temperature/maxim_thermocouple.c
@@ -122,7 +122,7 @@ struct maxim_thermocouple_data {
struct spi_device *spi;
const struct maxim_thermocouple_chip *chip;
- u8 buffer[16] ____cacheline_aligned;
+ u8 buffer[16] __aligned(IIO_DMA_MINALIGN);
char tc_type;
};
diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 3ebdd42fec36..686d170a5947 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -1179,8 +1179,10 @@ static int setup_base_ctxt(struct hfi1_filedata *fd,
goto done;
ret = init_user_ctxt(fd, uctxt);
- if (ret)
+ if (ret) {
+ hfi1_free_ctxt_rcv_groups(uctxt);
goto done;
+ }
user_init(uctxt);
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 86f6a4aae1e5..6e7dadc3386c 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -6125,8 +6125,8 @@ static irqreturn_t hns_roce_v2_msix_interrupt_abn(int irq, void *dev_id)
dev_err(dev, "AEQ overflow!\n");
- int_st |= 1 << HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S;
- roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG, int_st);
+ roce_write(hr_dev, ROCEE_VF_ABN_INT_ST_REG,
+ 1 << HNS_ROCE_V2_VF_INT_ST_AEQ_OVERFLOW_S);
/* Set reset level for reset_event() */
if (ops->set_default_reset_request)
diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c
index 646fa8677490..7b086fe63a24 100644
--- a/drivers/infiniband/hw/irdma/cm.c
+++ b/drivers/infiniband/hw/irdma/cm.c
@@ -1477,12 +1477,13 @@ irdma_find_listener(struct irdma_cm_core *cm_core, u32 *dst_addr, u16 dst_port,
list_for_each_entry (listen_node, &cm_core->listen_list, list) {
memcpy(listen_addr, listen_node->loc_addr, sizeof(listen_addr));
listen_port = listen_node->loc_port;
+ if (listen_port != dst_port ||
+ !(listener_state & listen_node->listener_state))
+ continue;
/* compare node pair, return node handle if a match */
- if ((!memcmp(listen_addr, dst_addr, sizeof(listen_addr)) ||
- !memcmp(listen_addr, ip_zero, sizeof(listen_addr))) &&
- listen_port == dst_port &&
- vlan_id == listen_node->vlan_id &&
- (listener_state & listen_node->listener_state)) {
+ if (!memcmp(listen_addr, ip_zero, sizeof(listen_addr)) ||
+ (!memcmp(listen_addr, dst_addr, sizeof(listen_addr)) &&
+ vlan_id == listen_node->vlan_id)) {
refcount_inc(&listen_node->refcnt);
spin_unlock_irqrestore(&cm_core->listen_list_lock,
flags);
diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c
index 3dc9b5801da1..c1b402a06e17 100644
--- a/drivers/infiniband/hw/irdma/hw.c
+++ b/drivers/infiniband/hw/irdma/hw.c
@@ -257,10 +257,6 @@ static void irdma_process_aeq(struct irdma_pci_f *rf)
iwqp->last_aeq = info->ae_id;
spin_unlock_irqrestore(&iwqp->lock, flags);
ctx_info = &iwqp->ctx_info;
- if (rdma_protocol_roce(&iwqp->iwdev->ibdev, 1))
- ctx_info->roce_info->err_rq_idx_valid = true;
- else
- ctx_info->iwarp_info->err_rq_idx_valid = true;
} else {
if (info->ae_id != IRDMA_AE_CQ_OPERATION_ERROR)
continue;
@@ -370,16 +366,12 @@ static void irdma_process_aeq(struct irdma_pci_f *rf)
case IRDMA_AE_LCE_FUNCTION_CATASTROPHIC:
case IRDMA_AE_LCE_CQ_CATASTROPHIC:
case IRDMA_AE_UDA_XMIT_DGRAM_TOO_LONG:
- if (rdma_protocol_roce(&iwdev->ibdev, 1))
- ctx_info->roce_info->err_rq_idx_valid = false;
- else
- ctx_info->iwarp_info->err_rq_idx_valid = false;
- fallthrough;
default:
ibdev_err(&iwdev->ibdev, "abnormal ae_id = 0x%x bool qp=%d qp_id = %d\n",
info->ae_id, info->qp, info->qp_cq_id);
if (rdma_protocol_roce(&iwdev->ibdev, 1)) {
- if (!info->sq && ctx_info->roce_info->err_rq_idx_valid) {
+ ctx_info->roce_info->err_rq_idx_valid = info->rq;
+ if (info->rq) {
ctx_info->roce_info->err_rq_idx = info->wqe_idx;
irdma_sc_qp_setctx_roce(&iwqp->sc_qp, iwqp->host_ctx.va,
ctx_info);
@@ -388,7 +380,8 @@ static void irdma_process_aeq(struct irdma_pci_f *rf)
irdma_cm_disconn(iwqp);
break;
}
- if (!info->sq && ctx_info->iwarp_info->err_rq_idx_valid) {
+ ctx_info->iwarp_info->err_rq_idx_valid = info->rq;
+ if (info->rq) {
ctx_info->iwarp_info->err_rq_idx = info->wqe_idx;
ctx_info->tcp_info_valid = false;
ctx_info->iwarp_info_valid = true;
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index 6daa149dcbda..b29631f6659a 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -1760,11 +1760,11 @@ static int irdma_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
spin_unlock_irqrestore(&iwcq->lock, flags);
irdma_cq_wq_destroy(iwdev->rf, cq);
- irdma_cq_free_rsrc(iwdev->rf, iwcq);
spin_lock_irqsave(&iwceq->ce_lock, flags);
irdma_sc_cleanup_ceqes(cq, ceq);
spin_unlock_irqrestore(&iwceq->ce_lock, flags);
+ irdma_cq_free_rsrc(iwdev->rf, iwcq);
return 0;
}
diff --git a/drivers/infiniband/hw/mlx5/fs.c b/drivers/infiniband/hw/mlx5/fs.c
index 661ed2b44508..6092118de672 100644
--- a/drivers/infiniband/hw/mlx5/fs.c
+++ b/drivers/infiniband/hw/mlx5/fs.c
@@ -2265,12 +2265,10 @@ static int mlx5_ib_matcher_ns(struct uverbs_attr_bundle *attrs,
if (err)
return err;
- if (flags) {
- mlx5_ib_ft_type_to_namespace(
+ if (flags)
+ return mlx5_ib_ft_type_to_namespace(
MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_TX,
&obj->ns_type);
- return 0;
- }
}
obj->ns_type = MLX5_FLOW_NAMESPACE_BYPASS;
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index df4d7970c1ad..cc99b293f0be 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -3083,7 +3083,7 @@ static struct qedr_mr *__qedr_alloc_mr(struct ib_pd *ibpd,
else
DP_ERR(dev, "roce alloc tid returned error %d\n", rc);
- goto err0;
+ goto err1;
}
/* Index only, 18 bit long, lkey = itid << 8 | key */
@@ -3107,7 +3107,7 @@ static struct qedr_mr *__qedr_alloc_mr(struct ib_pd *ibpd,
rc = dev->ops->rdma_register_tid(dev->rdma_ctx, &mr->hw_mr);
if (rc) {
DP_ERR(dev, "roce register tid returned an error %d\n", rc);
- goto err1;
+ goto err2;
}
mr->ibmr.lkey = mr->hw_mr.itid << 8 | mr->hw_mr.key;
@@ -3116,8 +3116,10 @@ static struct qedr_mr *__qedr_alloc_mr(struct ib_pd *ibpd,
DP_DEBUG(dev, QEDR_MSG_MR, "alloc frmr: %x\n", mr->ibmr.lkey);
return mr;
-err1:
+err2:
dev->ops->rdma_free_tid(dev->rdma_ctx, mr->hw_mr.itid);
+err1:
+ qedr_free_pbl(dev, &mr->info.pbl_info, mr->info.pbl_table);
err0:
kfree(mr);
return ERR_PTR(rc);
diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index 138b3e7d3a5f..ec671e171f13 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -114,6 +114,8 @@ void retransmit_timer(struct timer_list *t)
{
struct rxe_qp *qp = from_timer(qp, t, retrans_timer);
+ pr_debug("%s: fired for qp#%d\n", __func__, qp->elem.index);
+
if (qp->valid) {
qp->comp.timeout = 1;
rxe_run_task(&qp->comp.task, 1);
@@ -729,11 +731,15 @@ int rxe_completer(void *arg)
break;
case COMPST_RNR_RETRY:
+ /* we come here if we received an RNR NAK */
if (qp->comp.rnr_retry > 0) {
if (qp->comp.rnr_retry != 7)
qp->comp.rnr_retry--;
- qp->req.need_retry = 1;
+ /* don't start a retry flow until the
+ * rnr timer has fired
+ */
+ qp->req.wait_for_rnr_timer = 1;
pr_debug("qp#%d set rnr nak timer\n",
qp_num(qp));
mod_timer(&qp->rnr_nak_timer,
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index 2ffbe3390668..2b0edf360474 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -77,7 +77,7 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
enum rxe_mr_lookup_type type);
int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length);
int advance_dma_data(struct rxe_dma_info *dma, unsigned int length);
-int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey);
+int rxe_invalidate_mr(struct rxe_qp *qp, u32 key);
int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe);
int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr);
int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata);
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 60a31b718774..76d6498e83f9 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -576,22 +576,22 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
return mr;
}
-int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey)
+int rxe_invalidate_mr(struct rxe_qp *qp, u32 key)
{
struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
struct rxe_mr *mr;
int ret;
- mr = rxe_pool_get_index(&rxe->mr_pool, rkey >> 8);
+ mr = rxe_pool_get_index(&rxe->mr_pool, key >> 8);
if (!mr) {
- pr_err("%s: No MR for rkey %#x\n", __func__, rkey);
+ pr_err("%s: No MR for key %#x\n", __func__, key);
ret = -EINVAL;
goto err;
}
- if (rkey != mr->rkey) {
- pr_err("%s: rkey (%#x) doesn't match mr->rkey (%#x)\n",
- __func__, rkey, mr->rkey);
+ if (mr->rkey ? (key != mr->rkey) : (key != mr->lkey)) {
+ pr_err("%s: wr key (%#x) doesn't match mr key (%#x)\n",
+ __func__, key, (mr->rkey ? mr->rkey : mr->lkey));
ret = -EINVAL;
goto err_drop_ref;
}
diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c
index c86b2efd58f2..fc1416fc7feb 100644
--- a/drivers/infiniband/sw/rxe/rxe_mw.c
+++ b/drivers/infiniband/sw/rxe/rxe_mw.c
@@ -69,8 +69,6 @@ int rxe_dealloc_mw(struct ib_mw *ibmw)
static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
struct rxe_mw *mw, struct rxe_mr *mr)
{
- u32 key = wqe->wr.wr.mw.rkey & 0xff;
-
if (mw->ibmw.type == IB_MW_TYPE_1) {
if (unlikely(mw->state != RXE_MW_STATE_VALID)) {
pr_err_once(
@@ -108,11 +106,6 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
}
}
- if (unlikely(key == (mw->rkey & 0xff))) {
- pr_err_once("attempt to bind MW with same key\n");
- return -EINVAL;
- }
-
/* remaining checks only apply to a nonzero MR */
if (!mr)
return 0;
diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
index 87066d04ed18..69db28944567 100644
--- a/drivers/infiniband/sw/rxe/rxe_pool.c
+++ b/drivers/infiniband/sw/rxe/rxe_pool.c
@@ -140,7 +140,7 @@ void *rxe_alloc(struct rxe_pool *pool)
err = xa_alloc_cyclic(&pool->xa, &elem->index, elem, pool->limit,
&pool->next, GFP_KERNEL);
- if (err)
+ if (err < 0)
goto err_free;
return obj;
@@ -168,7 +168,7 @@ int __rxe_add_to_pool(struct rxe_pool *pool, struct rxe_pool_elem *elem)
err = xa_alloc_cyclic(&pool->xa, &elem->index, elem, pool->limit,
&pool->next, GFP_KERNEL);
- if (err)
+ if (err < 0)
goto err_cnt;
return 0;
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 62acf890af6c..4889dcae0cc8 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -186,6 +186,14 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
spin_lock_init(&qp->state_lock);
+ spin_lock_init(&qp->req.task.state_lock);
+ spin_lock_init(&qp->resp.task.state_lock);
+ spin_lock_init(&qp->comp.task.state_lock);
+
+ spin_lock_init(&qp->sq.sq_lock);
+ spin_lock_init(&qp->rq.producer_lock);
+ spin_lock_init(&qp->rq.consumer_lock);
+
atomic_set(&qp->ssn, 0);
atomic_set(&qp->skb_out, 0);
}
@@ -245,7 +253,6 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
qp->req.opcode = -1;
qp->comp.opcode = -1;
- spin_lock_init(&qp->sq.sq_lock);
skb_queue_head_init(&qp->req_pkts);
rxe_init_task(rxe, &qp->req.task, qp,
@@ -296,9 +303,6 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
}
}
- spin_lock_init(&qp->rq.producer_lock);
- spin_lock_init(&qp->rq.consumer_lock);
-
skb_queue_head_init(&qp->resp_pkts);
rxe_init_task(rxe, &qp->resp.task, qp,
@@ -513,6 +517,7 @@ static void rxe_qp_reset(struct rxe_qp *qp)
atomic_set(&qp->ssn, 0);
qp->req.opcode = -1;
qp->req.need_retry = 0;
+ qp->req.wait_for_rnr_timer = 0;
qp->req.noack_pkts = 0;
qp->resp.msn = 0;
qp->resp.opcode = -1;
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 8a1cff80a68e..90669b3c56af 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -103,7 +103,11 @@ void rnr_nak_timer(struct timer_list *t)
{
struct rxe_qp *qp = from_timer(qp, t, rnr_nak_timer);
- pr_debug("qp#%d rnr nak timer fired\n", qp_num(qp));
+ pr_debug("%s: fired for qp#%d\n", __func__, qp_num(qp));
+
+ /* request a send queue retry */
+ qp->req.need_retry = 1;
+ qp->req.wait_for_rnr_timer = 0;
rxe_run_task(&qp->req.task, 1);
}
@@ -586,9 +590,11 @@ static int rxe_do_local_ops(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
wqe->status = IB_WC_SUCCESS;
qp->req.wqe_index = queue_next_index(qp->sq.queue, qp->req.wqe_index);
- if ((wqe->wr.send_flags & IB_SEND_SIGNALED) ||
- qp->sq_sig_type == IB_SIGNAL_ALL_WR)
- rxe_run_task(&qp->comp.task, 1);
+ /* There is no ack coming for local work requests
+ * which can lead to a deadlock. So go ahead and complete
+ * it now.
+ */
+ rxe_run_task(&qp->comp.task, 1);
return 0;
}
@@ -624,10 +630,17 @@ int rxe_requester(void *arg)
qp->req.need_rd_atomic = 0;
qp->req.wait_psn = 0;
qp->req.need_retry = 0;
+ qp->req.wait_for_rnr_timer = 0;
goto exit;
}
- if (unlikely(qp->req.need_retry)) {
+ /* we come here if the retransmot timer has fired
+ * or if the rnr timer has fired. If the retransmit
+ * timer fires while we are processing an RNR NAK wait
+ * until the rnr timer has fired before starting the
+ * retry flow
+ */
+ if (unlikely(qp->req.need_retry && !qp->req.wait_for_rnr_timer)) {
req_retry(qp);
qp->req.need_retry = 0;
}
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index e7eff1ca75e9..33e8d0547553 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -123,6 +123,7 @@ struct rxe_req_info {
int need_rd_atomic;
int wait_psn;
int need_retry;
+ int wait_for_rnr_timer;
int noack_pkts;
struct rxe_task task;
};
diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index 17f34d584cd9..f88d2971c2c6 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -725,11 +725,11 @@ static int siw_proc_mpareply(struct siw_cep *cep)
enum mpa_v2_ctrl mpa_p2p_mode = MPA_V2_RDMA_NO_RTR;
rv = siw_recv_mpa_rr(cep);
- if (rv != -EAGAIN)
- siw_cancel_mpatimer(cep);
if (rv)
goto out_err;
+ siw_cancel_mpatimer(cep);
+
rep = &cep->mpa.hdr;
if (__mpa_rr_revision(rep->params.bits) > MPA_REVISION_2) {
@@ -895,7 +895,8 @@ static int siw_proc_mpareply(struct siw_cep *cep)
}
out_err:
- siw_cm_upcall(cep, IW_CM_EVENT_CONNECT_REPLY, -EINVAL);
+ if (rv != -EAGAIN)
+ siw_cm_upcall(cep, IW_CM_EVENT_CONNECT_REPLY, -EINVAL);
return rv;
}
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
index f8d0bab4424c..e36036b8f386 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -568,7 +568,7 @@ static void iscsi_iser_session_destroy(struct iscsi_cls_session *cls_session)
struct Scsi_Host *shost = iscsi_session_to_shost(cls_session);
iscsi_session_teardown(cls_session);
- iscsi_host_remove(shost);
+ iscsi_host_remove(shost, false);
iscsi_host_free(shost);
}
@@ -685,7 +685,7 @@ iscsi_iser_session_create(struct iscsi_endpoint *ep,
return cls_session;
remove_host:
- iscsi_host_remove(shost);
+ iscsi_host_remove(shost, false);
free_host:
iscsi_host_free(shost);
return NULL;
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index c2c860d0c56e..65bad1dd917b 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -740,25 +740,25 @@ struct path_it {
struct rtrs_clt_path *(*next_path)(struct path_it *it);
};
-/**
- * list_next_or_null_rr_rcu - get next list element in round-robin fashion.
+/*
+ * rtrs_clt_get_next_path_or_null - get clt path from the list or return NULL
* @head: the head for the list.
- * @ptr: the list head to take the next element from.
- * @type: the type of the struct this is embedded in.
- * @memb: the name of the list_head within the struct.
+ * @clt_path: The element to take the next clt_path from.
*
- * Next element returned in round-robin fashion, i.e. head will be skipped,
+ * Next clt path returned in round-robin fashion, i.e. head will be skipped,
* but if list is observed as empty, NULL will be returned.
*
- * This primitive may safely run concurrently with the _rcu list-mutation
+ * This function may safely run concurrently with the _rcu list-mutation
* primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
*/
-#define list_next_or_null_rr_rcu(head, ptr, type, memb) \
-({ \
- list_next_or_null_rcu(head, ptr, type, memb) ?: \
- list_next_or_null_rcu(head, READ_ONCE((ptr)->next), \
- type, memb); \
-})
+static inline struct rtrs_clt_path *
+rtrs_clt_get_next_path_or_null(struct list_head *head, struct rtrs_clt_path *clt_path)
+{
+ return list_next_or_null_rcu(head, &clt_path->s.entry, typeof(*clt_path), s.entry) ?:
+ list_next_or_null_rcu(head,
+ READ_ONCE((&clt_path->s.entry)->next),
+ typeof(*clt_path), s.entry);
+}
/**
* get_next_path_rr() - Returns path in round-robin fashion.
@@ -789,10 +789,8 @@ static struct rtrs_clt_path *get_next_path_rr(struct path_it *it)
path = list_first_or_null_rcu(&clt->paths_list,
typeof(*path), s.entry);
else
- path = list_next_or_null_rr_rcu(&clt->paths_list,
- &path->s.entry,
- typeof(*path),
- s.entry);
+ path = rtrs_clt_get_next_path_or_null(&clt->paths_list, path);
+
rcu_assign_pointer(*ppcpu_path, path);
return path;
@@ -2277,8 +2275,7 @@ static void rtrs_clt_remove_path_from_arr(struct rtrs_clt_path *clt_path)
* removed. If @sess is the last element, then @next is NULL.
*/
rcu_read_lock();
- next = list_next_or_null_rr_rcu(&clt->paths_list, &clt_path->s.entry,
- typeof(*next), s.entry);
+ next = rtrs_clt_get_next_path_or_null(&clt->paths_list, clt_path);
rcu_read_unlock();
/*
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
index 9a1e5c2ae55c..ac0df734eba8 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
@@ -23,6 +23,17 @@
#define RTRS_PROTO_VER_STRING __stringify(RTRS_PROTO_VER_MAJOR) "." \
__stringify(RTRS_PROTO_VER_MINOR)
+/*
+ * Max IB immediate data size is 2^28 (MAX_IMM_PAYL_BITS)
+ * and the minimum chunk size is 4096 (2^12).
+ * So the maximum sess_queue_depth is 65536 (2^16) in theory.
+ * But mempool_create, create_qp and ib_post_send fail with
+ * "cannot allocate memory" error if sess_queue_depth is too big.
+ * Therefore the pratical max value of sess_queue_depth is
+ * somewhere between 1 and 65534 and it depends on the system.
+ */
+#define MAX_SESS_QUEUE_DEPTH 65535
+
enum rtrs_imm_const {
MAX_IMM_TYPE_BITS = 4,
MAX_IMM_TYPE_MASK = ((1 << MAX_IMM_TYPE_BITS) - 1),
@@ -46,16 +57,6 @@ enum {
MAX_PATHS_NUM = 128,
- /*
- * Max IB immediate data size is 2^28 (MAX_IMM_PAYL_BITS)
- * and the minimum chunk size is 4096 (2^12).
- * So the maximum sess_queue_depth is 65536 (2^16) in theory.
- * But mempool_create, create_qp and ib_post_send fail with
- * "cannot allocate memory" error if sess_queue_depth is too big.
- * Therefore the pratical max value of sess_queue_depth is
- * somewhere between 1 and 65534 and it depends on the system.
- */
- MAX_SESS_QUEUE_DEPTH = 65535,
MIN_CHUNK_SIZE = 8192,
RTRS_HB_INTERVAL_MS = 5000,
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index f86ee1c4b970..c3036aeac89e 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -565,12 +565,9 @@ static int srpt_refresh_port(struct srpt_port *sport)
if (ret)
return ret;
- sport->port_guid_id.wwn.priv = sport;
- srpt_format_guid(sport->port_guid_id.name,
- sizeof(sport->port_guid_id.name),
+ srpt_format_guid(sport->guid_name, ARRAY_SIZE(sport->guid_name),
&sport->gid.global.interface_id);
- sport->port_gid_id.wwn.priv = sport;
- snprintf(sport->port_gid_id.name, sizeof(sport->port_gid_id.name),
+ snprintf(sport->gid_name, ARRAY_SIZE(sport->gid_name),
"0x%016llx%016llx",
be64_to_cpu(sport->gid.global.subnet_prefix),
be64_to_cpu(sport->gid.global.interface_id));
@@ -2314,31 +2311,35 @@ static int srpt_cm_req_recv(struct srpt_device *const sdev,
tag_num = ch->rq_size;
tag_size = 1; /* ib_srpt does not use se_sess->sess_cmd_map */
- mutex_lock(&sport->port_guid_id.mutex);
- list_for_each_entry(stpg, &sport->port_guid_id.tpg_list, entry) {
- if (!IS_ERR_OR_NULL(ch->sess))
- break;
- ch->sess = target_setup_session(&stpg->tpg, tag_num,
+ if (sport->guid_id) {
+ mutex_lock(&sport->guid_id->mutex);
+ list_for_each_entry(stpg, &sport->guid_id->tpg_list, entry) {
+ if (!IS_ERR_OR_NULL(ch->sess))
+ break;
+ ch->sess = target_setup_session(&stpg->tpg, tag_num,
tag_size, TARGET_PROT_NORMAL,
ch->sess_name, ch, NULL);
+ }
+ mutex_unlock(&sport->guid_id->mutex);
}
- mutex_unlock(&sport->port_guid_id.mutex);
- mutex_lock(&sport->port_gid_id.mutex);
- list_for_each_entry(stpg, &sport->port_gid_id.tpg_list, entry) {
- if (!IS_ERR_OR_NULL(ch->sess))
- break;
- ch->sess = target_setup_session(&stpg->tpg, tag_num,
+ if (sport->gid_id) {
+ mutex_lock(&sport->gid_id->mutex);
+ list_for_each_entry(stpg, &sport->gid_id->tpg_list, entry) {
+ if (!IS_ERR_OR_NULL(ch->sess))
+ break;
+ ch->sess = target_setup_session(&stpg->tpg, tag_num,
tag_size, TARGET_PROT_NORMAL, i_port_id,
ch, NULL);
- if (!IS_ERR_OR_NULL(ch->sess))
- break;
- /* Retry without leading "0x" */
- ch->sess = target_setup_session(&stpg->tpg, tag_num,
+ if (!IS_ERR_OR_NULL(ch->sess))
+ break;
+ /* Retry without leading "0x" */
+ ch->sess = target_setup_session(&stpg->tpg, tag_num,
tag_size, TARGET_PROT_NORMAL,
i_port_id + 2, ch, NULL);
+ }
+ mutex_unlock(&sport->gid_id->mutex);
}
- mutex_unlock(&sport->port_gid_id.mutex);
if (IS_ERR_OR_NULL(ch->sess)) {
WARN_ON_ONCE(ch->sess == NULL);
@@ -2983,7 +2984,12 @@ static int srpt_release_sport(struct srpt_port *sport)
return 0;
}
-static struct se_wwn *__srpt_lookup_wwn(const char *name)
+struct port_and_port_id {
+ struct srpt_port *sport;
+ struct srpt_port_id **port_id;
+};
+
+static struct port_and_port_id __srpt_lookup_port(const char *name)
{
struct ib_device *dev;
struct srpt_device *sdev;
@@ -2998,25 +3004,38 @@ static struct se_wwn *__srpt_lookup_wwn(const char *name)
for (i = 0; i < dev->phys_port_cnt; i++) {
sport = &sdev->port[i];
- if (strcmp(sport->port_guid_id.name, name) == 0)
- return &sport->port_guid_id.wwn;
- if (strcmp(sport->port_gid_id.name, name) == 0)
- return &sport->port_gid_id.wwn;
+ if (strcmp(sport->guid_name, name) == 0) {
+ kref_get(&sdev->refcnt);
+ return (struct port_and_port_id){
+ sport, &sport->guid_id};
+ }
+ if (strcmp(sport->gid_name, name) == 0) {
+ kref_get(&sdev->refcnt);
+ return (struct port_and_port_id){
+ sport, &sport->gid_id};
+ }
}
}
- return NULL;
+ return (struct port_and_port_id){};
}
-static struct se_wwn *srpt_lookup_wwn(const char *name)
+/**
+ * srpt_lookup_port() - Look up an RDMA port by name
+ * @name: ASCII port name
+ *
+ * Increments the RDMA port reference count if an RDMA port pointer is returned.
+ * The caller must drop that reference count by calling srpt_port_put_ref().
+ */
+static struct port_and_port_id srpt_lookup_port(const char *name)
{
- struct se_wwn *wwn;
+ struct port_and_port_id papi;
spin_lock(&srpt_dev_lock);
- wwn = __srpt_lookup_wwn(name);
+ papi = __srpt_lookup_port(name);
spin_unlock(&srpt_dev_lock);
- return wwn;
+ return papi;
}
static void srpt_free_srq(struct srpt_device *sdev)
@@ -3101,6 +3120,18 @@ static int srpt_use_srq(struct srpt_device *sdev, bool use_srq)
return ret;
}
+static void srpt_free_sdev(struct kref *refcnt)
+{
+ struct srpt_device *sdev = container_of(refcnt, typeof(*sdev), refcnt);
+
+ kfree(sdev);
+}
+
+static void srpt_sdev_put(struct srpt_device *sdev)
+{
+ kref_put(&sdev->refcnt, srpt_free_sdev);
+}
+
/**
* srpt_add_one - InfiniBand device addition callback function
* @device: Describes a HCA.
@@ -3119,6 +3150,7 @@ static int srpt_add_one(struct ib_device *device)
if (!sdev)
return -ENOMEM;
+ kref_init(&sdev->refcnt);
sdev->device = device;
mutex_init(&sdev->sdev_mutex);
@@ -3182,10 +3214,6 @@ static int srpt_add_one(struct ib_device *device)
sport->port_attrib.srp_sq_size = DEF_SRPT_SQ_SIZE;
sport->port_attrib.use_srq = false;
INIT_WORK(&sport->work, srpt_refresh_port_work);
- mutex_init(&sport->port_guid_id.mutex);
- INIT_LIST_HEAD(&sport->port_guid_id.tpg_list);
- mutex_init(&sport->port_gid_id.mutex);
- INIT_LIST_HEAD(&sport->port_gid_id.tpg_list);
ret = srpt_refresh_port(sport);
if (ret) {
@@ -3214,7 +3242,7 @@ static int srpt_add_one(struct ib_device *device)
srpt_free_srq(sdev);
ib_dealloc_pd(sdev->pd);
free_dev:
- kfree(sdev);
+ srpt_sdev_put(sdev);
pr_info("%s(%s) failed.\n", __func__, dev_name(&device->dev));
return ret;
}
@@ -3258,7 +3286,7 @@ static void srpt_remove_one(struct ib_device *device, void *client_data)
ib_dealloc_pd(sdev->pd);
- kfree(sdev);
+ srpt_sdev_put(sdev);
}
static struct ib_client srpt_client = {
@@ -3286,10 +3314,10 @@ static struct srpt_port_id *srpt_wwn_to_sport_id(struct se_wwn *wwn)
{
struct srpt_port *sport = wwn->priv;
- if (wwn == &sport->port_guid_id.wwn)
- return &sport->port_guid_id;
- if (wwn == &sport->port_gid_id.wwn)
- return &sport->port_gid_id;
+ if (sport->guid_id && &sport->guid_id->wwn == wwn)
+ return sport->guid_id;
+ if (sport->gid_id && &sport->gid_id->wwn == wwn)
+ return sport->gid_id;
WARN_ON_ONCE(true);
return NULL;
}
@@ -3774,7 +3802,31 @@ static struct se_wwn *srpt_make_tport(struct target_fabric_configfs *tf,
struct config_group *group,
const char *name)
{
- return srpt_lookup_wwn(name) ? : ERR_PTR(-EINVAL);
+ struct port_and_port_id papi = srpt_lookup_port(name);
+ struct srpt_port *sport = papi.sport;
+ struct srpt_port_id *port_id;
+
+ if (!papi.port_id)
+ return ERR_PTR(-EINVAL);
+ if (*papi.port_id) {
+ /* Attempt to create a directory that already exists. */
+ WARN_ON_ONCE(true);
+ return &(*papi.port_id)->wwn;
+ }
+ port_id = kzalloc(sizeof(*port_id), GFP_KERNEL);
+ if (!port_id) {
+ srpt_sdev_put(sport->sdev);
+ return ERR_PTR(-ENOMEM);
+ }
+ mutex_init(&port_id->mutex);
+ INIT_LIST_HEAD(&port_id->tpg_list);
+ port_id->wwn.priv = sport;
+ memcpy(port_id->name, port_id == sport->guid_id ? sport->guid_name :
+ sport->gid_name, ARRAY_SIZE(port_id->name));
+
+ *papi.port_id = port_id;
+
+ return &port_id->wwn;
}
/**
@@ -3783,6 +3835,18 @@ static struct se_wwn *srpt_make_tport(struct target_fabric_configfs *tf,
*/
static void srpt_drop_tport(struct se_wwn *wwn)
{
+ struct srpt_port_id *port_id = container_of(wwn, typeof(*port_id), wwn);
+ struct srpt_port *sport = wwn->priv;
+
+ if (sport->guid_id == port_id)
+ sport->guid_id = NULL;
+ else if (sport->gid_id == port_id)
+ sport->gid_id = NULL;
+ else
+ WARN_ON_ONCE(true);
+
+ srpt_sdev_put(sport->sdev);
+ kfree(port_id);
}
static ssize_t srpt_wwn_version_show(struct config_item *item, char *buf)
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h b/drivers/infiniband/ulp/srpt/ib_srpt.h
index 76e66f630c17..4c46b301eea1 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.h
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.h
@@ -376,7 +376,7 @@ struct srpt_tpg {
};
/**
- * struct srpt_port_id - information about an RDMA port name
+ * struct srpt_port_id - LIO RDMA port information
* @mutex: Protects @tpg_list changes.
* @tpg_list: TPGs associated with the RDMA port name.
* @wwn: WWN associated with the RDMA port name.
@@ -393,7 +393,7 @@ struct srpt_port_id {
};
/**
- * struct srpt_port - information associated by SRPT with a single IB port
+ * struct srpt_port - SRPT RDMA port information
* @sdev: backpointer to the HCA information.
* @mad_agent: per-port management datagram processing information.
* @enabled: Whether or not this target port is enabled.
@@ -402,8 +402,10 @@ struct srpt_port_id {
* @lid: cached value of the port's lid.
* @gid: cached value of the port's gid.
* @work: work structure for refreshing the aforementioned cached values.
- * @port_guid_id: target port GUID
- * @port_gid_id: target port GID
+ * @guid_name: port name in GUID format.
+ * @guid_id: LIO target port information for the port name in GUID format.
+ * @gid_name: port name in GID format.
+ * @gid_id: LIO target port information for the port name in GID format.
* @port_attrib: Port attributes that can be accessed through configfs.
* @refcount: Number of objects associated with this port.
* @freed_channels: Completion that will be signaled once @refcount becomes 0.
@@ -419,8 +421,10 @@ struct srpt_port {
u32 lid;
union ib_gid gid;
struct work_struct work;
- struct srpt_port_id port_guid_id;
- struct srpt_port_id port_gid_id;
+ char guid_name[64];
+ struct srpt_port_id *guid_id;
+ char gid_name[64];
+ struct srpt_port_id *gid_id;
struct srpt_port_attrib port_attrib;
atomic_t refcount;
struct completion *freed_channels;
@@ -430,6 +434,7 @@ struct srpt_port {
/**
* struct srpt_device - information associated by SRPT with a single HCA
+ * @refcnt: Reference count for this device.
* @device: Backpointer to the struct ib_device managed by the IB core.
* @pd: IB protection domain.
* @lkey: L_Key (local key) with write access to all local memory.
@@ -445,6 +450,7 @@ struct srpt_port {
* @port: Information about the ports owned by this HCA.
*/
struct srpt_device {
+ struct kref refcnt;
struct ib_device *device;
struct ib_pd *pd;
u32 lkey;
diff --git a/drivers/input/serio/gscps2.c b/drivers/input/serio/gscps2.c
index a9065c6ab550..da2c67cb8642 100644
--- a/drivers/input/serio/gscps2.c
+++ b/drivers/input/serio/gscps2.c
@@ -350,6 +350,10 @@ static int __init gscps2_probe(struct parisc_device *dev)
ps2port->port = serio;
ps2port->padev = dev;
ps2port->addr = ioremap(hpa, GSC_STATUS + 4);
+ if (!ps2port->addr) {
+ ret = -ENOMEM;
+ goto fail_nomem;
+ }
spin_lock_init(&ps2port->lock);
gscps2_reset(ps2port);
diff --git a/drivers/interconnect/imx/imx.c b/drivers/interconnect/imx/imx.c
index 249ca25d1d55..4406ec45fa90 100644
--- a/drivers/interconnect/imx/imx.c
+++ b/drivers/interconnect/imx/imx.c
@@ -234,16 +234,16 @@ int imx_icc_register(struct platform_device *pdev,
struct device *dev = &pdev->dev;
struct icc_onecell_data *data;
struct icc_provider *provider;
- int max_node_id;
+ int num_nodes;
int ret;
/* icc_onecell_data is indexed by node_id, unlike nodes param */
- max_node_id = get_max_node_id(nodes, nodes_count);
- data = devm_kzalloc(dev, struct_size(data, nodes, max_node_id),
+ num_nodes = get_max_node_id(nodes, nodes_count) + 1;
+ data = devm_kzalloc(dev, struct_size(data, nodes, num_nodes),
GFP_KERNEL);
if (!data)
return -ENOMEM;
- data->num_nodes = max_node_id;
+ data->num_nodes = num_nodes;
provider = devm_kzalloc(dev, sizeof(*provider), GFP_KERNEL);
if (!provider)
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 4c077c38fbd6..c11d2c2cbb62 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -750,9 +750,12 @@ static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
{
struct device_node *child;
- for_each_child_of_node(qcom_iommu->dev->of_node, child)
- if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+ for_each_child_of_node(qcom_iommu->dev->of_node, child) {
+ if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec")) {
+ of_node_put(child);
return true;
+ }
+ }
return false;
}
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 71f2018e23fe..cd4b889d5537 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -630,7 +630,7 @@ static int exynos_sysmmu_probe(struct platform_device *pdev)
ret = iommu_device_register(&data->iommu, &exynos_iommu_ops, dev);
if (ret)
- return ret;
+ goto err_iommu_register;
platform_set_drvdata(pdev, data);
@@ -657,6 +657,10 @@ static int exynos_sysmmu_probe(struct platform_device *pdev)
pm_runtime_enable(dev);
return 0;
+
+err_iommu_register:
+ iommu_device_sysfs_remove(&data->iommu);
+ return ret;
}
static int __maybe_unused exynos_sysmmu_suspend(struct device *dev)
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 497c5bd95caf..2a10c9b54064 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -495,7 +495,7 @@ static int dmar_parse_one_rhsa(struct acpi_dmar_header *header, void *arg)
if (drhd->reg_base_addr == rhsa->base_address) {
int node = pxm_to_node(rhsa->proximity_domain);
- if (!node_online(node))
+ if (node != NUMA_NO_NODE && !node_online(node))
node = NUMA_NO_NODE;
drhd->iommu->node = node;
return 0;
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 15edb9a6fcae..1af26c409cb1 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -177,7 +177,7 @@ config MADERA_IRQ
config IRQ_MIPS_CPU
bool
select GENERIC_IRQ_CHIP
- select GENERIC_IRQ_IPI if SYS_SUPPORTS_MULTITHREADING
+ select GENERIC_IRQ_IPI if SMP && SYS_SUPPORTS_MULTITHREADING
select IRQ_DOMAIN
select GENERIC_IRQ_EFFECTIVE_AFF_MASK
@@ -310,7 +310,8 @@ config KEYSTONE_IRQ
config MIPS_GIC
bool
- select GENERIC_IRQ_IPI
+ select GENERIC_IRQ_IPI if SMP
+ select IRQ_DOMAIN_HIERARCHY
select MIPS_CM
config INGENIC_IRQ
diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c
index ff89b36267dd..1ba0f1555c80 100644
--- a/drivers/irqchip/irq-mips-gic.c
+++ b/drivers/irqchip/irq-mips-gic.c
@@ -52,13 +52,15 @@ static DEFINE_PER_CPU_READ_MOSTLY(unsigned long[GIC_MAX_LONGS], pcpu_masks);
static DEFINE_SPINLOCK(gic_lock);
static struct irq_domain *gic_irq_domain;
-static struct irq_domain *gic_ipi_domain;
static int gic_shared_intrs;
static unsigned int gic_cpu_pin;
static unsigned int timer_cpu_pin;
static struct irq_chip gic_level_irq_controller, gic_edge_irq_controller;
+
+#ifdef CONFIG_GENERIC_IRQ_IPI
static DECLARE_BITMAP(ipi_resrv, GIC_MAX_INTRS);
static DECLARE_BITMAP(ipi_available, GIC_MAX_INTRS);
+#endif /* CONFIG_GENERIC_IRQ_IPI */
static struct gic_all_vpes_chip_data {
u32 map;
@@ -472,9 +474,11 @@ static int gic_irq_domain_map(struct irq_domain *d, unsigned int virq,
u32 map;
if (hwirq >= GIC_SHARED_HWIRQ_BASE) {
+#ifdef CONFIG_GENERIC_IRQ_IPI
/* verify that shared irqs don't conflict with an IPI irq */
if (test_bit(GIC_HWIRQ_TO_SHARED(hwirq), ipi_resrv))
return -EBUSY;
+#endif /* CONFIG_GENERIC_IRQ_IPI */
err = irq_domain_set_hwirq_and_chip(d, virq, hwirq,
&gic_level_irq_controller,
@@ -567,6 +571,8 @@ static const struct irq_domain_ops gic_irq_domain_ops = {
.map = gic_irq_domain_map,
};
+#ifdef CONFIG_GENERIC_IRQ_IPI
+
static int gic_ipi_domain_xlate(struct irq_domain *d, struct device_node *ctrlr,
const u32 *intspec, unsigned int intsize,
irq_hw_number_t *out_hwirq,
@@ -670,6 +676,48 @@ static const struct irq_domain_ops gic_ipi_domain_ops = {
.match = gic_ipi_domain_match,
};
+static int gic_register_ipi_domain(struct device_node *node)
+{
+ struct irq_domain *gic_ipi_domain;
+ unsigned int v[2], num_ipis;
+
+ gic_ipi_domain = irq_domain_add_hierarchy(gic_irq_domain,
+ IRQ_DOMAIN_FLAG_IPI_PER_CPU,
+ GIC_NUM_LOCAL_INTRS + gic_shared_intrs,
+ node, &gic_ipi_domain_ops, NULL);
+ if (!gic_ipi_domain) {
+ pr_err("Failed to add IPI domain");
+ return -ENXIO;
+ }
+
+ irq_domain_update_bus_token(gic_ipi_domain, DOMAIN_BUS_IPI);
+
+ if (node &&
+ !of_property_read_u32_array(node, "mti,reserved-ipi-vectors", v, 2)) {
+ bitmap_set(ipi_resrv, v[0], v[1]);
+ } else {
+ /*
+ * Reserve 2 interrupts per possible CPU/VP for use as IPIs,
+ * meeting the requirements of arch/mips SMP.
+ */
+ num_ipis = 2 * num_possible_cpus();
+ bitmap_set(ipi_resrv, gic_shared_intrs - num_ipis, num_ipis);
+ }
+
+ bitmap_copy(ipi_available, ipi_resrv, GIC_MAX_INTRS);
+
+ return 0;
+}
+
+#else /* !CONFIG_GENERIC_IRQ_IPI */
+
+static inline int gic_register_ipi_domain(struct device_node *node)
+{
+ return 0;
+}
+
+#endif /* !CONFIG_GENERIC_IRQ_IPI */
+
static int gic_cpu_startup(unsigned int cpu)
{
/* Enable or disable EIC */
@@ -688,11 +736,12 @@ static int gic_cpu_startup(unsigned int cpu)
static int __init gic_of_init(struct device_node *node,
struct device_node *parent)
{
- unsigned int cpu_vec, i, gicconfig, v[2], num_ipis;
+ unsigned int cpu_vec, i, gicconfig;
unsigned long reserved;
phys_addr_t gic_base;
struct resource res;
size_t gic_len;
+ int ret;
/* Find the first available CPU vector. */
i = 0;
@@ -734,6 +783,10 @@ static int __init gic_of_init(struct device_node *node,
}
mips_gic_base = ioremap(gic_base, gic_len);
+ if (!mips_gic_base) {
+ pr_err("Failed to ioremap gic_base\n");
+ return -ENOMEM;
+ }
gicconfig = read_gic_config();
gic_shared_intrs = FIELD_GET(GIC_CONFIG_NUMINTERRUPTS, gicconfig);
@@ -780,30 +833,9 @@ static int __init gic_of_init(struct device_node *node,
return -ENXIO;
}
- gic_ipi_domain = irq_domain_add_hierarchy(gic_irq_domain,
- IRQ_DOMAIN_FLAG_IPI_PER_CPU,
- GIC_NUM_LOCAL_INTRS + gic_shared_intrs,
- node, &gic_ipi_domain_ops, NULL);
- if (!gic_ipi_domain) {
- pr_err("Failed to add IPI domain");
- return -ENXIO;
- }
-
- irq_domain_update_bus_token(gic_ipi_domain, DOMAIN_BUS_IPI);
-
- if (node &&
- !of_property_read_u32_array(node, "mti,reserved-ipi-vectors", v, 2)) {
- bitmap_set(ipi_resrv, v[0], v[1]);
- } else {
- /*
- * Reserve 2 interrupts per possible CPU/VP for use as IPIs,
- * meeting the requirements of arch/mips SMP.
- */
- num_ipis = 2 * num_possible_cpus();
- bitmap_set(ipi_resrv, gic_shared_intrs - num_ipis, num_ipis);
- }
-
- bitmap_copy(ipi_available, ipi_resrv, GIC_MAX_INTRS);
+ ret = gic_register_ipi_domain(node);
+ if (ret)
+ return ret;
board_bind_eic_interrupt = &gic_bind_eic_interrupt;
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index e362a7471512..a55d6f6f294b 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3514,7 +3514,7 @@ static void raid_status(struct dm_target *ti, status_type_t type,
{
struct raid_set *rs = ti->private;
struct mddev *mddev = &rs->md;
- struct r5conf *conf = mddev->private;
+ struct r5conf *conf = rs_is_raid456(rs) ? mddev->private : NULL;
int i, max_nr_stripes = conf ? conf->max_nr_stripes : 0;
unsigned long recovery;
unsigned int raid_param_cnt = 1; /* at least 1 for chunksize */
@@ -3824,7 +3824,7 @@ static void attempt_restore_of_faulty_devices(struct raid_set *rs)
memset(cleared_failed_devices, 0, sizeof(cleared_failed_devices));
- for (i = 0; i < mddev->raid_disks; i++) {
+ for (i = 0; i < rs->raid_disks; i++) {
r = &rs->dev[i].rdev;
/* HM FIXME: enhance journal device recovery processing */
if (test_bit(Journal, &r->flags))
diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
index 2db7030aba00..a27395c8621f 100644
--- a/drivers/md/dm-thin-metadata.c
+++ b/drivers/md/dm-thin-metadata.c
@@ -2045,10 +2045,13 @@ int dm_pool_register_metadata_threshold(struct dm_pool_metadata *pmd,
dm_sm_threshold_fn fn,
void *context)
{
- int r;
+ int r = -EINVAL;
pmd_write_lock_in_core(pmd);
- r = dm_sm_register_threshold_callback(pmd->metadata_sm, threshold, fn, context);
+ if (!pmd->fail_io) {
+ r = dm_sm_register_threshold_callback(pmd->metadata_sm,
+ threshold, fn, context);
+ }
pmd_write_unlock(pmd);
return r;
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 4d25d0e27031..53ac6ae870ac 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -3382,8 +3382,10 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv)
calc_metadata_threshold(pt),
metadata_low_callback,
pool);
- if (r)
+ if (r) {
+ ti->error = "Error registering metadata threshold";
goto out_flags_changed;
+ }
dm_pool_register_pre_commit_callback(pool->pmd,
metadata_pre_commit_callback, pool);
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 5630b470ba42..27557b852c94 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -22,7 +22,7 @@
#define HIGH_WATERMARK 50
#define LOW_WATERMARK 45
-#define MAX_WRITEBACK_JOBS 0
+#define MAX_WRITEBACK_JOBS min(0x10000000 / PAGE_SIZE, totalram_pages() / 16)
#define ENDIO_LATENCY 16
#define WRITEBACK_LATENCY 64
#define AUTOCOMMIT_BLOCKS_SSD 65536
@@ -1328,8 +1328,8 @@ enum wc_map_op {
WC_MAP_ERROR,
};
-static enum wc_map_op writecache_map_remap_origin(struct dm_writecache *wc, struct bio *bio,
- struct wc_entry *e)
+static void writecache_map_remap_origin(struct dm_writecache *wc, struct bio *bio,
+ struct wc_entry *e)
{
if (e) {
sector_t next_boundary =
@@ -1337,8 +1337,6 @@ static enum wc_map_op writecache_map_remap_origin(struct dm_writecache *wc, stru
if (next_boundary < bio->bi_iter.bi_size >> SECTOR_SHIFT)
dm_accept_partial_bio(bio, next_boundary);
}
-
- return WC_MAP_REMAP_ORIGIN;
}
static enum wc_map_op writecache_map_read(struct dm_writecache *wc, struct bio *bio)
@@ -1365,14 +1363,16 @@ static enum wc_map_op writecache_map_read(struct dm_writecache *wc, struct bio *
map_op = WC_MAP_REMAP;
}
} else {
- map_op = writecache_map_remap_origin(wc, bio, e);
+ writecache_map_remap_origin(wc, bio, e);
+ wc->stats.reads += (bio->bi_iter.bi_size - wc->block_size) >> wc->block_size_bits;
+ map_op = WC_MAP_REMAP_ORIGIN;
}
return map_op;
}
-static enum wc_map_op writecache_bio_copy_ssd(struct dm_writecache *wc, struct bio *bio,
- struct wc_entry *e, bool search_used)
+static void writecache_bio_copy_ssd(struct dm_writecache *wc, struct bio *bio,
+ struct wc_entry *e, bool search_used)
{
unsigned bio_size = wc->block_size;
sector_t start_cache_sec = cache_sector(wc, e);
@@ -1412,14 +1412,15 @@ static enum wc_map_op writecache_bio_copy_ssd(struct dm_writecache *wc, struct b
bio->bi_iter.bi_sector = start_cache_sec;
dm_accept_partial_bio(bio, bio_size >> SECTOR_SHIFT);
+ wc->stats.writes += bio->bi_iter.bi_size >> wc->block_size_bits;
+ wc->stats.writes_allocate += (bio->bi_iter.bi_size - wc->block_size) >> wc->block_size_bits;
+
if (unlikely(wc->uncommitted_blocks >= wc->autocommit_blocks)) {
wc->uncommitted_blocks = 0;
queue_work(wc->writeback_wq, &wc->flush_work);
} else {
writecache_schedule_autocommit(wc);
}
-
- return WC_MAP_REMAP;
}
static enum wc_map_op writecache_map_write(struct dm_writecache *wc, struct bio *bio)
@@ -1429,9 +1430,10 @@ static enum wc_map_op writecache_map_write(struct dm_writecache *wc, struct bio
do {
bool found_entry = false;
bool search_used = false;
- wc->stats.writes++;
- if (writecache_has_error(wc))
+ if (writecache_has_error(wc)) {
+ wc->stats.writes += bio->bi_iter.bi_size >> wc->block_size_bits;
return WC_MAP_ERROR;
+ }
e = writecache_find_entry(wc, bio->bi_iter.bi_sector, 0);
if (e) {
if (!writecache_entry_is_committed(wc, e)) {
@@ -1455,9 +1457,11 @@ static enum wc_map_op writecache_map_write(struct dm_writecache *wc, struct bio
if (unlikely(!e)) {
if (!WC_MODE_PMEM(wc) && !found_entry) {
direct_write:
- wc->stats.writes_around++;
e = writecache_find_entry(wc, bio->bi_iter.bi_sector, WFE_RETURN_FOLLOWING);
- return writecache_map_remap_origin(wc, bio, e);
+ writecache_map_remap_origin(wc, bio, e);
+ wc->stats.writes_around += bio->bi_iter.bi_size >> wc->block_size_bits;
+ wc->stats.writes += bio->bi_iter.bi_size >> wc->block_size_bits;
+ return WC_MAP_REMAP_ORIGIN;
}
wc->stats.writes_blocked_on_freelist++;
writecache_wait_on_freelist(wc);
@@ -1468,10 +1472,13 @@ static enum wc_map_op writecache_map_write(struct dm_writecache *wc, struct bio
wc->uncommitted_blocks++;
wc->stats.writes_allocate++;
bio_copy:
- if (WC_MODE_PMEM(wc))
+ if (WC_MODE_PMEM(wc)) {
bio_copy_block(wc, bio, memory_data(wc, e));
- else
- return writecache_bio_copy_ssd(wc, bio, e, search_used);
+ wc->stats.writes++;
+ } else {
+ writecache_bio_copy_ssd(wc, bio, e, search_used);
+ return WC_MAP_REMAP;
+ }
} while (bio->bi_iter.bi_size);
if (unlikely(bio->bi_opf & REQ_FUA || wc->uncommitted_blocks >= wc->autocommit_blocks))
@@ -1506,7 +1513,7 @@ static enum wc_map_op writecache_map_flush(struct dm_writecache *wc, struct bio
static enum wc_map_op writecache_map_discard(struct dm_writecache *wc, struct bio *bio)
{
- wc->stats.discards++;
+ wc->stats.discards += bio->bi_iter.bi_size >> wc->block_size_bits;
if (writecache_has_error(wc))
return WC_MAP_ERROR;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index f01d33bc3613..e0e28a3203f8 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2955,6 +2955,11 @@ static int dm_call_pr(struct block_device *bdev, iterate_devices_callout_fn fn,
goto out;
ti = dm_table_get_target(table, 0);
+ if (dm_suspended_md(md)) {
+ ret = -EAGAIN;
+ goto out;
+ }
+
ret = -EINVAL;
if (!ti->type->iterate_devices)
goto out;
diff --git a/drivers/md/md.c b/drivers/md/md.c
index f79cab8c7700..134c12e481a5 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6259,11 +6259,11 @@ static void mddev_detach(struct mddev *mddev)
static void __md_stop(struct mddev *mddev)
{
struct md_personality *pers = mddev->pers;
- md_bitmap_destroy(mddev);
mddev_detach(mddev);
/* Ensure ->event_work is done */
if (mddev->event_work.func)
flush_workqueue(md_misc_wq);
+ md_bitmap_destroy(mddev);
spin_lock(&mddev->lock);
mddev->pers = NULL;
spin_unlock(&mddev->lock);
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index dfe7d62d3fbd..e9648d643dd1 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2157,9 +2157,12 @@ static int raid10_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
int err = 0;
int number = rdev->raid_disk;
struct md_rdev **rdevp;
- struct raid10_info *p = conf->mirrors + number;
+ struct raid10_info *p;
print_conf(conf);
+ if (unlikely(number >= mddev->raid_disks))
+ return 0;
+ p = conf->mirrors + number;
if (rdev == p->rdev)
rdevp = &p->rdev;
else if (rdev == p->replacement)
diff --git a/drivers/media/i2c/Kconfig b/drivers/media/i2c/Kconfig
index 2b20aa6c37b1..c926e5d43820 100644
--- a/drivers/media/i2c/Kconfig
+++ b/drivers/media/i2c/Kconfig
@@ -1178,6 +1178,7 @@ config VIDEO_ISL7998X
depends on OF_GPIO
select MEDIA_CONTROLLER
select VIDEO_V4L2_SUBDEV_API
+ select V4L2_FWNODE
help
Support for Intersil ISL7998x analog to MIPI-CSI2 or
BT.656 decoder.
diff --git a/drivers/media/pci/sta2x11/Kconfig b/drivers/media/pci/sta2x11/Kconfig
index a96e170ab04e..118b922c08c3 100644
--- a/drivers/media/pci/sta2x11/Kconfig
+++ b/drivers/media/pci/sta2x11/Kconfig
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
config STA2X11_VIP
tristate "STA2X11 VIP Video For Linux"
- depends on PCI && VIDEO_DEV && VIRT_TO_BUS && I2C
+ depends on PCI && VIDEO_DEV && I2C
depends on STA2X11 || COMPILE_TEST
select GPIOLIB if MEDIA_SUBDRV_AUTOSELECT
select VIDEO_ADV7180 if MEDIA_SUBDRV_AUTOSELECT
diff --git a/drivers/media/pci/tw686x/tw686x-core.c b/drivers/media/pci/tw686x/tw686x-core.c
index 6676e069b515..384d38754a4b 100644
--- a/drivers/media/pci/tw686x/tw686x-core.c
+++ b/drivers/media/pci/tw686x/tw686x-core.c
@@ -315,13 +315,6 @@ static int tw686x_probe(struct pci_dev *pci_dev,
spin_lock_init(&dev->lock);
- err = request_irq(pci_dev->irq, tw686x_irq, IRQF_SHARED,
- dev->name, dev);
- if (err < 0) {
- dev_err(&pci_dev->dev, "unable to request interrupt\n");
- goto iounmap;
- }
-
timer_setup(&dev->dma_delay_timer, tw686x_dma_delay, 0);
/*
@@ -333,18 +326,23 @@ static int tw686x_probe(struct pci_dev *pci_dev,
err = tw686x_video_init(dev);
if (err) {
dev_err(&pci_dev->dev, "can't register video\n");
- goto free_irq;
+ goto iounmap;
}
err = tw686x_audio_init(dev);
if (err)
dev_warn(&pci_dev->dev, "can't register audio\n");
+ err = request_irq(pci_dev->irq, tw686x_irq, IRQF_SHARED,
+ dev->name, dev);
+ if (err < 0) {
+ dev_err(&pci_dev->dev, "unable to request interrupt\n");
+ goto iounmap;
+ }
+
pci_set_drvdata(pci_dev, dev);
return 0;
-free_irq:
- free_irq(pci_dev->irq, dev);
iounmap:
pci_iounmap(pci_dev, dev->mmio);
free_region:
diff --git a/drivers/media/pci/tw686x/tw686x-video.c b/drivers/media/pci/tw686x/tw686x-video.c
index b227e9e78ebd..37a20fe24241 100644
--- a/drivers/media/pci/tw686x/tw686x-video.c
+++ b/drivers/media/pci/tw686x/tw686x-video.c
@@ -1282,8 +1282,10 @@ int tw686x_video_init(struct tw686x_dev *dev)
video_set_drvdata(vdev, vc);
err = video_register_device(vdev, VFL_TYPE_VIDEO, -1);
- if (err < 0)
+ if (err < 0) {
+ video_device_release(vdev);
goto error;
+ }
vc->num = vdev->num;
}
diff --git a/drivers/media/platform/amphion/vdec.c b/drivers/media/platform/amphion/vdec.c
index c0dfede11ab7..36450a2e85c9 100644
--- a/drivers/media/platform/amphion/vdec.c
+++ b/drivers/media/platform/amphion/vdec.c
@@ -26,8 +26,8 @@
#include "vpu_cmds.h"
#include "vpu_rpc.h"
-#define VDEC_FRAME_DEPTH 256
#define VDEC_MIN_BUFFER_CAP 8
+#define VDEC_MIN_BUFFER_OUT 8
struct vdec_fs_info {
char name[8];
@@ -63,8 +63,7 @@ struct vdec_t {
bool is_source_changed;
u32 source_change;
u32 drain;
- u32 ts_pre_count;
- u32 frame_depth;
+ bool aborting;
};
static const struct vpu_format vdec_formats[] = {
@@ -106,7 +105,6 @@ static const struct vpu_format vdec_formats[] = {
.pixfmt = V4L2_PIX_FMT_VC1_ANNEX_L,
.num_planes = 1,
.type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE,
- .flags = V4L2_FMT_FLAG_DYN_RESOLUTION
},
{
.pixfmt = V4L2_PIX_FMT_MPEG2,
@@ -174,16 +172,6 @@ static int vdec_ctrl_init(struct vpu_inst *inst)
return 0;
}
-static void vdec_set_last_buffer_dequeued(struct vpu_inst *inst)
-{
- struct vdec_t *vdec = inst->priv;
-
- if (vdec->eos_received) {
- if (!vpu_set_last_buffer_dequeued(inst))
- vdec->eos_received--;
- }
-}
-
static void vdec_handle_resolution_change(struct vpu_inst *inst)
{
struct vdec_t *vdec = inst->priv;
@@ -230,6 +218,21 @@ static int vdec_update_state(struct vpu_inst *inst, enum vpu_codec_state state,
return 0;
}
+static void vdec_set_last_buffer_dequeued(struct vpu_inst *inst)
+{
+ struct vdec_t *vdec = inst->priv;
+
+ if (inst->state == VPU_CODEC_STATE_DYAMIC_RESOLUTION_CHANGE)
+ return;
+
+ if (vdec->eos_received) {
+ if (!vpu_set_last_buffer_dequeued(inst)) {
+ vdec->eos_received--;
+ vdec_update_state(inst, VPU_CODEC_STATE_DRAIN, 0);
+ }
+ }
+}
+
static int vdec_querycap(struct file *file, void *fh, struct v4l2_capability *cap)
{
strscpy(cap->driver, "amphion-vpu", sizeof(cap->driver));
@@ -470,7 +473,7 @@ static int vdec_drain(struct vpu_inst *inst)
if (!vdec->drain)
return 0;
- if (v4l2_m2m_num_src_bufs_ready(inst->fh.m2m_ctx))
+ if (!vpu_is_source_empty(inst))
return 0;
if (!vdec->params.frame_count) {
@@ -489,6 +492,8 @@ static int vdec_drain(struct vpu_inst *inst)
static int vdec_cmd_start(struct vpu_inst *inst)
{
+ struct vdec_t *vdec = inst->priv;
+
switch (inst->state) {
case VPU_CODEC_STATE_STARTED:
case VPU_CODEC_STATE_DRAIN:
@@ -499,6 +504,8 @@ static int vdec_cmd_start(struct vpu_inst *inst)
break;
}
vpu_process_capture_buffer(inst);
+ if (vdec->eos_received)
+ vdec_set_last_buffer_dequeued(inst);
return 0;
}
@@ -589,11 +596,8 @@ static bool vdec_check_ready(struct vpu_inst *inst, unsigned int type)
{
struct vdec_t *vdec = inst->priv;
- if (V4L2_TYPE_IS_OUTPUT(type)) {
- if (vdec->ts_pre_count >= vdec->frame_depth)
- return false;
+ if (V4L2_TYPE_IS_OUTPUT(type))
return true;
- }
if (vdec->req_frame_count)
return true;
@@ -601,12 +605,21 @@ static bool vdec_check_ready(struct vpu_inst *inst, unsigned int type)
return false;
}
+static struct vb2_v4l2_buffer *vdec_get_src_buffer(struct vpu_inst *inst, u32 count)
+{
+ if (count > 1)
+ vpu_skip_frame(inst, count - 1);
+
+ return vpu_next_src_buf(inst);
+}
+
static int vdec_frame_decoded(struct vpu_inst *inst, void *arg)
{
struct vdec_t *vdec = inst->priv;
struct vpu_dec_pic_info *info = arg;
struct vpu_vb2_buffer *vpu_buf;
struct vb2_v4l2_buffer *vbuf;
+ struct vb2_v4l2_buffer *src_buf;
int ret = 0;
if (!info || info->id >= ARRAY_SIZE(vdec->slots))
@@ -620,14 +633,21 @@ static int vdec_frame_decoded(struct vpu_inst *inst, void *arg)
goto exit;
}
vbuf = &vpu_buf->m2m_buf.vb;
+ src_buf = vdec_get_src_buffer(inst, info->consumed_count);
+ if (src_buf) {
+ v4l2_m2m_buf_copy_metadata(src_buf, vbuf, true);
+ if (info->consumed_count) {
+ v4l2_m2m_src_buf_remove(inst->fh.m2m_ctx);
+ vpu_set_buffer_state(src_buf, VPU_BUF_STATE_IDLE);
+ v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
+ } else {
+ vpu_set_buffer_state(src_buf, VPU_BUF_STATE_DECODED);
+ }
+ }
if (vpu_get_buffer_state(vbuf) == VPU_BUF_STATE_DECODED)
dev_info(inst->dev, "[%d] buf[%d] has been decoded\n", inst->id, info->id);
vpu_set_buffer_state(vbuf, VPU_BUF_STATE_DECODED);
vdec->decoded_frame_count++;
- if (vdec->ts_pre_count >= info->consumed_count)
- vdec->ts_pre_count -= info->consumed_count;
- else
- vdec->ts_pre_count = 0;
exit:
vpu_inst_unlock(inst);
@@ -683,10 +703,9 @@ static void vdec_buf_done(struct vpu_inst *inst, struct vpu_frame_info *frame)
vpu_set_buffer_state(vbuf, VPU_BUF_STATE_READY);
vb2_set_plane_payload(&vbuf->vb2_buf, 0, inst->cap_format.sizeimage[0]);
vb2_set_plane_payload(&vbuf->vb2_buf, 1, inst->cap_format.sizeimage[1]);
- vbuf->vb2_buf.timestamp = frame->timestamp;
vbuf->field = inst->cap_format.field;
vbuf->sequence = sequence;
- dev_dbg(inst->dev, "[%d][OUTPUT TS]%32lld\n", inst->id, frame->timestamp);
+ dev_dbg(inst->dev, "[%d][OUTPUT TS]%32lld\n", inst->id, vbuf->vb2_buf.timestamp);
v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_DONE);
vpu_inst_lock(inst);
@@ -708,7 +727,6 @@ static void vdec_stop_done(struct vpu_inst *inst)
vdec->fixed_fmt = false;
vdec->params.end_flag = 0;
vdec->drain = 0;
- vdec->ts_pre_count = 0;
vdec->params.frame_count = 0;
vdec->decoded_frame_count = 0;
vdec->display_frame_count = 0;
@@ -716,6 +734,7 @@ static void vdec_stop_done(struct vpu_inst *inst)
vdec->eos_received = 0;
vdec->is_source_changed = false;
vdec->source_change = 0;
+ inst->total_input_count = 0;
vpu_inst_unlock(inst);
}
@@ -924,6 +943,9 @@ static int vdec_response_frame(struct vpu_inst *inst, struct vb2_v4l2_buffer *vb
if (inst->state != VPU_CODEC_STATE_ACTIVE)
return -EINVAL;
+ if (vdec->aborting)
+ return -EINVAL;
+
if (!vdec->req_frame_count)
return -EINVAL;
@@ -1033,6 +1055,8 @@ static void vdec_clear_slots(struct vpu_inst *inst)
vpu_buf = vdec->slots[i];
vbuf = &vpu_buf->m2m_buf.vb;
+ vpu_trace(inst->dev, "clear slot %d\n", i);
+ vdec_response_fs_release(inst, i, vpu_buf->tag);
vdec_recycle_buffer(inst, vbuf);
vdec->slots[i]->state = VPU_BUF_STATE_IDLE;
vdec->slots[i] = NULL;
@@ -1188,7 +1212,6 @@ static void vdec_event_eos(struct vpu_inst *inst)
vdec->eos_received++;
vdec->fixed_fmt = false;
inst->min_buffer_cap = VDEC_MIN_BUFFER_CAP;
- vdec_update_state(inst, VPU_CODEC_STATE_DRAIN, 0);
vdec_set_last_buffer_dequeued(inst);
vpu_inst_unlock(inst);
}
@@ -1244,18 +1267,14 @@ static int vdec_process_output(struct vpu_inst *inst, struct vb2_buffer *vb)
if (free_space < vb2_get_plane_payload(vb, 0) + 0x40000)
return -ENOMEM;
+ vpu_set_buffer_state(vbuf, VPU_BUF_STATE_INUSE);
ret = vpu_iface_input_frame(inst, vb);
if (ret < 0)
return -ENOMEM;
dev_dbg(inst->dev, "[%d][INPUT TS]%32lld\n", inst->id, vb->timestamp);
- vdec->ts_pre_count++;
vdec->params.frame_count++;
- v4l2_m2m_src_buf_remove_by_buf(inst->fh.m2m_ctx, vbuf);
- vpu_set_buffer_state(vbuf, VPU_BUF_STATE_IDLE);
- v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_DONE);
-
if (vdec->drain)
vdec_drain(inst);
@@ -1299,6 +1318,8 @@ static void vdec_abort(struct vpu_inst *inst)
int ret;
vpu_trace(inst->dev, "[%d] state = %d\n", inst->id, inst->state);
+
+ vdec->aborting = true;
vpu_iface_add_scode(inst, SCODE_PADDING_ABORT);
vdec->params.end_flag = 1;
vpu_iface_set_decode_params(inst, &vdec->params, 1);
@@ -1318,11 +1339,11 @@ static void vdec_abort(struct vpu_inst *inst)
vdec->sequence);
vdec->params.end_flag = 0;
vdec->drain = 0;
- vdec->ts_pre_count = 0;
vdec->params.frame_count = 0;
vdec->decoded_frame_count = 0;
vdec->display_frame_count = 0;
vdec->sequence = 0;
+ vdec->aborting = false;
}
static void vdec_stop(struct vpu_inst *inst, bool free)
@@ -1470,10 +1491,10 @@ static int vdec_stop_session(struct vpu_inst *inst, u32 type)
vdec_update_state(inst, VPU_CODEC_STATE_SEEK, 0);
vdec->drain = 0;
} else {
- if (inst->state != VPU_CODEC_STATE_DYAMIC_RESOLUTION_CHANGE)
+ if (inst->state != VPU_CODEC_STATE_DYAMIC_RESOLUTION_CHANGE) {
vdec_abort(inst);
-
- vdec->eos_received = 0;
+ vdec->eos_received = 0;
+ }
vdec_clear_slots(inst);
}
@@ -1525,10 +1546,6 @@ static int vdec_get_debug_info(struct vpu_inst *inst, char *str, u32 size, u32 i
vdec->drain, vdec->eos_received, vdec->source_change);
break;
case 8:
- num = scnprintf(str, size, "ts_pre_count = %d, frame_depth = %d\n",
- vdec->ts_pre_count, vdec->frame_depth);
- break;
- case 9:
num = scnprintf(str, size, "fps = %d/%d\n",
vdec->codec_info.frame_rate.numerator,
vdec->codec_info.frame_rate.denominator);
@@ -1562,12 +1579,8 @@ static struct vpu_inst_ops vdec_inst_ops = {
static void vdec_init(struct file *file)
{
struct vpu_inst *inst = to_inst(file);
- struct vdec_t *vdec;
struct v4l2_format f;
- vdec = inst->priv;
- vdec->frame_depth = VDEC_FRAME_DEPTH;
-
memset(&f, 0, sizeof(f));
f.type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE;
f.fmt.pix_mp.pixelformat = V4L2_PIX_FMT_H264;
@@ -1612,36 +1625,18 @@ static int vdec_open(struct file *file)
vdec->fixed_fmt = false;
inst->min_buffer_cap = VDEC_MIN_BUFFER_CAP;
+ inst->min_buffer_out = VDEC_MIN_BUFFER_OUT;
vdec_init(file);
return 0;
}
-static __poll_t vdec_poll(struct file *file, poll_table *wait)
-{
- struct vpu_inst *inst = to_inst(file);
- struct vb2_queue *src_q, *dst_q;
- __poll_t ret;
-
- ret = v4l2_m2m_fop_poll(file, wait);
- src_q = v4l2_m2m_get_src_vq(inst->fh.m2m_ctx);
- dst_q = v4l2_m2m_get_dst_vq(inst->fh.m2m_ctx);
- if (vb2_is_streaming(src_q) && !vb2_is_streaming(dst_q))
- ret &= (~EPOLLERR);
- if (!src_q->error && !dst_q->error &&
- (vb2_is_streaming(src_q) && list_empty(&src_q->queued_list)) &&
- (vb2_is_streaming(dst_q) && list_empty(&dst_q->queued_list)))
- ret &= (~EPOLLERR);
-
- return ret;
-}
-
static const struct v4l2_file_operations vdec_fops = {
.owner = THIS_MODULE,
.open = vdec_open,
.release = vpu_v4l2_close,
.unlocked_ioctl = video_ioctl2,
- .poll = vdec_poll,
+ .poll = v4l2_m2m_fop_poll,
.mmap = v4l2_m2m_fop_mmap,
};
diff --git a/drivers/media/platform/amphion/vpu.h b/drivers/media/platform/amphion/vpu.h
index e56b96a7e5d3..f914de6ed81e 100644
--- a/drivers/media/platform/amphion/vpu.h
+++ b/drivers/media/platform/amphion/vpu.h
@@ -258,6 +258,7 @@ struct vpu_inst {
struct vpu_format cap_format;
u32 min_buffer_cap;
u32 min_buffer_out;
+ u32 total_input_count;
struct v4l2_rect crop;
u32 colorspace;
diff --git a/drivers/media/platform/amphion/vpu_core.c b/drivers/media/platform/amphion/vpu_core.c
index 68ad183925fd..51a764713159 100644
--- a/drivers/media/platform/amphion/vpu_core.c
+++ b/drivers/media/platform/amphion/vpu_core.c
@@ -455,8 +455,13 @@ int vpu_inst_unregister(struct vpu_inst *inst)
}
vpu_core_check_hang(core);
if (core->state == VPU_CORE_HANG && !core->instance_mask) {
+ int err;
+
dev_info(core->dev, "reset hang core\n");
- if (!vpu_core_sw_reset(core)) {
+ mutex_unlock(&core->lock);
+ err = vpu_core_sw_reset(core);
+ mutex_lock(&core->lock);
+ if (!err) {
core->state = VPU_CORE_ACTIVE;
core->hang_mask = 0;
}
diff --git a/drivers/media/platform/amphion/vpu_malone.c b/drivers/media/platform/amphion/vpu_malone.c
index 446a9de0cc11..7f591805c48c 100644
--- a/drivers/media/platform/amphion/vpu_malone.c
+++ b/drivers/media/platform/amphion/vpu_malone.c
@@ -609,6 +609,8 @@ static int vpu_malone_set_params(struct vpu_shared_addr *shared,
enum vpu_malone_format malone_format;
malone_format = vpu_malone_format_remap(params->codec_format);
+ if (WARN_ON(malone_format == MALONE_FMT_NULL))
+ return -EINVAL;
iface->udata_buffer[instance].base = params->udata.base;
iface->udata_buffer[instance].slot_size = params->udata.size;
@@ -1294,6 +1296,8 @@ static int vpu_malone_insert_scode_vc1_l_seq(struct malone_scode_t *scode)
int size = 0;
u8 rcv_seqhdr[MALONE_VC1_RCV_SEQ_HEADER_LEN];
+ if (scode->inst->total_input_count)
+ return 0;
scode->need_data = 0;
ret = vpu_malone_insert_scode_seq(scode, MALONE_CODEC_ID_VC1_SIMPLE, sizeof(rcv_seqhdr));
@@ -1556,7 +1560,7 @@ int vpu_malone_input_frame(struct vpu_shared_addr *shared,
* merge the data to next frame
*/
vbuf = to_vb2_v4l2_buffer(vb);
- if (vpu_vb_is_codecconfig(vbuf) && (s64)vb->timestamp < 0) {
+ if (vpu_vb_is_codecconfig(vbuf)) {
inst->extra_size += size;
return 0;
}
diff --git a/drivers/media/platform/amphion/vpu_msgs.c b/drivers/media/platform/amphion/vpu_msgs.c
index 58502c51ddb3..077644bc1d7c 100644
--- a/drivers/media/platform/amphion/vpu_msgs.c
+++ b/drivers/media/platform/amphion/vpu_msgs.c
@@ -150,7 +150,12 @@ static void vpu_session_handle_eos(struct vpu_inst *inst, struct vpu_rpc_event *
static void vpu_session_handle_error(struct vpu_inst *inst, struct vpu_rpc_event *pkt)
{
- dev_err(inst->dev, "unsupported stream\n");
+ char *str = (char *)pkt->data;
+
+ if (strlen(str))
+ dev_err(inst->dev, "instance %d firmware error : %s\n", inst->id, str);
+ else
+ dev_err(inst->dev, "instance %d is unsupported stream\n", inst->id);
call_void_vop(inst, event_notify, VPU_MSG_ID_UNSUPPORTED, NULL);
vpu_v4l2_set_error(inst);
}
diff --git a/drivers/media/platform/amphion/vpu_rpc.h b/drivers/media/platform/amphion/vpu_rpc.h
index 25119e5e807e..7eb6f01e6ab5 100644
--- a/drivers/media/platform/amphion/vpu_rpc.h
+++ b/drivers/media/platform/amphion/vpu_rpc.h
@@ -312,11 +312,16 @@ static inline int vpu_iface_input_frame(struct vpu_inst *inst,
struct vb2_buffer *vb)
{
struct vpu_iface_ops *ops = vpu_core_get_iface(inst->core);
+ int ret;
if (!ops || !ops->input_frame)
return -EINVAL;
- return ops->input_frame(inst->core->iface, inst, vb);
+ ret = ops->input_frame(inst->core->iface, inst, vb);
+ if (ret < 0)
+ return ret;
+ inst->total_input_count++;
+ return ret;
}
static inline int vpu_iface_config_memory_resource(struct vpu_inst *inst,
diff --git a/drivers/media/platform/amphion/vpu_v4l2.c b/drivers/media/platform/amphion/vpu_v4l2.c
index 9c0704cd5766..0f3726c80967 100644
--- a/drivers/media/platform/amphion/vpu_v4l2.c
+++ b/drivers/media/platform/amphion/vpu_v4l2.c
@@ -127,6 +127,19 @@ int vpu_set_last_buffer_dequeued(struct vpu_inst *inst)
return 0;
}
+bool vpu_is_source_empty(struct vpu_inst *inst)
+{
+ struct v4l2_m2m_buffer *buf = NULL;
+
+ if (!inst->fh.m2m_ctx)
+ return true;
+ v4l2_m2m_for_each_src_buf(inst->fh.m2m_ctx, buf) {
+ if (vpu_get_buffer_state(&buf->vb) == VPU_BUF_STATE_IDLE)
+ return false;
+ }
+ return true;
+}
+
const struct vpu_format *vpu_try_fmt_common(struct vpu_inst *inst, struct v4l2_format *f)
{
struct v4l2_pix_format_mplane *pixmp = &f->fmt.pix_mp;
@@ -234,6 +247,49 @@ int vpu_process_capture_buffer(struct vpu_inst *inst)
return call_vop(inst, process_capture, &vbuf->vb2_buf);
}
+struct vb2_v4l2_buffer *vpu_next_src_buf(struct vpu_inst *inst)
+{
+ struct vb2_v4l2_buffer *src_buf = v4l2_m2m_next_src_buf(inst->fh.m2m_ctx);
+
+ if (!src_buf || vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_IDLE)
+ return NULL;
+
+ while (vpu_vb_is_codecconfig(src_buf)) {
+ v4l2_m2m_src_buf_remove(inst->fh.m2m_ctx);
+ vpu_set_buffer_state(src_buf, VPU_BUF_STATE_IDLE);
+ v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
+
+ src_buf = v4l2_m2m_next_src_buf(inst->fh.m2m_ctx);
+ if (!src_buf || vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_IDLE)
+ return NULL;
+ }
+
+ return src_buf;
+}
+
+void vpu_skip_frame(struct vpu_inst *inst, int count)
+{
+ struct vb2_v4l2_buffer *src_buf;
+ enum vb2_buffer_state state;
+ int i = 0;
+
+ if (count <= 0)
+ return;
+
+ while (i < count) {
+ src_buf = v4l2_m2m_src_buf_remove(inst->fh.m2m_ctx);
+ if (!src_buf || vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_IDLE)
+ return;
+ if (vpu_get_buffer_state(src_buf) == VPU_BUF_STATE_DECODED)
+ state = VB2_BUF_STATE_DONE;
+ else
+ state = VB2_BUF_STATE_ERROR;
+ i++;
+ vpu_set_buffer_state(src_buf, VPU_BUF_STATE_IDLE);
+ v4l2_m2m_buf_done(src_buf, state);
+ }
+}
+
struct vb2_v4l2_buffer *vpu_find_buf_by_sequence(struct vpu_inst *inst, u32 type, u32 sequence)
{
struct v4l2_m2m_buffer *buf = NULL;
@@ -440,10 +496,12 @@ static int vpu_vb2_start_streaming(struct vb2_queue *q, unsigned int count)
fmt->sizeimage[1], fmt->bytesperline[1],
fmt->sizeimage[2], fmt->bytesperline[2],
q->num_buffers);
- call_void_vop(inst, start, q->type);
vb2_clear_last_buffer_dequeued(q);
+ ret = call_vop(inst, start, q->type);
+ if (ret)
+ vpu_vb2_buffers_return(inst, q->type, VB2_BUF_STATE_QUEUED);
- return 0;
+ return ret;
}
static void vpu_vb2_stop_streaming(struct vb2_queue *q)
diff --git a/drivers/media/platform/amphion/vpu_v4l2.h b/drivers/media/platform/amphion/vpu_v4l2.h
index 90fa7ea67495..795ca33a6a50 100644
--- a/drivers/media/platform/amphion/vpu_v4l2.h
+++ b/drivers/media/platform/amphion/vpu_v4l2.h
@@ -19,6 +19,8 @@ int vpu_v4l2_close(struct file *file);
const struct vpu_format *vpu_try_fmt_common(struct vpu_inst *inst, struct v4l2_format *f);
int vpu_process_output_buffer(struct vpu_inst *inst);
int vpu_process_capture_buffer(struct vpu_inst *inst);
+struct vb2_v4l2_buffer *vpu_next_src_buf(struct vpu_inst *inst);
+void vpu_skip_frame(struct vpu_inst *inst, int count);
struct vb2_v4l2_buffer *vpu_find_buf_by_sequence(struct vpu_inst *inst, u32 type, u32 sequence);
struct vb2_v4l2_buffer *vpu_find_buf_by_idx(struct vpu_inst *inst, u32 type, u32 idx);
void vpu_v4l2_set_error(struct vpu_inst *inst);
@@ -27,6 +29,7 @@ int vpu_notify_source_change(struct vpu_inst *inst);
int vpu_set_last_buffer_dequeued(struct vpu_inst *inst);
void vpu_vb2_buffers_return(struct vpu_inst *inst, unsigned int type, enum vb2_buffer_state state);
int vpu_get_num_buffers(struct vpu_inst *inst, u32 type);
+bool vpu_is_source_empty(struct vpu_inst *inst);
dma_addr_t vpu_get_vb_phy_addr(struct vb2_buffer *vb, u32 plane_no);
unsigned int vpu_get_vb_length(struct vb2_buffer *vb, u32 plane_no);
diff --git a/drivers/media/platform/atmel/atmel-sama7g5-isc.c b/drivers/media/platform/atmel/atmel-sama7g5-isc.c
index 07a80b08bc54..e292a8d790a3 100644
--- a/drivers/media/platform/atmel/atmel-sama7g5-isc.c
+++ b/drivers/media/platform/atmel/atmel-sama7g5-isc.c
@@ -612,11 +612,13 @@ static const struct dev_pm_ops microchip_xisc_dev_pm_ops = {
SET_RUNTIME_PM_OPS(xisc_runtime_suspend, xisc_runtime_resume, NULL)
};
+#if IS_ENABLED(CONFIG_OF)
static const struct of_device_id microchip_xisc_of_match[] = {
{ .compatible = "microchip,sama7g5-isc" },
{ }
};
MODULE_DEVICE_TABLE(of, microchip_xisc_of_match);
+#endif
static struct platform_driver microchip_xisc_driver = {
.probe = microchip_xisc_probe,
diff --git a/drivers/media/platform/mediatek/mdp/mtk_mdp_ipi.h b/drivers/media/platform/mediatek/mdp/mtk_mdp_ipi.h
index 2cb8cecb3077..b810c96695c8 100644
--- a/drivers/media/platform/mediatek/mdp/mtk_mdp_ipi.h
+++ b/drivers/media/platform/mediatek/mdp/mtk_mdp_ipi.h
@@ -40,12 +40,14 @@ struct mdp_ipi_init {
* @ipi_id : IPI_MDP
* @ap_inst : AP mtk_mdp_vpu address
* @vpu_inst_addr : VPU MDP instance address
+ * @padding : Alignment padding
*/
struct mdp_ipi_comm {
uint32_t msg_id;
uint32_t ipi_id;
uint64_t ap_inst;
uint32_t vpu_inst_addr;
+ uint32_t padding;
};
/**
diff --git a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c
index c8ee5e2b4f69..54489f64533e 100644
--- a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec.c
@@ -112,8 +112,6 @@ void mtk_vcodec_dec_set_default_params(struct mtk_vcodec_ctx *ctx)
{
struct mtk_q_data *q_data;
- ctx->dev->vdec_pdata->init_vdec_params(ctx);
-
ctx->m2m_ctx->q_lock = &ctx->dev->dev_mutex;
ctx->fh.m2m_ctx = ctx->m2m_ctx;
ctx->fh.ctrl_handler = &ctx->ctrl_hdl;
@@ -196,6 +194,11 @@ static int vidioc_vdec_querycap(struct file *file, void *priv,
static int vidioc_vdec_subscribe_evt(struct v4l2_fh *fh,
const struct v4l2_event_subscription *sub)
{
+ struct mtk_vcodec_ctx *ctx = fh_to_ctx(fh);
+
+ if (ctx->dev->vdec_pdata->uses_stateless_api)
+ return v4l2_ctrl_subscribe_event(fh, sub);
+
switch (sub->type) {
case V4L2_EVENT_EOS:
return v4l2_event_subscribe(fh, sub, 2, NULL);
diff --git a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c
index fe7b2f1739b1..ff8f2a4829a3 100644
--- a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_dec_drv.c
@@ -211,9 +211,12 @@ static int fops_vcodec_open(struct file *file)
dev->dec_capability =
mtk_vcodec_fw_get_vdec_capa(dev->fw_handler);
+
mtk_v4l2_debug(0, "decoder capability %x", dev->dec_capability);
}
+ ctx->dev->vdec_pdata->init_vdec_params(ctx);
+
list_add(&ctx->list, &dev->ctx_list);
mutex_unlock(&dev->dev_mutex);
@@ -391,6 +394,8 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
mtk_v4l2_err("Main device of_platform_populate failed.");
goto err_reg_cont;
}
+ } else {
+ set_bit(MTK_VDEC_CORE, dev->subdev_bitmap);
}
ret = video_register_device(vfd_dec, VFL_TYPE_VIDEO, -1);
diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.c b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.c
index 29c604b1b179..718b7b08f93e 100644
--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.c
+++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.c
@@ -79,6 +79,11 @@ void mxc_jpeg_enable_irq(void __iomem *reg, int slot)
writel(0xFFFFFFFF, reg + MXC_SLOT_OFFSET(slot, SLOT_IRQ_EN));
}
+void mxc_jpeg_disable_irq(void __iomem *reg, int slot)
+{
+ writel(0x0, reg + MXC_SLOT_OFFSET(slot, SLOT_IRQ_EN));
+}
+
void mxc_jpeg_sw_reset(void __iomem *reg)
{
/*
diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h
index ae70d3a0dc24..bf4e1973a066 100644
--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h
+++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h
@@ -53,10 +53,10 @@
#define CAST_REC_REGS_SEL CAST_STATUS4
#define CAST_LUMTH CAST_STATUS5
#define CAST_CHRTH CAST_STATUS6
-#define CAST_NOMFRSIZE_LO CAST_STATUS7
-#define CAST_NOMFRSIZE_HI CAST_STATUS8
-#define CAST_OFBSIZE_LO CAST_STATUS9
-#define CAST_OFBSIZE_HI CAST_STATUS10
+#define CAST_NOMFRSIZE_LO CAST_STATUS16
+#define CAST_NOMFRSIZE_HI CAST_STATUS17
+#define CAST_OFBSIZE_LO CAST_STATUS18
+#define CAST_OFBSIZE_HI CAST_STATUS19
#define MXC_MAX_SLOTS 1 /* TODO use all 4 slots*/
/* JPEG-Decoder Wrapper Slot Registers 0..3 */
@@ -125,6 +125,7 @@ u32 mxc_jpeg_get_offset(void __iomem *reg, int slot);
void mxc_jpeg_enable_slot(void __iomem *reg, int slot);
void mxc_jpeg_set_l_endian(void __iomem *reg, int le);
void mxc_jpeg_enable_irq(void __iomem *reg, int slot);
+void mxc_jpeg_disable_irq(void __iomem *reg, int slot);
int mxc_jpeg_set_input(void __iomem *reg, u32 in_buf, u32 bufsize);
int mxc_jpeg_set_output(void __iomem *reg, u16 out_pitch, u32 out_buf,
u16 w, u16 h);
diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c
index d1ec1f4b506b..55c4b29d3e4e 100644
--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c
+++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c
@@ -82,6 +82,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
.h_align = 3,
.v_align = 3,
.flags = MXC_JPEG_FMT_TYPE_RAW,
+ .precision = 8,
},
{
.name = "ARGB", /* ARGBARGB packed format */
@@ -93,6 +94,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
.h_align = 3,
.v_align = 3,
.flags = MXC_JPEG_FMT_TYPE_RAW,
+ .precision = 8,
},
{
.name = "YUV420", /* 1st plane = Y, 2nd plane = UV */
@@ -104,6 +106,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
.h_align = 4,
.v_align = 4,
.flags = MXC_JPEG_FMT_TYPE_RAW,
+ .precision = 8,
},
{
.name = "YUV422", /* YUYV */
@@ -115,6 +118,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
.h_align = 4,
.v_align = 3,
.flags = MXC_JPEG_FMT_TYPE_RAW,
+ .precision = 8,
},
{
.name = "YUV444", /* YUVYUV */
@@ -126,6 +130,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
.h_align = 3,
.v_align = 3,
.flags = MXC_JPEG_FMT_TYPE_RAW,
+ .precision = 8,
},
{
.name = "Gray", /* Gray (Y8/Y12) or Single Comp */
@@ -137,6 +142,7 @@ static const struct mxc_jpeg_fmt mxc_formats[] = {
.h_align = 3,
.v_align = 3,
.flags = MXC_JPEG_FMT_TYPE_RAW,
+ .precision = 8,
},
};
@@ -309,6 +315,9 @@ struct mxc_jpeg_src_buf {
/* mxc-jpeg specific */
bool dht_needed;
bool jpeg_parse_error;
+ const struct mxc_jpeg_fmt *fmt;
+ int w;
+ int h;
};
static inline struct mxc_jpeg_src_buf *vb2_to_mxc_buf(struct vb2_buffer *vb)
@@ -321,6 +330,9 @@ static unsigned int debug;
module_param(debug, int, 0644);
MODULE_PARM_DESC(debug, "Debug level (0-3)");
+static void mxc_jpeg_bytesperline(struct mxc_jpeg_q_data *q, u32 precision);
+static void mxc_jpeg_sizeimage(struct mxc_jpeg_q_data *q);
+
static void _bswap16(u16 *a)
{
*a = ((*a & 0x00FF) << 8) | ((*a & 0xFF00) >> 8);
@@ -508,6 +520,7 @@ static bool mxc_jpeg_alloc_slot_data(struct mxc_jpeg_dev *jpeg,
GFP_ATOMIC);
if (!cfg_stm)
goto err;
+ memset(cfg_stm, 0, MXC_JPEG_MAX_CFG_STREAM);
jpeg->slot_data[slot].cfg_stream_vaddr = cfg_stm;
skip_alloc:
@@ -546,6 +559,18 @@ static void mxc_jpeg_free_slot_data(struct mxc_jpeg_dev *jpeg,
jpeg->slot_data[slot].used = false;
}
+static void mxc_jpeg_check_and_set_last_buffer(struct mxc_jpeg_ctx *ctx,
+ struct vb2_v4l2_buffer *src_buf,
+ struct vb2_v4l2_buffer *dst_buf)
+{
+ if (v4l2_m2m_is_last_draining_src_buf(ctx->fh.m2m_ctx, src_buf)) {
+ dst_buf->flags |= V4L2_BUF_FLAG_LAST;
+ v4l2_m2m_mark_stopped(ctx->fh.m2m_ctx);
+ notify_eos(ctx);
+ ctx->header_parsed = false;
+ }
+}
+
static irqreturn_t mxc_jpeg_dec_irq(int irq, void *priv)
{
struct mxc_jpeg_dev *jpeg = priv;
@@ -568,15 +593,8 @@ static irqreturn_t mxc_jpeg_dec_irq(int irq, void *priv)
dev_dbg(dev, "Irq %d on slot %d.\n", irq, slot);
ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev);
- if (!ctx) {
- dev_err(dev,
- "Instance released before the end of transaction.\n");
- /* soft reset only resets internal state, not registers */
- mxc_jpeg_sw_reset(reg);
- /* clear all interrupts */
- writel(0xFFFFFFFF, reg + MXC_SLOT_OFFSET(slot, SLOT_STATUS));
+ if (WARN_ON(!ctx))
goto job_unlock;
- }
if (slot != ctx->slot) {
/* TODO investigate when adding multi-instance support */
@@ -620,6 +638,7 @@ static irqreturn_t mxc_jpeg_dec_irq(int irq, void *priv)
dev_dbg(dev, "Decoder DHT cfg finished. Start decoding...\n");
goto job_unlock;
}
+
if (jpeg->mode == MXC_JPEG_ENCODE) {
payload = readl(reg + MXC_SLOT_OFFSET(slot, SLOT_BUF_PTR));
vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload);
@@ -647,7 +666,9 @@ static irqreturn_t mxc_jpeg_dec_irq(int irq, void *priv)
buf_state = VB2_BUF_STATE_DONE;
buffers_done:
+ mxc_jpeg_disable_irq(reg, ctx->slot);
jpeg->slot_data[slot].used = false; /* unused, but don't free */
+ mxc_jpeg_check_and_set_last_buffer(ctx, src_buf, dst_buf);
v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
v4l2_m2m_buf_done(src_buf, buf_state);
@@ -743,7 +764,13 @@ static unsigned int mxc_jpeg_setup_cfg_stream(void *cfg_stream_vaddr,
u32 fourcc,
u16 w, u16 h)
{
- unsigned int offset = 0;
+ /*
+ * There is a hardware issue that first 128 bytes of configuration data
+ * can't be loaded correctly.
+ * To avoid this issue, we need to write the configuration from
+ * an offset which should be no less than 0x80 (128 bytes).
+ */
+ unsigned int offset = 0x80;
u8 *cfg = (u8 *)cfg_stream_vaddr;
struct mxc_jpeg_sof *sof;
struct mxc_jpeg_sos *sos;
@@ -875,8 +902,8 @@ static void mxc_jpeg_config_enc_desc(struct vb2_buffer *out_buf,
jpeg->slot_data[slot].cfg_stream_size =
mxc_jpeg_setup_cfg_stream(cfg_stream_vaddr,
q_data->fmt->fourcc,
- q_data->w_adjusted,
- q_data->h_adjusted);
+ q_data->w,
+ q_data->h);
/* chain the config descriptor with the encoding descriptor */
cfg_desc->next_descpt_ptr = desc_handle | MXC_NXT_DESCPT_EN;
@@ -916,6 +943,67 @@ static void mxc_jpeg_config_enc_desc(struct vb2_buffer *out_buf,
mxc_jpeg_set_desc(cfg_desc_handle, reg, slot);
}
+static bool mxc_jpeg_source_change(struct mxc_jpeg_ctx *ctx,
+ struct mxc_jpeg_src_buf *jpeg_src_buf)
+{
+ struct device *dev = ctx->mxc_jpeg->dev;
+ struct mxc_jpeg_q_data *q_data_cap;
+
+ if (!jpeg_src_buf->fmt)
+ return false;
+
+ q_data_cap = mxc_jpeg_get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE);
+ if (q_data_cap->fmt != jpeg_src_buf->fmt ||
+ q_data_cap->w != jpeg_src_buf->w ||
+ q_data_cap->h != jpeg_src_buf->h) {
+ dev_dbg(dev, "Detected jpeg res=(%dx%d)->(%dx%d), pixfmt=%c%c%c%c\n",
+ q_data_cap->w, q_data_cap->h,
+ jpeg_src_buf->w, jpeg_src_buf->h,
+ (jpeg_src_buf->fmt->fourcc & 0xff),
+ (jpeg_src_buf->fmt->fourcc >> 8) & 0xff,
+ (jpeg_src_buf->fmt->fourcc >> 16) & 0xff,
+ (jpeg_src_buf->fmt->fourcc >> 24) & 0xff);
+
+ /*
+ * set-up the capture queue with the pixelformat and resolution
+ * detected from the jpeg output stream
+ */
+ q_data_cap->w = jpeg_src_buf->w;
+ q_data_cap->h = jpeg_src_buf->h;
+ q_data_cap->fmt = jpeg_src_buf->fmt;
+ q_data_cap->w_adjusted = q_data_cap->w;
+ q_data_cap->h_adjusted = q_data_cap->h;
+
+ /*
+ * align up the resolution for CAST IP,
+ * but leave the buffer resolution unchanged
+ */
+ v4l_bound_align_image(&q_data_cap->w_adjusted,
+ q_data_cap->w_adjusted, /* adjust up */
+ MXC_JPEG_MAX_WIDTH,
+ q_data_cap->fmt->h_align,
+ &q_data_cap->h_adjusted,
+ q_data_cap->h_adjusted, /* adjust up */
+ MXC_JPEG_MAX_HEIGHT,
+ 0,
+ 0);
+
+ /* setup bytesperline/sizeimage for capture queue */
+ mxc_jpeg_bytesperline(q_data_cap, jpeg_src_buf->fmt->precision);
+ mxc_jpeg_sizeimage(q_data_cap);
+ notify_src_chg(ctx);
+ ctx->source_change = 1;
+ }
+ return ctx->source_change ? true : false;
+}
+
+static int mxc_jpeg_job_ready(void *priv)
+{
+ struct mxc_jpeg_ctx *ctx = priv;
+
+ return ctx->source_change ? 0 : 1;
+}
+
static void mxc_jpeg_device_run(void *priv)
{
struct mxc_jpeg_ctx *ctx = priv;
@@ -954,6 +1042,7 @@ static void mxc_jpeg_device_run(void *priv)
jpeg_src_buf->jpeg_parse_error = true;
}
if (jpeg_src_buf->jpeg_parse_error) {
+ mxc_jpeg_check_and_set_last_buffer(ctx, src_buf, dst_buf);
v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
@@ -963,6 +1052,13 @@ static void mxc_jpeg_device_run(void *priv)
return;
}
+ if (ctx->mxc_jpeg->mode == MXC_JPEG_DECODE) {
+ if (ctx->source_change || mxc_jpeg_source_change(ctx, jpeg_src_buf)) {
+ spin_unlock_irqrestore(&ctx->mxc_jpeg->hw_lock, flags);
+ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+ return;
+ }
+ }
mxc_jpeg_enable(reg);
mxc_jpeg_set_l_endian(reg, 1);
@@ -997,44 +1093,33 @@ static void mxc_jpeg_device_run(void *priv)
spin_unlock_irqrestore(&ctx->mxc_jpeg->hw_lock, flags);
}
-static void mxc_jpeg_set_last_buffer_dequeued(struct mxc_jpeg_ctx *ctx)
-{
- struct vb2_queue *q;
-
- ctx->stopped = 1;
- q = v4l2_m2m_get_dst_vq(ctx->fh.m2m_ctx);
- if (!list_empty(&q->done_list))
- return;
-
- q->last_buffer_dequeued = true;
- wake_up(&q->done_wq);
- ctx->stopped = 0;
-}
-
static int mxc_jpeg_decoder_cmd(struct file *file, void *priv,
struct v4l2_decoder_cmd *cmd)
{
struct v4l2_fh *fh = file->private_data;
struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(fh);
- struct device *dev = ctx->mxc_jpeg->dev;
int ret;
ret = v4l2_m2m_ioctl_try_decoder_cmd(file, fh, cmd);
if (ret < 0)
return ret;
- if (cmd->cmd == V4L2_DEC_CMD_STOP) {
- dev_dbg(dev, "Received V4L2_DEC_CMD_STOP");
- if (v4l2_m2m_num_src_bufs_ready(fh->m2m_ctx) == 0) {
- /* No more src bufs, notify app EOS */
- notify_eos(ctx);
- mxc_jpeg_set_last_buffer_dequeued(ctx);
- } else {
- /* will send EOS later*/
- ctx->stopping = 1;
- }
+ if (!vb2_is_streaming(v4l2_m2m_get_src_vq(fh->m2m_ctx)))
+ return 0;
+
+ ret = v4l2_m2m_ioctl_decoder_cmd(file, priv, cmd);
+ if (ret < 0)
+ return ret;
+
+ if (cmd->cmd == V4L2_DEC_CMD_STOP &&
+ v4l2_m2m_has_stopped(fh->m2m_ctx)) {
+ notify_eos(ctx);
+ ctx->header_parsed = false;
}
+ if (cmd->cmd == V4L2_DEC_CMD_START &&
+ v4l2_m2m_has_stopped(fh->m2m_ctx))
+ vb2_clear_last_buffer_dequeued(&fh->m2m_ctx->cap_q_ctx.q);
return 0;
}
@@ -1043,24 +1128,27 @@ static int mxc_jpeg_encoder_cmd(struct file *file, void *priv,
{
struct v4l2_fh *fh = file->private_data;
struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(fh);
- struct device *dev = ctx->mxc_jpeg->dev;
int ret;
ret = v4l2_m2m_ioctl_try_encoder_cmd(file, fh, cmd);
if (ret < 0)
return ret;
- if (cmd->cmd == V4L2_ENC_CMD_STOP) {
- dev_dbg(dev, "Received V4L2_ENC_CMD_STOP");
- if (v4l2_m2m_num_src_bufs_ready(fh->m2m_ctx) == 0) {
- /* No more src bufs, notify app EOS */
- notify_eos(ctx);
- mxc_jpeg_set_last_buffer_dequeued(ctx);
- } else {
- /* will send EOS later*/
- ctx->stopping = 1;
- }
- }
+ if (!vb2_is_streaming(v4l2_m2m_get_src_vq(fh->m2m_ctx)) ||
+ !vb2_is_streaming(v4l2_m2m_get_dst_vq(fh->m2m_ctx)))
+ return 0;
+
+ ret = v4l2_m2m_ioctl_encoder_cmd(file, fh, cmd);
+ if (ret < 0)
+ return 0;
+
+ if (cmd->cmd == V4L2_ENC_CMD_STOP &&
+ v4l2_m2m_has_stopped(fh->m2m_ctx))
+ notify_eos(ctx);
+
+ if (cmd->cmd == V4L2_ENC_CMD_START &&
+ v4l2_m2m_has_stopped(fh->m2m_ctx))
+ vb2_clear_last_buffer_dequeued(&fh->m2m_ctx->cap_q_ctx.q);
return 0;
}
@@ -1073,16 +1161,28 @@ static int mxc_jpeg_queue_setup(struct vb2_queue *q,
{
struct mxc_jpeg_ctx *ctx = vb2_get_drv_priv(q);
struct mxc_jpeg_q_data *q_data = NULL;
+ struct mxc_jpeg_q_data tmp_q;
int i;
q_data = mxc_jpeg_get_q_data(ctx, q->type);
if (!q_data)
return -EINVAL;
+ tmp_q.fmt = q_data->fmt;
+ tmp_q.w = q_data->w_adjusted;
+ tmp_q.h = q_data->h_adjusted;
+ for (i = 0; i < MXC_JPEG_MAX_PLANES; i++) {
+ tmp_q.bytesperline[i] = q_data->bytesperline[i];
+ tmp_q.sizeimage[i] = q_data->sizeimage[i];
+ }
+ mxc_jpeg_sizeimage(&tmp_q);
+ for (i = 0; i < MXC_JPEG_MAX_PLANES; i++)
+ tmp_q.sizeimage[i] = max(tmp_q.sizeimage[i], q_data->sizeimage[i]);
+
/* Handle CREATE_BUFS situation - *nplanes != 0 */
if (*nplanes) {
for (i = 0; i < *nplanes; i++) {
- if (sizes[i] < q_data->sizeimage[i])
+ if (sizes[i] < tmp_q.sizeimage[i])
return -EINVAL;
}
return 0;
@@ -1091,7 +1191,7 @@ static int mxc_jpeg_queue_setup(struct vb2_queue *q,
/* Handle REQBUFS situation */
*nplanes = q_data->fmt->colplanes;
for (i = 0; i < *nplanes; i++)
- sizes[i] = q_data->sizeimage[i];
+ sizes[i] = tmp_q.sizeimage[i];
return 0;
}
@@ -1102,6 +1202,10 @@ static int mxc_jpeg_start_streaming(struct vb2_queue *q, unsigned int count)
struct mxc_jpeg_q_data *q_data = mxc_jpeg_get_q_data(ctx, q->type);
int ret;
+ v4l2_m2m_update_start_streaming_state(ctx->fh.m2m_ctx, q);
+
+ if (ctx->mxc_jpeg->mode == MXC_JPEG_DECODE && V4L2_TYPE_IS_CAPTURE(q->type))
+ ctx->source_change = 0;
dev_dbg(ctx->mxc_jpeg->dev, "Start streaming ctx=%p", ctx);
q_data->sequence = 0;
@@ -1131,11 +1235,15 @@ static void mxc_jpeg_stop_streaming(struct vb2_queue *q)
break;
v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_ERROR);
}
- pm_runtime_put_sync(&ctx->mxc_jpeg->pdev->dev);
- if (V4L2_TYPE_IS_OUTPUT(q->type)) {
- ctx->stopping = 0;
- ctx->stopped = 0;
+
+ v4l2_m2m_update_stop_streaming_state(ctx->fh.m2m_ctx, q);
+ if (V4L2_TYPE_IS_OUTPUT(q->type) &&
+ v4l2_m2m_has_stopped(ctx->fh.m2m_ctx)) {
+ notify_eos(ctx);
+ ctx->header_parsed = false;
}
+
+ pm_runtime_put_sync(&ctx->mxc_jpeg->pdev->dev);
}
static int mxc_jpeg_valid_comp_id(struct device *dev,
@@ -1175,14 +1283,17 @@ static u32 mxc_jpeg_get_image_format(struct device *dev,
for (i = 0; i < MXC_JPEG_NUM_FORMATS; i++)
if (mxc_formats[i].subsampling == header->frame.subsampling &&
- mxc_formats[i].nc == header->frame.num_components) {
+ mxc_formats[i].nc == header->frame.num_components &&
+ mxc_formats[i].precision == header->frame.precision) {
fourcc = mxc_formats[i].fourcc;
break;
}
if (fourcc == 0) {
- dev_err(dev, "Could not identify image format nc=%d, subsampling=%d\n",
+ dev_err(dev,
+ "Could not identify image format nc=%d, subsampling=%d, precision=%d\n",
header->frame.num_components,
- header->frame.subsampling);
+ header->frame.subsampling,
+ header->frame.precision);
return fourcc;
}
/*
@@ -1200,26 +1311,29 @@ static u32 mxc_jpeg_get_image_format(struct device *dev,
return fourcc;
}
-static void mxc_jpeg_bytesperline(struct mxc_jpeg_q_data *q,
- u32 precision)
+static void mxc_jpeg_bytesperline(struct mxc_jpeg_q_data *q, u32 precision)
{
/* Bytes distance between the leftmost pixels in two adjacent lines */
if (q->fmt->fourcc == V4L2_PIX_FMT_JPEG) {
/* bytesperline unused for compressed formats */
q->bytesperline[0] = 0;
q->bytesperline[1] = 0;
- } else if (q->fmt->fourcc == V4L2_PIX_FMT_NV12M) {
+ } else if (q->fmt->subsampling == V4L2_JPEG_CHROMA_SUBSAMPLING_420) {
/* When the image format is planar the bytesperline value
* applies to the first plane and is divided by the same factor
* as the width field for the other planes
*/
- q->bytesperline[0] = q->w * (precision / 8) *
- (q->fmt->depth / 8);
+ q->bytesperline[0] = q->w * DIV_ROUND_UP(precision, 8);
q->bytesperline[1] = q->bytesperline[0];
+ } else if (q->fmt->subsampling == V4L2_JPEG_CHROMA_SUBSAMPLING_422) {
+ q->bytesperline[0] = q->w * DIV_ROUND_UP(precision, 8) * 2;
+ q->bytesperline[1] = 0;
+ } else if (q->fmt->subsampling == V4L2_JPEG_CHROMA_SUBSAMPLING_444) {
+ q->bytesperline[0] = q->w * DIV_ROUND_UP(precision, 8) * q->fmt->nc;
+ q->bytesperline[1] = 0;
} else {
- /* single plane formats */
- q->bytesperline[0] = q->w * (precision / 8) *
- (q->fmt->depth / 8);
+ /* grayscale */
+ q->bytesperline[0] = q->w * DIV_ROUND_UP(precision, 8);
q->bytesperline[1] = 0;
}
}
@@ -1245,17 +1359,17 @@ static void mxc_jpeg_sizeimage(struct mxc_jpeg_q_data *q)
}
}
-static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx,
- u8 *src_addr, u32 size, bool *dht_needed)
+static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx, struct vb2_buffer *vb)
{
struct device *dev = ctx->mxc_jpeg->dev;
- struct mxc_jpeg_q_data *q_data_out, *q_data_cap;
- enum v4l2_buf_type cap_type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE;
- bool src_chg = false;
+ struct mxc_jpeg_q_data *q_data_out;
u32 fourcc;
struct v4l2_jpeg_header header;
struct mxc_jpeg_sof *psof = NULL;
struct mxc_jpeg_sos *psos = NULL;
+ struct mxc_jpeg_src_buf *jpeg_src_buf = vb2_to_mxc_buf(vb);
+ u8 *src_addr = (u8 *)vb2_plane_vaddr(vb, 0);
+ u32 size = vb2_get_plane_payload(vb, 0);
int ret;
memset(&header, 0, sizeof(header));
@@ -1266,7 +1380,7 @@ static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx,
}
/* if DHT marker present, no need to inject default one */
- *dht_needed = (header.num_dht == 0);
+ jpeg_src_buf->dht_needed = (header.num_dht == 0);
q_data_out = mxc_jpeg_get_q_data(ctx,
V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
@@ -1274,21 +1388,15 @@ static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx,
dev_warn(dev, "Invalid user resolution 0x0");
dev_warn(dev, "Keeping resolution from JPEG: %dx%d",
header.frame.width, header.frame.height);
- q_data_out->w = header.frame.width;
- q_data_out->h = header.frame.height;
} else if (header.frame.width != q_data_out->w ||
header.frame.height != q_data_out->h) {
dev_err(dev,
"Resolution mismatch: %dx%d (JPEG) versus %dx%d(user)",
header.frame.width, header.frame.height,
q_data_out->w, q_data_out->h);
- return -EINVAL;
- }
- if (header.frame.width % 8 != 0 || header.frame.height % 8 != 0) {
- dev_err(dev, "JPEG width or height not multiple of 8: %dx%d\n",
- header.frame.width, header.frame.height);
- return -EINVAL;
}
+ q_data_out->w = header.frame.width;
+ q_data_out->h = header.frame.height;
if (header.frame.width > MXC_JPEG_MAX_WIDTH ||
header.frame.height > MXC_JPEG_MAX_HEIGHT) {
dev_err(dev, "JPEG width or height should be <= 8192: %dx%d\n",
@@ -1316,51 +1424,13 @@ static int mxc_jpeg_parse(struct mxc_jpeg_ctx *ctx,
if (fourcc == 0)
return -EINVAL;
- /*
- * set-up the capture queue with the pixelformat and resolution
- * detected from the jpeg output stream
- */
- q_data_cap = mxc_jpeg_get_q_data(ctx, cap_type);
- if (q_data_cap->w != header.frame.width ||
- q_data_cap->h != header.frame.height)
- src_chg = true;
- q_data_cap->w = header.frame.width;
- q_data_cap->h = header.frame.height;
- q_data_cap->fmt = mxc_jpeg_find_format(ctx, fourcc);
- q_data_cap->w_adjusted = q_data_cap->w;
- q_data_cap->h_adjusted = q_data_cap->h;
- /*
- * align up the resolution for CAST IP,
- * but leave the buffer resolution unchanged
- */
- v4l_bound_align_image(&q_data_cap->w_adjusted,
- q_data_cap->w_adjusted, /* adjust up */
- MXC_JPEG_MAX_WIDTH,
- q_data_cap->fmt->h_align,
- &q_data_cap->h_adjusted,
- q_data_cap->h_adjusted, /* adjust up */
- MXC_JPEG_MAX_HEIGHT,
- q_data_cap->fmt->v_align,
- 0);
- dev_dbg(dev, "Detected jpeg res=(%dx%d)->(%dx%d), pixfmt=%c%c%c%c\n",
- q_data_cap->w, q_data_cap->h,
- q_data_cap->w_adjusted, q_data_cap->h_adjusted,
- (fourcc & 0xff),
- (fourcc >> 8) & 0xff,
- (fourcc >> 16) & 0xff,
- (fourcc >> 24) & 0xff);
-
- /* setup bytesperline/sizeimage for capture queue */
- mxc_jpeg_bytesperline(q_data_cap, header.frame.precision);
- mxc_jpeg_sizeimage(q_data_cap);
+ jpeg_src_buf->fmt = mxc_jpeg_find_format(ctx, fourcc);
+ jpeg_src_buf->w = header.frame.width;
+ jpeg_src_buf->h = header.frame.height;
+ ctx->header_parsed = true;
- /*
- * if the CAPTURE format was updated with new values, regardless of
- * whether they match the values set by the client or not, signal
- * a source change event
- */
- if (src_chg)
- notify_src_chg(ctx);
+ if (!v4l2_m2m_num_src_bufs_ready(ctx->fh.m2m_ctx))
+ mxc_jpeg_source_change(ctx, jpeg_src_buf);
return 0;
}
@@ -1372,6 +1442,20 @@ static void mxc_jpeg_buf_queue(struct vb2_buffer *vb)
struct mxc_jpeg_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
struct mxc_jpeg_src_buf *jpeg_src_buf;
+ if (V4L2_TYPE_IS_CAPTURE(vb->vb2_queue->type) &&
+ vb2_is_streaming(vb->vb2_queue) &&
+ v4l2_m2m_dst_buf_is_last(ctx->fh.m2m_ctx)) {
+ struct mxc_jpeg_q_data *q_data;
+
+ q_data = mxc_jpeg_get_q_data(ctx, vb->vb2_queue->type);
+ vbuf->field = V4L2_FIELD_NONE;
+ vbuf->sequence = q_data->sequence++;
+ v4l2_m2m_last_buffer_done(ctx->fh.m2m_ctx, vbuf);
+ notify_eos(ctx);
+ ctx->header_parsed = false;
+ return;
+ }
+
if (vb->vb2_queue->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
goto end;
@@ -1381,10 +1465,7 @@ static void mxc_jpeg_buf_queue(struct vb2_buffer *vb)
jpeg_src_buf = vb2_to_mxc_buf(vb);
jpeg_src_buf->jpeg_parse_error = false;
- ret = mxc_jpeg_parse(ctx,
- (u8 *)vb2_plane_vaddr(vb, 0),
- vb2_get_plane_payload(vb, 0),
- &jpeg_src_buf->dht_needed);
+ ret = mxc_jpeg_parse(ctx, vb);
if (ret)
jpeg_src_buf->jpeg_parse_error = true;
@@ -1424,23 +1505,11 @@ static int mxc_jpeg_buf_prepare(struct vb2_buffer *vb)
}
vb2_set_plane_payload(vb, i, sizeimage);
}
- return 0;
-}
-
-static void mxc_jpeg_buf_finish(struct vb2_buffer *vb)
-{
- struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
- struct mxc_jpeg_ctx *ctx = vb2_get_drv_priv(vb->vb2_queue);
- struct vb2_queue *q = vb->vb2_queue;
-
- if (V4L2_TYPE_IS_OUTPUT(vb->type))
- return;
- if (!ctx->stopped)
- return;
- if (list_empty(&q->done_list)) {
- vbuf->flags |= V4L2_BUF_FLAG_LAST;
- ctx->stopped = 0;
+ if (V4L2_TYPE_IS_CAPTURE(vb->vb2_queue->type)) {
+ vb2_set_plane_payload(vb, 0, 0);
+ vb2_set_plane_payload(vb, 1, 0);
}
+ return 0;
}
static const struct vb2_ops mxc_jpeg_qops = {
@@ -1449,7 +1518,6 @@ static const struct vb2_ops mxc_jpeg_qops = {
.wait_finish = vb2_ops_wait_finish,
.buf_out_validate = mxc_jpeg_buf_out_validate,
.buf_prepare = mxc_jpeg_buf_prepare,
- .buf_finish = mxc_jpeg_buf_finish,
.start_streaming = mxc_jpeg_start_streaming,
.stop_streaming = mxc_jpeg_stop_streaming,
.buf_queue = mxc_jpeg_buf_queue,
@@ -1510,7 +1578,7 @@ static void mxc_jpeg_set_default_params(struct mxc_jpeg_ctx *ctx)
q[i]->h = MXC_JPEG_DEFAULT_HEIGHT;
q[i]->w_adjusted = MXC_JPEG_DEFAULT_WIDTH;
q[i]->h_adjusted = MXC_JPEG_DEFAULT_HEIGHT;
- mxc_jpeg_bytesperline(q[i], 8);
+ mxc_jpeg_bytesperline(q[i], q[i]->fmt->precision);
mxc_jpeg_sizeimage(q[i]);
}
}
@@ -1585,26 +1653,42 @@ static int mxc_jpeg_enum_fmt_vid_cap(struct file *file, void *priv,
struct v4l2_fmtdesc *f)
{
struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(priv);
+ struct mxc_jpeg_q_data *q_data = mxc_jpeg_get_q_data(ctx, f->type);
- if (ctx->mxc_jpeg->mode == MXC_JPEG_ENCODE)
+ if (ctx->mxc_jpeg->mode == MXC_JPEG_ENCODE) {
return enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f,
MXC_JPEG_FMT_TYPE_ENC);
- else
+ } else if (!ctx->header_parsed) {
return enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f,
MXC_JPEG_FMT_TYPE_RAW);
+ } else {
+ /* For the decoder CAPTURE queue, only enumerate the raw formats
+ * supported for the format currently active on OUTPUT
+ * (more precisely what was propagated on capture queue
+ * after jpeg parse on the output buffer)
+ */
+ if (f->index)
+ return -EINVAL;
+ f->pixelformat = q_data->fmt->fourcc;
+ strscpy(f->description, q_data->fmt->name, sizeof(f->description));
+ return 0;
+ }
}
static int mxc_jpeg_enum_fmt_vid_out(struct file *file, void *priv,
struct v4l2_fmtdesc *f)
{
struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(priv);
+ u32 type = ctx->mxc_jpeg->mode == MXC_JPEG_DECODE ? MXC_JPEG_FMT_TYPE_ENC :
+ MXC_JPEG_FMT_TYPE_RAW;
+ int ret;
+ ret = enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f, type);
+ if (ret)
+ return ret;
if (ctx->mxc_jpeg->mode == MXC_JPEG_DECODE)
- return enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f,
- MXC_JPEG_FMT_TYPE_ENC);
- else
- return enum_fmt(mxc_formats, MXC_JPEG_NUM_FORMATS, f,
- MXC_JPEG_FMT_TYPE_RAW);
+ f->flags = V4L2_FMT_FLAG_DYN_RESOLUTION;
+ return 0;
}
static int mxc_jpeg_try_fmt(struct v4l2_format *f, const struct mxc_jpeg_fmt *fmt,
@@ -1624,22 +1708,17 @@ static int mxc_jpeg_try_fmt(struct v4l2_format *f, const struct mxc_jpeg_fmt *fm
pix_mp->num_planes = fmt->colplanes;
pix_mp->pixelformat = fmt->fourcc;
- /*
- * use MXC_JPEG_H_ALIGN instead of fmt->v_align, for vertical
- * alignment, to loosen up the alignment to multiple of 8,
- * otherwise NV12-1080p fails as 1080 is not a multiple of 16
- */
+ pix_mp->width = w;
+ pix_mp->height = h;
v4l_bound_align_image(&w,
- MXC_JPEG_MIN_WIDTH,
- w, /* adjust downwards*/
+ w, /* adjust upwards*/
+ MXC_JPEG_MAX_WIDTH,
fmt->h_align,
&h,
- MXC_JPEG_MIN_HEIGHT,
- h, /* adjust downwards*/
- MXC_JPEG_H_ALIGN,
+ h, /* adjust upwards*/
+ MXC_JPEG_MAX_HEIGHT,
+ 0,
0);
- pix_mp->width = w; /* negotiate the width */
- pix_mp->height = h; /* negotiate the height */
/* get user input into the tmp_q */
tmp_q.w = w;
@@ -1652,7 +1731,7 @@ static int mxc_jpeg_try_fmt(struct v4l2_format *f, const struct mxc_jpeg_fmt *fm
}
/* calculate bytesperline & sizeimage into the tmp_q */
- mxc_jpeg_bytesperline(&tmp_q, 8);
+ mxc_jpeg_bytesperline(&tmp_q, fmt->precision);
mxc_jpeg_sizeimage(&tmp_q);
/* adjust user format according to our calculations */
@@ -1765,35 +1844,19 @@ static int mxc_jpeg_s_fmt(struct mxc_jpeg_ctx *ctx,
q_data->w_adjusted = q_data->w;
q_data->h_adjusted = q_data->h;
- if (jpeg->mode == MXC_JPEG_DECODE) {
- /*
- * align up the resolution for CAST IP,
- * but leave the buffer resolution unchanged
- */
- v4l_bound_align_image(&q_data->w_adjusted,
- q_data->w_adjusted, /* adjust upwards */
- MXC_JPEG_MAX_WIDTH,
- q_data->fmt->h_align,
- &q_data->h_adjusted,
- q_data->h_adjusted, /* adjust upwards */
- MXC_JPEG_MAX_HEIGHT,
- q_data->fmt->v_align,
- 0);
- } else {
- /*
- * align down the resolution for CAST IP,
- * but leave the buffer resolution unchanged
- */
- v4l_bound_align_image(&q_data->w_adjusted,
- MXC_JPEG_MIN_WIDTH,
- q_data->w_adjusted, /* adjust downwards*/
- q_data->fmt->h_align,
- &q_data->h_adjusted,
- MXC_JPEG_MIN_HEIGHT,
- q_data->h_adjusted, /* adjust downwards*/
- q_data->fmt->v_align,
- 0);
- }
+ /*
+ * align up the resolution for CAST IP,
+ * but leave the buffer resolution unchanged
+ */
+ v4l_bound_align_image(&q_data->w_adjusted,
+ q_data->w_adjusted, /* adjust upwards */
+ MXC_JPEG_MAX_WIDTH,
+ q_data->fmt->h_align,
+ &q_data->h_adjusted,
+ q_data->h_adjusted, /* adjust upwards */
+ MXC_JPEG_MAX_HEIGHT,
+ q_data->fmt->v_align,
+ 0);
for (i = 0; i < pix_mp->num_planes; i++) {
q_data->bytesperline[i] = pix_mp->plane_fmt[i].bytesperline;
@@ -1875,27 +1938,6 @@ static int mxc_jpeg_subscribe_event(struct v4l2_fh *fh,
}
}
-static int mxc_jpeg_dqbuf(struct file *file, void *priv,
- struct v4l2_buffer *buf)
-{
- struct v4l2_fh *fh = file->private_data;
- struct mxc_jpeg_ctx *ctx = mxc_jpeg_fh_to_ctx(priv);
- struct device *dev = ctx->mxc_jpeg->dev;
- int num_src_ready = v4l2_m2m_num_src_bufs_ready(fh->m2m_ctx);
- int ret;
-
- dev_dbg(dev, "DQBUF type=%d, index=%d", buf->type, buf->index);
- if (ctx->stopping == 1 && num_src_ready == 0) {
- /* No more src bufs, notify app EOS */
- notify_eos(ctx);
- ctx->stopping = 0;
- mxc_jpeg_set_last_buffer_dequeued(ctx);
- }
-
- ret = v4l2_m2m_dqbuf(file, fh->m2m_ctx, buf);
- return ret;
-}
-
static const struct v4l2_ioctl_ops mxc_jpeg_ioctl_ops = {
.vidioc_querycap = mxc_jpeg_querycap,
.vidioc_enum_fmt_vid_cap = mxc_jpeg_enum_fmt_vid_cap,
@@ -1919,7 +1961,7 @@ static const struct v4l2_ioctl_ops mxc_jpeg_ioctl_ops = {
.vidioc_encoder_cmd = mxc_jpeg_encoder_cmd,
.vidioc_qbuf = v4l2_m2m_ioctl_qbuf,
- .vidioc_dqbuf = mxc_jpeg_dqbuf,
+ .vidioc_dqbuf = v4l2_m2m_ioctl_dqbuf,
.vidioc_create_bufs = v4l2_m2m_ioctl_create_bufs,
.vidioc_prepare_buf = v4l2_m2m_ioctl_prepare_buf,
@@ -1962,6 +2004,7 @@ static const struct v4l2_file_operations mxc_jpeg_fops = {
};
static const struct v4l2_m2m_ops mxc_jpeg_m2m_ops = {
+ .job_ready = mxc_jpeg_job_ready,
.device_run = mxc_jpeg_device_run,
};
@@ -2078,12 +2121,14 @@ static int mxc_jpeg_probe(struct platform_device *pdev)
jpeg->clk_ipg = devm_clk_get(dev, "ipg");
if (IS_ERR(jpeg->clk_ipg)) {
dev_err(dev, "failed to get clock: ipg\n");
+ ret = PTR_ERR(jpeg->clk_ipg);
goto err_clk;
}
jpeg->clk_per = devm_clk_get(dev, "per");
if (IS_ERR(jpeg->clk_per)) {
dev_err(dev, "failed to get clock: per\n");
+ ret = PTR_ERR(jpeg->clk_per);
goto err_clk;
}
diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h
index f53f004ba851..542993eb8d5b 100644
--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h
+++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h
@@ -49,6 +49,7 @@ enum mxc_jpeg_mode {
* @h_align: horizontal alignment order (align to 2^h_align)
* @v_align: vertical alignment order (align to 2^v_align)
* @flags: flags describing format applicability
+ * @precision: jpeg sample precision
*/
struct mxc_jpeg_fmt {
const char *name;
@@ -60,6 +61,7 @@ struct mxc_jpeg_fmt {
int h_align;
int v_align;
u32 flags;
+ u8 precision;
};
struct mxc_jpeg_desc {
@@ -90,9 +92,9 @@ struct mxc_jpeg_ctx {
struct mxc_jpeg_q_data cap_q;
struct v4l2_fh fh;
enum mxc_jpeg_enc_state enc_state;
- unsigned int stopping;
- unsigned int stopped;
unsigned int slot;
+ unsigned int source_change;
+ bool header_parsed;
};
struct mxc_jpeg_slot_data {
diff --git a/drivers/media/platform/qcom/camss/camss-csid.c b/drivers/media/platform/qcom/camss/camss-csid.c
index f993f349b66b..80628801cf09 100644
--- a/drivers/media/platform/qcom/camss/camss-csid.c
+++ b/drivers/media/platform/qcom/camss/camss-csid.c
@@ -666,7 +666,7 @@ int msm_csid_subdev_init(struct camss *camss, struct csid_device *csid,
if (csid->num_supplies) {
csid->supplies = devm_kmalloc_array(camss->dev,
csid->num_supplies,
- sizeof(csid->supplies),
+ sizeof(*csid->supplies),
GFP_KERNEL);
if (!csid->supplies)
return -ENOMEM;
diff --git a/drivers/media/platform/renesas/rcar-vin/rcar-core.c b/drivers/media/platform/renesas/rcar-vin/rcar-core.c
index 64cb05b3907c..ed3ddf36f906 100644
--- a/drivers/media/platform/renesas/rcar-vin/rcar-core.c
+++ b/drivers/media/platform/renesas/rcar-vin/rcar-core.c
@@ -1264,7 +1264,7 @@ static const struct rvin_info rcar_info_r8a77980 = {
};
static const struct rvin_group_route rcar_info_r8a77990_routes[] = {
- { .master = 0, .csi = RVIN_CSI40, .chsel = 0x03 },
+ { .master = 4, .csi = RVIN_CSI40, .chsel = 0x03 },
{ /* Sentinel */ }
};
diff --git a/drivers/media/usb/hdpvr/hdpvr-video.c b/drivers/media/usb/hdpvr/hdpvr-video.c
index 60e57e0f1927..fd7d2a9d0449 100644
--- a/drivers/media/usb/hdpvr/hdpvr-video.c
+++ b/drivers/media/usb/hdpvr/hdpvr-video.c
@@ -409,7 +409,7 @@ static ssize_t hdpvr_read(struct file *file, char __user *buffer, size_t count,
struct hdpvr_device *dev = video_drvdata(file);
struct hdpvr_buffer *buf = NULL;
struct urb *urb;
- unsigned int ret = 0;
+ int ret = 0;
int rem, cnt;
if (*pos)
diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c
index 675e22895ebe..094e1815209b 100644
--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
@@ -924,7 +924,7 @@ static __poll_t v4l2_m2m_poll_for_data(struct file *file,
if ((!src_q->streaming || src_q->error ||
list_empty(&src_q->queued_list)) &&
(!dst_q->streaming || dst_q->error ||
- list_empty(&dst_q->queued_list)))
+ (list_empty(&dst_q->queued_list) && !dst_q->last_buffer_dequeued)))
return EPOLLERR;
spin_lock_irqsave(&src_q->done_lock, flags);
diff --git a/drivers/memstick/core/ms_block.c b/drivers/memstick/core/ms_block.c
index 3993bdd4b519..f8fdf88fb240 100644
--- a/drivers/memstick/core/ms_block.c
+++ b/drivers/memstick/core/ms_block.c
@@ -1341,17 +1341,17 @@ static int msb_ftl_initialize(struct msb_data *msb)
msb->zone_count = msb->block_count / MS_BLOCKS_IN_ZONE;
msb->logical_block_count = msb->zone_count * 496 - 2;
- msb->used_blocks_bitmap = kzalloc(msb->block_count / 8, GFP_KERNEL);
- msb->erased_blocks_bitmap = kzalloc(msb->block_count / 8, GFP_KERNEL);
+ msb->used_blocks_bitmap = bitmap_zalloc(msb->block_count, GFP_KERNEL);
+ msb->erased_blocks_bitmap = bitmap_zalloc(msb->block_count, GFP_KERNEL);
msb->lba_to_pba_table =
kmalloc_array(msb->logical_block_count, sizeof(u16),
GFP_KERNEL);
if (!msb->used_blocks_bitmap || !msb->lba_to_pba_table ||
!msb->erased_blocks_bitmap) {
- kfree(msb->used_blocks_bitmap);
+ bitmap_free(msb->used_blocks_bitmap);
+ bitmap_free(msb->erased_blocks_bitmap);
kfree(msb->lba_to_pba_table);
- kfree(msb->erased_blocks_bitmap);
return -ENOMEM;
}
@@ -1946,7 +1946,8 @@ static DEFINE_MUTEX(msb_disk_lock); /* protects against races in open/release */
static void msb_data_clear(struct msb_data *msb)
{
kfree(msb->boot_page);
- kfree(msb->used_blocks_bitmap);
+ bitmap_free(msb->used_blocks_bitmap);
+ bitmap_free(msb->erased_blocks_bitmap);
kfree(msb->lba_to_pba_table);
kfree(msb->cache);
msb->card = NULL;
diff --git a/drivers/mfd/max77620.c b/drivers/mfd/max77620.c
index fec2096474ad..a6661e07035b 100644
--- a/drivers/mfd/max77620.c
+++ b/drivers/mfd/max77620.c
@@ -419,9 +419,11 @@ static int max77620_initialise_fps(struct max77620_chip *chip)
ret = max77620_config_fps(chip, fps_child);
if (ret < 0) {
of_node_put(fps_child);
+ of_node_put(fps_np);
return ret;
}
}
+ of_node_put(fps_np);
config = chip->enable_global_lpm ? MAX77620_ONOFFCNFG2_SLP_LPM_MSK : 0;
ret = regmap_update_bits(chip->rmap, MAX77620_REG_ONOFFCNFG2,
diff --git a/drivers/mfd/t7l66xb.c b/drivers/mfd/t7l66xb.c
index 5369c67e3280..663ffd4b8570 100644
--- a/drivers/mfd/t7l66xb.c
+++ b/drivers/mfd/t7l66xb.c
@@ -397,11 +397,8 @@ static int t7l66xb_probe(struct platform_device *dev)
static int t7l66xb_remove(struct platform_device *dev)
{
- struct t7l66xb_platform_data *pdata = dev_get_platdata(&dev->dev);
struct t7l66xb *t7l66xb = platform_get_drvdata(dev);
- int ret;
- ret = pdata->disable(dev);
clk_disable_unprepare(t7l66xb->clk48m);
clk_put(t7l66xb->clk48m);
clk_disable_unprepare(t7l66xb->clk32k);
@@ -412,8 +409,7 @@ static int t7l66xb_remove(struct platform_device *dev)
mfd_remove_devices(&dev->dev);
kfree(t7l66xb);
- return ret;
-
+ return 0;
}
static struct platform_driver t7l66xb_platform_driver = {
diff --git a/drivers/misc/cardreader/rtsx_pcr.c b/drivers/misc/cardreader/rtsx_pcr.c
index 2a2619e3c72c..f001d99bf366 100644
--- a/drivers/misc/cardreader/rtsx_pcr.c
+++ b/drivers/misc/cardreader/rtsx_pcr.c
@@ -1507,7 +1507,7 @@ static int rtsx_pci_probe(struct pci_dev *pcidev,
pcr->remap_addr = ioremap(base, len);
if (!pcr->remap_addr) {
ret = -ENOMEM;
- goto free_handle;
+ goto free_idr;
}
pcr->rtsx_resv_buf = dma_alloc_coherent(&(pcidev->dev),
@@ -1570,6 +1570,10 @@ static int rtsx_pci_probe(struct pci_dev *pcidev,
pcr->rtsx_resv_buf, pcr->rtsx_resv_buf_addr);
unmap:
iounmap(pcr->remap_addr);
+free_idr:
+ spin_lock(&rtsx_pci_lock);
+ idr_remove(&rtsx_pci_idr, pcr->id);
+ spin_unlock(&rtsx_pci_lock);
free_handle:
kfree(handle);
free_pcr:
diff --git a/drivers/misc/eeprom/idt_89hpesx.c b/drivers/misc/eeprom/idt_89hpesx.c
index b0cff4b152da..7f430742ce2b 100644
--- a/drivers/misc/eeprom/idt_89hpesx.c
+++ b/drivers/misc/eeprom/idt_89hpesx.c
@@ -909,14 +909,18 @@ static ssize_t idt_dbgfs_csr_write(struct file *filep, const char __user *ubuf,
u32 csraddr, csrval;
char *buf;
+ if (*offp)
+ return 0;
+
/* Copy data from User-space */
buf = kmalloc(count + 1, GFP_KERNEL);
if (!buf)
return -ENOMEM;
- ret = simple_write_to_buffer(buf, count, offp, ubuf, count);
- if (ret < 0)
+ if (copy_from_user(buf, ubuf, count)) {
+ ret = -EFAULT;
goto free_buf;
+ }
buf[count] = 0;
/* Find position of colon in the buffer */
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 23fcb7a16605..0b8d3d25ba7a 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -175,7 +175,7 @@ static inline int mmc_blk_part_switch(struct mmc_card *card,
unsigned int part_type);
static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
struct mmc_card *card,
- int disable_multi,
+ int recovery_mode,
struct mmc_queue *mq);
static void mmc_blk_hsq_req_done(struct mmc_request *mrq);
@@ -1285,7 +1285,7 @@ static void mmc_blk_eval_resp_error(struct mmc_blk_request *brq)
}
static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
- int disable_multi, bool *do_rel_wr_p,
+ int recovery_mode, bool *do_rel_wr_p,
bool *do_data_tag_p)
{
struct mmc_blk_data *md = mq->blkdata;
@@ -1351,12 +1351,12 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
brq->data.blocks--;
/*
- * After a read error, we redo the request one sector
+ * After a read error, we redo the request one (native) sector
* at a time in order to accurately determine which
* sectors can be read successfully.
*/
- if (disable_multi)
- brq->data.blocks = 1;
+ if (recovery_mode)
+ brq->data.blocks = queue_physical_block_size(mq->queue) >> 9;
/*
* Some controllers have HW issues while operating
@@ -1573,7 +1573,7 @@ static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
struct mmc_card *card,
- int disable_multi,
+ int recovery_mode,
struct mmc_queue *mq)
{
u32 readcmd, writecmd;
@@ -1582,7 +1582,7 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
struct mmc_blk_data *md = mq->blkdata;
bool do_rel_wr, do_data_tag;
- mmc_blk_data_prep(mq, mqrq, disable_multi, &do_rel_wr, &do_data_tag);
+ mmc_blk_data_prep(mq, mqrq, recovery_mode, &do_rel_wr, &do_data_tag);
brq->mrq.cmd = &brq->cmd;
@@ -1673,7 +1673,7 @@ static int mmc_blk_fix_state(struct mmc_card *card, struct request *req)
#define MMC_READ_SINGLE_RETRIES 2
-/* Single sector read during recovery */
+/* Single (native) sector read during recovery */
static void mmc_blk_read_single(struct mmc_queue *mq, struct request *req)
{
struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
@@ -1681,6 +1681,7 @@ static void mmc_blk_read_single(struct mmc_queue *mq, struct request *req)
struct mmc_card *card = mq->card;
struct mmc_host *host = card->host;
blk_status_t error = BLK_STS_OK;
+ size_t bytes_per_read = queue_physical_block_size(mq->queue);
do {
u32 status;
@@ -1715,13 +1716,13 @@ static void mmc_blk_read_single(struct mmc_queue *mq, struct request *req)
else
error = BLK_STS_OK;
- } while (blk_update_request(req, error, 512));
+ } while (blk_update_request(req, error, bytes_per_read));
return;
error_exit:
mrq->data->bytes_xfered = 0;
- blk_update_request(req, BLK_STS_IOERR, 512);
+ blk_update_request(req, BLK_STS_IOERR, bytes_per_read);
/* Let it try the remaining request again */
if (mqrq->retries > MMC_MAX_RETRIES - 1)
mqrq->retries = MMC_MAX_RETRIES - 1;
@@ -1862,10 +1863,9 @@ static void mmc_blk_mq_rw_recovery(struct mmc_queue *mq, struct request *req)
return;
}
- /* FIXME: Missing single sector read for large sector size */
- if (!mmc_large_sector(card) && rq_data_dir(req) == READ &&
- brq->data.blocks > 1) {
- /* Read one sector at a time */
+ if (rq_data_dir(req) == READ && brq->data.blocks >
+ queue_physical_block_size(mq->queue) >> 9) {
+ /* Read one (native) sector at a time */
mmc_blk_read_single(mq, req);
return;
}
diff --git a/drivers/mmc/core/quirks.h b/drivers/mmc/core/quirks.h
index f879dc63d936..be4393988086 100644
--- a/drivers/mmc/core/quirks.h
+++ b/drivers/mmc/core/quirks.h
@@ -163,8 +163,10 @@ static inline bool mmc_fixup_of_compatible_match(struct mmc_card *card,
struct device_node *np;
for_each_child_of_node(mmc_dev(card->host)->of_node, np) {
- if (of_device_is_compatible(np, compatible))
+ if (of_device_is_compatible(np, compatible)) {
+ of_node_put(np);
return true;
+ }
}
return false;
diff --git a/drivers/mmc/host/cavium-octeon.c b/drivers/mmc/host/cavium-octeon.c
index 2c4b2df52adb..12dca91a8ef6 100644
--- a/drivers/mmc/host/cavium-octeon.c
+++ b/drivers/mmc/host/cavium-octeon.c
@@ -277,6 +277,7 @@ static int octeon_mmc_probe(struct platform_device *pdev)
if (ret) {
dev_err(&pdev->dev, "Error populating slots\n");
octeon_mmc_set_shared_power(host, 0);
+ of_node_put(cn);
goto error;
}
i++;
diff --git a/drivers/mmc/host/cavium-thunderx.c b/drivers/mmc/host/cavium-thunderx.c
index 76013bbbcff3..202b1d6da678 100644
--- a/drivers/mmc/host/cavium-thunderx.c
+++ b/drivers/mmc/host/cavium-thunderx.c
@@ -142,8 +142,10 @@ static int thunder_mmc_probe(struct pci_dev *pdev,
continue;
ret = cvm_mmc_of_slot_probe(&host->slot_pdev[i]->dev, host);
- if (ret)
+ if (ret) {
+ of_node_put(child_node);
goto error;
+ }
}
i++;
}
diff --git a/drivers/mmc/host/mxcmmc.c b/drivers/mmc/host/mxcmmc.c
index 40b6878bea6c..b5940083a108 100644
--- a/drivers/mmc/host/mxcmmc.c
+++ b/drivers/mmc/host/mxcmmc.c
@@ -1025,7 +1025,7 @@ static int mxcmci_probe(struct platform_device *pdev)
mmc->max_req_size = mmc->max_blk_size * mmc->max_blk_count;
mmc->max_seg_size = mmc->max_req_size;
- host->devtype = (enum mxcmci_type)of_device_get_match_data(&pdev->dev);
+ host->devtype = (uintptr_t)of_device_get_match_data(&pdev->dev);
/* adjust max_segs after devtype detection */
if (!is_mpc512x_mmc(host))
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index ddb5ca2f559e..4fb49306275e 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -940,6 +940,10 @@ int renesas_sdhi_probe(struct platform_device *pdev,
if (IS_ERR(priv->clk_cd))
return dev_err_probe(&pdev->dev, PTR_ERR(priv->clk_cd), "cannot get cd clock");
+ priv->rstc = devm_reset_control_get_optional_exclusive(&pdev->dev, NULL);
+ if (IS_ERR(priv->rstc))
+ return PTR_ERR(priv->rstc);
+
priv->pinctrl = devm_pinctrl_get(&pdev->dev);
if (!IS_ERR(priv->pinctrl)) {
priv->pins_default = pinctrl_lookup_state(priv->pinctrl,
@@ -1032,10 +1036,6 @@ int renesas_sdhi_probe(struct platform_device *pdev,
if (ret)
goto efree;
- priv->rstc = devm_reset_control_get_optional_exclusive(&pdev->dev, NULL);
- if (IS_ERR(priv->rstc))
- return PTR_ERR(priv->rstc);
-
ver = sd_ctrl_read16(host, CTL_VERSION);
/* GEN2_SDR104 is first known SDHI to use 32bit block count */
if (ver < SDHI_VER_GEN2_SDR104 && mmc_data->max_blk_count > U16_MAX)
diff --git a/drivers/mmc/host/sdhci-of-at91.c b/drivers/mmc/host/sdhci-of-at91.c
index 10fb4cb2c731..cd0134580a90 100644
--- a/drivers/mmc/host/sdhci-of-at91.c
+++ b/drivers/mmc/host/sdhci-of-at91.c
@@ -100,8 +100,13 @@ static void sdhci_at91_set_clock(struct sdhci_host *host, unsigned int clock)
static void sdhci_at91_set_uhs_signaling(struct sdhci_host *host,
unsigned int timing)
{
- if (timing == MMC_TIMING_MMC_DDR52)
- sdhci_writeb(host, SDMMC_MC1R_DDR, SDMMC_MC1R);
+ u8 mc1r;
+
+ if (timing == MMC_TIMING_MMC_DDR52) {
+ mc1r = sdhci_readb(host, SDMMC_MC1R);
+ mc1r |= SDMMC_MC1R_DDR;
+ sdhci_writeb(host, mc1r, SDMMC_MC1R);
+ }
sdhci_set_uhs_signaling(host, timing);
}
diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c
index d9dc41143bb3..8b3d8119f388 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -904,6 +904,7 @@ static int esdhc_signal_voltage_switch(struct mmc_host *mmc,
scfg_node = of_find_matching_node(NULL, scfg_device_ids);
if (scfg_node)
scfg_base = of_iomap(scfg_node, 0);
+ of_node_put(scfg_node);
if (scfg_base) {
sdhciovselcr = SDHCIOVSELCR_TGLEN |
SDHCIOVSELCR_VSELVAL;
diff --git a/drivers/mtd/devices/mtd_dataflash.c b/drivers/mtd/devices/mtd_dataflash.c
index 134e27328597..25bad4318305 100644
--- a/drivers/mtd/devices/mtd_dataflash.c
+++ b/drivers/mtd/devices/mtd_dataflash.c
@@ -112,6 +112,13 @@ static const struct of_device_id dataflash_dt_ids[] = {
MODULE_DEVICE_TABLE(of, dataflash_dt_ids);
#endif
+static const struct spi_device_id dataflash_spi_ids[] = {
+ { .name = "at45", },
+ { .name = "dataflash", },
+ { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(spi, dataflash_spi_ids);
+
/* ......................................................................... */
/*
@@ -936,6 +943,7 @@ static struct spi_driver dataflash_driver = {
.probe = dataflash_probe,
.remove = dataflash_remove,
+ .id_table = dataflash_spi_ids,
/* FIXME: investigate suspend and resume... */
};
diff --git a/drivers/mtd/devices/st_spi_fsm.c b/drivers/mtd/devices/st_spi_fsm.c
index 983999c020d6..48bda2dd1bb5 100644
--- a/drivers/mtd/devices/st_spi_fsm.c
+++ b/drivers/mtd/devices/st_spi_fsm.c
@@ -2115,10 +2115,12 @@ static int stfsm_probe(struct platform_device *pdev)
(long long)fsm->mtd.size, (long long)(fsm->mtd.size >> 20),
fsm->mtd.erasesize, (fsm->mtd.erasesize >> 10));
- return mtd_device_register(&fsm->mtd, NULL, 0);
-
+ ret = mtd_device_register(&fsm->mtd, NULL, 0);
+ if (ret) {
err_clk_unprepare:
- clk_disable_unprepare(fsm->clk);
+ clk_disable_unprepare(fsm->clk);
+ }
+
return ret;
}
diff --git a/drivers/mtd/hyperbus/rpc-if.c b/drivers/mtd/hyperbus/rpc-if.c
index 6e08ec1d4f09..b70d259e48a7 100644
--- a/drivers/mtd/hyperbus/rpc-if.c
+++ b/drivers/mtd/hyperbus/rpc-if.c
@@ -134,7 +134,7 @@ static int rpcif_hb_probe(struct platform_device *pdev)
error = rpcif_hw_init(&hyperbus->rpc, true);
if (error)
- return error;
+ goto out_disable_rpm;
hyperbus->hbdev.map.size = hyperbus->rpc.size;
hyperbus->hbdev.map.virt = hyperbus->rpc.dirmap;
@@ -145,8 +145,12 @@ static int rpcif_hb_probe(struct platform_device *pdev)
hyperbus->hbdev.np = of_get_next_child(pdev->dev.parent->of_node, NULL);
error = hyperbus_register_device(&hyperbus->hbdev);
if (error)
- rpcif_disable_rpm(&hyperbus->rpc);
+ goto out_disable_rpm;
+
+ return 0;
+out_disable_rpm:
+ rpcif_disable_rpm(&hyperbus->rpc);
return error;
}
diff --git a/drivers/mtd/maps/physmap-versatile.c b/drivers/mtd/maps/physmap-versatile.c
index ad7cd9cfaee0..a1b8b7b25f88 100644
--- a/drivers/mtd/maps/physmap-versatile.c
+++ b/drivers/mtd/maps/physmap-versatile.c
@@ -93,6 +93,7 @@ static int ap_flash_init(struct platform_device *pdev)
return -ENODEV;
}
ebi_base = of_iomap(ebi, 0);
+ of_node_put(ebi);
if (!ebi_base)
return -ENODEV;
@@ -207,6 +208,7 @@ int of_flash_probe_versatile(struct platform_device *pdev,
versatile_flashprot = (enum versatile_flashprot)devid->data;
rmap = syscon_node_to_regmap(sysnp);
+ of_node_put(sysnp);
if (IS_ERR(rmap))
return PTR_ERR(rmap);
diff --git a/drivers/mtd/nand/raw/arasan-nand-controller.c b/drivers/mtd/nand/raw/arasan-nand-controller.c
index 53bd10738418..296fb16c8dc3 100644
--- a/drivers/mtd/nand/raw/arasan-nand-controller.c
+++ b/drivers/mtd/nand/raw/arasan-nand-controller.c
@@ -347,17 +347,17 @@ static int anfc_select_target(struct nand_chip *chip, int target)
/* Update clock frequency */
if (nfc->cur_clk != anand->clk) {
- clk_disable_unprepare(nfc->controller_clk);
- ret = clk_set_rate(nfc->controller_clk, anand->clk);
+ clk_disable_unprepare(nfc->bus_clk);
+ ret = clk_set_rate(nfc->bus_clk, anand->clk);
if (ret) {
dev_err(nfc->dev, "Failed to change clock rate\n");
return ret;
}
- ret = clk_prepare_enable(nfc->controller_clk);
+ ret = clk_prepare_enable(nfc->bus_clk);
if (ret) {
dev_err(nfc->dev,
- "Failed to re-enable the controller clock\n");
+ "Failed to re-enable the bus clock\n");
return ret;
}
@@ -1043,7 +1043,13 @@ static int anfc_setup_interface(struct nand_chip *chip, int target,
DQS_BUFF_SEL_OUT(dqs_mode);
}
- anand->clk = ANFC_XLNX_SDR_DFLT_CORE_CLK;
+ if (nand_interface_is_sdr(conf)) {
+ anand->clk = ANFC_XLNX_SDR_DFLT_CORE_CLK;
+ } else {
+ /* ONFI timings are defined in picoseconds */
+ anand->clk = div_u64((u64)NSEC_PER_SEC * 1000,
+ conf->timings.nvddr.tCK_min);
+ }
/*
* Due to a hardware bug in the ZynqMP SoC, SDR timing modes 0-1 work
diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
index ac3be92872d0..032180183339 100644
--- a/drivers/mtd/nand/raw/meson_nand.c
+++ b/drivers/mtd/nand/raw/meson_nand.c
@@ -1307,7 +1307,6 @@ static int meson_nfc_nand_chip_cleanup(struct meson_nfc *nfc)
if (ret)
return ret;
- meson_nfc_free_buffer(&meson_chip->nand);
nand_cleanup(&meson_chip->nand);
list_del(&meson_chip->node);
}
diff --git a/drivers/mtd/parsers/ofpart_bcm4908.c b/drivers/mtd/parsers/ofpart_bcm4908.c
index 0eddef4c198e..bb072a0940e4 100644
--- a/drivers/mtd/parsers/ofpart_bcm4908.c
+++ b/drivers/mtd/parsers/ofpart_bcm4908.c
@@ -35,12 +35,15 @@ static long long bcm4908_partitions_fw_offset(void)
err = kstrtoul(s + len + 1, 0, &offset);
if (err) {
pr_err("failed to parse %s\n", s + len + 1);
+ of_node_put(root);
return err;
}
+ of_node_put(root);
return offset << 10;
}
+ of_node_put(root);
return -ENOENT;
}
diff --git a/drivers/mtd/parsers/redboot.c b/drivers/mtd/parsers/redboot.c
index feb44a573d44..a16b42a88581 100644
--- a/drivers/mtd/parsers/redboot.c
+++ b/drivers/mtd/parsers/redboot.c
@@ -58,6 +58,7 @@ static void parse_redboot_of(struct mtd_info *master)
return;
ret = of_property_read_u32(npart, "fis-index-block", &dirblock);
+ of_node_put(npart);
if (ret)
return;
diff --git a/drivers/mtd/sm_ftl.c b/drivers/mtd/sm_ftl.c
index 0cff2cda1b5a..7f955fade838 100644
--- a/drivers/mtd/sm_ftl.c
+++ b/drivers/mtd/sm_ftl.c
@@ -1111,9 +1111,9 @@ static void sm_release(struct mtd_blktrans_dev *dev)
{
struct sm_ftl *ftl = dev->priv;
- mutex_lock(&ftl->mutex);
del_timer_sync(&ftl->timer);
cancel_work_sync(&ftl->flush_work);
+ mutex_lock(&ftl->mutex);
sm_cache_flush(ftl);
mutex_unlock(&ftl->mutex);
}
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index c1630131c734..170182eb431e 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -177,7 +177,7 @@ int spi_nor_controller_ops_write_reg(struct spi_nor *nor, u8 opcode,
static int spi_nor_controller_ops_erase(struct spi_nor *nor, loff_t offs)
{
- if (spi_nor_protocol_is_dtr(nor->write_proto))
+ if (spi_nor_protocol_is_dtr(nor->reg_proto))
return -EOPNOTSUPP;
return nor->controller_ops->erase(nor, offs);
@@ -976,7 +976,7 @@ static int spi_nor_erase_chip(struct spi_nor *nor)
SPI_MEM_OP_NO_DUMMY,
SPI_MEM_OP_NO_DATA);
- spi_nor_spimem_setup_op(nor, &op, nor->write_proto);
+ spi_nor_spimem_setup_op(nor, &op, nor->reg_proto);
ret = spi_mem_exec_op(nor->spimem, &op);
} else {
@@ -1121,7 +1121,7 @@ int spi_nor_erase_sector(struct spi_nor *nor, u32 addr)
SPI_MEM_OP_NO_DUMMY,
SPI_MEM_OP_NO_DATA);
- spi_nor_spimem_setup_op(nor, &op, nor->write_proto);
+ spi_nor_spimem_setup_op(nor, &op, nor->reg_proto);
return spi_mem_exec_op(nor->spimem, &op);
} else if (nor->controller_ops->erase) {
diff --git a/drivers/net/can/dev/netlink.c b/drivers/net/can/dev/netlink.c
index 7633d98e3912..037824011266 100644
--- a/drivers/net/can/dev/netlink.c
+++ b/drivers/net/can/dev/netlink.c
@@ -176,7 +176,8 @@ static int can_changelink(struct net_device *dev, struct nlattr *tb[],
* directly via do_set_bitrate(). Bail out if neither
* is given.
*/
- if (!priv->bittiming_const && !priv->do_set_bittiming)
+ if (!priv->bittiming_const && !priv->do_set_bittiming &&
+ !priv->bitrate_const)
return -EOPNOTSUPP;
memcpy(&bt, nla_data(data[IFLA_CAN_BITTIMING]), sizeof(bt));
@@ -278,7 +279,8 @@ static int can_changelink(struct net_device *dev, struct nlattr *tb[],
* directly via do_set_bitrate(). Bail out if neither
* is given.
*/
- if (!priv->data_bittiming_const && !priv->do_set_data_bittiming)
+ if (!priv->data_bittiming_const && !priv->do_set_data_bittiming &&
+ !priv->data_bitrate_const)
return -EOPNOTSUPP;
memcpy(&dbt, nla_data(data[IFLA_CAN_DATA_BITTIMING]),
diff --git a/drivers/net/can/pch_can.c b/drivers/net/can/pch_can.c
index 888bef03de09..17f8d67ddb18 100644
--- a/drivers/net/can/pch_can.c
+++ b/drivers/net/can/pch_can.c
@@ -489,6 +489,7 @@ static void pch_can_error(struct net_device *ndev, u32 status)
if (!skb)
return;
+ errc = ioread32(&priv->regs->errc);
if (status & PCH_BUS_OFF) {
pch_can_set_tx_all(priv, 0);
pch_can_set_rx_all(priv, 0);
@@ -496,9 +497,11 @@ static void pch_can_error(struct net_device *ndev, u32 status)
cf->can_id |= CAN_ERR_BUSOFF;
priv->can.can_stats.bus_off++;
can_bus_off(ndev);
+ } else {
+ cf->data[6] = errc & PCH_TEC;
+ cf->data[7] = (errc & PCH_REC) >> 8;
}
- errc = ioread32(&priv->regs->errc);
/* Warning interrupt. */
if (status & PCH_EWARN) {
state = CAN_STATE_ERROR_WARNING;
@@ -556,9 +559,6 @@ static void pch_can_error(struct net_device *ndev, u32 status)
break;
}
- cf->data[6] = errc & PCH_TEC;
- cf->data[7] = (errc & PCH_REC) >> 8;
-
priv->can.state = state;
netif_receive_skb(skb);
}
diff --git a/drivers/net/can/rcar/rcar_can.c b/drivers/net/can/rcar/rcar_can.c
index 33e37395379d..4ca5b09f1e5d 100644
--- a/drivers/net/can/rcar/rcar_can.c
+++ b/drivers/net/can/rcar/rcar_can.c
@@ -233,11 +233,8 @@ static void rcar_can_error(struct net_device *ndev)
if (eifr & (RCAR_CAN_EIFR_EWIF | RCAR_CAN_EIFR_EPIF)) {
txerr = readb(&priv->regs->tecr);
rxerr = readb(&priv->regs->recr);
- if (skb) {
+ if (skb)
cf->can_id |= CAN_ERR_CRTL;
- cf->data[6] = txerr;
- cf->data[7] = rxerr;
- }
}
if (eifr & RCAR_CAN_EIFR_BEIF) {
int rx_errors = 0, tx_errors = 0;
@@ -337,6 +334,9 @@ static void rcar_can_error(struct net_device *ndev)
can_bus_off(ndev);
if (skb)
cf->can_id |= CAN_ERR_BUSOFF;
+ } else if (skb) {
+ cf->data[6] = txerr;
+ cf->data[7] = rxerr;
}
if (eifr & RCAR_CAN_EIFR_ORIF) {
netdev_dbg(priv->ndev, "Receive overrun error interrupt\n");
diff --git a/drivers/net/can/sja1000/sja1000.c b/drivers/net/can/sja1000/sja1000.c
index 966316479485..366a369080ef 100644
--- a/drivers/net/can/sja1000/sja1000.c
+++ b/drivers/net/can/sja1000/sja1000.c
@@ -405,9 +405,6 @@ static int sja1000_err(struct net_device *dev, uint8_t isrc, uint8_t status)
txerr = priv->read_reg(priv, SJA1000_TXERR);
rxerr = priv->read_reg(priv, SJA1000_RXERR);
- cf->data[6] = txerr;
- cf->data[7] = rxerr;
-
if (isrc & IRQ_DOI) {
/* data overrun interrupt */
netdev_dbg(dev, "data overrun interrupt\n");
@@ -429,6 +426,10 @@ static int sja1000_err(struct net_device *dev, uint8_t isrc, uint8_t status)
else
state = CAN_STATE_ERROR_ACTIVE;
}
+ if (state != CAN_STATE_BUS_OFF) {
+ cf->data[6] = txerr;
+ cf->data[7] = rxerr;
+ }
if (isrc & IRQ_BEI) {
/* bus error interrupt */
priv->can.can_stats.bus_error++;
diff --git a/drivers/net/can/spi/hi311x.c b/drivers/net/can/spi/hi311x.c
index a5b2952b8d0f..7dc88a9eca0f 100644
--- a/drivers/net/can/spi/hi311x.c
+++ b/drivers/net/can/spi/hi311x.c
@@ -672,8 +672,6 @@ static irqreturn_t hi3110_can_ist(int irq, void *dev_id)
txerr = hi3110_read(spi, HI3110_READ_TEC);
rxerr = hi3110_read(spi, HI3110_READ_REC);
- cf->data[6] = txerr;
- cf->data[7] = rxerr;
tx_state = txerr >= rxerr ? new_state : 0;
rx_state = txerr <= rxerr ? new_state : 0;
can_change_state(net, cf, tx_state, rx_state);
@@ -686,6 +684,9 @@ static irqreturn_t hi3110_can_ist(int irq, void *dev_id)
hi3110_hw_sleep(spi);
break;
}
+ } else {
+ cf->data[6] = txerr;
+ cf->data[7] = rxerr;
}
}
diff --git a/drivers/net/can/sun4i_can.c b/drivers/net/can/sun4i_can.c
index 25d6d81ab4f4..f3ad585e766f 100644
--- a/drivers/net/can/sun4i_can.c
+++ b/drivers/net/can/sun4i_can.c
@@ -538,11 +538,6 @@ static int sun4i_can_err(struct net_device *dev, u8 isrc, u8 status)
rxerr = (errc >> 16) & 0xFF;
txerr = errc & 0xFF;
- if (skb) {
- cf->data[6] = txerr;
- cf->data[7] = rxerr;
- }
-
if (isrc & SUN4I_INT_DATA_OR) {
/* data overrun interrupt */
netdev_dbg(dev, "data overrun interrupt\n");
@@ -573,6 +568,10 @@ static int sun4i_can_err(struct net_device *dev, u8 isrc, u8 status)
else
state = CAN_STATE_ERROR_ACTIVE;
}
+ if (skb && state != CAN_STATE_BUS_OFF) {
+ cf->data[6] = txerr;
+ cf->data[7] = rxerr;
+ }
if (isrc & SUN4I_INT_BUS_ERR) {
/* bus error interrupt */
netdev_dbg(dev, "bus error interrupt\n");
diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
index 5d70844ac030..404093468b2f 100644
--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
+++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c
@@ -917,8 +917,10 @@ static void kvaser_usb_hydra_update_state(struct kvaser_usb_net_priv *priv,
new_state < CAN_STATE_BUS_OFF)
priv->can.can_stats.restarts++;
- cf->data[6] = bec->txerr;
- cf->data[7] = bec->rxerr;
+ if (new_state != CAN_STATE_BUS_OFF) {
+ cf->data[6] = bec->txerr;
+ cf->data[7] = bec->rxerr;
+ }
netif_rx(skb);
}
@@ -1069,8 +1071,10 @@ kvaser_usb_hydra_error_frame(struct kvaser_usb_net_priv *priv,
shhwtstamps->hwtstamp = hwtstamp;
cf->can_id |= CAN_ERR_BUSERROR;
- cf->data[6] = bec.txerr;
- cf->data[7] = bec.rxerr;
+ if (new_state != CAN_STATE_BUS_OFF) {
+ cf->data[6] = bec.txerr;
+ cf->data[7] = bec.rxerr;
+ }
netif_rx(skb);
diff --git a/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c b/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
index cc809ecd1e62..f551fde16a70 100644
--- a/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
+++ b/drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
@@ -853,8 +853,10 @@ static void kvaser_usb_leaf_rx_error(const struct kvaser_usb *dev,
break;
}
- cf->data[6] = es->txerr;
- cf->data[7] = es->rxerr;
+ if (new_state != CAN_STATE_BUS_OFF) {
+ cf->data[6] = es->txerr;
+ cf->data[7] = es->rxerr;
+ }
netif_rx(skb);
}
diff --git a/drivers/net/can/usb/usb_8dev.c b/drivers/net/can/usb/usb_8dev.c
index b638604bf1ee..ea63dc687fdb 100644
--- a/drivers/net/can/usb/usb_8dev.c
+++ b/drivers/net/can/usb/usb_8dev.c
@@ -439,9 +439,10 @@ static void usb_8dev_rx_err_msg(struct usb_8dev_priv *priv,
if (rx_errors)
stats->rx_errors++;
-
- cf->data[6] = txerr;
- cf->data[7] = rxerr;
+ if (priv->can.state != CAN_STATE_BUS_OFF) {
+ cf->data[6] = txerr;
+ cf->data[7] = rxerr;
+ }
priv->bec.txerr = txerr;
priv->bec.rxerr = rxerr;
diff --git a/drivers/net/dsa/ocelot/Kconfig b/drivers/net/dsa/ocelot/Kconfig
index 220b0b027b55..08db9cf76818 100644
--- a/drivers/net/dsa/ocelot/Kconfig
+++ b/drivers/net/dsa/ocelot/Kconfig
@@ -6,6 +6,7 @@ config NET_DSA_MSCC_FELIX
depends on NET_VENDOR_FREESCALE
depends on HAS_IOMEM
depends on PTP_1588_CLOCK_OPTIONAL
+ depends on NET_SCH_TAPRIO || NET_SCH_TAPRIO=n
select MSCC_OCELOT_SWITCH_LIB
select NET_DSA_TAG_OCELOT_8021Q
select NET_DSA_TAG_OCELOT
diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
index faccfb3f0158..d95c2c0a64bd 100644
--- a/drivers/net/dsa/ocelot/felix.c
+++ b/drivers/net/dsa/ocelot/felix.c
@@ -1613,9 +1613,18 @@ static void felix_txtstamp(struct dsa_switch *ds, int port,
static int felix_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
{
struct ocelot *ocelot = ds->priv;
+ struct ocelot_port *ocelot_port = ocelot->ports[port];
+ struct felix *felix = ocelot_to_felix(ocelot);
ocelot_port_set_maxlen(ocelot, port, new_mtu);
+ mutex_lock(&ocelot->tas_lock);
+
+ if (ocelot_port->taprio && felix->info->tas_guard_bands_update)
+ felix->info->tas_guard_bands_update(ocelot, port);
+
+ mutex_unlock(&ocelot->tas_lock);
+
return 0;
}
diff --git a/drivers/net/dsa/ocelot/felix.h b/drivers/net/dsa/ocelot/felix.h
index f083b06fdfe9..b0e4635a8cc2 100644
--- a/drivers/net/dsa/ocelot/felix.h
+++ b/drivers/net/dsa/ocelot/felix.h
@@ -53,6 +53,7 @@ struct felix_info {
struct phylink_link_state *state);
int (*port_setup_tc)(struct dsa_switch *ds, int port,
enum tc_setup_type type, void *type_data);
+ void (*tas_guard_bands_update)(struct ocelot *ocelot, int port);
void (*port_sched_speed_set)(struct ocelot *ocelot, int port,
u32 speed);
struct regmap *(*init_regmap)(struct ocelot *ocelot,
diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
index 4a071f96ea28..d6e8549c12c4 100644
--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
+++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
@@ -1124,9 +1124,212 @@ static void vsc9959_mdio_bus_free(struct ocelot *ocelot)
mdiobus_free(felix->imdio);
}
+/* Extract shortest continuous gate open intervals in ns for each traffic class
+ * of a cyclic tc-taprio schedule. If a gate is always open, the duration is
+ * considered U64_MAX. If the gate is always closed, it is considered 0.
+ */
+static void vsc9959_tas_min_gate_lengths(struct tc_taprio_qopt_offload *taprio,
+ u64 min_gate_len[OCELOT_NUM_TC])
+{
+ struct tc_taprio_sched_entry *entry;
+ u64 gate_len[OCELOT_NUM_TC];
+ u8 gates_ever_opened = 0;
+ int tc, i, n;
+
+ /* Initialize arrays */
+ for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
+ min_gate_len[tc] = U64_MAX;
+ gate_len[tc] = 0;
+ }
+
+ /* If we don't have taprio, consider all gates as permanently open */
+ if (!taprio)
+ return;
+
+ n = taprio->num_entries;
+
+ /* Walk through the gate list twice to determine the length
+ * of consecutively open gates for a traffic class, including
+ * open gates that wrap around. We are just interested in the
+ * minimum window size, and this doesn't change what the
+ * minimum is (if the gate never closes, min_gate_len will
+ * remain U64_MAX).
+ */
+ for (i = 0; i < 2 * n; i++) {
+ entry = &taprio->entries[i % n];
+
+ for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
+ if (entry->gate_mask & BIT(tc)) {
+ gate_len[tc] += entry->interval;
+ gates_ever_opened |= BIT(tc);
+ } else {
+ /* Gate closes now, record a potential new
+ * minimum and reinitialize length
+ */
+ if (min_gate_len[tc] > gate_len[tc] &&
+ gate_len[tc])
+ min_gate_len[tc] = gate_len[tc];
+ gate_len[tc] = 0;
+ }
+ }
+ }
+
+ /* min_gate_len[tc] actually tracks minimum *open* gate time, so for
+ * permanently closed gates, min_gate_len[tc] will still be U64_MAX.
+ * Therefore they are currently indistinguishable from permanently
+ * open gates. Overwrite the gate len with 0 when we know they're
+ * actually permanently closed, i.e. after the loop above.
+ */
+ for (tc = 0; tc < OCELOT_NUM_TC; tc++)
+ if (!(gates_ever_opened & BIT(tc)))
+ min_gate_len[tc] = 0;
+}
+
+/* Update QSYS_PORT_MAX_SDU to make sure the static guard bands added by the
+ * switch (see the ALWAYS_GUARD_BAND_SCH_Q comment) are correct at all MTU
+ * values (the default value is 1518). Also, for traffic class windows smaller
+ * than one MTU sized frame, update QSYS_QMAXSDU_CFG to enable oversized frame
+ * dropping, such that these won't hang the port, as they will never be sent.
+ */
+static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
+{
+ struct ocelot_port *ocelot_port = ocelot->ports[port];
+ u64 min_gate_len[OCELOT_NUM_TC];
+ int speed, picos_per_byte;
+ u64 needed_bit_time_ps;
+ u32 val, maxlen;
+ u8 tas_speed;
+ int tc;
+
+ lockdep_assert_held(&ocelot->tas_lock);
+
+ val = ocelot_read_rix(ocelot, QSYS_TAG_CONFIG, port);
+ tas_speed = QSYS_TAG_CONFIG_LINK_SPEED_X(val);
+
+ switch (tas_speed) {
+ case OCELOT_SPEED_10:
+ speed = SPEED_10;
+ break;
+ case OCELOT_SPEED_100:
+ speed = SPEED_100;
+ break;
+ case OCELOT_SPEED_1000:
+ speed = SPEED_1000;
+ break;
+ case OCELOT_SPEED_2500:
+ speed = SPEED_2500;
+ break;
+ default:
+ return;
+ }
+
+ picos_per_byte = (USEC_PER_SEC * 8) / speed;
+
+ val = ocelot_port_readl(ocelot_port, DEV_MAC_MAXLEN_CFG);
+ /* MAXLEN_CFG accounts automatically for VLAN. We need to include it
+ * manually in the bit time calculation, plus the preamble and SFD.
+ */
+ maxlen = val + 2 * VLAN_HLEN;
+ /* Consider the standard Ethernet overhead of 8 octets preamble+SFD,
+ * 4 octets FCS, 12 octets IFG.
+ */
+ needed_bit_time_ps = (maxlen + 24) * picos_per_byte;
+
+ dev_dbg(ocelot->dev,
+ "port %d: max frame size %d needs %llu ps at speed %d\n",
+ port, maxlen, needed_bit_time_ps, speed);
+
+ vsc9959_tas_min_gate_lengths(ocelot_port->taprio, min_gate_len);
+
+ for (tc = 0; tc < OCELOT_NUM_TC; tc++) {
+ u32 max_sdu;
+
+ if (min_gate_len[tc] == U64_MAX /* Gate always open */ ||
+ min_gate_len[tc] * 1000 > needed_bit_time_ps) {
+ /* Setting QMAXSDU_CFG to 0 disables oversized frame
+ * dropping.
+ */
+ max_sdu = 0;
+ dev_dbg(ocelot->dev,
+ "port %d tc %d min gate len %llu"
+ ", sending all frames\n",
+ port, tc, min_gate_len[tc]);
+ } else {
+ /* If traffic class doesn't support a full MTU sized
+ * frame, make sure to enable oversize frame dropping
+ * for frames larger than the smallest that would fit.
+ */
+ max_sdu = div_u64(min_gate_len[tc] * 1000,
+ picos_per_byte);
+ /* A TC gate may be completely closed, which is a
+ * special case where all packets are oversized.
+ * Any limit smaller than 64 octets accomplishes this
+ */
+ if (!max_sdu)
+ max_sdu = 1;
+ /* Take L1 overhead into account, but just don't allow
+ * max_sdu to go negative or to 0. Here we use 20
+ * because QSYS_MAXSDU_CFG_* already counts the 4 FCS
+ * octets as part of packet size.
+ */
+ if (max_sdu > 20)
+ max_sdu -= 20;
+ dev_info(ocelot->dev,
+ "port %d tc %d min gate length %llu"
+ " ns not enough for max frame size %d at %d"
+ " Mbps, dropping frames over %d"
+ " octets including FCS\n",
+ port, tc, min_gate_len[tc], maxlen, speed,
+ max_sdu);
+ }
+
+ /* ocelot_write_rix is a macro that concatenates
+ * QSYS_MAXSDU_CFG_* with _RSZ, so we need to spell out
+ * the writes to each traffic class
+ */
+ switch (tc) {
+ case 0:
+ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_0,
+ port);
+ break;
+ case 1:
+ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_1,
+ port);
+ break;
+ case 2:
+ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_2,
+ port);
+ break;
+ case 3:
+ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_3,
+ port);
+ break;
+ case 4:
+ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_4,
+ port);
+ break;
+ case 5:
+ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_5,
+ port);
+ break;
+ case 6:
+ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_6,
+ port);
+ break;
+ case 7:
+ ocelot_write_rix(ocelot, max_sdu, QSYS_QMAXSDU_CFG_7,
+ port);
+ break;
+ }
+ }
+
+ ocelot_write_rix(ocelot, maxlen, QSYS_PORT_MAX_SDU, port);
+}
+
static void vsc9959_sched_speed_set(struct ocelot *ocelot, int port,
u32 speed)
{
+ struct ocelot_port *ocelot_port = ocelot->ports[port];
u8 tas_speed;
switch (speed) {
@@ -1151,6 +1354,13 @@ static void vsc9959_sched_speed_set(struct ocelot *ocelot, int port,
QSYS_TAG_CONFIG_LINK_SPEED(tas_speed),
QSYS_TAG_CONFIG_LINK_SPEED_M,
QSYS_TAG_CONFIG, port);
+
+ mutex_lock(&ocelot->tas_lock);
+
+ if (ocelot_port->taprio)
+ vsc9959_tas_guard_bands_update(ocelot, port);
+
+ mutex_unlock(&ocelot->tas_lock);
}
static void vsc9959_new_base_time(struct ocelot *ocelot, ktime_t base_time,
@@ -1193,10 +1403,13 @@ static void vsc9959_tas_gcl_set(struct ocelot *ocelot, const u32 gcl_ix,
static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
struct tc_taprio_qopt_offload *taprio)
{
+ struct ocelot_port *ocelot_port = ocelot->ports[port];
struct timespec64 base_ts;
int ret, i;
u32 val;
+ mutex_lock(&ocelot->tas_lock);
+
if (!taprio->enable) {
ocelot_rmw_rix(ocelot,
QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF),
@@ -1204,15 +1417,25 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
QSYS_TAG_CONFIG, port);
+ taprio_offload_free(ocelot_port->taprio);
+ ocelot_port->taprio = NULL;
+
+ vsc9959_tas_guard_bands_update(ocelot, port);
+
+ mutex_unlock(&ocelot->tas_lock);
return 0;
}
if (taprio->cycle_time > NSEC_PER_SEC ||
- taprio->cycle_time_extension >= NSEC_PER_SEC)
- return -EINVAL;
+ taprio->cycle_time_extension >= NSEC_PER_SEC) {
+ ret = -EINVAL;
+ goto err;
+ }
- if (taprio->num_entries > VSC9959_TAS_GCL_ENTRY_MAX)
- return -ERANGE;
+ if (taprio->num_entries > VSC9959_TAS_GCL_ENTRY_MAX) {
+ ret = -ERANGE;
+ goto err;
+ }
/* Enable guard band. The switch will schedule frames without taking
* their length into account. Thus we'll always need to enable the
@@ -1233,8 +1456,10 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
* config is pending, need reset the TAS module
*/
val = ocelot_read(ocelot, QSYS_PARAM_STATUS_REG_8);
- if (val & QSYS_PARAM_STATUS_REG_8_CONFIG_PENDING)
- return -EBUSY;
+ if (val & QSYS_PARAM_STATUS_REG_8_CONFIG_PENDING) {
+ ret = -EBUSY;
+ goto err;
+ }
ocelot_rmw_rix(ocelot,
QSYS_TAG_CONFIG_ENABLE |
@@ -1267,10 +1492,71 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
ret = readx_poll_timeout(vsc9959_tas_read_cfg_status, ocelot, val,
!(val & QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE),
10, 100000);
+ if (ret)
+ goto err;
+
+ ocelot_port->taprio = taprio_offload_get(taprio);
+ vsc9959_tas_guard_bands_update(ocelot, port);
+
+err:
+ mutex_unlock(&ocelot->tas_lock);
return ret;
}
+static void vsc9959_tas_clock_adjust(struct ocelot *ocelot)
+{
+ struct tc_taprio_qopt_offload *taprio;
+ struct ocelot_port *ocelot_port;
+ struct timespec64 base_ts;
+ int port;
+ u32 val;
+
+ mutex_lock(&ocelot->tas_lock);
+
+ for (port = 0; port < ocelot->num_phys_ports; port++) {
+ ocelot_port = ocelot->ports[port];
+ taprio = ocelot_port->taprio;
+ if (!taprio)
+ continue;
+
+ ocelot_rmw(ocelot,
+ QSYS_TAS_PARAM_CFG_CTRL_PORT_NUM(port),
+ QSYS_TAS_PARAM_CFG_CTRL_PORT_NUM_M,
+ QSYS_TAS_PARAM_CFG_CTRL);
+
+ ocelot_rmw_rix(ocelot,
+ QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF),
+ QSYS_TAG_CONFIG_ENABLE |
+ QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
+ QSYS_TAG_CONFIG, port);
+
+ vsc9959_new_base_time(ocelot, taprio->base_time,
+ taprio->cycle_time, &base_ts);
+
+ ocelot_write(ocelot, base_ts.tv_nsec, QSYS_PARAM_CFG_REG_1);
+ ocelot_write(ocelot, lower_32_bits(base_ts.tv_sec),
+ QSYS_PARAM_CFG_REG_2);
+ val = upper_32_bits(base_ts.tv_sec);
+ ocelot_rmw(ocelot,
+ QSYS_PARAM_CFG_REG_3_BASE_TIME_SEC_MSB(val),
+ QSYS_PARAM_CFG_REG_3_BASE_TIME_SEC_MSB_M,
+ QSYS_PARAM_CFG_REG_3);
+
+ ocelot_rmw(ocelot, QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE,
+ QSYS_TAS_PARAM_CFG_CTRL_CONFIG_CHANGE,
+ QSYS_TAS_PARAM_CFG_CTRL);
+
+ ocelot_rmw_rix(ocelot,
+ QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) |
+ QSYS_TAG_CONFIG_ENABLE,
+ QSYS_TAG_CONFIG_ENABLE |
+ QSYS_TAG_CONFIG_INIT_GATE_STATE_M,
+ QSYS_TAG_CONFIG, port);
+ }
+ mutex_unlock(&ocelot->tas_lock);
+}
+
static int vsc9959_qos_port_cbs_set(struct dsa_switch *ds, int port,
struct tc_cbs_qopt_offload *cbs_qopt)
{
@@ -2210,6 +2496,7 @@ static const struct ocelot_ops vsc9959_ops = {
.psfp_filter_del = vsc9959_psfp_filter_del,
.psfp_stats_get = vsc9959_psfp_stats_get,
.cut_through_fwd = vsc9959_cut_through_fwd,
+ .tas_clock_adjust = vsc9959_tas_clock_adjust,
};
static const struct felix_info felix_info_vsc9959 = {
@@ -2237,6 +2524,7 @@ static const struct felix_info felix_info_vsc9959 = {
.port_modes = vsc9959_port_modes,
.port_setup_tc = vsc9959_port_setup_tc,
.port_sched_speed_set = vsc9959_sched_speed_set,
+ .tas_guard_bands_update = vsc9959_tas_guard_bands_update,
.init_regmap = ocelot_regmap_init,
};
diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
index ec167af0e3b2..5041265f4226 100644
--- a/drivers/net/ethernet/atheros/ag71xx.c
+++ b/drivers/net/ethernet/atheros/ag71xx.c
@@ -946,7 +946,7 @@ static unsigned int ag71xx_max_frame_len(unsigned int mtu)
return ETH_HLEN + VLAN_HLEN + mtu + ETH_FCS_LEN;
}
-static void ag71xx_hw_set_macaddr(struct ag71xx *ag, unsigned char *mac)
+static void ag71xx_hw_set_macaddr(struct ag71xx *ag, const unsigned char *mac)
{
u32 t;
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_dev.h b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
index fb3e89141a0d..a4fbf44f944c 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_dev.h
+++ b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
@@ -95,9 +95,6 @@ struct hinic_dev {
u16 sq_depth;
u16 rq_depth;
- struct hinic_txq_stats tx_stats;
- struct hinic_rxq_stats rx_stats;
-
u8 rss_tmpl_idx;
u8 rss_hash_engine;
u16 num_rss;
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c b/drivers/net/ethernet/huawei/hinic/hinic_main.c
index 05329292d940..c23ee2ddbce3 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_main.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c
@@ -62,8 +62,6 @@ MODULE_PARM_DESC(rx_weight, "Number Rx packets for NAPI budget (default=64)");
#define HINIC_LRO_RX_TIMER_DEFAULT 16
-#define VLAN_BITMAP_SIZE(nic_dev) (ALIGN(VLAN_N_VID, 8) / 8)
-
#define work_to_rx_mode_work(work) \
container_of(work, struct hinic_rx_mode_work, work)
@@ -82,56 +80,44 @@ static int set_features(struct hinic_dev *nic_dev,
netdev_features_t pre_features,
netdev_features_t features, bool force_change);
-static void update_rx_stats(struct hinic_dev *nic_dev, struct hinic_rxq *rxq)
+static void gather_rx_stats(struct hinic_rxq_stats *nic_rx_stats, struct hinic_rxq *rxq)
{
- struct hinic_rxq_stats *nic_rx_stats = &nic_dev->rx_stats;
struct hinic_rxq_stats rx_stats;
- u64_stats_init(&rx_stats.syncp);
-
hinic_rxq_get_stats(rxq, &rx_stats);
- u64_stats_update_begin(&nic_rx_stats->syncp);
nic_rx_stats->bytes += rx_stats.bytes;
nic_rx_stats->pkts += rx_stats.pkts;
nic_rx_stats->errors += rx_stats.errors;
nic_rx_stats->csum_errors += rx_stats.csum_errors;
nic_rx_stats->other_errors += rx_stats.other_errors;
- u64_stats_update_end(&nic_rx_stats->syncp);
-
- hinic_rxq_clean_stats(rxq);
}
-static void update_tx_stats(struct hinic_dev *nic_dev, struct hinic_txq *txq)
+static void gather_tx_stats(struct hinic_txq_stats *nic_tx_stats, struct hinic_txq *txq)
{
- struct hinic_txq_stats *nic_tx_stats = &nic_dev->tx_stats;
struct hinic_txq_stats tx_stats;
- u64_stats_init(&tx_stats.syncp);
-
hinic_txq_get_stats(txq, &tx_stats);
- u64_stats_update_begin(&nic_tx_stats->syncp);
nic_tx_stats->bytes += tx_stats.bytes;
nic_tx_stats->pkts += tx_stats.pkts;
nic_tx_stats->tx_busy += tx_stats.tx_busy;
nic_tx_stats->tx_wake += tx_stats.tx_wake;
nic_tx_stats->tx_dropped += tx_stats.tx_dropped;
nic_tx_stats->big_frags_pkts += tx_stats.big_frags_pkts;
- u64_stats_update_end(&nic_tx_stats->syncp);
-
- hinic_txq_clean_stats(txq);
}
-static void update_nic_stats(struct hinic_dev *nic_dev)
+static void gather_nic_stats(struct hinic_dev *nic_dev,
+ struct hinic_rxq_stats *nic_rx_stats,
+ struct hinic_txq_stats *nic_tx_stats)
{
int i, num_qps = hinic_hwdev_num_qps(nic_dev->hwdev);
for (i = 0; i < num_qps; i++)
- update_rx_stats(nic_dev, &nic_dev->rxqs[i]);
+ gather_rx_stats(nic_rx_stats, &nic_dev->rxqs[i]);
for (i = 0; i < num_qps; i++)
- update_tx_stats(nic_dev, &nic_dev->txqs[i]);
+ gather_tx_stats(nic_tx_stats, &nic_dev->txqs[i]);
}
/**
@@ -560,8 +546,6 @@ int hinic_close(struct net_device *netdev)
netif_carrier_off(netdev);
netif_tx_disable(netdev);
- update_nic_stats(nic_dev);
-
up(&nic_dev->mgmt_lock);
if (!HINIC_IS_VF(nic_dev->hwdev->hwif))
@@ -855,26 +839,19 @@ static void hinic_get_stats64(struct net_device *netdev,
struct rtnl_link_stats64 *stats)
{
struct hinic_dev *nic_dev = netdev_priv(netdev);
- struct hinic_rxq_stats *nic_rx_stats;
- struct hinic_txq_stats *nic_tx_stats;
-
- nic_rx_stats = &nic_dev->rx_stats;
- nic_tx_stats = &nic_dev->tx_stats;
-
- down(&nic_dev->mgmt_lock);
+ struct hinic_rxq_stats nic_rx_stats = {};
+ struct hinic_txq_stats nic_tx_stats = {};
if (nic_dev->flags & HINIC_INTF_UP)
- update_nic_stats(nic_dev);
-
- up(&nic_dev->mgmt_lock);
+ gather_nic_stats(nic_dev, &nic_rx_stats, &nic_tx_stats);
- stats->rx_bytes = nic_rx_stats->bytes;
- stats->rx_packets = nic_rx_stats->pkts;
- stats->rx_errors = nic_rx_stats->errors;
+ stats->rx_bytes = nic_rx_stats.bytes;
+ stats->rx_packets = nic_rx_stats.pkts;
+ stats->rx_errors = nic_rx_stats.errors;
- stats->tx_bytes = nic_tx_stats->bytes;
- stats->tx_packets = nic_tx_stats->pkts;
- stats->tx_errors = nic_tx_stats->tx_dropped;
+ stats->tx_bytes = nic_tx_stats.bytes;
+ stats->tx_packets = nic_tx_stats.pkts;
+ stats->tx_errors = nic_tx_stats.tx_dropped;
}
static int hinic_set_features(struct net_device *netdev,
@@ -1173,8 +1150,6 @@ static void hinic_free_intr_coalesce(struct hinic_dev *nic_dev)
static int nic_dev_init(struct pci_dev *pdev)
{
struct hinic_rx_mode_work *rx_mode_work;
- struct hinic_txq_stats *tx_stats;
- struct hinic_rxq_stats *rx_stats;
struct hinic_dev *nic_dev;
struct net_device *netdev;
struct hinic_hwdev *hwdev;
@@ -1236,15 +1211,8 @@ static int nic_dev_init(struct pci_dev *pdev)
sema_init(&nic_dev->mgmt_lock, 1);
- tx_stats = &nic_dev->tx_stats;
- rx_stats = &nic_dev->rx_stats;
-
- u64_stats_init(&tx_stats->syncp);
- u64_stats_init(&rx_stats->syncp);
-
- nic_dev->vlan_bitmap = devm_kzalloc(&pdev->dev,
- VLAN_BITMAP_SIZE(nic_dev),
- GFP_KERNEL);
+ nic_dev->vlan_bitmap = devm_bitmap_zalloc(&pdev->dev, VLAN_N_VID,
+ GFP_KERNEL);
if (!nic_dev->vlan_bitmap) {
err = -ENOMEM;
goto err_vlan_bitmap;
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_rx.c b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
index b33ed4d92b71..b6bce622a6a8 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
@@ -73,7 +73,6 @@ void hinic_rxq_get_stats(struct hinic_rxq *rxq, struct hinic_rxq_stats *stats)
struct hinic_rxq_stats *rxq_stats = &rxq->rxq_stats;
unsigned int start;
- u64_stats_update_begin(&stats->syncp);
do {
start = u64_stats_fetch_begin(&rxq_stats->syncp);
stats->pkts = rxq_stats->pkts;
@@ -83,7 +82,6 @@ void hinic_rxq_get_stats(struct hinic_rxq *rxq, struct hinic_rxq_stats *stats)
stats->csum_errors = rxq_stats->csum_errors;
stats->other_errors = rxq_stats->other_errors;
} while (u64_stats_fetch_retry(&rxq_stats->syncp, start));
- u64_stats_update_end(&stats->syncp);
}
/**
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_tx.c b/drivers/net/ethernet/huawei/hinic/hinic_tx.c
index 8d59babbf476..082eb2a5088d 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_tx.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_tx.c
@@ -98,7 +98,6 @@ void hinic_txq_get_stats(struct hinic_txq *txq, struct hinic_txq_stats *stats)
struct hinic_txq_stats *txq_stats = &txq->txq_stats;
unsigned int start;
- u64_stats_update_begin(&stats->syncp);
do {
start = u64_stats_fetch_begin(&txq_stats->syncp);
stats->pkts = txq_stats->pkts;
@@ -108,7 +107,6 @@ void hinic_txq_get_stats(struct hinic_txq *txq, struct hinic_txq_stats *stats)
stats->tx_dropped = txq_stats->tx_dropped;
stats->big_frags_pkts = txq_stats->big_frags_pkts;
} while (u64_stats_fetch_retry(&txq_stats->syncp, start));
- u64_stats_update_end(&stats->syncp);
}
/**
diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h
index 0ea0361cd86b..a988c08e906f 100644
--- a/drivers/net/ethernet/intel/iavf/iavf.h
+++ b/drivers/net/ethernet/intel/iavf/iavf.h
@@ -92,6 +92,7 @@ struct iavf_vsi {
#define IAVF_HKEY_ARRAY_SIZE ((IAVF_VFQF_HKEY_MAX_INDEX + 1) * 4)
#define IAVF_HLUT_ARRAY_SIZE ((IAVF_VFQF_HLUT_MAX_INDEX + 1) * 4)
#define IAVF_MBPS_DIVISOR 125000 /* divisor to convert to Mbps */
+#define IAVF_MBPS_QUANTA 50
#define IAVF_VIRTCHNL_VF_RESOURCE_SIZE (sizeof(struct virtchnl_vf_resource) + \
(IAVF_MAX_VF_VSI * \
@@ -430,6 +431,11 @@ struct iavf_adapter {
/* lock to protect access to the cloud filter list */
spinlock_t cloud_filter_list_lock;
u16 num_cloud_filters;
+ /* snapshot of "num_active_queues" before setup_tc for qdisc add
+ * is invoked. This information is useful during qdisc del flow,
+ * to restore correct number of queues
+ */
+ int orig_num_active_queues;
#define IAVF_MAX_FDIR_FILTERS 128 /* max allowed Flow Director filters */
u16 fdir_active_fltr;
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 2e2c153ce46a..3dbfaead2ac7 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -3322,6 +3322,7 @@ static int iavf_validate_ch_config(struct iavf_adapter *adapter,
struct tc_mqprio_qopt_offload *mqprio_qopt)
{
u64 total_max_rate = 0;
+ u32 tx_rate_rem = 0;
int i, num_qps = 0;
u64 tx_rate = 0;
int ret = 0;
@@ -3336,12 +3337,32 @@ static int iavf_validate_ch_config(struct iavf_adapter *adapter,
return -EINVAL;
if (mqprio_qopt->min_rate[i]) {
dev_err(&adapter->pdev->dev,
- "Invalid min tx rate (greater than 0) specified\n");
+ "Invalid min tx rate (greater than 0) specified for TC%d\n",
+ i);
return -EINVAL;
}
- /*convert to Mbps */
+
+ /* convert to Mbps */
tx_rate = div_u64(mqprio_qopt->max_rate[i],
IAVF_MBPS_DIVISOR);
+
+ if (mqprio_qopt->max_rate[i] &&
+ tx_rate < IAVF_MBPS_QUANTA) {
+ dev_err(&adapter->pdev->dev,
+ "Invalid max tx rate for TC%d, minimum %dMbps\n",
+ i, IAVF_MBPS_QUANTA);
+ return -EINVAL;
+ }
+
+ (void)div_u64_rem(tx_rate, IAVF_MBPS_QUANTA, &tx_rate_rem);
+
+ if (tx_rate_rem != 0) {
+ dev_err(&adapter->pdev->dev,
+ "Invalid max tx rate for TC%d, not divisible by %d\n",
+ i, IAVF_MBPS_QUANTA);
+ return -EINVAL;
+ }
+
total_max_rate += tx_rate;
num_qps += mqprio_qopt->qopt.count[i];
}
@@ -3408,6 +3429,7 @@ static int __iavf_setup_tc(struct net_device *netdev, void *type_data)
netif_tx_disable(netdev);
iavf_del_all_cloud_filters(adapter);
adapter->aq_required = IAVF_FLAG_AQ_DISABLE_CHANNELS;
+ total_qps = adapter->orig_num_active_queues;
goto exit;
} else {
return -EINVAL;
@@ -3451,7 +3473,21 @@ static int __iavf_setup_tc(struct net_device *netdev, void *type_data)
adapter->ch_config.ch_info[i].offset = 0;
}
}
+
+ /* Take snapshot of original config such as "num_active_queues"
+ * It is used later when delete ADQ flow is exercised, so that
+ * once delete ADQ flow completes, VF shall go back to its
+ * original queue configuration
+ */
+
+ adapter->orig_num_active_queues = adapter->num_active_queues;
+
+ /* Store queue info based on TC so that VF gets configured
+ * with correct number of queues when VF completes ADQ config
+ * flow
+ */
adapter->ch_config.total_qps = total_qps;
+
netif_tx_stop_all_queues(netdev);
netif_tx_disable(netdev);
adapter->aq_required |= IAVF_FLAG_AQ_ENABLE_CHANNELS;
@@ -3468,6 +3504,12 @@ static int __iavf_setup_tc(struct net_device *netdev, void *type_data)
}
}
exit:
+ if (test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
+ return 0;
+
+ netif_set_real_num_rx_queues(netdev, total_qps);
+ netif_set_real_num_tx_queues(netdev, total_qps);
+
return ret;
}
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 522462f41067..f48938af960c 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -419,7 +419,7 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
IFF_PROMISC;
goto out_promisc;
}
- if (vsi->current_netdev_flags &
+ if (vsi->netdev->features &
NETIF_F_HW_VLAN_CTAG_FILTER)
vlan_ops->ena_rx_filtering(vsi);
}
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c
index 25b8f6f726eb..73960c8f9dba 100644
--- a/drivers/net/ethernet/intel/ice/ice_switch.c
+++ b/drivers/net/ethernet/intel/ice/ice_switch.c
@@ -4874,7 +4874,7 @@ ice_find_free_recp_res_idx(struct ice_hw *hw, const unsigned long *profiles,
bitmap_zero(recipes, ICE_MAX_NUM_RECIPES);
bitmap_zero(used_idx, ICE_MAX_FV_WORDS);
- bitmap_set(possible_idx, 0, ICE_MAX_FV_WORDS);
+ bitmap_fill(possible_idx, ICE_MAX_FV_WORDS);
/* For each profile we are going to associate the recipe with, add the
* recipes that are associated with that profile. This will give us
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index d0d14325a0d9..3ddf76aea3f1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -109,7 +109,7 @@ struct page_pool;
#define MLX5E_REQUIRED_WQE_MTTS (MLX5_ALIGN_MTTS(MLX5_MPWRQ_PAGES_PER_WQE + 1))
#define MLX5E_REQUIRED_MTTS(wqes) (wqes * MLX5E_REQUIRED_WQE_MTTS)
#define MLX5E_MAX_RQ_NUM_MTTS \
- ((1 << 16) * 2) /* So that MLX5_MTT_OCTW(num_mtts) fits into u16 */
+ (ALIGN_DOWN(U16_MAX, 4) * 2) /* So that MLX5_MTT_OCTW(num_mtts) fits into u16 */
#define MLX5E_ORDER2_MAX_PACKET_MTU (order_base_2(10 * 1024))
#define MLX5E_PARAMS_MAXIMUM_LOG_RQ_SIZE_MPW \
(ilog2(MLX5E_MAX_RQ_NUM_MTTS / MLX5E_REQUIRED_WQE_MTTS))
@@ -174,8 +174,8 @@ struct page_pool;
ALIGN_DOWN(MLX5E_KLM_MAX_ENTRIES_PER_WQE(wqe_size), MLX5_UMR_KLM_ALIGNMENT)
#define MLX5E_MAX_KLM_PER_WQE(mdev) \
- MLX5E_KLM_ENTRIES_PER_WQE(mlx5e_get_sw_max_sq_mpw_wqebbs(mlx5e_get_max_sq_wqebbs(mdev)) \
- << MLX5_MKEY_BSF_OCTO_SIZE)
+ MLX5E_KLM_ENTRIES_PER_WQE(MLX5_SEND_WQE_BB * \
+ mlx5e_get_sw_max_sq_mpw_wqebbs(mlx5e_get_max_sq_wqebbs(mdev)))
#define MLX5E_MSG_LEVEL NETIF_MSG_LINK
@@ -233,7 +233,7 @@ static inline u16 mlx5e_get_max_sq_wqebbs(struct mlx5_core_dev *mdev)
MLX5_CAP_GEN(mdev, max_wqe_sz_sq) / MLX5_SEND_WQE_BB);
}
-static inline u16 mlx5e_get_sw_max_sq_mpw_wqebbs(u16 max_sq_wqebbs)
+static inline u8 mlx5e_get_sw_max_sq_mpw_wqebbs(u8 max_sq_wqebbs)
{
/* The return value will be multiplied by MLX5_SEND_WQEBB_NUM_DS.
* Since max_sq_wqebbs may be up to MLX5_SEND_WQE_MAX_WQEBBS == 16,
@@ -242,11 +242,12 @@ static inline u16 mlx5e_get_sw_max_sq_mpw_wqebbs(u16 max_sq_wqebbs)
* than MLX5_SEND_WQE_MAX_WQEBBS to let a full-session WQE be
* cache-aligned.
*/
-#if L1_CACHE_BYTES < 128
- return min_t(u16, max_sq_wqebbs, MLX5_SEND_WQE_MAX_WQEBBS - 1);
-#else
- return min_t(u16, max_sq_wqebbs, MLX5_SEND_WQE_MAX_WQEBBS - 2);
+ u8 wqebbs = min_t(u8, max_sq_wqebbs, MLX5_SEND_WQE_MAX_WQEBBS - 1);
+
+#if L1_CACHE_BYTES >= 128
+ wqebbs = ALIGN_DOWN(wqebbs, 2);
#endif
+ return wqebbs;
}
struct mlx5e_tx_wqe {
@@ -456,7 +457,7 @@ struct mlx5e_txqsq {
struct netdev_queue *txq;
u32 sqn;
u16 stop_room;
- u16 max_sq_mpw_wqebbs;
+ u8 max_sq_mpw_wqebbs;
u8 min_inline_mode;
struct device *pdev;
__be32 mkey_be;
@@ -571,7 +572,7 @@ struct mlx5e_xdpsq {
struct device *pdev;
__be32 mkey_be;
u16 stop_room;
- u16 max_sq_mpw_wqebbs;
+ u8 max_sq_mpw_wqebbs;
u8 min_inline_mode;
unsigned long state;
unsigned int hw_mtu;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 08fd1370a8b0..75da12e1b0c7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -797,8 +797,20 @@ static u8 mlx5e_build_icosq_log_wq_sz(struct mlx5_core_dev *mdev,
return MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE;
wqebbs = MLX5E_UMR_WQEBBS * BIT(mlx5e_get_rq_log_wq_sz(rqp->rqc));
+
+ /* If XDP program is attached, XSK may be turned on at any time without
+ * restarting the channel. ICOSQ must be big enough to fit UMR WQEs of
+ * both regular RQ and XSK RQ.
+ * Although mlx5e_mpwqe_get_log_rq_size accepts mlx5e_xsk_param, it
+ * doesn't affect its return value, as long as params->xdp_prog != NULL,
+ * so we can just multiply by 2.
+ */
+ if (params->xdp_prog)
+ wqebbs *= 2;
+
if (params->packet_merge.type == MLX5E_PACKET_MERGE_SHAMPO)
wqebbs += mlx5e_shampo_icosq_sz(mdev, params, rqp);
+
return max_t(u8, MLX5E_PARAMS_MINIMUM_LOG_SQ_SIZE, order_base_2(wqebbs));
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c
index dea137dd744b..2b64dd557b5d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c
@@ -128,6 +128,7 @@ mlx5e_tc_post_act_add(struct mlx5e_post_act *post_act, struct mlx5_flow_attr *at
post_attr->inner_match_level = MLX5_MATCH_NONE;
post_attr->outer_match_level = MLX5_MATCH_NONE;
post_attr->action &= ~MLX5_FLOW_CONTEXT_ACTION_DECAP;
+ post_attr->flags |= MLX5_ATTR_FLAG_NO_IN_PORT;
handle->ns_type = post_act->ns_type;
/* Splits were handled before post action */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
index 7f88ccf67fdd..8b56cb8b4743 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
@@ -7,6 +7,8 @@
#include "en.h"
#include <net/xdp_sock_drv.h>
+#define MLX5E_MTT_PTAG_MASK 0xfffffffffffffff8ULL
+
/* RX data path */
struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
@@ -22,6 +24,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
static inline int mlx5e_xsk_page_alloc_pool(struct mlx5e_rq *rq,
struct mlx5e_dma_info *dma_info)
{
+retry:
dma_info->xsk = xsk_buff_alloc(rq->xsk_pool);
if (!dma_info->xsk)
return -ENOMEM;
@@ -33,6 +36,17 @@ static inline int mlx5e_xsk_page_alloc_pool(struct mlx5e_rq *rq,
*/
dma_info->addr = xsk_buff_xdp_get_frame_dma(dma_info->xsk);
+ /* MTT page mapping has alignment requirements. If they are not
+ * satisfied, leak the descriptor so that it won't come again, and try
+ * to allocate a new one.
+ */
+ if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
+ if (unlikely(dma_info->addr & ~MLX5E_MTT_PTAG_MASK)) {
+ xsk_buff_discard(dma_info->xsk);
+ goto retry;
+ }
+ }
+
return 0;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c
index d93aadbf10da..90ea78239d40 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c
@@ -16,7 +16,7 @@ static int mlx5e_ktls_add(struct net_device *netdev, struct sock *sk,
struct mlx5_core_dev *mdev = priv->mdev;
int err;
- if (WARN_ON(!mlx5e_ktls_type_check(mdev, crypto_info)))
+ if (!mlx5e_ktls_type_check(mdev, crypto_info))
return -EOPNOTSUPP;
if (direction == TLS_OFFLOAD_CTX_DIR_TX)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 796d97bcf1aa..c6546613f7d8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -230,10 +230,8 @@ esw_setup_ft_dest(struct mlx5_flow_destination *dest,
}
static void
-esw_setup_slow_path_dest(struct mlx5_flow_destination *dest,
- struct mlx5_flow_act *flow_act,
- struct mlx5_fs_chains *chains,
- int i)
+esw_setup_accept_dest(struct mlx5_flow_destination *dest, struct mlx5_flow_act *flow_act,
+ struct mlx5_fs_chains *chains, int i)
{
if (mlx5_chains_ignore_flow_level_supported(chains))
flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL;
@@ -241,6 +239,16 @@ esw_setup_slow_path_dest(struct mlx5_flow_destination *dest,
dest[i].ft = mlx5_chains_get_tc_end_ft(chains);
}
+static void
+esw_setup_slow_path_dest(struct mlx5_flow_destination *dest, struct mlx5_flow_act *flow_act,
+ struct mlx5_eswitch *esw, int i)
+{
+ if (MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, ignore_flow_level))
+ flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL;
+ dest[i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE;
+ dest[i].ft = esw->fdb_table.offloads.slow_fdb;
+}
+
static int
esw_setup_chain_dest(struct mlx5_flow_destination *dest,
struct mlx5_flow_act *flow_act,
@@ -473,8 +481,11 @@ esw_setup_dests(struct mlx5_flow_destination *dest,
} else if (attr->dest_ft) {
esw_setup_ft_dest(dest, flow_act, esw, attr, spec, *i);
(*i)++;
- } else if (mlx5e_tc_attr_flags_skip(attr->flags)) {
- esw_setup_slow_path_dest(dest, flow_act, chains, *i);
+ } else if (attr->flags & MLX5_ATTR_FLAG_SLOW_PATH) {
+ esw_setup_slow_path_dest(dest, flow_act, esw, *i);
+ (*i)++;
+ } else if (attr->flags & MLX5_ATTR_FLAG_ACCEPT) {
+ esw_setup_accept_dest(dest, flow_act, chains, *i);
(*i)++;
} else if (attr->dest_chain) {
err = esw_setup_chain_dest(dest, flow_act, chains, attr->dest_chain,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
index d758848d34d0..696e45e2bd06 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
@@ -32,20 +32,17 @@ static void tout_set(struct mlx5_core_dev *dev, u64 val, enum mlx5_timeouts_type
dev->timeouts->to[type] = val;
}
-void mlx5_tout_set_def_val(struct mlx5_core_dev *dev)
+int mlx5_tout_init(struct mlx5_core_dev *dev)
{
int i;
- for (i = 0; i < MAX_TIMEOUT_TYPES; i++)
- tout_set(dev, tout_def_sw_val[i], i);
-}
-
-int mlx5_tout_init(struct mlx5_core_dev *dev)
-{
dev->timeouts = kmalloc(sizeof(*dev->timeouts), GFP_KERNEL);
if (!dev->timeouts)
return -ENOMEM;
+ for (i = 0; i < MAX_TIMEOUT_TYPES; i++)
+ tout_set(dev, tout_def_sw_val[i], i);
+
return 0;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
index 257c03eeab36..bc9e9aeda847 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
@@ -35,7 +35,6 @@ int mlx5_tout_init(struct mlx5_core_dev *dev);
void mlx5_tout_cleanup(struct mlx5_core_dev *dev);
void mlx5_tout_query_iseg(struct mlx5_core_dev *dev);
int mlx5_tout_query_dtor(struct mlx5_core_dev *dev);
-void mlx5_tout_set_def_val(struct mlx5_core_dev *dev);
u64 _mlx5_tout_ms(struct mlx5_core_dev *dev, enum mlx5_timeouts_types type);
#define mlx5_tout_ms(dev, type) _mlx5_tout_ms(dev, MLX5_TO_##type##_MS)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 8b5263699994..75d216246955 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -526,7 +526,7 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
/* Check log_max_qp from HCA caps to set in current profile */
if (prof->log_max_qp == LOG_MAX_SUPPORTED_QPS) {
- prof->log_max_qp = min_t(u8, 17, MLX5_CAP_GEN_MAX(dev, log_max_qp));
+ prof->log_max_qp = min_t(u8, 18, MLX5_CAP_GEN_MAX(dev, log_max_qp));
} else if (MLX5_CAP_GEN_MAX(dev, log_max_qp) < prof->log_max_qp) {
mlx5_core_warn(dev, "log_max_qp value in current profile is %d, changing it to HCA capability limit (%d)\n",
prof->log_max_qp,
@@ -1025,8 +1025,6 @@ static int mlx5_function_setup(struct mlx5_core_dev *dev, u64 timeout)
if (mlx5_core_is_pf(dev))
pcie_print_link_status(dev->pdev);
- mlx5_tout_set_def_val(dev);
-
/* wait for firmware to accept initialization segments configurations
*/
err = wait_fw_init(dev, timeout,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
index 887ee0f729d1..2935614f6fa9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
@@ -87,6 +87,11 @@ static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
enable_vfs_hca:
num_msix_count = mlx5_get_default_msix_vec_count(dev, num_vfs);
for (vf = 0; vf < num_vfs; vf++) {
+ /* Notify the VF before its enablement to let it set
+ * some stuff.
+ */
+ blocking_notifier_call_chain(&sriov->vfs_ctx[vf].notifier,
+ MLX5_PF_NOTIFY_ENABLE_VF, dev);
err = mlx5_core_enable_hca(dev, vf + 1);
if (err) {
mlx5_core_warn(dev, "failed to enable VF %d (%d)\n", vf, err);
@@ -127,6 +132,11 @@ mlx5_device_disable_sriov(struct mlx5_core_dev *dev, int num_vfs, bool clear_vf)
for (vf = num_vfs - 1; vf >= 0; vf--) {
if (!sriov->vfs_ctx[vf].enabled)
continue;
+ /* Notify the VF before its disablement to let it clean
+ * some resources.
+ */
+ blocking_notifier_call_chain(&sriov->vfs_ctx[vf].notifier,
+ MLX5_PF_NOTIFY_DISABLE_VF, dev);
err = mlx5_core_disable_hca(dev, vf + 1);
if (err) {
mlx5_core_warn(dev, "failed to disable VF %d\n", vf);
@@ -257,7 +267,7 @@ int mlx5_sriov_init(struct mlx5_core_dev *dev)
{
struct mlx5_core_sriov *sriov = &dev->priv.sriov;
struct pci_dev *pdev = dev->pdev;
- int total_vfs;
+ int total_vfs, i;
if (!mlx5_core_is_pf(dev))
return 0;
@@ -269,6 +279,9 @@ int mlx5_sriov_init(struct mlx5_core_dev *dev)
if (!sriov->vfs_ctx)
return -ENOMEM;
+ for (i = 0; i < total_vfs; i++)
+ BLOCKING_INIT_NOTIFIER_HEAD(&sriov->vfs_ctx[i].notifier);
+
return 0;
}
@@ -281,3 +294,53 @@ void mlx5_sriov_cleanup(struct mlx5_core_dev *dev)
kfree(sriov->vfs_ctx);
}
+
+/**
+ * mlx5_sriov_blocking_notifier_unregister - Unregister a VF from
+ * a notification block chain.
+ *
+ * @mdev: The mlx5 core device.
+ * @vf_id: The VF id.
+ * @nb: The notifier block to be unregistered.
+ */
+void mlx5_sriov_blocking_notifier_unregister(struct mlx5_core_dev *mdev,
+ int vf_id,
+ struct notifier_block *nb)
+{
+ struct mlx5_vf_context *vfs_ctx;
+ struct mlx5_core_sriov *sriov;
+
+ sriov = &mdev->priv.sriov;
+ if (WARN_ON(vf_id < 0 || vf_id >= sriov->num_vfs))
+ return;
+
+ vfs_ctx = &sriov->vfs_ctx[vf_id];
+ blocking_notifier_chain_unregister(&vfs_ctx->notifier, nb);
+}
+EXPORT_SYMBOL(mlx5_sriov_blocking_notifier_unregister);
+
+/**
+ * mlx5_sriov_blocking_notifier_register - Register a VF notification
+ * block chain.
+ *
+ * @mdev: The mlx5 core device.
+ * @vf_id: The VF id.
+ * @nb: The notifier block to be called upon the VF events.
+ *
+ * Returns 0 on success or an error code.
+ */
+int mlx5_sriov_blocking_notifier_register(struct mlx5_core_dev *mdev,
+ int vf_id,
+ struct notifier_block *nb)
+{
+ struct mlx5_vf_context *vfs_ctx;
+ struct mlx5_core_sriov *sriov;
+
+ sriov = &mdev->priv.sriov;
+ if (vf_id < 0 || vf_id >= sriov->num_vfs)
+ return -EINVAL;
+
+ vfs_ctx = &sriov->vfs_ctx[vf_id];
+ return blocking_notifier_chain_register(&vfs_ctx->notifier, nb);
+}
+EXPORT_SYMBOL(mlx5_sriov_blocking_notifier_register);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
index d5998ef59be4..7adcf0eec13b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
@@ -21,10 +21,11 @@ enum dr_dump_rec_type {
DR_DUMP_REC_TYPE_TABLE_TX = 3102,
DR_DUMP_REC_TYPE_MATCHER = 3200,
- DR_DUMP_REC_TYPE_MATCHER_MASK = 3201,
+ DR_DUMP_REC_TYPE_MATCHER_MASK_DEPRECATED = 3201,
DR_DUMP_REC_TYPE_MATCHER_RX = 3202,
DR_DUMP_REC_TYPE_MATCHER_TX = 3203,
DR_DUMP_REC_TYPE_MATCHER_BUILDER = 3204,
+ DR_DUMP_REC_TYPE_MATCHER_MASK = 3205,
DR_DUMP_REC_TYPE_RULE = 3300,
DR_DUMP_REC_TYPE_RULE_RX_ENTRY_V0 = 3301,
@@ -114,13 +115,15 @@ dr_dump_rule_action_mem(struct seq_file *file, const u64 rule_id,
break;
case DR_ACTION_TYP_FT:
if (action->dest_tbl->is_fw_tbl)
- seq_printf(file, "%d,0x%llx,0x%llx,0x%x\n",
+ seq_printf(file, "%d,0x%llx,0x%llx,0x%x,0x%x\n",
DR_DUMP_REC_TYPE_ACTION_FT, action_id,
- rule_id, action->dest_tbl->fw_tbl.id);
+ rule_id, action->dest_tbl->fw_tbl.id,
+ -1);
else
- seq_printf(file, "%d,0x%llx,0x%llx,0x%x\n",
+ seq_printf(file, "%d,0x%llx,0x%llx,0x%x,0x%llx\n",
DR_DUMP_REC_TYPE_ACTION_FT, action_id,
- rule_id, action->dest_tbl->tbl->table_id);
+ rule_id, action->dest_tbl->tbl->table_id,
+ DR_DBG_PTR_TO_ID(action->dest_tbl->tbl));
break;
case DR_ACTION_TYP_CTR:
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index 20ceac81a2c2..497077726916 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -3245,6 +3245,7 @@ int ocelot_init(struct ocelot *ocelot)
mutex_init(&ocelot->ptp_lock);
mutex_init(&ocelot->mact_lock);
mutex_init(&ocelot->fwd_domain_lock);
+ mutex_init(&ocelot->tas_lock);
spin_lock_init(&ocelot->ptp_clock_lock);
spin_lock_init(&ocelot->ts_id_lock);
snprintf(queue_name, sizeof(queue_name), "%s-stats",
diff --git a/drivers/net/ethernet/mscc/ocelot_ptp.c b/drivers/net/ethernet/mscc/ocelot_ptp.c
index 87ad2137ba06..09c703efe946 100644
--- a/drivers/net/ethernet/mscc/ocelot_ptp.c
+++ b/drivers/net/ethernet/mscc/ocelot_ptp.c
@@ -72,6 +72,10 @@ int ocelot_ptp_settime64(struct ptp_clock_info *ptp,
ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
+
+ if (ocelot->ops->tas_clock_adjust)
+ ocelot->ops->tas_clock_adjust(ocelot);
+
return 0;
}
EXPORT_SYMBOL(ocelot_ptp_settime64);
@@ -105,6 +109,9 @@ int ocelot_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
ocelot_write_rix(ocelot, val, PTP_PIN_CFG, TOD_ACC_PIN);
spin_unlock_irqrestore(&ocelot->ptp_clock_lock, flags);
+
+ if (ocelot->ops->tas_clock_adjust)
+ ocelot->ops->tas_clock_adjust(ocelot);
} else {
/* Fall back using ocelot_ptp_settime64 which is not exact. */
struct timespec64 ts;
@@ -117,6 +124,7 @@ int ocelot_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
ocelot_ptp_settime64(ptp, &ts);
}
+
return 0;
}
EXPORT_SYMBOL(ocelot_ptp_adjtime);
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
index f3568901eb91..1443f788ee37 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
@@ -1437,7 +1437,7 @@ static int ionic_set_nic_features(struct ionic_lif *lif,
if ((old_hw_features ^ lif->hw_features) & IONIC_ETH_HW_RX_HASH)
ionic_lif_rss_config(lif, lif->rss_types, NULL, NULL);
- if ((vlan_flags & features) &&
+ if ((vlan_flags & le64_to_cpu(ctx.cmd.lif_setattr.features)) &&
!(vlan_flags & le64_to_cpu(ctx.comp.lif_setattr.features)))
dev_info_once(lif->ionic->dev, "NIC is not supporting vlan offload, likely in SmartNIC mode\n");
diff --git a/drivers/net/netdevsim/bpf.c b/drivers/net/netdevsim/bpf.c
index a43820212932..50854265864d 100644
--- a/drivers/net/netdevsim/bpf.c
+++ b/drivers/net/netdevsim/bpf.c
@@ -351,10 +351,12 @@ nsim_map_alloc_elem(struct bpf_offloaded_map *offmap, unsigned int idx)
{
struct nsim_bpf_bound_map *nmap = offmap->dev_priv;
- nmap->entry[idx].key = kmalloc(offmap->map.key_size, GFP_USER);
+ nmap->entry[idx].key = kmalloc(offmap->map.key_size,
+ GFP_KERNEL_ACCOUNT | __GFP_NOWARN);
if (!nmap->entry[idx].key)
return -ENOMEM;
- nmap->entry[idx].value = kmalloc(offmap->map.value_size, GFP_USER);
+ nmap->entry[idx].value = kmalloc(offmap->map.value_size,
+ GFP_KERNEL_ACCOUNT | __GFP_NOWARN);
if (!nmap->entry[idx].value) {
kfree(nmap->entry[idx].key);
nmap->entry[idx].key = NULL;
@@ -496,7 +498,7 @@ nsim_bpf_map_alloc(struct netdevsim *ns, struct bpf_offloaded_map *offmap)
if (offmap->map.map_flags)
return -EINVAL;
- nmap = kzalloc(sizeof(*nmap), GFP_USER);
+ nmap = kzalloc(sizeof(*nmap), GFP_KERNEL_ACCOUNT);
if (!nmap)
return -ENOMEM;
diff --git a/drivers/net/netdevsim/fib.c b/drivers/net/netdevsim/fib.c
index 378ee779061c..14787d17f703 100644
--- a/drivers/net/netdevsim/fib.c
+++ b/drivers/net/netdevsim/fib.c
@@ -53,6 +53,7 @@ struct nsim_fib_data {
struct rhashtable nexthop_ht;
struct devlink *devlink;
struct work_struct fib_event_work;
+ struct work_struct fib_flush_work;
struct list_head fib_event_queue;
spinlock_t fib_event_queue_lock; /* Protects fib event queue list */
struct mutex nh_lock; /* Protects NH HT */
@@ -977,7 +978,7 @@ static int nsim_fib_event_schedule_work(struct nsim_fib_data *data,
fib_event = kzalloc(sizeof(*fib_event), GFP_ATOMIC);
if (!fib_event)
- return NOTIFY_BAD;
+ goto err_fib_event_alloc;
fib_event->data = data;
fib_event->event = event;
@@ -1005,6 +1006,9 @@ static int nsim_fib_event_schedule_work(struct nsim_fib_data *data,
err_fib_prepare_event:
kfree(fib_event);
+err_fib_event_alloc:
+ if (event == FIB_EVENT_ENTRY_DEL)
+ schedule_work(&data->fib_flush_work);
return NOTIFY_BAD;
}
@@ -1482,6 +1486,24 @@ static void nsim_fib_event_work(struct work_struct *work)
mutex_unlock(&data->fib_lock);
}
+static void nsim_fib_flush_work(struct work_struct *work)
+{
+ struct nsim_fib_data *data = container_of(work, struct nsim_fib_data,
+ fib_flush_work);
+ struct nsim_fib_rt *fib_rt, *fib_rt_tmp;
+
+ /* Process pending work. */
+ flush_work(&data->fib_event_work);
+
+ mutex_lock(&data->fib_lock);
+ list_for_each_entry_safe(fib_rt, fib_rt_tmp, &data->fib_rt_list, list) {
+ rhashtable_remove_fast(&data->fib_rt_ht, &fib_rt->ht_node,
+ nsim_fib_rt_ht_params);
+ nsim_fib_rt_free(fib_rt, data);
+ }
+ mutex_unlock(&data->fib_lock);
+}
+
static int
nsim_fib_debugfs_init(struct nsim_fib_data *data, struct nsim_dev *nsim_dev)
{
@@ -1540,6 +1562,7 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink,
goto err_rhashtable_nexthop_destroy;
INIT_WORK(&data->fib_event_work, nsim_fib_event_work);
+ INIT_WORK(&data->fib_flush_work, nsim_fib_flush_work);
INIT_LIST_HEAD(&data->fib_event_queue);
spin_lock_init(&data->fib_event_queue_lock);
@@ -1586,6 +1609,7 @@ struct nsim_fib_data *nsim_fib_create(struct devlink *devlink,
err_nexthop_nb_unregister:
unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb);
err_rhashtable_fib_destroy:
+ cancel_work_sync(&data->fib_flush_work);
flush_work(&data->fib_event_work);
rhashtable_free_and_destroy(&data->fib_rt_ht, nsim_fib_rt_free,
data);
@@ -1615,6 +1639,7 @@ void nsim_fib_destroy(struct devlink *devlink, struct nsim_fib_data *data)
NSIM_RESOURCE_IPV4_FIB);
unregister_fib_notifier(devlink_net(devlink), &data->fib_nb);
unregister_nexthop_notifier(devlink_net(devlink), &data->nexthop_nb);
+ cancel_work_sync(&data->fib_flush_work);
flush_work(&data->fib_event_work);
rhashtable_free_and_destroy(&data->fib_rt_ht, nsim_fib_rt_free,
data);
diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
index d8cac02a79b9..636b0907a598 100644
--- a/drivers/net/phy/smsc.c
+++ b/drivers/net/phy/smsc.c
@@ -110,7 +110,7 @@ static int smsc_phy_config_init(struct phy_device *phydev)
struct smsc_phy_priv *priv = phydev->priv;
int rc;
- if (!priv->energy_enable)
+ if (!priv->energy_enable || phydev->irq != PHY_POLL)
return 0;
rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
@@ -210,6 +210,8 @@ static int lan95xx_config_aneg_ext(struct phy_device *phydev)
* response on link pulses to detect presence of plugged Ethernet cable.
* The Energy Detect Power-Down mode is enabled again in the end of procedure to
* save approximately 220 mW of power if cable is unplugged.
+ * The workaround is only applicable to poll mode. Energy Detect Power-Down may
+ * not be used in interrupt mode lest link change detection becomes unreliable.
*/
static int lan87xx_read_status(struct phy_device *phydev)
{
@@ -217,7 +219,7 @@ static int lan87xx_read_status(struct phy_device *phydev)
int err = genphy_read_status(phydev);
- if (!phydev->link && priv->energy_enable) {
+ if (!phydev->link && priv->energy_enable && phydev->irq == PHY_POLL) {
/* Disable EDPD to wake up PHY */
int rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
if (rc < 0)
diff --git a/drivers/net/usb/Kconfig b/drivers/net/usb/Kconfig
index e62fc4f2aee0..76659c1c525a 100644
--- a/drivers/net/usb/Kconfig
+++ b/drivers/net/usb/Kconfig
@@ -637,8 +637,9 @@ config USB_NET_AQC111
* Aquantia AQtion USB to 5GbE
config USB_RTL8153_ECM
- tristate "RTL8153 ECM support"
+ tristate
depends on USB_NET_CDCETHER && (USB_RTL8152 || USB_RTL8152=n)
+ default y
help
This option supports ECM mode for RTL8153 ethernet adapter, when
CONFIG_USB_RTL8152 is not set, or the RTL8153 device is not
diff --git a/drivers/net/usb/ax88179_178a.c b/drivers/net/usb/ax88179_178a.c
index dc1f6d8444ad..873f6deabbd1 100644
--- a/drivers/net/usb/ax88179_178a.c
+++ b/drivers/net/usb/ax88179_178a.c
@@ -1801,7 +1801,7 @@ static const struct driver_info ax88179_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1814,7 +1814,7 @@ static const struct driver_info ax88178a_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1827,7 +1827,7 @@ static const struct driver_info cypress_GX3_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1840,7 +1840,7 @@ static const struct driver_info dlink_dub1312_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1853,7 +1853,7 @@ static const struct driver_info sitecom_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1866,7 +1866,7 @@ static const struct driver_info samsung_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1879,7 +1879,7 @@ static const struct driver_info lenovo_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1892,7 +1892,7 @@ static const struct driver_info belkin_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1905,7 +1905,7 @@ static const struct driver_info toshiba_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1918,7 +1918,7 @@ static const struct driver_info mct_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1931,7 +1931,7 @@ static const struct driver_info at_umc2000_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1944,7 +1944,7 @@ static const struct driver_info at_umc200_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
@@ -1957,7 +1957,7 @@ static const struct driver_info at_umc2000sp_info = {
.link_reset = ax88179_link_reset,
.reset = ax88179_reset,
.stop = ax88179_stop,
- .flags = FLAG_ETHER | FLAG_FRAMING_AX | FLAG_SEND_ZLP,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
.rx_fixup = ax88179_rx_fixup,
.tx_fixup = ax88179_tx_fixup,
};
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index edf0492ad489..515363d74078 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -18,6 +18,8 @@
#include <linux/usb/usbnet.h>
#include <linux/slab.h>
#include <linux/of_net.h>
+#include <linux/irq.h>
+#include <linux/irqdomain.h>
#include <linux/mdio.h>
#include <linux/phy.h>
#include <net/selftests.h>
@@ -53,6 +55,9 @@
#define SUSPEND_ALLMODES (SUSPEND_SUSPEND0 | SUSPEND_SUSPEND1 | \
SUSPEND_SUSPEND2 | SUSPEND_SUSPEND3)
+#define SMSC95XX_NR_IRQS (1) /* raise to 12 for GPIOs */
+#define PHY_HWIRQ (SMSC95XX_NR_IRQS - 1)
+
struct smsc95xx_priv {
u32 mac_cr;
u32 hash_hi;
@@ -61,8 +66,12 @@ struct smsc95xx_priv {
spinlock_t mac_cr_lock;
u8 features;
u8 suspend_flags;
+ struct irq_chip irqchip;
+ struct irq_domain *irqdomain;
+ struct fwnode_handle *irqfwnode;
struct mii_bus *mdiobus;
struct phy_device *phydev;
+ struct task_struct *pm_task;
};
static bool turbo_mode = true;
@@ -72,13 +81,14 @@ MODULE_PARM_DESC(turbo_mode, "Enable multiple frames per Rx transaction");
static int __must_check __smsc95xx_read_reg(struct usbnet *dev, u32 index,
u32 *data, int in_pm)
{
+ struct smsc95xx_priv *pdata = dev->driver_priv;
u32 buf;
int ret;
int (*fn)(struct usbnet *, u8, u8, u16, u16, void *, u16);
BUG_ON(!dev);
- if (!in_pm)
+ if (current != pdata->pm_task)
fn = usbnet_read_cmd;
else
fn = usbnet_read_cmd_nopm;
@@ -102,13 +112,14 @@ static int __must_check __smsc95xx_read_reg(struct usbnet *dev, u32 index,
static int __must_check __smsc95xx_write_reg(struct usbnet *dev, u32 index,
u32 data, int in_pm)
{
+ struct smsc95xx_priv *pdata = dev->driver_priv;
u32 buf;
int ret;
int (*fn)(struct usbnet *, u8, u8, u16, u16, const void *, u16);
BUG_ON(!dev);
- if (!in_pm)
+ if (current != pdata->pm_task)
fn = usbnet_write_cmd;
else
fn = usbnet_write_cmd_nopm;
@@ -566,16 +577,12 @@ static int smsc95xx_phy_update_flowcontrol(struct usbnet *dev)
return smsc95xx_write_reg(dev, AFC_CFG, afc_cfg);
}
-static int smsc95xx_link_reset(struct usbnet *dev)
+static void smsc95xx_mac_update_fullduplex(struct usbnet *dev)
{
struct smsc95xx_priv *pdata = dev->driver_priv;
unsigned long flags;
int ret;
- ret = smsc95xx_write_reg(dev, INT_STS, INT_STS_CLEAR_ALL_);
- if (ret < 0)
- return ret;
-
spin_lock_irqsave(&pdata->mac_cr_lock, flags);
if (pdata->phydev->duplex != DUPLEX_FULL) {
pdata->mac_cr &= ~MAC_CR_FDPX_;
@@ -587,18 +594,22 @@ static int smsc95xx_link_reset(struct usbnet *dev)
spin_unlock_irqrestore(&pdata->mac_cr_lock, flags);
ret = smsc95xx_write_reg(dev, MAC_CR, pdata->mac_cr);
- if (ret < 0)
- return ret;
+ if (ret < 0) {
+ if (ret != -ENODEV)
+ netdev_warn(dev->net,
+ "Error updating MAC full duplex mode\n");
+ return;
+ }
ret = smsc95xx_phy_update_flowcontrol(dev);
if (ret < 0)
netdev_warn(dev->net, "Error updating PHY flow control\n");
-
- return ret;
}
static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
{
+ struct smsc95xx_priv *pdata = dev->driver_priv;
+ unsigned long flags;
u32 intdata;
if (urb->actual_length != 4) {
@@ -610,11 +621,15 @@ static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
intdata = get_unaligned_le32(urb->transfer_buffer);
netif_dbg(dev, link, dev->net, "intdata: 0x%08X\n", intdata);
+ local_irq_save(flags);
+
if (intdata & INT_ENP_PHY_INT_)
- usbnet_defer_kevent(dev, EVENT_LINK_RESET);
+ generic_handle_domain_irq(pdata->irqdomain, PHY_HWIRQ);
else
netdev_warn(dev->net, "unexpected interrupt, intdata=0x%08X\n",
intdata);
+
+ local_irq_restore(flags);
}
/* Enable or disable Tx & Rx checksum offload engines */
@@ -1092,6 +1107,7 @@ static void smsc95xx_handle_link_change(struct net_device *net)
struct usbnet *dev = netdev_priv(net);
phy_print_status(net->phydev);
+ smsc95xx_mac_update_fullduplex(dev);
usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
}
@@ -1099,8 +1115,9 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
{
struct smsc95xx_priv *pdata;
bool is_internal_phy;
+ char usb_path[64];
+ int ret, phy_irq;
u32 val;
- int ret;
printk(KERN_INFO SMSC_CHIPNAME " v" SMSC_DRIVER_VERSION "\n");
@@ -1140,10 +1157,38 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
if (ret)
goto free_pdata;
+ /* create irq domain for use by PHY driver and GPIO consumers */
+ usb_make_path(dev->udev, usb_path, sizeof(usb_path));
+ pdata->irqfwnode = irq_domain_alloc_named_fwnode(usb_path);
+ if (!pdata->irqfwnode) {
+ ret = -ENOMEM;
+ goto free_pdata;
+ }
+
+ pdata->irqdomain = irq_domain_create_linear(pdata->irqfwnode,
+ SMSC95XX_NR_IRQS,
+ &irq_domain_simple_ops,
+ pdata);
+ if (!pdata->irqdomain) {
+ ret = -ENOMEM;
+ goto free_irqfwnode;
+ }
+
+ phy_irq = irq_create_mapping(pdata->irqdomain, PHY_HWIRQ);
+ if (!phy_irq) {
+ ret = -ENOENT;
+ goto remove_irqdomain;
+ }
+
+ pdata->irqchip = dummy_irq_chip;
+ pdata->irqchip.name = SMSC_CHIPNAME;
+ irq_set_chip_and_handler_name(phy_irq, &pdata->irqchip,
+ handle_simple_irq, "phy");
+
pdata->mdiobus = mdiobus_alloc();
if (!pdata->mdiobus) {
ret = -ENOMEM;
- goto free_pdata;
+ goto dispose_irq;
}
ret = smsc95xx_read_reg(dev, HW_CFG, &val);
@@ -1176,6 +1221,7 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
goto unregister_mdio;
}
+ pdata->phydev->irq = phy_irq;
pdata->phydev->is_internal = is_internal_phy;
/* detect device revision as different features may be available */
@@ -1218,6 +1264,15 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
free_mdio:
mdiobus_free(pdata->mdiobus);
+dispose_irq:
+ irq_dispose_mapping(phy_irq);
+
+remove_irqdomain:
+ irq_domain_remove(pdata->irqdomain);
+
+free_irqfwnode:
+ irq_domain_free_fwnode(pdata->irqfwnode);
+
free_pdata:
kfree(pdata);
return ret;
@@ -1230,6 +1285,9 @@ static void smsc95xx_unbind(struct usbnet *dev, struct usb_interface *intf)
phy_disconnect(dev->net->phydev);
mdiobus_unregister(pdata->mdiobus);
mdiobus_free(pdata->mdiobus);
+ irq_dispose_mapping(irq_find_mapping(pdata->irqdomain, PHY_HWIRQ));
+ irq_domain_remove(pdata->irqdomain);
+ irq_domain_free_fwnode(pdata->irqfwnode);
netif_dbg(dev, ifdown, dev->net, "free pdata\n");
kfree(pdata);
}
@@ -1254,29 +1312,6 @@ static u32 smsc_crc(const u8 *buffer, size_t len, int filter)
return crc << ((filter % 2) * 16);
}
-static int smsc95xx_enable_phy_wakeup_interrupts(struct usbnet *dev, u16 mask)
-{
- int ret;
-
- netdev_dbg(dev->net, "enabling PHY wakeup interrupts\n");
-
- /* read to clear */
- ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_SRC);
- if (ret < 0)
- return ret;
-
- /* enable interrupt source */
- ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_MASK);
- if (ret < 0)
- return ret;
-
- ret |= mask;
-
- smsc95xx_mdio_write_nopm(dev, PHY_INT_MASK, ret);
-
- return 0;
-}
-
static int smsc95xx_link_ok_nopm(struct usbnet *dev)
{
int ret;
@@ -1443,7 +1478,6 @@ static int smsc95xx_enter_suspend3(struct usbnet *dev)
static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
{
struct smsc95xx_priv *pdata = dev->driver_priv;
- int ret;
if (!netif_running(dev->net)) {
/* interface is ifconfig down so fully power down hw */
@@ -1462,27 +1496,10 @@ static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
}
netdev_dbg(dev->net, "autosuspend entering SUSPEND1\n");
-
- /* enable PHY wakeup events for if cable is attached */
- ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
- PHY_INT_MASK_ANEG_COMP_);
- if (ret < 0) {
- netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
- return ret;
- }
-
netdev_info(dev->net, "entering SUSPEND1 mode\n");
return smsc95xx_enter_suspend1(dev);
}
- /* enable PHY wakeup events so we remote wakeup if cable is pulled */
- ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
- PHY_INT_MASK_LINK_DOWN_);
- if (ret < 0) {
- netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
- return ret;
- }
-
netdev_dbg(dev->net, "autosuspend entering SUSPEND3\n");
return smsc95xx_enter_suspend3(dev);
}
@@ -1494,9 +1511,12 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
u32 val, link_up;
int ret;
+ pdata->pm_task = current;
+
ret = usbnet_suspend(intf, message);
if (ret < 0) {
netdev_warn(dev->net, "usbnet_suspend error\n");
+ pdata->pm_task = NULL;
return ret;
}
@@ -1548,13 +1568,6 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
}
if (pdata->wolopts & WAKE_PHY) {
- ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
- (PHY_INT_MASK_ANEG_COMP_ | PHY_INT_MASK_LINK_DOWN_));
- if (ret < 0) {
- netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
- goto done;
- }
-
/* if link is down then configure EDPD and enter SUSPEND1,
* otherwise enter SUSPEND0 below
*/
@@ -1743,6 +1756,7 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
if (ret && PMSG_IS_AUTO(message))
usbnet_resume(intf);
+ pdata->pm_task = NULL;
return ret;
}
@@ -1763,45 +1777,53 @@ static int smsc95xx_resume(struct usb_interface *intf)
/* do this first to ensure it's cleared even in error case */
pdata->suspend_flags = 0;
+ pdata->pm_task = current;
+
if (suspend_flags & SUSPEND_ALLMODES) {
/* clear wake-up sources */
ret = smsc95xx_read_reg_nopm(dev, WUCSR, &val);
if (ret < 0)
- return ret;
+ goto done;
val &= ~(WUCSR_WAKE_EN_ | WUCSR_MPEN_);
ret = smsc95xx_write_reg_nopm(dev, WUCSR, val);
if (ret < 0)
- return ret;
+ goto done;
/* clear wake-up status */
ret = smsc95xx_read_reg_nopm(dev, PM_CTRL, &val);
if (ret < 0)
- return ret;
+ goto done;
val &= ~PM_CTL_WOL_EN_;
val |= PM_CTL_WUPS_;
ret = smsc95xx_write_reg_nopm(dev, PM_CTRL, val);
if (ret < 0)
- return ret;
+ goto done;
}
+ phy_init_hw(pdata->phydev);
+
ret = usbnet_resume(intf);
if (ret < 0)
netdev_warn(dev->net, "usbnet_resume error\n");
- phy_init_hw(pdata->phydev);
+done:
+ pdata->pm_task = NULL;
return ret;
}
static int smsc95xx_reset_resume(struct usb_interface *intf)
{
struct usbnet *dev = usb_get_intfdata(intf);
+ struct smsc95xx_priv *pdata = dev->driver_priv;
int ret;
+ pdata->pm_task = current;
ret = smsc95xx_reset(dev);
+ pdata->pm_task = NULL;
if (ret < 0)
return ret;
@@ -1997,7 +2019,6 @@ static const struct driver_info smsc95xx_info = {
.description = "smsc95xx USB 2.0 Ethernet",
.bind = smsc95xx_bind,
.unbind = smsc95xx_unbind,
- .link_reset = smsc95xx_link_reset,
.reset = smsc95xx_reset,
.check_connect = smsc95xx_start_phy,
.stop = smsc95xx_stop,
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 9b2bd09628a3..502e6abbc8e2 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -849,13 +849,11 @@ int usbnet_stop (struct net_device *net)
mpn = !test_and_clear_bit(EVENT_NO_RUNTIME_PM, &dev->flags);
- /* deferred work (task, timer, softirq) must also stop.
- * can't flush_scheduled_work() until we drop rtnl (later),
- * else workers could deadlock; so make workers a NOP.
- */
+ /* deferred work (timer, softirq, task) must also stop */
dev->flags = 0;
del_timer_sync (&dev->delay);
tasklet_kill (&dev->bh);
+ cancel_work_sync(&dev->kevent);
if (!pm)
usb_autopm_put_interface(dev->intf);
@@ -1619,8 +1617,6 @@ void usbnet_disconnect (struct usb_interface *intf)
net = dev->net;
unregister_netdev (net);
- cancel_work_sync(&dev->kevent);
-
usb_scuttle_anchored_urbs(&dev->deferred);
if (dev->driver_info->unbind)
diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c
index 9a4c8ff32d9d..5bf7822c53f1 100644
--- a/drivers/net/wireguard/allowedips.c
+++ b/drivers/net/wireguard/allowedips.c
@@ -6,6 +6,8 @@
#include "allowedips.h"
#include "peer.h"
+enum { MAX_ALLOWEDIPS_BITS = 128 };
+
static struct kmem_cache *node_cache;
static void swap_endian(u8 *dst, const u8 *src, u8 bits)
@@ -40,7 +42,8 @@ static void push_rcu(struct allowedips_node **stack,
struct allowedips_node __rcu *p, unsigned int *len)
{
if (rcu_access_pointer(p)) {
- WARN_ON(IS_ENABLED(DEBUG) && *len >= 128);
+ if (WARN_ON(IS_ENABLED(DEBUG) && *len >= MAX_ALLOWEDIPS_BITS))
+ return;
stack[(*len)++] = rcu_dereference_raw(p);
}
}
@@ -52,7 +55,7 @@ static void node_free_rcu(struct rcu_head *rcu)
static void root_free_rcu(struct rcu_head *rcu)
{
- struct allowedips_node *node, *stack[128] = {
+ struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = {
container_of(rcu, struct allowedips_node, rcu) };
unsigned int len = 1;
@@ -65,7 +68,7 @@ static void root_free_rcu(struct rcu_head *rcu)
static void root_remove_peer_lists(struct allowedips_node *root)
{
- struct allowedips_node *node, *stack[128] = { root };
+ struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = { root };
unsigned int len = 1;
while (len > 0 && (node = stack[--len])) {
diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c
index e173204ae7d7..41db10f9be49 100644
--- a/drivers/net/wireguard/selftest/allowedips.c
+++ b/drivers/net/wireguard/selftest/allowedips.c
@@ -593,10 +593,10 @@ bool __init wg_allowedips_selftest(void)
wg_allowedips_remove_by_peer(&t, a, &mutex);
test_negative(4, a, 192, 168, 0, 1);
- /* These will hit the WARN_ON(len >= 128) in free_node if something
- * goes wrong.
+ /* These will hit the WARN_ON(len >= MAX_ALLOWEDIPS_BITS) in free_node
+ * if something goes wrong.
*/
- for (i = 0; i < 128; ++i) {
+ for (i = 0; i < MAX_ALLOWEDIPS_BITS; ++i) {
part = cpu_to_be64(~(1LLU << (i % 64)));
memset(&ip, 0xff, 16);
memcpy((u8 *)&ip + (i < 64) * 8, &part, 8);
diff --git a/drivers/net/wireguard/selftest/ratelimiter.c b/drivers/net/wireguard/selftest/ratelimiter.c
index 007cd4457c5f..ba87d294604f 100644
--- a/drivers/net/wireguard/selftest/ratelimiter.c
+++ b/drivers/net/wireguard/selftest/ratelimiter.c
@@ -6,28 +6,29 @@
#ifdef DEBUG
#include <linux/jiffies.h>
+#include <linux/hrtimer.h>
static const struct {
bool result;
- unsigned int msec_to_sleep_before;
+ u64 nsec_to_sleep_before;
} expected_results[] __initconst = {
[0 ... PACKETS_BURSTABLE - 1] = { true, 0 },
[PACKETS_BURSTABLE] = { false, 0 },
- [PACKETS_BURSTABLE + 1] = { true, MSEC_PER_SEC / PACKETS_PER_SECOND },
+ [PACKETS_BURSTABLE + 1] = { true, NSEC_PER_SEC / PACKETS_PER_SECOND },
[PACKETS_BURSTABLE + 2] = { false, 0 },
- [PACKETS_BURSTABLE + 3] = { true, (MSEC_PER_SEC / PACKETS_PER_SECOND) * 2 },
+ [PACKETS_BURSTABLE + 3] = { true, (NSEC_PER_SEC / PACKETS_PER_SECOND) * 2 },
[PACKETS_BURSTABLE + 4] = { true, 0 },
[PACKETS_BURSTABLE + 5] = { false, 0 }
};
static __init unsigned int maximum_jiffies_at_index(int index)
{
- unsigned int total_msecs = 2 * MSEC_PER_SEC / PACKETS_PER_SECOND / 3;
+ u64 total_nsecs = 2 * NSEC_PER_SEC / PACKETS_PER_SECOND / 3;
int i;
for (i = 0; i <= index; ++i)
- total_msecs += expected_results[i].msec_to_sleep_before;
- return msecs_to_jiffies(total_msecs);
+ total_nsecs += expected_results[i].nsec_to_sleep_before;
+ return nsecs_to_jiffies(total_nsecs);
}
static __init int timings_test(struct sk_buff *skb4, struct iphdr *hdr4,
@@ -42,8 +43,12 @@ static __init int timings_test(struct sk_buff *skb4, struct iphdr *hdr4,
loop_start_time = jiffies;
for (i = 0; i < ARRAY_SIZE(expected_results); ++i) {
- if (expected_results[i].msec_to_sleep_before)
- msleep(expected_results[i].msec_to_sleep_before);
+ if (expected_results[i].nsec_to_sleep_before) {
+ ktime_t timeout = ktime_add(ktime_add_ns(ktime_get_coarse_boottime(), TICK_NSEC * 4 / 3),
+ ns_to_ktime(expected_results[i].nsec_to_sleep_before));
+ set_current_state(TASK_UNINTERRUPTIBLE);
+ schedule_hrtimeout_range_clock(&timeout, 0, HRTIMER_MODE_ABS, CLOCK_BOOTTIME);
+ }
if (time_is_before_jiffies(loop_start_time +
maximum_jiffies_at_index(i)))
@@ -127,7 +132,7 @@ bool __init wg_ratelimiter_selftest(void)
if (IS_ENABLED(CONFIG_KASAN) || IS_ENABLED(CONFIG_UBSAN))
return true;
- BUILD_BUG_ON(MSEC_PER_SEC % PACKETS_PER_SECOND != 0);
+ BUILD_BUG_ON(NSEC_PER_SEC % PACKETS_PER_SECOND != 0);
if (wg_ratelimiter_init())
goto out;
@@ -176,7 +181,6 @@ bool __init wg_ratelimiter_selftest(void)
test += test_count;
goto err;
}
- msleep(500);
continue;
} else if (ret < 0) {
test += test_count;
@@ -195,7 +199,6 @@ bool __init wg_ratelimiter_selftest(void)
test += test_count;
goto err;
}
- msleep(50);
continue;
}
test += test_count;
diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index 8328966a0471..603dc7bebc0c 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -1249,13 +1249,12 @@ static void ath10k_snoc_init_napi(struct ath10k *ar)
static int ath10k_snoc_request_irq(struct ath10k *ar)
{
struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
- int irqflags = IRQF_TRIGGER_RISING;
int ret, id;
for (id = 0; id < CE_COUNT_MAX; id++) {
ret = request_irq(ar_snoc->ce_irqs[id].irq_line,
- ath10k_snoc_per_engine_handler,
- irqflags, ce_name[id], ar);
+ ath10k_snoc_per_engine_handler, 0,
+ ce_name[id], ar);
if (ret) {
ath10k_err(ar,
"failed to register IRQ handler for CE %d: %d\n",
diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c
index 90a5df1fbdbd..0106e529f3c5 100644
--- a/drivers/net/wireless/ath/ath11k/core.c
+++ b/drivers/net/wireless/ath/ath11k/core.c
@@ -903,23 +903,23 @@ static int ath11k_core_pdev_create(struct ath11k_base *ab)
return ret;
}
- ret = ath11k_mac_register(ab);
+ ret = ath11k_dp_pdev_alloc(ab);
if (ret) {
- ath11k_err(ab, "failed register the radio with mac80211: %d\n", ret);
+ ath11k_err(ab, "failed to attach DP pdev: %d\n", ret);
goto err_pdev_debug;
}
- ret = ath11k_dp_pdev_alloc(ab);
+ ret = ath11k_mac_register(ab);
if (ret) {
- ath11k_err(ab, "failed to attach DP pdev: %d\n", ret);
- goto err_mac_unregister;
+ ath11k_err(ab, "failed register the radio with mac80211: %d\n", ret);
+ goto err_dp_pdev_free;
}
ret = ath11k_thermal_register(ab);
if (ret) {
ath11k_err(ab, "could not register thermal device: %d\n",
ret);
- goto err_dp_pdev_free;
+ goto err_mac_unregister;
}
ret = ath11k_spectral_init(ab);
@@ -932,10 +932,10 @@ static int ath11k_core_pdev_create(struct ath11k_base *ab)
err_thermal_unregister:
ath11k_thermal_unregister(ab);
-err_dp_pdev_free:
- ath11k_dp_pdev_free(ab);
err_mac_unregister:
ath11k_mac_unregister(ab);
+err_dp_pdev_free:
+ ath11k_dp_pdev_free(ab);
err_pdev_debug:
ath11k_debugfs_pdev_destroy(ab);
diff --git a/drivers/net/wireless/ath/ath11k/debug.h b/drivers/net/wireless/ath/ath11k/debug.h
index fbbd5fe02aa8..91545640c47b 100644
--- a/drivers/net/wireless/ath/ath11k/debug.h
+++ b/drivers/net/wireless/ath/ath11k/debug.h
@@ -23,8 +23,8 @@ enum ath11k_debug_mask {
ATH11K_DBG_TESTMODE = 0x00000400,
ATH11k_DBG_HAL = 0x00000800,
ATH11K_DBG_PCI = 0x00001000,
- ATH11K_DBG_DP_TX = 0x00001000,
- ATH11K_DBG_DP_RX = 0x00002000,
+ ATH11K_DBG_DP_TX = 0x00002000,
+ ATH11K_DBG_DP_RX = 0x00004000,
ATH11K_DBG_ANY = 0xffffffff,
};
diff --git a/drivers/net/wireless/ath/ath11k/dp_rx.c b/drivers/net/wireless/ath/ath11k/dp_rx.c
index 049774cc158c..b3e133add1ce 100644
--- a/drivers/net/wireless/ath/ath11k/dp_rx.c
+++ b/drivers/net/wireless/ath/ath11k/dp_rx.c
@@ -835,8 +835,9 @@ void ath11k_peer_rx_tid_delete(struct ath11k *ar,
HAL_REO_CMD_UPDATE_RX_QUEUE, &cmd,
ath11k_dp_rx_tid_del_func);
if (ret) {
- ath11k_err(ar->ab, "failed to send HAL_REO_CMD_UPDATE_RX_QUEUE cmd, tid %d (%d)\n",
- tid, ret);
+ if (ret != -ESHUTDOWN)
+ ath11k_err(ar->ab, "failed to send HAL_REO_CMD_UPDATE_RX_QUEUE cmd, tid %d (%d)\n",
+ tid, ret);
dma_unmap_single(ar->ab->dev, rx_tid->paddr, rx_tid->size,
DMA_BIDIRECTIONAL);
kfree(rx_tid->vaddr);
diff --git a/drivers/net/wireless/ath/ath11k/htc.c b/drivers/net/wireless/ath/ath11k/htc.c
index 6913b7494b9b..2de1e953a539 100644
--- a/drivers/net/wireless/ath/ath11k/htc.c
+++ b/drivers/net/wireless/ath/ath11k/htc.c
@@ -258,8 +258,10 @@ void ath11k_htc_tx_completion_handler(struct ath11k_base *ab,
u8 eid;
eid = ATH11K_SKB_CB(skb)->eid;
- if (eid >= ATH11K_HTC_EP_COUNT)
+ if (eid >= ATH11K_HTC_EP_COUNT) {
+ dev_kfree_skb_any(skb);
return;
+ }
ep = &htc->endpoint[eid];
spin_lock_bh(&htc->tx_lock);
diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c
index 8a3ff12057e8..e2382c8595b6 100644
--- a/drivers/net/wireless/ath/ath11k/pci.c
+++ b/drivers/net/wireless/ath/ath11k/pci.c
@@ -1566,7 +1566,9 @@ static void ath11k_pci_remove(struct pci_dev *pdev)
static void ath11k_pci_shutdown(struct pci_dev *pdev)
{
struct ath11k_base *ab = pci_get_drvdata(pdev);
+ struct ath11k_pci *ab_pci = ath11k_pci_priv(ab);
+ ath11k_pci_set_irq_affinity_hint(ab_pci, NULL);
ath11k_pci_power_down(ab);
}
diff --git a/drivers/net/wireless/ath/ath9k/htc.h b/drivers/net/wireless/ath/ath9k/htc.h
index 6b45e63fae4b..e3d546ef71dd 100644
--- a/drivers/net/wireless/ath/ath9k/htc.h
+++ b/drivers/net/wireless/ath/ath9k/htc.h
@@ -327,11 +327,11 @@ static inline struct ath9k_htc_tx_ctl *HTC_SKB_CB(struct sk_buff *skb)
}
#ifdef CONFIG_ATH9K_HTC_DEBUGFS
-
-#define TX_STAT_INC(c) (hif_dev->htc_handle->drv_priv->debug.tx_stats.c++)
-#define TX_STAT_ADD(c, a) (hif_dev->htc_handle->drv_priv->debug.tx_stats.c += a)
-#define RX_STAT_INC(c) (hif_dev->htc_handle->drv_priv->debug.skbrx_stats.c++)
-#define RX_STAT_ADD(c, a) (hif_dev->htc_handle->drv_priv->debug.skbrx_stats.c += a)
+#define __STAT_SAFE(expr) (hif_dev->htc_handle->drv_priv ? (expr) : 0)
+#define TX_STAT_INC(c) __STAT_SAFE(hif_dev->htc_handle->drv_priv->debug.tx_stats.c++)
+#define TX_STAT_ADD(c, a) __STAT_SAFE(hif_dev->htc_handle->drv_priv->debug.tx_stats.c += a)
+#define RX_STAT_INC(c) __STAT_SAFE(hif_dev->htc_handle->drv_priv->debug.skbrx_stats.c++)
+#define RX_STAT_ADD(c, a) __STAT_SAFE(hif_dev->htc_handle->drv_priv->debug.skbrx_stats.c += a)
#define CAB_STAT_INC priv->debug.tx_stats.cab_queued++
#define TX_QSTAT_INC(q) (priv->debug.tx_stats.queue_stats[q]++)
diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_init.c b/drivers/net/wireless/ath/ath9k/htc_drv_init.c
index ff61ae34ecdf..07ac88fb1c57 100644
--- a/drivers/net/wireless/ath/ath9k/htc_drv_init.c
+++ b/drivers/net/wireless/ath/ath9k/htc_drv_init.c
@@ -944,7 +944,6 @@ int ath9k_htc_probe_device(struct htc_target *htc_handle, struct device *dev,
priv->hw = hw;
priv->htc = htc_handle;
priv->dev = dev;
- htc_handle->drv_priv = priv;
SET_IEEE80211_DEV(hw, priv->dev);
ret = ath9k_htc_wait_for_target(priv);
@@ -965,6 +964,8 @@ int ath9k_htc_probe_device(struct htc_target *htc_handle, struct device *dev,
if (ret)
goto err_init;
+ htc_handle->drv_priv = priv;
+
return 0;
err_init:
diff --git a/drivers/net/wireless/ath/wil6210/debugfs.c b/drivers/net/wireless/ath/wil6210/debugfs.c
index 4c944e595978..ac7787e1a7f6 100644
--- a/drivers/net/wireless/ath/wil6210/debugfs.c
+++ b/drivers/net/wireless/ath/wil6210/debugfs.c
@@ -1010,20 +1010,14 @@ static ssize_t wil_write_file_wmi(struct file *file, const char __user *buf,
void *cmd;
int cmdlen = len - sizeof(struct wmi_cmd_hdr);
u16 cmdid;
- int rc, rc1;
+ int rc1;
- if (cmdlen < 0)
+ if (cmdlen < 0 || *ppos != 0)
return -EINVAL;
- wmi = kmalloc(len, GFP_KERNEL);
- if (!wmi)
- return -ENOMEM;
-
- rc = simple_write_to_buffer(wmi, len, ppos, buf, len);
- if (rc < 0) {
- kfree(wmi);
- return rc;
- }
+ wmi = memdup_user(buf, len);
+ if (IS_ERR(wmi))
+ return PTR_ERR(wmi);
cmd = (cmdlen > 0) ? &wmi[1] : NULL;
cmdid = le16_to_cpu(wmi->command_id);
@@ -1033,7 +1027,7 @@ static ssize_t wil_write_file_wmi(struct file *file, const char __user *buf,
wil_info(wil, "0x%04x[%d] -> %d\n", cmdid, cmdlen, rc1);
- return rc;
+ return len;
}
static const struct file_operations fops_wmi = {
diff --git a/drivers/net/wireless/intel/iwlegacy/4965-rs.c b/drivers/net/wireless/intel/iwlegacy/4965-rs.c
index 9a491e5db75b..532e3b91777d 100644
--- a/drivers/net/wireless/intel/iwlegacy/4965-rs.c
+++ b/drivers/net/wireless/intel/iwlegacy/4965-rs.c
@@ -2403,7 +2403,7 @@ il4965_rs_fill_link_cmd(struct il_priv *il, struct il_lq_sta *lq_sta,
/* Repeat initial/next rate.
* For legacy IL_NUMBER_TRY == 1, this loop will not execute.
* For HT IL_HT_NUMBER_TRY == 3, this executes twice. */
- while (repeat_rate > 0 && idx < LINK_QUAL_MAX_RETRY_NUM) {
+ while (repeat_rate > 0) {
if (is_legacy(tbl_type.lq_type)) {
if (ant_toggle_cnt < NUM_TRY_BEFORE_ANT_TOGGLE)
ant_toggle_cnt++;
@@ -2422,6 +2422,8 @@ il4965_rs_fill_link_cmd(struct il_priv *il, struct il_lq_sta *lq_sta,
cpu_to_le32(new_rate);
repeat_rate--;
idx++;
+ if (idx >= LINK_QUAL_MAX_RETRY_NUM)
+ goto out;
}
il4965_rs_get_tbl_info_from_mcs(new_rate, lq_sta->band,
@@ -2466,6 +2468,7 @@ il4965_rs_fill_link_cmd(struct il_priv *il, struct il_lq_sta *lq_sta,
repeat_rate--;
}
+out:
lq_cmd->agg_params.agg_frame_cnt_limit = LINK_QUAL_AGG_FRAME_LIMIT_DEF;
lq_cmd->agg_params.agg_dis_start_th = LINK_QUAL_AGG_DISABLE_START_DEF;
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
index c7f9d3870f21..8a38d1bfe9b3 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
@@ -1862,6 +1862,7 @@ static void iwl_mvm_disable_sta_queues(struct iwl_mvm *mvm,
iwl_mvm_txq_from_mac80211(sta->txq[i]);
mvmtxq->txq_id = IWL_MVM_INVALID_QUEUE;
+ list_del_init(&mvmtxq->list);
}
}
diff --git a/drivers/net/wireless/intersil/p54/main.c b/drivers/net/wireless/intersil/p54/main.c
index a3ca6620dc0c..8fa3ec71603e 100644
--- a/drivers/net/wireless/intersil/p54/main.c
+++ b/drivers/net/wireless/intersil/p54/main.c
@@ -682,7 +682,7 @@ static void p54_flush(struct ieee80211_hw *dev, struct ieee80211_vif *vif,
* queues have already been stopped and no new frames can sneak
* up from behind.
*/
- while ((total = p54_flush_count(priv) && i--)) {
+ while ((total = p54_flush_count(priv)) && i--) {
/* waste time */
msleep(20);
}
diff --git a/drivers/net/wireless/intersil/p54/p54spi.c b/drivers/net/wireless/intersil/p54/p54spi.c
index f99b7ba69fc3..19152fd449ba 100644
--- a/drivers/net/wireless/intersil/p54/p54spi.c
+++ b/drivers/net/wireless/intersil/p54/p54spi.c
@@ -164,7 +164,7 @@ static int p54spi_request_firmware(struct ieee80211_hw *dev)
ret = p54_parse_firmware(dev, priv->firmware);
if (ret) {
- release_firmware(priv->firmware);
+ /* the firmware is released by the caller */
return ret;
}
@@ -659,6 +659,7 @@ static int p54spi_probe(struct spi_device *spi)
return 0;
err_free_common:
+ release_firmware(priv->firmware);
free_irq(gpio_to_irq(p54spi_gpio_irq), spi);
err_free_gpio_irq:
gpio_free(p54spi_gpio_irq);
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
index e9ec63e0e395..1f39f2ed633d 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -680,7 +680,7 @@ struct mac80211_hwsim_data {
bool ps_poll_pending;
struct dentry *debugfs;
- uintptr_t pending_cookie;
+ atomic_t pending_cookie;
struct sk_buff_head pending; /* packets pending */
/*
* Only radios in the same group can communicate together (the
@@ -1416,8 +1416,7 @@ static void mac80211_hwsim_tx_frame_nl(struct ieee80211_hw *hw,
goto nla_put_failure;
/* We create a cookie to identify this skb */
- data->pending_cookie++;
- cookie = data->pending_cookie;
+ cookie = atomic_inc_return(&data->pending_cookie);
info->rate_driver_data[0] = (void *)cookie;
if (nla_put_u64_64bit(skb, HWSIM_ATTR_COOKIE, cookie, HWSIM_ATTR_PAD))
goto nla_put_failure;
@@ -4080,6 +4079,7 @@ static int hwsim_tx_info_frame_received_nl(struct sk_buff *skb_2,
const u8 *src;
unsigned int hwsim_flags;
int i;
+ unsigned long flags;
bool found = false;
if (!info->attrs[HWSIM_ATTR_ADDR_TRANSMITTER] ||
@@ -4107,18 +4107,20 @@ static int hwsim_tx_info_frame_received_nl(struct sk_buff *skb_2,
}
/* look for the skb matching the cookie passed back from user */
+ spin_lock_irqsave(&data2->pending.lock, flags);
skb_queue_walk_safe(&data2->pending, skb, tmp) {
- u64 skb_cookie;
+ uintptr_t skb_cookie;
txi = IEEE80211_SKB_CB(skb);
- skb_cookie = (u64)(uintptr_t)txi->rate_driver_data[0];
+ skb_cookie = (uintptr_t)txi->rate_driver_data[0];
if (skb_cookie == ret_skb_cookie) {
- skb_unlink(skb, &data2->pending);
+ __skb_unlink(skb, &data2->pending);
found = true;
break;
}
}
+ spin_unlock_irqrestore(&data2->pending.lock, flags);
/* not found */
if (!found)
diff --git a/drivers/net/wireless/marvell/libertas/if_usb.c b/drivers/net/wireless/marvell/libertas/if_usb.c
index 5d6dc1dd050d..32fdc4150b60 100644
--- a/drivers/net/wireless/marvell/libertas/if_usb.c
+++ b/drivers/net/wireless/marvell/libertas/if_usb.c
@@ -287,6 +287,7 @@ static int if_usb_probe(struct usb_interface *intf,
return 0;
err_get_fw:
+ usb_put_dev(udev);
lbs_remove_card(priv);
err_add_card:
if_usb_reset_device(cardp);
diff --git a/drivers/net/wireless/mediatek/mt76/eeprom.c b/drivers/net/wireless/mediatek/mt76/eeprom.c
index a499861918fa..9bc8758573fc 100644
--- a/drivers/net/wireless/mediatek/mt76/eeprom.c
+++ b/drivers/net/wireless/mediatek/mt76/eeprom.c
@@ -162,10 +162,13 @@ mt76_find_power_limits_node(struct mt76_dev *dev)
}
if (mt76_string_prop_find(country, dev->alpha2) ||
- mt76_string_prop_find(regd, region_name))
+ mt76_string_prop_find(regd, region_name)) {
+ of_node_put(np);
return cur;
+ }
}
+ of_node_put(np);
return fallback;
}
diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c
index 8a2fedbb1451..2cd7b3f1db64 100644
--- a/drivers/net/wireless/mediatek/mt76/mac80211.c
+++ b/drivers/net/wireless/mediatek/mt76/mac80211.c
@@ -210,6 +210,7 @@ static int mt76_led_init(struct mt76_dev *dev)
if (!of_property_read_u32(np, "led-sources", &led_pin))
dev->led_pin = led_pin;
dev->led_al = of_property_read_bool(np, "led-active-low");
+ of_node_put(np);
}
return led_classdev_register(dev->dev, &dev->led_cdev);
diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
index bd687f7de628..9e832b27170f 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7615/mac.c
@@ -2282,6 +2282,7 @@ mt7615_dfs_init_radar_specs(struct mt7615_phy *phy)
int mt7615_dfs_init_radar_detector(struct mt7615_phy *phy)
{
+ struct cfg80211_chan_def *chandef = &phy->mt76->chandef;
struct mt7615_dev *dev = phy->dev;
bool ext_phy = phy != &dev->phy;
enum mt76_dfs_state dfs_state, prev_state;
@@ -2292,13 +2293,13 @@ int mt7615_dfs_init_radar_detector(struct mt7615_phy *phy)
prev_state = phy->mt76->dfs_state;
dfs_state = mt76_phy_dfs_state(phy->mt76);
+ if ((chandef->chan->flags & IEEE80211_CHAN_RADAR) &&
+ dfs_state < MT_DFS_STATE_CAC)
+ dfs_state = MT_DFS_STATE_ACTIVE;
if (prev_state == dfs_state)
return 0;
- if (prev_state == MT_DFS_STATE_UNKNOWN)
- mt7615_dfs_stop_radar_detector(phy);
-
if (dfs_state == MT_DFS_STATE_DISABLED)
goto stop;
diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/main.c b/drivers/net/wireless/mediatek/mt76/mt7615/main.c
index 6b8e3e7ae4a2..36990637a8a2 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7615/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7615/main.c
@@ -282,26 +282,6 @@ static void mt7615_remove_interface(struct ieee80211_hw *hw,
mt76_packet_id_flush(&dev->mt76, &mvif->sta.wcid);
}
-static void mt7615_init_dfs_state(struct mt7615_phy *phy)
-{
- struct mt76_phy *mphy = phy->mt76;
- struct ieee80211_hw *hw = mphy->hw;
- struct cfg80211_chan_def *chandef = &hw->conf.chandef;
-
- if (hw->conf.flags & IEEE80211_CONF_OFFCHANNEL)
- return;
-
- if (!(chandef->chan->flags & IEEE80211_CHAN_RADAR) &&
- !(mphy->chandef.chan->flags & IEEE80211_CHAN_RADAR))
- return;
-
- if (mphy->chandef.chan->center_freq == chandef->chan->center_freq &&
- mphy->chandef.width == chandef->width)
- return;
-
- phy->dfs_state = -1;
-}
-
int mt7615_set_channel(struct mt7615_phy *phy)
{
struct mt7615_dev *dev = phy->dev;
@@ -314,7 +294,6 @@ int mt7615_set_channel(struct mt7615_phy *phy)
set_bit(MT76_RESET, &phy->mt76->state);
- mt7615_init_dfs_state(phy);
mt76_set_channel(phy->mt76);
if (is_mt7615(&dev->mt76) && dev->flash_eeprom) {
diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c
index 97e2a85cb728..7127d6007ae0 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c
@@ -350,10 +350,11 @@ static int mt7615_mcu_fw_pmctrl(struct mt7615_dev *dev)
}
mt7622_trigger_hif_int(dev, false);
-
- pm->stats.last_doze_event = jiffies;
- pm->stats.awake_time += pm->stats.last_doze_event -
- pm->stats.last_wake_event;
+ if (!err) {
+ pm->stats.last_doze_event = jiffies;
+ pm->stats.awake_time += pm->stats.last_doze_event -
+ pm->stats.last_wake_event;
+ }
out:
mutex_unlock(&pm->mutex);
@@ -402,6 +403,9 @@ mt7615_mcu_rx_radar_detected(struct mt7615_dev *dev, struct sk_buff *skb)
if (r->band_idx && dev->mt76.phy2)
mphy = dev->mt76.phy2;
+ if (mt76_phy_dfs_state(mphy) < MT_DFS_STATE_CAC)
+ return;
+
ieee80211_radar_detected(mphy->hw);
dev->hw_pattern++;
}
diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h b/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h
index 2e91f6a27d0f..082c73b571ae 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h
+++ b/drivers/net/wireless/mediatek/mt76/mt7615/mt7615.h
@@ -177,7 +177,6 @@ struct mt7615_phy {
u8 chfreq;
u8 rdd_state;
- int dfs_state;
u32 rx_ampdu_ts;
u32 ampdu_ref;
diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c b/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c
index 2953df7d8388..c6c16fe8ee85 100644
--- a/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c
@@ -108,7 +108,7 @@ __mt76x02u_mcu_send_msg(struct mt76_dev *dev, struct sk_buff *skb,
ret = mt76u_bulk_msg(dev, skb->data, skb->len, NULL, 500,
MT_EP_OUT_INBAND_CMD);
if (ret)
- return ret;
+ goto out;
if (wait_resp)
ret = mt76x02u_mcu_wait_resp(dev, seq);
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/init.c b/drivers/net/wireless/mediatek/mt76/mt7921/init.c
index 91fc41922d95..37453e1c136f 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/init.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/init.c
@@ -49,8 +49,8 @@ mt7921_init_wiphy(struct ieee80211_hw *hw)
struct wiphy *wiphy = hw->wiphy;
hw->queues = 4;
- hw->max_rx_aggregation_subframes = 64;
- hw->max_tx_aggregation_subframes = 128;
+ hw->max_rx_aggregation_subframes = IEEE80211_MAX_AMPDU_BUF_HE;
+ hw->max_tx_aggregation_subframes = IEEE80211_MAX_AMPDU_BUF_HE;
hw->netdev_features = NETIF_F_RXCSUM;
hw->radiotap_timestamp.units_pos =
@@ -291,7 +291,7 @@ int mt7921_register_device(struct mt7921_dev *dev)
IEEE80211_HT_CAP_LDPC_CODING |
IEEE80211_HT_CAP_MAX_AMSDU;
dev->mphy.sband_5g.sband.vht_cap.cap |=
- IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_7991 |
+ IEEE80211_VHT_CAP_MAX_MPDU_LENGTH_11454 |
IEEE80211_VHT_CAP_MAX_A_MPDU_LENGTH_EXPONENT_MASK |
IEEE80211_VHT_CAP_SU_BEAMFORMEE_CAPABLE |
IEEE80211_VHT_CAP_MU_BEAMFORMEE_CAPABLE |
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
index da2be050ed7c..2a609b25561c 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
@@ -538,13 +538,6 @@ static int mt7921_load_patch(struct mt7921_dev *dev)
if (ret)
dev_err(dev->mt76.dev, "Failed to start patch\n");
- if (mt76_is_sdio(&dev->mt76)) {
- /* activate again */
- ret = __mt7921_mcu_fw_pmctrl(dev);
- if (!ret)
- ret = __mt7921_mcu_drv_pmctrl(dev);
- }
-
out:
sem = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false);
switch (sem) {
@@ -555,6 +548,14 @@ static int mt7921_load_patch(struct mt7921_dev *dev)
dev_err(dev->mt76.dev, "Failed to release patch semaphore\n");
break;
}
+
+ if (!ret && mt76_is_sdio(&dev->mt76)) {
+ /* activate again */
+ ret = __mt7921_mcu_fw_pmctrl(dev);
+ if (!ret)
+ ret = __mt7921_mcu_drv_pmctrl(dev);
+ }
+
release_firmware(fw);
return ret;
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mcu.c
index 36669e5aeef3..a1ab5f878f81 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci_mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci_mcu.c
@@ -102,7 +102,7 @@ int mt7921e_mcu_fw_pmctrl(struct mt7921_dev *dev)
{
struct mt76_phy *mphy = &dev->mt76.phy;
struct mt76_connac_pm *pm = &dev->pm;
- int i, err = 0;
+ int i;
for (i = 0; i < MT7921_DRV_OWN_RETRY_COUNT; i++) {
mt76_wr(dev, MT_CONN_ON_LPCTL, PCIE_LPCR_HOST_SET_OWN);
@@ -114,12 +114,12 @@ int mt7921e_mcu_fw_pmctrl(struct mt7921_dev *dev)
if (i == MT7921_DRV_OWN_RETRY_COUNT) {
dev_err(dev->mt76.dev, "firmware own failed\n");
clear_bit(MT76_STATE_PM, &mphy->state);
- err = -EIO;
+ return -EIO;
}
pm->stats.last_doze_event = jiffies;
pm->stats.awake_time += pm->stats.last_doze_event -
pm->stats.last_wake_event;
- return err;
+ return 0;
}
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/sdio_mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/sdio_mcu.c
index 54a5c712a3c3..c572a3107b8b 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/sdio_mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/sdio_mcu.c
@@ -136,8 +136,8 @@ int mt7921s_mcu_fw_pmctrl(struct mt7921_dev *dev)
struct sdio_func *func = dev->mt76.sdio.func;
struct mt76_phy *mphy = &dev->mt76.phy;
struct mt76_connac_pm *pm = &dev->pm;
- int err = 0;
u32 status;
+ int err;
sdio_claim_host(func);
@@ -148,7 +148,7 @@ int mt7921s_mcu_fw_pmctrl(struct mt7921_dev *dev)
2000, 1000000);
if (err < 0) {
dev_err(dev->mt76.dev, "mailbox ACK not cleared\n");
- goto err;
+ goto out;
}
}
@@ -156,18 +156,18 @@ int mt7921s_mcu_fw_pmctrl(struct mt7921_dev *dev)
err = readx_poll_timeout(mt76s_read_pcr, &dev->mt76, status,
!(status & WHLPCR_IS_DRIVER_OWN), 2000, 1000000);
+out:
sdio_release_host(func);
-err:
if (err < 0) {
dev_err(dev->mt76.dev, "firmware own failed\n");
clear_bit(MT76_STATE_PM, &mphy->state);
- err = -EIO;
+ return -EIO;
}
pm->stats.last_doze_event = jiffies;
pm->stats.awake_time += pm->stats.last_doze_event -
pm->stats.last_wake_event;
- return err;
+ return 0;
}
diff --git a/drivers/net/wireless/microchip/wilc1000/spi.c b/drivers/net/wireless/microchip/wilc1000/spi.c
index 18420e954402..2ae8dd3411ac 100644
--- a/drivers/net/wireless/microchip/wilc1000/spi.c
+++ b/drivers/net/wireless/microchip/wilc1000/spi.c
@@ -191,11 +191,11 @@ static void wilc_wlan_power(struct wilc *wilc, bool on)
/* assert ENABLE: */
gpiod_set_value(gpios->enable, 1);
mdelay(5);
- /* deassert RESET: */
- gpiod_set_value(gpios->reset, 0);
- } else {
/* assert RESET: */
gpiod_set_value(gpios->reset, 1);
+ } else {
+ /* deassert RESET: */
+ gpiod_set_value(gpios->reset, 0);
/* deassert ENABLE: */
gpiod_set_value(gpios->enable, 0);
}
diff --git a/drivers/net/wireless/realtek/rtlwifi/debug.c b/drivers/net/wireless/realtek/rtlwifi/debug.c
index 901cdfe3723c..0b1bc04cb6ad 100644
--- a/drivers/net/wireless/realtek/rtlwifi/debug.c
+++ b/drivers/net/wireless/realtek/rtlwifi/debug.c
@@ -329,8 +329,8 @@ static ssize_t rtl_debugfs_set_write_h2c(struct file *filp,
tmp_len = (count > sizeof(tmp) - 1 ? sizeof(tmp) - 1 : count);
- if (!buffer || copy_from_user(tmp, buffer, tmp_len))
- return count;
+ if (copy_from_user(tmp, buffer, tmp_len))
+ return -EFAULT;
tmp[tmp_len] = '\0';
@@ -340,8 +340,8 @@ static ssize_t rtl_debugfs_set_write_h2c(struct file *filp,
&h2c_data[4], &h2c_data[5],
&h2c_data[6], &h2c_data[7]);
- if (h2c_len <= 0)
- return count;
+ if (h2c_len == 0)
+ return -EINVAL;
for (i = 0; i < h2c_len; i++)
h2c_data_packed[i] = (u8)h2c_data[i];
diff --git a/drivers/net/wireless/realtek/rtw88/main.c b/drivers/net/wireless/realtek/rtw88/main.c
index 8b9899e41b0b..ded952913ae6 100644
--- a/drivers/net/wireless/realtek/rtw88/main.c
+++ b/drivers/net/wireless/realtek/rtw88/main.c
@@ -1974,6 +1974,10 @@ int rtw_core_init(struct rtw_dev *rtwdev)
timer_setup(&rtwdev->tx_report.purge_timer,
rtw_tx_report_purge_timer, 0);
rtwdev->tx_wq = alloc_workqueue("rtw_tx_wq", WQ_UNBOUND | WQ_HIGHPRI, 0);
+ if (!rtwdev->tx_wq) {
+ rtw_warn(rtwdev, "alloc_workqueue rtw_tx_wq failed\n");
+ return -ENOMEM;
+ }
INIT_DELAYED_WORK(&rtwdev->watch_dog_work, rtw_watch_dog_work);
INIT_DELAYED_WORK(&coex->bt_relink_work, rtw_coex_bt_relink_work);
diff --git a/drivers/net/wireless/realtek/rtw89/rtw8852a_rfk.c b/drivers/net/wireless/realtek/rtw89/rtw8852a_rfk.c
index ad272854c442..8e4869eb3a24 100644
--- a/drivers/net/wireless/realtek/rtw89/rtw8852a_rfk.c
+++ b/drivers/net/wireless/realtek/rtw89/rtw8852a_rfk.c
@@ -2330,8 +2330,8 @@ static u8 _dpk_pas_read(struct rtw89_dev *rtwdev, bool is_check)
val2_q = abs(sign_extend32(val2_q, 11));
rtw89_debug(rtwdev, RTW89_DBG_RFK, "[DPK] PAS_delta = 0x%x\n",
- (val1_i * val1_i + val1_q * val1_q) /
- (val2_i * val2_i + val2_q * val2_q));
+ phy_div(val1_i * val1_i + val1_q * val1_q,
+ val2_i * val2_i + val2_q * val2_q));
} else {
for (i = 0; i < 32; i++) {
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c9831daafbc6..a58a69999dbc 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1897,8 +1897,10 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_id_ns *id)
if (ns->head->ids.csi == NVME_CSI_ZNS) {
ret = nvme_update_zone_info(ns, lbaf);
- if (ret)
- goto out_unfreeze;
+ if (ret) {
+ blk_mq_unfreeze_queue(ns->disk->queue);
+ goto out;
+ }
}
set_disk_ro(ns->disk, (id->nsattr & NVME_NS_ATTR_RO) ||
@@ -1909,7 +1911,7 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_id_ns *id)
if (blk_queue_is_zoned(ns->queue)) {
ret = nvme_revalidate_zones(ns);
if (ret && !nvme_first_scan(ns->disk))
- return ret;
+ goto out;
}
if (nvme_ns_head_multipath(ns->head)) {
@@ -1924,9 +1926,9 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_id_ns *id)
disk_update_readahead(ns->head->disk);
blk_mq_unfreeze_queue(ns->head->disk->queue);
}
- return 0;
-out_unfreeze:
+ ret = 0;
+out:
/*
* If probing fails due an unsupported feature, hide the block device,
* but still allow other access.
@@ -1936,7 +1938,6 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_id_ns *id)
set_bit(NVME_NS_READY, &ns->flags);
ret = 0;
}
- blk_mq_unfreeze_queue(ns->disk->queue);
return ret;
}
@@ -2093,6 +2094,7 @@ static int nvme_report_zones(struct gendisk *disk, sector_t sector,
static const struct block_device_operations nvme_bdev_ops = {
.owner = THIS_MODULE,
.ioctl = nvme_ioctl,
+ .compat_ioctl = blkdev_compat_ptr_ioctl,
.open = nvme_open,
.release = nvme_release,
.getgeo = nvme_getgeo,
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 080f85f4105f..05f9da251758 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1899,6 +1899,24 @@ nvme_fc_ctrl_ioerr_work(struct work_struct *work)
nvme_fc_error_recovery(ctrl, "transport detected io error");
}
+/*
+ * nvme_fc_io_getuuid - Routine called to get the appid field
+ * associated with request by the lldd
+ * @req:IO request from nvme fc to driver
+ * Returns: UUID if there is an appid associated with VM or
+ * NULL if the user/libvirt has not set the appid to VM
+ */
+char *nvme_fc_io_getuuid(struct nvmefc_fcp_req *req)
+{
+ struct nvme_fc_fcp_op *op = fcp_req_to_fcp_op(req);
+ struct request *rq = op->rq;
+
+ if (!IS_ENABLED(CONFIG_BLK_CGROUP_FC_APPID) || !rq->bio)
+ return NULL;
+ return blkcg_get_fc_appid(rq->bio);
+}
+EXPORT_SYMBOL_GPL(nvme_fc_io_getuuid);
+
static void
nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
{
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index d464fdf978fb..b0fe23439c4a 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -408,6 +408,7 @@ const struct block_device_operations nvme_ns_head_ops = {
.open = nvme_ns_head_open,
.release = nvme_ns_head_release,
.ioctl = nvme_ns_head_ioctl,
+ .compat_ioctl = blkdev_compat_ptr_ioctl,
.getgeo = nvme_getgeo,
.report_zones = nvme_ns_head_report_zones,
.pr_ops = &nvme_pr_ops,
diff --git a/drivers/nvme/host/trace.h b/drivers/nvme/host/trace.h
index 37c7f4c89f92..6f0eaf6a1528 100644
--- a/drivers/nvme/host/trace.h
+++ b/drivers/nvme/host/trace.h
@@ -98,7 +98,7 @@ TRACE_EVENT(nvme_complete_rq,
TP_fast_assign(
__entry->ctrl_id = nvme_req(req)->ctrl->instance;
__entry->qid = nvme_req_qid(req);
- __entry->cid = req->tag;
+ __entry->cid = nvme_req(req)->cmd->common.command_id;
__entry->result = le64_to_cpu(nvme_req(req)->result.u64);
__entry->retries = nvme_req(req)->retries;
__entry->flags = nvme_req(req)->flags;
diff --git a/drivers/nvme/target/zns.c b/drivers/nvme/target/zns.c
index e34718b09550..82b61acf7a72 100644
--- a/drivers/nvme/target/zns.c
+++ b/drivers/nvme/target/zns.c
@@ -34,8 +34,7 @@ static int validate_conv_zones_cb(struct blk_zone *z,
bool nvmet_bdev_zns_enable(struct nvmet_ns *ns)
{
- struct request_queue *q = ns->bdev->bd_disk->queue;
- u8 zasl = nvmet_zasl(queue_max_zone_append_sectors(q));
+ u8 zasl = nvmet_zasl(bdev_max_zone_append_sectors(ns->bdev));
struct gendisk *bd_disk = ns->bdev->bd_disk;
int ret;
diff --git a/drivers/of/device.c b/drivers/of/device.c
index 874f031442dc..75b6cbffa755 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -81,8 +81,11 @@ of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
* restricted-dma-pool region is allowed.
*/
if (of_device_is_compatible(node, "restricted-dma-pool") &&
- of_device_is_available(node))
+ of_device_is_available(node)) {
+ of_node_put(node);
break;
+ }
+ of_node_put(node);
}
/*
diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 0f30496ce80b..d624e8b185aa 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -246,7 +246,7 @@ static int populate_node(const void *blob,
}
*pnp = np;
- return true;
+ return 0;
}
static void reverse_nodes(struct device_node *parent)
diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
index 8d374cc552be..91b04b04eec4 100644
--- a/drivers/of/kexec.c
+++ b/drivers/of/kexec.c
@@ -126,6 +126,7 @@ int ima_get_kexec_buffer(void **addr, size_t *size)
{
int ret, len;
unsigned long tmp_addr;
+ unsigned long start_pfn, end_pfn;
size_t tmp_size;
const void *prop;
@@ -140,6 +141,22 @@ int ima_get_kexec_buffer(void **addr, size_t *size)
if (ret)
return ret;
+ /* Do some sanity on the returned size for the ima-kexec buffer */
+ if (!tmp_size)
+ return -ENOENT;
+
+ /*
+ * Calculate the PFNs for the buffer and ensure
+ * they are with in addressable memory.
+ */
+ start_pfn = PHYS_PFN(tmp_addr);
+ end_pfn = PHYS_PFN(tmp_addr + tmp_size - 1);
+ if (!page_is_ram(start_pfn) || !page_is_ram(end_pfn)) {
+ pr_warn("IMA buffer at 0x%lx, size = 0x%zx beyond memory\n",
+ tmp_addr, tmp_size);
+ return -EINVAL;
+ }
+
*addr = __va(tmp_addr);
*size = tmp_size;
diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 740407252298..e154b18ec4b1 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -2413,8 +2413,8 @@ struct opp_table *dev_pm_opp_attach_genpd(struct device *dev,
}
virt_dev = dev_pm_domain_attach_by_name(dev, *name);
- if (IS_ERR(virt_dev)) {
- ret = PTR_ERR(virt_dev);
+ if (IS_ERR_OR_NULL(virt_dev)) {
+ ret = PTR_ERR(virt_dev) ? : -ENODEV;
dev_err(dev, "Couldn't attach to pm_domain: %d\n", ret);
goto err;
}
diff --git a/drivers/parisc/lba_pci.c b/drivers/parisc/lba_pci.c
index 732b516c7bf8..afc6e66ddc31 100644
--- a/drivers/parisc/lba_pci.c
+++ b/drivers/parisc/lba_pci.c
@@ -1476,9 +1476,13 @@ lba_driver_probe(struct parisc_device *dev)
u32 func_class;
void *tmp_obj;
char *version;
- void __iomem *addr = ioremap(dev->hpa.start, 4096);
+ void __iomem *addr;
int max;
+ addr = ioremap(dev->hpa.start, 4096);
+ if (addr == NULL)
+ return -ENOMEM;
+
/* Read HW Rev First */
func_class = READ_REG32(addr + LBA_FCLASS);
diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index 0eda8236c125..13c2e73f0eaf 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -780,8 +780,9 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
ep->msi_mem = pci_epc_mem_alloc_addr(epc, &ep->msi_mem_phys,
epc->mem->window.page_size);
if (!ep->msi_mem) {
+ ret = -ENOMEM;
dev_err(dev, "Failed to reserve memory for MSI/MSI-X\n");
- return -ENOMEM;
+ goto err_exit_epc_mem;
}
if (ep->ops->get_features) {
@@ -790,6 +791,19 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
return 0;
}
- return dw_pcie_ep_init_complete(ep);
+ ret = dw_pcie_ep_init_complete(ep);
+ if (ret)
+ goto err_free_epc_mem;
+
+ return 0;
+
+err_free_epc_mem:
+ pci_epc_mem_free_addr(epc, ep->msi_mem_phys, ep->msi_mem,
+ epc->mem->window.page_size);
+
+err_exit_epc_mem:
+ pci_epc_mem_exit(epc);
+
+ return ret;
}
EXPORT_SYMBOL_GPL(dw_pcie_ep_init);
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 9979302532b7..d0d768f22ac3 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -421,8 +421,14 @@ int dw_pcie_host_init(struct pcie_port *pp)
bridge->sysdata = pp;
ret = pci_host_probe(bridge);
- if (!ret)
- return 0;
+ if (ret)
+ goto err_stop_link;
+
+ return 0;
+
+err_stop_link:
+ if (pci->ops && pci->ops->stop_link)
+ pci->ops->stop_link(pci);
err_free_msi:
if (pp->has_msi_ctrl)
@@ -433,8 +439,14 @@ EXPORT_SYMBOL_GPL(dw_pcie_host_init);
void dw_pcie_host_deinit(struct pcie_port *pp)
{
+ struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
+
pci_stop_root_bus(pp->bridge->bus);
pci_remove_root_bus(pp->bridge->bus);
+
+ if (pci->ops && pci->ops->stop_link)
+ pci->ops->stop_link(pci);
+
if (pp->has_msi_ctrl)
dw_pcie_free_msi(pp);
}
@@ -531,7 +543,6 @@ static struct pci_ops dw_pcie_ops = {
void dw_pcie_setup_rc(struct pcie_port *pp)
{
- int i;
u32 val, ctrl, num_ctrls;
struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
@@ -582,19 +593,22 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
PCI_COMMAND_MASTER | PCI_COMMAND_SERR;
dw_pcie_writel_dbi(pci, PCI_COMMAND, val);
- /* Ensure all outbound windows are disabled so there are multiple matches */
- for (i = 0; i < pci->num_ob_windows; i++)
- dw_pcie_disable_atu(pci, i, DW_PCIE_REGION_OUTBOUND);
-
/*
* If the platform provides its own child bus config accesses, it means
* the platform uses its own address translation component rather than
* ATU, so we should not program the ATU here.
*/
if (pp->bridge->child_ops == &dw_child_pcie_ops) {
- int atu_idx = 0;
+ int i, atu_idx = 0;
struct resource_entry *entry;
+ /*
+ * Disable all outbound windows to make sure a transaction
+ * can't match multiple windows.
+ */
+ for (i = 0; i < pci->num_ob_windows; i++)
+ dw_pcie_disable_atu(pci, i, DW_PCIE_REGION_OUTBOUND);
+
/* Get last memory resource entry */
resource_list_for_each_entry(entry, &pp->bridge->windows) {
if (resource_type(entry->res) != IORESOURCE_MEM)
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index d92c8a25094f..5848cc520b52 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -287,8 +287,8 @@ static void dw_pcie_prog_outbound_atu_unroll(struct dw_pcie *pci, u8 func_no,
dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_UPPER_TARGET,
upper_32_bits(pci_addr));
val = type | PCIE_ATU_FUNC_NUM(func_no);
- val = upper_32_bits(size - 1) ?
- val | PCIE_ATU_INCREASE_REGION_SIZE : val;
+ if (upper_32_bits(limit_addr) > upper_32_bits(cpu_addr))
+ val |= PCIE_ATU_INCREASE_REGION_SIZE;
if (pci->version == 0x490A)
val = dw_pcie_enable_ecrc(val);
dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL1, val);
@@ -315,6 +315,7 @@ static void __dw_pcie_prog_outbound_atu(struct dw_pcie *pci, u8 func_no,
u64 pci_addr, u64 size)
{
u32 retries, val;
+ u64 limit_addr;
if (pci->ops && pci->ops->cpu_addr_fixup)
cpu_addr = pci->ops->cpu_addr_fixup(pci, cpu_addr);
@@ -325,6 +326,8 @@ static void __dw_pcie_prog_outbound_atu(struct dw_pcie *pci, u8 func_no,
return;
}
+ limit_addr = cpu_addr + size - 1;
+
dw_pcie_writel_dbi(pci, PCIE_ATU_VIEWPORT,
PCIE_ATU_REGION_OUTBOUND | index);
dw_pcie_writel_dbi(pci, PCIE_ATU_LOWER_BASE,
@@ -332,17 +335,18 @@ static void __dw_pcie_prog_outbound_atu(struct dw_pcie *pci, u8 func_no,
dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_BASE,
upper_32_bits(cpu_addr));
dw_pcie_writel_dbi(pci, PCIE_ATU_LIMIT,
- lower_32_bits(cpu_addr + size - 1));
+ lower_32_bits(limit_addr));
if (pci->version >= 0x460A)
dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_LIMIT,
- upper_32_bits(cpu_addr + size - 1));
+ upper_32_bits(limit_addr));
dw_pcie_writel_dbi(pci, PCIE_ATU_LOWER_TARGET,
lower_32_bits(pci_addr));
dw_pcie_writel_dbi(pci, PCIE_ATU_UPPER_TARGET,
upper_32_bits(pci_addr));
val = type | PCIE_ATU_FUNC_NUM(func_no);
- val = ((upper_32_bits(size - 1)) && (pci->version >= 0x460A)) ?
- val | PCIE_ATU_INCREASE_REGION_SIZE : val;
+ if (upper_32_bits(limit_addr) > upper_32_bits(cpu_addr) &&
+ pci->version >= 0x460A)
+ val |= PCIE_ATU_INCREASE_REGION_SIZE;
if (pci->version == 0x490A)
val = dw_pcie_enable_ecrc(val);
dw_pcie_writel_dbi(pci, PCIE_ATU_CR1, val);
@@ -491,7 +495,7 @@ int dw_pcie_prog_inbound_atu(struct dw_pcie *pci, u8 func_no, int index,
void dw_pcie_disable_atu(struct dw_pcie *pci, int index,
enum dw_pcie_region_type type)
{
- int region;
+ u32 region;
switch (type) {
case DW_PCIE_REGION_INBOUND:
@@ -504,8 +508,18 @@ void dw_pcie_disable_atu(struct dw_pcie *pci, int index,
return;
}
- dw_pcie_writel_dbi(pci, PCIE_ATU_VIEWPORT, region | index);
- dw_pcie_writel_dbi(pci, PCIE_ATU_CR2, ~(u32)PCIE_ATU_ENABLE);
+ if (pci->iatu_unroll_enabled) {
+ if (region == PCIE_ATU_REGION_INBOUND) {
+ dw_pcie_writel_ib_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL2,
+ ~(u32)PCIE_ATU_ENABLE);
+ } else {
+ dw_pcie_writel_ob_unroll(pci, index, PCIE_ATU_UNR_REGION_CTRL2,
+ ~(u32)PCIE_ATU_ENABLE);
+ }
+ } else {
+ dw_pcie_writel_dbi(pci, PCIE_ATU_VIEWPORT, region | index);
+ dw_pcie_writel_dbi(pci, PCIE_ATU_CR2, ~(u32)PCIE_ATU_ENABLE);
+ }
}
int dw_pcie_wait_for_link(struct dw_pcie *pci)
@@ -726,6 +740,13 @@ void dw_pcie_setup(struct dw_pcie *pci)
val |= PORT_LINK_DLL_LINK_EN;
dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, val);
+ if (of_property_read_bool(np, "snps,enable-cdm-check")) {
+ val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
+ val |= PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS |
+ PCIE_PL_CHK_REG_CHK_REG_START;
+ dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, val);
+ }
+
of_property_read_u32(np, "num-lanes", &pci->num_lanes);
if (!pci->num_lanes) {
dev_dbg(pci->dev, "Using h/w default number of lanes\n");
@@ -772,11 +793,4 @@ void dw_pcie_setup(struct dw_pcie *pci)
break;
}
dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, val);
-
- if (of_property_read_bool(np, "snps,enable-cdm-check")) {
- val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
- val |= PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS |
- PCIE_PL_CHK_REG_CHK_REG_START;
- dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, val);
- }
}
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index ed55421eb9ba..340542aab8a5 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -337,8 +337,6 @@ static int qcom_pcie_init_2_1_0(struct qcom_pcie *pcie)
reset_control_assert(res->ext_reset);
reset_control_assert(res->phy_reset);
- writel(1, pcie->parf + PCIE20_PARF_PHY_CTRL);
-
ret = regulator_bulk_enable(ARRAY_SIZE(res->supplies), res->supplies);
if (ret < 0) {
dev_err(dev, "cannot enable regulators\n");
@@ -381,15 +379,15 @@ static int qcom_pcie_init_2_1_0(struct qcom_pcie *pcie)
goto err_deassert_axi;
}
- ret = clk_bulk_prepare_enable(ARRAY_SIZE(res->clks), res->clks);
- if (ret)
- goto err_clks;
-
/* enable PCIe clocks and resets */
val = readl(pcie->parf + PCIE20_PARF_PHY_CTRL);
val &= ~BIT(0);
writel(val, pcie->parf + PCIE20_PARF_PHY_CTRL);
+ ret = clk_bulk_prepare_enable(ARRAY_SIZE(res->clks), res->clks);
+ if (ret)
+ goto err_clks;
+
if (of_device_is_compatible(node, "qcom,pcie-ipq8064") ||
of_device_is_compatible(node, "qcom,pcie-ipq8064-v2")) {
writel(PCS_DEEMPH_TX_DEEMPH_GEN1(24) |
@@ -1038,9 +1036,7 @@ static int qcom_pcie_init_2_3_3(struct qcom_pcie *pcie)
struct qcom_pcie_resources_2_3_3 *res = &pcie->res.v2_3_3;
struct dw_pcie *pci = pcie->pci;
struct device *dev = pci->dev;
- u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
int i, ret;
- u32 val;
for (i = 0; i < ARRAY_SIZE(res->rst); i++) {
ret = reset_control_assert(res->rst[i]);
@@ -1097,6 +1093,33 @@ static int qcom_pcie_init_2_3_3(struct qcom_pcie *pcie)
goto err_clk_aux;
}
+ return 0;
+
+err_clk_aux:
+ clk_disable_unprepare(res->ahb_clk);
+err_clk_ahb:
+ clk_disable_unprepare(res->axi_s_clk);
+err_clk_axi_s:
+ clk_disable_unprepare(res->axi_m_clk);
+err_clk_axi_m:
+ clk_disable_unprepare(res->iface);
+err_clk_iface:
+ /*
+ * Not checking for failure, will anyway return
+ * the original failure in 'ret'.
+ */
+ for (i = 0; i < ARRAY_SIZE(res->rst); i++)
+ reset_control_assert(res->rst[i]);
+
+ return ret;
+}
+
+static int qcom_pcie_post_init_2_3_3(struct qcom_pcie *pcie)
+{
+ struct dw_pcie *pci = pcie->pci;
+ u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
+ u32 val;
+
writel(SLV_ADDR_SPACE_SZ,
pcie->parf + PCIE20_v3_PARF_SLV_ADDR_SPACE_SIZE);
@@ -1124,24 +1147,6 @@ static int qcom_pcie_init_2_3_3(struct qcom_pcie *pcie)
PCI_EXP_DEVCTL2);
return 0;
-
-err_clk_aux:
- clk_disable_unprepare(res->ahb_clk);
-err_clk_ahb:
- clk_disable_unprepare(res->axi_s_clk);
-err_clk_axi_s:
- clk_disable_unprepare(res->axi_m_clk);
-err_clk_axi_m:
- clk_disable_unprepare(res->iface);
-err_clk_iface:
- /*
- * Not checking for failure, will anyway return
- * the original failure in 'ret'.
- */
- for (i = 0; i < ARRAY_SIZE(res->rst); i++)
- reset_control_assert(res->rst[i]);
-
- return ret;
}
static int qcom_pcie_get_resources_2_7_0(struct qcom_pcie *pcie)
@@ -1467,6 +1472,7 @@ static const struct qcom_pcie_ops ops_2_4_0 = {
static const struct qcom_pcie_ops ops_2_3_3 = {
.get_resources = qcom_pcie_get_resources_2_3_3,
.init = qcom_pcie_init_2_3_3,
+ .post_init = qcom_pcie_post_init_2_3_3,
.deinit = qcom_pcie_deinit_2_3_3,
.ltssm_enable = qcom_pcie_2_3_2_ltssm_enable,
};
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index b1b5f836a806..a479e58cc8fb 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -352,15 +352,14 @@ static irqreturn_t tegra_pcie_rp_irq_handler(int irq, void *arg)
struct tegra194_pcie *pcie = arg;
struct dw_pcie *pci = &pcie->pci;
struct pcie_port *pp = &pci->pp;
- u32 val, tmp;
+ u32 val, status_l0, status_l1;
u16 val_w;
- val = appl_readl(pcie, APPL_INTR_STATUS_L0);
- if (val & APPL_INTR_STATUS_L0_LINK_STATE_INT) {
- val = appl_readl(pcie, APPL_INTR_STATUS_L1_0_0);
- if (val & APPL_INTR_STATUS_L1_0_0_LINK_REQ_RST_NOT_CHGED) {
- appl_writel(pcie, val, APPL_INTR_STATUS_L1_0_0);
-
+ status_l0 = appl_readl(pcie, APPL_INTR_STATUS_L0);
+ if (status_l0 & APPL_INTR_STATUS_L0_LINK_STATE_INT) {
+ status_l1 = appl_readl(pcie, APPL_INTR_STATUS_L1_0_0);
+ appl_writel(pcie, status_l1, APPL_INTR_STATUS_L1_0_0);
+ if (status_l1 & APPL_INTR_STATUS_L1_0_0_LINK_REQ_RST_NOT_CHGED) {
/* SBR & Surprise Link Down WAR */
val = appl_readl(pcie, APPL_CAR_RESET_OVRD);
val &= ~APPL_CAR_RESET_OVRD_CYA_OVERRIDE_CORE_RST_N;
@@ -376,15 +375,15 @@ static irqreturn_t tegra_pcie_rp_irq_handler(int irq, void *arg)
}
}
- if (val & APPL_INTR_STATUS_L0_INT_INT) {
- val = appl_readl(pcie, APPL_INTR_STATUS_L1_8_0);
- if (val & APPL_INTR_STATUS_L1_8_0_AUTO_BW_INT_STS) {
+ if (status_l0 & APPL_INTR_STATUS_L0_INT_INT) {
+ status_l1 = appl_readl(pcie, APPL_INTR_STATUS_L1_8_0);
+ if (status_l1 & APPL_INTR_STATUS_L1_8_0_AUTO_BW_INT_STS) {
appl_writel(pcie,
APPL_INTR_STATUS_L1_8_0_AUTO_BW_INT_STS,
APPL_INTR_STATUS_L1_8_0);
apply_bad_link_workaround(pp);
}
- if (val & APPL_INTR_STATUS_L1_8_0_BW_MGT_INT_STS) {
+ if (status_l1 & APPL_INTR_STATUS_L1_8_0_BW_MGT_INT_STS) {
appl_writel(pcie,
APPL_INTR_STATUS_L1_8_0_BW_MGT_INT_STS,
APPL_INTR_STATUS_L1_8_0);
@@ -396,25 +395,24 @@ static irqreturn_t tegra_pcie_rp_irq_handler(int irq, void *arg)
}
}
- val = appl_readl(pcie, APPL_INTR_STATUS_L0);
- if (val & APPL_INTR_STATUS_L0_CDM_REG_CHK_INT) {
- val = appl_readl(pcie, APPL_INTR_STATUS_L1_18);
- tmp = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
- if (val & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_CMPLT) {
+ if (status_l0 & APPL_INTR_STATUS_L0_CDM_REG_CHK_INT) {
+ status_l1 = appl_readl(pcie, APPL_INTR_STATUS_L1_18);
+ val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
+ if (status_l1 & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_CMPLT) {
dev_info(pci->dev, "CDM check complete\n");
- tmp |= PCIE_PL_CHK_REG_CHK_REG_COMPLETE;
+ val |= PCIE_PL_CHK_REG_CHK_REG_COMPLETE;
}
- if (val & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_CMP_ERR) {
+ if (status_l1 & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_CMP_ERR) {
dev_err(pci->dev, "CDM comparison mismatch\n");
- tmp |= PCIE_PL_CHK_REG_CHK_REG_COMPARISON_ERROR;
+ val |= PCIE_PL_CHK_REG_CHK_REG_COMPARISON_ERROR;
}
- if (val & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_LOGIC_ERR) {
+ if (status_l1 & APPL_INTR_STATUS_L1_18_CDM_REG_CHK_LOGIC_ERR) {
dev_err(pci->dev, "CDM Logic error\n");
- tmp |= PCIE_PL_CHK_REG_CHK_REG_LOGIC_ERROR;
+ val |= PCIE_PL_CHK_REG_CHK_REG_LOGIC_ERROR;
}
- dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, tmp);
- tmp = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_ERR_ADDR);
- dev_err(pci->dev, "CDM Error Address Offset = 0x%08X\n", tmp);
+ dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, val);
+ val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_ERR_ADDR);
+ dev_err(pci->dev, "CDM Error Address Offset = 0x%08X\n", val);
}
return IRQ_HANDLED;
@@ -980,7 +978,7 @@ static int tegra194_pcie_start_link(struct dw_pcie *pci)
offset = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_ID_DLF);
val = dw_pcie_readl_dbi(pci, offset + PCI_DLF_CAP);
val &= ~PCI_DLF_EXCHANGE_ENABLE;
- dw_pcie_writel_dbi(pci, offset, val);
+ dw_pcie_writel_dbi(pci, offset + PCI_DLF_CAP, val);
tegra194_pcie_host_init(pp);
dw_pcie_setup_rc(pp);
@@ -1951,6 +1949,7 @@ static int tegra_pcie_config_ep(struct tegra194_pcie *pcie,
if (ret) {
dev_err(dev, "Failed to initialize DWC Endpoint subsystem: %d\n",
ret);
+ pm_runtime_disable(dev);
return ret;
}
diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c b/drivers/pci/controller/pcie-mediatek-gen3.c
index 5d9fd36b02d1..a02c466a597c 100644
--- a/drivers/pci/controller/pcie-mediatek-gen3.c
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -600,7 +600,8 @@ static int mtk_pcie_init_irq_domains(struct mtk_gen3_pcie *pcie)
&intx_domain_ops, pcie);
if (!pcie->intx_domain) {
dev_err(dev, "failed to create INTx IRQ domain\n");
- return -ENODEV;
+ ret = -ENODEV;
+ goto out_put_node;
}
/* Setup MSI */
@@ -623,13 +624,15 @@ static int mtk_pcie_init_irq_domains(struct mtk_gen3_pcie *pcie)
goto err_msi_domain;
}
+ of_node_put(intc_node);
return 0;
err_msi_domain:
irq_domain_remove(pcie->msi_bottom_domain);
err_msi_bottom_domain:
irq_domain_remove(pcie->intx_domain);
-
+out_put_node:
+ of_node_put(intc_node);
return ret;
}
diff --git a/drivers/pci/controller/pcie-microchip-host.c b/drivers/pci/controller/pcie-microchip-host.c
index 2c52a8cef726..932b1c182149 100644
--- a/drivers/pci/controller/pcie-microchip-host.c
+++ b/drivers/pci/controller/pcie-microchip-host.c
@@ -904,6 +904,7 @@ static int mc_pcie_init_irq_domains(struct mc_pcie *port)
&event_domain_ops, port);
if (!port->event_domain) {
dev_err(dev, "failed to get event domain\n");
+ of_node_put(pcie_intc_node);
return -ENOMEM;
}
@@ -913,6 +914,7 @@ static int mc_pcie_init_irq_domains(struct mc_pcie *port)
&intx_domain_ops, port);
if (!port->intx_domain) {
dev_err(dev, "failed to get an INTx IRQ domain\n");
+ of_node_put(pcie_intc_node);
return -ENOMEM;
}
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index 5b833f00e980..a5ed779b0a51 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -627,7 +627,6 @@ static void pci_epf_test_unbind(struct pci_epf *epf)
cancel_delayed_work(&epf_test->cmd_handler);
pci_epf_test_clean_dma_chan(epf_test);
- pci_epc_stop(epc);
for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
epf_bar = &epf->bar[bar];
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 7952e5efd6cf..a1e38ca93cd9 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -538,7 +538,7 @@ static const char *aer_agent_string[] = {
u64 *stats = pdev->aer_stats->stats_array; \
size_t len = 0; \
\
- for (i = 0; i < ARRAY_SIZE(strings_array); i++) { \
+ for (i = 0; i < ARRAY_SIZE(pdev->aer_stats->stats_array); i++) {\
if (strings_array[i]) \
len += sysfs_emit_at(buf, len, "%s %llu\n", \
strings_array[i], \
@@ -1347,6 +1347,11 @@ static int aer_probe(struct pcie_device *dev)
struct device *device = &dev->device;
struct pci_dev *port = dev->port;
+ BUILD_BUG_ON(ARRAY_SIZE(aer_correctable_error_string) <
+ AER_MAX_TYPEOF_COR_ERRS);
+ BUILD_BUG_ON(ARRAY_SIZE(aer_uncorrectable_error_string) <
+ AER_MAX_TYPEOF_UNCOR_ERRS);
+
/* Limit to Root Ports or Root Complex Event Collectors */
if ((pci_pcie_type(port) != PCI_EXP_TYPE_RC_EC) &&
(pci_pcie_type(port) != PCI_EXP_TYPE_ROOT_PORT))
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 604feeb84ee4..1ac7fec47d6f 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -222,15 +222,8 @@ static int get_port_device_capability(struct pci_dev *dev)
#ifdef CONFIG_PCIEAER
if (dev->aer_cap && pci_aer_available() &&
- (pcie_ports_native || host->native_aer)) {
+ (pcie_ports_native || host->native_aer))
services |= PCIE_PORT_SERVICE_AER;
-
- /*
- * Disable AER on this port in case it's been enabled by the
- * BIOS (the AER service driver will enable it when necessary).
- */
- pci_disable_pcie_error_reporting(dev);
- }
#endif
/* Root Ports and Root Complex Event Collectors may generate PMEs */
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index d44bcc29d99c..cd5945e17fdf 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -39,6 +39,24 @@
#include <asm/mmu.h>
#include <asm/sysreg.h>
+/*
+ * Cache if the event is allowed to trace Context information.
+ * This allows us to perform the check, i.e, perfmon_capable(),
+ * in the context of the event owner, once, during the event_init().
+ */
+#define SPE_PMU_HW_FLAGS_CX BIT(0)
+
+static void set_spe_event_has_cx(struct perf_event *event)
+{
+ if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
+ event->hw.flags |= SPE_PMU_HW_FLAGS_CX;
+}
+
+static bool get_spe_event_has_cx(struct perf_event *event)
+{
+ return !!(event->hw.flags & SPE_PMU_HW_FLAGS_CX);
+}
+
#define ARM_SPE_BUF_PAD_BYTE 0
struct arm_spe_pmu_buf {
@@ -272,7 +290,7 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
if (!attr->exclude_kernel)
reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
- if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
+ if (get_spe_event_has_cx(event))
reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
return reg;
@@ -709,10 +727,10 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
!(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT))
return -EOPNOTSUPP;
+ set_spe_event_has_cx(event);
reg = arm_spe_event_to_pmscr(event);
if (!perfmon_capable() &&
(reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) |
- BIT(SYS_PMSCR_EL1_CX_SHIFT) |
BIT(SYS_PMSCR_EL1_PCT_SHIFT))))
return -EACCES;
diff --git a/drivers/perf/riscv_pmu.c b/drivers/perf/riscv_pmu.c
index b2b8d2074ed0..130b9f1a40e0 100644
--- a/drivers/perf/riscv_pmu.c
+++ b/drivers/perf/riscv_pmu.c
@@ -170,7 +170,6 @@ int riscv_pmu_event_set_period(struct perf_event *event)
left = (max_period >> 1);
local64_set(&hwc->prev_count, (u64)-left);
- perf_event_update_userpage(event);
return overflow;
}
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_sbi.c
index a1317a483512..77b826d6c05b 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_sbi.c
@@ -274,8 +274,13 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
cflags |= SBI_PMU_CFG_FLAG_SET_UINH;
/* retrieve the available counter index */
+#if defined(CONFIG_32BIT)
+ ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase, cmask,
+ cflags, hwc->event_base, hwc->config, hwc->config >> 32);
+#else
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase, cmask,
cflags, hwc->event_base, hwc->config, 0);
+#endif
if (ret.error) {
pr_debug("Not able to find a counter for event %lx config %llx\n",
hwc->event_base, hwc->config);
@@ -417,8 +422,13 @@ static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)
struct hw_perf_event *hwc = &event->hw;
unsigned long flag = SBI_PMU_START_FLAG_SET_INIT_VALUE;
+#if defined(CONFIG_32BIT)
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, hwc->idx,
1, flag, ival, ival >> 32, 0);
+#else
+ ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, hwc->idx,
+ 1, flag, ival, 0, 0);
+#endif
if (ret.error && (ret.error != SBI_ERR_ALREADY_STARTED))
pr_err("Starting counter idx %d failed with error %d\n",
hwc->idx, sbi_err_map_linux_errno(ret.error));
@@ -525,8 +535,14 @@ static inline void pmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
hwc = &event->hw;
max_period = riscv_pmu_ctr_get_width_mask(event);
init_val = local64_read(&hwc->prev_count) & max_period;
+#if defined(CONFIG_32BIT)
+ sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
+ flag, init_val, init_val >> 32, 0);
+#else
sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
flag, init_val, 0, 0);
+#endif
+ perf_event_update_userpage(event);
}
ctr_ovf_mask = ctr_ovf_mask >> 1;
idx++;
@@ -666,12 +682,15 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde
child = of_get_compatible_child(cpu, "riscv,cpu-intc");
if (!child) {
pr_err("Failed to find INTC node\n");
+ of_node_put(cpu);
return -ENODEV;
}
domain = irq_find_host(child);
of_node_put(child);
- if (domain)
+ if (domain) {
+ of_node_put(cpu);
break;
+ }
}
if (!domain) {
pr_err("Failed to find INTC IRQ root domain\n");
diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.h b/drivers/phy/qualcomm/phy-qcom-qmp.h
index 06b2556ed93a..b9a91520439c 100644
--- a/drivers/phy/qualcomm/phy-qcom-qmp.h
+++ b/drivers/phy/qualcomm/phy-qcom-qmp.h
@@ -1116,7 +1116,8 @@
#define QSERDES_V5_COM_CORE_CLK_EN 0x174
#define QSERDES_V5_COM_CMN_CONFIG 0x17c
#define QSERDES_V5_COM_CMN_MISC1 0x19c
-#define QSERDES_V5_COM_CMN_MODE 0x1a4
+#define QSERDES_V5_COM_CMN_MODE 0x1a0
+#define QSERDES_V5_COM_CMN_MODE_CONTD 0x1a4
#define QSERDES_V5_COM_VCO_DC_LEVEL_CTRL 0x1a8
#define QSERDES_V5_COM_BIN_VCOCAL_CMP_CODE1_MODE0 0x1ac
#define QSERDES_V5_COM_BIN_VCOCAL_CMP_CODE2_MODE0 0x1b0
diff --git a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
index cba5c32cbaee..0c6548ac9d8a 100644
--- a/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
+++ b/drivers/phy/rockchip/phy-rockchip-inno-usb2.c
@@ -942,7 +942,9 @@ static irqreturn_t rockchip_usb2phy_irq(int irq, void *data)
switch (rport->port_id) {
case USB2PHY_PORT_OTG:
- ret |= rockchip_usb2phy_otg_mux_irq(irq, rport);
+ if (rport->mode != USB_DR_MODE_HOST &&
+ rport->mode != USB_DR_MODE_UNKNOWN)
+ ret |= rockchip_usb2phy_otg_mux_irq(irq, rport);
break;
case USB2PHY_PORT_HOST:
ret |= rockchip_usb2phy_linestate_irq(irq, rport);
diff --git a/drivers/phy/samsung/phy-exynosautov9-ufs.c b/drivers/phy/samsung/phy-exynosautov9-ufs.c
index 36398a15c2db..d043dfdb598a 100644
--- a/drivers/phy/samsung/phy-exynosautov9-ufs.c
+++ b/drivers/phy/samsung/phy-exynosautov9-ufs.c
@@ -31,22 +31,22 @@ static const struct samsung_ufs_phy_cfg exynosautov9_pre_init_cfg[] = {
PHY_COMN_REG_CFG(0x023, 0xc0, PWR_MODE_ANY),
PHY_COMN_REG_CFG(0x023, 0x00, PWR_MODE_ANY),
- PHY_TRSV_REG_CFG(0x042, 0x5d, PWR_MODE_ANY),
- PHY_TRSV_REG_CFG(0x043, 0x80, PWR_MODE_ANY),
+ PHY_TRSV_REG_CFG_AUTOV9(0x042, 0x5d, PWR_MODE_ANY),
+ PHY_TRSV_REG_CFG_AUTOV9(0x043, 0x80, PWR_MODE_ANY),
END_UFS_PHY_CFG,
};
/* Calibration for HS mode series A/B */
static const struct samsung_ufs_phy_cfg exynosautov9_pre_pwr_hs_cfg[] = {
- PHY_TRSV_REG_CFG(0x032, 0xbc, PWR_MODE_HS_ANY),
- PHY_TRSV_REG_CFG(0x03c, 0x7f, PWR_MODE_HS_ANY),
- PHY_TRSV_REG_CFG(0x048, 0xc0, PWR_MODE_HS_ANY),
+ PHY_TRSV_REG_CFG_AUTOV9(0x032, 0xbc, PWR_MODE_HS_ANY),
+ PHY_TRSV_REG_CFG_AUTOV9(0x03c, 0x7f, PWR_MODE_HS_ANY),
+ PHY_TRSV_REG_CFG_AUTOV9(0x048, 0xc0, PWR_MODE_HS_ANY),
- PHY_TRSV_REG_CFG(0x04a, 0x00, PWR_MODE_HS_G3_SER_B),
- PHY_TRSV_REG_CFG(0x04b, 0x10, PWR_MODE_HS_G1_SER_B |
- PWR_MODE_HS_G3_SER_B),
- PHY_TRSV_REG_CFG(0x04d, 0x63, PWR_MODE_HS_G3_SER_B),
+ PHY_TRSV_REG_CFG_AUTOV9(0x04a, 0x00, PWR_MODE_HS_G3_SER_B),
+ PHY_TRSV_REG_CFG_AUTOV9(0x04b, 0x10, PWR_MODE_HS_G1_SER_B |
+ PWR_MODE_HS_G3_SER_B),
+ PHY_TRSV_REG_CFG_AUTOV9(0x04d, 0x63, PWR_MODE_HS_G3_SER_B),
END_UFS_PHY_CFG,
};
diff --git a/drivers/phy/st/phy-stm32-usbphyc.c b/drivers/phy/st/phy-stm32-usbphyc.c
index 007a23c78d56..a98c911cc37a 100644
--- a/drivers/phy/st/phy-stm32-usbphyc.c
+++ b/drivers/phy/st/phy-stm32-usbphyc.c
@@ -358,7 +358,9 @@ static int stm32_usbphyc_phy_init(struct phy *phy)
return 0;
pll_disable:
- return stm32_usbphyc_pll_disable(usbphyc);
+ stm32_usbphyc_pll_disable(usbphyc);
+
+ return ret;
}
static int stm32_usbphyc_phy_exit(struct phy *phy)
diff --git a/drivers/phy/ti/phy-tusb1210.c b/drivers/phy/ti/phy-tusb1210.c
index c3ab4b69ea68..669c13d6e402 100644
--- a/drivers/phy/ti/phy-tusb1210.c
+++ b/drivers/phy/ti/phy-tusb1210.c
@@ -105,8 +105,9 @@ static int tusb1210_power_on(struct phy *phy)
msleep(TUSB1210_RESET_TIME_MS);
/* Restore the optional eye diagram optimization value */
- return tusb1210_ulpi_write(tusb, TUSB1210_VENDOR_SPECIFIC2,
- tusb->vendor_specific2);
+ tusb1210_ulpi_write(tusb, TUSB1210_VENDOR_SPECIFIC2, tusb->vendor_specific2);
+
+ return 0;
}
static int tusb1210_power_off(struct phy *phy)
diff --git a/drivers/pinctrl/Kconfig b/drivers/pinctrl/Kconfig
index f52960d2dfbe..bff144c97e66 100644
--- a/drivers/pinctrl/Kconfig
+++ b/drivers/pinctrl/Kconfig
@@ -32,7 +32,7 @@ config DEBUG_PINCTRL
Say Y here to add some extra checks and diagnostics to PINCTRL calls.
config PINCTRL_AMD
- tristate "AMD GPIO pin control"
+ bool "AMD GPIO pin control"
depends on HAS_IOMEM
depends on ACPI || COMPILE_TEST
select GPIOLIB
diff --git a/drivers/platform/chrome/cros_ec.c b/drivers/platform/chrome/cros_ec.c
index a5cc8f24299e..4168d7f5f273 100644
--- a/drivers/platform/chrome/cros_ec.c
+++ b/drivers/platform/chrome/cros_ec.c
@@ -135,16 +135,16 @@ static int cros_ec_sleep_event(struct cros_ec_device *ec_dev, u8 sleep_event)
buf.msg.command = EC_CMD_HOST_SLEEP_EVENT;
ret = cros_ec_cmd_xfer_status(ec_dev, &buf.msg);
-
- /* For now, report failure to transition to S0ix with a warning. */
+ /* Report failure to transition to system wide suspend with a warning. */
if (ret >= 0 && ec_dev->host_sleep_v1 &&
- (sleep_event == HOST_SLEEP_EVENT_S0IX_RESUME)) {
+ (sleep_event == HOST_SLEEP_EVENT_S0IX_RESUME ||
+ sleep_event == HOST_SLEEP_EVENT_S3_RESUME)) {
ec_dev->last_resume_result =
buf.u.resp1.resume_response.sleep_transitions;
WARN_ONCE(buf.u.resp1.resume_response.sleep_transitions &
EC_HOST_RESUME_SLEEP_TIMEOUT,
- "EC detected sleep transition timeout. Total slp_s0 transitions: %d",
+ "EC detected sleep transition timeout. Total sleep transitions: %d",
buf.u.resp1.resume_response.sleep_transitions &
EC_HOST_RESUME_SLEEP_TRANSITIONS_MASK);
}
diff --git a/drivers/platform/mellanox/mlxreg-lc.c b/drivers/platform/mellanox/mlxreg-lc.c
index c897a2f15840..55834ccb4ac7 100644
--- a/drivers/platform/mellanox/mlxreg-lc.c
+++ b/drivers/platform/mellanox/mlxreg-lc.c
@@ -716,8 +716,12 @@ mlxreg_lc_config_init(struct mlxreg_lc *mlxreg_lc, void *regmap,
switch (regval) {
case MLXREG_LC_SN4800_C16:
err = mlxreg_lc_sn4800_c16_config_init(mlxreg_lc, regmap, data);
- if (err)
+ if (err) {
+ dev_err(dev, "Failed to config client %s at bus %d at addr 0x%02x\n",
+ data->hpdev.brdinfo->type, data->hpdev.nr,
+ data->hpdev.brdinfo->addr);
return err;
+ }
break;
default:
return -ENODEV;
@@ -730,8 +734,11 @@ mlxreg_lc_config_init(struct mlxreg_lc *mlxreg_lc, void *regmap,
mlxreg_lc->mux = platform_device_register_resndata(dev, "i2c-mux-mlxcpld", data->hpdev.nr,
NULL, 0, mlxreg_lc->mux_data,
sizeof(*mlxreg_lc->mux_data));
- if (IS_ERR(mlxreg_lc->mux))
+ if (IS_ERR(mlxreg_lc->mux)) {
+ dev_err(dev, "Failed to create mux infra for client %s at bus %d at addr 0x%02x\n",
+ data->hpdev.brdinfo->type, data->hpdev.nr, data->hpdev.brdinfo->addr);
return PTR_ERR(mlxreg_lc->mux);
+ }
/* Register IO access driver. */
if (mlxreg_lc->io_data) {
@@ -740,6 +747,9 @@ mlxreg_lc_config_init(struct mlxreg_lc *mlxreg_lc, void *regmap,
platform_device_register_resndata(dev, "mlxreg-io", data->hpdev.nr, NULL, 0,
mlxreg_lc->io_data, sizeof(*mlxreg_lc->io_data));
if (IS_ERR(mlxreg_lc->io_regs)) {
+ dev_err(dev, "Failed to create regio for client %s at bus %d at addr 0x%02x\n",
+ data->hpdev.brdinfo->type, data->hpdev.nr,
+ data->hpdev.brdinfo->addr);
err = PTR_ERR(mlxreg_lc->io_regs);
goto fail_register_io;
}
@@ -753,6 +763,9 @@ mlxreg_lc_config_init(struct mlxreg_lc *mlxreg_lc, void *regmap,
mlxreg_lc->led_data,
sizeof(*mlxreg_lc->led_data));
if (IS_ERR(mlxreg_lc->led)) {
+ dev_err(dev, "Failed to create LED objects for client %s at bus %d at addr 0x%02x\n",
+ data->hpdev.brdinfo->type, data->hpdev.nr,
+ data->hpdev.brdinfo->addr);
err = PTR_ERR(mlxreg_lc->led);
goto fail_register_led;
}
@@ -809,7 +822,8 @@ static int mlxreg_lc_probe(struct platform_device *pdev)
if (!data->hpdev.adapter) {
dev_err(&pdev->dev, "Failed to get adapter for bus %d\n",
data->hpdev.nr);
- return -EFAULT;
+ err = -EFAULT;
+ goto i2c_get_adapter_fail;
}
/* Create device at the top of line card I2C tree.*/
@@ -818,32 +832,40 @@ static int mlxreg_lc_probe(struct platform_device *pdev)
if (IS_ERR(data->hpdev.client)) {
dev_err(&pdev->dev, "Failed to create client %s at bus %d at addr 0x%02x\n",
data->hpdev.brdinfo->type, data->hpdev.nr, data->hpdev.brdinfo->addr);
-
- i2c_put_adapter(data->hpdev.adapter);
- data->hpdev.adapter = NULL;
- return PTR_ERR(data->hpdev.client);
+ err = PTR_ERR(data->hpdev.client);
+ goto i2c_new_device_fail;
}
regmap = devm_regmap_init_i2c(data->hpdev.client,
&mlxreg_lc_regmap_conf);
if (IS_ERR(regmap)) {
+ dev_err(&pdev->dev, "Failed to create regmap for client %s at bus %d at addr 0x%02x\n",
+ data->hpdev.brdinfo->type, data->hpdev.nr, data->hpdev.brdinfo->addr);
err = PTR_ERR(regmap);
- goto mlxreg_lc_probe_fail;
+ goto devm_regmap_init_i2c_fail;
}
/* Set default registers. */
for (i = 0; i < mlxreg_lc_regmap_conf.num_reg_defaults; i++) {
err = regmap_write(regmap, mlxreg_lc_regmap_default[i].reg,
mlxreg_lc_regmap_default[i].def);
- if (err)
- goto mlxreg_lc_probe_fail;
+ if (err) {
+ dev_err(&pdev->dev, "Failed to set default regmap %d for client %s at bus %d at addr 0x%02x\n",
+ i, data->hpdev.brdinfo->type, data->hpdev.nr,
+ data->hpdev.brdinfo->addr);
+ goto regmap_write_fail;
+ }
}
/* Sync registers with hardware. */
regcache_mark_dirty(regmap);
err = regcache_sync(regmap);
- if (err)
- goto mlxreg_lc_probe_fail;
+ if (err) {
+ dev_err(&pdev->dev, "Failed to sync regmap for client %s at bus %d at addr 0x%02x\n",
+ data->hpdev.brdinfo->type, data->hpdev.nr, data->hpdev.brdinfo->addr);
+ err = PTR_ERR(regmap);
+ goto regcache_sync_fail;
+ }
par_pdata = data->hpdev.brdinfo->platform_data;
mlxreg_lc->par_regmap = par_pdata->regmap;
@@ -854,12 +876,27 @@ static int mlxreg_lc_probe(struct platform_device *pdev)
/* Configure line card. */
err = mlxreg_lc_config_init(mlxreg_lc, regmap, data);
if (err)
- goto mlxreg_lc_probe_fail;
+ goto mlxreg_lc_config_init_fail;
return err;
-mlxreg_lc_probe_fail:
+mlxreg_lc_config_init_fail:
+regcache_sync_fail:
+regmap_write_fail:
+devm_regmap_init_i2c_fail:
+ if (data->hpdev.client) {
+ i2c_unregister_device(data->hpdev.client);
+ data->hpdev.client = NULL;
+ }
+i2c_new_device_fail:
i2c_put_adapter(data->hpdev.adapter);
+ data->hpdev.adapter = NULL;
+i2c_get_adapter_fail:
+ /* Clear event notification callback and handle. */
+ if (data->notifier) {
+ data->notifier->user_handler = NULL;
+ data->notifier->handle = NULL;
+ }
return err;
}
@@ -868,11 +905,18 @@ static int mlxreg_lc_remove(struct platform_device *pdev)
struct mlxreg_core_data *data = dev_get_platdata(&pdev->dev);
struct mlxreg_lc *mlxreg_lc = platform_get_drvdata(pdev);
- /* Clear event notification callback. */
- if (data->notifier) {
- data->notifier->user_handler = NULL;
- data->notifier->handle = NULL;
- }
+ /*
+ * Probing and removing are invoked by hotplug events raised upon line card insertion and
+ * removing. If probing procedure fails all data is cleared. However, hotplug event still
+ * will be raised on line card removing and activate removing procedure. In this case there
+ * is nothing to remove.
+ */
+ if (!data->notifier || !data->notifier->handle)
+ return 0;
+
+ /* Clear event notification callback and handle. */
+ data->notifier->user_handler = NULL;
+ data->notifier->handle = NULL;
/* Destroy static I2C device feeding by main power. */
mlxreg_lc_destroy_static_devices(mlxreg_lc, mlxreg_lc->main_devs,
diff --git a/drivers/platform/olpc/olpc-ec.c b/drivers/platform/olpc/olpc-ec.c
index 4ff5c3a12991..921520475ff6 100644
--- a/drivers/platform/olpc/olpc-ec.c
+++ b/drivers/platform/olpc/olpc-ec.c
@@ -264,7 +264,7 @@ static ssize_t ec_dbgfs_cmd_write(struct file *file, const char __user *buf,
int i, m;
unsigned char ec_cmd[EC_MAX_CMD_ARGS];
unsigned int ec_cmd_int[EC_MAX_CMD_ARGS];
- char cmdbuf[64];
+ char cmdbuf[64] = "";
int ec_cmd_bytes;
mutex_lock(&ec_dbgfs_lock);
diff --git a/drivers/platform/x86/pmc_atom.c b/drivers/platform/x86/pmc_atom.c
index a40fae6edc84..f24ab24f2927 100644
--- a/drivers/platform/x86/pmc_atom.c
+++ b/drivers/platform/x86/pmc_atom.c
@@ -402,21 +402,16 @@ static const struct dmi_system_id critclk_systems[] = {
},
},
{
- /* pmc_plt_clk0 - 3 are used for the 4 ethernet controllers */
- .ident = "Lex 3I380D",
+ /*
+ * Lex System / Lex Computech Co. makes a lot of Bay Trail
+ * based embedded boards which often come with multiple
+ * ethernet controllers using multiple pmc_plt_clks. See:
+ * https://www.lex.com.tw/products/embedded-ipc-board/
+ */
+ .ident = "Lex BayTrail",
.callback = dmi_callback,
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Lex BayTrail"),
- DMI_MATCH(DMI_PRODUCT_NAME, "3I380D"),
- },
- },
- {
- /* pmc_plt_clk* - are used for ethernet controllers */
- .ident = "Lex 2I385SW",
- .callback = dmi_callback,
- .matches = {
- DMI_MATCH(DMI_SYS_VENDOR, "Lex BayTrail"),
- DMI_MATCH(DMI_PRODUCT_NAME, "2I385SW"),
},
},
{
diff --git a/drivers/pwm/pwm-lpc18xx-sct.c b/drivers/pwm/pwm-lpc18xx-sct.c
index b909096dba2f..43b5509dde51 100644
--- a/drivers/pwm/pwm-lpc18xx-sct.c
+++ b/drivers/pwm/pwm-lpc18xx-sct.c
@@ -98,7 +98,7 @@ struct lpc18xx_pwm_chip {
unsigned long clk_rate;
unsigned int period_ns;
unsigned int min_period_ns;
- unsigned int max_period_ns;
+ u64 max_period_ns;
unsigned int period_event;
unsigned long event_map;
struct mutex res_lock;
@@ -145,40 +145,48 @@ static void lpc18xx_pwm_set_conflict_res(struct lpc18xx_pwm_chip *lpc18xx_pwm,
mutex_unlock(&lpc18xx_pwm->res_lock);
}
-static void lpc18xx_pwm_config_period(struct pwm_chip *chip, int period_ns)
+static void lpc18xx_pwm_config_period(struct pwm_chip *chip, u64 period_ns)
{
struct lpc18xx_pwm_chip *lpc18xx_pwm = to_lpc18xx_pwm_chip(chip);
- u64 val;
+ u32 val;
- val = (u64)period_ns * lpc18xx_pwm->clk_rate;
- do_div(val, NSEC_PER_SEC);
+ /*
+ * With clk_rate < NSEC_PER_SEC this cannot overflow.
+ * With period_ns < max_period_ns this also fits into an u32.
+ * As period_ns >= min_period_ns = DIV_ROUND_UP(NSEC_PER_SEC, lpc18xx_pwm->clk_rate);
+ * we have val >= 1.
+ */
+ val = mul_u64_u64_div_u64(period_ns, lpc18xx_pwm->clk_rate, NSEC_PER_SEC);
lpc18xx_pwm_writel(lpc18xx_pwm,
LPC18XX_PWM_MATCH(lpc18xx_pwm->period_event),
- (u32)val - 1);
+ val - 1);
lpc18xx_pwm_writel(lpc18xx_pwm,
LPC18XX_PWM_MATCHREL(lpc18xx_pwm->period_event),
- (u32)val - 1);
+ val - 1);
}
static void lpc18xx_pwm_config_duty(struct pwm_chip *chip,
- struct pwm_device *pwm, int duty_ns)
+ struct pwm_device *pwm, u64 duty_ns)
{
struct lpc18xx_pwm_chip *lpc18xx_pwm = to_lpc18xx_pwm_chip(chip);
struct lpc18xx_pwm_data *lpc18xx_data = &lpc18xx_pwm->channeldata[pwm->hwpwm];
- u64 val;
+ u32 val;
- val = (u64)duty_ns * lpc18xx_pwm->clk_rate;
- do_div(val, NSEC_PER_SEC);
+ /*
+ * With clk_rate < NSEC_PER_SEC this cannot overflow.
+ * With duty_ns <= period_ns < max_period_ns this also fits into an u32.
+ */
+ val = mul_u64_u64_div_u64(duty_ns, lpc18xx_pwm->clk_rate, NSEC_PER_SEC);
lpc18xx_pwm_writel(lpc18xx_pwm,
LPC18XX_PWM_MATCH(lpc18xx_data->duty_event),
- (u32)val);
+ val);
lpc18xx_pwm_writel(lpc18xx_pwm,
LPC18XX_PWM_MATCHREL(lpc18xx_data->duty_event),
- (u32)val);
+ val);
}
static int lpc18xx_pwm_config(struct pwm_chip *chip, struct pwm_device *pwm,
@@ -360,12 +368,27 @@ static int lpc18xx_pwm_probe(struct platform_device *pdev)
goto disable_pwmclk;
}
+ /*
+ * If clkrate is too fast, the calculations in .apply() might overflow.
+ */
+ if (lpc18xx_pwm->clk_rate > NSEC_PER_SEC) {
+ ret = dev_err_probe(&pdev->dev, -EINVAL, "pwm clock to fast\n");
+ goto disable_pwmclk;
+ }
+
+ /*
+ * If clkrate is too fast, the calculations in .apply() might overflow.
+ */
+ if (lpc18xx_pwm->clk_rate > NSEC_PER_SEC) {
+ ret = dev_err_probe(&pdev->dev, -EINVAL, "pwm clock to fast\n");
+ goto disable_pwmclk;
+ }
+
mutex_init(&lpc18xx_pwm->res_lock);
mutex_init(&lpc18xx_pwm->period_lock);
- val = (u64)NSEC_PER_SEC * LPC18XX_PWM_TIMER_MAX;
- do_div(val, lpc18xx_pwm->clk_rate);
- lpc18xx_pwm->max_period_ns = val;
+ lpc18xx_pwm->max_period_ns =
+ mul_u64_u64_div_u64(NSEC_PER_SEC, LPC18XX_PWM_TIMER_MAX, lpc18xx_pwm->clk_rate);
lpc18xx_pwm->min_period_ns = DIV_ROUND_UP(NSEC_PER_SEC,
lpc18xx_pwm->clk_rate);
diff --git a/drivers/pwm/pwm-sifive.c b/drivers/pwm/pwm-sifive.c
index 253c4a17d255..58347fcd4812 100644
--- a/drivers/pwm/pwm-sifive.c
+++ b/drivers/pwm/pwm-sifive.c
@@ -23,7 +23,7 @@
#define PWM_SIFIVE_PWMCFG 0x0
#define PWM_SIFIVE_PWMCOUNT 0x8
#define PWM_SIFIVE_PWMS 0x10
-#define PWM_SIFIVE_PWMCMP0 0x20
+#define PWM_SIFIVE_PWMCMP(i) (0x20 + 4 * (i))
/* PWMCFG fields */
#define PWM_SIFIVE_PWMCFG_SCALE GENMASK(3, 0)
@@ -36,8 +36,6 @@
#define PWM_SIFIVE_PWMCFG_GANG BIT(24)
#define PWM_SIFIVE_PWMCFG_IP BIT(28)
-/* PWM_SIFIVE_SIZE_PWMCMP is used to calculate offset for pwmcmpX registers */
-#define PWM_SIFIVE_SIZE_PWMCMP 4
#define PWM_SIFIVE_CMPWIDTH 16
#define PWM_SIFIVE_DEFAULT_PERIOD 10000000
@@ -112,8 +110,7 @@ static void pwm_sifive_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
struct pwm_sifive_ddata *ddata = pwm_sifive_chip_to_ddata(chip);
u32 duty, val;
- duty = readl(ddata->regs + PWM_SIFIVE_PWMCMP0 +
- pwm->hwpwm * PWM_SIFIVE_SIZE_PWMCMP);
+ duty = readl(ddata->regs + PWM_SIFIVE_PWMCMP(pwm->hwpwm));
state->enabled = duty > 0;
@@ -194,8 +191,7 @@ static int pwm_sifive_apply(struct pwm_chip *chip, struct pwm_device *pwm,
pwm_sifive_update_clock(ddata, clk_get_rate(ddata->clk));
}
- writel(frac, ddata->regs + PWM_SIFIVE_PWMCMP0 +
- pwm->hwpwm * PWM_SIFIVE_SIZE_PWMCMP);
+ writel(frac, ddata->regs + PWM_SIFIVE_PWMCMP(pwm->hwpwm));
if (state->enabled != enabled)
pwm_sifive_enable(chip, state->enabled);
@@ -233,6 +229,8 @@ static int pwm_sifive_probe(struct platform_device *pdev)
struct pwm_sifive_ddata *ddata;
struct pwm_chip *chip;
int ret;
+ u32 val;
+ unsigned int enabled_pwms = 0, enabled_clks = 1;
ddata = devm_kzalloc(dev, sizeof(*ddata), GFP_KERNEL);
if (!ddata)
@@ -259,6 +257,33 @@ static int pwm_sifive_probe(struct platform_device *pdev)
return ret;
}
+ val = readl(ddata->regs + PWM_SIFIVE_PWMCFG);
+ if (val & PWM_SIFIVE_PWMCFG_EN_ALWAYS) {
+ unsigned int i;
+
+ for (i = 0; i < chip->npwm; ++i) {
+ val = readl(ddata->regs + PWM_SIFIVE_PWMCMP(i));
+ if (val > 0)
+ ++enabled_pwms;
+ }
+ }
+
+ /* The clk should be on once for each running PWM. */
+ if (enabled_pwms) {
+ while (enabled_clks < enabled_pwms) {
+ /* This is not expected to fail as the clk is already on */
+ ret = clk_enable(ddata->clk);
+ if (unlikely(ret)) {
+ dev_err_probe(dev, ret, "Failed to enable clk\n");
+ goto disable_clk;
+ }
+ ++enabled_clks;
+ }
+ } else {
+ clk_disable(ddata->clk);
+ enabled_clks = 0;
+ }
+
/* Watch for changes to underlying clock frequency */
ddata->notifier.notifier_call = pwm_sifive_clock_notifier;
ret = clk_notifier_register(ddata->clk, &ddata->notifier);
@@ -281,7 +306,11 @@ static int pwm_sifive_probe(struct platform_device *pdev)
unregister_clk:
clk_notifier_unregister(ddata->clk, &ddata->notifier);
disable_clk:
- clk_disable_unprepare(ddata->clk);
+ while (enabled_clks) {
+ clk_disable(ddata->clk);
+ --enabled_clks;
+ }
+ clk_unprepare(ddata->clk);
return ret;
}
@@ -289,23 +318,19 @@ static int pwm_sifive_probe(struct platform_device *pdev)
static int pwm_sifive_remove(struct platform_device *dev)
{
struct pwm_sifive_ddata *ddata = platform_get_drvdata(dev);
- bool is_enabled = false;
struct pwm_device *pwm;
int ch;
+ pwmchip_remove(&ddata->chip);
+ clk_notifier_unregister(ddata->clk, &ddata->notifier);
+
for (ch = 0; ch < ddata->chip.npwm; ch++) {
pwm = &ddata->chip.pwms[ch];
- if (pwm->state.enabled) {
- is_enabled = true;
- break;
- }
+ if (pwm->state.enabled)
+ clk_disable(ddata->clk);
}
- if (is_enabled)
- clk_disable(ddata->clk);
- clk_disable_unprepare(ddata->clk);
- pwmchip_remove(&ddata->chip);
- clk_notifier_unregister(ddata->clk, &ddata->notifier);
+ clk_unprepare(ddata->clk);
return 0;
}
diff --git a/drivers/regulator/of_regulator.c b/drivers/regulator/of_regulator.c
index f54d4f176882..e12b681c72e5 100644
--- a/drivers/regulator/of_regulator.c
+++ b/drivers/regulator/of_regulator.c
@@ -264,8 +264,12 @@ static int of_get_regulation_constraints(struct device *dev,
}
suspend_np = of_get_child_by_name(np, regulator_states[i]);
- if (!suspend_np || !suspend_state)
+ if (!suspend_np)
continue;
+ if (!suspend_state) {
+ of_node_put(suspend_np);
+ continue;
+ }
if (!of_property_read_u32(suspend_np, "regulator-mode",
&pval)) {
diff --git a/drivers/regulator/qcom_smd-regulator.c b/drivers/regulator/qcom_smd-regulator.c
index 7dff94a2eb7e..0af8286e1b10 100644
--- a/drivers/regulator/qcom_smd-regulator.c
+++ b/drivers/regulator/qcom_smd-regulator.c
@@ -357,10 +357,10 @@ static const struct regulator_desc pm8941_switch = {
static const struct regulator_desc pm8916_pldo = {
.linear_ranges = (struct linear_range[]) {
- REGULATOR_LINEAR_RANGE(750000, 0, 208, 12500),
+ REGULATOR_LINEAR_RANGE(1750000, 0, 127, 12500),
},
.n_linear_ranges = 1,
- .n_voltages = 209,
+ .n_voltages = 128,
.ops = &rpm_smps_ldo_ops,
};
diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c
index 91eb037089ef..f17bb41a6551 100644
--- a/drivers/remoteproc/imx_rproc.c
+++ b/drivers/remoteproc/imx_rproc.c
@@ -562,16 +562,17 @@ static int imx_rproc_addr_init(struct imx_rproc *priv,
node = of_parse_phandle(np, "memory-region", a);
/* Not map vdevbuffer, vdevring region */
- if (!strncmp(node->name, "vdev", strlen("vdev")))
+ if (!strncmp(node->name, "vdev", strlen("vdev"))) {
+ of_node_put(node);
continue;
+ }
err = of_address_to_resource(node, 0, &res);
+ of_node_put(node);
if (err) {
dev_err(dev, "unable to resolve memory region\n");
return err;
}
- of_node_put(node);
-
if (b >= IMX_RPROC_MEM_MAX)
break;
diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c
index 1ae47cc153e5..e7765bbe17a0 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -87,6 +87,9 @@ static void adsp_minidump(struct rproc *rproc)
{
struct qcom_adsp *adsp = rproc->priv;
+ if (rproc->dump_conf == RPROC_COREDUMP_DISABLED)
+ return;
+
qcom_minidump(rproc, adsp->minidump_id);
}
diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c
index 9fca81492863..a9f04dd83ab6 100644
--- a/drivers/remoteproc/qcom_sysmon.c
+++ b/drivers/remoteproc/qcom_sysmon.c
@@ -41,6 +41,7 @@ struct qcom_sysmon {
struct completion comp;
struct completion ind_comp;
struct completion shutdown_comp;
+ struct completion ssctl_comp;
struct mutex lock;
bool ssr_ack;
@@ -445,6 +446,8 @@ static int ssctl_new_server(struct qmi_handle *qmi, struct qmi_service *svc)
svc->priv = sysmon;
+ complete(&sysmon->ssctl_comp);
+
return 0;
}
@@ -501,6 +504,7 @@ static int sysmon_start(struct rproc_subdev *subdev)
.ssr_event = SSCTL_SSR_EVENT_AFTER_POWERUP
};
+ reinit_completion(&sysmon->ssctl_comp);
mutex_lock(&sysmon->state_lock);
sysmon->state = SSCTL_SSR_EVENT_AFTER_POWERUP;
blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
@@ -545,6 +549,11 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
if (crashed)
return;
+ if (sysmon->ssctl_instance) {
+ if (!wait_for_completion_timeout(&sysmon->ssctl_comp, HZ / 2))
+ dev_err(sysmon->dev, "timeout waiting for ssctl service\n");
+ }
+
if (sysmon->ssctl_version)
sysmon->shutdown_acked = ssctl_request_shutdown(sysmon);
else if (sysmon->ept)
@@ -631,6 +640,7 @@ struct qcom_sysmon *qcom_add_sysmon_subdev(struct rproc *rproc,
init_completion(&sysmon->comp);
init_completion(&sysmon->ind_comp);
init_completion(&sysmon->shutdown_comp);
+ init_completion(&sysmon->ssctl_comp);
mutex_init(&sysmon->lock);
mutex_init(&sysmon->state_lock);
diff --git a/drivers/remoteproc/qcom_wcnss.c b/drivers/remoteproc/qcom_wcnss.c
index 9a223d394087..68f37296b151 100644
--- a/drivers/remoteproc/qcom_wcnss.c
+++ b/drivers/remoteproc/qcom_wcnss.c
@@ -467,6 +467,7 @@ static int wcnss_request_irq(struct qcom_wcnss *wcnss,
irq_handler_t thread_fn)
{
int ret;
+ int irq_number;
ret = platform_get_irq_byname(pdev, name);
if (ret < 0 && optional) {
@@ -477,14 +478,19 @@ static int wcnss_request_irq(struct qcom_wcnss *wcnss,
return ret;
}
+ irq_number = ret;
+
ret = devm_request_threaded_irq(&pdev->dev, ret,
NULL, thread_fn,
IRQF_TRIGGER_RISING | IRQF_ONESHOT,
"wcnss", wcnss);
- if (ret)
+ if (ret) {
dev_err(&pdev->dev, "request %s IRQ failed\n", name);
+ return ret;
+ }
- return ret;
+ /* Return the IRQ number if the IRQ was successfully acquired */
+ return irq_number;
}
static int wcnss_alloc_memory_region(struct qcom_wcnss *wcnss)
diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
index 4840ad906018..0481926c6975 100644
--- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
+++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
@@ -1655,6 +1655,7 @@ static int k3_r5_cluster_of_init(struct platform_device *pdev)
if (!cpdev) {
ret = -ENODEV;
dev_err(dev, "could not get R5 core platform device\n");
+ of_node_put(child);
goto fail;
}
@@ -1663,6 +1664,7 @@ static int k3_r5_cluster_of_init(struct platform_device *pdev)
dev_err(dev, "k3_r5_core_of_init failed, ret = %d\n",
ret);
put_device(&cpdev->dev);
+ of_node_put(child);
goto fail;
}
diff --git a/drivers/rpmsg/mtk_rpmsg.c b/drivers/rpmsg/mtk_rpmsg.c
index 5b4404b8be4c..d1213c33da20 100644
--- a/drivers/rpmsg/mtk_rpmsg.c
+++ b/drivers/rpmsg/mtk_rpmsg.c
@@ -234,7 +234,9 @@ static void mtk_register_device_work_function(struct work_struct *register_work)
if (info->registered)
continue;
+ mutex_unlock(&subdev->channels_lock);
ret = mtk_rpmsg_register_device(subdev, &info->info);
+ mutex_lock(&subdev->channels_lock);
if (ret) {
dev_err(&pdev->dev, "Can't create rpmsg_device\n");
continue;
diff --git a/drivers/rpmsg/qcom_smd.c b/drivers/rpmsg/qcom_smd.c
index 1957b27c4cf3..f7af53891ef9 100644
--- a/drivers/rpmsg/qcom_smd.c
+++ b/drivers/rpmsg/qcom_smd.c
@@ -1383,6 +1383,7 @@ static int qcom_smd_parse_edge(struct device *dev,
}
edge->ipc_regmap = syscon_node_to_regmap(syscon_np);
+ of_node_put(syscon_np);
if (IS_ERR(edge->ipc_regmap)) {
ret = PTR_ERR(edge->ipc_regmap);
goto put_node;
diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c
index b6183d4f62a2..4f2189111494 100644
--- a/drivers/rpmsg/rpmsg_char.c
+++ b/drivers/rpmsg/rpmsg_char.c
@@ -120,8 +120,11 @@ static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
struct rpmsg_device *rpdev = eptdev->rpdev;
struct device *dev = &eptdev->dev;
- if (eptdev->ept)
+ mutex_lock(&eptdev->ept_lock);
+ if (eptdev->ept) {
+ mutex_unlock(&eptdev->ept_lock);
return -EBUSY;
+ }
get_device(dev);
@@ -137,11 +140,13 @@ static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
if (!ept) {
dev_err(dev, "failed to open %s\n", eptdev->chinfo.name);
put_device(dev);
+ mutex_unlock(&eptdev->ept_lock);
return -EINVAL;
}
eptdev->ept = ept;
filp->private_data = eptdev;
+ mutex_unlock(&eptdev->ept_lock);
return 0;
}
diff --git a/drivers/rtc/rtc-rx8025.c b/drivers/rtc/rtc-rx8025.c
index 5bfdd34a72ff..753c79892aa6 100644
--- a/drivers/rtc/rtc-rx8025.c
+++ b/drivers/rtc/rtc-rx8025.c
@@ -55,6 +55,8 @@
#define RX8025_BIT_CTRL2_XST BIT(5)
#define RX8025_BIT_CTRL2_VDET BIT(6)
+#define RX8035_BIT_HOUR_1224 BIT(7)
+
/* Clock precision adjustment */
#define RX8025_ADJ_RESOLUTION 3050 /* in ppb */
#define RX8025_ADJ_DATA_MAX 62
@@ -78,6 +80,7 @@ struct rx8025_data {
struct rtc_device *rtc;
enum rx_model model;
u8 ctrl1;
+ int is_24;
};
static s32 rx8025_read_reg(const struct i2c_client *client, u8 number)
@@ -226,7 +229,7 @@ static int rx8025_get_time(struct device *dev, struct rtc_time *dt)
dt->tm_sec = bcd2bin(date[RX8025_REG_SEC] & 0x7f);
dt->tm_min = bcd2bin(date[RX8025_REG_MIN] & 0x7f);
- if (rx8025->ctrl1 & RX8025_BIT_CTRL1_1224)
+ if (rx8025->is_24)
dt->tm_hour = bcd2bin(date[RX8025_REG_HOUR] & 0x3f);
else
dt->tm_hour = bcd2bin(date[RX8025_REG_HOUR] & 0x1f) % 12
@@ -254,7 +257,7 @@ static int rx8025_set_time(struct device *dev, struct rtc_time *dt)
*/
date[RX8025_REG_SEC] = bin2bcd(dt->tm_sec);
date[RX8025_REG_MIN] = bin2bcd(dt->tm_min);
- if (rx8025->ctrl1 & RX8025_BIT_CTRL1_1224)
+ if (rx8025->is_24)
date[RX8025_REG_HOUR] = bin2bcd(dt->tm_hour);
else
date[RX8025_REG_HOUR] = (dt->tm_hour >= 12 ? 0x20 : 0)
@@ -279,6 +282,7 @@ static int rx8025_init_client(struct i2c_client *client)
struct rx8025_data *rx8025 = i2c_get_clientdata(client);
u8 ctrl[2], ctrl2;
int need_clear = 0;
+ int hour_reg;
int err;
err = rx8025_read_regs(client, RX8025_REG_CTRL1, 2, ctrl);
@@ -303,6 +307,16 @@ static int rx8025_init_client(struct i2c_client *client)
err = rx8025_write_reg(client, RX8025_REG_CTRL2, ctrl2);
}
+
+ if (rx8025->model == model_rx_8035) {
+ /* In RX-8035, 12/24 flag is in the hour register */
+ hour_reg = rx8025_read_reg(client, RX8025_REG_HOUR);
+ if (hour_reg < 0)
+ return hour_reg;
+ rx8025->is_24 = (hour_reg & RX8035_BIT_HOUR_1224);
+ } else {
+ rx8025->is_24 = (ctrl[1] & RX8025_BIT_CTRL1_1224);
+ }
out:
return err;
}
@@ -329,7 +343,7 @@ static int rx8025_read_alarm(struct device *dev, struct rtc_wkalrm *t)
/* Hardware alarms precision is 1 minute! */
t->time.tm_sec = 0;
t->time.tm_min = bcd2bin(ald[0] & 0x7f);
- if (rx8025->ctrl1 & RX8025_BIT_CTRL1_1224)
+ if (rx8025->is_24)
t->time.tm_hour = bcd2bin(ald[1] & 0x3f);
else
t->time.tm_hour = bcd2bin(ald[1] & 0x1f) % 12
@@ -350,7 +364,7 @@ static int rx8025_set_alarm(struct device *dev, struct rtc_wkalrm *t)
int err;
ald[0] = bin2bcd(t->time.tm_min);
- if (rx8025->ctrl1 & RX8025_BIT_CTRL1_1224)
+ if (rx8025->is_24)
ald[1] = bin2bcd(t->time.tm_hour);
else
ald[1] = (t->time.tm_hour >= 12 ? 0x20 : 0)
diff --git a/drivers/s390/char/zcore.c b/drivers/s390/char/zcore.c
index 516783ba950f..92b32ce645b9 100644
--- a/drivers/s390/char/zcore.c
+++ b/drivers/s390/char/zcore.c
@@ -50,6 +50,7 @@ static struct dentry *zcore_reipl_file;
static struct dentry *zcore_hsa_file;
static struct ipl_parameter_block *zcore_ipl_block;
+static DEFINE_MUTEX(hsa_buf_mutex);
static char hsa_buf[PAGE_SIZE] __aligned(PAGE_SIZE);
/*
@@ -66,19 +67,24 @@ int memcpy_hsa_user(void __user *dest, unsigned long src, size_t count)
if (!hsa_available)
return -ENODATA;
+ mutex_lock(&hsa_buf_mutex);
while (count) {
if (sclp_sdias_copy(hsa_buf, src / PAGE_SIZE + 2, 1)) {
TRACE("sclp_sdias_copy() failed\n");
+ mutex_unlock(&hsa_buf_mutex);
return -EIO;
}
offset = src % PAGE_SIZE;
bytes = min(PAGE_SIZE - offset, count);
- if (copy_to_user(dest, hsa_buf + offset, bytes))
+ if (copy_to_user(dest, hsa_buf + offset, bytes)) {
+ mutex_unlock(&hsa_buf_mutex);
return -EFAULT;
+ }
src += bytes;
dest += bytes;
count -= bytes;
}
+ mutex_unlock(&hsa_buf_mutex);
return 0;
}
@@ -96,9 +102,11 @@ int memcpy_hsa_kernel(void *dest, unsigned long src, size_t count)
if (!hsa_available)
return -ENODATA;
+ mutex_lock(&hsa_buf_mutex);
while (count) {
if (sclp_sdias_copy(hsa_buf, src / PAGE_SIZE + 2, 1)) {
TRACE("sclp_sdias_copy() failed\n");
+ mutex_unlock(&hsa_buf_mutex);
return -EIO;
}
offset = src % PAGE_SIZE;
@@ -108,6 +116,7 @@ int memcpy_hsa_kernel(void *dest, unsigned long src, size_t count)
dest += bytes;
count -= bytes;
}
+ mutex_unlock(&hsa_buf_mutex);
return 0;
}
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index ee182cfb467d..e76a4a3cf1ce 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -107,9 +107,10 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
/*
* Reset to IDLE only if processing of a channel program
* has finished. Do not overwrite a possible processing
- * state if the final interrupt was for HSCH or CSCH.
+ * state if the interrupt was unsolicited, or if the final
+ * interrupt was for HSCH or CSCH.
*/
- if (private->mdev && cp_is_finished)
+ if (cp_is_finished)
private->state = VFIO_CCW_STATE_IDLE;
if (private->io_trigger)
@@ -301,19 +302,11 @@ static int vfio_ccw_sch_event(struct subchannel *sch, int process)
if (work_pending(&sch->todo_work))
goto out_unlock;
- if (cio_update_schib(sch)) {
- vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_NOT_OPER);
- rc = 0;
- goto out_unlock;
- }
-
- private = dev_get_drvdata(&sch->dev);
- if (private->state == VFIO_CCW_STATE_NOT_OPER) {
- private->state = private->mdev ? VFIO_CCW_STATE_IDLE :
- VFIO_CCW_STATE_STANDBY;
- }
rc = 0;
+ if (cio_update_schib(sch))
+ vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_NOT_OPER);
+
out_unlock:
spin_unlock_irqrestore(sch->lock, flags);
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index d8589afac272..8d76f5b26e2b 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -146,7 +146,7 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
vfio_uninit_group_dev(&private->vdev);
atomic_inc(&private->avail);
private->mdev = NULL;
- private->state = VFIO_CCW_STATE_IDLE;
+ private->state = VFIO_CCW_STATE_STANDBY;
return ret;
}
diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
index 511bf8e0a436..b61acbb09be3 100644
--- a/drivers/s390/scsi/zfcp_fc.c
+++ b/drivers/s390/scsi/zfcp_fc.c
@@ -145,27 +145,33 @@ void zfcp_fc_enqueue_event(struct zfcp_adapter *adapter,
static int zfcp_fc_wka_port_get(struct zfcp_fc_wka_port *wka_port)
{
+ int ret = -EIO;
+
if (mutex_lock_interruptible(&wka_port->mutex))
return -ERESTARTSYS;
if (wka_port->status == ZFCP_FC_WKA_PORT_OFFLINE ||
wka_port->status == ZFCP_FC_WKA_PORT_CLOSING) {
wka_port->status = ZFCP_FC_WKA_PORT_OPENING;
- if (zfcp_fsf_open_wka_port(wka_port))
+ if (zfcp_fsf_open_wka_port(wka_port)) {
+ /* could not even send request, nothing to wait for */
wka_port->status = ZFCP_FC_WKA_PORT_OFFLINE;
+ goto out;
+ }
}
- mutex_unlock(&wka_port->mutex);
-
- wait_event(wka_port->completion_wq,
+ wait_event(wka_port->opened,
wka_port->status == ZFCP_FC_WKA_PORT_ONLINE ||
wka_port->status == ZFCP_FC_WKA_PORT_OFFLINE);
if (wka_port->status == ZFCP_FC_WKA_PORT_ONLINE) {
atomic_inc(&wka_port->refcount);
- return 0;
+ ret = 0;
+ goto out;
}
- return -EIO;
+out:
+ mutex_unlock(&wka_port->mutex);
+ return ret;
}
static void zfcp_fc_wka_port_offline(struct work_struct *work)
@@ -181,9 +187,12 @@ static void zfcp_fc_wka_port_offline(struct work_struct *work)
wka_port->status = ZFCP_FC_WKA_PORT_CLOSING;
if (zfcp_fsf_close_wka_port(wka_port)) {
+ /* could not even send request, nothing to wait for */
wka_port->status = ZFCP_FC_WKA_PORT_OFFLINE;
- wake_up(&wka_port->completion_wq);
+ goto out;
}
+ wait_event(wka_port->closed,
+ wka_port->status == ZFCP_FC_WKA_PORT_OFFLINE);
out:
mutex_unlock(&wka_port->mutex);
}
@@ -193,13 +202,15 @@ static void zfcp_fc_wka_port_put(struct zfcp_fc_wka_port *wka_port)
if (atomic_dec_return(&wka_port->refcount) != 0)
return;
/* wait 10 milliseconds, other reqs might pop in */
- schedule_delayed_work(&wka_port->work, HZ / 100);
+ queue_delayed_work(wka_port->adapter->work_queue, &wka_port->work,
+ msecs_to_jiffies(10));
}
static void zfcp_fc_wka_port_init(struct zfcp_fc_wka_port *wka_port, u32 d_id,
struct zfcp_adapter *adapter)
{
- init_waitqueue_head(&wka_port->completion_wq);
+ init_waitqueue_head(&wka_port->opened);
+ init_waitqueue_head(&wka_port->closed);
wka_port->adapter = adapter;
wka_port->d_id = d_id;
diff --git a/drivers/s390/scsi/zfcp_fc.h b/drivers/s390/scsi/zfcp_fc.h
index 8aaf409ce9cb..97755407ce1b 100644
--- a/drivers/s390/scsi/zfcp_fc.h
+++ b/drivers/s390/scsi/zfcp_fc.h
@@ -185,7 +185,8 @@ enum zfcp_fc_wka_status {
/**
* struct zfcp_fc_wka_port - representation of well-known-address (WKA) FC port
* @adapter: Pointer to adapter structure this WKA port belongs to
- * @completion_wq: Wait for completion of open/close command
+ * @opened: Wait for completion of open command
+ * @closed: Wait for completion of close command
* @status: Current status of WKA port
* @refcount: Reference count to keep port open as long as it is in use
* @d_id: FC destination id or well-known-address
@@ -195,7 +196,8 @@ enum zfcp_fc_wka_status {
*/
struct zfcp_fc_wka_port {
struct zfcp_adapter *adapter;
- wait_queue_head_t completion_wq;
+ wait_queue_head_t opened;
+ wait_queue_head_t closed;
enum zfcp_fc_wka_status status;
atomic_t refcount;
u32 d_id;
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c
index 4f1e4385ce58..19223b075568 100644
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -1907,7 +1907,7 @@ static void zfcp_fsf_open_wka_port_handler(struct zfcp_fsf_req *req)
wka_port->status = ZFCP_FC_WKA_PORT_ONLINE;
}
out:
- wake_up(&wka_port->completion_wq);
+ wake_up(&wka_port->opened);
}
/**
@@ -1966,7 +1966,7 @@ static void zfcp_fsf_close_wka_port_handler(struct zfcp_fsf_req *req)
}
wka_port->status = ZFCP_FC_WKA_PORT_OFFLINE;
- wake_up(&wka_port->completion_wq);
+ wake_up(&wka_port->closed);
}
/**
diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c
index 3bb0adefbe06..02026476c39c 100644
--- a/drivers/scsi/be2iscsi/be_main.c
+++ b/drivers/scsi/be2iscsi/be_main.c
@@ -5745,7 +5745,7 @@ static void beiscsi_remove(struct pci_dev *pcidev)
cancel_work_sync(&phba->sess_work);
beiscsi_iface_destroy_default(phba);
- iscsi_host_remove(phba->shost);
+ iscsi_host_remove(phba->shost, false);
beiscsi_disable_port(phba, 1);
/* after cancelling boot_work */
diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c b/drivers/scsi/bnx2i/bnx2i_iscsi.c
index 15fbd09baa94..a3c800e04a2e 100644
--- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
+++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
@@ -909,7 +909,7 @@ void bnx2i_free_hba(struct bnx2i_hba *hba)
{
struct Scsi_Host *shost = hba->shost;
- iscsi_host_remove(shost);
+ iscsi_host_remove(shost, false);
INIT_LIST_HEAD(&hba->ep_ofld_list);
INIT_LIST_HEAD(&hba->ep_active_list);
INIT_LIST_HEAD(&hba->ep_destroy_list);
diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index 4365d52c6430..32abdf0fa9aa 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -328,7 +328,7 @@ void cxgbi_hbas_remove(struct cxgbi_device *cdev)
chba = cdev->hbas[i];
if (chba) {
cdev->hbas[i] = NULL;
- iscsi_host_remove(chba->shost);
+ iscsi_host_remove(chba->shost, false);
pci_dev_put(cdev->pdev);
iscsi_host_free(chba->shost);
}
diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 9fee70d6434a..52c6f70d60ec 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -898,7 +898,7 @@ iscsi_sw_tcp_session_create(struct iscsi_endpoint *ep, uint16_t cmds_max,
remove_session:
iscsi_session_teardown(cls_session);
remove_host:
- iscsi_host_remove(shost);
+ iscsi_host_remove(shost, false);
free_host:
iscsi_host_free(shost);
return NULL;
@@ -915,7 +915,7 @@ static void iscsi_sw_tcp_session_destroy(struct iscsi_cls_session *cls_session)
iscsi_tcp_r2tpool_free(cls_session->dd_data);
iscsi_session_teardown(cls_session);
- iscsi_host_remove(shost);
+ iscsi_host_remove(shost, false);
iscsi_host_free(shost);
}
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 797abf4f5399..3ddb701cd29c 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -2828,11 +2828,12 @@ static void iscsi_notify_host_removed(struct iscsi_cls_session *cls_session)
/**
* iscsi_host_remove - remove host and sessions
* @shost: scsi host
+ * @is_shutdown: true if called from a driver shutdown callout
*
* If there are any sessions left, this will initiate the removal and wait
* for the completion.
*/
-void iscsi_host_remove(struct Scsi_Host *shost)
+void iscsi_host_remove(struct Scsi_Host *shost, bool is_shutdown)
{
struct iscsi_host *ihost = shost_priv(shost);
unsigned long flags;
@@ -2841,7 +2842,11 @@ void iscsi_host_remove(struct Scsi_Host *shost)
ihost->state = ISCSI_HOST_REMOVED;
spin_unlock_irqrestore(&ihost->lock, flags);
- iscsi_host_for_each_session(shost, iscsi_notify_host_removed);
+ if (!is_shutdown)
+ iscsi_host_for_each_session(shost, iscsi_notify_host_removed);
+ else
+ iscsi_host_for_each_session(shost, iscsi_force_destroy_session);
+
wait_event_interruptible(ihost->session_removal_wq,
ihost->num_sessions == 0);
if (signal_pending(current))
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index add238dc599b..baed4f0146f5 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -5710,7 +5710,6 @@ lpfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd)
cur_iocbq->cmd_flag |= LPFC_IO_VMID;
}
}
- atomic_inc(&ndlp->cmd_pending);
#ifdef CONFIG_SCSI_LPFC_DEBUG_FS
if (unlikely(phba->hdwqstat_on & LPFC_CHECK_SCSI_IO))
diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index 83ffba7f51da..780d975c85b5 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -2414,9 +2414,12 @@ static void __qedi_remove(struct pci_dev *pdev, int mode)
int rval;
u16 retry = 10;
- if (mode == QEDI_MODE_NORMAL || mode == QEDI_MODE_SHUTDOWN) {
- iscsi_host_remove(qedi->shost);
+ if (mode == QEDI_MODE_NORMAL)
+ iscsi_host_remove(qedi->shost, false);
+ else if (mode == QEDI_MODE_SHUTDOWN)
+ iscsi_host_remove(qedi->shost, true);
+ if (mode == QEDI_MODE_NORMAL || mode == QEDI_MODE_SHUTDOWN) {
if (qedi->tmf_thread) {
destroy_workqueue(qedi->tmf_thread);
qedi->tmf_thread = NULL;
@@ -2791,7 +2794,7 @@ static int __qedi_probe(struct pci_dev *pdev, int mode)
#ifdef CONFIG_DEBUG_FS
qedi_dbg_host_exit(&qedi->dbg_ctx);
#endif
- iscsi_host_remove(qedi->shost);
+ iscsi_host_remove(qedi->shost, false);
stop_iscsi_func:
qedi_ops->stop(qedi->cdev);
stop_slowpath:
diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
index 3b3e4234f37a..412ad888bdc1 100644
--- a/drivers/scsi/qla2xxx/qla_attr.c
+++ b/drivers/scsi/qla2xxx/qla_attr.c
@@ -2716,17 +2716,24 @@ qla2x00_dev_loss_tmo_callbk(struct fc_rport *rport)
if (!fcport)
return;
- /* Now that the rport has been deleted, set the fcport state to
- FCS_DEVICE_DEAD */
- qla2x00_set_fcport_state(fcport, FCS_DEVICE_DEAD);
+
+ /*
+ * Now that the rport has been deleted, set the fcport state to
+ * FCS_DEVICE_DEAD, if the fcport is still lost.
+ */
+ if (fcport->scan_state != QLA_FCPORT_FOUND)
+ qla2x00_set_fcport_state(fcport, FCS_DEVICE_DEAD);
/*
* Transport has effectively 'deleted' the rport, clear
* all local references.
*/
spin_lock_irqsave(host->host_lock, flags);
- fcport->rport = fcport->drport = NULL;
- *((fc_port_t **)rport->dd_data) = NULL;
+ /* Confirm port has not reappeared before clearing pointers. */
+ if (rport->port_state != FC_PORTSTATE_ONLINE) {
+ fcport->rport = fcport->drport = NULL;
+ *((fc_port_t **)rport->dd_data) = NULL;
+ }
spin_unlock_irqrestore(host->host_lock, flags);
if (test_bit(ABORT_ISP_ACTIVE, &fcport->vha->dpc_flags))
@@ -2759,9 +2766,12 @@ qla2x00_terminate_rport_io(struct fc_rport *rport)
/*
* At this point all fcport's software-states are cleared. Perform any
* final cleanup of firmware resources (PCBs and XCBs).
+ *
+ * Attempt to cleanup only lost devices.
*/
if (fcport->loop_id != FC_NO_LOOP_ID) {
- if (IS_FWI2_CAPABLE(fcport->vha->hw)) {
+ if (IS_FWI2_CAPABLE(fcport->vha->hw) &&
+ fcport->scan_state != QLA_FCPORT_FOUND) {
if (fcport->loop_id != FC_NO_LOOP_ID)
fcport->logout_on_delete = 1;
@@ -2771,7 +2781,7 @@ qla2x00_terminate_rport_io(struct fc_rport *rport)
__LINE__);
qlt_schedule_sess_for_deletion(fcport);
}
- } else {
+ } else if (!IS_FWI2_CAPABLE(fcport->vha->hw)) {
qla2x00_port_logout(fcport->vha, fcport);
}
}
diff --git a/drivers/scsi/qla2xxx/qla_bsg.c b/drivers/scsi/qla2xxx/qla_bsg.c
index c2f00f076f79..726af9e40572 100644
--- a/drivers/scsi/qla2xxx/qla_bsg.c
+++ b/drivers/scsi/qla2xxx/qla_bsg.c
@@ -2975,6 +2975,13 @@ qla24xx_bsg_timeout(struct bsg_job *bsg_job)
ql_log(ql_log_info, vha, 0x708b, "%s CMD timeout. bsg ptr %p.\n",
__func__, bsg_job);
+
+ if (qla2x00_isp_reg_stat(ha)) {
+ ql_log(ql_log_info, vha, 0x9007,
+ "PCI/Register disconnect.\n");
+ qla_pci_set_eeh_busy(vha);
+ }
+
/* find the bsg job from the active list of commands */
spin_lock_irqsave(&ha->hardware_lock, flags);
for (que = 0; que < ha->max_req_queues; que++) {
@@ -2992,7 +2999,8 @@ qla24xx_bsg_timeout(struct bsg_job *bsg_job)
sp->u.bsg_job == bsg_job) {
req->outstanding_cmds[cnt] = NULL;
spin_unlock_irqrestore(&ha->hardware_lock, flags);
- if (ha->isp_ops->abort_command(sp)) {
+
+ if (!ha->flags.eeh_busy && ha->isp_ops->abort_command(sp)) {
ql_log(ql_log_warn, vha, 0x7089,
"mbx abort_command failed.\n");
bsg_reply->result = -EIO;
diff --git a/drivers/scsi/qla2xxx/qla_dbg.h b/drivers/scsi/qla2xxx/qla_dbg.h
index f1f6c740bdcd..feeb1666227f 100644
--- a/drivers/scsi/qla2xxx/qla_dbg.h
+++ b/drivers/scsi/qla2xxx/qla_dbg.h
@@ -383,5 +383,5 @@ ql_mask_match(uint level)
if (ql2xextended_error_logging == 1)
ql2xextended_error_logging = QL_DBG_DEFAULT1_MASK;
- return (level & ql2xextended_error_logging) == level;
+ return level && ((level & ql2xextended_error_logging) == level);
}
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index e8f69c486be1..01cdd5f8723c 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -2158,6 +2158,11 @@ typedef struct {
#define CS_IOCB_ERROR 0x31 /* Generic error for IOCB request
failure */
#define CS_REJECT_RECEIVED 0x4E /* Reject received */
+#define CS_EDIF_AUTH_ERROR 0x63 /* decrypt error */
+#define CS_EDIF_PAD_LEN_ERROR 0x65 /* pad > frame size, not 4byte align */
+#define CS_EDIF_INV_REQ 0x66 /* invalid request */
+#define CS_EDIF_SPI_ERROR 0x67 /* rx frame unable to locate sa */
+#define CS_EDIF_HDR_ERROR 0x69 /* data frame != expected len */
#define CS_BAD_PAYLOAD 0x80 /* Driver defined */
#define CS_UNKNOWN 0x81 /* Driver defined */
#define CS_RETRY 0x82 /* Driver defined */
@@ -2626,7 +2631,6 @@ typedef struct fc_port {
struct {
uint32_t enable:1; /* device is edif enabled/req'd */
uint32_t app_stop:2;
- uint32_t app_started:1;
uint32_t aes_gmac:1;
uint32_t app_sess_online:1;
uint32_t tx_sa_set:1;
@@ -2637,6 +2641,7 @@ typedef struct fc_port {
uint32_t rx_rekey_cnt;
uint64_t tx_bytes;
uint64_t rx_bytes;
+ uint8_t sess_down_acked;
uint8_t auth_state;
uint16_t authok:1;
uint16_t rekey_cnt;
@@ -3204,6 +3209,8 @@ struct ct_sns_rsp {
#define GFF_NVME_OFFSET 23 /* type = 28h */
struct {
uint8_t fc4_features[128];
+#define FC4_FF_TARGET BIT_0
+#define FC4_FF_INITIATOR BIT_1
} gff_id;
struct {
uint8_t reserved;
@@ -3975,6 +3982,7 @@ struct qla_hw_data {
/* SRB cache. */
#define SRB_MIN_REQ 128
mempool_t *srb_mempool;
+ u8 port_name[WWN_SIZE];
volatile struct {
uint32_t mbox_int :1;
@@ -4040,6 +4048,9 @@ struct qla_hw_data {
uint32_t n2n_fw_acc_sec:1;
uint32_t plogi_template_valid:1;
uint32_t port_isolated:1;
+ uint32_t eeh_flush:2;
+#define EEH_FLUSH_RDY 1
+#define EEH_FLUSH_DONE 2
} flags;
uint16_t max_exchg;
@@ -4074,6 +4085,7 @@ struct qla_hw_data {
uint32_t rsp_que_len;
uint32_t req_que_off;
uint32_t rsp_que_off;
+ unsigned long eeh_jif;
/* Multi queue data structs */
device_reg_t *mqiobase;
@@ -4256,8 +4268,8 @@ struct qla_hw_data {
#define IS_OEM_001(ha) ((ha)->device_type & DT_OEM_001)
#define HAS_EXTENDED_IDS(ha) ((ha)->device_type & DT_EXTENDED_IDS)
#define IS_CT6_SUPPORTED(ha) ((ha)->device_type & DT_CT6_SUPPORTED)
-#define IS_MQUE_CAPABLE(ha) ((ha)->mqenable || IS_QLA83XX(ha) || \
- IS_QLA27XX(ha) || IS_QLA28XX(ha))
+#define IS_MQUE_CAPABLE(ha) (IS_QLA83XX(ha) || IS_QLA27XX(ha) || \
+ IS_QLA28XX(ha))
#define IS_BIDI_CAPABLE(ha) \
(IS_QLA25XX(ha) || IS_QLA2031(ha) || IS_QLA27XX(ha) || IS_QLA28XX(ha))
/* Bit 21 of fw_attributes decides the MCTP capabilities */
diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c
index 0628633c7c7e..dbe8ef887a9d 100644
--- a/drivers/scsi/qla2xxx/qla_edif.c
+++ b/drivers/scsi/qla2xxx/qla_edif.c
@@ -52,6 +52,31 @@ const char *sc_to_str(uint16_t cmd)
return "unknown";
}
+static struct edb_node *qla_edb_getnext(scsi_qla_host_t *vha)
+{
+ unsigned long flags;
+ struct edb_node *edbnode = NULL;
+
+ spin_lock_irqsave(&vha->e_dbell.db_lock, flags);
+
+ /* db nodes are fifo - no qualifications done */
+ if (!list_empty(&vha->e_dbell.head)) {
+ edbnode = list_first_entry(&vha->e_dbell.head,
+ struct edb_node, list);
+ list_del_init(&edbnode->list);
+ }
+
+ spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
+
+ return edbnode;
+}
+
+static void qla_edb_node_free(scsi_qla_host_t *vha, struct edb_node *node)
+{
+ list_del_init(&node->list);
+ kfree(node);
+}
+
static struct edif_list_entry *qla_edif_list_find_sa_index(fc_port_t *fcport,
uint16_t handle)
{
@@ -257,14 +282,8 @@ qla2x00_find_fcport_by_pid(scsi_qla_host_t *vha, port_id_t *id)
f = NULL;
list_for_each_entry_safe(f, tf, &vha->vp_fcports, list) {
- if ((f->flags & FCF_FCSP_DEVICE)) {
- ql_dbg(ql_dbg_edif + ql_dbg_verbose, vha, 0x2058,
- "Found secure fcport - nn %8phN pn %8phN portid=0x%x, 0x%x.\n",
- f->node_name, f->port_name,
- f->d_id.b24, id->b24);
- if (f->d_id.b24 == id->b24)
- return f;
- }
+ if (f->d_id.b24 == id->b24)
+ return f;
}
return NULL;
}
@@ -280,14 +299,19 @@ qla_edif_app_check(scsi_qla_host_t *vha, struct app_id appid)
{
/* check that the app is allow/known to the driver */
- if (appid.app_vid == EDIF_APP_ID) {
- ql_dbg(ql_dbg_edif + ql_dbg_verbose, vha, 0x911d, "%s app id ok\n", __func__);
- return true;
+ if (appid.app_vid != EDIF_APP_ID) {
+ ql_dbg(ql_dbg_edif, vha, 0x911d, "%s app id not ok (%x)",
+ __func__, appid.app_vid);
+ return false;
+ }
+
+ if (appid.version != EDIF_VERSION1) {
+ ql_dbg(ql_dbg_edif, vha, 0x911d, "%s app version is not ok (%x)",
+ __func__, appid.version);
+ return false;
}
- ql_dbg(ql_dbg_edif, vha, 0x911d, "%s app id not ok (%x)",
- __func__, appid.app_vid);
- return false;
+ return true;
}
static void
@@ -486,16 +510,35 @@ qla_edif_app_start(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
/* mark doorbell as active since an app is now present */
vha->e_dbell.db_flags |= EDB_ACTIVE;
} else {
- ql_dbg(ql_dbg_edif, vha, 0x911e, "%s doorbell already active\n",
- __func__);
+ goto out;
}
if (N2N_TOPO(vha->hw)) {
- if (vha->hw->flags.n2n_fw_acc_sec)
- set_bit(N2N_LINK_RESET, &vha->dpc_flags);
- else
+ list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list)
+ fcport->n2n_link_reset_cnt = 0;
+
+ if (vha->hw->flags.n2n_fw_acc_sec) {
+ list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list)
+ qla_edif_sa_ctl_init(vha, fcport);
+
+ /*
+ * While authentication app was not running, remote device
+ * could still try to login with this local port. Let's
+ * clear the state and try again.
+ */
+ qla2x00_wait_for_sess_deletion(vha);
+
+ /* bounce the link to get the other guy to relogin */
+ if (!vha->hw->flags.n2n_bigger) {
+ set_bit(N2N_LINK_RESET, &vha->dpc_flags);
+ qla2xxx_wake_dpc(vha);
+ }
+ } else {
+ qla2x00_wait_for_hba_online(vha);
set_bit(ISP_ABORT_NEEDED, &vha->dpc_flags);
- qla2xxx_wake_dpc(vha);
+ qla2xxx_wake_dpc(vha);
+ qla2x00_wait_for_hba_online(vha);
+ }
} else {
list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list) {
ql_dbg(ql_dbg_edif, vha, 0x2058,
@@ -517,19 +560,31 @@ qla_edif_app_start(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
if (atomic_read(&vha->loop_state) == LOOP_DOWN)
break;
- fcport->edif.app_started = 1;
fcport->login_retry = vha->hw->login_retry_count;
- /* no activity */
fcport->edif.app_stop = 0;
+ fcport->edif.app_sess_online = 0;
+
+ if (fcport->scan_state != QLA_FCPORT_FOUND)
+ continue;
+
+ if (fcport->port_type == FCT_UNKNOWN &&
+ !fcport->fc4_features)
+ rval = qla24xx_async_gffid(vha, fcport, true);
+
+ if (!rval && !(fcport->fc4_features & FC4_FF_TARGET ||
+ fcport->port_type & (FCT_TARGET|FCT_NVME_TARGET)))
+ continue;
+
+ rval = 0;
ql_dbg(ql_dbg_edif, vha, 0x911e,
"%s wwpn %8phC calling qla_edif_reset_auth_wait\n",
__func__, fcport->port_name);
- fcport->edif.app_sess_online = 0;
qlt_schedule_sess_for_deletion(fcport);
qla_edif_sa_ctl_init(vha, fcport);
}
+ set_bit(RELOGIN_NEEDED, &vha->dpc_flags);
}
if (vha->pur_cinfo.enode_flags != ENODE_ACTIVE) {
@@ -540,9 +595,11 @@ qla_edif_app_start(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
__func__);
}
+out:
appreply.host_support_edif = vha->hw->flags.edif_enabled;
appreply.edif_enode_active = vha->pur_cinfo.enode_flags;
appreply.edif_edb_active = vha->e_dbell.db_flags;
+ appreply.version = EDIF_VERSION1;
bsg_job->reply_len = sizeof(struct fc_bsg_reply);
@@ -610,9 +667,6 @@ qla_edif_app_stop(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
fcport->send_els_logo = 1;
qlt_schedule_sess_for_deletion(fcport);
-
- /* qla_edif_flush_sa_ctl_lists(fcport); */
- fcport->edif.app_started = 0;
}
}
@@ -673,6 +727,7 @@ qla_edif_app_authok(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
portid.b.area = appplogiok.u.d_id.b.area;
portid.b.al_pa = appplogiok.u.d_id.b.al_pa;
+ appplogireply.version = EDIF_VERSION1;
switch (appplogiok.type) {
case PL_TYPE_WWPN:
fcport = qla2x00_find_fcport_by_wwpn(vha,
@@ -865,6 +920,8 @@ qla_edif_app_getfcinfo(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
} else {
struct fc_port *fcport = NULL, *tf;
+ app_reply->version = EDIF_VERSION1;
+
list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list) {
if (!(fcport->flags & FCF_FCSP_DEVICE))
continue;
@@ -881,9 +938,25 @@ qla_edif_app_getfcinfo(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
if (tdid.b24 != 0 && tdid.b24 != fcport->d_id.b24)
continue;
- app_reply->ports[pcnt].rekey_count =
- fcport->edif.rekey_cnt;
+ if (!N2N_TOPO(vha->hw)) {
+ if (fcport->scan_state != QLA_FCPORT_FOUND)
+ continue;
+
+ if (fcport->port_type == FCT_UNKNOWN &&
+ !fcport->fc4_features)
+ rval = qla24xx_async_gffid(vha, fcport,
+ true);
+
+ if (!rval &&
+ !(fcport->fc4_features & FC4_FF_TARGET ||
+ fcport->port_type &
+ (FCT_TARGET | FCT_NVME_TARGET)))
+ continue;
+ }
+
+ rval = 0;
+ app_reply->ports[pcnt].version = EDIF_VERSION1;
app_reply->ports[pcnt].remote_type =
VND_CMD_RTYPE_UNKNOWN;
if (fcport->port_type & (FCT_NVME_TARGET | FCT_TARGET))
@@ -980,6 +1053,8 @@ qla_edif_app_getstats(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
} else {
struct fc_port *fcport = NULL, *tf;
+ app_reply->version = EDIF_VERSION1;
+
list_for_each_entry_safe(fcport, tf, &vha->vp_fcports, list) {
if (fcport->edif.enable) {
if (pcnt > app_req.num_ports)
@@ -1013,6 +1088,164 @@ qla_edif_app_getstats(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
return rval;
}
+static int32_t
+qla_edif_ack(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+{
+ struct fc_port *fcport;
+ struct aen_complete_cmd ack;
+ struct fc_bsg_reply *bsg_reply = bsg_job->reply;
+
+ sg_copy_to_buffer(bsg_job->request_payload.sg_list,
+ bsg_job->request_payload.sg_cnt, &ack, sizeof(ack));
+
+ ql_dbg(ql_dbg_edif, vha, 0x70cf,
+ "%s: %06x event_code %x\n",
+ __func__, ack.port_id.b24, ack.event_code);
+
+ fcport = qla2x00_find_fcport_by_pid(vha, &ack.port_id);
+ SET_DID_STATUS(bsg_reply->result, DID_OK);
+
+ if (!fcport) {
+ ql_dbg(ql_dbg_edif, vha, 0x70cf,
+ "%s: unable to find fcport %06x \n",
+ __func__, ack.port_id.b24);
+ return 0;
+ }
+
+ switch (ack.event_code) {
+ case VND_CMD_AUTH_STATE_SESSION_SHUTDOWN:
+ fcport->edif.sess_down_acked = 1;
+ break;
+ default:
+ break;
+ }
+ return 0;
+}
+
+static int qla_edif_consume_dbell(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+{
+ struct fc_bsg_reply *bsg_reply = bsg_job->reply;
+ u32 sg_skip, reply_payload_len;
+ bool keep;
+ struct edb_node *dbnode = NULL;
+ struct edif_app_dbell ap;
+ int dat_size = 0;
+
+ sg_skip = 0;
+ reply_payload_len = bsg_job->reply_payload.payload_len;
+
+ while ((reply_payload_len - sg_skip) >= sizeof(struct edb_node)) {
+ dbnode = qla_edb_getnext(vha);
+ if (dbnode) {
+ keep = true;
+ dat_size = 0;
+ ap.event_code = dbnode->ntype;
+ switch (dbnode->ntype) {
+ case VND_CMD_AUTH_STATE_SESSION_SHUTDOWN:
+ case VND_CMD_AUTH_STATE_NEEDED:
+ ap.port_id = dbnode->u.plogi_did;
+ dat_size += sizeof(ap.port_id);
+ break;
+ case VND_CMD_AUTH_STATE_ELS_RCVD:
+ ap.port_id = dbnode->u.els_sid;
+ dat_size += sizeof(ap.port_id);
+ break;
+ case VND_CMD_AUTH_STATE_SAUPDATE_COMPL:
+ ap.port_id = dbnode->u.sa_aen.port_id;
+ memcpy(&ap.event_data, &dbnode->u,
+ sizeof(struct edif_sa_update_aen));
+ dat_size += sizeof(struct edif_sa_update_aen);
+ break;
+ default:
+ keep = false;
+ ql_log(ql_log_warn, vha, 0x09102,
+ "%s unknown DB type=%d %p\n",
+ __func__, dbnode->ntype, dbnode);
+ break;
+ }
+ ap.event_data_size = dat_size;
+ /* 8 = sizeof(ap.event_code + ap.event_data_size) */
+ dat_size += 8;
+ if (keep)
+ sg_skip += sg_copy_buffer(bsg_job->reply_payload.sg_list,
+ bsg_job->reply_payload.sg_cnt,
+ &ap, dat_size, sg_skip, false);
+
+ ql_dbg(ql_dbg_edif, vha, 0x09102,
+ "%s Doorbell consumed : type=%d %p\n",
+ __func__, dbnode->ntype, dbnode);
+
+ kfree(dbnode);
+ } else {
+ break;
+ }
+ }
+
+ SET_DID_STATUS(bsg_reply->result, DID_OK);
+ bsg_reply->reply_payload_rcv_len = sg_skip;
+ bsg_job->reply_len = sizeof(struct fc_bsg_reply);
+
+ return 0;
+}
+
+static void __qla_edif_dbell_bsg_done(scsi_qla_host_t *vha, struct bsg_job *bsg_job,
+ u32 delay)
+{
+ struct fc_bsg_reply *bsg_reply = bsg_job->reply;
+
+ /* small sleep for doorbell events to accumulate */
+ if (delay)
+ msleep(delay);
+
+ qla_edif_consume_dbell(vha, bsg_job);
+
+ bsg_job_done(bsg_job, bsg_reply->result, bsg_reply->reply_payload_rcv_len);
+}
+
+static void qla_edif_dbell_bsg_done(scsi_qla_host_t *vha)
+{
+ unsigned long flags;
+ struct bsg_job *prev_bsg_job = NULL;
+
+ spin_lock_irqsave(&vha->e_dbell.db_lock, flags);
+ if (vha->e_dbell.dbell_bsg_job) {
+ prev_bsg_job = vha->e_dbell.dbell_bsg_job;
+ vha->e_dbell.dbell_bsg_job = NULL;
+ }
+ spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
+
+ if (prev_bsg_job)
+ __qla_edif_dbell_bsg_done(vha, prev_bsg_job, 0);
+}
+
+static int
+qla_edif_dbell_bsg(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
+{
+ unsigned long flags;
+ bool return_bsg = false;
+
+ /* flush previous dbell bsg */
+ qla_edif_dbell_bsg_done(vha);
+
+ spin_lock_irqsave(&vha->e_dbell.db_lock, flags);
+ if (list_empty(&vha->e_dbell.head) && DBELL_ACTIVE(vha)) {
+ /*
+ * when the next db event happens, bsg_job will return.
+ * Otherwise, timer will return it.
+ */
+ vha->e_dbell.dbell_bsg_job = bsg_job;
+ vha->e_dbell.bsg_expire = jiffies + 10 * HZ;
+ } else {
+ return_bsg = true;
+ }
+ spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
+
+ if (return_bsg)
+ __qla_edif_dbell_bsg_done(vha, bsg_job, 1);
+
+ return 0;
+}
+
int32_t
qla_edif_app_mgmt(struct bsg_job *bsg_job)
{
@@ -1024,8 +1257,13 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
bool done = true;
int32_t rval = 0;
uint32_t vnd_sc = bsg_request->rqst_data.h_vendor.vendor_cmd[1];
+ u32 level = ql_dbg_edif;
+
+ /* doorbell is high traffic */
+ if (vnd_sc == QL_VND_SC_READ_DBELL)
+ level = 0;
- ql_dbg(ql_dbg_edif, vha, 0x911d, "%s vnd subcmd=%x\n",
+ ql_dbg(level, vha, 0x911d, "%s vnd subcmd=%x\n",
__func__, vnd_sc);
sg_copy_to_buffer(bsg_job->request_payload.sg_list,
@@ -1034,7 +1272,7 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
if (!vha->hw->flags.edif_enabled ||
test_bit(VPORT_DELETE, &vha->dpc_flags)) {
- ql_dbg(ql_dbg_edif, vha, 0x911d,
+ ql_dbg(level, vha, 0x911d,
"%s edif not enabled or vp delete. bsg ptr done %p. dpc_flags %lx\n",
__func__, bsg_job, vha->dpc_flags);
@@ -1043,7 +1281,7 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
}
if (!qla_edif_app_check(vha, appcheck)) {
- ql_dbg(ql_dbg_edif, vha, 0x911d,
+ ql_dbg(level, vha, 0x911d,
"%s app checked failed.\n",
__func__);
@@ -1075,6 +1313,13 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
case QL_VND_SC_GET_STATS:
rval = qla_edif_app_getstats(vha, bsg_job);
break;
+ case QL_VND_SC_AEN_COMPLETE:
+ rval = qla_edif_ack(vha, bsg_job);
+ break;
+ case QL_VND_SC_READ_DBELL:
+ rval = qla_edif_dbell_bsg(vha, bsg_job);
+ done = false;
+ break;
default:
ql_dbg(ql_dbg_edif, vha, 0x911d, "%s unknown cmd=%x\n",
__func__,
@@ -1086,7 +1331,7 @@ qla_edif_app_mgmt(struct bsg_job *bsg_job)
done:
if (done) {
- ql_dbg(ql_dbg_user, vha, 0x7009,
+ ql_dbg(level, vha, 0x7009,
"%s: %d bsg ptr done %p\n", __func__, __LINE__, bsg_job);
bsg_job_done(bsg_job, bsg_reply->result,
bsg_reply->reply_payload_rcv_len);
@@ -1248,6 +1493,8 @@ qla24xx_check_sadb_avail_slot(struct bsg_job *bsg_job, fc_port_t *fcport,
#define QLA_SA_UPDATE_FLAGS_RX_KEY 0x0
#define QLA_SA_UPDATE_FLAGS_TX_KEY 0x2
+#define EDIF_MSLEEP_INTERVAL 100
+#define EDIF_RETRY_COUNT 50
int
qla24xx_sadb_update(struct bsg_job *bsg_job)
@@ -1260,7 +1507,7 @@ qla24xx_sadb_update(struct bsg_job *bsg_job)
struct edif_list_entry *edif_entry = NULL;
int found = 0;
int rval = 0;
- int result = 0;
+ int result = 0, cnt;
struct qla_sa_update_frame sa_frame;
struct srb_iocb *iocb_cmd;
port_id_t portid;
@@ -1501,11 +1748,23 @@ qla24xx_sadb_update(struct bsg_job *bsg_job)
sp->done = qla2x00_bsg_job_done;
iocb_cmd = &sp->u.iocb_cmd;
iocb_cmd->u.sa_update.sa_frame = sa_frame;
-
+ cnt = 0;
+retry:
rval = qla2x00_start_sp(sp);
- if (rval != QLA_SUCCESS) {
+ switch (rval) {
+ case QLA_SUCCESS:
+ break;
+ case EAGAIN:
+ msleep(EDIF_MSLEEP_INTERVAL);
+ cnt++;
+ if (cnt < EDIF_RETRY_COUNT)
+ goto retry;
+
+ fallthrough;
+ default:
ql_log(ql_dbg_edif, vha, 0x70e3,
- "qla2x00_start_sp failed=%d.\n", rval);
+ "%s qla2x00_start_sp failed=%d.\n",
+ __func__, rval);
qla2x00_rel_sp(sp);
rval = -EIO;
@@ -1798,30 +2057,6 @@ qla_edb_init(scsi_qla_host_t *vha)
/* initialize lock which protects doorbell & init list */
spin_lock_init(&vha->e_dbell.db_lock);
INIT_LIST_HEAD(&vha->e_dbell.head);
-
- /* create and initialize doorbell */
- init_completion(&vha->e_dbell.dbell);
-}
-
-static void
-qla_edb_node_free(scsi_qla_host_t *vha, struct edb_node *node)
-{
- /*
- * releases the space held by this edb node entry
- * this function does _not_ free the edb node itself
- * NB: the edb node entry passed should not be on any list
- *
- * currently for doorbell there's no additional cleanup
- * needed, but here as a placeholder for furture use.
- */
-
- if (!node) {
- ql_dbg(ql_dbg_edif, vha, 0x09122,
- "%s error - no valid node passed\n", __func__);
- return;
- }
-
- node->ntype = N_UNDEF;
}
static void qla_edb_clear(scsi_qla_host_t *vha, port_id_t portid)
@@ -1868,11 +2103,8 @@ static void qla_edb_clear(scsi_qla_host_t *vha, port_id_t portid)
}
spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
- list_for_each_entry_safe(e, tmp, &edb_list, list) {
+ list_for_each_entry_safe(e, tmp, &edb_list, list)
qla_edb_node_free(vha, e);
- list_del_init(&e->list);
- kfree(e);
- }
}
/* function called when app is stopping */
@@ -1900,14 +2132,10 @@ qla_edb_stop(scsi_qla_host_t *vha)
"%s freeing edb_node type=%x\n",
__func__, node->ntype);
qla_edb_node_free(vha, node);
- list_del(&node->list);
-
- kfree(node);
}
spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
- /* wake up doorbell waiters - they'll be dismissed with error code */
- complete_all(&vha->e_dbell.dbell);
+ qla_edif_dbell_bsg_done(vha);
}
static struct edb_node *
@@ -1945,9 +2173,6 @@ qla_edb_node_add(scsi_qla_host_t *vha, struct edb_node *ptr)
list_add_tail(&ptr->list, &vha->e_dbell.head);
spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
- /* ring doorbell for waiters */
- complete(&vha->e_dbell.dbell);
-
return true;
}
@@ -2011,47 +2236,29 @@ qla_edb_eventcreate(scsi_qla_host_t *vha, uint32_t dbtype,
edbnode->u.sa_aen.port_id = fcport->d_id;
edbnode->u.sa_aen.status = data;
edbnode->u.sa_aen.key_type = data2;
+ edbnode->u.sa_aen.version = EDIF_VERSION1;
break;
default:
ql_dbg(ql_dbg_edif, vha, 0x09102,
"%s unknown type: %x\n", __func__, dbtype);
- qla_edb_node_free(vha, edbnode);
kfree(edbnode);
edbnode = NULL;
break;
}
- if (edbnode && (!qla_edb_node_add(vha, edbnode))) {
+ if (edbnode) {
+ if (!qla_edb_node_add(vha, edbnode)) {
+ ql_dbg(ql_dbg_edif, vha, 0x09102,
+ "%s unable to add dbnode\n", __func__);
+ kfree(edbnode);
+ return;
+ }
ql_dbg(ql_dbg_edif, vha, 0x09102,
- "%s unable to add dbnode\n", __func__);
- qla_edb_node_free(vha, edbnode);
- kfree(edbnode);
- return;
- }
- if (edbnode && fcport)
- fcport->edif.auth_state = dbtype;
- ql_dbg(ql_dbg_edif, vha, 0x09102,
- "%s Doorbell produced : type=%d %p\n", __func__, dbtype, edbnode);
-}
-
-static struct edb_node *
-qla_edb_getnext(scsi_qla_host_t *vha)
-{
- unsigned long flags;
- struct edb_node *edbnode = NULL;
-
- spin_lock_irqsave(&vha->e_dbell.db_lock, flags);
-
- /* db nodes are fifo - no qualifications done */
- if (!list_empty(&vha->e_dbell.head)) {
- edbnode = list_first_entry(&vha->e_dbell.head,
- struct edb_node, list);
- list_del(&edbnode->list);
+ "%s Doorbell produced : type=%d %p\n", __func__, dbtype, edbnode);
+ qla_edif_dbell_bsg_done(vha);
+ if (fcport)
+ fcport->edif.auth_state = dbtype;
}
-
- spin_unlock_irqrestore(&vha->e_dbell.db_lock, flags);
-
- return edbnode;
}
void
@@ -2079,6 +2286,9 @@ qla_edif_timer(scsi_qla_host_t *vha)
ha->edif_post_stop_cnt_down = 60;
}
}
+
+ if (vha->e_dbell.dbell_bsg_job && time_after_eq(jiffies, vha->e_dbell.bsg_expire))
+ qla_edif_dbell_bsg_done(vha);
}
/*
@@ -2146,7 +2356,6 @@ edif_doorbell_show(struct device *dev, struct device_attribute *attr,
"%s Doorbell consumed : type=%d %p\n",
__func__, dbnode->ntype, dbnode);
/* we're done with the db node, so free it up */
- qla_edb_node_free(vha, dbnode);
kfree(dbnode);
} else {
break;
@@ -2162,6 +2371,7 @@ edif_doorbell_show(struct device *dev, struct device_attribute *attr,
static void qla_noop_sp_done(srb_t *sp, int res)
{
+ sp->fcport->flags &= ~(FCF_ASYNC_SENT | FCF_ASYNC_ACTIVE);
/* ref: INIT */
kref_put(&sp->cmd_kref, qla2x00_sp_release);
}
@@ -2186,7 +2396,8 @@ qla24xx_issue_sa_replace_iocb(scsi_qla_host_t *vha, struct qla_work_evt *e)
if (!sa_ctl) {
ql_dbg(ql_dbg_edif, vha, 0x70e6,
"sa_ctl allocation failed\n");
- return -ENOMEM;
+ rval = -ENOMEM;
+ goto done;
}
fcport = sa_ctl->fcport;
@@ -2196,7 +2407,8 @@ qla24xx_issue_sa_replace_iocb(scsi_qla_host_t *vha, struct qla_work_evt *e)
if (!sp) {
ql_dbg(ql_dbg_edif, vha, 0x70e6,
"SRB allocation failed\n");
- return -ENOMEM;
+ rval = -ENOMEM;
+ goto done;
}
fcport->flags |= FCF_ASYNC_SENT;
@@ -2225,9 +2437,16 @@ qla24xx_issue_sa_replace_iocb(scsi_qla_host_t *vha, struct qla_work_evt *e)
rval = qla2x00_start_sp(sp);
- if (rval != QLA_SUCCESS)
- rval = QLA_FUNCTION_FAILED;
+ if (rval != QLA_SUCCESS) {
+ goto done_free_sp;
+ }
+ return rval;
+done_free_sp:
+ kref_put(&sp->cmd_kref, qla2x00_sp_release);
+ fcport->flags &= ~FCF_ASYNC_SENT;
+done:
+ fcport->flags &= ~FCF_ASYNC_ACTIVE;
return rval;
}
@@ -2447,8 +2666,7 @@ void qla24xx_auth_els(scsi_qla_host_t *vha, void **pkt, struct rsp_que **rsp)
fcport = qla2x00_find_fcport_by_pid(host, &purex->pur_info.pur_sid);
- if (DBELL_INACTIVE(vha) ||
- (fcport && EDIF_SESSION_DOWN(fcport))) {
+ if (DBELL_INACTIVE(vha)) {
ql_dbg(ql_dbg_edif, host, 0x0910c, "%s e_dbell.db_flags =%x %06x\n",
__func__, host->e_dbell.db_flags,
fcport ? fcport->d_id.b24 : 0);
@@ -2458,6 +2676,22 @@ void qla24xx_auth_els(scsi_qla_host_t *vha, void **pkt, struct rsp_que **rsp)
return;
}
+ if (fcport && EDIF_SESSION_DOWN(fcport)) {
+ ql_dbg(ql_dbg_edif, host, 0x13b6,
+ "%s terminate exchange. Send logo to 0x%x\n",
+ __func__, a.did.b24);
+
+ a.tx_byte_count = a.tx_len = 0;
+ a.tx_addr = 0;
+ a.control_flags = EPD_RX_XCHG; /* EPD_RX_XCHG = terminate cmd */
+ qla_els_reject_iocb(host, (*rsp)->qpair, &a);
+ qla_enode_free(host, ptr);
+ /* send logo to let remote port knows to tear down session */
+ fcport->send_els_logo = 1;
+ qlt_schedule_sess_for_deletion(fcport);
+ return;
+ }
+
/* add the local enode to the list */
qla_enode_add(host, ptr);
@@ -3350,10 +3584,14 @@ int qla_edif_process_els(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
fc_port_t *fcport = NULL;
struct qla_hw_data *ha = vha->hw;
srb_t *sp;
- int rval = (DID_ERROR << 16);
+ int rval = (DID_ERROR << 16), cnt;
port_id_t d_id;
struct qla_bsg_auth_els_request *p =
(struct qla_bsg_auth_els_request *)bsg_job->request;
+ struct qla_bsg_auth_els_reply *rpl =
+ (struct qla_bsg_auth_els_reply *)bsg_job->reply;
+
+ rpl->version = EDIF_VERSION1;
d_id.b.al_pa = bsg_request->rqst_data.h_els.port_id[2];
d_id.b.area = bsg_request->rqst_data.h_els.port_id[1];
@@ -3372,7 +3610,7 @@ int qla_edif_process_els(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
if (qla_bsg_check(vha, bsg_job, fcport))
return 0;
- if (fcport->loop_id == FC_NO_LOOP_ID) {
+ if (EDIF_SESS_DELETE(fcport)) {
ql_dbg(ql_dbg_edif, vha, 0x910d,
"%s ELS code %x, no loop id.\n", __func__,
bsg_request->rqst_data.r_els.els_code);
@@ -3441,17 +3679,26 @@ int qla_edif_process_els(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
sp->free = qla2x00_bsg_sp_free;
sp->done = qla2x00_bsg_job_done;
+ cnt = 0;
+retry:
rval = qla2x00_start_sp(sp);
-
- ql_dbg(ql_dbg_edif, vha, 0x700a,
- "%s %s %8phN xchg %x ctlflag %x hdl %x reqlen %xh bsg ptr %p\n",
- __func__, sc_to_str(p->e.sub_cmd), fcport->port_name,
- p->e.extra_rx_xchg_address, p->e.extra_control_flags,
- sp->handle, sp->remap.req.len, bsg_job);
-
- if (rval != QLA_SUCCESS) {
+ switch (rval) {
+ case QLA_SUCCESS:
+ ql_dbg(ql_dbg_edif, vha, 0x700a,
+ "%s %s %8phN xchg %x ctlflag %x hdl %x reqlen %xh bsg ptr %p\n",
+ __func__, sc_to_str(p->e.sub_cmd), fcport->port_name,
+ p->e.extra_rx_xchg_address, p->e.extra_control_flags,
+ sp->handle, sp->remap.req.len, bsg_job);
+ break;
+ case EAGAIN:
+ msleep(EDIF_MSLEEP_INTERVAL);
+ cnt++;
+ if (cnt < EDIF_RETRY_COUNT)
+ goto retry;
+ fallthrough;
+ default:
ql_log(ql_log_warn, vha, 0x700e,
- "qla2x00_start_sp failed = %d\n", rval);
+ "%s qla2x00_start_sp failed = %d\n", __func__, rval);
SET_DID_STATUS(bsg_reply->result, DID_IMM_RETRY);
rval = -EIO;
goto done_free_remap_rsp;
@@ -3473,14 +3720,29 @@ int qla_edif_process_els(scsi_qla_host_t *vha, struct bsg_job *bsg_job)
void qla_edif_sess_down(struct scsi_qla_host *vha, struct fc_port *sess)
{
+ u16 cnt = 0;
+
if (sess->edif.app_sess_online && DBELL_ACTIVE(vha)) {
ql_dbg(ql_dbg_disc, vha, 0xf09c,
"%s: sess %8phN send port_offline event\n",
__func__, sess->port_name);
sess->edif.app_sess_online = 0;
+ sess->edif.sess_down_acked = 0;
qla_edb_eventcreate(vha, VND_CMD_AUTH_STATE_SESSION_SHUTDOWN,
sess->d_id.b24, 0, sess);
qla2x00_post_aen_work(vha, FCH_EVT_PORT_OFFLINE, sess->d_id.b24);
+
+ while (!READ_ONCE(sess->edif.sess_down_acked) &&
+ !test_bit(VPORT_DELETE, &vha->dpc_flags)) {
+ msleep(100);
+ cnt++;
+ if (cnt > 100)
+ break;
+ }
+ sess->edif.sess_down_acked = 0;
+ ql_dbg(ql_dbg_disc, vha, 0xf09c,
+ "%s: sess %8phN port_offline event completed\n",
+ __func__, sess->port_name);
}
}
diff --git a/drivers/scsi/qla2xxx/qla_edif.h b/drivers/scsi/qla2xxx/qla_edif.h
index a965ca8e47ce..7cdb89ccdc6e 100644
--- a/drivers/scsi/qla2xxx/qla_edif.h
+++ b/drivers/scsi/qla2xxx/qla_edif.h
@@ -51,7 +51,8 @@ struct edif_dbell {
enum db_flags_t db_flags;
spinlock_t db_lock;
struct list_head head;
- struct completion dbell;
+ struct bsg_job *dbell_bsg_job;
+ unsigned long bsg_expire;
};
#define SA_UPDATE_IOCB_TYPE 0x71 /* Security Association Update IOCB entry */
@@ -140,4 +141,8 @@ struct enode {
(DBELL_ACTIVE(_fcport->vha) && \
(_fcport->disc_state == DSC_LOGIN_AUTH_PEND))
+#define EDIF_SESS_DELETE(_s) \
+ (qla_ini_mode_enabled(_s->vha) && (_s->disc_state == DSC_DELETE_PEND || \
+ _s->disc_state == DSC_DELETED))
+
#endif /* __QLA_EDIF_H */
diff --git a/drivers/scsi/qla2xxx/qla_edif_bsg.h b/drivers/scsi/qla2xxx/qla_edif_bsg.h
index 5a26c77157da..0931f4e4e127 100644
--- a/drivers/scsi/qla2xxx/qla_edif_bsg.h
+++ b/drivers/scsi/qla2xxx/qla_edif_bsg.h
@@ -7,13 +7,15 @@
#ifndef __QLA_EDIF_BSG_H
#define __QLA_EDIF_BSG_H
+#define EDIF_VERSION1 1
+
/* BSG Vendor specific commands */
#define ELS_MAX_PAYLOAD 2112
#ifndef WWN_SIZE
#define WWN_SIZE 8
#endif
-#define VND_CMD_APP_RESERVED_SIZE 32
-
+#define VND_CMD_APP_RESERVED_SIZE 28
+#define VND_CMD_PAD_SIZE 3
enum auth_els_sub_cmd {
SEND_ELS = 0,
SEND_ELS_REPLY,
@@ -28,7 +30,9 @@ struct extra_auth_els {
#define BSG_CTL_FLAG_LS_ACC 1
#define BSG_CTL_FLAG_LS_RJT 2
#define BSG_CTL_FLAG_TRM 3
- uint8_t extra_rsvd[3];
+ uint8_t version;
+ uint8_t pad[2];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
struct qla_bsg_auth_els_request {
@@ -39,51 +43,46 @@ struct qla_bsg_auth_els_request {
struct qla_bsg_auth_els_reply {
struct fc_bsg_reply r;
uint32_t rx_xchg_address;
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
};
struct app_id {
int app_vid;
- uint8_t app_key[32];
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
struct app_start_reply {
uint32_t host_support_edif;
uint32_t edif_enode_active;
uint32_t edif_edb_active;
- uint32_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
struct app_start {
struct app_id app_info;
- uint32_t prli_to;
- uint32_t key_shred;
uint8_t app_start_flags;
- uint8_t reserved[VND_CMD_APP_RESERVED_SIZE - 1];
+ uint8_t version;
+ uint8_t pad[2];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
struct app_stop {
struct app_id app_info;
- char buf[16];
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
struct app_plogi_reply {
uint32_t prli_status;
- uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
-} __packed;
-
-#define RECFG_TIME 1
-#define RECFG_BYTES 2
-
-struct app_rekey_cfg {
- struct app_id app_info;
- uint8_t rekey_mode;
- port_id_t d_id;
- uint8_t force;
- union {
- int64_t bytes;
- int64_t time;
- } rky_units;
-
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
@@ -91,7 +90,9 @@ struct app_pinfo_req {
struct app_id app_info;
uint8_t num_ports;
port_id_t remote_pid;
- uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
struct app_pinfo {
@@ -103,11 +104,8 @@ struct app_pinfo {
#define VND_CMD_RTYPE_INITIATOR 2
uint8_t remote_state;
uint8_t auth_state;
- uint8_t rekey_mode;
- int64_t rekey_count;
- int64_t rekey_config_value;
- int64_t rekey_consumed_value;
-
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
@@ -120,6 +118,8 @@ struct app_pinfo {
struct app_pinfo_reply {
uint8_t port_count;
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
struct app_pinfo ports[];
} __packed;
@@ -127,6 +127,8 @@ struct app_pinfo_reply {
struct app_sinfo_req {
struct app_id app_info;
uint8_t num_ports;
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
@@ -140,6 +142,9 @@ struct app_sinfo {
struct app_stats_reply {
uint8_t elem_count;
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
struct app_sinfo elem[];
} __packed;
@@ -163,9 +168,11 @@ struct qla_sa_update_frame {
uint8_t node_name[WWN_SIZE];
uint8_t port_name[WWN_SIZE];
port_id_t port_id;
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved2[VND_CMD_APP_RESERVED_SIZE];
} __packed;
-// used for edif mgmt bsg interface
#define QL_VND_SC_UNDEF 0
#define QL_VND_SC_SA_UPDATE 1
#define QL_VND_SC_APP_START 2
@@ -175,6 +182,22 @@ struct qla_sa_update_frame {
#define QL_VND_SC_REKEY_CONFIG 6
#define QL_VND_SC_GET_FCINFO 7
#define QL_VND_SC_GET_STATS 8
+#define QL_VND_SC_AEN_COMPLETE 9
+#define QL_VND_SC_READ_DBELL 10
+
+/*
+ * bsg caller to provide empty buffer for doorbell events.
+ *
+ * sg_io_v4.din_xferp = empty buffer for door bell events
+ * sg_io_v4.dout_xferp = struct edif_read_dbell *buf
+ */
+struct edif_read_dbell {
+ struct app_id app_info;
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+};
+
/* Application interface data structure for rtn data */
#define EXT_DEF_EVENT_DATA_SIZE 64
@@ -191,7 +214,9 @@ struct edif_sa_update_aen {
port_id_t port_id;
uint32_t key_type; /* Tx (1) or RX (2) */
uint32_t status; /* 0 succes, 1 failed, 2 timeout , 3 error */
- uint8_t reserved[16];
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
#define QL_VND_SA_STAT_SUCCESS 0
@@ -212,9 +237,22 @@ struct auth_complete_cmd {
uint8_t wwpn[WWN_SIZE];
port_id_t d_id;
} u;
- uint32_t reserved[VND_CMD_APP_RESERVED_SIZE];
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
+} __packed;
+
+struct aen_complete_cmd {
+ struct app_id app_info;
+ port_id_t port_id;
+ uint32_t event_code;
+ uint8_t version;
+ uint8_t pad[VND_CMD_PAD_SIZE];
+ uint8_t reserved[VND_CMD_APP_RESERVED_SIZE];
} __packed;
#define RX_DELAY_DELETE_TIMEOUT 20
+#define FCH_EVT_VENDOR_UNIQUE_VPORT_DOWN 1
+
#endif /* QLA_EDIF_BSG_H */
diff --git a/drivers/scsi/qla2xxx/qla_fw.h b/drivers/scsi/qla2xxx/qla_fw.h
index 0bb1d562f0bf..361015b5763e 100644
--- a/drivers/scsi/qla2xxx/qla_fw.h
+++ b/drivers/scsi/qla2xxx/qla_fw.h
@@ -807,7 +807,7 @@ struct els_entry_24xx {
#define EPD_ELS_COMMAND (0 << 13)
#define EPD_ELS_ACC (1 << 13)
#define EPD_ELS_RJT (2 << 13)
-#define EPD_RX_XCHG (3 << 13)
+#define EPD_RX_XCHG (3 << 13) /* terminate exchange */
#define ECF_CLR_PASSTHRU_PEND BIT_12
#define ECF_INCL_FRAME_HDR BIT_11
#define ECF_SEC_LOGIN BIT_3
diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
index dac27b5ff0ac..2e5b65072b75 100644
--- a/drivers/scsi/qla2xxx/qla_gbl.h
+++ b/drivers/scsi/qla2xxx/qla_gbl.h
@@ -335,6 +335,7 @@ extern int qla24xx_configure_prot_mode(srb_t *, uint16_t *);
extern int qla24xx_issue_sa_replace_iocb(scsi_qla_host_t *vha,
struct qla_work_evt *e);
void qla2x00_sp_release(struct kref *kref);
+void qla2x00_els_dcmd2_iocb_timeout(void *data);
/*
* Global Function Prototypes in qla_mbx.c source file.
@@ -433,7 +434,8 @@ extern int
qla2x00_get_resource_cnts(scsi_qla_host_t *);
extern int
-qla2x00_get_fcal_position_map(scsi_qla_host_t *ha, char *pos_map);
+qla2x00_get_fcal_position_map(scsi_qla_host_t *ha, char *pos_map,
+ u8 *num_entries);
extern int
qla2x00_get_link_status(scsi_qla_host_t *, uint16_t, struct link_statistics *,
@@ -727,7 +729,7 @@ int qla24xx_async_gpsc(scsi_qla_host_t *, fc_port_t *);
void qla24xx_handle_gpsc_event(scsi_qla_host_t *, struct event_arg *);
int qla2x00_mgmt_svr_login(scsi_qla_host_t *);
void qla24xx_handle_gffid_event(scsi_qla_host_t *vha, struct event_arg *ea);
-int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport);
+int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport, bool);
int qla24xx_async_gpnft(scsi_qla_host_t *, u8, srb_t *);
void qla24xx_async_gpnft_done(scsi_qla_host_t *, srb_t *);
void qla24xx_async_gnnft_done(scsi_qla_host_t *, srb_t *);
diff --git a/drivers/scsi/qla2xxx/qla_gs.c b/drivers/scsi/qla2xxx/qla_gs.c
index e811de2f6a25..7ca734337000 100644
--- a/drivers/scsi/qla2xxx/qla_gs.c
+++ b/drivers/scsi/qla2xxx/qla_gs.c
@@ -1596,7 +1596,6 @@ qla2x00_hba_attributes(scsi_qla_host_t *vha, void *entries,
unsigned int callopt)
{
struct qla_hw_data *ha = vha->hw;
- struct init_cb_24xx *icb24 = (void *)ha->init_cb;
struct new_utsname *p_sysid = utsname();
struct ct_fdmi_hba_attr *eiter;
uint16_t alen;
@@ -1758,8 +1757,8 @@ qla2x00_hba_attributes(scsi_qla_host_t *vha, void *entries,
/* MAX CT Payload Length */
eiter = entries + size;
eiter->type = cpu_to_be16(FDMI_HBA_MAXIMUM_CT_PAYLOAD_LENGTH);
- eiter->a.max_ct_len = cpu_to_be32(le16_to_cpu(IS_FWI2_CAPABLE(ha) ?
- icb24->frame_payload_size : ha->init_cb->frame_payload_size));
+ eiter->a.max_ct_len = cpu_to_be32(ha->frame_payload_size >> 2);
+
alen = sizeof(eiter->a.max_ct_len);
alen += FDMI_ATTR_TYPELEN(eiter);
eiter->len = cpu_to_be16(alen);
@@ -1851,7 +1850,6 @@ qla2x00_port_attributes(scsi_qla_host_t *vha, void *entries,
unsigned int callopt)
{
struct qla_hw_data *ha = vha->hw;
- struct init_cb_24xx *icb24 = (void *)ha->init_cb;
struct new_utsname *p_sysid = utsname();
char *hostname = p_sysid ?
p_sysid->nodename : fc_host_system_hostname(vha->host);
@@ -1903,8 +1901,7 @@ qla2x00_port_attributes(scsi_qla_host_t *vha, void *entries,
/* Max frame size. */
eiter = entries + size;
eiter->type = cpu_to_be16(FDMI_PORT_MAX_FRAME_SIZE);
- eiter->a.max_frame_size = cpu_to_be32(le16_to_cpu(IS_FWI2_CAPABLE(ha) ?
- icb24->frame_payload_size : ha->init_cb->frame_payload_size));
+ eiter->a.max_frame_size = cpu_to_be32(ha->frame_payload_size);
alen = sizeof(eiter->a.max_frame_size);
alen += FDMI_ATTR_TYPELEN(eiter);
eiter->len = cpu_to_be16(alen);
@@ -3280,19 +3277,12 @@ int qla24xx_async_gpnid(scsi_qla_host_t *vha, port_id_t *id)
return rval;
}
-void qla24xx_handle_gffid_event(scsi_qla_host_t *vha, struct event_arg *ea)
-{
- fc_port_t *fcport = ea->fcport;
-
- qla24xx_post_gnl_work(vha, fcport);
-}
void qla24xx_async_gffid_sp_done(srb_t *sp, int res)
{
struct scsi_qla_host *vha = sp->vha;
fc_port_t *fcport = sp->fcport;
struct ct_sns_rsp *ct_rsp;
- struct event_arg ea;
uint8_t fc4_scsi_feat;
uint8_t fc4_nvme_feat;
@@ -3300,10 +3290,10 @@ void qla24xx_async_gffid_sp_done(srb_t *sp, int res)
"Async done-%s res %x ID %x. %8phC\n",
sp->name, res, fcport->d_id.b24, fcport->port_name);
- fcport->flags &= ~FCF_ASYNC_SENT;
- ct_rsp = &fcport->ct_desc.ct_sns->p.rsp;
+ ct_rsp = sp->u.iocb_cmd.u.ctarg.rsp;
fc4_scsi_feat = ct_rsp->rsp.gff_id.fc4_features[GFF_FCP_SCSI_OFFSET];
fc4_nvme_feat = ct_rsp->rsp.gff_id.fc4_features[GFF_NVME_OFFSET];
+ sp->rc = res;
/*
* FC-GS-7, 5.2.3.12 FC-4 Features - format
@@ -3324,24 +3314,42 @@ void qla24xx_async_gffid_sp_done(srb_t *sp, int res)
}
}
- memset(&ea, 0, sizeof(ea));
- ea.sp = sp;
- ea.fcport = sp->fcport;
- ea.rc = res;
+ if (sp->flags & SRB_WAKEUP_ON_COMP) {
+ complete(sp->comp);
+ } else {
+ if (sp->u.iocb_cmd.u.ctarg.req) {
+ dma_free_coherent(&vha->hw->pdev->dev,
+ sp->u.iocb_cmd.u.ctarg.req_allocated_size,
+ sp->u.iocb_cmd.u.ctarg.req,
+ sp->u.iocb_cmd.u.ctarg.req_dma);
+ sp->u.iocb_cmd.u.ctarg.req = NULL;
+ }
- qla24xx_handle_gffid_event(vha, &ea);
- /* ref: INIT */
- kref_put(&sp->cmd_kref, qla2x00_sp_release);
+ if (sp->u.iocb_cmd.u.ctarg.rsp) {
+ dma_free_coherent(&vha->hw->pdev->dev,
+ sp->u.iocb_cmd.u.ctarg.rsp_allocated_size,
+ sp->u.iocb_cmd.u.ctarg.rsp,
+ sp->u.iocb_cmd.u.ctarg.rsp_dma);
+ sp->u.iocb_cmd.u.ctarg.rsp = NULL;
+ }
+
+ /* ref: INIT */
+ kref_put(&sp->cmd_kref, qla2x00_sp_release);
+ /* we should not be here */
+ dump_stack();
+ }
}
/* Get FC4 Feature with Nport ID. */
-int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport)
+int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport, bool wait)
{
int rval = QLA_FUNCTION_FAILED;
struct ct_sns_req *ct_req;
srb_t *sp;
+ DECLARE_COMPLETION_ONSTACK(comp);
- if (!vha->flags.online || (fcport->flags & FCF_ASYNC_SENT))
+ /* this routine does not have handling for no wait */
+ if (!vha->flags.online || !wait)
return rval;
/* ref: INIT */
@@ -3349,43 +3357,86 @@ int qla24xx_async_gffid(scsi_qla_host_t *vha, fc_port_t *fcport)
if (!sp)
return rval;
- fcport->flags |= FCF_ASYNC_SENT;
sp->type = SRB_CT_PTHRU_CMD;
sp->name = "gffid";
sp->gen1 = fcport->rscn_gen;
sp->gen2 = fcport->login_gen;
qla2x00_init_async_sp(sp, qla2x00_get_async_timeout(vha) + 2,
qla24xx_async_gffid_sp_done);
+ sp->comp = ∁
+ sp->u.iocb_cmd.timeout = qla2x00_els_dcmd2_iocb_timeout;
+
+ if (wait)
+ sp->flags = SRB_WAKEUP_ON_COMP;
+
+ sp->u.iocb_cmd.u.ctarg.req_allocated_size = sizeof(struct ct_sns_pkt);
+ sp->u.iocb_cmd.u.ctarg.req = dma_alloc_coherent(&vha->hw->pdev->dev,
+ sp->u.iocb_cmd.u.ctarg.req_allocated_size,
+ &sp->u.iocb_cmd.u.ctarg.req_dma,
+ GFP_KERNEL);
+ if (!sp->u.iocb_cmd.u.ctarg.req) {
+ ql_log(ql_log_warn, vha, 0xd041,
+ "%s: Failed to allocate ct_sns request.\n",
+ __func__);
+ goto done_free_sp;
+ }
+
+ sp->u.iocb_cmd.u.ctarg.rsp_allocated_size = sizeof(struct ct_sns_pkt);
+ sp->u.iocb_cmd.u.ctarg.rsp = dma_alloc_coherent(&vha->hw->pdev->dev,
+ sp->u.iocb_cmd.u.ctarg.rsp_allocated_size,
+ &sp->u.iocb_cmd.u.ctarg.rsp_dma,
+ GFP_KERNEL);
+ if (!sp->u.iocb_cmd.u.ctarg.rsp) {
+ ql_log(ql_log_warn, vha, 0xd041,
+ "%s: Failed to allocate ct_sns response.\n",
+ __func__);
+ goto done_free_sp;
+ }
/* CT_IU preamble */
- ct_req = qla2x00_prep_ct_req(fcport->ct_desc.ct_sns, GFF_ID_CMD,
- GFF_ID_RSP_SIZE);
+ ct_req = qla2x00_prep_ct_req(sp->u.iocb_cmd.u.ctarg.req, GFF_ID_CMD, GFF_ID_RSP_SIZE);
ct_req->req.gff_id.port_id[0] = fcport->d_id.b.domain;
ct_req->req.gff_id.port_id[1] = fcport->d_id.b.area;
ct_req->req.gff_id.port_id[2] = fcport->d_id.b.al_pa;
- sp->u.iocb_cmd.u.ctarg.req = fcport->ct_desc.ct_sns;
- sp->u.iocb_cmd.u.ctarg.req_dma = fcport->ct_desc.ct_sns_dma;
- sp->u.iocb_cmd.u.ctarg.rsp = fcport->ct_desc.ct_sns;
- sp->u.iocb_cmd.u.ctarg.rsp_dma = fcport->ct_desc.ct_sns_dma;
sp->u.iocb_cmd.u.ctarg.req_size = GFF_ID_REQ_SIZE;
sp->u.iocb_cmd.u.ctarg.rsp_size = GFF_ID_RSP_SIZE;
sp->u.iocb_cmd.u.ctarg.nport_handle = NPH_SNS;
- ql_dbg(ql_dbg_disc, vha, 0x2132,
- "Async-%s hdl=%x %8phC.\n", sp->name,
- sp->handle, fcport->port_name);
-
rval = qla2x00_start_sp(sp);
- if (rval != QLA_SUCCESS)
+
+ if (rval != QLA_SUCCESS) {
+ rval = QLA_FUNCTION_FAILED;
goto done_free_sp;
+ } else {
+ ql_dbg(ql_dbg_disc, vha, 0x3074,
+ "Async-%s hdl=%x portid %06x\n",
+ sp->name, sp->handle, fcport->d_id.b24);
+ }
+
+ wait_for_completion(sp->comp);
+ rval = sp->rc;
- return rval;
done_free_sp:
+ if (sp->u.iocb_cmd.u.ctarg.req) {
+ dma_free_coherent(&vha->hw->pdev->dev,
+ sp->u.iocb_cmd.u.ctarg.req_allocated_size,
+ sp->u.iocb_cmd.u.ctarg.req,
+ sp->u.iocb_cmd.u.ctarg.req_dma);
+ sp->u.iocb_cmd.u.ctarg.req = NULL;
+ }
+
+ if (sp->u.iocb_cmd.u.ctarg.rsp) {
+ dma_free_coherent(&vha->hw->pdev->dev,
+ sp->u.iocb_cmd.u.ctarg.rsp_allocated_size,
+ sp->u.iocb_cmd.u.ctarg.rsp,
+ sp->u.iocb_cmd.u.ctarg.rsp_dma);
+ sp->u.iocb_cmd.u.ctarg.rsp = NULL;
+ }
+
/* ref: INIT */
kref_put(&sp->cmd_kref, qla2x00_sp_release);
- fcport->flags &= ~FCF_ASYNC_SENT;
return rval;
}
@@ -3578,7 +3629,7 @@ void qla24xx_async_gnnft_done(scsi_qla_host_t *vha, srb_t *sp)
do_delete) {
if (fcport->loop_id != FC_NO_LOOP_ID) {
if (fcport->flags & FCF_FCP2_DEVICE)
- fcport->logout_on_delete = 0;
+ continue;
ql_log(ql_log_warn, vha, 0x20f0,
"%s %d %8phC post del sess\n",
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 3f3417a3e891..51503a316b10 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -47,6 +47,7 @@ qla2x00_sp_timeout(struct timer_list *t)
{
srb_t *sp = from_timer(sp, t, u.iocb_cmd.timer);
struct srb_iocb *iocb;
+ scsi_qla_host_t *vha = sp->vha;
WARN_ON(irqs_disabled());
iocb = &sp->u.iocb_cmd;
@@ -54,6 +55,12 @@ qla2x00_sp_timeout(struct timer_list *t)
/* ref: TMR */
kref_put(&sp->cmd_kref, qla2x00_sp_release);
+
+ if (vha && qla2x00_isp_reg_stat(vha->hw)) {
+ ql_log(ql_log_info, vha, 0x9008,
+ "PCI/Register disconnect.\n");
+ qla_pci_set_eeh_busy(vha);
+ }
}
void qla2x00_sp_free(srb_t *sp)
@@ -161,6 +168,7 @@ int qla24xx_async_abort_cmd(srb_t *cmd_sp, bool wait)
struct srb_iocb *abt_iocb;
srb_t *sp;
int rval = QLA_FUNCTION_FAILED;
+ uint8_t bail;
/* ref: INIT for ABTS command */
sp = qla2xxx_get_qpair_sp(cmd_sp->vha, cmd_sp->qpair, cmd_sp->fcport,
@@ -168,6 +176,7 @@ int qla24xx_async_abort_cmd(srb_t *cmd_sp, bool wait)
if (!sp)
return QLA_MEMORY_ALLOC_FAILED;
+ QLA_VHA_MARK_BUSY(vha, bail);
abt_iocb = &sp->u.iocb_cmd;
sp->type = SRB_ABT_CMD;
sp->name = "abort";
@@ -1480,7 +1489,6 @@ static int qla_chk_secure_login(scsi_qla_host_t *vha, fc_port_t *fcport,
ql_dbg(ql_dbg_disc, vha, 0x20ef,
"%s %d %8phC EDIF: post DB_AUTH: AUTH needed\n",
__func__, __LINE__, fcport->port_name);
- fcport->edif.app_started = 1;
fcport->edif.app_sess_online = 1;
qla_edb_eventcreate(vha, VND_CMD_AUTH_STATE_NEEDED,
@@ -1763,8 +1771,16 @@ int qla24xx_fcport_handle_login(struct scsi_qla_host *vha, fc_port_t *fcport)
break;
case DSC_LOGIN_PEND:
- if (fcport->fw_login_state == DSC_LS_PLOGI_COMP)
+ if (vha->hw->flags.edif_enabled)
+ break;
+
+ if (fcport->fw_login_state == DSC_LS_PLOGI_COMP) {
+ ql_dbg(ql_dbg_disc, vha, 0x2118,
+ "%s %d %8phC post %s PRLI\n",
+ __func__, __LINE__, fcport->port_name,
+ NVME_TARGET(vha->hw, fcport) ? "NVME" : "FC");
qla24xx_post_prli_work(vha, fcport);
+ }
break;
case DSC_UPD_FCPORT:
@@ -1818,7 +1834,8 @@ void qla2x00_handle_rscn(scsi_qla_host_t *vha, struct event_arg *ea)
case RSCN_PORT_ADDR:
fcport = qla2x00_find_fcport_by_nportid(vha, &ea->id, 1);
if (fcport) {
- if (fcport->flags & FCF_FCP2_DEVICE) {
+ if (fcport->flags & FCF_FCP2_DEVICE &&
+ atomic_read(&fcport->state) == FCS_ONLINE) {
ql_dbg(ql_dbg_disc, vha, 0x2115,
"Delaying session delete for FCP2 portid=%06x %8phC ",
fcport->d_id.b24, fcport->port_name);
@@ -1850,7 +1867,8 @@ void qla2x00_handle_rscn(scsi_qla_host_t *vha, struct event_arg *ea)
break;
case RSCN_AREA_ADDR:
list_for_each_entry(fcport, &vha->vp_fcports, list) {
- if (fcport->flags & FCF_FCP2_DEVICE)
+ if (fcport->flags & FCF_FCP2_DEVICE &&
+ atomic_read(&fcport->state) == FCS_ONLINE)
continue;
if ((ea->id.b24 & 0xffff00) == (fcport->d_id.b24 & 0xffff00)) {
@@ -1861,7 +1879,8 @@ void qla2x00_handle_rscn(scsi_qla_host_t *vha, struct event_arg *ea)
break;
case RSCN_DOM_ADDR:
list_for_each_entry(fcport, &vha->vp_fcports, list) {
- if (fcport->flags & FCF_FCP2_DEVICE)
+ if (fcport->flags & FCF_FCP2_DEVICE &&
+ atomic_read(&fcport->state) == FCS_ONLINE)
continue;
if ((ea->id.b24 & 0xff0000) == (fcport->d_id.b24 & 0xff0000)) {
@@ -1873,7 +1892,8 @@ void qla2x00_handle_rscn(scsi_qla_host_t *vha, struct event_arg *ea)
case RSCN_FAB_ADDR:
default:
list_for_each_entry(fcport, &vha->vp_fcports, list) {
- if (fcport->flags & FCF_FCP2_DEVICE)
+ if (fcport->flags & FCF_FCP2_DEVICE &&
+ atomic_read(&fcport->state) == FCS_ONLINE)
continue;
fcport->scan_needed = 1;
@@ -2000,12 +2020,14 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint32_t lun,
struct srb_iocb *tm_iocb;
srb_t *sp;
int rval = QLA_FUNCTION_FAILED;
+ uint8_t bail;
/* ref: INIT */
sp = qla2x00_get_sp(vha, fcport, GFP_KERNEL);
if (!sp)
goto done;
+ QLA_VHA_MARK_BUSY(vha, bail);
sp->type = SRB_TM_CMD;
sp->name = "tmf";
qla2x00_init_async_sp(sp, qla2x00_get_async_timeout(vha),
@@ -2124,6 +2146,13 @@ qla24xx_handle_prli_done_event(struct scsi_qla_host *vha, struct event_arg *ea)
}
if (N2N_TOPO(vha->hw)) {
+ if (ea->fcport->n2n_link_reset_cnt ==
+ vha->hw->login_retry_count &&
+ ea->fcport->flags & FCF_FCSP_DEVICE) {
+ /* remote authentication app just started */
+ ea->fcport->n2n_link_reset_cnt = 0;
+ }
+
if (ea->fcport->n2n_link_reset_cnt <
vha->hw->login_retry_count) {
ea->fcport->n2n_link_reset_cnt++;
@@ -4509,6 +4538,8 @@ qla2x00_init_rings(scsi_qla_host_t *vha)
BIT_6) != 0;
ql_dbg(ql_dbg_init, vha, 0x00bc, "FA-WWPN Support: %s.\n",
(ha->flags.fawwpn_enabled) ? "enabled" : "disabled");
+ /* Init_cb will be reused for other command(s). Save a backup copy of port_name */
+ memcpy(ha->port_name, ha->init_cb->port_name, WWN_SIZE);
}
/* ELS pass through payload is limit by frame size. */
@@ -5273,9 +5304,6 @@ qla2x00_alloc_fcport(scsi_qla_host_t *vha, gfp_t flags)
INIT_LIST_HEAD(&fcport->edif.tx_sa_list);
INIT_LIST_HEAD(&fcport->edif.rx_sa_list);
- if (vha->e_dbell.db_flags == EDB_ACTIVE)
- fcport->edif.app_started = 1;
-
spin_lock_init(&fcport->edif.indx_list_lock);
INIT_LIST_HEAD(&fcport->edif.edif_indx_list);
@@ -5488,6 +5516,22 @@ static int qla2x00_configure_n2n_loop(scsi_qla_host_t *vha)
return QLA_FUNCTION_FAILED;
}
+static void
+qla_reinitialize_link(scsi_qla_host_t *vha)
+{
+ int rval;
+
+ atomic_set(&vha->loop_state, LOOP_DOWN);
+ atomic_set(&vha->loop_down_timer, LOOP_DOWN_TIME);
+ rval = qla2x00_full_login_lip(vha);
+ if (rval == QLA_SUCCESS) {
+ ql_dbg(ql_dbg_disc, vha, 0xd050, "Link reinitialized\n");
+ } else {
+ ql_dbg(ql_dbg_disc, vha, 0xd051,
+ "Link reinitialization failed (%d)\n", rval);
+ }
+}
+
/*
* qla2x00_configure_local_loop
* Updates Fibre Channel Device Database with local loop devices.
@@ -5539,6 +5583,19 @@ qla2x00_configure_local_loop(scsi_qla_host_t *vha)
spin_unlock_irqrestore(&vha->work_lock, flags);
if (vha->scan.scan_retry < MAX_SCAN_RETRIES) {
+ u8 loop_map_entries = 0;
+ int rc;
+
+ rc = qla2x00_get_fcal_position_map(vha, NULL,
+ &loop_map_entries);
+ if (rc == QLA_SUCCESS && loop_map_entries > 1) {
+ /*
+ * There are devices that are still not logged
+ * in. Reinitialize to give them a chance.
+ */
+ qla_reinitialize_link(vha);
+ return QLA_FUNCTION_FAILED;
+ }
set_bit(LOCAL_LOOP_UPDATE, &vha->dpc_flags);
set_bit(LOOP_RESYNC_NEEDED, &vha->dpc_flags);
}
@@ -5767,8 +5824,6 @@ qla2x00_reg_remote_port(scsi_qla_host_t *vha, fc_port_t *fcport)
if (atomic_read(&fcport->state) == FCS_ONLINE)
return;
- qla2x00_set_fcport_state(fcport, FCS_ONLINE);
-
rport_ids.node_name = wwn_to_u64(fcport->node_name);
rport_ids.port_name = wwn_to_u64(fcport->port_name);
rport_ids.port_id = fcport->d_id.b.domain << 16 |
@@ -5869,7 +5924,6 @@ qla2x00_update_fcport(scsi_qla_host_t *vha, fc_port_t *fcport)
qla2x00_reg_remote_port(vha, fcport);
break;
case MODE_TARGET:
- qla2x00_set_fcport_state(fcport, FCS_ONLINE);
if (!vha->vha_tgt.qla_tgt->tgt_stop &&
!vha->vha_tgt.qla_tgt->tgt_stopped)
qlt_fc_port_added(vha, fcport);
@@ -5887,6 +5941,8 @@ qla2x00_update_fcport(scsi_qla_host_t *vha, fc_port_t *fcport)
if (NVME_TARGET(vha->hw, fcport))
qla_nvme_register_remote(vha, fcport);
+ qla2x00_set_fcport_state(fcport, FCS_ONLINE);
+
if (IS_IIDMA_CAPABLE(vha->hw) && vha->hw->flags.gpsc_supported) {
if (fcport->id_changed) {
fcport->id_changed = 0;
@@ -9657,6 +9713,12 @@ int qla2xxx_disable_port(struct Scsi_Host *host)
vha->hw->flags.port_isolated = 1;
+ if (qla2x00_isp_reg_stat(vha->hw)) {
+ ql_log(ql_log_info, vha, 0x9006,
+ "PCI/Register disconnect, exiting.\n");
+ qla_pci_set_eeh_busy(vha);
+ return FAILED;
+ }
if (qla2x00_chip_is_down(vha))
return 0;
@@ -9672,6 +9734,13 @@ int qla2xxx_enable_port(struct Scsi_Host *host)
{
scsi_qla_host_t *vha = shost_priv(host);
+ if (qla2x00_isp_reg_stat(vha->hw)) {
+ ql_log(ql_log_info, vha, 0x9001,
+ "PCI/Register disconnect, exiting.\n");
+ qla_pci_set_eeh_busy(vha);
+ return FAILED;
+ }
+
vha->hw->flags.port_isolated = 0;
/* Set the flag to 1, so that isp_abort can proceed */
vha->flags.online = 1;
diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
index e0fe9ddb4bd2..42ce4e1fe744 100644
--- a/drivers/scsi/qla2xxx/qla_iocb.c
+++ b/drivers/scsi/qla2xxx/qla_iocb.c
@@ -2819,7 +2819,7 @@ qla24xx_els_logo_iocb(srb_t *sp, struct els_entry_24xx *els_iocb)
sp->vha->qla_stats.control_requests++;
}
-static void
+void
qla2x00_els_dcmd2_iocb_timeout(void *data)
{
srb_t *sp = data;
@@ -2882,6 +2882,9 @@ static void qla2x00_els_dcmd2_sp_done(srb_t *sp, int res)
sp->name, res, sp->handle, fcport->d_id.b24, fcport->port_name);
fcport->flags &= ~(FCF_ASYNC_SENT|FCF_ASYNC_ACTIVE);
+ /* For edif, set logout on delete to ensure any residual key from FW is flushed.*/
+ fcport->logout_on_delete = 1;
+ fcport->chip_reset = vha->hw->base_qpair->chip_reset;
if (sp->flags & SRB_WAKEUP_ON_COMP)
complete(&lio->u.els_plogi.comp);
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 21b31d6359c8..de348628aa53 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -1354,9 +1354,7 @@ qla2x00_async_event(scsi_qla_host_t *vha, struct rsp_que *rsp, uint16_t *mb)
if (!vha->vp_idx) {
if (ha->flags.fawwpn_enabled &&
(ha->current_topology == ISP_CFG_F)) {
- void *wwpn = ha->init_cb->port_name;
-
- memcpy(vha->port_name, wwpn, WWN_SIZE);
+ memcpy(vha->port_name, ha->port_name, WWN_SIZE);
fc_host_port_name(vha->host) =
wwn_to_u64(vha->port_name);
ql_dbg(ql_dbg_init + ql_dbg_verbose,
@@ -2639,7 +2637,7 @@ static void qla24xx_nvme_iocb_entry(scsi_qla_host_t *vha, struct req_que *req,
}
if (unlikely(logit))
- ql_log(ql_dbg_io, fcport->vha, 0x5060,
+ ql_dbg(ql_dbg_io, fcport->vha, 0x5060,
"NVME-%s ERR Handling - hdl=%x status(%x) tr_len:%x resid=%x ox_id=%x\n",
sp->name, sp->handle, comp_status,
fd->transferred_length, le32_to_cpu(sts->residual_len),
@@ -3426,6 +3424,7 @@ qla2x00_status_entry(scsi_qla_host_t *vha, struct rsp_que *rsp, void *pkt)
case CS_PORT_UNAVAILABLE:
case CS_TIMEOUT:
case CS_RESET:
+ case CS_EDIF_INV_REQ:
/*
* We are going to have the fc class block the rport
@@ -3496,7 +3495,7 @@ qla2x00_status_entry(scsi_qla_host_t *vha, struct rsp_que *rsp, void *pkt)
out:
if (logit)
- ql_log(ql_dbg_io, fcport->vha, 0x3022,
+ ql_dbg(ql_dbg_io, fcport->vha, 0x3022,
"FCP command status: 0x%x-0x%x (0x%x) nexus=%ld:%d:%llu portid=%02x%02x%02x oxid=0x%x cdb=%10phN len=0x%x rsp_info=0x%x resid=0x%x fw_resid=0x%x sp=%p cp=%p.\n",
comp_status, scsi_status, res, vha->host_no,
cp->device->id, cp->device->lun, fcport->d_id.b.domain,
@@ -4420,16 +4419,12 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
}
/* Enable MSI-X vector for response queue update for queue 0 */
- if (IS_QLA83XX(ha) || IS_QLA27XX(ha) || IS_QLA28XX(ha)) {
- if (ha->msixbase && ha->mqiobase &&
- (ha->max_rsp_queues > 1 || ha->max_req_queues > 1 ||
- ql2xmqsupport))
- ha->mqenable = 1;
- } else
- if (ha->mqiobase &&
- (ha->max_rsp_queues > 1 || ha->max_req_queues > 1 ||
- ql2xmqsupport))
- ha->mqenable = 1;
+ if (IS_MQUE_CAPABLE(ha) &&
+ (ha->msixbase && ha->mqiobase && ha->max_qpairs))
+ ha->mqenable = 1;
+ else
+ ha->mqenable = 0;
+
ql_dbg(ql_dbg_multiq, vha, 0xc005,
"mqiobase=%p, max_rsp_queues=%d, max_req_queues=%d.\n",
ha->mqiobase, ha->max_rsp_queues, ha->max_req_queues);
diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c
index 892caf2475df..86d8c455c07a 100644
--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -238,6 +238,8 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp)
ql_dbg(ql_dbg_mbx, vha, 0x1112,
"mbox[%d]<-0x%04x\n", cnt, *iptr);
wrt_reg_word(optr, *iptr);
+ } else {
+ wrt_reg_word(optr, 0);
}
mboxes >>= 1;
@@ -274,6 +276,12 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp)
atomic_inc(&ha->num_pend_mbx_stage3);
if (!wait_for_completion_timeout(&ha->mbx_intr_comp,
mcp->tov * HZ)) {
+ ql_dbg(ql_dbg_mbx, vha, 0x117a,
+ "cmd=%x Timeout.\n", command);
+ spin_lock_irqsave(&ha->hardware_lock, flags);
+ clear_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags);
+ spin_unlock_irqrestore(&ha->hardware_lock, flags);
+
if (chip_reset != ha->chip_reset) {
eeh_delay = ha->flags.eeh_busy ? 1 : 0;
@@ -286,12 +294,6 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp)
rval = QLA_ABORTED;
goto premature_exit;
}
- ql_dbg(ql_dbg_mbx, vha, 0x117a,
- "cmd=%x Timeout.\n", command);
- spin_lock_irqsave(&ha->hardware_lock, flags);
- clear_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags);
- spin_unlock_irqrestore(&ha->hardware_lock, flags);
-
} else if (ha->flags.purge_mbox ||
chip_reset != ha->chip_reset) {
eeh_delay = ha->flags.eeh_busy ? 1 : 0;
@@ -3066,7 +3068,8 @@ qla2x00_get_resource_cnts(scsi_qla_host_t *vha)
* Kernel context.
*/
int
-qla2x00_get_fcal_position_map(scsi_qla_host_t *vha, char *pos_map)
+qla2x00_get_fcal_position_map(scsi_qla_host_t *vha, char *pos_map,
+ u8 *num_entries)
{
int rval;
mbx_cmd_t mc;
@@ -3106,6 +3109,8 @@ qla2x00_get_fcal_position_map(scsi_qla_host_t *vha, char *pos_map)
if (pos_map)
memcpy(pos_map, pmap, FCAL_MAP_SIZE);
+ if (num_entries)
+ *num_entries = pmap[0];
}
dma_pool_free(ha->s_dma_pool, pmap, pmap_dma);
diff --git a/drivers/scsi/qla2xxx/qla_mid.c b/drivers/scsi/qla2xxx/qla_mid.c
index e6b5c4ccce97..eb43a5f1b399 100644
--- a/drivers/scsi/qla2xxx/qla_mid.c
+++ b/drivers/scsi/qla2xxx/qla_mid.c
@@ -166,9 +166,13 @@ qla24xx_disable_vp(scsi_qla_host_t *vha)
int ret = QLA_SUCCESS;
fc_port_t *fcport;
- if (vha->hw->flags.edif_enabled)
+ if (vha->hw->flags.edif_enabled) {
+ if (DBELL_ACTIVE(vha))
+ qla2x00_post_aen_work(vha, FCH_EVT_VENDOR_UNIQUE,
+ FCH_EVT_VENDOR_UNIQUE_VPORT_DOWN);
/* delete sessions and flush sa_indexes */
qla2x00_wait_for_sess_deletion(vha);
+ }
if (vha->hw->flags.fw_started)
ret = qla24xx_control_vp(vha, VCE_COMMAND_DISABLE_VPS_LOGO_ALL);
diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
index 87c9404aa401..7450c3458be7 100644
--- a/drivers/scsi/qla2xxx/qla_nvme.c
+++ b/drivers/scsi/qla2xxx/qla_nvme.c
@@ -37,11 +37,6 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport)
(fcport->nvme_flag & NVME_FLAG_REGISTERED))
return 0;
- if (atomic_read(&fcport->state) == FCS_ONLINE)
- return 0;
-
- qla2x00_set_fcport_state(fcport, FCS_ONLINE);
-
fcport->nvme_flag &= ~NVME_FLAG_RESETTING;
memset(&req, 0, sizeof(struct nvme_fc_port_info));
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 762229d495a8..f9ad0847782d 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -333,6 +333,11 @@ MODULE_PARM_DESC(ql2xabts_wait_nvme,
"To wait for ABTS response on I/O timeouts for NVMe. (default: 1)");
+u32 ql2xdelay_before_pci_error_handling = 5;
+module_param(ql2xdelay_before_pci_error_handling, uint, 0644);
+MODULE_PARM_DESC(ql2xdelay_before_pci_error_handling,
+ "Number of seconds delayed before qla begin PCI error self-handling (default: 5).\n");
+
static void qla2x00_clear_drv_active(struct qla_hw_data *);
static void qla2x00_free_device(scsi_qla_host_t *);
static int qla2xxx_map_queues(struct Scsi_Host *shost);
@@ -1337,21 +1342,20 @@ qla2xxx_eh_abort(struct scsi_cmnd *cmd)
/*
* Returns: QLA_SUCCESS or QLA_FUNCTION_FAILED.
*/
-int
-qla2x00_eh_wait_for_pending_commands(scsi_qla_host_t *vha, unsigned int t,
- uint64_t l, enum nexus_wait_type type)
+static int
+__qla2x00_eh_wait_for_pending_commands(struct qla_qpair *qpair, unsigned int t,
+ uint64_t l, enum nexus_wait_type type)
{
int cnt, match, status;
unsigned long flags;
- struct qla_hw_data *ha = vha->hw;
- struct req_que *req;
+ scsi_qla_host_t *vha = qpair->vha;
+ struct req_que *req = qpair->req;
srb_t *sp;
struct scsi_cmnd *cmd;
status = QLA_SUCCESS;
- spin_lock_irqsave(&ha->hardware_lock, flags);
- req = vha->req;
+ spin_lock_irqsave(qpair->qp_lock_ptr, flags);
for (cnt = 1; status == QLA_SUCCESS &&
cnt < req->num_outstanding_cmds; cnt++) {
sp = req->outstanding_cmds[cnt];
@@ -1378,15 +1382,35 @@ qla2x00_eh_wait_for_pending_commands(scsi_qla_host_t *vha, unsigned int t,
if (!match)
continue;
- spin_unlock_irqrestore(&ha->hardware_lock, flags);
+ spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
status = qla2x00_eh_wait_on_command(cmd);
- spin_lock_irqsave(&ha->hardware_lock, flags);
+ spin_lock_irqsave(qpair->qp_lock_ptr, flags);
}
- spin_unlock_irqrestore(&ha->hardware_lock, flags);
+ spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
return status;
}
+int
+qla2x00_eh_wait_for_pending_commands(scsi_qla_host_t *vha, unsigned int t,
+ uint64_t l, enum nexus_wait_type type)
+{
+ struct qla_qpair *qpair;
+ struct qla_hw_data *ha = vha->hw;
+ int i, status = QLA_SUCCESS;
+
+ status = __qla2x00_eh_wait_for_pending_commands(ha->base_qpair, t, l,
+ type);
+ for (i = 0; status == QLA_SUCCESS && i < ha->max_qpairs; i++) {
+ qpair = ha->queue_pair_map[i];
+ if (!qpair)
+ continue;
+ status = __qla2x00_eh_wait_for_pending_commands(qpair, t, l,
+ type);
+ }
+ return status;
+}
+
static char *reset_errors[] = {
"HBA not online",
"HBA not ready",
@@ -1420,7 +1444,7 @@ qla2xxx_eh_device_reset(struct scsi_cmnd *cmd)
return err;
if (fcport->deleted)
- return SUCCESS;
+ return FAILED;
ql_log(ql_log_info, vha, 0x8009,
"DEVICE RESET ISSUED nexus=%ld:%d:%llu cmd=%p.\n", vha->host_no,
@@ -1488,7 +1512,7 @@ qla2xxx_eh_target_reset(struct scsi_cmnd *cmd)
return err;
if (fcport->deleted)
- return SUCCESS;
+ return FAILED;
ql_log(ql_log_info, vha, 0x8009,
"TARGET RESET ISSUED nexus=%ld:%d cmd=%p.\n", vha->host_no,
@@ -5473,7 +5497,7 @@ qla2x00_do_work(struct scsi_qla_host *vha)
e->u.fcport.fcport, false);
break;
case QLA_EVT_SA_REPLACE:
- qla24xx_issue_sa_replace_iocb(vha, e);
+ rc = qla24xx_issue_sa_replace_iocb(vha, e);
break;
}
@@ -7239,6 +7263,44 @@ static void qla_heart_beat(struct scsi_qla_host *vha, u16 dpc_started)
}
}
+static void qla_wind_down_chip(scsi_qla_host_t *vha)
+{
+ struct qla_hw_data *ha = vha->hw;
+
+ if (!ha->flags.eeh_busy)
+ return;
+ if (ha->pci_error_state)
+ /* system is trying to recover */
+ return;
+
+ /*
+ * Current system is not handling PCIE error. At this point, this is
+ * best effort to wind down the adapter.
+ */
+ if (time_after_eq(jiffies, ha->eeh_jif + ql2xdelay_before_pci_error_handling * HZ) &&
+ !ha->flags.eeh_flush) {
+ ql_log(ql_log_info, vha, 0x9009,
+ "PCI Error detected, attempting to reset hardware.\n");
+
+ ha->isp_ops->reset_chip(vha);
+ ha->isp_ops->disable_intrs(ha);
+
+ ha->flags.eeh_flush = EEH_FLUSH_RDY;
+ ha->eeh_jif = jiffies;
+
+ } else if (ha->flags.eeh_flush == EEH_FLUSH_RDY &&
+ time_after_eq(jiffies, ha->eeh_jif + 5 * HZ)) {
+ pci_clear_master(ha->pdev);
+
+ /* flush all command */
+ qla2x00_abort_isp_cleanup(vha);
+ ha->flags.eeh_flush = EEH_FLUSH_DONE;
+
+ ql_log(ql_log_info, vha, 0x900a,
+ "PCI Error handling complete, all IOs aborted.\n");
+ }
+}
+
/**************************************************************************
* qla2x00_timer
*
@@ -7262,6 +7324,8 @@ qla2x00_timer(struct timer_list *t)
fc_port_t *fcport = NULL;
if (ha->flags.eeh_busy) {
+ qla_wind_down_chip(vha);
+
ql_dbg(ql_dbg_timer, vha, 0x6000,
"EEH = %d, restarting timer.\n",
ha->flags.eeh_busy);
@@ -7842,6 +7906,9 @@ void qla_pci_set_eeh_busy(struct scsi_qla_host *vha)
spin_lock_irqsave(&base_vha->work_lock, flags);
if (!ha->flags.eeh_busy) {
+ ha->eeh_jif = jiffies;
+ ha->flags.eeh_flush = 0;
+
ha->flags.eeh_busy = 1;
do_cleanup = true;
}
diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
index 6dfcfd8e7337..34b85c80233f 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -988,22 +988,6 @@ void qlt_free_session_done(struct work_struct *work)
sess->send_els_logo);
if (!IS_SW_RESV_ADDR(sess->d_id)) {
- if (ha->flags.edif_enabled &&
- (!own || own->iocb.u.isp24.status_subcode == ELS_PLOGI)) {
- sess->edif.authok = 0;
- if (!ha->flags.host_shutting_down) {
- ql_dbg(ql_dbg_edif, vha, 0x911e,
- "%s wwpn %8phC calling qla2x00_release_all_sadb\n",
- __func__, sess->port_name);
- qla2x00_release_all_sadb(vha, sess);
- } else {
- ql_dbg(ql_dbg_edif, vha, 0x911e,
- "%s bypassing release_all_sadb\n",
- __func__);
- }
- qla_edif_clear_appdata(vha, sess);
- qla_edif_sess_down(vha, sess);
- }
qla2x00_mark_device_lost(vha, sess, 0);
if (sess->send_els_logo) {
@@ -1049,6 +1033,25 @@ void qlt_free_session_done(struct work_struct *work)
sess->nvme_flag |= NVME_FLAG_DELETING;
qla_nvme_unregister_remote_port(sess);
}
+
+ if (ha->flags.edif_enabled &&
+ (!own || (own &&
+ own->iocb.u.isp24.status_subcode == ELS_PLOGI))) {
+ sess->edif.authok = 0;
+ if (!ha->flags.host_shutting_down) {
+ ql_dbg(ql_dbg_edif, vha, 0x911e,
+ "%s wwpn %8phC calling qla2x00_release_all_sadb\n",
+ __func__, sess->port_name);
+ qla2x00_release_all_sadb(vha, sess);
+ } else {
+ ql_dbg(ql_dbg_edif, vha, 0x911e,
+ "%s bypassing release_all_sadb\n",
+ __func__);
+ }
+
+ qla_edif_clear_appdata(vha, sess);
+ qla_edif_sess_down(vha, sess);
+ }
}
/*
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 5d21f07456c6..2a38cd2d24ef 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2264,16 +2264,8 @@ static void iscsi_if_disconnect_bound_ep(struct iscsi_cls_conn *conn,
}
}
-static int iscsi_if_stop_conn(struct iscsi_transport *transport,
- struct iscsi_uevent *ev)
+static int iscsi_if_stop_conn(struct iscsi_cls_conn *conn, int flag)
{
- int flag = ev->u.stop_conn.flag;
- struct iscsi_cls_conn *conn;
-
- conn = iscsi_conn_lookup(ev->u.stop_conn.sid, ev->u.stop_conn.cid);
- if (!conn)
- return -EINVAL;
-
ISCSI_DBG_TRANS_CONN(conn, "iscsi if conn stop.\n");
/*
* If this is a termination we have to call stop_conn with that flag
@@ -2349,6 +2341,55 @@ static void iscsi_cleanup_conn_work_fn(struct work_struct *work)
ISCSI_DBG_TRANS_CONN(conn, "cleanup done.\n");
}
+static int iscsi_iter_force_destroy_conn_fn(struct device *dev, void *data)
+{
+ struct iscsi_transport *transport;
+ struct iscsi_cls_conn *conn;
+
+ if (!iscsi_is_conn_dev(dev))
+ return 0;
+
+ conn = iscsi_dev_to_conn(dev);
+ transport = conn->transport;
+
+ if (READ_ONCE(conn->state) != ISCSI_CONN_DOWN)
+ iscsi_if_stop_conn(conn, STOP_CONN_TERM);
+
+ transport->destroy_conn(conn);
+ return 0;
+}
+
+/**
+ * iscsi_force_destroy_session - destroy a session from the kernel
+ * @session: session to destroy
+ *
+ * Force the destruction of a session from the kernel. This should only be
+ * used when userspace is no longer running during system shutdown.
+ */
+void iscsi_force_destroy_session(struct iscsi_cls_session *session)
+{
+ struct iscsi_transport *transport = session->transport;
+ unsigned long flags;
+
+ WARN_ON_ONCE(system_state == SYSTEM_RUNNING);
+
+ spin_lock_irqsave(&sesslock, flags);
+ if (list_empty(&session->sess_list)) {
+ spin_unlock_irqrestore(&sesslock, flags);
+ /*
+ * Conn/ep is already freed. Session is being torn down via
+ * async path. For shutdown we don't care about it so return.
+ */
+ return;
+ }
+ spin_unlock_irqrestore(&sesslock, flags);
+
+ device_for_each_child(&session->dev, NULL,
+ iscsi_iter_force_destroy_conn_fn);
+ transport->destroy_session(session);
+}
+EXPORT_SYMBOL_GPL(iscsi_force_destroy_session);
+
void iscsi_free_session(struct iscsi_cls_session *session)
{
ISCSI_DBG_TRANS_SESSION(session, "Freeing session\n");
@@ -3720,7 +3761,12 @@ static int iscsi_if_transport_conn(struct iscsi_transport *transport,
case ISCSI_UEVENT_DESTROY_CONN:
return iscsi_if_destroy_conn(transport, ev);
case ISCSI_UEVENT_STOP_CONN:
- return iscsi_if_stop_conn(transport, ev);
+ conn = iscsi_conn_lookup(ev->u.stop_conn.sid,
+ ev->u.stop_conn.cid);
+ if (!conn)
+ return -EINVAL;
+
+ return iscsi_if_stop_conn(conn, ev->u.stop_conn.flag);
}
/*
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index cbffa712b9f3..508cb855386d 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -195,7 +195,7 @@ static void sg_link_reserve(Sg_fd * sfp, Sg_request * srp, int size);
static void sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp);
static Sg_fd *sg_add_sfp(Sg_device * sdp);
static void sg_remove_sfp(struct kref *);
-static Sg_request *sg_get_rq_mark(Sg_fd * sfp, int pack_id);
+static Sg_request *sg_get_rq_mark(Sg_fd * sfp, int pack_id, bool *busy);
static Sg_request *sg_add_request(Sg_fd * sfp);
static int sg_remove_request(Sg_fd * sfp, Sg_request * srp);
static Sg_device *sg_get_dev(int dev);
@@ -444,6 +444,7 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
Sg_fd *sfp;
Sg_request *srp;
int req_pack_id = -1;
+ bool busy;
sg_io_hdr_t *hp;
struct sg_header *old_hdr;
int retval;
@@ -466,20 +467,16 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos)
if (retval)
return retval;
- srp = sg_get_rq_mark(sfp, req_pack_id);
+ srp = sg_get_rq_mark(sfp, req_pack_id, &busy);
if (!srp) { /* now wait on packet to arrive */
- if (atomic_read(&sdp->detaching))
- return -ENODEV;
if (filp->f_flags & O_NONBLOCK)
return -EAGAIN;
retval = wait_event_interruptible(sfp->read_wait,
- (atomic_read(&sdp->detaching) ||
- (srp = sg_get_rq_mark(sfp, req_pack_id))));
- if (atomic_read(&sdp->detaching))
- return -ENODEV;
- if (retval)
- /* -ERESTARTSYS as signal hit process */
- return retval;
+ ((srp = sg_get_rq_mark(sfp, req_pack_id, &busy)) ||
+ (!busy && atomic_read(&sdp->detaching))));
+ if (!srp)
+ /* signal or detaching */
+ return retval ? retval : -ENODEV;
}
if (srp->header.interface_id != '\0')
return sg_new_read(sfp, buf, count, srp);
@@ -939,9 +936,7 @@ sg_ioctl_common(struct file *filp, Sg_device *sdp, Sg_fd *sfp,
if (result < 0)
return result;
result = wait_event_interruptible(sfp->read_wait,
- (srp_done(sfp, srp) || atomic_read(&sdp->detaching)));
- if (atomic_read(&sdp->detaching))
- return -ENODEV;
+ srp_done(sfp, srp));
write_lock_irq(&sfp->rq_list_lock);
if (srp->done) {
srp->done = 2;
@@ -2078,19 +2073,28 @@ sg_unlink_reserve(Sg_fd * sfp, Sg_request * srp)
}
static Sg_request *
-sg_get_rq_mark(Sg_fd * sfp, int pack_id)
+sg_get_rq_mark(Sg_fd * sfp, int pack_id, bool *busy)
{
Sg_request *resp;
unsigned long iflags;
+ *busy = false;
write_lock_irqsave(&sfp->rq_list_lock, iflags);
list_for_each_entry(resp, &sfp->rq_list, entry) {
- /* look for requests that are ready + not SG_IO owned */
- if ((1 == resp->done) && (!resp->sg_io_owned) &&
+ /* look for requests that are not SG_IO owned */
+ if ((!resp->sg_io_owned) &&
((-1 == pack_id) || (resp->header.pack_id == pack_id))) {
- resp->done = 2; /* guard against other readers */
- write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
- return resp;
+ switch (resp->done) {
+ case 0: /* request active */
+ *busy = true;
+ break;
+ case 1: /* request done; response ready to return */
+ resp->done = 2; /* guard against other readers */
+ write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
+ return resp;
+ case 2: /* response already being returned */
+ break;
+ }
}
}
write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
@@ -2144,6 +2148,15 @@ sg_remove_request(Sg_fd * sfp, Sg_request * srp)
res = 1;
}
write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
+
+ /*
+ * If the device is detaching, wakeup any readers in case we just
+ * removed the last response, which would leave nothing for them to
+ * return other than -ENODEV.
+ */
+ if (unlikely(atomic_read(&sfp->parentdp->detaching)))
+ wake_up_interruptible_all(&sfp->read_wait);
+
return res;
}
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 7c0d069a3158..e1fc6f5b9612 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -5484,10 +5484,10 @@ static int pqi_raid_submit_scsi_cmd_with_io_request(
}
switch (scmd->sc_data_direction) {
- case DMA_TO_DEVICE:
+ case DMA_FROM_DEVICE:
request->data_direction = SOP_READ_FLAG;
break;
- case DMA_FROM_DEVICE:
+ case DMA_TO_DEVICE:
request->data_direction = SOP_WRITE_FLAG;
break;
case DMA_NONE:
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 874490f7f5e7..9e6f45aea20e 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -9500,12 +9500,8 @@ EXPORT_SYMBOL(ufshcd_runtime_resume);
int ufshcd_shutdown(struct ufs_hba *hba)
{
if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
- goto out;
-
- pm_runtime_get_sync(hba->dev);
+ ufshcd_suspend(hba);
- ufshcd_suspend(hba);
-out:
hba->is_powered = false;
/* allow force shutdown even in case of errors */
return 0;
diff --git a/drivers/soc/amlogic/meson-mx-socinfo.c b/drivers/soc/amlogic/meson-mx-socinfo.c
index 78f0f1aeca57..92125dd65f33 100644
--- a/drivers/soc/amlogic/meson-mx-socinfo.c
+++ b/drivers/soc/amlogic/meson-mx-socinfo.c
@@ -126,6 +126,7 @@ static int __init meson_mx_socinfo_init(void)
np = of_find_matching_node(NULL, meson_mx_socinfo_analog_top_ids);
if (np) {
analog_top_regmap = syscon_node_to_regmap(np);
+ of_node_put(np);
if (IS_ERR(analog_top_regmap))
return PTR_ERR(analog_top_regmap);
diff --git a/drivers/soc/amlogic/meson-secure-pwrc.c b/drivers/soc/amlogic/meson-secure-pwrc.c
index a10a417a87db..e93518763526 100644
--- a/drivers/soc/amlogic/meson-secure-pwrc.c
+++ b/drivers/soc/amlogic/meson-secure-pwrc.c
@@ -152,8 +152,10 @@ static int meson_secure_pwrc_probe(struct platform_device *pdev)
}
pwrc = devm_kzalloc(&pdev->dev, sizeof(*pwrc), GFP_KERNEL);
- if (!pwrc)
+ if (!pwrc) {
+ of_node_put(sm_np);
return -ENOMEM;
+ }
pwrc->fw = meson_sm_get(sm_np);
of_node_put(sm_np);
diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c
index 5ed2fc1c53a0..be18d46c7b0f 100644
--- a/drivers/soc/fsl/guts.c
+++ b/drivers/soc/fsl/guts.c
@@ -140,7 +140,7 @@ static int fsl_guts_probe(struct platform_device *pdev)
struct device_node *root, *np = pdev->dev.of_node;
struct device *dev = &pdev->dev;
const struct fsl_soc_die_attr *soc_die;
- const char *machine;
+ const char *machine = NULL;
u32 svr;
/* Initialize guts */
diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index e718b8735444..4472fb22ba04 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -129,6 +129,7 @@ config QCOM_RPMHPD
config QCOM_RPMPD
tristate "Qualcomm RPM Power domain driver"
+ depends on PM
depends on QCOM_SMD_RPM
help
QCOM RPM Power domain driver to support power-domains with
diff --git a/drivers/soc/qcom/ocmem.c b/drivers/soc/qcom/ocmem.c
index 97fd24c178f8..c92d26b73e6f 100644
--- a/drivers/soc/qcom/ocmem.c
+++ b/drivers/soc/qcom/ocmem.c
@@ -194,14 +194,17 @@ struct ocmem *of_get_ocmem(struct device *dev)
devnode = of_parse_phandle(dev->of_node, "sram", 0);
if (!devnode || !devnode->parent) {
dev_err(dev, "Cannot look up sram phandle\n");
+ of_node_put(devnode);
return ERR_PTR(-ENODEV);
}
pdev = of_find_device_by_node(devnode->parent);
if (!pdev) {
dev_err(dev, "Cannot find device node %s\n", devnode->name);
+ of_node_put(devnode);
return ERR_PTR(-EPROBE_DEFER);
}
+ of_node_put(devnode);
ocmem = platform_get_drvdata(pdev);
if (!ocmem) {
diff --git a/drivers/soc/qcom/qcom_aoss.c b/drivers/soc/qcom/qcom_aoss.c
index a59bb34e5eba..18c856056475 100644
--- a/drivers/soc/qcom/qcom_aoss.c
+++ b/drivers/soc/qcom/qcom_aoss.c
@@ -399,8 +399,10 @@ static int qmp_cooling_devices_register(struct qmp *qmp)
continue;
ret = qmp_cooling_device_add(qmp, &qmp->cooling_devs[count++],
child);
- if (ret)
+ if (ret) {
+ of_node_put(child);
goto unroll;
+ }
}
if (!count)
diff --git a/drivers/soc/qcom/socinfo.c b/drivers/soc/qcom/socinfo.c
index 8b38d134720a..f6879983da69 100644
--- a/drivers/soc/qcom/socinfo.c
+++ b/drivers/soc/qcom/socinfo.c
@@ -328,7 +328,8 @@ static const struct soc_id soc_id[] = {
{ 455, "QRB5165" },
{ 457, "SM8450" },
{ 459, "SM7225" },
- { 460, "SA8540P" },
+ { 460, "SA8295P" },
+ { 461, "SA8540P" },
{ 480, "SM8450" },
};
diff --git a/drivers/soc/renesas/r8a779a0-sysc.c b/drivers/soc/renesas/r8a779a0-sysc.c
index fdfc857df334..04f1bc322ae7 100644
--- a/drivers/soc/renesas/r8a779a0-sysc.c
+++ b/drivers/soc/renesas/r8a779a0-sysc.c
@@ -57,11 +57,11 @@ static struct rcar_gen4_sysc_area r8a779a0_areas[] __initdata = {
{ "a2cv6", R8A779A0_PD_A2CV6, R8A779A0_PD_A3IR },
{ "a2cn2", R8A779A0_PD_A2CN2, R8A779A0_PD_A3IR },
{ "a2imp23", R8A779A0_PD_A2IMP23, R8A779A0_PD_A3IR },
- { "a2dp1", R8A779A0_PD_A2DP0, R8A779A0_PD_A3IR },
- { "a2cv2", R8A779A0_PD_A2CV0, R8A779A0_PD_A3IR },
- { "a2cv3", R8A779A0_PD_A2CV1, R8A779A0_PD_A3IR },
- { "a2cv5", R8A779A0_PD_A2CV4, R8A779A0_PD_A3IR },
- { "a2cv7", R8A779A0_PD_A2CV6, R8A779A0_PD_A3IR },
+ { "a2dp1", R8A779A0_PD_A2DP1, R8A779A0_PD_A3IR },
+ { "a2cv2", R8A779A0_PD_A2CV2, R8A779A0_PD_A3IR },
+ { "a2cv3", R8A779A0_PD_A2CV3, R8A779A0_PD_A3IR },
+ { "a2cv5", R8A779A0_PD_A2CV5, R8A779A0_PD_A3IR },
+ { "a2cv7", R8A779A0_PD_A2CV7, R8A779A0_PD_A3IR },
{ "a2cn1", R8A779A0_PD_A2CN1, R8A779A0_PD_A3IR },
{ "a1cnn0", R8A779A0_PD_A1CNN0, R8A779A0_PD_A2CN0 },
{ "a1cnn2", R8A779A0_PD_A1CNN2, R8A779A0_PD_A2CN2 },
diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index 354d3f89366f..f1cf78c6d477 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -7,6 +7,7 @@
#include <linux/pm_runtime.h>
#include <linux/soundwire/sdw_registers.h>
#include <linux/soundwire/sdw.h>
+#include <linux/soundwire/sdw_type.h>
#include "bus.h"
#include "sysfs_local.h"
@@ -846,15 +847,21 @@ static int sdw_slave_clk_stop_callback(struct sdw_slave *slave,
enum sdw_clk_stop_mode mode,
enum sdw_clk_stop_type type)
{
- int ret;
+ int ret = 0;
- if (slave->ops && slave->ops->clk_stop) {
- ret = slave->ops->clk_stop(slave, mode, type);
- if (ret < 0)
- return ret;
+ mutex_lock(&slave->sdw_dev_lock);
+
+ if (slave->probed) {
+ struct device *dev = &slave->dev;
+ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
+
+ if (drv->ops && drv->ops->clk_stop)
+ ret = drv->ops->clk_stop(slave, mode, type);
}
- return 0;
+ mutex_unlock(&slave->sdw_dev_lock);
+
+ return ret;
}
static int sdw_slave_clk_stop_prepare(struct sdw_slave *slave,
@@ -1616,14 +1623,24 @@ static int sdw_handle_slave_alerts(struct sdw_slave *slave)
}
/* Update the Slave driver */
- if (slave_notify && slave->ops &&
- slave->ops->interrupt_callback) {
- slave_intr.sdca_cascade = sdca_cascade;
- slave_intr.control_port = clear;
- memcpy(slave_intr.port, &port_status,
- sizeof(slave_intr.port));
-
- slave->ops->interrupt_callback(slave, &slave_intr);
+ if (slave_notify) {
+ mutex_lock(&slave->sdw_dev_lock);
+
+ if (slave->probed) {
+ struct device *dev = &slave->dev;
+ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
+
+ if (drv->ops && drv->ops->interrupt_callback) {
+ slave_intr.sdca_cascade = sdca_cascade;
+ slave_intr.control_port = clear;
+ memcpy(slave_intr.port, &port_status,
+ sizeof(slave_intr.port));
+
+ drv->ops->interrupt_callback(slave, &slave_intr);
+ }
+ }
+
+ mutex_unlock(&slave->sdw_dev_lock);
}
/* Ack interrupt */
@@ -1697,29 +1714,21 @@ static int sdw_handle_slave_alerts(struct sdw_slave *slave)
static int sdw_update_slave_status(struct sdw_slave *slave,
enum sdw_slave_status status)
{
- unsigned long time;
+ int ret = 0;
- if (!slave->probed) {
- /*
- * the slave status update is typically handled in an
- * interrupt thread, which can race with the driver
- * probe, e.g. when a module needs to be loaded.
- *
- * make sure the probe is complete before updating
- * status.
- */
- time = wait_for_completion_timeout(&slave->probe_complete,
- msecs_to_jiffies(DEFAULT_PROBE_TIMEOUT));
- if (!time) {
- dev_err(&slave->dev, "Probe not complete, timed out\n");
- return -ETIMEDOUT;
- }
+ mutex_lock(&slave->sdw_dev_lock);
+
+ if (slave->probed) {
+ struct device *dev = &slave->dev;
+ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
+
+ if (drv->ops && drv->ops->update_status)
+ ret = drv->ops->update_status(slave, status);
}
- if (!slave->ops || !slave->ops->update_status)
- return 0;
+ mutex_unlock(&slave->sdw_dev_lock);
- return slave->ops->update_status(slave, status);
+ return ret;
}
/**
diff --git a/drivers/soundwire/bus_type.c b/drivers/soundwire/bus_type.c
index 893296f3fe39..04b3529f8929 100644
--- a/drivers/soundwire/bus_type.c
+++ b/drivers/soundwire/bus_type.c
@@ -98,8 +98,6 @@ static int sdw_drv_probe(struct device *dev)
if (!id)
return -ENODEV;
- slave->ops = drv->ops;
-
/*
* attach to power domain but don't turn on (last arg)
*/
@@ -107,19 +105,23 @@ static int sdw_drv_probe(struct device *dev)
if (ret)
return ret;
+ mutex_lock(&slave->sdw_dev_lock);
+
ret = drv->probe(slave, id);
if (ret) {
name = drv->name;
if (!name)
name = drv->driver.name;
+ mutex_unlock(&slave->sdw_dev_lock);
+
dev_err(dev, "Probe of %s failed: %d\n", name, ret);
dev_pm_domain_detach(dev, false);
return ret;
}
/* device is probed so let's read the properties now */
- if (slave->ops && slave->ops->read_prop)
- slave->ops->read_prop(slave);
+ if (drv->ops && drv->ops->read_prop)
+ drv->ops->read_prop(slave);
/* init the sysfs as we have properties now */
ret = sdw_slave_sysfs_init(slave);
@@ -139,7 +141,19 @@ static int sdw_drv_probe(struct device *dev)
slave->prop.clk_stop_timeout);
slave->probed = true;
- complete(&slave->probe_complete);
+
+ /*
+ * if the probe happened after the bus was started, notify the codec driver
+ * of the current hardware status to e.g. start the initialization.
+ * Errors are only logged as warnings to avoid failing the probe.
+ */
+ if (drv->ops && drv->ops->update_status) {
+ ret = drv->ops->update_status(slave, slave->status);
+ if (ret < 0)
+ dev_warn(dev, "%s: update_status failed with status %d\n", __func__, ret);
+ }
+
+ mutex_unlock(&slave->sdw_dev_lock);
dev_dbg(dev, "probe complete\n");
@@ -152,9 +166,15 @@ static int sdw_drv_remove(struct device *dev)
struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
int ret = 0;
+ mutex_lock(&slave->sdw_dev_lock);
+
+ slave->probed = false;
+
if (drv->remove)
ret = drv->remove(slave);
+ mutex_unlock(&slave->sdw_dev_lock);
+
dev_pm_domain_detach(dev, false);
return ret;
@@ -193,12 +213,8 @@ int __sdw_register_driver(struct sdw_driver *drv, struct module *owner)
drv->driver.owner = owner;
drv->driver.probe = sdw_drv_probe;
-
- if (drv->remove)
- drv->driver.remove = sdw_drv_remove;
-
- if (drv->shutdown)
- drv->driver.shutdown = sdw_drv_shutdown;
+ drv->driver.remove = sdw_drv_remove;
+ drv->driver.shutdown = sdw_drv_shutdown;
return driver_register(&drv->driver);
}
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
index 7b8ef45abee4..eea9d0f338f5 100644
--- a/drivers/soundwire/qcom.c
+++ b/drivers/soundwire/qcom.c
@@ -471,6 +471,10 @@ static int qcom_swrm_enumerate(struct sdw_bus *bus)
char *buf1 = (char *)&val1, *buf2 = (char *)&val2;
for (i = 1; i <= SDW_MAX_DEVICES; i++) {
+ /* do not continue if the status is Not Present */
+ if (!ctrl->status[i])
+ continue;
+
/*SCP_Devid5 - Devid 4*/
ctrl->reg_read(ctrl, SWRM_ENUMERATOR_SLAVE_DEV_ID_1(i), &val1);
diff --git a/drivers/soundwire/slave.c b/drivers/soundwire/slave.c
index 669d7573320b..25e76b5d4a1a 100644
--- a/drivers/soundwire/slave.c
+++ b/drivers/soundwire/slave.c
@@ -12,6 +12,7 @@ static void sdw_slave_release(struct device *dev)
{
struct sdw_slave *slave = dev_to_sdw_dev(dev);
+ mutex_destroy(&slave->sdw_dev_lock);
kfree(slave);
}
@@ -58,9 +59,9 @@ int sdw_slave_add(struct sdw_bus *bus,
init_completion(&slave->enumeration_complete);
init_completion(&slave->initialization_complete);
slave->dev_num = 0;
- init_completion(&slave->probe_complete);
slave->probed = false;
slave->first_interrupt_done = false;
+ mutex_init(&slave->sdw_dev_lock);
for (i = 0; i < SDW_MAX_PORTS; i++)
init_completion(&slave->port_ready[i]);
diff --git a/drivers/soundwire/stream.c b/drivers/soundwire/stream.c
index f273459b2023..6963e5f7ea6d 100644
--- a/drivers/soundwire/stream.c
+++ b/drivers/soundwire/stream.c
@@ -13,6 +13,7 @@
#include <linux/slab.h>
#include <linux/soundwire/sdw_registers.h>
#include <linux/soundwire/sdw.h>
+#include <linux/soundwire/sdw_type.h>
#include <sound/soc.h>
#include "bus.h"
@@ -401,20 +402,26 @@ static int sdw_do_port_prep(struct sdw_slave_runtime *s_rt,
struct sdw_prepare_ch prep_ch,
enum sdw_port_prep_ops cmd)
{
- const struct sdw_slave_ops *ops = s_rt->slave->ops;
- int ret;
+ int ret = 0;
+ struct sdw_slave *slave = s_rt->slave;
- if (ops->port_prep) {
- ret = ops->port_prep(s_rt->slave, &prep_ch, cmd);
- if (ret < 0) {
- dev_err(&s_rt->slave->dev,
- "Slave Port Prep cmd %d failed: %d\n",
- cmd, ret);
- return ret;
+ mutex_lock(&slave->sdw_dev_lock);
+
+ if (slave->probed) {
+ struct device *dev = &slave->dev;
+ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
+
+ if (drv->ops && drv->ops->port_prep) {
+ ret = drv->ops->port_prep(slave, &prep_ch, cmd);
+ if (ret < 0)
+ dev_err(dev, "Slave Port Prep cmd %d failed: %d\n",
+ cmd, ret);
}
}
- return 0;
+ mutex_unlock(&slave->sdw_dev_lock);
+
+ return ret;
}
static int sdw_prep_deprep_slave_ports(struct sdw_bus *bus,
@@ -578,7 +585,7 @@ static int sdw_notify_config(struct sdw_master_runtime *m_rt)
struct sdw_slave_runtime *s_rt;
struct sdw_bus *bus = m_rt->bus;
struct sdw_slave *slave;
- int ret = 0;
+ int ret;
if (bus->ops->set_bus_conf) {
ret = bus->ops->set_bus_conf(bus, &bus->params);
@@ -589,17 +596,27 @@ static int sdw_notify_config(struct sdw_master_runtime *m_rt)
list_for_each_entry(s_rt, &m_rt->slave_rt_list, m_rt_node) {
slave = s_rt->slave;
- if (slave->ops->bus_config) {
- ret = slave->ops->bus_config(slave, &bus->params);
- if (ret < 0) {
- dev_err(bus->dev, "Notify Slave: %d failed\n",
- slave->dev_num);
- return ret;
+ mutex_lock(&slave->sdw_dev_lock);
+
+ if (slave->probed) {
+ struct device *dev = &slave->dev;
+ struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);
+
+ if (drv->ops && drv->ops->bus_config) {
+ ret = drv->ops->bus_config(slave, &bus->params);
+ if (ret < 0) {
+ dev_err(dev, "Notify Slave: %d failed\n",
+ slave->dev_num);
+ mutex_unlock(&slave->sdw_dev_lock);
+ return ret;
+ }
}
}
+
+ mutex_unlock(&slave->sdw_dev_lock);
}
- return ret;
+ return 0;
}
/**
diff --git a/drivers/spi/spi-altera-dfl.c b/drivers/spi/spi-altera-dfl.c
index ca40923258af..596e181ae136 100644
--- a/drivers/spi/spi-altera-dfl.c
+++ b/drivers/spi/spi-altera-dfl.c
@@ -128,9 +128,9 @@ static int dfl_spi_altera_probe(struct dfl_device *dfl_dev)
struct spi_master *master;
struct altera_spi *hw;
void __iomem *base;
- int err = -ENODEV;
+ int err;
- master = spi_alloc_master(dev, sizeof(struct altera_spi));
+ master = devm_spi_alloc_master(dev, sizeof(struct altera_spi));
if (!master)
return -ENOMEM;
@@ -159,10 +159,9 @@ static int dfl_spi_altera_probe(struct dfl_device *dfl_dev)
altera_spi_init_master(master);
err = devm_spi_register_master(dev, master);
- if (err) {
- dev_err(dev, "%s failed to register spi master %d\n", __func__, err);
- goto exit;
- }
+ if (err)
+ return dev_err_probe(dev, err, "%s failed to register spi master\n",
+ __func__);
if (dfl_dev->revision == FME_FEATURE_REV_MAX10_SPI_N5010)
strscpy(board_info.modalias, "m10-n5010", SPI_NAME_SIZE);
@@ -179,9 +178,6 @@ static int dfl_spi_altera_probe(struct dfl_device *dfl_dev)
}
return 0;
-exit:
- spi_master_put(master);
- return err;
}
static const struct dfl_device_id dfl_spi_altera_ids[] = {
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index d5ee5130601e..79d853f6d192 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -23,7 +23,7 @@
((_dws)->ip == DW_ ## _ip ## _ID)
#define __dw_spi_ver_cmp(_dws, _ip, _ver, _op) \
- (dw_spi_ip_is(_dws, _ip) && (_dws)->ver _op DW_ ## _ip ## _ver)
+ (dw_spi_ip_is(_dws, _ip) && (_dws)->ver _op DW_ ## _ip ## _ ## _ver)
#define dw_spi_ver_is(_dws, _ip, _ver) __dw_spi_ver_cmp(_dws, _ip, _ver, ==)
diff --git a/drivers/spi/spi-rspi.c b/drivers/spi/spi-rspi.c
index 7a014eeec2d0..411b1307b7fd 100644
--- a/drivers/spi/spi-rspi.c
+++ b/drivers/spi/spi-rspi.c
@@ -613,6 +613,10 @@ static int rspi_dma_transfer(struct rspi_data *rspi, struct sg_table *tx,
rspi->dma_callbacked, HZ);
if (ret > 0 && rspi->dma_callbacked) {
ret = 0;
+ if (tx)
+ dmaengine_synchronize(rspi->ctlr->dma_tx);
+ if (rx)
+ dmaengine_synchronize(rspi->ctlr->dma_rx);
} else {
if (!ret) {
dev_err(&rspi->ctlr->dev, "DMA timeout\n");
diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c
index c26440e9058d..8fa21afc6a35 100644
--- a/drivers/spi/spi-s3c64xx.c
+++ b/drivers/spi/spi-s3c64xx.c
@@ -1413,7 +1413,7 @@ static const struct s3c64xx_spi_port_config exynos5433_spi_port_config = {
.quirks = S3C64XX_SPI_QUIRK_CS_AUTO,
};
-static struct s3c64xx_spi_port_config fsd_spi_port_config = {
+static const struct s3c64xx_spi_port_config fsd_spi_port_config = {
.fifo_lvl_mask = { 0x7f, 0x7f, 0x7f, 0x7f, 0x7f},
.rx_lvl_offset = 15,
.tx_st_done = 25,
diff --git a/drivers/spi/spi-synquacer.c b/drivers/spi/spi-synquacer.c
index ea706d9629cb..47cbe73137c2 100644
--- a/drivers/spi/spi-synquacer.c
+++ b/drivers/spi/spi-synquacer.c
@@ -783,6 +783,7 @@ static int __maybe_unused synquacer_spi_resume(struct device *dev)
ret = synquacer_spi_enable(master);
if (ret) {
+ clk_disable_unprepare(sspi->clk);
dev_err(dev, "failed to enable spi (%d)\n", ret);
return ret;
}
diff --git a/drivers/spi/spi-tegra20-slink.c b/drivers/spi/spi-tegra20-slink.c
index 80c3787deea9..c91c226be69e 100644
--- a/drivers/spi/spi-tegra20-slink.c
+++ b/drivers/spi/spi-tegra20-slink.c
@@ -1137,7 +1137,7 @@ static int tegra_slink_probe(struct platform_device *pdev)
static int tegra_slink_remove(struct platform_device *pdev)
{
- struct spi_master *master = platform_get_drvdata(pdev);
+ struct spi_master *master = spi_master_get(platform_get_drvdata(pdev));
struct tegra_slink_data *tspi = spi_master_get_devdata(master);
spi_unregister_master(master);
@@ -1152,6 +1152,7 @@ static int tegra_slink_remove(struct platform_device *pdev)
if (tspi->rx_dma_chan)
tegra_slink_deinit_dma_param(tspi, true);
+ spi_master_put(master);
return 0;
}
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 2e6d6bbeb784..6a8850e62078 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -2416,7 +2416,7 @@ static int acpi_spi_add_resource(struct acpi_resource *ares, void *data)
ctlr = acpi_spi_find_controller_by_adev(adev);
if (!ctlr)
- return -ENODEV;
+ return -EPROBE_DEFER;
lookup->ctlr = ctlr;
}
@@ -3068,9 +3068,9 @@ int spi_register_controller(struct spi_controller *ctlr)
}
EXPORT_SYMBOL_GPL(spi_register_controller);
-static void devm_spi_unregister(void *ctlr)
+static void devm_spi_unregister(struct device *dev, void *res)
{
- spi_unregister_controller(ctlr);
+ spi_unregister_controller(*(struct spi_controller **)res);
}
/**
@@ -3089,13 +3089,22 @@ static void devm_spi_unregister(void *ctlr)
int devm_spi_register_controller(struct device *dev,
struct spi_controller *ctlr)
{
+ struct spi_controller **ptr;
int ret;
+ ptr = devres_alloc(devm_spi_unregister, sizeof(*ptr), GFP_KERNEL);
+ if (!ptr)
+ return -ENOMEM;
+
ret = spi_register_controller(ctlr);
- if (ret)
- return ret;
+ if (!ret) {
+ *ptr = ctlr;
+ devres_add(dev, ptr);
+ } else {
+ devres_free(ptr);
+ }
- return devm_add_action_or_reset(dev, devm_spi_unregister, ctlr);
+ return ret;
}
EXPORT_SYMBOL_GPL(devm_spi_register_controller);
diff --git a/drivers/staging/fbtft/fbtft-core.c b/drivers/staging/fbtft/fbtft-core.c
index 9c4d797e7ae4..4137c1a51e1b 100644
--- a/drivers/staging/fbtft/fbtft-core.c
+++ b/drivers/staging/fbtft/fbtft-core.c
@@ -656,7 +656,6 @@ struct fb_info *fbtft_framebuffer_alloc(struct fbtft_display *display,
fbdefio->delay = HZ / fps;
fbdefio->sort_pagelist = true;
fbdefio->deferred_io = fbtft_deferred_io;
- fb_deferred_io_init(info);
snprintf(info->fix.id, sizeof(info->fix.id), "%s", dev->driver->name);
info->fix.type = FB_TYPE_PACKED_PIXELS;
@@ -667,6 +666,7 @@ struct fb_info *fbtft_framebuffer_alloc(struct fbtft_display *display,
info->fix.line_length = width * bpp / 8;
info->fix.accel = FB_ACCEL_NONE;
info->fix.smem_len = vmem_size;
+ fb_deferred_io_init(info);
info->var.rotate = pdata->rotate;
info->var.xres = width;
diff --git a/drivers/staging/media/atomisp/pci/atomisp_cmd.c b/drivers/staging/media/atomisp/pci/atomisp_cmd.c
index 97d5a528969b..0da0b69a4637 100644
--- a/drivers/staging/media/atomisp/pci/atomisp_cmd.c
+++ b/drivers/staging/media/atomisp/pci/atomisp_cmd.c
@@ -901,9 +901,9 @@ void atomisp_buf_done(struct atomisp_sub_device *asd, int error,
int err;
unsigned long irqflags;
struct ia_css_frame *frame = NULL;
- struct atomisp_s3a_buf *s3a_buf = NULL, *_s3a_buf_tmp;
- struct atomisp_dis_buf *dis_buf = NULL, *_dis_buf_tmp;
- struct atomisp_metadata_buf *md_buf = NULL, *_md_buf_tmp;
+ struct atomisp_s3a_buf *s3a_buf = NULL, *_s3a_buf_tmp, *s3a_iter;
+ struct atomisp_dis_buf *dis_buf = NULL, *_dis_buf_tmp, *dis_iter;
+ struct atomisp_metadata_buf *md_buf = NULL, *_md_buf_tmp, *md_iter;
enum atomisp_metadata_type md_type;
struct atomisp_device *isp = asd->isp;
struct v4l2_control ctrl;
@@ -942,60 +942,75 @@ void atomisp_buf_done(struct atomisp_sub_device *asd, int error,
switch (buf_type) {
case IA_CSS_BUFFER_TYPE_3A_STATISTICS:
- list_for_each_entry_safe(s3a_buf, _s3a_buf_tmp,
+ list_for_each_entry_safe(s3a_iter, _s3a_buf_tmp,
&asd->s3a_stats_in_css, list) {
- if (s3a_buf->s3a_data ==
+ if (s3a_iter->s3a_data ==
buffer.css_buffer.data.stats_3a) {
- list_del_init(&s3a_buf->list);
- list_add_tail(&s3a_buf->list,
+ list_del_init(&s3a_iter->list);
+ list_add_tail(&s3a_iter->list,
&asd->s3a_stats_ready);
+ s3a_buf = s3a_iter;
break;
}
}
asd->s3a_bufs_in_css[css_pipe_id]--;
atomisp_3a_stats_ready_event(asd, buffer.css_buffer.exp_id);
- dev_dbg(isp->dev, "%s: s3a stat with exp_id %d is ready\n",
- __func__, s3a_buf->s3a_data->exp_id);
+ if (s3a_buf)
+ dev_dbg(isp->dev, "%s: s3a stat with exp_id %d is ready\n",
+ __func__, s3a_buf->s3a_data->exp_id);
+ else
+ dev_dbg(isp->dev, "%s: s3a stat is ready with no exp_id found\n",
+ __func__);
break;
case IA_CSS_BUFFER_TYPE_METADATA:
if (error)
break;
md_type = atomisp_get_metadata_type(asd, css_pipe_id);
- list_for_each_entry_safe(md_buf, _md_buf_tmp,
+ list_for_each_entry_safe(md_iter, _md_buf_tmp,
&asd->metadata_in_css[md_type], list) {
- if (md_buf->metadata ==
+ if (md_iter->metadata ==
buffer.css_buffer.data.metadata) {
- list_del_init(&md_buf->list);
- list_add_tail(&md_buf->list,
+ list_del_init(&md_iter->list);
+ list_add_tail(&md_iter->list,
&asd->metadata_ready[md_type]);
+ md_buf = md_iter;
break;
}
}
asd->metadata_bufs_in_css[stream_id][css_pipe_id]--;
atomisp_metadata_ready_event(asd, md_type);
- dev_dbg(isp->dev, "%s: metadata with exp_id %d is ready\n",
- __func__, md_buf->metadata->exp_id);
+ if (md_buf)
+ dev_dbg(isp->dev, "%s: metadata with exp_id %d is ready\n",
+ __func__, md_buf->metadata->exp_id);
+ else
+ dev_dbg(isp->dev, "%s: metadata is ready with no exp_id found\n",
+ __func__);
break;
case IA_CSS_BUFFER_TYPE_DIS_STATISTICS:
- list_for_each_entry_safe(dis_buf, _dis_buf_tmp,
+ list_for_each_entry_safe(dis_iter, _dis_buf_tmp,
&asd->dis_stats_in_css, list) {
- if (dis_buf->dis_data ==
+ if (dis_iter->dis_data ==
buffer.css_buffer.data.stats_dvs) {
spin_lock_irqsave(&asd->dis_stats_lock,
irqflags);
- list_del_init(&dis_buf->list);
- list_add(&dis_buf->list, &asd->dis_stats);
+ list_del_init(&dis_iter->list);
+ list_add(&dis_iter->list, &asd->dis_stats);
asd->params.dis_proj_data_valid = true;
spin_unlock_irqrestore(&asd->dis_stats_lock,
irqflags);
+ dis_buf = dis_iter;
break;
}
}
asd->dis_bufs_in_css--;
- dev_dbg(isp->dev, "%s: dis stat with exp_id %d is ready\n",
- __func__, dis_buf->dis_data->exp_id);
+ if (dis_buf)
+ dev_dbg(isp->dev, "%s: dis stat with exp_id %d is ready\n",
+ __func__, dis_buf->dis_data->exp_id);
+ else
+ dev_dbg(isp->dev, "%s: dis stat is ready with no exp_id found\n",
+ __func__);
break;
case IA_CSS_BUFFER_TYPE_VF_OUTPUT_FRAME:
case IA_CSS_BUFFER_TYPE_SEC_VF_OUTPUT_FRAME:
diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
index bd7d11032c94..ac232b5f7825 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -638,6 +638,7 @@ static const struct of_device_id of_hantro_match[] = {
{ .compatible = "rockchip,rk3288-vpu", .data = &rk3288_vpu_variant, },
{ .compatible = "rockchip,rk3328-vpu", .data = &rk3328_vpu_variant, },
{ .compatible = "rockchip,rk3399-vpu", .data = &rk3399_vpu_variant, },
+ { .compatible = "rockchip,rk3568-vpu", .data = &rk3568_vpu_variant, },
#endif
#ifdef CONFIG_VIDEO_HANTRO_IMX8M
{ .compatible = "nxp,imx8mm-vpu-g1", .data = &imx8mm_vpu_g1_variant, },
diff --git a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
index 5f3178bac9c8..d28653d04d20 100644
--- a/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
+++ b/drivers/staging/media/hantro/hantro_g2_hevc_dec.c
@@ -8,6 +8,20 @@
#include "hantro_hw.h"
#include "hantro_g2_regs.h"
+#define G2_ALIGN 16
+
+static size_t hantro_hevc_chroma_offset(struct hantro_ctx *ctx)
+{
+ return ctx->dst_fmt.width * ctx->dst_fmt.height;
+}
+
+static size_t hantro_hevc_motion_vectors_offset(struct hantro_ctx *ctx)
+{
+ size_t cr_offset = hantro_hevc_chroma_offset(ctx);
+
+ return ALIGN((cr_offset * 3) / 2, G2_ALIGN);
+}
+
static void prepare_tile_info_buffer(struct hantro_ctx *ctx)
{
struct hantro_dev *vpu = ctx->dev;
@@ -330,7 +344,6 @@ static void set_ref_pic_list(struct hantro_ctx *ctx)
static int set_ref(struct hantro_ctx *ctx)
{
const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
- const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
const struct v4l2_ctrl_hevc_pps *pps = ctrls->pps;
const struct v4l2_ctrl_hevc_decode_params *decode_params = ctrls->decode_params;
const struct v4l2_hevc_dpb_entry *dpb = decode_params->dpb;
@@ -338,8 +351,8 @@ static int set_ref(struct hantro_ctx *ctx)
struct hantro_dev *vpu = ctx->dev;
struct vb2_v4l2_buffer *vb2_dst;
struct hantro_decoded_buffer *dst;
- size_t cr_offset = hantro_hevc_chroma_offset(sps);
- size_t mv_offset = hantro_hevc_motion_vectors_offset(sps);
+ size_t cr_offset = hantro_hevc_chroma_offset(ctx);
+ size_t mv_offset = hantro_hevc_motion_vectors_offset(ctx);
u32 max_ref_frames;
u16 dpb_longterm_e;
static const struct hantro_reg cur_poc[] = {
@@ -377,11 +390,10 @@ static int set_ref(struct hantro_ctx *ctx)
!!(pps->flags & V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED));
/*
- * Write POC count diff from current pic. For frame decoding only compute
- * pic_order_cnt[0] and ignore pic_order_cnt[1] used in field-coding.
+ * Write POC count diff from current pic.
*/
for (i = 0; i < decode_params->num_active_dpb_entries && i < ARRAY_SIZE(cur_poc); i++) {
- char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt[0];
+ char poc_diff = decode_params->pic_order_cnt_val - dpb[i].pic_order_cnt_val;
hantro_reg_write(vpu, &cur_poc[i], poc_diff);
}
@@ -401,14 +413,14 @@ static int set_ref(struct hantro_ctx *ctx)
set_ref_pic_list(ctx);
- /* We will only keep the references picture that are still used */
- ctx->hevc_dec.ref_bufs_used = 0;
+ /* We will only keep the reference pictures that are still used */
+ hantro_hevc_ref_init(ctx);
/* Set up addresses of DPB buffers */
dpb_longterm_e = 0;
for (i = 0; i < decode_params->num_active_dpb_entries &&
i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
- luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt[0]);
+ luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt_val);
if (!luma_addr)
return -ENOMEM;
@@ -443,8 +455,6 @@ static int set_ref(struct hantro_ctx *ctx)
hantro_write_addr(vpu, G2_OUT_CHROMA_ADDR, chroma_addr);
hantro_write_addr(vpu, G2_OUT_MV_ADDR, mv_addr);
- hantro_hevc_ref_remove_unused(ctx);
-
for (; i < V4L2_HEVC_DPB_ENTRIES_NUM_MAX; i++) {
hantro_write_addr(vpu, G2_REF_LUMA_ADDR(i), 0);
hantro_write_addr(vpu, G2_REF_CHROMA_ADDR(i), 0);
diff --git a/drivers/staging/media/hantro/hantro_g2_regs.h b/drivers/staging/media/hantro/hantro_g2_regs.h
index b7c6f9877b9d..0f43516d0805 100644
--- a/drivers/staging/media/hantro/hantro_g2_regs.h
+++ b/drivers/staging/media/hantro/hantro_g2_regs.h
@@ -107,7 +107,7 @@
#define g2_start_code_e G2_DEC_REG(10, 31, 0x1)
#define g2_init_qp_old G2_DEC_REG(10, 25, 0x3f)
-#define g2_init_qp G2_DEC_REG(10, 24, 0x3f)
+#define g2_init_qp G2_DEC_REG(10, 24, 0x7f)
#define g2_num_tile_cols_old G2_DEC_REG(10, 20, 0x1f)
#define g2_num_tile_cols G2_DEC_REG(10, 19, 0x1f)
#define g2_num_tile_rows_old G2_DEC_REG(10, 15, 0x1f)
diff --git a/drivers/staging/media/hantro/hantro_hevc.c b/drivers/staging/media/hantro/hantro_hevc.c
index b49a41d7ae91..df1f81952bba 100644
--- a/drivers/staging/media/hantro/hantro_hevc.c
+++ b/drivers/staging/media/hantro/hantro_hevc.c
@@ -25,41 +25,20 @@
#define MAX_TILE_COLS 20
#define MAX_TILE_ROWS 22
-#define UNUSED_REF -1
-
-#define G2_ALIGN 16
-
-size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps)
-{
- int bytes_per_pixel = sps->bit_depth_luma_minus8 == 0 ? 1 : 2;
-
- return sps->pic_width_in_luma_samples *
- sps->pic_height_in_luma_samples * bytes_per_pixel;
-}
-
-size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps)
-{
- size_t cr_offset = hantro_hevc_chroma_offset(sps);
-
- return ALIGN((cr_offset * 3) / 2, G2_ALIGN);
-}
-
-static void hantro_hevc_ref_init(struct hantro_ctx *ctx)
+void hantro_hevc_ref_init(struct hantro_ctx *ctx)
{
struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
- int i;
- for (i = 0; i < NUM_REF_PICTURES; i++)
- hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
+ hevc_dec->ref_bufs_used = 0;
}
dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
- int poc)
+ s32 poc)
{
struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
int i;
- /* Find the reference buffer in already know ones */
+ /* Find the reference buffer in already known ones */
for (i = 0; i < NUM_REF_PICTURES; i++) {
if (hevc_dec->ref_bufs_poc[i] == poc) {
hevc_dec->ref_bufs_used |= 1 << i;
@@ -77,7 +56,7 @@ int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr)
/* Add a new reference buffer */
for (i = 0; i < NUM_REF_PICTURES; i++) {
- if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF) {
+ if (!(hevc_dec->ref_bufs_used & 1 << i)) {
hevc_dec->ref_bufs_used |= 1 << i;
hevc_dec->ref_bufs_poc[i] = poc;
hevc_dec->ref_bufs[i].dma = addr;
@@ -88,23 +67,6 @@ int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr)
return -EINVAL;
}
-void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx)
-{
- struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
- int i;
-
- /* Just tag buffer as unused, do not free them */
- for (i = 0; i < NUM_REF_PICTURES; i++) {
- if (hevc_dec->ref_bufs_poc[i] == UNUSED_REF)
- continue;
-
- if (hevc_dec->ref_bufs_used & (1 << i))
- continue;
-
- hevc_dec->ref_bufs_poc[i] = UNUSED_REF;
- }
-}
-
static int tile_buffer_reallocate(struct hantro_ctx *ctx)
{
struct hantro_dev *vpu = ctx->dev;
@@ -192,6 +154,25 @@ static int tile_buffer_reallocate(struct hantro_ctx *ctx)
return -ENOMEM;
}
+static int hantro_hevc_validate_sps(struct hantro_ctx *ctx, const struct v4l2_ctrl_hevc_sps *sps)
+{
+ /*
+ * for tile pixel format check if the width and height match
+ * hardware constraints
+ */
+ if (ctx->vpu_dst_fmt->fourcc == V4L2_PIX_FMT_NV12_4L4) {
+ if (ctx->dst_fmt.width !=
+ ALIGN(sps->pic_width_in_luma_samples, ctx->vpu_dst_fmt->frmsize.step_width))
+ return -EINVAL;
+
+ if (ctx->dst_fmt.height !=
+ ALIGN(sps->pic_height_in_luma_samples, ctx->vpu_dst_fmt->frmsize.step_height))
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx)
{
struct hantro_hevc_dec_hw_ctx *hevc_ctx = &ctx->hevc_dec;
@@ -215,6 +196,10 @@ int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx)
if (WARN_ON(!ctrls->sps))
return -EINVAL;
+ ret = hantro_hevc_validate_sps(ctx, ctrls->sps);
+ if (ret)
+ return ret;
+
ctrls->pps =
hantro_get_ctrl(ctx, V4L2_CID_MPEG_VIDEO_HEVC_PPS);
if (WARN_ON(!ctrls->pps))
diff --git a/drivers/staging/media/hantro/hantro_hw.h b/drivers/staging/media/hantro/hantro_hw.h
index ed018e293ba0..68c313864b06 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -18,9 +18,21 @@
#define DEC_8190_ALIGN_MASK 0x07U
#define MB_DIM 16
+#define TILE_MB_DIM 4
#define MB_WIDTH(w) DIV_ROUND_UP(w, MB_DIM)
#define MB_HEIGHT(h) DIV_ROUND_UP(h, MB_DIM)
+#define FMT_MIN_WIDTH 48
+#define FMT_MIN_HEIGHT 48
+#define FMT_HD_WIDTH 1280
+#define FMT_HD_HEIGHT 720
+#define FMT_FHD_WIDTH 1920
+#define FMT_FHD_HEIGHT 1088
+#define FMT_UHD_WIDTH 3840
+#define FMT_UHD_HEIGHT 2160
+#define FMT_4K_WIDTH 4096
+#define FMT_4K_HEIGHT 2304
+
#define NUM_REF_PICTURES (V4L2_HEVC_DPB_ENTRIES_NUM_MAX + 1)
struct hantro_dev;
@@ -131,7 +143,7 @@ struct hantro_hevc_dec_hw_ctx {
struct hantro_aux_buf tile_bsd;
struct hantro_aux_buf ref_bufs[NUM_REF_PICTURES];
struct hantro_aux_buf scaling_lists;
- int ref_bufs_poc[NUM_REF_PICTURES];
+ s32 ref_bufs_poc[NUM_REF_PICTURES];
u32 ref_bufs_used;
struct hantro_hevc_dec_ctrls ctrls;
unsigned int num_tile_cols_allocated;
@@ -300,6 +312,7 @@ extern const struct hantro_variant rk3066_vpu_variant;
extern const struct hantro_variant rk3288_vpu_variant;
extern const struct hantro_variant rk3328_vpu_variant;
extern const struct hantro_variant rk3399_vpu_variant;
+extern const struct hantro_variant rk3568_vpu_variant;
extern const struct hantro_variant sama5d4_vdec_variant;
extern const struct hantro_variant sunxi_vpu_variant;
@@ -337,11 +350,10 @@ int hantro_hevc_dec_init(struct hantro_ctx *ctx);
void hantro_hevc_dec_exit(struct hantro_ctx *ctx);
int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx);
int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx);
-dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, int poc);
+void hantro_hevc_ref_init(struct hantro_ctx *ctx);
+dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, s32 poc);
int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr);
-void hantro_hevc_ref_remove_unused(struct hantro_ctx *ctx);
-size_t hantro_hevc_chroma_offset(const struct v4l2_ctrl_hevc_sps *sps);
-size_t hantro_hevc_motion_vectors_offset(const struct v4l2_ctrl_hevc_sps *sps);
+
static inline unsigned short hantro_vp9_num_sbs(unsigned short dimension)
{
diff --git a/drivers/staging/media/hantro/hantro_v4l2.c b/drivers/staging/media/hantro/hantro_v4l2.c
index 71a6279750bf..93d0dcf69f4a 100644
--- a/drivers/staging/media/hantro/hantro_v4l2.c
+++ b/drivers/staging/media/hantro/hantro_v4l2.c
@@ -260,7 +260,7 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx,
} else if (ctx->is_encoder) {
vpu_fmt = ctx->vpu_dst_fmt;
} else {
- vpu_fmt = ctx->vpu_src_fmt;
+ vpu_fmt = fmt;
/*
* Width/height on the CAPTURE end of a decoder are ignored and
* replaced by the OUTPUT ones.
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index 9802508bade2..77f574fdfa77 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -83,6 +83,14 @@ static const struct hantro_fmt imx8m_vpu_postproc_fmts[] = {
.fourcc = V4L2_PIX_FMT_YUYV,
.codec_mode = HANTRO_MODE_NONE,
.postprocessed = true,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
},
};
@@ -90,17 +98,25 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
{
.fourcc = V4L2_PIX_FMT_NV12,
.codec_mode = HANTRO_MODE_NONE,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
},
{
.fourcc = V4L2_PIX_FMT_MPEG2_SLICE,
.codec_mode = HANTRO_MODE_MPEG2_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1920,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 1088,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -109,11 +125,11 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
.codec_mode = HANTRO_MODE_VP8_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 3840,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 2160,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -122,11 +138,11 @@ static const struct hantro_fmt imx8m_vpu_dec_fmts[] = {
.codec_mode = HANTRO_MODE_H264_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 3840,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 2160,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -137,6 +153,14 @@ static const struct hantro_fmt imx8m_vpu_g2_postproc_fmts[] = {
.fourcc = V4L2_PIX_FMT_NV12,
.codec_mode = HANTRO_MODE_NONE,
.postprocessed = true,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
},
};
@@ -144,18 +168,26 @@ static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
{
.fourcc = V4L2_PIX_FMT_NV12_4L4,
.codec_mode = HANTRO_MODE_NONE,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = TILE_MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = TILE_MB_DIM,
+ },
},
{
.fourcc = V4L2_PIX_FMT_HEVC_SLICE,
.codec_mode = HANTRO_MODE_HEVC_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 3840,
- .step_width = MB_DIM,
- .min_height = 48,
- .max_height = 2160,
- .step_height = MB_DIM,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = TILE_MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = TILE_MB_DIM,
},
},
{
@@ -163,12 +195,12 @@ static const struct hantro_fmt imx8m_vpu_g2_dec_fmts[] = {
.codec_mode = HANTRO_MODE_VP9_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 3840,
- .step_width = MB_DIM,
- .min_height = 48,
- .max_height = 2160,
- .step_height = MB_DIM,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = TILE_MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = TILE_MB_DIM,
},
},
};
diff --git a/drivers/staging/media/hantro/rockchip_vpu_hw.c b/drivers/staging/media/hantro/rockchip_vpu_hw.c
index 163cf92eafca..26e16b5a6a70 100644
--- a/drivers/staging/media/hantro/rockchip_vpu_hw.c
+++ b/drivers/staging/media/hantro/rockchip_vpu_hw.c
@@ -63,6 +63,14 @@ static const struct hantro_fmt rockchip_vpu1_postproc_fmts[] = {
.fourcc = V4L2_PIX_FMT_YUYV,
.codec_mode = HANTRO_MODE_NONE,
.postprocessed = true,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
},
};
@@ -70,17 +78,25 @@ static const struct hantro_fmt rk3066_vpu_dec_fmts[] = {
{
.fourcc = V4L2_PIX_FMT_NV12,
.codec_mode = HANTRO_MODE_NONE,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
},
{
.fourcc = V4L2_PIX_FMT_H264_SLICE,
.codec_mode = HANTRO_MODE_H264_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1920,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 1088,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -89,11 +105,11 @@ static const struct hantro_fmt rk3066_vpu_dec_fmts[] = {
.codec_mode = HANTRO_MODE_MPEG2_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1920,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 1088,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -102,11 +118,11 @@ static const struct hantro_fmt rk3066_vpu_dec_fmts[] = {
.codec_mode = HANTRO_MODE_VP8_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1920,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 1088,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -116,17 +132,25 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
{
.fourcc = V4L2_PIX_FMT_NV12,
.codec_mode = HANTRO_MODE_NONE,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_4K_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_4K_HEIGHT,
+ .step_height = MB_DIM,
+ },
},
{
.fourcc = V4L2_PIX_FMT_H264_SLICE,
.codec_mode = HANTRO_MODE_H264_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 4096,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_4K_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 2304,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_4K_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -135,11 +159,11 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
.codec_mode = HANTRO_MODE_MPEG2_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1920,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 1088,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -148,31 +172,80 @@ static const struct hantro_fmt rk3288_vpu_dec_fmts[] = {
.codec_mode = HANTRO_MODE_VP8_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 3840,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 2160,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
.step_height = MB_DIM,
},
},
};
-static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
+static const struct hantro_fmt rockchip_vdpu2_dec_fmts[] = {
{
.fourcc = V4L2_PIX_FMT_NV12,
.codec_mode = HANTRO_MODE_NONE,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
},
{
.fourcc = V4L2_PIX_FMT_H264_SLICE,
.codec_mode = HANTRO_MODE_H264_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1920,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_MPEG2_SLICE,
+ .codec_mode = HANTRO_MODE_MPEG2_DEC,
+ .max_depth = 2,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+ {
+ .fourcc = V4L2_PIX_FMT_VP8_FRAME,
+ .codec_mode = HANTRO_MODE_VP8_DEC,
+ .max_depth = 2,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = MB_DIM,
+ },
+ },
+};
+
+static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
+ {
+ .fourcc = V4L2_PIX_FMT_NV12,
+ .codec_mode = HANTRO_MODE_NONE,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 1088,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -181,11 +254,11 @@ static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
.codec_mode = HANTRO_MODE_MPEG2_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1920,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_FHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 1088,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_FHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -194,11 +267,11 @@ static const struct hantro_fmt rk3399_vpu_dec_fmts[] = {
.codec_mode = HANTRO_MODE_VP8_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 3840,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 2160,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -516,8 +589,8 @@ const struct hantro_variant rk3288_vpu_variant = {
const struct hantro_variant rk3328_vpu_variant = {
.dec_offset = 0x400,
- .dec_fmts = rk3399_vpu_dec_fmts,
- .num_dec_fmts = ARRAY_SIZE(rk3399_vpu_dec_fmts),
+ .dec_fmts = rockchip_vdpu2_dec_fmts,
+ .num_dec_fmts = ARRAY_SIZE(rockchip_vdpu2_dec_fmts),
.codec = HANTRO_MPEG2_DECODER | HANTRO_VP8_DECODER |
HANTRO_H264_DECODER,
.codec_ops = rk3399_vpu_codec_ops,
@@ -528,6 +601,11 @@ const struct hantro_variant rk3328_vpu_variant = {
.num_clocks = ARRAY_SIZE(rockchip_vpu_clk_names),
};
+/*
+ * H.264 decoding explicitly disabled in RK3399.
+ * This ensures userspace applications use the Rockchip VDEC core,
+ * which has better performance.
+ */
const struct hantro_variant rk3399_vpu_variant = {
.enc_offset = 0x0,
.enc_fmts = rockchip_vpu_enc_fmts,
@@ -545,13 +623,27 @@ const struct hantro_variant rk3399_vpu_variant = {
.num_clocks = ARRAY_SIZE(rockchip_vpu_clk_names)
};
+const struct hantro_variant rk3568_vpu_variant = {
+ .dec_offset = 0x400,
+ .dec_fmts = rockchip_vdpu2_dec_fmts,
+ .num_dec_fmts = ARRAY_SIZE(rockchip_vdpu2_dec_fmts),
+ .codec = HANTRO_MPEG2_DECODER |
+ HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
+ .codec_ops = rk3399_vpu_codec_ops,
+ .irqs = rockchip_vdpu2_irqs,
+ .num_irqs = ARRAY_SIZE(rockchip_vdpu2_irqs),
+ .init = rockchip_vpu_hw_init,
+ .clk_names = rockchip_vpu_clk_names,
+ .num_clocks = ARRAY_SIZE(rockchip_vpu_clk_names)
+};
+
const struct hantro_variant px30_vpu_variant = {
.enc_offset = 0x0,
.enc_fmts = rockchip_vpu_enc_fmts,
.num_enc_fmts = ARRAY_SIZE(rockchip_vpu_enc_fmts),
.dec_offset = 0x400,
- .dec_fmts = rk3399_vpu_dec_fmts,
- .num_dec_fmts = ARRAY_SIZE(rk3399_vpu_dec_fmts),
+ .dec_fmts = rockchip_vdpu2_dec_fmts,
+ .num_dec_fmts = ARRAY_SIZE(rockchip_vdpu2_dec_fmts),
.codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
.codec_ops = rk3399_vpu_codec_ops,
diff --git a/drivers/staging/media/hantro/sama5d4_vdec_hw.c b/drivers/staging/media/hantro/sama5d4_vdec_hw.c
index b2fc1c5613e1..b205e2db5b04 100644
--- a/drivers/staging/media/hantro/sama5d4_vdec_hw.c
+++ b/drivers/staging/media/hantro/sama5d4_vdec_hw.c
@@ -16,6 +16,14 @@ static const struct hantro_fmt sama5d4_vdec_postproc_fmts[] = {
.fourcc = V4L2_PIX_FMT_YUYV,
.codec_mode = HANTRO_MODE_NONE,
.postprocessed = true,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_HD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_HD_HEIGHT,
+ .step_height = MB_DIM,
+ },
},
};
@@ -23,17 +31,25 @@ static const struct hantro_fmt sama5d4_vdec_fmts[] = {
{
.fourcc = V4L2_PIX_FMT_NV12,
.codec_mode = HANTRO_MODE_NONE,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_HD_WIDTH,
+ .step_width = MB_DIM,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_HD_HEIGHT,
+ .step_height = MB_DIM,
+ },
},
{
.fourcc = V4L2_PIX_FMT_MPEG2_SLICE,
.codec_mode = HANTRO_MODE_MPEG2_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1280,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_HD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 720,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_HD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -42,11 +58,11 @@ static const struct hantro_fmt sama5d4_vdec_fmts[] = {
.codec_mode = HANTRO_MODE_VP8_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1280,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_HD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 720,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_HD_HEIGHT,
.step_height = MB_DIM,
},
},
@@ -55,11 +71,11 @@ static const struct hantro_fmt sama5d4_vdec_fmts[] = {
.codec_mode = HANTRO_MODE_H264_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 1280,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_HD_WIDTH,
.step_width = MB_DIM,
- .min_height = 48,
- .max_height = 720,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_HD_HEIGHT,
.step_height = MB_DIM,
},
},
diff --git a/drivers/staging/media/hantro/sunxi_vpu_hw.c b/drivers/staging/media/hantro/sunxi_vpu_hw.c
index c0edd5856a0c..fbeac81e59e1 100644
--- a/drivers/staging/media/hantro/sunxi_vpu_hw.c
+++ b/drivers/staging/media/hantro/sunxi_vpu_hw.c
@@ -14,6 +14,14 @@ static const struct hantro_fmt sunxi_vpu_postproc_fmts[] = {
.fourcc = V4L2_PIX_FMT_NV12,
.codec_mode = HANTRO_MODE_NONE,
.postprocessed = true,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = 32,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = 32,
+ },
},
};
@@ -21,17 +29,25 @@ static const struct hantro_fmt sunxi_vpu_dec_fmts[] = {
{
.fourcc = V4L2_PIX_FMT_NV12_4L4,
.codec_mode = HANTRO_MODE_NONE,
+ .frmsize = {
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
+ .step_width = 32,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
+ .step_height = 32,
+ },
},
{
.fourcc = V4L2_PIX_FMT_VP9_FRAME,
.codec_mode = HANTRO_MODE_VP9_DEC,
.max_depth = 2,
.frmsize = {
- .min_width = 48,
- .max_width = 3840,
+ .min_width = FMT_MIN_WIDTH,
+ .max_width = FMT_UHD_WIDTH,
.step_width = 32,
- .min_height = 48,
- .max_height = 2160,
+ .min_height = FMT_MIN_HEIGHT,
+ .max_height = FMT_UHD_HEIGHT,
.step_height = 32,
},
},
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
index 44f385be9f6c..04419381ea56 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c
@@ -143,10 +143,13 @@ static void cedrus_h265_frame_info_write_dpb(struct cedrus_ctx *ctx,
for (i = 0; i < num_active_dpb_entries; i++) {
int buffer_index = vb2_find_timestamp(vq, dpb[i].timestamp, 0);
u32 pic_order_cnt[2] = {
- dpb[i].pic_order_cnt[0],
- dpb[i].pic_order_cnt[1]
+ dpb[i].pic_order_cnt_val,
+ dpb[i].pic_order_cnt_val
};
+ if (buffer_index < 0)
+ continue;
+
cedrus_h265_frame_info_write_single(ctx, i, dpb[i].field_pic,
pic_order_cnt,
buffer_index);
@@ -301,6 +304,31 @@ static void cedrus_h265_write_scaling_list(struct cedrus_ctx *ctx,
}
}
+static int cedrus_h265_is_low_delay(struct cedrus_run *run)
+{
+ const struct v4l2_ctrl_hevc_slice_params *slice_params;
+ const struct v4l2_hevc_dpb_entry *dpb;
+ s32 poc;
+ int i;
+
+ slice_params = run->h265.slice_params;
+ poc = run->h265.decode_params->pic_order_cnt_val;
+ dpb = run->h265.decode_params->dpb;
+
+ for (i = 0; i < slice_params->num_ref_idx_l0_active_minus1 + 1; i++)
+ if (dpb[slice_params->ref_idx_l0[i]].pic_order_cnt_val > poc)
+ return 1;
+
+ if (slice_params->slice_type != V4L2_HEVC_SLICE_TYPE_B)
+ return 0;
+
+ for (i = 0; i < slice_params->num_ref_idx_l1_active_minus1 + 1; i++)
+ if (dpb[slice_params->ref_idx_l1[i]].pic_order_cnt_val > poc)
+ return 1;
+
+ return 0;
+}
+
static void cedrus_h265_setup(struct cedrus_ctx *ctx,
struct cedrus_run *run)
{
@@ -559,7 +587,6 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
reg = VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_TC_OFFSET_DIV2(slice_params->slice_tc_offset_div2) |
VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_BETA_OFFSET_DIV2(slice_params->slice_beta_offset_div2) |
- VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(decode_params->num_poc_st_curr_after == 0) |
VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CR_QP_OFFSET(slice_params->slice_cr_qp_offset) |
VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CB_QP_OFFSET(slice_params->slice_cb_qp_offset) |
VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_QP_DELTA(slice_params->slice_qp_delta);
@@ -572,6 +599,9 @@ static void cedrus_h265_setup(struct cedrus_ctx *ctx,
V4L2_HEVC_SLICE_PARAMS_FLAG_SLICE_LOOP_FILTER_ACROSS_SLICES_ENABLED,
slice_params->flags);
+ if (slice_params->slice_type != V4L2_HEVC_SLICE_TYPE_I && !cedrus_h265_is_low_delay(run))
+ reg |= VE_DEC_H265_DEC_SLICE_HDR_INFO1_FLAG_SLICE_NOT_LOW_DELAY;
+
cedrus_write(dev, VE_DEC_H265_DEC_SLICE_HDR_INFO1, reg);
chroma_log2_weight_denom = pred_weight_table->luma_log2_weight_denom +
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_regs.h b/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
index bdb062ad8682..d81f7513ade0 100644
--- a/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
+++ b/drivers/staging/media/sunxi/cedrus/cedrus_regs.h
@@ -377,13 +377,12 @@
#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_FLAG_SLICE_DEBLOCKING_FILTER_DISABLED BIT(23)
#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_FLAG_SLICE_LOOP_FILTER_ACROSS_SLICES_ENABLED BIT(22)
+#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_FLAG_SLICE_NOT_LOW_DELAY BIT(21)
#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_TC_OFFSET_DIV2(v) \
SHIFT_AND_MASK_BITS(v, 31, 28)
#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_BETA_OFFSET_DIV2(v) \
SHIFT_AND_MASK_BITS(v, 27, 24)
-#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_POC_BIGEST_IN_RPS_ST(v) \
- ((v) ? BIT(21) : 0)
#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CR_QP_OFFSET(v) \
SHIFT_AND_MASK_BITS(v, 20, 16)
#define VE_DEC_H265_DEC_SLICE_HDR_INFO1_SLICE_CB_QP_OFFSET(v) \
diff --git a/drivers/staging/rtl8192u/r8192U.h b/drivers/staging/rtl8192u/r8192U.h
index 14ca00a2789b..1942cb849374 100644
--- a/drivers/staging/rtl8192u/r8192U.h
+++ b/drivers/staging/rtl8192u/r8192U.h
@@ -1013,7 +1013,7 @@ typedef struct r8192_priv {
bool bis_any_nonbepkts;
bool bcurrent_turbo_EDCA;
bool bis_cur_rdlstate;
- struct timer_list fsync_timer;
+ struct delayed_work fsync_work;
bool bfsync_processing; /* 500ms Fsync timer is active or not */
u32 rate_record;
u32 rateCountDiffRecord;
diff --git a/drivers/staging/rtl8192u/r8192U_dm.c b/drivers/staging/rtl8192u/r8192U_dm.c
index 725bf5ca9e34..0fcfcaa6500b 100644
--- a/drivers/staging/rtl8192u/r8192U_dm.c
+++ b/drivers/staging/rtl8192u/r8192U_dm.c
@@ -2578,19 +2578,20 @@ static void dm_init_fsync(struct net_device *dev)
priv->ieee80211->fsync_seconddiff_ratethreshold = 200;
priv->ieee80211->fsync_state = Default_Fsync;
priv->framesyncMonitor = 1; /* current default 0xc38 monitor on */
- timer_setup(&priv->fsync_timer, dm_fsync_timer_callback, 0);
+ INIT_DELAYED_WORK(&priv->fsync_work, dm_fsync_work_callback);
}
static void dm_deInit_fsync(struct net_device *dev)
{
struct r8192_priv *priv = ieee80211_priv(dev);
- del_timer_sync(&priv->fsync_timer);
+ cancel_delayed_work_sync(&priv->fsync_work);
}
-void dm_fsync_timer_callback(struct timer_list *t)
+void dm_fsync_work_callback(struct work_struct *work)
{
- struct r8192_priv *priv = from_timer(priv, t, fsync_timer);
+ struct r8192_priv *priv =
+ container_of(work, struct r8192_priv, fsync_work.work);
struct net_device *dev = priv->ieee80211->dev;
u32 rate_index, rate_count = 0, rate_count_diff = 0;
bool bSwitchFromCountDiff = false;
@@ -2657,17 +2658,16 @@ void dm_fsync_timer_callback(struct timer_list *t)
}
}
if (bDoubleTimeInterval) {
- if (timer_pending(&priv->fsync_timer))
- del_timer_sync(&priv->fsync_timer);
- priv->fsync_timer.expires = jiffies +
- msecs_to_jiffies(priv->ieee80211->fsync_time_interval*priv->ieee80211->fsync_multiple_timeinterval);
- add_timer(&priv->fsync_timer);
+ cancel_delayed_work_sync(&priv->fsync_work);
+ schedule_delayed_work(&priv->fsync_work,
+ msecs_to_jiffies(priv
+ ->ieee80211->fsync_time_interval *
+ priv->ieee80211->fsync_multiple_timeinterval));
} else {
- if (timer_pending(&priv->fsync_timer))
- del_timer_sync(&priv->fsync_timer);
- priv->fsync_timer.expires = jiffies +
- msecs_to_jiffies(priv->ieee80211->fsync_time_interval);
- add_timer(&priv->fsync_timer);
+ cancel_delayed_work_sync(&priv->fsync_work);
+ schedule_delayed_work(&priv->fsync_work,
+ msecs_to_jiffies(priv
+ ->ieee80211->fsync_time_interval));
}
} else {
/* Let Register return to default value; */
@@ -2695,7 +2695,7 @@ static void dm_EndSWFsync(struct net_device *dev)
struct r8192_priv *priv = ieee80211_priv(dev);
RT_TRACE(COMP_HALDM, "%s\n", __func__);
- del_timer_sync(&(priv->fsync_timer));
+ cancel_delayed_work_sync(&priv->fsync_work);
/* Let Register return to default value; */
if (priv->bswitch_fsync) {
@@ -2736,11 +2736,9 @@ static void dm_StartSWFsync(struct net_device *dev)
if (priv->ieee80211->fsync_rate_bitmap & rateBitmap)
priv->rate_record += priv->stats.received_rate_histogram[1][rateIndex];
}
- if (timer_pending(&priv->fsync_timer))
- del_timer_sync(&priv->fsync_timer);
- priv->fsync_timer.expires = jiffies +
- msecs_to_jiffies(priv->ieee80211->fsync_time_interval);
- add_timer(&priv->fsync_timer);
+ cancel_delayed_work_sync(&priv->fsync_work);
+ schedule_delayed_work(&priv->fsync_work,
+ msecs_to_jiffies(priv->ieee80211->fsync_time_interval));
write_nic_dword(dev, rOFDM0_RxDetector2, 0x465c12cd);
}
diff --git a/drivers/staging/rtl8192u/r8192U_dm.h b/drivers/staging/rtl8192u/r8192U_dm.h
index 0b2a1c688597..2159018b4e38 100644
--- a/drivers/staging/rtl8192u/r8192U_dm.h
+++ b/drivers/staging/rtl8192u/r8192U_dm.h
@@ -166,7 +166,7 @@ void dm_force_tx_fw_info(struct net_device *dev,
void dm_init_edca_turbo(struct net_device *dev);
void dm_rf_operation_test_callback(unsigned long data);
void dm_rf_pathcheck_workitemcallback(struct work_struct *work);
-void dm_fsync_timer_callback(struct timer_list *t);
+void dm_fsync_work_callback(struct work_struct *work);
void dm_cck_txpower_adjust(struct net_device *dev, bool binch14);
void dm_shadow_init(struct net_device *dev);
void dm_initialize_txpower_tracking(struct net_device *dev);
diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index 1c4aac8464a7..1e5a78131aba 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -813,12 +813,13 @@ static const struct attribute_group cooling_device_stats_attr_group = {
static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
{
+ const struct attribute_group *stats_attr_group = NULL;
struct cooling_dev_stats *stats;
unsigned long states;
int var;
if (cdev->ops->get_max_state(cdev, &states))
- return;
+ goto out;
states++; /* Total number of states is highest state + 1 */
@@ -828,7 +829,7 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
stats = kzalloc(var, GFP_KERNEL);
if (!stats)
- return;
+ goto out;
stats->time_in_state = (ktime_t *)(stats + 1);
stats->trans_table = (unsigned int *)(stats->time_in_state + states);
@@ -838,9 +839,12 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
spin_lock_init(&stats->lock);
+ stats_attr_group = &cooling_device_stats_attr_group;
+
+out:
/* Fill the empty slot left in cooling_device_attr_groups */
var = ARRAY_SIZE(cooling_device_attr_groups) - 2;
- cooling_device_attr_groups[var] = &cooling_device_stats_attr_group;
+ cooling_device_attr_groups[var] = stats_attr_group;
}
static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev)
diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index 9f2a8c0e1e33..3dd71ede4cd4 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -5,6 +5,14 @@
*
* * THIS IS A DEVELOPMENT SNAPSHOT IT IS NOT A FINAL RELEASE *
*
+ * Outgoing path:
+ * tty -> DLCI fifo -> scheduler -> GSM MUX data queue ---o-> ldisc
+ * control message -> GSM MUX control queue --´
+ *
+ * Incoming path:
+ * ldisc -> gsm_queue() -o--> tty
+ * `-> gsm_control_response()
+ *
* TO DO:
* Mostly done: ioctls for setting modes/timing
* Partly done: hooks so you can pull off frames to non tty devs
@@ -210,6 +218,9 @@ struct gsm_mux {
/* Events on the GSM channel */
wait_queue_head_t event;
+ /* ldisc send work */
+ struct work_struct tx_work;
+
/* Bits for GSM mode decoding */
/* Framing Layer */
@@ -235,14 +246,17 @@ struct gsm_mux {
struct gsm_dlci *dlci[NUM_DLCI];
int old_c_iflag; /* termios c_iflag value before attach */
bool constipated; /* Asked by remote to shut up */
+ bool has_devices; /* Devices were registered */
spinlock_t tx_lock;
unsigned int tx_bytes; /* TX data outstanding */
#define TX_THRESH_HI 8192
#define TX_THRESH_LO 2048
- struct list_head tx_list; /* Pending data packets */
+ struct list_head tx_ctrl_list; /* Pending control packets */
+ struct list_head tx_data_list; /* Pending data packets */
/* Control messages */
+ struct timer_list kick_timer; /* Kick TX queuing on timeout */
struct timer_list t2_timer; /* Retransmit timer for commands */
int cretries; /* Command retry counter */
struct gsm_control *pending_cmd;/* Our current pending command */
@@ -369,6 +383,11 @@ static const u8 gsm_fcs8[256] = {
static int gsmld_output(struct gsm_mux *gsm, u8 *data, int len);
static int gsm_modem_update(struct gsm_dlci *dlci, u8 brk);
+static struct gsm_msg *gsm_data_alloc(struct gsm_mux *gsm, u8 addr, int len,
+ u8 ctrl);
+static int gsm_send_packet(struct gsm_mux *gsm, struct gsm_msg *msg);
+static void gsmld_write_trigger(struct gsm_mux *gsm);
+static void gsmld_write_task(struct work_struct *work);
/**
* gsm_fcs_add - update FCS
@@ -419,6 +438,27 @@ static int gsm_read_ea(unsigned int *val, u8 c)
return c & EA;
}
+/**
+ * gsm_read_ea_val - read a value until EA
+ * @val: variable holding value
+ * @data: buffer of data
+ * @dlen: length of data
+ *
+ * Processes an EA value. Updates the passed variable and
+ * returns the processed data length.
+ */
+static unsigned int gsm_read_ea_val(unsigned int *val, const u8 *data, int dlen)
+{
+ unsigned int len = 0;
+
+ for (; dlen > 0; dlen--) {
+ len++;
+ if (gsm_read_ea(val, *data++))
+ break;
+ }
+ return len;
+}
+
/**
* gsm_encode_modem - encode modem data bits
* @dlci: DLCI to encode from
@@ -463,6 +503,68 @@ static void gsm_hex_dump_bytes(const char *fname, const u8 *data,
kfree(prefix);
}
+/**
+ * gsm_register_devices - register all tty devices for a given mux index
+ *
+ * @driver: the tty driver that describes the tty devices
+ * @index: the mux number is used to calculate the minor numbers of the
+ * ttys for this mux and may differ from the position in the
+ * mux array.
+ */
+static int gsm_register_devices(struct tty_driver *driver, unsigned int index)
+{
+ struct device *dev;
+ int i;
+ unsigned int base;
+
+ if (!driver || index >= MAX_MUX)
+ return -EINVAL;
+
+ base = index * NUM_DLCI; /* first minor for this index */
+ for (i = 1; i < NUM_DLCI; i++) {
+ /* Don't register device 0 - this is the control channel
+ * and not a usable tty interface
+ */
+ dev = tty_register_device(gsm_tty_driver, base + i, NULL);
+ if (IS_ERR(dev)) {
+ if (debug & 8)
+ pr_info("%s failed to register device minor %u",
+ __func__, base + i);
+ for (i--; i >= 1; i--)
+ tty_unregister_device(gsm_tty_driver, base + i);
+ return PTR_ERR(dev);
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * gsm_unregister_devices - unregister all tty devices for a given mux index
+ *
+ * @driver: the tty driver that describes the tty devices
+ * @index: the mux number is used to calculate the minor numbers of the
+ * ttys for this mux and may differ from the position in the
+ * mux array.
+ */
+static void gsm_unregister_devices(struct tty_driver *driver,
+ unsigned int index)
+{
+ int i;
+ unsigned int base;
+
+ if (!driver || index >= MAX_MUX)
+ return;
+
+ base = index * NUM_DLCI; /* first minor for this index */
+ for (i = 1; i < NUM_DLCI; i++) {
+ /* Don't unregister device 0 - this is the control
+ * channel and not a usable tty interface
+ */
+ tty_unregister_device(gsm_tty_driver, base + i);
+ }
+}
+
/**
* gsm_print_packet - display a frame for debug
* @hdr: header to print before decode
@@ -570,57 +672,73 @@ static int gsm_stuff_frame(const u8 *input, u8 *output, int len)
* @cr: command/response bit seen as initiator
* @control: control byte including PF bit
*
- * Format up and transmit a control frame. These do not go via the
- * queueing logic as they should be transmitted ahead of data when
- * they are needed.
- *
- * FIXME: Lock versus data TX path
+ * Format up and transmit a control frame. These should be transmitted
+ * ahead of data when they are needed.
*/
-
-static void gsm_send(struct gsm_mux *gsm, int addr, int cr, int control)
+static int gsm_send(struct gsm_mux *gsm, int addr, int cr, int control)
{
- int len;
- u8 cbuf[10];
- u8 ibuf[3];
+ struct gsm_msg *msg;
+ u8 *dp;
int ocr;
+ unsigned long flags;
+
+ msg = gsm_data_alloc(gsm, addr, 0, control);
+ if (!msg)
+ return -ENOMEM;
/* toggle C/R coding if not initiator */
ocr = cr ^ (gsm->initiator ? 0 : 1);
- switch (gsm->encoding) {
- case 0:
- cbuf[0] = GSM0_SOF;
- cbuf[1] = (addr << 2) | (ocr << 1) | EA;
- cbuf[2] = control;
- cbuf[3] = EA; /* Length of data = 0 */
- cbuf[4] = 0xFF - gsm_fcs_add_block(INIT_FCS, cbuf + 1, 3);
- cbuf[5] = GSM0_SOF;
- len = 6;
- break;
- case 1:
- case 2:
- /* Control frame + packing (but not frame stuffing) in mode 1 */
- ibuf[0] = (addr << 2) | (ocr << 1) | EA;
- ibuf[1] = control;
- ibuf[2] = 0xFF - gsm_fcs_add_block(INIT_FCS, ibuf, 2);
- /* Stuffing may double the size worst case */
- len = gsm_stuff_frame(ibuf, cbuf + 1, 3);
- /* Now add the SOF markers */
- cbuf[0] = GSM1_SOF;
- cbuf[len + 1] = GSM1_SOF;
- /* FIXME: we can omit the lead one in many cases */
- len += 2;
- break;
- default:
- WARN_ON(1);
- return;
- }
- gsmld_output(gsm, cbuf, len);
- if (!gsm->initiator) {
- cr = cr & gsm->initiator;
- control = control & ~PF;
+ msg->data -= 3;
+ dp = msg->data;
+ *dp++ = (addr << 2) | (ocr << 1) | EA;
+ *dp++ = control;
+
+ if (gsm->encoding == 0)
+ *dp++ = EA; /* Length of data = 0 */
+
+ *dp = 0xFF - gsm_fcs_add_block(INIT_FCS, msg->data, dp - msg->data);
+ msg->len = (dp - msg->data) + 1;
+
+ gsm_print_packet("Q->", addr, cr, control, NULL, 0);
+
+ spin_lock_irqsave(&gsm->tx_lock, flags);
+ list_add_tail(&msg->list, &gsm->tx_ctrl_list);
+ gsm->tx_bytes += msg->len;
+ spin_unlock_irqrestore(&gsm->tx_lock, flags);
+ gsmld_write_trigger(gsm);
+
+ return 0;
+}
+
+/**
+ * gsm_dlci_clear_queues - remove outstanding data for a DLCI
+ * @gsm: mux
+ * @dlci: clear for this DLCI
+ *
+ * Clears the data queues for a given DLCI.
+ */
+static void gsm_dlci_clear_queues(struct gsm_mux *gsm, struct gsm_dlci *dlci)
+{
+ struct gsm_msg *msg, *nmsg;
+ int addr = dlci->addr;
+ unsigned long flags;
+
+ /* Clear DLCI write fifo first */
+ spin_lock_irqsave(&dlci->lock, flags);
+ kfifo_reset(&dlci->fifo);
+ spin_unlock_irqrestore(&dlci->lock, flags);
+
+ /* Clear data packets in MUX write queue */
+ spin_lock_irqsave(&gsm->tx_lock, flags);
+ list_for_each_entry_safe(msg, nmsg, &gsm->tx_data_list, list) {
+ if (msg->addr != addr)
+ continue;
+ gsm->tx_bytes -= msg->len;
+ list_del(&msg->list);
+ kfree(msg);
}
- gsm_print_packet("-->", addr, cr, control, NULL, 0);
+ spin_unlock_irqrestore(&gsm->tx_lock, flags);
}
/**
@@ -683,59 +801,151 @@ static struct gsm_msg *gsm_data_alloc(struct gsm_mux *gsm, u8 addr, int len,
}
/**
- * gsm_data_kick - poke the queue
+ * gsm_send_packet - sends a single packet
* @gsm: GSM Mux
- * @dlci: DLCI sending the data
+ * @msg: packet to send
*
- * The tty device has called us to indicate that room has appeared in
- * the transmit queue. Ram more data into the pipe if we have any
- * If we have been flow-stopped by a CMD_FCOFF, then we can only
- * send messages on DLCI0 until CMD_FCON
+ * The given packet is encoded and sent out. No memory is freed.
+ * The caller must hold the gsm tx lock.
+ */
+static int gsm_send_packet(struct gsm_mux *gsm, struct gsm_msg *msg)
+{
+ int len, ret;
+
+
+ if (gsm->encoding == 0) {
+ gsm->txframe[0] = GSM0_SOF;
+ memcpy(gsm->txframe + 1, msg->data, msg->len);
+ gsm->txframe[msg->len + 1] = GSM0_SOF;
+ len = msg->len + 2;
+ } else {
+ gsm->txframe[0] = GSM1_SOF;
+ len = gsm_stuff_frame(msg->data, gsm->txframe + 1, msg->len);
+ gsm->txframe[len + 1] = GSM1_SOF;
+ len += 2;
+ }
+
+ if (debug & 4)
+ gsm_hex_dump_bytes(__func__, gsm->txframe, len);
+ gsm_print_packet("-->", msg->addr, gsm->initiator, msg->ctrl, msg->data,
+ msg->len);
+
+ ret = gsmld_output(gsm, gsm->txframe, len);
+ if (ret <= 0)
+ return ret;
+ /* FIXME: Can eliminate one SOF in many more cases */
+ gsm->tx_bytes -= msg->len;
+
+ return 0;
+}
+
+/**
+ * gsm_is_flow_ctrl_msg - checks if flow control message
+ * @msg: message to check
*
- * FIXME: lock against link layer control transmissions
+ * Returns true if the given message is a flow control command of the
+ * control channel. False is returned in any other case.
*/
+static bool gsm_is_flow_ctrl_msg(struct gsm_msg *msg)
+{
+ unsigned int cmd;
+
+ if (msg->addr > 0)
+ return false;
+
+ switch (msg->ctrl & ~PF) {
+ case UI:
+ case UIH:
+ cmd = 0;
+ if (gsm_read_ea_val(&cmd, msg->data + 2, msg->len - 2) < 1)
+ break;
+ switch (cmd & ~PF) {
+ case CMD_FCOFF:
+ case CMD_FCON:
+ return true;
+ }
+ break;
+ }
+
+ return false;
+}
-static void gsm_data_kick(struct gsm_mux *gsm, struct gsm_dlci *dlci)
+/**
+ * gsm_data_kick - poke the queue
+ * @gsm: GSM Mux
+ *
+ * The tty device has called us to indicate that room has appeared in
+ * the transmit queue. Ram more data into the pipe if we have any.
+ * If we have been flow-stopped by a CMD_FCOFF, then we can only
+ * send messages on DLCI0 until CMD_FCON. The caller must hold
+ * the gsm tx lock.
+ */
+static int gsm_data_kick(struct gsm_mux *gsm)
{
struct gsm_msg *msg, *nmsg;
- int len;
+ struct gsm_dlci *dlci;
+ int ret;
- list_for_each_entry_safe(msg, nmsg, &gsm->tx_list, list) {
- if (gsm->constipated && msg->addr)
- continue;
- if (gsm->encoding != 0) {
- gsm->txframe[0] = GSM1_SOF;
- len = gsm_stuff_frame(msg->data,
- gsm->txframe + 1, msg->len);
- gsm->txframe[len + 1] = GSM1_SOF;
- len += 2;
- } else {
- gsm->txframe[0] = GSM0_SOF;
- memcpy(gsm->txframe + 1 , msg->data, msg->len);
- gsm->txframe[msg->len + 1] = GSM0_SOF;
- len = msg->len + 2;
- }
+ clear_bit(TTY_DO_WRITE_WAKEUP, &gsm->tty->flags);
- if (debug & 4)
- gsm_hex_dump_bytes(__func__, gsm->txframe, len);
- if (gsmld_output(gsm, gsm->txframe, len) <= 0)
+ /* Serialize control messages and control channel messages first */
+ list_for_each_entry_safe(msg, nmsg, &gsm->tx_ctrl_list, list) {
+ if (gsm->constipated && !gsm_is_flow_ctrl_msg(msg))
+ continue;
+ ret = gsm_send_packet(gsm, msg);
+ switch (ret) {
+ case -ENOSPC:
+ return -ENOSPC;
+ case -ENODEV:
+ /* ldisc not open */
+ gsm->tx_bytes -= msg->len;
+ list_del(&msg->list);
+ kfree(msg);
+ continue;
+ default:
+ if (ret >= 0) {
+ list_del(&msg->list);
+ kfree(msg);
+ }
break;
- /* FIXME: Can eliminate one SOF in many more cases */
- gsm->tx_bytes -= msg->len;
-
- list_del(&msg->list);
- kfree(msg);
+ }
+ }
- if (dlci) {
- tty_port_tty_wakeup(&dlci->port);
- } else {
- int i = 0;
+ if (gsm->constipated)
+ return -EAGAIN;
- for (i = 0; i < NUM_DLCI; i++)
- if (gsm->dlci[i])
- tty_port_tty_wakeup(&gsm->dlci[i]->port);
+ /* Serialize other channels */
+ if (list_empty(&gsm->tx_data_list))
+ return 0;
+ list_for_each_entry_safe(msg, nmsg, &gsm->tx_data_list, list) {
+ dlci = gsm->dlci[msg->addr];
+ /* Send only messages for DLCIs with valid state */
+ if (dlci->state != DLCI_OPEN) {
+ gsm->tx_bytes -= msg->len;
+ list_del(&msg->list);
+ kfree(msg);
+ continue;
+ }
+ ret = gsm_send_packet(gsm, msg);
+ switch (ret) {
+ case -ENOSPC:
+ return -ENOSPC;
+ case -ENODEV:
+ /* ldisc not open */
+ gsm->tx_bytes -= msg->len;
+ list_del(&msg->list);
+ kfree(msg);
+ continue;
+ default:
+ if (ret >= 0) {
+ list_del(&msg->list);
+ kfree(msg);
+ }
+ break;
}
}
+
+ return 1;
}
/**
@@ -784,9 +994,22 @@ static void __gsm_data_queue(struct gsm_dlci *dlci, struct gsm_msg *msg)
msg->data = dp;
/* Add to the actual output queue */
- list_add_tail(&msg->list, &gsm->tx_list);
+ switch (msg->ctrl & ~PF) {
+ case UI:
+ case UIH:
+ if (msg->addr > 0) {
+ list_add_tail(&msg->list, &gsm->tx_data_list);
+ break;
+ }
+ fallthrough;
+ default:
+ list_add_tail(&msg->list, &gsm->tx_ctrl_list);
+ break;
+ }
gsm->tx_bytes += msg->len;
- gsm_data_kick(gsm, dlci);
+
+ gsmld_write_trigger(gsm);
+ mod_timer(&gsm->kick_timer, jiffies + 10 * gsm->t1 * HZ / 100);
}
/**
@@ -823,41 +1046,48 @@ static int gsm_dlci_data_output(struct gsm_mux *gsm, struct gsm_dlci *dlci)
{
struct gsm_msg *msg;
u8 *dp;
- int len, total_size, size;
- int h = dlci->adaption - 1;
+ int h, len, size;
- total_size = 0;
- while (1) {
- len = kfifo_len(&dlci->fifo);
- if (len == 0)
- return total_size;
-
- /* MTU/MRU count only the data bits */
- if (len > gsm->mtu)
- len = gsm->mtu;
-
- size = len + h;
-
- msg = gsm_data_alloc(gsm, dlci->addr, size, gsm->ftype);
- /* FIXME: need a timer or something to kick this so it can't
- get stuck with no work outstanding and no buffer free */
- if (msg == NULL)
- return -ENOMEM;
- dp = msg->data;
- switch (dlci->adaption) {
- case 1: /* Unstructured */
- break;
- case 2: /* Unstructed with modem bits.
- Always one byte as we never send inline break data */
- *dp++ = (gsm_encode_modem(dlci) << 1) | EA;
- break;
- }
- WARN_ON(kfifo_out_locked(&dlci->fifo, dp , len, &dlci->lock) != len);
- __gsm_data_queue(dlci, msg);
- total_size += size;
+ /* for modem bits without break data */
+ h = ((dlci->adaption == 1) ? 0 : 1);
+
+ len = kfifo_len(&dlci->fifo);
+ if (len == 0)
+ return 0;
+
+ /* MTU/MRU count only the data bits but watch adaption mode */
+ if ((len + h) > gsm->mtu)
+ len = gsm->mtu - h;
+
+ size = len + h;
+
+ msg = gsm_data_alloc(gsm, dlci->addr, size, gsm->ftype);
+ if (!msg)
+ return -ENOMEM;
+ dp = msg->data;
+ switch (dlci->adaption) {
+ case 1: /* Unstructured */
+ break;
+ case 2: /* Unstructured with modem bits.
+ * Always one byte as we never send inline break data
+ */
+ *dp++ = (gsm_encode_modem(dlci) << 1) | EA;
+ break;
+ default:
+ pr_err("%s: unsupported adaption %d\n", __func__,
+ dlci->adaption);
+ break;
}
+
+ WARN_ON(len != kfifo_out_locked(&dlci->fifo, dp, len,
+ &dlci->lock));
+
+ /* Notify upper layer about available send space. */
+ tty_port_tty_wakeup(&dlci->port);
+
+ __gsm_data_queue(dlci, msg);
/* Bytes of data we used up */
- return total_size;
+ return size;
}
/**
@@ -908,9 +1138,6 @@ static int gsm_dlci_data_output_framed(struct gsm_mux *gsm,
size = len + overhead;
msg = gsm_data_alloc(gsm, dlci->addr, size, gsm->ftype);
-
- /* FIXME: need a timer or something to kick this so it can't
- get stuck with no work outstanding and no buffer free */
if (msg == NULL) {
skb_queue_tail(&dlci->skb_list, dlci->skb);
dlci->skb = NULL;
@@ -1006,32 +1233,43 @@ static int gsm_dlci_modem_output(struct gsm_mux *gsm, struct gsm_dlci *dlci,
* renegotiate DLCI priorities with optional stuff. Needs optimising.
*/
-static void gsm_dlci_data_sweep(struct gsm_mux *gsm)
+static int gsm_dlci_data_sweep(struct gsm_mux *gsm)
{
- int len;
/* Priority ordering: We should do priority with RR of the groups */
- int i = 1;
-
- while (i < NUM_DLCI) {
- struct gsm_dlci *dlci;
+ int i, len, ret = 0;
+ bool sent;
+ struct gsm_dlci *dlci;
- if (gsm->tx_bytes > TX_THRESH_HI)
- break;
- dlci = gsm->dlci[i];
- if (dlci == NULL || dlci->constipated) {
- i++;
- continue;
+ while (gsm->tx_bytes < TX_THRESH_HI) {
+ for (sent = false, i = 1; i < NUM_DLCI; i++) {
+ dlci = gsm->dlci[i];
+ /* skip unused or blocked channel */
+ if (!dlci || dlci->constipated)
+ continue;
+ /* skip channels with invalid state */
+ if (dlci->state != DLCI_OPEN)
+ continue;
+ /* count the sent data per adaption */
+ if (dlci->adaption < 3 && !dlci->net)
+ len = gsm_dlci_data_output(gsm, dlci);
+ else
+ len = gsm_dlci_data_output_framed(gsm, dlci);
+ /* on error exit */
+ if (len < 0)
+ return ret;
+ if (len > 0) {
+ ret++;
+ sent = true;
+ /* The lower DLCs can starve the higher DLCs! */
+ break;
+ }
+ /* try next */
}
- if (dlci->adaption < 3 && !dlci->net)
- len = gsm_dlci_data_output(gsm, dlci);
- else
- len = gsm_dlci_data_output_framed(gsm, dlci);
- if (len < 0)
+ if (!sent)
break;
- /* DLCI empty - try the next */
- if (len == 0)
- i++;
- }
+ };
+
+ return ret;
}
/**
@@ -1277,7 +1515,6 @@ static void gsm_control_message(struct gsm_mux *gsm, unsigned int command,
const u8 *data, int clen)
{
u8 buf[1];
- unsigned long flags;
switch (command) {
case CMD_CLD: {
@@ -1299,9 +1536,7 @@ static void gsm_control_message(struct gsm_mux *gsm, unsigned int command,
gsm->constipated = false;
gsm_control_reply(gsm, CMD_FCON, NULL, 0);
/* Kick the link in case it is idling */
- spin_lock_irqsave(&gsm->tx_lock, flags);
- gsm_data_kick(gsm, NULL);
- spin_unlock_irqrestore(&gsm->tx_lock, flags);
+ gsmld_write_trigger(gsm);
break;
case CMD_FCOFF:
/* Modem wants us to STFU */
@@ -1407,7 +1642,7 @@ static void gsm_control_retransmit(struct timer_list *t)
spin_lock_irqsave(&gsm->control_lock, flags);
ctrl = gsm->pending_cmd;
if (ctrl) {
- if (gsm->cretries == 0) {
+ if (gsm->cretries == 0 || !gsm->dlci[0] || gsm->dlci[0]->dead) {
gsm->pending_cmd = NULL;
ctrl->error = -ETIMEDOUT;
ctrl->done = 1;
@@ -1504,25 +1739,24 @@ static int gsm_control_wait(struct gsm_mux *gsm, struct gsm_control *control)
static void gsm_dlci_close(struct gsm_dlci *dlci)
{
- unsigned long flags;
-
del_timer(&dlci->t1);
if (debug & 8)
pr_debug("DLCI %d goes closed.\n", dlci->addr);
dlci->state = DLCI_CLOSED;
+ /* Prevent us from sending data before the link is up again */
+ dlci->constipated = true;
if (dlci->addr != 0) {
tty_port_tty_hangup(&dlci->port, false);
- spin_lock_irqsave(&dlci->lock, flags);
- kfifo_reset(&dlci->fifo);
- spin_unlock_irqrestore(&dlci->lock, flags);
+ gsm_dlci_clear_queues(dlci->gsm, dlci);
/* Ensure that gsmtty_open() can return. */
tty_port_set_initialized(&dlci->port, 0);
wake_up_interruptible(&dlci->port.open_wait);
} else
dlci->gsm->dead = true;
- wake_up(&dlci->gsm->event);
/* A DLCI 0 close is a MUX termination so we need to kick that
back to userspace somehow */
+ gsm_dlci_data_kick(dlci);
+ wake_up(&dlci->gsm->event);
}
/**
@@ -1539,11 +1773,13 @@ static void gsm_dlci_open(struct gsm_dlci *dlci)
del_timer(&dlci->t1);
/* This will let a tty open continue */
dlci->state = DLCI_OPEN;
+ dlci->constipated = false;
if (debug & 8)
pr_debug("DLCI %d goes open.\n", dlci->addr);
/* Send current modem state */
if (dlci->addr)
gsm_modem_update(dlci, 0);
+ gsm_dlci_data_kick(dlci);
wake_up(&dlci->gsm->event);
}
@@ -1569,8 +1805,8 @@ static void gsm_dlci_t1(struct timer_list *t)
switch (dlci->state) {
case DLCI_OPENING:
- dlci->retries--;
if (dlci->retries) {
+ dlci->retries--;
gsm_command(dlci->gsm, dlci->addr, SABM|PF);
mod_timer(&dlci->t1, jiffies + gsm->t1 * HZ / 100);
} else if (!dlci->addr && gsm->control == (DM | PF)) {
@@ -1585,8 +1821,8 @@ static void gsm_dlci_t1(struct timer_list *t)
break;
case DLCI_CLOSING:
- dlci->retries--;
if (dlci->retries) {
+ dlci->retries--;
gsm_command(dlci->gsm, dlci->addr, DISC|PF);
mod_timer(&dlci->t1, jiffies + gsm->t1 * HZ / 100);
} else
@@ -1619,6 +1855,25 @@ static void gsm_dlci_begin_open(struct gsm_dlci *dlci)
mod_timer(&dlci->t1, jiffies + gsm->t1 * HZ / 100);
}
+/**
+ * gsm_dlci_set_opening - change state to opening
+ * @dlci: DLCI to open
+ *
+ * Change internal state to wait for DLCI open from initiator side.
+ * We set off timers and responses upon reception of an SABM.
+ */
+static void gsm_dlci_set_opening(struct gsm_dlci *dlci)
+{
+ switch (dlci->state) {
+ case DLCI_CLOSED:
+ case DLCI_CLOSING:
+ dlci->state = DLCI_OPENING;
+ break;
+ default:
+ break;
+ }
+}
+
/**
* gsm_dlci_begin_close - start channel open procedure
* @dlci: DLCI to open
@@ -1728,6 +1983,30 @@ static void gsm_dlci_command(struct gsm_dlci *dlci, const u8 *data, int len)
}
}
+/**
+ * gsm_kick_timer - transmit if possible
+ * @t: timer contained in our gsm object
+ *
+ * Transmit data from DLCIs if the queue is empty. We can't rely on
+ * a tty wakeup except when we filled the pipe so we need to fire off
+ * new data ourselves in other cases.
+ */
+static void gsm_kick_timer(struct timer_list *t)
+{
+ struct gsm_mux *gsm = from_timer(gsm, t, kick_timer);
+ unsigned long flags;
+ int sent = 0;
+
+ spin_lock_irqsave(&gsm->tx_lock, flags);
+ /* If we have nothing running then we need to fire up */
+ if (gsm->tx_bytes < TX_THRESH_LO)
+ sent = gsm_dlci_data_sweep(gsm);
+ spin_unlock_irqrestore(&gsm->tx_lock, flags);
+
+ if (sent && debug & 4)
+ pr_info("%s TX queue stalled\n", __func__);
+}
+
/*
* Allocate/Free DLCI channels
*/
@@ -1762,10 +2041,13 @@ static struct gsm_dlci *gsm_dlci_alloc(struct gsm_mux *gsm, int addr)
dlci->addr = addr;
dlci->adaption = gsm->adaption;
dlci->state = DLCI_CLOSED;
- if (addr)
+ if (addr) {
dlci->data = gsm_dlci_data;
- else
+ /* Prevent us from sending data before the link is up */
+ dlci->constipated = true;
+ } else {
dlci->data = gsm_dlci_command;
+ }
gsm->dlci[addr] = dlci;
return dlci;
}
@@ -1929,7 +2211,7 @@ static void gsm_queue(struct gsm_mux *gsm)
goto invalid;
#endif
if (dlci == NULL || dlci->state != DLCI_OPEN) {
- gsm_command(gsm, address, DM|PF);
+ gsm_response(gsm, address, DM|PF);
return;
}
dlci->data(dlci, gsm->buf, gsm->len);
@@ -2052,7 +2334,7 @@ static void gsm1_receive(struct gsm_mux *gsm, unsigned char c)
} else if ((c & ISO_IEC_646_MASK) == XOFF) {
gsm->constipated = false;
/* Kick the link in case it is idling */
- gsm_data_kick(gsm, NULL);
+ gsmld_write_trigger(gsm);
return;
}
if (c == GSM1_SOF) {
@@ -2180,18 +2462,29 @@ static void gsm_cleanup_mux(struct gsm_mux *gsm, bool disc)
}
/* Finish outstanding timers, making sure they are done */
+ del_timer_sync(&gsm->kick_timer);
del_timer_sync(&gsm->t2_timer);
+ /* Finish writing to ldisc */
+ flush_work(&gsm->tx_work);
+
/* Free up any link layer users and finally the control channel */
+ if (gsm->has_devices) {
+ gsm_unregister_devices(gsm_tty_driver, gsm->num);
+ gsm->has_devices = false;
+ }
for (i = NUM_DLCI - 1; i >= 0; i--)
if (gsm->dlci[i])
gsm_dlci_release(gsm->dlci[i]);
mutex_unlock(&gsm->mutex);
/* Now wipe the queues */
tty_ldisc_flush(gsm->tty);
- list_for_each_entry_safe(txq, ntxq, &gsm->tx_list, list)
+ list_for_each_entry_safe(txq, ntxq, &gsm->tx_ctrl_list, list)
+ kfree(txq);
+ INIT_LIST_HEAD(&gsm->tx_ctrl_list);
+ list_for_each_entry_safe(txq, ntxq, &gsm->tx_data_list, list)
kfree(txq);
- INIT_LIST_HEAD(&gsm->tx_list);
+ INIT_LIST_HEAD(&gsm->tx_data_list);
}
/**
@@ -2206,8 +2499,15 @@ static void gsm_cleanup_mux(struct gsm_mux *gsm, bool disc)
static int gsm_activate_mux(struct gsm_mux *gsm)
{
struct gsm_dlci *dlci;
+ int ret;
+
+ dlci = gsm_dlci_alloc(gsm, 0);
+ if (dlci == NULL)
+ return -ENOMEM;
+ timer_setup(&gsm->kick_timer, gsm_kick_timer, 0);
timer_setup(&gsm->t2_timer, gsm_control_retransmit, 0);
+ INIT_WORK(&gsm->tx_work, gsmld_write_task);
init_waitqueue_head(&gsm->event);
spin_lock_init(&gsm->control_lock);
spin_lock_init(&gsm->tx_lock);
@@ -2217,9 +2517,11 @@ static int gsm_activate_mux(struct gsm_mux *gsm)
else
gsm->receive = gsm1_receive;
- dlci = gsm_dlci_alloc(gsm, 0);
- if (dlci == NULL)
- return -ENOMEM;
+ ret = gsm_register_devices(gsm_tty_driver, gsm->num);
+ if (ret)
+ return ret;
+
+ gsm->has_devices = true;
gsm->dead = false; /* Tty opens are now permissible */
return 0;
}
@@ -2312,7 +2614,8 @@ static struct gsm_mux *gsm_alloc_mux(void)
spin_lock_init(&gsm->lock);
mutex_init(&gsm->mutex);
kref_init(&gsm->ref);
- INIT_LIST_HEAD(&gsm->tx_list);
+ INIT_LIST_HEAD(&gsm->tx_ctrl_list);
+ INIT_LIST_HEAD(&gsm->tx_data_list);
gsm->t1 = T1;
gsm->t2 = T2;
@@ -2469,6 +2772,47 @@ static int gsmld_output(struct gsm_mux *gsm, u8 *data, int len)
return gsm->tty->ops->write(gsm->tty, data, len);
}
+
+/**
+ * gsmld_write_trigger - schedule ldisc write task
+ * @gsm: our mux
+ */
+static void gsmld_write_trigger(struct gsm_mux *gsm)
+{
+ if (!gsm || !gsm->dlci[0] || gsm->dlci[0]->dead)
+ return;
+ schedule_work(&gsm->tx_work);
+}
+
+
+/**
+ * gsmld_write_task - ldisc write task
+ * @work: our tx write work
+ *
+ * Writes out data to the ldisc if possible. We are doing this here to
+ * avoid dead-locking. This returns if no space or data is left for output.
+ */
+static void gsmld_write_task(struct work_struct *work)
+{
+ struct gsm_mux *gsm = container_of(work, struct gsm_mux, tx_work);
+ unsigned long flags;
+ int i, ret;
+
+ /* All outstanding control channel and control messages and one data
+ * frame is sent.
+ */
+ ret = -ENODEV;
+ spin_lock_irqsave(&gsm->tx_lock, flags);
+ if (gsm->tty)
+ ret = gsm_data_kick(gsm);
+ spin_unlock_irqrestore(&gsm->tx_lock, flags);
+
+ if (ret >= 0)
+ for (i = 0; i < NUM_DLCI; i++)
+ if (gsm->dlci[i])
+ tty_port_tty_wakeup(&gsm->dlci[i]->port);
+}
+
/**
* gsmld_attach_gsm - mode set up
* @tty: our tty structure
@@ -2479,39 +2823,14 @@ static int gsmld_output(struct gsm_mux *gsm, u8 *data, int len)
* will need moving to an ioctl path.
*/
-static int gsmld_attach_gsm(struct tty_struct *tty, struct gsm_mux *gsm)
+static void gsmld_attach_gsm(struct tty_struct *tty, struct gsm_mux *gsm)
{
- unsigned int base;
- int ret, i;
-
gsm->tty = tty_kref_get(tty);
/* Turn off tty XON/XOFF handling to handle it explicitly. */
gsm->old_c_iflag = tty->termios.c_iflag;
tty->termios.c_iflag &= (IXON | IXOFF);
- ret = gsm_activate_mux(gsm);
- if (ret != 0)
- tty_kref_put(gsm->tty);
- else {
- /* Don't register device 0 - this is the control channel and not
- a usable tty interface */
- base = mux_num_to_base(gsm); /* Base for this MUX */
- for (i = 1; i < NUM_DLCI; i++) {
- struct device *dev;
-
- dev = tty_register_device(gsm_tty_driver,
- base + i, NULL);
- if (IS_ERR(dev)) {
- for (i--; i >= 1; i--)
- tty_unregister_device(gsm_tty_driver,
- base + i);
- return PTR_ERR(dev);
- }
- }
- }
- return ret;
}
-
/**
* gsmld_detach_gsm - stop doing 0710 mux
* @tty: tty attached to the mux
@@ -2522,12 +2841,7 @@ static int gsmld_attach_gsm(struct tty_struct *tty, struct gsm_mux *gsm)
static void gsmld_detach_gsm(struct tty_struct *tty, struct gsm_mux *gsm)
{
- unsigned int base = mux_num_to_base(gsm); /* Base for this MUX */
- int i;
-
WARN_ON(tty != gsm->tty);
- for (i = 1; i < NUM_DLCI; i++)
- tty_unregister_device(gsm_tty_driver, base + i);
/* Restore tty XON/XOFF handling. */
gsm->tty->termios.c_iflag = gsm->old_c_iflag;
tty_kref_put(gsm->tty);
@@ -2619,7 +2933,6 @@ static void gsmld_close(struct tty_struct *tty)
static int gsmld_open(struct tty_struct *tty)
{
struct gsm_mux *gsm;
- int ret;
if (tty->ops->write == NULL)
return -EINVAL;
@@ -2635,12 +2948,13 @@ static int gsmld_open(struct tty_struct *tty)
/* Attach the initial passive connection */
gsm->encoding = 1;
- ret = gsmld_attach_gsm(tty, gsm);
- if (ret != 0) {
- gsm_cleanup_mux(gsm, false);
- mux_put(gsm);
- }
- return ret;
+ gsmld_attach_gsm(tty, gsm);
+
+ timer_setup(&gsm->kick_timer, gsm_kick_timer, 0);
+ timer_setup(&gsm->t2_timer, gsm_control_retransmit, 0);
+ INIT_WORK(&gsm->tx_work, gsmld_write_task);
+
+ return 0;
}
/**
@@ -2655,16 +2969,9 @@ static int gsmld_open(struct tty_struct *tty)
static void gsmld_write_wakeup(struct tty_struct *tty)
{
struct gsm_mux *gsm = tty->disc_data;
- unsigned long flags;
/* Queue poll */
- clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
- spin_lock_irqsave(&gsm->tx_lock, flags);
- gsm_data_kick(gsm, NULL);
- if (gsm->tx_bytes < TX_THRESH_LO) {
- gsm_dlci_data_sweep(gsm);
- }
- spin_unlock_irqrestore(&gsm->tx_lock, flags);
+ gsmld_write_trigger(gsm);
}
/**
@@ -2708,11 +3015,24 @@ static ssize_t gsmld_read(struct tty_struct *tty, struct file *file,
static ssize_t gsmld_write(struct tty_struct *tty, struct file *file,
const unsigned char *buf, size_t nr)
{
- int space = tty_write_room(tty);
+ struct gsm_mux *gsm = tty->disc_data;
+ unsigned long flags;
+ int space;
+ int ret;
+
+ if (!gsm)
+ return -ENODEV;
+
+ ret = -ENOBUFS;
+ spin_lock_irqsave(&gsm->tx_lock, flags);
+ space = tty_write_room(tty);
if (space >= nr)
- return tty->ops->write(tty, buf, nr);
- set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
- return -ENOBUFS;
+ ret = tty->ops->write(tty, buf, nr);
+ else
+ set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
+ spin_unlock_irqrestore(&gsm->tx_lock, flags);
+
+ return ret;
}
/**
@@ -2737,12 +3057,15 @@ static __poll_t gsmld_poll(struct tty_struct *tty, struct file *file,
poll_wait(file, &tty->read_wait, wait);
poll_wait(file, &tty->write_wait, wait);
+
+ if (gsm->dead)
+ mask |= EPOLLHUP;
if (tty_hung_up_p(file))
mask |= EPOLLHUP;
+ if (test_bit(TTY_OTHER_CLOSED, &tty->flags))
+ mask |= EPOLLHUP;
if (!tty_is_writelocked(tty) && tty_write_room(tty) > 0)
mask |= EPOLLOUT | EPOLLWRNORM;
- if (gsm->dead)
- mask |= EPOLLHUP;
return mask;
}
@@ -3178,6 +3501,8 @@ static int gsmtty_open(struct tty_struct *tty, struct file *filp)
/* Start sending off SABM messages */
if (gsm->initiator)
gsm_dlci_begin_open(dlci);
+ else
+ gsm_dlci_set_opening(dlci);
/* And wait for virtual carrier */
return tty_port_block_til_ready(port, tty, filp);
}
diff --git a/drivers/tty/serial/8250/8250.h b/drivers/tty/serial/8250/8250.h
index db784ace25d8..467372534d1c 100644
--- a/drivers/tty/serial/8250/8250.h
+++ b/drivers/tty/serial/8250/8250.h
@@ -120,6 +120,28 @@ static inline void serial_out(struct uart_8250_port *up, int offset, int value)
up->port.serial_out(&up->port, offset, value);
}
+/*
+ * For the 16C950
+ */
+static void serial_icr_write(struct uart_8250_port *up, int offset, int value)
+{
+ serial_out(up, UART_SCR, offset);
+ serial_out(up, UART_ICR, value);
+}
+
+static unsigned int __maybe_unused serial_icr_read(struct uart_8250_port *up,
+ int offset)
+{
+ unsigned int value;
+
+ serial_icr_write(up, UART_ACR, up->acr | UART_ACR_ICRRD);
+ serial_out(up, UART_SCR, offset);
+ value = serial_in(up, UART_ICR);
+ serial_icr_write(up, UART_ACR, up->acr);
+
+ return value;
+}
+
void serial8250_clear_and_reinit_fifos(struct uart_8250_port *p);
static inline int serial_dl_read(struct uart_8250_port *up)
diff --git a/drivers/tty/serial/8250/8250_bcm2835aux.c b/drivers/tty/serial/8250/8250_bcm2835aux.c
index 2a1226a78a0c..21939bb44613 100644
--- a/drivers/tty/serial/8250/8250_bcm2835aux.c
+++ b/drivers/tty/serial/8250/8250_bcm2835aux.c
@@ -166,8 +166,10 @@ static int bcm2835aux_serial_probe(struct platform_device *pdev)
uartclk = clk_get_rate(data->clk);
if (!uartclk) {
ret = device_property_read_u32(&pdev->dev, "clock-frequency", &uartclk);
- if (ret)
- return dev_err_probe(&pdev->dev, ret, "could not get clk rate\n");
+ if (ret) {
+ dev_err_probe(&pdev->dev, ret, "could not get clk rate\n");
+ goto dis_clk;
+ }
}
/* the HW-clock divider for bcm2835aux is 8,
diff --git a/drivers/tty/serial/8250/8250_bcm7271.c b/drivers/tty/serial/8250/8250_bcm7271.c
index 9b878d023dac..8efdc271eb75 100644
--- a/drivers/tty/serial/8250/8250_bcm7271.c
+++ b/drivers/tty/serial/8250/8250_bcm7271.c
@@ -1139,16 +1139,19 @@ static int __maybe_unused brcmuart_suspend(struct device *dev)
struct brcmuart_priv *priv = dev_get_drvdata(dev);
struct uart_8250_port *up = serial8250_get_port(priv->line);
struct uart_port *port = &up->port;
-
- serial8250_suspend_port(priv->line);
- clk_disable_unprepare(priv->baud_mux_clk);
+ unsigned long flags;
/*
* This will prevent resume from enabling RTS before the
- * baud rate has been resored.
+ * baud rate has been restored.
*/
+ spin_lock_irqsave(&port->lock, flags);
priv->saved_mctrl = port->mctrl;
- port->mctrl = 0;
+ port->mctrl &= ~TIOCM_RTS;
+ spin_unlock_irqrestore(&port->lock, flags);
+
+ serial8250_suspend_port(priv->line);
+ clk_disable_unprepare(priv->baud_mux_clk);
return 0;
}
@@ -1158,6 +1161,7 @@ static int __maybe_unused brcmuart_resume(struct device *dev)
struct brcmuart_priv *priv = dev_get_drvdata(dev);
struct uart_8250_port *up = serial8250_get_port(priv->line);
struct uart_port *port = &up->port;
+ unsigned long flags;
int ret;
ret = clk_prepare_enable(priv->baud_mux_clk);
@@ -1180,7 +1184,15 @@ static int __maybe_unused brcmuart_resume(struct device *dev)
start_rx_dma(serial8250_get_port(priv->line));
}
serial8250_resume_port(priv->line);
- port->mctrl = priv->saved_mctrl;
+
+ if (priv->saved_mctrl & TIOCM_RTS) {
+ /* Restore RTS */
+ spin_lock_irqsave(&port->lock, flags);
+ port->mctrl |= TIOCM_RTS;
+ port->ops->set_mctrl(port, port->mctrl);
+ spin_unlock_irqrestore(&port->lock, flags);
+ }
+
return 0;
}
diff --git a/drivers/tty/serial/8250/8250_fsl.c b/drivers/tty/serial/8250/8250_fsl.c
index 9c01c531349d..71ce43685797 100644
--- a/drivers/tty/serial/8250/8250_fsl.c
+++ b/drivers/tty/serial/8250/8250_fsl.c
@@ -77,7 +77,7 @@ int fsl8250_handle_irq(struct uart_port *port)
if ((lsr & UART_LSR_THRE) && (up->ier & UART_IER_THRI))
serial8250_tx_chars(up);
- up->lsr_saved_flags = orig_lsr;
+ up->lsr_saved_flags |= orig_lsr & UART_LSR_BI;
uart_unlock_and_check_sysrq_irqrestore(&up->port, flags);
diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index a293e9f107d0..aeac20f7cbb2 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -11,6 +11,7 @@
#include <linux/pci.h>
#include <linux/string.h>
#include <linux/kernel.h>
+#include <linux/math.h>
#include <linux/slab.h>
#include <linux/delay.h>
#include <linux/tty.h>
@@ -994,41 +995,29 @@ static void pci_ite887x_exit(struct pci_dev *dev)
}
/*
- * EndRun Technologies.
- * Determine the number of ports available on the device.
+ * Oxford Semiconductor Inc.
+ * Check if an OxSemi device is part of the Tornado range of devices.
*/
#define PCI_VENDOR_ID_ENDRUN 0x7401
#define PCI_DEVICE_ID_ENDRUN_1588 0xe100
-static int pci_endrun_init(struct pci_dev *dev)
+static bool pci_oxsemi_tornado_p(struct pci_dev *dev)
{
- u8 __iomem *p;
- unsigned long deviceID;
- unsigned int number_uarts = 0;
+ /* OxSemi Tornado devices are all 0xCxxx */
+ if (dev->vendor == PCI_VENDOR_ID_OXSEMI &&
+ (dev->device & 0xf000) != 0xc000)
+ return false;
- /* EndRun device is all 0xexxx */
+ /* EndRun devices are all 0xExxx */
if (dev->vendor == PCI_VENDOR_ID_ENDRUN &&
- (dev->device & 0xf000) != 0xe000)
- return 0;
-
- p = pci_iomap(dev, 0, 5);
- if (p == NULL)
- return -ENOMEM;
+ (dev->device & 0xf000) != 0xe000)
+ return false;
- deviceID = ioread32(p);
- /* EndRun device */
- if (deviceID == 0x07000200) {
- number_uarts = ioread8(p + 4);
- pci_dbg(dev, "%d ports detected on EndRun PCI Express device\n", number_uarts);
- }
- pci_iounmap(dev, p);
- return number_uarts;
+ return true;
}
/*
- * Oxford Semiconductor Inc.
- * Check that device is part of the Tornado range of devices, then determine
- * the number of ports available on the device.
+ * Determine the number of ports available on a Tornado device.
*/
static int pci_oxsemi_tornado_init(struct pci_dev *dev)
{
@@ -1036,9 +1025,7 @@ static int pci_oxsemi_tornado_init(struct pci_dev *dev)
unsigned long deviceID;
unsigned int number_uarts = 0;
- /* OxSemi Tornado devices are all 0xCxxx */
- if (dev->vendor == PCI_VENDOR_ID_OXSEMI &&
- (dev->device & 0xF000) != 0xC000)
+ if (!pci_oxsemi_tornado_p(dev))
return 0;
p = pci_iomap(dev, 0, 5);
@@ -1049,12 +1036,217 @@ static int pci_oxsemi_tornado_init(struct pci_dev *dev)
/* Tornado device */
if (deviceID == 0x07000200) {
number_uarts = ioread8(p + 4);
- pci_dbg(dev, "%d ports detected on Oxford PCI Express device\n", number_uarts);
+ pci_dbg(dev, "%d ports detected on %s PCI Express device\n",
+ number_uarts,
+ dev->vendor == PCI_VENDOR_ID_ENDRUN ?
+ "EndRun" : "Oxford");
}
pci_iounmap(dev, p);
return number_uarts;
}
+/* Tornado-specific constants for the TCR and CPR registers; see below. */
+#define OXSEMI_TORNADO_TCR_MASK 0xf
+#define OXSEMI_TORNADO_CPR_MASK 0x1ff
+#define OXSEMI_TORNADO_CPR_MIN 0x008
+#define OXSEMI_TORNADO_CPR_DEF 0x10f
+
+/*
+ * Determine the oversampling rate, the clock prescaler, and the clock
+ * divisor for the requested baud rate. The clock rate is 62.5 MHz,
+ * which is four times the baud base, and the prescaler increments in
+ * steps of 1/8. Therefore to make calculations on integers we need
+ * to use a scaled clock rate, which is the baud base multiplied by 32
+ * (or our assumed UART clock rate multiplied by 2).
+ *
+ * The allowed oversampling rates are from 4 up to 16 inclusive (values
+ * from 0 to 3 inclusive map to 16). Likewise the clock prescaler allows
+ * values between 1.000 and 63.875 inclusive (operation for values from
+ * 0.000 to 0.875 has not been specified). The clock divisor is the usual
+ * unsigned 16-bit integer.
+ *
+ * For the most accurate baud rate we use a table of predetermined
+ * oversampling rates and clock prescalers that records all possible
+ * products of the two parameters in the range from 4 up to 255 inclusive,
+ * and additionally 335 for the 1500000bps rate, with the prescaler scaled
+ * by 8. The table is sorted by the decreasing value of the oversampling
+ * rate and ties are resolved by sorting by the decreasing value of the
+ * product. This way preference is given to higher oversampling rates.
+ *
+ * We iterate over the table and choose the product of an oversampling
+ * rate and a clock prescaler that gives the lowest integer division
+ * result deviation, or if an exact integer divider is found we stop
+ * looking for it right away. We do some fixup if the resulting clock
+ * divisor required would be out of its unsigned 16-bit integer range.
+ *
+ * Finally we abuse the supposed fractional part returned to encode the
+ * 4-bit value of the oversampling rate and the 9-bit value of the clock
+ * prescaler which will end up in the TCR and CPR/CPR2 registers.
+ */
+static unsigned int pci_oxsemi_tornado_get_divisor(struct uart_port *port,
+ unsigned int baud,
+ unsigned int *frac)
+{
+ static u8 p[][2] = {
+ { 16, 14, }, { 16, 13, }, { 16, 12, }, { 16, 11, },
+ { 16, 10, }, { 16, 9, }, { 16, 8, }, { 15, 17, },
+ { 15, 16, }, { 15, 15, }, { 15, 14, }, { 15, 13, },
+ { 15, 12, }, { 15, 11, }, { 15, 10, }, { 15, 9, },
+ { 15, 8, }, { 14, 18, }, { 14, 17, }, { 14, 14, },
+ { 14, 13, }, { 14, 12, }, { 14, 11, }, { 14, 10, },
+ { 14, 9, }, { 14, 8, }, { 13, 19, }, { 13, 18, },
+ { 13, 17, }, { 13, 13, }, { 13, 12, }, { 13, 11, },
+ { 13, 10, }, { 13, 9, }, { 13, 8, }, { 12, 19, },
+ { 12, 18, }, { 12, 17, }, { 12, 11, }, { 12, 9, },
+ { 12, 8, }, { 11, 23, }, { 11, 22, }, { 11, 21, },
+ { 11, 20, }, { 11, 19, }, { 11, 18, }, { 11, 17, },
+ { 11, 11, }, { 11, 10, }, { 11, 9, }, { 11, 8, },
+ { 10, 25, }, { 10, 23, }, { 10, 20, }, { 10, 19, },
+ { 10, 17, }, { 10, 10, }, { 10, 9, }, { 10, 8, },
+ { 9, 27, }, { 9, 23, }, { 9, 21, }, { 9, 19, },
+ { 9, 18, }, { 9, 17, }, { 9, 9, }, { 9, 8, },
+ { 8, 31, }, { 8, 29, }, { 8, 23, }, { 8, 19, },
+ { 8, 17, }, { 8, 8, }, { 7, 35, }, { 7, 31, },
+ { 7, 29, }, { 7, 25, }, { 7, 23, }, { 7, 21, },
+ { 7, 19, }, { 7, 17, }, { 7, 15, }, { 7, 14, },
+ { 7, 13, }, { 7, 12, }, { 7, 11, }, { 7, 10, },
+ { 7, 9, }, { 7, 8, }, { 6, 41, }, { 6, 37, },
+ { 6, 31, }, { 6, 29, }, { 6, 23, }, { 6, 19, },
+ { 6, 17, }, { 6, 13, }, { 6, 11, }, { 6, 10, },
+ { 6, 9, }, { 6, 8, }, { 5, 67, }, { 5, 47, },
+ { 5, 43, }, { 5, 41, }, { 5, 37, }, { 5, 31, },
+ { 5, 29, }, { 5, 25, }, { 5, 23, }, { 5, 19, },
+ { 5, 17, }, { 5, 15, }, { 5, 13, }, { 5, 11, },
+ { 5, 10, }, { 5, 9, }, { 5, 8, }, { 4, 61, },
+ { 4, 59, }, { 4, 53, }, { 4, 47, }, { 4, 43, },
+ { 4, 41, }, { 4, 37, }, { 4, 31, }, { 4, 29, },
+ { 4, 23, }, { 4, 19, }, { 4, 17, }, { 4, 13, },
+ { 4, 9, }, { 4, 8, },
+ };
+ /* Scale the quotient for comparison to get the fractional part. */
+ const unsigned int quot_scale = 65536;
+ unsigned int sclk = port->uartclk * 2;
+ unsigned int sdiv = DIV_ROUND_CLOSEST(sclk, baud);
+ unsigned int best_squot;
+ unsigned int squot;
+ unsigned int quot;
+ u16 cpr;
+ u8 tcr;
+ int i;
+
+ /* Old custom speed handling. */
+ if (baud == 38400 && (port->flags & UPF_SPD_MASK) == UPF_SPD_CUST) {
+ unsigned int cust_div = port->custom_divisor;
+
+ quot = cust_div & UART_DIV_MAX;
+ tcr = (cust_div >> 16) & OXSEMI_TORNADO_TCR_MASK;
+ cpr = (cust_div >> 20) & OXSEMI_TORNADO_CPR_MASK;
+ if (cpr < OXSEMI_TORNADO_CPR_MIN)
+ cpr = OXSEMI_TORNADO_CPR_DEF;
+ } else {
+ best_squot = quot_scale;
+ for (i = 0; i < ARRAY_SIZE(p); i++) {
+ unsigned int spre;
+ unsigned int srem;
+ u8 cp;
+ u8 tc;
+
+ tc = p[i][0];
+ cp = p[i][1];
+ spre = tc * cp;
+
+ srem = sdiv % spre;
+ if (srem > spre / 2)
+ srem = spre - srem;
+ squot = DIV_ROUND_CLOSEST(srem * quot_scale, spre);
+
+ if (srem == 0) {
+ tcr = tc;
+ cpr = cp;
+ quot = sdiv / spre;
+ break;
+ } else if (squot < best_squot) {
+ best_squot = squot;
+ tcr = tc;
+ cpr = cp;
+ quot = DIV_ROUND_CLOSEST(sdiv, spre);
+ }
+ }
+ while (tcr <= (OXSEMI_TORNADO_TCR_MASK + 1) >> 1 &&
+ quot % 2 == 0) {
+ quot >>= 1;
+ tcr <<= 1;
+ }
+ while (quot > UART_DIV_MAX) {
+ if (tcr <= (OXSEMI_TORNADO_TCR_MASK + 1) >> 1) {
+ quot >>= 1;
+ tcr <<= 1;
+ } else if (cpr <= OXSEMI_TORNADO_CPR_MASK >> 1) {
+ quot >>= 1;
+ cpr <<= 1;
+ } else {
+ quot = quot * cpr / OXSEMI_TORNADO_CPR_MASK;
+ cpr = OXSEMI_TORNADO_CPR_MASK;
+ }
+ }
+ }
+
+ *frac = (cpr << 8) | (tcr & OXSEMI_TORNADO_TCR_MASK);
+ return quot;
+}
+
+/*
+ * Set the oversampling rate in the transmitter clock cycle register (TCR),
+ * the clock prescaler in the clock prescaler register (CPR and CPR2), and
+ * the clock divisor in the divisor latch (DLL and DLM). Note that for
+ * backwards compatibility any write to CPR clears CPR2 and therefore CPR
+ * has to be written first, followed by CPR2, which occupies the location
+ * of CKS used with earlier UART designs.
+ */
+static void pci_oxsemi_tornado_set_divisor(struct uart_port *port,
+ unsigned int baud,
+ unsigned int quot,
+ unsigned int quot_frac)
+{
+ struct uart_8250_port *up = up_to_u8250p(port);
+ u8 cpr2 = quot_frac >> 16;
+ u8 cpr = quot_frac >> 8;
+ u8 tcr = quot_frac;
+
+ serial_icr_write(up, UART_TCR, tcr);
+ serial_icr_write(up, UART_CPR, cpr);
+ serial_icr_write(up, UART_CKS, cpr2);
+ serial8250_do_set_divisor(port, baud, quot, 0);
+}
+
+/*
+ * For Tornado devices we force MCR[7] set for the Divide-by-M N/8 baud rate
+ * generator prescaler (CPR and CPR2). Otherwise no prescaler would be used.
+ */
+static void pci_oxsemi_tornado_set_mctrl(struct uart_port *port,
+ unsigned int mctrl)
+{
+ struct uart_8250_port *up = up_to_u8250p(port);
+
+ up->mcr |= UART_MCR_CLKSEL;
+ serial8250_do_set_mctrl(port, mctrl);
+}
+
+static int pci_oxsemi_tornado_setup(struct serial_private *priv,
+ const struct pciserial_board *board,
+ struct uart_8250_port *up, int idx)
+{
+ struct pci_dev *dev = priv->dev;
+
+ if (pci_oxsemi_tornado_p(dev)) {
+ up->port.get_divisor = pci_oxsemi_tornado_get_divisor;
+ up->port.set_divisor = pci_oxsemi_tornado_set_divisor;
+ up->port.set_mctrl = pci_oxsemi_tornado_set_mctrl;
+ }
+
+ return pci_default_setup(priv, board, up, idx);
+}
+
static int pci_asix_setup(struct serial_private *priv,
const struct pciserial_board *board,
struct uart_8250_port *port, int idx)
@@ -2244,7 +2436,7 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
.device = PCI_ANY_ID,
.subvendor = PCI_ANY_ID,
.subdevice = PCI_ANY_ID,
- .init = pci_endrun_init,
+ .init = pci_oxsemi_tornado_init,
.setup = pci_default_setup,
},
/*
@@ -2256,7 +2448,7 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
.subvendor = PCI_ANY_ID,
.subdevice = PCI_ANY_ID,
.init = pci_oxsemi_tornado_init,
- .setup = pci_default_setup,
+ .setup = pci_oxsemi_tornado_setup,
},
{
.vendor = PCI_VENDOR_ID_MAINPINE,
@@ -2264,7 +2456,7 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
.subvendor = PCI_ANY_ID,
.subdevice = PCI_ANY_ID,
.init = pci_oxsemi_tornado_init,
- .setup = pci_default_setup,
+ .setup = pci_oxsemi_tornado_setup,
},
{
.vendor = PCI_VENDOR_ID_DIGI,
@@ -2272,7 +2464,7 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
.subvendor = PCI_SUBVENDOR_ID_IBM,
.subdevice = PCI_ANY_ID,
.init = pci_oxsemi_tornado_init,
- .setup = pci_default_setup,
+ .setup = pci_oxsemi_tornado_setup,
},
{
.vendor = PCI_VENDOR_ID_INTEL,
@@ -2589,7 +2781,7 @@ enum pci_board_num_t {
pbn_b0_2_1843200,
pbn_b0_4_1843200,
- pbn_b0_1_3906250,
+ pbn_b0_1_15625000,
pbn_b0_bt_1_115200,
pbn_b0_bt_2_115200,
@@ -2667,12 +2859,11 @@ enum pci_board_num_t {
pbn_panacom2,
pbn_panacom4,
pbn_plx_romulus,
- pbn_endrun_2_3906250,
pbn_oxsemi,
- pbn_oxsemi_1_3906250,
- pbn_oxsemi_2_3906250,
- pbn_oxsemi_4_3906250,
- pbn_oxsemi_8_3906250,
+ pbn_oxsemi_1_15625000,
+ pbn_oxsemi_2_15625000,
+ pbn_oxsemi_4_15625000,
+ pbn_oxsemi_8_15625000,
pbn_intel_i960,
pbn_sgi_ioc3,
pbn_computone_4,
@@ -2815,10 +3006,10 @@ static struct pciserial_board pci_boards[] = {
.uart_offset = 8,
},
- [pbn_b0_1_3906250] = {
+ [pbn_b0_1_15625000] = {
.flags = FL_BASE0,
.num_ports = 1,
- .base_baud = 3906250,
+ .base_baud = 15625000,
.uart_offset = 8,
},
@@ -3189,20 +3380,6 @@ static struct pciserial_board pci_boards[] = {
.first_offset = 0x03,
},
- /*
- * EndRun Technologies
- * Uses the size of PCI Base region 0 to
- * signal now many ports are available
- * 2 port 952 Uart support
- */
- [pbn_endrun_2_3906250] = {
- .flags = FL_BASE0,
- .num_ports = 2,
- .base_baud = 3906250,
- .uart_offset = 0x200,
- .first_offset = 0x1000,
- },
-
/*
* This board uses the size of PCI Base region 0 to
* signal now many ports are available
@@ -3213,31 +3390,31 @@ static struct pciserial_board pci_boards[] = {
.base_baud = 115200,
.uart_offset = 8,
},
- [pbn_oxsemi_1_3906250] = {
+ [pbn_oxsemi_1_15625000] = {
.flags = FL_BASE0,
.num_ports = 1,
- .base_baud = 3906250,
+ .base_baud = 15625000,
.uart_offset = 0x200,
.first_offset = 0x1000,
},
- [pbn_oxsemi_2_3906250] = {
+ [pbn_oxsemi_2_15625000] = {
.flags = FL_BASE0,
.num_ports = 2,
- .base_baud = 3906250,
+ .base_baud = 15625000,
.uart_offset = 0x200,
.first_offset = 0x1000,
},
- [pbn_oxsemi_4_3906250] = {
+ [pbn_oxsemi_4_15625000] = {
.flags = FL_BASE0,
.num_ports = 4,
- .base_baud = 3906250,
+ .base_baud = 15625000,
.uart_offset = 0x200,
.first_offset = 0x1000,
},
- [pbn_oxsemi_8_3906250] = {
+ [pbn_oxsemi_8_15625000] = {
.flags = FL_BASE0,
.num_ports = 8,
- .base_baud = 3906250,
+ .base_baud = 15625000,
.uart_offset = 0x200,
.first_offset = 0x1000,
},
@@ -4109,13 +4286,6 @@ static const struct pci_device_id serial_pci_tbl[] = {
{ PCI_VENDOR_ID_PLX, PCI_DEVICE_ID_PLX_ROMULUS,
0x10b5, 0x106a, 0, 0,
pbn_plx_romulus },
- /*
- * EndRun Technologies. PCI express device range.
- * EndRun PTP/1588 has 2 Native UARTs.
- */
- { PCI_VENDOR_ID_ENDRUN, PCI_DEVICE_ID_ENDRUN_1588,
- PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_endrun_2_3906250 },
/*
* Quatech cards. These actually have configurable clocks but for
* now we just use the default.
@@ -4225,158 +4395,165 @@ static const struct pci_device_id serial_pci_tbl[] = {
*/
{ PCI_VENDOR_ID_OXSEMI, 0xc101, /* OXPCIe952 1 Legacy UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_b0_1_3906250 },
+ pbn_b0_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc105, /* OXPCIe952 1 Legacy UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_b0_1_3906250 },
+ pbn_b0_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc11b, /* OXPCIe952 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc11f, /* OXPCIe952 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc120, /* OXPCIe952 1 Legacy UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_b0_1_3906250 },
+ pbn_b0_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc124, /* OXPCIe952 1 Legacy UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_b0_1_3906250 },
+ pbn_b0_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc138, /* OXPCIe952 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc13d, /* OXPCIe952 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc140, /* OXPCIe952 1 Legacy UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_b0_1_3906250 },
+ pbn_b0_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc141, /* OXPCIe952 1 Legacy UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_b0_1_3906250 },
+ pbn_b0_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc144, /* OXPCIe952 1 Legacy UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_b0_1_3906250 },
+ pbn_b0_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc145, /* OXPCIe952 1 Legacy UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_b0_1_3906250 },
+ pbn_b0_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc158, /* OXPCIe952 2 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_2_3906250 },
+ pbn_oxsemi_2_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc15d, /* OXPCIe952 2 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_2_3906250 },
+ pbn_oxsemi_2_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc208, /* OXPCIe954 4 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_4_3906250 },
+ pbn_oxsemi_4_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc20d, /* OXPCIe954 4 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_4_3906250 },
+ pbn_oxsemi_4_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc308, /* OXPCIe958 8 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_8_3906250 },
+ pbn_oxsemi_8_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc30d, /* OXPCIe958 8 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_8_3906250 },
+ pbn_oxsemi_8_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc40b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc40f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc41b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc41f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc42b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc42f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc43b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc43f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc44b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc44f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc45b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc45f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc46b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc46f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc47b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc47f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc48b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc48f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc49b, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc49f, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc4ab, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc4af, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc4bb, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc4bf, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc4cb, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_OXSEMI, 0xc4cf, /* OXPCIe200 1 Native UART */
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
/*
* Mainpine Inc. IQ Express "Rev3" utilizing OxSemi Tornado
*/
{ PCI_VENDOR_ID_MAINPINE, 0x4000, /* IQ Express 1 Port V.34 Super-G3 Fax */
PCI_VENDOR_ID_MAINPINE, 0x4001, 0, 0,
- pbn_oxsemi_1_3906250 },
+ pbn_oxsemi_1_15625000 },
{ PCI_VENDOR_ID_MAINPINE, 0x4000, /* IQ Express 2 Port V.34 Super-G3 Fax */
PCI_VENDOR_ID_MAINPINE, 0x4002, 0, 0,
- pbn_oxsemi_2_3906250 },
+ pbn_oxsemi_2_15625000 },
{ PCI_VENDOR_ID_MAINPINE, 0x4000, /* IQ Express 4 Port V.34 Super-G3 Fax */
PCI_VENDOR_ID_MAINPINE, 0x4004, 0, 0,
- pbn_oxsemi_4_3906250 },
+ pbn_oxsemi_4_15625000 },
{ PCI_VENDOR_ID_MAINPINE, 0x4000, /* IQ Express 8 Port V.34 Super-G3 Fax */
PCI_VENDOR_ID_MAINPINE, 0x4008, 0, 0,
- pbn_oxsemi_8_3906250 },
+ pbn_oxsemi_8_15625000 },
/*
* Digi/IBM PCIe 2-port Async EIA-232 Adapter utilizing OxSemi Tornado
*/
{ PCI_VENDOR_ID_DIGI, PCIE_DEVICE_ID_NEO_2_OX_IBM,
PCI_SUBVENDOR_ID_IBM, PCI_ANY_ID, 0, 0,
- pbn_oxsemi_2_3906250 },
+ pbn_oxsemi_2_15625000 },
+ /*
+ * EndRun Technologies. PCI express device range.
+ * EndRun PTP/1588 has 2 Native UARTs utilizing OxSemi 952.
+ */
+ { PCI_VENDOR_ID_ENDRUN, PCI_DEVICE_ID_ENDRUN_1588,
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+ pbn_oxsemi_2_15625000 },
/*
* SBS Technologies, Inc. P-Octal and PMC-OCTPRO cards,
@@ -4886,6 +5063,115 @@ static const struct pci_device_id serial_pci_tbl[] = {
PCI_ANY_ID, PCI_ANY_ID,
0, 0,
pbn_b2_4_115200 },
+ /*
+ * Brainboxes PX-101
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4005,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_b0_2_115200 },
+ { PCI_VENDOR_ID_INTASHIELD, 0x4019,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_2_15625000 },
+ /*
+ * Brainboxes PX-235/246
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4004,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_b0_1_115200 },
+ { PCI_VENDOR_ID_INTASHIELD, 0x4016,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_1_15625000 },
+ /*
+ * Brainboxes PX-203/PX-257
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4006,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_b0_2_115200 },
+ { PCI_VENDOR_ID_INTASHIELD, 0x4015,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_4_15625000 },
+ /*
+ * Brainboxes PX-260/PX-701
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x400A,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_4_15625000 },
+ /*
+ * Brainboxes PX-310
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x400E,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_2_15625000 },
+ /*
+ * Brainboxes PX-313
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x400C,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_2_15625000 },
+ /*
+ * Brainboxes PX-320/324/PX-376/PX-387
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x400B,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_1_15625000 },
+ /*
+ * Brainboxes PX-335/346
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x400F,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_4_15625000 },
+ /*
+ * Brainboxes PX-368
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4010,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_4_15625000 },
+ /*
+ * Brainboxes PX-420
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4000,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_b0_4_115200 },
+ { PCI_VENDOR_ID_INTASHIELD, 0x4011,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_4_15625000 },
+ /*
+ * Brainboxes PX-803
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4009,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_b0_1_115200 },
+ { PCI_VENDOR_ID_INTASHIELD, 0x401E,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_1_15625000 },
+ /*
+ * Brainboxes PX-846
+ */
+ { PCI_VENDOR_ID_INTASHIELD, 0x4008,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_b0_1_115200 },
+ { PCI_VENDOR_ID_INTASHIELD, 0x4017,
+ PCI_ANY_ID, PCI_ANY_ID,
+ 0, 0,
+ pbn_oxsemi_1_15625000 },
+
/*
* Perle PCI-RAS cards
*/
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index ddaf35daf316..291fd99bd7f1 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -537,27 +537,6 @@ serial_port_out_sync(struct uart_port *p, int offset, int value)
}
}
-/*
- * For the 16C950
- */
-static void serial_icr_write(struct uart_8250_port *up, int offset, int value)
-{
- serial_out(up, UART_SCR, offset);
- serial_out(up, UART_ICR, value);
-}
-
-static unsigned int serial_icr_read(struct uart_8250_port *up, int offset)
-{
- unsigned int value;
-
- serial_icr_write(up, UART_ACR, up->acr | UART_ACR_ICRRD);
- serial_out(up, UART_SCR, offset);
- value = serial_in(up, UART_ICR);
- serial_icr_write(up, UART_ACR, up->acr);
-
- return value;
-}
-
/*
* FIFO support.
*/
diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
index 2cb89491dd09..65b76adf107c 100644
--- a/drivers/tty/serial/fsl_lpuart.c
+++ b/drivers/tty/serial/fsl_lpuart.c
@@ -990,12 +990,12 @@ static void lpuart32_rxint(struct lpuart_port *sport)
if (sr & (UARTSTAT_PE | UARTSTAT_OR | UARTSTAT_FE)) {
if (sr & UARTSTAT_PE) {
+ sport->port.icount.parity++;
+ } else if (sr & UARTSTAT_FE) {
if (is_break)
sport->port.icount.brk++;
else
- sport->port.icount.parity++;
- } else if (sr & UARTSTAT_FE) {
- sport->port.icount.frame++;
+ sport->port.icount.frame++;
}
if (sr & UARTSTAT_OR)
@@ -1010,12 +1010,12 @@ static void lpuart32_rxint(struct lpuart_port *sport)
sr &= sport->port.read_status_mask;
if (sr & UARTSTAT_PE) {
+ flg = TTY_PARITY;
+ } else if (sr & UARTSTAT_FE) {
if (is_break)
flg = TTY_BREAK;
else
- flg = TTY_PARITY;
- } else if (sr & UARTSTAT_FE) {
- flg = TTY_FRAME;
+ flg = TTY_FRAME;
}
if (sr & UARTSTAT_OR)
diff --git a/drivers/tty/serial/mvebu-uart.c b/drivers/tty/serial/mvebu-uart.c
index 93489fe334d0..65eaecd10b7c 100644
--- a/drivers/tty/serial/mvebu-uart.c
+++ b/drivers/tty/serial/mvebu-uart.c
@@ -265,6 +265,7 @@ static void mvebu_uart_rx_chars(struct uart_port *port, unsigned int status)
struct tty_port *tport = &port->state->port;
unsigned char ch = 0;
char flag = 0;
+ int ret;
do {
if (status & STAT_RX_RDY(port)) {
@@ -277,6 +278,16 @@ static void mvebu_uart_rx_chars(struct uart_port *port, unsigned int status)
port->icount.parity++;
}
+ /*
+ * For UART2, error bits are not cleared on buffer read.
+ * This causes interrupt loop and system hang.
+ */
+ if (IS_EXTENDED(port) && (status & STAT_BRK_ERR)) {
+ ret = readl(port->membase + UART_STAT);
+ ret |= STAT_BRK_ERR;
+ writel(ret, port->membase + UART_STAT);
+ }
+
if (status & STAT_BRK_DET) {
port->icount.brk++;
status &= ~(STAT_FRM_ERR | STAT_PAR_ERR);
diff --git a/drivers/tty/serial/pic32_uart.c b/drivers/tty/serial/pic32_uart.c
index b7a3a1b959b1..a967b586a0a0 100644
--- a/drivers/tty/serial/pic32_uart.c
+++ b/drivers/tty/serial/pic32_uart.c
@@ -427,7 +427,7 @@ static int pic32_uart_startup(struct uart_port *port)
if (!sport->irq_fault_name) {
dev_err(port->dev, "%s: kasprintf err!", __func__);
ret = -ENOMEM;
- goto out_done;
+ goto out_disable_clk;
}
irq_set_status_flags(sport->irq_fault, IRQ_NOAUTOEN);
ret = request_irq(sport->irq_fault, pic32_uart_fault_interrupt,
@@ -493,14 +493,16 @@ static int pic32_uart_startup(struct uart_port *port)
return 0;
out_t:
- kfree(sport->irq_tx_name);
free_irq(sport->irq_tx, port);
+ kfree(sport->irq_tx_name);
out_r:
- kfree(sport->irq_rx_name);
free_irq(sport->irq_rx, port);
+ kfree(sport->irq_rx_name);
out_f:
- kfree(sport->irq_fault_name);
free_irq(sport->irq_fault, port);
+ kfree(sport->irq_fault_name);
+out_disable_clk:
+ clk_disable_unprepare(sport->clk);
out_done:
return ret;
}
@@ -519,8 +521,11 @@ static void pic32_uart_shutdown(struct uart_port *port)
/* free all 3 interrupts for this UART */
free_irq(sport->irq_fault, port);
+ kfree(sport->irq_fault_name);
free_irq(sport->irq_tx, port);
+ kfree(sport->irq_tx_name);
free_irq(sport->irq_rx, port);
+ kfree(sport->irq_rx_name);
}
/* serial core request to change current uart setting */
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index dfc1f4b445f3..6eaf8eb84661 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -344,7 +344,7 @@ static struct uni_screen *vc_uniscr_alloc(unsigned int cols, unsigned int rows)
/* allocate everything in one go */
memsize = cols * rows * sizeof(char32_t);
memsize += rows * sizeof(char32_t *);
- p = vmalloc(memsize);
+ p = vzalloc(memsize);
if (!p)
return NULL;
diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index d6d515d598dc..ae049eb28b93 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -2280,11 +2280,16 @@ static int cdns3_gadget_ep_enable(struct usb_ep *ep,
int ret = 0;
int val;
+ if (!ep) {
+ pr_debug("usbss: ep not configured?\n");
+ return -EINVAL;
+ }
+
priv_ep = ep_to_cdns3_ep(ep);
priv_dev = priv_ep->cdns3_dev;
comp_desc = priv_ep->endpoint.comp_desc;
- if (!ep || !desc || desc->bDescriptorType != USB_DT_ENDPOINT) {
+ if (!desc || desc->bDescriptorType != USB_DT_ENDPOINT) {
dev_dbg(priv_dev->dev, "usbss: invalid parameters\n");
return -EINVAL;
}
@@ -2596,7 +2601,7 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
struct usb_request *request)
{
struct cdns3_endpoint *priv_ep = ep_to_cdns3_ep(ep);
- struct cdns3_device *priv_dev = priv_ep->cdns3_dev;
+ struct cdns3_device *priv_dev;
struct usb_request *req, *req_temp;
struct cdns3_request *priv_req;
struct cdns3_trb *link_trb;
@@ -2607,6 +2612,8 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
if (!ep || !request || !ep->desc)
return -EINVAL;
+ priv_dev = priv_ep->cdns3_dev;
+
spin_lock_irqsave(&priv_dev->lock, flags);
priv_req = to_cdns3_request(request);
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index 06eea8848ccc..11c8ea0cccc8 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1691,7 +1691,6 @@ static void usb_giveback_urb_bh(struct tasklet_struct *t)
spin_lock_irq(&bh->lock);
bh->running = true;
- restart:
list_replace_init(&bh->head, &local_list);
spin_unlock_irq(&bh->lock);
@@ -1705,10 +1704,17 @@ static void usb_giveback_urb_bh(struct tasklet_struct *t)
bh->completing_ep = NULL;
}
- /* check if there are new URBs to giveback */
+ /*
+ * giveback new URBs next time to prevent this function
+ * from not exiting for a long time.
+ */
spin_lock_irq(&bh->lock);
- if (!list_empty(&bh->head))
- goto restart;
+ if (!list_empty(&bh->head)) {
+ if (bh->high_prio)
+ tasklet_hi_schedule(&bh->bh);
+ else
+ tasklet_schedule(&bh->bh);
+ }
bh->running = false;
spin_unlock_irq(&bh->lock);
}
@@ -1737,7 +1743,7 @@ static void usb_giveback_urb_bh(struct tasklet_struct *t)
void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
{
struct giveback_urb_bh *bh;
- bool running, high_prio_bh;
+ bool running;
/* pass status to tasklet via unlinked */
if (likely(!urb->unlinked))
@@ -1748,13 +1754,10 @@ void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
return;
}
- if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe)) {
+ if (usb_pipeisoc(urb->pipe) || usb_pipeint(urb->pipe))
bh = &hcd->high_prio_bh;
- high_prio_bh = true;
- } else {
+ else
bh = &hcd->low_prio_bh;
- high_prio_bh = false;
- }
spin_lock(&bh->lock);
list_add_tail(&urb->urb_list, &bh->head);
@@ -1763,7 +1766,7 @@ void usb_hcd_giveback_urb(struct usb_hcd *hcd, struct urb *urb, int status)
if (running)
;
- else if (high_prio_bh)
+ else if (bh->high_prio)
tasklet_hi_schedule(&bh->bh);
else
tasklet_schedule(&bh->bh);
@@ -2959,6 +2962,7 @@ int usb_add_hcd(struct usb_hcd *hcd,
/* initialize tasklets */
init_giveback_urb_bh(&hcd->high_prio_bh);
+ hcd->high_prio_bh.high_prio = true;
init_giveback_urb_bh(&hcd->low_prio_bh);
/* enable irqs just before we start the controller,
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index d28cd1a6709b..ecec9cdfa0a6 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -157,8 +157,13 @@ static void __dwc3_set_mode(struct work_struct *work)
break;
}
- /* For DRD host or device mode only */
- if (dwc->desired_dr_role != DWC3_GCTL_PRTCAP_OTG) {
+ /*
+ * When current_dr_role is not set, there's no role switching.
+ * Only perform GCTL.CoreSoftReset when there's DRD role switching.
+ */
+ if (dwc->current_dr_role && ((DWC3_IP_IS(DWC3) ||
+ DWC3_VER_IS_PRIOR(DWC31, 190A)) &&
+ dwc->desired_dr_role != DWC3_GCTL_PRTCAP_OTG)) {
reg = dwc3_readl(dwc->regs, DWC3_GCTL);
reg |= DWC3_GCTL_CORESOFTRESET;
dwc3_writel(dwc->regs, DWC3_GCTL, reg);
diff --git a/drivers/usb/dwc3/dwc3-qcom.c b/drivers/usb/dwc3/dwc3-qcom.c
index 6cba990da32e..3582fd6dfa14 100644
--- a/drivers/usb/dwc3/dwc3-qcom.c
+++ b/drivers/usb/dwc3/dwc3-qcom.c
@@ -443,9 +443,9 @@ static int dwc3_qcom_get_irq(struct platform_device *pdev,
int ret;
if (np)
- ret = platform_get_irq_byname(pdev_irq, name);
+ ret = platform_get_irq_byname_optional(pdev_irq, name);
else
- ret = platform_get_irq(pdev_irq, num);
+ ret = platform_get_irq_optional(pdev_irq, num);
return ret;
}
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index f46fc02669a8..d3be2b424244 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1181,17 +1181,49 @@ static u32 dwc3_calc_trbs_left(struct dwc3_ep *dep)
return trbs_left;
}
-static void __dwc3_prepare_one_trb(struct dwc3_ep *dep, struct dwc3_trb *trb,
- dma_addr_t dma, unsigned int length, unsigned int chain,
- unsigned int node, unsigned int stream_id,
- unsigned int short_not_ok, unsigned int no_interrupt,
- unsigned int is_last, bool must_interrupt)
+/**
+ * dwc3_prepare_one_trb - setup one TRB from one request
+ * @dep: endpoint for which this request is prepared
+ * @req: dwc3_request pointer
+ * @trb_length: buffer size of the TRB
+ * @chain: should this TRB be chained to the next?
+ * @node: only for isochronous endpoints. First TRB needs different type.
+ * @use_bounce_buffer: set to use bounce buffer
+ * @must_interrupt: set to interrupt on TRB completion
+ */
+static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
+ struct dwc3_request *req, unsigned int trb_length,
+ unsigned int chain, unsigned int node, bool use_bounce_buffer,
+ bool must_interrupt)
{
+ struct dwc3_trb *trb;
+ dma_addr_t dma;
+ unsigned int stream_id = req->request.stream_id;
+ unsigned int short_not_ok = req->request.short_not_ok;
+ unsigned int no_interrupt = req->request.no_interrupt;
+ unsigned int is_last = req->request.is_last;
struct dwc3 *dwc = dep->dwc;
struct usb_gadget *gadget = dwc->gadget;
enum usb_device_speed speed = gadget->speed;
- trb->size = DWC3_TRB_SIZE_LENGTH(length);
+ if (use_bounce_buffer)
+ dma = dep->dwc->bounce_addr;
+ else if (req->request.num_sgs > 0)
+ dma = sg_dma_address(req->start_sg);
+ else
+ dma = req->request.dma;
+
+ trb = &dep->trb_pool[dep->trb_enqueue];
+
+ if (!req->trb) {
+ dwc3_gadget_move_started_request(req);
+ req->trb = trb;
+ req->trb_dma = dwc3_trb_dma_offset(dep, trb);
+ }
+
+ req->num_trbs++;
+
+ trb->size = DWC3_TRB_SIZE_LENGTH(trb_length);
trb->bpl = lower_32_bits(dma);
trb->bph = upper_32_bits(dma);
@@ -1231,10 +1263,10 @@ static void __dwc3_prepare_one_trb(struct dwc3_ep *dep, struct dwc3_trb *trb,
unsigned int mult = 2;
unsigned int maxp = usb_endpoint_maxp(ep->desc);
- if (length <= (2 * maxp))
+ if (req->request.length <= (2 * maxp))
mult--;
- if (length <= maxp)
+ if (req->request.length <= maxp)
mult--;
trb->size |= DWC3_TRB_SIZE_PCM1(mult);
@@ -1308,50 +1340,6 @@ static void __dwc3_prepare_one_trb(struct dwc3_ep *dep, struct dwc3_trb *trb,
trace_dwc3_prepare_trb(dep, trb);
}
-/**
- * dwc3_prepare_one_trb - setup one TRB from one request
- * @dep: endpoint for which this request is prepared
- * @req: dwc3_request pointer
- * @trb_length: buffer size of the TRB
- * @chain: should this TRB be chained to the next?
- * @node: only for isochronous endpoints. First TRB needs different type.
- * @use_bounce_buffer: set to use bounce buffer
- * @must_interrupt: set to interrupt on TRB completion
- */
-static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
- struct dwc3_request *req, unsigned int trb_length,
- unsigned int chain, unsigned int node, bool use_bounce_buffer,
- bool must_interrupt)
-{
- struct dwc3_trb *trb;
- dma_addr_t dma;
- unsigned int stream_id = req->request.stream_id;
- unsigned int short_not_ok = req->request.short_not_ok;
- unsigned int no_interrupt = req->request.no_interrupt;
- unsigned int is_last = req->request.is_last;
-
- if (use_bounce_buffer)
- dma = dep->dwc->bounce_addr;
- else if (req->request.num_sgs > 0)
- dma = sg_dma_address(req->start_sg);
- else
- dma = req->request.dma;
-
- trb = &dep->trb_pool[dep->trb_enqueue];
-
- if (!req->trb) {
- dwc3_gadget_move_started_request(req);
- req->trb = trb;
- req->trb_dma = dwc3_trb_dma_offset(dep, trb);
- }
-
- req->num_trbs++;
-
- __dwc3_prepare_one_trb(dep, trb, dma, trb_length, chain, node,
- stream_id, short_not_ok, no_interrupt, is_last,
- must_interrupt);
-}
-
static bool dwc3_needs_extra_trb(struct dwc3_ep *dep, struct dwc3_request *req)
{
unsigned int maxp = usb_endpoint_maxp(dep->endpoint.desc);
diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c
index 3a77bca0ebe1..e884f295504f 100644
--- a/drivers/usb/gadget/function/f_mass_storage.c
+++ b/drivers/usb/gadget/function/f_mass_storage.c
@@ -1192,13 +1192,14 @@ static int do_read_toc(struct fsg_common *common, struct fsg_buffhd *bh)
u8 format;
int i, len;
+ format = common->cmnd[2] & 0xf;
+
if ((common->cmnd[1] & ~0x02) != 0 || /* Mask away MSF */
- start_track > 1) {
+ (start_track > 1 && format != 0x1)) {
curlun->sense_data = SS_INVALID_FIELD_IN_CDB;
return -EINVAL;
}
- format = common->cmnd[2] & 0xf;
/*
* Check if CDB is old style SFF-8020i
* i.e. format is in 2 MSBs of byte 9
@@ -1208,8 +1209,8 @@ static int do_read_toc(struct fsg_common *common, struct fsg_buffhd *bh)
format = (common->cmnd[9] >> 6) & 0x3;
switch (format) {
- case 0:
- /* Formatted TOC */
+ case 0: /* Formatted TOC */
+ case 1: /* Multi-session info */
len = 4 + 2*8; /* 4 byte header + 2 descriptors */
memset(buf, 0, len);
buf[1] = len - 2; /* TOC Length excludes length field */
@@ -1250,7 +1251,7 @@ static int do_read_toc(struct fsg_common *common, struct fsg_buffhd *bh)
return len;
default:
- /* Multi-session, PMA, ATIP, CD-TEXT not supported/required */
+ /* PMA, ATIP, CD-TEXT not supported/required */
curlun->sense_data = SS_INVALID_FIELD_IN_CDB;
return -EINVAL;
}
diff --git a/drivers/usb/gadget/udc/Kconfig b/drivers/usb/gadget/udc/Kconfig
index 69394dc1cdfb..2cdd37be165a 100644
--- a/drivers/usb/gadget/udc/Kconfig
+++ b/drivers/usb/gadget/udc/Kconfig
@@ -311,7 +311,7 @@ source "drivers/usb/gadget/udc/bdc/Kconfig"
config USB_AMD5536UDC
tristate "AMD5536 UDC"
- depends on USB_PCI
+ depends on USB_PCI && HAS_DMA
select USB_SNP_CORE
help
The AMD5536 UDC is part of the AMD Geode CS5536, an x86 southbridge.
diff --git a/drivers/usb/gadget/udc/aspeed-vhub/hub.c b/drivers/usb/gadget/udc/aspeed-vhub/hub.c
index 65cd4e46f031..e2207d014620 100644
--- a/drivers/usb/gadget/udc/aspeed-vhub/hub.c
+++ b/drivers/usb/gadget/udc/aspeed-vhub/hub.c
@@ -1059,8 +1059,10 @@ static int ast_vhub_init_desc(struct ast_vhub *vhub)
/* Initialize vhub String Descriptors. */
INIT_LIST_HEAD(&vhub->vhub_str_desc);
desc_np = of_get_child_by_name(vhub_np, "vhub-strings");
- if (desc_np)
+ if (desc_np) {
ret = ast_vhub_of_parse_str_desc(vhub, desc_np);
+ of_node_put(desc_np);
+ }
else
ret = ast_vhub_str_alloc_add(vhub, &ast_vhub_strings);
diff --git a/drivers/usb/gadget/udc/tegra-xudc.c b/drivers/usb/gadget/udc/tegra-xudc.c
index d9c406bdb680..c25765628625 100644
--- a/drivers/usb/gadget/udc/tegra-xudc.c
+++ b/drivers/usb/gadget/udc/tegra-xudc.c
@@ -3691,15 +3691,15 @@ static int tegra_xudc_powerdomain_init(struct tegra_xudc *xudc)
int err;
xudc->genpd_dev_device = dev_pm_domain_attach_by_name(dev, "dev");
- if (IS_ERR(xudc->genpd_dev_device)) {
- err = PTR_ERR(xudc->genpd_dev_device);
+ if (IS_ERR_OR_NULL(xudc->genpd_dev_device)) {
+ err = PTR_ERR(xudc->genpd_dev_device) ? : -ENODATA;
dev_err(dev, "failed to get device power domain: %d\n", err);
return err;
}
xudc->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "ss");
- if (IS_ERR(xudc->genpd_dev_ss)) {
- err = PTR_ERR(xudc->genpd_dev_ss);
+ if (IS_ERR_OR_NULL(xudc->genpd_dev_ss)) {
+ err = PTR_ERR(xudc->genpd_dev_ss) ? : -ENODATA;
dev_err(dev, "failed to get SuperSpeed power domain: %d\n", err);
return err;
}
diff --git a/drivers/usb/host/ehci-ppc-of.c b/drivers/usb/host/ehci-ppc-of.c
index 6bbaee74f7e7..28a19693c19f 100644
--- a/drivers/usb/host/ehci-ppc-of.c
+++ b/drivers/usb/host/ehci-ppc-of.c
@@ -148,6 +148,7 @@ static int ehci_hcd_ppc_of_probe(struct platform_device *op)
} else {
ehci->has_amcc_usb23 = 1;
}
+ of_node_put(np);
}
if (of_get_property(dn, "big-endian", NULL)) {
diff --git a/drivers/usb/host/ohci-nxp.c b/drivers/usb/host/ohci-nxp.c
index 85878e8ad331..106a6bcefb08 100644
--- a/drivers/usb/host/ohci-nxp.c
+++ b/drivers/usb/host/ohci-nxp.c
@@ -164,6 +164,7 @@ static int ohci_hcd_nxp_probe(struct platform_device *pdev)
}
isp1301_i2c_client = isp1301_get_client(isp1301_node);
+ of_node_put(isp1301_node);
if (!isp1301_i2c_client)
return -EPROBE_DEFER;
diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
index 996958a6565c..bdb776553826 100644
--- a/drivers/usb/host/xhci-tegra.c
+++ b/drivers/usb/host/xhci-tegra.c
@@ -1010,15 +1010,15 @@ static int tegra_xusb_powerdomain_init(struct device *dev,
int err;
tegra->genpd_dev_host = dev_pm_domain_attach_by_name(dev, "xusb_host");
- if (IS_ERR(tegra->genpd_dev_host)) {
- err = PTR_ERR(tegra->genpd_dev_host);
+ if (IS_ERR_OR_NULL(tegra->genpd_dev_host)) {
+ err = PTR_ERR(tegra->genpd_dev_host) ? : -ENODATA;
dev_err(dev, "failed to get host pm-domain: %d\n", err);
return err;
}
tegra->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "xusb_ss");
- if (IS_ERR(tegra->genpd_dev_ss)) {
- err = PTR_ERR(tegra->genpd_dev_ss);
+ if (IS_ERR_OR_NULL(tegra->genpd_dev_ss)) {
+ err = PTR_ERR(tegra->genpd_dev_ss) ? : -ENODATA;
dev_err(dev, "failed to get superspeed pm-domain: %d\n", err);
return err;
}
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 1f3f311d9951..1d60e62752f3 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -2393,7 +2393,7 @@ static inline const char *xhci_decode_trb(char *str, size_t size,
field3 & TRB_CYCLE ? 'C' : 'c');
break;
case TRB_STOP_RING:
- sprintf(str,
+ snprintf(str, size,
"%s: slot %d sp %d ep %d flags %c",
xhci_trb_type_string(type),
TRB_TO_SLOT_ID(field3),
diff --git a/drivers/usb/serial/sierra.c b/drivers/usb/serial/sierra.c
index 9d56138133a9..ef6a2891f290 100644
--- a/drivers/usb/serial/sierra.c
+++ b/drivers/usb/serial/sierra.c
@@ -737,7 +737,8 @@ static void sierra_close(struct usb_serial_port *port)
/*
* Need to take susp_lock to make sure port is not already being
- * resumed, but no need to hold it due to initialized
+ * resumed, but no need to hold it due to the tty-port initialized
+ * flag.
*/
spin_lock_irq(&intfdata->susp_lock);
if (--intfdata->open_ports == 0)
diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c
index 24101bd7fcad..e35bea2235c1 100644
--- a/drivers/usb/serial/usb-serial.c
+++ b/drivers/usb/serial/usb-serial.c
@@ -295,7 +295,7 @@ static int serial_open(struct tty_struct *tty, struct file *filp)
*
* Shut down a USB serial port. Serialized against activate by the
* tport mutex and kept to matching open/close pairs
- * of calls by the initialized flag.
+ * of calls by the tty-port initialized flag.
*
* Not called if tty is console.
*/
diff --git a/drivers/usb/serial/usb_wwan.c b/drivers/usb/serial/usb_wwan.c
index dab38b63eaf7..cc81ab7ef4da 100644
--- a/drivers/usb/serial/usb_wwan.c
+++ b/drivers/usb/serial/usb_wwan.c
@@ -388,7 +388,8 @@ void usb_wwan_close(struct usb_serial_port *port)
/*
* Need to take susp_lock to make sure port is not already being
- * resumed, but no need to hold it due to initialized
+ * resumed, but no need to hold it due to the tty-port initialized
+ * flag.
*/
spin_lock_irq(&intfdata->susp_lock);
if (--intfdata->open_ports == 0)
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index a6045aef0d04..668880c3a472 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -76,6 +76,10 @@ static int ucsi_read_error(struct ucsi *ucsi)
if (ret)
return ret;
+ ret = ucsi_acknowledge_command(ucsi);
+ if (ret)
+ return ret;
+
switch (error) {
case UCSI_ERROR_INCOMPATIBLE_PARTNER:
return -EOPNOTSUPP;
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
index 767b5d47631a..e92376837b29 100644
--- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
+++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
@@ -337,6 +337,14 @@ static int vf_qm_cache_wb(struct hisi_qm *qm)
return 0;
}
+static struct hisi_acc_vf_core_device *hssi_acc_drvdata(struct pci_dev *pdev)
+{
+ struct vfio_pci_core_device *core_device = dev_get_drvdata(&pdev->dev);
+
+ return container_of(core_device, struct hisi_acc_vf_core_device,
+ core_device);
+}
+
static void vf_qm_fun_reset(struct hisi_acc_vf_core_device *hisi_acc_vdev,
struct hisi_qm *qm)
{
@@ -962,7 +970,7 @@ hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev,
static void hisi_acc_vf_pci_aer_reset_done(struct pci_dev *pdev)
{
- struct hisi_acc_vf_core_device *hisi_acc_vdev = dev_get_drvdata(&pdev->dev);
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hssi_acc_drvdata(pdev);
if (hisi_acc_vdev->core_device.vdev.migration_flags !=
VFIO_MIGRATION_STOP_COPY)
@@ -1274,11 +1282,10 @@ static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device
&hisi_acc_vfio_pci_ops);
}
+ dev_set_drvdata(&pdev->dev, &hisi_acc_vdev->core_device);
ret = vfio_pci_core_register_device(&hisi_acc_vdev->core_device);
if (ret)
goto out_free;
-
- dev_set_drvdata(&pdev->dev, hisi_acc_vdev);
return 0;
out_free:
@@ -1289,7 +1296,7 @@ static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device
static void hisi_acc_vfio_pci_remove(struct pci_dev *pdev)
{
- struct hisi_acc_vf_core_device *hisi_acc_vdev = dev_get_drvdata(&pdev->dev);
+ struct hisi_acc_vf_core_device *hisi_acc_vdev = hssi_acc_drvdata(pdev);
vfio_pci_core_unregister_device(&hisi_acc_vdev->core_device);
vfio_pci_core_uninit_device(&hisi_acc_vdev->core_device);
diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c
index bbec5d288fee..9f59f5807b8a 100644
--- a/drivers/vfio/pci/mlx5/main.c
+++ b/drivers/vfio/pci/mlx5/main.c
@@ -39,6 +39,14 @@ struct mlx5vf_pci_core_device {
struct mlx5_vf_migration_file *saving_migf;
};
+static struct mlx5vf_pci_core_device *mlx5vf_drvdata(struct pci_dev *pdev)
+{
+ struct vfio_pci_core_device *core_device = dev_get_drvdata(&pdev->dev);
+
+ return container_of(core_device, struct mlx5vf_pci_core_device,
+ core_device);
+}
+
static struct page *
mlx5vf_get_migration_page(struct mlx5_vf_migration_file *migf,
unsigned long offset)
@@ -505,7 +513,7 @@ static int mlx5vf_pci_get_device_state(struct vfio_device *vdev,
static void mlx5vf_pci_aer_reset_done(struct pci_dev *pdev)
{
- struct mlx5vf_pci_core_device *mvdev = dev_get_drvdata(&pdev->dev);
+ struct mlx5vf_pci_core_device *mvdev = mlx5vf_drvdata(pdev);
if (!mvdev->migrate_cap)
return;
@@ -614,11 +622,10 @@ static int mlx5vf_pci_probe(struct pci_dev *pdev,
}
}
+ dev_set_drvdata(&pdev->dev, &mvdev->core_device);
ret = vfio_pci_core_register_device(&mvdev->core_device);
if (ret)
goto out_free;
-
- dev_set_drvdata(&pdev->dev, mvdev);
return 0;
out_free:
@@ -629,7 +636,7 @@ static int mlx5vf_pci_probe(struct pci_dev *pdev,
static void mlx5vf_pci_remove(struct pci_dev *pdev)
{
- struct mlx5vf_pci_core_device *mvdev = dev_get_drvdata(&pdev->dev);
+ struct mlx5vf_pci_core_device *mvdev = mlx5vf_drvdata(pdev);
vfio_pci_core_unregister_device(&mvdev->core_device);
vfio_pci_core_uninit_device(&mvdev->core_device);
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 2b047469e02f..8c990a1a7def 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -151,10 +151,10 @@ static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
return -ENOMEM;
vfio_pci_core_init_device(vdev, pdev, &vfio_pci_ops);
+ dev_set_drvdata(&pdev->dev, vdev);
ret = vfio_pci_core_register_device(vdev);
if (ret)
goto out_free;
- dev_set_drvdata(&pdev->dev, vdev);
return 0;
out_free:
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 06b6f3594a13..65587fd5c021 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1821,6 +1821,10 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev)
struct pci_dev *pdev = vdev->pdev;
int ret;
+ /* Drivers must set the vfio_pci_core_device to their drvdata */
+ if (WARN_ON(vdev != dev_get_drvdata(&vdev->pdev->dev)))
+ return -EINVAL;
+
if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
return -EINVAL;
diff --git a/drivers/video/fbdev/amba-clcd.c b/drivers/video/fbdev/amba-clcd.c
index 8080116aea84..f65c96d1394d 100644
--- a/drivers/video/fbdev/amba-clcd.c
+++ b/drivers/video/fbdev/amba-clcd.c
@@ -698,16 +698,18 @@ static int clcdfb_of_init_display(struct clcd_fb *fb)
return -ENODEV;
panel = of_graph_get_remote_port_parent(endpoint);
- if (!panel)
- return -ENODEV;
+ if (!panel) {
+ err = -ENODEV;
+ goto out_endpoint_put;
+ }
err = clcdfb_of_get_backlight(&fb->dev->dev, fb->panel);
if (err)
- return err;
+ goto out_panel_put;
err = clcdfb_of_get_mode(&fb->dev->dev, panel, fb->panel);
if (err)
- return err;
+ goto out_panel_put;
err = of_property_read_u32(fb->dev->dev.of_node, "max-memory-bandwidth",
&max_bandwidth);
@@ -736,11 +738,21 @@ static int clcdfb_of_init_display(struct clcd_fb *fb)
if (of_property_read_u32_array(endpoint,
"arm,pl11x,tft-r0g0b0-pads",
- tft_r0b0g0, ARRAY_SIZE(tft_r0b0g0)) != 0)
- return -ENOENT;
+ tft_r0b0g0, ARRAY_SIZE(tft_r0b0g0)) != 0) {
+ err = -ENOENT;
+ goto out_panel_put;
+ }
+
+ of_node_put(panel);
+ of_node_put(endpoint);
return clcdfb_of_init_tft_panel(fb, tft_r0b0g0[0],
tft_r0b0g0[1], tft_r0b0g0[2]);
+out_panel_put:
+ of_node_put(panel);
+out_endpoint_put:
+ of_node_put(endpoint);
+ return err;
}
static int clcdfb_of_vram_setup(struct clcd_fb *fb)
diff --git a/drivers/video/fbdev/arkfb.c b/drivers/video/fbdev/arkfb.c
index eb3e47c58c5f..a2a381631628 100644
--- a/drivers/video/fbdev/arkfb.c
+++ b/drivers/video/fbdev/arkfb.c
@@ -781,7 +781,12 @@ static int arkfb_set_par(struct fb_info *info)
return -EINVAL;
}
- ark_set_pixclock(info, (hdiv * info->var.pixclock) / hmul);
+ value = (hdiv * info->var.pixclock) / hmul;
+ if (!value) {
+ fb_dbg(info, "invalid pixclock\n");
+ value = 1;
+ }
+ ark_set_pixclock(info, value);
svga_set_timings(par->state.vgabase, &ark_timing_regs, &(info->var), hmul, hdiv,
(info->var.vmode & FB_VMODE_DOUBLE) ? 2 : 1,
(info->var.vmode & FB_VMODE_INTERLACED) ? 2 : 1,
@@ -792,6 +797,8 @@ static int arkfb_set_par(struct fb_info *info)
value = ((value * hmul / hdiv) / 8) - 5;
vga_wcrt(par->state.vgabase, 0x42, (value + 1) / 2);
+ if (screen_size > info->screen_size)
+ screen_size = info->screen_size;
memset_io(info->screen_base, 0x00, screen_size);
/* Device and screen back on */
svga_wcrt_mask(par->state.vgabase, 0x17, 0x80, 0x80);
diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c
index 526a7d2de491..d597f0c31578 100644
--- a/drivers/video/fbdev/core/fbcon.c
+++ b/drivers/video/fbdev/core/fbcon.c
@@ -115,8 +115,8 @@ static int logo_lines;
enums. */
static int logo_shown = FBCON_LOGO_CANSHOW;
/* console mappings */
-static int first_fb_vc;
-static int last_fb_vc = MAX_NR_CONSOLES - 1;
+static unsigned int first_fb_vc;
+static unsigned int last_fb_vc = MAX_NR_CONSOLES - 1;
static int fbcon_is_default = 1;
static int primary_device = -1;
static int fbcon_has_console_bind;
@@ -464,10 +464,12 @@ static int __init fb_console_setup(char *this_opt)
options += 3;
if (*options)
first_fb_vc = simple_strtoul(options, &options, 10) - 1;
- if (first_fb_vc < 0)
+ if (first_fb_vc >= MAX_NR_CONSOLES)
first_fb_vc = 0;
if (*options++ == '-')
last_fb_vc = simple_strtoul(options, &options, 10) - 1;
+ if (last_fb_vc < first_fb_vc || last_fb_vc >= MAX_NR_CONSOLES)
+ last_fb_vc = MAX_NR_CONSOLES - 1;
fbcon_is_default = 0;
continue;
}
@@ -1704,8 +1706,6 @@ static bool fbcon_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
case SM_UP:
if (count > vc->vc_rows) /* Maximum realistic size */
count = vc->vc_rows;
- if (logo_shown >= 0)
- goto redraw_up;
switch (fb_scrollmode(p)) {
case SCROLL_MOVE:
fbcon_redraw_blit(vc, info, p, t, b - t - count,
@@ -1794,8 +1794,6 @@ static bool fbcon_scroll(struct vc_data *vc, unsigned int t, unsigned int b,
case SM_DOWN:
if (count > vc->vc_rows) /* Maximum realistic size */
count = vc->vc_rows;
- if (logo_shown >= 0)
- goto redraw_down;
switch (fb_scrollmode(p)) {
case SCROLL_MOVE:
fbcon_redraw_blit(vc, info, p, b - 1, b - t - count,
diff --git a/drivers/video/fbdev/s3fb.c b/drivers/video/fbdev/s3fb.c
index b93c8eb02336..5069f6f67923 100644
--- a/drivers/video/fbdev/s3fb.c
+++ b/drivers/video/fbdev/s3fb.c
@@ -905,6 +905,8 @@ static int s3fb_set_par(struct fb_info *info)
value = clamp((htotal + hsstart + 1) / 2 + 2, hsstart + 4, htotal + 1);
svga_wcrt_multi(par->state.vgabase, s3_dtpc_regs, value);
+ if (screen_size > info->screen_size)
+ screen_size = info->screen_size;
memset_io(info->screen_base, 0x00, screen_size);
/* Device and screen back on */
svga_wcrt_mask(par->state.vgabase, 0x17, 0x80, 0x80);
diff --git a/drivers/video/fbdev/sis/init.c b/drivers/video/fbdev/sis/init.c
index b568c646a76c..2ba91d62af92 100644
--- a/drivers/video/fbdev/sis/init.c
+++ b/drivers/video/fbdev/sis/init.c
@@ -355,12 +355,12 @@ SiS_GetModeID(int VGAEngine, unsigned int VBFlags, int HDisplay, int VDisplay,
}
break;
case 400:
- if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 800) && (LCDwidth >= 600))) {
+ if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 800) && (LCDheight >= 600))) {
if(VDisplay == 300) ModeIndex = ModeIndex_400x300[Depth];
}
break;
case 512:
- if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 1024) && (LCDwidth >= 768))) {
+ if((!(VBFlags & CRT1_LCDA)) || ((LCDwidth >= 1024) && (LCDheight >= 768))) {
if(VDisplay == 384) ModeIndex = ModeIndex_512x384[Depth];
}
break;
diff --git a/drivers/video/fbdev/vt8623fb.c b/drivers/video/fbdev/vt8623fb.c
index a92a8c670cf0..4274c6efb249 100644
--- a/drivers/video/fbdev/vt8623fb.c
+++ b/drivers/video/fbdev/vt8623fb.c
@@ -507,6 +507,8 @@ static int vt8623fb_set_par(struct fb_info *info)
(info->var.vmode & FB_VMODE_DOUBLE) ? 2 : 1, 1,
1, info->node);
+ if (screen_size > info->screen_size)
+ screen_size = info->screen_size;
memset_io(info->screen_base, 0x00, screen_size);
/* Device and screen back on */
diff --git a/drivers/watchdog/armada_37xx_wdt.c b/drivers/watchdog/armada_37xx_wdt.c
index 1635f421ef2c..854b1cc723cb 100644
--- a/drivers/watchdog/armada_37xx_wdt.c
+++ b/drivers/watchdog/armada_37xx_wdt.c
@@ -274,6 +274,8 @@ static int armada_37xx_wdt_probe(struct platform_device *pdev)
if (!res)
return -ENODEV;
dev->reg = devm_ioremap(&pdev->dev, res->start, resource_size(res));
+ if (!dev->reg)
+ return -ENOMEM;
/* init clock */
dev->clk = devm_clk_get(&pdev->dev, NULL);
diff --git a/drivers/watchdog/f71808e_wdt.c b/drivers/watchdog/f71808e_wdt.c
index 7f59c680de25..6a16d3d0bb1e 100644
--- a/drivers/watchdog/f71808e_wdt.c
+++ b/drivers/watchdog/f71808e_wdt.c
@@ -634,7 +634,9 @@ static int __init fintek_wdt_init(void)
pdata.type = ret;
- platform_driver_register(&fintek_wdt_driver);
+ ret = platform_driver_register(&fintek_wdt_driver);
+ if (ret)
+ return ret;
wdt_res.name = "superio port";
wdt_res.flags = IORESOURCE_IO;
diff --git a/drivers/watchdog/sp5100_tco.c b/drivers/watchdog/sp5100_tco.c
index 86ffb58fbc85..ae54dd33e233 100644
--- a/drivers/watchdog/sp5100_tco.c
+++ b/drivers/watchdog/sp5100_tco.c
@@ -402,6 +402,7 @@ static int sp5100_tco_setupdevice_mmio(struct device *dev,
iounmap(addr);
release_resource(res);
+ kfree(res);
return ret;
}
diff --git a/fs/Makefile b/fs/Makefile
index 208a74e0b00e..93b80529f8e8 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -34,8 +34,6 @@ obj-$(CONFIG_TIMERFD) += timerfd.o
obj-$(CONFIG_EVENTFD) += eventfd.o
obj-$(CONFIG_USERFAULTFD) += userfaultfd.o
obj-$(CONFIG_AIO) += aio.o
-obj-$(CONFIG_IO_URING) += io_uring.o
-obj-$(CONFIG_IO_WQ) += io-wq.o
obj-$(CONFIG_FS_DAX) += dax.o
obj-$(CONFIG_FS_ENCRYPTION) += crypto/
obj-$(CONFIG_FS_VERITY) += verity/
diff --git a/fs/attr.c b/fs/attr.c
index dbe996b0dedf..f581c4d00897 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -184,6 +184,8 @@ EXPORT_SYMBOL(setattr_prepare);
*/
int inode_newsize_ok(const struct inode *inode, loff_t offset)
{
+ if (offset < 0)
+ return -EINVAL;
if (inode->i_size < offset) {
unsigned long limit;
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 667b7025d503..0c7fe3142d7c 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1033,8 +1033,13 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
< block_group->zone_unusable);
WARN_ON(block_group->space_info->disk_total
< block_group->length * factor);
+ WARN_ON(block_group->zone_is_active &&
+ block_group->space_info->active_total_bytes
+ < block_group->length);
}
block_group->space_info->total_bytes -= block_group->length;
+ if (block_group->zone_is_active)
+ block_group->space_info->active_total_bytes -= block_group->length;
block_group->space_info->bytes_readonly -=
(block_group->length - block_group->zone_unusable);
block_group->space_info->bytes_zone_unusable -=
@@ -2102,7 +2107,8 @@ static int read_one_block_group(struct btrfs_fs_info *info,
trace_btrfs_add_block_group(info, cache, 0);
btrfs_update_space_info(info, cache->flags, cache->length,
cache->used, cache->bytes_super,
- cache->zone_unusable, &space_info);
+ cache->zone_unusable, cache->zone_is_active,
+ &space_info);
cache->space_info = space_info;
@@ -2172,7 +2178,7 @@ static int fill_dummy_bgs(struct btrfs_fs_info *fs_info)
}
btrfs_update_space_info(fs_info, bg->flags, em->len, em->len,
- 0, 0, &space_info);
+ 0, 0, false, &space_info);
bg->space_info = space_info;
link_block_group(bg);
@@ -2553,7 +2559,7 @@ struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *tran
trace_btrfs_add_block_group(fs_info, cache, 1);
btrfs_update_space_info(fs_info, cache->flags, size, bytes_used,
cache->bytes_super, cache->zone_unusable,
- &cache->space_info);
+ cache->zone_is_active, &cache->space_info);
btrfs_update_global_block_rsv(fs_info);
link_block_group(cache);
@@ -2653,6 +2659,14 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);
if (ret < 0)
goto out;
+ /*
+ * We have allocated a new chunk. We also need to activate that chunk to
+ * grant metadata tickets for zoned filesystem.
+ */
+ ret = btrfs_zoned_activate_one_bg(fs_info, cache->space_info, true);
+ if (ret < 0)
+ goto out;
+
ret = inc_block_group_ro(cache, 0);
if (ret == -ETXTBSY)
goto unlock_out;
@@ -3724,6 +3738,7 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags,
* attempt.
*/
wait_for_alloc = true;
+ force = CHUNK_ALLOC_NO_FORCE;
spin_unlock(&space_info->lock);
mutex_lock(&fs_info->chunk_mutex);
mutex_unlock(&fs_info->chunk_mutex);
@@ -3846,6 +3861,14 @@ static void reserve_chunk_space(struct btrfs_trans_handle *trans,
if (IS_ERR(bg)) {
ret = PTR_ERR(bg);
} else {
+ /*
+ * We have a new chunk. We also need to activate it for
+ * zoned filesystem.
+ */
+ ret = btrfs_zoned_activate_one_bg(fs_info, info, true);
+ if (ret < 0)
+ return;
+
/*
* If we fail to add the chunk item here, we end up
* trying again at phase 2 of chunk allocation, at
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 077c95e9baa5..4ce0977ec9e1 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -107,14 +107,6 @@ struct btrfs_ioctl_encoded_io_args;
#define BTRFS_STAT_CURR 0
#define BTRFS_STAT_PREV 1
-/*
- * Count how many BTRFS_MAX_EXTENT_SIZE cover the @size
- */
-static inline u32 count_max_extents(u64 size)
-{
- return div_u64(size + BTRFS_MAX_EXTENT_SIZE - 1, BTRFS_MAX_EXTENT_SIZE);
-}
-
static inline unsigned long btrfs_chunk_item_size(int num_stripes)
{
BUG_ON(num_stripes == 0);
@@ -635,6 +627,9 @@ enum {
/* Indicate we have half completed snapshot deletions pending. */
BTRFS_FS_UNFINISHED_DROPS,
+ /* Indicate we have to finish a zone to do next allocation. */
+ BTRFS_FS_NEED_ZONE_FINISH,
+
#if BITS_PER_LONG == 32
/* Indicate if we have error/warn message printed on 32bit systems */
BTRFS_FS_32BIT_ERROR,
@@ -1032,6 +1027,12 @@ struct btrfs_fs_info {
u32 csums_per_leaf;
u32 stripesize;
+ /*
+ * Maximum size of an extent. BTRFS_MAX_EXTENT_SIZE on regular
+ * filesystem, on zoned it depends on the device constraints.
+ */
+ u64 max_extent_size;
+
/* Block groups and devices containing active swapfiles. */
spinlock_t swapfile_pins_lock;
struct rb_root swapfile_pins;
@@ -1050,6 +1051,8 @@ struct btrfs_fs_info {
u64 zoned;
};
+ /* Max size to emit ZONE_APPEND write command */
+ u64 max_zone_append_size;
struct mutex zoned_meta_io_lock;
spinlock_t treelog_bg_lock;
u64 treelog_bg;
@@ -1066,6 +1069,8 @@ struct btrfs_fs_info {
spinlock_t zone_active_bgs_lock;
struct list_head zone_active_bgs;
+ /* Waiters when BTRFS_FS_NEED_ZONE_FINISH is set */
+ wait_queue_head_t zone_finish_wait;
#ifdef CONFIG_BTRFS_FS_REF_VERIFY
spinlock_t ref_verify_lock;
@@ -3932,6 +3937,19 @@ static inline bool btrfs_is_zoned(const struct btrfs_fs_info *fs_info)
return fs_info->zoned != 0;
}
+/*
+ * Count how many fs_info->max_extent_size cover the @size
+ */
+static inline u32 count_max_extents(struct btrfs_fs_info *fs_info, u64 size)
+{
+#ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
+ if (!fs_info)
+ return div_u64(size + BTRFS_MAX_EXTENT_SIZE - 1, BTRFS_MAX_EXTENT_SIZE);
+#endif
+
+ return div_u64(size + fs_info->max_extent_size - 1, fs_info->max_extent_size);
+}
+
static inline bool btrfs_is_data_reloc_root(const struct btrfs_root *root)
{
return root->root_key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID;
diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index bd8267c4687d..a224f7e52bf7 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -273,7 +273,7 @@ static void calc_inode_reservations(struct btrfs_fs_info *fs_info,
u64 num_bytes, u64 disk_num_bytes,
u64 *meta_reserve, u64 *qgroup_reserve)
{
- u64 nr_extents = count_max_extents(num_bytes);
+ u64 nr_extents = count_max_extents(fs_info, num_bytes);
u64 csum_leaves = btrfs_csum_bytes_to_leaves(fs_info, disk_num_bytes);
u64 inode_update = btrfs_calc_metadata_size(fs_info, 1);
@@ -349,7 +349,7 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes,
* needs to free the reservation we just made.
*/
spin_lock(&inode->lock);
- nr_extents = count_max_extents(num_bytes);
+ nr_extents = count_max_extents(fs_info, num_bytes);
btrfs_mod_outstanding_extents(inode, nr_extents);
inode->csum_bytes += disk_num_bytes;
btrfs_calculate_inode_block_rsv_size(fs_info, inode);
@@ -412,7 +412,7 @@ void btrfs_delalloc_release_extents(struct btrfs_inode *inode, u64 num_bytes)
unsigned num_extents;
spin_lock(&inode->lock);
- num_extents = count_max_extents(num_bytes);
+ num_extents = count_max_extents(fs_info, num_bytes);
btrfs_mod_outstanding_extents(inode, -num_extents);
btrfs_calculate_inode_block_rsv_size(fs_info, inode);
spin_unlock(&inode->lock);
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 6f30413ed9a9..59fa7bf3a2e5 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3239,6 +3239,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
init_waitqueue_head(&fs_info->transaction_blocked_wait);
init_waitqueue_head(&fs_info->async_submit_wait);
init_waitqueue_head(&fs_info->delayed_iputs_wait);
+ init_waitqueue_head(&fs_info->zone_finish_wait);
/* Usable values until the real ones are cached from the superblock */
fs_info->nodesize = 4096;
@@ -3246,6 +3247,8 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
fs_info->sectorsize_bits = ilog2(4096);
fs_info->stripesize = 4096;
+ fs_info->max_extent_size = BTRFS_MAX_EXTENT_SIZE;
+
spin_lock_init(&fs_info->swapfile_pins_lock);
fs_info->swapfile_pins = RB_ROOT;
@@ -3577,16 +3580,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
*/
fs_info->compress_type = BTRFS_COMPRESS_ZLIB;
- /*
- * Flag our filesystem as having big metadata blocks if they are bigger
- * than the page size.
- */
- if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
- if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
- btrfs_info(fs_info,
- "flagging fs with big metadata feature");
- features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
- }
/* Set up fs_info before parsing mount options */
nodesize = btrfs_super_nodesize(disk_super);
@@ -3627,6 +3620,17 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
if (features & BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA)
btrfs_info(fs_info, "has skinny extents");
+ /*
+ * Flag our filesystem as having big metadata blocks if they are bigger
+ * than the page size.
+ */
+ if (btrfs_super_nodesize(disk_super) > PAGE_SIZE) {
+ if (!(features & BTRFS_FEATURE_INCOMPAT_BIG_METADATA))
+ btrfs_info(fs_info,
+ "flagging fs with big metadata feature");
+ features |= BTRFS_FEATURE_INCOMPAT_BIG_METADATA;
+ }
+
/*
* mixed block groups end up with duplicate but slightly offset
* extent buffers for the same range. It leads to corruptions
@@ -3654,6 +3658,20 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
err = -EINVAL;
goto fail_alloc;
}
+ /*
+ * We have unsupported RO compat features, although RO mounted, we
+ * should not cause any metadata write, including log replay.
+ * Or we could screw up whatever the new feature requires.
+ */
+ if (unlikely(features && btrfs_super_log_root(disk_super) &&
+ !btrfs_test_opt(fs_info, NOLOGREPLAY))) {
+ btrfs_err(fs_info,
+"cannot replay dirty log with unsupported compat_ro features (0x%llx), try rescue=nologreplay",
+ features);
+ err = -EINVAL;
+ goto fail_alloc;
+ }
+
if (sectorsize < PAGE_SIZE) {
struct btrfs_subpage_info *subpage_info;
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index f45ecd939a2c..eee68a6f2be7 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3803,8 +3803,7 @@ static int do_allocation_zoned(struct btrfs_block_group *block_group,
/* Check RO and no space case before trying to activate it */
spin_lock(&block_group->lock);
- if (block_group->ro ||
- block_group->alloc_offset == block_group->zone_capacity) {
+ if (block_group->ro || btrfs_zoned_bg_is_full(block_group)) {
ret = 1;
/*
* May need to clear fs_info->{treelog,data_reloc}_bg.
@@ -3985,23 +3984,63 @@ static void found_extent(struct find_free_extent_ctl *ffe_ctl,
}
}
-static bool can_allocate_chunk(struct btrfs_fs_info *fs_info,
- struct find_free_extent_ctl *ffe_ctl)
+static int can_allocate_chunk_zoned(struct btrfs_fs_info *fs_info,
+ struct find_free_extent_ctl *ffe_ctl)
+{
+ /* If we can activate new zone, just allocate a chunk and use it */
+ if (btrfs_can_activate_zone(fs_info->fs_devices, ffe_ctl->flags))
+ return 0;
+
+ /*
+ * We already reached the max active zones. Try to finish one block
+ * group to make a room for a new block group. This is only possible
+ * for a data block group because btrfs_zone_finish() may need to wait
+ * for a running transaction which can cause a deadlock for metadata
+ * allocation.
+ */
+ if (ffe_ctl->flags & BTRFS_BLOCK_GROUP_DATA) {
+ int ret = btrfs_zone_finish_one_bg(fs_info);
+
+ if (ret == 1)
+ return 0;
+ else if (ret < 0)
+ return ret;
+ }
+
+ /*
+ * If we have enough free space left in an already active block group
+ * and we can't activate any other zone now, do not allow allocating a
+ * new chunk and let find_free_extent() retry with a smaller size.
+ */
+ if (ffe_ctl->max_extent_size >= ffe_ctl->min_alloc_size)
+ return -ENOSPC;
+
+ /*
+ * Even min_alloc_size is not left in any block groups. Since we cannot
+ * activate a new block group, allocating it may not help. Let's tell a
+ * caller to try again and hope it progress something by writing some
+ * parts of the region. That is only possible for data block groups,
+ * where a part of the region can be written.
+ */
+ if (ffe_ctl->flags & BTRFS_BLOCK_GROUP_DATA)
+ return -EAGAIN;
+
+ /*
+ * We cannot activate a new block group and no enough space left in any
+ * block groups. So, allocating a new block group may not help. But,
+ * there is nothing to do anyway, so let's go with it.
+ */
+ return 0;
+}
+
+static int can_allocate_chunk(struct btrfs_fs_info *fs_info,
+ struct find_free_extent_ctl *ffe_ctl)
{
switch (ffe_ctl->policy) {
case BTRFS_EXTENT_ALLOC_CLUSTERED:
- return true;
+ return 0;
case BTRFS_EXTENT_ALLOC_ZONED:
- /*
- * If we have enough free space left in an already
- * active block group and we can't activate any other
- * zone now, do not allow allocating a new chunk and
- * let find_free_extent() retry with a smaller size.
- */
- if (ffe_ctl->max_extent_size >= ffe_ctl->min_alloc_size &&
- !btrfs_can_activate_zone(fs_info->fs_devices, ffe_ctl->flags))
- return false;
- return true;
+ return can_allocate_chunk_zoned(fs_info, ffe_ctl);
default:
BUG();
}
@@ -4083,8 +4122,9 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
int exist = 0;
/*Check if allocation policy allows to create a new chunk */
- if (!can_allocate_chunk(fs_info, ffe_ctl))
- return -ENOSPC;
+ ret = can_allocate_chunk(fs_info, ffe_ctl);
+ if (ret)
+ return ret;
trans = current->journal_info;
if (trans)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 68ddd90685d9..bfc7d5b31156 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1992,10 +1992,12 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode,
struct page *locked_page, u64 *start,
u64 *end)
{
+ struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;
const u64 orig_start = *start;
const u64 orig_end = *end;
- u64 max_bytes = BTRFS_MAX_EXTENT_SIZE;
+ /* The sanity tests may not set a valid fs_info. */
+ u64 max_bytes = fs_info ? fs_info->max_extent_size : BTRFS_MAX_EXTENT_SIZE;
u64 delalloc_start;
u64 delalloc_end;
bool found;
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 153920acd226..2d24f2dcc0ea 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2344,7 +2344,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
btrfs_release_log_ctx_extents(&ctx);
if (ret < 0) {
/* Fallthrough and commit/free transaction. */
- ret = 1;
+ ret = BTRFS_LOG_FORCE_COMMIT;
}
/* we've logged all the items and now have a consistent
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 01a408db5683..ef84bc5030cd 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2630,16 +2630,19 @@ int __btrfs_add_free_space(struct btrfs_block_group *block_group,
static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group,
u64 bytenr, u64 size, bool used)
{
- struct btrfs_fs_info *fs_info = block_group->fs_info;
+ struct btrfs_space_info *sinfo = block_group->space_info;
struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl;
u64 offset = bytenr - block_group->start;
u64 to_free, to_unusable;
- const int bg_reclaim_threshold = READ_ONCE(fs_info->bg_reclaim_threshold);
+ int bg_reclaim_threshold = 0;
bool initial = (size == block_group->length);
u64 reclaimable_unusable;
WARN_ON(!initial && offset + size > block_group->zone_capacity);
+ if (!initial)
+ bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold);
+
spin_lock(&ctl->tree_lock);
if (!used)
to_free = size;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5d15e374d032..6cf4b3146667 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -92,7 +92,8 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent);
static noinline int cow_file_range(struct btrfs_inode *inode,
struct page *locked_page,
u64 start, u64 end, int *page_started,
- unsigned long *nr_written, int unlock);
+ unsigned long *nr_written, int unlock,
+ u64 *done_offset);
static struct extent_map *create_io_em(struct btrfs_inode *inode, u64 start,
u64 len, u64 orig_start, u64 block_start,
u64 block_len, u64 orig_block_len,
@@ -884,15 +885,25 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
* can directly submit them without interruption.
*/
ret = cow_file_range(inode, locked_page, start, end, &page_started,
- &nr_written, 0);
+ &nr_written, 0, NULL);
/* Inline extent inserted, page gets unlocked and everything is done */
if (page_started) {
ret = 0;
goto out;
}
if (ret < 0) {
- if (locked_page)
+ btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1);
+ if (locked_page) {
+ const u64 page_start = page_offset(locked_page);
+ const u64 page_end = page_start + PAGE_SIZE - 1;
+
+ btrfs_page_set_error(inode->root->fs_info, locked_page,
+ page_start, PAGE_SIZE);
+ set_page_writeback(locked_page);
+ end_page_writeback(locked_page);
+ end_extent_writepage(locked_page, ret, page_start, page_end);
unlock_page(locked_page);
+ }
goto out;
}
@@ -1097,15 +1108,39 @@ static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start,
* *page_started is set to one if we unlock locked_page and do everything
* required to start IO on it. It may be clean and already done with
* IO when we return.
+ *
+ * When unlock == 1, we unlock the pages in successfully allocated regions.
+ * When unlock == 0, we leave them locked for writing them out.
+ *
+ * However, we unlock all the pages except @locked_page in case of failure.
+ *
+ * In summary, page locking state will be as follow:
+ *
+ * - page_started == 1 (return value)
+ * - All the pages are unlocked. IO is started.
+ * - Note that this can happen only on success
+ * - unlock == 1
+ * - All the pages except @locked_page are unlocked in any case
+ * - unlock == 0
+ * - On success, all the pages are locked for writing out them
+ * - On failure, all the pages except @locked_page are unlocked
+ *
+ * When a failure happens in the second or later iteration of the
+ * while-loop, the ordered extents created in previous iterations are kept
+ * intact. So, the caller must clean them up by calling
+ * btrfs_cleanup_ordered_extents(). See btrfs_run_delalloc_range() for
+ * example.
*/
static noinline int cow_file_range(struct btrfs_inode *inode,
struct page *locked_page,
u64 start, u64 end, int *page_started,
- unsigned long *nr_written, int unlock)
+ unsigned long *nr_written, int unlock,
+ u64 *done_offset)
{
struct btrfs_root *root = inode->root;
struct btrfs_fs_info *fs_info = root->fs_info;
u64 alloc_hint = 0;
+ u64 orig_start = start;
u64 num_bytes;
unsigned long ram_size;
u64 cur_alloc_size = 0;
@@ -1293,18 +1328,62 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
btrfs_dec_block_group_reservations(fs_info, ins.objectid);
btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1);
out_unlock:
+ /*
+ * If done_offset is non-NULL and ret == -EAGAIN, we expect the
+ * caller to write out the successfully allocated region and retry.
+ */
+ if (done_offset && ret == -EAGAIN) {
+ if (orig_start < start)
+ *done_offset = start - 1;
+ else
+ *done_offset = start;
+ return ret;
+ } else if (ret == -EAGAIN) {
+ /* Convert to -ENOSPC since the caller cannot retry. */
+ ret = -ENOSPC;
+ }
+
+ /*
+ * Now, we have three regions to clean up:
+ *
+ * |-------(1)----|---(2)---|-------------(3)----------|
+ * `- orig_start `- start `- start + cur_alloc_size `- end
+ *
+ * We process each region below.
+ */
+
clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |
EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV;
page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK;
+
/*
- * If we reserved an extent for our delalloc range (or a subrange) and
- * failed to create the respective ordered extent, then it means that
- * when we reserved the extent we decremented the extent's size from
- * the data space_info's bytes_may_use counter and incremented the
- * space_info's bytes_reserved counter by the same amount. We must make
- * sure extent_clear_unlock_delalloc() does not try to decrement again
- * the data space_info's bytes_may_use counter, therefore we do not pass
- * it the flag EXTENT_CLEAR_DATA_RESV.
+ * For the range (1). We have already instantiated the ordered extents
+ * for this region. They are cleaned up by
+ * btrfs_cleanup_ordered_extents() in e.g,
+ * btrfs_run_delalloc_range(). EXTENT_LOCKED | EXTENT_DELALLOC are
+ * already cleared in the above loop. And, EXTENT_DELALLOC_NEW |
+ * EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV are handled by the cleanup
+ * function.
+ *
+ * However, in case of unlock == 0, we still need to unlock the pages
+ * (except @locked_page) to ensure all the pages are unlocked.
+ */
+ if (!unlock && orig_start < start) {
+ if (!locked_page)
+ mapping_set_error(inode->vfs_inode.i_mapping, ret);
+ extent_clear_unlock_delalloc(inode, orig_start, start - 1,
+ locked_page, 0, page_ops);
+ }
+
+ /*
+ * For the range (2). If we reserved an extent for our delalloc range
+ * (or a subrange) and failed to create the respective ordered extent,
+ * then it means that when we reserved the extent we decremented the
+ * extent's size from the data space_info's bytes_may_use counter and
+ * incremented the space_info's bytes_reserved counter by the same
+ * amount. We must make sure extent_clear_unlock_delalloc() does not try
+ * to decrement again the data space_info's bytes_may_use counter,
+ * therefore we do not pass it the flag EXTENT_CLEAR_DATA_RESV.
*/
if (extent_reserved) {
extent_clear_unlock_delalloc(inode, start,
@@ -1316,6 +1395,13 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
if (start >= end)
goto out;
}
+
+ /*
+ * For the range (3). We never touched the region. In addition to the
+ * clear_bits above, we add EXTENT_CLEAR_DATA_RESV to release the data
+ * space_info's bytes_may_use counter, reserved in
+ * btrfs_check_data_free_space().
+ */
extent_clear_unlock_delalloc(inode, start, end, locked_page,
clear_bits | EXTENT_CLEAR_DATA_RESV,
page_ops);
@@ -1502,19 +1588,42 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
u64 end, int *page_started,
unsigned long *nr_written)
{
+ u64 done_offset = end;
int ret;
+ bool locked_page_done = false;
- ret = cow_file_range(inode, locked_page, start, end, page_started,
- nr_written, 0);
- if (ret)
- return ret;
+ while (start <= end) {
+ ret = cow_file_range(inode, locked_page, start, end, page_started,
+ nr_written, 0, &done_offset);
+ if (ret && ret != -EAGAIN)
+ return ret;
- if (*page_started)
- return 0;
+ if (*page_started) {
+ ASSERT(ret == 0);
+ return 0;
+ }
+
+ if (ret == 0)
+ done_offset = end;
+
+ if (done_offset == start) {
+ struct btrfs_fs_info *info = inode->root->fs_info;
+
+ wait_var_event(&info->zone_finish_wait,
+ !test_bit(BTRFS_FS_NEED_ZONE_FINISH, &info->flags));
+ continue;
+ }
+
+ if (!locked_page_done) {
+ __set_page_dirty_nobuffers(locked_page);
+ account_page_redirty(locked_page);
+ }
+ locked_page_done = true;
+ extent_write_locked_range(&inode->vfs_inode, start, done_offset);
+
+ start = done_offset + 1;
+ }
- __set_page_dirty_nobuffers(locked_page);
- account_page_redirty(locked_page);
- extent_write_locked_range(&inode->vfs_inode, start, end);
*page_started = 1;
return 0;
@@ -1606,7 +1715,7 @@ static int fallback_to_cow(struct btrfs_inode *inode, struct page *locked_page,
}
return cow_file_range(inode, locked_page, start, end, page_started,
- nr_written, 1);
+ nr_written, 1, NULL);
}
/*
@@ -2017,7 +2126,7 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
page_started, nr_written);
else
ret = cow_file_range(inode, locked_page, start, end,
- page_started, nr_written, 1);
+ page_started, nr_written, 1, NULL);
} else {
set_bit(BTRFS_INODE_HAS_ASYNC_EXTENT, &inode->runtime_flags);
ret = cow_file_range_async(inode, wbc, locked_page, start, end,
@@ -2033,6 +2142,7 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
void btrfs_split_delalloc_extent(struct inode *inode,
struct extent_state *orig, u64 split)
{
+ struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
u64 size;
/* not delalloc, ignore it */
@@ -2040,7 +2150,7 @@ void btrfs_split_delalloc_extent(struct inode *inode,
return;
size = orig->end - orig->start + 1;
- if (size > BTRFS_MAX_EXTENT_SIZE) {
+ if (size > fs_info->max_extent_size) {
u32 num_extents;
u64 new_size;
@@ -2049,10 +2159,10 @@ void btrfs_split_delalloc_extent(struct inode *inode,
* applies here, just in reverse.
*/
new_size = orig->end - split + 1;
- num_extents = count_max_extents(new_size);
+ num_extents = count_max_extents(fs_info, new_size);
new_size = split - orig->start;
- num_extents += count_max_extents(new_size);
- if (count_max_extents(size) >= num_extents)
+ num_extents += count_max_extents(fs_info, new_size);
+ if (count_max_extents(fs_info, size) >= num_extents)
return;
}
@@ -2069,6 +2179,7 @@ void btrfs_split_delalloc_extent(struct inode *inode,
void btrfs_merge_delalloc_extent(struct inode *inode, struct extent_state *new,
struct extent_state *other)
{
+ struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
u64 new_size, old_size;
u32 num_extents;
@@ -2082,7 +2193,7 @@ void btrfs_merge_delalloc_extent(struct inode *inode, struct extent_state *new,
new_size = other->end - new->start + 1;
/* we're not bigger than the max, unreserve the space and go */
- if (new_size <= BTRFS_MAX_EXTENT_SIZE) {
+ if (new_size <= fs_info->max_extent_size) {
spin_lock(&BTRFS_I(inode)->lock);
btrfs_mod_outstanding_extents(BTRFS_I(inode), -1);
spin_unlock(&BTRFS_I(inode)->lock);
@@ -2108,10 +2219,10 @@ void btrfs_merge_delalloc_extent(struct inode *inode, struct extent_state *new,
* this case.
*/
old_size = other->end - other->start + 1;
- num_extents = count_max_extents(old_size);
+ num_extents = count_max_extents(fs_info, old_size);
old_size = new->end - new->start + 1;
- num_extents += count_max_extents(old_size);
- if (count_max_extents(new_size) >= num_extents)
+ num_extents += count_max_extents(fs_info, old_size);
+ if (count_max_extents(fs_info, new_size) >= num_extents)
return;
spin_lock(&BTRFS_I(inode)->lock);
@@ -2190,7 +2301,7 @@ void btrfs_set_delalloc_extent(struct inode *inode, struct extent_state *state,
if (!(state->state & EXTENT_DELALLOC) && (*bits & EXTENT_DELALLOC)) {
struct btrfs_root *root = BTRFS_I(inode)->root;
u64 len = state->end + 1 - state->start;
- u32 num_extents = count_max_extents(len);
+ u32 num_extents = count_max_extents(fs_info, len);
bool do_list = !btrfs_is_free_space_inode(BTRFS_I(inode));
spin_lock(&BTRFS_I(inode)->lock);
@@ -2232,7 +2343,7 @@ void btrfs_clear_delalloc_extent(struct inode *vfs_inode,
struct btrfs_inode *inode = BTRFS_I(vfs_inode);
struct btrfs_fs_info *fs_info = btrfs_sb(vfs_inode->i_sb);
u64 len = state->end + 1 - state->start;
- u32 num_extents = count_max_extents(len);
+ u32 num_extents = count_max_extents(fs_info, len);
if ((state->state & EXTENT_DEFRAG) && (*bits & EXTENT_DEFRAG)) {
spin_lock(&inode->lock);
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index b87931a458eb..104cbc901c0e 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -9,6 +9,7 @@
#include "ordered-data.h"
#include "transaction.h"
#include "block-group.h"
+#include "zoned.h"
/*
* HOW DOES SPACE RESERVATION WORK
@@ -181,6 +182,43 @@ void btrfs_clear_space_info_full(struct btrfs_fs_info *info)
found->full = 0;
}
+/*
+ * Block groups with more than this value (percents) of unusable space will be
+ * scheduled for background reclaim.
+ */
+#define BTRFS_DEFAULT_ZONED_RECLAIM_THRESH (75)
+
+/*
+ * Calculate chunk size depending on volume type (regular or zoned).
+ */
+static u64 calc_chunk_size(const struct btrfs_fs_info *fs_info, u64 flags)
+{
+ if (btrfs_is_zoned(fs_info))
+ return fs_info->zone_size;
+
+ ASSERT(flags & BTRFS_BLOCK_GROUP_TYPE_MASK);
+
+ if (flags & BTRFS_BLOCK_GROUP_DATA)
+ return SZ_1G;
+ else if (flags & BTRFS_BLOCK_GROUP_SYSTEM)
+ return SZ_32M;
+
+ /* Handle BTRFS_BLOCK_GROUP_METADATA */
+ if (fs_info->fs_devices->total_rw_bytes > 50ULL * SZ_1G)
+ return SZ_1G;
+
+ return SZ_256M;
+}
+
+/*
+ * Update default chunk size.
+ */
+void btrfs_update_space_info_chunk_size(struct btrfs_space_info *space_info,
+ u64 chunk_size)
+{
+ WRITE_ONCE(space_info->chunk_size, chunk_size);
+}
+
static int create_space_info(struct btrfs_fs_info *info, u64 flags)
{
@@ -202,6 +240,10 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags)
INIT_LIST_HEAD(&space_info->tickets);
INIT_LIST_HEAD(&space_info->priority_tickets);
space_info->clamp = 1;
+ btrfs_update_space_info_chunk_size(space_info, calc_chunk_size(info, flags));
+
+ if (btrfs_is_zoned(info))
+ space_info->bg_reclaim_threshold = BTRFS_DEFAULT_ZONED_RECLAIM_THRESH;
ret = btrfs_sysfs_add_space_info_type(info, space_info);
if (ret)
@@ -254,7 +296,7 @@ int btrfs_init_space_info(struct btrfs_fs_info *fs_info)
void btrfs_update_space_info(struct btrfs_fs_info *info, u64 flags,
u64 total_bytes, u64 bytes_used,
u64 bytes_readonly, u64 bytes_zone_unusable,
- struct btrfs_space_info **space_info)
+ bool active, struct btrfs_space_info **space_info)
{
struct btrfs_space_info *found;
int factor;
@@ -265,6 +307,8 @@ void btrfs_update_space_info(struct btrfs_fs_info *info, u64 flags,
ASSERT(found);
spin_lock(&found->lock);
found->total_bytes += total_bytes;
+ if (active)
+ found->active_total_bytes += total_bytes;
found->disk_total += total_bytes * factor;
found->bytes_used += bytes_used;
found->disk_used += bytes_used * factor;
@@ -328,6 +372,22 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
return avail;
}
+static inline u64 writable_total_bytes(struct btrfs_fs_info *fs_info,
+ struct btrfs_space_info *space_info)
+{
+ /*
+ * On regular filesystem, all total_bytes are always writable. On zoned
+ * filesystem, there may be a limitation imposed by max_active_zones.
+ * For metadata allocation, we cannot finish an existing active block
+ * group to avoid a deadlock. Thus, we need to consider only the active
+ * groups to be writable for metadata space.
+ */
+ if (!btrfs_is_zoned(fs_info) || (space_info->flags & BTRFS_BLOCK_GROUP_DATA))
+ return space_info->total_bytes;
+
+ return space_info->active_total_bytes;
+}
+
int btrfs_can_overcommit(struct btrfs_fs_info *fs_info,
struct btrfs_space_info *space_info, u64 bytes,
enum btrfs_reserve_flush_enum flush)
@@ -340,9 +400,12 @@ int btrfs_can_overcommit(struct btrfs_fs_info *fs_info,
return 0;
used = btrfs_space_info_used(space_info, true);
- avail = calc_available_free_space(fs_info, space_info, flush);
+ if (btrfs_is_zoned(fs_info) && (space_info->flags & BTRFS_BLOCK_GROUP_METADATA))
+ avail = 0;
+ else
+ avail = calc_available_free_space(fs_info, space_info, flush);
- if (used + bytes < space_info->total_bytes + avail)
+ if (used + bytes < writable_total_bytes(fs_info, space_info) + avail)
return 1;
return 0;
}
@@ -378,7 +441,7 @@ void btrfs_try_granting_tickets(struct btrfs_fs_info *fs_info,
ticket = list_first_entry(head, struct reserve_ticket, list);
/* Check and see if our ticket can be satisfied now. */
- if ((used + ticket->bytes <= space_info->total_bytes) ||
+ if ((used + ticket->bytes <= writable_total_bytes(fs_info, space_info)) ||
btrfs_can_overcommit(fs_info, space_info, ticket->bytes,
flush)) {
btrfs_space_info_update_bytes_may_use(fs_info,
@@ -662,6 +725,18 @@ static void flush_space(struct btrfs_fs_info *fs_info,
break;
case ALLOC_CHUNK:
case ALLOC_CHUNK_FORCE:
+ /*
+ * For metadata space on zoned filesystem, reaching here means we
+ * don't have enough space left in active_total_bytes. Try to
+ * activate a block group first, because we may have inactive
+ * block group already allocated.
+ */
+ ret = btrfs_zoned_activate_one_bg(fs_info, space_info, false);
+ if (ret < 0)
+ break;
+ else if (ret == 1)
+ break;
+
trans = btrfs_join_transaction(root);
if (IS_ERR(trans)) {
ret = PTR_ERR(trans);
@@ -672,6 +747,23 @@ static void flush_space(struct btrfs_fs_info *fs_info,
(state == ALLOC_CHUNK) ? CHUNK_ALLOC_NO_FORCE :
CHUNK_ALLOC_FORCE);
btrfs_end_transaction(trans);
+
+ /*
+ * For metadata space on zoned filesystem, allocating a new chunk
+ * is not enough. We still need to activate the block * group.
+ * Active the newly allocated block group by (maybe) finishing
+ * a block group.
+ */
+ if (ret == 1) {
+ ret = btrfs_zoned_activate_one_bg(fs_info, space_info, true);
+ /*
+ * Revert to the original ret regardless we could finish
+ * one block group or not.
+ */
+ if (ret >= 0)
+ ret = 1;
+ }
+
if (ret > 0 || ret == -ENOSPC)
ret = 0;
break;
@@ -709,6 +801,7 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info,
{
u64 used;
u64 avail;
+ u64 total;
u64 to_reclaim = space_info->reclaim_size;
lockdep_assert_held(&space_info->lock);
@@ -723,8 +816,9 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info,
* space. If that's the case add in our overage so we make sure to put
* appropriate pressure on the flushing state machine.
*/
- if (space_info->total_bytes + avail < used)
- to_reclaim += used - (space_info->total_bytes + avail);
+ total = writable_total_bytes(fs_info, space_info);
+ if (total + avail < used)
+ to_reclaim += used - (total + avail);
return to_reclaim;
}
@@ -734,9 +828,12 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
{
u64 global_rsv_size = fs_info->global_block_rsv.reserved;
u64 ordered, delalloc;
- u64 thresh = div_factor_fine(space_info->total_bytes, 90);
+ u64 total = writable_total_bytes(fs_info, space_info);
+ u64 thresh;
u64 used;
+ thresh = div_factor_fine(total, 90);
+
lockdep_assert_held(&space_info->lock);
/* If we're just plain full then async reclaim just slows us down. */
@@ -798,8 +895,8 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
BTRFS_RESERVE_FLUSH_ALL);
used = space_info->bytes_used + space_info->bytes_reserved +
space_info->bytes_readonly + global_rsv_size;
- if (used < space_info->total_bytes)
- thresh += space_info->total_bytes - used;
+ if (used < total)
+ thresh += total - used;
thresh >>= space_info->clamp;
used = space_info->bytes_pinned;
@@ -1516,7 +1613,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
* can_overcommit() to ensure we can overcommit to continue.
*/
if (!pending_tickets &&
- ((used + orig_bytes <= space_info->total_bytes) ||
+ ((used + orig_bytes <= writable_total_bytes(fs_info, space_info)) ||
btrfs_can_overcommit(fs_info, space_info, orig_bytes, flush))) {
btrfs_space_info_update_bytes_may_use(fs_info, space_info,
orig_bytes);
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index d841fed73492..b8cee27df213 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -17,12 +17,22 @@ struct btrfs_space_info {
u64 bytes_may_use; /* number of bytes that may be used for
delalloc/allocations */
u64 bytes_readonly; /* total bytes that are read only */
+ /* Total bytes in the space, but only accounts active block groups. */
+ u64 active_total_bytes;
u64 bytes_zone_unusable; /* total bytes that are unusable until
resetting the device zone */
u64 max_extent_size; /* This will hold the maximum extent size of
the space info if we had an ENOSPC in the
allocator. */
+ /* Chunk size in bytes */
+ u64 chunk_size;
+
+ /*
+ * Once a block group drops below this threshold (percents) we'll
+ * schedule it for reclaim.
+ */
+ int bg_reclaim_threshold;
int clamp; /* Used to scale our threshold for preemptive
flushing. The value is >> clamp, so turns
@@ -114,7 +124,9 @@ int btrfs_init_space_info(struct btrfs_fs_info *fs_info);
void btrfs_update_space_info(struct btrfs_fs_info *info, u64 flags,
u64 total_bytes, u64 bytes_used,
u64 bytes_readonly, u64 bytes_zone_unusable,
- struct btrfs_space_info **space_info);
+ bool active, struct btrfs_space_info **space_info);
+void btrfs_update_space_info_chunk_size(struct btrfs_space_info *space_info,
+ u64 chunk_size);
struct btrfs_space_info *btrfs_find_space_info(struct btrfs_fs_info *info,
u64 flags);
u64 __pure btrfs_space_info_used(struct btrfs_space_info *s_info,
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index ba78ca5aabbb..43845cae0c74 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -722,6 +722,42 @@ SPACE_INFO_ATTR(bytes_zone_unusable);
SPACE_INFO_ATTR(disk_used);
SPACE_INFO_ATTR(disk_total);
+static ssize_t btrfs_sinfo_bg_reclaim_threshold_show(struct kobject *kobj,
+ struct kobj_attribute *a,
+ char *buf)
+{
+ struct btrfs_space_info *space_info = to_space_info(kobj);
+ ssize_t ret;
+
+ ret = sysfs_emit(buf, "%d\n", READ_ONCE(space_info->bg_reclaim_threshold));
+
+ return ret;
+}
+
+static ssize_t btrfs_sinfo_bg_reclaim_threshold_store(struct kobject *kobj,
+ struct kobj_attribute *a,
+ const char *buf, size_t len)
+{
+ struct btrfs_space_info *space_info = to_space_info(kobj);
+ int thresh;
+ int ret;
+
+ ret = kstrtoint(buf, 10, &thresh);
+ if (ret)
+ return ret;
+
+ if (thresh != 0 && (thresh <= 50 || thresh > 100))
+ return -EINVAL;
+
+ WRITE_ONCE(space_info->bg_reclaim_threshold, thresh);
+
+ return len;
+}
+
+BTRFS_ATTR_RW(space_info, bg_reclaim_threshold,
+ btrfs_sinfo_bg_reclaim_threshold_show,
+ btrfs_sinfo_bg_reclaim_threshold_store);
+
/*
* Allocation information about block group types.
*
@@ -738,6 +774,7 @@ static struct attribute *space_info_attrs[] = {
BTRFS_ATTR_PTR(space_info, bytes_zone_unusable),
BTRFS_ATTR_PTR(space_info, disk_used),
BTRFS_ATTR_PTR(space_info, disk_total),
+ BTRFS_ATTR_PTR(space_info, bg_reclaim_threshold),
NULL,
};
ATTRIBUTE_GROUPS(space_info);
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index e65633686378..2cc376775ae4 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -171,7 +171,7 @@ static int start_log_trans(struct btrfs_trans_handle *trans,
int index = (root->log_transid + 1) % 2;
if (btrfs_need_log_full_commit(trans)) {
- ret = -EAGAIN;
+ ret = BTRFS_LOG_FORCE_COMMIT;
goto out;
}
@@ -194,7 +194,7 @@ static int start_log_trans(struct btrfs_trans_handle *trans,
* writing.
*/
if (zoned && !created) {
- ret = -EAGAIN;
+ ret = BTRFS_LOG_FORCE_COMMIT;
goto out;
}
@@ -3122,7 +3122,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
/* bail out if we need to do a full commit */
if (btrfs_need_log_full_commit(trans)) {
- ret = -EAGAIN;
+ ret = BTRFS_LOG_FORCE_COMMIT;
mutex_unlock(&root->log_mutex);
goto out;
}
@@ -3223,7 +3223,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
}
btrfs_wait_tree_log_extents(log, mark);
mutex_unlock(&log_root_tree->log_mutex);
- ret = -EAGAIN;
+ ret = BTRFS_LOG_FORCE_COMMIT;
goto out;
}
@@ -3262,7 +3262,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
blk_finish_plug(&plug);
btrfs_wait_tree_log_extents(log, mark);
mutex_unlock(&log_root_tree->log_mutex);
- ret = -EAGAIN;
+ ret = BTRFS_LOG_FORCE_COMMIT;
goto out_wake_log_root;
}
@@ -5849,7 +5849,7 @@ static int btrfs_log_inode(struct btrfs_trans_handle *trans,
inode_only == LOG_INODE_ALL &&
inode->last_unlink_trans >= trans->transid) {
btrfs_set_log_full_commit(trans);
- ret = 1;
+ ret = BTRFS_LOG_FORCE_COMMIT;
goto out_unlock;
}
@@ -6563,12 +6563,12 @@ static int btrfs_log_inode_parent(struct btrfs_trans_handle *trans,
bool log_dentries = false;
if (btrfs_test_opt(fs_info, NOTREELOG)) {
- ret = 1;
+ ret = BTRFS_LOG_FORCE_COMMIT;
goto end_no_trans;
}
if (btrfs_root_refs(&root->root_item) == 0) {
- ret = 1;
+ ret = BTRFS_LOG_FORCE_COMMIT;
goto end_no_trans;
}
@@ -6666,7 +6666,7 @@ static int btrfs_log_inode_parent(struct btrfs_trans_handle *trans,
end_trans:
if (ret < 0) {
btrfs_set_log_full_commit(trans);
- ret = 1;
+ ret = BTRFS_LOG_FORCE_COMMIT;
}
if (ret)
@@ -7030,8 +7030,15 @@ void btrfs_log_new_name(struct btrfs_trans_handle *trans,
* anyone from syncing the log until we have updated both inodes
* in the log.
*/
+ ret = join_running_log_trans(root);
+ /*
+ * At least one of the inodes was logged before, so this should
+ * not fail, but if it does, it's not serious, just bail out and
+ * mark the log for a full commit.
+ */
+ if (WARN_ON_ONCE(ret < 0))
+ goto out;
log_pinned = true;
- btrfs_pin_log_trans(root);
path = btrfs_alloc_path();
if (!path) {
diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h
index 1620f8170629..57ab5f3b8dc7 100644
--- a/fs/btrfs/tree-log.h
+++ b/fs/btrfs/tree-log.h
@@ -12,6 +12,9 @@
/* return value for btrfs_log_dentry_safe that means we don't need to log it at all */
#define BTRFS_NO_LOG_SYNC 256
+/* We can't use the tree log for whatever reason, force a transaction commit */
+#define BTRFS_LOG_FORCE_COMMIT (1)
+
struct btrfs_log_ctx {
int log_ret;
int log_transid;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 659575526e9f..4bc97e7d8e46 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5091,26 +5091,16 @@ static void init_alloc_chunk_ctl_policy_regular(
struct btrfs_fs_devices *fs_devices,
struct alloc_chunk_ctl *ctl)
{
- u64 type = ctl->type;
+ struct btrfs_space_info *space_info;
- if (type & BTRFS_BLOCK_GROUP_DATA) {
- ctl->max_stripe_size = SZ_1G;
- ctl->max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
- } else if (type & BTRFS_BLOCK_GROUP_METADATA) {
- /* For larger filesystems, use larger metadata chunks */
- if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
- ctl->max_stripe_size = SZ_1G;
- else
- ctl->max_stripe_size = SZ_256M;
- ctl->max_chunk_size = ctl->max_stripe_size;
- } else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
- ctl->max_stripe_size = SZ_32M;
- ctl->max_chunk_size = 2 * ctl->max_stripe_size;
- ctl->devs_max = min_t(int, ctl->devs_max,
- BTRFS_MAX_DEVS_SYS_CHUNK);
- } else {
- BUG();
- }
+ space_info = btrfs_find_space_info(fs_devices->fs_info, ctl->type);
+ ASSERT(space_info);
+
+ ctl->max_chunk_size = READ_ONCE(space_info->chunk_size);
+ ctl->max_stripe_size = ctl->max_chunk_size;
+
+ if (ctl->type & BTRFS_BLOCK_GROUP_SYSTEM)
+ ctl->devs_max = min_t(int, ctl->devs_max, BTRFS_MAX_DEVS_SYS_CHUNK);
/* We don't want a chunk larger than 10% of writable space */
ctl->max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 84b6d39509bd..45e29b8c705c 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -407,6 +407,16 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
nr_sectors = bdev_nr_sectors(bdev);
zone_info->zone_size_shift = ilog2(zone_info->zone_size);
zone_info->nr_zones = nr_sectors >> ilog2(zone_sectors);
+ /*
+ * We limit max_zone_append_size also by max_segments *
+ * PAGE_SIZE. Technically, we can have multiple pages per segment. But,
+ * since btrfs adds the pages one by one to a bio, and btrfs cannot
+ * increase the metadata reservation even if it increases the number of
+ * extents, it is safe to stick with the limit.
+ */
+ zone_info->max_zone_append_size =
+ min_t(u64, (u64)bdev_max_zone_append_sectors(bdev) << SECTOR_SHIFT,
+ (u64)bdev_max_segments(bdev) << PAGE_SHIFT);
if (!IS_ALIGNED(nr_sectors, zone_sectors))
zone_info->nr_zones++;
@@ -632,6 +642,7 @@ int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info)
u64 zoned_devices = 0;
u64 nr_devices = 0;
u64 zone_size = 0;
+ u64 max_zone_append_size = 0;
const bool incompat_zoned = btrfs_fs_incompat(fs_info, ZONED);
int ret = 0;
@@ -666,6 +677,11 @@ int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info)
ret = -EINVAL;
goto out;
}
+ if (!max_zone_append_size ||
+ (zone_info->max_zone_append_size &&
+ zone_info->max_zone_append_size < max_zone_append_size))
+ max_zone_append_size =
+ zone_info->max_zone_append_size;
}
nr_devices++;
}
@@ -715,7 +731,11 @@ int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info)
}
fs_info->zone_size = zone_size;
+ fs_info->max_zone_append_size = ALIGN_DOWN(max_zone_append_size,
+ fs_info->sectorsize);
fs_info->fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_ZONED;
+ if (fs_info->max_zone_append_size < fs_info->max_extent_size)
+ fs_info->max_extent_size = fs_info->max_zone_append_size;
/*
* Check mount options here, because we might change fs_info->zoned
@@ -1821,6 +1841,7 @@ struct btrfs_device *btrfs_zoned_get_device(struct btrfs_fs_info *fs_info,
bool btrfs_zone_activate(struct btrfs_block_group *block_group)
{
struct btrfs_fs_info *fs_info = block_group->fs_info;
+ struct btrfs_space_info *space_info = block_group->space_info;
struct map_lookup *map;
struct btrfs_device *device;
u64 physical;
@@ -1832,6 +1853,7 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
map = block_group->physical_map;
+ spin_lock(&space_info->lock);
spin_lock(&block_group->lock);
if (block_group->zone_is_active) {
ret = true;
@@ -1839,7 +1861,7 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
}
/* No space left */
- if (block_group->alloc_offset == block_group->zone_capacity) {
+ if (btrfs_zoned_bg_is_full(block_group)) {
ret = false;
goto out_unlock;
}
@@ -1860,7 +1882,10 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
/* Successfully activated all the zones */
block_group->zone_is_active = 1;
+ space_info->active_total_bytes += block_group->length;
spin_unlock(&block_group->lock);
+ btrfs_try_granting_tickets(fs_info, space_info);
+ spin_unlock(&space_info->lock);
/* For the active block group list */
btrfs_get_block_group(block_group);
@@ -1873,6 +1898,7 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
out_unlock:
spin_unlock(&block_group->lock);
+ spin_unlock(&space_info->lock);
return ret;
}
@@ -1967,6 +1993,9 @@ int btrfs_zone_finish(struct btrfs_block_group *block_group)
/* For active_bg_list */
btrfs_put_block_group(block_group);
+ clear_bit(BTRFS_FS_NEED_ZONE_FINISH, &fs_info->flags);
+ wake_up_all(&fs_info->zone_finish_wait);
+
return 0;
}
@@ -1995,6 +2024,9 @@ bool btrfs_can_activate_zone(struct btrfs_fs_devices *fs_devices, u64 flags)
}
mutex_unlock(&fs_info->chunk_mutex);
+ if (!ret)
+ set_bit(BTRFS_FS_NEED_ZONE_FINISH, &fs_info->flags);
+
return ret;
}
@@ -2156,3 +2188,96 @@ void btrfs_zoned_release_data_reloc_bg(struct btrfs_fs_info *fs_info, u64 logica
spin_unlock(&block_group->lock);
btrfs_put_block_group(block_group);
}
+
+int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info)
+{
+ struct btrfs_block_group *block_group;
+ struct btrfs_block_group *min_bg = NULL;
+ u64 min_avail = U64_MAX;
+ int ret;
+
+ spin_lock(&fs_info->zone_active_bgs_lock);
+ list_for_each_entry(block_group, &fs_info->zone_active_bgs,
+ active_bg_list) {
+ u64 avail;
+
+ spin_lock(&block_group->lock);
+ if (block_group->reserved ||
+ (block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM)) {
+ spin_unlock(&block_group->lock);
+ continue;
+ }
+
+ avail = block_group->zone_capacity - block_group->alloc_offset;
+ if (min_avail > avail) {
+ if (min_bg)
+ btrfs_put_block_group(min_bg);
+ min_bg = block_group;
+ min_avail = avail;
+ btrfs_get_block_group(min_bg);
+ }
+ spin_unlock(&block_group->lock);
+ }
+ spin_unlock(&fs_info->zone_active_bgs_lock);
+
+ if (!min_bg)
+ return 0;
+
+ ret = btrfs_zone_finish(min_bg);
+ btrfs_put_block_group(min_bg);
+
+ return ret < 0 ? ret : 1;
+}
+
+int btrfs_zoned_activate_one_bg(struct btrfs_fs_info *fs_info,
+ struct btrfs_space_info *space_info,
+ bool do_finish)
+{
+ struct btrfs_block_group *bg;
+ int index;
+
+ if (!btrfs_is_zoned(fs_info) || (space_info->flags & BTRFS_BLOCK_GROUP_DATA))
+ return 0;
+
+ /* No more block groups to activate */
+ if (space_info->active_total_bytes == space_info->total_bytes)
+ return 0;
+
+ for (;;) {
+ int ret;
+ bool need_finish = false;
+
+ down_read(&space_info->groups_sem);
+ for (index = 0; index < BTRFS_NR_RAID_TYPES; index++) {
+ list_for_each_entry(bg, &space_info->block_groups[index],
+ list) {
+ if (!spin_trylock(&bg->lock))
+ continue;
+ if (btrfs_zoned_bg_is_full(bg) || bg->zone_is_active) {
+ spin_unlock(&bg->lock);
+ continue;
+ }
+ spin_unlock(&bg->lock);
+
+ if (btrfs_zone_activate(bg)) {
+ up_read(&space_info->groups_sem);
+ return 1;
+ }
+
+ need_finish = true;
+ }
+ }
+ up_read(&space_info->groups_sem);
+
+ if (!do_finish || !need_finish)
+ break;
+
+ ret = btrfs_zone_finish_one_bg(fs_info);
+ if (ret == 0)
+ break;
+ if (ret < 0)
+ return ret;
+ }
+
+ return 0;
+}
diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h
index cf6320feef46..1cac32266276 100644
--- a/fs/btrfs/zoned.h
+++ b/fs/btrfs/zoned.h
@@ -10,11 +10,7 @@
#include "block-group.h"
#include "btrfs_inode.h"
-/*
- * Block groups with more than this value (percents) of unusable space will be
- * scheduled for background reclaim.
- */
-#define BTRFS_DEFAULT_RECLAIM_THRESH 75
+#define BTRFS_DEFAULT_RECLAIM_THRESH (75)
struct btrfs_zoned_device_info {
/*
@@ -23,6 +19,7 @@ struct btrfs_zoned_device_info {
*/
u64 zone_size;
u8 zone_size_shift;
+ u64 max_zone_append_size;
u32 nr_zones;
unsigned int max_active_zones;
atomic_t active_zones_left;
@@ -82,6 +79,9 @@ void btrfs_clear_data_reloc_bg(struct btrfs_block_group *bg);
void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info);
void btrfs_zoned_release_data_reloc_bg(struct btrfs_fs_info *fs_info, u64 logical,
u64 length);
+int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info);
+int btrfs_zoned_activate_one_bg(struct btrfs_fs_info *fs_info,
+ struct btrfs_space_info *space_info, bool do_finish);
#else /* CONFIG_BLK_DEV_ZONED */
static inline int btrfs_get_dev_zone(struct btrfs_device *device, u64 pos,
struct blk_zone *zone)
@@ -246,6 +246,20 @@ static inline void btrfs_free_zone_cache(struct btrfs_fs_info *fs_info) { }
static inline void btrfs_zoned_release_data_reloc_bg(struct btrfs_fs_info *fs_info,
u64 logical, u64 length) { }
+
+static inline int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info)
+{
+ return 1;
+}
+
+static inline int btrfs_zoned_activate_one_bg(struct btrfs_fs_info *fs_info,
+ struct btrfs_space_info *space_info,
+ bool do_finish)
+{
+ /* Consider all the block groups are active */
+ return 0;
+}
+
#endif
static inline bool btrfs_dev_is_sequential(struct btrfs_device *device, u64 pos)
@@ -380,4 +394,10 @@ static inline void btrfs_zoned_data_reloc_unlock(struct btrfs_inode *inode)
mutex_unlock(&root->fs_info->zoned_data_reloc_io_lock);
}
+static inline bool btrfs_zoned_bg_is_full(const struct btrfs_block_group *bg)
+{
+ ASSERT(btrfs_is_zoned(bg->fs_info));
+ return (bg->alloc_offset == bg->zone_capacity);
+}
+
#endif
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 58dce567ceaf..6aeca5b301c8 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -4456,10 +4456,10 @@ static void cifs_readahead(struct readahead_control *ractl)
* TODO: Send a whole batch of pages to be read
* by the cache.
*/
- page = readahead_page(ractl);
- last_batch_size = 1 << thp_order(page);
+ struct folio *folio = readahead_folio(ractl);
+ last_batch_size = folio_nr_pages(folio);
if (cifs_readpage_from_fscache(ractl->mapping->host,
- page) < 0) {
+ &folio->page) < 0) {
/*
* TODO: Deal with cache read failure
* here, but for the moment, delegate
@@ -4467,7 +4467,7 @@ static void cifs_readahead(struct readahead_control *ractl)
*/
caching = false;
}
- unlock_page(page);
+ folio_unlock(folio);
next_cached++;
cache_nr_pages--;
if (cache_nr_pages == 0)
@@ -4807,8 +4807,6 @@ void cifs_oplock_break(struct work_struct *work)
struct TCP_Server_Info *server = tcon->ses->server;
int rc = 0;
bool purge_cache = false;
- bool is_deferred = false;
- struct cifs_deferred_close *dclose;
wait_on_bit(&cinode->flags, CIFS_INODE_PENDING_WRITERS,
TASK_UNINTERRUPTIBLE);
@@ -4844,22 +4842,6 @@ void cifs_oplock_break(struct work_struct *work)
cifs_dbg(VFS, "Push locks rc = %d\n", rc);
oplock_break_ack:
- /*
- * When oplock break is received and there are no active
- * file handles but cached, then schedule deferred close immediately.
- * So, new open will not use cached handle.
- */
- spin_lock(&CIFS_I(inode)->deferred_lock);
- is_deferred = cifs_is_deferred_close(cfile, &dclose);
- spin_unlock(&CIFS_I(inode)->deferred_lock);
- if (is_deferred &&
- cfile->deferred_close_scheduled &&
- delayed_work_pending(&cfile->deferred)) {
- if (cancel_delayed_work(&cfile->deferred)) {
- _cifsFileInfo_put(cfile, false, false);
- goto oplock_break_done;
- }
- }
/*
* releasing stale oplock after recent reconnect of smb session using
* a now incorrect file handle is not a data integrity issue but do
@@ -4871,7 +4853,7 @@ void cifs_oplock_break(struct work_struct *work)
cinode);
cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
}
-oplock_break_done:
+
_cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
cifs_done_oplock_break(cinode);
}
diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
index 0e0d1fc0f130..9bbd9a59426b 100644
--- a/fs/erofs/decompressor.c
+++ b/fs/erofs/decompressor.c
@@ -93,14 +93,18 @@ static int z_erofs_lz4_prepare_dstpages(struct z_erofs_lz4_decompress_ctx *ctx,
if (page) {
__clear_bit(j, bounced);
- if (kaddr) {
- if (kaddr + PAGE_SIZE == page_address(page))
+ if (!PageHighMem(page)) {
+ if (!i) {
+ kaddr = page_address(page);
+ continue;
+ }
+ if (kaddr &&
+ kaddr + PAGE_SIZE == page_address(page)) {
kaddr += PAGE_SIZE;
- else
- kaddr = NULL;
- } else if (!i) {
- kaddr = page_address(page);
+ continue;
+ }
}
+ kaddr = NULL;
continue;
}
kaddr = NULL;
diff --git a/fs/erofs/decompressor_lzma.c b/fs/erofs/decompressor_lzma.c
index 05a3063cf2bc..5e59b3f523eb 100644
--- a/fs/erofs/decompressor_lzma.c
+++ b/fs/erofs/decompressor_lzma.c
@@ -143,6 +143,7 @@ int z_erofs_load_lzma_config(struct super_block *sb,
DBG_BUGON(z_erofs_lzma_head);
z_erofs_lzma_head = head;
spin_unlock(&z_erofs_lzma_lock);
+ wake_up_all(&z_erofs_lzma_wq);
z_erofs_lzma_max_dictsize = dict_size;
mutex_unlock(&lzma_resize_mutex);
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index e2daa940ebce..8b56b94e2f56 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1747,6 +1747,21 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms)
return to;
}
+/*
+ * autoremove_wake_function, but remove even on failure to wake up, because we
+ * know that default_wake_function/ttwu will only fail if the thread is already
+ * woken, and in that case the ep_poll loop will remove the entry anyways, not
+ * try to reuse it.
+ */
+static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry,
+ unsigned int mode, int sync, void *key)
+{
+ int ret = default_wake_function(wq_entry, mode, sync, key);
+
+ list_del_init(&wq_entry->entry);
+ return ret;
+}
+
/**
* ep_poll - Retrieves ready events, and delivers them to the caller-supplied
* event buffer.
@@ -1828,8 +1843,15 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
* normal wakeup path no need to call __remove_wait_queue()
* explicitly, thus ep->lock is not taken, which halts the
* event delivery.
+ *
+ * In fact, we now use an even more aggressive function that
+ * unconditionally removes, because we don't reuse the wait
+ * entry between loop iterations. This lets us also avoid the
+ * performance issue if a process is killed, causing all of its
+ * threads to wake up without being removed normally.
*/
init_wait(&wait);
+ wait.func = ep_autoremove_wake_function;
write_lock_irq(&ep->lock);
/*
diff --git a/fs/exec.c b/fs/exec.c
index 5a75e92b1a0a..a9f5acf8f0ec 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1297,6 +1297,9 @@ int begin_new_exec(struct linux_binprm * bprm)
bprm->mm = NULL;
#ifdef CONFIG_POSIX_TIMERS
+ spin_lock_irq(&me->sighand->siglock);
+ posix_cpu_timers_exit(me);
+ spin_unlock_irq(&me->sighand->siglock);
exit_itimers(me);
flush_itimer_signals();
#endif
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index f6a19f6d9f6d..cdffa2a041af 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -1059,9 +1059,10 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
sbi->s_frags_per_group);
goto failed_mount;
}
- if (sbi->s_inodes_per_group > sb->s_blocksize * 8) {
+ if (sbi->s_inodes_per_group < sbi->s_inodes_per_block ||
+ sbi->s_inodes_per_group > sb->s_blocksize * 8) {
ext2_msg(sb, KERN_ERR,
- "error: #inodes per group too big: %lu",
+ "error: invalid #inodes per group: %lu",
sbi->s_inodes_per_group);
goto failed_mount;
}
@@ -1071,6 +1072,13 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
sbi->s_groups_count = ((le32_to_cpu(es->s_blocks_count) -
le32_to_cpu(es->s_first_data_block) - 1)
/ EXT2_BLOCKS_PER_GROUP(sb)) + 1;
+ if ((u64)sbi->s_groups_count * sbi->s_inodes_per_group !=
+ le32_to_cpu(es->s_inodes_count)) {
+ ext2_msg(sb, KERN_ERR, "error: invalid #inodes: %u vs computed %llu",
+ le32_to_cpu(es->s_inodes_count),
+ (u64)sbi->s_groups_count * sbi->s_inodes_per_group);
+ goto failed_mount;
+ }
db_count = (sbi->s_groups_count + EXT2_DESC_PER_BLOCK(sb) - 1) /
EXT2_DESC_PER_BLOCK(sb);
sbi->s_group_desc = kmalloc_array(db_count,
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index e9ef5cf30969..84fcd06a8e8a 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -35,6 +35,9 @@ static int get_max_inline_xattr_value_size(struct inode *inode,
struct ext4_inode *raw_inode;
int free, min_offs;
+ if (!EXT4_INODE_HAS_XATTR_SPACE(inode))
+ return 0;
+
min_offs = EXT4_SB(inode->i_sb)->s_inode_size -
EXT4_GOOD_OLD_INODE_SIZE -
EXT4_I(inode)->i_extra_isize -
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index beed9e32571c..e94ec798dce1 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -178,6 +178,8 @@ void ext4_evict_inode(struct inode *inode)
trace_ext4_evict_inode(inode);
+ if (EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)
+ ext4_evict_ea_inode(inode);
if (inode->i_nlink) {
/*
* When journalling data dirty buffers are tracked only in the
@@ -1559,7 +1561,14 @@ static void mpage_release_unused_pages(struct mpage_da_data *mpd,
ext4_lblk_t start, last;
start = index << (PAGE_SHIFT - inode->i_blkbits);
last = end << (PAGE_SHIFT - inode->i_blkbits);
+
+ /*
+ * avoid racing with extent status tree scans made by
+ * ext4_insert_delayed_block()
+ */
+ down_write(&EXT4_I(inode)->i_data_sem);
ext4_es_remove_extent(inode, start, last - start + 1);
+ up_write(&EXT4_I(inode)->i_data_sem);
}
pagevec_init(&pvec);
@@ -3130,13 +3139,15 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
{
struct inode *inode = mapping->host;
journal_t *journal;
+ sector_t ret = 0;
int err;
+ inode_lock_shared(inode);
/*
* We can get here for an inline file via the FIBMAP ioctl
*/
if (ext4_has_inline_data(inode))
- return 0;
+ goto out;
if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) &&
test_opt(inode->i_sb, DELALLOC)) {
@@ -3175,10 +3186,14 @@ static sector_t ext4_bmap(struct address_space *mapping, sector_t block)
jbd2_journal_unlock_updates(journal);
if (err)
- return 0;
+ goto out;
}
- return iomap_bmap(mapping, block, &ext4_iomap_ops);
+ ret = iomap_bmap(mapping, block, &ext4_iomap_ops);
+
+out:
+ inode_unlock_shared(inode);
+ return ret;
}
static int ext4_readpage(struct file *file, struct page *page)
@@ -4674,8 +4689,7 @@ static inline int ext4_iget_extra_inode(struct inode *inode,
__le32 *magic = (void *)raw_inode +
EXT4_GOOD_OLD_INODE_SIZE + ei->i_extra_isize;
- if (EXT4_GOOD_OLD_INODE_SIZE + ei->i_extra_isize + sizeof(__le32) <=
- EXT4_INODE_SIZE(inode->i_sb) &&
+ if (EXT4_INODE_HAS_XATTR_SPACE(inode) &&
*magic == cpu_to_le32(EXT4_XATTR_MAGIC)) {
ext4_set_inode_state(inode, EXT4_STATE_XATTR);
return ext4_find_inline_data_nolock(inode);
diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
index 7a5353a8cfd7..f2c4d9eee475 100644
--- a/fs/ext4/migrate.c
+++ b/fs/ext4/migrate.c
@@ -417,7 +417,7 @@ int ext4_ext_migrate(struct inode *inode)
struct inode *tmp_inode = NULL;
struct migrate_struct lb;
unsigned long max_entries;
- __u32 goal;
+ __u32 goal, tmp_csum_seed;
uid_t owner[2];
/*
@@ -465,6 +465,7 @@ int ext4_ext_migrate(struct inode *inode)
* the migration.
*/
ei = EXT4_I(inode);
+ tmp_csum_seed = EXT4_I(tmp_inode)->i_csum_seed;
EXT4_I(tmp_inode)->i_csum_seed = ei->i_csum_seed;
i_size_write(tmp_inode, i_size_read(inode));
/*
@@ -575,6 +576,7 @@ int ext4_ext_migrate(struct inode *inode)
* the inode is not visible to user space.
*/
tmp_inode->i_blocks = 0;
+ EXT4_I(tmp_inode)->i_csum_seed = tmp_csum_seed;
/* Reset the extent details */
ext4_ext_tree_init(handle, tmp_inode);
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 4f0420b1ff3e..13b6265848c2 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -54,6 +54,7 @@ static struct buffer_head *ext4_append(handle_t *handle,
struct inode *inode,
ext4_lblk_t *block)
{
+ struct ext4_map_blocks map;
struct buffer_head *bh;
int err;
@@ -63,6 +64,21 @@ static struct buffer_head *ext4_append(handle_t *handle,
return ERR_PTR(-ENOSPC);
*block = inode->i_size >> inode->i_sb->s_blocksize_bits;
+ map.m_lblk = *block;
+ map.m_len = 1;
+
+ /*
+ * We're appending new directory block. Make sure the block is not
+ * allocated yet, otherwise we will end up corrupting the
+ * directory.
+ */
+ err = ext4_map_blocks(NULL, inode, &map, 0);
+ if (err < 0)
+ return ERR_PTR(err);
+ if (err) {
+ EXT4_ERROR_INODE(inode, "Logical block already allocated");
+ return ERR_PTR(-EFSCORRUPTED);
+ }
bh = ext4_bread(handle, inode, *block, EXT4_GET_BLOCKS_CREATE);
if (IS_ERR(bh))
@@ -110,6 +126,13 @@ static struct buffer_head *__ext4_read_dirblock(struct inode *inode,
struct ext4_dir_entry *dirent;
int is_dx_block = 0;
+ if (block >= inode->i_size) {
+ ext4_error_inode(inode, func, line, block,
+ "Attempting to read directory block (%u) that is past i_size (%llu)",
+ block, inode->i_size);
+ return ERR_PTR(-EFSCORRUPTED);
+ }
+
if (ext4_simulate_fail(inode->i_sb, EXT4_SIM_DIRBLOCK_EIO))
bh = ERR_PTR(-EIO);
else
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 8b70a4701293..e5c2713aa11a 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1484,6 +1484,7 @@ static void ext4_update_super(struct super_block *sb,
* Update the fs overhead information
*/
ext4_calculate_overhead(sb);
+ es->s_overhead_clusters = cpu_to_le32(sbi->s_overhead);
if (test_opt(sb, DEBUG))
printk(KERN_DEBUG "EXT4-fs: added group %u:"
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 042325349098..533216e80fa2 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -436,6 +436,21 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino,
return err;
}
+/* Remove entry from mbcache when EA inode is getting evicted */
+void ext4_evict_ea_inode(struct inode *inode)
+{
+ struct mb_cache_entry *oe;
+
+ if (!EA_INODE_CACHE(inode))
+ return;
+ /* Wait for entry to get unused so that we can remove it */
+ while ((oe = mb_cache_entry_delete_or_get(EA_INODE_CACHE(inode),
+ ext4_xattr_inode_get_hash(inode), inode->i_ino))) {
+ mb_cache_entry_wait_unused(oe);
+ mb_cache_entry_put(EA_INODE_CACHE(inode), oe);
+ }
+}
+
static int
ext4_xattr_inode_verify_hashes(struct inode *ea_inode,
struct ext4_xattr_entry *entry, void *buffer,
@@ -976,10 +991,8 @@ int __ext4_xattr_set_credits(struct super_block *sb, struct inode *inode,
static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode,
int ref_change)
{
- struct mb_cache *ea_inode_cache = EA_INODE_CACHE(ea_inode);
struct ext4_iloc iloc;
s64 ref_count;
- u32 hash;
int ret;
inode_lock(ea_inode);
@@ -1002,14 +1015,6 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode,
set_nlink(ea_inode, 1);
ext4_orphan_del(handle, ea_inode);
-
- if (ea_inode_cache) {
- hash = ext4_xattr_inode_get_hash(ea_inode);
- mb_cache_entry_create(ea_inode_cache,
- GFP_NOFS, hash,
- ea_inode->i_ino,
- true /* reusable */);
- }
}
} else {
WARN_ONCE(ref_count < 0, "EA inode %lu ref_count=%lld",
@@ -1022,12 +1027,6 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode,
clear_nlink(ea_inode);
ext4_orphan_add(handle, ea_inode);
-
- if (ea_inode_cache) {
- hash = ext4_xattr_inode_get_hash(ea_inode);
- mb_cache_entry_delete(ea_inode_cache, hash,
- ea_inode->i_ino);
- }
}
}
@@ -1237,6 +1236,7 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode,
if (error)
goto out;
+retry_ref:
lock_buffer(bh);
hash = le32_to_cpu(BHDR(bh)->h_hash);
ref = le32_to_cpu(BHDR(bh)->h_refcount);
@@ -1246,9 +1246,18 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode,
* This must happen under buffer lock for
* ext4_xattr_block_set() to reliably detect freed block
*/
- if (ea_block_cache)
- mb_cache_entry_delete(ea_block_cache, hash,
- bh->b_blocknr);
+ if (ea_block_cache) {
+ struct mb_cache_entry *oe;
+
+ oe = mb_cache_entry_delete_or_get(ea_block_cache, hash,
+ bh->b_blocknr);
+ if (oe) {
+ unlock_buffer(bh);
+ mb_cache_entry_wait_unused(oe);
+ mb_cache_entry_put(ea_block_cache, oe);
+ goto retry_ref;
+ }
+ }
get_bh(bh);
unlock_buffer(bh);
@@ -1858,6 +1867,8 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
#define header(x) ((struct ext4_xattr_header *)(x))
if (s->base) {
+ int offset = (char *)s->here - bs->bh->b_data;
+
BUFFER_TRACE(bs->bh, "get_write_access");
error = ext4_journal_get_write_access(handle, sb, bs->bh,
EXT4_JTR_NONE);
@@ -1873,9 +1884,20 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
* ext4_xattr_block_set() to reliably detect modified
* block
*/
- if (ea_block_cache)
- mb_cache_entry_delete(ea_block_cache, hash,
- bs->bh->b_blocknr);
+ if (ea_block_cache) {
+ struct mb_cache_entry *oe;
+
+ oe = mb_cache_entry_delete_or_get(ea_block_cache,
+ hash, bs->bh->b_blocknr);
+ if (oe) {
+ /*
+ * Xattr block is getting reused. Leave
+ * it alone.
+ */
+ mb_cache_entry_put(ea_block_cache, oe);
+ goto clone_block;
+ }
+ }
ea_bdebug(bs->bh, "modifying in-place");
error = ext4_xattr_set_entry(i, s, handle, inode,
true /* is_block */);
@@ -1890,50 +1912,47 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
if (error)
goto cleanup;
goto inserted;
- } else {
- int offset = (char *)s->here - bs->bh->b_data;
+ }
+clone_block:
+ unlock_buffer(bs->bh);
+ ea_bdebug(bs->bh, "cloning");
+ s->base = kmemdup(BHDR(bs->bh), bs->bh->b_size, GFP_NOFS);
+ error = -ENOMEM;
+ if (s->base == NULL)
+ goto cleanup;
+ s->first = ENTRY(header(s->base)+1);
+ header(s->base)->h_refcount = cpu_to_le32(1);
+ s->here = ENTRY(s->base + offset);
+ s->end = s->base + bs->bh->b_size;
- unlock_buffer(bs->bh);
- ea_bdebug(bs->bh, "cloning");
- s->base = kmalloc(bs->bh->b_size, GFP_NOFS);
- error = -ENOMEM;
- if (s->base == NULL)
+ /*
+ * If existing entry points to an xattr inode, we need
+ * to prevent ext4_xattr_set_entry() from decrementing
+ * ref count on it because the reference belongs to the
+ * original block. In this case, make the entry look
+ * like it has an empty value.
+ */
+ if (!s->not_found && s->here->e_value_inum) {
+ ea_ino = le32_to_cpu(s->here->e_value_inum);
+ error = ext4_xattr_inode_iget(inode, ea_ino,
+ le32_to_cpu(s->here->e_hash),
+ &tmp_inode);
+ if (error)
goto cleanup;
- memcpy(s->base, BHDR(bs->bh), bs->bh->b_size);
- s->first = ENTRY(header(s->base)+1);
- header(s->base)->h_refcount = cpu_to_le32(1);
- s->here = ENTRY(s->base + offset);
- s->end = s->base + bs->bh->b_size;
-
- /*
- * If existing entry points to an xattr inode, we need
- * to prevent ext4_xattr_set_entry() from decrementing
- * ref count on it because the reference belongs to the
- * original block. In this case, make the entry look
- * like it has an empty value.
- */
- if (!s->not_found && s->here->e_value_inum) {
- ea_ino = le32_to_cpu(s->here->e_value_inum);
- error = ext4_xattr_inode_iget(inode, ea_ino,
- le32_to_cpu(s->here->e_hash),
- &tmp_inode);
- if (error)
- goto cleanup;
-
- if (!ext4_test_inode_state(tmp_inode,
- EXT4_STATE_LUSTRE_EA_INODE)) {
- /*
- * Defer quota free call for previous
- * inode until success is guaranteed.
- */
- old_ea_inode_quota = le32_to_cpu(
- s->here->e_value_size);
- }
- iput(tmp_inode);
- s->here->e_value_inum = 0;
- s->here->e_value_size = 0;
+ if (!ext4_test_inode_state(tmp_inode,
+ EXT4_STATE_LUSTRE_EA_INODE)) {
+ /*
+ * Defer quota free call for previous
+ * inode until success is guaranteed.
+ */
+ old_ea_inode_quota = le32_to_cpu(
+ s->here->e_value_size);
}
+ iput(tmp_inode);
+
+ s->here->e_value_inum = 0;
+ s->here->e_value_size = 0;
}
} else {
/* Allocate a buffer where we construct the new block. */
@@ -2000,18 +2019,13 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
lock_buffer(new_bh);
/*
* We have to be careful about races with
- * freeing, rehashing or adding references to
- * xattr block. Once we hold buffer lock xattr
- * block's state is stable so we can check
- * whether the block got freed / rehashed or
- * not. Since we unhash mbcache entry under
- * buffer lock when freeing / rehashing xattr
- * block, checking whether entry is still
- * hashed is reliable. Same rules hold for
- * e_reusable handling.
+ * adding references to xattr block. Once we
+ * hold buffer lock xattr block's state is
+ * stable so we can check the additional
+ * reference fits.
*/
- if (hlist_bl_unhashed(&ce->e_hash_list) ||
- !ce->e_reusable) {
+ ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1;
+ if (ref > EXT4_XATTR_REFCOUNT_MAX) {
/*
* Undo everything and check mbcache
* again.
@@ -2026,9 +2040,8 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
new_bh = NULL;
goto inserted;
}
- ref = le32_to_cpu(BHDR(new_bh)->h_refcount) + 1;
BHDR(new_bh)->h_refcount = cpu_to_le32(ref);
- if (ref >= EXT4_XATTR_REFCOUNT_MAX)
+ if (ref == EXT4_XATTR_REFCOUNT_MAX)
ce->e_reusable = 0;
ea_bdebug(new_bh, "reusing; refcount now=%d",
ref);
@@ -2176,8 +2189,9 @@ int ext4_xattr_ibody_find(struct inode *inode, struct ext4_xattr_info *i,
struct ext4_inode *raw_inode;
int error;
- if (EXT4_I(inode)->i_extra_isize == 0)
+ if (!EXT4_INODE_HAS_XATTR_SPACE(inode))
return 0;
+
raw_inode = ext4_raw_inode(&is->iloc);
header = IHDR(inode, raw_inode);
is->s.base = is->s.first = IFIRST(header);
@@ -2205,8 +2219,9 @@ int ext4_xattr_ibody_set(handle_t *handle, struct inode *inode,
struct ext4_xattr_search *s = &is->s;
int error;
- if (EXT4_I(inode)->i_extra_isize == 0)
+ if (!EXT4_INODE_HAS_XATTR_SPACE(inode))
return -ENOSPC;
+
error = ext4_xattr_set_entry(i, s, handle, inode, false /* is_block */);
if (error)
return error;
diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h
index 77efb9a627ad..e5e36bd11f05 100644
--- a/fs/ext4/xattr.h
+++ b/fs/ext4/xattr.h
@@ -95,6 +95,19 @@ struct ext4_xattr_entry {
#define EXT4_ZERO_XATTR_VALUE ((void *)-1)
+/*
+ * If we want to add an xattr to the inode, we should make sure that
+ * i_extra_isize is not 0 and that the inode size is not less than
+ * EXT4_GOOD_OLD_INODE_SIZE + extra_isize + pad.
+ * EXT4_GOOD_OLD_INODE_SIZE extra_isize header entry pad data
+ * |--------------------------|------------|------|---------|---|-------|
+ */
+#define EXT4_INODE_HAS_XATTR_SPACE(inode) \
+ ((EXT4_I(inode)->i_extra_isize != 0) && \
+ (EXT4_GOOD_OLD_INODE_SIZE + EXT4_I(inode)->i_extra_isize + \
+ sizeof(struct ext4_xattr_ibody_header) + EXT4_XATTR_PAD <= \
+ EXT4_INODE_SIZE((inode)->i_sb)))
+
struct ext4_xattr_info {
const char *name;
const void *value;
@@ -178,6 +191,7 @@ extern void ext4_xattr_inode_array_free(struct ext4_xattr_inode_array *array);
extern int ext4_expand_extra_isize_ea(struct inode *inode, int new_extra_isize,
struct ext4_inode *raw_inode, handle_t *handle);
+extern void ext4_evict_ea_inode(struct inode *inode);
extern const struct xattr_handler *ext4_xattr_handlers[];
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index beceac9885c3..63ab8c9d674e 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1004,9 +1004,7 @@ static void __add_dirty_inode(struct inode *inode, enum inode_type type)
return;
set_inode_flag(inode, flag);
- if (!f2fs_is_volatile_file(inode))
- list_add_tail(&F2FS_I(inode)->dirty_list,
- &sbi->inode_list[type]);
+ list_add_tail(&F2FS_I(inode)->dirty_list, &sbi->inode_list[type]);
stat_inc_dirty_inode(sbi, type);
}
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 9a1a526f2092..07522e82c7c4 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -69,8 +69,7 @@ static bool __is_cp_guaranteed(struct page *page)
if (f2fs_is_compressed_page(page))
return false;
- if ((S_ISREG(inode->i_mode) &&
- (f2fs_is_atomic_file(inode) || IS_NOQUOTA(inode))) ||
+ if ((S_ISREG(inode->i_mode) && IS_NOQUOTA(inode)) ||
page_private_gcing(page))
return true;
return false;
@@ -1436,9 +1435,12 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
*map->m_next_extent = pgofs + map->m_len;
/* for hardware encryption, but to avoid potential issue in future */
- if (flag == F2FS_GET_BLOCK_DIO)
+ if (flag == F2FS_GET_BLOCK_DIO) {
f2fs_wait_on_block_writeback_range(inode,
map->m_pblk, map->m_len);
+ invalidate_mapping_pages(META_MAPPING(sbi),
+ map->m_pblk, map->m_pblk + map->m_len - 1);
+ }
if (map->m_multidev_dio) {
block_t blk_addr = map->m_pblk;
@@ -1655,7 +1657,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map,
f2fs_wait_on_block_writeback_range(inode,
map->m_pblk, map->m_len);
invalidate_mapping_pages(META_MAPPING(sbi),
- map->m_pblk, map->m_pblk);
+ map->m_pblk, map->m_pblk + map->m_len - 1);
if (map->m_multidev_dio) {
block_t blk_addr = map->m_pblk;
@@ -2563,7 +2565,12 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
bool ipu_force = false;
int err = 0;
- set_new_dnode(&dn, inode, NULL, NULL, 0);
+ /* Use COW inode to make dnode_of_data for atomic write */
+ if (f2fs_is_atomic_file(inode))
+ set_new_dnode(&dn, F2FS_I(inode)->cow_inode, NULL, NULL, 0);
+ else
+ set_new_dnode(&dn, inode, NULL, NULL, 0);
+
if (need_inplace_update(fio) &&
f2fs_lookup_extent_cache(inode, page->index, &ei)) {
fio->old_blkaddr = ei.blk + page->index - ei.fofs;
@@ -2600,6 +2607,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
err = -EFSCORRUPTED;
goto out_writepage;
}
+
/*
* If current allocation needs SSR,
* it had better in-place writes for updated data.
@@ -2736,11 +2744,6 @@ int f2fs_write_single_data_page(struct page *page, int *submitted,
write:
if (f2fs_is_drop_cache(inode))
goto out;
- /* we should not write 0'th page having journal header */
- if (f2fs_is_volatile_file(inode) && (!page->index ||
- (!wbc->for_reclaim &&
- f2fs_available_free_memory(sbi, BASE_CHECK))))
- goto redirty_out;
/* Dentry/quota blocks are controlled by checkpoint */
if (S_ISDIR(inode->i_mode) || IS_NOQUOTA(inode)) {
@@ -3313,6 +3316,100 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,
return err;
}
+static int __find_data_block(struct inode *inode, pgoff_t index,
+ block_t *blk_addr)
+{
+ struct dnode_of_data dn;
+ struct page *ipage;
+ struct extent_info ei = {0, };
+ int err = 0;
+
+ ipage = f2fs_get_node_page(F2FS_I_SB(inode), inode->i_ino);
+ if (IS_ERR(ipage))
+ return PTR_ERR(ipage);
+
+ set_new_dnode(&dn, inode, ipage, ipage, 0);
+
+ if (f2fs_lookup_extent_cache(inode, index, &ei)) {
+ dn.data_blkaddr = ei.blk + index - ei.fofs;
+ } else {
+ /* hole case */
+ err = f2fs_get_dnode_of_data(&dn, index, LOOKUP_NODE);
+ if (err) {
+ dn.data_blkaddr = NULL_ADDR;
+ err = 0;
+ }
+ }
+ *blk_addr = dn.data_blkaddr;
+ f2fs_put_dnode(&dn);
+ return err;
+}
+
+static int __reserve_data_block(struct inode *inode, pgoff_t index,
+ block_t *blk_addr, bool *node_changed)
+{
+ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+ struct dnode_of_data dn;
+ struct page *ipage;
+ int err = 0;
+
+ f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, true);
+
+ ipage = f2fs_get_node_page(sbi, inode->i_ino);
+ if (IS_ERR(ipage)) {
+ err = PTR_ERR(ipage);
+ goto unlock_out;
+ }
+ set_new_dnode(&dn, inode, ipage, ipage, 0);
+
+ err = f2fs_get_block(&dn, index);
+
+ *blk_addr = dn.data_blkaddr;
+ *node_changed = dn.node_changed;
+ f2fs_put_dnode(&dn);
+
+unlock_out:
+ f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, false);
+ return err;
+}
+
+static int prepare_atomic_write_begin(struct f2fs_sb_info *sbi,
+ struct page *page, loff_t pos, unsigned int len,
+ block_t *blk_addr, bool *node_changed)
+{
+ struct inode *inode = page->mapping->host;
+ struct inode *cow_inode = F2FS_I(inode)->cow_inode;
+ pgoff_t index = page->index;
+ int err = 0;
+ block_t ori_blk_addr;
+
+ /* If pos is beyond the end of file, reserve a new block in COW inode */
+ if ((pos & PAGE_MASK) >= i_size_read(inode))
+ return __reserve_data_block(cow_inode, index, blk_addr,
+ node_changed);
+
+ /* Look for the block in COW inode first */
+ err = __find_data_block(cow_inode, index, blk_addr);
+ if (err)
+ return err;
+ else if (*blk_addr != NULL_ADDR)
+ return 0;
+
+ /* Look for the block in the original inode */
+ err = __find_data_block(inode, index, &ori_blk_addr);
+ if (err)
+ return err;
+
+ /* Finally, we should reserve a new block in COW inode for the update */
+ err = __reserve_data_block(cow_inode, index, blk_addr, node_changed);
+ if (err)
+ return err;
+
+ if (ori_blk_addr != NULL_ADDR)
+ *blk_addr = ori_blk_addr;
+ return 0;
+}
+
static int f2fs_write_begin(struct file *file, struct address_space *mapping,
loff_t pos, unsigned len, unsigned flags,
struct page **pagep, void **fsdata)
@@ -3321,7 +3418,7 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct page *page = NULL;
pgoff_t index = ((unsigned long long) pos) >> PAGE_SHIFT;
- bool need_balance = false, drop_atomic = false;
+ bool need_balance = false;
block_t blkaddr = NULL_ADDR;
int err = 0;
@@ -3332,14 +3429,6 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
goto fail;
}
- if ((f2fs_is_atomic_file(inode) &&
- !f2fs_available_free_memory(sbi, INMEM_PAGES)) ||
- is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
- err = -ENOMEM;
- drop_atomic = true;
- goto fail;
- }
-
/*
* We should check this at this moment to avoid deadlock on inode page
* and #0 page. The locking rule for inline_data conversion should be:
@@ -3387,7 +3476,11 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
*pagep = page;
- err = prepare_write_begin(sbi, page, pos, len,
+ if (f2fs_is_atomic_file(inode))
+ err = prepare_atomic_write_begin(sbi, page, pos, len,
+ &blkaddr, &need_balance);
+ else
+ err = prepare_write_begin(sbi, page, pos, len,
&blkaddr, &need_balance);
if (err)
goto fail;
@@ -3443,8 +3536,6 @@ static int f2fs_write_begin(struct file *file, struct address_space *mapping,
fail:
f2fs_put_page(page, 1);
f2fs_write_failed(inode, pos + len);
- if (drop_atomic)
- f2fs_drop_inmem_pages_all(sbi, false);
return err;
}
@@ -3488,8 +3579,12 @@ static int f2fs_write_end(struct file *file,
set_page_dirty(page);
if (pos + copied > i_size_read(inode) &&
- !f2fs_verity_in_progress(inode))
+ !f2fs_verity_in_progress(inode)) {
f2fs_i_size_write(inode, pos + copied);
+ if (f2fs_is_atomic_file(inode))
+ f2fs_i_size_write(F2FS_I(inode)->cow_inode,
+ pos + copied);
+ }
unlock_out:
f2fs_put_page(page, 1);
f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
@@ -3522,9 +3617,6 @@ void f2fs_invalidate_folio(struct folio *folio, size_t offset, size_t length)
inode->i_ino == F2FS_COMPRESS_INO(sbi))
clear_page_private_data(&folio->page);
- if (page_private_atomic(&folio->page))
- return f2fs_drop_inmem_page(inode, &folio->page);
-
folio_detach_private(folio);
}
@@ -3534,10 +3626,6 @@ int f2fs_release_page(struct page *page, gfp_t wait)
if (PageDirty(page))
return 0;
- /* This is atomic written page, keep Private */
- if (page_private_atomic(page))
- return 0;
-
if (test_opt(F2FS_P_SB(page), COMPRESS_CACHE)) {
struct inode *inode = page->mapping->host;
@@ -3563,18 +3651,6 @@ static bool f2fs_dirty_data_folio(struct address_space *mapping,
folio_mark_uptodate(folio);
BUG_ON(folio_test_swapcache(folio));
- if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)) {
- if (!page_private_atomic(&folio->page)) {
- f2fs_register_inmem_page(inode, &folio->page);
- return true;
- }
- /*
- * Previously, this page has been registered, we just
- * return here.
- */
- return false;
- }
-
if (!folio_test_dirty(folio)) {
filemap_dirty_folio(mapping, folio);
f2fs_update_dirty_folio(inode, folio);
@@ -3654,42 +3730,14 @@ static sector_t f2fs_bmap(struct address_space *mapping, sector_t block)
int f2fs_migrate_page(struct address_space *mapping,
struct page *newpage, struct page *page, enum migrate_mode mode)
{
- int rc, extra_count;
- struct f2fs_inode_info *fi = F2FS_I(mapping->host);
- bool atomic_written = page_private_atomic(page);
+ int rc, extra_count = 0;
BUG_ON(PageWriteback(page));
- /* migrating an atomic written page is safe with the inmem_lock hold */
- if (atomic_written) {
- if (mode != MIGRATE_SYNC)
- return -EBUSY;
- if (!mutex_trylock(&fi->inmem_lock))
- return -EAGAIN;
- }
-
- /* one extra reference was held for atomic_write page */
- extra_count = atomic_written ? 1 : 0;
rc = migrate_page_move_mapping(mapping, newpage,
page, extra_count);
- if (rc != MIGRATEPAGE_SUCCESS) {
- if (atomic_written)
- mutex_unlock(&fi->inmem_lock);
+ if (rc != MIGRATEPAGE_SUCCESS)
return rc;
- }
-
- if (atomic_written) {
- struct inmem_pages *cur;
-
- list_for_each_entry(cur, &fi->inmem_pages, list)
- if (cur->page == page) {
- cur->page = newpage;
- break;
- }
- mutex_unlock(&fi->inmem_lock);
- put_page(page);
- get_page(newpage);
- }
/* guarantee to start from no stale private field */
set_page_private(newpage, 0);
diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index fcdf253cd211..c92625ef16d0 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -91,11 +91,8 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->ndirty_files = sbi->ndirty_inode[FILE_INODE];
si->nquota_files = sbi->nquota_files;
si->ndirty_all = sbi->ndirty_inode[DIRTY_META];
- si->inmem_pages = get_pages(sbi, F2FS_INMEM_PAGES);
si->aw_cnt = sbi->atomic_files;
- si->vw_cnt = atomic_read(&sbi->vw_cnt);
si->max_aw_cnt = atomic_read(&sbi->max_aw_cnt);
- si->max_vw_cnt = atomic_read(&sbi->max_vw_cnt);
si->nr_dio_read = get_pages(sbi, F2FS_DIO_READ);
si->nr_dio_write = get_pages(sbi, F2FS_DIO_WRITE);
si->nr_wb_cp_data = get_pages(sbi, F2FS_WB_CP_DATA);
@@ -167,8 +164,6 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->alloc_nids = NM_I(sbi)->nid_cnt[PREALLOC_NID];
si->io_skip_bggc = sbi->io_skip_bggc;
si->other_skip_bggc = sbi->other_skip_bggc;
- si->skipped_atomic_files[BG_GC] = sbi->skipped_atomic_files[BG_GC];
- si->skipped_atomic_files[FG_GC] = sbi->skipped_atomic_files[FG_GC];
si->util_free = (int)(free_user_blocks(sbi) >> sbi->log_blocks_per_seg)
* 100 / (int)(sbi->user_block_count >> sbi->log_blocks_per_seg)
/ 2;
@@ -296,7 +291,6 @@ static void update_mem_info(struct f2fs_sb_info *sbi)
sizeof(struct nat_entry);
si->cache_mem += NM_I(sbi)->nat_cnt[DIRTY_NAT] *
sizeof(struct nat_entry_set);
- si->cache_mem += si->inmem_pages * sizeof(struct inmem_pages);
for (i = 0; i < MAX_INO_ENTRY; i++)
si->cache_mem += sbi->im[i].ino_num * sizeof(struct ino_entry);
si->cache_mem += atomic_read(&sbi->total_ext_tree) *
@@ -491,10 +485,6 @@ static int stat_show(struct seq_file *s, void *v)
si->bg_data_blks);
seq_printf(s, " - node blocks : %d (%d)\n", si->node_blks,
si->bg_node_blks);
- seq_printf(s, "Skipped : atomic write %llu (%llu)\n",
- si->skipped_atomic_files[BG_GC] +
- si->skipped_atomic_files[FG_GC],
- si->skipped_atomic_files[BG_GC]);
seq_printf(s, "BG skip : IO: %u, Other: %u\n",
si->io_skip_bggc, si->other_skip_bggc);
seq_puts(s, "\nExtent Cache:\n");
@@ -519,10 +509,8 @@ static int stat_show(struct seq_file *s, void *v)
si->flush_list_empty,
si->nr_discarding, si->nr_discarded,
si->nr_discard_cmd, si->undiscard_blks);
- seq_printf(s, " - inmem: %4d, atomic IO: %4d (Max. %4d), "
- "volatile IO: %4d (Max. %4d)\n",
- si->inmem_pages, si->aw_cnt, si->max_aw_cnt,
- si->vw_cnt, si->max_vw_cnt);
+ seq_printf(s, " - atomic IO: %4d (Max. %4d)\n",
+ si->aw_cnt, si->max_aw_cnt);
seq_printf(s, " - compress: %4d, hit:%8d\n", si->compress_pages, si->compress_page_hit);
seq_printf(s, " - nodes: %4d in %4d\n",
si->ndirty_node, si->node_pages);
@@ -623,9 +611,7 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi)
for (i = META_CP; i < META_MAX; i++)
atomic_set(&sbi->meta_count[i], 0);
- atomic_set(&sbi->vw_cnt, 0);
atomic_set(&sbi->max_aw_cnt, 0);
- atomic_set(&sbi->max_vw_cnt, 0);
raw_spin_lock_irqsave(&f2fs_stat_lock, flags);
list_add_tail(&si->stat_list, &f2fs_stat_list);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 9b89f26af1f3..e2022fad1574 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -716,7 +716,6 @@ enum {
enum {
GC_FAILURE_PIN,
- GC_FAILURE_ATOMIC,
MAX_GC_FAILURE
};
@@ -738,8 +737,6 @@ enum {
FI_UPDATE_WRITE, /* inode has in-place-update data */
FI_NEED_IPU, /* used for ipu per file */
FI_ATOMIC_FILE, /* indicate atomic file */
- FI_ATOMIC_COMMIT, /* indicate the state of atomical committing */
- FI_VOLATILE_FILE, /* indicate volatile file */
FI_FIRST_BLOCK_WRITTEN, /* indicate #0 data block was written */
FI_DROP_CACHE, /* drop dirty page cache */
FI_DATA_EXIST, /* indicate data exists */
@@ -752,7 +749,6 @@ enum {
FI_EXTRA_ATTR, /* indicate file has extra attribute */
FI_PROJ_INHERIT, /* indicate file inherits projectid */
FI_PIN_FILE, /* indicate file should not be gced */
- FI_ATOMIC_REVOKE_REQUEST, /* request to drop atomic data */
FI_VERITY_IN_PROGRESS, /* building fs-verity Merkle tree */
FI_COMPRESSED_FILE, /* indicate file's data can be compressed */
FI_COMPRESS_CORRUPT, /* indicate compressed cluster is corrupted */
@@ -760,6 +756,7 @@ enum {
FI_ENABLE_COMPRESS, /* enable compression in "user" compression mode */
FI_COMPRESS_RELEASED, /* compressed blocks were released */
FI_ALIGNED_WRITE, /* enable aligned write */
+ FI_COW_FILE, /* indicate COW file */
FI_MAX, /* max flag, never be used */
};
@@ -794,11 +791,9 @@ struct f2fs_inode_info {
#endif
struct list_head dirty_list; /* dirty list for dirs and files */
struct list_head gdirty_list; /* linked in global dirty list */
- struct list_head inmem_ilist; /* list for inmem inodes */
- struct list_head inmem_pages; /* inmemory pages managed by f2fs */
- struct task_struct *inmem_task; /* store inmemory task */
- struct mutex inmem_lock; /* lock for inmemory pages */
+ struct task_struct *atomic_write_task; /* store atomic write task */
struct extent_tree *extent_tree; /* cached extent_tree entry */
+ struct inode *cow_inode; /* copy-on-write inode for atomic write */
/* avoid racing between foreground op and gc */
struct f2fs_rwsem i_gc_rwsem[2];
@@ -1092,7 +1087,6 @@ enum count_type {
F2FS_DIRTY_QDATA,
F2FS_DIRTY_NODES,
F2FS_DIRTY_META,
- F2FS_INMEM_PAGES,
F2FS_DIRTY_IMETA,
F2FS_WB_CP_DATA,
F2FS_WB_DATA,
@@ -1122,11 +1116,7 @@ enum page_type {
META,
NR_PAGE_TYPE,
META_FLUSH,
- INMEM, /* the below types are used by tracepoints only. */
- INMEM_DROP,
- INMEM_INVALIDATE,
- INMEM_REVOKE,
- IPU,
+ IPU, /* the below types are used by tracepoints only. */
OPU,
};
@@ -1718,7 +1708,6 @@ struct f2fs_sb_info {
/* for skip statistic */
unsigned int atomic_files; /* # of opened atomic file */
- unsigned long long skipped_atomic_files[2]; /* FG_GC and BG_GC */
unsigned long long skipped_gc_rwsem; /* FG_GC only */
/* threshold for gc trials on pinned files */
@@ -1749,9 +1738,7 @@ struct f2fs_sb_info {
atomic_t inline_dir; /* # of inline_dentry inodes */
atomic_t compr_inode; /* # of compressed inodes */
atomic64_t compr_blocks; /* # of compressed blocks */
- atomic_t vw_cnt; /* # of volatile writes */
atomic_t max_aw_cnt; /* max # of atomic writes */
- atomic_t max_vw_cnt; /* max # of volatile writes */
unsigned int io_skip_bggc; /* skip background gc for in-flight IO */
unsigned int other_skip_bggc; /* skip background gc for other reasons */
unsigned int ndirty_inode[NR_INODE_TYPE]; /* # of dirty inodes */
@@ -3202,14 +3189,9 @@ static inline bool f2fs_is_atomic_file(struct inode *inode)
return is_inode_flag_set(inode, FI_ATOMIC_FILE);
}
-static inline bool f2fs_is_commit_atomic_write(struct inode *inode)
+static inline bool f2fs_is_cow_file(struct inode *inode)
{
- return is_inode_flag_set(inode, FI_ATOMIC_COMMIT);
-}
-
-static inline bool f2fs_is_volatile_file(struct inode *inode)
-{
- return is_inode_flag_set(inode, FI_VOLATILE_FILE);
+ return is_inode_flag_set(inode, FI_COW_FILE);
}
static inline bool f2fs_is_first_block_written(struct inode *inode)
@@ -3444,6 +3426,8 @@ void f2fs_handle_failed_inode(struct inode *inode);
int f2fs_update_extension_list(struct f2fs_sb_info *sbi, const char *name,
bool hot, bool set);
struct dentry *f2fs_get_parent(struct dentry *child);
+int f2fs_get_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
+ struct inode **new_inode);
/*
* dir.c
@@ -3579,11 +3563,8 @@ void f2fs_destroy_node_manager_caches(void);
* segment.c
*/
bool f2fs_need_SSR(struct f2fs_sb_info *sbi);
-void f2fs_register_inmem_page(struct inode *inode, struct page *page);
-void f2fs_drop_inmem_pages_all(struct f2fs_sb_info *sbi, bool gc_failure);
-void f2fs_drop_inmem_pages(struct inode *inode);
-void f2fs_drop_inmem_page(struct inode *inode, struct page *page);
-int f2fs_commit_inmem_pages(struct inode *inode);
+int f2fs_commit_atomic_write(struct inode *inode);
+void f2fs_abort_atomic_write(struct inode *inode, bool clean);
void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need);
void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi, bool from_bg);
int f2fs_issue_flush(struct f2fs_sb_info *sbi, nid_t ino);
@@ -3815,7 +3796,6 @@ struct f2fs_stat_info {
int ext_tree, zombie_tree, ext_node;
int ndirty_node, ndirty_dent, ndirty_meta, ndirty_imeta;
int ndirty_data, ndirty_qdata;
- int inmem_pages;
unsigned int ndirty_dirs, ndirty_files, nquota_files, ndirty_all;
int nats, dirty_nats, sits, dirty_sits;
int free_nids, avail_nids, alloc_nids;
@@ -3833,7 +3813,7 @@ struct f2fs_stat_info {
int inline_xattr, inline_inode, inline_dir, append, update, orphans;
int compr_inode;
unsigned long long compr_blocks;
- int aw_cnt, max_aw_cnt, vw_cnt, max_vw_cnt;
+ int aw_cnt, max_aw_cnt;
unsigned int valid_count, valid_node_count, valid_inode_count, discard_blks;
unsigned int bimodal, avg_vblocks;
int util_free, util_valid, util_invalid;
@@ -3845,7 +3825,6 @@ struct f2fs_stat_info {
int bg_node_segs, bg_data_segs;
int tot_blks, data_blks, node_blks;
int bg_data_blks, bg_node_blks;
- unsigned long long skipped_atomic_files[2];
int curseg[NR_CURSEG_TYPE];
int cursec[NR_CURSEG_TYPE];
int curzone[NR_CURSEG_TYPE];
@@ -3945,17 +3924,6 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct f2fs_sb_info *sbi)
if (cur > max) \
atomic_set(&F2FS_I_SB(inode)->max_aw_cnt, cur); \
} while (0)
-#define stat_inc_volatile_write(inode) \
- (atomic_inc(&F2FS_I_SB(inode)->vw_cnt))
-#define stat_dec_volatile_write(inode) \
- (atomic_dec(&F2FS_I_SB(inode)->vw_cnt))
-#define stat_update_max_volatile_write(inode) \
- do { \
- int cur = atomic_read(&F2FS_I_SB(inode)->vw_cnt); \
- int max = atomic_read(&F2FS_I_SB(inode)->max_vw_cnt); \
- if (cur > max) \
- atomic_set(&F2FS_I_SB(inode)->max_vw_cnt, cur); \
- } while (0)
#define stat_inc_seg_count(sbi, type, gc_type) \
do { \
struct f2fs_stat_info *si = F2FS_STAT(sbi); \
@@ -4017,9 +3985,6 @@ void f2fs_update_sit_info(struct f2fs_sb_info *sbi);
#define stat_add_compr_blocks(inode, blocks) do { } while (0)
#define stat_sub_compr_blocks(inode, blocks) do { } while (0)
#define stat_update_max_atomic_write(inode) do { } while (0)
-#define stat_inc_volatile_write(inode) do { } while (0)
-#define stat_dec_volatile_write(inode) do { } while (0)
-#define stat_update_max_volatile_write(inode) do { } while (0)
#define stat_inc_meta_count(sbi, blkaddr) do { } while (0)
#define stat_inc_seg_type(sbi, curseg) do { } while (0)
#define stat_inc_block_count(sbi, curseg) do { } while (0)
@@ -4423,8 +4388,7 @@ static inline bool f2fs_lfs_mode(struct f2fs_sb_info *sbi)
static inline bool f2fs_may_compress(struct inode *inode)
{
if (IS_SWAPFILE(inode) || f2fs_is_pinned_file(inode) ||
- f2fs_is_atomic_file(inode) ||
- f2fs_is_volatile_file(inode))
+ f2fs_is_atomic_file(inode) || f2fs_has_inline_data(inode))
return false;
return S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode);
}
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 5d1b97e852e7..d3415aca5736 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1816,16 +1816,8 @@ static int f2fs_release_file(struct inode *inode, struct file *filp)
atomic_read(&inode->i_writecount) != 1)
return 0;
- /* some remained atomic pages should discarded */
if (f2fs_is_atomic_file(inode))
- f2fs_drop_inmem_pages(inode);
- if (f2fs_is_volatile_file(inode)) {
- set_inode_flag(inode, FI_DROP_CACHE);
- filemap_fdatawrite(inode->i_mapping);
- clear_inode_flag(inode, FI_DROP_CACHE);
- clear_inode_flag(inode, FI_VOLATILE_FILE);
- stat_dec_volatile_write(inode);
- }
+ f2fs_abort_atomic_write(inode, true);
return 0;
}
@@ -1840,8 +1832,8 @@ static int f2fs_file_flush(struct file *file, fl_owner_t id)
* before dropping file lock, it needs to do in ->flush.
*/
if (f2fs_is_atomic_file(inode) &&
- F2FS_I(inode)->inmem_task == current)
- f2fs_drop_inmem_pages(inode);
+ F2FS_I(inode)->atomic_write_task == current)
+ f2fs_abort_atomic_write(inode, true);
return 0;
}
@@ -1875,10 +1867,7 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask)
if (masked_flags & F2FS_COMPR_FL) {
if (!f2fs_disable_compressed_file(inode))
return -EINVAL;
- }
- if (iflags & F2FS_NOCOMP_FL)
- return -EINVAL;
- if (iflags & F2FS_COMPR_FL) {
+ } else {
if (!f2fs_may_compress(inode))
return -EINVAL;
if (S_ISREG(inode->i_mode) && inode->i_size)
@@ -1887,10 +1876,6 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask)
set_compress_context(inode);
}
}
- if ((iflags ^ masked_flags) & F2FS_NOCOMP_FL) {
- if (masked_flags & F2FS_COMPR_FL)
- return -EINVAL;
- }
fi->i_flags = iflags | (fi->i_flags & ~mask);
f2fs_bug_on(F2FS_I_SB(inode), (fi->i_flags & F2FS_COMPR_FL) &&
@@ -2004,6 +1989,7 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
struct f2fs_inode_info *fi = F2FS_I(inode);
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+ struct inode *pinode;
int ret;
if (!inode_owner_or_capable(mnt_userns, inode))
@@ -2026,11 +2012,8 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
goto out;
}
- if (f2fs_is_atomic_file(inode)) {
- if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST))
- ret = -EINVAL;
+ if (f2fs_is_atomic_file(inode))
goto out;
- }
ret = f2fs_convert_inline_inode(inode);
if (ret)
@@ -2051,19 +2034,33 @@ static int f2fs_ioc_start_atomic_write(struct file *filp)
goto out;
}
+ /* Create a COW inode for atomic write */
+ pinode = f2fs_iget(inode->i_sb, fi->i_pino);
+ if (IS_ERR(pinode)) {
+ f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
+ ret = PTR_ERR(pinode);
+ goto out;
+ }
+
+ ret = f2fs_get_tmpfile(mnt_userns, pinode, &fi->cow_inode);
+ iput(pinode);
+ if (ret) {
+ f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
+ goto out;
+ }
+ f2fs_i_size_write(fi->cow_inode, i_size_read(inode));
+
spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
- if (list_empty(&fi->inmem_ilist))
- list_add_tail(&fi->inmem_ilist, &sbi->inode_list[ATOMIC_FILE]);
sbi->atomic_files++;
spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
- /* add inode in inmem_list first and set atomic_file */
set_inode_flag(inode, FI_ATOMIC_FILE);
- clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
+ set_inode_flag(fi->cow_inode, FI_COW_FILE);
+ clear_inode_flag(fi->cow_inode, FI_INLINE_DATA);
f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
- F2FS_I(inode)->inmem_task = current;
+ F2FS_I(inode)->atomic_write_task = current;
stat_update_max_atomic_write(inode);
out:
inode_unlock(inode);
@@ -2088,99 +2085,24 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
inode_lock(inode);
- if (f2fs_is_volatile_file(inode)) {
- ret = -EINVAL;
- goto err_out;
- }
-
if (f2fs_is_atomic_file(inode)) {
- ret = f2fs_commit_inmem_pages(inode);
+ ret = f2fs_commit_atomic_write(inode);
if (ret)
- goto err_out;
+ goto unlock_out;
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
if (!ret)
- f2fs_drop_inmem_pages(inode);
+ f2fs_abort_atomic_write(inode, false);
} else {
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
}
-err_out:
- if (is_inode_flag_set(inode, FI_ATOMIC_REVOKE_REQUEST)) {
- clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
- ret = -EINVAL;
- }
+unlock_out:
inode_unlock(inode);
mnt_drop_write_file(filp);
return ret;
}
-static int f2fs_ioc_start_volatile_write(struct file *filp)
-{
- struct inode *inode = file_inode(filp);
- struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
- int ret;
-
- if (!inode_owner_or_capable(mnt_userns, inode))
- return -EACCES;
-
- if (!S_ISREG(inode->i_mode))
- return -EINVAL;
-
- ret = mnt_want_write_file(filp);
- if (ret)
- return ret;
-
- inode_lock(inode);
-
- if (f2fs_is_volatile_file(inode))
- goto out;
-
- ret = f2fs_convert_inline_inode(inode);
- if (ret)
- goto out;
-
- stat_inc_volatile_write(inode);
- stat_update_max_volatile_write(inode);
-
- set_inode_flag(inode, FI_VOLATILE_FILE);
- f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
-out:
- inode_unlock(inode);
- mnt_drop_write_file(filp);
- return ret;
-}
-
-static int f2fs_ioc_release_volatile_write(struct file *filp)
-{
- struct inode *inode = file_inode(filp);
- struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
- int ret;
-
- if (!inode_owner_or_capable(mnt_userns, inode))
- return -EACCES;
-
- ret = mnt_want_write_file(filp);
- if (ret)
- return ret;
-
- inode_lock(inode);
-
- if (!f2fs_is_volatile_file(inode))
- goto out;
-
- if (!f2fs_is_first_block_written(inode)) {
- ret = truncate_partial_data_page(inode, 0, true);
- goto out;
- }
-
- ret = punch_hole(inode, 0, F2FS_BLKSIZE);
-out:
- inode_unlock(inode);
- mnt_drop_write_file(filp);
- return ret;
-}
-
-static int f2fs_ioc_abort_volatile_write(struct file *filp)
+static int f2fs_ioc_abort_atomic_write(struct file *filp)
{
struct inode *inode = file_inode(filp);
struct user_namespace *mnt_userns = file_mnt_user_ns(filp);
@@ -2196,14 +2118,7 @@ static int f2fs_ioc_abort_volatile_write(struct file *filp)
inode_lock(inode);
if (f2fs_is_atomic_file(inode))
- f2fs_drop_inmem_pages(inode);
- if (f2fs_is_volatile_file(inode)) {
- clear_inode_flag(inode, FI_VOLATILE_FILE);
- stat_dec_volatile_write(inode);
- ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true);
- }
-
- clear_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
+ f2fs_abort_atomic_write(inode, true);
inode_unlock(inode);
@@ -4024,8 +3939,8 @@ static int f2fs_ioc_decompress_file(struct file *filp, unsigned long arg)
goto out;
}
- if (f2fs_is_mmap_file(inode)) {
- ret = -EBUSY;
+ if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) {
+ ret = -EINVAL;
goto out;
}
@@ -4096,8 +4011,8 @@ static int f2fs_ioc_compress_file(struct file *filp, unsigned long arg)
goto out;
}
- if (f2fs_is_mmap_file(inode)) {
- ret = -EBUSY;
+ if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) {
+ ret = -EINVAL;
goto out;
}
@@ -4149,12 +4064,11 @@ static long __f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
return f2fs_ioc_start_atomic_write(filp);
case F2FS_IOC_COMMIT_ATOMIC_WRITE:
return f2fs_ioc_commit_atomic_write(filp);
+ case F2FS_IOC_ABORT_ATOMIC_WRITE:
+ return f2fs_ioc_abort_atomic_write(filp);
case F2FS_IOC_START_VOLATILE_WRITE:
- return f2fs_ioc_start_volatile_write(filp);
case F2FS_IOC_RELEASE_VOLATILE_WRITE:
- return f2fs_ioc_release_volatile_write(filp);
- case F2FS_IOC_ABORT_VOLATILE_WRITE:
- return f2fs_ioc_abort_volatile_write(filp);
+ return -EOPNOTSUPP;
case F2FS_IOC_SHUTDOWN:
return f2fs_ioc_shutdown(filp, arg);
case FITRIM:
@@ -4778,7 +4692,7 @@ long f2fs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case F2FS_IOC_COMMIT_ATOMIC_WRITE:
case F2FS_IOC_START_VOLATILE_WRITE:
case F2FS_IOC_RELEASE_VOLATILE_WRITE:
- case F2FS_IOC_ABORT_VOLATILE_WRITE:
+ case F2FS_IOC_ABORT_ATOMIC_WRITE:
case F2FS_IOC_SHUTDOWN:
case FITRIM:
case FS_IOC_SET_ENCRYPTION_POLICY:
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index ea5b93b689cd..ba8e93e517be 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -646,6 +646,54 @@ static void release_victim_entry(struct f2fs_sb_info *sbi)
f2fs_bug_on(sbi, !list_empty(&am->victim_list));
}
+static bool f2fs_pin_section(struct f2fs_sb_info *sbi, unsigned int segno)
+{
+ struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
+ unsigned int secno = GET_SEC_FROM_SEG(sbi, segno);
+
+ if (!dirty_i->enable_pin_section)
+ return false;
+ if (!test_and_set_bit(secno, dirty_i->pinned_secmap))
+ dirty_i->pinned_secmap_cnt++;
+ return true;
+}
+
+static bool f2fs_pinned_section_exists(struct dirty_seglist_info *dirty_i)
+{
+ return dirty_i->pinned_secmap_cnt;
+}
+
+static bool f2fs_section_is_pinned(struct dirty_seglist_info *dirty_i,
+ unsigned int secno)
+{
+ return dirty_i->enable_pin_section &&
+ f2fs_pinned_section_exists(dirty_i) &&
+ test_bit(secno, dirty_i->pinned_secmap);
+}
+
+static void f2fs_unpin_all_sections(struct f2fs_sb_info *sbi, bool enable)
+{
+ unsigned int bitmap_size = f2fs_bitmap_size(MAIN_SECS(sbi));
+
+ if (f2fs_pinned_section_exists(DIRTY_I(sbi))) {
+ memset(DIRTY_I(sbi)->pinned_secmap, 0, bitmap_size);
+ DIRTY_I(sbi)->pinned_secmap_cnt = 0;
+ }
+ DIRTY_I(sbi)->enable_pin_section = enable;
+}
+
+static int f2fs_gc_pinned_control(struct inode *inode, int gc_type,
+ unsigned int segno)
+{
+ if (!f2fs_is_pinned_file(inode))
+ return 0;
+ if (gc_type != FG_GC)
+ return -EBUSY;
+ if (!f2fs_pin_section(F2FS_I_SB(inode), segno))
+ f2fs_pin_file_control(inode, true);
+ return -EAGAIN;
+}
+
/*
* This function is called from two paths.
* One is garbage collection and the other is SSR segment selection.
@@ -787,6 +835,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
goto next;
+ if (gc_type == FG_GC && f2fs_section_is_pinned(dirty_i, secno))
+ goto next;
+
if (is_atgc) {
add_victim_entry(sbi, &p, segno);
goto next;
@@ -1194,18 +1245,9 @@ static int move_data_block(struct inode *inode, block_t bidx,
goto out;
}
- if (f2fs_is_atomic_file(inode)) {
- F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC]++;
- F2FS_I_SB(inode)->skipped_atomic_files[gc_type]++;
- err = -EAGAIN;
- goto out;
- }
-
- if (f2fs_is_pinned_file(inode)) {
- f2fs_pin_file_control(inode, true);
- err = -EAGAIN;
+ err = f2fs_gc_pinned_control(inode, gc_type, segno);
+ if (err)
goto out;
- }
set_new_dnode(&dn, inode, NULL, NULL, 0);
err = f2fs_get_dnode_of_data(&dn, bidx, LOOKUP_NODE);
@@ -1344,18 +1386,9 @@ static int move_data_page(struct inode *inode, block_t bidx, int gc_type,
goto out;
}
- if (f2fs_is_atomic_file(inode)) {
- F2FS_I(inode)->i_gc_failures[GC_FAILURE_ATOMIC]++;
- F2FS_I_SB(inode)->skipped_atomic_files[gc_type]++;
- err = -EAGAIN;
- goto out;
- }
- if (f2fs_is_pinned_file(inode)) {
- if (gc_type == FG_GC)
- f2fs_pin_file_control(inode, true);
- err = -EAGAIN;
+ err = f2fs_gc_pinned_control(inode, gc_type, segno);
+ if (err)
goto out;
- }
if (gc_type == BG_GC) {
if (PageWriteback(page)) {
@@ -1475,11 +1508,19 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
ofs_in_node = le16_to_cpu(entry->ofs_in_node);
if (phase == 3) {
+ int err;
+
inode = f2fs_iget(sb, dni.ino);
if (IS_ERR(inode) || is_bad_inode(inode) ||
special_file(inode->i_mode))
continue;
+ err = f2fs_gc_pinned_control(inode, gc_type, segno);
+ if (err == -EAGAIN) {
+ iput(inode);
+ return submitted;
+ }
+
if (!f2fs_down_write_trylock(
&F2FS_I(inode)->i_gc_rwsem[WRITE])) {
iput(inode);
@@ -1711,8 +1752,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
.ilist = LIST_HEAD_INIT(gc_list.ilist),
.iroot = RADIX_TREE_INIT(gc_list.iroot, GFP_NOFS),
};
- unsigned long long last_skipped = sbi->skipped_atomic_files[FG_GC];
- unsigned long long first_skipped;
unsigned int skipped_round = 0, round = 0;
trace_f2fs_gc_begin(sbi->sb, sync, background,
@@ -1726,7 +1765,6 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
cpc.reason = __get_cp_reason(sbi);
sbi->skipped_gc_rwsem = 0;
- first_skipped = last_skipped;
gc_more:
if (unlikely(!(sbi->sb->s_flags & SB_ACTIVE))) {
ret = -EINVAL;
@@ -1758,9 +1796,17 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
ret = -EINVAL;
goto stop;
}
+retry:
ret = __get_victim(sbi, &segno, gc_type);
- if (ret)
+ if (ret) {
+ /* allow to search victim from sections has pinned data */
+ if (ret == -ENODATA && gc_type == FG_GC &&
+ f2fs_pinned_section_exists(DIRTY_I(sbi))) {
+ f2fs_unpin_all_sections(sbi, false);
+ goto retry;
+ }
goto stop;
+ }
seg_freed = do_garbage_collect(sbi, segno, &gc_list, gc_type, force);
if (gc_type == FG_GC &&
@@ -1769,10 +1815,8 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
total_freed += seg_freed;
if (gc_type == FG_GC) {
- if (sbi->skipped_atomic_files[FG_GC] > last_skipped ||
- sbi->skipped_gc_rwsem)
+ if (sbi->skipped_gc_rwsem)
skipped_round++;
- last_skipped = sbi->skipped_atomic_files[FG_GC];
round++;
}
@@ -1782,27 +1826,31 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
if (sync)
goto stop;
- if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
- if (skipped_round <= MAX_SKIP_GC_COUNT ||
- skipped_round * 2 < round) {
- segno = NULL_SEGNO;
- goto gc_more;
- }
+ if (!has_not_enough_free_secs(sbi, sec_freed, 0))
+ goto stop;
- if (first_skipped < last_skipped &&
- (last_skipped - first_skipped) >
- sbi->skipped_gc_rwsem) {
- f2fs_drop_inmem_pages_all(sbi, true);
- segno = NULL_SEGNO;
- goto gc_more;
- }
- if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
+ if (skipped_round <= MAX_SKIP_GC_COUNT || skipped_round * 2 < round) {
+
+ /* Write checkpoint to reclaim prefree segments */
+ if (free_sections(sbi) < NR_CURSEG_PERSIST_TYPE &&
+ prefree_segments(sbi) &&
+ !is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
ret = f2fs_write_checkpoint(sbi, &cpc);
+ if (ret)
+ goto stop;
+ }
+ segno = NULL_SEGNO;
+ goto gc_more;
}
+ if (gc_type == FG_GC && !is_sbi_flag_set(sbi, SBI_CP_DISABLED))
+ ret = f2fs_write_checkpoint(sbi, &cpc);
stop:
SIT_I(sbi)->last_victim[ALLOC_NEXT] = 0;
SIT_I(sbi)->last_victim[FLUSH_DEVICE] = init_segno;
+ if (gc_type == FG_GC)
+ f2fs_unpin_all_sections(sbi, true);
+
trace_f2fs_gc_end(sbi->sb, ret, total_freed, sec_freed,
get_pages(sbi, F2FS_DIRTY_NODES),
get_pages(sbi, F2FS_DIRTY_DENTS),
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index e9818723103c..938961a9084e 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -744,9 +744,8 @@ void f2fs_evict_inode(struct inode *inode)
nid_t xnid = F2FS_I(inode)->i_xattr_nid;
int err = 0;
- /* some remained atomic pages should discarded */
if (f2fs_is_atomic_file(inode))
- f2fs_drop_inmem_pages(inode);
+ f2fs_abort_atomic_write(inode, true);
trace_f2fs_evict_inode(inode);
truncate_inode_pages_final(&inode->i_data);
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index 3764e12f19db..28d5981443a7 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -848,8 +848,8 @@ static int f2fs_mknod(struct user_namespace *mnt_userns, struct inode *dir,
}
static int __f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
- struct dentry *dentry, umode_t mode,
- struct inode **whiteout)
+ struct dentry *dentry, umode_t mode, bool is_whiteout,
+ struct inode **new_inode)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(dir);
struct inode *inode;
@@ -863,7 +863,7 @@ static int __f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
if (IS_ERR(inode))
return PTR_ERR(inode);
- if (whiteout) {
+ if (is_whiteout) {
init_special_inode(inode, inode->i_mode, WHITEOUT_DEV);
inode->i_op = &f2fs_special_inode_operations;
} else {
@@ -888,21 +888,25 @@ static int __f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
f2fs_add_orphan_inode(inode);
f2fs_alloc_nid_done(sbi, inode->i_ino);
- if (whiteout) {
+ if (is_whiteout) {
f2fs_i_links_write(inode, false);
spin_lock(&inode->i_lock);
inode->i_state |= I_LINKABLE;
spin_unlock(&inode->i_lock);
-
- *whiteout = inode;
} else {
- d_tmpfile(dentry, inode);
+ if (dentry)
+ d_tmpfile(dentry, inode);
+ else
+ f2fs_i_links_write(inode, false);
}
/* link_count was changed by d_tmpfile as well. */
f2fs_unlock_op(sbi);
unlock_new_inode(inode);
+ if (new_inode)
+ *new_inode = inode;
+
f2fs_balance_fs(sbi, true);
return 0;
@@ -923,7 +927,7 @@ static int f2fs_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
if (!f2fs_is_checkpoint_ready(sbi))
return -ENOSPC;
- return __f2fs_tmpfile(mnt_userns, dir, dentry, mode, NULL);
+ return __f2fs_tmpfile(mnt_userns, dir, dentry, mode, false, NULL);
}
static int f2fs_create_whiteout(struct user_namespace *mnt_userns,
@@ -933,7 +937,13 @@ static int f2fs_create_whiteout(struct user_namespace *mnt_userns,
return -EIO;
return __f2fs_tmpfile(mnt_userns, dir, NULL,
- S_IFCHR | WHITEOUT_MODE, whiteout);
+ S_IFCHR | WHITEOUT_MODE, true, whiteout);
+}
+
+int f2fs_get_tmpfile(struct user_namespace *mnt_userns, struct inode *dir,
+ struct inode **new_inode)
+{
+ return __f2fs_tmpfile(mnt_userns, dir, NULL, S_IFREG, false, new_inode);
}
static int f2fs_rename(struct user_namespace *mnt_userns, struct inode *old_dir,
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index aedc3d334113..9dabf99eedc5 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -90,10 +90,6 @@ bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type)
atomic_read(&sbi->total_ext_node) *
sizeof(struct extent_node)) >> PAGE_SHIFT;
res = mem_size < ((avail_ram * nm_i->ram_thresh / 100) >> 1);
- } else if (type == INMEM_PAGES) {
- /* it allows 20% / total_ram for inmemory pages */
- mem_size = get_pages(sbi, F2FS_INMEM_PAGES);
- res = mem_size < (val.totalram / 5);
} else if (type == DISCARD_CACHE) {
mem_size = (atomic_read(&dcc->discard_cmd_cnt) *
sizeof(struct discard_cmd)) >> PAGE_SHIFT;
diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
index 4c1d34bfea78..3c09cae058b0 100644
--- a/fs/f2fs/node.h
+++ b/fs/f2fs/node.h
@@ -147,7 +147,6 @@ enum mem_type {
DIRTY_DENTS, /* indicates dirty dentry pages */
INO_ENTRIES, /* indicates inode entries */
EXTENT_CACHE, /* indicates extent cache */
- INMEM_PAGES, /* indicates inmemory pages */
DISCARD_CACHE, /* indicates memory of cached discard cmds */
COMPRESS_PAGE, /* indicates memory of cached compressed pages */
BASE_CHECK, /* check kernel status */
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index aa0162664a1e..331ad46722b2 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -30,7 +30,7 @@
static struct kmem_cache *discard_entry_slab;
static struct kmem_cache *discard_cmd_slab;
static struct kmem_cache *sit_entry_set_slab;
-static struct kmem_cache *inmem_entry_slab;
+static struct kmem_cache *revoke_entry_slab;
static unsigned long __reverse_ulong(unsigned char *str)
{
@@ -185,304 +185,180 @@ bool f2fs_need_SSR(struct f2fs_sb_info *sbi)
SM_I(sbi)->min_ssr_sections + reserved_sections(sbi));
}
-void f2fs_register_inmem_page(struct inode *inode, struct page *page)
+void f2fs_abort_atomic_write(struct inode *inode, bool clean)
{
- struct inmem_pages *new;
-
- set_page_private_atomic(page);
-
- new = f2fs_kmem_cache_alloc(inmem_entry_slab,
- GFP_NOFS, true, NULL);
-
- /* add atomic page indices to the list */
- new->page = page;
- INIT_LIST_HEAD(&new->list);
+ struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+ struct f2fs_inode_info *fi = F2FS_I(inode);
- /* increase reference count with clean state */
- get_page(page);
- mutex_lock(&F2FS_I(inode)->inmem_lock);
- list_add_tail(&new->list, &F2FS_I(inode)->inmem_pages);
- inc_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
- mutex_unlock(&F2FS_I(inode)->inmem_lock);
+ if (f2fs_is_atomic_file(inode)) {
+ if (clean)
+ truncate_inode_pages_final(inode->i_mapping);
+ clear_inode_flag(fi->cow_inode, FI_COW_FILE);
+ iput(fi->cow_inode);
+ fi->cow_inode = NULL;
+ clear_inode_flag(inode, FI_ATOMIC_FILE);
- trace_f2fs_register_inmem_page(page, INMEM);
+ spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
+ sbi->atomic_files--;
+ spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+ }
}
-static int __revoke_inmem_pages(struct inode *inode,
- struct list_head *head, bool drop, bool recover,
- bool trylock)
+static int __replace_atomic_write_block(struct inode *inode, pgoff_t index,
+ block_t new_addr, block_t *old_addr, bool recover)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
- struct inmem_pages *cur, *tmp;
- int err = 0;
-
- list_for_each_entry_safe(cur, tmp, head, list) {
- struct page *page = cur->page;
-
- if (drop)
- trace_f2fs_commit_inmem_page(page, INMEM_DROP);
-
- if (trylock) {
- /*
- * to avoid deadlock in between page lock and
- * inmem_lock.
- */
- if (!trylock_page(page))
- continue;
- } else {
- lock_page(page);
- }
-
- f2fs_wait_on_page_writeback(page, DATA, true, true);
-
- if (recover) {
- struct dnode_of_data dn;
- struct node_info ni;
+ struct dnode_of_data dn;
+ struct node_info ni;
+ int err;
- trace_f2fs_commit_inmem_page(page, INMEM_REVOKE);
retry:
- set_new_dnode(&dn, inode, NULL, NULL, 0);
- err = f2fs_get_dnode_of_data(&dn, page->index,
- LOOKUP_NODE);
- if (err) {
- if (err == -ENOMEM) {
- memalloc_retry_wait(GFP_NOFS);
- goto retry;
- }
- err = -EAGAIN;
- goto next;
- }
-
- err = f2fs_get_node_info(sbi, dn.nid, &ni, false);
- if (err) {
- f2fs_put_dnode(&dn);
- return err;
- }
-
- if (cur->old_addr == NEW_ADDR) {
- f2fs_invalidate_blocks(sbi, dn.data_blkaddr);
- f2fs_update_data_blkaddr(&dn, NEW_ADDR);
- } else
- f2fs_replace_block(sbi, &dn, dn.data_blkaddr,
- cur->old_addr, ni.version, true, true);
- f2fs_put_dnode(&dn);
- }
-next:
- /* we don't need to invalidate this in the sccessful status */
- if (drop || recover) {
- ClearPageUptodate(page);
- clear_page_private_gcing(page);
+ set_new_dnode(&dn, inode, NULL, NULL, 0);
+ err = f2fs_get_dnode_of_data(&dn, index, LOOKUP_NODE_RA);
+ if (err) {
+ if (err == -ENOMEM) {
+ f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT);
+ goto retry;
}
- detach_page_private(page);
- set_page_private(page, 0);
- f2fs_put_page(page, 1);
-
- list_del(&cur->list);
- kmem_cache_free(inmem_entry_slab, cur);
- dec_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
+ return err;
}
- return err;
-}
-void f2fs_drop_inmem_pages_all(struct f2fs_sb_info *sbi, bool gc_failure)
-{
- struct list_head *head = &sbi->inode_list[ATOMIC_FILE];
- struct inode *inode;
- struct f2fs_inode_info *fi;
- unsigned int count = sbi->atomic_files;
- unsigned int looped = 0;
-next:
- spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
- if (list_empty(head)) {
- spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
- return;
+ err = f2fs_get_node_info(sbi, dn.nid, &ni, false);
+ if (err) {
+ f2fs_put_dnode(&dn);
+ return err;
}
- fi = list_first_entry(head, struct f2fs_inode_info, inmem_ilist);
- inode = igrab(&fi->vfs_inode);
- if (inode)
- list_move_tail(&fi->inmem_ilist, head);
- spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
- if (inode) {
- if (gc_failure) {
- if (!fi->i_gc_failures[GC_FAILURE_ATOMIC])
- goto skip;
+ if (recover) {
+ /* dn.data_blkaddr is always valid */
+ if (!__is_valid_data_blkaddr(new_addr)) {
+ if (new_addr == NULL_ADDR)
+ dec_valid_block_count(sbi, inode, 1);
+ f2fs_invalidate_blocks(sbi, dn.data_blkaddr);
+ f2fs_update_data_blkaddr(&dn, new_addr);
+ } else {
+ f2fs_replace_block(sbi, &dn, dn.data_blkaddr,
+ new_addr, ni.version, true, true);
}
- set_inode_flag(inode, FI_ATOMIC_REVOKE_REQUEST);
- f2fs_drop_inmem_pages(inode);
-skip:
- iput(inode);
- }
- f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT);
- if (gc_failure) {
- if (++looped >= count)
- return;
- }
- goto next;
-}
-
-void f2fs_drop_inmem_pages(struct inode *inode)
-{
- struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
- struct f2fs_inode_info *fi = F2FS_I(inode);
+ } else {
+ blkcnt_t count = 1;
- do {
- mutex_lock(&fi->inmem_lock);
- if (list_empty(&fi->inmem_pages)) {
- fi->i_gc_failures[GC_FAILURE_ATOMIC] = 0;
-
- spin_lock(&sbi->inode_lock[ATOMIC_FILE]);
- if (!list_empty(&fi->inmem_ilist))
- list_del_init(&fi->inmem_ilist);
- if (f2fs_is_atomic_file(inode)) {
- clear_inode_flag(inode, FI_ATOMIC_FILE);
- sbi->atomic_files--;
- }
- spin_unlock(&sbi->inode_lock[ATOMIC_FILE]);
+ *old_addr = dn.data_blkaddr;
+ f2fs_truncate_data_blocks_range(&dn, 1);
+ dec_valid_block_count(sbi, F2FS_I(inode)->cow_inode, count);
+ inc_valid_block_count(sbi, inode, &count);
+ f2fs_replace_block(sbi, &dn, dn.data_blkaddr, new_addr,
+ ni.version, true, false);
+ }
- mutex_unlock(&fi->inmem_lock);
- break;
- }
- __revoke_inmem_pages(inode, &fi->inmem_pages,
- true, false, true);
- mutex_unlock(&fi->inmem_lock);
- } while (1);
+ f2fs_put_dnode(&dn);
+ return 0;
}
-void f2fs_drop_inmem_page(struct inode *inode, struct page *page)
+static void __complete_revoke_list(struct inode *inode, struct list_head *head,
+ bool revoke)
{
- struct f2fs_inode_info *fi = F2FS_I(inode);
- struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
- struct list_head *head = &fi->inmem_pages;
- struct inmem_pages *cur = NULL;
- struct inmem_pages *tmp;
-
- f2fs_bug_on(sbi, !page_private_atomic(page));
+ struct revoke_entry *cur, *tmp;
- mutex_lock(&fi->inmem_lock);
- list_for_each_entry(tmp, head, list) {
- if (tmp->page == page) {
- cur = tmp;
- break;
- }
+ list_for_each_entry_safe(cur, tmp, head, list) {
+ if (revoke)
+ __replace_atomic_write_block(inode, cur->index,
+ cur->old_addr, NULL, true);
+ list_del(&cur->list);
+ kmem_cache_free(revoke_entry_slab, cur);
}
-
- f2fs_bug_on(sbi, !cur);
- list_del(&cur->list);
- mutex_unlock(&fi->inmem_lock);
-
- dec_page_count(sbi, F2FS_INMEM_PAGES);
- kmem_cache_free(inmem_entry_slab, cur);
-
- ClearPageUptodate(page);
- clear_page_private_atomic(page);
- f2fs_put_page(page, 0);
-
- detach_page_private(page);
- set_page_private(page, 0);
-
- trace_f2fs_commit_inmem_page(page, INMEM_INVALIDATE);
}
-static int __f2fs_commit_inmem_pages(struct inode *inode)
+static int __f2fs_commit_atomic_write(struct inode *inode)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct f2fs_inode_info *fi = F2FS_I(inode);
- struct inmem_pages *cur, *tmp;
- struct f2fs_io_info fio = {
- .sbi = sbi,
- .ino = inode->i_ino,
- .type = DATA,
- .op = REQ_OP_WRITE,
- .op_flags = REQ_SYNC | REQ_PRIO,
- .io_type = FS_DATA_IO,
- };
+ struct inode *cow_inode = fi->cow_inode;
+ struct revoke_entry *new;
struct list_head revoke_list;
- bool submit_bio = false;
- int err = 0;
+ block_t blkaddr;
+ struct dnode_of_data dn;
+ pgoff_t len = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
+ pgoff_t off = 0, blen, index;
+ int ret = 0, i;
INIT_LIST_HEAD(&revoke_list);
- list_for_each_entry_safe(cur, tmp, &fi->inmem_pages, list) {
- struct page *page = cur->page;
+ while (len) {
+ blen = min_t(pgoff_t, ADDRS_PER_BLOCK(cow_inode), len);
- lock_page(page);
- if (page->mapping == inode->i_mapping) {
- trace_f2fs_commit_inmem_page(page, INMEM);
+ set_new_dnode(&dn, cow_inode, NULL, NULL, 0);
+ ret = f2fs_get_dnode_of_data(&dn, off, LOOKUP_NODE_RA);
+ if (ret && ret != -ENOENT) {
+ goto out;
+ } else if (ret == -ENOENT) {
+ ret = 0;
+ if (dn.max_level == 0)
+ goto out;
+ goto next;
+ }
- f2fs_wait_on_page_writeback(page, DATA, true, true);
+ blen = min((pgoff_t)ADDRS_PER_PAGE(dn.node_page, cow_inode),
+ len);
+ index = off;
+ for (i = 0; i < blen; i++, dn.ofs_in_node++, index++) {
+ blkaddr = f2fs_data_blkaddr(&dn);
- set_page_dirty(page);
- if (clear_page_dirty_for_io(page)) {
- inode_dec_dirty_pages(inode);
- f2fs_remove_dirty_inode(inode);
- }
-retry:
- fio.page = page;
- fio.old_blkaddr = NULL_ADDR;
- fio.encrypted_page = NULL;
- fio.need_lock = LOCK_DONE;
- err = f2fs_do_write_data_page(&fio);
- if (err) {
- if (err == -ENOMEM) {
- memalloc_retry_wait(GFP_NOFS);
- goto retry;
- }
- unlock_page(page);
- break;
+ if (!__is_valid_data_blkaddr(blkaddr)) {
+ continue;
+ } else if (!f2fs_is_valid_blkaddr(sbi, blkaddr,
+ DATA_GENERIC_ENHANCE)) {
+ f2fs_put_dnode(&dn);
+ ret = -EFSCORRUPTED;
+ goto out;
}
- /* record old blkaddr for revoking */
- cur->old_addr = fio.old_blkaddr;
- submit_bio = true;
- }
- unlock_page(page);
- list_move_tail(&cur->list, &revoke_list);
- }
- if (submit_bio)
- f2fs_submit_merged_write_cond(sbi, inode, NULL, 0, DATA);
+ new = f2fs_kmem_cache_alloc(revoke_entry_slab, GFP_NOFS,
+ true, NULL);
+ if (!new) {
+ f2fs_put_dnode(&dn);
+ ret = -ENOMEM;
+ goto out;
+ }
- if (err) {
- /*
- * try to revoke all committed pages, but still we could fail
- * due to no memory or other reason, if that happened, EAGAIN
- * will be returned, which means in such case, transaction is
- * already not integrity, caller should use journal to do the
- * recovery or rewrite & commit last transaction. For other
- * error number, revoking was done by filesystem itself.
- */
- err = __revoke_inmem_pages(inode, &revoke_list,
- false, true, false);
+ ret = __replace_atomic_write_block(inode, index, blkaddr,
+ &new->old_addr, false);
+ if (ret) {
+ f2fs_put_dnode(&dn);
+ kmem_cache_free(revoke_entry_slab, new);
+ goto out;
+ }
- /* drop all uncommitted pages */
- __revoke_inmem_pages(inode, &fi->inmem_pages,
- true, false, false);
- } else {
- __revoke_inmem_pages(inode, &revoke_list,
- false, false, false);
+ f2fs_update_data_blkaddr(&dn, NULL_ADDR);
+ new->index = index;
+ list_add_tail(&new->list, &revoke_list);
+ }
+ f2fs_put_dnode(&dn);
+next:
+ off += blen;
+ len -= blen;
}
- return err;
+out:
+ __complete_revoke_list(inode, &revoke_list, ret ? true : false);
+
+ return ret;
}
-int f2fs_commit_inmem_pages(struct inode *inode)
+int f2fs_commit_atomic_write(struct inode *inode)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct f2fs_inode_info *fi = F2FS_I(inode);
int err;
- f2fs_balance_fs(sbi, true);
+ err = filemap_write_and_wait_range(inode->i_mapping, 0, LLONG_MAX);
+ if (err)
+ return err;
f2fs_down_write(&fi->i_gc_rwsem[WRITE]);
-
f2fs_lock_op(sbi);
- set_inode_flag(inode, FI_ATOMIC_COMMIT);
-
- mutex_lock(&fi->inmem_lock);
- err = __f2fs_commit_inmem_pages(inode);
- mutex_unlock(&fi->inmem_lock);
- clear_inode_flag(inode, FI_ATOMIC_COMMIT);
+ err = __f2fs_commit_atomic_write(inode);
f2fs_unlock_op(sbi);
f2fs_up_write(&fi->i_gc_rwsem[WRITE]);
@@ -3291,8 +3167,7 @@ static int __get_segment_type_6(struct f2fs_io_info *fio)
return CURSEG_COLD_DATA;
if (file_is_hot(inode) ||
is_inode_flag_set(inode, FI_HOT_DATA) ||
- f2fs_is_atomic_file(inode) ||
- f2fs_is_volatile_file(inode))
+ f2fs_is_cow_file(inode))
return CURSEG_HOT_DATA;
return f2fs_rw_hint_to_seg_type(inode->i_write_hint);
} else {
@@ -4653,6 +4528,13 @@ static int init_victim_secmap(struct f2fs_sb_info *sbi)
dirty_i->victim_secmap = f2fs_kvzalloc(sbi, bitmap_size, GFP_KERNEL);
if (!dirty_i->victim_secmap)
return -ENOMEM;
+
+ dirty_i->pinned_secmap = f2fs_kvzalloc(sbi, bitmap_size, GFP_KERNEL);
+ if (!dirty_i->pinned_secmap)
+ return -ENOMEM;
+
+ dirty_i->pinned_secmap_cnt = 0;
+ dirty_i->enable_pin_section = true;
return 0;
}
@@ -5241,6 +5123,7 @@ static void destroy_victim_secmap(struct f2fs_sb_info *sbi)
{
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
+ kvfree(dirty_i->pinned_secmap);
kvfree(dirty_i->victim_secmap);
}
@@ -5351,9 +5234,9 @@ int __init f2fs_create_segment_manager_caches(void)
if (!sit_entry_set_slab)
goto destroy_discard_cmd;
- inmem_entry_slab = f2fs_kmem_cache_create("f2fs_inmem_page_entry",
- sizeof(struct inmem_pages));
- if (!inmem_entry_slab)
+ revoke_entry_slab = f2fs_kmem_cache_create("f2fs_revoke_entry",
+ sizeof(struct revoke_entry));
+ if (!revoke_entry_slab)
goto destroy_sit_entry_set;
return 0;
@@ -5372,5 +5255,5 @@ void f2fs_destroy_segment_manager_caches(void)
kmem_cache_destroy(sit_entry_set_slab);
kmem_cache_destroy(discard_cmd_slab);
kmem_cache_destroy(discard_entry_slab);
- kmem_cache_destroy(inmem_entry_slab);
+ kmem_cache_destroy(revoke_entry_slab);
}
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 1fa26a9603cb..3f277dfcb131 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -225,10 +225,10 @@ struct segment_allocation {
#define MAX_SKIP_GC_COUNT 16
-struct inmem_pages {
+struct revoke_entry {
struct list_head list;
- struct page *page;
block_t old_addr; /* for revoking when fail to commit */
+ pgoff_t index;
};
struct sit_info {
@@ -295,6 +295,9 @@ struct dirty_seglist_info {
struct mutex seglist_lock; /* lock for segment bitmaps */
int nr_dirty[NR_DIRTY_TYPE]; /* # of dirty segments */
unsigned long *victim_secmap; /* background GC victims */
+ unsigned long *pinned_secmap; /* pinned victims from foreground GC */
+ unsigned int pinned_secmap_cnt; /* count of victims which has pinned data */
+ bool enable_pin_section; /* enable pinning section */
};
/* victim selection function for cleaning and SSR */
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 1b59f95606c7..e3539c027a6c 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1339,9 +1339,6 @@ static struct inode *f2fs_alloc_inode(struct super_block *sb)
spin_lock_init(&fi->i_size_lock);
INIT_LIST_HEAD(&fi->dirty_list);
INIT_LIST_HEAD(&fi->gdirty_list);
- INIT_LIST_HEAD(&fi->inmem_ilist);
- INIT_LIST_HEAD(&fi->inmem_pages);
- mutex_init(&fi->inmem_lock);
init_f2fs_rwsem(&fi->i_gc_rwsem[READ]);
init_f2fs_rwsem(&fi->i_gc_rwsem[WRITE]);
init_f2fs_rwsem(&fi->i_xattr_sem);
@@ -1382,9 +1379,8 @@ static int f2fs_drop_inode(struct inode *inode)
atomic_inc(&inode->i_count);
spin_unlock(&inode->i_lock);
- /* some remained atomic pages should discarded */
if (f2fs_is_atomic_file(inode))
- f2fs_drop_inmem_pages(inode);
+ f2fs_abort_atomic_write(inode, true);
/* should remain fi->extent_tree for writepage */
f2fs_destroy_extent_node(inode);
diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c
index 3d793202cc9f..5ac7e756a1bb 100644
--- a/fs/f2fs/verity.c
+++ b/fs/f2fs/verity.c
@@ -128,7 +128,7 @@ static int f2fs_begin_enable_verity(struct file *filp)
if (f2fs_verity_in_progress(inode))
return -EBUSY;
- if (f2fs_is_atomic_file(inode) || f2fs_is_volatile_file(inode))
+ if (f2fs_is_atomic_file(inode))
return -EOPNOTSUPP;
/*
diff --git a/fs/fuse/control.c b/fs/fuse/control.c
index 7cede9a3bc96..247ef4f76761 100644
--- a/fs/fuse/control.c
+++ b/fs/fuse/control.c
@@ -258,7 +258,7 @@ int fuse_ctl_add_conn(struct fuse_conn *fc)
struct dentry *parent;
char name[32];
- if (!fuse_control_sb)
+ if (!fuse_control_sb || fc->no_control)
return 0;
parent = fuse_control_sb->s_root;
@@ -296,7 +296,7 @@ void fuse_ctl_remove_conn(struct fuse_conn *fc)
{
int i;
- if (!fuse_control_sb)
+ if (!fuse_control_sb || fc->no_control)
return;
for (i = fc->ctl_ndents - 1; i >= 0; i--) {
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 9ff27b8a9782..9dfee44e97ad 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -537,6 +537,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
struct fuse_file *ff;
void *security_ctx = NULL;
u32 security_ctxlen;
+ bool trunc = flags & O_TRUNC;
/* Userspace expects S_IFREG in create mode */
BUG_ON((mode & S_IFMT) != S_IFREG);
@@ -561,7 +562,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
inarg.mode = mode;
inarg.umask = current_umask();
- if (fm->fc->handle_killpriv_v2 && (flags & O_TRUNC) &&
+ if (fm->fc->handle_killpriv_v2 && trunc &&
!(flags & O_EXCL) && !capable(CAP_FSETID)) {
inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID;
}
@@ -623,6 +624,10 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
} else {
file->private_data = ff;
fuse_finish_open(inode, file);
+ if (fm->fc->atomic_o_trunc && trunc)
+ truncate_pagecache(inode, 0);
+ else if (!(ff->open_flags & FOPEN_KEEP_CACHE))
+ invalidate_inode_pages2(inode->i_mapping);
}
return err;
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index f18d14d5fea1..9b64e2ff1c96 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -210,13 +210,9 @@ void fuse_finish_open(struct inode *inode, struct file *file)
fi->attr_version = atomic64_inc_return(&fc->attr_version);
i_size_write(inode, 0);
spin_unlock(&fi->lock);
- truncate_pagecache(inode, 0);
file_update_time(file);
fuse_invalidate_attr_mask(inode, FUSE_STATX_MODSIZE);
- } else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) {
- invalidate_inode_pages2(inode->i_mapping);
}
-
if ((file->f_mode & FMODE_WRITE) && fc->writeback_cache)
fuse_link_write_file(file);
}
@@ -239,30 +235,38 @@ int fuse_open_common(struct inode *inode, struct file *file, bool isdir)
if (err)
return err;
- if (is_wb_truncate || dax_truncate) {
+ if (is_wb_truncate || dax_truncate)
inode_lock(inode);
- fuse_set_nowrite(inode);
- }
if (dax_truncate) {
filemap_invalidate_lock(inode->i_mapping);
err = fuse_dax_break_layouts(inode, 0, 0);
if (err)
- goto out;
+ goto out_inode_unlock;
}
+ if (is_wb_truncate || dax_truncate)
+ fuse_set_nowrite(inode);
+
err = fuse_do_open(fm, get_node_id(inode), file, isdir);
if (!err)
fuse_finish_open(inode, file);
-out:
+ if (is_wb_truncate || dax_truncate)
+ fuse_release_nowrite(inode);
+ if (!err) {
+ struct fuse_file *ff = file->private_data;
+
+ if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC))
+ truncate_pagecache(inode, 0);
+ else if (!(ff->open_flags & FOPEN_KEEP_CACHE))
+ invalidate_inode_pages2(inode->i_mapping);
+ }
if (dax_truncate)
filemap_invalidate_unlock(inode->i_mapping);
-
- if (is_wb_truncate | dax_truncate) {
- fuse_release_nowrite(inode);
+out_inode_unlock:
+ if (is_wb_truncate || dax_truncate)
inode_unlock(inode);
- }
return err;
}
@@ -338,6 +342,15 @@ static int fuse_open(struct inode *inode, struct file *file)
static int fuse_release(struct inode *inode, struct file *file)
{
+ struct fuse_conn *fc = get_fuse_conn(inode);
+
+ /*
+ * Dirty pages might remain despite write_inode_now() call from
+ * fuse_flush() due to writes racing with the close.
+ */
+ if (fc->writeback_cache)
+ write_inode_now(inode, 1);
+
fuse_release_common(file, false);
/* return value is ignored by VFS */
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 8c0665c5dff8..7c290089e693 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -180,6 +180,12 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
inode->i_uid = make_kuid(fc->user_ns, attr->uid);
inode->i_gid = make_kgid(fc->user_ns, attr->gid);
inode->i_blocks = attr->blocks;
+
+ /* Sanitize nsecs */
+ attr->atimensec = min_t(u32, attr->atimensec, NSEC_PER_SEC - 1);
+ attr->mtimensec = min_t(u32, attr->mtimensec, NSEC_PER_SEC - 1);
+ attr->ctimensec = min_t(u32, attr->ctimensec, NSEC_PER_SEC - 1);
+
inode->i_atime.tv_sec = attr->atime;
inode->i_atime.tv_nsec = attr->atimensec;
/* mtime from server may be stale due to local buffered write */
diff --git a/fs/fuse/ioctl.c b/fs/fuse/ioctl.c
index 33cde4bbccdc..61d8afcb10a3 100644
--- a/fs/fuse/ioctl.c
+++ b/fs/fuse/ioctl.c
@@ -9,6 +9,17 @@
#include <linux/compat.h>
#include <linux/fileattr.h>
+static ssize_t fuse_send_ioctl(struct fuse_mount *fm, struct fuse_args *args)
+{
+ ssize_t ret = fuse_simple_request(fm, args);
+
+ /* Translate ENOSYS, which shouldn't be returned from fs */
+ if (ret == -ENOSYS)
+ ret = -ENOTTY;
+
+ return ret;
+}
+
/*
* CUSE servers compiled on 32bit broke on 64bit kernels because the
* ABI was defined to be 'struct iovec' which is different on 32bit
@@ -259,7 +270,7 @@ long fuse_do_ioctl(struct file *file, unsigned int cmd, unsigned long arg,
ap.args.out_pages = true;
ap.args.out_argvar = true;
- transferred = fuse_simple_request(fm, &ap.args);
+ transferred = fuse_send_ioctl(fm, &ap.args);
err = transferred;
if (transferred < 0)
goto out;
@@ -393,7 +404,7 @@ static int fuse_priv_ioctl(struct inode *inode, struct fuse_file *ff,
args.out_args[1].size = inarg.out_size;
args.out_args[1].value = ptr;
- err = fuse_simple_request(fm, &args);
+ err = fuse_send_ioctl(fm, &args);
if (!err) {
if (outarg.result < 0)
err = outarg.result;
diff --git a/fs/io-wq.c b/fs/io-wq.c
deleted file mode 100644
index 32aeb2c581c5..000000000000
--- a/fs/io-wq.c
+++ /dev/null
@@ -1,1424 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Basic worker thread pool for io_uring
- *
- * Copyright (C) 2019 Jens Axboe
- *
- */
-#include <linux/kernel.h>
-#include <linux/init.h>
-#include <linux/errno.h>
-#include <linux/sched/signal.h>
-#include <linux/percpu.h>
-#include <linux/slab.h>
-#include <linux/rculist_nulls.h>
-#include <linux/cpu.h>
-#include <linux/task_work.h>
-#include <linux/audit.h>
-#include <uapi/linux/io_uring.h>
-
-#include "io-wq.h"
-
-#define WORKER_IDLE_TIMEOUT (5 * HZ)
-
-enum {
- IO_WORKER_F_UP = 1, /* up and active */
- IO_WORKER_F_RUNNING = 2, /* account as running */
- IO_WORKER_F_FREE = 4, /* worker on free list */
- IO_WORKER_F_BOUND = 8, /* is doing bounded work */
-};
-
-enum {
- IO_WQ_BIT_EXIT = 0, /* wq exiting */
-};
-
-enum {
- IO_ACCT_STALLED_BIT = 0, /* stalled on hash */
-};
-
-/*
- * One for each thread in a wqe pool
- */
-struct io_worker {
- refcount_t ref;
- unsigned flags;
- struct hlist_nulls_node nulls_node;
- struct list_head all_list;
- struct task_struct *task;
- struct io_wqe *wqe;
-
- struct io_wq_work *cur_work;
- struct io_wq_work *next_work;
- raw_spinlock_t lock;
-
- struct completion ref_done;
-
- unsigned long create_state;
- struct callback_head create_work;
- int create_index;
-
- union {
- struct rcu_head rcu;
- struct work_struct work;
- };
-};
-
-#if BITS_PER_LONG == 64
-#define IO_WQ_HASH_ORDER 6
-#else
-#define IO_WQ_HASH_ORDER 5
-#endif
-
-#define IO_WQ_NR_HASH_BUCKETS (1u << IO_WQ_HASH_ORDER)
-
-struct io_wqe_acct {
- unsigned nr_workers;
- unsigned max_workers;
- int index;
- atomic_t nr_running;
- raw_spinlock_t lock;
- struct io_wq_work_list work_list;
- unsigned long flags;
-};
-
-enum {
- IO_WQ_ACCT_BOUND,
- IO_WQ_ACCT_UNBOUND,
- IO_WQ_ACCT_NR,
-};
-
-/*
- * Per-node worker thread pool
- */
-struct io_wqe {
- raw_spinlock_t lock;
- struct io_wqe_acct acct[IO_WQ_ACCT_NR];
-
- int node;
-
- struct hlist_nulls_head free_list;
- struct list_head all_list;
-
- struct wait_queue_entry wait;
-
- struct io_wq *wq;
- struct io_wq_work *hash_tail[IO_WQ_NR_HASH_BUCKETS];
-
- cpumask_var_t cpu_mask;
-};
-
-/*
- * Per io_wq state
- */
-struct io_wq {
- unsigned long state;
-
- free_work_fn *free_work;
- io_wq_work_fn *do_work;
-
- struct io_wq_hash *hash;
-
- atomic_t worker_refs;
- struct completion worker_done;
-
- struct hlist_node cpuhp_node;
-
- struct task_struct *task;
-
- struct io_wqe *wqes[];
-};
-
-static enum cpuhp_state io_wq_online;
-
-struct io_cb_cancel_data {
- work_cancel_fn *fn;
- void *data;
- int nr_running;
- int nr_pending;
- bool cancel_all;
-};
-
-static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index);
-static void io_wqe_dec_running(struct io_worker *worker);
-static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
- struct io_wqe_acct *acct,
- struct io_cb_cancel_data *match);
-static void create_worker_cb(struct callback_head *cb);
-static void io_wq_cancel_tw_create(struct io_wq *wq);
-
-static bool io_worker_get(struct io_worker *worker)
-{
- return refcount_inc_not_zero(&worker->ref);
-}
-
-static void io_worker_release(struct io_worker *worker)
-{
- if (refcount_dec_and_test(&worker->ref))
- complete(&worker->ref_done);
-}
-
-static inline struct io_wqe_acct *io_get_acct(struct io_wqe *wqe, bool bound)
-{
- return &wqe->acct[bound ? IO_WQ_ACCT_BOUND : IO_WQ_ACCT_UNBOUND];
-}
-
-static inline struct io_wqe_acct *io_work_get_acct(struct io_wqe *wqe,
- struct io_wq_work *work)
-{
- return io_get_acct(wqe, !(work->flags & IO_WQ_WORK_UNBOUND));
-}
-
-static inline struct io_wqe_acct *io_wqe_get_acct(struct io_worker *worker)
-{
- return io_get_acct(worker->wqe, worker->flags & IO_WORKER_F_BOUND);
-}
-
-static void io_worker_ref_put(struct io_wq *wq)
-{
- if (atomic_dec_and_test(&wq->worker_refs))
- complete(&wq->worker_done);
-}
-
-static void io_worker_cancel_cb(struct io_worker *worker)
-{
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
-
- atomic_dec(&acct->nr_running);
- raw_spin_lock(&worker->wqe->lock);
- acct->nr_workers--;
- raw_spin_unlock(&worker->wqe->lock);
- io_worker_ref_put(wq);
- clear_bit_unlock(0, &worker->create_state);
- io_worker_release(worker);
-}
-
-static bool io_task_worker_match(struct callback_head *cb, void *data)
-{
- struct io_worker *worker;
-
- if (cb->func != create_worker_cb)
- return false;
- worker = container_of(cb, struct io_worker, create_work);
- return worker == data;
-}
-
-static void io_worker_exit(struct io_worker *worker)
-{
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
-
- while (1) {
- struct callback_head *cb = task_work_cancel_match(wq->task,
- io_task_worker_match, worker);
-
- if (!cb)
- break;
- io_worker_cancel_cb(worker);
- }
-
- io_worker_release(worker);
- wait_for_completion(&worker->ref_done);
-
- raw_spin_lock(&wqe->lock);
- if (worker->flags & IO_WORKER_F_FREE)
- hlist_nulls_del_rcu(&worker->nulls_node);
- list_del_rcu(&worker->all_list);
- raw_spin_unlock(&wqe->lock);
- io_wqe_dec_running(worker);
- worker->flags = 0;
- preempt_disable();
- current->flags &= ~PF_IO_WORKER;
- preempt_enable();
-
- kfree_rcu(worker, rcu);
- io_worker_ref_put(wqe->wq);
- do_exit(0);
-}
-
-static inline bool io_acct_run_queue(struct io_wqe_acct *acct)
-{
- bool ret = false;
-
- raw_spin_lock(&acct->lock);
- if (!wq_list_empty(&acct->work_list) &&
- !test_bit(IO_ACCT_STALLED_BIT, &acct->flags))
- ret = true;
- raw_spin_unlock(&acct->lock);
-
- return ret;
-}
-
-/*
- * Check head of free list for an available worker. If one isn't available,
- * caller must create one.
- */
-static bool io_wqe_activate_free_worker(struct io_wqe *wqe,
- struct io_wqe_acct *acct)
- __must_hold(RCU)
-{
- struct hlist_nulls_node *n;
- struct io_worker *worker;
-
- /*
- * Iterate free_list and see if we can find an idle worker to
- * activate. If a given worker is on the free_list but in the process
- * of exiting, keep trying.
- */
- hlist_nulls_for_each_entry_rcu(worker, n, &wqe->free_list, nulls_node) {
- if (!io_worker_get(worker))
- continue;
- if (io_wqe_get_acct(worker) != acct) {
- io_worker_release(worker);
- continue;
- }
- if (wake_up_process(worker->task)) {
- io_worker_release(worker);
- return true;
- }
- io_worker_release(worker);
- }
-
- return false;
-}
-
-/*
- * We need a worker. If we find a free one, we're good. If not, and we're
- * below the max number of workers, create one.
- */
-static bool io_wqe_create_worker(struct io_wqe *wqe, struct io_wqe_acct *acct)
-{
- /*
- * Most likely an attempt to queue unbounded work on an io_wq that
- * wasn't setup with any unbounded workers.
- */
- if (unlikely(!acct->max_workers))
- pr_warn_once("io-wq is not configured for unbound workers");
-
- raw_spin_lock(&wqe->lock);
- if (acct->nr_workers >= acct->max_workers) {
- raw_spin_unlock(&wqe->lock);
- return true;
- }
- acct->nr_workers++;
- raw_spin_unlock(&wqe->lock);
- atomic_inc(&acct->nr_running);
- atomic_inc(&wqe->wq->worker_refs);
- return create_io_worker(wqe->wq, wqe, acct->index);
-}
-
-static void io_wqe_inc_running(struct io_worker *worker)
-{
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
-
- atomic_inc(&acct->nr_running);
-}
-
-static void create_worker_cb(struct callback_head *cb)
-{
- struct io_worker *worker;
- struct io_wq *wq;
- struct io_wqe *wqe;
- struct io_wqe_acct *acct;
- bool do_create = false;
-
- worker = container_of(cb, struct io_worker, create_work);
- wqe = worker->wqe;
- wq = wqe->wq;
- acct = &wqe->acct[worker->create_index];
- raw_spin_lock(&wqe->lock);
- if (acct->nr_workers < acct->max_workers) {
- acct->nr_workers++;
- do_create = true;
- }
- raw_spin_unlock(&wqe->lock);
- if (do_create) {
- create_io_worker(wq, wqe, worker->create_index);
- } else {
- atomic_dec(&acct->nr_running);
- io_worker_ref_put(wq);
- }
- clear_bit_unlock(0, &worker->create_state);
- io_worker_release(worker);
-}
-
-static bool io_queue_worker_create(struct io_worker *worker,
- struct io_wqe_acct *acct,
- task_work_func_t func)
-{
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
-
- /* raced with exit, just ignore create call */
- if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
- goto fail;
- if (!io_worker_get(worker))
- goto fail;
- /*
- * create_state manages ownership of create_work/index. We should
- * only need one entry per worker, as the worker going to sleep
- * will trigger the condition, and waking will clear it once it
- * runs the task_work.
- */
- if (test_bit(0, &worker->create_state) ||
- test_and_set_bit_lock(0, &worker->create_state))
- goto fail_release;
-
- atomic_inc(&wq->worker_refs);
- init_task_work(&worker->create_work, func);
- worker->create_index = acct->index;
- if (!task_work_add(wq->task, &worker->create_work, TWA_SIGNAL)) {
- /*
- * EXIT may have been set after checking it above, check after
- * adding the task_work and remove any creation item if it is
- * now set. wq exit does that too, but we can have added this
- * work item after we canceled in io_wq_exit_workers().
- */
- if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
- io_wq_cancel_tw_create(wq);
- io_worker_ref_put(wq);
- return true;
- }
- io_worker_ref_put(wq);
- clear_bit_unlock(0, &worker->create_state);
-fail_release:
- io_worker_release(worker);
-fail:
- atomic_dec(&acct->nr_running);
- io_worker_ref_put(wq);
- return false;
-}
-
-static void io_wqe_dec_running(struct io_worker *worker)
-{
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
- struct io_wqe *wqe = worker->wqe;
-
- if (!(worker->flags & IO_WORKER_F_UP))
- return;
-
- if (!atomic_dec_and_test(&acct->nr_running))
- return;
- if (!io_acct_run_queue(acct))
- return;
-
- atomic_inc(&acct->nr_running);
- atomic_inc(&wqe->wq->worker_refs);
- io_queue_worker_create(worker, acct, create_worker_cb);
-}
-
-/*
- * Worker will start processing some work. Move it to the busy list, if
- * it's currently on the freelist
- */
-static void __io_worker_busy(struct io_wqe *wqe, struct io_worker *worker)
-{
- if (worker->flags & IO_WORKER_F_FREE) {
- worker->flags &= ~IO_WORKER_F_FREE;
- raw_spin_lock(&wqe->lock);
- hlist_nulls_del_init_rcu(&worker->nulls_node);
- raw_spin_unlock(&wqe->lock);
- }
-}
-
-/*
- * No work, worker going to sleep. Move to freelist, and unuse mm if we
- * have one attached. Dropping the mm may potentially sleep, so we drop
- * the lock in that case and return success. Since the caller has to
- * retry the loop in that case (we changed task state), we don't regrab
- * the lock if we return success.
- */
-static void __io_worker_idle(struct io_wqe *wqe, struct io_worker *worker)
- __must_hold(wqe->lock)
-{
- if (!(worker->flags & IO_WORKER_F_FREE)) {
- worker->flags |= IO_WORKER_F_FREE;
- hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
- }
-}
-
-static inline unsigned int io_get_work_hash(struct io_wq_work *work)
-{
- return work->flags >> IO_WQ_HASH_SHIFT;
-}
-
-static bool io_wait_on_hash(struct io_wqe *wqe, unsigned int hash)
-{
- struct io_wq *wq = wqe->wq;
- bool ret = false;
-
- spin_lock_irq(&wq->hash->wait.lock);
- if (list_empty(&wqe->wait.entry)) {
- __add_wait_queue(&wq->hash->wait, &wqe->wait);
- if (!test_bit(hash, &wq->hash->map)) {
- __set_current_state(TASK_RUNNING);
- list_del_init(&wqe->wait.entry);
- ret = true;
- }
- }
- spin_unlock_irq(&wq->hash->wait.lock);
- return ret;
-}
-
-static struct io_wq_work *io_get_next_work(struct io_wqe_acct *acct,
- struct io_worker *worker)
- __must_hold(acct->lock)
-{
- struct io_wq_work_node *node, *prev;
- struct io_wq_work *work, *tail;
- unsigned int stall_hash = -1U;
- struct io_wqe *wqe = worker->wqe;
-
- wq_list_for_each(node, prev, &acct->work_list) {
- unsigned int hash;
-
- work = container_of(node, struct io_wq_work, list);
-
- /* not hashed, can run anytime */
- if (!io_wq_is_hashed(work)) {
- wq_list_del(&acct->work_list, node, prev);
- return work;
- }
-
- hash = io_get_work_hash(work);
- /* all items with this hash lie in [work, tail] */
- tail = wqe->hash_tail[hash];
-
- /* hashed, can run if not already running */
- if (!test_and_set_bit(hash, &wqe->wq->hash->map)) {
- wqe->hash_tail[hash] = NULL;
- wq_list_cut(&acct->work_list, &tail->list, prev);
- return work;
- }
- if (stall_hash == -1U)
- stall_hash = hash;
- /* fast forward to a next hash, for-each will fix up @prev */
- node = &tail->list;
- }
-
- if (stall_hash != -1U) {
- bool unstalled;
-
- /*
- * Set this before dropping the lock to avoid racing with new
- * work being added and clearing the stalled bit.
- */
- set_bit(IO_ACCT_STALLED_BIT, &acct->flags);
- raw_spin_unlock(&acct->lock);
- unstalled = io_wait_on_hash(wqe, stall_hash);
- raw_spin_lock(&acct->lock);
- if (unstalled) {
- clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
- if (wq_has_sleeper(&wqe->wq->hash->wait))
- wake_up(&wqe->wq->hash->wait);
- }
- }
-
- return NULL;
-}
-
-static bool io_flush_signals(void)
-{
- if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) {
- __set_current_state(TASK_RUNNING);
- clear_notify_signal();
- if (task_work_pending(current))
- task_work_run();
- return true;
- }
- return false;
-}
-
-static void io_assign_current_work(struct io_worker *worker,
- struct io_wq_work *work)
-{
- if (work) {
- io_flush_signals();
- cond_resched();
- }
-
- raw_spin_lock(&worker->lock);
- worker->cur_work = work;
- worker->next_work = NULL;
- raw_spin_unlock(&worker->lock);
-}
-
-static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work);
-
-static void io_worker_handle_work(struct io_worker *worker)
-{
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
- bool do_kill = test_bit(IO_WQ_BIT_EXIT, &wq->state);
-
- do {
- struct io_wq_work *work;
-
- /*
- * If we got some work, mark us as busy. If we didn't, but
- * the list isn't empty, it means we stalled on hashed work.
- * Mark us stalled so we don't keep looking for work when we
- * can't make progress, any work completion or insertion will
- * clear the stalled flag.
- */
- raw_spin_lock(&acct->lock);
- work = io_get_next_work(acct, worker);
- raw_spin_unlock(&acct->lock);
- if (work) {
- __io_worker_busy(wqe, worker);
-
- /*
- * Make sure cancelation can find this, even before
- * it becomes the active work. That avoids a window
- * where the work has been removed from our general
- * work list, but isn't yet discoverable as the
- * current work item for this worker.
- */
- raw_spin_lock(&worker->lock);
- worker->next_work = work;
- raw_spin_unlock(&worker->lock);
- } else {
- break;
- }
- io_assign_current_work(worker, work);
- __set_current_state(TASK_RUNNING);
-
- /* handle a whole dependent link */
- do {
- struct io_wq_work *next_hashed, *linked;
- unsigned int hash = io_get_work_hash(work);
-
- next_hashed = wq_next_work(work);
-
- if (unlikely(do_kill) && (work->flags & IO_WQ_WORK_UNBOUND))
- work->flags |= IO_WQ_WORK_CANCEL;
- wq->do_work(work);
- io_assign_current_work(worker, NULL);
-
- linked = wq->free_work(work);
- work = next_hashed;
- if (!work && linked && !io_wq_is_hashed(linked)) {
- work = linked;
- linked = NULL;
- }
- io_assign_current_work(worker, work);
- if (linked)
- io_wqe_enqueue(wqe, linked);
-
- if (hash != -1U && !next_hashed) {
- /* serialize hash clear with wake_up() */
- spin_lock_irq(&wq->hash->wait.lock);
- clear_bit(hash, &wq->hash->map);
- clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
- spin_unlock_irq(&wq->hash->wait.lock);
- if (wq_has_sleeper(&wq->hash->wait))
- wake_up(&wq->hash->wait);
- }
- } while (work);
- } while (1);
-}
-
-static int io_wqe_worker(void *data)
-{
- struct io_worker *worker = data;
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
- bool last_timeout = false;
- char buf[TASK_COMM_LEN];
-
- worker->flags |= (IO_WORKER_F_UP | IO_WORKER_F_RUNNING);
-
- snprintf(buf, sizeof(buf), "iou-wrk-%d", wq->task->pid);
- set_task_comm(current, buf);
-
- audit_alloc_kernel(current);
-
- while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
- long ret;
-
- set_current_state(TASK_INTERRUPTIBLE);
- while (io_acct_run_queue(acct))
- io_worker_handle_work(worker);
-
- raw_spin_lock(&wqe->lock);
- /* timed out, exit unless we're the last worker */
- if (last_timeout && acct->nr_workers > 1) {
- acct->nr_workers--;
- raw_spin_unlock(&wqe->lock);
- __set_current_state(TASK_RUNNING);
- break;
- }
- last_timeout = false;
- __io_worker_idle(wqe, worker);
- raw_spin_unlock(&wqe->lock);
- if (io_flush_signals())
- continue;
- ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
- if (signal_pending(current)) {
- struct ksignal ksig;
-
- if (!get_signal(&ksig))
- continue;
- break;
- }
- last_timeout = !ret;
- }
-
- if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
- io_worker_handle_work(worker);
-
- audit_free(current);
- io_worker_exit(worker);
- return 0;
-}
-
-/*
- * Called when a worker is scheduled in. Mark us as currently running.
- */
-void io_wq_worker_running(struct task_struct *tsk)
-{
- struct io_worker *worker = tsk->worker_private;
-
- if (!worker)
- return;
- if (!(worker->flags & IO_WORKER_F_UP))
- return;
- if (worker->flags & IO_WORKER_F_RUNNING)
- return;
- worker->flags |= IO_WORKER_F_RUNNING;
- io_wqe_inc_running(worker);
-}
-
-/*
- * Called when worker is going to sleep. If there are no workers currently
- * running and we have work pending, wake up a free one or create a new one.
- */
-void io_wq_worker_sleeping(struct task_struct *tsk)
-{
- struct io_worker *worker = tsk->worker_private;
-
- if (!worker)
- return;
- if (!(worker->flags & IO_WORKER_F_UP))
- return;
- if (!(worker->flags & IO_WORKER_F_RUNNING))
- return;
-
- worker->flags &= ~IO_WORKER_F_RUNNING;
- io_wqe_dec_running(worker);
-}
-
-static void io_init_new_worker(struct io_wqe *wqe, struct io_worker *worker,
- struct task_struct *tsk)
-{
- tsk->worker_private = worker;
- worker->task = tsk;
- set_cpus_allowed_ptr(tsk, wqe->cpu_mask);
- tsk->flags |= PF_NO_SETAFFINITY;
-
- raw_spin_lock(&wqe->lock);
- hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
- list_add_tail_rcu(&worker->all_list, &wqe->all_list);
- worker->flags |= IO_WORKER_F_FREE;
- raw_spin_unlock(&wqe->lock);
- wake_up_new_task(tsk);
-}
-
-static bool io_wq_work_match_all(struct io_wq_work *work, void *data)
-{
- return true;
-}
-
-static inline bool io_should_retry_thread(long err)
-{
- /*
- * Prevent perpetual task_work retry, if the task (or its group) is
- * exiting.
- */
- if (fatal_signal_pending(current))
- return false;
-
- switch (err) {
- case -EAGAIN:
- case -ERESTARTSYS:
- case -ERESTARTNOINTR:
- case -ERESTARTNOHAND:
- return true;
- default:
- return false;
- }
-}
-
-static void create_worker_cont(struct callback_head *cb)
-{
- struct io_worker *worker;
- struct task_struct *tsk;
- struct io_wqe *wqe;
-
- worker = container_of(cb, struct io_worker, create_work);
- clear_bit_unlock(0, &worker->create_state);
- wqe = worker->wqe;
- tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
- if (!IS_ERR(tsk)) {
- io_init_new_worker(wqe, worker, tsk);
- io_worker_release(worker);
- return;
- } else if (!io_should_retry_thread(PTR_ERR(tsk))) {
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
-
- atomic_dec(&acct->nr_running);
- raw_spin_lock(&wqe->lock);
- acct->nr_workers--;
- if (!acct->nr_workers) {
- struct io_cb_cancel_data match = {
- .fn = io_wq_work_match_all,
- .cancel_all = true,
- };
-
- raw_spin_unlock(&wqe->lock);
- while (io_acct_cancel_pending_work(wqe, acct, &match))
- ;
- } else {
- raw_spin_unlock(&wqe->lock);
- }
- io_worker_ref_put(wqe->wq);
- kfree(worker);
- return;
- }
-
- /* re-create attempts grab a new worker ref, drop the existing one */
- io_worker_release(worker);
- schedule_work(&worker->work);
-}
-
-static void io_workqueue_create(struct work_struct *work)
-{
- struct io_worker *worker = container_of(work, struct io_worker, work);
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
-
- if (!io_queue_worker_create(worker, acct, create_worker_cont))
- kfree(worker);
-}
-
-static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index)
-{
- struct io_wqe_acct *acct = &wqe->acct[index];
- struct io_worker *worker;
- struct task_struct *tsk;
-
- __set_current_state(TASK_RUNNING);
-
- worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, wqe->node);
- if (!worker) {
-fail:
- atomic_dec(&acct->nr_running);
- raw_spin_lock(&wqe->lock);
- acct->nr_workers--;
- raw_spin_unlock(&wqe->lock);
- io_worker_ref_put(wq);
- return false;
- }
-
- refcount_set(&worker->ref, 1);
- worker->wqe = wqe;
- raw_spin_lock_init(&worker->lock);
- init_completion(&worker->ref_done);
-
- if (index == IO_WQ_ACCT_BOUND)
- worker->flags |= IO_WORKER_F_BOUND;
-
- tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
- if (!IS_ERR(tsk)) {
- io_init_new_worker(wqe, worker, tsk);
- } else if (!io_should_retry_thread(PTR_ERR(tsk))) {
- kfree(worker);
- goto fail;
- } else {
- INIT_WORK(&worker->work, io_workqueue_create);
- schedule_work(&worker->work);
- }
-
- return true;
-}
-
-/*
- * Iterate the passed in list and call the specific function for each
- * worker that isn't exiting
- */
-static bool io_wq_for_each_worker(struct io_wqe *wqe,
- bool (*func)(struct io_worker *, void *),
- void *data)
-{
- struct io_worker *worker;
- bool ret = false;
-
- list_for_each_entry_rcu(worker, &wqe->all_list, all_list) {
- if (io_worker_get(worker)) {
- /* no task if node is/was offline */
- if (worker->task)
- ret = func(worker, data);
- io_worker_release(worker);
- if (ret)
- break;
- }
- }
-
- return ret;
-}
-
-static bool io_wq_worker_wake(struct io_worker *worker, void *data)
-{
- set_notify_signal(worker->task);
- wake_up_process(worker->task);
- return false;
-}
-
-static void io_run_cancel(struct io_wq_work *work, struct io_wqe *wqe)
-{
- struct io_wq *wq = wqe->wq;
-
- do {
- work->flags |= IO_WQ_WORK_CANCEL;
- wq->do_work(work);
- work = wq->free_work(work);
- } while (work);
-}
-
-static void io_wqe_insert_work(struct io_wqe *wqe, struct io_wq_work *work)
-{
- struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
- unsigned int hash;
- struct io_wq_work *tail;
-
- if (!io_wq_is_hashed(work)) {
-append:
- wq_list_add_tail(&work->list, &acct->work_list);
- return;
- }
-
- hash = io_get_work_hash(work);
- tail = wqe->hash_tail[hash];
- wqe->hash_tail[hash] = work;
- if (!tail)
- goto append;
-
- wq_list_add_after(&work->list, &tail->list, &acct->work_list);
-}
-
-static bool io_wq_work_match_item(struct io_wq_work *work, void *data)
-{
- return work == data;
-}
-
-static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
-{
- struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
- struct io_cb_cancel_data match;
- unsigned work_flags = work->flags;
- bool do_create;
-
- /*
- * If io-wq is exiting for this task, or if the request has explicitly
- * been marked as one that should not get executed, cancel it here.
- */
- if (test_bit(IO_WQ_BIT_EXIT, &wqe->wq->state) ||
- (work->flags & IO_WQ_WORK_CANCEL)) {
- io_run_cancel(work, wqe);
- return;
- }
-
- raw_spin_lock(&acct->lock);
- io_wqe_insert_work(wqe, work);
- clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
- raw_spin_unlock(&acct->lock);
-
- raw_spin_lock(&wqe->lock);
- rcu_read_lock();
- do_create = !io_wqe_activate_free_worker(wqe, acct);
- rcu_read_unlock();
-
- raw_spin_unlock(&wqe->lock);
-
- if (do_create && ((work_flags & IO_WQ_WORK_CONCURRENT) ||
- !atomic_read(&acct->nr_running))) {
- bool did_create;
-
- did_create = io_wqe_create_worker(wqe, acct);
- if (likely(did_create))
- return;
-
- raw_spin_lock(&wqe->lock);
- if (acct->nr_workers) {
- raw_spin_unlock(&wqe->lock);
- return;
- }
- raw_spin_unlock(&wqe->lock);
-
- /* fatal condition, failed to create the first worker */
- match.fn = io_wq_work_match_item,
- match.data = work,
- match.cancel_all = false,
-
- io_acct_cancel_pending_work(wqe, acct, &match);
- }
-}
-
-void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work)
-{
- struct io_wqe *wqe = wq->wqes[numa_node_id()];
-
- io_wqe_enqueue(wqe, work);
-}
-
-/*
- * Work items that hash to the same value will not be done in parallel.
- * Used to limit concurrent writes, generally hashed by inode.
- */
-void io_wq_hash_work(struct io_wq_work *work, void *val)
-{
- unsigned int bit;
-
- bit = hash_ptr(val, IO_WQ_HASH_ORDER);
- work->flags |= (IO_WQ_WORK_HASHED | (bit << IO_WQ_HASH_SHIFT));
-}
-
-static bool __io_wq_worker_cancel(struct io_worker *worker,
- struct io_cb_cancel_data *match,
- struct io_wq_work *work)
-{
- if (work && match->fn(work, match->data)) {
- work->flags |= IO_WQ_WORK_CANCEL;
- set_notify_signal(worker->task);
- return true;
- }
-
- return false;
-}
-
-static bool io_wq_worker_cancel(struct io_worker *worker, void *data)
-{
- struct io_cb_cancel_data *match = data;
-
- /*
- * Hold the lock to avoid ->cur_work going out of scope, caller
- * may dereference the passed in work.
- */
- raw_spin_lock(&worker->lock);
- if (__io_wq_worker_cancel(worker, match, worker->cur_work) ||
- __io_wq_worker_cancel(worker, match, worker->next_work))
- match->nr_running++;
- raw_spin_unlock(&worker->lock);
-
- return match->nr_running && !match->cancel_all;
-}
-
-static inline void io_wqe_remove_pending(struct io_wqe *wqe,
- struct io_wq_work *work,
- struct io_wq_work_node *prev)
-{
- struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
- unsigned int hash = io_get_work_hash(work);
- struct io_wq_work *prev_work = NULL;
-
- if (io_wq_is_hashed(work) && work == wqe->hash_tail[hash]) {
- if (prev)
- prev_work = container_of(prev, struct io_wq_work, list);
- if (prev_work && io_get_work_hash(prev_work) == hash)
- wqe->hash_tail[hash] = prev_work;
- else
- wqe->hash_tail[hash] = NULL;
- }
- wq_list_del(&acct->work_list, &work->list, prev);
-}
-
-static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
- struct io_wqe_acct *acct,
- struct io_cb_cancel_data *match)
-{
- struct io_wq_work_node *node, *prev;
- struct io_wq_work *work;
-
- raw_spin_lock(&acct->lock);
- wq_list_for_each(node, prev, &acct->work_list) {
- work = container_of(node, struct io_wq_work, list);
- if (!match->fn(work, match->data))
- continue;
- io_wqe_remove_pending(wqe, work, prev);
- raw_spin_unlock(&acct->lock);
- io_run_cancel(work, wqe);
- match->nr_pending++;
- /* not safe to continue after unlock */
- return true;
- }
- raw_spin_unlock(&acct->lock);
-
- return false;
-}
-
-static void io_wqe_cancel_pending_work(struct io_wqe *wqe,
- struct io_cb_cancel_data *match)
-{
- int i;
-retry:
- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
- struct io_wqe_acct *acct = io_get_acct(wqe, i == 0);
-
- if (io_acct_cancel_pending_work(wqe, acct, match)) {
- if (match->cancel_all)
- goto retry;
- break;
- }
- }
-}
-
-static void io_wqe_cancel_running_work(struct io_wqe *wqe,
- struct io_cb_cancel_data *match)
-{
- rcu_read_lock();
- io_wq_for_each_worker(wqe, io_wq_worker_cancel, match);
- rcu_read_unlock();
-}
-
-enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
- void *data, bool cancel_all)
-{
- struct io_cb_cancel_data match = {
- .fn = cancel,
- .data = data,
- .cancel_all = cancel_all,
- };
- int node;
-
- /*
- * First check pending list, if we're lucky we can just remove it
- * from there. CANCEL_OK means that the work is returned as-new,
- * no completion will be posted for it.
- *
- * Then check if a free (going busy) or busy worker has the work
- * currently running. If we find it there, we'll return CANCEL_RUNNING
- * as an indication that we attempt to signal cancellation. The
- * completion will run normally in this case.
- *
- * Do both of these while holding the wqe->lock, to ensure that
- * we'll find a work item regardless of state.
- */
- for_each_node(node) {
- struct io_wqe *wqe = wq->wqes[node];
-
- io_wqe_cancel_pending_work(wqe, &match);
- if (match.nr_pending && !match.cancel_all)
- return IO_WQ_CANCEL_OK;
-
- raw_spin_lock(&wqe->lock);
- io_wqe_cancel_running_work(wqe, &match);
- raw_spin_unlock(&wqe->lock);
- if (match.nr_running && !match.cancel_all)
- return IO_WQ_CANCEL_RUNNING;
- }
-
- if (match.nr_running)
- return IO_WQ_CANCEL_RUNNING;
- if (match.nr_pending)
- return IO_WQ_CANCEL_OK;
- return IO_WQ_CANCEL_NOTFOUND;
-}
-
-static int io_wqe_hash_wake(struct wait_queue_entry *wait, unsigned mode,
- int sync, void *key)
-{
- struct io_wqe *wqe = container_of(wait, struct io_wqe, wait);
- int i;
-
- list_del_init(&wait->entry);
-
- rcu_read_lock();
- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
- struct io_wqe_acct *acct = &wqe->acct[i];
-
- if (test_and_clear_bit(IO_ACCT_STALLED_BIT, &acct->flags))
- io_wqe_activate_free_worker(wqe, acct);
- }
- rcu_read_unlock();
- return 1;
-}
-
-struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
-{
- int ret, node, i;
- struct io_wq *wq;
-
- if (WARN_ON_ONCE(!data->free_work || !data->do_work))
- return ERR_PTR(-EINVAL);
- if (WARN_ON_ONCE(!bounded))
- return ERR_PTR(-EINVAL);
-
- wq = kzalloc(struct_size(wq, wqes, nr_node_ids), GFP_KERNEL);
- if (!wq)
- return ERR_PTR(-ENOMEM);
- ret = cpuhp_state_add_instance_nocalls(io_wq_online, &wq->cpuhp_node);
- if (ret)
- goto err_wq;
-
- refcount_inc(&data->hash->refs);
- wq->hash = data->hash;
- wq->free_work = data->free_work;
- wq->do_work = data->do_work;
-
- ret = -ENOMEM;
- for_each_node(node) {
- struct io_wqe *wqe;
- int alloc_node = node;
-
- if (!node_online(alloc_node))
- alloc_node = NUMA_NO_NODE;
- wqe = kzalloc_node(sizeof(struct io_wqe), GFP_KERNEL, alloc_node);
- if (!wqe)
- goto err;
- if (!alloc_cpumask_var(&wqe->cpu_mask, GFP_KERNEL))
- goto err;
- cpumask_copy(wqe->cpu_mask, cpumask_of_node(node));
- wq->wqes[node] = wqe;
- wqe->node = alloc_node;
- wqe->acct[IO_WQ_ACCT_BOUND].max_workers = bounded;
- wqe->acct[IO_WQ_ACCT_UNBOUND].max_workers =
- task_rlimit(current, RLIMIT_NPROC);
- INIT_LIST_HEAD(&wqe->wait.entry);
- wqe->wait.func = io_wqe_hash_wake;
- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
- struct io_wqe_acct *acct = &wqe->acct[i];
-
- acct->index = i;
- atomic_set(&acct->nr_running, 0);
- INIT_WQ_LIST(&acct->work_list);
- raw_spin_lock_init(&acct->lock);
- }
- wqe->wq = wq;
- raw_spin_lock_init(&wqe->lock);
- INIT_HLIST_NULLS_HEAD(&wqe->free_list, 0);
- INIT_LIST_HEAD(&wqe->all_list);
- }
-
- wq->task = get_task_struct(data->task);
- atomic_set(&wq->worker_refs, 1);
- init_completion(&wq->worker_done);
- return wq;
-err:
- io_wq_put_hash(data->hash);
- cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
- for_each_node(node) {
- if (!wq->wqes[node])
- continue;
- free_cpumask_var(wq->wqes[node]->cpu_mask);
- kfree(wq->wqes[node]);
- }
-err_wq:
- kfree(wq);
- return ERR_PTR(ret);
-}
-
-static bool io_task_work_match(struct callback_head *cb, void *data)
-{
- struct io_worker *worker;
-
- if (cb->func != create_worker_cb && cb->func != create_worker_cont)
- return false;
- worker = container_of(cb, struct io_worker, create_work);
- return worker->wqe->wq == data;
-}
-
-void io_wq_exit_start(struct io_wq *wq)
-{
- set_bit(IO_WQ_BIT_EXIT, &wq->state);
-}
-
-static void io_wq_cancel_tw_create(struct io_wq *wq)
-{
- struct callback_head *cb;
-
- while ((cb = task_work_cancel_match(wq->task, io_task_work_match, wq)) != NULL) {
- struct io_worker *worker;
-
- worker = container_of(cb, struct io_worker, create_work);
- io_worker_cancel_cb(worker);
- }
-}
-
-static void io_wq_exit_workers(struct io_wq *wq)
-{
- int node;
-
- if (!wq->task)
- return;
-
- io_wq_cancel_tw_create(wq);
-
- rcu_read_lock();
- for_each_node(node) {
- struct io_wqe *wqe = wq->wqes[node];
-
- io_wq_for_each_worker(wqe, io_wq_worker_wake, NULL);
- }
- rcu_read_unlock();
- io_worker_ref_put(wq);
- wait_for_completion(&wq->worker_done);
-
- for_each_node(node) {
- spin_lock_irq(&wq->hash->wait.lock);
- list_del_init(&wq->wqes[node]->wait.entry);
- spin_unlock_irq(&wq->hash->wait.lock);
- }
- put_task_struct(wq->task);
- wq->task = NULL;
-}
-
-static void io_wq_destroy(struct io_wq *wq)
-{
- int node;
-
- cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
-
- for_each_node(node) {
- struct io_wqe *wqe = wq->wqes[node];
- struct io_cb_cancel_data match = {
- .fn = io_wq_work_match_all,
- .cancel_all = true,
- };
- io_wqe_cancel_pending_work(wqe, &match);
- free_cpumask_var(wqe->cpu_mask);
- kfree(wqe);
- }
- io_wq_put_hash(wq->hash);
- kfree(wq);
-}
-
-void io_wq_put_and_exit(struct io_wq *wq)
-{
- WARN_ON_ONCE(!test_bit(IO_WQ_BIT_EXIT, &wq->state));
-
- io_wq_exit_workers(wq);
- io_wq_destroy(wq);
-}
-
-struct online_data {
- unsigned int cpu;
- bool online;
-};
-
-static bool io_wq_worker_affinity(struct io_worker *worker, void *data)
-{
- struct online_data *od = data;
-
- if (od->online)
- cpumask_set_cpu(od->cpu, worker->wqe->cpu_mask);
- else
- cpumask_clear_cpu(od->cpu, worker->wqe->cpu_mask);
- return false;
-}
-
-static int __io_wq_cpu_online(struct io_wq *wq, unsigned int cpu, bool online)
-{
- struct online_data od = {
- .cpu = cpu,
- .online = online
- };
- int i;
-
- rcu_read_lock();
- for_each_node(i)
- io_wq_for_each_worker(wq->wqes[i], io_wq_worker_affinity, &od);
- rcu_read_unlock();
- return 0;
-}
-
-static int io_wq_cpu_online(unsigned int cpu, struct hlist_node *node)
-{
- struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
-
- return __io_wq_cpu_online(wq, cpu, true);
-}
-
-static int io_wq_cpu_offline(unsigned int cpu, struct hlist_node *node)
-{
- struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
-
- return __io_wq_cpu_online(wq, cpu, false);
-}
-
-int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask)
-{
- int i;
-
- rcu_read_lock();
- for_each_node(i) {
- struct io_wqe *wqe = wq->wqes[i];
-
- if (mask)
- cpumask_copy(wqe->cpu_mask, mask);
- else
- cpumask_copy(wqe->cpu_mask, cpumask_of_node(i));
- }
- rcu_read_unlock();
- return 0;
-}
-
-/*
- * Set max number of unbounded workers, returns old value. If new_count is 0,
- * then just return the old value.
- */
-int io_wq_max_workers(struct io_wq *wq, int *new_count)
-{
- int prev[IO_WQ_ACCT_NR];
- bool first_node = true;
- int i, node;
-
- BUILD_BUG_ON((int) IO_WQ_ACCT_BOUND != (int) IO_WQ_BOUND);
- BUILD_BUG_ON((int) IO_WQ_ACCT_UNBOUND != (int) IO_WQ_UNBOUND);
- BUILD_BUG_ON((int) IO_WQ_ACCT_NR != 2);
-
- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
- if (new_count[i] > task_rlimit(current, RLIMIT_NPROC))
- new_count[i] = task_rlimit(current, RLIMIT_NPROC);
- }
-
- for (i = 0; i < IO_WQ_ACCT_NR; i++)
- prev[i] = 0;
-
- rcu_read_lock();
- for_each_node(node) {
- struct io_wqe *wqe = wq->wqes[node];
- struct io_wqe_acct *acct;
-
- raw_spin_lock(&wqe->lock);
- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
- acct = &wqe->acct[i];
- if (first_node)
- prev[i] = max_t(int, acct->max_workers, prev[i]);
- if (new_count[i])
- acct->max_workers = new_count[i];
- }
- raw_spin_unlock(&wqe->lock);
- first_node = false;
- }
- rcu_read_unlock();
-
- for (i = 0; i < IO_WQ_ACCT_NR; i++)
- new_count[i] = prev[i];
-
- return 0;
-}
-
-static __init int io_wq_init(void)
-{
- int ret;
-
- ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "io-wq/online",
- io_wq_cpu_online, io_wq_cpu_offline);
- if (ret < 0)
- return ret;
- io_wq_online = ret;
- return 0;
-}
-subsys_initcall(io_wq_init);
diff --git a/fs/io-wq.h b/fs/io-wq.h
deleted file mode 100644
index dbecd27656c7..000000000000
--- a/fs/io-wq.h
+++ /dev/null
@@ -1,227 +0,0 @@
-#ifndef INTERNAL_IO_WQ_H
-#define INTERNAL_IO_WQ_H
-
-#include <linux/refcount.h>
-
-struct io_wq;
-
-enum {
- IO_WQ_WORK_CANCEL = 1,
- IO_WQ_WORK_HASHED = 2,
- IO_WQ_WORK_UNBOUND = 4,
- IO_WQ_WORK_CONCURRENT = 16,
-
- IO_WQ_HASH_SHIFT = 24, /* upper 8 bits are used for hash key */
-};
-
-enum io_wq_cancel {
- IO_WQ_CANCEL_OK, /* cancelled before started */
- IO_WQ_CANCEL_RUNNING, /* found, running, and attempted cancelled */
- IO_WQ_CANCEL_NOTFOUND, /* work not found */
-};
-
-struct io_wq_work_node {
- struct io_wq_work_node *next;
-};
-
-struct io_wq_work_list {
- struct io_wq_work_node *first;
- struct io_wq_work_node *last;
-};
-
-#define wq_list_for_each(pos, prv, head) \
- for (pos = (head)->first, prv = NULL; pos; prv = pos, pos = (pos)->next)
-
-#define wq_list_for_each_resume(pos, prv) \
- for (; pos; prv = pos, pos = (pos)->next)
-
-#define wq_list_empty(list) (READ_ONCE((list)->first) == NULL)
-#define INIT_WQ_LIST(list) do { \
- (list)->first = NULL; \
-} while (0)
-
-static inline void wq_list_add_after(struct io_wq_work_node *node,
- struct io_wq_work_node *pos,
- struct io_wq_work_list *list)
-{
- struct io_wq_work_node *next = pos->next;
-
- pos->next = node;
- node->next = next;
- if (!next)
- list->last = node;
-}
-
-/**
- * wq_list_merge - merge the second list to the first one.
- * @list0: the first list
- * @list1: the second list
- * Return the first node after mergence.
- */
-static inline struct io_wq_work_node *wq_list_merge(struct io_wq_work_list *list0,
- struct io_wq_work_list *list1)
-{
- struct io_wq_work_node *ret;
-
- if (!list0->first) {
- ret = list1->first;
- } else {
- ret = list0->first;
- list0->last->next = list1->first;
- }
- INIT_WQ_LIST(list0);
- INIT_WQ_LIST(list1);
- return ret;
-}
-
-static inline void wq_list_add_tail(struct io_wq_work_node *node,
- struct io_wq_work_list *list)
-{
- node->next = NULL;
- if (!list->first) {
- list->last = node;
- WRITE_ONCE(list->first, node);
- } else {
- list->last->next = node;
- list->last = node;
- }
-}
-
-static inline void wq_list_add_head(struct io_wq_work_node *node,
- struct io_wq_work_list *list)
-{
- node->next = list->first;
- if (!node->next)
- list->last = node;
- WRITE_ONCE(list->first, node);
-}
-
-static inline void wq_list_cut(struct io_wq_work_list *list,
- struct io_wq_work_node *last,
- struct io_wq_work_node *prev)
-{
- /* first in the list, if prev==NULL */
- if (!prev)
- WRITE_ONCE(list->first, last->next);
- else
- prev->next = last->next;
-
- if (last == list->last)
- list->last = prev;
- last->next = NULL;
-}
-
-static inline void __wq_list_splice(struct io_wq_work_list *list,
- struct io_wq_work_node *to)
-{
- list->last->next = to->next;
- to->next = list->first;
- INIT_WQ_LIST(list);
-}
-
-static inline bool wq_list_splice(struct io_wq_work_list *list,
- struct io_wq_work_node *to)
-{
- if (!wq_list_empty(list)) {
- __wq_list_splice(list, to);
- return true;
- }
- return false;
-}
-
-static inline void wq_stack_add_head(struct io_wq_work_node *node,
- struct io_wq_work_node *stack)
-{
- node->next = stack->next;
- stack->next = node;
-}
-
-static inline void wq_list_del(struct io_wq_work_list *list,
- struct io_wq_work_node *node,
- struct io_wq_work_node *prev)
-{
- wq_list_cut(list, node, prev);
-}
-
-static inline
-struct io_wq_work_node *wq_stack_extract(struct io_wq_work_node *stack)
-{
- struct io_wq_work_node *node = stack->next;
-
- stack->next = node->next;
- return node;
-}
-
-struct io_wq_work {
- struct io_wq_work_node list;
- unsigned flags;
-};
-
-static inline struct io_wq_work *wq_next_work(struct io_wq_work *work)
-{
- if (!work->list.next)
- return NULL;
-
- return container_of(work->list.next, struct io_wq_work, list);
-}
-
-typedef struct io_wq_work *(free_work_fn)(struct io_wq_work *);
-typedef void (io_wq_work_fn)(struct io_wq_work *);
-
-struct io_wq_hash {
- refcount_t refs;
- unsigned long map;
- struct wait_queue_head wait;
-};
-
-static inline void io_wq_put_hash(struct io_wq_hash *hash)
-{
- if (refcount_dec_and_test(&hash->refs))
- kfree(hash);
-}
-
-struct io_wq_data {
- struct io_wq_hash *hash;
- struct task_struct *task;
- io_wq_work_fn *do_work;
- free_work_fn *free_work;
-};
-
-struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data);
-void io_wq_exit_start(struct io_wq *wq);
-void io_wq_put_and_exit(struct io_wq *wq);
-
-void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work);
-void io_wq_hash_work(struct io_wq_work *work, void *val);
-
-int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask);
-int io_wq_max_workers(struct io_wq *wq, int *new_count);
-
-static inline bool io_wq_is_hashed(struct io_wq_work *work)
-{
- return work->flags & IO_WQ_WORK_HASHED;
-}
-
-typedef bool (work_cancel_fn)(struct io_wq_work *, void *);
-
-enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
- void *data, bool cancel_all);
-
-#if defined(CONFIG_IO_WQ)
-extern void io_wq_worker_sleeping(struct task_struct *);
-extern void io_wq_worker_running(struct task_struct *);
-#else
-static inline void io_wq_worker_sleeping(struct task_struct *tsk)
-{
-}
-static inline void io_wq_worker_running(struct task_struct *tsk)
-{
-}
-#endif
-
-static inline bool io_wq_current_is_worker(void)
-{
- return in_task() && (current->flags & PF_IO_WORKER) &&
- current->worker_private;
-}
-#endif
diff --git a/fs/io_uring.c b/fs/io_uring.c
deleted file mode 100644
index 3d97372e811e..000000000000
--- a/fs/io_uring.c
+++ /dev/null
@@ -1,11915 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Shared application/kernel submission and completion ring pairs, for
- * supporting fast/efficient IO.
- *
- * A note on the read/write ordering memory barriers that are matched between
- * the application and kernel side.
- *
- * After the application reads the CQ ring tail, it must use an
- * appropriate smp_rmb() to pair with the smp_wmb() the kernel uses
- * before writing the tail (using smp_load_acquire to read the tail will
- * do). It also needs a smp_mb() before updating CQ head (ordering the
- * entry load(s) with the head store), pairing with an implicit barrier
- * through a control-dependency in io_get_cqe (smp_store_release to
- * store head will do). Failure to do so could lead to reading invalid
- * CQ entries.
- *
- * Likewise, the application must use an appropriate smp_wmb() before
- * writing the SQ tail (ordering SQ entry stores with the tail store),
- * which pairs with smp_load_acquire in io_get_sqring (smp_store_release
- * to store the tail will do). And it needs a barrier ordering the SQ
- * head load before writing new SQ entries (smp_load_acquire to read
- * head will do).
- *
- * When using the SQ poll thread (IORING_SETUP_SQPOLL), the application
- * needs to check the SQ flags for IORING_SQ_NEED_WAKEUP *after*
- * updating the SQ tail; a full memory barrier smp_mb() is needed
- * between.
- *
- * Also see the examples in the liburing library:
- *
- * git://git.kernel.dk/liburing
- *
- * io_uring also uses READ/WRITE_ONCE() for _any_ store or load that happens
- * from data shared between the kernel and application. This is done both
- * for ordering purposes, but also to ensure that once a value is loaded from
- * data that the application could potentially modify, it remains stable.
- *
- * Copyright (C) 2018-2019 Jens Axboe
- * Copyright (c) 2018-2019 Christoph Hellwig
- */
-#include <linux/kernel.h>
-#include <linux/init.h>
-#include <linux/errno.h>
-#include <linux/syscalls.h>
-#include <linux/compat.h>
-#include <net/compat.h>
-#include <linux/refcount.h>
-#include <linux/uio.h>
-#include <linux/bits.h>
-
-#include <linux/sched/signal.h>
-#include <linux/fs.h>
-#include <linux/file.h>
-#include <linux/fdtable.h>
-#include <linux/mm.h>
-#include <linux/mman.h>
-#include <linux/percpu.h>
-#include <linux/slab.h>
-#include <linux/blk-mq.h>
-#include <linux/bvec.h>
-#include <linux/net.h>
-#include <net/sock.h>
-#include <net/af_unix.h>
-#include <net/scm.h>
-#include <linux/anon_inodes.h>
-#include <linux/sched/mm.h>
-#include <linux/uaccess.h>
-#include <linux/nospec.h>
-#include <linux/sizes.h>
-#include <linux/hugetlb.h>
-#include <linux/highmem.h>
-#include <linux/namei.h>
-#include <linux/fsnotify.h>
-#include <linux/fadvise.h>
-#include <linux/eventpoll.h>
-#include <linux/splice.h>
-#include <linux/task_work.h>
-#include <linux/pagemap.h>
-#include <linux/io_uring.h>
-#include <linux/audit.h>
-#include <linux/security.h>
-
-#define CREATE_TRACE_POINTS
-#include <trace/events/io_uring.h>
-
-#include <uapi/linux/io_uring.h>
-
-#include "internal.h"
-#include "io-wq.h"
-
-#define IORING_MAX_ENTRIES 32768
-#define IORING_MAX_CQ_ENTRIES (2 * IORING_MAX_ENTRIES)
-#define IORING_SQPOLL_CAP_ENTRIES_VALUE 8
-
-/* only define max */
-#define IORING_MAX_FIXED_FILES (1U << 15)
-#define IORING_MAX_RESTRICTIONS (IORING_RESTRICTION_LAST + \
- IORING_REGISTER_LAST + IORING_OP_LAST)
-
-#define IO_RSRC_TAG_TABLE_SHIFT (PAGE_SHIFT - 3)
-#define IO_RSRC_TAG_TABLE_MAX (1U << IO_RSRC_TAG_TABLE_SHIFT)
-#define IO_RSRC_TAG_TABLE_MASK (IO_RSRC_TAG_TABLE_MAX - 1)
-
-#define IORING_MAX_REG_BUFFERS (1U << 14)
-
-#define SQE_COMMON_FLAGS (IOSQE_FIXED_FILE | IOSQE_IO_LINK | \
- IOSQE_IO_HARDLINK | IOSQE_ASYNC)
-
-#define SQE_VALID_FLAGS (SQE_COMMON_FLAGS | IOSQE_BUFFER_SELECT | \
- IOSQE_IO_DRAIN | IOSQE_CQE_SKIP_SUCCESS)
-
-#define IO_REQ_CLEAN_FLAGS (REQ_F_BUFFER_SELECTED | REQ_F_NEED_CLEANUP | \
- REQ_F_POLLED | REQ_F_INFLIGHT | REQ_F_CREDS | \
- REQ_F_ASYNC_DATA)
-
-#define IO_TCTX_REFS_CACHE_NR (1U << 10)
-
-struct io_uring {
- u32 head ____cacheline_aligned_in_smp;
- u32 tail ____cacheline_aligned_in_smp;
-};
-
-/*
- * This data is shared with the application through the mmap at offsets
- * IORING_OFF_SQ_RING and IORING_OFF_CQ_RING.
- *
- * The offsets to the member fields are published through struct
- * io_sqring_offsets when calling io_uring_setup.
- */
-struct io_rings {
- /*
- * Head and tail offsets into the ring; the offsets need to be
- * masked to get valid indices.
- *
- * The kernel controls head of the sq ring and the tail of the cq ring,
- * and the application controls tail of the sq ring and the head of the
- * cq ring.
- */
- struct io_uring sq, cq;
- /*
- * Bitmasks to apply to head and tail offsets (constant, equals
- * ring_entries - 1)
- */
- u32 sq_ring_mask, cq_ring_mask;
- /* Ring sizes (constant, power of 2) */
- u32 sq_ring_entries, cq_ring_entries;
- /*
- * Number of invalid entries dropped by the kernel due to
- * invalid index stored in array
- *
- * Written by the kernel, shouldn't be modified by the
- * application (i.e. get number of "new events" by comparing to
- * cached value).
- *
- * After a new SQ head value was read by the application this
- * counter includes all submissions that were dropped reaching
- * the new SQ head (and possibly more).
- */
- u32 sq_dropped;
- /*
- * Runtime SQ flags
- *
- * Written by the kernel, shouldn't be modified by the
- * application.
- *
- * The application needs a full memory barrier before checking
- * for IORING_SQ_NEED_WAKEUP after updating the sq tail.
- */
- u32 sq_flags;
- /*
- * Runtime CQ flags
- *
- * Written by the application, shouldn't be modified by the
- * kernel.
- */
- u32 cq_flags;
- /*
- * Number of completion events lost because the queue was full;
- * this should be avoided by the application by making sure
- * there are not more requests pending than there is space in
- * the completion queue.
- *
- * Written by the kernel, shouldn't be modified by the
- * application (i.e. get number of "new events" by comparing to
- * cached value).
- *
- * As completion events come in out of order this counter is not
- * ordered with any other data.
- */
- u32 cq_overflow;
- /*
- * Ring buffer of completion events.
- *
- * The kernel writes completion events fresh every time they are
- * produced, so the application is allowed to modify pending
- * entries.
- */
- struct io_uring_cqe cqes[] ____cacheline_aligned_in_smp;
-};
-
-enum io_uring_cmd_flags {
- IO_URING_F_COMPLETE_DEFER = 1,
- IO_URING_F_UNLOCKED = 2,
- /* int's last bit, sign checks are usually faster than a bit test */
- IO_URING_F_NONBLOCK = INT_MIN,
-};
-
-struct io_mapped_ubuf {
- u64 ubuf;
- u64 ubuf_end;
- unsigned int nr_bvecs;
- unsigned long acct_pages;
- struct bio_vec bvec[];
-};
-
-struct io_ring_ctx;
-
-struct io_overflow_cqe {
- struct io_uring_cqe cqe;
- struct list_head list;
-};
-
-struct io_fixed_file {
- /* file * with additional FFS_* flags */
- unsigned long file_ptr;
-};
-
-struct io_rsrc_put {
- struct list_head list;
- u64 tag;
- union {
- void *rsrc;
- struct file *file;
- struct io_mapped_ubuf *buf;
- };
-};
-
-struct io_file_table {
- struct io_fixed_file *files;
-};
-
-struct io_rsrc_node {
- struct percpu_ref refs;
- struct list_head node;
- struct list_head rsrc_list;
- struct io_rsrc_data *rsrc_data;
- struct llist_node llist;
- bool done;
-};
-
-typedef void (rsrc_put_fn)(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc);
-
-struct io_rsrc_data {
- struct io_ring_ctx *ctx;
-
- u64 **tags;
- unsigned int nr;
- rsrc_put_fn *do_put;
- atomic_t refs;
- struct completion done;
- bool quiesce;
-};
-
-struct io_buffer_list {
- struct list_head list;
- struct list_head buf_list;
- __u16 bgid;
-};
-
-struct io_buffer {
- struct list_head list;
- __u64 addr;
- __u32 len;
- __u16 bid;
- __u16 bgid;
-};
-
-struct io_restriction {
- DECLARE_BITMAP(register_op, IORING_REGISTER_LAST);
- DECLARE_BITMAP(sqe_op, IORING_OP_LAST);
- u8 sqe_flags_allowed;
- u8 sqe_flags_required;
- bool registered;
-};
-
-enum {
- IO_SQ_THREAD_SHOULD_STOP = 0,
- IO_SQ_THREAD_SHOULD_PARK,
-};
-
-struct io_sq_data {
- refcount_t refs;
- atomic_t park_pending;
- struct mutex lock;
-
- /* ctx's that are using this sqd */
- struct list_head ctx_list;
-
- struct task_struct *thread;
- struct wait_queue_head wait;
-
- unsigned sq_thread_idle;
- int sq_cpu;
- pid_t task_pid;
- pid_t task_tgid;
-
- unsigned long state;
- struct completion exited;
-};
-
-#define IO_COMPL_BATCH 32
-#define IO_REQ_CACHE_SIZE 32
-#define IO_REQ_ALLOC_BATCH 8
-
-struct io_submit_link {
- struct io_kiocb *head;
- struct io_kiocb *last;
-};
-
-struct io_submit_state {
- /* inline/task_work completion list, under ->uring_lock */
- struct io_wq_work_node free_list;
- /* batch completion logic */
- struct io_wq_work_list compl_reqs;
- struct io_submit_link link;
-
- bool plug_started;
- bool need_plug;
- bool flush_cqes;
- unsigned short submit_nr;
- struct blk_plug plug;
-};
-
-struct io_ev_fd {
- struct eventfd_ctx *cq_ev_fd;
- unsigned int eventfd_async: 1;
- struct rcu_head rcu;
-};
-
-#define IO_BUFFERS_HASH_BITS 5
-
-struct io_ring_ctx {
- /* const or read-mostly hot data */
- struct {
- struct percpu_ref refs;
-
- struct io_rings *rings;
- unsigned int flags;
- unsigned int compat: 1;
- unsigned int drain_next: 1;
- unsigned int restricted: 1;
- unsigned int off_timeout_used: 1;
- unsigned int drain_active: 1;
- unsigned int drain_disabled: 1;
- unsigned int has_evfd: 1;
- } ____cacheline_aligned_in_smp;
-
- /* submission data */
- struct {
- struct mutex uring_lock;
-
- /*
- * Ring buffer of indices into array of io_uring_sqe, which is
- * mmapped by the application using the IORING_OFF_SQES offset.
- *
- * This indirection could e.g. be used to assign fixed
- * io_uring_sqe entries to operations and only submit them to
- * the queue when needed.
- *
- * The kernel modifies neither the indices array nor the entries
- * array.
- */
- u32 *sq_array;
- struct io_uring_sqe *sq_sqes;
- unsigned cached_sq_head;
- unsigned sq_entries;
- struct list_head defer_list;
-
- /*
- * Fixed resources fast path, should be accessed only under
- * uring_lock, and updated through io_uring_register(2)
- */
- struct io_rsrc_node *rsrc_node;
- int rsrc_cached_refs;
- struct io_file_table file_table;
- unsigned nr_user_files;
- unsigned nr_user_bufs;
- struct io_mapped_ubuf **user_bufs;
-
- struct io_submit_state submit_state;
- struct list_head timeout_list;
- struct list_head ltimeout_list;
- struct list_head cq_overflow_list;
- struct list_head *io_buffers;
- struct list_head io_buffers_cache;
- struct list_head apoll_cache;
- struct xarray personalities;
- u32 pers_next;
- unsigned sq_thread_idle;
- } ____cacheline_aligned_in_smp;
-
- /* IRQ completion list, under ->completion_lock */
- struct io_wq_work_list locked_free_list;
- unsigned int locked_free_nr;
-
- const struct cred *sq_creds; /* cred used for __io_sq_thread() */
- struct io_sq_data *sq_data; /* if using sq thread polling */
-
- struct wait_queue_head sqo_sq_wait;
- struct list_head sqd_list;
-
- unsigned long check_cq_overflow;
-
- struct {
- unsigned cached_cq_tail;
- unsigned cq_entries;
- struct io_ev_fd __rcu *io_ev_fd;
- struct wait_queue_head cq_wait;
- unsigned cq_extra;
- atomic_t cq_timeouts;
- unsigned cq_last_tm_flush;
- } ____cacheline_aligned_in_smp;
-
- struct {
- spinlock_t completion_lock;
-
- spinlock_t timeout_lock;
-
- /*
- * ->iopoll_list is protected by the ctx->uring_lock for
- * io_uring instances that don't use IORING_SETUP_SQPOLL.
- * For SQPOLL, only the single threaded io_sq_thread() will
- * manipulate the list, hence no extra locking is needed there.
- */
- struct io_wq_work_list iopoll_list;
- struct hlist_head *cancel_hash;
- unsigned cancel_hash_bits;
- bool poll_multi_queue;
-
- struct list_head io_buffers_comp;
- } ____cacheline_aligned_in_smp;
-
- struct io_restriction restrictions;
-
- /* slow path rsrc auxilary data, used by update/register */
- struct {
- struct io_rsrc_node *rsrc_backup_node;
- struct io_mapped_ubuf *dummy_ubuf;
- struct io_rsrc_data *file_data;
- struct io_rsrc_data *buf_data;
-
- struct delayed_work rsrc_put_work;
- struct llist_head rsrc_put_llist;
- struct list_head rsrc_ref_list;
- spinlock_t rsrc_ref_lock;
-
- struct list_head io_buffers_pages;
- };
-
- /* Keep this last, we don't need it for the fast path */
- struct {
- #if defined(CONFIG_UNIX)
- struct socket *ring_sock;
- #endif
- /* hashed buffered write serialization */
- struct io_wq_hash *hash_map;
-
- /* Only used for accounting purposes */
- struct user_struct *user;
- struct mm_struct *mm_account;
-
- /* ctx exit and cancelation */
- struct llist_head fallback_llist;
- struct delayed_work fallback_work;
- struct work_struct exit_work;
- struct list_head tctx_list;
- struct completion ref_comp;
- u32 iowq_limits[2];
- bool iowq_limits_set;
- };
-};
-
-/*
- * Arbitrary limit, can be raised if need be
- */
-#define IO_RINGFD_REG_MAX 16
-
-struct io_uring_task {
- /* submission side */
- int cached_refs;
- struct xarray xa;
- struct wait_queue_head wait;
- const struct io_ring_ctx *last;
- struct io_wq *io_wq;
- struct percpu_counter inflight;
- atomic_t inflight_tracked;
- atomic_t in_idle;
-
- spinlock_t task_lock;
- struct io_wq_work_list task_list;
- struct io_wq_work_list prior_task_list;
- struct callback_head task_work;
- struct file **registered_rings;
- bool task_running;
-};
-
-/*
- * First field must be the file pointer in all the
- * iocb unions! See also 'struct kiocb' in <linux/fs.h>
- */
-struct io_poll_iocb {
- struct file *file;
- struct wait_queue_head *head;
- __poll_t events;
- struct wait_queue_entry wait;
-};
-
-struct io_poll_update {
- struct file *file;
- u64 old_user_data;
- u64 new_user_data;
- __poll_t events;
- bool update_events;
- bool update_user_data;
-};
-
-struct io_close {
- struct file *file;
- int fd;
- u32 file_slot;
-};
-
-struct io_timeout_data {
- struct io_kiocb *req;
- struct hrtimer timer;
- struct timespec64 ts;
- enum hrtimer_mode mode;
- u32 flags;
-};
-
-struct io_accept {
- struct file *file;
- struct sockaddr __user *addr;
- int __user *addr_len;
- int flags;
- u32 file_slot;
- unsigned long nofile;
-};
-
-struct io_sync {
- struct file *file;
- loff_t len;
- loff_t off;
- int flags;
- int mode;
-};
-
-struct io_cancel {
- struct file *file;
- u64 addr;
-};
-
-struct io_timeout {
- struct file *file;
- u32 off;
- u32 target_seq;
- struct list_head list;
- /* head of the link, used by linked timeouts only */
- struct io_kiocb *head;
- /* for linked completions */
- struct io_kiocb *prev;
-};
-
-struct io_timeout_rem {
- struct file *file;
- u64 addr;
-
- /* timeout update */
- struct timespec64 ts;
- u32 flags;
- bool ltimeout;
-};
-
-struct io_rw {
- /* NOTE: kiocb has the file as the first member, so don't do it here */
- struct kiocb kiocb;
- u64 addr;
- u32 len;
- u32 flags;
-};
-
-struct io_connect {
- struct file *file;
- struct sockaddr __user *addr;
- int addr_len;
-};
-
-struct io_sr_msg {
- struct file *file;
- union {
- struct compat_msghdr __user *umsg_compat;
- struct user_msghdr __user *umsg;
- void __user *buf;
- };
- int msg_flags;
- int bgid;
- size_t len;
- size_t done_io;
-};
-
-struct io_open {
- struct file *file;
- int dfd;
- u32 file_slot;
- struct filename *filename;
- struct open_how how;
- unsigned long nofile;
-};
-
-struct io_rsrc_update {
- struct file *file;
- u64 arg;
- u32 nr_args;
- u32 offset;
-};
-
-struct io_fadvise {
- struct file *file;
- u64 offset;
- u32 len;
- u32 advice;
-};
-
-struct io_madvise {
- struct file *file;
- u64 addr;
- u32 len;
- u32 advice;
-};
-
-struct io_epoll {
- struct file *file;
- int epfd;
- int op;
- int fd;
- struct epoll_event event;
-};
-
-struct io_splice {
- struct file *file_out;
- loff_t off_out;
- loff_t off_in;
- u64 len;
- int splice_fd_in;
- unsigned int flags;
-};
-
-struct io_provide_buf {
- struct file *file;
- __u64 addr;
- __u32 len;
- __u32 bgid;
- __u16 nbufs;
- __u16 bid;
-};
-
-struct io_statx {
- struct file *file;
- int dfd;
- unsigned int mask;
- unsigned int flags;
- struct filename *filename;
- struct statx __user *buffer;
-};
-
-struct io_shutdown {
- struct file *file;
- int how;
-};
-
-struct io_rename {
- struct file *file;
- int old_dfd;
- int new_dfd;
- struct filename *oldpath;
- struct filename *newpath;
- int flags;
-};
-
-struct io_unlink {
- struct file *file;
- int dfd;
- int flags;
- struct filename *filename;
-};
-
-struct io_mkdir {
- struct file *file;
- int dfd;
- umode_t mode;
- struct filename *filename;
-};
-
-struct io_symlink {
- struct file *file;
- int new_dfd;
- struct filename *oldpath;
- struct filename *newpath;
-};
-
-struct io_hardlink {
- struct file *file;
- int old_dfd;
- int new_dfd;
- struct filename *oldpath;
- struct filename *newpath;
- int flags;
-};
-
-struct io_msg {
- struct file *file;
- u64 user_data;
- u32 len;
-};
-
-struct io_async_connect {
- struct sockaddr_storage address;
-};
-
-struct io_async_msghdr {
- struct iovec fast_iov[UIO_FASTIOV];
- /* points to an allocated iov, if NULL we use fast_iov instead */
- struct iovec *free_iov;
- struct sockaddr __user *uaddr;
- struct msghdr msg;
- struct sockaddr_storage addr;
-};
-
-struct io_rw_state {
- struct iov_iter iter;
- struct iov_iter_state iter_state;
- struct iovec fast_iov[UIO_FASTIOV];
-};
-
-struct io_async_rw {
- struct io_rw_state s;
- const struct iovec *free_iovec;
- size_t bytes_done;
- struct wait_page_queue wpq;
-};
-
-enum {
- REQ_F_FIXED_FILE_BIT = IOSQE_FIXED_FILE_BIT,
- REQ_F_IO_DRAIN_BIT = IOSQE_IO_DRAIN_BIT,
- REQ_F_LINK_BIT = IOSQE_IO_LINK_BIT,
- REQ_F_HARDLINK_BIT = IOSQE_IO_HARDLINK_BIT,
- REQ_F_FORCE_ASYNC_BIT = IOSQE_ASYNC_BIT,
- REQ_F_BUFFER_SELECT_BIT = IOSQE_BUFFER_SELECT_BIT,
- REQ_F_CQE_SKIP_BIT = IOSQE_CQE_SKIP_SUCCESS_BIT,
-
- /* first byte is taken by user flags, shift it to not overlap */
- REQ_F_FAIL_BIT = 8,
- REQ_F_INFLIGHT_BIT,
- REQ_F_CUR_POS_BIT,
- REQ_F_NOWAIT_BIT,
- REQ_F_LINK_TIMEOUT_BIT,
- REQ_F_NEED_CLEANUP_BIT,
- REQ_F_POLLED_BIT,
- REQ_F_BUFFER_SELECTED_BIT,
- REQ_F_COMPLETE_INLINE_BIT,
- REQ_F_REISSUE_BIT,
- REQ_F_CREDS_BIT,
- REQ_F_REFCOUNT_BIT,
- REQ_F_ARM_LTIMEOUT_BIT,
- REQ_F_ASYNC_DATA_BIT,
- REQ_F_SKIP_LINK_CQES_BIT,
- REQ_F_SINGLE_POLL_BIT,
- REQ_F_DOUBLE_POLL_BIT,
- REQ_F_PARTIAL_IO_BIT,
- /* keep async read/write and isreg together and in order */
- REQ_F_SUPPORT_NOWAIT_BIT,
- REQ_F_ISREG_BIT,
-
- /* not a real bit, just to check we're not overflowing the space */
- __REQ_F_LAST_BIT,
-};
-
-enum {
- /* ctx owns file */
- REQ_F_FIXED_FILE = BIT(REQ_F_FIXED_FILE_BIT),
- /* drain existing IO first */
- REQ_F_IO_DRAIN = BIT(REQ_F_IO_DRAIN_BIT),
- /* linked sqes */
- REQ_F_LINK = BIT(REQ_F_LINK_BIT),
- /* doesn't sever on completion < 0 */
- REQ_F_HARDLINK = BIT(REQ_F_HARDLINK_BIT),
- /* IOSQE_ASYNC */
- REQ_F_FORCE_ASYNC = BIT(REQ_F_FORCE_ASYNC_BIT),
- /* IOSQE_BUFFER_SELECT */
- REQ_F_BUFFER_SELECT = BIT(REQ_F_BUFFER_SELECT_BIT),
- /* IOSQE_CQE_SKIP_SUCCESS */
- REQ_F_CQE_SKIP = BIT(REQ_F_CQE_SKIP_BIT),
-
- /* fail rest of links */
- REQ_F_FAIL = BIT(REQ_F_FAIL_BIT),
- /* on inflight list, should be cancelled and waited on exit reliably */
- REQ_F_INFLIGHT = BIT(REQ_F_INFLIGHT_BIT),
- /* read/write uses file position */
- REQ_F_CUR_POS = BIT(REQ_F_CUR_POS_BIT),
- /* must not punt to workers */
- REQ_F_NOWAIT = BIT(REQ_F_NOWAIT_BIT),
- /* has or had linked timeout */
- REQ_F_LINK_TIMEOUT = BIT(REQ_F_LINK_TIMEOUT_BIT),
- /* needs cleanup */
- REQ_F_NEED_CLEANUP = BIT(REQ_F_NEED_CLEANUP_BIT),
- /* already went through poll handler */
- REQ_F_POLLED = BIT(REQ_F_POLLED_BIT),
- /* buffer already selected */
- REQ_F_BUFFER_SELECTED = BIT(REQ_F_BUFFER_SELECTED_BIT),
- /* completion is deferred through io_comp_state */
- REQ_F_COMPLETE_INLINE = BIT(REQ_F_COMPLETE_INLINE_BIT),
- /* caller should reissue async */
- REQ_F_REISSUE = BIT(REQ_F_REISSUE_BIT),
- /* supports async reads/writes */
- REQ_F_SUPPORT_NOWAIT = BIT(REQ_F_SUPPORT_NOWAIT_BIT),
- /* regular file */
- REQ_F_ISREG = BIT(REQ_F_ISREG_BIT),
- /* has creds assigned */
- REQ_F_CREDS = BIT(REQ_F_CREDS_BIT),
- /* skip refcounting if not set */
- REQ_F_REFCOUNT = BIT(REQ_F_REFCOUNT_BIT),
- /* there is a linked timeout that has to be armed */
- REQ_F_ARM_LTIMEOUT = BIT(REQ_F_ARM_LTIMEOUT_BIT),
- /* ->async_data allocated */
- REQ_F_ASYNC_DATA = BIT(REQ_F_ASYNC_DATA_BIT),
- /* don't post CQEs while failing linked requests */
- REQ_F_SKIP_LINK_CQES = BIT(REQ_F_SKIP_LINK_CQES_BIT),
- /* single poll may be active */
- REQ_F_SINGLE_POLL = BIT(REQ_F_SINGLE_POLL_BIT),
- /* double poll may active */
- REQ_F_DOUBLE_POLL = BIT(REQ_F_DOUBLE_POLL_BIT),
- /* request has already done partial IO */
- REQ_F_PARTIAL_IO = BIT(REQ_F_PARTIAL_IO_BIT),
-};
-
-struct async_poll {
- struct io_poll_iocb poll;
- struct io_poll_iocb *double_poll;
-};
-
-typedef void (*io_req_tw_func_t)(struct io_kiocb *req, bool *locked);
-
-struct io_task_work {
- union {
- struct io_wq_work_node node;
- struct llist_node fallback_node;
- };
- io_req_tw_func_t func;
-};
-
-enum {
- IORING_RSRC_FILE = 0,
- IORING_RSRC_BUFFER = 1,
-};
-
-/*
- * NOTE! Each of the iocb union members has the file pointer
- * as the first entry in their struct definition. So you can
- * access the file pointer through any of the sub-structs,
- * or directly as just 'file' in this struct.
- */
-struct io_kiocb {
- union {
- struct file *file;
- struct io_rw rw;
- struct io_poll_iocb poll;
- struct io_poll_update poll_update;
- struct io_accept accept;
- struct io_sync sync;
- struct io_cancel cancel;
- struct io_timeout timeout;
- struct io_timeout_rem timeout_rem;
- struct io_connect connect;
- struct io_sr_msg sr_msg;
- struct io_open open;
- struct io_close close;
- struct io_rsrc_update rsrc_update;
- struct io_fadvise fadvise;
- struct io_madvise madvise;
- struct io_epoll epoll;
- struct io_splice splice;
- struct io_provide_buf pbuf;
- struct io_statx statx;
- struct io_shutdown shutdown;
- struct io_rename rename;
- struct io_unlink unlink;
- struct io_mkdir mkdir;
- struct io_symlink symlink;
- struct io_hardlink hardlink;
- struct io_msg msg;
- };
-
- u8 opcode;
- /* polled IO has completed */
- u8 iopoll_completed;
- u16 buf_index;
- unsigned int flags;
-
- u64 user_data;
- u32 result;
- /* fd initially, then cflags for completion */
- union {
- u32 cflags;
- int fd;
- };
-
- struct io_ring_ctx *ctx;
- struct task_struct *task;
-
- struct percpu_ref *fixed_rsrc_refs;
- /* store used ubuf, so we can prevent reloading */
- struct io_mapped_ubuf *imu;
-
- union {
- /* used by request caches, completion batching and iopoll */
- struct io_wq_work_node comp_list;
- /* cache ->apoll->events */
- __poll_t apoll_events;
- };
- atomic_t refs;
- atomic_t poll_refs;
- struct io_task_work io_task_work;
- /* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */
- struct hlist_node hash_node;
- /* internal polling, see IORING_FEAT_FAST_POLL */
- struct async_poll *apoll;
- /* opcode allocated if it needs to store data for async defer */
- void *async_data;
- /* stores selected buf, valid IFF REQ_F_BUFFER_SELECTED is set */
- struct io_buffer *kbuf;
- /* linked requests, IFF REQ_F_HARDLINK or REQ_F_LINK are set */
- struct io_kiocb *link;
- /* custom credentials, valid IFF REQ_F_CREDS is set */
- const struct cred *creds;
- struct io_wq_work work;
-};
-
-struct io_tctx_node {
- struct list_head ctx_node;
- struct task_struct *task;
- struct io_ring_ctx *ctx;
-};
-
-struct io_defer_entry {
- struct list_head list;
- struct io_kiocb *req;
- u32 seq;
-};
-
-struct io_op_def {
- /* needs req->file assigned */
- unsigned needs_file : 1;
- /* should block plug */
- unsigned plug : 1;
- /* hash wq insertion if file is a regular file */
- unsigned hash_reg_file : 1;
- /* unbound wq insertion if file is a non-regular file */
- unsigned unbound_nonreg_file : 1;
- /* set if opcode supports polled "wait" */
- unsigned pollin : 1;
- unsigned pollout : 1;
- unsigned poll_exclusive : 1;
- /* op supports buffer selection */
- unsigned buffer_select : 1;
- /* do prep async if is going to be punted */
- unsigned needs_async_setup : 1;
- /* opcode is not supported by this kernel */
- unsigned not_supported : 1;
- /* skip auditing */
- unsigned audit_skip : 1;
- /* size of async data needed, if any */
- unsigned short async_size;
-};
-
-static const struct io_op_def io_op_defs[] = {
- [IORING_OP_NOP] = {},
- [IORING_OP_READV] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .pollin = 1,
- .buffer_select = 1,
- .needs_async_setup = 1,
- .plug = 1,
- .audit_skip = 1,
- .async_size = sizeof(struct io_async_rw),
- },
- [IORING_OP_WRITEV] = {
- .needs_file = 1,
- .hash_reg_file = 1,
- .unbound_nonreg_file = 1,
- .pollout = 1,
- .needs_async_setup = 1,
- .plug = 1,
- .audit_skip = 1,
- .async_size = sizeof(struct io_async_rw),
- },
- [IORING_OP_FSYNC] = {
- .needs_file = 1,
- .audit_skip = 1,
- },
- [IORING_OP_READ_FIXED] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .pollin = 1,
- .plug = 1,
- .audit_skip = 1,
- .async_size = sizeof(struct io_async_rw),
- },
- [IORING_OP_WRITE_FIXED] = {
- .needs_file = 1,
- .hash_reg_file = 1,
- .unbound_nonreg_file = 1,
- .pollout = 1,
- .plug = 1,
- .audit_skip = 1,
- .async_size = sizeof(struct io_async_rw),
- },
- [IORING_OP_POLL_ADD] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .audit_skip = 1,
- },
- [IORING_OP_POLL_REMOVE] = {
- .audit_skip = 1,
- },
- [IORING_OP_SYNC_FILE_RANGE] = {
- .needs_file = 1,
- .audit_skip = 1,
- },
- [IORING_OP_SENDMSG] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .pollout = 1,
- .needs_async_setup = 1,
- .async_size = sizeof(struct io_async_msghdr),
- },
- [IORING_OP_RECVMSG] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .pollin = 1,
- .buffer_select = 1,
- .needs_async_setup = 1,
- .async_size = sizeof(struct io_async_msghdr),
- },
- [IORING_OP_TIMEOUT] = {
- .audit_skip = 1,
- .async_size = sizeof(struct io_timeout_data),
- },
- [IORING_OP_TIMEOUT_REMOVE] = {
- /* used by timeout updates' prep() */
- .audit_skip = 1,
- },
- [IORING_OP_ACCEPT] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .pollin = 1,
- .poll_exclusive = 1,
- },
- [IORING_OP_ASYNC_CANCEL] = {
- .audit_skip = 1,
- },
- [IORING_OP_LINK_TIMEOUT] = {
- .audit_skip = 1,
- .async_size = sizeof(struct io_timeout_data),
- },
- [IORING_OP_CONNECT] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .pollout = 1,
- .needs_async_setup = 1,
- .async_size = sizeof(struct io_async_connect),
- },
- [IORING_OP_FALLOCATE] = {
- .needs_file = 1,
- },
- [IORING_OP_OPENAT] = {},
- [IORING_OP_CLOSE] = {},
- [IORING_OP_FILES_UPDATE] = {
- .audit_skip = 1,
- },
- [IORING_OP_STATX] = {
- .audit_skip = 1,
- },
- [IORING_OP_READ] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .pollin = 1,
- .buffer_select = 1,
- .plug = 1,
- .audit_skip = 1,
- .async_size = sizeof(struct io_async_rw),
- },
- [IORING_OP_WRITE] = {
- .needs_file = 1,
- .hash_reg_file = 1,
- .unbound_nonreg_file = 1,
- .pollout = 1,
- .plug = 1,
- .audit_skip = 1,
- .async_size = sizeof(struct io_async_rw),
- },
- [IORING_OP_FADVISE] = {
- .needs_file = 1,
- .audit_skip = 1,
- },
- [IORING_OP_MADVISE] = {},
- [IORING_OP_SEND] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .pollout = 1,
- .audit_skip = 1,
- },
- [IORING_OP_RECV] = {
- .needs_file = 1,
- .unbound_nonreg_file = 1,
- .pollin = 1,
- .buffer_select = 1,
- .audit_skip = 1,
- },
- [IORING_OP_OPENAT2] = {
- },
- [IORING_OP_EPOLL_CTL] = {
- .unbound_nonreg_file = 1,
- .audit_skip = 1,
- },
- [IORING_OP_SPLICE] = {
- .needs_file = 1,
- .hash_reg_file = 1,
- .unbound_nonreg_file = 1,
- .audit_skip = 1,
- },
- [IORING_OP_PROVIDE_BUFFERS] = {
- .audit_skip = 1,
- },
- [IORING_OP_REMOVE_BUFFERS] = {
- .audit_skip = 1,
- },
- [IORING_OP_TEE] = {
- .needs_file = 1,
- .hash_reg_file = 1,
- .unbound_nonreg_file = 1,
- .audit_skip = 1,
- },
- [IORING_OP_SHUTDOWN] = {
- .needs_file = 1,
- },
- [IORING_OP_RENAMEAT] = {},
- [IORING_OP_UNLINKAT] = {},
- [IORING_OP_MKDIRAT] = {},
- [IORING_OP_SYMLINKAT] = {},
- [IORING_OP_LINKAT] = {},
- [IORING_OP_MSG_RING] = {
- .needs_file = 1,
- },
-};
-
-/* requests with any of those set should undergo io_disarm_next() */
-#define IO_DISARM_MASK (REQ_F_ARM_LTIMEOUT | REQ_F_LINK_TIMEOUT | REQ_F_FAIL)
-
-static bool io_disarm_next(struct io_kiocb *req);
-static void io_uring_del_tctx_node(unsigned long index);
-static void io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
- struct task_struct *task,
- bool cancel_all);
-static void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd);
-
-static void io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags);
-
-static void io_put_req(struct io_kiocb *req);
-static void io_put_req_deferred(struct io_kiocb *req);
-static void io_dismantle_req(struct io_kiocb *req);
-static void io_queue_linked_timeout(struct io_kiocb *req);
-static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
- struct io_uring_rsrc_update2 *up,
- unsigned nr_args);
-static void io_clean_op(struct io_kiocb *req);
-static inline struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
- unsigned issue_flags);
-static inline struct file *io_file_get_normal(struct io_kiocb *req, int fd);
-static void __io_queue_sqe(struct io_kiocb *req);
-static void io_rsrc_put_work(struct work_struct *work);
-
-static void io_req_task_queue(struct io_kiocb *req);
-static void __io_submit_flush_completions(struct io_ring_ctx *ctx);
-static int io_req_prep_async(struct io_kiocb *req);
-
-static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
- unsigned int issue_flags, u32 slot_index);
-static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags);
-
-static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer);
-static void io_eventfd_signal(struct io_ring_ctx *ctx);
-
-static struct kmem_cache *req_cachep;
-
-static const struct file_operations io_uring_fops;
-
-struct sock *io_uring_get_socket(struct file *file)
-{
-#if defined(CONFIG_UNIX)
- if (file->f_op == &io_uring_fops) {
- struct io_ring_ctx *ctx = file->private_data;
-
- return ctx->ring_sock->sk;
- }
-#endif
- return NULL;
-}
-EXPORT_SYMBOL(io_uring_get_socket);
-
-static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked)
-{
- if (!*locked) {
- mutex_lock(&ctx->uring_lock);
- *locked = true;
- }
-}
-
-#define io_for_each_link(pos, head) \
- for (pos = (head); pos; pos = pos->link)
-
-/*
- * Shamelessly stolen from the mm implementation of page reference checking,
- * see commit f958d7b528b1 for details.
- */
-#define req_ref_zero_or_close_to_overflow(req) \
- ((unsigned int) atomic_read(&(req->refs)) + 127u <= 127u)
-
-static inline bool req_ref_inc_not_zero(struct io_kiocb *req)
-{
- WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
- return atomic_inc_not_zero(&req->refs);
-}
-
-static inline bool req_ref_put_and_test(struct io_kiocb *req)
-{
- if (likely(!(req->flags & REQ_F_REFCOUNT)))
- return true;
-
- WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
- return atomic_dec_and_test(&req->refs);
-}
-
-static inline void req_ref_get(struct io_kiocb *req)
-{
- WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
- WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
- atomic_inc(&req->refs);
-}
-
-static inline void io_submit_flush_completions(struct io_ring_ctx *ctx)
-{
- if (!wq_list_empty(&ctx->submit_state.compl_reqs))
- __io_submit_flush_completions(ctx);
-}
-
-static inline void __io_req_set_refcount(struct io_kiocb *req, int nr)
-{
- if (!(req->flags & REQ_F_REFCOUNT)) {
- req->flags |= REQ_F_REFCOUNT;
- atomic_set(&req->refs, nr);
- }
-}
-
-static inline void io_req_set_refcount(struct io_kiocb *req)
-{
- __io_req_set_refcount(req, 1);
-}
-
-#define IO_RSRC_REF_BATCH 100
-
-static inline void io_req_put_rsrc_locked(struct io_kiocb *req,
- struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
-{
- struct percpu_ref *ref = req->fixed_rsrc_refs;
-
- if (ref) {
- if (ref == &ctx->rsrc_node->refs)
- ctx->rsrc_cached_refs++;
- else
- percpu_ref_put(ref);
- }
-}
-
-static inline void io_req_put_rsrc(struct io_kiocb *req, struct io_ring_ctx *ctx)
-{
- if (req->fixed_rsrc_refs)
- percpu_ref_put(req->fixed_rsrc_refs);
-}
-
-static __cold void io_rsrc_refs_drop(struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
-{
- if (ctx->rsrc_cached_refs) {
- percpu_ref_put_many(&ctx->rsrc_node->refs, ctx->rsrc_cached_refs);
- ctx->rsrc_cached_refs = 0;
- }
-}
-
-static void io_rsrc_refs_refill(struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
-{
- ctx->rsrc_cached_refs += IO_RSRC_REF_BATCH;
- percpu_ref_get_many(&ctx->rsrc_node->refs, IO_RSRC_REF_BATCH);
-}
-
-static inline void io_req_set_rsrc_node(struct io_kiocb *req,
- struct io_ring_ctx *ctx,
- unsigned int issue_flags)
-{
- if (!req->fixed_rsrc_refs) {
- req->fixed_rsrc_refs = &ctx->rsrc_node->refs;
-
- if (!(issue_flags & IO_URING_F_UNLOCKED)) {
- lockdep_assert_held(&ctx->uring_lock);
- ctx->rsrc_cached_refs--;
- if (unlikely(ctx->rsrc_cached_refs < 0))
- io_rsrc_refs_refill(ctx);
- } else {
- percpu_ref_get(req->fixed_rsrc_refs);
- }
- }
-}
-
-static unsigned int __io_put_kbuf(struct io_kiocb *req, struct list_head *list)
-{
- struct io_buffer *kbuf = req->kbuf;
- unsigned int cflags;
-
- cflags = IORING_CQE_F_BUFFER | (kbuf->bid << IORING_CQE_BUFFER_SHIFT);
- req->flags &= ~REQ_F_BUFFER_SELECTED;
- list_add(&kbuf->list, list);
- req->kbuf = NULL;
- return cflags;
-}
-
-static inline unsigned int io_put_kbuf_comp(struct io_kiocb *req)
-{
- lockdep_assert_held(&req->ctx->completion_lock);
-
- if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
- return 0;
- return __io_put_kbuf(req, &req->ctx->io_buffers_comp);
-}
-
-static inline unsigned int io_put_kbuf(struct io_kiocb *req,
- unsigned issue_flags)
-{
- unsigned int cflags;
-
- if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
- return 0;
-
- /*
- * We can add this buffer back to two lists:
- *
- * 1) The io_buffers_cache list. This one is protected by the
- * ctx->uring_lock. If we already hold this lock, add back to this
- * list as we can grab it from issue as well.
- * 2) The io_buffers_comp list. This one is protected by the
- * ctx->completion_lock.
- *
- * We migrate buffers from the comp_list to the issue cache list
- * when we need one.
- */
- if (issue_flags & IO_URING_F_UNLOCKED) {
- struct io_ring_ctx *ctx = req->ctx;
-
- spin_lock(&ctx->completion_lock);
- cflags = __io_put_kbuf(req, &ctx->io_buffers_comp);
- spin_unlock(&ctx->completion_lock);
- } else {
- lockdep_assert_held(&req->ctx->uring_lock);
-
- cflags = __io_put_kbuf(req, &req->ctx->io_buffers_cache);
- }
-
- return cflags;
-}
-
-static struct io_buffer_list *io_buffer_get_list(struct io_ring_ctx *ctx,
- unsigned int bgid)
-{
- struct list_head *hash_list;
- struct io_buffer_list *bl;
-
- hash_list = &ctx->io_buffers[hash_32(bgid, IO_BUFFERS_HASH_BITS)];
- list_for_each_entry(bl, hash_list, list)
- if (bl->bgid == bgid || bgid == -1U)
- return bl;
-
- return NULL;
-}
-
-static void io_kbuf_recycle(struct io_kiocb *req, unsigned issue_flags)
-{
- struct io_ring_ctx *ctx = req->ctx;
- struct io_buffer_list *bl;
- struct io_buffer *buf;
-
- if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
- return;
- /* don't recycle if we already did IO to this buffer */
- if (req->flags & REQ_F_PARTIAL_IO)
- return;
-
- if (issue_flags & IO_URING_F_UNLOCKED)
- mutex_lock(&ctx->uring_lock);
-
- lockdep_assert_held(&ctx->uring_lock);
-
- buf = req->kbuf;
- bl = io_buffer_get_list(ctx, buf->bgid);
- list_add(&buf->list, &bl->buf_list);
- req->flags &= ~REQ_F_BUFFER_SELECTED;
- req->kbuf = NULL;
-
- if (issue_flags & IO_URING_F_UNLOCKED)
- mutex_unlock(&ctx->uring_lock);
-}
-
-static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
- bool cancel_all)
- __must_hold(&req->ctx->timeout_lock)
-{
- struct io_kiocb *req;
-
- if (task && head->task != task)
- return false;
- if (cancel_all)
- return true;
-
- io_for_each_link(req, head) {
- if (req->flags & REQ_F_INFLIGHT)
- return true;
- }
- return false;
-}
-
-static bool io_match_linked(struct io_kiocb *head)
-{
- struct io_kiocb *req;
-
- io_for_each_link(req, head) {
- if (req->flags & REQ_F_INFLIGHT)
- return true;
- }
- return false;
-}
-
-/*
- * As io_match_task() but protected against racing with linked timeouts.
- * User must not hold timeout_lock.
- */
-static bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task,
- bool cancel_all)
-{
- bool matched;
-
- if (task && head->task != task)
- return false;
- if (cancel_all)
- return true;
-
- if (head->flags & REQ_F_LINK_TIMEOUT) {
- struct io_ring_ctx *ctx = head->ctx;
-
- /* protect against races with linked timeouts */
- spin_lock_irq(&ctx->timeout_lock);
- matched = io_match_linked(head);
- spin_unlock_irq(&ctx->timeout_lock);
- } else {
- matched = io_match_linked(head);
- }
- return matched;
-}
-
-static inline bool req_has_async_data(struct io_kiocb *req)
-{
- return req->flags & REQ_F_ASYNC_DATA;
-}
-
-static inline void req_set_fail(struct io_kiocb *req)
-{
- req->flags |= REQ_F_FAIL;
- if (req->flags & REQ_F_CQE_SKIP) {
- req->flags &= ~REQ_F_CQE_SKIP;
- req->flags |= REQ_F_SKIP_LINK_CQES;
- }
-}
-
-static inline void req_fail_link_node(struct io_kiocb *req, int res)
-{
- req_set_fail(req);
- req->result = res;
-}
-
-static __cold void io_ring_ctx_ref_free(struct percpu_ref *ref)
-{
- struct io_ring_ctx *ctx = container_of(ref, struct io_ring_ctx, refs);
-
- complete(&ctx->ref_comp);
-}
-
-static inline bool io_is_timeout_noseq(struct io_kiocb *req)
-{
- return !req->timeout.off;
-}
-
-static __cold void io_fallback_req_func(struct work_struct *work)
-{
- struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
- fallback_work.work);
- struct llist_node *node = llist_del_all(&ctx->fallback_llist);
- struct io_kiocb *req, *tmp;
- bool locked = false;
-
- percpu_ref_get(&ctx->refs);
- llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
- req->io_task_work.func(req, &locked);
-
- if (locked) {
- io_submit_flush_completions(ctx);
- mutex_unlock(&ctx->uring_lock);
- }
- percpu_ref_put(&ctx->refs);
-}
-
-static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
-{
- struct io_ring_ctx *ctx;
- int i, hash_bits;
-
- ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
- if (!ctx)
- return NULL;
-
- /*
- * Use 5 bits less than the max cq entries, that should give us around
- * 32 entries per hash list if totally full and uniformly spread.
- */
- hash_bits = ilog2(p->cq_entries);
- hash_bits -= 5;
- if (hash_bits <= 0)
- hash_bits = 1;
- ctx->cancel_hash_bits = hash_bits;
- ctx->cancel_hash = kmalloc((1U << hash_bits) * sizeof(struct hlist_head),
- GFP_KERNEL);
- if (!ctx->cancel_hash)
- goto err;
- __hash_init(ctx->cancel_hash, 1U << hash_bits);
-
- ctx->dummy_ubuf = kzalloc(sizeof(*ctx->dummy_ubuf), GFP_KERNEL);
- if (!ctx->dummy_ubuf)
- goto err;
- /* set invalid range, so io_import_fixed() fails meeting it */
- ctx->dummy_ubuf->ubuf = -1UL;
-
- ctx->io_buffers = kcalloc(1U << IO_BUFFERS_HASH_BITS,
- sizeof(struct list_head), GFP_KERNEL);
- if (!ctx->io_buffers)
- goto err;
- for (i = 0; i < (1U << IO_BUFFERS_HASH_BITS); i++)
- INIT_LIST_HEAD(&ctx->io_buffers[i]);
-
- if (percpu_ref_init(&ctx->refs, io_ring_ctx_ref_free,
- PERCPU_REF_ALLOW_REINIT, GFP_KERNEL))
- goto err;
-
- ctx->flags = p->flags;
- init_waitqueue_head(&ctx->sqo_sq_wait);
- INIT_LIST_HEAD(&ctx->sqd_list);
- INIT_LIST_HEAD(&ctx->cq_overflow_list);
- INIT_LIST_HEAD(&ctx->io_buffers_cache);
- INIT_LIST_HEAD(&ctx->apoll_cache);
- init_completion(&ctx->ref_comp);
- xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1);
- mutex_init(&ctx->uring_lock);
- init_waitqueue_head(&ctx->cq_wait);
- spin_lock_init(&ctx->completion_lock);
- spin_lock_init(&ctx->timeout_lock);
- INIT_WQ_LIST(&ctx->iopoll_list);
- INIT_LIST_HEAD(&ctx->io_buffers_pages);
- INIT_LIST_HEAD(&ctx->io_buffers_comp);
- INIT_LIST_HEAD(&ctx->defer_list);
- INIT_LIST_HEAD(&ctx->timeout_list);
- INIT_LIST_HEAD(&ctx->ltimeout_list);
- spin_lock_init(&ctx->rsrc_ref_lock);
- INIT_LIST_HEAD(&ctx->rsrc_ref_list);
- INIT_DELAYED_WORK(&ctx->rsrc_put_work, io_rsrc_put_work);
- init_llist_head(&ctx->rsrc_put_llist);
- INIT_LIST_HEAD(&ctx->tctx_list);
- ctx->submit_state.free_list.next = NULL;
- INIT_WQ_LIST(&ctx->locked_free_list);
- INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func);
- INIT_WQ_LIST(&ctx->submit_state.compl_reqs);
- return ctx;
-err:
- kfree(ctx->dummy_ubuf);
- kfree(ctx->cancel_hash);
- kfree(ctx->io_buffers);
- kfree(ctx);
- return NULL;
-}
-
-static void io_account_cq_overflow(struct io_ring_ctx *ctx)
-{
- struct io_rings *r = ctx->rings;
-
- WRITE_ONCE(r->cq_overflow, READ_ONCE(r->cq_overflow) + 1);
- ctx->cq_extra--;
-}
-
-static bool req_need_defer(struct io_kiocb *req, u32 seq)
-{
- if (unlikely(req->flags & REQ_F_IO_DRAIN)) {
- struct io_ring_ctx *ctx = req->ctx;
-
- return seq + READ_ONCE(ctx->cq_extra) != ctx->cached_cq_tail;
- }
-
- return false;
-}
-
-#define FFS_NOWAIT 0x1UL
-#define FFS_ISREG 0x2UL
-#define FFS_MASK ~(FFS_NOWAIT|FFS_ISREG)
-
-static inline bool io_req_ffs_set(struct io_kiocb *req)
-{
- return req->flags & REQ_F_FIXED_FILE;
-}
-
-static inline void io_req_track_inflight(struct io_kiocb *req)
-{
- if (!(req->flags & REQ_F_INFLIGHT)) {
- req->flags |= REQ_F_INFLIGHT;
- atomic_inc(&req->task->io_uring->inflight_tracked);
- }
-}
-
-static struct io_kiocb *__io_prep_linked_timeout(struct io_kiocb *req)
-{
- if (WARN_ON_ONCE(!req->link))
- return NULL;
-
- req->flags &= ~REQ_F_ARM_LTIMEOUT;
- req->flags |= REQ_F_LINK_TIMEOUT;
-
- /* linked timeouts should have two refs once prep'ed */
- io_req_set_refcount(req);
- __io_req_set_refcount(req->link, 2);
- return req->link;
-}
-
-static inline struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req)
-{
- if (likely(!(req->flags & REQ_F_ARM_LTIMEOUT)))
- return NULL;
- return __io_prep_linked_timeout(req);
-}
-
-static void io_prep_async_work(struct io_kiocb *req)
-{
- const struct io_op_def *def = &io_op_defs[req->opcode];
- struct io_ring_ctx *ctx = req->ctx;
-
- if (!(req->flags & REQ_F_CREDS)) {
- req->flags |= REQ_F_CREDS;
- req->creds = get_current_cred();
- }
-
- req->work.list.next = NULL;
- req->work.flags = 0;
- if (req->flags & REQ_F_FORCE_ASYNC)
- req->work.flags |= IO_WQ_WORK_CONCURRENT;
-
- if (req->flags & REQ_F_ISREG) {
- if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
- io_wq_hash_work(&req->work, file_inode(req->file));
- } else if (!req->file || !S_ISBLK(file_inode(req->file)->i_mode)) {
- if (def->unbound_nonreg_file)
- req->work.flags |= IO_WQ_WORK_UNBOUND;
- }
-}
-
-static void io_prep_async_link(struct io_kiocb *req)
-{
- struct io_kiocb *cur;
-
- if (req->flags & REQ_F_LINK_TIMEOUT) {
- struct io_ring_ctx *ctx = req->ctx;
-
- spin_lock_irq(&ctx->timeout_lock);
- io_for_each_link(cur, req)
- io_prep_async_work(cur);
- spin_unlock_irq(&ctx->timeout_lock);
- } else {
- io_for_each_link(cur, req)
- io_prep_async_work(cur);
- }
-}
-
-static inline void io_req_add_compl_list(struct io_kiocb *req)
-{
- struct io_ring_ctx *ctx = req->ctx;
- struct io_submit_state *state = &ctx->submit_state;
-
- if (!(req->flags & REQ_F_CQE_SKIP))
- ctx->submit_state.flush_cqes = true;
- wq_list_add_tail(&req->comp_list, &state->compl_reqs);
-}
-
-static void io_queue_async_work(struct io_kiocb *req, bool *dont_use)
-{
- struct io_ring_ctx *ctx = req->ctx;
- struct io_kiocb *link = io_prep_linked_timeout(req);
- struct io_uring_task *tctx = req->task->io_uring;
-
- BUG_ON(!tctx);
- BUG_ON(!tctx->io_wq);
-
- /* init ->work of the whole link before punting */
- io_prep_async_link(req);
-
- /*
- * Not expected to happen, but if we do have a bug where this _can_
- * happen, catch it here and ensure the request is marked as
- * canceled. That will make io-wq go through the usual work cancel
- * procedure rather than attempt to run this request (or create a new
- * worker for it).
- */
- if (WARN_ON_ONCE(!same_thread_group(req->task, current)))
- req->work.flags |= IO_WQ_WORK_CANCEL;
-
- trace_io_uring_queue_async_work(ctx, req, req->user_data, req->opcode, req->flags,
- &req->work, io_wq_is_hashed(&req->work));
- io_wq_enqueue(tctx->io_wq, &req->work);
- if (link)
- io_queue_linked_timeout(link);
-}
-
-static void io_kill_timeout(struct io_kiocb *req, int status)
- __must_hold(&req->ctx->completion_lock)
- __must_hold(&req->ctx->timeout_lock)
-{
- struct io_timeout_data *io = req->async_data;
-
- if (hrtimer_try_to_cancel(&io->timer) != -1) {
- if (status)
- req_set_fail(req);
- atomic_set(&req->ctx->cq_timeouts,
- atomic_read(&req->ctx->cq_timeouts) + 1);
- list_del_init(&req->timeout.list);
- io_fill_cqe_req(req, status, 0);
- io_put_req_deferred(req);
- }
-}
-
-static __cold void io_queue_deferred(struct io_ring_ctx *ctx)
-{
- while (!list_empty(&ctx->defer_list)) {
- struct io_defer_entry *de = list_first_entry(&ctx->defer_list,
- struct io_defer_entry, list);
-
- if (req_need_defer(de->req, de->seq))
- break;
- list_del_init(&de->list);
- io_req_task_queue(de->req);
- kfree(de);
- }
-}
-
-static __cold void io_flush_timeouts(struct io_ring_ctx *ctx)
- __must_hold(&ctx->completion_lock)
-{
- u32 seq = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
- struct io_kiocb *req, *tmp;
-
- spin_lock_irq(&ctx->timeout_lock);
- list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
- u32 events_needed, events_got;
-
- if (io_is_timeout_noseq(req))
- break;
-
- /*
- * Since seq can easily wrap around over time, subtract
- * the last seq at which timeouts were flushed before comparing.
- * Assuming not more than 2^31-1 events have happened since,
- * these subtractions won't have wrapped, so we can check if
- * target is in [last_seq, current_seq] by comparing the two.
- */
- events_needed = req->timeout.target_seq - ctx->cq_last_tm_flush;
- events_got = seq - ctx->cq_last_tm_flush;
- if (events_got < events_needed)
- break;
-
- io_kill_timeout(req, 0);
- }
- ctx->cq_last_tm_flush = seq;
- spin_unlock_irq(&ctx->timeout_lock);
-}
-
-static inline void io_commit_cqring(struct io_ring_ctx *ctx)
-{
- /* order cqe stores with ring update */
- smp_store_release(&ctx->rings->cq.tail, ctx->cached_cq_tail);
-}
-
-static void __io_commit_cqring_flush(struct io_ring_ctx *ctx)
-{
- if (ctx->off_timeout_used || ctx->drain_active) {
- spin_lock(&ctx->completion_lock);
- if (ctx->off_timeout_used)
- io_flush_timeouts(ctx);
- if (ctx->drain_active)
- io_queue_deferred(ctx);
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- }
- if (ctx->has_evfd)
- io_eventfd_signal(ctx);
-}
-
-static inline bool io_sqring_full(struct io_ring_ctx *ctx)
-{
- struct io_rings *r = ctx->rings;
-
- return READ_ONCE(r->sq.tail) - ctx->cached_sq_head == ctx->sq_entries;
-}
-
-static inline unsigned int __io_cqring_events(struct io_ring_ctx *ctx)
-{
- return ctx->cached_cq_tail - READ_ONCE(ctx->rings->cq.head);
-}
-
-static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx)
-{
- struct io_rings *rings = ctx->rings;
- unsigned tail, mask = ctx->cq_entries - 1;
-
- /*
- * writes to the cq entry need to come after reading head; the
- * control dependency is enough as we're using WRITE_ONCE to
- * fill the cq entry
- */
- if (__io_cqring_events(ctx) == ctx->cq_entries)
- return NULL;
-
- tail = ctx->cached_cq_tail++;
- return &rings->cqes[tail & mask];
-}
-
-static void io_eventfd_signal(struct io_ring_ctx *ctx)
-{
- struct io_ev_fd *ev_fd;
-
- rcu_read_lock();
- /*
- * rcu_dereference ctx->io_ev_fd once and use it for both for checking
- * and eventfd_signal
- */
- ev_fd = rcu_dereference(ctx->io_ev_fd);
-
- /*
- * Check again if ev_fd exists incase an io_eventfd_unregister call
- * completed between the NULL check of ctx->io_ev_fd at the start of
- * the function and rcu_read_lock.
- */
- if (unlikely(!ev_fd))
- goto out;
- if (READ_ONCE(ctx->rings->cq_flags) & IORING_CQ_EVENTFD_DISABLED)
- goto out;
-
- if (!ev_fd->eventfd_async || io_wq_current_is_worker())
- eventfd_signal(ev_fd->cq_ev_fd, 1);
-out:
- rcu_read_unlock();
-}
-
-static inline void io_cqring_wake(struct io_ring_ctx *ctx)
-{
- /*
- * wake_up_all() may seem excessive, but io_wake_function() and
- * io_should_wake() handle the termination of the loop and only
- * wake as many waiters as we need to.
- */
- if (wq_has_sleeper(&ctx->cq_wait))
- wake_up_all(&ctx->cq_wait);
-}
-
-/*
- * This should only get called when at least one event has been posted.
- * Some applications rely on the eventfd notification count only changing
- * IFF a new CQE has been added to the CQ ring. There's no depedency on
- * 1:1 relationship between how many times this function is called (and
- * hence the eventfd count) and number of CQEs posted to the CQ ring.
- */
-static inline void io_cqring_ev_posted(struct io_ring_ctx *ctx)
-{
- if (unlikely(ctx->off_timeout_used || ctx->drain_active ||
- ctx->has_evfd))
- __io_commit_cqring_flush(ctx);
-
- io_cqring_wake(ctx);
-}
-
-static void io_cqring_ev_posted_iopoll(struct io_ring_ctx *ctx)
-{
- if (unlikely(ctx->off_timeout_used || ctx->drain_active ||
- ctx->has_evfd))
- __io_commit_cqring_flush(ctx);
-
- if (ctx->flags & IORING_SETUP_SQPOLL)
- io_cqring_wake(ctx);
-}
-
-/* Returns true if there are no backlogged entries after the flush */
-static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
-{
- bool all_flushed, posted;
-
- if (!force && __io_cqring_events(ctx) == ctx->cq_entries)
- return false;
-
- posted = false;
- spin_lock(&ctx->completion_lock);
- while (!list_empty(&ctx->cq_overflow_list)) {
- struct io_uring_cqe *cqe = io_get_cqe(ctx);
- struct io_overflow_cqe *ocqe;
-
- if (!cqe && !force)
- break;
- ocqe = list_first_entry(&ctx->cq_overflow_list,
- struct io_overflow_cqe, list);
- if (cqe)
- memcpy(cqe, &ocqe->cqe, sizeof(*cqe));
- else
- io_account_cq_overflow(ctx);
-
- posted = true;
- list_del(&ocqe->list);
- kfree(ocqe);
- }
-
- all_flushed = list_empty(&ctx->cq_overflow_list);
- if (all_flushed) {
- clear_bit(0, &ctx->check_cq_overflow);
- WRITE_ONCE(ctx->rings->sq_flags,
- ctx->rings->sq_flags & ~IORING_SQ_CQ_OVERFLOW);
- }
-
- if (posted)
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- if (posted)
- io_cqring_ev_posted(ctx);
- return all_flushed;
-}
-
-static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx)
-{
- bool ret = true;
-
- if (test_bit(0, &ctx->check_cq_overflow)) {
- /* iopoll syncs against uring_lock, not completion_lock */
- if (ctx->flags & IORING_SETUP_IOPOLL)
- mutex_lock(&ctx->uring_lock);
- ret = __io_cqring_overflow_flush(ctx, false);
- if (ctx->flags & IORING_SETUP_IOPOLL)
- mutex_unlock(&ctx->uring_lock);
- }
-
- return ret;
-}
-
-/* must to be called somewhat shortly after putting a request */
-static inline void io_put_task(struct task_struct *task, int nr)
-{
- struct io_uring_task *tctx = task->io_uring;
-
- if (likely(task == current)) {
- tctx->cached_refs += nr;
- } else {
- percpu_counter_sub(&tctx->inflight, nr);
- if (unlikely(atomic_read(&tctx->in_idle)))
- wake_up(&tctx->wait);
- put_task_struct_many(task, nr);
- }
-}
-
-static void io_task_refs_refill(struct io_uring_task *tctx)
-{
- unsigned int refill = -tctx->cached_refs + IO_TCTX_REFS_CACHE_NR;
-
- percpu_counter_add(&tctx->inflight, refill);
- refcount_add(refill, ¤t->usage);
- tctx->cached_refs += refill;
-}
-
-static inline void io_get_task_refs(int nr)
-{
- struct io_uring_task *tctx = current->io_uring;
-
- tctx->cached_refs -= nr;
- if (unlikely(tctx->cached_refs < 0))
- io_task_refs_refill(tctx);
-}
-
-static __cold void io_uring_drop_tctx_refs(struct task_struct *task)
-{
- struct io_uring_task *tctx = task->io_uring;
- unsigned int refs = tctx->cached_refs;
-
- if (refs) {
- tctx->cached_refs = 0;
- percpu_counter_sub(&tctx->inflight, refs);
- put_task_struct_many(task, refs);
- }
-}
-
-static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
- s32 res, u32 cflags)
-{
- struct io_overflow_cqe *ocqe;
-
- ocqe = kmalloc(sizeof(*ocqe), GFP_ATOMIC | __GFP_ACCOUNT);
- if (!ocqe) {
- /*
- * If we're in ring overflow flush mode, or in task cancel mode,
- * or cannot allocate an overflow entry, then we need to drop it
- * on the floor.
- */
- io_account_cq_overflow(ctx);
- return false;
- }
- if (list_empty(&ctx->cq_overflow_list)) {
- set_bit(0, &ctx->check_cq_overflow);
- WRITE_ONCE(ctx->rings->sq_flags,
- ctx->rings->sq_flags | IORING_SQ_CQ_OVERFLOW);
-
- }
- ocqe->cqe.user_data = user_data;
- ocqe->cqe.res = res;
- ocqe->cqe.flags = cflags;
- list_add_tail(&ocqe->list, &ctx->cq_overflow_list);
- return true;
-}
-
-static inline bool __io_fill_cqe(struct io_ring_ctx *ctx, u64 user_data,
- s32 res, u32 cflags)
-{
- struct io_uring_cqe *cqe;
-
- /*
- * If we can't get a cq entry, userspace overflowed the
- * submission (by quite a lot). Increment the overflow count in
- * the ring.
- */
- cqe = io_get_cqe(ctx);
- if (likely(cqe)) {
- WRITE_ONCE(cqe->user_data, user_data);
- WRITE_ONCE(cqe->res, res);
- WRITE_ONCE(cqe->flags, cflags);
- return true;
- }
- return io_cqring_event_overflow(ctx, user_data, res, cflags);
-}
-
-static inline bool __io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
-{
- trace_io_uring_complete(req->ctx, req, req->user_data, res, cflags);
- return __io_fill_cqe(req->ctx, req->user_data, res, cflags);
-}
-
-static noinline void io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
-{
- if (!(req->flags & REQ_F_CQE_SKIP))
- __io_fill_cqe_req(req, res, cflags);
-}
-
-static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
- s32 res, u32 cflags)
-{
- ctx->cq_extra++;
- trace_io_uring_complete(ctx, NULL, user_data, res, cflags);
- return __io_fill_cqe(ctx, user_data, res, cflags);
-}
-
-static void __io_req_complete_post(struct io_kiocb *req, s32 res,
- u32 cflags)
-{
- struct io_ring_ctx *ctx = req->ctx;
-
- if (!(req->flags & REQ_F_CQE_SKIP))
- __io_fill_cqe_req(req, res, cflags);
- /*
- * If we're the last reference to this request, add to our locked
- * free_list cache.
- */
- if (req_ref_put_and_test(req)) {
- if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
- if (req->flags & IO_DISARM_MASK)
- io_disarm_next(req);
- if (req->link) {
- io_req_task_queue(req->link);
- req->link = NULL;
- }
- }
- io_req_put_rsrc(req, ctx);
- /*
- * Selected buffer deallocation in io_clean_op() assumes that
- * we don't hold ->completion_lock. Clean them here to avoid
- * deadlocks.
- */
- io_put_kbuf_comp(req);
- io_dismantle_req(req);
- io_put_task(req->task, 1);
- wq_list_add_head(&req->comp_list, &ctx->locked_free_list);
- ctx->locked_free_nr++;
- }
-}
-
-static void io_req_complete_post(struct io_kiocb *req, s32 res,
- u32 cflags)
-{
- struct io_ring_ctx *ctx = req->ctx;
-
- spin_lock(&ctx->completion_lock);
- __io_req_complete_post(req, res, cflags);
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- io_cqring_ev_posted(ctx);
-}
-
-static inline void io_req_complete_state(struct io_kiocb *req, s32 res,
- u32 cflags)
-{
- req->result = res;
- req->cflags = cflags;
- req->flags |= REQ_F_COMPLETE_INLINE;
-}
-
-static inline void __io_req_complete(struct io_kiocb *req, unsigned issue_flags,
- s32 res, u32 cflags)
-{
- if (issue_flags & IO_URING_F_COMPLETE_DEFER)
- io_req_complete_state(req, res, cflags);
- else
- io_req_complete_post(req, res, cflags);
-}
-
-static inline void io_req_complete(struct io_kiocb *req, s32 res)
-{
- __io_req_complete(req, 0, res, 0);
-}
-
-static void io_req_complete_failed(struct io_kiocb *req, s32 res)
-{
- req_set_fail(req);
- io_req_complete_post(req, res, io_put_kbuf(req, IO_URING_F_UNLOCKED));
-}
-
-static void io_req_complete_fail_submit(struct io_kiocb *req)
-{
- /*
- * We don't submit, fail them all, for that replace hardlinks with
- * normal links. Extra REQ_F_LINK is tolerated.
- */
- req->flags &= ~REQ_F_HARDLINK;
- req->flags |= REQ_F_LINK;
- io_req_complete_failed(req, req->result);
-}
-
-/*
- * Don't initialise the fields below on every allocation, but do that in
- * advance and keep them valid across allocations.
- */
-static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx)
-{
- req->ctx = ctx;
- req->link = NULL;
- req->async_data = NULL;
- /* not necessary, but safer to zero */
- req->result = 0;
-}
-
-static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx,
- struct io_submit_state *state)
-{
- spin_lock(&ctx->completion_lock);
- wq_list_splice(&ctx->locked_free_list, &state->free_list);
- ctx->locked_free_nr = 0;
- spin_unlock(&ctx->completion_lock);
-}
-
-/* Returns true IFF there are requests in the cache */
-static bool io_flush_cached_reqs(struct io_ring_ctx *ctx)
-{
- struct io_submit_state *state = &ctx->submit_state;
-
- /*
- * If we have more than a batch's worth of requests in our IRQ side
- * locked cache, grab the lock and move them over to our submission
- * side cache.
- */
- if (READ_ONCE(ctx->locked_free_nr) > IO_COMPL_BATCH)
- io_flush_cached_locked_reqs(ctx, state);
- return !!state->free_list.next;
-}
-
-/*
- * A request might get retired back into the request caches even before opcode
- * handlers and io_issue_sqe() are done with it, e.g. inline completion path.
- * Because of that, io_alloc_req() should be called only under ->uring_lock
- * and with extra caution to not get a request that is still worked on.
- */
-static __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
-{
- struct io_submit_state *state = &ctx->submit_state;
- gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
- void *reqs[IO_REQ_ALLOC_BATCH];
- struct io_kiocb *req;
- int ret, i;
-
- if (likely(state->free_list.next || io_flush_cached_reqs(ctx)))
- return true;
-
- ret = kmem_cache_alloc_bulk(req_cachep, gfp, ARRAY_SIZE(reqs), reqs);
-
- /*
- * Bulk alloc is all-or-nothing. If we fail to get a batch,
- * retry single alloc to be on the safe side.
- */
- if (unlikely(ret <= 0)) {
- reqs[0] = kmem_cache_alloc(req_cachep, gfp);
- if (!reqs[0])
- return false;
- ret = 1;
- }
-
- percpu_ref_get_many(&ctx->refs, ret);
- for (i = 0; i < ret; i++) {
- req = reqs[i];
-
- io_preinit_req(req, ctx);
- wq_stack_add_head(&req->comp_list, &state->free_list);
- }
- return true;
-}
-
-static inline bool io_alloc_req_refill(struct io_ring_ctx *ctx)
-{
- if (unlikely(!ctx->submit_state.free_list.next))
- return __io_alloc_req_refill(ctx);
- return true;
-}
-
-static inline struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
-{
- struct io_wq_work_node *node;
-
- node = wq_stack_extract(&ctx->submit_state.free_list);
- return container_of(node, struct io_kiocb, comp_list);
-}
-
-static inline void io_put_file(struct file *file)
-{
- if (file)
- fput(file);
-}
-
-static inline void io_dismantle_req(struct io_kiocb *req)
-{
- unsigned int flags = req->flags;
-
- if (unlikely(flags & IO_REQ_CLEAN_FLAGS))
- io_clean_op(req);
- if (!(flags & REQ_F_FIXED_FILE))
- io_put_file(req->file);
-}
-
-static __cold void __io_free_req(struct io_kiocb *req)
-{
- struct io_ring_ctx *ctx = req->ctx;
-
- io_req_put_rsrc(req, ctx);
- io_dismantle_req(req);
- io_put_task(req->task, 1);
-
- spin_lock(&ctx->completion_lock);
- wq_list_add_head(&req->comp_list, &ctx->locked_free_list);
- ctx->locked_free_nr++;
- spin_unlock(&ctx->completion_lock);
-}
-
-static inline void io_remove_next_linked(struct io_kiocb *req)
-{
- struct io_kiocb *nxt = req->link;
-
- req->link = nxt->link;
- nxt->link = NULL;
-}
-
-static bool io_kill_linked_timeout(struct io_kiocb *req)
- __must_hold(&req->ctx->completion_lock)
- __must_hold(&req->ctx->timeout_lock)
-{
- struct io_kiocb *link = req->link;
-
- if (link && link->opcode == IORING_OP_LINK_TIMEOUT) {
- struct io_timeout_data *io = link->async_data;
-
- io_remove_next_linked(req);
- link->timeout.head = NULL;
- if (hrtimer_try_to_cancel(&io->timer) != -1) {
- list_del(&link->timeout.list);
- /* leave REQ_F_CQE_SKIP to io_fill_cqe_req */
- io_fill_cqe_req(link, -ECANCELED, 0);
- io_put_req_deferred(link);
- return true;
- }
- }
- return false;
-}
-
-static void io_fail_links(struct io_kiocb *req)
- __must_hold(&req->ctx->completion_lock)
-{
- struct io_kiocb *nxt, *link = req->link;
- bool ignore_cqes = req->flags & REQ_F_SKIP_LINK_CQES;
-
- req->link = NULL;
- while (link) {
- long res = -ECANCELED;
-
- if (link->flags & REQ_F_FAIL)
- res = link->result;
-
- nxt = link->link;
- link->link = NULL;
-
- trace_io_uring_fail_link(req->ctx, req, req->user_data,
- req->opcode, link);
-
- if (!ignore_cqes) {
- link->flags &= ~REQ_F_CQE_SKIP;
- io_fill_cqe_req(link, res, 0);
- }
- io_put_req_deferred(link);
- link = nxt;
- }
-}
-
-static bool io_disarm_next(struct io_kiocb *req)
- __must_hold(&req->ctx->completion_lock)
-{
- bool posted = false;
-
- if (req->flags & REQ_F_ARM_LTIMEOUT) {
- struct io_kiocb *link = req->link;
-
- req->flags &= ~REQ_F_ARM_LTIMEOUT;
- if (link && link->opcode == IORING_OP_LINK_TIMEOUT) {
- io_remove_next_linked(req);
- /* leave REQ_F_CQE_SKIP to io_fill_cqe_req */
- io_fill_cqe_req(link, -ECANCELED, 0);
- io_put_req_deferred(link);
- posted = true;
- }
- } else if (req->flags & REQ_F_LINK_TIMEOUT) {
- struct io_ring_ctx *ctx = req->ctx;
-
- spin_lock_irq(&ctx->timeout_lock);
- posted = io_kill_linked_timeout(req);
- spin_unlock_irq(&ctx->timeout_lock);
- }
- if (unlikely((req->flags & REQ_F_FAIL) &&
- !(req->flags & REQ_F_HARDLINK))) {
- posted |= (req->link != NULL);
- io_fail_links(req);
- }
- return posted;
-}
-
-static void __io_req_find_next_prep(struct io_kiocb *req)
-{
- struct io_ring_ctx *ctx = req->ctx;
- bool posted;
-
- spin_lock(&ctx->completion_lock);
- posted = io_disarm_next(req);
- if (posted)
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- if (posted)
- io_cqring_ev_posted(ctx);
-}
-
-static inline struct io_kiocb *io_req_find_next(struct io_kiocb *req)
-{
- struct io_kiocb *nxt;
-
- if (likely(!(req->flags & (REQ_F_LINK|REQ_F_HARDLINK))))
- return NULL;
- /*
- * If LINK is set, we have dependent requests in this chain. If we
- * didn't fail this request, queue the first one up, moving any other
- * dependencies to the next request. In case of failure, fail the rest
- * of the chain.
- */
- if (unlikely(req->flags & IO_DISARM_MASK))
- __io_req_find_next_prep(req);
- nxt = req->link;
- req->link = NULL;
- return nxt;
-}
-
-static void ctx_flush_and_put(struct io_ring_ctx *ctx, bool *locked)
-{
- if (!ctx)
- return;
- if (*locked) {
- io_submit_flush_completions(ctx);
- mutex_unlock(&ctx->uring_lock);
- *locked = false;
- }
- percpu_ref_put(&ctx->refs);
-}
-
-static inline void ctx_commit_and_unlock(struct io_ring_ctx *ctx)
-{
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- io_cqring_ev_posted(ctx);
-}
-
-static void handle_prev_tw_list(struct io_wq_work_node *node,
- struct io_ring_ctx **ctx, bool *uring_locked)
-{
- if (*ctx && !*uring_locked)
- spin_lock(&(*ctx)->completion_lock);
-
- do {
- struct io_wq_work_node *next = node->next;
- struct io_kiocb *req = container_of(node, struct io_kiocb,
- io_task_work.node);
-
- prefetch(container_of(next, struct io_kiocb, io_task_work.node));
-
- if (req->ctx != *ctx) {
- if (unlikely(!*uring_locked && *ctx))
- ctx_commit_and_unlock(*ctx);
-
- ctx_flush_and_put(*ctx, uring_locked);
- *ctx = req->ctx;
- /* if not contended, grab and improve batching */
- *uring_locked = mutex_trylock(&(*ctx)->uring_lock);
- percpu_ref_get(&(*ctx)->refs);
- if (unlikely(!*uring_locked))
- spin_lock(&(*ctx)->completion_lock);
- }
- if (likely(*uring_locked))
- req->io_task_work.func(req, uring_locked);
- else
- __io_req_complete_post(req, req->result,
- io_put_kbuf_comp(req));
- node = next;
- } while (node);
-
- if (unlikely(!*uring_locked))
- ctx_commit_and_unlock(*ctx);
-}
-
-static void handle_tw_list(struct io_wq_work_node *node,
- struct io_ring_ctx **ctx, bool *locked)
-{
- do {
- struct io_wq_work_node *next = node->next;
- struct io_kiocb *req = container_of(node, struct io_kiocb,
- io_task_work.node);
-
- prefetch(container_of(next, struct io_kiocb, io_task_work.node));
-
- if (req->ctx != *ctx) {
- ctx_flush_and_put(*ctx, locked);
- *ctx = req->ctx;
- /* if not contended, grab and improve batching */
- *locked = mutex_trylock(&(*ctx)->uring_lock);
- percpu_ref_get(&(*ctx)->refs);
- }
- req->io_task_work.func(req, locked);
- node = next;
- } while (node);
-}
-
-static void tctx_task_work(struct callback_head *cb)
-{
- bool uring_locked = false;
- struct io_ring_ctx *ctx = NULL;
- struct io_uring_task *tctx = container_of(cb, struct io_uring_task,
- task_work);
-
- while (1) {
- struct io_wq_work_node *node1, *node2;
-
- if (!tctx->task_list.first &&
- !tctx->prior_task_list.first && uring_locked)
- io_submit_flush_completions(ctx);
-
- spin_lock_irq(&tctx->task_lock);
- node1 = tctx->prior_task_list.first;
- node2 = tctx->task_list.first;
- INIT_WQ_LIST(&tctx->task_list);
- INIT_WQ_LIST(&tctx->prior_task_list);
- if (!node2 && !node1)
- tctx->task_running = false;
- spin_unlock_irq(&tctx->task_lock);
- if (!node2 && !node1)
- break;
-
- if (node1)
- handle_prev_tw_list(node1, &ctx, &uring_locked);
-
- if (node2)
- handle_tw_list(node2, &ctx, &uring_locked);
- cond_resched();
- }
-
- ctx_flush_and_put(ctx, &uring_locked);
-
- /* relaxed read is enough as only the task itself sets ->in_idle */
- if (unlikely(atomic_read(&tctx->in_idle)))
- io_uring_drop_tctx_refs(current);
-}
-
-static void io_req_task_work_add(struct io_kiocb *req, bool priority)
-{
- struct task_struct *tsk = req->task;
- struct io_uring_task *tctx = tsk->io_uring;
- enum task_work_notify_mode notify;
- struct io_wq_work_node *node;
- unsigned long flags;
- bool running;
-
- WARN_ON_ONCE(!tctx);
-
- spin_lock_irqsave(&tctx->task_lock, flags);
- if (priority)
- wq_list_add_tail(&req->io_task_work.node, &tctx->prior_task_list);
- else
- wq_list_add_tail(&req->io_task_work.node, &tctx->task_list);
- running = tctx->task_running;
- if (!running)
- tctx->task_running = true;
- spin_unlock_irqrestore(&tctx->task_lock, flags);
-
- /* task_work already pending, we're done */
- if (running)
- return;
-
- /*
- * SQPOLL kernel thread doesn't need notification, just a wakeup. For
- * all other cases, use TWA_SIGNAL unconditionally to ensure we're
- * processing task_work. There's no reliable way to tell if TWA_RESUME
- * will do the job.
- */
- notify = (req->ctx->flags & IORING_SETUP_SQPOLL) ? TWA_NONE : TWA_SIGNAL;
- if (likely(!task_work_add(tsk, &tctx->task_work, notify))) {
- if (notify == TWA_NONE)
- wake_up_process(tsk);
- return;
- }
-
- spin_lock_irqsave(&tctx->task_lock, flags);
- tctx->task_running = false;
- node = wq_list_merge(&tctx->prior_task_list, &tctx->task_list);
- spin_unlock_irqrestore(&tctx->task_lock, flags);
-
- while (node) {
- req = container_of(node, struct io_kiocb, io_task_work.node);
- node = node->next;
- if (llist_add(&req->io_task_work.fallback_node,
- &req->ctx->fallback_llist))
- schedule_delayed_work(&req->ctx->fallback_work, 1);
- }
-}
-
-static void io_req_task_cancel(struct io_kiocb *req, bool *locked)
-{
- struct io_ring_ctx *ctx = req->ctx;
-
- /* not needed for normal modes, but SQPOLL depends on it */
- io_tw_lock(ctx, locked);
- io_req_complete_failed(req, req->result);
-}
-
-static void io_req_task_submit(struct io_kiocb *req, bool *locked)
-{
- struct io_ring_ctx *ctx = req->ctx;
-
- io_tw_lock(ctx, locked);
- /* req->task == current here, checking PF_EXITING is safe */
- if (likely(!(req->task->flags & PF_EXITING)))
- __io_queue_sqe(req);
- else
- io_req_complete_failed(req, -EFAULT);
-}
-
-static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
-{
- req->result = ret;
- req->io_task_work.func = io_req_task_cancel;
- io_req_task_work_add(req, false);
-}
-
-static void io_req_task_queue(struct io_kiocb *req)
-{
- req->io_task_work.func = io_req_task_submit;
- io_req_task_work_add(req, false);
-}
-
-static void io_req_task_queue_reissue(struct io_kiocb *req)
-{
- req->io_task_work.func = io_queue_async_work;
- io_req_task_work_add(req, false);
-}
-
-static inline void io_queue_next(struct io_kiocb *req)
-{
- struct io_kiocb *nxt = io_req_find_next(req);
-
- if (nxt)
- io_req_task_queue(nxt);
-}
-
-static void io_free_req(struct io_kiocb *req)
-{
- io_queue_next(req);
- __io_free_req(req);
-}
-
-static void io_free_req_work(struct io_kiocb *req, bool *locked)
-{
- io_free_req(req);
-}
-
-static void io_free_batch_list(struct io_ring_ctx *ctx,
- struct io_wq_work_node *node)
- __must_hold(&ctx->uring_lock)
-{
- struct task_struct *task = NULL;
- int task_refs = 0;
-
- do {
- struct io_kiocb *req = container_of(node, struct io_kiocb,
- comp_list);
-
- if (unlikely(req->flags & REQ_F_REFCOUNT)) {
- node = req->comp_list.next;
- if (!req_ref_put_and_test(req))
- continue;
- }
-
- io_req_put_rsrc_locked(req, ctx);
- io_queue_next(req);
- io_dismantle_req(req);
-
- if (req->task != task) {
- if (task)
- io_put_task(task, task_refs);
- task = req->task;
- task_refs = 0;
- }
- task_refs++;
- node = req->comp_list.next;
- wq_stack_add_head(&req->comp_list, &ctx->submit_state.free_list);
- } while (node);
-
- if (task)
- io_put_task(task, task_refs);
-}
-
-static void __io_submit_flush_completions(struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
-{
- struct io_wq_work_node *node, *prev;
- struct io_submit_state *state = &ctx->submit_state;
-
- if (state->flush_cqes) {
- spin_lock(&ctx->completion_lock);
- wq_list_for_each(node, prev, &state->compl_reqs) {
- struct io_kiocb *req = container_of(node, struct io_kiocb,
- comp_list);
-
- if (!(req->flags & REQ_F_CQE_SKIP))
- __io_fill_cqe_req(req, req->result, req->cflags);
- if ((req->flags & REQ_F_POLLED) && req->apoll) {
- struct async_poll *apoll = req->apoll;
-
- if (apoll->double_poll)
- kfree(apoll->double_poll);
- list_add(&apoll->poll.wait.entry,
- &ctx->apoll_cache);
- req->flags &= ~REQ_F_POLLED;
- }
- }
-
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- io_cqring_ev_posted(ctx);
- state->flush_cqes = false;
- }
-
- io_free_batch_list(ctx, state->compl_reqs.first);
- INIT_WQ_LIST(&state->compl_reqs);
-}
-
-/*
- * Drop reference to request, return next in chain (if there is one) if this
- * was the last reference to this request.
- */
-static inline struct io_kiocb *io_put_req_find_next(struct io_kiocb *req)
-{
- struct io_kiocb *nxt = NULL;
-
- if (req_ref_put_and_test(req)) {
- nxt = io_req_find_next(req);
- __io_free_req(req);
- }
- return nxt;
-}
-
-static inline void io_put_req(struct io_kiocb *req)
-{
- if (req_ref_put_and_test(req))
- io_free_req(req);
-}
-
-static inline void io_put_req_deferred(struct io_kiocb *req)
-{
- if (req_ref_put_and_test(req)) {
- req->io_task_work.func = io_free_req_work;
- io_req_task_work_add(req, false);
- }
-}
-
-static unsigned io_cqring_events(struct io_ring_ctx *ctx)
-{
- /* See comment at the top of this file */
- smp_rmb();
- return __io_cqring_events(ctx);
-}
-
-static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)
-{
- struct io_rings *rings = ctx->rings;
-
- /* make sure SQ entry isn't read before tail */
- return smp_load_acquire(&rings->sq.tail) - ctx->cached_sq_head;
-}
-
-static inline bool io_run_task_work(void)
-{
- if (test_thread_flag(TIF_NOTIFY_SIGNAL) || task_work_pending(current)) {
- __set_current_state(TASK_RUNNING);
- clear_notify_signal();
- if (task_work_pending(current))
- task_work_run();
- return true;
- }
-
- return false;
-}
-
-static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
-{
- struct io_wq_work_node *pos, *start, *prev;
- unsigned int poll_flags = BLK_POLL_NOSLEEP;
- DEFINE_IO_COMP_BATCH(iob);
- int nr_events = 0;
-
- /*
- * Only spin for completions if we don't have multiple devices hanging
- * off our complete list.
- */
- if (ctx->poll_multi_queue || force_nonspin)
- poll_flags |= BLK_POLL_ONESHOT;
-
- wq_list_for_each(pos, start, &ctx->iopoll_list) {
- struct io_kiocb *req = container_of(pos, struct io_kiocb, comp_list);
- struct kiocb *kiocb = &req->rw.kiocb;
- int ret;
-
- /*
- * Move completed and retryable entries to our local lists.
- * If we find a request that requires polling, break out
- * and complete those lists first, if we have entries there.
- */
- if (READ_ONCE(req->iopoll_completed))
- break;
-
- ret = kiocb->ki_filp->f_op->iopoll(kiocb, &iob, poll_flags);
- if (unlikely(ret < 0))
- return ret;
- else if (ret)
- poll_flags |= BLK_POLL_ONESHOT;
-
- /* iopoll may have completed current req */
- if (!rq_list_empty(iob.req_list) ||
- READ_ONCE(req->iopoll_completed))
- break;
- }
-
- if (!rq_list_empty(iob.req_list))
- iob.complete(&iob);
- else if (!pos)
- return 0;
-
- prev = start;
- wq_list_for_each_resume(pos, prev) {
- struct io_kiocb *req = container_of(pos, struct io_kiocb, comp_list);
-
- /* order with io_complete_rw_iopoll(), e.g. ->result updates */
- if (!smp_load_acquire(&req->iopoll_completed))
- break;
- nr_events++;
- if (unlikely(req->flags & REQ_F_CQE_SKIP))
- continue;
- __io_fill_cqe_req(req, req->result, io_put_kbuf(req, 0));
- }
-
- if (unlikely(!nr_events))
- return 0;
-
- io_commit_cqring(ctx);
- io_cqring_ev_posted_iopoll(ctx);
- pos = start ? start->next : ctx->iopoll_list.first;
- wq_list_cut(&ctx->iopoll_list, prev, start);
- io_free_batch_list(ctx, pos);
- return nr_events;
-}
-
-/*
- * We can't just wait for polled events to come to us, we have to actively
- * find and complete them.
- */
-static __cold void io_iopoll_try_reap_events(struct io_ring_ctx *ctx)
-{
- if (!(ctx->flags & IORING_SETUP_IOPOLL))
- return;
-
- mutex_lock(&ctx->uring_lock);
- while (!wq_list_empty(&ctx->iopoll_list)) {
- /* let it sleep and repeat later if can't complete a request */
- if (io_do_iopoll(ctx, true) == 0)
- break;
- /*
- * Ensure we allow local-to-the-cpu processing to take place,
- * in this case we need to ensure that we reap all events.
- * Also let task_work, etc. to progress by releasing the mutex
- */
- if (need_resched()) {
- mutex_unlock(&ctx->uring_lock);
- cond_resched();
- mutex_lock(&ctx->uring_lock);
- }
- }
- mutex_unlock(&ctx->uring_lock);
-}
-
-static int io_iopoll_check(struct io_ring_ctx *ctx, long min)
-{
- unsigned int nr_events = 0;
- int ret = 0;
-
- /*
- * We disallow the app entering submit/complete with polling, but we
- * still need to lock the ring to prevent racing with polled issue
- * that got punted to a workqueue.
- */
- mutex_lock(&ctx->uring_lock);
- /*
- * Don't enter poll loop if we already have events pending.
- * If we do, we can potentially be spinning for commands that
- * already triggered a CQE (eg in error).
- */
- if (test_bit(0, &ctx->check_cq_overflow))
- __io_cqring_overflow_flush(ctx, false);
- if (io_cqring_events(ctx))
- goto out;
- do {
- /*
- * If a submit got punted to a workqueue, we can have the
- * application entering polling for a command before it gets
- * issued. That app will hold the uring_lock for the duration
- * of the poll right here, so we need to take a breather every
- * now and then to ensure that the issue has a chance to add
- * the poll to the issued list. Otherwise we can spin here
- * forever, while the workqueue is stuck trying to acquire the
- * very same mutex.
- */
- if (wq_list_empty(&ctx->iopoll_list)) {
- u32 tail = ctx->cached_cq_tail;
-
- mutex_unlock(&ctx->uring_lock);
- io_run_task_work();
- mutex_lock(&ctx->uring_lock);
-
- /* some requests don't go through iopoll_list */
- if (tail != ctx->cached_cq_tail ||
- wq_list_empty(&ctx->iopoll_list))
- break;
- }
- ret = io_do_iopoll(ctx, !min);
- if (ret < 0)
- break;
- nr_events += ret;
- ret = 0;
- } while (nr_events < min && !need_resched());
-out:
- mutex_unlock(&ctx->uring_lock);
- return ret;
-}
-
-static void kiocb_end_write(struct io_kiocb *req)
-{
- /*
- * Tell lockdep we inherited freeze protection from submission
- * thread.
- */
- if (req->flags & REQ_F_ISREG) {
- struct super_block *sb = file_inode(req->file)->i_sb;
-
- __sb_writers_acquired(sb, SB_FREEZE_WRITE);
- sb_end_write(sb);
- }
-}
-
-#ifdef CONFIG_BLOCK
-static bool io_resubmit_prep(struct io_kiocb *req)
-{
- struct io_async_rw *rw = req->async_data;
-
- if (!req_has_async_data(req))
- return !io_req_prep_async(req);
- iov_iter_restore(&rw->s.iter, &rw->s.iter_state);
- return true;
-}
-
-static bool io_rw_should_reissue(struct io_kiocb *req)
-{
- umode_t mode = file_inode(req->file)->i_mode;
- struct io_ring_ctx *ctx = req->ctx;
-
- if (!S_ISBLK(mode) && !S_ISREG(mode))
- return false;
- if ((req->flags & REQ_F_NOWAIT) || (io_wq_current_is_worker() &&
- !(ctx->flags & IORING_SETUP_IOPOLL)))
- return false;
- /*
- * If ref is dying, we might be running poll reap from the exit work.
- * Don't attempt to reissue from that path, just let it fail with
- * -EAGAIN.
- */
- if (percpu_ref_is_dying(&ctx->refs))
- return false;
- /*
- * Play it safe and assume not safe to re-import and reissue if we're
- * not in the original thread group (or in task context).
- */
- if (!same_thread_group(req->task, current) || !in_task())
- return false;
- return true;
-}
-#else
-static bool io_resubmit_prep(struct io_kiocb *req)
-{
- return false;
-}
-static bool io_rw_should_reissue(struct io_kiocb *req)
-{
- return false;
-}
-#endif
-
-static bool __io_complete_rw_common(struct io_kiocb *req, long res)
-{
- if (req->rw.kiocb.ki_flags & IOCB_WRITE) {
- kiocb_end_write(req);
- fsnotify_modify(req->file);
- } else {
- fsnotify_access(req->file);
- }
- if (unlikely(res != req->result)) {
- if ((res == -EAGAIN || res == -EOPNOTSUPP) &&
- io_rw_should_reissue(req)) {
- req->flags |= REQ_F_REISSUE;
- return true;
- }
- req_set_fail(req);
- req->result = res;
- }
- return false;
-}
-
-static inline void io_req_task_complete(struct io_kiocb *req, bool *locked)
-{
- int res = req->result;
-
- if (*locked) {
- io_req_complete_state(req, res, io_put_kbuf(req, 0));
- io_req_add_compl_list(req);
- } else {
- io_req_complete_post(req, res,
- io_put_kbuf(req, IO_URING_F_UNLOCKED));
- }
-}
-
-static void __io_complete_rw(struct io_kiocb *req, long res,
- unsigned int issue_flags)
-{
- if (__io_complete_rw_common(req, res))
- return;
- __io_req_complete(req, issue_flags, req->result,
- io_put_kbuf(req, issue_flags));
-}
-
-static void io_complete_rw(struct kiocb *kiocb, long res)
-{
- struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
-
- if (__io_complete_rw_common(req, res))
- return;
- req->result = res;
- req->io_task_work.func = io_req_task_complete;
- io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));
-}
-
-static void io_complete_rw_iopoll(struct kiocb *kiocb, long res)
-{
- struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
-
- if (kiocb->ki_flags & IOCB_WRITE)
- kiocb_end_write(req);
- if (unlikely(res != req->result)) {
- if (res == -EAGAIN && io_rw_should_reissue(req)) {
- req->flags |= REQ_F_REISSUE;
- return;
- }
- req->result = res;
- }
-
- /* order with io_iopoll_complete() checking ->iopoll_completed */
- smp_store_release(&req->iopoll_completed, 1);
-}
-
-/*
- * After the iocb has been issued, it's safe to be found on the poll list.
- * Adding the kiocb to the list AFTER submission ensures that we don't
- * find it from a io_do_iopoll() thread before the issuer is done
- * accessing the kiocb cookie.
- */
-static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_ring_ctx *ctx = req->ctx;
- const bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
-
- /* workqueue context doesn't hold uring_lock, grab it now */
- if (unlikely(needs_lock))
- mutex_lock(&ctx->uring_lock);
-
- /*
- * Track whether we have multiple files in our lists. This will impact
- * how we do polling eventually, not spinning if we're on potentially
- * different devices.
- */
- if (wq_list_empty(&ctx->iopoll_list)) {
- ctx->poll_multi_queue = false;
- } else if (!ctx->poll_multi_queue) {
- struct io_kiocb *list_req;
-
- list_req = container_of(ctx->iopoll_list.first, struct io_kiocb,
- comp_list);
- if (list_req->file != req->file)
- ctx->poll_multi_queue = true;
- }
-
- /*
- * For fast devices, IO may have already completed. If it has, add
- * it to the front so we find it first.
- */
- if (READ_ONCE(req->iopoll_completed))
- wq_list_add_head(&req->comp_list, &ctx->iopoll_list);
- else
- wq_list_add_tail(&req->comp_list, &ctx->iopoll_list);
-
- if (unlikely(needs_lock)) {
- /*
- * If IORING_SETUP_SQPOLL is enabled, sqes are either handle
- * in sq thread task context or in io worker task context. If
- * current task context is sq thread, we don't need to check
- * whether should wake up sq thread.
- */
- if ((ctx->flags & IORING_SETUP_SQPOLL) &&
- wq_has_sleeper(&ctx->sq_data->wait))
- wake_up(&ctx->sq_data->wait);
-
- mutex_unlock(&ctx->uring_lock);
- }
-}
-
-static bool io_bdev_nowait(struct block_device *bdev)
-{
- return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
-}
-
-/*
- * If we tracked the file through the SCM inflight mechanism, we could support
- * any file. For now, just ensure that anything potentially problematic is done
- * inline.
- */
-static bool __io_file_supports_nowait(struct file *file, umode_t mode)
-{
- if (S_ISBLK(mode)) {
- if (IS_ENABLED(CONFIG_BLOCK) &&
- io_bdev_nowait(I_BDEV(file->f_mapping->host)))
- return true;
- return false;
- }
- if (S_ISSOCK(mode))
- return true;
- if (S_ISREG(mode)) {
- if (IS_ENABLED(CONFIG_BLOCK) &&
- io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
- file->f_op != &io_uring_fops)
- return true;
- return false;
- }
-
- /* any ->read/write should understand O_NONBLOCK */
- if (file->f_flags & O_NONBLOCK)
- return true;
- return file->f_mode & FMODE_NOWAIT;
-}
-
-/*
- * If we tracked the file through the SCM inflight mechanism, we could support
- * any file. For now, just ensure that anything potentially problematic is done
- * inline.
- */
-static unsigned int io_file_get_flags(struct file *file)
-{
- umode_t mode = file_inode(file)->i_mode;
- unsigned int res = 0;
-
- if (S_ISREG(mode))
- res |= FFS_ISREG;
- if (__io_file_supports_nowait(file, mode))
- res |= FFS_NOWAIT;
- return res;
-}
-
-static inline bool io_file_supports_nowait(struct io_kiocb *req)
-{
- return req->flags & REQ_F_SUPPORT_NOWAIT;
-}
-
-static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct kiocb *kiocb = &req->rw.kiocb;
- unsigned ioprio;
- int ret;
-
- kiocb->ki_pos = READ_ONCE(sqe->off);
- /* used for fixed read/write too - just read unconditionally */
- req->buf_index = READ_ONCE(sqe->buf_index);
- req->imu = NULL;
-
- if (req->opcode == IORING_OP_READ_FIXED ||
- req->opcode == IORING_OP_WRITE_FIXED) {
- struct io_ring_ctx *ctx = req->ctx;
- u16 index;
-
- if (unlikely(req->buf_index >= ctx->nr_user_bufs))
- return -EFAULT;
- index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
- req->imu = ctx->user_bufs[index];
- io_req_set_rsrc_node(req, ctx, 0);
- }
-
- ioprio = READ_ONCE(sqe->ioprio);
- if (ioprio) {
- ret = ioprio_check_cap(ioprio);
- if (ret)
- return ret;
-
- kiocb->ki_ioprio = ioprio;
- } else {
- kiocb->ki_ioprio = get_current_ioprio();
- }
-
- req->rw.addr = READ_ONCE(sqe->addr);
- req->rw.len = READ_ONCE(sqe->len);
- req->rw.flags = READ_ONCE(sqe->rw_flags);
- return 0;
-}
-
-static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret)
-{
- switch (ret) {
- case -EIOCBQUEUED:
- break;
- case -ERESTARTSYS:
- case -ERESTARTNOINTR:
- case -ERESTARTNOHAND:
- case -ERESTART_RESTARTBLOCK:
- /*
- * We can't just restart the syscall, since previously
- * submitted sqes may already be in progress. Just fail this
- * IO with EINTR.
- */
- ret = -EINTR;
- fallthrough;
- default:
- kiocb->ki_complete(kiocb, ret);
- }
-}
-
-static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req)
-{
- struct kiocb *kiocb = &req->rw.kiocb;
-
- if (kiocb->ki_pos != -1)
- return &kiocb->ki_pos;
-
- if (!(req->file->f_mode & FMODE_STREAM)) {
- req->flags |= REQ_F_CUR_POS;
- kiocb->ki_pos = req->file->f_pos;
- return &kiocb->ki_pos;
- }
-
- kiocb->ki_pos = 0;
- return NULL;
-}
-
-static void kiocb_done(struct io_kiocb *req, ssize_t ret,
- unsigned int issue_flags)
-{
- struct io_async_rw *io = req->async_data;
-
- /* add previously done IO, if any */
- if (req_has_async_data(req) && io->bytes_done > 0) {
- if (ret < 0)
- ret = io->bytes_done;
- else
- ret += io->bytes_done;
- }
-
- if (req->flags & REQ_F_CUR_POS)
- req->file->f_pos = req->rw.kiocb.ki_pos;
- if (ret >= 0 && (req->rw.kiocb.ki_complete == io_complete_rw))
- __io_complete_rw(req, ret, issue_flags);
- else
- io_rw_done(&req->rw.kiocb, ret);
-
- if (req->flags & REQ_F_REISSUE) {
- req->flags &= ~REQ_F_REISSUE;
- if (io_resubmit_prep(req))
- io_req_task_queue_reissue(req);
- else
- io_req_task_queue_fail(req, ret);
- }
-}
-
-static int __io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter,
- struct io_mapped_ubuf *imu)
-{
- size_t len = req->rw.len;
- u64 buf_end, buf_addr = req->rw.addr;
- size_t offset;
-
- if (unlikely(check_add_overflow(buf_addr, (u64)len, &buf_end)))
- return -EFAULT;
- /* not inside the mapped region */
- if (unlikely(buf_addr < imu->ubuf || buf_end > imu->ubuf_end))
- return -EFAULT;
-
- /*
- * May not be a start of buffer, set size appropriately
- * and advance us to the beginning.
- */
- offset = buf_addr - imu->ubuf;
- iov_iter_bvec(iter, rw, imu->bvec, imu->nr_bvecs, offset + len);
-
- if (offset) {
- /*
- * Don't use iov_iter_advance() here, as it's really slow for
- * using the latter parts of a big fixed buffer - it iterates
- * over each segment manually. We can cheat a bit here, because
- * we know that:
- *
- * 1) it's a BVEC iter, we set it up
- * 2) all bvecs are PAGE_SIZE in size, except potentially the
- * first and last bvec
- *
- * So just find our index, and adjust the iterator afterwards.
- * If the offset is within the first bvec (or the whole first
- * bvec, just use iov_iter_advance(). This makes it easier
- * since we can just skip the first segment, which may not
- * be PAGE_SIZE aligned.
- */
- const struct bio_vec *bvec = imu->bvec;
-
- if (offset <= bvec->bv_len) {
- iov_iter_advance(iter, offset);
- } else {
- unsigned long seg_skip;
-
- /* skip first vec */
- offset -= bvec->bv_len;
- seg_skip = 1 + (offset >> PAGE_SHIFT);
-
- iter->bvec = bvec + seg_skip;
- iter->nr_segs -= seg_skip;
- iter->count -= bvec->bv_len + offset;
- iter->iov_offset = offset & ~PAGE_MASK;
- }
- }
-
- return 0;
-}
-
-static int io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter,
- unsigned int issue_flags)
-{
- if (WARN_ON_ONCE(!req->imu))
- return -EFAULT;
- return __io_import_fixed(req, rw, iter, req->imu);
-}
-
-static void io_ring_submit_unlock(struct io_ring_ctx *ctx, bool needs_lock)
-{
- if (needs_lock)
- mutex_unlock(&ctx->uring_lock);
-}
-
-static void io_ring_submit_lock(struct io_ring_ctx *ctx, bool needs_lock)
-{
- /*
- * "Normal" inline submissions always hold the uring_lock, since we
- * grab it from the system call. Same is true for the SQPOLL offload.
- * The only exception is when we've detached the request and issue it
- * from an async worker thread, grab the lock for that case.
- */
- if (needs_lock)
- mutex_lock(&ctx->uring_lock);
-}
-
-static void io_buffer_add_list(struct io_ring_ctx *ctx,
- struct io_buffer_list *bl, unsigned int bgid)
-{
- struct list_head *list;
-
- list = &ctx->io_buffers[hash_32(bgid, IO_BUFFERS_HASH_BITS)];
- INIT_LIST_HEAD(&bl->buf_list);
- bl->bgid = bgid;
- list_add(&bl->list, list);
-}
-
-static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
- int bgid, unsigned int issue_flags)
-{
- struct io_buffer *kbuf = req->kbuf;
- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
- struct io_ring_ctx *ctx = req->ctx;
- struct io_buffer_list *bl;
-
- if (req->flags & REQ_F_BUFFER_SELECTED)
- return kbuf;
-
- io_ring_submit_lock(ctx, needs_lock);
-
- lockdep_assert_held(&ctx->uring_lock);
-
- bl = io_buffer_get_list(ctx, bgid);
- if (bl && !list_empty(&bl->buf_list)) {
- kbuf = list_first_entry(&bl->buf_list, struct io_buffer, list);
- list_del(&kbuf->list);
- if (*len > kbuf->len)
- *len = kbuf->len;
- req->flags |= REQ_F_BUFFER_SELECTED;
- req->kbuf = kbuf;
- } else {
- kbuf = ERR_PTR(-ENOBUFS);
- }
-
- io_ring_submit_unlock(req->ctx, needs_lock);
- return kbuf;
-}
-
-static void __user *io_rw_buffer_select(struct io_kiocb *req, size_t *len,
- unsigned int issue_flags)
-{
- struct io_buffer *kbuf;
- u16 bgid;
-
- bgid = req->buf_index;
- kbuf = io_buffer_select(req, len, bgid, issue_flags);
- if (IS_ERR(kbuf))
- return kbuf;
- return u64_to_user_ptr(kbuf->addr);
-}
-
-#ifdef CONFIG_COMPAT
-static ssize_t io_compat_import(struct io_kiocb *req, struct iovec *iov,
- unsigned int issue_flags)
-{
- struct compat_iovec __user *uiov;
- compat_ssize_t clen;
- void __user *buf;
- ssize_t len;
-
- uiov = u64_to_user_ptr(req->rw.addr);
- if (!access_ok(uiov, sizeof(*uiov)))
- return -EFAULT;
- if (__get_user(clen, &uiov->iov_len))
- return -EFAULT;
- if (clen < 0)
- return -EINVAL;
-
- len = clen;
- buf = io_rw_buffer_select(req, &len, issue_flags);
- if (IS_ERR(buf))
- return PTR_ERR(buf);
- iov[0].iov_base = buf;
- iov[0].iov_len = (compat_size_t) len;
- return 0;
-}
-#endif
-
-static ssize_t __io_iov_buffer_select(struct io_kiocb *req, struct iovec *iov,
- unsigned int issue_flags)
-{
- struct iovec __user *uiov = u64_to_user_ptr(req->rw.addr);
- void __user *buf;
- ssize_t len;
-
- if (copy_from_user(iov, uiov, sizeof(*uiov)))
- return -EFAULT;
-
- len = iov[0].iov_len;
- if (len < 0)
- return -EINVAL;
- buf = io_rw_buffer_select(req, &len, issue_flags);
- if (IS_ERR(buf))
- return PTR_ERR(buf);
- iov[0].iov_base = buf;
- iov[0].iov_len = len;
- return 0;
-}
-
-static ssize_t io_iov_buffer_select(struct io_kiocb *req, struct iovec *iov,
- unsigned int issue_flags)
-{
- if (req->flags & REQ_F_BUFFER_SELECTED) {
- struct io_buffer *kbuf = req->kbuf;
-
- iov[0].iov_base = u64_to_user_ptr(kbuf->addr);
- iov[0].iov_len = kbuf->len;
- return 0;
- }
- if (req->rw.len != 1)
- return -EINVAL;
-
-#ifdef CONFIG_COMPAT
- if (req->ctx->compat)
- return io_compat_import(req, iov, issue_flags);
-#endif
-
- return __io_iov_buffer_select(req, iov, issue_flags);
-}
-
-static inline bool io_do_buffer_select(struct io_kiocb *req)
-{
- if (!(req->flags & REQ_F_BUFFER_SELECT))
- return false;
- return !(req->flags & REQ_F_BUFFER_SELECTED);
-}
-
-static struct iovec *__io_import_iovec(int rw, struct io_kiocb *req,
- struct io_rw_state *s,
- unsigned int issue_flags)
-{
- struct iov_iter *iter = &s->iter;
- u8 opcode = req->opcode;
- struct iovec *iovec;
- void __user *buf;
- size_t sqe_len;
- ssize_t ret;
-
- if (opcode == IORING_OP_READ_FIXED || opcode == IORING_OP_WRITE_FIXED) {
- ret = io_import_fixed(req, rw, iter, issue_flags);
- if (ret)
- return ERR_PTR(ret);
- return NULL;
- }
-
- /* buffer index only valid with fixed read/write, or buffer select */
- if (unlikely(req->buf_index && !(req->flags & REQ_F_BUFFER_SELECT)))
- return ERR_PTR(-EINVAL);
-
- buf = u64_to_user_ptr(req->rw.addr);
- sqe_len = req->rw.len;
-
- if (opcode == IORING_OP_READ || opcode == IORING_OP_WRITE) {
- if (req->flags & REQ_F_BUFFER_SELECT) {
- buf = io_rw_buffer_select(req, &sqe_len, issue_flags);
- if (IS_ERR(buf))
- return ERR_CAST(buf);
- req->rw.len = sqe_len;
- }
-
- ret = import_single_range(rw, buf, sqe_len, s->fast_iov, iter);
- if (ret)
- return ERR_PTR(ret);
- return NULL;
- }
-
- iovec = s->fast_iov;
- if (req->flags & REQ_F_BUFFER_SELECT) {
- ret = io_iov_buffer_select(req, iovec, issue_flags);
- if (ret)
- return ERR_PTR(ret);
- iov_iter_init(iter, rw, iovec, 1, iovec->iov_len);
- return NULL;
- }
-
- ret = __import_iovec(rw, buf, sqe_len, UIO_FASTIOV, &iovec, iter,
- req->ctx->compat);
- if (unlikely(ret < 0))
- return ERR_PTR(ret);
- return iovec;
-}
-
-static inline int io_import_iovec(int rw, struct io_kiocb *req,
- struct iovec **iovec, struct io_rw_state *s,
- unsigned int issue_flags)
-{
- *iovec = __io_import_iovec(rw, req, s, issue_flags);
- if (unlikely(IS_ERR(*iovec)))
- return PTR_ERR(*iovec);
-
- iov_iter_save_state(&s->iter, &s->iter_state);
- return 0;
-}
-
-static inline loff_t *io_kiocb_ppos(struct kiocb *kiocb)
-{
- return (kiocb->ki_filp->f_mode & FMODE_STREAM) ? NULL : &kiocb->ki_pos;
-}
-
-/*
- * For files that don't have ->read_iter() and ->write_iter(), handle them
- * by looping over ->read() or ->write() manually.
- */
-static ssize_t loop_rw_iter(int rw, struct io_kiocb *req, struct iov_iter *iter)
-{
- struct kiocb *kiocb = &req->rw.kiocb;
- struct file *file = req->file;
- ssize_t ret = 0;
- loff_t *ppos;
-
- /*
- * Don't support polled IO through this interface, and we can't
- * support non-blocking either. For the latter, this just causes
- * the kiocb to be handled from an async context.
- */
- if (kiocb->ki_flags & IOCB_HIPRI)
- return -EOPNOTSUPP;
- if ((kiocb->ki_flags & IOCB_NOWAIT) &&
- !(kiocb->ki_filp->f_flags & O_NONBLOCK))
- return -EAGAIN;
-
- ppos = io_kiocb_ppos(kiocb);
-
- while (iov_iter_count(iter)) {
- struct iovec iovec;
- ssize_t nr;
-
- if (!iov_iter_is_bvec(iter)) {
- iovec = iov_iter_iovec(iter);
- } else {
- iovec.iov_base = u64_to_user_ptr(req->rw.addr);
- iovec.iov_len = req->rw.len;
- }
-
- if (rw == READ) {
- nr = file->f_op->read(file, iovec.iov_base,
- iovec.iov_len, ppos);
- } else {
- nr = file->f_op->write(file, iovec.iov_base,
- iovec.iov_len, ppos);
- }
-
- if (nr < 0) {
- if (!ret)
- ret = nr;
- break;
- }
- ret += nr;
- if (!iov_iter_is_bvec(iter)) {
- iov_iter_advance(iter, nr);
- } else {
- req->rw.addr += nr;
- req->rw.len -= nr;
- if (!req->rw.len)
- break;
- }
- if (nr != iovec.iov_len)
- break;
- }
-
- return ret;
-}
-
-static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
- const struct iovec *fast_iov, struct iov_iter *iter)
-{
- struct io_async_rw *rw = req->async_data;
-
- memcpy(&rw->s.iter, iter, sizeof(*iter));
- rw->free_iovec = iovec;
- rw->bytes_done = 0;
- /* can only be fixed buffers, no need to do anything */
- if (iov_iter_is_bvec(iter))
- return;
- if (!iovec) {
- unsigned iov_off = 0;
-
- rw->s.iter.iov = rw->s.fast_iov;
- if (iter->iov != fast_iov) {
- iov_off = iter->iov - fast_iov;
- rw->s.iter.iov += iov_off;
- }
- if (rw->s.fast_iov != fast_iov)
- memcpy(rw->s.fast_iov + iov_off, fast_iov + iov_off,
- sizeof(struct iovec) * iter->nr_segs);
- } else {
- req->flags |= REQ_F_NEED_CLEANUP;
- }
-}
-
-static inline bool io_alloc_async_data(struct io_kiocb *req)
-{
- WARN_ON_ONCE(!io_op_defs[req->opcode].async_size);
- req->async_data = kmalloc(io_op_defs[req->opcode].async_size, GFP_KERNEL);
- if (req->async_data) {
- req->flags |= REQ_F_ASYNC_DATA;
- return false;
- }
- return true;
-}
-
-static int io_setup_async_rw(struct io_kiocb *req, const struct iovec *iovec,
- struct io_rw_state *s, bool force)
-{
- if (!force && !io_op_defs[req->opcode].needs_async_setup)
- return 0;
- if (!req_has_async_data(req)) {
- struct io_async_rw *iorw;
-
- if (io_alloc_async_data(req)) {
- kfree(iovec);
- return -ENOMEM;
- }
-
- io_req_map_rw(req, iovec, s->fast_iov, &s->iter);
- iorw = req->async_data;
- /* we've copied and mapped the iter, ensure state is saved */
- iov_iter_save_state(&iorw->s.iter, &iorw->s.iter_state);
- }
- return 0;
-}
-
-static inline int io_rw_prep_async(struct io_kiocb *req, int rw)
-{
- struct io_async_rw *iorw = req->async_data;
- struct iovec *iov;
- int ret;
-
- /* submission path, ->uring_lock should already be taken */
- ret = io_import_iovec(rw, req, &iov, &iorw->s, 0);
- if (unlikely(ret < 0))
- return ret;
-
- iorw->bytes_done = 0;
- iorw->free_iovec = iov;
- if (iov)
- req->flags |= REQ_F_NEED_CLEANUP;
- return 0;
-}
-
-/*
- * This is our waitqueue callback handler, registered through __folio_lock_async()
- * when we initially tried to do the IO with the iocb armed our waitqueue.
- * This gets called when the page is unlocked, and we generally expect that to
- * happen when the page IO is completed and the page is now uptodate. This will
- * queue a task_work based retry of the operation, attempting to copy the data
- * again. If the latter fails because the page was NOT uptodate, then we will
- * do a thread based blocking retry of the operation. That's the unexpected
- * slow path.
- */
-static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
- int sync, void *arg)
-{
- struct wait_page_queue *wpq;
- struct io_kiocb *req = wait->private;
- struct wait_page_key *key = arg;
-
- wpq = container_of(wait, struct wait_page_queue, wait);
-
- if (!wake_page_match(wpq, key))
- return 0;
-
- req->rw.kiocb.ki_flags &= ~IOCB_WAITQ;
- list_del_init(&wait->entry);
- io_req_task_queue(req);
- return 1;
-}
-
-/*
- * This controls whether a given IO request should be armed for async page
- * based retry. If we return false here, the request is handed to the async
- * worker threads for retry. If we're doing buffered reads on a regular file,
- * we prepare a private wait_page_queue entry and retry the operation. This
- * will either succeed because the page is now uptodate and unlocked, or it
- * will register a callback when the page is unlocked at IO completion. Through
- * that callback, io_uring uses task_work to setup a retry of the operation.
- * That retry will attempt the buffered read again. The retry will generally
- * succeed, or in rare cases where it fails, we then fall back to using the
- * async worker threads for a blocking retry.
- */
-static bool io_rw_should_retry(struct io_kiocb *req)
-{
- struct io_async_rw *rw = req->async_data;
- struct wait_page_queue *wait = &rw->wpq;
- struct kiocb *kiocb = &req->rw.kiocb;
-
- /* never retry for NOWAIT, we just complete with -EAGAIN */
- if (req->flags & REQ_F_NOWAIT)
- return false;
-
- /* Only for buffered IO */
- if (kiocb->ki_flags & (IOCB_DIRECT | IOCB_HIPRI))
- return false;
-
- /*
- * just use poll if we can, and don't attempt if the fs doesn't
- * support callback based unlocks
- */
- if (file_can_poll(req->file) || !(req->file->f_mode & FMODE_BUF_RASYNC))
- return false;
-
- wait->wait.func = io_async_buf_func;
- wait->wait.private = req;
- wait->wait.flags = 0;
- INIT_LIST_HEAD(&wait->wait.entry);
- kiocb->ki_flags |= IOCB_WAITQ;
- kiocb->ki_flags &= ~IOCB_NOWAIT;
- kiocb->ki_waitq = wait;
- return true;
-}
-
-static inline int io_iter_do_read(struct io_kiocb *req, struct iov_iter *iter)
-{
- if (likely(req->file->f_op->read_iter))
- return call_read_iter(req->file, &req->rw.kiocb, iter);
- else if (req->file->f_op->read)
- return loop_rw_iter(READ, req, iter);
- else
- return -EINVAL;
-}
-
-static bool need_read_all(struct io_kiocb *req)
-{
- return req->flags & REQ_F_ISREG ||
- S_ISBLK(file_inode(req->file)->i_mode);
-}
-
-static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
-{
- struct kiocb *kiocb = &req->rw.kiocb;
- struct io_ring_ctx *ctx = req->ctx;
- struct file *file = req->file;
- int ret;
-
- if (unlikely(!file || !(file->f_mode & mode)))
- return -EBADF;
-
- if (!io_req_ffs_set(req))
- req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
-
- kiocb->ki_flags = iocb_flags(file);
- ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
- if (unlikely(ret))
- return ret;
-
- /*
- * If the file is marked O_NONBLOCK, still allow retry for it if it
- * supports async. Otherwise it's impossible to use O_NONBLOCK files
- * reliably. If not, or it IOCB_NOWAIT is set, don't retry.
- */
- if ((kiocb->ki_flags & IOCB_NOWAIT) ||
- ((file->f_flags & O_NONBLOCK) && !io_file_supports_nowait(req)))
- req->flags |= REQ_F_NOWAIT;
-
- if (ctx->flags & IORING_SETUP_IOPOLL) {
- if (!(kiocb->ki_flags & IOCB_DIRECT) || !file->f_op->iopoll)
- return -EOPNOTSUPP;
-
- kiocb->private = NULL;
- kiocb->ki_flags |= IOCB_HIPRI | IOCB_ALLOC_CACHE;
- kiocb->ki_complete = io_complete_rw_iopoll;
- req->iopoll_completed = 0;
- } else {
- if (kiocb->ki_flags & IOCB_HIPRI)
- return -EINVAL;
- kiocb->ki_complete = io_complete_rw;
- }
-
- return 0;
-}
-
-static int io_read(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_rw_state __s, *s = &__s;
- struct iovec *iovec;
- struct kiocb *kiocb = &req->rw.kiocb;
- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
- struct io_async_rw *rw;
- ssize_t ret, ret2;
- loff_t *ppos;
-
- if (!req_has_async_data(req)) {
- ret = io_import_iovec(READ, req, &iovec, s, issue_flags);
- if (unlikely(ret < 0))
- return ret;
- } else {
- rw = req->async_data;
- s = &rw->s;
-
- /*
- * Safe and required to re-import if we're using provided
- * buffers, as we dropped the selected one before retry.
- */
- if (io_do_buffer_select(req)) {
- ret = io_import_iovec(READ, req, &iovec, s, issue_flags);
- if (unlikely(ret < 0))
- return ret;
- }
-
- /*
- * We come here from an earlier attempt, restore our state to
- * match in case it doesn't. It's cheap enough that we don't
- * need to make this conditional.
- */
- iov_iter_restore(&s->iter, &s->iter_state);
- iovec = NULL;
- }
- ret = io_rw_init_file(req, FMODE_READ);
- if (unlikely(ret)) {
- kfree(iovec);
- return ret;
- }
- req->result = iov_iter_count(&s->iter);
-
- if (force_nonblock) {
- /* If the file doesn't support async, just async punt */
- if (unlikely(!io_file_supports_nowait(req))) {
- ret = io_setup_async_rw(req, iovec, s, true);
- return ret ?: -EAGAIN;
- }
- kiocb->ki_flags |= IOCB_NOWAIT;
- } else {
- /* Ensure we clear previously set non-block flag */
- kiocb->ki_flags &= ~IOCB_NOWAIT;
- }
-
- ppos = io_kiocb_update_pos(req);
-
- ret = rw_verify_area(READ, req->file, ppos, req->result);
- if (unlikely(ret)) {
- kfree(iovec);
- return ret;
- }
-
- ret = io_iter_do_read(req, &s->iter);
-
- if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) {
- req->flags &= ~REQ_F_REISSUE;
- /* if we can poll, just do that */
- if (req->opcode == IORING_OP_READ && file_can_poll(req->file))
- return -EAGAIN;
- /* IOPOLL retry should happen for io-wq threads */
- if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL))
- goto done;
- /* no retry on NONBLOCK nor RWF_NOWAIT */
- if (req->flags & REQ_F_NOWAIT)
- goto done;
- ret = 0;
- } else if (ret == -EIOCBQUEUED) {
- goto out_free;
- } else if (ret == req->result || ret <= 0 || !force_nonblock ||
- (req->flags & REQ_F_NOWAIT) || !need_read_all(req)) {
- /* read all, failed, already did sync or don't want to retry */
- goto done;
- }
-
- /*
- * Don't depend on the iter state matching what was consumed, or being
- * untouched in case of error. Restore it and we'll advance it
- * manually if we need to.
- */
- iov_iter_restore(&s->iter, &s->iter_state);
-
- ret2 = io_setup_async_rw(req, iovec, s, true);
- if (ret2)
- return ret2;
-
- iovec = NULL;
- rw = req->async_data;
- s = &rw->s;
- /*
- * Now use our persistent iterator and state, if we aren't already.
- * We've restored and mapped the iter to match.
- */
-
- do {
- /*
- * We end up here because of a partial read, either from
- * above or inside this loop. Advance the iter by the bytes
- * that were consumed.
- */
- iov_iter_advance(&s->iter, ret);
- if (!iov_iter_count(&s->iter))
- break;
- rw->bytes_done += ret;
- iov_iter_save_state(&s->iter, &s->iter_state);
-
- /* if we can retry, do so with the callbacks armed */
- if (!io_rw_should_retry(req)) {
- kiocb->ki_flags &= ~IOCB_WAITQ;
- return -EAGAIN;
- }
-
- /*
- * Now retry read with the IOCB_WAITQ parts set in the iocb. If
- * we get -EIOCBQUEUED, then we'll get a notification when the
- * desired page gets unlocked. We can also get a partial read
- * here, and if we do, then just retry at the new offset.
- */
- ret = io_iter_do_read(req, &s->iter);
- if (ret == -EIOCBQUEUED)
- return 0;
- /* we got some bytes, but not all. retry. */
- kiocb->ki_flags &= ~IOCB_WAITQ;
- iov_iter_restore(&s->iter, &s->iter_state);
- } while (ret > 0);
-done:
- kiocb_done(req, ret, issue_flags);
-out_free:
- /* it's faster to check here then delegate to kfree */
- if (iovec)
- kfree(iovec);
- return 0;
-}
-
-static int io_write(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_rw_state __s, *s = &__s;
- struct iovec *iovec;
- struct kiocb *kiocb = &req->rw.kiocb;
- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
- ssize_t ret, ret2;
- loff_t *ppos;
-
- if (!req_has_async_data(req)) {
- ret = io_import_iovec(WRITE, req, &iovec, s, issue_flags);
- if (unlikely(ret < 0))
- return ret;
- } else {
- struct io_async_rw *rw = req->async_data;
-
- s = &rw->s;
- iov_iter_restore(&s->iter, &s->iter_state);
- iovec = NULL;
- }
- ret = io_rw_init_file(req, FMODE_WRITE);
- if (unlikely(ret)) {
- kfree(iovec);
- return ret;
- }
- req->result = iov_iter_count(&s->iter);
-
- if (force_nonblock) {
- /* If the file doesn't support async, just async punt */
- if (unlikely(!io_file_supports_nowait(req)))
- goto copy_iov;
-
- /* file path doesn't support NOWAIT for non-direct_IO */
- if (force_nonblock && !(kiocb->ki_flags & IOCB_DIRECT) &&
- (req->flags & REQ_F_ISREG))
- goto copy_iov;
-
- kiocb->ki_flags |= IOCB_NOWAIT;
- } else {
- /* Ensure we clear previously set non-block flag */
- kiocb->ki_flags &= ~IOCB_NOWAIT;
- }
-
- ppos = io_kiocb_update_pos(req);
-
- ret = rw_verify_area(WRITE, req->file, ppos, req->result);
- if (unlikely(ret))
- goto out_free;
-
- /*
- * Open-code file_start_write here to grab freeze protection,
- * which will be released by another thread in
- * io_complete_rw(). Fool lockdep by telling it the lock got
- * released so that it doesn't complain about the held lock when
- * we return to userspace.
- */
- if (req->flags & REQ_F_ISREG) {
- sb_start_write(file_inode(req->file)->i_sb);
- __sb_writers_release(file_inode(req->file)->i_sb,
- SB_FREEZE_WRITE);
- }
- kiocb->ki_flags |= IOCB_WRITE;
-
- if (likely(req->file->f_op->write_iter))
- ret2 = call_write_iter(req->file, kiocb, &s->iter);
- else if (req->file->f_op->write)
- ret2 = loop_rw_iter(WRITE, req, &s->iter);
- else
- ret2 = -EINVAL;
-
- if (req->flags & REQ_F_REISSUE) {
- req->flags &= ~REQ_F_REISSUE;
- ret2 = -EAGAIN;
- }
-
- /*
- * Raw bdev writes will return -EOPNOTSUPP for IOCB_NOWAIT. Just
- * retry them without IOCB_NOWAIT.
- */
- if (ret2 == -EOPNOTSUPP && (kiocb->ki_flags & IOCB_NOWAIT))
- ret2 = -EAGAIN;
- /* no retry on NONBLOCK nor RWF_NOWAIT */
- if (ret2 == -EAGAIN && (req->flags & REQ_F_NOWAIT))
- goto done;
- if (!force_nonblock || ret2 != -EAGAIN) {
- /* IOPOLL retry should happen for io-wq threads */
- if (ret2 == -EAGAIN && (req->ctx->flags & IORING_SETUP_IOPOLL))
- goto copy_iov;
-done:
- kiocb_done(req, ret2, issue_flags);
- } else {
-copy_iov:
- iov_iter_restore(&s->iter, &s->iter_state);
- ret = io_setup_async_rw(req, iovec, s, false);
- return ret ?: -EAGAIN;
- }
-out_free:
- /* it's reportedly faster than delegating the null check to kfree() */
- if (iovec)
- kfree(iovec);
- return ret;
-}
-
-static int io_renameat_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- struct io_rename *ren = &req->rename;
- const char __user *oldf, *newf;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
- return -EINVAL;
- if (unlikely(req->flags & REQ_F_FIXED_FILE))
- return -EBADF;
-
- ren->old_dfd = READ_ONCE(sqe->fd);
- oldf = u64_to_user_ptr(READ_ONCE(sqe->addr));
- newf = u64_to_user_ptr(READ_ONCE(sqe->addr2));
- ren->new_dfd = READ_ONCE(sqe->len);
- ren->flags = READ_ONCE(sqe->rename_flags);
-
- ren->oldpath = getname(oldf);
- if (IS_ERR(ren->oldpath))
- return PTR_ERR(ren->oldpath);
-
- ren->newpath = getname(newf);
- if (IS_ERR(ren->newpath)) {
- putname(ren->oldpath);
- return PTR_ERR(ren->newpath);
- }
-
- req->flags |= REQ_F_NEED_CLEANUP;
- return 0;
-}
-
-static int io_renameat(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_rename *ren = &req->rename;
- int ret;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- ret = do_renameat2(ren->old_dfd, ren->oldpath, ren->new_dfd,
- ren->newpath, ren->flags);
-
- req->flags &= ~REQ_F_NEED_CLEANUP;
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-static int io_unlinkat_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- struct io_unlink *un = &req->unlink;
- const char __user *fname;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->off || sqe->len || sqe->buf_index ||
- sqe->splice_fd_in)
- return -EINVAL;
- if (unlikely(req->flags & REQ_F_FIXED_FILE))
- return -EBADF;
-
- un->dfd = READ_ONCE(sqe->fd);
-
- un->flags = READ_ONCE(sqe->unlink_flags);
- if (un->flags & ~AT_REMOVEDIR)
- return -EINVAL;
-
- fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
- un->filename = getname(fname);
- if (IS_ERR(un->filename))
- return PTR_ERR(un->filename);
-
- req->flags |= REQ_F_NEED_CLEANUP;
- return 0;
-}
-
-static int io_unlinkat(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_unlink *un = &req->unlink;
- int ret;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- if (un->flags & AT_REMOVEDIR)
- ret = do_rmdir(un->dfd, un->filename);
- else
- ret = do_unlinkat(un->dfd, un->filename);
-
- req->flags &= ~REQ_F_NEED_CLEANUP;
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-static int io_mkdirat_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- struct io_mkdir *mkd = &req->mkdir;
- const char __user *fname;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->off || sqe->rw_flags || sqe->buf_index ||
- sqe->splice_fd_in)
- return -EINVAL;
- if (unlikely(req->flags & REQ_F_FIXED_FILE))
- return -EBADF;
-
- mkd->dfd = READ_ONCE(sqe->fd);
- mkd->mode = READ_ONCE(sqe->len);
-
- fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
- mkd->filename = getname(fname);
- if (IS_ERR(mkd->filename))
- return PTR_ERR(mkd->filename);
-
- req->flags |= REQ_F_NEED_CLEANUP;
- return 0;
-}
-
-static int io_mkdirat(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_mkdir *mkd = &req->mkdir;
- int ret;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- ret = do_mkdirat(mkd->dfd, mkd->filename, mkd->mode);
-
- req->flags &= ~REQ_F_NEED_CLEANUP;
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-static int io_symlinkat_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- struct io_symlink *sl = &req->symlink;
- const char __user *oldpath, *newpath;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->len || sqe->rw_flags || sqe->buf_index ||
- sqe->splice_fd_in)
- return -EINVAL;
- if (unlikely(req->flags & REQ_F_FIXED_FILE))
- return -EBADF;
-
- sl->new_dfd = READ_ONCE(sqe->fd);
- oldpath = u64_to_user_ptr(READ_ONCE(sqe->addr));
- newpath = u64_to_user_ptr(READ_ONCE(sqe->addr2));
-
- sl->oldpath = getname(oldpath);
- if (IS_ERR(sl->oldpath))
- return PTR_ERR(sl->oldpath);
-
- sl->newpath = getname(newpath);
- if (IS_ERR(sl->newpath)) {
- putname(sl->oldpath);
- return PTR_ERR(sl->newpath);
- }
-
- req->flags |= REQ_F_NEED_CLEANUP;
- return 0;
-}
-
-static int io_symlinkat(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_symlink *sl = &req->symlink;
- int ret;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- ret = do_symlinkat(sl->oldpath, sl->new_dfd, sl->newpath);
-
- req->flags &= ~REQ_F_NEED_CLEANUP;
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-static int io_linkat_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- struct io_hardlink *lnk = &req->hardlink;
- const char __user *oldf, *newf;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->rw_flags || sqe->buf_index || sqe->splice_fd_in)
- return -EINVAL;
- if (unlikely(req->flags & REQ_F_FIXED_FILE))
- return -EBADF;
-
- lnk->old_dfd = READ_ONCE(sqe->fd);
- lnk->new_dfd = READ_ONCE(sqe->len);
- oldf = u64_to_user_ptr(READ_ONCE(sqe->addr));
- newf = u64_to_user_ptr(READ_ONCE(sqe->addr2));
- lnk->flags = READ_ONCE(sqe->hardlink_flags);
-
- lnk->oldpath = getname(oldf);
- if (IS_ERR(lnk->oldpath))
- return PTR_ERR(lnk->oldpath);
-
- lnk->newpath = getname(newf);
- if (IS_ERR(lnk->newpath)) {
- putname(lnk->oldpath);
- return PTR_ERR(lnk->newpath);
- }
-
- req->flags |= REQ_F_NEED_CLEANUP;
- return 0;
-}
-
-static int io_linkat(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_hardlink *lnk = &req->hardlink;
- int ret;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- ret = do_linkat(lnk->old_dfd, lnk->oldpath, lnk->new_dfd,
- lnk->newpath, lnk->flags);
-
- req->flags &= ~REQ_F_NEED_CLEANUP;
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-static int io_shutdown_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
-#if defined(CONFIG_NET)
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(sqe->ioprio || sqe->off || sqe->addr || sqe->rw_flags ||
- sqe->buf_index || sqe->splice_fd_in))
- return -EINVAL;
-
- req->shutdown.how = READ_ONCE(sqe->len);
- return 0;
-#else
- return -EOPNOTSUPP;
-#endif
-}
-
-static int io_shutdown(struct io_kiocb *req, unsigned int issue_flags)
-{
-#if defined(CONFIG_NET)
- struct socket *sock;
- int ret;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- sock = sock_from_file(req->file);
- if (unlikely(!sock))
- return -ENOTSOCK;
-
- ret = __sys_shutdown_sock(sock, req->shutdown.how);
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-#else
- return -EOPNOTSUPP;
-#endif
-}
-
-static int __io_splice_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- struct io_splice *sp = &req->splice;
- unsigned int valid_flags = SPLICE_F_FD_IN_FIXED | SPLICE_F_ALL;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
-
- sp->len = READ_ONCE(sqe->len);
- sp->flags = READ_ONCE(sqe->splice_flags);
- if (unlikely(sp->flags & ~valid_flags))
- return -EINVAL;
- sp->splice_fd_in = READ_ONCE(sqe->splice_fd_in);
- return 0;
-}
-
-static int io_tee_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- if (READ_ONCE(sqe->splice_off_in) || READ_ONCE(sqe->off))
- return -EINVAL;
- return __io_splice_prep(req, sqe);
-}
-
-static int io_tee(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_splice *sp = &req->splice;
- struct file *out = sp->file_out;
- unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED;
- struct file *in;
- long ret = 0;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- if (sp->flags & SPLICE_F_FD_IN_FIXED)
- in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags);
- else
- in = io_file_get_normal(req, sp->splice_fd_in);
- if (!in) {
- ret = -EBADF;
- goto done;
- }
-
- if (sp->len)
- ret = do_tee(in, out, sp->len, flags);
-
- if (!(sp->flags & SPLICE_F_FD_IN_FIXED))
- io_put_file(in);
-done:
- if (ret != sp->len)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-static int io_splice_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct io_splice *sp = &req->splice;
-
- sp->off_in = READ_ONCE(sqe->splice_off_in);
- sp->off_out = READ_ONCE(sqe->off);
- return __io_splice_prep(req, sqe);
-}
-
-static int io_splice(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_splice *sp = &req->splice;
- struct file *out = sp->file_out;
- unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED;
- loff_t *poff_in, *poff_out;
- struct file *in;
- long ret = 0;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- if (sp->flags & SPLICE_F_FD_IN_FIXED)
- in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags);
- else
- in = io_file_get_normal(req, sp->splice_fd_in);
- if (!in) {
- ret = -EBADF;
- goto done;
- }
-
- poff_in = (sp->off_in == -1) ? NULL : &sp->off_in;
- poff_out = (sp->off_out == -1) ? NULL : &sp->off_out;
-
- if (sp->len)
- ret = do_splice(in, poff_in, out, poff_out, sp->len, flags);
-
- if (!(sp->flags & SPLICE_F_FD_IN_FIXED))
- io_put_file(in);
-done:
- if (ret != sp->len)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-/*
- * IORING_OP_NOP just posts a completion event, nothing else.
- */
-static int io_nop(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_ring_ctx *ctx = req->ctx;
-
- if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
-
- __io_req_complete(req, issue_flags, 0, 0);
- return 0;
-}
-
-static int io_msg_ring_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- if (unlikely(sqe->addr || sqe->ioprio || sqe->rw_flags ||
- sqe->splice_fd_in || sqe->buf_index || sqe->personality))
- return -EINVAL;
-
- req->msg.user_data = READ_ONCE(sqe->off);
- req->msg.len = READ_ONCE(sqe->len);
- return 0;
-}
-
-static int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_ring_ctx *target_ctx;
- struct io_msg *msg = &req->msg;
- bool filled;
- int ret;
-
- ret = -EBADFD;
- if (req->file->f_op != &io_uring_fops)
- goto done;
-
- ret = -EOVERFLOW;
- target_ctx = req->file->private_data;
-
- spin_lock(&target_ctx->completion_lock);
- filled = io_fill_cqe_aux(target_ctx, msg->user_data, msg->len, 0);
- io_commit_cqring(target_ctx);
- spin_unlock(&target_ctx->completion_lock);
-
- if (filled) {
- io_cqring_ev_posted(target_ctx);
- ret = 0;
- }
-
-done:
- if (ret < 0)
- req_set_fail(req);
- __io_req_complete(req, issue_flags, ret, 0);
- /* put file to avoid an attempt to IOPOLL the req */
- io_put_file(req->file);
- req->file = NULL;
- return 0;
-}
-
-static int io_fsync_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct io_ring_ctx *ctx = req->ctx;
-
- if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(sqe->addr || sqe->ioprio || sqe->buf_index ||
- sqe->splice_fd_in))
- return -EINVAL;
-
- req->sync.flags = READ_ONCE(sqe->fsync_flags);
- if (unlikely(req->sync.flags & ~IORING_FSYNC_DATASYNC))
- return -EINVAL;
-
- req->sync.off = READ_ONCE(sqe->off);
- req->sync.len = READ_ONCE(sqe->len);
- return 0;
-}
-
-static int io_fsync(struct io_kiocb *req, unsigned int issue_flags)
-{
- loff_t end = req->sync.off + req->sync.len;
- int ret;
-
- /* fsync always requires a blocking context */
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- ret = vfs_fsync_range(req->file, req->sync.off,
- end > 0 ? end : LLONG_MAX,
- req->sync.flags & IORING_FSYNC_DATASYNC);
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-static int io_fallocate_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- if (sqe->ioprio || sqe->buf_index || sqe->rw_flags ||
- sqe->splice_fd_in)
- return -EINVAL;
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
-
- req->sync.off = READ_ONCE(sqe->off);
- req->sync.len = READ_ONCE(sqe->addr);
- req->sync.mode = READ_ONCE(sqe->len);
- return 0;
-}
-
-static int io_fallocate(struct io_kiocb *req, unsigned int issue_flags)
-{
- int ret;
-
- /* fallocate always requiring blocking context */
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
- ret = vfs_fallocate(req->file, req->sync.mode, req->sync.off,
- req->sync.len);
- if (ret < 0)
- req_set_fail(req);
- else
- fsnotify_modify(req->file);
- io_req_complete(req, ret);
- return 0;
-}
-
-static int __io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- const char __user *fname;
- int ret;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(sqe->ioprio || sqe->buf_index))
- return -EINVAL;
- if (unlikely(req->flags & REQ_F_FIXED_FILE))
- return -EBADF;
-
- /* open.how should be already initialised */
- if (!(req->open.how.flags & O_PATH) && force_o_largefile())
- req->open.how.flags |= O_LARGEFILE;
-
- req->open.dfd = READ_ONCE(sqe->fd);
- fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
- req->open.filename = getname(fname);
- if (IS_ERR(req->open.filename)) {
- ret = PTR_ERR(req->open.filename);
- req->open.filename = NULL;
- return ret;
- }
-
- req->open.file_slot = READ_ONCE(sqe->file_index);
- if (req->open.file_slot && (req->open.how.flags & O_CLOEXEC))
- return -EINVAL;
-
- req->open.nofile = rlimit(RLIMIT_NOFILE);
- req->flags |= REQ_F_NEED_CLEANUP;
- return 0;
-}
-
-static int io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- u64 mode = READ_ONCE(sqe->len);
- u64 flags = READ_ONCE(sqe->open_flags);
-
- req->open.how = build_open_how(flags, mode);
- return __io_openat_prep(req, sqe);
-}
-
-static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct open_how __user *how;
- size_t len;
- int ret;
-
- how = u64_to_user_ptr(READ_ONCE(sqe->addr2));
- len = READ_ONCE(sqe->len);
- if (len < OPEN_HOW_SIZE_VER0)
- return -EINVAL;
-
- ret = copy_struct_from_user(&req->open.how, sizeof(req->open.how), how,
- len);
- if (ret)
- return ret;
-
- return __io_openat_prep(req, sqe);
-}
-
-static int io_openat2(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct open_flags op;
- struct file *file;
- bool resolve_nonblock, nonblock_set;
- bool fixed = !!req->open.file_slot;
- int ret;
-
- ret = build_open_flags(&req->open.how, &op);
- if (ret)
- goto err;
- nonblock_set = op.open_flag & O_NONBLOCK;
- resolve_nonblock = req->open.how.resolve & RESOLVE_CACHED;
- if (issue_flags & IO_URING_F_NONBLOCK) {
- /*
- * Don't bother trying for O_TRUNC, O_CREAT, or O_TMPFILE open,
- * it'll always -EAGAIN
- */
- if (req->open.how.flags & (O_TRUNC | O_CREAT | O_TMPFILE))
- return -EAGAIN;
- op.lookup_flags |= LOOKUP_CACHED;
- op.open_flag |= O_NONBLOCK;
- }
-
- if (!fixed) {
- ret = __get_unused_fd_flags(req->open.how.flags, req->open.nofile);
- if (ret < 0)
- goto err;
- }
-
- file = do_filp_open(req->open.dfd, req->open.filename, &op);
- if (IS_ERR(file)) {
- /*
- * We could hang on to this 'fd' on retrying, but seems like
- * marginal gain for something that is now known to be a slower
- * path. So just put it, and we'll get a new one when we retry.
- */
- if (!fixed)
- put_unused_fd(ret);
-
- ret = PTR_ERR(file);
- /* only retry if RESOLVE_CACHED wasn't already set by application */
- if (ret == -EAGAIN &&
- (!resolve_nonblock && (issue_flags & IO_URING_F_NONBLOCK)))
- return -EAGAIN;
- goto err;
- }
-
- if ((issue_flags & IO_URING_F_NONBLOCK) && !nonblock_set)
- file->f_flags &= ~O_NONBLOCK;
- fsnotify_open(file);
-
- if (!fixed)
- fd_install(ret, file);
- else
- ret = io_install_fixed_file(req, file, issue_flags,
- req->open.file_slot - 1);
-err:
- putname(req->open.filename);
- req->flags &= ~REQ_F_NEED_CLEANUP;
- if (ret < 0)
- req_set_fail(req);
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-
-static int io_openat(struct io_kiocb *req, unsigned int issue_flags)
-{
- return io_openat2(req, issue_flags);
-}
-
-static int io_remove_buffers_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- struct io_provide_buf *p = &req->pbuf;
- u64 tmp;
-
- if (sqe->ioprio || sqe->rw_flags || sqe->addr || sqe->len || sqe->off ||
- sqe->splice_fd_in)
- return -EINVAL;
-
- tmp = READ_ONCE(sqe->fd);
- if (!tmp || tmp > USHRT_MAX)
- return -EINVAL;
-
- memset(p, 0, sizeof(*p));
- p->nbufs = tmp;
- p->bgid = READ_ONCE(sqe->buf_group);
- return 0;
-}
-
-static int __io_remove_buffers(struct io_ring_ctx *ctx,
- struct io_buffer_list *bl, unsigned nbufs)
-{
- unsigned i = 0;
-
- /* shouldn't happen */
- if (!nbufs)
- return 0;
-
- /* the head kbuf is the list itself */
- while (!list_empty(&bl->buf_list)) {
- struct io_buffer *nxt;
-
- nxt = list_first_entry(&bl->buf_list, struct io_buffer, list);
- list_del(&nxt->list);
- if (++i == nbufs)
- return i;
- cond_resched();
- }
- i++;
-
- return i;
-}
-
-static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_provide_buf *p = &req->pbuf;
- struct io_ring_ctx *ctx = req->ctx;
- struct io_buffer_list *bl;
- int ret = 0;
- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
-
- io_ring_submit_lock(ctx, needs_lock);
-
- lockdep_assert_held(&ctx->uring_lock);
-
- ret = -ENOENT;
- bl = io_buffer_get_list(ctx, p->bgid);
- if (bl)
- ret = __io_remove_buffers(ctx, bl, p->nbufs);
- if (ret < 0)
- req_set_fail(req);
-
- /* complete before unlock, IOPOLL may need the lock */
- __io_req_complete(req, issue_flags, ret, 0);
- io_ring_submit_unlock(ctx, needs_lock);
- return 0;
-}
-
-static int io_provide_buffers_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- unsigned long size, tmp_check;
- struct io_provide_buf *p = &req->pbuf;
- u64 tmp;
-
- if (sqe->ioprio || sqe->rw_flags || sqe->splice_fd_in)
- return -EINVAL;
-
- tmp = READ_ONCE(sqe->fd);
- if (!tmp || tmp > USHRT_MAX)
- return -E2BIG;
- p->nbufs = tmp;
- p->addr = READ_ONCE(sqe->addr);
- p->len = READ_ONCE(sqe->len);
-
- if (check_mul_overflow((unsigned long)p->len, (unsigned long)p->nbufs,
- &size))
- return -EOVERFLOW;
- if (check_add_overflow((unsigned long)p->addr, size, &tmp_check))
- return -EOVERFLOW;
-
- size = (unsigned long)p->len * p->nbufs;
- if (!access_ok(u64_to_user_ptr(p->addr), size))
- return -EFAULT;
-
- p->bgid = READ_ONCE(sqe->buf_group);
- tmp = READ_ONCE(sqe->off);
- if (tmp > USHRT_MAX)
- return -E2BIG;
- p->bid = tmp;
- return 0;
-}
-
-static int io_refill_buffer_cache(struct io_ring_ctx *ctx)
-{
- struct io_buffer *buf;
- struct page *page;
- int bufs_in_page;
-
- /*
- * Completions that don't happen inline (eg not under uring_lock) will
- * add to ->io_buffers_comp. If we don't have any free buffers, check
- * the completion list and splice those entries first.
- */
- if (!list_empty_careful(&ctx->io_buffers_comp)) {
- spin_lock(&ctx->completion_lock);
- if (!list_empty(&ctx->io_buffers_comp)) {
- list_splice_init(&ctx->io_buffers_comp,
- &ctx->io_buffers_cache);
- spin_unlock(&ctx->completion_lock);
- return 0;
- }
- spin_unlock(&ctx->completion_lock);
- }
-
- /*
- * No free buffers and no completion entries either. Allocate a new
- * page worth of buffer entries and add those to our freelist.
- */
- page = alloc_page(GFP_KERNEL_ACCOUNT);
- if (!page)
- return -ENOMEM;
-
- list_add(&page->lru, &ctx->io_buffers_pages);
-
- buf = page_address(page);
- bufs_in_page = PAGE_SIZE / sizeof(*buf);
- while (bufs_in_page) {
- list_add_tail(&buf->list, &ctx->io_buffers_cache);
- buf++;
- bufs_in_page--;
- }
-
- return 0;
-}
-
-static int io_add_buffers(struct io_ring_ctx *ctx, struct io_provide_buf *pbuf,
- struct io_buffer_list *bl)
-{
- struct io_buffer *buf;
- u64 addr = pbuf->addr;
- int i, bid = pbuf->bid;
-
- for (i = 0; i < pbuf->nbufs; i++) {
- if (list_empty(&ctx->io_buffers_cache) &&
- io_refill_buffer_cache(ctx))
- break;
- buf = list_first_entry(&ctx->io_buffers_cache, struct io_buffer,
- list);
- list_move_tail(&buf->list, &bl->buf_list);
- buf->addr = addr;
- buf->len = min_t(__u32, pbuf->len, MAX_RW_COUNT);
- buf->bid = bid;
- buf->bgid = pbuf->bgid;
- addr += pbuf->len;
- bid++;
- cond_resched();
- }
-
- return i ? 0 : -ENOMEM;
-}
-
-static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_provide_buf *p = &req->pbuf;
- struct io_ring_ctx *ctx = req->ctx;
- struct io_buffer_list *bl;
- int ret = 0;
- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
-
- io_ring_submit_lock(ctx, needs_lock);
-
- lockdep_assert_held(&ctx->uring_lock);
-
- bl = io_buffer_get_list(ctx, p->bgid);
- if (unlikely(!bl)) {
- bl = kmalloc(sizeof(*bl), GFP_KERNEL);
- if (!bl) {
- ret = -ENOMEM;
- goto err;
- }
- io_buffer_add_list(ctx, bl, p->bgid);
- }
-
- ret = io_add_buffers(ctx, p, bl);
-err:
- if (ret < 0)
- req_set_fail(req);
- /* complete before unlock, IOPOLL may need the lock */
- __io_req_complete(req, issue_flags, ret, 0);
- io_ring_submit_unlock(ctx, needs_lock);
- return 0;
-}
-
-static int io_epoll_ctl_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
-#if defined(CONFIG_EPOLL)
- if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
- return -EINVAL;
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
-
- req->epoll.epfd = READ_ONCE(sqe->fd);
- req->epoll.op = READ_ONCE(sqe->len);
- req->epoll.fd = READ_ONCE(sqe->off);
-
- if (ep_op_has_event(req->epoll.op)) {
- struct epoll_event __user *ev;
-
- ev = u64_to_user_ptr(READ_ONCE(sqe->addr));
- if (copy_from_user(&req->epoll.event, ev, sizeof(*ev)))
- return -EFAULT;
- }
-
- return 0;
-#else
- return -EOPNOTSUPP;
-#endif
-}
-
-static int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags)
-{
-#if defined(CONFIG_EPOLL)
- struct io_epoll *ie = &req->epoll;
- int ret;
- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
-
- ret = do_epoll_ctl(ie->epfd, ie->op, ie->fd, &ie->event, force_nonblock);
- if (force_nonblock && ret == -EAGAIN)
- return -EAGAIN;
-
- if (ret < 0)
- req_set_fail(req);
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-#else
- return -EOPNOTSUPP;
-#endif
-}
-
-static int io_madvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
-#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
- if (sqe->ioprio || sqe->buf_index || sqe->off || sqe->splice_fd_in)
- return -EINVAL;
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
-
- req->madvise.addr = READ_ONCE(sqe->addr);
- req->madvise.len = READ_ONCE(sqe->len);
- req->madvise.advice = READ_ONCE(sqe->fadvise_advice);
- return 0;
-#else
- return -EOPNOTSUPP;
-#endif
-}
-
-static int io_madvise(struct io_kiocb *req, unsigned int issue_flags)
-{
-#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
- struct io_madvise *ma = &req->madvise;
- int ret;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- ret = do_madvise(current->mm, ma->addr, ma->len, ma->advice);
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-#else
- return -EOPNOTSUPP;
-#endif
-}
-
-static int io_fadvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- if (sqe->ioprio || sqe->buf_index || sqe->addr || sqe->splice_fd_in)
- return -EINVAL;
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
-
- req->fadvise.offset = READ_ONCE(sqe->off);
- req->fadvise.len = READ_ONCE(sqe->len);
- req->fadvise.advice = READ_ONCE(sqe->fadvise_advice);
- return 0;
-}
-
-static int io_fadvise(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_fadvise *fa = &req->fadvise;
- int ret;
-
- if (issue_flags & IO_URING_F_NONBLOCK) {
- switch (fa->advice) {
- case POSIX_FADV_NORMAL:
- case POSIX_FADV_RANDOM:
- case POSIX_FADV_SEQUENTIAL:
- break;
- default:
- return -EAGAIN;
- }
- }
-
- ret = vfs_fadvise(req->file, fa->offset, fa->len, fa->advice);
- if (ret < 0)
- req_set_fail(req);
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-
-static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- const char __user *path;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
- return -EINVAL;
- if (req->flags & REQ_F_FIXED_FILE)
- return -EBADF;
-
- req->statx.dfd = READ_ONCE(sqe->fd);
- req->statx.mask = READ_ONCE(sqe->len);
- path = u64_to_user_ptr(READ_ONCE(sqe->addr));
- req->statx.buffer = u64_to_user_ptr(READ_ONCE(sqe->addr2));
- req->statx.flags = READ_ONCE(sqe->statx_flags);
-
- req->statx.filename = getname_flags(path,
- getname_statx_lookup_flags(req->statx.flags),
- NULL);
-
- if (IS_ERR(req->statx.filename)) {
- int ret = PTR_ERR(req->statx.filename);
-
- req->statx.filename = NULL;
- return ret;
- }
-
- req->flags |= REQ_F_NEED_CLEANUP;
- return 0;
-}
-
-static int io_statx(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_statx *ctx = &req->statx;
- int ret;
-
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- ret = do_statx(ctx->dfd, ctx->filename, ctx->flags, ctx->mask,
- ctx->buffer);
-
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->off || sqe->addr || sqe->len ||
- sqe->rw_flags || sqe->buf_index)
- return -EINVAL;
- if (req->flags & REQ_F_FIXED_FILE)
- return -EBADF;
-
- req->close.fd = READ_ONCE(sqe->fd);
- req->close.file_slot = READ_ONCE(sqe->file_index);
- if (req->close.file_slot && req->close.fd)
- return -EINVAL;
-
- return 0;
-}
-
-static int io_close(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct files_struct *files = current->files;
- struct io_close *close = &req->close;
- struct fdtable *fdt;
- struct file *file = NULL;
- int ret = -EBADF;
-
- if (req->close.file_slot) {
- ret = io_close_fixed(req, issue_flags);
- goto err;
- }
-
- spin_lock(&files->file_lock);
- fdt = files_fdtable(files);
- if (close->fd >= fdt->max_fds) {
- spin_unlock(&files->file_lock);
- goto err;
- }
- file = fdt->fd[close->fd];
- if (!file || file->f_op == &io_uring_fops) {
- spin_unlock(&files->file_lock);
- file = NULL;
- goto err;
- }
-
- /* if the file has a flush method, be safe and punt to async */
- if (file->f_op->flush && (issue_flags & IO_URING_F_NONBLOCK)) {
- spin_unlock(&files->file_lock);
- return -EAGAIN;
- }
-
- ret = __close_fd_get_file(close->fd, &file);
- spin_unlock(&files->file_lock);
- if (ret < 0) {
- if (ret == -ENOENT)
- ret = -EBADF;
- goto err;
- }
-
- /* No ->flush() or already async, safely close from here */
- ret = filp_close(file, current->files);
-err:
- if (ret < 0)
- req_set_fail(req);
- if (file)
- fput(file);
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-
-static int io_sfr_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct io_ring_ctx *ctx = req->ctx;
-
- if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(sqe->addr || sqe->ioprio || sqe->buf_index ||
- sqe->splice_fd_in))
- return -EINVAL;
-
- req->sync.off = READ_ONCE(sqe->off);
- req->sync.len = READ_ONCE(sqe->len);
- req->sync.flags = READ_ONCE(sqe->sync_range_flags);
- return 0;
-}
-
-static int io_sync_file_range(struct io_kiocb *req, unsigned int issue_flags)
-{
- int ret;
-
- /* sync_file_range always requires a blocking context */
- if (issue_flags & IO_URING_F_NONBLOCK)
- return -EAGAIN;
-
- ret = sync_file_range(req->file, req->sync.off, req->sync.len,
- req->sync.flags);
- if (ret < 0)
- req_set_fail(req);
- io_req_complete(req, ret);
- return 0;
-}
-
-#if defined(CONFIG_NET)
-static int io_setup_async_msg(struct io_kiocb *req,
- struct io_async_msghdr *kmsg)
-{
- struct io_async_msghdr *async_msg = req->async_data;
-
- if (async_msg)
- return -EAGAIN;
- if (io_alloc_async_data(req)) {
- kfree(kmsg->free_iov);
- return -ENOMEM;
- }
- async_msg = req->async_data;
- req->flags |= REQ_F_NEED_CLEANUP;
- memcpy(async_msg, kmsg, sizeof(*kmsg));
- async_msg->msg.msg_name = &async_msg->addr;
- /* if were using fast_iov, set it to the new one */
- if (!async_msg->free_iov)
- async_msg->msg.msg_iter.iov = async_msg->fast_iov;
-
- return -EAGAIN;
-}
-
-static int io_sendmsg_copy_hdr(struct io_kiocb *req,
- struct io_async_msghdr *iomsg)
-{
- iomsg->msg.msg_name = &iomsg->addr;
- iomsg->free_iov = iomsg->fast_iov;
- return sendmsg_copy_msghdr(&iomsg->msg, req->sr_msg.umsg,
- req->sr_msg.msg_flags, &iomsg->free_iov);
-}
-
-static int io_sendmsg_prep_async(struct io_kiocb *req)
-{
- int ret;
-
- ret = io_sendmsg_copy_hdr(req, req->async_data);
- if (!ret)
- req->flags |= REQ_F_NEED_CLEANUP;
- return ret;
-}
-
-static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct io_sr_msg *sr = &req->sr_msg;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
- return -EINVAL;
-
- sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
- sr->len = READ_ONCE(sqe->len);
- sr->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL;
- if (sr->msg_flags & MSG_DONTWAIT)
- req->flags |= REQ_F_NOWAIT;
-
-#ifdef CONFIG_COMPAT
- if (req->ctx->compat)
- sr->msg_flags |= MSG_CMSG_COMPAT;
-#endif
- return 0;
-}
-
-static int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_async_msghdr iomsg, *kmsg;
- struct socket *sock;
- unsigned flags;
- int min_ret = 0;
- int ret;
-
- sock = sock_from_file(req->file);
- if (unlikely(!sock))
- return -ENOTSOCK;
-
- if (req_has_async_data(req)) {
- kmsg = req->async_data;
- } else {
- ret = io_sendmsg_copy_hdr(req, &iomsg);
- if (ret)
- return ret;
- kmsg = &iomsg;
- }
-
- flags = req->sr_msg.msg_flags;
- if (issue_flags & IO_URING_F_NONBLOCK)
- flags |= MSG_DONTWAIT;
- if (flags & MSG_WAITALL)
- min_ret = iov_iter_count(&kmsg->msg.msg_iter);
-
- ret = __sys_sendmsg_sock(sock, &kmsg->msg, flags);
-
- if (ret < min_ret) {
- if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK))
- return io_setup_async_msg(req, kmsg);
- if (ret == -ERESTARTSYS)
- ret = -EINTR;
- req_set_fail(req);
- }
- /* fast path, check for non-NULL to avoid function call */
- if (kmsg->free_iov)
- kfree(kmsg->free_iov);
- req->flags &= ~REQ_F_NEED_CLEANUP;
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-
-static int io_send(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_sr_msg *sr = &req->sr_msg;
- struct msghdr msg;
- struct iovec iov;
- struct socket *sock;
- unsigned flags;
- int min_ret = 0;
- int ret;
-
- sock = sock_from_file(req->file);
- if (unlikely(!sock))
- return -ENOTSOCK;
-
- ret = import_single_range(WRITE, sr->buf, sr->len, &iov, &msg.msg_iter);
- if (unlikely(ret))
- return ret;
-
- msg.msg_name = NULL;
- msg.msg_control = NULL;
- msg.msg_controllen = 0;
- msg.msg_namelen = 0;
-
- flags = req->sr_msg.msg_flags;
- if (issue_flags & IO_URING_F_NONBLOCK)
- flags |= MSG_DONTWAIT;
- if (flags & MSG_WAITALL)
- min_ret = iov_iter_count(&msg.msg_iter);
-
- msg.msg_flags = flags;
- ret = sock_sendmsg(sock, &msg);
- if (ret < min_ret) {
- if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK))
- return -EAGAIN;
- if (ret == -ERESTARTSYS)
- ret = -EINTR;
- req_set_fail(req);
- }
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-
-static int __io_recvmsg_copy_hdr(struct io_kiocb *req,
- struct io_async_msghdr *iomsg)
-{
- struct io_sr_msg *sr = &req->sr_msg;
- struct iovec __user *uiov;
- size_t iov_len;
- int ret;
-
- ret = __copy_msghdr_from_user(&iomsg->msg, sr->umsg,
- &iomsg->uaddr, &uiov, &iov_len);
- if (ret)
- return ret;
-
- if (req->flags & REQ_F_BUFFER_SELECT) {
- if (iov_len > 1)
- return -EINVAL;
- if (copy_from_user(iomsg->fast_iov, uiov, sizeof(*uiov)))
- return -EFAULT;
- sr->len = iomsg->fast_iov[0].iov_len;
- iomsg->free_iov = NULL;
- } else {
- iomsg->free_iov = iomsg->fast_iov;
- ret = __import_iovec(READ, uiov, iov_len, UIO_FASTIOV,
- &iomsg->free_iov, &iomsg->msg.msg_iter,
- false);
- if (ret > 0)
- ret = 0;
- }
-
- return ret;
-}
-
-#ifdef CONFIG_COMPAT
-static int __io_compat_recvmsg_copy_hdr(struct io_kiocb *req,
- struct io_async_msghdr *iomsg)
-{
- struct io_sr_msg *sr = &req->sr_msg;
- struct compat_iovec __user *uiov;
- compat_uptr_t ptr;
- compat_size_t len;
- int ret;
-
- ret = __get_compat_msghdr(&iomsg->msg, sr->umsg_compat, &iomsg->uaddr,
- &ptr, &len);
- if (ret)
- return ret;
-
- uiov = compat_ptr(ptr);
- if (req->flags & REQ_F_BUFFER_SELECT) {
- compat_ssize_t clen;
-
- if (len > 1)
- return -EINVAL;
- if (!access_ok(uiov, sizeof(*uiov)))
- return -EFAULT;
- if (__get_user(clen, &uiov->iov_len))
- return -EFAULT;
- if (clen < 0)
- return -EINVAL;
- sr->len = clen;
- iomsg->free_iov = NULL;
- } else {
- iomsg->free_iov = iomsg->fast_iov;
- ret = __import_iovec(READ, (struct iovec __user *)uiov, len,
- UIO_FASTIOV, &iomsg->free_iov,
- &iomsg->msg.msg_iter, true);
- if (ret < 0)
- return ret;
- }
-
- return 0;
-}
-#endif
-
-static int io_recvmsg_copy_hdr(struct io_kiocb *req,
- struct io_async_msghdr *iomsg)
-{
- iomsg->msg.msg_name = &iomsg->addr;
-
-#ifdef CONFIG_COMPAT
- if (req->ctx->compat)
- return __io_compat_recvmsg_copy_hdr(req, iomsg);
-#endif
-
- return __io_recvmsg_copy_hdr(req, iomsg);
-}
-
-static struct io_buffer *io_recv_buffer_select(struct io_kiocb *req,
- unsigned int issue_flags)
-{
- struct io_sr_msg *sr = &req->sr_msg;
-
- return io_buffer_select(req, &sr->len, sr->bgid, issue_flags);
-}
-
-static int io_recvmsg_prep_async(struct io_kiocb *req)
-{
- int ret;
-
- ret = io_recvmsg_copy_hdr(req, req->async_data);
- if (!ret)
- req->flags |= REQ_F_NEED_CLEANUP;
- return ret;
-}
-
-static int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct io_sr_msg *sr = &req->sr_msg;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
- return -EINVAL;
-
- sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
- sr->len = READ_ONCE(sqe->len);
- sr->bgid = READ_ONCE(sqe->buf_group);
- sr->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL;
- if (sr->msg_flags & MSG_DONTWAIT)
- req->flags |= REQ_F_NOWAIT;
-
-#ifdef CONFIG_COMPAT
- if (req->ctx->compat)
- sr->msg_flags |= MSG_CMSG_COMPAT;
-#endif
- sr->done_io = 0;
- return 0;
-}
-
-static bool io_net_retry(struct socket *sock, int flags)
-{
- if (!(flags & MSG_WAITALL))
- return false;
- return sock->type == SOCK_STREAM || sock->type == SOCK_SEQPACKET;
-}
-
-static int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_async_msghdr iomsg, *kmsg;
- struct io_sr_msg *sr = &req->sr_msg;
- struct socket *sock;
- struct io_buffer *kbuf;
- unsigned flags;
- int ret, min_ret = 0;
- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
-
- sock = sock_from_file(req->file);
- if (unlikely(!sock))
- return -ENOTSOCK;
-
- if (req_has_async_data(req)) {
- kmsg = req->async_data;
- } else {
- ret = io_recvmsg_copy_hdr(req, &iomsg);
- if (ret)
- return ret;
- kmsg = &iomsg;
- }
-
- if (req->flags & REQ_F_BUFFER_SELECT) {
- kbuf = io_recv_buffer_select(req, issue_flags);
- if (IS_ERR(kbuf))
- return PTR_ERR(kbuf);
- kmsg->fast_iov[0].iov_base = u64_to_user_ptr(kbuf->addr);
- kmsg->fast_iov[0].iov_len = req->sr_msg.len;
- iov_iter_init(&kmsg->msg.msg_iter, READ, kmsg->fast_iov,
- 1, req->sr_msg.len);
- }
-
- flags = req->sr_msg.msg_flags;
- if (force_nonblock)
- flags |= MSG_DONTWAIT;
- if (flags & MSG_WAITALL)
- min_ret = iov_iter_count(&kmsg->msg.msg_iter);
-
- ret = __sys_recvmsg_sock(sock, &kmsg->msg, req->sr_msg.umsg,
- kmsg->uaddr, flags);
- if (ret < min_ret) {
- if (ret == -EAGAIN && force_nonblock)
- return io_setup_async_msg(req, kmsg);
- if (ret == -ERESTARTSYS)
- ret = -EINTR;
- if (ret > 0 && io_net_retry(sock, flags)) {
- sr->done_io += ret;
- req->flags |= REQ_F_PARTIAL_IO;
- return io_setup_async_msg(req, kmsg);
- }
- req_set_fail(req);
- } else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
- req_set_fail(req);
- }
-
- /* fast path, check for non-NULL to avoid function call */
- if (kmsg->free_iov)
- kfree(kmsg->free_iov);
- req->flags &= ~REQ_F_NEED_CLEANUP;
- if (ret >= 0)
- ret += sr->done_io;
- else if (sr->done_io)
- ret = sr->done_io;
- __io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
- return 0;
-}
-
-static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_buffer *kbuf;
- struct io_sr_msg *sr = &req->sr_msg;
- struct msghdr msg;
- void __user *buf = sr->buf;
- struct socket *sock;
- struct iovec iov;
- unsigned flags;
- int ret, min_ret = 0;
- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
-
- sock = sock_from_file(req->file);
- if (unlikely(!sock))
- return -ENOTSOCK;
-
- if (req->flags & REQ_F_BUFFER_SELECT) {
- kbuf = io_recv_buffer_select(req, issue_flags);
- if (IS_ERR(kbuf))
- return PTR_ERR(kbuf);
- buf = u64_to_user_ptr(kbuf->addr);
- }
-
- ret = import_single_range(READ, buf, sr->len, &iov, &msg.msg_iter);
- if (unlikely(ret))
- goto out_free;
-
- msg.msg_name = NULL;
- msg.msg_control = NULL;
- msg.msg_controllen = 0;
- msg.msg_namelen = 0;
- msg.msg_iocb = NULL;
- msg.msg_flags = 0;
-
- flags = req->sr_msg.msg_flags;
- if (force_nonblock)
- flags |= MSG_DONTWAIT;
- if (flags & MSG_WAITALL)
- min_ret = iov_iter_count(&msg.msg_iter);
-
- ret = sock_recvmsg(sock, &msg, flags);
- if (ret < min_ret) {
- if (ret == -EAGAIN && force_nonblock)
- return -EAGAIN;
- if (ret == -ERESTARTSYS)
- ret = -EINTR;
- if (ret > 0 && io_net_retry(sock, flags)) {
- sr->len -= ret;
- sr->buf += ret;
- sr->done_io += ret;
- req->flags |= REQ_F_PARTIAL_IO;
- return -EAGAIN;
- }
- req_set_fail(req);
- } else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
-out_free:
- req_set_fail(req);
- }
-
- if (ret >= 0)
- ret += sr->done_io;
- else if (sr->done_io)
- ret = sr->done_io;
- __io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
- return 0;
-}
-
-static int io_accept_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct io_accept *accept = &req->accept;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->len || sqe->buf_index)
- return -EINVAL;
-
- accept->addr = u64_to_user_ptr(READ_ONCE(sqe->addr));
- accept->addr_len = u64_to_user_ptr(READ_ONCE(sqe->addr2));
- accept->flags = READ_ONCE(sqe->accept_flags);
- accept->nofile = rlimit(RLIMIT_NOFILE);
-
- accept->file_slot = READ_ONCE(sqe->file_index);
- if (accept->file_slot && (accept->flags & SOCK_CLOEXEC))
- return -EINVAL;
- if (accept->flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
- return -EINVAL;
- if (SOCK_NONBLOCK != O_NONBLOCK && (accept->flags & SOCK_NONBLOCK))
- accept->flags = (accept->flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
- return 0;
-}
-
-static int io_accept(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_accept *accept = &req->accept;
- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
- unsigned int file_flags = force_nonblock ? O_NONBLOCK : 0;
- bool fixed = !!accept->file_slot;
- struct file *file;
- int ret, fd;
-
- if (!fixed) {
- fd = __get_unused_fd_flags(accept->flags, accept->nofile);
- if (unlikely(fd < 0))
- return fd;
- }
- file = do_accept(req->file, file_flags, accept->addr, accept->addr_len,
- accept->flags);
- if (IS_ERR(file)) {
- if (!fixed)
- put_unused_fd(fd);
- ret = PTR_ERR(file);
- if (ret == -EAGAIN && force_nonblock)
- return -EAGAIN;
- if (ret == -ERESTARTSYS)
- ret = -EINTR;
- req_set_fail(req);
- } else if (!fixed) {
- fd_install(fd, file);
- ret = fd;
- } else {
- ret = io_install_fixed_file(req, file, issue_flags,
- accept->file_slot - 1);
- }
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-
-static int io_connect_prep_async(struct io_kiocb *req)
-{
- struct io_async_connect *io = req->async_data;
- struct io_connect *conn = &req->connect;
-
- return move_addr_to_kernel(conn->addr, conn->addr_len, &io->address);
-}
-
-static int io_connect_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct io_connect *conn = &req->connect;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->len || sqe->buf_index || sqe->rw_flags ||
- sqe->splice_fd_in)
- return -EINVAL;
-
- conn->addr = u64_to_user_ptr(READ_ONCE(sqe->addr));
- conn->addr_len = READ_ONCE(sqe->addr2);
- return 0;
-}
-
-static int io_connect(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_async_connect __io, *io;
- unsigned file_flags;
- int ret;
- bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
-
- if (req_has_async_data(req)) {
- io = req->async_data;
- } else {
- ret = move_addr_to_kernel(req->connect.addr,
- req->connect.addr_len,
- &__io.address);
- if (ret)
- goto out;
- io = &__io;
- }
-
- file_flags = force_nonblock ? O_NONBLOCK : 0;
-
- ret = __sys_connect_file(req->file, &io->address,
- req->connect.addr_len, file_flags);
- if ((ret == -EAGAIN || ret == -EINPROGRESS) && force_nonblock) {
- if (req_has_async_data(req))
- return -EAGAIN;
- if (io_alloc_async_data(req)) {
- ret = -ENOMEM;
- goto out;
- }
- memcpy(req->async_data, &__io, sizeof(__io));
- return -EAGAIN;
- }
- if (ret == -ERESTARTSYS)
- ret = -EINTR;
-out:
- if (ret < 0)
- req_set_fail(req);
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-#else /* !CONFIG_NET */
-#define IO_NETOP_FN(op) \
-static int io_##op(struct io_kiocb *req, unsigned int issue_flags) \
-{ \
- return -EOPNOTSUPP; \
-}
-
-#define IO_NETOP_PREP(op) \
-IO_NETOP_FN(op) \
-static int io_##op##_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) \
-{ \
- return -EOPNOTSUPP; \
-} \
-
-#define IO_NETOP_PREP_ASYNC(op) \
-IO_NETOP_PREP(op) \
-static int io_##op##_prep_async(struct io_kiocb *req) \
-{ \
- return -EOPNOTSUPP; \
-}
-
-IO_NETOP_PREP_ASYNC(sendmsg);
-IO_NETOP_PREP_ASYNC(recvmsg);
-IO_NETOP_PREP_ASYNC(connect);
-IO_NETOP_PREP(accept);
-IO_NETOP_FN(send);
-IO_NETOP_FN(recv);
-#endif /* CONFIG_NET */
-
-struct io_poll_table {
- struct poll_table_struct pt;
- struct io_kiocb *req;
- int nr_entries;
- int error;
-};
-
-#define IO_POLL_CANCEL_FLAG BIT(31)
-#define IO_POLL_REF_MASK GENMASK(30, 0)
-
-/*
- * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can
- * bump it and acquire ownership. It's disallowed to modify requests while not
- * owning it, that prevents from races for enqueueing task_work's and b/w
- * arming poll and wakeups.
- */
-static inline bool io_poll_get_ownership(struct io_kiocb *req)
-{
- return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK);
-}
-
-static void io_poll_mark_cancelled(struct io_kiocb *req)
-{
- atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs);
-}
-
-static struct io_poll_iocb *io_poll_get_double(struct io_kiocb *req)
-{
- /* pure poll stashes this in ->async_data, poll driven retry elsewhere */
- if (req->opcode == IORING_OP_POLL_ADD)
- return req->async_data;
- return req->apoll->double_poll;
-}
-
-static struct io_poll_iocb *io_poll_get_single(struct io_kiocb *req)
-{
- if (req->opcode == IORING_OP_POLL_ADD)
- return &req->poll;
- return &req->apoll->poll;
-}
-
-static void io_poll_req_insert(struct io_kiocb *req)
-{
- struct io_ring_ctx *ctx = req->ctx;
- struct hlist_head *list;
-
- list = &ctx->cancel_hash[hash_long(req->user_data, ctx->cancel_hash_bits)];
- hlist_add_head(&req->hash_node, list);
-}
-
-static void io_init_poll_iocb(struct io_poll_iocb *poll, __poll_t events,
- wait_queue_func_t wake_func)
-{
- poll->head = NULL;
-#define IO_POLL_UNMASK (EPOLLERR|EPOLLHUP|EPOLLNVAL|EPOLLRDHUP)
- /* mask in events that we always want/need */
- poll->events = events | IO_POLL_UNMASK;
- INIT_LIST_HEAD(&poll->wait.entry);
- init_waitqueue_func_entry(&poll->wait, wake_func);
-}
-
-static inline void io_poll_remove_entry(struct io_poll_iocb *poll)
-{
- struct wait_queue_head *head = smp_load_acquire(&poll->head);
-
- if (head) {
- spin_lock_irq(&head->lock);
- list_del_init(&poll->wait.entry);
- poll->head = NULL;
- spin_unlock_irq(&head->lock);
- }
-}
-
-static void io_poll_remove_entries(struct io_kiocb *req)
-{
- /*
- * Nothing to do if neither of those flags are set. Avoid dipping
- * into the poll/apoll/double cachelines if we can.
- */
- if (!(req->flags & (REQ_F_SINGLE_POLL | REQ_F_DOUBLE_POLL)))
- return;
-
- /*
- * While we hold the waitqueue lock and the waitqueue is nonempty,
- * wake_up_pollfree() will wait for us. However, taking the waitqueue
- * lock in the first place can race with the waitqueue being freed.
- *
- * We solve this as eventpoll does: by taking advantage of the fact that
- * all users of wake_up_pollfree() will RCU-delay the actual free. If
- * we enter rcu_read_lock() and see that the pointer to the queue is
- * non-NULL, we can then lock it without the memory being freed out from
- * under us.
- *
- * Keep holding rcu_read_lock() as long as we hold the queue lock, in
- * case the caller deletes the entry from the queue, leaving it empty.
- * In that case, only RCU prevents the queue memory from being freed.
- */
- rcu_read_lock();
- if (req->flags & REQ_F_SINGLE_POLL)
- io_poll_remove_entry(io_poll_get_single(req));
- if (req->flags & REQ_F_DOUBLE_POLL)
- io_poll_remove_entry(io_poll_get_double(req));
- rcu_read_unlock();
-}
-
-/*
- * All poll tw should go through this. Checks for poll events, manages
- * references, does rewait, etc.
- *
- * Returns a negative error on failure. >0 when no action require, which is
- * either spurious wakeup or multishot CQE is served. 0 when it's done with
- * the request, then the mask is stored in req->result.
- */
-static int io_poll_check_events(struct io_kiocb *req, bool locked)
-{
- struct io_ring_ctx *ctx = req->ctx;
- int v;
-
- /* req->task == current here, checking PF_EXITING is safe */
- if (unlikely(req->task->flags & PF_EXITING))
- io_poll_mark_cancelled(req);
-
- do {
- v = atomic_read(&req->poll_refs);
-
- /* tw handler should be the owner, and so have some references */
- if (WARN_ON_ONCE(!(v & IO_POLL_REF_MASK)))
- return 0;
- if (v & IO_POLL_CANCEL_FLAG)
- return -ECANCELED;
-
- if (!req->result) {
- struct poll_table_struct pt = { ._key = req->apoll_events };
- req->result = vfs_poll(req->file, &pt) & req->apoll_events;
- }
-
- /* multishot, just fill an CQE and proceed */
- if (req->result && !(req->apoll_events & EPOLLONESHOT)) {
- __poll_t mask = mangle_poll(req->result & req->apoll_events);
- bool filled;
-
- spin_lock(&ctx->completion_lock);
- filled = io_fill_cqe_aux(ctx, req->user_data, mask,
- IORING_CQE_F_MORE);
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- if (unlikely(!filled))
- return -ECANCELED;
- io_cqring_ev_posted(ctx);
- } else if (req->result) {
- return 0;
- }
-
- /*
- * Release all references, retry if someone tried to restart
- * task_work while we were executing it.
- */
- } while (atomic_sub_return(v & IO_POLL_REF_MASK, &req->poll_refs));
-
- return 1;
-}
-
-static void io_poll_task_func(struct io_kiocb *req, bool *locked)
-{
- struct io_ring_ctx *ctx = req->ctx;
- int ret;
-
- ret = io_poll_check_events(req, *locked);
- if (ret > 0)
- return;
-
- if (!ret) {
- req->result = mangle_poll(req->result & req->poll.events);
- } else {
- req->result = ret;
- req_set_fail(req);
- }
-
- io_poll_remove_entries(req);
- spin_lock(&ctx->completion_lock);
- hash_del(&req->hash_node);
- __io_req_complete_post(req, req->result, 0);
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- io_cqring_ev_posted(ctx);
-}
-
-static void io_apoll_task_func(struct io_kiocb *req, bool *locked)
-{
- struct io_ring_ctx *ctx = req->ctx;
- int ret;
-
- ret = io_poll_check_events(req, *locked);
- if (ret > 0)
- return;
-
- io_poll_remove_entries(req);
- spin_lock(&ctx->completion_lock);
- hash_del(&req->hash_node);
- spin_unlock(&ctx->completion_lock);
-
- if (!ret)
- io_req_task_submit(req, locked);
- else
- io_req_complete_failed(req, ret);
-}
-
-static void __io_poll_execute(struct io_kiocb *req, int mask,
- __poll_t __maybe_unused events)
-{
- req->result = mask;
- /*
- * This is useful for poll that is armed on behalf of another
- * request, and where the wakeup path could be on a different
- * CPU. We want to avoid pulling in req->apoll->events for that
- * case.
- */
- if (req->opcode == IORING_OP_POLL_ADD)
- req->io_task_work.func = io_poll_task_func;
- else
- req->io_task_work.func = io_apoll_task_func;
-
- trace_io_uring_task_add(req->ctx, req, req->user_data, req->opcode, mask);
- io_req_task_work_add(req, false);
-}
-
-static inline void io_poll_execute(struct io_kiocb *req, int res,
- __poll_t events)
-{
- if (io_poll_get_ownership(req))
- __io_poll_execute(req, res, events);
-}
-
-static void io_poll_cancel_req(struct io_kiocb *req)
-{
- io_poll_mark_cancelled(req);
- /* kick tw, which should complete the request */
- io_poll_execute(req, 0, 0);
-}
-
-#define wqe_to_req(wait) ((void *)((unsigned long) (wait)->private & ~1))
-#define wqe_is_double(wait) ((unsigned long) (wait)->private & 1)
-#define IO_ASYNC_POLL_COMMON (EPOLLONESHOT | POLLPRI)
-
-static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
- void *key)
-{
- struct io_kiocb *req = wqe_to_req(wait);
- struct io_poll_iocb *poll = container_of(wait, struct io_poll_iocb,
- wait);
- __poll_t mask = key_to_poll(key);
-
- if (unlikely(mask & POLLFREE)) {
- io_poll_mark_cancelled(req);
- /* we have to kick tw in case it's not already */
- io_poll_execute(req, 0, poll->events);
-
- /*
- * If the waitqueue is being freed early but someone is already
- * holds ownership over it, we have to tear down the request as
- * best we can. That means immediately removing the request from
- * its waitqueue and preventing all further accesses to the
- * waitqueue via the request.
- */
- list_del_init(&poll->wait.entry);
-
- /*
- * Careful: this *must* be the last step, since as soon
- * as req->head is NULL'ed out, the request can be
- * completed and freed, since aio_poll_complete_work()
- * will no longer need to take the waitqueue lock.
- */
- smp_store_release(&poll->head, NULL);
- return 1;
- }
-
- /* for instances that support it check for an event match first */
- if (mask && !(mask & (poll->events & ~IO_ASYNC_POLL_COMMON)))
- return 0;
-
- if (io_poll_get_ownership(req)) {
- /* optional, saves extra locking for removal in tw handler */
- if (mask && poll->events & EPOLLONESHOT) {
- list_del_init(&poll->wait.entry);
- poll->head = NULL;
- if (wqe_is_double(wait))
- req->flags &= ~REQ_F_DOUBLE_POLL;
- else
- req->flags &= ~REQ_F_SINGLE_POLL;
- }
- __io_poll_execute(req, mask, poll->events);
- }
- return 1;
-}
-
-static void __io_queue_proc(struct io_poll_iocb *poll, struct io_poll_table *pt,
- struct wait_queue_head *head,
- struct io_poll_iocb **poll_ptr)
-{
- struct io_kiocb *req = pt->req;
- unsigned long wqe_private = (unsigned long) req;
-
- /*
- * The file being polled uses multiple waitqueues for poll handling
- * (e.g. one for read, one for write). Setup a separate io_poll_iocb
- * if this happens.
- */
- if (unlikely(pt->nr_entries)) {
- struct io_poll_iocb *first = poll;
-
- /* double add on the same waitqueue head, ignore */
- if (first->head == head)
- return;
- /* already have a 2nd entry, fail a third attempt */
- if (*poll_ptr) {
- if ((*poll_ptr)->head == head)
- return;
- pt->error = -EINVAL;
- return;
- }
-
- poll = kmalloc(sizeof(*poll), GFP_ATOMIC);
- if (!poll) {
- pt->error = -ENOMEM;
- return;
- }
- /* mark as double wq entry */
- wqe_private |= 1;
- req->flags |= REQ_F_DOUBLE_POLL;
- io_init_poll_iocb(poll, first->events, first->wait.func);
- *poll_ptr = poll;
- if (req->opcode == IORING_OP_POLL_ADD)
- req->flags |= REQ_F_ASYNC_DATA;
- }
-
- req->flags |= REQ_F_SINGLE_POLL;
- pt->nr_entries++;
- poll->head = head;
- poll->wait.private = (void *) wqe_private;
-
- if (poll->events & EPOLLEXCLUSIVE)
- add_wait_queue_exclusive(head, &poll->wait);
- else
- add_wait_queue(head, &poll->wait);
-}
-
-static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head,
- struct poll_table_struct *p)
-{
- struct io_poll_table *pt = container_of(p, struct io_poll_table, pt);
-
- __io_queue_proc(&pt->req->poll, pt, head,
- (struct io_poll_iocb **) &pt->req->async_data);
-}
-
-static int __io_arm_poll_handler(struct io_kiocb *req,
- struct io_poll_iocb *poll,
- struct io_poll_table *ipt, __poll_t mask)
-{
- struct io_ring_ctx *ctx = req->ctx;
- int v;
-
- INIT_HLIST_NODE(&req->hash_node);
- io_init_poll_iocb(poll, mask, io_poll_wake);
- poll->file = req->file;
-
- req->apoll_events = poll->events;
-
- ipt->pt._key = mask;
- ipt->req = req;
- ipt->error = 0;
- ipt->nr_entries = 0;
-
- /*
- * Take the ownership to delay any tw execution up until we're done
- * with poll arming. see io_poll_get_ownership().
- */
- atomic_set(&req->poll_refs, 1);
- mask = vfs_poll(req->file, &ipt->pt) & poll->events;
-
- if (mask && (poll->events & EPOLLONESHOT)) {
- io_poll_remove_entries(req);
- /* no one else has access to the req, forget about the ref */
- return mask;
- }
- if (!mask && unlikely(ipt->error || !ipt->nr_entries)) {
- io_poll_remove_entries(req);
- if (!ipt->error)
- ipt->error = -EINVAL;
- return 0;
- }
-
- spin_lock(&ctx->completion_lock);
- io_poll_req_insert(req);
- spin_unlock(&ctx->completion_lock);
-
- if (mask) {
- /* can't multishot if failed, just queue the event we've got */
- if (unlikely(ipt->error || !ipt->nr_entries)) {
- poll->events |= EPOLLONESHOT;
- req->apoll_events |= EPOLLONESHOT;
- ipt->error = 0;
- }
- __io_poll_execute(req, mask, poll->events);
- return 0;
- }
-
- /*
- * Release ownership. If someone tried to queue a tw while it was
- * locked, kick it off for them.
- */
- v = atomic_dec_return(&req->poll_refs);
- if (unlikely(v & IO_POLL_REF_MASK))
- __io_poll_execute(req, 0, poll->events);
- return 0;
-}
-
-static void io_async_queue_proc(struct file *file, struct wait_queue_head *head,
- struct poll_table_struct *p)
-{
- struct io_poll_table *pt = container_of(p, struct io_poll_table, pt);
- struct async_poll *apoll = pt->req->apoll;
-
- __io_queue_proc(&apoll->poll, pt, head, &apoll->double_poll);
-}
-
-enum {
- IO_APOLL_OK,
- IO_APOLL_ABORTED,
- IO_APOLL_READY
-};
-
-static int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
-{
- const struct io_op_def *def = &io_op_defs[req->opcode];
- struct io_ring_ctx *ctx = req->ctx;
- struct async_poll *apoll;
- struct io_poll_table ipt;
- __poll_t mask = IO_ASYNC_POLL_COMMON | POLLERR;
- int ret;
-
- if (!def->pollin && !def->pollout)
- return IO_APOLL_ABORTED;
- if (!file_can_poll(req->file) || (req->flags & REQ_F_POLLED))
- return IO_APOLL_ABORTED;
-
- if (def->pollin) {
- mask |= POLLIN | POLLRDNORM;
-
- /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
- if ((req->opcode == IORING_OP_RECVMSG) &&
- (req->sr_msg.msg_flags & MSG_ERRQUEUE))
- mask &= ~POLLIN;
- } else {
- mask |= POLLOUT | POLLWRNORM;
- }
- if (def->poll_exclusive)
- mask |= EPOLLEXCLUSIVE;
- if (!(issue_flags & IO_URING_F_UNLOCKED) &&
- !list_empty(&ctx->apoll_cache)) {
- apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
- poll.wait.entry);
- list_del_init(&apoll->poll.wait.entry);
- } else {
- apoll = kmalloc(sizeof(*apoll), GFP_ATOMIC);
- if (unlikely(!apoll))
- return IO_APOLL_ABORTED;
- }
- apoll->double_poll = NULL;
- req->apoll = apoll;
- req->flags |= REQ_F_POLLED;
- ipt.pt._qproc = io_async_queue_proc;
-
- io_kbuf_recycle(req, issue_flags);
-
- ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask);
- if (ret || ipt.error)
- return ret ? IO_APOLL_READY : IO_APOLL_ABORTED;
-
- trace_io_uring_poll_arm(ctx, req, req->user_data, req->opcode,
- mask, apoll->poll.events);
- return IO_APOLL_OK;
-}
-
-/*
- * Returns true if we found and killed one or more poll requests
- */
-static __cold bool io_poll_remove_all(struct io_ring_ctx *ctx,
- struct task_struct *tsk, bool cancel_all)
-{
- struct hlist_node *tmp;
- struct io_kiocb *req;
- bool found = false;
- int i;
-
- spin_lock(&ctx->completion_lock);
- for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
- struct hlist_head *list;
-
- list = &ctx->cancel_hash[i];
- hlist_for_each_entry_safe(req, tmp, list, hash_node) {
- if (io_match_task_safe(req, tsk, cancel_all)) {
- hlist_del_init(&req->hash_node);
- io_poll_cancel_req(req);
- found = true;
- }
- }
- }
- spin_unlock(&ctx->completion_lock);
- return found;
-}
-
-static struct io_kiocb *io_poll_find(struct io_ring_ctx *ctx, __u64 sqe_addr,
- bool poll_only)
- __must_hold(&ctx->completion_lock)
-{
- struct hlist_head *list;
- struct io_kiocb *req;
-
- list = &ctx->cancel_hash[hash_long(sqe_addr, ctx->cancel_hash_bits)];
- hlist_for_each_entry(req, list, hash_node) {
- if (sqe_addr != req->user_data)
- continue;
- if (poll_only && req->opcode != IORING_OP_POLL_ADD)
- continue;
- return req;
- }
- return NULL;
-}
-
-static bool io_poll_disarm(struct io_kiocb *req)
- __must_hold(&ctx->completion_lock)
-{
- if (!io_poll_get_ownership(req))
- return false;
- io_poll_remove_entries(req);
- hash_del(&req->hash_node);
- return true;
-}
-
-static int io_poll_cancel(struct io_ring_ctx *ctx, __u64 sqe_addr,
- bool poll_only)
- __must_hold(&ctx->completion_lock)
-{
- struct io_kiocb *req = io_poll_find(ctx, sqe_addr, poll_only);
-
- if (!req)
- return -ENOENT;
- io_poll_cancel_req(req);
- return 0;
-}
-
-static __poll_t io_poll_parse_events(const struct io_uring_sqe *sqe,
- unsigned int flags)
-{
- u32 events;
-
- events = READ_ONCE(sqe->poll32_events);
-#ifdef __BIG_ENDIAN
- events = swahw32(events);
-#endif
- if (!(flags & IORING_POLL_ADD_MULTI))
- events |= EPOLLONESHOT;
- return demangle_poll(events) | (events & (EPOLLEXCLUSIVE|EPOLLONESHOT));
-}
-
-static int io_poll_update_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- struct io_poll_update *upd = &req->poll_update;
- u32 flags;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
- return -EINVAL;
- flags = READ_ONCE(sqe->len);
- if (flags & ~(IORING_POLL_UPDATE_EVENTS | IORING_POLL_UPDATE_USER_DATA |
- IORING_POLL_ADD_MULTI))
- return -EINVAL;
- /* meaningless without update */
- if (flags == IORING_POLL_ADD_MULTI)
- return -EINVAL;
-
- upd->old_user_data = READ_ONCE(sqe->addr);
- upd->update_events = flags & IORING_POLL_UPDATE_EVENTS;
- upd->update_user_data = flags & IORING_POLL_UPDATE_USER_DATA;
-
- upd->new_user_data = READ_ONCE(sqe->off);
- if (!upd->update_user_data && upd->new_user_data)
- return -EINVAL;
- if (upd->update_events)
- upd->events = io_poll_parse_events(sqe, flags);
- else if (sqe->poll32_events)
- return -EINVAL;
-
- return 0;
-}
-
-static int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- struct io_poll_iocb *poll = &req->poll;
- u32 flags;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->buf_index || sqe->off || sqe->addr)
- return -EINVAL;
- flags = READ_ONCE(sqe->len);
- if (flags & ~IORING_POLL_ADD_MULTI)
- return -EINVAL;
- if ((flags & IORING_POLL_ADD_MULTI) && (req->flags & REQ_F_CQE_SKIP))
- return -EINVAL;
-
- io_req_set_refcount(req);
- poll->events = io_poll_parse_events(sqe, flags);
- return 0;
-}
-
-static int io_poll_add(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_poll_iocb *poll = &req->poll;
- struct io_poll_table ipt;
- int ret;
-
- ipt.pt._qproc = io_poll_queue_proc;
-
- ret = __io_arm_poll_handler(req, &req->poll, &ipt, poll->events);
- if (!ret && ipt.error)
- req_set_fail(req);
- ret = ret ?: ipt.error;
- if (ret)
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-
-static int io_poll_update(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_ring_ctx *ctx = req->ctx;
- struct io_kiocb *preq;
- int ret2, ret = 0;
- bool locked;
-
- spin_lock(&ctx->completion_lock);
- preq = io_poll_find(ctx, req->poll_update.old_user_data, true);
- if (!preq || !io_poll_disarm(preq)) {
- spin_unlock(&ctx->completion_lock);
- ret = preq ? -EALREADY : -ENOENT;
- goto out;
- }
- spin_unlock(&ctx->completion_lock);
-
- if (req->poll_update.update_events || req->poll_update.update_user_data) {
- /* only mask one event flags, keep behavior flags */
- if (req->poll_update.update_events) {
- preq->poll.events &= ~0xffff;
- preq->poll.events |= req->poll_update.events & 0xffff;
- preq->poll.events |= IO_POLL_UNMASK;
- }
- if (req->poll_update.update_user_data)
- preq->user_data = req->poll_update.new_user_data;
-
- ret2 = io_poll_add(preq, issue_flags);
- /* successfully updated, don't complete poll request */
- if (!ret2)
- goto out;
- }
-
- req_set_fail(preq);
- preq->result = -ECANCELED;
- locked = !(issue_flags & IO_URING_F_UNLOCKED);
- io_req_task_complete(preq, &locked);
-out:
- if (ret < 0)
- req_set_fail(req);
- /* complete update request, we're done with it */
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-
-static enum hrtimer_restart io_timeout_fn(struct hrtimer *timer)
-{
- struct io_timeout_data *data = container_of(timer,
- struct io_timeout_data, timer);
- struct io_kiocb *req = data->req;
- struct io_ring_ctx *ctx = req->ctx;
- unsigned long flags;
-
- spin_lock_irqsave(&ctx->timeout_lock, flags);
- list_del_init(&req->timeout.list);
- atomic_set(&req->ctx->cq_timeouts,
- atomic_read(&req->ctx->cq_timeouts) + 1);
- spin_unlock_irqrestore(&ctx->timeout_lock, flags);
-
- if (!(data->flags & IORING_TIMEOUT_ETIME_SUCCESS))
- req_set_fail(req);
-
- req->result = -ETIME;
- req->io_task_work.func = io_req_task_complete;
- io_req_task_work_add(req, false);
- return HRTIMER_NORESTART;
-}
-
-static struct io_kiocb *io_timeout_extract(struct io_ring_ctx *ctx,
- __u64 user_data)
- __must_hold(&ctx->timeout_lock)
-{
- struct io_timeout_data *io;
- struct io_kiocb *req;
- bool found = false;
-
- list_for_each_entry(req, &ctx->timeout_list, timeout.list) {
- found = user_data == req->user_data;
- if (found)
- break;
- }
- if (!found)
- return ERR_PTR(-ENOENT);
-
- io = req->async_data;
- if (hrtimer_try_to_cancel(&io->timer) == -1)
- return ERR_PTR(-EALREADY);
- list_del_init(&req->timeout.list);
- return req;
-}
-
-static int io_timeout_cancel(struct io_ring_ctx *ctx, __u64 user_data)
- __must_hold(&ctx->completion_lock)
- __must_hold(&ctx->timeout_lock)
-{
- struct io_kiocb *req = io_timeout_extract(ctx, user_data);
-
- if (IS_ERR(req))
- return PTR_ERR(req);
- io_req_task_queue_fail(req, -ECANCELED);
- return 0;
-}
-
-static clockid_t io_timeout_get_clock(struct io_timeout_data *data)
-{
- switch (data->flags & IORING_TIMEOUT_CLOCK_MASK) {
- case IORING_TIMEOUT_BOOTTIME:
- return CLOCK_BOOTTIME;
- case IORING_TIMEOUT_REALTIME:
- return CLOCK_REALTIME;
- default:
- /* can't happen, vetted at prep time */
- WARN_ON_ONCE(1);
- fallthrough;
- case 0:
- return CLOCK_MONOTONIC;
- }
-}
-
-static int io_linked_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
- struct timespec64 *ts, enum hrtimer_mode mode)
- __must_hold(&ctx->timeout_lock)
-{
- struct io_timeout_data *io;
- struct io_kiocb *req;
- bool found = false;
-
- list_for_each_entry(req, &ctx->ltimeout_list, timeout.list) {
- found = user_data == req->user_data;
- if (found)
- break;
- }
- if (!found)
- return -ENOENT;
-
- io = req->async_data;
- if (hrtimer_try_to_cancel(&io->timer) == -1)
- return -EALREADY;
- hrtimer_init(&io->timer, io_timeout_get_clock(io), mode);
- io->timer.function = io_link_timeout_fn;
- hrtimer_start(&io->timer, timespec64_to_ktime(*ts), mode);
- return 0;
-}
-
-static int io_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
- struct timespec64 *ts, enum hrtimer_mode mode)
- __must_hold(&ctx->timeout_lock)
-{
- struct io_kiocb *req = io_timeout_extract(ctx, user_data);
- struct io_timeout_data *data;
-
- if (IS_ERR(req))
- return PTR_ERR(req);
-
- req->timeout.off = 0; /* noseq */
- data = req->async_data;
- list_add_tail(&req->timeout.list, &ctx->timeout_list);
- hrtimer_init(&data->timer, io_timeout_get_clock(data), mode);
- data->timer.function = io_timeout_fn;
- hrtimer_start(&data->timer, timespec64_to_ktime(*ts), mode);
- return 0;
-}
-
-static int io_timeout_remove_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- struct io_timeout_rem *tr = &req->timeout_rem;
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
- return -EINVAL;
- if (sqe->ioprio || sqe->buf_index || sqe->len || sqe->splice_fd_in)
- return -EINVAL;
-
- tr->ltimeout = false;
- tr->addr = READ_ONCE(sqe->addr);
- tr->flags = READ_ONCE(sqe->timeout_flags);
- if (tr->flags & IORING_TIMEOUT_UPDATE_MASK) {
- if (hweight32(tr->flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
- return -EINVAL;
- if (tr->flags & IORING_LINK_TIMEOUT_UPDATE)
- tr->ltimeout = true;
- if (tr->flags & ~(IORING_TIMEOUT_UPDATE_MASK|IORING_TIMEOUT_ABS))
- return -EINVAL;
- if (get_timespec64(&tr->ts, u64_to_user_ptr(sqe->addr2)))
- return -EFAULT;
- if (tr->ts.tv_sec < 0 || tr->ts.tv_nsec < 0)
- return -EINVAL;
- } else if (tr->flags) {
- /* timeout removal doesn't support flags */
- return -EINVAL;
- }
-
- return 0;
-}
-
-static inline enum hrtimer_mode io_translate_timeout_mode(unsigned int flags)
-{
- return (flags & IORING_TIMEOUT_ABS) ? HRTIMER_MODE_ABS
- : HRTIMER_MODE_REL;
-}
-
-/*
- * Remove or update an existing timeout command
- */
-static int io_timeout_remove(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_timeout_rem *tr = &req->timeout_rem;
- struct io_ring_ctx *ctx = req->ctx;
- int ret;
-
- if (!(req->timeout_rem.flags & IORING_TIMEOUT_UPDATE)) {
- spin_lock(&ctx->completion_lock);
- spin_lock_irq(&ctx->timeout_lock);
- ret = io_timeout_cancel(ctx, tr->addr);
- spin_unlock_irq(&ctx->timeout_lock);
- spin_unlock(&ctx->completion_lock);
- } else {
- enum hrtimer_mode mode = io_translate_timeout_mode(tr->flags);
-
- spin_lock_irq(&ctx->timeout_lock);
- if (tr->ltimeout)
- ret = io_linked_timeout_update(ctx, tr->addr, &tr->ts, mode);
- else
- ret = io_timeout_update(ctx, tr->addr, &tr->ts, mode);
- spin_unlock_irq(&ctx->timeout_lock);
- }
-
- if (ret < 0)
- req_set_fail(req);
- io_req_complete_post(req, ret, 0);
- return 0;
-}
-
-static int io_timeout_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
- bool is_timeout_link)
-{
- struct io_timeout_data *data;
- unsigned flags;
- u32 off = READ_ONCE(sqe->off);
-
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (sqe->ioprio || sqe->buf_index || sqe->len != 1 ||
- sqe->splice_fd_in)
- return -EINVAL;
- if (off && is_timeout_link)
- return -EINVAL;
- flags = READ_ONCE(sqe->timeout_flags);
- if (flags & ~(IORING_TIMEOUT_ABS | IORING_TIMEOUT_CLOCK_MASK |
- IORING_TIMEOUT_ETIME_SUCCESS))
- return -EINVAL;
- /* more than one clock specified is invalid, obviously */
- if (hweight32(flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
- return -EINVAL;
-
- INIT_LIST_HEAD(&req->timeout.list);
- req->timeout.off = off;
- if (unlikely(off && !req->ctx->off_timeout_used))
- req->ctx->off_timeout_used = true;
-
- if (WARN_ON_ONCE(req_has_async_data(req)))
- return -EFAULT;
- if (io_alloc_async_data(req))
- return -ENOMEM;
-
- data = req->async_data;
- data->req = req;
- data->flags = flags;
-
- if (get_timespec64(&data->ts, u64_to_user_ptr(sqe->addr)))
- return -EFAULT;
-
- if (data->ts.tv_sec < 0 || data->ts.tv_nsec < 0)
- return -EINVAL;
-
- INIT_LIST_HEAD(&req->timeout.list);
- data->mode = io_translate_timeout_mode(flags);
- hrtimer_init(&data->timer, io_timeout_get_clock(data), data->mode);
-
- if (is_timeout_link) {
- struct io_submit_link *link = &req->ctx->submit_state.link;
-
- if (!link->head)
- return -EINVAL;
- if (link->last->opcode == IORING_OP_LINK_TIMEOUT)
- return -EINVAL;
- req->timeout.head = link->last;
- link->last->flags |= REQ_F_ARM_LTIMEOUT;
- }
- return 0;
-}
-
-static int io_timeout(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_ring_ctx *ctx = req->ctx;
- struct io_timeout_data *data = req->async_data;
- struct list_head *entry;
- u32 tail, off = req->timeout.off;
-
- spin_lock_irq(&ctx->timeout_lock);
-
- /*
- * sqe->off holds how many events that need to occur for this
- * timeout event to be satisfied. If it isn't set, then this is
- * a pure timeout request, sequence isn't used.
- */
- if (io_is_timeout_noseq(req)) {
- entry = ctx->timeout_list.prev;
- goto add;
- }
-
- tail = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
- req->timeout.target_seq = tail + off;
-
- /* Update the last seq here in case io_flush_timeouts() hasn't.
- * This is safe because ->completion_lock is held, and submissions
- * and completions are never mixed in the same ->completion_lock section.
- */
- ctx->cq_last_tm_flush = tail;
-
- /*
- * Insertion sort, ensuring the first entry in the list is always
- * the one we need first.
- */
- list_for_each_prev(entry, &ctx->timeout_list) {
- struct io_kiocb *nxt = list_entry(entry, struct io_kiocb,
- timeout.list);
-
- if (io_is_timeout_noseq(nxt))
- continue;
- /* nxt.seq is behind @tail, otherwise would've been completed */
- if (off >= nxt->timeout.target_seq - tail)
- break;
- }
-add:
- list_add(&req->timeout.list, entry);
- data->timer.function = io_timeout_fn;
- hrtimer_start(&data->timer, timespec64_to_ktime(data->ts), data->mode);
- spin_unlock_irq(&ctx->timeout_lock);
- return 0;
-}
-
-struct io_cancel_data {
- struct io_ring_ctx *ctx;
- u64 user_data;
-};
-
-static bool io_cancel_cb(struct io_wq_work *work, void *data)
-{
- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
- struct io_cancel_data *cd = data;
-
- return req->ctx == cd->ctx && req->user_data == cd->user_data;
-}
-
-static int io_async_cancel_one(struct io_uring_task *tctx, u64 user_data,
- struct io_ring_ctx *ctx)
-{
- struct io_cancel_data data = { .ctx = ctx, .user_data = user_data, };
- enum io_wq_cancel cancel_ret;
- int ret = 0;
-
- if (!tctx || !tctx->io_wq)
- return -ENOENT;
-
- cancel_ret = io_wq_cancel_cb(tctx->io_wq, io_cancel_cb, &data, false);
- switch (cancel_ret) {
- case IO_WQ_CANCEL_OK:
- ret = 0;
- break;
- case IO_WQ_CANCEL_RUNNING:
- ret = -EALREADY;
- break;
- case IO_WQ_CANCEL_NOTFOUND:
- ret = -ENOENT;
- break;
- }
-
- return ret;
-}
-
-static int io_try_cancel_userdata(struct io_kiocb *req, u64 sqe_addr)
-{
- struct io_ring_ctx *ctx = req->ctx;
- int ret;
-
- WARN_ON_ONCE(!io_wq_current_is_worker() && req->task != current);
-
- ret = io_async_cancel_one(req->task->io_uring, sqe_addr, ctx);
- /*
- * Fall-through even for -EALREADY, as we may have poll armed
- * that need unarming.
- */
- if (!ret)
- return 0;
-
- spin_lock(&ctx->completion_lock);
- ret = io_poll_cancel(ctx, sqe_addr, false);
- if (ret != -ENOENT)
- goto out;
-
- spin_lock_irq(&ctx->timeout_lock);
- ret = io_timeout_cancel(ctx, sqe_addr);
- spin_unlock_irq(&ctx->timeout_lock);
-out:
- spin_unlock(&ctx->completion_lock);
- return ret;
-}
-
-static int io_async_cancel_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
- return -EINVAL;
- if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
- return -EINVAL;
- if (sqe->ioprio || sqe->off || sqe->len || sqe->cancel_flags ||
- sqe->splice_fd_in)
- return -EINVAL;
-
- req->cancel.addr = READ_ONCE(sqe->addr);
- return 0;
-}
-
-static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_ring_ctx *ctx = req->ctx;
- u64 sqe_addr = req->cancel.addr;
- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
- struct io_tctx_node *node;
- int ret;
-
- ret = io_try_cancel_userdata(req, sqe_addr);
- if (ret != -ENOENT)
- goto done;
-
- /* slow path, try all io-wq's */
- io_ring_submit_lock(ctx, needs_lock);
- ret = -ENOENT;
- list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
- struct io_uring_task *tctx = node->task->io_uring;
-
- ret = io_async_cancel_one(tctx, req->cancel.addr, ctx);
- if (ret != -ENOENT)
- break;
- }
- io_ring_submit_unlock(ctx, needs_lock);
-done:
- if (ret < 0)
- req_set_fail(req);
- io_req_complete_post(req, ret, 0);
- return 0;
-}
-
-static int io_rsrc_update_prep(struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
-{
- if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
- return -EINVAL;
- if (sqe->ioprio || sqe->rw_flags || sqe->splice_fd_in)
- return -EINVAL;
-
- req->rsrc_update.offset = READ_ONCE(sqe->off);
- req->rsrc_update.nr_args = READ_ONCE(sqe->len);
- if (!req->rsrc_update.nr_args)
- return -EINVAL;
- req->rsrc_update.arg = READ_ONCE(sqe->addr);
- return 0;
-}
-
-static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
-{
- struct io_ring_ctx *ctx = req->ctx;
- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
- struct io_uring_rsrc_update2 up;
- int ret;
-
- up.offset = req->rsrc_update.offset;
- up.data = req->rsrc_update.arg;
- up.nr = 0;
- up.tags = 0;
- up.resv = 0;
- up.resv2 = 0;
-
- io_ring_submit_lock(ctx, needs_lock);
- ret = __io_register_rsrc_update(ctx, IORING_RSRC_FILE,
- &up, req->rsrc_update.nr_args);
- io_ring_submit_unlock(ctx, needs_lock);
-
- if (ret < 0)
- req_set_fail(req);
- __io_req_complete(req, issue_flags, ret, 0);
- return 0;
-}
-
-static int io_req_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-{
- switch (req->opcode) {
- case IORING_OP_NOP:
- return 0;
- case IORING_OP_READV:
- case IORING_OP_READ_FIXED:
- case IORING_OP_READ:
- case IORING_OP_WRITEV:
- case IORING_OP_WRITE_FIXED:
- case IORING_OP_WRITE:
- return io_prep_rw(req, sqe);
- case IORING_OP_POLL_ADD:
- return io_poll_add_prep(req, sqe);
- case IORING_OP_POLL_REMOVE:
- return io_poll_update_prep(req, sqe);
- case IORING_OP_FSYNC:
- return io_fsync_prep(req, sqe);
- case IORING_OP_SYNC_FILE_RANGE:
- return io_sfr_prep(req, sqe);
- case IORING_OP_SENDMSG:
- case IORING_OP_SEND:
- return io_sendmsg_prep(req, sqe);
- case IORING_OP_RECVMSG:
- case IORING_OP_RECV:
- return io_recvmsg_prep(req, sqe);
- case IORING_OP_CONNECT:
- return io_connect_prep(req, sqe);
- case IORING_OP_TIMEOUT:
- return io_timeout_prep(req, sqe, false);
- case IORING_OP_TIMEOUT_REMOVE:
- return io_timeout_remove_prep(req, sqe);
- case IORING_OP_ASYNC_CANCEL:
- return io_async_cancel_prep(req, sqe);
- case IORING_OP_LINK_TIMEOUT:
- return io_timeout_prep(req, sqe, true);
- case IORING_OP_ACCEPT:
- return io_accept_prep(req, sqe);
- case IORING_OP_FALLOCATE:
- return io_fallocate_prep(req, sqe);
- case IORING_OP_OPENAT:
- return io_openat_prep(req, sqe);
- case IORING_OP_CLOSE:
- return io_close_prep(req, sqe);
- case IORING_OP_FILES_UPDATE:
- return io_rsrc_update_prep(req, sqe);
- case IORING_OP_STATX:
- return io_statx_prep(req, sqe);
- case IORING_OP_FADVISE:
- return io_fadvise_prep(req, sqe);
- case IORING_OP_MADVISE:
- return io_madvise_prep(req, sqe);
- case IORING_OP_OPENAT2:
- return io_openat2_prep(req, sqe);
- case IORING_OP_EPOLL_CTL:
- return io_epoll_ctl_prep(req, sqe);
- case IORING_OP_SPLICE:
- return io_splice_prep(req, sqe);
- case IORING_OP_PROVIDE_BUFFERS:
- return io_provide_buffers_prep(req, sqe);
- case IORING_OP_REMOVE_BUFFERS:
- return io_remove_buffers_prep(req, sqe);
- case IORING_OP_TEE:
- return io_tee_prep(req, sqe);
- case IORING_OP_SHUTDOWN:
- return io_shutdown_prep(req, sqe);
- case IORING_OP_RENAMEAT:
- return io_renameat_prep(req, sqe);
- case IORING_OP_UNLINKAT:
- return io_unlinkat_prep(req, sqe);
- case IORING_OP_MKDIRAT:
- return io_mkdirat_prep(req, sqe);
- case IORING_OP_SYMLINKAT:
- return io_symlinkat_prep(req, sqe);
- case IORING_OP_LINKAT:
- return io_linkat_prep(req, sqe);
- case IORING_OP_MSG_RING:
- return io_msg_ring_prep(req, sqe);
- }
-
- printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
- req->opcode);
- return -EINVAL;
-}
-
-static int io_req_prep_async(struct io_kiocb *req)
-{
- const struct io_op_def *def = &io_op_defs[req->opcode];
-
- /* assign early for deferred execution for non-fixed file */
- if (def->needs_file && !(req->flags & REQ_F_FIXED_FILE))
- req->file = io_file_get_normal(req, req->fd);
- if (!def->needs_async_setup)
- return 0;
- if (WARN_ON_ONCE(req_has_async_data(req)))
- return -EFAULT;
- if (io_alloc_async_data(req))
- return -EAGAIN;
-
- switch (req->opcode) {
- case IORING_OP_READV:
- return io_rw_prep_async(req, READ);
- case IORING_OP_WRITEV:
- return io_rw_prep_async(req, WRITE);
- case IORING_OP_SENDMSG:
- return io_sendmsg_prep_async(req);
- case IORING_OP_RECVMSG:
- return io_recvmsg_prep_async(req);
- case IORING_OP_CONNECT:
- return io_connect_prep_async(req);
- }
- printk_once(KERN_WARNING "io_uring: prep_async() bad opcode %d\n",
- req->opcode);
- return -EFAULT;
-}
-
-static u32 io_get_sequence(struct io_kiocb *req)
-{
- u32 seq = req->ctx->cached_sq_head;
-
- /* need original cached_sq_head, but it was increased for each req */
- io_for_each_link(req, req)
- seq--;
- return seq;
-}
-
-static __cold void io_drain_req(struct io_kiocb *req)
-{
- struct io_ring_ctx *ctx = req->ctx;
- struct io_defer_entry *de;
- int ret;
- u32 seq = io_get_sequence(req);
-
- /* Still need defer if there is pending req in defer list. */
- spin_lock(&ctx->completion_lock);
- if (!req_need_defer(req, seq) && list_empty_careful(&ctx->defer_list)) {
- spin_unlock(&ctx->completion_lock);
-queue:
- ctx->drain_active = false;
- io_req_task_queue(req);
- return;
- }
- spin_unlock(&ctx->completion_lock);
-
- ret = io_req_prep_async(req);
- if (ret) {
-fail:
- io_req_complete_failed(req, ret);
- return;
- }
- io_prep_async_link(req);
- de = kmalloc(sizeof(*de), GFP_KERNEL);
- if (!de) {
- ret = -ENOMEM;
- goto fail;
- }
-
- spin_lock(&ctx->completion_lock);
- if (!req_need_defer(req, seq) && list_empty(&ctx->defer_list)) {
- spin_unlock(&ctx->completion_lock);
- kfree(de);
- goto queue;
- }
-
- trace_io_uring_defer(ctx, req, req->user_data, req->opcode);
- de->req = req;
- de->seq = seq;
- list_add_tail(&de->list, &ctx->defer_list);
- spin_unlock(&ctx->completion_lock);
-}
-
-static void io_clean_op(struct io_kiocb *req)
-{
- if (req->flags & REQ_F_BUFFER_SELECTED) {
- spin_lock(&req->ctx->completion_lock);
- io_put_kbuf_comp(req);
- spin_unlock(&req->ctx->completion_lock);
- }
-
- if (req->flags & REQ_F_NEED_CLEANUP) {
- switch (req->opcode) {
- case IORING_OP_READV:
- case IORING_OP_READ_FIXED:
- case IORING_OP_READ:
- case IORING_OP_WRITEV:
- case IORING_OP_WRITE_FIXED:
- case IORING_OP_WRITE: {
- struct io_async_rw *io = req->async_data;
-
- kfree(io->free_iovec);
- break;
- }
- case IORING_OP_RECVMSG:
- case IORING_OP_SENDMSG: {
- struct io_async_msghdr *io = req->async_data;
-
- kfree(io->free_iov);
- break;
- }
- case IORING_OP_OPENAT:
- case IORING_OP_OPENAT2:
- if (req->open.filename)
- putname(req->open.filename);
- break;
- case IORING_OP_RENAMEAT:
- putname(req->rename.oldpath);
- putname(req->rename.newpath);
- break;
- case IORING_OP_UNLINKAT:
- putname(req->unlink.filename);
- break;
- case IORING_OP_MKDIRAT:
- putname(req->mkdir.filename);
- break;
- case IORING_OP_SYMLINKAT:
- putname(req->symlink.oldpath);
- putname(req->symlink.newpath);
- break;
- case IORING_OP_LINKAT:
- putname(req->hardlink.oldpath);
- putname(req->hardlink.newpath);
- break;
- case IORING_OP_STATX:
- if (req->statx.filename)
- putname(req->statx.filename);
- break;
- }
- }
- if ((req->flags & REQ_F_POLLED) && req->apoll) {
- kfree(req->apoll->double_poll);
- kfree(req->apoll);
- req->apoll = NULL;
- }
- if (req->flags & REQ_F_INFLIGHT) {
- struct io_uring_task *tctx = req->task->io_uring;
-
- atomic_dec(&tctx->inflight_tracked);
- }
- if (req->flags & REQ_F_CREDS)
- put_cred(req->creds);
- if (req->flags & REQ_F_ASYNC_DATA) {
- kfree(req->async_data);
- req->async_data = NULL;
- }
- req->flags &= ~IO_REQ_CLEAN_FLAGS;
-}
-
-static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
-{
- if (req->file || !io_op_defs[req->opcode].needs_file)
- return true;
-
- if (req->flags & REQ_F_FIXED_FILE)
- req->file = io_file_get_fixed(req, req->fd, issue_flags);
- else
- req->file = io_file_get_normal(req, req->fd);
- if (req->file)
- return true;
-
- req_set_fail(req);
- req->result = -EBADF;
- return false;
-}
-
-static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
-{
- const struct cred *creds = NULL;
- int ret;
-
- if (unlikely(!io_assign_file(req, issue_flags)))
- return -EBADF;
-
- if (unlikely((req->flags & REQ_F_CREDS) && req->creds != current_cred()))
- creds = override_creds(req->creds);
-
- if (!io_op_defs[req->opcode].audit_skip)
- audit_uring_entry(req->opcode);
-
- switch (req->opcode) {
- case IORING_OP_NOP:
- ret = io_nop(req, issue_flags);
- break;
- case IORING_OP_READV:
- case IORING_OP_READ_FIXED:
- case IORING_OP_READ:
- ret = io_read(req, issue_flags);
- break;
- case IORING_OP_WRITEV:
- case IORING_OP_WRITE_FIXED:
- case IORING_OP_WRITE:
- ret = io_write(req, issue_flags);
- break;
- case IORING_OP_FSYNC:
- ret = io_fsync(req, issue_flags);
- break;
- case IORING_OP_POLL_ADD:
- ret = io_poll_add(req, issue_flags);
- break;
- case IORING_OP_POLL_REMOVE:
- ret = io_poll_update(req, issue_flags);
- break;
- case IORING_OP_SYNC_FILE_RANGE:
- ret = io_sync_file_range(req, issue_flags);
- break;
- case IORING_OP_SENDMSG:
- ret = io_sendmsg(req, issue_flags);
- break;
- case IORING_OP_SEND:
- ret = io_send(req, issue_flags);
- break;
- case IORING_OP_RECVMSG:
- ret = io_recvmsg(req, issue_flags);
- break;
- case IORING_OP_RECV:
- ret = io_recv(req, issue_flags);
- break;
- case IORING_OP_TIMEOUT:
- ret = io_timeout(req, issue_flags);
- break;
- case IORING_OP_TIMEOUT_REMOVE:
- ret = io_timeout_remove(req, issue_flags);
- break;
- case IORING_OP_ACCEPT:
- ret = io_accept(req, issue_flags);
- break;
- case IORING_OP_CONNECT:
- ret = io_connect(req, issue_flags);
- break;
- case IORING_OP_ASYNC_CANCEL:
- ret = io_async_cancel(req, issue_flags);
- break;
- case IORING_OP_FALLOCATE:
- ret = io_fallocate(req, issue_flags);
- break;
- case IORING_OP_OPENAT:
- ret = io_openat(req, issue_flags);
- break;
- case IORING_OP_CLOSE:
- ret = io_close(req, issue_flags);
- break;
- case IORING_OP_FILES_UPDATE:
- ret = io_files_update(req, issue_flags);
- break;
- case IORING_OP_STATX:
- ret = io_statx(req, issue_flags);
- break;
- case IORING_OP_FADVISE:
- ret = io_fadvise(req, issue_flags);
- break;
- case IORING_OP_MADVISE:
- ret = io_madvise(req, issue_flags);
- break;
- case IORING_OP_OPENAT2:
- ret = io_openat2(req, issue_flags);
- break;
- case IORING_OP_EPOLL_CTL:
- ret = io_epoll_ctl(req, issue_flags);
- break;
- case IORING_OP_SPLICE:
- ret = io_splice(req, issue_flags);
- break;
- case IORING_OP_PROVIDE_BUFFERS:
- ret = io_provide_buffers(req, issue_flags);
- break;
- case IORING_OP_REMOVE_BUFFERS:
- ret = io_remove_buffers(req, issue_flags);
- break;
- case IORING_OP_TEE:
- ret = io_tee(req, issue_flags);
- break;
- case IORING_OP_SHUTDOWN:
- ret = io_shutdown(req, issue_flags);
- break;
- case IORING_OP_RENAMEAT:
- ret = io_renameat(req, issue_flags);
- break;
- case IORING_OP_UNLINKAT:
- ret = io_unlinkat(req, issue_flags);
- break;
- case IORING_OP_MKDIRAT:
- ret = io_mkdirat(req, issue_flags);
- break;
- case IORING_OP_SYMLINKAT:
- ret = io_symlinkat(req, issue_flags);
- break;
- case IORING_OP_LINKAT:
- ret = io_linkat(req, issue_flags);
- break;
- case IORING_OP_MSG_RING:
- ret = io_msg_ring(req, issue_flags);
- break;
- default:
- ret = -EINVAL;
- break;
- }
-
- if (!io_op_defs[req->opcode].audit_skip)
- audit_uring_exit(!ret, ret);
-
- if (creds)
- revert_creds(creds);
- if (ret)
- return ret;
- /* If the op doesn't have a file, we're not polling for it */
- if ((req->ctx->flags & IORING_SETUP_IOPOLL) && req->file)
- io_iopoll_req_issued(req, issue_flags);
-
- return 0;
-}
-
-static struct io_wq_work *io_wq_free_work(struct io_wq_work *work)
-{
- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
-
- req = io_put_req_find_next(req);
- return req ? &req->work : NULL;
-}
-
-static void io_wq_submit_work(struct io_wq_work *work)
-{
- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
- const struct io_op_def *def = &io_op_defs[req->opcode];
- unsigned int issue_flags = IO_URING_F_UNLOCKED;
- bool needs_poll = false;
- struct io_kiocb *timeout;
- int ret = 0, err = -ECANCELED;
-
- /* one will be dropped by ->io_free_work() after returning to io-wq */
- if (!(req->flags & REQ_F_REFCOUNT))
- __io_req_set_refcount(req, 2);
- else
- req_ref_get(req);
-
- timeout = io_prep_linked_timeout(req);
- if (timeout)
- io_queue_linked_timeout(timeout);
-
-
- /* either cancelled or io-wq is dying, so don't touch tctx->iowq */
- if (work->flags & IO_WQ_WORK_CANCEL) {
-fail:
- io_req_task_queue_fail(req, err);
- return;
- }
- if (!io_assign_file(req, issue_flags)) {
- err = -EBADF;
- work->flags |= IO_WQ_WORK_CANCEL;
- goto fail;
- }
-
- if (req->flags & REQ_F_FORCE_ASYNC) {
- bool opcode_poll = def->pollin || def->pollout;
-
- if (opcode_poll && file_can_poll(req->file)) {
- needs_poll = true;
- issue_flags |= IO_URING_F_NONBLOCK;
- }
- }
-
- do {
- ret = io_issue_sqe(req, issue_flags);
- if (ret != -EAGAIN)
- break;
- /*
- * We can get EAGAIN for iopolled IO even though we're
- * forcing a sync submission from here, since we can't
- * wait for request slots on the block side.
- */
- if (!needs_poll) {
- if (!(req->ctx->flags & IORING_SETUP_IOPOLL))
- break;
- cond_resched();
- continue;
- }
-
- if (io_arm_poll_handler(req, issue_flags) == IO_APOLL_OK)
- return;
- /* aborted or ready, in either case retry blocking */
- needs_poll = false;
- issue_flags &= ~IO_URING_F_NONBLOCK;
- } while (1);
-
- /* avoid locking problems by failing it from a clean context */
- if (ret)
- io_req_task_queue_fail(req, ret);
-}
-
-static inline struct io_fixed_file *io_fixed_file_slot(struct io_file_table *table,
- unsigned i)
-{
- return &table->files[i];
-}
-
-static inline struct file *io_file_from_index(struct io_ring_ctx *ctx,
- int index)
-{
- struct io_fixed_file *slot = io_fixed_file_slot(&ctx->file_table, index);
-
- return (struct file *) (slot->file_ptr & FFS_MASK);
-}
-
-static void io_fixed_file_set(struct io_fixed_file *file_slot, struct file *file)
-{
- unsigned long file_ptr = (unsigned long) file;
-
- file_ptr |= io_file_get_flags(file);
- file_slot->file_ptr = file_ptr;
-}
-
-static inline struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
- unsigned int issue_flags)
-{
- struct io_ring_ctx *ctx = req->ctx;
- struct file *file = NULL;
- unsigned long file_ptr;
-
- if (issue_flags & IO_URING_F_UNLOCKED)
- mutex_lock(&ctx->uring_lock);
-
- if (unlikely((unsigned int)fd >= ctx->nr_user_files))
- goto out;
- fd = array_index_nospec(fd, ctx->nr_user_files);
- file_ptr = io_fixed_file_slot(&ctx->file_table, fd)->file_ptr;
- file = (struct file *) (file_ptr & FFS_MASK);
- file_ptr &= ~FFS_MASK;
- /* mask in overlapping REQ_F and FFS bits */
- req->flags |= (file_ptr << REQ_F_SUPPORT_NOWAIT_BIT);
- io_req_set_rsrc_node(req, ctx, 0);
-out:
- if (issue_flags & IO_URING_F_UNLOCKED)
- mutex_unlock(&ctx->uring_lock);
- return file;
-}
-
-static struct file *io_file_get_normal(struct io_kiocb *req, int fd)
-{
- struct file *file = fget(fd);
-
- trace_io_uring_file_get(req->ctx, req, req->user_data, fd);
-
- /* we don't allow fixed io_uring files */
- if (file && file->f_op == &io_uring_fops)
- io_req_track_inflight(req);
- return file;
-}
-
-static void io_req_task_link_timeout(struct io_kiocb *req, bool *locked)
-{
- struct io_kiocb *prev = req->timeout.prev;
- int ret = -ENOENT;
-
- if (prev) {
- if (!(req->task->flags & PF_EXITING))
- ret = io_try_cancel_userdata(req, prev->user_data);
- io_req_complete_post(req, ret ?: -ETIME, 0);
- io_put_req(prev);
- } else {
- io_req_complete_post(req, -ETIME, 0);
- }
-}
-
-static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer)
-{
- struct io_timeout_data *data = container_of(timer,
- struct io_timeout_data, timer);
- struct io_kiocb *prev, *req = data->req;
- struct io_ring_ctx *ctx = req->ctx;
- unsigned long flags;
-
- spin_lock_irqsave(&ctx->timeout_lock, flags);
- prev = req->timeout.head;
- req->timeout.head = NULL;
-
- /*
- * We don't expect the list to be empty, that will only happen if we
- * race with the completion of the linked work.
- */
- if (prev) {
- io_remove_next_linked(prev);
- if (!req_ref_inc_not_zero(prev))
- prev = NULL;
- }
- list_del(&req->timeout.list);
- req->timeout.prev = prev;
- spin_unlock_irqrestore(&ctx->timeout_lock, flags);
-
- req->io_task_work.func = io_req_task_link_timeout;
- io_req_task_work_add(req, false);
- return HRTIMER_NORESTART;
-}
-
-static void io_queue_linked_timeout(struct io_kiocb *req)
-{
- struct io_ring_ctx *ctx = req->ctx;
-
- spin_lock_irq(&ctx->timeout_lock);
- /*
- * If the back reference is NULL, then our linked request finished
- * before we got a chance to setup the timer
- */
- if (req->timeout.head) {
- struct io_timeout_data *data = req->async_data;
-
- data->timer.function = io_link_timeout_fn;
- hrtimer_start(&data->timer, timespec64_to_ktime(data->ts),
- data->mode);
- list_add_tail(&req->timeout.list, &ctx->ltimeout_list);
- }
- spin_unlock_irq(&ctx->timeout_lock);
- /* drop submission reference */
- io_put_req(req);
-}
-
-static void io_queue_sqe_arm_apoll(struct io_kiocb *req)
- __must_hold(&req->ctx->uring_lock)
-{
- struct io_kiocb *linked_timeout = io_prep_linked_timeout(req);
-
- switch (io_arm_poll_handler(req, 0)) {
- case IO_APOLL_READY:
- io_req_task_queue(req);
- break;
- case IO_APOLL_ABORTED:
- /*
- * Queued up for async execution, worker will release
- * submit reference when the iocb is actually submitted.
- */
- io_queue_async_work(req, NULL);
- break;
- case IO_APOLL_OK:
- break;
- }
-
- if (linked_timeout)
- io_queue_linked_timeout(linked_timeout);
-}
-
-static inline void __io_queue_sqe(struct io_kiocb *req)
- __must_hold(&req->ctx->uring_lock)
-{
- struct io_kiocb *linked_timeout;
- int ret;
-
- ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
-
- if (req->flags & REQ_F_COMPLETE_INLINE) {
- io_req_add_compl_list(req);
- return;
- }
- /*
- * We async punt it if the file wasn't marked NOWAIT, or if the file
- * doesn't support non-blocking read/write attempts
- */
- if (likely(!ret)) {
- linked_timeout = io_prep_linked_timeout(req);
- if (linked_timeout)
- io_queue_linked_timeout(linked_timeout);
- } else if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) {
- io_queue_sqe_arm_apoll(req);
- } else {
- io_req_complete_failed(req, ret);
- }
-}
-
-static void io_queue_sqe_fallback(struct io_kiocb *req)
- __must_hold(&req->ctx->uring_lock)
-{
- if (req->flags & REQ_F_FAIL) {
- io_req_complete_fail_submit(req);
- } else if (unlikely(req->ctx->drain_active)) {
- io_drain_req(req);
- } else {
- int ret = io_req_prep_async(req);
-
- if (unlikely(ret))
- io_req_complete_failed(req, ret);
- else
- io_queue_async_work(req, NULL);
- }
-}
-
-static inline void io_queue_sqe(struct io_kiocb *req)
- __must_hold(&req->ctx->uring_lock)
-{
- if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL))))
- __io_queue_sqe(req);
- else
- io_queue_sqe_fallback(req);
-}
-
-/*
- * Check SQE restrictions (opcode and flags).
- *
- * Returns 'true' if SQE is allowed, 'false' otherwise.
- */
-static inline bool io_check_restriction(struct io_ring_ctx *ctx,
- struct io_kiocb *req,
- unsigned int sqe_flags)
-{
- if (!test_bit(req->opcode, ctx->restrictions.sqe_op))
- return false;
-
- if ((sqe_flags & ctx->restrictions.sqe_flags_required) !=
- ctx->restrictions.sqe_flags_required)
- return false;
-
- if (sqe_flags & ~(ctx->restrictions.sqe_flags_allowed |
- ctx->restrictions.sqe_flags_required))
- return false;
-
- return true;
-}
-
-static void io_init_req_drain(struct io_kiocb *req)
-{
- struct io_ring_ctx *ctx = req->ctx;
- struct io_kiocb *head = ctx->submit_state.link.head;
-
- ctx->drain_active = true;
- if (head) {
- /*
- * If we need to drain a request in the middle of a link, drain
- * the head request and the next request/link after the current
- * link. Considering sequential execution of links,
- * REQ_F_IO_DRAIN will be maintained for every request of our
- * link.
- */
- head->flags |= REQ_F_IO_DRAIN | REQ_F_FORCE_ASYNC;
- ctx->drain_next = true;
- }
-}
-
-static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
- __must_hold(&ctx->uring_lock)
-{
- unsigned int sqe_flags;
- int personality;
- u8 opcode;
-
- /* req is partially pre-initialised, see io_preinit_req() */
- req->opcode = opcode = READ_ONCE(sqe->opcode);
- /* same numerical values with corresponding REQ_F_*, safe to copy */
- req->flags = sqe_flags = READ_ONCE(sqe->flags);
- req->user_data = READ_ONCE(sqe->user_data);
- req->file = NULL;
- req->fixed_rsrc_refs = NULL;
- req->task = current;
-
- if (unlikely(opcode >= IORING_OP_LAST)) {
- req->opcode = 0;
- return -EINVAL;
- }
- if (unlikely(sqe_flags & ~SQE_COMMON_FLAGS)) {
- /* enforce forwards compatibility on users */
- if (sqe_flags & ~SQE_VALID_FLAGS)
- return -EINVAL;
- if ((sqe_flags & IOSQE_BUFFER_SELECT) &&
- !io_op_defs[opcode].buffer_select)
- return -EOPNOTSUPP;
- if (sqe_flags & IOSQE_CQE_SKIP_SUCCESS)
- ctx->drain_disabled = true;
- if (sqe_flags & IOSQE_IO_DRAIN) {
- if (ctx->drain_disabled)
- return -EOPNOTSUPP;
- io_init_req_drain(req);
- }
- }
- if (unlikely(ctx->restricted || ctx->drain_active || ctx->drain_next)) {
- if (ctx->restricted && !io_check_restriction(ctx, req, sqe_flags))
- return -EACCES;
- /* knock it to the slow queue path, will be drained there */
- if (ctx->drain_active)
- req->flags |= REQ_F_FORCE_ASYNC;
- /* if there is no link, we're at "next" request and need to drain */
- if (unlikely(ctx->drain_next) && !ctx->submit_state.link.head) {
- ctx->drain_next = false;
- ctx->drain_active = true;
- req->flags |= REQ_F_IO_DRAIN | REQ_F_FORCE_ASYNC;
- }
- }
-
- if (io_op_defs[opcode].needs_file) {
- struct io_submit_state *state = &ctx->submit_state;
-
- req->fd = READ_ONCE(sqe->fd);
-
- /*
- * Plug now if we have more than 2 IO left after this, and the
- * target is potentially a read/write to block based storage.
- */
- if (state->need_plug && io_op_defs[opcode].plug) {
- state->plug_started = true;
- state->need_plug = false;
- blk_start_plug_nr_ios(&state->plug, state->submit_nr);
- }
- }
-
- personality = READ_ONCE(sqe->personality);
- if (personality) {
- int ret;
-
- req->creds = xa_load(&ctx->personalities, personality);
- if (!req->creds)
- return -EINVAL;
- get_cred(req->creds);
- ret = security_uring_override_creds(req->creds);
- if (ret) {
- put_cred(req->creds);
- return ret;
- }
- req->flags |= REQ_F_CREDS;
- }
-
- return io_req_prep(req, sqe);
-}
-
-static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
- const struct io_uring_sqe *sqe)
- __must_hold(&ctx->uring_lock)
-{
- struct io_submit_link *link = &ctx->submit_state.link;
- int ret;
-
- ret = io_init_req(ctx, req, sqe);
- if (unlikely(ret)) {
- trace_io_uring_req_failed(sqe, ctx, req, ret);
-
- /* fail even hard links since we don't submit */
- if (link->head) {
- /*
- * we can judge a link req is failed or cancelled by if
- * REQ_F_FAIL is set, but the head is an exception since
- * it may be set REQ_F_FAIL because of other req's failure
- * so let's leverage req->result to distinguish if a head
- * is set REQ_F_FAIL because of its failure or other req's
- * failure so that we can set the correct ret code for it.
- * init result here to avoid affecting the normal path.
- */
- if (!(link->head->flags & REQ_F_FAIL))
- req_fail_link_node(link->head, -ECANCELED);
- } else if (!(req->flags & (REQ_F_LINK | REQ_F_HARDLINK))) {
- /*
- * the current req is a normal req, we should return
- * error and thus break the submittion loop.
- */
- io_req_complete_failed(req, ret);
- return ret;
- }
- req_fail_link_node(req, ret);
- }
-
- /* don't need @sqe from now on */
- trace_io_uring_submit_sqe(ctx, req, req->user_data, req->opcode,
- req->flags, true,
- ctx->flags & IORING_SETUP_SQPOLL);
-
- /*
- * If we already have a head request, queue this one for async
- * submittal once the head completes. If we don't have a head but
- * IOSQE_IO_LINK is set in the sqe, start a new head. This one will be
- * submitted sync once the chain is complete. If none of those
- * conditions are true (normal request), then just queue it.
- */
- if (link->head) {
- struct io_kiocb *head = link->head;
-
- if (!(req->flags & REQ_F_FAIL)) {
- ret = io_req_prep_async(req);
- if (unlikely(ret)) {
- req_fail_link_node(req, ret);
- if (!(head->flags & REQ_F_FAIL))
- req_fail_link_node(head, -ECANCELED);
- }
- }
- trace_io_uring_link(ctx, req, head);
- link->last->link = req;
- link->last = req;
-
- if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK))
- return 0;
- /* last request of a link, enqueue the link */
- link->head = NULL;
- req = head;
- } else if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
- link->head = req;
- link->last = req;
- return 0;
- }
-
- io_queue_sqe(req);
- return 0;
-}
-
-/*
- * Batched submission is done, ensure local IO is flushed out.
- */
-static void io_submit_state_end(struct io_ring_ctx *ctx)
-{
- struct io_submit_state *state = &ctx->submit_state;
-
- if (state->link.head)
- io_queue_sqe(state->link.head);
- /* flush only after queuing links as they can generate completions */
- io_submit_flush_completions(ctx);
- if (state->plug_started)
- blk_finish_plug(&state->plug);
-}
-
-/*
- * Start submission side cache.
- */
-static void io_submit_state_start(struct io_submit_state *state,
- unsigned int max_ios)
-{
- state->plug_started = false;
- state->need_plug = max_ios > 2;
- state->submit_nr = max_ios;
- /* set only head, no need to init link_last in advance */
- state->link.head = NULL;
-}
-
-static void io_commit_sqring(struct io_ring_ctx *ctx)
-{
- struct io_rings *rings = ctx->rings;
-
- /*
- * Ensure any loads from the SQEs are done at this point,
- * since once we write the new head, the application could
- * write new data to them.
- */
- smp_store_release(&rings->sq.head, ctx->cached_sq_head);
-}
-
-/*
- * Fetch an sqe, if one is available. Note this returns a pointer to memory
- * that is mapped by userspace. This means that care needs to be taken to
- * ensure that reads are stable, as we cannot rely on userspace always
- * being a good citizen. If members of the sqe are validated and then later
- * used, it's important that those reads are done through READ_ONCE() to
- * prevent a re-load down the line.
- */
-static const struct io_uring_sqe *io_get_sqe(struct io_ring_ctx *ctx)
-{
- unsigned head, mask = ctx->sq_entries - 1;
- unsigned sq_idx = ctx->cached_sq_head++ & mask;
-
- /*
- * The cached sq head (or cq tail) serves two purposes:
- *
- * 1) allows us to batch the cost of updating the user visible
- * head updates.
- * 2) allows the kernel side to track the head on its own, even
- * though the application is the one updating it.
- */
- head = READ_ONCE(ctx->sq_array[sq_idx]);
- if (likely(head < ctx->sq_entries))
- return &ctx->sq_sqes[head];
-
- /* drop invalid entries */
- ctx->cq_extra--;
- WRITE_ONCE(ctx->rings->sq_dropped,
- READ_ONCE(ctx->rings->sq_dropped) + 1);
- return NULL;
-}
-
-static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr)
- __must_hold(&ctx->uring_lock)
-{
- unsigned int entries = io_sqring_entries(ctx);
- int submitted = 0;
-
- if (unlikely(!entries))
- return 0;
- /* make sure SQ entry isn't read before tail */
- nr = min3(nr, ctx->sq_entries, entries);
- io_get_task_refs(nr);
-
- io_submit_state_start(&ctx->submit_state, nr);
- do {
- const struct io_uring_sqe *sqe;
- struct io_kiocb *req;
-
- if (unlikely(!io_alloc_req_refill(ctx))) {
- if (!submitted)
- submitted = -EAGAIN;
- break;
- }
- req = io_alloc_req(ctx);
- sqe = io_get_sqe(ctx);
- if (unlikely(!sqe)) {
- wq_stack_add_head(&req->comp_list, &ctx->submit_state.free_list);
- break;
- }
- /* will complete beyond this point, count as submitted */
- submitted++;
- if (io_submit_sqe(ctx, req, sqe)) {
- /*
- * Continue submitting even for sqe failure if the
- * ring was setup with IORING_SETUP_SUBMIT_ALL
- */
- if (!(ctx->flags & IORING_SETUP_SUBMIT_ALL))
- break;
- }
- } while (submitted < nr);
-
- if (unlikely(submitted != nr)) {
- int ref_used = (submitted == -EAGAIN) ? 0 : submitted;
- int unused = nr - ref_used;
-
- current->io_uring->cached_refs += unused;
- }
-
- io_submit_state_end(ctx);
- /* Commit SQ ring head once we've consumed and submitted all SQEs */
- io_commit_sqring(ctx);
-
- return submitted;
-}
-
-static inline bool io_sqd_events_pending(struct io_sq_data *sqd)
-{
- return READ_ONCE(sqd->state);
-}
-
-static inline void io_ring_set_wakeup_flag(struct io_ring_ctx *ctx)
-{
- /* Tell userspace we may need a wakeup call */
- spin_lock(&ctx->completion_lock);
- WRITE_ONCE(ctx->rings->sq_flags,
- ctx->rings->sq_flags | IORING_SQ_NEED_WAKEUP);
- spin_unlock(&ctx->completion_lock);
-}
-
-static inline void io_ring_clear_wakeup_flag(struct io_ring_ctx *ctx)
-{
- spin_lock(&ctx->completion_lock);
- WRITE_ONCE(ctx->rings->sq_flags,
- ctx->rings->sq_flags & ~IORING_SQ_NEED_WAKEUP);
- spin_unlock(&ctx->completion_lock);
-}
-
-static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
-{
- unsigned int to_submit;
- int ret = 0;
-
- to_submit = io_sqring_entries(ctx);
- /* if we're handling multiple rings, cap submit size for fairness */
- if (cap_entries && to_submit > IORING_SQPOLL_CAP_ENTRIES_VALUE)
- to_submit = IORING_SQPOLL_CAP_ENTRIES_VALUE;
-
- if (!wq_list_empty(&ctx->iopoll_list) || to_submit) {
- const struct cred *creds = NULL;
-
- if (ctx->sq_creds != current_cred())
- creds = override_creds(ctx->sq_creds);
-
- mutex_lock(&ctx->uring_lock);
- if (!wq_list_empty(&ctx->iopoll_list))
- io_do_iopoll(ctx, true);
-
- /*
- * Don't submit if refs are dying, good for io_uring_register(),
- * but also it is relied upon by io_ring_exit_work()
- */
- if (to_submit && likely(!percpu_ref_is_dying(&ctx->refs)) &&
- !(ctx->flags & IORING_SETUP_R_DISABLED))
- ret = io_submit_sqes(ctx, to_submit);
- mutex_unlock(&ctx->uring_lock);
-
- if (to_submit && wq_has_sleeper(&ctx->sqo_sq_wait))
- wake_up(&ctx->sqo_sq_wait);
- if (creds)
- revert_creds(creds);
- }
-
- return ret;
-}
-
-static __cold void io_sqd_update_thread_idle(struct io_sq_data *sqd)
-{
- struct io_ring_ctx *ctx;
- unsigned sq_thread_idle = 0;
-
- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
- sq_thread_idle = max(sq_thread_idle, ctx->sq_thread_idle);
- sqd->sq_thread_idle = sq_thread_idle;
-}
-
-static bool io_sqd_handle_event(struct io_sq_data *sqd)
-{
- bool did_sig = false;
- struct ksignal ksig;
-
- if (test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state) ||
- signal_pending(current)) {
- mutex_unlock(&sqd->lock);
- if (signal_pending(current))
- did_sig = get_signal(&ksig);
- cond_resched();
- mutex_lock(&sqd->lock);
- }
- return did_sig || test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
-}
-
-static int io_sq_thread(void *data)
-{
- struct io_sq_data *sqd = data;
- struct io_ring_ctx *ctx;
- unsigned long timeout = 0;
- char buf[TASK_COMM_LEN];
- DEFINE_WAIT(wait);
-
- snprintf(buf, sizeof(buf), "iou-sqp-%d", sqd->task_pid);
- set_task_comm(current, buf);
-
- if (sqd->sq_cpu != -1)
- set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu));
- else
- set_cpus_allowed_ptr(current, cpu_online_mask);
- current->flags |= PF_NO_SETAFFINITY;
-
- audit_alloc_kernel(current);
-
- mutex_lock(&sqd->lock);
- while (1) {
- bool cap_entries, sqt_spin = false;
-
- if (io_sqd_events_pending(sqd) || signal_pending(current)) {
- if (io_sqd_handle_event(sqd))
- break;
- timeout = jiffies + sqd->sq_thread_idle;
- }
-
- cap_entries = !list_is_singular(&sqd->ctx_list);
- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
- int ret = __io_sq_thread(ctx, cap_entries);
-
- if (!sqt_spin && (ret > 0 || !wq_list_empty(&ctx->iopoll_list)))
- sqt_spin = true;
- }
- if (io_run_task_work())
- sqt_spin = true;
-
- if (sqt_spin || !time_after(jiffies, timeout)) {
- cond_resched();
- if (sqt_spin)
- timeout = jiffies + sqd->sq_thread_idle;
- continue;
- }
-
- prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
- if (!io_sqd_events_pending(sqd) && !task_work_pending(current)) {
- bool needs_sched = true;
-
- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
- io_ring_set_wakeup_flag(ctx);
-
- if ((ctx->flags & IORING_SETUP_IOPOLL) &&
- !wq_list_empty(&ctx->iopoll_list)) {
- needs_sched = false;
- break;
- }
-
- /*
- * Ensure the store of the wakeup flag is not
- * reordered with the load of the SQ tail
- */
- smp_mb();
-
- if (io_sqring_entries(ctx)) {
- needs_sched = false;
- break;
- }
- }
-
- if (needs_sched) {
- mutex_unlock(&sqd->lock);
- schedule();
- mutex_lock(&sqd->lock);
- }
- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
- io_ring_clear_wakeup_flag(ctx);
- }
-
- finish_wait(&sqd->wait, &wait);
- timeout = jiffies + sqd->sq_thread_idle;
- }
-
- io_uring_cancel_generic(true, sqd);
- sqd->thread = NULL;
- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
- io_ring_set_wakeup_flag(ctx);
- io_run_task_work();
- mutex_unlock(&sqd->lock);
-
- audit_free(current);
-
- complete(&sqd->exited);
- do_exit(0);
-}
-
-struct io_wait_queue {
- struct wait_queue_entry wq;
- struct io_ring_ctx *ctx;
- unsigned cq_tail;
- unsigned nr_timeouts;
-};
-
-static inline bool io_should_wake(struct io_wait_queue *iowq)
-{
- struct io_ring_ctx *ctx = iowq->ctx;
- int dist = ctx->cached_cq_tail - (int) iowq->cq_tail;
-
- /*
- * Wake up if we have enough events, or if a timeout occurred since we
- * started waiting. For timeouts, we always want to return to userspace,
- * regardless of event count.
- */
- return dist >= 0 || atomic_read(&ctx->cq_timeouts) != iowq->nr_timeouts;
-}
-
-static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode,
- int wake_flags, void *key)
-{
- struct io_wait_queue *iowq = container_of(curr, struct io_wait_queue,
- wq);
-
- /*
- * Cannot safely flush overflowed CQEs from here, ensure we wake up
- * the task, and the next invocation will do it.
- */
- if (io_should_wake(iowq) || test_bit(0, &iowq->ctx->check_cq_overflow))
- return autoremove_wake_function(curr, mode, wake_flags, key);
- return -1;
-}
-
-static int io_run_task_work_sig(void)
-{
- if (io_run_task_work())
- return 1;
- if (test_thread_flag(TIF_NOTIFY_SIGNAL))
- return -ERESTARTSYS;
- if (task_sigpending(current))
- return -EINTR;
- return 0;
-}
-
-/* when returns >0, the caller should retry */
-static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
- struct io_wait_queue *iowq,
- ktime_t timeout)
-{
- int ret;
-
- /* make sure we run task_work before checking for signals */
- ret = io_run_task_work_sig();
- if (ret || io_should_wake(iowq))
- return ret;
- /* let the caller flush overflows, retry */
- if (test_bit(0, &ctx->check_cq_overflow))
- return 1;
-
- if (!schedule_hrtimeout(&timeout, HRTIMER_MODE_ABS))
- return -ETIME;
- return 1;
-}
-
-/*
- * Wait until events become available, if we don't already have some. The
- * application must reap them itself, as they reside on the shared cq ring.
- */
-static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
- const sigset_t __user *sig, size_t sigsz,
- struct __kernel_timespec __user *uts)
-{
- struct io_wait_queue iowq;
- struct io_rings *rings = ctx->rings;
- ktime_t timeout = KTIME_MAX;
- int ret;
-
- do {
- io_cqring_overflow_flush(ctx);
- if (io_cqring_events(ctx) >= min_events)
- return 0;
- if (!io_run_task_work())
- break;
- } while (1);
-
- if (sig) {
-#ifdef CONFIG_COMPAT
- if (in_compat_syscall())
- ret = set_compat_user_sigmask((const compat_sigset_t __user *)sig,
- sigsz);
- else
-#endif
- ret = set_user_sigmask(sig, sigsz);
-
- if (ret)
- return ret;
- }
-
- if (uts) {
- struct timespec64 ts;
-
- if (get_timespec64(&ts, uts))
- return -EFAULT;
- timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
- }
-
- init_waitqueue_func_entry(&iowq.wq, io_wake_function);
- iowq.wq.private = current;
- INIT_LIST_HEAD(&iowq.wq.entry);
- iowq.ctx = ctx;
- iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts);
- iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events;
-
- trace_io_uring_cqring_wait(ctx, min_events);
- do {
- /* if we can't even flush overflow, don't wait for more */
- if (!io_cqring_overflow_flush(ctx)) {
- ret = -EBUSY;
- break;
- }
- prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
- TASK_INTERRUPTIBLE);
- ret = io_cqring_wait_schedule(ctx, &iowq, timeout);
- finish_wait(&ctx->cq_wait, &iowq.wq);
- cond_resched();
- } while (ret > 0);
-
- restore_saved_sigmask_unless(ret == -EINTR);
-
- return READ_ONCE(rings->cq.head) == READ_ONCE(rings->cq.tail) ? ret : 0;
-}
-
-static void io_free_page_table(void **table, size_t size)
-{
- unsigned i, nr_tables = DIV_ROUND_UP(size, PAGE_SIZE);
-
- for (i = 0; i < nr_tables; i++)
- kfree(table[i]);
- kfree(table);
-}
-
-static __cold void **io_alloc_page_table(size_t size)
-{
- unsigned i, nr_tables = DIV_ROUND_UP(size, PAGE_SIZE);
- size_t init_size = size;
- void **table;
-
- table = kcalloc(nr_tables, sizeof(*table), GFP_KERNEL_ACCOUNT);
- if (!table)
- return NULL;
-
- for (i = 0; i < nr_tables; i++) {
- unsigned int this_size = min_t(size_t, size, PAGE_SIZE);
-
- table[i] = kzalloc(this_size, GFP_KERNEL_ACCOUNT);
- if (!table[i]) {
- io_free_page_table(table, init_size);
- return NULL;
- }
- size -= this_size;
- }
- return table;
-}
-
-static void io_rsrc_node_destroy(struct io_rsrc_node *ref_node)
-{
- percpu_ref_exit(&ref_node->refs);
- kfree(ref_node);
-}
-
-static __cold void io_rsrc_node_ref_zero(struct percpu_ref *ref)
-{
- struct io_rsrc_node *node = container_of(ref, struct io_rsrc_node, refs);
- struct io_ring_ctx *ctx = node->rsrc_data->ctx;
- unsigned long flags;
- bool first_add = false;
- unsigned long delay = HZ;
-
- spin_lock_irqsave(&ctx->rsrc_ref_lock, flags);
- node->done = true;
-
- /* if we are mid-quiesce then do not delay */
- if (node->rsrc_data->quiesce)
- delay = 0;
-
- while (!list_empty(&ctx->rsrc_ref_list)) {
- node = list_first_entry(&ctx->rsrc_ref_list,
- struct io_rsrc_node, node);
- /* recycle ref nodes in order */
- if (!node->done)
- break;
- list_del(&node->node);
- first_add |= llist_add(&node->llist, &ctx->rsrc_put_llist);
- }
- spin_unlock_irqrestore(&ctx->rsrc_ref_lock, flags);
-
- if (first_add)
- mod_delayed_work(system_wq, &ctx->rsrc_put_work, delay);
-}
-
-static struct io_rsrc_node *io_rsrc_node_alloc(void)
-{
- struct io_rsrc_node *ref_node;
-
- ref_node = kzalloc(sizeof(*ref_node), GFP_KERNEL);
- if (!ref_node)
- return NULL;
-
- if (percpu_ref_init(&ref_node->refs, io_rsrc_node_ref_zero,
- 0, GFP_KERNEL)) {
- kfree(ref_node);
- return NULL;
- }
- INIT_LIST_HEAD(&ref_node->node);
- INIT_LIST_HEAD(&ref_node->rsrc_list);
- ref_node->done = false;
- return ref_node;
-}
-
-static void io_rsrc_node_switch(struct io_ring_ctx *ctx,
- struct io_rsrc_data *data_to_kill)
- __must_hold(&ctx->uring_lock)
-{
- WARN_ON_ONCE(!ctx->rsrc_backup_node);
- WARN_ON_ONCE(data_to_kill && !ctx->rsrc_node);
-
- io_rsrc_refs_drop(ctx);
-
- if (data_to_kill) {
- struct io_rsrc_node *rsrc_node = ctx->rsrc_node;
-
- rsrc_node->rsrc_data = data_to_kill;
- spin_lock_irq(&ctx->rsrc_ref_lock);
- list_add_tail(&rsrc_node->node, &ctx->rsrc_ref_list);
- spin_unlock_irq(&ctx->rsrc_ref_lock);
-
- atomic_inc(&data_to_kill->refs);
- percpu_ref_kill(&rsrc_node->refs);
- ctx->rsrc_node = NULL;
- }
-
- if (!ctx->rsrc_node) {
- ctx->rsrc_node = ctx->rsrc_backup_node;
- ctx->rsrc_backup_node = NULL;
- }
-}
-
-static int io_rsrc_node_switch_start(struct io_ring_ctx *ctx)
-{
- if (ctx->rsrc_backup_node)
- return 0;
- ctx->rsrc_backup_node = io_rsrc_node_alloc();
- return ctx->rsrc_backup_node ? 0 : -ENOMEM;
-}
-
-static __cold int io_rsrc_ref_quiesce(struct io_rsrc_data *data,
- struct io_ring_ctx *ctx)
-{
- int ret;
-
- /* As we may drop ->uring_lock, other task may have started quiesce */
- if (data->quiesce)
- return -ENXIO;
-
- data->quiesce = true;
- do {
- ret = io_rsrc_node_switch_start(ctx);
- if (ret)
- break;
- io_rsrc_node_switch(ctx, data);
-
- /* kill initial ref, already quiesced if zero */
- if (atomic_dec_and_test(&data->refs))
- break;
- mutex_unlock(&ctx->uring_lock);
- flush_delayed_work(&ctx->rsrc_put_work);
- ret = wait_for_completion_interruptible(&data->done);
- if (!ret) {
- mutex_lock(&ctx->uring_lock);
- if (atomic_read(&data->refs) > 0) {
- /*
- * it has been revived by another thread while
- * we were unlocked
- */
- mutex_unlock(&ctx->uring_lock);
- } else {
- break;
- }
- }
-
- atomic_inc(&data->refs);
- /* wait for all works potentially completing data->done */
- flush_delayed_work(&ctx->rsrc_put_work);
- reinit_completion(&data->done);
-
- ret = io_run_task_work_sig();
- mutex_lock(&ctx->uring_lock);
- } while (ret >= 0);
- data->quiesce = false;
-
- return ret;
-}
-
-static u64 *io_get_tag_slot(struct io_rsrc_data *data, unsigned int idx)
-{
- unsigned int off = idx & IO_RSRC_TAG_TABLE_MASK;
- unsigned int table_idx = idx >> IO_RSRC_TAG_TABLE_SHIFT;
-
- return &data->tags[table_idx][off];
-}
-
-static void io_rsrc_data_free(struct io_rsrc_data *data)
-{
- size_t size = data->nr * sizeof(data->tags[0][0]);
-
- if (data->tags)
- io_free_page_table((void **)data->tags, size);
- kfree(data);
-}
-
-static __cold int io_rsrc_data_alloc(struct io_ring_ctx *ctx, rsrc_put_fn *do_put,
- u64 __user *utags, unsigned nr,
- struct io_rsrc_data **pdata)
-{
- struct io_rsrc_data *data;
- int ret = -ENOMEM;
- unsigned i;
-
- data = kzalloc(sizeof(*data), GFP_KERNEL);
- if (!data)
- return -ENOMEM;
- data->tags = (u64 **)io_alloc_page_table(nr * sizeof(data->tags[0][0]));
- if (!data->tags) {
- kfree(data);
- return -ENOMEM;
- }
-
- data->nr = nr;
- data->ctx = ctx;
- data->do_put = do_put;
- if (utags) {
- ret = -EFAULT;
- for (i = 0; i < nr; i++) {
- u64 *tag_slot = io_get_tag_slot(data, i);
-
- if (copy_from_user(tag_slot, &utags[i],
- sizeof(*tag_slot)))
- goto fail;
- }
- }
-
- atomic_set(&data->refs, 1);
- init_completion(&data->done);
- *pdata = data;
- return 0;
-fail:
- io_rsrc_data_free(data);
- return ret;
-}
-
-static bool io_alloc_file_tables(struct io_file_table *table, unsigned nr_files)
-{
- table->files = kvcalloc(nr_files, sizeof(table->files[0]),
- GFP_KERNEL_ACCOUNT);
- return !!table->files;
-}
-
-static void io_free_file_tables(struct io_file_table *table)
-{
- kvfree(table->files);
- table->files = NULL;
-}
-
-static void __io_sqe_files_unregister(struct io_ring_ctx *ctx)
-{
-#if defined(CONFIG_UNIX)
- if (ctx->ring_sock) {
- struct sock *sock = ctx->ring_sock->sk;
- struct sk_buff *skb;
-
- while ((skb = skb_dequeue(&sock->sk_receive_queue)) != NULL)
- kfree_skb(skb);
- }
-#else
- int i;
-
- for (i = 0; i < ctx->nr_user_files; i++) {
- struct file *file;
-
- file = io_file_from_index(ctx, i);
- if (file)
- fput(file);
- }
-#endif
- io_free_file_tables(&ctx->file_table);
- io_rsrc_data_free(ctx->file_data);
- ctx->file_data = NULL;
- ctx->nr_user_files = 0;
-}
-
-static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
-{
- unsigned nr = ctx->nr_user_files;
- int ret;
-
- if (!ctx->file_data)
- return -ENXIO;
-
- /*
- * Quiesce may unlock ->uring_lock, and while it's not held
- * prevent new requests using the table.
- */
- ctx->nr_user_files = 0;
- ret = io_rsrc_ref_quiesce(ctx->file_data, ctx);
- ctx->nr_user_files = nr;
- if (!ret)
- __io_sqe_files_unregister(ctx);
- return ret;
-}
-
-static void io_sq_thread_unpark(struct io_sq_data *sqd)
- __releases(&sqd->lock)
-{
- WARN_ON_ONCE(sqd->thread == current);
-
- /*
- * Do the dance but not conditional clear_bit() because it'd race with
- * other threads incrementing park_pending and setting the bit.
- */
- clear_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
- if (atomic_dec_return(&sqd->park_pending))
- set_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
- mutex_unlock(&sqd->lock);
-}
-
-static void io_sq_thread_park(struct io_sq_data *sqd)
- __acquires(&sqd->lock)
-{
- WARN_ON_ONCE(sqd->thread == current);
-
- atomic_inc(&sqd->park_pending);
- set_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
- mutex_lock(&sqd->lock);
- if (sqd->thread)
- wake_up_process(sqd->thread);
-}
-
-static void io_sq_thread_stop(struct io_sq_data *sqd)
-{
- WARN_ON_ONCE(sqd->thread == current);
- WARN_ON_ONCE(test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state));
-
- set_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
- mutex_lock(&sqd->lock);
- if (sqd->thread)
- wake_up_process(sqd->thread);
- mutex_unlock(&sqd->lock);
- wait_for_completion(&sqd->exited);
-}
-
-static void io_put_sq_data(struct io_sq_data *sqd)
-{
- if (refcount_dec_and_test(&sqd->refs)) {
- WARN_ON_ONCE(atomic_read(&sqd->park_pending));
-
- io_sq_thread_stop(sqd);
- kfree(sqd);
- }
-}
-
-static void io_sq_thread_finish(struct io_ring_ctx *ctx)
-{
- struct io_sq_data *sqd = ctx->sq_data;
-
- if (sqd) {
- io_sq_thread_park(sqd);
- list_del_init(&ctx->sqd_list);
- io_sqd_update_thread_idle(sqd);
- io_sq_thread_unpark(sqd);
-
- io_put_sq_data(sqd);
- ctx->sq_data = NULL;
- }
-}
-
-static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
-{
- struct io_ring_ctx *ctx_attach;
- struct io_sq_data *sqd;
- struct fd f;
-
- f = fdget(p->wq_fd);
- if (!f.file)
- return ERR_PTR(-ENXIO);
- if (f.file->f_op != &io_uring_fops) {
- fdput(f);
- return ERR_PTR(-EINVAL);
- }
-
- ctx_attach = f.file->private_data;
- sqd = ctx_attach->sq_data;
- if (!sqd) {
- fdput(f);
- return ERR_PTR(-EINVAL);
- }
- if (sqd->task_tgid != current->tgid) {
- fdput(f);
- return ERR_PTR(-EPERM);
- }
-
- refcount_inc(&sqd->refs);
- fdput(f);
- return sqd;
-}
-
-static struct io_sq_data *io_get_sq_data(struct io_uring_params *p,
- bool *attached)
-{
- struct io_sq_data *sqd;
-
- *attached = false;
- if (p->flags & IORING_SETUP_ATTACH_WQ) {
- sqd = io_attach_sq_data(p);
- if (!IS_ERR(sqd)) {
- *attached = true;
- return sqd;
- }
- /* fall through for EPERM case, setup new sqd/task */
- if (PTR_ERR(sqd) != -EPERM)
- return sqd;
- }
-
- sqd = kzalloc(sizeof(*sqd), GFP_KERNEL);
- if (!sqd)
- return ERR_PTR(-ENOMEM);
-
- atomic_set(&sqd->park_pending, 0);
- refcount_set(&sqd->refs, 1);
- INIT_LIST_HEAD(&sqd->ctx_list);
- mutex_init(&sqd->lock);
- init_waitqueue_head(&sqd->wait);
- init_completion(&sqd->exited);
- return sqd;
-}
-
-#if defined(CONFIG_UNIX)
-/*
- * Ensure the UNIX gc is aware of our file set, so we are certain that
- * the io_uring can be safely unregistered on process exit, even if we have
- * loops in the file referencing.
- */
-static int __io_sqe_files_scm(struct io_ring_ctx *ctx, int nr, int offset)
-{
- struct sock *sk = ctx->ring_sock->sk;
- struct scm_fp_list *fpl;
- struct sk_buff *skb;
- int i, nr_files;
-
- fpl = kzalloc(sizeof(*fpl), GFP_KERNEL);
- if (!fpl)
- return -ENOMEM;
-
- skb = alloc_skb(0, GFP_KERNEL);
- if (!skb) {
- kfree(fpl);
- return -ENOMEM;
- }
-
- skb->sk = sk;
-
- nr_files = 0;
- fpl->user = get_uid(current_user());
- for (i = 0; i < nr; i++) {
- struct file *file = io_file_from_index(ctx, i + offset);
-
- if (!file)
- continue;
- fpl->fp[nr_files] = get_file(file);
- unix_inflight(fpl->user, fpl->fp[nr_files]);
- nr_files++;
- }
-
- if (nr_files) {
- fpl->max = SCM_MAX_FD;
- fpl->count = nr_files;
- UNIXCB(skb).fp = fpl;
- skb->destructor = unix_destruct_scm;
- refcount_add(skb->truesize, &sk->sk_wmem_alloc);
- skb_queue_head(&sk->sk_receive_queue, skb);
-
- for (i = 0; i < nr; i++) {
- struct file *file = io_file_from_index(ctx, i + offset);
-
- if (file)
- fput(file);
- }
- } else {
- kfree_skb(skb);
- free_uid(fpl->user);
- kfree(fpl);
- }
-
- return 0;
-}
-
-/*
- * If UNIX sockets are enabled, fd passing can cause a reference cycle which
- * causes regular reference counting to break down. We rely on the UNIX
- * garbage collection to take care of this problem for us.
- */
-static int io_sqe_files_scm(struct io_ring_ctx *ctx)
-{
- unsigned left, total;
- int ret = 0;
-
- total = 0;
- left = ctx->nr_user_files;
- while (left) {
- unsigned this_files = min_t(unsigned, left, SCM_MAX_FD);
-
- ret = __io_sqe_files_scm(ctx, this_files, total);
- if (ret)
- break;
- left -= this_files;
- total += this_files;
- }
-
- if (!ret)
- return 0;
-
- while (total < ctx->nr_user_files) {
- struct file *file = io_file_from_index(ctx, total);
-
- if (file)
- fput(file);
- total++;
- }
-
- return ret;
-}
-#else
-static int io_sqe_files_scm(struct io_ring_ctx *ctx)
-{
- return 0;
-}
-#endif
-
-static void io_rsrc_file_put(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc)
-{
- struct file *file = prsrc->file;
-#if defined(CONFIG_UNIX)
- struct sock *sock = ctx->ring_sock->sk;
- struct sk_buff_head list, *head = &sock->sk_receive_queue;
- struct sk_buff *skb;
- int i;
-
- __skb_queue_head_init(&list);
-
- /*
- * Find the skb that holds this file in its SCM_RIGHTS. When found,
- * remove this entry and rearrange the file array.
- */
- skb = skb_dequeue(head);
- while (skb) {
- struct scm_fp_list *fp;
-
- fp = UNIXCB(skb).fp;
- for (i = 0; i < fp->count; i++) {
- int left;
-
- if (fp->fp[i] != file)
- continue;
-
- unix_notinflight(fp->user, fp->fp[i]);
- left = fp->count - 1 - i;
- if (left) {
- memmove(&fp->fp[i], &fp->fp[i + 1],
- left * sizeof(struct file *));
- }
- fp->count--;
- if (!fp->count) {
- kfree_skb(skb);
- skb = NULL;
- } else {
- __skb_queue_tail(&list, skb);
- }
- fput(file);
- file = NULL;
- break;
- }
-
- if (!file)
- break;
-
- __skb_queue_tail(&list, skb);
-
- skb = skb_dequeue(head);
- }
-
- if (skb_peek(&list)) {
- spin_lock_irq(&head->lock);
- while ((skb = __skb_dequeue(&list)) != NULL)
- __skb_queue_tail(head, skb);
- spin_unlock_irq(&head->lock);
- }
-#else
- fput(file);
-#endif
-}
-
-static void __io_rsrc_put_work(struct io_rsrc_node *ref_node)
-{
- struct io_rsrc_data *rsrc_data = ref_node->rsrc_data;
- struct io_ring_ctx *ctx = rsrc_data->ctx;
- struct io_rsrc_put *prsrc, *tmp;
-
- list_for_each_entry_safe(prsrc, tmp, &ref_node->rsrc_list, list) {
- list_del(&prsrc->list);
-
- if (prsrc->tag) {
- bool lock_ring = ctx->flags & IORING_SETUP_IOPOLL;
-
- io_ring_submit_lock(ctx, lock_ring);
- spin_lock(&ctx->completion_lock);
- io_fill_cqe_aux(ctx, prsrc->tag, 0, 0);
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- io_cqring_ev_posted(ctx);
- io_ring_submit_unlock(ctx, lock_ring);
- }
-
- rsrc_data->do_put(ctx, prsrc);
- kfree(prsrc);
- }
-
- io_rsrc_node_destroy(ref_node);
- if (atomic_dec_and_test(&rsrc_data->refs))
- complete(&rsrc_data->done);
-}
-
-static void io_rsrc_put_work(struct work_struct *work)
-{
- struct io_ring_ctx *ctx;
- struct llist_node *node;
-
- ctx = container_of(work, struct io_ring_ctx, rsrc_put_work.work);
- node = llist_del_all(&ctx->rsrc_put_llist);
-
- while (node) {
- struct io_rsrc_node *ref_node;
- struct llist_node *next = node->next;
-
- ref_node = llist_entry(node, struct io_rsrc_node, llist);
- __io_rsrc_put_work(ref_node);
- node = next;
- }
-}
-
-static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
- unsigned nr_args, u64 __user *tags)
-{
- __s32 __user *fds = (__s32 __user *) arg;
- struct file *file;
- int fd, ret;
- unsigned i;
-
- if (ctx->file_data)
- return -EBUSY;
- if (!nr_args)
- return -EINVAL;
- if (nr_args > IORING_MAX_FIXED_FILES)
- return -EMFILE;
- if (nr_args > rlimit(RLIMIT_NOFILE))
- return -EMFILE;
- ret = io_rsrc_node_switch_start(ctx);
- if (ret)
- return ret;
- ret = io_rsrc_data_alloc(ctx, io_rsrc_file_put, tags, nr_args,
- &ctx->file_data);
- if (ret)
- return ret;
-
- ret = -ENOMEM;
- if (!io_alloc_file_tables(&ctx->file_table, nr_args))
- goto out_free;
-
- for (i = 0; i < nr_args; i++, ctx->nr_user_files++) {
- if (copy_from_user(&fd, &fds[i], sizeof(fd))) {
- ret = -EFAULT;
- goto out_fput;
- }
- /* allow sparse sets */
- if (fd == -1) {
- ret = -EINVAL;
- if (unlikely(*io_get_tag_slot(ctx->file_data, i)))
- goto out_fput;
- continue;
- }
-
- file = fget(fd);
- ret = -EBADF;
- if (unlikely(!file))
- goto out_fput;
-
- /*
- * Don't allow io_uring instances to be registered. If UNIX
- * isn't enabled, then this causes a reference cycle and this
- * instance can never get freed. If UNIX is enabled we'll
- * handle it just fine, but there's still no point in allowing
- * a ring fd as it doesn't support regular read/write anyway.
- */
- if (file->f_op == &io_uring_fops) {
- fput(file);
- goto out_fput;
- }
- io_fixed_file_set(io_fixed_file_slot(&ctx->file_table, i), file);
- }
-
- ret = io_sqe_files_scm(ctx);
- if (ret) {
- __io_sqe_files_unregister(ctx);
- return ret;
- }
-
- io_rsrc_node_switch(ctx, NULL);
- return ret;
-out_fput:
- for (i = 0; i < ctx->nr_user_files; i++) {
- file = io_file_from_index(ctx, i);
- if (file)
- fput(file);
- }
- io_free_file_tables(&ctx->file_table);
- ctx->nr_user_files = 0;
-out_free:
- io_rsrc_data_free(ctx->file_data);
- ctx->file_data = NULL;
- return ret;
-}
-
-static int io_sqe_file_register(struct io_ring_ctx *ctx, struct file *file,
- int index)
-{
-#if defined(CONFIG_UNIX)
- struct sock *sock = ctx->ring_sock->sk;
- struct sk_buff_head *head = &sock->sk_receive_queue;
- struct sk_buff *skb;
-
- /*
- * See if we can merge this file into an existing skb SCM_RIGHTS
- * file set. If there's no room, fall back to allocating a new skb
- * and filling it in.
- */
- spin_lock_irq(&head->lock);
- skb = skb_peek(head);
- if (skb) {
- struct scm_fp_list *fpl = UNIXCB(skb).fp;
-
- if (fpl->count < SCM_MAX_FD) {
- __skb_unlink(skb, head);
- spin_unlock_irq(&head->lock);
- fpl->fp[fpl->count] = get_file(file);
- unix_inflight(fpl->user, fpl->fp[fpl->count]);
- fpl->count++;
- spin_lock_irq(&head->lock);
- __skb_queue_head(head, skb);
- } else {
- skb = NULL;
- }
- }
- spin_unlock_irq(&head->lock);
-
- if (skb) {
- fput(file);
- return 0;
- }
-
- return __io_sqe_files_scm(ctx, 1, index);
-#else
- return 0;
-#endif
-}
-
-static int io_queue_rsrc_removal(struct io_rsrc_data *data, unsigned idx,
- struct io_rsrc_node *node, void *rsrc)
-{
- u64 *tag_slot = io_get_tag_slot(data, idx);
- struct io_rsrc_put *prsrc;
-
- prsrc = kzalloc(sizeof(*prsrc), GFP_KERNEL);
- if (!prsrc)
- return -ENOMEM;
-
- prsrc->tag = *tag_slot;
- *tag_slot = 0;
- prsrc->rsrc = rsrc;
- list_add(&prsrc->list, &node->rsrc_list);
- return 0;
-}
-
-static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
- unsigned int issue_flags, u32 slot_index)
-{
- struct io_ring_ctx *ctx = req->ctx;
- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
- bool needs_switch = false;
- struct io_fixed_file *file_slot;
- int ret = -EBADF;
-
- io_ring_submit_lock(ctx, needs_lock);
- if (file->f_op == &io_uring_fops)
- goto err;
- ret = -ENXIO;
- if (!ctx->file_data)
- goto err;
- ret = -EINVAL;
- if (slot_index >= ctx->nr_user_files)
- goto err;
-
- slot_index = array_index_nospec(slot_index, ctx->nr_user_files);
- file_slot = io_fixed_file_slot(&ctx->file_table, slot_index);
-
- if (file_slot->file_ptr) {
- struct file *old_file;
-
- ret = io_rsrc_node_switch_start(ctx);
- if (ret)
- goto err;
-
- old_file = (struct file *)(file_slot->file_ptr & FFS_MASK);
- ret = io_queue_rsrc_removal(ctx->file_data, slot_index,
- ctx->rsrc_node, old_file);
- if (ret)
- goto err;
- file_slot->file_ptr = 0;
- needs_switch = true;
- }
-
- *io_get_tag_slot(ctx->file_data, slot_index) = 0;
- io_fixed_file_set(file_slot, file);
- ret = io_sqe_file_register(ctx, file, slot_index);
- if (ret) {
- file_slot->file_ptr = 0;
- goto err;
- }
-
- ret = 0;
-err:
- if (needs_switch)
- io_rsrc_node_switch(ctx, ctx->file_data);
- io_ring_submit_unlock(ctx, needs_lock);
- if (ret)
- fput(file);
- return ret;
-}
-
-static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
-{
- unsigned int offset = req->close.file_slot - 1;
- struct io_ring_ctx *ctx = req->ctx;
- bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
- struct io_fixed_file *file_slot;
- struct file *file;
- int ret;
-
- io_ring_submit_lock(ctx, needs_lock);
- ret = -ENXIO;
- if (unlikely(!ctx->file_data))
- goto out;
- ret = -EINVAL;
- if (offset >= ctx->nr_user_files)
- goto out;
- ret = io_rsrc_node_switch_start(ctx);
- if (ret)
- goto out;
-
- offset = array_index_nospec(offset, ctx->nr_user_files);
- file_slot = io_fixed_file_slot(&ctx->file_table, offset);
- ret = -EBADF;
- if (!file_slot->file_ptr)
- goto out;
-
- file = (struct file *)(file_slot->file_ptr & FFS_MASK);
- ret = io_queue_rsrc_removal(ctx->file_data, offset, ctx->rsrc_node, file);
- if (ret)
- goto out;
-
- file_slot->file_ptr = 0;
- io_rsrc_node_switch(ctx, ctx->file_data);
- ret = 0;
-out:
- io_ring_submit_unlock(ctx, needs_lock);
- return ret;
-}
-
-static int __io_sqe_files_update(struct io_ring_ctx *ctx,
- struct io_uring_rsrc_update2 *up,
- unsigned nr_args)
-{
- u64 __user *tags = u64_to_user_ptr(up->tags);
- __s32 __user *fds = u64_to_user_ptr(up->data);
- struct io_rsrc_data *data = ctx->file_data;
- struct io_fixed_file *file_slot;
- struct file *file;
- int fd, i, err = 0;
- unsigned int done;
- bool needs_switch = false;
-
- if (!ctx->file_data)
- return -ENXIO;
- if (up->offset + nr_args > ctx->nr_user_files)
- return -EINVAL;
-
- for (done = 0; done < nr_args; done++) {
- u64 tag = 0;
-
- if ((tags && copy_from_user(&tag, &tags[done], sizeof(tag))) ||
- copy_from_user(&fd, &fds[done], sizeof(fd))) {
- err = -EFAULT;
- break;
- }
- if ((fd == IORING_REGISTER_FILES_SKIP || fd == -1) && tag) {
- err = -EINVAL;
- break;
- }
- if (fd == IORING_REGISTER_FILES_SKIP)
- continue;
-
- i = array_index_nospec(up->offset + done, ctx->nr_user_files);
- file_slot = io_fixed_file_slot(&ctx->file_table, i);
-
- if (file_slot->file_ptr) {
- file = (struct file *)(file_slot->file_ptr & FFS_MASK);
- err = io_queue_rsrc_removal(data, i, ctx->rsrc_node, file);
- if (err)
- break;
- file_slot->file_ptr = 0;
- needs_switch = true;
- }
- if (fd != -1) {
- file = fget(fd);
- if (!file) {
- err = -EBADF;
- break;
- }
- /*
- * Don't allow io_uring instances to be registered. If
- * UNIX isn't enabled, then this causes a reference
- * cycle and this instance can never get freed. If UNIX
- * is enabled we'll handle it just fine, but there's
- * still no point in allowing a ring fd as it doesn't
- * support regular read/write anyway.
- */
- if (file->f_op == &io_uring_fops) {
- fput(file);
- err = -EBADF;
- break;
- }
- *io_get_tag_slot(data, i) = tag;
- io_fixed_file_set(file_slot, file);
- err = io_sqe_file_register(ctx, file, i);
- if (err) {
- file_slot->file_ptr = 0;
- fput(file);
- break;
- }
- }
- }
-
- if (needs_switch)
- io_rsrc_node_switch(ctx, data);
- return done ? done : err;
-}
-
-static struct io_wq *io_init_wq_offload(struct io_ring_ctx *ctx,
- struct task_struct *task)
-{
- struct io_wq_hash *hash;
- struct io_wq_data data;
- unsigned int concurrency;
-
- mutex_lock(&ctx->uring_lock);
- hash = ctx->hash_map;
- if (!hash) {
- hash = kzalloc(sizeof(*hash), GFP_KERNEL);
- if (!hash) {
- mutex_unlock(&ctx->uring_lock);
- return ERR_PTR(-ENOMEM);
- }
- refcount_set(&hash->refs, 1);
- init_waitqueue_head(&hash->wait);
- ctx->hash_map = hash;
- }
- mutex_unlock(&ctx->uring_lock);
-
- data.hash = hash;
- data.task = task;
- data.free_work = io_wq_free_work;
- data.do_work = io_wq_submit_work;
-
- /* Do QD, or 4 * CPUS, whatever is smallest */
- concurrency = min(ctx->sq_entries, 4 * num_online_cpus());
-
- return io_wq_create(concurrency, &data);
-}
-
-static __cold int io_uring_alloc_task_context(struct task_struct *task,
- struct io_ring_ctx *ctx)
-{
- struct io_uring_task *tctx;
- int ret;
-
- tctx = kzalloc(sizeof(*tctx), GFP_KERNEL);
- if (unlikely(!tctx))
- return -ENOMEM;
-
- tctx->registered_rings = kcalloc(IO_RINGFD_REG_MAX,
- sizeof(struct file *), GFP_KERNEL);
- if (unlikely(!tctx->registered_rings)) {
- kfree(tctx);
- return -ENOMEM;
- }
-
- ret = percpu_counter_init(&tctx->inflight, 0, GFP_KERNEL);
- if (unlikely(ret)) {
- kfree(tctx->registered_rings);
- kfree(tctx);
- return ret;
- }
-
- tctx->io_wq = io_init_wq_offload(ctx, task);
- if (IS_ERR(tctx->io_wq)) {
- ret = PTR_ERR(tctx->io_wq);
- percpu_counter_destroy(&tctx->inflight);
- kfree(tctx->registered_rings);
- kfree(tctx);
- return ret;
- }
-
- xa_init(&tctx->xa);
- init_waitqueue_head(&tctx->wait);
- atomic_set(&tctx->in_idle, 0);
- atomic_set(&tctx->inflight_tracked, 0);
- task->io_uring = tctx;
- spin_lock_init(&tctx->task_lock);
- INIT_WQ_LIST(&tctx->task_list);
- INIT_WQ_LIST(&tctx->prior_task_list);
- init_task_work(&tctx->task_work, tctx_task_work);
- return 0;
-}
-
-void __io_uring_free(struct task_struct *tsk)
-{
- struct io_uring_task *tctx = tsk->io_uring;
-
- WARN_ON_ONCE(!xa_empty(&tctx->xa));
- WARN_ON_ONCE(tctx->io_wq);
- WARN_ON_ONCE(tctx->cached_refs);
-
- kfree(tctx->registered_rings);
- percpu_counter_destroy(&tctx->inflight);
- kfree(tctx);
- tsk->io_uring = NULL;
-}
-
-static __cold int io_sq_offload_create(struct io_ring_ctx *ctx,
- struct io_uring_params *p)
-{
- int ret;
-
- /* Retain compatibility with failing for an invalid attach attempt */
- if ((ctx->flags & (IORING_SETUP_ATTACH_WQ | IORING_SETUP_SQPOLL)) ==
- IORING_SETUP_ATTACH_WQ) {
- struct fd f;
-
- f = fdget(p->wq_fd);
- if (!f.file)
- return -ENXIO;
- if (f.file->f_op != &io_uring_fops) {
- fdput(f);
- return -EINVAL;
- }
- fdput(f);
- }
- if (ctx->flags & IORING_SETUP_SQPOLL) {
- struct task_struct *tsk;
- struct io_sq_data *sqd;
- bool attached;
-
- ret = security_uring_sqpoll();
- if (ret)
- return ret;
-
- sqd = io_get_sq_data(p, &attached);
- if (IS_ERR(sqd)) {
- ret = PTR_ERR(sqd);
- goto err;
- }
-
- ctx->sq_creds = get_current_cred();
- ctx->sq_data = sqd;
- ctx->sq_thread_idle = msecs_to_jiffies(p->sq_thread_idle);
- if (!ctx->sq_thread_idle)
- ctx->sq_thread_idle = HZ;
-
- io_sq_thread_park(sqd);
- list_add(&ctx->sqd_list, &sqd->ctx_list);
- io_sqd_update_thread_idle(sqd);
- /* don't attach to a dying SQPOLL thread, would be racy */
- ret = (attached && !sqd->thread) ? -ENXIO : 0;
- io_sq_thread_unpark(sqd);
-
- if (ret < 0)
- goto err;
- if (attached)
- return 0;
-
- if (p->flags & IORING_SETUP_SQ_AFF) {
- int cpu = p->sq_thread_cpu;
-
- ret = -EINVAL;
- if (cpu >= nr_cpu_ids || !cpu_online(cpu))
- goto err_sqpoll;
- sqd->sq_cpu = cpu;
- } else {
- sqd->sq_cpu = -1;
- }
-
- sqd->task_pid = current->pid;
- sqd->task_tgid = current->tgid;
- tsk = create_io_thread(io_sq_thread, sqd, NUMA_NO_NODE);
- if (IS_ERR(tsk)) {
- ret = PTR_ERR(tsk);
- goto err_sqpoll;
- }
-
- sqd->thread = tsk;
- ret = io_uring_alloc_task_context(tsk, ctx);
- wake_up_new_task(tsk);
- if (ret)
- goto err;
- } else if (p->flags & IORING_SETUP_SQ_AFF) {
- /* Can't have SQ_AFF without SQPOLL */
- ret = -EINVAL;
- goto err;
- }
-
- return 0;
-err_sqpoll:
- complete(&ctx->sq_data->exited);
-err:
- io_sq_thread_finish(ctx);
- return ret;
-}
-
-static inline void __io_unaccount_mem(struct user_struct *user,
- unsigned long nr_pages)
-{
- atomic_long_sub(nr_pages, &user->locked_vm);
-}
-
-static inline int __io_account_mem(struct user_struct *user,
- unsigned long nr_pages)
-{
- unsigned long page_limit, cur_pages, new_pages;
-
- /* Don't allow more pages than we can safely lock */
- page_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
-
- do {
- cur_pages = atomic_long_read(&user->locked_vm);
- new_pages = cur_pages + nr_pages;
- if (new_pages > page_limit)
- return -ENOMEM;
- } while (atomic_long_cmpxchg(&user->locked_vm, cur_pages,
- new_pages) != cur_pages);
-
- return 0;
-}
-
-static void io_unaccount_mem(struct io_ring_ctx *ctx, unsigned long nr_pages)
-{
- if (ctx->user)
- __io_unaccount_mem(ctx->user, nr_pages);
-
- if (ctx->mm_account)
- atomic64_sub(nr_pages, &ctx->mm_account->pinned_vm);
-}
-
-static int io_account_mem(struct io_ring_ctx *ctx, unsigned long nr_pages)
-{
- int ret;
-
- if (ctx->user) {
- ret = __io_account_mem(ctx->user, nr_pages);
- if (ret)
- return ret;
- }
-
- if (ctx->mm_account)
- atomic64_add(nr_pages, &ctx->mm_account->pinned_vm);
-
- return 0;
-}
-
-static void io_mem_free(void *ptr)
-{
- struct page *page;
-
- if (!ptr)
- return;
-
- page = virt_to_head_page(ptr);
- if (put_page_testzero(page))
- free_compound_page(page);
-}
-
-static void *io_mem_alloc(size_t size)
-{
- gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP;
-
- return (void *) __get_free_pages(gfp, get_order(size));
-}
-
-static unsigned long rings_size(unsigned sq_entries, unsigned cq_entries,
- size_t *sq_offset)
-{
- struct io_rings *rings;
- size_t off, sq_array_size;
-
- off = struct_size(rings, cqes, cq_entries);
- if (off == SIZE_MAX)
- return SIZE_MAX;
-
-#ifdef CONFIG_SMP
- off = ALIGN(off, SMP_CACHE_BYTES);
- if (off == 0)
- return SIZE_MAX;
-#endif
-
- if (sq_offset)
- *sq_offset = off;
-
- sq_array_size = array_size(sizeof(u32), sq_entries);
- if (sq_array_size == SIZE_MAX)
- return SIZE_MAX;
-
- if (check_add_overflow(off, sq_array_size, &off))
- return SIZE_MAX;
-
- return off;
-}
-
-static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_mapped_ubuf **slot)
-{
- struct io_mapped_ubuf *imu = *slot;
- unsigned int i;
-
- if (imu != ctx->dummy_ubuf) {
- for (i = 0; i < imu->nr_bvecs; i++)
- unpin_user_page(imu->bvec[i].bv_page);
- if (imu->acct_pages)
- io_unaccount_mem(ctx, imu->acct_pages);
- kvfree(imu);
- }
- *slot = NULL;
-}
-
-static void io_rsrc_buf_put(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc)
-{
- io_buffer_unmap(ctx, &prsrc->buf);
- prsrc->buf = NULL;
-}
-
-static void __io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
-{
- unsigned int i;
-
- for (i = 0; i < ctx->nr_user_bufs; i++)
- io_buffer_unmap(ctx, &ctx->user_bufs[i]);
- kfree(ctx->user_bufs);
- io_rsrc_data_free(ctx->buf_data);
- ctx->user_bufs = NULL;
- ctx->buf_data = NULL;
- ctx->nr_user_bufs = 0;
-}
-
-static int io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
-{
- unsigned nr = ctx->nr_user_bufs;
- int ret;
-
- if (!ctx->buf_data)
- return -ENXIO;
-
- /*
- * Quiesce may unlock ->uring_lock, and while it's not held
- * prevent new requests using the table.
- */
- ctx->nr_user_bufs = 0;
- ret = io_rsrc_ref_quiesce(ctx->buf_data, ctx);
- ctx->nr_user_bufs = nr;
- if (!ret)
- __io_sqe_buffers_unregister(ctx);
- return ret;
-}
-
-static int io_copy_iov(struct io_ring_ctx *ctx, struct iovec *dst,
- void __user *arg, unsigned index)
-{
- struct iovec __user *src;
-
-#ifdef CONFIG_COMPAT
- if (ctx->compat) {
- struct compat_iovec __user *ciovs;
- struct compat_iovec ciov;
-
- ciovs = (struct compat_iovec __user *) arg;
- if (copy_from_user(&ciov, &ciovs[index], sizeof(ciov)))
- return -EFAULT;
-
- dst->iov_base = u64_to_user_ptr((u64)ciov.iov_base);
- dst->iov_len = ciov.iov_len;
- return 0;
- }
-#endif
- src = (struct iovec __user *) arg;
- if (copy_from_user(dst, &src[index], sizeof(*dst)))
- return -EFAULT;
- return 0;
-}
-
-/*
- * Not super efficient, but this is just a registration time. And we do cache
- * the last compound head, so generally we'll only do a full search if we don't
- * match that one.
- *
- * We check if the given compound head page has already been accounted, to
- * avoid double accounting it. This allows us to account the full size of the
- * page, not just the constituent pages of a huge page.
- */
-static bool headpage_already_acct(struct io_ring_ctx *ctx, struct page **pages,
- int nr_pages, struct page *hpage)
-{
- int i, j;
-
- /* check current page array */
- for (i = 0; i < nr_pages; i++) {
- if (!PageCompound(pages[i]))
- continue;
- if (compound_head(pages[i]) == hpage)
- return true;
- }
-
- /* check previously registered pages */
- for (i = 0; i < ctx->nr_user_bufs; i++) {
- struct io_mapped_ubuf *imu = ctx->user_bufs[i];
-
- for (j = 0; j < imu->nr_bvecs; j++) {
- if (!PageCompound(imu->bvec[j].bv_page))
- continue;
- if (compound_head(imu->bvec[j].bv_page) == hpage)
- return true;
- }
- }
-
- return false;
-}
-
-static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
- int nr_pages, struct io_mapped_ubuf *imu,
- struct page **last_hpage)
-{
- int i, ret;
-
- imu->acct_pages = 0;
- for (i = 0; i < nr_pages; i++) {
- if (!PageCompound(pages[i])) {
- imu->acct_pages++;
- } else {
- struct page *hpage;
-
- hpage = compound_head(pages[i]);
- if (hpage == *last_hpage)
- continue;
- *last_hpage = hpage;
- if (headpage_already_acct(ctx, pages, i, hpage))
- continue;
- imu->acct_pages += page_size(hpage) >> PAGE_SHIFT;
- }
- }
-
- if (!imu->acct_pages)
- return 0;
-
- ret = io_account_mem(ctx, imu->acct_pages);
- if (ret)
- imu->acct_pages = 0;
- return ret;
-}
-
-static int io_sqe_buffer_register(struct io_ring_ctx *ctx, struct iovec *iov,
- struct io_mapped_ubuf **pimu,
- struct page **last_hpage)
-{
- struct io_mapped_ubuf *imu = NULL;
- struct vm_area_struct **vmas = NULL;
- struct page **pages = NULL;
- unsigned long off, start, end, ubuf;
- size_t size;
- int ret, pret, nr_pages, i;
-
- if (!iov->iov_base) {
- *pimu = ctx->dummy_ubuf;
- return 0;
- }
-
- ubuf = (unsigned long) iov->iov_base;
- end = (ubuf + iov->iov_len + PAGE_SIZE - 1) >> PAGE_SHIFT;
- start = ubuf >> PAGE_SHIFT;
- nr_pages = end - start;
-
- *pimu = NULL;
- ret = -ENOMEM;
-
- pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL);
- if (!pages)
- goto done;
-
- vmas = kvmalloc_array(nr_pages, sizeof(struct vm_area_struct *),
- GFP_KERNEL);
- if (!vmas)
- goto done;
-
- imu = kvmalloc(struct_size(imu, bvec, nr_pages), GFP_KERNEL);
- if (!imu)
- goto done;
-
- ret = 0;
- mmap_read_lock(current->mm);
- pret = pin_user_pages(ubuf, nr_pages, FOLL_WRITE | FOLL_LONGTERM,
- pages, vmas);
- if (pret == nr_pages) {
- /* don't support file backed memory */
- for (i = 0; i < nr_pages; i++) {
- struct vm_area_struct *vma = vmas[i];
-
- if (vma_is_shmem(vma))
- continue;
- if (vma->vm_file &&
- !is_file_hugepages(vma->vm_file)) {
- ret = -EOPNOTSUPP;
- break;
- }
- }
- } else {
- ret = pret < 0 ? pret : -EFAULT;
- }
- mmap_read_unlock(current->mm);
- if (ret) {
- /*
- * if we did partial map, or found file backed vmas,
- * release any pages we did get
- */
- if (pret > 0)
- unpin_user_pages(pages, pret);
- goto done;
- }
-
- ret = io_buffer_account_pin(ctx, pages, pret, imu, last_hpage);
- if (ret) {
- unpin_user_pages(pages, pret);
- goto done;
- }
-
- off = ubuf & ~PAGE_MASK;
- size = iov->iov_len;
- for (i = 0; i < nr_pages; i++) {
- size_t vec_len;
-
- vec_len = min_t(size_t, size, PAGE_SIZE - off);
- imu->bvec[i].bv_page = pages[i];
- imu->bvec[i].bv_len = vec_len;
- imu->bvec[i].bv_offset = off;
- off = 0;
- size -= vec_len;
- }
- /* store original address for later verification */
- imu->ubuf = ubuf;
- imu->ubuf_end = ubuf + iov->iov_len;
- imu->nr_bvecs = nr_pages;
- *pimu = imu;
- ret = 0;
-done:
- if (ret)
- kvfree(imu);
- kvfree(pages);
- kvfree(vmas);
- return ret;
-}
-
-static int io_buffers_map_alloc(struct io_ring_ctx *ctx, unsigned int nr_args)
-{
- ctx->user_bufs = kcalloc(nr_args, sizeof(*ctx->user_bufs), GFP_KERNEL);
- return ctx->user_bufs ? 0 : -ENOMEM;
-}
-
-static int io_buffer_validate(struct iovec *iov)
-{
- unsigned long tmp, acct_len = iov->iov_len + (PAGE_SIZE - 1);
-
- /*
- * Don't impose further limits on the size and buffer
- * constraints here, we'll -EINVAL later when IO is
- * submitted if they are wrong.
- */
- if (!iov->iov_base)
- return iov->iov_len ? -EFAULT : 0;
- if (!iov->iov_len)
- return -EFAULT;
-
- /* arbitrary limit, but we need something */
- if (iov->iov_len > SZ_1G)
- return -EFAULT;
-
- if (check_add_overflow((unsigned long)iov->iov_base, acct_len, &tmp))
- return -EOVERFLOW;
-
- return 0;
-}
-
-static int io_sqe_buffers_register(struct io_ring_ctx *ctx, void __user *arg,
- unsigned int nr_args, u64 __user *tags)
-{
- struct page *last_hpage = NULL;
- struct io_rsrc_data *data;
- int i, ret;
- struct iovec iov;
-
- if (ctx->user_bufs)
- return -EBUSY;
- if (!nr_args || nr_args > IORING_MAX_REG_BUFFERS)
- return -EINVAL;
- ret = io_rsrc_node_switch_start(ctx);
- if (ret)
- return ret;
- ret = io_rsrc_data_alloc(ctx, io_rsrc_buf_put, tags, nr_args, &data);
- if (ret)
- return ret;
- ret = io_buffers_map_alloc(ctx, nr_args);
- if (ret) {
- io_rsrc_data_free(data);
- return ret;
- }
-
- for (i = 0; i < nr_args; i++, ctx->nr_user_bufs++) {
- ret = io_copy_iov(ctx, &iov, arg, i);
- if (ret)
- break;
- ret = io_buffer_validate(&iov);
- if (ret)
- break;
- if (!iov.iov_base && *io_get_tag_slot(data, i)) {
- ret = -EINVAL;
- break;
- }
-
- ret = io_sqe_buffer_register(ctx, &iov, &ctx->user_bufs[i],
- &last_hpage);
- if (ret)
- break;
- }
-
- WARN_ON_ONCE(ctx->buf_data);
-
- ctx->buf_data = data;
- if (ret)
- __io_sqe_buffers_unregister(ctx);
- else
- io_rsrc_node_switch(ctx, NULL);
- return ret;
-}
-
-static int __io_sqe_buffers_update(struct io_ring_ctx *ctx,
- struct io_uring_rsrc_update2 *up,
- unsigned int nr_args)
-{
- u64 __user *tags = u64_to_user_ptr(up->tags);
- struct iovec iov, __user *iovs = u64_to_user_ptr(up->data);
- struct page *last_hpage = NULL;
- bool needs_switch = false;
- __u32 done;
- int i, err;
-
- if (!ctx->buf_data)
- return -ENXIO;
- if (up->offset + nr_args > ctx->nr_user_bufs)
- return -EINVAL;
-
- for (done = 0; done < nr_args; done++) {
- struct io_mapped_ubuf *imu;
- int offset = up->offset + done;
- u64 tag = 0;
-
- err = io_copy_iov(ctx, &iov, iovs, done);
- if (err)
- break;
- if (tags && copy_from_user(&tag, &tags[done], sizeof(tag))) {
- err = -EFAULT;
- break;
- }
- err = io_buffer_validate(&iov);
- if (err)
- break;
- if (!iov.iov_base && tag) {
- err = -EINVAL;
- break;
- }
- err = io_sqe_buffer_register(ctx, &iov, &imu, &last_hpage);
- if (err)
- break;
-
- i = array_index_nospec(offset, ctx->nr_user_bufs);
- if (ctx->user_bufs[i] != ctx->dummy_ubuf) {
- err = io_queue_rsrc_removal(ctx->buf_data, i,
- ctx->rsrc_node, ctx->user_bufs[i]);
- if (unlikely(err)) {
- io_buffer_unmap(ctx, &imu);
- break;
- }
- ctx->user_bufs[i] = NULL;
- needs_switch = true;
- }
-
- ctx->user_bufs[i] = imu;
- *io_get_tag_slot(ctx->buf_data, offset) = tag;
- }
-
- if (needs_switch)
- io_rsrc_node_switch(ctx, ctx->buf_data);
- return done ? done : err;
-}
-
-static int io_eventfd_register(struct io_ring_ctx *ctx, void __user *arg,
- unsigned int eventfd_async)
-{
- struct io_ev_fd *ev_fd;
- __s32 __user *fds = arg;
- int fd;
-
- ev_fd = rcu_dereference_protected(ctx->io_ev_fd,
- lockdep_is_held(&ctx->uring_lock));
- if (ev_fd)
- return -EBUSY;
-
- if (copy_from_user(&fd, fds, sizeof(*fds)))
- return -EFAULT;
-
- ev_fd = kmalloc(sizeof(*ev_fd), GFP_KERNEL);
- if (!ev_fd)
- return -ENOMEM;
-
- ev_fd->cq_ev_fd = eventfd_ctx_fdget(fd);
- if (IS_ERR(ev_fd->cq_ev_fd)) {
- int ret = PTR_ERR(ev_fd->cq_ev_fd);
- kfree(ev_fd);
- return ret;
- }
- ev_fd->eventfd_async = eventfd_async;
- ctx->has_evfd = true;
- rcu_assign_pointer(ctx->io_ev_fd, ev_fd);
- return 0;
-}
-
-static void io_eventfd_put(struct rcu_head *rcu)
-{
- struct io_ev_fd *ev_fd = container_of(rcu, struct io_ev_fd, rcu);
-
- eventfd_ctx_put(ev_fd->cq_ev_fd);
- kfree(ev_fd);
-}
-
-static int io_eventfd_unregister(struct io_ring_ctx *ctx)
-{
- struct io_ev_fd *ev_fd;
-
- ev_fd = rcu_dereference_protected(ctx->io_ev_fd,
- lockdep_is_held(&ctx->uring_lock));
- if (ev_fd) {
- ctx->has_evfd = false;
- rcu_assign_pointer(ctx->io_ev_fd, NULL);
- call_rcu(&ev_fd->rcu, io_eventfd_put);
- return 0;
- }
-
- return -ENXIO;
-}
-
-static void io_destroy_buffers(struct io_ring_ctx *ctx)
-{
- int i;
-
- for (i = 0; i < (1U << IO_BUFFERS_HASH_BITS); i++) {
- struct list_head *list = &ctx->io_buffers[i];
-
- while (!list_empty(list)) {
- struct io_buffer_list *bl;
-
- bl = list_first_entry(list, struct io_buffer_list, list);
- __io_remove_buffers(ctx, bl, -1U);
- list_del(&bl->list);
- kfree(bl);
- }
- }
-
- while (!list_empty(&ctx->io_buffers_pages)) {
- struct page *page;
-
- page = list_first_entry(&ctx->io_buffers_pages, struct page, lru);
- list_del_init(&page->lru);
- __free_page(page);
- }
-}
-
-static void io_req_caches_free(struct io_ring_ctx *ctx)
-{
- struct io_submit_state *state = &ctx->submit_state;
- int nr = 0;
-
- mutex_lock(&ctx->uring_lock);
- io_flush_cached_locked_reqs(ctx, state);
-
- while (state->free_list.next) {
- struct io_wq_work_node *node;
- struct io_kiocb *req;
-
- node = wq_stack_extract(&state->free_list);
- req = container_of(node, struct io_kiocb, comp_list);
- kmem_cache_free(req_cachep, req);
- nr++;
- }
- if (nr)
- percpu_ref_put_many(&ctx->refs, nr);
- mutex_unlock(&ctx->uring_lock);
-}
-
-static void io_wait_rsrc_data(struct io_rsrc_data *data)
-{
- if (data && !atomic_dec_and_test(&data->refs))
- wait_for_completion(&data->done);
-}
-
-static void io_flush_apoll_cache(struct io_ring_ctx *ctx)
-{
- struct async_poll *apoll;
-
- while (!list_empty(&ctx->apoll_cache)) {
- apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
- poll.wait.entry);
- list_del(&apoll->poll.wait.entry);
- kfree(apoll);
- }
-}
-
-static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
-{
- io_sq_thread_finish(ctx);
-
- if (ctx->mm_account) {
- mmdrop(ctx->mm_account);
- ctx->mm_account = NULL;
- }
-
- io_rsrc_refs_drop(ctx);
- /* __io_rsrc_put_work() may need uring_lock to progress, wait w/o it */
- io_wait_rsrc_data(ctx->buf_data);
- io_wait_rsrc_data(ctx->file_data);
-
- mutex_lock(&ctx->uring_lock);
- if (ctx->buf_data)
- __io_sqe_buffers_unregister(ctx);
- if (ctx->file_data)
- __io_sqe_files_unregister(ctx);
- if (ctx->rings)
- __io_cqring_overflow_flush(ctx, true);
- io_eventfd_unregister(ctx);
- io_flush_apoll_cache(ctx);
- mutex_unlock(&ctx->uring_lock);
- io_destroy_buffers(ctx);
- if (ctx->sq_creds)
- put_cred(ctx->sq_creds);
-
- /* there are no registered resources left, nobody uses it */
- if (ctx->rsrc_node)
- io_rsrc_node_destroy(ctx->rsrc_node);
- if (ctx->rsrc_backup_node)
- io_rsrc_node_destroy(ctx->rsrc_backup_node);
- flush_delayed_work(&ctx->rsrc_put_work);
- flush_delayed_work(&ctx->fallback_work);
-
- WARN_ON_ONCE(!list_empty(&ctx->rsrc_ref_list));
- WARN_ON_ONCE(!llist_empty(&ctx->rsrc_put_llist));
-
-#if defined(CONFIG_UNIX)
- if (ctx->ring_sock) {
- ctx->ring_sock->file = NULL; /* so that iput() is called */
- sock_release(ctx->ring_sock);
- }
-#endif
- WARN_ON_ONCE(!list_empty(&ctx->ltimeout_list));
-
- io_mem_free(ctx->rings);
- io_mem_free(ctx->sq_sqes);
-
- percpu_ref_exit(&ctx->refs);
- free_uid(ctx->user);
- io_req_caches_free(ctx);
- if (ctx->hash_map)
- io_wq_put_hash(ctx->hash_map);
- kfree(ctx->cancel_hash);
- kfree(ctx->dummy_ubuf);
- kfree(ctx->io_buffers);
- kfree(ctx);
-}
-
-static __poll_t io_uring_poll(struct file *file, poll_table *wait)
-{
- struct io_ring_ctx *ctx = file->private_data;
- __poll_t mask = 0;
-
- poll_wait(file, &ctx->cq_wait, wait);
- /*
- * synchronizes with barrier from wq_has_sleeper call in
- * io_commit_cqring
- */
- smp_rmb();
- if (!io_sqring_full(ctx))
- mask |= EPOLLOUT | EPOLLWRNORM;
-
- /*
- * Don't flush cqring overflow list here, just do a simple check.
- * Otherwise there could possible be ABBA deadlock:
- * CPU0 CPU1
- * ---- ----
- * lock(&ctx->uring_lock);
- * lock(&ep->mtx);
- * lock(&ctx->uring_lock);
- * lock(&ep->mtx);
- *
- * Users may get EPOLLIN meanwhile seeing nothing in cqring, this
- * pushs them to do the flush.
- */
- if (io_cqring_events(ctx) || test_bit(0, &ctx->check_cq_overflow))
- mask |= EPOLLIN | EPOLLRDNORM;
-
- return mask;
-}
-
-static int io_unregister_personality(struct io_ring_ctx *ctx, unsigned id)
-{
- const struct cred *creds;
-
- creds = xa_erase(&ctx->personalities, id);
- if (creds) {
- put_cred(creds);
- return 0;
- }
-
- return -EINVAL;
-}
-
-struct io_tctx_exit {
- struct callback_head task_work;
- struct completion completion;
- struct io_ring_ctx *ctx;
-};
-
-static __cold void io_tctx_exit_cb(struct callback_head *cb)
-{
- struct io_uring_task *tctx = current->io_uring;
- struct io_tctx_exit *work;
-
- work = container_of(cb, struct io_tctx_exit, task_work);
- /*
- * When @in_idle, we're in cancellation and it's racy to remove the
- * node. It'll be removed by the end of cancellation, just ignore it.
- */
- if (!atomic_read(&tctx->in_idle))
- io_uring_del_tctx_node((unsigned long)work->ctx);
- complete(&work->completion);
-}
-
-static __cold bool io_cancel_ctx_cb(struct io_wq_work *work, void *data)
-{
- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
-
- return req->ctx == data;
-}
-
-static __cold void io_ring_exit_work(struct work_struct *work)
-{
- struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, exit_work);
- unsigned long timeout = jiffies + HZ * 60 * 5;
- unsigned long interval = HZ / 20;
- struct io_tctx_exit exit;
- struct io_tctx_node *node;
- int ret;
-
- /*
- * If we're doing polled IO and end up having requests being
- * submitted async (out-of-line), then completions can come in while
- * we're waiting for refs to drop. We need to reap these manually,
- * as nobody else will be looking for them.
- */
- do {
- io_uring_try_cancel_requests(ctx, NULL, true);
- if (ctx->sq_data) {
- struct io_sq_data *sqd = ctx->sq_data;
- struct task_struct *tsk;
-
- io_sq_thread_park(sqd);
- tsk = sqd->thread;
- if (tsk && tsk->io_uring && tsk->io_uring->io_wq)
- io_wq_cancel_cb(tsk->io_uring->io_wq,
- io_cancel_ctx_cb, ctx, true);
- io_sq_thread_unpark(sqd);
- }
-
- io_req_caches_free(ctx);
-
- if (WARN_ON_ONCE(time_after(jiffies, timeout))) {
- /* there is little hope left, don't run it too often */
- interval = HZ * 60;
- }
- } while (!wait_for_completion_timeout(&ctx->ref_comp, interval));
-
- init_completion(&exit.completion);
- init_task_work(&exit.task_work, io_tctx_exit_cb);
- exit.ctx = ctx;
- /*
- * Some may use context even when all refs and requests have been put,
- * and they are free to do so while still holding uring_lock or
- * completion_lock, see io_req_task_submit(). Apart from other work,
- * this lock/unlock section also waits them to finish.
- */
- mutex_lock(&ctx->uring_lock);
- while (!list_empty(&ctx->tctx_list)) {
- WARN_ON_ONCE(time_after(jiffies, timeout));
-
- node = list_first_entry(&ctx->tctx_list, struct io_tctx_node,
- ctx_node);
- /* don't spin on a single task if cancellation failed */
- list_rotate_left(&ctx->tctx_list);
- ret = task_work_add(node->task, &exit.task_work, TWA_SIGNAL);
- if (WARN_ON_ONCE(ret))
- continue;
-
- mutex_unlock(&ctx->uring_lock);
- wait_for_completion(&exit.completion);
- mutex_lock(&ctx->uring_lock);
- }
- mutex_unlock(&ctx->uring_lock);
- spin_lock(&ctx->completion_lock);
- spin_unlock(&ctx->completion_lock);
-
- io_ring_ctx_free(ctx);
-}
-
-/* Returns true if we found and killed one or more timeouts */
-static __cold bool io_kill_timeouts(struct io_ring_ctx *ctx,
- struct task_struct *tsk, bool cancel_all)
-{
- struct io_kiocb *req, *tmp;
- int canceled = 0;
-
- spin_lock(&ctx->completion_lock);
- spin_lock_irq(&ctx->timeout_lock);
- list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
- if (io_match_task(req, tsk, cancel_all)) {
- io_kill_timeout(req, -ECANCELED);
- canceled++;
- }
- }
- spin_unlock_irq(&ctx->timeout_lock);
- if (canceled != 0)
- io_commit_cqring(ctx);
- spin_unlock(&ctx->completion_lock);
- if (canceled != 0)
- io_cqring_ev_posted(ctx);
- return canceled != 0;
-}
-
-static __cold void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
-{
- unsigned long index;
- struct creds *creds;
-
- mutex_lock(&ctx->uring_lock);
- percpu_ref_kill(&ctx->refs);
- if (ctx->rings)
- __io_cqring_overflow_flush(ctx, true);
- xa_for_each(&ctx->personalities, index, creds)
- io_unregister_personality(ctx, index);
- mutex_unlock(&ctx->uring_lock);
-
- io_kill_timeouts(ctx, NULL, true);
- io_poll_remove_all(ctx, NULL, true);
-
- /* if we failed setting up the ctx, we might not have any rings */
- io_iopoll_try_reap_events(ctx);
-
- INIT_WORK(&ctx->exit_work, io_ring_exit_work);
- /*
- * Use system_unbound_wq to avoid spawning tons of event kworkers
- * if we're exiting a ton of rings at the same time. It just adds
- * noise and overhead, there's no discernable change in runtime
- * over using system_wq.
- */
- queue_work(system_unbound_wq, &ctx->exit_work);
-}
-
-static int io_uring_release(struct inode *inode, struct file *file)
-{
- struct io_ring_ctx *ctx = file->private_data;
-
- file->private_data = NULL;
- io_ring_ctx_wait_and_kill(ctx);
- return 0;
-}
-
-struct io_task_cancel {
- struct task_struct *task;
- bool all;
-};
-
-static bool io_cancel_task_cb(struct io_wq_work *work, void *data)
-{
- struct io_kiocb *req = container_of(work, struct io_kiocb, work);
- struct io_task_cancel *cancel = data;
-
- return io_match_task_safe(req, cancel->task, cancel->all);
-}
-
-static __cold bool io_cancel_defer_files(struct io_ring_ctx *ctx,
- struct task_struct *task,
- bool cancel_all)
-{
- struct io_defer_entry *de;
- LIST_HEAD(list);
-
- spin_lock(&ctx->completion_lock);
- list_for_each_entry_reverse(de, &ctx->defer_list, list) {
- if (io_match_task_safe(de->req, task, cancel_all)) {
- list_cut_position(&list, &ctx->defer_list, &de->list);
- break;
- }
- }
- spin_unlock(&ctx->completion_lock);
- if (list_empty(&list))
- return false;
-
- while (!list_empty(&list)) {
- de = list_first_entry(&list, struct io_defer_entry, list);
- list_del_init(&de->list);
- io_req_complete_failed(de->req, -ECANCELED);
- kfree(de);
- }
- return true;
-}
-
-static __cold bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx)
-{
- struct io_tctx_node *node;
- enum io_wq_cancel cret;
- bool ret = false;
-
- mutex_lock(&ctx->uring_lock);
- list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
- struct io_uring_task *tctx = node->task->io_uring;
-
- /*
- * io_wq will stay alive while we hold uring_lock, because it's
- * killed after ctx nodes, which requires to take the lock.
- */
- if (!tctx || !tctx->io_wq)
- continue;
- cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true);
- ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
- }
- mutex_unlock(&ctx->uring_lock);
-
- return ret;
-}
-
-static __cold void io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
- struct task_struct *task,
- bool cancel_all)
-{
- struct io_task_cancel cancel = { .task = task, .all = cancel_all, };
- struct io_uring_task *tctx = task ? task->io_uring : NULL;
-
- while (1) {
- enum io_wq_cancel cret;
- bool ret = false;
-
- if (!task) {
- ret |= io_uring_try_cancel_iowq(ctx);
- } else if (tctx && tctx->io_wq) {
- /*
- * Cancels requests of all rings, not only @ctx, but
- * it's fine as the task is in exit/exec.
- */
- cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_task_cb,
- &cancel, true);
- ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
- }
-
- /* SQPOLL thread does its own polling */
- if ((!(ctx->flags & IORING_SETUP_SQPOLL) && cancel_all) ||
- (ctx->sq_data && ctx->sq_data->thread == current)) {
- while (!wq_list_empty(&ctx->iopoll_list)) {
- io_iopoll_try_reap_events(ctx);
- ret = true;
- }
- }
-
- ret |= io_cancel_defer_files(ctx, task, cancel_all);
- ret |= io_poll_remove_all(ctx, task, cancel_all);
- ret |= io_kill_timeouts(ctx, task, cancel_all);
- if (task)
- ret |= io_run_task_work();
- if (!ret)
- break;
- cond_resched();
- }
-}
-
-static int __io_uring_add_tctx_node(struct io_ring_ctx *ctx)
-{
- struct io_uring_task *tctx = current->io_uring;
- struct io_tctx_node *node;
- int ret;
-
- if (unlikely(!tctx)) {
- ret = io_uring_alloc_task_context(current, ctx);
- if (unlikely(ret))
- return ret;
-
- tctx = current->io_uring;
- if (ctx->iowq_limits_set) {
- unsigned int limits[2] = { ctx->iowq_limits[0],
- ctx->iowq_limits[1], };
-
- ret = io_wq_max_workers(tctx->io_wq, limits);
- if (ret)
- return ret;
- }
- }
- if (!xa_load(&tctx->xa, (unsigned long)ctx)) {
- node = kmalloc(sizeof(*node), GFP_KERNEL);
- if (!node)
- return -ENOMEM;
- node->ctx = ctx;
- node->task = current;
-
- ret = xa_err(xa_store(&tctx->xa, (unsigned long)ctx,
- node, GFP_KERNEL));
- if (ret) {
- kfree(node);
- return ret;
- }
-
- mutex_lock(&ctx->uring_lock);
- list_add(&node->ctx_node, &ctx->tctx_list);
- mutex_unlock(&ctx->uring_lock);
- }
- tctx->last = ctx;
- return 0;
-}
-
-/*
- * Note that this task has used io_uring. We use it for cancelation purposes.
- */
-static inline int io_uring_add_tctx_node(struct io_ring_ctx *ctx)
-{
- struct io_uring_task *tctx = current->io_uring;
-
- if (likely(tctx && tctx->last == ctx))
- return 0;
- return __io_uring_add_tctx_node(ctx);
-}
-
-/*
- * Remove this io_uring_file -> task mapping.
- */
-static __cold void io_uring_del_tctx_node(unsigned long index)
-{
- struct io_uring_task *tctx = current->io_uring;
- struct io_tctx_node *node;
-
- if (!tctx)
- return;
- node = xa_erase(&tctx->xa, index);
- if (!node)
- return;
-
- WARN_ON_ONCE(current != node->task);
- WARN_ON_ONCE(list_empty(&node->ctx_node));
-
- mutex_lock(&node->ctx->uring_lock);
- list_del(&node->ctx_node);
- mutex_unlock(&node->ctx->uring_lock);
-
- if (tctx->last == node->ctx)
- tctx->last = NULL;
- kfree(node);
-}
-
-static __cold void io_uring_clean_tctx(struct io_uring_task *tctx)
-{
- struct io_wq *wq = tctx->io_wq;
- struct io_tctx_node *node;
- unsigned long index;
-
- xa_for_each(&tctx->xa, index, node) {
- io_uring_del_tctx_node(index);
- cond_resched();
- }
- if (wq) {
- /*
- * Must be after io_uring_del_tctx_node() (removes nodes under
- * uring_lock) to avoid race with io_uring_try_cancel_iowq().
- */
- io_wq_put_and_exit(wq);
- tctx->io_wq = NULL;
- }
-}
-
-static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked)
-{
- if (tracked)
- return atomic_read(&tctx->inflight_tracked);
- return percpu_counter_sum(&tctx->inflight);
-}
-
-/*
- * Find any io_uring ctx that this task has registered or done IO on, and cancel
- * requests. @sqd should be not-null IFF it's an SQPOLL thread cancellation.
- */
-static __cold void io_uring_cancel_generic(bool cancel_all,
- struct io_sq_data *sqd)
-{
- struct io_uring_task *tctx = current->io_uring;
- struct io_ring_ctx *ctx;
- s64 inflight;
- DEFINE_WAIT(wait);
-
- WARN_ON_ONCE(sqd && sqd->thread != current);
-
- if (!current->io_uring)
- return;
- if (tctx->io_wq)
- io_wq_exit_start(tctx->io_wq);
-
- atomic_inc(&tctx->in_idle);
- do {
- io_uring_drop_tctx_refs(current);
- /* read completions before cancelations */
- inflight = tctx_inflight(tctx, !cancel_all);
- if (!inflight)
- break;
-
- if (!sqd) {
- struct io_tctx_node *node;
- unsigned long index;
-
- xa_for_each(&tctx->xa, index, node) {
- /* sqpoll task will cancel all its requests */
- if (node->ctx->sq_data)
- continue;
- io_uring_try_cancel_requests(node->ctx, current,
- cancel_all);
- }
- } else {
- list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
- io_uring_try_cancel_requests(ctx, current,
- cancel_all);
- }
-
- prepare_to_wait(&tctx->wait, &wait, TASK_INTERRUPTIBLE);
- io_run_task_work();
- io_uring_drop_tctx_refs(current);
-
- /*
- * If we've seen completions, retry without waiting. This
- * avoids a race where a completion comes in before we did
- * prepare_to_wait().
- */
- if (inflight == tctx_inflight(tctx, !cancel_all))
- schedule();
- finish_wait(&tctx->wait, &wait);
- } while (1);
-
- io_uring_clean_tctx(tctx);
- if (cancel_all) {
- /*
- * We shouldn't run task_works after cancel, so just leave
- * ->in_idle set for normal exit.
- */
- atomic_dec(&tctx->in_idle);
- /* for exec all current's requests should be gone, kill tctx */
- __io_uring_free(current);
- }
-}
-
-void __io_uring_cancel(bool cancel_all)
-{
- io_uring_cancel_generic(cancel_all, NULL);
-}
-
-void io_uring_unreg_ringfd(void)
-{
- struct io_uring_task *tctx = current->io_uring;
- int i;
-
- for (i = 0; i < IO_RINGFD_REG_MAX; i++) {
- if (tctx->registered_rings[i]) {
- fput(tctx->registered_rings[i]);
- tctx->registered_rings[i] = NULL;
- }
- }
-}
-
-static int io_ring_add_registered_fd(struct io_uring_task *tctx, int fd,
- int start, int end)
-{
- struct file *file;
- int offset;
-
- for (offset = start; offset < end; offset++) {
- offset = array_index_nospec(offset, IO_RINGFD_REG_MAX);
- if (tctx->registered_rings[offset])
- continue;
-
- file = fget(fd);
- if (!file) {
- return -EBADF;
- } else if (file->f_op != &io_uring_fops) {
- fput(file);
- return -EOPNOTSUPP;
- }
- tctx->registered_rings[offset] = file;
- return offset;
- }
-
- return -EBUSY;
-}
-
-/*
- * Register a ring fd to avoid fdget/fdput for each io_uring_enter()
- * invocation. User passes in an array of struct io_uring_rsrc_update
- * with ->data set to the ring_fd, and ->offset given for the desired
- * index. If no index is desired, application may set ->offset == -1U
- * and we'll find an available index. Returns number of entries
- * successfully processed, or < 0 on error if none were processed.
- */
-static int io_ringfd_register(struct io_ring_ctx *ctx, void __user *__arg,
- unsigned nr_args)
-{
- struct io_uring_rsrc_update __user *arg = __arg;
- struct io_uring_rsrc_update reg;
- struct io_uring_task *tctx;
- int ret, i;
-
- if (!nr_args || nr_args > IO_RINGFD_REG_MAX)
- return -EINVAL;
-
- mutex_unlock(&ctx->uring_lock);
- ret = io_uring_add_tctx_node(ctx);
- mutex_lock(&ctx->uring_lock);
- if (ret)
- return ret;
-
- tctx = current->io_uring;
- for (i = 0; i < nr_args; i++) {
- int start, end;
-
- if (copy_from_user(®, &arg[i], sizeof(reg))) {
- ret = -EFAULT;
- break;
- }
-
- if (reg.resv) {
- ret = -EINVAL;
- break;
- }
-
- if (reg.offset == -1U) {
- start = 0;
- end = IO_RINGFD_REG_MAX;
- } else {
- if (reg.offset >= IO_RINGFD_REG_MAX) {
- ret = -EINVAL;
- break;
- }
- start = reg.offset;
- end = start + 1;
- }
-
- ret = io_ring_add_registered_fd(tctx, reg.data, start, end);
- if (ret < 0)
- break;
-
- reg.offset = ret;
- if (copy_to_user(&arg[i], ®, sizeof(reg))) {
- fput(tctx->registered_rings[reg.offset]);
- tctx->registered_rings[reg.offset] = NULL;
- ret = -EFAULT;
- break;
- }
- }
-
- return i ? i : ret;
-}
-
-static int io_ringfd_unregister(struct io_ring_ctx *ctx, void __user *__arg,
- unsigned nr_args)
-{
- struct io_uring_rsrc_update __user *arg = __arg;
- struct io_uring_task *tctx = current->io_uring;
- struct io_uring_rsrc_update reg;
- int ret = 0, i;
-
- if (!nr_args || nr_args > IO_RINGFD_REG_MAX)
- return -EINVAL;
- if (!tctx)
- return 0;
-
- for (i = 0; i < nr_args; i++) {
- if (copy_from_user(®, &arg[i], sizeof(reg))) {
- ret = -EFAULT;
- break;
- }
- if (reg.resv || reg.data || reg.offset >= IO_RINGFD_REG_MAX) {
- ret = -EINVAL;
- break;
- }
-
- reg.offset = array_index_nospec(reg.offset, IO_RINGFD_REG_MAX);
- if (tctx->registered_rings[reg.offset]) {
- fput(tctx->registered_rings[reg.offset]);
- tctx->registered_rings[reg.offset] = NULL;
- }
- }
-
- return i ? i : ret;
-}
-
-static void *io_uring_validate_mmap_request(struct file *file,
- loff_t pgoff, size_t sz)
-{
- struct io_ring_ctx *ctx = file->private_data;
- loff_t offset = pgoff << PAGE_SHIFT;
- struct page *page;
- void *ptr;
-
- switch (offset) {
- case IORING_OFF_SQ_RING:
- case IORING_OFF_CQ_RING:
- ptr = ctx->rings;
- break;
- case IORING_OFF_SQES:
- ptr = ctx->sq_sqes;
- break;
- default:
- return ERR_PTR(-EINVAL);
- }
-
- page = virt_to_head_page(ptr);
- if (sz > page_size(page))
- return ERR_PTR(-EINVAL);
-
- return ptr;
-}
-
-#ifdef CONFIG_MMU
-
-static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
-{
- size_t sz = vma->vm_end - vma->vm_start;
- unsigned long pfn;
- void *ptr;
-
- ptr = io_uring_validate_mmap_request(file, vma->vm_pgoff, sz);
- if (IS_ERR(ptr))
- return PTR_ERR(ptr);
-
- pfn = virt_to_phys(ptr) >> PAGE_SHIFT;
- return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
-}
-
-#else /* !CONFIG_MMU */
-
-static int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
-{
- return vma->vm_flags & (VM_SHARED | VM_MAYSHARE) ? 0 : -EINVAL;
-}
-
-static unsigned int io_uring_nommu_mmap_capabilities(struct file *file)
-{
- return NOMMU_MAP_DIRECT | NOMMU_MAP_READ | NOMMU_MAP_WRITE;
-}
-
-static unsigned long io_uring_nommu_get_unmapped_area(struct file *file,
- unsigned long addr, unsigned long len,
- unsigned long pgoff, unsigned long flags)
-{
- void *ptr;
-
- ptr = io_uring_validate_mmap_request(file, pgoff, len);
- if (IS_ERR(ptr))
- return PTR_ERR(ptr);
-
- return (unsigned long) ptr;
-}
-
-#endif /* !CONFIG_MMU */
-
-static int io_sqpoll_wait_sq(struct io_ring_ctx *ctx)
-{
- DEFINE_WAIT(wait);
-
- do {
- if (!io_sqring_full(ctx))
- break;
- prepare_to_wait(&ctx->sqo_sq_wait, &wait, TASK_INTERRUPTIBLE);
-
- if (!io_sqring_full(ctx))
- break;
- schedule();
- } while (!signal_pending(current));
-
- finish_wait(&ctx->sqo_sq_wait, &wait);
- return 0;
-}
-
-static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz,
- struct __kernel_timespec __user **ts,
- const sigset_t __user **sig)
-{
- struct io_uring_getevents_arg arg;
-
- /*
- * If EXT_ARG isn't set, then we have no timespec and the argp pointer
- * is just a pointer to the sigset_t.
- */
- if (!(flags & IORING_ENTER_EXT_ARG)) {
- *sig = (const sigset_t __user *) argp;
- *ts = NULL;
- return 0;
- }
-
- /*
- * EXT_ARG is set - ensure we agree on the size of it and copy in our
- * timespec and sigset_t pointers if good.
- */
- if (*argsz != sizeof(arg))
- return -EINVAL;
- if (copy_from_user(&arg, argp, sizeof(arg)))
- return -EFAULT;
- if (arg.pad)
- return -EINVAL;
- *sig = u64_to_user_ptr(arg.sigmask);
- *argsz = arg.sigmask_sz;
- *ts = u64_to_user_ptr(arg.ts);
- return 0;
-}
-
-SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
- u32, min_complete, u32, flags, const void __user *, argp,
- size_t, argsz)
-{
- struct io_ring_ctx *ctx;
- int submitted = 0;
- struct fd f;
- long ret;
-
- io_run_task_work();
-
- if (unlikely(flags & ~(IORING_ENTER_GETEVENTS | IORING_ENTER_SQ_WAKEUP |
- IORING_ENTER_SQ_WAIT | IORING_ENTER_EXT_ARG |
- IORING_ENTER_REGISTERED_RING)))
- return -EINVAL;
-
- /*
- * Ring fd has been registered via IORING_REGISTER_RING_FDS, we
- * need only dereference our task private array to find it.
- */
- if (flags & IORING_ENTER_REGISTERED_RING) {
- struct io_uring_task *tctx = current->io_uring;
-
- if (!tctx || fd >= IO_RINGFD_REG_MAX)
- return -EINVAL;
- fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
- f.file = tctx->registered_rings[fd];
- if (unlikely(!f.file))
- return -EBADF;
- } else {
- f = fdget(fd);
- if (unlikely(!f.file))
- return -EBADF;
- }
-
- ret = -EOPNOTSUPP;
- if (unlikely(f.file->f_op != &io_uring_fops))
- goto out_fput;
-
- ret = -ENXIO;
- ctx = f.file->private_data;
- if (unlikely(!percpu_ref_tryget(&ctx->refs)))
- goto out_fput;
-
- ret = -EBADFD;
- if (unlikely(ctx->flags & IORING_SETUP_R_DISABLED))
- goto out;
-
- /*
- * For SQ polling, the thread will do all submissions and completions.
- * Just return the requested submit count, and wake the thread if
- * we were asked to.
- */
- ret = 0;
- if (ctx->flags & IORING_SETUP_SQPOLL) {
- io_cqring_overflow_flush(ctx);
-
- if (unlikely(ctx->sq_data->thread == NULL)) {
- ret = -EOWNERDEAD;
- goto out;
- }
- if (flags & IORING_ENTER_SQ_WAKEUP)
- wake_up(&ctx->sq_data->wait);
- if (flags & IORING_ENTER_SQ_WAIT) {
- ret = io_sqpoll_wait_sq(ctx);
- if (ret)
- goto out;
- }
- submitted = to_submit;
- } else if (to_submit) {
- ret = io_uring_add_tctx_node(ctx);
- if (unlikely(ret))
- goto out;
- mutex_lock(&ctx->uring_lock);
- submitted = io_submit_sqes(ctx, to_submit);
- mutex_unlock(&ctx->uring_lock);
-
- if (submitted != to_submit)
- goto out;
- }
- if (flags & IORING_ENTER_GETEVENTS) {
- const sigset_t __user *sig;
- struct __kernel_timespec __user *ts;
-
- ret = io_get_ext_arg(flags, argp, &argsz, &ts, &sig);
- if (unlikely(ret))
- goto out;
-
- min_complete = min(min_complete, ctx->cq_entries);
-
- /*
- * When SETUP_IOPOLL and SETUP_SQPOLL are both enabled, user
- * space applications don't need to do io completion events
- * polling again, they can rely on io_sq_thread to do polling
- * work, which can reduce cpu usage and uring_lock contention.
- */
- if (ctx->flags & IORING_SETUP_IOPOLL &&
- !(ctx->flags & IORING_SETUP_SQPOLL)) {
- ret = io_iopoll_check(ctx, min_complete);
- } else {
- ret = io_cqring_wait(ctx, min_complete, sig, argsz, ts);
- }
- }
-
-out:
- percpu_ref_put(&ctx->refs);
-out_fput:
- if (!(flags & IORING_ENTER_REGISTERED_RING))
- fdput(f);
- return submitted ? submitted : ret;
-}
-
-#ifdef CONFIG_PROC_FS
-static __cold int io_uring_show_cred(struct seq_file *m, unsigned int id,
- const struct cred *cred)
-{
- struct user_namespace *uns = seq_user_ns(m);
- struct group_info *gi;
- kernel_cap_t cap;
- unsigned __capi;
- int g;
-
- seq_printf(m, "%5d\n", id);
- seq_put_decimal_ull(m, "\tUid:\t", from_kuid_munged(uns, cred->uid));
- seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->euid));
- seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->suid));
- seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->fsuid));
- seq_put_decimal_ull(m, "\n\tGid:\t", from_kgid_munged(uns, cred->gid));
- seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->egid));
- seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->sgid));
- seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->fsgid));
- seq_puts(m, "\n\tGroups:\t");
- gi = cred->group_info;
- for (g = 0; g < gi->ngroups; g++) {
- seq_put_decimal_ull(m, g ? " " : "",
- from_kgid_munged(uns, gi->gid[g]));
- }
- seq_puts(m, "\n\tCapEff:\t");
- cap = cred->cap_effective;
- CAP_FOR_EACH_U32(__capi)
- seq_put_hex_ll(m, NULL, cap.cap[CAP_LAST_U32 - __capi], 8);
- seq_putc(m, '\n');
- return 0;
-}
-
-static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
- struct seq_file *m)
-{
- struct io_sq_data *sq = NULL;
- struct io_overflow_cqe *ocqe;
- struct io_rings *r = ctx->rings;
- unsigned int sq_mask = ctx->sq_entries - 1, cq_mask = ctx->cq_entries - 1;
- unsigned int sq_head = READ_ONCE(r->sq.head);
- unsigned int sq_tail = READ_ONCE(r->sq.tail);
- unsigned int cq_head = READ_ONCE(r->cq.head);
- unsigned int cq_tail = READ_ONCE(r->cq.tail);
- unsigned int sq_entries, cq_entries;
- bool has_lock;
- unsigned int i;
-
- /*
- * we may get imprecise sqe and cqe info if uring is actively running
- * since we get cached_sq_head and cached_cq_tail without uring_lock
- * and sq_tail and cq_head are changed by userspace. But it's ok since
- * we usually use these info when it is stuck.
- */
- seq_printf(m, "SqMask:\t0x%x\n", sq_mask);
- seq_printf(m, "SqHead:\t%u\n", sq_head);
- seq_printf(m, "SqTail:\t%u\n", sq_tail);
- seq_printf(m, "CachedSqHead:\t%u\n", ctx->cached_sq_head);
- seq_printf(m, "CqMask:\t0x%x\n", cq_mask);
- seq_printf(m, "CqHead:\t%u\n", cq_head);
- seq_printf(m, "CqTail:\t%u\n", cq_tail);
- seq_printf(m, "CachedCqTail:\t%u\n", ctx->cached_cq_tail);
- seq_printf(m, "SQEs:\t%u\n", sq_tail - ctx->cached_sq_head);
- sq_entries = min(sq_tail - sq_head, ctx->sq_entries);
- for (i = 0; i < sq_entries; i++) {
- unsigned int entry = i + sq_head;
- unsigned int sq_idx = READ_ONCE(ctx->sq_array[entry & sq_mask]);
- struct io_uring_sqe *sqe;
-
- if (sq_idx > sq_mask)
- continue;
- sqe = &ctx->sq_sqes[sq_idx];
- seq_printf(m, "%5u: opcode:%d, fd:%d, flags:%x, user_data:%llu\n",
- sq_idx, sqe->opcode, sqe->fd, sqe->flags,
- sqe->user_data);
- }
- seq_printf(m, "CQEs:\t%u\n", cq_tail - cq_head);
- cq_entries = min(cq_tail - cq_head, ctx->cq_entries);
- for (i = 0; i < cq_entries; i++) {
- unsigned int entry = i + cq_head;
- struct io_uring_cqe *cqe = &r->cqes[entry & cq_mask];
-
- seq_printf(m, "%5u: user_data:%llu, res:%d, flag:%x\n",
- entry & cq_mask, cqe->user_data, cqe->res,
- cqe->flags);
- }
-
- /*
- * Avoid ABBA deadlock between the seq lock and the io_uring mutex,
- * since fdinfo case grabs it in the opposite direction of normal use
- * cases. If we fail to get the lock, we just don't iterate any
- * structures that could be going away outside the io_uring mutex.
- */
- has_lock = mutex_trylock(&ctx->uring_lock);
-
- if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) {
- sq = ctx->sq_data;
- if (!sq->thread)
- sq = NULL;
- }
-
- seq_printf(m, "SqThread:\t%d\n", sq ? task_pid_nr(sq->thread) : -1);
- seq_printf(m, "SqThreadCpu:\t%d\n", sq ? task_cpu(sq->thread) : -1);
- seq_printf(m, "UserFiles:\t%u\n", ctx->nr_user_files);
- for (i = 0; has_lock && i < ctx->nr_user_files; i++) {
- struct file *f = io_file_from_index(ctx, i);
-
- if (f)
- seq_printf(m, "%5u: %s\n", i, file_dentry(f)->d_iname);
- else
- seq_printf(m, "%5u: <none>\n", i);
- }
- seq_printf(m, "UserBufs:\t%u\n", ctx->nr_user_bufs);
- for (i = 0; has_lock && i < ctx->nr_user_bufs; i++) {
- struct io_mapped_ubuf *buf = ctx->user_bufs[i];
- unsigned int len = buf->ubuf_end - buf->ubuf;
-
- seq_printf(m, "%5u: 0x%llx/%u\n", i, buf->ubuf, len);
- }
- if (has_lock && !xa_empty(&ctx->personalities)) {
- unsigned long index;
- const struct cred *cred;
-
- seq_printf(m, "Personalities:\n");
- xa_for_each(&ctx->personalities, index, cred)
- io_uring_show_cred(m, index, cred);
- }
- if (has_lock)
- mutex_unlock(&ctx->uring_lock);
-
- seq_puts(m, "PollList:\n");
- spin_lock(&ctx->completion_lock);
- for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
- struct hlist_head *list = &ctx->cancel_hash[i];
- struct io_kiocb *req;
-
- hlist_for_each_entry(req, list, hash_node)
- seq_printf(m, " op=%d, task_works=%d\n", req->opcode,
- task_work_pending(req->task));
- }
-
- seq_puts(m, "CqOverflowList:\n");
- list_for_each_entry(ocqe, &ctx->cq_overflow_list, list) {
- struct io_uring_cqe *cqe = &ocqe->cqe;
-
- seq_printf(m, " user_data=%llu, res=%d, flags=%x\n",
- cqe->user_data, cqe->res, cqe->flags);
-
- }
-
- spin_unlock(&ctx->completion_lock);
-}
-
-static __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
-{
- struct io_ring_ctx *ctx = f->private_data;
-
- if (percpu_ref_tryget(&ctx->refs)) {
- __io_uring_show_fdinfo(ctx, m);
- percpu_ref_put(&ctx->refs);
- }
-}
-#endif
-
-static const struct file_operations io_uring_fops = {
- .release = io_uring_release,
- .mmap = io_uring_mmap,
-#ifndef CONFIG_MMU
- .get_unmapped_area = io_uring_nommu_get_unmapped_area,
- .mmap_capabilities = io_uring_nommu_mmap_capabilities,
-#endif
- .poll = io_uring_poll,
-#ifdef CONFIG_PROC_FS
- .show_fdinfo = io_uring_show_fdinfo,
-#endif
-};
-
-static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx,
- struct io_uring_params *p)
-{
- struct io_rings *rings;
- size_t size, sq_array_offset;
-
- /* make sure these are sane, as we already accounted them */
- ctx->sq_entries = p->sq_entries;
- ctx->cq_entries = p->cq_entries;
-
- size = rings_size(p->sq_entries, p->cq_entries, &sq_array_offset);
- if (size == SIZE_MAX)
- return -EOVERFLOW;
-
- rings = io_mem_alloc(size);
- if (!rings)
- return -ENOMEM;
-
- ctx->rings = rings;
- ctx->sq_array = (u32 *)((char *)rings + sq_array_offset);
- rings->sq_ring_mask = p->sq_entries - 1;
- rings->cq_ring_mask = p->cq_entries - 1;
- rings->sq_ring_entries = p->sq_entries;
- rings->cq_ring_entries = p->cq_entries;
-
- size = array_size(sizeof(struct io_uring_sqe), p->sq_entries);
- if (size == SIZE_MAX) {
- io_mem_free(ctx->rings);
- ctx->rings = NULL;
- return -EOVERFLOW;
- }
-
- ctx->sq_sqes = io_mem_alloc(size);
- if (!ctx->sq_sqes) {
- io_mem_free(ctx->rings);
- ctx->rings = NULL;
- return -ENOMEM;
- }
-
- return 0;
-}
-
-static int io_uring_install_fd(struct io_ring_ctx *ctx, struct file *file)
-{
- int ret, fd;
-
- fd = get_unused_fd_flags(O_RDWR | O_CLOEXEC);
- if (fd < 0)
- return fd;
-
- ret = io_uring_add_tctx_node(ctx);
- if (ret) {
- put_unused_fd(fd);
- return ret;
- }
- fd_install(fd, file);
- return fd;
-}
-
-/*
- * Allocate an anonymous fd, this is what constitutes the application
- * visible backing of an io_uring instance. The application mmaps this
- * fd to gain access to the SQ/CQ ring details. If UNIX sockets are enabled,
- * we have to tie this fd to a socket for file garbage collection purposes.
- */
-static struct file *io_uring_get_file(struct io_ring_ctx *ctx)
-{
- struct file *file;
-#if defined(CONFIG_UNIX)
- int ret;
-
- ret = sock_create_kern(&init_net, PF_UNIX, SOCK_RAW, IPPROTO_IP,
- &ctx->ring_sock);
- if (ret)
- return ERR_PTR(ret);
-#endif
-
- file = anon_inode_getfile_secure("[io_uring]", &io_uring_fops, ctx,
- O_RDWR | O_CLOEXEC, NULL);
-#if defined(CONFIG_UNIX)
- if (IS_ERR(file)) {
- sock_release(ctx->ring_sock);
- ctx->ring_sock = NULL;
- } else {
- ctx->ring_sock->file = file;
- }
-#endif
- return file;
-}
-
-static __cold int io_uring_create(unsigned entries, struct io_uring_params *p,
- struct io_uring_params __user *params)
-{
- struct io_ring_ctx *ctx;
- struct file *file;
- int ret;
-
- if (!entries)
- return -EINVAL;
- if (entries > IORING_MAX_ENTRIES) {
- if (!(p->flags & IORING_SETUP_CLAMP))
- return -EINVAL;
- entries = IORING_MAX_ENTRIES;
- }
-
- /*
- * Use twice as many entries for the CQ ring. It's possible for the
- * application to drive a higher depth than the size of the SQ ring,
- * since the sqes are only used at submission time. This allows for
- * some flexibility in overcommitting a bit. If the application has
- * set IORING_SETUP_CQSIZE, it will have passed in the desired number
- * of CQ ring entries manually.
- */
- p->sq_entries = roundup_pow_of_two(entries);
- if (p->flags & IORING_SETUP_CQSIZE) {
- /*
- * If IORING_SETUP_CQSIZE is set, we do the same roundup
- * to a power-of-two, if it isn't already. We do NOT impose
- * any cq vs sq ring sizing.
- */
- if (!p->cq_entries)
- return -EINVAL;
- if (p->cq_entries > IORING_MAX_CQ_ENTRIES) {
- if (!(p->flags & IORING_SETUP_CLAMP))
- return -EINVAL;
- p->cq_entries = IORING_MAX_CQ_ENTRIES;
- }
- p->cq_entries = roundup_pow_of_two(p->cq_entries);
- if (p->cq_entries < p->sq_entries)
- return -EINVAL;
- } else {
- p->cq_entries = 2 * p->sq_entries;
- }
-
- ctx = io_ring_ctx_alloc(p);
- if (!ctx)
- return -ENOMEM;
- ctx->compat = in_compat_syscall();
- if (!capable(CAP_IPC_LOCK))
- ctx->user = get_uid(current_user());
-
- /*
- * This is just grabbed for accounting purposes. When a process exits,
- * the mm is exited and dropped before the files, hence we need to hang
- * on to this mm purely for the purposes of being able to unaccount
- * memory (locked/pinned vm). It's not used for anything else.
- */
- mmgrab(current->mm);
- ctx->mm_account = current->mm;
-
- ret = io_allocate_scq_urings(ctx, p);
- if (ret)
- goto err;
-
- ret = io_sq_offload_create(ctx, p);
- if (ret)
- goto err;
- /* always set a rsrc node */
- ret = io_rsrc_node_switch_start(ctx);
- if (ret)
- goto err;
- io_rsrc_node_switch(ctx, NULL);
-
- memset(&p->sq_off, 0, sizeof(p->sq_off));
- p->sq_off.head = offsetof(struct io_rings, sq.head);
- p->sq_off.tail = offsetof(struct io_rings, sq.tail);
- p->sq_off.ring_mask = offsetof(struct io_rings, sq_ring_mask);
- p->sq_off.ring_entries = offsetof(struct io_rings, sq_ring_entries);
- p->sq_off.flags = offsetof(struct io_rings, sq_flags);
- p->sq_off.dropped = offsetof(struct io_rings, sq_dropped);
- p->sq_off.array = (char *)ctx->sq_array - (char *)ctx->rings;
-
- memset(&p->cq_off, 0, sizeof(p->cq_off));
- p->cq_off.head = offsetof(struct io_rings, cq.head);
- p->cq_off.tail = offsetof(struct io_rings, cq.tail);
- p->cq_off.ring_mask = offsetof(struct io_rings, cq_ring_mask);
- p->cq_off.ring_entries = offsetof(struct io_rings, cq_ring_entries);
- p->cq_off.overflow = offsetof(struct io_rings, cq_overflow);
- p->cq_off.cqes = offsetof(struct io_rings, cqes);
- p->cq_off.flags = offsetof(struct io_rings, cq_flags);
-
- p->features = IORING_FEAT_SINGLE_MMAP | IORING_FEAT_NODROP |
- IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS |
- IORING_FEAT_CUR_PERSONALITY | IORING_FEAT_FAST_POLL |
- IORING_FEAT_POLL_32BITS | IORING_FEAT_SQPOLL_NONFIXED |
- IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS |
- IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP |
- IORING_FEAT_LINKED_FILE;
-
- if (copy_to_user(params, p, sizeof(*p))) {
- ret = -EFAULT;
- goto err;
- }
-
- file = io_uring_get_file(ctx);
- if (IS_ERR(file)) {
- ret = PTR_ERR(file);
- goto err;
- }
-
- /*
- * Install ring fd as the very last thing, so we don't risk someone
- * having closed it before we finish setup
- */
- ret = io_uring_install_fd(ctx, file);
- if (ret < 0) {
- /* fput will clean it up */
- fput(file);
- return ret;
- }
-
- trace_io_uring_create(ret, ctx, p->sq_entries, p->cq_entries, p->flags);
- return ret;
-err:
- io_ring_ctx_wait_and_kill(ctx);
- return ret;
-}
-
-/*
- * Sets up an aio uring context, and returns the fd. Applications asks for a
- * ring size, we return the actual sq/cq ring sizes (among other things) in the
- * params structure passed in.
- */
-static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
-{
- struct io_uring_params p;
- int i;
-
- if (copy_from_user(&p, params, sizeof(p)))
- return -EFAULT;
- for (i = 0; i < ARRAY_SIZE(p.resv); i++) {
- if (p.resv[i])
- return -EINVAL;
- }
-
- if (p.flags & ~(IORING_SETUP_IOPOLL | IORING_SETUP_SQPOLL |
- IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE |
- IORING_SETUP_CLAMP | IORING_SETUP_ATTACH_WQ |
- IORING_SETUP_R_DISABLED | IORING_SETUP_SUBMIT_ALL))
- return -EINVAL;
-
- return io_uring_create(entries, &p, params);
-}
-
-SYSCALL_DEFINE2(io_uring_setup, u32, entries,
- struct io_uring_params __user *, params)
-{
- return io_uring_setup(entries, params);
-}
-
-static __cold int io_probe(struct io_ring_ctx *ctx, void __user *arg,
- unsigned nr_args)
-{
- struct io_uring_probe *p;
- size_t size;
- int i, ret;
-
- size = struct_size(p, ops, nr_args);
- if (size == SIZE_MAX)
- return -EOVERFLOW;
- p = kzalloc(size, GFP_KERNEL);
- if (!p)
- return -ENOMEM;
-
- ret = -EFAULT;
- if (copy_from_user(p, arg, size))
- goto out;
- ret = -EINVAL;
- if (memchr_inv(p, 0, size))
- goto out;
-
- p->last_op = IORING_OP_LAST - 1;
- if (nr_args > IORING_OP_LAST)
- nr_args = IORING_OP_LAST;
-
- for (i = 0; i < nr_args; i++) {
- p->ops[i].op = i;
- if (!io_op_defs[i].not_supported)
- p->ops[i].flags = IO_URING_OP_SUPPORTED;
- }
- p->ops_len = i;
-
- ret = 0;
- if (copy_to_user(arg, p, size))
- ret = -EFAULT;
-out:
- kfree(p);
- return ret;
-}
-
-static int io_register_personality(struct io_ring_ctx *ctx)
-{
- const struct cred *creds;
- u32 id;
- int ret;
-
- creds = get_current_cred();
-
- ret = xa_alloc_cyclic(&ctx->personalities, &id, (void *)creds,
- XA_LIMIT(0, USHRT_MAX), &ctx->pers_next, GFP_KERNEL);
- if (ret < 0) {
- put_cred(creds);
- return ret;
- }
- return id;
-}
-
-static __cold int io_register_restrictions(struct io_ring_ctx *ctx,
- void __user *arg, unsigned int nr_args)
-{
- struct io_uring_restriction *res;
- size_t size;
- int i, ret;
-
- /* Restrictions allowed only if rings started disabled */
- if (!(ctx->flags & IORING_SETUP_R_DISABLED))
- return -EBADFD;
-
- /* We allow only a single restrictions registration */
- if (ctx->restrictions.registered)
- return -EBUSY;
-
- if (!arg || nr_args > IORING_MAX_RESTRICTIONS)
- return -EINVAL;
-
- size = array_size(nr_args, sizeof(*res));
- if (size == SIZE_MAX)
- return -EOVERFLOW;
-
- res = memdup_user(arg, size);
- if (IS_ERR(res))
- return PTR_ERR(res);
-
- ret = 0;
-
- for (i = 0; i < nr_args; i++) {
- switch (res[i].opcode) {
- case IORING_RESTRICTION_REGISTER_OP:
- if (res[i].register_op >= IORING_REGISTER_LAST) {
- ret = -EINVAL;
- goto out;
- }
-
- __set_bit(res[i].register_op,
- ctx->restrictions.register_op);
- break;
- case IORING_RESTRICTION_SQE_OP:
- if (res[i].sqe_op >= IORING_OP_LAST) {
- ret = -EINVAL;
- goto out;
- }
-
- __set_bit(res[i].sqe_op, ctx->restrictions.sqe_op);
- break;
- case IORING_RESTRICTION_SQE_FLAGS_ALLOWED:
- ctx->restrictions.sqe_flags_allowed = res[i].sqe_flags;
- break;
- case IORING_RESTRICTION_SQE_FLAGS_REQUIRED:
- ctx->restrictions.sqe_flags_required = res[i].sqe_flags;
- break;
- default:
- ret = -EINVAL;
- goto out;
- }
- }
-
-out:
- /* Reset all restrictions if an error happened */
- if (ret != 0)
- memset(&ctx->restrictions, 0, sizeof(ctx->restrictions));
- else
- ctx->restrictions.registered = true;
-
- kfree(res);
- return ret;
-}
-
-static int io_register_enable_rings(struct io_ring_ctx *ctx)
-{
- if (!(ctx->flags & IORING_SETUP_R_DISABLED))
- return -EBADFD;
-
- if (ctx->restrictions.registered)
- ctx->restricted = 1;
-
- ctx->flags &= ~IORING_SETUP_R_DISABLED;
- if (ctx->sq_data && wq_has_sleeper(&ctx->sq_data->wait))
- wake_up(&ctx->sq_data->wait);
- return 0;
-}
-
-static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
- struct io_uring_rsrc_update2 *up,
- unsigned nr_args)
-{
- __u32 tmp;
- int err;
-
- if (check_add_overflow(up->offset, nr_args, &tmp))
- return -EOVERFLOW;
- err = io_rsrc_node_switch_start(ctx);
- if (err)
- return err;
-
- switch (type) {
- case IORING_RSRC_FILE:
- return __io_sqe_files_update(ctx, up, nr_args);
- case IORING_RSRC_BUFFER:
- return __io_sqe_buffers_update(ctx, up, nr_args);
- }
- return -EINVAL;
-}
-
-static int io_register_files_update(struct io_ring_ctx *ctx, void __user *arg,
- unsigned nr_args)
-{
- struct io_uring_rsrc_update2 up;
-
- if (!nr_args)
- return -EINVAL;
- memset(&up, 0, sizeof(up));
- if (copy_from_user(&up, arg, sizeof(struct io_uring_rsrc_update)))
- return -EFAULT;
- if (up.resv || up.resv2)
- return -EINVAL;
- return __io_register_rsrc_update(ctx, IORING_RSRC_FILE, &up, nr_args);
-}
-
-static int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg,
- unsigned size, unsigned type)
-{
- struct io_uring_rsrc_update2 up;
-
- if (size != sizeof(up))
- return -EINVAL;
- if (copy_from_user(&up, arg, sizeof(up)))
- return -EFAULT;
- if (!up.nr || up.resv || up.resv2)
- return -EINVAL;
- return __io_register_rsrc_update(ctx, type, &up, up.nr);
-}
-
-static __cold int io_register_rsrc(struct io_ring_ctx *ctx, void __user *arg,
- unsigned int size, unsigned int type)
-{
- struct io_uring_rsrc_register rr;
-
- /* keep it extendible */
- if (size != sizeof(rr))
- return -EINVAL;
-
- memset(&rr, 0, sizeof(rr));
- if (copy_from_user(&rr, arg, size))
- return -EFAULT;
- if (!rr.nr || rr.resv || rr.resv2)
- return -EINVAL;
-
- switch (type) {
- case IORING_RSRC_FILE:
- return io_sqe_files_register(ctx, u64_to_user_ptr(rr.data),
- rr.nr, u64_to_user_ptr(rr.tags));
- case IORING_RSRC_BUFFER:
- return io_sqe_buffers_register(ctx, u64_to_user_ptr(rr.data),
- rr.nr, u64_to_user_ptr(rr.tags));
- }
- return -EINVAL;
-}
-
-static __cold int io_register_iowq_aff(struct io_ring_ctx *ctx,
- void __user *arg, unsigned len)
-{
- struct io_uring_task *tctx = current->io_uring;
- cpumask_var_t new_mask;
- int ret;
-
- if (!tctx || !tctx->io_wq)
- return -EINVAL;
-
- if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
- return -ENOMEM;
-
- cpumask_clear(new_mask);
- if (len > cpumask_size())
- len = cpumask_size();
-
- if (in_compat_syscall()) {
- ret = compat_get_bitmap(cpumask_bits(new_mask),
- (const compat_ulong_t __user *)arg,
- len * 8 /* CHAR_BIT */);
- } else {
- ret = copy_from_user(new_mask, arg, len);
- }
-
- if (ret) {
- free_cpumask_var(new_mask);
- return -EFAULT;
- }
-
- ret = io_wq_cpu_affinity(tctx->io_wq, new_mask);
- free_cpumask_var(new_mask);
- return ret;
-}
-
-static __cold int io_unregister_iowq_aff(struct io_ring_ctx *ctx)
-{
- struct io_uring_task *tctx = current->io_uring;
-
- if (!tctx || !tctx->io_wq)
- return -EINVAL;
-
- return io_wq_cpu_affinity(tctx->io_wq, NULL);
-}
-
-static __cold int io_register_iowq_max_workers(struct io_ring_ctx *ctx,
- void __user *arg)
- __must_hold(&ctx->uring_lock)
-{
- struct io_tctx_node *node;
- struct io_uring_task *tctx = NULL;
- struct io_sq_data *sqd = NULL;
- __u32 new_count[2];
- int i, ret;
-
- if (copy_from_user(new_count, arg, sizeof(new_count)))
- return -EFAULT;
- for (i = 0; i < ARRAY_SIZE(new_count); i++)
- if (new_count[i] > INT_MAX)
- return -EINVAL;
-
- if (ctx->flags & IORING_SETUP_SQPOLL) {
- sqd = ctx->sq_data;
- if (sqd) {
- /*
- * Observe the correct sqd->lock -> ctx->uring_lock
- * ordering. Fine to drop uring_lock here, we hold
- * a ref to the ctx.
- */
- refcount_inc(&sqd->refs);
- mutex_unlock(&ctx->uring_lock);
- mutex_lock(&sqd->lock);
- mutex_lock(&ctx->uring_lock);
- if (sqd->thread)
- tctx = sqd->thread->io_uring;
- }
- } else {
- tctx = current->io_uring;
- }
-
- BUILD_BUG_ON(sizeof(new_count) != sizeof(ctx->iowq_limits));
-
- for (i = 0; i < ARRAY_SIZE(new_count); i++)
- if (new_count[i])
- ctx->iowq_limits[i] = new_count[i];
- ctx->iowq_limits_set = true;
-
- if (tctx && tctx->io_wq) {
- ret = io_wq_max_workers(tctx->io_wq, new_count);
- if (ret)
- goto err;
- } else {
- memset(new_count, 0, sizeof(new_count));
- }
-
- if (sqd) {
- mutex_unlock(&sqd->lock);
- io_put_sq_data(sqd);
- }
-
- if (copy_to_user(arg, new_count, sizeof(new_count)))
- return -EFAULT;
-
- /* that's it for SQPOLL, only the SQPOLL task creates requests */
- if (sqd)
- return 0;
-
- /* now propagate the restriction to all registered users */
- list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
- struct io_uring_task *tctx = node->task->io_uring;
-
- if (WARN_ON_ONCE(!tctx->io_wq))
- continue;
-
- for (i = 0; i < ARRAY_SIZE(new_count); i++)
- new_count[i] = ctx->iowq_limits[i];
- /* ignore errors, it always returns zero anyway */
- (void)io_wq_max_workers(tctx->io_wq, new_count);
- }
- return 0;
-err:
- if (sqd) {
- mutex_unlock(&sqd->lock);
- io_put_sq_data(sqd);
- }
- return ret;
-}
-
-static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
- void __user *arg, unsigned nr_args)
- __releases(ctx->uring_lock)
- __acquires(ctx->uring_lock)
-{
- int ret;
-
- /*
- * We're inside the ring mutex, if the ref is already dying, then
- * someone else killed the ctx or is already going through
- * io_uring_register().
- */
- if (percpu_ref_is_dying(&ctx->refs))
- return -ENXIO;
-
- if (ctx->restricted) {
- if (opcode >= IORING_REGISTER_LAST)
- return -EINVAL;
- opcode = array_index_nospec(opcode, IORING_REGISTER_LAST);
- if (!test_bit(opcode, ctx->restrictions.register_op))
- return -EACCES;
- }
-
- switch (opcode) {
- case IORING_REGISTER_BUFFERS:
- ret = io_sqe_buffers_register(ctx, arg, nr_args, NULL);
- break;
- case IORING_UNREGISTER_BUFFERS:
- ret = -EINVAL;
- if (arg || nr_args)
- break;
- ret = io_sqe_buffers_unregister(ctx);
- break;
- case IORING_REGISTER_FILES:
- ret = io_sqe_files_register(ctx, arg, nr_args, NULL);
- break;
- case IORING_UNREGISTER_FILES:
- ret = -EINVAL;
- if (arg || nr_args)
- break;
- ret = io_sqe_files_unregister(ctx);
- break;
- case IORING_REGISTER_FILES_UPDATE:
- ret = io_register_files_update(ctx, arg, nr_args);
- break;
- case IORING_REGISTER_EVENTFD:
- ret = -EINVAL;
- if (nr_args != 1)
- break;
- ret = io_eventfd_register(ctx, arg, 0);
- break;
- case IORING_REGISTER_EVENTFD_ASYNC:
- ret = -EINVAL;
- if (nr_args != 1)
- break;
- ret = io_eventfd_register(ctx, arg, 1);
- break;
- case IORING_UNREGISTER_EVENTFD:
- ret = -EINVAL;
- if (arg || nr_args)
- break;
- ret = io_eventfd_unregister(ctx);
- break;
- case IORING_REGISTER_PROBE:
- ret = -EINVAL;
- if (!arg || nr_args > 256)
- break;
- ret = io_probe(ctx, arg, nr_args);
- break;
- case IORING_REGISTER_PERSONALITY:
- ret = -EINVAL;
- if (arg || nr_args)
- break;
- ret = io_register_personality(ctx);
- break;
- case IORING_UNREGISTER_PERSONALITY:
- ret = -EINVAL;
- if (arg)
- break;
- ret = io_unregister_personality(ctx, nr_args);
- break;
- case IORING_REGISTER_ENABLE_RINGS:
- ret = -EINVAL;
- if (arg || nr_args)
- break;
- ret = io_register_enable_rings(ctx);
- break;
- case IORING_REGISTER_RESTRICTIONS:
- ret = io_register_restrictions(ctx, arg, nr_args);
- break;
- case IORING_REGISTER_FILES2:
- ret = io_register_rsrc(ctx, arg, nr_args, IORING_RSRC_FILE);
- break;
- case IORING_REGISTER_FILES_UPDATE2:
- ret = io_register_rsrc_update(ctx, arg, nr_args,
- IORING_RSRC_FILE);
- break;
- case IORING_REGISTER_BUFFERS2:
- ret = io_register_rsrc(ctx, arg, nr_args, IORING_RSRC_BUFFER);
- break;
- case IORING_REGISTER_BUFFERS_UPDATE:
- ret = io_register_rsrc_update(ctx, arg, nr_args,
- IORING_RSRC_BUFFER);
- break;
- case IORING_REGISTER_IOWQ_AFF:
- ret = -EINVAL;
- if (!arg || !nr_args)
- break;
- ret = io_register_iowq_aff(ctx, arg, nr_args);
- break;
- case IORING_UNREGISTER_IOWQ_AFF:
- ret = -EINVAL;
- if (arg || nr_args)
- break;
- ret = io_unregister_iowq_aff(ctx);
- break;
- case IORING_REGISTER_IOWQ_MAX_WORKERS:
- ret = -EINVAL;
- if (!arg || nr_args != 2)
- break;
- ret = io_register_iowq_max_workers(ctx, arg);
- break;
- case IORING_REGISTER_RING_FDS:
- ret = io_ringfd_register(ctx, arg, nr_args);
- break;
- case IORING_UNREGISTER_RING_FDS:
- ret = io_ringfd_unregister(ctx, arg, nr_args);
- break;
- default:
- ret = -EINVAL;
- break;
- }
-
- return ret;
-}
-
-SYSCALL_DEFINE4(io_uring_register, unsigned int, fd, unsigned int, opcode,
- void __user *, arg, unsigned int, nr_args)
-{
- struct io_ring_ctx *ctx;
- long ret = -EBADF;
- struct fd f;
-
- f = fdget(fd);
- if (!f.file)
- return -EBADF;
-
- ret = -EOPNOTSUPP;
- if (f.file->f_op != &io_uring_fops)
- goto out_fput;
-
- ctx = f.file->private_data;
-
- io_run_task_work();
-
- mutex_lock(&ctx->uring_lock);
- ret = __io_uring_register(ctx, opcode, arg, nr_args);
- mutex_unlock(&ctx->uring_lock);
- trace_io_uring_register(ctx, opcode, ctx->nr_user_files, ctx->nr_user_bufs, ret);
-out_fput:
- fdput(f);
- return ret;
-}
-
-static int __init io_uring_init(void)
-{
-#define __BUILD_BUG_VERIFY_ELEMENT(stype, eoffset, etype, ename) do { \
- BUILD_BUG_ON(offsetof(stype, ename) != eoffset); \
- BUILD_BUG_ON(sizeof(etype) != sizeof_field(stype, ename)); \
-} while (0)
-
-#define BUILD_BUG_SQE_ELEM(eoffset, etype, ename) \
- __BUILD_BUG_VERIFY_ELEMENT(struct io_uring_sqe, eoffset, etype, ename)
- BUILD_BUG_ON(sizeof(struct io_uring_sqe) != 64);
- BUILD_BUG_SQE_ELEM(0, __u8, opcode);
- BUILD_BUG_SQE_ELEM(1, __u8, flags);
- BUILD_BUG_SQE_ELEM(2, __u16, ioprio);
- BUILD_BUG_SQE_ELEM(4, __s32, fd);
- BUILD_BUG_SQE_ELEM(8, __u64, off);
- BUILD_BUG_SQE_ELEM(8, __u64, addr2);
- BUILD_BUG_SQE_ELEM(16, __u64, addr);
- BUILD_BUG_SQE_ELEM(16, __u64, splice_off_in);
- BUILD_BUG_SQE_ELEM(24, __u32, len);
- BUILD_BUG_SQE_ELEM(28, __kernel_rwf_t, rw_flags);
- BUILD_BUG_SQE_ELEM(28, /* compat */ int, rw_flags);
- BUILD_BUG_SQE_ELEM(28, /* compat */ __u32, rw_flags);
- BUILD_BUG_SQE_ELEM(28, __u32, fsync_flags);
- BUILD_BUG_SQE_ELEM(28, /* compat */ __u16, poll_events);
- BUILD_BUG_SQE_ELEM(28, __u32, poll32_events);
- BUILD_BUG_SQE_ELEM(28, __u32, sync_range_flags);
- BUILD_BUG_SQE_ELEM(28, __u32, msg_flags);
- BUILD_BUG_SQE_ELEM(28, __u32, timeout_flags);
- BUILD_BUG_SQE_ELEM(28, __u32, accept_flags);
- BUILD_BUG_SQE_ELEM(28, __u32, cancel_flags);
- BUILD_BUG_SQE_ELEM(28, __u32, open_flags);
- BUILD_BUG_SQE_ELEM(28, __u32, statx_flags);
- BUILD_BUG_SQE_ELEM(28, __u32, fadvise_advice);
- BUILD_BUG_SQE_ELEM(28, __u32, splice_flags);
- BUILD_BUG_SQE_ELEM(32, __u64, user_data);
- BUILD_BUG_SQE_ELEM(40, __u16, buf_index);
- BUILD_BUG_SQE_ELEM(40, __u16, buf_group);
- BUILD_BUG_SQE_ELEM(42, __u16, personality);
- BUILD_BUG_SQE_ELEM(44, __s32, splice_fd_in);
- BUILD_BUG_SQE_ELEM(44, __u32, file_index);
-
- BUILD_BUG_ON(sizeof(struct io_uring_files_update) !=
- sizeof(struct io_uring_rsrc_update));
- BUILD_BUG_ON(sizeof(struct io_uring_rsrc_update) >
- sizeof(struct io_uring_rsrc_update2));
-
- /* ->buf_index is u16 */
- BUILD_BUG_ON(IORING_MAX_REG_BUFFERS >= (1u << 16));
-
- /* should fit into one byte */
- BUILD_BUG_ON(SQE_VALID_FLAGS >= (1 << 8));
- BUILD_BUG_ON(SQE_COMMON_FLAGS >= (1 << 8));
- BUILD_BUG_ON((SQE_VALID_FLAGS | SQE_COMMON_FLAGS) != SQE_VALID_FLAGS);
-
- BUILD_BUG_ON(ARRAY_SIZE(io_op_defs) != IORING_OP_LAST);
- BUILD_BUG_ON(__REQ_F_LAST_BIT > 8 * sizeof(int));
-
- req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
- SLAB_ACCOUNT);
- return 0;
-};
-__initcall(io_uring_init);
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index ac7f067b7bdd..f306b52b8e0b 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -551,13 +551,13 @@ void jbd2_journal_commit_transaction(journal_t *journal)
*/
jbd2_journal_switch_revoke_table(journal);
+ write_lock(&journal->j_state_lock);
/*
* Reserved credits cannot be claimed anymore, free them
*/
atomic_sub(atomic_read(&journal->j_reserved_credits),
&commit_transaction->t_outstanding_credits);
- write_lock(&journal->j_state_lock);
trace_jbd2_commit_flushing(journal, commit_transaction);
stats.run.rs_flushing = jiffies;
stats.run.rs_locked = jbd2_time_diff(stats.run.rs_locked,
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index fcb9175016a5..49f109699933 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -1486,8 +1486,6 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
struct journal_head *jh;
int ret = 0;
- if (is_handle_aborted(handle))
- return -EROFS;
if (!buffer_jbd(bh))
return -EUCLEAN;
@@ -1534,6 +1532,18 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
journal = transaction->t_journal;
spin_lock(&jh->b_state_lock);
+ if (is_handle_aborted(handle)) {
+ /*
+ * Check journal aborting with @jh->b_state_lock locked,
+ * since 'jh->b_transaction' could be replaced with
+ * 'jh->b_next_transaction' during old transaction
+ * committing if journal aborted, which may fail
+ * assertion on 'jh->b_frozen_data == NULL'.
+ */
+ ret = -EROFS;
+ goto out_unlock_bh;
+ }
+
if (jh->b_modified == 0) {
/*
* This buffer's got modified and becoming part
diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 6eca72cfa1f2..1cc88ba6de90 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1343,14 +1343,17 @@ static void __kernfs_remove(struct kernfs_node *kn)
{
struct kernfs_node *pos;
+ /* Short-circuit if non-root @kn has already finished removal. */
+ if (!kn)
+ return;
+
lockdep_assert_held_write(&kernfs_root(kn)->kernfs_rwsem);
/*
- * Short-circuit if non-root @kn has already finished removal.
* This is for kernfs_remove_self() which plays with active ref
* after removal.
*/
- if (!kn || (kn->parent && RB_EMPTY_NODE(&kn->rb)))
+ if (kn->parent && RB_EMPTY_NODE(&kn->rb))
return;
pr_debug("kernfs %s: removing\n", kn->name);
diff --git a/fs/ksmbd/connection.c b/fs/ksmbd/connection.c
index bc6050b67256..e8f476c5f189 100644
--- a/fs/ksmbd/connection.c
+++ b/fs/ksmbd/connection.c
@@ -205,31 +205,31 @@ int ksmbd_conn_write(struct ksmbd_work *work)
return 0;
}
-int ksmbd_conn_rdma_read(struct ksmbd_conn *conn, void *buf,
- unsigned int buflen, u32 remote_key, u64 remote_offset,
- u32 remote_len)
+int ksmbd_conn_rdma_read(struct ksmbd_conn *conn,
+ void *buf, unsigned int buflen,
+ struct smb2_buffer_desc_v1 *desc,
+ unsigned int desc_len)
{
int ret = -EINVAL;
if (conn->transport->ops->rdma_read)
ret = conn->transport->ops->rdma_read(conn->transport,
buf, buflen,
- remote_key, remote_offset,
- remote_len);
+ desc, desc_len);
return ret;
}
-int ksmbd_conn_rdma_write(struct ksmbd_conn *conn, void *buf,
- unsigned int buflen, u32 remote_key,
- u64 remote_offset, u32 remote_len)
+int ksmbd_conn_rdma_write(struct ksmbd_conn *conn,
+ void *buf, unsigned int buflen,
+ struct smb2_buffer_desc_v1 *desc,
+ unsigned int desc_len)
{
int ret = -EINVAL;
if (conn->transport->ops->rdma_write)
ret = conn->transport->ops->rdma_write(conn->transport,
buf, buflen,
- remote_key, remote_offset,
- remote_len);
+ desc, desc_len);
return ret;
}
diff --git a/fs/ksmbd/connection.h b/fs/ksmbd/connection.h
index 7a59aacb5daa..98c1cbe45ec9 100644
--- a/fs/ksmbd/connection.h
+++ b/fs/ksmbd/connection.h
@@ -122,11 +122,14 @@ struct ksmbd_transport_ops {
int (*writev)(struct ksmbd_transport *t, struct kvec *iovs, int niov,
int size, bool need_invalidate_rkey,
unsigned int remote_key);
- int (*rdma_read)(struct ksmbd_transport *t, void *buf, unsigned int len,
- u32 remote_key, u64 remote_offset, u32 remote_len);
- int (*rdma_write)(struct ksmbd_transport *t, void *buf,
- unsigned int len, u32 remote_key, u64 remote_offset,
- u32 remote_len);
+ int (*rdma_read)(struct ksmbd_transport *t,
+ void *buf, unsigned int len,
+ struct smb2_buffer_desc_v1 *desc,
+ unsigned int desc_len);
+ int (*rdma_write)(struct ksmbd_transport *t,
+ void *buf, unsigned int len,
+ struct smb2_buffer_desc_v1 *desc,
+ unsigned int desc_len);
};
struct ksmbd_transport {
@@ -148,12 +151,14 @@ struct ksmbd_conn *ksmbd_conn_alloc(void);
void ksmbd_conn_free(struct ksmbd_conn *conn);
bool ksmbd_conn_lookup_dialect(struct ksmbd_conn *c);
int ksmbd_conn_write(struct ksmbd_work *work);
-int ksmbd_conn_rdma_read(struct ksmbd_conn *conn, void *buf,
- unsigned int buflen, u32 remote_key, u64 remote_offset,
- u32 remote_len);
-int ksmbd_conn_rdma_write(struct ksmbd_conn *conn, void *buf,
- unsigned int buflen, u32 remote_key, u64 remote_offset,
- u32 remote_len);
+int ksmbd_conn_rdma_read(struct ksmbd_conn *conn,
+ void *buf, unsigned int buflen,
+ struct smb2_buffer_desc_v1 *desc,
+ unsigned int desc_len);
+int ksmbd_conn_rdma_write(struct ksmbd_conn *conn,
+ void *buf, unsigned int buflen,
+ struct smb2_buffer_desc_v1 *desc,
+ unsigned int desc_len);
void ksmbd_conn_enqueue_request(struct ksmbd_work *work);
int ksmbd_conn_try_dequeue_request(struct ksmbd_work *work);
void ksmbd_conn_init_server_callbacks(struct ksmbd_conn_ops *ops);
diff --git a/fs/ksmbd/ksmbd_netlink.h b/fs/ksmbd/ksmbd_netlink.h
index ebe6ca08467a..52aa0adeb951 100644
--- a/fs/ksmbd/ksmbd_netlink.h
+++ b/fs/ksmbd/ksmbd_netlink.h
@@ -104,7 +104,8 @@ struct ksmbd_startup_request {
*/
__u32 sub_auth[3]; /* Subauth value for Security ID */
__u32 smb2_max_credits; /* MAX credits */
- __u32 reserved[128]; /* Reserved room */
+ __u32 smbd_max_io_size; /* smbd read write size */
+ __u32 reserved[127]; /* Reserved room */
__u32 ifc_list_sz; /* interfaces list size */
__s8 ____payload[];
};
diff --git a/fs/ksmbd/smb2misc.c b/fs/ksmbd/smb2misc.c
index f8f456377a51..6e25ace36568 100644
--- a/fs/ksmbd/smb2misc.c
+++ b/fs/ksmbd/smb2misc.c
@@ -90,11 +90,6 @@ static int smb2_get_data_area_len(unsigned int *off, unsigned int *len,
*off = 0;
*len = 0;
- /* error reqeusts do not have data area */
- if (hdr->Status && hdr->Status != STATUS_MORE_PROCESSING_REQUIRED &&
- (((struct smb2_err_rsp *)hdr)->StructureSize) == SMB2_ERROR_STRUCTURE_SIZE2_LE)
- return ret;
-
/*
* Following commands have data areas so we have to get the location
* of the data buffer offset and data buffer length for the particular
@@ -136,8 +131,11 @@ static int smb2_get_data_area_len(unsigned int *off, unsigned int *len,
*len = le16_to_cpu(((struct smb2_read_req *)hdr)->ReadChannelInfoLength);
break;
case SMB2_WRITE:
- if (((struct smb2_write_req *)hdr)->DataOffset) {
- *off = le16_to_cpu(((struct smb2_write_req *)hdr)->DataOffset);
+ if (((struct smb2_write_req *)hdr)->DataOffset ||
+ ((struct smb2_write_req *)hdr)->Length) {
+ *off = max_t(unsigned int,
+ le16_to_cpu(((struct smb2_write_req *)hdr)->DataOffset),
+ offsetof(struct smb2_write_req, Buffer));
*len = le32_to_cpu(((struct smb2_write_req *)hdr)->Length);
break;
}
diff --git a/fs/ksmbd/smb2pdu.c b/fs/ksmbd/smb2pdu.c
index 31138e9be1dc..85a9ed7156ea 100644
--- a/fs/ksmbd/smb2pdu.c
+++ b/fs/ksmbd/smb2pdu.c
@@ -535,9 +535,10 @@ int smb2_allocate_rsp_buf(struct ksmbd_work *work)
struct smb2_query_info_req *req;
req = smb2_get_msg(work->request_buf);
- if (req->InfoType == SMB2_O_INFO_FILE &&
- (req->FileInfoClass == FILE_FULL_EA_INFORMATION ||
- req->FileInfoClass == FILE_ALL_INFORMATION))
+ if ((req->InfoType == SMB2_O_INFO_FILE &&
+ (req->FileInfoClass == FILE_FULL_EA_INFORMATION ||
+ req->FileInfoClass == FILE_ALL_INFORMATION)) ||
+ req->InfoType == SMB2_O_INFO_SECURITY)
sz = large_sz;
}
@@ -1139,12 +1140,16 @@ int smb2_handle_negotiate(struct ksmbd_work *work)
status);
rsp->hdr.Status = status;
rc = -EINVAL;
+ kfree(conn->preauth_info);
+ conn->preauth_info = NULL;
goto err_out;
}
rc = init_smb3_11_server(conn);
if (rc < 0) {
rsp->hdr.Status = STATUS_INVALID_PARAMETER;
+ kfree(conn->preauth_info);
+ conn->preauth_info = NULL;
goto err_out;
}
@@ -2039,6 +2044,7 @@ int smb2_tree_disconnect(struct ksmbd_work *work)
ksmbd_close_tree_conn_fds(work);
ksmbd_tree_conn_disconnect(sess, tcon);
+ work->tcon = NULL;
return 0;
}
@@ -2969,7 +2975,7 @@ int smb2_open(struct ksmbd_work *work)
goto err_out;
rc = build_sec_desc(user_ns,
- pntsd, NULL,
+ pntsd, NULL, 0,
OWNER_SECINFO |
GROUP_SECINFO |
DACL_SECINFO,
@@ -3814,6 +3820,15 @@ static int verify_info_level(int info_level)
return 0;
}
+static int smb2_resp_buf_len(struct ksmbd_work *work, unsigned short hdr2_len)
+{
+ int free_len;
+
+ free_len = (int)(work->response_sz -
+ (get_rfc1002_len(work->response_buf) + 4)) - hdr2_len;
+ return free_len;
+}
+
static int smb2_calc_max_out_buf_len(struct ksmbd_work *work,
unsigned short hdr2_len,
unsigned int out_buf_len)
@@ -3823,9 +3838,7 @@ static int smb2_calc_max_out_buf_len(struct ksmbd_work *work,
if (out_buf_len > work->conn->vals->max_trans_size)
return -EINVAL;
- free_len = (int)(work->response_sz -
- (get_rfc1002_len(work->response_buf) + 4)) -
- hdr2_len;
+ free_len = smb2_resp_buf_len(work, hdr2_len);
if (free_len < 0)
return -EINVAL;
@@ -5080,10 +5093,10 @@ static int smb2_get_info_sec(struct ksmbd_work *work,
struct smb_ntsd *pntsd = (struct smb_ntsd *)rsp->Buffer, *ppntsd = NULL;
struct smb_fattr fattr = {{0}};
struct inode *inode;
- __u32 secdesclen;
+ __u32 secdesclen = 0;
unsigned int id = KSMBD_NO_FID, pid = KSMBD_NO_FID;
int addition_info = le32_to_cpu(req->AdditionalInformation);
- int rc;
+ int rc = 0, ppntsd_size = 0;
if (addition_info & ~(OWNER_SECINFO | GROUP_SECINFO | DACL_SECINFO |
PROTECTED_DACL_SECINFO |
@@ -5129,11 +5142,14 @@ static int smb2_get_info_sec(struct ksmbd_work *work,
if (test_share_config_flag(work->tcon->share_conf,
KSMBD_SHARE_FLAG_ACL_XATTR))
- ksmbd_vfs_get_sd_xattr(work->conn, user_ns,
- fp->filp->f_path.dentry, &ppntsd);
-
- rc = build_sec_desc(user_ns, pntsd, ppntsd, addition_info,
- &secdesclen, &fattr);
+ ppntsd_size = ksmbd_vfs_get_sd_xattr(work->conn, user_ns,
+ fp->filp->f_path.dentry,
+ &ppntsd);
+
+ /* Check if sd buffer size exceeds response buffer size */
+ if (smb2_resp_buf_len(work, 8) > ppntsd_size)
+ rc = build_sec_desc(user_ns, pntsd, ppntsd, ppntsd_size,
+ addition_info, &secdesclen, &fattr);
posix_acl_release(fattr.cf_acls);
posix_acl_release(fattr.cf_dacls);
kfree(ppntsd);
@@ -6116,7 +6132,6 @@ static noinline int smb2_read_pipe(struct ksmbd_work *work)
static int smb2_set_remote_key_for_rdma(struct ksmbd_work *work,
struct smb2_buffer_desc_v1 *desc,
__le32 Channel,
- __le16 ChannelInfoOffset,
__le16 ChannelInfoLength)
{
unsigned int i, ch_count;
@@ -6142,7 +6157,8 @@ static int smb2_set_remote_key_for_rdma(struct ksmbd_work *work,
work->need_invalidate_rkey =
(Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE);
- work->remote_key = le32_to_cpu(desc->token);
+ if (Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE)
+ work->remote_key = le32_to_cpu(desc->token);
return 0;
}
@@ -6150,14 +6166,12 @@ static ssize_t smb2_read_rdma_channel(struct ksmbd_work *work,
struct smb2_read_req *req, void *data_buf,
size_t length)
{
- struct smb2_buffer_desc_v1 *desc =
- (struct smb2_buffer_desc_v1 *)&req->Buffer[0];
int err;
err = ksmbd_conn_rdma_write(work->conn, data_buf, length,
- le32_to_cpu(desc->token),
- le64_to_cpu(desc->offset),
- le32_to_cpu(desc->length));
+ (struct smb2_buffer_desc_v1 *)
+ ((char *)req + le16_to_cpu(req->ReadChannelInfoOffset)),
+ le16_to_cpu(req->ReadChannelInfoLength));
if (err)
return err;
@@ -6180,6 +6194,8 @@ int smb2_read(struct ksmbd_work *work)
size_t length, mincount;
ssize_t nbytes = 0, remain_bytes = 0;
int err = 0;
+ bool is_rdma_channel = false;
+ unsigned int max_read_size = conn->vals->max_read_size;
WORK_BUFFERS(work, req, rsp);
@@ -6191,6 +6207,11 @@ int smb2_read(struct ksmbd_work *work)
if (req->Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE ||
req->Channel == SMB2_CHANNEL_RDMA_V1) {
+ is_rdma_channel = true;
+ max_read_size = get_smbd_max_read_write_size();
+ }
+
+ if (is_rdma_channel == true) {
unsigned int ch_offset = le16_to_cpu(req->ReadChannelInfoOffset);
if (ch_offset < offsetof(struct smb2_read_req, Buffer)) {
@@ -6201,7 +6222,6 @@ int smb2_read(struct ksmbd_work *work)
(struct smb2_buffer_desc_v1 *)
((char *)req + ch_offset),
req->Channel,
- req->ReadChannelInfoOffset,
req->ReadChannelInfoLength);
if (err)
goto out;
@@ -6223,9 +6243,9 @@ int smb2_read(struct ksmbd_work *work)
length = le32_to_cpu(req->Length);
mincount = le32_to_cpu(req->MinimumCount);
- if (length > conn->vals->max_read_size) {
+ if (length > max_read_size) {
ksmbd_debug(SMB, "limiting read size to max size(%u)\n",
- conn->vals->max_read_size);
+ max_read_size);
err = -EINVAL;
goto out;
}
@@ -6257,8 +6277,7 @@ int smb2_read(struct ksmbd_work *work)
ksmbd_debug(SMB, "nbytes %zu, offset %lld mincount %zu\n",
nbytes, offset, mincount);
- if (req->Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE ||
- req->Channel == SMB2_CHANNEL_RDMA_V1) {
+ if (is_rdma_channel == true) {
/* write data to the client using rdma channel */
remain_bytes = smb2_read_rdma_channel(work, req,
work->aux_payload_buf,
@@ -6328,23 +6347,18 @@ static noinline int smb2_write_pipe(struct ksmbd_work *work)
length = le32_to_cpu(req->Length);
id = req->VolatileFileId;
- if (le16_to_cpu(req->DataOffset) ==
- offsetof(struct smb2_write_req, Buffer)) {
- data_buf = (char *)&req->Buffer[0];
- } else {
- if ((u64)le16_to_cpu(req->DataOffset) + length >
- get_rfc1002_len(work->request_buf)) {
- pr_err("invalid write data offset %u, smb_len %u\n",
- le16_to_cpu(req->DataOffset),
- get_rfc1002_len(work->request_buf));
- err = -EINVAL;
- goto out;
- }
-
- data_buf = (char *)(((char *)&req->hdr.ProtocolId) +
- le16_to_cpu(req->DataOffset));
+ if ((u64)le16_to_cpu(req->DataOffset) + length >
+ get_rfc1002_len(work->request_buf)) {
+ pr_err("invalid write data offset %u, smb_len %u\n",
+ le16_to_cpu(req->DataOffset),
+ get_rfc1002_len(work->request_buf));
+ err = -EINVAL;
+ goto out;
}
+ data_buf = (char *)(((char *)&req->hdr.ProtocolId) +
+ le16_to_cpu(req->DataOffset));
+
rpc_resp = ksmbd_rpc_write(work->sess, id, data_buf, length);
if (rpc_resp) {
if (rpc_resp->flags == KSMBD_RPC_ENOTIMPLEMENTED) {
@@ -6384,21 +6398,18 @@ static ssize_t smb2_write_rdma_channel(struct ksmbd_work *work,
struct ksmbd_file *fp,
loff_t offset, size_t length, bool sync)
{
- struct smb2_buffer_desc_v1 *desc;
char *data_buf;
int ret;
ssize_t nbytes;
- desc = (struct smb2_buffer_desc_v1 *)&req->Buffer[0];
-
data_buf = kvmalloc(length, GFP_KERNEL | __GFP_ZERO);
if (!data_buf)
return -ENOMEM;
ret = ksmbd_conn_rdma_read(work->conn, data_buf, length,
- le32_to_cpu(desc->token),
- le64_to_cpu(desc->offset),
- le32_to_cpu(desc->length));
+ (struct smb2_buffer_desc_v1 *)
+ ((char *)req + le16_to_cpu(req->WriteChannelInfoOffset)),
+ le16_to_cpu(req->WriteChannelInfoLength));
if (ret < 0) {
kvfree(data_buf);
return ret;
@@ -6427,8 +6438,9 @@ int smb2_write(struct ksmbd_work *work)
size_t length;
ssize_t nbytes;
char *data_buf;
- bool writethrough = false;
+ bool writethrough = false, is_rdma_channel = false;
int err = 0;
+ unsigned int max_write_size = work->conn->vals->max_write_size;
WORK_BUFFERS(work, req, rsp);
@@ -6437,8 +6449,17 @@ int smb2_write(struct ksmbd_work *work)
return smb2_write_pipe(work);
}
+ offset = le64_to_cpu(req->Offset);
+ length = le32_to_cpu(req->Length);
+
if (req->Channel == SMB2_CHANNEL_RDMA_V1 ||
req->Channel == SMB2_CHANNEL_RDMA_V1_INVALIDATE) {
+ is_rdma_channel = true;
+ max_write_size = get_smbd_max_read_write_size();
+ length = le32_to_cpu(req->RemainingBytes);
+ }
+
+ if (is_rdma_channel == true) {
unsigned int ch_offset = le16_to_cpu(req->WriteChannelInfoOffset);
if (req->Length != 0 || req->DataOffset != 0 ||
@@ -6450,7 +6471,6 @@ int smb2_write(struct ksmbd_work *work)
(struct smb2_buffer_desc_v1 *)
((char *)req + ch_offset),
req->Channel,
- req->WriteChannelInfoOffset,
req->WriteChannelInfoLength);
if (err)
goto out;
@@ -6474,12 +6494,9 @@ int smb2_write(struct ksmbd_work *work)
goto out;
}
- offset = le64_to_cpu(req->Offset);
- length = le32_to_cpu(req->Length);
-
- if (length > work->conn->vals->max_write_size) {
+ if (length > max_write_size) {
ksmbd_debug(SMB, "limiting write size to max size(%u)\n",
- work->conn->vals->max_write_size);
+ max_write_size);
err = -EINVAL;
goto out;
}
@@ -6487,25 +6504,16 @@ int smb2_write(struct ksmbd_work *work)
if (le32_to_cpu(req->Flags) & SMB2_WRITEFLAG_WRITE_THROUGH)
writethrough = true;
- if (req->Channel != SMB2_CHANNEL_RDMA_V1 &&
- req->Channel != SMB2_CHANNEL_RDMA_V1_INVALIDATE) {
- if (le16_to_cpu(req->DataOffset) ==
+ if (is_rdma_channel == false) {
+ if (le16_to_cpu(req->DataOffset) <
offsetof(struct smb2_write_req, Buffer)) {
- data_buf = (char *)&req->Buffer[0];
- } else {
- if ((u64)le16_to_cpu(req->DataOffset) + length >
- get_rfc1002_len(work->request_buf)) {
- pr_err("invalid write data offset %u, smb_len %u\n",
- le16_to_cpu(req->DataOffset),
- get_rfc1002_len(work->request_buf));
- err = -EINVAL;
- goto out;
- }
-
- data_buf = (char *)(((char *)&req->hdr.ProtocolId) +
- le16_to_cpu(req->DataOffset));
+ err = -EINVAL;
+ goto out;
}
+ data_buf = (char *)(((char *)&req->hdr.ProtocolId) +
+ le16_to_cpu(req->DataOffset));
+
ksmbd_debug(SMB, "flags %u\n", le32_to_cpu(req->Flags));
if (le32_to_cpu(req->Flags) & SMB2_WRITEFLAG_WRITE_THROUGH)
writethrough = true;
@@ -6520,8 +6528,7 @@ int smb2_write(struct ksmbd_work *work)
/* read data from the client using rdma channel, and
* write the data.
*/
- nbytes = smb2_write_rdma_channel(work, req, fp, offset,
- le32_to_cpu(req->RemainingBytes),
+ nbytes = smb2_write_rdma_channel(work, req, fp, offset, length,
writethrough);
if (nbytes < 0) {
err = (int)nbytes;
diff --git a/fs/ksmbd/smbacl.c b/fs/ksmbd/smbacl.c
index 38f23bf981ac..3781bca2c8fc 100644
--- a/fs/ksmbd/smbacl.c
+++ b/fs/ksmbd/smbacl.c
@@ -690,6 +690,7 @@ static void set_posix_acl_entries_dacl(struct user_namespace *user_ns,
static void set_ntacl_dacl(struct user_namespace *user_ns,
struct smb_acl *pndacl,
struct smb_acl *nt_dacl,
+ unsigned int aces_size,
const struct smb_sid *pownersid,
const struct smb_sid *pgrpsid,
struct smb_fattr *fattr)
@@ -703,9 +704,19 @@ static void set_ntacl_dacl(struct user_namespace *user_ns,
if (nt_num_aces) {
ntace = (struct smb_ace *)((char *)nt_dacl + sizeof(struct smb_acl));
for (i = 0; i < nt_num_aces; i++) {
- memcpy((char *)pndace + size, ntace, le16_to_cpu(ntace->size));
- size += le16_to_cpu(ntace->size);
- ntace = (struct smb_ace *)((char *)ntace + le16_to_cpu(ntace->size));
+ unsigned short nt_ace_size;
+
+ if (offsetof(struct smb_ace, access_req) > aces_size)
+ break;
+
+ nt_ace_size = le16_to_cpu(ntace->size);
+ if (nt_ace_size > aces_size)
+ break;
+
+ memcpy((char *)pndace + size, ntace, nt_ace_size);
+ size += nt_ace_size;
+ aces_size -= nt_ace_size;
+ ntace = (struct smb_ace *)((char *)ntace + nt_ace_size);
num_aces++;
}
}
@@ -878,7 +889,7 @@ int parse_sec_desc(struct user_namespace *user_ns, struct smb_ntsd *pntsd,
/* Convert permission bits from mode to equivalent CIFS ACL */
int build_sec_desc(struct user_namespace *user_ns,
struct smb_ntsd *pntsd, struct smb_ntsd *ppntsd,
- int addition_info, __u32 *secdesclen,
+ int ppntsd_size, int addition_info, __u32 *secdesclen,
struct smb_fattr *fattr)
{
int rc = 0;
@@ -938,15 +949,25 @@ int build_sec_desc(struct user_namespace *user_ns,
if (!ppntsd) {
set_mode_dacl(user_ns, dacl_ptr, fattr);
- } else if (!ppntsd->dacloffset) {
- goto out;
} else {
struct smb_acl *ppdacl_ptr;
+ unsigned int dacl_offset = le32_to_cpu(ppntsd->dacloffset);
+ int ppdacl_size, ntacl_size = ppntsd_size - dacl_offset;
+
+ if (!dacl_offset ||
+ (dacl_offset + sizeof(struct smb_acl) > ppntsd_size))
+ goto out;
+
+ ppdacl_ptr = (struct smb_acl *)((char *)ppntsd + dacl_offset);
+ ppdacl_size = le16_to_cpu(ppdacl_ptr->size);
+ if (ppdacl_size > ntacl_size ||
+ ppdacl_size < sizeof(struct smb_acl))
+ goto out;
- ppdacl_ptr = (struct smb_acl *)((char *)ppntsd +
- le32_to_cpu(ppntsd->dacloffset));
set_ntacl_dacl(user_ns, dacl_ptr, ppdacl_ptr,
- nowner_sid_ptr, ngroup_sid_ptr, fattr);
+ ntacl_size - sizeof(struct smb_acl),
+ nowner_sid_ptr, ngroup_sid_ptr,
+ fattr);
}
pntsd->dacloffset = cpu_to_le32(offset);
offset += le16_to_cpu(dacl_ptr->size);
@@ -980,24 +1001,31 @@ int smb_inherit_dacl(struct ksmbd_conn *conn,
struct smb_sid owner_sid, group_sid;
struct dentry *parent = path->dentry->d_parent;
struct user_namespace *user_ns = mnt_user_ns(path->mnt);
- int inherited_flags = 0, flags = 0, i, ace_cnt = 0, nt_size = 0;
- int rc = 0, num_aces, dacloffset, pntsd_type, acl_len;
+ int inherited_flags = 0, flags = 0, i, ace_cnt = 0, nt_size = 0, pdacl_size;
+ int rc = 0, num_aces, dacloffset, pntsd_type, pntsd_size, acl_len, aces_size;
char *aces_base;
bool is_dir = S_ISDIR(d_inode(path->dentry)->i_mode);
- acl_len = ksmbd_vfs_get_sd_xattr(conn, user_ns,
- parent, &parent_pntsd);
- if (acl_len <= 0)
+ pntsd_size = ksmbd_vfs_get_sd_xattr(conn, user_ns,
+ parent, &parent_pntsd);
+ if (pntsd_size <= 0)
return -ENOENT;
dacloffset = le32_to_cpu(parent_pntsd->dacloffset);
- if (!dacloffset) {
+ if (!dacloffset || (dacloffset + sizeof(struct smb_acl) > pntsd_size)) {
rc = -EINVAL;
goto free_parent_pntsd;
}
parent_pdacl = (struct smb_acl *)((char *)parent_pntsd + dacloffset);
+ acl_len = pntsd_size - dacloffset;
num_aces = le32_to_cpu(parent_pdacl->num_aces);
pntsd_type = le16_to_cpu(parent_pntsd->type);
+ pdacl_size = le16_to_cpu(parent_pdacl->size);
+
+ if (pdacl_size > acl_len || pdacl_size < sizeof(struct smb_acl)) {
+ rc = -EINVAL;
+ goto free_parent_pntsd;
+ }
aces_base = kmalloc(sizeof(struct smb_ace) * num_aces * 2, GFP_KERNEL);
if (!aces_base) {
@@ -1008,11 +1036,23 @@ int smb_inherit_dacl(struct ksmbd_conn *conn,
aces = (struct smb_ace *)aces_base;
parent_aces = (struct smb_ace *)((char *)parent_pdacl +
sizeof(struct smb_acl));
+ aces_size = acl_len - sizeof(struct smb_acl);
if (pntsd_type & DACL_AUTO_INHERITED)
inherited_flags = INHERITED_ACE;
for (i = 0; i < num_aces; i++) {
+ int pace_size;
+
+ if (offsetof(struct smb_ace, access_req) > aces_size)
+ break;
+
+ pace_size = le16_to_cpu(parent_aces->size);
+ if (pace_size > aces_size)
+ break;
+
+ aces_size -= pace_size;
+
flags = parent_aces->flags;
if (!smb_inherit_flags(flags, is_dir))
goto pass;
@@ -1057,8 +1097,7 @@ int smb_inherit_dacl(struct ksmbd_conn *conn,
aces = (struct smb_ace *)((char *)aces + le16_to_cpu(aces->size));
ace_cnt++;
pass:
- parent_aces =
- (struct smb_ace *)((char *)parent_aces + le16_to_cpu(parent_aces->size));
+ parent_aces = (struct smb_ace *)((char *)parent_aces + pace_size);
}
if (nt_size > 0) {
@@ -1153,7 +1192,7 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
struct smb_ntsd *pntsd = NULL;
struct smb_acl *pdacl;
struct posix_acl *posix_acls;
- int rc = 0, acl_size;
+ int rc = 0, pntsd_size, acl_size, aces_size, pdacl_size, dacl_offset;
struct smb_sid sid;
int granted = le32_to_cpu(*pdaccess & ~FILE_MAXIMAL_ACCESS_LE);
struct smb_ace *ace;
@@ -1162,37 +1201,33 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
struct smb_ace *others_ace = NULL;
struct posix_acl_entry *pa_entry;
unsigned int sid_type = SIDOWNER;
- char *end_of_acl;
+ unsigned short ace_size;
ksmbd_debug(SMB, "check permission using windows acl\n");
- acl_size = ksmbd_vfs_get_sd_xattr(conn, user_ns,
- path->dentry, &pntsd);
- if (acl_size <= 0 || !pntsd || !pntsd->dacloffset) {
- kfree(pntsd);
- return 0;
- }
+ pntsd_size = ksmbd_vfs_get_sd_xattr(conn, user_ns,
+ path->dentry, &pntsd);
+ if (pntsd_size <= 0 || !pntsd)
+ goto err_out;
+
+ dacl_offset = le32_to_cpu(pntsd->dacloffset);
+ if (!dacl_offset ||
+ (dacl_offset + sizeof(struct smb_acl) > pntsd_size))
+ goto err_out;
pdacl = (struct smb_acl *)((char *)pntsd + le32_to_cpu(pntsd->dacloffset));
- end_of_acl = ((char *)pntsd) + acl_size;
- if (end_of_acl <= (char *)pdacl) {
- kfree(pntsd);
- return 0;
- }
+ acl_size = pntsd_size - dacl_offset;
+ pdacl_size = le16_to_cpu(pdacl->size);
- if (end_of_acl < (char *)pdacl + le16_to_cpu(pdacl->size) ||
- le16_to_cpu(pdacl->size) < sizeof(struct smb_acl)) {
- kfree(pntsd);
- return 0;
- }
+ if (pdacl_size > acl_size || pdacl_size < sizeof(struct smb_acl))
+ goto err_out;
if (!pdacl->num_aces) {
- if (!(le16_to_cpu(pdacl->size) - sizeof(struct smb_acl)) &&
+ if (!(pdacl_size - sizeof(struct smb_acl)) &&
*pdaccess & ~(FILE_READ_CONTROL_LE | FILE_WRITE_DAC_LE)) {
rc = -EACCES;
goto err_out;
}
- kfree(pntsd);
- return 0;
+ goto err_out;
}
if (*pdaccess & FILE_MAXIMAL_ACCESS_LE) {
@@ -1200,11 +1235,16 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
DELETE;
ace = (struct smb_ace *)((char *)pdacl + sizeof(struct smb_acl));
+ aces_size = acl_size - sizeof(struct smb_acl);
for (i = 0; i < le32_to_cpu(pdacl->num_aces); i++) {
+ if (offsetof(struct smb_ace, access_req) > aces_size)
+ break;
+ ace_size = le16_to_cpu(ace->size);
+ if (ace_size > aces_size)
+ break;
+ aces_size -= ace_size;
granted |= le32_to_cpu(ace->access_req);
ace = (struct smb_ace *)((char *)ace + le16_to_cpu(ace->size));
- if (end_of_acl < (char *)ace)
- goto err_out;
}
if (!pdacl->num_aces)
@@ -1216,7 +1256,15 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
id_to_sid(uid, sid_type, &sid);
ace = (struct smb_ace *)((char *)pdacl + sizeof(struct smb_acl));
+ aces_size = acl_size - sizeof(struct smb_acl);
for (i = 0; i < le32_to_cpu(pdacl->num_aces); i++) {
+ if (offsetof(struct smb_ace, access_req) > aces_size)
+ break;
+ ace_size = le16_to_cpu(ace->size);
+ if (ace_size > aces_size)
+ break;
+ aces_size -= ace_size;
+
if (!compare_sids(&sid, &ace->sid) ||
!compare_sids(&sid_unix_NFS_mode, &ace->sid)) {
found = 1;
@@ -1226,8 +1274,6 @@ int smb_check_perm_dacl(struct ksmbd_conn *conn, struct path *path,
others_ace = ace;
ace = (struct smb_ace *)((char *)ace + le16_to_cpu(ace->size));
- if (end_of_acl < (char *)ace)
- goto err_out;
}
if (*pdaccess & FILE_MAXIMAL_ACCESS_LE && found) {
diff --git a/fs/ksmbd/smbacl.h b/fs/ksmbd/smbacl.h
index 811af3309429..fcb2c83f2992 100644
--- a/fs/ksmbd/smbacl.h
+++ b/fs/ksmbd/smbacl.h
@@ -193,7 +193,7 @@ struct posix_acl_state {
int parse_sec_desc(struct user_namespace *user_ns, struct smb_ntsd *pntsd,
int acl_len, struct smb_fattr *fattr);
int build_sec_desc(struct user_namespace *user_ns, struct smb_ntsd *pntsd,
- struct smb_ntsd *ppntsd, int addition_info,
+ struct smb_ntsd *ppntsd, int ppntsd_size, int addition_info,
__u32 *secdesclen, struct smb_fattr *fattr);
int init_acl_state(struct posix_acl_state *state, int cnt);
void free_acl_state(struct posix_acl_state *state);
diff --git a/fs/ksmbd/transport_ipc.c b/fs/ksmbd/transport_ipc.c
index 3ad6881e0f7e..7cb0eeb07c80 100644
--- a/fs/ksmbd/transport_ipc.c
+++ b/fs/ksmbd/transport_ipc.c
@@ -26,6 +26,7 @@
#include "mgmt/ksmbd_ida.h"
#include "connection.h"
#include "transport_tcp.h"
+#include "transport_rdma.h"
#define IPC_WAIT_TIMEOUT (2 * HZ)
@@ -303,6 +304,8 @@ static int ipc_server_config_on_startup(struct ksmbd_startup_request *req)
init_smb2_max_trans_size(req->smb2_max_trans);
if (req->smb2_max_credits)
init_smb2_max_credits(req->smb2_max_credits);
+ if (req->smbd_max_io_size)
+ init_smbd_max_io_size(req->smbd_max_io_size);
ret = ksmbd_set_netbios_name(req->netbios_name);
ret |= ksmbd_set_server_string(req->server_string);
diff --git a/fs/ksmbd/transport_rdma.c b/fs/ksmbd/transport_rdma.c
index 3f5d13571694..c6af8d89b7f7 100644
--- a/fs/ksmbd/transport_rdma.c
+++ b/fs/ksmbd/transport_rdma.c
@@ -80,9 +80,7 @@ static int smb_direct_max_fragmented_recv_size = 1024 * 1024;
/* The maximum single-message size which can be received */
static int smb_direct_max_receive_size = 8192;
-static int smb_direct_max_read_write_size = 524224;
-
-static int smb_direct_max_outstanding_rw_ops = 8;
+static int smb_direct_max_read_write_size = SMBD_DEFAULT_IOSIZE;
static LIST_HEAD(smb_direct_device_list);
static DEFINE_RWLOCK(smb_direct_device_lock);
@@ -147,10 +145,12 @@ struct smb_direct_transport {
atomic_t send_credits;
spinlock_t lock_new_recv_credits;
int new_recv_credits;
- atomic_t rw_avail_ops;
+ int max_rw_credits;
+ int pages_per_rw_credit;
+ atomic_t rw_credits;
wait_queue_head_t wait_send_credits;
- wait_queue_head_t wait_rw_avail_ops;
+ wait_queue_head_t wait_rw_credits;
mempool_t *sendmsg_mempool;
struct kmem_cache *sendmsg_cache;
@@ -214,6 +214,17 @@ struct smb_direct_rdma_rw_msg {
struct scatterlist sg_list[];
};
+void init_smbd_max_io_size(unsigned int sz)
+{
+ sz = clamp_val(sz, SMBD_MIN_IOSIZE, SMBD_MAX_IOSIZE);
+ smb_direct_max_read_write_size = sz;
+}
+
+unsigned int get_smbd_max_read_write_size(void)
+{
+ return smb_direct_max_read_write_size;
+}
+
static inline int get_buf_page_count(void *buf, int size)
{
return DIV_ROUND_UP((uintptr_t)buf + size, PAGE_SIZE) -
@@ -377,7 +388,7 @@ static struct smb_direct_transport *alloc_transport(struct rdma_cm_id *cm_id)
t->reassembly_queue_length = 0;
init_waitqueue_head(&t->wait_reassembly_queue);
init_waitqueue_head(&t->wait_send_credits);
- init_waitqueue_head(&t->wait_rw_avail_ops);
+ init_waitqueue_head(&t->wait_rw_credits);
spin_lock_init(&t->receive_credit_lock);
spin_lock_init(&t->recvmsg_queue_lock);
@@ -984,18 +995,19 @@ static int smb_direct_flush_send_list(struct smb_direct_transport *t,
}
static int wait_for_credits(struct smb_direct_transport *t,
- wait_queue_head_t *waitq, atomic_t *credits)
+ wait_queue_head_t *waitq, atomic_t *total_credits,
+ int needed)
{
int ret;
do {
- if (atomic_dec_return(credits) >= 0)
+ if (atomic_sub_return(needed, total_credits) >= 0)
return 0;
- atomic_inc(credits);
+ atomic_add(needed, total_credits);
ret = wait_event_interruptible(*waitq,
- atomic_read(credits) > 0 ||
- t->status != SMB_DIRECT_CS_CONNECTED);
+ atomic_read(total_credits) >= needed ||
+ t->status != SMB_DIRECT_CS_CONNECTED);
if (t->status != SMB_DIRECT_CS_CONNECTED)
return -ENOTCONN;
@@ -1016,7 +1028,19 @@ static int wait_for_send_credits(struct smb_direct_transport *t,
return ret;
}
- return wait_for_credits(t, &t->wait_send_credits, &t->send_credits);
+ return wait_for_credits(t, &t->wait_send_credits, &t->send_credits, 1);
+}
+
+static int wait_for_rw_credits(struct smb_direct_transport *t, int credits)
+{
+ return wait_for_credits(t, &t->wait_rw_credits, &t->rw_credits, credits);
+}
+
+static int calc_rw_credits(struct smb_direct_transport *t,
+ char *buf, unsigned int len)
+{
+ return DIV_ROUND_UP(get_buf_page_count(buf, len),
+ t->pages_per_rw_credit);
}
static int smb_direct_create_header(struct smb_direct_transport *t,
@@ -1332,8 +1356,8 @@ static void read_write_done(struct ib_cq *cq, struct ib_wc *wc,
smb_direct_disconnect_rdma_connection(t);
}
- if (atomic_inc_return(&t->rw_avail_ops) > 0)
- wake_up(&t->wait_rw_avail_ops);
+ if (atomic_inc_return(&t->rw_credits) > 0)
+ wake_up(&t->wait_rw_credits);
rdma_rw_ctx_destroy(&msg->rw_ctx, t->qp, t->qp->port,
msg->sg_list, msg->sgt.nents, dir);
@@ -1352,16 +1376,22 @@ static void write_done(struct ib_cq *cq, struct ib_wc *wc)
read_write_done(cq, wc, DMA_TO_DEVICE);
}
-static int smb_direct_rdma_xmit(struct smb_direct_transport *t, void *buf,
- int buf_len, u32 remote_key, u64 remote_offset,
- u32 remote_len, bool is_read)
+static int smb_direct_rdma_xmit(struct smb_direct_transport *t,
+ void *buf, int buf_len,
+ struct smb2_buffer_desc_v1 *desc,
+ unsigned int desc_len,
+ bool is_read)
{
struct smb_direct_rdma_rw_msg *msg;
int ret;
DECLARE_COMPLETION_ONSTACK(completion);
struct ib_send_wr *first_wr = NULL;
+ u32 remote_key = le32_to_cpu(desc[0].token);
+ u64 remote_offset = le64_to_cpu(desc[0].offset);
+ int credits_needed;
- ret = wait_for_credits(t, &t->wait_rw_avail_ops, &t->rw_avail_ops);
+ credits_needed = calc_rw_credits(t, buf, buf_len);
+ ret = wait_for_rw_credits(t, credits_needed);
if (ret < 0)
return ret;
@@ -1369,7 +1399,7 @@ static int smb_direct_rdma_xmit(struct smb_direct_transport *t, void *buf,
msg = kmalloc(offsetof(struct smb_direct_rdma_rw_msg, sg_list) +
sizeof(struct scatterlist) * SG_CHUNK_SIZE, GFP_KERNEL);
if (!msg) {
- atomic_inc(&t->rw_avail_ops);
+ atomic_add(credits_needed, &t->rw_credits);
return -ENOMEM;
}
@@ -1378,7 +1408,7 @@ static int smb_direct_rdma_xmit(struct smb_direct_transport *t, void *buf,
get_buf_page_count(buf, buf_len),
msg->sg_list, SG_CHUNK_SIZE);
if (ret) {
- atomic_inc(&t->rw_avail_ops);
+ atomic_add(credits_needed, &t->rw_credits);
kfree(msg);
return -ENOMEM;
}
@@ -1414,7 +1444,7 @@ static int smb_direct_rdma_xmit(struct smb_direct_transport *t, void *buf,
return 0;
err:
- atomic_inc(&t->rw_avail_ops);
+ atomic_add(credits_needed, &t->rw_credits);
if (first_wr)
rdma_rw_ctx_destroy(&msg->rw_ctx, t->qp, t->qp->port,
msg->sg_list, msg->sgt.nents,
@@ -1424,22 +1454,22 @@ static int smb_direct_rdma_xmit(struct smb_direct_transport *t, void *buf,
return ret;
}
-static int smb_direct_rdma_write(struct ksmbd_transport *t, void *buf,
- unsigned int buflen, u32 remote_key,
- u64 remote_offset, u32 remote_len)
+static int smb_direct_rdma_write(struct ksmbd_transport *t,
+ void *buf, unsigned int buflen,
+ struct smb2_buffer_desc_v1 *desc,
+ unsigned int desc_len)
{
return smb_direct_rdma_xmit(smb_trans_direct_transfort(t), buf, buflen,
- remote_key, remote_offset,
- remote_len, false);
+ desc, desc_len, false);
}
-static int smb_direct_rdma_read(struct ksmbd_transport *t, void *buf,
- unsigned int buflen, u32 remote_key,
- u64 remote_offset, u32 remote_len)
+static int smb_direct_rdma_read(struct ksmbd_transport *t,
+ void *buf, unsigned int buflen,
+ struct smb2_buffer_desc_v1 *desc,
+ unsigned int desc_len)
{
return smb_direct_rdma_xmit(smb_trans_direct_transfort(t), buf, buflen,
- remote_key, remote_offset,
- remote_len, true);
+ desc, desc_len, true);
}
static void smb_direct_disconnect(struct ksmbd_transport *t)
@@ -1639,11 +1669,19 @@ static int smb_direct_prepare_negotiation(struct smb_direct_transport *t)
return ret;
}
+static unsigned int smb_direct_get_max_fr_pages(struct smb_direct_transport *t)
+{
+ return min_t(unsigned int,
+ t->cm_id->device->attrs.max_fast_reg_page_list_len,
+ 256);
+}
+
static int smb_direct_init_params(struct smb_direct_transport *t,
struct ib_qp_cap *cap)
{
struct ib_device *device = t->cm_id->device;
- int max_send_sges, max_pages, max_rw_wrs, max_send_wrs;
+ int max_send_sges, max_rw_wrs, max_send_wrs;
+ unsigned int max_sge_per_wr, wrs_per_credit;
/* need 2 more sge. because a SMB_DIRECT header will be mapped,
* and maybe a send buffer could be not page aligned.
@@ -1655,25 +1693,31 @@ static int smb_direct_init_params(struct smb_direct_transport *t,
return -EINVAL;
}
- /*
- * allow smb_direct_max_outstanding_rw_ops of in-flight RDMA
- * read/writes. HCA guarantees at least max_send_sge of sges for
- * a RDMA read/write work request, and if memory registration is used,
- * we need reg_mr, local_inv wrs for each read/write.
+ /* Calculate the number of work requests for RDMA R/W.
+ * The maximum number of pages which can be registered
+ * with one Memory region can be transferred with one
+ * R/W credit. And at least 4 work requests for each credit
+ * are needed for MR registration, RDMA R/W, local & remote
+ * MR invalidation.
*/
t->max_rdma_rw_size = smb_direct_max_read_write_size;
- max_pages = DIV_ROUND_UP(t->max_rdma_rw_size, PAGE_SIZE) + 1;
- max_rw_wrs = DIV_ROUND_UP(max_pages, SMB_DIRECT_MAX_SEND_SGES);
- max_rw_wrs += rdma_rw_mr_factor(device, t->cm_id->port_num,
- max_pages) * 2;
- max_rw_wrs *= smb_direct_max_outstanding_rw_ops;
+ t->pages_per_rw_credit = smb_direct_get_max_fr_pages(t);
+ t->max_rw_credits = DIV_ROUND_UP(t->max_rdma_rw_size,
+ (t->pages_per_rw_credit - 1) *
+ PAGE_SIZE);
+
+ max_sge_per_wr = min_t(unsigned int, device->attrs.max_send_sge,
+ device->attrs.max_sge_rd);
+ wrs_per_credit = max_t(unsigned int, 4,
+ DIV_ROUND_UP(t->pages_per_rw_credit,
+ max_sge_per_wr) + 1);
+ max_rw_wrs = t->max_rw_credits * wrs_per_credit;
max_send_wrs = smb_direct_send_credit_target + max_rw_wrs;
if (max_send_wrs > device->attrs.max_cqe ||
max_send_wrs > device->attrs.max_qp_wr) {
- pr_err("consider lowering send_credit_target = %d, or max_outstanding_rw_ops = %d\n",
- smb_direct_send_credit_target,
- smb_direct_max_outstanding_rw_ops);
+ pr_err("consider lowering send_credit_target = %d\n",
+ smb_direct_send_credit_target);
pr_err("Possible CQE overrun, device reporting max_cqe %d max_qp_wr %d\n",
device->attrs.max_cqe, device->attrs.max_qp_wr);
return -EINVAL;
@@ -1708,7 +1752,7 @@ static int smb_direct_init_params(struct smb_direct_transport *t,
t->send_credit_target = smb_direct_send_credit_target;
atomic_set(&t->send_credits, 0);
- atomic_set(&t->rw_avail_ops, smb_direct_max_outstanding_rw_ops);
+ atomic_set(&t->rw_credits, t->max_rw_credits);
t->max_send_size = smb_direct_max_send_size;
t->max_recv_size = smb_direct_max_receive_size;
@@ -1716,12 +1760,10 @@ static int smb_direct_init_params(struct smb_direct_transport *t,
cap->max_send_wr = max_send_wrs;
cap->max_recv_wr = t->recv_credit_max;
- cap->max_send_sge = SMB_DIRECT_MAX_SEND_SGES;
+ cap->max_send_sge = max_sge_per_wr;
cap->max_recv_sge = SMB_DIRECT_MAX_RECV_SGES;
cap->max_inline_data = 0;
- cap->max_rdma_ctxs =
- rdma_rw_mr_factor(device, t->cm_id->port_num, max_pages) *
- smb_direct_max_outstanding_rw_ops;
+ cap->max_rdma_ctxs = t->max_rw_credits;
return 0;
}
@@ -1814,7 +1856,8 @@ static int smb_direct_create_qpair(struct smb_direct_transport *t,
}
t->send_cq = ib_alloc_cq(t->cm_id->device, t,
- t->send_credit_target, 0, IB_POLL_WORKQUEUE);
+ smb_direct_send_credit_target + cap->max_rdma_ctxs,
+ 0, IB_POLL_WORKQUEUE);
if (IS_ERR(t->send_cq)) {
pr_err("Can't create RDMA send CQ\n");
ret = PTR_ERR(t->send_cq);
@@ -1823,8 +1866,7 @@ static int smb_direct_create_qpair(struct smb_direct_transport *t,
}
t->recv_cq = ib_alloc_cq(t->cm_id->device, t,
- cap->max_send_wr + cap->max_rdma_ctxs,
- 0, IB_POLL_WORKQUEUE);
+ t->recv_credit_max, 0, IB_POLL_WORKQUEUE);
if (IS_ERR(t->recv_cq)) {
pr_err("Can't create RDMA recv CQ\n");
ret = PTR_ERR(t->recv_cq);
@@ -1853,17 +1895,12 @@ static int smb_direct_create_qpair(struct smb_direct_transport *t,
pages_per_rw = DIV_ROUND_UP(t->max_rdma_rw_size, PAGE_SIZE) + 1;
if (pages_per_rw > t->cm_id->device->attrs.max_sgl_rd) {
- int pages_per_mr, mr_count;
-
- pages_per_mr = min_t(int, pages_per_rw,
- t->cm_id->device->attrs.max_fast_reg_page_list_len);
- mr_count = DIV_ROUND_UP(pages_per_rw, pages_per_mr) *
- atomic_read(&t->rw_avail_ops);
- ret = ib_mr_pool_init(t->qp, &t->qp->rdma_mrs, mr_count,
- IB_MR_TYPE_MEM_REG, pages_per_mr, 0);
+ ret = ib_mr_pool_init(t->qp, &t->qp->rdma_mrs,
+ t->max_rw_credits, IB_MR_TYPE_MEM_REG,
+ t->pages_per_rw_credit, 0);
if (ret) {
pr_err("failed to init mr pool count %d pages %d\n",
- mr_count, pages_per_mr);
+ t->max_rw_credits, t->pages_per_rw_credit);
goto err;
}
}
diff --git a/fs/ksmbd/transport_rdma.h b/fs/ksmbd/transport_rdma.h
index 5567d93a6f96..77aee4e5c9dc 100644
--- a/fs/ksmbd/transport_rdma.h
+++ b/fs/ksmbd/transport_rdma.h
@@ -7,6 +7,10 @@
#ifndef __KSMBD_TRANSPORT_RDMA_H__
#define __KSMBD_TRANSPORT_RDMA_H__
+#define SMBD_DEFAULT_IOSIZE (8 * 1024 * 1024)
+#define SMBD_MIN_IOSIZE (512 * 1024)
+#define SMBD_MAX_IOSIZE (16 * 1024 * 1024)
+
/* SMB DIRECT negotiation request packet [MS-SMBD] 2.2.1 */
struct smb_direct_negotiate_req {
__le16 min_version;
@@ -52,10 +56,14 @@ struct smb_direct_data_transfer {
int ksmbd_rdma_init(void);
void ksmbd_rdma_destroy(void);
bool ksmbd_rdma_capable_netdev(struct net_device *netdev);
+void init_smbd_max_io_size(unsigned int sz);
+unsigned int get_smbd_max_read_write_size(void);
#else
static inline int ksmbd_rdma_init(void) { return 0; }
static inline int ksmbd_rdma_destroy(void) { return 0; }
static inline bool ksmbd_rdma_capable_netdev(struct net_device *netdev) { return false; }
+static inline void init_smbd_max_io_size(unsigned int sz) { }
+static inline unsigned int get_smbd_max_read_write_size(void) { return 0; }
#endif
#endif /* __KSMBD_TRANSPORT_RDMA_H__ */
diff --git a/fs/ksmbd/vfs.c b/fs/ksmbd/vfs.c
index 05efcdf7a4a7..201962f03772 100644
--- a/fs/ksmbd/vfs.c
+++ b/fs/ksmbd/vfs.c
@@ -1540,6 +1540,11 @@ int ksmbd_vfs_get_sd_xattr(struct ksmbd_conn *conn,
}
*pntsd = acl.sd_buf;
+ if (acl.sd_size < sizeof(struct smb_ntsd)) {
+ pr_err("sd size is invalid\n");
+ goto out_free;
+ }
+
(*pntsd)->osidoffset = cpu_to_le32(le32_to_cpu((*pntsd)->osidoffset) -
NDR_NTSD_OFFSETOF);
(*pntsd)->gsidoffset = cpu_to_le32(le32_to_cpu((*pntsd)->gsidoffset) -
diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 176b468a61c7..e5adb524a445 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -32,6 +32,10 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
if (!nlmsvc_ops)
return nlm_lck_denied_nolocks;
+ if (lock->lock_start > OFFSET_MAX ||
+ (lock->lock_len && ((lock->lock_len - 1) > (OFFSET_MAX - lock->lock_start))))
+ return nlm4_fbig;
+
/* Obtain host handle */
if (!(host = nlmsvc_lookup_host(rqstp, lock->caller, lock->len))
|| (argp->monitor && nsm_monitor(host) < 0))
@@ -50,6 +54,10 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
/* Set up the missing parts of the file_lock structure */
lock->fl.fl_file = file->f_file[mode];
lock->fl.fl_pid = current->tgid;
+ lock->fl.fl_start = (loff_t)lock->lock_start;
+ lock->fl.fl_end = lock->lock_len ?
+ (loff_t)(lock->lock_start + lock->lock_len - 1) :
+ OFFSET_MAX;
lock->fl.fl_lmops = &nlmsvc_lock_operations;
nlmsvc_locks_init_private(&lock->fl, host, (pid_t)lock->svid);
if (!lock->fl.fl_owner) {
diff --git a/fs/lockd/xdr4.c b/fs/lockd/xdr4.c
index 856267c0864b..712fdfeb8ef0 100644
--- a/fs/lockd/xdr4.c
+++ b/fs/lockd/xdr4.c
@@ -20,13 +20,6 @@
#include "svcxdr.h"
-static inline loff_t
-s64_to_loff_t(__s64 offset)
-{
- return (loff_t)offset;
-}
-
-
static inline s64
loff_t_to_s64(loff_t offset)
{
@@ -70,8 +63,6 @@ static bool
svcxdr_decode_lock(struct xdr_stream *xdr, struct nlm_lock *lock)
{
struct file_lock *fl = &lock->fl;
- u64 len, start;
- s64 end;
if (!svcxdr_decode_string(xdr, &lock->caller, &lock->len))
return false;
@@ -81,20 +72,14 @@ svcxdr_decode_lock(struct xdr_stream *xdr, struct nlm_lock *lock)
return false;
if (xdr_stream_decode_u32(xdr, &lock->svid) < 0)
return false;
- if (xdr_stream_decode_u64(xdr, &start) < 0)
+ if (xdr_stream_decode_u64(xdr, &lock->lock_start) < 0)
return false;
- if (xdr_stream_decode_u64(xdr, &len) < 0)
+ if (xdr_stream_decode_u64(xdr, &lock->lock_len) < 0)
return false;
locks_init_lock(fl);
fl->fl_flags = FL_POSIX;
fl->fl_type = F_RDLCK;
- end = start + len - 1;
- fl->fl_start = s64_to_loff_t(start);
- if (len == 0 || end < 0)
- fl->fl_end = OFFSET_MAX;
- else
- fl->fl_end = s64_to_loff_t(end);
return true;
}
diff --git a/fs/mbcache.c b/fs/mbcache.c
index 97c54d3a2227..2010bc80a3f2 100644
--- a/fs/mbcache.c
+++ b/fs/mbcache.c
@@ -11,7 +11,7 @@
/*
* Mbcache is a simple key-value store. Keys need not be unique, however
* key-value pairs are expected to be unique (we use this fact in
- * mb_cache_entry_delete()).
+ * mb_cache_entry_delete_or_get()).
*
* Ext2 and ext4 use this cache for deduplication of extended attribute blocks.
* Ext4 also uses it for deduplication of xattr values stored in inodes.
@@ -125,6 +125,19 @@ void __mb_cache_entry_free(struct mb_cache_entry *entry)
}
EXPORT_SYMBOL(__mb_cache_entry_free);
+/*
+ * mb_cache_entry_wait_unused - wait to be the last user of the entry
+ *
+ * @entry - entry to work on
+ *
+ * Wait to be the last user of the entry.
+ */
+void mb_cache_entry_wait_unused(struct mb_cache_entry *entry)
+{
+ wait_var_event(&entry->e_refcnt, atomic_read(&entry->e_refcnt) <= 3);
+}
+EXPORT_SYMBOL(mb_cache_entry_wait_unused);
+
static struct mb_cache_entry *__entry_find(struct mb_cache *cache,
struct mb_cache_entry *entry,
u32 key)
@@ -217,7 +230,7 @@ struct mb_cache_entry *mb_cache_entry_get(struct mb_cache *cache, u32 key,
}
EXPORT_SYMBOL(mb_cache_entry_get);
-/* mb_cache_entry_delete - remove a cache entry
+/* mb_cache_entry_delete - try to remove a cache entry
* @cache - cache we work with
* @key - key
* @value - value
@@ -254,6 +267,55 @@ void mb_cache_entry_delete(struct mb_cache *cache, u32 key, u64 value)
}
EXPORT_SYMBOL(mb_cache_entry_delete);
+/* mb_cache_entry_delete_or_get - remove a cache entry if it has no users
+ * @cache - cache we work with
+ * @key - key
+ * @value - value
+ *
+ * Remove entry from cache @cache with key @key and value @value. The removal
+ * happens only if the entry is unused. The function returns NULL in case the
+ * entry was successfully removed or there's no entry in cache. Otherwise the
+ * function grabs reference of the entry that we failed to delete because it
+ * still has users and return it.
+ */
+struct mb_cache_entry *mb_cache_entry_delete_or_get(struct mb_cache *cache,
+ u32 key, u64 value)
+{
+ struct hlist_bl_node *node;
+ struct hlist_bl_head *head;
+ struct mb_cache_entry *entry;
+
+ head = mb_cache_entry_head(cache, key);
+ hlist_bl_lock(head);
+ hlist_bl_for_each_entry(entry, node, head, e_hash_list) {
+ if (entry->e_key == key && entry->e_value == value) {
+ if (atomic_read(&entry->e_refcnt) > 2) {
+ atomic_inc(&entry->e_refcnt);
+ hlist_bl_unlock(head);
+ return entry;
+ }
+ /* We keep hash list reference to keep entry alive */
+ hlist_bl_del_init(&entry->e_hash_list);
+ hlist_bl_unlock(head);
+ spin_lock(&cache->c_list_lock);
+ if (!list_empty(&entry->e_list)) {
+ list_del_init(&entry->e_list);
+ if (!WARN_ONCE(cache->c_entry_count == 0,
+ "mbcache: attempt to decrement c_entry_count past zero"))
+ cache->c_entry_count--;
+ atomic_dec(&entry->e_refcnt);
+ }
+ spin_unlock(&cache->c_list_lock);
+ mb_cache_entry_put(cache, entry);
+ return NULL;
+ }
+ }
+ hlist_bl_unlock(head);
+
+ return NULL;
+}
+EXPORT_SYMBOL(mb_cache_entry_delete_or_get);
+
/* mb_cache_entry_touch - cache entry got used
* @cache - cache the entry belongs to
* @entry - entry that got used
@@ -288,7 +350,7 @@ static unsigned long mb_cache_shrink(struct mb_cache *cache,
while (nr_to_scan-- && !list_empty(&cache->c_list)) {
entry = list_first_entry(&cache->c_list,
struct mb_cache_entry, e_list);
- if (entry->e_referenced) {
+ if (entry->e_referenced || atomic_read(&entry->e_refcnt) > 2) {
entry->e_referenced = 0;
list_move_tail(&entry->e_list, &cache->c_list);
continue;
@@ -302,6 +364,14 @@ static unsigned long mb_cache_shrink(struct mb_cache *cache,
spin_unlock(&cache->c_list_lock);
head = mb_cache_entry_head(cache, entry->e_key);
hlist_bl_lock(head);
+ /* Now a reliable check if the entry didn't get used... */
+ if (atomic_read(&entry->e_refcnt) > 2) {
+ hlist_bl_unlock(head);
+ spin_lock(&cache->c_list_lock);
+ list_add_tail(&entry->e_list, &cache->c_list);
+ cache->c_entry_count++;
+ continue;
+ }
if (!hlist_bl_unhashed(&entry->e_hash_list)) {
hlist_bl_del_init(&entry->e_hash_list);
atomic_dec(&entry->e_refcnt);
diff --git a/fs/namei.c b/fs/namei.c
index fd3c95ac261b..2fa412c5a082 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1511,6 +1511,8 @@ static bool __follow_mount_rcu(struct nameidata *nd, struct path *path,
* becoming unpinned.
*/
flags = dentry->d_flags;
+ if (read_seqretry(&mount_lock, nd->m_seq))
+ return false;
continue;
}
if (read_seqretry(&mount_lock, nd->m_seq))
@@ -3571,6 +3573,8 @@ struct dentry *vfs_tmpfile(struct user_namespace *mnt_userns,
child = d_alloc(dentry, &slash_name);
if (unlikely(!child))
goto out_err;
+ if (!IS_POSIXACL(dir))
+ mode &= ~current_umask();
error = dir->i_op->tmpfile(mnt_userns, dir, child, mode);
if (error)
goto out_err;
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index 604be402ae13..7d285561e59f 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -1131,6 +1131,8 @@ static int ff_layout_async_handle_error_v4(struct rpc_task *task,
case -EIO:
case -ETIMEDOUT:
case -EPIPE:
+ case -EPROTO:
+ case -ENODEV:
dprintk("%s DS connection error %d\n", __func__,
task->tk_status);
nfs4_delete_deviceid(devid->ld, devid->nfs_client,
@@ -1236,6 +1238,8 @@ static void ff_layout_io_track_ds_error(struct pnfs_layout_segment *lseg,
case -ENOBUFS:
case -EPIPE:
case -EPERM:
+ case -EPROTO:
+ case -ENODEV:
*op_status = status = NFS4ERR_NXIO;
break;
case -EACCES:
diff --git a/fs/nfs/nfs3client.c b/fs/nfs/nfs3client.c
index 5601e47360c2..b49359afac88 100644
--- a/fs/nfs/nfs3client.c
+++ b/fs/nfs/nfs3client.c
@@ -108,7 +108,6 @@ struct nfs_client *nfs3_set_ds_client(struct nfs_server *mds_srv,
if (mds_srv->flags & NFS_MOUNT_NORESVPORT)
__set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags);
- __set_bit(NFS_CS_NOPING, &cl_init.init_flags);
__set_bit(NFS_CS_DS, &cl_init.init_flags);
/* Use the MDS nfs_client cl_ipaddr. */
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 0326bdec5de7..a46260965cf1 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -183,12 +183,6 @@ nfsd_file_alloc(struct inode *inode, unsigned int may, unsigned int hashval,
nf->nf_hashval = hashval;
refcount_set(&nf->nf_ref, 1);
nf->nf_may = may & NFSD_FILE_MAY_MASK;
- if (may & NFSD_MAY_NOT_BREAK_LEASE) {
- if (may & NFSD_MAY_WRITE)
- __set_bit(NFSD_FILE_BREAK_WRITE, &nf->nf_flags);
- if (may & NFSD_MAY_READ)
- __set_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags);
- }
nf->nf_mark = NULL;
trace_nfsd_file_alloc(nf);
}
@@ -954,21 +948,7 @@ nfsd_file_acquire(struct svc_rqst *rqstp, struct svc_fh *fhp,
this_cpu_inc(nfsd_file_cache_hits);
- if (!(may_flags & NFSD_MAY_NOT_BREAK_LEASE)) {
- bool write = (may_flags & NFSD_MAY_WRITE);
-
- if (test_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags) ||
- (test_bit(NFSD_FILE_BREAK_WRITE, &nf->nf_flags) && write)) {
- status = nfserrno(nfsd_open_break_lease(
- file_inode(nf->nf_file), may_flags));
- if (status == nfs_ok) {
- clear_bit(NFSD_FILE_BREAK_READ, &nf->nf_flags);
- if (write)
- clear_bit(NFSD_FILE_BREAK_WRITE,
- &nf->nf_flags);
- }
- }
- }
+ status = nfserrno(nfsd_open_break_lease(file_inode(nf->nf_file), may_flags));
out:
if (status == nfs_ok) {
*pnf = nf;
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 435ceab27897..63104be2865c 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -37,9 +37,7 @@ struct nfsd_file {
struct net *nf_net;
#define NFSD_FILE_HASHED (0)
#define NFSD_FILE_PENDING (1)
-#define NFSD_FILE_BREAK_READ (2)
-#define NFSD_FILE_BREAK_WRITE (3)
-#define NFSD_FILE_REFERENCED (4)
+#define NFSD_FILE_REFERENCED (2)
unsigned long nf_flags;
struct inode *nf_inode;
unsigned int nf_hashval;
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 242fa123e0e9..61ee50843f5c 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -692,18 +692,10 @@ DEFINE_CLID_EVENT(confirmed_r);
/*
* from fs/nfsd/filecache.h
*/
-TRACE_DEFINE_ENUM(NFSD_FILE_HASHED);
-TRACE_DEFINE_ENUM(NFSD_FILE_PENDING);
-TRACE_DEFINE_ENUM(NFSD_FILE_BREAK_READ);
-TRACE_DEFINE_ENUM(NFSD_FILE_BREAK_WRITE);
-TRACE_DEFINE_ENUM(NFSD_FILE_REFERENCED);
-
#define show_nf_flags(val) \
__print_flags(val, "|", \
{ 1 << NFSD_FILE_HASHED, "HASHED" }, \
{ 1 << NFSD_FILE_PENDING, "PENDING" }, \
- { 1 << NFSD_FILE_BREAK_READ, "BREAK_READ" }, \
- { 1 << NFSD_FILE_BREAK_WRITE, "BREAK_WRITE" }, \
{ 1 << NFSD_FILE_REFERENCED, "REFERENCED"})
DECLARE_EVENT_CLASS(nfsd_file_class,
diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
index ebde05c9cf62..dbb944b5f81e 100644
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@@ -259,7 +259,7 @@ static int ovl_encode_fh(struct inode *inode, u32 *fid, int *max_len,
return FILEID_INVALID;
dentry = d_find_any_alias(inode);
- if (WARN_ON(!dentry))
+ if (!dentry)
return FILEID_INVALID;
bytes = ovl_dentry_to_fid(ofs, dentry, fid, buflen);
diff --git a/fs/proc/base.c b/fs/proc/base.c
index c1031843cc6a..2f0595e8ec4a 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1885,7 +1885,7 @@ void proc_pid_evict_inode(struct proc_inode *ei)
put_pid(pid);
}
-struct inode *proc_pid_make_inode(struct super_block * sb,
+struct inode *proc_pid_make_inode(struct super_block *sb,
struct task_struct *task, umode_t mode)
{
struct inode * inode;
@@ -1914,11 +1914,6 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
/* Let the pid remember us for quick removal */
ei->pid = pid;
- if (S_ISDIR(mode)) {
- spin_lock(&pid->lock);
- hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
- spin_unlock(&pid->lock);
- }
task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid);
security_task_to_inode(task, inode);
@@ -1931,6 +1926,39 @@ struct inode *proc_pid_make_inode(struct super_block * sb,
return NULL;
}
+/*
+ * Generating an inode and adding it into @pid->inodes, so that task will
+ * invalidate inode's dentry before being released.
+ *
+ * This helper is used for creating dir-type entries under '/proc' and
+ * '/proc/<tgid>/task'. Other entries(eg. fd, stat) under '/proc/<tgid>'
+ * can be released by invalidating '/proc/<tgid>' dentry.
+ * In theory, dentries under '/proc/<tgid>/task' can also be released by
+ * invalidating '/proc/<tgid>' dentry, we reserve it to handle single
+ * thread exiting situation: Any one of threads should invalidate its
+ * '/proc/<tgid>/task/<pid>' dentry before released.
+ */
+static struct inode *proc_pid_make_base_inode(struct super_block *sb,
+ struct task_struct *task, umode_t mode)
+{
+ struct inode *inode;
+ struct proc_inode *ei;
+ struct pid *pid;
+
+ inode = proc_pid_make_inode(sb, task, mode);
+ if (!inode)
+ return NULL;
+
+ /* Let proc_flush_pid find this directory inode */
+ ei = PROC_I(inode);
+ pid = ei->pid;
+ spin_lock(&pid->lock);
+ hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes);
+ spin_unlock(&pid->lock);
+
+ return inode;
+}
+
int pid_getattr(struct user_namespace *mnt_userns, const struct path *path,
struct kstat *stat, u32 request_mask, unsigned int query_flags)
{
@@ -3350,7 +3378,8 @@ static struct dentry *proc_pid_instantiate(struct dentry * dentry,
{
struct inode *inode;
- inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
+ inode = proc_pid_make_base_inode(dentry->d_sb, task,
+ S_IFDIR | S_IRUGO | S_IXUGO);
if (!inode)
return ERR_PTR(-ENOENT);
@@ -3649,7 +3678,8 @@ static struct dentry *proc_task_instantiate(struct dentry *dentry,
struct task_struct *task, const void *ptr)
{
struct inode *inode;
- inode = proc_pid_make_inode(dentry->d_sb, task, S_IFDIR | S_IRUGO | S_IXUGO);
+ inode = proc_pid_make_base_inode(dentry->d_sb, task,
+ S_IFDIR | S_IRUGO | S_IXUGO);
if (!inode)
return ERR_PTR(-ENOENT);
diff --git a/fs/splice.c b/fs/splice.c
index 047b79db8eb5..93a2c9bf6249 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -814,17 +814,15 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
{
struct pipe_inode_info *pipe;
long ret, bytes;
- umode_t i_mode;
size_t len;
int i, flags, more;
/*
- * We require the input being a regular file, as we don't want to
- * randomly drop data for eg socket -> socket splicing. Use the
- * piped splicing for that!
+ * We require the input to be seekable, as we don't want to randomly
+ * drop data for eg socket -> socket splicing. Use the piped splicing
+ * for that!
*/
- i_mode = file_inode(in)->i_mode;
- if (unlikely(!S_ISREG(i_mode) && !S_ISBLK(i_mode)))
+ if (unlikely(!(in->f_mode & FMODE_LSEEK)))
return -EINVAL;
/*
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index 15a4c7c07a3b..b68798a572fc 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -723,13 +723,12 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
struct inode *inode = file_inode(iocb->ki_filp);
struct zonefs_inode_info *zi = ZONEFS_I(inode);
struct block_device *bdev = inode->i_sb->s_bdev;
- unsigned int max;
+ unsigned int max = bdev_max_zone_append_sectors(bdev);
struct bio *bio;
ssize_t size;
int nr_pages;
ssize_t ret;
- max = queue_max_zone_append_sectors(bdev_get_queue(bdev));
max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize);
iov_iter_truncate(from, max);
diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
index 181907349b49..a76f8c6b732d 100644
--- a/include/acpi/cppc_acpi.h
+++ b/include/acpi/cppc_acpi.h
@@ -17,7 +17,7 @@
#include <acpi/pcc.h>
#include <acpi/processor.h>
-/* Support CPPCv2 and CPPCv3 */
+/* CPPCv2 and CPPCv3 support */
#define CPPC_V2_REV 2
#define CPPC_V3_REV 3
#define CPPC_V2_NUM_ENT 21
diff --git a/include/crypto/internal/blake2s.h b/include/crypto/internal/blake2s.h
index 52363eee2b20..506d56530ca9 100644
--- a/include/crypto/internal/blake2s.h
+++ b/include/crypto/internal/blake2s.h
@@ -8,7 +8,6 @@
#define _CRYPTO_INTERNAL_BLAKE2S_H
#include <crypto/blake2s.h>
-#include <crypto/internal/hash.h>
#include <linux/string.h>
void blake2s_compress_generic(struct blake2s_state *state, const u8 *block,
@@ -19,111 +18,4 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block,
bool blake2s_selftest(void);
-static inline void blake2s_set_lastblock(struct blake2s_state *state)
-{
- state->f[0] = -1;
-}
-
-/* Helper functions for BLAKE2s shared by the library and shash APIs */
-
-static __always_inline void
-__blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen,
- bool force_generic)
-{
- const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen;
-
- if (unlikely(!inlen))
- return;
- if (inlen > fill) {
- memcpy(state->buf + state->buflen, in, fill);
- if (force_generic)
- blake2s_compress_generic(state, state->buf, 1,
- BLAKE2S_BLOCK_SIZE);
- else
- blake2s_compress(state, state->buf, 1,
- BLAKE2S_BLOCK_SIZE);
- state->buflen = 0;
- in += fill;
- inlen -= fill;
- }
- if (inlen > BLAKE2S_BLOCK_SIZE) {
- const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE);
- /* Hash one less (full) block than strictly possible */
- if (force_generic)
- blake2s_compress_generic(state, in, nblocks - 1,
- BLAKE2S_BLOCK_SIZE);
- else
- blake2s_compress(state, in, nblocks - 1,
- BLAKE2S_BLOCK_SIZE);
- in += BLAKE2S_BLOCK_SIZE * (nblocks - 1);
- inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1);
- }
- memcpy(state->buf + state->buflen, in, inlen);
- state->buflen += inlen;
-}
-
-static __always_inline void
-__blake2s_final(struct blake2s_state *state, u8 *out, bool force_generic)
-{
- blake2s_set_lastblock(state);
- memset(state->buf + state->buflen, 0,
- BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */
- if (force_generic)
- blake2s_compress_generic(state, state->buf, 1, state->buflen);
- else
- blake2s_compress(state, state->buf, 1, state->buflen);
- cpu_to_le32_array(state->h, ARRAY_SIZE(state->h));
- memcpy(out, state->h, state->outlen);
-}
-
-/* Helper functions for shash implementations of BLAKE2s */
-
-struct blake2s_tfm_ctx {
- u8 key[BLAKE2S_KEY_SIZE];
- unsigned int keylen;
-};
-
-static inline int crypto_blake2s_setkey(struct crypto_shash *tfm,
- const u8 *key, unsigned int keylen)
-{
- struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(tfm);
-
- if (keylen == 0 || keylen > BLAKE2S_KEY_SIZE)
- return -EINVAL;
-
- memcpy(tctx->key, key, keylen);
- tctx->keylen = keylen;
-
- return 0;
-}
-
-static inline int crypto_blake2s_init(struct shash_desc *desc)
-{
- const struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm);
- struct blake2s_state *state = shash_desc_ctx(desc);
- unsigned int outlen = crypto_shash_digestsize(desc->tfm);
-
- __blake2s_init(state, outlen, tctx->key, tctx->keylen);
- return 0;
-}
-
-static inline int crypto_blake2s_update(struct shash_desc *desc,
- const u8 *in, unsigned int inlen,
- bool force_generic)
-{
- struct blake2s_state *state = shash_desc_ctx(desc);
-
- __blake2s_update(state, in, inlen, force_generic);
- return 0;
-}
-
-static inline int crypto_blake2s_final(struct shash_desc *desc, u8 *out,
- bool force_generic)
-{
- struct blake2s_state *state = shash_desc_ctx(desc);
-
- __blake2s_final(state, out, force_generic);
- return 0;
-}
-
#endif /* _CRYPTO_INTERNAL_BLAKE2S_H */
diff --git a/include/dt-bindings/clock/qcom,gcc-msm8939.h b/include/dt-bindings/clock/qcom,gcc-msm8939.h
index 0634467c4ce5..2d545ed0d35a 100644
--- a/include/dt-bindings/clock/qcom,gcc-msm8939.h
+++ b/include/dt-bindings/clock/qcom,gcc-msm8939.h
@@ -192,6 +192,7 @@
#define GCC_VENUS0_CORE0_VCODEC0_CLK 183
#define GCC_VENUS0_CORE1_VCODEC0_CLK 184
#define GCC_OXILI_TIMER_CLK 185
+#define SYSTEM_MM_NOC_BFDCD_CLK_SRC 186
/* Indexes for GDSCs */
#define BIMC_GDSC 0
diff --git a/include/linux/acpi_viot.h b/include/linux/acpi_viot.h
index 1eb8ee5b0e5f..a5a122431563 100644
--- a/include/linux/acpi_viot.h
+++ b/include/linux/acpi_viot.h
@@ -6,9 +6,11 @@
#include <linux/acpi.h>
#ifdef CONFIG_ACPI_VIOT
+void __init acpi_viot_early_init(void);
void __init acpi_viot_init(void);
int viot_iommu_configure(struct device *dev);
#else
+static inline void acpi_viot_early_init(void) {}
static inline void acpi_viot_init(void) {}
static inline int viot_iommu_configure(struct device *dev)
{
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 108e3d114bfc..7927480b9cf7 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -466,7 +466,6 @@ struct request_queue {
#endif /* CONFIG_BLK_DEV_ZONED */
int node;
- struct mutex debugfs_mutex;
#ifdef CONFIG_BLK_DEV_IO_TRACE
struct blk_trace __rcu *blk_trace;
#endif
@@ -510,11 +509,12 @@ struct request_queue {
struct bio_set bio_split;
struct dentry *debugfs_dir;
-
-#ifdef CONFIG_BLK_DEBUG_FS
struct dentry *sched_debugfs_dir;
struct dentry *rqos_debugfs_dir;
-#endif
+ /*
+ * Serializes all debugfs metadata operations using the above dentries.
+ */
+ struct mutex debugfs_mutex;
bool mq_sysfs_init_done;
@@ -1190,6 +1190,17 @@ static inline unsigned int queue_max_zone_append_sectors(const struct request_qu
return min(l->max_zone_append_sectors, l->max_sectors);
}
+static inline unsigned int
+bdev_max_zone_append_sectors(struct block_device *bdev)
+{
+ return queue_max_zone_append_sectors(bdev_get_queue(bdev));
+}
+
+static inline unsigned int bdev_max_segments(struct block_device *bdev)
+{
+ return queue_max_segments(bdev_get_queue(bdev));
+}
+
static inline unsigned queue_logical_block_size(const struct request_queue *q)
{
int retval = 512;
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 83bd5598ec4d..a5f5172eb52c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -23,6 +23,7 @@
#include <linux/slab.h>
#include <linux/percpu-refcount.h>
#include <linux/bpfptr.h>
+#include <linux/btf.h>
struct bpf_verifier_env;
struct bpf_verifier_log;
@@ -155,6 +156,41 @@ struct bpf_map_ops {
const struct bpf_iter_seq_info *iter_seq_info;
};
+enum {
+ /* Support at most 8 pointers in a BPF map value */
+ BPF_MAP_VALUE_OFF_MAX = 8,
+ BPF_MAP_OFF_ARR_MAX = BPF_MAP_VALUE_OFF_MAX +
+ 1 + /* for bpf_spin_lock */
+ 1, /* for bpf_timer */
+};
+
+enum bpf_kptr_type {
+ BPF_KPTR_UNREF,
+ BPF_KPTR_REF,
+};
+
+struct bpf_map_value_off_desc {
+ u32 offset;
+ enum bpf_kptr_type type;
+ struct {
+ struct btf *btf;
+ struct module *module;
+ btf_dtor_kfunc_t dtor;
+ u32 btf_id;
+ } kptr;
+};
+
+struct bpf_map_value_off {
+ u32 nr_off;
+ struct bpf_map_value_off_desc off[];
+};
+
+struct bpf_map_off_arr {
+ u32 cnt;
+ u32 field_off[BPF_MAP_OFF_ARR_MAX];
+ u8 field_sz[BPF_MAP_OFF_ARR_MAX];
+};
+
struct bpf_map {
/* The first two cachelines with read-mostly members of which some
* are also accessed in fast-path (e.g. ops, max_entries).
@@ -171,6 +207,7 @@ struct bpf_map {
u64 map_extra; /* any per-map-type extra fields */
u32 map_flags;
int spin_lock_off; /* >=0 valid offset, <0 error */
+ struct bpf_map_value_off *kptr_off_tab;
int timer_off; /* >=0 valid offset, <0 error */
u32 id;
int numa_node;
@@ -182,10 +219,7 @@ struct bpf_map {
struct mem_cgroup *memcg;
#endif
char name[BPF_OBJ_NAME_LEN];
- bool bypass_spec_v1;
- bool frozen; /* write-once; write-protected by freeze_mutex */
- /* 14 bytes hole */
-
+ struct bpf_map_off_arr *off_arr;
/* The 3rd and 4th cacheline with misc members to avoid false sharing
* particularly with refcounting.
*/
@@ -205,6 +239,8 @@ struct bpf_map {
bool jited;
bool xdp_has_frags;
} owner;
+ bool bypass_spec_v1;
+ bool frozen; /* write-once; write-protected by freeze_mutex */
};
static inline bool map_value_has_spin_lock(const struct bpf_map *map)
@@ -217,43 +253,44 @@ static inline bool map_value_has_timer(const struct bpf_map *map)
return map->timer_off >= 0;
}
+static inline bool map_value_has_kptrs(const struct bpf_map *map)
+{
+ return !IS_ERR_OR_NULL(map->kptr_off_tab);
+}
+
static inline void check_and_init_map_value(struct bpf_map *map, void *dst)
{
if (unlikely(map_value_has_spin_lock(map)))
memset(dst + map->spin_lock_off, 0, sizeof(struct bpf_spin_lock));
if (unlikely(map_value_has_timer(map)))
memset(dst + map->timer_off, 0, sizeof(struct bpf_timer));
+ if (unlikely(map_value_has_kptrs(map))) {
+ struct bpf_map_value_off *tab = map->kptr_off_tab;
+ int i;
+
+ for (i = 0; i < tab->nr_off; i++)
+ *(u64 *)(dst + tab->off[i].offset) = 0;
+ }
}
/* copy everything but bpf_spin_lock and bpf_timer. There could be one of each. */
static inline void copy_map_value(struct bpf_map *map, void *dst, void *src)
{
- u32 s_off = 0, s_sz = 0, t_off = 0, t_sz = 0;
+ u32 curr_off = 0;
+ int i;
- if (unlikely(map_value_has_spin_lock(map))) {
- s_off = map->spin_lock_off;
- s_sz = sizeof(struct bpf_spin_lock);
- }
- if (unlikely(map_value_has_timer(map))) {
- t_off = map->timer_off;
- t_sz = sizeof(struct bpf_timer);
+ if (likely(!map->off_arr)) {
+ memcpy(dst, src, map->value_size);
+ return;
}
- if (unlikely(s_sz || t_sz)) {
- if (s_off < t_off || !s_sz) {
- swap(s_off, t_off);
- swap(s_sz, t_sz);
- }
- memcpy(dst, src, t_off);
- memcpy(dst + t_off + t_sz,
- src + t_off + t_sz,
- s_off - t_off - t_sz);
- memcpy(dst + s_off + s_sz,
- src + s_off + s_sz,
- map->value_size - s_off - s_sz);
- } else {
- memcpy(dst, src, map->value_size);
+ for (i = 0; i < map->off_arr->cnt; i++) {
+ u32 next_off = map->off_arr->field_off[i];
+
+ memcpy(dst + curr_off, src + curr_off, next_off - curr_off);
+ curr_off += map->off_arr->field_sz[i];
}
+ memcpy(dst + curr_off, src + curr_off, map->value_size - curr_off);
}
void copy_map_value_locked(struct bpf_map *map, void *dst, void *src,
bool lock_src);
@@ -342,7 +379,10 @@ enum bpf_type_flag {
*/
MEM_PERCPU = BIT(4 + BPF_BASE_TYPE_BITS),
- __BPF_TYPE_LAST_FLAG = MEM_PERCPU,
+ /* Indicates that the argument will be released. */
+ OBJ_RELEASE = BIT(5 + BPF_BASE_TYPE_BITS),
+
+ __BPF_TYPE_LAST_FLAG = OBJ_RELEASE,
};
/* Max number of base types. */
@@ -391,6 +431,7 @@ enum bpf_arg_type {
ARG_PTR_TO_STACK, /* pointer to stack */
ARG_PTR_TO_CONST_STR, /* pointer to a null terminated read-only string */
ARG_PTR_TO_TIMER, /* pointer to bpf_timer */
+ ARG_PTR_TO_KPTR, /* pointer to referenced kptr */
__BPF_ARG_TYPE_MAX,
/* Extended arg_types. */
@@ -400,6 +441,7 @@ enum bpf_arg_type {
ARG_PTR_TO_SOCKET_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_SOCKET,
ARG_PTR_TO_ALLOC_MEM_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_ALLOC_MEM,
ARG_PTR_TO_STACK_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_STACK,
+ ARG_PTR_TO_BTF_ID_OR_NULL = PTR_MAYBE_NULL | ARG_PTR_TO_BTF_ID,
/* This must be the last entry. Its purpose is to ensure the enum is
* wide enough to hold the higher bits reserved for bpf_type_flag.
@@ -674,11 +716,11 @@ struct btf_func_model {
/* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
* bytes on x86.
*/
-#define BPF_MAX_TRAMP_PROGS 38
+#define BPF_MAX_TRAMP_LINKS 38
-struct bpf_tramp_progs {
- struct bpf_prog *progs[BPF_MAX_TRAMP_PROGS];
- int nr_progs;
+struct bpf_tramp_links {
+ struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
+ int nr_links;
};
/* Different use cases for BPF trampoline:
@@ -704,7 +746,7 @@ struct bpf_tramp_progs {
struct bpf_tramp_image;
int arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
const struct btf_func_model *m, u32 flags,
- struct bpf_tramp_progs *tprogs,
+ struct bpf_tramp_links *tlinks,
void *orig_call);
/* these two functions are called from generated trampoline */
u64 notrace __bpf_prog_enter(struct bpf_prog *prog);
@@ -803,9 +845,10 @@ static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func(
{
return bpf_func(ctx, insnsi);
}
+
#ifdef CONFIG_BPF_JIT
-int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr);
-int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr);
+int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr);
+int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr);
struct bpf_trampoline *bpf_trampoline_get(u64 key,
struct bpf_attach_target_info *tgt_info);
void bpf_trampoline_put(struct bpf_trampoline *tr);
@@ -856,12 +899,12 @@ int bpf_jit_charge_modmem(u32 size);
void bpf_jit_uncharge_modmem(u32 size);
bool bpf_prog_has_trampoline(const struct bpf_prog *prog);
#else
-static inline int bpf_trampoline_link_prog(struct bpf_prog *prog,
+static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
struct bpf_trampoline *tr)
{
return -ENOTSUPP;
}
-static inline int bpf_trampoline_unlink_prog(struct bpf_prog *prog,
+static inline int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link,
struct bpf_trampoline *tr)
{
return -ENOTSUPP;
@@ -959,8 +1002,6 @@ struct bpf_prog_aux {
bool sleepable;
bool tail_call_reachable;
bool xdp_has_frags;
- bool use_bpf_prog_pack;
- struct hlist_node tramp_hlist;
/* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
const struct btf_type *attach_func_proto;
/* function name for valid attach_btf_id */
@@ -1047,6 +1088,18 @@ struct bpf_link_ops {
struct bpf_link_info *info);
};
+struct bpf_tramp_link {
+ struct bpf_link link;
+ struct hlist_node tramp_hlist;
+};
+
+struct bpf_tracing_link {
+ struct bpf_tramp_link link;
+ enum bpf_attach_type attach_type;
+ struct bpf_trampoline *trampoline;
+ struct bpf_prog *tgt_prog;
+};
+
struct bpf_link_primer {
struct bpf_link *link;
struct file *file;
@@ -1084,8 +1137,8 @@ bool bpf_struct_ops_get(const void *kdata);
void bpf_struct_ops_put(const void *kdata);
int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map, void *key,
void *value);
-int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_progs *tprogs,
- struct bpf_prog *prog,
+int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
+ struct bpf_tramp_link *link,
const struct btf_func_model *model,
void *image, void *image_end);
static inline bool bpf_try_module_get(const void *data, struct module *owner)
@@ -1396,6 +1449,12 @@ void bpf_prog_put(struct bpf_prog *prog);
void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock);
void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock);
+struct bpf_map_value_off_desc *bpf_map_kptr_off_contains(struct bpf_map *map, u32 offset);
+void bpf_map_free_kptr_off_tab(struct bpf_map *map);
+struct bpf_map_value_off *bpf_map_copy_kptr_off_tab(const struct bpf_map *map);
+bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_map *map_b);
+void bpf_map_free_kptrs(struct bpf_map *map, void *map_value);
+
struct bpf_map *bpf_map_get(u32 ufd);
struct bpf_map *bpf_map_get_with_uref(u32 ufd);
struct bpf_map *__bpf_map_get(struct fd f);
@@ -2169,6 +2228,7 @@ extern const struct bpf_func_proto bpf_find_vma_proto;
extern const struct bpf_func_proto bpf_loop_proto;
extern const struct bpf_func_proto bpf_strncmp_proto;
extern const struct bpf_func_proto bpf_copy_from_user_task_proto;
+extern const struct bpf_func_proto bpf_kptr_xchg_proto;
const struct bpf_func_proto *tracing_prog_func_proto(
enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index 3e24ad0c4b3c..2b9112b80171 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -141,3 +141,4 @@ BPF_LINK_TYPE(BPF_LINK_TYPE_XDP, xdp)
BPF_LINK_TYPE(BPF_LINK_TYPE_PERF_EVENT, perf)
#endif
BPF_LINK_TYPE(BPF_LINK_TYPE_KPROBE_MULTI, kprobe_multi)
+BPF_LINK_TYPE(BPF_LINK_TYPE_STRUCT_OPS, struct_ops)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 3a9d2d7cc6b7..1f1e7f2ea967 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -523,8 +523,7 @@ int check_ptr_off_reg(struct bpf_verifier_env *env,
const struct bpf_reg_state *reg, int regno);
int check_func_arg_reg_off(struct bpf_verifier_env *env,
const struct bpf_reg_state *reg, int regno,
- enum bpf_arg_type arg_type,
- bool is_release_func);
+ enum bpf_arg_type arg_type);
int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
u32 regno);
int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 36bc09b8e890..f70625dd5bb4 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -40,6 +40,13 @@ struct btf_kfunc_id_set {
};
};
+struct btf_id_dtor_kfunc {
+ u32 btf_id;
+ u32 kfunc_btf_id;
+};
+
+typedef void (*btf_dtor_kfunc_t)(void *);
+
extern const struct file_operations btf_fops;
void btf_get(struct btf *btf);
@@ -123,6 +130,8 @@ bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
u32 expected_offset, u32 expected_size);
int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t);
int btf_find_timer(const struct btf *btf, const struct btf_type *t);
+struct bpf_map_value_off *btf_parse_kptrs(const struct btf *btf,
+ const struct btf_type *t);
bool btf_type_is_void(const struct btf_type *t);
s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind);
const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
@@ -344,6 +353,9 @@ bool btf_kfunc_id_set_contains(const struct btf *btf,
enum btf_kfunc_type type, u32 kfunc_btf_id);
int register_btf_kfunc_id_set(enum bpf_prog_type prog_type,
const struct btf_kfunc_id_set *s);
+s32 btf_find_dtor_kfunc(struct btf *btf, u32 btf_id);
+int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors, u32 add_cnt,
+ struct module *owner);
#else
static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
u32 type_id)
@@ -367,6 +379,15 @@ static inline int register_btf_kfunc_id_set(enum bpf_prog_type prog_type,
{
return 0;
}
+static inline s32 btf_find_dtor_kfunc(struct btf *btf, u32 btf_id)
+{
+ return -ENOENT;
+}
+static inline int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors,
+ u32 add_cnt, struct module *owner)
+{
+ return 0;
+}
#endif
#endif
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index bcb4fe9b8575..8e4552460910 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -117,7 +117,6 @@ static __always_inline int test_clear_buffer_##name(struct buffer_head *bh) \
* of the form "mark_buffer_foo()". These are higher-level functions which
* do something in addition to setting a b_state bit.
*/
-BUFFER_FNS(Uptodate, uptodate)
BUFFER_FNS(Dirty, dirty)
TAS_BUFFER_FNS(Dirty, dirty)
BUFFER_FNS(Lock, locked)
@@ -135,6 +134,30 @@ BUFFER_FNS(Meta, meta)
BUFFER_FNS(Prio, prio)
BUFFER_FNS(Defer_Completion, defer_completion)
+static __always_inline void set_buffer_uptodate(struct buffer_head *bh)
+{
+ /*
+ * make it consistent with folio_mark_uptodate
+ * pairs with smp_load_acquire in buffer_uptodate
+ */
+ smp_mb__before_atomic();
+ set_bit(BH_Uptodate, &bh->b_state);
+}
+
+static __always_inline void clear_buffer_uptodate(struct buffer_head *bh)
+{
+ clear_bit(BH_Uptodate, &bh->b_state);
+}
+
+static __always_inline int buffer_uptodate(const struct buffer_head *bh)
+{
+ /*
+ * make it consistent with folio_test_uptodate
+ * pairs with smp_mb__before_atomic in set_buffer_uptodate
+ */
+ return (smp_load_acquire(&bh->b_state) & (1UL << BH_Uptodate)) != 0;
+}
+
#define bh_offset(bh) ((unsigned long)(bh)->b_data & ~PAGE_MASK)
/* If we *know* page->private refers to buffer_heads */
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index fe29ac7cc469..4592d0845941 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -1071,4 +1071,22 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
[0] = 1UL \
} }
+/*
+ * Provide a valid theoretical max size for cpumap and cpulist sysfs files
+ * to avoid breaking userspace which may allocate a buffer based on the size
+ * reported by e.g. fstat.
+ *
+ * for cpumap NR_CPUS * 9/32 - 1 should be an exact length.
+ *
+ * For cpulist 7 is (ceil(log10(NR_CPUS)) + 1) allowing for NR_CPUS to be up
+ * to 2 orders of magnitude larger than 8192. And then we divide by 2 to
+ * cover a worst-case of every other cpu being on one of two nodes for a
+ * very large NR_CPUS.
+ *
+ * Use PAGE_SIZE as a minimum for smaller configurations.
+ */
+#define CPUMAP_FILE_MAX_BYTES ((((NR_CPUS * 9)/32 - 1) > PAGE_SIZE) \
+ ? (NR_CPUS * 9)/32 - 1 : PAGE_SIZE)
+#define CPULIST_FILE_MAX_BYTES (((NR_CPUS * 7)/2 > PAGE_SIZE) ? (NR_CPUS * 7)/2 : PAGE_SIZE)
+
#endif /* __LINUX_CPUMASK_H */
diff --git a/include/linux/filter.h b/include/linux/filter.h
index ed0c0ff42ad5..8fd2e2f58eeb 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -948,6 +948,7 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog);
void bpf_jit_compile(struct bpf_prog *prog);
bool bpf_jit_needs_zext(void);
+bool bpf_jit_supports_subprog_tailcalls(void);
bool bpf_jit_supports_kfunc_call(void);
bool bpf_helper_changes_pkt_data(void *func);
@@ -1060,6 +1061,14 @@ u64 bpf_jit_alloc_exec_limit(void);
void *bpf_jit_alloc_exec(unsigned long size);
void bpf_jit_free_exec(void *addr);
void bpf_jit_free(struct bpf_prog *fp);
+struct bpf_binary_header *
+bpf_jit_binary_pack_hdr(const struct bpf_prog *fp);
+
+static inline bool bpf_prog_kallsyms_verify_off(const struct bpf_prog *fp)
+{
+ return list_empty(&fp->aux->ksym.lnode) ||
+ fp->aux->ksym.lnode.prev == LIST_POISON2;
+}
struct bpf_binary_header *
bpf_jit_binary_pack_alloc(unsigned int proglen, u8 **ro_image,
diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 39bb9b47fa9c..7037c6bed55c 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -218,6 +218,16 @@ static inline void clear_highpage(struct page *page)
kunmap_local(kaddr);
}
+static inline void clear_highpage_kasan_tagged(struct page *page)
+{
+ u8 tag;
+
+ tag = page_kasan_tag(page);
+ page_kasan_tag_reset(page);
+ clear_highpage(page);
+ page_kasan_tag_set(page, tag);
+}
+
#ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
static inline void tag_clear_highpage(struct page *page)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index ac2a1d758a80..fbf4a6509fb3 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -167,7 +167,7 @@ bool hugetlb_reserve_pages(struct inode *inode, long from, long to,
vm_flags_t vm_flags);
long hugetlb_unreserve_pages(struct inode *inode, long start, long end,
long freed);
-bool isolate_huge_page(struct page *page, struct list_head *list);
+int isolate_hugetlb(struct page *page, struct list_head *list);
int get_hwpoison_huge_page(struct page *page, bool *hugetlb);
int get_huge_page_for_hwpoison(unsigned long pfn, int flags);
void putback_active_hugepage(struct page *page);
@@ -369,9 +369,9 @@ static inline pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
return NULL;
}
-static inline bool isolate_huge_page(struct page *page, struct list_head *list)
+static inline int isolate_hugetlb(struct page *page, struct list_head *list)
{
- return false;
+ return -EBUSY;
}
static inline int get_hwpoison_huge_page(struct page *page, bool *hugetlb)
diff --git a/include/linux/iio/common/cros_ec_sensors_core.h b/include/linux/iio/common/cros_ec_sensors_core.h
index c582e1a14232..7b5dbd749995 100644
--- a/include/linux/iio/common/cros_ec_sensors_core.h
+++ b/include/linux/iio/common/cros_ec_sensors_core.h
@@ -95,8 +95,11 @@ int cros_ec_sensors_read_cmd(struct iio_dev *indio_dev, unsigned long scan_mask,
struct platform_device;
int cros_ec_sensors_core_init(struct platform_device *pdev,
struct iio_dev *indio_dev, bool physical_device,
- cros_ec_sensors_capture_t trigger_capture,
- cros_ec_sensorhub_push_data_cb_t push_data);
+ cros_ec_sensors_capture_t trigger_capture);
+
+int cros_ec_sensors_core_register(struct device *dev,
+ struct iio_dev *indio_dev,
+ cros_ec_sensorhub_push_data_cb_t push_data);
irqreturn_t cros_ec_sensors_capture(int irq, void *p);
int cros_ec_sensors_push_data(struct iio_dev *indio_dev,
diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h
index faf00f2c0be6..c4ce02293f1f 100644
--- a/include/linux/iio/iio.h
+++ b/include/linux/iio/iio.h
@@ -9,6 +9,7 @@
#include <linux/device.h>
#include <linux/cdev.h>
+#include <linux/slab.h>
#include <linux/iio/types.h>
#include <linux/of.h>
/* IIO TODO LIST */
@@ -657,8 +658,13 @@ static inline void *iio_device_get_drvdata(const struct iio_dev *indio_dev)
return dev_get_drvdata(&indio_dev->dev);
}
-/* Can we make this smaller? */
-#define IIO_ALIGN L1_CACHE_BYTES
+/*
+ * Used to ensure the iio_priv() structure is aligned to allow that structure
+ * to in turn include IIO_DMA_MINALIGN'd elements such as buffers which
+ * must not share cachelines with the rest of the structure, thus making
+ * them safe for use with non-coherent DMA.
+ */
+#define IIO_DMA_MINALIGN ARCH_KMALLOC_MINALIGN
struct iio_dev *iio_device_alloc(struct device *parent, int sizeof_priv);
/* The information at the returned address is guaranteed to be cacheline aligned */
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 8d573baaab29..f3e7680befcc 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -188,21 +188,48 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
void *buf, unsigned int size,
bool get_value);
void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name);
+void *kexec_image_load_default(struct kimage *image);
-/* Architectures may override the below functions */
-int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
- unsigned long buf_len);
-void *arch_kexec_kernel_image_load(struct kimage *image);
-int arch_kimage_file_post_load_cleanup(struct kimage *image);
-#ifdef CONFIG_KEXEC_SIG
-int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
- unsigned long buf_len);
+#ifndef arch_kexec_kernel_image_probe
+static inline int
+arch_kexec_kernel_image_probe(struct kimage *image, void *buf, unsigned long buf_len)
+{
+ return kexec_image_probe_default(image, buf, buf_len);
+}
+#endif
+
+#ifndef arch_kimage_file_post_load_cleanup
+static inline int arch_kimage_file_post_load_cleanup(struct kimage *image)
+{
+ return kexec_image_post_load_cleanup_default(image);
+}
+#endif
+
+#ifndef arch_kexec_kernel_image_load
+static inline void *arch_kexec_kernel_image_load(struct kimage *image)
+{
+ return kexec_image_load_default(image);
+}
#endif
-int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
extern int kexec_add_buffer(struct kexec_buf *kbuf);
int kexec_locate_mem_hole(struct kexec_buf *kbuf);
+#ifndef arch_kexec_locate_mem_hole
+/**
+ * arch_kexec_locate_mem_hole - Find free memory to place the segments.
+ * @kbuf: Parameters for the memory search.
+ *
+ * On success, kbuf->mem will have the start address of the memory region found.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+static inline int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
+{
+ return kexec_locate_mem_hole(kbuf);
+}
+#endif
+
/* Alignment required for elf header segment */
#define ELF_CORE_HEADER_ALIGN 4096
diff --git a/include/linux/kfifo.h b/include/linux/kfifo.h
index 86249476b57f..0b35a41440ff 100644
--- a/include/linux/kfifo.h
+++ b/include/linux/kfifo.h
@@ -688,7 +688,7 @@ __kfifo_uint_must_check_helper( \
* writer, you don't need extra locking to use these macro.
*/
#define kfifo_to_user(fifo, to, len, copied) \
-__kfifo_uint_must_check_helper( \
+__kfifo_int_must_check_helper( \
({ \
typeof((fifo) + 1) __tmp = (fifo); \
void __user *__to = (to); \
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index ac1ebb37a0ff..f328a01db4fe 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -19,6 +19,7 @@ struct kvm_memslots;
enum kvm_mr_change;
#include <linux/bits.h>
+#include <linux/mutex.h>
#include <linux/types.h>
#include <linux/spinlock_types.h>
@@ -69,6 +70,7 @@ struct gfn_to_pfn_cache {
struct kvm_vcpu *vcpu;
struct list_head list;
rwlock_t lock;
+ struct mutex refresh_lock;
void *khva;
kvm_pfn_t pfn;
enum pfn_cache_usage usage;
diff --git a/include/linux/lockd/xdr.h b/include/linux/lockd/xdr.h
index 398f70093cd3..67e4a2c5500b 100644
--- a/include/linux/lockd/xdr.h
+++ b/include/linux/lockd/xdr.h
@@ -41,6 +41,8 @@ struct nlm_lock {
struct nfs_fh fh;
struct xdr_netobj oh;
u32 svid;
+ u64 lock_start;
+ u64 lock_len;
struct file_lock fl;
};
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 467b94257105..0a9afe84ea44 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -192,7 +192,7 @@ static inline void
lockdep_init_map_waits(struct lockdep_map *lock, const char *name,
struct lock_class_key *key, int subclass, u8 inner, u8 outer)
{
- lockdep_init_map_type(lock, name, key, subclass, inner, LD_WAIT_INV, LD_LOCK_NORMAL);
+ lockdep_init_map_type(lock, name, key, subclass, inner, outer, LD_LOCK_NORMAL);
}
static inline void
@@ -215,24 +215,28 @@ static inline void lockdep_init_map(struct lockdep_map *lock, const char *name,
* or they are too narrow (they suffer from a false class-split):
*/
#define lockdep_set_class(lock, key) \
- lockdep_init_map_waits(&(lock)->dep_map, #key, key, 0, \
- (lock)->dep_map.wait_type_inner, \
- (lock)->dep_map.wait_type_outer)
+ lockdep_init_map_type(&(lock)->dep_map, #key, key, 0, \
+ (lock)->dep_map.wait_type_inner, \
+ (lock)->dep_map.wait_type_outer, \
+ (lock)->dep_map.lock_type)
#define lockdep_set_class_and_name(lock, key, name) \
- lockdep_init_map_waits(&(lock)->dep_map, name, key, 0, \
- (lock)->dep_map.wait_type_inner, \
- (lock)->dep_map.wait_type_outer)
+ lockdep_init_map_type(&(lock)->dep_map, name, key, 0, \
+ (lock)->dep_map.wait_type_inner, \
+ (lock)->dep_map.wait_type_outer, \
+ (lock)->dep_map.lock_type)
#define lockdep_set_class_and_subclass(lock, key, sub) \
- lockdep_init_map_waits(&(lock)->dep_map, #key, key, sub,\
- (lock)->dep_map.wait_type_inner, \
- (lock)->dep_map.wait_type_outer)
+ lockdep_init_map_type(&(lock)->dep_map, #key, key, sub, \
+ (lock)->dep_map.wait_type_inner, \
+ (lock)->dep_map.wait_type_outer, \
+ (lock)->dep_map.lock_type)
#define lockdep_set_subclass(lock, sub) \
- lockdep_init_map_waits(&(lock)->dep_map, #lock, (lock)->dep_map.key, sub,\
- (lock)->dep_map.wait_type_inner, \
- (lock)->dep_map.wait_type_outer)
+ lockdep_init_map_type(&(lock)->dep_map, #lock, (lock)->dep_map.key, sub,\
+ (lock)->dep_map.wait_type_inner, \
+ (lock)->dep_map.wait_type_outer, \
+ (lock)->dep_map.lock_type)
#define lockdep_set_novalidate_class(lock) \
lockdep_set_class_and_name(lock, &__lockdep_no_validate__, #lock)
diff --git a/include/linux/mbcache.h b/include/linux/mbcache.h
index 20f1e3ff6013..8eca7f25c432 100644
--- a/include/linux/mbcache.h
+++ b/include/linux/mbcache.h
@@ -30,15 +30,23 @@ void mb_cache_destroy(struct mb_cache *cache);
int mb_cache_entry_create(struct mb_cache *cache, gfp_t mask, u32 key,
u64 value, bool reusable);
void __mb_cache_entry_free(struct mb_cache_entry *entry);
+void mb_cache_entry_wait_unused(struct mb_cache_entry *entry);
static inline int mb_cache_entry_put(struct mb_cache *cache,
struct mb_cache_entry *entry)
{
- if (!atomic_dec_and_test(&entry->e_refcnt))
+ unsigned int cnt = atomic_dec_return(&entry->e_refcnt);
+
+ if (cnt > 0) {
+ if (cnt <= 3)
+ wake_up_var(&entry->e_refcnt);
return 0;
+ }
__mb_cache_entry_free(entry);
return 1;
}
+struct mb_cache_entry *mb_cache_entry_delete_or_get(struct mb_cache *cache,
+ u32 key, u64 value);
void mb_cache_entry_delete(struct mb_cache *cache, u32 key, u64 value);
struct mb_cache_entry *mb_cache_entry_get(struct mb_cache *cache, u32 key,
u64 value);
diff --git a/include/linux/mfd/t7l66xb.h b/include/linux/mfd/t7l66xb.h
index 69632c1b07bd..ae3e7a5c5219 100644
--- a/include/linux/mfd/t7l66xb.h
+++ b/include/linux/mfd/t7l66xb.h
@@ -12,7 +12,6 @@
struct t7l66xb_platform_data {
int (*enable)(struct platform_device *dev);
- int (*disable)(struct platform_device *dev);
int (*suspend)(struct platform_device *dev);
int (*resume)(struct platform_device *dev);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 9424503eb8d3..3d1594bad4ec 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -445,6 +445,11 @@ struct mlx5_qp_table {
struct radix_tree_root tree;
};
+enum {
+ MLX5_PF_NOTIFY_DISABLE_VF,
+ MLX5_PF_NOTIFY_ENABLE_VF,
+};
+
struct mlx5_vf_context {
int enabled;
u64 port_guid;
@@ -455,6 +460,7 @@ struct mlx5_vf_context {
u8 port_guid_valid:1;
u8 node_guid_valid:1;
enum port_state_policy policy;
+ struct blocking_notifier_head notifier;
};
struct mlx5_core_sriov {
@@ -1155,6 +1161,12 @@ int mlx5_dm_sw_icm_dealloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type
struct mlx5_core_dev *mlx5_vf_get_core_dev(struct pci_dev *pdev);
void mlx5_vf_put_core_dev(struct mlx5_core_dev *mdev);
+int mlx5_sriov_blocking_notifier_register(struct mlx5_core_dev *mdev,
+ int vf_id,
+ struct notifier_block *nb);
+void mlx5_sriov_blocking_notifier_unregister(struct mlx5_core_dev *mdev,
+ int vf_id,
+ struct notifier_block *nb);
#ifdef CONFIG_MLX5_CORE_IPOIB
struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
struct ib_device *ibdev,
diff --git a/include/linux/nvme-fc-driver.h b/include/linux/nvme-fc-driver.h
index 5358a5facdee..fa092b9be2fd 100644
--- a/include/linux/nvme-fc-driver.h
+++ b/include/linux/nvme-fc-driver.h
@@ -564,6 +564,15 @@ int nvme_fc_rcv_ls_req(struct nvme_fc_remote_port *remoteport,
void *lsreqbuf, u32 lsreqbuf_len);
+/*
+ * Routine called to get the appid field associated with request by the lldd
+ *
+ * If the return value is NULL : the user/libvirt has not set the appid to VM
+ * If the return value is non-zero: Returns the appid associated with VM
+ *
+ * @req: IO request from nvme fc to driver
+ */
+char *nvme_fc_io_getuuid(struct nvmefc_fcp_req *req);
/*
* *************** LLDD FC-NVME Target/Subsystem API ***************
@@ -1048,5 +1057,10 @@ int nvmet_fc_rcv_fcp_req(struct nvmet_fc_target_port *tgtport,
void nvmet_fc_rcv_fcp_abort(struct nvmet_fc_target_port *tgtport,
struct nvmefc_tgt_fcp_req *fcpreq);
+/*
+ * add a define, visible to the compiler, that indicates support
+ * for feature. Allows for conditional compilation in LLDDs.
+ */
+#define NVME_FC_FEAT_UUID 0x0001
#endif /* _NVME_FC_DRIVER_H */
diff --git a/include/linux/once_lite.h b/include/linux/once_lite.h
index 861e606b820f..b7bce4983638 100644
--- a/include/linux/once_lite.h
+++ b/include/linux/once_lite.h
@@ -9,15 +9,27 @@
*/
#define DO_ONCE_LITE(func, ...) \
DO_ONCE_LITE_IF(true, func, ##__VA_ARGS__)
-#define DO_ONCE_LITE_IF(condition, func, ...) \
+
+#define __ONCE_LITE_IF(condition) \
({ \
static bool __section(".data.once") __already_done; \
- bool __ret_do_once = !!(condition); \
+ bool __ret_cond = !!(condition); \
+ bool __ret_once = false; \
\
- if (unlikely(__ret_do_once && !__already_done)) { \
+ if (unlikely(__ret_cond && !__already_done)) { \
__already_done = true; \
- func(__VA_ARGS__); \
+ __ret_once = true; \
} \
+ unlikely(__ret_once); \
+ })
+
+#define DO_ONCE_LITE_IF(condition, func, ...) \
+ ({ \
+ bool __ret_do_once = !!(condition); \
+ \
+ if (__ONCE_LITE_IF(__ret_do_once)) \
+ func(__VA_ARGS__); \
+ \
unlikely(__ret_do_once); \
})
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index af97dd427501..a411080d5169 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1063,6 +1063,22 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
data->txn = 0;
}
+/*
+ * Clear all bitfields in the perf_branch_entry.
+ * The to and from fields are not cleared because they are
+ * systematically modified by caller.
+ */
+static inline void perf_clear_branch_entry_bitfields(struct perf_branch_entry *br)
+{
+ br->mispred = 0;
+ br->predicted = 0;
+ br->in_tx = 0;
+ br->abort = 0;
+ br->cycles = 0;
+ br->type = 0;
+ br->reserved = 0;
+}
+
extern void perf_output_sample(struct perf_output_handle *handle,
struct perf_event_header *header,
struct perf_sample_data *data,
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index cb0fd633a610..4ea496924106 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -229,6 +229,15 @@ static inline bool pipe_buf_try_steal(struct pipe_inode_info *pipe,
return buf->ops->try_steal(pipe, buf);
}
+static inline void pipe_discard_from(struct pipe_inode_info *pipe,
+ unsigned int old_head)
+{
+ unsigned int mask = pipe->ring_size - 1;
+
+ while (pipe->head > old_head)
+ pipe_buf_release(pipe, &pipe->bufs[--pipe->head & mask]);
+}
+
/* Differs from PIPE_BUF in that PIPE_SIZE is the length of the actual
memory allocation, whereas PIPE_BUF makes atomicity guarantees. */
#define PIPE_SIZE PAGE_SIZE
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 17230c458341..a0c4a870bb48 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -220,8 +220,8 @@ struct page_vma_mapped_walk {
#define DEFINE_PAGE_VMA_WALK(name, _page, _vma, _address, _flags) \
struct page_vma_mapped_walk name = { \
.pfn = page_to_pfn(_page), \
- .nr_pages = compound_nr(page), \
- .pgoff = page_to_pgoff(page), \
+ .nr_pages = compound_nr(_page), \
+ .pgoff = page_to_pgoff(_page), \
.vma = _vma, \
.address = _address, \
.flags = _flags, \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index a8911b1f35aa..d438e39fffe5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1812,7 +1812,7 @@ current_restore_flags(unsigned long orig_flags, unsigned long flags)
}
extern int cpuset_cpumask_can_shrink(const struct cpumask *cur, const struct cpumask *trial);
-extern int task_can_attach(struct task_struct *p, const struct cpumask *cs_cpus_allowed);
+extern int task_can_attach(struct task_struct *p, const struct cpumask *cs_effective_cpus);
#ifdef CONFIG_SMP
extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask);
extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask);
diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
index e5af028c08b4..994c25640e15 100644
--- a/include/linux/sched/rt.h
+++ b/include/linux/sched/rt.h
@@ -39,20 +39,12 @@ static inline struct task_struct *rt_mutex_get_top_task(struct task_struct *p)
}
extern void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task);
extern void rt_mutex_adjust_pi(struct task_struct *p);
-static inline bool tsk_is_pi_blocked(struct task_struct *tsk)
-{
- return tsk->pi_blocked_on != NULL;
-}
#else
static inline struct task_struct *rt_mutex_get_top_task(struct task_struct *task)
{
return NULL;
}
# define rt_mutex_adjust_pi(p) do { } while (0)
-static inline bool tsk_is_pi_blocked(struct task_struct *tsk)
-{
- return false;
-}
#endif
extern void normalize_rt_tasks(void);
diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 56cffe42abbc..816df6cc444e 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -81,6 +81,7 @@ struct sched_domain_shared {
atomic_t ref;
atomic_t nr_busy_cpus;
int has_idle_cores;
+ int nr_idle_scan;
};
struct sched_domain {
diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
index 76ce3f3ac0f2..bf6f0decb3f6 100644
--- a/include/linux/soundwire/sdw.h
+++ b/include/linux/soundwire/sdw.h
@@ -646,9 +646,6 @@ struct sdw_slave_ops {
* @dev_num: Current Device Number, values can be 0 or dev_num_sticky
* @dev_num_sticky: one-time static Device Number assigned by Bus
* @probed: boolean tracking driver state
- * @probe_complete: completion utility to control potential races
- * on startup between driver probe/initialization and SoundWire
- * Slave state changes/implementation-defined interrupts
* @enumeration_complete: completion utility to control potential races
* on startup between device enumeration and read/write access to the
* Slave device
@@ -663,6 +660,7 @@ struct sdw_slave_ops {
* for a Slave happens for the first time after enumeration
* @is_mockup_device: status flag used to squelch errors in the command/control
* protocol for SoundWire mockup devices
+ * @sdw_dev_lock: mutex used to protect callbacks/remove races
*/
struct sdw_slave {
struct sdw_slave_id id;
@@ -680,12 +678,12 @@ struct sdw_slave {
u16 dev_num;
u16 dev_num_sticky;
bool probed;
- struct completion probe_complete;
struct completion enumeration_complete;
struct completion initialization_complete;
u32 unattach_request;
bool first_interrupt_done;
bool is_mockup_device;
+ struct mutex sdw_dev_lock; /* protect callbacks/remove races */
};
#define dev_to_sdw_dev(_dev) container_of(_dev, struct sdw_slave, dev)
diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index d356ab4047f7..f99f0bd85724 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -216,8 +216,10 @@ extern void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep,
spinlock_t *ptl);
extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
unsigned long address);
-extern void migration_entry_wait_huge(struct vm_area_struct *vma,
- struct mm_struct *mm, pte_t *pte);
+#ifdef CONFIG_HUGETLB_PAGE
+extern void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl);
+extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte);
+#endif
#else
static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
{
@@ -238,8 +240,10 @@ static inline void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep,
spinlock_t *ptl) { }
static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
unsigned long address) { }
-static inline void migration_entry_wait_huge(struct vm_area_struct *vma,
- struct mm_struct *mm, pte_t *pte) { }
+#ifdef CONFIG_HUGETLB_PAGE
+static inline void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl) { }
+static inline void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { }
+#endif
static inline int is_writable_migration_entry(swp_entry_t entry)
{
return 0;
diff --git a/include/linux/tpm_eventlog.h b/include/linux/tpm_eventlog.h
index 739ba9a03ec1..20c0ff54b7a0 100644
--- a/include/linux/tpm_eventlog.h
+++ b/include/linux/tpm_eventlog.h
@@ -157,7 +157,7 @@ struct tcg_algorithm_info {
* Return: size of the event on success, 0 on failure
*/
-static inline int __calc_tpm2_event_size(struct tcg_pcr_event2_head *event,
+static __always_inline int __calc_tpm2_event_size(struct tcg_pcr_event2_head *event,
struct tcg_pcr_event *event_header,
bool do_mapping)
{
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index e6e95a9f07a5..b18759a673c6 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -916,6 +916,24 @@ perf_trace_buf_submit(void *raw_data, int size, int rctx, u16 type,
#endif
+#define TRACE_EVENT_STR_MAX 512
+
+/*
+ * gcc warns that you can not use a va_list in an inlined
+ * function. But lets me make it into a macro :-/
+ */
+#define __trace_event_vstr_len(fmt, va) \
+({ \
+ va_list __ap; \
+ int __ret; \
+ \
+ va_copy(__ap, *(va)); \
+ __ret = vsnprintf(NULL, 0, fmt, __ap) + 1; \
+ va_end(__ap); \
+ \
+ min(__ret, TRACE_EVENT_STR_MAX); \
+})
+
#endif /* _LINUX_TRACE_EVENT_H */
/*
diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
index 2c1fc9212cf2..98d1921f02b1 100644
--- a/include/linux/usb/hcd.h
+++ b/include/linux/usb/hcd.h
@@ -66,6 +66,7 @@
struct giveback_urb_bh {
bool running;
+ bool high_prio;
spinlock_t lock;
struct list_head head;
struct tasklet_struct bh;
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 851e07da2583..58cfbf81447c 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -544,10 +544,11 @@ do { \
\
hrtimer_init_sleeper_on_stack(&__t, CLOCK_MONOTONIC, \
HRTIMER_MODE_REL); \
- if ((timeout) != KTIME_MAX) \
- hrtimer_start_range_ns(&__t.timer, timeout, \
- current->timer_slack_ns, \
- HRTIMER_MODE_REL); \
+ if ((timeout) != KTIME_MAX) { \
+ hrtimer_set_expires_range_ns(&__t.timer, timeout, \
+ current->timer_slack_ns); \
+ hrtimer_sleeper_start_expires(&__t, HRTIMER_MODE_REL); \
+ } \
\
__ret = ___wait_event(wq_head, condition, state, 0, 0, \
if (!__t.task) { \
diff --git a/include/media/hevc-ctrls.h b/include/media/hevc-ctrls.h
index 01ccda48d8c5..88e804578cb1 100644
--- a/include/media/hevc-ctrls.h
+++ b/include/media/hevc-ctrls.h
@@ -135,7 +135,7 @@ struct v4l2_hevc_dpb_entry {
__u64 timestamp;
__u8 flags;
__u8 field_pic;
- __u16 pic_order_cnt[2];
+ __s32 pic_order_cnt_val;
__u8 padding[2];
};
@@ -178,7 +178,7 @@ struct v4l2_ctrl_hevc_slice_params {
/* ISO/IEC 23008-2, ITU-T Rec. H.265: General slice segment header */
__u8 slice_type;
__u8 colour_plane_id;
- __u16 slice_pic_order_cnt;
+ __s32 slice_pic_order_cnt;
__u8 num_ref_idx_l0_active_minus1;
__u8 num_ref_idx_l1_active_minus1;
__u8 collocated_ref_idx;
diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index ec1d1706f43c..cb78e0e33332 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -76,7 +76,7 @@ enum p9_req_status_t {
struct p9_req_t {
int status;
int t_err;
- struct kref refcount;
+ refcount_t refcount;
wait_queue_head_t wq;
struct p9_fcall tc;
struct p9_fcall rc;
@@ -227,15 +227,15 @@ struct p9_req_t *p9_tag_lookup(struct p9_client *c, u16 tag);
static inline void p9_req_get(struct p9_req_t *r)
{
- kref_get(&r->refcount);
+ refcount_inc(&r->refcount);
}
static inline int p9_req_try_get(struct p9_req_t *r)
{
- return kref_get_unless_zero(&r->refcount);
+ return refcount_inc_not_zero(&r->refcount);
}
-int p9_req_put(struct p9_req_t *r);
+int p9_req_put(struct p9_client *c, struct p9_req_t *r);
void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status);
diff --git a/include/net/ax25.h b/include/net/ax25.h
index a427a05672e2..f8cf3629a419 100644
--- a/include/net/ax25.h
+++ b/include/net/ax25.h
@@ -236,6 +236,7 @@ typedef struct ax25_cb {
ax25_address source_addr, dest_addr;
ax25_digi *digipeat;
ax25_dev *ax25_dev;
+ netdevice_tracker dev_tracker;
unsigned char iamdigi;
unsigned char state, modulus, pidincl;
unsigned short vs, vr, va;
diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index 62a9bb022aed..5f2342205ae1 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -361,6 +361,7 @@ enum {
HCI_QUALITY_REPORT,
HCI_OFFLOAD_CODECS_ENABLED,
HCI_LE_SIMULTANEOUS_ROLES,
+ HCI_CMD_DRAIN_WORKQUEUE,
__HCI_NUM_FLAGS,
};
diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
index 81b965953036..56f1286583d3 100644
--- a/include/net/inet6_hashtables.h
+++ b/include/net/inet6_hashtables.h
@@ -103,15 +103,24 @@ struct sock *inet6_lookup(struct net *net, struct inet_hashinfo *hashinfo,
const int dif);
int inet6_hash(struct sock *sk);
-#endif /* IS_ENABLED(CONFIG_IPV6) */
-#define INET6_MATCH(__sk, __net, __saddr, __daddr, __ports, __dif, __sdif) \
- (((__sk)->sk_portpair == (__ports)) && \
- ((__sk)->sk_family == AF_INET6) && \
- ipv6_addr_equal(&(__sk)->sk_v6_daddr, (__saddr)) && \
- ipv6_addr_equal(&(__sk)->sk_v6_rcv_saddr, (__daddr)) && \
- (((__sk)->sk_bound_dev_if == (__dif)) || \
- ((__sk)->sk_bound_dev_if == (__sdif))) && \
- net_eq(sock_net(__sk), (__net)))
+static inline bool inet6_match(struct net *net, const struct sock *sk,
+ const struct in6_addr *saddr,
+ const struct in6_addr *daddr,
+ const __portpair ports,
+ const int dif, const int sdif)
+{
+ if (!net_eq(sock_net(sk), net) ||
+ sk->sk_family != AF_INET6 ||
+ sk->sk_portpair != ports ||
+ !ipv6_addr_equal(&sk->sk_v6_daddr, saddr) ||
+ !ipv6_addr_equal(&sk->sk_v6_rcv_saddr, daddr))
+ return false;
+
+ /* READ_ONCE() paired with WRITE_ONCE() in sock_bindtoindex_locked() */
+ return inet_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif,
+ sdif);
+}
+#endif /* IS_ENABLED(CONFIG_IPV6) */
#endif /* _INET6_HASHTABLES_H */
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 749bb1e46087..53c22b64e972 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -203,17 +203,6 @@ static inline void inet_ehash_locks_free(struct inet_hashinfo *hashinfo)
hashinfo->ehash_locks = NULL;
}
-static inline bool inet_sk_bound_dev_eq(struct net *net, int bound_dev_if,
- int dif, int sdif)
-{
-#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
- return inet_bound_dev_eq(!!READ_ONCE(net->ipv4.sysctl_tcp_l3mdev_accept),
- bound_dev_if, dif, sdif);
-#else
- return inet_bound_dev_eq(true, bound_dev_if, dif, sdif);
-#endif
-}
-
struct inet_bind_bucket *
inet_bind_bucket_create(struct kmem_cache *cachep, struct net *net,
struct inet_bind_hashbucket *head,
@@ -295,7 +284,6 @@ static inline struct sock *inet_lookup_listener(struct net *net,
((__force __portpair)(((__u32)(__dport) << 16) | (__force __u32)(__be16)(__sport)))
#endif
-#if (BITS_PER_LONG == 64)
#ifdef __BIG_ENDIAN
#define INET_ADDR_COOKIE(__name, __saddr, __daddr) \
const __addrpair __name = (__force __addrpair) ( \
@@ -307,24 +295,20 @@ static inline struct sock *inet_lookup_listener(struct net *net,
(((__force __u64)(__be32)(__daddr)) << 32) | \
((__force __u64)(__be32)(__saddr)))
#endif /* __BIG_ENDIAN */
-#define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif, __sdif) \
- (((__sk)->sk_portpair == (__ports)) && \
- ((__sk)->sk_addrpair == (__cookie)) && \
- (((__sk)->sk_bound_dev_if == (__dif)) || \
- ((__sk)->sk_bound_dev_if == (__sdif))) && \
- net_eq(sock_net(__sk), (__net)))
-#else /* 32-bit arch */
-#define INET_ADDR_COOKIE(__name, __saddr, __daddr) \
- const int __name __deprecated __attribute__((unused))
-
-#define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif, __sdif) \
- (((__sk)->sk_portpair == (__ports)) && \
- ((__sk)->sk_daddr == (__saddr)) && \
- ((__sk)->sk_rcv_saddr == (__daddr)) && \
- (((__sk)->sk_bound_dev_if == (__dif)) || \
- ((__sk)->sk_bound_dev_if == (__sdif))) && \
- net_eq(sock_net(__sk), (__net)))
-#endif /* 64-bit arch */
+
+static inline bool INET_MATCH(struct net *net, const struct sock *sk,
+ const __addrpair cookie, const __portpair ports,
+ int dif, int sdif)
+{
+ if (!net_eq(sock_net(sk), net) ||
+ sk->sk_portpair != ports ||
+ sk->sk_addrpair != cookie)
+ return false;
+
+ /* READ_ONCE() paired with WRITE_ONCE() in sock_bindtoindex_locked() */
+ return inet_sk_bound_dev_eq(net, READ_ONCE(sk->sk_bound_dev_if), dif,
+ sdif);
+}
/* Sockets in TCP_CLOSE state are _always_ taken out of the hash, so we need
* not check it for lookups anymore, thanks Alexey. -DaveM
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 6395f6b9a5d2..bf5654ce711e 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -149,6 +149,17 @@ static inline bool inet_bound_dev_eq(bool l3mdev_accept, int bound_dev_if,
return bound_dev_if == dif || bound_dev_if == sdif;
}
+static inline bool inet_sk_bound_dev_eq(struct net *net, int bound_dev_if,
+ int dif, int sdif)
+{
+#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
+ return inet_bound_dev_eq(!!READ_ONCE(net->ipv4.sysctl_tcp_l3mdev_accept),
+ bound_dev_if, dif, sdif);
+#else
+ return inet_bound_dev_eq(true, bound_dev_if, dif, sdif);
+#endif
+}
+
struct inet_cork {
unsigned int flags;
__be32 addr;
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 44a35531952e..3372a1f67cf4 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -173,11 +173,28 @@ struct tc_taprio_qopt_offload {
struct tc_taprio_sched_entry entries[];
};
+#if IS_ENABLED(CONFIG_NET_SCH_TAPRIO)
+
/* Reference counting */
struct tc_taprio_qopt_offload *taprio_offload_get(struct tc_taprio_qopt_offload
*offload);
void taprio_offload_free(struct tc_taprio_qopt_offload *offload);
+#else
+
+/* Reference counting */
+static inline struct tc_taprio_qopt_offload *
+taprio_offload_get(struct tc_taprio_qopt_offload *offload)
+{
+ return NULL;
+}
+
+static inline void taprio_offload_free(struct tc_taprio_qopt_offload *offload)
+{
+}
+
+#endif
+
/* Ensure skb_mstamp_ns, which might have been populated with the txtime, is
* not mistaken for a software timestamp, because this will otherwise prevent
* the dispatch of hardware timestamps to the socket.
diff --git a/include/net/raw.h b/include/net/raw.h
index c51a635671a7..537d9d1df890 100644
--- a/include/net/raw.h
+++ b/include/net/raw.h
@@ -20,9 +20,8 @@
extern struct proto raw_prot;
extern struct raw_hashinfo raw_v4_hashinfo;
-struct sock *__raw_v4_lookup(struct net *net, struct sock *sk,
- unsigned short num, __be32 raddr,
- __be32 laddr, int dif, int sdif);
+bool raw_v4_match(struct net *net, struct sock *sk, unsigned short num,
+ __be32 raddr, __be32 laddr, int dif, int sdif);
int raw_abort(struct sock *sk, int err);
void raw_icmp_error(struct sk_buff *, int, u32);
@@ -34,9 +33,18 @@ int raw_rcv(struct sock *, struct sk_buff *);
struct raw_hashinfo {
rwlock_t lock;
- struct hlist_head ht[RAW_HTABLE_SIZE];
+ struct hlist_nulls_head ht[RAW_HTABLE_SIZE];
};
+static inline void raw_hashinfo_init(struct raw_hashinfo *hashinfo)
+{
+ int i;
+
+ rwlock_init(&hashinfo->lock);
+ for (i = 0; i < RAW_HTABLE_SIZE; i++)
+ INIT_HLIST_NULLS_HEAD(&hashinfo->ht[i], i);
+}
+
#ifdef CONFIG_PROC_FS
int raw_proc_init(void);
void raw_proc_exit(void);
diff --git a/include/net/rawv6.h b/include/net/rawv6.h
index 53d86b6055e8..bc70909625f6 100644
--- a/include/net/rawv6.h
+++ b/include/net/rawv6.h
@@ -3,11 +3,12 @@
#define _NET_RAWV6_H
#include <net/protocol.h>
+#include <net/raw.h>
extern struct raw_hashinfo raw_v6_hashinfo;
-struct sock *__raw_v6_lookup(struct net *net, struct sock *sk,
- unsigned short num, const struct in6_addr *loc_addr,
- const struct in6_addr *rmt_addr, int dif, int sdif);
+bool raw_v6_match(struct net *net, struct sock *sk, unsigned short num,
+ const struct in6_addr *loc_addr,
+ const struct in6_addr *rmt_addr, int dif, int sdif);
int raw_abort(struct sock *sk, int err);
diff --git a/include/net/sock.h b/include/net/sock.h
index 9563a093fdfc..810cb96b7f39 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -161,9 +161,6 @@ typedef __u64 __bitwise __addrpair;
* for struct sock and struct inet_timewait_sock.
*/
struct sock_common {
- /* skc_daddr and skc_rcv_saddr must be grouped on a 8 bytes aligned
- * address on 64bit arches : cf INET_MATCH()
- */
union {
__addrpair skc_addrpair;
struct {
@@ -1557,19 +1554,23 @@ static inline bool sk_has_account(struct sock *sk)
static inline bool sk_wmem_schedule(struct sock *sk, int size)
{
+ int delta;
+
if (!sk_has_account(sk))
return true;
- return size <= sk->sk_forward_alloc ||
- __sk_mem_schedule(sk, size, SK_MEM_SEND);
+ delta = size - sk->sk_forward_alloc;
+ return delta <= 0 || __sk_mem_schedule(sk, delta, SK_MEM_SEND);
}
static inline bool
sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size)
{
+ int delta;
+
if (!sk_has_account(sk))
return true;
- return size <= sk->sk_forward_alloc ||
- __sk_mem_schedule(sk, size, SK_MEM_RECV) ||
+ delta = size - sk->sk_forward_alloc;
+ return delta <= 0 || __sk_mem_schedule(sk, delta, SK_MEM_RECV) ||
skb_pfmemalloc(skb);
}
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 4aa031849668..0774ce97c2f1 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -95,6 +95,13 @@ static inline void xsk_buff_free(struct xdp_buff *xdp)
xp_free(xskb);
}
+static inline void xsk_buff_discard(struct xdp_buff *xdp)
+{
+ struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp);
+
+ xp_release(xskb);
+}
+
static inline void xsk_buff_set_size(struct xdp_buff *xdp, u32 size)
{
xdp->data = xdp->data_hard_start + XDP_PACKET_HEADROOM;
@@ -238,6 +245,10 @@ static inline void xsk_buff_free(struct xdp_buff *xdp)
{
}
+static inline void xsk_buff_discard(struct xdp_buff *xdp)
+{
+}
+
static inline void xsk_buff_set_size(struct xdp_buff *xdp, u32 size)
{
}
diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index c0703cd20a99..9758a4a9923f 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -411,7 +411,7 @@ extern int iscsi_host_add(struct Scsi_Host *shost, struct device *pdev);
extern struct Scsi_Host *iscsi_host_alloc(struct scsi_host_template *sht,
int dd_data_size,
bool xmit_can_sleep);
-extern void iscsi_host_remove(struct Scsi_Host *shost);
+extern void iscsi_host_remove(struct Scsi_Host *shost, bool is_shutdown);
extern void iscsi_host_free(struct Scsi_Host *shost);
extern int iscsi_target_alloc(struct scsi_target *starget);
extern int iscsi_host_get_max_scsi_cmds(struct Scsi_Host *shost,
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index 9acb8422f680..d6eab7cb221a 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -442,6 +442,7 @@ extern struct iscsi_cls_session *iscsi_create_session(struct Scsi_Host *shost,
struct iscsi_transport *t,
int dd_size,
unsigned int target_id);
+extern void iscsi_force_destroy_session(struct iscsi_cls_session *session);
extern void iscsi_remove_session(struct iscsi_cls_session *session);
extern void iscsi_free_session(struct iscsi_cls_session *session);
extern struct iscsi_cls_conn *iscsi_alloc_conn(struct iscsi_cls_session *sess,
diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
index 9b4e6c78d0f4..c90a9a2f77a9 100644
--- a/include/soc/mscc/ocelot.h
+++ b/include/soc/mscc/ocelot.h
@@ -568,6 +568,7 @@ struct ocelot_ops {
int (*psfp_stats_get)(struct ocelot *ocelot, struct flow_cls_offload *f,
struct flow_stats *stats);
void (*cut_through_fwd)(struct ocelot *ocelot);
+ void (*tas_clock_adjust)(struct ocelot *ocelot);
};
struct ocelot_vcap_policer {
@@ -652,29 +653,32 @@ struct ocelot_port {
struct regmap *target;
- bool vlan_aware;
+ struct net_device *bond;
+ struct net_device *bridge;
+
/* VLAN that untagged frames are classified to, on ingress */
const struct ocelot_bridge_vlan *pvid_vlan;
+ struct tc_taprio_qopt_offload *taprio;
+
+ phy_interface_t phy_mode;
+
unsigned int ptp_skbs_in_flight;
- u8 ptp_cmd;
struct sk_buff_head tx_skbs;
- u8 ts_id;
- phy_interface_t phy_mode;
+ u16 mrp_ring_id;
- u8 *xmit_template;
+ u8 ptp_cmd;
+ u8 ts_id;
+
+ u8 stp_state;
+ bool vlan_aware;
bool is_dsa_8021q_cpu;
bool learn_ena;
- struct net_device *bond;
bool lag_tx_active;
- u16 mrp_ring_id;
-
- struct net_device *bridge;
int bridge_num;
- u8 stp_state;
int speed;
};
@@ -743,6 +747,9 @@ struct ocelot {
/* Lock for serializing forwarding domain changes */
struct mutex fwd_domain_lock;
+ /* Lock for serializing Time-Aware Shaper changes */
+ struct mutex tas_lock;
+
struct workqueue_struct *owq;
u8 ptp:1;
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index 1779e133cea0..d91370377523 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trace/events/f2fs.h
@@ -15,10 +15,6 @@ TRACE_DEFINE_ENUM(NODE);
TRACE_DEFINE_ENUM(DATA);
TRACE_DEFINE_ENUM(META);
TRACE_DEFINE_ENUM(META_FLUSH);
-TRACE_DEFINE_ENUM(INMEM);
-TRACE_DEFINE_ENUM(INMEM_DROP);
-TRACE_DEFINE_ENUM(INMEM_INVALIDATE);
-TRACE_DEFINE_ENUM(INMEM_REVOKE);
TRACE_DEFINE_ENUM(IPU);
TRACE_DEFINE_ENUM(OPU);
TRACE_DEFINE_ENUM(HOT);
@@ -59,10 +55,6 @@ TRACE_DEFINE_ENUM(CP_RESIZE);
{ DATA, "DATA" }, \
{ META, "META" }, \
{ META_FLUSH, "META_FLUSH" }, \
- { INMEM, "INMEM" }, \
- { INMEM_DROP, "INMEM_DROP" }, \
- { INMEM_INVALIDATE, "INMEM_INVALIDATE" }, \
- { INMEM_REVOKE, "INMEM_REVOKE" }, \
{ IPU, "IN-PLACE" }, \
{ OPU, "OUT-OF-PLACE" })
@@ -1289,20 +1281,6 @@ DEFINE_EVENT(f2fs__page, f2fs_vm_page_mkwrite,
TP_ARGS(page, type)
);
-DEFINE_EVENT(f2fs__page, f2fs_register_inmem_page,
-
- TP_PROTO(struct page *page, int type),
-
- TP_ARGS(page, type)
-);
-
-DEFINE_EVENT(f2fs__page, f2fs_commit_inmem_page,
-
- TP_PROTO(struct page *page, int type),
-
- TP_ARGS(page, type)
-);
-
TRACE_EVENT(f2fs_filemap_fault,
TP_PROTO(struct inode *inode, pgoff_t index, unsigned long ret),
diff --git a/include/trace/events/spmi.h b/include/trace/events/spmi.h
index 8b60efe18ba6..a6819fd85cdf 100644
--- a/include/trace/events/spmi.h
+++ b/include/trace/events/spmi.h
@@ -21,15 +21,15 @@ TRACE_EVENT(spmi_write_begin,
__field ( u8, sid )
__field ( u16, addr )
__field ( u8, len )
- __dynamic_array ( u8, buf, len + 1 )
+ __dynamic_array ( u8, buf, len )
),
TP_fast_assign(
__entry->opcode = opcode;
__entry->sid = sid;
__entry->addr = addr;
- __entry->len = len + 1;
- memcpy(__get_dynamic_array(buf), buf, len + 1);
+ __entry->len = len;
+ memcpy(__get_dynamic_array(buf), buf, len);
),
TP_printk("opc=%d sid=%02d addr=0x%04x len=%d buf=0x[%*phD]",
@@ -92,7 +92,7 @@ TRACE_EVENT(spmi_read_end,
__field ( u16, addr )
__field ( int, ret )
__field ( u8, len )
- __dynamic_array ( u8, buf, len + 1 )
+ __dynamic_array ( u8, buf, len )
),
TP_fast_assign(
@@ -100,8 +100,8 @@ TRACE_EVENT(spmi_read_end,
__entry->sid = sid;
__entry->addr = addr;
__entry->ret = ret;
- __entry->len = len + 1;
- memcpy(__get_dynamic_array(buf), buf, len + 1);
+ __entry->len = len;
+ memcpy(__get_dynamic_array(buf), buf, len);
),
TP_printk("opc=%d sid=%02d addr=0x%04x ret=%d len=%02d buf=0x[%*phD]",
diff --git a/include/trace/stages/stage1_struct_define.h b/include/trace/stages/stage1_struct_define.h
index a16783419687..1b7bab60434c 100644
--- a/include/trace/stages/stage1_struct_define.h
+++ b/include/trace/stages/stage1_struct_define.h
@@ -26,6 +26,9 @@
#undef __string_len
#define __string_len(item, src, len) __dynamic_array(char, item, -1)
+#undef __vstring
+#define __vstring(item, fmt, ap) __dynamic_array(char, item, -1)
+
#undef __bitmask
#define __bitmask(item, nr_bits) __dynamic_array(char, item, -1)
diff --git a/include/trace/stages/stage2_data_offsets.h b/include/trace/stages/stage2_data_offsets.h
index 42fd1e8813ec..1b7a8f764fdd 100644
--- a/include/trace/stages/stage2_data_offsets.h
+++ b/include/trace/stages/stage2_data_offsets.h
@@ -32,6 +32,9 @@
#undef __string_len
#define __string_len(item, src, len) __dynamic_array(char, item, -1)
+#undef __vstring
+#define __vstring(item, fmt, ap) __dynamic_array(char, item, -1)
+
#undef __bitmask
#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
diff --git a/include/trace/stages/stage4_event_fields.h b/include/trace/stages/stage4_event_fields.h
index e80cdc397a43..80d34f396555 100644
--- a/include/trace/stages/stage4_event_fields.h
+++ b/include/trace/stages/stage4_event_fields.h
@@ -2,16 +2,18 @@
/* Stage 4 definitions for creating trace events */
+#define ALIGN_STRUCTFIELD(type) ((int)(offsetof(struct {char a; type b;}, b)))
+
#undef __field_ext
#define __field_ext(_type, _item, _filter_type) { \
.type = #_type, .name = #_item, \
- .size = sizeof(_type), .align = __alignof__(_type), \
+ .size = sizeof(_type), .align = ALIGN_STRUCTFIELD(_type), \
.is_signed = is_signed_type(_type), .filter_type = _filter_type },
#undef __field_struct_ext
#define __field_struct_ext(_type, _item, _filter_type) { \
.type = #_type, .name = #_item, \
- .size = sizeof(_type), .align = __alignof__(_type), \
+ .size = sizeof(_type), .align = ALIGN_STRUCTFIELD(_type), \
0, .filter_type = _filter_type },
#undef __field
@@ -23,7 +25,7 @@
#undef __array
#define __array(_type, _item, _len) { \
.type = #_type"["__stringify(_len)"]", .name = #_item, \
- .size = sizeof(_type[_len]), .align = __alignof__(_type), \
+ .size = sizeof(_type[_len]), .align = ALIGN_STRUCTFIELD(_type), \
.is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER },
#undef __dynamic_array
@@ -38,6 +40,9 @@
#undef __string_len
#define __string_len(item, src, len) __dynamic_array(char, item, -1)
+#undef __vstring
+#define __vstring(item, fmt, ap) __dynamic_array(char, item, -1)
+
#undef __bitmask
#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
diff --git a/include/trace/stages/stage5_get_offsets.h b/include/trace/stages/stage5_get_offsets.h
index 7ee5931300e6..fba4c24ed9e6 100644
--- a/include/trace/stages/stage5_get_offsets.h
+++ b/include/trace/stages/stage5_get_offsets.h
@@ -39,6 +39,10 @@
#undef __string_len
#define __string_len(item, src, len) __dynamic_array(char, item, (len) + 1)
+#undef __vstring
+#define __vstring(item, fmt, ap) __dynamic_array(char, item, \
+ __trace_event_vstr_len(fmt, ap))
+
#undef __rel_dynamic_array
#define __rel_dynamic_array(type, item, len) \
__item_length = (len) * sizeof(type); \
diff --git a/include/trace/stages/stage6_event_callback.h b/include/trace/stages/stage6_event_callback.h
index e1724f73594b..3c554a585320 100644
--- a/include/trace/stages/stage6_event_callback.h
+++ b/include/trace/stages/stage6_event_callback.h
@@ -24,6 +24,9 @@
#undef __string_len
#define __string_len(item, src, len) __dynamic_array(char, item, -1)
+#undef __vstring
+#define __vstring(item, fmt, ap) __dynamic_array(char, item, -1)
+
#undef __assign_str
#define __assign_str(dst, src) \
strcpy(__get_str(dst), (src) ? (const char *)(src) : "(null)");
@@ -35,6 +38,15 @@
__get_str(dst)[len] = '\0'; \
} while(0)
+#undef __assign_vstr
+#define __assign_vstr(dst, fmt, va) \
+ do { \
+ va_list __cp_va; \
+ va_copy(__cp_va, *(va)); \
+ vsnprintf(__get_str(dst), TRACE_EVENT_STR_MAX, fmt, __cp_va); \
+ va_end(__cp_va); \
+ } while (0)
+
#undef __bitmask
#define __bitmask(item, nr_bits) __dynamic_array(unsigned long, item, -1)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d14b10b85e51..ff9af73859ca 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1013,6 +1013,7 @@ enum bpf_link_type {
BPF_LINK_TYPE_XDP = 6,
BPF_LINK_TYPE_PERF_EVENT = 7,
BPF_LINK_TYPE_KPROBE_MULTI = 8,
+ BPF_LINK_TYPE_STRUCT_OPS = 9,
MAX_BPF_LINK_TYPE,
};
@@ -5143,6 +5144,17 @@ union bpf_attr {
* The **hash_algo** is returned on success,
* **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if
* invalid arguments are passed.
+ *
+ * void *bpf_kptr_xchg(void *map_value, void *ptr)
+ * Description
+ * Exchange kptr at pointer *map_value* with *ptr*, and return the
+ * old value. *ptr* can be NULL, otherwise it must be a referenced
+ * pointer which will be released when this helper is called.
+ * Return
+ * The old value of kptr (which can be NULL). The returned pointer
+ * if not NULL, is a reference which must be released using its
+ * corresponding release function, or moved into a BPF map before
+ * program exit.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -5339,6 +5351,7 @@ union bpf_attr {
FN(copy_from_user_task), \
FN(skb_set_tstamp), \
FN(ima_file_hash), \
+ FN(kptr_xchg), \
/* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/include/uapi/linux/can/error.h b/include/uapi/linux/can/error.h
index 34633283de64..a1000cb63063 100644
--- a/include/uapi/linux/can/error.h
+++ b/include/uapi/linux/can/error.h
@@ -120,6 +120,9 @@
#define CAN_ERR_TRX_CANL_SHORT_TO_GND 0x70 /* 0111 0000 */
#define CAN_ERR_TRX_CANL_SHORT_TO_CANH 0x80 /* 1000 0000 */
-/* controller specific additional information / data[5..7] */
+/* data[5] is reserved (do not use) */
+
+/* TX error counter / data[6] */
+/* RX error counter / data[7] */
#endif /* _UAPI_CAN_ERROR_H */
diff --git a/include/uapi/linux/f2fs.h b/include/uapi/linux/f2fs.h
index 352a822d4370..3121d127d5aa 100644
--- a/include/uapi/linux/f2fs.h
+++ b/include/uapi/linux/f2fs.h
@@ -13,7 +13,7 @@
#define F2FS_IOC_COMMIT_ATOMIC_WRITE _IO(F2FS_IOCTL_MAGIC, 2)
#define F2FS_IOC_START_VOLATILE_WRITE _IO(F2FS_IOCTL_MAGIC, 3)
#define F2FS_IOC_RELEASE_VOLATILE_WRITE _IO(F2FS_IOCTL_MAGIC, 4)
-#define F2FS_IOC_ABORT_VOLATILE_WRITE _IO(F2FS_IOCTL_MAGIC, 5)
+#define F2FS_IOC_ABORT_ATOMIC_WRITE _IO(F2FS_IOCTL_MAGIC, 5)
#define F2FS_IOC_GARBAGE_COLLECT _IOW(F2FS_IOCTL_MAGIC, 6, __u32)
#define F2FS_IOC_WRITE_CHECKPOINT _IO(F2FS_IOCTL_MAGIC, 7)
#define F2FS_IOC_DEFRAGMENT _IOWR(F2FS_IOCTL_MAGIC, 8, \
diff --git a/include/uapi/linux/netfilter/xt_IDLETIMER.h b/include/uapi/linux/netfilter/xt_IDLETIMER.h
index 49ddcdc61c09..7bfb31a66fc9 100644
--- a/include/uapi/linux/netfilter/xt_IDLETIMER.h
+++ b/include/uapi/linux/netfilter/xt_IDLETIMER.h
@@ -1,6 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
/*
- * linux/include/linux/netfilter/xt_IDLETIMER.h
- *
* Header file for Xtables timer target module.
*
* Copyright (C) 2004, 2010 Nokia Corporation
@@ -10,20 +9,6 @@
* by Luciano Coelho <luciano.coelho@nokia.com>
*
* Contact: Luciano Coelho <luciano.coelho@nokia.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * version 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
- * 02110-1301 USA
*/
#ifndef _XT_IDLETIMER_H
diff --git a/init/main.c b/init/main.c
index 80bd217dd35e..1e866b31bf19 100644
--- a/init/main.c
+++ b/init/main.c
@@ -99,6 +99,7 @@
#include <linux/kcsan.h>
#include <linux/init_syscalls.h>
#include <linux/stackdepot.h>
+#include <linux/randomize_kstack.h>
#include <net/net_namespace.h>
#include <asm/io.h>
diff --git a/io_uring/Makefile b/io_uring/Makefile
new file mode 100644
index 000000000000..3680425df947
--- /dev/null
+++ b/io_uring/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for io_uring
+
+obj-$(CONFIG_IO_URING) += io_uring.o
+obj-$(CONFIG_IO_WQ) += io-wq.o
diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
new file mode 100644
index 000000000000..32aeb2c581c5
--- /dev/null
+++ b/io_uring/io-wq.c
@@ -0,0 +1,1424 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Basic worker thread pool for io_uring
+ *
+ * Copyright (C) 2019 Jens Axboe
+ *
+ */
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/errno.h>
+#include <linux/sched/signal.h>
+#include <linux/percpu.h>
+#include <linux/slab.h>
+#include <linux/rculist_nulls.h>
+#include <linux/cpu.h>
+#include <linux/task_work.h>
+#include <linux/audit.h>
+#include <uapi/linux/io_uring.h>
+
+#include "io-wq.h"
+
+#define WORKER_IDLE_TIMEOUT (5 * HZ)
+
+enum {
+ IO_WORKER_F_UP = 1, /* up and active */
+ IO_WORKER_F_RUNNING = 2, /* account as running */
+ IO_WORKER_F_FREE = 4, /* worker on free list */
+ IO_WORKER_F_BOUND = 8, /* is doing bounded work */
+};
+
+enum {
+ IO_WQ_BIT_EXIT = 0, /* wq exiting */
+};
+
+enum {
+ IO_ACCT_STALLED_BIT = 0, /* stalled on hash */
+};
+
+/*
+ * One for each thread in a wqe pool
+ */
+struct io_worker {
+ refcount_t ref;
+ unsigned flags;
+ struct hlist_nulls_node nulls_node;
+ struct list_head all_list;
+ struct task_struct *task;
+ struct io_wqe *wqe;
+
+ struct io_wq_work *cur_work;
+ struct io_wq_work *next_work;
+ raw_spinlock_t lock;
+
+ struct completion ref_done;
+
+ unsigned long create_state;
+ struct callback_head create_work;
+ int create_index;
+
+ union {
+ struct rcu_head rcu;
+ struct work_struct work;
+ };
+};
+
+#if BITS_PER_LONG == 64
+#define IO_WQ_HASH_ORDER 6
+#else
+#define IO_WQ_HASH_ORDER 5
+#endif
+
+#define IO_WQ_NR_HASH_BUCKETS (1u << IO_WQ_HASH_ORDER)
+
+struct io_wqe_acct {
+ unsigned nr_workers;
+ unsigned max_workers;
+ int index;
+ atomic_t nr_running;
+ raw_spinlock_t lock;
+ struct io_wq_work_list work_list;
+ unsigned long flags;
+};
+
+enum {
+ IO_WQ_ACCT_BOUND,
+ IO_WQ_ACCT_UNBOUND,
+ IO_WQ_ACCT_NR,
+};
+
+/*
+ * Per-node worker thread pool
+ */
+struct io_wqe {
+ raw_spinlock_t lock;
+ struct io_wqe_acct acct[IO_WQ_ACCT_NR];
+
+ int node;
+
+ struct hlist_nulls_head free_list;
+ struct list_head all_list;
+
+ struct wait_queue_entry wait;
+
+ struct io_wq *wq;
+ struct io_wq_work *hash_tail[IO_WQ_NR_HASH_BUCKETS];
+
+ cpumask_var_t cpu_mask;
+};
+
+/*
+ * Per io_wq state
+ */
+struct io_wq {
+ unsigned long state;
+
+ free_work_fn *free_work;
+ io_wq_work_fn *do_work;
+
+ struct io_wq_hash *hash;
+
+ atomic_t worker_refs;
+ struct completion worker_done;
+
+ struct hlist_node cpuhp_node;
+
+ struct task_struct *task;
+
+ struct io_wqe *wqes[];
+};
+
+static enum cpuhp_state io_wq_online;
+
+struct io_cb_cancel_data {
+ work_cancel_fn *fn;
+ void *data;
+ int nr_running;
+ int nr_pending;
+ bool cancel_all;
+};
+
+static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index);
+static void io_wqe_dec_running(struct io_worker *worker);
+static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
+ struct io_wqe_acct *acct,
+ struct io_cb_cancel_data *match);
+static void create_worker_cb(struct callback_head *cb);
+static void io_wq_cancel_tw_create(struct io_wq *wq);
+
+static bool io_worker_get(struct io_worker *worker)
+{
+ return refcount_inc_not_zero(&worker->ref);
+}
+
+static void io_worker_release(struct io_worker *worker)
+{
+ if (refcount_dec_and_test(&worker->ref))
+ complete(&worker->ref_done);
+}
+
+static inline struct io_wqe_acct *io_get_acct(struct io_wqe *wqe, bool bound)
+{
+ return &wqe->acct[bound ? IO_WQ_ACCT_BOUND : IO_WQ_ACCT_UNBOUND];
+}
+
+static inline struct io_wqe_acct *io_work_get_acct(struct io_wqe *wqe,
+ struct io_wq_work *work)
+{
+ return io_get_acct(wqe, !(work->flags & IO_WQ_WORK_UNBOUND));
+}
+
+static inline struct io_wqe_acct *io_wqe_get_acct(struct io_worker *worker)
+{
+ return io_get_acct(worker->wqe, worker->flags & IO_WORKER_F_BOUND);
+}
+
+static void io_worker_ref_put(struct io_wq *wq)
+{
+ if (atomic_dec_and_test(&wq->worker_refs))
+ complete(&wq->worker_done);
+}
+
+static void io_worker_cancel_cb(struct io_worker *worker)
+{
+ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+ struct io_wqe *wqe = worker->wqe;
+ struct io_wq *wq = wqe->wq;
+
+ atomic_dec(&acct->nr_running);
+ raw_spin_lock(&worker->wqe->lock);
+ acct->nr_workers--;
+ raw_spin_unlock(&worker->wqe->lock);
+ io_worker_ref_put(wq);
+ clear_bit_unlock(0, &worker->create_state);
+ io_worker_release(worker);
+}
+
+static bool io_task_worker_match(struct callback_head *cb, void *data)
+{
+ struct io_worker *worker;
+
+ if (cb->func != create_worker_cb)
+ return false;
+ worker = container_of(cb, struct io_worker, create_work);
+ return worker == data;
+}
+
+static void io_worker_exit(struct io_worker *worker)
+{
+ struct io_wqe *wqe = worker->wqe;
+ struct io_wq *wq = wqe->wq;
+
+ while (1) {
+ struct callback_head *cb = task_work_cancel_match(wq->task,
+ io_task_worker_match, worker);
+
+ if (!cb)
+ break;
+ io_worker_cancel_cb(worker);
+ }
+
+ io_worker_release(worker);
+ wait_for_completion(&worker->ref_done);
+
+ raw_spin_lock(&wqe->lock);
+ if (worker->flags & IO_WORKER_F_FREE)
+ hlist_nulls_del_rcu(&worker->nulls_node);
+ list_del_rcu(&worker->all_list);
+ raw_spin_unlock(&wqe->lock);
+ io_wqe_dec_running(worker);
+ worker->flags = 0;
+ preempt_disable();
+ current->flags &= ~PF_IO_WORKER;
+ preempt_enable();
+
+ kfree_rcu(worker, rcu);
+ io_worker_ref_put(wqe->wq);
+ do_exit(0);
+}
+
+static inline bool io_acct_run_queue(struct io_wqe_acct *acct)
+{
+ bool ret = false;
+
+ raw_spin_lock(&acct->lock);
+ if (!wq_list_empty(&acct->work_list) &&
+ !test_bit(IO_ACCT_STALLED_BIT, &acct->flags))
+ ret = true;
+ raw_spin_unlock(&acct->lock);
+
+ return ret;
+}
+
+/*
+ * Check head of free list for an available worker. If one isn't available,
+ * caller must create one.
+ */
+static bool io_wqe_activate_free_worker(struct io_wqe *wqe,
+ struct io_wqe_acct *acct)
+ __must_hold(RCU)
+{
+ struct hlist_nulls_node *n;
+ struct io_worker *worker;
+
+ /*
+ * Iterate free_list and see if we can find an idle worker to
+ * activate. If a given worker is on the free_list but in the process
+ * of exiting, keep trying.
+ */
+ hlist_nulls_for_each_entry_rcu(worker, n, &wqe->free_list, nulls_node) {
+ if (!io_worker_get(worker))
+ continue;
+ if (io_wqe_get_acct(worker) != acct) {
+ io_worker_release(worker);
+ continue;
+ }
+ if (wake_up_process(worker->task)) {
+ io_worker_release(worker);
+ return true;
+ }
+ io_worker_release(worker);
+ }
+
+ return false;
+}
+
+/*
+ * We need a worker. If we find a free one, we're good. If not, and we're
+ * below the max number of workers, create one.
+ */
+static bool io_wqe_create_worker(struct io_wqe *wqe, struct io_wqe_acct *acct)
+{
+ /*
+ * Most likely an attempt to queue unbounded work on an io_wq that
+ * wasn't setup with any unbounded workers.
+ */
+ if (unlikely(!acct->max_workers))
+ pr_warn_once("io-wq is not configured for unbound workers");
+
+ raw_spin_lock(&wqe->lock);
+ if (acct->nr_workers >= acct->max_workers) {
+ raw_spin_unlock(&wqe->lock);
+ return true;
+ }
+ acct->nr_workers++;
+ raw_spin_unlock(&wqe->lock);
+ atomic_inc(&acct->nr_running);
+ atomic_inc(&wqe->wq->worker_refs);
+ return create_io_worker(wqe->wq, wqe, acct->index);
+}
+
+static void io_wqe_inc_running(struct io_worker *worker)
+{
+ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+
+ atomic_inc(&acct->nr_running);
+}
+
+static void create_worker_cb(struct callback_head *cb)
+{
+ struct io_worker *worker;
+ struct io_wq *wq;
+ struct io_wqe *wqe;
+ struct io_wqe_acct *acct;
+ bool do_create = false;
+
+ worker = container_of(cb, struct io_worker, create_work);
+ wqe = worker->wqe;
+ wq = wqe->wq;
+ acct = &wqe->acct[worker->create_index];
+ raw_spin_lock(&wqe->lock);
+ if (acct->nr_workers < acct->max_workers) {
+ acct->nr_workers++;
+ do_create = true;
+ }
+ raw_spin_unlock(&wqe->lock);
+ if (do_create) {
+ create_io_worker(wq, wqe, worker->create_index);
+ } else {
+ atomic_dec(&acct->nr_running);
+ io_worker_ref_put(wq);
+ }
+ clear_bit_unlock(0, &worker->create_state);
+ io_worker_release(worker);
+}
+
+static bool io_queue_worker_create(struct io_worker *worker,
+ struct io_wqe_acct *acct,
+ task_work_func_t func)
+{
+ struct io_wqe *wqe = worker->wqe;
+ struct io_wq *wq = wqe->wq;
+
+ /* raced with exit, just ignore create call */
+ if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
+ goto fail;
+ if (!io_worker_get(worker))
+ goto fail;
+ /*
+ * create_state manages ownership of create_work/index. We should
+ * only need one entry per worker, as the worker going to sleep
+ * will trigger the condition, and waking will clear it once it
+ * runs the task_work.
+ */
+ if (test_bit(0, &worker->create_state) ||
+ test_and_set_bit_lock(0, &worker->create_state))
+ goto fail_release;
+
+ atomic_inc(&wq->worker_refs);
+ init_task_work(&worker->create_work, func);
+ worker->create_index = acct->index;
+ if (!task_work_add(wq->task, &worker->create_work, TWA_SIGNAL)) {
+ /*
+ * EXIT may have been set after checking it above, check after
+ * adding the task_work and remove any creation item if it is
+ * now set. wq exit does that too, but we can have added this
+ * work item after we canceled in io_wq_exit_workers().
+ */
+ if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
+ io_wq_cancel_tw_create(wq);
+ io_worker_ref_put(wq);
+ return true;
+ }
+ io_worker_ref_put(wq);
+ clear_bit_unlock(0, &worker->create_state);
+fail_release:
+ io_worker_release(worker);
+fail:
+ atomic_dec(&acct->nr_running);
+ io_worker_ref_put(wq);
+ return false;
+}
+
+static void io_wqe_dec_running(struct io_worker *worker)
+{
+ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+ struct io_wqe *wqe = worker->wqe;
+
+ if (!(worker->flags & IO_WORKER_F_UP))
+ return;
+
+ if (!atomic_dec_and_test(&acct->nr_running))
+ return;
+ if (!io_acct_run_queue(acct))
+ return;
+
+ atomic_inc(&acct->nr_running);
+ atomic_inc(&wqe->wq->worker_refs);
+ io_queue_worker_create(worker, acct, create_worker_cb);
+}
+
+/*
+ * Worker will start processing some work. Move it to the busy list, if
+ * it's currently on the freelist
+ */
+static void __io_worker_busy(struct io_wqe *wqe, struct io_worker *worker)
+{
+ if (worker->flags & IO_WORKER_F_FREE) {
+ worker->flags &= ~IO_WORKER_F_FREE;
+ raw_spin_lock(&wqe->lock);
+ hlist_nulls_del_init_rcu(&worker->nulls_node);
+ raw_spin_unlock(&wqe->lock);
+ }
+}
+
+/*
+ * No work, worker going to sleep. Move to freelist, and unuse mm if we
+ * have one attached. Dropping the mm may potentially sleep, so we drop
+ * the lock in that case and return success. Since the caller has to
+ * retry the loop in that case (we changed task state), we don't regrab
+ * the lock if we return success.
+ */
+static void __io_worker_idle(struct io_wqe *wqe, struct io_worker *worker)
+ __must_hold(wqe->lock)
+{
+ if (!(worker->flags & IO_WORKER_F_FREE)) {
+ worker->flags |= IO_WORKER_F_FREE;
+ hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
+ }
+}
+
+static inline unsigned int io_get_work_hash(struct io_wq_work *work)
+{
+ return work->flags >> IO_WQ_HASH_SHIFT;
+}
+
+static bool io_wait_on_hash(struct io_wqe *wqe, unsigned int hash)
+{
+ struct io_wq *wq = wqe->wq;
+ bool ret = false;
+
+ spin_lock_irq(&wq->hash->wait.lock);
+ if (list_empty(&wqe->wait.entry)) {
+ __add_wait_queue(&wq->hash->wait, &wqe->wait);
+ if (!test_bit(hash, &wq->hash->map)) {
+ __set_current_state(TASK_RUNNING);
+ list_del_init(&wqe->wait.entry);
+ ret = true;
+ }
+ }
+ spin_unlock_irq(&wq->hash->wait.lock);
+ return ret;
+}
+
+static struct io_wq_work *io_get_next_work(struct io_wqe_acct *acct,
+ struct io_worker *worker)
+ __must_hold(acct->lock)
+{
+ struct io_wq_work_node *node, *prev;
+ struct io_wq_work *work, *tail;
+ unsigned int stall_hash = -1U;
+ struct io_wqe *wqe = worker->wqe;
+
+ wq_list_for_each(node, prev, &acct->work_list) {
+ unsigned int hash;
+
+ work = container_of(node, struct io_wq_work, list);
+
+ /* not hashed, can run anytime */
+ if (!io_wq_is_hashed(work)) {
+ wq_list_del(&acct->work_list, node, prev);
+ return work;
+ }
+
+ hash = io_get_work_hash(work);
+ /* all items with this hash lie in [work, tail] */
+ tail = wqe->hash_tail[hash];
+
+ /* hashed, can run if not already running */
+ if (!test_and_set_bit(hash, &wqe->wq->hash->map)) {
+ wqe->hash_tail[hash] = NULL;
+ wq_list_cut(&acct->work_list, &tail->list, prev);
+ return work;
+ }
+ if (stall_hash == -1U)
+ stall_hash = hash;
+ /* fast forward to a next hash, for-each will fix up @prev */
+ node = &tail->list;
+ }
+
+ if (stall_hash != -1U) {
+ bool unstalled;
+
+ /*
+ * Set this before dropping the lock to avoid racing with new
+ * work being added and clearing the stalled bit.
+ */
+ set_bit(IO_ACCT_STALLED_BIT, &acct->flags);
+ raw_spin_unlock(&acct->lock);
+ unstalled = io_wait_on_hash(wqe, stall_hash);
+ raw_spin_lock(&acct->lock);
+ if (unstalled) {
+ clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
+ if (wq_has_sleeper(&wqe->wq->hash->wait))
+ wake_up(&wqe->wq->hash->wait);
+ }
+ }
+
+ return NULL;
+}
+
+static bool io_flush_signals(void)
+{
+ if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) {
+ __set_current_state(TASK_RUNNING);
+ clear_notify_signal();
+ if (task_work_pending(current))
+ task_work_run();
+ return true;
+ }
+ return false;
+}
+
+static void io_assign_current_work(struct io_worker *worker,
+ struct io_wq_work *work)
+{
+ if (work) {
+ io_flush_signals();
+ cond_resched();
+ }
+
+ raw_spin_lock(&worker->lock);
+ worker->cur_work = work;
+ worker->next_work = NULL;
+ raw_spin_unlock(&worker->lock);
+}
+
+static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work);
+
+static void io_worker_handle_work(struct io_worker *worker)
+{
+ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+ struct io_wqe *wqe = worker->wqe;
+ struct io_wq *wq = wqe->wq;
+ bool do_kill = test_bit(IO_WQ_BIT_EXIT, &wq->state);
+
+ do {
+ struct io_wq_work *work;
+
+ /*
+ * If we got some work, mark us as busy. If we didn't, but
+ * the list isn't empty, it means we stalled on hashed work.
+ * Mark us stalled so we don't keep looking for work when we
+ * can't make progress, any work completion or insertion will
+ * clear the stalled flag.
+ */
+ raw_spin_lock(&acct->lock);
+ work = io_get_next_work(acct, worker);
+ raw_spin_unlock(&acct->lock);
+ if (work) {
+ __io_worker_busy(wqe, worker);
+
+ /*
+ * Make sure cancelation can find this, even before
+ * it becomes the active work. That avoids a window
+ * where the work has been removed from our general
+ * work list, but isn't yet discoverable as the
+ * current work item for this worker.
+ */
+ raw_spin_lock(&worker->lock);
+ worker->next_work = work;
+ raw_spin_unlock(&worker->lock);
+ } else {
+ break;
+ }
+ io_assign_current_work(worker, work);
+ __set_current_state(TASK_RUNNING);
+
+ /* handle a whole dependent link */
+ do {
+ struct io_wq_work *next_hashed, *linked;
+ unsigned int hash = io_get_work_hash(work);
+
+ next_hashed = wq_next_work(work);
+
+ if (unlikely(do_kill) && (work->flags & IO_WQ_WORK_UNBOUND))
+ work->flags |= IO_WQ_WORK_CANCEL;
+ wq->do_work(work);
+ io_assign_current_work(worker, NULL);
+
+ linked = wq->free_work(work);
+ work = next_hashed;
+ if (!work && linked && !io_wq_is_hashed(linked)) {
+ work = linked;
+ linked = NULL;
+ }
+ io_assign_current_work(worker, work);
+ if (linked)
+ io_wqe_enqueue(wqe, linked);
+
+ if (hash != -1U && !next_hashed) {
+ /* serialize hash clear with wake_up() */
+ spin_lock_irq(&wq->hash->wait.lock);
+ clear_bit(hash, &wq->hash->map);
+ clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
+ spin_unlock_irq(&wq->hash->wait.lock);
+ if (wq_has_sleeper(&wq->hash->wait))
+ wake_up(&wq->hash->wait);
+ }
+ } while (work);
+ } while (1);
+}
+
+static int io_wqe_worker(void *data)
+{
+ struct io_worker *worker = data;
+ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+ struct io_wqe *wqe = worker->wqe;
+ struct io_wq *wq = wqe->wq;
+ bool last_timeout = false;
+ char buf[TASK_COMM_LEN];
+
+ worker->flags |= (IO_WORKER_F_UP | IO_WORKER_F_RUNNING);
+
+ snprintf(buf, sizeof(buf), "iou-wrk-%d", wq->task->pid);
+ set_task_comm(current, buf);
+
+ audit_alloc_kernel(current);
+
+ while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
+ long ret;
+
+ set_current_state(TASK_INTERRUPTIBLE);
+ while (io_acct_run_queue(acct))
+ io_worker_handle_work(worker);
+
+ raw_spin_lock(&wqe->lock);
+ /* timed out, exit unless we're the last worker */
+ if (last_timeout && acct->nr_workers > 1) {
+ acct->nr_workers--;
+ raw_spin_unlock(&wqe->lock);
+ __set_current_state(TASK_RUNNING);
+ break;
+ }
+ last_timeout = false;
+ __io_worker_idle(wqe, worker);
+ raw_spin_unlock(&wqe->lock);
+ if (io_flush_signals())
+ continue;
+ ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
+ if (signal_pending(current)) {
+ struct ksignal ksig;
+
+ if (!get_signal(&ksig))
+ continue;
+ break;
+ }
+ last_timeout = !ret;
+ }
+
+ if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
+ io_worker_handle_work(worker);
+
+ audit_free(current);
+ io_worker_exit(worker);
+ return 0;
+}
+
+/*
+ * Called when a worker is scheduled in. Mark us as currently running.
+ */
+void io_wq_worker_running(struct task_struct *tsk)
+{
+ struct io_worker *worker = tsk->worker_private;
+
+ if (!worker)
+ return;
+ if (!(worker->flags & IO_WORKER_F_UP))
+ return;
+ if (worker->flags & IO_WORKER_F_RUNNING)
+ return;
+ worker->flags |= IO_WORKER_F_RUNNING;
+ io_wqe_inc_running(worker);
+}
+
+/*
+ * Called when worker is going to sleep. If there are no workers currently
+ * running and we have work pending, wake up a free one or create a new one.
+ */
+void io_wq_worker_sleeping(struct task_struct *tsk)
+{
+ struct io_worker *worker = tsk->worker_private;
+
+ if (!worker)
+ return;
+ if (!(worker->flags & IO_WORKER_F_UP))
+ return;
+ if (!(worker->flags & IO_WORKER_F_RUNNING))
+ return;
+
+ worker->flags &= ~IO_WORKER_F_RUNNING;
+ io_wqe_dec_running(worker);
+}
+
+static void io_init_new_worker(struct io_wqe *wqe, struct io_worker *worker,
+ struct task_struct *tsk)
+{
+ tsk->worker_private = worker;
+ worker->task = tsk;
+ set_cpus_allowed_ptr(tsk, wqe->cpu_mask);
+ tsk->flags |= PF_NO_SETAFFINITY;
+
+ raw_spin_lock(&wqe->lock);
+ hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
+ list_add_tail_rcu(&worker->all_list, &wqe->all_list);
+ worker->flags |= IO_WORKER_F_FREE;
+ raw_spin_unlock(&wqe->lock);
+ wake_up_new_task(tsk);
+}
+
+static bool io_wq_work_match_all(struct io_wq_work *work, void *data)
+{
+ return true;
+}
+
+static inline bool io_should_retry_thread(long err)
+{
+ /*
+ * Prevent perpetual task_work retry, if the task (or its group) is
+ * exiting.
+ */
+ if (fatal_signal_pending(current))
+ return false;
+
+ switch (err) {
+ case -EAGAIN:
+ case -ERESTARTSYS:
+ case -ERESTARTNOINTR:
+ case -ERESTARTNOHAND:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static void create_worker_cont(struct callback_head *cb)
+{
+ struct io_worker *worker;
+ struct task_struct *tsk;
+ struct io_wqe *wqe;
+
+ worker = container_of(cb, struct io_worker, create_work);
+ clear_bit_unlock(0, &worker->create_state);
+ wqe = worker->wqe;
+ tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
+ if (!IS_ERR(tsk)) {
+ io_init_new_worker(wqe, worker, tsk);
+ io_worker_release(worker);
+ return;
+ } else if (!io_should_retry_thread(PTR_ERR(tsk))) {
+ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+
+ atomic_dec(&acct->nr_running);
+ raw_spin_lock(&wqe->lock);
+ acct->nr_workers--;
+ if (!acct->nr_workers) {
+ struct io_cb_cancel_data match = {
+ .fn = io_wq_work_match_all,
+ .cancel_all = true,
+ };
+
+ raw_spin_unlock(&wqe->lock);
+ while (io_acct_cancel_pending_work(wqe, acct, &match))
+ ;
+ } else {
+ raw_spin_unlock(&wqe->lock);
+ }
+ io_worker_ref_put(wqe->wq);
+ kfree(worker);
+ return;
+ }
+
+ /* re-create attempts grab a new worker ref, drop the existing one */
+ io_worker_release(worker);
+ schedule_work(&worker->work);
+}
+
+static void io_workqueue_create(struct work_struct *work)
+{
+ struct io_worker *worker = container_of(work, struct io_worker, work);
+ struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+
+ if (!io_queue_worker_create(worker, acct, create_worker_cont))
+ kfree(worker);
+}
+
+static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index)
+{
+ struct io_wqe_acct *acct = &wqe->acct[index];
+ struct io_worker *worker;
+ struct task_struct *tsk;
+
+ __set_current_state(TASK_RUNNING);
+
+ worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, wqe->node);
+ if (!worker) {
+fail:
+ atomic_dec(&acct->nr_running);
+ raw_spin_lock(&wqe->lock);
+ acct->nr_workers--;
+ raw_spin_unlock(&wqe->lock);
+ io_worker_ref_put(wq);
+ return false;
+ }
+
+ refcount_set(&worker->ref, 1);
+ worker->wqe = wqe;
+ raw_spin_lock_init(&worker->lock);
+ init_completion(&worker->ref_done);
+
+ if (index == IO_WQ_ACCT_BOUND)
+ worker->flags |= IO_WORKER_F_BOUND;
+
+ tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
+ if (!IS_ERR(tsk)) {
+ io_init_new_worker(wqe, worker, tsk);
+ } else if (!io_should_retry_thread(PTR_ERR(tsk))) {
+ kfree(worker);
+ goto fail;
+ } else {
+ INIT_WORK(&worker->work, io_workqueue_create);
+ schedule_work(&worker->work);
+ }
+
+ return true;
+}
+
+/*
+ * Iterate the passed in list and call the specific function for each
+ * worker that isn't exiting
+ */
+static bool io_wq_for_each_worker(struct io_wqe *wqe,
+ bool (*func)(struct io_worker *, void *),
+ void *data)
+{
+ struct io_worker *worker;
+ bool ret = false;
+
+ list_for_each_entry_rcu(worker, &wqe->all_list, all_list) {
+ if (io_worker_get(worker)) {
+ /* no task if node is/was offline */
+ if (worker->task)
+ ret = func(worker, data);
+ io_worker_release(worker);
+ if (ret)
+ break;
+ }
+ }
+
+ return ret;
+}
+
+static bool io_wq_worker_wake(struct io_worker *worker, void *data)
+{
+ set_notify_signal(worker->task);
+ wake_up_process(worker->task);
+ return false;
+}
+
+static void io_run_cancel(struct io_wq_work *work, struct io_wqe *wqe)
+{
+ struct io_wq *wq = wqe->wq;
+
+ do {
+ work->flags |= IO_WQ_WORK_CANCEL;
+ wq->do_work(work);
+ work = wq->free_work(work);
+ } while (work);
+}
+
+static void io_wqe_insert_work(struct io_wqe *wqe, struct io_wq_work *work)
+{
+ struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
+ unsigned int hash;
+ struct io_wq_work *tail;
+
+ if (!io_wq_is_hashed(work)) {
+append:
+ wq_list_add_tail(&work->list, &acct->work_list);
+ return;
+ }
+
+ hash = io_get_work_hash(work);
+ tail = wqe->hash_tail[hash];
+ wqe->hash_tail[hash] = work;
+ if (!tail)
+ goto append;
+
+ wq_list_add_after(&work->list, &tail->list, &acct->work_list);
+}
+
+static bool io_wq_work_match_item(struct io_wq_work *work, void *data)
+{
+ return work == data;
+}
+
+static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
+{
+ struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
+ struct io_cb_cancel_data match;
+ unsigned work_flags = work->flags;
+ bool do_create;
+
+ /*
+ * If io-wq is exiting for this task, or if the request has explicitly
+ * been marked as one that should not get executed, cancel it here.
+ */
+ if (test_bit(IO_WQ_BIT_EXIT, &wqe->wq->state) ||
+ (work->flags & IO_WQ_WORK_CANCEL)) {
+ io_run_cancel(work, wqe);
+ return;
+ }
+
+ raw_spin_lock(&acct->lock);
+ io_wqe_insert_work(wqe, work);
+ clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
+ raw_spin_unlock(&acct->lock);
+
+ raw_spin_lock(&wqe->lock);
+ rcu_read_lock();
+ do_create = !io_wqe_activate_free_worker(wqe, acct);
+ rcu_read_unlock();
+
+ raw_spin_unlock(&wqe->lock);
+
+ if (do_create && ((work_flags & IO_WQ_WORK_CONCURRENT) ||
+ !atomic_read(&acct->nr_running))) {
+ bool did_create;
+
+ did_create = io_wqe_create_worker(wqe, acct);
+ if (likely(did_create))
+ return;
+
+ raw_spin_lock(&wqe->lock);
+ if (acct->nr_workers) {
+ raw_spin_unlock(&wqe->lock);
+ return;
+ }
+ raw_spin_unlock(&wqe->lock);
+
+ /* fatal condition, failed to create the first worker */
+ match.fn = io_wq_work_match_item,
+ match.data = work,
+ match.cancel_all = false,
+
+ io_acct_cancel_pending_work(wqe, acct, &match);
+ }
+}
+
+void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work)
+{
+ struct io_wqe *wqe = wq->wqes[numa_node_id()];
+
+ io_wqe_enqueue(wqe, work);
+}
+
+/*
+ * Work items that hash to the same value will not be done in parallel.
+ * Used to limit concurrent writes, generally hashed by inode.
+ */
+void io_wq_hash_work(struct io_wq_work *work, void *val)
+{
+ unsigned int bit;
+
+ bit = hash_ptr(val, IO_WQ_HASH_ORDER);
+ work->flags |= (IO_WQ_WORK_HASHED | (bit << IO_WQ_HASH_SHIFT));
+}
+
+static bool __io_wq_worker_cancel(struct io_worker *worker,
+ struct io_cb_cancel_data *match,
+ struct io_wq_work *work)
+{
+ if (work && match->fn(work, match->data)) {
+ work->flags |= IO_WQ_WORK_CANCEL;
+ set_notify_signal(worker->task);
+ return true;
+ }
+
+ return false;
+}
+
+static bool io_wq_worker_cancel(struct io_worker *worker, void *data)
+{
+ struct io_cb_cancel_data *match = data;
+
+ /*
+ * Hold the lock to avoid ->cur_work going out of scope, caller
+ * may dereference the passed in work.
+ */
+ raw_spin_lock(&worker->lock);
+ if (__io_wq_worker_cancel(worker, match, worker->cur_work) ||
+ __io_wq_worker_cancel(worker, match, worker->next_work))
+ match->nr_running++;
+ raw_spin_unlock(&worker->lock);
+
+ return match->nr_running && !match->cancel_all;
+}
+
+static inline void io_wqe_remove_pending(struct io_wqe *wqe,
+ struct io_wq_work *work,
+ struct io_wq_work_node *prev)
+{
+ struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
+ unsigned int hash = io_get_work_hash(work);
+ struct io_wq_work *prev_work = NULL;
+
+ if (io_wq_is_hashed(work) && work == wqe->hash_tail[hash]) {
+ if (prev)
+ prev_work = container_of(prev, struct io_wq_work, list);
+ if (prev_work && io_get_work_hash(prev_work) == hash)
+ wqe->hash_tail[hash] = prev_work;
+ else
+ wqe->hash_tail[hash] = NULL;
+ }
+ wq_list_del(&acct->work_list, &work->list, prev);
+}
+
+static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
+ struct io_wqe_acct *acct,
+ struct io_cb_cancel_data *match)
+{
+ struct io_wq_work_node *node, *prev;
+ struct io_wq_work *work;
+
+ raw_spin_lock(&acct->lock);
+ wq_list_for_each(node, prev, &acct->work_list) {
+ work = container_of(node, struct io_wq_work, list);
+ if (!match->fn(work, match->data))
+ continue;
+ io_wqe_remove_pending(wqe, work, prev);
+ raw_spin_unlock(&acct->lock);
+ io_run_cancel(work, wqe);
+ match->nr_pending++;
+ /* not safe to continue after unlock */
+ return true;
+ }
+ raw_spin_unlock(&acct->lock);
+
+ return false;
+}
+
+static void io_wqe_cancel_pending_work(struct io_wqe *wqe,
+ struct io_cb_cancel_data *match)
+{
+ int i;
+retry:
+ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+ struct io_wqe_acct *acct = io_get_acct(wqe, i == 0);
+
+ if (io_acct_cancel_pending_work(wqe, acct, match)) {
+ if (match->cancel_all)
+ goto retry;
+ break;
+ }
+ }
+}
+
+static void io_wqe_cancel_running_work(struct io_wqe *wqe,
+ struct io_cb_cancel_data *match)
+{
+ rcu_read_lock();
+ io_wq_for_each_worker(wqe, io_wq_worker_cancel, match);
+ rcu_read_unlock();
+}
+
+enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
+ void *data, bool cancel_all)
+{
+ struct io_cb_cancel_data match = {
+ .fn = cancel,
+ .data = data,
+ .cancel_all = cancel_all,
+ };
+ int node;
+
+ /*
+ * First check pending list, if we're lucky we can just remove it
+ * from there. CANCEL_OK means that the work is returned as-new,
+ * no completion will be posted for it.
+ *
+ * Then check if a free (going busy) or busy worker has the work
+ * currently running. If we find it there, we'll return CANCEL_RUNNING
+ * as an indication that we attempt to signal cancellation. The
+ * completion will run normally in this case.
+ *
+ * Do both of these while holding the wqe->lock, to ensure that
+ * we'll find a work item regardless of state.
+ */
+ for_each_node(node) {
+ struct io_wqe *wqe = wq->wqes[node];
+
+ io_wqe_cancel_pending_work(wqe, &match);
+ if (match.nr_pending && !match.cancel_all)
+ return IO_WQ_CANCEL_OK;
+
+ raw_spin_lock(&wqe->lock);
+ io_wqe_cancel_running_work(wqe, &match);
+ raw_spin_unlock(&wqe->lock);
+ if (match.nr_running && !match.cancel_all)
+ return IO_WQ_CANCEL_RUNNING;
+ }
+
+ if (match.nr_running)
+ return IO_WQ_CANCEL_RUNNING;
+ if (match.nr_pending)
+ return IO_WQ_CANCEL_OK;
+ return IO_WQ_CANCEL_NOTFOUND;
+}
+
+static int io_wqe_hash_wake(struct wait_queue_entry *wait, unsigned mode,
+ int sync, void *key)
+{
+ struct io_wqe *wqe = container_of(wait, struct io_wqe, wait);
+ int i;
+
+ list_del_init(&wait->entry);
+
+ rcu_read_lock();
+ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+ struct io_wqe_acct *acct = &wqe->acct[i];
+
+ if (test_and_clear_bit(IO_ACCT_STALLED_BIT, &acct->flags))
+ io_wqe_activate_free_worker(wqe, acct);
+ }
+ rcu_read_unlock();
+ return 1;
+}
+
+struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
+{
+ int ret, node, i;
+ struct io_wq *wq;
+
+ if (WARN_ON_ONCE(!data->free_work || !data->do_work))
+ return ERR_PTR(-EINVAL);
+ if (WARN_ON_ONCE(!bounded))
+ return ERR_PTR(-EINVAL);
+
+ wq = kzalloc(struct_size(wq, wqes, nr_node_ids), GFP_KERNEL);
+ if (!wq)
+ return ERR_PTR(-ENOMEM);
+ ret = cpuhp_state_add_instance_nocalls(io_wq_online, &wq->cpuhp_node);
+ if (ret)
+ goto err_wq;
+
+ refcount_inc(&data->hash->refs);
+ wq->hash = data->hash;
+ wq->free_work = data->free_work;
+ wq->do_work = data->do_work;
+
+ ret = -ENOMEM;
+ for_each_node(node) {
+ struct io_wqe *wqe;
+ int alloc_node = node;
+
+ if (!node_online(alloc_node))
+ alloc_node = NUMA_NO_NODE;
+ wqe = kzalloc_node(sizeof(struct io_wqe), GFP_KERNEL, alloc_node);
+ if (!wqe)
+ goto err;
+ if (!alloc_cpumask_var(&wqe->cpu_mask, GFP_KERNEL))
+ goto err;
+ cpumask_copy(wqe->cpu_mask, cpumask_of_node(node));
+ wq->wqes[node] = wqe;
+ wqe->node = alloc_node;
+ wqe->acct[IO_WQ_ACCT_BOUND].max_workers = bounded;
+ wqe->acct[IO_WQ_ACCT_UNBOUND].max_workers =
+ task_rlimit(current, RLIMIT_NPROC);
+ INIT_LIST_HEAD(&wqe->wait.entry);
+ wqe->wait.func = io_wqe_hash_wake;
+ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+ struct io_wqe_acct *acct = &wqe->acct[i];
+
+ acct->index = i;
+ atomic_set(&acct->nr_running, 0);
+ INIT_WQ_LIST(&acct->work_list);
+ raw_spin_lock_init(&acct->lock);
+ }
+ wqe->wq = wq;
+ raw_spin_lock_init(&wqe->lock);
+ INIT_HLIST_NULLS_HEAD(&wqe->free_list, 0);
+ INIT_LIST_HEAD(&wqe->all_list);
+ }
+
+ wq->task = get_task_struct(data->task);
+ atomic_set(&wq->worker_refs, 1);
+ init_completion(&wq->worker_done);
+ return wq;
+err:
+ io_wq_put_hash(data->hash);
+ cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
+ for_each_node(node) {
+ if (!wq->wqes[node])
+ continue;
+ free_cpumask_var(wq->wqes[node]->cpu_mask);
+ kfree(wq->wqes[node]);
+ }
+err_wq:
+ kfree(wq);
+ return ERR_PTR(ret);
+}
+
+static bool io_task_work_match(struct callback_head *cb, void *data)
+{
+ struct io_worker *worker;
+
+ if (cb->func != create_worker_cb && cb->func != create_worker_cont)
+ return false;
+ worker = container_of(cb, struct io_worker, create_work);
+ return worker->wqe->wq == data;
+}
+
+void io_wq_exit_start(struct io_wq *wq)
+{
+ set_bit(IO_WQ_BIT_EXIT, &wq->state);
+}
+
+static void io_wq_cancel_tw_create(struct io_wq *wq)
+{
+ struct callback_head *cb;
+
+ while ((cb = task_work_cancel_match(wq->task, io_task_work_match, wq)) != NULL) {
+ struct io_worker *worker;
+
+ worker = container_of(cb, struct io_worker, create_work);
+ io_worker_cancel_cb(worker);
+ }
+}
+
+static void io_wq_exit_workers(struct io_wq *wq)
+{
+ int node;
+
+ if (!wq->task)
+ return;
+
+ io_wq_cancel_tw_create(wq);
+
+ rcu_read_lock();
+ for_each_node(node) {
+ struct io_wqe *wqe = wq->wqes[node];
+
+ io_wq_for_each_worker(wqe, io_wq_worker_wake, NULL);
+ }
+ rcu_read_unlock();
+ io_worker_ref_put(wq);
+ wait_for_completion(&wq->worker_done);
+
+ for_each_node(node) {
+ spin_lock_irq(&wq->hash->wait.lock);
+ list_del_init(&wq->wqes[node]->wait.entry);
+ spin_unlock_irq(&wq->hash->wait.lock);
+ }
+ put_task_struct(wq->task);
+ wq->task = NULL;
+}
+
+static void io_wq_destroy(struct io_wq *wq)
+{
+ int node;
+
+ cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
+
+ for_each_node(node) {
+ struct io_wqe *wqe = wq->wqes[node];
+ struct io_cb_cancel_data match = {
+ .fn = io_wq_work_match_all,
+ .cancel_all = true,
+ };
+ io_wqe_cancel_pending_work(wqe, &match);
+ free_cpumask_var(wqe->cpu_mask);
+ kfree(wqe);
+ }
+ io_wq_put_hash(wq->hash);
+ kfree(wq);
+}
+
+void io_wq_put_and_exit(struct io_wq *wq)
+{
+ WARN_ON_ONCE(!test_bit(IO_WQ_BIT_EXIT, &wq->state));
+
+ io_wq_exit_workers(wq);
+ io_wq_destroy(wq);
+}
+
+struct online_data {
+ unsigned int cpu;
+ bool online;
+};
+
+static bool io_wq_worker_affinity(struct io_worker *worker, void *data)
+{
+ struct online_data *od = data;
+
+ if (od->online)
+ cpumask_set_cpu(od->cpu, worker->wqe->cpu_mask);
+ else
+ cpumask_clear_cpu(od->cpu, worker->wqe->cpu_mask);
+ return false;
+}
+
+static int __io_wq_cpu_online(struct io_wq *wq, unsigned int cpu, bool online)
+{
+ struct online_data od = {
+ .cpu = cpu,
+ .online = online
+ };
+ int i;
+
+ rcu_read_lock();
+ for_each_node(i)
+ io_wq_for_each_worker(wq->wqes[i], io_wq_worker_affinity, &od);
+ rcu_read_unlock();
+ return 0;
+}
+
+static int io_wq_cpu_online(unsigned int cpu, struct hlist_node *node)
+{
+ struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
+
+ return __io_wq_cpu_online(wq, cpu, true);
+}
+
+static int io_wq_cpu_offline(unsigned int cpu, struct hlist_node *node)
+{
+ struct io_wq *wq = hlist_entry_safe(node, struct io_wq, cpuhp_node);
+
+ return __io_wq_cpu_online(wq, cpu, false);
+}
+
+int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask)
+{
+ int i;
+
+ rcu_read_lock();
+ for_each_node(i) {
+ struct io_wqe *wqe = wq->wqes[i];
+
+ if (mask)
+ cpumask_copy(wqe->cpu_mask, mask);
+ else
+ cpumask_copy(wqe->cpu_mask, cpumask_of_node(i));
+ }
+ rcu_read_unlock();
+ return 0;
+}
+
+/*
+ * Set max number of unbounded workers, returns old value. If new_count is 0,
+ * then just return the old value.
+ */
+int io_wq_max_workers(struct io_wq *wq, int *new_count)
+{
+ int prev[IO_WQ_ACCT_NR];
+ bool first_node = true;
+ int i, node;
+
+ BUILD_BUG_ON((int) IO_WQ_ACCT_BOUND != (int) IO_WQ_BOUND);
+ BUILD_BUG_ON((int) IO_WQ_ACCT_UNBOUND != (int) IO_WQ_UNBOUND);
+ BUILD_BUG_ON((int) IO_WQ_ACCT_NR != 2);
+
+ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+ if (new_count[i] > task_rlimit(current, RLIMIT_NPROC))
+ new_count[i] = task_rlimit(current, RLIMIT_NPROC);
+ }
+
+ for (i = 0; i < IO_WQ_ACCT_NR; i++)
+ prev[i] = 0;
+
+ rcu_read_lock();
+ for_each_node(node) {
+ struct io_wqe *wqe = wq->wqes[node];
+ struct io_wqe_acct *acct;
+
+ raw_spin_lock(&wqe->lock);
+ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+ acct = &wqe->acct[i];
+ if (first_node)
+ prev[i] = max_t(int, acct->max_workers, prev[i]);
+ if (new_count[i])
+ acct->max_workers = new_count[i];
+ }
+ raw_spin_unlock(&wqe->lock);
+ first_node = false;
+ }
+ rcu_read_unlock();
+
+ for (i = 0; i < IO_WQ_ACCT_NR; i++)
+ new_count[i] = prev[i];
+
+ return 0;
+}
+
+static __init int io_wq_init(void)
+{
+ int ret;
+
+ ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "io-wq/online",
+ io_wq_cpu_online, io_wq_cpu_offline);
+ if (ret < 0)
+ return ret;
+ io_wq_online = ret;
+ return 0;
+}
+subsys_initcall(io_wq_init);
diff --git a/io_uring/io-wq.h b/io_uring/io-wq.h
new file mode 100644
index 000000000000..dbecd27656c7
--- /dev/null
+++ b/io_uring/io-wq.h
@@ -0,0 +1,227 @@
+#ifndef INTERNAL_IO_WQ_H
+#define INTERNAL_IO_WQ_H
+
+#include <linux/refcount.h>
+
+struct io_wq;
+
+enum {
+ IO_WQ_WORK_CANCEL = 1,
+ IO_WQ_WORK_HASHED = 2,
+ IO_WQ_WORK_UNBOUND = 4,
+ IO_WQ_WORK_CONCURRENT = 16,
+
+ IO_WQ_HASH_SHIFT = 24, /* upper 8 bits are used for hash key */
+};
+
+enum io_wq_cancel {
+ IO_WQ_CANCEL_OK, /* cancelled before started */
+ IO_WQ_CANCEL_RUNNING, /* found, running, and attempted cancelled */
+ IO_WQ_CANCEL_NOTFOUND, /* work not found */
+};
+
+struct io_wq_work_node {
+ struct io_wq_work_node *next;
+};
+
+struct io_wq_work_list {
+ struct io_wq_work_node *first;
+ struct io_wq_work_node *last;
+};
+
+#define wq_list_for_each(pos, prv, head) \
+ for (pos = (head)->first, prv = NULL; pos; prv = pos, pos = (pos)->next)
+
+#define wq_list_for_each_resume(pos, prv) \
+ for (; pos; prv = pos, pos = (pos)->next)
+
+#define wq_list_empty(list) (READ_ONCE((list)->first) == NULL)
+#define INIT_WQ_LIST(list) do { \
+ (list)->first = NULL; \
+} while (0)
+
+static inline void wq_list_add_after(struct io_wq_work_node *node,
+ struct io_wq_work_node *pos,
+ struct io_wq_work_list *list)
+{
+ struct io_wq_work_node *next = pos->next;
+
+ pos->next = node;
+ node->next = next;
+ if (!next)
+ list->last = node;
+}
+
+/**
+ * wq_list_merge - merge the second list to the first one.
+ * @list0: the first list
+ * @list1: the second list
+ * Return the first node after mergence.
+ */
+static inline struct io_wq_work_node *wq_list_merge(struct io_wq_work_list *list0,
+ struct io_wq_work_list *list1)
+{
+ struct io_wq_work_node *ret;
+
+ if (!list0->first) {
+ ret = list1->first;
+ } else {
+ ret = list0->first;
+ list0->last->next = list1->first;
+ }
+ INIT_WQ_LIST(list0);
+ INIT_WQ_LIST(list1);
+ return ret;
+}
+
+static inline void wq_list_add_tail(struct io_wq_work_node *node,
+ struct io_wq_work_list *list)
+{
+ node->next = NULL;
+ if (!list->first) {
+ list->last = node;
+ WRITE_ONCE(list->first, node);
+ } else {
+ list->last->next = node;
+ list->last = node;
+ }
+}
+
+static inline void wq_list_add_head(struct io_wq_work_node *node,
+ struct io_wq_work_list *list)
+{
+ node->next = list->first;
+ if (!node->next)
+ list->last = node;
+ WRITE_ONCE(list->first, node);
+}
+
+static inline void wq_list_cut(struct io_wq_work_list *list,
+ struct io_wq_work_node *last,
+ struct io_wq_work_node *prev)
+{
+ /* first in the list, if prev==NULL */
+ if (!prev)
+ WRITE_ONCE(list->first, last->next);
+ else
+ prev->next = last->next;
+
+ if (last == list->last)
+ list->last = prev;
+ last->next = NULL;
+}
+
+static inline void __wq_list_splice(struct io_wq_work_list *list,
+ struct io_wq_work_node *to)
+{
+ list->last->next = to->next;
+ to->next = list->first;
+ INIT_WQ_LIST(list);
+}
+
+static inline bool wq_list_splice(struct io_wq_work_list *list,
+ struct io_wq_work_node *to)
+{
+ if (!wq_list_empty(list)) {
+ __wq_list_splice(list, to);
+ return true;
+ }
+ return false;
+}
+
+static inline void wq_stack_add_head(struct io_wq_work_node *node,
+ struct io_wq_work_node *stack)
+{
+ node->next = stack->next;
+ stack->next = node;
+}
+
+static inline void wq_list_del(struct io_wq_work_list *list,
+ struct io_wq_work_node *node,
+ struct io_wq_work_node *prev)
+{
+ wq_list_cut(list, node, prev);
+}
+
+static inline
+struct io_wq_work_node *wq_stack_extract(struct io_wq_work_node *stack)
+{
+ struct io_wq_work_node *node = stack->next;
+
+ stack->next = node->next;
+ return node;
+}
+
+struct io_wq_work {
+ struct io_wq_work_node list;
+ unsigned flags;
+};
+
+static inline struct io_wq_work *wq_next_work(struct io_wq_work *work)
+{
+ if (!work->list.next)
+ return NULL;
+
+ return container_of(work->list.next, struct io_wq_work, list);
+}
+
+typedef struct io_wq_work *(free_work_fn)(struct io_wq_work *);
+typedef void (io_wq_work_fn)(struct io_wq_work *);
+
+struct io_wq_hash {
+ refcount_t refs;
+ unsigned long map;
+ struct wait_queue_head wait;
+};
+
+static inline void io_wq_put_hash(struct io_wq_hash *hash)
+{
+ if (refcount_dec_and_test(&hash->refs))
+ kfree(hash);
+}
+
+struct io_wq_data {
+ struct io_wq_hash *hash;
+ struct task_struct *task;
+ io_wq_work_fn *do_work;
+ free_work_fn *free_work;
+};
+
+struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data);
+void io_wq_exit_start(struct io_wq *wq);
+void io_wq_put_and_exit(struct io_wq *wq);
+
+void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work);
+void io_wq_hash_work(struct io_wq_work *work, void *val);
+
+int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask);
+int io_wq_max_workers(struct io_wq *wq, int *new_count);
+
+static inline bool io_wq_is_hashed(struct io_wq_work *work)
+{
+ return work->flags & IO_WQ_WORK_HASHED;
+}
+
+typedef bool (work_cancel_fn)(struct io_wq_work *, void *);
+
+enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
+ void *data, bool cancel_all);
+
+#if defined(CONFIG_IO_WQ)
+extern void io_wq_worker_sleeping(struct task_struct *);
+extern void io_wq_worker_running(struct task_struct *);
+#else
+static inline void io_wq_worker_sleeping(struct task_struct *tsk)
+{
+}
+static inline void io_wq_worker_running(struct task_struct *tsk)
+{
+}
+#endif
+
+static inline bool io_wq_current_is_worker(void)
+{
+ return in_task() && (current->flags & PF_IO_WORKER) &&
+ current->worker_private;
+}
+#endif
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
new file mode 100644
index 000000000000..ec4e31c67e07
--- /dev/null
+++ b/io_uring/io_uring.c
@@ -0,0 +1,11915 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Shared application/kernel submission and completion ring pairs, for
+ * supporting fast/efficient IO.
+ *
+ * A note on the read/write ordering memory barriers that are matched between
+ * the application and kernel side.
+ *
+ * After the application reads the CQ ring tail, it must use an
+ * appropriate smp_rmb() to pair with the smp_wmb() the kernel uses
+ * before writing the tail (using smp_load_acquire to read the tail will
+ * do). It also needs a smp_mb() before updating CQ head (ordering the
+ * entry load(s) with the head store), pairing with an implicit barrier
+ * through a control-dependency in io_get_cqe (smp_store_release to
+ * store head will do). Failure to do so could lead to reading invalid
+ * CQ entries.
+ *
+ * Likewise, the application must use an appropriate smp_wmb() before
+ * writing the SQ tail (ordering SQ entry stores with the tail store),
+ * which pairs with smp_load_acquire in io_get_sqring (smp_store_release
+ * to store the tail will do). And it needs a barrier ordering the SQ
+ * head load before writing new SQ entries (smp_load_acquire to read
+ * head will do).
+ *
+ * When using the SQ poll thread (IORING_SETUP_SQPOLL), the application
+ * needs to check the SQ flags for IORING_SQ_NEED_WAKEUP *after*
+ * updating the SQ tail; a full memory barrier smp_mb() is needed
+ * between.
+ *
+ * Also see the examples in the liburing library:
+ *
+ * git://git.kernel.dk/liburing
+ *
+ * io_uring also uses READ/WRITE_ONCE() for _any_ store or load that happens
+ * from data shared between the kernel and application. This is done both
+ * for ordering purposes, but also to ensure that once a value is loaded from
+ * data that the application could potentially modify, it remains stable.
+ *
+ * Copyright (C) 2018-2019 Jens Axboe
+ * Copyright (c) 2018-2019 Christoph Hellwig
+ */
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/errno.h>
+#include <linux/syscalls.h>
+#include <linux/compat.h>
+#include <net/compat.h>
+#include <linux/refcount.h>
+#include <linux/uio.h>
+#include <linux/bits.h>
+
+#include <linux/sched/signal.h>
+#include <linux/fs.h>
+#include <linux/file.h>
+#include <linux/fdtable.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/percpu.h>
+#include <linux/slab.h>
+#include <linux/blk-mq.h>
+#include <linux/bvec.h>
+#include <linux/net.h>
+#include <net/sock.h>
+#include <net/af_unix.h>
+#include <net/scm.h>
+#include <linux/anon_inodes.h>
+#include <linux/sched/mm.h>
+#include <linux/uaccess.h>
+#include <linux/nospec.h>
+#include <linux/sizes.h>
+#include <linux/hugetlb.h>
+#include <linux/highmem.h>
+#include <linux/namei.h>
+#include <linux/fsnotify.h>
+#include <linux/fadvise.h>
+#include <linux/eventpoll.h>
+#include <linux/splice.h>
+#include <linux/task_work.h>
+#include <linux/pagemap.h>
+#include <linux/io_uring.h>
+#include <linux/audit.h>
+#include <linux/security.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/io_uring.h>
+
+#include <uapi/linux/io_uring.h>
+
+#include "../fs/internal.h"
+#include "io-wq.h"
+
+#define IORING_MAX_ENTRIES 32768
+#define IORING_MAX_CQ_ENTRIES (2 * IORING_MAX_ENTRIES)
+#define IORING_SQPOLL_CAP_ENTRIES_VALUE 8
+
+/* only define max */
+#define IORING_MAX_FIXED_FILES (1U << 15)
+#define IORING_MAX_RESTRICTIONS (IORING_RESTRICTION_LAST + \
+ IORING_REGISTER_LAST + IORING_OP_LAST)
+
+#define IO_RSRC_TAG_TABLE_SHIFT (PAGE_SHIFT - 3)
+#define IO_RSRC_TAG_TABLE_MAX (1U << IO_RSRC_TAG_TABLE_SHIFT)
+#define IO_RSRC_TAG_TABLE_MASK (IO_RSRC_TAG_TABLE_MAX - 1)
+
+#define IORING_MAX_REG_BUFFERS (1U << 14)
+
+#define SQE_COMMON_FLAGS (IOSQE_FIXED_FILE | IOSQE_IO_LINK | \
+ IOSQE_IO_HARDLINK | IOSQE_ASYNC)
+
+#define SQE_VALID_FLAGS (SQE_COMMON_FLAGS | IOSQE_BUFFER_SELECT | \
+ IOSQE_IO_DRAIN | IOSQE_CQE_SKIP_SUCCESS)
+
+#define IO_REQ_CLEAN_FLAGS (REQ_F_BUFFER_SELECTED | REQ_F_NEED_CLEANUP | \
+ REQ_F_POLLED | REQ_F_INFLIGHT | REQ_F_CREDS | \
+ REQ_F_ASYNC_DATA)
+
+#define IO_TCTX_REFS_CACHE_NR (1U << 10)
+
+struct io_uring {
+ u32 head ____cacheline_aligned_in_smp;
+ u32 tail ____cacheline_aligned_in_smp;
+};
+
+/*
+ * This data is shared with the application through the mmap at offsets
+ * IORING_OFF_SQ_RING and IORING_OFF_CQ_RING.
+ *
+ * The offsets to the member fields are published through struct
+ * io_sqring_offsets when calling io_uring_setup.
+ */
+struct io_rings {
+ /*
+ * Head and tail offsets into the ring; the offsets need to be
+ * masked to get valid indices.
+ *
+ * The kernel controls head of the sq ring and the tail of the cq ring,
+ * and the application controls tail of the sq ring and the head of the
+ * cq ring.
+ */
+ struct io_uring sq, cq;
+ /*
+ * Bitmasks to apply to head and tail offsets (constant, equals
+ * ring_entries - 1)
+ */
+ u32 sq_ring_mask, cq_ring_mask;
+ /* Ring sizes (constant, power of 2) */
+ u32 sq_ring_entries, cq_ring_entries;
+ /*
+ * Number of invalid entries dropped by the kernel due to
+ * invalid index stored in array
+ *
+ * Written by the kernel, shouldn't be modified by the
+ * application (i.e. get number of "new events" by comparing to
+ * cached value).
+ *
+ * After a new SQ head value was read by the application this
+ * counter includes all submissions that were dropped reaching
+ * the new SQ head (and possibly more).
+ */
+ u32 sq_dropped;
+ /*
+ * Runtime SQ flags
+ *
+ * Written by the kernel, shouldn't be modified by the
+ * application.
+ *
+ * The application needs a full memory barrier before checking
+ * for IORING_SQ_NEED_WAKEUP after updating the sq tail.
+ */
+ u32 sq_flags;
+ /*
+ * Runtime CQ flags
+ *
+ * Written by the application, shouldn't be modified by the
+ * kernel.
+ */
+ u32 cq_flags;
+ /*
+ * Number of completion events lost because the queue was full;
+ * this should be avoided by the application by making sure
+ * there are not more requests pending than there is space in
+ * the completion queue.
+ *
+ * Written by the kernel, shouldn't be modified by the
+ * application (i.e. get number of "new events" by comparing to
+ * cached value).
+ *
+ * As completion events come in out of order this counter is not
+ * ordered with any other data.
+ */
+ u32 cq_overflow;
+ /*
+ * Ring buffer of completion events.
+ *
+ * The kernel writes completion events fresh every time they are
+ * produced, so the application is allowed to modify pending
+ * entries.
+ */
+ struct io_uring_cqe cqes[] ____cacheline_aligned_in_smp;
+};
+
+enum io_uring_cmd_flags {
+ IO_URING_F_COMPLETE_DEFER = 1,
+ IO_URING_F_UNLOCKED = 2,
+ /* int's last bit, sign checks are usually faster than a bit test */
+ IO_URING_F_NONBLOCK = INT_MIN,
+};
+
+struct io_mapped_ubuf {
+ u64 ubuf;
+ u64 ubuf_end;
+ unsigned int nr_bvecs;
+ unsigned long acct_pages;
+ struct bio_vec bvec[];
+};
+
+struct io_ring_ctx;
+
+struct io_overflow_cqe {
+ struct io_uring_cqe cqe;
+ struct list_head list;
+};
+
+struct io_fixed_file {
+ /* file * with additional FFS_* flags */
+ unsigned long file_ptr;
+};
+
+struct io_rsrc_put {
+ struct list_head list;
+ u64 tag;
+ union {
+ void *rsrc;
+ struct file *file;
+ struct io_mapped_ubuf *buf;
+ };
+};
+
+struct io_file_table {
+ struct io_fixed_file *files;
+};
+
+struct io_rsrc_node {
+ struct percpu_ref refs;
+ struct list_head node;
+ struct list_head rsrc_list;
+ struct io_rsrc_data *rsrc_data;
+ struct llist_node llist;
+ bool done;
+};
+
+typedef void (rsrc_put_fn)(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc);
+
+struct io_rsrc_data {
+ struct io_ring_ctx *ctx;
+
+ u64 **tags;
+ unsigned int nr;
+ rsrc_put_fn *do_put;
+ atomic_t refs;
+ struct completion done;
+ bool quiesce;
+};
+
+struct io_buffer_list {
+ struct list_head list;
+ struct list_head buf_list;
+ __u16 bgid;
+};
+
+struct io_buffer {
+ struct list_head list;
+ __u64 addr;
+ __u32 len;
+ __u16 bid;
+ __u16 bgid;
+};
+
+struct io_restriction {
+ DECLARE_BITMAP(register_op, IORING_REGISTER_LAST);
+ DECLARE_BITMAP(sqe_op, IORING_OP_LAST);
+ u8 sqe_flags_allowed;
+ u8 sqe_flags_required;
+ bool registered;
+};
+
+enum {
+ IO_SQ_THREAD_SHOULD_STOP = 0,
+ IO_SQ_THREAD_SHOULD_PARK,
+};
+
+struct io_sq_data {
+ refcount_t refs;
+ atomic_t park_pending;
+ struct mutex lock;
+
+ /* ctx's that are using this sqd */
+ struct list_head ctx_list;
+
+ struct task_struct *thread;
+ struct wait_queue_head wait;
+
+ unsigned sq_thread_idle;
+ int sq_cpu;
+ pid_t task_pid;
+ pid_t task_tgid;
+
+ unsigned long state;
+ struct completion exited;
+};
+
+#define IO_COMPL_BATCH 32
+#define IO_REQ_CACHE_SIZE 32
+#define IO_REQ_ALLOC_BATCH 8
+
+struct io_submit_link {
+ struct io_kiocb *head;
+ struct io_kiocb *last;
+};
+
+struct io_submit_state {
+ /* inline/task_work completion list, under ->uring_lock */
+ struct io_wq_work_node free_list;
+ /* batch completion logic */
+ struct io_wq_work_list compl_reqs;
+ struct io_submit_link link;
+
+ bool plug_started;
+ bool need_plug;
+ bool flush_cqes;
+ unsigned short submit_nr;
+ struct blk_plug plug;
+};
+
+struct io_ev_fd {
+ struct eventfd_ctx *cq_ev_fd;
+ unsigned int eventfd_async: 1;
+ struct rcu_head rcu;
+};
+
+#define IO_BUFFERS_HASH_BITS 5
+
+struct io_ring_ctx {
+ /* const or read-mostly hot data */
+ struct {
+ struct percpu_ref refs;
+
+ struct io_rings *rings;
+ unsigned int flags;
+ unsigned int compat: 1;
+ unsigned int drain_next: 1;
+ unsigned int restricted: 1;
+ unsigned int off_timeout_used: 1;
+ unsigned int drain_active: 1;
+ unsigned int drain_disabled: 1;
+ unsigned int has_evfd: 1;
+ } ____cacheline_aligned_in_smp;
+
+ /* submission data */
+ struct {
+ struct mutex uring_lock;
+
+ /*
+ * Ring buffer of indices into array of io_uring_sqe, which is
+ * mmapped by the application using the IORING_OFF_SQES offset.
+ *
+ * This indirection could e.g. be used to assign fixed
+ * io_uring_sqe entries to operations and only submit them to
+ * the queue when needed.
+ *
+ * The kernel modifies neither the indices array nor the entries
+ * array.
+ */
+ u32 *sq_array;
+ struct io_uring_sqe *sq_sqes;
+ unsigned cached_sq_head;
+ unsigned sq_entries;
+ struct list_head defer_list;
+
+ /*
+ * Fixed resources fast path, should be accessed only under
+ * uring_lock, and updated through io_uring_register(2)
+ */
+ struct io_rsrc_node *rsrc_node;
+ int rsrc_cached_refs;
+ struct io_file_table file_table;
+ unsigned nr_user_files;
+ unsigned nr_user_bufs;
+ struct io_mapped_ubuf **user_bufs;
+
+ struct io_submit_state submit_state;
+ struct list_head timeout_list;
+ struct list_head ltimeout_list;
+ struct list_head cq_overflow_list;
+ struct list_head *io_buffers;
+ struct list_head io_buffers_cache;
+ struct list_head apoll_cache;
+ struct xarray personalities;
+ u32 pers_next;
+ unsigned sq_thread_idle;
+ } ____cacheline_aligned_in_smp;
+
+ /* IRQ completion list, under ->completion_lock */
+ struct io_wq_work_list locked_free_list;
+ unsigned int locked_free_nr;
+
+ const struct cred *sq_creds; /* cred used for __io_sq_thread() */
+ struct io_sq_data *sq_data; /* if using sq thread polling */
+
+ struct wait_queue_head sqo_sq_wait;
+ struct list_head sqd_list;
+
+ unsigned long check_cq_overflow;
+
+ struct {
+ unsigned cached_cq_tail;
+ unsigned cq_entries;
+ struct io_ev_fd __rcu *io_ev_fd;
+ struct wait_queue_head cq_wait;
+ unsigned cq_extra;
+ atomic_t cq_timeouts;
+ unsigned cq_last_tm_flush;
+ } ____cacheline_aligned_in_smp;
+
+ struct {
+ spinlock_t completion_lock;
+
+ spinlock_t timeout_lock;
+
+ /*
+ * ->iopoll_list is protected by the ctx->uring_lock for
+ * io_uring instances that don't use IORING_SETUP_SQPOLL.
+ * For SQPOLL, only the single threaded io_sq_thread() will
+ * manipulate the list, hence no extra locking is needed there.
+ */
+ struct io_wq_work_list iopoll_list;
+ struct hlist_head *cancel_hash;
+ unsigned cancel_hash_bits;
+ bool poll_multi_queue;
+
+ struct list_head io_buffers_comp;
+ } ____cacheline_aligned_in_smp;
+
+ struct io_restriction restrictions;
+
+ /* slow path rsrc auxilary data, used by update/register */
+ struct {
+ struct io_rsrc_node *rsrc_backup_node;
+ struct io_mapped_ubuf *dummy_ubuf;
+ struct io_rsrc_data *file_data;
+ struct io_rsrc_data *buf_data;
+
+ struct delayed_work rsrc_put_work;
+ struct llist_head rsrc_put_llist;
+ struct list_head rsrc_ref_list;
+ spinlock_t rsrc_ref_lock;
+
+ struct list_head io_buffers_pages;
+ };
+
+ /* Keep this last, we don't need it for the fast path */
+ struct {
+ #if defined(CONFIG_UNIX)
+ struct socket *ring_sock;
+ #endif
+ /* hashed buffered write serialization */
+ struct io_wq_hash *hash_map;
+
+ /* Only used for accounting purposes */
+ struct user_struct *user;
+ struct mm_struct *mm_account;
+
+ /* ctx exit and cancelation */
+ struct llist_head fallback_llist;
+ struct delayed_work fallback_work;
+ struct work_struct exit_work;
+ struct list_head tctx_list;
+ struct completion ref_comp;
+ u32 iowq_limits[2];
+ bool iowq_limits_set;
+ };
+};
+
+/*
+ * Arbitrary limit, can be raised if need be
+ */
+#define IO_RINGFD_REG_MAX 16
+
+struct io_uring_task {
+ /* submission side */
+ int cached_refs;
+ struct xarray xa;
+ struct wait_queue_head wait;
+ const struct io_ring_ctx *last;
+ struct io_wq *io_wq;
+ struct percpu_counter inflight;
+ atomic_t inflight_tracked;
+ atomic_t in_idle;
+
+ spinlock_t task_lock;
+ struct io_wq_work_list task_list;
+ struct io_wq_work_list prior_task_list;
+ struct callback_head task_work;
+ struct file **registered_rings;
+ bool task_running;
+};
+
+/*
+ * First field must be the file pointer in all the
+ * iocb unions! See also 'struct kiocb' in <linux/fs.h>
+ */
+struct io_poll_iocb {
+ struct file *file;
+ struct wait_queue_head *head;
+ __poll_t events;
+ struct wait_queue_entry wait;
+};
+
+struct io_poll_update {
+ struct file *file;
+ u64 old_user_data;
+ u64 new_user_data;
+ __poll_t events;
+ bool update_events;
+ bool update_user_data;
+};
+
+struct io_close {
+ struct file *file;
+ int fd;
+ u32 file_slot;
+};
+
+struct io_timeout_data {
+ struct io_kiocb *req;
+ struct hrtimer timer;
+ struct timespec64 ts;
+ enum hrtimer_mode mode;
+ u32 flags;
+};
+
+struct io_accept {
+ struct file *file;
+ struct sockaddr __user *addr;
+ int __user *addr_len;
+ int flags;
+ u32 file_slot;
+ unsigned long nofile;
+};
+
+struct io_sync {
+ struct file *file;
+ loff_t len;
+ loff_t off;
+ int flags;
+ int mode;
+};
+
+struct io_cancel {
+ struct file *file;
+ u64 addr;
+};
+
+struct io_timeout {
+ struct file *file;
+ u32 off;
+ u32 target_seq;
+ struct list_head list;
+ /* head of the link, used by linked timeouts only */
+ struct io_kiocb *head;
+ /* for linked completions */
+ struct io_kiocb *prev;
+};
+
+struct io_timeout_rem {
+ struct file *file;
+ u64 addr;
+
+ /* timeout update */
+ struct timespec64 ts;
+ u32 flags;
+ bool ltimeout;
+};
+
+struct io_rw {
+ /* NOTE: kiocb has the file as the first member, so don't do it here */
+ struct kiocb kiocb;
+ u64 addr;
+ u32 len;
+ u32 flags;
+};
+
+struct io_connect {
+ struct file *file;
+ struct sockaddr __user *addr;
+ int addr_len;
+};
+
+struct io_sr_msg {
+ struct file *file;
+ union {
+ struct compat_msghdr __user *umsg_compat;
+ struct user_msghdr __user *umsg;
+ void __user *buf;
+ };
+ int msg_flags;
+ int bgid;
+ size_t len;
+ size_t done_io;
+};
+
+struct io_open {
+ struct file *file;
+ int dfd;
+ u32 file_slot;
+ struct filename *filename;
+ struct open_how how;
+ unsigned long nofile;
+};
+
+struct io_rsrc_update {
+ struct file *file;
+ u64 arg;
+ u32 nr_args;
+ u32 offset;
+};
+
+struct io_fadvise {
+ struct file *file;
+ u64 offset;
+ u32 len;
+ u32 advice;
+};
+
+struct io_madvise {
+ struct file *file;
+ u64 addr;
+ u32 len;
+ u32 advice;
+};
+
+struct io_epoll {
+ struct file *file;
+ int epfd;
+ int op;
+ int fd;
+ struct epoll_event event;
+};
+
+struct io_splice {
+ struct file *file_out;
+ loff_t off_out;
+ loff_t off_in;
+ u64 len;
+ int splice_fd_in;
+ unsigned int flags;
+};
+
+struct io_provide_buf {
+ struct file *file;
+ __u64 addr;
+ __u32 len;
+ __u32 bgid;
+ __u16 nbufs;
+ __u16 bid;
+};
+
+struct io_statx {
+ struct file *file;
+ int dfd;
+ unsigned int mask;
+ unsigned int flags;
+ struct filename *filename;
+ struct statx __user *buffer;
+};
+
+struct io_shutdown {
+ struct file *file;
+ int how;
+};
+
+struct io_rename {
+ struct file *file;
+ int old_dfd;
+ int new_dfd;
+ struct filename *oldpath;
+ struct filename *newpath;
+ int flags;
+};
+
+struct io_unlink {
+ struct file *file;
+ int dfd;
+ int flags;
+ struct filename *filename;
+};
+
+struct io_mkdir {
+ struct file *file;
+ int dfd;
+ umode_t mode;
+ struct filename *filename;
+};
+
+struct io_symlink {
+ struct file *file;
+ int new_dfd;
+ struct filename *oldpath;
+ struct filename *newpath;
+};
+
+struct io_hardlink {
+ struct file *file;
+ int old_dfd;
+ int new_dfd;
+ struct filename *oldpath;
+ struct filename *newpath;
+ int flags;
+};
+
+struct io_msg {
+ struct file *file;
+ u64 user_data;
+ u32 len;
+};
+
+struct io_async_connect {
+ struct sockaddr_storage address;
+};
+
+struct io_async_msghdr {
+ struct iovec fast_iov[UIO_FASTIOV];
+ /* points to an allocated iov, if NULL we use fast_iov instead */
+ struct iovec *free_iov;
+ struct sockaddr __user *uaddr;
+ struct msghdr msg;
+ struct sockaddr_storage addr;
+};
+
+struct io_rw_state {
+ struct iov_iter iter;
+ struct iov_iter_state iter_state;
+ struct iovec fast_iov[UIO_FASTIOV];
+};
+
+struct io_async_rw {
+ struct io_rw_state s;
+ const struct iovec *free_iovec;
+ size_t bytes_done;
+ struct wait_page_queue wpq;
+};
+
+enum {
+ REQ_F_FIXED_FILE_BIT = IOSQE_FIXED_FILE_BIT,
+ REQ_F_IO_DRAIN_BIT = IOSQE_IO_DRAIN_BIT,
+ REQ_F_LINK_BIT = IOSQE_IO_LINK_BIT,
+ REQ_F_HARDLINK_BIT = IOSQE_IO_HARDLINK_BIT,
+ REQ_F_FORCE_ASYNC_BIT = IOSQE_ASYNC_BIT,
+ REQ_F_BUFFER_SELECT_BIT = IOSQE_BUFFER_SELECT_BIT,
+ REQ_F_CQE_SKIP_BIT = IOSQE_CQE_SKIP_SUCCESS_BIT,
+
+ /* first byte is taken by user flags, shift it to not overlap */
+ REQ_F_FAIL_BIT = 8,
+ REQ_F_INFLIGHT_BIT,
+ REQ_F_CUR_POS_BIT,
+ REQ_F_NOWAIT_BIT,
+ REQ_F_LINK_TIMEOUT_BIT,
+ REQ_F_NEED_CLEANUP_BIT,
+ REQ_F_POLLED_BIT,
+ REQ_F_BUFFER_SELECTED_BIT,
+ REQ_F_COMPLETE_INLINE_BIT,
+ REQ_F_REISSUE_BIT,
+ REQ_F_CREDS_BIT,
+ REQ_F_REFCOUNT_BIT,
+ REQ_F_ARM_LTIMEOUT_BIT,
+ REQ_F_ASYNC_DATA_BIT,
+ REQ_F_SKIP_LINK_CQES_BIT,
+ REQ_F_SINGLE_POLL_BIT,
+ REQ_F_DOUBLE_POLL_BIT,
+ REQ_F_PARTIAL_IO_BIT,
+ /* keep async read/write and isreg together and in order */
+ REQ_F_SUPPORT_NOWAIT_BIT,
+ REQ_F_ISREG_BIT,
+
+ /* not a real bit, just to check we're not overflowing the space */
+ __REQ_F_LAST_BIT,
+};
+
+enum {
+ /* ctx owns file */
+ REQ_F_FIXED_FILE = BIT(REQ_F_FIXED_FILE_BIT),
+ /* drain existing IO first */
+ REQ_F_IO_DRAIN = BIT(REQ_F_IO_DRAIN_BIT),
+ /* linked sqes */
+ REQ_F_LINK = BIT(REQ_F_LINK_BIT),
+ /* doesn't sever on completion < 0 */
+ REQ_F_HARDLINK = BIT(REQ_F_HARDLINK_BIT),
+ /* IOSQE_ASYNC */
+ REQ_F_FORCE_ASYNC = BIT(REQ_F_FORCE_ASYNC_BIT),
+ /* IOSQE_BUFFER_SELECT */
+ REQ_F_BUFFER_SELECT = BIT(REQ_F_BUFFER_SELECT_BIT),
+ /* IOSQE_CQE_SKIP_SUCCESS */
+ REQ_F_CQE_SKIP = BIT(REQ_F_CQE_SKIP_BIT),
+
+ /* fail rest of links */
+ REQ_F_FAIL = BIT(REQ_F_FAIL_BIT),
+ /* on inflight list, should be cancelled and waited on exit reliably */
+ REQ_F_INFLIGHT = BIT(REQ_F_INFLIGHT_BIT),
+ /* read/write uses file position */
+ REQ_F_CUR_POS = BIT(REQ_F_CUR_POS_BIT),
+ /* must not punt to workers */
+ REQ_F_NOWAIT = BIT(REQ_F_NOWAIT_BIT),
+ /* has or had linked timeout */
+ REQ_F_LINK_TIMEOUT = BIT(REQ_F_LINK_TIMEOUT_BIT),
+ /* needs cleanup */
+ REQ_F_NEED_CLEANUP = BIT(REQ_F_NEED_CLEANUP_BIT),
+ /* already went through poll handler */
+ REQ_F_POLLED = BIT(REQ_F_POLLED_BIT),
+ /* buffer already selected */
+ REQ_F_BUFFER_SELECTED = BIT(REQ_F_BUFFER_SELECTED_BIT),
+ /* completion is deferred through io_comp_state */
+ REQ_F_COMPLETE_INLINE = BIT(REQ_F_COMPLETE_INLINE_BIT),
+ /* caller should reissue async */
+ REQ_F_REISSUE = BIT(REQ_F_REISSUE_BIT),
+ /* supports async reads/writes */
+ REQ_F_SUPPORT_NOWAIT = BIT(REQ_F_SUPPORT_NOWAIT_BIT),
+ /* regular file */
+ REQ_F_ISREG = BIT(REQ_F_ISREG_BIT),
+ /* has creds assigned */
+ REQ_F_CREDS = BIT(REQ_F_CREDS_BIT),
+ /* skip refcounting if not set */
+ REQ_F_REFCOUNT = BIT(REQ_F_REFCOUNT_BIT),
+ /* there is a linked timeout that has to be armed */
+ REQ_F_ARM_LTIMEOUT = BIT(REQ_F_ARM_LTIMEOUT_BIT),
+ /* ->async_data allocated */
+ REQ_F_ASYNC_DATA = BIT(REQ_F_ASYNC_DATA_BIT),
+ /* don't post CQEs while failing linked requests */
+ REQ_F_SKIP_LINK_CQES = BIT(REQ_F_SKIP_LINK_CQES_BIT),
+ /* single poll may be active */
+ REQ_F_SINGLE_POLL = BIT(REQ_F_SINGLE_POLL_BIT),
+ /* double poll may active */
+ REQ_F_DOUBLE_POLL = BIT(REQ_F_DOUBLE_POLL_BIT),
+ /* request has already done partial IO */
+ REQ_F_PARTIAL_IO = BIT(REQ_F_PARTIAL_IO_BIT),
+};
+
+struct async_poll {
+ struct io_poll_iocb poll;
+ struct io_poll_iocb *double_poll;
+};
+
+typedef void (*io_req_tw_func_t)(struct io_kiocb *req, bool *locked);
+
+struct io_task_work {
+ union {
+ struct io_wq_work_node node;
+ struct llist_node fallback_node;
+ };
+ io_req_tw_func_t func;
+};
+
+enum {
+ IORING_RSRC_FILE = 0,
+ IORING_RSRC_BUFFER = 1,
+};
+
+/*
+ * NOTE! Each of the iocb union members has the file pointer
+ * as the first entry in their struct definition. So you can
+ * access the file pointer through any of the sub-structs,
+ * or directly as just 'file' in this struct.
+ */
+struct io_kiocb {
+ union {
+ struct file *file;
+ struct io_rw rw;
+ struct io_poll_iocb poll;
+ struct io_poll_update poll_update;
+ struct io_accept accept;
+ struct io_sync sync;
+ struct io_cancel cancel;
+ struct io_timeout timeout;
+ struct io_timeout_rem timeout_rem;
+ struct io_connect connect;
+ struct io_sr_msg sr_msg;
+ struct io_open open;
+ struct io_close close;
+ struct io_rsrc_update rsrc_update;
+ struct io_fadvise fadvise;
+ struct io_madvise madvise;
+ struct io_epoll epoll;
+ struct io_splice splice;
+ struct io_provide_buf pbuf;
+ struct io_statx statx;
+ struct io_shutdown shutdown;
+ struct io_rename rename;
+ struct io_unlink unlink;
+ struct io_mkdir mkdir;
+ struct io_symlink symlink;
+ struct io_hardlink hardlink;
+ struct io_msg msg;
+ };
+
+ u8 opcode;
+ /* polled IO has completed */
+ u8 iopoll_completed;
+ u16 buf_index;
+ unsigned int flags;
+
+ u64 user_data;
+ u32 result;
+ /* fd initially, then cflags for completion */
+ union {
+ u32 cflags;
+ int fd;
+ };
+
+ struct io_ring_ctx *ctx;
+ struct task_struct *task;
+
+ struct percpu_ref *fixed_rsrc_refs;
+ /* store used ubuf, so we can prevent reloading */
+ struct io_mapped_ubuf *imu;
+
+ union {
+ /* used by request caches, completion batching and iopoll */
+ struct io_wq_work_node comp_list;
+ /* cache ->apoll->events */
+ __poll_t apoll_events;
+ };
+ atomic_t refs;
+ atomic_t poll_refs;
+ struct io_task_work io_task_work;
+ /* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */
+ struct hlist_node hash_node;
+ /* internal polling, see IORING_FEAT_FAST_POLL */
+ struct async_poll *apoll;
+ /* opcode allocated if it needs to store data for async defer */
+ void *async_data;
+ /* stores selected buf, valid IFF REQ_F_BUFFER_SELECTED is set */
+ struct io_buffer *kbuf;
+ /* linked requests, IFF REQ_F_HARDLINK or REQ_F_LINK are set */
+ struct io_kiocb *link;
+ /* custom credentials, valid IFF REQ_F_CREDS is set */
+ const struct cred *creds;
+ struct io_wq_work work;
+};
+
+struct io_tctx_node {
+ struct list_head ctx_node;
+ struct task_struct *task;
+ struct io_ring_ctx *ctx;
+};
+
+struct io_defer_entry {
+ struct list_head list;
+ struct io_kiocb *req;
+ u32 seq;
+};
+
+struct io_op_def {
+ /* needs req->file assigned */
+ unsigned needs_file : 1;
+ /* should block plug */
+ unsigned plug : 1;
+ /* hash wq insertion if file is a regular file */
+ unsigned hash_reg_file : 1;
+ /* unbound wq insertion if file is a non-regular file */
+ unsigned unbound_nonreg_file : 1;
+ /* set if opcode supports polled "wait" */
+ unsigned pollin : 1;
+ unsigned pollout : 1;
+ unsigned poll_exclusive : 1;
+ /* op supports buffer selection */
+ unsigned buffer_select : 1;
+ /* do prep async if is going to be punted */
+ unsigned needs_async_setup : 1;
+ /* opcode is not supported by this kernel */
+ unsigned not_supported : 1;
+ /* skip auditing */
+ unsigned audit_skip : 1;
+ /* size of async data needed, if any */
+ unsigned short async_size;
+};
+
+static const struct io_op_def io_op_defs[] = {
+ [IORING_OP_NOP] = {},
+ [IORING_OP_READV] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollin = 1,
+ .buffer_select = 1,
+ .needs_async_setup = 1,
+ .plug = 1,
+ .audit_skip = 1,
+ .async_size = sizeof(struct io_async_rw),
+ },
+ [IORING_OP_WRITEV] = {
+ .needs_file = 1,
+ .hash_reg_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollout = 1,
+ .needs_async_setup = 1,
+ .plug = 1,
+ .audit_skip = 1,
+ .async_size = sizeof(struct io_async_rw),
+ },
+ [IORING_OP_FSYNC] = {
+ .needs_file = 1,
+ .audit_skip = 1,
+ },
+ [IORING_OP_READ_FIXED] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollin = 1,
+ .plug = 1,
+ .audit_skip = 1,
+ .async_size = sizeof(struct io_async_rw),
+ },
+ [IORING_OP_WRITE_FIXED] = {
+ .needs_file = 1,
+ .hash_reg_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollout = 1,
+ .plug = 1,
+ .audit_skip = 1,
+ .async_size = sizeof(struct io_async_rw),
+ },
+ [IORING_OP_POLL_ADD] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .audit_skip = 1,
+ },
+ [IORING_OP_POLL_REMOVE] = {
+ .audit_skip = 1,
+ },
+ [IORING_OP_SYNC_FILE_RANGE] = {
+ .needs_file = 1,
+ .audit_skip = 1,
+ },
+ [IORING_OP_SENDMSG] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollout = 1,
+ .needs_async_setup = 1,
+ .async_size = sizeof(struct io_async_msghdr),
+ },
+ [IORING_OP_RECVMSG] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollin = 1,
+ .buffer_select = 1,
+ .needs_async_setup = 1,
+ .async_size = sizeof(struct io_async_msghdr),
+ },
+ [IORING_OP_TIMEOUT] = {
+ .audit_skip = 1,
+ .async_size = sizeof(struct io_timeout_data),
+ },
+ [IORING_OP_TIMEOUT_REMOVE] = {
+ /* used by timeout updates' prep() */
+ .audit_skip = 1,
+ },
+ [IORING_OP_ACCEPT] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollin = 1,
+ .poll_exclusive = 1,
+ },
+ [IORING_OP_ASYNC_CANCEL] = {
+ .audit_skip = 1,
+ },
+ [IORING_OP_LINK_TIMEOUT] = {
+ .audit_skip = 1,
+ .async_size = sizeof(struct io_timeout_data),
+ },
+ [IORING_OP_CONNECT] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollout = 1,
+ .needs_async_setup = 1,
+ .async_size = sizeof(struct io_async_connect),
+ },
+ [IORING_OP_FALLOCATE] = {
+ .needs_file = 1,
+ },
+ [IORING_OP_OPENAT] = {},
+ [IORING_OP_CLOSE] = {},
+ [IORING_OP_FILES_UPDATE] = {
+ .audit_skip = 1,
+ },
+ [IORING_OP_STATX] = {
+ .audit_skip = 1,
+ },
+ [IORING_OP_READ] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollin = 1,
+ .buffer_select = 1,
+ .plug = 1,
+ .audit_skip = 1,
+ .async_size = sizeof(struct io_async_rw),
+ },
+ [IORING_OP_WRITE] = {
+ .needs_file = 1,
+ .hash_reg_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollout = 1,
+ .plug = 1,
+ .audit_skip = 1,
+ .async_size = sizeof(struct io_async_rw),
+ },
+ [IORING_OP_FADVISE] = {
+ .needs_file = 1,
+ .audit_skip = 1,
+ },
+ [IORING_OP_MADVISE] = {},
+ [IORING_OP_SEND] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollout = 1,
+ .audit_skip = 1,
+ },
+ [IORING_OP_RECV] = {
+ .needs_file = 1,
+ .unbound_nonreg_file = 1,
+ .pollin = 1,
+ .buffer_select = 1,
+ .audit_skip = 1,
+ },
+ [IORING_OP_OPENAT2] = {
+ },
+ [IORING_OP_EPOLL_CTL] = {
+ .unbound_nonreg_file = 1,
+ .audit_skip = 1,
+ },
+ [IORING_OP_SPLICE] = {
+ .needs_file = 1,
+ .hash_reg_file = 1,
+ .unbound_nonreg_file = 1,
+ .audit_skip = 1,
+ },
+ [IORING_OP_PROVIDE_BUFFERS] = {
+ .audit_skip = 1,
+ },
+ [IORING_OP_REMOVE_BUFFERS] = {
+ .audit_skip = 1,
+ },
+ [IORING_OP_TEE] = {
+ .needs_file = 1,
+ .hash_reg_file = 1,
+ .unbound_nonreg_file = 1,
+ .audit_skip = 1,
+ },
+ [IORING_OP_SHUTDOWN] = {
+ .needs_file = 1,
+ },
+ [IORING_OP_RENAMEAT] = {},
+ [IORING_OP_UNLINKAT] = {},
+ [IORING_OP_MKDIRAT] = {},
+ [IORING_OP_SYMLINKAT] = {},
+ [IORING_OP_LINKAT] = {},
+ [IORING_OP_MSG_RING] = {
+ .needs_file = 1,
+ },
+};
+
+/* requests with any of those set should undergo io_disarm_next() */
+#define IO_DISARM_MASK (REQ_F_ARM_LTIMEOUT | REQ_F_LINK_TIMEOUT | REQ_F_FAIL)
+
+static bool io_disarm_next(struct io_kiocb *req);
+static void io_uring_del_tctx_node(unsigned long index);
+static void io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
+ struct task_struct *task,
+ bool cancel_all);
+static void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd);
+
+static void io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags);
+
+static void io_put_req(struct io_kiocb *req);
+static void io_put_req_deferred(struct io_kiocb *req);
+static void io_dismantle_req(struct io_kiocb *req);
+static void io_queue_linked_timeout(struct io_kiocb *req);
+static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
+ struct io_uring_rsrc_update2 *up,
+ unsigned nr_args);
+static void io_clean_op(struct io_kiocb *req);
+static inline struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
+ unsigned issue_flags);
+static inline struct file *io_file_get_normal(struct io_kiocb *req, int fd);
+static void __io_queue_sqe(struct io_kiocb *req);
+static void io_rsrc_put_work(struct work_struct *work);
+
+static void io_req_task_queue(struct io_kiocb *req);
+static void __io_submit_flush_completions(struct io_ring_ctx *ctx);
+static int io_req_prep_async(struct io_kiocb *req);
+
+static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
+ unsigned int issue_flags, u32 slot_index);
+static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags);
+
+static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer);
+static void io_eventfd_signal(struct io_ring_ctx *ctx);
+
+static struct kmem_cache *req_cachep;
+
+static const struct file_operations io_uring_fops;
+
+struct sock *io_uring_get_socket(struct file *file)
+{
+#if defined(CONFIG_UNIX)
+ if (file->f_op == &io_uring_fops) {
+ struct io_ring_ctx *ctx = file->private_data;
+
+ return ctx->ring_sock->sk;
+ }
+#endif
+ return NULL;
+}
+EXPORT_SYMBOL(io_uring_get_socket);
+
+static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked)
+{
+ if (!*locked) {
+ mutex_lock(&ctx->uring_lock);
+ *locked = true;
+ }
+}
+
+#define io_for_each_link(pos, head) \
+ for (pos = (head); pos; pos = pos->link)
+
+/*
+ * Shamelessly stolen from the mm implementation of page reference checking,
+ * see commit f958d7b528b1 for details.
+ */
+#define req_ref_zero_or_close_to_overflow(req) \
+ ((unsigned int) atomic_read(&(req->refs)) + 127u <= 127u)
+
+static inline bool req_ref_inc_not_zero(struct io_kiocb *req)
+{
+ WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
+ return atomic_inc_not_zero(&req->refs);
+}
+
+static inline bool req_ref_put_and_test(struct io_kiocb *req)
+{
+ if (likely(!(req->flags & REQ_F_REFCOUNT)))
+ return true;
+
+ WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
+ return atomic_dec_and_test(&req->refs);
+}
+
+static inline void req_ref_get(struct io_kiocb *req)
+{
+ WARN_ON_ONCE(!(req->flags & REQ_F_REFCOUNT));
+ WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req));
+ atomic_inc(&req->refs);
+}
+
+static inline void io_submit_flush_completions(struct io_ring_ctx *ctx)
+{
+ if (!wq_list_empty(&ctx->submit_state.compl_reqs))
+ __io_submit_flush_completions(ctx);
+}
+
+static inline void __io_req_set_refcount(struct io_kiocb *req, int nr)
+{
+ if (!(req->flags & REQ_F_REFCOUNT)) {
+ req->flags |= REQ_F_REFCOUNT;
+ atomic_set(&req->refs, nr);
+ }
+}
+
+static inline void io_req_set_refcount(struct io_kiocb *req)
+{
+ __io_req_set_refcount(req, 1);
+}
+
+#define IO_RSRC_REF_BATCH 100
+
+static inline void io_req_put_rsrc_locked(struct io_kiocb *req,
+ struct io_ring_ctx *ctx)
+ __must_hold(&ctx->uring_lock)
+{
+ struct percpu_ref *ref = req->fixed_rsrc_refs;
+
+ if (ref) {
+ if (ref == &ctx->rsrc_node->refs)
+ ctx->rsrc_cached_refs++;
+ else
+ percpu_ref_put(ref);
+ }
+}
+
+static inline void io_req_put_rsrc(struct io_kiocb *req, struct io_ring_ctx *ctx)
+{
+ if (req->fixed_rsrc_refs)
+ percpu_ref_put(req->fixed_rsrc_refs);
+}
+
+static __cold void io_rsrc_refs_drop(struct io_ring_ctx *ctx)
+ __must_hold(&ctx->uring_lock)
+{
+ if (ctx->rsrc_cached_refs) {
+ percpu_ref_put_many(&ctx->rsrc_node->refs, ctx->rsrc_cached_refs);
+ ctx->rsrc_cached_refs = 0;
+ }
+}
+
+static void io_rsrc_refs_refill(struct io_ring_ctx *ctx)
+ __must_hold(&ctx->uring_lock)
+{
+ ctx->rsrc_cached_refs += IO_RSRC_REF_BATCH;
+ percpu_ref_get_many(&ctx->rsrc_node->refs, IO_RSRC_REF_BATCH);
+}
+
+static inline void io_req_set_rsrc_node(struct io_kiocb *req,
+ struct io_ring_ctx *ctx,
+ unsigned int issue_flags)
+{
+ if (!req->fixed_rsrc_refs) {
+ req->fixed_rsrc_refs = &ctx->rsrc_node->refs;
+
+ if (!(issue_flags & IO_URING_F_UNLOCKED)) {
+ lockdep_assert_held(&ctx->uring_lock);
+ ctx->rsrc_cached_refs--;
+ if (unlikely(ctx->rsrc_cached_refs < 0))
+ io_rsrc_refs_refill(ctx);
+ } else {
+ percpu_ref_get(req->fixed_rsrc_refs);
+ }
+ }
+}
+
+static unsigned int __io_put_kbuf(struct io_kiocb *req, struct list_head *list)
+{
+ struct io_buffer *kbuf = req->kbuf;
+ unsigned int cflags;
+
+ cflags = IORING_CQE_F_BUFFER | (kbuf->bid << IORING_CQE_BUFFER_SHIFT);
+ req->flags &= ~REQ_F_BUFFER_SELECTED;
+ list_add(&kbuf->list, list);
+ req->kbuf = NULL;
+ return cflags;
+}
+
+static inline unsigned int io_put_kbuf_comp(struct io_kiocb *req)
+{
+ lockdep_assert_held(&req->ctx->completion_lock);
+
+ if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
+ return 0;
+ return __io_put_kbuf(req, &req->ctx->io_buffers_comp);
+}
+
+static inline unsigned int io_put_kbuf(struct io_kiocb *req,
+ unsigned issue_flags)
+{
+ unsigned int cflags;
+
+ if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
+ return 0;
+
+ /*
+ * We can add this buffer back to two lists:
+ *
+ * 1) The io_buffers_cache list. This one is protected by the
+ * ctx->uring_lock. If we already hold this lock, add back to this
+ * list as we can grab it from issue as well.
+ * 2) The io_buffers_comp list. This one is protected by the
+ * ctx->completion_lock.
+ *
+ * We migrate buffers from the comp_list to the issue cache list
+ * when we need one.
+ */
+ if (issue_flags & IO_URING_F_UNLOCKED) {
+ struct io_ring_ctx *ctx = req->ctx;
+
+ spin_lock(&ctx->completion_lock);
+ cflags = __io_put_kbuf(req, &ctx->io_buffers_comp);
+ spin_unlock(&ctx->completion_lock);
+ } else {
+ lockdep_assert_held(&req->ctx->uring_lock);
+
+ cflags = __io_put_kbuf(req, &req->ctx->io_buffers_cache);
+ }
+
+ return cflags;
+}
+
+static struct io_buffer_list *io_buffer_get_list(struct io_ring_ctx *ctx,
+ unsigned int bgid)
+{
+ struct list_head *hash_list;
+ struct io_buffer_list *bl;
+
+ hash_list = &ctx->io_buffers[hash_32(bgid, IO_BUFFERS_HASH_BITS)];
+ list_for_each_entry(bl, hash_list, list)
+ if (bl->bgid == bgid || bgid == -1U)
+ return bl;
+
+ return NULL;
+}
+
+static void io_kbuf_recycle(struct io_kiocb *req, unsigned issue_flags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_buffer_list *bl;
+ struct io_buffer *buf;
+
+ if (likely(!(req->flags & REQ_F_BUFFER_SELECTED)))
+ return;
+ /* don't recycle if we already did IO to this buffer */
+ if (req->flags & REQ_F_PARTIAL_IO)
+ return;
+
+ if (issue_flags & IO_URING_F_UNLOCKED)
+ mutex_lock(&ctx->uring_lock);
+
+ lockdep_assert_held(&ctx->uring_lock);
+
+ buf = req->kbuf;
+ bl = io_buffer_get_list(ctx, buf->bgid);
+ list_add(&buf->list, &bl->buf_list);
+ req->flags &= ~REQ_F_BUFFER_SELECTED;
+ req->kbuf = NULL;
+
+ if (issue_flags & IO_URING_F_UNLOCKED)
+ mutex_unlock(&ctx->uring_lock);
+}
+
+static bool io_match_task(struct io_kiocb *head, struct task_struct *task,
+ bool cancel_all)
+ __must_hold(&req->ctx->timeout_lock)
+{
+ struct io_kiocb *req;
+
+ if (task && head->task != task)
+ return false;
+ if (cancel_all)
+ return true;
+
+ io_for_each_link(req, head) {
+ if (req->flags & REQ_F_INFLIGHT)
+ return true;
+ }
+ return false;
+}
+
+static bool io_match_linked(struct io_kiocb *head)
+{
+ struct io_kiocb *req;
+
+ io_for_each_link(req, head) {
+ if (req->flags & REQ_F_INFLIGHT)
+ return true;
+ }
+ return false;
+}
+
+/*
+ * As io_match_task() but protected against racing with linked timeouts.
+ * User must not hold timeout_lock.
+ */
+static bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task,
+ bool cancel_all)
+{
+ bool matched;
+
+ if (task && head->task != task)
+ return false;
+ if (cancel_all)
+ return true;
+
+ if (head->flags & REQ_F_LINK_TIMEOUT) {
+ struct io_ring_ctx *ctx = head->ctx;
+
+ /* protect against races with linked timeouts */
+ spin_lock_irq(&ctx->timeout_lock);
+ matched = io_match_linked(head);
+ spin_unlock_irq(&ctx->timeout_lock);
+ } else {
+ matched = io_match_linked(head);
+ }
+ return matched;
+}
+
+static inline bool req_has_async_data(struct io_kiocb *req)
+{
+ return req->flags & REQ_F_ASYNC_DATA;
+}
+
+static inline void req_set_fail(struct io_kiocb *req)
+{
+ req->flags |= REQ_F_FAIL;
+ if (req->flags & REQ_F_CQE_SKIP) {
+ req->flags &= ~REQ_F_CQE_SKIP;
+ req->flags |= REQ_F_SKIP_LINK_CQES;
+ }
+}
+
+static inline void req_fail_link_node(struct io_kiocb *req, int res)
+{
+ req_set_fail(req);
+ req->result = res;
+}
+
+static __cold void io_ring_ctx_ref_free(struct percpu_ref *ref)
+{
+ struct io_ring_ctx *ctx = container_of(ref, struct io_ring_ctx, refs);
+
+ complete(&ctx->ref_comp);
+}
+
+static inline bool io_is_timeout_noseq(struct io_kiocb *req)
+{
+ return !req->timeout.off;
+}
+
+static __cold void io_fallback_req_func(struct work_struct *work)
+{
+ struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx,
+ fallback_work.work);
+ struct llist_node *node = llist_del_all(&ctx->fallback_llist);
+ struct io_kiocb *req, *tmp;
+ bool locked = false;
+
+ percpu_ref_get(&ctx->refs);
+ llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
+ req->io_task_work.func(req, &locked);
+
+ if (locked) {
+ io_submit_flush_completions(ctx);
+ mutex_unlock(&ctx->uring_lock);
+ }
+ percpu_ref_put(&ctx->refs);
+}
+
+static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
+{
+ struct io_ring_ctx *ctx;
+ int i, hash_bits;
+
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+ if (!ctx)
+ return NULL;
+
+ /*
+ * Use 5 bits less than the max cq entries, that should give us around
+ * 32 entries per hash list if totally full and uniformly spread.
+ */
+ hash_bits = ilog2(p->cq_entries);
+ hash_bits -= 5;
+ if (hash_bits <= 0)
+ hash_bits = 1;
+ ctx->cancel_hash_bits = hash_bits;
+ ctx->cancel_hash = kmalloc((1U << hash_bits) * sizeof(struct hlist_head),
+ GFP_KERNEL);
+ if (!ctx->cancel_hash)
+ goto err;
+ __hash_init(ctx->cancel_hash, 1U << hash_bits);
+
+ ctx->dummy_ubuf = kzalloc(sizeof(*ctx->dummy_ubuf), GFP_KERNEL);
+ if (!ctx->dummy_ubuf)
+ goto err;
+ /* set invalid range, so io_import_fixed() fails meeting it */
+ ctx->dummy_ubuf->ubuf = -1UL;
+
+ ctx->io_buffers = kcalloc(1U << IO_BUFFERS_HASH_BITS,
+ sizeof(struct list_head), GFP_KERNEL);
+ if (!ctx->io_buffers)
+ goto err;
+ for (i = 0; i < (1U << IO_BUFFERS_HASH_BITS); i++)
+ INIT_LIST_HEAD(&ctx->io_buffers[i]);
+
+ if (percpu_ref_init(&ctx->refs, io_ring_ctx_ref_free,
+ 0, GFP_KERNEL))
+ goto err;
+
+ ctx->flags = p->flags;
+ init_waitqueue_head(&ctx->sqo_sq_wait);
+ INIT_LIST_HEAD(&ctx->sqd_list);
+ INIT_LIST_HEAD(&ctx->cq_overflow_list);
+ INIT_LIST_HEAD(&ctx->io_buffers_cache);
+ INIT_LIST_HEAD(&ctx->apoll_cache);
+ init_completion(&ctx->ref_comp);
+ xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1);
+ mutex_init(&ctx->uring_lock);
+ init_waitqueue_head(&ctx->cq_wait);
+ spin_lock_init(&ctx->completion_lock);
+ spin_lock_init(&ctx->timeout_lock);
+ INIT_WQ_LIST(&ctx->iopoll_list);
+ INIT_LIST_HEAD(&ctx->io_buffers_pages);
+ INIT_LIST_HEAD(&ctx->io_buffers_comp);
+ INIT_LIST_HEAD(&ctx->defer_list);
+ INIT_LIST_HEAD(&ctx->timeout_list);
+ INIT_LIST_HEAD(&ctx->ltimeout_list);
+ spin_lock_init(&ctx->rsrc_ref_lock);
+ INIT_LIST_HEAD(&ctx->rsrc_ref_list);
+ INIT_DELAYED_WORK(&ctx->rsrc_put_work, io_rsrc_put_work);
+ init_llist_head(&ctx->rsrc_put_llist);
+ INIT_LIST_HEAD(&ctx->tctx_list);
+ ctx->submit_state.free_list.next = NULL;
+ INIT_WQ_LIST(&ctx->locked_free_list);
+ INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func);
+ INIT_WQ_LIST(&ctx->submit_state.compl_reqs);
+ return ctx;
+err:
+ kfree(ctx->dummy_ubuf);
+ kfree(ctx->cancel_hash);
+ kfree(ctx->io_buffers);
+ kfree(ctx);
+ return NULL;
+}
+
+static void io_account_cq_overflow(struct io_ring_ctx *ctx)
+{
+ struct io_rings *r = ctx->rings;
+
+ WRITE_ONCE(r->cq_overflow, READ_ONCE(r->cq_overflow) + 1);
+ ctx->cq_extra--;
+}
+
+static bool req_need_defer(struct io_kiocb *req, u32 seq)
+{
+ if (unlikely(req->flags & REQ_F_IO_DRAIN)) {
+ struct io_ring_ctx *ctx = req->ctx;
+
+ return seq + READ_ONCE(ctx->cq_extra) != ctx->cached_cq_tail;
+ }
+
+ return false;
+}
+
+#define FFS_NOWAIT 0x1UL
+#define FFS_ISREG 0x2UL
+#define FFS_MASK ~(FFS_NOWAIT|FFS_ISREG)
+
+static inline bool io_req_ffs_set(struct io_kiocb *req)
+{
+ return req->flags & REQ_F_FIXED_FILE;
+}
+
+static inline void io_req_track_inflight(struct io_kiocb *req)
+{
+ if (!(req->flags & REQ_F_INFLIGHT)) {
+ req->flags |= REQ_F_INFLIGHT;
+ atomic_inc(&req->task->io_uring->inflight_tracked);
+ }
+}
+
+static struct io_kiocb *__io_prep_linked_timeout(struct io_kiocb *req)
+{
+ if (WARN_ON_ONCE(!req->link))
+ return NULL;
+
+ req->flags &= ~REQ_F_ARM_LTIMEOUT;
+ req->flags |= REQ_F_LINK_TIMEOUT;
+
+ /* linked timeouts should have two refs once prep'ed */
+ io_req_set_refcount(req);
+ __io_req_set_refcount(req->link, 2);
+ return req->link;
+}
+
+static inline struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req)
+{
+ if (likely(!(req->flags & REQ_F_ARM_LTIMEOUT)))
+ return NULL;
+ return __io_prep_linked_timeout(req);
+}
+
+static void io_prep_async_work(struct io_kiocb *req)
+{
+ const struct io_op_def *def = &io_op_defs[req->opcode];
+ struct io_ring_ctx *ctx = req->ctx;
+
+ if (!(req->flags & REQ_F_CREDS)) {
+ req->flags |= REQ_F_CREDS;
+ req->creds = get_current_cred();
+ }
+
+ req->work.list.next = NULL;
+ req->work.flags = 0;
+ if (req->flags & REQ_F_FORCE_ASYNC)
+ req->work.flags |= IO_WQ_WORK_CONCURRENT;
+
+ if (req->flags & REQ_F_ISREG) {
+ if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
+ io_wq_hash_work(&req->work, file_inode(req->file));
+ } else if (!req->file || !S_ISBLK(file_inode(req->file)->i_mode)) {
+ if (def->unbound_nonreg_file)
+ req->work.flags |= IO_WQ_WORK_UNBOUND;
+ }
+}
+
+static void io_prep_async_link(struct io_kiocb *req)
+{
+ struct io_kiocb *cur;
+
+ if (req->flags & REQ_F_LINK_TIMEOUT) {
+ struct io_ring_ctx *ctx = req->ctx;
+
+ spin_lock_irq(&ctx->timeout_lock);
+ io_for_each_link(cur, req)
+ io_prep_async_work(cur);
+ spin_unlock_irq(&ctx->timeout_lock);
+ } else {
+ io_for_each_link(cur, req)
+ io_prep_async_work(cur);
+ }
+}
+
+static inline void io_req_add_compl_list(struct io_kiocb *req)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_submit_state *state = &ctx->submit_state;
+
+ if (!(req->flags & REQ_F_CQE_SKIP))
+ ctx->submit_state.flush_cqes = true;
+ wq_list_add_tail(&req->comp_list, &state->compl_reqs);
+}
+
+static void io_queue_async_work(struct io_kiocb *req, bool *dont_use)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_kiocb *link = io_prep_linked_timeout(req);
+ struct io_uring_task *tctx = req->task->io_uring;
+
+ BUG_ON(!tctx);
+ BUG_ON(!tctx->io_wq);
+
+ /* init ->work of the whole link before punting */
+ io_prep_async_link(req);
+
+ /*
+ * Not expected to happen, but if we do have a bug where this _can_
+ * happen, catch it here and ensure the request is marked as
+ * canceled. That will make io-wq go through the usual work cancel
+ * procedure rather than attempt to run this request (or create a new
+ * worker for it).
+ */
+ if (WARN_ON_ONCE(!same_thread_group(req->task, current)))
+ req->work.flags |= IO_WQ_WORK_CANCEL;
+
+ trace_io_uring_queue_async_work(ctx, req, req->user_data, req->opcode, req->flags,
+ &req->work, io_wq_is_hashed(&req->work));
+ io_wq_enqueue(tctx->io_wq, &req->work);
+ if (link)
+ io_queue_linked_timeout(link);
+}
+
+static void io_kill_timeout(struct io_kiocb *req, int status)
+ __must_hold(&req->ctx->completion_lock)
+ __must_hold(&req->ctx->timeout_lock)
+{
+ struct io_timeout_data *io = req->async_data;
+
+ if (hrtimer_try_to_cancel(&io->timer) != -1) {
+ if (status)
+ req_set_fail(req);
+ atomic_set(&req->ctx->cq_timeouts,
+ atomic_read(&req->ctx->cq_timeouts) + 1);
+ list_del_init(&req->timeout.list);
+ io_fill_cqe_req(req, status, 0);
+ io_put_req_deferred(req);
+ }
+}
+
+static __cold void io_queue_deferred(struct io_ring_ctx *ctx)
+{
+ while (!list_empty(&ctx->defer_list)) {
+ struct io_defer_entry *de = list_first_entry(&ctx->defer_list,
+ struct io_defer_entry, list);
+
+ if (req_need_defer(de->req, de->seq))
+ break;
+ list_del_init(&de->list);
+ io_req_task_queue(de->req);
+ kfree(de);
+ }
+}
+
+static __cold void io_flush_timeouts(struct io_ring_ctx *ctx)
+ __must_hold(&ctx->completion_lock)
+{
+ u32 seq = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
+ struct io_kiocb *req, *tmp;
+
+ spin_lock_irq(&ctx->timeout_lock);
+ list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
+ u32 events_needed, events_got;
+
+ if (io_is_timeout_noseq(req))
+ break;
+
+ /*
+ * Since seq can easily wrap around over time, subtract
+ * the last seq at which timeouts were flushed before comparing.
+ * Assuming not more than 2^31-1 events have happened since,
+ * these subtractions won't have wrapped, so we can check if
+ * target is in [last_seq, current_seq] by comparing the two.
+ */
+ events_needed = req->timeout.target_seq - ctx->cq_last_tm_flush;
+ events_got = seq - ctx->cq_last_tm_flush;
+ if (events_got < events_needed)
+ break;
+
+ io_kill_timeout(req, 0);
+ }
+ ctx->cq_last_tm_flush = seq;
+ spin_unlock_irq(&ctx->timeout_lock);
+}
+
+static inline void io_commit_cqring(struct io_ring_ctx *ctx)
+{
+ /* order cqe stores with ring update */
+ smp_store_release(&ctx->rings->cq.tail, ctx->cached_cq_tail);
+}
+
+static void __io_commit_cqring_flush(struct io_ring_ctx *ctx)
+{
+ if (ctx->off_timeout_used || ctx->drain_active) {
+ spin_lock(&ctx->completion_lock);
+ if (ctx->off_timeout_used)
+ io_flush_timeouts(ctx);
+ if (ctx->drain_active)
+ io_queue_deferred(ctx);
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ }
+ if (ctx->has_evfd)
+ io_eventfd_signal(ctx);
+}
+
+static inline bool io_sqring_full(struct io_ring_ctx *ctx)
+{
+ struct io_rings *r = ctx->rings;
+
+ return READ_ONCE(r->sq.tail) - ctx->cached_sq_head == ctx->sq_entries;
+}
+
+static inline unsigned int __io_cqring_events(struct io_ring_ctx *ctx)
+{
+ return ctx->cached_cq_tail - READ_ONCE(ctx->rings->cq.head);
+}
+
+static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx)
+{
+ struct io_rings *rings = ctx->rings;
+ unsigned tail, mask = ctx->cq_entries - 1;
+
+ /*
+ * writes to the cq entry need to come after reading head; the
+ * control dependency is enough as we're using WRITE_ONCE to
+ * fill the cq entry
+ */
+ if (__io_cqring_events(ctx) == ctx->cq_entries)
+ return NULL;
+
+ tail = ctx->cached_cq_tail++;
+ return &rings->cqes[tail & mask];
+}
+
+static void io_eventfd_signal(struct io_ring_ctx *ctx)
+{
+ struct io_ev_fd *ev_fd;
+
+ rcu_read_lock();
+ /*
+ * rcu_dereference ctx->io_ev_fd once and use it for both for checking
+ * and eventfd_signal
+ */
+ ev_fd = rcu_dereference(ctx->io_ev_fd);
+
+ /*
+ * Check again if ev_fd exists incase an io_eventfd_unregister call
+ * completed between the NULL check of ctx->io_ev_fd at the start of
+ * the function and rcu_read_lock.
+ */
+ if (unlikely(!ev_fd))
+ goto out;
+ if (READ_ONCE(ctx->rings->cq_flags) & IORING_CQ_EVENTFD_DISABLED)
+ goto out;
+
+ if (!ev_fd->eventfd_async || io_wq_current_is_worker())
+ eventfd_signal(ev_fd->cq_ev_fd, 1);
+out:
+ rcu_read_unlock();
+}
+
+static inline void io_cqring_wake(struct io_ring_ctx *ctx)
+{
+ /*
+ * wake_up_all() may seem excessive, but io_wake_function() and
+ * io_should_wake() handle the termination of the loop and only
+ * wake as many waiters as we need to.
+ */
+ if (wq_has_sleeper(&ctx->cq_wait))
+ wake_up_all(&ctx->cq_wait);
+}
+
+/*
+ * This should only get called when at least one event has been posted.
+ * Some applications rely on the eventfd notification count only changing
+ * IFF a new CQE has been added to the CQ ring. There's no depedency on
+ * 1:1 relationship between how many times this function is called (and
+ * hence the eventfd count) and number of CQEs posted to the CQ ring.
+ */
+static inline void io_cqring_ev_posted(struct io_ring_ctx *ctx)
+{
+ if (unlikely(ctx->off_timeout_used || ctx->drain_active ||
+ ctx->has_evfd))
+ __io_commit_cqring_flush(ctx);
+
+ io_cqring_wake(ctx);
+}
+
+static void io_cqring_ev_posted_iopoll(struct io_ring_ctx *ctx)
+{
+ if (unlikely(ctx->off_timeout_used || ctx->drain_active ||
+ ctx->has_evfd))
+ __io_commit_cqring_flush(ctx);
+
+ if (ctx->flags & IORING_SETUP_SQPOLL)
+ io_cqring_wake(ctx);
+}
+
+/* Returns true if there are no backlogged entries after the flush */
+static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
+{
+ bool all_flushed, posted;
+
+ if (!force && __io_cqring_events(ctx) == ctx->cq_entries)
+ return false;
+
+ posted = false;
+ spin_lock(&ctx->completion_lock);
+ while (!list_empty(&ctx->cq_overflow_list)) {
+ struct io_uring_cqe *cqe = io_get_cqe(ctx);
+ struct io_overflow_cqe *ocqe;
+
+ if (!cqe && !force)
+ break;
+ ocqe = list_first_entry(&ctx->cq_overflow_list,
+ struct io_overflow_cqe, list);
+ if (cqe)
+ memcpy(cqe, &ocqe->cqe, sizeof(*cqe));
+ else
+ io_account_cq_overflow(ctx);
+
+ posted = true;
+ list_del(&ocqe->list);
+ kfree(ocqe);
+ }
+
+ all_flushed = list_empty(&ctx->cq_overflow_list);
+ if (all_flushed) {
+ clear_bit(0, &ctx->check_cq_overflow);
+ WRITE_ONCE(ctx->rings->sq_flags,
+ ctx->rings->sq_flags & ~IORING_SQ_CQ_OVERFLOW);
+ }
+
+ if (posted)
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ if (posted)
+ io_cqring_ev_posted(ctx);
+ return all_flushed;
+}
+
+static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx)
+{
+ bool ret = true;
+
+ if (test_bit(0, &ctx->check_cq_overflow)) {
+ /* iopoll syncs against uring_lock, not completion_lock */
+ if (ctx->flags & IORING_SETUP_IOPOLL)
+ mutex_lock(&ctx->uring_lock);
+ ret = __io_cqring_overflow_flush(ctx, false);
+ if (ctx->flags & IORING_SETUP_IOPOLL)
+ mutex_unlock(&ctx->uring_lock);
+ }
+
+ return ret;
+}
+
+/* must to be called somewhat shortly after putting a request */
+static inline void io_put_task(struct task_struct *task, int nr)
+{
+ struct io_uring_task *tctx = task->io_uring;
+
+ if (likely(task == current)) {
+ tctx->cached_refs += nr;
+ } else {
+ percpu_counter_sub(&tctx->inflight, nr);
+ if (unlikely(atomic_read(&tctx->in_idle)))
+ wake_up(&tctx->wait);
+ put_task_struct_many(task, nr);
+ }
+}
+
+static void io_task_refs_refill(struct io_uring_task *tctx)
+{
+ unsigned int refill = -tctx->cached_refs + IO_TCTX_REFS_CACHE_NR;
+
+ percpu_counter_add(&tctx->inflight, refill);
+ refcount_add(refill, ¤t->usage);
+ tctx->cached_refs += refill;
+}
+
+static inline void io_get_task_refs(int nr)
+{
+ struct io_uring_task *tctx = current->io_uring;
+
+ tctx->cached_refs -= nr;
+ if (unlikely(tctx->cached_refs < 0))
+ io_task_refs_refill(tctx);
+}
+
+static __cold void io_uring_drop_tctx_refs(struct task_struct *task)
+{
+ struct io_uring_task *tctx = task->io_uring;
+ unsigned int refs = tctx->cached_refs;
+
+ if (refs) {
+ tctx->cached_refs = 0;
+ percpu_counter_sub(&tctx->inflight, refs);
+ put_task_struct_many(task, refs);
+ }
+}
+
+static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
+ s32 res, u32 cflags)
+{
+ struct io_overflow_cqe *ocqe;
+
+ ocqe = kmalloc(sizeof(*ocqe), GFP_ATOMIC | __GFP_ACCOUNT);
+ if (!ocqe) {
+ /*
+ * If we're in ring overflow flush mode, or in task cancel mode,
+ * or cannot allocate an overflow entry, then we need to drop it
+ * on the floor.
+ */
+ io_account_cq_overflow(ctx);
+ return false;
+ }
+ if (list_empty(&ctx->cq_overflow_list)) {
+ set_bit(0, &ctx->check_cq_overflow);
+ WRITE_ONCE(ctx->rings->sq_flags,
+ ctx->rings->sq_flags | IORING_SQ_CQ_OVERFLOW);
+
+ }
+ ocqe->cqe.user_data = user_data;
+ ocqe->cqe.res = res;
+ ocqe->cqe.flags = cflags;
+ list_add_tail(&ocqe->list, &ctx->cq_overflow_list);
+ return true;
+}
+
+static inline bool __io_fill_cqe(struct io_ring_ctx *ctx, u64 user_data,
+ s32 res, u32 cflags)
+{
+ struct io_uring_cqe *cqe;
+
+ /*
+ * If we can't get a cq entry, userspace overflowed the
+ * submission (by quite a lot). Increment the overflow count in
+ * the ring.
+ */
+ cqe = io_get_cqe(ctx);
+ if (likely(cqe)) {
+ WRITE_ONCE(cqe->user_data, user_data);
+ WRITE_ONCE(cqe->res, res);
+ WRITE_ONCE(cqe->flags, cflags);
+ return true;
+ }
+ return io_cqring_event_overflow(ctx, user_data, res, cflags);
+}
+
+static inline bool __io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
+{
+ trace_io_uring_complete(req->ctx, req, req->user_data, res, cflags);
+ return __io_fill_cqe(req->ctx, req->user_data, res, cflags);
+}
+
+static noinline void io_fill_cqe_req(struct io_kiocb *req, s32 res, u32 cflags)
+{
+ if (!(req->flags & REQ_F_CQE_SKIP))
+ __io_fill_cqe_req(req, res, cflags);
+}
+
+static noinline bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data,
+ s32 res, u32 cflags)
+{
+ ctx->cq_extra++;
+ trace_io_uring_complete(ctx, NULL, user_data, res, cflags);
+ return __io_fill_cqe(ctx, user_data, res, cflags);
+}
+
+static void __io_req_complete_post(struct io_kiocb *req, s32 res,
+ u32 cflags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ if (!(req->flags & REQ_F_CQE_SKIP))
+ __io_fill_cqe_req(req, res, cflags);
+ /*
+ * If we're the last reference to this request, add to our locked
+ * free_list cache.
+ */
+ if (req_ref_put_and_test(req)) {
+ if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
+ if (req->flags & IO_DISARM_MASK)
+ io_disarm_next(req);
+ if (req->link) {
+ io_req_task_queue(req->link);
+ req->link = NULL;
+ }
+ }
+ io_req_put_rsrc(req, ctx);
+ /*
+ * Selected buffer deallocation in io_clean_op() assumes that
+ * we don't hold ->completion_lock. Clean them here to avoid
+ * deadlocks.
+ */
+ io_put_kbuf_comp(req);
+ io_dismantle_req(req);
+ io_put_task(req->task, 1);
+ wq_list_add_head(&req->comp_list, &ctx->locked_free_list);
+ ctx->locked_free_nr++;
+ }
+}
+
+static void io_req_complete_post(struct io_kiocb *req, s32 res,
+ u32 cflags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ spin_lock(&ctx->completion_lock);
+ __io_req_complete_post(req, res, cflags);
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ io_cqring_ev_posted(ctx);
+}
+
+static inline void io_req_complete_state(struct io_kiocb *req, s32 res,
+ u32 cflags)
+{
+ req->result = res;
+ req->cflags = cflags;
+ req->flags |= REQ_F_COMPLETE_INLINE;
+}
+
+static inline void __io_req_complete(struct io_kiocb *req, unsigned issue_flags,
+ s32 res, u32 cflags)
+{
+ if (issue_flags & IO_URING_F_COMPLETE_DEFER)
+ io_req_complete_state(req, res, cflags);
+ else
+ io_req_complete_post(req, res, cflags);
+}
+
+static inline void io_req_complete(struct io_kiocb *req, s32 res)
+{
+ __io_req_complete(req, 0, res, 0);
+}
+
+static void io_req_complete_failed(struct io_kiocb *req, s32 res)
+{
+ req_set_fail(req);
+ io_req_complete_post(req, res, io_put_kbuf(req, IO_URING_F_UNLOCKED));
+}
+
+static void io_req_complete_fail_submit(struct io_kiocb *req)
+{
+ /*
+ * We don't submit, fail them all, for that replace hardlinks with
+ * normal links. Extra REQ_F_LINK is tolerated.
+ */
+ req->flags &= ~REQ_F_HARDLINK;
+ req->flags |= REQ_F_LINK;
+ io_req_complete_failed(req, req->result);
+}
+
+/*
+ * Don't initialise the fields below on every allocation, but do that in
+ * advance and keep them valid across allocations.
+ */
+static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx)
+{
+ req->ctx = ctx;
+ req->link = NULL;
+ req->async_data = NULL;
+ /* not necessary, but safer to zero */
+ req->result = 0;
+}
+
+static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx,
+ struct io_submit_state *state)
+{
+ spin_lock(&ctx->completion_lock);
+ wq_list_splice(&ctx->locked_free_list, &state->free_list);
+ ctx->locked_free_nr = 0;
+ spin_unlock(&ctx->completion_lock);
+}
+
+/* Returns true IFF there are requests in the cache */
+static bool io_flush_cached_reqs(struct io_ring_ctx *ctx)
+{
+ struct io_submit_state *state = &ctx->submit_state;
+
+ /*
+ * If we have more than a batch's worth of requests in our IRQ side
+ * locked cache, grab the lock and move them over to our submission
+ * side cache.
+ */
+ if (READ_ONCE(ctx->locked_free_nr) > IO_COMPL_BATCH)
+ io_flush_cached_locked_reqs(ctx, state);
+ return !!state->free_list.next;
+}
+
+/*
+ * A request might get retired back into the request caches even before opcode
+ * handlers and io_issue_sqe() are done with it, e.g. inline completion path.
+ * Because of that, io_alloc_req() should be called only under ->uring_lock
+ * and with extra caution to not get a request that is still worked on.
+ */
+static __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx)
+ __must_hold(&ctx->uring_lock)
+{
+ struct io_submit_state *state = &ctx->submit_state;
+ gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
+ void *reqs[IO_REQ_ALLOC_BATCH];
+ struct io_kiocb *req;
+ int ret, i;
+
+ if (likely(state->free_list.next || io_flush_cached_reqs(ctx)))
+ return true;
+
+ ret = kmem_cache_alloc_bulk(req_cachep, gfp, ARRAY_SIZE(reqs), reqs);
+
+ /*
+ * Bulk alloc is all-or-nothing. If we fail to get a batch,
+ * retry single alloc to be on the safe side.
+ */
+ if (unlikely(ret <= 0)) {
+ reqs[0] = kmem_cache_alloc(req_cachep, gfp);
+ if (!reqs[0])
+ return false;
+ ret = 1;
+ }
+
+ percpu_ref_get_many(&ctx->refs, ret);
+ for (i = 0; i < ret; i++) {
+ req = reqs[i];
+
+ io_preinit_req(req, ctx);
+ wq_stack_add_head(&req->comp_list, &state->free_list);
+ }
+ return true;
+}
+
+static inline bool io_alloc_req_refill(struct io_ring_ctx *ctx)
+{
+ if (unlikely(!ctx->submit_state.free_list.next))
+ return __io_alloc_req_refill(ctx);
+ return true;
+}
+
+static inline struct io_kiocb *io_alloc_req(struct io_ring_ctx *ctx)
+{
+ struct io_wq_work_node *node;
+
+ node = wq_stack_extract(&ctx->submit_state.free_list);
+ return container_of(node, struct io_kiocb, comp_list);
+}
+
+static inline void io_put_file(struct file *file)
+{
+ if (file)
+ fput(file);
+}
+
+static inline void io_dismantle_req(struct io_kiocb *req)
+{
+ unsigned int flags = req->flags;
+
+ if (unlikely(flags & IO_REQ_CLEAN_FLAGS))
+ io_clean_op(req);
+ if (!(flags & REQ_F_FIXED_FILE))
+ io_put_file(req->file);
+}
+
+static __cold void __io_free_req(struct io_kiocb *req)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ io_req_put_rsrc(req, ctx);
+ io_dismantle_req(req);
+ io_put_task(req->task, 1);
+
+ spin_lock(&ctx->completion_lock);
+ wq_list_add_head(&req->comp_list, &ctx->locked_free_list);
+ ctx->locked_free_nr++;
+ spin_unlock(&ctx->completion_lock);
+}
+
+static inline void io_remove_next_linked(struct io_kiocb *req)
+{
+ struct io_kiocb *nxt = req->link;
+
+ req->link = nxt->link;
+ nxt->link = NULL;
+}
+
+static bool io_kill_linked_timeout(struct io_kiocb *req)
+ __must_hold(&req->ctx->completion_lock)
+ __must_hold(&req->ctx->timeout_lock)
+{
+ struct io_kiocb *link = req->link;
+
+ if (link && link->opcode == IORING_OP_LINK_TIMEOUT) {
+ struct io_timeout_data *io = link->async_data;
+
+ io_remove_next_linked(req);
+ link->timeout.head = NULL;
+ if (hrtimer_try_to_cancel(&io->timer) != -1) {
+ list_del(&link->timeout.list);
+ /* leave REQ_F_CQE_SKIP to io_fill_cqe_req */
+ io_fill_cqe_req(link, -ECANCELED, 0);
+ io_put_req_deferred(link);
+ return true;
+ }
+ }
+ return false;
+}
+
+static void io_fail_links(struct io_kiocb *req)
+ __must_hold(&req->ctx->completion_lock)
+{
+ struct io_kiocb *nxt, *link = req->link;
+ bool ignore_cqes = req->flags & REQ_F_SKIP_LINK_CQES;
+
+ req->link = NULL;
+ while (link) {
+ long res = -ECANCELED;
+
+ if (link->flags & REQ_F_FAIL)
+ res = link->result;
+
+ nxt = link->link;
+ link->link = NULL;
+
+ trace_io_uring_fail_link(req->ctx, req, req->user_data,
+ req->opcode, link);
+
+ if (!ignore_cqes) {
+ link->flags &= ~REQ_F_CQE_SKIP;
+ io_fill_cqe_req(link, res, 0);
+ }
+ io_put_req_deferred(link);
+ link = nxt;
+ }
+}
+
+static bool io_disarm_next(struct io_kiocb *req)
+ __must_hold(&req->ctx->completion_lock)
+{
+ bool posted = false;
+
+ if (req->flags & REQ_F_ARM_LTIMEOUT) {
+ struct io_kiocb *link = req->link;
+
+ req->flags &= ~REQ_F_ARM_LTIMEOUT;
+ if (link && link->opcode == IORING_OP_LINK_TIMEOUT) {
+ io_remove_next_linked(req);
+ /* leave REQ_F_CQE_SKIP to io_fill_cqe_req */
+ io_fill_cqe_req(link, -ECANCELED, 0);
+ io_put_req_deferred(link);
+ posted = true;
+ }
+ } else if (req->flags & REQ_F_LINK_TIMEOUT) {
+ struct io_ring_ctx *ctx = req->ctx;
+
+ spin_lock_irq(&ctx->timeout_lock);
+ posted = io_kill_linked_timeout(req);
+ spin_unlock_irq(&ctx->timeout_lock);
+ }
+ if (unlikely((req->flags & REQ_F_FAIL) &&
+ !(req->flags & REQ_F_HARDLINK))) {
+ posted |= (req->link != NULL);
+ io_fail_links(req);
+ }
+ return posted;
+}
+
+static void __io_req_find_next_prep(struct io_kiocb *req)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ bool posted;
+
+ spin_lock(&ctx->completion_lock);
+ posted = io_disarm_next(req);
+ if (posted)
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ if (posted)
+ io_cqring_ev_posted(ctx);
+}
+
+static inline struct io_kiocb *io_req_find_next(struct io_kiocb *req)
+{
+ struct io_kiocb *nxt;
+
+ if (likely(!(req->flags & (REQ_F_LINK|REQ_F_HARDLINK))))
+ return NULL;
+ /*
+ * If LINK is set, we have dependent requests in this chain. If we
+ * didn't fail this request, queue the first one up, moving any other
+ * dependencies to the next request. In case of failure, fail the rest
+ * of the chain.
+ */
+ if (unlikely(req->flags & IO_DISARM_MASK))
+ __io_req_find_next_prep(req);
+ nxt = req->link;
+ req->link = NULL;
+ return nxt;
+}
+
+static void ctx_flush_and_put(struct io_ring_ctx *ctx, bool *locked)
+{
+ if (!ctx)
+ return;
+ if (*locked) {
+ io_submit_flush_completions(ctx);
+ mutex_unlock(&ctx->uring_lock);
+ *locked = false;
+ }
+ percpu_ref_put(&ctx->refs);
+}
+
+static inline void ctx_commit_and_unlock(struct io_ring_ctx *ctx)
+{
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ io_cqring_ev_posted(ctx);
+}
+
+static void handle_prev_tw_list(struct io_wq_work_node *node,
+ struct io_ring_ctx **ctx, bool *uring_locked)
+{
+ if (*ctx && !*uring_locked)
+ spin_lock(&(*ctx)->completion_lock);
+
+ do {
+ struct io_wq_work_node *next = node->next;
+ struct io_kiocb *req = container_of(node, struct io_kiocb,
+ io_task_work.node);
+
+ prefetch(container_of(next, struct io_kiocb, io_task_work.node));
+
+ if (req->ctx != *ctx) {
+ if (unlikely(!*uring_locked && *ctx))
+ ctx_commit_and_unlock(*ctx);
+
+ ctx_flush_and_put(*ctx, uring_locked);
+ *ctx = req->ctx;
+ /* if not contended, grab and improve batching */
+ *uring_locked = mutex_trylock(&(*ctx)->uring_lock);
+ percpu_ref_get(&(*ctx)->refs);
+ if (unlikely(!*uring_locked))
+ spin_lock(&(*ctx)->completion_lock);
+ }
+ if (likely(*uring_locked))
+ req->io_task_work.func(req, uring_locked);
+ else
+ __io_req_complete_post(req, req->result,
+ io_put_kbuf_comp(req));
+ node = next;
+ } while (node);
+
+ if (unlikely(!*uring_locked))
+ ctx_commit_and_unlock(*ctx);
+}
+
+static void handle_tw_list(struct io_wq_work_node *node,
+ struct io_ring_ctx **ctx, bool *locked)
+{
+ do {
+ struct io_wq_work_node *next = node->next;
+ struct io_kiocb *req = container_of(node, struct io_kiocb,
+ io_task_work.node);
+
+ prefetch(container_of(next, struct io_kiocb, io_task_work.node));
+
+ if (req->ctx != *ctx) {
+ ctx_flush_and_put(*ctx, locked);
+ *ctx = req->ctx;
+ /* if not contended, grab and improve batching */
+ *locked = mutex_trylock(&(*ctx)->uring_lock);
+ percpu_ref_get(&(*ctx)->refs);
+ }
+ req->io_task_work.func(req, locked);
+ node = next;
+ } while (node);
+}
+
+static void tctx_task_work(struct callback_head *cb)
+{
+ bool uring_locked = false;
+ struct io_ring_ctx *ctx = NULL;
+ struct io_uring_task *tctx = container_of(cb, struct io_uring_task,
+ task_work);
+
+ while (1) {
+ struct io_wq_work_node *node1, *node2;
+
+ if (!tctx->task_list.first &&
+ !tctx->prior_task_list.first && uring_locked)
+ io_submit_flush_completions(ctx);
+
+ spin_lock_irq(&tctx->task_lock);
+ node1 = tctx->prior_task_list.first;
+ node2 = tctx->task_list.first;
+ INIT_WQ_LIST(&tctx->task_list);
+ INIT_WQ_LIST(&tctx->prior_task_list);
+ if (!node2 && !node1)
+ tctx->task_running = false;
+ spin_unlock_irq(&tctx->task_lock);
+ if (!node2 && !node1)
+ break;
+
+ if (node1)
+ handle_prev_tw_list(node1, &ctx, &uring_locked);
+
+ if (node2)
+ handle_tw_list(node2, &ctx, &uring_locked);
+ cond_resched();
+ }
+
+ ctx_flush_and_put(ctx, &uring_locked);
+
+ /* relaxed read is enough as only the task itself sets ->in_idle */
+ if (unlikely(atomic_read(&tctx->in_idle)))
+ io_uring_drop_tctx_refs(current);
+}
+
+static void io_req_task_work_add(struct io_kiocb *req, bool priority)
+{
+ struct task_struct *tsk = req->task;
+ struct io_uring_task *tctx = tsk->io_uring;
+ enum task_work_notify_mode notify;
+ struct io_wq_work_node *node;
+ unsigned long flags;
+ bool running;
+
+ WARN_ON_ONCE(!tctx);
+
+ spin_lock_irqsave(&tctx->task_lock, flags);
+ if (priority)
+ wq_list_add_tail(&req->io_task_work.node, &tctx->prior_task_list);
+ else
+ wq_list_add_tail(&req->io_task_work.node, &tctx->task_list);
+ running = tctx->task_running;
+ if (!running)
+ tctx->task_running = true;
+ spin_unlock_irqrestore(&tctx->task_lock, flags);
+
+ /* task_work already pending, we're done */
+ if (running)
+ return;
+
+ /*
+ * SQPOLL kernel thread doesn't need notification, just a wakeup. For
+ * all other cases, use TWA_SIGNAL unconditionally to ensure we're
+ * processing task_work. There's no reliable way to tell if TWA_RESUME
+ * will do the job.
+ */
+ notify = (req->ctx->flags & IORING_SETUP_SQPOLL) ? TWA_NONE : TWA_SIGNAL;
+ if (likely(!task_work_add(tsk, &tctx->task_work, notify))) {
+ if (notify == TWA_NONE)
+ wake_up_process(tsk);
+ return;
+ }
+
+ spin_lock_irqsave(&tctx->task_lock, flags);
+ tctx->task_running = false;
+ node = wq_list_merge(&tctx->prior_task_list, &tctx->task_list);
+ spin_unlock_irqrestore(&tctx->task_lock, flags);
+
+ while (node) {
+ req = container_of(node, struct io_kiocb, io_task_work.node);
+ node = node->next;
+ if (llist_add(&req->io_task_work.fallback_node,
+ &req->ctx->fallback_llist))
+ schedule_delayed_work(&req->ctx->fallback_work, 1);
+ }
+}
+
+static void io_req_task_cancel(struct io_kiocb *req, bool *locked)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ /* not needed for normal modes, but SQPOLL depends on it */
+ io_tw_lock(ctx, locked);
+ io_req_complete_failed(req, req->result);
+}
+
+static void io_req_task_submit(struct io_kiocb *req, bool *locked)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ io_tw_lock(ctx, locked);
+ /* req->task == current here, checking PF_EXITING is safe */
+ if (likely(!(req->task->flags & PF_EXITING)))
+ __io_queue_sqe(req);
+ else
+ io_req_complete_failed(req, -EFAULT);
+}
+
+static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
+{
+ req->result = ret;
+ req->io_task_work.func = io_req_task_cancel;
+ io_req_task_work_add(req, false);
+}
+
+static void io_req_task_queue(struct io_kiocb *req)
+{
+ req->io_task_work.func = io_req_task_submit;
+ io_req_task_work_add(req, false);
+}
+
+static void io_req_task_queue_reissue(struct io_kiocb *req)
+{
+ req->io_task_work.func = io_queue_async_work;
+ io_req_task_work_add(req, false);
+}
+
+static inline void io_queue_next(struct io_kiocb *req)
+{
+ struct io_kiocb *nxt = io_req_find_next(req);
+
+ if (nxt)
+ io_req_task_queue(nxt);
+}
+
+static void io_free_req(struct io_kiocb *req)
+{
+ io_queue_next(req);
+ __io_free_req(req);
+}
+
+static void io_free_req_work(struct io_kiocb *req, bool *locked)
+{
+ io_free_req(req);
+}
+
+static void io_free_batch_list(struct io_ring_ctx *ctx,
+ struct io_wq_work_node *node)
+ __must_hold(&ctx->uring_lock)
+{
+ struct task_struct *task = NULL;
+ int task_refs = 0;
+
+ do {
+ struct io_kiocb *req = container_of(node, struct io_kiocb,
+ comp_list);
+
+ if (unlikely(req->flags & REQ_F_REFCOUNT)) {
+ node = req->comp_list.next;
+ if (!req_ref_put_and_test(req))
+ continue;
+ }
+
+ io_req_put_rsrc_locked(req, ctx);
+ io_queue_next(req);
+ io_dismantle_req(req);
+
+ if (req->task != task) {
+ if (task)
+ io_put_task(task, task_refs);
+ task = req->task;
+ task_refs = 0;
+ }
+ task_refs++;
+ node = req->comp_list.next;
+ wq_stack_add_head(&req->comp_list, &ctx->submit_state.free_list);
+ } while (node);
+
+ if (task)
+ io_put_task(task, task_refs);
+}
+
+static void __io_submit_flush_completions(struct io_ring_ctx *ctx)
+ __must_hold(&ctx->uring_lock)
+{
+ struct io_wq_work_node *node, *prev;
+ struct io_submit_state *state = &ctx->submit_state;
+
+ if (state->flush_cqes) {
+ spin_lock(&ctx->completion_lock);
+ wq_list_for_each(node, prev, &state->compl_reqs) {
+ struct io_kiocb *req = container_of(node, struct io_kiocb,
+ comp_list);
+
+ if (!(req->flags & REQ_F_CQE_SKIP))
+ __io_fill_cqe_req(req, req->result, req->cflags);
+ if ((req->flags & REQ_F_POLLED) && req->apoll) {
+ struct async_poll *apoll = req->apoll;
+
+ if (apoll->double_poll)
+ kfree(apoll->double_poll);
+ list_add(&apoll->poll.wait.entry,
+ &ctx->apoll_cache);
+ req->flags &= ~REQ_F_POLLED;
+ }
+ }
+
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ io_cqring_ev_posted(ctx);
+ state->flush_cqes = false;
+ }
+
+ io_free_batch_list(ctx, state->compl_reqs.first);
+ INIT_WQ_LIST(&state->compl_reqs);
+}
+
+/*
+ * Drop reference to request, return next in chain (if there is one) if this
+ * was the last reference to this request.
+ */
+static inline struct io_kiocb *io_put_req_find_next(struct io_kiocb *req)
+{
+ struct io_kiocb *nxt = NULL;
+
+ if (req_ref_put_and_test(req)) {
+ nxt = io_req_find_next(req);
+ __io_free_req(req);
+ }
+ return nxt;
+}
+
+static inline void io_put_req(struct io_kiocb *req)
+{
+ if (req_ref_put_and_test(req))
+ io_free_req(req);
+}
+
+static inline void io_put_req_deferred(struct io_kiocb *req)
+{
+ if (req_ref_put_and_test(req)) {
+ req->io_task_work.func = io_free_req_work;
+ io_req_task_work_add(req, false);
+ }
+}
+
+static unsigned io_cqring_events(struct io_ring_ctx *ctx)
+{
+ /* See comment at the top of this file */
+ smp_rmb();
+ return __io_cqring_events(ctx);
+}
+
+static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)
+{
+ struct io_rings *rings = ctx->rings;
+
+ /* make sure SQ entry isn't read before tail */
+ return smp_load_acquire(&rings->sq.tail) - ctx->cached_sq_head;
+}
+
+static inline bool io_run_task_work(void)
+{
+ if (test_thread_flag(TIF_NOTIFY_SIGNAL) || task_work_pending(current)) {
+ __set_current_state(TASK_RUNNING);
+ clear_notify_signal();
+ if (task_work_pending(current))
+ task_work_run();
+ return true;
+ }
+
+ return false;
+}
+
+static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
+{
+ struct io_wq_work_node *pos, *start, *prev;
+ unsigned int poll_flags = BLK_POLL_NOSLEEP;
+ DEFINE_IO_COMP_BATCH(iob);
+ int nr_events = 0;
+
+ /*
+ * Only spin for completions if we don't have multiple devices hanging
+ * off our complete list.
+ */
+ if (ctx->poll_multi_queue || force_nonspin)
+ poll_flags |= BLK_POLL_ONESHOT;
+
+ wq_list_for_each(pos, start, &ctx->iopoll_list) {
+ struct io_kiocb *req = container_of(pos, struct io_kiocb, comp_list);
+ struct kiocb *kiocb = &req->rw.kiocb;
+ int ret;
+
+ /*
+ * Move completed and retryable entries to our local lists.
+ * If we find a request that requires polling, break out
+ * and complete those lists first, if we have entries there.
+ */
+ if (READ_ONCE(req->iopoll_completed))
+ break;
+
+ ret = kiocb->ki_filp->f_op->iopoll(kiocb, &iob, poll_flags);
+ if (unlikely(ret < 0))
+ return ret;
+ else if (ret)
+ poll_flags |= BLK_POLL_ONESHOT;
+
+ /* iopoll may have completed current req */
+ if (!rq_list_empty(iob.req_list) ||
+ READ_ONCE(req->iopoll_completed))
+ break;
+ }
+
+ if (!rq_list_empty(iob.req_list))
+ iob.complete(&iob);
+ else if (!pos)
+ return 0;
+
+ prev = start;
+ wq_list_for_each_resume(pos, prev) {
+ struct io_kiocb *req = container_of(pos, struct io_kiocb, comp_list);
+
+ /* order with io_complete_rw_iopoll(), e.g. ->result updates */
+ if (!smp_load_acquire(&req->iopoll_completed))
+ break;
+ nr_events++;
+ if (unlikely(req->flags & REQ_F_CQE_SKIP))
+ continue;
+ __io_fill_cqe_req(req, req->result, io_put_kbuf(req, 0));
+ }
+
+ if (unlikely(!nr_events))
+ return 0;
+
+ io_commit_cqring(ctx);
+ io_cqring_ev_posted_iopoll(ctx);
+ pos = start ? start->next : ctx->iopoll_list.first;
+ wq_list_cut(&ctx->iopoll_list, prev, start);
+ io_free_batch_list(ctx, pos);
+ return nr_events;
+}
+
+/*
+ * We can't just wait for polled events to come to us, we have to actively
+ * find and complete them.
+ */
+static __cold void io_iopoll_try_reap_events(struct io_ring_ctx *ctx)
+{
+ if (!(ctx->flags & IORING_SETUP_IOPOLL))
+ return;
+
+ mutex_lock(&ctx->uring_lock);
+ while (!wq_list_empty(&ctx->iopoll_list)) {
+ /* let it sleep and repeat later if can't complete a request */
+ if (io_do_iopoll(ctx, true) == 0)
+ break;
+ /*
+ * Ensure we allow local-to-the-cpu processing to take place,
+ * in this case we need to ensure that we reap all events.
+ * Also let task_work, etc. to progress by releasing the mutex
+ */
+ if (need_resched()) {
+ mutex_unlock(&ctx->uring_lock);
+ cond_resched();
+ mutex_lock(&ctx->uring_lock);
+ }
+ }
+ mutex_unlock(&ctx->uring_lock);
+}
+
+static int io_iopoll_check(struct io_ring_ctx *ctx, long min)
+{
+ unsigned int nr_events = 0;
+ int ret = 0;
+
+ /*
+ * We disallow the app entering submit/complete with polling, but we
+ * still need to lock the ring to prevent racing with polled issue
+ * that got punted to a workqueue.
+ */
+ mutex_lock(&ctx->uring_lock);
+ /*
+ * Don't enter poll loop if we already have events pending.
+ * If we do, we can potentially be spinning for commands that
+ * already triggered a CQE (eg in error).
+ */
+ if (test_bit(0, &ctx->check_cq_overflow))
+ __io_cqring_overflow_flush(ctx, false);
+ if (io_cqring_events(ctx))
+ goto out;
+ do {
+ /*
+ * If a submit got punted to a workqueue, we can have the
+ * application entering polling for a command before it gets
+ * issued. That app will hold the uring_lock for the duration
+ * of the poll right here, so we need to take a breather every
+ * now and then to ensure that the issue has a chance to add
+ * the poll to the issued list. Otherwise we can spin here
+ * forever, while the workqueue is stuck trying to acquire the
+ * very same mutex.
+ */
+ if (wq_list_empty(&ctx->iopoll_list)) {
+ u32 tail = ctx->cached_cq_tail;
+
+ mutex_unlock(&ctx->uring_lock);
+ io_run_task_work();
+ mutex_lock(&ctx->uring_lock);
+
+ /* some requests don't go through iopoll_list */
+ if (tail != ctx->cached_cq_tail ||
+ wq_list_empty(&ctx->iopoll_list))
+ break;
+ }
+ ret = io_do_iopoll(ctx, !min);
+ if (ret < 0)
+ break;
+ nr_events += ret;
+ ret = 0;
+ } while (nr_events < min && !need_resched());
+out:
+ mutex_unlock(&ctx->uring_lock);
+ return ret;
+}
+
+static void kiocb_end_write(struct io_kiocb *req)
+{
+ /*
+ * Tell lockdep we inherited freeze protection from submission
+ * thread.
+ */
+ if (req->flags & REQ_F_ISREG) {
+ struct super_block *sb = file_inode(req->file)->i_sb;
+
+ __sb_writers_acquired(sb, SB_FREEZE_WRITE);
+ sb_end_write(sb);
+ }
+}
+
+#ifdef CONFIG_BLOCK
+static bool io_resubmit_prep(struct io_kiocb *req)
+{
+ struct io_async_rw *rw = req->async_data;
+
+ if (!req_has_async_data(req))
+ return !io_req_prep_async(req);
+ iov_iter_restore(&rw->s.iter, &rw->s.iter_state);
+ return true;
+}
+
+static bool io_rw_should_reissue(struct io_kiocb *req)
+{
+ umode_t mode = file_inode(req->file)->i_mode;
+ struct io_ring_ctx *ctx = req->ctx;
+
+ if (!S_ISBLK(mode) && !S_ISREG(mode))
+ return false;
+ if ((req->flags & REQ_F_NOWAIT) || (io_wq_current_is_worker() &&
+ !(ctx->flags & IORING_SETUP_IOPOLL)))
+ return false;
+ /*
+ * If ref is dying, we might be running poll reap from the exit work.
+ * Don't attempt to reissue from that path, just let it fail with
+ * -EAGAIN.
+ */
+ if (percpu_ref_is_dying(&ctx->refs))
+ return false;
+ /*
+ * Play it safe and assume not safe to re-import and reissue if we're
+ * not in the original thread group (or in task context).
+ */
+ if (!same_thread_group(req->task, current) || !in_task())
+ return false;
+ return true;
+}
+#else
+static bool io_resubmit_prep(struct io_kiocb *req)
+{
+ return false;
+}
+static bool io_rw_should_reissue(struct io_kiocb *req)
+{
+ return false;
+}
+#endif
+
+static bool __io_complete_rw_common(struct io_kiocb *req, long res)
+{
+ if (req->rw.kiocb.ki_flags & IOCB_WRITE) {
+ kiocb_end_write(req);
+ fsnotify_modify(req->file);
+ } else {
+ fsnotify_access(req->file);
+ }
+ if (unlikely(res != req->result)) {
+ if ((res == -EAGAIN || res == -EOPNOTSUPP) &&
+ io_rw_should_reissue(req)) {
+ req->flags |= REQ_F_REISSUE;
+ return true;
+ }
+ req_set_fail(req);
+ req->result = res;
+ }
+ return false;
+}
+
+static inline void io_req_task_complete(struct io_kiocb *req, bool *locked)
+{
+ int res = req->result;
+
+ if (*locked) {
+ io_req_complete_state(req, res, io_put_kbuf(req, 0));
+ io_req_add_compl_list(req);
+ } else {
+ io_req_complete_post(req, res,
+ io_put_kbuf(req, IO_URING_F_UNLOCKED));
+ }
+}
+
+static void __io_complete_rw(struct io_kiocb *req, long res,
+ unsigned int issue_flags)
+{
+ if (__io_complete_rw_common(req, res))
+ return;
+ __io_req_complete(req, issue_flags, req->result,
+ io_put_kbuf(req, issue_flags));
+}
+
+static void io_complete_rw(struct kiocb *kiocb, long res)
+{
+ struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
+
+ if (__io_complete_rw_common(req, res))
+ return;
+ req->result = res;
+ req->io_task_work.func = io_req_task_complete;
+ io_req_task_work_add(req, !!(req->ctx->flags & IORING_SETUP_SQPOLL));
+}
+
+static void io_complete_rw_iopoll(struct kiocb *kiocb, long res)
+{
+ struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
+
+ if (kiocb->ki_flags & IOCB_WRITE)
+ kiocb_end_write(req);
+ if (unlikely(res != req->result)) {
+ if (res == -EAGAIN && io_rw_should_reissue(req)) {
+ req->flags |= REQ_F_REISSUE;
+ return;
+ }
+ req->result = res;
+ }
+
+ /* order with io_iopoll_complete() checking ->iopoll_completed */
+ smp_store_release(&req->iopoll_completed, 1);
+}
+
+/*
+ * After the iocb has been issued, it's safe to be found on the poll list.
+ * Adding the kiocb to the list AFTER submission ensures that we don't
+ * find it from a io_do_iopoll() thread before the issuer is done
+ * accessing the kiocb cookie.
+ */
+static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ const bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+
+ /* workqueue context doesn't hold uring_lock, grab it now */
+ if (unlikely(needs_lock))
+ mutex_lock(&ctx->uring_lock);
+
+ /*
+ * Track whether we have multiple files in our lists. This will impact
+ * how we do polling eventually, not spinning if we're on potentially
+ * different devices.
+ */
+ if (wq_list_empty(&ctx->iopoll_list)) {
+ ctx->poll_multi_queue = false;
+ } else if (!ctx->poll_multi_queue) {
+ struct io_kiocb *list_req;
+
+ list_req = container_of(ctx->iopoll_list.first, struct io_kiocb,
+ comp_list);
+ if (list_req->file != req->file)
+ ctx->poll_multi_queue = true;
+ }
+
+ /*
+ * For fast devices, IO may have already completed. If it has, add
+ * it to the front so we find it first.
+ */
+ if (READ_ONCE(req->iopoll_completed))
+ wq_list_add_head(&req->comp_list, &ctx->iopoll_list);
+ else
+ wq_list_add_tail(&req->comp_list, &ctx->iopoll_list);
+
+ if (unlikely(needs_lock)) {
+ /*
+ * If IORING_SETUP_SQPOLL is enabled, sqes are either handle
+ * in sq thread task context or in io worker task context. If
+ * current task context is sq thread, we don't need to check
+ * whether should wake up sq thread.
+ */
+ if ((ctx->flags & IORING_SETUP_SQPOLL) &&
+ wq_has_sleeper(&ctx->sq_data->wait))
+ wake_up(&ctx->sq_data->wait);
+
+ mutex_unlock(&ctx->uring_lock);
+ }
+}
+
+static bool io_bdev_nowait(struct block_device *bdev)
+{
+ return !bdev || blk_queue_nowait(bdev_get_queue(bdev));
+}
+
+/*
+ * If we tracked the file through the SCM inflight mechanism, we could support
+ * any file. For now, just ensure that anything potentially problematic is done
+ * inline.
+ */
+static bool __io_file_supports_nowait(struct file *file, umode_t mode)
+{
+ if (S_ISBLK(mode)) {
+ if (IS_ENABLED(CONFIG_BLOCK) &&
+ io_bdev_nowait(I_BDEV(file->f_mapping->host)))
+ return true;
+ return false;
+ }
+ if (S_ISSOCK(mode))
+ return true;
+ if (S_ISREG(mode)) {
+ if (IS_ENABLED(CONFIG_BLOCK) &&
+ io_bdev_nowait(file->f_inode->i_sb->s_bdev) &&
+ file->f_op != &io_uring_fops)
+ return true;
+ return false;
+ }
+
+ /* any ->read/write should understand O_NONBLOCK */
+ if (file->f_flags & O_NONBLOCK)
+ return true;
+ return file->f_mode & FMODE_NOWAIT;
+}
+
+/*
+ * If we tracked the file through the SCM inflight mechanism, we could support
+ * any file. For now, just ensure that anything potentially problematic is done
+ * inline.
+ */
+static unsigned int io_file_get_flags(struct file *file)
+{
+ umode_t mode = file_inode(file)->i_mode;
+ unsigned int res = 0;
+
+ if (S_ISREG(mode))
+ res |= FFS_ISREG;
+ if (__io_file_supports_nowait(file, mode))
+ res |= FFS_NOWAIT;
+ return res;
+}
+
+static inline bool io_file_supports_nowait(struct io_kiocb *req)
+{
+ return req->flags & REQ_F_SUPPORT_NOWAIT;
+}
+
+static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct kiocb *kiocb = &req->rw.kiocb;
+ unsigned ioprio;
+ int ret;
+
+ kiocb->ki_pos = READ_ONCE(sqe->off);
+ /* used for fixed read/write too - just read unconditionally */
+ req->buf_index = READ_ONCE(sqe->buf_index);
+ req->imu = NULL;
+
+ if (req->opcode == IORING_OP_READ_FIXED ||
+ req->opcode == IORING_OP_WRITE_FIXED) {
+ struct io_ring_ctx *ctx = req->ctx;
+ u16 index;
+
+ if (unlikely(req->buf_index >= ctx->nr_user_bufs))
+ return -EFAULT;
+ index = array_index_nospec(req->buf_index, ctx->nr_user_bufs);
+ req->imu = ctx->user_bufs[index];
+ io_req_set_rsrc_node(req, ctx, 0);
+ }
+
+ ioprio = READ_ONCE(sqe->ioprio);
+ if (ioprio) {
+ ret = ioprio_check_cap(ioprio);
+ if (ret)
+ return ret;
+
+ kiocb->ki_ioprio = ioprio;
+ } else {
+ kiocb->ki_ioprio = get_current_ioprio();
+ }
+
+ req->rw.addr = READ_ONCE(sqe->addr);
+ req->rw.len = READ_ONCE(sqe->len);
+ req->rw.flags = READ_ONCE(sqe->rw_flags);
+ return 0;
+}
+
+static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret)
+{
+ switch (ret) {
+ case -EIOCBQUEUED:
+ break;
+ case -ERESTARTSYS:
+ case -ERESTARTNOINTR:
+ case -ERESTARTNOHAND:
+ case -ERESTART_RESTARTBLOCK:
+ /*
+ * We can't just restart the syscall, since previously
+ * submitted sqes may already be in progress. Just fail this
+ * IO with EINTR.
+ */
+ ret = -EINTR;
+ fallthrough;
+ default:
+ kiocb->ki_complete(kiocb, ret);
+ }
+}
+
+static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req)
+{
+ struct kiocb *kiocb = &req->rw.kiocb;
+
+ if (kiocb->ki_pos != -1)
+ return &kiocb->ki_pos;
+
+ if (!(req->file->f_mode & FMODE_STREAM)) {
+ req->flags |= REQ_F_CUR_POS;
+ kiocb->ki_pos = req->file->f_pos;
+ return &kiocb->ki_pos;
+ }
+
+ kiocb->ki_pos = 0;
+ return NULL;
+}
+
+static void kiocb_done(struct io_kiocb *req, ssize_t ret,
+ unsigned int issue_flags)
+{
+ struct io_async_rw *io = req->async_data;
+
+ /* add previously done IO, if any */
+ if (req_has_async_data(req) && io->bytes_done > 0) {
+ if (ret < 0)
+ ret = io->bytes_done;
+ else
+ ret += io->bytes_done;
+ }
+
+ if (req->flags & REQ_F_CUR_POS)
+ req->file->f_pos = req->rw.kiocb.ki_pos;
+ if (ret >= 0 && (req->rw.kiocb.ki_complete == io_complete_rw))
+ __io_complete_rw(req, ret, issue_flags);
+ else
+ io_rw_done(&req->rw.kiocb, ret);
+
+ if (req->flags & REQ_F_REISSUE) {
+ req->flags &= ~REQ_F_REISSUE;
+ if (io_resubmit_prep(req))
+ io_req_task_queue_reissue(req);
+ else
+ io_req_task_queue_fail(req, ret);
+ }
+}
+
+static int __io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter,
+ struct io_mapped_ubuf *imu)
+{
+ size_t len = req->rw.len;
+ u64 buf_end, buf_addr = req->rw.addr;
+ size_t offset;
+
+ if (unlikely(check_add_overflow(buf_addr, (u64)len, &buf_end)))
+ return -EFAULT;
+ /* not inside the mapped region */
+ if (unlikely(buf_addr < imu->ubuf || buf_end > imu->ubuf_end))
+ return -EFAULT;
+
+ /*
+ * May not be a start of buffer, set size appropriately
+ * and advance us to the beginning.
+ */
+ offset = buf_addr - imu->ubuf;
+ iov_iter_bvec(iter, rw, imu->bvec, imu->nr_bvecs, offset + len);
+
+ if (offset) {
+ /*
+ * Don't use iov_iter_advance() here, as it's really slow for
+ * using the latter parts of a big fixed buffer - it iterates
+ * over each segment manually. We can cheat a bit here, because
+ * we know that:
+ *
+ * 1) it's a BVEC iter, we set it up
+ * 2) all bvecs are PAGE_SIZE in size, except potentially the
+ * first and last bvec
+ *
+ * So just find our index, and adjust the iterator afterwards.
+ * If the offset is within the first bvec (or the whole first
+ * bvec, just use iov_iter_advance(). This makes it easier
+ * since we can just skip the first segment, which may not
+ * be PAGE_SIZE aligned.
+ */
+ const struct bio_vec *bvec = imu->bvec;
+
+ if (offset <= bvec->bv_len) {
+ iov_iter_advance(iter, offset);
+ } else {
+ unsigned long seg_skip;
+
+ /* skip first vec */
+ offset -= bvec->bv_len;
+ seg_skip = 1 + (offset >> PAGE_SHIFT);
+
+ iter->bvec = bvec + seg_skip;
+ iter->nr_segs -= seg_skip;
+ iter->count -= bvec->bv_len + offset;
+ iter->iov_offset = offset & ~PAGE_MASK;
+ }
+ }
+
+ return 0;
+}
+
+static int io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter,
+ unsigned int issue_flags)
+{
+ if (WARN_ON_ONCE(!req->imu))
+ return -EFAULT;
+ return __io_import_fixed(req, rw, iter, req->imu);
+}
+
+static void io_ring_submit_unlock(struct io_ring_ctx *ctx, bool needs_lock)
+{
+ if (needs_lock)
+ mutex_unlock(&ctx->uring_lock);
+}
+
+static void io_ring_submit_lock(struct io_ring_ctx *ctx, bool needs_lock)
+{
+ /*
+ * "Normal" inline submissions always hold the uring_lock, since we
+ * grab it from the system call. Same is true for the SQPOLL offload.
+ * The only exception is when we've detached the request and issue it
+ * from an async worker thread, grab the lock for that case.
+ */
+ if (needs_lock)
+ mutex_lock(&ctx->uring_lock);
+}
+
+static void io_buffer_add_list(struct io_ring_ctx *ctx,
+ struct io_buffer_list *bl, unsigned int bgid)
+{
+ struct list_head *list;
+
+ list = &ctx->io_buffers[hash_32(bgid, IO_BUFFERS_HASH_BITS)];
+ INIT_LIST_HEAD(&bl->buf_list);
+ bl->bgid = bgid;
+ list_add(&bl->list, list);
+}
+
+static struct io_buffer *io_buffer_select(struct io_kiocb *req, size_t *len,
+ int bgid, unsigned int issue_flags)
+{
+ struct io_buffer *kbuf = req->kbuf;
+ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_buffer_list *bl;
+
+ if (req->flags & REQ_F_BUFFER_SELECTED)
+ return kbuf;
+
+ io_ring_submit_lock(ctx, needs_lock);
+
+ lockdep_assert_held(&ctx->uring_lock);
+
+ bl = io_buffer_get_list(ctx, bgid);
+ if (bl && !list_empty(&bl->buf_list)) {
+ kbuf = list_first_entry(&bl->buf_list, struct io_buffer, list);
+ list_del(&kbuf->list);
+ if (*len > kbuf->len)
+ *len = kbuf->len;
+ req->flags |= REQ_F_BUFFER_SELECTED;
+ req->kbuf = kbuf;
+ } else {
+ kbuf = ERR_PTR(-ENOBUFS);
+ }
+
+ io_ring_submit_unlock(req->ctx, needs_lock);
+ return kbuf;
+}
+
+static void __user *io_rw_buffer_select(struct io_kiocb *req, size_t *len,
+ unsigned int issue_flags)
+{
+ struct io_buffer *kbuf;
+ u16 bgid;
+
+ bgid = req->buf_index;
+ kbuf = io_buffer_select(req, len, bgid, issue_flags);
+ if (IS_ERR(kbuf))
+ return kbuf;
+ return u64_to_user_ptr(kbuf->addr);
+}
+
+#ifdef CONFIG_COMPAT
+static ssize_t io_compat_import(struct io_kiocb *req, struct iovec *iov,
+ unsigned int issue_flags)
+{
+ struct compat_iovec __user *uiov;
+ compat_ssize_t clen;
+ void __user *buf;
+ ssize_t len;
+
+ uiov = u64_to_user_ptr(req->rw.addr);
+ if (!access_ok(uiov, sizeof(*uiov)))
+ return -EFAULT;
+ if (__get_user(clen, &uiov->iov_len))
+ return -EFAULT;
+ if (clen < 0)
+ return -EINVAL;
+
+ len = clen;
+ buf = io_rw_buffer_select(req, &len, issue_flags);
+ if (IS_ERR(buf))
+ return PTR_ERR(buf);
+ iov[0].iov_base = buf;
+ iov[0].iov_len = (compat_size_t) len;
+ return 0;
+}
+#endif
+
+static ssize_t __io_iov_buffer_select(struct io_kiocb *req, struct iovec *iov,
+ unsigned int issue_flags)
+{
+ struct iovec __user *uiov = u64_to_user_ptr(req->rw.addr);
+ void __user *buf;
+ ssize_t len;
+
+ if (copy_from_user(iov, uiov, sizeof(*uiov)))
+ return -EFAULT;
+
+ len = iov[0].iov_len;
+ if (len < 0)
+ return -EINVAL;
+ buf = io_rw_buffer_select(req, &len, issue_flags);
+ if (IS_ERR(buf))
+ return PTR_ERR(buf);
+ iov[0].iov_base = buf;
+ iov[0].iov_len = len;
+ return 0;
+}
+
+static ssize_t io_iov_buffer_select(struct io_kiocb *req, struct iovec *iov,
+ unsigned int issue_flags)
+{
+ if (req->flags & REQ_F_BUFFER_SELECTED) {
+ struct io_buffer *kbuf = req->kbuf;
+
+ iov[0].iov_base = u64_to_user_ptr(kbuf->addr);
+ iov[0].iov_len = kbuf->len;
+ return 0;
+ }
+ if (req->rw.len != 1)
+ return -EINVAL;
+
+#ifdef CONFIG_COMPAT
+ if (req->ctx->compat)
+ return io_compat_import(req, iov, issue_flags);
+#endif
+
+ return __io_iov_buffer_select(req, iov, issue_flags);
+}
+
+static inline bool io_do_buffer_select(struct io_kiocb *req)
+{
+ if (!(req->flags & REQ_F_BUFFER_SELECT))
+ return false;
+ return !(req->flags & REQ_F_BUFFER_SELECTED);
+}
+
+static struct iovec *__io_import_iovec(int rw, struct io_kiocb *req,
+ struct io_rw_state *s,
+ unsigned int issue_flags)
+{
+ struct iov_iter *iter = &s->iter;
+ u8 opcode = req->opcode;
+ struct iovec *iovec;
+ void __user *buf;
+ size_t sqe_len;
+ ssize_t ret;
+
+ if (opcode == IORING_OP_READ_FIXED || opcode == IORING_OP_WRITE_FIXED) {
+ ret = io_import_fixed(req, rw, iter, issue_flags);
+ if (ret)
+ return ERR_PTR(ret);
+ return NULL;
+ }
+
+ /* buffer index only valid with fixed read/write, or buffer select */
+ if (unlikely(req->buf_index && !(req->flags & REQ_F_BUFFER_SELECT)))
+ return ERR_PTR(-EINVAL);
+
+ buf = u64_to_user_ptr(req->rw.addr);
+ sqe_len = req->rw.len;
+
+ if (opcode == IORING_OP_READ || opcode == IORING_OP_WRITE) {
+ if (req->flags & REQ_F_BUFFER_SELECT) {
+ buf = io_rw_buffer_select(req, &sqe_len, issue_flags);
+ if (IS_ERR(buf))
+ return ERR_CAST(buf);
+ req->rw.len = sqe_len;
+ }
+
+ ret = import_single_range(rw, buf, sqe_len, s->fast_iov, iter);
+ if (ret)
+ return ERR_PTR(ret);
+ return NULL;
+ }
+
+ iovec = s->fast_iov;
+ if (req->flags & REQ_F_BUFFER_SELECT) {
+ ret = io_iov_buffer_select(req, iovec, issue_flags);
+ if (ret)
+ return ERR_PTR(ret);
+ iov_iter_init(iter, rw, iovec, 1, iovec->iov_len);
+ return NULL;
+ }
+
+ ret = __import_iovec(rw, buf, sqe_len, UIO_FASTIOV, &iovec, iter,
+ req->ctx->compat);
+ if (unlikely(ret < 0))
+ return ERR_PTR(ret);
+ return iovec;
+}
+
+static inline int io_import_iovec(int rw, struct io_kiocb *req,
+ struct iovec **iovec, struct io_rw_state *s,
+ unsigned int issue_flags)
+{
+ *iovec = __io_import_iovec(rw, req, s, issue_flags);
+ if (unlikely(IS_ERR(*iovec)))
+ return PTR_ERR(*iovec);
+
+ iov_iter_save_state(&s->iter, &s->iter_state);
+ return 0;
+}
+
+static inline loff_t *io_kiocb_ppos(struct kiocb *kiocb)
+{
+ return (kiocb->ki_filp->f_mode & FMODE_STREAM) ? NULL : &kiocb->ki_pos;
+}
+
+/*
+ * For files that don't have ->read_iter() and ->write_iter(), handle them
+ * by looping over ->read() or ->write() manually.
+ */
+static ssize_t loop_rw_iter(int rw, struct io_kiocb *req, struct iov_iter *iter)
+{
+ struct kiocb *kiocb = &req->rw.kiocb;
+ struct file *file = req->file;
+ ssize_t ret = 0;
+ loff_t *ppos;
+
+ /*
+ * Don't support polled IO through this interface, and we can't
+ * support non-blocking either. For the latter, this just causes
+ * the kiocb to be handled from an async context.
+ */
+ if (kiocb->ki_flags & IOCB_HIPRI)
+ return -EOPNOTSUPP;
+ if ((kiocb->ki_flags & IOCB_NOWAIT) &&
+ !(kiocb->ki_filp->f_flags & O_NONBLOCK))
+ return -EAGAIN;
+
+ ppos = io_kiocb_ppos(kiocb);
+
+ while (iov_iter_count(iter)) {
+ struct iovec iovec;
+ ssize_t nr;
+
+ if (!iov_iter_is_bvec(iter)) {
+ iovec = iov_iter_iovec(iter);
+ } else {
+ iovec.iov_base = u64_to_user_ptr(req->rw.addr);
+ iovec.iov_len = req->rw.len;
+ }
+
+ if (rw == READ) {
+ nr = file->f_op->read(file, iovec.iov_base,
+ iovec.iov_len, ppos);
+ } else {
+ nr = file->f_op->write(file, iovec.iov_base,
+ iovec.iov_len, ppos);
+ }
+
+ if (nr < 0) {
+ if (!ret)
+ ret = nr;
+ break;
+ }
+ ret += nr;
+ if (!iov_iter_is_bvec(iter)) {
+ iov_iter_advance(iter, nr);
+ } else {
+ req->rw.addr += nr;
+ req->rw.len -= nr;
+ if (!req->rw.len)
+ break;
+ }
+ if (nr != iovec.iov_len)
+ break;
+ }
+
+ return ret;
+}
+
+static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
+ const struct iovec *fast_iov, struct iov_iter *iter)
+{
+ struct io_async_rw *rw = req->async_data;
+
+ memcpy(&rw->s.iter, iter, sizeof(*iter));
+ rw->free_iovec = iovec;
+ rw->bytes_done = 0;
+ /* can only be fixed buffers, no need to do anything */
+ if (iov_iter_is_bvec(iter))
+ return;
+ if (!iovec) {
+ unsigned iov_off = 0;
+
+ rw->s.iter.iov = rw->s.fast_iov;
+ if (iter->iov != fast_iov) {
+ iov_off = iter->iov - fast_iov;
+ rw->s.iter.iov += iov_off;
+ }
+ if (rw->s.fast_iov != fast_iov)
+ memcpy(rw->s.fast_iov + iov_off, fast_iov + iov_off,
+ sizeof(struct iovec) * iter->nr_segs);
+ } else {
+ req->flags |= REQ_F_NEED_CLEANUP;
+ }
+}
+
+static inline bool io_alloc_async_data(struct io_kiocb *req)
+{
+ WARN_ON_ONCE(!io_op_defs[req->opcode].async_size);
+ req->async_data = kmalloc(io_op_defs[req->opcode].async_size, GFP_KERNEL);
+ if (req->async_data) {
+ req->flags |= REQ_F_ASYNC_DATA;
+ return false;
+ }
+ return true;
+}
+
+static int io_setup_async_rw(struct io_kiocb *req, const struct iovec *iovec,
+ struct io_rw_state *s, bool force)
+{
+ if (!force && !io_op_defs[req->opcode].needs_async_setup)
+ return 0;
+ if (!req_has_async_data(req)) {
+ struct io_async_rw *iorw;
+
+ if (io_alloc_async_data(req)) {
+ kfree(iovec);
+ return -ENOMEM;
+ }
+
+ io_req_map_rw(req, iovec, s->fast_iov, &s->iter);
+ iorw = req->async_data;
+ /* we've copied and mapped the iter, ensure state is saved */
+ iov_iter_save_state(&iorw->s.iter, &iorw->s.iter_state);
+ }
+ return 0;
+}
+
+static inline int io_rw_prep_async(struct io_kiocb *req, int rw)
+{
+ struct io_async_rw *iorw = req->async_data;
+ struct iovec *iov;
+ int ret;
+
+ /* submission path, ->uring_lock should already be taken */
+ ret = io_import_iovec(rw, req, &iov, &iorw->s, 0);
+ if (unlikely(ret < 0))
+ return ret;
+
+ iorw->bytes_done = 0;
+ iorw->free_iovec = iov;
+ if (iov)
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return 0;
+}
+
+/*
+ * This is our waitqueue callback handler, registered through __folio_lock_async()
+ * when we initially tried to do the IO with the iocb armed our waitqueue.
+ * This gets called when the page is unlocked, and we generally expect that to
+ * happen when the page IO is completed and the page is now uptodate. This will
+ * queue a task_work based retry of the operation, attempting to copy the data
+ * again. If the latter fails because the page was NOT uptodate, then we will
+ * do a thread based blocking retry of the operation. That's the unexpected
+ * slow path.
+ */
+static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
+ int sync, void *arg)
+{
+ struct wait_page_queue *wpq;
+ struct io_kiocb *req = wait->private;
+ struct wait_page_key *key = arg;
+
+ wpq = container_of(wait, struct wait_page_queue, wait);
+
+ if (!wake_page_match(wpq, key))
+ return 0;
+
+ req->rw.kiocb.ki_flags &= ~IOCB_WAITQ;
+ list_del_init(&wait->entry);
+ io_req_task_queue(req);
+ return 1;
+}
+
+/*
+ * This controls whether a given IO request should be armed for async page
+ * based retry. If we return false here, the request is handed to the async
+ * worker threads for retry. If we're doing buffered reads on a regular file,
+ * we prepare a private wait_page_queue entry and retry the operation. This
+ * will either succeed because the page is now uptodate and unlocked, or it
+ * will register a callback when the page is unlocked at IO completion. Through
+ * that callback, io_uring uses task_work to setup a retry of the operation.
+ * That retry will attempt the buffered read again. The retry will generally
+ * succeed, or in rare cases where it fails, we then fall back to using the
+ * async worker threads for a blocking retry.
+ */
+static bool io_rw_should_retry(struct io_kiocb *req)
+{
+ struct io_async_rw *rw = req->async_data;
+ struct wait_page_queue *wait = &rw->wpq;
+ struct kiocb *kiocb = &req->rw.kiocb;
+
+ /* never retry for NOWAIT, we just complete with -EAGAIN */
+ if (req->flags & REQ_F_NOWAIT)
+ return false;
+
+ /* Only for buffered IO */
+ if (kiocb->ki_flags & (IOCB_DIRECT | IOCB_HIPRI))
+ return false;
+
+ /*
+ * just use poll if we can, and don't attempt if the fs doesn't
+ * support callback based unlocks
+ */
+ if (file_can_poll(req->file) || !(req->file->f_mode & FMODE_BUF_RASYNC))
+ return false;
+
+ wait->wait.func = io_async_buf_func;
+ wait->wait.private = req;
+ wait->wait.flags = 0;
+ INIT_LIST_HEAD(&wait->wait.entry);
+ kiocb->ki_flags |= IOCB_WAITQ;
+ kiocb->ki_flags &= ~IOCB_NOWAIT;
+ kiocb->ki_waitq = wait;
+ return true;
+}
+
+static inline int io_iter_do_read(struct io_kiocb *req, struct iov_iter *iter)
+{
+ if (likely(req->file->f_op->read_iter))
+ return call_read_iter(req->file, &req->rw.kiocb, iter);
+ else if (req->file->f_op->read)
+ return loop_rw_iter(READ, req, iter);
+ else
+ return -EINVAL;
+}
+
+static bool need_read_all(struct io_kiocb *req)
+{
+ return req->flags & REQ_F_ISREG ||
+ S_ISBLK(file_inode(req->file)->i_mode);
+}
+
+static int io_rw_init_file(struct io_kiocb *req, fmode_t mode)
+{
+ struct kiocb *kiocb = &req->rw.kiocb;
+ struct io_ring_ctx *ctx = req->ctx;
+ struct file *file = req->file;
+ int ret;
+
+ if (unlikely(!file || !(file->f_mode & mode)))
+ return -EBADF;
+
+ if (!io_req_ffs_set(req))
+ req->flags |= io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT;
+
+ kiocb->ki_flags = iocb_flags(file);
+ ret = kiocb_set_rw_flags(kiocb, req->rw.flags);
+ if (unlikely(ret))
+ return ret;
+
+ /*
+ * If the file is marked O_NONBLOCK, still allow retry for it if it
+ * supports async. Otherwise it's impossible to use O_NONBLOCK files
+ * reliably. If not, or it IOCB_NOWAIT is set, don't retry.
+ */
+ if ((kiocb->ki_flags & IOCB_NOWAIT) ||
+ ((file->f_flags & O_NONBLOCK) && !io_file_supports_nowait(req)))
+ req->flags |= REQ_F_NOWAIT;
+
+ if (ctx->flags & IORING_SETUP_IOPOLL) {
+ if (!(kiocb->ki_flags & IOCB_DIRECT) || !file->f_op->iopoll)
+ return -EOPNOTSUPP;
+
+ kiocb->private = NULL;
+ kiocb->ki_flags |= IOCB_HIPRI | IOCB_ALLOC_CACHE;
+ kiocb->ki_complete = io_complete_rw_iopoll;
+ req->iopoll_completed = 0;
+ } else {
+ if (kiocb->ki_flags & IOCB_HIPRI)
+ return -EINVAL;
+ kiocb->ki_complete = io_complete_rw;
+ }
+
+ return 0;
+}
+
+static int io_read(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_rw_state __s, *s = &__s;
+ struct iovec *iovec;
+ struct kiocb *kiocb = &req->rw.kiocb;
+ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+ struct io_async_rw *rw;
+ ssize_t ret, ret2;
+ loff_t *ppos;
+
+ if (!req_has_async_data(req)) {
+ ret = io_import_iovec(READ, req, &iovec, s, issue_flags);
+ if (unlikely(ret < 0))
+ return ret;
+ } else {
+ rw = req->async_data;
+ s = &rw->s;
+
+ /*
+ * Safe and required to re-import if we're using provided
+ * buffers, as we dropped the selected one before retry.
+ */
+ if (io_do_buffer_select(req)) {
+ ret = io_import_iovec(READ, req, &iovec, s, issue_flags);
+ if (unlikely(ret < 0))
+ return ret;
+ }
+
+ /*
+ * We come here from an earlier attempt, restore our state to
+ * match in case it doesn't. It's cheap enough that we don't
+ * need to make this conditional.
+ */
+ iov_iter_restore(&s->iter, &s->iter_state);
+ iovec = NULL;
+ }
+ ret = io_rw_init_file(req, FMODE_READ);
+ if (unlikely(ret)) {
+ kfree(iovec);
+ return ret;
+ }
+ req->result = iov_iter_count(&s->iter);
+
+ if (force_nonblock) {
+ /* If the file doesn't support async, just async punt */
+ if (unlikely(!io_file_supports_nowait(req))) {
+ ret = io_setup_async_rw(req, iovec, s, true);
+ return ret ?: -EAGAIN;
+ }
+ kiocb->ki_flags |= IOCB_NOWAIT;
+ } else {
+ /* Ensure we clear previously set non-block flag */
+ kiocb->ki_flags &= ~IOCB_NOWAIT;
+ }
+
+ ppos = io_kiocb_update_pos(req);
+
+ ret = rw_verify_area(READ, req->file, ppos, req->result);
+ if (unlikely(ret)) {
+ kfree(iovec);
+ return ret;
+ }
+
+ ret = io_iter_do_read(req, &s->iter);
+
+ if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) {
+ req->flags &= ~REQ_F_REISSUE;
+ /* if we can poll, just do that */
+ if (req->opcode == IORING_OP_READ && file_can_poll(req->file))
+ return -EAGAIN;
+ /* IOPOLL retry should happen for io-wq threads */
+ if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL))
+ goto done;
+ /* no retry on NONBLOCK nor RWF_NOWAIT */
+ if (req->flags & REQ_F_NOWAIT)
+ goto done;
+ ret = 0;
+ } else if (ret == -EIOCBQUEUED) {
+ goto out_free;
+ } else if (ret == req->result || ret <= 0 || !force_nonblock ||
+ (req->flags & REQ_F_NOWAIT) || !need_read_all(req)) {
+ /* read all, failed, already did sync or don't want to retry */
+ goto done;
+ }
+
+ /*
+ * Don't depend on the iter state matching what was consumed, or being
+ * untouched in case of error. Restore it and we'll advance it
+ * manually if we need to.
+ */
+ iov_iter_restore(&s->iter, &s->iter_state);
+
+ ret2 = io_setup_async_rw(req, iovec, s, true);
+ if (ret2)
+ return ret2;
+
+ iovec = NULL;
+ rw = req->async_data;
+ s = &rw->s;
+ /*
+ * Now use our persistent iterator and state, if we aren't already.
+ * We've restored and mapped the iter to match.
+ */
+
+ do {
+ /*
+ * We end up here because of a partial read, either from
+ * above or inside this loop. Advance the iter by the bytes
+ * that were consumed.
+ */
+ iov_iter_advance(&s->iter, ret);
+ if (!iov_iter_count(&s->iter))
+ break;
+ rw->bytes_done += ret;
+ iov_iter_save_state(&s->iter, &s->iter_state);
+
+ /* if we can retry, do so with the callbacks armed */
+ if (!io_rw_should_retry(req)) {
+ kiocb->ki_flags &= ~IOCB_WAITQ;
+ return -EAGAIN;
+ }
+
+ /*
+ * Now retry read with the IOCB_WAITQ parts set in the iocb. If
+ * we get -EIOCBQUEUED, then we'll get a notification when the
+ * desired page gets unlocked. We can also get a partial read
+ * here, and if we do, then just retry at the new offset.
+ */
+ ret = io_iter_do_read(req, &s->iter);
+ if (ret == -EIOCBQUEUED)
+ return 0;
+ /* we got some bytes, but not all. retry. */
+ kiocb->ki_flags &= ~IOCB_WAITQ;
+ iov_iter_restore(&s->iter, &s->iter_state);
+ } while (ret > 0);
+done:
+ kiocb_done(req, ret, issue_flags);
+out_free:
+ /* it's faster to check here then delegate to kfree */
+ if (iovec)
+ kfree(iovec);
+ return 0;
+}
+
+static int io_write(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_rw_state __s, *s = &__s;
+ struct iovec *iovec;
+ struct kiocb *kiocb = &req->rw.kiocb;
+ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+ ssize_t ret, ret2;
+ loff_t *ppos;
+
+ if (!req_has_async_data(req)) {
+ ret = io_import_iovec(WRITE, req, &iovec, s, issue_flags);
+ if (unlikely(ret < 0))
+ return ret;
+ } else {
+ struct io_async_rw *rw = req->async_data;
+
+ s = &rw->s;
+ iov_iter_restore(&s->iter, &s->iter_state);
+ iovec = NULL;
+ }
+ ret = io_rw_init_file(req, FMODE_WRITE);
+ if (unlikely(ret)) {
+ kfree(iovec);
+ return ret;
+ }
+ req->result = iov_iter_count(&s->iter);
+
+ if (force_nonblock) {
+ /* If the file doesn't support async, just async punt */
+ if (unlikely(!io_file_supports_nowait(req)))
+ goto copy_iov;
+
+ /* file path doesn't support NOWAIT for non-direct_IO */
+ if (force_nonblock && !(kiocb->ki_flags & IOCB_DIRECT) &&
+ (req->flags & REQ_F_ISREG))
+ goto copy_iov;
+
+ kiocb->ki_flags |= IOCB_NOWAIT;
+ } else {
+ /* Ensure we clear previously set non-block flag */
+ kiocb->ki_flags &= ~IOCB_NOWAIT;
+ }
+
+ ppos = io_kiocb_update_pos(req);
+
+ ret = rw_verify_area(WRITE, req->file, ppos, req->result);
+ if (unlikely(ret))
+ goto out_free;
+
+ /*
+ * Open-code file_start_write here to grab freeze protection,
+ * which will be released by another thread in
+ * io_complete_rw(). Fool lockdep by telling it the lock got
+ * released so that it doesn't complain about the held lock when
+ * we return to userspace.
+ */
+ if (req->flags & REQ_F_ISREG) {
+ sb_start_write(file_inode(req->file)->i_sb);
+ __sb_writers_release(file_inode(req->file)->i_sb,
+ SB_FREEZE_WRITE);
+ }
+ kiocb->ki_flags |= IOCB_WRITE;
+
+ if (likely(req->file->f_op->write_iter))
+ ret2 = call_write_iter(req->file, kiocb, &s->iter);
+ else if (req->file->f_op->write)
+ ret2 = loop_rw_iter(WRITE, req, &s->iter);
+ else
+ ret2 = -EINVAL;
+
+ if (req->flags & REQ_F_REISSUE) {
+ req->flags &= ~REQ_F_REISSUE;
+ ret2 = -EAGAIN;
+ }
+
+ /*
+ * Raw bdev writes will return -EOPNOTSUPP for IOCB_NOWAIT. Just
+ * retry them without IOCB_NOWAIT.
+ */
+ if (ret2 == -EOPNOTSUPP && (kiocb->ki_flags & IOCB_NOWAIT))
+ ret2 = -EAGAIN;
+ /* no retry on NONBLOCK nor RWF_NOWAIT */
+ if (ret2 == -EAGAIN && (req->flags & REQ_F_NOWAIT))
+ goto done;
+ if (!force_nonblock || ret2 != -EAGAIN) {
+ /* IOPOLL retry should happen for io-wq threads */
+ if (ret2 == -EAGAIN && (req->ctx->flags & IORING_SETUP_IOPOLL))
+ goto copy_iov;
+done:
+ kiocb_done(req, ret2, issue_flags);
+ } else {
+copy_iov:
+ iov_iter_restore(&s->iter, &s->iter_state);
+ ret = io_setup_async_rw(req, iovec, s, false);
+ return ret ?: -EAGAIN;
+ }
+out_free:
+ /* it's reportedly faster than delegating the null check to kfree() */
+ if (iovec)
+ kfree(iovec);
+ return ret;
+}
+
+static int io_renameat_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_rename *ren = &req->rename;
+ const char __user *oldf, *newf;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
+ return -EINVAL;
+ if (unlikely(req->flags & REQ_F_FIXED_FILE))
+ return -EBADF;
+
+ ren->old_dfd = READ_ONCE(sqe->fd);
+ oldf = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ newf = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+ ren->new_dfd = READ_ONCE(sqe->len);
+ ren->flags = READ_ONCE(sqe->rename_flags);
+
+ ren->oldpath = getname(oldf);
+ if (IS_ERR(ren->oldpath))
+ return PTR_ERR(ren->oldpath);
+
+ ren->newpath = getname(newf);
+ if (IS_ERR(ren->newpath)) {
+ putname(ren->oldpath);
+ return PTR_ERR(ren->newpath);
+ }
+
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return 0;
+}
+
+static int io_renameat(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_rename *ren = &req->rename;
+ int ret;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ ret = do_renameat2(ren->old_dfd, ren->oldpath, ren->new_dfd,
+ ren->newpath, ren->flags);
+
+ req->flags &= ~REQ_F_NEED_CLEANUP;
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+static int io_unlinkat_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_unlink *un = &req->unlink;
+ const char __user *fname;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->off || sqe->len || sqe->buf_index ||
+ sqe->splice_fd_in)
+ return -EINVAL;
+ if (unlikely(req->flags & REQ_F_FIXED_FILE))
+ return -EBADF;
+
+ un->dfd = READ_ONCE(sqe->fd);
+
+ un->flags = READ_ONCE(sqe->unlink_flags);
+ if (un->flags & ~AT_REMOVEDIR)
+ return -EINVAL;
+
+ fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ un->filename = getname(fname);
+ if (IS_ERR(un->filename))
+ return PTR_ERR(un->filename);
+
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return 0;
+}
+
+static int io_unlinkat(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_unlink *un = &req->unlink;
+ int ret;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ if (un->flags & AT_REMOVEDIR)
+ ret = do_rmdir(un->dfd, un->filename);
+ else
+ ret = do_unlinkat(un->dfd, un->filename);
+
+ req->flags &= ~REQ_F_NEED_CLEANUP;
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+static int io_mkdirat_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_mkdir *mkd = &req->mkdir;
+ const char __user *fname;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->off || sqe->rw_flags || sqe->buf_index ||
+ sqe->splice_fd_in)
+ return -EINVAL;
+ if (unlikely(req->flags & REQ_F_FIXED_FILE))
+ return -EBADF;
+
+ mkd->dfd = READ_ONCE(sqe->fd);
+ mkd->mode = READ_ONCE(sqe->len);
+
+ fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ mkd->filename = getname(fname);
+ if (IS_ERR(mkd->filename))
+ return PTR_ERR(mkd->filename);
+
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return 0;
+}
+
+static int io_mkdirat(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_mkdir *mkd = &req->mkdir;
+ int ret;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ ret = do_mkdirat(mkd->dfd, mkd->filename, mkd->mode);
+
+ req->flags &= ~REQ_F_NEED_CLEANUP;
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+static int io_symlinkat_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_symlink *sl = &req->symlink;
+ const char __user *oldpath, *newpath;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->len || sqe->rw_flags || sqe->buf_index ||
+ sqe->splice_fd_in)
+ return -EINVAL;
+ if (unlikely(req->flags & REQ_F_FIXED_FILE))
+ return -EBADF;
+
+ sl->new_dfd = READ_ONCE(sqe->fd);
+ oldpath = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ newpath = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+
+ sl->oldpath = getname(oldpath);
+ if (IS_ERR(sl->oldpath))
+ return PTR_ERR(sl->oldpath);
+
+ sl->newpath = getname(newpath);
+ if (IS_ERR(sl->newpath)) {
+ putname(sl->oldpath);
+ return PTR_ERR(sl->newpath);
+ }
+
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return 0;
+}
+
+static int io_symlinkat(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_symlink *sl = &req->symlink;
+ int ret;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ ret = do_symlinkat(sl->oldpath, sl->new_dfd, sl->newpath);
+
+ req->flags &= ~REQ_F_NEED_CLEANUP;
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+static int io_linkat_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_hardlink *lnk = &req->hardlink;
+ const char __user *oldf, *newf;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->rw_flags || sqe->buf_index || sqe->splice_fd_in)
+ return -EINVAL;
+ if (unlikely(req->flags & REQ_F_FIXED_FILE))
+ return -EBADF;
+
+ lnk->old_dfd = READ_ONCE(sqe->fd);
+ lnk->new_dfd = READ_ONCE(sqe->len);
+ oldf = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ newf = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+ lnk->flags = READ_ONCE(sqe->hardlink_flags);
+
+ lnk->oldpath = getname(oldf);
+ if (IS_ERR(lnk->oldpath))
+ return PTR_ERR(lnk->oldpath);
+
+ lnk->newpath = getname(newf);
+ if (IS_ERR(lnk->newpath)) {
+ putname(lnk->oldpath);
+ return PTR_ERR(lnk->newpath);
+ }
+
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return 0;
+}
+
+static int io_linkat(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_hardlink *lnk = &req->hardlink;
+ int ret;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ ret = do_linkat(lnk->old_dfd, lnk->oldpath, lnk->new_dfd,
+ lnk->newpath, lnk->flags);
+
+ req->flags &= ~REQ_F_NEED_CLEANUP;
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+static int io_shutdown_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+#if defined(CONFIG_NET)
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (unlikely(sqe->ioprio || sqe->off || sqe->addr || sqe->rw_flags ||
+ sqe->buf_index || sqe->splice_fd_in))
+ return -EINVAL;
+
+ req->shutdown.how = READ_ONCE(sqe->len);
+ return 0;
+#else
+ return -EOPNOTSUPP;
+#endif
+}
+
+static int io_shutdown(struct io_kiocb *req, unsigned int issue_flags)
+{
+#if defined(CONFIG_NET)
+ struct socket *sock;
+ int ret;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ sock = sock_from_file(req->file);
+ if (unlikely(!sock))
+ return -ENOTSOCK;
+
+ ret = __sys_shutdown_sock(sock, req->shutdown.how);
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+#else
+ return -EOPNOTSUPP;
+#endif
+}
+
+static int __io_splice_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_splice *sp = &req->splice;
+ unsigned int valid_flags = SPLICE_F_FD_IN_FIXED | SPLICE_F_ALL;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+
+ sp->len = READ_ONCE(sqe->len);
+ sp->flags = READ_ONCE(sqe->splice_flags);
+ if (unlikely(sp->flags & ~valid_flags))
+ return -EINVAL;
+ sp->splice_fd_in = READ_ONCE(sqe->splice_fd_in);
+ return 0;
+}
+
+static int io_tee_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ if (READ_ONCE(sqe->splice_off_in) || READ_ONCE(sqe->off))
+ return -EINVAL;
+ return __io_splice_prep(req, sqe);
+}
+
+static int io_tee(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_splice *sp = &req->splice;
+ struct file *out = sp->file_out;
+ unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED;
+ struct file *in;
+ long ret = 0;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ if (sp->flags & SPLICE_F_FD_IN_FIXED)
+ in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags);
+ else
+ in = io_file_get_normal(req, sp->splice_fd_in);
+ if (!in) {
+ ret = -EBADF;
+ goto done;
+ }
+
+ if (sp->len)
+ ret = do_tee(in, out, sp->len, flags);
+
+ if (!(sp->flags & SPLICE_F_FD_IN_FIXED))
+ io_put_file(in);
+done:
+ if (ret != sp->len)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+static int io_splice_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct io_splice *sp = &req->splice;
+
+ sp->off_in = READ_ONCE(sqe->splice_off_in);
+ sp->off_out = READ_ONCE(sqe->off);
+ return __io_splice_prep(req, sqe);
+}
+
+static int io_splice(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_splice *sp = &req->splice;
+ struct file *out = sp->file_out;
+ unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED;
+ loff_t *poff_in, *poff_out;
+ struct file *in;
+ long ret = 0;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ if (sp->flags & SPLICE_F_FD_IN_FIXED)
+ in = io_file_get_fixed(req, sp->splice_fd_in, issue_flags);
+ else
+ in = io_file_get_normal(req, sp->splice_fd_in);
+ if (!in) {
+ ret = -EBADF;
+ goto done;
+ }
+
+ poff_in = (sp->off_in == -1) ? NULL : &sp->off_in;
+ poff_out = (sp->off_out == -1) ? NULL : &sp->off_out;
+
+ if (sp->len)
+ ret = do_splice(in, poff_in, out, poff_out, sp->len, flags);
+
+ if (!(sp->flags & SPLICE_F_FD_IN_FIXED))
+ io_put_file(in);
+done:
+ if (ret != sp->len)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+/*
+ * IORING_OP_NOP just posts a completion event, nothing else.
+ */
+static int io_nop(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+
+ __io_req_complete(req, issue_flags, 0, 0);
+ return 0;
+}
+
+static int io_msg_ring_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ if (unlikely(sqe->addr || sqe->ioprio || sqe->rw_flags ||
+ sqe->splice_fd_in || sqe->buf_index || sqe->personality))
+ return -EINVAL;
+
+ req->msg.user_data = READ_ONCE(sqe->off);
+ req->msg.len = READ_ONCE(sqe->len);
+ return 0;
+}
+
+static int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_ring_ctx *target_ctx;
+ struct io_msg *msg = &req->msg;
+ bool filled;
+ int ret;
+
+ ret = -EBADFD;
+ if (req->file->f_op != &io_uring_fops)
+ goto done;
+
+ ret = -EOVERFLOW;
+ target_ctx = req->file->private_data;
+
+ spin_lock(&target_ctx->completion_lock);
+ filled = io_fill_cqe_aux(target_ctx, msg->user_data, msg->len, 0);
+ io_commit_cqring(target_ctx);
+ spin_unlock(&target_ctx->completion_lock);
+
+ if (filled) {
+ io_cqring_ev_posted(target_ctx);
+ ret = 0;
+ }
+
+done:
+ if (ret < 0)
+ req_set_fail(req);
+ __io_req_complete(req, issue_flags, ret, 0);
+ /* put file to avoid an attempt to IOPOLL the req */
+ io_put_file(req->file);
+ req->file = NULL;
+ return 0;
+}
+
+static int io_fsync_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (unlikely(sqe->addr || sqe->ioprio || sqe->buf_index ||
+ sqe->splice_fd_in))
+ return -EINVAL;
+
+ req->sync.flags = READ_ONCE(sqe->fsync_flags);
+ if (unlikely(req->sync.flags & ~IORING_FSYNC_DATASYNC))
+ return -EINVAL;
+
+ req->sync.off = READ_ONCE(sqe->off);
+ req->sync.len = READ_ONCE(sqe->len);
+ return 0;
+}
+
+static int io_fsync(struct io_kiocb *req, unsigned int issue_flags)
+{
+ loff_t end = req->sync.off + req->sync.len;
+ int ret;
+
+ /* fsync always requires a blocking context */
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ ret = vfs_fsync_range(req->file, req->sync.off,
+ end > 0 ? end : LLONG_MAX,
+ req->sync.flags & IORING_FSYNC_DATASYNC);
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+static int io_fallocate_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ if (sqe->ioprio || sqe->buf_index || sqe->rw_flags ||
+ sqe->splice_fd_in)
+ return -EINVAL;
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+
+ req->sync.off = READ_ONCE(sqe->off);
+ req->sync.len = READ_ONCE(sqe->addr);
+ req->sync.mode = READ_ONCE(sqe->len);
+ return 0;
+}
+
+static int io_fallocate(struct io_kiocb *req, unsigned int issue_flags)
+{
+ int ret;
+
+ /* fallocate always requiring blocking context */
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+ ret = vfs_fallocate(req->file, req->sync.mode, req->sync.off,
+ req->sync.len);
+ if (ret < 0)
+ req_set_fail(req);
+ else
+ fsnotify_modify(req->file);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+static int __io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ const char __user *fname;
+ int ret;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (unlikely(sqe->ioprio || sqe->buf_index))
+ return -EINVAL;
+ if (unlikely(req->flags & REQ_F_FIXED_FILE))
+ return -EBADF;
+
+ /* open.how should be already initialised */
+ if (!(req->open.how.flags & O_PATH) && force_o_largefile())
+ req->open.how.flags |= O_LARGEFILE;
+
+ req->open.dfd = READ_ONCE(sqe->fd);
+ fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ req->open.filename = getname(fname);
+ if (IS_ERR(req->open.filename)) {
+ ret = PTR_ERR(req->open.filename);
+ req->open.filename = NULL;
+ return ret;
+ }
+
+ req->open.file_slot = READ_ONCE(sqe->file_index);
+ if (req->open.file_slot && (req->open.how.flags & O_CLOEXEC))
+ return -EINVAL;
+
+ req->open.nofile = rlimit(RLIMIT_NOFILE);
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return 0;
+}
+
+static int io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ u64 mode = READ_ONCE(sqe->len);
+ u64 flags = READ_ONCE(sqe->open_flags);
+
+ req->open.how = build_open_how(flags, mode);
+ return __io_openat_prep(req, sqe);
+}
+
+static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct open_how __user *how;
+ size_t len;
+ int ret;
+
+ how = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+ len = READ_ONCE(sqe->len);
+ if (len < OPEN_HOW_SIZE_VER0)
+ return -EINVAL;
+
+ ret = copy_struct_from_user(&req->open.how, sizeof(req->open.how), how,
+ len);
+ if (ret)
+ return ret;
+
+ return __io_openat_prep(req, sqe);
+}
+
+static int io_openat2(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct open_flags op;
+ struct file *file;
+ bool resolve_nonblock, nonblock_set;
+ bool fixed = !!req->open.file_slot;
+ int ret;
+
+ ret = build_open_flags(&req->open.how, &op);
+ if (ret)
+ goto err;
+ nonblock_set = op.open_flag & O_NONBLOCK;
+ resolve_nonblock = req->open.how.resolve & RESOLVE_CACHED;
+ if (issue_flags & IO_URING_F_NONBLOCK) {
+ /*
+ * Don't bother trying for O_TRUNC, O_CREAT, or O_TMPFILE open,
+ * it'll always -EAGAIN
+ */
+ if (req->open.how.flags & (O_TRUNC | O_CREAT | O_TMPFILE))
+ return -EAGAIN;
+ op.lookup_flags |= LOOKUP_CACHED;
+ op.open_flag |= O_NONBLOCK;
+ }
+
+ if (!fixed) {
+ ret = __get_unused_fd_flags(req->open.how.flags, req->open.nofile);
+ if (ret < 0)
+ goto err;
+ }
+
+ file = do_filp_open(req->open.dfd, req->open.filename, &op);
+ if (IS_ERR(file)) {
+ /*
+ * We could hang on to this 'fd' on retrying, but seems like
+ * marginal gain for something that is now known to be a slower
+ * path. So just put it, and we'll get a new one when we retry.
+ */
+ if (!fixed)
+ put_unused_fd(ret);
+
+ ret = PTR_ERR(file);
+ /* only retry if RESOLVE_CACHED wasn't already set by application */
+ if (ret == -EAGAIN &&
+ (!resolve_nonblock && (issue_flags & IO_URING_F_NONBLOCK)))
+ return -EAGAIN;
+ goto err;
+ }
+
+ if ((issue_flags & IO_URING_F_NONBLOCK) && !nonblock_set)
+ file->f_flags &= ~O_NONBLOCK;
+ fsnotify_open(file);
+
+ if (!fixed)
+ fd_install(ret, file);
+ else
+ ret = io_install_fixed_file(req, file, issue_flags,
+ req->open.file_slot - 1);
+err:
+ putname(req->open.filename);
+ req->flags &= ~REQ_F_NEED_CLEANUP;
+ if (ret < 0)
+ req_set_fail(req);
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+
+static int io_openat(struct io_kiocb *req, unsigned int issue_flags)
+{
+ return io_openat2(req, issue_flags);
+}
+
+static int io_remove_buffers_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_provide_buf *p = &req->pbuf;
+ u64 tmp;
+
+ if (sqe->ioprio || sqe->rw_flags || sqe->addr || sqe->len || sqe->off ||
+ sqe->splice_fd_in)
+ return -EINVAL;
+
+ tmp = READ_ONCE(sqe->fd);
+ if (!tmp || tmp > USHRT_MAX)
+ return -EINVAL;
+
+ memset(p, 0, sizeof(*p));
+ p->nbufs = tmp;
+ p->bgid = READ_ONCE(sqe->buf_group);
+ return 0;
+}
+
+static int __io_remove_buffers(struct io_ring_ctx *ctx,
+ struct io_buffer_list *bl, unsigned nbufs)
+{
+ unsigned i = 0;
+
+ /* shouldn't happen */
+ if (!nbufs)
+ return 0;
+
+ /* the head kbuf is the list itself */
+ while (!list_empty(&bl->buf_list)) {
+ struct io_buffer *nxt;
+
+ nxt = list_first_entry(&bl->buf_list, struct io_buffer, list);
+ list_del(&nxt->list);
+ if (++i == nbufs)
+ return i;
+ cond_resched();
+ }
+ i++;
+
+ return i;
+}
+
+static int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_provide_buf *p = &req->pbuf;
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_buffer_list *bl;
+ int ret = 0;
+ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+
+ io_ring_submit_lock(ctx, needs_lock);
+
+ lockdep_assert_held(&ctx->uring_lock);
+
+ ret = -ENOENT;
+ bl = io_buffer_get_list(ctx, p->bgid);
+ if (bl)
+ ret = __io_remove_buffers(ctx, bl, p->nbufs);
+ if (ret < 0)
+ req_set_fail(req);
+
+ /* complete before unlock, IOPOLL may need the lock */
+ __io_req_complete(req, issue_flags, ret, 0);
+ io_ring_submit_unlock(ctx, needs_lock);
+ return 0;
+}
+
+static int io_provide_buffers_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ unsigned long size, tmp_check;
+ struct io_provide_buf *p = &req->pbuf;
+ u64 tmp;
+
+ if (sqe->ioprio || sqe->rw_flags || sqe->splice_fd_in)
+ return -EINVAL;
+
+ tmp = READ_ONCE(sqe->fd);
+ if (!tmp || tmp > USHRT_MAX)
+ return -E2BIG;
+ p->nbufs = tmp;
+ p->addr = READ_ONCE(sqe->addr);
+ p->len = READ_ONCE(sqe->len);
+
+ if (check_mul_overflow((unsigned long)p->len, (unsigned long)p->nbufs,
+ &size))
+ return -EOVERFLOW;
+ if (check_add_overflow((unsigned long)p->addr, size, &tmp_check))
+ return -EOVERFLOW;
+
+ size = (unsigned long)p->len * p->nbufs;
+ if (!access_ok(u64_to_user_ptr(p->addr), size))
+ return -EFAULT;
+
+ p->bgid = READ_ONCE(sqe->buf_group);
+ tmp = READ_ONCE(sqe->off);
+ if (tmp > USHRT_MAX)
+ return -E2BIG;
+ p->bid = tmp;
+ return 0;
+}
+
+static int io_refill_buffer_cache(struct io_ring_ctx *ctx)
+{
+ struct io_buffer *buf;
+ struct page *page;
+ int bufs_in_page;
+
+ /*
+ * Completions that don't happen inline (eg not under uring_lock) will
+ * add to ->io_buffers_comp. If we don't have any free buffers, check
+ * the completion list and splice those entries first.
+ */
+ if (!list_empty_careful(&ctx->io_buffers_comp)) {
+ spin_lock(&ctx->completion_lock);
+ if (!list_empty(&ctx->io_buffers_comp)) {
+ list_splice_init(&ctx->io_buffers_comp,
+ &ctx->io_buffers_cache);
+ spin_unlock(&ctx->completion_lock);
+ return 0;
+ }
+ spin_unlock(&ctx->completion_lock);
+ }
+
+ /*
+ * No free buffers and no completion entries either. Allocate a new
+ * page worth of buffer entries and add those to our freelist.
+ */
+ page = alloc_page(GFP_KERNEL_ACCOUNT);
+ if (!page)
+ return -ENOMEM;
+
+ list_add(&page->lru, &ctx->io_buffers_pages);
+
+ buf = page_address(page);
+ bufs_in_page = PAGE_SIZE / sizeof(*buf);
+ while (bufs_in_page) {
+ list_add_tail(&buf->list, &ctx->io_buffers_cache);
+ buf++;
+ bufs_in_page--;
+ }
+
+ return 0;
+}
+
+static int io_add_buffers(struct io_ring_ctx *ctx, struct io_provide_buf *pbuf,
+ struct io_buffer_list *bl)
+{
+ struct io_buffer *buf;
+ u64 addr = pbuf->addr;
+ int i, bid = pbuf->bid;
+
+ for (i = 0; i < pbuf->nbufs; i++) {
+ if (list_empty(&ctx->io_buffers_cache) &&
+ io_refill_buffer_cache(ctx))
+ break;
+ buf = list_first_entry(&ctx->io_buffers_cache, struct io_buffer,
+ list);
+ list_move_tail(&buf->list, &bl->buf_list);
+ buf->addr = addr;
+ buf->len = min_t(__u32, pbuf->len, MAX_RW_COUNT);
+ buf->bid = bid;
+ buf->bgid = pbuf->bgid;
+ addr += pbuf->len;
+ bid++;
+ cond_resched();
+ }
+
+ return i ? 0 : -ENOMEM;
+}
+
+static int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_provide_buf *p = &req->pbuf;
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_buffer_list *bl;
+ int ret = 0;
+ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+
+ io_ring_submit_lock(ctx, needs_lock);
+
+ lockdep_assert_held(&ctx->uring_lock);
+
+ bl = io_buffer_get_list(ctx, p->bgid);
+ if (unlikely(!bl)) {
+ bl = kzalloc(sizeof(*bl), GFP_KERNEL_ACCOUNT);
+ if (!bl) {
+ ret = -ENOMEM;
+ goto err;
+ }
+ io_buffer_add_list(ctx, bl, p->bgid);
+ }
+
+ ret = io_add_buffers(ctx, p, bl);
+err:
+ if (ret < 0)
+ req_set_fail(req);
+ /* complete before unlock, IOPOLL may need the lock */
+ __io_req_complete(req, issue_flags, ret, 0);
+ io_ring_submit_unlock(ctx, needs_lock);
+ return 0;
+}
+
+static int io_epoll_ctl_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+#if defined(CONFIG_EPOLL)
+ if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
+ return -EINVAL;
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+
+ req->epoll.epfd = READ_ONCE(sqe->fd);
+ req->epoll.op = READ_ONCE(sqe->len);
+ req->epoll.fd = READ_ONCE(sqe->off);
+
+ if (ep_op_has_event(req->epoll.op)) {
+ struct epoll_event __user *ev;
+
+ ev = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ if (copy_from_user(&req->epoll.event, ev, sizeof(*ev)))
+ return -EFAULT;
+ }
+
+ return 0;
+#else
+ return -EOPNOTSUPP;
+#endif
+}
+
+static int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags)
+{
+#if defined(CONFIG_EPOLL)
+ struct io_epoll *ie = &req->epoll;
+ int ret;
+ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+
+ ret = do_epoll_ctl(ie->epfd, ie->op, ie->fd, &ie->event, force_nonblock);
+ if (force_nonblock && ret == -EAGAIN)
+ return -EAGAIN;
+
+ if (ret < 0)
+ req_set_fail(req);
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+#else
+ return -EOPNOTSUPP;
+#endif
+}
+
+static int io_madvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
+ if (sqe->ioprio || sqe->buf_index || sqe->off || sqe->splice_fd_in)
+ return -EINVAL;
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+
+ req->madvise.addr = READ_ONCE(sqe->addr);
+ req->madvise.len = READ_ONCE(sqe->len);
+ req->madvise.advice = READ_ONCE(sqe->fadvise_advice);
+ return 0;
+#else
+ return -EOPNOTSUPP;
+#endif
+}
+
+static int io_madvise(struct io_kiocb *req, unsigned int issue_flags)
+{
+#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
+ struct io_madvise *ma = &req->madvise;
+ int ret;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ ret = do_madvise(current->mm, ma->addr, ma->len, ma->advice);
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+#else
+ return -EOPNOTSUPP;
+#endif
+}
+
+static int io_fadvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ if (sqe->ioprio || sqe->buf_index || sqe->addr || sqe->splice_fd_in)
+ return -EINVAL;
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+
+ req->fadvise.offset = READ_ONCE(sqe->off);
+ req->fadvise.len = READ_ONCE(sqe->len);
+ req->fadvise.advice = READ_ONCE(sqe->fadvise_advice);
+ return 0;
+}
+
+static int io_fadvise(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_fadvise *fa = &req->fadvise;
+ int ret;
+
+ if (issue_flags & IO_URING_F_NONBLOCK) {
+ switch (fa->advice) {
+ case POSIX_FADV_NORMAL:
+ case POSIX_FADV_RANDOM:
+ case POSIX_FADV_SEQUENTIAL:
+ break;
+ default:
+ return -EAGAIN;
+ }
+ }
+
+ ret = vfs_fadvise(req->file, fa->offset, fa->len, fa->advice);
+ if (ret < 0)
+ req_set_fail(req);
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+
+static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ const char __user *path;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
+ return -EINVAL;
+ if (req->flags & REQ_F_FIXED_FILE)
+ return -EBADF;
+
+ req->statx.dfd = READ_ONCE(sqe->fd);
+ req->statx.mask = READ_ONCE(sqe->len);
+ path = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ req->statx.buffer = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+ req->statx.flags = READ_ONCE(sqe->statx_flags);
+
+ req->statx.filename = getname_flags(path,
+ getname_statx_lookup_flags(req->statx.flags),
+ NULL);
+
+ if (IS_ERR(req->statx.filename)) {
+ int ret = PTR_ERR(req->statx.filename);
+
+ req->statx.filename = NULL;
+ return ret;
+ }
+
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return 0;
+}
+
+static int io_statx(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_statx *ctx = &req->statx;
+ int ret;
+
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ ret = do_statx(ctx->dfd, ctx->filename, ctx->flags, ctx->mask,
+ ctx->buffer);
+
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->off || sqe->addr || sqe->len ||
+ sqe->rw_flags || sqe->buf_index)
+ return -EINVAL;
+ if (req->flags & REQ_F_FIXED_FILE)
+ return -EBADF;
+
+ req->close.fd = READ_ONCE(sqe->fd);
+ req->close.file_slot = READ_ONCE(sqe->file_index);
+ if (req->close.file_slot && req->close.fd)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int io_close(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct files_struct *files = current->files;
+ struct io_close *close = &req->close;
+ struct fdtable *fdt;
+ struct file *file = NULL;
+ int ret = -EBADF;
+
+ if (req->close.file_slot) {
+ ret = io_close_fixed(req, issue_flags);
+ goto err;
+ }
+
+ spin_lock(&files->file_lock);
+ fdt = files_fdtable(files);
+ if (close->fd >= fdt->max_fds) {
+ spin_unlock(&files->file_lock);
+ goto err;
+ }
+ file = fdt->fd[close->fd];
+ if (!file || file->f_op == &io_uring_fops) {
+ spin_unlock(&files->file_lock);
+ file = NULL;
+ goto err;
+ }
+
+ /* if the file has a flush method, be safe and punt to async */
+ if (file->f_op->flush && (issue_flags & IO_URING_F_NONBLOCK)) {
+ spin_unlock(&files->file_lock);
+ return -EAGAIN;
+ }
+
+ ret = __close_fd_get_file(close->fd, &file);
+ spin_unlock(&files->file_lock);
+ if (ret < 0) {
+ if (ret == -ENOENT)
+ ret = -EBADF;
+ goto err;
+ }
+
+ /* No ->flush() or already async, safely close from here */
+ ret = filp_close(file, current->files);
+err:
+ if (ret < 0)
+ req_set_fail(req);
+ if (file)
+ fput(file);
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+
+static int io_sfr_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (unlikely(sqe->addr || sqe->ioprio || sqe->buf_index ||
+ sqe->splice_fd_in))
+ return -EINVAL;
+
+ req->sync.off = READ_ONCE(sqe->off);
+ req->sync.len = READ_ONCE(sqe->len);
+ req->sync.flags = READ_ONCE(sqe->sync_range_flags);
+ return 0;
+}
+
+static int io_sync_file_range(struct io_kiocb *req, unsigned int issue_flags)
+{
+ int ret;
+
+ /* sync_file_range always requires a blocking context */
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ return -EAGAIN;
+
+ ret = sync_file_range(req->file, req->sync.off, req->sync.len,
+ req->sync.flags);
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete(req, ret);
+ return 0;
+}
+
+#if defined(CONFIG_NET)
+static int io_setup_async_msg(struct io_kiocb *req,
+ struct io_async_msghdr *kmsg)
+{
+ struct io_async_msghdr *async_msg = req->async_data;
+
+ if (async_msg)
+ return -EAGAIN;
+ if (io_alloc_async_data(req)) {
+ kfree(kmsg->free_iov);
+ return -ENOMEM;
+ }
+ async_msg = req->async_data;
+ req->flags |= REQ_F_NEED_CLEANUP;
+ memcpy(async_msg, kmsg, sizeof(*kmsg));
+ async_msg->msg.msg_name = &async_msg->addr;
+ /* if were using fast_iov, set it to the new one */
+ if (!async_msg->free_iov)
+ async_msg->msg.msg_iter.iov = async_msg->fast_iov;
+
+ return -EAGAIN;
+}
+
+static int io_sendmsg_copy_hdr(struct io_kiocb *req,
+ struct io_async_msghdr *iomsg)
+{
+ iomsg->msg.msg_name = &iomsg->addr;
+ iomsg->free_iov = iomsg->fast_iov;
+ return sendmsg_copy_msghdr(&iomsg->msg, req->sr_msg.umsg,
+ req->sr_msg.msg_flags, &iomsg->free_iov);
+}
+
+static int io_sendmsg_prep_async(struct io_kiocb *req)
+{
+ int ret;
+
+ ret = io_sendmsg_copy_hdr(req, req->async_data);
+ if (!ret)
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return ret;
+}
+
+static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct io_sr_msg *sr = &req->sr_msg;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
+ return -EINVAL;
+
+ sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ sr->len = READ_ONCE(sqe->len);
+ sr->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL;
+ if (sr->msg_flags & MSG_DONTWAIT)
+ req->flags |= REQ_F_NOWAIT;
+
+#ifdef CONFIG_COMPAT
+ if (req->ctx->compat)
+ sr->msg_flags |= MSG_CMSG_COMPAT;
+#endif
+ return 0;
+}
+
+static int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_async_msghdr iomsg, *kmsg;
+ struct socket *sock;
+ unsigned flags;
+ int min_ret = 0;
+ int ret;
+
+ sock = sock_from_file(req->file);
+ if (unlikely(!sock))
+ return -ENOTSOCK;
+
+ if (req_has_async_data(req)) {
+ kmsg = req->async_data;
+ } else {
+ ret = io_sendmsg_copy_hdr(req, &iomsg);
+ if (ret)
+ return ret;
+ kmsg = &iomsg;
+ }
+
+ flags = req->sr_msg.msg_flags;
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ flags |= MSG_DONTWAIT;
+ if (flags & MSG_WAITALL)
+ min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+
+ ret = __sys_sendmsg_sock(sock, &kmsg->msg, flags);
+
+ if (ret < min_ret) {
+ if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK))
+ return io_setup_async_msg(req, kmsg);
+ if (ret == -ERESTARTSYS)
+ ret = -EINTR;
+ req_set_fail(req);
+ }
+ /* fast path, check for non-NULL to avoid function call */
+ if (kmsg->free_iov)
+ kfree(kmsg->free_iov);
+ req->flags &= ~REQ_F_NEED_CLEANUP;
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+
+static int io_send(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_sr_msg *sr = &req->sr_msg;
+ struct msghdr msg;
+ struct iovec iov;
+ struct socket *sock;
+ unsigned flags;
+ int min_ret = 0;
+ int ret;
+
+ sock = sock_from_file(req->file);
+ if (unlikely(!sock))
+ return -ENOTSOCK;
+
+ ret = import_single_range(WRITE, sr->buf, sr->len, &iov, &msg.msg_iter);
+ if (unlikely(ret))
+ return ret;
+
+ msg.msg_name = NULL;
+ msg.msg_control = NULL;
+ msg.msg_controllen = 0;
+ msg.msg_namelen = 0;
+
+ flags = req->sr_msg.msg_flags;
+ if (issue_flags & IO_URING_F_NONBLOCK)
+ flags |= MSG_DONTWAIT;
+ if (flags & MSG_WAITALL)
+ min_ret = iov_iter_count(&msg.msg_iter);
+
+ msg.msg_flags = flags;
+ ret = sock_sendmsg(sock, &msg);
+ if (ret < min_ret) {
+ if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK))
+ return -EAGAIN;
+ if (ret == -ERESTARTSYS)
+ ret = -EINTR;
+ req_set_fail(req);
+ }
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+
+static int __io_recvmsg_copy_hdr(struct io_kiocb *req,
+ struct io_async_msghdr *iomsg)
+{
+ struct io_sr_msg *sr = &req->sr_msg;
+ struct iovec __user *uiov;
+ size_t iov_len;
+ int ret;
+
+ ret = __copy_msghdr_from_user(&iomsg->msg, sr->umsg,
+ &iomsg->uaddr, &uiov, &iov_len);
+ if (ret)
+ return ret;
+
+ if (req->flags & REQ_F_BUFFER_SELECT) {
+ if (iov_len > 1)
+ return -EINVAL;
+ if (copy_from_user(iomsg->fast_iov, uiov, sizeof(*uiov)))
+ return -EFAULT;
+ sr->len = iomsg->fast_iov[0].iov_len;
+ iomsg->free_iov = NULL;
+ } else {
+ iomsg->free_iov = iomsg->fast_iov;
+ ret = __import_iovec(READ, uiov, iov_len, UIO_FASTIOV,
+ &iomsg->free_iov, &iomsg->msg.msg_iter,
+ false);
+ if (ret > 0)
+ ret = 0;
+ }
+
+ return ret;
+}
+
+#ifdef CONFIG_COMPAT
+static int __io_compat_recvmsg_copy_hdr(struct io_kiocb *req,
+ struct io_async_msghdr *iomsg)
+{
+ struct io_sr_msg *sr = &req->sr_msg;
+ struct compat_iovec __user *uiov;
+ compat_uptr_t ptr;
+ compat_size_t len;
+ int ret;
+
+ ret = __get_compat_msghdr(&iomsg->msg, sr->umsg_compat, &iomsg->uaddr,
+ &ptr, &len);
+ if (ret)
+ return ret;
+
+ uiov = compat_ptr(ptr);
+ if (req->flags & REQ_F_BUFFER_SELECT) {
+ compat_ssize_t clen;
+
+ if (len > 1)
+ return -EINVAL;
+ if (!access_ok(uiov, sizeof(*uiov)))
+ return -EFAULT;
+ if (__get_user(clen, &uiov->iov_len))
+ return -EFAULT;
+ if (clen < 0)
+ return -EINVAL;
+ sr->len = clen;
+ iomsg->free_iov = NULL;
+ } else {
+ iomsg->free_iov = iomsg->fast_iov;
+ ret = __import_iovec(READ, (struct iovec __user *)uiov, len,
+ UIO_FASTIOV, &iomsg->free_iov,
+ &iomsg->msg.msg_iter, true);
+ if (ret < 0)
+ return ret;
+ }
+
+ return 0;
+}
+#endif
+
+static int io_recvmsg_copy_hdr(struct io_kiocb *req,
+ struct io_async_msghdr *iomsg)
+{
+ iomsg->msg.msg_name = &iomsg->addr;
+
+#ifdef CONFIG_COMPAT
+ if (req->ctx->compat)
+ return __io_compat_recvmsg_copy_hdr(req, iomsg);
+#endif
+
+ return __io_recvmsg_copy_hdr(req, iomsg);
+}
+
+static struct io_buffer *io_recv_buffer_select(struct io_kiocb *req,
+ unsigned int issue_flags)
+{
+ struct io_sr_msg *sr = &req->sr_msg;
+
+ return io_buffer_select(req, &sr->len, sr->bgid, issue_flags);
+}
+
+static int io_recvmsg_prep_async(struct io_kiocb *req)
+{
+ int ret;
+
+ ret = io_recvmsg_copy_hdr(req, req->async_data);
+ if (!ret)
+ req->flags |= REQ_F_NEED_CLEANUP;
+ return ret;
+}
+
+static int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct io_sr_msg *sr = &req->sr_msg;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (unlikely(sqe->addr2 || sqe->file_index || sqe->ioprio))
+ return -EINVAL;
+
+ sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ sr->len = READ_ONCE(sqe->len);
+ sr->bgid = READ_ONCE(sqe->buf_group);
+ sr->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL;
+ if (sr->msg_flags & MSG_DONTWAIT)
+ req->flags |= REQ_F_NOWAIT;
+
+#ifdef CONFIG_COMPAT
+ if (req->ctx->compat)
+ sr->msg_flags |= MSG_CMSG_COMPAT;
+#endif
+ sr->done_io = 0;
+ return 0;
+}
+
+static bool io_net_retry(struct socket *sock, int flags)
+{
+ if (!(flags & MSG_WAITALL))
+ return false;
+ return sock->type == SOCK_STREAM || sock->type == SOCK_SEQPACKET;
+}
+
+static int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_async_msghdr iomsg, *kmsg;
+ struct io_sr_msg *sr = &req->sr_msg;
+ struct socket *sock;
+ struct io_buffer *kbuf;
+ unsigned flags;
+ int ret, min_ret = 0;
+ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+
+ sock = sock_from_file(req->file);
+ if (unlikely(!sock))
+ return -ENOTSOCK;
+
+ if (req_has_async_data(req)) {
+ kmsg = req->async_data;
+ } else {
+ ret = io_recvmsg_copy_hdr(req, &iomsg);
+ if (ret)
+ return ret;
+ kmsg = &iomsg;
+ }
+
+ if (req->flags & REQ_F_BUFFER_SELECT) {
+ kbuf = io_recv_buffer_select(req, issue_flags);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+ kmsg->fast_iov[0].iov_base = u64_to_user_ptr(kbuf->addr);
+ kmsg->fast_iov[0].iov_len = req->sr_msg.len;
+ iov_iter_init(&kmsg->msg.msg_iter, READ, kmsg->fast_iov,
+ 1, req->sr_msg.len);
+ }
+
+ flags = req->sr_msg.msg_flags;
+ if (force_nonblock)
+ flags |= MSG_DONTWAIT;
+ if (flags & MSG_WAITALL)
+ min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+
+ ret = __sys_recvmsg_sock(sock, &kmsg->msg, req->sr_msg.umsg,
+ kmsg->uaddr, flags);
+ if (ret < min_ret) {
+ if (ret == -EAGAIN && force_nonblock)
+ return io_setup_async_msg(req, kmsg);
+ if (ret == -ERESTARTSYS)
+ ret = -EINTR;
+ if (ret > 0 && io_net_retry(sock, flags)) {
+ sr->done_io += ret;
+ req->flags |= REQ_F_PARTIAL_IO;
+ return io_setup_async_msg(req, kmsg);
+ }
+ req_set_fail(req);
+ } else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
+ req_set_fail(req);
+ }
+
+ /* fast path, check for non-NULL to avoid function call */
+ if (kmsg->free_iov)
+ kfree(kmsg->free_iov);
+ req->flags &= ~REQ_F_NEED_CLEANUP;
+ if (ret >= 0)
+ ret += sr->done_io;
+ else if (sr->done_io)
+ ret = sr->done_io;
+ __io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
+ return 0;
+}
+
+static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_buffer *kbuf;
+ struct io_sr_msg *sr = &req->sr_msg;
+ struct msghdr msg;
+ void __user *buf = sr->buf;
+ struct socket *sock;
+ struct iovec iov;
+ unsigned flags;
+ int ret, min_ret = 0;
+ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+
+ sock = sock_from_file(req->file);
+ if (unlikely(!sock))
+ return -ENOTSOCK;
+
+ if (req->flags & REQ_F_BUFFER_SELECT) {
+ kbuf = io_recv_buffer_select(req, issue_flags);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+ buf = u64_to_user_ptr(kbuf->addr);
+ }
+
+ ret = import_single_range(READ, buf, sr->len, &iov, &msg.msg_iter);
+ if (unlikely(ret))
+ goto out_free;
+
+ msg.msg_name = NULL;
+ msg.msg_control = NULL;
+ msg.msg_controllen = 0;
+ msg.msg_namelen = 0;
+ msg.msg_iocb = NULL;
+ msg.msg_flags = 0;
+
+ flags = req->sr_msg.msg_flags;
+ if (force_nonblock)
+ flags |= MSG_DONTWAIT;
+ if (flags & MSG_WAITALL)
+ min_ret = iov_iter_count(&msg.msg_iter);
+
+ ret = sock_recvmsg(sock, &msg, flags);
+ if (ret < min_ret) {
+ if (ret == -EAGAIN && force_nonblock)
+ return -EAGAIN;
+ if (ret == -ERESTARTSYS)
+ ret = -EINTR;
+ if (ret > 0 && io_net_retry(sock, flags)) {
+ sr->len -= ret;
+ sr->buf += ret;
+ sr->done_io += ret;
+ req->flags |= REQ_F_PARTIAL_IO;
+ return -EAGAIN;
+ }
+ req_set_fail(req);
+ } else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
+out_free:
+ req_set_fail(req);
+ }
+
+ if (ret >= 0)
+ ret += sr->done_io;
+ else if (sr->done_io)
+ ret = sr->done_io;
+ __io_req_complete(req, issue_flags, ret, io_put_kbuf(req, issue_flags));
+ return 0;
+}
+
+static int io_accept_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct io_accept *accept = &req->accept;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->len || sqe->buf_index)
+ return -EINVAL;
+
+ accept->addr = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ accept->addr_len = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+ accept->flags = READ_ONCE(sqe->accept_flags);
+ accept->nofile = rlimit(RLIMIT_NOFILE);
+
+ accept->file_slot = READ_ONCE(sqe->file_index);
+ if (accept->file_slot && (accept->flags & SOCK_CLOEXEC))
+ return -EINVAL;
+ if (accept->flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
+ return -EINVAL;
+ if (SOCK_NONBLOCK != O_NONBLOCK && (accept->flags & SOCK_NONBLOCK))
+ accept->flags = (accept->flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
+ return 0;
+}
+
+static int io_accept(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_accept *accept = &req->accept;
+ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+ unsigned int file_flags = force_nonblock ? O_NONBLOCK : 0;
+ bool fixed = !!accept->file_slot;
+ struct file *file;
+ int ret, fd;
+
+ if (!fixed) {
+ fd = __get_unused_fd_flags(accept->flags, accept->nofile);
+ if (unlikely(fd < 0))
+ return fd;
+ }
+ file = do_accept(req->file, file_flags, accept->addr, accept->addr_len,
+ accept->flags);
+ if (IS_ERR(file)) {
+ if (!fixed)
+ put_unused_fd(fd);
+ ret = PTR_ERR(file);
+ if (ret == -EAGAIN && force_nonblock)
+ return -EAGAIN;
+ if (ret == -ERESTARTSYS)
+ ret = -EINTR;
+ req_set_fail(req);
+ } else if (!fixed) {
+ fd_install(fd, file);
+ ret = fd;
+ } else {
+ ret = io_install_fixed_file(req, file, issue_flags,
+ accept->file_slot - 1);
+ }
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+
+static int io_connect_prep_async(struct io_kiocb *req)
+{
+ struct io_async_connect *io = req->async_data;
+ struct io_connect *conn = &req->connect;
+
+ return move_addr_to_kernel(conn->addr, conn->addr_len, &io->address);
+}
+
+static int io_connect_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct io_connect *conn = &req->connect;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->len || sqe->buf_index || sqe->rw_flags ||
+ sqe->splice_fd_in)
+ return -EINVAL;
+
+ conn->addr = u64_to_user_ptr(READ_ONCE(sqe->addr));
+ conn->addr_len = READ_ONCE(sqe->addr2);
+ return 0;
+}
+
+static int io_connect(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_async_connect __io, *io;
+ unsigned file_flags;
+ int ret;
+ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+
+ if (req_has_async_data(req)) {
+ io = req->async_data;
+ } else {
+ ret = move_addr_to_kernel(req->connect.addr,
+ req->connect.addr_len,
+ &__io.address);
+ if (ret)
+ goto out;
+ io = &__io;
+ }
+
+ file_flags = force_nonblock ? O_NONBLOCK : 0;
+
+ ret = __sys_connect_file(req->file, &io->address,
+ req->connect.addr_len, file_flags);
+ if ((ret == -EAGAIN || ret == -EINPROGRESS) && force_nonblock) {
+ if (req_has_async_data(req))
+ return -EAGAIN;
+ if (io_alloc_async_data(req)) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ memcpy(req->async_data, &__io, sizeof(__io));
+ return -EAGAIN;
+ }
+ if (ret == -ERESTARTSYS)
+ ret = -EINTR;
+out:
+ if (ret < 0)
+ req_set_fail(req);
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+#else /* !CONFIG_NET */
+#define IO_NETOP_FN(op) \
+static int io_##op(struct io_kiocb *req, unsigned int issue_flags) \
+{ \
+ return -EOPNOTSUPP; \
+}
+
+#define IO_NETOP_PREP(op) \
+IO_NETOP_FN(op) \
+static int io_##op##_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) \
+{ \
+ return -EOPNOTSUPP; \
+} \
+
+#define IO_NETOP_PREP_ASYNC(op) \
+IO_NETOP_PREP(op) \
+static int io_##op##_prep_async(struct io_kiocb *req) \
+{ \
+ return -EOPNOTSUPP; \
+}
+
+IO_NETOP_PREP_ASYNC(sendmsg);
+IO_NETOP_PREP_ASYNC(recvmsg);
+IO_NETOP_PREP_ASYNC(connect);
+IO_NETOP_PREP(accept);
+IO_NETOP_FN(send);
+IO_NETOP_FN(recv);
+#endif /* CONFIG_NET */
+
+struct io_poll_table {
+ struct poll_table_struct pt;
+ struct io_kiocb *req;
+ int nr_entries;
+ int error;
+};
+
+#define IO_POLL_CANCEL_FLAG BIT(31)
+#define IO_POLL_REF_MASK GENMASK(30, 0)
+
+/*
+ * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can
+ * bump it and acquire ownership. It's disallowed to modify requests while not
+ * owning it, that prevents from races for enqueueing task_work's and b/w
+ * arming poll and wakeups.
+ */
+static inline bool io_poll_get_ownership(struct io_kiocb *req)
+{
+ return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK);
+}
+
+static void io_poll_mark_cancelled(struct io_kiocb *req)
+{
+ atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs);
+}
+
+static struct io_poll_iocb *io_poll_get_double(struct io_kiocb *req)
+{
+ /* pure poll stashes this in ->async_data, poll driven retry elsewhere */
+ if (req->opcode == IORING_OP_POLL_ADD)
+ return req->async_data;
+ return req->apoll->double_poll;
+}
+
+static struct io_poll_iocb *io_poll_get_single(struct io_kiocb *req)
+{
+ if (req->opcode == IORING_OP_POLL_ADD)
+ return &req->poll;
+ return &req->apoll->poll;
+}
+
+static void io_poll_req_insert(struct io_kiocb *req)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct hlist_head *list;
+
+ list = &ctx->cancel_hash[hash_long(req->user_data, ctx->cancel_hash_bits)];
+ hlist_add_head(&req->hash_node, list);
+}
+
+static void io_init_poll_iocb(struct io_poll_iocb *poll, __poll_t events,
+ wait_queue_func_t wake_func)
+{
+ poll->head = NULL;
+#define IO_POLL_UNMASK (EPOLLERR|EPOLLHUP|EPOLLNVAL|EPOLLRDHUP)
+ /* mask in events that we always want/need */
+ poll->events = events | IO_POLL_UNMASK;
+ INIT_LIST_HEAD(&poll->wait.entry);
+ init_waitqueue_func_entry(&poll->wait, wake_func);
+}
+
+static inline void io_poll_remove_entry(struct io_poll_iocb *poll)
+{
+ struct wait_queue_head *head = smp_load_acquire(&poll->head);
+
+ if (head) {
+ spin_lock_irq(&head->lock);
+ list_del_init(&poll->wait.entry);
+ poll->head = NULL;
+ spin_unlock_irq(&head->lock);
+ }
+}
+
+static void io_poll_remove_entries(struct io_kiocb *req)
+{
+ /*
+ * Nothing to do if neither of those flags are set. Avoid dipping
+ * into the poll/apoll/double cachelines if we can.
+ */
+ if (!(req->flags & (REQ_F_SINGLE_POLL | REQ_F_DOUBLE_POLL)))
+ return;
+
+ /*
+ * While we hold the waitqueue lock and the waitqueue is nonempty,
+ * wake_up_pollfree() will wait for us. However, taking the waitqueue
+ * lock in the first place can race with the waitqueue being freed.
+ *
+ * We solve this as eventpoll does: by taking advantage of the fact that
+ * all users of wake_up_pollfree() will RCU-delay the actual free. If
+ * we enter rcu_read_lock() and see that the pointer to the queue is
+ * non-NULL, we can then lock it without the memory being freed out from
+ * under us.
+ *
+ * Keep holding rcu_read_lock() as long as we hold the queue lock, in
+ * case the caller deletes the entry from the queue, leaving it empty.
+ * In that case, only RCU prevents the queue memory from being freed.
+ */
+ rcu_read_lock();
+ if (req->flags & REQ_F_SINGLE_POLL)
+ io_poll_remove_entry(io_poll_get_single(req));
+ if (req->flags & REQ_F_DOUBLE_POLL)
+ io_poll_remove_entry(io_poll_get_double(req));
+ rcu_read_unlock();
+}
+
+/*
+ * All poll tw should go through this. Checks for poll events, manages
+ * references, does rewait, etc.
+ *
+ * Returns a negative error on failure. >0 when no action require, which is
+ * either spurious wakeup or multishot CQE is served. 0 when it's done with
+ * the request, then the mask is stored in req->result.
+ */
+static int io_poll_check_events(struct io_kiocb *req, bool locked)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ int v;
+
+ /* req->task == current here, checking PF_EXITING is safe */
+ if (unlikely(req->task->flags & PF_EXITING))
+ io_poll_mark_cancelled(req);
+
+ do {
+ v = atomic_read(&req->poll_refs);
+
+ /* tw handler should be the owner, and so have some references */
+ if (WARN_ON_ONCE(!(v & IO_POLL_REF_MASK)))
+ return 0;
+ if (v & IO_POLL_CANCEL_FLAG)
+ return -ECANCELED;
+
+ if (!req->result) {
+ struct poll_table_struct pt = { ._key = req->apoll_events };
+ req->result = vfs_poll(req->file, &pt) & req->apoll_events;
+ }
+
+ /* multishot, just fill an CQE and proceed */
+ if (req->result && !(req->apoll_events & EPOLLONESHOT)) {
+ __poll_t mask = mangle_poll(req->result & req->apoll_events);
+ bool filled;
+
+ spin_lock(&ctx->completion_lock);
+ filled = io_fill_cqe_aux(ctx, req->user_data, mask,
+ IORING_CQE_F_MORE);
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ if (unlikely(!filled))
+ return -ECANCELED;
+ io_cqring_ev_posted(ctx);
+ } else if (req->result) {
+ return 0;
+ }
+
+ /*
+ * Release all references, retry if someone tried to restart
+ * task_work while we were executing it.
+ */
+ } while (atomic_sub_return(v & IO_POLL_REF_MASK, &req->poll_refs));
+
+ return 1;
+}
+
+static void io_poll_task_func(struct io_kiocb *req, bool *locked)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ int ret;
+
+ ret = io_poll_check_events(req, *locked);
+ if (ret > 0)
+ return;
+
+ if (!ret) {
+ req->result = mangle_poll(req->result & req->poll.events);
+ } else {
+ req->result = ret;
+ req_set_fail(req);
+ }
+
+ io_poll_remove_entries(req);
+ spin_lock(&ctx->completion_lock);
+ hash_del(&req->hash_node);
+ __io_req_complete_post(req, req->result, 0);
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ io_cqring_ev_posted(ctx);
+}
+
+static void io_apoll_task_func(struct io_kiocb *req, bool *locked)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ int ret;
+
+ ret = io_poll_check_events(req, *locked);
+ if (ret > 0)
+ return;
+
+ io_poll_remove_entries(req);
+ spin_lock(&ctx->completion_lock);
+ hash_del(&req->hash_node);
+ spin_unlock(&ctx->completion_lock);
+
+ if (!ret)
+ io_req_task_submit(req, locked);
+ else
+ io_req_complete_failed(req, ret);
+}
+
+static void __io_poll_execute(struct io_kiocb *req, int mask,
+ __poll_t __maybe_unused events)
+{
+ req->result = mask;
+ /*
+ * This is useful for poll that is armed on behalf of another
+ * request, and where the wakeup path could be on a different
+ * CPU. We want to avoid pulling in req->apoll->events for that
+ * case.
+ */
+ if (req->opcode == IORING_OP_POLL_ADD)
+ req->io_task_work.func = io_poll_task_func;
+ else
+ req->io_task_work.func = io_apoll_task_func;
+
+ trace_io_uring_task_add(req->ctx, req, req->user_data, req->opcode, mask);
+ io_req_task_work_add(req, false);
+}
+
+static inline void io_poll_execute(struct io_kiocb *req, int res,
+ __poll_t events)
+{
+ if (io_poll_get_ownership(req))
+ __io_poll_execute(req, res, events);
+}
+
+static void io_poll_cancel_req(struct io_kiocb *req)
+{
+ io_poll_mark_cancelled(req);
+ /* kick tw, which should complete the request */
+ io_poll_execute(req, 0, 0);
+}
+
+#define wqe_to_req(wait) ((void *)((unsigned long) (wait)->private & ~1))
+#define wqe_is_double(wait) ((unsigned long) (wait)->private & 1)
+#define IO_ASYNC_POLL_COMMON (EPOLLONESHOT | POLLPRI)
+
+static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
+ void *key)
+{
+ struct io_kiocb *req = wqe_to_req(wait);
+ struct io_poll_iocb *poll = container_of(wait, struct io_poll_iocb,
+ wait);
+ __poll_t mask = key_to_poll(key);
+
+ if (unlikely(mask & POLLFREE)) {
+ io_poll_mark_cancelled(req);
+ /* we have to kick tw in case it's not already */
+ io_poll_execute(req, 0, poll->events);
+
+ /*
+ * If the waitqueue is being freed early but someone is already
+ * holds ownership over it, we have to tear down the request as
+ * best we can. That means immediately removing the request from
+ * its waitqueue and preventing all further accesses to the
+ * waitqueue via the request.
+ */
+ list_del_init(&poll->wait.entry);
+
+ /*
+ * Careful: this *must* be the last step, since as soon
+ * as req->head is NULL'ed out, the request can be
+ * completed and freed, since aio_poll_complete_work()
+ * will no longer need to take the waitqueue lock.
+ */
+ smp_store_release(&poll->head, NULL);
+ return 1;
+ }
+
+ /* for instances that support it check for an event match first */
+ if (mask && !(mask & (poll->events & ~IO_ASYNC_POLL_COMMON)))
+ return 0;
+
+ if (io_poll_get_ownership(req)) {
+ /* optional, saves extra locking for removal in tw handler */
+ if (mask && poll->events & EPOLLONESHOT) {
+ list_del_init(&poll->wait.entry);
+ poll->head = NULL;
+ if (wqe_is_double(wait))
+ req->flags &= ~REQ_F_DOUBLE_POLL;
+ else
+ req->flags &= ~REQ_F_SINGLE_POLL;
+ }
+ __io_poll_execute(req, mask, poll->events);
+ }
+ return 1;
+}
+
+static void __io_queue_proc(struct io_poll_iocb *poll, struct io_poll_table *pt,
+ struct wait_queue_head *head,
+ struct io_poll_iocb **poll_ptr)
+{
+ struct io_kiocb *req = pt->req;
+ unsigned long wqe_private = (unsigned long) req;
+
+ /*
+ * The file being polled uses multiple waitqueues for poll handling
+ * (e.g. one for read, one for write). Setup a separate io_poll_iocb
+ * if this happens.
+ */
+ if (unlikely(pt->nr_entries)) {
+ struct io_poll_iocb *first = poll;
+
+ /* double add on the same waitqueue head, ignore */
+ if (first->head == head)
+ return;
+ /* already have a 2nd entry, fail a third attempt */
+ if (*poll_ptr) {
+ if ((*poll_ptr)->head == head)
+ return;
+ pt->error = -EINVAL;
+ return;
+ }
+
+ poll = kmalloc(sizeof(*poll), GFP_ATOMIC);
+ if (!poll) {
+ pt->error = -ENOMEM;
+ return;
+ }
+ /* mark as double wq entry */
+ wqe_private |= 1;
+ req->flags |= REQ_F_DOUBLE_POLL;
+ io_init_poll_iocb(poll, first->events, first->wait.func);
+ *poll_ptr = poll;
+ if (req->opcode == IORING_OP_POLL_ADD)
+ req->flags |= REQ_F_ASYNC_DATA;
+ }
+
+ req->flags |= REQ_F_SINGLE_POLL;
+ pt->nr_entries++;
+ poll->head = head;
+ poll->wait.private = (void *) wqe_private;
+
+ if (poll->events & EPOLLEXCLUSIVE)
+ add_wait_queue_exclusive(head, &poll->wait);
+ else
+ add_wait_queue(head, &poll->wait);
+}
+
+static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head,
+ struct poll_table_struct *p)
+{
+ struct io_poll_table *pt = container_of(p, struct io_poll_table, pt);
+
+ __io_queue_proc(&pt->req->poll, pt, head,
+ (struct io_poll_iocb **) &pt->req->async_data);
+}
+
+static int __io_arm_poll_handler(struct io_kiocb *req,
+ struct io_poll_iocb *poll,
+ struct io_poll_table *ipt, __poll_t mask)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ int v;
+
+ INIT_HLIST_NODE(&req->hash_node);
+ io_init_poll_iocb(poll, mask, io_poll_wake);
+ poll->file = req->file;
+
+ req->apoll_events = poll->events;
+
+ ipt->pt._key = mask;
+ ipt->req = req;
+ ipt->error = 0;
+ ipt->nr_entries = 0;
+
+ /*
+ * Take the ownership to delay any tw execution up until we're done
+ * with poll arming. see io_poll_get_ownership().
+ */
+ atomic_set(&req->poll_refs, 1);
+ mask = vfs_poll(req->file, &ipt->pt) & poll->events;
+
+ if (mask && (poll->events & EPOLLONESHOT)) {
+ io_poll_remove_entries(req);
+ /* no one else has access to the req, forget about the ref */
+ return mask;
+ }
+ if (!mask && unlikely(ipt->error || !ipt->nr_entries)) {
+ io_poll_remove_entries(req);
+ if (!ipt->error)
+ ipt->error = -EINVAL;
+ return 0;
+ }
+
+ spin_lock(&ctx->completion_lock);
+ io_poll_req_insert(req);
+ spin_unlock(&ctx->completion_lock);
+
+ if (mask) {
+ /* can't multishot if failed, just queue the event we've got */
+ if (unlikely(ipt->error || !ipt->nr_entries)) {
+ poll->events |= EPOLLONESHOT;
+ req->apoll_events |= EPOLLONESHOT;
+ ipt->error = 0;
+ }
+ __io_poll_execute(req, mask, poll->events);
+ return 0;
+ }
+
+ /*
+ * Release ownership. If someone tried to queue a tw while it was
+ * locked, kick it off for them.
+ */
+ v = atomic_dec_return(&req->poll_refs);
+ if (unlikely(v & IO_POLL_REF_MASK))
+ __io_poll_execute(req, 0, poll->events);
+ return 0;
+}
+
+static void io_async_queue_proc(struct file *file, struct wait_queue_head *head,
+ struct poll_table_struct *p)
+{
+ struct io_poll_table *pt = container_of(p, struct io_poll_table, pt);
+ struct async_poll *apoll = pt->req->apoll;
+
+ __io_queue_proc(&apoll->poll, pt, head, &apoll->double_poll);
+}
+
+enum {
+ IO_APOLL_OK,
+ IO_APOLL_ABORTED,
+ IO_APOLL_READY
+};
+
+static int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
+{
+ const struct io_op_def *def = &io_op_defs[req->opcode];
+ struct io_ring_ctx *ctx = req->ctx;
+ struct async_poll *apoll;
+ struct io_poll_table ipt;
+ __poll_t mask = IO_ASYNC_POLL_COMMON | POLLERR;
+ int ret;
+
+ if (!def->pollin && !def->pollout)
+ return IO_APOLL_ABORTED;
+ if (!file_can_poll(req->file) || (req->flags & REQ_F_POLLED))
+ return IO_APOLL_ABORTED;
+
+ if (def->pollin) {
+ mask |= POLLIN | POLLRDNORM;
+
+ /* If reading from MSG_ERRQUEUE using recvmsg, ignore POLLIN */
+ if ((req->opcode == IORING_OP_RECVMSG) &&
+ (req->sr_msg.msg_flags & MSG_ERRQUEUE))
+ mask &= ~POLLIN;
+ } else {
+ mask |= POLLOUT | POLLWRNORM;
+ }
+ if (def->poll_exclusive)
+ mask |= EPOLLEXCLUSIVE;
+ if (!(issue_flags & IO_URING_F_UNLOCKED) &&
+ !list_empty(&ctx->apoll_cache)) {
+ apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
+ poll.wait.entry);
+ list_del_init(&apoll->poll.wait.entry);
+ } else {
+ apoll = kmalloc(sizeof(*apoll), GFP_ATOMIC);
+ if (unlikely(!apoll))
+ return IO_APOLL_ABORTED;
+ }
+ apoll->double_poll = NULL;
+ req->apoll = apoll;
+ req->flags |= REQ_F_POLLED;
+ ipt.pt._qproc = io_async_queue_proc;
+
+ io_kbuf_recycle(req, issue_flags);
+
+ ret = __io_arm_poll_handler(req, &apoll->poll, &ipt, mask);
+ if (ret || ipt.error)
+ return ret ? IO_APOLL_READY : IO_APOLL_ABORTED;
+
+ trace_io_uring_poll_arm(ctx, req, req->user_data, req->opcode,
+ mask, apoll->poll.events);
+ return IO_APOLL_OK;
+}
+
+/*
+ * Returns true if we found and killed one or more poll requests
+ */
+static __cold bool io_poll_remove_all(struct io_ring_ctx *ctx,
+ struct task_struct *tsk, bool cancel_all)
+{
+ struct hlist_node *tmp;
+ struct io_kiocb *req;
+ bool found = false;
+ int i;
+
+ spin_lock(&ctx->completion_lock);
+ for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
+ struct hlist_head *list;
+
+ list = &ctx->cancel_hash[i];
+ hlist_for_each_entry_safe(req, tmp, list, hash_node) {
+ if (io_match_task_safe(req, tsk, cancel_all)) {
+ hlist_del_init(&req->hash_node);
+ io_poll_cancel_req(req);
+ found = true;
+ }
+ }
+ }
+ spin_unlock(&ctx->completion_lock);
+ return found;
+}
+
+static struct io_kiocb *io_poll_find(struct io_ring_ctx *ctx, __u64 sqe_addr,
+ bool poll_only)
+ __must_hold(&ctx->completion_lock)
+{
+ struct hlist_head *list;
+ struct io_kiocb *req;
+
+ list = &ctx->cancel_hash[hash_long(sqe_addr, ctx->cancel_hash_bits)];
+ hlist_for_each_entry(req, list, hash_node) {
+ if (sqe_addr != req->user_data)
+ continue;
+ if (poll_only && req->opcode != IORING_OP_POLL_ADD)
+ continue;
+ return req;
+ }
+ return NULL;
+}
+
+static bool io_poll_disarm(struct io_kiocb *req)
+ __must_hold(&ctx->completion_lock)
+{
+ if (!io_poll_get_ownership(req))
+ return false;
+ io_poll_remove_entries(req);
+ hash_del(&req->hash_node);
+ return true;
+}
+
+static int io_poll_cancel(struct io_ring_ctx *ctx, __u64 sqe_addr,
+ bool poll_only)
+ __must_hold(&ctx->completion_lock)
+{
+ struct io_kiocb *req = io_poll_find(ctx, sqe_addr, poll_only);
+
+ if (!req)
+ return -ENOENT;
+ io_poll_cancel_req(req);
+ return 0;
+}
+
+static __poll_t io_poll_parse_events(const struct io_uring_sqe *sqe,
+ unsigned int flags)
+{
+ u32 events;
+
+ events = READ_ONCE(sqe->poll32_events);
+#ifdef __BIG_ENDIAN
+ events = swahw32(events);
+#endif
+ if (!(flags & IORING_POLL_ADD_MULTI))
+ events |= EPOLLONESHOT;
+ return demangle_poll(events) | (events & (EPOLLEXCLUSIVE|EPOLLONESHOT));
+}
+
+static int io_poll_update_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_poll_update *upd = &req->poll_update;
+ u32 flags;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->buf_index || sqe->splice_fd_in)
+ return -EINVAL;
+ flags = READ_ONCE(sqe->len);
+ if (flags & ~(IORING_POLL_UPDATE_EVENTS | IORING_POLL_UPDATE_USER_DATA |
+ IORING_POLL_ADD_MULTI))
+ return -EINVAL;
+ /* meaningless without update */
+ if (flags == IORING_POLL_ADD_MULTI)
+ return -EINVAL;
+
+ upd->old_user_data = READ_ONCE(sqe->addr);
+ upd->update_events = flags & IORING_POLL_UPDATE_EVENTS;
+ upd->update_user_data = flags & IORING_POLL_UPDATE_USER_DATA;
+
+ upd->new_user_data = READ_ONCE(sqe->off);
+ if (!upd->update_user_data && upd->new_user_data)
+ return -EINVAL;
+ if (upd->update_events)
+ upd->events = io_poll_parse_events(sqe, flags);
+ else if (sqe->poll32_events)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ struct io_poll_iocb *poll = &req->poll;
+ u32 flags;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->buf_index || sqe->off || sqe->addr)
+ return -EINVAL;
+ flags = READ_ONCE(sqe->len);
+ if (flags & ~IORING_POLL_ADD_MULTI)
+ return -EINVAL;
+ if ((flags & IORING_POLL_ADD_MULTI) && (req->flags & REQ_F_CQE_SKIP))
+ return -EINVAL;
+
+ io_req_set_refcount(req);
+ poll->events = io_poll_parse_events(sqe, flags);
+ return 0;
+}
+
+static int io_poll_add(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_poll_iocb *poll = &req->poll;
+ struct io_poll_table ipt;
+ int ret;
+
+ ipt.pt._qproc = io_poll_queue_proc;
+
+ ret = __io_arm_poll_handler(req, &req->poll, &ipt, poll->events);
+ if (!ret && ipt.error)
+ req_set_fail(req);
+ ret = ret ?: ipt.error;
+ if (ret)
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+
+static int io_poll_update(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_kiocb *preq;
+ int ret2, ret = 0;
+ bool locked;
+
+ spin_lock(&ctx->completion_lock);
+ preq = io_poll_find(ctx, req->poll_update.old_user_data, true);
+ if (!preq || !io_poll_disarm(preq)) {
+ spin_unlock(&ctx->completion_lock);
+ ret = preq ? -EALREADY : -ENOENT;
+ goto out;
+ }
+ spin_unlock(&ctx->completion_lock);
+
+ if (req->poll_update.update_events || req->poll_update.update_user_data) {
+ /* only mask one event flags, keep behavior flags */
+ if (req->poll_update.update_events) {
+ preq->poll.events &= ~0xffff;
+ preq->poll.events |= req->poll_update.events & 0xffff;
+ preq->poll.events |= IO_POLL_UNMASK;
+ }
+ if (req->poll_update.update_user_data)
+ preq->user_data = req->poll_update.new_user_data;
+
+ ret2 = io_poll_add(preq, issue_flags);
+ /* successfully updated, don't complete poll request */
+ if (!ret2)
+ goto out;
+ }
+
+ req_set_fail(preq);
+ preq->result = -ECANCELED;
+ locked = !(issue_flags & IO_URING_F_UNLOCKED);
+ io_req_task_complete(preq, &locked);
+out:
+ if (ret < 0)
+ req_set_fail(req);
+ /* complete update request, we're done with it */
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+
+static enum hrtimer_restart io_timeout_fn(struct hrtimer *timer)
+{
+ struct io_timeout_data *data = container_of(timer,
+ struct io_timeout_data, timer);
+ struct io_kiocb *req = data->req;
+ struct io_ring_ctx *ctx = req->ctx;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ctx->timeout_lock, flags);
+ list_del_init(&req->timeout.list);
+ atomic_set(&req->ctx->cq_timeouts,
+ atomic_read(&req->ctx->cq_timeouts) + 1);
+ spin_unlock_irqrestore(&ctx->timeout_lock, flags);
+
+ if (!(data->flags & IORING_TIMEOUT_ETIME_SUCCESS))
+ req_set_fail(req);
+
+ req->result = -ETIME;
+ req->io_task_work.func = io_req_task_complete;
+ io_req_task_work_add(req, false);
+ return HRTIMER_NORESTART;
+}
+
+static struct io_kiocb *io_timeout_extract(struct io_ring_ctx *ctx,
+ __u64 user_data)
+ __must_hold(&ctx->timeout_lock)
+{
+ struct io_timeout_data *io;
+ struct io_kiocb *req;
+ bool found = false;
+
+ list_for_each_entry(req, &ctx->timeout_list, timeout.list) {
+ found = user_data == req->user_data;
+ if (found)
+ break;
+ }
+ if (!found)
+ return ERR_PTR(-ENOENT);
+
+ io = req->async_data;
+ if (hrtimer_try_to_cancel(&io->timer) == -1)
+ return ERR_PTR(-EALREADY);
+ list_del_init(&req->timeout.list);
+ return req;
+}
+
+static int io_timeout_cancel(struct io_ring_ctx *ctx, __u64 user_data)
+ __must_hold(&ctx->completion_lock)
+ __must_hold(&ctx->timeout_lock)
+{
+ struct io_kiocb *req = io_timeout_extract(ctx, user_data);
+
+ if (IS_ERR(req))
+ return PTR_ERR(req);
+ io_req_task_queue_fail(req, -ECANCELED);
+ return 0;
+}
+
+static clockid_t io_timeout_get_clock(struct io_timeout_data *data)
+{
+ switch (data->flags & IORING_TIMEOUT_CLOCK_MASK) {
+ case IORING_TIMEOUT_BOOTTIME:
+ return CLOCK_BOOTTIME;
+ case IORING_TIMEOUT_REALTIME:
+ return CLOCK_REALTIME;
+ default:
+ /* can't happen, vetted at prep time */
+ WARN_ON_ONCE(1);
+ fallthrough;
+ case 0:
+ return CLOCK_MONOTONIC;
+ }
+}
+
+static int io_linked_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
+ struct timespec64 *ts, enum hrtimer_mode mode)
+ __must_hold(&ctx->timeout_lock)
+{
+ struct io_timeout_data *io;
+ struct io_kiocb *req;
+ bool found = false;
+
+ list_for_each_entry(req, &ctx->ltimeout_list, timeout.list) {
+ found = user_data == req->user_data;
+ if (found)
+ break;
+ }
+ if (!found)
+ return -ENOENT;
+
+ io = req->async_data;
+ if (hrtimer_try_to_cancel(&io->timer) == -1)
+ return -EALREADY;
+ hrtimer_init(&io->timer, io_timeout_get_clock(io), mode);
+ io->timer.function = io_link_timeout_fn;
+ hrtimer_start(&io->timer, timespec64_to_ktime(*ts), mode);
+ return 0;
+}
+
+static int io_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
+ struct timespec64 *ts, enum hrtimer_mode mode)
+ __must_hold(&ctx->timeout_lock)
+{
+ struct io_kiocb *req = io_timeout_extract(ctx, user_data);
+ struct io_timeout_data *data;
+
+ if (IS_ERR(req))
+ return PTR_ERR(req);
+
+ req->timeout.off = 0; /* noseq */
+ data = req->async_data;
+ list_add_tail(&req->timeout.list, &ctx->timeout_list);
+ hrtimer_init(&data->timer, io_timeout_get_clock(data), mode);
+ data->timer.function = io_timeout_fn;
+ hrtimer_start(&data->timer, timespec64_to_ktime(*ts), mode);
+ return 0;
+}
+
+static int io_timeout_remove_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ struct io_timeout_rem *tr = &req->timeout_rem;
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->buf_index || sqe->len || sqe->splice_fd_in)
+ return -EINVAL;
+
+ tr->ltimeout = false;
+ tr->addr = READ_ONCE(sqe->addr);
+ tr->flags = READ_ONCE(sqe->timeout_flags);
+ if (tr->flags & IORING_TIMEOUT_UPDATE_MASK) {
+ if (hweight32(tr->flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
+ return -EINVAL;
+ if (tr->flags & IORING_LINK_TIMEOUT_UPDATE)
+ tr->ltimeout = true;
+ if (tr->flags & ~(IORING_TIMEOUT_UPDATE_MASK|IORING_TIMEOUT_ABS))
+ return -EINVAL;
+ if (get_timespec64(&tr->ts, u64_to_user_ptr(sqe->addr2)))
+ return -EFAULT;
+ if (tr->ts.tv_sec < 0 || tr->ts.tv_nsec < 0)
+ return -EINVAL;
+ } else if (tr->flags) {
+ /* timeout removal doesn't support flags */
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static inline enum hrtimer_mode io_translate_timeout_mode(unsigned int flags)
+{
+ return (flags & IORING_TIMEOUT_ABS) ? HRTIMER_MODE_ABS
+ : HRTIMER_MODE_REL;
+}
+
+/*
+ * Remove or update an existing timeout command
+ */
+static int io_timeout_remove(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_timeout_rem *tr = &req->timeout_rem;
+ struct io_ring_ctx *ctx = req->ctx;
+ int ret;
+
+ if (!(req->timeout_rem.flags & IORING_TIMEOUT_UPDATE)) {
+ spin_lock(&ctx->completion_lock);
+ spin_lock_irq(&ctx->timeout_lock);
+ ret = io_timeout_cancel(ctx, tr->addr);
+ spin_unlock_irq(&ctx->timeout_lock);
+ spin_unlock(&ctx->completion_lock);
+ } else {
+ enum hrtimer_mode mode = io_translate_timeout_mode(tr->flags);
+
+ spin_lock_irq(&ctx->timeout_lock);
+ if (tr->ltimeout)
+ ret = io_linked_timeout_update(ctx, tr->addr, &tr->ts, mode);
+ else
+ ret = io_timeout_update(ctx, tr->addr, &tr->ts, mode);
+ spin_unlock_irq(&ctx->timeout_lock);
+ }
+
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete_post(req, ret, 0);
+ return 0;
+}
+
+static int io_timeout_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
+ bool is_timeout_link)
+{
+ struct io_timeout_data *data;
+ unsigned flags;
+ u32 off = READ_ONCE(sqe->off);
+
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->buf_index || sqe->len != 1 ||
+ sqe->splice_fd_in)
+ return -EINVAL;
+ if (off && is_timeout_link)
+ return -EINVAL;
+ flags = READ_ONCE(sqe->timeout_flags);
+ if (flags & ~(IORING_TIMEOUT_ABS | IORING_TIMEOUT_CLOCK_MASK |
+ IORING_TIMEOUT_ETIME_SUCCESS))
+ return -EINVAL;
+ /* more than one clock specified is invalid, obviously */
+ if (hweight32(flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
+ return -EINVAL;
+
+ INIT_LIST_HEAD(&req->timeout.list);
+ req->timeout.off = off;
+ if (unlikely(off && !req->ctx->off_timeout_used))
+ req->ctx->off_timeout_used = true;
+
+ if (WARN_ON_ONCE(req_has_async_data(req)))
+ return -EFAULT;
+ if (io_alloc_async_data(req))
+ return -ENOMEM;
+
+ data = req->async_data;
+ data->req = req;
+ data->flags = flags;
+
+ if (get_timespec64(&data->ts, u64_to_user_ptr(sqe->addr)))
+ return -EFAULT;
+
+ if (data->ts.tv_sec < 0 || data->ts.tv_nsec < 0)
+ return -EINVAL;
+
+ INIT_LIST_HEAD(&req->timeout.list);
+ data->mode = io_translate_timeout_mode(flags);
+ hrtimer_init(&data->timer, io_timeout_get_clock(data), data->mode);
+
+ if (is_timeout_link) {
+ struct io_submit_link *link = &req->ctx->submit_state.link;
+
+ if (!link->head)
+ return -EINVAL;
+ if (link->last->opcode == IORING_OP_LINK_TIMEOUT)
+ return -EINVAL;
+ req->timeout.head = link->last;
+ link->last->flags |= REQ_F_ARM_LTIMEOUT;
+ }
+ return 0;
+}
+
+static int io_timeout(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_timeout_data *data = req->async_data;
+ struct list_head *entry;
+ u32 tail, off = req->timeout.off;
+
+ spin_lock_irq(&ctx->timeout_lock);
+
+ /*
+ * sqe->off holds how many events that need to occur for this
+ * timeout event to be satisfied. If it isn't set, then this is
+ * a pure timeout request, sequence isn't used.
+ */
+ if (io_is_timeout_noseq(req)) {
+ entry = ctx->timeout_list.prev;
+ goto add;
+ }
+
+ tail = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
+ req->timeout.target_seq = tail + off;
+
+ /* Update the last seq here in case io_flush_timeouts() hasn't.
+ * This is safe because ->completion_lock is held, and submissions
+ * and completions are never mixed in the same ->completion_lock section.
+ */
+ ctx->cq_last_tm_flush = tail;
+
+ /*
+ * Insertion sort, ensuring the first entry in the list is always
+ * the one we need first.
+ */
+ list_for_each_prev(entry, &ctx->timeout_list) {
+ struct io_kiocb *nxt = list_entry(entry, struct io_kiocb,
+ timeout.list);
+
+ if (io_is_timeout_noseq(nxt))
+ continue;
+ /* nxt.seq is behind @tail, otherwise would've been completed */
+ if (off >= nxt->timeout.target_seq - tail)
+ break;
+ }
+add:
+ list_add(&req->timeout.list, entry);
+ data->timer.function = io_timeout_fn;
+ hrtimer_start(&data->timer, timespec64_to_ktime(data->ts), data->mode);
+ spin_unlock_irq(&ctx->timeout_lock);
+ return 0;
+}
+
+struct io_cancel_data {
+ struct io_ring_ctx *ctx;
+ u64 user_data;
+};
+
+static bool io_cancel_cb(struct io_wq_work *work, void *data)
+{
+ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+ struct io_cancel_data *cd = data;
+
+ return req->ctx == cd->ctx && req->user_data == cd->user_data;
+}
+
+static int io_async_cancel_one(struct io_uring_task *tctx, u64 user_data,
+ struct io_ring_ctx *ctx)
+{
+ struct io_cancel_data data = { .ctx = ctx, .user_data = user_data, };
+ enum io_wq_cancel cancel_ret;
+ int ret = 0;
+
+ if (!tctx || !tctx->io_wq)
+ return -ENOENT;
+
+ cancel_ret = io_wq_cancel_cb(tctx->io_wq, io_cancel_cb, &data, false);
+ switch (cancel_ret) {
+ case IO_WQ_CANCEL_OK:
+ ret = 0;
+ break;
+ case IO_WQ_CANCEL_RUNNING:
+ ret = -EALREADY;
+ break;
+ case IO_WQ_CANCEL_NOTFOUND:
+ ret = -ENOENT;
+ break;
+ }
+
+ return ret;
+}
+
+static int io_try_cancel_userdata(struct io_kiocb *req, u64 sqe_addr)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ int ret;
+
+ WARN_ON_ONCE(!io_wq_current_is_worker() && req->task != current);
+
+ ret = io_async_cancel_one(req->task->io_uring, sqe_addr, ctx);
+ /*
+ * Fall-through even for -EALREADY, as we may have poll armed
+ * that need unarming.
+ */
+ if (!ret)
+ return 0;
+
+ spin_lock(&ctx->completion_lock);
+ ret = io_poll_cancel(ctx, sqe_addr, false);
+ if (ret != -ENOENT)
+ goto out;
+
+ spin_lock_irq(&ctx->timeout_lock);
+ ret = io_timeout_cancel(ctx, sqe_addr);
+ spin_unlock_irq(&ctx->timeout_lock);
+out:
+ spin_unlock(&ctx->completion_lock);
+ return ret;
+}
+
+static int io_async_cancel_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+ return -EINVAL;
+ if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->off || sqe->len || sqe->cancel_flags ||
+ sqe->splice_fd_in)
+ return -EINVAL;
+
+ req->cancel.addr = READ_ONCE(sqe->addr);
+ return 0;
+}
+
+static int io_async_cancel(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ u64 sqe_addr = req->cancel.addr;
+ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+ struct io_tctx_node *node;
+ int ret;
+
+ ret = io_try_cancel_userdata(req, sqe_addr);
+ if (ret != -ENOENT)
+ goto done;
+
+ /* slow path, try all io-wq's */
+ io_ring_submit_lock(ctx, needs_lock);
+ ret = -ENOENT;
+ list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
+ struct io_uring_task *tctx = node->task->io_uring;
+
+ ret = io_async_cancel_one(tctx, req->cancel.addr, ctx);
+ if (ret != -ENOENT)
+ break;
+ }
+ io_ring_submit_unlock(ctx, needs_lock);
+done:
+ if (ret < 0)
+ req_set_fail(req);
+ io_req_complete_post(req, ret, 0);
+ return 0;
+}
+
+static int io_rsrc_update_prep(struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+{
+ if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
+ return -EINVAL;
+ if (sqe->ioprio || sqe->rw_flags || sqe->splice_fd_in)
+ return -EINVAL;
+
+ req->rsrc_update.offset = READ_ONCE(sqe->off);
+ req->rsrc_update.nr_args = READ_ONCE(sqe->len);
+ if (!req->rsrc_update.nr_args)
+ return -EINVAL;
+ req->rsrc_update.arg = READ_ONCE(sqe->addr);
+ return 0;
+}
+
+static int io_files_update(struct io_kiocb *req, unsigned int issue_flags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+ struct io_uring_rsrc_update2 up;
+ int ret;
+
+ up.offset = req->rsrc_update.offset;
+ up.data = req->rsrc_update.arg;
+ up.nr = 0;
+ up.tags = 0;
+ up.resv = 0;
+ up.resv2 = 0;
+
+ io_ring_submit_lock(ctx, needs_lock);
+ ret = __io_register_rsrc_update(ctx, IORING_RSRC_FILE,
+ &up, req->rsrc_update.nr_args);
+ io_ring_submit_unlock(ctx, needs_lock);
+
+ if (ret < 0)
+ req_set_fail(req);
+ __io_req_complete(req, issue_flags, ret, 0);
+ return 0;
+}
+
+static int io_req_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+ switch (req->opcode) {
+ case IORING_OP_NOP:
+ return 0;
+ case IORING_OP_READV:
+ case IORING_OP_READ_FIXED:
+ case IORING_OP_READ:
+ case IORING_OP_WRITEV:
+ case IORING_OP_WRITE_FIXED:
+ case IORING_OP_WRITE:
+ return io_prep_rw(req, sqe);
+ case IORING_OP_POLL_ADD:
+ return io_poll_add_prep(req, sqe);
+ case IORING_OP_POLL_REMOVE:
+ return io_poll_update_prep(req, sqe);
+ case IORING_OP_FSYNC:
+ return io_fsync_prep(req, sqe);
+ case IORING_OP_SYNC_FILE_RANGE:
+ return io_sfr_prep(req, sqe);
+ case IORING_OP_SENDMSG:
+ case IORING_OP_SEND:
+ return io_sendmsg_prep(req, sqe);
+ case IORING_OP_RECVMSG:
+ case IORING_OP_RECV:
+ return io_recvmsg_prep(req, sqe);
+ case IORING_OP_CONNECT:
+ return io_connect_prep(req, sqe);
+ case IORING_OP_TIMEOUT:
+ return io_timeout_prep(req, sqe, false);
+ case IORING_OP_TIMEOUT_REMOVE:
+ return io_timeout_remove_prep(req, sqe);
+ case IORING_OP_ASYNC_CANCEL:
+ return io_async_cancel_prep(req, sqe);
+ case IORING_OP_LINK_TIMEOUT:
+ return io_timeout_prep(req, sqe, true);
+ case IORING_OP_ACCEPT:
+ return io_accept_prep(req, sqe);
+ case IORING_OP_FALLOCATE:
+ return io_fallocate_prep(req, sqe);
+ case IORING_OP_OPENAT:
+ return io_openat_prep(req, sqe);
+ case IORING_OP_CLOSE:
+ return io_close_prep(req, sqe);
+ case IORING_OP_FILES_UPDATE:
+ return io_rsrc_update_prep(req, sqe);
+ case IORING_OP_STATX:
+ return io_statx_prep(req, sqe);
+ case IORING_OP_FADVISE:
+ return io_fadvise_prep(req, sqe);
+ case IORING_OP_MADVISE:
+ return io_madvise_prep(req, sqe);
+ case IORING_OP_OPENAT2:
+ return io_openat2_prep(req, sqe);
+ case IORING_OP_EPOLL_CTL:
+ return io_epoll_ctl_prep(req, sqe);
+ case IORING_OP_SPLICE:
+ return io_splice_prep(req, sqe);
+ case IORING_OP_PROVIDE_BUFFERS:
+ return io_provide_buffers_prep(req, sqe);
+ case IORING_OP_REMOVE_BUFFERS:
+ return io_remove_buffers_prep(req, sqe);
+ case IORING_OP_TEE:
+ return io_tee_prep(req, sqe);
+ case IORING_OP_SHUTDOWN:
+ return io_shutdown_prep(req, sqe);
+ case IORING_OP_RENAMEAT:
+ return io_renameat_prep(req, sqe);
+ case IORING_OP_UNLINKAT:
+ return io_unlinkat_prep(req, sqe);
+ case IORING_OP_MKDIRAT:
+ return io_mkdirat_prep(req, sqe);
+ case IORING_OP_SYMLINKAT:
+ return io_symlinkat_prep(req, sqe);
+ case IORING_OP_LINKAT:
+ return io_linkat_prep(req, sqe);
+ case IORING_OP_MSG_RING:
+ return io_msg_ring_prep(req, sqe);
+ }
+
+ printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
+ req->opcode);
+ return -EINVAL;
+}
+
+static int io_req_prep_async(struct io_kiocb *req)
+{
+ const struct io_op_def *def = &io_op_defs[req->opcode];
+
+ /* assign early for deferred execution for non-fixed file */
+ if (def->needs_file && !(req->flags & REQ_F_FIXED_FILE))
+ req->file = io_file_get_normal(req, req->fd);
+ if (!def->needs_async_setup)
+ return 0;
+ if (WARN_ON_ONCE(req_has_async_data(req)))
+ return -EFAULT;
+ if (io_alloc_async_data(req))
+ return -EAGAIN;
+
+ switch (req->opcode) {
+ case IORING_OP_READV:
+ return io_rw_prep_async(req, READ);
+ case IORING_OP_WRITEV:
+ return io_rw_prep_async(req, WRITE);
+ case IORING_OP_SENDMSG:
+ return io_sendmsg_prep_async(req);
+ case IORING_OP_RECVMSG:
+ return io_recvmsg_prep_async(req);
+ case IORING_OP_CONNECT:
+ return io_connect_prep_async(req);
+ }
+ printk_once(KERN_WARNING "io_uring: prep_async() bad opcode %d\n",
+ req->opcode);
+ return -EFAULT;
+}
+
+static u32 io_get_sequence(struct io_kiocb *req)
+{
+ u32 seq = req->ctx->cached_sq_head;
+
+ /* need original cached_sq_head, but it was increased for each req */
+ io_for_each_link(req, req)
+ seq--;
+ return seq;
+}
+
+static __cold void io_drain_req(struct io_kiocb *req)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_defer_entry *de;
+ int ret;
+ u32 seq = io_get_sequence(req);
+
+ /* Still need defer if there is pending req in defer list. */
+ spin_lock(&ctx->completion_lock);
+ if (!req_need_defer(req, seq) && list_empty_careful(&ctx->defer_list)) {
+ spin_unlock(&ctx->completion_lock);
+queue:
+ ctx->drain_active = false;
+ io_req_task_queue(req);
+ return;
+ }
+ spin_unlock(&ctx->completion_lock);
+
+ ret = io_req_prep_async(req);
+ if (ret) {
+fail:
+ io_req_complete_failed(req, ret);
+ return;
+ }
+ io_prep_async_link(req);
+ de = kmalloc(sizeof(*de), GFP_KERNEL);
+ if (!de) {
+ ret = -ENOMEM;
+ goto fail;
+ }
+
+ spin_lock(&ctx->completion_lock);
+ if (!req_need_defer(req, seq) && list_empty(&ctx->defer_list)) {
+ spin_unlock(&ctx->completion_lock);
+ kfree(de);
+ goto queue;
+ }
+
+ trace_io_uring_defer(ctx, req, req->user_data, req->opcode);
+ de->req = req;
+ de->seq = seq;
+ list_add_tail(&de->list, &ctx->defer_list);
+ spin_unlock(&ctx->completion_lock);
+}
+
+static void io_clean_op(struct io_kiocb *req)
+{
+ if (req->flags & REQ_F_BUFFER_SELECTED) {
+ spin_lock(&req->ctx->completion_lock);
+ io_put_kbuf_comp(req);
+ spin_unlock(&req->ctx->completion_lock);
+ }
+
+ if (req->flags & REQ_F_NEED_CLEANUP) {
+ switch (req->opcode) {
+ case IORING_OP_READV:
+ case IORING_OP_READ_FIXED:
+ case IORING_OP_READ:
+ case IORING_OP_WRITEV:
+ case IORING_OP_WRITE_FIXED:
+ case IORING_OP_WRITE: {
+ struct io_async_rw *io = req->async_data;
+
+ kfree(io->free_iovec);
+ break;
+ }
+ case IORING_OP_RECVMSG:
+ case IORING_OP_SENDMSG: {
+ struct io_async_msghdr *io = req->async_data;
+
+ kfree(io->free_iov);
+ break;
+ }
+ case IORING_OP_OPENAT:
+ case IORING_OP_OPENAT2:
+ if (req->open.filename)
+ putname(req->open.filename);
+ break;
+ case IORING_OP_RENAMEAT:
+ putname(req->rename.oldpath);
+ putname(req->rename.newpath);
+ break;
+ case IORING_OP_UNLINKAT:
+ putname(req->unlink.filename);
+ break;
+ case IORING_OP_MKDIRAT:
+ putname(req->mkdir.filename);
+ break;
+ case IORING_OP_SYMLINKAT:
+ putname(req->symlink.oldpath);
+ putname(req->symlink.newpath);
+ break;
+ case IORING_OP_LINKAT:
+ putname(req->hardlink.oldpath);
+ putname(req->hardlink.newpath);
+ break;
+ case IORING_OP_STATX:
+ if (req->statx.filename)
+ putname(req->statx.filename);
+ break;
+ }
+ }
+ if ((req->flags & REQ_F_POLLED) && req->apoll) {
+ kfree(req->apoll->double_poll);
+ kfree(req->apoll);
+ req->apoll = NULL;
+ }
+ if (req->flags & REQ_F_INFLIGHT) {
+ struct io_uring_task *tctx = req->task->io_uring;
+
+ atomic_dec(&tctx->inflight_tracked);
+ }
+ if (req->flags & REQ_F_CREDS)
+ put_cred(req->creds);
+ if (req->flags & REQ_F_ASYNC_DATA) {
+ kfree(req->async_data);
+ req->async_data = NULL;
+ }
+ req->flags &= ~IO_REQ_CLEAN_FLAGS;
+}
+
+static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags)
+{
+ if (req->file || !io_op_defs[req->opcode].needs_file)
+ return true;
+
+ if (req->flags & REQ_F_FIXED_FILE)
+ req->file = io_file_get_fixed(req, req->fd, issue_flags);
+ else
+ req->file = io_file_get_normal(req, req->fd);
+ if (req->file)
+ return true;
+
+ req_set_fail(req);
+ req->result = -EBADF;
+ return false;
+}
+
+static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
+{
+ const struct cred *creds = NULL;
+ int ret;
+
+ if (unlikely(!io_assign_file(req, issue_flags)))
+ return -EBADF;
+
+ if (unlikely((req->flags & REQ_F_CREDS) && req->creds != current_cred()))
+ creds = override_creds(req->creds);
+
+ if (!io_op_defs[req->opcode].audit_skip)
+ audit_uring_entry(req->opcode);
+
+ switch (req->opcode) {
+ case IORING_OP_NOP:
+ ret = io_nop(req, issue_flags);
+ break;
+ case IORING_OP_READV:
+ case IORING_OP_READ_FIXED:
+ case IORING_OP_READ:
+ ret = io_read(req, issue_flags);
+ break;
+ case IORING_OP_WRITEV:
+ case IORING_OP_WRITE_FIXED:
+ case IORING_OP_WRITE:
+ ret = io_write(req, issue_flags);
+ break;
+ case IORING_OP_FSYNC:
+ ret = io_fsync(req, issue_flags);
+ break;
+ case IORING_OP_POLL_ADD:
+ ret = io_poll_add(req, issue_flags);
+ break;
+ case IORING_OP_POLL_REMOVE:
+ ret = io_poll_update(req, issue_flags);
+ break;
+ case IORING_OP_SYNC_FILE_RANGE:
+ ret = io_sync_file_range(req, issue_flags);
+ break;
+ case IORING_OP_SENDMSG:
+ ret = io_sendmsg(req, issue_flags);
+ break;
+ case IORING_OP_SEND:
+ ret = io_send(req, issue_flags);
+ break;
+ case IORING_OP_RECVMSG:
+ ret = io_recvmsg(req, issue_flags);
+ break;
+ case IORING_OP_RECV:
+ ret = io_recv(req, issue_flags);
+ break;
+ case IORING_OP_TIMEOUT:
+ ret = io_timeout(req, issue_flags);
+ break;
+ case IORING_OP_TIMEOUT_REMOVE:
+ ret = io_timeout_remove(req, issue_flags);
+ break;
+ case IORING_OP_ACCEPT:
+ ret = io_accept(req, issue_flags);
+ break;
+ case IORING_OP_CONNECT:
+ ret = io_connect(req, issue_flags);
+ break;
+ case IORING_OP_ASYNC_CANCEL:
+ ret = io_async_cancel(req, issue_flags);
+ break;
+ case IORING_OP_FALLOCATE:
+ ret = io_fallocate(req, issue_flags);
+ break;
+ case IORING_OP_OPENAT:
+ ret = io_openat(req, issue_flags);
+ break;
+ case IORING_OP_CLOSE:
+ ret = io_close(req, issue_flags);
+ break;
+ case IORING_OP_FILES_UPDATE:
+ ret = io_files_update(req, issue_flags);
+ break;
+ case IORING_OP_STATX:
+ ret = io_statx(req, issue_flags);
+ break;
+ case IORING_OP_FADVISE:
+ ret = io_fadvise(req, issue_flags);
+ break;
+ case IORING_OP_MADVISE:
+ ret = io_madvise(req, issue_flags);
+ break;
+ case IORING_OP_OPENAT2:
+ ret = io_openat2(req, issue_flags);
+ break;
+ case IORING_OP_EPOLL_CTL:
+ ret = io_epoll_ctl(req, issue_flags);
+ break;
+ case IORING_OP_SPLICE:
+ ret = io_splice(req, issue_flags);
+ break;
+ case IORING_OP_PROVIDE_BUFFERS:
+ ret = io_provide_buffers(req, issue_flags);
+ break;
+ case IORING_OP_REMOVE_BUFFERS:
+ ret = io_remove_buffers(req, issue_flags);
+ break;
+ case IORING_OP_TEE:
+ ret = io_tee(req, issue_flags);
+ break;
+ case IORING_OP_SHUTDOWN:
+ ret = io_shutdown(req, issue_flags);
+ break;
+ case IORING_OP_RENAMEAT:
+ ret = io_renameat(req, issue_flags);
+ break;
+ case IORING_OP_UNLINKAT:
+ ret = io_unlinkat(req, issue_flags);
+ break;
+ case IORING_OP_MKDIRAT:
+ ret = io_mkdirat(req, issue_flags);
+ break;
+ case IORING_OP_SYMLINKAT:
+ ret = io_symlinkat(req, issue_flags);
+ break;
+ case IORING_OP_LINKAT:
+ ret = io_linkat(req, issue_flags);
+ break;
+ case IORING_OP_MSG_RING:
+ ret = io_msg_ring(req, issue_flags);
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ if (!io_op_defs[req->opcode].audit_skip)
+ audit_uring_exit(!ret, ret);
+
+ if (creds)
+ revert_creds(creds);
+ if (ret)
+ return ret;
+ /* If the op doesn't have a file, we're not polling for it */
+ if ((req->ctx->flags & IORING_SETUP_IOPOLL) && req->file)
+ io_iopoll_req_issued(req, issue_flags);
+
+ return 0;
+}
+
+static struct io_wq_work *io_wq_free_work(struct io_wq_work *work)
+{
+ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+
+ req = io_put_req_find_next(req);
+ return req ? &req->work : NULL;
+}
+
+static void io_wq_submit_work(struct io_wq_work *work)
+{
+ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+ const struct io_op_def *def = &io_op_defs[req->opcode];
+ unsigned int issue_flags = IO_URING_F_UNLOCKED;
+ bool needs_poll = false;
+ struct io_kiocb *timeout;
+ int ret = 0, err = -ECANCELED;
+
+ /* one will be dropped by ->io_free_work() after returning to io-wq */
+ if (!(req->flags & REQ_F_REFCOUNT))
+ __io_req_set_refcount(req, 2);
+ else
+ req_ref_get(req);
+
+ timeout = io_prep_linked_timeout(req);
+ if (timeout)
+ io_queue_linked_timeout(timeout);
+
+
+ /* either cancelled or io-wq is dying, so don't touch tctx->iowq */
+ if (work->flags & IO_WQ_WORK_CANCEL) {
+fail:
+ io_req_task_queue_fail(req, err);
+ return;
+ }
+ if (!io_assign_file(req, issue_flags)) {
+ err = -EBADF;
+ work->flags |= IO_WQ_WORK_CANCEL;
+ goto fail;
+ }
+
+ if (req->flags & REQ_F_FORCE_ASYNC) {
+ bool opcode_poll = def->pollin || def->pollout;
+
+ if (opcode_poll && file_can_poll(req->file)) {
+ needs_poll = true;
+ issue_flags |= IO_URING_F_NONBLOCK;
+ }
+ }
+
+ do {
+ ret = io_issue_sqe(req, issue_flags);
+ if (ret != -EAGAIN)
+ break;
+ /*
+ * We can get EAGAIN for iopolled IO even though we're
+ * forcing a sync submission from here, since we can't
+ * wait for request slots on the block side.
+ */
+ if (!needs_poll) {
+ if (!(req->ctx->flags & IORING_SETUP_IOPOLL))
+ break;
+ cond_resched();
+ continue;
+ }
+
+ if (io_arm_poll_handler(req, issue_flags) == IO_APOLL_OK)
+ return;
+ /* aborted or ready, in either case retry blocking */
+ needs_poll = false;
+ issue_flags &= ~IO_URING_F_NONBLOCK;
+ } while (1);
+
+ /* avoid locking problems by failing it from a clean context */
+ if (ret)
+ io_req_task_queue_fail(req, ret);
+}
+
+static inline struct io_fixed_file *io_fixed_file_slot(struct io_file_table *table,
+ unsigned i)
+{
+ return &table->files[i];
+}
+
+static inline struct file *io_file_from_index(struct io_ring_ctx *ctx,
+ int index)
+{
+ struct io_fixed_file *slot = io_fixed_file_slot(&ctx->file_table, index);
+
+ return (struct file *) (slot->file_ptr & FFS_MASK);
+}
+
+static void io_fixed_file_set(struct io_fixed_file *file_slot, struct file *file)
+{
+ unsigned long file_ptr = (unsigned long) file;
+
+ file_ptr |= io_file_get_flags(file);
+ file_slot->file_ptr = file_ptr;
+}
+
+static inline struct file *io_file_get_fixed(struct io_kiocb *req, int fd,
+ unsigned int issue_flags)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct file *file = NULL;
+ unsigned long file_ptr;
+
+ if (issue_flags & IO_URING_F_UNLOCKED)
+ mutex_lock(&ctx->uring_lock);
+
+ if (unlikely((unsigned int)fd >= ctx->nr_user_files))
+ goto out;
+ fd = array_index_nospec(fd, ctx->nr_user_files);
+ file_ptr = io_fixed_file_slot(&ctx->file_table, fd)->file_ptr;
+ file = (struct file *) (file_ptr & FFS_MASK);
+ file_ptr &= ~FFS_MASK;
+ /* mask in overlapping REQ_F and FFS bits */
+ req->flags |= (file_ptr << REQ_F_SUPPORT_NOWAIT_BIT);
+ io_req_set_rsrc_node(req, ctx, 0);
+out:
+ if (issue_flags & IO_URING_F_UNLOCKED)
+ mutex_unlock(&ctx->uring_lock);
+ return file;
+}
+
+static struct file *io_file_get_normal(struct io_kiocb *req, int fd)
+{
+ struct file *file = fget(fd);
+
+ trace_io_uring_file_get(req->ctx, req, req->user_data, fd);
+
+ /* we don't allow fixed io_uring files */
+ if (file && file->f_op == &io_uring_fops)
+ io_req_track_inflight(req);
+ return file;
+}
+
+static void io_req_task_link_timeout(struct io_kiocb *req, bool *locked)
+{
+ struct io_kiocb *prev = req->timeout.prev;
+ int ret = -ENOENT;
+
+ if (prev) {
+ if (!(req->task->flags & PF_EXITING))
+ ret = io_try_cancel_userdata(req, prev->user_data);
+ io_req_complete_post(req, ret ?: -ETIME, 0);
+ io_put_req(prev);
+ } else {
+ io_req_complete_post(req, -ETIME, 0);
+ }
+}
+
+static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer)
+{
+ struct io_timeout_data *data = container_of(timer,
+ struct io_timeout_data, timer);
+ struct io_kiocb *prev, *req = data->req;
+ struct io_ring_ctx *ctx = req->ctx;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ctx->timeout_lock, flags);
+ prev = req->timeout.head;
+ req->timeout.head = NULL;
+
+ /*
+ * We don't expect the list to be empty, that will only happen if we
+ * race with the completion of the linked work.
+ */
+ if (prev) {
+ io_remove_next_linked(prev);
+ if (!req_ref_inc_not_zero(prev))
+ prev = NULL;
+ }
+ list_del(&req->timeout.list);
+ req->timeout.prev = prev;
+ spin_unlock_irqrestore(&ctx->timeout_lock, flags);
+
+ req->io_task_work.func = io_req_task_link_timeout;
+ io_req_task_work_add(req, false);
+ return HRTIMER_NORESTART;
+}
+
+static void io_queue_linked_timeout(struct io_kiocb *req)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+
+ spin_lock_irq(&ctx->timeout_lock);
+ /*
+ * If the back reference is NULL, then our linked request finished
+ * before we got a chance to setup the timer
+ */
+ if (req->timeout.head) {
+ struct io_timeout_data *data = req->async_data;
+
+ data->timer.function = io_link_timeout_fn;
+ hrtimer_start(&data->timer, timespec64_to_ktime(data->ts),
+ data->mode);
+ list_add_tail(&req->timeout.list, &ctx->ltimeout_list);
+ }
+ spin_unlock_irq(&ctx->timeout_lock);
+ /* drop submission reference */
+ io_put_req(req);
+}
+
+static void io_queue_sqe_arm_apoll(struct io_kiocb *req)
+ __must_hold(&req->ctx->uring_lock)
+{
+ struct io_kiocb *linked_timeout = io_prep_linked_timeout(req);
+
+ switch (io_arm_poll_handler(req, 0)) {
+ case IO_APOLL_READY:
+ io_req_task_queue(req);
+ break;
+ case IO_APOLL_ABORTED:
+ /*
+ * Queued up for async execution, worker will release
+ * submit reference when the iocb is actually submitted.
+ */
+ io_queue_async_work(req, NULL);
+ break;
+ case IO_APOLL_OK:
+ break;
+ }
+
+ if (linked_timeout)
+ io_queue_linked_timeout(linked_timeout);
+}
+
+static inline void __io_queue_sqe(struct io_kiocb *req)
+ __must_hold(&req->ctx->uring_lock)
+{
+ struct io_kiocb *linked_timeout;
+ int ret;
+
+ ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
+
+ if (req->flags & REQ_F_COMPLETE_INLINE) {
+ io_req_add_compl_list(req);
+ return;
+ }
+ /*
+ * We async punt it if the file wasn't marked NOWAIT, or if the file
+ * doesn't support non-blocking read/write attempts
+ */
+ if (likely(!ret)) {
+ linked_timeout = io_prep_linked_timeout(req);
+ if (linked_timeout)
+ io_queue_linked_timeout(linked_timeout);
+ } else if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) {
+ io_queue_sqe_arm_apoll(req);
+ } else {
+ io_req_complete_failed(req, ret);
+ }
+}
+
+static void io_queue_sqe_fallback(struct io_kiocb *req)
+ __must_hold(&req->ctx->uring_lock)
+{
+ if (req->flags & REQ_F_FAIL) {
+ io_req_complete_fail_submit(req);
+ } else if (unlikely(req->ctx->drain_active)) {
+ io_drain_req(req);
+ } else {
+ int ret = io_req_prep_async(req);
+
+ if (unlikely(ret))
+ io_req_complete_failed(req, ret);
+ else
+ io_queue_async_work(req, NULL);
+ }
+}
+
+static inline void io_queue_sqe(struct io_kiocb *req)
+ __must_hold(&req->ctx->uring_lock)
+{
+ if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL))))
+ __io_queue_sqe(req);
+ else
+ io_queue_sqe_fallback(req);
+}
+
+/*
+ * Check SQE restrictions (opcode and flags).
+ *
+ * Returns 'true' if SQE is allowed, 'false' otherwise.
+ */
+static inline bool io_check_restriction(struct io_ring_ctx *ctx,
+ struct io_kiocb *req,
+ unsigned int sqe_flags)
+{
+ if (!test_bit(req->opcode, ctx->restrictions.sqe_op))
+ return false;
+
+ if ((sqe_flags & ctx->restrictions.sqe_flags_required) !=
+ ctx->restrictions.sqe_flags_required)
+ return false;
+
+ if (sqe_flags & ~(ctx->restrictions.sqe_flags_allowed |
+ ctx->restrictions.sqe_flags_required))
+ return false;
+
+ return true;
+}
+
+static void io_init_req_drain(struct io_kiocb *req)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_kiocb *head = ctx->submit_state.link.head;
+
+ ctx->drain_active = true;
+ if (head) {
+ /*
+ * If we need to drain a request in the middle of a link, drain
+ * the head request and the next request/link after the current
+ * link. Considering sequential execution of links,
+ * REQ_F_IO_DRAIN will be maintained for every request of our
+ * link.
+ */
+ head->flags |= REQ_F_IO_DRAIN | REQ_F_FORCE_ASYNC;
+ ctx->drain_next = true;
+ }
+}
+
+static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+ __must_hold(&ctx->uring_lock)
+{
+ unsigned int sqe_flags;
+ int personality;
+ u8 opcode;
+
+ /* req is partially pre-initialised, see io_preinit_req() */
+ req->opcode = opcode = READ_ONCE(sqe->opcode);
+ /* same numerical values with corresponding REQ_F_*, safe to copy */
+ req->flags = sqe_flags = READ_ONCE(sqe->flags);
+ req->user_data = READ_ONCE(sqe->user_data);
+ req->file = NULL;
+ req->fixed_rsrc_refs = NULL;
+ req->task = current;
+
+ if (unlikely(opcode >= IORING_OP_LAST)) {
+ req->opcode = 0;
+ return -EINVAL;
+ }
+ if (unlikely(sqe_flags & ~SQE_COMMON_FLAGS)) {
+ /* enforce forwards compatibility on users */
+ if (sqe_flags & ~SQE_VALID_FLAGS)
+ return -EINVAL;
+ if ((sqe_flags & IOSQE_BUFFER_SELECT) &&
+ !io_op_defs[opcode].buffer_select)
+ return -EOPNOTSUPP;
+ if (sqe_flags & IOSQE_CQE_SKIP_SUCCESS)
+ ctx->drain_disabled = true;
+ if (sqe_flags & IOSQE_IO_DRAIN) {
+ if (ctx->drain_disabled)
+ return -EOPNOTSUPP;
+ io_init_req_drain(req);
+ }
+ }
+ if (unlikely(ctx->restricted || ctx->drain_active || ctx->drain_next)) {
+ if (ctx->restricted && !io_check_restriction(ctx, req, sqe_flags))
+ return -EACCES;
+ /* knock it to the slow queue path, will be drained there */
+ if (ctx->drain_active)
+ req->flags |= REQ_F_FORCE_ASYNC;
+ /* if there is no link, we're at "next" request and need to drain */
+ if (unlikely(ctx->drain_next) && !ctx->submit_state.link.head) {
+ ctx->drain_next = false;
+ ctx->drain_active = true;
+ req->flags |= REQ_F_IO_DRAIN | REQ_F_FORCE_ASYNC;
+ }
+ }
+
+ if (io_op_defs[opcode].needs_file) {
+ struct io_submit_state *state = &ctx->submit_state;
+
+ req->fd = READ_ONCE(sqe->fd);
+
+ /*
+ * Plug now if we have more than 2 IO left after this, and the
+ * target is potentially a read/write to block based storage.
+ */
+ if (state->need_plug && io_op_defs[opcode].plug) {
+ state->plug_started = true;
+ state->need_plug = false;
+ blk_start_plug_nr_ios(&state->plug, state->submit_nr);
+ }
+ }
+
+ personality = READ_ONCE(sqe->personality);
+ if (personality) {
+ int ret;
+
+ req->creds = xa_load(&ctx->personalities, personality);
+ if (!req->creds)
+ return -EINVAL;
+ get_cred(req->creds);
+ ret = security_uring_override_creds(req->creds);
+ if (ret) {
+ put_cred(req->creds);
+ return ret;
+ }
+ req->flags |= REQ_F_CREDS;
+ }
+
+ return io_req_prep(req, sqe);
+}
+
+static int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
+ const struct io_uring_sqe *sqe)
+ __must_hold(&ctx->uring_lock)
+{
+ struct io_submit_link *link = &ctx->submit_state.link;
+ int ret;
+
+ ret = io_init_req(ctx, req, sqe);
+ if (unlikely(ret)) {
+ trace_io_uring_req_failed(sqe, ctx, req, ret);
+
+ /* fail even hard links since we don't submit */
+ if (link->head) {
+ /*
+ * we can judge a link req is failed or cancelled by if
+ * REQ_F_FAIL is set, but the head is an exception since
+ * it may be set REQ_F_FAIL because of other req's failure
+ * so let's leverage req->result to distinguish if a head
+ * is set REQ_F_FAIL because of its failure or other req's
+ * failure so that we can set the correct ret code for it.
+ * init result here to avoid affecting the normal path.
+ */
+ if (!(link->head->flags & REQ_F_FAIL))
+ req_fail_link_node(link->head, -ECANCELED);
+ } else if (!(req->flags & (REQ_F_LINK | REQ_F_HARDLINK))) {
+ /*
+ * the current req is a normal req, we should return
+ * error and thus break the submittion loop.
+ */
+ io_req_complete_failed(req, ret);
+ return ret;
+ }
+ req_fail_link_node(req, ret);
+ }
+
+ /* don't need @sqe from now on */
+ trace_io_uring_submit_sqe(ctx, req, req->user_data, req->opcode,
+ req->flags, true,
+ ctx->flags & IORING_SETUP_SQPOLL);
+
+ /*
+ * If we already have a head request, queue this one for async
+ * submittal once the head completes. If we don't have a head but
+ * IOSQE_IO_LINK is set in the sqe, start a new head. This one will be
+ * submitted sync once the chain is complete. If none of those
+ * conditions are true (normal request), then just queue it.
+ */
+ if (link->head) {
+ struct io_kiocb *head = link->head;
+
+ if (!(req->flags & REQ_F_FAIL)) {
+ ret = io_req_prep_async(req);
+ if (unlikely(ret)) {
+ req_fail_link_node(req, ret);
+ if (!(head->flags & REQ_F_FAIL))
+ req_fail_link_node(head, -ECANCELED);
+ }
+ }
+ trace_io_uring_link(ctx, req, head);
+ link->last->link = req;
+ link->last = req;
+
+ if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK))
+ return 0;
+ /* last request of a link, enqueue the link */
+ link->head = NULL;
+ req = head;
+ } else if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) {
+ link->head = req;
+ link->last = req;
+ return 0;
+ }
+
+ io_queue_sqe(req);
+ return 0;
+}
+
+/*
+ * Batched submission is done, ensure local IO is flushed out.
+ */
+static void io_submit_state_end(struct io_ring_ctx *ctx)
+{
+ struct io_submit_state *state = &ctx->submit_state;
+
+ if (state->link.head)
+ io_queue_sqe(state->link.head);
+ /* flush only after queuing links as they can generate completions */
+ io_submit_flush_completions(ctx);
+ if (state->plug_started)
+ blk_finish_plug(&state->plug);
+}
+
+/*
+ * Start submission side cache.
+ */
+static void io_submit_state_start(struct io_submit_state *state,
+ unsigned int max_ios)
+{
+ state->plug_started = false;
+ state->need_plug = max_ios > 2;
+ state->submit_nr = max_ios;
+ /* set only head, no need to init link_last in advance */
+ state->link.head = NULL;
+}
+
+static void io_commit_sqring(struct io_ring_ctx *ctx)
+{
+ struct io_rings *rings = ctx->rings;
+
+ /*
+ * Ensure any loads from the SQEs are done at this point,
+ * since once we write the new head, the application could
+ * write new data to them.
+ */
+ smp_store_release(&rings->sq.head, ctx->cached_sq_head);
+}
+
+/*
+ * Fetch an sqe, if one is available. Note this returns a pointer to memory
+ * that is mapped by userspace. This means that care needs to be taken to
+ * ensure that reads are stable, as we cannot rely on userspace always
+ * being a good citizen. If members of the sqe are validated and then later
+ * used, it's important that those reads are done through READ_ONCE() to
+ * prevent a re-load down the line.
+ */
+static const struct io_uring_sqe *io_get_sqe(struct io_ring_ctx *ctx)
+{
+ unsigned head, mask = ctx->sq_entries - 1;
+ unsigned sq_idx = ctx->cached_sq_head++ & mask;
+
+ /*
+ * The cached sq head (or cq tail) serves two purposes:
+ *
+ * 1) allows us to batch the cost of updating the user visible
+ * head updates.
+ * 2) allows the kernel side to track the head on its own, even
+ * though the application is the one updating it.
+ */
+ head = READ_ONCE(ctx->sq_array[sq_idx]);
+ if (likely(head < ctx->sq_entries))
+ return &ctx->sq_sqes[head];
+
+ /* drop invalid entries */
+ ctx->cq_extra--;
+ WRITE_ONCE(ctx->rings->sq_dropped,
+ READ_ONCE(ctx->rings->sq_dropped) + 1);
+ return NULL;
+}
+
+static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr)
+ __must_hold(&ctx->uring_lock)
+{
+ unsigned int entries = io_sqring_entries(ctx);
+ int submitted = 0;
+
+ if (unlikely(!entries))
+ return 0;
+ /* make sure SQ entry isn't read before tail */
+ nr = min3(nr, ctx->sq_entries, entries);
+ io_get_task_refs(nr);
+
+ io_submit_state_start(&ctx->submit_state, nr);
+ do {
+ const struct io_uring_sqe *sqe;
+ struct io_kiocb *req;
+
+ if (unlikely(!io_alloc_req_refill(ctx))) {
+ if (!submitted)
+ submitted = -EAGAIN;
+ break;
+ }
+ req = io_alloc_req(ctx);
+ sqe = io_get_sqe(ctx);
+ if (unlikely(!sqe)) {
+ wq_stack_add_head(&req->comp_list, &ctx->submit_state.free_list);
+ break;
+ }
+ /* will complete beyond this point, count as submitted */
+ submitted++;
+ if (io_submit_sqe(ctx, req, sqe)) {
+ /*
+ * Continue submitting even for sqe failure if the
+ * ring was setup with IORING_SETUP_SUBMIT_ALL
+ */
+ if (!(ctx->flags & IORING_SETUP_SUBMIT_ALL))
+ break;
+ }
+ } while (submitted < nr);
+
+ if (unlikely(submitted != nr)) {
+ int ref_used = (submitted == -EAGAIN) ? 0 : submitted;
+ int unused = nr - ref_used;
+
+ current->io_uring->cached_refs += unused;
+ }
+
+ io_submit_state_end(ctx);
+ /* Commit SQ ring head once we've consumed and submitted all SQEs */
+ io_commit_sqring(ctx);
+
+ return submitted;
+}
+
+static inline bool io_sqd_events_pending(struct io_sq_data *sqd)
+{
+ return READ_ONCE(sqd->state);
+}
+
+static inline void io_ring_set_wakeup_flag(struct io_ring_ctx *ctx)
+{
+ /* Tell userspace we may need a wakeup call */
+ spin_lock(&ctx->completion_lock);
+ WRITE_ONCE(ctx->rings->sq_flags,
+ ctx->rings->sq_flags | IORING_SQ_NEED_WAKEUP);
+ spin_unlock(&ctx->completion_lock);
+}
+
+static inline void io_ring_clear_wakeup_flag(struct io_ring_ctx *ctx)
+{
+ spin_lock(&ctx->completion_lock);
+ WRITE_ONCE(ctx->rings->sq_flags,
+ ctx->rings->sq_flags & ~IORING_SQ_NEED_WAKEUP);
+ spin_unlock(&ctx->completion_lock);
+}
+
+static int __io_sq_thread(struct io_ring_ctx *ctx, bool cap_entries)
+{
+ unsigned int to_submit;
+ int ret = 0;
+
+ to_submit = io_sqring_entries(ctx);
+ /* if we're handling multiple rings, cap submit size for fairness */
+ if (cap_entries && to_submit > IORING_SQPOLL_CAP_ENTRIES_VALUE)
+ to_submit = IORING_SQPOLL_CAP_ENTRIES_VALUE;
+
+ if (!wq_list_empty(&ctx->iopoll_list) || to_submit) {
+ const struct cred *creds = NULL;
+
+ if (ctx->sq_creds != current_cred())
+ creds = override_creds(ctx->sq_creds);
+
+ mutex_lock(&ctx->uring_lock);
+ if (!wq_list_empty(&ctx->iopoll_list))
+ io_do_iopoll(ctx, true);
+
+ /*
+ * Don't submit if refs are dying, good for io_uring_register(),
+ * but also it is relied upon by io_ring_exit_work()
+ */
+ if (to_submit && likely(!percpu_ref_is_dying(&ctx->refs)) &&
+ !(ctx->flags & IORING_SETUP_R_DISABLED))
+ ret = io_submit_sqes(ctx, to_submit);
+ mutex_unlock(&ctx->uring_lock);
+
+ if (to_submit && wq_has_sleeper(&ctx->sqo_sq_wait))
+ wake_up(&ctx->sqo_sq_wait);
+ if (creds)
+ revert_creds(creds);
+ }
+
+ return ret;
+}
+
+static __cold void io_sqd_update_thread_idle(struct io_sq_data *sqd)
+{
+ struct io_ring_ctx *ctx;
+ unsigned sq_thread_idle = 0;
+
+ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
+ sq_thread_idle = max(sq_thread_idle, ctx->sq_thread_idle);
+ sqd->sq_thread_idle = sq_thread_idle;
+}
+
+static bool io_sqd_handle_event(struct io_sq_data *sqd)
+{
+ bool did_sig = false;
+ struct ksignal ksig;
+
+ if (test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state) ||
+ signal_pending(current)) {
+ mutex_unlock(&sqd->lock);
+ if (signal_pending(current))
+ did_sig = get_signal(&ksig);
+ cond_resched();
+ mutex_lock(&sqd->lock);
+ }
+ return did_sig || test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
+}
+
+static int io_sq_thread(void *data)
+{
+ struct io_sq_data *sqd = data;
+ struct io_ring_ctx *ctx;
+ unsigned long timeout = 0;
+ char buf[TASK_COMM_LEN];
+ DEFINE_WAIT(wait);
+
+ snprintf(buf, sizeof(buf), "iou-sqp-%d", sqd->task_pid);
+ set_task_comm(current, buf);
+
+ if (sqd->sq_cpu != -1)
+ set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu));
+ else
+ set_cpus_allowed_ptr(current, cpu_online_mask);
+ current->flags |= PF_NO_SETAFFINITY;
+
+ audit_alloc_kernel(current);
+
+ mutex_lock(&sqd->lock);
+ while (1) {
+ bool cap_entries, sqt_spin = false;
+
+ if (io_sqd_events_pending(sqd) || signal_pending(current)) {
+ if (io_sqd_handle_event(sqd))
+ break;
+ timeout = jiffies + sqd->sq_thread_idle;
+ }
+
+ cap_entries = !list_is_singular(&sqd->ctx_list);
+ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
+ int ret = __io_sq_thread(ctx, cap_entries);
+
+ if (!sqt_spin && (ret > 0 || !wq_list_empty(&ctx->iopoll_list)))
+ sqt_spin = true;
+ }
+ if (io_run_task_work())
+ sqt_spin = true;
+
+ if (sqt_spin || !time_after(jiffies, timeout)) {
+ cond_resched();
+ if (sqt_spin)
+ timeout = jiffies + sqd->sq_thread_idle;
+ continue;
+ }
+
+ prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
+ if (!io_sqd_events_pending(sqd) && !task_work_pending(current)) {
+ bool needs_sched = true;
+
+ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) {
+ io_ring_set_wakeup_flag(ctx);
+
+ if ((ctx->flags & IORING_SETUP_IOPOLL) &&
+ !wq_list_empty(&ctx->iopoll_list)) {
+ needs_sched = false;
+ break;
+ }
+
+ /*
+ * Ensure the store of the wakeup flag is not
+ * reordered with the load of the SQ tail
+ */
+ smp_mb();
+
+ if (io_sqring_entries(ctx)) {
+ needs_sched = false;
+ break;
+ }
+ }
+
+ if (needs_sched) {
+ mutex_unlock(&sqd->lock);
+ schedule();
+ mutex_lock(&sqd->lock);
+ }
+ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
+ io_ring_clear_wakeup_flag(ctx);
+ }
+
+ finish_wait(&sqd->wait, &wait);
+ timeout = jiffies + sqd->sq_thread_idle;
+ }
+
+ io_uring_cancel_generic(true, sqd);
+ sqd->thread = NULL;
+ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
+ io_ring_set_wakeup_flag(ctx);
+ io_run_task_work();
+ mutex_unlock(&sqd->lock);
+
+ audit_free(current);
+
+ complete(&sqd->exited);
+ do_exit(0);
+}
+
+struct io_wait_queue {
+ struct wait_queue_entry wq;
+ struct io_ring_ctx *ctx;
+ unsigned cq_tail;
+ unsigned nr_timeouts;
+};
+
+static inline bool io_should_wake(struct io_wait_queue *iowq)
+{
+ struct io_ring_ctx *ctx = iowq->ctx;
+ int dist = ctx->cached_cq_tail - (int) iowq->cq_tail;
+
+ /*
+ * Wake up if we have enough events, or if a timeout occurred since we
+ * started waiting. For timeouts, we always want to return to userspace,
+ * regardless of event count.
+ */
+ return dist >= 0 || atomic_read(&ctx->cq_timeouts) != iowq->nr_timeouts;
+}
+
+static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode,
+ int wake_flags, void *key)
+{
+ struct io_wait_queue *iowq = container_of(curr, struct io_wait_queue,
+ wq);
+
+ /*
+ * Cannot safely flush overflowed CQEs from here, ensure we wake up
+ * the task, and the next invocation will do it.
+ */
+ if (io_should_wake(iowq) || test_bit(0, &iowq->ctx->check_cq_overflow))
+ return autoremove_wake_function(curr, mode, wake_flags, key);
+ return -1;
+}
+
+static int io_run_task_work_sig(void)
+{
+ if (io_run_task_work())
+ return 1;
+ if (test_thread_flag(TIF_NOTIFY_SIGNAL))
+ return -ERESTARTSYS;
+ if (task_sigpending(current))
+ return -EINTR;
+ return 0;
+}
+
+/* when returns >0, the caller should retry */
+static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
+ struct io_wait_queue *iowq,
+ ktime_t timeout)
+{
+ int ret;
+
+ /* make sure we run task_work before checking for signals */
+ ret = io_run_task_work_sig();
+ if (ret || io_should_wake(iowq))
+ return ret;
+ /* let the caller flush overflows, retry */
+ if (test_bit(0, &ctx->check_cq_overflow))
+ return 1;
+
+ if (!schedule_hrtimeout(&timeout, HRTIMER_MODE_ABS))
+ return -ETIME;
+ return 1;
+}
+
+/*
+ * Wait until events become available, if we don't already have some. The
+ * application must reap them itself, as they reside on the shared cq ring.
+ */
+static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
+ const sigset_t __user *sig, size_t sigsz,
+ struct __kernel_timespec __user *uts)
+{
+ struct io_wait_queue iowq;
+ struct io_rings *rings = ctx->rings;
+ ktime_t timeout = KTIME_MAX;
+ int ret;
+
+ do {
+ io_cqring_overflow_flush(ctx);
+ if (io_cqring_events(ctx) >= min_events)
+ return 0;
+ if (!io_run_task_work())
+ break;
+ } while (1);
+
+ if (sig) {
+#ifdef CONFIG_COMPAT
+ if (in_compat_syscall())
+ ret = set_compat_user_sigmask((const compat_sigset_t __user *)sig,
+ sigsz);
+ else
+#endif
+ ret = set_user_sigmask(sig, sigsz);
+
+ if (ret)
+ return ret;
+ }
+
+ if (uts) {
+ struct timespec64 ts;
+
+ if (get_timespec64(&ts, uts))
+ return -EFAULT;
+ timeout = ktime_add_ns(timespec64_to_ktime(ts), ktime_get_ns());
+ }
+
+ init_waitqueue_func_entry(&iowq.wq, io_wake_function);
+ iowq.wq.private = current;
+ INIT_LIST_HEAD(&iowq.wq.entry);
+ iowq.ctx = ctx;
+ iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts);
+ iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events;
+
+ trace_io_uring_cqring_wait(ctx, min_events);
+ do {
+ /* if we can't even flush overflow, don't wait for more */
+ if (!io_cqring_overflow_flush(ctx)) {
+ ret = -EBUSY;
+ break;
+ }
+ prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
+ TASK_INTERRUPTIBLE);
+ ret = io_cqring_wait_schedule(ctx, &iowq, timeout);
+ finish_wait(&ctx->cq_wait, &iowq.wq);
+ cond_resched();
+ } while (ret > 0);
+
+ restore_saved_sigmask_unless(ret == -EINTR);
+
+ return READ_ONCE(rings->cq.head) == READ_ONCE(rings->cq.tail) ? ret : 0;
+}
+
+static void io_free_page_table(void **table, size_t size)
+{
+ unsigned i, nr_tables = DIV_ROUND_UP(size, PAGE_SIZE);
+
+ for (i = 0; i < nr_tables; i++)
+ kfree(table[i]);
+ kfree(table);
+}
+
+static __cold void **io_alloc_page_table(size_t size)
+{
+ unsigned i, nr_tables = DIV_ROUND_UP(size, PAGE_SIZE);
+ size_t init_size = size;
+ void **table;
+
+ table = kcalloc(nr_tables, sizeof(*table), GFP_KERNEL_ACCOUNT);
+ if (!table)
+ return NULL;
+
+ for (i = 0; i < nr_tables; i++) {
+ unsigned int this_size = min_t(size_t, size, PAGE_SIZE);
+
+ table[i] = kzalloc(this_size, GFP_KERNEL_ACCOUNT);
+ if (!table[i]) {
+ io_free_page_table(table, init_size);
+ return NULL;
+ }
+ size -= this_size;
+ }
+ return table;
+}
+
+static void io_rsrc_node_destroy(struct io_rsrc_node *ref_node)
+{
+ percpu_ref_exit(&ref_node->refs);
+ kfree(ref_node);
+}
+
+static __cold void io_rsrc_node_ref_zero(struct percpu_ref *ref)
+{
+ struct io_rsrc_node *node = container_of(ref, struct io_rsrc_node, refs);
+ struct io_ring_ctx *ctx = node->rsrc_data->ctx;
+ unsigned long flags;
+ bool first_add = false;
+ unsigned long delay = HZ;
+
+ spin_lock_irqsave(&ctx->rsrc_ref_lock, flags);
+ node->done = true;
+
+ /* if we are mid-quiesce then do not delay */
+ if (node->rsrc_data->quiesce)
+ delay = 0;
+
+ while (!list_empty(&ctx->rsrc_ref_list)) {
+ node = list_first_entry(&ctx->rsrc_ref_list,
+ struct io_rsrc_node, node);
+ /* recycle ref nodes in order */
+ if (!node->done)
+ break;
+ list_del(&node->node);
+ first_add |= llist_add(&node->llist, &ctx->rsrc_put_llist);
+ }
+ spin_unlock_irqrestore(&ctx->rsrc_ref_lock, flags);
+
+ if (first_add)
+ mod_delayed_work(system_wq, &ctx->rsrc_put_work, delay);
+}
+
+static struct io_rsrc_node *io_rsrc_node_alloc(void)
+{
+ struct io_rsrc_node *ref_node;
+
+ ref_node = kzalloc(sizeof(*ref_node), GFP_KERNEL);
+ if (!ref_node)
+ return NULL;
+
+ if (percpu_ref_init(&ref_node->refs, io_rsrc_node_ref_zero,
+ 0, GFP_KERNEL)) {
+ kfree(ref_node);
+ return NULL;
+ }
+ INIT_LIST_HEAD(&ref_node->node);
+ INIT_LIST_HEAD(&ref_node->rsrc_list);
+ ref_node->done = false;
+ return ref_node;
+}
+
+static void io_rsrc_node_switch(struct io_ring_ctx *ctx,
+ struct io_rsrc_data *data_to_kill)
+ __must_hold(&ctx->uring_lock)
+{
+ WARN_ON_ONCE(!ctx->rsrc_backup_node);
+ WARN_ON_ONCE(data_to_kill && !ctx->rsrc_node);
+
+ io_rsrc_refs_drop(ctx);
+
+ if (data_to_kill) {
+ struct io_rsrc_node *rsrc_node = ctx->rsrc_node;
+
+ rsrc_node->rsrc_data = data_to_kill;
+ spin_lock_irq(&ctx->rsrc_ref_lock);
+ list_add_tail(&rsrc_node->node, &ctx->rsrc_ref_list);
+ spin_unlock_irq(&ctx->rsrc_ref_lock);
+
+ atomic_inc(&data_to_kill->refs);
+ percpu_ref_kill(&rsrc_node->refs);
+ ctx->rsrc_node = NULL;
+ }
+
+ if (!ctx->rsrc_node) {
+ ctx->rsrc_node = ctx->rsrc_backup_node;
+ ctx->rsrc_backup_node = NULL;
+ }
+}
+
+static int io_rsrc_node_switch_start(struct io_ring_ctx *ctx)
+{
+ if (ctx->rsrc_backup_node)
+ return 0;
+ ctx->rsrc_backup_node = io_rsrc_node_alloc();
+ return ctx->rsrc_backup_node ? 0 : -ENOMEM;
+}
+
+static __cold int io_rsrc_ref_quiesce(struct io_rsrc_data *data,
+ struct io_ring_ctx *ctx)
+{
+ int ret;
+
+ /* As we may drop ->uring_lock, other task may have started quiesce */
+ if (data->quiesce)
+ return -ENXIO;
+
+ data->quiesce = true;
+ do {
+ ret = io_rsrc_node_switch_start(ctx);
+ if (ret)
+ break;
+ io_rsrc_node_switch(ctx, data);
+
+ /* kill initial ref, already quiesced if zero */
+ if (atomic_dec_and_test(&data->refs))
+ break;
+ mutex_unlock(&ctx->uring_lock);
+ flush_delayed_work(&ctx->rsrc_put_work);
+ ret = wait_for_completion_interruptible(&data->done);
+ if (!ret) {
+ mutex_lock(&ctx->uring_lock);
+ if (atomic_read(&data->refs) > 0) {
+ /*
+ * it has been revived by another thread while
+ * we were unlocked
+ */
+ mutex_unlock(&ctx->uring_lock);
+ } else {
+ break;
+ }
+ }
+
+ atomic_inc(&data->refs);
+ /* wait for all works potentially completing data->done */
+ flush_delayed_work(&ctx->rsrc_put_work);
+ reinit_completion(&data->done);
+
+ ret = io_run_task_work_sig();
+ mutex_lock(&ctx->uring_lock);
+ } while (ret >= 0);
+ data->quiesce = false;
+
+ return ret;
+}
+
+static u64 *io_get_tag_slot(struct io_rsrc_data *data, unsigned int idx)
+{
+ unsigned int off = idx & IO_RSRC_TAG_TABLE_MASK;
+ unsigned int table_idx = idx >> IO_RSRC_TAG_TABLE_SHIFT;
+
+ return &data->tags[table_idx][off];
+}
+
+static void io_rsrc_data_free(struct io_rsrc_data *data)
+{
+ size_t size = data->nr * sizeof(data->tags[0][0]);
+
+ if (data->tags)
+ io_free_page_table((void **)data->tags, size);
+ kfree(data);
+}
+
+static __cold int io_rsrc_data_alloc(struct io_ring_ctx *ctx, rsrc_put_fn *do_put,
+ u64 __user *utags, unsigned nr,
+ struct io_rsrc_data **pdata)
+{
+ struct io_rsrc_data *data;
+ int ret = -ENOMEM;
+ unsigned i;
+
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+ data->tags = (u64 **)io_alloc_page_table(nr * sizeof(data->tags[0][0]));
+ if (!data->tags) {
+ kfree(data);
+ return -ENOMEM;
+ }
+
+ data->nr = nr;
+ data->ctx = ctx;
+ data->do_put = do_put;
+ if (utags) {
+ ret = -EFAULT;
+ for (i = 0; i < nr; i++) {
+ u64 *tag_slot = io_get_tag_slot(data, i);
+
+ if (copy_from_user(tag_slot, &utags[i],
+ sizeof(*tag_slot)))
+ goto fail;
+ }
+ }
+
+ atomic_set(&data->refs, 1);
+ init_completion(&data->done);
+ *pdata = data;
+ return 0;
+fail:
+ io_rsrc_data_free(data);
+ return ret;
+}
+
+static bool io_alloc_file_tables(struct io_file_table *table, unsigned nr_files)
+{
+ table->files = kvcalloc(nr_files, sizeof(table->files[0]),
+ GFP_KERNEL_ACCOUNT);
+ return !!table->files;
+}
+
+static void io_free_file_tables(struct io_file_table *table)
+{
+ kvfree(table->files);
+ table->files = NULL;
+}
+
+static void __io_sqe_files_unregister(struct io_ring_ctx *ctx)
+{
+#if defined(CONFIG_UNIX)
+ if (ctx->ring_sock) {
+ struct sock *sock = ctx->ring_sock->sk;
+ struct sk_buff *skb;
+
+ while ((skb = skb_dequeue(&sock->sk_receive_queue)) != NULL)
+ kfree_skb(skb);
+ }
+#else
+ int i;
+
+ for (i = 0; i < ctx->nr_user_files; i++) {
+ struct file *file;
+
+ file = io_file_from_index(ctx, i);
+ if (file)
+ fput(file);
+ }
+#endif
+ io_free_file_tables(&ctx->file_table);
+ io_rsrc_data_free(ctx->file_data);
+ ctx->file_data = NULL;
+ ctx->nr_user_files = 0;
+}
+
+static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
+{
+ unsigned nr = ctx->nr_user_files;
+ int ret;
+
+ if (!ctx->file_data)
+ return -ENXIO;
+
+ /*
+ * Quiesce may unlock ->uring_lock, and while it's not held
+ * prevent new requests using the table.
+ */
+ ctx->nr_user_files = 0;
+ ret = io_rsrc_ref_quiesce(ctx->file_data, ctx);
+ ctx->nr_user_files = nr;
+ if (!ret)
+ __io_sqe_files_unregister(ctx);
+ return ret;
+}
+
+static void io_sq_thread_unpark(struct io_sq_data *sqd)
+ __releases(&sqd->lock)
+{
+ WARN_ON_ONCE(sqd->thread == current);
+
+ /*
+ * Do the dance but not conditional clear_bit() because it'd race with
+ * other threads incrementing park_pending and setting the bit.
+ */
+ clear_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
+ if (atomic_dec_return(&sqd->park_pending))
+ set_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
+ mutex_unlock(&sqd->lock);
+}
+
+static void io_sq_thread_park(struct io_sq_data *sqd)
+ __acquires(&sqd->lock)
+{
+ WARN_ON_ONCE(sqd->thread == current);
+
+ atomic_inc(&sqd->park_pending);
+ set_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state);
+ mutex_lock(&sqd->lock);
+ if (sqd->thread)
+ wake_up_process(sqd->thread);
+}
+
+static void io_sq_thread_stop(struct io_sq_data *sqd)
+{
+ WARN_ON_ONCE(sqd->thread == current);
+ WARN_ON_ONCE(test_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state));
+
+ set_bit(IO_SQ_THREAD_SHOULD_STOP, &sqd->state);
+ mutex_lock(&sqd->lock);
+ if (sqd->thread)
+ wake_up_process(sqd->thread);
+ mutex_unlock(&sqd->lock);
+ wait_for_completion(&sqd->exited);
+}
+
+static void io_put_sq_data(struct io_sq_data *sqd)
+{
+ if (refcount_dec_and_test(&sqd->refs)) {
+ WARN_ON_ONCE(atomic_read(&sqd->park_pending));
+
+ io_sq_thread_stop(sqd);
+ kfree(sqd);
+ }
+}
+
+static void io_sq_thread_finish(struct io_ring_ctx *ctx)
+{
+ struct io_sq_data *sqd = ctx->sq_data;
+
+ if (sqd) {
+ io_sq_thread_park(sqd);
+ list_del_init(&ctx->sqd_list);
+ io_sqd_update_thread_idle(sqd);
+ io_sq_thread_unpark(sqd);
+
+ io_put_sq_data(sqd);
+ ctx->sq_data = NULL;
+ }
+}
+
+static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
+{
+ struct io_ring_ctx *ctx_attach;
+ struct io_sq_data *sqd;
+ struct fd f;
+
+ f = fdget(p->wq_fd);
+ if (!f.file)
+ return ERR_PTR(-ENXIO);
+ if (f.file->f_op != &io_uring_fops) {
+ fdput(f);
+ return ERR_PTR(-EINVAL);
+ }
+
+ ctx_attach = f.file->private_data;
+ sqd = ctx_attach->sq_data;
+ if (!sqd) {
+ fdput(f);
+ return ERR_PTR(-EINVAL);
+ }
+ if (sqd->task_tgid != current->tgid) {
+ fdput(f);
+ return ERR_PTR(-EPERM);
+ }
+
+ refcount_inc(&sqd->refs);
+ fdput(f);
+ return sqd;
+}
+
+static struct io_sq_data *io_get_sq_data(struct io_uring_params *p,
+ bool *attached)
+{
+ struct io_sq_data *sqd;
+
+ *attached = false;
+ if (p->flags & IORING_SETUP_ATTACH_WQ) {
+ sqd = io_attach_sq_data(p);
+ if (!IS_ERR(sqd)) {
+ *attached = true;
+ return sqd;
+ }
+ /* fall through for EPERM case, setup new sqd/task */
+ if (PTR_ERR(sqd) != -EPERM)
+ return sqd;
+ }
+
+ sqd = kzalloc(sizeof(*sqd), GFP_KERNEL);
+ if (!sqd)
+ return ERR_PTR(-ENOMEM);
+
+ atomic_set(&sqd->park_pending, 0);
+ refcount_set(&sqd->refs, 1);
+ INIT_LIST_HEAD(&sqd->ctx_list);
+ mutex_init(&sqd->lock);
+ init_waitqueue_head(&sqd->wait);
+ init_completion(&sqd->exited);
+ return sqd;
+}
+
+#if defined(CONFIG_UNIX)
+/*
+ * Ensure the UNIX gc is aware of our file set, so we are certain that
+ * the io_uring can be safely unregistered on process exit, even if we have
+ * loops in the file referencing.
+ */
+static int __io_sqe_files_scm(struct io_ring_ctx *ctx, int nr, int offset)
+{
+ struct sock *sk = ctx->ring_sock->sk;
+ struct scm_fp_list *fpl;
+ struct sk_buff *skb;
+ int i, nr_files;
+
+ fpl = kzalloc(sizeof(*fpl), GFP_KERNEL);
+ if (!fpl)
+ return -ENOMEM;
+
+ skb = alloc_skb(0, GFP_KERNEL);
+ if (!skb) {
+ kfree(fpl);
+ return -ENOMEM;
+ }
+
+ skb->sk = sk;
+
+ nr_files = 0;
+ fpl->user = get_uid(current_user());
+ for (i = 0; i < nr; i++) {
+ struct file *file = io_file_from_index(ctx, i + offset);
+
+ if (!file)
+ continue;
+ fpl->fp[nr_files] = get_file(file);
+ unix_inflight(fpl->user, fpl->fp[nr_files]);
+ nr_files++;
+ }
+
+ if (nr_files) {
+ fpl->max = SCM_MAX_FD;
+ fpl->count = nr_files;
+ UNIXCB(skb).fp = fpl;
+ skb->destructor = unix_destruct_scm;
+ refcount_add(skb->truesize, &sk->sk_wmem_alloc);
+ skb_queue_head(&sk->sk_receive_queue, skb);
+
+ for (i = 0; i < nr; i++) {
+ struct file *file = io_file_from_index(ctx, i + offset);
+
+ if (file)
+ fput(file);
+ }
+ } else {
+ kfree_skb(skb);
+ free_uid(fpl->user);
+ kfree(fpl);
+ }
+
+ return 0;
+}
+
+/*
+ * If UNIX sockets are enabled, fd passing can cause a reference cycle which
+ * causes regular reference counting to break down. We rely on the UNIX
+ * garbage collection to take care of this problem for us.
+ */
+static int io_sqe_files_scm(struct io_ring_ctx *ctx)
+{
+ unsigned left, total;
+ int ret = 0;
+
+ total = 0;
+ left = ctx->nr_user_files;
+ while (left) {
+ unsigned this_files = min_t(unsigned, left, SCM_MAX_FD);
+
+ ret = __io_sqe_files_scm(ctx, this_files, total);
+ if (ret)
+ break;
+ left -= this_files;
+ total += this_files;
+ }
+
+ if (!ret)
+ return 0;
+
+ while (total < ctx->nr_user_files) {
+ struct file *file = io_file_from_index(ctx, total);
+
+ if (file)
+ fput(file);
+ total++;
+ }
+
+ return ret;
+}
+#else
+static int io_sqe_files_scm(struct io_ring_ctx *ctx)
+{
+ return 0;
+}
+#endif
+
+static void io_rsrc_file_put(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc)
+{
+ struct file *file = prsrc->file;
+#if defined(CONFIG_UNIX)
+ struct sock *sock = ctx->ring_sock->sk;
+ struct sk_buff_head list, *head = &sock->sk_receive_queue;
+ struct sk_buff *skb;
+ int i;
+
+ __skb_queue_head_init(&list);
+
+ /*
+ * Find the skb that holds this file in its SCM_RIGHTS. When found,
+ * remove this entry and rearrange the file array.
+ */
+ skb = skb_dequeue(head);
+ while (skb) {
+ struct scm_fp_list *fp;
+
+ fp = UNIXCB(skb).fp;
+ for (i = 0; i < fp->count; i++) {
+ int left;
+
+ if (fp->fp[i] != file)
+ continue;
+
+ unix_notinflight(fp->user, fp->fp[i]);
+ left = fp->count - 1 - i;
+ if (left) {
+ memmove(&fp->fp[i], &fp->fp[i + 1],
+ left * sizeof(struct file *));
+ }
+ fp->count--;
+ if (!fp->count) {
+ kfree_skb(skb);
+ skb = NULL;
+ } else {
+ __skb_queue_tail(&list, skb);
+ }
+ fput(file);
+ file = NULL;
+ break;
+ }
+
+ if (!file)
+ break;
+
+ __skb_queue_tail(&list, skb);
+
+ skb = skb_dequeue(head);
+ }
+
+ if (skb_peek(&list)) {
+ spin_lock_irq(&head->lock);
+ while ((skb = __skb_dequeue(&list)) != NULL)
+ __skb_queue_tail(head, skb);
+ spin_unlock_irq(&head->lock);
+ }
+#else
+ fput(file);
+#endif
+}
+
+static void __io_rsrc_put_work(struct io_rsrc_node *ref_node)
+{
+ struct io_rsrc_data *rsrc_data = ref_node->rsrc_data;
+ struct io_ring_ctx *ctx = rsrc_data->ctx;
+ struct io_rsrc_put *prsrc, *tmp;
+
+ list_for_each_entry_safe(prsrc, tmp, &ref_node->rsrc_list, list) {
+ list_del(&prsrc->list);
+
+ if (prsrc->tag) {
+ bool lock_ring = ctx->flags & IORING_SETUP_IOPOLL;
+
+ io_ring_submit_lock(ctx, lock_ring);
+ spin_lock(&ctx->completion_lock);
+ io_fill_cqe_aux(ctx, prsrc->tag, 0, 0);
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ io_cqring_ev_posted(ctx);
+ io_ring_submit_unlock(ctx, lock_ring);
+ }
+
+ rsrc_data->do_put(ctx, prsrc);
+ kfree(prsrc);
+ }
+
+ io_rsrc_node_destroy(ref_node);
+ if (atomic_dec_and_test(&rsrc_data->refs))
+ complete(&rsrc_data->done);
+}
+
+static void io_rsrc_put_work(struct work_struct *work)
+{
+ struct io_ring_ctx *ctx;
+ struct llist_node *node;
+
+ ctx = container_of(work, struct io_ring_ctx, rsrc_put_work.work);
+ node = llist_del_all(&ctx->rsrc_put_llist);
+
+ while (node) {
+ struct io_rsrc_node *ref_node;
+ struct llist_node *next = node->next;
+
+ ref_node = llist_entry(node, struct io_rsrc_node, llist);
+ __io_rsrc_put_work(ref_node);
+ node = next;
+ }
+}
+
+static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
+ unsigned nr_args, u64 __user *tags)
+{
+ __s32 __user *fds = (__s32 __user *) arg;
+ struct file *file;
+ int fd, ret;
+ unsigned i;
+
+ if (ctx->file_data)
+ return -EBUSY;
+ if (!nr_args)
+ return -EINVAL;
+ if (nr_args > IORING_MAX_FIXED_FILES)
+ return -EMFILE;
+ if (nr_args > rlimit(RLIMIT_NOFILE))
+ return -EMFILE;
+ ret = io_rsrc_node_switch_start(ctx);
+ if (ret)
+ return ret;
+ ret = io_rsrc_data_alloc(ctx, io_rsrc_file_put, tags, nr_args,
+ &ctx->file_data);
+ if (ret)
+ return ret;
+
+ ret = -ENOMEM;
+ if (!io_alloc_file_tables(&ctx->file_table, nr_args))
+ goto out_free;
+
+ for (i = 0; i < nr_args; i++, ctx->nr_user_files++) {
+ if (copy_from_user(&fd, &fds[i], sizeof(fd))) {
+ ret = -EFAULT;
+ goto out_fput;
+ }
+ /* allow sparse sets */
+ if (fd == -1) {
+ ret = -EINVAL;
+ if (unlikely(*io_get_tag_slot(ctx->file_data, i)))
+ goto out_fput;
+ continue;
+ }
+
+ file = fget(fd);
+ ret = -EBADF;
+ if (unlikely(!file))
+ goto out_fput;
+
+ /*
+ * Don't allow io_uring instances to be registered. If UNIX
+ * isn't enabled, then this causes a reference cycle and this
+ * instance can never get freed. If UNIX is enabled we'll
+ * handle it just fine, but there's still no point in allowing
+ * a ring fd as it doesn't support regular read/write anyway.
+ */
+ if (file->f_op == &io_uring_fops) {
+ fput(file);
+ goto out_fput;
+ }
+ io_fixed_file_set(io_fixed_file_slot(&ctx->file_table, i), file);
+ }
+
+ ret = io_sqe_files_scm(ctx);
+ if (ret) {
+ __io_sqe_files_unregister(ctx);
+ return ret;
+ }
+
+ io_rsrc_node_switch(ctx, NULL);
+ return ret;
+out_fput:
+ for (i = 0; i < ctx->nr_user_files; i++) {
+ file = io_file_from_index(ctx, i);
+ if (file)
+ fput(file);
+ }
+ io_free_file_tables(&ctx->file_table);
+ ctx->nr_user_files = 0;
+out_free:
+ io_rsrc_data_free(ctx->file_data);
+ ctx->file_data = NULL;
+ return ret;
+}
+
+static int io_sqe_file_register(struct io_ring_ctx *ctx, struct file *file,
+ int index)
+{
+#if defined(CONFIG_UNIX)
+ struct sock *sock = ctx->ring_sock->sk;
+ struct sk_buff_head *head = &sock->sk_receive_queue;
+ struct sk_buff *skb;
+
+ /*
+ * See if we can merge this file into an existing skb SCM_RIGHTS
+ * file set. If there's no room, fall back to allocating a new skb
+ * and filling it in.
+ */
+ spin_lock_irq(&head->lock);
+ skb = skb_peek(head);
+ if (skb) {
+ struct scm_fp_list *fpl = UNIXCB(skb).fp;
+
+ if (fpl->count < SCM_MAX_FD) {
+ __skb_unlink(skb, head);
+ spin_unlock_irq(&head->lock);
+ fpl->fp[fpl->count] = get_file(file);
+ unix_inflight(fpl->user, fpl->fp[fpl->count]);
+ fpl->count++;
+ spin_lock_irq(&head->lock);
+ __skb_queue_head(head, skb);
+ } else {
+ skb = NULL;
+ }
+ }
+ spin_unlock_irq(&head->lock);
+
+ if (skb) {
+ fput(file);
+ return 0;
+ }
+
+ return __io_sqe_files_scm(ctx, 1, index);
+#else
+ return 0;
+#endif
+}
+
+static int io_queue_rsrc_removal(struct io_rsrc_data *data, unsigned idx,
+ struct io_rsrc_node *node, void *rsrc)
+{
+ u64 *tag_slot = io_get_tag_slot(data, idx);
+ struct io_rsrc_put *prsrc;
+
+ prsrc = kzalloc(sizeof(*prsrc), GFP_KERNEL);
+ if (!prsrc)
+ return -ENOMEM;
+
+ prsrc->tag = *tag_slot;
+ *tag_slot = 0;
+ prsrc->rsrc = rsrc;
+ list_add(&prsrc->list, &node->rsrc_list);
+ return 0;
+}
+
+static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
+ unsigned int issue_flags, u32 slot_index)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+ bool needs_switch = false;
+ struct io_fixed_file *file_slot;
+ int ret = -EBADF;
+
+ io_ring_submit_lock(ctx, needs_lock);
+ if (file->f_op == &io_uring_fops)
+ goto err;
+ ret = -ENXIO;
+ if (!ctx->file_data)
+ goto err;
+ ret = -EINVAL;
+ if (slot_index >= ctx->nr_user_files)
+ goto err;
+
+ slot_index = array_index_nospec(slot_index, ctx->nr_user_files);
+ file_slot = io_fixed_file_slot(&ctx->file_table, slot_index);
+
+ if (file_slot->file_ptr) {
+ struct file *old_file;
+
+ ret = io_rsrc_node_switch_start(ctx);
+ if (ret)
+ goto err;
+
+ old_file = (struct file *)(file_slot->file_ptr & FFS_MASK);
+ ret = io_queue_rsrc_removal(ctx->file_data, slot_index,
+ ctx->rsrc_node, old_file);
+ if (ret)
+ goto err;
+ file_slot->file_ptr = 0;
+ needs_switch = true;
+ }
+
+ *io_get_tag_slot(ctx->file_data, slot_index) = 0;
+ io_fixed_file_set(file_slot, file);
+ ret = io_sqe_file_register(ctx, file, slot_index);
+ if (ret) {
+ file_slot->file_ptr = 0;
+ goto err;
+ }
+
+ ret = 0;
+err:
+ if (needs_switch)
+ io_rsrc_node_switch(ctx, ctx->file_data);
+ io_ring_submit_unlock(ctx, needs_lock);
+ if (ret)
+ fput(file);
+ return ret;
+}
+
+static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags)
+{
+ unsigned int offset = req->close.file_slot - 1;
+ struct io_ring_ctx *ctx = req->ctx;
+ bool needs_lock = issue_flags & IO_URING_F_UNLOCKED;
+ struct io_fixed_file *file_slot;
+ struct file *file;
+ int ret;
+
+ io_ring_submit_lock(ctx, needs_lock);
+ ret = -ENXIO;
+ if (unlikely(!ctx->file_data))
+ goto out;
+ ret = -EINVAL;
+ if (offset >= ctx->nr_user_files)
+ goto out;
+ ret = io_rsrc_node_switch_start(ctx);
+ if (ret)
+ goto out;
+
+ offset = array_index_nospec(offset, ctx->nr_user_files);
+ file_slot = io_fixed_file_slot(&ctx->file_table, offset);
+ ret = -EBADF;
+ if (!file_slot->file_ptr)
+ goto out;
+
+ file = (struct file *)(file_slot->file_ptr & FFS_MASK);
+ ret = io_queue_rsrc_removal(ctx->file_data, offset, ctx->rsrc_node, file);
+ if (ret)
+ goto out;
+
+ file_slot->file_ptr = 0;
+ io_rsrc_node_switch(ctx, ctx->file_data);
+ ret = 0;
+out:
+ io_ring_submit_unlock(ctx, needs_lock);
+ return ret;
+}
+
+static int __io_sqe_files_update(struct io_ring_ctx *ctx,
+ struct io_uring_rsrc_update2 *up,
+ unsigned nr_args)
+{
+ u64 __user *tags = u64_to_user_ptr(up->tags);
+ __s32 __user *fds = u64_to_user_ptr(up->data);
+ struct io_rsrc_data *data = ctx->file_data;
+ struct io_fixed_file *file_slot;
+ struct file *file;
+ int fd, i, err = 0;
+ unsigned int done;
+ bool needs_switch = false;
+
+ if (!ctx->file_data)
+ return -ENXIO;
+ if (up->offset + nr_args > ctx->nr_user_files)
+ return -EINVAL;
+
+ for (done = 0; done < nr_args; done++) {
+ u64 tag = 0;
+
+ if ((tags && copy_from_user(&tag, &tags[done], sizeof(tag))) ||
+ copy_from_user(&fd, &fds[done], sizeof(fd))) {
+ err = -EFAULT;
+ break;
+ }
+ if ((fd == IORING_REGISTER_FILES_SKIP || fd == -1) && tag) {
+ err = -EINVAL;
+ break;
+ }
+ if (fd == IORING_REGISTER_FILES_SKIP)
+ continue;
+
+ i = array_index_nospec(up->offset + done, ctx->nr_user_files);
+ file_slot = io_fixed_file_slot(&ctx->file_table, i);
+
+ if (file_slot->file_ptr) {
+ file = (struct file *)(file_slot->file_ptr & FFS_MASK);
+ err = io_queue_rsrc_removal(data, i, ctx->rsrc_node, file);
+ if (err)
+ break;
+ file_slot->file_ptr = 0;
+ needs_switch = true;
+ }
+ if (fd != -1) {
+ file = fget(fd);
+ if (!file) {
+ err = -EBADF;
+ break;
+ }
+ /*
+ * Don't allow io_uring instances to be registered. If
+ * UNIX isn't enabled, then this causes a reference
+ * cycle and this instance can never get freed. If UNIX
+ * is enabled we'll handle it just fine, but there's
+ * still no point in allowing a ring fd as it doesn't
+ * support regular read/write anyway.
+ */
+ if (file->f_op == &io_uring_fops) {
+ fput(file);
+ err = -EBADF;
+ break;
+ }
+ *io_get_tag_slot(data, i) = tag;
+ io_fixed_file_set(file_slot, file);
+ err = io_sqe_file_register(ctx, file, i);
+ if (err) {
+ file_slot->file_ptr = 0;
+ fput(file);
+ break;
+ }
+ }
+ }
+
+ if (needs_switch)
+ io_rsrc_node_switch(ctx, data);
+ return done ? done : err;
+}
+
+static struct io_wq *io_init_wq_offload(struct io_ring_ctx *ctx,
+ struct task_struct *task)
+{
+ struct io_wq_hash *hash;
+ struct io_wq_data data;
+ unsigned int concurrency;
+
+ mutex_lock(&ctx->uring_lock);
+ hash = ctx->hash_map;
+ if (!hash) {
+ hash = kzalloc(sizeof(*hash), GFP_KERNEL);
+ if (!hash) {
+ mutex_unlock(&ctx->uring_lock);
+ return ERR_PTR(-ENOMEM);
+ }
+ refcount_set(&hash->refs, 1);
+ init_waitqueue_head(&hash->wait);
+ ctx->hash_map = hash;
+ }
+ mutex_unlock(&ctx->uring_lock);
+
+ data.hash = hash;
+ data.task = task;
+ data.free_work = io_wq_free_work;
+ data.do_work = io_wq_submit_work;
+
+ /* Do QD, or 4 * CPUS, whatever is smallest */
+ concurrency = min(ctx->sq_entries, 4 * num_online_cpus());
+
+ return io_wq_create(concurrency, &data);
+}
+
+static __cold int io_uring_alloc_task_context(struct task_struct *task,
+ struct io_ring_ctx *ctx)
+{
+ struct io_uring_task *tctx;
+ int ret;
+
+ tctx = kzalloc(sizeof(*tctx), GFP_KERNEL);
+ if (unlikely(!tctx))
+ return -ENOMEM;
+
+ tctx->registered_rings = kcalloc(IO_RINGFD_REG_MAX,
+ sizeof(struct file *), GFP_KERNEL);
+ if (unlikely(!tctx->registered_rings)) {
+ kfree(tctx);
+ return -ENOMEM;
+ }
+
+ ret = percpu_counter_init(&tctx->inflight, 0, GFP_KERNEL);
+ if (unlikely(ret)) {
+ kfree(tctx->registered_rings);
+ kfree(tctx);
+ return ret;
+ }
+
+ tctx->io_wq = io_init_wq_offload(ctx, task);
+ if (IS_ERR(tctx->io_wq)) {
+ ret = PTR_ERR(tctx->io_wq);
+ percpu_counter_destroy(&tctx->inflight);
+ kfree(tctx->registered_rings);
+ kfree(tctx);
+ return ret;
+ }
+
+ xa_init(&tctx->xa);
+ init_waitqueue_head(&tctx->wait);
+ atomic_set(&tctx->in_idle, 0);
+ atomic_set(&tctx->inflight_tracked, 0);
+ task->io_uring = tctx;
+ spin_lock_init(&tctx->task_lock);
+ INIT_WQ_LIST(&tctx->task_list);
+ INIT_WQ_LIST(&tctx->prior_task_list);
+ init_task_work(&tctx->task_work, tctx_task_work);
+ return 0;
+}
+
+void __io_uring_free(struct task_struct *tsk)
+{
+ struct io_uring_task *tctx = tsk->io_uring;
+
+ WARN_ON_ONCE(!xa_empty(&tctx->xa));
+ WARN_ON_ONCE(tctx->io_wq);
+ WARN_ON_ONCE(tctx->cached_refs);
+
+ kfree(tctx->registered_rings);
+ percpu_counter_destroy(&tctx->inflight);
+ kfree(tctx);
+ tsk->io_uring = NULL;
+}
+
+static __cold int io_sq_offload_create(struct io_ring_ctx *ctx,
+ struct io_uring_params *p)
+{
+ int ret;
+
+ /* Retain compatibility with failing for an invalid attach attempt */
+ if ((ctx->flags & (IORING_SETUP_ATTACH_WQ | IORING_SETUP_SQPOLL)) ==
+ IORING_SETUP_ATTACH_WQ) {
+ struct fd f;
+
+ f = fdget(p->wq_fd);
+ if (!f.file)
+ return -ENXIO;
+ if (f.file->f_op != &io_uring_fops) {
+ fdput(f);
+ return -EINVAL;
+ }
+ fdput(f);
+ }
+ if (ctx->flags & IORING_SETUP_SQPOLL) {
+ struct task_struct *tsk;
+ struct io_sq_data *sqd;
+ bool attached;
+
+ ret = security_uring_sqpoll();
+ if (ret)
+ return ret;
+
+ sqd = io_get_sq_data(p, &attached);
+ if (IS_ERR(sqd)) {
+ ret = PTR_ERR(sqd);
+ goto err;
+ }
+
+ ctx->sq_creds = get_current_cred();
+ ctx->sq_data = sqd;
+ ctx->sq_thread_idle = msecs_to_jiffies(p->sq_thread_idle);
+ if (!ctx->sq_thread_idle)
+ ctx->sq_thread_idle = HZ;
+
+ io_sq_thread_park(sqd);
+ list_add(&ctx->sqd_list, &sqd->ctx_list);
+ io_sqd_update_thread_idle(sqd);
+ /* don't attach to a dying SQPOLL thread, would be racy */
+ ret = (attached && !sqd->thread) ? -ENXIO : 0;
+ io_sq_thread_unpark(sqd);
+
+ if (ret < 0)
+ goto err;
+ if (attached)
+ return 0;
+
+ if (p->flags & IORING_SETUP_SQ_AFF) {
+ int cpu = p->sq_thread_cpu;
+
+ ret = -EINVAL;
+ if (cpu >= nr_cpu_ids || !cpu_online(cpu))
+ goto err_sqpoll;
+ sqd->sq_cpu = cpu;
+ } else {
+ sqd->sq_cpu = -1;
+ }
+
+ sqd->task_pid = current->pid;
+ sqd->task_tgid = current->tgid;
+ tsk = create_io_thread(io_sq_thread, sqd, NUMA_NO_NODE);
+ if (IS_ERR(tsk)) {
+ ret = PTR_ERR(tsk);
+ goto err_sqpoll;
+ }
+
+ sqd->thread = tsk;
+ ret = io_uring_alloc_task_context(tsk, ctx);
+ wake_up_new_task(tsk);
+ if (ret)
+ goto err;
+ } else if (p->flags & IORING_SETUP_SQ_AFF) {
+ /* Can't have SQ_AFF without SQPOLL */
+ ret = -EINVAL;
+ goto err;
+ }
+
+ return 0;
+err_sqpoll:
+ complete(&ctx->sq_data->exited);
+err:
+ io_sq_thread_finish(ctx);
+ return ret;
+}
+
+static inline void __io_unaccount_mem(struct user_struct *user,
+ unsigned long nr_pages)
+{
+ atomic_long_sub(nr_pages, &user->locked_vm);
+}
+
+static inline int __io_account_mem(struct user_struct *user,
+ unsigned long nr_pages)
+{
+ unsigned long page_limit, cur_pages, new_pages;
+
+ /* Don't allow more pages than we can safely lock */
+ page_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
+
+ do {
+ cur_pages = atomic_long_read(&user->locked_vm);
+ new_pages = cur_pages + nr_pages;
+ if (new_pages > page_limit)
+ return -ENOMEM;
+ } while (atomic_long_cmpxchg(&user->locked_vm, cur_pages,
+ new_pages) != cur_pages);
+
+ return 0;
+}
+
+static void io_unaccount_mem(struct io_ring_ctx *ctx, unsigned long nr_pages)
+{
+ if (ctx->user)
+ __io_unaccount_mem(ctx->user, nr_pages);
+
+ if (ctx->mm_account)
+ atomic64_sub(nr_pages, &ctx->mm_account->pinned_vm);
+}
+
+static int io_account_mem(struct io_ring_ctx *ctx, unsigned long nr_pages)
+{
+ int ret;
+
+ if (ctx->user) {
+ ret = __io_account_mem(ctx->user, nr_pages);
+ if (ret)
+ return ret;
+ }
+
+ if (ctx->mm_account)
+ atomic64_add(nr_pages, &ctx->mm_account->pinned_vm);
+
+ return 0;
+}
+
+static void io_mem_free(void *ptr)
+{
+ struct page *page;
+
+ if (!ptr)
+ return;
+
+ page = virt_to_head_page(ptr);
+ if (put_page_testzero(page))
+ free_compound_page(page);
+}
+
+static void *io_mem_alloc(size_t size)
+{
+ gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP;
+
+ return (void *) __get_free_pages(gfp, get_order(size));
+}
+
+static unsigned long rings_size(unsigned sq_entries, unsigned cq_entries,
+ size_t *sq_offset)
+{
+ struct io_rings *rings;
+ size_t off, sq_array_size;
+
+ off = struct_size(rings, cqes, cq_entries);
+ if (off == SIZE_MAX)
+ return SIZE_MAX;
+
+#ifdef CONFIG_SMP
+ off = ALIGN(off, SMP_CACHE_BYTES);
+ if (off == 0)
+ return SIZE_MAX;
+#endif
+
+ if (sq_offset)
+ *sq_offset = off;
+
+ sq_array_size = array_size(sizeof(u32), sq_entries);
+ if (sq_array_size == SIZE_MAX)
+ return SIZE_MAX;
+
+ if (check_add_overflow(off, sq_array_size, &off))
+ return SIZE_MAX;
+
+ return off;
+}
+
+static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_mapped_ubuf **slot)
+{
+ struct io_mapped_ubuf *imu = *slot;
+ unsigned int i;
+
+ if (imu != ctx->dummy_ubuf) {
+ for (i = 0; i < imu->nr_bvecs; i++)
+ unpin_user_page(imu->bvec[i].bv_page);
+ if (imu->acct_pages)
+ io_unaccount_mem(ctx, imu->acct_pages);
+ kvfree(imu);
+ }
+ *slot = NULL;
+}
+
+static void io_rsrc_buf_put(struct io_ring_ctx *ctx, struct io_rsrc_put *prsrc)
+{
+ io_buffer_unmap(ctx, &prsrc->buf);
+ prsrc->buf = NULL;
+}
+
+static void __io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
+{
+ unsigned int i;
+
+ for (i = 0; i < ctx->nr_user_bufs; i++)
+ io_buffer_unmap(ctx, &ctx->user_bufs[i]);
+ kfree(ctx->user_bufs);
+ io_rsrc_data_free(ctx->buf_data);
+ ctx->user_bufs = NULL;
+ ctx->buf_data = NULL;
+ ctx->nr_user_bufs = 0;
+}
+
+static int io_sqe_buffers_unregister(struct io_ring_ctx *ctx)
+{
+ unsigned nr = ctx->nr_user_bufs;
+ int ret;
+
+ if (!ctx->buf_data)
+ return -ENXIO;
+
+ /*
+ * Quiesce may unlock ->uring_lock, and while it's not held
+ * prevent new requests using the table.
+ */
+ ctx->nr_user_bufs = 0;
+ ret = io_rsrc_ref_quiesce(ctx->buf_data, ctx);
+ ctx->nr_user_bufs = nr;
+ if (!ret)
+ __io_sqe_buffers_unregister(ctx);
+ return ret;
+}
+
+static int io_copy_iov(struct io_ring_ctx *ctx, struct iovec *dst,
+ void __user *arg, unsigned index)
+{
+ struct iovec __user *src;
+
+#ifdef CONFIG_COMPAT
+ if (ctx->compat) {
+ struct compat_iovec __user *ciovs;
+ struct compat_iovec ciov;
+
+ ciovs = (struct compat_iovec __user *) arg;
+ if (copy_from_user(&ciov, &ciovs[index], sizeof(ciov)))
+ return -EFAULT;
+
+ dst->iov_base = u64_to_user_ptr((u64)ciov.iov_base);
+ dst->iov_len = ciov.iov_len;
+ return 0;
+ }
+#endif
+ src = (struct iovec __user *) arg;
+ if (copy_from_user(dst, &src[index], sizeof(*dst)))
+ return -EFAULT;
+ return 0;
+}
+
+/*
+ * Not super efficient, but this is just a registration time. And we do cache
+ * the last compound head, so generally we'll only do a full search if we don't
+ * match that one.
+ *
+ * We check if the given compound head page has already been accounted, to
+ * avoid double accounting it. This allows us to account the full size of the
+ * page, not just the constituent pages of a huge page.
+ */
+static bool headpage_already_acct(struct io_ring_ctx *ctx, struct page **pages,
+ int nr_pages, struct page *hpage)
+{
+ int i, j;
+
+ /* check current page array */
+ for (i = 0; i < nr_pages; i++) {
+ if (!PageCompound(pages[i]))
+ continue;
+ if (compound_head(pages[i]) == hpage)
+ return true;
+ }
+
+ /* check previously registered pages */
+ for (i = 0; i < ctx->nr_user_bufs; i++) {
+ struct io_mapped_ubuf *imu = ctx->user_bufs[i];
+
+ for (j = 0; j < imu->nr_bvecs; j++) {
+ if (!PageCompound(imu->bvec[j].bv_page))
+ continue;
+ if (compound_head(imu->bvec[j].bv_page) == hpage)
+ return true;
+ }
+ }
+
+ return false;
+}
+
+static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
+ int nr_pages, struct io_mapped_ubuf *imu,
+ struct page **last_hpage)
+{
+ int i, ret;
+
+ imu->acct_pages = 0;
+ for (i = 0; i < nr_pages; i++) {
+ if (!PageCompound(pages[i])) {
+ imu->acct_pages++;
+ } else {
+ struct page *hpage;
+
+ hpage = compound_head(pages[i]);
+ if (hpage == *last_hpage)
+ continue;
+ *last_hpage = hpage;
+ if (headpage_already_acct(ctx, pages, i, hpage))
+ continue;
+ imu->acct_pages += page_size(hpage) >> PAGE_SHIFT;
+ }
+ }
+
+ if (!imu->acct_pages)
+ return 0;
+
+ ret = io_account_mem(ctx, imu->acct_pages);
+ if (ret)
+ imu->acct_pages = 0;
+ return ret;
+}
+
+static int io_sqe_buffer_register(struct io_ring_ctx *ctx, struct iovec *iov,
+ struct io_mapped_ubuf **pimu,
+ struct page **last_hpage)
+{
+ struct io_mapped_ubuf *imu = NULL;
+ struct vm_area_struct **vmas = NULL;
+ struct page **pages = NULL;
+ unsigned long off, start, end, ubuf;
+ size_t size;
+ int ret, pret, nr_pages, i;
+
+ if (!iov->iov_base) {
+ *pimu = ctx->dummy_ubuf;
+ return 0;
+ }
+
+ ubuf = (unsigned long) iov->iov_base;
+ end = (ubuf + iov->iov_len + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ start = ubuf >> PAGE_SHIFT;
+ nr_pages = end - start;
+
+ *pimu = NULL;
+ ret = -ENOMEM;
+
+ pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL);
+ if (!pages)
+ goto done;
+
+ vmas = kvmalloc_array(nr_pages, sizeof(struct vm_area_struct *),
+ GFP_KERNEL);
+ if (!vmas)
+ goto done;
+
+ imu = kvmalloc(struct_size(imu, bvec, nr_pages), GFP_KERNEL);
+ if (!imu)
+ goto done;
+
+ ret = 0;
+ mmap_read_lock(current->mm);
+ pret = pin_user_pages(ubuf, nr_pages, FOLL_WRITE | FOLL_LONGTERM,
+ pages, vmas);
+ if (pret == nr_pages) {
+ /* don't support file backed memory */
+ for (i = 0; i < nr_pages; i++) {
+ struct vm_area_struct *vma = vmas[i];
+
+ if (vma_is_shmem(vma))
+ continue;
+ if (vma->vm_file &&
+ !is_file_hugepages(vma->vm_file)) {
+ ret = -EOPNOTSUPP;
+ break;
+ }
+ }
+ } else {
+ ret = pret < 0 ? pret : -EFAULT;
+ }
+ mmap_read_unlock(current->mm);
+ if (ret) {
+ /*
+ * if we did partial map, or found file backed vmas,
+ * release any pages we did get
+ */
+ if (pret > 0)
+ unpin_user_pages(pages, pret);
+ goto done;
+ }
+
+ ret = io_buffer_account_pin(ctx, pages, pret, imu, last_hpage);
+ if (ret) {
+ unpin_user_pages(pages, pret);
+ goto done;
+ }
+
+ off = ubuf & ~PAGE_MASK;
+ size = iov->iov_len;
+ for (i = 0; i < nr_pages; i++) {
+ size_t vec_len;
+
+ vec_len = min_t(size_t, size, PAGE_SIZE - off);
+ imu->bvec[i].bv_page = pages[i];
+ imu->bvec[i].bv_len = vec_len;
+ imu->bvec[i].bv_offset = off;
+ off = 0;
+ size -= vec_len;
+ }
+ /* store original address for later verification */
+ imu->ubuf = ubuf;
+ imu->ubuf_end = ubuf + iov->iov_len;
+ imu->nr_bvecs = nr_pages;
+ *pimu = imu;
+ ret = 0;
+done:
+ if (ret)
+ kvfree(imu);
+ kvfree(pages);
+ kvfree(vmas);
+ return ret;
+}
+
+static int io_buffers_map_alloc(struct io_ring_ctx *ctx, unsigned int nr_args)
+{
+ ctx->user_bufs = kcalloc(nr_args, sizeof(*ctx->user_bufs), GFP_KERNEL);
+ return ctx->user_bufs ? 0 : -ENOMEM;
+}
+
+static int io_buffer_validate(struct iovec *iov)
+{
+ unsigned long tmp, acct_len = iov->iov_len + (PAGE_SIZE - 1);
+
+ /*
+ * Don't impose further limits on the size and buffer
+ * constraints here, we'll -EINVAL later when IO is
+ * submitted if they are wrong.
+ */
+ if (!iov->iov_base)
+ return iov->iov_len ? -EFAULT : 0;
+ if (!iov->iov_len)
+ return -EFAULT;
+
+ /* arbitrary limit, but we need something */
+ if (iov->iov_len > SZ_1G)
+ return -EFAULT;
+
+ if (check_add_overflow((unsigned long)iov->iov_base, acct_len, &tmp))
+ return -EOVERFLOW;
+
+ return 0;
+}
+
+static int io_sqe_buffers_register(struct io_ring_ctx *ctx, void __user *arg,
+ unsigned int nr_args, u64 __user *tags)
+{
+ struct page *last_hpage = NULL;
+ struct io_rsrc_data *data;
+ int i, ret;
+ struct iovec iov;
+
+ if (ctx->user_bufs)
+ return -EBUSY;
+ if (!nr_args || nr_args > IORING_MAX_REG_BUFFERS)
+ return -EINVAL;
+ ret = io_rsrc_node_switch_start(ctx);
+ if (ret)
+ return ret;
+ ret = io_rsrc_data_alloc(ctx, io_rsrc_buf_put, tags, nr_args, &data);
+ if (ret)
+ return ret;
+ ret = io_buffers_map_alloc(ctx, nr_args);
+ if (ret) {
+ io_rsrc_data_free(data);
+ return ret;
+ }
+
+ for (i = 0; i < nr_args; i++, ctx->nr_user_bufs++) {
+ ret = io_copy_iov(ctx, &iov, arg, i);
+ if (ret)
+ break;
+ ret = io_buffer_validate(&iov);
+ if (ret)
+ break;
+ if (!iov.iov_base && *io_get_tag_slot(data, i)) {
+ ret = -EINVAL;
+ break;
+ }
+
+ ret = io_sqe_buffer_register(ctx, &iov, &ctx->user_bufs[i],
+ &last_hpage);
+ if (ret)
+ break;
+ }
+
+ WARN_ON_ONCE(ctx->buf_data);
+
+ ctx->buf_data = data;
+ if (ret)
+ __io_sqe_buffers_unregister(ctx);
+ else
+ io_rsrc_node_switch(ctx, NULL);
+ return ret;
+}
+
+static int __io_sqe_buffers_update(struct io_ring_ctx *ctx,
+ struct io_uring_rsrc_update2 *up,
+ unsigned int nr_args)
+{
+ u64 __user *tags = u64_to_user_ptr(up->tags);
+ struct iovec iov, __user *iovs = u64_to_user_ptr(up->data);
+ struct page *last_hpage = NULL;
+ bool needs_switch = false;
+ __u32 done;
+ int i, err;
+
+ if (!ctx->buf_data)
+ return -ENXIO;
+ if (up->offset + nr_args > ctx->nr_user_bufs)
+ return -EINVAL;
+
+ for (done = 0; done < nr_args; done++) {
+ struct io_mapped_ubuf *imu;
+ int offset = up->offset + done;
+ u64 tag = 0;
+
+ err = io_copy_iov(ctx, &iov, iovs, done);
+ if (err)
+ break;
+ if (tags && copy_from_user(&tag, &tags[done], sizeof(tag))) {
+ err = -EFAULT;
+ break;
+ }
+ err = io_buffer_validate(&iov);
+ if (err)
+ break;
+ if (!iov.iov_base && tag) {
+ err = -EINVAL;
+ break;
+ }
+ err = io_sqe_buffer_register(ctx, &iov, &imu, &last_hpage);
+ if (err)
+ break;
+
+ i = array_index_nospec(offset, ctx->nr_user_bufs);
+ if (ctx->user_bufs[i] != ctx->dummy_ubuf) {
+ err = io_queue_rsrc_removal(ctx->buf_data, i,
+ ctx->rsrc_node, ctx->user_bufs[i]);
+ if (unlikely(err)) {
+ io_buffer_unmap(ctx, &imu);
+ break;
+ }
+ ctx->user_bufs[i] = NULL;
+ needs_switch = true;
+ }
+
+ ctx->user_bufs[i] = imu;
+ *io_get_tag_slot(ctx->buf_data, offset) = tag;
+ }
+
+ if (needs_switch)
+ io_rsrc_node_switch(ctx, ctx->buf_data);
+ return done ? done : err;
+}
+
+static int io_eventfd_register(struct io_ring_ctx *ctx, void __user *arg,
+ unsigned int eventfd_async)
+{
+ struct io_ev_fd *ev_fd;
+ __s32 __user *fds = arg;
+ int fd;
+
+ ev_fd = rcu_dereference_protected(ctx->io_ev_fd,
+ lockdep_is_held(&ctx->uring_lock));
+ if (ev_fd)
+ return -EBUSY;
+
+ if (copy_from_user(&fd, fds, sizeof(*fds)))
+ return -EFAULT;
+
+ ev_fd = kmalloc(sizeof(*ev_fd), GFP_KERNEL);
+ if (!ev_fd)
+ return -ENOMEM;
+
+ ev_fd->cq_ev_fd = eventfd_ctx_fdget(fd);
+ if (IS_ERR(ev_fd->cq_ev_fd)) {
+ int ret = PTR_ERR(ev_fd->cq_ev_fd);
+ kfree(ev_fd);
+ return ret;
+ }
+ ev_fd->eventfd_async = eventfd_async;
+ ctx->has_evfd = true;
+ rcu_assign_pointer(ctx->io_ev_fd, ev_fd);
+ return 0;
+}
+
+static void io_eventfd_put(struct rcu_head *rcu)
+{
+ struct io_ev_fd *ev_fd = container_of(rcu, struct io_ev_fd, rcu);
+
+ eventfd_ctx_put(ev_fd->cq_ev_fd);
+ kfree(ev_fd);
+}
+
+static int io_eventfd_unregister(struct io_ring_ctx *ctx)
+{
+ struct io_ev_fd *ev_fd;
+
+ ev_fd = rcu_dereference_protected(ctx->io_ev_fd,
+ lockdep_is_held(&ctx->uring_lock));
+ if (ev_fd) {
+ ctx->has_evfd = false;
+ rcu_assign_pointer(ctx->io_ev_fd, NULL);
+ call_rcu(&ev_fd->rcu, io_eventfd_put);
+ return 0;
+ }
+
+ return -ENXIO;
+}
+
+static void io_destroy_buffers(struct io_ring_ctx *ctx)
+{
+ int i;
+
+ for (i = 0; i < (1U << IO_BUFFERS_HASH_BITS); i++) {
+ struct list_head *list = &ctx->io_buffers[i];
+
+ while (!list_empty(list)) {
+ struct io_buffer_list *bl;
+
+ bl = list_first_entry(list, struct io_buffer_list, list);
+ __io_remove_buffers(ctx, bl, -1U);
+ list_del(&bl->list);
+ kfree(bl);
+ }
+ }
+
+ while (!list_empty(&ctx->io_buffers_pages)) {
+ struct page *page;
+
+ page = list_first_entry(&ctx->io_buffers_pages, struct page, lru);
+ list_del_init(&page->lru);
+ __free_page(page);
+ }
+}
+
+static void io_req_caches_free(struct io_ring_ctx *ctx)
+{
+ struct io_submit_state *state = &ctx->submit_state;
+ int nr = 0;
+
+ mutex_lock(&ctx->uring_lock);
+ io_flush_cached_locked_reqs(ctx, state);
+
+ while (state->free_list.next) {
+ struct io_wq_work_node *node;
+ struct io_kiocb *req;
+
+ node = wq_stack_extract(&state->free_list);
+ req = container_of(node, struct io_kiocb, comp_list);
+ kmem_cache_free(req_cachep, req);
+ nr++;
+ }
+ if (nr)
+ percpu_ref_put_many(&ctx->refs, nr);
+ mutex_unlock(&ctx->uring_lock);
+}
+
+static void io_wait_rsrc_data(struct io_rsrc_data *data)
+{
+ if (data && !atomic_dec_and_test(&data->refs))
+ wait_for_completion(&data->done);
+}
+
+static void io_flush_apoll_cache(struct io_ring_ctx *ctx)
+{
+ struct async_poll *apoll;
+
+ while (!list_empty(&ctx->apoll_cache)) {
+ apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
+ poll.wait.entry);
+ list_del(&apoll->poll.wait.entry);
+ kfree(apoll);
+ }
+}
+
+static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
+{
+ io_sq_thread_finish(ctx);
+
+ if (ctx->mm_account) {
+ mmdrop(ctx->mm_account);
+ ctx->mm_account = NULL;
+ }
+
+ io_rsrc_refs_drop(ctx);
+ /* __io_rsrc_put_work() may need uring_lock to progress, wait w/o it */
+ io_wait_rsrc_data(ctx->buf_data);
+ io_wait_rsrc_data(ctx->file_data);
+
+ mutex_lock(&ctx->uring_lock);
+ if (ctx->buf_data)
+ __io_sqe_buffers_unregister(ctx);
+ if (ctx->file_data)
+ __io_sqe_files_unregister(ctx);
+ if (ctx->rings)
+ __io_cqring_overflow_flush(ctx, true);
+ io_eventfd_unregister(ctx);
+ io_flush_apoll_cache(ctx);
+ mutex_unlock(&ctx->uring_lock);
+ io_destroy_buffers(ctx);
+ if (ctx->sq_creds)
+ put_cred(ctx->sq_creds);
+
+ /* there are no registered resources left, nobody uses it */
+ if (ctx->rsrc_node)
+ io_rsrc_node_destroy(ctx->rsrc_node);
+ if (ctx->rsrc_backup_node)
+ io_rsrc_node_destroy(ctx->rsrc_backup_node);
+ flush_delayed_work(&ctx->rsrc_put_work);
+ flush_delayed_work(&ctx->fallback_work);
+
+ WARN_ON_ONCE(!list_empty(&ctx->rsrc_ref_list));
+ WARN_ON_ONCE(!llist_empty(&ctx->rsrc_put_llist));
+
+#if defined(CONFIG_UNIX)
+ if (ctx->ring_sock) {
+ ctx->ring_sock->file = NULL; /* so that iput() is called */
+ sock_release(ctx->ring_sock);
+ }
+#endif
+ WARN_ON_ONCE(!list_empty(&ctx->ltimeout_list));
+
+ io_mem_free(ctx->rings);
+ io_mem_free(ctx->sq_sqes);
+
+ percpu_ref_exit(&ctx->refs);
+ free_uid(ctx->user);
+ io_req_caches_free(ctx);
+ if (ctx->hash_map)
+ io_wq_put_hash(ctx->hash_map);
+ kfree(ctx->cancel_hash);
+ kfree(ctx->dummy_ubuf);
+ kfree(ctx->io_buffers);
+ kfree(ctx);
+}
+
+static __poll_t io_uring_poll(struct file *file, poll_table *wait)
+{
+ struct io_ring_ctx *ctx = file->private_data;
+ __poll_t mask = 0;
+
+ poll_wait(file, &ctx->cq_wait, wait);
+ /*
+ * synchronizes with barrier from wq_has_sleeper call in
+ * io_commit_cqring
+ */
+ smp_rmb();
+ if (!io_sqring_full(ctx))
+ mask |= EPOLLOUT | EPOLLWRNORM;
+
+ /*
+ * Don't flush cqring overflow list here, just do a simple check.
+ * Otherwise there could possible be ABBA deadlock:
+ * CPU0 CPU1
+ * ---- ----
+ * lock(&ctx->uring_lock);
+ * lock(&ep->mtx);
+ * lock(&ctx->uring_lock);
+ * lock(&ep->mtx);
+ *
+ * Users may get EPOLLIN meanwhile seeing nothing in cqring, this
+ * pushs them to do the flush.
+ */
+ if (io_cqring_events(ctx) || test_bit(0, &ctx->check_cq_overflow))
+ mask |= EPOLLIN | EPOLLRDNORM;
+
+ return mask;
+}
+
+static int io_unregister_personality(struct io_ring_ctx *ctx, unsigned id)
+{
+ const struct cred *creds;
+
+ creds = xa_erase(&ctx->personalities, id);
+ if (creds) {
+ put_cred(creds);
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
+struct io_tctx_exit {
+ struct callback_head task_work;
+ struct completion completion;
+ struct io_ring_ctx *ctx;
+};
+
+static __cold void io_tctx_exit_cb(struct callback_head *cb)
+{
+ struct io_uring_task *tctx = current->io_uring;
+ struct io_tctx_exit *work;
+
+ work = container_of(cb, struct io_tctx_exit, task_work);
+ /*
+ * When @in_idle, we're in cancellation and it's racy to remove the
+ * node. It'll be removed by the end of cancellation, just ignore it.
+ */
+ if (!atomic_read(&tctx->in_idle))
+ io_uring_del_tctx_node((unsigned long)work->ctx);
+ complete(&work->completion);
+}
+
+static __cold bool io_cancel_ctx_cb(struct io_wq_work *work, void *data)
+{
+ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+
+ return req->ctx == data;
+}
+
+static __cold void io_ring_exit_work(struct work_struct *work)
+{
+ struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, exit_work);
+ unsigned long timeout = jiffies + HZ * 60 * 5;
+ unsigned long interval = HZ / 20;
+ struct io_tctx_exit exit;
+ struct io_tctx_node *node;
+ int ret;
+
+ /*
+ * If we're doing polled IO and end up having requests being
+ * submitted async (out-of-line), then completions can come in while
+ * we're waiting for refs to drop. We need to reap these manually,
+ * as nobody else will be looking for them.
+ */
+ do {
+ io_uring_try_cancel_requests(ctx, NULL, true);
+ if (ctx->sq_data) {
+ struct io_sq_data *sqd = ctx->sq_data;
+ struct task_struct *tsk;
+
+ io_sq_thread_park(sqd);
+ tsk = sqd->thread;
+ if (tsk && tsk->io_uring && tsk->io_uring->io_wq)
+ io_wq_cancel_cb(tsk->io_uring->io_wq,
+ io_cancel_ctx_cb, ctx, true);
+ io_sq_thread_unpark(sqd);
+ }
+
+ io_req_caches_free(ctx);
+
+ if (WARN_ON_ONCE(time_after(jiffies, timeout))) {
+ /* there is little hope left, don't run it too often */
+ interval = HZ * 60;
+ }
+ } while (!wait_for_completion_timeout(&ctx->ref_comp, interval));
+
+ init_completion(&exit.completion);
+ init_task_work(&exit.task_work, io_tctx_exit_cb);
+ exit.ctx = ctx;
+ /*
+ * Some may use context even when all refs and requests have been put,
+ * and they are free to do so while still holding uring_lock or
+ * completion_lock, see io_req_task_submit(). Apart from other work,
+ * this lock/unlock section also waits them to finish.
+ */
+ mutex_lock(&ctx->uring_lock);
+ while (!list_empty(&ctx->tctx_list)) {
+ WARN_ON_ONCE(time_after(jiffies, timeout));
+
+ node = list_first_entry(&ctx->tctx_list, struct io_tctx_node,
+ ctx_node);
+ /* don't spin on a single task if cancellation failed */
+ list_rotate_left(&ctx->tctx_list);
+ ret = task_work_add(node->task, &exit.task_work, TWA_SIGNAL);
+ if (WARN_ON_ONCE(ret))
+ continue;
+
+ mutex_unlock(&ctx->uring_lock);
+ wait_for_completion(&exit.completion);
+ mutex_lock(&ctx->uring_lock);
+ }
+ mutex_unlock(&ctx->uring_lock);
+ spin_lock(&ctx->completion_lock);
+ spin_unlock(&ctx->completion_lock);
+
+ io_ring_ctx_free(ctx);
+}
+
+/* Returns true if we found and killed one or more timeouts */
+static __cold bool io_kill_timeouts(struct io_ring_ctx *ctx,
+ struct task_struct *tsk, bool cancel_all)
+{
+ struct io_kiocb *req, *tmp;
+ int canceled = 0;
+
+ spin_lock(&ctx->completion_lock);
+ spin_lock_irq(&ctx->timeout_lock);
+ list_for_each_entry_safe(req, tmp, &ctx->timeout_list, timeout.list) {
+ if (io_match_task(req, tsk, cancel_all)) {
+ io_kill_timeout(req, -ECANCELED);
+ canceled++;
+ }
+ }
+ spin_unlock_irq(&ctx->timeout_lock);
+ if (canceled != 0)
+ io_commit_cqring(ctx);
+ spin_unlock(&ctx->completion_lock);
+ if (canceled != 0)
+ io_cqring_ev_posted(ctx);
+ return canceled != 0;
+}
+
+static __cold void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
+{
+ unsigned long index;
+ struct creds *creds;
+
+ mutex_lock(&ctx->uring_lock);
+ percpu_ref_kill(&ctx->refs);
+ if (ctx->rings)
+ __io_cqring_overflow_flush(ctx, true);
+ xa_for_each(&ctx->personalities, index, creds)
+ io_unregister_personality(ctx, index);
+ mutex_unlock(&ctx->uring_lock);
+
+ io_kill_timeouts(ctx, NULL, true);
+ io_poll_remove_all(ctx, NULL, true);
+
+ /* if we failed setting up the ctx, we might not have any rings */
+ io_iopoll_try_reap_events(ctx);
+
+ INIT_WORK(&ctx->exit_work, io_ring_exit_work);
+ /*
+ * Use system_unbound_wq to avoid spawning tons of event kworkers
+ * if we're exiting a ton of rings at the same time. It just adds
+ * noise and overhead, there's no discernable change in runtime
+ * over using system_wq.
+ */
+ queue_work(system_unbound_wq, &ctx->exit_work);
+}
+
+static int io_uring_release(struct inode *inode, struct file *file)
+{
+ struct io_ring_ctx *ctx = file->private_data;
+
+ file->private_data = NULL;
+ io_ring_ctx_wait_and_kill(ctx);
+ return 0;
+}
+
+struct io_task_cancel {
+ struct task_struct *task;
+ bool all;
+};
+
+static bool io_cancel_task_cb(struct io_wq_work *work, void *data)
+{
+ struct io_kiocb *req = container_of(work, struct io_kiocb, work);
+ struct io_task_cancel *cancel = data;
+
+ return io_match_task_safe(req, cancel->task, cancel->all);
+}
+
+static __cold bool io_cancel_defer_files(struct io_ring_ctx *ctx,
+ struct task_struct *task,
+ bool cancel_all)
+{
+ struct io_defer_entry *de;
+ LIST_HEAD(list);
+
+ spin_lock(&ctx->completion_lock);
+ list_for_each_entry_reverse(de, &ctx->defer_list, list) {
+ if (io_match_task_safe(de->req, task, cancel_all)) {
+ list_cut_position(&list, &ctx->defer_list, &de->list);
+ break;
+ }
+ }
+ spin_unlock(&ctx->completion_lock);
+ if (list_empty(&list))
+ return false;
+
+ while (!list_empty(&list)) {
+ de = list_first_entry(&list, struct io_defer_entry, list);
+ list_del_init(&de->list);
+ io_req_complete_failed(de->req, -ECANCELED);
+ kfree(de);
+ }
+ return true;
+}
+
+static __cold bool io_uring_try_cancel_iowq(struct io_ring_ctx *ctx)
+{
+ struct io_tctx_node *node;
+ enum io_wq_cancel cret;
+ bool ret = false;
+
+ mutex_lock(&ctx->uring_lock);
+ list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
+ struct io_uring_task *tctx = node->task->io_uring;
+
+ /*
+ * io_wq will stay alive while we hold uring_lock, because it's
+ * killed after ctx nodes, which requires to take the lock.
+ */
+ if (!tctx || !tctx->io_wq)
+ continue;
+ cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_ctx_cb, ctx, true);
+ ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
+ }
+ mutex_unlock(&ctx->uring_lock);
+
+ return ret;
+}
+
+static __cold void io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
+ struct task_struct *task,
+ bool cancel_all)
+{
+ struct io_task_cancel cancel = { .task = task, .all = cancel_all, };
+ struct io_uring_task *tctx = task ? task->io_uring : NULL;
+
+ while (1) {
+ enum io_wq_cancel cret;
+ bool ret = false;
+
+ if (!task) {
+ ret |= io_uring_try_cancel_iowq(ctx);
+ } else if (tctx && tctx->io_wq) {
+ /*
+ * Cancels requests of all rings, not only @ctx, but
+ * it's fine as the task is in exit/exec.
+ */
+ cret = io_wq_cancel_cb(tctx->io_wq, io_cancel_task_cb,
+ &cancel, true);
+ ret |= (cret != IO_WQ_CANCEL_NOTFOUND);
+ }
+
+ /* SQPOLL thread does its own polling */
+ if ((!(ctx->flags & IORING_SETUP_SQPOLL) && cancel_all) ||
+ (ctx->sq_data && ctx->sq_data->thread == current)) {
+ while (!wq_list_empty(&ctx->iopoll_list)) {
+ io_iopoll_try_reap_events(ctx);
+ ret = true;
+ }
+ }
+
+ ret |= io_cancel_defer_files(ctx, task, cancel_all);
+ ret |= io_poll_remove_all(ctx, task, cancel_all);
+ ret |= io_kill_timeouts(ctx, task, cancel_all);
+ if (task)
+ ret |= io_run_task_work();
+ if (!ret)
+ break;
+ cond_resched();
+ }
+}
+
+static int __io_uring_add_tctx_node(struct io_ring_ctx *ctx)
+{
+ struct io_uring_task *tctx = current->io_uring;
+ struct io_tctx_node *node;
+ int ret;
+
+ if (unlikely(!tctx)) {
+ ret = io_uring_alloc_task_context(current, ctx);
+ if (unlikely(ret))
+ return ret;
+
+ tctx = current->io_uring;
+ if (ctx->iowq_limits_set) {
+ unsigned int limits[2] = { ctx->iowq_limits[0],
+ ctx->iowq_limits[1], };
+
+ ret = io_wq_max_workers(tctx->io_wq, limits);
+ if (ret)
+ return ret;
+ }
+ }
+ if (!xa_load(&tctx->xa, (unsigned long)ctx)) {
+ node = kmalloc(sizeof(*node), GFP_KERNEL);
+ if (!node)
+ return -ENOMEM;
+ node->ctx = ctx;
+ node->task = current;
+
+ ret = xa_err(xa_store(&tctx->xa, (unsigned long)ctx,
+ node, GFP_KERNEL));
+ if (ret) {
+ kfree(node);
+ return ret;
+ }
+
+ mutex_lock(&ctx->uring_lock);
+ list_add(&node->ctx_node, &ctx->tctx_list);
+ mutex_unlock(&ctx->uring_lock);
+ }
+ tctx->last = ctx;
+ return 0;
+}
+
+/*
+ * Note that this task has used io_uring. We use it for cancelation purposes.
+ */
+static inline int io_uring_add_tctx_node(struct io_ring_ctx *ctx)
+{
+ struct io_uring_task *tctx = current->io_uring;
+
+ if (likely(tctx && tctx->last == ctx))
+ return 0;
+ return __io_uring_add_tctx_node(ctx);
+}
+
+/*
+ * Remove this io_uring_file -> task mapping.
+ */
+static __cold void io_uring_del_tctx_node(unsigned long index)
+{
+ struct io_uring_task *tctx = current->io_uring;
+ struct io_tctx_node *node;
+
+ if (!tctx)
+ return;
+ node = xa_erase(&tctx->xa, index);
+ if (!node)
+ return;
+
+ WARN_ON_ONCE(current != node->task);
+ WARN_ON_ONCE(list_empty(&node->ctx_node));
+
+ mutex_lock(&node->ctx->uring_lock);
+ list_del(&node->ctx_node);
+ mutex_unlock(&node->ctx->uring_lock);
+
+ if (tctx->last == node->ctx)
+ tctx->last = NULL;
+ kfree(node);
+}
+
+static __cold void io_uring_clean_tctx(struct io_uring_task *tctx)
+{
+ struct io_wq *wq = tctx->io_wq;
+ struct io_tctx_node *node;
+ unsigned long index;
+
+ xa_for_each(&tctx->xa, index, node) {
+ io_uring_del_tctx_node(index);
+ cond_resched();
+ }
+ if (wq) {
+ /*
+ * Must be after io_uring_del_tctx_node() (removes nodes under
+ * uring_lock) to avoid race with io_uring_try_cancel_iowq().
+ */
+ io_wq_put_and_exit(wq);
+ tctx->io_wq = NULL;
+ }
+}
+
+static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked)
+{
+ if (tracked)
+ return atomic_read(&tctx->inflight_tracked);
+ return percpu_counter_sum(&tctx->inflight);
+}
+
+/*
+ * Find any io_uring ctx that this task has registered or done IO on, and cancel
+ * requests. @sqd should be not-null IFF it's an SQPOLL thread cancellation.
+ */
+static __cold void io_uring_cancel_generic(bool cancel_all,
+ struct io_sq_data *sqd)
+{
+ struct io_uring_task *tctx = current->io_uring;
+ struct io_ring_ctx *ctx;
+ s64 inflight;
+ DEFINE_WAIT(wait);
+
+ WARN_ON_ONCE(sqd && sqd->thread != current);
+
+ if (!current->io_uring)
+ return;
+ if (tctx->io_wq)
+ io_wq_exit_start(tctx->io_wq);
+
+ atomic_inc(&tctx->in_idle);
+ do {
+ io_uring_drop_tctx_refs(current);
+ /* read completions before cancelations */
+ inflight = tctx_inflight(tctx, !cancel_all);
+ if (!inflight)
+ break;
+
+ if (!sqd) {
+ struct io_tctx_node *node;
+ unsigned long index;
+
+ xa_for_each(&tctx->xa, index, node) {
+ /* sqpoll task will cancel all its requests */
+ if (node->ctx->sq_data)
+ continue;
+ io_uring_try_cancel_requests(node->ctx, current,
+ cancel_all);
+ }
+ } else {
+ list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
+ io_uring_try_cancel_requests(ctx, current,
+ cancel_all);
+ }
+
+ prepare_to_wait(&tctx->wait, &wait, TASK_INTERRUPTIBLE);
+ io_run_task_work();
+ io_uring_drop_tctx_refs(current);
+
+ /*
+ * If we've seen completions, retry without waiting. This
+ * avoids a race where a completion comes in before we did
+ * prepare_to_wait().
+ */
+ if (inflight == tctx_inflight(tctx, !cancel_all))
+ schedule();
+ finish_wait(&tctx->wait, &wait);
+ } while (1);
+
+ io_uring_clean_tctx(tctx);
+ if (cancel_all) {
+ /*
+ * We shouldn't run task_works after cancel, so just leave
+ * ->in_idle set for normal exit.
+ */
+ atomic_dec(&tctx->in_idle);
+ /* for exec all current's requests should be gone, kill tctx */
+ __io_uring_free(current);
+ }
+}
+
+void __io_uring_cancel(bool cancel_all)
+{
+ io_uring_cancel_generic(cancel_all, NULL);
+}
+
+void io_uring_unreg_ringfd(void)
+{
+ struct io_uring_task *tctx = current->io_uring;
+ int i;
+
+ for (i = 0; i < IO_RINGFD_REG_MAX; i++) {
+ if (tctx->registered_rings[i]) {
+ fput(tctx->registered_rings[i]);
+ tctx->registered_rings[i] = NULL;
+ }
+ }
+}
+
+static int io_ring_add_registered_fd(struct io_uring_task *tctx, int fd,
+ int start, int end)
+{
+ struct file *file;
+ int offset;
+
+ for (offset = start; offset < end; offset++) {
+ offset = array_index_nospec(offset, IO_RINGFD_REG_MAX);
+ if (tctx->registered_rings[offset])
+ continue;
+
+ file = fget(fd);
+ if (!file) {
+ return -EBADF;
+ } else if (file->f_op != &io_uring_fops) {
+ fput(file);
+ return -EOPNOTSUPP;
+ }
+ tctx->registered_rings[offset] = file;
+ return offset;
+ }
+
+ return -EBUSY;
+}
+
+/*
+ * Register a ring fd to avoid fdget/fdput for each io_uring_enter()
+ * invocation. User passes in an array of struct io_uring_rsrc_update
+ * with ->data set to the ring_fd, and ->offset given for the desired
+ * index. If no index is desired, application may set ->offset == -1U
+ * and we'll find an available index. Returns number of entries
+ * successfully processed, or < 0 on error if none were processed.
+ */
+static int io_ringfd_register(struct io_ring_ctx *ctx, void __user *__arg,
+ unsigned nr_args)
+{
+ struct io_uring_rsrc_update __user *arg = __arg;
+ struct io_uring_rsrc_update reg;
+ struct io_uring_task *tctx;
+ int ret, i;
+
+ if (!nr_args || nr_args > IO_RINGFD_REG_MAX)
+ return -EINVAL;
+
+ mutex_unlock(&ctx->uring_lock);
+ ret = io_uring_add_tctx_node(ctx);
+ mutex_lock(&ctx->uring_lock);
+ if (ret)
+ return ret;
+
+ tctx = current->io_uring;
+ for (i = 0; i < nr_args; i++) {
+ int start, end;
+
+ if (copy_from_user(®, &arg[i], sizeof(reg))) {
+ ret = -EFAULT;
+ break;
+ }
+
+ if (reg.resv) {
+ ret = -EINVAL;
+ break;
+ }
+
+ if (reg.offset == -1U) {
+ start = 0;
+ end = IO_RINGFD_REG_MAX;
+ } else {
+ if (reg.offset >= IO_RINGFD_REG_MAX) {
+ ret = -EINVAL;
+ break;
+ }
+ start = reg.offset;
+ end = start + 1;
+ }
+
+ ret = io_ring_add_registered_fd(tctx, reg.data, start, end);
+ if (ret < 0)
+ break;
+
+ reg.offset = ret;
+ if (copy_to_user(&arg[i], ®, sizeof(reg))) {
+ fput(tctx->registered_rings[reg.offset]);
+ tctx->registered_rings[reg.offset] = NULL;
+ ret = -EFAULT;
+ break;
+ }
+ }
+
+ return i ? i : ret;
+}
+
+static int io_ringfd_unregister(struct io_ring_ctx *ctx, void __user *__arg,
+ unsigned nr_args)
+{
+ struct io_uring_rsrc_update __user *arg = __arg;
+ struct io_uring_task *tctx = current->io_uring;
+ struct io_uring_rsrc_update reg;
+ int ret = 0, i;
+
+ if (!nr_args || nr_args > IO_RINGFD_REG_MAX)
+ return -EINVAL;
+ if (!tctx)
+ return 0;
+
+ for (i = 0; i < nr_args; i++) {
+ if (copy_from_user(®, &arg[i], sizeof(reg))) {
+ ret = -EFAULT;
+ break;
+ }
+ if (reg.resv || reg.data || reg.offset >= IO_RINGFD_REG_MAX) {
+ ret = -EINVAL;
+ break;
+ }
+
+ reg.offset = array_index_nospec(reg.offset, IO_RINGFD_REG_MAX);
+ if (tctx->registered_rings[reg.offset]) {
+ fput(tctx->registered_rings[reg.offset]);
+ tctx->registered_rings[reg.offset] = NULL;
+ }
+ }
+
+ return i ? i : ret;
+}
+
+static void *io_uring_validate_mmap_request(struct file *file,
+ loff_t pgoff, size_t sz)
+{
+ struct io_ring_ctx *ctx = file->private_data;
+ loff_t offset = pgoff << PAGE_SHIFT;
+ struct page *page;
+ void *ptr;
+
+ switch (offset) {
+ case IORING_OFF_SQ_RING:
+ case IORING_OFF_CQ_RING:
+ ptr = ctx->rings;
+ break;
+ case IORING_OFF_SQES:
+ ptr = ctx->sq_sqes;
+ break;
+ default:
+ return ERR_PTR(-EINVAL);
+ }
+
+ page = virt_to_head_page(ptr);
+ if (sz > page_size(page))
+ return ERR_PTR(-EINVAL);
+
+ return ptr;
+}
+
+#ifdef CONFIG_MMU
+
+static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ size_t sz = vma->vm_end - vma->vm_start;
+ unsigned long pfn;
+ void *ptr;
+
+ ptr = io_uring_validate_mmap_request(file, vma->vm_pgoff, sz);
+ if (IS_ERR(ptr))
+ return PTR_ERR(ptr);
+
+ pfn = virt_to_phys(ptr) >> PAGE_SHIFT;
+ return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
+}
+
+#else /* !CONFIG_MMU */
+
+static int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ return vma->vm_flags & (VM_SHARED | VM_MAYSHARE) ? 0 : -EINVAL;
+}
+
+static unsigned int io_uring_nommu_mmap_capabilities(struct file *file)
+{
+ return NOMMU_MAP_DIRECT | NOMMU_MAP_READ | NOMMU_MAP_WRITE;
+}
+
+static unsigned long io_uring_nommu_get_unmapped_area(struct file *file,
+ unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags)
+{
+ void *ptr;
+
+ ptr = io_uring_validate_mmap_request(file, pgoff, len);
+ if (IS_ERR(ptr))
+ return PTR_ERR(ptr);
+
+ return (unsigned long) ptr;
+}
+
+#endif /* !CONFIG_MMU */
+
+static int io_sqpoll_wait_sq(struct io_ring_ctx *ctx)
+{
+ DEFINE_WAIT(wait);
+
+ do {
+ if (!io_sqring_full(ctx))
+ break;
+ prepare_to_wait(&ctx->sqo_sq_wait, &wait, TASK_INTERRUPTIBLE);
+
+ if (!io_sqring_full(ctx))
+ break;
+ schedule();
+ } while (!signal_pending(current));
+
+ finish_wait(&ctx->sqo_sq_wait, &wait);
+ return 0;
+}
+
+static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz,
+ struct __kernel_timespec __user **ts,
+ const sigset_t __user **sig)
+{
+ struct io_uring_getevents_arg arg;
+
+ /*
+ * If EXT_ARG isn't set, then we have no timespec and the argp pointer
+ * is just a pointer to the sigset_t.
+ */
+ if (!(flags & IORING_ENTER_EXT_ARG)) {
+ *sig = (const sigset_t __user *) argp;
+ *ts = NULL;
+ return 0;
+ }
+
+ /*
+ * EXT_ARG is set - ensure we agree on the size of it and copy in our
+ * timespec and sigset_t pointers if good.
+ */
+ if (*argsz != sizeof(arg))
+ return -EINVAL;
+ if (copy_from_user(&arg, argp, sizeof(arg)))
+ return -EFAULT;
+ if (arg.pad)
+ return -EINVAL;
+ *sig = u64_to_user_ptr(arg.sigmask);
+ *argsz = arg.sigmask_sz;
+ *ts = u64_to_user_ptr(arg.ts);
+ return 0;
+}
+
+SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
+ u32, min_complete, u32, flags, const void __user *, argp,
+ size_t, argsz)
+{
+ struct io_ring_ctx *ctx;
+ int submitted = 0;
+ struct fd f;
+ long ret;
+
+ io_run_task_work();
+
+ if (unlikely(flags & ~(IORING_ENTER_GETEVENTS | IORING_ENTER_SQ_WAKEUP |
+ IORING_ENTER_SQ_WAIT | IORING_ENTER_EXT_ARG |
+ IORING_ENTER_REGISTERED_RING)))
+ return -EINVAL;
+
+ /*
+ * Ring fd has been registered via IORING_REGISTER_RING_FDS, we
+ * need only dereference our task private array to find it.
+ */
+ if (flags & IORING_ENTER_REGISTERED_RING) {
+ struct io_uring_task *tctx = current->io_uring;
+
+ if (!tctx || fd >= IO_RINGFD_REG_MAX)
+ return -EINVAL;
+ fd = array_index_nospec(fd, IO_RINGFD_REG_MAX);
+ f.file = tctx->registered_rings[fd];
+ if (unlikely(!f.file))
+ return -EBADF;
+ } else {
+ f = fdget(fd);
+ if (unlikely(!f.file))
+ return -EBADF;
+ }
+
+ ret = -EOPNOTSUPP;
+ if (unlikely(f.file->f_op != &io_uring_fops))
+ goto out_fput;
+
+ ret = -ENXIO;
+ ctx = f.file->private_data;
+ if (unlikely(!percpu_ref_tryget(&ctx->refs)))
+ goto out_fput;
+
+ ret = -EBADFD;
+ if (unlikely(ctx->flags & IORING_SETUP_R_DISABLED))
+ goto out;
+
+ /*
+ * For SQ polling, the thread will do all submissions and completions.
+ * Just return the requested submit count, and wake the thread if
+ * we were asked to.
+ */
+ ret = 0;
+ if (ctx->flags & IORING_SETUP_SQPOLL) {
+ io_cqring_overflow_flush(ctx);
+
+ if (unlikely(ctx->sq_data->thread == NULL)) {
+ ret = -EOWNERDEAD;
+ goto out;
+ }
+ if (flags & IORING_ENTER_SQ_WAKEUP)
+ wake_up(&ctx->sq_data->wait);
+ if (flags & IORING_ENTER_SQ_WAIT) {
+ ret = io_sqpoll_wait_sq(ctx);
+ if (ret)
+ goto out;
+ }
+ submitted = to_submit;
+ } else if (to_submit) {
+ ret = io_uring_add_tctx_node(ctx);
+ if (unlikely(ret))
+ goto out;
+ mutex_lock(&ctx->uring_lock);
+ submitted = io_submit_sqes(ctx, to_submit);
+ mutex_unlock(&ctx->uring_lock);
+
+ if (submitted != to_submit)
+ goto out;
+ }
+ if (flags & IORING_ENTER_GETEVENTS) {
+ const sigset_t __user *sig;
+ struct __kernel_timespec __user *ts;
+
+ ret = io_get_ext_arg(flags, argp, &argsz, &ts, &sig);
+ if (unlikely(ret))
+ goto out;
+
+ min_complete = min(min_complete, ctx->cq_entries);
+
+ /*
+ * When SETUP_IOPOLL and SETUP_SQPOLL are both enabled, user
+ * space applications don't need to do io completion events
+ * polling again, they can rely on io_sq_thread to do polling
+ * work, which can reduce cpu usage and uring_lock contention.
+ */
+ if (ctx->flags & IORING_SETUP_IOPOLL &&
+ !(ctx->flags & IORING_SETUP_SQPOLL)) {
+ ret = io_iopoll_check(ctx, min_complete);
+ } else {
+ ret = io_cqring_wait(ctx, min_complete, sig, argsz, ts);
+ }
+ }
+
+out:
+ percpu_ref_put(&ctx->refs);
+out_fput:
+ if (!(flags & IORING_ENTER_REGISTERED_RING))
+ fdput(f);
+ return submitted ? submitted : ret;
+}
+
+#ifdef CONFIG_PROC_FS
+static __cold int io_uring_show_cred(struct seq_file *m, unsigned int id,
+ const struct cred *cred)
+{
+ struct user_namespace *uns = seq_user_ns(m);
+ struct group_info *gi;
+ kernel_cap_t cap;
+ unsigned __capi;
+ int g;
+
+ seq_printf(m, "%5d\n", id);
+ seq_put_decimal_ull(m, "\tUid:\t", from_kuid_munged(uns, cred->uid));
+ seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->euid));
+ seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->suid));
+ seq_put_decimal_ull(m, "\t\t", from_kuid_munged(uns, cred->fsuid));
+ seq_put_decimal_ull(m, "\n\tGid:\t", from_kgid_munged(uns, cred->gid));
+ seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->egid));
+ seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->sgid));
+ seq_put_decimal_ull(m, "\t\t", from_kgid_munged(uns, cred->fsgid));
+ seq_puts(m, "\n\tGroups:\t");
+ gi = cred->group_info;
+ for (g = 0; g < gi->ngroups; g++) {
+ seq_put_decimal_ull(m, g ? " " : "",
+ from_kgid_munged(uns, gi->gid[g]));
+ }
+ seq_puts(m, "\n\tCapEff:\t");
+ cap = cred->cap_effective;
+ CAP_FOR_EACH_U32(__capi)
+ seq_put_hex_ll(m, NULL, cap.cap[CAP_LAST_U32 - __capi], 8);
+ seq_putc(m, '\n');
+ return 0;
+}
+
+static __cold void __io_uring_show_fdinfo(struct io_ring_ctx *ctx,
+ struct seq_file *m)
+{
+ struct io_sq_data *sq = NULL;
+ struct io_overflow_cqe *ocqe;
+ struct io_rings *r = ctx->rings;
+ unsigned int sq_mask = ctx->sq_entries - 1, cq_mask = ctx->cq_entries - 1;
+ unsigned int sq_head = READ_ONCE(r->sq.head);
+ unsigned int sq_tail = READ_ONCE(r->sq.tail);
+ unsigned int cq_head = READ_ONCE(r->cq.head);
+ unsigned int cq_tail = READ_ONCE(r->cq.tail);
+ unsigned int sq_entries, cq_entries;
+ bool has_lock;
+ unsigned int i;
+
+ /*
+ * we may get imprecise sqe and cqe info if uring is actively running
+ * since we get cached_sq_head and cached_cq_tail without uring_lock
+ * and sq_tail and cq_head are changed by userspace. But it's ok since
+ * we usually use these info when it is stuck.
+ */
+ seq_printf(m, "SqMask:\t0x%x\n", sq_mask);
+ seq_printf(m, "SqHead:\t%u\n", sq_head);
+ seq_printf(m, "SqTail:\t%u\n", sq_tail);
+ seq_printf(m, "CachedSqHead:\t%u\n", ctx->cached_sq_head);
+ seq_printf(m, "CqMask:\t0x%x\n", cq_mask);
+ seq_printf(m, "CqHead:\t%u\n", cq_head);
+ seq_printf(m, "CqTail:\t%u\n", cq_tail);
+ seq_printf(m, "CachedCqTail:\t%u\n", ctx->cached_cq_tail);
+ seq_printf(m, "SQEs:\t%u\n", sq_tail - ctx->cached_sq_head);
+ sq_entries = min(sq_tail - sq_head, ctx->sq_entries);
+ for (i = 0; i < sq_entries; i++) {
+ unsigned int entry = i + sq_head;
+ unsigned int sq_idx = READ_ONCE(ctx->sq_array[entry & sq_mask]);
+ struct io_uring_sqe *sqe;
+
+ if (sq_idx > sq_mask)
+ continue;
+ sqe = &ctx->sq_sqes[sq_idx];
+ seq_printf(m, "%5u: opcode:%d, fd:%d, flags:%x, user_data:%llu\n",
+ sq_idx, sqe->opcode, sqe->fd, sqe->flags,
+ sqe->user_data);
+ }
+ seq_printf(m, "CQEs:\t%u\n", cq_tail - cq_head);
+ cq_entries = min(cq_tail - cq_head, ctx->cq_entries);
+ for (i = 0; i < cq_entries; i++) {
+ unsigned int entry = i + cq_head;
+ struct io_uring_cqe *cqe = &r->cqes[entry & cq_mask];
+
+ seq_printf(m, "%5u: user_data:%llu, res:%d, flag:%x\n",
+ entry & cq_mask, cqe->user_data, cqe->res,
+ cqe->flags);
+ }
+
+ /*
+ * Avoid ABBA deadlock between the seq lock and the io_uring mutex,
+ * since fdinfo case grabs it in the opposite direction of normal use
+ * cases. If we fail to get the lock, we just don't iterate any
+ * structures that could be going away outside the io_uring mutex.
+ */
+ has_lock = mutex_trylock(&ctx->uring_lock);
+
+ if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) {
+ sq = ctx->sq_data;
+ if (!sq->thread)
+ sq = NULL;
+ }
+
+ seq_printf(m, "SqThread:\t%d\n", sq ? task_pid_nr(sq->thread) : -1);
+ seq_printf(m, "SqThreadCpu:\t%d\n", sq ? task_cpu(sq->thread) : -1);
+ seq_printf(m, "UserFiles:\t%u\n", ctx->nr_user_files);
+ for (i = 0; has_lock && i < ctx->nr_user_files; i++) {
+ struct file *f = io_file_from_index(ctx, i);
+
+ if (f)
+ seq_printf(m, "%5u: %s\n", i, file_dentry(f)->d_iname);
+ else
+ seq_printf(m, "%5u: <none>\n", i);
+ }
+ seq_printf(m, "UserBufs:\t%u\n", ctx->nr_user_bufs);
+ for (i = 0; has_lock && i < ctx->nr_user_bufs; i++) {
+ struct io_mapped_ubuf *buf = ctx->user_bufs[i];
+ unsigned int len = buf->ubuf_end - buf->ubuf;
+
+ seq_printf(m, "%5u: 0x%llx/%u\n", i, buf->ubuf, len);
+ }
+ if (has_lock && !xa_empty(&ctx->personalities)) {
+ unsigned long index;
+ const struct cred *cred;
+
+ seq_printf(m, "Personalities:\n");
+ xa_for_each(&ctx->personalities, index, cred)
+ io_uring_show_cred(m, index, cred);
+ }
+ if (has_lock)
+ mutex_unlock(&ctx->uring_lock);
+
+ seq_puts(m, "PollList:\n");
+ spin_lock(&ctx->completion_lock);
+ for (i = 0; i < (1U << ctx->cancel_hash_bits); i++) {
+ struct hlist_head *list = &ctx->cancel_hash[i];
+ struct io_kiocb *req;
+
+ hlist_for_each_entry(req, list, hash_node)
+ seq_printf(m, " op=%d, task_works=%d\n", req->opcode,
+ task_work_pending(req->task));
+ }
+
+ seq_puts(m, "CqOverflowList:\n");
+ list_for_each_entry(ocqe, &ctx->cq_overflow_list, list) {
+ struct io_uring_cqe *cqe = &ocqe->cqe;
+
+ seq_printf(m, " user_data=%llu, res=%d, flags=%x\n",
+ cqe->user_data, cqe->res, cqe->flags);
+
+ }
+
+ spin_unlock(&ctx->completion_lock);
+}
+
+static __cold void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
+{
+ struct io_ring_ctx *ctx = f->private_data;
+
+ if (percpu_ref_tryget(&ctx->refs)) {
+ __io_uring_show_fdinfo(ctx, m);
+ percpu_ref_put(&ctx->refs);
+ }
+}
+#endif
+
+static const struct file_operations io_uring_fops = {
+ .release = io_uring_release,
+ .mmap = io_uring_mmap,
+#ifndef CONFIG_MMU
+ .get_unmapped_area = io_uring_nommu_get_unmapped_area,
+ .mmap_capabilities = io_uring_nommu_mmap_capabilities,
+#endif
+ .poll = io_uring_poll,
+#ifdef CONFIG_PROC_FS
+ .show_fdinfo = io_uring_show_fdinfo,
+#endif
+};
+
+static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx,
+ struct io_uring_params *p)
+{
+ struct io_rings *rings;
+ size_t size, sq_array_offset;
+
+ /* make sure these are sane, as we already accounted them */
+ ctx->sq_entries = p->sq_entries;
+ ctx->cq_entries = p->cq_entries;
+
+ size = rings_size(p->sq_entries, p->cq_entries, &sq_array_offset);
+ if (size == SIZE_MAX)
+ return -EOVERFLOW;
+
+ rings = io_mem_alloc(size);
+ if (!rings)
+ return -ENOMEM;
+
+ ctx->rings = rings;
+ ctx->sq_array = (u32 *)((char *)rings + sq_array_offset);
+ rings->sq_ring_mask = p->sq_entries - 1;
+ rings->cq_ring_mask = p->cq_entries - 1;
+ rings->sq_ring_entries = p->sq_entries;
+ rings->cq_ring_entries = p->cq_entries;
+
+ size = array_size(sizeof(struct io_uring_sqe), p->sq_entries);
+ if (size == SIZE_MAX) {
+ io_mem_free(ctx->rings);
+ ctx->rings = NULL;
+ return -EOVERFLOW;
+ }
+
+ ctx->sq_sqes = io_mem_alloc(size);
+ if (!ctx->sq_sqes) {
+ io_mem_free(ctx->rings);
+ ctx->rings = NULL;
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static int io_uring_install_fd(struct io_ring_ctx *ctx, struct file *file)
+{
+ int ret, fd;
+
+ fd = get_unused_fd_flags(O_RDWR | O_CLOEXEC);
+ if (fd < 0)
+ return fd;
+
+ ret = io_uring_add_tctx_node(ctx);
+ if (ret) {
+ put_unused_fd(fd);
+ return ret;
+ }
+ fd_install(fd, file);
+ return fd;
+}
+
+/*
+ * Allocate an anonymous fd, this is what constitutes the application
+ * visible backing of an io_uring instance. The application mmaps this
+ * fd to gain access to the SQ/CQ ring details. If UNIX sockets are enabled,
+ * we have to tie this fd to a socket for file garbage collection purposes.
+ */
+static struct file *io_uring_get_file(struct io_ring_ctx *ctx)
+{
+ struct file *file;
+#if defined(CONFIG_UNIX)
+ int ret;
+
+ ret = sock_create_kern(&init_net, PF_UNIX, SOCK_RAW, IPPROTO_IP,
+ &ctx->ring_sock);
+ if (ret)
+ return ERR_PTR(ret);
+#endif
+
+ file = anon_inode_getfile_secure("[io_uring]", &io_uring_fops, ctx,
+ O_RDWR | O_CLOEXEC, NULL);
+#if defined(CONFIG_UNIX)
+ if (IS_ERR(file)) {
+ sock_release(ctx->ring_sock);
+ ctx->ring_sock = NULL;
+ } else {
+ ctx->ring_sock->file = file;
+ }
+#endif
+ return file;
+}
+
+static __cold int io_uring_create(unsigned entries, struct io_uring_params *p,
+ struct io_uring_params __user *params)
+{
+ struct io_ring_ctx *ctx;
+ struct file *file;
+ int ret;
+
+ if (!entries)
+ return -EINVAL;
+ if (entries > IORING_MAX_ENTRIES) {
+ if (!(p->flags & IORING_SETUP_CLAMP))
+ return -EINVAL;
+ entries = IORING_MAX_ENTRIES;
+ }
+
+ /*
+ * Use twice as many entries for the CQ ring. It's possible for the
+ * application to drive a higher depth than the size of the SQ ring,
+ * since the sqes are only used at submission time. This allows for
+ * some flexibility in overcommitting a bit. If the application has
+ * set IORING_SETUP_CQSIZE, it will have passed in the desired number
+ * of CQ ring entries manually.
+ */
+ p->sq_entries = roundup_pow_of_two(entries);
+ if (p->flags & IORING_SETUP_CQSIZE) {
+ /*
+ * If IORING_SETUP_CQSIZE is set, we do the same roundup
+ * to a power-of-two, if it isn't already. We do NOT impose
+ * any cq vs sq ring sizing.
+ */
+ if (!p->cq_entries)
+ return -EINVAL;
+ if (p->cq_entries > IORING_MAX_CQ_ENTRIES) {
+ if (!(p->flags & IORING_SETUP_CLAMP))
+ return -EINVAL;
+ p->cq_entries = IORING_MAX_CQ_ENTRIES;
+ }
+ p->cq_entries = roundup_pow_of_two(p->cq_entries);
+ if (p->cq_entries < p->sq_entries)
+ return -EINVAL;
+ } else {
+ p->cq_entries = 2 * p->sq_entries;
+ }
+
+ ctx = io_ring_ctx_alloc(p);
+ if (!ctx)
+ return -ENOMEM;
+ ctx->compat = in_compat_syscall();
+ if (!capable(CAP_IPC_LOCK))
+ ctx->user = get_uid(current_user());
+
+ /*
+ * This is just grabbed for accounting purposes. When a process exits,
+ * the mm is exited and dropped before the files, hence we need to hang
+ * on to this mm purely for the purposes of being able to unaccount
+ * memory (locked/pinned vm). It's not used for anything else.
+ */
+ mmgrab(current->mm);
+ ctx->mm_account = current->mm;
+
+ ret = io_allocate_scq_urings(ctx, p);
+ if (ret)
+ goto err;
+
+ ret = io_sq_offload_create(ctx, p);
+ if (ret)
+ goto err;
+ /* always set a rsrc node */
+ ret = io_rsrc_node_switch_start(ctx);
+ if (ret)
+ goto err;
+ io_rsrc_node_switch(ctx, NULL);
+
+ memset(&p->sq_off, 0, sizeof(p->sq_off));
+ p->sq_off.head = offsetof(struct io_rings, sq.head);
+ p->sq_off.tail = offsetof(struct io_rings, sq.tail);
+ p->sq_off.ring_mask = offsetof(struct io_rings, sq_ring_mask);
+ p->sq_off.ring_entries = offsetof(struct io_rings, sq_ring_entries);
+ p->sq_off.flags = offsetof(struct io_rings, sq_flags);
+ p->sq_off.dropped = offsetof(struct io_rings, sq_dropped);
+ p->sq_off.array = (char *)ctx->sq_array - (char *)ctx->rings;
+
+ memset(&p->cq_off, 0, sizeof(p->cq_off));
+ p->cq_off.head = offsetof(struct io_rings, cq.head);
+ p->cq_off.tail = offsetof(struct io_rings, cq.tail);
+ p->cq_off.ring_mask = offsetof(struct io_rings, cq_ring_mask);
+ p->cq_off.ring_entries = offsetof(struct io_rings, cq_ring_entries);
+ p->cq_off.overflow = offsetof(struct io_rings, cq_overflow);
+ p->cq_off.cqes = offsetof(struct io_rings, cqes);
+ p->cq_off.flags = offsetof(struct io_rings, cq_flags);
+
+ p->features = IORING_FEAT_SINGLE_MMAP | IORING_FEAT_NODROP |
+ IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS |
+ IORING_FEAT_CUR_PERSONALITY | IORING_FEAT_FAST_POLL |
+ IORING_FEAT_POLL_32BITS | IORING_FEAT_SQPOLL_NONFIXED |
+ IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS |
+ IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP |
+ IORING_FEAT_LINKED_FILE;
+
+ if (copy_to_user(params, p, sizeof(*p))) {
+ ret = -EFAULT;
+ goto err;
+ }
+
+ file = io_uring_get_file(ctx);
+ if (IS_ERR(file)) {
+ ret = PTR_ERR(file);
+ goto err;
+ }
+
+ /*
+ * Install ring fd as the very last thing, so we don't risk someone
+ * having closed it before we finish setup
+ */
+ ret = io_uring_install_fd(ctx, file);
+ if (ret < 0) {
+ /* fput will clean it up */
+ fput(file);
+ return ret;
+ }
+
+ trace_io_uring_create(ret, ctx, p->sq_entries, p->cq_entries, p->flags);
+ return ret;
+err:
+ io_ring_ctx_wait_and_kill(ctx);
+ return ret;
+}
+
+/*
+ * Sets up an aio uring context, and returns the fd. Applications asks for a
+ * ring size, we return the actual sq/cq ring sizes (among other things) in the
+ * params structure passed in.
+ */
+static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
+{
+ struct io_uring_params p;
+ int i;
+
+ if (copy_from_user(&p, params, sizeof(p)))
+ return -EFAULT;
+ for (i = 0; i < ARRAY_SIZE(p.resv); i++) {
+ if (p.resv[i])
+ return -EINVAL;
+ }
+
+ if (p.flags & ~(IORING_SETUP_IOPOLL | IORING_SETUP_SQPOLL |
+ IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE |
+ IORING_SETUP_CLAMP | IORING_SETUP_ATTACH_WQ |
+ IORING_SETUP_R_DISABLED | IORING_SETUP_SUBMIT_ALL))
+ return -EINVAL;
+
+ return io_uring_create(entries, &p, params);
+}
+
+SYSCALL_DEFINE2(io_uring_setup, u32, entries,
+ struct io_uring_params __user *, params)
+{
+ return io_uring_setup(entries, params);
+}
+
+static __cold int io_probe(struct io_ring_ctx *ctx, void __user *arg,
+ unsigned nr_args)
+{
+ struct io_uring_probe *p;
+ size_t size;
+ int i, ret;
+
+ size = struct_size(p, ops, nr_args);
+ if (size == SIZE_MAX)
+ return -EOVERFLOW;
+ p = kzalloc(size, GFP_KERNEL);
+ if (!p)
+ return -ENOMEM;
+
+ ret = -EFAULT;
+ if (copy_from_user(p, arg, size))
+ goto out;
+ ret = -EINVAL;
+ if (memchr_inv(p, 0, size))
+ goto out;
+
+ p->last_op = IORING_OP_LAST - 1;
+ if (nr_args > IORING_OP_LAST)
+ nr_args = IORING_OP_LAST;
+
+ for (i = 0; i < nr_args; i++) {
+ p->ops[i].op = i;
+ if (!io_op_defs[i].not_supported)
+ p->ops[i].flags = IO_URING_OP_SUPPORTED;
+ }
+ p->ops_len = i;
+
+ ret = 0;
+ if (copy_to_user(arg, p, size))
+ ret = -EFAULT;
+out:
+ kfree(p);
+ return ret;
+}
+
+static int io_register_personality(struct io_ring_ctx *ctx)
+{
+ const struct cred *creds;
+ u32 id;
+ int ret;
+
+ creds = get_current_cred();
+
+ ret = xa_alloc_cyclic(&ctx->personalities, &id, (void *)creds,
+ XA_LIMIT(0, USHRT_MAX), &ctx->pers_next, GFP_KERNEL);
+ if (ret < 0) {
+ put_cred(creds);
+ return ret;
+ }
+ return id;
+}
+
+static __cold int io_register_restrictions(struct io_ring_ctx *ctx,
+ void __user *arg, unsigned int nr_args)
+{
+ struct io_uring_restriction *res;
+ size_t size;
+ int i, ret;
+
+ /* Restrictions allowed only if rings started disabled */
+ if (!(ctx->flags & IORING_SETUP_R_DISABLED))
+ return -EBADFD;
+
+ /* We allow only a single restrictions registration */
+ if (ctx->restrictions.registered)
+ return -EBUSY;
+
+ if (!arg || nr_args > IORING_MAX_RESTRICTIONS)
+ return -EINVAL;
+
+ size = array_size(nr_args, sizeof(*res));
+ if (size == SIZE_MAX)
+ return -EOVERFLOW;
+
+ res = memdup_user(arg, size);
+ if (IS_ERR(res))
+ return PTR_ERR(res);
+
+ ret = 0;
+
+ for (i = 0; i < nr_args; i++) {
+ switch (res[i].opcode) {
+ case IORING_RESTRICTION_REGISTER_OP:
+ if (res[i].register_op >= IORING_REGISTER_LAST) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ __set_bit(res[i].register_op,
+ ctx->restrictions.register_op);
+ break;
+ case IORING_RESTRICTION_SQE_OP:
+ if (res[i].sqe_op >= IORING_OP_LAST) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ __set_bit(res[i].sqe_op, ctx->restrictions.sqe_op);
+ break;
+ case IORING_RESTRICTION_SQE_FLAGS_ALLOWED:
+ ctx->restrictions.sqe_flags_allowed = res[i].sqe_flags;
+ break;
+ case IORING_RESTRICTION_SQE_FLAGS_REQUIRED:
+ ctx->restrictions.sqe_flags_required = res[i].sqe_flags;
+ break;
+ default:
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+
+out:
+ /* Reset all restrictions if an error happened */
+ if (ret != 0)
+ memset(&ctx->restrictions, 0, sizeof(ctx->restrictions));
+ else
+ ctx->restrictions.registered = true;
+
+ kfree(res);
+ return ret;
+}
+
+static int io_register_enable_rings(struct io_ring_ctx *ctx)
+{
+ if (!(ctx->flags & IORING_SETUP_R_DISABLED))
+ return -EBADFD;
+
+ if (ctx->restrictions.registered)
+ ctx->restricted = 1;
+
+ ctx->flags &= ~IORING_SETUP_R_DISABLED;
+ if (ctx->sq_data && wq_has_sleeper(&ctx->sq_data->wait))
+ wake_up(&ctx->sq_data->wait);
+ return 0;
+}
+
+static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type,
+ struct io_uring_rsrc_update2 *up,
+ unsigned nr_args)
+{
+ __u32 tmp;
+ int err;
+
+ if (check_add_overflow(up->offset, nr_args, &tmp))
+ return -EOVERFLOW;
+ err = io_rsrc_node_switch_start(ctx);
+ if (err)
+ return err;
+
+ switch (type) {
+ case IORING_RSRC_FILE:
+ return __io_sqe_files_update(ctx, up, nr_args);
+ case IORING_RSRC_BUFFER:
+ return __io_sqe_buffers_update(ctx, up, nr_args);
+ }
+ return -EINVAL;
+}
+
+static int io_register_files_update(struct io_ring_ctx *ctx, void __user *arg,
+ unsigned nr_args)
+{
+ struct io_uring_rsrc_update2 up;
+
+ if (!nr_args)
+ return -EINVAL;
+ memset(&up, 0, sizeof(up));
+ if (copy_from_user(&up, arg, sizeof(struct io_uring_rsrc_update)))
+ return -EFAULT;
+ if (up.resv || up.resv2)
+ return -EINVAL;
+ return __io_register_rsrc_update(ctx, IORING_RSRC_FILE, &up, nr_args);
+}
+
+static int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg,
+ unsigned size, unsigned type)
+{
+ struct io_uring_rsrc_update2 up;
+
+ if (size != sizeof(up))
+ return -EINVAL;
+ if (copy_from_user(&up, arg, sizeof(up)))
+ return -EFAULT;
+ if (!up.nr || up.resv || up.resv2)
+ return -EINVAL;
+ return __io_register_rsrc_update(ctx, type, &up, up.nr);
+}
+
+static __cold int io_register_rsrc(struct io_ring_ctx *ctx, void __user *arg,
+ unsigned int size, unsigned int type)
+{
+ struct io_uring_rsrc_register rr;
+
+ /* keep it extendible */
+ if (size != sizeof(rr))
+ return -EINVAL;
+
+ memset(&rr, 0, sizeof(rr));
+ if (copy_from_user(&rr, arg, size))
+ return -EFAULT;
+ if (!rr.nr || rr.resv || rr.resv2)
+ return -EINVAL;
+
+ switch (type) {
+ case IORING_RSRC_FILE:
+ return io_sqe_files_register(ctx, u64_to_user_ptr(rr.data),
+ rr.nr, u64_to_user_ptr(rr.tags));
+ case IORING_RSRC_BUFFER:
+ return io_sqe_buffers_register(ctx, u64_to_user_ptr(rr.data),
+ rr.nr, u64_to_user_ptr(rr.tags));
+ }
+ return -EINVAL;
+}
+
+static __cold int io_register_iowq_aff(struct io_ring_ctx *ctx,
+ void __user *arg, unsigned len)
+{
+ struct io_uring_task *tctx = current->io_uring;
+ cpumask_var_t new_mask;
+ int ret;
+
+ if (!tctx || !tctx->io_wq)
+ return -EINVAL;
+
+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
+ return -ENOMEM;
+
+ cpumask_clear(new_mask);
+ if (len > cpumask_size())
+ len = cpumask_size();
+
+ if (in_compat_syscall()) {
+ ret = compat_get_bitmap(cpumask_bits(new_mask),
+ (const compat_ulong_t __user *)arg,
+ len * 8 /* CHAR_BIT */);
+ } else {
+ ret = copy_from_user(new_mask, arg, len);
+ }
+
+ if (ret) {
+ free_cpumask_var(new_mask);
+ return -EFAULT;
+ }
+
+ ret = io_wq_cpu_affinity(tctx->io_wq, new_mask);
+ free_cpumask_var(new_mask);
+ return ret;
+}
+
+static __cold int io_unregister_iowq_aff(struct io_ring_ctx *ctx)
+{
+ struct io_uring_task *tctx = current->io_uring;
+
+ if (!tctx || !tctx->io_wq)
+ return -EINVAL;
+
+ return io_wq_cpu_affinity(tctx->io_wq, NULL);
+}
+
+static __cold int io_register_iowq_max_workers(struct io_ring_ctx *ctx,
+ void __user *arg)
+ __must_hold(&ctx->uring_lock)
+{
+ struct io_tctx_node *node;
+ struct io_uring_task *tctx = NULL;
+ struct io_sq_data *sqd = NULL;
+ __u32 new_count[2];
+ int i, ret;
+
+ if (copy_from_user(new_count, arg, sizeof(new_count)))
+ return -EFAULT;
+ for (i = 0; i < ARRAY_SIZE(new_count); i++)
+ if (new_count[i] > INT_MAX)
+ return -EINVAL;
+
+ if (ctx->flags & IORING_SETUP_SQPOLL) {
+ sqd = ctx->sq_data;
+ if (sqd) {
+ /*
+ * Observe the correct sqd->lock -> ctx->uring_lock
+ * ordering. Fine to drop uring_lock here, we hold
+ * a ref to the ctx.
+ */
+ refcount_inc(&sqd->refs);
+ mutex_unlock(&ctx->uring_lock);
+ mutex_lock(&sqd->lock);
+ mutex_lock(&ctx->uring_lock);
+ if (sqd->thread)
+ tctx = sqd->thread->io_uring;
+ }
+ } else {
+ tctx = current->io_uring;
+ }
+
+ BUILD_BUG_ON(sizeof(new_count) != sizeof(ctx->iowq_limits));
+
+ for (i = 0; i < ARRAY_SIZE(new_count); i++)
+ if (new_count[i])
+ ctx->iowq_limits[i] = new_count[i];
+ ctx->iowq_limits_set = true;
+
+ if (tctx && tctx->io_wq) {
+ ret = io_wq_max_workers(tctx->io_wq, new_count);
+ if (ret)
+ goto err;
+ } else {
+ memset(new_count, 0, sizeof(new_count));
+ }
+
+ if (sqd) {
+ mutex_unlock(&sqd->lock);
+ io_put_sq_data(sqd);
+ }
+
+ if (copy_to_user(arg, new_count, sizeof(new_count)))
+ return -EFAULT;
+
+ /* that's it for SQPOLL, only the SQPOLL task creates requests */
+ if (sqd)
+ return 0;
+
+ /* now propagate the restriction to all registered users */
+ list_for_each_entry(node, &ctx->tctx_list, ctx_node) {
+ struct io_uring_task *tctx = node->task->io_uring;
+
+ if (WARN_ON_ONCE(!tctx->io_wq))
+ continue;
+
+ for (i = 0; i < ARRAY_SIZE(new_count); i++)
+ new_count[i] = ctx->iowq_limits[i];
+ /* ignore errors, it always returns zero anyway */
+ (void)io_wq_max_workers(tctx->io_wq, new_count);
+ }
+ return 0;
+err:
+ if (sqd) {
+ mutex_unlock(&sqd->lock);
+ io_put_sq_data(sqd);
+ }
+ return ret;
+}
+
+static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
+ void __user *arg, unsigned nr_args)
+ __releases(ctx->uring_lock)
+ __acquires(ctx->uring_lock)
+{
+ int ret;
+
+ /*
+ * We're inside the ring mutex, if the ref is already dying, then
+ * someone else killed the ctx or is already going through
+ * io_uring_register().
+ */
+ if (percpu_ref_is_dying(&ctx->refs))
+ return -ENXIO;
+
+ if (ctx->restricted) {
+ if (opcode >= IORING_REGISTER_LAST)
+ return -EINVAL;
+ opcode = array_index_nospec(opcode, IORING_REGISTER_LAST);
+ if (!test_bit(opcode, ctx->restrictions.register_op))
+ return -EACCES;
+ }
+
+ switch (opcode) {
+ case IORING_REGISTER_BUFFERS:
+ ret = io_sqe_buffers_register(ctx, arg, nr_args, NULL);
+ break;
+ case IORING_UNREGISTER_BUFFERS:
+ ret = -EINVAL;
+ if (arg || nr_args)
+ break;
+ ret = io_sqe_buffers_unregister(ctx);
+ break;
+ case IORING_REGISTER_FILES:
+ ret = io_sqe_files_register(ctx, arg, nr_args, NULL);
+ break;
+ case IORING_UNREGISTER_FILES:
+ ret = -EINVAL;
+ if (arg || nr_args)
+ break;
+ ret = io_sqe_files_unregister(ctx);
+ break;
+ case IORING_REGISTER_FILES_UPDATE:
+ ret = io_register_files_update(ctx, arg, nr_args);
+ break;
+ case IORING_REGISTER_EVENTFD:
+ ret = -EINVAL;
+ if (nr_args != 1)
+ break;
+ ret = io_eventfd_register(ctx, arg, 0);
+ break;
+ case IORING_REGISTER_EVENTFD_ASYNC:
+ ret = -EINVAL;
+ if (nr_args != 1)
+ break;
+ ret = io_eventfd_register(ctx, arg, 1);
+ break;
+ case IORING_UNREGISTER_EVENTFD:
+ ret = -EINVAL;
+ if (arg || nr_args)
+ break;
+ ret = io_eventfd_unregister(ctx);
+ break;
+ case IORING_REGISTER_PROBE:
+ ret = -EINVAL;
+ if (!arg || nr_args > 256)
+ break;
+ ret = io_probe(ctx, arg, nr_args);
+ break;
+ case IORING_REGISTER_PERSONALITY:
+ ret = -EINVAL;
+ if (arg || nr_args)
+ break;
+ ret = io_register_personality(ctx);
+ break;
+ case IORING_UNREGISTER_PERSONALITY:
+ ret = -EINVAL;
+ if (arg)
+ break;
+ ret = io_unregister_personality(ctx, nr_args);
+ break;
+ case IORING_REGISTER_ENABLE_RINGS:
+ ret = -EINVAL;
+ if (arg || nr_args)
+ break;
+ ret = io_register_enable_rings(ctx);
+ break;
+ case IORING_REGISTER_RESTRICTIONS:
+ ret = io_register_restrictions(ctx, arg, nr_args);
+ break;
+ case IORING_REGISTER_FILES2:
+ ret = io_register_rsrc(ctx, arg, nr_args, IORING_RSRC_FILE);
+ break;
+ case IORING_REGISTER_FILES_UPDATE2:
+ ret = io_register_rsrc_update(ctx, arg, nr_args,
+ IORING_RSRC_FILE);
+ break;
+ case IORING_REGISTER_BUFFERS2:
+ ret = io_register_rsrc(ctx, arg, nr_args, IORING_RSRC_BUFFER);
+ break;
+ case IORING_REGISTER_BUFFERS_UPDATE:
+ ret = io_register_rsrc_update(ctx, arg, nr_args,
+ IORING_RSRC_BUFFER);
+ break;
+ case IORING_REGISTER_IOWQ_AFF:
+ ret = -EINVAL;
+ if (!arg || !nr_args)
+ break;
+ ret = io_register_iowq_aff(ctx, arg, nr_args);
+ break;
+ case IORING_UNREGISTER_IOWQ_AFF:
+ ret = -EINVAL;
+ if (arg || nr_args)
+ break;
+ ret = io_unregister_iowq_aff(ctx);
+ break;
+ case IORING_REGISTER_IOWQ_MAX_WORKERS:
+ ret = -EINVAL;
+ if (!arg || nr_args != 2)
+ break;
+ ret = io_register_iowq_max_workers(ctx, arg);
+ break;
+ case IORING_REGISTER_RING_FDS:
+ ret = io_ringfd_register(ctx, arg, nr_args);
+ break;
+ case IORING_UNREGISTER_RING_FDS:
+ ret = io_ringfd_unregister(ctx, arg, nr_args);
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+
+SYSCALL_DEFINE4(io_uring_register, unsigned int, fd, unsigned int, opcode,
+ void __user *, arg, unsigned int, nr_args)
+{
+ struct io_ring_ctx *ctx;
+ long ret = -EBADF;
+ struct fd f;
+
+ f = fdget(fd);
+ if (!f.file)
+ return -EBADF;
+
+ ret = -EOPNOTSUPP;
+ if (f.file->f_op != &io_uring_fops)
+ goto out_fput;
+
+ ctx = f.file->private_data;
+
+ io_run_task_work();
+
+ mutex_lock(&ctx->uring_lock);
+ ret = __io_uring_register(ctx, opcode, arg, nr_args);
+ mutex_unlock(&ctx->uring_lock);
+ trace_io_uring_register(ctx, opcode, ctx->nr_user_files, ctx->nr_user_bufs, ret);
+out_fput:
+ fdput(f);
+ return ret;
+}
+
+static int __init io_uring_init(void)
+{
+#define __BUILD_BUG_VERIFY_ELEMENT(stype, eoffset, etype, ename) do { \
+ BUILD_BUG_ON(offsetof(stype, ename) != eoffset); \
+ BUILD_BUG_ON(sizeof(etype) != sizeof_field(stype, ename)); \
+} while (0)
+
+#define BUILD_BUG_SQE_ELEM(eoffset, etype, ename) \
+ __BUILD_BUG_VERIFY_ELEMENT(struct io_uring_sqe, eoffset, etype, ename)
+ BUILD_BUG_ON(sizeof(struct io_uring_sqe) != 64);
+ BUILD_BUG_SQE_ELEM(0, __u8, opcode);
+ BUILD_BUG_SQE_ELEM(1, __u8, flags);
+ BUILD_BUG_SQE_ELEM(2, __u16, ioprio);
+ BUILD_BUG_SQE_ELEM(4, __s32, fd);
+ BUILD_BUG_SQE_ELEM(8, __u64, off);
+ BUILD_BUG_SQE_ELEM(8, __u64, addr2);
+ BUILD_BUG_SQE_ELEM(16, __u64, addr);
+ BUILD_BUG_SQE_ELEM(16, __u64, splice_off_in);
+ BUILD_BUG_SQE_ELEM(24, __u32, len);
+ BUILD_BUG_SQE_ELEM(28, __kernel_rwf_t, rw_flags);
+ BUILD_BUG_SQE_ELEM(28, /* compat */ int, rw_flags);
+ BUILD_BUG_SQE_ELEM(28, /* compat */ __u32, rw_flags);
+ BUILD_BUG_SQE_ELEM(28, __u32, fsync_flags);
+ BUILD_BUG_SQE_ELEM(28, /* compat */ __u16, poll_events);
+ BUILD_BUG_SQE_ELEM(28, __u32, poll32_events);
+ BUILD_BUG_SQE_ELEM(28, __u32, sync_range_flags);
+ BUILD_BUG_SQE_ELEM(28, __u32, msg_flags);
+ BUILD_BUG_SQE_ELEM(28, __u32, timeout_flags);
+ BUILD_BUG_SQE_ELEM(28, __u32, accept_flags);
+ BUILD_BUG_SQE_ELEM(28, __u32, cancel_flags);
+ BUILD_BUG_SQE_ELEM(28, __u32, open_flags);
+ BUILD_BUG_SQE_ELEM(28, __u32, statx_flags);
+ BUILD_BUG_SQE_ELEM(28, __u32, fadvise_advice);
+ BUILD_BUG_SQE_ELEM(28, __u32, splice_flags);
+ BUILD_BUG_SQE_ELEM(32, __u64, user_data);
+ BUILD_BUG_SQE_ELEM(40, __u16, buf_index);
+ BUILD_BUG_SQE_ELEM(40, __u16, buf_group);
+ BUILD_BUG_SQE_ELEM(42, __u16, personality);
+ BUILD_BUG_SQE_ELEM(44, __s32, splice_fd_in);
+ BUILD_BUG_SQE_ELEM(44, __u32, file_index);
+
+ BUILD_BUG_ON(sizeof(struct io_uring_files_update) !=
+ sizeof(struct io_uring_rsrc_update));
+ BUILD_BUG_ON(sizeof(struct io_uring_rsrc_update) >
+ sizeof(struct io_uring_rsrc_update2));
+
+ /* ->buf_index is u16 */
+ BUILD_BUG_ON(IORING_MAX_REG_BUFFERS >= (1u << 16));
+
+ /* should fit into one byte */
+ BUILD_BUG_ON(SQE_VALID_FLAGS >= (1 << 8));
+ BUILD_BUG_ON(SQE_COMMON_FLAGS >= (1 << 8));
+ BUILD_BUG_ON((SQE_VALID_FLAGS | SQE_COMMON_FLAGS) != SQE_VALID_FLAGS);
+
+ BUILD_BUG_ON(ARRAY_SIZE(io_op_defs) != IORING_OP_LAST);
+ BUILD_BUG_ON(__REQ_F_LAST_BIT > 8 * sizeof(int));
+
+ req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
+ SLAB_ACCOUNT);
+ return 0;
+};
+__initcall(io_uring_init);
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 7f145aefbff8..d015fce67865 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -155,6 +155,11 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
return &array->map;
}
+static void *array_map_elem_ptr(struct bpf_array* array, u32 index)
+{
+ return array->value + (u64)array->elem_size * index;
+}
+
/* Called from syscall or from eBPF program */
static void *array_map_lookup_elem(struct bpf_map *map, void *key)
{
@@ -164,7 +169,7 @@ static void *array_map_lookup_elem(struct bpf_map *map, void *key)
if (unlikely(index >= array->map.max_entries))
return NULL;
- return array->value + array->elem_size * (index & array->index_mask);
+ return array->value + (u64)array->elem_size * (index & array->index_mask);
}
static int array_map_direct_value_addr(const struct bpf_map *map, u64 *imm,
@@ -287,10 +292,12 @@ static int array_map_get_next_key(struct bpf_map *map, void *key, void *next_key
return 0;
}
-static void check_and_free_timer_in_array(struct bpf_array *arr, void *val)
+static void check_and_free_fields(struct bpf_array *arr, void *val)
{
- if (unlikely(map_value_has_timer(&arr->map)))
+ if (map_value_has_timer(&arr->map))
bpf_timer_cancel_and_free(val + arr->map.timer_off);
+ if (map_value_has_kptrs(&arr->map))
+ bpf_map_free_kptrs(&arr->map, val);
}
/* Called from syscall or from eBPF program */
@@ -322,12 +329,12 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value,
value, map->value_size);
} else {
val = array->value +
- array->elem_size * (index & array->index_mask);
+ (u64)array->elem_size * (index & array->index_mask);
if (map_flags & BPF_F_LOCK)
copy_map_value_locked(map, val, value, false);
else
copy_map_value(map, val, value);
- check_and_free_timer_in_array(array, val);
+ check_and_free_fields(array, val);
}
return 0;
}
@@ -386,18 +393,25 @@ static void array_map_free_timers(struct bpf_map *map)
struct bpf_array *array = container_of(map, struct bpf_array, map);
int i;
- if (likely(!map_value_has_timer(map)))
+ /* We don't reset or free kptr on uref dropping to zero. */
+ if (!map_value_has_timer(map))
return;
for (i = 0; i < array->map.max_entries; i++)
- bpf_timer_cancel_and_free(array->value + array->elem_size * i +
- map->timer_off);
+ bpf_timer_cancel_and_free(array_map_elem_ptr(array, i) + map->timer_off);
}
/* Called when map->refcnt goes to zero, either from workqueue or from syscall */
static void array_map_free(struct bpf_map *map)
{
struct bpf_array *array = container_of(map, struct bpf_array, map);
+ int i;
+
+ if (map_value_has_kptrs(map)) {
+ for (i = 0; i < array->map.max_entries; i++)
+ bpf_map_free_kptrs(map, array_map_elem_ptr(array, i));
+ bpf_map_free_kptr_off_tab(map);
+ }
if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
bpf_array_free_percpu(array);
@@ -531,7 +545,7 @@ static void *bpf_array_map_seq_start(struct seq_file *seq, loff_t *pos)
index = info->index & array->index_mask;
if (info->percpu_value_buf)
return array->pptrs[index];
- return array->value + array->elem_size * index;
+ return array_map_elem_ptr(array, index);
}
static void *bpf_array_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
@@ -550,7 +564,7 @@ static void *bpf_array_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
index = info->index & array->index_mask;
if (info->percpu_value_buf)
return array->pptrs[index];
- return array->value + array->elem_size * index;
+ return array_map_elem_ptr(array, index);
}
static int __bpf_array_map_seq_show(struct seq_file *seq, void *v)
@@ -665,7 +679,7 @@ static int bpf_for_each_array_elem(struct bpf_map *map, bpf_callback_t callback_
if (is_percpu)
val = this_cpu_ptr(array->pptrs[i]);
else
- val = array->value + array->elem_size * i;
+ val = array_map_elem_ptr(array, i);
num_elems++;
key = i;
ret = callback_fn((u64)(long)map, (u64)(long)&key,
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index 21069dbe9138..310b0591d91f 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -32,15 +32,15 @@ struct bpf_struct_ops_map {
const struct bpf_struct_ops *st_ops;
/* protect map_update */
struct mutex lock;
- /* progs has all the bpf_prog that is populated
+ /* link has all the bpf_links that is populated
* to the func ptr of the kernel's struct
* (in kvalue.data).
*/
- struct bpf_prog **progs;
+ struct bpf_link **links;
/* image is a page that has all the trampolines
* that stores the func args before calling the bpf_prog.
* A PAGE_SIZE "image" is enough to store all trampoline for
- * "progs[]".
+ * "links[]".
*/
void *image;
/* uvalue->data stores the kernel struct
@@ -282,9 +282,9 @@ static void bpf_struct_ops_map_put_progs(struct bpf_struct_ops_map *st_map)
u32 i;
for (i = 0; i < btf_type_vlen(t); i++) {
- if (st_map->progs[i]) {
- bpf_prog_put(st_map->progs[i]);
- st_map->progs[i] = NULL;
+ if (st_map->links[i]) {
+ bpf_link_put(st_map->links[i]);
+ st_map->links[i] = NULL;
}
}
}
@@ -315,18 +315,34 @@ static int check_zero_holes(const struct btf_type *t, void *data)
return 0;
}
-int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_progs *tprogs,
- struct bpf_prog *prog,
+static void bpf_struct_ops_link_release(struct bpf_link *link)
+{
+}
+
+static void bpf_struct_ops_link_dealloc(struct bpf_link *link)
+{
+ struct bpf_tramp_link *tlink = container_of(link, struct bpf_tramp_link, link);
+
+ kfree(tlink);
+}
+
+const struct bpf_link_ops bpf_struct_ops_link_lops = {
+ .release = bpf_struct_ops_link_release,
+ .dealloc = bpf_struct_ops_link_dealloc,
+};
+
+int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
+ struct bpf_tramp_link *link,
const struct btf_func_model *model,
void *image, void *image_end)
{
u32 flags;
- tprogs[BPF_TRAMP_FENTRY].progs[0] = prog;
- tprogs[BPF_TRAMP_FENTRY].nr_progs = 1;
+ tlinks[BPF_TRAMP_FENTRY].links[0] = link;
+ tlinks[BPF_TRAMP_FENTRY].nr_links = 1;
flags = model->ret_size > 0 ? BPF_TRAMP_F_RET_FENTRY_RET : 0;
return arch_prepare_bpf_trampoline(NULL, image, image_end,
- model, flags, tprogs, NULL);
+ model, flags, tlinks, NULL);
}
static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
@@ -337,7 +353,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
struct bpf_struct_ops_value *uvalue, *kvalue;
const struct btf_member *member;
const struct btf_type *t = st_ops->type;
- struct bpf_tramp_progs *tprogs = NULL;
+ struct bpf_tramp_links *tlinks = NULL;
void *udata, *kdata;
int prog_fd, err = 0;
void *image, *image_end;
@@ -361,8 +377,8 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
if (uvalue->state || refcount_read(&uvalue->refcnt))
return -EINVAL;
- tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(*tprogs), GFP_KERNEL);
- if (!tprogs)
+ tlinks = kcalloc(BPF_TRAMP_MAX, sizeof(*tlinks), GFP_KERNEL);
+ if (!tlinks)
return -ENOMEM;
uvalue = (struct bpf_struct_ops_value *)st_map->uvalue;
@@ -385,6 +401,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
for_each_member(i, t, member) {
const struct btf_type *mtype, *ptype;
struct bpf_prog *prog;
+ struct bpf_tramp_link *link;
u32 moff;
moff = __btf_member_bit_offset(t, member) / 8;
@@ -438,16 +455,26 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
err = PTR_ERR(prog);
goto reset_unlock;
}
- st_map->progs[i] = prog;
if (prog->type != BPF_PROG_TYPE_STRUCT_OPS ||
prog->aux->attach_btf_id != st_ops->type_id ||
prog->expected_attach_type != i) {
+ bpf_prog_put(prog);
err = -EINVAL;
goto reset_unlock;
}
- err = bpf_struct_ops_prepare_trampoline(tprogs, prog,
+ link = kzalloc(sizeof(*link), GFP_USER);
+ if (!link) {
+ bpf_prog_put(prog);
+ err = -ENOMEM;
+ goto reset_unlock;
+ }
+ bpf_link_init(&link->link, BPF_LINK_TYPE_STRUCT_OPS,
+ &bpf_struct_ops_link_lops, prog);
+ st_map->links[i] = &link->link;
+
+ err = bpf_struct_ops_prepare_trampoline(tlinks, link,
&st_ops->func_models[i],
image, image_end);
if (err < 0)
@@ -490,7 +517,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
memset(uvalue, 0, map->value_size);
memset(kvalue, 0, map->value_size);
unlock:
- kfree(tprogs);
+ kfree(tlinks);
mutex_unlock(&st_map->lock);
return err;
}
@@ -545,9 +572,9 @@ static void bpf_struct_ops_map_free(struct bpf_map *map)
{
struct bpf_struct_ops_map *st_map = (struct bpf_struct_ops_map *)map;
- if (st_map->progs)
+ if (st_map->links)
bpf_struct_ops_map_put_progs(st_map);
- bpf_map_area_free(st_map->progs);
+ bpf_map_area_free(st_map->links);
bpf_jit_free_exec(st_map->image);
bpf_map_area_free(st_map->uvalue);
bpf_map_area_free(st_map);
@@ -596,11 +623,11 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
map = &st_map->map;
st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE);
- st_map->progs =
- bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct bpf_prog *),
+ st_map->links =
+ bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct bpf_links *),
NUMA_NO_NODE);
st_map->image = bpf_jit_alloc_exec(PAGE_SIZE);
- if (!st_map->uvalue || !st_map->progs || !st_map->image) {
+ if (!st_map->uvalue || !st_map->links || !st_map->image) {
bpf_struct_ops_map_free(map);
return ERR_PTR(-ENOMEM);
}
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index feef799884d1..7a593ecfbeec 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -207,12 +207,18 @@ enum btf_kfunc_hook {
enum {
BTF_KFUNC_SET_MAX_CNT = 32,
+ BTF_DTOR_KFUNC_MAX_CNT = 256,
};
struct btf_kfunc_set_tab {
struct btf_id_set *sets[BTF_KFUNC_HOOK_MAX][BTF_KFUNC_TYPE_MAX];
};
+struct btf_id_dtor_kfunc_tab {
+ u32 cnt;
+ struct btf_id_dtor_kfunc dtors[];
+};
+
struct btf {
void *data;
struct btf_type **types;
@@ -228,6 +234,7 @@ struct btf {
u32 id;
struct rcu_head rcu;
struct btf_kfunc_set_tab *kfunc_set_tab;
+ struct btf_id_dtor_kfunc_tab *dtor_kfunc_tab;
/* split BTF support */
struct btf *base_btf;
@@ -1616,8 +1623,19 @@ static void btf_free_kfunc_set_tab(struct btf *btf)
btf->kfunc_set_tab = NULL;
}
+static void btf_free_dtor_kfunc_tab(struct btf *btf)
+{
+ struct btf_id_dtor_kfunc_tab *tab = btf->dtor_kfunc_tab;
+
+ if (!tab)
+ return;
+ kfree(tab);
+ btf->dtor_kfunc_tab = NULL;
+}
+
static void btf_free(struct btf *btf)
{
+ btf_free_dtor_kfunc_tab(btf);
btf_free_kfunc_set_tab(btf);
kvfree(btf->types);
kvfree(btf->resolved_sizes);
@@ -3163,24 +3181,86 @@ static void btf_struct_log(struct btf_verifier_env *env,
btf_verifier_log(env, "size=%u vlen=%u", t->size, btf_type_vlen(t));
}
+enum btf_field_type {
+ BTF_FIELD_SPIN_LOCK,
+ BTF_FIELD_TIMER,
+ BTF_FIELD_KPTR,
+};
+
+enum {
+ BTF_FIELD_IGNORE = 0,
+ BTF_FIELD_FOUND = 1,
+};
+
+struct btf_field_info {
+ u32 type_id;
+ u32 off;
+ enum bpf_kptr_type type;
+};
+
+static int btf_find_struct(const struct btf *btf, const struct btf_type *t,
+ u32 off, int sz, struct btf_field_info *info)
+{
+ if (!__btf_type_is_struct(t))
+ return BTF_FIELD_IGNORE;
+ if (t->size != sz)
+ return BTF_FIELD_IGNORE;
+ info->off = off;
+ return BTF_FIELD_FOUND;
+}
+
+static int btf_find_kptr(const struct btf *btf, const struct btf_type *t,
+ u32 off, int sz, struct btf_field_info *info)
+{
+ enum bpf_kptr_type type;
+ u32 res_id;
+
+ /* For PTR, sz is always == 8 */
+ if (!btf_type_is_ptr(t))
+ return BTF_FIELD_IGNORE;
+ t = btf_type_by_id(btf, t->type);
+
+ if (!btf_type_is_type_tag(t))
+ return BTF_FIELD_IGNORE;
+ /* Reject extra tags */
+ if (btf_type_is_type_tag(btf_type_by_id(btf, t->type)))
+ return -EINVAL;
+ if (!strcmp("kptr", __btf_name_by_offset(btf, t->name_off)))
+ type = BPF_KPTR_UNREF;
+ else if (!strcmp("kptr_ref", __btf_name_by_offset(btf, t->name_off)))
+ type = BPF_KPTR_REF;
+ else
+ return -EINVAL;
+
+ /* Get the base type */
+ t = btf_type_skip_modifiers(btf, t->type, &res_id);
+ /* Only pointer to struct is allowed */
+ if (!__btf_type_is_struct(t))
+ return -EINVAL;
+
+ info->type_id = res_id;
+ info->off = off;
+ info->type = type;
+ return BTF_FIELD_FOUND;
+}
+
static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t,
- const char *name, int sz, int align)
+ const char *name, int sz, int align,
+ enum btf_field_type field_type,
+ struct btf_field_info *info, int info_cnt)
{
const struct btf_member *member;
- u32 i, off = -ENOENT;
+ struct btf_field_info tmp;
+ int ret, idx = 0;
+ u32 i, off;
for_each_member(i, t, member) {
const struct btf_type *member_type = btf_type_by_id(btf,
member->type);
- if (!__btf_type_is_struct(member_type))
- continue;
- if (member_type->size != sz)
- continue;
- if (strcmp(__btf_name_by_offset(btf, member_type->name_off), name))
+
+ if (name && strcmp(__btf_name_by_offset(btf, member_type->name_off), name))
continue;
- if (off != -ENOENT)
- /* only one such field is allowed */
- return -E2BIG;
+
off = __btf_member_bit_offset(t, member);
if (off % 8)
/* valid C code cannot generate such BTF */
@@ -3188,46 +3268,115 @@ static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t
off /= 8;
if (off % align)
return -EINVAL;
+
+ switch (field_type) {
+ case BTF_FIELD_SPIN_LOCK:
+ case BTF_FIELD_TIMER:
+ ret = btf_find_struct(btf, member_type, off, sz,
+ idx < info_cnt ? &info[idx] : &tmp);
+ if (ret < 0)
+ return ret;
+ break;
+ case BTF_FIELD_KPTR:
+ ret = btf_find_kptr(btf, member_type, off, sz,
+ idx < info_cnt ? &info[idx] : &tmp);
+ if (ret < 0)
+ return ret;
+ break;
+ default:
+ return -EFAULT;
+ }
+
+ if (ret == BTF_FIELD_IGNORE)
+ continue;
+ if (idx >= info_cnt)
+ return -E2BIG;
+ ++idx;
}
- return off;
+ return idx;
}
static int btf_find_datasec_var(const struct btf *btf, const struct btf_type *t,
- const char *name, int sz, int align)
+ const char *name, int sz, int align,
+ enum btf_field_type field_type,
+ struct btf_field_info *info, int info_cnt)
{
const struct btf_var_secinfo *vsi;
- u32 i, off = -ENOENT;
+ struct btf_field_info tmp;
+ int ret, idx = 0;
+ u32 i, off;
for_each_vsi(i, t, vsi) {
const struct btf_type *var = btf_type_by_id(btf, vsi->type);
const struct btf_type *var_type = btf_type_by_id(btf, var->type);
- if (!__btf_type_is_struct(var_type))
- continue;
- if (var_type->size != sz)
+ off = vsi->offset;
+
+ if (name && strcmp(__btf_name_by_offset(btf, var_type->name_off), name))
continue;
if (vsi->size != sz)
continue;
- if (strcmp(__btf_name_by_offset(btf, var_type->name_off), name))
- continue;
- if (off != -ENOENT)
- /* only one such field is allowed */
- return -E2BIG;
- off = vsi->offset;
if (off % align)
return -EINVAL;
+
+ switch (field_type) {
+ case BTF_FIELD_SPIN_LOCK:
+ case BTF_FIELD_TIMER:
+ ret = btf_find_struct(btf, var_type, off, sz,
+ idx < info_cnt ? &info[idx] : &tmp);
+ if (ret < 0)
+ return ret;
+ break;
+ case BTF_FIELD_KPTR:
+ ret = btf_find_kptr(btf, var_type, off, sz,
+ idx < info_cnt ? &info[idx] : &tmp);
+ if (ret < 0)
+ return ret;
+ break;
+ default:
+ return -EFAULT;
+ }
+
+ if (ret == BTF_FIELD_IGNORE)
+ continue;
+ if (idx >= info_cnt)
+ return -E2BIG;
+ ++idx;
}
- return off;
+ return idx;
}
static int btf_find_field(const struct btf *btf, const struct btf_type *t,
- const char *name, int sz, int align)
+ enum btf_field_type field_type,
+ struct btf_field_info *info, int info_cnt)
{
+ const char *name;
+ int sz, align;
+
+ switch (field_type) {
+ case BTF_FIELD_SPIN_LOCK:
+ name = "bpf_spin_lock";
+ sz = sizeof(struct bpf_spin_lock);
+ align = __alignof__(struct bpf_spin_lock);
+ break;
+ case BTF_FIELD_TIMER:
+ name = "bpf_timer";
+ sz = sizeof(struct bpf_timer);
+ align = __alignof__(struct bpf_timer);
+ break;
+ case BTF_FIELD_KPTR:
+ name = NULL;
+ sz = sizeof(u64);
+ align = 8;
+ break;
+ default:
+ return -EFAULT;
+ }
if (__btf_type_is_struct(t))
- return btf_find_struct_field(btf, t, name, sz, align);
+ return btf_find_struct_field(btf, t, name, sz, align, field_type, info, info_cnt);
else if (btf_type_is_datasec(t))
- return btf_find_datasec_var(btf, t, name, sz, align);
+ return btf_find_datasec_var(btf, t, name, sz, align, field_type, info, info_cnt);
return -EINVAL;
}
@@ -3237,16 +3386,130 @@ static int btf_find_field(const struct btf *btf, const struct btf_type *t,
*/
int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t)
{
- return btf_find_field(btf, t, "bpf_spin_lock",
- sizeof(struct bpf_spin_lock),
- __alignof__(struct bpf_spin_lock));
+ struct btf_field_info info;
+ int ret;
+
+ ret = btf_find_field(btf, t, BTF_FIELD_SPIN_LOCK, &info, 1);
+ if (ret < 0)
+ return ret;
+ if (!ret)
+ return -ENOENT;
+ return info.off;
}
int btf_find_timer(const struct btf *btf, const struct btf_type *t)
{
- return btf_find_field(btf, t, "bpf_timer",
- sizeof(struct bpf_timer),
- __alignof__(struct bpf_timer));
+ struct btf_field_info info;
+ int ret;
+
+ ret = btf_find_field(btf, t, BTF_FIELD_TIMER, &info, 1);
+ if (ret < 0)
+ return ret;
+ if (!ret)
+ return -ENOENT;
+ return info.off;
+}
+
+struct bpf_map_value_off *btf_parse_kptrs(const struct btf *btf,
+ const struct btf_type *t)
+{
+ struct btf_field_info info_arr[BPF_MAP_VALUE_OFF_MAX];
+ struct bpf_map_value_off *tab;
+ struct btf *kernel_btf = NULL;
+ struct module *mod = NULL;
+ int ret, i, nr_off;
+
+ ret = btf_find_field(btf, t, BTF_FIELD_KPTR, info_arr, ARRAY_SIZE(info_arr));
+ if (ret < 0)
+ return ERR_PTR(ret);
+ if (!ret)
+ return NULL;
+
+ nr_off = ret;
+ tab = kzalloc(offsetof(struct bpf_map_value_off, off[nr_off]), GFP_KERNEL | __GFP_NOWARN);
+ if (!tab)
+ return ERR_PTR(-ENOMEM);
+
+ for (i = 0; i < nr_off; i++) {
+ const struct btf_type *t;
+ s32 id;
+
+ /* Find type in map BTF, and use it to look up the matching type
+ * in vmlinux or module BTFs, by name and kind.
+ */
+ t = btf_type_by_id(btf, info_arr[i].type_id);
+ id = bpf_find_btf_id(__btf_name_by_offset(btf, t->name_off), BTF_INFO_KIND(t->info),
+ &kernel_btf);
+ if (id < 0) {
+ ret = id;
+ goto end;
+ }
+
+ /* Find and stash the function pointer for the destruction function that
+ * needs to be eventually invoked from the map free path.
+ */
+ if (info_arr[i].type == BPF_KPTR_REF) {
+ const struct btf_type *dtor_func;
+ const char *dtor_func_name;
+ unsigned long addr;
+ s32 dtor_btf_id;
+
+ /* This call also serves as a whitelist of allowed objects that
+ * can be used as a referenced pointer and be stored in a map at
+ * the same time.
+ */
+ dtor_btf_id = btf_find_dtor_kfunc(kernel_btf, id);
+ if (dtor_btf_id < 0) {
+ ret = dtor_btf_id;
+ goto end_btf;
+ }
+
+ dtor_func = btf_type_by_id(kernel_btf, dtor_btf_id);
+ if (!dtor_func) {
+ ret = -ENOENT;
+ goto end_btf;
+ }
+
+ if (btf_is_module(kernel_btf)) {
+ mod = btf_try_get_module(kernel_btf);
+ if (!mod) {
+ ret = -ENXIO;
+ goto end_btf;
+ }
+ }
+
+ /* We already verified dtor_func to be btf_type_is_func
+ * in register_btf_id_dtor_kfuncs.
+ */
+ dtor_func_name = __btf_name_by_offset(kernel_btf, dtor_func->name_off);
+ addr = kallsyms_lookup_name(dtor_func_name);
+ if (!addr) {
+ ret = -EINVAL;
+ goto end_mod;
+ }
+ tab->off[i].kptr.dtor = (void *)addr;
+ }
+
+ tab->off[i].offset = info_arr[i].off;
+ tab->off[i].type = info_arr[i].type;
+ tab->off[i].kptr.btf_id = id;
+ tab->off[i].kptr.btf = kernel_btf;
+ tab->off[i].kptr.module = mod;
+ }
+ tab->nr_off = nr_off;
+ return tab;
+end_mod:
+ module_put(mod);
+end_btf:
+ btf_put(kernel_btf);
+end:
+ while (i--) {
+ btf_put(tab->off[i].kptr.btf);
+ if (tab->off[i].kptr.module)
+ module_put(tab->off[i].kptr.module);
+ }
+ kfree(tab);
+ return ERR_PTR(ret);
}
static void __btf_struct_show(const struct btf *btf, const struct btf_type *t,
@@ -5811,6 +6074,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
* verifier sees.
*/
for (i = 0; i < nargs; i++) {
+ enum bpf_arg_type arg_type = ARG_DONTCARE;
u32 regno = i + 1;
struct bpf_reg_state *reg = ®s[regno];
@@ -5831,7 +6095,9 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id);
ref_tname = btf_name_by_offset(btf, ref_t->name_off);
- ret = check_func_arg_reg_off(env, reg, regno, ARG_DONTCARE, rel);
+ if (rel && reg->ref_obj_id)
+ arg_type |= OBJ_RELEASE;
+ ret = check_func_arg_reg_off(env, reg, regno, arg_type);
if (ret < 0)
return ret;
@@ -5862,11 +6128,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
if (reg->type == PTR_TO_BTF_ID) {
reg_btf = reg->btf;
reg_ref_id = reg->btf_id;
- /* Ensure only one argument is referenced
- * PTR_TO_BTF_ID, check_func_arg_reg_off relies
- * on only one referenced register being allowed
- * for kfuncs.
- */
+ /* Ensure only one argument is referenced PTR_TO_BTF_ID */
if (reg->ref_obj_id) {
if (ref_obj_id) {
bpf_log(log, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
@@ -6832,6 +7094,138 @@ int register_btf_kfunc_id_set(enum bpf_prog_type prog_type,
}
EXPORT_SYMBOL_GPL(register_btf_kfunc_id_set);
+s32 btf_find_dtor_kfunc(struct btf *btf, u32 btf_id)
+{
+ struct btf_id_dtor_kfunc_tab *tab = btf->dtor_kfunc_tab;
+ struct btf_id_dtor_kfunc *dtor;
+
+ if (!tab)
+ return -ENOENT;
+ /* Even though the size of tab->dtors[0] is > sizeof(u32), we only need
+ * to compare the first u32 with btf_id, so we can reuse btf_id_cmp_func.
+ */
+ BUILD_BUG_ON(offsetof(struct btf_id_dtor_kfunc, btf_id) != 0);
+ dtor = bsearch(&btf_id, tab->dtors, tab->cnt, sizeof(tab->dtors[0]), btf_id_cmp_func);
+ if (!dtor)
+ return -ENOENT;
+ return dtor->kfunc_btf_id;
+}
+
+static int btf_check_dtor_kfuncs(struct btf *btf, const struct btf_id_dtor_kfunc *dtors, u32 cnt)
+{
+ const struct btf_type *dtor_func, *dtor_func_proto, *t;
+ const struct btf_param *args;
+ s32 dtor_btf_id;
+ u32 nr_args, i;
+
+ for (i = 0; i < cnt; i++) {
+ dtor_btf_id = dtors[i].kfunc_btf_id;
+
+ dtor_func = btf_type_by_id(btf, dtor_btf_id);
+ if (!dtor_func || !btf_type_is_func(dtor_func))
+ return -EINVAL;
+
+ dtor_func_proto = btf_type_by_id(btf, dtor_func->type);
+ if (!dtor_func_proto || !btf_type_is_func_proto(dtor_func_proto))
+ return -EINVAL;
+
+ /* Make sure the prototype of the destructor kfunc is 'void func(type *)' */
+ t = btf_type_by_id(btf, dtor_func_proto->type);
+ if (!t || !btf_type_is_void(t))
+ return -EINVAL;
+
+ nr_args = btf_type_vlen(dtor_func_proto);
+ if (nr_args != 1)
+ return -EINVAL;
+ args = btf_params(dtor_func_proto);
+ t = btf_type_by_id(btf, args[0].type);
+ /* Allow any pointer type, as width on targets Linux supports
+ * will be same for all pointer types (i.e. sizeof(void *))
+ */
+ if (!t || !btf_type_is_ptr(t))
+ return -EINVAL;
+ }
+ return 0;
+}
+
+/* This function must be invoked only from initcalls/module init functions */
+int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors, u32 add_cnt,
+ struct module *owner)
+{
+ struct btf_id_dtor_kfunc_tab *tab;
+ struct btf *btf;
+ u32 tab_cnt;
+ int ret;
+
+ btf = btf_get_module_btf(owner);
+ if (!btf) {
+ if (!owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) {
+ pr_err("missing vmlinux BTF, cannot register dtor kfuncs\n");
+ return -ENOENT;
+ }
+ if (owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES)) {
+ pr_err("missing module BTF, cannot register dtor kfuncs\n");
+ return -ENOENT;
+ }
+ return 0;
+ }
+ if (IS_ERR(btf))
+ return PTR_ERR(btf);
+
+ if (add_cnt >= BTF_DTOR_KFUNC_MAX_CNT) {
+ pr_err("cannot register more than %d kfunc destructors\n", BTF_DTOR_KFUNC_MAX_CNT);
+ ret = -E2BIG;
+ goto end;
+ }
+
+ /* Ensure that the prototype of dtor kfuncs being registered is sane */
+ ret = btf_check_dtor_kfuncs(btf, dtors, add_cnt);
+ if (ret < 0)
+ goto end;
+
+ tab = btf->dtor_kfunc_tab;
+ /* Only one call allowed for modules */
+ if (WARN_ON_ONCE(tab && btf_is_module(btf))) {
+ ret = -EINVAL;
+ goto end;
+ }
+
+ tab_cnt = tab ? tab->cnt : 0;
+ if (tab_cnt > U32_MAX - add_cnt) {
+ ret = -EOVERFLOW;
+ goto end;
+ }
+ if (tab_cnt + add_cnt >= BTF_DTOR_KFUNC_MAX_CNT) {
+ pr_err("cannot register more than %d kfunc destructors\n", BTF_DTOR_KFUNC_MAX_CNT);
+ ret = -E2BIG;
+ goto end;
+ }
+
+ tab = krealloc(btf->dtor_kfunc_tab,
+ offsetof(struct btf_id_dtor_kfunc_tab, dtors[tab_cnt + add_cnt]),
+ GFP_KERNEL | __GFP_NOWARN);
+ if (!tab) {
+ ret = -ENOMEM;
+ goto end;
+ }
+
+ if (!btf->dtor_kfunc_tab)
+ tab->cnt = 0;
+ btf->dtor_kfunc_tab = tab;
+
+ memcpy(tab->dtors + tab->cnt, dtors, add_cnt * sizeof(tab->dtors[0]));
+ tab->cnt += add_cnt;
+
+ sort(tab->dtors, tab->cnt, sizeof(tab->dtors[0]), btf_id_cmp_func, NULL);
+
+ return 0;
+end:
+ btf_free_dtor_kfunc_tab(btf);
+ btf_put(btf);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(register_btf_id_dtor_kfuncs);
+
#define MAX_TYPES_ARE_COMPAT_DEPTH 2
static
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 0cb6211fcb58..e7af3e582b4c 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -747,6 +747,60 @@ static struct bpf_prog_list *find_detach_entry(struct list_head *progs,
return ERR_PTR(-ENOENT);
}
+/**
+ * purge_effective_progs() - After compute_effective_progs fails to alloc new
+ * cgrp->bpf.inactive table we can recover by
+ * recomputing the array in place.
+ *
+ * @cgrp: The cgroup which descendants to travers
+ * @prog: A program to detach or NULL
+ * @link: A link to detach or NULL
+ * @atype: Type of detach operation
+ */
+static void purge_effective_progs(struct cgroup *cgrp, struct bpf_prog *prog,
+ struct bpf_cgroup_link *link,
+ enum cgroup_bpf_attach_type atype)
+{
+ struct cgroup_subsys_state *css;
+ struct bpf_prog_array *progs;
+ struct bpf_prog_list *pl;
+ struct list_head *head;
+ struct cgroup *cg;
+ int pos;
+
+ /* recompute effective prog array in place */
+ css_for_each_descendant_pre(css, &cgrp->self) {
+ struct cgroup *desc = container_of(css, struct cgroup, self);
+
+ if (percpu_ref_is_zero(&desc->bpf.refcnt))
+ continue;
+
+ /* find position of link or prog in effective progs array */
+ for (pos = 0, cg = desc; cg; cg = cgroup_parent(cg)) {
+ if (pos && !(cg->bpf.flags[atype] & BPF_F_ALLOW_MULTI))
+ continue;
+
+ head = &cg->bpf.progs[atype];
+ list_for_each_entry(pl, head, node) {
+ if (!prog_list_prog(pl))
+ continue;
+ if (pl->prog == prog && pl->link == link)
+ goto found;
+ pos++;
+ }
+ }
+found:
+ BUG_ON(!cg);
+ progs = rcu_dereference_protected(
+ desc->bpf.effective[atype],
+ lockdep_is_held(&cgroup_mutex));
+
+ /* Remove the program from the array */
+ WARN_ONCE(bpf_prog_array_delete_safe_at(progs, pos),
+ "Failed to purge a prog from array at index %d", pos);
+ }
+}
+
/**
* __cgroup_bpf_detach() - Detach the program or link from a cgroup, and
* propagate the change to descendants
@@ -766,7 +820,6 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
struct bpf_prog_list *pl;
struct list_head *progs;
u32 flags;
- int err;
atype = to_cgroup_bpf_attach_type(type);
if (atype < 0)
@@ -788,9 +841,12 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
pl->prog = NULL;
pl->link = NULL;
- err = update_effective_progs(cgrp, atype);
- if (err)
- goto cleanup;
+ if (update_effective_progs(cgrp, atype)) {
+ /* if update effective array failed replace the prog with a dummy prog*/
+ pl->prog = old_prog;
+ pl->link = link;
+ purge_effective_progs(cgrp, old_prog, link, atype);
+ }
/* now can actually delete it from this cgroup list */
list_del(&pl->node);
@@ -802,12 +858,6 @@ static int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
bpf_prog_put(old_prog);
static_branch_dec(&cgroup_bpf_enabled_key[atype]);
return 0;
-
-cleanup:
- /* restore back prog or link */
- pl->prog = old_prog;
- pl->link = link;
- return err;
}
static int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 3adff3831c04..483bee45ead5 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -649,12 +649,6 @@ static bool bpf_prog_kallsyms_candidate(const struct bpf_prog *fp)
return fp->jited && !bpf_prog_was_classic(fp);
}
-static bool bpf_prog_kallsyms_verify_off(const struct bpf_prog *fp)
-{
- return list_empty(&fp->aux->ksym.lnode) ||
- fp->aux->ksym.lnode.prev == LIST_POISON2;
-}
-
void bpf_prog_kallsyms_add(struct bpf_prog *fp)
{
if (!bpf_prog_kallsyms_candidate(fp) ||
@@ -1149,7 +1143,6 @@ int bpf_jit_binary_pack_finalize(struct bpf_prog *prog,
bpf_prog_pack_free(ro_header);
return PTR_ERR(ptr);
}
- prog->aux->use_bpf_prog_pack = true;
return 0;
}
@@ -1173,17 +1166,23 @@ void bpf_jit_binary_pack_free(struct bpf_binary_header *ro_header,
bpf_jit_uncharge_modmem(size);
}
+struct bpf_binary_header *
+bpf_jit_binary_pack_hdr(const struct bpf_prog *fp)
+{
+ unsigned long real_start = (unsigned long)fp->bpf_func;
+ unsigned long addr;
+
+ addr = real_start & BPF_PROG_CHUNK_MASK;
+ return (void *)addr;
+}
+
static inline struct bpf_binary_header *
bpf_jit_binary_hdr(const struct bpf_prog *fp)
{
unsigned long real_start = (unsigned long)fp->bpf_func;
unsigned long addr;
- if (fp->aux->use_bpf_prog_pack)
- addr = real_start & BPF_PROG_CHUNK_MASK;
- else
- addr = real_start & PAGE_MASK;
-
+ addr = real_start & PAGE_MASK;
return (void *)addr;
}
@@ -1196,11 +1195,7 @@ void __weak bpf_jit_free(struct bpf_prog *fp)
if (fp->jited) {
struct bpf_binary_header *hdr = bpf_jit_binary_hdr(fp);
- if (fp->aux->use_bpf_prog_pack)
- bpf_jit_binary_pack_free(hdr, NULL /* rw_buffer */);
- else
- bpf_jit_binary_free(hdr);
-
+ bpf_jit_binary_free(hdr);
WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(fp));
}
@@ -2712,6 +2707,12 @@ bool __weak bpf_jit_needs_zext(void)
return false;
}
+/* Return TRUE if the JIT backend supports mixing bpf2bpf and tailcalls. */
+bool __weak bpf_jit_supports_subprog_tailcalls(void)
+{
+ return false;
+}
+
bool __weak bpf_jit_supports_kfunc_call(void)
{
return false;
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 65877967f414..ea99c91f72b6 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -238,7 +238,7 @@ static void htab_free_prealloced_timers(struct bpf_htab *htab)
u32 num_entries = htab->map.max_entries;
int i;
- if (likely(!map_value_has_timer(&htab->map)))
+ if (!map_value_has_timer(&htab->map))
return;
if (htab_has_extra_elems(htab))
num_entries += num_possible_cpus();
@@ -254,6 +254,25 @@ static void htab_free_prealloced_timers(struct bpf_htab *htab)
}
}
+static void htab_free_prealloced_kptrs(struct bpf_htab *htab)
+{
+ u32 num_entries = htab->map.max_entries;
+ int i;
+
+ if (!map_value_has_kptrs(&htab->map))
+ return;
+ if (htab_has_extra_elems(htab))
+ num_entries += num_possible_cpus();
+
+ for (i = 0; i < num_entries; i++) {
+ struct htab_elem *elem;
+
+ elem = get_htab_elem(htab, i);
+ bpf_map_free_kptrs(&htab->map, elem->key + round_up(htab->map.key_size, 8));
+ cond_resched();
+ }
+}
+
static void htab_free_elems(struct bpf_htab *htab)
{
int i;
@@ -725,12 +744,15 @@ static int htab_lru_map_gen_lookup(struct bpf_map *map,
return insn - insn_buf;
}
-static void check_and_free_timer(struct bpf_htab *htab, struct htab_elem *elem)
+static void check_and_free_fields(struct bpf_htab *htab,
+ struct htab_elem *elem)
{
- if (unlikely(map_value_has_timer(&htab->map)))
- bpf_timer_cancel_and_free(elem->key +
- round_up(htab->map.key_size, 8) +
- htab->map.timer_off);
+ void *map_value = elem->key + round_up(htab->map.key_size, 8);
+
+ if (map_value_has_timer(&htab->map))
+ bpf_timer_cancel_and_free(map_value + htab->map.timer_off);
+ if (map_value_has_kptrs(&htab->map))
+ bpf_map_free_kptrs(&htab->map, map_value);
}
/* It is called from the bpf_lru_list when the LRU needs to delete
@@ -757,7 +779,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node)
hlist_nulls_for_each_entry_rcu(l, n, head, hash_node)
if (l == tgt_l) {
hlist_nulls_del_rcu(&l->hash_node);
- check_and_free_timer(htab, l);
+ check_and_free_fields(htab, l);
break;
}
@@ -829,7 +851,7 @@ static void htab_elem_free(struct bpf_htab *htab, struct htab_elem *l)
{
if (htab->map.map_type == BPF_MAP_TYPE_PERCPU_HASH)
free_percpu(htab_elem_get_ptr(l, htab->map.key_size));
- check_and_free_timer(htab, l);
+ check_and_free_fields(htab, l);
kfree(l);
}
@@ -857,7 +879,7 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l)
htab_put_fd_value(htab, l);
if (htab_is_prealloc(htab)) {
- check_and_free_timer(htab, l);
+ check_and_free_fields(htab, l);
__pcpu_freelist_push(&htab->freelist, &l->fnode);
} else {
atomic_dec(&htab->count);
@@ -1104,7 +1126,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
if (!htab_is_prealloc(htab))
free_htab_elem(htab, l_old);
else
- check_and_free_timer(htab, l_old);
+ check_and_free_fields(htab, l_old);
}
ret = 0;
err:
@@ -1114,7 +1136,7 @@ static int htab_map_update_elem(struct bpf_map *map, void *key, void *value,
static void htab_lru_push_free(struct bpf_htab *htab, struct htab_elem *elem)
{
- check_and_free_timer(htab, elem);
+ check_and_free_fields(htab, elem);
bpf_lru_push_free(&htab->lru, &elem->lru_node);
}
@@ -1419,8 +1441,14 @@ static void htab_free_malloced_timers(struct bpf_htab *htab)
struct hlist_nulls_node *n;
struct htab_elem *l;
- hlist_nulls_for_each_entry(l, n, head, hash_node)
- check_and_free_timer(htab, l);
+ hlist_nulls_for_each_entry(l, n, head, hash_node) {
+ /* We don't reset or free kptr on uref dropping to zero,
+ * hence just free timer.
+ */
+ bpf_timer_cancel_and_free(l->key +
+ round_up(htab->map.key_size, 8) +
+ htab->map.timer_off);
+ }
cond_resched_rcu();
}
rcu_read_unlock();
@@ -1430,7 +1458,8 @@ static void htab_map_free_timers(struct bpf_map *map)
{
struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
- if (likely(!map_value_has_timer(&htab->map)))
+ /* We don't reset or free kptr on uref dropping to zero. */
+ if (!map_value_has_timer(&htab->map))
return;
if (!htab_is_prealloc(htab))
htab_free_malloced_timers(htab);
@@ -1453,11 +1482,14 @@ static void htab_map_free(struct bpf_map *map)
* not have executed. Wait for them.
*/
rcu_barrier();
- if (!htab_is_prealloc(htab))
+ if (!htab_is_prealloc(htab)) {
delete_all_elements(htab);
- else
+ } else {
+ htab_free_prealloced_kptrs(htab);
prealloc_destroy(htab);
+ }
+ bpf_map_free_kptr_off_tab(map);
free_percpu(htab->extra_elems);
bpf_map_area_free(htab->buckets);
for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 315053ef6a75..3e709fed5306 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1374,6 +1374,28 @@ void bpf_timer_cancel_and_free(void *val)
kfree(t);
}
+BPF_CALL_2(bpf_kptr_xchg, void *, map_value, void *, ptr)
+{
+ unsigned long *kptr = map_value;
+
+ return xchg(kptr, (unsigned long)ptr);
+}
+
+/* Unlike other PTR_TO_BTF_ID helpers the btf_id in bpf_kptr_xchg()
+ * helper is determined dynamically by the verifier.
+ */
+#define BPF_PTR_POISON ((void *)((0xeB9FUL << 2) + POISON_POINTER_DELTA))
+
+const struct bpf_func_proto bpf_kptr_xchg_proto = {
+ .func = bpf_kptr_xchg,
+ .gpl_only = false,
+ .ret_type = RET_PTR_TO_BTF_ID_OR_NULL,
+ .ret_btf_id = BPF_PTR_POISON,
+ .arg1_type = ARG_PTR_TO_KPTR,
+ .arg2_type = ARG_PTR_TO_BTF_ID_OR_NULL | OBJ_RELEASE,
+ .arg2_btf_id = BPF_PTR_POISON,
+};
+
const struct bpf_func_proto bpf_get_current_task_proto __weak;
const struct bpf_func_proto bpf_get_current_task_btf_proto __weak;
const struct bpf_func_proto bpf_probe_read_user_proto __weak;
@@ -1452,6 +1474,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
return &bpf_timer_start_proto;
case BPF_FUNC_timer_cancel:
return &bpf_timer_cancel_proto;
+ case BPF_FUNC_kptr_xchg:
+ return &bpf_kptr_xchg_proto;
default:
break;
}
diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c
index 5cd8f5277279..135205d0d560 100644
--- a/kernel/bpf/map_in_map.c
+++ b/kernel/bpf/map_in_map.c
@@ -52,6 +52,7 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd)
inner_map_meta->max_entries = inner_map->max_entries;
inner_map_meta->spin_lock_off = inner_map->spin_lock_off;
inner_map_meta->timer_off = inner_map->timer_off;
+ inner_map_meta->kptr_off_tab = bpf_map_copy_kptr_off_tab(inner_map);
if (inner_map->btf) {
btf_get(inner_map->btf);
inner_map_meta->btf = inner_map->btf;
@@ -71,6 +72,7 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd)
void bpf_map_meta_free(struct bpf_map *map_meta)
{
+ bpf_map_free_kptr_off_tab(map_meta);
btf_put(map_meta->btf);
kfree(map_meta);
}
@@ -83,7 +85,8 @@ bool bpf_map_meta_equal(const struct bpf_map *meta0,
meta0->key_size == meta1->key_size &&
meta0->value_size == meta1->value_size &&
meta0->timer_off == meta1->timer_off &&
- meta0->map_flags == meta1->map_flags;
+ meta0->map_flags == meta1->map_flags &&
+ bpf_map_equal_kptr_off_tab(meta0, meta1);
}
void *bpf_map_fd_get_ptr(struct bpf_map *map,
diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c
index 710ba9de12ce..5173fd37590f 100644
--- a/kernel/bpf/ringbuf.c
+++ b/kernel/bpf/ringbuf.c
@@ -404,7 +404,7 @@ BPF_CALL_2(bpf_ringbuf_submit, void *, sample, u64, flags)
const struct bpf_func_proto bpf_ringbuf_submit_proto = {
.func = bpf_ringbuf_submit,
.ret_type = RET_VOID,
- .arg1_type = ARG_PTR_TO_ALLOC_MEM,
+ .arg1_type = ARG_PTR_TO_ALLOC_MEM | OBJ_RELEASE,
.arg2_type = ARG_ANYTHING,
};
@@ -417,7 +417,7 @@ BPF_CALL_2(bpf_ringbuf_discard, void *, sample, u64, flags)
const struct bpf_func_proto bpf_ringbuf_discard_proto = {
.func = bpf_ringbuf_discard,
.ret_type = RET_VOID,
- .arg1_type = ARG_PTR_TO_ALLOC_MEM,
+ .arg1_type = ARG_PTR_TO_ALLOC_MEM | OBJ_RELEASE,
.arg2_type = ARG_ANYTHING,
};
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index cdaa1152436a..f4d1f974a8cd 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -6,6 +6,7 @@
#include <linux/bpf_trace.h>
#include <linux/bpf_lirc.h>
#include <linux/bpf_verifier.h>
+#include <linux/bsearch.h>
#include <linux/btf.h>
#include <linux/syscalls.h>
#include <linux/slab.h>
@@ -29,6 +30,7 @@
#include <linux/pgtable.h>
#include <linux/bpf_lsm.h>
#include <linux/poll.h>
+#include <linux/sort.h>
#include <linux/bpf-netns.h>
#include <linux/rcupdate_trace.h>
#include <linux/memcontrol.h>
@@ -473,14 +475,128 @@ static void bpf_map_release_memcg(struct bpf_map *map)
}
#endif
+static int bpf_map_kptr_off_cmp(const void *a, const void *b)
+{
+ const struct bpf_map_value_off_desc *off_desc1 = a, *off_desc2 = b;
+
+ if (off_desc1->offset < off_desc2->offset)
+ return -1;
+ else if (off_desc1->offset > off_desc2->offset)
+ return 1;
+ return 0;
+}
+
+struct bpf_map_value_off_desc *bpf_map_kptr_off_contains(struct bpf_map *map, u32 offset)
+{
+ /* Since members are iterated in btf_find_field in increasing order,
+ * offsets appended to kptr_off_tab are in increasing order, so we can
+ * do bsearch to find exact match.
+ */
+ struct bpf_map_value_off *tab;
+
+ if (!map_value_has_kptrs(map))
+ return NULL;
+ tab = map->kptr_off_tab;
+ return bsearch(&offset, tab->off, tab->nr_off, sizeof(tab->off[0]), bpf_map_kptr_off_cmp);
+}
+
+void bpf_map_free_kptr_off_tab(struct bpf_map *map)
+{
+ struct bpf_map_value_off *tab = map->kptr_off_tab;
+ int i;
+
+ if (!map_value_has_kptrs(map))
+ return;
+ for (i = 0; i < tab->nr_off; i++) {
+ if (tab->off[i].kptr.module)
+ module_put(tab->off[i].kptr.module);
+ btf_put(tab->off[i].kptr.btf);
+ }
+ kfree(tab);
+ map->kptr_off_tab = NULL;
+}
+
+struct bpf_map_value_off *bpf_map_copy_kptr_off_tab(const struct bpf_map *map)
+{
+ struct bpf_map_value_off *tab = map->kptr_off_tab, *new_tab;
+ int size, i;
+
+ if (!map_value_has_kptrs(map))
+ return ERR_PTR(-ENOENT);
+ size = offsetof(struct bpf_map_value_off, off[tab->nr_off]);
+ new_tab = kmemdup(tab, size, GFP_KERNEL | __GFP_NOWARN);
+ if (!new_tab)
+ return ERR_PTR(-ENOMEM);
+ /* Do a deep copy of the kptr_off_tab */
+ for (i = 0; i < tab->nr_off; i++) {
+ btf_get(tab->off[i].kptr.btf);
+ if (tab->off[i].kptr.module && !try_module_get(tab->off[i].kptr.module)) {
+ while (i--) {
+ if (tab->off[i].kptr.module)
+ module_put(tab->off[i].kptr.module);
+ btf_put(tab->off[i].kptr.btf);
+ }
+ kfree(new_tab);
+ return ERR_PTR(-ENXIO);
+ }
+ }
+ return new_tab;
+}
+
+bool bpf_map_equal_kptr_off_tab(const struct bpf_map *map_a, const struct bpf_map *map_b)
+{
+ struct bpf_map_value_off *tab_a = map_a->kptr_off_tab, *tab_b = map_b->kptr_off_tab;
+ bool a_has_kptr = map_value_has_kptrs(map_a), b_has_kptr = map_value_has_kptrs(map_b);
+ int size;
+
+ if (!a_has_kptr && !b_has_kptr)
+ return true;
+ if (a_has_kptr != b_has_kptr)
+ return false;
+ if (tab_a->nr_off != tab_b->nr_off)
+ return false;
+ size = offsetof(struct bpf_map_value_off, off[tab_a->nr_off]);
+ return !memcmp(tab_a, tab_b, size);
+}
+
+/* Caller must ensure map_value_has_kptrs is true. Note that this function can
+ * be called on a map value while the map_value is visible to BPF programs, as
+ * it ensures the correct synchronization, and we already enforce the same using
+ * the bpf_kptr_xchg helper on the BPF program side for referenced kptrs.
+ */
+void bpf_map_free_kptrs(struct bpf_map *map, void *map_value)
+{
+ struct bpf_map_value_off *tab = map->kptr_off_tab;
+ unsigned long *btf_id_ptr;
+ int i;
+
+ for (i = 0; i < tab->nr_off; i++) {
+ struct bpf_map_value_off_desc *off_desc = &tab->off[i];
+ unsigned long old_ptr;
+
+ btf_id_ptr = map_value + off_desc->offset;
+ if (off_desc->type == BPF_KPTR_UNREF) {
+ u64 *p = (u64 *)btf_id_ptr;
+
+ WRITE_ONCE(p, 0);
+ continue;
+ }
+ old_ptr = xchg(btf_id_ptr, 0);
+ off_desc->kptr.dtor((void *)old_ptr);
+ }
+}
+
/* called from workqueue */
static void bpf_map_free_deferred(struct work_struct *work)
{
struct bpf_map *map = container_of(work, struct bpf_map, work);
security_bpf_map_free(map);
+ kfree(map->off_arr);
bpf_map_release_memcg(map);
- /* implementation dependent freeing */
+ /* implementation dependent freeing, map_free callback also does
+ * bpf_map_free_kptr_off_tab, if needed.
+ */
map->ops->map_free(map);
}
@@ -640,7 +756,7 @@ static int bpf_map_mmap(struct file *filp, struct vm_area_struct *vma)
int err;
if (!map->ops->map_mmap || map_value_has_spin_lock(map) ||
- map_value_has_timer(map))
+ map_value_has_timer(map) || map_value_has_kptrs(map))
return -ENOTSUPP;
if (!(vma->vm_flags & VM_SHARED))
@@ -767,6 +883,84 @@ int map_check_no_btf(const struct bpf_map *map,
return -ENOTSUPP;
}
+static int map_off_arr_cmp(const void *_a, const void *_b, const void *priv)
+{
+ const u32 a = *(const u32 *)_a;
+ const u32 b = *(const u32 *)_b;
+
+ if (a < b)
+ return -1;
+ else if (a > b)
+ return 1;
+ return 0;
+}
+
+static void map_off_arr_swap(void *_a, void *_b, int size, const void *priv)
+{
+ struct bpf_map *map = (struct bpf_map *)priv;
+ u32 *off_base = map->off_arr->field_off;
+ u32 *a = _a, *b = _b;
+ u8 *sz_a, *sz_b;
+
+ sz_a = map->off_arr->field_sz + (a - off_base);
+ sz_b = map->off_arr->field_sz + (b - off_base);
+
+ swap(*a, *b);
+ swap(*sz_a, *sz_b);
+}
+
+static int bpf_map_alloc_off_arr(struct bpf_map *map)
+{
+ bool has_spin_lock = map_value_has_spin_lock(map);
+ bool has_timer = map_value_has_timer(map);
+ bool has_kptrs = map_value_has_kptrs(map);
+ struct bpf_map_off_arr *off_arr;
+ u32 i;
+
+ if (!has_spin_lock && !has_timer && !has_kptrs) {
+ map->off_arr = NULL;
+ return 0;
+ }
+
+ off_arr = kmalloc(sizeof(*map->off_arr), GFP_KERNEL | __GFP_NOWARN);
+ if (!off_arr)
+ return -ENOMEM;
+ map->off_arr = off_arr;
+
+ off_arr->cnt = 0;
+ if (has_spin_lock) {
+ i = off_arr->cnt;
+
+ off_arr->field_off[i] = map->spin_lock_off;
+ off_arr->field_sz[i] = sizeof(struct bpf_spin_lock);
+ off_arr->cnt++;
+ }
+ if (has_timer) {
+ i = off_arr->cnt;
+
+ off_arr->field_off[i] = map->timer_off;
+ off_arr->field_sz[i] = sizeof(struct bpf_timer);
+ off_arr->cnt++;
+ }
+ if (has_kptrs) {
+ struct bpf_map_value_off *tab = map->kptr_off_tab;
+ u32 *off = &off_arr->field_off[off_arr->cnt];
+ u8 *sz = &off_arr->field_sz[off_arr->cnt];
+
+ for (i = 0; i < tab->nr_off; i++) {
+ *off++ = tab->off[i].offset;
+ *sz++ = sizeof(u64);
+ }
+ off_arr->cnt += tab->nr_off;
+ }
+
+ if (off_arr->cnt == 1)
+ return 0;
+ sort_r(off_arr->field_off, off_arr->cnt, sizeof(off_arr->field_off[0]),
+ map_off_arr_cmp, map_off_arr_swap, map);
+ return 0;
+}
+
static int map_check_btf(struct bpf_map *map, const struct btf *btf,
u32 btf_key_id, u32 btf_value_id)
{
@@ -820,9 +1014,33 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
return -EOPNOTSUPP;
}
- if (map->ops->map_check_btf)
+ map->kptr_off_tab = btf_parse_kptrs(btf, value_type);
+ if (map_value_has_kptrs(map)) {
+ if (!bpf_capable()) {
+ ret = -EPERM;
+ goto free_map_tab;
+ }
+ if (map->map_flags & (BPF_F_RDONLY_PROG | BPF_F_WRONLY_PROG)) {
+ ret = -EACCES;
+ goto free_map_tab;
+ }
+ if (map->map_type != BPF_MAP_TYPE_HASH &&
+ map->map_type != BPF_MAP_TYPE_LRU_HASH &&
+ map->map_type != BPF_MAP_TYPE_ARRAY) {
+ ret = -EOPNOTSUPP;
+ goto free_map_tab;
+ }
+ }
+
+ if (map->ops->map_check_btf) {
ret = map->ops->map_check_btf(map, btf, key_type, value_type);
+ if (ret < 0)
+ goto free_map_tab;
+ }
+ return ret;
+free_map_tab:
+ bpf_map_free_kptr_off_tab(map);
return ret;
}
@@ -912,10 +1130,14 @@ static int map_create(union bpf_attr *attr)
attr->btf_vmlinux_value_type_id;
}
- err = security_bpf_map_alloc(map);
+ err = bpf_map_alloc_off_arr(map);
if (err)
goto free_map;
+ err = security_bpf_map_alloc(map);
+ if (err)
+ goto free_map_off_arr;
+
err = bpf_map_alloc_id(map);
if (err)
goto free_map_sec;
@@ -938,6 +1160,8 @@ static int map_create(union bpf_attr *attr)
free_map_sec:
security_bpf_map_free(map);
+free_map_off_arr:
+ kfree(map->off_arr);
free_map:
btf_put(map->btf);
map->ops->map_free(map);
@@ -1639,7 +1863,7 @@ static int map_freeze(const union bpf_attr *attr)
return PTR_ERR(map);
if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS ||
- map_value_has_timer(map)) {
+ map_value_has_timer(map) || map_value_has_kptrs(map)) {
fdput(f);
return -ENOTSUPP;
}
@@ -2640,19 +2864,12 @@ struct bpf_link *bpf_link_get_from_fd(u32 ufd)
}
EXPORT_SYMBOL(bpf_link_get_from_fd);
-struct bpf_tracing_link {
- struct bpf_link link;
- enum bpf_attach_type attach_type;
- struct bpf_trampoline *trampoline;
- struct bpf_prog *tgt_prog;
-};
-
static void bpf_tracing_link_release(struct bpf_link *link)
{
struct bpf_tracing_link *tr_link =
- container_of(link, struct bpf_tracing_link, link);
+ container_of(link, struct bpf_tracing_link, link.link);
- WARN_ON_ONCE(bpf_trampoline_unlink_prog(link->prog,
+ WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
tr_link->trampoline));
bpf_trampoline_put(tr_link->trampoline);
@@ -2665,7 +2882,7 @@ static void bpf_tracing_link_release(struct bpf_link *link)
static void bpf_tracing_link_dealloc(struct bpf_link *link)
{
struct bpf_tracing_link *tr_link =
- container_of(link, struct bpf_tracing_link, link);
+ container_of(link, struct bpf_tracing_link, link.link);
kfree(tr_link);
}
@@ -2674,7 +2891,7 @@ static void bpf_tracing_link_show_fdinfo(const struct bpf_link *link,
struct seq_file *seq)
{
struct bpf_tracing_link *tr_link =
- container_of(link, struct bpf_tracing_link, link);
+ container_of(link, struct bpf_tracing_link, link.link);
seq_printf(seq,
"attach_type:\t%d\n",
@@ -2685,7 +2902,7 @@ static int bpf_tracing_link_fill_link_info(const struct bpf_link *link,
struct bpf_link_info *info)
{
struct bpf_tracing_link *tr_link =
- container_of(link, struct bpf_tracing_link, link);
+ container_of(link, struct bpf_tracing_link, link.link);
info->tracing.attach_type = tr_link->attach_type;
bpf_trampoline_unpack_key(tr_link->trampoline->key,
@@ -2766,7 +2983,7 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
err = -ENOMEM;
goto out_put_prog;
}
- bpf_link_init(&link->link, BPF_LINK_TYPE_TRACING,
+ bpf_link_init(&link->link.link, BPF_LINK_TYPE_TRACING,
&bpf_tracing_link_lops, prog);
link->attach_type = prog->expected_attach_type;
@@ -2836,11 +3053,11 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
tgt_prog = prog->aux->dst_prog;
}
- err = bpf_link_prime(&link->link, &link_primer);
+ err = bpf_link_prime(&link->link.link, &link_primer);
if (err)
goto out_unlock;
- err = bpf_trampoline_link_prog(prog, tr);
+ err = bpf_trampoline_link_prog(&link->link, tr);
if (err) {
bpf_link_cleanup(&link_primer);
link = NULL;
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 5d8bfb5ef239..e3bcad5a7c68 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -168,30 +168,30 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
return ret;
}
-static struct bpf_tramp_progs *
+static struct bpf_tramp_links *
bpf_trampoline_get_progs(const struct bpf_trampoline *tr, int *total, bool *ip_arg)
{
- const struct bpf_prog_aux *aux;
- struct bpf_tramp_progs *tprogs;
- struct bpf_prog **progs;
+ struct bpf_tramp_link *link;
+ struct bpf_tramp_links *tlinks;
+ struct bpf_tramp_link **links;
int kind;
*total = 0;
- tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(*tprogs), GFP_KERNEL);
- if (!tprogs)
+ tlinks = kcalloc(BPF_TRAMP_MAX, sizeof(*tlinks), GFP_KERNEL);
+ if (!tlinks)
return ERR_PTR(-ENOMEM);
for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
- tprogs[kind].nr_progs = tr->progs_cnt[kind];
+ tlinks[kind].nr_links = tr->progs_cnt[kind];
*total += tr->progs_cnt[kind];
- progs = tprogs[kind].progs;
+ links = tlinks[kind].links;
- hlist_for_each_entry(aux, &tr->progs_hlist[kind], tramp_hlist) {
- *ip_arg |= aux->prog->call_get_func_ip;
- *progs++ = aux->prog;
+ hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) {
+ *ip_arg |= link->link.prog->call_get_func_ip;
+ *links++ = link;
}
}
- return tprogs;
+ return tlinks;
}
static void __bpf_tramp_image_put_deferred(struct work_struct *work)
@@ -330,14 +330,14 @@ static struct bpf_tramp_image *bpf_tramp_image_alloc(u64 key, u32 idx)
static int bpf_trampoline_update(struct bpf_trampoline *tr)
{
struct bpf_tramp_image *im;
- struct bpf_tramp_progs *tprogs;
+ struct bpf_tramp_links *tlinks;
u32 flags = BPF_TRAMP_F_RESTORE_REGS;
bool ip_arg = false;
int err, total;
- tprogs = bpf_trampoline_get_progs(tr, &total, &ip_arg);
- if (IS_ERR(tprogs))
- return PTR_ERR(tprogs);
+ tlinks = bpf_trampoline_get_progs(tr, &total, &ip_arg);
+ if (IS_ERR(tlinks))
+ return PTR_ERR(tlinks);
if (total == 0) {
err = unregister_fentry(tr, tr->cur_image->image);
@@ -353,15 +353,15 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
goto out;
}
- if (tprogs[BPF_TRAMP_FEXIT].nr_progs ||
- tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs)
+ if (tlinks[BPF_TRAMP_FEXIT].nr_links ||
+ tlinks[BPF_TRAMP_MODIFY_RETURN].nr_links)
flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;
if (ip_arg)
flags |= BPF_TRAMP_F_IP_ARG;
err = arch_prepare_bpf_trampoline(im, im->image, im->image + PAGE_SIZE,
- &tr->func.model, flags, tprogs,
+ &tr->func.model, flags, tlinks,
tr->func.addr);
if (err < 0)
goto out;
@@ -381,7 +381,7 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
tr->cur_image = im;
tr->selector++;
out:
- kfree(tprogs);
+ kfree(tlinks);
return err;
}
@@ -407,13 +407,14 @@ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(struct bpf_prog *prog)
}
}
-int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
{
enum bpf_tramp_prog_type kind;
+ struct bpf_tramp_link *link_exiting;
int err = 0;
int cnt = 0, i;
- kind = bpf_attach_type_to_tramp(prog);
+ kind = bpf_attach_type_to_tramp(link->link.prog);
mutex_lock(&tr->mutex);
if (tr->extension_prog) {
/* cannot attach fentry/fexit if extension prog is attached.
@@ -432,25 +433,33 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
err = -EBUSY;
goto out;
}
- tr->extension_prog = prog;
+ tr->extension_prog = link->link.prog;
err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
- prog->bpf_func);
+ link->link.prog->bpf_func);
goto out;
}
- if (cnt >= BPF_MAX_TRAMP_PROGS) {
+ if (cnt >= BPF_MAX_TRAMP_LINKS) {
err = -E2BIG;
goto out;
}
- if (!hlist_unhashed(&prog->aux->tramp_hlist)) {
+ if (!hlist_unhashed(&link->tramp_hlist)) {
/* prog already linked */
err = -EBUSY;
goto out;
}
- hlist_add_head(&prog->aux->tramp_hlist, &tr->progs_hlist[kind]);
+ hlist_for_each_entry(link_exiting, &tr->progs_hlist[kind], tramp_hlist) {
+ if (link_exiting->link.prog != link->link.prog)
+ continue;
+ /* prog already linked */
+ err = -EBUSY;
+ goto out;
+ }
+
+ hlist_add_head(&link->tramp_hlist, &tr->progs_hlist[kind]);
tr->progs_cnt[kind]++;
err = bpf_trampoline_update(tr);
if (err) {
- hlist_del_init(&prog->aux->tramp_hlist);
+ hlist_del_init(&link->tramp_hlist);
tr->progs_cnt[kind]--;
}
out:
@@ -459,12 +468,12 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
}
/* bpf_trampoline_unlink_prog() should never fail. */
-int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
+int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr)
{
enum bpf_tramp_prog_type kind;
int err;
- kind = bpf_attach_type_to_tramp(prog);
+ kind = bpf_attach_type_to_tramp(link->link.prog);
mutex_lock(&tr->mutex);
if (kind == BPF_TRAMP_REPLACE) {
WARN_ON_ONCE(!tr->extension_prog);
@@ -473,7 +482,7 @@ int bpf_trampoline_unlink_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
tr->extension_prog = NULL;
goto out;
}
- hlist_del_init(&prog->aux->tramp_hlist);
+ hlist_del_init(&link->tramp_hlist);
tr->progs_cnt[kind]--;
err = bpf_trampoline_update(tr);
out:
@@ -641,7 +650,7 @@ void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
int __weak
arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
const struct btf_func_model *m, u32 flags,
- struct bpf_tramp_progs *tprogs,
+ struct bpf_tramp_links *tlinks,
void *orig_call)
{
return -ENOTSUPP;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d04147a5efa5..839340d4cc29 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -245,6 +245,7 @@ struct bpf_call_arg_meta {
struct bpf_map *map_ptr;
bool raw_mode;
bool pkt_access;
+ u8 release_regno;
int regno;
int access_size;
int mem_size;
@@ -257,6 +258,7 @@ struct bpf_call_arg_meta {
struct btf *ret_btf;
u32 ret_btf_id;
u32 subprogno;
+ struct bpf_map_value_off_desc *kptr_off_desc;
};
struct btf *btf_vmlinux;
@@ -471,17 +473,6 @@ static bool type_may_be_null(u32 type)
return type & PTR_MAYBE_NULL;
}
-/* Determine whether the function releases some resources allocated by another
- * function call. The first reference type argument will be assumed to be
- * released by release_reference().
- */
-static bool is_release_function(enum bpf_func_id func_id)
-{
- return func_id == BPF_FUNC_sk_release ||
- func_id == BPF_FUNC_ringbuf_submit ||
- func_id == BPF_FUNC_ringbuf_discard;
-}
-
static bool may_be_acquire_function(enum bpf_func_id func_id)
{
return func_id == BPF_FUNC_sk_lookup_tcp ||
@@ -499,7 +490,8 @@ static bool is_acquire_function(enum bpf_func_id func_id,
if (func_id == BPF_FUNC_sk_lookup_tcp ||
func_id == BPF_FUNC_sk_lookup_udp ||
func_id == BPF_FUNC_skc_lookup_tcp ||
- func_id == BPF_FUNC_ringbuf_reserve)
+ func_id == BPF_FUNC_ringbuf_reserve ||
+ func_id == BPF_FUNC_kptr_xchg)
return true;
if (func_id == BPF_FUNC_map_lookup_elem &&
@@ -3210,7 +3202,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
return 0;
}
-enum stack_access_src {
+enum bpf_access_src {
ACCESS_DIRECT = 1, /* the access is performed by an instruction */
ACCESS_HELPER = 2, /* the access is performed by a helper */
};
@@ -3218,7 +3210,7 @@ enum stack_access_src {
static int check_stack_range_initialized(struct bpf_verifier_env *env,
int regno, int off, int access_size,
bool zero_size_allowed,
- enum stack_access_src type,
+ enum bpf_access_src type,
struct bpf_call_arg_meta *meta);
static struct bpf_reg_state *reg_state(struct bpf_verifier_env *env, int regno)
@@ -3468,9 +3460,159 @@ static int check_mem_region_access(struct bpf_verifier_env *env, u32 regno,
return 0;
}
+static int __check_ptr_off_reg(struct bpf_verifier_env *env,
+ const struct bpf_reg_state *reg, int regno,
+ bool fixed_off_ok)
+{
+ /* Access to this pointer-typed register or passing it to a helper
+ * is only allowed in its original, unmodified form.
+ */
+
+ if (reg->off < 0) {
+ verbose(env, "negative offset %s ptr R%d off=%d disallowed\n",
+ reg_type_str(env, reg->type), regno, reg->off);
+ return -EACCES;
+ }
+
+ if (!fixed_off_ok && reg->off) {
+ verbose(env, "dereference of modified %s ptr R%d off=%d disallowed\n",
+ reg_type_str(env, reg->type), regno, reg->off);
+ return -EACCES;
+ }
+
+ if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
+ char tn_buf[48];
+
+ tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
+ verbose(env, "variable %s access var_off=%s disallowed\n",
+ reg_type_str(env, reg->type), tn_buf);
+ return -EACCES;
+ }
+
+ return 0;
+}
+
+int check_ptr_off_reg(struct bpf_verifier_env *env,
+ const struct bpf_reg_state *reg, int regno)
+{
+ return __check_ptr_off_reg(env, reg, regno, false);
+}
+
+static int map_kptr_match_type(struct bpf_verifier_env *env,
+ struct bpf_map_value_off_desc *off_desc,
+ struct bpf_reg_state *reg, u32 regno)
+{
+ const char *targ_name = kernel_type_name(off_desc->kptr.btf, off_desc->kptr.btf_id);
+ const char *reg_name = "";
+
+ if (base_type(reg->type) != PTR_TO_BTF_ID || type_flag(reg->type) != PTR_MAYBE_NULL)
+ goto bad_type;
+
+ if (!btf_is_kernel(reg->btf)) {
+ verbose(env, "R%d must point to kernel BTF\n", regno);
+ return -EINVAL;
+ }
+ /* We need to verify reg->type and reg->btf, before accessing reg->btf */
+ reg_name = kernel_type_name(reg->btf, reg->btf_id);
+
+ /* For ref_ptr case, release function check should ensure we get one
+ * referenced PTR_TO_BTF_ID, and that its fixed offset is 0. For the
+ * normal store of unreferenced kptr, we must ensure var_off is zero.
+ * Since ref_ptr cannot be accessed directly by BPF insns, checks for
+ * reg->off and reg->ref_obj_id are not needed here.
+ */
+ if (__check_ptr_off_reg(env, reg, regno, true))
+ return -EACCES;
+
+ /* A full type match is needed, as BTF can be vmlinux or module BTF, and
+ * we also need to take into account the reg->off.
+ *
+ * We want to support cases like:
+ *
+ * struct foo {
+ * struct bar br;
+ * struct baz bz;
+ * };
+ *
+ * struct foo *v;
+ * v = func(); // PTR_TO_BTF_ID
+ * val->foo = v; // reg->off is zero, btf and btf_id match type
+ * val->bar = &v->br; // reg->off is still zero, but we need to retry with
+ * // first member type of struct after comparison fails
+ * val->baz = &v->bz; // reg->off is non-zero, so struct needs to be walked
+ * // to match type
+ *
+ * In the kptr_ref case, check_func_arg_reg_off already ensures reg->off
+ * is zero.
+ */
+ if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, reg->off,
+ off_desc->kptr.btf, off_desc->kptr.btf_id))
+ goto bad_type;
+ return 0;
+bad_type:
+ verbose(env, "invalid kptr access, R%d type=%s%s ", regno,
+ reg_type_str(env, reg->type), reg_name);
+ verbose(env, "expected=%s%s\n", reg_type_str(env, PTR_TO_BTF_ID), targ_name);
+ return -EINVAL;
+}
+
+static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno,
+ int value_regno, int insn_idx,
+ struct bpf_map_value_off_desc *off_desc)
+{
+ struct bpf_insn *insn = &env->prog->insnsi[insn_idx];
+ int class = BPF_CLASS(insn->code);
+ struct bpf_reg_state *val_reg;
+
+ /* Things we already checked for in check_map_access and caller:
+ * - Reject cases where variable offset may touch kptr
+ * - size of access (must be BPF_DW)
+ * - tnum_is_const(reg->var_off)
+ * - off_desc->offset == off + reg->var_off.value
+ */
+ /* Only BPF_[LDX,STX,ST] | BPF_MEM | BPF_DW is supported */
+ if (BPF_MODE(insn->code) != BPF_MEM) {
+ verbose(env, "kptr in map can only be accessed using BPF_MEM instruction mode\n");
+ return -EACCES;
+ }
+
+ /* We cannot directly access kptr_ref */
+ if (off_desc->type == BPF_KPTR_REF) {
+ verbose(env, "accessing referenced kptr disallowed\n");
+ return -EACCES;
+ }
+
+ if (class == BPF_LDX) {
+ val_reg = reg_state(env, value_regno);
+ /* We can simply mark the value_regno receiving the pointer
+ * value from map as PTR_TO_BTF_ID, with the correct type.
+ */
+ mark_btf_ld_reg(env, cur_regs(env), value_regno, PTR_TO_BTF_ID, off_desc->kptr.btf,
+ off_desc->kptr.btf_id, PTR_MAYBE_NULL);
+ /* For mark_ptr_or_null_reg */
+ val_reg->id = ++env->id_gen;
+ } else if (class == BPF_STX) {
+ val_reg = reg_state(env, value_regno);
+ if (!register_is_null(val_reg) &&
+ map_kptr_match_type(env, off_desc, val_reg, value_regno))
+ return -EACCES;
+ } else if (class == BPF_ST) {
+ if (insn->imm) {
+ verbose(env, "BPF_ST imm must be 0 when storing to kptr at off=%u\n",
+ off_desc->offset);
+ return -EACCES;
+ }
+ } else {
+ verbose(env, "kptr in map can only be accessed using BPF_LDX/BPF_STX/BPF_ST\n");
+ return -EACCES;
+ }
+ return 0;
+}
+
/* check read/write into a map element with possible variable offset */
static int check_map_access(struct bpf_verifier_env *env, u32 regno,
- int off, int size, bool zero_size_allowed)
+ int off, int size, bool zero_size_allowed,
+ enum bpf_access_src src)
{
struct bpf_verifier_state *vstate = env->cur_state;
struct bpf_func_state *state = vstate->frame[vstate->curframe];
@@ -3506,6 +3648,36 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno,
return -EACCES;
}
}
+ if (map_value_has_kptrs(map)) {
+ struct bpf_map_value_off *tab = map->kptr_off_tab;
+ int i;
+
+ for (i = 0; i < tab->nr_off; i++) {
+ u32 p = tab->off[i].offset;
+
+ if (reg->smin_value + off < p + sizeof(u64) &&
+ p < reg->umax_value + off + size) {
+ if (src != ACCESS_DIRECT) {
+ verbose(env, "kptr cannot be accessed indirectly by helper\n");
+ return -EACCES;
+ }
+ if (!tnum_is_const(reg->var_off)) {
+ verbose(env, "kptr access cannot have variable offset\n");
+ return -EACCES;
+ }
+ if (p != off + reg->var_off.value) {
+ verbose(env, "kptr access misaligned expected=%u off=%llu\n",
+ p, off + reg->var_off.value);
+ return -EACCES;
+ }
+ if (size != bpf_size_to_bytes(BPF_DW)) {
+ verbose(env, "kptr access size must be BPF_DW\n");
+ return -EACCES;
+ }
+ break;
+ }
+ }
+ }
return err;
}
@@ -3979,44 +4151,6 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env,
}
#endif
-static int __check_ptr_off_reg(struct bpf_verifier_env *env,
- const struct bpf_reg_state *reg, int regno,
- bool fixed_off_ok)
-{
- /* Access to this pointer-typed register or passing it to a helper
- * is only allowed in its original, unmodified form.
- */
-
- if (reg->off < 0) {
- verbose(env, "negative offset %s ptr R%d off=%d disallowed\n",
- reg_type_str(env, reg->type), regno, reg->off);
- return -EACCES;
- }
-
- if (!fixed_off_ok && reg->off) {
- verbose(env, "dereference of modified %s ptr R%d off=%d disallowed\n",
- reg_type_str(env, reg->type), regno, reg->off);
- return -EACCES;
- }
-
- if (!tnum_is_const(reg->var_off) || reg->var_off.value) {
- char tn_buf[48];
-
- tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off);
- verbose(env, "variable %s access var_off=%s disallowed\n",
- reg_type_str(env, reg->type), tn_buf);
- return -EACCES;
- }
-
- return 0;
-}
-
-int check_ptr_off_reg(struct bpf_verifier_env *env,
- const struct bpf_reg_state *reg, int regno)
-{
- return __check_ptr_off_reg(env, reg, regno, false);
-}
-
static int __check_buffer_access(struct bpf_verifier_env *env,
const char *buf_info,
const struct bpf_reg_state *reg,
@@ -4315,7 +4449,7 @@ static int check_stack_slot_within_bounds(int off,
static int check_stack_access_within_bounds(
struct bpf_verifier_env *env,
int regno, int off, int access_size,
- enum stack_access_src src, enum bpf_access_type type)
+ enum bpf_access_src src, enum bpf_access_type type)
{
struct bpf_reg_state *regs = cur_regs(env);
struct bpf_reg_state *reg = regs + regno;
@@ -4411,6 +4545,8 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
if (value_regno >= 0)
mark_reg_unknown(env, regs, value_regno);
} else if (reg->type == PTR_TO_MAP_VALUE) {
+ struct bpf_map_value_off_desc *kptr_off_desc = NULL;
+
if (t == BPF_WRITE && value_regno >= 0 &&
is_pointer_value(env, value_regno)) {
verbose(env, "R%d leaks addr into map\n", value_regno);
@@ -4419,8 +4555,16 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
err = check_map_access_type(env, regno, off, size, t);
if (err)
return err;
- err = check_map_access(env, regno, off, size, false);
- if (!err && t == BPF_READ && value_regno >= 0) {
+ err = check_map_access(env, regno, off, size, false, ACCESS_DIRECT);
+ if (err)
+ return err;
+ if (tnum_is_const(reg->var_off))
+ kptr_off_desc = bpf_map_kptr_off_contains(reg->map_ptr,
+ off + reg->var_off.value);
+ if (kptr_off_desc) {
+ err = check_map_kptr_access(env, regno, value_regno, insn_idx,
+ kptr_off_desc);
+ } else if (t == BPF_READ && value_regno >= 0) {
struct bpf_map *map = reg->map_ptr;
/* if map is read-only, track its contents as scalars */
@@ -4723,7 +4867,7 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
static int check_stack_range_initialized(
struct bpf_verifier_env *env, int regno, int off,
int access_size, bool zero_size_allowed,
- enum stack_access_src type, struct bpf_call_arg_meta *meta)
+ enum bpf_access_src type, struct bpf_call_arg_meta *meta)
{
struct bpf_reg_state *reg = reg_state(env, regno);
struct bpf_func_state *state = func(env, reg);
@@ -4873,7 +5017,7 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno,
BPF_READ))
return -EACCES;
return check_map_access(env, regno, reg->off, access_size,
- zero_size_allowed);
+ zero_size_allowed, ACCESS_HELPER);
case PTR_TO_MEM:
if (type_is_rdonly_mem(reg->type)) {
if (meta && meta->raw_mode) {
@@ -5162,6 +5306,53 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno,
return 0;
}
+static int process_kptr_func(struct bpf_verifier_env *env, int regno,
+ struct bpf_call_arg_meta *meta)
+{
+ struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno];
+ struct bpf_map_value_off_desc *off_desc;
+ struct bpf_map *map_ptr = reg->map_ptr;
+ u32 kptr_off;
+ int ret;
+
+ if (!tnum_is_const(reg->var_off)) {
+ verbose(env,
+ "R%d doesn't have constant offset. kptr has to be at the constant offset\n",
+ regno);
+ return -EINVAL;
+ }
+ if (!map_ptr->btf) {
+ verbose(env, "map '%s' has to have BTF in order to use bpf_kptr_xchg\n",
+ map_ptr->name);
+ return -EINVAL;
+ }
+ if (!map_value_has_kptrs(map_ptr)) {
+ ret = PTR_ERR_OR_ZERO(map_ptr->kptr_off_tab);
+ if (ret == -E2BIG)
+ verbose(env, "map '%s' has more than %d kptr\n", map_ptr->name,
+ BPF_MAP_VALUE_OFF_MAX);
+ else if (ret == -EEXIST)
+ verbose(env, "map '%s' has repeating kptr BTF tags\n", map_ptr->name);
+ else
+ verbose(env, "map '%s' has no valid kptr\n", map_ptr->name);
+ return -EINVAL;
+ }
+
+ meta->map_ptr = map_ptr;
+ kptr_off = reg->off + reg->var_off.value;
+ off_desc = bpf_map_kptr_off_contains(map_ptr, kptr_off);
+ if (!off_desc) {
+ verbose(env, "off=%d doesn't point to kptr\n", kptr_off);
+ return -EACCES;
+ }
+ if (off_desc->type != BPF_KPTR_REF) {
+ verbose(env, "off=%d kptr isn't referenced kptr\n", kptr_off);
+ return -EACCES;
+ }
+ meta->kptr_off_desc = off_desc;
+ return 0;
+}
+
static bool arg_type_is_mem_ptr(enum bpf_arg_type type)
{
return base_type(type) == ARG_PTR_TO_MEM ||
@@ -5185,6 +5376,11 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
type == ARG_PTR_TO_LONG;
}
+static bool arg_type_is_release(enum bpf_arg_type type)
+{
+ return type & OBJ_RELEASE;
+}
+
static int int_ptr_type_to_size(enum bpf_arg_type type)
{
if (type == ARG_PTR_TO_INT)
@@ -5297,6 +5493,7 @@ static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } };
static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK } };
static const struct bpf_reg_types const_str_ptr_types = { .types = { PTR_TO_MAP_VALUE } };
static const struct bpf_reg_types timer_types = { .types = { PTR_TO_MAP_VALUE } };
+static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } };
static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_MAP_KEY] = &map_key_value_types,
@@ -5324,11 +5521,13 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
[ARG_PTR_TO_STACK] = &stack_ptr_types,
[ARG_PTR_TO_CONST_STR] = &const_str_ptr_types,
[ARG_PTR_TO_TIMER] = &timer_types,
+ [ARG_PTR_TO_KPTR] = &kptr_types,
};
static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
enum bpf_arg_type arg_type,
- const u32 *arg_btf_id)
+ const u32 *arg_btf_id,
+ struct bpf_call_arg_meta *meta)
{
struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno];
enum bpf_reg_type expected, type = reg->type;
@@ -5381,8 +5580,11 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
arg_btf_id = compatible->btf_id;
}
- if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, reg->off,
- btf_vmlinux, *arg_btf_id)) {
+ if (meta->func_id == BPF_FUNC_kptr_xchg) {
+ if (map_kptr_match_type(env, meta->kptr_off_desc, reg, regno))
+ return -EACCES;
+ } else if (!btf_struct_ids_match(&env->log, reg->btf, reg->btf_id, reg->off,
+ btf_vmlinux, *arg_btf_id)) {
verbose(env, "R%d is of type %s but %s is expected\n",
regno, kernel_type_name(reg->btf, reg->btf_id),
kernel_type_name(btf_vmlinux, *arg_btf_id));
@@ -5395,11 +5597,10 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
int check_func_arg_reg_off(struct bpf_verifier_env *env,
const struct bpf_reg_state *reg, int regno,
- enum bpf_arg_type arg_type,
- bool is_release_func)
+ enum bpf_arg_type arg_type)
{
- bool fixed_off_ok = false, release_reg;
enum bpf_reg_type type = reg->type;
+ bool fixed_off_ok = false;
switch ((u32)type) {
case SCALAR_VALUE:
@@ -5417,7 +5618,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
/* Some of the argument types nevertheless require a
* zero register offset.
*/
- if (arg_type != ARG_PTR_TO_ALLOC_MEM)
+ if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
return 0;
break;
/* All the rest must be rejected, except PTR_TO_BTF_ID which allows
@@ -5425,19 +5626,17 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
*/
case PTR_TO_BTF_ID:
/* When referenced PTR_TO_BTF_ID is passed to release function,
- * it's fixed offset must be 0. We rely on the property that
- * only one referenced register can be passed to BPF helpers and
- * kfuncs. In the other cases, fixed offset can be non-zero.
+ * it's fixed offset must be 0. In the other cases, fixed offset
+ * can be non-zero.
*/
- release_reg = is_release_func && reg->ref_obj_id;
- if (release_reg && reg->off) {
+ if (arg_type_is_release(arg_type) && reg->off) {
verbose(env, "R%d must have zero offset when passed to release func\n",
regno);
return -EINVAL;
}
- /* For release_reg == true, fixed_off_ok must be false, but we
- * already checked and rejected reg->off != 0 above, so set to
- * true to allow fixed offset for all other cases.
+ /* For arg is release pointer, fixed_off_ok must be false, but
+ * we already checked and rejected reg->off != 0 above, so set
+ * to true to allow fixed offset for all other cases.
*/
fixed_off_ok = true;
break;
@@ -5492,18 +5691,28 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
*/
goto skip_type_check;
- err = check_reg_type(env, regno, arg_type, fn->arg_btf_id[arg]);
+ err = check_reg_type(env, regno, arg_type, fn->arg_btf_id[arg], meta);
if (err)
return err;
- err = check_func_arg_reg_off(env, reg, regno, arg_type, is_release_function(meta->func_id));
+ err = check_func_arg_reg_off(env, reg, regno, arg_type);
if (err)
return err;
skip_type_check:
- /* check_func_arg_reg_off relies on only one referenced register being
- * allowed for BPF helpers.
- */
+ if (arg_type_is_release(arg_type)) {
+ if (!reg->ref_obj_id && !register_is_null(reg)) {
+ verbose(env, "R%d must be referenced when passed to release function\n",
+ regno);
+ return -EINVAL;
+ }
+ if (meta->release_regno) {
+ verbose(env, "verifier internal error: more than one release argument\n");
+ return -EFAULT;
+ }
+ meta->release_regno = regno;
+ }
+
if (reg->ref_obj_id) {
if (meta->ref_obj_id) {
verbose(env, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
@@ -5641,7 +5850,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
}
err = check_map_access(env, regno, reg->off,
- map->value_size - reg->off, false);
+ map->value_size - reg->off, false,
+ ACCESS_HELPER);
if (err)
return err;
@@ -5657,6 +5867,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
verbose(env, "string is not zero-terminated\n");
return -EINVAL;
}
+ } else if (arg_type == ARG_PTR_TO_KPTR) {
+ if (process_kptr_func(env, regno, meta))
+ return -EACCES;
}
return err;
@@ -5696,7 +5909,8 @@ static bool may_update_sockmap(struct bpf_verifier_env *env, int func_id)
static bool allow_tail_call_in_subprogs(struct bpf_verifier_env *env)
{
- return env->prog->jit_requested && IS_ENABLED(CONFIG_X86_64);
+ return env->prog->jit_requested &&
+ bpf_jit_supports_subprog_tailcalls();
}
static int check_map_func_compatibility(struct bpf_verifier_env *env,
@@ -5999,17 +6213,18 @@ static bool check_btf_id_ok(const struct bpf_func_proto *fn)
int i;
for (i = 0; i < ARRAY_SIZE(fn->arg_type); i++) {
- if (fn->arg_type[i] == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
+ if (base_type(fn->arg_type[i]) == ARG_PTR_TO_BTF_ID && !fn->arg_btf_id[i])
return false;
- if (fn->arg_type[i] != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
+ if (base_type(fn->arg_type[i]) != ARG_PTR_TO_BTF_ID && fn->arg_btf_id[i])
return false;
}
return true;
}
-static int check_func_proto(const struct bpf_func_proto *fn, int func_id)
+static int check_func_proto(const struct bpf_func_proto *fn, int func_id,
+ struct bpf_call_arg_meta *meta)
{
return check_raw_mode_ok(fn) &&
check_arg_pair_ok(fn) &&
@@ -6691,7 +6906,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
memset(&meta, 0, sizeof(meta));
meta.pkt_access = fn->pkt_access;
- err = check_func_proto(fn, func_id);
+ err = check_func_proto(fn, func_id, &meta);
if (err) {
verbose(env, "kernel subsystem misconfigured func %s#%d\n",
func_id_name(func_id), func_id);
@@ -6724,8 +6939,17 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
return err;
}
- if (is_release_function(func_id)) {
- err = release_reference(env, meta.ref_obj_id);
+ regs = cur_regs(env);
+
+ if (meta.release_regno) {
+ err = -EINVAL;
+ if (meta.ref_obj_id)
+ err = release_reference(env, meta.ref_obj_id);
+ /* meta.ref_obj_id can only be 0 if register that is meant to be
+ * released is NULL, which must be > R0.
+ */
+ else if (register_is_null(®s[meta.release_regno]))
+ err = 0;
if (err) {
verbose(env, "func %s#%d reference has not been acquired before\n",
func_id_name(func_id), func_id);
@@ -6733,8 +6957,6 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
}
}
- regs = cur_regs(env);
-
switch (func_id) {
case BPF_FUNC_tail_call:
err = check_reference_leak(env);
@@ -6858,21 +7080,25 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
regs[BPF_REG_0].btf_id = meta.ret_btf_id;
}
} else if (base_type(ret_type) == RET_PTR_TO_BTF_ID) {
+ struct btf *ret_btf;
int ret_btf_id;
mark_reg_known_zero(env, regs, BPF_REG_0);
regs[BPF_REG_0].type = PTR_TO_BTF_ID | ret_flag;
- ret_btf_id = *fn->ret_btf_id;
+ if (func_id == BPF_FUNC_kptr_xchg) {
+ ret_btf = meta.kptr_off_desc->kptr.btf;
+ ret_btf_id = meta.kptr_off_desc->kptr.btf_id;
+ } else {
+ ret_btf = btf_vmlinux;
+ ret_btf_id = *fn->ret_btf_id;
+ }
if (ret_btf_id == 0) {
verbose(env, "invalid return type %u of func %s#%d\n",
base_type(ret_type), func_id_name(func_id),
func_id);
return -EINVAL;
}
- /* current BPF helper definitions are only coming from
- * built-in code with type IDs from vmlinux BTF
- */
- regs[BPF_REG_0].btf = btf_vmlinux;
+ regs[BPF_REG_0].btf = ret_btf;
regs[BPF_REG_0].btf_id = ret_btf_id;
} else {
verbose(env, "unknown return type %u of func %s#%d\n",
@@ -7459,7 +7685,7 @@ static int sanitize_check_bounds(struct bpf_verifier_env *env,
return -EACCES;
break;
case PTR_TO_MAP_VALUE:
- if (check_map_access(env, dst, dst_reg->off, 1, false)) {
+ if (check_map_access(env, dst, dst_reg->off, 1, false, ACCESS_HELPER)) {
verbose(env, "R%d pointer arithmetic of map value goes out of range, "
"prohibited for !root\n", dst);
return -EACCES;
@@ -13015,6 +13241,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
/* Below members will be freed only at prog->aux */
func[i]->aux->btf = prog->aux->btf;
func[i]->aux->func_info = prog->aux->func_info;
+ func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
func[i]->aux->poke_tab = prog->aux->poke_tab;
func[i]->aux->size_poke_tab = prog->aux->size_poke_tab;
@@ -13027,9 +13254,6 @@ static int jit_subprogs(struct bpf_verifier_env *env)
poke->aux = func[i]->aux;
}
- /* Use bpf_prog_F_tag to indicate functions in stack traces.
- * Long term would need debug info to populate names
- */
func[i]->aux->name[0] = 'F';
func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
func[i]->jit_requested = 1;
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 71a418858a5e..58aadfda9b8b 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2239,7 +2239,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
goto out_unlock;
cgroup_taskset_for_each(task, css, tset) {
- ret = task_can_attach(task, cs->cpus_allowed);
+ ret = task_can_attach(task, cs->effective_cpus);
if (ret)
goto out_unlock;
ret = security_task_setscheduler(task);
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 73a41cec9e38..90e5f5c92fdc 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -588,7 +588,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
int index;
phys_addr_t tlb_addr;
- if (!mem)
+ if (!mem || !mem->nslabs)
panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 10929eda9825..fc760d064a65 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -82,6 +82,7 @@ config IRQ_FASTEOI_HIERARCHY_HANDLERS
# Generic IRQ IPI support
config GENERIC_IRQ_IPI
bool
+ depends on SMP
select IRQ_DOMAIN_HIERARCHY
# Generic MSI interrupt support
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 54af0deb239b..eb921485930f 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -1513,7 +1513,8 @@ int irq_chip_request_resources_parent(struct irq_data *data)
if (data->chip->irq_request_resources)
return data->chip->irq_request_resources(data);
- return -ENOSYS;
+ /* no error on missing optional irq_chip::irq_request_resources */
+ return 0;
}
EXPORT_SYMBOL_GPL(irq_chip_request_resources_parent);
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index d5ce96510549..481abb885d61 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -910,6 +910,8 @@ struct irq_desc *__irq_resolve_mapping(struct irq_domain *domain,
data = irq_domain_get_irq_data(domain, hwirq);
if (data && data->hwirq == hwirq)
desc = irq_data_to_desc(data);
+ if (irq && desc)
+ *irq = hwirq;
}
return desc;
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index bb0fb63f563c..ad005cd184a4 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -62,14 +62,7 @@ int kexec_image_probe_default(struct kimage *image, void *buf,
return ret;
}
-/* Architectures can provide this probe function */
-int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
- unsigned long buf_len)
-{
- return kexec_image_probe_default(image, buf, buf_len);
-}
-
-static void *kexec_image_load_default(struct kimage *image)
+void *kexec_image_load_default(struct kimage *image)
{
if (!image->fops || !image->fops->load)
return ERR_PTR(-ENOEXEC);
@@ -80,11 +73,6 @@ static void *kexec_image_load_default(struct kimage *image)
image->cmdline_buf_len);
}
-void * __weak arch_kexec_kernel_image_load(struct kimage *image)
-{
- return kexec_image_load_default(image);
-}
-
int kexec_image_post_load_cleanup_default(struct kimage *image)
{
if (!image->fops || !image->fops->cleanup)
@@ -93,30 +81,6 @@ int kexec_image_post_load_cleanup_default(struct kimage *image)
return image->fops->cleanup(image->image_loader_data);
}
-int __weak arch_kimage_file_post_load_cleanup(struct kimage *image)
-{
- return kexec_image_post_load_cleanup_default(image);
-}
-
-#ifdef CONFIG_KEXEC_SIG
-static int kexec_image_verify_sig_default(struct kimage *image, void *buf,
- unsigned long buf_len)
-{
- if (!image->fops || !image->fops->verify_sig) {
- pr_debug("kernel loader does not support signature verification.\n");
- return -EKEYREJECTED;
- }
-
- return image->fops->verify_sig(buf, buf_len);
-}
-
-int __weak arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
- unsigned long buf_len)
-{
- return kexec_image_verify_sig_default(image, buf, buf_len);
-}
-#endif
-
/*
* Free up memory used by kernel, initrd, and command line. This is temporary
* memory allocation which is not needed any more after these buffers have
@@ -159,13 +123,24 @@ void kimage_file_post_load_cleanup(struct kimage *image)
}
#ifdef CONFIG_KEXEC_SIG
+static int kexec_image_verify_sig(struct kimage *image, void *buf,
+ unsigned long buf_len)
+{
+ if (!image->fops || !image->fops->verify_sig) {
+ pr_debug("kernel loader does not support signature verification.\n");
+ return -EKEYREJECTED;
+ }
+
+ return image->fops->verify_sig(buf, buf_len);
+}
+
static int
kimage_validate_signature(struct kimage *image)
{
int ret;
- ret = arch_kexec_kernel_verify_sig(image, image->kernel_buf,
- image->kernel_buf_len);
+ ret = kexec_image_verify_sig(image, image->kernel_buf,
+ image->kernel_buf_len);
if (ret) {
if (sig_enforce) {
@@ -621,19 +596,6 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
return ret == 1 ? 0 : -EADDRNOTAVAIL;
}
-/**
- * arch_kexec_locate_mem_hole - Find free memory to place the segments.
- * @kbuf: Parameters for the memory search.
- *
- * On success, kbuf->mem will have the start address of the memory region found.
- *
- * Return: 0 on success, negative errno on error.
- */
-int __weak arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
-{
- return kexec_locate_mem_hole(kbuf);
-}
-
/**
* kexec_add_buffer - place a buffer in a kexec segment
* @kbuf: Buffer contents and memory parameters.
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index f214f8c088ed..80697e5e03e4 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1560,7 +1560,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
preempt_disable();
/* Ensure it is not in reserved area nor out of text */
- if (!kernel_text_address((unsigned long) p->addr) ||
+ if (!(core_kernel_text((unsigned long) p->addr) ||
+ is_module_text_address((unsigned long) p->addr)) ||
within_kprobe_blacklist((unsigned long) p->addr) ||
jump_label_text_reserved(p->addr, p->addr) ||
static_call_text_reserved(p->addr, p->addr) ||
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c06cab6546ed..e45ec82b42e6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -5214,9 +5214,10 @@ __lock_set_class(struct lockdep_map *lock, const char *name,
return 0;
}
- lockdep_init_map_waits(lock, name, key, 0,
- lock->wait_type_inner,
- lock->wait_type_outer);
+ lockdep_init_map_type(lock, name, key, 0,
+ lock->wait_type_inner,
+ lock->wait_type_outer,
+ lock->lock_type);
class = register_lock_class(lock, subclass, 0);
hlock->class_idx = class - lock_classes;
diff --git a/kernel/power/user.c b/kernel/power/user.c
index ad241b4ff64c..d43c2aa583b2 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -26,6 +26,7 @@
#include "power.h"
+static bool need_wait;
static struct snapshot_data {
struct snapshot_handle handle;
@@ -78,7 +79,7 @@ static int snapshot_open(struct inode *inode, struct file *filp)
* Resuming. We may need to wait for the image device to
* appear.
*/
- wait_for_device_probe();
+ need_wait = true;
data->swap = -1;
data->mode = O_WRONLY;
@@ -168,6 +169,11 @@ static ssize_t snapshot_write(struct file *filp, const char __user *buf,
ssize_t res;
loff_t pg_offp = *offp & ~PAGE_MASK;
+ if (need_wait) {
+ wait_for_device_probe();
+ need_wait = false;
+ }
+
lock_system_sleep();
data = filp->private_data;
@@ -244,6 +250,11 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd,
loff_t size;
sector_t offset;
+ if (need_wait) {
+ wait_for_device_probe();
+ need_wait = false;
+ }
+
if (_IOC_TYPE(cmd) != SNAPSHOT_IOC_MAGIC)
return -ENOTTY;
if (_IOC_NR(cmd) > SNAPSHOT_IOC_MAXNR)
diff --git a/kernel/profile.c b/kernel/profile.c
index 37640a0bd8a3..ae82ddfc6a68 100644
--- a/kernel/profile.c
+++ b/kernel/profile.c
@@ -109,6 +109,13 @@ int __ref profile_init(void)
/* only text is profiled */
prof_len = (_etext - _stext) >> prof_shift;
+
+ if (!prof_len) {
+ pr_warn("profiling shift: %u too large\n", prof_shift);
+ prof_on = 0;
+ return -EINVAL;
+ }
+
buffer_bytes = prof_len*sizeof(atomic_t);
if (!alloc_cpumask_var(&prof_cpu_mask, GFP_KERNEL))
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 55d049c39608..7c3a7f8c828b 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2030,6 +2030,19 @@ static int rcutorture_booster_init(unsigned int cpu)
if (boost_tasks[cpu] != NULL)
return 0; /* Already created, nothing more to do. */
+ // Testing RCU priority boosting requires rcutorture do
+ // some serious abuse. Counter this by running ksoftirqd
+ // at higher priority.
+ if (IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)) {
+ struct sched_param sp;
+ struct task_struct *t;
+
+ t = per_cpu(ksoftirqd, cpu);
+ WARN_ON_ONCE(!t);
+ sp.sched_priority = 2;
+ sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
+ }
+
/* Don't allow time recalculation while creating a new task. */
mutex_lock(&boost_mutex);
rcu_torture_disable_rt_throttle();
@@ -3281,21 +3294,6 @@ rcu_torture_init(void)
rcutor_hp = firsterr;
if (torture_init_error(firsterr))
goto unwind;
-
- // Testing RCU priority boosting requires rcutorture do
- // some serious abuse. Counter this by running ksoftirqd
- // at higher priority.
- if (IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)) {
- for_each_online_cpu(cpu) {
- struct sched_param sp;
- struct task_struct *t;
-
- t = per_cpu(ksoftirqd, cpu);
- WARN_ON_ONCE(!t);
- sp.sched_priority = 2;
- sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
- }
- }
}
shutdown_jiffies = jiffies + shutdown_secs * HZ;
firsterr = torture_shutdown_init(shutdown_secs, rcu_torture_cleanup);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index dd11daa7a84b..9671796a11cc 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -88,7 +88,7 @@
#include "stats.h"
#include "../workqueue_internal.h"
-#include "../../fs/io-wq.h"
+#include "../../io_uring/io-wq.h"
#include "../smpboot.h"
/*
@@ -3811,7 +3811,7 @@ bool cpus_share_cache(int this_cpu, int that_cpu)
return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
}
-static inline bool ttwu_queue_cond(int cpu, int wake_flags)
+static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
{
/*
* Do not complicate things with the async wake_list while the CPU is
@@ -3820,6 +3820,10 @@ static inline bool ttwu_queue_cond(int cpu, int wake_flags)
if (!cpu_active(cpu))
return false;
+ /* Ensure the task will still be allowed to run on the CPU. */
+ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
+ return false;
+
/*
* If the CPU does not share cache, then queue the task on the
* remote rqs wakelist to avoid accessing remote data.
@@ -3827,13 +3831,21 @@ static inline bool ttwu_queue_cond(int cpu, int wake_flags)
if (!cpus_share_cache(smp_processor_id(), cpu))
return true;
+ if (cpu == smp_processor_id())
+ return false;
+
/*
- * If the task is descheduling and the only running task on the
- * CPU then use the wakelist to offload the task activation to
- * the soon-to-be-idle CPU as the current CPU is likely busy.
- * nr_running is checked to avoid unnecessary task stacking.
+ * If the wakee cpu is idle, or the task is descheduling and the
+ * only running task on the CPU, then use the wakelist to offload
+ * the task activation to the idle (or soon-to-be-idle) CPU as
+ * the current CPU is likely busy. nr_running is checked to
+ * avoid unnecessary task stacking.
+ *
+ * Note that we can only get here with (wakee) p->on_rq=0,
+ * p->on_cpu can be whatever, we've done the dequeue, so
+ * the wakee has been accounted out of ->nr_running.
*/
- if ((wake_flags & WF_ON_CPU) && cpu_rq(cpu)->nr_running <= 1)
+ if (!cpu_rq(cpu)->nr_running)
return true;
return false;
@@ -3841,10 +3853,7 @@ static inline bool ttwu_queue_cond(int cpu, int wake_flags)
static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
{
- if (sched_feat(TTWU_QUEUE) && ttwu_queue_cond(cpu, wake_flags)) {
- if (WARN_ON_ONCE(cpu == smp_processor_id()))
- return false;
-
+ if (sched_feat(TTWU_QUEUE) && ttwu_queue_cond(p, cpu)) {
sched_clock_cpu(cpu); /* Sync clocks across CPUs */
__ttwu_queue_wakelist(p, cpu, wake_flags);
return true;
@@ -4166,7 +4175,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
* scheduling.
*/
if (smp_load_acquire(&p->on_cpu) &&
- ttwu_queue_wakelist(p, task_cpu(p), wake_flags | WF_ON_CPU))
+ ttwu_queue_wakelist(p, task_cpu(p), wake_flags))
goto unlock;
/*
@@ -4710,7 +4719,8 @@ static inline void prepare_task(struct task_struct *next)
* Claim the task as running, we do this before switching to it
* such that any running task will have this set.
*
- * See the ttwu() WF_ON_CPU case and its ordering comment.
+ * See the smp_load_acquire(&p->on_cpu) case in ttwu() and
+ * its ordering comment.
*/
WRITE_ONCE(next->on_cpu, 1);
#endif
@@ -6460,8 +6470,12 @@ static inline void sched_submit_work(struct task_struct *tsk)
io_wq_worker_sleeping(tsk);
}
- if (tsk_is_pi_blocked(tsk))
- return;
+ /*
+ * spinlock and rwlock must not flush block requests. This will
+ * deadlock if the callback attempts to acquire a lock which is
+ * already acquired.
+ */
+ SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
/*
* If we are going to sleep and we have plugged IO queued,
@@ -8895,7 +8909,7 @@ int cpuset_cpumask_can_shrink(const struct cpumask *cur,
}
int task_can_attach(struct task_struct *p,
- const struct cpumask *cs_cpus_allowed)
+ const struct cpumask *cs_effective_cpus)
{
int ret = 0;
@@ -8914,9 +8928,11 @@ int task_can_attach(struct task_struct *p,
}
if (dl_task(p) && !cpumask_intersects(task_rq(p)->rd->span,
- cs_cpus_allowed)) {
- int cpu = cpumask_any_and(cpu_active_mask, cs_cpus_allowed);
+ cs_effective_cpus)) {
+ int cpu = cpumask_any_and(cpu_active_mask, cs_effective_cpus);
+ if (unlikely(cpu >= nr_cpu_ids))
+ return -EINVAL;
ret = dl_cpu_busy(cpu, p);
}
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cc8daa3dcc8b..46f6674a0979 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6324,6 +6324,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
{
struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
int i, cpu, idle_cpu = -1, nr = INT_MAX;
+ struct sched_domain_shared *sd_share;
struct rq *this_rq = this_rq();
int this = smp_processor_id();
struct sched_domain *this_sd;
@@ -6363,6 +6364,17 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
time = cpu_clock(this);
}
+ if (sched_feat(SIS_UTIL)) {
+ sd_share = rcu_dereference(per_cpu(sd_llc_shared, target));
+ if (sd_share) {
+ /* because !--nr is the condition to stop scan */
+ nr = READ_ONCE(sd_share->nr_idle_scan) + 1;
+ /* overloaded LLC is unlikely to have idle cpu/core */
+ if (nr == 1)
+ return -1;
+ }
+ }
+
for_each_cpu_wrap(cpu, cpus, target + 1) {
if (has_idle_core) {
i = select_idle_core(p, cpu, cpus, &idle_cpu);
@@ -7616,8 +7628,8 @@ enum group_type {
*/
group_fully_busy,
/*
- * SD_ASYM_CPUCAPACITY only: One task doesn't fit with CPU's capacity
- * and must be migrated to a more powerful CPU.
+ * One task doesn't fit with CPU's capacity and must be migrated to a
+ * more powerful CPU.
*/
group_misfit_task,
/*
@@ -8700,6 +8712,19 @@ sched_asym(struct lb_env *env, struct sd_lb_stats *sds, struct sg_lb_stats *sgs
return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
}
+static inline bool
+sched_reduced_capacity(struct rq *rq, struct sched_domain *sd)
+{
+ /*
+ * When there is more than 1 task, the group_overloaded case already
+ * takes care of cpu with reduced capacity
+ */
+ if (rq->cfs.h_nr_running != 1)
+ return false;
+
+ return check_cpu_capacity(rq, sd);
+}
+
/**
* update_sg_lb_stats - Update sched_group's statistics for load balancing.
* @env: The load balancing environment.
@@ -8722,8 +8747,9 @@ static inline void update_sg_lb_stats(struct lb_env *env,
for_each_cpu_and(i, sched_group_span(group), env->cpus) {
struct rq *rq = cpu_rq(i);
+ unsigned long load = cpu_load(rq);
- sgs->group_load += cpu_load(rq);
+ sgs->group_load += load;
sgs->group_util += cpu_util_cfs(i);
sgs->group_runnable += cpu_runnable(rq);
sgs->sum_h_nr_running += rq->cfs.h_nr_running;
@@ -8753,11 +8779,17 @@ static inline void update_sg_lb_stats(struct lb_env *env,
if (local_group)
continue;
- /* Check for a misfit task on the cpu */
- if (env->sd->flags & SD_ASYM_CPUCAPACITY &&
- sgs->group_misfit_task_load < rq->misfit_task_load) {
- sgs->group_misfit_task_load = rq->misfit_task_load;
- *sg_status |= SG_OVERLOAD;
+ if (env->sd->flags & SD_ASYM_CPUCAPACITY) {
+ /* Check for a misfit task on the cpu */
+ if (sgs->group_misfit_task_load < rq->misfit_task_load) {
+ sgs->group_misfit_task_load = rq->misfit_task_load;
+ *sg_status |= SG_OVERLOAD;
+ }
+ } else if ((env->idle != CPU_NOT_IDLE) &&
+ sched_reduced_capacity(rq, env->sd)) {
+ /* Check for a task running on a CPU with reduced capacity */
+ if (sgs->group_misfit_task_load < load)
+ sgs->group_misfit_task_load = load;
}
}
@@ -8810,7 +8842,8 @@ static bool update_sd_pick_busiest(struct lb_env *env,
* CPUs in the group should either be possible to resolve
* internally or be covered by avg_load imbalance (eventually).
*/
- if (sgs->group_type == group_misfit_task &&
+ if ((env->sd->flags & SD_ASYM_CPUCAPACITY) &&
+ (sgs->group_type == group_misfit_task) &&
(!capacity_greater(capacity_of(env->dst_cpu), sg->sgc->max_capacity) ||
sds->local_stat.group_type != group_has_spare))
return false;
@@ -9253,6 +9286,77 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu)
return idlest;
}
+static void update_idle_cpu_scan(struct lb_env *env,
+ unsigned long sum_util)
+{
+ struct sched_domain_shared *sd_share;
+ int llc_weight, pct;
+ u64 x, y, tmp;
+ /*
+ * Update the number of CPUs to scan in LLC domain, which could
+ * be used as a hint in select_idle_cpu(). The update of sd_share
+ * could be expensive because it is within a shared cache line.
+ * So the write of this hint only occurs during periodic load
+ * balancing, rather than CPU_NEWLY_IDLE, because the latter
+ * can fire way more frequently than the former.
+ */
+ if (!sched_feat(SIS_UTIL) || env->idle == CPU_NEWLY_IDLE)
+ return;
+
+ llc_weight = per_cpu(sd_llc_size, env->dst_cpu);
+ if (env->sd->span_weight != llc_weight)
+ return;
+
+ sd_share = rcu_dereference(per_cpu(sd_llc_shared, env->dst_cpu));
+ if (!sd_share)
+ return;
+
+ /*
+ * The number of CPUs to search drops as sum_util increases, when
+ * sum_util hits 85% or above, the scan stops.
+ * The reason to choose 85% as the threshold is because this is the
+ * imbalance_pct(117) when a LLC sched group is overloaded.
+ *
+ * let y = SCHED_CAPACITY_SCALE - p * x^2 [1]
+ * and y'= y / SCHED_CAPACITY_SCALE
+ *
+ * x is the ratio of sum_util compared to the CPU capacity:
+ * x = sum_util / (llc_weight * SCHED_CAPACITY_SCALE)
+ * y' is the ratio of CPUs to be scanned in the LLC domain,
+ * and the number of CPUs to scan is calculated by:
+ *
+ * nr_scan = llc_weight * y' [2]
+ *
+ * When x hits the threshold of overloaded, AKA, when
+ * x = 100 / pct, y drops to 0. According to [1],
+ * p should be SCHED_CAPACITY_SCALE * pct^2 / 10000
+ *
+ * Scale x by SCHED_CAPACITY_SCALE:
+ * x' = sum_util / llc_weight; [3]
+ *
+ * and finally [1] becomes:
+ * y = SCHED_CAPACITY_SCALE -
+ * x'^2 * pct^2 / (10000 * SCHED_CAPACITY_SCALE) [4]
+ *
+ */
+ /* equation [3] */
+ x = sum_util;
+ do_div(x, llc_weight);
+
+ /* equation [4] */
+ pct = env->sd->imbalance_pct;
+ tmp = x * x * pct * pct;
+ do_div(tmp, 10000 * SCHED_CAPACITY_SCALE);
+ tmp = min_t(long, tmp, SCHED_CAPACITY_SCALE);
+ y = SCHED_CAPACITY_SCALE - tmp;
+
+ /* equation [2] */
+ y *= llc_weight;
+ do_div(y, SCHED_CAPACITY_SCALE);
+ if ((int)y != sd_share->nr_idle_scan)
+ WRITE_ONCE(sd_share->nr_idle_scan, (int)y);
+}
+
/**
* update_sd_lb_stats - Update sched_domain's statistics for load balancing.
* @env: The load balancing environment.
@@ -9265,6 +9369,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
struct sched_group *sg = env->sd->groups;
struct sg_lb_stats *local = &sds->local_stat;
struct sg_lb_stats tmp_sgs;
+ unsigned long sum_util = 0;
int sg_status = 0;
do {
@@ -9297,6 +9402,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
sds->total_load += sgs->group_load;
sds->total_capacity += sgs->group_capacity;
+ sum_util += sgs->group_util;
sg = sg->next;
} while (sg != env->sd->groups);
@@ -9322,6 +9428,8 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
WRITE_ONCE(rd->overutilized, SG_OVERUTILIZED);
trace_sched_overutilized_tp(rd, SG_OVERUTILIZED);
}
+
+ update_idle_cpu_scan(env, sum_util);
}
#define NUMA_IMBALANCE_MIN 2
@@ -9356,9 +9464,18 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
busiest = &sds->busiest_stat;
if (busiest->group_type == group_misfit_task) {
- /* Set imbalance to allow misfit tasks to be balanced. */
- env->migration_type = migrate_misfit;
- env->imbalance = 1;
+ if (env->sd->flags & SD_ASYM_CPUCAPACITY) {
+ /* Set imbalance to allow misfit tasks to be balanced. */
+ env->migration_type = migrate_misfit;
+ env->imbalance = 1;
+ } else {
+ /*
+ * Set load imbalance to allow moving task from cpu
+ * with reduced capacity.
+ */
+ env->migration_type = migrate_load;
+ env->imbalance = busiest->group_misfit_task_load;
+ }
return;
}
diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index 1cf435bbcd9c..ee7f23c76bd3 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -60,7 +60,8 @@ SCHED_FEAT(TTWU_QUEUE, true)
/*
* When doing wakeups, attempt to limit superfluous scans of the LLC domain.
*/
-SCHED_FEAT(SIS_PROP, true)
+SCHED_FEAT(SIS_PROP, false)
+SCHED_FEAT(SIS_UTIL, true)
/*
* Issue a WARN when we do multiple update_rq_clock() calls
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 7891c0f0e1ff..907a5d7507a8 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -430,7 +430,7 @@ static inline void rt_queue_push_tasks(struct rq *rq)
#endif /* CONFIG_SMP */
static void enqueue_top_rt_rq(struct rt_rq *rt_rq);
-static void dequeue_top_rt_rq(struct rt_rq *rt_rq);
+static void dequeue_top_rt_rq(struct rt_rq *rt_rq, unsigned int count);
static inline int on_rt_rq(struct sched_rt_entity *rt_se)
{
@@ -551,7 +551,7 @@ static void sched_rt_rq_dequeue(struct rt_rq *rt_rq)
rt_se = rt_rq->tg->rt_se[cpu];
if (!rt_se) {
- dequeue_top_rt_rq(rt_rq);
+ dequeue_top_rt_rq(rt_rq, rt_rq->rt_nr_running);
/* Kick cpufreq (see the comment in kernel/sched/sched.h). */
cpufreq_update_util(rq_of_rt_rq(rt_rq), 0);
}
@@ -637,7 +637,7 @@ static inline void sched_rt_rq_enqueue(struct rt_rq *rt_rq)
static inline void sched_rt_rq_dequeue(struct rt_rq *rt_rq)
{
- dequeue_top_rt_rq(rt_rq);
+ dequeue_top_rt_rq(rt_rq, rt_rq->rt_nr_running);
}
static inline int rt_rq_throttled(struct rt_rq *rt_rq)
@@ -1039,7 +1039,7 @@ static void update_curr_rt(struct rq *rq)
}
static void
-dequeue_top_rt_rq(struct rt_rq *rt_rq)
+dequeue_top_rt_rq(struct rt_rq *rt_rq, unsigned int count)
{
struct rq *rq = rq_of_rt_rq(rt_rq);
@@ -1050,7 +1050,7 @@ dequeue_top_rt_rq(struct rt_rq *rt_rq)
BUG_ON(!rq->nr_running);
- sub_nr_running(rq, rt_rq->rt_nr_running);
+ sub_nr_running(rq, count);
rt_rq->rt_queued = 0;
}
@@ -1436,18 +1436,21 @@ static void __dequeue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flag
static void dequeue_rt_stack(struct sched_rt_entity *rt_se, unsigned int flags)
{
struct sched_rt_entity *back = NULL;
+ unsigned int rt_nr_running;
for_each_sched_rt_entity(rt_se) {
rt_se->back = back;
back = rt_se;
}
- dequeue_top_rt_rq(rt_rq_of_se(back));
+ rt_nr_running = rt_rq_of_se(back)->rt_nr_running;
for (rt_se = back; rt_se; rt_se = rt_se->back) {
if (on_rt_rq(rt_se))
__dequeue_rt_entity(rt_se, flags);
}
+
+ dequeue_top_rt_rq(rt_rq_of_se(back), rt_nr_running);
}
static void enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flags)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 84bba67c92dc..08fdb9ccd14d 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2042,7 +2042,6 @@ static inline int task_on_rq_migrating(struct task_struct *p)
#define WF_SYNC 0x10 /* Waker goes to sleep after wakeup */
#define WF_MIGRATED 0x20 /* Internal use, task got migrated */
-#define WF_ON_CPU 0x40 /* Wakee is on_cpu */
#ifdef CONFIG_SMP
static_assert(WF_EXEC == SD_BALANCE_EXEC);
diff --git a/kernel/smp.c b/kernel/smp.c
index 65a630f62363..381eb15cd28f 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -174,9 +174,9 @@ static int __init csdlock_debug(char *str)
if (val)
static_branch_enable(&csdlock_debug_enabled);
- return 0;
+ return 1;
}
-early_param("csdlock_debug", csdlock_debug);
+__setup("csdlock_debug=", csdlock_debug);
static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 0ea8702eb516..23af5eca11b1 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -2311,6 +2311,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires, u64 delta,
return !t.task ? 0 : -EINTR;
}
+EXPORT_SYMBOL_GPL(schedule_hrtimeout_range_clock);
/**
* schedule_hrtimeout_range - sleep until timeout
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 871c912860ed..d6a0ff68df41 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -23,6 +23,7 @@
#include <linux/pvclock_gtod.h>
#include <linux/compiler.h>
#include <linux/audit.h>
+#include <linux/random.h>
#include "tick-internal.h"
#include "ntp_internal.h"
@@ -1326,8 +1327,10 @@ int do_settimeofday64(const struct timespec64 *ts)
/* Signal hrtimers about time change */
clock_was_set(CLOCK_SET_WALL);
- if (!ret)
+ if (!ret) {
audit_tk_injoffset(ts_delta);
+ add_device_randomness(ts, sizeof(*ts));
+ }
return ret;
}
@@ -2413,6 +2416,7 @@ int do_adjtimex(struct __kernel_timex *txc)
ret = timekeeping_validate_timex(txc);
if (ret)
return ret;
+ add_device_randomness(txc, sizeof(*txc));
if (txc->modes & ADJ_SETOFFSET) {
struct timespec64 delta;
@@ -2430,6 +2434,7 @@ int do_adjtimex(struct __kernel_timex *txc)
audit_ntp_init(&ad);
ktime_get_real_ts64(&ts);
+ add_device_randomness(&ts, sizeof(ts));
raw_spin_lock_irqsave(&timekeeper_lock, flags);
write_seqcount_begin(&tk_core.seq);
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 4d5629196d01..f0500b5cfefe 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -770,14 +770,11 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg)
**/
void blk_trace_shutdown(struct request_queue *q)
{
- mutex_lock(&q->debugfs_mutex);
if (rcu_dereference_protected(q->blk_trace,
lockdep_is_held(&q->debugfs_mutex))) {
__blk_trace_startstop(q, 0);
__blk_trace_remove(q);
}
-
- mutex_unlock(&q->debugfs_mutex);
}
#ifdef CONFIG_BLK_CGROUP
@@ -1059,7 +1056,7 @@ static void blk_add_trace_rq_remap(void *ignore, struct request *rq, dev_t dev,
r.sector_from = cpu_to_be64(from);
__blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq),
- rq_data_dir(rq), 0, BLK_TA_REMAP, 0,
+ req_op(rq), rq->cmd_flags, BLK_TA_REMAP, 0,
sizeof(r), &r, blk_trace_request_get_cgid(rq));
rcu_read_unlock();
}
diff --git a/lib/crypto/blake2s-selftest.c b/lib/crypto/blake2s-selftest.c
index 409e4b728770..7d77dea15587 100644
--- a/lib/crypto/blake2s-selftest.c
+++ b/lib/crypto/blake2s-selftest.c
@@ -4,6 +4,8 @@
*/
#include <crypto/internal/blake2s.h>
+#include <linux/kernel.h>
+#include <linux/random.h>
#include <linux/string.h>
/*
@@ -587,5 +589,44 @@ bool __init blake2s_selftest(void)
}
}
+ for (i = 0; i < 32; ++i) {
+ enum { TEST_ALIGNMENT = 16 };
+ u8 unaligned_block[BLAKE2S_BLOCK_SIZE + TEST_ALIGNMENT - 1]
+ __aligned(TEST_ALIGNMENT);
+ u8 blocks[BLAKE2S_BLOCK_SIZE * 2];
+ struct blake2s_state state1, state2;
+
+ get_random_bytes(blocks, sizeof(blocks));
+ get_random_bytes(&state, sizeof(state));
+
+#if defined(CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC) && \
+ defined(CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S)
+ memcpy(&state1, &state, sizeof(state1));
+ memcpy(&state2, &state, sizeof(state2));
+ blake2s_compress(&state1, blocks, 2, BLAKE2S_BLOCK_SIZE);
+ blake2s_compress_generic(&state2, blocks, 2, BLAKE2S_BLOCK_SIZE);
+ if (memcmp(&state1, &state2, sizeof(state1))) {
+ pr_err("blake2s random compress self-test %d: FAIL\n",
+ i + 1);
+ success = false;
+ }
+#endif
+
+ memcpy(&state1, &state, sizeof(state1));
+ blake2s_compress(&state1, blocks, 1, BLAKE2S_BLOCK_SIZE);
+ for (l = 1; l < TEST_ALIGNMENT; ++l) {
+ memcpy(unaligned_block + l, blocks,
+ BLAKE2S_BLOCK_SIZE);
+ memcpy(&state2, &state, sizeof(state2));
+ blake2s_compress(&state2, unaligned_block + l, 1,
+ BLAKE2S_BLOCK_SIZE);
+ if (memcmp(&state1, &state2, sizeof(state1))) {
+ pr_err("blake2s random compress align %d self-test %d: FAIL\n",
+ l, i + 1);
+ success = false;
+ }
+ }
+ }
+
return success;
}
diff --git a/lib/crypto/blake2s.c b/lib/crypto/blake2s.c
index c71c09621c09..98e688c6d891 100644
--- a/lib/crypto/blake2s.c
+++ b/lib/crypto/blake2s.c
@@ -16,16 +16,44 @@
#include <linux/init.h>
#include <linux/bug.h>
+static inline void blake2s_set_lastblock(struct blake2s_state *state)
+{
+ state->f[0] = -1;
+}
+
void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen)
{
- __blake2s_update(state, in, inlen, false);
+ const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen;
+
+ if (unlikely(!inlen))
+ return;
+ if (inlen > fill) {
+ memcpy(state->buf + state->buflen, in, fill);
+ blake2s_compress(state, state->buf, 1, BLAKE2S_BLOCK_SIZE);
+ state->buflen = 0;
+ in += fill;
+ inlen -= fill;
+ }
+ if (inlen > BLAKE2S_BLOCK_SIZE) {
+ const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE);
+ blake2s_compress(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE);
+ in += BLAKE2S_BLOCK_SIZE * (nblocks - 1);
+ inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1);
+ }
+ memcpy(state->buf + state->buflen, in, inlen);
+ state->buflen += inlen;
}
EXPORT_SYMBOL(blake2s_update);
void blake2s_final(struct blake2s_state *state, u8 *out)
{
WARN_ON(IS_ENABLED(DEBUG) && !out);
- __blake2s_final(state, out, false);
+ blake2s_set_lastblock(state);
+ memset(state->buf + state->buflen, 0,
+ BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */
+ blake2s_compress(state, state->buf, 1, state->buflen);
+ cpu_to_le32_array(state->h, ARRAY_SIZE(state->h));
+ memcpy(out, state->h, state->outlen);
memzero_explicit(state, sizeof(*state));
}
EXPORT_SYMBOL(blake2s_final);
@@ -38,12 +66,7 @@ static int __init blake2s_mod_init(void)
return 0;
}
-static void __exit blake2s_mod_exit(void)
-{
-}
-
module_init(blake2s_mod_init);
-module_exit(blake2s_mod_exit);
MODULE_LICENSE("GPL v2");
MODULE_DESCRIPTION("BLAKE2s hash function");
MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 0b64695ab632..2bf20b48a04a 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -689,6 +689,7 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
struct pipe_inode_info *pipe = i->pipe;
unsigned int p_mask = pipe->ring_size - 1;
unsigned int i_head;
+ unsigned int valid = pipe->head;
size_t n, off, xfer = 0;
if (!sanity(i))
@@ -702,11 +703,17 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
rem = copy_mc_to_kernel(p + off, addr + xfer, chunk);
chunk -= rem;
kunmap_local(p);
- i->head = i_head;
- i->iov_offset = off + chunk;
- xfer += chunk;
- if (rem)
+ if (chunk) {
+ i->head = i_head;
+ i->iov_offset = off + chunk;
+ xfer += chunk;
+ valid = i_head + 1;
+ }
+ if (rem) {
+ pipe->bufs[i_head & p_mask].len -= rem;
+ pipe_discard_from(pipe, valid);
break;
+ }
n -= chunk;
off = 0;
i_head++;
diff --git a/lib/kunit/executor.c b/lib/kunit/executor.c
index 96f96e42ce06..16fb88c0aca3 100644
--- a/lib/kunit/executor.c
+++ b/lib/kunit/executor.c
@@ -76,8 +76,10 @@ kunit_filter_tests(struct kunit_suite *const suite, const char *test_glob)
memcpy(copy, suite, sizeof(*copy));
filtered = kcalloc(n + 1, sizeof(*filtered), GFP_KERNEL);
- if (!filtered)
+ if (!filtered) {
+ kfree(copy);
return ERR_PTR(-ENOMEM);
+ }
n = 0;
kunit_suite_for_each_test_case(suite, test_case) {
diff --git a/lib/livepatch/test_klp_callbacks_busy.c b/lib/livepatch/test_klp_callbacks_busy.c
index 7ac845f65be5..133929e0ce8f 100644
--- a/lib/livepatch/test_klp_callbacks_busy.c
+++ b/lib/livepatch/test_klp_callbacks_busy.c
@@ -16,10 +16,12 @@ MODULE_PARM_DESC(block_transition, "block_transition (default=false)");
static void busymod_work_func(struct work_struct *work);
static DECLARE_WORK(work, busymod_work_func);
+static DECLARE_COMPLETION(busymod_work_started);
static void busymod_work_func(struct work_struct *work)
{
pr_info("%s enter\n", __func__);
+ complete(&busymod_work_started);
while (READ_ONCE(block_transition)) {
/*
@@ -37,6 +39,12 @@ static int test_klp_callbacks_busy_init(void)
pr_info("%s\n", __func__);
schedule_work(&work);
+ /*
+ * To synchronize kernel messages, hold the init function from
+ * exiting until the work function's entry message has printed.
+ */
+ wait_for_completion(&busymod_work_started);
+
if (!block_transition) {
/*
* Serialize output: print all messages from the work
diff --git a/lib/overflow_kunit.c b/lib/overflow_kunit.c
index 475f0c064bf6..7e3e43679b73 100644
--- a/lib/overflow_kunit.c
+++ b/lib/overflow_kunit.c
@@ -91,6 +91,7 @@ DEFINE_TEST_ARRAY(u32) = {
{-4U, 5U, 1U, -9U, -20U, true, false, true},
};
+#if BITS_PER_LONG == 64
DEFINE_TEST_ARRAY(u64) = {
{0, 0, 0, 0, 0, false, false, false},
{1, 1, 2, 0, 1, false, false, false},
@@ -114,6 +115,7 @@ DEFINE_TEST_ARRAY(u64) = {
false, true, false},
{-15ULL, 10ULL, -5ULL, -25ULL, -150ULL, false, false, true},
};
+#endif
DEFINE_TEST_ARRAY(s8) = {
{0, 0, 0, 0, 0, false, false, false},
@@ -188,6 +190,8 @@ DEFINE_TEST_ARRAY(s32) = {
{S32_MIN, S32_MIN, 0, 0, 0, true, false, true},
{S32_MAX, S32_MAX, -2, 0, 1, true, false, true},
};
+
+#if BITS_PER_LONG == 64
DEFINE_TEST_ARRAY(s64) = {
{0, 0, 0, 0, 0, false, false, false},
@@ -216,6 +220,7 @@ DEFINE_TEST_ARRAY(s64) = {
{-128, -1, -129, -127, 128, false, false, false},
{0, -S64_MAX, -S64_MAX, S64_MAX, 0, false, false, false},
};
+#endif
#define check_one_op(t, fmt, op, sym, a, b, r, of) do { \
t _r; \
@@ -650,6 +655,7 @@ static struct kunit_case overflow_test_cases[] = {
KUNIT_CASE(s16_overflow_test),
KUNIT_CASE(u32_overflow_test),
KUNIT_CASE(s32_overflow_test),
+/* Clang 13 and earlier generate unwanted libcalls on 32-bit. */
#if BITS_PER_LONG == 64
KUNIT_CASE(u64_overflow_test),
KUNIT_CASE(s64_overflow_test),
diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c
index 046ac6297c78..a2bb7738c373 100644
--- a/lib/smp_processor_id.c
+++ b/lib/smp_processor_id.c
@@ -47,9 +47,9 @@ unsigned int check_preemption_disabled(const char *what1, const char *what2)
printk("caller is %pS\n", __builtin_return_address(0));
dump_stack();
- instrumentation_end();
out_enable:
+ instrumentation_end();
preempt_enable_no_resched_notrace();
out:
return this_cpu;
diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 0c5cb2d6436a..1a00d0247f70 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -14456,9 +14456,9 @@ static struct skb_segment_test skb_segment_tests[] __initconst = {
.build_skb = build_test_skb_linear_no_head_frag,
.features = NETIF_F_SG | NETIF_F_FRAGLIST |
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_GSO |
- NETIF_F_LLTX_BIT | NETIF_F_GRO |
+ NETIF_F_LLTX | NETIF_F_GRO |
NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM |
- NETIF_F_HW_VLAN_STAG_TX_BIT
+ NETIF_F_HW_VLAN_STAG_TX
}
};
diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index cfe632047839..f2c3015c5c82 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -732,7 +732,7 @@ static int dmirror_exclusive(struct dmirror *dmirror,
mmap_read_lock(mm);
for (addr = start; addr < end; addr = next) {
- unsigned long mapped;
+ unsigned long mapped = 0;
int i;
if (end < addr + (ARRAY_SIZE(pages) << PAGE_SHIFT))
@@ -741,7 +741,13 @@ static int dmirror_exclusive(struct dmirror *dmirror,
next = addr + (ARRAY_SIZE(pages) << PAGE_SHIFT);
ret = make_device_exclusive_range(mm, addr, next, pages, NULL);
- mapped = dmirror_atomic_map(addr, next, pages, dmirror);
+ /*
+ * Do dmirror_atomic_map() iff all pages are marked for
+ * exclusive access to avoid accessing uninitialized
+ * fields of pages.
+ */
+ if (ret == (next - addr) >> PAGE_SHIFT)
+ mapped = dmirror_atomic_map(addr, next, pages, dmirror);
for (i = 0; i < ret; i++) {
if (pages[i]) {
unlock_page(pages[i]);
diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index ad880231dfa8..630e0c31d7c2 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -131,6 +131,7 @@ static void kmalloc_oob_right(struct kunit *test)
ptr = kmalloc(size, GFP_KERNEL);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+ OPTIMIZER_HIDE_VAR(ptr);
/*
* An unaligned access past the requested kmalloc size.
* Only generic KASAN can precisely detect these.
@@ -159,6 +160,7 @@ static void kmalloc_oob_left(struct kunit *test)
ptr = kmalloc(size, GFP_KERNEL);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+ OPTIMIZER_HIDE_VAR(ptr);
KUNIT_EXPECT_KASAN_FAIL(test, *ptr = *(ptr - 1));
kfree(ptr);
}
@@ -171,6 +173,7 @@ static void kmalloc_node_oob_right(struct kunit *test)
ptr = kmalloc_node(size, GFP_KERNEL, 0);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+ OPTIMIZER_HIDE_VAR(ptr);
KUNIT_EXPECT_KASAN_FAIL(test, ptr[0] = ptr[size]);
kfree(ptr);
}
@@ -191,6 +194,7 @@ static void kmalloc_pagealloc_oob_right(struct kunit *test)
ptr = kmalloc(size, GFP_KERNEL);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+ OPTIMIZER_HIDE_VAR(ptr);
KUNIT_EXPECT_KASAN_FAIL(test, ptr[size + OOB_TAG_OFF] = 0);
kfree(ptr);
@@ -271,6 +275,7 @@ static void kmalloc_large_oob_right(struct kunit *test)
ptr = kmalloc(size, GFP_KERNEL);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+ OPTIMIZER_HIDE_VAR(ptr);
KUNIT_EXPECT_KASAN_FAIL(test, ptr[size] = 0);
kfree(ptr);
}
@@ -410,6 +415,8 @@ static void kmalloc_oob_16(struct kunit *test)
ptr2 = kmalloc(sizeof(*ptr2), GFP_KERNEL);
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr2);
+ OPTIMIZER_HIDE_VAR(ptr1);
+ OPTIMIZER_HIDE_VAR(ptr2);
KUNIT_EXPECT_KASAN_FAIL(test, *ptr1 = *ptr2);
kfree(ptr1);
kfree(ptr2);
@@ -756,6 +763,8 @@ static void ksize_unpoisons_memory(struct kunit *test)
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
real_size = ksize(ptr);
+ OPTIMIZER_HIDE_VAR(ptr);
+
/* This access shouldn't trigger a KASAN report. */
ptr[size] = 'x';
@@ -778,6 +787,7 @@ static void ksize_uaf(struct kunit *test)
KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
kfree(ptr);
+ OPTIMIZER_HIDE_VAR(ptr);
KUNIT_EXPECT_KASAN_FAIL(test, ksize(ptr));
KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[0]);
KUNIT_EXPECT_KASAN_FAIL(test, ((volatile char *)ptr)[size]);
diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
index e34c4d0c4d93..11982685508e 100644
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -384,8 +384,10 @@ static int __init damon_reclaim_init(void)
if (!ctx)
return -ENOMEM;
- if (damon_select_ops(ctx, DAMON_OPS_PADDR))
+ if (damon_select_ops(ctx, DAMON_OPS_PADDR)) {
+ damon_destroy_ctx(ctx);
return -EINVAL;
+ }
ctx->callback.after_aggregation = damon_reclaim_after_aggregation;
diff --git a/mm/gup.c b/mm/gup.c
index c5d076d43d9b..414f0c4b4caa 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1795,7 +1795,7 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages,
* Try to move out any movable page before pinning the range.
*/
if (folio_test_hugetlb(folio)) {
- if (!isolate_huge_page(&folio->page,
+ if (isolate_hugetlb(&folio->page,
&movable_page_list))
isolation_error_count++;
continue;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b7d0697b1f26..e0c1cf3168d7 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -18,6 +18,7 @@
#include <linux/shrinker.h>
#include <linux/mm_inline.h>
#include <linux/swapops.h>
+#include <linux/backing-dev.h>
#include <linux/dax.h>
#include <linux/khugepaged.h>
#include <linux/freezer.h>
@@ -2389,11 +2390,15 @@ static void __split_huge_page(struct page *page, struct list_head *list,
__split_huge_page_tail(head, i, lruvec, list);
/* Some pages can be beyond EOF: drop them from page cache */
if (head[i].index >= end) {
- ClearPageDirty(head + i);
- __delete_from_page_cache(head + i, NULL);
+ struct folio *tail = page_folio(head + i);
+
if (shmem_mapping(head->mapping))
shmem_uncharge(head->mapping->host, 1);
- put_page(head + i);
+ else if (folio_test_clear_dirty(tail))
+ folio_account_cleaned(tail,
+ inode_to_wb(folio->mapping->host));
+ __filemap_remove_folio(tail, NULL);
+ folio_put(tail);
} else if (!PageAnon(page)) {
__xa_store(&head->mapping->i_pages, head[i].index,
head + i, 0);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 859cfcaecddb..19063dbed332 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2758,8 +2758,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page,
* Fail with -EBUSY if not possible.
*/
spin_unlock_irq(&hugetlb_lock);
- if (!isolate_huge_page(old_page, list))
- ret = -EBUSY;
+ ret = isolate_hugetlb(old_page, list);
spin_lock_irq(&hugetlb_lock);
goto free_new;
} else if (!HPageFreed(old_page)) {
@@ -2835,7 +2834,7 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
if (hstate_is_gigantic(h))
return -ENOMEM;
- if (page_count(head) && isolate_huge_page(head, list))
+ if (page_count(head) && !isolate_hugetlb(head, list))
ret = 0;
else if (!page_count(head))
ret = alloc_and_dissolve_huge_page(h, head, list);
@@ -5603,7 +5602,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
*/
entry = huge_ptep_get(ptep);
if (unlikely(is_hugetlb_entry_migration(entry))) {
- migration_entry_wait_huge(vma, mm, ptep);
+ migration_entry_wait_huge(vma, ptep);
return 0;
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
return VM_FAULT_HWPOISON_LARGE |
@@ -6726,7 +6725,7 @@ follow_huge_pmd(struct mm_struct *mm, unsigned long address,
} else {
if (is_hugetlb_entry_migration(pte)) {
spin_unlock(ptl);
- __migration_entry_wait(mm, (pte_t *)pmd, ptl);
+ __migration_entry_wait_huge((pte_t *)pmd, ptl);
goto retry;
}
/*
@@ -6758,15 +6757,15 @@ follow_huge_pgd(struct mm_struct *mm, unsigned long address, pgd_t *pgd, int fla
return pte_page(*(pte_t *)pgd) + ((address & ~PGDIR_MASK) >> PAGE_SHIFT);
}
-bool isolate_huge_page(struct page *page, struct list_head *list)
+int isolate_hugetlb(struct page *page, struct list_head *list)
{
- bool ret = true;
+ int ret = 0;
spin_lock_irq(&hugetlb_lock);
if (!PageHeadHuge(page) ||
!HPageMigratable(page) ||
!get_page_unless_zero(page)) {
- ret = false;
+ ret = -EBUSY;
goto unlock;
}
ClearHPageMigratable(page);
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index f9942841df18..c86691c431fd 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -772,6 +772,7 @@ static void __init __hugetlb_cgroup_file_dfl_init(int idx)
/* Add the numa stat file */
cft = &h->cgroup_files_dfl[6];
snprintf(cft->name, MAX_CFTYPE_NAME, "%s.numa_stat", buf);
+ cft->private = MEMFILE_PRIVATE(idx, 0);
cft->seq_show = hugetlb_cgroup_read_numa_stat;
cft->flags = CFTYPE_NOT_ON_ROOT;
diff --git a/mm/kasan/hw_tags.c b/mm/kasan/hw_tags.c
index 9e1b6544bfa8..9ad8eff71b28 100644
--- a/mm/kasan/hw_tags.c
+++ b/mm/kasan/hw_tags.c
@@ -257,27 +257,37 @@ static void unpoison_vmalloc_pages(const void *addr, u8 tag)
}
}
+static void init_vmalloc_pages(const void *start, unsigned long size)
+{
+ const void *addr;
+
+ for (addr = start; addr < start + size; addr += PAGE_SIZE) {
+ struct page *page = virt_to_page(addr);
+
+ clear_highpage_kasan_tagged(page);
+ }
+}
+
void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
kasan_vmalloc_flags_t flags)
{
u8 tag;
unsigned long redzone_start, redzone_size;
- if (!kasan_vmalloc_enabled())
- return (void *)start;
-
- if (!is_vmalloc_or_module_addr(start))
+ if (!kasan_vmalloc_enabled() || !is_vmalloc_or_module_addr(start)) {
+ if (flags & KASAN_VMALLOC_INIT)
+ init_vmalloc_pages(start, size);
return (void *)start;
+ }
/*
- * Skip unpoisoning and assigning a pointer tag for non-VM_ALLOC
- * mappings as:
+ * Don't tag non-VM_ALLOC mappings, as:
*
* 1. Unlike the software KASAN modes, hardware tag-based KASAN only
* supports tagging physical memory. Therefore, it can only tag a
* single mapping of normal physical pages.
* 2. Hardware tag-based KASAN can only tag memory mapped with special
- * mapping protection bits, see arch_vmalloc_pgprot_modify().
+ * mapping protection bits, see arch_vmap_pgprot_tagged().
* As non-VM_ALLOC mappings can be mapped outside of vmalloc code,
* providing these bits would require tracking all non-VM_ALLOC
* mappers.
@@ -289,15 +299,19 @@ void *__kasan_unpoison_vmalloc(const void *start, unsigned long size,
*
* For non-VM_ALLOC allocations, page_alloc memory is tagged as usual.
*/
- if (!(flags & KASAN_VMALLOC_VM_ALLOC))
+ if (!(flags & KASAN_VMALLOC_VM_ALLOC)) {
+ WARN_ON(flags & KASAN_VMALLOC_INIT);
return (void *)start;
+ }
/*
* Don't tag executable memory.
* The kernel doesn't tolerate having the PC register tagged.
*/
- if (!(flags & KASAN_VMALLOC_PROT_NORMAL))
+ if (!(flags & KASAN_VMALLOC_PROT_NORMAL)) {
+ WARN_ON(flags & KASAN_VMALLOC_INIT);
return (void *)start;
+ }
tag = kasan_random_tag();
start = set_tag(start, tag);
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 94dac77f5eba..fb8c9b0b53ab 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2211,7 +2211,7 @@ static bool isolate_page(struct page *page, struct list_head *pagelist)
bool lru = PageLRU(page);
if (PageHuge(page)) {
- isolated = isolate_huge_page(page, pagelist);
+ isolated = !isolate_hugetlb(page, pagelist);
} else {
if (lru)
isolated = !isolate_lru_page(page);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 416b38ca8def..3a60c1444bc7 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1627,7 +1627,7 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
if (PageHuge(page)) {
pfn = page_to_pfn(head) + compound_nr(head) - 1;
- isolate_huge_page(head, &source);
+ isolate_hugetlb(head, &source);
continue;
} else if (PageTransHuge(page))
pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index ea6dee61bc9d..c1ccc89845f4 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -607,7 +607,7 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
if (flags & (MPOL_MF_MOVE_ALL) ||
(flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
- if (!isolate_huge_page(page, qp->pagelist) &&
+ if (isolate_hugetlb(page, qp->pagelist) &&
(flags & MPOL_MF_STRICT))
/*
* Failed to isolate page but allow migrating pages
@@ -1387,7 +1387,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
unsigned long bits = min_t(unsigned long, maxnode, BITS_PER_LONG);
unsigned long t;
- if (get_bitmap(&t, &nmask[maxnode / BITS_PER_LONG], bits))
+ if (get_bitmap(&t, &nmask[(maxnode - 1) / BITS_PER_LONG], bits))
return -EFAULT;
if (maxnode - bits >= MAX_NUMNODES) {
diff --git a/mm/memremap.c b/mm/memremap.c
index e11653fd348c..2c1130486d28 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -141,10 +141,10 @@ void memunmap_pages(struct dev_pagemap *pgmap)
for (i = 0; i < pgmap->nr_range; i++)
percpu_ref_put_many(&pgmap->ref, pfn_len(pgmap, i));
wait_for_completion(&pgmap->done);
- percpu_ref_exit(&pgmap->ref);
for (i = 0; i < pgmap->nr_range; i++)
pageunmap_range(pgmap, i);
+ percpu_ref_exit(&pgmap->ref);
WARN_ONCE(pgmap->altmap.alloc, "failed to free all reserved pages\n");
devmap_managed_enable_put(pgmap);
diff --git a/mm/migrate.c b/mm/migrate.c
index 6c31ee1e1c9b..5aba3fb612d4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -133,7 +133,7 @@ static void putback_movable_page(struct page *page)
*
* This function shall be used whenever the isolated pageset has been
* built from lru, balloon, hugetlbfs page. See isolate_migratepages_range()
- * and isolate_huge_page().
+ * and isolate_hugetlb().
*/
void putback_movable_pages(struct list_head *l)
{
@@ -309,13 +309,28 @@ void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
__migration_entry_wait(mm, ptep, ptl);
}
-void migration_entry_wait_huge(struct vm_area_struct *vma,
- struct mm_struct *mm, pte_t *pte)
+#ifdef CONFIG_HUGETLB_PAGE
+void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl)
{
- spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), mm, pte);
- __migration_entry_wait(mm, pte, ptl);
+ pte_t pte;
+
+ spin_lock(ptl);
+ pte = huge_ptep_get(ptep);
+
+ if (unlikely(!is_hugetlb_entry_migration(pte)))
+ spin_unlock(ptl);
+ else
+ migration_entry_wait_on_locked(pte_to_swp_entry(pte), NULL, ptl);
}
+void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte)
+{
+ spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), vma->vm_mm, pte);
+
+ __migration_entry_wait_huge(pte, ptl);
+}
+#endif
+
#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
void pmd_migration_entry_wait(struct mm_struct *mm, pmd_t *pmd)
{
@@ -1631,8 +1646,9 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
if (PageHuge(page)) {
if (PageHead(page)) {
- isolate_huge_page(page, pagelist);
- err = 1;
+ err = isolate_hugetlb(page, pagelist);
+ if (!err)
+ err = 1;
}
} else {
struct page *head;
diff --git a/mm/mmap.c b/mm/mmap.c
index 313b57d55a63..6d9bfd9c94ca 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1882,7 +1882,6 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
/* Undo any partial mapping done by a device driver. */
unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end);
- charged = 0;
if (vm_flags & VM_SHARED)
mapping_unmap_writable(file->f_mapping);
free_vma:
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 135a081edb82..a8bd356a4d1e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1287,12 +1287,8 @@ static void kernel_init_free_pages(struct page *page, int numpages)
/* s390's use of memset() could override KASAN redzones. */
kasan_disable_current();
- for (i = 0; i < numpages; i++) {
- u8 tag = page_kasan_tag(page + i);
- page_kasan_tag_reset(page + i);
- clear_highpage(page + i);
- page_kasan_tag_set(page + i, tag);
- }
+ for (i = 0; i < numpages; i++)
+ clear_highpage_kasan_tagged(page + i);
kasan_enable_current();
}
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index cadfbb5155ea..06c555fc5be0 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3170,15 +3170,15 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
/*
* Mark the pages as accessible, now that they are mapped.
- * The init condition should match the one in post_alloc_hook()
- * (except for the should_skip_init() check) to make sure that memory
- * is initialized under the same conditions regardless of the enabled
- * KASAN mode.
+ * The condition for setting KASAN_VMALLOC_INIT should complement the
+ * one in post_alloc_hook() with regards to the __GFP_SKIP_ZERO check
+ * to make sure that memory is initialized under the same conditions.
* Tag-based KASAN modes only assign tags to normal non-executable
* allocations, see __kasan_unpoison_vmalloc().
*/
kasan_flags |= KASAN_VMALLOC_VM_ALLOC;
- if (!want_init_on_free() && want_init_on_alloc(gfp_mask))
+ if (!want_init_on_free() && want_init_on_alloc(gfp_mask) &&
+ (gfp_mask & __GFP_SKIP_ZERO))
kasan_flags |= KASAN_VMALLOC_INIT;
/* KASAN_VMALLOC_PROT_NORMAL already set if required. */
area->addr = kasan_unpoison_vmalloc(area->addr, real_size, kasan_flags);
diff --git a/net/9p/client.c b/net/9p/client.c
index 8bba0d9cf975..87cde948f628 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -305,7 +305,7 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
* callback), so p9_client_cb eats the second ref there
* as the pointer is duplicated directly by virtqueue_add_sgs()
*/
- refcount_set(&req->refcount.refcount, 2);
+ refcount_set(&req->refcount, 2);
return req;
@@ -341,7 +341,7 @@ struct p9_req_t *p9_tag_lookup(struct p9_client *c, u16 tag)
if (!p9_req_try_get(req))
goto again;
if (req->tc.tag != tag) {
- p9_req_put(req);
+ p9_req_put(c, req);
goto again;
}
}
@@ -367,21 +367,18 @@ static int p9_tag_remove(struct p9_client *c, struct p9_req_t *r)
spin_lock_irqsave(&c->lock, flags);
idr_remove(&c->reqs, tag);
spin_unlock_irqrestore(&c->lock, flags);
- return p9_req_put(r);
+ return p9_req_put(c, r);
}
-static void p9_req_free(struct kref *ref)
+int p9_req_put(struct p9_client *c, struct p9_req_t *r)
{
- struct p9_req_t *r = container_of(ref, struct p9_req_t, refcount);
-
- p9_fcall_fini(&r->tc);
- p9_fcall_fini(&r->rc);
- kmem_cache_free(p9_req_cache, r);
-}
-
-int p9_req_put(struct p9_req_t *r)
-{
- return kref_put(&r->refcount, p9_req_free);
+ if (refcount_dec_and_test(&r->refcount)) {
+ p9_fcall_fini(&r->tc);
+ p9_fcall_fini(&r->rc);
+ kmem_cache_free(p9_req_cache, r);
+ return 1;
+ }
+ return 0;
}
EXPORT_SYMBOL(p9_req_put);
@@ -426,7 +423,7 @@ void p9_client_cb(struct p9_client *c, struct p9_req_t *req, int status)
wake_up(&req->wq);
p9_debug(P9_DEBUG_MUX, "wakeup: %d\n", req->tc.tag);
- p9_req_put(req);
+ p9_req_put(c, req);
}
EXPORT_SYMBOL(p9_client_cb);
@@ -709,7 +706,7 @@ static struct p9_req_t *p9_client_prepare_req(struct p9_client *c,
reterr:
p9_tag_remove(c, req);
/* We have to put also the 2nd reference as it won't be used */
- p9_req_put(req);
+ p9_req_put(c, req);
return ERR_PTR(err);
}
@@ -746,7 +743,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char *fmt, ...)
err = c->trans_mod->request(c, req);
if (err < 0) {
/* write won't happen */
- p9_req_put(req);
+ p9_req_put(c, req);
if (err != -ERESTARTSYS && err != -EFAULT)
c->status = Disconnected;
goto recalc_sigpending;
@@ -889,16 +886,13 @@ static struct p9_fid *p9_fid_create(struct p9_client *clnt)
struct p9_fid *fid;
p9_debug(P9_DEBUG_FID, "clnt %p\n", clnt);
- fid = kmalloc(sizeof(*fid), GFP_KERNEL);
+ fid = kzalloc(sizeof(*fid), GFP_KERNEL);
if (!fid)
return NULL;
- memset(&fid->qid, 0, sizeof(fid->qid));
fid->mode = -1;
fid->uid = current_fsuid();
fid->clnt = clnt;
- fid->rdir = NULL;
- fid->fid = 0;
refcount_set(&fid->count, 1);
idr_preload(GFP_KERNEL);
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 8f8f95e39b03..e758978b44be 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -343,6 +343,7 @@ static void p9_read_work(struct work_struct *work)
p9_debug(P9_DEBUG_ERROR,
"No recv fcall for tag %d (req %p), disconnecting!\n",
m->rc.tag, m->rreq);
+ p9_req_put(m->client, m->rreq);
m->rreq = NULL;
err = -EIO;
goto error;
@@ -378,7 +379,7 @@ static void p9_read_work(struct work_struct *work)
m->rc.sdata = NULL;
m->rc.offset = 0;
m->rc.capacity = 0;
- p9_req_put(m->rreq);
+ p9_req_put(m->client, m->rreq);
m->rreq = NULL;
}
@@ -492,7 +493,7 @@ static void p9_write_work(struct work_struct *work)
m->wpos += err;
if (m->wpos == m->wsize) {
m->wpos = m->wsize = 0;
- p9_req_put(m->wreq);
+ p9_req_put(m->client, m->wreq);
m->wreq = NULL;
}
@@ -695,7 +696,7 @@ static int p9_fd_cancel(struct p9_client *client, struct p9_req_t *req)
if (req->status == REQ_STATUS_UNSENT) {
list_del(&req->req_list);
req->status = REQ_STATUS_FLSHD;
- p9_req_put(req);
+ p9_req_put(client, req);
ret = 0;
}
spin_unlock(&client->lock);
@@ -722,7 +723,7 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
list_del(&req->req_list);
req->status = REQ_STATUS_FLSHD;
spin_unlock(&client->lock);
- p9_req_put(req);
+ p9_req_put(client, req);
return 0;
}
@@ -883,12 +884,12 @@ static void p9_conn_destroy(struct p9_conn *m)
p9_mux_poll_stop(m);
cancel_work_sync(&m->rq);
if (m->rreq) {
- p9_req_put(m->rreq);
+ p9_req_put(m->client, m->rreq);
m->rreq = NULL;
}
cancel_work_sync(&m->wq);
if (m->wreq) {
- p9_req_put(m->wreq);
+ p9_req_put(m->client, m->wreq);
m->wreq = NULL;
}
diff --git a/net/9p/trans_rdma.c b/net/9p/trans_rdma.c
index 88e563826674..d817d3745238 100644
--- a/net/9p/trans_rdma.c
+++ b/net/9p/trans_rdma.c
@@ -350,7 +350,7 @@ send_done(struct ib_cq *cq, struct ib_wc *wc)
c->busa, c->req->tc.size,
DMA_TO_DEVICE);
up(&rdma->sq_sem);
- p9_req_put(c->req);
+ p9_req_put(client, c->req);
kfree(c);
}
diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index b24a4fb0f0a2..147972bf2e79 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -199,7 +199,7 @@ static int p9_virtio_cancel(struct p9_client *client, struct p9_req_t *req)
/* Reply won't come, so drop req ref */
static int p9_virtio_cancelled(struct p9_client *client, struct p9_req_t *req)
{
- p9_req_put(req);
+ p9_req_put(client, req);
return 0;
}
@@ -523,7 +523,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
kvfree(out_pages);
if (!kicked) {
/* reply won't come */
- p9_req_put(req);
+ p9_req_put(client, req);
}
return err;
}
diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index 77883b6788cd..4cf0c78d4d22 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -163,7 +163,7 @@ static int p9_xen_request(struct p9_client *client, struct p9_req_t *p9_req)
ring->intf->out_prod = prod;
spin_unlock_irqrestore(&ring->lock, flags);
notify_remote_via_irq(ring->irq);
- p9_req_put(p9_req);
+ p9_req_put(client, p9_req);
return 0;
}
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index 4c7030ed8d33..5b5363c99ed5 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1065,7 +1065,7 @@ static int ax25_release(struct socket *sock)
del_timer_sync(&ax25->t3timer);
del_timer_sync(&ax25->idletimer);
}
- dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker);
+ dev_put_track(ax25_dev->dev, &ax25->dev_tracker);
ax25_dev_put(ax25_dev);
}
@@ -1146,7 +1146,7 @@ static int ax25_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
if (ax25_dev) {
ax25_fillin_cb(ax25, ax25_dev);
- dev_hold_track(ax25_dev->dev, &ax25_dev->dev_tracker, GFP_ATOMIC);
+ dev_hold_track(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC);
}
done:
diff --git a/net/batman-adv/trace.h b/net/batman-adv/trace.h
index d673ebdd0426..31c8f922651d 100644
--- a/net/batman-adv/trace.h
+++ b/net/batman-adv/trace.h
@@ -28,8 +28,6 @@
#endif /* CONFIG_BATMAN_ADV_TRACING */
-#define BATADV_MAX_MSG_LEN 256
-
TRACE_EVENT(batadv_dbg,
TP_PROTO(struct batadv_priv *bat_priv,
@@ -40,16 +38,13 @@ TRACE_EVENT(batadv_dbg,
TP_STRUCT__entry(
__string(device, bat_priv->soft_iface->name)
__string(driver, KBUILD_MODNAME)
- __dynamic_array(char, msg, BATADV_MAX_MSG_LEN)
+ __vstring(msg, vaf->fmt, vaf->va)
),
TP_fast_assign(
__assign_str(device, bat_priv->soft_iface->name);
__assign_str(driver, KBUILD_MODNAME);
- WARN_ON_ONCE(vsnprintf(__get_dynamic_array(msg),
- BATADV_MAX_MSG_LEN,
- vaf->fmt,
- *vaf->va) >= BATADV_MAX_MSG_LEN);
+ __assign_vstr(msg, vaf->fmt, vaf->va);
),
TP_printk(
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 19df3905c5f8..2b8b51b89452 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -593,6 +593,11 @@ static int hci_dev_do_reset(struct hci_dev *hdev)
skb_queue_purge(&hdev->rx_q);
skb_queue_purge(&hdev->cmd_q);
+ /* Cancel these to avoid queueing non-chained pending work */
+ hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
+ cancel_delayed_work(&hdev->cmd_timer);
+ cancel_delayed_work(&hdev->ncmd_timer);
+
/* Avoid potential lockdep warnings from the *_flush() calls by
* ensuring the workqueue is empty up front.
*/
@@ -606,6 +611,8 @@ static int hci_dev_do_reset(struct hci_dev *hdev)
if (hdev->flush)
hdev->flush(hdev);
+ hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
+
atomic_set(&hdev->cmd_cnt, 1);
hdev->acl_cnt = 0; hdev->sco_cnt = 0; hdev->le_cnt = 0;
@@ -3863,7 +3870,8 @@ static void hci_cmd_work(struct work_struct *work)
if (res < 0)
__hci_cmd_sync_cancel(hdev, -res);
- if (test_bit(HCI_RESET, &hdev->flags))
+ if (test_bit(HCI_RESET, &hdev->flags) ||
+ hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
cancel_delayed_work(&hdev->cmd_timer);
else
schedule_delayed_work(&hdev->cmd_timer,
diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index af17dfb20e01..7cb956d3abb2 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -3768,8 +3768,9 @@ static inline void handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd)
cancel_delayed_work(&hdev->ncmd_timer);
atomic_set(&hdev->cmd_cnt, 1);
} else {
- schedule_delayed_work(&hdev->ncmd_timer,
- HCI_NCMD_TIMEOUT);
+ if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
+ schedule_delayed_work(&hdev->ncmd_timer,
+ HCI_NCMD_TIMEOUT);
}
}
}
diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
index 9e2a42299fc0..6f901398132e 100644
--- a/net/bluetooth/hci_sync.c
+++ b/net/bluetooth/hci_sync.c
@@ -1612,6 +1612,9 @@ static int hci_le_add_resolve_list_sync(struct hci_dev *hdev,
bacpy(&cp.bdaddr, ¶ms->addr);
memcpy(cp.peer_irk, irk->val, 16);
+ /* Default privacy mode is always Network */
+ params->privacy_mode = HCI_NETWORK_PRIVACY;
+
done:
if (hci_dev_test_flag(hdev, HCI_PRIVACY))
memcpy(cp.local_irk, hdev->irk, 16);
@@ -5008,13 +5011,13 @@ static int hci_resume_scan_sync(struct hci_dev *hdev)
if (!hdev->scanning_paused)
return 0;
+ hdev->scanning_paused = false;
+
hci_update_scan_sync(hdev);
/* Reset passive scanning to normal */
hci_update_passive_scan_sync(hdev);
- hdev->scanning_paused = false;
-
return 0;
}
@@ -5033,7 +5036,6 @@ int hci_resume_sync(struct hci_dev *hdev)
return 0;
hdev->suspended = false;
- hdev->scanning_paused = false;
/* Restore event mask */
hci_set_event_mask_sync(hdev);
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 52668662ae8d..f18d0c72713f 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -1969,11 +1969,11 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
bdaddr_t *dst,
u8 link_type)
{
- struct l2cap_chan *c, *c1 = NULL;
+ struct l2cap_chan *c, *tmp, *c1 = NULL;
read_lock(&chan_list_lock);
- list_for_each_entry(c, &chan_list, global_l) {
+ list_for_each_entry_safe(c, tmp, &chan_list, global_l) {
if (state && c->state != state)
continue;
@@ -1992,11 +1992,10 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
dst_match = !bacmp(&c->dst, dst);
if (src_match && dst_match) {
c = l2cap_chan_hold_unless_zero(c);
- if (!c)
- continue;
-
- read_unlock(&chan_list_lock);
- return c;
+ if (c) {
+ read_unlock(&chan_list_lock);
+ return c;
+ }
}
/* Closest match */
diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
index ae758ab1b558..12c1ecdba3f3 100644
--- a/net/bluetooth/mgmt.c
+++ b/net/bluetooth/mgmt.c
@@ -6821,11 +6821,14 @@ static int get_conn_info(struct sock *sk, struct hci_dev *hdev, void *data,
cmd = mgmt_pending_new(sk, MGMT_OP_GET_CONN_INFO, hdev, data,
len);
- if (!cmd)
+ if (!cmd) {
err = -ENOMEM;
- else
+ } else {
+ hci_conn_hold(conn);
+ cmd->user_data = hci_conn_get(conn);
err = hci_cmd_sync_queue(hdev, get_conn_info_sync,
cmd, get_conn_info_complete);
+ }
if (err < 0) {
mgmt_cmd_complete(sk, hdev->id, MGMT_OP_GET_CONN_INFO,
@@ -6837,9 +6840,6 @@ static int get_conn_info(struct sock *sk, struct hci_dev *hdev, void *data,
goto unlock;
}
- hci_conn_hold(conn);
- cmd->user_data = hci_conn_get(conn);
-
conn->conn_info_timestamp = jiffies;
} else {
/* Cache is valid, just reply with values cached in hci_conn */
diff --git a/net/bpf/bpf_dummy_struct_ops.c b/net/bpf/bpf_dummy_struct_ops.c
index d0e54e30658a..e78dadfc5829 100644
--- a/net/bpf/bpf_dummy_struct_ops.c
+++ b/net/bpf/bpf_dummy_struct_ops.c
@@ -72,13 +72,16 @@ static int dummy_ops_call_op(void *image, struct bpf_dummy_ops_test_args *args)
args->args[3], args->args[4]);
}
+extern const struct bpf_link_ops bpf_struct_ops_link_lops;
+
int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
union bpf_attr __user *uattr)
{
const struct bpf_struct_ops *st_ops = &bpf_bpf_dummy_ops;
const struct btf_type *func_proto;
struct bpf_dummy_ops_test_args *args;
- struct bpf_tramp_progs *tprogs;
+ struct bpf_tramp_links *tlinks;
+ struct bpf_tramp_link *link = NULL;
void *image = NULL;
unsigned int op_idx;
int prog_ret;
@@ -92,8 +95,8 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
if (IS_ERR(args))
return PTR_ERR(args);
- tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(*tprogs), GFP_KERNEL);
- if (!tprogs) {
+ tlinks = kcalloc(BPF_TRAMP_MAX, sizeof(*tlinks), GFP_KERNEL);
+ if (!tlinks) {
err = -ENOMEM;
goto out;
}
@@ -105,8 +108,17 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
}
set_vm_flush_reset_perms(image);
+ link = kzalloc(sizeof(*link), GFP_USER);
+ if (!link) {
+ err = -ENOMEM;
+ goto out;
+ }
+ /* prog doesn't take the ownership of the reference from caller */
+ bpf_prog_inc(prog);
+ bpf_link_init(&link->link, BPF_LINK_TYPE_STRUCT_OPS, &bpf_struct_ops_link_lops, prog);
+
op_idx = prog->expected_attach_type;
- err = bpf_struct_ops_prepare_trampoline(tprogs, prog,
+ err = bpf_struct_ops_prepare_trampoline(tlinks, link,
&st_ops->func_models[op_idx],
image, image + PAGE_SIZE);
if (err < 0)
@@ -124,7 +136,9 @@ int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
out:
kfree(args);
bpf_jit_free_exec(image);
- kfree(tprogs);
+ if (link)
+ bpf_link_put(&link->link);
+ kfree(tlinks);
return err;
}
diff --git a/net/core/filter.c b/net/core/filter.c
index d0b0c163d3f3..a98f34cb5aee 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3917,7 +3917,7 @@ static void *bpf_xdp_pointer(struct xdp_buff *xdp, u32 offset, u32 len)
offset -= frag_size;
}
out:
- return offset + len < size ? addr + offset : NULL;
+ return offset + len <= size ? addr + offset : NULL;
}
BPF_CALL_4(bpf_xdp_load_bytes, struct xdp_buff *, xdp, u32, offset,
@@ -6642,7 +6642,7 @@ static const struct bpf_func_proto bpf_sk_release_proto = {
.func = bpf_sk_release,
.gpl_only = false,
.ret_type = RET_INTEGER,
- .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON,
+ .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON | OBJ_RELEASE,
};
BPF_CALL_5(bpf_xdp_sk_lookup_udp, struct xdp_buff *, ctx,
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index ede0af308f40..f50f8d95b628 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -462,7 +462,7 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
if (copied == len)
break;
- } while (i != msg_rx->sg.end);
+ } while (!sg_is_last(sge));
if (unlikely(peek)) {
msg_rx = sk_psock_next_msg(psock, msg_rx);
@@ -472,7 +472,7 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
}
msg_rx->sg.start = i;
- if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) {
+ if (!sge->length && sg_is_last(sge)) {
msg_rx = sk_psock_dequeue_msg(psock);
kfree_sk_msg(msg_rx);
}
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index a976b4d29892..0ee62506105a 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -736,11 +736,6 @@ int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
lock_sock(sk);
- if (dccp_qpolicy_full(sk)) {
- rc = -EAGAIN;
- goto out_release;
- }
-
timeo = sock_sndtimeo(sk, noblock);
/*
@@ -759,6 +754,11 @@ int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
if (skb == NULL)
goto out_release;
+ if (dccp_qpolicy_full(sk)) {
+ rc = -EAGAIN;
+ goto out_discard;
+ }
+
if (sk->sk_state == DCCP_CLOSED) {
rc = -ENOTCONN;
goto out_discard;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 5c207367b3b4..9a70032625af 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1920,6 +1920,8 @@ static int __init inet_init(void)
sock_skb_cb_check_size(sizeof(struct inet_skb_parm));
+ raw_hashinfo_init(&raw_v4_hashinfo);
+
rc = proto_register(&tcp_prot, 1);
if (rc)
goto out;
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 55654e335d43..d631198e0938 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -410,13 +410,11 @@ struct sock *__inet_lookup_established(struct net *net,
sk_nulls_for_each_rcu(sk, node, &head->chain) {
if (sk->sk_hash != hash)
continue;
- if (likely(INET_MATCH(sk, net, acookie,
- saddr, daddr, ports, dif, sdif))) {
+ if (likely(INET_MATCH(net, sk, acookie, ports, dif, sdif))) {
if (unlikely(!refcount_inc_not_zero(&sk->sk_refcnt)))
goto out;
- if (unlikely(!INET_MATCH(sk, net, acookie,
- saddr, daddr, ports,
- dif, sdif))) {
+ if (unlikely(!INET_MATCH(net, sk, acookie,
+ ports, dif, sdif))) {
sock_gen_put(sk);
goto begin;
}
@@ -465,8 +463,7 @@ static int __inet_check_established(struct inet_timewait_death_row *death_row,
if (sk2->sk_hash != hash)
continue;
- if (likely(INET_MATCH(sk2, net, acookie,
- saddr, daddr, ports, dif, sdif))) {
+ if (likely(INET_MATCH(net, sk2, acookie, ports, dif, sdif))) {
if (sk2->sk_state == TCP_TIME_WAIT) {
tw = inet_twsk(sk2);
if (twsk_unique(sk, sk2, twp))
@@ -532,16 +529,14 @@ static bool inet_ehash_lookup_by_sk(struct sock *sk,
if (esk->sk_hash != sk->sk_hash)
continue;
if (sk->sk_family == AF_INET) {
- if (unlikely(INET_MATCH(esk, net, acookie,
- sk->sk_daddr,
- sk->sk_rcv_saddr,
+ if (unlikely(INET_MATCH(net, esk, acookie,
ports, dif, sdif))) {
return true;
}
}
#if IS_ENABLED(CONFIG_IPV6)
else if (sk->sk_family == AF_INET6) {
- if (unlikely(INET6_MATCH(esk, net,
+ if (unlikely(inet6_match(net, esk,
&sk->sk_v6_daddr,
&sk->sk_v6_rcv_saddr,
ports, dif, sdif))) {
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index c9dd9603f2e7..d269b85630b2 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -85,20 +85,19 @@ struct raw_frag_vec {
int hlen;
};
-struct raw_hashinfo raw_v4_hashinfo = {
- .lock = __RW_LOCK_UNLOCKED(raw_v4_hashinfo.lock),
-};
+struct raw_hashinfo raw_v4_hashinfo;
EXPORT_SYMBOL_GPL(raw_v4_hashinfo);
int raw_hash_sk(struct sock *sk)
{
struct raw_hashinfo *h = sk->sk_prot->h.raw_hash;
- struct hlist_head *head;
+ struct hlist_nulls_head *hlist;
- head = &h->ht[inet_sk(sk)->inet_num & (RAW_HTABLE_SIZE - 1)];
+ hlist = &h->ht[inet_sk(sk)->inet_num & (RAW_HTABLE_SIZE - 1)];
write_lock_bh(&h->lock);
- sk_add_node(sk, head);
+ hlist_nulls_add_head_rcu(&sk->sk_nulls_node, hlist);
+ sock_set_flag(sk, SOCK_RCU_FREE);
write_unlock_bh(&h->lock);
sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
@@ -111,30 +110,25 @@ void raw_unhash_sk(struct sock *sk)
struct raw_hashinfo *h = sk->sk_prot->h.raw_hash;
write_lock_bh(&h->lock);
- if (sk_del_node_init(sk))
+ if (__sk_nulls_del_node_init_rcu(sk))
sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
write_unlock_bh(&h->lock);
}
EXPORT_SYMBOL_GPL(raw_unhash_sk);
-struct sock *__raw_v4_lookup(struct net *net, struct sock *sk,
- unsigned short num, __be32 raddr, __be32 laddr,
- int dif, int sdif)
+bool raw_v4_match(struct net *net, struct sock *sk, unsigned short num,
+ __be32 raddr, __be32 laddr, int dif, int sdif)
{
- sk_for_each_from(sk) {
- struct inet_sock *inet = inet_sk(sk);
-
- if (net_eq(sock_net(sk), net) && inet->inet_num == num &&
- !(inet->inet_daddr && inet->inet_daddr != raddr) &&
- !(inet->inet_rcv_saddr && inet->inet_rcv_saddr != laddr) &&
- raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif))
- goto found; /* gotcha */
- }
- sk = NULL;
-found:
- return sk;
+ struct inet_sock *inet = inet_sk(sk);
+
+ if (net_eq(sock_net(sk), net) && inet->inet_num == num &&
+ !(inet->inet_daddr && inet->inet_daddr != raddr) &&
+ !(inet->inet_rcv_saddr && inet->inet_rcv_saddr != laddr) &&
+ raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if, dif, sdif))
+ return true;
+ return false;
}
-EXPORT_SYMBOL_GPL(__raw_v4_lookup);
+EXPORT_SYMBOL_GPL(raw_v4_match);
/*
* 0 - deliver
@@ -168,23 +162,20 @@ static int icmp_filter(const struct sock *sk, const struct sk_buff *skb)
*/
static int raw_v4_input(struct sk_buff *skb, const struct iphdr *iph, int hash)
{
+ struct net *net = dev_net(skb->dev);
+ struct hlist_nulls_head *hlist;
+ struct hlist_nulls_node *hnode;
int sdif = inet_sdif(skb);
int dif = inet_iif(skb);
- struct sock *sk;
- struct hlist_head *head;
int delivered = 0;
- struct net *net;
-
- read_lock(&raw_v4_hashinfo.lock);
- head = &raw_v4_hashinfo.ht[hash];
- if (hlist_empty(head))
- goto out;
-
- net = dev_net(skb->dev);
- sk = __raw_v4_lookup(net, __sk_head(head), iph->protocol,
- iph->saddr, iph->daddr, dif, sdif);
+ struct sock *sk;
- while (sk) {
+ hlist = &raw_v4_hashinfo.ht[hash];
+ rcu_read_lock();
+ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
+ if (!raw_v4_match(net, sk, iph->protocol,
+ iph->saddr, iph->daddr, dif, sdif))
+ continue;
delivered = 1;
if ((iph->protocol != IPPROTO_ICMP || !icmp_filter(sk, skb)) &&
ip_mc_sf_allow(sk, iph->daddr, iph->saddr,
@@ -195,31 +186,16 @@ static int raw_v4_input(struct sk_buff *skb, const struct iphdr *iph, int hash)
if (clone)
raw_rcv(sk, clone);
}
- sk = __raw_v4_lookup(net, sk_next(sk), iph->protocol,
- iph->saddr, iph->daddr,
- dif, sdif);
}
-out:
- read_unlock(&raw_v4_hashinfo.lock);
+ rcu_read_unlock();
return delivered;
}
int raw_local_deliver(struct sk_buff *skb, int protocol)
{
- int hash;
- struct sock *raw_sk;
-
- hash = protocol & (RAW_HTABLE_SIZE - 1);
- raw_sk = sk_head(&raw_v4_hashinfo.ht[hash]);
-
- /* If there maybe a raw socket we must check - if not we
- * don't care less
- */
- if (raw_sk && !raw_v4_input(skb, ip_hdr(skb), hash))
- raw_sk = NULL;
-
- return raw_sk != NULL;
+ int hash = protocol & (RAW_HTABLE_SIZE - 1);
+ return raw_v4_input(skb, ip_hdr(skb), hash);
}
static void raw_err(struct sock *sk, struct sk_buff *skb, u32 info)
@@ -286,31 +262,27 @@ static void raw_err(struct sock *sk, struct sk_buff *skb, u32 info)
void raw_icmp_error(struct sk_buff *skb, int protocol, u32 info)
{
- int hash;
- struct sock *raw_sk;
+ struct net *net = dev_net(skb->dev);
+ struct hlist_nulls_head *hlist;
+ struct hlist_nulls_node *hnode;
+ int dif = skb->dev->ifindex;
+ int sdif = inet_sdif(skb);
const struct iphdr *iph;
- struct net *net;
+ struct sock *sk;
+ int hash;
hash = protocol & (RAW_HTABLE_SIZE - 1);
+ hlist = &raw_v4_hashinfo.ht[hash];
- read_lock(&raw_v4_hashinfo.lock);
- raw_sk = sk_head(&raw_v4_hashinfo.ht[hash]);
- if (raw_sk) {
- int dif = skb->dev->ifindex;
- int sdif = inet_sdif(skb);
-
+ rcu_read_lock();
+ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
iph = (const struct iphdr *)skb->data;
- net = dev_net(skb->dev);
-
- while ((raw_sk = __raw_v4_lookup(net, raw_sk, protocol,
- iph->daddr, iph->saddr,
- dif, sdif)) != NULL) {
- raw_err(raw_sk, skb, info);
- raw_sk = sk_next(raw_sk);
- iph = (const struct iphdr *)skb->data;
- }
+ if (!raw_v4_match(net, sk, iph->protocol,
+ iph->daddr, iph->saddr, dif, sdif))
+ continue;
+ raw_err(sk, skb, info);
}
- read_unlock(&raw_v4_hashinfo.lock);
+ rcu_read_unlock();
}
static int raw_rcv_skb(struct sock *sk, struct sk_buff *skb)
@@ -972,44 +944,41 @@ struct proto raw_prot = {
};
#ifdef CONFIG_PROC_FS
-static struct sock *raw_get_first(struct seq_file *seq)
+static struct sock *raw_get_first(struct seq_file *seq, int bucket)
{
- struct sock *sk;
struct raw_hashinfo *h = pde_data(file_inode(seq->file));
struct raw_iter_state *state = raw_seq_private(seq);
+ struct hlist_nulls_head *hlist;
+ struct hlist_nulls_node *hnode;
+ struct sock *sk;
- for (state->bucket = 0; state->bucket < RAW_HTABLE_SIZE;
+ for (state->bucket = bucket; state->bucket < RAW_HTABLE_SIZE;
++state->bucket) {
- sk_for_each(sk, &h->ht[state->bucket])
+ hlist = &h->ht[state->bucket];
+ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
if (sock_net(sk) == seq_file_net(seq))
- goto found;
+ return sk;
+ }
}
- sk = NULL;
-found:
- return sk;
+ return NULL;
}
static struct sock *raw_get_next(struct seq_file *seq, struct sock *sk)
{
- struct raw_hashinfo *h = pde_data(file_inode(seq->file));
struct raw_iter_state *state = raw_seq_private(seq);
do {
- sk = sk_next(sk);
-try_again:
- ;
+ sk = sk_nulls_next(sk);
} while (sk && sock_net(sk) != seq_file_net(seq));
- if (!sk && ++state->bucket < RAW_HTABLE_SIZE) {
- sk = sk_head(&h->ht[state->bucket]);
- goto try_again;
- }
+ if (!sk)
+ return raw_get_first(seq, state->bucket + 1);
return sk;
}
static struct sock *raw_get_idx(struct seq_file *seq, loff_t pos)
{
- struct sock *sk = raw_get_first(seq);
+ struct sock *sk = raw_get_first(seq, 0);
if (sk)
while (pos && (sk = raw_get_next(seq, sk)) != NULL)
@@ -1018,11 +987,9 @@ static struct sock *raw_get_idx(struct seq_file *seq, loff_t pos)
}
void *raw_seq_start(struct seq_file *seq, loff_t *pos)
- __acquires(&h->lock)
+ __acquires(RCU)
{
- struct raw_hashinfo *h = pde_data(file_inode(seq->file));
-
- read_lock(&h->lock);
+ rcu_read_lock();
return *pos ? raw_get_idx(seq, *pos - 1) : SEQ_START_TOKEN;
}
EXPORT_SYMBOL_GPL(raw_seq_start);
@@ -1032,7 +999,7 @@ void *raw_seq_next(struct seq_file *seq, void *v, loff_t *pos)
struct sock *sk;
if (v == SEQ_START_TOKEN)
- sk = raw_get_first(seq);
+ sk = raw_get_first(seq, 0);
else
sk = raw_get_next(seq, v);
++*pos;
@@ -1041,11 +1008,9 @@ void *raw_seq_next(struct seq_file *seq, void *v, loff_t *pos)
EXPORT_SYMBOL_GPL(raw_seq_next);
void raw_seq_stop(struct seq_file *seq, void *v)
- __releases(&h->lock)
+ __releases(RCU)
{
- struct raw_hashinfo *h = pde_data(file_inode(seq->file));
-
- read_unlock(&h->lock);
+ rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(raw_seq_stop);
@@ -1107,6 +1072,7 @@ static __net_initdata struct pernet_operations raw_net_ops = {
int __init raw_proc_init(void)
{
+
return register_pernet_subsys(&raw_net_ops);
}
diff --git a/net/ipv4/raw_diag.c b/net/ipv4/raw_diag.c
index ccacbde30a2c..5f208e840d85 100644
--- a/net/ipv4/raw_diag.c
+++ b/net/ipv4/raw_diag.c
@@ -34,57 +34,57 @@ raw_get_hashinfo(const struct inet_diag_req_v2 *r)
* use helper to figure it out.
*/
-static struct sock *raw_lookup(struct net *net, struct sock *from,
- const struct inet_diag_req_v2 *req)
+static bool raw_lookup(struct net *net, struct sock *sk,
+ const struct inet_diag_req_v2 *req)
{
struct inet_diag_req_raw *r = (void *)req;
- struct sock *sk = NULL;
if (r->sdiag_family == AF_INET)
- sk = __raw_v4_lookup(net, from, r->sdiag_raw_protocol,
- r->id.idiag_dst[0],
- r->id.idiag_src[0],
- r->id.idiag_if, 0);
+ return raw_v4_match(net, sk, r->sdiag_raw_protocol,
+ r->id.idiag_dst[0],
+ r->id.idiag_src[0],
+ r->id.idiag_if, 0);
#if IS_ENABLED(CONFIG_IPV6)
else
- sk = __raw_v6_lookup(net, from, r->sdiag_raw_protocol,
- (const struct in6_addr *)r->id.idiag_src,
- (const struct in6_addr *)r->id.idiag_dst,
- r->id.idiag_if, 0);
+ return raw_v6_match(net, sk, r->sdiag_raw_protocol,
+ (const struct in6_addr *)r->id.idiag_src,
+ (const struct in6_addr *)r->id.idiag_dst,
+ r->id.idiag_if, 0);
#endif
- return sk;
+ return false;
}
static struct sock *raw_sock_get(struct net *net, const struct inet_diag_req_v2 *r)
{
struct raw_hashinfo *hashinfo = raw_get_hashinfo(r);
- struct sock *sk = NULL, *s;
+ struct hlist_nulls_head *hlist;
+ struct hlist_nulls_node *hnode;
+ struct sock *sk;
int slot;
if (IS_ERR(hashinfo))
return ERR_CAST(hashinfo);
- read_lock(&hashinfo->lock);
+ rcu_read_lock();
for (slot = 0; slot < RAW_HTABLE_SIZE; slot++) {
- sk_for_each(s, &hashinfo->ht[slot]) {
- sk = raw_lookup(net, s, r);
- if (sk) {
+ hlist = &hashinfo->ht[slot];
+ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
+ if (raw_lookup(net, sk, r)) {
/*
* Grab it and keep until we fill
- * diag meaage to be reported, so
+ * diag message to be reported, so
* caller should call sock_put then.
- * We can do that because we're keeping
- * hashinfo->lock here.
*/
- sock_hold(sk);
- goto out_unlock;
+ if (refcount_inc_not_zero(&sk->sk_refcnt))
+ goto out_unlock;
}
}
}
+ sk = ERR_PTR(-ENOENT);
out_unlock:
- read_unlock(&hashinfo->lock);
+ rcu_read_unlock();
- return sk ? sk : ERR_PTR(-ENOENT);
+ return sk;
}
static int raw_diag_dump_one(struct netlink_callback *cb,
@@ -142,6 +142,8 @@ static void raw_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
struct raw_hashinfo *hashinfo = raw_get_hashinfo(r);
struct net *net = sock_net(skb->sk);
struct inet_diag_dump_data *cb_data;
+ struct hlist_nulls_head *hlist;
+ struct hlist_nulls_node *hnode;
int num, s_num, slot, s_slot;
struct sock *sk = NULL;
struct nlattr *bc;
@@ -158,7 +160,8 @@ static void raw_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
for (slot = s_slot; slot < RAW_HTABLE_SIZE; s_num = 0, slot++) {
num = 0;
- sk_for_each(sk, &hashinfo->ht[slot]) {
+ hlist = &hashinfo->ht[slot];
+ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
struct inet_sock *inet = inet_sk(sk);
if (!net_eq(sock_net(sk), net))
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 91735d631a28..51116166e3d2 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -953,6 +953,23 @@ static int tcp_downgrade_zcopy_pure(struct sock *sk, struct sk_buff *skb)
return 0;
}
+static int tcp_wmem_schedule(struct sock *sk, int copy)
+{
+ int left;
+
+ if (likely(sk_wmem_schedule(sk, copy)))
+ return copy;
+
+ /* We could be in trouble if we have nothing queued.
+ * Use whatever is left in sk->sk_forward_alloc and tcp_wmem[0]
+ * to guarantee some progress.
+ */
+ left = sock_net(sk)->ipv4.sysctl_tcp_wmem[0] - sk->sk_wmem_queued;
+ if (left > 0)
+ sk_forced_mem_schedule(sk, min(left, copy));
+ return min(copy, sk->sk_forward_alloc);
+}
+
static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
struct page *page, int offset, size_t *size)
{
@@ -988,7 +1005,11 @@ static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
tcp_mark_push(tp, skb);
goto new_segment;
}
- if (tcp_downgrade_zcopy_pure(sk, skb) || !sk_wmem_schedule(sk, copy))
+ if (tcp_downgrade_zcopy_pure(sk, skb))
+ return NULL;
+
+ copy = tcp_wmem_schedule(sk, copy);
+ if (!copy)
return NULL;
if (can_coalesce) {
@@ -1337,8 +1358,11 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
copy = min_t(int, copy, pfrag->size - pfrag->offset);
- if (tcp_downgrade_zcopy_pure(sk, skb) ||
- !sk_wmem_schedule(sk, copy))
+ if (tcp_downgrade_zcopy_pure(sk, skb))
+ goto wait_for_space;
+
+ copy = tcp_wmem_schedule(sk, copy);
+ if (!copy)
goto wait_for_space;
err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb,
@@ -1365,7 +1389,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
skb_shinfo(skb)->flags |= SKBFL_PURE_ZEROCOPY;
if (!skb_zcopy_pure(skb)) {
- if (!sk_wmem_schedule(sk, copy))
+ copy = tcp_wmem_schedule(sk, copy);
+ if (!copy)
goto wait_for_space;
}
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index a7f0a1f0c2a3..b19090b81ff0 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3140,7 +3140,7 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
struct tcp_sock *tp = tcp_sk(sk);
unsigned int cur_mss;
int diff, len, err;
-
+ int avail_wnd;
/* Inconclusive MTU probe */
if (icsk->icsk_mtup.probe_size)
@@ -3162,17 +3162,25 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
return -EHOSTUNREACH; /* Routing failure or similar. */
cur_mss = tcp_current_mss(sk);
+ avail_wnd = tcp_wnd_end(tp) - TCP_SKB_CB(skb)->seq;
/* If receiver has shrunk his window, and skb is out of
* new window, do not retransmit it. The exception is the
* case, when window is shrunk to zero. In this case
- * our retransmit serves as a zero window probe.
+ * our retransmit of one segment serves as a zero window probe.
*/
- if (!before(TCP_SKB_CB(skb)->seq, tcp_wnd_end(tp)) &&
- TCP_SKB_CB(skb)->seq != tp->snd_una)
- return -EAGAIN;
+ if (avail_wnd <= 0) {
+ if (TCP_SKB_CB(skb)->seq != tp->snd_una)
+ return -EAGAIN;
+ avail_wnd = cur_mss;
+ }
len = cur_mss * segs;
+ if (len > avail_wnd) {
+ len = rounddown(avail_wnd, cur_mss);
+ if (!len)
+ len = avail_wnd;
+ }
if (skb->len > len) {
if (tcp_fragment(sk, TCP_FRAG_IN_RTX_QUEUE, skb, len,
cur_mss, GFP_ATOMIC))
@@ -3186,8 +3194,9 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
diff -= tcp_skb_pcount(skb);
if (diff)
tcp_adjust_pcount(sk, skb, diff);
- if (skb->len < cur_mss)
- tcp_retrans_try_collapse(sk, skb, cur_mss);
+ avail_wnd = min_t(int, avail_wnd, cur_mss);
+ if (skb->len < avail_wnd)
+ tcp_retrans_try_collapse(sk, skb, avail_wnd);
}
/* RFC3168, section 6.1.1.1. ECN fallback */
@@ -3358,11 +3367,12 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
*/
void sk_forced_mem_schedule(struct sock *sk, int size)
{
- int amt;
+ int delta, amt;
- if (size <= sk->sk_forward_alloc)
+ delta = size - sk->sk_forward_alloc;
+ if (delta <= 0)
return;
- amt = sk_mem_pages(size);
+ amt = sk_mem_pages(delta);
sk->sk_forward_alloc += amt * SK_MEM_QUANTUM;
sk_memory_allocated_add(sk, amt);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 6b4d8361560f..201e2533acb2 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2564,8 +2564,7 @@ static struct sock *__udp4_lib_demux_lookup(struct net *net,
struct sock *sk;
udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
- if (INET_MATCH(sk, net, acookie, rmt_addr,
- loc_addr, ports, dif, sdif))
+ if (INET_MATCH(net, sk, acookie, ports, dif, sdif))
return sk;
/* Only check first socket in chain */
break;
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index ef1e6545d869..095298c5727a 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -63,6 +63,7 @@
#include <net/compat.h>
#include <net/xfrm.h>
#include <net/ioam6.h>
+#include <net/rawv6.h>
#include <linux/uaccess.h>
#include <linux/mroute6.h>
@@ -1074,6 +1075,8 @@ static int __init inet6_init(void)
goto out;
}
+ raw_hashinfo_init(&raw_v6_hashinfo);
+
err = proto_register(&tcpv6_prot, 1);
if (err)
goto out;
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index 32ccac10bd62..bb4939e36840 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -71,12 +71,12 @@ struct sock *__inet6_lookup_established(struct net *net,
sk_nulls_for_each_rcu(sk, node, &head->chain) {
if (sk->sk_hash != hash)
continue;
- if (!INET6_MATCH(sk, net, saddr, daddr, ports, dif, sdif))
+ if (!inet6_match(net, sk, saddr, daddr, ports, dif, sdif))
continue;
if (unlikely(!refcount_inc_not_zero(&sk->sk_refcnt)))
goto out;
- if (unlikely(!INET6_MATCH(sk, net, saddr, daddr, ports, dif, sdif))) {
+ if (unlikely(!inet6_match(net, sk, saddr, daddr, ports, dif, sdif))) {
sock_gen_put(sk);
goto begin;
}
@@ -269,7 +269,7 @@ static int __inet6_check_established(struct inet_timewait_death_row *death_row,
if (sk2->sk_hash != hash)
continue;
- if (likely(INET6_MATCH(sk2, net, saddr, daddr, ports,
+ if (likely(inet6_match(net, sk2, saddr, daddr, ports,
dif, sdif))) {
if (sk2->sk_state == TCP_TIME_WAIT) {
tw = inet_twsk(sk2);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8bb41f3b246a..62f0a3d08d4d 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -61,46 +61,30 @@
#define ICMPV6_HDRLEN 4 /* ICMPv6 header, RFC 4443 Section 2.1 */
-struct raw_hashinfo raw_v6_hashinfo = {
- .lock = __RW_LOCK_UNLOCKED(raw_v6_hashinfo.lock),
-};
+struct raw_hashinfo raw_v6_hashinfo;
EXPORT_SYMBOL_GPL(raw_v6_hashinfo);
-struct sock *__raw_v6_lookup(struct net *net, struct sock *sk,
- unsigned short num, const struct in6_addr *loc_addr,
- const struct in6_addr *rmt_addr, int dif, int sdif)
+bool raw_v6_match(struct net *net, struct sock *sk, unsigned short num,
+ const struct in6_addr *loc_addr,
+ const struct in6_addr *rmt_addr, int dif, int sdif)
{
- bool is_multicast = ipv6_addr_is_multicast(loc_addr);
-
- sk_for_each_from(sk)
- if (inet_sk(sk)->inet_num == num) {
-
- if (!net_eq(sock_net(sk), net))
- continue;
-
- if (!ipv6_addr_any(&sk->sk_v6_daddr) &&
- !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr))
- continue;
-
- if (!raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if,
- dif, sdif))
- continue;
-
- if (!ipv6_addr_any(&sk->sk_v6_rcv_saddr)) {
- if (ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr))
- goto found;
- if (is_multicast &&
- inet6_mc_check(sk, loc_addr, rmt_addr))
- goto found;
- continue;
- }
- goto found;
- }
- sk = NULL;
-found:
- return sk;
+ if (inet_sk(sk)->inet_num != num ||
+ !net_eq(sock_net(sk), net) ||
+ (!ipv6_addr_any(&sk->sk_v6_daddr) &&
+ !ipv6_addr_equal(&sk->sk_v6_daddr, rmt_addr)) ||
+ !raw_sk_bound_dev_eq(net, sk->sk_bound_dev_if,
+ dif, sdif))
+ return false;
+
+ if (ipv6_addr_any(&sk->sk_v6_rcv_saddr) ||
+ ipv6_addr_equal(&sk->sk_v6_rcv_saddr, loc_addr) ||
+ (ipv6_addr_is_multicast(loc_addr) &&
+ inet6_mc_check(sk, loc_addr, rmt_addr)))
+ return true;
+
+ return false;
}
-EXPORT_SYMBOL_GPL(__raw_v6_lookup);
+EXPORT_SYMBOL_GPL(raw_v6_match);
/*
* 0 - deliver
@@ -156,31 +140,27 @@ EXPORT_SYMBOL(rawv6_mh_filter_unregister);
*/
static bool ipv6_raw_deliver(struct sk_buff *skb, int nexthdr)
{
+ struct net *net = dev_net(skb->dev);
+ struct hlist_nulls_head *hlist;
+ struct hlist_nulls_node *hnode;
const struct in6_addr *saddr;
const struct in6_addr *daddr;
struct sock *sk;
bool delivered = false;
__u8 hash;
- struct net *net;
saddr = &ipv6_hdr(skb)->saddr;
daddr = saddr + 1;
hash = nexthdr & (RAW_HTABLE_SIZE - 1);
-
- read_lock(&raw_v6_hashinfo.lock);
- sk = sk_head(&raw_v6_hashinfo.ht[hash]);
-
- if (!sk)
- goto out;
-
- net = dev_net(skb->dev);
- sk = __raw_v6_lookup(net, sk, nexthdr, daddr, saddr,
- inet6_iif(skb), inet6_sdif(skb));
-
- while (sk) {
+ hlist = &raw_v6_hashinfo.ht[hash];
+ rcu_read_lock();
+ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
int filtered;
+ if (!raw_v6_match(net, sk, nexthdr, daddr, saddr,
+ inet6_iif(skb), inet6_sdif(skb)))
+ continue;
delivered = true;
switch (nexthdr) {
case IPPROTO_ICMPV6:
@@ -219,23 +199,14 @@ static bool ipv6_raw_deliver(struct sk_buff *skb, int nexthdr)
rawv6_rcv(sk, clone);
}
}
- sk = __raw_v6_lookup(net, sk_next(sk), nexthdr, daddr, saddr,
- inet6_iif(skb), inet6_sdif(skb));
}
-out:
- read_unlock(&raw_v6_hashinfo.lock);
+ rcu_read_unlock();
return delivered;
}
bool raw6_local_deliver(struct sk_buff *skb, int nexthdr)
{
- struct sock *raw_sk;
-
- raw_sk = sk_head(&raw_v6_hashinfo.ht[nexthdr & (RAW_HTABLE_SIZE - 1)]);
- if (raw_sk && !ipv6_raw_deliver(skb, nexthdr))
- raw_sk = NULL;
-
- return raw_sk != NULL;
+ return ipv6_raw_deliver(skb, nexthdr);
}
/* This cleans up af_inet6 a bit. -DaveM */
@@ -361,30 +332,25 @@ static void rawv6_err(struct sock *sk, struct sk_buff *skb,
void raw6_icmp_error(struct sk_buff *skb, int nexthdr,
u8 type, u8 code, int inner_offset, __be32 info)
{
+ struct net *net = dev_net(skb->dev);
+ struct hlist_nulls_head *hlist;
+ struct hlist_nulls_node *hnode;
struct sock *sk;
int hash;
- const struct in6_addr *saddr, *daddr;
- struct net *net;
hash = nexthdr & (RAW_HTABLE_SIZE - 1);
-
- read_lock(&raw_v6_hashinfo.lock);
- sk = sk_head(&raw_v6_hashinfo.ht[hash]);
- if (sk) {
+ hlist = &raw_v6_hashinfo.ht[hash];
+ rcu_read_lock();
+ hlist_nulls_for_each_entry(sk, hnode, hlist, sk_nulls_node) {
/* Note: ipv6_hdr(skb) != skb->data */
const struct ipv6hdr *ip6h = (const struct ipv6hdr *)skb->data;
- saddr = &ip6h->saddr;
- daddr = &ip6h->daddr;
- net = dev_net(skb->dev);
-
- while ((sk = __raw_v6_lookup(net, sk, nexthdr, saddr, daddr,
- inet6_iif(skb), inet6_iif(skb)))) {
- rawv6_err(sk, skb, NULL, type, code,
- inner_offset, info);
- sk = sk_next(sk);
- }
+
+ if (!raw_v6_match(net, sk, nexthdr, &ip6h->saddr, &ip6h->daddr,
+ inet6_iif(skb), inet6_iif(skb)))
+ continue;
+ rawv6_err(sk, skb, NULL, type, code, inner_offset, info);
}
- read_unlock(&raw_v6_hashinfo.lock);
+ rcu_read_unlock();
}
static inline int rawv6_rcv_skb(struct sock *sk, struct sk_buff *skb)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index aea28bf701be..544842fc831e 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1044,7 +1044,7 @@ static struct sock *__udp6_lib_demux_lookup(struct net *net,
udp_portaddr_for_each_entry_rcu(sk, &hslot2->head) {
if (sk->sk_state == TCP_ESTABLISHED &&
- INET6_MATCH(sk, net, rmt_addr, loc_addr, ports, dif, sdif))
+ inet6_match(net, sk, rmt_addr, loc_addr, ports, dif, sdif))
return sk;
/* Only check first socket in chain */
break;
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 07b5a2044cab..9c25e6a9a279 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -323,9 +323,10 @@ static bool mptcp_rmem_schedule(struct sock *sk, struct sock *ssk, int size)
struct mptcp_sock *msk = mptcp_sk(sk);
int amt, amount;
- if (size < msk->rmem_fwd_alloc)
+ if (size <= msk->rmem_fwd_alloc)
return true;
+ size -= msk->rmem_fwd_alloc;
amt = sk_mem_pages(size);
amount = amt << SK_MEM_QUANTUM_SHIFT;
msk->rmem_fwd_alloc += amount;
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index de3dc35ce609..2b60923cf87b 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -153,6 +153,7 @@ static struct nft_trans *nft_trans_alloc_gfp(const struct nft_ctx *ctx,
if (trans == NULL)
return NULL;
+ INIT_LIST_HEAD(&trans->list);
trans->msg_type = msg_type;
trans->ctx = *ctx;
@@ -2472,6 +2473,7 @@ static int nf_tables_updchain(struct nft_ctx *ctx, u8 genmask, u8 policy,
}
static struct nft_chain *nft_chain_lookup_byid(const struct net *net,
+ const struct nft_table *table,
const struct nlattr *nla)
{
struct nftables_pernet *nft_net = nft_pernet(net);
@@ -2482,6 +2484,7 @@ static struct nft_chain *nft_chain_lookup_byid(const struct net *net,
struct nft_chain *chain = trans->ctx.chain;
if (trans->msg_type == NFT_MSG_NEWCHAIN &&
+ chain->table == table &&
id == nft_trans_chain_id(trans))
return chain;
}
@@ -3369,6 +3372,7 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
}
static struct nft_rule *nft_rule_lookup_byid(const struct net *net,
+ const struct nft_chain *chain,
const struct nlattr *nla);
#define NFT_RULE_MAXEXPRS 128
@@ -3415,7 +3419,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
return -EOPNOTSUPP;
} else if (nla[NFTA_RULE_CHAIN_ID]) {
- chain = nft_chain_lookup_byid(net, nla[NFTA_RULE_CHAIN_ID]);
+ chain = nft_chain_lookup_byid(net, table, nla[NFTA_RULE_CHAIN_ID]);
if (IS_ERR(chain)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN_ID]);
return PTR_ERR(chain);
@@ -3457,7 +3461,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
return PTR_ERR(old_rule);
}
} else if (nla[NFTA_RULE_POSITION_ID]) {
- old_rule = nft_rule_lookup_byid(net, nla[NFTA_RULE_POSITION_ID]);
+ old_rule = nft_rule_lookup_byid(net, chain, nla[NFTA_RULE_POSITION_ID]);
if (IS_ERR(old_rule)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_POSITION_ID]);
return PTR_ERR(old_rule);
@@ -3602,6 +3606,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
}
static struct nft_rule *nft_rule_lookup_byid(const struct net *net,
+ const struct nft_chain *chain,
const struct nlattr *nla)
{
struct nftables_pernet *nft_net = nft_pernet(net);
@@ -3612,6 +3617,7 @@ static struct nft_rule *nft_rule_lookup_byid(const struct net *net,
struct nft_rule *rule = nft_trans_rule(trans);
if (trans->msg_type == NFT_MSG_NEWRULE &&
+ trans->ctx.chain == chain &&
id == nft_trans_rule_id(trans))
return rule;
}
@@ -3661,7 +3667,7 @@ static int nf_tables_delrule(struct sk_buff *skb, const struct nfnl_info *info,
err = nft_delrule(&ctx, rule);
} else if (nla[NFTA_RULE_ID]) {
- rule = nft_rule_lookup_byid(net, nla[NFTA_RULE_ID]);
+ rule = nft_rule_lookup_byid(net, chain, nla[NFTA_RULE_ID]);
if (IS_ERR(rule)) {
NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_ID]);
return PTR_ERR(rule);
@@ -3840,6 +3846,7 @@ static struct nft_set *nft_set_lookup_byhandle(const struct nft_table *table,
}
static struct nft_set *nft_set_lookup_byid(const struct net *net,
+ const struct nft_table *table,
const struct nlattr *nla, u8 genmask)
{
struct nftables_pernet *nft_net = nft_pernet(net);
@@ -3851,6 +3858,7 @@ static struct nft_set *nft_set_lookup_byid(const struct net *net,
struct nft_set *set = nft_trans_set(trans);
if (id == nft_trans_set_id(trans) &&
+ set->table == table &&
nft_active_genmask(set, genmask))
return set;
}
@@ -3871,7 +3879,7 @@ struct nft_set *nft_set_lookup_global(const struct net *net,
if (!nla_set_id)
return set;
- set = nft_set_lookup_byid(net, nla_set_id, genmask);
+ set = nft_set_lookup_byid(net, table, nla_set_id, genmask);
}
return set;
}
@@ -9595,7 +9603,7 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data,
tb[NFTA_VERDICT_CHAIN],
genmask);
} else if (tb[NFTA_VERDICT_CHAIN_ID]) {
- chain = nft_chain_lookup_byid(ctx->net,
+ chain = nft_chain_lookup_byid(ctx->net, ctx->table,
tb[NFTA_VERDICT_CHAIN_ID]);
if (IS_ERR(chain))
return PTR_ERR(chain);
diff --git a/net/netfilter/nft_queue.c b/net/netfilter/nft_queue.c
index 15e4b7640dc0..da29e92c03e2 100644
--- a/net/netfilter/nft_queue.c
+++ b/net/netfilter/nft_queue.c
@@ -68,6 +68,31 @@ static void nft_queue_sreg_eval(const struct nft_expr *expr,
regs->verdict.code = ret;
}
+static int nft_queue_validate(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ const struct nft_data **data)
+{
+ static const unsigned int supported_hooks = ((1 << NF_INET_PRE_ROUTING) |
+ (1 << NF_INET_LOCAL_IN) |
+ (1 << NF_INET_FORWARD) |
+ (1 << NF_INET_LOCAL_OUT) |
+ (1 << NF_INET_POST_ROUTING));
+
+ switch (ctx->family) {
+ case NFPROTO_IPV4:
+ case NFPROTO_IPV6:
+ case NFPROTO_INET:
+ case NFPROTO_BRIDGE:
+ break;
+ case NFPROTO_NETDEV: /* lacks okfn */
+ fallthrough;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ return nft_chain_validate_hooks(ctx->chain, supported_hooks);
+}
+
static const struct nla_policy nft_queue_policy[NFTA_QUEUE_MAX + 1] = {
[NFTA_QUEUE_NUM] = { .type = NLA_U16 },
[NFTA_QUEUE_TOTAL] = { .type = NLA_U16 },
@@ -164,6 +189,7 @@ static const struct nft_expr_ops nft_queue_ops = {
.eval = nft_queue_eval,
.init = nft_queue_init,
.dump = nft_queue_dump,
+ .validate = nft_queue_validate,
.reduce = NFT_REDUCE_READONLY,
};
@@ -173,6 +199,7 @@ static const struct nft_expr_ops nft_queue_sreg_ops = {
.eval = nft_queue_sreg_eval,
.init = nft_queue_sreg_init,
.dump = nft_queue_sreg_dump,
+ .validate = nft_queue_validate,
.reduce = NFT_REDUCE_READONLY,
};
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index bf2d986a6bc3..a8e3ec800a9c 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -192,6 +192,7 @@ static void rose_kill_by_device(struct net_device *dev)
rose_disconnect(s, ENETUNREACH, ROSE_OUT_OF_ORDER, 0);
if (rose->neighbour)
rose->neighbour->use--;
+ dev_put(rose->device);
rose->device = NULL;
}
}
@@ -592,6 +593,8 @@ static struct sock *rose_make_new(struct sock *osk)
rose->idle = orose->idle;
rose->defer = orose->defer;
rose->device = orose->device;
+ if (rose->device)
+ dev_hold(rose->device);
rose->qbitincl = orose->qbitincl;
return sk;
@@ -645,6 +648,7 @@ static int rose_release(struct socket *sock)
break;
}
+ dev_put(rose->device);
sock->sk = NULL;
release_sock(sk);
sock_put(sk);
@@ -721,7 +725,6 @@ static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_le
struct rose_sock *rose = rose_sk(sk);
struct sockaddr_rose *addr = (struct sockaddr_rose *)uaddr;
unsigned char cause, diagnostic;
- struct net_device *dev;
ax25_uid_assoc *user;
int n, err = 0;
@@ -778,9 +781,12 @@ static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_le
}
if (sock_flag(sk, SOCK_ZAPPED)) { /* Must bind first - autobinding in this may or may not work */
+ struct net_device *dev;
+
sock_reset_flag(sk, SOCK_ZAPPED);
- if ((dev = rose_dev_first()) == NULL) {
+ dev = rose_dev_first();
+ if (!dev) {
err = -ENETUNREACH;
goto out_release;
}
@@ -788,6 +794,7 @@ static int rose_connect(struct socket *sock, struct sockaddr *uaddr, int addr_le
user = ax25_findbyuid(current_euid());
if (!user) {
err = -EINVAL;
+ dev_put(dev);
goto out_release;
}
diff --git a/net/rose/rose_route.c b/net/rose/rose_route.c
index 5e93510fa3d2..09ad2d9e63c9 100644
--- a/net/rose/rose_route.c
+++ b/net/rose/rose_route.c
@@ -615,6 +615,8 @@ struct net_device *rose_dev_first(void)
if (first == NULL || strncmp(dev->name, first->name, 3) < 0)
first = dev;
}
+ if (first)
+ dev_hold(first);
rcu_read_unlock();
return first;
diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
index a35ab8c27866..3f935cbbaff6 100644
--- a/net/sched/cls_route.c
+++ b/net/sched/cls_route.c
@@ -526,7 +526,7 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
rcu_assign_pointer(f->next, f1);
rcu_assign_pointer(*fp, f);
- if (fold && fold->handle && f->handle != fold->handle) {
+ if (fold) {
th = to_hash(fold->handle);
h = from_hash(fold->handle >> 16);
b = rtnl_dereference(head->table[th]);
diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
index 48585c4d04ad..0273bf7375e2 100644
--- a/scripts/Makefile.modpost
+++ b/scripts/Makefile.modpost
@@ -87,8 +87,7 @@ obj := $(KBUILD_EXTMOD)
src := $(obj)
# Include the module's Makefile to find KBUILD_EXTRA_SYMBOLS
-include $(if $(wildcard $(KBUILD_EXTMOD)/Kbuild), \
- $(KBUILD_EXTMOD)/Kbuild, $(KBUILD_EXTMOD)/Makefile)
+include $(if $(wildcard $(src)/Kbuild), $(src)/Kbuild, $(src)/Makefile)
# modpost option for external modules
MODPOST += -e
diff --git a/scripts/faddr2line b/scripts/faddr2line
index 94ed98dd899f..57099687e5e1 100755
--- a/scripts/faddr2line
+++ b/scripts/faddr2line
@@ -112,7 +112,9 @@ __faddr2line() {
# section offsets.
local file_type=$(${READELF} --file-header $objfile |
${AWK} '$1 == "Type:" { print $2; exit }')
- [[ $file_type = "EXEC" ]] && is_vmlinux=1
+ if [[ $file_type = "EXEC" ]] || [[ $file_type == "DYN" ]]; then
+ is_vmlinux=1
+ fi
# Go through each of the object's symbols which match the func name.
# In rare cases there might be duplicates, in which case we print all
diff --git a/scripts/gdb/linux/dmesg.py b/scripts/gdb/linux/dmesg.py
index d5983cf3db7d..c771831eb077 100644
--- a/scripts/gdb/linux/dmesg.py
+++ b/scripts/gdb/linux/dmesg.py
@@ -22,7 +22,6 @@ prb_desc_type = utils.CachedType("struct prb_desc")
prb_desc_ring_type = utils.CachedType("struct prb_desc_ring")
prb_data_ring_type = utils.CachedType("struct prb_data_ring")
printk_ringbuffer_type = utils.CachedType("struct printk_ringbuffer")
-atomic_long_type = utils.CachedType("atomic_long_t")
class LxDmesg(gdb.Command):
"""Print Linux kernel log buffer."""
@@ -68,8 +67,6 @@ class LxDmesg(gdb.Command):
off = prb_data_ring_type.get_type()['data'].bitpos // 8
text_data_addr = utils.read_ulong(text_data_ring, off)
- counter_off = atomic_long_type.get_type()['counter'].bitpos // 8
-
sv_off = prb_desc_type.get_type()['state_var'].bitpos // 8
off = prb_desc_type.get_type()['text_blk_lpos'].bitpos // 8
@@ -89,9 +86,9 @@ class LxDmesg(gdb.Command):
# read in tail and head descriptor ids
off = prb_desc_ring_type.get_type()['tail_id'].bitpos // 8
- tail_id = utils.read_u64(desc_ring, off + counter_off)
+ tail_id = utils.read_atomic_long(desc_ring, off)
off = prb_desc_ring_type.get_type()['head_id'].bitpos // 8
- head_id = utils.read_u64(desc_ring, off + counter_off)
+ head_id = utils.read_atomic_long(desc_ring, off)
did = tail_id
while True:
@@ -102,7 +99,7 @@ class LxDmesg(gdb.Command):
desc = utils.read_memoryview(inf, desc_addr + desc_off, desc_sz).tobytes()
# skip non-committed record
- state = 3 & (utils.read_u64(desc, sv_off + counter_off) >> desc_flags_shift)
+ state = 3 & (utils.read_atomic_long(desc, sv_off) >> desc_flags_shift)
if state != desc_committed and state != desc_finalized:
if did == head_id:
break
diff --git a/scripts/gdb/linux/utils.py b/scripts/gdb/linux/utils.py
index ff7c1799d588..1553f68716cc 100644
--- a/scripts/gdb/linux/utils.py
+++ b/scripts/gdb/linux/utils.py
@@ -35,13 +35,12 @@ class CachedType:
long_type = CachedType("long")
-
+atomic_long_type = CachedType("atomic_long_t")
def get_long_type():
global long_type
return long_type.get_type()
-
def offset_of(typeobj, field):
element = gdb.Value(0).cast(typeobj)
return int(str(element[field].address).split()[0], 16)
@@ -129,6 +128,17 @@ def read_ulong(buffer, offset):
else:
return read_u32(buffer, offset)
+atomic_long_counter_offset = atomic_long_type.get_type()['counter'].bitpos
+atomic_long_counter_sizeof = atomic_long_type.get_type()['counter'].type.sizeof
+
+def read_atomic_long(buffer, offset):
+ global atomic_long_counter_offset
+ global atomic_long_counter_sizeof
+
+ if atomic_long_counter_sizeof == 8:
+ return read_u64(buffer, offset + atomic_long_counter_offset)
+ else:
+ return read_u32(buffer, offset + atomic_long_counter_offset)
target_arch = None
diff --git a/security/selinux/ss/policydb.h b/security/selinux/ss/policydb.h
index c24d4e1063ea..ffc4e7bad205 100644
--- a/security/selinux/ss/policydb.h
+++ b/security/selinux/ss/policydb.h
@@ -370,6 +370,8 @@ static inline int put_entry(const void *buf, size_t bytes, int num, struct polic
{
size_t len = bytes * num;
+ if (len > fp->len)
+ return -EINVAL;
memcpy(fp->data, buf, len);
fp->data += len;
fp->len -= len;
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 6901dc07680d..cad54f454d01 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -4049,6 +4049,7 @@ int security_read_policy(struct selinux_state *state,
int security_read_state_kernel(struct selinux_state *state,
void **data, size_t *len)
{
+ int err;
struct selinux_policy *policy;
policy = rcu_dereference_protected(
@@ -4061,5 +4062,11 @@ int security_read_state_kernel(struct selinux_state *state,
if (!*data)
return -ENOMEM;
- return __security_read_policy(policy, *data, len);
+ err = __security_read_policy(policy, *data, len);
+ if (err) {
+ vfree(*data);
+ *data = NULL;
+ *len = 0;
+ }
+ return err;
}
diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c
index 678fbcaf2a3b..6807b4708a17 100644
--- a/sound/pci/hda/patch_cirrus.c
+++ b/sound/pci/hda/patch_cirrus.c
@@ -395,6 +395,7 @@ static const struct snd_pci_quirk cs420x_fixup_tbl[] = {
/* codec SSID */
SND_PCI_QUIRK(0x106b, 0x0600, "iMac 14,1", CS420X_IMAC27_122),
+ SND_PCI_QUIRK(0x106b, 0x0900, "iMac 12,1", CS420X_IMAC27_122),
SND_PCI_QUIRK(0x106b, 0x1c00, "MacBookPro 8,1", CS420X_MBP81),
SND_PCI_QUIRK(0x106b, 0x2000, "iMac 12,2", CS420X_IMAC27_122),
SND_PCI_QUIRK(0x106b, 0x2800, "MacBookPro 10,1", CS420X_MBP101),
diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
index 8c68e6e1387e..2bc9274e0960 100644
--- a/sound/pci/hda/patch_conexant.c
+++ b/sound/pci/hda/patch_conexant.c
@@ -222,6 +222,7 @@ enum {
CXT_PINCFG_LEMOTE_A1205,
CXT_PINCFG_COMPAQ_CQ60,
CXT_FIXUP_STEREO_DMIC,
+ CXT_PINCFG_LENOVO_NOTEBOOK,
CXT_FIXUP_INC_MIC_BOOST,
CXT_FIXUP_HEADPHONE_MIC_PIN,
CXT_FIXUP_HEADPHONE_MIC,
@@ -772,6 +773,14 @@ static const struct hda_fixup cxt_fixups[] = {
.type = HDA_FIXUP_FUNC,
.v.func = cxt_fixup_stereo_dmic,
},
+ [CXT_PINCFG_LENOVO_NOTEBOOK] = {
+ .type = HDA_FIXUP_PINS,
+ .v.pins = (const struct hda_pintbl[]) {
+ { 0x1a, 0x05d71030 },
+ { }
+ },
+ .chain_id = CXT_FIXUP_STEREO_DMIC,
+ },
[CXT_FIXUP_INC_MIC_BOOST] = {
.type = HDA_FIXUP_FUNC,
.v.func = cxt5066_increase_mic_boost,
@@ -971,7 +980,7 @@ static const struct snd_pci_quirk cxt5066_fixups[] = {
SND_PCI_QUIRK(0x17aa, 0x3905, "Lenovo G50-30", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x390b, "Lenovo G50-80", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x3975, "Lenovo U300s", CXT_FIXUP_STEREO_DMIC),
- SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_FIXUP_STEREO_DMIC),
+ SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_PINCFG_LENOVO_NOTEBOOK),
SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo G50-70", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x397b, "Lenovo S205", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK_VENDOR(0x17aa, "Thinkpad", CXT_FIXUP_THINKPAD_ACPI),
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index b7f1f2fb60cb..9c7335721cd0 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -6839,6 +6839,43 @@ static void alc_fixup_dell4_mic_no_presence_quiet(struct hda_codec *codec,
}
}
+static void alc287_fixup_yoga9_14iap7_bass_spk_pin(struct hda_codec *codec,
+ const struct hda_fixup *fix, int action)
+{
+ /*
+ * The Pin Complex 0x17 for the bass speakers is wrongly reported as
+ * unconnected.
+ */
+ static const struct hda_pintbl pincfgs[] = {
+ { 0x17, 0x90170121 },
+ { }
+ };
+ /*
+ * Avoid DAC 0x06 and 0x08, as they have no volume controls.
+ * DAC 0x02 and 0x03 would be fine.
+ */
+ static const hda_nid_t conn[] = { 0x02, 0x03 };
+ /*
+ * Prefer both speakerbar (0x14) and bass speakers (0x17) connected to DAC 0x02.
+ * Headphones (0x21) are connected to DAC 0x03.
+ */
+ static const hda_nid_t preferred_pairs[] = {
+ 0x14, 0x02,
+ 0x17, 0x02,
+ 0x21, 0x03,
+ 0
+ };
+ struct alc_spec *spec = codec->spec;
+
+ switch (action) {
+ case HDA_FIXUP_ACT_PRE_PROBE:
+ snd_hda_apply_pincfgs(codec, pincfgs);
+ snd_hda_override_conn_list(codec, 0x17, ARRAY_SIZE(conn), conn);
+ spec->gen.preferred_dacs = preferred_pairs;
+ break;
+ }
+}
+
enum {
ALC269_FIXUP_GPIO2,
ALC269_FIXUP_SONY_VAIO,
@@ -6894,6 +6931,7 @@ enum {
ALC269_FIXUP_LIMIT_INT_MIC_BOOST,
ALC269VB_FIXUP_ASUS_ZENBOOK,
ALC269VB_FIXUP_ASUS_ZENBOOK_UX31A,
+ ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE,
ALC269_FIXUP_LIMIT_INT_MIC_BOOST_MUTE_LED,
ALC269VB_FIXUP_ORDISSIMO_EVE2,
ALC283_FIXUP_CHROME_BOOK,
@@ -7075,6 +7113,8 @@ enum {
ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED,
ALC285_FIXUP_HP_SPEAKERS_MICMUTE_LED,
ALC295_FIXUP_FRAMEWORK_LAPTOP_MIC_NO_PRESENCE,
+ ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK,
+ ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN,
};
/* A special fixup for Lenovo C940 and Yoga Duet 7;
@@ -7479,6 +7519,15 @@ static const struct hda_fixup alc269_fixups[] = {
.chained = true,
.chain_id = ALC269VB_FIXUP_ASUS_ZENBOOK,
},
+ [ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE] = {
+ .type = HDA_FIXUP_PINS,
+ .v.pins = (const struct hda_pintbl[]) {
+ { 0x18, 0x01a110f0 }, /* use as headset mic */
+ { }
+ },
+ .chained = true,
+ .chain_id = ALC269_FIXUP_HEADSET_MIC
+ },
[ALC269_FIXUP_LIMIT_INT_MIC_BOOST_MUTE_LED] = {
.type = HDA_FIXUP_FUNC,
.v.func = alc269_fixup_limit_int_mic_boost,
@@ -8917,6 +8966,74 @@ static const struct hda_fixup alc269_fixups[] = {
.chained = true,
.chain_id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC
},
+ [ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK] = {
+ .type = HDA_FIXUP_VERBS,
+ .v.verbs = (const struct hda_verb[]) {
+ // enable left speaker
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x41 },
+
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xc },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x1a },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
+
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xf },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x42 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
+
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x10 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x40 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
+
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x2 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
+
+ // enable right speaker
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x24 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x46 },
+
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xc },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x2a },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
+
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xf },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x46 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
+
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x10 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x44 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
+
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x26 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x2 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x0 },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0xb020 },
+
+ { },
+ },
+ },
+ [ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc287_fixup_yoga9_14iap7_bass_spk_pin,
+ .chained = true,
+ .chain_id = ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK,
+ },
};
static const struct snd_pci_quirk alc269_fixup_tbl[] = {
@@ -9096,6 +9213,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x103c, 0x861f, "HP Elite Dragonfly G1", ALC285_FIXUP_HP_GPIO_AMP_INIT),
SND_PCI_QUIRK(0x103c, 0x869d, "HP", ALC236_FIXUP_HP_MUTE_LED),
SND_PCI_QUIRK(0x103c, 0x86c7, "HP Envy AiO 32", ALC274_FIXUP_HP_ENVY_GPIO),
+ SND_PCI_QUIRK(0x103c, 0x86e7, "HP Spectre x360 15-eb0xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1),
+ SND_PCI_QUIRK(0x103c, 0x86e8, "HP Spectre x360 15-eb0xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1),
SND_PCI_QUIRK(0x103c, 0x8716, "HP Elite Dragonfly G2 Notebook PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
SND_PCI_QUIRK(0x103c, 0x8720, "HP EliteBook x360 1040 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
SND_PCI_QUIRK(0x103c, 0x8724, "HP EliteBook 850 G7", ALC285_FIXUP_HP_GPIO_LED),
@@ -9111,6 +9230,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
ALC285_FIXUP_HP_GPIO_AMP_INIT),
SND_PCI_QUIRK(0x103c, 0x8783, "HP ZBook Fury 15 G7 Mobile Workstation",
ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8786, "HP OMEN 15", ALC285_FIXUP_HP_MUTE_LED),
SND_PCI_QUIRK(0x103c, 0x8787, "HP OMEN 15", ALC285_FIXUP_HP_MUTE_LED),
SND_PCI_QUIRK(0x103c, 0x8788, "HP OMEN 15", ALC285_FIXUP_HP_MUTE_LED),
SND_PCI_QUIRK(0x103c, 0x87c8, "HP", ALC287_FIXUP_HP_GPIO_LED),
@@ -9180,6 +9300,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1043, 0x12a0, "ASUS X441UV", ALC233_FIXUP_EAPD_COEF_AND_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1043, 0x12e0, "ASUS X541SA", ALC256_FIXUP_ASUS_MIC),
SND_PCI_QUIRK(0x1043, 0x12f0, "ASUS X541UV", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1313, "Asus K42JZ", ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1043, 0x13b0, "ASUS Z550SA", ALC256_FIXUP_ASUS_MIC),
SND_PCI_QUIRK(0x1043, 0x1427, "Asus Zenbook UX31E", ALC269VB_FIXUP_ASUS_ZENBOOK),
SND_PCI_QUIRK(0x1043, 0x1517, "Asus Zenbook UX31A", ALC269VB_FIXUP_ASUS_ZENBOOK_UX31A),
@@ -9255,6 +9376,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1558, 0x4018, "Clevo NV40M[BE]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x4019, "Clevo NV40MZ", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x4020, "Clevo NV40MB", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x4041, "Clevo NV4[15]PZ", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x40a1, "Clevo NL40GU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x40c1, "Clevo NL40[CZ]U", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x40d1, "Clevo NL41DU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
@@ -9367,6 +9489,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x17aa, 0x3176, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x17aa, 0x3178, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x17aa, 0x31af, "ThinkCentre Station", ALC623_FIXUP_LENOVO_THINKSTATION_P340),
+ SND_PCI_QUIRK(0x17aa, 0x3801, "Lenovo Yoga9 14IAP7", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN),
SND_PCI_QUIRK(0x17aa, 0x3802, "Lenovo Yoga DuetITL 2021", ALC287_FIXUP_YOGA7_14ITL_SPEAKERS),
SND_PCI_QUIRK(0x17aa, 0x3813, "Legion 7i 15IMHG05", ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS),
SND_PCI_QUIRK(0x17aa, 0x3818, "Lenovo C940 / Yoga Duet 7", ALC298_FIXUP_LENOVO_C940_DUET7),
@@ -9612,6 +9735,7 @@ static const struct hda_model_fixup alc269_fixup_models[] = {
{.id = ALC285_FIXUP_HP_SPECTRE_X360, .name = "alc285-hp-spectre-x360"},
{.id = ALC285_FIXUP_HP_SPECTRE_X360_EB1, .name = "alc285-hp-spectre-x360-eb1"},
{.id = ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP, .name = "alc287-ideapad-bass-spk-amp"},
+ {.id = ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN, .name = "alc287-yoga9-bass-spk-pin"},
{.id = ALC623_FIXUP_LENOVO_THINKSTATION_P340, .name = "alc623-lenovo-thinkstation-p340"},
{.id = ALC255_FIXUP_ACER_HEADPHONE_AND_MIC, .name = "alc255-acer-headphone-and-mic"},
{.id = ALC285_FIXUP_HP_GPIO_AMP_INIT, .name = "alc285-hp-amp-init"},
diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c
index 959b70e8baf2..7d3d7d33fa20 100644
--- a/sound/soc/amd/yc/acp6x-mach.c
+++ b/sound/soc/amd/yc/acp6x-mach.c
@@ -104,28 +104,14 @@ static const struct dmi_system_id yc_acp_quirk_table[] = {
.driver_data = &acp6x_card,
.matches = {
DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
- DMI_MATCH(DMI_PRODUCT_NAME, "21AW"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CM"),
}
},
{
.driver_data = &acp6x_card,
.matches = {
DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
- DMI_MATCH(DMI_PRODUCT_NAME, "21AX"),
- }
- },
- {
- .driver_data = &acp6x_card,
- .matches = {
- DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
- DMI_MATCH(DMI_PRODUCT_NAME, "21BN"),
- }
- },
- {
- .driver_data = &acp6x_card,
- .matches = {
- DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
- DMI_MATCH(DMI_PRODUCT_NAME, "21BQ"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "21CN"),
}
},
{
@@ -156,20 +142,6 @@ static const struct dmi_system_id yc_acp_quirk_table[] = {
DMI_MATCH(DMI_PRODUCT_NAME, "21CL"),
}
},
- {
- .driver_data = &acp6x_card,
- .matches = {
- DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
- DMI_MATCH(DMI_PRODUCT_NAME, "21D8"),
- }
- },
- {
- .driver_data = &acp6x_card,
- .matches = {
- DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
- DMI_MATCH(DMI_PRODUCT_NAME, "21D9"),
- }
- },
{}
};
diff --git a/sound/soc/atmel/mchp-spdifrx.c b/sound/soc/atmel/mchp-spdifrx.c
index 5fc968483f2c..a7baa0385ec5 100644
--- a/sound/soc/atmel/mchp-spdifrx.c
+++ b/sound/soc/atmel/mchp-spdifrx.c
@@ -288,15 +288,17 @@ static void mchp_spdifrx_isr_blockend_en(struct mchp_spdifrx_dev *dev)
spin_unlock_irqrestore(&dev->blockend_lock, flags);
}
-/* called from atomic context only */
+/* called from atomic/non-atomic context */
static void mchp_spdifrx_isr_blockend_dis(struct mchp_spdifrx_dev *dev)
{
- spin_lock(&dev->blockend_lock);
+ unsigned long flags;
+
+ spin_lock_irqsave(&dev->blockend_lock, flags);
dev->blockend_refcount--;
/* don't enable BLOCKEND interrupt if it's already enabled */
if (dev->blockend_refcount == 0)
regmap_write(dev->regmap, SPDIFRX_IDR, SPDIFRX_IR_BLOCKEND);
- spin_unlock(&dev->blockend_lock);
+ spin_unlock_irqrestore(&dev->blockend_lock, flags);
}
static irqreturn_t mchp_spdif_interrupt(int irq, void *dev_id)
@@ -575,6 +577,7 @@ static int mchp_spdifrx_subcode_ch_get(struct mchp_spdifrx_dev *dev,
if (ret <= 0) {
dev_dbg(dev->dev, "user data for channel %d timeout\n",
channel);
+ mchp_spdifrx_isr_blockend_dis(dev);
return ret;
}
diff --git a/sound/soc/codecs/cros_ec_codec.c b/sound/soc/codecs/cros_ec_codec.c
index 9b92e1a0d1a3..43561ff1bb8d 100644
--- a/sound/soc/codecs/cros_ec_codec.c
+++ b/sound/soc/codecs/cros_ec_codec.c
@@ -994,6 +994,7 @@ static int cros_ec_codec_platform_probe(struct platform_device *pdev)
dev_dbg(dev, "ap_shm_phys_addr=%#llx len=%#x\n",
priv->ap_shm_phys_addr, priv->ap_shm_len);
}
+ of_node_put(node);
}
#endif
diff --git a/sound/soc/codecs/da7210.c b/sound/soc/codecs/da7210.c
index 8af344b2fdbf..d75d15006f64 100644
--- a/sound/soc/codecs/da7210.c
+++ b/sound/soc/codecs/da7210.c
@@ -1336,6 +1336,8 @@ static int __init da7210_modinit(void)
int ret = 0;
#if IS_ENABLED(CONFIG_I2C)
ret = i2c_add_driver(&da7210_i2c_driver);
+ if (ret)
+ return ret;
#endif
#if defined(CONFIG_SPI_MASTER)
ret = spi_register_driver(&da7210_spi_driver);
diff --git a/sound/soc/codecs/msm8916-wcd-digital.c b/sound/soc/codecs/msm8916-wcd-digital.c
index 20a07c92b2fc..098a58990f07 100644
--- a/sound/soc/codecs/msm8916-wcd-digital.c
+++ b/sound/soc/codecs/msm8916-wcd-digital.c
@@ -328,8 +328,8 @@ static const struct snd_kcontrol_new rx1_mix2_inp1_mux = SOC_DAPM_ENUM(
static const struct snd_kcontrol_new rx2_mix2_inp1_mux = SOC_DAPM_ENUM(
"RX2 MIX2 INP1 Mux", rx2_mix2_inp1_chain_enum);
-/* Digital Gain control -38.4 dB to +38.4 dB in 0.3 dB steps */
-static const DECLARE_TLV_DB_SCALE(digital_gain, -3840, 30, 0);
+/* Digital Gain control -84 dB to +40 dB in 1 dB steps */
+static const DECLARE_TLV_DB_SCALE(digital_gain, -8400, 100, -8400);
/* Cutoff Freq for High Pass Filter at -3dB */
static const char * const hpf_cutoff_text[] = {
@@ -510,15 +510,15 @@ static int wcd_iir_filter_info(struct snd_kcontrol *kcontrol,
static const struct snd_kcontrol_new msm8916_wcd_digital_snd_controls[] = {
SOC_SINGLE_S8_TLV("RX1 Digital Volume", LPASS_CDC_RX1_VOL_CTL_B2_CTL,
- -128, 127, digital_gain),
+ -84, 40, digital_gain),
SOC_SINGLE_S8_TLV("RX2 Digital Volume", LPASS_CDC_RX2_VOL_CTL_B2_CTL,
- -128, 127, digital_gain),
+ -84, 40, digital_gain),
SOC_SINGLE_S8_TLV("RX3 Digital Volume", LPASS_CDC_RX3_VOL_CTL_B2_CTL,
- -128, 127, digital_gain),
+ -84, 40, digital_gain),
SOC_SINGLE_S8_TLV("TX1 Digital Volume", LPASS_CDC_TX1_VOL_CTL_GAIN,
- -128, 127, digital_gain),
+ -84, 40, digital_gain),
SOC_SINGLE_S8_TLV("TX2 Digital Volume", LPASS_CDC_TX2_VOL_CTL_GAIN,
- -128, 127, digital_gain),
+ -84, 40, digital_gain),
SOC_ENUM("TX1 HPF Cutoff", tx1_hpf_cutoff_enum),
SOC_ENUM("TX2 HPF Cutoff", tx2_hpf_cutoff_enum),
SOC_SINGLE("TX1 HPF Switch", LPASS_CDC_TX1_MUX_CTL, 3, 1, 0),
@@ -553,22 +553,22 @@ static const struct snd_kcontrol_new msm8916_wcd_digital_snd_controls[] = {
WCD_IIR_FILTER_CTL("IIR2 Band3", IIR2, BAND3),
WCD_IIR_FILTER_CTL("IIR2 Band4", IIR2, BAND4),
WCD_IIR_FILTER_CTL("IIR2 Band5", IIR2, BAND5),
- SOC_SINGLE_SX_TLV("IIR1 INP1 Volume", LPASS_CDC_IIR1_GAIN_B1_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("IIR1 INP2 Volume", LPASS_CDC_IIR1_GAIN_B2_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("IIR1 INP3 Volume", LPASS_CDC_IIR1_GAIN_B3_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("IIR1 INP4 Volume", LPASS_CDC_IIR1_GAIN_B4_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("IIR2 INP1 Volume", LPASS_CDC_IIR2_GAIN_B1_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("IIR2 INP2 Volume", LPASS_CDC_IIR2_GAIN_B2_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("IIR2 INP3 Volume", LPASS_CDC_IIR2_GAIN_B3_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("IIR2 INP4 Volume", LPASS_CDC_IIR2_GAIN_B4_CTL,
- 0, -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("IIR1 INP1 Volume", LPASS_CDC_IIR1_GAIN_B1_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("IIR1 INP2 Volume", LPASS_CDC_IIR1_GAIN_B2_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("IIR1 INP3 Volume", LPASS_CDC_IIR1_GAIN_B3_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("IIR1 INP4 Volume", LPASS_CDC_IIR1_GAIN_B4_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("IIR2 INP1 Volume", LPASS_CDC_IIR2_GAIN_B1_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("IIR2 INP2 Volume", LPASS_CDC_IIR2_GAIN_B2_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("IIR2 INP3 Volume", LPASS_CDC_IIR2_GAIN_B3_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("IIR2 INP4 Volume", LPASS_CDC_IIR2_GAIN_B4_CTL,
+ -84, 40, digital_gain),
};
diff --git a/sound/soc/codecs/mt6359-accdet.c b/sound/soc/codecs/mt6359-accdet.c
index 6d3d170144a0..c190628e2905 100644
--- a/sound/soc/codecs/mt6359-accdet.c
+++ b/sound/soc/codecs/mt6359-accdet.c
@@ -675,6 +675,7 @@ static int mt6359_accdet_parse_dt(struct mt6359_accdet *priv)
sizeof(struct three_key_threshold));
}
+ of_node_put(node);
dev_warn(priv->dev, "accdet caps=%x\n", priv->caps);
return 0;
diff --git a/sound/soc/codecs/mt6359.c b/sound/soc/codecs/mt6359.c
index f8532aa7e4aa..9a9c8555f720 100644
--- a/sound/soc/codecs/mt6359.c
+++ b/sound/soc/codecs/mt6359.c
@@ -2780,6 +2780,7 @@ static int mt6359_parse_dt(struct mt6359_priv *priv)
ret = of_property_read_u32(np, "mediatek,mic-type-2",
&priv->mux_select[MUX_MIC_TYPE_2]);
+ of_node_put(np);
if (ret) {
dev_info(priv->dev,
"%s() failed to read mic-type-2, use default (%d)\n",
diff --git a/sound/soc/codecs/wcd9335.c b/sound/soc/codecs/wcd9335.c
index aa685980a97b..b7c5bfc44127 100644
--- a/sound/soc/codecs/wcd9335.c
+++ b/sound/soc/codecs/wcd9335.c
@@ -2259,51 +2259,42 @@ static int wcd9335_rx_hph_mode_put(struct snd_kcontrol *kc,
static const struct snd_kcontrol_new wcd9335_snd_controls[] = {
/* -84dB min - 40dB max */
- SOC_SINGLE_SX_TLV("RX0 Digital Volume", WCD9335_CDC_RX0_RX_VOL_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX1 Digital Volume", WCD9335_CDC_RX1_RX_VOL_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX2 Digital Volume", WCD9335_CDC_RX2_RX_VOL_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX3 Digital Volume", WCD9335_CDC_RX3_RX_VOL_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX4 Digital Volume", WCD9335_CDC_RX4_RX_VOL_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX5 Digital Volume", WCD9335_CDC_RX5_RX_VOL_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX6 Digital Volume", WCD9335_CDC_RX6_RX_VOL_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX7 Digital Volume", WCD9335_CDC_RX7_RX_VOL_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX8 Digital Volume", WCD9335_CDC_RX8_RX_VOL_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX0 Mix Digital Volume",
- WCD9335_CDC_RX0_RX_VOL_MIX_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX1 Mix Digital Volume",
- WCD9335_CDC_RX1_RX_VOL_MIX_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX2 Mix Digital Volume",
- WCD9335_CDC_RX2_RX_VOL_MIX_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX3 Mix Digital Volume",
- WCD9335_CDC_RX3_RX_VOL_MIX_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX4 Mix Digital Volume",
- WCD9335_CDC_RX4_RX_VOL_MIX_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX5 Mix Digital Volume",
- WCD9335_CDC_RX5_RX_VOL_MIX_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX6 Mix Digital Volume",
- WCD9335_CDC_RX6_RX_VOL_MIX_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX7 Mix Digital Volume",
- WCD9335_CDC_RX7_RX_VOL_MIX_CTL,
- 0, -84, 40, digital_gain),
- SOC_SINGLE_SX_TLV("RX8 Mix Digital Volume",
- WCD9335_CDC_RX8_RX_VOL_MIX_CTL,
- 0, -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX0 Digital Volume", WCD9335_CDC_RX0_RX_VOL_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX1 Digital Volume", WCD9335_CDC_RX1_RX_VOL_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX2 Digital Volume", WCD9335_CDC_RX2_RX_VOL_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX3 Digital Volume", WCD9335_CDC_RX3_RX_VOL_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX4 Digital Volume", WCD9335_CDC_RX4_RX_VOL_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX5 Digital Volume", WCD9335_CDC_RX5_RX_VOL_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX6 Digital Volume", WCD9335_CDC_RX6_RX_VOL_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX7 Digital Volume", WCD9335_CDC_RX7_RX_VOL_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX8 Digital Volume", WCD9335_CDC_RX8_RX_VOL_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX0 Mix Digital Volume", WCD9335_CDC_RX0_RX_VOL_MIX_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX1 Mix Digital Volume", WCD9335_CDC_RX1_RX_VOL_MIX_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX2 Mix Digital Volume", WCD9335_CDC_RX2_RX_VOL_MIX_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX3 Mix Digital Volume", WCD9335_CDC_RX3_RX_VOL_MIX_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX4 Mix Digital Volume", WCD9335_CDC_RX4_RX_VOL_MIX_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX5 Mix Digital Volume", WCD9335_CDC_RX5_RX_VOL_MIX_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX6 Mix Digital Volume", WCD9335_CDC_RX6_RX_VOL_MIX_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX7 Mix Digital Volume", WCD9335_CDC_RX7_RX_VOL_MIX_CTL,
+ -84, 40, digital_gain),
+ SOC_SINGLE_S8_TLV("RX8 Mix Digital Volume", WCD9335_CDC_RX8_RX_VOL_MIX_CTL,
+ -84, 40, digital_gain),
SOC_ENUM("RX INT0_1 HPF cut off", cf_int0_1_enum),
SOC_ENUM("RX INT0_2 HPF cut off", cf_int0_2_enum),
SOC_ENUM("RX INT1_1 HPF cut off", cf_int1_1_enum),
diff --git a/sound/soc/codecs/wsa881x.c b/sound/soc/codecs/wsa881x.c
index 616b26c70c3b..649c7e73f774 100644
--- a/sound/soc/codecs/wsa881x.c
+++ b/sound/soc/codecs/wsa881x.c
@@ -1174,11 +1174,17 @@ static int __maybe_unused wsa881x_runtime_resume(struct device *dev)
struct sdw_slave *slave = dev_to_sdw_dev(dev);
struct regmap *regmap = dev_get_regmap(dev, NULL);
struct wsa881x_priv *wsa881x = dev_get_drvdata(dev);
+ unsigned long time;
gpiod_direction_output(wsa881x->sd_n, 1);
- wait_for_completion_timeout(&slave->initialization_complete,
- msecs_to_jiffies(WSA881X_PROBE_TIMEOUT));
+ time = wait_for_completion_timeout(&slave->initialization_complete,
+ msecs_to_jiffies(WSA881X_PROBE_TIMEOUT));
+ if (!time) {
+ dev_err(dev, "Initialization not complete, timed out\n");
+ gpiod_direction_output(wsa881x->sd_n, 0);
+ return -ETIMEDOUT;
+ }
regcache_cache_only(regmap, false);
regcache_sync(regmap);
diff --git a/sound/soc/fsl/fsl-asoc-card.c b/sound/soc/fsl/fsl-asoc-card.c
index d9a0d4768c4d..c836848ef0a6 100644
--- a/sound/soc/fsl/fsl-asoc-card.c
+++ b/sound/soc/fsl/fsl-asoc-card.c
@@ -537,6 +537,7 @@ static int fsl_asoc_card_probe(struct platform_device *pdev)
struct device *codec_dev = NULL;
const char *codec_dai_name;
const char *codec_dev_name;
+ u32 asrc_fmt = 0;
u32 width;
int ret;
@@ -829,8 +830,8 @@ static int fsl_asoc_card_probe(struct platform_device *pdev)
goto asrc_fail;
}
- ret = of_property_read_u32(asrc_np, "fsl,asrc-format",
- &priv->asrc_format);
+ ret = of_property_read_u32(asrc_np, "fsl,asrc-format", &asrc_fmt);
+ priv->asrc_format = (__force snd_pcm_format_t)asrc_fmt;
if (ret) {
/* Fallback to old binding; translate to asrc_format */
ret = of_property_read_u32(asrc_np, "fsl,asrc-width",
diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
index d7d1536a4f37..44dcbf49456c 100644
--- a/sound/soc/fsl/fsl_asrc.c
+++ b/sound/soc/fsl/fsl_asrc.c
@@ -1066,6 +1066,7 @@ static int fsl_asrc_probe(struct platform_device *pdev)
struct resource *res;
void __iomem *regs;
int irq, ret, i;
+ u32 asrc_fmt = 0;
u32 map_idx;
char tmp[16];
u32 width;
@@ -1174,7 +1175,8 @@ static int fsl_asrc_probe(struct platform_device *pdev)
return ret;
}
- ret = of_property_read_u32(np, "fsl,asrc-format", &asrc->asrc_format);
+ ret = of_property_read_u32(np, "fsl,asrc-format", &asrc_fmt);
+ asrc->asrc_format = (__force snd_pcm_format_t)asrc_fmt;
if (ret) {
ret = of_property_read_u32(np, "fsl,asrc-width", &width);
if (ret) {
@@ -1197,7 +1199,7 @@ static int fsl_asrc_probe(struct platform_device *pdev)
}
}
- if (!(FSL_ASRC_FORMATS & (1ULL << asrc->asrc_format))) {
+ if (!(FSL_ASRC_FORMATS & pcm_format_to_bits(asrc->asrc_format))) {
dev_warn(&pdev->dev, "unsupported width, use default S24_LE\n");
asrc->asrc_format = SNDRV_PCM_FORMAT_S24_LE;
}
diff --git a/sound/soc/fsl/fsl_easrc.c b/sound/soc/fsl/fsl_easrc.c
index be14f84796cb..cf0e10d17dbe 100644
--- a/sound/soc/fsl/fsl_easrc.c
+++ b/sound/soc/fsl/fsl_easrc.c
@@ -476,7 +476,8 @@ static int fsl_easrc_prefilter_config(struct fsl_asrc *easrc,
struct fsl_asrc_pair *ctx;
struct device *dev;
u32 inrate, outrate, offset = 0;
- u32 in_s_rate, out_s_rate, in_s_fmt, out_s_fmt;
+ u32 in_s_rate, out_s_rate;
+ snd_pcm_format_t in_s_fmt, out_s_fmt;
int ret, i;
if (!easrc)
@@ -1873,6 +1874,7 @@ static int fsl_easrc_probe(struct platform_device *pdev)
struct resource *res;
struct device_node *np;
void __iomem *regs;
+ u32 asrc_fmt = 0;
int ret, irq;
easrc = devm_kzalloc(dev, sizeof(*easrc), GFP_KERNEL);
@@ -1933,13 +1935,14 @@ static int fsl_easrc_probe(struct platform_device *pdev)
return ret;
}
- ret = of_property_read_u32(np, "fsl,asrc-format", &easrc->asrc_format);
+ ret = of_property_read_u32(np, "fsl,asrc-format", &asrc_fmt);
+ easrc->asrc_format = (__force snd_pcm_format_t)asrc_fmt;
if (ret) {
dev_err(dev, "failed to asrc format\n");
return ret;
}
- if (!(FSL_EASRC_FORMATS & (1ULL << easrc->asrc_format))) {
+ if (!(FSL_EASRC_FORMATS & (pcm_format_to_bits(easrc->asrc_format)))) {
dev_warn(dev, "unsupported format, switching to S24_LE\n");
easrc->asrc_format = SNDRV_PCM_FORMAT_S24_LE;
}
diff --git a/sound/soc/fsl/fsl_easrc.h b/sound/soc/fsl/fsl_easrc.h
index 30620d56252c..5b8469757c12 100644
--- a/sound/soc/fsl/fsl_easrc.h
+++ b/sound/soc/fsl/fsl_easrc.h
@@ -569,7 +569,7 @@ struct fsl_easrc_io_params {
unsigned int access_len;
unsigned int fifo_wtmk;
unsigned int sample_rate;
- unsigned int sample_format;
+ snd_pcm_format_t sample_format;
unsigned int norm_rate;
};
diff --git a/sound/soc/fsl/imx-audmux.c b/sound/soc/fsl/imx-audmux.c
index dfa05d40b276..a8e5e0f57faf 100644
--- a/sound/soc/fsl/imx-audmux.c
+++ b/sound/soc/fsl/imx-audmux.c
@@ -298,7 +298,7 @@ static int imx_audmux_probe(struct platform_device *pdev)
audmux_clk = NULL;
}
- audmux_type = (enum imx_audmux_type)of_device_get_match_data(&pdev->dev);
+ audmux_type = (uintptr_t)of_device_get_match_data(&pdev->dev);
switch (audmux_type) {
case IMX31_AUDMUX:
diff --git a/sound/soc/fsl/imx-card.c b/sound/soc/fsl/imx-card.c
index 6f8efd838fcc..4a8609b0d700 100644
--- a/sound/soc/fsl/imx-card.c
+++ b/sound/soc/fsl/imx-card.c
@@ -17,6 +17,9 @@
#include "fsl_sai.h"
+#define IMX_CARD_MCLK_22P5792MHZ 22579200
+#define IMX_CARD_MCLK_24P576MHZ 24576000
+
enum codec_type {
CODEC_DUMMY = 0,
CODEC_AK5558 = 1,
@@ -115,7 +118,7 @@ struct imx_card_data {
struct snd_soc_card card;
int num_dapm_routes;
u32 asrc_rate;
- u32 asrc_format;
+ snd_pcm_format_t asrc_format;
};
static struct imx_akcodec_fs_mul ak4458_fs_mul[] = {
@@ -353,9 +356,14 @@ static int imx_aif_hw_params(struct snd_pcm_substream *substream,
mclk_freq = akcodec_get_mclk_rate(substream, params, slots, slot_width);
else
mclk_freq = params_rate(params) * slots * slot_width;
- /* Use the maximum freq from DSD512 (512*44100 = 22579200) */
- if (format_is_dsd(params))
- mclk_freq = 22579200;
+
+ if (format_is_dsd(params)) {
+ /* Use the maximum freq from DSD512 (512*44100 = 22579200) */
+ if (!(params_rate(params) % 11025))
+ mclk_freq = IMX_CARD_MCLK_22P5792MHZ;
+ else
+ mclk_freq = IMX_CARD_MCLK_24P576MHZ;
+ }
ret = snd_soc_dai_set_sysclk(cpu_dai, link_data->cpu_sysclk_id, mclk_freq,
SND_SOC_CLOCK_OUT);
@@ -466,7 +474,7 @@ static int be_hw_params_fixup(struct snd_soc_pcm_runtime *rtd,
mask = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT);
snd_mask_none(mask);
- snd_mask_set(mask, data->asrc_format);
+ snd_mask_set(mask, (__force unsigned int)data->asrc_format);
return 0;
}
@@ -485,6 +493,7 @@ static int imx_card_parse_of(struct imx_card_data *data)
struct dai_link_data *link_data;
struct of_phandle_args args;
int ret, num_links;
+ u32 asrc_fmt = 0;
u32 width;
ret = snd_soc_of_parse_card_name(card, "model");
@@ -631,7 +640,8 @@ static int imx_card_parse_of(struct imx_card_data *data)
goto err;
}
- ret = of_property_read_u32(args.np, "fsl,asrc-format", &data->asrc_format);
+ ret = of_property_read_u32(args.np, "fsl,asrc-format", &asrc_fmt);
+ data->asrc_format = (__force snd_pcm_format_t)asrc_fmt;
if (ret) {
/* Fallback to old binding; translate to asrc_format */
ret = of_property_read_u32(args.np, "fsl,asrc-width", &width);
diff --git a/sound/soc/generic/audio-graph-card.c b/sound/soc/generic/audio-graph-card.c
index 2b598af8feef..b327372f2e4a 100644
--- a/sound/soc/generic/audio-graph-card.c
+++ b/sound/soc/generic/audio-graph-card.c
@@ -158,8 +158,10 @@ static int asoc_simple_parse_dai(struct device_node *ep,
* if he unbinded CPU or Codec.
*/
ret = snd_soc_get_dai_name(&args, &dlc->dai_name);
- if (ret < 0)
+ if (ret < 0) {
+ of_node_put(node);
return ret;
+ }
dlc->of_node = node;
diff --git a/sound/soc/generic/audio-graph-card2.c b/sound/soc/generic/audio-graph-card2.c
index c0f3907a01fd..da782403316f 100644
--- a/sound/soc/generic/audio-graph-card2.c
+++ b/sound/soc/generic/audio-graph-card2.c
@@ -229,7 +229,8 @@ enum graph_type {
static enum graph_type __graph_get_type(struct device_node *lnk)
{
- struct device_node *np;
+ struct device_node *np, *parent_np;
+ enum graph_type ret;
/*
* target {
@@ -240,19 +241,33 @@ static enum graph_type __graph_get_type(struct device_node *lnk)
* };
*/
np = of_get_parent(lnk);
- if (of_node_name_eq(np, "ports"))
- np = of_get_parent(np);
+ if (of_node_name_eq(np, "ports")) {
+ parent_np = of_get_parent(np);
+ of_node_put(np);
+ np = parent_np;
+ }
+
+ if (of_node_name_eq(np, GRAPH_NODENAME_MULTI)) {
+ ret = GRAPH_MULTI;
+ goto out_put;
+ }
+
+ if (of_node_name_eq(np, GRAPH_NODENAME_DPCM)) {
+ ret = GRAPH_DPCM;
+ goto out_put;
+ }
- if (of_node_name_eq(np, GRAPH_NODENAME_MULTI))
- return GRAPH_MULTI;
+ if (of_node_name_eq(np, GRAPH_NODENAME_C2C)) {
+ ret = GRAPH_C2C;
+ goto out_put;
+ }
- if (of_node_name_eq(np, GRAPH_NODENAME_DPCM))
- return GRAPH_DPCM;
+ ret = GRAPH_NORMAL;
- if (of_node_name_eq(np, GRAPH_NODENAME_C2C))
- return GRAPH_C2C;
+out_put:
+ of_node_put(np);
+ return ret;
- return GRAPH_NORMAL;
}
static enum graph_type graph_get_type(struct asoc_simple_priv *priv,
@@ -430,8 +445,10 @@ static int asoc_simple_parse_dai(struct device_node *ep,
* if he unbinded CPU or Codec.
*/
ret = snd_soc_get_dai_name(&args, &dlc->dai_name);
- if (ret < 0)
+ if (ret < 0) {
+ of_node_put(node);
return ret;
+ }
dlc->of_node = node;
@@ -856,7 +873,7 @@ int audio_graph2_link_c2c(struct asoc_simple_priv *priv,
struct device_node *port0, *port1, *ports;
struct device_node *codec0_port, *codec1_port;
struct device_node *ep0, *ep1;
- u32 val;
+ u32 val = 0;
int ret = -EINVAL;
/*
@@ -880,7 +897,8 @@ int audio_graph2_link_c2c(struct asoc_simple_priv *priv,
ports = of_get_parent(port0);
port1 = of_get_next_child(ports, lnk);
- if (!of_get_property(ports, "rate", &val)) {
+ of_property_read_u32(ports, "rate", &val);
+ if (!val) {
struct device *dev = simple_priv_to_dev(priv);
dev_err(dev, "Codec2Codec needs rate settings\n");
diff --git a/sound/soc/intel/boards/sof_rt5682.c b/sound/soc/intel/boards/sof_rt5682.c
index 7126fcb63d90..d9f83b04be87 100644
--- a/sound/soc/intel/boards/sof_rt5682.c
+++ b/sound/soc/intel/boards/sof_rt5682.c
@@ -435,6 +435,15 @@ static int sof_card_late_probe(struct snd_soc_card *card)
int err;
int i = 0;
+ if (sof_rt5682_quirk & SOF_MAX98373_SPEAKER_AMP_PRESENT) {
+ /* Disable Left and Right Spk pin after boot */
+ snd_soc_dapm_disable_pin(dapm, "Left Spk");
+ snd_soc_dapm_disable_pin(dapm, "Right Spk");
+ err = snd_soc_dapm_sync(dapm);
+ if (err < 0)
+ return err;
+ }
+
/* HDMI is not supported by SOF on Baytrail/CherryTrail */
if (is_legacy_cpu || !ctx->idisp_codec)
return 0;
@@ -468,15 +477,6 @@ static int sof_card_late_probe(struct snd_soc_card *card)
i++;
}
- if (sof_rt5682_quirk & SOF_MAX98373_SPEAKER_AMP_PRESENT) {
- /* Disable Left and Right Spk pin after boot */
- snd_soc_dapm_disable_pin(dapm, "Left Spk");
- snd_soc_dapm_disable_pin(dapm, "Right Spk");
- err = snd_soc_dapm_sync(dapm);
- if (err < 0)
- return err;
- }
-
return hdac_hdmi_jack_port_init(component, &card->dapm);
}
diff --git a/sound/soc/mediatek/mt6797/mt6797-mt6351.c b/sound/soc/mediatek/mt6797/mt6797-mt6351.c
index 496f32bcfb5e..d2f6213a6bfc 100644
--- a/sound/soc/mediatek/mt6797/mt6797-mt6351.c
+++ b/sound/soc/mediatek/mt6797/mt6797-mt6351.c
@@ -217,7 +217,8 @@ static int mt6797_mt6351_dev_probe(struct platform_device *pdev)
if (!codec_node) {
dev_err(&pdev->dev,
"Property 'audio-codec' missing or invalid\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto put_platform_node;
}
for_each_card_prelinks(card, i, dai_link) {
if (dai_link->codecs->name)
@@ -230,6 +231,9 @@ static int mt6797_mt6351_dev_probe(struct platform_device *pdev)
dev_err(&pdev->dev, "%s snd_soc_register_card fail %d\n",
__func__, ret);
+ of_node_put(codec_node);
+put_platform_node:
+ of_node_put(platform_node);
return ret;
}
diff --git a/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c b/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c
index 5716d9299066..c1e8633a74a7 100644
--- a/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c
+++ b/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c
@@ -256,14 +256,16 @@ static int mt8173_rt5650_rt5676_dev_probe(struct platform_device *pdev)
if (!mt8173_rt5650_rt5676_dais[DAI_LINK_CODEC_I2S].codecs[0].of_node) {
dev_err(&pdev->dev,
"Property 'audio-codec' missing or invalid\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto put_node;
}
mt8173_rt5650_rt5676_dais[DAI_LINK_CODEC_I2S].codecs[1].of_node =
of_parse_phandle(pdev->dev.of_node, "mediatek,audio-codec", 1);
if (!mt8173_rt5650_rt5676_dais[DAI_LINK_CODEC_I2S].codecs[1].of_node) {
dev_err(&pdev->dev,
"Property 'audio-codec' missing or invalid\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto put_node;
}
mt8173_rt5650_rt5676_codec_conf[0].dlc.of_node =
mt8173_rt5650_rt5676_dais[DAI_LINK_CODEC_I2S].codecs[1].of_node;
@@ -276,13 +278,15 @@ static int mt8173_rt5650_rt5676_dev_probe(struct platform_device *pdev)
if (!mt8173_rt5650_rt5676_dais[DAI_LINK_HDMI_I2S].codecs->of_node) {
dev_err(&pdev->dev,
"Property 'audio-codec' missing or invalid\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto put_node;
}
card->dev = &pdev->dev;
ret = devm_snd_soc_register_card(&pdev->dev, card);
+put_node:
of_node_put(platform_node);
return ret;
}
diff --git a/sound/soc/mediatek/mt8173/mt8173-rt5650.c b/sound/soc/mediatek/mt8173/mt8173-rt5650.c
index fc164f4f95f8..487807ee7019 100644
--- a/sound/soc/mediatek/mt8173/mt8173-rt5650.c
+++ b/sound/soc/mediatek/mt8173/mt8173-rt5650.c
@@ -280,7 +280,8 @@ static int mt8173_rt5650_dev_probe(struct platform_device *pdev)
if (!mt8173_rt5650_dais[DAI_LINK_CODEC_I2S].codecs[0].of_node) {
dev_err(&pdev->dev,
"Property 'audio-codec' missing or invalid\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto put_platform_node;
}
mt8173_rt5650_dais[DAI_LINK_CODEC_I2S].codecs[1].of_node =
mt8173_rt5650_dais[DAI_LINK_CODEC_I2S].codecs[0].of_node;
@@ -293,7 +294,7 @@ static int mt8173_rt5650_dev_probe(struct platform_device *pdev)
dev_err(&pdev->dev,
"%s codec_capture_dai name fail %d\n",
__func__, ret);
- return ret;
+ goto put_platform_node;
}
mt8173_rt5650_dais[DAI_LINK_CODEC_I2S].codecs[1].dai_name =
codec_capture_dai;
@@ -315,12 +316,14 @@ static int mt8173_rt5650_dev_probe(struct platform_device *pdev)
if (!mt8173_rt5650_dais[DAI_LINK_HDMI_I2S].codecs->of_node) {
dev_err(&pdev->dev,
"Property 'audio-codec' missing or invalid\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto put_platform_node;
}
card->dev = &pdev->dev;
ret = devm_snd_soc_register_card(&pdev->dev, card);
+put_platform_node:
of_node_put(platform_node);
return ret;
}
diff --git a/sound/soc/qcom/lpass-cpu.c b/sound/soc/qcom/lpass-cpu.c
index e6846ad2b5fa..964eb07f46d6 100644
--- a/sound/soc/qcom/lpass-cpu.c
+++ b/sound/soc/qcom/lpass-cpu.c
@@ -1090,6 +1090,7 @@ int asoc_qcom_lpass_cpu_platform_probe(struct platform_device *pdev)
dsp_of_node = of_parse_phandle(pdev->dev.of_node, "qcom,adsp", 0);
if (dsp_of_node) {
dev_err(dev, "DSP exists and holds audio resources\n");
+ of_node_put(dsp_of_node);
return -EBUSY;
}
diff --git a/sound/soc/qcom/qdsp6/q6adm.c b/sound/soc/qcom/qdsp6/q6adm.c
index 72c5719f1d25..a0678e8cf20a 100644
--- a/sound/soc/qcom/qdsp6/q6adm.c
+++ b/sound/soc/qcom/qdsp6/q6adm.c
@@ -217,7 +217,7 @@ static struct q6copp *q6adm_alloc_copp(struct q6adm *adm, int port_idx)
idx = find_first_zero_bit(&adm->copp_bitmap[port_idx],
MAX_COPPS_PER_PORT);
- if (idx > MAX_COPPS_PER_PORT)
+ if (idx >= MAX_COPPS_PER_PORT)
return ERR_PTR(-EBUSY);
c = kzalloc(sizeof(*c), GFP_ATOMIC);
diff --git a/sound/soc/samsung/aries_wm8994.c b/sound/soc/samsung/aries_wm8994.c
index 83acbe57b248..a0825da9fff9 100644
--- a/sound/soc/samsung/aries_wm8994.c
+++ b/sound/soc/samsung/aries_wm8994.c
@@ -628,8 +628,10 @@ static int aries_audio_probe(struct platform_device *pdev)
return -EINVAL;
codec = of_get_child_by_name(dev->of_node, "codec");
- if (!codec)
- return -EINVAL;
+ if (!codec) {
+ ret = -EINVAL;
+ goto out;
+ }
for_each_card_prelinks(card, i, dai_link) {
dai_link->codecs->of_node = of_parse_phandle(codec,
diff --git a/sound/soc/samsung/h1940_uda1380.c b/sound/soc/samsung/h1940_uda1380.c
index c994e67d1eaf..ca086243fcfd 100644
--- a/sound/soc/samsung/h1940_uda1380.c
+++ b/sound/soc/samsung/h1940_uda1380.c
@@ -8,7 +8,7 @@
// Based on version from Arnaud Patard <arnaud.patard@rtp-net.org>
#include <linux/types.h>
-#include <linux/gpio.h>
+#include <linux/gpio/consumer.h>
#include <linux/module.h>
#include <sound/soc.h>
diff --git a/sound/soc/samsung/rx1950_uda1380.c b/sound/soc/samsung/rx1950_uda1380.c
index 6ea1c8cc9167..2820097b00b9 100644
--- a/sound/soc/samsung/rx1950_uda1380.c
+++ b/sound/soc/samsung/rx1950_uda1380.c
@@ -128,7 +128,7 @@ static int rx1950_startup(struct snd_pcm_substream *substream)
&hw_rates);
}
-struct gpio_desc *gpiod_speaker_power;
+static struct gpio_desc *gpiod_speaker_power;
static int rx1950_spk_power(struct snd_soc_dapm_widget *w,
struct snd_kcontrol *kcontrol, int event)
@@ -227,7 +227,7 @@ static int rx1950_probe(struct platform_device *pdev)
return devm_snd_soc_register_card(dev, &rx1950_asoc);
}
-struct platform_driver rx1950_audio = {
+static struct platform_driver rx1950_audio = {
.driver = {
.name = "rx1950-audio",
.pm = &snd_soc_pm_ops,
diff --git a/sound/soc/sof/ipc3-topology.c b/sound/soc/sof/ipc3-topology.c
index 80fb82ece38d..74c230fbcf68 100644
--- a/sound/soc/sof/ipc3-topology.c
+++ b/sound/soc/sof/ipc3-topology.c
@@ -1629,6 +1629,7 @@ static int sof_ipc3_control_load_bytes(struct snd_sof_dev *sdev, struct snd_sof_
return 0;
err:
kfree(scontrol->ipc_control_data);
+ scontrol->ipc_control_data = NULL;
return ret;
}
diff --git a/sound/soc/sof/mediatek/mt8195/mt8195-loader.c b/sound/soc/sof/mediatek/mt8195/mt8195-loader.c
index ed18d6379e92..ef2664c3cd47 100644
--- a/sound/soc/sof/mediatek/mt8195/mt8195-loader.c
+++ b/sound/soc/sof/mediatek/mt8195/mt8195-loader.c
@@ -21,7 +21,7 @@ void sof_hifixdsp_boot_sequence(struct snd_sof_dev *sdev, u32 boot_addr)
/* pull high StatVectorSel to use AltResetVec (set bit4 to 1) */
snd_sof_dsp_update_bits(sdev, DSP_REG_BAR, DSP_RESET_SW,
- DSP_RESET_SW, DSP_RESET_SW);
+ STATVECTOR_SEL, STATVECTOR_SEL);
/* toggle DReset & BReset */
/* pull high DReset & BReset */
diff --git a/sound/soc/sof/sof-priv.h b/sound/soc/sof/sof-priv.h
index c856f0d84e49..f369242dd975 100644
--- a/sound/soc/sof/sof-priv.h
+++ b/sound/soc/sof/sof-priv.h
@@ -364,8 +364,8 @@ struct snd_sof_ipc_msg {
/**
* struct sof_ipc_pm_ops - IPC-specific PM ops
- * @ctx_save: Function pointer for context save
- * @ctx_restore: Function pointer for context restore
+ * @ctx_save: Optional function pointer for context save
+ * @ctx_restore: Optional function pointer for context restore
*/
struct sof_ipc_pm_ops {
int (*ctx_save)(struct snd_sof_dev *sdev);
diff --git a/sound/usb/bcd2000/bcd2000.c b/sound/usb/bcd2000/bcd2000.c
index cd4a0bc6d278..7aec0a95c609 100644
--- a/sound/usb/bcd2000/bcd2000.c
+++ b/sound/usb/bcd2000/bcd2000.c
@@ -348,7 +348,8 @@ static int bcd2000_init_midi(struct bcd2000 *bcd2k)
static void bcd2000_free_usb_related_resources(struct bcd2000 *bcd2k,
struct usb_interface *interface)
{
- /* usb_kill_urb not necessary, urb is aborted automatically */
+ usb_kill_urb(bcd2k->midi_out_urb);
+ usb_kill_urb(bcd2k->midi_in_urb);
usb_free_urb(bcd2k->midi_out_urb);
usb_free_urb(bcd2k->midi_in_urb);
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c
index 968d90caeefa..168fd802d70b 100644
--- a/sound/usb/quirks.c
+++ b/sound/usb/quirks.c
@@ -1843,6 +1843,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER),
DEVICE_FLG(0x1395, 0x740a, /* Sennheiser DECT */
QUIRK_FLAG_GET_SAMPLE_RATE),
+ DEVICE_FLG(0x1397, 0x0507, /* Behringer UMC202HD */
+ QUIRK_FLAG_PLAYBACK_FIRST | QUIRK_FLAG_GENERIC_IMPLICIT_FB),
DEVICE_FLG(0x1397, 0x0508, /* Behringer UMC204HD */
QUIRK_FLAG_PLAYBACK_FIRST | QUIRK_FLAG_GENERIC_IMPLICIT_FB),
DEVICE_FLG(0x1397, 0x0509, /* Behringer UMC404HD */
diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
index 97dec81950e5..6353a789322b 100644
--- a/tools/bpf/bpftool/link.c
+++ b/tools/bpf/bpftool/link.c
@@ -20,6 +20,10 @@ static const char * const link_type_name[] = {
[BPF_LINK_TYPE_CGROUP] = "cgroup",
[BPF_LINK_TYPE_ITER] = "iter",
[BPF_LINK_TYPE_NETNS] = "netns",
+ [BPF_LINK_TYPE_XDP] = "xdp",
+ [BPF_LINK_TYPE_PERF_EVENT] = "perf_event",
+ [BPF_LINK_TYPE_KPROBE_MULTI] = "kprobe_multi",
+ [BPF_LINK_TYPE_STRUCT_OPS] = "struct_ops",
};
static struct hashmap *link_table;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d14b10b85e51..ff9af73859ca 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1013,6 +1013,7 @@ enum bpf_link_type {
BPF_LINK_TYPE_XDP = 6,
BPF_LINK_TYPE_PERF_EVENT = 7,
BPF_LINK_TYPE_KPROBE_MULTI = 8,
+ BPF_LINK_TYPE_STRUCT_OPS = 9,
MAX_BPF_LINK_TYPE,
};
@@ -5143,6 +5144,17 @@ union bpf_attr {
* The **hash_algo** is returned on success,
* **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if
* invalid arguments are passed.
+ *
+ * void *bpf_kptr_xchg(void *map_value, void *ptr)
+ * Description
+ * Exchange kptr at pointer *map_value* with *ptr*, and return the
+ * old value. *ptr* can be NULL, otherwise it must be a referenced
+ * pointer which will be released when this helper is called.
+ * Return
+ * The old value of kptr (which can be NULL). The returned pointer
+ * if not NULL, is a reference which must be released using its
+ * corresponding release function, or moved into a BPF map before
+ * program exit.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -5339,6 +5351,7 @@ union bpf_attr {
FN(copy_from_user_task), \
FN(skb_set_tstamp), \
FN(ima_file_hash), \
+ FN(kptr_xchg), \
/* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index e3a8c947e89f..479f514f7411 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -227,7 +227,7 @@ struct pt_regs___arm64 {
#define __PT_PARM5_REG a4
#define __PT_RET_REG ra
#define __PT_FP_REG s0
-#define __PT_RC_REG a5
+#define __PT_RC_REG a0
#define __PT_SP_REG sp
#define __PT_IP_REG pc
/* riscv does not select ARCH_HAS_SYSCALL_WRAPPER. */
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
index 927745b08014..23f5c46708f8 100644
--- a/tools/lib/bpf/gen_loader.c
+++ b/tools/lib/bpf/gen_loader.c
@@ -533,7 +533,7 @@ void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *attach_name,
gen->attach_kind = kind;
ret = snprintf(gen->attach_target, sizeof(gen->attach_target), "%s%s",
prefix, attach_name);
- if (ret == sizeof(gen->attach_target))
+ if (ret >= sizeof(gen->attach_target))
gen->error = -ENOSPC;
}
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 881ea905ca81..2323d3e38a8e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4271,7 +4271,7 @@ static int bpf_get_map_info_from_fdinfo(int fd, struct bpf_map_info *info)
int bpf_map__reuse_fd(struct bpf_map *map, int fd)
{
struct bpf_map_info info = {};
- __u32 len = sizeof(info);
+ __u32 len = sizeof(info), name_len;
int new_fd, err;
char *new_name;
@@ -4281,7 +4281,12 @@ int bpf_map__reuse_fd(struct bpf_map *map, int fd)
if (err)
return libbpf_err(err);
- new_name = strdup(info.name);
+ name_len = strlen(info.name);
+ if (name_len == BPF_OBJ_NAME_LEN - 1 && strncmp(map->name, info.name, name_len) == 0)
+ new_name = strdup(map->name);
+ else
+ new_name = strdup(info.name);
+
if (!new_name)
return libbpf_err(-errno);
diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index af136f73b09d..67dc010e9fe3 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -1147,8 +1147,6 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
goto out_mmap_tx;
}
- ctx->prog_fd = -1;
-
if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
err = __xsk_setup_xdp_prog(xsk, NULL);
if (err)
@@ -1229,7 +1227,10 @@ void xsk_socket__delete(struct xsk_socket *xsk)
ctx = xsk->ctx;
umem = ctx->umem;
- if (ctx->prog_fd != -1) {
+
+ xsk_put_ctx(ctx, true);
+
+ if (!ctx->refcount) {
xsk_delete_bpf_maps(xsk);
close(ctx->prog_fd);
if (ctx->has_bpf_link)
@@ -1248,8 +1249,6 @@ void xsk_socket__delete(struct xsk_socket *xsk)
}
}
- xsk_put_ctx(ctx, true);
-
umem->refcount--;
/* Do not close an fd that also has an associated umem connected
* to it.
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f058e8cddfa8..e15fd6f9f7ea 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1663,12 +1663,6 @@ static int add_default_attributes(void)
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },
-};
- struct perf_event_attr default_sw_attrs[] = {
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
};
/*
@@ -1910,30 +1904,6 @@ static int add_default_attributes(void)
}
if (!evsel_list->core.nr_entries) {
- if (perf_pmu__has_hybrid()) {
- struct parse_events_error errinfo;
- const char *hybrid_str = "cycles,instructions,branches,branch-misses";
-
- if (target__has_cpu(&target))
- default_sw_attrs[0].config = PERF_COUNT_SW_CPU_CLOCK;
-
- if (evlist__add_default_attrs(evsel_list,
- default_sw_attrs) < 0) {
- return -1;
- }
-
- parse_events_error__init(&errinfo);
- err = parse_events(evsel_list, hybrid_str, &errinfo);
- if (err) {
- fprintf(stderr,
- "Cannot set up hybrid events %s: %d\n",
- hybrid_str, err);
- parse_events_error__print(&errinfo, hybrid_str);
- }
- parse_events_error__exit(&errinfo);
- return err ? -1 : 0;
- }
-
if (target__has_cpu(&target))
default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;
diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c
index b97366f77bbf..2bd23e4cf19e 100644
--- a/tools/perf/util/dsos.c
+++ b/tools/perf/util/dsos.c
@@ -23,8 +23,19 @@ static int __dso_id__cmp(struct dso_id *a, struct dso_id *b)
if (a->ino > b->ino) return -1;
if (a->ino < b->ino) return 1;
- if (a->ino_generation > b->ino_generation) return -1;
- if (a->ino_generation < b->ino_generation) return 1;
+ /*
+ * Synthesized MMAP events have zero ino_generation, avoid comparing
+ * them with MMAP events with actual ino_generation.
+ *
+ * I found it harmful because the mismatch resulted in a new
+ * dso that did not have a build ID whereas the original dso did have a
+ * build ID. The build ID was essential because the object was not found
+ * otherwise. - Adrian
+ */
+ if (a->ino_generation && b->ino_generation) {
+ if (a->ino_generation > b->ino_generation) return -1;
+ if (a->ino_generation < b->ino_generation) return 1;
+ }
return 0;
}
diff --git a/tools/perf/util/genelf.c b/tools/perf/util/genelf.c
index aed49806a09b..953338b9e887 100644
--- a/tools/perf/util/genelf.c
+++ b/tools/perf/util/genelf.c
@@ -30,7 +30,11 @@
#define BUILD_ID_URANDOM /* different uuid for each run */
-#ifdef HAVE_LIBCRYPTO
+// FIXME, remove this and fix the deprecation warnings before its removed and
+// We'll break for good here...
+#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
+
+#ifdef HAVE_LIBCRYPTO_SUPPORT
#define BUILD_ID_MD5
#undef BUILD_ID_SHA /* does not seem to work well when linked with Java */
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index ef6ced5c5746..cb7b24493782 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1294,16 +1294,29 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
if (elf_read_program_header(syms_ss->elf,
(u64)sym.st_value, &phdr)) {
- pr_warning("%s: failed to find program header for "
+ pr_debug4("%s: failed to find program header for "
"symbol: %s st_value: %#" PRIx64 "\n",
__func__, elf_name, (u64)sym.st_value);
- continue;
+ pr_debug4("%s: adjusting symbol: st_value: %#" PRIx64 " "
+ "sh_addr: %#" PRIx64 " sh_offset: %#" PRIx64 "\n",
+ __func__, (u64)sym.st_value, (u64)shdr.sh_addr,
+ (u64)shdr.sh_offset);
+ /*
+ * Fail to find program header, let's rollback
+ * to use shdr.sh_addr and shdr.sh_offset to
+ * calibrate symbol's file address, though this
+ * is not necessary for normal C ELF file, we
+ * still need to handle java JIT symbols in this
+ * case.
+ */
+ sym.st_value -= shdr.sh_addr - shdr.sh_offset;
+ } else {
+ pr_debug4("%s: adjusting symbol: st_value: %#" PRIx64 " "
+ "p_vaddr: %#" PRIx64 " p_offset: %#" PRIx64 "\n",
+ __func__, (u64)sym.st_value, (u64)phdr.p_vaddr,
+ (u64)phdr.p_offset);
+ sym.st_value -= phdr.p_vaddr - phdr.p_offset;
}
- pr_debug4("%s: adjusting symbol: st_value: %#" PRIx64 " "
- "p_vaddr: %#" PRIx64 " p_offset: %#" PRIx64 "\n",
- __func__, (u64)sym.st_value, (u64)phdr.p_vaddr,
- (u64)phdr.p_offset);
- sym.st_value -= phdr.p_vaddr - phdr.p_offset;
}
demangled = demangle_sym(dso, kmodule, elf_name);
diff --git a/tools/power/x86/intel-speed-select/isst-daemon.c b/tools/power/x86/intel-speed-select/isst-daemon.c
index dd372924bc82..d0400c6684ba 100644
--- a/tools/power/x86/intel-speed-select/isst-daemon.c
+++ b/tools/power/x86/intel-speed-select/isst-daemon.c
@@ -41,7 +41,7 @@ void process_level_change(int cpu)
time_t tm;
int ret;
- if (pkg_id >= MAX_PACKAGE_COUNT || die_id > MAX_DIE_PER_PACKAGE) {
+ if (pkg_id >= MAX_PACKAGE_COUNT || die_id >= MAX_DIE_PER_PACKAGE) {
debug_printf("Invalid package/die info for cpu:%d\n", cpu);
return;
}
diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
index ec823561b912..a294176f8a9d 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf.c
@@ -5226,7 +5226,7 @@ static void do_test_pprint(int test_num)
ret = snprintf(pin_path, sizeof(pin_path), "%s/%s",
"/sys/fs/bpf", test->map_name);
- if (CHECK(ret == sizeof(pin_path), "pin_path %s/%s is too long",
+ if (CHECK(ret >= sizeof(pin_path), "pin_path %s/%s is too long",
"/sys/fs/bpf", test->map_name)) {
err = -1;
goto done;
diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
index 3ee2107bbf7a..58b03d1a70c8 100644
--- a/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_stress.c
@@ -7,11 +7,9 @@
void test_fexit_stress(void)
{
- char test_skb[128] = {};
int fexit_fd[CNT] = {};
int link_fd[CNT] = {};
- char error[4096];
- int err, i, filter_fd;
+ int err, i;
const struct bpf_insn trace_program[] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
@@ -20,25 +18,9 @@ void test_fexit_stress(void)
LIBBPF_OPTS(bpf_prog_load_opts, trace_opts,
.expected_attach_type = BPF_TRACE_FEXIT,
- .log_buf = error,
- .log_size = sizeof(error),
);
- const struct bpf_insn skb_program[] = {
- BPF_MOV64_IMM(BPF_REG_0, 0),
- BPF_EXIT_INSN(),
- };
-
- LIBBPF_OPTS(bpf_prog_load_opts, skb_opts,
- .log_buf = error,
- .log_size = sizeof(error),
- );
-
- LIBBPF_OPTS(bpf_test_run_opts, topts,
- .data_in = test_skb,
- .data_size_in = sizeof(test_skb),
- .repeat = 1,
- );
+ LIBBPF_OPTS(bpf_test_run_opts, topts);
err = libbpf_find_vmlinux_btf_id("bpf_fentry_test1",
trace_opts.expected_attach_type);
@@ -58,15 +40,9 @@ void test_fexit_stress(void)
goto out;
}
- filter_fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL",
- skb_program, sizeof(skb_program) / sizeof(struct bpf_insn),
- &skb_opts);
- if (!ASSERT_GE(filter_fd, 0, "test_program_loaded"))
- goto out;
+ err = bpf_prog_test_run_opts(fexit_fd[0], &topts);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
- err = bpf_prog_test_run_opts(filter_fd, &topts);
- close(filter_fd);
- CHECK_FAIL(err);
out:
for (i = 0; i < CNT; i++) {
if (link_fd[i])
diff --git a/tools/testing/selftests/bpf/prog_tests/sock_fields.c b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
index 9d211b5c22c4..7d23166c77af 100644
--- a/tools/testing/selftests/bpf/prog_tests/sock_fields.c
+++ b/tools/testing/selftests/bpf/prog_tests/sock_fields.c
@@ -394,7 +394,6 @@ void serial_test_sock_fields(void)
test();
done:
- test_sock_fields__detach(skel);
test_sock_fields__destroy(skel);
if (child_cg_fd >= 0)
close(child_cg_fd);
diff --git a/tools/testing/selftests/bpf/prog_tests/tc_redirect.c b/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
index 7ad66a247c02..b2e415647bd7 100644
--- a/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
+++ b/tools/testing/selftests/bpf/prog_tests/tc_redirect.c
@@ -646,7 +646,7 @@ static void test_tcp_clear_dtime(struct test_tc_dtime *skel)
__u32 *errs = skel->bss->errs[t];
skel->bss->test = t;
- test_inet_dtime(AF_INET6, SOCK_STREAM, IP6_DST, 0);
+ test_inet_dtime(AF_INET6, SOCK_STREAM, IP6_DST, 50000 + t);
ASSERT_EQ(dtimes[INGRESS_FWDNS_P100], 0,
dtime_cnt_str(t, INGRESS_FWDNS_P100));
@@ -683,7 +683,7 @@ static void test_tcp_dtime(struct test_tc_dtime *skel, int family, bool bpf_fwd)
errs = skel->bss->errs[t];
skel->bss->test = t;
- test_inet_dtime(family, SOCK_STREAM, addr, 0);
+ test_inet_dtime(family, SOCK_STREAM, addr, 50000 + t);
/* fwdns_prio100 prog does not read delivery_time_type, so
* kernel puts the (rcv) timetamp in __sk_buff->tstamp
@@ -715,13 +715,13 @@ static void test_udp_dtime(struct test_tc_dtime *skel, int family, bool bpf_fwd)
errs = skel->bss->errs[t];
skel->bss->test = t;
- test_inet_dtime(family, SOCK_DGRAM, addr, 0);
+ test_inet_dtime(family, SOCK_DGRAM, addr, 50000 + t);
ASSERT_EQ(dtimes[INGRESS_FWDNS_P100], 0,
dtime_cnt_str(t, INGRESS_FWDNS_P100));
/* non mono delivery time is not forwarded */
ASSERT_EQ(dtimes[INGRESS_FWDNS_P101], 0,
- dtime_cnt_str(t, INGRESS_FWDNS_P100));
+ dtime_cnt_str(t, INGRESS_FWDNS_P101));
for (i = EGRESS_FWDNS_P100; i < SET_DTIME; i++)
ASSERT_GT(dtimes[i], 0, dtime_cnt_str(t, i));
diff --git a/tools/testing/selftests/bpf/progs/test_tc_dtime.c b/tools/testing/selftests/bpf/progs/test_tc_dtime.c
index 06f300d06dbd..b596479a9ebe 100644
--- a/tools/testing/selftests/bpf/progs/test_tc_dtime.c
+++ b/tools/testing/selftests/bpf/progs/test_tc_dtime.c
@@ -11,6 +11,8 @@
#include <linux/in.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>
#include <sys/socket.h>
@@ -115,6 +117,19 @@ static bool bpf_fwd(void)
return test < TCP_IP4_RT_FWD;
}
+static __u8 get_proto(void)
+{
+ switch (test) {
+ case UDP_IP4:
+ case UDP_IP6:
+ case UDP_IP4_RT_FWD:
+ case UDP_IP6_RT_FWD:
+ return IPPROTO_UDP;
+ default:
+ return IPPROTO_TCP;
+ }
+}
+
/* -1: parse error: TC_ACT_SHOT
* 0: not testing traffic: TC_ACT_OK
* >0: first byte is the inet_proto, second byte has the netns
@@ -122,11 +137,16 @@ static bool bpf_fwd(void)
*/
static int skb_get_type(struct __sk_buff *skb)
{
+ __u16 dst_ns_port = __bpf_htons(50000 + test);
void *data_end = ctx_ptr(skb->data_end);
void *data = ctx_ptr(skb->data);
__u8 inet_proto = 0, ns = 0;
struct ipv6hdr *ip6h;
+ __u16 sport, dport;
struct iphdr *iph;
+ struct tcphdr *th;
+ struct udphdr *uh;
+ void *trans;
switch (skb->protocol) {
case __bpf_htons(ETH_P_IP):
@@ -138,6 +158,7 @@ static int skb_get_type(struct __sk_buff *skb)
else if (iph->saddr == ip4_dst)
ns = DST_NS;
inet_proto = iph->protocol;
+ trans = iph + 1;
break;
case __bpf_htons(ETH_P_IPV6):
ip6h = data + sizeof(struct ethhdr);
@@ -148,15 +169,43 @@ static int skb_get_type(struct __sk_buff *skb)
else if (v6_equal(ip6h->saddr, (struct in6_addr)ip6_dst))
ns = DST_NS;
inet_proto = ip6h->nexthdr;
+ trans = ip6h + 1;
break;
default:
return 0;
}
- if ((inet_proto != IPPROTO_TCP && inet_proto != IPPROTO_UDP) || !ns)
+ /* skb is not from src_ns or dst_ns.
+ * skb is not the testing IPPROTO.
+ */
+ if (!ns || inet_proto != get_proto())
return 0;
- return (ns << 8 | inet_proto);
+ switch (inet_proto) {
+ case IPPROTO_TCP:
+ th = trans;
+ if (th + 1 > data_end)
+ return -1;
+ sport = th->source;
+ dport = th->dest;
+ break;
+ case IPPROTO_UDP:
+ uh = trans;
+ if (uh + 1 > data_end)
+ return -1;
+ sport = uh->source;
+ dport = uh->dest;
+ break;
+ default:
+ return 0;
+ }
+
+ /* The skb is the testing traffic */
+ if ((ns == SRC_NS && dport == dst_ns_port) ||
+ (ns == DST_NS && sport == dst_ns_port))
+ return (ns << 8 | inet_proto);
+
+ return 0;
}
/* format: direction@iface@netns
diff --git a/tools/testing/selftests/bpf/verifier/ref_tracking.c b/tools/testing/selftests/bpf/verifier/ref_tracking.c
index fbd682520e47..57a83d763ec1 100644
--- a/tools/testing/selftests/bpf/verifier/ref_tracking.c
+++ b/tools/testing/selftests/bpf/verifier/ref_tracking.c
@@ -796,7 +796,7 @@
},
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
.result = REJECT,
- .errstr = "reference has not been acquired before",
+ .errstr = "R1 must be referenced when passed to release function",
},
{
/* !bpf_sk_fullsock(sk) is checked but !bpf_tcp_sock(sk) is not checked */
diff --git a/tools/testing/selftests/bpf/verifier/sock.c b/tools/testing/selftests/bpf/verifier/sock.c
index 86b24cad27a7..d11d0b28be41 100644
--- a/tools/testing/selftests/bpf/verifier/sock.c
+++ b/tools/testing/selftests/bpf/verifier/sock.c
@@ -417,7 +417,7 @@
},
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
.result = REJECT,
- .errstr = "reference has not been acquired before",
+ .errstr = "R1 must be referenced when passed to release function",
},
{
"bpf_sk_release(bpf_sk_fullsock(skb->sk))",
@@ -436,7 +436,7 @@
},
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
.result = REJECT,
- .errstr = "reference has not been acquired before",
+ .errstr = "R1 must be referenced when passed to release function",
},
{
"bpf_sk_release(bpf_tcp_sock(skb->sk))",
@@ -455,7 +455,7 @@
},
.prog_type = BPF_PROG_TYPE_SCHED_CLS,
.result = REJECT,
- .errstr = "reference has not been acquired before",
+ .errstr = "R1 must be referenced when passed to release function",
},
{
"sk_storage_get(map, skb->sk, NULL, 0): value == NULL",
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index 33ea5e9955d9..b9984124a294 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -1423,7 +1423,7 @@ uint64_t kvm_hypercall(uint64_t nr, uint64_t a0, uint64_t a1, uint64_t a2,
asm volatile("vmcall"
: "=a"(r)
- : "b"(a0), "c"(a1), "d"(a2), "S"(a3));
+ : "a"(nr), "b"(a0), "c"(a1), "d"(a2), "S"(a3));
return r;
}
diff --git a/tools/testing/selftests/powerpc/papr_attributes/attr_test.c b/tools/testing/selftests/powerpc/papr_attributes/attr_test.c
index bab0dc06e90b..9b655be641c9 100644
--- a/tools/testing/selftests/powerpc/papr_attributes/attr_test.c
+++ b/tools/testing/selftests/powerpc/papr_attributes/attr_test.c
@@ -7,6 +7,7 @@
* Copyright 2022, Pratik Rajesh Sampat, IBM Corp.
*/
+#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <dirent.h>
@@ -32,7 +33,7 @@ enum type {
NUM_VAL
};
-int value_type(int id)
+static int value_type(int id)
{
int val_type;
@@ -54,15 +55,21 @@ int value_type(int id)
return val_type;
}
-int verify_energy_info(void)
+static int verify_energy_info(void)
{
const char *path = "/sys/firmware/papr/energy_scale_info";
struct dirent *entry;
struct stat s;
DIR *dirp;
- if (stat(path, &s) || !S_ISDIR(s.st_mode))
- return -1;
+ errno = 0;
+ if (stat(path, &s)) {
+ SKIP_IF(errno == ENOENT);
+ FAIL_IF(errno);
+ }
+
+ FAIL_IF(!S_ISDIR(s.st_mode));
+
dirp = opendir(path);
while ((entry = readdir(dirp)) != NULL) {
@@ -76,25 +83,24 @@ int verify_energy_info(void)
id = atoi(entry->d_name);
attr_type = value_type(id);
- if (attr_type == INVALID)
- return -1;
+ FAIL_IF(attr_type == INVALID);
/* Check if the files exist and have data in them */
sprintf(file_name, "%s/%d/desc", path, id);
f = fopen(file_name, "r");
- if (!f || fgetc(f) == EOF)
- return -1;
+ FAIL_IF(!f);
+ FAIL_IF(fgetc(f) == EOF);
sprintf(file_name, "%s/%d/value", path, id);
f = fopen(file_name, "r");
- if (!f || fgetc(f) == EOF)
- return -1;
+ FAIL_IF(!f);
+ FAIL_IF(fgetc(f) == EOF);
if (attr_type == STR_VAL) {
sprintf(file_name, "%s/%d/value_desc", path, id);
f = fopen(file_name, "r");
- if (!f || fgetc(f) == EOF)
- return -1;
+ FAIL_IF(!f);
+ FAIL_IF(fgetc(f) == EOF);
}
}
diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh b/tools/testing/selftests/rcutorture/bin/kvm.sh
index 55b2c1533282..f2768994d861 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
@@ -44,6 +44,7 @@ TORTURE_KCONFIG_KASAN_ARG=""
TORTURE_KCONFIG_KCSAN_ARG=""
TORTURE_KMAKE_ARG=""
TORTURE_QEMU_MEM=512
+torture_qemu_mem_default=1
TORTURE_REMOTE=
TORTURE_SHUTDOWN_GRACE=180
TORTURE_SUITE=rcu
@@ -163,7 +164,7 @@ do
shift
;;
--gdb)
- TORTURE_KCONFIG_GDB_ARG="CONFIG_DEBUG_INFO=y"; export TORTURE_KCONFIG_GDB_ARG
+ TORTURE_KCONFIG_GDB_ARG="CONFIG_DEBUG_INFO_NONE=n CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y"; export TORTURE_KCONFIG_GDB_ARG
TORTURE_BOOT_GDB_ARG="nokaslr"; export TORTURE_BOOT_GDB_ARG
TORTURE_QEMU_GDB_ARG="-s -S"; export TORTURE_QEMU_GDB_ARG
;;
@@ -179,7 +180,11 @@ do
shift
;;
--kasan)
- TORTURE_KCONFIG_KASAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KASAN=y"; export TORTURE_KCONFIG_KASAN_ARG
+ TORTURE_KCONFIG_KASAN_ARG="CONFIG_DEBUG_INFO_NONE=n CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y CONFIG_KASAN=y"; export TORTURE_KCONFIG_KASAN_ARG
+ if test -n "$torture_qemu_mem_default"
+ then
+ TORTURE_QEMU_MEM=2G
+ fi
;;
--kconfig|--kconfigs)
checkarg --kconfig "(Kconfig options)" $# "$2" '^CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\( CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\)*$' '^error$'
@@ -187,7 +192,7 @@ do
shift
;;
--kcsan)
- TORTURE_KCONFIG_KCSAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KCSAN=y CONFIG_KCSAN_STRICT=y CONFIG_KCSAN_REPORT_ONCE_IN_MS=100000 CONFIG_KCSAN_VERBOSE=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y"; export TORTURE_KCONFIG_KCSAN_ARG
+ TORTURE_KCONFIG_KCSAN_ARG="CONFIG_DEBUG_INFO_NONE=n CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y CONFIG_KCSAN=y CONFIG_KCSAN_STRICT=y CONFIG_KCSAN_REPORT_ONCE_IN_MS=100000 CONFIG_KCSAN_VERBOSE=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y"; export TORTURE_KCONFIG_KCSAN_ARG
;;
--kmake-arg|--kmake-args)
checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
@@ -202,6 +207,7 @@ do
--memory)
checkarg --memory "(memory size)" $# "$2" '^[0-9]\+[MG]\?$' error
TORTURE_QEMU_MEM=$2
+ torture_qemu_mem_default=
shift
;;
--no-initrd)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 313bb0cbfb1e..6f65eeb6a3dd 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -802,7 +802,7 @@ void kill_thread_or_group(struct __test_metadata *_metadata,
.len = (unsigned short)ARRAY_SIZE(filter_thread),
.filter = filter_thread,
};
- int kill = kill_how == KILL_PROCESS ? SECCOMP_RET_KILL_PROCESS : 0xAAAAAAAAA;
+ int kill = kill_how == KILL_PROCESS ? SECCOMP_RET_KILL_PROCESS : 0xAAAAAAAA;
struct sock_filter filter_process[] = {
BPF_STMT(BPF_LD|BPF_W|BPF_ABS,
offsetof(struct seccomp_data, nr)),
diff --git a/tools/testing/selftests/timers/clocksource-switch.c b/tools/testing/selftests/timers/clocksource-switch.c
index ef8eb3604595..b57f0a9be490 100644
--- a/tools/testing/selftests/timers/clocksource-switch.c
+++ b/tools/testing/selftests/timers/clocksource-switch.c
@@ -110,10 +110,10 @@ int run_tests(int secs)
sprintf(buf, "./inconsistency-check -t %i", secs);
ret = system(buf);
- if (ret)
- return ret;
+ if (WIFEXITED(ret) && WEXITSTATUS(ret))
+ return WEXITSTATUS(ret);
ret = system("./nanosleep");
- return ret;
+ return WIFEXITED(ret) ? WEXITSTATUS(ret) : 0;
}
diff --git a/tools/testing/selftests/timers/valid-adjtimex.c b/tools/testing/selftests/timers/valid-adjtimex.c
index 5397de708d3c..48b9a803235a 100644
--- a/tools/testing/selftests/timers/valid-adjtimex.c
+++ b/tools/testing/selftests/timers/valid-adjtimex.c
@@ -40,7 +40,7 @@
#define ADJ_SETOFFSET 0x0100
#include <sys/syscall.h>
-static int clock_adjtime(clockid_t id, struct timex *tx)
+int clock_adjtime(clockid_t id, struct timex *tx)
{
return syscall(__NR_clock_adjtime, id, tx);
}
diff --git a/tools/testing/selftests/vm/hugepage-mremap.c b/tools/testing/selftests/vm/hugepage-mremap.c
index 1d689084a54b..22546f279534 100644
--- a/tools/testing/selftests/vm/hugepage-mremap.c
+++ b/tools/testing/selftests/vm/hugepage-mremap.c
@@ -107,7 +107,7 @@ static void register_region_with_uffd(char *addr, size_t len)
int main(int argc, char *argv[])
{
- size_t length;
+ size_t length = 0;
if (argc != 2 && argc != 3) {
printf("Usage: %s [length_in_MB] <hugetlb_file>\n", argv[0]);
diff --git a/tools/testing/selftests/vm/hugetlb-madvise.c b/tools/testing/selftests/vm/hugetlb-madvise.c
index 6c6af40f5747..3c9943131881 100644
--- a/tools/testing/selftests/vm/hugetlb-madvise.c
+++ b/tools/testing/selftests/vm/hugetlb-madvise.c
@@ -89,10 +89,11 @@ void write_fault_pages(void *addr, unsigned long nr_pages)
void read_fault_pages(void *addr, unsigned long nr_pages)
{
- unsigned long i, tmp;
+ unsigned long dummy = 0;
+ unsigned long i;
for (i = 0; i < nr_pages; i++)
- tmp += *((unsigned long *)(addr + (i * huge_page_size)));
+ dummy += *((unsigned long *)(addr + (i * huge_page_size)));
}
int main(int argc, char **argv)
diff --git a/tools/testing/selftests/wireguard/qemu/arch/riscv32.config b/tools/testing/selftests/wireguard/qemu/arch/riscv32.config
index 0bd0e72d95d4..2fc36efb166d 100644
--- a/tools/testing/selftests/wireguard/qemu/arch/riscv32.config
+++ b/tools/testing/selftests/wireguard/qemu/arch/riscv32.config
@@ -1,3 +1,4 @@
+CONFIG_NONPORTABLE=y
CONFIG_ARCH_RV32I=y
CONFIG_MMU=y
CONFIG_FPU=y
diff --git a/tools/thermal/tmon/sysfs.c b/tools/thermal/tmon/sysfs.c
index b00b1bfd9d8e..cb1108bc9249 100644
--- a/tools/thermal/tmon/sysfs.c
+++ b/tools/thermal/tmon/sysfs.c
@@ -13,6 +13,7 @@
#include <stdint.h>
#include <dirent.h>
#include <libintl.h>
+#include <limits.h>
#include <ctype.h>
#include <time.h>
#include <syslog.h>
@@ -33,9 +34,9 @@ int sysfs_set_ulong(char *path, char *filename, unsigned long val)
{
FILE *fd;
int ret = -1;
- char filepath[256];
+ char filepath[PATH_MAX + 2]; /* NUL and '/' */
- snprintf(filepath, 256, "%s/%s", path, filename);
+ snprintf(filepath, sizeof(filepath), "%s/%s", path, filename);
fd = fopen(filepath, "w");
if (!fd) {
@@ -57,9 +58,9 @@ static int sysfs_get_ulong(char *path, char *filename, unsigned long *p_ulong)
{
FILE *fd;
int ret = -1;
- char filepath[256];
+ char filepath[PATH_MAX + 2]; /* NUL and '/' */
- snprintf(filepath, 256, "%s/%s", path, filename);
+ snprintf(filepath, sizeof(filepath), "%s/%s", path, filename);
fd = fopen(filepath, "r");
if (!fd) {
@@ -76,9 +77,9 @@ static int sysfs_get_string(char *path, char *filename, char *str)
{
FILE *fd;
int ret = -1;
- char filepath[256];
+ char filepath[PATH_MAX + 2]; /* NUL and '/' */
- snprintf(filepath, 256, "%s/%s", path, filename);
+ snprintf(filepath, sizeof(filepath), "%s/%s", path, filename);
fd = fopen(filepath, "r");
if (!fd) {
@@ -199,8 +200,8 @@ static int find_tzone_cdev(struct dirent *nl, char *tz_name,
{
unsigned long trip_instance = 0;
char cdev_name_linked[256];
- char cdev_name[256];
- char cdev_trip_name[256];
+ char cdev_name[PATH_MAX];
+ char cdev_trip_name[PATH_MAX];
int cdev_id;
if (nl->d_type == DT_LNK) {
@@ -213,7 +214,8 @@ static int find_tzone_cdev(struct dirent *nl, char *tz_name,
return -EINVAL;
}
/* find the link to real cooling device record binding */
- snprintf(cdev_name, 256, "%s/%s", tz_name, nl->d_name);
+ snprintf(cdev_name, sizeof(cdev_name) - 2, "%s/%s",
+ tz_name, nl->d_name);
memset(cdev_name_linked, 0, sizeof(cdev_name_linked));
if (readlink(cdev_name, cdev_name_linked,
sizeof(cdev_name_linked) - 1) != -1) {
@@ -226,8 +228,8 @@ static int find_tzone_cdev(struct dirent *nl, char *tz_name,
/* find the trip point in which the cdev is binded to
* in this tzone
*/
- snprintf(cdev_trip_name, 256, "%s%s", nl->d_name,
- "_trip_point");
+ snprintf(cdev_trip_name, sizeof(cdev_trip_name) - 1,
+ "%s%s", nl->d_name, "_trip_point");
sysfs_get_ulong(tz_name, cdev_trip_name,
&trip_instance);
/* validate trip point range, e.g. trip could return -1
diff --git a/tools/thermal/tmon/tmon.h b/tools/thermal/tmon/tmon.h
index c9066ec104dd..44d16d778f04 100644
--- a/tools/thermal/tmon/tmon.h
+++ b/tools/thermal/tmon/tmon.h
@@ -27,6 +27,9 @@
#define NR_LINES_TZDATA 1
#define TMON_LOG_FILE "/var/tmp/tmon.log"
+#include <sys/time.h>
+#include <pthread.h>
+
extern unsigned long ticktime;
extern double time_elapsed;
extern unsigned long target_temp_user;
diff --git a/tools/tracing/rtla/Makefile b/tools/tracing/rtla/Makefile
index 3822f4ea5f49..1bea2d16d4c1 100644
--- a/tools/tracing/rtla/Makefile
+++ b/tools/tracing/rtla/Makefile
@@ -1,6 +1,6 @@
NAME := rtla
# Follow the kernel version
-VERSION := $(shell cat VERSION 2> /dev/null || make -sC ../../.. kernelversion)
+VERSION := $(shell cat VERSION 2> /dev/null || make -sC ../../.. kernelversion | grep -v make)
# From libtracefs:
# Makefiles suck: This macro sets a default value of $(2) for the
diff --git a/tools/tracing/rtla/src/trace.c b/tools/tracing/rtla/src/trace.c
index 5784c9f9e570..e1ba6d9f4265 100644
--- a/tools/tracing/rtla/src/trace.c
+++ b/tools/tracing/rtla/src/trace.c
@@ -134,13 +134,18 @@ void trace_instance_destroy(struct trace_instance *trace)
if (trace->inst) {
disable_tracer(trace->inst);
destroy_instance(trace->inst);
+ trace->inst = NULL;
}
- if (trace->seq)
+ if (trace->seq) {
free(trace->seq);
+ trace->seq = NULL;
+ }
- if (trace->tep)
+ if (trace->tep) {
tep_free(trace->tep);
+ trace->tep = NULL;
+ }
}
/*
diff --git a/tools/tracing/rtla/src/utils.c b/tools/tracing/rtla/src/utils.c
index 5352167a1e75..5ae2fa96fde1 100644
--- a/tools/tracing/rtla/src/utils.c
+++ b/tools/tracing/rtla/src/utils.c
@@ -106,8 +106,9 @@ int parse_cpu_list(char *cpu_list, char **monitored_cpus)
nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
- mon_cpus = malloc(nr_cpus * sizeof(char));
- memset(mon_cpus, 0, (nr_cpus * sizeof(char)));
+ mon_cpus = calloc(nr_cpus, sizeof(char));
+ if (!mon_cpus)
+ goto err;
for (p = cpu_list; *p; ) {
cpu = atoi(p);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7f1d19689701..843396ed4ad3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -724,6 +724,15 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
kvm->mn_active_invalidate_count++;
spin_unlock(&kvm->mn_invalidate_lock);
+ /*
+ * Invalidate pfn caches _before_ invalidating the secondary MMUs, i.e.
+ * before acquiring mmu_lock, to avoid holding mmu_lock while acquiring
+ * each cache's lock. There are relatively few caches in existence at
+ * any given time, and the caches themselves can check for hva overlap,
+ * i.e. don't need to rely on memslot overlap checks for performance.
+ * Because this runs without holding mmu_lock, the pfn caches must use
+ * mn_active_invalidate_count (see above) instead of mmu_notifier_count.
+ */
gfn_to_pfn_cache_invalidate_start(kvm, range->start, range->end,
hva_range.may_block);
@@ -2843,16 +2852,28 @@ void kvm_release_pfn_dirty(kvm_pfn_t pfn)
}
EXPORT_SYMBOL_GPL(kvm_release_pfn_dirty);
+static bool kvm_is_ad_tracked_pfn(kvm_pfn_t pfn)
+{
+ if (!pfn_valid(pfn))
+ return false;
+
+ /*
+ * Per page-flags.h, pages tagged PG_reserved "should in general not be
+ * touched (e.g. set dirty) except by its owner".
+ */
+ return !PageReserved(pfn_to_page(pfn));
+}
+
void kvm_set_pfn_dirty(kvm_pfn_t pfn)
{
- if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
+ if (kvm_is_ad_tracked_pfn(pfn))
SetPageDirty(pfn_to_page(pfn));
}
EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
void kvm_set_pfn_accessed(kvm_pfn_t pfn)
{
- if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
+ if (kvm_is_ad_tracked_pfn(pfn))
mark_page_accessed(pfn_to_page(pfn));
}
EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index dd84676615f1..b0b678367376 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -95,7 +95,7 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
}
EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_check);
-static void __release_gpc(struct kvm *kvm, kvm_pfn_t pfn, void *khva, gpa_t gpa)
+static void gpc_release_pfn_and_khva(struct kvm *kvm, kvm_pfn_t pfn, void *khva)
{
/* Unmap the old page if it was mapped before, and release it */
if (!is_error_noslot_pfn(pfn)) {
@@ -112,31 +112,122 @@ static void __release_gpc(struct kvm *kvm, kvm_pfn_t pfn, void *khva, gpa_t gpa)
}
}
-static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, unsigned long uhva)
+static inline bool mmu_notifier_retry_cache(struct kvm *kvm, unsigned long mmu_seq)
{
+ /*
+ * mn_active_invalidate_count acts for all intents and purposes
+ * like mmu_notifier_count here; but the latter cannot be used
+ * here because the invalidation of caches in the mmu_notifier
+ * event occurs _before_ mmu_notifier_count is elevated.
+ *
+ * Note, it does not matter that mn_active_invalidate_count
+ * is not protected by gpc->lock. It is guaranteed to
+ * be elevated before the mmu_notifier acquires gpc->lock, and
+ * isn't dropped until after mmu_notifier_seq is updated.
+ */
+ if (kvm->mn_active_invalidate_count)
+ return true;
+
+ /*
+ * Ensure mn_active_invalidate_count is read before
+ * mmu_notifier_seq. This pairs with the smp_wmb() in
+ * mmu_notifier_invalidate_range_end() to guarantee either the
+ * old (non-zero) value of mn_active_invalidate_count or the
+ * new (incremented) value of mmu_notifier_seq is observed.
+ */
+ smp_rmb();
+ return kvm->mmu_notifier_seq != mmu_seq;
+}
+
+static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
+{
+ /* Note, the new page offset may be different than the old! */
+ void *old_khva = gpc->khva - offset_in_page(gpc->khva);
+ kvm_pfn_t new_pfn = KVM_PFN_ERR_FAULT;
+ void *new_khva = NULL;
unsigned long mmu_seq;
- kvm_pfn_t new_pfn;
- int retry;
+
+ lockdep_assert_held(&gpc->refresh_lock);
+
+ lockdep_assert_held_write(&gpc->lock);
+
+ /*
+ * Invalidate the cache prior to dropping gpc->lock, the gpa=>uhva
+ * assets have already been updated and so a concurrent check() from a
+ * different task may not fail the gpa/uhva/generation checks.
+ */
+ gpc->valid = false;
do {
mmu_seq = kvm->mmu_notifier_seq;
smp_rmb();
+ write_unlock_irq(&gpc->lock);
+
+ /*
+ * If the previous iteration "failed" due to an mmu_notifier
+ * event, release the pfn and unmap the kernel virtual address
+ * from the previous attempt. Unmapping might sleep, so this
+ * needs to be done after dropping the lock. Opportunistically
+ * check for resched while the lock isn't held.
+ */
+ if (new_pfn != KVM_PFN_ERR_FAULT) {
+ /*
+ * Keep the mapping if the previous iteration reused
+ * the existing mapping and didn't create a new one.
+ */
+ if (new_khva == old_khva)
+ new_khva = NULL;
+
+ gpc_release_pfn_and_khva(kvm, new_pfn, new_khva);
+
+ cond_resched();
+ }
+
/* We always request a writeable mapping */
- new_pfn = hva_to_pfn(uhva, false, NULL, true, NULL);
+ new_pfn = hva_to_pfn(gpc->uhva, false, NULL, true, NULL);
if (is_error_noslot_pfn(new_pfn))
- break;
+ goto out_error;
+
+ /*
+ * Obtain a new kernel mapping if KVM itself will access the
+ * pfn. Note, kmap() and memremap() can both sleep, so this
+ * too must be done outside of gpc->lock!
+ */
+ if (gpc->usage & KVM_HOST_USES_PFN) {
+ if (new_pfn == gpc->pfn) {
+ new_khva = old_khva;
+ } else if (pfn_valid(new_pfn)) {
+ new_khva = kmap(pfn_to_page(new_pfn));
+#ifdef CONFIG_HAS_IOMEM
+ } else {
+ new_khva = memremap(pfn_to_hpa(new_pfn), PAGE_SIZE, MEMREMAP_WB);
+#endif
+ }
+ if (!new_khva) {
+ kvm_release_pfn_clean(new_pfn);
+ goto out_error;
+ }
+ }
- KVM_MMU_READ_LOCK(kvm);
- retry = mmu_notifier_retry_hva(kvm, mmu_seq, uhva);
- KVM_MMU_READ_UNLOCK(kvm);
- if (!retry)
- break;
+ write_lock_irq(&gpc->lock);
- cond_resched();
- } while (1);
+ /*
+ * Other tasks must wait for _this_ refresh to complete before
+ * attempting to refresh.
+ */
+ WARN_ON_ONCE(gpc->valid);
+ } while (mmu_notifier_retry_cache(kvm, mmu_seq));
- return new_pfn;
+ gpc->valid = true;
+ gpc->pfn = new_pfn;
+ gpc->khva = new_khva + (gpc->gpa & ~PAGE_MASK);
+ return 0;
+
+out_error:
+ write_lock_irq(&gpc->lock);
+
+ return -EFAULT;
}
int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
@@ -146,9 +237,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
unsigned long page_offset = gpa & ~PAGE_MASK;
kvm_pfn_t old_pfn, new_pfn;
unsigned long old_uhva;
- gpa_t old_gpa;
void *old_khva;
- bool old_valid;
int ret = 0;
/*
@@ -158,13 +247,18 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
if (page_offset + len > PAGE_SIZE)
return -EINVAL;
+ /*
+ * If another task is refreshing the cache, wait for it to complete.
+ * There is no guarantee that concurrent refreshes will see the same
+ * gpa, memslots generation, etc..., so they must be fully serialized.
+ */
+ mutex_lock(&gpc->refresh_lock);
+
write_lock_irq(&gpc->lock);
- old_gpa = gpc->gpa;
old_pfn = gpc->pfn;
old_khva = gpc->khva - offset_in_page(gpc->khva);
old_uhva = gpc->uhva;
- old_valid = gpc->valid;
/* If the userspace HVA is invalid, refresh that first */
if (gpc->gpa != gpa || gpc->generation != slots->generation ||
@@ -177,64 +271,17 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
gpc->uhva = gfn_to_hva_memslot(gpc->memslot, gfn);
if (kvm_is_error_hva(gpc->uhva)) {
- gpc->pfn = KVM_PFN_ERR_FAULT;
ret = -EFAULT;
goto out;
}
-
- gpc->uhva += page_offset;
}
/*
* If the userspace HVA changed or the PFN was already invalid,
* drop the lock and do the HVA to PFN lookup again.
*/
- if (!old_valid || old_uhva != gpc->uhva) {
- unsigned long uhva = gpc->uhva;
- void *new_khva = NULL;
-
- /* Placeholders for "hva is valid but not yet mapped" */
- gpc->pfn = KVM_PFN_ERR_FAULT;
- gpc->khva = NULL;
- gpc->valid = true;
-
- write_unlock_irq(&gpc->lock);
-
- new_pfn = hva_to_pfn_retry(kvm, uhva);
- if (is_error_noslot_pfn(new_pfn)) {
- ret = -EFAULT;
- goto map_done;
- }
-
- if (gpc->usage & KVM_HOST_USES_PFN) {
- if (new_pfn == old_pfn) {
- new_khva = old_khva;
- old_pfn = KVM_PFN_ERR_FAULT;
- old_khva = NULL;
- } else if (pfn_valid(new_pfn)) {
- new_khva = kmap(pfn_to_page(new_pfn));
-#ifdef CONFIG_HAS_IOMEM
- } else {
- new_khva = memremap(pfn_to_hpa(new_pfn), PAGE_SIZE, MEMREMAP_WB);
-#endif
- }
- if (new_khva)
- new_khva += page_offset;
- else
- ret = -EFAULT;
- }
-
- map_done:
- write_lock_irq(&gpc->lock);
- if (ret) {
- gpc->valid = false;
- gpc->pfn = KVM_PFN_ERR_FAULT;
- gpc->khva = NULL;
- } else {
- /* At this point, gpc->valid may already have been cleared */
- gpc->pfn = new_pfn;
- gpc->khva = new_khva;
- }
+ if (!gpc->valid || old_uhva != gpc->uhva) {
+ ret = hva_to_pfn_retry(kvm, gpc);
} else {
/* If the HVA→PFN mapping was already valid, don't unmap it. */
old_pfn = KVM_PFN_ERR_FAULT;
@@ -242,9 +289,26 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
}
out:
+ /*
+ * Invalidate the cache and purge the pfn/khva if the refresh failed.
+ * Some/all of the uhva, gpa, and memslot generation info may still be
+ * valid, leave it as is.
+ */
+ if (ret) {
+ gpc->valid = false;
+ gpc->pfn = KVM_PFN_ERR_FAULT;
+ gpc->khva = NULL;
+ }
+
+ /* Snapshot the new pfn before dropping the lock! */
+ new_pfn = gpc->pfn;
+
write_unlock_irq(&gpc->lock);
- __release_gpc(kvm, old_pfn, old_khva, old_gpa);
+ mutex_unlock(&gpc->refresh_lock);
+
+ if (old_pfn != new_pfn)
+ gpc_release_pfn_and_khva(kvm, old_pfn, old_khva);
return ret;
}
@@ -254,14 +318,13 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
{
void *old_khva;
kvm_pfn_t old_pfn;
- gpa_t old_gpa;
+ mutex_lock(&gpc->refresh_lock);
write_lock_irq(&gpc->lock);
gpc->valid = false;
old_khva = gpc->khva - offset_in_page(gpc->khva);
- old_gpa = gpc->gpa;
old_pfn = gpc->pfn;
/*
@@ -272,8 +335,9 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc)
gpc->pfn = KVM_PFN_ERR_FAULT;
write_unlock_irq(&gpc->lock);
+ mutex_unlock(&gpc->refresh_lock);
- __release_gpc(kvm, old_pfn, old_khva, old_gpa);
+ gpc_release_pfn_and_khva(kvm, old_pfn, old_khva);
}
EXPORT_SYMBOL_GPL(kvm_gfn_to_pfn_cache_unmap);
@@ -286,6 +350,7 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
if (!gpc->active) {
rwlock_init(&gpc->lock);
+ mutex_init(&gpc->refresh_lock);
gpc->khva = NULL;
gpc->pfn = KVM_PFN_ERR_FAULT;
prev parent reply other threads:[~2022-08-17 13:26 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-17 13:24 Linux 5.18.18 Greg Kroah-Hartman
2022-08-17 13:24 ` Greg Kroah-Hartman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1660742693200231@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=jslaby@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=lwn@lwn.net \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.