* [PATCH 00/11] fpga: change FPGA indirect article to an
From: trix @ 2021-06-08 21:23 UTC (permalink / raw)
To: mdf, robh+dt, hao.wu, corbet, fbarrat, ajd, bbrezillon, arno,
schalla, herbert, davem, gregkh, Sven.Auhagen, grandmaster
Cc: devicetree, linux-doc, Tom Rix, linux-fpga, linux-staging,
linux-kernel, linux-crypto, linuxppc-dev
From: Tom Rix <trix@redhat.com>
A treewide followup of
https://lore.kernel.org/linux-fpga/2faf6ccb-005b-063a-a2a3-e177082c4b3c@silicom.dk/
Change the use of 'a fpga' to 'an fpga'
Ref usage in wiki
https://en.wikipedia.org/wiki/Field-programmable_gate_array
and Intel's 'FPGAs For Dummies'
https://plan.seek.intel.com/PSG_WW_NC_LPCD_FR_2018_FPGAforDummiesbook
Change was mechanical
!/bin/sh
for f in `find . -type f`; do
sed -i.bak 's/ a fpga/ an fpga/g' $f
sed -i.bak 's/ A fpga/ An fpga/g' $f
sed -i.bak 's/ a FPGA/ an FPGA/g' $f
sed -i.bak 's/ A FPGA/ An FPGA/g' $f
done
Tom Rix (11):
dt-bindings: fpga: fpga-region: change FPGA indirect article to an
Documentation: fpga: dfl: change FPGA indirect article to an
Documentation: ocxl.rst: change FPGA indirect article to an
crypto: marvell: cesa: change FPGA indirect article to an
fpga: change FPGA indirect article to an
fpga: bridge: change FPGA indirect article to an
fpga-mgr: change FPGA indirect article to an
fpga: region: change FPGA indirect article to an
fpga: of-fpga-region: change FPGA indirect article to an
fpga: stratix10-soc: change FPGA indirect article to an
staging: fpgaboot: change FPGA indirect article to an
.../devicetree/bindings/fpga/fpga-region.txt | 22 +++++++++----------
Documentation/fpga/dfl.rst | 4 ++--
.../userspace-api/accelerators/ocxl.rst | 2 +-
drivers/crypto/marvell/cesa/cesa.h | 2 +-
drivers/fpga/Kconfig | 4 ++--
drivers/fpga/fpga-bridge.c | 22 +++++++++----------
drivers/fpga/fpga-mgr.c | 22 +++++++++----------
drivers/fpga/fpga-region.c | 14 ++++++------
drivers/fpga/of-fpga-region.c | 8 +++----
drivers/fpga/stratix10-soc.c | 2 +-
drivers/staging/gs_fpgaboot/README | 2 +-
include/linux/fpga/fpga-bridge.h | 2 +-
include/linux/fpga/fpga-mgr.h | 2 +-
13 files changed, 54 insertions(+), 54 deletions(-)
--
2.26.3
^ permalink raw reply
* [PATCH 04/11] crypto: marvell: cesa: change FPGA indirect article to an
From: trix @ 2021-06-08 21:23 UTC (permalink / raw)
To: mdf, robh+dt, hao.wu, corbet, fbarrat, ajd, bbrezillon, arno,
schalla, herbert, davem, gregkh, Sven.Auhagen, grandmaster
Cc: devicetree, linux-doc, Tom Rix, linux-fpga, linux-staging,
linux-kernel, linux-crypto, linuxppc-dev
In-Reply-To: <20210608212350.3029742-1-trix@redhat.com>
From: Tom Rix <trix@redhat.com>
Change use of 'a fpga' to 'an fpga'
Signed-off-by: Tom Rix <trix@redhat.com>
---
drivers/crypto/marvell/cesa/cesa.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/marvell/cesa/cesa.h b/drivers/crypto/marvell/cesa/cesa.h
index c1007f2ba79c8..d215a6bed6bc7 100644
--- a/drivers/crypto/marvell/cesa/cesa.h
+++ b/drivers/crypto/marvell/cesa/cesa.h
@@ -66,7 +66,7 @@
#define CESA_SA_ST_ACT_1 BIT(1)
/*
- * CESA_SA_FPGA_INT_STATUS looks like a FPGA leftover and is documented only
+ * CESA_SA_FPGA_INT_STATUS looks like an FPGA leftover and is documented only
* in Errata 4.12. It looks like that it was part of an IRQ-controller in FPGA
* and someone forgot to remove it while switching to the core and moving to
* CESA_SA_INT_STATUS.
--
2.26.3
^ permalink raw reply related
* [PATCH 02/11] Documentation: fpga: dfl: change FPGA indirect article to an
From: trix @ 2021-06-08 21:23 UTC (permalink / raw)
To: mdf, robh+dt, hao.wu, corbet, fbarrat, ajd, bbrezillon, arno,
schalla, herbert, davem, gregkh, Sven.Auhagen, grandmaster
Cc: devicetree, linux-doc, Tom Rix, linux-fpga, linux-staging,
linux-kernel, linux-crypto, linuxppc-dev
In-Reply-To: <20210608212350.3029742-1-trix@redhat.com>
From: Tom Rix <trix@redhat.com>
Change use of 'a fpga' to 'an fpga'
Signed-off-by: Tom Rix <trix@redhat.com>
---
Documentation/fpga/dfl.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/fpga/dfl.rst b/Documentation/fpga/dfl.rst
index ccc33f199df2a..ef9eec71f6f3a 100644
--- a/Documentation/fpga/dfl.rst
+++ b/Documentation/fpga/dfl.rst
@@ -57,7 +57,7 @@ FPGA Interface Unit (FIU) represents a standalone functional unit for the
interface to FPGA, e.g. the FPGA Management Engine (FME) and Port (more
descriptions on FME and Port in later sections).
-Accelerated Function Unit (AFU) represents a FPGA programmable region and
+Accelerated Function Unit (AFU) represents an FPGA programmable region and
always connects to a FIU (e.g. a Port) as its child as illustrated above.
Private Features represent sub features of the FIU and AFU. They could be
@@ -311,7 +311,7 @@ The driver organization in virtualization case is illustrated below:
| PCI PF Device | | | PCI VF Device |
+---------------+ | +---------------+
-FPGA PCIe device driver is always loaded first once a FPGA PCIe PF or VF device
+FPGA PCIe device driver is always loaded first once an FPGA PCIe PF or VF device
is detected. It:
* Finishes enumeration on both FPGA PCIe PF and VF device using common
--
2.26.3
^ permalink raw reply related
* [PATCH 03/11] Documentation: ocxl.rst: change FPGA indirect article to an
From: trix @ 2021-06-08 21:23 UTC (permalink / raw)
To: mdf, robh+dt, hao.wu, corbet, fbarrat, ajd, bbrezillon, arno,
schalla, herbert, davem, gregkh, Sven.Auhagen, grandmaster
Cc: devicetree, linux-doc, Tom Rix, linux-fpga, linux-staging,
linux-kernel, linux-crypto, linuxppc-dev
In-Reply-To: <20210608212350.3029742-1-trix@redhat.com>
From: Tom Rix <trix@redhat.com>
Change use of 'a fpga' to 'an fpga'
Signed-off-by: Tom Rix <trix@redhat.com>
---
Documentation/userspace-api/accelerators/ocxl.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/userspace-api/accelerators/ocxl.rst b/Documentation/userspace-api/accelerators/ocxl.rst
index 14cefc020e2d5..db7570d5e50d1 100644
--- a/Documentation/userspace-api/accelerators/ocxl.rst
+++ b/Documentation/userspace-api/accelerators/ocxl.rst
@@ -6,7 +6,7 @@ OpenCAPI is an interface between processors and accelerators. It aims
at being low-latency and high-bandwidth. The specification is
developed by the `OpenCAPI Consortium <http://opencapi.org/>`_.
-It allows an accelerator (which could be a FPGA, ASICs, ...) to access
+It allows an accelerator (which could be an FPGA, ASICs, ...) to access
the host memory coherently, using virtual addresses. An OpenCAPI
device can also host its own memory, that can be accessed from the
host.
--
2.26.3
^ permalink raw reply related
* [PATCH 01/11] dt-bindings: fpga: fpga-region: change FPGA indirect article to an
From: trix @ 2021-06-08 21:23 UTC (permalink / raw)
To: mdf, robh+dt, hao.wu, corbet, fbarrat, ajd, bbrezillon, arno,
schalla, herbert, davem, gregkh, Sven.Auhagen, grandmaster
Cc: devicetree, linux-doc, Tom Rix, linux-fpga, linux-staging,
linux-kernel, linux-crypto, linuxppc-dev
In-Reply-To: <20210608212350.3029742-1-trix@redhat.com>
From: Tom Rix <trix@redhat.com>
Change use of 'a fpga' to 'an fpga'
Signed-off-by: Tom Rix <trix@redhat.com>
---
.../devicetree/bindings/fpga/fpga-region.txt | 22 +++++++++----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/Documentation/devicetree/bindings/fpga/fpga-region.txt b/Documentation/devicetree/bindings/fpga/fpga-region.txt
index d787d57491a1c..7d35152648387 100644
--- a/Documentation/devicetree/bindings/fpga/fpga-region.txt
+++ b/Documentation/devicetree/bindings/fpga/fpga-region.txt
@@ -38,7 +38,7 @@ Partial Reconfiguration (PR)
Partial Reconfiguration Region (PRR)
* Also called a "reconfigurable partition"
- * A PRR is a specific section of a FPGA reserved for reconfiguration.
+ * A PRR is a specific section of an FPGA reserved for reconfiguration.
* A base (or static) FPGA image may create a set of PRR's that later may
be independently reprogrammed many times.
* The size and specific location of each PRR is fixed.
@@ -105,7 +105,7 @@ reprogrammed independently while the rest of the system continues to function.
Sequence
========
-When a DT overlay that targets a FPGA Region is applied, the FPGA Region will
+When a DT overlay that targets an FPGA Region is applied, the FPGA Region will
do the following:
1. Disable appropriate FPGA bridges.
@@ -134,8 +134,8 @@ The intended use is that a Device Tree overlay (DTO) can be used to reprogram an
FPGA while an operating system is running.
An FPGA Region that exists in the live Device Tree reflects the current state.
-If the live tree shows a "firmware-name" property or child nodes under a FPGA
-Region, the FPGA already has been programmed. A DTO that targets a FPGA Region
+If the live tree shows a "firmware-name" property or child nodes under an FPGA
+Region, the FPGA already has been programmed. A DTO that targets an FPGA Region
and adds the "firmware-name" property is taken as a request to reprogram the
FPGA. After reprogramming is successful, the overlay is accepted into the live
tree.
@@ -152,9 +152,9 @@ These FPGA regions are children of FPGA bridges which are then children of the
base FPGA region. The "Full Reconfiguration to add PRR's" example below shows
this.
-If an FPGA Region does not specify a FPGA Manager, it will inherit the FPGA
+If an FPGA Region does not specify an FPGA Manager, it will inherit the FPGA
Manager specified by its ancestor FPGA Region. This supports both the case
-where the same FPGA Manager is used for all of a FPGA as well the case where
+where the same FPGA Manager is used for all of an FPGA as well the case where
a different FPGA Manager is used for each region.
FPGA Regions do not inherit their ancestor FPGA regions' bridges. This prevents
@@ -166,7 +166,7 @@ within the static image of the FPGA.
Required properties:
- compatible : should contain "fpga-region"
- fpga-mgr : should contain a phandle to an FPGA Manager. Child FPGA Regions
- inherit this property from their ancestor regions. A fpga-mgr property
+ inherit this property from their ancestor regions. An fpga-mgr property
in a region will override any inherited FPGA manager.
- #address-cells, #size-cells, ranges : must be present to handle address space
mapping for child nodes.
@@ -175,12 +175,12 @@ Optional properties:
- firmware-name : should contain the name of an FPGA image file located on the
firmware search path. If this property shows up in a live device tree
it indicates that the FPGA has already been programmed with this image.
- If this property is in an overlay targeting a FPGA region, it is a
+ If this property is in an overlay targeting an FPGA region, it is a
request to program the FPGA with that image.
- fpga-bridges : should contain a list of phandles to FPGA Bridges that must be
controlled during FPGA programming along with the parent FPGA bridge.
This property is optional if the FPGA Manager handles the bridges.
- If the fpga-region is the child of a fpga-bridge, the list should not
+ If the fpga-region is the child of an fpga-bridge, the list should not
contain the parent bridge.
- partial-fpga-config : boolean, set if partial reconfiguration is to be done,
otherwise full reconfiguration is done.
@@ -279,7 +279,7 @@ Supported Use Models
In all cases the live DT must have the FPGA Manager, FPGA Bridges (if any), and
a FPGA Region. The target of the Device Tree Overlay is the FPGA Region. Some
-uses are specific to a FPGA device.
+uses are specific to an FPGA device.
* No FPGA Bridges
In this case, the FPGA Manager which programs the FPGA also handles the
@@ -300,7 +300,7 @@ uses are specific to a FPGA device.
bridges need to exist in the FPGA that can gate the buses going to each FPGA
region while the buses are enabled for other sections. Before any partial
reconfiguration can be done, a base FPGA image must be loaded which includes
- PRR's with FPGA bridges. The device tree should have a FPGA region for each
+ PRR's with FPGA bridges. The device tree should have an FPGA region for each
PRR.
Device Tree Examples
--
2.26.3
^ permalink raw reply related
* [PATCH 00/11] fpga: change FPGA indirect article to an
From: trix @ 2021-06-08 21:23 UTC (permalink / raw)
To: mdf, robh+dt, hao.wu, corbet, fbarrat, ajd, bbrezillon, arno,
schalla, herbert, davem, gregkh, Sven.Auhagen, grandmaster
Cc: devicetree, linux-doc, Tom Rix, linux-fpga, linux-staging,
linux-kernel, linux-crypto, linuxppc-dev
In-Reply-To: <20210608212350.3029742-1-trix@redhat.com>
From: Tom Rix <trix@redhat.com>
A treewide followup of
https://lore.kernel.org/linux-fpga/2faf6ccb-005b-063a-a2a3-e177082c4b3c@silicom.dk/
Change the use of 'a fpga' to 'an fpga'
Ref usage in wiki
https://en.wikipedia.org/wiki/Field-programmable_gate_array
and Intel's 'FPGAs For Dummies'
https://plan.seek.intel.com/PSG_WW_NC_LPCD_FR_2018_FPGAforDummiesbook
Change was mechanical
!/bin/sh
for f in `find . -type f`; do
sed -i.bak 's/ a fpga/ an fpga/g' $f
sed -i.bak 's/ A fpga/ An fpga/g' $f
sed -i.bak 's/ a FPGA/ an FPGA/g' $f
sed -i.bak 's/ A FPGA/ An FPGA/g' $f
done
Tom Rix (11):
dt-bindings: fpga: fpga-region: change FPGA indirect article to an
Documentation: fpga: dfl: change FPGA indirect article to an
Documentation: ocxl.rst: change FPGA indirect article to an
crypto: marvell: cesa: change FPGA indirect article to an
fpga: change FPGA indirect article to an
fpga: bridge: change FPGA indirect article to an
fpga-mgr: change FPGA indirect article to an
fpga: region: change FPGA indirect article to an
fpga: of-fpga-region: change FPGA indirect article to an
fpga: stratix10-soc: change FPGA indirect article to an
staging: fpgaboot: change FPGA indirect article to an
.../devicetree/bindings/fpga/fpga-region.txt | 22 +++++++++----------
Documentation/fpga/dfl.rst | 4 ++--
.../userspace-api/accelerators/ocxl.rst | 2 +-
drivers/crypto/marvell/cesa/cesa.h | 2 +-
drivers/fpga/Kconfig | 4 ++--
drivers/fpga/fpga-bridge.c | 22 +++++++++----------
drivers/fpga/fpga-mgr.c | 22 +++++++++----------
drivers/fpga/fpga-region.c | 14 ++++++------
drivers/fpga/of-fpga-region.c | 8 +++----
drivers/fpga/stratix10-soc.c | 2 +-
drivers/staging/gs_fpgaboot/README | 2 +-
include/linux/fpga/fpga-bridge.h | 2 +-
include/linux/fpga/fpga-mgr.h | 2 +-
13 files changed, 54 insertions(+), 54 deletions(-)
--
2.26.3
^ permalink raw reply
* Re: [PATCH] crash_core, vmcoreinfo: Append 'SECTION_SIZE_BITS' to vmcoreinfo
From: Andrew Morton @ 2021-06-08 21:14 UTC (permalink / raw)
To: Baoquan He
Cc: Mark Rutland, Kazuhito Hagio, Bhupesh Sharma, linux-arm-kernel,
Will Deacon, x86, kexec, linuxppc-dev, Pingfan Liu, linux-kernel,
Boris Petkov, Catalin Marinas, James Morse, Thomas Gleixner,
Dave Young, Ingo Molnar, Paul Mackerras, Dave Anderson
In-Reply-To: <20210608142432.GA587883@MiWiFi-R3L-srv>
On Tue, 8 Jun 2021 22:24:32 +0800 Baoquan He <bhe@redhat.com> wrote:
> On 06/08/21 at 06:33am, Pingfan Liu wrote:
> > As mentioned in kernel commit 1d50e5d0c505 ("crash_core, vmcoreinfo:
> > Append 'MAX_PHYSMEM_BITS' to vmcoreinfo"), SECTION_SIZE_BITS in the
> > formula:
> > #define SECTIONS_SHIFT (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)
> >
> > Besides SECTIONS_SHIFT, SECTION_SIZE_BITS is also used to calculate
> > PAGES_PER_SECTION in makedumpfile just like kernel.
> >
> > Unfortunately, this arch-dependent macro SECTION_SIZE_BITS changes, e.g.
> > recently in kernel commit f0b13ee23241 ("arm64/sparsemem: reduce
> > SECTION_SIZE_BITS"). But user space wants a stable interface to get this
> > info. Such info is impossible to be deduced from a crashdump vmcore.
> > Hence append SECTION_SIZE_BITS to vmcoreinfo.
>
> ...
>
> Add the discussion of the original thread in kexec ML for reference:
> http://lists.infradead.org/pipermail/kexec/2021-June/022676.html
I added a Link: for this.
> This looks good to me.
>
> Acked-by: Baoquan He <bhe@redhat.com>
I'm thinking we should backport this at least to Fixes:f0b13ee23241.
But perhaps it's simpler to just backport it as far as possible, so I
added a bare cc:stable with no Fixes:. Thoughts?
^ permalink raw reply
* Re: [PATCH v7 01/11] mm/mremap: Fix race between MOVE_PMD mremap and pageout
From: Hugh Dickins @ 2021-06-08 20:39 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Linus Torvalds, Hugh Dickins, npiggin, linux-mm, kaleshsingh,
joel, Kirill A . Shutemov, akpm, linuxppc-dev
In-Reply-To: <87o8cgokso.fsf@linux.ibm.com>
On Tue, 8 Jun 2021, Aneesh Kumar K.V wrote:
>
> mm/mremap: hold the rmap lock in write mode when moving page table entries.
>
> To avoid a race between rmap walk and mremap, mremap does take_rmap_locks().
> The lock was taken to ensure that rmap walk don't miss a page table entry due to
> PTE moves via move_pagetables(). The kernel does further optimization of
> this lock such that if we are going to find the newly added vma after the
> old vma, the rmap lock is not taken. This is because rmap walk would find the
> vmas in the same order and if we don't find the page table attached to
> older vma we would find it with the new vma which we would iterate later.
> The actual lifetime of the page is still controlled by the PTE lock.
>
> This patch updates the locking requirement to handle another race condition
> explained below with optimized mremap::
>
> Optmized PMD move
>
> CPU 1 CPU 2 CPU 3
>
> mremap(old_addr, new_addr) page_shrinker/try_to_unmap_one
>
> mmap_write_lock_killable()
>
> addr = old_addr
> lock(pte_ptl)
> lock(pmd_ptl)
> pmd = *old_pmd
> pmd_clear(old_pmd)
> flush_tlb_range(old_addr)
>
> *new_pmd = pmd
> *new_addr = 10; and fills
> TLB with new addr
> and old pfn
>
> unlock(pmd_ptl)
> ptep_clear_flush()
> old pfn is free.
> Stale TLB entry
>
The PUD example below is mainly a waste a space and time:
"Optimized PUD move suffers from a similar race." would be better.
> Optmized PUD move:
>
> CPU 1 CPU 2 CPU 3
>
> mremap(old_addr, new_addr) page_shrinker/try_to_unmap_one
>
> mmap_write_lock_killable()
>
> addr = old_addr
> lock(pte_ptl)
> lock(pud_ptl)
> pud = *old_pud
> pud_clear(old_pud)
> flush_tlb_range(old_addr)
>
> *new_pud = pud
> *new_addr = 10; and fills
> TLB with new addr
> and old pfn
>
> unlock(pud_ptl)
> ptep_clear_flush()
> old pfn is free.
> Stale TLB entry
>
> Both the above race condition can be fixed if we force mremap path to take rmap lock.
>
Don't forget the Fixes and Link you had in the previous version:
Fixes: 2c91bd4a4e2e ("mm: speed up mremap by 20x on large regions")
Link: https://lore.kernel.org/linux-mm/CAHk-=wgXVR04eBNtxQfevontWnP6FDm+oj5vauQXP3S-huwbPw@mail.gmail.com
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Thanks, this is orders of magnitude better!
Acked-by: Hugh Dickins <hughd@google.com>
>
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 9cd352fb9cf8..f12df630fb37 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -517,7 +517,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma,
> } else if (IS_ENABLED(CONFIG_HAVE_MOVE_PUD) && extent == PUD_SIZE) {
>
> if (move_pgt_entry(NORMAL_PUD, vma, old_addr, new_addr,
> - old_pud, new_pud, need_rmap_locks))
> + old_pud, new_pud, true))
> continue;
> }
>
> @@ -544,7 +544,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma,
> * moving at the PMD level if possible.
> */
> if (move_pgt_entry(NORMAL_PMD, vma, old_addr, new_addr,
> - old_pmd, new_pmd, need_rmap_locks))
> + old_pmd, new_pmd, true))
> continue;
> }
>
>
^ permalink raw reply
* Re: [PATCH] powerpc/32: Remove __main()
From: Segher Boessenkool @ 2021-06-08 18:25 UTC (permalink / raw)
To: Christophe Leroy; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel
In-Reply-To: <d01028f8166b98584eec536b52f14c5e3f98ff6b.1623172922.git.christophe.leroy@csgroup.eu>
On Tue, Jun 08, 2021 at 05:22:51PM +0000, Christophe Leroy wrote:
> Comment says that __main() is there to make GCC happy.
>
> It's been there since the implementation of ppc arch in Linux 1.3.45.
>
> ppc32 is the only architecture having that. Even ppc64 doesn't have it.
>
> Seems like GCC is still happy without it.
>
> Drop it for good.
If you used G++ to build the kernel there could be a call to __main
inserted under some circumstances. It is used in functions called
"main" if there is no other way to do initialisations (this should not
happen if you use -ffreestanding, and there should not be a function
called "main" anyway, but who knows).
Either way, yup, this is ancient history :-)
Segher
^ permalink raw reply
* Re: [PATCH 13/16] block: use memcpy_from_bvec in bio_copy_kern_endio_read
From: Chaitanya Kulkarni @ 2021-06-08 18:26 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Thomas Bogendoerfer, Mike Snitzer, Geoff Levand,
linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org,
Dongsheng Yang, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, dm-devel@redhat.com, Ilya Dryomov,
Ira Weiny, ceph-devel@vger.kernel.org
In-Reply-To: <20210608160603.1535935-14-hch@lst.de>
On 6/8/21 09:09, Christoph Hellwig wrote:
> Use memcpy_from_bvec instead of open coding the logic.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
^ permalink raw reply
* Re: [PATCH 12/16] block: use memcpy_to_bvec in copy_to_high_bio_irq
From: Chaitanya Kulkarni @ 2021-06-08 18:24 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Thomas Bogendoerfer, Mike Snitzer, Geoff Levand,
linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org,
Dongsheng Yang, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, dm-devel@redhat.com, Ilya Dryomov,
Ira Weiny, ceph-devel@vger.kernel.org
In-Reply-To: <20210608160603.1535935-13-hch@lst.de>
On 6/8/21 09:08, Christoph Hellwig wrote:
> Use memcpy_to_bvec instead of opencoding the logic.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
^ permalink raw reply
* Re: [PATCH 05/16] bvec: add memcpy_{from,to}_bvec and memzero_bvec helper
From: Chaitanya Kulkarni @ 2021-06-08 18:21 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Thomas Bogendoerfer, Mike Snitzer, Geoff Levand,
linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org,
Dongsheng Yang, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, dm-devel@redhat.com, Ilya Dryomov,
Ira Weiny, ceph-devel@vger.kernel.org
In-Reply-To: <20210608160603.1535935-6-hch@lst.de>
On 6/8/21 09:07, Christoph Hellwig wrote:
> Add helpers to perform common memory operation on a bvec.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
^ permalink raw reply
* Re: [PATCH 06/16] block: use memzero_page in zero_fill_bio
From: Chaitanya Kulkarni @ 2021-06-08 18:19 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Thomas Bogendoerfer, Mike Snitzer, Geoff Levand,
linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org,
Dongsheng Yang, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, dm-devel@redhat.com, Ilya Dryomov,
Ira Weiny, ceph-devel@vger.kernel.org
In-Reply-To: <20210608160603.1535935-7-hch@lst.de>
On 6/8/21 09:07, Christoph Hellwig wrote:
> Use memzero_bvec to zero each segment in the bio instead of manually
> mapping and zeroing the data.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
^ permalink raw reply
* Re: [PATCH 04/16] bvec: add a bvec_kmap_local helper
From: Chaitanya Kulkarni @ 2021-06-08 18:18 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Thomas Bogendoerfer, Mike Snitzer, Geoff Levand,
linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org,
Dongsheng Yang, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, dm-devel@redhat.com, Ilya Dryomov,
Ira Weiny, ceph-devel@vger.kernel.org
In-Reply-To: <20210608160603.1535935-5-hch@lst.de>
On 6/8/21 09:06, Christoph Hellwig wrote:
> Add a helper to call kmap_local_page on a bvec. There is no need for
> an unmap helper given that kunmap_local accept any address in the mapped
> page.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
^ permalink raw reply
* Re: [PATCH 01/16] mm: use kmap_local_page in memzero_page
From: Chaitanya Kulkarni @ 2021-06-08 18:17 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Thomas Bogendoerfer, Mike Snitzer, Geoff Levand,
linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org,
Dongsheng Yang, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, dm-devel@redhat.com, Ilya Dryomov,
Ira Weiny, ceph-devel@vger.kernel.org
In-Reply-To: <20210608160603.1535935-2-hch@lst.de>
On 6/8/21 09:06, Christoph Hellwig wrote:
> No need for kmap_atomic here.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
^ permalink raw reply
* Re: [PATCH 03/16] bvec: fix the include guards for bvec.h
From: Chaitanya Kulkarni @ 2021-06-08 18:18 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Jens Axboe, Thomas Bogendoerfer, Mike Snitzer, Geoff Levand,
linuxppc-dev@lists.ozlabs.org, linux-mips@vger.kernel.org,
Dongsheng Yang, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, dm-devel@redhat.com, Ilya Dryomov,
Ira Weiny, ceph-devel@vger.kernel.org
In-Reply-To: <20210608160603.1535935-4-hch@lst.de>
On 6/8/21 09:06, Christoph Hellwig wrote:
> Fix the include guards to match the file naming.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good.
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
^ permalink raw reply
* Re: [PATCH 2/4] drivers/nvdimm: Add perf interface to expose nvdimm performance stats
From: Peter Zijlstra @ 2021-06-08 17:36 UTC (permalink / raw)
To: Kajol Jain
Cc: nvdimm, santosh, maddy, ira.weiny, rnsastry, linux-kernel,
atrajeev, aneesh.kumar, vaibhav, dan.j.williams, linuxppc-dev,
tglx
In-Reply-To: <20210608115700.85933-3-kjain@linux.ibm.com>
On Tue, Jun 08, 2021 at 05:26:58PM +0530, Kajol Jain wrote:
> +static int nvdimm_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
> +{
> + struct nvdimm_pmu *nd_pmu;
> + u32 target;
> + int nodeid;
> + const struct cpumask *cpumask;
> +
> + nd_pmu = hlist_entry_safe(node, struct nvdimm_pmu, node);
> +
> + /* Clear it, incase given cpu is set in nd_pmu->arch_cpumask */
> + cpumask_test_and_clear_cpu(cpu, &nd_pmu->arch_cpumask);
> +
> + /*
> + * If given cpu is not same as current designated cpu for
> + * counter access, just return.
> + */
> + if (cpu != nd_pmu->cpu)
> + return 0;
> +
> + /* Check for any active cpu in nd_pmu->arch_cpumask */
> + target = cpumask_any(&nd_pmu->arch_cpumask);
> + nd_pmu->cpu = target;
> +
> + /*
> + * Incase we don't have any active cpu in nd_pmu->arch_cpumask,
> + * check in given cpu's numa node list.
> + */
> + if (target >= nr_cpu_ids) {
> + nodeid = cpu_to_node(cpu);
> + cpumask = cpumask_of_node(nodeid);
> + target = cpumask_any_but(cpumask, cpu);
> + nd_pmu->cpu = target;
> +
> + if (target >= nr_cpu_ids)
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +static int nvdimm_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
> +{
> + struct nvdimm_pmu *nd_pmu;
> +
> + nd_pmu = hlist_entry_safe(node, struct nvdimm_pmu, node);
> +
> + if (nd_pmu->cpu >= nr_cpu_ids)
> + nd_pmu->cpu = cpu;
> +
> + return 0;
> +}
> +static int nvdimm_pmu_cpu_hotplug_init(struct nvdimm_pmu *nd_pmu)
> +{
> + int nodeid, rc;
> + const struct cpumask *cpumask;
> +
> + /*
> + * Incase cpu hotplug is not handled by arch specific code
> + * they can still provide required cpumask which can be used
> + * to get designatd cpu for counter access.
> + * Check for any active cpu in nd_pmu->arch_cpumask.
> + */
> + if (!cpumask_empty(&nd_pmu->arch_cpumask)) {
> + nd_pmu->cpu = cpumask_any(&nd_pmu->arch_cpumask);
> + } else {
> + /* pick active cpu from the cpumask of device numa node. */
> + nodeid = dev_to_node(nd_pmu->dev);
> + cpumask = cpumask_of_node(nodeid);
> + nd_pmu->cpu = cpumask_any(cpumask);
> + }
> +
> + rc = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "perf/nvdimm:online",
> + nvdimm_pmu_cpu_online, nvdimm_pmu_cpu_offline);
> +
Did you actually test this hotplug stuff?
That is, create a counter, unplug the CPU the counter was on, and
continue counting? "perf stat -I" is a good option for this, concurrent
with a hotplug.
Because I don't think it's actually correct. The thing is perf core is
strictly per-cpu, and it will place the event on a specific CPU context.
If you then unplug that CPU, nothing will touch the events on that CPU
anymore.
What drivers that span CPUs need to do is call
perf_pmu_migrate_context() whenever the CPU they were assigned to goes
away. Please have a look at arch/x86/events/rapl.c or
arch/x86/events/amd/power.c for relatively simple drivers that have this
property.
^ permalink raw reply
* [PATCH] powerpc/32: Remove __main()
From: Christophe Leroy @ 2021-06-08 17:22 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
Comment says that __main() is there to make GCC happy.
It's been there since the implementation of ppc arch in Linux 1.3.45.
ppc32 is the only architecture having that. Even ppc64 doesn't have it.
Seems like GCC is still happy without it.
Drop it for good.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/kernel/misc_32.S | 6 ------
1 file changed, 6 deletions(-)
diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 6a076bef2932..39ab15419592 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -388,9 +388,3 @@ _GLOBAL(start_secondary_resume)
bl start_secondary
b .
#endif /* CONFIG_SMP */
-
-/*
- * This routine is just here to keep GCC happy - sigh...
- */
-_GLOBAL(__main)
- blr
--
2.25.0
^ permalink raw reply related
* Re: [PATCH v7 00/11] Speedup mremap on ppc64
From: Linus Torvalds @ 2021-06-08 17:10 UTC (permalink / raw)
To: Nick Piggin
Cc: Aneesh Kumar K.V, linux-mm@kvack.org, kaleshsingh@google.com,
joel@joelfernandes.org, Kirill A . Shutemov,
akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAPa8GCAmgUyqqAcuLC7KxDvDepkqhhvVcwgSGJh92PT+LoMQcw@mail.gmail.com>
On Mon, Jun 7, 2021 at 3:10 AM Nick Piggin <npiggin@gmail.com> wrote:
>
> I'd really rather not do this, I'm not sure if micro benchmark captures everything.
I don't much care what powerpc code does _itnernally_ for this
architecture-specific mis-design issue, but I really don't want to see
more complex generic interfaces unless you have better hard numbers
for them.
So far the numbers are: "no observable difference".
It would have to be not just observable, but actually meaningful for
me to go "ok, we'll add this crazy flag that nobody else cares about".
And honestly, from everything I've seen on page table walker caches:
they are great, but once you start remapping big ranges and
invallidating megabytes of TLB's, the walker caches just aren't going
to be your issue.
But: numbers talk. I'd take the sane generic interfaces as a first
cut. If somebody then has really compelling numbers, we can _then_
look at that "optimize for odd page table walker cache situation"
case.
And in the meantime, maybe you can talk to the hardware people and
tell them that you want the "flush range" capability to work right,
and that if the walker cache is <i>so</i> important they shouldn't
have made it a all-or-nothing flush.
Linus
^ permalink raw reply
* Re: [RFC] powerpc/pseries: Interface to represent PAPR firmware attributes
From: Pratik Sampat @ 2021-06-08 16:42 UTC (permalink / raw)
To: mpe, benh, paulus, linuxppc-dev, kvm-ppc, linux-kernel,
pratik.r.sampat, Pratik Sampat
In-Reply-To: <20210604163501.51511-1-psampat@linux.ibm.com>
I've implemented a POC using this interface for the powerpc-utils'
ppc64_cpu --frequency command-line tool to utilize this information
in userspace.
The POC has been hosted here:
https://github.com/pratiksampat/powerpc-utils/tree/H_GET_ENERGY_SCALE_INFO
and based on comments I suggestions I can further improve the
parsing logic from this initial implementation.
Sample output from the powerpc-utils tool is as follows:
# ppc64_cpu --frequency
Power and Performance Mode: XXXX
Idle Power Saver Status : XXXX
Processor Folding Status : XXXX --> Printed if Idle power save status is supported
Platform reported frequencies --> Frequencies reported from the platform's H_CALL i.e PAPR interface
min : NNNN GHz
max : NNNN GHz
static : NNNN GHz
Tool Computed frequencies
min : NNNN GHz (cpu XX)
max : NNNN GHz (cpu XX)
avg : NNNN GHz
On 04/06/21 10:05 pm, Pratik R. Sampat wrote:
> Adds a generic interface to represent the energy and frequency related
> PAPR attributes on the system using the new H_CALL
> "H_GET_ENERGY_SCALE_INFO".
>
> H_GET_EM_PARMS H_CALL was previously responsible for exporting this
> information in the lparcfg, however the H_GET_EM_PARMS H_CALL
> will be deprecated P10 onwards.
>
> The H_GET_ENERGY_SCALE_INFO H_CALL is of the following call format:
> hcall(
> uint64 H_GET_ENERGY_SCALE_INFO, // Get energy scale info
> uint64 flags, // Per the flag request
> uint64 firstAttributeId,// The attribute id
> uint64 bufferAddress, // The logical address of the output buffer
> uint64 bufferSize // The size in bytes of the output buffer
> );
>
> This H_CALL can query either all the attributes at once with
> firstAttributeId = 0, flags = 0 as well as query only one attribute
> at a time with firstAttributeId = id
>
> The output buffer consists of the following
> 1. number of attributes - 8 bytes
> 2. array offset to the data location - 8 bytes
> 3. version info - 1 byte
> 4. A data array of size num attributes, which contains the following:
> a. attribute ID - 8 bytes
> b. attribute value in number - 8 bytes
> c. attribute name in string - 64 bytes
> d. attribute value in string - 64 bytes
>
> The new H_CALL exports information in direct string value format, hence
> a new interface has been introduced in /sys/firmware/papr to export
> this information to userspace in an extensible pass-through format.
> The H_CALL returns the name, numeric value and string value. As string
> values are in human readable format, therefore if the string value
> exists then that is given precedence over the numeric value.
>
> The format of exposing the sysfs information is as follows:
> /sys/firmware/papr/
> |-- attr_0_name
> |-- attr_0_val
> |-- attr_1_name
> |-- attr_1_val
> ...
>
> The energy information that is exported is useful for userspace tools
> such as powerpc-utils. Currently these tools infer the
> "power_mode_data" value in the lparcfg, which in turn is obtained from
> the to be deprecated H_GET_EM_PARMS H_CALL.
> On future platforms, such userspace utilities will have to look at the
> data returned from the new H_CALL being populated in this new sysfs
> interface and report this information directly without the need of
> interpretation.
>
> Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com>
> ---
> Documentation/ABI/testing/sysfs-firmware-papr | 24 +++
> arch/powerpc/include/asm/hvcall.h | 21 +-
> arch/powerpc/kvm/trace_hv.h | 1 +
> arch/powerpc/platforms/pseries/Makefile | 3 +-
> .../pseries/papr_platform_attributes.c | 203 ++++++++++++++++++
> 5 files changed, 250 insertions(+), 2 deletions(-)
> create mode 100644 Documentation/ABI/testing/sysfs-firmware-papr
> create mode 100644 arch/powerpc/platforms/pseries/papr_platform_attributes.c
>
> diff --git a/Documentation/ABI/testing/sysfs-firmware-papr b/Documentation/ABI/testing/sysfs-firmware-papr
> new file mode 100644
> index 000000000000..1c040b44ac3b
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-firmware-papr
> @@ -0,0 +1,24 @@
> +What: /sys/firmware/papr
> +Date: June 2021
> +Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> +Description : Director hosting a set of platform attributes on Linux
> + running as a PAPR guest.
> +
> + Each file in a directory contains a platform
> + attribute pertaining to performance/energy-savings
> + mode and processor frequency.
> +
> +What: /sys/firmware/papr/attr_X_name
> + /sys/firmware/papr/attr_X_val
> +Date: June 2021
> +Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> +Description: PAPR attributes directory for POWERVM servers
> +
> + This directory provides PAPR information. It
> + contains below sysfs attributes:
> +
> + - attr_X_name: File contains the name of
> + attribute X
> +
> + - attr_X_val: Numeric/string value of
> + attribute X
> diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
> index e3b29eda8074..19a2a8c77a49 100644
> --- a/arch/powerpc/include/asm/hvcall.h
> +++ b/arch/powerpc/include/asm/hvcall.h
> @@ -316,7 +316,8 @@
> #define H_SCM_PERFORMANCE_STATS 0x418
> #define H_RPT_INVALIDATE 0x448
> #define H_SCM_FLUSH 0x44C
> -#define MAX_HCALL_OPCODE H_SCM_FLUSH
> +#define H_GET_ENERGY_SCALE_INFO 0x450
> +#define MAX_HCALL_OPCODE H_GET_ENERGY_SCALE_INFO
>
> /* Scope args for H_SCM_UNBIND_ALL */
> #define H_UNBIND_SCOPE_ALL (0x1)
> @@ -631,6 +632,24 @@ struct hv_gpci_request_buffer {
> uint8_t bytes[HGPCI_MAX_DATA_BYTES];
> } __packed;
>
> +#define MAX_EM_ATTRS 10
> +#define MAX_EM_DATA_BYTES \
> + (sizeof(struct energy_scale_attributes) * MAX_EM_ATTRS)
> +struct energy_scale_attributes {
> + __be64 attr_id;
> + __be64 attr_value;
> + unsigned char attr_desc[64];
> + unsigned char attr_value_desc[64];
> +} __packed;
> +
> +struct hv_energy_scale_buffer {
> + __be64 num_attr;
> + __be64 array_offset;
> + __u8 data_header_version;
> + unsigned char data[MAX_EM_DATA_BYTES];
> +} __packed;
> +
> +
> #endif /* __ASSEMBLY__ */
> #endif /* __KERNEL__ */
> #endif /* _ASM_POWERPC_HVCALL_H */
> diff --git a/arch/powerpc/kvm/trace_hv.h b/arch/powerpc/kvm/trace_hv.h
> index 830a126e095d..38cd0ed0a617 100644
> --- a/arch/powerpc/kvm/trace_hv.h
> +++ b/arch/powerpc/kvm/trace_hv.h
> @@ -115,6 +115,7 @@
> {H_VASI_STATE, "H_VASI_STATE"}, \
> {H_ENABLE_CRQ, "H_ENABLE_CRQ"}, \
> {H_GET_EM_PARMS, "H_GET_EM_PARMS"}, \
> + {H_GET_ENERGY_SCALE_INFO, "H_GET_ENERGY_SCALE_INFO"}, \
> {H_SET_MPP, "H_SET_MPP"}, \
> {H_GET_MPP, "H_GET_MPP"}, \
> {H_HOME_NODE_ASSOCIATIVITY, "H_HOME_NODE_ASSOCIATIVITY"}, \
> diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile
> index c8a2b0b05ac0..d14fca89ac25 100644
> --- a/arch/powerpc/platforms/pseries/Makefile
> +++ b/arch/powerpc/platforms/pseries/Makefile
> @@ -6,7 +6,8 @@ obj-y := lpar.o hvCall.o nvram.o reconfig.o \
> of_helpers.o \
> setup.o iommu.o event_sources.o ras.o \
> firmware.o power.o dlpar.o mobility.o rng.o \
> - pci.o pci_dlpar.o eeh_pseries.o msi.o
> + pci.o pci_dlpar.o eeh_pseries.o msi.o \
> + papr_platform_attributes.o
> obj-$(CONFIG_SMP) += smp.o
> obj-$(CONFIG_SCANLOG) += scanlog.o
> obj-$(CONFIG_KEXEC_CORE) += kexec.o
> diff --git a/arch/powerpc/platforms/pseries/papr_platform_attributes.c b/arch/powerpc/platforms/pseries/papr_platform_attributes.c
> new file mode 100644
> index 000000000000..8818877ff47e
> --- /dev/null
> +++ b/arch/powerpc/platforms/pseries/papr_platform_attributes.c
> @@ -0,0 +1,203 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * PowerPC64 LPAR PAPR Information Driver
> + *
> + * This driver creates a sys file at /sys/firmware/papr/ which contains
> + * files keyword - value pairs that specify energy configuration of the system.
> + *
> + * Copyright 2021 IBM Corp.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/types.h>
> +#include <linux/errno.h>
> +#include <linux/init.h>
> +#include <linux/seq_file.h>
> +#include <linux/slab.h>
> +#include <linux/uaccess.h>
> +#include <linux/hugetlb.h>
> +#include <asm/lppaca.h>
> +#include <asm/hvcall.h>
> +#include <asm/firmware.h>
> +#include <asm/time.h>
> +#include <asm/prom.h>
> +#include <asm/vdso_datapage.h>
> +#include <asm/vio.h>
> +#include <asm/mmu.h>
> +#include <asm/machdep.h>
> +#include <asm/drmem.h>
> +
> +#include "pseries.h"
> +
> +#define MAX_KOBJ_ATTRS 2
> +
> +struct papr_attr {
> + u64 id;
> + struct kobj_attribute attr;
> +} *pgattrs;
> +
> +struct kobject *papr_kobj;
> +struct hv_energy_scale_buffer *em_buf;
> +struct energy_scale_attributes *ea;
> +
> +static ssize_t papr_show_name(struct kobject *kobj,
> + struct kobj_attribute *attr,
> + char *buf)
> +{
> + struct papr_attr *pattr = container_of(attr, struct papr_attr, attr);
> + int idx, ret = 0;
> +
> + /*
> + * We do not expect the name to change, hence use the old value
> + * and save a HCALL
> + */
> + for (idx = 0; idx < be64_to_cpu(em_buf->num_attr); idx++) {
> + if (pattr->id == be64_to_cpu(ea[idx].attr_id)) {
> + ret = sprintf(buf, "%s\n", ea[idx].attr_desc);
> + if (ret < 0)
> + ret = -EIO;
> + break;
> + }
> + }
> +
> + return ret;
> +}
> +
> +static ssize_t papr_show_val(struct kobject *kobj,
> + struct kobj_attribute *attr,
> + char *buf)
> +{
> + struct papr_attr *pattr = container_of(attr, struct papr_attr, attr);
> + struct hv_energy_scale_buffer *t_buf;
> + struct energy_scale_attributes *t_ea;
> + int data_offset, ret = 0;
> +
> + t_buf = kmalloc(sizeof(*t_buf), GFP_KERNEL);
> + if (t_buf == NULL)
> + return -ENOMEM;
> +
> + ret = plpar_hcall_norets(H_GET_ENERGY_SCALE_INFO, 0,
> + pattr->id, virt_to_phys(t_buf),
> + sizeof(*t_buf));
> +
> + if (ret != H_SUCCESS) {
> + pr_warn("hcall faiiled: H_GET_ENERGY_SCALE_INFO");
> + goto out;
> + }
> +
> + data_offset = be64_to_cpu(t_buf->array_offset) -
> + (sizeof(t_buf->num_attr) +
> + sizeof(t_buf->array_offset) +
> + sizeof(t_buf->data_header_version));
> +
> + t_ea = (struct energy_scale_attributes *) &t_buf->data[data_offset];
> +
> + /* Prioritize string values over numerical */
> + if (strlen(t_ea->attr_value_desc) != 0)
> + ret = sprintf(buf, "%s\n", t_ea->attr_value_desc);
> + else
> + ret = sprintf(buf, "%llu\n", be64_to_cpu(t_ea->attr_value));
> + if (ret < 0)
> + ret = -EIO;
> +out:
> + kfree(t_buf);
> + return ret;
> +}
> +
> +static struct papr_ops_info {
> + const char *attr_name;
> + ssize_t (*show)(struct kobject *kobj, struct kobj_attribute *attr,
> + char *buf);
> +} ops_info[MAX_KOBJ_ATTRS] = {
> + { "name", papr_show_name },
> + { "val", papr_show_val },
> +};
> +
> +static int __init papr_init(void)
> +{
> + uint64_t num_attr;
> + int ret, idx, i, data_offset;
> +
> + em_buf = kmalloc(sizeof(*em_buf), GFP_KERNEL);
> + if (em_buf == NULL)
> + return -ENOMEM;
> + /*
> + * hcall(
> + * uint64 H_GET_ENERGY_SCALE_INFO, // Get energy scale info
> + * uint64 flags, // Per the flag request
> + * uint64 firstAttributeId, // The attribute id
> + * uint64 bufferAddress, // The logical address of the output buffer
> + * uint64 bufferSize); // The size in bytes of the output buffer
> + */
> + ret = plpar_hcall_norets(H_GET_ENERGY_SCALE_INFO, 0, 0,
> + virt_to_phys(em_buf), sizeof(*em_buf));
> +
> + if (!firmware_has_feature(FW_FEATURE_LPAR) || ret != H_SUCCESS ||
> + em_buf->data_header_version != 0x1) {
> + pr_warn("hcall faiiled: H_GET_ENERGY_SCALE_INFO");
> + goto out;
> + }
> +
> + num_attr = be64_to_cpu(em_buf->num_attr);
> +
> + /*
> + * Typecast the energy buffer to the attribute structure at the offset
> + * specified in the buffer
> + */
> + data_offset = be64_to_cpu(em_buf->array_offset) -
> + (sizeof(em_buf->num_attr) +
> + sizeof(em_buf->array_offset) +
> + sizeof(em_buf->data_header_version));
> +
> + ea = (struct energy_scale_attributes *) &em_buf->data[data_offset];
> +
> + papr_kobj = kobject_create_and_add("papr", firmware_kobj);
> + if (!papr_kobj) {
> + pr_warn("kobject_create_and_add papr failed\n");
> + goto out_kobj;
> + }
> +
> + for (idx = 0; idx < num_attr; idx++) {
> + pgattrs = kcalloc(MAX_KOBJ_ATTRS,
> + sizeof(*pgattrs),
> + GFP_KERNEL);
> + if (!pgattrs)
> + goto out_kobj;
> +
> + /*
> + * Create the sysfs attribute hierarchy for each PAPR
> + * property found
> + */
> + for (i = 0; i < MAX_KOBJ_ATTRS; i++) {
> + char buf[20];
> +
> + pgattrs[i].id = be64_to_cpu(ea[idx].attr_id);
> + sysfs_attr_init(&pgattrs[i].attr.attr);
> + sprintf(buf, "%s_%d_%s", "attr", idx,
> + ops_info[i].attr_name);
> + pgattrs[i].attr.attr.name = buf;
> + pgattrs[i].attr.attr.mode = 0444;
> + pgattrs[i].attr.show = ops_info[i].show;
> +
> + if (sysfs_create_file(papr_kobj, &pgattrs[i].attr.attr)) {
> + pr_warn("Failed to create papr file %s\n",
> + pgattrs[i].attr.attr.name);
> + goto out_pgattrs;
> + }
> + }
> + }
> +
> + return 0;
> +
> +out_pgattrs:
> + for (i = 0; i < MAX_KOBJ_ATTRS; i++)
> + kfree(pgattrs);
> +out_kobj:
> + kobject_put(papr_kobj);
> +out:
> + kfree(em_buf);
> +
> + return -ENOMEM;
> +}
> +
> +machine_device_initcall(pseries, papr_init);
^ permalink raw reply
* Re: [PATCH 08/16] dm-writecache: use bvec_kmap_local instead of bvec_kmap_irq
From: Christoph Hellwig @ 2021-06-08 16:38 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, Thomas Bogendoerfer, Mike Snitzer, Geoff Levand,
linuxppc-dev, ceph-devel, linux-mips, Dongsheng Yang,
linux-kernel, linux-block, dm-devel, Ilya Dryomov, Ira Weiny,
Christoph Hellwig
In-Reply-To: <4c248453-713f-9da8-04e8-7939388be49a@acm.org>
On Tue, Jun 08, 2021 at 09:30:56AM -0700, Bart Van Assche wrote:
> >From one of the functions called by kunmap_local():
>
> unsigned long addr = (unsigned long) vaddr & PAGE_MASK;
>
> This won't work well if bvec->bv_offset >= PAGE_SIZE I assume?
It won't indeed. Both the existing and new helpers operate on single
page bvecs only, and all callers only use those. I should have
probably mentioned that in the cover letter and documented the
assumptions in the code, though.
^ permalink raw reply
* Re: [PATCH 08/16] dm-writecache: use bvec_kmap_local instead of bvec_kmap_irq
From: Bart Van Assche @ 2021-06-08 16:30 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Thomas Bogendoerfer, Mike Snitzer, Geoff Levand, linuxppc-dev,
linux-mips, Dongsheng Yang, linux-kernel, linux-block, dm-devel,
Ilya Dryomov, Ira Weiny, ceph-devel
In-Reply-To: <20210608160603.1535935-9-hch@lst.de>
On 6/8/21 9:05 AM, Christoph Hellwig wrote:
> diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
> index aecc246ade26..93ca454eaca9 100644
> --- a/drivers/md/dm-writecache.c
> +++ b/drivers/md/dm-writecache.c
> @@ -1205,14 +1205,13 @@ static void memcpy_flushcache_optimized(void *dest, void *source, size_t size)
> static void bio_copy_block(struct dm_writecache *wc, struct bio *bio, void *data)
> {
> void *buf;
> - unsigned long flags;
> unsigned size;
> int rw = bio_data_dir(bio);
> unsigned remaining_size = wc->block_size;
>
> do {
> struct bio_vec bv = bio_iter_iovec(bio, bio->bi_iter);
> - buf = bvec_kmap_irq(&bv, &flags);
> + buf = bvec_kmap_local(&bv);
> size = bv.bv_len;
> if (unlikely(size > remaining_size))
> size = remaining_size;
> @@ -1230,7 +1229,7 @@ static void bio_copy_block(struct dm_writecache *wc, struct bio *bio, void *data
> memcpy_flushcache_optimized(data, buf, size);
> }
>
> - bvec_kunmap_irq(buf, &flags);
> + kunmap_local(buf);
>
> data = (char *)data + size;
> remaining_size -= size;
From one of the functions called by kunmap_local():
unsigned long addr = (unsigned long) vaddr & PAGE_MASK;
This won't work well if bvec->bv_offset >= PAGE_SIZE I assume?
Thanks,
Bart.
^ permalink raw reply
* Re: [PATCH 03/16] bvec: fix the include guards for bvec.h
From: Bart Van Assche @ 2021-06-08 16:23 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Thomas Bogendoerfer, Mike Snitzer, Geoff Levand, linuxppc-dev,
linux-mips, Dongsheng Yang, linux-kernel, linux-block, dm-devel,
Ilya Dryomov, Ira Weiny, ceph-devel
In-Reply-To: <20210608160603.1535935-4-hch@lst.de>
On 6/8/21 9:05 AM, Christoph Hellwig wrote:
> Fix the include guards to match the file naming.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
^ permalink raw reply
* Re: [PATCH 02/16] MIPS: don't include <linux/genhd.h> in <asm/mach-rc32434/rb.h>
From: Bart Van Assche @ 2021-06-08 16:23 UTC (permalink / raw)
To: Christoph Hellwig, Jens Axboe
Cc: Thomas Bogendoerfer, Mike Snitzer, Geoff Levand, linuxppc-dev,
linux-mips, Dongsheng Yang, linux-kernel, linux-block, dm-devel,
Ilya Dryomov, Ira Weiny, ceph-devel
In-Reply-To: <20210608160603.1535935-3-hch@lst.de>
On 6/8/21 9:05 AM, Christoph Hellwig wrote:
> There is no need to include genhd.h from a random arch header, and not
> doing so prevents the possibility for nasty include loops.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> arch/mips/include/asm/mach-rc32434/rb.h | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/arch/mips/include/asm/mach-rc32434/rb.h b/arch/mips/include/asm/mach-rc32434/rb.h
> index d502673a4f6c..34d179ca020b 100644
> --- a/arch/mips/include/asm/mach-rc32434/rb.h
> +++ b/arch/mips/include/asm/mach-rc32434/rb.h
> @@ -7,8 +7,6 @@
> #ifndef __ASM_RC32434_RB_H
> #define __ASM_RC32434_RB_H
>
> -#include <linux/genhd.h>
> -
> #define REGBASE 0x18000000
> #define IDT434_REG_BASE ((volatile void *) KSEG1ADDR(REGBASE))
> #define UART0BASE 0x58000
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
^ permalink raw reply
* Re: [PATCH v4 2/4] lazy tlb: allow lazy tlb mm refcounting to be configurable
From: Andy Lutomirski @ 2021-06-08 16:20 UTC (permalink / raw)
To: Nicholas Piggin, Andrew Morton
Cc: linux-arch, Randy Dunlap, linux-kernel, linux-mm, linuxppc-dev
In-Reply-To: <20210605014216.446867-3-npiggin@gmail.com>
On 6/4/21 6:42 PM, Nicholas Piggin wrote:
> Add CONFIG_MMU_TLB_REFCOUNT which enables refcounting of the lazy tlb mm
> when it is context switched. This can be disabled by architectures that
> don't require this refcounting if they clean up lazy tlb mms when the
> last refcount is dropped. Currently this is always enabled, which is
> what existing code does, so the patch is effectively a no-op.
>
> Rename rq->prev_mm to rq->prev_lazy_mm, because that's what it is.
I am in favor of this approach, but I would be a lot more comfortable
with the resulting code if task->active_mm were at least better
documented and possibly even guarded by ifdefs.
x86 bare metal currently does not need the core lazy mm refcounting, and
x86 bare metal *also* does not need ->active_mm. Under the x86 scheme,
if lazy mm refcounting were configured out, ->active_mm could become a
dangling pointer, and this makes me extremely uncomfortable.
So I tend to think that, depending on config, the core code should
either keep ->active_mm [1] alive or get rid of it entirely.
[1] I don't really think it belongs in task_struct at all. It's not a
property of the task. It's the *per-cpu* mm that the core code is
keeping alive for lazy purposes. How about consolidating it with the
copy in rq?
I guess the short summary of my opinion is that I like making this
configurable, but I do not like the state of the code.
--Andy
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox