All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/1] selinux: Add xfs quota command types
From: Christoph Hellwig @ 2020-02-20 15:44 UTC (permalink / raw)
  To: Richard Haines; +Cc: darrick.wong, sds, paul, linux-xfs, selinux
In-Reply-To: <20200220153234.152426-2-richard_c_haines@btinternet.com>

On Thu, Feb 20, 2020 at 03:32:34PM +0000, Richard Haines wrote:
> Add Q_XQUOTAOFF, Q_XQUOTAON and Q_XSETQLIM to trigger filesystem quotamod
> permission check.
> 
> Add Q_XGETQUOTA, Q_XGETQSTAT, Q_XGETQSTATV and Q_XGETNEXTQUOTA to trigger
> filesystem quotaget permission check.
> 
> Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* [PATCH 0/1] selinux: Add xfs quota command types
From: Richard Haines @ 2020-02-20 15:32 UTC (permalink / raw)
  To: darrick.wong, sds, paul; +Cc: linux-xfs, selinux, Richard Haines

Added these quota command types for SELinux checks on XFS quotas. I picked
those I thought useful. The selinux-testsuite will have tests for these
permission checks on XFS.

One point to note: XFS does not call dquot_quota_on() to trigger
security_quota_on(), therefore the 'file quotaon' permission is not tested
for SELinux

Richard Haines (1):
  selinux: Add xfs quota command types

 security/selinux/hooks.c | 7 +++++++
 1 file changed, 7 insertions(+)

-- 
2.24.1


^ permalink raw reply

* Re: [igt-dev] [PATCH i-g-t] intel-ci: add a pre-merge blacklist to reduce the testing queue
From: Chris Wilson @ 2020-02-20 15:44 UTC (permalink / raw)
  To: Martin Peres, igt-dev
In-Reply-To: <20200220153209.210767-1-martin.peres@linux.intel.com>

Quoting Martin Peres (2020-02-20 15:32:09)
> +###############################################################################
> +# Perf tests are for people using performance counters to get details about
> +# how the execution is performed (Observability Architecture). As such, the
> +# audience is very limited (game developers, driver developers), and it does
> +# not justify the overall execution time:
> +#
> +# - shard-skl: 0%
> +# - shard-kbl: 0%
> +# - shard-apl: 0%
> +# - shard-glk: 0%
> +# - shard-icl: 0%
> +# - shard-tgl: 1.7% (~3.5 minutes)
> +#
> +# Data acquired on 2020-02-20 by Martin Peres
> +###############################################################################
> +igt@perf@gen12-mi-rpc

Nak. That is just a straightforward BUG. The test is doing it's job in
reporting the interface is snafu. Then the test screws up by deadlocking
due to unclean exit handling after detecting the error.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply

* Re: [PATCH] mm: memcontrol: asynchronous reclaim for memory.high
From: Daniel Jordan @ 2020-02-20 15:45 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Daniel Jordan, Michal Hocko, Andrew Morton, Tejun Heo,
	Roman Gushchin, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, kernel-team-b10kYP2dOMg,
	Peter Zijlstra
In-Reply-To: <20200219220859.GF54486-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

+Peter

On Wed, Feb 19, 2020 at 05:08:59PM -0500, Johannes Weiner wrote:
> On Wed, Feb 19, 2020 at 04:41:12PM -0500, Daniel Jordan wrote:
> > On Wed, Feb 19, 2020 at 08:53:32PM +0100, Michal Hocko wrote:
> > > On Wed 19-02-20 14:16:18, Johannes Weiner wrote:
> > > > On Wed, Feb 19, 2020 at 07:37:31PM +0100, Michal Hocko wrote:
> > > > > On Wed 19-02-20 13:12:19, Johannes Weiner wrote:
> > > > > > This patch adds asynchronous reclaim to the memory.high cgroup limit
> > > > > > while keeping direct reclaim as a fallback. In our testing, this
> > > > > > eliminated all direct reclaim from the affected workload.
> > > > > 
> > > > > Who is accounted for all the work? Unless I am missing something this
> > > > > just gets hidden in the system activity and that might hurt the
> > > > > isolation. I do see how moving the work to a different context is
> > > > > desirable but this work has to be accounted properly when it is going to
> > > > > become a normal mode of operation (rather than a rare exception like the
> > > > > existing irq context handling).
> > > > 
> > > > Yes, the plan is to account it to the cgroup on whose behalf we're
> > > > doing the work.
> > 
> > How are you planning to do that?
> > 
> > I've been thinking about how to account a kernel thread's CPU usage to a cgroup
> > on and off while working on the parallelizing Michal mentions below.  A few
> > approaches are described here:
> > 
> > https://lore.kernel.org/linux-mm/20200212224731.kmss6o6agekkg3mw-S51bK0XF4qpuJJETbFA3a0B3C2bhBk7L0E9HWUfgJXw@public.gmane.org/
> 
> What we do for the IO controller is execute the work unthrottled but
> charge the cgroup on whose behalf we are executing with whatever cost
> or time or bandwith that was incurred. The cgroup will pay off this
> debt when it requests more of that resource.
>
[snip code pointers]

Thanks!  Figuring out how the io controllers dealt with remote charging was on
my list, this makes it easier.

> The plan for the CPU controller is similar. When a remote execution
> begins, flush the current runtime accumulated (update_curr) and
> associate the current thread with another cgroup (similar to
> current->active_memcg); when remote execution is done, flush the
> runtime delta to that cgroup and unset the remote context.

Ok, consistency with io and memory is one advantage to doing it that way.
Creating kthreads in cgroups also seems viable so far, and it's unclear whether
either approach is significantly simpler or more maintainable than the other,
at least to me.

Is someone on your side working on remote charging right now?  I was planning
to post an RFD comparing these soon and it would make sense to include them.

^ permalink raw reply

* Re: [PATCH] mm: memcontrol: asynchronous reclaim for memory.high
From: Daniel Jordan @ 2020-02-20 15:45 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Daniel Jordan, Michal Hocko, Andrew Morton, Tejun Heo,
	Roman Gushchin, linux-mm, cgroups, linux-kernel, kernel-team,
	Peter Zijlstra
In-Reply-To: <20200219220859.GF54486@cmpxchg.org>

+Peter

On Wed, Feb 19, 2020 at 05:08:59PM -0500, Johannes Weiner wrote:
> On Wed, Feb 19, 2020 at 04:41:12PM -0500, Daniel Jordan wrote:
> > On Wed, Feb 19, 2020 at 08:53:32PM +0100, Michal Hocko wrote:
> > > On Wed 19-02-20 14:16:18, Johannes Weiner wrote:
> > > > On Wed, Feb 19, 2020 at 07:37:31PM +0100, Michal Hocko wrote:
> > > > > On Wed 19-02-20 13:12:19, Johannes Weiner wrote:
> > > > > > This patch adds asynchronous reclaim to the memory.high cgroup limit
> > > > > > while keeping direct reclaim as a fallback. In our testing, this
> > > > > > eliminated all direct reclaim from the affected workload.
> > > > > 
> > > > > Who is accounted for all the work? Unless I am missing something this
> > > > > just gets hidden in the system activity and that might hurt the
> > > > > isolation. I do see how moving the work to a different context is
> > > > > desirable but this work has to be accounted properly when it is going to
> > > > > become a normal mode of operation (rather than a rare exception like the
> > > > > existing irq context handling).
> > > > 
> > > > Yes, the plan is to account it to the cgroup on whose behalf we're
> > > > doing the work.
> > 
> > How are you planning to do that?
> > 
> > I've been thinking about how to account a kernel thread's CPU usage to a cgroup
> > on and off while working on the parallelizing Michal mentions below.  A few
> > approaches are described here:
> > 
> > https://lore.kernel.org/linux-mm/20200212224731.kmss6o6agekkg3mw@ca-dmjordan1.us.oracle.com/
> 
> What we do for the IO controller is execute the work unthrottled but
> charge the cgroup on whose behalf we are executing with whatever cost
> or time or bandwith that was incurred. The cgroup will pay off this
> debt when it requests more of that resource.
>
[snip code pointers]

Thanks!  Figuring out how the io controllers dealt with remote charging was on
my list, this makes it easier.

> The plan for the CPU controller is similar. When a remote execution
> begins, flush the current runtime accumulated (update_curr) and
> associate the current thread with another cgroup (similar to
> current->active_memcg); when remote execution is done, flush the
> runtime delta to that cgroup and unset the remote context.

Ok, consistency with io and memory is one advantage to doing it that way.
Creating kthreads in cgroups also seems viable so far, and it's unclear whether
either approach is significantly simpler or more maintainable than the other,
at least to me.

Is someone on your side working on remote charging right now?  I was planning
to post an RFD comparing these soon and it would make sense to include them.


^ permalink raw reply

* Re: [PATCH RFC] mmc: sdhci-msm: Toggle fifo write clk after ungating sdcc clk
From: Bjorn Andersson @ 2020-02-20 15:44 UTC (permalink / raw)
  To: Sayali Lokhande
  Cc: adrian.hunter, robh+dt, ulf.hansson, asutoshd, stummala, ppvk,
	rampraka, vbadigan, sboyd, georgi.djakov, mka, linux-mmc,
	linux-kernel, linux-arm-msm, devicetree, agross, linux-mmc-owner
In-Reply-To: <1582190446-4778-2-git-send-email-sayalil@codeaurora.org>

On Thu 20 Feb 01:20 PST 2020, Sayali Lokhande wrote:

> From: Ram Prakash Gupta <rampraka@codeaurora.org>
> 
> During GCC level clock gating of MCLK, the async FIFO
> gets into some hang condition, such that for the next
> transfer after MCLK ungating, first bit of CMD response
> doesn't get written in to the FIFO. This cause the CPSM
> to hang eventually leading to SW timeout.

Does this always happen, on what platforms does this happen? How does
this manifest itself? Can you please elaborate.

> 
> To fix the issue, toggle the FIFO write clock after
> MCLK ungated to get the FIFO pointers and flags to
> valid states.
> 
> Change-Id: Ibef2d1d283ac0b6983c609a4abc98bc574d31fa6

Please drop the Change-Id and please add

Cc: stable@vger.kernel.org

If this is a bug fix that should be backported to e.g. 5.4.

> Signed-off-by: Ram Prakash Gupta <rampraka@codeaurora.org>
> Signed-off-by: Sayali Lokhande <sayalil@codeaurora.org>
> ---
>  drivers/mmc/host/sdhci-msm.c | 43 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
> index c3a160c..eaa3e95 100644
> --- a/drivers/mmc/host/sdhci-msm.c
> +++ b/drivers/mmc/host/sdhci-msm.c
> @@ -127,6 +127,8 @@
>  #define CQHCI_VENDOR_CFG1	0xA00
>  #define CQHCI_VENDOR_DIS_RST_ON_CQ_EN	(0x3 << 13)
>  
> +#define RCLK_TOGGLE 0x2

Please use BIT(1) instead.

> +
>  struct sdhci_msm_offset {
>  	u32 core_hc_mode;
>  	u32 core_mci_data_cnt;
> @@ -1554,6 +1556,43 @@ static void __sdhci_msm_set_clock(struct sdhci_host *host, unsigned int clock)
>  	sdhci_enable_clk(host, clk);
>  }
>  
> +/*
> + * After MCLK ugating, toggle the FIFO write clock to get
> + * the FIFO pointers and flags to valid state.
> + */
> +static void sdhci_msm_toggle_fifo_write_clk(struct sdhci_host *host)
> +{
> +	struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
> +	struct sdhci_msm_host *msm_host = sdhci_pltfm_priv(pltfm_host);
> +	const struct sdhci_msm_offset *msm_offset =
> +					msm_host->offset;

This doesn't look to be > 80 chars, please unwrap.

> +	struct mmc_card *card = host->mmc->card;
> +
> +	if (msm_host->tuning_done ||
> +			(card && card->ext_csd.strobe_support &&
> +			card->host->ios.enhanced_strobe)) {
> +		/*
> +		 * set HC_REG_DLL_CONFIG_3[1] to select MCLK as
> +		 * DLL input clock

You can shorten this to /* Select MCLK as DLL input clock */ if you make
the below readl/writel a little bit easier to read.

> +		 */
> +		writel_relaxed(((readl_relaxed(host->ioaddr +
> +			msm_offset->core_dll_config_3))
> +			| RCLK_TOGGLE), host->ioaddr +
> +			msm_offset->core_dll_config_3);

Please use a local variable and write this out as:
		val = readl(addr);
		val |= RCLK_TOGGLE;
		writel(val, addr);

> +		/* ensure above write as toggling same bit quickly */
> +		wmb();

This ensures ordering of writes, if you want to make sure the write has
hit the hardware before the delay perform a readl() on the address.

> +		udelay(2);
> +		/*
> +		 * clear HC_REG_DLL_CONFIG_3[1] to select RCLK as
> +		 * DLL input clock
> +		 */

		/* Select RCLK as DLL input clock */

> +		writel_relaxed(((readl_relaxed(host->ioaddr +
> +			msm_offset->core_dll_config_3))
> +			& ~RCLK_TOGGLE), host->ioaddr +
> +			msm_offset->core_dll_config_3);

Same as above, readl(); val &= ~RCLK_TOGGLE; writel(); will make this
easier on the eyes.

> +	}
> +}
> +
>  /* sdhci_msm_set_clock - Called with (host->lock) spinlock held. */
>  static void sdhci_msm_set_clock(struct sdhci_host *host, unsigned int clock)
>  {
> @@ -2149,6 +2188,10 @@ static __maybe_unused int sdhci_msm_runtime_resume(struct device *dev)
>  				       msm_host->bulk_clks);
>  	if (ret)
>  		return ret;

An empty line please.

> +	if (host->mmc &&

Afaict host->mmc can't be NULL, can you please confirm that you need
this check.

> +			(host->mmc->ios.timing == MMC_TIMING_MMC_HS400))
> +		sdhci_msm_toggle_fifo_write_clk(host);
> +

Regards,
Bjorn

>  	/*
>  	 * Whenever core-clock is gated dynamically, it's needed to
>  	 * restore the SDR DLL settings when the clock is ungated.
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project

^ permalink raw reply

* Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks for PM suspend and hibernation
From: Roger Pau Monné @ 2020-02-20 15:45 UTC (permalink / raw)
  To: Durrant, Paul
  Cc: Agarwal, Anchal, Valentin, Eduardo, len.brown@intel.com,
	peterz@infradead.org, benh@kernel.crashing.org, x86@kernel.org,
	linux-mm@kvack.org, pavel@ucw.cz, hpa@zytor.com,
	tglx@linutronix.de, sstabellini@kernel.org, fllinden@amaozn.com,
	Kamata, Munehisa, mingo@redhat.com,
	xen-devel@lists.xenproject.org, Singh, Balbir, axboe@kernel.dk,
	konrad.wilk@oracle.com, bp@alien8.de, boris.ostrovsky@oracle.com,
	jgross@suse.com, netdev@vger.kernel.org, linux-pm@vger.kernel.org,
	rjw@rjwysocki.net, linux-kernel@vger.kernel.org,
	vkuznets@redhat.com, davem@davemloft.net, Woodhouse, David
In-Reply-To: <f986b845491b47cc8469d88e2e65e2a7@EX13D32EUC003.ant.amazon.com>

On Thu, Feb 20, 2020 at 08:54:36AM +0000, Durrant, Paul wrote:
> > -----Original Message-----
> > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> > Roger Pau Monné
> > Sent: 20 February 2020 08:39
> > To: Agarwal, Anchal <anchalag@amazon.com>
> > Cc: Valentin, Eduardo <eduval@amazon.com>; len.brown@intel.com;
> > peterz@infradead.org; benh@kernel.crashing.org; x86@kernel.org; linux-
> > mm@kvack.org; pavel@ucw.cz; hpa@zytor.com; tglx@linutronix.de;
> > sstabellini@kernel.org; fllinden@amaozn.com; Kamata, Munehisa
> > <kamatam@amazon.com>; mingo@redhat.com; xen-devel@lists.xenproject.org;
> > Singh, Balbir <sblbir@amazon.com>; axboe@kernel.dk;
> > konrad.wilk@oracle.com; bp@alien8.de; boris.ostrovsky@oracle.com;
> > jgross@suse.com; netdev@vger.kernel.org; linux-pm@vger.kernel.org;
> > rjw@rjwysocki.net; linux-kernel@vger.kernel.org; vkuznets@redhat.com;
> > davem@davemloft.net; Woodhouse, David <dwmw@amazon.co.uk>
> > Subject: Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks
> > for PM suspend and hibernation
> > 
> > Thanks for this work, please see below.
> > 
> > On Wed, Feb 19, 2020 at 06:04:24PM +0000, Anchal Agarwal wrote:
> > > On Tue, Feb 18, 2020 at 10:16:11AM +0100, Roger Pau Monné wrote:
> > > > On Mon, Feb 17, 2020 at 11:05:53PM +0000, Anchal Agarwal wrote:
> > > > > On Mon, Feb 17, 2020 at 11:05:09AM +0100, Roger Pau Monné wrote:
> > > > > > On Fri, Feb 14, 2020 at 11:25:34PM +0000, Anchal Agarwal wrote:
> > > > > Quiescing the queue seemed a better option here as we want to make
> > sure ongoing
> > > > > requests dispatches are totally drained.
> > > > > I should accept that some of these notion is borrowed from how nvme
> > freeze/unfreeze
> > > > > is done although its not apple to apple comparison.
> > > >
> > > > That's fine, but I would still like to requests that you use the same
> > > > logic (as much as possible) for both the Xen and the PM initiated
> > > > suspension.
> > > >
> > > > So you either apply this freeze/unfreeze to the Xen suspension (and
> > > > drop the re-issuing of requests on resume) or adapt the same approach
> > > > as the Xen initiated suspension. Keeping two completely different
> > > > approaches to suspension / resume on blkfront is not suitable long
> > > > term.
> > > >
> > > I agree with you on overhaul of xen suspend/resume wrt blkfront is a
> > good
> > > idea however, IMO that is a work for future and this patch series should
> > > not be blocked for it. What do you think?
> > 
> > It's not so much that I think an overhaul of suspend/resume in
> > blkfront is needed, it's just that I don't want to have two completely
> > different suspend/resume paths inside blkfront.
> > 
> > So from my PoV I think the right solution is to either use the same
> > code (as much as possible) as it's currently used by Xen initiated
> > suspend/resume, or to also switch Xen initiated suspension to use the
> > newly introduced code.
> > 
> > Having two different approaches to suspend/resume in the same driver
> > is a recipe for disaster IMO: it adds complexity by forcing developers
> > to take into account two different suspend/resume approaches when
> > there's no need for it.
> 
> I disagree. S3 or S4 suspend/resume (or perhaps we should call them power state transitions to avoid confusion) are quite different from Xen suspend/resume.
> Power state transitions ought to be, and indeed are, visible to the software running inside the guest. Applications, as well as drivers, can receive notification and take whatever action they deem appropriate.
> Xen suspend/resume OTOH is used when a guest is migrated and the code should go to all lengths possible to make any software running inside the guest (other than Xen specific enlightened code, such as PV drivers) completely unaware that anything has actually happened.

So from what you say above PM state transitions are notified to all
drivers, and Xen suspend/resume is only notified to PV drivers, and
here we are speaking about blkfront which is a PV driver, and should
get notified in both cases. So I'm unsure why the same (or at least
very similar) approach can't be used in both cases.

The suspend/resume approach proposed by this patch is completely
different than the one used by a xenbus initiated suspend/resume, and
I don't see a technical reason that warrants this difference.

I'm not saying that the approach used here is wrong, it's just that I
don't see the point in having two different ways to do suspend/resume
in the same driver, unless there's a technical reason for it, which I
don't think has been provided.

I would be fine with switching xenbus initiated suspend/resume to also
use the approach proposed here: freeze the queues and drain the shared
rings before suspending.

> So, whilst it may be possible to use common routines to, for example, re-establish PV frontend/backend communication, PV frontend code should be acutely aware of the circumstances they are operating in. I can cite example code in the Windows PV driver, which have supported guest S3/S4 power state transitions since day 1.

Hm, please bear with me, as I'm not sure I fully understand. Why isn't
the current suspend/resume logic suitable for PM transitions?

As said above, I'm happy to switch xenbus initiated suspend/resume to
use the logic in this patch, but unless there's a technical reason for
it I don't see why blkfront should have two completely different
approaches to suspend/resume depending on whether it's a PM or a
xenbus state change.

Thanks, Roger.

^ permalink raw reply

* Re: mvneta: comphy regression with SolidRun ClearFog
From: Russell King - ARM Linux admin @ 2020-02-20 15:45 UTC (permalink / raw)
  To: Joel Johnson
  Cc: David S. Miller, Baruch Siach, Gregory Clement, Thomas Petazzoni,
	Rob Herring, netdev
In-Reply-To: <9c61fda15f89a69989c0d80fda33ea47@lixil.net>

On Thu, Feb 20, 2020 at 08:34:07AM -0700, Joel Johnson wrote:
> On 2020-02-20 03:12, Russell King - ARM Linux admin wrote:
> > On Wed, Feb 19, 2020 at 06:49:51AM -0700, Joel Johnson wrote:
> > > On 2020-02-19 02:22, Russell King - ARM Linux admin wrote:
> > > > On Tue, Feb 18, 2020 at 10:14:48PM -0700, Joel Johnson wrote:
> > > > Does debian have support for the comphy enabled in their kernel,
> > > > which is controlled by CONFIG_PHY_MVEBU_A38X_COMPHY ?
> > > 
> > > Well, doh! I stared at the patch series for way to long, but the added
> > > Kconfig symbol failed to register mentally somehow. I had been using
> > > the
> > > last known good Debian config with make olddefconfig, but it obviously
> > > wasn't included in earlier configs and not enabled by default.
> > > 
> > > Many thanks to you and Willy Tarreau for pointing out my glaring
> > > omission!
> > 
> > Thanks for letting us know that you've fixed it now.
> 
> Sure thing, I've submitted a Debian patch, and was pointed to an existing
> Debian bug with the same issue and patch, so hopefully that will get
> incorporated soon. I'll also keep an eye on OpenWRT when they move to an
> affected kernel version to make sure it's included.
> 
> One lingering question that wasn't clear to me is the apparent inconsistency
> in default enablement for PHYs in drivers/phy/marvell/Kconfig. Is there a
> technical reason why PHY_MVEBU_A3700_COMPHY defaults to 'y' but
> PHY_MVEBU_A38X_COMPHY (and PHY_MVEBU_CP110_COMPHY) default to 'n', or is it
> just an artifact of being added at different times? Similarly, is there a
> reason that PHY_MVEBU_A3700_COMPHY and PHY_MVEBU_A3700_UTMI default to 'y'
> instead of 'm' for all ARCH_MVEBU builds? In my testing, building with
> PHY_MVEBU_A38X_COMPHY as a module still seemed to autoload the module as
> needed on boot, so modules for different platforms seems off-hand more
> lightweight that building the driver in for all MVEBU boards which don't use
> all drivers.
> 
> With the current defaults, it seems like PHY_MVEBU_CP110_COMPHY may be
> affected in Debian the same way as PHY_MVEBU_A38X_COMPHY, but I don't have
> available Armada 7K/8K hardware yet to confirm.

There is no clear answer to whether should something default to Y,
M or N - different people have different ideas and different levels
of frustration with which-ever are picked.

The root problem is that there are way too many configuration
options that it's become quite time consuming to put together the
proper kernel configuration for any particular platform, and things
get even more interesting when it comes to a kernel supporting
multiple platforms, where one may wish to avoid having a huge
kernel image, but want to use modules for the platform specific
bits.

I wonder whether we ought to be considering something like:

menuconfig ARCH_MVEBU_DEFAULTS
	tristate "Marvell Engineering Business Unit (MVEBU) SoCs"

config ARCH_MVEBU
	def_bool y if ARCH_MVEBU_DEFAULTS
	...

and then have mvebu drivers default to the state of
ARCH_MVEBU_DEFAULTS.  That means, if you want to build the
platform with modular drivers, or built-in drivers there's one top
level config that will default all the appropriate options that way,
and any new drivers added later can pick up on the state of that
option.

Just a thought.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks for PM suspend and hibernation
From: Roger Pau Monné @ 2020-02-20 15:45 UTC (permalink / raw)
  To: Durrant, Paul
  Cc: Valentin, Eduardo, peterz@infradead.org, benh@kernel.crashing.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org, pavel@ucw.cz,
	hpa@zytor.com, tglx@linutronix.de, sstabellini@kernel.org,
	fllinden@amaozn.com, x86@kernel.org, mingo@redhat.com,
	xen-devel@lists.xenproject.org, Singh, Balbir,
	len.brown@intel.com, jgross@suse.com, konrad.wilk@oracle.com,
	Agarwal, Anchal, bp@alien8.de, boris.ostrovsky@oracle.com,
	axboe@kernel.dk, netdev@vger.kernel.org, linux-pm@vger.kernel.org,
	rjw@rjwysocki.net, Kamata, Munehisa, vkuznets@redhat.com,
	davem@davemloft.net, Woodhouse, David
In-Reply-To: <f986b845491b47cc8469d88e2e65e2a7@EX13D32EUC003.ant.amazon.com>

On Thu, Feb 20, 2020 at 08:54:36AM +0000, Durrant, Paul wrote:
> > -----Original Message-----
> > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> > Roger Pau Monné
> > Sent: 20 February 2020 08:39
> > To: Agarwal, Anchal <anchalag@amazon.com>
> > Cc: Valentin, Eduardo <eduval@amazon.com>; len.brown@intel.com;
> > peterz@infradead.org; benh@kernel.crashing.org; x86@kernel.org; linux-
> > mm@kvack.org; pavel@ucw.cz; hpa@zytor.com; tglx@linutronix.de;
> > sstabellini@kernel.org; fllinden@amaozn.com; Kamata, Munehisa
> > <kamatam@amazon.com>; mingo@redhat.com; xen-devel@lists.xenproject.org;
> > Singh, Balbir <sblbir@amazon.com>; axboe@kernel.dk;
> > konrad.wilk@oracle.com; bp@alien8.de; boris.ostrovsky@oracle.com;
> > jgross@suse.com; netdev@vger.kernel.org; linux-pm@vger.kernel.org;
> > rjw@rjwysocki.net; linux-kernel@vger.kernel.org; vkuznets@redhat.com;
> > davem@davemloft.net; Woodhouse, David <dwmw@amazon.co.uk>
> > Subject: Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks
> > for PM suspend and hibernation
> > 
> > Thanks for this work, please see below.
> > 
> > On Wed, Feb 19, 2020 at 06:04:24PM +0000, Anchal Agarwal wrote:
> > > On Tue, Feb 18, 2020 at 10:16:11AM +0100, Roger Pau Monné wrote:
> > > > On Mon, Feb 17, 2020 at 11:05:53PM +0000, Anchal Agarwal wrote:
> > > > > On Mon, Feb 17, 2020 at 11:05:09AM +0100, Roger Pau Monné wrote:
> > > > > > On Fri, Feb 14, 2020 at 11:25:34PM +0000, Anchal Agarwal wrote:
> > > > > Quiescing the queue seemed a better option here as we want to make
> > sure ongoing
> > > > > requests dispatches are totally drained.
> > > > > I should accept that some of these notion is borrowed from how nvme
> > freeze/unfreeze
> > > > > is done although its not apple to apple comparison.
> > > >
> > > > That's fine, but I would still like to requests that you use the same
> > > > logic (as much as possible) for both the Xen and the PM initiated
> > > > suspension.
> > > >
> > > > So you either apply this freeze/unfreeze to the Xen suspension (and
> > > > drop the re-issuing of requests on resume) or adapt the same approach
> > > > as the Xen initiated suspension. Keeping two completely different
> > > > approaches to suspension / resume on blkfront is not suitable long
> > > > term.
> > > >
> > > I agree with you on overhaul of xen suspend/resume wrt blkfront is a
> > good
> > > idea however, IMO that is a work for future and this patch series should
> > > not be blocked for it. What do you think?
> > 
> > It's not so much that I think an overhaul of suspend/resume in
> > blkfront is needed, it's just that I don't want to have two completely
> > different suspend/resume paths inside blkfront.
> > 
> > So from my PoV I think the right solution is to either use the same
> > code (as much as possible) as it's currently used by Xen initiated
> > suspend/resume, or to also switch Xen initiated suspension to use the
> > newly introduced code.
> > 
> > Having two different approaches to suspend/resume in the same driver
> > is a recipe for disaster IMO: it adds complexity by forcing developers
> > to take into account two different suspend/resume approaches when
> > there's no need for it.
> 
> I disagree. S3 or S4 suspend/resume (or perhaps we should call them power state transitions to avoid confusion) are quite different from Xen suspend/resume.
> Power state transitions ought to be, and indeed are, visible to the software running inside the guest. Applications, as well as drivers, can receive notification and take whatever action they deem appropriate.
> Xen suspend/resume OTOH is used when a guest is migrated and the code should go to all lengths possible to make any software running inside the guest (other than Xen specific enlightened code, such as PV drivers) completely unaware that anything has actually happened.

So from what you say above PM state transitions are notified to all
drivers, and Xen suspend/resume is only notified to PV drivers, and
here we are speaking about blkfront which is a PV driver, and should
get notified in both cases. So I'm unsure why the same (or at least
very similar) approach can't be used in both cases.

The suspend/resume approach proposed by this patch is completely
different than the one used by a xenbus initiated suspend/resume, and
I don't see a technical reason that warrants this difference.

I'm not saying that the approach used here is wrong, it's just that I
don't see the point in having two different ways to do suspend/resume
in the same driver, unless there's a technical reason for it, which I
don't think has been provided.

I would be fine with switching xenbus initiated suspend/resume to also
use the approach proposed here: freeze the queues and drain the shared
rings before suspending.

> So, whilst it may be possible to use common routines to, for example, re-establish PV frontend/backend communication, PV frontend code should be acutely aware of the circumstances they are operating in. I can cite example code in the Windows PV driver, which have supported guest S3/S4 power state transitions since day 1.

Hm, please bear with me, as I'm not sure I fully understand. Why isn't
the current suspend/resume logic suitable for PM transitions?

As said above, I'm happy to switch xenbus initiated suspend/resume to
use the logic in this patch, but unless there's a technical reason for
it I don't see why blkfront should have two completely different
approaches to suspend/resume depending on whether it's a PM or a
xenbus state change.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply

* Re: [igt-dev] [PATCH i-g-t] intel-ci: add a pre-merge blacklist to reduce the testing queue
From: Chris Wilson @ 2020-02-20 15:46 UTC (permalink / raw)
  To: Martin Peres, igt-dev
In-Reply-To: <20200220153209.210767-1-martin.peres@linux.intel.com>

Quoting Martin Peres (2020-02-20 15:32:09)
> +###############################################################################
> +# Modern userspace does not depend on the GTT anymore, so let's drop the
> +# slowest tests from pre-merge testing:
> +#
> +# - shard-skl: 2.7% (~6.5 minutes)
> +# - shard-kbl: 2% (~2.5 minutes)
> +# - shard-apl: 4.7% (~8.5 minutes)
> +# - shard-glk: 3.5% (~8 minutes)
> +# - shard-icl: 4.2% (~8.5 minutes)
> +# - shard-tgl: 2.5% (~4.5 minutes)
> +#
> +# Data acquired on 2020-02-20 by Martin Peres
> +###############################################################################
> +igt@gem_fence_thrash@bo-write-verify-threaded-[xy]
> +igt@gem_tiled_(wc|(|blits|fence_blits)@(normal|interruptible))

So where's the pre-merge regression testing then?
Just speed the tests up.
-Chris
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply

* Re: [PATCH v3 2/2] qemu-cpu-models.rst: Document -noTSX, mds-no, taa-no, and tsx-ctrl
From: Peter Maydell @ 2020-02-20 15:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Richard Henderson, Daniel P. Berrange, QEMU Developers,
	Eduardo Habkost, Kashyap Chamarthy
In-Reply-To: <4c3f3d85-9499-2e48-124b-18cc0dc36c8a@redhat.com>

On Thu, 20 Feb 2020 at 14:52, Paolo Bonzini <pbonzini@redhat.com> wrote:
> ... refer to Intel's `deep dive into MDS <...>`_.
>
> (I don't know what the trailing underscore is for.

It's because the external-hyperlink syntax is a complication
of the simpler format for within-document references. Inside
a document, you can have a simple link like this:

  This is a link to the target_ defined below.

  .. _target:

  This is where the link target defined above goes to.

So trailing underscore is for words that go to somewhere else,
and leading underscore for words that come from somewhere else.

Syntax complication 1 is that instead of word_ you can
say `some phrase with spaces`_ if you want the link to
span more than one word. (The docutils spec calls these
"phrase-references").

Syntax complication 2 is that instead of having to define
the target of an external URL separately from the place
you wanted to refer to it, like this:

  .. _target: http://somewhere-else.org/

you can directly embed it inside a phrase-reference:

  Go to `somewhere external <http://somewhere-else.org/>`_

which is more convenient if you only wanted to use the URL once.
But the _ is still there because it's still the markup that
indicates "this is going be a link to go somewhere".

> I reaffirm my definition of rST as the Perl of markup formats).

Not going to argue with that :-)  But like Perl, there's
usually some kind of a rationale lurking under the surface.

thanks
-- PMM


^ permalink raw reply

* Re: [PATCH] usbnet: optimize barrier usage for Rmw atomic bitops
From: Davidlohr Bueso @ 2020-02-20 15:35 UTC (permalink / raw)
  To: oneukum; +Cc: linux-usb, linux-kernel, Davidlohr Bueso
In-Reply-To: <20200129181646.25487-1-dave@stgolabs.net>

ping?

^ permalink raw reply

* Re: [cip-dev] [PATCH 4.19.y-cip 15/23] usb: typec: driver for TI HD3SS3220 USB Type-C DRP port controller
From: Pavel Machek @ 2020-02-20 15:46 UTC (permalink / raw)
  To: Marian-Cristian Rotariu; +Cc: cip-dev@lists.cip-project.org
In-Reply-To: <OSAPR01MB502894683145EEC15BD1DDD7AE130@OSAPR01MB5028.jpnprd01.prod.outlook.com>


[-- Attachment #1.1: Type: text/plain, Size: 1617 bytes --]

Hi!

> > > +static enum usb_role hd3ss3220_get_attached_state(struct hd3ss3220
> > > +*hd3ss3220) {
> > > +	unsigned int reg_val;
> > > +	enum usb_role attached_state;
> > > +	int ret;
> > > +
> > > +	ret = regmap_read(hd3ss3220->regmap,
> > HD3SS3220_REG_CN_STAT_CTRL,
> > > +			  &reg_val);
> > > +	if (ret < 0)
> > > +		return ret;
> > 
> > This function claims to return "enum usb_role", but here it returns errno
> > from regmap_read.

######### HERE

> > > +static int hd3ss3220_dr_set(const struct typec_capability *cap,
> > > +			    enum typec_data_role role)
> > > +{
> > ...
> > > +	ret = hd3ss3220_set_source_pref(hd3ss3220, pref);
> > > +	usleep_range(10, 100);
> > 
> > Would udelay() make more sense here? Are your CPUs / timer subsystem so
> > fast that sleeping for 10usec is possible and reasonable to do?
> 
> I think Biju followed the overall indications from:
> Documentation/timers-howto.txt (.rst)
> 
> According to the documentation the inflection point is at 10us, so this is kind
> of a grey area. Just to put it in context, the RZ/G2E has 2x1.2Ghz ARM Cortex A53.
> Therefore this is kind of a high-performance embedded platform. At least, we
> like to advertise it like that 😊
> 
> So, I am inclined to keep this as it is if you agree. 

It is not a big deal to keep it. What about the "returning -errno but
return type is enum" issue above? (Marked with "HERE").

Best regards,
								Pavel

-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
cip-dev mailing list
cip-dev@lists.cip-project.org
https://lists.cip-project.org/mailman/listinfo/cip-dev

^ permalink raw reply

* [Cluster-devel] [PATCH v7 14/24] btrfs: Convert from readpages to readahead
From: Christoph Hellwig @ 2020-02-20 15:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com
In-Reply-To: <20200220134849.GV24185@bombadil.infradead.org>

On Thu, Feb 20, 2020 at 05:48:49AM -0800, Matthew Wilcox wrote:
> btrfs: Convert from readpages to readahead
>   
> Implement the new readahead method in btrfs.  Add a readahead_page_batch()
> to optimise fetching a batch of pages at once.

Shouldn't this readahead_page_batch heper go into a separate patch so
that it clearly stands out?




^ permalink raw reply

* [Ocfs2-devel] [PATCH v7 14/24] btrfs: Convert from readpages to readahead
From: Christoph Hellwig @ 2020-02-20 15:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Johannes Thumshirn, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org,
	linux-ext4@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com,
	ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org
In-Reply-To: <20200220134849.GV24185@bombadil.infradead.org>

On Thu, Feb 20, 2020 at 05:48:49AM -0800, Matthew Wilcox wrote:
> btrfs: Convert from readpages to readahead
>   
> Implement the new readahead method in btrfs.  Add a readahead_page_batch()
> to optimise fetching a batch of pages at once.

Shouldn't this readahead_page_batch heper go into a separate patch so
that it clearly stands out?

^ permalink raw reply

* Re: [PATCH v7 14/24] btrfs: Convert from readpages to readahead
From: Christoph Hellwig @ 2020-02-20 15:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Johannes Thumshirn, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org,
	linux-ext4@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com,
	ocfs2-devel@oss.oracle.com, linux-xfs@vger.kernel.org
In-Reply-To: <20200220134849.GV24185@bombadil.infradead.org>

On Thu, Feb 20, 2020 at 05:48:49AM -0800, Matthew Wilcox wrote:
> btrfs: Convert from readpages to readahead
>   
> Implement the new readahead method in btrfs.  Add a readahead_page_batch()
> to optimise fetching a batch of pages at once.

Shouldn't this readahead_page_batch heper go into a separate patch so
that it clearly stands out?

^ permalink raw reply

* Re: [dpdk-dev] [PATCH v2] app/testpmd: fix return value in portlist parser
From: Govindharajan, Hariprasad @ 2020-02-20 15:46 UTC (permalink / raw)
  To: Lu, Wenzhuo, Wu, Jingjing, Iremonger, Bernard
  Cc: dev@dpdk.org, Yigit, Ferruh, stephen@networkplumber.org,
	david.marchand@redhat.com
In-Reply-To: <1582213402-18941-1-git-send-email-hariprasad.govindharajan@intel.com>



> -----Original Message-----
> From: Govindharajan, Hariprasad <hariprasad.govindharajan@intel.com>
> Sent: Thursday, February 20, 2020 3:43 PM
> To: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Iremonger, Bernard
> <bernard.iremonger@intel.com>
> Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>;
> stephen@networkplumber.org; david.marchand@redhat.com;
> Govindharajan, Hariprasad <hariprasad.govindharajan@intel.com>
> Subject: [PATCH v2] app/testpmd: fix return value in portlist parser
> 
> The function parse_port_list() is designed to return unsigned int value. After
> sanitizing the inputs, it is returning -1. Changed it to return 0.
> 
> Fixes: 2df00d562d20 ("app/testpmd: add --portlist option")
> Cc: hariprasad.govindharajan@intel.com
> 
> Signed-off-by: Hariprasad Govindharajan
> <hariprasad.govindharajan@intel.com>
> ---
>  app/test-pmd/config.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index
> 9d95202..91db508 100644
> --- a/app/test-pmd/config.c
> +++ b/app/test-pmd/config.c
> @@ -2642,7 +2642,7 @@ parse_port_list(const char *list, unsigned int
> *values, unsigned int maxsize)
>  	unsigned int marked[maxsize];
> 
>  	if (list == NULL || values == NULL)
> -		return -1;
> +		return 0;
> 
>  	for (i = 0; i < (int)maxsize; i++)
>  		marked[i] = 0;
> --
> 2.7.4

This patch was acked previously by Bernard Iremonger.
Just changed the patch head line to correct format.

Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>

^ permalink raw reply

* Re: [Xen-devel] [PATCH] x86/vpt: update last_guest_time with cmpxchg and drop pl_time_lock
From: Jan Beulich @ 2020-02-20 15:47 UTC (permalink / raw)
  To: Igor Druzhinin; +Cc: xen-devel, roger.pau, wl, andrew.cooper3
In-Reply-To: <eb6156eb-6a6d-28f5-c8ec-081f81444b99@citrix.com>

On 20.02.2020 16:37, Igor Druzhinin wrote:
> On 20/02/2020 08:27, Jan Beulich wrote:
>> On 19.02.2020 19:52, Igor Druzhinin wrote:
>>> On 19/02/2020 07:48, Jan Beulich wrote:
>>>> On 20.12.2019 22:39, Igor Druzhinin wrote:
>>>>> @@ -38,24 +37,22 @@ void hvm_init_guest_time(struct domain *d)
>>>>>  uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc)
>>>>>  {
>>>>>      struct pl_time *pl = v->domain->arch.hvm.pl_time;
>>>>> -    u64 now;
>>>>> +    s_time_t old, new, now = get_s_time_fixed(at_tsc) + pl->stime_offset;
>>>>>  
>>>>>      /* Called from device models shared with PV guests. Be careful. */
>>>>>      ASSERT(is_hvm_vcpu(v));
>>>>>  
>>>>> -    spin_lock(&pl->pl_time_lock);
>>>>> -    now = get_s_time_fixed(at_tsc) + pl->stime_offset;
>>>>> -
>>>>>      if ( !at_tsc )
>>>>>      {
>>>>> -        if ( (int64_t)(now - pl->last_guest_time) > 0 )
>>>>> -            pl->last_guest_time = now;
>>>>> -        else
>>>>> -            now = ++pl->last_guest_time;
>>>>> +        do {
>>>>> +            old = pl->last_guest_time;
>>>>> +            new = now > pl->last_guest_time ? now : old + 1;
>>>>> +        } while ( cmpxchg(&pl->last_guest_time, old, new) != old );
>>>>
>>>> I wonder whether you wouldn't better re-invoke get_s_time() in
>>>> case you need to retry here. See how the function previously
>>>> was called only after the lock was already acquired.
>>>
>>> If there is a concurrent writer, wouldn't it just update pl->last_guest_time
>>> with the new get_s_time() and then we subsequently would just use the new
>>> time on retry?
>>
>> Yes, it would, but the latency until the retry actually occurs
>> is unknown (in particular if Xen itself runs virtualized). I.e.
>> in the at_tsc == 0 case I think the value would better be
>> re-calculated on every iteration.
> 
> Why does it need to be recalculated if a concurrent writer did this
> for us already anyway and (get_s_time_fixed(at_tsc) + pl->stime_offset)
> value is common for all of vCPUs? Yes, it might reduce jitter slightly
> but overall latency could come from any point (especially in case of
> rinning virtualized) and it's important just to preserve invariant that
> the value is monotonic across vCPUs.

I'm afraid I don't follow: If we rely on remote CPUs updating
pl->last_guest_time, then what we'd return is whatever was put
there plus one. Whereas the correct value might be dozens of
clocks further ahead.

>> Anther thing I notice only now are the multiple reads of
>> pl->last_guest_time. Wouldn't you better do
>>
>>         do {
>>             old = ACCESS_ONCE(pl->last_guest_time);
>>             new = now > old ? now : old + 1;
>>         } while ( cmpxchg(&pl->last_guest_time, old, new) != old );
>>
>> ?
> 
> Fair enough, although even reading it multiple times wouldn't cause
> any harm as any inconsistency would be resolved by cmpxchg op.

Afaics "new", if calculated from a value latched _earlier_
than "old", could cause time to actually move backwards. Reads
can be re-ordered, after all.

> I'd
> prefer to make it in a separate commit to unify it with pv_soft_rdtsc().

I'd be fine if you changed pv_soft_rdtsc() first, and then
made the code here match. But I don't think the code should be
introduced in other than its (for the time being) final shape.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply

* Re: [PATCH v7 14/24] btrfs: Convert from readpages to readahead
From: Christoph Hellwig @ 2020-02-20 15:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-xfs@vger.kernel.org, Johannes Thumshirn,
	linux-kernel@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com,
	linux-mm@kvack.org, ocfs2-devel@oss.oracle.com,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-erofs@lists.ozlabs.org, linux-btrfs@vger.kernel.org
In-Reply-To: <20200220134849.GV24185@bombadil.infradead.org>

On Thu, Feb 20, 2020 at 05:48:49AM -0800, Matthew Wilcox wrote:
> btrfs: Convert from readpages to readahead
>   
> Implement the new readahead method in btrfs.  Add a readahead_page_batch()
> to optimise fetching a batch of pages at once.

Shouldn't this readahead_page_batch heper go into a separate patch so
that it clearly stands out?

^ permalink raw reply

* [Cluster-devel] [PATCH v7 21/24] iomap: Restructure iomap_readpages_actor
From: Christoph Hellwig @ 2020-02-20 15:47 UTC (permalink / raw)
  To: cluster-devel.redhat.com
In-Reply-To: <20200219210103.32400-22-willy@infradead.org>

On Wed, Feb 19, 2020 at 01:01:00PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> By putting the 'have we reached the end of the page' condition at the end
> of the loop instead of the beginning, we can remove the 'submit the last
> page' code from iomap_readpages().  Also check that iomap_readpage_actor()
> didn't return 0, which would lead to an endless loop.

I'm obviously biassed a I wrote the original code, but I find the new
very much harder to understand (not that the previous one was easy, this
is tricky code..).




^ permalink raw reply

* [Ocfs2-devel] [PATCH v7 21/24] iomap: Restructure iomap_readpages_actor
From: Christoph Hellwig @ 2020-02-20 15:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs
In-Reply-To: <20200219210103.32400-22-willy@infradead.org>

On Wed, Feb 19, 2020 at 01:01:00PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> By putting the 'have we reached the end of the page' condition at the end
> of the loop instead of the beginning, we can remove the 'submit the last
> page' code from iomap_readpages().  Also check that iomap_readpage_actor()
> didn't return 0, which would lead to an endless loop.

I'm obviously biassed a I wrote the original code, but I find the new
very much harder to understand (not that the previous one was easy, this
is tricky code..).

^ permalink raw reply

* Re: [PATCH v7 21/24] iomap: Restructure iomap_readpages_actor
From: Christoph Hellwig @ 2020-02-20 15:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-fsdevel, linux-mm, linux-kernel, linux-btrfs, linux-erofs,
	linux-ext4, linux-f2fs-devel, cluster-devel, ocfs2-devel,
	linux-xfs
In-Reply-To: <20200219210103.32400-22-willy@infradead.org>

On Wed, Feb 19, 2020 at 01:01:00PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> By putting the 'have we reached the end of the page' condition at the end
> of the loop instead of the beginning, we can remove the 'submit the last
> page' code from iomap_readpages().  Also check that iomap_readpage_actor()
> didn't return 0, which would lead to an endless loop.

I'm obviously biassed a I wrote the original code, but I find the new
very much harder to understand (not that the previous one was easy, this
is tricky code..).

^ permalink raw reply

* Re: [PATCH v7 21/24] iomap: Restructure iomap_readpages_actor
From: Christoph Hellwig @ 2020-02-20 15:47 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-xfs, linux-kernel, linux-f2fs-devel, cluster-devel,
	linux-mm, ocfs2-devel, linux-fsdevel, linux-ext4, linux-erofs,
	linux-btrfs
In-Reply-To: <20200219210103.32400-22-willy@infradead.org>

On Wed, Feb 19, 2020 at 01:01:00PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> By putting the 'have we reached the end of the page' condition at the end
> of the loop instead of the beginning, we can remove the 'submit the last
> page' code from iomap_readpages().  Also check that iomap_readpage_actor()
> didn't return 0, which would lead to an endless loop.

I'm obviously biassed a I wrote the original code, but I find the new
very much harder to understand (not that the previous one was easy, this
is tricky code..).

^ permalink raw reply

* Re: [PATCH 3/8] btrfs: move the root freeing stuff into btrfs_put_root
From: Josef Bacik @ 2020-02-20 15:48 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team
In-Reply-To: <058f94f8-7fb6-9dfb-61e0-21dc989e22bc@suse.com>

On 2/19/20 10:10 AM, Nikolay Borisov wrote:
> 
> 
> On 14.02.20 г. 23:11 ч., Josef Bacik wrote:
>> There are a few different ways to free roots, either you allocated them
>> yourself and you just do
>>
>> free_extent_buffer(root->node);
>> free_extent_buffer(root->commit_node);
>> btrfs_put_root(root);
>>
>> Which is the pattern for log roots.  Or for snapshots/subvolumes that
>> are being dropped you simply call btrfs_free_fs_root() which does all
>> the cleanup for you.
>>
>> Unify this all into btrfs_put_root(), so that we don't free up things
>> associated with the root until the last reference is dropped.  This
>> makes the root freeing code much more significant.
>>
>> The only caveat is at close_ctree() time we have to free the extent
>> buffers for all of our main roots (extent_root, chunk_root, etc) because
>> we have to drop the btree_inode and we'll run into issues if we hold
>> onto those nodes until ->kill_sb() time.  This will be addressed in the
>> future when we kill the btree_inode.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> Nit: This patch obsoleted the last comment in btrfs_init_fs_root, namely:
> 
> /* The caller is responsible to call btrfs_free_fs_root */
> 
>> ---
>>   fs/btrfs/disk-io.c           | 64 ++++++++++++++++++------------------
>>   fs/btrfs/disk-io.h           | 16 +--------
>>   fs/btrfs/extent-tree.c       |  7 ++--
>>   fs/btrfs/extent_io.c         | 16 +++++++--
>>   fs/btrfs/free-space-tree.c   |  2 --
>>   fs/btrfs/qgroup.c            |  7 +---
>>   fs/btrfs/relocation.c        |  4 ---
>>   fs/btrfs/tests/btrfs-tests.c |  5 +--
>>   fs/btrfs/tree-log.c          |  6 ----
>>   9 files changed, 50 insertions(+), 77 deletions(-)
>>
> 
> <snip>
> 
>> @@ -4795,7 +4803,6 @@ int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
>>   
>>   static void __free_extent_buffer(struct extent_buffer *eb)
>>   {
>> -	btrfs_leak_debug_del(&eb->fs_info->eb_leak_lock, &eb->leak_list);
>>   	kmem_cache_free(extent_buffer_cache, eb);
>>   }
> 
> This function becomes a trivial wrapper so should be eliminated altogether.
> 
> <snip>
> 
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index 034f5f151a74..4fb7e3cc2aca 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -2549,10 +2549,6 @@ void free_reloc_roots(struct list_head *list)
>>   		reloc_root = list_entry(list->next, struct btrfs_root,
>>   					root_list);
>>   		__del_reloc_root(reloc_root);
>> -		free_extent_buffer(reloc_root->node);
>> -		free_extent_buffer(reloc_root->commit_root);
>> -		reloc_root->node = NULL;
>> -		reloc_root->commit_root = NULL;
> 
> Shouldn't you do btrfs_put_root(reloc_root) here ?

No, but I can see how this is confusing.  The reloc root is actually cleaned up 
in clean_dirty_subvols(), so it's final put happens there.  Thanks,

Josef

^ permalink raw reply

* Re: [dpdk-dev] [PATCH] net/mlx5: fix metadata split with encap action
From: Raslan Darawsheh @ 2020-02-20 15:48 UTC (permalink / raw)
  To: Matan Azrad, dev@dpdk.org; +Cc: Slava Ovsiienko, stable@dpdk.org
In-Reply-To: <1582209810-20546-1-git-send-email-matan@mellanox.com>

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Matan Azrad
> Sent: Thursday, February 20, 2020 4:44 PM
> To: dev@dpdk.org
> Cc: Slava Ovsiienko <viacheslavo@mellanox.com>; stable@dpdk.org
> Subject: [dpdk-dev] [PATCH] net/mlx5: fix metadata split with encap action
> 
> In order to move the mbuf metadata from the WQE to the FDB steering
> domain, the PMD add for each NIC TX flow a new action to copy the
> metadata register REG_A to REG_C_0.
> 
> This copy action is considered as modify header action from HW
> perspective.
> 
> The HW doesn't support to do modify header action after ant
> encapsulation action.
> 
> The split metadata function wrongly added the copy action in the end of
> the original actions list, hence, NIC egress flow with encapapsulation
> action failed when the PMD worked with dv_xmeta_en mode.
> 
> Move the copy action to be before and back to back with the
> encapsulation action for the aforementioned case.
> 
> Fixes: 71e254bc0294 ("net/mlx5: split Rx flows to provide metadata copy")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c | 66
> ++++++++++++++++++++++++++++++++++----------
>  1 file changed, 51 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index fa58546..60aab16 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -2744,7 +2744,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  }
> 
>  /**
> - * Get QUEUE/RSS action from the action list.
> + * Get metadata split action information.
>   *
>   * @param[in] actions
>   *   Pointer to the list of actions.
> @@ -2753,18 +2753,38 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>   * @param[out] qrss_type
>   *   Pointer to the action type to return. RTE_FLOW_ACTION_TYPE_END is
> returned
>   *   if no QUEUE/RSS is found.
> + * @param[out] encap_idx
> + *   Pointer to the index of the encap action if exists, otherwise the last
> + *   action index.
>   *
>   * @return
>   *   Total number of actions.
>   */
>  static int
> -flow_parse_qrss_action(const struct rte_flow_action actions[],
> -		       const struct rte_flow_action **qrss)
> +flow_parse_metadata_split_actions_info(const struct rte_flow_action
> actions[],
> +				       const struct rte_flow_action **qrss,
> +				       int *encap_idx)
>  {
> +	const struct rte_flow_action_raw_encap *raw_encap;
>  	int actions_n = 0;
> +	int raw_decap_idx = -1;
> 
> +	*encap_idx = -1;
>  	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
>  		switch (actions->type) {
> +		case RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP:
> +		case RTE_FLOW_ACTION_TYPE_NVGRE_ENCAP:
> +			*encap_idx = actions_n;
> +			break;
> +		case RTE_FLOW_ACTION_TYPE_RAW_DECAP:
> +			raw_decap_idx = actions_n;
> +			break;
> +		case RTE_FLOW_ACTION_TYPE_RAW_ENCAP:
> +			raw_encap = actions->conf;
> +			if (raw_encap->size >
> MLX5_ENCAPSULATION_DECISION_SIZE)
> +				*encap_idx = raw_decap_idx != -1 ?
> +						      raw_decap_idx :
> actions_n;
> +			break;
>  		case RTE_FLOW_ACTION_TYPE_QUEUE:
>  		case RTE_FLOW_ACTION_TYPE_RSS:
>  			*qrss = actions;
> @@ -2774,6 +2794,8 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  		}
>  		actions_n++;
>  	}
> +	if (*encap_idx == -1)
> +		*encap_idx = actions_n;
>  	/* Count RTE_FLOW_ACTION_TYPE_END. */
>  	return actions_n + 1;
>  }
> @@ -3739,6 +3761,8 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>   *   Number of actions in the list.
>   * @param[out] error
>   *   Perform verbose error reporting if not NULL.
> + * @param[in] encap_idx
> + *   The encap action inndex.
>   *
>   * @return
>   *   0 on success, negative value otherwise
> @@ -3747,7 +3771,8 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  flow_mreg_tx_copy_prep(struct rte_eth_dev *dev,
>  		       struct rte_flow_action *ext_actions,
>  		       const struct rte_flow_action *actions,
> -		       int actions_n, struct rte_flow_error *error)
> +		       int actions_n, struct rte_flow_error *error,
> +		       int encap_idx)
>  {
>  	struct mlx5_flow_action_copy_mreg *cp_mreg =
>  		(struct mlx5_flow_action_copy_mreg *)
> @@ -3762,15 +3787,24 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  	if (ret < 0)
>  		return ret;
>  	cp_mreg->src = ret;
> -	memcpy(ext_actions, actions,
> -			sizeof(*ext_actions) * actions_n);
> -	ext_actions[actions_n - 1] = (struct rte_flow_action){
> -		.type = MLX5_RTE_FLOW_ACTION_TYPE_COPY_MREG,
> -		.conf = cp_mreg,
> -	};
> -	ext_actions[actions_n] = (struct rte_flow_action){
> -		.type = RTE_FLOW_ACTION_TYPE_END,
> -	};
> +	if (encap_idx != 0)
> +		memcpy(ext_actions, actions, sizeof(*ext_actions) *
> encap_idx);
> +	if (encap_idx == actions_n - 1) {
> +		ext_actions[actions_n - 1] = (struct rte_flow_action){
> +			.type =
> MLX5_RTE_FLOW_ACTION_TYPE_COPY_MREG,
> +			.conf = cp_mreg,
> +		};
> +		ext_actions[actions_n] = (struct rte_flow_action){
> +			.type = RTE_FLOW_ACTION_TYPE_END,
> +		};
> +	} else {
> +		ext_actions[encap_idx] = (struct rte_flow_action){
> +			.type =
> MLX5_RTE_FLOW_ACTION_TYPE_COPY_MREG,
> +			.conf = cp_mreg,
> +		};
> +		memcpy(ext_actions + encap_idx + 1, actions + encap_idx,
> +				sizeof(*ext_actions) * (actions_n -
> encap_idx));
> +	}
>  	return 0;
>  }
> 
> @@ -3821,6 +3855,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  	int mtr_sfx = 0;
>  	size_t act_size;
>  	int actions_n;
> +	int encap_idx;
>  	int ret;
> 
>  	/* Check whether extensive metadata feature is engaged. */
> @@ -3830,7 +3865,8 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  		return flow_create_split_inner(dev, flow, NULL,
> prefix_layers,
>  					       attr, items, actions, external,
>  					       error);
> -	actions_n = flow_parse_qrss_action(actions, &qrss);
> +	actions_n = flow_parse_metadata_split_actions_info(actions, &qrss,
> +							   &encap_idx);
>  	if (qrss) {
>  		/* Exclude hairpin flows from splitting. */
>  		if (qrss->type == RTE_FLOW_ACTION_TYPE_QUEUE) {
> @@ -3905,7 +3941,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> rte_eth_dev *dev, int32_t priority,
>  						  "metadata flow");
>  		/* Create the action list appended with copy register. */
>  		ret = flow_mreg_tx_copy_prep(dev, ext_actions, actions,
> -					     actions_n, error);
> +					     actions_n, error, encap_idx);
>  		if (ret < 0)
>  			goto exit;
>  	}
> --
> 1.8.3.1


Fixed typo in commit log,

And patch applied to next-net-mlx,

Kindest regards
Raslan Darawsheh

^ permalink raw reply


This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.