All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 2/4] x86/elf: use HWCAP2 to expose ring 3 MWAIT
From: Grzegorz Andrejczuk @ 2016-11-09 13:46 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86
  Cc: bp, dave.hansen, lukasz.daniluk, james.h.cownie, jacob.jun.pan,
	Piotr.Luc, linux-kernel, Grzegorz Andrejczuk
In-Reply-To: <1478699194-30946-1-git-send-email-grzegorz.andrejczuk@intel.com>

Add HWCAP2 for x86 and reserve its bit 0 to expose
ring 3 mwait.

Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
---
 arch/x86/include/asm/elf.h         | 9 +++++++++
 arch/x86/include/uapi/asm/hwcap2.h | 7 +++++++
 arch/x86/kernel/cpu/common.c       | 3 +++
 3 files changed, 19 insertions(+)
 create mode 100644 arch/x86/include/uapi/asm/hwcap2.h

diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index e7f155c..59703aa 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -258,6 +258,15 @@ extern int force_personality32;
 
 #define ELF_HWCAP		(boot_cpu_data.x86_capability[CPUID_1_EDX])
 
+extern unsigned int elf_hwcap2;
+
+/*
+ * HWCAP2 supplies mask with kernel enabled CPU features, so that
+ * the application can discover that it can safely use them.
+ * The bits are defined in uapi/asm/hwcap2.h.
+ */
+#define ELF_HWCAP2		elf_hwcap2
+
 /* This yields a string that ld.so will use to load implementation
    specific libraries for optimization.  This is more specific in
    intent than poking at uname or /proc/cpuinfo.
diff --git a/arch/x86/include/uapi/asm/hwcap2.h b/arch/x86/include/uapi/asm/hwcap2.h
new file mode 100644
index 0000000..116cab3
--- /dev/null
+++ b/arch/x86/include/uapi/asm/hwcap2.h
@@ -0,0 +1,7 @@
+#ifndef _ASM_X86_HWCAP2_H
+#define _ASM_X86_HWCAP2_H
+
+/* Kernel enabled Ring 3 MONITOR/MWAIT*/
+#define HWCAP2_RING3MWAIT		(1 << 0)
+
+#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index bcc9ccc..fdbf708 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -35,6 +35,7 @@
 #include <asm/desc.h>
 #include <asm/fpu/internal.h>
 #include <asm/mtrr.h>
+#include <asm/hwcap2.h>
 #include <linux/numa.h>
 #include <asm/asm.h>
 #include <asm/bugs.h>
@@ -51,6 +52,8 @@
 
 #include "cpu.h"
 
+unsigned int elf_hwcap2 __read_mostly;
+
 /* all of these masks are initialized in setup_cpu_local_masks() */
 cpumask_var_t cpu_initialized_mask;
 cpumask_var_t cpu_callout_mask;
-- 
2.5.1

^ permalink raw reply related

* [PATCH v9 1/4] x86/msr: add MSR_MISC_FEATURE_ENABLES and RING3MWAIT bit
From: Grzegorz Andrejczuk @ 2016-11-09 13:46 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86
  Cc: bp, dave.hansen, lukasz.daniluk, james.h.cownie, jacob.jun.pan,
	Piotr.Luc, linux-kernel, Grzegorz Andrejczuk
In-Reply-To: <1478699194-30946-1-git-send-email-grzegorz.andrejczuk@intel.com>

Intel Xeon Phi x200 (codenamed Knights Landing) allows to enable
MONITOR and MWAIT instructions outside of ring 0.

The feature is controlled by MSR MISC_FEATURE_ENABLES (0x140).
Setting bit 1 of this register enables it, so MONITOR and MWAIT
instructions do not cause invalid-opcode exceptions when invoked
outside of ring 0.
The feature MSR is not yet documented in the SDM. Here is
the relevant documentation:

Hex   Dec  Name                    Scope
140H  320  MISC_FEATURE_ENABLES    Thread
           0    Reserved
           1    if set to 1, the MONITOR and MWAIT instructions do not
                cause invalid-opcode exceptions when executed with CPL > 0
                or in virtual-8086 mode. If MWAIT is executed when CPL > 0
                or in virtual-8086 mode, and if EAX indicates a C-state
                other than C0 or C1, the instruction operates as if EAX
                indicated the C-state C1.
           63:2 Reserved

Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
---
 arch/x86/include/asm/msr-index.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 56f4c66..c95da90 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -540,6 +540,11 @@
 #define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT	39
 #define MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE		(1ULL << MSR_IA32_MISC_ENABLE_IP_PREF_DISABLE_BIT)
 
+/* Intel Xeon Phi x200 ring 3 MONITOR/MWAIT */
+#define MSR_MISC_FEATURE_ENABLES	0x00000140
+#define MSR_MISC_FEATURE_ENABLES_RING3MWAIT_BIT	1
+#define MSR_MISC_FEATURE_ENABLES_RING3MWAIT	(1ULL << MSR_MISC_FEATURE_ENABLES_RING3MWAIT_BIT)
+
 #define MSR_IA32_TSC_DEADLINE		0x000006E0
 
 /* P4/Xeon+ specific */
-- 
2.5.1

^ permalink raw reply related

* [PATCH v9 0/4] Enabling Ring 3 MONITOR/MWAIT feature for Knights Landing
From: Grzegorz Andrejczuk @ 2016-11-09 13:46 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86
  Cc: bp, dave.hansen, lukasz.daniluk, james.h.cownie, jacob.jun.pan,
	Piotr.Luc, linux-kernel, Grzegorz Andrejczuk

These patches enable Intel Xeon Phi x200 feature to use MONITOR/MWAIT
instruction in ring 3 (userspace) Patches set MSR 0x140 for all logical CPUs.
Then expose it as CPU feature and introduces elf HWCAP capability for x86.
Reference:
https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait

v9:
Removed PHI from defines

v8:
Fixed commit messages
Removed logging
Used msr_set/clear_bit functions instesd of wrmsrl
Fixed documentation
Renamed HWCAP2_PHIR3MWAIT to HWCAP2_RING3MWAIT

v7:
Change order of the patches, with this code looks cleaner.
Changed the name of MSR to MSR_MISC_FEATURE_ENABLES.
Used Word 3 25th bit to expose feature.

v6: 

v5:
When phir3mwait=disable is cmdline switch off r3 mwait feature
Fix typos

v4:
Wrapped the enabling code by CONFIG_X86_64
Add documentation for phir3mwait=disable cmdline switch
Move probe_ function call from early_intel_init to intel_init
Fixed commit messages

v3:
Included Daves and Thomas comments

v2:
Check MSR before wrmsrl
Shortened names
Used Word 3 for feature init_scattered_cpuid_features()
Fixed commit messages

Grzegorz Andrejczuk (4):
  x86/msr: Add MSR_MISC_FEATURE_ENABLES and RING3MWAIT bit
  x86/elf: Use HWCAP2 to expose ring 3 MWAIT
  x86/cpufeature: Add RING3MWAIT to CPU features
  x86/cpufeatures: Handle RING3MWAIT on Xeon Phi models

 Documentation/kernel-parameters.txt       |  5 ++++
 Documentation/x86/x86_64/boot-options.txt |  5 ++++
 arch/x86/include/asm/cpufeatures.h        |  2 +-
 arch/x86/include/asm/elf.h                |  9 +++++++
 arch/x86/include/asm/msr-index.h          |  5 ++++
 arch/x86/include/uapi/asm/hwcap2.h        |  7 ++++++
 arch/x86/kernel/cpu/common.c              |  3 +++
 arch/x86/kernel/cpu/intel.c               | 39 +++++++++++++++++++++++++++++++
 8 files changed, 74 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/include/uapi/asm/hwcap2.h

-- 
2.5.1

^ permalink raw reply

* Re: [PATCH v3 1/4] drm: Add a new connector property for link status
From: Jani Nikula @ 2016-11-09 13:45 UTC (permalink / raw)
  To: intel-gfx; +Cc: Manasi Navare, Daniel Vetter, dri-devel
In-Reply-To: <1477949394-20483-1-git-send-email-manasi.d.navare@intel.com>

On Mon, 31 Oct 2016, Manasi Navare <manasi.d.navare@intel.com> wrote:
> A new default connector property is added for keeping
> track of whether the link is good (link training passed) or
> link is bad (link training  failed). If the link status property
> is not good, then userspace should fire off a new modeset at the current
> mode even if there have not been any changes in the mode list
> or connector status.
> Also add link status connector member corersponding to the
> decoded value of link status property.

I think it would be good to expand on this, both in the commit message
and documentation. This is UABI after all.

When is the property changed, who changes it, what is the userspace
expected to do when the status goes from good to bad, how does userspace
notice it's gone bad (uevents).

While we want this for handling DP link training failures during mode
setting according to the DP spec, in a way that can pass the CTS, it's
also needed for async mode sets.

BR,
Jani.



>
> v3:
> * Drop "link training" from description since this is
> not specific to DP (Jani Nikula)
> * Add link status member to store property value locally
> (Ville Syrjala)
> v2:
> * Make this a default connector property (Daniel Vetter)
>
> Cc: dri-devel@lists.freedesktop.org
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
> ---
>  drivers/gpu/drm/drm_connector.c | 17 +++++++++++++++++
>  include/drm/drm_connector.h     |  7 ++++++-
>  include/drm/drm_crtc.h          |  5 +++++
>  include/uapi/drm/drm_mode.h     |  4 ++++
>  4 files changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index 2db7fb5..d4e852f 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -243,6 +243,10 @@ int drm_connector_init(struct drm_device *dev,
>  	drm_object_attach_property(&connector->base,
>  				      config->dpms_property, 0);
>  
> +	drm_object_attach_property(&connector->base,
> +				   config->link_status_property,
> +				   0);
> +
>  	if (drm_core_check_feature(dev, DRIVER_ATOMIC)) {
>  		drm_object_attach_property(&connector->base, config->prop_crtc_id, 0);
>  	}
> @@ -506,6 +510,12 @@ const char *drm_get_subpixel_order_name(enum subpixel_order order)
>  };
>  DRM_ENUM_NAME_FN(drm_get_dpms_name, drm_dpms_enum_list)
>  
> +static const struct drm_prop_enum_list drm_link_status_enum_list[] = {
> +	{ DRM_MODE_LINK_STATUS_GOOD, "Good" },
> +	{ DRM_MODE_LINK_STATUS_BAD, "Bad" },
> +};
> +DRM_ENUM_NAME_FN(drm_get_link_status_name, drm_link_status_enum_list)
> +
>  /**
>   * drm_display_info_set_bus_formats - set the supported bus formats
>   * @info: display info to store bus formats in
> @@ -622,6 +632,13 @@ int drm_connector_create_standard_properties(struct drm_device *dev)
>  		return -ENOMEM;
>  	dev->mode_config.tile_property = prop;
>  
> +	prop = drm_property_create_enum(dev, 0, "link-status",
> +					drm_link_status_enum_list,
> +					ARRAY_SIZE(drm_link_status_enum_list));
> +	if (!prop)
> +		return -ENOMEM;
> +	dev->mode_config.link_status_property = prop;
> +
>  	return 0;
>  }
>  
> diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
> index ac9d7d8..5c335e8 100644
> --- a/include/drm/drm_connector.h
> +++ b/include/drm/drm_connector.h
> @@ -682,6 +682,12 @@ struct drm_connector {
>  	uint8_t num_h_tile, num_v_tile;
>  	uint8_t tile_h_loc, tile_v_loc;
>  	uint16_t tile_h_size, tile_v_size;
> +
> +	/* Connector Link status
> +	 * 0: If the link is Good
> +	 * 1: If the link is Bad
> +	 */
> +	int link_status;
>  };
>  
>  #define obj_to_connector(x) container_of(x, struct drm_connector, base)
> @@ -754,7 +760,6 @@ int drm_mode_create_tv_properties(struct drm_device *dev,
>  int drm_mode_create_scaling_mode_property(struct drm_device *dev);
>  int drm_mode_create_aspect_ratio_property(struct drm_device *dev);
>  int drm_mode_create_suggested_offset_properties(struct drm_device *dev);
> -
>  int drm_mode_connector_set_path_property(struct drm_connector *connector,
>  					 const char *path);
>  int drm_mode_connector_set_tile_property(struct drm_connector *connector);
> diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
> index fa1aa21..737f4d3 100644
> --- a/include/drm/drm_crtc.h
> +++ b/include/drm/drm_crtc.h
> @@ -1151,6 +1151,11 @@ struct drm_mode_config {
>  	 */
>  	struct drm_property *tile_property;
>  	/**
> +	 * @link_status_property: Default connector property for link status
> +	 * of a connector
> +	 */
> +	struct drm_property *link_status_property;
> +	/**
>  	 * @plane_type_property: Default plane property to differentiate
>  	 * CURSOR, PRIMARY and OVERLAY legacy uses of planes.
>  	 */
> diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
> index 084b50a..f1b0afd 100644
> --- a/include/uapi/drm/drm_mode.h
> +++ b/include/uapi/drm/drm_mode.h
> @@ -121,6 +121,10 @@
>  #define DRM_MODE_DIRTY_ON       1
>  #define DRM_MODE_DIRTY_ANNOTATE 2
>  
> +/* Link Status options */
> +#define DRM_MODE_LINK_STATUS_GOOD	0
> +#define DRM_MODE_LINK_STATUS_BAD	1
> +
>  struct drm_mode_modeinfo {
>  	__u32 clock;
>  	__u16 hdisplay;

-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply

* Re: [bug report] [media] vcodec: mediatek: Add Mediatek V4L2 Video Decoder Driver
From: Hans Verkuil @ 2016-11-09 13:45 UTC (permalink / raw)
  To: Dan Carpenter, tiffany.lin; +Cc: linux-media, linux-mediatek
In-Reply-To: <20161109132820.GA26677@mwanda>

On 11/09/16 14:28, Dan Carpenter wrote:
> Hello Tiffany Lin,
>
> The patch 590577a4e525: "[media] vcodec: mediatek: Add Mediatek V4L2
> Video Decoder Driver" from Sep 2, 2016, leads to the following static
> checker warning:
>
> 	drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c:536 vidioc_vdec_qbuf()
> 	error: buffer overflow 'vq->bufs' 32 <= u32max
>
> drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
>    520  static int vidioc_vdec_qbuf(struct file *file, void *priv,
>    521                              struct v4l2_buffer *buf)
>    522  {
>    523          struct mtk_vcodec_ctx *ctx = fh_to_ctx(priv);
>    524          struct vb2_queue *vq;
>    525          struct vb2_buffer *vb;
>    526          struct mtk_video_dec_buf *mtkbuf;
>    527          struct vb2_v4l2_buffer  *vb2_v4l2;
>    528
>    529          if (ctx->state == MTK_STATE_ABORT) {
>    530                  mtk_v4l2_err("[%d] Call on QBUF after unrecoverable error",
>    531                                  ctx->id);
>    532                  return -EIO;
>    533          }
>    534
>    535          vq = v4l2_m2m_get_vq(ctx->m2m_ctx, buf->type);
>    536          vb = vq->bufs[buf->index];
>
> Smatch thinks that "buf->index" comes straight from the user without
> being checked and that this is a buffer overflow.  It seems simple
> enough to analyse the call tree.
>
> __video_do_ioctl()
> ->  v4l_qbuf()
>   -> vidioc_vdec_qbuf()
>
> It seems like Smatch is correct.  I looked at a different implementation
> of this and that one wasn't checked either so maybe there is something
> I am not seeing.
>
> This has obvious security implications.  Can someone take a look at
> this?

This is indeed wrong.

The v4l2_m2m_qbuf() call at the end of this function calls in turn 
vb2_qbuf which
will check the index. But if you override vidioc_qbuf (or 
vidioc_prepare), then
you need to check the index value.

I double-checked all cases where vidioc_qbuf was set to a 
driver-specific function
and this is the only driver that doesn't check the index field. In all 
other cases
it is either checked, or it is not used before calling into the vb1/vb2 
framework
which checks this.

So luckily this only concerns this driver.

Regards,

	Hans

>
>    537          vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, vb2_buf);
>    538          mtkbuf = container_of(vb2_v4l2, struct mtk_video_dec_buf, vb);
>    539
>    540          if ((buf->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) &&
>    541              (buf->m.planes[0].bytesused == 0)) {
>    542                  mtkbuf->lastframe = true;
>    543                  mtk_v4l2_debug(1, "[%d] (%d) id=%d lastframe=%d (%d,%d, %d) vb=%p",
>    544                           ctx->id, buf->type, buf->index,
>    545                           mtkbuf->lastframe, buf->bytesused,
>    546                           buf->m.planes[0].bytesused, buf->length,
>    547                           vb);
>    548          }
>    549
>    550          return v4l2_m2m_qbuf(file, ctx->m2m_ctx, buf);
>    551  }
>
> regards,
> dan carpenter
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: [PATCH 2/2] i2c: octeon: Fix waiting for operation completion
From: Jan Glauber @ 2016-11-09 13:41 UTC (permalink / raw)
  To: Paul Burton; +Cc: linux-i2c, linux-mips, David Daney, Peter Swain, Wolfram Sang
In-Reply-To: <20161107200921.30284-2-paul.burton@imgtec.com>

Hi Paul,

I think we should revert commit "70121f7 i2c: octeon: thunderx: Limit
register access retries". With debugging enabled I'm getting:

[   78.871568] ipmi_ssif: Trying hotmod-specified SSIF interface at i2c address 0x12, adapter Cavium ThunderX i2c adapter at 0000:01:09.4, slave address 0x0
[   78.886107] do not call blocking ops when !TASK_RUNNING; state=2 set at [<fffffc00080e0088>] prepare_to_wait_event+0x58/0x10c
[   78.897436] ------------[ cut here ]------------
[   78.902050] WARNING: CPU: 6 PID: 2235 at kernel/sched/core.c:7718 __might_sleep+0x80/0x88
[   78.910218] Modules linked in: ipmi_ssif i2c_thunderx i2c_smbus nicvf nicpf thunder_bgx thunder_xcv mdio_thunder

[   78.921916] CPU: 6 PID: 2235 Comm: bash Tainted: G        W       4.9.0-rc3-jang+ #17
[   78.929737] Hardware name: www.cavium.com ThunderX CRB-1S/ThunderX CRB-1S, BIOS 0.3 Aug 24 2016
[   78.938426] task: fffffe1fdd554500 task.stack: fffffe1fe384c000
[   78.944338] PC is at __might_sleep+0x80/0x88
[   78.948601] LR is at __might_sleep+0x80/0x88
[   78.952863] pc : [<fffffc00080c3aac>] lr : [<fffffc00080c3aac>] pstate: 80000145
[   78.960250] sp : fffffe1fe384f600
[   78.963557] x29: fffffe1fe384f600 x28: fffffe1fe384f860
[   78.968875] x27: fffffe1fd07fa018 x26: fffffe1fe384f968
[   78.974193] x25: fffffc0009a2b000 x24: 00000000ffff26d6
[   78.979510] x23: fffffe1fe384f860 x22: fffffe1fe384f860
[   78.984827] x21: 0000000000000000 x20: 00000000000000b1
[   78.990144] x19: fffffc0000e330b8 x18: 0000000000005bb0
[   78.995461] x17: fffffc0009669ca8 x16: 0000000000000000
[   79.000779] x15: 0000000000000539 x14: 66663c5b20746120
[   79.006097] x13: 74657320323d6574 x12: 617473203b474e49
[   79.011415] x11: 4e4e55525f4b5341 x10: 5421206e65687720
[   79.016732] x9 : 73706f20676e696b x8 : 0000000000000000
[   79.022049] x7 : fffffc00080f5740 x6 : fffffc00080f5740
[   79.027367] x5 : ffffffffffffff80 x4 : 0000000000000060
[   79.032684] x3 : 0000000000000000 x2 : 0000000000000001
[   79.038001] x1 : 0000000000000000 x0 : 0000000000000071

[   79.044803] ---[ end trace d8af6005f683d444 ]---
[   79.049413] Call trace:
[   79.051853] Exception stack(0xfffffe1fe384f420 to 0xfffffe1fe384f550)
[   79.058287] f420: fffffc0000e330b8 0000040000000000 fffffe1fe384f600 fffffc00080c3aac
[   79.066109] f440: 0000000080000145 000000000000003d 0000000000000000 fffffc0008853920
[   79.073931] f460: 0000040000000000 0000000100000001 fffffe1fe384f520 fffffc00080f60d8
[   79.081752] f480: fffffc0000e330b8 00000000000000b1 0000000000000000 fffffe1fe384f860
[   79.089574] f4a0: fffffe1fe384f860 00000000ffff26d6 fffffc0009a2b000 fffffe1fe384f968
[   79.097396] f4c0: fffffe1fd07fa018 fffffe1fe384f860 0000000000000071 0000000000000000
[   79.105218] f4e0: 0000000000000001 0000000000000000 0000000000000060 ffffffffffffff80
[   79.113040] f500: fffffc00080f5740 fffffc00080f5740 0000000000000000 73706f20676e696b
[   79.120861] f520: 5421206e65687720 4e4e55525f4b5341 617473203b474e49 74657320323d6574
[   79.128683] f540: 66663c5b20746120 0000000000000539
[   79.133553] [<fffffc00080c3aac>] __might_sleep+0x80/0x88
[   79.138862] [<fffffc0000e30138>] octeon_i2c_test_iflg+0x4c/0xbc [i2c_thunderx]
[   79.146077] [<fffffc0000e30958>] octeon_i2c_test_ready+0x18/0x70 [i2c_thunderx]
[   79.153379] [<fffffc0000e30b04>] octeon_i2c_wait+0x154/0x1a4 [i2c_thunderx]
[   79.160334] [<fffffc0000e310bc>] octeon_i2c_xfer+0xf4/0xf60 [i2c_thunderx]

This is not caused by the usleep inside the wait_event but by readq_poll_timeout().
Could you try if it works for you if you only revert this patch?

Thanks,
Jan

^ permalink raw reply

* Re: disable hugepages
From: Christian Ehrhardt @ 2016-11-09 13:40 UTC (permalink / raw)
  To: Keren Hochman; +Cc: dev
In-Reply-To: <CAJq3SQ6Wjueqs9sCF0Kn4S_O0sZqZ_4hV6ztAeA8a583OZQc9w@mail.gmail.com>

On Wed, Nov 9, 2016 at 1:55 PM, Keren Hochman <keren.hochman@lightcyber.com>
wrote:

> how can I create mempool without hugepages?My application is running on a
> pcap file so no huge pages is needed ?
>

Not sure if that is what you really want (Debug use only), but in general
no-huge is available as EAL arg

>From http://pktgen.readthedocs.io/en/latest/usage_eal.html :

EAL options for DEBUG use only:
  --no-huge           : Use malloc instead of hugetlbfs

^ permalink raw reply

* [PATCH net-next v5]] cadence: Add LSO support.
From: Rafal Ozieblo @ 2016-11-09 13:41 UTC (permalink / raw)
  To: nicolas.ferre, netdev, linux-kernel; +Cc: Rafal Ozieblo
In-Reply-To: <1478612463-15076-1-git-send-email-rafalo@cadence.com>

New Cadence GEM hardware support Large Segment Offload (LSO):
TCP segmentation offload (TSO) as well as UDP fragmentation
offload (UFO). Support for those features was added to the driver.

Signed-off-by: Rafal Ozieblo <rafalo@cadence.com>
---
Changed in v2:
macb_lso_check_compatibility() changed to macb_features_check()
(with little modifications) and bind to .ndo_features_check.
(after Eric Dumazet suggestion)
---
Changed in v3:
Respin to net-next.
---
Changed in v4:
(struct iphdr*)skb_network_header(skb) changed to ip_hdr(skb)
---
Changed in v5:
Changes after Florian Fainelli comments
---
 drivers/net/ethernet/cadence/macb.c | 142 +++++++++++++++++++++++++++++++++---
 drivers/net/ethernet/cadence/macb.h |  14 ++++
 2 files changed, 144 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index e1847ce..dd38ef7 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -32,7 +32,9 @@
 #include <linux/of_gpio.h>
 #include <linux/of_mdio.h>
 #include <linux/of_net.h>
-
+#include <linux/ip.h>
+#include <linux/udp.h>
+#include <linux/tcp.h>
 #include "macb.h"
 
 #define MACB_RX_BUFFER_SIZE	128
@@ -60,10 +62,13 @@
 					| MACB_BIT(TXERR))
 #define MACB_TX_INT_FLAGS	(MACB_TX_ERR_FLAGS | MACB_BIT(TCOMP))
 
-#define MACB_MAX_TX_LEN		((unsigned int)((1 << MACB_TX_FRMLEN_SIZE) - 1))
-#define GEM_MAX_TX_LEN		((unsigned int)((1 << GEM_TX_FRMLEN_SIZE) - 1))
+/* Max length of transmit frame must be a multiple of 8 bytes */
+#define MACB_TX_LEN_ALIGN	8
+#define MACB_MAX_TX_LEN		((unsigned int)((1 << MACB_TX_FRMLEN_SIZE) - 1) & ~((unsigned int)(MACB_TX_LEN_ALIGN - 1)))
+#define GEM_MAX_TX_LEN		((unsigned int)((1 << GEM_TX_FRMLEN_SIZE) - 1) & ~((unsigned int)(MACB_TX_LEN_ALIGN - 1)))
 
 #define GEM_MTU_MIN_SIZE	ETH_MIN_MTU
+#define MACB_NETIF_LSO		(NETIF_F_TSO | NETIF_F_UFO)
 
 #define MACB_WOL_HAS_MAGIC_PACKET	(0x1 << 0)
 #define MACB_WOL_ENABLED		(0x1 << 1)
@@ -1223,7 +1228,8 @@ static void macb_poll_controller(struct net_device *dev)
 
 static unsigned int macb_tx_map(struct macb *bp,
 				struct macb_queue *queue,
-				struct sk_buff *skb)
+				struct sk_buff *skb,
+				unsigned int hdrlen)
 {
 	dma_addr_t mapping;
 	unsigned int len, entry, i, tx_head = queue->tx_head;
@@ -1231,14 +1237,27 @@ static unsigned int macb_tx_map(struct macb *bp,
 	struct macb_dma_desc *desc;
 	unsigned int offset, size, count = 0;
 	unsigned int f, nr_frags = skb_shinfo(skb)->nr_frags;
-	unsigned int eof = 1;
-	u32 ctrl;
+	unsigned int eof = 1, mss_mfs = 0;
+	u32 ctrl, lso_ctrl = 0, seq_ctrl = 0;
+
+	/* LSO */
+	if (skb_shinfo(skb)->gso_size != 0) {
+		if (ip_hdr(skb)->protocol == IPPROTO_UDP)
+			/* UDP - UFO */
+			lso_ctrl = MACB_LSO_UFO_ENABLE;
+		else
+			/* TCP - TSO */
+			lso_ctrl = MACB_LSO_TSO_ENABLE;
+	}
 
 	/* First, map non-paged data */
 	len = skb_headlen(skb);
+
+	/* first buffer length */
+	size = hdrlen;
+
 	offset = 0;
 	while (len) {
-		size = min(len, bp->max_tx_length);
 		entry = macb_tx_ring_wrap(bp, tx_head);
 		tx_skb = &queue->tx_skb[entry];
 
@@ -1258,6 +1277,8 @@ static unsigned int macb_tx_map(struct macb *bp,
 		offset += size;
 		count++;
 		tx_head++;
+
+		size = min(len, bp->max_tx_length);
 	}
 
 	/* Then, map paged data from fragments */
@@ -1311,6 +1332,21 @@ static unsigned int macb_tx_map(struct macb *bp,
 	desc = &queue->tx_ring[entry];
 	desc->ctrl = ctrl;
 
+	if (lso_ctrl) {
+		if (lso_ctrl == MACB_LSO_UFO_ENABLE)
+			/* include header and FCS in value given to h/w */
+			mss_mfs = skb_shinfo(skb)->gso_size +
+					skb_transport_offset(skb) +
+					ETH_FCS_LEN;
+		else /* TSO */ {
+			mss_mfs = skb_shinfo(skb)->gso_size;
+			/* TCP Sequence Number Source Select
+			 * can be set only for TSO
+			 */
+			seq_ctrl = 0;
+		}
+	}
+
 	do {
 		i--;
 		entry = macb_tx_ring_wrap(bp, i);
@@ -1325,6 +1361,16 @@ static unsigned int macb_tx_map(struct macb *bp,
 		if (unlikely(entry == (bp->tx_ring_size - 1)))
 			ctrl |= MACB_BIT(TX_WRAP);
 
+		/* First descriptor is header descriptor */
+		if (i == queue->tx_head) {
+			ctrl |= MACB_BF(TX_LSO, lso_ctrl);
+			ctrl |= MACB_BF(TX_TCP_SEQ_SRC, seq_ctrl);
+		} else
+			/* Only set MSS/MFS on payload descriptors
+			 * (second or later descriptor)
+			 */
+			ctrl |= MACB_BF(MSS_MFS, mss_mfs);
+
 		/* Set TX buffer descriptor */
 		macb_set_addr(desc, tx_skb->mapping);
 		/* desc->addr must be visible to hardware before clearing
@@ -1350,6 +1396,43 @@ static unsigned int macb_tx_map(struct macb *bp,
 	return 0;
 }
 
+static netdev_features_t macb_features_check(struct sk_buff *skb,
+					     struct net_device *dev,
+					     netdev_features_t features)
+{
+	unsigned int nr_frags, f;
+	unsigned int hdrlen;
+
+	/* Validate LSO compatibility */
+
+	/* there is only one buffer */
+	if (!skb_is_nonlinear(skb))
+		return features;
+
+	/* length of header */
+	hdrlen = skb_transport_offset(skb);
+	if (ip_hdr(skb)->protocol == IPPROTO_TCP)
+		hdrlen += tcp_hdrlen(skb);
+
+	/* For LSO:
+	 * When software supplies two or more payload buffers all payload buffers
+	 * apart from the last must be a multiple of 8 bytes in size.
+	 */
+	if (!IS_ALIGNED(skb_headlen(skb) - hdrlen, MACB_TX_LEN_ALIGN))
+		return features & ~MACB_NETIF_LSO;
+
+	nr_frags = skb_shinfo(skb)->nr_frags;
+	/* No need to check last fragment */
+	nr_frags--;
+	for (f = 0; f < nr_frags; f++) {
+		const skb_frag_t *frag = &skb_shinfo(skb)->frags[f];
+
+		if (!IS_ALIGNED(skb_frag_size(frag), MACB_TX_LEN_ALIGN))
+			return features & ~MACB_NETIF_LSO;
+	}
+	return features;
+}
+
 static inline int macb_clear_csum(struct sk_buff *skb)
 {
 	/* no change for packets without checksum offloading */
@@ -1374,7 +1457,28 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct macb *bp = netdev_priv(dev);
 	struct macb_queue *queue = &bp->queues[queue_index];
 	unsigned long flags;
-	unsigned int count, nr_frags, frag_size, f;
+	unsigned int desc_cnt, nr_frags, frag_size, f;
+	unsigned int hdrlen;
+	bool is_lso, is_udp = 0;
+
+	is_lso = (skb_shinfo(skb)->gso_size != 0);
+
+	if (is_lso) {
+		is_udp = !!(ip_hdr(skb)->protocol == IPPROTO_UDP);
+
+		/* length of headers */
+		if (is_udp)
+			/* only queue eth + ip headers separately for UDP */
+			hdrlen = skb_transport_offset(skb);
+		else
+			hdrlen = skb_transport_offset(skb) + tcp_hdrlen(skb);
+		if (skb_headlen(skb) < hdrlen) {
+			netdev_err(bp->dev, "Error - LSO headers fragmented!!!\n");
+			/* if this is required, would need to copy to single buffer */
+			return NETDEV_TX_BUSY;
+		}
+	} else
+		hdrlen = min(skb_headlen(skb), bp->max_tx_length);
 
 #if defined(DEBUG) && defined(VERBOSE_DEBUG)
 	netdev_vdbg(bp->dev,
@@ -1389,18 +1493,22 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * socket buffer: skb fragments of jumbo frames may need to be
 	 * split into many buffer descriptors.
 	 */
-	count = DIV_ROUND_UP(skb_headlen(skb), bp->max_tx_length);
+	if (is_lso && (skb_headlen(skb) > hdrlen))
+		/* extra header descriptor if also payload in first buffer */
+		desc_cnt = DIV_ROUND_UP((skb_headlen(skb) - hdrlen), bp->max_tx_length) + 1;
+	else
+		desc_cnt = DIV_ROUND_UP(skb_headlen(skb), bp->max_tx_length);
 	nr_frags = skb_shinfo(skb)->nr_frags;
 	for (f = 0; f < nr_frags; f++) {
 		frag_size = skb_frag_size(&skb_shinfo(skb)->frags[f]);
-		count += DIV_ROUND_UP(frag_size, bp->max_tx_length);
+		desc_cnt += DIV_ROUND_UP(frag_size, bp->max_tx_length);
 	}
 
 	spin_lock_irqsave(&bp->lock, flags);
 
 	/* This is a hard error, log it. */
 	if (CIRC_SPACE(queue->tx_head, queue->tx_tail,
-		       bp->tx_ring_size) < count) {
+		       bp->tx_ring_size) < desc_cnt) {
 		netif_stop_subqueue(dev, queue_index);
 		spin_unlock_irqrestore(&bp->lock, flags);
 		netdev_dbg(bp->dev, "tx_head = %u, tx_tail = %u\n",
@@ -1408,13 +1516,17 @@ static int macb_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		return NETDEV_TX_BUSY;
 	}
 
+	if (is_udp) /* is_udp is only set when (is_lso) is checked */
+		/* zero UDP checksum, not calculated by h/w for UFO */
+		udp_hdr(skb)->check = 0;
+
 	if (macb_clear_csum(skb)) {
 		dev_kfree_skb_any(skb);
 		goto unlock;
 	}
 
 	/* Map socket buffer for DMA transfer */
-	if (!macb_tx_map(bp, queue, skb)) {
+	if (!macb_tx_map(bp, queue, skb, hdrlen)) {
 		dev_kfree_skb_any(skb);
 		goto unlock;
 	}
@@ -2354,6 +2466,7 @@ static const struct net_device_ops macb_netdev_ops = {
 	.ndo_poll_controller	= macb_poll_controller,
 #endif
 	.ndo_set_features	= macb_set_features,
+	.ndo_features_check	= macb_features_check,
 };
 
 /* Configure peripheral capabilities according to device tree
@@ -2560,6 +2673,11 @@ static int macb_init(struct platform_device *pdev)
 
 	/* Set features */
 	dev->hw_features = NETIF_F_SG;
+
+	/* Check LSO capability */
+	if (GEM_BFEXT(PBUF_LSO, gem_readl(bp, DCFG6)))
+		dev->hw_features |= MACB_NETIF_LSO;
+
 	/* Checksum offload is only available on gem with packet buffer */
 	if (macb_is_gem(bp) && !(bp->caps & MACB_CAPS_FIFO_MODE))
 		dev->hw_features |= NETIF_F_HW_CSUM | NETIF_F_RXCSUM;
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 1216950..d67adad 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -382,6 +382,10 @@
 #define GEM_TX_PKT_BUFF_OFFSET			21
 #define GEM_TX_PKT_BUFF_SIZE			1
 
+/* Bitfields in DCFG6. */
+#define GEM_PBUF_LSO_OFFSET			27
+#define GEM_PBUF_LSO_SIZE			1
+
 /* Constants for CLK */
 #define MACB_CLK_DIV8				0
 #define MACB_CLK_DIV16				1
@@ -414,6 +418,10 @@
 #define MACB_CAPS_SG_DISABLED			0x40000000
 #define MACB_CAPS_MACB_IS_GEM			0x80000000
 
+/* LSO settings */
+#define MACB_LSO_UFO_ENABLE			0x01
+#define MACB_LSO_TSO_ENABLE			0x02
+
 /* Bit manipulation macros */
 #define MACB_BIT(name)					\
 	(1 << MACB_##name##_OFFSET)
@@ -545,6 +553,12 @@ struct macb_dma_desc {
 #define MACB_TX_LAST_SIZE			1
 #define MACB_TX_NOCRC_OFFSET			16
 #define MACB_TX_NOCRC_SIZE			1
+#define MACB_MSS_MFS_OFFSET			16
+#define MACB_MSS_MFS_SIZE			14
+#define MACB_TX_LSO_OFFSET			17
+#define MACB_TX_LSO_SIZE			2
+#define MACB_TX_TCP_SEQ_SRC_OFFSET		19
+#define MACB_TX_TCP_SEQ_SRC_SIZE		1
 #define MACB_TX_BUF_EXHAUSTED_OFFSET		27
 #define MACB_TX_BUF_EXHAUSTED_SIZE		1
 #define MACB_TX_UNDERRUN_OFFSET			28
-- 
2.4.5

^ permalink raw reply related

* Re: [PATCH 2/2] i2c: octeon: Fix waiting for operation completion
From: Jan Glauber @ 2016-11-09 13:41 UTC (permalink / raw)
  To: Paul Burton; +Cc: linux-i2c, linux-mips, David Daney, Peter Swain, Wolfram Sang
In-Reply-To: <20161107200921.30284-2-paul.burton@imgtec.com>

Hi Paul,

I think we should revert commit "70121f7 i2c: octeon: thunderx: Limit
register access retries". With debugging enabled I'm getting:

[   78.871568] ipmi_ssif: Trying hotmod-specified SSIF interface at i2c address 0x12, adapter Cavium ThunderX i2c adapter at 0000:01:09.4, slave address 0x0
[   78.886107] do not call blocking ops when !TASK_RUNNING; state=2 set at [<fffffc00080e0088>] prepare_to_wait_event+0x58/0x10c
[   78.897436] ------------[ cut here ]------------
[   78.902050] WARNING: CPU: 6 PID: 2235 at kernel/sched/core.c:7718 __might_sleep+0x80/0x88
[   78.910218] Modules linked in: ipmi_ssif i2c_thunderx i2c_smbus nicvf nicpf thunder_bgx thunder_xcv mdio_thunder

[   78.921916] CPU: 6 PID: 2235 Comm: bash Tainted: G        W       4.9.0-rc3-jang+ #17
[   78.929737] Hardware name: www.cavium.com ThunderX CRB-1S/ThunderX CRB-1S, BIOS 0.3 Aug 24 2016
[   78.938426] task: fffffe1fdd554500 task.stack: fffffe1fe384c000
[   78.944338] PC is at __might_sleep+0x80/0x88
[   78.948601] LR is at __might_sleep+0x80/0x88
[   78.952863] pc : [<fffffc00080c3aac>] lr : [<fffffc00080c3aac>] pstate: 80000145
[   78.960250] sp : fffffe1fe384f600
[   78.963557] x29: fffffe1fe384f600 x28: fffffe1fe384f860
[   78.968875] x27: fffffe1fd07fa018 x26: fffffe1fe384f968
[   78.974193] x25: fffffc0009a2b000 x24: 00000000ffff26d6
[   78.979510] x23: fffffe1fe384f860 x22: fffffe1fe384f860
[   78.984827] x21: 0000000000000000 x20: 00000000000000b1
[   78.990144] x19: fffffc0000e330b8 x18: 0000000000005bb0
[   78.995461] x17: fffffc0009669ca8 x16: 0000000000000000
[   79.000779] x15: 0000000000000539 x14: 66663c5b20746120
[   79.006097] x13: 74657320323d6574 x12: 617473203b474e49
[   79.011415] x11: 4e4e55525f4b5341 x10: 5421206e65687720
[   79.016732] x9 : 73706f20676e696b x8 : 0000000000000000
[   79.022049] x7 : fffffc00080f5740 x6 : fffffc00080f5740
[   79.027367] x5 : ffffffffffffff80 x4 : 0000000000000060
[   79.032684] x3 : 0000000000000000 x2 : 0000000000000001
[   79.038001] x1 : 0000000000000000 x0 : 0000000000000071

[   79.044803] ---[ end trace d8af6005f683d444 ]---
[   79.049413] Call trace:
[   79.051853] Exception stack(0xfffffe1fe384f420 to 0xfffffe1fe384f550)
[   79.058287] f420: fffffc0000e330b8 0000040000000000 fffffe1fe384f600 fffffc00080c3aac
[   79.066109] f440: 0000000080000145 000000000000003d 0000000000000000 fffffc0008853920
[   79.073931] f460: 0000040000000000 0000000100000001 fffffe1fe384f520 fffffc00080f60d8
[   79.081752] f480: fffffc0000e330b8 00000000000000b1 0000000000000000 fffffe1fe384f860
[   79.089574] f4a0: fffffe1fe384f860 00000000ffff26d6 fffffc0009a2b000 fffffe1fe384f968
[   79.097396] f4c0: fffffe1fd07fa018 fffffe1fe384f860 0000000000000071 0000000000000000
[   79.105218] f4e0: 0000000000000001 0000000000000000 0000000000000060 ffffffffffffff80
[   79.113040] f500: fffffc00080f5740 fffffc00080f5740 0000000000000000 73706f20676e696b
[   79.120861] f520: 5421206e65687720 4e4e55525f4b5341 617473203b474e49 74657320323d6574
[   79.128683] f540: 66663c5b20746120 0000000000000539
[   79.133553] [<fffffc00080c3aac>] __might_sleep+0x80/0x88
[   79.138862] [<fffffc0000e30138>] octeon_i2c_test_iflg+0x4c/0xbc [i2c_thunderx]
[   79.146077] [<fffffc0000e30958>] octeon_i2c_test_ready+0x18/0x70 [i2c_thunderx]
[   79.153379] [<fffffc0000e30b04>] octeon_i2c_wait+0x154/0x1a4 [i2c_thunderx]
[   79.160334] [<fffffc0000e310bc>] octeon_i2c_xfer+0xf4/0xf60 [i2c_thunderx]

This is not caused by the usleep inside the wait_event but by readq_poll_timeout().
Could you try if it works for you if you only revert this patch?

Thanks,
Jan

^ permalink raw reply

* Re: [Qemu-devel] [PATCH] vhost: Use vbus var instead of VIRTIO_BUS() macro
From: Felipe Franciosi @ 2016-11-09 13:24 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Felipe Franciosi, Stefan Hajnoczi, Michael S. Tsirkin,
	qemu-devel@nongnu.org
In-Reply-To: <22f2ee5c-8dc4-c142-f0a9-130fde997c3f@redhat.com>


> On 9 Nov 2016, at 14:22, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> 
> 
> On 09/11/2016 14:18, Felipe Franciosi wrote:
>> Recent changes on vhost_dev_enable/disable_notifiers() produced a
>> VirtioBusState vbus variable which can be used instead of the
>> VIRTIO_BUS() macro. This commit just makes the code a little bit cleaner
>> and more consistent.
>> 
>> Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
> 
> Michael, what do you think?  Perhaps it's simplest to just squash the
> two patches (v2 of "vhost: Update 'ioeventfd_started' with host
> notifiers" and this one).
> 
> Paolo

Please, do. I just realised the repeated use of VIRTIO_BUS() after sending the first patch (Update 'ioeventfd_started' with host notifiers) and decided to ship this one too to get them both in if possible.

Thanks,
Felipe

> 
>> ---
>> hw/virtio/vhost.c | 14 ++++++--------
>> 1 file changed, 6 insertions(+), 8 deletions(-)
>> 
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index 1290963..7d29dad 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -1198,20 +1198,18 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
>> 
>>     virtio_device_stop_ioeventfd(vdev);
>>     for (i = 0; i < hdev->nvqs; ++i) {
>> -        r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i,
>> -                                         true);
>> +        r = virtio_bus_set_host_notifier(vbus, hdev->vq_index + i, true);
>>         if (r < 0) {
>>             error_report("vhost VQ %d notifier binding failed: %d", i, -r);
>>             goto fail_vq;
>>         }
>>     }
>> -    VIRTIO_BUS(qbus)->ioeventfd_started = true;
>> +    vbus->ioeventfd_started = true;
>> 
>>     return 0;
>> fail_vq:
>>     while (--i >= 0) {
>> -        e = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i,
>> -                                         false);
>> +        e = virtio_bus_set_host_notifier(vbus, hdev->vq_index + i, false);
>>         if (e < 0) {
>>             error_report("vhost VQ %d notifier cleanup error: %d", i, -r);
>>         }
>> @@ -1230,17 +1228,17 @@ fail:
>> void vhost_dev_disable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
>> {
>>     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
>> +    VirtioBusState *vbus = VIRTIO_BUS(qbus);
>>     int i, r;
>> 
>>     for (i = 0; i < hdev->nvqs; ++i) {
>> -        r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i,
>> -                                         false);
>> +        r = virtio_bus_set_host_notifier(vbus, hdev->vq_index + i, false);
>>         if (r < 0) {
>>             error_report("vhost VQ %d notifier cleanup failed: %d", i, -r);
>>         }
>>         assert (r >= 0);
>>     }
>> -    VIRTIO_BUS(qbus)->ioeventfd_started = false;
>> +    vbus->ioeventfd_started = false;
>>     virtio_device_start_ioeventfd(vdev);
>> }
>> 
>> 

^ permalink raw reply

* Re: [PATCH 2/2] ASoC: axentia: tse850: add ASoC driver for the Axentia TSE-850
From: Mark Brown @ 2016-11-09 13:38 UTC (permalink / raw)
  To: Peter Rosin
  Cc: linux-kernel, Liam Girdwood, Rob Herring, Mark Rutland,
	Jaroslav Kysela, Takashi Iwai, alsa-devel, devicetree
In-Reply-To: <1478622057-12426-3-git-send-email-peda@axentia.se>

[-- Attachment #1: Type: text/plain, Size: 2753 bytes --]

On Tue, Nov 08, 2016 at 05:20:57PM +0100, Peter Rosin wrote:

> +++ b/sound/soc/axentia/Kconfig
> @@ -0,0 +1,10 @@
> +config SND_SOC_AXENTIA_TSE850_PCM5142
> +	tristate "ASoC driver for the Axentia TSE-850"
> +	depends on ARCH_AT91 && OF
> +	select ATMEL_SSC
> +	select SND_ATMEL_SOC
> +	select SND_ATMEL_SOC_SSC_DMA
> +	select SND_SOC_PCM512x_I2C
> +	help
> +	  Say Y if you want to add support for the ASoC driver for the
> +	  Axentia TSE-850 with a PCM5142 codec.

This just looks like a normal machine driver for an Atmel system which
would usually go in the atemel directory - why is a new directory being
created?

> +static int tse850_get_mux2(struct snd_kcontrol *kctrl,
> +			   struct snd_ctl_elem_value *ucontrol)
> +{
> +	struct snd_soc_dapm_context *dapm = snd_soc_dapm_kcontrol_dapm(kctrl);
> +	struct snd_soc_card *card = dapm->card;
> +	struct tse850_priv *tse850 = snd_soc_card_get_drvdata(card);
> +	int ret;
> +
> +	ret = gpiod_get_value(tse850->loop2);
> +	if (ret < 0)
> +		return ret;

We can't reliably read the value of output GPIOs (though in practice the
majority do support it) so it'd be better practice to use a state
variable to remember what we set.  I'd also expect this to use the
_cansleep() GPIO calls as it's not in a context where sleeping would be
a problem.

> +int tse850_get_ana(struct snd_kcontrol *kctrl,
> +		   struct snd_ctl_elem_value *ucontrol)
> +{
> +	struct snd_soc_dapm_context *dapm = snd_soc_dapm_kcontrol_dapm(kctrl);
> +	struct snd_soc_card *card = dapm->card;
> +	struct tse850_priv *tse850 = snd_soc_card_get_drvdata(card);
> +	int ret;
> +
> +	ret = regulator_get_voltage(tse850->ana);
> +	if (ret < 0)
> +		return ret;
> +
> +	if (ret < 11000000)
> +		ret = 11000000;
> +	else if (ret > 20000000)
> +		ret = 20000000;

This needs some comments...

> +	struct snd_soc_pcm_runtime *rtd = substream->private_data;
> +	struct device *dev = rtd->dev;
> +	struct snd_soc_dai *cpu_dai = rtd->cpu_dai;
> +	int dir = substream->stream != SNDRV_PCM_STREAM_PLAYBACK;
> +	int div_id = dir ? ATMEL_SSC_RCMR_PERIOD : ATMEL_SSC_TCMR_PERIOD;
> +	int period = snd_soc_params_to_frame_size(params) / 2 - 1;

Please write the logic out as normal if statements for legibility.  It's
a bit concerning that we even need this function, it looks like pretty
basic stuff that I'd expect the CPU DAI to just be doing - why can't
this be the default behaviour of the CPU DAI?

> +static int tse850_init(struct snd_soc_pcm_runtime *rtd)
> +{
> +	struct snd_soc_dapm_context *dapm = &rtd->card->dapm;
> +
> +	return snd_soc_dapm_add_routes(dapm, tse850_intercon,
> +				       ARRAY_SIZE(tse850_intercon));

Set this up in the card data structure rather than open coding the call,
you can register DAPM routes there too.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply

* Re: master branch merges must pass unit tests
From: Alfredo Deza @ 2016-11-09 13:37 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel
In-Reply-To: <alpine.DEB.2.11.1611082147590.29278@piezo.us.to>

On Tue, Nov 8, 2016 at 4:49 PM, Sage Weil <sweil@redhat.com> wrote:
> I enabled the github check that the unit tests pass in order to merge to
> master.

Thank you!

> These tests still aren't completely reliable, but they're close,
> and we'll make better progress if we start enforcing it now.
>

I would like to see some effort put into make it more reliable. One of
the things that it has to do is reboot the box where it runs
to ensure that processes are really killed. I would love to see some
effort put into the targets to make them robust.

There is nothing too complicated going on at the CI level, we just
call run-make-check.sh:

https://github.com/ceph/ceph-build/blob/master/ceph-pull-requests/config/definitions/ceph-pull-requests.yml#L49

> Note that core developers can still override the check to merge if
> it's necessary, but I encourage you to avoid doing so!
>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: A way to reduce compression overhead
From: Igor Fedotov @ 2016-11-09 13:35 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel
In-Reply-To: <alpine.DEB.2.11.1611082019530.29278@piezo.us.to>

Sage


On 11/8/2016 11:27 PM, Sage Weil wrote:
> On Tue, 8 Nov 2016, Igor Fedotov wrote:
>> Hi Sage, et al.
>>
>> Let me share some ideas about possible compression burden reduction on the
>> cluster.
>>
>> As known we perform block compression at BlueStore level for each replica
>> independently. This triples compression CPU overhead for the cluster. Looks
>> like significant CPU resource waste IMHO.
>>
>> We can probably eliminate this overhead by introduction write request
>> preprocessing performed at ObjectStore level synchronously. This preprocessing
>> parses transaction, detects write requests and transforms them into different
>> ones aligned with current store allocation unit. At the same time resulting
>> extents that span more than single AU are compressed if needed. I.e.
>> preprocessing do some of the job performed at BlueStore::_do_write_data that
>> splits write request into _do_write_small/_do_write_big calls. But after the
>> split and big blob compression preprocessor simply updates the transaction
>> with new write requests.
>>
>> E.g.
>>
>> with AU = 0x1000
>>
>> Write Request (1~0xffff) is transformed into the following sequence:
>>
>> WriteX 1~0xfff (uncompressed)
>>
>> WriteX 0x1000~E000 (compressed if needed)
>>
>> WriteX 0xf000~0xfff (uncompressed)
>>
>> Then updated transaction is passed to all replicas including the master one
>> using regular apply_/queue_transaction mechanics.
>>
>>
>> As a bonus one receives automatic payload compression when transporting
>> request to remote store replicas.
>> Regular write request path should be preserved for EC pools and other needs as
>> well.
>>
>> Please note that almost no latency is introduced for request handling.
>> Replicas receive modified transaction later but they do not spend time on
>> doing split/compress stuff.
> I think this is pretty reasonable!  We have a couple options... we could
> (1) just expose a compression alignment via ObjectStore, (2) take
> compression alignment from a pool property, or (3) have an explicit
> per-write call into ObjectStore so that it can chunk it up however it
> likes.
>
> Whatever we choose, the tricky bit is that there may be different stores
> on different replicas.  Or we could let the primary just decide locally,
> given that this is primarily an optimization; in the worst case we
> compress something on the primary but one replica doesn't support
> compression and just decompresses it before doing the write (i.e., we get
> on-the-wire compression but no on-disk compression).
IMHO different stores on different replicas is rather a corner case and 
it's better (or simpler) to disable compression optimization when it 
takes place. Doing compression followed by decompression seems ugly a 
bit unless we're talking about traffic compression only.
To disable compression preprocessing we can either have a manual switch 
in the config or collect remote OSD capabilities at primary and disable 
preprocessing automatically. This can be made just once hence it 
wouldn't impact request handling performance.
> I lean toward the simplicity of get_compression_alignment() and
> get_compression_alg() (or similar) and just make a local (primary)
> decision.  Then we just have a simple compatibility write_compressed()
> implementation (or helper) that decompresses the payload so that we can do
> a normal write.
As for me I always stand for better functionality encapsulation - hence 
I'd prefer (3): store do whatever it can and transparently passes 
results to replicas. This allows to modify or extend the logic smoothly, 
e.g. optimize csum calculation for big chunks etc.
Contrary in (1) we expose most of this functionality to store's client 
(i.g. replicated backend stuff,  not a real Ceph client). In fact for 
(1) we'll have  2 potentially evolving APIs:
- compressed(optimized) write request delivery
- store optimization description provided to client ( i.e. mentioned 
algorithm + alignment retrieval initially).
The latter isn't needed for (3)

>
> Before getting to carried away, though, we should consider whether we're
> going to want to take a further step to allow clients to compress data
> before it's sent.  That isn't necessarily in conflict with this if we go
> with pool properties to inform the alignment and compression alg
> decision.  If we assume that the ObjectStore on the primary gets to decide
> everything it will work less well...
Firstly let's agree on the terminology. Here we're talking about Ceph 
cluster clients. While it were store clients (PG backends) above.
Well, this case is a bit different comparing to the above. (3) isn't a 
viable option here. Ceph client definitely relies on (1) or (2) if any 
(I'm afraid bringing compression to client will be a headache).
But at the same time IMHO it might be an argument against having (1) for 
the store client. There appears three entities that a aware of 
compression optimization: Ceph client, store client(PG backend) and 
store itself. Not good...
In case of (1) + (3) intermediate layer can be probably unburden from 
that awareness - it simply has to pass compressed blocks transparently 
from client to store and from primary store to replicas.
>> There is a potential conflict with the current garbage collection stuff though
>> - we can't perform GC during preprocessing due to possible race with preceding
>> unfinished transactions and consequently we're unable to merge and compress
>> merged data. Well, we can do that when applying transaction but this will
>> produce a sequence like this at each replica:
>>
>> decompress original request + decompress data to merge -> compress merged
>> data.
>>
>> Probably this limitation isn't that bad - IMHO it's better to have compressed
>> blobs aligned with original write requests.
>>
>> Moreover I have some ideas how to get rid of blob_depth notion that makes life
>> a bit easier. Will share shortly.
> I'm curious what you have in mind!  The blob_depth as currently
> implemented is not terribly reliable...
General idea is to estimate allocated vs stored ratio for the blob(s) 
under the extent being written.
Where stored and allocated are measured in allocation units. And can be 
calculated using blobs ref_map.
If that ratio is greater than 1 (+-some correction) - we need to perform 
GC for these blobs. Given the fact we do that after compression 
preprocessing it's expensive to merge the compressed extent being 
written and old shards. Hence that shards are written as standalone 
extents as opposed to current implementation when we try to merge both 
new and existing extents into  a single entity. Not a big drawback IMHO. 
Evidently this is valid for new compressed extents (that are AU aligned) 
only. Uncompressed ones can be merged in any fashion.
This is just a draft hence comments are highly appreciated.

>
> sage


^ permalink raw reply

* Re: [PATCH v2 0/7] soc: renesas: Identify SoC and register with the SoC bus
From: Geert Uytterhoeven @ 2016-11-09 13:34 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Greg Kroah-Hartman, Yangbo Lu, Simon Horman, Magnus Damm,
	Rob Herring, Mark Rutland, Dirk Behme, Linux-Renesas,
	linux-arm-kernel@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, Pankaj Dubey,
	linux-samsung-soc@vger.kernel.org
In-Reply-To: <CAMuHMdV4HG0aOr4Qp_OZXU=3jLeOJ2QaMKp09a3v4489ABbRcA@mail.gmail.com>

Hi Arnd,

On Mon, Nov 7, 2016 at 10:35 AM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> On Mon, Oct 31, 2016 at 12:30 PM, Geert Uytterhoeven
> <geert+renesas@glider.be> wrote:
>> Some Renesas SoCs may exist in different revisions, providing slightly
>> different functionalities (e.g. R-Car H3 ES1.x and ES2.0), and behavior
>> (errate and quirks).  This needs to be catered for by drivers and/or
>> platform code.  The recently proposed soc_device_match() API seems like
>> a good fit to handle this.
>>
>> This patch series implements the core infrastructure to provide SoC and
>> revision information through the SoC bus for Renesas ARM SoCs. It
>> consists of 7 patches:
>>   - Patches 1-4 provide soc_device_match(), with some related fixes,
>>   - Patches 5-7 implement identification of Renesas SoCs and
>>     registration with the SoC bus,
>>
>> Changes compared to v1:
>>   - Add Acked-by,
>>   - New patches:
>>       - "[4/7] base: soc: Provide a dummy implementation of
>>                soc_device_match()",
>>       - "[5/7] ARM: shmobile: Document DT bindings for CCCR and PRR",
>>       - "[6/7] arm64: dts: r8a7795: Add device node for PRR"
>>         (more similar patches available, I'm not yet spamming you all
>>          with them),
>>   - Drop SoC families and family names; use fixed "Renesas" instead,
>>   - Drop EMEV2, which doesn't have a chip ID register, and doesn't share
>>     devices with other SoCs,
>>   - Drop RZ/A1H and R-CAR M1A, which don't have chip ID registers (for
>>     M1A: not accessible from the ARM core?),
>>   - On arm, move "select SOC_BUS" from ARCH_RENESAS to Kconfig symbols
>>     for SoCs that provide a chip ID register,
>>   - Build renesas-soc only if SOC_BUS is enabled,
>>   - Use "renesas,prr" and "renesas,cccr" device nodes in DT if
>>     available, else fall back to hardcoded addresses for compatibility
>>     with existing DTBs,
>>   - Remove verification of product IDs; just print the ID instead,
>>   - Don't register the SoC bus if the chip ID register is missing,
>>   - Change R-Mobile APE6 fallback to use PRR instead of CCCR (it has
>>     both).
>>
>> Merge strategy:
>>   - In theory, patches 1-4 should go through Greg's driver core tree.
>>     But it's a hard dependency for all users.
>>     If people agree, I can provide an immutable branch in my
>>     renesas-drivers repository, to be merged by all interested parties.
>>     So far I'm aware of Freescale/NXP, and Renesas.
>
> And Samsung.
> Shall I create the immutable branch now?

Arnd: are you happy with the new patches and changes?

Thanks again!

>>   - Patches 5-7 obviously have to go through Simon's Renesas tree (after
>>     merging the soc_device_match() core), and arm-soc.
>>
>> Tested on (machine, soc_id, optional revision):
>>     EMEV2 KZM9D Board, emev2
>>     Genmai, r7s72100
>>     APE6EVM, r8a73a4, ES1.0
>>     armadillo 800 eva, r8a7740, ES2.0
>>     bockw, r8a7778
>>     marzen, r8a7779, ES1.0
>>     Lager, r8a7790, ES1.0
>>     Koelsch, r8a7791, ES1.0
>>     Porter, r8a7791, ES3.0
>>     Blanche, r8a7792, ES1.1
>>     Gose, r8a7793, ES1.0
>>     Alt, r8a7794, ES1.0
>>     Renesas Salvator-X board based on r8a7795, r8a7795, ES1.0
>>     Renesas Salvator-X board based on r8a7795, r8a7795, ES1.1
>>     Renesas Salvator-X board based on r8a7796, r8a7796, ES1.0
>>     KZM-A9-GT, sh73a0, ES2.0
>>
>> For your convenience, this series (incl. more DT updates to add device
>> nodes for CCCR and PRR to all other Renesas ARM SoCs) is also available
>> in the topic/renesas-soc-id-v2 branch of my renesas-drivers git
>> repository at
>> git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git
>> Its first user is support for R-Car H3 ES2.0 in branch
>> topic/r8a7795-es2-v1-rebased2.
>>
>> Thanks for your comments!
>>
>> Arnd Bergmann (1):
>>   base: soc: Introduce soc_device_match() interface
>>
>> Geert Uytterhoeven (6):
>>   base: soc: Early register bus when needed
>>   base: soc: Check for NULL SoC device attributes
>>   base: soc: Provide a dummy implementation of soc_device_match()
>>   ARM: shmobile: Document DT bindings for CCCR and PRR
>>   arm64: dts: r8a7795: Add device node for PRR
>>   soc: renesas: Identify SoC and register with the SoC bus

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH v2 0/7] soc: renesas: Identify SoC and register with the SoC bus
From: Geert Uytterhoeven @ 2016-11-09 13:34 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAMuHMdV4HG0aOr4Qp_OZXU=3jLeOJ2QaMKp09a3v4489ABbRcA@mail.gmail.com>

Hi Arnd,

On Mon, Nov 7, 2016 at 10:35 AM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> On Mon, Oct 31, 2016 at 12:30 PM, Geert Uytterhoeven
> <geert+renesas@glider.be> wrote:
>> Some Renesas SoCs may exist in different revisions, providing slightly
>> different functionalities (e.g. R-Car H3 ES1.x and ES2.0), and behavior
>> (errate and quirks).  This needs to be catered for by drivers and/or
>> platform code.  The recently proposed soc_device_match() API seems like
>> a good fit to handle this.
>>
>> This patch series implements the core infrastructure to provide SoC and
>> revision information through the SoC bus for Renesas ARM SoCs. It
>> consists of 7 patches:
>>   - Patches 1-4 provide soc_device_match(), with some related fixes,
>>   - Patches 5-7 implement identification of Renesas SoCs and
>>     registration with the SoC bus,
>>
>> Changes compared to v1:
>>   - Add Acked-by,
>>   - New patches:
>>       - "[4/7] base: soc: Provide a dummy implementation of
>>                soc_device_match()",
>>       - "[5/7] ARM: shmobile: Document DT bindings for CCCR and PRR",
>>       - "[6/7] arm64: dts: r8a7795: Add device node for PRR"
>>         (more similar patches available, I'm not yet spamming you all
>>          with them),
>>   - Drop SoC families and family names; use fixed "Renesas" instead,
>>   - Drop EMEV2, which doesn't have a chip ID register, and doesn't share
>>     devices with other SoCs,
>>   - Drop RZ/A1H and R-CAR M1A, which don't have chip ID registers (for
>>     M1A: not accessible from the ARM core?),
>>   - On arm, move "select SOC_BUS" from ARCH_RENESAS to Kconfig symbols
>>     for SoCs that provide a chip ID register,
>>   - Build renesas-soc only if SOC_BUS is enabled,
>>   - Use "renesas,prr" and "renesas,cccr" device nodes in DT if
>>     available, else fall back to hardcoded addresses for compatibility
>>     with existing DTBs,
>>   - Remove verification of product IDs; just print the ID instead,
>>   - Don't register the SoC bus if the chip ID register is missing,
>>   - Change R-Mobile APE6 fallback to use PRR instead of CCCR (it has
>>     both).
>>
>> Merge strategy:
>>   - In theory, patches 1-4 should go through Greg's driver core tree.
>>     But it's a hard dependency for all users.
>>     If people agree, I can provide an immutable branch in my
>>     renesas-drivers repository, to be merged by all interested parties.
>>     So far I'm aware of Freescale/NXP, and Renesas.
>
> And Samsung.
> Shall I create the immutable branch now?

Arnd: are you happy with the new patches and changes?

Thanks again!

>>   - Patches 5-7 obviously have to go through Simon's Renesas tree (after
>>     merging the soc_device_match() core), and arm-soc.
>>
>> Tested on (machine, soc_id, optional revision):
>>     EMEV2 KZM9D Board, emev2
>>     Genmai, r7s72100
>>     APE6EVM, r8a73a4, ES1.0
>>     armadillo 800 eva, r8a7740, ES2.0
>>     bockw, r8a7778
>>     marzen, r8a7779, ES1.0
>>     Lager, r8a7790, ES1.0
>>     Koelsch, r8a7791, ES1.0
>>     Porter, r8a7791, ES3.0
>>     Blanche, r8a7792, ES1.1
>>     Gose, r8a7793, ES1.0
>>     Alt, r8a7794, ES1.0
>>     Renesas Salvator-X board based on r8a7795, r8a7795, ES1.0
>>     Renesas Salvator-X board based on r8a7795, r8a7795, ES1.1
>>     Renesas Salvator-X board based on r8a7796, r8a7796, ES1.0
>>     KZM-A9-GT, sh73a0, ES2.0
>>
>> For your convenience, this series (incl. more DT updates to add device
>> nodes for CCCR and PRR to all other Renesas ARM SoCs) is also available
>> in the topic/renesas-soc-id-v2 branch of my renesas-drivers git
>> repository at
>> git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git
>> Its first user is support for R-Car H3 ES2.0 in branch
>> topic/r8a7795-es2-v1-rebased2.
>>
>> Thanks for your comments!
>>
>> Arnd Bergmann (1):
>>   base: soc: Introduce soc_device_match() interface
>>
>> Geert Uytterhoeven (6):
>>   base: soc: Early register bus when needed
>>   base: soc: Check for NULL SoC device attributes
>>   base: soc: Provide a dummy implementation of soc_device_match()
>>   ARM: shmobile: Document DT bindings for CCCR and PRR
>>   arm64: dts: r8a7795: Add device node for PRR
>>   soc: renesas: Identify SoC and register with the SoC bus

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH v10 6/7] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID
From: Thomas Gleixner @ 2016-11-09 13:34 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kyle Huey, Robert O'Callahan, Andy Lutomirski, Ingo Molnar,
	H. Peter Anvin, x86, Paolo Bonzini, Radim Krčmář,
	Jeff Dike, Richard Weinberger, Alexander Viro, Shuah Khan,
	Dave Hansen, Peter Zijlstra, Boris Ostrovsky, Len Brown,
	Rafael J. Wysocki, Dmitry Safonov, David Matlack, linux-kernel
In-Reply-To: <20161109132114.3ujq2wkhsm4kcytz@pd.tnic>

On Wed, 9 Nov 2016, Borislav Petkov wrote:

> On Tue, Nov 08, 2016 at 09:06:31PM +0100, Thomas Gleixner wrote:
> > The upcoming ring3 mwait stuff can add its magic to tweak that MSR into
> > this function.
> > 
> > Stick the call at the end of init_scattered_cpuid_features() for now. I
> > still need to figure out a proper place for it.
> 
> So Thomas and I discussed this more on IRC and I think we can get rid
> of the MSR iterating in scattered.c and integrate both the R3 MWAIT and
> CPUID faulting like this:

I agree mostly.
 
> ---
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index fcd484d2bb03..5c38a85af2e7 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -452,6 +457,39 @@ static void intel_bsp_resume(struct cpuinfo_x86 *c)
>  	init_intel_energy_perf(c);
>  }
>  
> +static void init_misc_enables(struct cpuinfo_x86 *c)
> +{
> +	u64 val, misc_en;
> +
> +	if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &misc_en))
> +		return;
> +
> +	misc_en &= ~MSR_MISC_ENABLES_CPUID_FAULT_ENABLE;

I rather prefer to write this MSR to 0 right away and just enable the bits
which we really support.

Whatever Intel comes up with next, e.g. faulting of random other
instructions or whatever (mis)feature they think is valuable, will lead to
a debugging nightmare if some incompetent BIOS writer sets the bit and the
kernel does not know about it.

Yes, I know that there might be bits forced to 1 at some point in the
future, but let's deal with that when it happens.

Right now I can enable the CPUID FAULT bit on my broadwell and watch user
space programs die unexpectedly without a hint why. Simply because it's not
documented in the SDM. So we rather be safe than surprised.

@hpa: Thoughts?

Thanks,

	tglx

^ permalink raw reply

* Re: [PATCH v10 6/7] x86/arch_prctl: Add ARCH_[GET|SET]_CPUID
From: Thomas Gleixner @ 2016-11-09 13:34 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kyle Huey, Robert O'Callahan, Andy Lutomirski, Ingo Molnar,
	H. Peter Anvin, x86, Paolo Bonzini, Radim Krčmář,
	Jeff Dike, Richard Weinberger, Alexander Viro, Shuah Khan,
	Dave Hansen, Peter Zijlstra, Boris Ostrovsky, Len Brown,
	Rafael J. Wysocki, Dmitry Safonov, David Matlack, linux-kernel,
	user-mode-linux-devel, user-mode-linux-user, linux-fsdevel,
	linux-kselftest, kvm
In-Reply-To: <20161109132114.3ujq2wkhsm4kcytz@pd.tnic>

On Wed, 9 Nov 2016, Borislav Petkov wrote:

> On Tue, Nov 08, 2016 at 09:06:31PM +0100, Thomas Gleixner wrote:
> > The upcoming ring3 mwait stuff can add its magic to tweak that MSR into
> > this function.
> > 
> > Stick the call at the end of init_scattered_cpuid_features() for now. I
> > still need to figure out a proper place for it.
> 
> So Thomas and I discussed this more on IRC and I think we can get rid
> of the MSR iterating in scattered.c and integrate both the R3 MWAIT and
> CPUID faulting like this:

I agree mostly.
 
> ---
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index fcd484d2bb03..5c38a85af2e7 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -452,6 +457,39 @@ static void intel_bsp_resume(struct cpuinfo_x86 *c)
>  	init_intel_energy_perf(c);
>  }
>  
> +static void init_misc_enables(struct cpuinfo_x86 *c)
> +{
> +	u64 val, misc_en;
> +
> +	if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &misc_en))
> +		return;
> +
> +	misc_en &= ~MSR_MISC_ENABLES_CPUID_FAULT_ENABLE;

I rather prefer to write this MSR to 0 right away and just enable the bits
which we really support.

Whatever Intel comes up with next, e.g. faulting of random other
instructions or whatever (mis)feature they think is valuable, will lead to
a debugging nightmare if some incompetent BIOS writer sets the bit and the
kernel does not know about it.

Yes, I know that there might be bits forced to 1 at some point in the
future, but let's deal with that when it happens.

Right now I can enable the CPUID FAULT bit on my broadwell and watch user
space programs die unexpectedly without a hint why. Simply because it's not
documented in the SDM. So we rather be safe than surprised.

@hpa: Thoughts?

Thanks,

	tglx


^ permalink raw reply

* Re: [PATCH] sched/rt: RT_RUNTIME_GREED sched feature
From: Daniel Bristot de Oliveira @ 2016-11-09 13:33 UTC (permalink / raw)
  To: Peter Zijlstra, Daniel Bristot de Oliveira
  Cc: Steven Rostedt, Ingo Molnar, Christoph Lameter, linux-rt-users,
	LKML, Tommaso Cucinotta
In-Reply-To: <20161108195015.GP3117@twins.programming.kicks-ass.net>



On 11/08/2016 08:50 PM, Peter Zijlstra wrote:
>> The problem is that using RT_RUNTIME_SHARE a CPU will almost always
>> > borrow enough runtime to make a CPU intensive rt task to run forever...
>> > well not forever, but until the system crash because a kworker starved
>> > in this CPU. Kworkers are sched fair by design and users do not always
>> > have a way to avoid them in an isolated CPU, for example.
>> > 
>> > The user then can disable RT_RUNTIME_SHARE, but then the user will have
>> > the CPU going idle for (period - runtime) at each period... throwing CPU
>> > time in the trash.
> So why is this a problem? You really should not be running that much
> FIFO tasks to begin with.

I agree that a spinning real-time task is not a good practice, but there
are people using it and they have their own reasons/metrics/evaluations
motivating them.

> So I'm willing to take out (or at least default disable
> RT_RUNTIME_SHARE). But other than this, this never really worked to
> begin with. So it cannot be a regression. And we've lived this long with
> the 'problem'.

I agree! It would work perfectly in the absence of tasks pinned to a
processor, but that is not the case.

Trying to attend the users that want as much CPU time for -rt tasks as
possible, the proposed patch seems to be a better solution when compared
to RT_RUNTIME_SHARE, and it is way simpler! Even though it is not as
perfect as a DL Server would be in the future, it seems to be useful
until there...

> We really should be doing the right thing here, not make a bigger mess.

I see, agree and I am anxious to have it! :-). Tommaso and I discussed
about DL servers implementing such rt throttling. The more complicated
point so far (as Rostedt pointed on another e-mail) will be to have DL
servers with arbitrary affinity, or serving task with arbitrary
affinity. For example, one DL server pinned to each CPU providing
bandwidth for fair tasks to run for (rt_period - rt_runtime)us at each
rt_period... it will take sometime until someone propose it.

-- Daniel

^ permalink raw reply

* Re: [PATCH v7 00/21] Introduce SoC device/driver framework for EAL
From: Shreyansh Jain @ 2016-11-09 13:36 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, viktorin, david.marchand
In-Reply-To: <15406748.TzsDne6LfD@xps13>

Hello Thomas,

On Wednesday 09 November 2016 03:47 PM, Thomas Monjalon wrote:
> Hi Shreyansh,
>
> I realize that we had a lot of off-list discussions on this topic
> and there was no public explanation of the status of this series.

Thanks for your email. (and all the suggestions you have given 
offline/IRC etc.)
I was beginning to wonder that probably only Jan and me were the ones 
interested in this. Ironically, I felt that being EAL changes, a lot of 
people would come and be critic of it - giving an opportunity to get it 
widely accepted.

>
> 2016-10-28 17:56, Shreyansh Jain:
> [...]
>> As of now EAL is primarly focused on PCI initialization/probing.
>
> Right. DPDK was PCI centric.
> We must give PCI its right role: a bus as other ones.
> A first step was done in 16.11 (thanks to you Shreyansh, Jan and David)
> to have a better device model.
> The next step is to rework the bus abstraction.

Or, this change can be broken into multiple steps:
1. Create a PCI parallel layer for non-PCI devices. (call it SoC, or 
whatever else - doesn't really matter. More below).
2. Generalize this 'soc' changeset into common
3. Complete generalization by introducing a Linux like model of 
bus<=>device<=>driver.

Which was what my current approach was - expecting that SoC patchset 
would allow me to introduce NXP PMD and parallel to it I can keep 
pushing EAL generic changes towards generic bus arch.

>
> It seems a bus can be defined with the properties scan/match/notify,
> leading to initialize the devices.

'notify' is something which I am not completely clear with - but, in 
principle, agree.

>
> More comments below your technical presentation.
>
> [...]
>> This patchset introduces SoC framework which would enable SoC drivers and
>> drivers to be plugged into EAL, very similar to how PCI drivers/devices are
>> done today.
>>
>> This is a stripped down version of PCI framework which allows the SoC PMDs
>> to implement their own routines for detecting devices and linking devices to
>> drivers.
>>
>> 1) Changes to EAL
>>  rte_eal_init()
>>   |- rte_eal_pci_init(): Find PCI devices from sysfs
>>   |- rte_eal_soc_init(): Calls PMDs->scan_fn
>>   |- ...
>>   |- rte_eal_memzone_init()
>>   |- ...
>>   |- rte_eal_pci_probe(): Driver<=>Device initialization, PMD->devinit()
>>   `- rte_eal_soc_probe(): Calls PMDs->match_fn and PMDs->devinit();
>>
>> 2) New device/driver structures:
>>   - rte_soc_driver (inheriting rte_driver)
>>   - rte_soc_device (inheriting rte_device)
>>   - rte_eth_dev and eth_driver embedded rte_soc_device and rte_soc_driver,
>>     respectively.
>>
>> 3) The SoC PMDs need to:
>>  - define rte_soc_driver with necessary scan and match callbacks
>>  - Register themselves using DRIVER_REGISTER_SOC()
>>  - Implement respective bus scanning in the scan callbacks to add necessary
>>    devices to SoC device list
>>  - Implement necessary eth_dev_init/uninint for ethernet instances
>
> These callbacks are not specific to a SoC.

Agree; this is just a note - it exactly what a PCI PMD does.

> By the way a SoC defines nothing specific. You are using the SoC word
> as an equivalent of non-PCI.

Yes, primarily because SoC is a very broad word which can encompass a 
variety of devices (buses/drivers). So, we have PCI set, VDEV set, and 
everything else represented by 'SoC'.
The complete set is based on this principle: to have a generic subsystem 
_parallel_ to PCI (so as not to impact it) which can represent all 
devices not already included in PCI set'. - Actually, it is _nothing_ 
specific.

> We must have a bus abstraction like the one you are defining for the SoC
> but it must be generic and defined on top of PCI, so we can plug any
> bus in it: PCI, vdev (as a software bus), any other well-defined bus,
> and some driver-specific bus which can be implemented directly in the
> driver (the NXP case AFAIK).

Indeed - that is an ideal approach. And honestly, true attribution, it 
is not my original idea. It is yours, from our IRC discussion.
And it was ironic as well because Declan came up with similar suggestion 
much earlier but no one commented on it.

>
>> 4) Design considerations that are same as PCI:
>>  - SoC initialization is being done through rte_eal_init(), just after PCI
>>    initialization is done.
>>  - As in case of PCI, probe is done after rte_eal_pci_probe() to link the
>>    devices detected with the drivers registered.
>>  - Device attach/detach functions are available and have been designed on
>>    the lines of PCI framework.
>>  - PMDs register using DRIVER_REGISTER_SOC, very similar to
>>    DRIVER_REGISTER_PCI for PCI devices.
>>  - Linked list of SoC driver and devices exists independent of the other
>>    driver/device list, but inheriting rte_driver/rte_driver, these are
>>    also part of a global list.
>>
>> 5) Design considerations that are different from PCI:
>>  - Each driver implements its own scan and match function. PCI uses the BDF
>>    format to read the device from sysfs, but this _may_not_ be a case for a
>>    SoC ethernet device.
>>    = This is an important change from initial proposal by Jan in [2].
>>    Unlike his attempt to use /sys/bus/platform, this patch relies on the
>>    PMD to detect the devices. This is because SoC may require specific or
>>    additional info for device detection. Further, SoC may have embedded
>>    devices/MACs which require initialization which cannot be covered
>>    through sysfs parsing.
>>    `-> Point (6) below is a side note to above.
>>    = PCI based PMDs rely on EAL's capability to detect devices. This
>>    proposal puts the onus on PMD to detect devices, add to soc_device_list
>>    and wait for Probe. Matching, of device<=>driver is again PMD's
>>    callback.
>
> These PCI considerations can be described as a PCI bus implementation in EAL.

Yes. Or, we can continue as it is for PCI and allow more buses to plug 
in. And eventually, make the whole bus thing generic. Step-by-Step.

>
>> 6) Adding default scan and match helpers for PMDs
>>  - The design warrrants the PMDs implement their own scan of devices
>>    on bus, and match routines for probe implementation.
>>    This patch introduces helpers which can be used by PMDs for scan of
>>    the platform bus and matching devices against the compatible string
>>    extracted from the scan.
>>  - Intention is to make it easier to integrate known SoC which expose
>>    platform bus compliant information (compat, sys/bus/platform...).
>>  - PMDs which have deviations from this standard model can implement and
>>    hook their bus scanning and probe match callbacks while registering
>>    driver.
>
> Yes we can have EAL helpers to scan sys/bus/platform or other common formats.
> Then the PMD can use them to implement its specific bus.

And:
  - Framework/Developers should be able to 'register/unregister' buses.
  -- Just like drivers/net
  -- So, probably a drivers/bus/pci, drivers/bus/xxx folder structure
  -- and compatible compilation routine
  -- Or maybe librte_bus?
  --- I still haven't got around solving this.
  - RTE_PMD_REGISTER_BUS probably.
  - just EAL helpers are not enough - or at least, not generic enough.

  - Drivers should be able to 'lookup' buses
  -- PCI PMDs find a 'pci' bus during registration, for example.
  -- Unlike Linux, a device tree doesn't exist to solve this issue of 
which PMD is connected to which bus. This needs to solved using explicit 
APIs
  -- Once found, drivers can 'associate' with a bus.

  - Not just bus<->device<->driver, there is pending work with respect 
ethernet devices.
  -- Removing eth_driver->pci_driver linkage. Making it generic.
  -- Multiple places assume eth device to be PCI, rewriting that part.
  -- device_insert changes for vdev

>
> I know that David, you and me are on the same page to agree on this generic
> design. I hope you will be able to achieve this work soon.

I agree with your suggestion in principle - bus model is nice to have. 
There is always a better design. It is about how to achieve it.

I still think that current SoC patchset is a decent first step without 
completely discounting any future changes. (Obviously, assuming it is 
not NACK'd by someone because of some currently unforeseen technical 
argument).
Because:
1. It doesn't break the existing PCI subsystem anywhere. It is disabled 
by default.
2. It is independent - in the sense that changes are limited to SoC 
specific files, with only a few changes in EAL. But, clean patches make 
it easier to validate.
3. It would allow more people to check/validate the overall proposal of 
non-PCI device existence.
4. It paves way for cleaner transit to Bus architecture
  - rte_device/driver inheritance is complete after SoC patchset. Except 
changes in eth_dev/driver which is a completely parallel change with 
little overlap (but wide impact)
  - rte_soc_driver introduces 'scan/match' which can then be generalized 
to rte_driver in next step.
  -- This is very similar to what happened in rte_device/driver change 
where PCI things were generalized to common.

> The goal is to be able to plug a SoC driver in DPDK 17.02 with a clean design.

I am already working on that - now on a higher priority. Hopefully 
within next few days I will send a RFC out. There are still some grey 
ares for me (lack of DPDK internal understand, probably) - but I am 
hoping I would get good support from others - at least this time.

Then, probably for 17.02, we can have either candidate - and we can 
integrate whatever looks best (in terms of design and impact, both).

> My advice to make reviews and contributions easier, is to split the work in
> few steps:
> 	- clean PCI code (generalize non-PCI stuff)
> 	- add generic bus functions
> 	- plug PCI in generic bus abstraction
> 	- plug vdev in generic bus abstraction
> 	- plug the new NXP driver

I agree with above.

>
> Thanks
>

Thank you for your time and suggestions.

-
Shreyansh

^ permalink raw reply

* Re: [RFC PATCH 13/24] ARM: vITS: handle CLEAR command
From: Julien Grall @ 2016-11-09 13:32 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel
In-Reply-To: <alpine.DEB.2.10.1611081608580.3491@sstabellini-ThinkPad-X260>

Hi,

On 09/11/16 00:39, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> This introduces the ITS command handler for the CLEAR command, which
>> clears the pending state of an LPI.
>> This removes a not-yet injected, but already queued IRQ from a VCPU.
>>
>> In addition this patch introduces the lookup function which translates
>> a given DeviceID/EventID pair into a pointer to our vITTE structure.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/vgic-its.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 115 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
>> index 875b992..99d9e9c 100644
>> --- a/xen/arch/arm/vgic-its.c
>> +++ b/xen/arch/arm/vgic-its.c
>> @@ -61,6 +61,73 @@ struct vits_itte
>>      uint64_t collection:16;
>>  };
>>
>> +#define UNMAPPED_COLLECTION      ((uint16_t)~0)
>> +
>> +/* Must be called with the ITS lock held. */
>> +static struct vcpu *get_vcpu_from_collection(struct virt_its *its, int collid)
>> +{
>> +    uint16_t vcpu_id;
>> +
>> +    if ( collid >= its->max_collections )
>> +        return NULL;
>> +
>> +    vcpu_id = its->coll_table[collid];
>> +    if ( vcpu_id == UNMAPPED_COLLECTION || vcpu_id >= its->d->max_vcpus )
>> +        return NULL;
>> +
>> +    return its->d->vcpu[vcpu_id];
>> +}
>> +
>> +#define DEV_TABLE_ITT_ADDR(x) ((x) & GENMASK(51, 8))
>> +#define DEV_TABLE_ITT_SIZE(x) (BIT(((x) & GENMASK(7, 0)) + 1))
>> +#define DEV_TABLE_ENTRY(addr, bits)                     \
>> +        (((addr) & GENMASK(51, 8)) | (((bits) - 1) & GENMASK(7, 0)))
>> +
>> +static paddr_t get_itte_address(struct virt_its *its,
>> +                                uint32_t devid, uint32_t evid)
>> +{
>> +    paddr_t addr;
>> +
>> +    if ( devid >= its->max_devices )
>> +        return ~0;
>
> Please #define the error

Technically this should be INVALID_PADDR here.

>
>> +    if ( evid >= DEV_TABLE_ITT_SIZE(its->dev_table[devid]) )
>> +        return ~0;
>
> same here

Ditto.

>
>
>> +    addr = DEV_TABLE_ITT_ADDR(its->dev_table[devid]);
>> +
>> +    return addr + evid * sizeof(struct vits_itte);
>> +}
>> +
>> +/* Looks up a given deviceID/eventID pair on an ITS and returns a pointer to
>> + * the corresponding ITTE. This maps the respective guest page into Xen.
>> + * Once finished with handling the ITTE, call put_devid_evid() to unmap
>> + * the page again.
>> + * Must be called with the ITS lock held.
>> + */
>> +static struct vits_itte *get_devid_evid(struct virt_its *its,
>> +                                        uint32_t devid, uint32_t evid)
>> +{
>> +    paddr_t addr = get_itte_address(its, devid, evid);
>> +    struct vits_itte *itte;
>> +
>> +    if (addr == ~0)
>> +        return NULL;
>> +
>> +    /* TODO: check locking for map_guest_pages() */
>> +    itte = map_guest_pages(its->d, addr & PAGE_MASK, 1);
>> +    if ( !itte )
>> +        return NULL;
>
> No need to use the vmap to map 1 page

But you do have to translate the IPA to a PA, so you cannot directly use 
__pa on it.

>
>> +    return itte + (addr & ~PAGE_MASK) / sizeof(struct vits_itte);
>
> Please use () around the div operation for clarity
>
>
>> +}
>> +
>> +/* Must be called with the ITS lock held. */
>> +static void put_devid_evid(struct virt_its *its, struct vits_itte *itte)
>> +{
>> +    unmap_guest_pages(itte, 1);
>
> No need for this, once you use __pa instead of the vmap

Well, we should at least use map_domain_page/unmap_domain_page even if 
they are a nop on ARM64. And not directly __pa.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply

* Re: [PATCH 1/2] net: mvpp2: don't bring up on MAC address set
From: Thomas Petazzoni @ 2016-11-09 13:22 UTC (permalink / raw)
  To: Baruch Siach; +Cc: Marcin Wojtas, netdev, Gregory Clement
In-Reply-To: <ff17831771f3575f351c134703d3f153485b01c0.1478696194.git.baruch@tkos.co.il>

Hello,

On Wed,  9 Nov 2016 14:56:33 +0200, Baruch Siach wrote:
> Current .ndo_set_mac_address implementation brings up the interface when revert
> to original address after failure succeeds. Fix this.
> 
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>

Indeed, this piece of code is not very smart.

> diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
> index 60227a3452a4..e427b4706726 100644
> --- a/drivers/net/ethernet/marvell/mvpp2.c
> +++ b/drivers/net/ethernet/marvell/mvpp2.c
> @@ -5686,9 +5686,8 @@ static int mvpp2_set_mac_address(struct net_device *dev, void *p)
>  		if (!err)
>  			return 0;
>  		/* Reconfigure parser to accept the original MAC address */
> -		err = mvpp2_prs_update_mac_da(dev, dev->dev_addr);
> -		if (err)
> -			goto error;
> +		mvpp2_prs_update_mac_da(dev, dev->dev_addr);
> +		goto error;

Wouldn't it make more sense to call mvpp2_prs_update_mac_da() under
the error: goto label?

But if you think beyond that, it is a bit crazy that to handle the
error case of mvpp2_prs_update_mac_da(), we have to call
mvpp2_prs_update_mac_da(), which is exactly the same function...

Perhaps it would be interesting to investigate what are the various
conditions for which mvpp2_prs_update_mac_da() fails, and see if we can
avoid them.

Best regards,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply

* Re: [v3 4/5] vfio: implement APIs to set/put kvm to/from vfio group
From: Paolo Bonzini @ 2016-11-09 13:31 UTC (permalink / raw)
  To: Xiao Guangrong, Jike Song
  Cc: Alex Williamson, kwankhede, cjia, kevin.tian, kvm
In-Reply-To: <3e5299bc-e08e-f072-1001-7e0432cb1ca8@linux.intel.com>



On 09/11/2016 14:06, Xiao Guangrong wrote:
> 
> 
> On 11/09/2016 08:49 PM, Jike Song wrote:
> 
>> +void vfio_group_attach_kvm(struct vfio_group *group, struct kvm *kvm,
>> +        void (*fn)(struct kvm *))
>> +{
>> +    mutex_lock(&group->udata.lock);
> 
> This lock is needed, please see below.

*not* needed I guess.

>> +
>> +    fn(kvm);
>> +    blocking_notifier_call_chain(&group->udata.notifier,
>> +                VFIO_GROUP_NOTIFY_ATTACH_KVM, kvm);
> 
> As this is a callback before KVM releases its last refcount, i do not
> think vendor driver need to get additional KVM refcount.

The *group* driver doesn't need it indeed.  The mdev vendor driver
however does, so it will use kvm_get_kvm under its own mutex.  That is:

- attach kvm

	mutex_lock(mdev_driver->lock);
	mdev_driver->kvm = opaque;
	kvm_get_kvm(mdev_driver->kvm);
	mutex_unlock(mdev_driver->lock);

- detach kvm

	mutex_lock(mdev_driver->lock);
	kvm_put_kvm(mdev_driver->kvm);
	WARN_ON(mdev_driver->kvm != opaque);
	mdev_driver->kvm = NULL;
	mutex_unlock(mdev_driver->lock);

- use kvm

	mutex_lock(mdev_driver->lock);
	kvm = mdev_driver->kvm;
	...
	mutex_unlock(mdev_driver->lock);

or if safe:

	mutex_lock(mdev_driver->lock);
	kvm = mdev_driver->kvm;
	kvm_get_kvm(kvm);
	mutex_unlock(mdev_driver->lock);
	...
	kvm_put_kvm(kvm);

Thanks,

Paolo

^ permalink raw reply

* [Buildroot] [PATCH] qt5webkit: Get sources from Qt-5-unofficial-builds
From: Jérôme Pouiller @ 2016-11-09 13:31 UTC (permalink / raw)
  To: buildroot
In-Reply-To: <1478695085-4210-1-git-send-email-abrodkin@synopsys.com>

Hello Alexey,

On Wednesday 09 November 2016 15:38:05 Alexey Brodkin wrote:
[...]
> --- a/package/qt5/qt5.mk
> +++ b/package/qt5/qt5.mk
> @@ -1,6 +1,7 @@
>  QT5_VERSION_MAJOR = 5.6
>  QT5_VERSION = $(QT5_VERSION_MAJOR).2
>  QT5_SITE = http://download.qt.io/official_releases/qt/$(QT5_VERSION_MAJOR)/$(QT5_VERSION)/submodules
> +QT5_SNAPSHOTS_SITE = http://download.qt.io/snapshots/qt/$(QT5_VERSION_MAJOR)/$(QT5_VERSION)/latest_src/submodules

Is it possible to also use snapshots site for official modules?

[...]
> --- a/package/qt5/qt5webkit/qt5webkit.mk
> +++ b/package/qt5/qt5webkit/qt5webkit.mk
> @@ -4,10 +4,9 @@
>  #
>  ################################################################################
>  
> -QT5WEBKIT_VERSION = b35917bcb44d7f200af0f4ac68a126fa0aa8d93d
> -# Using GitHub since it supports downloading tarballs from random commits.
> -# The http://code.qt.io/cgit/qt/qtwebkit.git/ repo doesn't allow to do so.
> -QT5WEBKIT_SITE = $(call github,qtproject,qtwebkit,$(QT5WEBKIT_VERSION))
> +QT5WEBKIT_VERSION = $(QT5_VERSION)
> +QT5WEBKIT_SITE = $(QT5_SNAPSHOTS_SITE)
> +QT5WEBKIT_SOURCE = qtwebkit-opensource-src-$(QT5WEBKIT_EXAMPLES_VERSION).tar.xz
                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
You mean QT5WEBKIT_VERSION?

BR,

-- 
J?r?me Pouiller, Sysmic
Embedded Linux specialist
http://www.sysmic.fr

^ permalink raw reply

* Re: [PATCH v3][manpages 2/2] perf_event_open.2: Document write_backward
From: Michael Kerrisk (man-pages) @ 2016-11-09 13:30 UTC (permalink / raw)
  To: Wang Nan, vincent.weaver
  Cc: mtk.manpages, pi3orama, linux-kernel, lizefan, linux-man
In-Reply-To: <20161024065256.160703-3-wangnan0@huawei.com>

Hello Wang Nan,

On 10/24/2016 08:52 AM, Wang Nan wrote:
> Linux 4.7 (9ecda41acb971ebd07c8fb35faf24005c0baea12) introduces write_backward
> attribute to perf_event_attr. Document this feature.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Reviewed-by: Vince Weaver <vincent.weaver@maine.edu>
> Cc: Michael Kerrisk <mtk.manpages@gmail.com>
> ---
>  man2/perf_event_open.2 | 57 ++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 55 insertions(+), 2 deletions(-)
> 
> diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
> index 561331c..fccde79 100644
> --- a/man2/perf_event_open.2
> +++ b/man2/perf_event_open.2
> @@ -245,7 +245,8 @@ struct perf_event_attr {
>            use_clockid    :  1,  /* use clockid for time fields */
>            context_switch :  1,  /* context switch data */
>  
> -          __reserved_1   : 37;
> +          write_backward :  1,  /* Write ring buffer from end to beginning */
> +          __reserved_1   : 36;
>  
>      union {
>          __u32 wakeup_events;    /* wakeup every n events */
> @@ -1127,6 +1128,31 @@ The advantage of this method is that it will give full
>  information even with strict
>  .I perf_event_paranoid
>  settings.
> +.IR "write_backward" " (since Linux 4.7)"
> +.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12
> +This makes the resuling event use a backward ring-buffer, which

s/reuling/resulting/

s/This/Setting this bit/ ?

> +writes samples from the end of the ring-buffer to the beginning.
> +
> +It is not allowed to connect events with backward and forward
> +ring-buffer settings together using
> +.B PERF_EVENT_IOC_SET_OUTPUT.
> +
> +Backward ring-buffer is useful for ring-buffers created by readonly
> +.BR mmap (2).
> +In this case,
> +.IR data_tail
> +is useless (because user space programs are not allowed to write to it).
> +.IR data_head
> +points to the head of the most recent sample. In a backward
> +ring-buffer, it is easy to iterate over the whole ring-buffer by reading
> +samples one by one from
> +.IR data_head
> +because size of a sample can be found from decoding its header.
> +
> +For a forward read only ring-buffer in contract,

What does "in contract" here mean? This needs to be clarified.

> +.IR data_head
> +points to the end of the most recent sample, but the size of a sample
> +can't be determined from the end of it.
>  .TP
>  .IR "wakeup_events" ", " "wakeup_watermark"
>  This union sets how many samples
> @@ -1671,7 +1697,9 @@ And vice versa:
>  .TP
>  .I data_head
>  This points to the head of the data section.
> -The value continuously increases, it does not wrap.
> +The value continuously increases (or decrease if
> +.IR write_backward
> +is set), it does not wrap.
>  The value needs to be manually wrapped by the size of the mmap buffer
>  before accessing the samples.
>  
> @@ -2736,6 +2764,24 @@ Starting with Linux 3.18,
>  .B POLL_HUP
>  is indicated if the event being monitored is attached to a different
>  process and that process exits.
> +.SS Reading from overwritable ring-buffer
> +Reader is unable to update
> +.IR data_tail
> +if the mapping is not
> +.BR PROT_WRITE .
> +In this case, kernel will overwrite data without considering whether
> +they are read or not, so ring-buffer is overwritable and
> +behaves like a flight recorder. To read from an overwritable
> +ring-buffer, setting
> +.IR write_backward
> +is suggested, or it would be hard to find a proper position to start
> +decoding. In addition, ring-buffer should be paused before reading
> +through
> +.BR ioctl (2)
> +with
> +.B PERF_EVENT_IOC_PAUSE_OUTPUT
> +to avoid racing between kernel and reader. Ring-buffer should be resumed
> +after finish reading.
>  .SS rdpmc instruction
>  Starting with Linux 3.4 on x86, you can use the
>  .\" commit c7206205d00ab375839bd6c7ddb247d600693c09
> @@ -2848,6 +2894,13 @@ The file descriptors must all be on the same CPU.
>  
>  The argument specifies the desired file descriptor, or \-1 if
>  output should be ignored.
> +
> +Two events with different
> +.IR write_backward
> +settings are not allowed to be connected together using
> +.B PERF_EVENT_IOC_SET_OUTPUT.
> +.B EINVAL
> +is returned in this case.
>  .TP
>  .BR PERF_EVENT_IOC_SET_FILTER " (since Linux 2.6.33)"
>  .\" commit 6fb2915df7f0747d9044da9dbff5b46dc2e20830

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply

* Re: [PATCH v4 0/4] MIPS: Remote processor driver
From: Matt Redfearn @ 2016-11-09 13:30 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ralf Baechle, Bjorn Andersson, Ohad Ben-Cohen, linux-mips,
	linux-remoteproc, Lisa Parratt, LKML, Qais Yousef,
	Masahiro Yamada, Paul Gortmaker, Jason Cooper, James Hogan,
	Marc Zyngier, Paul Burton, Peter Zijlstra
In-Reply-To: <alpine.DEB.2.20.1611091111041.3501@nanos>

Hi Thomas,


On 09/11/16 10:15, Thomas Gleixner wrote:
> On Wed, 2 Nov 2016, Matt Redfearn wrote:
>> The MIPS remote processor driver allows non-Linux firmware to take
>> control of and execute on one of the systems VPEs. The CPU must be
>> offlined from Linux first. A sysfs interface is created which allows
>> firmware to be loaded and changed at runtime. A full description is
>> available at [1]. An example firmware that can be used with this driver
>> is available at [2].
>>
>> This is useful to allow running bare metal code, or an RTOS, on one or
>> more CPUs while allowing Linux to continue running on those remaining.
> And how is actually guaranteed that these two things are properly seperated
> (memory, devices, interrupts etc.) ?

Memory separation is primarily handled by the remoteproc subsystem, 
which will allocate and map memory as required by the firmware, though 
because the CPU is executing in kernel mode there is nothing preventing 
it accessing anything in the system. But that is of course the same as 
any root process which could do the same thing via /dev/mem. One must be 
root to offline the CPU from Linux and load firmware to it, so there is 
no greater hazard to the system than that firmware running as a root 
process within userland.

Separation of devices and interrupts is a system design issue, as this 
feature will find use in embedded systems where the system will be 
partitioned into Linux and bare metal components. This is done where 
there are requirements such as needing to run real time code as well as 
Linux, or enforce separation through firmware binaries running 
separately to Linux.
This would be useful, for example, for a modem driver running as bare 
metal code within one of the system VPEs and providing a virtio-net 
interface to the kernel. There would be no kernel driver present for 
such a device, therefore there would be no resource conflicts.

There only different thing about the MIPS implementation of remoteproc 
is that it turns one of the general purpose Linux CPUs into a remote 
processor, rather than there being a separate remote CPU within the SoC, 
as is the case with most remoteproc drivers. But unless there is some 
form of MMU between that CPU and the system bus, then it will have the 
same ability to access all system resources as is the case with this 
driver. Again I don't think there is any greater risk to the system here 
as there would be with any other remoteproc based system.

>
> We have rejected attempts to do exactly the same thing on x86 in the
> past. There is virtualization and NOHZ_FULL to do it proper and not just
> with a horrible hackery.

There is already a mechanism to do this in the upstream MIPS kernel - 
the VPE loader, which has been there 2005 (commit 
e01402b115cccb6357f956649487aca2c6f7fbba). One user of the VPE loader 
was Lantiq, who used it to load a proprietary modem driver, for which 
there is no GPL driver.
What we are proposing here is to move from that MIPS specific mechanism 
of running bare metal code to the standardized remoteproc subsystem such 
that people wanting to design a MIPS based system with both real time 
firmware and general Linux processing tasks may do so using standardized 
kernel interfaces.

Thanks,
Matt

>
> Thanks,
>
> 	tglx
>
>

^ permalink raw reply


This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.