Linux userland API discussions

Linux userland API discussions
 help / color / mirror / Atom feed

* [PATCH 15/15] staging: media: Replace v4l2-mediabus.h inclusion with v4l2-mbus.h
From: Boris Brezillon @ 2014-11-04  9:55 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Hans Verkuil, Laurent Pinchart,
	linux-media-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-doc-u79uwXL29TY76Z2rM5mHXA, Guennadi Liakhovetski,
	Boris Brezillon
In-Reply-To: <1415094910-15899-1-git-send-email-boris.brezillon-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>

The v4l2-mediabus.h header is now deprecated and should be replaced with
v4l2-mbus.h.

Signed-off-by: Boris Brezillon <boris.brezillon-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
---
 drivers/staging/media/omap4iss/iss_csi2.c  | 2 +-
 drivers/staging/media/omap4iss/iss_video.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/omap4iss/iss_csi2.c b/drivers/staging/media/omap4iss/iss_csi2.c
index b72e530..f47e4e5 100644
--- a/drivers/staging/media/omap4iss/iss_csi2.c
+++ b/drivers/staging/media/omap4iss/iss_csi2.c
@@ -13,7 +13,7 @@
 
 #include <linux/delay.h>
 #include <media/v4l2-common.h>
-#include <linux/v4l2-mediabus.h>
+#include <linux/v4l2-mbus.h>
 #include <linux/mm.h>
 
 #include "iss.h"
diff --git a/drivers/staging/media/omap4iss/iss_video.h b/drivers/staging/media/omap4iss/iss_video.h
index cc8146b..a028b51 100644
--- a/drivers/staging/media/omap4iss/iss_video.h
+++ b/drivers/staging/media/omap4iss/iss_video.h
@@ -14,7 +14,7 @@
 #ifndef OMAP4_ISS_VIDEO_H
 #define OMAP4_ISS_VIDEO_H
 
-#include <linux/v4l2-mediabus.h>
+#include <linux/v4l2-mbus.h>
 #include <media/media-entity.h>
 #include <media/v4l2-dev.h>
 #include <media/v4l2-fh.h>
-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH net-next 3/7] bpf: add array type of eBPF maps
From: Daniel Borkmann @ 2014-11-04  9:58 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Ingo Molnar, Andy Lutomirski,
	Hannes Frederic Sowa, Eric Dumazet, linux-api, netdev,
	linux-kernel
In-Reply-To: <1415069656-14138-4-git-send-email-ast@plumgrid.com>

On 11/04/2014 03:54 AM, Alexei Starovoitov wrote:
> add new map type BPF_MAP_TYPE_ARRAY and its implementation
>
> - optimized for fastest possible lookup()
>    . in the future verifier/JIT may recognize lookup() with constant key
>      and optimize it into constant pointer. Can optimize non-constant
>      key into direct pointer arithmetic as well, since pointers and
>      value_size are constant for the life of the eBPF program.
>      In other words array_map_lookup_elem() may be 'inlined' by verifier/JIT
>      while preserving concurrent access to this map from user space
>
> - two main use cases for array type:
>    . 'global' eBPF variables: array of 1 element with key=0 and value is a
>      collection of 'global' variables which programs can use to keep the state
>      between events
>    . aggregation of tracing events into fixed set of buckets
>
> - all array elements pre-allocated and zero initialized at init time
>
> - key as an index in array and can only be 4 byte
>
> - map_delete_elem() returns EINVAL, since elements cannot be deleted
>
> - map_update_elem() replaces elements in an non-atomic way
>    (for atomic updates hashtable type should be used instead)
>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

...
> +/* Called from syscall or from eBPF program */
> +static int array_map_update_elem(struct bpf_map *map, void *key, void *value,
> +				 u64 map_flags)
> +{
> +	struct bpf_array *array = container_of(map, struct bpf_array, map);
> +	u32 index = *(u32 *)key;
> +
> +	if (map_flags > BPF_MAP_UPDATE_ONLY)
> +		/* unknown flags */
> +		return -EINVAL;
> +
> +	if (map_flags == BPF_MAP_CREATE_ONLY)
> +		return -EINVAL;
> +
> +	if (index >= array->map.max_entries)
> +		/* all elements were pre-allocated, cannot insert a new one */
> +		return -E2BIG;
> +
> +	memcpy(array->value + array->elem_size * index, value, array->elem_size);

What would protect this from concurrent updates?

> +	return 0;
> +}

^ permalink raw reply

* Re: [PATCH 01/15] [media] Move mediabus format definition to a more standard place
From: Hans Verkuil @ 2014-11-04 10:20 UTC (permalink / raw)
  To: Boris Brezillon, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, linux-media
  Cc: linux-arm-kernel, linux-api, devel, linux-kernel, linux-doc,
	Guennadi Liakhovetski
In-Reply-To: <1415094910-15899-2-git-send-email-boris.brezillon@free-electrons.com>

Hi Boris,

On 11/04/14 10:54, Boris Brezillon wrote:
> Rename mediabus formats and move the enum into a separate header file so
> that it can be used by DRM/KMS subsystem without any reference to the V4L2
> subsystem.
> 
> Old V4L2_MBUS_FMT_ definitions are now referencing MEDIA_BUS_FMT_ value.

I missed earlier that v4l2-mediabus.h contained a struct as well, so it can't be
deprecated and neither can a #warning be added.

The best approach, I think, is to use a macro in media-bus-format.h
that will either define just the MEDIA_BUS value when compiled in the kernel, or
define both MEDIA_BUS and V4L2_MBUS values when compiled for userspace.

E.g. something like this:

#ifdef __KERNEL__
#define MEDIA_BUS_FMT_ENTRY(name, val) MEDIA_BUS_FMT_ # name = val
#else
/* Keep V4L2_MBUS_FMT for backwards compatibility */
#define MEDIA_BUS_FMT_ENTRY(name, val) \
	MEDIA_BUS_FMT_ # name = val, \
	V4L2_MBUS_FMT_ # name = val
#endif

An alternative approach is to have v4l2-mediabus.h include media-bus-format.h,
put #ifndef __KERNEL__ around the enum v4l2_mbus_pixelcode and add a big comment
there that applications should use the defines from media-bus-format.h and that
this enum is frozen (i.e. new values are only added to media-bus-format.h).

But I think I like the macro idea best.

Regards,

	Hans

> 
> Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
> ---
>  include/uapi/linux/Kbuild             |   1 +
>  include/uapi/linux/media-bus-format.h | 126 +++++++++++++++++++++++
>  include/uapi/linux/v4l2-mediabus.h    | 184 +++++++++++++++-------------------
>  3 files changed, 206 insertions(+), 105 deletions(-)
>  create mode 100644 include/uapi/linux/media-bus-format.h
> 
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index b70237e..b2c23f8 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -414,6 +414,7 @@ header-y += veth.h
>  header-y += vfio.h
>  header-y += vhost.h
>  header-y += videodev2.h
> +header-y += media-bus-format.h
>  header-y += virtio_9p.h
>  header-y += virtio_balloon.h
>  header-y += virtio_blk.h
> diff --git a/include/uapi/linux/media-bus-format.h b/include/uapi/linux/media-bus-format.h
> new file mode 100644
> index 0000000..2a826e9
> --- /dev/null
> +++ b/include/uapi/linux/media-bus-format.h
> @@ -0,0 +1,126 @@
> +/*
> + * Media Bus API header
> + *
> + * Copyright (C) 2009, Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef __LINUX_MEDIA_BUS_FORMAT_H
> +#define __LINUX_MEDIA_BUS_FORMAT_H
> +
> +/*
> + * These bus formats uniquely identify data formats on the data bus. Format 0
> + * is reserved, MEDIA_BUS_FMT_FIXED shall be used by host-client pairs, where
> + * the data format is fixed. Additionally, "2X8" means that one pixel is
> + * transferred in two 8-bit samples, "BE" or "LE" specify in which order those
> + * samples are transferred over the bus: "LE" means that the least significant
> + * bits are transferred first, "BE" means that the most significant bits are
> + * transferred first, and "PADHI" and "PADLO" define which bits - low or high,
> + * in the incomplete high byte, are filled with padding bits.
> + *
> + * The bus formats are grouped by type, bus_width, bits per component, samples
> + * per pixel and order of subsamples. Numerical values are sorted using generic
> + * numerical sort order (8 thus comes before 10).
> + *
> + * As their value can't change when a new bus format is inserted in the
> + * enumeration, the bus formats are explicitly given a numerical value. The next
> + * free values for each category are listed below, update them when inserting
> + * new pixel codes.
> + */
> +enum media_bus_format {
> +	MEDIA_BUS_FMT_FIXED = 0x0001,
> +
> +	/* RGB - next is 0x100e */
> +	MEDIA_BUS_FMT_RGB444_2X8_PADHI_BE = 0x1001,
> +	MEDIA_BUS_FMT_RGB444_2X8_PADHI_LE = 0x1002,
> +	MEDIA_BUS_FMT_RGB555_2X8_PADHI_BE = 0x1003,
> +	MEDIA_BUS_FMT_RGB555_2X8_PADHI_LE = 0x1004,
> +	MEDIA_BUS_FMT_BGR565_2X8_BE = 0x1005,
> +	MEDIA_BUS_FMT_BGR565_2X8_LE = 0x1006,
> +	MEDIA_BUS_FMT_RGB565_2X8_BE = 0x1007,
> +	MEDIA_BUS_FMT_RGB565_2X8_LE = 0x1008,
> +	MEDIA_BUS_FMT_RGB666_1X18 = 0x1009,
> +	MEDIA_BUS_FMT_RGB888_1X24 = 0x100a,
> +	MEDIA_BUS_FMT_RGB888_2X12_BE = 0x100b,
> +	MEDIA_BUS_FMT_RGB888_2X12_LE = 0x100c,
> +	MEDIA_BUS_FMT_ARGB8888_1X32 = 0x100d,
> +
> +	/* YUV (including grey) - next is 0x2024 */
> +	MEDIA_BUS_FMT_Y8_1X8 = 0x2001,
> +	MEDIA_BUS_FMT_UV8_1X8 = 0x2015,
> +	MEDIA_BUS_FMT_UYVY8_1_5X8 = 0x2002,
> +	MEDIA_BUS_FMT_VYUY8_1_5X8 = 0x2003,
> +	MEDIA_BUS_FMT_YUYV8_1_5X8 = 0x2004,
> +	MEDIA_BUS_FMT_YVYU8_1_5X8 = 0x2005,
> +	MEDIA_BUS_FMT_UYVY8_2X8 = 0x2006,
> +	MEDIA_BUS_FMT_VYUY8_2X8 = 0x2007,
> +	MEDIA_BUS_FMT_YUYV8_2X8 = 0x2008,
> +	MEDIA_BUS_FMT_YVYU8_2X8 = 0x2009,
> +	MEDIA_BUS_FMT_Y10_1X10 = 0x200a,
> +	MEDIA_BUS_FMT_UYVY10_2X10 = 0x2018,
> +	MEDIA_BUS_FMT_VYUY10_2X10 = 0x2019,
> +	MEDIA_BUS_FMT_YUYV10_2X10 = 0x200b,
> +	MEDIA_BUS_FMT_YVYU10_2X10 = 0x200c,
> +	MEDIA_BUS_FMT_Y12_1X12 = 0x2013,
> +	MEDIA_BUS_FMT_UYVY8_1X16 = 0x200f,
> +	MEDIA_BUS_FMT_VYUY8_1X16 = 0x2010,
> +	MEDIA_BUS_FMT_YUYV8_1X16 = 0x2011,
> +	MEDIA_BUS_FMT_YVYU8_1X16 = 0x2012,
> +	MEDIA_BUS_FMT_YDYUYDYV8_1X16 = 0x2014,
> +	MEDIA_BUS_FMT_UYVY10_1X20 = 0x201a,
> +	MEDIA_BUS_FMT_VYUY10_1X20 = 0x201b,
> +	MEDIA_BUS_FMT_YUYV10_1X20 = 0x200d,
> +	MEDIA_BUS_FMT_YVYU10_1X20 = 0x200e,
> +	MEDIA_BUS_FMT_YUV10_1X30 = 0x2016,
> +	MEDIA_BUS_FMT_AYUV8_1X32 = 0x2017,
> +	MEDIA_BUS_FMT_UYVY12_2X12 = 0x201c,
> +	MEDIA_BUS_FMT_VYUY12_2X12 = 0x201d,
> +	MEDIA_BUS_FMT_YUYV12_2X12 = 0x201e,
> +	MEDIA_BUS_FMT_YVYU12_2X12 = 0x201f,
> +	MEDIA_BUS_FMT_UYVY12_1X24 = 0x2020,
> +	MEDIA_BUS_FMT_VYUY12_1X24 = 0x2021,
> +	MEDIA_BUS_FMT_YUYV12_1X24 = 0x2022,
> +	MEDIA_BUS_FMT_YVYU12_1X24 = 0x2023,
> +
> +	/* Bayer - next is 0x3019 */
> +	MEDIA_BUS_FMT_SBGGR8_1X8 = 0x3001,
> +	MEDIA_BUS_FMT_SGBRG8_1X8 = 0x3013,
> +	MEDIA_BUS_FMT_SGRBG8_1X8 = 0x3002,
> +	MEDIA_BUS_FMT_SRGGB8_1X8 = 0x3014,
> +	MEDIA_BUS_FMT_SBGGR10_ALAW8_1X8 = 0x3015,
> +	MEDIA_BUS_FMT_SGBRG10_ALAW8_1X8 = 0x3016,
> +	MEDIA_BUS_FMT_SGRBG10_ALAW8_1X8 = 0x3017,
> +	MEDIA_BUS_FMT_SRGGB10_ALAW8_1X8 = 0x3018,
> +	MEDIA_BUS_FMT_SBGGR10_DPCM8_1X8 = 0x300b,
> +	MEDIA_BUS_FMT_SGBRG10_DPCM8_1X8 = 0x300c,
> +	MEDIA_BUS_FMT_SGRBG10_DPCM8_1X8 = 0x3009,
> +	MEDIA_BUS_FMT_SRGGB10_DPCM8_1X8 = 0x300d,
> +	MEDIA_BUS_FMT_SBGGR10_2X8_PADHI_BE = 0x3003,
> +	MEDIA_BUS_FMT_SBGGR10_2X8_PADHI_LE = 0x3004,
> +	MEDIA_BUS_FMT_SBGGR10_2X8_PADLO_BE = 0x3005,
> +	MEDIA_BUS_FMT_SBGGR10_2X8_PADLO_LE = 0x3006,
> +	MEDIA_BUS_FMT_SBGGR10_1X10 = 0x3007,
> +	MEDIA_BUS_FMT_SGBRG10_1X10 = 0x300e,
> +	MEDIA_BUS_FMT_SGRBG10_1X10 = 0x300a,
> +	MEDIA_BUS_FMT_SRGGB10_1X10 = 0x300f,
> +	MEDIA_BUS_FMT_SBGGR12_1X12 = 0x3008,
> +	MEDIA_BUS_FMT_SGBRG12_1X12 = 0x3010,
> +	MEDIA_BUS_FMT_SGRBG12_1X12 = 0x3011,
> +	MEDIA_BUS_FMT_SRGGB12_1X12 = 0x3012,
> +
> +	/* JPEG compressed formats - next is 0x4002 */
> +	MEDIA_BUS_FMT_JPEG_1X8 = 0x4001,
> +
> +	/* Vendor specific formats - next is 0x5002 */
> +
> +	/* S5C73M3 sensor specific interleaved UYVY and JPEG */
> +	MEDIA_BUS_FMT_S5C_UYVY_JPEG_1X8 = 0x5001,
> +
> +	/* HSV - next is 0x6002 */
> +	MEDIA_BUS_FMT_AHSV8888_1X32 = 0x6001,
> +};
> +
> +#endif /* __LINUX_MEDIA_BUS_FORMAT_H */
> diff --git a/include/uapi/linux/v4l2-mediabus.h b/include/uapi/linux/v4l2-mediabus.h
> index 1445e85..f471064 100644
> --- a/include/uapi/linux/v4l2-mediabus.h
> +++ b/include/uapi/linux/v4l2-mediabus.h
> @@ -13,118 +13,92 @@
>  
>  #include <linux/types.h>
>  #include <linux/videodev2.h>
> +#include <linux/media-bus-format.h>
>  
> -/*
> - * These pixel codes uniquely identify data formats on the media bus. Mostly
> - * they correspond to similarly named V4L2_PIX_FMT_* formats, format 0 is
> - * reserved, V4L2_MBUS_FMT_FIXED shall be used by host-client pairs, where the
> - * data format is fixed. Additionally, "2X8" means that one pixel is transferred
> - * in two 8-bit samples, "BE" or "LE" specify in which order those samples are
> - * transferred over the bus: "LE" means that the least significant bits are
> - * transferred first, "BE" means that the most significant bits are transferred
> - * first, and "PADHI" and "PADLO" define which bits - low or high, in the
> - * incomplete high byte, are filled with padding bits.
> - *
> - * The pixel codes are grouped by type, bus_width, bits per component, samples
> - * per pixel and order of subsamples. Numerical values are sorted using generic
> - * numerical sort order (8 thus comes before 10).
> - *
> - * As their value can't change when a new pixel code is inserted in the
> - * enumeration, the pixel codes are explicitly given a numerical value. The next
> - * free values for each category are listed below, update them when inserting
> - * new pixel codes.
> - */
> -enum v4l2_mbus_pixelcode {
> -	V4L2_MBUS_FMT_FIXED = 0x0001,
> -
> -	/* RGB - next is 0x100e */
> -	V4L2_MBUS_FMT_RGB444_2X8_PADHI_BE = 0x1001,
> -	V4L2_MBUS_FMT_RGB444_2X8_PADHI_LE = 0x1002,
> -	V4L2_MBUS_FMT_RGB555_2X8_PADHI_BE = 0x1003,
> -	V4L2_MBUS_FMT_RGB555_2X8_PADHI_LE = 0x1004,
> -	V4L2_MBUS_FMT_BGR565_2X8_BE = 0x1005,
> -	V4L2_MBUS_FMT_BGR565_2X8_LE = 0x1006,
> -	V4L2_MBUS_FMT_RGB565_2X8_BE = 0x1007,
> -	V4L2_MBUS_FMT_RGB565_2X8_LE = 0x1008,
> -	V4L2_MBUS_FMT_RGB666_1X18 = 0x1009,
> -	V4L2_MBUS_FMT_RGB888_1X24 = 0x100a,
> -	V4L2_MBUS_FMT_RGB888_2X12_BE = 0x100b,
> -	V4L2_MBUS_FMT_RGB888_2X12_LE = 0x100c,
> -	V4L2_MBUS_FMT_ARGB8888_1X32 = 0x100d,
> +#define MEDIA_BUS_TO_V4L2_MBUS(x)	V4L2_MBUS_FMT_ ## x = MEDIA_BUS_FMT_ ## x
>  
> -	/* YUV (including grey) - next is 0x2024 */
> -	V4L2_MBUS_FMT_Y8_1X8 = 0x2001,
> -	V4L2_MBUS_FMT_UV8_1X8 = 0x2015,
> -	V4L2_MBUS_FMT_UYVY8_1_5X8 = 0x2002,
> -	V4L2_MBUS_FMT_VYUY8_1_5X8 = 0x2003,
> -	V4L2_MBUS_FMT_YUYV8_1_5X8 = 0x2004,
> -	V4L2_MBUS_FMT_YVYU8_1_5X8 = 0x2005,
> -	V4L2_MBUS_FMT_UYVY8_2X8 = 0x2006,
> -	V4L2_MBUS_FMT_VYUY8_2X8 = 0x2007,
> -	V4L2_MBUS_FMT_YUYV8_2X8 = 0x2008,
> -	V4L2_MBUS_FMT_YVYU8_2X8 = 0x2009,
> -	V4L2_MBUS_FMT_Y10_1X10 = 0x200a,
> -	V4L2_MBUS_FMT_UYVY10_2X10 = 0x2018,
> -	V4L2_MBUS_FMT_VYUY10_2X10 = 0x2019,
> -	V4L2_MBUS_FMT_YUYV10_2X10 = 0x200b,
> -	V4L2_MBUS_FMT_YVYU10_2X10 = 0x200c,
> -	V4L2_MBUS_FMT_Y12_1X12 = 0x2013,
> -	V4L2_MBUS_FMT_UYVY8_1X16 = 0x200f,
> -	V4L2_MBUS_FMT_VYUY8_1X16 = 0x2010,
> -	V4L2_MBUS_FMT_YUYV8_1X16 = 0x2011,
> -	V4L2_MBUS_FMT_YVYU8_1X16 = 0x2012,
> -	V4L2_MBUS_FMT_YDYUYDYV8_1X16 = 0x2014,
> -	V4L2_MBUS_FMT_UYVY10_1X20 = 0x201a,
> -	V4L2_MBUS_FMT_VYUY10_1X20 = 0x201b,
> -	V4L2_MBUS_FMT_YUYV10_1X20 = 0x200d,
> -	V4L2_MBUS_FMT_YVYU10_1X20 = 0x200e,
> -	V4L2_MBUS_FMT_YUV10_1X30 = 0x2016,
> -	V4L2_MBUS_FMT_AYUV8_1X32 = 0x2017,
> -	V4L2_MBUS_FMT_UYVY12_2X12 = 0x201c,
> -	V4L2_MBUS_FMT_VYUY12_2X12 = 0x201d,
> -	V4L2_MBUS_FMT_YUYV12_2X12 = 0x201e,
> -	V4L2_MBUS_FMT_YVYU12_2X12 = 0x201f,
> -	V4L2_MBUS_FMT_UYVY12_1X24 = 0x2020,
> -	V4L2_MBUS_FMT_VYUY12_1X24 = 0x2021,
> -	V4L2_MBUS_FMT_YUYV12_1X24 = 0x2022,
> -	V4L2_MBUS_FMT_YVYU12_1X24 = 0x2023,
> +enum v4l2_mbus_pixelcode {
> +	MEDIA_BUS_TO_V4L2_MBUS(FIXED),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB444_2X8_PADHI_BE),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB444_2X8_PADHI_LE),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB555_2X8_PADHI_BE),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB555_2X8_PADHI_LE),
> +	MEDIA_BUS_TO_V4L2_MBUS(BGR565_2X8_BE),
> +	MEDIA_BUS_TO_V4L2_MBUS(BGR565_2X8_LE),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB565_2X8_BE),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB565_2X8_LE),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB666_1X18),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB888_1X24),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB888_2X12_BE),
> +	MEDIA_BUS_TO_V4L2_MBUS(RGB888_2X12_LE),
> +	MEDIA_BUS_TO_V4L2_MBUS(ARGB8888_1X32),
>  
> -	/* Bayer - next is 0x3019 */
> -	V4L2_MBUS_FMT_SBGGR8_1X8 = 0x3001,
> -	V4L2_MBUS_FMT_SGBRG8_1X8 = 0x3013,
> -	V4L2_MBUS_FMT_SGRBG8_1X8 = 0x3002,
> -	V4L2_MBUS_FMT_SRGGB8_1X8 = 0x3014,
> -	V4L2_MBUS_FMT_SBGGR10_ALAW8_1X8 = 0x3015,
> -	V4L2_MBUS_FMT_SGBRG10_ALAW8_1X8 = 0x3016,
> -	V4L2_MBUS_FMT_SGRBG10_ALAW8_1X8 = 0x3017,
> -	V4L2_MBUS_FMT_SRGGB10_ALAW8_1X8 = 0x3018,
> -	V4L2_MBUS_FMT_SBGGR10_DPCM8_1X8 = 0x300b,
> -	V4L2_MBUS_FMT_SGBRG10_DPCM8_1X8 = 0x300c,
> -	V4L2_MBUS_FMT_SGRBG10_DPCM8_1X8 = 0x3009,
> -	V4L2_MBUS_FMT_SRGGB10_DPCM8_1X8 = 0x300d,
> -	V4L2_MBUS_FMT_SBGGR10_2X8_PADHI_BE = 0x3003,
> -	V4L2_MBUS_FMT_SBGGR10_2X8_PADHI_LE = 0x3004,
> -	V4L2_MBUS_FMT_SBGGR10_2X8_PADLO_BE = 0x3005,
> -	V4L2_MBUS_FMT_SBGGR10_2X8_PADLO_LE = 0x3006,
> -	V4L2_MBUS_FMT_SBGGR10_1X10 = 0x3007,
> -	V4L2_MBUS_FMT_SGBRG10_1X10 = 0x300e,
> -	V4L2_MBUS_FMT_SGRBG10_1X10 = 0x300a,
> -	V4L2_MBUS_FMT_SRGGB10_1X10 = 0x300f,
> -	V4L2_MBUS_FMT_SBGGR12_1X12 = 0x3008,
> -	V4L2_MBUS_FMT_SGBRG12_1X12 = 0x3010,
> -	V4L2_MBUS_FMT_SGRBG12_1X12 = 0x3011,
> -	V4L2_MBUS_FMT_SRGGB12_1X12 = 0x3012,
> +	MEDIA_BUS_TO_V4L2_MBUS(Y8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(UV8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(UYVY8_1_5X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(VYUY8_1_5X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(YUYV8_1_5X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(YVYU8_1_5X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(UYVY8_2X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(VYUY8_2X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(YUYV8_2X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(YVYU8_2X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(Y10_1X10),
> +	MEDIA_BUS_TO_V4L2_MBUS(UYVY10_2X10),
> +	MEDIA_BUS_TO_V4L2_MBUS(VYUY10_2X10),
> +	MEDIA_BUS_TO_V4L2_MBUS(YUYV10_2X10),
> +	MEDIA_BUS_TO_V4L2_MBUS(YVYU10_2X10),
> +	MEDIA_BUS_TO_V4L2_MBUS(Y12_1X12),
> +	MEDIA_BUS_TO_V4L2_MBUS(UYVY8_1X16),
> +	MEDIA_BUS_TO_V4L2_MBUS(VYUY8_1X16),
> +	MEDIA_BUS_TO_V4L2_MBUS(YUYV8_1X16),
> +	MEDIA_BUS_TO_V4L2_MBUS(YVYU8_1X16),
> +	MEDIA_BUS_TO_V4L2_MBUS(YDYUYDYV8_1X16),
> +	MEDIA_BUS_TO_V4L2_MBUS(UYVY10_1X20),
> +	MEDIA_BUS_TO_V4L2_MBUS(VYUY10_1X20),
> +	MEDIA_BUS_TO_V4L2_MBUS(YUYV10_1X20),
> +	MEDIA_BUS_TO_V4L2_MBUS(YVYU10_1X20),
> +	MEDIA_BUS_TO_V4L2_MBUS(YUV10_1X30),
> +	MEDIA_BUS_TO_V4L2_MBUS(AYUV8_1X32),
> +	MEDIA_BUS_TO_V4L2_MBUS(UYVY12_2X12),
> +	MEDIA_BUS_TO_V4L2_MBUS(VYUY12_2X12),
> +	MEDIA_BUS_TO_V4L2_MBUS(YUYV12_2X12),
> +	MEDIA_BUS_TO_V4L2_MBUS(YVYU12_2X12),
> +	MEDIA_BUS_TO_V4L2_MBUS(UYVY12_1X24),
> +	MEDIA_BUS_TO_V4L2_MBUS(VYUY12_1X24),
> +	MEDIA_BUS_TO_V4L2_MBUS(YUYV12_1X24),
> +	MEDIA_BUS_TO_V4L2_MBUS(YVYU12_1X24),
>  
> -	/* JPEG compressed formats - next is 0x4002 */
> -	V4L2_MBUS_FMT_JPEG_1X8 = 0x4001,
> +	MEDIA_BUS_TO_V4L2_MBUS(SBGGR8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGBRG8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGRBG8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SRGGB8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SBGGR10_ALAW8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGBRG10_ALAW8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGRBG10_ALAW8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SRGGB10_ALAW8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SBGGR10_DPCM8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGBRG10_DPCM8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGRBG10_DPCM8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SRGGB10_DPCM8_1X8),
> +	MEDIA_BUS_TO_V4L2_MBUS(SBGGR10_2X8_PADHI_BE),
> +	MEDIA_BUS_TO_V4L2_MBUS(SBGGR10_2X8_PADHI_LE),
> +	MEDIA_BUS_TO_V4L2_MBUS(SBGGR10_2X8_PADLO_BE),
> +	MEDIA_BUS_TO_V4L2_MBUS(SBGGR10_2X8_PADLO_LE),
> +	MEDIA_BUS_TO_V4L2_MBUS(SBGGR10_1X10),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGBRG10_1X10),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGRBG10_1X10),
> +	MEDIA_BUS_TO_V4L2_MBUS(SRGGB10_1X10),
> +	MEDIA_BUS_TO_V4L2_MBUS(SBGGR12_1X12),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGBRG12_1X12),
> +	MEDIA_BUS_TO_V4L2_MBUS(SGRBG12_1X12),
> +	MEDIA_BUS_TO_V4L2_MBUS(SRGGB12_1X12),
>  
> -	/* Vendor specific formats - next is 0x5002 */
> +	MEDIA_BUS_TO_V4L2_MBUS(JPEG_1X8),
>  
> -	/* S5C73M3 sensor specific interleaved UYVY and JPEG */
> -	V4L2_MBUS_FMT_S5C_UYVY_JPEG_1X8 = 0x5001,
> +	MEDIA_BUS_TO_V4L2_MBUS(S5C_UYVY_JPEG_1X8),
>  
> -	/* HSV - next is 0x6002 */
> -	V4L2_MBUS_FMT_AHSV8888_1X32 = 0x6001,
> +	MEDIA_BUS_TO_V4L2_MBUS(AHSV8888_1X32),
>  };
>  
>  /**
> 

^ permalink raw reply

* Re: [PATCH 01/15] [media] Move mediabus format definition to a more standard place
From: Hans Verkuil @ 2014-11-04 10:22 UTC (permalink / raw)
  To: Boris Brezillon, Mauro Carvalho Chehab, Hans Verkuil,
	Laurent Pinchart, linux-media
  Cc: linux-arm-kernel, linux-api, devel, linux-kernel, linux-doc,
	Guennadi Liakhovetski
In-Reply-To: <5458A878.3010809@cisco.com>



On 11/04/14 11:20, Hans Verkuil wrote:
> Hi Boris,
> 
> On 11/04/14 10:54, Boris Brezillon wrote:
>> Rename mediabus formats and move the enum into a separate header file so
>> that it can be used by DRM/KMS subsystem without any reference to the V4L2
>> subsystem.
>>
>> Old V4L2_MBUS_FMT_ definitions are now referencing MEDIA_BUS_FMT_ value.
> 
> I missed earlier that v4l2-mediabus.h contained a struct as well, so it can't be
> deprecated and neither can a #warning be added.
> 
> The best approach, I think, is to use a macro in media-bus-format.h
> that will either define just the MEDIA_BUS value when compiled in the kernel, or
> define both MEDIA_BUS and V4L2_MBUS values when compiled for userspace.
> 
> E.g. something like this:
> 
> #ifdef __KERNEL__
> #define MEDIA_BUS_FMT_ENTRY(name, val) MEDIA_BUS_FMT_ # name = val
> #else
> /* Keep V4L2_MBUS_FMT for backwards compatibility */
> #define MEDIA_BUS_FMT_ENTRY(name, val) \
> 	MEDIA_BUS_FMT_ # name = val, \
> 	V4L2_MBUS_FMT_ # name = val
> #endif

And v4l2-mediabus.h needs this as well:

#ifndef __KERNEL__
/* For backwards compatibility */
#define v4l2_mbus_pixelcode media_bus_format
#endif

Regards,

	Hans

^ permalink raw reply

* Re: [PATCH 01/15] [media] Move mediabus format definition to a more standard place
From: Boris Brezillon @ 2014-11-04 10:45 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: devel, linux-doc, linux-api, linux-kernel, Hans Verkuil,
	Laurent Pinchart, Mauro Carvalho Chehab, Guennadi Liakhovetski,
	linux-arm-kernel, linux-media
In-Reply-To: <5458A878.3010809@cisco.com>

Hi Hans,

On Tue, 04 Nov 2014 11:20:40 +0100
Hans Verkuil <hansverk@cisco.com> wrote:

> Hi Boris,
> 
> On 11/04/14 10:54, Boris Brezillon wrote:
> > Rename mediabus formats and move the enum into a separate header file so
> > that it can be used by DRM/KMS subsystem without any reference to the V4L2
> > subsystem.
> > 
> > Old V4L2_MBUS_FMT_ definitions are now referencing MEDIA_BUS_FMT_ value.
> 
> I missed earlier that v4l2-mediabus.h contained a struct as well, so it can't be
> deprecated and neither can a #warning be added.
> 
> The best approach, I think, is to use a macro in media-bus-format.h
> that will either define just the MEDIA_BUS value when compiled in the kernel, or
> define both MEDIA_BUS and V4L2_MBUS values when compiled for userspace.
> 
> E.g. something like this:
> 
> #ifdef __KERNEL__
> #define MEDIA_BUS_FMT_ENTRY(name, val) MEDIA_BUS_FMT_ # name = val
> #else
> /* Keep V4L2_MBUS_FMT for backwards compatibility */
> #define MEDIA_BUS_FMT_ENTRY(name, val) \
> 	MEDIA_BUS_FMT_ # name = val, \
> 	V4L2_MBUS_FMT_ # name = val
> #endif

Okay, but this means we keep adding V4L2_MBUS_FMT_ definitions even for
new formats (which definitely doesn't encourage people to move on).
Moreover, we add a V4L2 prefix in what was supposed to be a subsystem
neutral header.

Anyway, these are just nitpicks, and if you prefer this approach
I'll rework my series :-).

> 
> An alternative approach is to have v4l2-mediabus.h include media-bus-format.h,
> put #ifndef __KERNEL__ around the enum v4l2_mbus_pixelcode and add a big comment
> there that applications should use the defines from media-bus-format.h and that
> this enum is frozen (i.e. new values are only added to media-bus-format.h).
> 
> But I think I like the macro idea best.

As you wish, my only intent is to use those bus format definitions in a
DRM driver :-).

Thanks,

Boris


-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply

* Re: [PATCH v3 0/3] perf: User/kernel time correlation and event generation
From: Thomas Gleixner @ 2014-11-04 10:49 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Arnd Bergmann, John Stultz, Andy Lutomirski, Pawel Moll,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Paul Mackerras,
	Arnaldo Carvalho de Melo, Masami Hiramatsu, Christopher Covington,
	Namhyung Kim, David Ahern, Tomeu Vizoso,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux API,
	Pawel Moll
In-Reply-To: <20141104082728.GB4253-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>

On Tue, 4 Nov 2014, Richard Cochran wrote:

> On Tue, Nov 04, 2014 at 09:01:31AM +0100, Arnd Bergmann wrote:
> > On Monday 03 November 2014 17:11:53 John Stultz wrote:
> > > I've got some thoughts on what a possible interface that wouldn't be
> > > awful could look like, but I'm still hesitant because I don't really
> > > know if exposing this sort of data is actually a good idea long term.
> >  
> > I was also thinking (while working on an unrelated patch) we could use
> > a system call like
> > 
> > int clock_getoffset(clockid_t clkid, struct timespec *offs);

We might make *offs a timespec64 or u64 :)

> > that returns the current offset between CLOCK_REALTIME and the
> > requested timebase. It is of course racy, but so is every use
> > of CLOCK_REALTIME. We could also use a reference other than
> > CLOCK_REALTIME that might be more stable, but passing two arbitrary
> > clocks as input would make this much more complex to implement.
> 
> No, it is really easy to implement. Just drop the idea of "atomic". It
> really is not necessary or even possible.

If the two clocks have the same underlying hardware then you get an
'atomic' snapshot of their relationship. That's true for any
combination of CLOCK_REALTIME, CLOCK_MONOTONIC, CLOCK_BOOTTIME and
CLOCK_TAI. So we can and should expose these 'atomic' snapshots.

There is another reason why we want to support the notion of 'atomic'
snapshots: There exists hardware which gives you 'atomic' samples of
two different hardware clocks and there is more of that coming soon.

If that's not the case then you need two seperate readouts which of
course cannot provide any guarantee, but I agree that we could do
something like what you do in the PTP_SYS_OFFSET ioctl and let user
space analyze the samples. But that should be optional.

Thanks,

	tglx

^ permalink raw reply

* Re: [PATCH 01/15] [media] Move mediabus format definition to a more standard place
From: Hans Verkuil @ 2014-11-04 11:09 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: devel, linux-doc, linux-api, linux-kernel, Hans Verkuil,
	Laurent Pinchart, Mauro Carvalho Chehab, Guennadi Liakhovetski,
	linux-arm-kernel, linux-media
In-Reply-To: <20141104114503.309cb54f@bbrezillon>

Well, I gave two alternatives :-)

Both are fine as far as I am concerned, but it would be nice to hear
what others think.

Regards,

	Hans

On 11/04/14 11:45, Boris Brezillon wrote:
> Hi Hans,
> 
> On Tue, 04 Nov 2014 11:20:40 +0100
> Hans Verkuil <hansverk@cisco.com> wrote:
> 
>> Hi Boris,
>>
>> On 11/04/14 10:54, Boris Brezillon wrote:
>>> Rename mediabus formats and move the enum into a separate header file so
>>> that it can be used by DRM/KMS subsystem without any reference to the V4L2
>>> subsystem.
>>>
>>> Old V4L2_MBUS_FMT_ definitions are now referencing MEDIA_BUS_FMT_ value.
>>
>> I missed earlier that v4l2-mediabus.h contained a struct as well, so it can't be
>> deprecated and neither can a #warning be added.
>>
>> The best approach, I think, is to use a macro in media-bus-format.h
>> that will either define just the MEDIA_BUS value when compiled in the kernel, or
>> define both MEDIA_BUS and V4L2_MBUS values when compiled for userspace.
>>
>> E.g. something like this:
>>
>> #ifdef __KERNEL__
>> #define MEDIA_BUS_FMT_ENTRY(name, val) MEDIA_BUS_FMT_ # name = val
>> #else
>> /* Keep V4L2_MBUS_FMT for backwards compatibility */
>> #define MEDIA_BUS_FMT_ENTRY(name, val) \
>> 	MEDIA_BUS_FMT_ # name = val, \
>> 	V4L2_MBUS_FMT_ # name = val
>> #endif
> 
> Okay, but this means we keep adding V4L2_MBUS_FMT_ definitions even for
> new formats (which definitely doesn't encourage people to move on).
> Moreover, we add a V4L2 prefix in what was supposed to be a subsystem
> neutral header.
> 
> Anyway, these are just nitpicks, and if you prefer this approach
> I'll rework my series :-).
> 
>>
>> An alternative approach is to have v4l2-mediabus.h include media-bus-format.h,
>> put #ifndef __KERNEL__ around the enum v4l2_mbus_pixelcode and add a big comment
>> there that applications should use the defines from media-bus-format.h and that
>> this enum is frozen (i.e. new values are only added to media-bus-format.h).
>>
>> But I think I like the macro idea best.
> 
> As you wish, my only intent is to use those bus format definitions in a
> DRM driver :-).
> 
> Thanks,
> 
> Boris
> 
> 

^ permalink raw reply

* Re: [PATCHv2 0/7] CGroup Namespaces
From: Vivek Goyal @ 2014-11-04 13:10 UTC (permalink / raw)
  To: Aditya Kali
  Cc: linux-api-u79uwXL29TY76Z2rM5mHXA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, luto-kltTT9wpgjJwATOyAt5JVQ,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, tj-DgEjT+Ai2ygdnm+yROfE0A,
	cgroups-u79uwXL29TY76Z2rM5mHXA, mingo-H+wXaHxf7aLQT0dZR+AlfA
In-Reply-To: <1414783141-6947-1-git-send-email-adityakali-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

On Fri, Oct 31, 2014 at 12:18:54PM -0700, Aditya Kali wrote:
[..]
>  fs/kernfs/dir.c                  | 194 ++++++++++++++++++++++++++++++++++-----
>  fs/kernfs/mount.c                |  48 ++++++++++
>  fs/proc/namespaces.c             |   1 +
>  include/linux/cgroup.h           |  41 ++++++++-
>  include/linux/cgroup_namespace.h |  36 ++++++++
>  include/linux/kernfs.h           |   5 +
>  include/linux/nsproxy.h          |   2 +
>  include/linux/proc_ns.h          |   4 +
>  include/uapi/linux/sched.h       |   3 +-
>  kernel/Makefile                  |   2 +-
>  kernel/cgroup.c                  | 108 +++++++++++++++++-----
>  kernel/cgroup_namespace.c        | 148 +++++++++++++++++++++++++++++
>  kernel/fork.c                    |   2 +-
>  kernel/nsproxy.c                 |  19 +++-

Hi Aditya,

Can we provide a documentation file for cgroup namespace behavior. Say,
Documentation/namespaces/cgroup-namespace.txt.

Namespaces are complicated and it might be a good idea to keep one .txt
file for each namespace.

Thanks
Vivek

^ permalink raw reply

* Re: [PATCHv2 7/7] cgroup: mount cgroupns-root when inside non-init cgroupns
From: Tejun Heo @ 2014-11-04 13:46 UTC (permalink / raw)
  To: Aditya Kali
  Cc: Eric W. Biederman, Li Zefan, Serge Hallyn, Andy Lutomirski,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux API,
	Ingo Molnar, Linux Containers, Rohit Jnagal
In-Reply-To: <CAGr1F2Hd_PS_AscBGMXdZC9qkHGRUp-MeQvJksDOQkRBB3RGoA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Hello, Aditya.

On Mon, Nov 03, 2014 at 02:43:47PM -0800, Aditya Kali wrote:
> I agree that this is effectively bind-mounting, but doing this in kernel
> makes it really convenient for the userspace. The process that sets up the
> container doesn't need to care whether it should bind-mount cgroupfs inside
> the container or not. The tasks inside the container can mount cgroupfs on
> as-needed basis. The root container manager can simply unshare cgroupns and
> forget about the internal setup. I think this is useful just for the reason
> that it makes life much simpler for userspace.

If it's okay to require userland to just do bind mounting, I'd be far
happier with that.  cgroup mount code is already overcomplicated
because of the dynamic matching of supers to mounts when it could just
have told userland to use bind mounting.  Doesn't the host side have
to set up some of the filesystem layouts anyway?  Does it really
matter that we require the host to set up cgroup hierarchy too?

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCHv2 7/7] cgroup: mount cgroupns-root when inside non-init cgroupns
From: Tejun Heo @ 2014-11-04 13:57 UTC (permalink / raw)
  To: Aditya Kali
  Cc: Andy Lutomirski, Li Zefan, Serge Hallyn, Eric W. Biederman,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux API,
	Ingo Molnar, Linux Containers, Rohit Jnagal
In-Reply-To: <CAGr1F2FuPQxLraYv7PstJ9c8H-XQsgawaAtj4AS77B+_0k2o+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Hello, Aditya.

On Mon, Nov 03, 2014 at 03:12:28PM -0800, Aditya Kali wrote:
> I think the sane-behavior flag is only temporary and will be removed
> anyways, right? So I didn't bother asking user to supply it. But I can
> make the change as you suggested. We just have to make sure that tasks
> inside cgroupns cannot mount non-default hierarchies as it would be a
> regression.

I'm not sure whether supporting mounting from inside a ns is even
necessary but, if it is, can't you just test against cgrp_dfl_root?
There's no reason to do anything differnetly for ns mounting.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCHv2 7/7] cgroup: mount cgroupns-root when inside non-init cgroupns
From: Andy Lutomirski @ 2014-11-04 15:00 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Aditya Kali, Eric W. Biederman, Li Zefan, Serge Hallyn,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux API,
	Ingo Molnar, Linux Containers, Rohit Jnagal
In-Reply-To: <20141104134633.GA14014-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>

On Tue, Nov 4, 2014 at 5:46 AM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> Hello, Aditya.
>
> On Mon, Nov 03, 2014 at 02:43:47PM -0800, Aditya Kali wrote:
>> I agree that this is effectively bind-mounting, but doing this in kernel
>> makes it really convenient for the userspace. The process that sets up the
>> container doesn't need to care whether it should bind-mount cgroupfs inside
>> the container or not. The tasks inside the container can mount cgroupfs on
>> as-needed basis. The root container manager can simply unshare cgroupns and
>> forget about the internal setup. I think this is useful just for the reason
>> that it makes life much simpler for userspace.
>
> If it's okay to require userland to just do bind mounting, I'd be far
> happier with that.  cgroup mount code is already overcomplicated
> because of the dynamic matching of supers to mounts when it could just
> have told userland to use bind mounting.  Doesn't the host side have
> to set up some of the filesystem layouts anyway?  Does it really
> matter that we require the host to set up cgroup hierarchy too?
>

Sort of, but only sort of.

You can create a container by unsharing namespaces, mounting
everything, and then calling pivot_root.  But this is unpleasant
because of the strange way that pid namespaces work -- you generally
have to fork first, so this gets tedious.  And it doesn't integrate
well with things like fstab or other container-side configuration
mechanisms.

It's nicer if you can unshare namespaces, mount the bare minimum,
pivot_root, and let the contained software do as much setup as
possible.

--Andy

^ permalink raw reply

* Re: [PATCH v3 0/3] perf: User/kernel time correlation and event generation
From: Pawel Moll @ 2014-11-04 15:07 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: John Stultz, Richard Cochran, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Paul Mackerras, Arnaldo Carvalho de Melo,
	Masami Hiramatsu, Christopher Covington, Namhyung Kim,
	David Ahern, Thomas Gleixner, Tomeu Vizoso,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux API
In-Reply-To: <CALCETrUAkXKyXzZy4xaYcW2f65Lh=APrU4cFU1zm-qmc6EwB8g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue, 2014-11-04 at 01:25 +0000, Andy Lutomirski wrote:
> >> If you're going to add double-stamped packets, can you also add a
> >> syscall to read multiple clocks at once, atomically?  Or can you
> >> otherwise add a non-perf mechanism to get at this data?
> >
> > I've got some thoughts on what a possible interface that wouldn't be
> > awful could look like, but I'm still hesitant because I don't really
> > know if exposing this sort of data is actually a good idea long term.
> 
> My only real thought here is that, if perf is going to try to do this,
> then presumably it should be reasonably integrated w/ the core timing
> code.  I.e. if perf does this, then presumably the core code should
> know about it and there should be a core interface to it.

I think I understand where you're coming from. Arnd's idea for the API
seems reasonable, although I can't promise implementing a proposal
(don't make me stop you from doing it :-).

As to the perf-specific correlation, I'm assuming limited accuracy.
Others already mentioned that in the absence of hardware support, the
time values are never really "atomic". The best what can be done is to
access them as near to each other in the code as possible and make sure
it happens in a non-preemptible section. In my tests I've achieved, on
average, sub-microsecond accuracy, which was good enough from my
perspective, but it's far from ideal 42ns resolution for my (just an
example) time source clocked at 24MHz.

Paweł

^ permalink raw reply

* Re: [PATCH v3 1/3] perf: Use monotonic clock as a source for timestamps
From: Pawel Moll @ 2014-11-04 15:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Richard Cochran, Steven Rostedt, Ingo Molnar, Paul Mackerras,
	Arnaldo Carvalho de Melo, John Stultz, Masami Hiramatsu,
	Christopher Covington, Namhyung Kim, David Ahern, Thomas Gleixner,
	Tomeu Vizoso,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <20141104072308.GE10501-IIpfhp3q70z/8w/KjCw3T+5/BudmfyzbbVWyRVo5IupeoWH0uzbU5w@public.gmane.org>

On Tue, 2014-11-04 at 07:23 +0000, Peter Zijlstra wrote:
> On Tue, Nov 04, 2014 at 12:28:36AM +0000, Pawel Moll wrote:
> 
> > +int sysctl_perf_sample_time_clk_id = CLOCK_MONOTONIC;
> 
> const ?

Sure (unless we have to change it as mentioned below)

> >  /*
> >   * perf samples are done in some very critical code paths (NMIs).
> >   * If they take too much CPU time, the system can lock up and not
> > @@ -324,7 +326,7 @@ extern __weak const char *perf_pmu_name(void)
> >  
> >  static inline u64 perf_clock(void)
> >  {
> > -	return local_clock();
> > +	return ktime_get_mono_fast_ns();
> >  }
> 
> Do we maybe want to make it boot-time switchable back to local_clock for
> people with bad systems and or backwards compat issues?

Very good idea, should have came up with it myself :-)

Does __setup("perf_use_local_clock") sound reasonable? Then we have to
decide whether to hide the sysctl "perf_sample_time_clk_id" (my
preferred option, will see how difficult it is) or just provide an
invalid clock_id (eg. -1) in it.

Cheers!

Pawel

^ permalink raw reply

* Re: [PATCH v3 1/3] perf: Use monotonic clock as a source for timestamps
From: Peter Zijlstra @ 2014-11-04 15:30 UTC (permalink / raw)
  To: Pawel Moll
  Cc: Richard Cochran, Steven Rostedt, Ingo Molnar, Paul Mackerras,
	Arnaldo Carvalho de Melo, John Stultz, Masami Hiramatsu,
	Christopher Covington, Namhyung Kim, David Ahern, Thomas Gleixner,
	Tomeu Vizoso,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1415114727.24819.8.camel-5wv7dgnIgG8@public.gmane.org>

On Tue, Nov 04, 2014 at 03:25:27PM +0000, Pawel Moll wrote:
> Very good idea, should have came up with it myself :-)
> 
> Does __setup("perf_use_local_clock") sound reasonable?

Would not the 'module' already prefix a perf.? I never quite know how
all that works out. But sure, that works.

> Then we have to
> decide whether to hide the sysctl "perf_sample_time_clk_id" (my
> preferred option, will see how difficult it is) or just provide an
> invalid clock_id (eg. -1) in it.

invalid clock id is fine with me, although I suppose the clock people
might have an oh-pinion ;-)

^ permalink raw reply

* [PATCH] kernel, add panic_on_warn
From: Prarit Bhargava @ 2014-11-04 15:41 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Prarit Bhargava, Andi Kleen, Jonathan Corbet,
	kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Rusty Russell,
	linux-doc-u79uwXL29TY76Z2rM5mHXA, jbaron-JqFfY2XvxFXQT0dZR+AlfA,
	Fabian Frederick, isimatu.yasuaki-+CUm20s59erQFUHtdCDX3A,
	H. Peter Anvin, Masami Hiramatsu, Andrew Morton,
	linux-api-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

There have been several times where I have had to rebuild a kernel to
cause a panic when hitting a WARN() in the code in order to get a crash
dump from a system.  Sometimes this is easy to do, other times (such as
in the case of a remote admin) it is not trivial to send new images to the
user.

A much easier method would be a switch to change the WARN() over to a
panic.  This makes debugging easier in that I can now test the actual
image the WARN() was seen on and I do not have to engage in remote
debugging.

This patch adds a panic_on_warn kernel parameter and
/proc/sys/kernel/panic_on_warn calls panic() in the warn_slowpath_common()
path.  The function will still print out the location of the warning.

An example of the panic_on_warn output:

The first line below is from the WARN_ON() to output the WARN_ON()'s location.
After that the panic() output is displayed.

WARNING: CPU: 30 PID: 11698 at /home/prarit/dummy_module/dummy-module.c:25 init_dummy+0x1f/0x30 [dummy_module]()
Kernel panic - not syncing: panic_on_warn set ...

CPU: 30 PID: 11698 Comm: insmod Tainted: G        W  OE  3.17.0+ #57
Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
 0000000000000000 000000008e3f87df ffff88080f093c38 ffffffff81665190
 0000000000000000 ffffffff818aea3d ffff88080f093cb8 ffffffff8165e2ec
 ffffffff00000008 ffff88080f093cc8 ffff88080f093c68 000000008e3f87df
Call Trace:
 [<ffffffff81665190>] dump_stack+0x46/0x58
 [<ffffffff8165e2ec>] panic+0xd0/0x204
 [<ffffffffa038e05f>] ? init_dummy+0x1f/0x30 [dummy_module]
 [<ffffffff81076b90>] warn_slowpath_common+0xd0/0xd0
 [<ffffffffa038e040>] ? dummy_greetings+0x40/0x40 [dummy_module]
 [<ffffffff81076c8a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa038e05f>] init_dummy+0x1f/0x30 [dummy_module]
 [<ffffffff81002144>] do_one_initcall+0xd4/0x210
 [<ffffffff811b52c2>] ? __vunmap+0xc2/0x110
 [<ffffffff810f8889>] load_module+0x16a9/0x1b30
 [<ffffffff810f3d30>] ? store_uevent+0x70/0x70
 [<ffffffff810f49b9>] ? copy_module_from_fd.isra.44+0x129/0x180
 [<ffffffff810f8ec6>] SyS_finit_module+0xa6/0xd0
 [<ffffffff8166cf29>] system_call_fastpath+0x12/0x17

Successfully tested by me.

Cc: Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>
Cc: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
Cc: "H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
Cc: Andi Kleen <ak-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt-FCd8Q96Dh0JBDgjK7y7TUQ@public.gmane.org>
Cc: Fabian Frederick <fabf-AgBVmzD5pcezQB+pC5nmwQ@public.gmane.org>
Cc: vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: isimatu.yasuaki-+CUm20s59erQFUHtdCDX3A@public.gmane.org
Cc: jbaron-JqFfY2XvxFXQT0dZR+AlfA@public.gmane.org
Cc: linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: kexec-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Signed-off-by: Prarit Bhargava <prarit-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

[v2]: add /proc/sys/kernel/panic_on_warn, additional documentation, modify
      !slowpath cases
[v3]: use proc_dointvec_minmax() in sysctl handler
[v4]: remove !slowpath cases, and add __read_mostly
[v5]: change to panic_on_warn, re-alphabetize Documentation/sysctl/kernel.txt
[v6]: disable on kdump kernel to avoid bogus panicks.
[v7]: swithch to core param, and remove change from v6
---
 Documentation/kdump/kdump.txt       |    7 ++++++
 Documentation/kernel-parameters.txt |    3 +++
 Documentation/sysctl/kernel.txt     |   40 +++++++++++++++++++++++------------
 include/linux/kernel.h              |    1 +
 include/uapi/linux/sysctl.h         |    1 +
 kernel/panic.c                      |   15 ++++++++++++-
 kernel/sysctl.c                     |    9 ++++++++
 kernel/sysctl_binary.c              |    1 +
 8 files changed, 62 insertions(+), 15 deletions(-)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index 6c0b9f2..bc4bd5a 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -471,6 +471,13 @@ format. Crash is available on Dave Anderson's site at the following URL:
 
    http://people.redhat.com/~anderson/
 
+Trigger Kdump on WARN()
+=======================
+
+The kernel parameter, panic_on_warn, calls panic() in all WARN() paths.  This
+will cause a kdump to occur at the panic() call.  In cases where a user wants
+to specify this during runtime, /proc/sys/kernel/panic_on_warn can be set to 1
+to achieve the same behaviour.
 
 Contact
 =======
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 4c81a86..ea5d57c 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2509,6 +2509,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			timeout < 0: reboot immediately
 			Format: <timeout>
 
+	panic_on_warn	panic() instead of WARN().  Useful to cause kdump
+			on a WARN().
+
 	crash_kexec_post_notifiers
 			Run kdump after running panic-notifiers and dumping
 			kmsg. This only for the users who doubt kdump always
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 57baff5..b5d0c85 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -54,8 +54,9 @@ show up in /proc/sys/kernel:
 - overflowuid
 - panic
 - panic_on_oops
-- panic_on_unrecovered_nmi
 - panic_on_stackoverflow
+- panic_on_unrecovered_nmi
+- panic_on_warn
 - pid_max
 - powersave-nap               [ PPC only ]
 - printk
@@ -527,19 +528,6 @@ the recommended setting is 60.
 
 ==============================================================
 
-panic_on_unrecovered_nmi:
-
-The default Linux behaviour on an NMI of either memory or unknown is
-to continue operation. For many environments such as scientific
-computing it is preferable that the box is taken out and the error
-dealt with than an uncorrected parity/ECC error get propagated.
-
-A small number of systems do generate NMI's for bizarre random reasons
-such as power management so the default is off. That sysctl works like
-the existing panic controls already in that directory.
-
-==============================================================
-
 panic_on_oops:
 
 Controls the kernel's behaviour when an oops or BUG is encountered.
@@ -563,6 +551,30 @@ This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled.
 
 ==============================================================
 
+panic_on_unrecovered_nmi:
+
+The default Linux behaviour on an NMI of either memory or unknown is
+to continue operation. For many environments such as scientific
+computing it is preferable that the box is taken out and the error
+dealt with than an uncorrected parity/ECC error get propagated.
+
+A small number of systems do generate NMI's for bizarre random reasons
+such as power management so the default is off. That sysctl works like
+the existing panic controls already in that directory.
+
+==============================================================
+
+panic_on_warn:
+
+Calls panic() in the WARN() path when set to 1.  This is useful to avoid
+a kernel rebuild when attempting to kdump at the location of a WARN().
+
+0: only WARN(), default behaviour.
+
+1: call panic() after printing out WARN() location.
+
+==============================================================
+
 perf_cpu_time_max_percent:
 
 Hints to the kernel how much CPU time it should be allowed to
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 3d770f55..d60d31d 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -422,6 +422,7 @@ extern int panic_timeout;
 extern int panic_on_oops;
 extern int panic_on_unrecovered_nmi;
 extern int panic_on_io_nmi;
+extern int panic_on_warn;
 extern int sysctl_panic_on_stackoverflow;
 /*
  * Only to be used by arch init code. If the user over-wrote the default
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
index 43aaba1..0956373 100644
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -153,6 +153,7 @@ enum
 	KERN_MAX_LOCK_DEPTH=74, /* int: rtmutex's maximum lock depth */
 	KERN_NMI_WATCHDOG=75, /* int: enable/disable nmi watchdog */
 	KERN_PANIC_ON_NMI=76, /* int: whether we will panic on an unrecovered */
+	KERN_PANIC_ON_WARN=77, /* int: call panic() in WARN() functions */
 };
 
 
diff --git a/kernel/panic.c b/kernel/panic.c
index d09dc5c..db37c35 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -23,6 +23,7 @@
 #include <linux/sysrq.h>
 #include <linux/init.h>
 #include <linux/nmi.h>
+#include <linux/crash_dump.h>
 
 #define PANIC_TIMER_STEP 100
 #define PANIC_BLINK_SPD 18
@@ -33,6 +34,7 @@ static int pause_on_oops;
 static int pause_on_oops_flag;
 static DEFINE_SPINLOCK(pause_on_oops_lock);
 static bool crash_kexec_post_notifiers;
+int panic_on_warn __read_mostly;
 
 int panic_timeout = CONFIG_PANIC_TIMEOUT;
 EXPORT_SYMBOL_GPL(panic_timeout);
@@ -420,13 +422,23 @@ static void warn_slowpath_common(const char *file, int line, void *caller,
 {
 	disable_trace_on_warning();
 
-	pr_warn("------------[ cut here ]------------\n");
+	if (!panic_on_warn)
+		pr_warn("------------[ cut here ]------------\n");
 	pr_warn("WARNING: CPU: %d PID: %d at %s:%d %pS()\n",
 		raw_smp_processor_id(), current->pid, file, line, caller);
 
 	if (args)
 		vprintk(args->fmt, args->args);
 
+	if (panic_on_warn) {
+		/*
+		 * A flood of WARN()s may occur.  Prevent further WARN()s
+		 * from panicking the system.
+		 */
+		panic_on_warn = 0;
+		panic("panic_on_warn set ...\n");
+	}
+
 	print_modules();
 	dump_stack();
 	print_oops_end_marker();
@@ -484,6 +496,7 @@ EXPORT_SYMBOL(__stack_chk_fail);
 
 core_param(panic, panic_timeout, int, 0644);
 core_param(pause_on_oops, pause_on_oops, int, 0644);
+core_param(panic_on_warn, panic_on_warn, int, 0644);
 
 static int __init setup_crash_kexec_post_notifiers(char *s)
 {
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 15f2511..7c54ff7 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1104,6 +1104,15 @@ static struct ctl_table kern_table[] = {
 		.proc_handler	= proc_dointvec,
 	},
 #endif
+	{
+		.procname	= "panic_on_warn",
+		.data		= &panic_on_warn,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
 	{ }
 };
 
diff --git a/kernel/sysctl_binary.c b/kernel/sysctl_binary.c
index 9a4f750..7e7746a 100644
--- a/kernel/sysctl_binary.c
+++ b/kernel/sysctl_binary.c
@@ -137,6 +137,7 @@ static const struct bin_table bin_kern_table[] = {
 	{ CTL_INT,	KERN_COMPAT_LOG,		"compat-log" },
 	{ CTL_INT,	KERN_MAX_LOCK_DEPTH,		"max_lock_depth" },
 	{ CTL_INT,	KERN_PANIC_ON_NMI,		"panic_on_unrecovered_nmi" },
+	{ CTL_INT,	KERN_PANIC_ON_WARN,		"panic_on_warn" },
 	{}
 };
 
-- 
1.7.9.3

^ permalink raw reply related

* Re: [PATCHv2 7/7] cgroup: mount cgroupns-root when inside non-init cgroupns
From: Serge E. Hallyn @ 2014-11-04 15:50 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Linux API, Linux Containers, Serge Hallyn,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Ingo Molnar,
	Eric W. Biederman, Tejun Heo, cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CALCETrUggQCJyxsTWRNrjt3GM=R0VMU6RjMkU1aw3YUNMx1xEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Quoting Andy Lutomirski (luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org):
> On Tue, Nov 4, 2014 at 5:46 AM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> > Hello, Aditya.
> >
> > On Mon, Nov 03, 2014 at 02:43:47PM -0800, Aditya Kali wrote:
> >> I agree that this is effectively bind-mounting, but doing this in kernel
> >> makes it really convenient for the userspace. The process that sets up the
> >> container doesn't need to care whether it should bind-mount cgroupfs inside
> >> the container or not. The tasks inside the container can mount cgroupfs on
> >> as-needed basis. The root container manager can simply unshare cgroupns and
> >> forget about the internal setup. I think this is useful just for the reason
> >> that it makes life much simpler for userspace.
> >
> > If it's okay to require userland to just do bind mounting, I'd be far
> > happier with that.  cgroup mount code is already overcomplicated
> > because of the dynamic matching of supers to mounts when it could just
> > have told userland to use bind mounting.  Doesn't the host side have
> > to set up some of the filesystem layouts anyway?  Does it really
> > matter that we require the host to set up cgroup hierarchy too?
> >
> 
> Sort of, but only sort of.
> 
> You can create a container by unsharing namespaces, mounting
> everything, and then calling pivot_root.  But this is unpleasant
> because of the strange way that pid namespaces work -- you generally
> have to fork first, so this gets tedious.  And it doesn't integrate
> well with things like fstab or other container-side configuration
> mechanisms.
> 
> It's nicer if you can unshare namespaces, mount the bare minimum,
> pivot_root, and let the contained software do as much setup as
> possible.

Also, the bind-mount requires the container manager to know where
the guest distro will want the cgroups mounted.

-serge

^ permalink raw reply

* Re: [PATCH v3 0/3] perf: User/kernel time correlation and event generation
From: Pawel Moll @ 2014-11-04 15:51 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Richard Cochran, Steven Rostedt, Ingo Molnar, Peter Zijlstra,
	Paul Mackerras, Arnaldo Carvalho de Melo, John Stultz,
	Christopher Covington, Namhyung Kim, David Ahern, Thomas Gleixner,
	Tomeu Vizoso,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <54589B58.7080102-FCd8Q96Dh0JBDgjK7y7TUQ@public.gmane.org>

On Tue, 2014-11-04 at 09:24 +0000, Masami Hiramatsu wrote:
> What I'd like to do is the binary version of ftrace-marker, the text
> version is already supported by qemu (see below).
> https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg00505.html
> 
> But since that is just a string data (not structured data), it is hard to
> analyze via perf-script or some other useful filters/triggers in ftrace.
> 
> In my idea, the new event will be defined via a special file in debugfs like
> kprobe-events, like below.
> 
>   # cd $debugfs/tracing
>   # echo "newgrp/newevent signarg:s32 flag:u64" >> marker_events
>   # cat events/newgrp/newevent/format
>   name: newevent
>   ID: 2048
>   format:
>         field:unsigned short common_type;       offset:0;       size:2; signed:0;
>         field:unsigned char common_flags;       offset:2;       size:1; signed:0;
>         field:unsigned char common_preempt_count;       offset:3;       size:1;signed:0;
>         field:int common_pid;   offset:4;       size:4; signed:1;
> 
>         field:s32 signarg;      offset:8;      size:4; signed:1;
>         field:u64 flag; offset:12;      size:8; signed:0;
> 
>   print fmt: "signarg=%d flag=0x%Lx", REC->signarg, REC->flag
> 
> Then, users will write the data (excluded common fields) when the event happens
> via trace_marker which start with '\0'ID(in u32). Kernel just checks the ID and
> its data size, but doesn't parse, filter/trigger it and log it into the kernel buffer.

Very neat, I like it! Certainly useful with scripting. Any gut feeling
regarding the kernel version it will be ready for? 3.19 or later than
that?

> Of course, this has a downside that the user must have a privilege to access to debugfs.
> Thus maybe we need both of prctl() IF for perf and this IF for ftrace.

I don't have any particularly strong feelings about the solution as long
as I'm able to create this "synchronisation point" of mine in the perf
data. In one of this patch's previous incarnations I was also doing a
write() to the perf fd to achieve pretty much the same result.

In my personal use case root access to debugfs isn't a problem (I need
it for other ftrace operations anyway). However Ingo and some other guys
seemed interested in prctl() approach because: 1. it's much simpler to
use even comparing with simple trace_marker's open(path)/write()/close()
and 2. because any process can do it at any time and the results are
quietly discarded if no one is listening. I also remember that when I
proposed sort of "unification" between trace_marker and the uevents,
Ingo straight away "suggested" keeping it separate.

Pawel

^ permalink raw reply

* Re: [PATCH v3 0/3] perf: User/kernel time correlation and event generation
From: John Stultz @ 2014-11-04 16:04 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Andy Lutomirski, Pawel Moll, Richard Cochran, Steven Rostedt,
	Ingo Molnar, Peter Zijlstra, Paul Mackerras,
	Arnaldo Carvalho de Melo, Masami Hiramatsu, Christopher Covington,
	Namhyung Kim, David Ahern, Thomas Gleixner, Tomeu Vizoso,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux API,
	Pawel Moll
In-Reply-To: <3430954.VNaFmamXmP@wuerfel>

On Tue, Nov 4, 2014 at 12:01 AM, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote:
> On Monday 03 November 2014 17:11:53 John Stultz wrote:
>> I've got some thoughts on what a possible interface that wouldn't be
>> awful could look like, but I'm still hesitant because I don't really
>> know if exposing this sort of data is actually a good idea long term.
>
> I was also thinking (while working on an unrelated patch) we could use
> a system call like
>
> int clock_getoffset(clockid_t clkid, struct timespec *offs);
>
> that returns the current offset between CLOCK_REALTIME and the
> requested timebase. It is of course racy, but so is every use
> of CLOCK_REALTIME. We could also use a reference other than
> CLOCK_REALTIME that might be more stable, but passing two arbitrary
> clocks as input would make this much more complex to implement.

Yea, this is too racy for me, at least for it to be useful. You get an
offset, but you don't get any sense of what it was actually valid for.

I think to be at all useful, you'll have to return both a timestamp
for a given clockid, and an offset to the second clockid. That way you
can generate a valid point in time on two clocks (as best as possible,
given possible non-atomic reads of separately backed clockids).

But again, I'm not totally sure exposing this provides that much value
over userspace reading the two clocks itself (in ABA fashion) to sort
this out.

And I also don't see it as particularly related to this perf extension
that Pawel is doing (since we are trying to avoid making the perf
clock a directly accessible clockid).

thanks
-john

^ permalink raw reply

* Re: [PATCH v3 0/3] perf: User/kernel time correlation and event generation
From: Arnd Bergmann @ 2014-11-04 16:11 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Richard Cochran, John Stultz, Andy Lutomirski, Pawel Moll,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, Paul Mackerras,
	Arnaldo Carvalho de Melo, Masami Hiramatsu, Christopher Covington,
	Namhyung Kim, David Ahern, Tomeu Vizoso,
	linux-kernel@vger.kernel.org, Linux API, Pawel Moll
In-Reply-To: <alpine.DEB.2.11.1411041111010.5308@nanos>

On Tuesday 04 November 2014 11:49:04 Thomas Gleixner wrote:
> On Tue, 4 Nov 2014, Richard Cochran wrote:
> 
> > On Tue, Nov 04, 2014 at 09:01:31AM +0100, Arnd Bergmann wrote:
> > > On Monday 03 November 2014 17:11:53 John Stultz wrote:
> > > > I've got some thoughts on what a possible interface that wouldn't be
> > > > awful could look like, but I'm still hesitant because I don't really
> > > > know if exposing this sort of data is actually a good idea long term.
> > >  
> > > I was also thinking (while working on an unrelated patch) we could use
> > > a system call like
> > > 
> > > int clock_getoffset(clockid_t clkid, struct timespec *offs);
> 
> We might make *offs a timespec64 or u64 

I don't think we are ready yet to introduce timespec64 in the uapi
headers, this needs some more careful planning. Otherwise I agree
it's bad to introduce syscalls that we already know will become
obsolete soon.

	Arnd

^ permalink raw reply

* Re: [PATCH v3 2/3] perf: Userspace event
From: Pawel Moll @ 2014-11-04 16:42 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Richard Cochran, Steven Rostedt, Ingo Molnar, Peter Zijlstra,
	Paul Mackerras, Arnaldo Carvalho de Melo, John Stultz,
	Masami Hiramatsu, Christopher Covington, David Ahern,
	Thomas Gleixner, Tomeu Vizoso,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <87ppd35vbk.fsf-vfBCOVm4yAnB69T4xOojN9BPR1lH4CV8@public.gmane.org>

On Tue, 2014-11-04 at 06:33 +0000, Namhyung Kim wrote:
> Hi Pawel,
> 
> On Tue,  4 Nov 2014 00:28:37 +0000, Pawel Moll wrote:
> > +	/*
> > +	 * Data in userspace event record is transparent for the kernel
> > +	 *
> > +	 * Userspace perf tool code maintains a list of known types with
> > +	 * reference implementations of parsers for the data field.
> > +	 *
> > +	 * Overall size of the record (including type and size fields)
> > +	 * is always aligned to 8 bytes by adding padding after the data.
> > +	 *
> > +	 * struct {
> > +	 *	struct perf_event_header	header;
> > +	 *	u32				type;
> > +	 *	u32				size;
> 
> The struct perf_event_header also has 'size' field and it has the entire
> length of the record so it's redundant. 

Well, is it? Correct me if I'm wrong, but as far as I remember the
record size must be always aligned to 8 bytes. Thus you can't reliably
derive the data size from it - if I my data is 3 bytes long, I have to
add 5 bytes of padding thus making the header.size = 24 (I'm ignoring
sample_id here), right? So now, decoding the record, all I can do is:
header.size - sizeof(header) - sizeof(type) - sizeof(size) = 24 - 8 - 8
= 8. So, basing on the header.size the data is 8 bytes long. But only 3
first bytes are valid...

To summarize, there are three options:

1. I'm wrong and the record doesn't have to be padded to make it 8 bytes
aligned. Then I can drop the additional size field.

2. I could impose a limitation on the prctl API that the data size must
be 8 bytes aligned. Bad idea in my opinion, I'd rather not.

3. The additional size (for the data part) field stays. Notice that
PERF_SAMPLE_RAW has it as well :-)

>  Also there's 'misc' field in the
> perf_event_header and I guess it can be used as 'type' info as it's
> mostly for cpumode and we are in user mode by definition.

Hm. First of all, I don't really like the idea of "overloading" the misc
meaning. It's a set of flags and I'd rather see it staying like this.

Secondly, I'm not sure that it can be reused - we are in user mode,
true, but it can be either PERF_RECORD_MISC_USER or
PERF_RECORD_MISC_GUEST_USER.

Thirdly, misc is "only" 16 bits wide, and someone even asked for the
type to be 64 bit long! (I suspect he wanted to use it in some special,
hacky way though :-) 32 bit length seems like a reasonable choice,
though.

Do you feel that the "unnecessary" type field is a big problem?

Thanks for your time!

Pawel

^ permalink raw reply

* [PATCH 00/20] kselftest install target feature
From: Shuah Khan @ 2014-11-04 17:10 UTC (permalink / raw)
  To: gregkh, akpm, mmarek, davem, keescook, tranmanphong, dh.herrmann,
	hughd, bobby.prani, ebiederm, serge.hallyn
  Cc: Shuah Khan, linux-kbuild, linux-kernel, linux-api, netdev

This patch series adds a new kselftest_install make target
to enable selftest install. When make kselftest_install is
run, selftests are installed on the system. A new install
target is added to selftests Makefile which will install
targets for the tests that are specified in INSTALL_TARGETS.
During install, a script is generated to run tests that are
installed. This script will be installed in the selftest install
directory. Individual test Makefiles are changed to add to the
script. This will allow new tests to add install and run test
commands to the generated kselftest script.

Approach:

make kselftest_target:
-- exports kselftest INSTALL_KSFT_PATH
   default $(INSTALL_MOD_PATH)/lib/kselftest/$(KERNELRELEASE)
-- exports path for ksefltest.sh
-- runs selftests make install target:

selftests make install target
-- creates kselftest.sh script in install install dir
-- runs install targets for all INSTALL_TARGETS

Individual test make install targets:
-- install test programs and/or scripts in install dir
-- append to the ksefltest.sh file to add commands to run test

Shuah Khan (20):
  selftests/user: move test out of Makefile into a shell script
  selftests/net: move test out of Makefile into a shell script
  kbuild: kselftest_install - add a new make target to install selftests
  selftests: add install target to enable installing selftests
  selftests/breakpoints: add install target to enable installing test
  selftests/cpu-hotplug: add install target to enable installing test
  selftests/efivarfs: add install target to enable installing test
  selftests/firmware: add install target to enable installing test
  selftests/ipc: add install target to enable installing test
  selftests/kcmp: add install target to enable installing test
  selftests/memfd: add install target to enable installing test
  selftests/memory-hotplug: add install target to enable installing test
  selftests/mount: add install target to enable installing test
  selftests/mqueue: add install target to enable installing test
  selftests/net: add install target to enable installing test
  selftests/ptrace: add install target to enable installing test
  selftests/sysctl: add install target to enable installing test
  selftests/timers: add install target to enable installing test
  selftests/vm: add install target to enable installing test
  selftests/user: add install target to enable installing test

 Makefile                                        | 17 +++++++++++++++++
 tools/testing/selftests/Makefile                | 14 ++++++++++++++
 tools/testing/selftests/breakpoints/Makefile    | 12 ++++++++++++
 tools/testing/selftests/cpu-hotplug/Makefile    |  9 +++++++++
 tools/testing/selftests/efivarfs/Makefile       | 13 ++++++++++++-
 tools/testing/selftests/firmware/Makefile       | 20 ++++++++++++++++++++
 tools/testing/selftests/ipc/Makefile            | 11 +++++++++++
 tools/testing/selftests/kcmp/Makefile           | 12 ++++++++++++
 tools/testing/selftests/memfd/Makefile          | 10 ++++++++++
 tools/testing/selftests/memory-hotplug/Makefile |  9 +++++++++
 tools/testing/selftests/mount/Makefile          |  7 +++++++
 tools/testing/selftests/mqueue/Makefile         |  8 ++++++++
 tools/testing/selftests/net/Makefile            | 18 +++++++++++-------
 tools/testing/selftests/net/test_bpf.sh         | 10 ++++++++++
 tools/testing/selftests/ptrace/Makefile         | 11 +++++++++--
 tools/testing/selftests/sysctl/Makefile         | 10 ++++++++++
 tools/testing/selftests/timers/Makefile         |  7 +++++++
 tools/testing/selftests/user/Makefile           | 15 ++++++++-------
 tools/testing/selftests/user/test_user_copy.sh  | 10 ++++++++++
 tools/testing/selftests/vm/Makefile             |  7 +++++++
 20 files changed, 213 insertions(+), 17 deletions(-)
 create mode 100755 tools/testing/selftests/net/test_bpf.sh
 create mode 100755 tools/testing/selftests/user/test_user_copy.sh

-- 
1.9.1

^ permalink raw reply

* [PATCH 01/20] selftests/user: move test out of Makefile into a shell script
From: Shuah Khan @ 2014-11-04 17:10 UTC (permalink / raw)
  To: gregkh, akpm, mmarek, davem, keescook, tranmanphong, dh.herrmann,
	hughd, bobby.prani, ebiederm, serge.hallyn
  Cc: Shuah Khan, linux-kbuild, linux-kernel, linux-api, netdev
In-Reply-To: <cover.1415117102.git.shuahkh@osg.samsung.com>

Currently user copy test is run from the Makefile. Move it out
of the Makefile to be run from a shell script to allow the test
to be run as stand-alone test, in addition to allowing the test
run from a make target.

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
---
 tools/testing/selftests/user/Makefile          |  8 +-------
 tools/testing/selftests/user/test_user_copy.sh | 10 ++++++++++
 2 files changed, 11 insertions(+), 7 deletions(-)
 create mode 100755 tools/testing/selftests/user/test_user_copy.sh

diff --git a/tools/testing/selftests/user/Makefile b/tools/testing/selftests/user/Makefile
index 396255b..12c9d15 100644
--- a/tools/testing/selftests/user/Makefile
+++ b/tools/testing/selftests/user/Makefile
@@ -4,10 +4,4 @@
 all:
 
 run_tests: all
-	@if /sbin/modprobe test_user_copy ; then \
-		rmmod test_user_copy; \
-		echo "user_copy: ok"; \
-	else \
-		echo "user_copy: [FAIL]"; \
-		exit 1; \
-	fi
+	./test_user_copy.sh
diff --git a/tools/testing/selftests/user/test_user_copy.sh b/tools/testing/selftests/user/test_user_copy.sh
new file mode 100755
index 0000000..350107f
--- /dev/null
+++ b/tools/testing/selftests/user/test_user_copy.sh
@@ -0,0 +1,10 @@
+#!/bin/sh
+# Runs copy_to/from_user infrastructure using test_user_copy kernel module
+
+if /sbin/modprobe -q test_user_copy; then
+	/sbin/modprobe -q -r test_user_copy
+	echo "user_copy: ok"
+else
+	echo "user_copy: [FAIL]"
+	exit 1
+fi
-- 
1.9.1

^ permalink raw reply related

* [PATCH 02/20] selftests/net: move test out of Makefile into a shell script
From: Shuah Khan @ 2014-11-04 17:10 UTC (permalink / raw)
  To: gregkh, akpm, mmarek, davem, keescook, tranmanphong, dh.herrmann,
	hughd, bobby.prani, ebiederm, serge.hallyn
  Cc: Shuah Khan, linux-kbuild, linux-kernel, linux-api, netdev
In-Reply-To: <cover.1415117102.git.shuahkh@osg.samsung.com>

Currently bpf test run from the Makefile. Move it out of the
Makefile to be run from a shell script to allow the test to
be run as stand-alone test, in addition to allowing the test
run from a make target.

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
---
 tools/testing/selftests/net/Makefile    |  8 +-------
 tools/testing/selftests/net/test_bpf.sh | 10 ++++++++++
 2 files changed, 11 insertions(+), 7 deletions(-)
 create mode 100755 tools/testing/selftests/net/test_bpf.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index c7493b8..62f22cc 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -14,12 +14,6 @@ all: $(NET_PROGS)
 run_tests: all
 	@/bin/sh ./run_netsocktests || echo "sockettests: [FAIL]"
 	@/bin/sh ./run_afpackettests || echo "afpackettests: [FAIL]"
-	@if /sbin/modprobe test_bpf ; then \
-		/sbin/rmmod test_bpf; \
-		echo "test_bpf: ok"; \
-	else \
-		echo "test_bpf: [FAIL]"; \
-		exit 1; \
-	fi
+	./test_bpf.sh
 clean:
 	$(RM) $(NET_PROGS)
diff --git a/tools/testing/selftests/net/test_bpf.sh b/tools/testing/selftests/net/test_bpf.sh
new file mode 100755
index 0000000..8b29796
--- /dev/null
+++ b/tools/testing/selftests/net/test_bpf.sh
@@ -0,0 +1,10 @@
+#!/bin/sh
+# Runs bpf test using test_bpf kernel module
+
+if /sbin/modprobe -q test_bpf ; then
+	/sbin/modprobe -q -r test_bpf;
+	echo "test_bpf: ok";
+else
+	echo "test_bpf: [FAIL]";
+	exit 1;
+fi
-- 
1.9.1


^ permalink raw reply related

* [PATCH 03/20] kbuild: kselftest_install - add a new make target to install selftests
From: Shuah Khan @ 2014-11-04 17:10 UTC (permalink / raw)
  To: gregkh, akpm, mmarek, davem, keescook, tranmanphong, dh.herrmann,
	hughd, bobby.prani, ebiederm, serge.hallyn
  Cc: Shuah Khan, linux-kbuild, linux-kernel, linux-api, netdev
In-Reply-To: <cover.1415117102.git.shuahkh@osg.samsung.com>

Add a new make target to install to install kernel selftests.
This new target will build and install selftests.

Approach:

make kselftest_target:
-- exports kselftest INSTALL_KSFT_PATH
   default $(INSTALL_MOD_PATH)/lib/kselftest/$(KERNELRELEASE)
-- exports path for ksefltest.sh
-- runs selftests make install target:

selftests make install target
-- creates kselftest.sh script in install install dir
-- runs install targets for all INSTALL_TARGETS

Individual test make install targets:
-- install test programs and/or scripts in install dir
-- append to the ksefltest.sh file to add commands to run test

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
---
 Makefile | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/Makefile b/Makefile
index 05d67af..291aff7 100644
--- a/Makefile
+++ b/Makefile
@@ -1078,6 +1078,20 @@ kselftest:
 	$(Q)$(MAKE) -C tools/testing/selftests run_tests
 
 # ---------------------------------------------------------------------------
+# Kernel selftest install
+INSTALL_KSFT_PATH=$(INSTALL_MOD_PATH)/lib/kselftest/$(KERNELRELEASE)
+export INSTALL_KSFT_PATH
+KSELFTEST=$(INSTALL_KSFT_PATH)/kselftest.sh
+export KSELFTEST
+
+PHONY += kselftest_install
+kselftest_install:
+	@rm -rf $(INSTALL_KSFT_PATH)
+	@mkdir -p $(INSTALL_KSFT_PATH)
+	$(Q)$(MAKE) -C tools/testing/selftests install
+	chmod +x $(KSELFTEST)
+
+# ---------------------------------------------------------------------------
 # Modules
 
 ifdef CONFIG_MODULES
@@ -1285,6 +1299,9 @@ help:
 	@echo  '                    Build, install, and boot kernel before'
 	@echo  '                    running kselftest on it'
 	@echo  ''
+	@echo  '  kselftest_install - Install Kselftests to INSTALL_KSFT_PATH'
+	@echo  '                      default: $(INSTALL_MOD_PATH)/lib/kselftest/$(KERNELRELEASE)'
+	@echo  ''
 	@echo  'Kernel packaging:'
 	@$(MAKE) $(build)=$(package-dir) help
 	@echo  ''
-- 
1.9.1

^ permalink raw reply related

* [PATCH 04/20] selftests: add install target to enable installing selftests
From: Shuah Khan @ 2014-11-04 17:11 UTC (permalink / raw)
  To: gregkh, akpm, mmarek, davem, keescook, tranmanphong, dh.herrmann,
	hughd, bobby.prani, ebiederm, serge.hallyn
  Cc: Shuah Khan, linux-kbuild, linux-kernel, linux-api, netdev
In-Reply-To: <cover.1415117102.git.shuahkh@osg.samsung.com>

Add a new make target to enable installing selftests. This
new target will call install targets for the tests that are
specified in INSTALL_TARGETS. During install, a script is
generated to run tests that are installed. This script will
be installed in the selftest install directory. Individual
test Makefiles are changed to add to the script. This will
allow new tests to add install and run test commands to the
generated kselftest script.

Approach:

make kselftest_target:
-- exports kselftest INSTALL_KSFT_PATH
   default $(INSTALL_MOD_PATH)/lib/kselftest/$(KERNELRELEASE)
-- exports path for ksefltest.sh
-- runs selftests make install target:

selftests make install target
-- creates kselftest.sh script in install install dir
-- runs install targets for all INSTALL_TARGETS

Individual test make install targets:
-- install test programs and/or scripts in install dir
-- append to the ksefltest.sh file to add commands to run test

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
---
 tools/testing/selftests/Makefile | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 45f145c..07b0244 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -16,6 +16,10 @@ TARGETS += sysctl
 TARGETS += firmware
 TARGETS += ftrace
 
+INSTALL_TARGETS = breakpoints cpu-hotplug efivarfs firmware ipc
+INSTALL_TARGETS += kcmp memfd memory-hotplug mqueue mount net
+INSTALL_TARGETS += ptrace sysctl timers user vm
+
 TARGETS_HOTPLUG = cpu-hotplug
 TARGETS_HOTPLUG += memory-hotplug
 
@@ -24,6 +28,16 @@ all:
 		make -C $$TARGET; \
 	done;
 
+install: all
+	echo "#!/bin/sh\n# Kselftest Run Tests ...." > $(KSELFTEST)
+	echo "# This file is generated during kselftest_install" >> $(KSELFTEST)
+	echo "# Please don't change it !!\n"  >> $(KSELFTEST)
+	echo "echo \"==============================\"" >> $(KSELFTEST)
+	for TARGET in $(INSTALL_TARGETS); do \
+		echo "\nInstalling $$TARGET"; \
+		make -C $$TARGET install; \
+	done;
+
 run_tests: all
 	for TARGET in $(TARGETS); do \
 		make -C $$TARGET run_tests; \
-- 
1.9.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox