Linux Documentation
 help / color / mirror / Atom feed
* [PATCH v4 1/2] perf: riscv: preliminary RISC-V support
From: Alan Kao @ 2018-04-18  2:12 UTC (permalink / raw)
  To: Palmer Dabbelt, Albert Ou, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Alex Solomatnikov, Jonathan Corbet, linux-riscv,
	linux-doc, linux-kernel
  Cc: Alan Kao, Nick Hu, Greentime Hu
In-Reply-To: <1524017523-25076-1-git-send-email-alankao@andestech.com>

This patch provide a basic PMU, riscv_base_pmu, which supports two
general hardware event, instructions and cycles.  Furthermore, this
PMU serves as a reference implementation to ease the portings in
the future.

riscv_base_pmu should be able to run on any RISC-V machine that
conforms to the Priv-Spec.  Note that the latest qemu model hasn't
fully support a proper behavior of Priv-Spec 1.10 yet, but work
around should be easy with very small fixes.  Please check
https://github.com/riscv/riscv-qemu/pull/115 for future updates.

Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <greentime@andestech.com>
Signed-off-by: Alan Kao <alankao@andestech.com>
---
 arch/riscv/Kconfig                  |  13 +
 arch/riscv/include/asm/perf_event.h |  79 ++++-
 arch/riscv/kernel/Makefile          |   1 +
 arch/riscv/kernel/perf_event.c      | 482 ++++++++++++++++++++++++++++
 4 files changed, 571 insertions(+), 4 deletions(-)
 create mode 100644 arch/riscv/kernel/perf_event.c

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c22ebe08e902..90d9c8e50377 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -203,6 +203,19 @@ config RISCV_ISA_C
 config RISCV_ISA_A
 	def_bool y
 
+menu "supported PMU type"
+	depends on PERF_EVENTS
+
+config RISCV_BASE_PMU
+	bool "Base Performance Monitoring Unit"
+	def_bool y
+	help
+	  A base PMU that serves as a reference implementation and has limited
+	  feature of perf.  It can run on any RISC-V machines so serves as the
+	  fallback, but this option can also be disable to reduce kernel size.
+
+endmenu
+
 endmenu
 
 menu "Kernel type"
diff --git a/arch/riscv/include/asm/perf_event.h b/arch/riscv/include/asm/perf_event.h
index e13d2ff29e83..0e638a0c3feb 100644
--- a/arch/riscv/include/asm/perf_event.h
+++ b/arch/riscv/include/asm/perf_event.h
@@ -1,13 +1,84 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * Copyright (C) 2018 SiFive
+ * Copyright (C) 2018 Andes Technology Corporation
  *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public Licence
- * as published by the Free Software Foundation; either version
- * 2 of the Licence, or (at your option) any later version.
  */
 
 #ifndef _ASM_RISCV_PERF_EVENT_H
 #define _ASM_RISCV_PERF_EVENT_H
 
+#include <linux/perf_event.h>
+#include <linux/ptrace.h>
+
+#define RISCV_BASE_COUNTERS	2
+
+/*
+ * The RISCV_MAX_COUNTERS parameter should be specified.
+ */
+
+#ifdef CONFIG_RISCV_BASE_PMU
+#define RISCV_MAX_COUNTERS	2
+#endif
+
+#ifndef RISCV_MAX_COUNTERS
+#error "Please provide a valid RISCV_MAX_COUNTERS for the PMU."
+#endif
+
+/*
+ * These are the indexes of bits in counteren register *minus* 1,
+ * except for cycle.  It would be coherent if it can directly mapped
+ * to counteren bit definition, but there is a *time* register at
+ * counteren[1].  Per-cpu structure is scarce resource here.
+ *
+ * According to the spec, an implementation can support counter up to
+ * mhpmcounter31, but many high-end processors has at most 6 general
+ * PMCs, we give the definition to MHPMCOUNTER8 here.
+ */
+#define RISCV_PMU_CYCLE		0
+#define RISCV_PMU_INSTRET	1
+#define RISCV_PMU_MHPMCOUNTER3	2
+#define RISCV_PMU_MHPMCOUNTER4	3
+#define RISCV_PMU_MHPMCOUNTER5	4
+#define RISCV_PMU_MHPMCOUNTER6	5
+#define RISCV_PMU_MHPMCOUNTER7	6
+#define RISCV_PMU_MHPMCOUNTER8	7
+
+#define RISCV_OP_UNSUPP		(-EOPNOTSUPP)
+
+struct cpu_hw_events {
+	/* # currently enabled events*/
+	int			n_events;
+	/* currently enabled events */
+	struct perf_event	*events[RISCV_MAX_COUNTERS];
+	/* vendor-defined PMU data */
+	void			*platform;
+};
+
+struct riscv_pmu {
+	struct pmu	*pmu;
+
+	/* generic hw/cache events table */
+	const int	*hw_events;
+	const int	(*cache_events)[PERF_COUNT_HW_CACHE_MAX]
+				       [PERF_COUNT_HW_CACHE_OP_MAX]
+				       [PERF_COUNT_HW_CACHE_RESULT_MAX];
+	/* method used to map hw/cache events */
+	int		(*map_hw_event)(u64 config);
+	int		(*map_cache_event)(u64 config);
+
+	/* max generic hw events in map */
+	int		max_events;
+	/* number total counters, 2(base) + x(general) */
+	int		num_counters;
+	/* the width of the counter */
+	int		counter_width;
+
+	/* vendor-defined PMU features */
+	void		*platform;
+
+	irqreturn_t	(*handle_irq)(int irq_num, void *dev);
+	int		irq;
+};
+
 #endif /* _ASM_RISCV_PERF_EVENT_H */
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index ffa439d4a364..f50d19816757 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -39,5 +39,6 @@ obj-$(CONFIG_MODULE_SECTIONS)	+= module-sections.o
 obj-$(CONFIG_FUNCTION_TRACER)	+= mcount.o
 obj-$(CONFIG_DYNAMIC_FTRACE)	+= mcount-dyn.o
 obj-$(CONFIG_FUNCTION_GRAPH_TRACER)	+= ftrace.o
+obj-$(CONFIG_PERF_EVENTS)      += perf_event.o
 
 clean:
diff --git a/arch/riscv/kernel/perf_event.c b/arch/riscv/kernel/perf_event.c
new file mode 100644
index 000000000000..ba3192afc470
--- /dev/null
+++ b/arch/riscv/kernel/perf_event.c
@@ -0,0 +1,482 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2008 Thomas Gleixner <tglx@linutronix.de>
+ * Copyright (C) 2008-2009 Red Hat, Inc., Ingo Molnar
+ * Copyright (C) 2009 Jaswinder Singh Rajput
+ * Copyright (C) 2009 Advanced Micro Devices, Inc., Robert Richter
+ * Copyright (C) 2008-2009 Red Hat, Inc., Peter Zijlstra
+ * Copyright (C) 2009 Intel Corporation, <markus.t.metzger@intel.com>
+ * Copyright (C) 2009 Google, Inc., Stephane Eranian
+ * Copyright 2014 Tilera Corporation. All Rights Reserved.
+ * Copyright (C) 2018 Andes Technology Corporation
+ *
+ * Perf_events support for RISC-V platforms.
+ *
+ * Since the spec. (as of now, Priv-Spec 1.10) does not provide enough
+ * functionality for perf event to fully work, this file provides
+ * the very basic framework only.
+ *
+ * For platform portings, please check Documentations/riscv/pmu.txt.
+ *
+ * The Copyright line includes x86 and tile ones.
+ */
+
+#include <linux/kprobes.h>
+#include <linux/kernel.h>
+#include <linux/kdebug.h>
+#include <linux/mutex.h>
+#include <linux/bitmap.h>
+#include <linux/irq.h>
+#include <linux/interrupt.h>
+#include <linux/perf_event.h>
+#include <linux/atomic.h>
+#include <linux/of.h>
+#include <asm/perf_event.h>
+
+static const struct riscv_pmu *riscv_pmu __read_mostly;
+static DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events);
+
+/*
+ * Hardware & cache maps and their methods
+ */
+
+static const int riscv_hw_event_map[] = {
+	[PERF_COUNT_HW_CPU_CYCLES]		= RISCV_PMU_CYCLE,
+	[PERF_COUNT_HW_INSTRUCTIONS]		= RISCV_PMU_INSTRET,
+	[PERF_COUNT_HW_CACHE_REFERENCES]	= RISCV_OP_UNSUPP,
+	[PERF_COUNT_HW_CACHE_MISSES]		= RISCV_OP_UNSUPP,
+	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS]	= RISCV_OP_UNSUPP,
+	[PERF_COUNT_HW_BRANCH_MISSES]		= RISCV_OP_UNSUPP,
+	[PERF_COUNT_HW_BUS_CYCLES]		= RISCV_OP_UNSUPP,
+};
+
+#define C(x) PERF_COUNT_HW_CACHE_##x
+static const int riscv_cache_event_map[PERF_COUNT_HW_CACHE_MAX]
+[PERF_COUNT_HW_CACHE_OP_MAX]
+[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+	[C(L1D)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(L1I)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(LL)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(DTLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] =  RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] =  RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(ITLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+	[C(BPU)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = RISCV_OP_UNSUPP,
+			[C(RESULT_MISS)] = RISCV_OP_UNSUPP,
+		},
+	},
+};
+
+static int riscv_map_hw_event(u64 config)
+{
+	if (config >= riscv_pmu->max_events)
+		return -EINVAL;
+
+	return riscv_pmu->hw_events[config];
+}
+
+int riscv_map_cache_decode(u64 config, unsigned int *type,
+			   unsigned int *op, unsigned int *result)
+{
+	return -ENOENT;
+}
+
+static int riscv_map_cache_event(u64 config)
+{
+	unsigned int type, op, result;
+	int err = -ENOENT;
+		int code;
+
+	err = riscv_map_cache_decode(config, &type, &op, &result);
+	if (!riscv_pmu->cache_events || err)
+		return err;
+
+	if (type >= PERF_COUNT_HW_CACHE_MAX ||
+	    op >= PERF_COUNT_HW_CACHE_OP_MAX ||
+	    result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
+		return -EINVAL;
+
+	code = (*riscv_pmu->cache_events)[type][op][result];
+	if (code == RISCV_OP_UNSUPP)
+		return -EINVAL;
+
+	return code;
+}
+
+/*
+ * Low-level functions: reading/writing counters
+ */
+
+static inline u64 read_counter(int idx)
+{
+	u64 val = 0;
+
+	switch (idx) {
+	case RISCV_PMU_CYCLE:
+		val = csr_read(cycle);
+		break;
+	case RISCV_PMU_INSTRET:
+		val = csr_read(instret);
+		break;
+	default:
+		WARN_ON_ONCE(idx < 0 ||	idx > RISCV_MAX_COUNTERS);
+		return -EINVAL;
+	}
+
+	return val;
+}
+
+static inline void write_counter(int idx, u64 value)
+{
+	/* currently not supported */
+	WARN_ON_ONCE(1);
+}
+
+/*
+ * pmu->read: read and update the counter
+ *
+ * Other architectures' implementation often have a xxx_perf_event_update
+ * routine, which can return counter values when called in the IRQ, but
+ * return void when being called by the pmu->read method.
+ */
+static void riscv_pmu_read(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev_raw_count, new_raw_count;
+	u64 oldval;
+	int idx = hwc->idx;
+	u64 delta;
+
+	do {
+		prev_raw_count = local64_read(&hwc->prev_count);
+		new_raw_count = read_counter(idx);
+
+		oldval = local64_cmpxchg(&hwc->prev_count, prev_raw_count,
+					 new_raw_count);
+	} while (oldval != prev_raw_count);
+
+	/*
+	 * delta is the value to update the counter we maintain in the kernel.
+	 */
+	delta = (new_raw_count - prev_raw_count) &
+		((1ULL << riscv_pmu->counter_width) - 1);
+	local64_add(delta, &event->count);
+	/*
+	 * Something like local64_sub(delta, &hwc->period_left) here is
+	 * needed if there is an interrupt for perf.
+	 */
+}
+
+/*
+ * State transition functions:
+ *
+ * stop()/start() & add()/del()
+ */
+
+/*
+ * pmu->stop: stop the counter
+ */
+static void riscv_pmu_stop(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		riscv_pmu->pmu->read(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+/*
+ * pmu->start: start the event.
+ */
+static void riscv_pmu_start(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
+		return;
+
+	if (flags & PERF_EF_RELOAD) {
+		WARN_ON_ONCE(!(event->hw.state & PERF_HES_UPTODATE));
+
+		/*
+		 * Set the counter to the period to the next interrupt here,
+		 * if you have any.
+		 */
+	}
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+
+	/*
+	 * Since we cannot write to counters, this serves as an initialization
+	 * to the delta-mechanism in pmu->read(); otherwise, the delta would be
+	 * wrong when pmu->read is called for the first time.
+	 */
+	local64_set(&hwc->prev_count, read_counter(hwc->idx));
+}
+
+/*
+ * pmu->add: add the event to PMU.
+ */
+static int riscv_pmu_add(struct perf_event *event, int flags)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+
+	if (cpuc->n_events == riscv_pmu->num_counters)
+		return -ENOSPC;
+
+	/*
+	 * We don't have general conunters, so no binding-event-to-counter
+	 * process here.
+	 *
+	 * Indexing using hwc->config generally not works, since config may
+	 * contain extra information, but here the only info we have in
+	 * hwc->config is the event index.
+	 */
+	hwc->idx = hwc->config;
+	cpuc->events[hwc->idx] = event;
+	cpuc->n_events++;
+
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
+	if (flags & PERF_EF_START)
+		riscv_pmu->pmu->start(event, PERF_EF_RELOAD);
+
+	return 0;
+}
+
+/*
+ * pmu->del: delete the event from PMU.
+ */
+static void riscv_pmu_del(struct perf_event *event, int flags)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+	struct hw_perf_event *hwc = &event->hw;
+
+	cpuc->events[hwc->idx] = NULL;
+	cpuc->n_events--;
+	riscv_pmu->pmu->stop(event, PERF_EF_UPDATE);
+	perf_event_update_userpage(event);
+}
+
+/*
+ * Interrupt: a skeletion for reference.
+ */
+
+static DEFINE_MUTEX(pmc_reserve_mutex);
+
+irqreturn_t riscv_base_pmu_handle_irq(int irq_num, void *dev)
+{
+	return IRQ_NONE;
+}
+
+static int reserve_pmc_hardware(void)
+{
+	int err = 0;
+
+	mutex_lock(&pmc_reserve_mutex);
+	if (riscv_pmu->irq >=0 && riscv_pmu->handle_irq) {
+		err = request_irq(riscv_pmu->irq, riscv_pmu->handle_irq,
+				  IRQF_PERCPU, "riscv-base-perf", NULL);
+	}
+	mutex_unlock(&pmc_reserve_mutex);
+
+	return err;
+}
+
+void release_pmc_hardware(void)
+{
+	mutex_lock(&pmc_reserve_mutex);
+	if (riscv_pmu->irq >=0) {
+		free_irq(riscv_pmu->irq, NULL);
+	}
+	mutex_unlock(&pmc_reserve_mutex);
+}
+
+/*
+ * Event Initialization/Finalization
+ */
+
+static atomic_t riscv_active_events = ATOMIC_INIT(0);
+
+static void riscv_event_destroy(struct perf_event *event)
+{
+	if (atomic_dec_return(&riscv_active_events) == 0)
+		release_pmc_hardware();
+}
+
+static int riscv_event_init(struct perf_event *event)
+{
+	struct perf_event_attr *attr = &event->attr;
+	struct hw_perf_event *hwc = &event->hw;
+	int err;
+	int code;
+
+	if (atomic_inc_return(&riscv_active_events) == 1) {
+		err = reserve_pmc_hardware();
+
+		if (err) {
+			pr_warn("PMC hardware not available\n");
+			atomic_dec(&riscv_active_events);
+			return -EBUSY;
+		}
+	}
+
+	switch (event->attr.type) {
+	case PERF_TYPE_HARDWARE:
+		code = riscv_pmu->map_hw_event(attr->config);
+		break;
+	case PERF_TYPE_HW_CACHE:
+		code = riscv_pmu->map_cache_event(attr->config);
+		break;
+	case PERF_TYPE_RAW:
+		return -EOPNOTSUPP;
+	default:
+		return -ENOENT;
+	}
+
+	event->destroy = riscv_event_destroy;
+	if (code < 0) {
+		event->destroy(event);
+		return code;
+	}
+
+	/*
+	 * idx is set to -1 because the index of a general event should not be
+	 * decided until binding to some counter in pmu->add().
+	 *
+	 * But since we don't have such support, later in pmu->add(), we just
+	 * use hwc->config as the index instead.
+	 */
+	hwc->config = code;
+	hwc->idx = -1;
+
+	return 0;
+}
+
+/*
+ * Initialization
+ */
+
+static struct pmu min_pmu = {
+	.name		= "riscv-base",
+	.event_init	= riscv_event_init,
+	.add		= riscv_pmu_add,
+	.del		= riscv_pmu_del,
+	.start		= riscv_pmu_start,
+	.stop		= riscv_pmu_stop,
+	.read		= riscv_pmu_read,
+};
+
+static const struct riscv_pmu riscv_base_pmu = {
+	.pmu = &min_pmu,
+	.max_events = ARRAY_SIZE(riscv_hw_event_map),
+	.map_hw_event = riscv_map_hw_event,
+	.hw_events = riscv_hw_event_map,
+	.map_cache_event = riscv_map_cache_event,
+	.cache_events = &riscv_cache_event_map,
+	.counter_width = 63,
+	.num_counters = RISCV_BASE_COUNTERS + 0,
+	.handle_irq = &riscv_base_pmu_handle_irq,
+
+	/* This means this PMU has no IRQ. */
+	.irq = -1,
+};
+
+static const struct of_device_id riscv_pmu_of_ids[] = {
+	{.compatible = "riscv,base-pmu",	.data = &riscv_base_pmu},
+	{ /* sentinel value */ }
+};
+
+int __init init_hw_perf_events(void)
+{
+	struct device_node *node = of_find_node_by_type(NULL, "pmu");
+	const struct of_device_id *of_id;
+	
+	if (node && (of_id = of_match_node(riscv_pmu_of_ids, node)))
+		riscv_pmu = of_id->data;
+	else
+		riscv_pmu = &riscv_base_pmu;
+
+	perf_pmu_register(riscv_pmu->pmu, "cpu", PERF_TYPE_RAW);
+	return 0;
+}
+arch_initcall(init_hw_perf_events);
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH v2 1/7] powerpc: Add TIDR CPU feature for Power9
From: Andrew Donnellan @ 2018-04-18  7:03 UTC (permalink / raw)
  To: Alastair D'Silva, linuxppc-dev
  Cc: linux-kernel, linux-doc, mikey, vaibhav, aneesh.kumar, malat,
	felix, pombredanne, sukadev, npiggin, gregkh, arnd, fbarrat,
	corbet, Alastair D'Silva
In-Reply-To: <20180418010810.30937-2-alastair@au1.ibm.com>

On 18/04/18 11:08, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> This patch adds a CPU feature bit to show whether the CPU has
> the TIDR register available, enabling as_notify/wait in userspace.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

Per my previous email:

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>


-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 2/7] powerpc: Use TIDR CPU feature to control TIDR allocation
From: Andrew Donnellan @ 2018-04-18  7:13 UTC (permalink / raw)
  To: Alastair D'Silva, linuxppc-dev
  Cc: linux-kernel, linux-doc, mikey, vaibhav, aneesh.kumar, malat,
	felix, pombredanne, sukadev, npiggin, gregkh, arnd, fbarrat,
	corbet, Alastair D'Silva
In-Reply-To: <20180418010810.30937-3-alastair@au1.ibm.com>

On 18/04/18 11:08, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Switch the use of TIDR on it's CPU feature, rather than assuming it
> is available based on architecture.
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 7/7] ocxl: Document new OCXL IOCTLs
From: Andrew Donnellan @ 2018-04-18  7:29 UTC (permalink / raw)
  To: Alastair D'Silva, linuxppc-dev
  Cc: linux-kernel, linux-doc, mikey, vaibhav, aneesh.kumar, malat,
	felix, pombredanne, sukadev, npiggin, gregkh, arnd, fbarrat,
	corbet, Alastair D'Silva
In-Reply-To: <20180418010810.30937-8-alastair@au1.ibm.com>

On 18/04/18 11:08, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
> 
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>

This looks better.

Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

> ---
>   Documentation/accelerators/ocxl.rst | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst
> index 7904adcc07fd..3b8d3b99795c 100644
> --- a/Documentation/accelerators/ocxl.rst
> +++ b/Documentation/accelerators/ocxl.rst
> @@ -157,6 +157,17 @@ OCXL_IOCTL_GET_METADATA:
>     Obtains configuration information from the card, such at the size of
>     MMIO areas, the AFU version, and the PASID for the current context.
>   
> +OCXL_IOCTL_ENABLE_P9_WAIT:
> +
> +  Allows the AFU to wake a userspace thread executing 'wait'. Returns
> +  information to userspace to allow it to configure the AFU. Note that
> +  this is only available on Power 9.

Nitpicking time, if you do a v3 you should stay on brand and call it 
POWER9. :D

> +
> +OCXL_IOCTL_GET_FEATURES:
> +
> +  Reports on which CPU features that affect OpenCAPI are usable from
> +  userspace.
> +
>   mmap
>   ----
>   
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 0/7]  docs/vm: start moving files do Documentation/admin-guide`
From: Mike Rapoport @ 2018-04-18  8:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Andrew Morton, Alexander Viro, Matthew Wilcox, linux-doc,
	linux-mm, linux-fsdevel, linux-kernel, Mike Rapoport

Hi,

These pacthes begin categorizing memory management documentation.  The
documents that describe userspace APIs and do not overload the reader with
implementation details can be moved to Documentation/admin-guide, so let's
do it :)

Mike Rapoport (7):
  docs/vm: hugetlbpage: minor improvements
  docs/vm: hugetlbpage: move section about kernel development to
    hugetlbfs_reserv
  docs/vm: pagemap: formatting and spelling updates
  docs/vm: pagemap: change document title
  docs/admin-guide: introduce basic index for mm documentation
  docs/admin-guide/mm: start moving here files from Documentation/vm
  docs/admin-guide/mm: convert plain text cross references to hyperlinks

 Documentation/ABI/stable/sysfs-devices-node        |  2 +-
 .../ABI/testing/sysfs-kernel-mm-hugepages          |  2 +-
 Documentation/admin-guide/index.rst                |  1 +
 .../{vm => admin-guide/mm}/hugetlbpage.rst         | 28 +++++++--------
 .../{vm => admin-guide/mm}/idle_page_tracking.rst  |  5 +--
 Documentation/admin-guide/mm/index.rst             | 28 +++++++++++++++
 Documentation/{vm => admin-guide/mm}/pagemap.rst   | 40 ++++++++++++----------
 .../{vm => admin-guide/mm}/soft-dirty.rst          |  0
 .../{vm => admin-guide/mm}/userfaultfd.rst         |  0
 Documentation/filesystems/proc.txt                 |  6 ++--
 Documentation/sysctl/vm.txt                        |  4 +--
 Documentation/vm/00-INDEX                          | 10 ------
 Documentation/vm/hugetlbfs_reserv.rst              |  8 +++++
 Documentation/vm/hwpoison.rst                      |  2 +-
 Documentation/vm/index.rst                         |  5 ---
 fs/Kconfig                                         |  2 +-
 fs/proc/task_mmu.c                                 |  4 +--
 mm/Kconfig                                         |  5 +--
 18 files changed, 89 insertions(+), 63 deletions(-)
 rename Documentation/{vm => admin-guide/mm}/hugetlbpage.rst (95%)
 rename Documentation/{vm => admin-guide/mm}/idle_page_tracking.rst (96%)
 create mode 100644 Documentation/admin-guide/mm/index.rst
 rename Documentation/{vm => admin-guide/mm}/pagemap.rst (83%)
 rename Documentation/{vm => admin-guide/mm}/soft-dirty.rst (100%)
 rename Documentation/{vm => admin-guide/mm}/userfaultfd.rst (100%)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 1/7] docs/vm: hugetlbpage: minor improvements
From: Mike Rapoport @ 2018-04-18  8:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Andrew Morton, Alexander Viro, Matthew Wilcox, linux-doc,
	linux-mm, linux-fsdevel, linux-kernel, Mike Rapoport
In-Reply-To: <1524038870-413-1-git-send-email-rppt@linux.vnet.ibm.com>

* fixed mistypes
* added internal cross-references for sections

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 Documentation/vm/hugetlbpage.rst | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/Documentation/vm/hugetlbpage.rst b/Documentation/vm/hugetlbpage.rst
index a5da14b..99ad5d9 100644
--- a/Documentation/vm/hugetlbpage.rst
+++ b/Documentation/vm/hugetlbpage.rst
@@ -87,7 +87,7 @@ memory pressure.
 Once a number of huge pages have been pre-allocated to the kernel huge page
 pool, a user with appropriate privilege can use either the mmap system call
 or shared memory system calls to use the huge pages.  See the discussion of
-Using Huge Pages, below.
+:ref:`Using Huge Pages <using_huge_pages>`, below.
 
 The administrator can allocate persistent huge pages on the kernel boot
 command line by specifying the "hugepages=N" parameter, where 'N' = the
@@ -115,8 +115,9 @@ over all the set of allowed nodes specified by the NUMA memory policy of the
 task that modifies ``nr_hugepages``. The default for the allowed nodes--when the
 task has default memory policy--is all on-line nodes with memory.  Allowed
 nodes with insufficient available, contiguous memory for a huge page will be
-silently skipped when allocating persistent huge pages.  See the discussion
-below of the interaction of task memory policy, cpusets and per node attributes
+silently skipped when allocating persistent huge pages.  See the
+:ref:`discussion below <mem_policy_and_hp_alloc>`
+of the interaction of task memory policy, cpusets and per node attributes
 with the allocation and freeing of persistent huge pages.
 
 The success or failure of huge page allocation depends on the amount of
@@ -158,7 +159,7 @@ normal page pool.
 Caveat: Shrinking the persistent huge page pool via ``nr_hugepages`` such that
 it becomes less than the number of huge pages in use will convert the balance
 of the in-use huge pages to surplus huge pages.  This will occur even if
-the number of surplus pages it would exceed the overcommit value.  As long as
+the number of surplus pages would exceed the overcommit value.  As long as
 this condition holds--that is, until ``nr_hugepages+nr_overcommit_hugepages`` is
 increased sufficiently, or the surplus huge pages go out of use and are freed--
 no more surplus huge pages will be allowed to be allocated.
@@ -187,6 +188,7 @@ Inside each of these directories, the same set of files will exist::
 
 which function as described above for the default huge page-sized case.
 
+.. _mem_policy_and_hp_alloc:
 
 Interaction of Task Memory Policy with Huge Page Allocation/Freeing
 ===================================================================
@@ -282,6 +284,7 @@ Note that the number of overcommit and reserve pages remain global quantities,
 as we don't know until fault time, when the faulting task's mempolicy is
 applied, from which node the huge page allocation will be attempted.
 
+.. _using_huge_pages:
 
 Using Huge Pages
 ================
@@ -295,7 +298,7 @@ type hugetlbfs::
 	min_size=<value>,nr_inodes=<value> none /mnt/huge
 
 This command mounts a (pseudo) filesystem of type hugetlbfs on the directory
-``/mnt/huge``.  Any files created on ``/mnt/huge`` uses huge pages.
+``/mnt/huge``.  Any file created on ``/mnt/huge`` uses huge pages.
 
 The ``uid`` and ``gid`` options sets the owner and group of the root of the
 file system.  By default the ``uid`` and ``gid`` of the current process
@@ -345,8 +348,8 @@ applications are going to use only shmat/shmget system calls or mmap with
 MAP_HUGETLB.  For an example of how to use mmap with MAP_HUGETLB see
 :ref:`map_hugetlb <map_hugetlb>` below.
 
-Users who wish to use hugetlb memory via shared memory segment should be a
-member of a supplementary group and system admin needs to configure that gid
+Users who wish to use hugetlb memory via shared memory segment should be
+members of a supplementary group and system admin needs to configure that gid
 into ``/proc/sys/vm/hugetlb_shm_group``.  It is possible for same or different
 applications to use any combination of mmaps and shm* calls, though the mount of
 filesystem will be required for using mmap calls without MAP_HUGETLB.
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 4/7] docs/vm: pagemap: change document title
From: Mike Rapoport @ 2018-04-18  8:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Andrew Morton, Alexander Viro, Matthew Wilcox, linux-doc,
	linux-mm, linux-fsdevel, linux-kernel, Mike Rapoport
In-Reply-To: <1524038870-413-1-git-send-email-rppt@linux.vnet.ibm.com>

"pagemap from the Userspace Perspective" is not very descriptive for
unaware readers. Since the document describes how to examine a process page
tables, let's title it "Examining Process Page Tables"

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 Documentation/vm/pagemap.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/vm/pagemap.rst b/Documentation/vm/pagemap.rst
index 9644bc0..7ba8cbd 100644
--- a/Documentation/vm/pagemap.rst
+++ b/Documentation/vm/pagemap.rst
@@ -1,8 +1,8 @@
 .. _pagemap:
 
-======================================
-pagemap from the Userspace Perspective
-======================================
+=============================
+Examining Process Page Tables
+=============================
 
 pagemap is a new (as of 2.6.25) set of interfaces in the kernel that allow
 userspace programs to examine the page tables and related information by
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH v3 1/3] Documentation/i2c: whitespace cleanup
From: Wolfram Sang @ 2018-04-18  8:08 UTC (permalink / raw)
  To: Sam Hansen; +Cc: linux-i2c, corbet, linux-doc, linux-kernel
In-Reply-To: <20180413174257.139182-1-hansens@google.com>

[-- Attachment #1: Type: text/plain, Size: 223 bytes --]

On Fri, Apr 13, 2018 at 10:42:55AM -0700, Sam Hansen wrote:
> This strips trailing whitespace in Documentation/i2c/dev-interface.
> 
> Signed-off-by: Sam Hansen <hansens@google.com>

Applied to for-current, thanks!


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v3 2/3] Documentation/i2c: sync docs with current state of i2c-tools
From: Wolfram Sang @ 2018-04-18  8:09 UTC (permalink / raw)
  To: Sam Hansen; +Cc: linux-i2c, corbet, linux-doc, linux-kernel
In-Reply-To: <20180413174257.139182-2-hansens@google.com>

[-- Attachment #1: Type: text/plain, Size: 584 bytes --]

On Fri, Apr 13, 2018 at 10:42:56AM -0700, Sam Hansen wrote:
> Currently, Documentation/i2c/dev-interface describes the use of
> i2c_smbus_* helper routines as static inlined functions provided by
> linux/i2c-dev.h.  Work has been done to refactor the linux/i2c-dev.h file
> in the i2c-tools project out into its own library.  As a result, these
> docs have become stale.
> 
> This patch corrects the discrepancy and directs the reader to the
> i2c-tools project for more information.
> 
> Signed-off-by: Sam Hansen <hansens@google.com>

Applied to for-current, thanks!


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH 7/7] docs/admin-guide/mm: convert plain text cross references to hyperlinks
From: Mike Rapoport @ 2018-04-18  8:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Andrew Morton, Alexander Viro, Matthew Wilcox, linux-doc,
	linux-mm, linux-fsdevel, linux-kernel, Mike Rapoport
In-Reply-To: <1524038870-413-1-git-send-email-rppt@linux.vnet.ibm.com>

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 Documentation/admin-guide/mm/hugetlbpage.rst        |  3 ++-
 Documentation/admin-guide/mm/idle_page_tracking.rst |  5 +++--
 Documentation/admin-guide/mm/pagemap.rst            | 18 +++++++++++-------
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
index 2b374d1..a8b0806 100644
--- a/Documentation/admin-guide/mm/hugetlbpage.rst
+++ b/Documentation/admin-guide/mm/hugetlbpage.rst
@@ -219,7 +219,8 @@ When adjusting the persistent hugepage count via ``nr_hugepages_mempolicy``, any
 memory policy mode--bind, preferred, local or interleave--may be used.  The
 resulting effect on persistent huge page allocation is as follows:
 
-#. Regardless of mempolicy mode [see Documentation/vm/numa_memory_policy.rst],
+#. Regardless of mempolicy mode [see
+   :ref:`Documentation/vm/numa_memory_policy.rst <numa_memory_policy>`],
    persistent huge pages will be distributed across the node or nodes
    specified in the mempolicy as if "interleave" had been specified.
    However, if a node in the policy does not contain sufficient contiguous
diff --git a/Documentation/admin-guide/mm/idle_page_tracking.rst b/Documentation/admin-guide/mm/idle_page_tracking.rst
index 92e3a25..6f7b7ca 100644
--- a/Documentation/admin-guide/mm/idle_page_tracking.rst
+++ b/Documentation/admin-guide/mm/idle_page_tracking.rst
@@ -65,8 +65,9 @@ workload one should:
     are not reclaimable, he or she can filter them out using
     ``/proc/kpageflags``.
 
-See Documentation/admin-guide/mm/pagemap.rst for more information about
-``/proc/pid/pagemap``, ``/proc/kpageflags``, and ``/proc/kpagecgroup``.
+See :ref:`Documentation/admin-guide/mm/pagemap.rst <pagemap>` for more
+information about ``/proc/pid/pagemap``, ``/proc/kpageflags``, and
+``/proc/kpagecgroup``.
 
 .. _impl_details:
 
diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst
index 053ca64..577af85 100644
--- a/Documentation/admin-guide/mm/pagemap.rst
+++ b/Documentation/admin-guide/mm/pagemap.rst
@@ -18,7 +18,8 @@ There are four components to pagemap:
     * Bits 0-54  page frame number (PFN) if present
     * Bits 0-4   swap type if swapped
     * Bits 5-54  swap offset if swapped
-    * Bit  55    pte is soft-dirty (see Documentation/admin-guide/mm/soft-dirty.rst)
+    * Bit  55    pte is soft-dirty (see
+      :ref:`Documentation/admin-guide/mm/soft-dirty.rst <soft_dirty>`)
     * Bit  56    page exclusively mapped (since 4.2)
     * Bits 57-60 zero
     * Bit  61    page is file-page or shared-anon (since 3.5)
@@ -97,9 +98,11 @@ Short descriptions to the page flags
     A compound page with order N consists of 2^N physically contiguous pages.
     A compound page with order 2 takes the form of "HTTT", where H donates its
     head page and T donates its tail page(s).  The major consumers of compound
-    pages are hugeTLB pages (Documentation/admin-guide/mm/hugetlbpage.rst), the SLUB etc.
-    memory allocators and various device drivers. However in this interface,
-    only huge/giga pages are made visible to end users.
+    pages are hugeTLB pages
+    (:ref:`Documentation/admin-guide/mm/hugetlbpage.rst <hugetlbpage>`),
+    the SLUB etc.  memory allocators and various device drivers.
+    However in this interface, only huge/giga pages are made visible
+    to end users.
 16 - COMPOUND_TAIL
     A compound page tail (see description above).
 17 - HUGE
@@ -118,9 +121,10 @@ Short descriptions to the page flags
     zero page for pfn_zero or huge_zero page
 25 - IDLE
     page has not been accessed since it was marked idle (see
-    Documentation/admin-guide/mm/idle_page_tracking.rst). Note that this flag may be
-    stale in case the page was accessed via a PTE. To make sure the flag
-    is up-to-date one has to read ``/sys/kernel/mm/page_idle/bitmap`` first.
+    :ref:`Documentation/admin-guide/mm/idle_page_tracking.rst <idle_page_tracking>`).
+    Note that this flag may be stale in case the page was accessed via
+    a PTE. To make sure the flag is up-to-date one has to read
+    ``/sys/kernel/mm/page_idle/bitmap`` first.
 
 IO related page flags
 ---------------------
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 5/7] docs/admin-guide: introduce basic index for mm documentation
From: Mike Rapoport @ 2018-04-18  8:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Andrew Morton, Alexander Viro, Matthew Wilcox, linux-doc,
	linux-mm, linux-fsdevel, linux-kernel, Mike Rapoport
In-Reply-To: <1524038870-413-1-git-send-email-rppt@linux.vnet.ibm.com>

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 Documentation/admin-guide/index.rst    |  1 +
 Documentation/admin-guide/mm/index.rst | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+)
 create mode 100644 Documentation/admin-guide/mm/index.rst

diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 5bb9161..cac906f 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -63,6 +63,7 @@ configure specific aspects of kernel behavior to your liking.
    pm/index
    thunderbolt
    LSM/index
+   mm/index
 
 .. only::  subproject and html
 
diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst
new file mode 100644
index 0000000..c47c16e
--- /dev/null
+++ b/Documentation/admin-guide/mm/index.rst
@@ -0,0 +1,19 @@
+=================
+Memory Management
+=================
+
+Linux memory management subsystem is responsible, as the name implies,
+for managing the memory in the system. This includes implemnetation of
+virtual memory and demand paging, memory allocation both for kernel
+internal structures and user space programms, mapping of files into
+processes address space and many other cool things.
+
+Linux memory management is a complex system with many configurable
+settings. Most of these settings are available via ``/proc``
+filesystem and can be quired and adjusted using ``sysctl``. These APIs
+are described in Documentation/sysctl/vm.txt and in `man 5 proc`_.
+
+.. _man 5 proc: http://man7.org/linux/man-pages/man5/proc.5.html
+
+Here we document in detail how to interact with various mechanisms in
+the Linux memory management.
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH v3 3/3] Documentation/i2c: adopt kernel commenting style in examples
From: Wolfram Sang @ 2018-04-18  8:10 UTC (permalink / raw)
  To: Sam Hansen; +Cc: linux-i2c, corbet, linux-doc, linux-kernel
In-Reply-To: <20180413174257.139182-3-hansens@google.com>

[-- Attachment #1: Type: text/plain, Size: 246 bytes --]

On Fri, Apr 13, 2018 at 10:42:57AM -0700, Sam Hansen wrote:
> The example I2C code is rewritten to adopt the preferred kernel block
> commenting style.
> 
> Signed-off-by: Sam Hansen <hansens@google.com>

Applied to for-current, thanks!


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH 6/7] docs/admin-guide/mm: start moving here files from Documentation/vm
From: Mike Rapoport @ 2018-04-18  8:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Andrew Morton, Alexander Viro, Matthew Wilcox, linux-doc,
	linux-mm, linux-fsdevel, linux-kernel, Mike Rapoport
In-Reply-To: <1524038870-413-1-git-send-email-rppt@linux.vnet.ibm.com>

Several documents in Documentation/vm fit quite well into the "admin/user
guide" category. The documents that don't overload the reader with lots of
implementation details and provide coherent description of certain feature
can be moved to Documentation/admin-guide/mm.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 Documentation/ABI/stable/sysfs-devices-node                 |  2 +-
 Documentation/ABI/testing/sysfs-kernel-mm-hugepages         |  2 +-
 Documentation/{vm => admin-guide/mm}/hugetlbpage.rst        |  0
 Documentation/{vm => admin-guide/mm}/idle_page_tracking.rst |  2 +-
 Documentation/admin-guide/mm/index.rst                      |  9 +++++++++
 Documentation/{vm => admin-guide/mm}/pagemap.rst            |  6 +++---
 Documentation/{vm => admin-guide/mm}/soft-dirty.rst         |  0
 Documentation/{vm => admin-guide/mm}/userfaultfd.rst        |  0
 Documentation/filesystems/proc.txt                          |  6 ++++--
 Documentation/sysctl/vm.txt                                 |  4 ++--
 Documentation/vm/00-INDEX                                   | 10 ----------
 Documentation/vm/hwpoison.rst                               |  2 +-
 Documentation/vm/index.rst                                  |  5 -----
 fs/Kconfig                                                  |  2 +-
 fs/proc/task_mmu.c                                          |  4 ++--
 mm/Kconfig                                                  |  5 +++--
 16 files changed, 28 insertions(+), 31 deletions(-)
 rename Documentation/{vm => admin-guide/mm}/hugetlbpage.rst (100%)
 rename Documentation/{vm => admin-guide/mm}/idle_page_tracking.rst (98%)
 rename Documentation/{vm => admin-guide/mm}/pagemap.rst (96%)
 rename Documentation/{vm => admin-guide/mm}/soft-dirty.rst (100%)
 rename Documentation/{vm => admin-guide/mm}/userfaultfd.rst (100%)

diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
index b38f4b7..3e90e1f 100644
--- a/Documentation/ABI/stable/sysfs-devices-node
+++ b/Documentation/ABI/stable/sysfs-devices-node
@@ -90,4 +90,4 @@ Date:		December 2009
 Contact:	Lee Schermerhorn <lee.schermerhorn@hp.com>
 Description:
 		The node's huge page size control/query attributes.
-		See Documentation/vm/hugetlbpage.rst
\ No newline at end of file
+		See Documentation/admin-guide/mm/hugetlbpage.rst
\ No newline at end of file
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-hugepages b/Documentation/ABI/testing/sysfs-kernel-mm-hugepages
index 5140b23..fdaa216 100644
--- a/Documentation/ABI/testing/sysfs-kernel-mm-hugepages
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-hugepages
@@ -12,4 +12,4 @@ Description:
 			free_hugepages
 			surplus_hugepages
 			resv_hugepages
-		See Documentation/vm/hugetlbpage.rst for details.
+		See Documentation/admin-guide/mm/hugetlbpage.rst for details.
diff --git a/Documentation/vm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
similarity index 100%
rename from Documentation/vm/hugetlbpage.rst
rename to Documentation/admin-guide/mm/hugetlbpage.rst
diff --git a/Documentation/vm/idle_page_tracking.rst b/Documentation/admin-guide/mm/idle_page_tracking.rst
similarity index 98%
rename from Documentation/vm/idle_page_tracking.rst
rename to Documentation/admin-guide/mm/idle_page_tracking.rst
index d1c4609..92e3a25 100644
--- a/Documentation/vm/idle_page_tracking.rst
+++ b/Documentation/admin-guide/mm/idle_page_tracking.rst
@@ -65,7 +65,7 @@ workload one should:
     are not reclaimable, he or she can filter them out using
     ``/proc/kpageflags``.
 
-See Documentation/vm/pagemap.rst for more information about
+See Documentation/admin-guide/mm/pagemap.rst for more information about
 ``/proc/pid/pagemap``, ``/proc/kpageflags``, and ``/proc/kpagecgroup``.
 
 .. _impl_details:
diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst
index c47c16e..6c8b554 100644
--- a/Documentation/admin-guide/mm/index.rst
+++ b/Documentation/admin-guide/mm/index.rst
@@ -17,3 +17,12 @@ are described in Documentation/sysctl/vm.txt and in `man 5 proc`_.
 
 Here we document in detail how to interact with various mechanisms in
 the Linux memory management.
+
+.. toctree::
+   :maxdepth: 1
+
+   hugetlbpage
+   idle_page_tracking
+   pagemap
+   soft-dirty
+   userfaultfd
diff --git a/Documentation/vm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst
similarity index 96%
rename from Documentation/vm/pagemap.rst
rename to Documentation/admin-guide/mm/pagemap.rst
index 7ba8cbd..053ca64 100644
--- a/Documentation/vm/pagemap.rst
+++ b/Documentation/admin-guide/mm/pagemap.rst
@@ -18,7 +18,7 @@ There are four components to pagemap:
     * Bits 0-54  page frame number (PFN) if present
     * Bits 0-4   swap type if swapped
     * Bits 5-54  swap offset if swapped
-    * Bit  55    pte is soft-dirty (see Documentation/vm/soft-dirty.rst)
+    * Bit  55    pte is soft-dirty (see Documentation/admin-guide/mm/soft-dirty.rst)
     * Bit  56    page exclusively mapped (since 4.2)
     * Bits 57-60 zero
     * Bit  61    page is file-page or shared-anon (since 3.5)
@@ -97,7 +97,7 @@ Short descriptions to the page flags
     A compound page with order N consists of 2^N physically contiguous pages.
     A compound page with order 2 takes the form of "HTTT", where H donates its
     head page and T donates its tail page(s).  The major consumers of compound
-    pages are hugeTLB pages (Documentation/vm/hugetlbpage.rst), the SLUB etc.
+    pages are hugeTLB pages (Documentation/admin-guide/mm/hugetlbpage.rst), the SLUB etc.
     memory allocators and various device drivers. However in this interface,
     only huge/giga pages are made visible to end users.
 16 - COMPOUND_TAIL
@@ -118,7 +118,7 @@ Short descriptions to the page flags
     zero page for pfn_zero or huge_zero page
 25 - IDLE
     page has not been accessed since it was marked idle (see
-    Documentation/vm/idle_page_tracking.rst). Note that this flag may be
+    Documentation/admin-guide/mm/idle_page_tracking.rst). Note that this flag may be
     stale in case the page was accessed via a PTE. To make sure the flag
     is up-to-date one has to read ``/sys/kernel/mm/page_idle/bitmap`` first.
 
diff --git a/Documentation/vm/soft-dirty.rst b/Documentation/admin-guide/mm/soft-dirty.rst
similarity index 100%
rename from Documentation/vm/soft-dirty.rst
rename to Documentation/admin-guide/mm/soft-dirty.rst
diff --git a/Documentation/vm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
similarity index 100%
rename from Documentation/vm/userfaultfd.rst
rename to Documentation/admin-guide/mm/userfaultfd.rst
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 2d3984c..ef53f80 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -515,7 +515,8 @@ guarantees:
 
 The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG
 bits on both physical and virtual pages associated with a process, and the
-soft-dirty bit on pte (see Documentation/vm/soft-dirty.rst for details).
+soft-dirty bit on pte (see Documentation/admin-guide/mm/soft-dirty.rst
+for details).
 To clear the bits for all the pages associated with the process
     > echo 1 > /proc/PID/clear_refs
 
@@ -536,7 +537,8 @@ Any other value written to /proc/PID/clear_refs will have no effect.
 
 The /proc/pid/pagemap gives the PFN, which can be used to find the pageflags
 using /proc/kpageflags and number of times a page is mapped using
-/proc/kpagecount. For detailed explanation, see Documentation/vm/pagemap.rst.
+/proc/kpagecount. For detailed explanation, see
+Documentation/admin-guide/mm/pagemap.rst.
 
 The /proc/pid/numa_maps is an extension based on maps, showing the memory
 locality and binding policy, as well as the memory usage (in pages) of
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index c8e6d5b..697ef8c 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -515,7 +515,7 @@ nr_hugepages
 
 Change the minimum size of the hugepage pool.
 
-See Documentation/vm/hugetlbpage.rst
+See Documentation/admin-guide/mm/hugetlbpage.rst
 
 ==============================================================
 
@@ -524,7 +524,7 @@ nr_overcommit_hugepages
 Change the maximum size of the hugepage pool. The maximum is
 nr_hugepages + nr_overcommit_hugepages.
 
-See Documentation/vm/hugetlbpage.rst
+See Documentation/admin-guide/mm/hugetlbpage.rst
 
 ==============================================================
 
diff --git a/Documentation/vm/00-INDEX b/Documentation/vm/00-INDEX
index cda564d..f8a96ca 100644
--- a/Documentation/vm/00-INDEX
+++ b/Documentation/vm/00-INDEX
@@ -12,14 +12,10 @@ highmem.rst
 	- Outline of highmem and common issues.
 hmm.rst
 	- Documentation of heterogeneous memory management
-hugetlbpage.rst
-	- a brief summary of hugetlbpage support in the Linux kernel.
 hugetlbfs_reserv.rst
 	- A brief overview of hugetlbfs reservation design/implementation.
 hwpoison.rst
 	- explains what hwpoison is
-idle_page_tracking.rst
-	- description of the idle page tracking feature.
 ksm.rst
 	- how to use the Kernel Samepage Merging feature.
 mmu_notifier.rst
@@ -34,16 +30,12 @@ page_frags.rst
 	- description of page fragments allocator
 page_migration.rst
 	- description of page migration in NUMA systems.
-pagemap.rst
-	- pagemap, from the userspace perspective
 page_owner.rst
 	- tracking about who allocated each page
 remap_file_pages.rst
 	- a note about remap_file_pages() system call
 slub.rst
 	- a short users guide for SLUB.
-soft-dirty.rst
-	- short explanation for soft-dirty PTEs
 split_page_table_lock.rst
 	- Separate per-table lock to improve scalability of the old page_table_lock.
 swap_numa.rst
@@ -52,8 +44,6 @@ transhuge.rst
 	- Transparent Hugepage Support, alternative way of using hugepages.
 unevictable-lru.rst
 	- Unevictable LRU infrastructure
-userfaultfd.rst
-	- description of userfaultfd system call
 z3fold.txt
 	- outline of z3fold allocator for storing compressed pages
 zsmalloc.rst
diff --git a/Documentation/vm/hwpoison.rst b/Documentation/vm/hwpoison.rst
index 070aa1e..09bd24a 100644
--- a/Documentation/vm/hwpoison.rst
+++ b/Documentation/vm/hwpoison.rst
@@ -155,7 +155,7 @@ Testing
 	value).  This allows stress testing of many kinds of
 	pages. The page_flags are the same as in /proc/kpageflags. The
 	flag bits are defined in include/linux/kernel-page-flags.h and
-	documented in Documentation/vm/pagemap.rst
+	documented in Documentation/admin-guide/mm/pagemap.rst
 
 * Architecture specific MCE injector
 
diff --git a/Documentation/vm/index.rst b/Documentation/vm/index.rst
index 6c45142..ed58cb9 100644
--- a/Documentation/vm/index.rst
+++ b/Documentation/vm/index.rst
@@ -13,15 +13,10 @@ various features of the Linux memory management
 .. toctree::
    :maxdepth: 1
 
-   hugetlbpage
-   idle_page_tracking
    ksm
    numa_memory_policy
-   pagemap
    transhuge
-   soft-dirty
    swap_numa
-   userfaultfd
    zswap
 
 Kernel developers MM documentation
diff --git a/fs/Kconfig b/fs/Kconfig
index ba53dc2..ac4ac90 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -196,7 +196,7 @@ config HUGETLBFS
 	help
 	  hugetlbfs is a filesystem backing for HugeTLB pages, based on
 	  ramfs. For architectures that support it, say Y here and read
-	  <file:Documentation/vm/hugetlbpage.rst> for details.
+	  <file:Documentation/admin-guide/mm/hugetlbpage.rst> for details.
 
 	  If unsure, say N.
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 333cda8..ed48b6e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -937,7 +937,7 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
 	/*
 	 * The soft-dirty tracker uses #PF-s to catch writes
 	 * to pages, so write-protect the pte as well. See the
-	 * Documentation/vm/soft-dirty.rst for full description
+	 * Documentation/admin-guide/mm/soft-dirty.rst for full description
 	 * of how soft-dirty works.
 	 */
 	pte_t ptent = *pte;
@@ -1417,7 +1417,7 @@ static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask,
  * Bits 0-54  page frame number (PFN) if present
  * Bits 0-4   swap type if swapped
  * Bits 5-54  swap offset if swapped
- * Bit  55    pte is soft-dirty (see Documentation/vm/soft-dirty.rst)
+ * Bit  55    pte is soft-dirty (see Documentation/admin-guide/mm/soft-dirty.rst)
  * Bit  56    page exclusively mapped
  * Bits 57-60 zero
  * Bit  61    page is file-page or shared-anon
diff --git a/mm/Kconfig b/mm/Kconfig
index 9bdb018..2d7ef62 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -530,7 +530,7 @@ config MEM_SOFT_DIRTY
 	  into a page just as regular dirty bit, but unlike the latter
 	  it can be cleared by hands.
 
-	  See Documentation/vm/soft-dirty.rst for more details.
+	  See Documentation/admin-guide/mm/soft-dirty.rst for more details.
 
 config ZSWAP
 	bool "Compressed cache for swap pages (EXPERIMENTAL)"
@@ -656,7 +656,8 @@ config IDLE_PAGE_TRACKING
 	  be useful to tune memory cgroup limits and/or for job placement
 	  within a compute cluster.
 
-	  See Documentation/vm/idle_page_tracking.rst for more details.
+	  See Documentation/admin-guide/mm/idle_page_tracking.rst for
+	  more details.
 
 # arch_add_memory() comprehends device memory
 config ARCH_HAS_ZONE_DEVICE
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 2/7] docs/vm: hugetlbpage: move section about kernel development to hugetlbfs_reserv
From: Mike Rapoport @ 2018-04-18  8:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Andrew Morton, Alexander Viro, Matthew Wilcox, linux-doc,
	linux-mm, linux-fsdevel, linux-kernel, Mike Rapoport
In-Reply-To: <1524038870-413-1-git-send-email-rppt@linux.vnet.ibm.com>

The hugetlbpage describes hugetlbfs from the user perspective and newer
hugetlbfs_reserv document targets kernel developers. Hence the section
about hugetlbfs kernel development naturally belongs there.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 Documentation/vm/hugetlbfs_reserv.rst | 8 ++++++++
 Documentation/vm/hugetlbpage.rst      | 8 --------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/Documentation/vm/hugetlbfs_reserv.rst b/Documentation/vm/hugetlbfs_reserv.rst
index 36a87a2..9d20076 100644
--- a/Documentation/vm/hugetlbfs_reserv.rst
+++ b/Documentation/vm/hugetlbfs_reserv.rst
@@ -583,5 +583,13 @@ of cpusets or memory policy there is no guarantee that huge pages will be
 available on the required nodes.  This is true even if there are a sufficient
 number of global reservations.
 
+Hugetlbfs regression testing
+============================
 
+The most complete set of hugetlb tests are in the libhugetlbfs repository.
+If you modify any hugetlb related code, use the libhugetlbfs test suite
+to check for regressions.  In addition, if you add any new hugetlb
+functionality, please add appropriate tests to libhugetlbfs.
+
+--
 Mike Kravetz, 7 April 2017
diff --git a/Documentation/vm/hugetlbpage.rst b/Documentation/vm/hugetlbpage.rst
index 99ad5d9..2b374d1 100644
--- a/Documentation/vm/hugetlbpage.rst
+++ b/Documentation/vm/hugetlbpage.rst
@@ -379,11 +379,3 @@ The `libhugetlbfs`_  library provides a wide range of userspace tools
 to help with huge page usability, environment setup, and control.
 
 .. _libhugetlbfs: https://github.com/libhugetlbfs/libhugetlbfs
-
-Kernel development regression testing
-=====================================
-
-The most complete set of hugetlb tests are in the libhugetlbfs repository.
-If you modify any hugetlb related code, use the libhugetlbfs test suite
-to check for regressions.  In addition, if you add any new hugetlb
-functionality, please add appropriate tests to libhugetlbfs.
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 3/7] docs/vm: pagemap: formatting and spelling updates
From: Mike Rapoport @ 2018-04-18  8:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Andrew Morton, Alexander Viro, Matthew Wilcox, linux-doc,
	linux-mm, linux-fsdevel, linux-kernel, Mike Rapoport
In-Reply-To: <1524038870-413-1-git-send-email-rppt@linux.vnet.ibm.com>

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 Documentation/vm/pagemap.rst | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/Documentation/vm/pagemap.rst b/Documentation/vm/pagemap.rst
index d54b4bf..9644bc0 100644
--- a/Documentation/vm/pagemap.rst
+++ b/Documentation/vm/pagemap.rst
@@ -13,7 +13,7 @@ There are four components to pagemap:
  * ``/proc/pid/pagemap``.  This file lets a userspace process find out which
    physical frame each virtual page is mapped to.  It contains one 64-bit
    value for each virtual page, containing the following data (from
-   fs/proc/task_mmu.c, above pagemap_read):
+   ``fs/proc/task_mmu.c``, above pagemap_read):
 
     * Bits 0-54  page frame number (PFN) if present
     * Bits 0-4   swap type if swapped
@@ -36,7 +36,7 @@ There are four components to pagemap:
    precisely which pages are mapped (or in swap) and comparing mapped
    pages between processes.
 
-   Efficient users of this interface will use /proc/pid/maps to
+   Efficient users of this interface will use ``/proc/pid/maps`` to
    determine which areas of memory are actually mapped and llseek to
    skip over unmapped regions.
 
@@ -79,11 +79,11 @@ There are four components to pagemap:
    memory cgroup each page is charged to, indexed by PFN. Only available when
    CONFIG_MEMCG is set.
 
-Short descriptions to the page flags:
-=====================================
+Short descriptions to the page flags
+====================================
 
 0 - LOCKED
-   page is being locked for exclusive access, eg. by undergoing read/write IO
+   page is being locked for exclusive access, e.g. by undergoing read/write IO
 7 - SLAB
    page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
    When compound page is used, SLUB/SLQB will only set this flag on the head
@@ -132,7 +132,7 @@ IO related page flags
    ie. for file backed page: (in-memory data revision >= on-disk one)
 4 - DIRTY
    page has been written to, hence contains new data
-   ie. for file backed page: (in-memory data revision >  on-disk one)
+   i.e. for file backed page: (in-memory data revision >  on-disk one)
 8 - WRITEBACK
    page is being synced to disk
 
@@ -145,7 +145,7 @@ LRU related page flags
    page is in the active LRU list
 18 - UNEVICTABLE
    page is in the unevictable (non-)LRU list It is somehow pinned and
-   not a candidate for LRU page reclaims, eg. ramfs pages,
+   not a candidate for LRU page reclaims, e.g. ramfs pages,
    shmctl(SHM_LOCK) and mlock() memory segments
 2 - REFERENCED
    page has been referenced since last LRU list enqueue/requeue
@@ -156,7 +156,7 @@ LRU related page flags
 12 - ANON
    a memory mapped page that is not part of a file
 13 - SWAPCACHE
-   page is mapped to swap space, ie. has an associated swap entry
+   page is mapped to swap space, i.e. has an associated swap entry
 14 - SWAPBACKED
    page is backed by swap/RAM
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH v3 2/3] Documentation/i2c: sync docs with current state of i2c-tools
From: Wolfram Sang @ 2018-04-18  8:21 UTC (permalink / raw)
  To: Sam Hansen; +Cc: linux-i2c, corbet, linux-doc, linux-kernel
In-Reply-To: <20180413174257.139182-2-hansens@google.com>

[-- Attachment #1: Type: text/plain, Size: 436 bytes --]


> +The above functions are made available by linking against the libi2c library,
> +which is provided by the i2c-tools project.  See:
> +https://git.kernel.org/pub/scm/utils/i2c-tools/i2c-tools.git/.

In the beginning, we say that '#include <i2c/smbus.h>' is needed.
Shouldn't we mention i2c-tools there already and in what case it is
needed (only for SMBus)? I'd think so.

Sam, would you be open to do this as an incremental patch?


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH] docs: ip-sysctl.txt: fix name of some ipv6 variables
From: Olivier Gayot @ 2018-04-18 10:31 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: linux-doc, Trivial Patch Monkey, Olivier Gayot

The name of the following proc/sysctl entries were incorrectly
documented:

    /proc/sys/net/ipv6/conf/<interface>/max_dst_opts_number
    /proc/sys/net/ipv6/conf/<interface>/max_hbt_opts_number
    /proc/sys/net/ipv6/conf/<interface>/max_dst_opts_length
    /proc/sys/net/ipv6/conf/<interface>/max_hbt_length

Their name was set to the name of the symbol in the .data field of the
control table instead of their .proc name.

Signed-off-by: Olivier Gayot <olivier.gayot@sigexec.com>
---
 Documentation/networking/ip-sysctl.txt | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 5dc1a040a2f1..b583a73cf95f 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1390,26 +1390,26 @@ mld_qrv - INTEGER
 	Default: 2 (as specified by RFC3810 9.1)
 	Minimum: 1 (as specified by RFC6636 4.5)
 
-max_dst_opts_cnt - INTEGER
+max_dst_opts_number - INTEGER
 	Maximum number of non-padding TLVs allowed in a Destination
 	options extension header. If this value is less than zero
 	then unknown options are disallowed and the number of known
 	TLVs allowed is the absolute value of this number.
 	Default: 8
 
-max_hbh_opts_cnt - INTEGER
+max_hbh_opts_number - INTEGER
 	Maximum number of non-padding TLVs allowed in a Hop-by-Hop
 	options extension header. If this value is less than zero
 	then unknown options are disallowed and the number of known
 	TLVs allowed is the absolute value of this number.
 	Default: 8
 
-max dst_opts_len - INTEGER
+max_dst_opts_length - INTEGER
 	Maximum length allowed for a Destination options extension
 	header.
 	Default: INT_MAX (unlimited)
 
-max hbh_opts_len - INTEGER
+max_hbh_length - INTEGER
 	Maximum length allowed for a Hop-by-Hop options extension
 	header.
 	Default: INT_MAX (unlimited)
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH bpf-next v3 8/8] bpf: add documentation for eBPF helpers (58-64)
From: Jesper Dangaard Brouer @ 2018-04-18 13:34 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: brouer, daniel, ast, netdev, oss-drivers, linux-doc, linux-man,
	John Fastabend
In-Reply-To: <20180417143438.7018-9-quentin.monnet@netronome.com>

On Tue, 17 Apr 2018 15:34:38 +0100
Quentin Monnet <quentin.monnet@netronome.com> wrote:

> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 350459c583de..3d329538498f 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1276,6 +1276,50 @@ union bpf_attr {
>   * 	Return
>   * 		0 on success, or a negative error in case of failure.
>   *
> + * int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
> + * 	Description
> + * 		Redirect the packet to the endpoint referenced by *map* at
> + * 		index *key*. Depending on its type, his *map* can contain
                                                    ^^^

"his" -> "this"

> + * 		references to net devices (for forwarding packets through other
> + * 		ports), or to CPUs (for redirecting XDP frames to another CPU;
> + * 		but this is only implemented for native XDP (with driver
> + * 		support) as of this writing).
> + *
> + * 		All values for *flags* are reserved for future usage, and must
> + * 		be left at zero.
> + * 	Return
> + * 		**XDP_REDIRECT** on success, or **XDP_ABORT** on error.
> + *

"XDP_ABORT" -> "XDP_ABORTED"

I don't know if it's worth mentioning in the doc/man-page; that for XDP
using bpf_redirect_map() is a HUGE performance advantage, compared to
the bpf_redirect() call ?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] docs: ip-sysctl.txt: fix name of some ipv6 variables
From: Jonathan Corbet @ 2018-04-18 13:43 UTC (permalink / raw)
  To: Olivier Gayot; +Cc: linux-doc, Trivial Patch Monkey
In-Reply-To: <1524047494-32679-1-git-send-email-olivier.gayot@sigexec.com>

On Wed, 18 Apr 2018 12:31:34 +0200
Olivier Gayot <olivier.gayot@sigexec.com> wrote:

> The name of the following proc/sysctl entries were incorrectly
> documented:
> 
>     /proc/sys/net/ipv6/conf/<interface>/max_dst_opts_number
>     /proc/sys/net/ipv6/conf/<interface>/max_hbt_opts_number
>     /proc/sys/net/ipv6/conf/<interface>/max_dst_opts_length
>     /proc/sys/net/ipv6/conf/<interface>/max_hbt_length
> 
> Their name was set to the name of the symbol in the .data field of the
> control table instead of their .proc name.

The patch seems good, but can I suggest resending it to netdev?  Davem
likes to handle networking docs patches himself.

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 04/10] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
From: Rob Herring @ 2018-04-18 13:59 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel@vger.kernel.org,
	linux-doc, devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist
In-Reply-To: <584aca6c-c87a-ff7a-2fdc-3c742236be60@linux.intel.com>

On Tue, Apr 17, 2018 at 5:06 PM, Jae Hyun Yoo
<jae.hyun.yoo@linux.intel.com> wrote:
> On 4/17/2018 11:16 AM, Jae Hyun Yoo wrote:
>>
>> On 4/17/2018 6:16 AM, Rob Herring wrote:
>>>
>>> On Mon, Apr 16, 2018 at 6:12 PM, Jae Hyun Yoo
>>> <jae.hyun.yoo@linux.intel.com> wrote:
>>>>
>>>> On 4/16/2018 11:10 AM, Rob Herring wrote:
>>>>>
>>>>>
>>>>> On Tue, Apr 10, 2018 at 11:32:06AM -0700, Jae Hyun Yoo wrote:
>>>>>>
>>>>>>
>>>>>> This commit adds a dt-bindings document of PECI adapter driver for
>>>>>> Aspeed
>>>>>> AST24xx/25xx SoCs.
>>>
>>>
>>> [...]
>>>
>>>>>> +- clocks            : Should contain clock source for PECI
>>>>>> controller.
>>>>>> +                     Should reference clkin.
>>>>>> +- clock_frequency   : Should contain the operation frequency of PECI
>>>>>> controller
>>>>>> +                     in units of Hz.
>>>>>> +                     187500 ~ 24000000
>>>>>
>>>>>
>>>>>
>>>>> This is the frequency of the bus or used to derive it? It would be
>>>>> better to specify the bus frequency instead and have the driver
>>>>> calculate its internal freq. And then use "bus-frequency" instead.
>>>>>
>>>>
>>>> I agree with you. Actually, it is being used for operation frequency
>>>> setting
>>>> of PECI controller module in SoC so it's different from the meaning of
>>>> "bus-frequency". I'll change it to "operation-frequency".
>>>
>>>
>>> No, now you've gone from a standard property name to something custom.
>>> Why do you need to set the frequency in DT if it is not related to the
>>> interface frequency?
>>>
>>> Rob
>>>
>>
>> Actually, the interface frequency is affected by the operation frequency
>> but there is no description of its relationship in datasheet. I'll check
>> again about the detail to ASPEED chip vendor and will use
>> 'bus-frequency' if available.
>>
>
> I investigated it more deeply. Basically, by the spec, PECI bus speed
> cannot be set as a fixed speed. A PECI bus can have a wide speed range
> from 2Kbps to 2Mbps which is dynamically set by a handshaking sequence
> between an originator and clients called 'timing negotiation' in spec.
> This timing negotiation behavior happens on every single transaction so the
> bus speed also can vary on every transactions. So I'm thinking a custom
> property name for it, 'peci-clk-frequency' if it is acceptable.

Okay, seems bus-frequency is not appropriate here. So use
'clock-frequency' (note the '-' not '_' as that is the standard
property).

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH bpf-next v3 8/8] bpf: add documentation for eBPF helpers (58-64)
From: Quentin Monnet @ 2018-04-18 14:09 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: daniel, ast, netdev, oss-drivers, linux-doc, linux-man,
	John Fastabend
In-Reply-To: <20180418153448.574c6814@redhat.com>

2018-04-18 15:34 UTC+0200 ~ Jesper Dangaard Brouer <brouer@redhat.com>
> On Tue, 17 Apr 2018 15:34:38 +0100
> Quentin Monnet <quentin.monnet@netronome.com> wrote:
> 
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 350459c583de..3d329538498f 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -1276,6 +1276,50 @@ union bpf_attr {
>>   * 	Return
>>   * 		0 on success, or a negative error in case of failure.
>>   *
>> + * int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
>> + * 	Description
>> + * 		Redirect the packet to the endpoint referenced by *map* at
>> + * 		index *key*. Depending on its type, his *map* can contain
>                                                     ^^^
> 
> "his" -> "this"

Thanks!

>> + * 		references to net devices (for forwarding packets through other
>> + * 		ports), or to CPUs (for redirecting XDP frames to another CPU;
>> + * 		but this is only implemented for native XDP (with driver
>> + * 		support) as of this writing).
>> + *
>> + * 		All values for *flags* are reserved for future usage, and must
>> + * 		be left at zero.
>> + * 	Return
>> + * 		**XDP_REDIRECT** on success, or **XDP_ABORT** on error.
>> + *
> 
> "XDP_ABORT" -> "XDP_ABORTED"

Ouch. And I did the same for bpf_redirect(). Thanks for the catch.

> 
> I don't know if it's worth mentioning in the doc/man-page; that for XDP
> using bpf_redirect_map() is a HUGE performance advantage, compared to
> the bpf_redirect() call ?

It seems worth to me. How would you simply explain the reason for this
difference?

Quentin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 07/10] Documentation: dt-bindings: Add documents for PECI hwmon client drivers
From: Rob Herring @ 2018-04-18 14:32 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: Alan Cox, Andrew Jeffery, Andrew Lunn, Andy Shevchenko,
	Arnd Bergmann, Benjamin Herrenschmidt, Fengguang Wu, Greg KH,
	Guenter Roeck, Haiyue Wang, James Feist, Jason M Biils,
	Jean Delvare, Joel Stanley, Julia Cartwright, Miguel Ojeda,
	Milton Miller II, Pavel Machek, Randy Dunlap, Stef van Os,
	Sumeet R Pawnikar, Vernon Mauery, linux-kernel@vger.kernel.org,
	linux-doc, devicetree, Linux HWMON List,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	OpenBMC Maillist
In-Reply-To: <6ff697e8-cd20-e551-da13-b614cc39f900@linux.intel.com>

On Tue, Apr 17, 2018 at 3:40 PM, Jae Hyun Yoo
<jae.hyun.yoo@linux.intel.com> wrote:
> On 4/16/2018 4:51 PM, Jae Hyun Yoo wrote:
>>
>> On 4/16/2018 4:22 PM, Jae Hyun Yoo wrote:
>>>
>>> On 4/16/2018 11:14 AM, Rob Herring wrote:
>>>>
>>>> On Tue, Apr 10, 2018 at 11:32:09AM -0700, Jae Hyun Yoo wrote:
>>>>>
>>>>> This commit adds dt-bindings documents for PECI cputemp and dimmtemp
>>>>> client
>>>>> drivers.
>>>>
>>>>
>
> [...]
>
>>>>> +Example:
>>>>> +    peci-bus@0 {
>>>>> +        #address-cells = <1>;
>>>>> +        #size-cells = <0>;
>>>>> +        < more properties >
>>>>> +
>>>>> +        peci-dimmtemp@cpu0 {
>>>>
>>>>
>>>> unit-address is wrong.
>>>>
>>>
>>> Will fix it using the reg value.
>>>
>>>> It is a different bus from cputemp? Otherwise, you have conflicting
>>>> addresses. If that's the case, probably should make it clear by showing
>>>> different host adapters for each example.
>>>>
>>>
>>> It could be the same bus with cputemp. Also, client address sharing is
>>> possible by PECI core if the functionality is different. I mean, cputemp and
>>> dimmtemp targeting the same client is possible case like this.
>>> peci-cputemp@30
>>> peci-dimmtemp@30
>>>
>>
>> Oh, I got your point. Probably, I should change these separate settings
>> into one like
>>
>> peci-client@30 {
>>      compatible = "intel,peci-client";
>>      reg = <0x30>;
>> };
>>
>> Then cputemp and dimmtemp drivers could refer the same compatible string.
>> Will rewrite it.
>>
>
> I've checked it again and realized that it should use function based node
> name like:
>
> peci-cputemp@30
> peci-dimmtemp@30
>
> If it use the same string like 'peci-client@30', the drivers cannot be
> selectively enabled. The client address sharing way is well handled in PECI
> core and this way would be better for the future implementations of other
> PECI functional drivers such as crash dump driver and so on. So I'm going
> change the unit-address only.

2 nodes at the same address is wrong (and soon dtc will warn you on
this). You have 2 potential options. The first is you need additional
address information in the DT if these are in fact 2 independent
devices. This could be something like a function number to use
something from PCI addressing. From what I found on PECI, it doesn't
seem to have anything like that. The 2nd option is you have a single
DT node which registers multiple hwmon devices. DT nodes and drivers
don't have to be 1-1. Don't design your DT nodes from how you want to
partition drivers in some OS.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] doc: dev-tools: kselftest.rst: update contributing new tests
From: Shuah Khan @ 2018-04-18 14:48 UTC (permalink / raw)
  To: Anders Roxell, corbet
  Cc: linux-kselftest, linux-doc, linux-kernel, Shuah Khan, Shuah Khan
In-Reply-To: <20180417084631.11242-1-anders.roxell@linaro.org>

On 04/17/2018 02:46 AM, Anders Roxell wrote:
> Add a description that the kernel headers should be used as far as it is
> possible and then the system headers.
> 
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
> ---
>  Documentation/dev-tools/kselftest.rst | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
> index e80850eefe13..27f08d6ba91c 100644
> --- a/Documentation/dev-tools/kselftest.rst
> +++ b/Documentation/dev-tools/kselftest.rst
> @@ -151,6 +151,9 @@ Contributing new tests (details)
>     TEST_FILES, TEST_GEN_FILES mean it is the file which is used by
>     test.
>  
> + * First use the headers inside the kernel, and then the system headers. The
> +   internal headers should be the primary focus to be able to find regressions.

Clarifying the location of the headers might be helpful. This description uses
different terminology to describe kernel headers.

"First use the headers inside the kernel sources and/or git reo" would make this
clear. Instead of "internal headers" headers for the kernel release as opposed
to headers installed by the distro on the system would make a clear distinction.

thanks,
-- Shuah
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH bpf-next v3 8/8] bpf: add documentation for eBPF helpers (58-64)
From: Jesper Dangaard Brouer @ 2018-04-18 15:43 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: daniel, ast, netdev, oss-drivers, linux-doc, linux-man,
	John Fastabend, brouer
In-Reply-To: <67e84a95-5e7b-1c2c-e90f-7bcc0026bd10@netronome.com>

On Wed, 18 Apr 2018 15:09:41 +0100
Quentin Monnet <quentin.monnet@netronome.com> wrote:

> 2018-04-18 15:34 UTC+0200 ~ Jesper Dangaard Brouer <brouer@redhat.com>
> > On Tue, 17 Apr 2018 15:34:38 +0100
> > Quentin Monnet <quentin.monnet@netronome.com> wrote:
> >   
> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> >> index 350459c583de..3d329538498f 100644
> >> --- a/include/uapi/linux/bpf.h
> >> +++ b/include/uapi/linux/bpf.h
> >> @@ -1276,6 +1276,50 @@ union bpf_attr {
> >>   * 	Return
> >>   * 		0 on success, or a negative error in case of failure.
> >>   *
> >> + * int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
> >> + * 	Description
> >> + * 		Redirect the packet to the endpoint referenced by *map* at
> >> + * 		index *key*. Depending on its type, his *map* can contain  
> >                                                     ^^^
> > 
> > "his" -> "this"  
> 
> Thanks!
> 
> >> + * 		references to net devices (for forwarding packets through other
> >> + * 		ports), or to CPUs (for redirecting XDP frames to another CPU;
> >> + * 		but this is only implemented for native XDP (with driver
> >> + * 		support) as of this writing).
> >> + *
> >> + * 		All values for *flags* are reserved for future usage, and must
> >> + * 		be left at zero.
> >> + * 	Return
> >> + * 		**XDP_REDIRECT** on success, or **XDP_ABORT** on error.
> >> + *  
> > 
> > "XDP_ABORT" -> "XDP_ABORTED"  
> 
> Ouch. And I did the same for bpf_redirect(). Thanks for the catch.
> 
> > 
> > I don't know if it's worth mentioning in the doc/man-page; that for XDP
> > using bpf_redirect_map() is a HUGE performance advantage, compared to
> > the bpf_redirect() call ?  
> 
> It seems worth to me. How would you simply explain the reason for this
> difference?

The basic reason is "bulking effect", as devmap avoids the NIC
tailptr/doorbell update on every packet... how to write that in a doc
format?

I wrote about why XDP_REDIRECT with maps are smart here:
 http://vger.kernel.org/netconf2017_files/XDP_devel_update_NetConf2017_Seoul.pdf

Using maps for redirect, hopefully makes XDP_REDIRECT the last driver
XDP action code we need.  As new types of redirect can be introduced
without driver changes. See that AF_XDP also uses a map.

It is more subtle, but maps also function as a sorting step. Imagine
your XDP program need to redirect out different interfaces (or CPUs in
cpumap case), and packets arrive intermixed.  Packets get sorted into
the different map indexes, and the xdp_do_flush_map() will trigger the
flush operation.


Happened to have an i40e NIC benchmark setup, and ran a single flow pktgen test.

Results with 'xdp_redirect_map'
 13589297 pps (13,589,297) 

Results with 'xdp_redirect' NOT using devmap:
  7567575 pps (7,567,575)

Just to point out the performance benefit of devmap...

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] docs: ip-sysctl.txt: fix name of some ipv6 variables
From: Olivier Gayot @ 2018-04-18 16:12 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: linux-doc, Trivial Patch Monkey
In-Reply-To: <20180418074356.01d61108@lwn.net>

Hi Jonathan,

On Wed, Apr 18, 2018 at 07:43:56AM -0600, Jonathan Corbet wrote:
> The patch seems good, but can I suggest resending it to netdev?  Davem
> likes to handle networking docs patches himself.
> 
> Thanks,
> 
> jon

Sure. Thanks for the feedback. I'll resend to netdev.

Olivier
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox