RFC: android logger feedback request

linux-embedded.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RFC: android logger feedback request
@ 2011-12-21 22:59 Tim Bird
  2011-12-21 23:19 ` Greg KH
                   ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Tim Bird @ 2011-12-21 22:59 UTC (permalink / raw)
  To: linux-embedded
  Cc: linux kernel, Arnd Bergmann, john stultz, Greg KH, Brian Swetland

Hi all,

I'm looking for feedback on the Android logger code, to see what
it would take to make this code acceptable for inclusion in
the mainline kernel.

Information about the features of Android logging system
can be found at: http://elinux.org/Android_Logging_System

This system creates a new system-wide logging service, in
the kernel, for user-space message.  It is more comparable
to syslog than to the kernel log buffer, as it holds only
user-space messages.  It is optimized for write
performance, since most of the time the log is written to
and never read.  It creates multiple log channels, to prevent
an abundance of log messages in one channel from overwriting
messages in another channel.  The log channels have sizes
fixed at kernel compile-time.

Log messages are stored in very simple in-kernel buffers, that
overflow old messages upon wrapping.  A fixed set of attributes
(pid, tid, timestamp and message), is kept for each message.
By convention, Android puts a message priority and context tag
into each message.

In Android, this system uses a fixed set of device nodes with
well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
and /dev/log/system.

Operations on the log are done via a character device, using
standard file operations and some ioctls.

The code for this is below (I've moved it from linux-next
drivers/staging/android for my own testing).

Please let me know what issues you see with this code.

One specific question I have is where is the most appropriate
place for this code to live, in the kernel source tree?
Other embedded systems might want to use this system (it
is simpler than syslog, and superior in some ways), so I don't
think it should remain in an android-specific directory.

Thanks,
 -- Tim

P.S. Sorry to do this right before the holidays.  I'll be
away from my work machine for most of the next few weeks.  I
can answer informational questions, and gather feedback,
but I won't have a lot of time for testing new code until
after the New Year.


Here's the code:
diff --git a/drivers/misc/logger.c b/drivers/misc/logger.c
new file mode 100644
index 0000000..fa76ce7
--- /dev/null
+++ b/drivers/misc/logger.c
@@ -0,0 +1,616 @@
+/*
+ * drivers/misc/logger.c
+ *
+ * A Logging Subsystem
+ *
+ * Copyright (C) 2007-2008 Google, Inc.
+ *
+ * Robert Love <rlove@google.com>
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/sched.h>
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <linux/uaccess.h>
+#include <linux/poll.h>
+#include <linux/slab.h>
+#include <linux/time.h>
+#include "logger.h"
+
+#include <asm/ioctls.h>
+
+/*
+ * struct logger_log - represents a specific log, such as 'main' or 'radio'
+ *
+ * This structure lives from module insertion until module removal, so it does
+ * not need additional reference counting. The structure is protected by the
+ * mutex 'mutex'.
+ */
+struct logger_log {
+	unsigned char 		*buffer;/* the ring buffer itself */
+	struct miscdevice	misc;	/* misc device representing the log */
+	wait_queue_head_t	wq;	/* wait queue for readers */
+	struct list_head	readers; /* this log's readers */
+	struct mutex		mutex;	/* mutex protecting buffer */
+	size_t			w_off;	/* current write head offset */
+	size_t			head;	/* new readers start here */
+	size_t			size;	/* size of the log */
+};
+
+/*
+ * struct logger_reader - a logging device open for reading
+ *
+ * This object lives from open to release, so we don't need additional
+ * reference counting. The structure is protected by log->mutex.
+ */
+struct logger_reader {
+	struct logger_log	*log;	/* associated log */
+	struct list_head	list;	/* entry in logger_log's list */
+	size_t			r_off;	/* current read head offset */
+};
+
+/* logger_offset - returns index 'n' into the log via (optimized) modulus */
+#define logger_offset(n)	((n) & (log->size - 1))
+
+/*
+ * file_get_log - Given a file structure, return the associated log
+ *
+ * This isn't aesthetic. We have several goals:
+ *
+ * 	1) Need to quickly obtain the associated log during an I/O operation
+ * 	2) Readers need to maintain state (logger_reader)
+ * 	3) Writers need to be very fast (open() should be a near no-op)
+ *
+ * In the reader case, we can trivially go file->logger_reader->logger_log.
+ * For a writer, we don't want to maintain a logger_reader, so we just go
+ * file->logger_log. Thus what file->private_data points at depends on whether
+ * or not the file was opened for reading. This function hides that dirtiness.
+ */
+static inline struct logger_log *file_get_log(struct file *file)
+{
+	if (file->f_mode & FMODE_READ) {
+		struct logger_reader *reader = file->private_data;
+		return reader->log;
+	} else
+		return file->private_data;
+}
+
+/*
+ * get_entry_len - Grabs the length of the payload of the next entry starting
+ * from 'off'.
+ *
+ * Caller needs to hold log->mutex.
+ */
+static __u32 get_entry_len(struct logger_log *log, size_t off)
+{
+	__u16 val;
+
+	switch (log->size - off) {
+	case 1:
+		memcpy(&val, log->buffer + off, 1);
+		memcpy(((char *) &val) + 1, log->buffer, 1);
+		break;
+	default:
+		memcpy(&val, log->buffer + off, 2);
+	}
+
+	return sizeof(struct logger_entry) + val;
+}
+
+/*
+ * do_read_log_to_user - reads exactly 'count' bytes from 'log' into the
+ * user-space buffer 'buf'. Returns 'count' on success.
+ *
+ * Caller must hold log->mutex.
+ */
+static ssize_t do_read_log_to_user(struct logger_log *log,
+				   struct logger_reader *reader,
+				   char __user *buf,
+				   size_t count)
+{
+	size_t len;
+
+	/*
+	 * We read from the log in two disjoint operations. First, we read from
+	 * the current read head offset up to 'count' bytes or to the end of
+	 * the log, whichever comes first.
+	 */
+	len = min(count, log->size - reader->r_off);
+	if (copy_to_user(buf, log->buffer + reader->r_off, len))
+		return -EFAULT;
+
+	/*
+	 * Second, we read any remaining bytes, starting back at the head of
+	 * the log.
+	 */
+	if (count != len)
+		if (copy_to_user(buf + len, log->buffer, count - len))
+			return -EFAULT;
+
+	reader->r_off = logger_offset(reader->r_off + count);
+
+	return count;
+}
+
+/*
+ * logger_read - our log's read() method
+ *
+ * Behavior:
+ *
+ * 	- O_NONBLOCK works
+ * 	- If there are no log entries to read, blocks until log is written to
+ * 	- Atomically reads exactly one log entry
+ *
+ * Optimal read size is LOGGER_ENTRY_MAX_LEN. Will set errno to EINVAL if read
+ * buffer is insufficient to hold next entry.
+ */
+static ssize_t logger_read(struct file *file, char __user *buf,
+			   size_t count, loff_t *pos)
+{
+	struct logger_reader *reader = file->private_data;
+	struct logger_log *log = reader->log;
+	ssize_t ret;
+	DEFINE_WAIT(wait);
+
+start:
+	while (1) {
+		prepare_to_wait(&log->wq, &wait, TASK_INTERRUPTIBLE);
+
+		mutex_lock(&log->mutex);
+		ret = (log->w_off == reader->r_off);
+		mutex_unlock(&log->mutex);
+		if (!ret)
+			break;
+
+		if (file->f_flags & O_NONBLOCK) {
+			ret = -EAGAIN;
+			break;
+		}
+
+		if (signal_pending(current)) {
+			ret = -EINTR;
+			break;
+		}
+
+		schedule();
+	}
+
+	finish_wait(&log->wq, &wait);
+	if (ret)
+		return ret;
+
+	mutex_lock(&log->mutex);
+
+	/* is there still something to read or did we race? */
+	if (unlikely(log->w_off == reader->r_off)) {
+		mutex_unlock(&log->mutex);
+		goto start;
+	}
+
+	/* get the size of the next entry */
+	ret = get_entry_len(log, reader->r_off);
+	if (count < ret) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	/* get exactly one entry from the log */
+	ret = do_read_log_to_user(log, reader, buf, ret);
+
+out:
+	mutex_unlock(&log->mutex);
+
+	return ret;
+}
+
+/*
+ * get_next_entry - return the offset of the first valid entry at least 'len'
+ * bytes after 'off'.
+ *
+ * Caller must hold log->mutex.
+ */
+static size_t get_next_entry(struct logger_log *log, size_t off, size_t len)
+{
+	size_t count = 0;
+
+	do {
+		size_t nr = get_entry_len(log, off);
+		off = logger_offset(off + nr);
+		count += nr;
+	} while (count < len);
+
+	return off;
+}
+
+/*
+ * clock_interval - is a < c < b in mod-space? Put another way, does the line
+ * from a to b cross c?
+ */
+static inline int clock_interval(size_t a, size_t b, size_t c)
+{
+	if (b < a) {
+		if (a < c || b >= c)
+			return 1;
+	} else {
+		if (a < c && b >= c)
+			return 1;
+	}
+
+	return 0;
+}
+
+/*
+ * fix_up_readers - walk the list of all readers and "fix up" any who were
+ * lapped by the writer; also do the same for the default "start head".
+ * We do this by "pulling forward" the readers and start head to the first
+ * entry after the new write head.
+ *
+ * The caller needs to hold log->mutex.
+ */
+static void fix_up_readers(struct logger_log *log, size_t len)
+{
+	size_t old = log->w_off;
+	size_t new = logger_offset(old + len);
+	struct logger_reader *reader;
+
+	if (clock_interval(old, new, log->head))
+		log->head = get_next_entry(log, log->head, len);
+
+	list_for_each_entry(reader, &log->readers, list)
+		if (clock_interval(old, new, reader->r_off))
+			reader->r_off = get_next_entry(log, reader->r_off, len);
+}
+
+/*
+ * do_write_log - writes 'len' bytes from 'buf' to 'log'
+ *
+ * The caller needs to hold log->mutex.
+ */
+static void do_write_log(struct logger_log *log, const void *buf, size_t count)
+{
+	size_t len;
+
+	len = min(count, log->size - log->w_off);
+	memcpy(log->buffer + log->w_off, buf, len);
+
+	if (count != len)
+		memcpy(log->buffer, buf + len, count - len);
+
+	log->w_off = logger_offset(log->w_off + count);
+
+}
+
+/*
+ * do_write_log_user - writes 'len' bytes from the user-space buffer 'buf' to
+ * the log 'log'
+ *
+ * The caller needs to hold log->mutex.
+ *
+ * Returns 'count' on success, negative error code on failure.
+ */
+static ssize_t do_write_log_from_user(struct logger_log *log,
+				      const void __user *buf, size_t count)
+{
+	size_t len;
+
+	len = min(count, log->size - log->w_off);
+	if (len && copy_from_user(log->buffer + log->w_off, buf, len))
+		return -EFAULT;
+
+	if (count != len)
+		if (copy_from_user(log->buffer, buf + len, count - len))
+			return -EFAULT;
+
+	log->w_off = logger_offset(log->w_off + count);
+
+	return count;
+}
+
+/*
+ * logger_aio_write - our write method, implementing support for write(),
+ * writev(), and aio_write(). Writes are our fast path, and we try to optimize
+ * them above all else.
+ */
+ssize_t logger_aio_write(struct kiocb *iocb, const struct iovec *iov,
+			 unsigned long nr_segs, loff_t ppos)
+{
+	struct logger_log *log = file_get_log(iocb->ki_filp);
+	size_t orig = log->w_off;
+	struct logger_entry header;
+	struct timespec now;
+	ssize_t ret = 0;
+
+	now = current_kernel_time();
+
+	header.pid = current->tgid;
+	header.tid = current->pid;
+	header.sec = now.tv_sec;
+	header.nsec = now.tv_nsec;
+	header.len = min_t(size_t, iocb->ki_left, LOGGER_ENTRY_MAX_PAYLOAD);
+
+	/* null writes succeed, return zero */
+	if (unlikely(!header.len))
+		return 0;
+
+	mutex_lock(&log->mutex);
+
+	/*
+	 * Fix up any readers, pulling them forward to the first readable
+	 * entry after (what will be) the new write offset. We do this now
+	 * because if we partially fail, we can end up with clobbered log
+	 * entries that encroach on readable buffer.
+	 */
+	fix_up_readers(log, sizeof(struct logger_entry) + header.len);
+
+	do_write_log(log, &header, sizeof(struct logger_entry));
+
+	while (nr_segs-- > 0) {
+		size_t len;
+		ssize_t nr;
+
+		/* figure out how much of this vector we can keep */
+		len = min_t(size_t, iov->iov_len, header.len - ret);
+
+		/* write out this segment's payload */
+		nr = do_write_log_from_user(log, iov->iov_base, len);
+		if (unlikely(nr < 0)) {
+			log->w_off = orig;
+			mutex_unlock(&log->mutex);
+			return nr;
+		}
+
+		iov++;
+		ret += nr;
+	}
+
+	mutex_unlock(&log->mutex);
+
+	/* wake up any blocked readers */
+	wake_up_interruptible(&log->wq);
+
+	return ret;
+}
+
+static struct logger_log *get_log_from_minor(int);
+
+/*
+ * logger_open - the log's open() file operation
+ *
+ * Note how near a no-op this is in the write-only case. Keep it that way!
+ */
+static int logger_open(struct inode *inode, struct file *file)
+{
+	struct logger_log *log;
+	int ret;
+
+	ret = nonseekable_open(inode, file);
+	if (ret)
+		return ret;
+
+	log = get_log_from_minor(MINOR(inode->i_rdev));
+	if (!log)
+		return -ENODEV;
+
+	if (file->f_mode & FMODE_READ) {
+		struct logger_reader *reader;
+
+		reader = kmalloc(sizeof(struct logger_reader), GFP_KERNEL);
+		if (!reader)
+			return -ENOMEM;
+
+		reader->log = log;
+		INIT_LIST_HEAD(&reader->list);
+
+		mutex_lock(&log->mutex);
+		reader->r_off = log->head;
+		list_add_tail(&reader->list, &log->readers);
+		mutex_unlock(&log->mutex);
+
+		file->private_data = reader;
+	} else
+		file->private_data = log;
+
+	return 0;
+}
+
+/*
+ * logger_release - the log's release file operation
+ *
+ * Note this is a total no-op in the write-only case. Keep it that way!
+ */
+static int logger_release(struct inode *ignored, struct file *file)
+{
+	if (file->f_mode & FMODE_READ) {
+		struct logger_reader *reader = file->private_data;
+		list_del(&reader->list);
+		kfree(reader);
+	}
+
+	return 0;
+}
+
+/*
+ * logger_poll - the log's poll file operation, for poll/select/epoll
+ *
+ * Note we always return POLLOUT, because you can always write() to the log.
+ * Note also that, strictly speaking, a return value of POLLIN does not
+ * guarantee that the log is readable without blocking, as there is a small
+ * chance that the writer can lap the reader in the interim between poll()
+ * returning and the read() request.
+ */
+static unsigned int logger_poll(struct file *file, poll_table *wait)
+{
+	struct logger_reader *reader;
+	struct logger_log *log;
+	unsigned int ret = POLLOUT | POLLWRNORM;
+
+	if (!(file->f_mode & FMODE_READ))
+		return ret;
+
+	reader = file->private_data;
+	log = reader->log;
+
+	poll_wait(file, &log->wq, wait);
+
+	mutex_lock(&log->mutex);
+	if (log->w_off != reader->r_off)
+		ret |= POLLIN | POLLRDNORM;
+	mutex_unlock(&log->mutex);
+
+	return ret;
+}
+
+static long logger_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	struct logger_log *log = file_get_log(file);
+	struct logger_reader *reader;
+	long ret = -ENOTTY;
+
+	mutex_lock(&log->mutex);
+
+	switch (cmd) {
+	case LOGGER_GET_LOG_BUF_SIZE:
+		ret = log->size;
+		break;
+	case LOGGER_GET_LOG_LEN:
+		if (!(file->f_mode & FMODE_READ)) {
+			ret = -EBADF;
+			break;
+		}
+		reader = file->private_data;
+		if (log->w_off >= reader->r_off)
+			ret = log->w_off - reader->r_off;
+		else
+			ret = (log->size - reader->r_off) + log->w_off;
+		break;
+	case LOGGER_GET_NEXT_ENTRY_LEN:
+		if (!(file->f_mode & FMODE_READ)) {
+			ret = -EBADF;
+			break;
+		}
+		reader = file->private_data;
+		if (log->w_off != reader->r_off)
+			ret = get_entry_len(log, reader->r_off);
+		else
+			ret = 0;
+		break;
+	case LOGGER_FLUSH_LOG:
+		if (!(file->f_mode & FMODE_WRITE)) {
+			ret = -EBADF;
+			break;
+		}
+		list_for_each_entry(reader, &log->readers, list)
+			reader->r_off = log->w_off;
+		log->head = log->w_off;
+		ret = 0;
+		break;
+	}
+
+	mutex_unlock(&log->mutex);
+
+	return ret;
+}
+
+static const struct file_operations logger_fops = {
+	.owner = THIS_MODULE,
+	.read = logger_read,
+	.aio_write = logger_aio_write,
+	.poll = logger_poll,
+	.unlocked_ioctl = logger_ioctl,
+	.compat_ioctl = logger_ioctl,
+	.open = logger_open,
+	.release = logger_release,
+};
+
+/*
+ * Defines a log structure with name 'NAME' and a size of 'SIZE' bytes, which
+ * must be a power of two, greater than LOGGER_ENTRY_MAX_LEN, and less than
+ * LONG_MAX minus LOGGER_ENTRY_MAX_LEN.
+ */
+#define DEFINE_LOGGER_DEVICE(VAR, NAME, SIZE) \
+static unsigned char _buf_ ## VAR[SIZE]; \
+static struct logger_log VAR = { \
+	.buffer = _buf_ ## VAR, \
+	.misc = { \
+		.minor = MISC_DYNAMIC_MINOR, \
+		.name = NAME, \
+		.fops = &logger_fops, \
+		.parent = NULL, \
+	}, \
+	.wq = __WAIT_QUEUE_HEAD_INITIALIZER(VAR .wq), \
+	.readers = LIST_HEAD_INIT(VAR .readers), \
+	.mutex = __MUTEX_INITIALIZER(VAR .mutex), \
+	.w_off = 0, \
+	.head = 0, \
+	.size = SIZE, \
+};
+
+DEFINE_LOGGER_DEVICE(log_main, LOGGER_LOG_MAIN, 256*1024)
+DEFINE_LOGGER_DEVICE(log_events, LOGGER_LOG_EVENTS, 256*1024)
+DEFINE_LOGGER_DEVICE(log_radio, LOGGER_LOG_RADIO, 256*1024)
+DEFINE_LOGGER_DEVICE(log_system, LOGGER_LOG_SYSTEM, 256*1024)
+
+static struct logger_log *get_log_from_minor(int minor)
+{
+	if (log_main.misc.minor == minor)
+		return &log_main;
+	if (log_events.misc.minor == minor)
+		return &log_events;
+	if (log_radio.misc.minor == minor)
+		return &log_radio;
+	if (log_system.misc.minor == minor)
+		return &log_system;
+	return NULL;
+}
+
+static int __init init_log(struct logger_log *log)
+{
+	int ret;
+
+	ret = misc_register(&log->misc);
+	if (unlikely(ret)) {
+		printk(KERN_ERR "logger: failed to register misc "
+		       "device for log '%s'!\n", log->misc.name);
+		return ret;
+	}
+
+	printk(KERN_INFO "logger: created %luK log '%s'\n",
+	       (unsigned long) log->size >> 10, log->misc.name);
+
+	return 0;
+}
+
+static int __init logger_init(void)
+{
+	int ret;
+
+	ret = init_log(&log_main);
+	if (unlikely(ret))
+		goto out;
+
+	ret = init_log(&log_events);
+	if (unlikely(ret))
+		goto out;
+
+	ret = init_log(&log_radio);
+	if (unlikely(ret))
+		goto out;
+
+	ret = init_log(&log_system);
+	if (unlikely(ret))
+		goto out;
+
+out:
+	return ret;
+}
+device_initcall(logger_init);
diff --git a/drivers/misc/logger.h b/drivers/misc/logger.h
new file mode 100644
index 0000000..2cb06e9
--- /dev/null
+++ b/drivers/misc/logger.h
@@ -0,0 +1,49 @@
+/* include/linux/logger.h
+ *
+ * Copyright (C) 2007-2008 Google, Inc.
+ * Author: Robert Love <rlove@android.com>
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef _LINUX_LOGGER_H
+#define _LINUX_LOGGER_H
+
+#include <linux/types.h>
+#include <linux/ioctl.h>
+
+struct logger_entry {
+	__u16		len;	/* length of the payload */
+	__u16		__pad;	/* no matter what, we get 2 bytes of padding */
+	__s32		pid;	/* generating process's pid */
+	__s32		tid;	/* generating process's tid */
+	__s32		sec;	/* seconds since Epoch */
+	__s32		nsec;	/* nanoseconds */
+	char		msg[0];	/* the entry's payload */
+};
+
+#define LOGGER_LOG_RADIO	"log_radio"	/* radio-related messages */
+#define LOGGER_LOG_EVENTS	"log_events"	/* system/hardware events */
+#define LOGGER_LOG_SYSTEM	"log_system"	/* system/framework messages */
+#define LOGGER_LOG_MAIN		"log_main"	/* everything else */
+
+#define LOGGER_ENTRY_MAX_LEN		(4*1024)
+#define LOGGER_ENTRY_MAX_PAYLOAD	\
+	(LOGGER_ENTRY_MAX_LEN - sizeof(struct logger_entry))
+
+#define __LOGGERIO	0xAE
+
+#define LOGGER_GET_LOG_BUF_SIZE		_IO(__LOGGERIO, 1) /* size of log */
+#define LOGGER_GET_LOG_LEN		_IO(__LOGGERIO, 2) /* used log len */
+#define LOGGER_GET_NEXT_ENTRY_LEN	_IO(__LOGGERIO, 3) /* next entry len */
+#define LOGGER_FLUSH_LOG		_IO(__LOGGERIO, 4) /* flush log */
+
+#endif /* _LINUX_LOGGER_H */

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-21 22:59 RFC: android logger feedback request Tim Bird
@ 2011-12-21 23:19 ` Greg KH
  2011-12-22  0:18   ` john stultz
  2011-12-22  0:36   ` Tim Bird
  2011-12-22  0:59 ` David Brown
  2011-12-29  0:39 ` Andrew Morton
  2 siblings, 2 replies; 53+ messages in thread
From: Greg KH @ 2011-12-21 23:19 UTC (permalink / raw)
  To: Tim Bird
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote:
> Hi all,
> 
> I'm looking for feedback on the Android logger code, to see what
> it would take to make this code acceptable for inclusion in
> the mainline kernel.
> 
> Information about the features of Android logging system
> can be found at: http://elinux.org/Android_Logging_System
> 
> This system creates a new system-wide logging service, in
> the kernel, for user-space message.  It is more comparable
> to syslog than to the kernel log buffer, as it holds only
> user-space messages.  It is optimized for write
> performance, since most of the time the log is written to
> and never read.  It creates multiple log channels, to prevent
> an abundance of log messages in one channel from overwriting
> messages in another channel.  The log channels have sizes
> fixed at kernel compile-time.
> 
> Log messages are stored in very simple in-kernel buffers, that
> overflow old messages upon wrapping.  A fixed set of attributes
> (pid, tid, timestamp and message), is kept for each message.
> By convention, Android puts a message priority and context tag
> into each message.
> 
> In Android, this system uses a fixed set of device nodes with
> well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
> and /dev/log/system.
> 
> Operations on the log are done via a character device, using
> standard file operations and some ioctls.
> 
> The code for this is below (I've moved it from linux-next
> drivers/staging/android for my own testing).
> 
> Please let me know what issues you see with this code.

That all describes the current code, but you haven't described what's
wrong with the existing syslog interface that requires this new driver
to be written.  And why can't the existing interface be fixed to address
these (potential) shortcomings?

> One specific question I have is where is the most appropriate
> place for this code to live, in the kernel source tree?
> Other embedded systems might want to use this system (it
> is simpler than syslog, and superior in some ways), so I don't
> think it should remain in an android-specific directory.

What way is it superior?  Again, why not extend syslog?  Why not "fix"
syslog if this really is a superior thing?  How does this tie into Kay
and Lennard's proposal for work in this area?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-21 23:19 ` Greg KH
@ 2011-12-22  0:18   ` john stultz
  2011-12-22  0:27     ` Greg KH
  2011-12-22  0:42     ` Tim Bird
  2011-12-22  0:36   ` Tim Bird
  1 sibling, 2 replies; 53+ messages in thread
From: john stultz @ 2011-12-22  0:18 UTC (permalink / raw)
  To: Greg KH
  Cc: Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote:
> On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote:
> > Hi all,
> > 
> > I'm looking for feedback on the Android logger code, to see what
> > it would take to make this code acceptable for inclusion in
> > the mainline kernel.
> > 
> > Information about the features of Android logging system
> > can be found at: http://elinux.org/Android_Logging_System
> > 
> > This system creates a new system-wide logging service, in
> > the kernel, for user-space message.  It is more comparable
> > to syslog than to the kernel log buffer, as it holds only
> > user-space messages.  It is optimized for write
> > performance, since most of the time the log is written to
> > and never read.  It creates multiple log channels, to prevent
> > an abundance of log messages in one channel from overwriting
> > messages in another channel.  The log channels have sizes
> > fixed at kernel compile-time.
> > 
> > Log messages are stored in very simple in-kernel buffers, that
> > overflow old messages upon wrapping.  A fixed set of attributes
> > (pid, tid, timestamp and message), is kept for each message.
> > By convention, Android puts a message priority and context tag
> > into each message.
> > 
> > In Android, this system uses a fixed set of device nodes with
> > well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
> > and /dev/log/system.
> > 
> > Operations on the log are done via a character device, using
> > standard file operations and some ioctls.
> > 
> > The code for this is below (I've moved it from linux-next
> > drivers/staging/android for my own testing).
> > 
> > Please let me know what issues you see with this code.
> 
> That all describes the current code, but you haven't described what's
> wrong with the existing syslog interface that requires this new driver
> to be written.  And why can't the existing interface be fixed to address
> these (potential) shortcomings?
> 
> > One specific question I have is where is the most appropriate
> > place for this code to live, in the kernel source tree?
> > Other embedded systems might want to use this system (it
> > is simpler than syslog, and superior in some ways), so I don't
> > think it should remain in an android-specific directory.
> 
> What way is it superior?  Again, why not extend syslog?  Why not "fix"
> syslog if this really is a superior thing?  How does this tie into Kay
> and Lennard's proposal for work in this area?

There is also some overlap functionality wise with pstore as well, as I
believe the logger is used as a known location in memory where messages
can be fetched from after a kernel panic or crash.

thanks
-john

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:18   ` john stultz
@ 2011-12-22  0:27     ` Greg KH
  2011-12-22  0:47       ` john stultz
  2011-12-22  0:42     ` Tim Bird
  1 sibling, 1 reply; 53+ messages in thread
From: Greg KH @ 2011-12-22  0:27 UTC (permalink / raw)
  To: john stultz
  Cc: Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 04:18:11PM -0800, john stultz wrote:
> On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote:
> > On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote:
> > > Hi all,
> > > 
> > > I'm looking for feedback on the Android logger code, to see what
> > > it would take to make this code acceptable for inclusion in
> > > the mainline kernel.
> > > 
> > > Information about the features of Android logging system
> > > can be found at: http://elinux.org/Android_Logging_System
> > > 
> > > This system creates a new system-wide logging service, in
> > > the kernel, for user-space message.  It is more comparable
> > > to syslog than to the kernel log buffer, as it holds only
> > > user-space messages.  It is optimized for write
> > > performance, since most of the time the log is written to
> > > and never read.  It creates multiple log channels, to prevent
> > > an abundance of log messages in one channel from overwriting
> > > messages in another channel.  The log channels have sizes
> > > fixed at kernel compile-time.
> > > 
> > > Log messages are stored in very simple in-kernel buffers, that
> > > overflow old messages upon wrapping.  A fixed set of attributes
> > > (pid, tid, timestamp and message), is kept for each message.
> > > By convention, Android puts a message priority and context tag
> > > into each message.
> > > 
> > > In Android, this system uses a fixed set of device nodes with
> > > well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
> > > and /dev/log/system.
> > > 
> > > Operations on the log are done via a character device, using
> > > standard file operations and some ioctls.
> > > 
> > > The code for this is below (I've moved it from linux-next
> > > drivers/staging/android for my own testing).
> > > 
> > > Please let me know what issues you see with this code.
> > 
> > That all describes the current code, but you haven't described what's
> > wrong with the existing syslog interface that requires this new driver
> > to be written.  And why can't the existing interface be fixed to address
> > these (potential) shortcomings?
> > 
> > > One specific question I have is where is the most appropriate
> > > place for this code to live, in the kernel source tree?
> > > Other embedded systems might want to use this system (it
> > > is simpler than syslog, and superior in some ways), so I don't
> > > think it should remain in an android-specific directory.
> > 
> > What way is it superior?  Again, why not extend syslog?  Why not "fix"
> > syslog if this really is a superior thing?  How does this tie into Kay
> > and Lennard's proposal for work in this area?
> 
> There is also some overlap functionality wise with pstore as well, as I
> believe the logger is used as a known location in memory where messages
> can be fetched from after a kernel panic or crash.

That sounds like a major overlap, can't pstore be used for this
functionality today instead of the logger code?

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:27     ` Greg KH
@ 2011-12-22  0:47       ` john stultz
  2011-12-22  1:09         ` john stultz
  0 siblings, 1 reply; 53+ messages in thread
From: john stultz @ 2011-12-22  0:47 UTC (permalink / raw)
  To: Greg KH
  Cc: Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, 2011-12-21 at 16:27 -0800, Greg KH wrote:
> On Wed, Dec 21, 2011 at 04:18:11PM -0800, john stultz wrote:
> > On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote:
> > > What way is it superior?  Again, why not extend syslog?  Why not "fix"
> > > syslog if this really is a superior thing?  How does this tie into Kay
> > > and Lennard's proposal for work in this area?
> > 
> > There is also some overlap functionality wise with pstore as well, as I
> > believe the logger is used as a known location in memory where messages
> > can be fetched from after a kernel panic or crash.
> 
> That sounds like a major overlap, can't pstore be used for this
> functionality today instead of the logger code?

It may be able to suffice for some portion of the functionality, it was
suggested that we consider doing something like try to make logger use
pstore as a backend.  I'm not really familiar with pstore, but I don't
know if its able to preserve messages from userland, so it may not
really be sufficient in of itself.

Also, if I understand correctly, logger also is able to store messages
from hardware modems or other hardware that run their own micro-os in a
central location. This may be closer to the pstore mce bits, but its not
clear yet to me how much overlap is there.

But I've not yet really spent too much time looking at logger yet, so I
can't really comment too strongly here. Just wanted to make sure folks
have looked at all the other similar work.

thanks
-john

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:47       ` john stultz
@ 2011-12-22  1:09         ` john stultz
  0 siblings, 0 replies; 53+ messages in thread
From: john stultz @ 2011-12-22  1:09 UTC (permalink / raw)
  To: Greg KH
  Cc: Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, 2011-12-21 at 16:47 -0800, john stultz wrote:
> Also, if I understand correctly, logger also is able to store messages
> from hardware modems or other hardware that run their own micro-os in a
> central location. This may be closer to the pstore mce bits, but its not
> clear yet to me how much overlap is there.

Looking at the driver, this point seems to be inaccurate as well.  So
sorry for adding to any confusion here. 

thanks
-john

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:18   ` john stultz
  2011-12-22  0:27     ` Greg KH
@ 2011-12-22  0:42     ` Tim Bird
  2011-12-22  0:49       ` john stultz
  1 sibling, 1 reply; 53+ messages in thread
From: Tim Bird @ 2011-12-22  0:42 UTC (permalink / raw)
  To: john stultz
  Cc: Greg KH, linux-embedded, linux kernel, Arnd Bergmann,
	Brian Swetland, Kay Sievers, Lennart Poettering

On 12/21/2011 04:18 PM, john stultz wrote:
> On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote:
>> On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote:
>>> Hi all,
>>>
>>> I'm looking for feedback on the Android logger code, to see what
>>> it would take to make this code acceptable for inclusion in
>>> the mainline kernel.
>>>
>>> Information about the features of Android logging system
>>> can be found at: http://elinux.org/Android_Logging_System
>>>
>>> This system creates a new system-wide logging service, in
>>> the kernel, for user-space message.  It is more comparable
>>> to syslog than to the kernel log buffer, as it holds only
>>> user-space messages.  It is optimized for write
>>> performance, since most of the time the log is written to
>>> and never read.  It creates multiple log channels, to prevent
>>> an abundance of log messages in one channel from overwriting
>>> messages in another channel.  The log channels have sizes
>>> fixed at kernel compile-time.
>>>
>>> Log messages are stored in very simple in-kernel buffers, that
>>> overflow old messages upon wrapping.  A fixed set of attributes
>>> (pid, tid, timestamp and message), is kept for each message.
>>> By convention, Android puts a message priority and context tag
>>> into each message.
>>>
>>> In Android, this system uses a fixed set of device nodes with
>>> well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
>>> and /dev/log/system.
>>>
>>> Operations on the log are done via a character device, using
>>> standard file operations and some ioctls.
>>>
>>> The code for this is below (I've moved it from linux-next
>>> drivers/staging/android for my own testing).
>>>
>>> Please let me know what issues you see with this code.
>>
>> That all describes the current code, but you haven't described what's
>> wrong with the existing syslog interface that requires this new driver
>> to be written.  And why can't the existing interface be fixed to address
>> these (potential) shortcomings?
>>
>>> One specific question I have is where is the most appropriate
>>> place for this code to live, in the kernel source tree?
>>> Other embedded systems might want to use this system (it
>>> is simpler than syslog, and superior in some ways), so I don't
>>> think it should remain in an android-specific directory.
>>
>> What way is it superior?  Again, why not extend syslog?  Why not "fix"
>> syslog if this really is a superior thing?  How does this tie into Kay
>> and Lennard's proposal for work in this area?
> 
> There is also some overlap functionality wise with pstore as well, as I
> believe the logger is used as a known location in memory where messages
> can be fetched from after a kernel panic or crash.

I don't know if that's true or not.  I think you may be thinking
of Android's RAM console feature.  If there's a way to save
application messages over a reboot using the logger buffers, I'm
unfamiliar with it.
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Workgroup of the Linux Foundation
Senior Staff Engineer, Sony Network Entertainment
=============================

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:42     ` Tim Bird
@ 2011-12-22  0:49       ` john stultz
  2011-12-22  1:00         ` john stultz
  0 siblings, 1 reply; 53+ messages in thread
From: john stultz @ 2011-12-22  0:49 UTC (permalink / raw)
  To: Tim Bird
  Cc: Greg KH, linux-embedded, linux kernel, Arnd Bergmann,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, 2011-12-21 at 16:42 -0800, Tim Bird wrote:
> On 12/21/2011 04:18 PM, john stultz wrote:
> > On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote:
> >> On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote:
> >>> Hi all,
> >>>
> >>> I'm looking for feedback on the Android logger code, to see what
> >>> it would take to make this code acceptable for inclusion in
> >>> the mainline kernel.
> >>>
> >>> Information about the features of Android logging system
> >>> can be found at: http://elinux.org/Android_Logging_System
> >>>
> >>> This system creates a new system-wide logging service, in
> >>> the kernel, for user-space message.  It is more comparable
> >>> to syslog than to the kernel log buffer, as it holds only
> >>> user-space messages.  It is optimized for write
> >>> performance, since most of the time the log is written to
> >>> and never read.  It creates multiple log channels, to prevent
> >>> an abundance of log messages in one channel from overwriting
> >>> messages in another channel.  The log channels have sizes
> >>> fixed at kernel compile-time.
> >>>
> >>> Log messages are stored in very simple in-kernel buffers, that
> >>> overflow old messages upon wrapping.  A fixed set of attributes
> >>> (pid, tid, timestamp and message), is kept for each message.
> >>> By convention, Android puts a message priority and context tag
> >>> into each message.
> >>>
> >>> In Android, this system uses a fixed set of device nodes with
> >>> well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
> >>> and /dev/log/system.
> >>>
> >>> Operations on the log are done via a character device, using
> >>> standard file operations and some ioctls.
> >>>
> >>> The code for this is below (I've moved it from linux-next
> >>> drivers/staging/android for my own testing).
> >>>
> >>> Please let me know what issues you see with this code.
> >>
> >> That all describes the current code, but you haven't described what's
> >> wrong with the existing syslog interface that requires this new driver
> >> to be written.  And why can't the existing interface be fixed to address
> >> these (potential) shortcomings?
> >>
> >>> One specific question I have is where is the most appropriate
> >>> place for this code to live, in the kernel source tree?
> >>> Other embedded systems might want to use this system (it
> >>> is simpler than syslog, and superior in some ways), so I don't
> >>> think it should remain in an android-specific directory.
> >>
> >> What way is it superior?  Again, why not extend syslog?  Why not "fix"
> >> syslog if this really is a superior thing?  How does this tie into Kay
> >> and Lennard's proposal for work in this area?
> > 
> > There is also some overlap functionality wise with pstore as well, as I
> > believe the logger is used as a known location in memory where messages
> > can be fetched from after a kernel panic or crash.
> 
> I don't know if that's true or not.  I think you may be thinking
> of Android's RAM console feature.  If there's a way to save
> application messages over a reboot using the logger buffers, I'm
> unfamiliar with it.

I may very well be confusing things. I had thought apanic pulled the
logger data, but maybe I'm wrong.

thanks
-john


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:49       ` john stultz
@ 2011-12-22  1:00         ` john stultz
  0 siblings, 0 replies; 53+ messages in thread
From: john stultz @ 2011-12-22  1:00 UTC (permalink / raw)
  To: Tim Bird
  Cc: Greg KH, linux-embedded, linux kernel, Arnd Bergmann,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, 2011-12-21 at 16:49 -0800, john stultz wrote:
> On Wed, 2011-12-21 at 16:42 -0800, Tim Bird wrote:
> > On 12/21/2011 04:18 PM, john stultz wrote:
> > > On Wed, 2011-12-21 at 15:19 -0800, Greg KH wrote:
> > >> On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote:
> > >>> Hi all,
> > >>>
> > >>> I'm looking for feedback on the Android logger code, to see what
> > >>> it would take to make this code acceptable for inclusion in
> > >>> the mainline kernel.
> > >>>
> > >>> Information about the features of Android logging system
> > >>> can be found at: http://elinux.org/Android_Logging_System
> > >>>
> > >>> This system creates a new system-wide logging service, in
> > >>> the kernel, for user-space message.  It is more comparable
> > >>> to syslog than to the kernel log buffer, as it holds only
> > >>> user-space messages.  It is optimized for write
> > >>> performance, since most of the time the log is written to
> > >>> and never read.  It creates multiple log channels, to prevent
> > >>> an abundance of log messages in one channel from overwriting
> > >>> messages in another channel.  The log channels have sizes
> > >>> fixed at kernel compile-time.
> > >>>
> > >>> Log messages are stored in very simple in-kernel buffers, that
> > >>> overflow old messages upon wrapping.  A fixed set of attributes
> > >>> (pid, tid, timestamp and message), is kept for each message.
> > >>> By convention, Android puts a message priority and context tag
> > >>> into each message.
> > >>>
> > >>> In Android, this system uses a fixed set of device nodes with
> > >>> well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
> > >>> and /dev/log/system.
> > >>>
> > >>> Operations on the log are done via a character device, using
> > >>> standard file operations and some ioctls.
> > >>>
> > >>> The code for this is below (I've moved it from linux-next
> > >>> drivers/staging/android for my own testing).
> > >>>
> > >>> Please let me know what issues you see with this code.
> > >>
> > >> That all describes the current code, but you haven't described what's
> > >> wrong with the existing syslog interface that requires this new driver
> > >> to be written.  And why can't the existing interface be fixed to address
> > >> these (potential) shortcomings?
> > >>
> > >>> One specific question I have is where is the most appropriate
> > >>> place for this code to live, in the kernel source tree?
> > >>> Other embedded systems might want to use this system (it
> > >>> is simpler than syslog, and superior in some ways), so I don't
> > >>> think it should remain in an android-specific directory.
> > >>
> > >> What way is it superior?  Again, why not extend syslog?  Why not "fix"
> > >> syslog if this really is a superior thing?  How does this tie into Kay
> > >> and Lennard's proposal for work in this area?
> > > 
> > > There is also some overlap functionality wise with pstore as well, as I
> > > believe the logger is used as a known location in memory where messages
> > > can be fetched from after a kernel panic or crash.
> > 
> > I don't know if that's true or not.  I think you may be thinking
> > of Android's RAM console feature.  If there's a way to save
> > application messages over a reboot using the logger buffers, I'm
> > unfamiliar with it.
> 
> I may very well be confusing things. I had thought apanic pulled the
> logger data, but maybe I'm wrong.

Looks like I am wrong. Apologies. Thanks for pointing my error out.
-john

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-21 23:19 ` Greg KH
  2011-12-22  0:18   ` john stultz
@ 2011-12-22  0:36   ` Tim Bird
  2011-12-22  0:51     ` Greg KH
                       ` (2 more replies)
  1 sibling, 3 replies; 53+ messages in thread
From: Tim Bird @ 2011-12-22  0:36 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On 12/21/2011 03:19 PM, Greg KH wrote:
> That all describes the current code, but you haven't described what's
> wrong with the existing syslog interface that requires this new driver
> to be written.  And why can't the existing interface be fixed to address
> these (potential) shortcomings?


>> One specific question I have is where is the most appropriate
>> place for this code to live, in the kernel source tree?
>> Other embedded systems might want to use this system (it
>> is simpler than syslog, and superior in some ways), so I don't
>> think it should remain in an android-specific directory.
> 
> What way is it superior?

Here are some ways that this code is superior to syslog:

Speed and overhead
------------------
syslogd requires a separate user-space process to manage
the log area, where this code does not.  The overhead
for any user-space process is at least 16K, and usually much
more than this (not including the size of the log storage
area).  On one of my embedded systems, where syslogd is
provided by busybox, the unique set size for syslogd
is 96K.  This code, built in to the Linux kernel is less
than 4K code and data (again, not including the size of
the log storage area).

To deliver a message to syslog requires a socket operation
and a few context switches.  With the logger code,
the operation is a file operation (writev) to kernel memory,
with only one context switch into and out of the kernel.

The open and write paths through the Linux kernel are
arguably more optimized than the networking paths.
This is a consideration since the log operations should
optimized for the "create-a-log-entry" case (the open/write
path), since logs are mostly written and almost never read
on production devices.

No dependence on persistent storage
-----------------------------------
syslogd requires either persistent storage to store the log,
or a network connection to an outside device.  Being
purely memory-based, the logger requires neither of these.
With logger, persistence of the log data is left to the
implementor.  In Android, the data is delivered over a USB
connection via adb or to the console as ascii text, using
logcat.  In other embedded systems, other mechanisms might
be used if long-term storage of the messages is desired.
With logger, there is no automatic notion of on-device
persistent storage for the log data.

No dependence on networking kernel code
---------------------------------------
The syslog communication mechanism requires sockets.  This
prevents one from configuring the kernel with no networking
support, which is sometimes done in embedded systems to save
size.

Simpler constraint on log size
------------------------------
The busybox syslog daemon uses a log rotation feature to constrain
the size of the log in persistent storage.  This is overly
cumbersome in both speed and complexity compared to the logger's
simple ring buffer.

Licensing
---------
The code implementing library and command line tool support
for this logger (in user space) is available under an Apache license,
rather than a GPL license, which is desirable for some vendors.

> Again, why not extend syslog?  Why not "fix"
> syslog if this really is a superior thing?

"extend" syslog would not really the the right
direction.  This system is simpler than syslog,
while simultaneously having at least one valuable
extra feature (separate log channels).

syslog has a standard set of interfaces in libc
and various syslogd implementations, which are
heavier weight in nature than what is provided here.
It is unclear that an attempt at reducing these attributes
(such as memory overhead, number of context switches,
dependence on persistent storage, and socket utilization) would
yield a system substantially different from the logger.

> How does this tie into Kay
> and Lennard's proposal for work in this area?

It does not tie in at all.

Kay and Lennart's proposal for "the Journal" creates
a more complex system than syslog, and handles a number
of new interesting use cases.  This system is on the
opposite side of the spectrum from the journal, towards
simplicity and reduced footprint and overhead.

 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Workgroup of the Linux Foundation
Senior Staff Engineer, Sony Network Entertainment
=============================

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:36   ` Tim Bird
@ 2011-12-22  0:51     ` Greg KH
  2011-12-22  1:32       ` Tim Bird
  2011-12-22  1:20     ` NeilBrown
  2011-12-22  4:42     ` david
  2 siblings, 1 reply; 53+ messages in thread
From: Greg KH @ 2011-12-22  0:51 UTC (permalink / raw)
  To: Tim Bird
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 04:36:21PM -0800, Tim Bird wrote:
> On 12/21/2011 03:19 PM, Greg KH wrote:
> > That all describes the current code, but you haven't described what's
> > wrong with the existing syslog interface that requires this new driver
> > to be written.  And why can't the existing interface be fixed to address
> > these (potential) shortcomings?
> 
> 
> >> One specific question I have is where is the most appropriate
> >> place for this code to live, in the kernel source tree?
> >> Other embedded systems might want to use this system (it
> >> is simpler than syslog, and superior in some ways), so I don't
> >> think it should remain in an android-specific directory.
> > 
> > What way is it superior?
> 
> Here are some ways that this code is superior to syslog:
> 
> Speed and overhead
> ------------------
> syslogd requires a separate user-space process to manage
> the log area, where this code does not.  The overhead
> for any user-space process is at least 16K, and usually much
> more than this (not including the size of the log storage
> area).  On one of my embedded systems, where syslogd is
> provided by busybox, the unique set size for syslogd
> is 96K.  This code, built in to the Linux kernel is less
> than 4K code and data (again, not including the size of
> the log storage area).

Huh, I'm not talking about syslogd, I'm talking about the syslog(2)
syscall we have.

This character interface seems very close to the syslog(2) api, but just
done in a character interface, with ioctls, which also require userspace
tools to manage properly, so I fail to see the big "gain" here.

What am I missing?

> To deliver a message to syslog requires a socket operation
> and a few context switches.

For syslog(2)?  What socket?

> With the logger code,
> the operation is a file operation (writev) to kernel memory,
> with only one context switch into and out of the kernel.

I think we are thinking of different apis here...

> No dependence on persistent storage
> -----------------------------------
> syslogd requires either persistent storage to store the log,
> or a network connection to an outside device.  Being
> purely memory-based, the logger requires neither of these.
> With logger, persistence of the log data is left to the
> implementor.  In Android, the data is delivered over a USB
> connection via adb or to the console as ascii text, using
> logcat.  In other embedded systems, other mechanisms might
> be used if long-term storage of the messages is desired.
> With logger, there is no automatic notion of on-device
> persistent storage for the log data.
> 
> No dependence on networking kernel code
> ---------------------------------------
> The syslog communication mechanism requires sockets.  This
> prevents one from configuring the kernel with no networking
> support, which is sometimes done in embedded systems to save
> size.
> 
> Simpler constraint on log size
> ------------------------------
> The busybox syslog daemon uses a log rotation feature to constrain
> the size of the log in persistent storage.  This is overly
> cumbersome in both speed and complexity compared to the logger's
> simple ring buffer.
> 
> Licensing
> ---------
> The code implementing library and command line tool support
> for this logger (in user space) is available under an Apache license,
> rather than a GPL license, which is desirable for some vendors.

Ok, care to rewrite all of this when thinking of syslog(2), the kernel
system call, and not syslogd, which is what I was not comparing this
kernel driver to at all?

> > How does this tie into Kay
> > and Lennard's proposal for work in this area?
> 
> It does not tie in at all.
> 
> Kay and Lennart's proposal for "the Journal" creates
> a more complex system than syslog, and handles a number
> of new interesting use cases.  This system is on the
> opposite side of the spectrum from the journal, towards
> simplicity and reduced footprint and overhead.

No, it's much the same, for the kernel side, which is what I am
referring to here.

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:51     ` Greg KH
@ 2011-12-22  1:32       ` Tim Bird
  2011-12-22  1:47         ` Greg KH
  0 siblings, 1 reply; 53+ messages in thread
From: Tim Bird @ 2011-12-22  1:32 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On 12/21/2011 04:51 PM, Greg KH wrote:
> On Wed, Dec 21, 2011 at 04:36:21PM -0800, Tim Bird wrote:
>> On 12/21/2011 03:19 PM, Greg KH wrote:

> Huh, I'm not talking about syslogd, I'm talking about the syslog(2)
> syscall we have.

OK - switching gears.  Since the kernel log buffer isn't normally
used to store use-space messages, I thought you were referring
to syslog(3) and the associated (logger(1) and syslogd(8)).

> This character interface seems very close to the syslog(2) api, but just
> done in a character interface, with ioctls, which also require userspace
> tools to manage properly, so I fail to see the big "gain" here.
> 
> What am I missing?

syslog(2) would more aptly be named klogctrl() (and it is in glibc)

There's currently no operation in sys_sylog (the kernel function
implementing syslog(2)) for writing to the log.  The write operation
to the kernel log buffer is also done via a character interface
/dev/kmsg (via code in drivers/char/mem.c)  This is actually very
similar to what the Android logger code does.

But while the kernel log buffer has lots of similarities to the Android logger
there are some key differences which I think are important to isolate
from a user-space logging system.

Here's a stream-of-consciousness dump of the differences:

The printk interface in the kernel is almost always automatically drained
to the device console, at the time of the printk (after the message is dropped
into the log buffer itself).  This extra operation is not needed for most
application-level messages that go into the log, and incurs extra overhead
in the log buffer code.

The printk code is especially designed to be called from within any kernel
context (including interrupt code), and so has some locking avoidance code
paths and complexity that are not needed for code which handles strictly
user-space messages.

Oddly enough, the printk code paths in the kernel can end up doing
a fair amount of print formatting, which can be time-consuming.  The code
path in kmsg_writev() contains at least one kmalloc, which could fail
when running out of memory.  The code path in the logger is much simpler,
consisting really of only a data copy.

Timestamping is not automatically appended to messages going into the
kernel log buffer (but they can be optionally pre-pended, with control
configurable at runtime).  They are represented
as ASCII text, which consumes a little more than twice the overhead of
a 32-bit binary field.  PID and TID are not automatically preserved in
the log. The kernel keeps it's priority in text also, and has no convention
for contextual tagging.  I'm not sure that we should change the
kernel log buffer to support structured binary data, in addition to the
free-form ASCII data that the kernel uses now.

The kernel log buffer does not support separate channels for different
classes of log messages (indeed, there is only one channel, and it has
kernel messages).  A new system call (or some backwards-compatible tweak
to the existing syslog(2) call would have to be added to support
a channel ID in order to support this.

There *are* some benefits to intermingling the kernel log messages and the
user-space log messages, but I think they are outweighed by the
value in keeping these systems separate.  There might be the opportunity
for code reuse, but I suspect we'd end up with about the same amount
of code increase overall (and possibly an additional syscall), and add
some unneeded complexity to the prink code path to accomplish it.

I just read Neil Brown's suggestion for doing this via a filesystem rather
than a char device, and it's interesting.  I'll respond to that separately.

 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Workgroup of the Linux Foundation
Senior Staff Engineer, Sony Network Entertainment
=============================

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  1:32       ` Tim Bird
@ 2011-12-22  1:47         ` Greg KH
  2011-12-22  2:12           ` Tim Bird
  2011-12-22  2:34           ` Kay Sievers
  0 siblings, 2 replies; 53+ messages in thread
From: Greg KH @ 2011-12-22  1:47 UTC (permalink / raw)
  To: Tim Bird
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 05:32:36PM -0800, Tim Bird wrote:
> On 12/21/2011 04:51 PM, Greg KH wrote:
> > On Wed, Dec 21, 2011 at 04:36:21PM -0800, Tim Bird wrote:
> >> On 12/21/2011 03:19 PM, Greg KH wrote:
> 
> > Huh, I'm not talking about syslogd, I'm talking about the syslog(2)
> > syscall we have.
> 
> OK - switching gears.  Since the kernel log buffer isn't normally
> used to store use-space messages, I thought you were referring
> to syslog(3) and the associated (logger(1) and syslogd(8)).

The kernel log buffer has been storing userspace messages for a while
now, look at your boot log on the latest Fedora and openSUSE releases
(or any other distro using systemd for booting).

> > This character interface seems very close to the syslog(2) api, but just
> > done in a character interface, with ioctls, which also require userspace
> > tools to manage properly, so I fail to see the big "gain" here.
> > 
> > What am I missing?
> 
> syslog(2) would more aptly be named klogctrl() (and it is in glibc)

Maybe, but this is a kernel mailing list, we don't worry about glibc
much here :)

> There's currently no operation in sys_sylog (the kernel function
> implementing syslog(2)) for writing to the log.  The write operation
> to the kernel log buffer is also done via a character interface
> /dev/kmsg (via code in drivers/char/mem.c)  This is actually very
> similar to what the Android logger code does.

Again, see above, this has been in the kernel for quite a while now...

> But while the kernel log buffer has lots of similarities to the Android logger
> there are some key differences which I think are important to isolate
> from a user-space logging system.
> 
> Here's a stream-of-consciousness dump of the differences:
> 
> The printk interface in the kernel is almost always automatically drained
> to the device console, at the time of the printk (after the message is dropped
> into the log buffer itself).  This extra operation is not needed for most
> application-level messages that go into the log, and incurs extra overhead
> in the log buffer code.
> 
> The printk code is especially designed to be called from within any kernel
> context (including interrupt code), and so has some locking avoidance code
> paths and complexity that are not needed for code which handles strictly
> user-space messages.
> 
> Oddly enough, the printk code paths in the kernel can end up doing
> a fair amount of print formatting, which can be time-consuming.  The code
> path in kmsg_writev() contains at least one kmalloc, which could fail
> when running out of memory.  The code path in the logger is much simpler,
> consisting really of only a data copy.
> 
> Timestamping is not automatically appended to messages going into the
> kernel log buffer (but they can be optionally pre-pended, with control
> configurable at runtime).  They are represented
> as ASCII text, which consumes a little more than twice the overhead of
> a 32-bit binary field.  PID and TID are not automatically preserved in
> the log. The kernel keeps it's priority in text also, and has no convention
> for contextual tagging.  I'm not sure that we should change the
> kernel log buffer to support structured binary data, in addition to the
> free-form ASCII data that the kernel uses now.
> 
> The kernel log buffer does not support separate channels for different
> classes of log messages (indeed, there is only one channel, and it has
> kernel messages).  A new system call (or some backwards-compatible tweak
> to the existing syslog(2) call would have to be added to support
> a channel ID in order to support this.
> 
> There *are* some benefits to intermingling the kernel log messages and the
> user-space log messages, but I think they are outweighed by the
> value in keeping these systems separate.  There might be the opportunity
> for code reuse, but I suspect we'd end up with about the same amount
> of code increase overall (and possibly an additional syscall), and add
> some unneeded complexity to the prink code path to accomplish it.

Again, please see what we are already doing in the kernel and userspace,
I think a lot of the above is already implemented.

Which brings me back to my original question, what does this code do,
that is not already present in the kernel, and why is a totally new
interface being proposed for it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  1:47         ` Greg KH
@ 2011-12-22  2:12           ` Tim Bird
  2011-12-22  3:44             ` Kay Sievers
                               ` (2 more replies)
  2011-12-22  2:34           ` Kay Sievers
  1 sibling, 3 replies; 53+ messages in thread
From: Tim Bird @ 2011-12-22  2:12 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On 12/21/2011 05:47 PM, Greg KH wrote:
> On Wed, Dec 21, 2011 at 05:32:36PM -0800, Tim Bird wrote:
>> On 12/21/2011 04:51 PM, Greg KH wrote:
>>> On Wed, Dec 21, 2011 at 04:36:21PM -0800, Tim Bird wrote:
>>>> On 12/21/2011 03:19 PM, Greg KH wrote:
>>
>>> Huh, I'm not talking about syslogd, I'm talking about the syslog(2)
>>> syscall we have.
>>
>> OK - switching gears.  Since the kernel log buffer isn't normally
>> used to store use-space messages, I thought you were referring
>> to syslog(3) and the associated (logger(1) and syslogd(8)).
> 
> The kernel log buffer has been storing userspace messages for a while
> now, look at your boot log on the latest Fedora and openSUSE releases
> (or any other distro using systemd for booting).

Sorry - I don't have a distro that uses systemd.
I'm completely unaware of this use of the kernel log for user-space
message logging.  No wonder Lennart and Kay
are so hot on making a new logging system.  It seems sub-optimal
to me, to intermingle a structured log with a pure-ASCII log.

> Again, please see what we are already doing in the kernel and userspace,
> I think a lot of the above is already implemented.

I don't know what systemd has got going on in user-space.  I'm looking
at a very recent kernel, and I see no support for multiple log channels,
or an optimized open/write path.

Maybe Lennart could save me some time doing this research.

Lennart,

How does current systemd prevent user-space messages from crowding
out kernel messages?  (or vice-versa).  Since you've been doing
a lot of work on logging, do you have any existing metrics for logging
overhead via the kernel log buffer?

> Which brings me back to my original question, what does this code do,
> that is not already present in the kernel, and why is a totally new
> interface being proposed for it?

At the least, it supports multiple log channels.  Quite frankly my mind
has been blown a bit by the suggestion to overload the kernel log buffer
with user-space messages.  I would never have gone that route. But I'll have
to find out more about this systemd thing to answer your question.

Secondly, this is not a particularly new or radical interface. It is new
to legacy Linux, but it's been in the Android Linux kernel for some years
now, and it has worked well.
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Workgroup of the Linux Foundation
Senior Staff Engineer, Sony Network Entertainment
=============================

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  2:12           ` Tim Bird
@ 2011-12-22  3:44             ` Kay Sievers
  2011-12-22  3:45             ` Greg KH
  2011-12-22  3:47             ` Greg KH
  2 siblings, 0 replies; 53+ messages in thread
From: Kay Sievers @ 2011-12-22  3:44 UTC (permalink / raw)
  To: Tim Bird
  Cc: Greg KH, linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Lennart Poettering

On Thu, Dec 22, 2011 at 03:12, Tim Bird <tim.bird@am.sony.com> wrote:
> On 12/21/2011 05:47 PM, Greg KH wrote:

>> Again, please see what we are already doing in the kernel and userspace,
>> I think a lot of the above is already implemented.
>
> I don't know what systemd has got going on in user-space.  I'm looking
> at a very recent kernel, and I see no support for multiple log channels,
> or an optimized open/write path.

> How does current systemd prevent user-space messages from crowding
> out kernel messages?  (or vice-versa).

It intentionally doesn't. Splitting messages in separate stores is not
what we want in the general use case.

Syslog has the 'facility' for that, and it works just good enough. We
think that filtering in the receiver is more flexible and useful, than
to have multiple stores of pretty much the same type.

If you have multiple stores, you always have to fight the ordering
problem, and that really creates problems sometimes.

We are very convinced, that there should be a single entry only for
logging, and not multiple, and the filtering should happen on the
receiver side.

> Since you've been doing
> a lot of work on logging, do you have any existing metrics for logging
> overhead via the kernel log buffer?

Which overhead? I don't really think that is too interesting. Writing
to a device node or socket and copy it to the kernel's ring buffer,
not sure if it has an interesting potential of being optimized, if you
don't use fancy consoles and such.

We have no numbers, but until I see numbers of a real world problem, I
would expect there is no actual problem.

>> Which brings me back to my original question, what does this code do,
>> that is not already present in the kernel, and why is a totally new
>> interface being proposed for it?
>
> At the least, it supports multiple log channels.  Quite frankly my mind
> has been blown a bit by the suggestion to overload the kernel log buffer
> with user-space messages.  I would never have gone that route. But I'll have
> to find out more about this systemd thing to answer your question.

It's just fine. And honestly, for early boot and debugging
interleaving is exactly what you want, because userspace is triggering
the actions in the kernel, and you want to see that in the same
channel.

Also you can use netconsole and all the weird stuff people support for
the kernel and get the messages from early boot userspace including
intramfs there.

> Secondly, this is not a particularly new or radical interface. It is new
> to legacy Linux, but it's been in the Android Linux kernel for some years
> now, and it has worked well.

Sure, it's understandable, it sounds like something that made sense
when it was invented, and which probably still make a lot of sense for
Android today. But I don't think it should be discussed as a general
purpose Linux use case, or something that replaces kmsg. It does not
look like too interesting for general needs today outside of Android.

We probably just want the simple stuff we have today, or something
that really solves the structured logging problem in a proper way.
Something slightly better, with almost the same model and limitations,
than what we have already, is probably not that important enough to
make changes.

Thanks,
Kay

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  2:12           ` Tim Bird
  2011-12-22  3:44             ` Kay Sievers
@ 2011-12-22  3:45             ` Greg KH
  2011-12-22  3:47             ` Greg KH
  2 siblings, 0 replies; 53+ messages in thread
From: Greg KH @ 2011-12-22  3:45 UTC (permalink / raw)
  To: Tim Bird
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 06:12:39PM -0800, Tim Bird wrote:
> Secondly, this is not a particularly new or radical interface. It is new
> to legacy Linux, but it's been in the Android Linux kernel for some years
> now, and it has worked well.

Sorry, but you know full well that is not a justification for _any_
chunk of code to be included in the main kernel tree.  Staging is fine,
but to add a new kernel/user api that looks at first glance, to merely
duplicate the existing in-kernel apis needs a very good explaination for
why it needs to be included.

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  2:12           ` Tim Bird
  2011-12-22  3:44             ` Kay Sievers
  2011-12-22  3:45             ` Greg KH
@ 2011-12-22  3:47             ` Greg KH
  2011-12-22  4:12               ` Kay Sievers
  2 siblings, 1 reply; 53+ messages in thread
From: Greg KH @ 2011-12-22  3:47 UTC (permalink / raw)
  To: Tim Bird
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 06:12:39PM -0800, Tim Bird wrote:
> > Again, please see what we are already doing in the kernel and userspace,
> > I think a lot of the above is already implemented.
> 
> I don't know what systemd has got going on in user-space.  I'm looking
> at a very recent kernel, and I see no support for multiple log channels,
> or an optimized open/write path.

How is the existing syslog read path not "optimized"?  What kind of
speed and numbers are we talking about here?

Oh, and you do know about the userspace printk tty driver, right?  What
about using that combined with the existing syslog system call?

systemd doesn't use it, but I know of a few embedded systems that are
already using it today, and I think it might solve the android issues
already.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  3:47             ` Greg KH
@ 2011-12-22  4:12               ` Kay Sievers
  2011-12-22  4:22                 ` Brian Swetland
  2011-12-22  4:49                 ` david
  0 siblings, 2 replies; 53+ messages in thread
From: Kay Sievers @ 2011-12-22  4:12 UTC (permalink / raw)
  To: Greg KH
  Cc: Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Brian Swetland, Lennart Poettering

On Thu, Dec 22, 2011 at 04:47, Greg KH <gregkh@suse.de> wrote:
> On Wed, Dec 21, 2011 at 06:12:39PM -0800, Tim Bird wrote:
>> > Again, please see what we are already doing in the kernel and userspace,
>> > I think a lot of the above is already implemented.
>>
>> I don't know what systemd has got going on in user-space.  I'm looking
>> at a very recent kernel, and I see no support for multiple log channels,
>> or an optimized open/write path.
>
> How is the existing syslog read path not "optimized"?  What kind of
> speed and numbers are we talking about here?
>
> Oh, and you do know about the userspace printk tty driver, right?  What
> about using that combined with the existing syslog system call?
>
> systemd doesn't use it, but I know of a few embedded systems that are
> already using it today, and I think it might solve the android issues
> already.

We would like to have for the general Linux use case is a kmsg, that
supports structured data, but looks the same as it looks today to
current and legacy tools.

We would like to have every printk that does not start with a
KERN_CONT to create a record in the kmsg buffer. This record would
carry directly a timestamp, the log facility and the log level and a
reference to the usual human readable string.

The current kmsg prefixing of the timestamp would be a runtime switch,
and just added when the records are traversed, instead of the current
buffer printing.

In addition to this common data, every record can carry a
'dictionary', which basically looks like a process environment, means
KEY=value pairs. The additional data can carry all the stuff that is
needed for structured logging and provides the base for machine
readable log information.

We don't want to separate multiple stores to avoid ordering problems
when the messages need to be merged from the multiple streams on the
receiver side.

I think this is theoretically all possible with the current kmsg,
without breaking anything. The current interfaces would iterate over
the record list and print out the human readable parts and skip the
metadata, instead of just copying a linear byte stream. To tools, it
could look the same as it is currently. It's just that the ring buffer
is not a byte- but a record-buffer.

Advanced tools could get a new interface or switch the old interface
to pipe out the structured data too.

In the systemd journal, we have an ASCII-like stream format that
carries out structured data and is binary data capable.

If we want to think about any next generation logging, I'm convinced,
we need to support records, structured data and binary data; anything
else is unlikely worth to change the current stuff.

Kay

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  4:12               ` Kay Sievers
@ 2011-12-22  4:22                 ` Brian Swetland
  2011-12-22  4:43                   ` Kay Sievers
                                     ` (2 more replies)
  2011-12-22  4:49                 ` david
  1 sibling, 3 replies; 53+ messages in thread
From: Brian Swetland @ 2011-12-22  4:22 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Greg KH, Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Lennart Poettering

On Wed, Dec 21, 2011 at 8:12 PM, Kay Sievers <kay.sievers@vrfy.org> wrote:
>
> We would like to have for the general Linux use case is a kmsg, that
> supports structured data, but looks the same as it looks today to
> current and legacy tools.
>
> We would like to have every printk that does not start with a
> KERN_CONT to create a record in the kmsg buffer. This record would
> carry directly a timestamp, the log facility and the log level and a
> reference to the usual human readable string.
>
> The current kmsg prefixing of the timestamp would be a runtime switch,
> and just added when the records are traversed, instead of the current
> buffer printing.
>
> In addition to this common data, every record can carry a
> 'dictionary', which basically looks like a process environment, means
> KEY=value pairs. The additional data can carry all the stuff that is
> needed for structured logging and provides the base for machine
> readable log information.
>
> We don't want to separate multiple stores to avoid ordering problems
> when the messages need to be merged from the multiple streams on the
> receiver side.
>
> I think this is theoretically all possible with the current kmsg,
> without breaking anything. The current interfaces would iterate over
> the record list and print out the human readable parts and skip the
> metadata, instead of just copying a linear byte stream. To tools, it
> could look the same as it is currently. It's just that the ring buffer
> is not a byte- but a record-buffer.
>
> Advanced tools could get a new interface or switch the old interface
> to pipe out the structured data too.
>
> In the systemd journal, we have an ASCII-like stream format that
> carries out structured data and is binary data capable.
>
> If we want to think about any next generation logging, I'm convinced,
> we need to support records, structured data and binary data; anything
> else is unlikely worth to change the current stuff.

Any thoughts as to how one could allow N userspace agents to log to a
single shared buffer without one agent, if buggy or malicious,
spamming out all the other contents of the log?  This is one of the
main reasons we maintain separate logs in Android today, and are
likely to continue doing so until we figure out how resolve the issue.

Having everything in one unified log definitely has benefits (it
certainly makes keeping everything ordered and reading everything back
simpler), however we can't control third party applications (by
design), so systems that require apps to "just don't do that" are not
really solving that problem for us.

Also, as I mentioned earlier, from a security standpoint it is highly
desirable to be able to restrict which records certain readers can
read.  Convincing third party app developers to not log sensitive or
PII data remains a challenge -- providing the ability for filtered
read access allows some mitigation (app can retrieve it's own logs for
a bug report but can't sift through all logs looking for interesting
tidbits).

Brian

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  4:22                 ` Brian Swetland
@ 2011-12-22  4:43                   ` Kay Sievers
  2011-12-22  4:47                   ` david
  2011-12-22 13:40                   ` Lennart Poettering
  2 siblings, 0 replies; 53+ messages in thread
From: Kay Sievers @ 2011-12-22  4:43 UTC (permalink / raw)
  To: Brian Swetland
  Cc: Greg KH, Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Lennart Poettering

On Thu, Dec 22, 2011 at 05:22, Brian Swetland <swetland@google.com> wrote:
> On Wed, Dec 21, 2011 at 8:12 PM, Kay Sievers <kay.sievers@vrfy.org> wrote:
>>
>> We would like to have for the general Linux use case is a kmsg, that
>> supports structured data, but looks the same as it looks today to
>> current and legacy tools.
>>
>> We would like to have every printk that does not start with a
>> KERN_CONT to create a record in the kmsg buffer. This record would
>> carry directly a timestamp, the log facility and the log level and a
>> reference to the usual human readable string.
>>
>> The current kmsg prefixing of the timestamp would be a runtime switch,
>> and just added when the records are traversed, instead of the current
>> buffer printing.
>>
>> In addition to this common data, every record can carry a
>> 'dictionary', which basically looks like a process environment, means
>> KEY=value pairs. The additional data can carry all the stuff that is
>> needed for structured logging and provides the base for machine
>> readable log information.
>>
>> We don't want to separate multiple stores to avoid ordering problems
>> when the messages need to be merged from the multiple streams on the
>> receiver side.
>>
>> I think this is theoretically all possible with the current kmsg,
>> without breaking anything. The current interfaces would iterate over
>> the record list and print out the human readable parts and skip the
>> metadata, instead of just copying a linear byte stream. To tools, it
>> could look the same as it is currently. It's just that the ring buffer
>> is not a byte- but a record-buffer.
>>
>> Advanced tools could get a new interface or switch the old interface
>> to pipe out the structured data too.
>>
>> In the systemd journal, we have an ASCII-like stream format that
>> carries out structured data and is binary data capable.
>>
>> If we want to think about any next generation logging, I'm convinced,
>> we need to support records, structured data and binary data; anything
>> else is unlikely worth to change the current stuff.
>
> Any thoughts as to how one could allow N userspace agents to log to a
> single shared buffer without one agent, if buggy or malicious,
> spamming out all the other contents of the log?

In systemd journal, users get separate databases, but still all
databases are written by the same process.

All input is rate-limited based on log priority and remaining space.

> This is one of the
> main reasons we maintain separate logs in Android today, and are
> likely to continue doing so until we figure out how resolve the issue.

Right, it's a problem, but we think it can be solved with still having
a single interface, and a single representation on the log consumer
side.

> Having everything in one unified log definitely has benefits (it
> certainly makes keeping everything ordered and reading everything back
> simpler), however we can't control third party applications (by
> design), so systems that require apps to "just don't do that" are not
> really solving that problem for us.

Yeah sure, but rate limiting and separating databases should make that possible.

> Also, as I mentioned earlier, from a security standpoint it is highly
> desirable to be able to restrict which records certain readers can
> read. Â Convincing third party app developers to not log sensitive or
> PII data remains a challenge -- providing the ability for filtered
> read access allows some mitigation (app can retrieve it's own logs for
> a bug report but can't sift through all logs looking for interesting
> tidbits).

Yes, Android is a bit different with the model of assigning uids to
apps. We focus mainly on the 1 system and the N logged-in users, which
all get their own database. I think the same model could work just
fine for the app-uids on Android. The important difference to separate
stores is that there is a single process writing all that, and that
can assign sequence numbers regardless in which database the record
will be stored for later merging/sorting of the records.

Kay

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  4:22                 ` Brian Swetland
  2011-12-22  4:43                   ` Kay Sievers
@ 2011-12-22  4:47                   ` david
  2011-12-22  4:58                     ` Brian Swetland
  2011-12-22 13:40                   ` Lennart Poettering
  2 siblings, 1 reply; 53+ messages in thread
From: david @ 2011-12-22  4:47 UTC (permalink / raw)
  To: Brian Swetland
  Cc: Kay Sievers, Greg KH, Tim Bird, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Lennart Poettering

On Wed, 21 Dec 2011, Brian Swetland wrote:

> Any thoughts as to how one could allow N userspace agents to log to a
> single shared buffer without one agent, if buggy or malicious,
> spamming out all the other contents of the log?  This is one of the
> main reasons we maintain separate logs in Android today, and are
> likely to continue doing so until we figure out how resolve the issue.

note that you still don't prevent one app from blowing out the logs, all 
you do is separate the logs into a handfulof categories and limit the app 
(through standard filesystem permissions) to only blowing out one category 
of logs.

or do I misunderstand this and you actually keep a separate queue for each 
pid (which would seem to be a problem as you no longer know how much space 
you need)

David Lang

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  4:47                   ` david
@ 2011-12-22  4:58                     ` Brian Swetland
  2011-12-22  5:07                       ` david
  2011-12-22  5:21                       ` david
  0 siblings, 2 replies; 53+ messages in thread
From: Brian Swetland @ 2011-12-22  4:58 UTC (permalink / raw)
  To: david
  Cc: Kay Sievers, Greg KH, Tim Bird, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Lennart Poettering

On Wed, Dec 21, 2011 at 8:47 PM,  <david@lang.hm> wrote:
> On Wed, 21 Dec 2011, Brian Swetland wrote:
>> Any thoughts as to how one could allow N userspace agents to log to a
>> single shared buffer without one agent, if buggy or malicious,
>> spamming out all the other contents of the log? Â This is one of the
>> main reasons we maintain separate logs in Android today, and are
>> likely to continue doing so until we figure out how resolve the issue.
>
> note that you still don't prevent one app from blowing out the logs, all you
> do is separate the logs into a handfulof categories and limit the app
> (through standard filesystem permissions) to only blowing out one category
> of logs.
>
> or do I misunderstand this and you actually keep a separate queue for each
> pid (which would seem to be a problem as you no longer know how much space
> you need)

You're correct -- that's the model at present.  We'd like to move to a
model where we have better control and are kicking around ideas for
how to do this for the next platform version.

The rate at which apps push data into logs is pretty amazing at times.
 The system booting (maybe 10-20 services and 10+ apps starting up)
can blow through 256K of ringbuffer in seconds.

Having enough context available when something crashes to be able to
diagnose what went wrong, especially when problems could involve
interaction between multiple agents in multiple processes is important
for the frameworks and apps teams.

Trying to get individual developers to be frugal with their logging
has turned out to be mostly a losing battle.

Brian

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  4:58                     ` Brian Swetland
@ 2011-12-22  5:07                       ` david
  2011-12-22  5:21                       ` david
  1 sibling, 0 replies; 53+ messages in thread
From: david @ 2011-12-22  5:07 UTC (permalink / raw)
  To: Brian Swetland
  Cc: Kay Sievers, Greg KH, Tim Bird, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Lennart Poettering

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2125 bytes --]

On Wed, 21 Dec 2011, Brian Swetland wrote:

> On Wed, Dec 21, 2011 at 8:47 PM,  <david@lang.hm> wrote:
>> On Wed, 21 Dec 2011, Brian Swetland wrote:
>>> Any thoughts as to how one could allow N userspace agents to log to a
>>> single shared buffer without one agent, if buggy or malicious,
>>> spamming out all the other contents of the log?  This is one of the
>>> main reasons we maintain separate logs in Android today, and are
>>> likely to continue doing so until we figure out how resolve the issue.
>>
>> note that you still don't prevent one app from blowing out the logs, all you
>> do is separate the logs into a handfulof categories and limit the app
>> (through standard filesystem permissions) to only blowing out one category
>> of logs.
>>
>> or do I misunderstand this and you actually keep a separate queue for each
>> pid (which would seem to be a problem as you no longer know how much space
>> you need)
>
> You're correct -- that's the model at present.  We'd like to move to a
> model where we have better control and are kicking around ideas for
> how to do this for the next platform version.
>
> The rate at which apps push data into logs is pretty amazing at times.
> The system booting (maybe 10-20 services and 10+ apps starting up)
> can blow through 256K of ringbuffer in seconds.
>
> Having enough context available when something crashes to be able to
> diagnose what went wrong, especially when problems could involve
> interaction between multiple agents in multiple processes is important
> for the frameworks and apps teams.

this makes it sound more like this belongs in userspace instead of the 
kernel. if something crashes and they can flip a flag in userspace to say 
'keep more logs now' and try to duplicate the crash is going to be far 
easier than having to compile a new kernel to extend the size of the log.

> Trying to get individual developers to be frugal with their logging
> has turned out to be mostly a losing battle.

that's for sure, anything that's success depends on developers doing 
things right that aren't visible to the user are pretty much doomed.

David Lang

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  4:58                     ` Brian Swetland
  2011-12-22  5:07                       ` david
@ 2011-12-22  5:21                       ` david
  1 sibling, 0 replies; 53+ messages in thread
From: david @ 2011-12-22  5:21 UTC (permalink / raw)
  To: Brian Swetland
  Cc: Kay Sievers, Greg KH, Tim Bird, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Lennart Poettering

On Wed, 21 Dec 2011, Brian Swetland wrote:

> The rate at which apps push data into logs is pretty amazing at times.
> The system booting (maybe 10-20 services and 10+ apps starting up)
> can blow through 256K of ringbuffer in seconds.

a smidge more information here.

256K of logs in 'seconds'

are we talking single digit seconds, tens of seconds, what?

any idea the average message size

on my server farms, the average log size is ~256 byttes, so if this was 
over 10 seconds we would be talking ~100 logs/sec. "slow" syslog daemons 
can handle this sort of load trivially, they start to take significant 
amounts of CPU (10%+) around two orders of magnatude higher data rates.

now, it may be that  the logs on the android are smaller, and you hit the 
256K injust a couple of seconds, at which point things would be a little 
harder, but since the job is so much simpler it shouldn't be that hard.

David Lang

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  4:22                 ` Brian Swetland
  2011-12-22  4:43                   ` Kay Sievers
  2011-12-22  4:47                   ` david
@ 2011-12-22 13:40                   ` Lennart Poettering
  2 siblings, 0 replies; 53+ messages in thread
From: Lennart Poettering @ 2011-12-22 13:40 UTC (permalink / raw)
  To: Brian Swetland
  Cc: Kay Sievers, Greg KH, Tim Bird, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz

On Wed, 21.12.11 20:22, Brian Swetland (swetland@google.com) wrote:

> > If we want to think about any next generation logging, I'm convinced,
> > we need to support records, structured data and binary data; anything
> > else is unlikely worth to change the current stuff.
> 
> Any thoughts as to how one could allow N userspace agents to log to a
> single shared buffer without one agent, if buggy or malicious,
> spamming out all the other contents of the log?  This is one of the
> main reasons we maintain separate logs in Android today, and are
> likely to continue doing so until we figure out how resolve the issue.

I have been working on some code to rate limit what clients can log to
the journal, with some nice tricks: the rate limiting is per-cgroup, so
that services can't flush out other services data so easily (assuming
they are in separate cgroups, which they are in a systemd world). Also,
different rates apply to to messages with different priorities
(i.e. we'll have a lower limit on debug messages, and allow more
important messages to be logged at a higher rate). And finally, we look
at the available disk space: the overall rate limits are increased if
there's more free space, and lowered if there's less.

> Also, as I mentioned earlier, from a security standpoint it is highly
> desirable to be able to restrict which records certain readers can
> read.  Convincing third party app developers to not log sensitive or
> PII data remains a challenge -- providing the ability for filtered
> read access allows some mitigation (app can retrieve it's own logs for
> a bug report but can't sift through all logs looking for interesting
> tidbits).

The journal splits up user data into sparate files and sets file access
permissions to ensure that users can access their own data but nothing
else. However, when root wants to read all data it can, and the data
will be interleaved automatically.

Lennart

-- 
Lennart Poettering - Red Hat, Inc.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  4:12               ` Kay Sievers
  2011-12-22  4:22                 ` Brian Swetland
@ 2011-12-22  4:49                 ` david
  1 sibling, 0 replies; 53+ messages in thread
From: david @ 2011-12-22  4:49 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Greg KH, Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Brian Swetland, Lennart Poettering

On Thu, 22 Dec 2011, Kay Sievers wrote:

> If we want to think about any next generation logging, I'm convinced,
> we need to support records, structured data and binary data; anything
> else is unlikely worth to change the current stuff.

note that current generation syslog supports records, structured data, 
metadata capture of the sending process, everything you listed except 
unencoded binary data.

David Lang

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  1:47         ` Greg KH
  2011-12-22  2:12           ` Tim Bird
@ 2011-12-22  2:34           ` Kay Sievers
  1 sibling, 0 replies; 53+ messages in thread
From: Kay Sievers @ 2011-12-22  2:34 UTC (permalink / raw)
  To: Greg KH
  Cc: Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Brian Swetland, Lennart Poettering

On Thu, Dec 22, 2011 at 02:47, Greg KH <gregkh@suse.de> wrote:
> On Wed, Dec 21, 2011 at 05:32:36PM -0800, Tim Bird wrote:
>> On 12/21/2011 04:51 PM, Greg KH wrote:
>> > On Wed, Dec 21, 2011 at 04:36:21PM -0800, Tim Bird wrote:
>> >> On 12/21/2011 03:19 PM, Greg KH wrote:
>>
>> > Huh, I'm not talking about syslogd, I'm talking about the syslog(2)
>> > syscall we have.
>>
>> OK - switching gears. Â Since the kernel log buffer isn't normally
>> used to store use-space messages, I thought you were referring
>> to syslog(3) and the associated (logger(1) and syslogd(8)).
>
> The kernel log buffer has been storing userspace messages for a while
> now, look at your boot log on the latest Fedora and openSUSE releases
> (or any other distro using systemd for booting).

Right, and the kernel's printk buffer support numbered facilities,
like the syslog protocol defines, just fine.

Stuff injected into the kernel from userspace can easily distinguished
from kernel generated messages when pulled out by userspace.

We use that currently to do early boot logging, when no syslog is running:
  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=9d90c8d9cde929cbc575098e825d7c29d9f45054

Kay

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:36   ` Tim Bird
  2011-12-22  0:51     ` Greg KH
@ 2011-12-22  1:20     ` NeilBrown
  2011-12-22  1:49       ` Greg KH
                         ` (4 more replies)
  2011-12-22  4:42     ` david
  2 siblings, 5 replies; 53+ messages in thread
From: NeilBrown @ 2011-12-22  1:20 UTC (permalink / raw)
  To: Tim Bird
  Cc: Greg KH, linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

[-- Attachment #1: Type: text/plain, Size: 1971 bytes --]

On Wed, 21 Dec 2011 16:36:21 -0800 Tim Bird <tim.bird@am.sony.com> wrote:

> On 12/21/2011 03:19 PM, Greg KH wrote:
> > That all describes the current code, but you haven't described what's
> > wrong with the existing syslog interface that requires this new driver
> > to be written.  And why can't the existing interface be fixed to address
> > these (potential) shortcomings?
> 
> 
> >> One specific question I have is where is the most appropriate
> >> place for this code to live, in the kernel source tree?
> >> Other embedded systems might want to use this system (it
> >> is simpler than syslog, and superior in some ways), so I don't
> >> think it should remain in an android-specific directory.
> > 
> > What way is it superior?
> 
> Here are some ways that this code is superior to syslog:

It is certainly nice and simple.  It really looks more like a filesystem than
a char device though...  though they aren't really files so much as lossy
pipes.  I don't think that's a problem though, lots of things in filesystems
don't behave exactly like files.

If you created a 'logbuf' filesystem that used libfs to provide a single
directory in which privileged processes could create files then you wouldn't
need the kernel to "know" the allowed logs: radio, events, main, system.
The size could be set by ftruncate() (by privileged used again) rather than
being hardcoded.

You would defined 'read' and 'write' much like you currently do to create a list of
datagrams in a circular buffer and replace the ioctls by more standard
interfaces:

LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
LOGGER_FLUSH_LOG could use ftruncate

The result would be much the same amount of code, but an interface which has
fewer details hard-coded and is generally more versatile and accessible.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  1:20     ` NeilBrown
@ 2011-12-22  1:49       ` Greg KH
  2011-12-22  2:14       ` Tim Bird
                         ` (3 subsequent siblings)
  4 siblings, 0 replies; 53+ messages in thread
From: Greg KH @ 2011-12-22  1:49 UTC (permalink / raw)
  To: NeilBrown
  Cc: Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Brian Swetland, Kay Sievers, Lennart Poettering

On Thu, Dec 22, 2011 at 12:20:26PM +1100, NeilBrown wrote:
> On Wed, 21 Dec 2011 16:36:21 -0800 Tim Bird <tim.bird@am.sony.com> wrote:
> 
> > On 12/21/2011 03:19 PM, Greg KH wrote:
> > > That all describes the current code, but you haven't described what's
> > > wrong with the existing syslog interface that requires this new driver
> > > to be written.  And why can't the existing interface be fixed to address
> > > these (potential) shortcomings?
> > 
> > 
> > >> One specific question I have is where is the most appropriate
> > >> place for this code to live, in the kernel source tree?
> > >> Other embedded systems might want to use this system (it
> > >> is simpler than syslog, and superior in some ways), so I don't
> > >> think it should remain in an android-specific directory.
> > > 
> > > What way is it superior?
> > 
> > Here are some ways that this code is superior to syslog:
> 
> It is certainly nice and simple.  It really looks more like a filesystem than
> a char device though...  though they aren't really files so much as lossy
> pipes.  I don't think that's a problem though, lots of things in filesystems
> don't behave exactly like files.
> 
> If you created a 'logbuf' filesystem that used libfs to provide a single
> directory in which privileged processes could create files then you wouldn't
> need the kernel to "know" the allowed logs: radio, events, main, system.
> The size could be set by ftruncate() (by privileged used again) rather than
> being hardcoded.
> 
> You would defined 'read' and 'write' much like you currently do to create a list of
> datagrams in a circular buffer and replace the ioctls by more standard
> interfaces:
> 
> LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
> LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
> LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
> LOGGER_FLUSH_LOG could use ftruncate
> 
> The result would be much the same amount of code, but an interface which has
> fewer details hard-coded and is generally more versatile and accessible.

But, almost all of this is already in the syslog system call today,
right?  So why create a new user api for something we have?

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  1:20     ` NeilBrown
  2011-12-22  1:49       ` Greg KH
@ 2011-12-22  2:14       ` Tim Bird
  2011-12-22  2:34       ` Brian Swetland
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 53+ messages in thread
From: Tim Bird @ 2011-12-22  2:14 UTC (permalink / raw)
  To: NeilBrown
  Cc: Greg KH, linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On 12/21/2011 05:20 PM, NeilBrown wrote:
> It is certainly nice and simple.  It really looks more like a filesystem than
> a char device though...  though they aren't really files so much as lossy
> pipes.  I don't think that's a problem though, lots of things in filesystems
> don't behave exactly like files.
> 
> If you created a 'logbuf' filesystem that used libfs to provide a single
> directory in which privileged processes could create files then you wouldn't
> need the kernel to "know" the allowed logs: radio, events, main, system.
> The size could be set by ftruncate() (by privileged used again) rather than
> being hardcoded.
> 
> You would defined 'read' and 'write' much like you currently do to create a list of
> datagrams in a circular buffer and replace the ioctls by more standard
> interfaces:
> 
> LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
> LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
> LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
> LOGGER_FLUSH_LOG could use ftruncate
> 
> The result would be much the same amount of code, but an interface which has
> fewer details hard-coded and is generally more versatile and accessible.

This is a very interesting suggestion.  I think it would be good to
create a prototype of this and see how it affects the user-space code
as well as what the kernel-side issues would be (and to check whether
this would change any of the current logger semantics).

It would also have the side effect of allowing runtime modification of
the buffer size, though I'm not sure whether that's useful or not.

Thanks!
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Workgroup of the Linux Foundation
Senior Staff Engineer, Sony Network Entertainment
=============================

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  1:20     ` NeilBrown
  2011-12-22  1:49       ` Greg KH
  2011-12-22  2:14       ` Tim Bird
@ 2011-12-22  2:34       ` Brian Swetland
  2011-12-22  3:49         ` Greg KH
  2011-12-22  5:01         ` david
  2011-12-22  4:52       ` david
  2011-12-22 14:59       ` Arnd Bergmann
  4 siblings, 2 replies; 53+ messages in thread
From: Brian Swetland @ 2011-12-22  2:34 UTC (permalink / raw)
  To: NeilBrown
  Cc: Tim Bird, Greg KH, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 5:20 PM, NeilBrown <neilb@suse.de> wrote:
> On Wed, 21 Dec 2011 16:36:21 -0800 Tim Bird <tim.bird@am.sony.com> wrote:
>
>> On 12/21/2011 03:19 PM, Greg KH wrote:
>> > That all describes the current code, but you haven't described what's
>> > wrong with the existing syslog interface that requires this new driver
>> > to be written.  And why can't the existing interface be fixed to address
>> > these (potential) shortcomings?
>>
>> >> One specific question I have is where is the most appropriate
>> >> place for this code to live, in the kernel source tree?
>> >> Other embedded systems might want to use this system (it
>> >> is simpler than syslog, and superior in some ways), so I don't
>> >> think it should remain in an android-specific directory.
>> >
>> > What way is it superior?
>>
>> Here are some ways that this code is superior to syslog:
>
> It is certainly nice and simple.  It really looks more like a filesystem than
> a char device though...  though they aren't really files so much as lossy
> pipes.  I don't think that's a problem though, lots of things in filesystems
> don't behave exactly like files.
>
> If you created a 'logbuf' filesystem that used libfs to provide a single
> directory in which privileged processes could create files then you wouldn't
> need the kernel to "know" the allowed logs: radio, events, main, system.
> The size could be set by ftruncate() (by privileged used again) rather than
> being hardcoded.
>
> You would defined 'read' and 'write' much like you currently do to create a list of
> datagrams in a circular buffer and replace the ioctls by more standard
> interfaces:
>
> LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
> LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
> LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
> LOGGER_FLUSH_LOG could use ftruncate
>
> The result would be much the same amount of code, but an interface which has
> fewer details hard-coded and is generally more versatile and accessible.

Moving away from hard coding the names/sizes of the logs in the driver
is something that has been on the todo list for a while.  One thing
we'd likely want to accomplish there is avoid creating a vector for
consuming large amounts of memory by creating new logs.

One planned change (likely to happen in the Android J release
timeframe) is to adjust permissions such that any process can write
messages, but unless they belong to the correct group they can only
read back messages written by their own PID.  This is to allow apps to
grab their own log output after a crash or during a user problem
report without needing to grant them the ability to read all log
messages.

Currently the logger driver does not provide a mechanism for allowing
logs to survive a reboot (unlike the ramconsole), but this is
functionality that we've thought about adding.  Generally the kernel
logs are most interesting after an unexpected panic/reboot, but
getting a picture of what userspace has been up to can be useful too.

The goals behind the logger driver have been:
- keep userland and kernel logging separate (so that spammy userland
logging doesn't make us lose critical kernel logs or the other way
round)
- make log writing very inexpensive -- avoid having to pass messages
between processes (more critical on ARM9 platforms where this implied
extra cache flushing), avoid having to make several syscalls to write
a log message (getting time of day, etc), and so on
- make log writing reliable -- don't trust userland to report its
timestamp, PID, or to correctly format the datagrams, etc
- allow a log watching process (logcat) to easily pull data from all
logs at once
- avoid committing a vast amount of memory to logging
- try to prevent clients from spamming each other out of log space
(only successful on a coarse granularity right now with the
main/system/radio/events logs)
- ensure logs are not lost at the moment an app crashes

On one hand, having each app (per PID) be able to create their own
logs up to a specified size limit could be really useful and is
something we've kicked around -- for one it would allow us to avoid
the ever present request from userspace developers to increase the log
size because of too much log spam ("reduce log spam" never seems to be
an answer that makes them happy) -- but we haven't come up with a
reasonable plan for dealing with "well if we allow 16KB of log per app
and the user installs 100 apps, they may be pinning up to 1.6MB of ram
worst case", and so on.

Brian

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  2:34       ` Brian Swetland
@ 2011-12-22  3:49         ` Greg KH
  2011-12-22  4:36           ` Kay Sievers
  2011-12-22  5:01         ` david
  1 sibling, 1 reply; 53+ messages in thread
From: Greg KH @ 2011-12-22  3:49 UTC (permalink / raw)
  To: Brian Swetland
  Cc: NeilBrown, Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 06:34:08PM -0800, Brian Swetland wrote:
> The goals behind the logger driver have been:
> - keep userland and kernel logging separate (so that spammy userland
> logging doesn't make us lose critical kernel logs or the other way
> round)

Wouldn't a simple userspace daemon solve this, writing the data to a
ramfs file?

> - make log writing very inexpensive -- avoid having to pass messages
> between processes (more critical on ARM9 platforms where this implied
> extra cache flushing), avoid having to make several syscalls to write
> a log message (getting time of day, etc), and so on

ramfs?

> - make log writing reliable -- don't trust userland to report its
> timestamp, PID, or to correctly format the datagrams, etc

existing userspace printk tty driver?

> - allow a log watching process (logcat) to easily pull data from all
> logs at once

what do you mean by "at once"?

> - avoid committing a vast amount of memory to logging

memory where, in code or in the data being logged?

> - try to prevent clients from spamming each other out of log space
> (only successful on a coarse granularity right now with the
> main/system/radio/events logs)
> - ensure logs are not lost at the moment an app crashes

Which logs?

> On one hand, having each app (per PID) be able to create their own
> logs up to a specified size limit could be really useful and is
> something we've kicked around -- for one it would allow us to avoid
> the ever present request from userspace developers to increase the log
> size because of too much log spam ("reduce log spam" never seems to be
> an answer that makes them happy) -- but we haven't come up with a
> reasonable plan for dealing with "well if we allow 16KB of log per app
> and the user installs 100 apps, they may be pinning up to 1.6MB of ram
> worst case", and so on.

I think the userspace printk and syslog might already handle most of
this today.  Tim, care to look into that and see if it does or not?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  3:49         ` Greg KH
@ 2011-12-22  4:36           ` Kay Sievers
  0 siblings, 0 replies; 53+ messages in thread
From: Kay Sievers @ 2011-12-22  4:36 UTC (permalink / raw)
  To: Greg KH
  Cc: Brian Swetland, NeilBrown, Tim Bird, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Lennart Poettering

On Thu, Dec 22, 2011 at 04:49, Greg KH <gregkh@suse.de> wrote:
> On Wed, Dec 21, 2011 at 06:34:08PM -0800, Brian Swetland wrote:
>> The goals behind the logger driver have been:
>> - keep userland and kernel logging separate (so that spammy userland
>> logging doesn't make us lose critical kernel logs or the other way
>> round)
>
> Wouldn't a simple userspace daemon solve this, writing the data to a
> ramfs file?

That's what the systemd journal does in /run when /var is not writable
or the /var/log/journal directory does not exist. All just lands on a
tmpfs until the next reboot. The journal database just acts as a
ringbuffer on a filesystem.

>> - make log writing very inexpensive -- avoid having to pass messages
>> between processes (more critical on ARM9 platforms where this implied
>> extra cache flushing), avoid having to make several syscalls to write
>> a log message (getting time of day, etc), and so on
>
> ramfs?

The receiver gets almost all of this from the socket with
SCM_TIMESTAP, and SCM_CREDENTIALS. Needed stuff can be added in a
similar way. This is trusted, reliable and pretty cheap to retrieve
data.

>> - make log writing reliable -- don't trust userland to report its
>> timestamp, PID, or to correctly format the datagrams, etc
>
> existing userspace printk tty driver?

Sockets provide all that today already in a trusted way.

>> - allow a log watching process (logcat) to easily pull data from all
>> logs at once
>
> what do you mean by "at once"?

/proc/kmsg provides that today, which is basically just a file wrapper
around the kernel's (misleadingly named) syslog() call.

>> - avoid committing a vast amount of memory to logging
>
> memory where, in code or in the data being logged?
>
>> - try to prevent clients from spamming each other out of log space
>> (only successful on a coarse granularity right now with the
>> main/system/radio/events logs)
>> - ensure logs are not lost at the moment an app crashes
>
> Which logs?

Things need rate limits and quota based on the remaining buffer or
disk space. The systemd journal maintains a separate journal for every
login-uid, so every logged in user gets his own journal database,
which can not overflow the systems database or the databases of the
other users.

Every user can log to the journal, but only the journal daemon can
write to the databases. The user can directly read back its own
journal database, but not write it.

>> On one hand, having each app (per PID) be able to create their own
>> logs up to a specified size limit could be really useful and is
>> something we've kicked around -- for one it would allow us to avoid
>> the ever present request from userspace developers to increase the log
>> size because of too much log spam ("reduce log spam" never seems to be
>> an answer that makes them happy) -- but we haven't come up with a
>> reasonable plan for dealing with "well if we allow 16KB of log per app
>> and the user installs 100 apps, they may be pinning up to 1.6MB of ram
>> worst case", and so on.
>
> I think the userspace printk and syslog might already handle most of
> this today. Â Tim, care to look into that and see if it does or not?

It's very simple what we have here in the kernel, and if syslog
facility prefixes are used, it can probably replace most of the
Android log solution just fine. We can not really protect one store to
flush the entire buffer though, that's what the 4 separated stores can
easily provide.

But as mentioned, I think we want the ability to work with structured
data, which both kernel tools are incapable of.

Kay

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  2:34       ` Brian Swetland
  2011-12-22  3:49         ` Greg KH
@ 2011-12-22  5:01         ` david
  1 sibling, 0 replies; 53+ messages in thread
From: david @ 2011-12-22  5:01 UTC (permalink / raw)
  To: Brian Swetland
  Cc: NeilBrown, Tim Bird, Greg KH, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Kay Sievers, Lennart Poettering

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4788 bytes --]

On Wed, 21 Dec 2011, Brian Swetland wrote:

> On Wed, Dec 21, 2011 at 5:20 PM, NeilBrown <neilb@suse.de> wrote:
>> On Wed, 21 Dec 2011 16:36:21 -0800 Tim Bird <tim.bird@am.sony.com> wrote:
>>
>>> On 12/21/2011 03:19 PM, Greg KH wrote:
>>>> That all describes the current code, but you haven't described what's
>>>> wrong with the existing syslog interface that requires this new driver
>>>> to be written.  And why can't the existing interface be fixed to address
>>>> these (potential) shortcomings?
>>>
>>>>> One specific question I have is where is the most appropriate
>>>>> place for this code to live, in the kernel source tree?
>>>>> Other embedded systems might want to use this system (it
>>>>> is simpler than syslog, and superior in some ways), so I don't
>>>>> think it should remain in an android-specific directory.
>>>>
>>>> What way is it superior?
>>>
>>> Here are some ways that this code is superior to syslog:
>>
>> It is certainly nice and simple.  It really looks more like a filesystem than
>> a char device though...  though they aren't really files so much as lossy
>> pipes.  I don't think that's a problem though, lots of things in filesystems
>> don't behave exactly like files.
>>
>> If you created a 'logbuf' filesystem that used libfs to provide a single
>> directory in which privileged processes could create files then you wouldn't
>> need the kernel to "know" the allowed logs: radio, events, main, system.
>> The size could be set by ftruncate() (by privileged used again) rather than
>> being hardcoded.
<<SNIP>>
>> The result would be much the same amount of code, but an interface which has
>> fewer details hard-coded and is generally more versatile and accessible.
>
> Moving away from hard coding the names/sizes of the logs in the driver
> is something that has been on the todo list for a while.  One thing
> we'd likely want to accomplish there is avoid creating a vector for
> consuming large amounts of memory by creating new logs.
>
> One planned change (likely to happen in the Android J release
> timeframe) is to adjust permissions such that any process can write
> messages, but unless they belong to the correct group they can only
> read back messages written by their own PID.  This is to allow apps to
> grab their own log output after a crash or during a user problem
> report without needing to grant them the ability to read all log
> messages.
>
> Currently the logger driver does not provide a mechanism for allowing
> logs to survive a reboot (unlike the ramconsole), but this is
> functionality that we've thought about adding.  Generally the kernel
> logs are most interesting after an unexpected panic/reboot, but
> getting a picture of what userspace has been up to can be useful too.
>
> The goals behind the logger driver have been:
> - keep userland and kernel logging separate (so that spammy userland
> logging doesn't make us lose critical kernel logs or the other way
> round)
> - make log writing very inexpensive -- avoid having to pass messages
> between processes (more critical on ARM9 platforms where this implied
> extra cache flushing), avoid having to make several syscalls to write
> a log message (getting time of day, etc), and so on
> - make log writing reliable -- don't trust userland to report its
> timestamp, PID, or to correctly format the datagrams, etc
> - allow a log watching process (logcat) to easily pull data from all
> logs at once
> - avoid committing a vast amount of memory to logging
> - try to prevent clients from spamming each other out of log space
> (only successful on a coarse granularity right now with the
> main/system/radio/events logs)
> - ensure logs are not lost at the moment an app crashes
>
> On one hand, having each app (per PID) be able to create their own
> logs up to a specified size limit could be really useful and is
> something we've kicked around -- for one it would allow us to avoid
> the ever present request from userspace developers to increase the log
> size because of too much log spam ("reduce log spam" never seems to be
> an answer that makes them happy) -- but we haven't come up with a
> reasonable plan for dealing with "well if we allow 16KB of log per app
> and the user installs 100 apps, they may be pinning up to 1.6MB of ram
> worst case", and so on.

At this point you are starting to sound like something much closer to a 
traditional syslog daemon. you are adding so many variations, persistant 
storage, etc that you really don't want to have to have all this in the 
kernel, make these be interfaces into a userspace logging tool (ideally 
syslog compatible), and you have the option to easily have different 
policies, consuming different amounts of space, depending on the device 
and how resource limited it is.

David Lang

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  1:20     ` NeilBrown
                         ` (2 preceding siblings ...)
  2011-12-22  2:34       ` Brian Swetland
@ 2011-12-22  4:52       ` david
  2011-12-22  5:06         ` Brian Swetland
  2011-12-22 14:59       ` Arnd Bergmann
  4 siblings, 1 reply; 53+ messages in thread
From: david @ 2011-12-22  4:52 UTC (permalink / raw)
  To: NeilBrown
  Cc: Tim Bird, Greg KH, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Brian Swetland, Kay Sievers, Lennart Poettering

On Thu, 22 Dec 2011, NeilBrown wrote:

> On Wed, 21 Dec 2011 16:36:21 -0800 Tim Bird <tim.bird@am.sony.com> wrote:
>
>> On 12/21/2011 03:19 PM, Greg KH wrote:
>>> That all describes the current code, but you haven't described what's
>>> wrong with the existing syslog interface that requires this new driver
>>> to be written.  And why can't the existing interface be fixed to address
>>> these (potential) shortcomings?
>>
>>
>>>> One specific question I have is where is the most appropriate
>>>> place for this code to live, in the kernel source tree?
>>>> Other embedded systems might want to use this system (it
>>>> is simpler than syslog, and superior in some ways), so I don't
>>>> think it should remain in an android-specific directory.
>>>
>>> What way is it superior?
>>
>> Here are some ways that this code is superior to syslog:
>
> It is certainly nice and simple.  It really looks more like a filesystem than
> a char device though...  though they aren't really files so much as lossy
> pipes.  I don't think that's a problem though, lots of things in filesystems
> don't behave exactly like files.
>
> If you created a 'logbuf' filesystem that used libfs to provide a single
> directory in which privileged processes could create files then you wouldn't
> need the kernel to "know" the allowed logs: radio, events, main, system.
> The size could be set by ftruncate() (by privileged used again) rather than
> being hardcoded.
>
> You would defined 'read' and 'write' much like you currently do to create a list of
> datagrams in a circular buffer and replace the ioctls by more standard
> interfaces:
>
> LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
> LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
> LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
> LOGGER_FLUSH_LOG could use ftruncate
>
> The result would be much the same amount of code, but an interface which has
> fewer details hard-coded and is generally more versatile and accessible.

That sounds better than what has been done in android, but it is still 
_far_ more limited than doing something that could be replaced by a fairly 
standard syslog daemon.

David Lang

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  4:52       ` david
@ 2011-12-22  5:06         ` Brian Swetland
  2011-12-22  5:14           ` david
  2011-12-22  7:05           ` NeilBrown
  0 siblings, 2 replies; 53+ messages in thread
From: Brian Swetland @ 2011-12-22  5:06 UTC (permalink / raw)
  To: david
  Cc: NeilBrown, Tim Bird, Greg KH, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 8:52 PM,  <david@lang.hm> wrote:
> On Thu, 22 Dec 2011, NeilBrown wrote:
>>
>> You would defined 'read' and 'write' much like you currently do to create
>> a list of
>> datagrams in a circular buffer and replace the ioctls by more standard
>> interfaces:
>>
>> LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
>> LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
>> LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
>> LOGGER_FLUSH_LOG could use ftruncate
>>
>> The result would be much the same amount of code, but an interface which
>> has
>> fewer details hard-coded and is generally more versatile and accessible.
>
> That sounds better than what has been done in android, but it is still _far_
> more limited than doing something that could be replaced by a fairly
> standard syslog daemon.

We're really not interested in adding another daemon to the platform
-- the logging system we have has served us well, is integrated with
our existing development tools, and we're definitely interested in
improving it, but throwing it out and replacing it with a userspace
solution is not interesting to us right now.

Brian

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  5:06         ` Brian Swetland
@ 2011-12-22  5:14           ` david
  2011-12-22  5:25             ` Brian Swetland
  2011-12-22  7:05           ` NeilBrown
  1 sibling, 1 reply; 53+ messages in thread
From: david @ 2011-12-22  5:14 UTC (permalink / raw)
  To: Brian Swetland
  Cc: NeilBrown, Tim Bird, Greg KH, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Kay Sievers, Lennart Poettering

On Wed, 21 Dec 2011, Brian Swetland wrote:

>>> The result would be much the same amount of code, but an interface which
>>> has
>>> fewer details hard-coded and is generally more versatile and accessible.
>>
>> That sounds better than what has been done in android, but it is still _far_
>> more limited than doing something that could be replaced by a fairly
>> standard syslog daemon.
>
> We're really not interested in adding another daemon to the platform
> -- the logging system we have has served us well, is integrated with
> our existing development tools, and we're definitely interested in
> improving it, but throwing it out and replacing it with a userspace
> solution is not interesting to us right now.

Think very hard before you reject any possibility of doing this in 
userspace, especially with all the things that you are talking about doing 
with logging in the future. I really don't think aht a lot of your 
long-0term plans for logging are going to sit well with the kernel 
developers, and if your justification is "you don't want to change your 
build process", I really doubt that you will get this upstream.

It should be possible to do this without having to change the tools 
writing the logs, any change in logging will change what it takes to read 
the logs.

I am not saying that you need to have a logging daemon as heavyweight as 
syslog-ng or rsyslog for the low-end phones, but I do think that as 
android moves up the stack a bit into talets and netbooks (especially in 
applications where it will have wifi connectivity almost all the time) 
having the capability to move to a full-blown syslog daemon will make a 
huge amount of sense, so you should look at how big a minimalist daemon 
would be, and what sort of performance it would have.

David Lang

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  5:14           ` david
@ 2011-12-22  5:25             ` Brian Swetland
  2011-12-22  6:09               ` Greg KH
  0 siblings, 1 reply; 53+ messages in thread
From: Brian Swetland @ 2011-12-22  5:25 UTC (permalink / raw)
  To: david
  Cc: NeilBrown, Tim Bird, Greg KH, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 9:14 PM,  <david@lang.hm> wrote:
> On Wed, 21 Dec 2011, Brian Swetland wrote:
>>
>> We're really not interested in adding another daemon to the platform
>> -- the logging system we have has served us well, is integrated with
>> our existing development tools, and we're definitely interested in
>> improving it, but throwing it out and replacing it with a userspace
>> solution is not interesting to us right now.
>
> Think very hard before you reject any possibility of doing this in
> userspace, especially with all the things that you are talking about doing
> with logging in the future. I really don't think aht a lot of your
> long-0term plans for logging are going to sit well with the kernel
> developers, and if your justification is "you don't want to change your
> build process", I really doubt that you will get this upstream.
>
> It should be possible to do this without having to change the tools writing
> the logs, any change in logging will change what it takes to read the logs.
>
> I am not saying that you need to have a logging daemon as heavyweight as
> syslog-ng or rsyslog for the low-end phones, but I do think that as android
> moves up the stack a bit into talets and netbooks (especially in
> applications where it will have wifi connectivity almost all the time)
> having the capability to move to a full-blown syslog daemon will make a huge
> amount of sense, so you should look at how big a minimalist daemon would be,
> and what sort of performance it would have.

Long term things could certainly move around, but *right now* we
really don't see the value in gutting what we've got in favor of
writing a new userspace daemon, which is going to require somebody to
do that work, maintain it, chase down any behavioural quirks
introduced by the changes, etc.  Or to throw it out in favor of trying
to bolt our logging infrastructure on top of existing syslogds, etc.

We're happy to maintain the logger driver out of tree as we have for
the past four years.

If all discussions of bringing Android kernel support to mainline end
up as another round of "you should just rewrite it all in userspace"
or "you should use this other subsystem that doesn't actually solve
your problem but we think it's close enough", then there's not a lot
of point to having the discussions in the first place.

If somebody wants to go write a complete compatible replacement that
just drops in, we certainly could take a look at it and see how it
works, but given that the benefits are not clear to us, we're not
interested in going off and doing that work ourselves.

Brian

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  5:25             ` Brian Swetland
@ 2011-12-22  6:09               ` Greg KH
  2011-12-23 15:22                 ` Alan Cox
  0 siblings, 1 reply; 53+ messages in thread
From: Greg KH @ 2011-12-22  6:09 UTC (permalink / raw)
  To: Brian Swetland
  Cc: david, NeilBrown, Tim Bird, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Kay Sievers, Lennart Poettering

On Wed, Dec 21, 2011 at 09:25:52PM -0800, Brian Swetland wrote:
> Long term things could certainly move around, but *right now* we
> really don't see the value in gutting what we've got in favor of
> writing a new userspace daemon, which is going to require somebody to
> do that work, maintain it, chase down any behavioural quirks
> introduced by the changes, etc.  Or to throw it out in favor of trying
> to bolt our logging infrastructure on top of existing syslogds, etc.
> 
> We're happy to maintain the logger driver out of tree as we have for
> the past four years.

This all came about because Tim asked to merge this to the main tree
as-is, and so, people rightfully pointed out different options that
might be considered if it were to be merged that way.

I have no problem leaving the logger driver in staging, but it seems
that Tim is taking on the task to do the harder thing here, which
probably would entail work on both sides, which as a openhandset
alliance company member, he might have a change that someone like me
might not :)

Tim, it's now in your court it seems, have a great holiday,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  6:09               ` Greg KH
@ 2011-12-23 15:22                 ` Alan Cox
  2011-12-23 16:29                   ` Greg KH
  0 siblings, 1 reply; 53+ messages in thread
From: Alan Cox @ 2011-12-23 15:22 UTC (permalink / raw)
  To: Greg KH
  Cc: Brian Swetland, david, NeilBrown, Tim Bird, linux-embedded,
	linux kernel, Arnd Bergmann, john stultz, Kay Sievers,
	Lennart Poettering

> I have no problem leaving the logger driver in staging, but it seems
> that Tim is taking on the task to do the harder thing here, which
> probably would entail work on both sides, which as a openhandset
> alliance company member, he might have a change that someone like me
> might not :)

I'd like to see it in staging. There is a general discussion going on
about how such logging services should look (and indeed it goes back
before Android bits), and it looks like a good way to proceed is going to
be to "play" with the Android one - perhaps adding the fs model to it.

So we need something and the Android interface at the very least is a
prototype that has production proof of working.

Alan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-23 15:22                 ` Alan Cox
@ 2011-12-23 16:29                   ` Greg KH
  0 siblings, 0 replies; 53+ messages in thread
From: Greg KH @ 2011-12-23 16:29 UTC (permalink / raw)
  To: Alan Cox
  Cc: Brian Swetland, david, NeilBrown, Tim Bird, linux-embedded,
	linux kernel, Arnd Bergmann, john stultz, Kay Sievers,
	Lennart Poettering

On Fri, Dec 23, 2011 at 03:22:31PM +0000, Alan Cox wrote:
> > I have no problem leaving the logger driver in staging, but it seems
> > that Tim is taking on the task to do the harder thing here, which
> > probably would entail work on both sides, which as a openhandset
> > alliance company member, he might have a change that someone like me
> > might not :)
> 
> I'd like to see it in staging. There is a general discussion going on
> about how such logging services should look (and indeed it goes back
> before Android bits), and it looks like a good way to proceed is going to
> be to "play" with the Android one - perhaps adding the fs model to it.
> 
> So we need something and the Android interface at the very least is a
> prototype that has production proof of working.

It is already in staging, look in the staging-next branch of linux-next.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  5:06         ` Brian Swetland
  2011-12-22  5:14           ` david
@ 2011-12-22  7:05           ` NeilBrown
  2012-01-06 20:56             ` Tim Bird
  1 sibling, 1 reply; 53+ messages in thread
From: NeilBrown @ 2011-12-22  7:05 UTC (permalink / raw)
  To: Brian Swetland
  Cc: david, Tim Bird, Greg KH, linux-embedded, linux kernel,
	Arnd Bergmann, john stultz, Kay Sievers, Lennart Poettering

[-- Attachment #1: Type: text/plain, Size: 4613 bytes --]

On Wed, 21 Dec 2011 21:06:02 -0800 Brian Swetland <swetland@google.com> wrote:

> On Wed, Dec 21, 2011 at 8:52 PM,  <david@lang.hm> wrote:
> > On Thu, 22 Dec 2011, NeilBrown wrote:
> >>
> >> You would defined 'read' and 'write' much like you currently do to create
> >> a list of
> >> datagrams in a circular buffer and replace the ioctls by more standard
> >> interfaces:
> >>
> >> LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
> >> LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
> >> LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
> >> LOGGER_FLUSH_LOG could use ftruncate
> >>
> >> The result would be much the same amount of code, but an interface which
> >> has
> >> fewer details hard-coded and is generally more versatile and accessible.
> >
> > That sounds better than what has been done in android, but it is still _far_
> > more limited than doing something that could be replaced by a fairly
> > standard syslog daemon.
> 
> We're really not interested in adding another daemon to the platform
> -- the logging system we have has served us well, is integrated with
> our existing development tools, and we're definitely interested in
> improving it, but throwing it out and replacing it with a userspace
> solution is not interesting to us right now.
> 
> Brian

Possibly it would be useful to be clear what we all *are* really interested
in, because I suspect there is a lot of variety.

You appear to be interested in providing a great platform for a phone, in
minimising unnecessary churn in that platform, and in having the freedom to
optimise various aspects however you see fit.  You appear to have little
problem with maintaining some components out-of-mainline. This is all
perfectly valid and very much in the spirit of "freedom" that binds us
together.

Others want to be able to run a main-line kernel underneath the Android
user-space - and that is perfectly reasonable as well.

I don't much care about either of those, but I want "Linux" to be "high
quality" (according to my own personal standards of course) which means
keeping bad stuff out (though you could argue that the horse has well and
truly bolted there) and including good and useful stuff in.  And I think
Android has something to offer on both sides there :-)

Weighing all that up, I don't think it is useful to set our goal on "getting
Android to use a mainline kernel" - that isn't going to happen.
Rather we should focus primarily on "making it *possible* to run android
user-space on mainline".

That could involve two things
1/ Making sure the interfaces that Android uses are abstracted at a suitable
  level so that when running a mainline kernel you can just slip in an
  alternate shared library with compatible functionality.
2/ Adding functionality to Linux so that it is possible to provide the
  functionality that Android needs.

Android should not *have* to use the interface that the "mainline
community" decides is preferred, but nor should mainline be required to
include the drivers that Android wants to use.  History shows us that isn't
going to happen.

But if there was a fairly low-level API that Android used, then those in the
community could who want Android on a mainline kernel could work to implement
that API with whatever mixture of user-space and kernel-space that achieves
consensus.
Android could of course change to use the community version eventually if
they found that was a good thing, or could keep using their own.

So: bringing this back to the android logger...
  What I would like to see is a low-level API which is used to access logging
  for which alternate implementations could be written.  Ideally this would
  be embodied in a single .so, but we have the source so that doesn't need to
  be a barrier.
  Then we could argue to our heart's content about how best to implement that
  API - Journal and nsyslogd and rsyslogd could all implement it in different
  ways and we could be one step closer to running Android on a mainline
  kernel, but the Android developers don't need to be involved (but can if
  they want to of course).

  I would be important that the API is specified clearly - neither under
  specified nor over-specified. That means that the Android implementation
  would need to explicitly forbid anything that isn't explicit permitted.
  This is because most testing will happen on the Android platform so it's
  actual behaviour will become the defacto standard.

  Could that be a possible way forward?

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  7:05           ` NeilBrown
@ 2012-01-06 20:56             ` Tim Bird
  2012-01-06 21:20               ` Greg KH
  0 siblings, 1 reply; 53+ messages in thread
From: Tim Bird @ 2012-01-06 20:56 UTC (permalink / raw)
  To: NeilBrown
  Cc: Brian Swetland, david@lang.hm, Greg KH, linux-embedded,
	linux kernel, Arnd Bergmann, john stultz, Kay Sievers,
	Lennart Poettering, Andrew Morton, david, Paul McKenney,
	Alan Stern, Alan Cox, Linus Torvalds

I'm back from vacation - sorry for the long delay in responding.

First, let me say thanks to all those who have responded with ideas
and suggestions.  I'll be following up on many of them in the new
few weeks.  This is a background task for me, so it will likely
go relatively slowly.  I have volunteer offers of assistance,
and some resources available for contract work, but it might take time
to set that up.  In general, though, things should be able to move forward.

I could respond to a lot of messages individually, but
to conserve space I'll focus on this one, and present an overall
idea for how I plan to proceed.

On 12/21/2011 11:05 PM, NeilBrown wrote:
> Possibly it would be useful to be clear what we all *are* really interested
> in, because I suspect there is a lot of variety.
> 
> You appear to be interested in providing a great platform for a phone, in
> minimising unnecessary churn in that platform, and in having the freedom to
> optimise various aspects however you see fit.  You appear to have little
> problem with maintaining some components out-of-mainline. This is all
> perfectly valid and very much in the spirit of "freedom" that binds us
> together.
> 
> Others want to be able to run a main-line kernel underneath the Android
> user-space - and that is perfectly reasonable as well.
> 
> I don't much care about either of those, but I want "Linux" to be "high
> quality" (according to my own personal standards of course) which means
> keeping bad stuff out (though you could argue that the horse has well and
> truly bolted there) and including good and useful stuff in.  And I think
> Android has something to offer on both sides there :-)

Thanks very much for this.  I think it *is* important to take a step
back and compare goals, to see how we can achieve the best result for
all parties.

> 
> Weighing all that up, I don't think it is useful to set our goal on "getting
> Android to use a mainline kernel" - that isn't going to happen.
> Rather we should focus primarily on "making it *possible* to run android
> user-space on mainline".
> 
> That could involve two things
> 1/ Making sure the interfaces that Android uses are abstracted at a suitable
>   level so that when running a mainline kernel you can just slip in an
>   alternate shared library with compatible functionality.
> 2/ Adding functionality to Linux so that it is possible to provide the
>   functionality that Android needs.

Agreed.

> Android should not *have* to use the interface that the "mainline
> community" decides is preferred, but nor should mainline be required to
> include the drivers that Android wants to use.  History shows us that isn't
> going to happen.

I agree with the sentiment in the first sentence, but I would like to
make a few observations about this code, and the problem in general,
for consideration.  I'll do this below because it's a bit philosophical,
and I'd rather focus on "the way forward" first.

> But if there was a fairly low-level API that Android used, then those in the
> community could who want Android on a mainline kernel could work to implement
> that API with whatever mixture of user-space and kernel-space that achieves
> consensus.
> Android could of course change to use the community version eventually if
> they found that was a good thing, or could keep using their own.
> 
> So: bringing this back to the android logger...
>   What I would like to see is a low-level API which is used to access logging
>   for which alternate implementations could be written.  Ideally this would
>   be embodied in a single .so, but we have the source so that doesn't need to
>   be a barrier.
>   Then we could argue to our heart's content about how best to implement that
>   API - Journal and nsyslogd and rsyslogd could all implement it in different
>   ways and we could be one step closer to running Android on a mainline
>   kernel, but the Android developers don't need to be involved (but can if
>   they want to of course).
> 
>   I would be important that the API is specified clearly - neither under
>   specified nor over-specified. That means that the Android implementation
>   would need to explicitly forbid anything that isn't explicit permitted.
>   This is because most testing will happen on the Android platform so it's
>   actual behaviour will become the defacto standard.
> 
>   Could that be a possible way forward?

I'd like to pursue this, as well as some of the minor code cleanups
suggested by Andrew Morton.  Here's what I'm thinking:

I'd like to implement Neil's file-system based solution as a test case,
and compare that with the existing code.  I like the elegance of using
existing filesystem semantics, and the removal of some hard-coded limits
and policy from the kernel.  A lower priority would be to also try
a user-space-only solution, as described by Arnd.  Optimally, it would
be nice to have all three systems to compare (char dev, log fs, and
ram fs) against each other.

I'll try to use these under the existing logger library API to see if the
current semantics expected by Android user space can be preserved.  This
would minimize the churn to Android user space.  Before this (or any other
changes for that matter) could be rolled out in an official Android
release, there would be a lot of testing required, to make sure something
hasn't been broken.  I want to avoid asking the Google developers to
do this, as they have something that works now, and there's little incentive
to change it.  I'm tackling this problem as something of a wider issue
for the industry.

Separately, I'd like to apply the changes requested by Andrew Morton
to the existing char dev driver, as just simple cleanups.  This, however,
also will require some testing to avoid regressions.

On a separate track, I want to compare the requirements discussed in this
thread with what systemd is doing (and how /dev/kmsg is being used) to see
if there are issues that inform a possible more general solution for
logging in the future.  To this end, it might be good to set up a
meeting at a future event (ELC or collab summit, depending on what
people plan to come to) to discuss things face-to-face.

I'm hoping that the existing code can live in staging while we sort
this out longer term.  This code has lived out of mainline for several
years, and I don't think we're in any kind of rush.  I plan to record
the requirements and the plan for this on a wiki page I've set up at:
http://elinux.org/Mainline_Android_logger_project, but I may also record
some of the simple cleanup requests in a TODO file in staging.

Also, separately, we'll likely do some benchmarking of the various
systems, as part of our overall comparison effort - to address questions
about the tradeoffs involved - particularly between implementing this
in or out of kernel and between existing and new kernel infrastructure.

[warning - next part is long...]

Now, having said all that, let me go off into some philosophical weeds...

This code is only about 700 lines long, and specializes the kernel for
an extremely popular (by usage) user space stack.  Code of
lower quality which specializes the kernel for much less-used hardware
has been admitted into the kernel with significantly less fuss than
this code has received.  That's not an argument to accept the code
as is, it's just an observation about the relative hurdle that this
code faces compared to lots of other code that goes into the kernel.
(And please don't interpret this as dissatisfaction with the feedback
received.)

If this code were a character driver for an obscure serial port
on a lesser-known chip architecture, I don't think it would get
any attention at all.  As it is, it's looking like at least a few
man months of work will be required, as well as some relatively
unneeded changes to Android user space, to get this feature into
a permanently acceptable state.  I wouldn't be surprised to see
this stretch into a few calendar years.

Code that specializes the kernel in weird ways is accepted into
the kernel all the time, and I've tried to figure out why this
particular bit of code is treated differently.  Especially since
this code is self-contained, configurable, and imposes no
perceivable long-term maintenance burden.  (That's not true of
all Android patches, but I believe it's true of this one).

I have a few theories:
1) this is not tied to hardware, and as such represents a general
feature (but people are not at all required to treat it as such,
just as they are not required to use other people's weird drivers).

2) people want to avoid duplication with other similar features
(again, since it's self-contained and configurable, I don't know
why it would bother people if this existed in tandem with other
solutions - especially since it's so small)

3) there is really no maintainer for this feature category, so
discussions get bogged down as varying requirements and solutions
are suggested, which can not easily be compared against each other
(especially for non-existent implementations)  In particular, it's
unclear who I have to get the approval of for this code or some
derivative of it to be accepted.  That makes the development task
a very open-ended one.

4) this is for a popular use case, as opposed to some minor
outlying thing, and so people perceive the need to get it
exactly right.  In this sense, the code would be a victim of
it's own success.

5) blocking this is perceived to be a way to accomplish a
larger, related goal (if this is true then it has lots of
interesting implications for the economics of open source
work)

In general, there is a tension between the normal nature of adapting
the kernel to the most general use cases, and the specialization
that is performed in developing an embedded product.  Often
times, solutions to embedded requirements are very finely tuned
to a particular field of use or situation, and don't lend themselves
easily to the type of generalization that mainlining usually requires.

Which brings me to my last point and a question.
Is it inconceivable for there to be a category of code in the
Linux kernel which supports ONLY Android user space, and no
other?  That is, must every Android patch be generalized in
some manner to a broader use case?

I suspect some of them (particularly binder) will not lend
themselves easily to this type of generalization.
Knowing the answer to this question would help me gauge the
amount of effort required for this project, and the
net value of continuing it.

Thanks and regards,
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Workgroup of the Linux Foundation
Senior Staff Engineer, Sony Network Entertainment
=============================

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2012-01-06 20:56             ` Tim Bird
@ 2012-01-06 21:20               ` Greg KH
  2012-01-06 22:41                 ` Tim Bird
  0 siblings, 1 reply; 53+ messages in thread
From: Greg KH @ 2012-01-06 21:20 UTC (permalink / raw)
  To: Tim Bird
  Cc: NeilBrown, Brian Swetland, david@lang.hm, linux-embedded,
	linux kernel, Arnd Bergmann, john stultz, Kay Sievers,
	Lennart Poettering, Andrew Morton, Paul McKenney, Alan Stern,
	Alan Cox, Linus Torvalds

On Fri, Jan 06, 2012 at 12:56:27PM -0800, Tim Bird wrote:
> Now, having said all that, let me go off into some philosophical weeds...

Ah, this should be fun :)

> This code is only about 700 lines long, and specializes the kernel for
> an extremely popular (by usage) user space stack.  Code of
> lower quality which specializes the kernel for much less-used hardware
> has been admitted into the kernel with significantly less fuss than
> this code has received.  That's not an argument to accept the code
> as is, it's just an observation about the relative hurdle that this
> code faces compared to lots of other code that goes into the kernel.
> (And please don't interpret this as dissatisfaction with the feedback
> received.)
> 
> If this code were a character driver for an obscure serial port
> on a lesser-known chip architecture, I don't think it would get
> any attention at all.  As it is, it's looking like at least a few
> man months of work will be required, as well as some relatively
> unneeded changes to Android user space, to get this feature into
> a permanently acceptable state.  I wouldn't be surprised to see
> this stretch into a few calendar years.
> 
> Code that specializes the kernel in weird ways is accepted into
> the kernel all the time, and I've tried to figure out why this
> particular bit of code is treated differently.  Especially since
> this code is self-contained, configurable, and imposes no
> perceivable long-term maintenance burden.  (That's not true of
> all Android patches, but I believe it's true of this one).
> 
> I have a few theories:
> 1) this is not tied to hardware, and as such represents a general
> feature (but people are not at all required to treat it as such,
> just as they are not required to use other people's weird drivers).

That is exactly true.

> 2) people want to avoid duplication with other similar features
> (again, since it's self-contained and configurable, I don't know
> why it would bother people if this existed in tandem with other
> solutions - especially since it's so small)

Again, very true in the duplication sense.

It is general code, not tied to hardware, so it better justify itself
somehow.

> 3) there is really no maintainer for this feature category, so
> discussions get bogged down as varying requirements and solutions
> are suggested, which can not easily be compared against each other
> (especially for non-existent implementations)  In particular, it's
> unclear who I have to get the approval of for this code or some
> derivative of it to be accepted.  That makes the development task
> a very open-ended one.

No, I don't think this is a problem, you have bigger ones to deal with
:)

> 4) this is for a popular use case, as opposed to some minor
> outlying thing, and so people perceive the need to get it
> exactly right.  In this sense, the code would be a victim of
> it's own success.

It's not the code's "success" here, it's the "this solves a general
problem in a very specific way" issue, that just happens to be a use
case a lot of different people want to see addressed.  And since it is
only solving it in a narrow way, that's a problem for all of those other
people.

> 5) blocking this is perceived to be a way to accomplish a
> larger, related goal (if this is true then it has lots of
> interesting implications for the economics of open source
> work)

No, I don't think that's the issue here.

> In general, there is a tension between the normal nature of adapting
> the kernel to the most general use cases, and the specialization
> that is performed in developing an embedded product.  Often
> times, solutions to embedded requirements are very finely tuned
> to a particular field of use or situation, and don't lend themselves
> easily to the type of generalization that mainlining usually requires.

Why do people get hung up on the "embedded is special" case here.  Fact,
it's not.  Do you somehow think that the work done for the HUGE
multiprocessor machines 10 years ago would have gotten away with hacks
saying, "this is a special case, only we care about this", when they
were dealing with 4 and 8 way machines?  No, they didn't, and as such,
there's nothing special with embedded here either.  Matter of fact, your
machines are now more powerful than those old 10 year old machines, and
are really general purpose computers, just used in a single/limited
manner, again, just like those original big SMP machines were.

So, I've said it many times before, and I'll say it again:

  Yes, you are special and unique, just like everyone else.

The next person who says the "embedded is different" phrase again, owes
me a beer of my choice.

> Which brings me to my last point and a question.
> Is it inconceivable for there to be a category of code in the
> Linux kernel which supports ONLY Android user space, and no
> other?  That is, must every Android patch be generalized in
> some manner to a broader use case?

No, that's fine, I have no problem with a drivers/android/ directory for
android only stuff, because:

> I suspect some of them (particularly binder) will not lend
> themselves easily to this type of generalization.

Exactly, that is where the binder code should eventually end up, and
probably a few other minor bits.  But for almost everything else (and
really, there isn't all that much android-only code we are talking about
here:
$ wc drivers/staging/android/* drivers/staging/android/switch/* -l | tail -n 1
   8429 total

And the switch code is already being worked on to be generalized by
others, so that means we only have:
$ wc drivers/staging/android/*  -l | tail -n 1
   8015 total
to worry about here.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2012-01-06 21:20               ` Greg KH
@ 2012-01-06 22:41                 ` Tim Bird
  2012-01-06 23:17                   ` Greg KH
  2012-01-06 23:35                   ` Greg KH
  0 siblings, 2 replies; 53+ messages in thread
From: Tim Bird @ 2012-01-06 22:41 UTC (permalink / raw)
  To: Greg KH
  Cc: NeilBrown, Brian Swetland, david@lang.hm, linux-embedded,
	linux kernel, Arnd Bergmann, john stultz, Kay Sievers,
	Lennart Poettering, Andrew Morton, Paul McKenney, Alan Stern,
	Alan Cox, Linus Torvalds

On 01/06/2012 01:20 PM, Greg KH wrote:
> On Fri, Jan 06, 2012 at 12:56:27PM -0800, Tim Bird wrote:
>> 4) this is for a popular use case, as opposed to some minor
>> outlying thing, and so people perceive the need to get it
>> exactly right.  In this sense, the code would be a victim of
>> it's own success.
> 
> It's not the code's "success" here, it's the "this solves a general
> problem in a very specific way" issue, that just happens to be a use
> case a lot of different people want to see addressed.  And since it is
> only solving it in a narrow way, that's a problem for all of those other
> people.

It's only a problem in the sense that those different people
don't get something for free.  The presence of an isolated,
configurable driver somewhere in the kernel source, that
addresses one group's issues and not another's doesn't impose
any measurable burden on that other group.

It's more of an opportunity cost thing.  The opportunity
to develop something more general can be reduced when a narrow
solution is accepted.  But this is exactly related to
encouraging people to develop more general solutions, when
narrow ones accomplish what they need (see point 5).

I understand that the whole model is built on people contributing
code that not only addresses their own needs, but the needs of
others.  But of course there should be balance.  For example
I have concerns about integrating this into the printk code path,
because I think that code path is complex enough as it is,
and that combining this functionality with that is likely to
create more maintenance problems rather than less in the
long run.  (But I'll still plan to look at this option ...)

Sometimes, features should just remain separate.

> 
>> 5) blocking this is perceived to be a way to accomplish a
>> larger, related goal (if this is true then it has lots of
>> interesting implications for the economics of open source
>> work)
> 
> No, I don't think that's the issue here.

I don't think anyone is thinking: "If I block this
I can get Tim to do x".  But there *is* a desire to
create a more general logging solution.  And not
accepting this as is, is a way to encourage people to
work on that.

For some situations, I would object to that.  But in
this case I actually agree that there are more potential users
of this, and it would be nice to solve more problems than
just the Android case, with a consolidated code base.  That's
why I'm willing to invest in researching other loggers and
doing some experimental development.

> 
>> In general, there is a tension between the normal nature of adapting
>> the kernel to the most general use cases, and the specialization
>> that is performed in developing an embedded product.  Often
>> times, solutions to embedded requirements are very finely tuned
>> to a particular field of use or situation, and don't lend themselves
>> easily to the type of generalization that mainlining usually requires.

I should have left off "embedded product" for this thread.  There is
a tension in general between generalization and specialization.  And
that tension shows up in a lot of discussions about mainlining code.
My goal here is to determine where people think the balance lies
for this particular code.

> Why do people get hung up on the "embedded is special" case here.  Fact,
> it's not.  Do you somehow think that the work done for the HUGE
> multiprocessor machines 10 years ago would have gotten away with hacks
> saying, "this is a special case, only we care about this", when they
> were dealing with 4 and 8 way machines?  No, they didn't, and as such,
> there's nothing special with embedded here either.  Matter of fact, your
> machines are now more powerful than those old 10 year old machines, and
> are really general purpose computers, just used in a single/limited
> manner, again, just like those original big SMP machines were.
> 
> So, I've said it many times before, and I'll say it again:
> 
>   Yes, you are special and unique, just like everyone else.
> 
> The next person who says the "embedded is different" phrase again, owes
> me a beer of my choice.

Consider this your free beer token... :-)

I believe "embedded is different" because we throw away so much code,
that I just don't think desktop and enterprise do.  The number of
non-mainlined BSPs that have gone straight from product release to
bit bucket is truly staggering.

You're only looking at the Android case, where the machines are as
powerful as they were 10 years ago.  Broad swaths of embedded
are not on that kind of hardware.  It's different (there it is again,
is that two beers?) being at the bottom side of the scalability
spectrum rather than at the top.  Everyone expected that eventually
there would be more SMP machines in the world.  This is not true of
all embedded processors.  We're used to developing code that we'll
throw away after a single release.

But I think this is beside the point for this particular code.

I agree that it's worth some effort to try to generalize this
logger code. Indeed, when I first saw the logger code I thought,
"this would be nice to use on my other projects".  So we'll spend
some time trying to see if it can address more than a single group's
requirements, without gumming up the features that make it valuable
to Android as is.  I'll tell you in a few months whether I think
it's practical or worth it.

>> Which brings me to my last point and a question.
>> Is it inconceivable for there to be a category of code in the
>> Linux kernel which supports ONLY Android user space, and no
>> other?  That is, must every Android patch be generalized in
>> some manner to a broader use case?
> 
> No, that's fine, I have no problem with a drivers/android/ directory for
> android only stuff, because:
> 
>> I suspect some of them (particularly binder) will not lend
>> themselves easily to this type of generalization.
> 
> Exactly, that is where the binder code should eventually end up, and
> probably a few other minor bits.

That's good to hear.

> But for almost everything else (and
> really, there isn't all that much android-only code we are talking about
> here:
> $ wc drivers/staging/android/* drivers/staging/android/switch/* -l | tail -n 1
>    8429 total
> 
> And the switch code is already being worked on to be generalized by
> others, so that means we only have:
> $ wc drivers/staging/android/*  -l | tail -n 1
>    8015 total
> to worry about here.

Well, I'm not sure you've gotten it all (my count is closer to 20k,
some of it sprinkled in a few weird places), but I agree completely
that it's not a lot of code, which is encouraging.  Did you put
anything in besides what's in staging?
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Workgroup of the Linux Foundation
Senior Staff Engineer, Sony Network Entertainment
=============================

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2012-01-06 22:41                 ` Tim Bird
@ 2012-01-06 23:17                   ` Greg KH
  2012-01-06 23:35                   ` Greg KH
  1 sibling, 0 replies; 53+ messages in thread
From: Greg KH @ 2012-01-06 23:17 UTC (permalink / raw)
  To: Tim Bird
  Cc: NeilBrown, Brian Swetland, david@lang.hm, linux-embedded,
	linux kernel, Arnd Bergmann, john stultz, Kay Sievers,
	Lennart Poettering, Andrew Morton, Paul McKenney, Alan Stern,
	Alan Cox, Linus Torvalds

On Fri, Jan 06, 2012 at 02:41:52PM -0800, Tim Bird wrote:
> > But for almost everything else (and
> > really, there isn't all that much android-only code we are talking about
> > here:
> > $ wc drivers/staging/android/* drivers/staging/android/switch/* -l | tail -n 1
> >    8429 total
> > 
> > And the switch code is already being worked on to be generalized by
> > others, so that means we only have:
> > $ wc drivers/staging/android/*  -l | tail -n 1
> >    8015 total
> > to worry about here.
> 
> Well, I'm not sure you've gotten it all (my count is closer to 20k,
> some of it sprinkled in a few weird places), but I agree completely
> that it's not a lot of code, which is encouraging.  Did you put
> anything in besides what's in staging?

Yes, it should all be under drivers/staging/android in linux-next, soon
to go to Linus.

What do you think I have missed?  I only know of two missing bits:
	- android USB gadget drivers/stuff
	- wakelocks

The first is being addressed by the Google developers, see the post to
the linux-usb mailing list the end of last month with the code, and all
of the review comments on it.

The second is being worked on by someone else, as I'm sure you are
already aware.

Other than that, I don't see anything in the common.git android tree
that I am missing, other than some odd driver updates which should all
be upstream already from what I can tell.

So how are our line counts so far off?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2012-01-06 22:41                 ` Tim Bird
  2012-01-06 23:17                   ` Greg KH
@ 2012-01-06 23:35                   ` Greg KH
  1 sibling, 0 replies; 53+ messages in thread
From: Greg KH @ 2012-01-06 23:35 UTC (permalink / raw)
  To: Tim Bird
  Cc: NeilBrown, Brian Swetland, david@lang.hm, linux-embedded,
	linux kernel, Arnd Bergmann, john stultz, Kay Sievers,
	Lennart Poettering, Andrew Morton, Paul McKenney, Alan Stern,
	Alan Cox, Linus Torvalds

On Fri, Jan 06, 2012 at 02:41:52PM -0800, Tim Bird wrote:
> On 01/06/2012 01:20 PM, Greg KH wrote:
> > On Fri, Jan 06, 2012 at 12:56:27PM -0800, Tim Bird wrote:
> >> 4) this is for a popular use case, as opposed to some minor
> >> outlying thing, and so people perceive the need to get it
> >> exactly right.  In this sense, the code would be a victim of
> >> it's own success.
> > 
> > It's not the code's "success" here, it's the "this solves a general
> > problem in a very specific way" issue, that just happens to be a use
> > case a lot of different people want to see addressed.  And since it is
> > only solving it in a narrow way, that's a problem for all of those other
> > people.
> 
> It's only a problem in the sense that those different people
> don't get something for free.  The presence of an isolated,
> configurable driver somewhere in the kernel source, that
> addresses one group's issues and not another's doesn't impose
> any measurable burden on that other group.

That's been proven incorrect over the long-run as we all must now
maintain that user/kernel API for forever.

You could say the same thing for adding new ioctls for individual
drivers instead of creating a new syscall.  Sometimes, in rare cases, it
does make sense to create a new ioctl.  But that decision has to be
taken very carefully, as again, it needs to be standardized with all
other drivers of that time, and maintained for forever.

> It's more of an opportunity cost thing.  The opportunity
> to develop something more general can be reduced when a narrow
> solution is accepted.  But this is exactly related to
> encouraging people to develop more general solutions, when
> narrow ones accomplish what they need (see point 5).

When people work together on their narrow goals, they end up creating
something that works for way more than just themselves, and in fact, it
turns out it works better for people who originally wasn't even aware of
the issues involved.

Again, this has been proven over time with the Linux kernel development
process.  I wrote a whole chapter on this very topic in the book,
"Beautiful Code", if you want to hear more about it.

In fact, I would argue, that is one of the top reasons why Linux has
succeeded so well.  Again, you really aren't unique, others will have
the same issues/problems that you do, and if you solve them in a case
that works for more than one, it makes everything better.

> I understand that the whole model is built on people contributing
> code that not only addresses their own needs, but the needs of
> others.  But of course there should be balance.  For example
> I have concerns about integrating this into the printk code path,
> because I think that code path is complex enough as it is,
> and that combining this functionality with that is likely to
> create more maintenance problems rather than less in the
> long run.  (But I'll still plan to look at this option ...)

Complex code paths are tough, I will grant you that.  But if it was
easy, then, you wouldn't be the person for the job :)

> Sometimes, features should just remain separate.

When they do different things, of course.  When they do the same thing,
of course not.

> > Why do people get hung up on the "embedded is special" case here.  Fact,
> > it's not.  Do you somehow think that the work done for the HUGE
> > multiprocessor machines 10 years ago would have gotten away with hacks
> > saying, "this is a special case, only we care about this", when they
> > were dealing with 4 and 8 way machines?  No, they didn't, and as such,
> > there's nothing special with embedded here either.  Matter of fact, your
> > machines are now more powerful than those old 10 year old machines, and
> > are really general purpose computers, just used in a single/limited
> > manner, again, just like those original big SMP machines were.
> > 
> > So, I've said it many times before, and I'll say it again:
> > 
> >   Yes, you are special and unique, just like everyone else.
> > 
> > The next person who says the "embedded is different" phrase again, owes
> > me a beer of my choice.
> 
> Consider this your free beer token... :-)

Oooh, nice, I'll take you up on that at ELC in a few weeks :)

> I believe "embedded is different" because we throw away so much code,
> that I just don't think desktop and enterprise do.  The number of
> non-mainlined BSPs that have gone straight from product release to
> bit bucket is truly staggering.

Yes, and that's sad.  And maybe, if embedded feels that they are so
special, and that contributing to mainline doesn't help or matter, then
they shouldn't.

Yes, I mean it, why push stuff upstream if you aren't going to maintain
it and no one will ever build on top of it.

Oh wait, you will end up building on it with your next iteration, so if
you throw it away, oops, you just wasted time and money.

And again, this mirrors the super computer case EXACTLY!  In fact, I
would argue that there are less machines shipped for those types of
systems, than embedded, and do you see those developers saying that they
should just keep their code out of mainline?  Or do they actually
realize that it is cheaper, in time and money to mainline stuff.

Remember, Intel and IBM has said publically, numerous times, that it
makes business and money sense to do this.  Are the people at these
companies somehow wrong?  Do they not also support the "embedded"
markets?

> You're only looking at the Android case, where the machines are as
> powerful as they were 10 years ago.  Broad swaths of embedded
> are not on that kind of hardware.  It's different (there it is again,
> is that two beers?) being at the bottom side of the scalability
> spectrum rather than at the top.  Everyone expected that eventually
> there would be more SMP machines in the world.  This is not true of
> all embedded processors.  We're used to developing code that we'll
> throw away after a single release.

Again, same goes for super computers.

And again, if it really is going to be thrown away, then by all means,
throw it away.  But I will bet you the beer you owe me, odds are you
want that code again in the future to build on top of.

Unless it's an obsolete architecture like Voyager or something, then
just throw that away, no one would ever build on top of that :)

Oh wait, again, look at the benifits Linux overall has achieved from
supporting something "odd" and "obsolete" in the mainline kernel, to
help abstract core functionality to make things more flexable and
workable for {gasp} even the embedded folks.

Sorry, again, I don't buy your argument one bit.

> But I think this is beside the point for this particular code.

I agree.

thanks,

greg "beer's on Tim!" k-h

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  1:20     ` NeilBrown
                         ` (3 preceding siblings ...)
  2011-12-22  4:52       ` david
@ 2011-12-22 14:59       ` Arnd Bergmann
  2011-12-22 15:13         ` Kay Sievers
  4 siblings, 1 reply; 53+ messages in thread
From: Arnd Bergmann @ 2011-12-22 14:59 UTC (permalink / raw)
  To: NeilBrown
  Cc: Tim Bird, Greg KH, linux-embedded, linux kernel, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Thursday 22 December 2011, NeilBrown wrote:
> If you created a 'logbuf' filesystem that used libfs to provide a single
> directory in which privileged processes could create files then you wouldn't
> need the kernel to "know" the allowed logs: radio, events, main, system.
> The size could be set by ftruncate() (by privileged used again) rather than
> being hardcoded.
> 
> You would defined 'read' and 'write' much like you currently do to create a list of
> datagrams in a circular buffer and replace the ioctls by more standard
> interfaces:
> 
> LOGGER_GET_LOG_BUG_SIZE would use 'stat' and the st_blocks field
> LOGGER_GET_LOG_LEN would use 'stat' and the st_size field
> LOGGER_GET_NEXT_ENTRY_LEN could use the FIONREAD ioctl
> LOGGER_FLUSH_LOG could use ftruncate
> 
> The result would be much the same amount of code, but an interface which has
> fewer details hard-coded and is generally more versatile and accessible.

I like the idea and was going to suggest something very similar, but I wonder
if we could take the approach even further:

* Remove all kernel code for this and use a user space library together
  with tmpfs
* prepopulate the tmpfs at boot time with all the log buffers in the right
  size, and set the maximum file system size so that they cannot grow further.
* Have minimal formatting in the log buffer: A few bytes header (ring buffer
  start and end)
* Mandate that user space must use mmap and atomic operations to reserve space
  in the log and write to the files.
* Provide a tool to get the log data out of the buffer again in a race-free way.

Since any program that is allowed to write to the buffer can overwrite all
existing information in it anyway, I think we don't actually need any kernel
help in maintaining consistency of the contents either -- the reader will
simply discard any data. The main thing we would not be able to guarantee
without kernel help is proving the origin of individual messages, but I'm
not sure if that is a design goal.

	Arnd

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22 14:59       ` Arnd Bergmann
@ 2011-12-22 15:13         ` Kay Sievers
  0 siblings, 0 replies; 53+ messages in thread
From: Kay Sievers @ 2011-12-22 15:13 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: NeilBrown, Tim Bird, Greg KH, linux-embedded, linux kernel,
	john stultz, Brian Swetland, Lennart Poettering

On Thu, Dec 22, 2011 at 15:59, Arnd Bergmann <arnd@arndb.de> wrote:

> * Remove all kernel code for this and use a user space library together
>  with tmpfs
> * prepopulate the tmpfs at boot time with all the log buffers in the right
>  size, and set the maximum file system size so that they cannot grow further.
> * Have minimal formatting in the log buffer: A few bytes header (ring buffer
>  start and end)
> * Mandate that user space must use mmap and atomic operations to reserve space
>  in the log and write to the files.
> * Provide a tool to get the log data out of the buffer again in a race-free way.
>
> Since any program that is allowed to write to the buffer can overwrite all
> existing information in it anyway, I think we don't actually need any kernel
> help in maintaining consistency of the contents either -- the reader will
> simply discard any data. The main thing we would not be able to guarantee
> without kernel help is proving the origin of individual messages, but I'm
> not sure if that is a design goal.

I'm very sure you want rate limiting, global sequence numbers and
trusted properties added to the log.

I can imagine that the kernel provides a single character device,
instantiates a tmpfs mountpoint and splits all stuff that travels into
that device into several files (ring buffers) in the tmpfs filesystem.
The files are probably just split by the uid of the incoming message
sender, which maps to an app at Android.

Only the character device would be able to write to the file, but the
user with the matching uid can read its own file back directly from
tmpfs.

I'm very convinced that we should not design a logging system today,
where we give the ownership of the data to the actual untrusted
process that logs.

The current kmsg buffer could just be one additional file in this tmpfs mount.

Kay

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-22  0:36   ` Tim Bird
  2011-12-22  0:51     ` Greg KH
  2011-12-22  1:20     ` NeilBrown
@ 2011-12-22  4:42     ` david
  2 siblings, 0 replies; 53+ messages in thread
From: david @ 2011-12-22  4:42 UTC (permalink / raw)
  To: Tim Bird
  Cc: Greg KH, linux-embedded, linux kernel, Arnd Bergmann, john stultz,
	Brian Swetland, Kay Sievers, Lennart Poettering

On Wed, 21 Dec 2011, Tim Bird wrote:

> On 12/21/2011 03:19 PM, Greg KH wrote:
>> That all describes the current code, but you haven't described what's
>> wrong with the existing syslog interface that requires this new driver
>> to be written.  And why can't the existing interface be fixed to address
>> these (potential) shortcomings?
>
>
>>> One specific question I have is where is the most appropriate
>>> place for this code to live, in the kernel source tree?
>>> Other embedded systems might want to use this system (it
>>> is simpler than syslog, and superior in some ways), so I don't
>>> think it should remain in an android-specific directory.
>>
>> What way is it superior?
>
> Here are some ways that this code is superior to syslog:
>
> Speed and overhead
> ------------------
> syslogd requires a separate user-space process to manage
> the log area, where this code does not.  The overhead
> for any user-space process is at least 16K, and usually much
> more than this (not including the size of the log storage
> area).  On one of my embedded systems, where syslogd is
> provided by busybox, the unique set size for syslogd
> is 96K.  This code, built in to the Linux kernel is less
> than 4K code and data (again, not including the size of
> the log storage area).
>
> To deliver a message to syslog requires a socket operation
> and a few context switches.  With the logger code,
> the operation is a file operation (writev) to kernel memory,
> with only one context switch into and out of the kernel.
>
> The open and write paths through the Linux kernel are
> arguably more optimized than the networking paths.
> This is a consideration since the log operations should
> optimized for the "create-a-log-entry" case (the open/write
> path), since logs are mostly written and almost never read
> on production devices.

comparing this to a userspace syslog daemon, you are correct that doing 
things in the kernel is slightly faster, but just because putting 
something in the kernel may be faster, doesn't mean that it belongs there 
(the tux webserver anyone??). How much faster, and is this something that 
actually matters? it doens't matter if something is 1000x faster if it's 
only called infrequently. what rate of log messages do you expect to 
support (and note that userspace syslog daemons can handle up to a 
million logs/sec, on the right hardware)

> No dependence on persistent storage
> -----------------------------------
> syslogd requires either persistent storage to store the log,
> or a network connection to an outside device.  Being
> purely memory-based, the logger requires neither of these.
> With logger, persistence of the log data is left to the
> implementor.  In Android, the data is delivered over a USB
> connection via adb or to the console as ascii text, using
> logcat.  In other embedded systems, other mechanisms might
> be used if long-term storage of the messages is desired.
> With logger, there is no automatic notion of on-device
> persistent storage for the log data.
>
> No dependence on networking kernel code
> ---------------------------------------
> The syslog communication mechanism requires sockets.  This
> prevents one from configuring the kernel with no networking
> support, which is sometimes done in embedded systems to save
> size.
>
> Simpler constraint on log size
> ------------------------------
> The busybox syslog daemon uses a log rotation feature to constrain
> the size of the log in persistent storage.  This is overly
> cumbersome in both speed and complexity compared to the logger's
> simple ring buffer.
>
> Licensing
> ---------
> The code implementing library and command line tool support
> for this logger (in user space) is available under an Apache license,
> rather than a GPL license, which is desirable for some vendors.

All of these issues are only true if you consider the existing syslog 
daemones, but it sounds like it would be pretty trivial to write a new 
syslog daemon (or hack up a BSD licensed one) to listen on multiple 
sockets (which most of the existing syslgo daemond can do already), assign 
a priority to each different socket, and then keep the logs in it's own 
buffer, with an api to retrieve them.

it wouldn't be quite as small or as fast as doing it in the kernel, but it 
would not be large, and I have trouble believing that it would be slow 
enough to notice.

>> Again, why not extend syslog?  Why not "fix"
>> syslog if this really is a superior thing?
>
> "extend" syslog would not really the the right
> direction.  This system is simpler than syslog,
> while simultaneously having at least one valuable
> extra feature (separate log channels).
>
> syslog has a standard set of interfaces in libc
> and various syslogd implementations, which are
> heavier weight in nature than what is provided here.
> It is unclear that an attempt at reducing these attributes
> (such as memory overhead, number of context switches,
> dependence on persistent storage, and socket utilization) would
> yield a system substantially different from the logger.

I once saw a company spend 6 months of developer effort optimizing a 
readof a config file (at the start it took 30 seconds to read the config 
100,000 times at the end it to .5 seconds), this was a great speedup, 
until you looked at the numbers and realized that across 50 servers this 
function was called 50,000 times in an hour (configs for cgis), so this 6 
month development effort saved 15 sec of cpu time in a peak hour, spread 
across 50 machines. In other words a waste of time and diversion of 
resources that could have been far better employed fixing bigger issues.

I suspect that logging is in the same category, it just doesn't happen 
enough for the performance difference to really matter.

making this be a 'standard' syslog(3) interface, or even a very close 
cousin to it (if you want apps to be able to dump raw strings to it rather 
than formatted data for example) would not hurt the pure android on a 
low-end phone noticably, but would make it much easier for android on a 
high-end device (tablets, netbooks, etc) to plug in a full blown syslog 
daemon that could save the logs to disk, or to send them over the network, 
etc. these things are very useful to have and much easier to do in 
userspace than in the kernel.

David Lang

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-21 22:59 RFC: android logger feedback request Tim Bird
  2011-12-21 23:19 ` Greg KH
@ 2011-12-22  0:59 ` David Brown
  2011-12-29  0:39 ` Andrew Morton
  2 siblings, 0 replies; 53+ messages in thread
From: David Brown @ 2011-12-22  0:59 UTC (permalink / raw)
  To: Tim Bird
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz, Greg KH,
	Brian Swetland

On Wed, Dec 21, 2011 at 02:59:15PM -0800, Tim Bird wrote:

> In Android, this system uses a fixed set of device nodes with
> well-known names: /dev/log/main, /dev/log/events, /dev/log/radio
> and /dev/log/system.

These names seem very specific to Android's use case.  Would we want
the mechanism to be more general, or configurable, so that another
embedded-type system would be able to have their own log types.

But, the biggest question I have is to understand why this is a kernel
driver.  In essence, it is just shuttling data from various processes
to something that eventually needs to read that data.  In other words,
it is doing a subset of what syslog does.  If the concern is about a
userspace program crashing, wouldn't the userspace tool that reads
this data also be able to crash?

The driver clearly has no kernel API, since it exports no symbols.  I
could see more of an argument for it if certain kernel things were
able to write to the log, but then we are also duplicating other
functionality.

David

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-21 22:59 RFC: android logger feedback request Tim Bird
  2011-12-21 23:19 ` Greg KH
  2011-12-22  0:59 ` David Brown
@ 2011-12-29  0:39 ` Andrew Morton
  2012-01-04 15:34   ` Geunsik Lim
  2 siblings, 1 reply; 53+ messages in thread
From: Andrew Morton @ 2011-12-29  0:39 UTC (permalink / raw)
  To: Tim Bird
  Cc: linux-embedded, linux kernel, Arnd Bergmann, john stultz, Greg KH,
	Brian Swetland

On Wed, 21 Dec 2011 14:59:15 -0800
Tim Bird <tim.bird@am.sony.com> wrote:

> Hi all,
> 
> I'm looking for feedback on the Android logger code,

The code looks nice.

>
> ...
>
> +/* logger_offset - returns index 'n' into the log via (optimized) modulus */
> +#define logger_offset(n)	((n) & (log->size - 1))
> +

This macro depends upon the presence of a local variable called "log"
and is therefore all stinky.  Pass "log" in as an arg and convert it to
a regular C function.

>
> ...
>
> +/*
> + * get_entry_len - Grabs the length of the payload of the next entry starting
> + * from 'off'.
> + *
> + * Caller needs to hold log->mutex.
> + */
> +static __u32 get_entry_len(struct logger_log *log, size_t off)
> +{
> +	__u16 val;
> +
> +	switch (log->size - off) {
> +	case 1:
> +		memcpy(&val, log->buffer + off, 1);
> +		memcpy(((char *) &val) + 1, log->buffer, 1);

So numbers in the buffer are in host endian order.  That's worth
capturing in a comment somewhere.

There must be a way of doing the above more neatly, using
log->buffer[off] and log->buffer[0].  Perhaps using
include/linux/unaligned/packed_struct.h.

> +		break;
> +	default:
> +		memcpy(&val, log->buffer + off, 2);

get_unaligned()

> +	}
> +
> +	return sizeof(struct logger_entry) + val;
> +}
> +
>
> ...
>
> +static ssize_t logger_read(struct file *file, char __user *buf,
> +			   size_t count, loff_t *pos)
> +{
> +	struct logger_reader *reader = file->private_data;
> +	struct logger_log *log = reader->log;
> +	ssize_t ret;
> +	DEFINE_WAIT(wait);
> +
> +start:
> +	while (1) {
> +		prepare_to_wait(&log->wq, &wait, TASK_INTERRUPTIBLE);
> +
> +		mutex_lock(&log->mutex);

If mutex_lock() had to wait for the mutex, it will return in state
TASK_RUNNING, thus rubbing out the effects of prepare_to_wait().  We'll
just loop again so this is a benign buglet.

Could we lose a wakeup by running prepaer_to_wait() outside the lock? 
I don't think so, but I stopped thinking about it when I saw the above
bug.  These two lines should be switched around.

> +		ret = (log->w_off == reader->r_off);
> +		mutex_unlock(&log->mutex);
> +		if (!ret)
> +			break;
> +
> +		if (file->f_flags & O_NONBLOCK) {
> +			ret = -EAGAIN;
> +			break;
> +		}
> +
> +		if (signal_pending(current)) {
> +			ret = -EINTR;
> +			break;
> +		}
> +
> +		schedule();
> +	}
> +
> +	finish_wait(&log->wq, &wait);
> +	if (ret)
> +		return ret;
> +
> +	mutex_lock(&log->mutex);
> +
> +	/* is there still something to read or did we race? */
> +	if (unlikely(log->w_off == reader->r_off)) {
> +		mutex_unlock(&log->mutex);
> +		goto start;
> +	}
> +
> +	/* get the size of the next entry */
> +	ret = get_entry_len(log, reader->r_off);
> +	if (count < ret) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	/* get exactly one entry from the log */
> +	ret = do_read_log_to_user(log, reader, buf, ret);
> +
> +out:
> +	mutex_unlock(&log->mutex);
> +
> +	return ret;
> +}
> +
>
> ...
>
> +/*
> + * clock_interval - is a < c < b in mod-space? Put another way, does the line
> + * from a to b cross c?
> + */

This is where my little brain started to hurt.  I assume it's correct ;)

> +static inline int clock_interval(size_t a, size_t b, size_t c)
> +{
> +	if (b < a) {
> +		if (a < c || b >= c)
> +			return 1;
> +	} else {
> +		if (a < c && b >= c)
> +			return 1;
> +	}
> +
> +	return 0;
> +}

The explicit inline(s) are increaseingly old-fashioned.  gcc is good
about this now.

>
> ...
>
> +static ssize_t do_write_log_from_user(struct logger_log *log,
> +				      const void __user *buf, size_t count)
> +{
> +	size_t len;
> +
> +	len = min(count, log->size - log->w_off);
> +	if (len && copy_from_user(log->buffer + log->w_off, buf, len))
> +		return -EFAULT;
> +
> +	if (count != len)
> +		if (copy_from_user(log->buffer, buf + len, count - len))
> +			return -EFAULT;

Is it correct to simply return here without updating ->w_off? 
We've just copied "len" bytes in from userspace.

> +	log->w_off = logger_offset(log->w_off + count);
> +
> +	return count;
> +}
> +
> +/*
> + * logger_aio_write - our write method, implementing support for write(),
> + * writev(), and aio_write(). Writes are our fast path, and we try to optimize
> + * them above all else.
> + */
> +ssize_t logger_aio_write(struct kiocb *iocb, const struct iovec *iov,
> +			 unsigned long nr_segs, loff_t ppos)
> +{
> +	struct logger_log *log = file_get_log(iocb->ki_filp);
> +	size_t orig = log->w_off;
> +	struct logger_entry header;
> +	struct timespec now;
> +	ssize_t ret = 0;
> +
> +	now = current_kernel_time();
> +
> +	header.pid = current->tgid;
> +	header.tid = current->pid;

hm, I thought task_struct.pid was the pid.

Also, pids are not unique system-wide.  If the user is using PID
namespaces then the logs will contain what may be fatally confusing
duplicates.  This needs to be fixed, I expect.  

There are broader namespace issues which should be thought about.  Does
it make sense for readers in one container to be returned logs from a
different container?  Perhaps the do-it-via-a-filesystem proposals can
address this.

> +	header.sec = now.tv_sec;
> +	header.nsec = now.tv_nsec;
> +	header.len = min_t(size_t, iocb->ki_left, LOGGER_ENTRY_MAX_PAYLOAD);
> +
> +	/* null writes succeed, return zero */
> +	if (unlikely(!header.len))
> +		return 0;
> +
> +	mutex_lock(&log->mutex);
> +
> +	/*
> +	 * Fix up any readers, pulling them forward to the first readable
> +	 * entry after (what will be) the new write offset. We do this now
> +	 * because if we partially fail, we can end up with clobbered log
> +	 * entries that encroach on readable buffer.
> +	 */
> +	fix_up_readers(log, sizeof(struct logger_entry) + header.len);
> +
> +	do_write_log(log, &header, sizeof(struct logger_entry));
> +
> +	while (nr_segs-- > 0) {
> +		size_t len;
> +		ssize_t nr;
> +
> +		/* figure out how much of this vector we can keep */
> +		len = min_t(size_t, iov->iov_len, header.len - ret);
> +
> +		/* write out this segment's payload */
> +		nr = do_write_log_from_user(log, iov->iov_base, len);
> +		if (unlikely(nr < 0)) {
> +			log->w_off = orig;
> +			mutex_unlock(&log->mutex);
> +			return nr;
> +		}
> +
> +		iov++;
> +		ret += nr;
> +	}
> +
> +	mutex_unlock(&log->mutex);
> +
> +	/* wake up any blocked readers */
> +	wake_up_interruptible(&log->wq);
> +
> +	return ret;
> +}
> +
>
> ...
>
> +static int logger_release(struct inode *ignored, struct file *file)
> +{
> +	if (file->f_mode & FMODE_READ) {
> +		struct logger_reader *reader = file->private_data;
> +		list_del(&reader->list);

locking for reader->list?

> +		kfree(reader);
> +	}
> +
> +	return 0;
> +}
>
> ...
>
> +static long logger_ioctl(struct file *file, unsigned int cmd, unsigned long arg)

ew...

> +static struct logger_log *get_log_from_minor(int minor)
> +{
> +	if (log_main.misc.minor == minor)
> +		return &log_main;
> +	if (log_events.misc.minor == minor)
> +		return &log_events;
> +	if (log_radio.misc.minor == minor)
> +		return &log_radio;
> +	if (log_system.misc.minor == minor)
> +		return &log_system;
> +	return NULL;
> +}

ew...


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: RFC: android logger feedback request
  2011-12-29  0:39 ` Andrew Morton
@ 2012-01-04 15:34   ` Geunsik Lim
  0 siblings, 0 replies; 53+ messages in thread
From: Geunsik Lim @ 2012-01-04 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Tim Bird, linux-embedded, linux kernel, Arnd Bergmann,
	john stultz, Greg KH, Brian Swetland

On Thu, Dec 29, 2011 at 9:39 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Wed, 21 Dec 2011 14:59:15 -0800
> Tim Bird <tim.bird@am.sony.com> wrote:
>
>> Hi all,
>>
>> I'm looking for feedback on the Android logger code,
>
> The code looks nice.
>
>>
>> ...
>>
>> +/* logger_offset - returns index 'n' into the log via (optimized) modulus */
>> +#define logger_offset(n)     ((n) & (log->size - 1))
>> +
>
> This macro depends upon the presence of a local variable called "log"
> and is therefore all stinky.  Pass "log" in as an arg and convert it to
> a regular C function.
>
>>
>> ...
>>
>> +/*
>> + * get_entry_len - Grabs the length of the payload of the next entry starting
>> + * from 'off'.
>> + *
>> + * Caller needs to hold log->mutex.
>> + */
>> +static __u32 get_entry_len(struct logger_log *log, size_t off)
>> +{
>> +     __u16 val;
>> +
>> +     switch (log->size - off) {
>> +     case 1:
>> +             memcpy(&val, log->buffer + off, 1);
>> +             memcpy(((char *) &val) + 1, log->buffer, 1);
>
> So numbers in the buffer are in host endian order.  That's worth
> capturing in a comment somewhere.
>
> There must be a way of doing the above more neatly, using
> log->buffer[off] and log->buffer[0].  Perhaps using
> include/linux/unaligned/packed_struct.h.
>
>> +             break;
>> +     default:
>> +             memcpy(&val, log->buffer + off, 2);
>
> get_unaligned()
>
>> +     }
>> +
>> +     return sizeof(struct logger_entry) + val;
>> +}
>> +
>>
>> ...
>>
>> +static ssize_t logger_read(struct file *file, char __user *buf,
>> +                        size_t count, loff_t *pos)
>> +{
>> +     struct logger_reader *reader = file->private_data;
>> +     struct logger_log *log = reader->log;
>> +     ssize_t ret;
>> +     DEFINE_WAIT(wait);
>> +
>> +start:
>> +     while (1) {
>> +             prepare_to_wait(&log->wq, &wait, TASK_INTERRUPTIBLE);
>> +
>> +             mutex_lock(&log->mutex);
>
> If mutex_lock() had to wait for the mutex, it will return in state
> TASK_RUNNING, thus rubbing out the effects of prepare_to_wait().  We'll
> just loop again so this is a benign buglet.
>
> Could we lose a wakeup by running prepaer_to_wait() outside the lock?
> I don't think so, but I stopped thinking about it when I saw the above
> bug.  These two lines should be switched around.
>
>> +             ret = (log->w_off == reader->r_off);
>> +             mutex_unlock(&log->mutex);
>> +             if (!ret)
>> +                     break;
>> +
>> +             if (file->f_flags & O_NONBLOCK) {
>> +                     ret = -EAGAIN;
>> +                     break;
>> +             }
>> +
>> +             if (signal_pending(current)) {
>> +                     ret = -EINTR;
>> +                     break;
>> +             }
>> +
>> +             schedule();
>> +     }
>> +
>> +     finish_wait(&log->wq, &wait);
>> +     if (ret)
>> +             return ret;
>> +
>> +     mutex_lock(&log->mutex);
>> +
>> +     /* is there still something to read or did we race? */
>> +     if (unlikely(log->w_off == reader->r_off)) {
>> +             mutex_unlock(&log->mutex);
>> +             goto start;
>> +     }
>> +
>> +     /* get the size of the next entry */
>> +     ret = get_entry_len(log, reader->r_off);
>> +     if (count < ret) {
>> +             ret = -EINVAL;
>> +             goto out;
>> +     }
>> +
>> +     /* get exactly one entry from the log */
>> +     ret = do_read_log_to_user(log, reader, buf, ret);
>> +
>> +out:
>> +     mutex_unlock(&log->mutex);
>> +
>> +     return ret;
>> +}
>> +
>>
>> ...
>>
>> +/*
>> + * clock_interval - is a < c < b in mod-space? Put another way, does the line
>> + * from a to b cross c?
>> + */
>
> This is where my little brain started to hurt.  I assume it's correct ;)
>
>> +static inline int clock_interval(size_t a, size_t b, size_t c)
>> +{
>> +     if (b < a) {
>> +             if (a < c || b >= c)
>> +                     return 1;
>> +     } else {
>> +             if (a < c && b >= c)
>> +                     return 1;
>> +     }
>> +
>> +     return 0;
>> +}
>
> The explicit inline(s) are increaseingly old-fashioned.  gcc is good
> about this now.
>
>>
>> ...
>>
>> +static ssize_t do_write_log_from_user(struct logger_log *log,
>> +                                   const void __user *buf, size_t count)
>> +{
>> +     size_t len;
>> +
>> +     len = min(count, log->size - log->w_off);
>> +     if (len && copy_from_user(log->buffer + log->w_off, buf, len))
>> +             return -EFAULT;
>> +
>> +     if (count != len)
>> +             if (copy_from_user(log->buffer, buf + len, count - len))
>> +                     return -EFAULT;
>
> Is it correct to simply return here without updating ->w_off?
> We've just copied "len" bytes in from userspace.
>
>> +     log->w_off = logger_offset(log->w_off + count);
>> +
>> +     return count;
>> +}
>> +
>> +/*
>> + * logger_aio_write - our write method, implementing support for write(),
>> + * writev(), and aio_write(). Writes are our fast path, and we try to optimize
>> + * them above all else.
>> + */
>> +ssize_t logger_aio_write(struct kiocb *iocb, const struct iovec *iov,
>> +                      unsigned long nr_segs, loff_t ppos)
>> +{
>> +     struct logger_log *log = file_get_log(iocb->ki_filp);
>> +     size_t orig = log->w_off;
>> +     struct logger_entry header;
>> +     struct timespec now;
>> +     ssize_t ret = 0;
>> +
>> +     now = current_kernel_time();
>> +
>> +     header.pid = current->tgid;
>> +     header.tid = current->pid;
>
> hm, I thought task_struct.pid was the pid.
It's right.
>
> Also, pids are not unique system-wide.  If the user is using PID
I think that this code is  correct because  current->tgid means
process id of user-space
and current->pid means thread id of user-space actually.  Is it not make sense?

> namespaces then the logs will contain what may be fatally confusing duplicates.
> This needs to be fixed, I expect.
header.tid as a thread id of user-space is unique system-wide at least
although header.pid is not unqueue system-wide.
Why should we think/understand  that the logs contain duplicated
information to users
who run logcat command for collecting and viewing system debug output?
Thanks.
>
> There are broader namespace issues which should be thought about.  Does
> it make sense for readers in one container to be returned logs from a
> different container?  Perhaps the do-it-via-a-filesystem proposals can
> address this.
>
>> +     header.sec = now.tv_sec;
>> +     header.nsec = now.tv_nsec;
>> +     header.len = min_t(size_t, iocb->ki_left, LOGGER_ENTRY_MAX_PAYLOAD);
>> +
>> +     /* null writes succeed, return zero */
>> +     if (unlikely(!header.len))
>> +             return 0;
>> +
>> +     mutex_lock(&log->mutex);
>> +
>> +     /*
>> +      * Fix up any readers, pulling them forward to the first readable
>> +      * entry after (what will be) the new write offset. We do this now
>> +      * because if we partially fail, we can end up with clobbered log
>> +      * entries that encroach on readable buffer.
>> +      */
>> +     fix_up_readers(log, sizeof(struct logger_entry) + header.len);
>> +
>> +     do_write_log(log, &header, sizeof(struct logger_entry));
>> +
>> +     while (nr_segs-- > 0) {
>> +             size_t len;
>> +             ssize_t nr;
>> +
>> +             /* figure out how much of this vector we can keep */
>> +             len = min_t(size_t, iov->iov_len, header.len - ret);
>> +
>> +             /* write out this segment's payload */
>> +             nr = do_write_log_from_user(log, iov->iov_base, len);
>> +             if (unlikely(nr < 0)) {
>> +                     log->w_off = orig;
>> +                     mutex_unlock(&log->mutex);
>> +                     return nr;
>> +             }
>> +
>> +             iov++;
>> +             ret += nr;
>> +     }
>> +
>> +     mutex_unlock(&log->mutex);
>> +
>> +     /* wake up any blocked readers */
>> +     wake_up_interruptible(&log->wq);
>> +
>> +     return ret;
>> +}
>> +
>>
>> ...
>>
>> +static int logger_release(struct inode *ignored, struct file *file)
>> +{
>> +     if (file->f_mode & FMODE_READ) {
>> +             struct logger_reader *reader = file->private_data;
>> +             list_del(&reader->list);
>
> locking for reader->list?
>
>> +             kfree(reader);
>> +     }
>> +
>> +     return 0;
>> +}
>>
>> ...
>>
>> +static long logger_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
>
> ew...
>
>> +static struct logger_log *get_log_from_minor(int minor)
>> +{
>> +     if (log_main.misc.minor == minor)
>> +             return &log_main;
>> +     if (log_events.misc.minor == minor)
>> +             return &log_events;
>> +     if (log_radio.misc.minor == minor)
>> +             return &log_radio;
>> +     if (log_system.misc.minor == minor)
>> +             return &log_system;
>> +     return NULL;
>> +}
>
> ew...
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Best regards,
Geunsik Lim, Samsung Electronics
Homepage: http://leemgs.fedorapeople.org
-----------------------------------------------------------------------
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
-----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2012-01-06 23:35 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-21 22:59 RFC: android logger feedback request Tim Bird
2011-12-21 23:19 ` Greg KH
2011-12-22  0:18   ` john stultz
2011-12-22  0:27     ` Greg KH
2011-12-22  0:47       ` john stultz
2011-12-22  1:09         ` john stultz
2011-12-22  0:42     ` Tim Bird
2011-12-22  0:49       ` john stultz
2011-12-22  1:00         ` john stultz
2011-12-22  0:36   ` Tim Bird
2011-12-22  0:51     ` Greg KH
2011-12-22  1:32       ` Tim Bird
2011-12-22  1:47         ` Greg KH
2011-12-22  2:12           ` Tim Bird
2011-12-22  3:44             ` Kay Sievers
2011-12-22  3:45             ` Greg KH
2011-12-22  3:47             ` Greg KH
2011-12-22  4:12               ` Kay Sievers
2011-12-22  4:22                 ` Brian Swetland
2011-12-22  4:43                   ` Kay Sievers
2011-12-22  4:47                   ` david
2011-12-22  4:58                     ` Brian Swetland
2011-12-22  5:07                       ` david
2011-12-22  5:21                       ` david
2011-12-22 13:40                   ` Lennart Poettering
2011-12-22  4:49                 ` david
2011-12-22  2:34           ` Kay Sievers
2011-12-22  1:20     ` NeilBrown
2011-12-22  1:49       ` Greg KH
2011-12-22  2:14       ` Tim Bird
2011-12-22  2:34       ` Brian Swetland
2011-12-22  3:49         ` Greg KH
2011-12-22  4:36           ` Kay Sievers
2011-12-22  5:01         ` david
2011-12-22  4:52       ` david
2011-12-22  5:06         ` Brian Swetland
2011-12-22  5:14           ` david
2011-12-22  5:25             ` Brian Swetland
2011-12-22  6:09               ` Greg KH
2011-12-23 15:22                 ` Alan Cox
2011-12-23 16:29                   ` Greg KH
2011-12-22  7:05           ` NeilBrown
2012-01-06 20:56             ` Tim Bird
2012-01-06 21:20               ` Greg KH
2012-01-06 22:41                 ` Tim Bird
2012-01-06 23:17                   ` Greg KH
2012-01-06 23:35                   ` Greg KH
2011-12-22 14:59       ` Arnd Bergmann
2011-12-22 15:13         ` Kay Sievers
2011-12-22  4:42     ` david
2011-12-22  0:59 ` David Brown
2011-12-29  0:39 ` Andrew Morton
2012-01-04 15:34   ` Geunsik Lim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).