All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] sysfs on 2.5.48 unable to remove files while in use
@ 2002-11-21 15:39 Rusty Lynch
  2002-11-21 17:04 ` Patrick Mochel
  2002-11-21 20:45 ` Patrick Mochel
  0 siblings, 2 replies; 15+ messages in thread
From: Rusty Lynch @ 2002-11-21 15:39 UTC (permalink / raw)
  To: mochel; +Cc: Linux Kernel Mailing List

I have noticed some strange behavior on my 2.5.48 build
where sysfs is not able to handle file/directory removal
correctly when the file/directory is in use.

I can see this with a little kprobes example code that I have
been playing with that will create entries like:

/sysfsroot/noisy/0xc0107ae0/sys_fork

when someone uses that driver to insert a kernel probe
at 0xc0107ae0 that will printk "sys_fork".

What I have noticed, is that if I create a new probe
(which will create the sysfs entry), open a new shell and
cd to /sysfsroot/noisy/0xc0107ae0, and then use my
driver to remove the probe (which will remove the
sysfs entry), then /sysfsroot/noisy/0xc0107ae0 doesn't
go away after I cd out of the shell.

>From then on any attempts to create new sysfs entries
do not show up in /sysfsroot/ until I unload/load my driver
again.

It seems like this could be an issue with some real code
(not just this stupid play code of mine), like maybe hotswap
code that updates sysfs entries.

My patch to 2.5.48 (with kprobes already applied) can be
grabbed from:
http://www.stinkycat.com/patches/patch-kprobes_sample_with_sysfs-2.5.48.diff

 - or here is an inline version that $%^&^ outlook will surely screw up:

diff -urN linux-2.5.48-kprobes/arch/i386/kernel/i386_ksyms.c
linux-2.5.48-kprobes-patched/arch/i386/kernel/i386_ksyms.c
--- linux-2.5.48-kprobes/arch/i386/kernel/i386_ksyms.c	2002-11-17
20:29:57.000000000 -0800
+++ linux-2.5.48-kprobes-patched/arch/i386/kernel/i386_ksyms.c	2002-11-18
15:14:42.000000000 -0800
@@ -59,6 +59,8 @@
 extern unsigned long cpu_khz;
 extern unsigned long get_cmos_time(void);

+extern int valid_kernel_address(unsigned long addr);
+
 /* platform dependent support */
 EXPORT_SYMBOL(boot_cpu_data);
 #ifdef CONFIG_EISA
@@ -91,6 +93,7 @@
 EXPORT_SYMBOL(get_cmos_time);
 EXPORT_SYMBOL(cpu_khz);
 EXPORT_SYMBOL(apm_info);
+EXPORT_SYMBOL(valid_kernel_address);

 #ifdef CONFIG_DEBUG_IOVIRT
 EXPORT_SYMBOL(__io_virt_debug);
diff -urN linux-2.5.48-kprobes/arch/i386/kernel/traps.c
linux-2.5.48-kprobes-patched/arch/i386/kernel/traps.c
--- linux-2.5.48-kprobes/arch/i386/kernel/traps.c	2002-11-18
15:14:25.000000000 -0800
+++ linux-2.5.48-kprobes-patched/arch/i386/kernel/traps.c	2002-11-18
15:14:42.000000000 -0800
@@ -131,6 +131,11 @@

 #endif

+int valid_kernel_address(unsigned long addr)
+{
+	return kernel_text_address(addr);
+}
+
 void show_trace(unsigned long * stack)
 {
 	int i;
diff -urN linux-2.5.48-kprobes/drivers/char/Kconfig
linux-2.5.48-kprobes-patched/drivers/char/Kconfig
--- linux-2.5.48-kprobes/drivers/char/Kconfig	2002-11-17
20:29:21.000000000 -0800
+++ linux-2.5.48-kprobes-patched/drivers/char/Kconfig	2002-11-18
15:14:59.000000000 -0800
@@ -1270,5 +1270,20 @@
 	  Once bound, I/O against /dev/raw/rawN uses efficient zero-copy I/O.
 	  See the raw(8) manpage for more details.

+config NOISY
+	tristate "Noisy Interface Support"
+	---help---
+	  If you say Y here and create a character special file /dev/noisy with
+	  major number 10 and minor number 221 using mknod ("man mknod"), you
+	  will get access to an interface for inserting arbitrary printk's
+	  into executing kernel code.
+
+	  This driver is also available as a module ( = code which can be
+	  inserted in and removed from the running kernel whenever you want).
+	  The module is called noisy.o. If you want to compile it as a module,
+	  say M here and read <file:Documentation/modules.txt>.
+
+	  If unsure, say N.
+
 endmenu

diff -urN linux-2.5.48-kprobes/drivers/char/Makefile
linux-2.5.48-kprobes-patched/drivers/char/Makefile
--- linux-2.5.48-kprobes/drivers/char/Makefile	2002-11-17
20:29:56.000000000 -0800
+++ linux-2.5.48-kprobes-patched/drivers/char/Makefile	2002-11-18
15:14:59.000000000 -0800
@@ -102,6 +102,7 @@
 obj-$(CONFIG_AGP) += agp/
 obj-$(CONFIG_DRM) += drm/
 obj-$(CONFIG_PCMCIA) += pcmcia/
+obj-$(CONFIG_NOISY) += noisy.o

 # Files generated that shall be removed upon make clean
 clean-files := consolemap_deftbl.c defkeymap.c qtronixmap.c
diff -urN linux-2.5.48-kprobes/drivers/char/noisy.c
linux-2.5.48-kprobes-patched/drivers/char/noisy.c
--- linux-2.5.48-kprobes/drivers/char/noisy.c	1969-12-31
16:00:00.000000000 -0800
+++ linux-2.5.48-kprobes-patched/drivers/char/noisy.c	2002-11-20
18:28:41.000000000 -0800
@@ -0,0 +1,265 @@
+/*
+ * Noisy Interface for Linux
+ *
+ * This driver allows arbitrary printk's to be inserted into
+ * executing kernel code by using the new kprobes interface.
+ *
+ * Copyright (C) 2002 Rusty Lynch <rusty@linux.intel.com>
+ */
+
+#include <linux/module.h>
+#include <linux/miscdevice.h>
+#include <linux/init.h>
+#include <linux/kprobes.h>
+#include <linux/slab.h>
+#include <linux/kobject.h>
+#include <linux/sysfs.h>
+#include <asm/uaccess.h>
+
+/* exported by arch/YOURARCH/kernel/traps.c */
+extern int valid_kernel_address(unsigned long);
+
+#define MAX_MSG_SIZE 128
+
+/*
+ * Data structures for managing list of probe points
+ */
+static struct list_head probe_list;
+struct nprobe {
+	struct list_head list;
+	struct kprobe probe;
+	char message[MAX_MSG_SIZE + 1];
+	struct attribute attr;
+	struct kobject kobj;
+};
+
+/*
+ * sysfs stuff
+ */
+
+struct subsystem noisy_subsys = {
+	.kobj	= { .name = "noisy" },
+};
+
+/*
+ * Probe handler called before the execution of all probe points
+ */
+static void noisy_pre_handler(struct kprobe *p, struct pt_regs *r)
+{
+	struct nprobe *c = container_of(p, struct nprobe, probe);
+	printk(KERN_CRIT "%s: %s\n", __FUNCTION__, c->message);
+}
+
+/*
+ * Probe handler called just after the probed address is single stepped
+ */
+static void noisy_post_handler(struct kprobe *p, struct pt_regs *r,
+			       unsigned long flags)
+{
+}
+
+/*
+ * Fault handler that covers the pre_handler, single stepping, and
+ * post_handler executiion.
+ */
+static int noisy_fault_handler(struct kprobe *p, struct pt_regs *r, int
trapnr)
+{
+	/* Let the kernel handle this fault */
+	return 0;
+}
+
+/*
+ * Supported file operations
+ */
+static ssize_t noisy_read(struct file *, char *, size_t, loff_t *);
+static ssize_t noisy_write(struct file *, const char *, size_t, loff_t *);
+static int noisy_open(struct inode *, struct file *);
+static int noisy_release(struct inode *, struct file *);
+
+static struct file_operations noisy_fops = {
+	.owner          = THIS_MODULE,
+	.read           = noisy_read,
+	.write          = noisy_write,
+	.open           = noisy_open,
+	.release        = noisy_release,
+};
+
+/*
+ * To conserve major numbers, this device uses
+ * the miscdevice subsystem.
+ */
+static struct miscdevice noisy_dev =
+{
+	.minor        = NOISY_MINOR,
+	.name         = "noisy",
+	.fops         = &noisy_fops
+};
+
+static ssize_t noisy_read(struct file *file, char *buf,
+			size_t count, loff_t *ppos)
+{
+	struct nprobe *p;
+	list_for_each_entry(p, &probe_list, list) {
+		printk(KERN_CRIT "%p: %s\n", (p->probe).addr, p->message);
+	}
+
+	return 0;
+}
+
+/*
+ * A kprobe is created for each write operation where write is expecting
+ * a string of the form:
+ * 0xADDRESS MESSAGE
+ *
+ * The kprobe pre_handler will just do a printk using the MESSAGE passed
in.
+ *
+ * All kprobes are unregistered when the node is closed, so I use this
module
+ * like:
+ * $ mknod /dev/noisy c 10 221
+ * $ cat > /dev/noisy
+ * 0xc0107d50 sys_fork
+ *
+ * ... and then go do something that will trigger a sys_fork,
+ *     and then control-c to stop the cat process (which
+ *     will close the node and therefore stop syslog from further
+ *     DOS attacks from this driver)
+ */
+static ssize_t noisy_write(struct file *file, const char *buf, size_t
count,
+			   loff_t *ppos)
+{
+	struct nprobe *n = 0;
+	size_t ret = -ENOMEM;
+	char *tmp = 0;
+
+	if (count > MAX_MSG_SIZE) {
+		printk(KERN_CRIT
+		       "noisy: Input buffer (%i bytes) is too big!\n",
+		       count);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	tmp = kmalloc(count + 1, GFP_KERNEL);
+	if (!tmp) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	n = kmalloc(sizeof(struct nprobe), GFP_KERNEL);
+	if (!n) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	memset(n, '\0', sizeof(struct nprobe));
+
+	if (copy_from_user((void *)tmp, (void *)buf, count)) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	tmp[count] = '\0';
+
+	if (2 != sscanf(tmp, "0x%x %s", &(n->probe).addr, n->message)) {
+		ret = -EINVAL;
+		goto out;
+	}
+	(n->probe).pre_handler = noisy_pre_handler;
+	(n->probe).post_handler = noisy_post_handler;
+	(n->probe).fault_handler = noisy_fault_handler;
+
+
+	if (!valid_kernel_address((unsigned long)(n->probe).addr)) {
+		kfree(n);
+
+		ret = -EINVAL;
+		goto out;
+	}
+
+	kobject_init(&(n->kobj));
+	(n->kobj).subsys = &noisy_subsys;
+	snprintf((n->kobj).name, KOBJ_NAME_LEN, "0x%02x",
+		 (unsigned int)(n->probe).addr);
+
+	(n->attr).name = n->message;
+	(n->attr).mode = S_IRUGO;
+
+	if (register_kprobe(&(n->probe))) {
+		printk(KERN_CRIT "Unable to register probe at %p\n",
+		       (n->probe).addr);
+		kfree(n);
+
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (kobject_register(&(n->kobj))) {
+		printk(KERN_CRIT "Unable to add probe kobject!\n");
+		unregister_kprobe(&(n->probe));
+		kfree(n);
+
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (sysfs_create_file(&(n->kobj), &(n->attr))) {
+		printk(KERN_CRIT "Unable to add probe attr file!\n");
+		unregister_kprobe(&(n->probe));
+		kobject_unregister(&(n->kobj));
+		kfree(n);
+
+		ret = -EINVAL;
+		goto out;
+	}
+
+	list_add(&(n->list), &probe_list);
+	ret = count;
+
+ out:
+	kfree(tmp);
+        return ret;
+}
+
+static int noisy_open(struct inode *inode, struct file *file)
+{
+	return 0;
+}
+
+static int noisy_release(struct inode *inode, struct file *file)
+{
+	while(!list_empty(&probe_list)) {
+		struct list_head *n = probe_list.next;
+		struct nprobe *p = list_entry(n, struct nprobe, list);
+
+		printk("Releasing probe %p: %s\n",
+		       (p->probe).addr, p->message);
+		sysfs_remove_file(&(p->kobj), &(p->attr));
+		kobject_unregister(&(p->kobj));
+		unregister_kprobe(&(p->probe));
+
+		list_del(n);
+		kfree(p);
+	}
+	return 0;
+}
+
+static int __init noisy_init(void)
+{
+	if (misc_register(&noisy_dev))
+	{
+		return -ENODEV;
+	}
+	INIT_LIST_HEAD(&probe_list);
+	subsystem_register(&noisy_subsys);
+	return 0;
+}
+
+static void __exit noisy_exit (void)
+{
+	misc_deregister(&noisy_dev);
+	subsystem_unregister(&noisy_subsys);
+}
+
+module_init(noisy_init);
+module_exit(noisy_exit);
+
+MODULE_AUTHOR("Rusty Lynch");
+MODULE_LICENSE("GPL");
diff -urN linux-2.5.48-kprobes/include/linux/miscdevice.h
linux-2.5.48-kprobes-patched/include/linux/miscdevice.h
--- linux-2.5.48-kprobes/include/linux/miscdevice.h	2002-11-17
20:29:47.000000000 -0800
+++ linux-2.5.48-kprobes-patched/include/linux/miscdevice.h	2002-11-18
15:14:59.000000000 -0800
@@ -23,6 +23,7 @@
 #define MICROCODE_MINOR		184
 #define MWAVE_MINOR	219		/* ACP/Mwave Modem */
 #define MPT_MINOR	220
+#define NOISY_MINOR 221
 #define MISC_DYNAMIC_MINOR 255

 #define SGI_GRAPHICS_MINOR   146


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-21 15:39 [BUG] sysfs on 2.5.48 unable to remove files while in use Rusty Lynch
@ 2002-11-21 17:04 ` Patrick Mochel
  2002-11-21 17:20   ` Rusty Lynch
  2002-11-21 20:45 ` Patrick Mochel
  1 sibling, 1 reply; 15+ messages in thread
From: Patrick Mochel @ 2002-11-21 17:04 UTC (permalink / raw)
  To: Rusty Lynch; +Cc: Linux Kernel Mailing List


Hi there.

> My patch to 2.5.48 (with kprobes already applied) can be
> grabbed from:
> http://www.stinkycat.com/patches/patch-kprobes_sample_with_sysfs-2.5.48.diff

Where is the kprobes patch? 

Thanks,

	-pat


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-21 17:04 ` Patrick Mochel
@ 2002-11-21 17:20   ` Rusty Lynch
  0 siblings, 0 replies; 15+ messages in thread
From: Rusty Lynch @ 2002-11-21 17:20 UTC (permalink / raw)
  To: Patrick Mochel, Rusty Lynch; +Cc: Linux Kernel Mailing List

I used the one submitted by Vamsi in
http://marc.theaimsgroup.com/?l=linux-kernel&m=103761676628192&w=2

I think IBM has a kprobes webpage with a link to the patch, but I also have
copy of the one I used at:
http://www.stinkycat.com/patches/patch-kprobes-2.5.48.diff

    -rustyl

----- Original Message -----
From: "Patrick Mochel" <mochel@osdl.org>
To: "Rusty Lynch" <rusty@linux.co.intel.com>
Cc: "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Sent: Thursday, November 21, 2002 9:04 AM
Subject: Re: [BUG] sysfs on 2.5.48 unable to remove files while in use


>
> Hi there.
>
> > My patch to 2.5.48 (with kprobes already applied) can be
> > grabbed from:
> >
http://www.stinkycat.com/patches/patch-kprobes_sample_with_sysfs-2.5.48.diff
>
> Where is the kprobes patch?
>
> Thanks,
>
> -pat
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-21 15:39 [BUG] sysfs on 2.5.48 unable to remove files while in use Rusty Lynch
  2002-11-21 17:04 ` Patrick Mochel
@ 2002-11-21 20:45 ` Patrick Mochel
  2002-11-21 22:07   ` Rusty Lynch
  1 sibling, 1 reply; 15+ messages in thread
From: Patrick Mochel @ 2002-11-21 20:45 UTC (permalink / raw)
  To: Rusty Lynch; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 10187 bytes --]


Ok, I've had a chance to play with it a bit..

The kprobes patch didn't apply to current -bk. Attached is an updated 
patch. I also have some comments on your 'noisy' patch, but first..

> I can see this with a little kprobes example code that I have
> been playing with that will create entries like:
> 
> /sysfsroot/noisy/0xc0107ae0/sys_fork
> 
> when someone uses that driver to insert a kernel probe
> at 0xc0107ae0 that will printk "sys_fork".

It seems that having a pure sysfs implementation would be superior, 
instead of having to use a character device to write to. After looking 
into this, I realize that a couple of pieces of infrastructure are needed, 
so I'm working on that, and will post a modified version of your module 
once I'm done..

> What I have noticed, is that if I create a new probe
> (which will create the sysfs entry), open a new shell and
> cd to /sysfsroot/noisy/0xc0107ae0, and then use my
> driver to remove the probe (which will remove the
> sysfs entry), then /sysfsroot/noisy/0xc0107ae0 doesn't
> go away after I cd out of the shell.
> 
> >From then on any attempts to create new sysfs entries
> do not show up in /sysfsroot/ until I unload/load my driver
> again.
> 
> It seems like this could be an issue with some real code
> (not just this stupid play code of mine), like maybe hotswap
> code that updates sysfs entries.

Yes, it was a real bug. I've mucked with the method to do file and 
directory deletion, and it turns out that the way I was doing it was 
wrong. I've gone back to mimmicking vfs_unlink() and vfs_rmdir(), which I 
shouldn't have diverged from in the first place. (doing d_delete() after 
low-level unlinking, instead of d_invalidate() before it). 

Appended to this email is my current working patch to sysfs, including 
fixes discussed and posted yesterday. I've pushed these to Linus, though 
he appears to be away. I'll push again, with this fix in the next day or 
so. 

Please try this patch and let me know if you still experience the problem 
you're seeing.

Concerning your patch: 

> +config NOISY

'Noisy' seems like such a generic name, and the description doesn't seem 
to imply its dependence on kprobes. Maybe KPROBES_NOISY? And, you should 
put it under KPROBES, under DEBUG_KERNEL.

> diff -urN linux-2.5.48-kprobes/drivers/char/noisy.c
> linux-2.5.48-kprobes-patched/drivers/char/noisy.c

> +static struct list_head probe_list;
> +struct nprobe {
> +	struct list_head list;
> +	struct kprobe probe;
> +	char message[MAX_MSG_SIZE + 1];
> +	struct attribute attr;
> +	struct kobject kobj;
> +};

struct subsystem has a list, and kobject and entry, so you don't have to 
do it yourself..

> +static ssize_t noisy_write(struct file *file, const char *buf, size_t
> count,
> +			   loff_t *ppos)
> +{
> +	struct nprobe *n = 0;
> +	size_t ret = -ENOMEM;
> +	char *tmp = 0;
> +
> +	if (count > MAX_MSG_SIZE) {
> +		printk(KERN_CRIT
> +		       "noisy: Input buffer (%i bytes) is too big!\n",
> +		       count);
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	tmp = kmalloc(count + 1, GFP_KERNEL);
> +	if (!tmp) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}

This should be memset to 0. 

> +	n = kmalloc(sizeof(struct nprobe), GFP_KERNEL);
> +	if (!n) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +	memset(n, '\0', sizeof(struct nprobe));
> +
> +	if (copy_from_user((void *)tmp, (void *)buf, count)) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +	tmp[count] = '\0';
> +
> +	if (2 != sscanf(tmp, "0x%x %s", &(n->probe).addr, n->message)) {
> +		ret = -EINVAL;
> +		goto out;
> +	}

You don't free n if you get an error from copy_from_user() or sscanf(). 

> +	(n->attr).name = n->message;
> +	(n->attr).mode = S_IRUGO;

Instead of doing this, you should just declare a static attribute called
'message' and have the contents be the message. This will save you from 
initializing it each time, and save you 4 bytes in struct nprobe. 
Something like this should work:


struct noisy_attribute {
	struct attribute attr;
	ssize_t (*read)(struct nprobe *,char *,size_t,loff_t);
};

static ssize_t noisy_attr_show(struct kobject * kobj, struct attribute * 
attr,
			       char * page, size_t count, loff_t off)
{
	struct nprobe * n = container_of(kobj,struct nprobe,kobj);
	struct noisy_attribute * noisy_attr = container_of(attr,struct 
noisy_attribute,attr);
	ssize_t ret = 0;
	return noisy_attr->show ? noisy_attr->show(n,page,count,off);
}

static struct sysfs_ops noisy_sysfs_ops = {
	.show	= noisy_attr_show,
};


/* Default Attribute */
static ssize_t noisy_message_read(struct nprobe * n, char * page, size_t 
count, loff_t off)
{
	return off ? snprintf(page,MAX_MSG_SIZE,"%s",n->message);
}

static struct noisy_attribute attr_message = {
	.attr	= { .name = "message", .mode = S_IRUGO },
};

static struct attribute * default_attrs[] = {
	&attr_message.attr,
	NULL,
};

/*
 * sysfs stuff
 */

struct subsystem noisy_subsys = {
	.kobj	= { .name = "noisy" },
	.default_attrs	= default_attrs,
	.sysfs_ops	= noisy_sysfs_ops,
};


Note that this will also save you from manually having to create and 
teardown the file when the kobject is registered and unregistered. 


	-pat

--- linux-2.5-virgin/fs/sysfs/inode.c	Wed Nov 20 12:13:06 2002
+++ linux-2.5-core/fs/sysfs/inode.c	Thu Nov 21 13:51:32 2002
@@ -23,6 +23,8 @@
  * Please see Documentation/filesystems/sysfs.txt for more information.
  */
 
+#undef DEBUG 
+
 #include <linux/list.h>
 #include <linux/init.h>
 #include <linux/pagemap.h>
@@ -94,9 +96,10 @@
 
 	if (!dentry->d_inode) {
 		inode = sysfs_get_inode(dir->i_sb, mode, dev);
-		if (inode)
+		if (inode) {
 			d_instantiate(dentry, inode);
-		else
+			dget(dentry);
+		} else
 			error = -ENOSPC;
 	} else
 		error = -EEXIST;
@@ -142,17 +145,6 @@
 	return error;
 }
 
-static int sysfs_unlink(struct inode *dir, struct dentry *dentry)
-{
-	struct inode *inode = dentry->d_inode;
-	down(&inode->i_sem);
-	dentry->d_inode->i_nlink--;
-	up(&inode->i_sem);
-	d_invalidate(dentry);
-	dput(dentry);
-	return 0;
-}
-
 /**
  *	sysfs_read_file - read an attribute. 
  *	@file:	file pointer.
@@ -173,17 +165,11 @@
 sysfs_read_file(struct file *file, char *buf, size_t count, loff_t *ppos)
 {
 	struct attribute * attr = file->f_dentry->d_fsdata;
-	struct sysfs_ops * ops = NULL;
-	struct kobject * kobj;
+	struct sysfs_ops * ops = file->private_data;
+	struct kobject * kobj = file->f_dentry->d_parent->d_fsdata;
 	unsigned char *page;
 	ssize_t retval = 0;
 
-	kobj = file->f_dentry->d_parent->d_fsdata;
-	if (kobj && kobj->subsys)
-		ops = kobj->subsys->sysfs_ops;
-	if (!ops || !ops->show)
-		return 0;
-
 	if (count > PAGE_SIZE)
 		count = PAGE_SIZE;
 
@@ -234,16 +220,11 @@
 sysfs_write_file(struct file *file, const char *buf, size_t count, loff_t *ppos)
 {
 	struct attribute * attr = file->f_dentry->d_fsdata;
-	struct sysfs_ops * ops = NULL;
-	struct kobject * kobj;
+	struct sysfs_ops * ops = file->private_data;
+	struct kobject * kobj = file->f_dentry->d_parent->d_fsdata;
 	ssize_t retval = 0;
 	char * page;
 
-	kobj = file->f_dentry->d_parent->d_fsdata;
-	if (kobj && kobj->subsys)
-		ops = kobj->subsys->sysfs_ops;
-	if (!ops || !ops->store)
-		return -EINVAL;
 
 	page = (char *)__get_free_page(GFP_KERNEL);
 	if (!page)
@@ -275,21 +256,72 @@
 	return retval;
 }
 
-static int sysfs_open_file(struct inode * inode, struct file * filp)
+static int check_perm(struct inode * inode, struct file * file)
 {
-	struct kobject * kobj;
+	struct kobject * kobj = kobject_get(file->f_dentry->d_parent->d_fsdata);
+	struct attribute * attr = file->f_dentry->d_fsdata;
+	struct sysfs_ops * ops = NULL;
 	int error = 0;
 
-	kobj = filp->f_dentry->d_parent->d_fsdata;
-	if ((kobj = kobject_get(kobj))) {
-		struct attribute * attr = filp->f_dentry->d_fsdata;
-		if (!attr)
-			error = -EINVAL;
-	} else
-		error = -EINVAL;
+	if (!kobj || !attr)
+		goto Einval;
+	
+	if (kobj->subsys)
+		ops = kobj->subsys->sysfs_ops;
+
+	/* No sysfs operations, either from having no subsystem,
+	 * or the subsystem have no operations.
+	 */
+	if (!ops)
+		goto Eaccess;
+
+	/* File needs write support.
+	 * The inode's perms must say it's ok, 
+	 * and we must have a store method.
+	 */
+	if (file->f_mode & FMODE_WRITE) {
+
+		if (!(inode->i_mode & S_IWUGO))
+			goto Eperm;
+		if (!ops->store)
+			goto Eaccess;
+
+	}
+
+	/* File needs read support.
+	 * The inode's perms must say it's ok, and we there
+	 * must be a show method for it.
+	 */
+	if (file->f_mode & FMODE_READ) {
+		if (!(inode->i_mode & S_IRUGO))
+			goto Eperm;
+		if (!ops->show)
+			goto Eaccess;
+	}
+
+	/* No error? Great, store the ops in file->private_data
+	 * for easy access in the read/write functions.
+	 */
+	file->private_data = ops;
+	goto Done;
+
+ Einval:
+	error = -EINVAL;
+	goto Done;
+ Eaccess:
+	error = -EACCES;
+	goto Done;
+ Eperm:
+	error = -EPERM;
+ Done:
 	return error;
 }
 
+static int sysfs_open_file(struct inode * inode, struct file * filp)
+{
+	return check_perm(inode,filp);
+}
+
 static int sysfs_release(struct inode * inode, struct file * filp)
 {
 	struct kobject * kobj = filp->f_dentry->d_parent->d_fsdata;
@@ -541,7 +573,8 @@
 		/* make sure dentry is really there */
 		if (victim->d_inode && 
 		    (victim->d_parent->d_inode == dir->d_inode)) {
-			sysfs_unlink(dir->d_inode,victim);
+			simple_unlink(dir->d_inode,victim);
+			d_delete(victim);
 		}
 	}
 	up(&dir->d_inode->i_sem);
@@ -599,19 +632,16 @@
 	list_for_each_safe(node,next,&dentry->d_subdirs) {
 		struct dentry * d = list_entry(node,struct dentry,d_child);
 		/* make sure dentry is still there */
-		if (d->d_inode)
-			sysfs_unlink(dentry->d_inode,d);
+		if (d->d_inode) {
+			simple_unlink(dentry->d_inode,d);
+			d_delete(dentry);
+		}
 	}
 
-	d_invalidate(dentry);
-	if (simple_empty(dentry)) {
-		dentry->d_inode->i_nlink -= 2;
-		dentry->d_inode->i_flags |= S_DEAD;
-		parent->d_inode->i_nlink--;
-	}
 	up(&dentry->d_inode->i_sem);
-	dput(dentry);
-
+	d_invalidate(dentry);
+	simple_rmdir(parent->d_inode,dentry);
+	d_delete(dentry);
 	up(&parent->d_inode->i_sem);
 	dput(parent);
 }
@@ -622,4 +652,3 @@
 EXPORT_SYMBOL(sysfs_remove_link);
 EXPORT_SYMBOL(sysfs_create_dir);
 EXPORT_SYMBOL(sysfs_remove_dir);
-MODULE_LICENSE("GPL");

[-- Attachment #2: Type: TEXT/PLAIN, Size: 6494 bytes --]

===== arch/i386/Kconfig 1.10 vs edited =====
--- 1.10/arch/i386/Kconfig	Mon Nov 18 11:52:31 2002
+++ edited/arch/i386/Kconfig	Thu Nov 21 11:20:31 2002
@@ -1551,6 +1551,15 @@
 	  Say Y here if you are developing drivers or trying to debug and
 	  identify kernel problems.
 
+config KPROBES
+	bool "Kprobes"
+	depends on DEBUG_KERNEL
+	help
+	  Kprobes allows you to trap at almost any kernel address, using
+	  register_kprobe(), and providing a callback function.  This is useful
+	  for kernel debugging, non-intrusive instrumentation and testing.  If
+	  in doubt, say "N".
+
 config DEBUG_STACKOVERFLOW
 	bool "Check for stack overflows"
 	depends on DEBUG_KERNEL
===== arch/i386/kernel/Makefile 1.29 vs edited =====
--- 1.29/arch/i386/kernel/Makefile	Thu Nov  7 11:26:45 2002
+++ edited/arch/i386/kernel/Makefile	Thu Nov 21 11:20:31 2002
@@ -29,6 +29,7 @@
 obj-$(CONFIG_PROFILING)		+= profile.o
 obj-$(CONFIG_EDD)             	+= edd.o
 obj-$(CONFIG_MODULES)		+= module.o
+obj-$(CONFIG_KPROBES)		+= kprobes.o
 
 EXTRA_AFLAGS   := -traditional
 
===== arch/i386/kernel/entry.S 1.46 vs edited =====
--- 1.46/arch/i386/kernel/entry.S	Wed Nov 20 12:56:06 2002
+++ edited/arch/i386/kernel/entry.S	Thu Nov 21 11:20:31 2002
@@ -405,9 +405,16 @@
 	jmp ret_from_exception
 
 ENTRY(debug)
+	pushl $-1			# mark this as an int
+	SAVE_ALL
+	movl %esp,%edx
 	pushl $0
-	pushl $do_debug
-	jmp error_code
+	pushl %edx
+	call do_debug
+	addl $8,%esp
+	testl %eax,%eax 
+	jnz restore_all
+	jmp ret_from_exception
 
 ENTRY(nmi)
 	pushl %eax
@@ -420,9 +427,16 @@
 	RESTORE_ALL
 
 ENTRY(int3)
+	pushl $-1			# mark this as an int
+	SAVE_ALL
+	movl %esp,%edx
 	pushl $0
-	pushl $do_int3
-	jmp error_code
+	pushl %edx
+	call do_int3
+	addl $8,%esp
+	testl %eax,%eax 
+	jnz restore_all
+	jmp ret_from_exception
 
 ENTRY(overflow)
 	pushl $0
===== arch/i386/kernel/traps.c 1.36 vs edited =====
--- 1.36/arch/i386/kernel/traps.c	Mon Nov 18 12:10:45 2002
+++ edited/arch/i386/kernel/traps.c	Thu Nov 21 11:20:48 2002
@@ -24,6 +24,7 @@
 #include <linux/interrupt.h>
 #include <linux/highmem.h>
 #include <linux/kallsyms.h>
+#include <linux/kprobes.h>
 
 #ifdef CONFIG_EISA
 #include <linux/ioport.h>
@@ -404,7 +405,6 @@
 }
 
 DO_VM86_ERROR_INFO( 0, SIGFPE,  "divide error", divide_error, FPE_INTDIV, regs->eip)
-DO_VM86_ERROR( 3, SIGTRAP, "int3", int3)
 DO_VM86_ERROR( 4, SIGSEGV, "overflow", overflow)
 DO_VM86_ERROR( 5, SIGSEGV, "bounds", bounds)
 DO_ERROR_INFO( 6, SIGILL,  "invalid operand", invalid_op, ILL_ILLOPN, regs->eip)
@@ -420,6 +420,9 @@
 {
 	if (regs->eflags & VM_MASK)
 		goto gp_in_vm86;
+	
+	if (kprobe_running() && kprobe_fault_handler(regs, 13))
+		return;
 
 	if (!(regs->xcs & 3))
 		goto gp_in_kernel;
@@ -551,6 +554,17 @@
 	nmi_callback = dummy_nmi_callback;
 }
 
+asmlinkage int do_int3(struct pt_regs *regs, long error_code)
+{
+	if (kprobe_handler(regs))
+		return 1;
+	/* This is an interrupt gate, because kprobes wants interrupts
+           disabled.  Normal trap handlers don't. */
+	restore_interrupts(regs);
+	do_trap(3, SIGTRAP, "int3", 1, regs, error_code, NULL);
+	return 0;
+}
+
 /*
  * Our handling of the processor debug registers is non-trivial.
  * We do not clear them on entry and exit from the kernel. Therefore
@@ -573,7 +587,7 @@
  * find every occurrence of the TF bit that could be saved away even
  * by user code)
  */
-asmlinkage void do_debug(struct pt_regs * regs, long error_code)
+asmlinkage int do_debug(struct pt_regs * regs, long error_code)
 {
 	unsigned int condition;
 	struct task_struct *tsk = current;
@@ -581,6 +595,12 @@
 
 	__asm__ __volatile__("movl %%db6,%0" : "=r" (condition));
 
+	if (post_kprobe_handler(regs))
+		return 1;
+
+	/* Interrupts not disabled for normal trap handling. */
+	restore_interrupts(regs);
+
 	/* Mask out spurious debug traps due to lazy DR7 setting */
 	if (condition & (DR_TRAP0|DR_TRAP1|DR_TRAP2|DR_TRAP3)) {
 		if (!tsk->thread.debugreg[7])
@@ -631,15 +651,15 @@
 	__asm__("movl %0,%%db7"
 		: /* no output */
 		: "r" (0));
-	return;
+	return 0;
 
 debug_vm86:
 	handle_vm86_trap((struct kernel_vm86_regs *) regs, error_code, 1);
-	return;
+	return 0;
 
 clear_TF:
 	regs->eflags &= ~TF_MASK;
-	return;
+	return 0;
 }
 
 /*
@@ -803,6 +823,8 @@
 	struct task_struct *tsk = current;
 	clts();		/* Allow maths ops (or we recurse) */
 
+	if (kprobe_running() && kprobe_fault_handler(&regs, 7))
+		return;
 	if (!tsk->used_math)
 		init_fpu(tsk);
 	restore_fpu(tsk);
@@ -896,9 +918,9 @@
 #endif
 
 	set_trap_gate(0,&divide_error);
-	set_trap_gate(1,&debug);
+	_set_gate(idt_table+1,14,3,&debug); /* debug trap for kprobes */
 	set_intr_gate(2,&nmi);
-	set_system_gate(3,&int3);	/* int3-5 can be called from all */
+	_set_gate(idt_table+3,14,3,&int3); /* int3-5 can be called from all */
 	set_system_gate(4,&overflow);
 	set_system_gate(5,&bounds);
 	set_trap_gate(6,&invalid_op);
===== arch/i386/mm/fault.c 1.20 vs edited =====
--- 1.20/arch/i386/mm/fault.c	Sat Oct 12 06:11:03 2002
+++ edited/arch/i386/mm/fault.c	Thu Nov 21 11:20:31 2002
@@ -19,6 +19,7 @@
 #include <linux/init.h>
 #include <linux/tty.h>
 #include <linux/vt_kern.h>		/* For unblank_screen() */
+#include <linux/kprobes.h>
 
 #include <asm/system.h>
 #include <asm/uaccess.h>
@@ -162,6 +163,9 @@
 
 	/* get the address */
 	__asm__("movl %%cr2,%0":"=r" (address));
+
+	if (kprobe_running() && kprobe_fault_handler(regs, 14))
+		return;
 
 	/* It's safe to allow irq's after cr2 has been saved */
 	if (regs->eflags & X86_EFLAGS_IF)
===== kernel/Makefile 1.21 vs edited =====
--- 1.21/kernel/Makefile	Mon Nov 18 22:51:14 2002
+++ edited/kernel/Makefile	Thu Nov 21 11:20:31 2002
@@ -4,7 +4,7 @@
 
 export-objs = signal.o sys.o kmod.o workqueue.o ksyms.o pm.o exec_domain.o \
 		printk.o platform.o suspend.o dma.o module.o cpufreq.o \
-		profile.o rcupdate.o intermodule.o
+		profile.o rcupdate.o intermodule.o kprobes.o
 
 obj-y     = sched.o fork.o exec_domain.o panic.o printk.o profile.o \
 	    exit.o itimer.o time.o softirq.o resource.o \
@@ -21,6 +21,7 @@
 obj-$(CONFIG_CPU_FREQ) += cpufreq.o
 obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
 obj-$(CONFIG_SOFTWARE_SUSPEND) += suspend.o
+obj-$(CONFIG_KPROBES) += kprobes.o
 
 ifneq ($(CONFIG_IA64),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-21 20:45 ` Patrick Mochel
@ 2002-11-21 22:07   ` Rusty Lynch
  2002-11-21 23:03     ` Patrick Mochel
  0 siblings, 1 reply; 15+ messages in thread
From: Rusty Lynch @ 2002-11-21 22:07 UTC (permalink / raw)
  To: Patrick Mochel, Rusty Lynch; +Cc: Linux Kernel Mailing List

>
> Ok, I've had a chance to play with it a bit..
>
> The kprobes patch didn't apply to current -bk. Attached is an updated
> patch. I also have some comments on your 'noisy' patch, but first..
>

Very cool.  Bitkeeper is one of those things I never bothered
with yet (mainly because I feel some comfortable with CVS.)
It looks like it might be worth going through ramp-up time
on bk to keep up with changes to the kernel.

> > I can see this with a little kprobes example code that I have
> > been playing with that will create entries like:
> >
> > /sysfsroot/noisy/0xc0107ae0/sys_fork
> >
> > when someone uses that driver to insert a kernel probe
> > at 0xc0107ae0 that will printk "sys_fork".
>
> It seems that having a pure sysfs implementation would be superior,
> instead of having to use a character device to write to. After looking
> into this, I realize that a couple of pieces of infrastructure are needed,
> so I'm working on that, and will post a modified version of your module
> once I'm done..

I look forward to seeing it.

>
> > What I have noticed, is that if I create a new probe
> > (which will create the sysfs entry), open a new shell and
> > cd to /sysfsroot/noisy/0xc0107ae0, and then use my
> > driver to remove the probe (which will remove the
> > sysfs entry), then /sysfsroot/noisy/0xc0107ae0 doesn't
> > go away after I cd out of the shell.
> >
> > >From then on any attempts to create new sysfs entries
> > do not show up in /sysfsroot/ until I unload/load my driver
> > again.
> >
> > It seems like this could be an issue with some real code
> > (not just this stupid play code of mine), like maybe hotswap
> > code that updates sysfs entries.
>
> Yes, it was a real bug. I've mucked with the method to do file and
> directory deletion, and it turns out that the way I was doing it was
> wrong. I've gone back to mimmicking vfs_unlink() and vfs_rmdir(), which I
> shouldn't have diverged from in the first place. (doing d_delete() after
> low-level unlinking, instead of d_invalidate() before it).
>
> Appended to this email is my current working patch to sysfs, including
> fixes discussed and posted yesterday. I've pushed these to Linus, though
> he appears to be away. I'll push again, with this fix in the next day or
> so.
>
> Please try this patch and let me know if you still experience the problem
> you're seeing.
>
It looks like the patch is against the bk tree, and does not apply cleanly
to
strait 2.5.48.  I don't know how much has changed to sysfs/inode.c, but
I can see where the last hunk is looking too far up, so I'll try it anyway.

In the meantime I'll be grabbing bk so I can see what everyone is looking
at.

> Concerning your patch:

Thanks for taking the time to look at my module.  The reason I created
it was to get familiar with the 2.5 kernel, so this feedback is very
helpful.

I'll take a stab at the changes.

>
> > +config NOISY
>
> 'Noisy' seems like such a generic name, and the description doesn't seem
> to imply its dependence on kprobes. Maybe KPROBES_NOISY? And, you should
> put it under KPROBES, under DEBUG_KERNEL.
>

yea, I don't put much thought into a name, or where would show up
in the config.  I'll change it.

> > diff -urN linux-2.5.48-kprobes/drivers/char/noisy.c
> > linux-2.5.48-kprobes-patched/drivers/char/noisy.c
>
> > +static struct list_head probe_list;
> > +struct nprobe {
> > + struct list_head list;
> > + struct kprobe probe;
> > + char message[MAX_MSG_SIZE + 1];
> > + struct attribute attr;
> > + struct kobject kobj;
> > +};
>
> struct subsystem has a list, and kobject and entry, so you don't have to
> do it yourself..
>
> > +static ssize_t noisy_write(struct file *file, const char *buf, size_t
> > count,
> > +    loff_t *ppos)
> > +{
> > + struct nprobe *n = 0;
> > + size_t ret = -ENOMEM;
> > + char *tmp = 0;
> > +
> > + if (count > MAX_MSG_SIZE) {
> > + printk(KERN_CRIT
> > +        "noisy: Input buffer (%i bytes) is too big!\n",
> > +        count);
> > + ret = -EINVAL;
> > + goto out;
> > + }
> > +
> > + tmp = kmalloc(count + 1, GFP_KERNEL);
> > + if (!tmp) {
> > + ret = -ENOMEM;
> > + goto out;
> > + }
>
> This should be memset to 0.
>
> > + n = kmalloc(sizeof(struct nprobe), GFP_KERNEL);
> > + if (!n) {
> > + ret = -ENOMEM;
> > + goto out;
> > + }
> > + memset(n, '\0', sizeof(struct nprobe));
> > +
> > + if (copy_from_user((void *)tmp, (void *)buf, count)) {
> > + ret = -ENOMEM;
> > + goto out;
> > + }
> > + tmp[count] = '\0';
> > +
> > + if (2 != sscanf(tmp, "0x%x %s", &(n->probe).addr, n->message)) {
> > + ret = -EINVAL;
> > + goto out;
> > + }
>
> You don't free n if you get an error from copy_from_user() or sscanf().
>
> > + (n->attr).name = n->message;
> > + (n->attr).mode = S_IRUGO;
>
> Instead of doing this, you should just declare a static attribute called
> 'message' and have the contents be the message. This will save you from
> initializing it each time, and save you 4 bytes in struct nprobe.
> Something like this should work:
>
>
> struct noisy_attribute {
> struct attribute attr;
> ssize_t (*read)(struct nprobe *,char *,size_t,loff_t);
> };
>
> static ssize_t noisy_attr_show(struct kobject * kobj, struct attribute *
> attr,
>        char * page, size_t count, loff_t off)
> {
> struct nprobe * n = container_of(kobj,struct nprobe,kobj);
> struct noisy_attribute * noisy_attr = container_of(attr,struct
> noisy_attribute,attr);
> ssize_t ret = 0;
> return noisy_attr->show ? noisy_attr->show(n,page,count,off);
> }
>
> static struct sysfs_ops noisy_sysfs_ops = {
> .show = noisy_attr_show,
> };
>
>
> /* Default Attribute */
> static ssize_t noisy_message_read(struct nprobe * n, char * page, size_t
> count, loff_t off)
> {
> return off ? snprintf(page,MAX_MSG_SIZE,"%s",n->message);
> }
>
> static struct noisy_attribute attr_message = {
> .attr = { .name = "message", .mode = S_IRUGO },
> };
>
> static struct attribute * default_attrs[] = {
> &attr_message.attr,
> NULL,
> };
>
> /*
>  * sysfs stuff
>  */
>
> struct subsystem noisy_subsys = {
> .kobj = { .name = "noisy" },
> .default_attrs = default_attrs,
> .sysfs_ops = noisy_sysfs_ops,
> };
>
>
> Note that this will also save you from manually having to create and
> teardown the file when the kobject is registered and unregistered.
>
>
> -pat
>
> --- linux-2.5-virgin/fs/sysfs/inode.c Wed Nov 20 12:13:06 2002
> +++ linux-2.5-core/fs/sysfs/inode.c Thu Nov 21 13:51:32 2002
> @@ -23,6 +23,8 @@
>   * Please see Documentation/filesystems/sysfs.txt for more information.
>   */
>
> +#undef DEBUG
> +
>  #include <linux/list.h>
>  #include <linux/init.h>
>  #include <linux/pagemap.h>
> @@ -94,9 +96,10 @@
>
>   if (!dentry->d_inode) {
>   inode = sysfs_get_inode(dir->i_sb, mode, dev);
> - if (inode)
> + if (inode) {
>   d_instantiate(dentry, inode);
> - else
> + dget(dentry);
> + } else
>   error = -ENOSPC;
>   } else
>   error = -EEXIST;
> @@ -142,17 +145,6 @@
>   return error;
>  }
>
> -static int sysfs_unlink(struct inode *dir, struct dentry *dentry)
> -{
> - struct inode *inode = dentry->d_inode;
> - down(&inode->i_sem);
> - dentry->d_inode->i_nlink--;
> - up(&inode->i_sem);
> - d_invalidate(dentry);
> - dput(dentry);
> - return 0;
> -}
> -
>  /**
>   * sysfs_read_file - read an attribute.
>   * @file: file pointer.
> @@ -173,17 +165,11 @@
>  sysfs_read_file(struct file *file, char *buf, size_t count, loff_t *ppos)
>  {
>   struct attribute * attr = file->f_dentry->d_fsdata;
> - struct sysfs_ops * ops = NULL;
> - struct kobject * kobj;
> + struct sysfs_ops * ops = file->private_data;
> + struct kobject * kobj = file->f_dentry->d_parent->d_fsdata;
>   unsigned char *page;
>   ssize_t retval = 0;
>
> - kobj = file->f_dentry->d_parent->d_fsdata;
> - if (kobj && kobj->subsys)
> - ops = kobj->subsys->sysfs_ops;
> - if (!ops || !ops->show)
> - return 0;
> -
>   if (count > PAGE_SIZE)
>   count = PAGE_SIZE;
>
> @@ -234,16 +220,11 @@
>  sysfs_write_file(struct file *file, const char *buf, size_t count, loff_t
*ppos)
>  {
>   struct attribute * attr = file->f_dentry->d_fsdata;
> - struct sysfs_ops * ops = NULL;
> - struct kobject * kobj;
> + struct sysfs_ops * ops = file->private_data;
> + struct kobject * kobj = file->f_dentry->d_parent->d_fsdata;
>   ssize_t retval = 0;
>   char * page;
>
> - kobj = file->f_dentry->d_parent->d_fsdata;
> - if (kobj && kobj->subsys)
> - ops = kobj->subsys->sysfs_ops;
> - if (!ops || !ops->store)
> - return -EINVAL;
>
>   page = (char *)__get_free_page(GFP_KERNEL);
>   if (!page)
> @@ -275,21 +256,72 @@
>   return retval;
>  }
>
> -static int sysfs_open_file(struct inode * inode, struct file * filp)
> +static int check_perm(struct inode * inode, struct file * file)
>  {
> - struct kobject * kobj;
> + struct kobject * kobj = kobject_get(file->f_dentry->d_parent->d_fsdata);
> + struct attribute * attr = file->f_dentry->d_fsdata;
> + struct sysfs_ops * ops = NULL;
>   int error = 0;
>
> - kobj = filp->f_dentry->d_parent->d_fsdata;
> - if ((kobj = kobject_get(kobj))) {
> - struct attribute * attr = filp->f_dentry->d_fsdata;
> - if (!attr)
> - error = -EINVAL;
> - } else
> - error = -EINVAL;
> + if (!kobj || !attr)
> + goto Einval;
> +
> + if (kobj->subsys)
> + ops = kobj->subsys->sysfs_ops;
> +
> + /* No sysfs operations, either from having no subsystem,
> + * or the subsystem have no operations.
> + */
> + if (!ops)
> + goto Eaccess;
> +
> + /* File needs write support.
> + * The inode's perms must say it's ok,
> + * and we must have a store method.
> + */
> + if (file->f_mode & FMODE_WRITE) {
> +
> + if (!(inode->i_mode & S_IWUGO))
> + goto Eperm;
> + if (!ops->store)
> + goto Eaccess;
> +
> + }
> +
> + /* File needs read support.
> + * The inode's perms must say it's ok, and we there
> + * must be a show method for it.
> + */
> + if (file->f_mode & FMODE_READ) {
> + if (!(inode->i_mode & S_IRUGO))
> + goto Eperm;
> + if (!ops->show)
> + goto Eaccess;
> + }
> +
> + /* No error? Great, store the ops in file->private_data
> + * for easy access in the read/write functions.
> + */
> + file->private_data = ops;
> + goto Done;
> +
> + Einval:
> + error = -EINVAL;
> + goto Done;
> + Eaccess:
> + error = -EACCES;
> + goto Done;
> + Eperm:
> + error = -EPERM;
> + Done:
>   return error;
>  }
>
> +static int sysfs_open_file(struct inode * inode, struct file * filp)
> +{
> + return check_perm(inode,filp);
> +}
> +
>  static int sysfs_release(struct inode * inode, struct file * filp)
>  {
>   struct kobject * kobj = filp->f_dentry->d_parent->d_fsdata;
> @@ -541,7 +573,8 @@
>   /* make sure dentry is really there */
>   if (victim->d_inode &&
>       (victim->d_parent->d_inode == dir->d_inode)) {
> - sysfs_unlink(dir->d_inode,victim);
> + simple_unlink(dir->d_inode,victim);
> + d_delete(victim);
>   }
>   }
>   up(&dir->d_inode->i_sem);
> @@ -599,19 +632,16 @@
>   list_for_each_safe(node,next,&dentry->d_subdirs) {
>   struct dentry * d = list_entry(node,struct dentry,d_child);
>   /* make sure dentry is still there */
> - if (d->d_inode)
> - sysfs_unlink(dentry->d_inode,d);
> + if (d->d_inode) {
> + simple_unlink(dentry->d_inode,d);
> + d_delete(dentry);
> + }
>   }
>
> - d_invalidate(dentry);
> - if (simple_empty(dentry)) {
> - dentry->d_inode->i_nlink -= 2;
> - dentry->d_inode->i_flags |= S_DEAD;
> - parent->d_inode->i_nlink--;
> - }
>   up(&dentry->d_inode->i_sem);
> - dput(dentry);
> -
> + d_invalidate(dentry);
> + simple_rmdir(parent->d_inode,dentry);
> + d_delete(dentry);
>   up(&parent->d_inode->i_sem);
>   dput(parent);
>  }
> @@ -622,4 +652,3 @@
>  EXPORT_SYMBOL(sysfs_remove_link);
>  EXPORT_SYMBOL(sysfs_create_dir);
>  EXPORT_SYMBOL(sysfs_remove_dir);
> -MODULE_LICENSE("GPL");
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-21 22:07   ` Rusty Lynch
@ 2002-11-21 23:03     ` Patrick Mochel
  2002-11-23  0:32       ` Rusty Lynch
  2002-11-24 13:04       ` Werner Almesberger
  0 siblings, 2 replies; 15+ messages in thread
From: Patrick Mochel @ 2002-11-21 23:03 UTC (permalink / raw)
  To: Rusty Lynch; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 9360 bytes --]


> Very cool.  Bitkeeper is one of those things I never bothered
> with yet (mainly because I feel some comfortable with CVS.)
> It looks like it might be worth going through ramp-up time
> on bk to keep up with changes to the kernel.

There are also nightly snapshots, though I don't recall the URL off the 
top of my head..

> > It seems that having a pure sysfs implementation would be superior,
> > instead of having to use a character device to write to. After looking
> > into this, I realize that a couple of pieces of infrastructure are needed,
> > so I'm working on that, and will post a modified version of your module
> > once I'm done..
> 
> I look forward to seeing it.

Ok, there are basically two parts to it: the required modifications to 
sysfs and your updated module. 

The appended patch adds the support to sysfs be defining struct 
subsys_attribute for declaring attributes of subsystems themselves, as 
well as the needed helpers for creation/teardown and read/write. 

I've attached a replacement for noisy.c that creates a sysfs file named 
'ctl' that handles addition and deletion of nprobes, similar to the char 
device you had created. From the top of the file:

/*
 * Noisy Interface for Linux
 *
 * This driver allows arbitrary printk's to be inserted into
 * executing kernel code by using the new kprobes interface.
 * A message is attached to an address, and when that address
 * is reached, the message is printed. 
 *
 * This uses a sysfs control file to manage a list of probes. 
 * The sysfs directory is at
 *
 * /sys/noisy/
 *
 * and the control is named 'ctl'. 
 *
 * A Noisy Probe can be added by echoing into the file, like:
 *
 *	$ echo "add <address> <message>" > /sys/noisy/ctl
 *
 * where <address> is the address to break on, and <message> 
 * is the message to print when the address is reached. 
 *
 * Probes can be removed by doing:
 *
 *	$ echo "del <address>" > /sys/noisy/ctl
 *
 * where <address> is the address of the probe.
 *
 * The probes themselves get a directory under /sys/noisy/, and
 * the name of the directory is the address of the probe. Each
 * probe directory contains one file ('message') that contains
 * the message to be printed. (More may be added later).
 */

I've tried to comment the changes and the necessary steps in making this 
work. 

[ Note: While I'm generally happy with the way things work, I realize that 
it still requires a decent amount of overhead in using the sysfs interface 
(see the file). I'll be looking into shrinking this...]

Everything seems to work..

> It looks like the patch is against the bk tree, and does not apply cleanly
> to
> strait 2.5.48.  I don't know how much has changed to sysfs/inode.c, but
> I can see where the last hunk is looking too far up, so I'll try it anyway.

Ah yes, I forgot there was a patch applied to sysfs since 2.5.48. I've 
included everything since 2.5.48 in the one I've appended. 


	-pat

===== fs/sysfs/inode.c 1.60 vs 1.67 =====
--- 1.60/fs/sysfs/inode.c	Sat Nov 16 15:01:34 2002
+++ 1.67/fs/sysfs/inode.c	Thu Nov 21 17:01:53 2002
@@ -23,6 +23,8 @@
  * Please see Documentation/filesystems/sysfs.txt for more information.
  */
 
+#undef DEBUG 
+
 #include <linux/list.h>
 #include <linux/init.h>
 #include <linux/pagemap.h>
@@ -87,16 +89,17 @@
 	return inode;
 }
 
-static int sysfs_mknod(struct inode *dir, struct dentry *dentry, int mode, int dev)
+static int sysfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
 {
 	struct inode *inode;
 	int error = 0;
 
 	if (!dentry->d_inode) {
 		inode = sysfs_get_inode(dir->i_sb, mode, dev);
-		if (inode)
+		if (inode) {
 			d_instantiate(dentry, inode);
-		else
+			dget(dentry);
+		} else
 			error = -ENOSPC;
 	} else
 		error = -EEXIST;
@@ -142,17 +145,43 @@
 	return error;
 }
 
-static int sysfs_unlink(struct inode *dir, struct dentry *dentry)
+#define to_subsys(k) container_of(k,struct subsystem,kobj)
+#define to_sattr(a) container_of(a,struct subsys_attribute,attr)
+
+/**
+ * Subsystem file operations.
+ * These operations allow subsystems to have files that can be 
+ * read/written. 
+ */
+ssize_t subsys_attr_show(struct kobject * kobj, struct attribute * attr, 
+			 char * page, size_t count, loff_t off)
 {
-	struct inode *inode = dentry->d_inode;
-	down(&inode->i_sem);
-	dentry->d_inode->i_nlink--;
-	up(&inode->i_sem);
-	d_invalidate(dentry);
-	dput(dentry);
-	return 0;
+	struct subsystem * s = to_subsys(kobj);
+	struct subsys_attribute * sattr = to_sattr(attr);
+	ssize_t ret = 0;
+
+	if (sattr->show)
+		ret = sattr->show(s,page,count,off);
+	return ret;
 }
 
+ssize_t subsys_attr_store(struct kobject * kobj, struct attribute * attr,
+			  const char * page, size_t count, loff_t off)
+{
+	struct subsystem * s = to_subsys(kobj);
+	struct subsys_attribute * sattr = to_sattr(attr);
+	ssize_t ret = 0;
+
+	if (sattr->store)
+		ret = sattr->store(s,page,count,off);
+	return ret;
+}
+
+static struct sysfs_ops subsys_sysfs_ops = {
+	.show	= subsys_attr_show,
+	.store	= subsys_attr_store,
+};
+
 /**
  *	sysfs_read_file - read an attribute. 
  *	@file:	file pointer.
@@ -173,17 +202,11 @@
 sysfs_read_file(struct file *file, char *buf, size_t count, loff_t *ppos)
 {
 	struct attribute * attr = file->f_dentry->d_fsdata;
-	struct sysfs_ops * ops = NULL;
-	struct kobject * kobj;
+	struct sysfs_ops * ops = file->private_data;
+	struct kobject * kobj = file->f_dentry->d_parent->d_fsdata;
 	unsigned char *page;
 	ssize_t retval = 0;
 
-	kobj = file->f_dentry->d_parent->d_fsdata;
-	if (kobj && kobj->subsys)
-		ops = kobj->subsys->sysfs_ops;
-	if (!ops || !ops->show)
-		return 0;
-
 	if (count > PAGE_SIZE)
 		count = PAGE_SIZE;
 
@@ -234,16 +257,11 @@
 sysfs_write_file(struct file *file, const char *buf, size_t count, loff_t *ppos)
 {
 	struct attribute * attr = file->f_dentry->d_fsdata;
-	struct sysfs_ops * ops = NULL;
-	struct kobject * kobj;
+	struct sysfs_ops * ops = file->private_data;
+	struct kobject * kobj = file->f_dentry->d_parent->d_fsdata;
 	ssize_t retval = 0;
 	char * page;
 
-	kobj = file->f_dentry->d_parent->d_fsdata;
-	if (kobj && kobj->subsys)
-		ops = kobj->subsys->sysfs_ops;
-	if (!ops || !ops->store)
-		return 0;
 
 	page = (char *)__get_free_page(GFP_KERNEL);
 	if (!page)
@@ -275,21 +293,77 @@
 	return retval;
 }
 
-static int sysfs_open_file(struct inode * inode, struct file * filp)
+static int check_perm(struct inode * inode, struct file * file)
 {
-	struct kobject * kobj;
+	struct kobject * kobj = kobject_get(file->f_dentry->d_parent->d_fsdata);
+	struct attribute * attr = file->f_dentry->d_fsdata;
+	struct sysfs_ops * ops = NULL;
 	int error = 0;
 
-	kobj = filp->f_dentry->d_parent->d_fsdata;
-	if ((kobj = kobject_get(kobj))) {
-		struct attribute * attr = filp->f_dentry->d_fsdata;
-		if (!attr)
-			error = -EINVAL;
-	} else
-		error = -EINVAL;
+	if (!kobj || !attr)
+		goto Einval;
+
+	/* if the kobject has no subsystem, then it is a subsystem itself,
+	 * so give it the subsys_sysfs_ops.
+	 */
+	if (kobj->subsys)
+		ops = kobj->subsys->sysfs_ops;
+	else
+		ops = &subsys_sysfs_ops;
+
+	/* No sysfs operations, either from having no subsystem,
+	 * or the subsystem have no operations.
+	 */
+	if (!ops)
+		goto Eaccess;
+
+	/* File needs write support.
+	 * The inode's perms must say it's ok, 
+	 * and we must have a store method.
+	 */
+	if (file->f_mode & FMODE_WRITE) {
+
+		if (!(inode->i_mode & S_IWUGO))
+			goto Eperm;
+		if (!ops->store)
+			goto Eaccess;
+
+	}
+
+	/* File needs read support.
+	 * The inode's perms must say it's ok, and we there
+	 * must be a show method for it.
+	 */
+	if (file->f_mode & FMODE_READ) {
+		if (!(inode->i_mode & S_IRUGO))
+			goto Eperm;
+		if (!ops->show)
+			goto Eaccess;
+	}
+
+	/* No error? Great, store the ops in file->private_data
+	 * for easy access in the read/write functions.
+	 */
+	file->private_data = ops;
+	goto Done;
+
+ Einval:
+	error = -EINVAL;
+	goto Done;
+ Eaccess:
+	error = -EACCES;
+	goto Done;
+ Eperm:
+	error = -EPERM;
+ Done:
 	return error;
 }
 
+static int sysfs_open_file(struct inode * inode, struct file * filp)
+{
+	return check_perm(inode,filp);
+}
+
 static int sysfs_release(struct inode * inode, struct file * filp)
 {
 	struct kobject * kobj = filp->f_dentry->d_parent->d_fsdata;
@@ -541,7 +615,8 @@
 		/* make sure dentry is really there */
 		if (victim->d_inode && 
 		    (victim->d_parent->d_inode == dir->d_inode)) {
-			sysfs_unlink(dir->d_inode,victim);
+			simple_unlink(dir->d_inode,victim);
+			d_delete(victim);
 		}
 	}
 	up(&dir->d_inode->i_sem);
@@ -599,19 +674,16 @@
 	list_for_each_safe(node,next,&dentry->d_subdirs) {
 		struct dentry * d = list_entry(node,struct dentry,d_child);
 		/* make sure dentry is still there */
-		if (d->d_inode)
-			sysfs_unlink(dentry->d_inode,d);
+		if (d->d_inode) {
+			simple_unlink(dentry->d_inode,d);
+			d_delete(dentry);
+		}
 	}
 
-	d_invalidate(dentry);
-	if (simple_empty(dentry)) {
-		dentry->d_inode->i_nlink -= 2;
-		dentry->d_inode->i_flags |= S_DEAD;
-		parent->d_inode->i_nlink--;
-	}
 	up(&dentry->d_inode->i_sem);
-	dput(dentry);
-
+	d_invalidate(dentry);
+	simple_rmdir(parent->d_inode,dentry);
+	d_delete(dentry);
 	up(&parent->d_inode->i_sem);
 	dput(parent);
 }
@@ -622,4 +694,3 @@
 EXPORT_SYMBOL(sysfs_remove_link);
 EXPORT_SYMBOL(sysfs_create_dir);
 EXPORT_SYMBOL(sysfs_remove_dir);
-MODULE_LICENSE("GPL");

[-- Attachment #2: Type: TEXT/PLAIN, Size: 7403 bytes --]

/*
 * Noisy Interface for Linux
 *
 * This driver allows arbitrary printk's to be inserted into
 * executing kernel code by using the new kprobes interface.
 * A message is attached to an address, and when that address
 * is reached, the message is printed. 
 *
 * This uses a sysfs control file to manage a list of probes. 
 * The sysfs directory is at
 *
 * /sys/noisy/
 *
 * and the control is named 'ctl'. 
 *
 * A Noisy Probe can be added by echoing into the file, like:
 *
 *	$ echo "add <address> <message>" > /sys/noisy/ctl
 *
 * where <address> is the address to break on, and <message> 
 * is the message to print when the address is reached. 
 *
 * Probes can be removed by doing:
 *
 *	$ echo "del <address>" > /sys/noisy/ctl
 *
 * where <address> is the address of the probe.
 *
 * The probes themselves get a directory under /sys/noisy/, and
 * the name of the directory is the address of the probe. Each
 * probe directory contains one file ('message') that contains
 * the message to be printed. (More may be added later).
 *
 * Copyright (C) 2002 Rusty Lynch <rusty@linux.intel.com>
 */

#include <linux/module.h>
#include <linux/init.h>
#include <linux/kprobes.h>
#include <linux/slab.h>
#include <linux/kobject.h>

/* exported by arch/YOURARCH/kernel/traps.c */
extern int valid_kernel_address(unsigned long);

#define MAX_MSG_SIZE 128

#define to_nprobe(entry) container_of(entry,struct nprobe,kobj.entry);

static DECLARE_MUTEX(noisy_sem);
static struct subsystem noisy_subsys;

/*
 * struct nrpobe: data structure for managing list of probe points
 */
struct nprobe {
	struct kprobe probe;
	char message[MAX_MSG_SIZE + 1];
	struct kobject kobj;
};

/* 
 * Probe handlers.
 * Only one is used (pre) to print the message out.
 */
static void noisy_pre_handler(struct kprobe *p, struct pt_regs *r)
{
	struct nprobe *c = container_of(p, struct nprobe, probe);
	printk(KERN_CRIT "%s: %s\n", __FUNCTION__, c->message);
}

static void noisy_post_handler(struct kprobe *p, struct pt_regs *r, 
			       unsigned long flags)
{ }

static int noisy_fault_handler(struct kprobe *p, struct pt_regs *r, int trapnr)
{
	/* Let the kernel handle this fault */
	return 0;
}


/*
 * struct noisy_attribute - used for defining probe attributes, with a 
 * typesafe show method.
 */
struct noisy_attribute {
	struct attribute attr;
	ssize_t (*show)(struct nprobe *,char *,size_t,loff_t);
};


/**
 *	noisy_attr_show - forward sysfs read call to proper handler.
 *	@kobj:	kobject of probe being acessed.
 *	@attr:	generic attribute portion of struct noisy_attribute.
 *	@page:	buffer to write into.
 *	@count:	number of bytes requested.
 *	@off:	offset into buffer.
 *
 *	This is called from sysfs and is necessary to convert the generic
 *	kobject into the right type, and to convert the attribute into the
 *	right attribute type.
 */

static ssize_t noisy_attr_show(struct kobject * kobj, struct attribute * attr,
			       char * page, size_t count, loff_t off)
{
	struct nprobe * n = container_of(kobj,struct nprobe,kobj);
	struct noisy_attribute * noisy_attr = container_of(attr,struct noisy_attribute,attr);
	return noisy_attr->show ? noisy_attr->show(n,page,count,off) : 0;
}

/*
 * noisy_sysfs_ops - sysfs operations for struct nprobes.
 */
static struct sysfs_ops noisy_sysfs_ops = {
	.show	= noisy_attr_show,
};


/* Default Attribute - the message to print out. */
static ssize_t noisy_message_read(struct nprobe * n, char * page, size_t count, loff_t off)
{
	return off ? 0: snprintf(page,MAX_MSG_SIZE,"%s\n",n->message);
}

static struct noisy_attribute attr_message = {
	.attr	= { .name = "message", .mode = S_IRUGO },
	.show	= noisy_message_read,
};

/* Declare array of default attributes to be added when an nprobe is added */
static struct attribute * default_attrs[] = {
	&attr_message.attr,
	NULL,
};


/* Declare noisy_subsys for addition to sysfs */
static struct subsystem noisy_subsys = {
	.kobj	= { .name = "noisy" },
	.default_attrs	= default_attrs,
	.sysfs_ops	= &noisy_sysfs_ops,
};


/*
 * noisy ctl attribute.
 * This is declared as an attribute of the subsystem, and added in 
 * noisy_init(). 
 * 
 * Reading this attribute dumps all the registered noisy probes.
 * Writing to it either adds or deletes a noisy probe, as described at 
 * the beginning of the file.
 */
static ssize_t ctl_show(struct subsystem * s, char * page, size_t count, loff_t off)
{
	char * str = page;
	int ret = 0;

	down(&noisy_sem);
	if (!off) {
		struct list_head * entry, * next;
		list_for_each_safe(entry,next,&noisy_subsys.list) {
			struct nprobe * n = to_nprobe(entry);
			if ((ret + MAX_MSG_SIZE) > PAGE_SIZE)
				break;
			str += snprintf(str,PAGE_SIZE - ret,"%p: %s\n",
					n->probe.addr,n->message);
			ret = str - page;
		}
	}
	up(&noisy_sem);
	return ret;
}

static int add(unsigned long addr, char * message)
{
	struct nprobe * n;
	int error = 0;

	if (!valid_kernel_address(addr))
		return -EFAULT;

	n = kmalloc(sizeof(struct nprobe),GFP_KERNEL);
	if (!n)
		return -ENOMEM;
	memset(n,0,sizeof(struct nprobe));

	n->probe.addr = (kprobe_opcode_t *)addr;
	strncpy(n->message,message,MAX_MSG_SIZE);
	n->probe.pre_handler = noisy_pre_handler;
	n->probe.post_handler = noisy_post_handler;
	n->probe.fault_handler = noisy_fault_handler;

	/* Doing this manually will be unnecessary soon. */
	kobject_init(&n->kobj);
	n->kobj.subsys = &noisy_subsys;
	snprintf(n->kobj.name, KOBJ_NAME_LEN, "%p", n->probe.addr);
	
	if ((error = register_kprobe(&n->probe))) {
		printk(KERN_ERR "Unable to register probe at %p\n", 
		       (n->probe).addr);
		goto Error;
	}
	if ((error = kobject_register(&n->kobj))) {
		unregister_kprobe(&n->probe);
		goto Error;
	}
	goto Done;
 Error:
	kfree(n);
 Done:
	return error;
}

static int del(unsigned long addr)
{
	struct list_head * entry;
	struct nprobe * n;

	list_for_each(entry,&noisy_subsys.list) {
		n = to_nprobe(entry);
		if ((unsigned long)(n->probe.addr) == addr) {
			kobject_unregister(&n->kobj);
			unregister_kprobe(&n->probe);
			return 0;
		}
	}
	return -EFAULT;
}

static ssize_t ctl_store(struct subsystem * s, const char * page, size_t count, loff_t off)
{
	char message[MAX_MSG_SIZE];
	char ctl[16];
	unsigned long addr;
	int num;
	int error;
	int ret = 0;

	down(&noisy_sem);
	if (off)
		goto Done;
	num = sscanf(page,"%15s 0x%lx %128s",ctl,&addr,message);
	if (!num) {
		error = -EINVAL;
		goto Done;
	}
	if (!strcmp(ctl,"add") && num == 3)
		error = add(addr,message);
	else if (!strcmp(ctl,"del") && num == 2)
		error = del(addr);
	else
		error = -EINVAL;
	ret = error ? error : count;
 Done:
	up(&noisy_sem);
	return ret;
}

static struct subsys_attribute subsys_attr_ctl = {
	.attr	= { .name = "ctl", .mode = 0644 },
	.show	= ctl_show,
	.store	= ctl_store,
};


static int __init noisy_init(void)
{
	subsystem_register(&noisy_subsys);
	subsys_create_file(&noisy_subsys,&subsys_attr_ctl);
	return 0;
}

static void __exit noisy_exit (void)
{
	subsys_remove_file(&noisy_subsys,&subsys_attr_ctl);
	subsystem_unregister(&noisy_subsys);
}

module_init(noisy_init);
module_exit(noisy_exit);

MODULE_AUTHOR("Rusty Lynch");
MODULE_LICENSE("GPL");

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-21 23:03     ` Patrick Mochel
@ 2002-11-23  0:32       ` Rusty Lynch
  2002-11-25 17:29         ` Patrick Mochel
  2002-11-24 13:04       ` Werner Almesberger
  1 sibling, 1 reply; 15+ messages in thread
From: Rusty Lynch @ 2002-11-23  0:32 UTC (permalink / raw)
  To: Patrick Mochel; +Cc: Linux Kernel Mailing List

Patrick,

Your changes added various subsys_* calls that do not seem to
exist in the latest bk tree, or in the patch you sent earlier in this
thread.

Do you have a local implementation of:

subsys_create_file()
subsys_remove_file()

and defines struct subsys_attribute?

    -rustyl

----- Original Message -----
From: "Patrick Mochel" <mochel@osdl.org>
To: "Rusty Lynch" <rusty@linux.co.intel.com>
Cc: "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Sent: Thursday, November 21, 2002 3:03 PM
Subject: Re: [BUG] sysfs on 2.5.48 unable to remove files while in use


>
> > Very cool.  Bitkeeper is one of those things I never bothered
> > with yet (mainly because I feel some comfortable with CVS.)
> > It looks like it might be worth going through ramp-up time
> > on bk to keep up with changes to the kernel.
>
> There are also nightly snapshots, though I don't recall the URL off the
> top of my head..
>
> > > It seems that having a pure sysfs implementation would be superior,
> > > instead of having to use a character device to write to. After looking
> > > into this, I realize that a couple of pieces of infrastructure are
needed,
> > > so I'm working on that, and will post a modified version of your
module
> > > once I'm done..
> >
> > I look forward to seeing it.
>
> Ok, there are basically two parts to it: the required modifications to
> sysfs and your updated module.
>
> The appended patch adds the support to sysfs be defining struct
> subsys_attribute for declaring attributes of subsystems themselves, as
> well as the needed helpers for creation/teardown and read/write.
>
> I've attached a replacement for noisy.c that creates a sysfs file named
> 'ctl' that handles addition and deletion of nprobes, similar to the char
> device you had created. From the top of the file:
>
> /*
>  * Noisy Interface for Linux
>  *
>  * This driver allows arbitrary printk's to be inserted into
>  * executing kernel code by using the new kprobes interface.
>  * A message is attached to an address, and when that address
>  * is reached, the message is printed.
>  *
>  * This uses a sysfs control file to manage a list of probes.
>  * The sysfs directory is at
>  *
>  * /sys/noisy/
>  *
>  * and the control is named 'ctl'.
>  *
>  * A Noisy Probe can be added by echoing into the file, like:
>  *
>  * $ echo "add <address> <message>" > /sys/noisy/ctl
>  *
>  * where <address> is the address to break on, and <message>
>  * is the message to print when the address is reached.
>  *
>  * Probes can be removed by doing:
>  *
>  * $ echo "del <address>" > /sys/noisy/ctl
>  *
>  * where <address> is the address of the probe.
>  *
>  * The probes themselves get a directory under /sys/noisy/, and
>  * the name of the directory is the address of the probe. Each
>  * probe directory contains one file ('message') that contains
>  * the message to be printed. (More may be added later).
>  */
>
> I've tried to comment the changes and the necessary steps in making this
> work.
>
> [ Note: While I'm generally happy with the way things work, I realize that
> it still requires a decent amount of overhead in using the sysfs interface
> (see the file). I'll be looking into shrinking this...]
>
> Everything seems to work..
>
> > It looks like the patch is against the bk tree, and does not apply
cleanly
> > to
> > strait 2.5.48.  I don't know how much has changed to sysfs/inode.c, but
> > I can see where the last hunk is looking too far up, so I'll try it
anyway.
>
> Ah yes, I forgot there was a patch applied to sysfs since 2.5.48. I've
> included everything since 2.5.48 in the one I've appended.
>
>
> -pat
>
> ===== fs/sysfs/inode.c 1.60 vs 1.67 =====
> --- 1.60/fs/sysfs/inode.c Sat Nov 16 15:01:34 2002
> +++ 1.67/fs/sysfs/inode.c Thu Nov 21 17:01:53 2002
> @@ -23,6 +23,8 @@
>   * Please see Documentation/filesystems/sysfs.txt for more information.
>   */
>
> +#undef DEBUG
> +
>  #include <linux/list.h>
>  #include <linux/init.h>
>  #include <linux/pagemap.h>
> @@ -87,16 +89,17 @@
>   return inode;
>  }
>
> -static int sysfs_mknod(struct inode *dir, struct dentry *dentry, int
mode, int dev)
> +static int sysfs_mknod(struct inode *dir, struct dentry *dentry, int
mode, dev_t dev)
>  {
>   struct inode *inode;
>   int error = 0;
>
>   if (!dentry->d_inode) {
>   inode = sysfs_get_inode(dir->i_sb, mode, dev);
> - if (inode)
> + if (inode) {
>   d_instantiate(dentry, inode);
> - else
> + dget(dentry);
> + } else
>   error = -ENOSPC;
>   } else
>   error = -EEXIST;
> @@ -142,17 +145,43 @@
>   return error;
>  }
>
> -static int sysfs_unlink(struct inode *dir, struct dentry *dentry)
> +#define to_subsys(k) container_of(k,struct subsystem,kobj)
> +#define to_sattr(a) container_of(a,struct subsys_attribute,attr)
> +
> +/**
> + * Subsystem file operations.
> + * These operations allow subsystems to have files that can be
> + * read/written.
> + */
> +ssize_t subsys_attr_show(struct kobject * kobj, struct attribute * attr,
> + char * page, size_t count, loff_t off)
>  {
> - struct inode *inode = dentry->d_inode;
> - down(&inode->i_sem);
> - dentry->d_inode->i_nlink--;
> - up(&inode->i_sem);
> - d_invalidate(dentry);
> - dput(dentry);
> - return 0;
> + struct subsystem * s = to_subsys(kobj);
> + struct subsys_attribute * sattr = to_sattr(attr);
> + ssize_t ret = 0;
> +
> + if (sattr->show)
> + ret = sattr->show(s,page,count,off);
> + return ret;
>  }
>
> +ssize_t subsys_attr_store(struct kobject * kobj, struct attribute * attr,
> +   const char * page, size_t count, loff_t off)
> +{
> + struct subsystem * s = to_subsys(kobj);
> + struct subsys_attribute * sattr = to_sattr(attr);
> + ssize_t ret = 0;
> +
> + if (sattr->store)
> + ret = sattr->store(s,page,count,off);
> + return ret;
> +}
> +
> +static struct sysfs_ops subsys_sysfs_ops = {
> + .show = subsys_attr_show,
> + .store = subsys_attr_store,
> +};
> +
>  /**
>   * sysfs_read_file - read an attribute.
>   * @file: file pointer.
> @@ -173,17 +202,11 @@
>  sysfs_read_file(struct file *file, char *buf, size_t count, loff_t *ppos)
>  {
>   struct attribute * attr = file->f_dentry->d_fsdata;
> - struct sysfs_ops * ops = NULL;
> - struct kobject * kobj;
> + struct sysfs_ops * ops = file->private_data;
> + struct kobject * kobj = file->f_dentry->d_parent->d_fsdata;
>   unsigned char *page;
>   ssize_t retval = 0;
>
> - kobj = file->f_dentry->d_parent->d_fsdata;
> - if (kobj && kobj->subsys)
> - ops = kobj->subsys->sysfs_ops;
> - if (!ops || !ops->show)
> - return 0;
> -
>   if (count > PAGE_SIZE)
>   count = PAGE_SIZE;
>
> @@ -234,16 +257,11 @@
>  sysfs_write_file(struct file *file, const char *buf, size_t count, loff_t
*ppos)
>  {
>   struct attribute * attr = file->f_dentry->d_fsdata;
> - struct sysfs_ops * ops = NULL;
> - struct kobject * kobj;
> + struct sysfs_ops * ops = file->private_data;
> + struct kobject * kobj = file->f_dentry->d_parent->d_fsdata;
>   ssize_t retval = 0;
>   char * page;
>
> - kobj = file->f_dentry->d_parent->d_fsdata;
> - if (kobj && kobj->subsys)
> - ops = kobj->subsys->sysfs_ops;
> - if (!ops || !ops->store)
> - return 0;
>
>   page = (char *)__get_free_page(GFP_KERNEL);
>   if (!page)
> @@ -275,21 +293,77 @@
>   return retval;
>  }
>
> -static int sysfs_open_file(struct inode * inode, struct file * filp)
> +static int check_perm(struct inode * inode, struct file * file)
>  {
> - struct kobject * kobj;
> + struct kobject * kobj = kobject_get(file->f_dentry->d_parent->d_fsdata);
> + struct attribute * attr = file->f_dentry->d_fsdata;
> + struct sysfs_ops * ops = NULL;
>   int error = 0;
>
> - kobj = filp->f_dentry->d_parent->d_fsdata;
> - if ((kobj = kobject_get(kobj))) {
> - struct attribute * attr = filp->f_dentry->d_fsdata;
> - if (!attr)
> - error = -EINVAL;
> - } else
> - error = -EINVAL;
> + if (!kobj || !attr)
> + goto Einval;
> +
> + /* if the kobject has no subsystem, then it is a subsystem itself,
> + * so give it the subsys_sysfs_ops.
> + */
> + if (kobj->subsys)
> + ops = kobj->subsys->sysfs_ops;
> + else
> + ops = &subsys_sysfs_ops;
> +
> + /* No sysfs operations, either from having no subsystem,
> + * or the subsystem have no operations.
> + */
> + if (!ops)
> + goto Eaccess;
> +
> + /* File needs write support.
> + * The inode's perms must say it's ok,
> + * and we must have a store method.
> + */
> + if (file->f_mode & FMODE_WRITE) {
> +
> + if (!(inode->i_mode & S_IWUGO))
> + goto Eperm;
> + if (!ops->store)
> + goto Eaccess;
> +
> + }
> +
> + /* File needs read support.
> + * The inode's perms must say it's ok, and we there
> + * must be a show method for it.
> + */
> + if (file->f_mode & FMODE_READ) {
> + if (!(inode->i_mode & S_IRUGO))
> + goto Eperm;
> + if (!ops->show)
> + goto Eaccess;
> + }
> +
> + /* No error? Great, store the ops in file->private_data
> + * for easy access in the read/write functions.
> + */
> + file->private_data = ops;
> + goto Done;
> +
> + Einval:
> + error = -EINVAL;
> + goto Done;
> + Eaccess:
> + error = -EACCES;
> + goto Done;
> + Eperm:
> + error = -EPERM;
> + Done:
>   return error;
>  }
>
> +static int sysfs_open_file(struct inode * inode, struct file * filp)
> +{
> + return check_perm(inode,filp);
> +}
> +
>  static int sysfs_release(struct inode * inode, struct file * filp)
>  {
>   struct kobject * kobj = filp->f_dentry->d_parent->d_fsdata;
> @@ -541,7 +615,8 @@
>   /* make sure dentry is really there */
>   if (victim->d_inode &&
>       (victim->d_parent->d_inode == dir->d_inode)) {
> - sysfs_unlink(dir->d_inode,victim);
> + simple_unlink(dir->d_inode,victim);
> + d_delete(victim);
>   }
>   }
>   up(&dir->d_inode->i_sem);
> @@ -599,19 +674,16 @@
>   list_for_each_safe(node,next,&dentry->d_subdirs) {
>   struct dentry * d = list_entry(node,struct dentry,d_child);
>   /* make sure dentry is still there */
> - if (d->d_inode)
> - sysfs_unlink(dentry->d_inode,d);
> + if (d->d_inode) {
> + simple_unlink(dentry->d_inode,d);
> + d_delete(dentry);
> + }
>   }
>
> - d_invalidate(dentry);
> - if (simple_empty(dentry)) {
> - dentry->d_inode->i_nlink -= 2;
> - dentry->d_inode->i_flags |= S_DEAD;
> - parent->d_inode->i_nlink--;
> - }
>   up(&dentry->d_inode->i_sem);
> - dput(dentry);
> -
> + d_invalidate(dentry);
> + simple_rmdir(parent->d_inode,dentry);
> + d_delete(dentry);
>   up(&parent->d_inode->i_sem);
>   dput(parent);
>  }
> @@ -622,4 +694,3 @@
>  EXPORT_SYMBOL(sysfs_remove_link);
>  EXPORT_SYMBOL(sysfs_create_dir);
>  EXPORT_SYMBOL(sysfs_remove_dir);
> -MODULE_LICENSE("GPL");
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-21 23:03     ` Patrick Mochel
  2002-11-23  0:32       ` Rusty Lynch
@ 2002-11-24 13:04       ` Werner Almesberger
  2002-11-24 13:38         ` Alexander Viro
  1 sibling, 1 reply; 15+ messages in thread
From: Werner Almesberger @ 2002-11-24 13:04 UTC (permalink / raw)
  To: Patrick Mochel; +Cc: Rusty Lynch, Linux Kernel Mailing List

Patrick Mochel wrote:
>  * This uses a sysfs control file to manage a list of probes. 
>  * The sysfs directory is at
>  *
>  * /sys/noisy/

I really like the idea of controlling kprobes through sysfs.
However, ...

>  * A Noisy Probe can be added by echoing into the file, like:
>  *
>  *	$ echo "add <address> <message>" > /sys/noisy/ctl

do you really need a "magic" file for this ? I don't know how
well sysfs supports mkdir/rmdir (if at all), but they would
seem to provide a much more natural interface. (VFS allows
rmdir to remove non-empty directories, so you wouldn't have
to rm -r.)

I don't think you need probe installation and message setting
to be atomic, so you could just assign a unique default
message, e.g. the probe address.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-24 13:04       ` Werner Almesberger
@ 2002-11-24 13:38         ` Alexander Viro
  2002-11-24 14:32           ` Werner Almesberger
  2002-11-24 14:51           ` Roman Zippel
  0 siblings, 2 replies; 15+ messages in thread
From: Alexander Viro @ 2002-11-24 13:38 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: Patrick Mochel, Rusty Lynch, Linux Kernel Mailing List



On Sun, 24 Nov 2002, Werner Almesberger wrote:

> do you really need a "magic" file for this ? I don't know how
> well sysfs supports mkdir/rmdir (if at all), but they would
> seem to provide a much more natural interface. (VFS allows
> rmdir to remove non-empty directories, so you wouldn't have
> to rm -r.)

a) sysfs doesn't allow mkdir/rmdir and thus avoids an imperial buttload
of races - witness the crap in devfs.

b) rmdir of non-empty directory pretty much guarantees another buttload of
races.

c) mkdir creating non-empty directory or rmdir removing non-empty directory
is *ugly*.  BTW, Roman's "filesystem" for modules in its current form is
vetoed, as far as I'm concerned - this sort of magic is just plain wrong.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-24 13:38         ` Alexander Viro
@ 2002-11-24 14:32           ` Werner Almesberger
  2002-11-25 18:34             ` Patrick Mochel
  2002-11-24 14:51           ` Roman Zippel
  1 sibling, 1 reply; 15+ messages in thread
From: Werner Almesberger @ 2002-11-24 14:32 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Patrick Mochel, Rusty Lynch, Linux Kernel Mailing List

Alexander Viro wrote:
> a) sysfs doesn't allow mkdir/rmdir and thus avoids an imperial buttload
> of races - witness the crap in devfs.

But isn't one of the problems there that kernel and user space can
both initiate changes ? What I'm proposing is to let this be driven
by user space. You'd of course still have different policies in
different parts of the sysfs hierarchy, but would that really be a
problem ?

> b) rmdir of non-empty directory pretty much guarantees another buttload of
> races.

Hmm, if the sysfs user could veto file creation while the removal is
in progress, behaviour like rm -r would certainly be sufficient.
Even if it can't veto it, this may be good enough, since user space
is in charge of file creation (after mkdir) anyway.

The main reason why I think rmdir with rm -r functionality would be
useful in this case, is that, in order to implement a "remove probe"
functionality, the application wouldn't have to know what exactly
lives in that directory, and it also wouldn't have to implement a
real rm -r (or rm xxx/* && rmdir xxx), which I'd also consider a
more powerful operation than a simple rmdir.

I can see a potential issue with a proliferation of callbacks,
though.

> c) mkdir creating non-empty directory or rmdir removing non-empty directory
> is *ugly*.

Uglier than a "magic" file that then goes and creates/removes
directories and files in them ? Why don't we  echo mkdir foo >.
then ? ;-)

Besides, there's some precedent: . and .. are also handled
implicitly. (And do we really have no file systems that store
some meta-data in "magic" files that are created/removed
implicitly ?)

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-24 13:38         ` Alexander Viro
  2002-11-24 14:32           ` Werner Almesberger
@ 2002-11-24 14:51           ` Roman Zippel
  1 sibling, 0 replies; 15+ messages in thread
From: Roman Zippel @ 2002-11-24 14:51 UTC (permalink / raw)
  To: Alexander Viro
  Cc: Werner Almesberger, Patrick Mochel, Rusty Lynch,
	Linux Kernel Mailing List

Hi,

On Sun, 24 Nov 2002, Alexander Viro wrote:

> c) mkdir creating non-empty directory or rmdir removing non-empty directory
> is *ugly*.  BTW, Roman's "filesystem" for modules in its current form is
> vetoed, as far as I'm concerned - this sort of magic is just plain wrong.

I'm open to alternative ideas.
If the contents of the dir again is controlled by the fs itself, I don't 
really see a problem. You need some way to tell the kernel "please create 
this object and give me an interface to control it".

bye, Roman


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-23  0:32       ` Rusty Lynch
@ 2002-11-25 17:29         ` Patrick Mochel
  0 siblings, 0 replies; 15+ messages in thread
From: Patrick Mochel @ 2002-11-25 17:29 UTC (permalink / raw)
  To: Rusty Lynch; +Cc: Linux Kernel Mailing List


On Fri, 22 Nov 2002, Rusty Lynch wrote:

> Patrick,
> 
> Your changes added various subsys_* calls that do not seem to
> exist in the latest bk tree, or in the patch you sent earlier in this
> thread.
> 
> Do you have a local implementation of:
> 
> subsys_create_file()
> subsys_remove_file()
> 
> and defines struct subsys_attribute?

Woops, I forgot to include that part of the patch. Attached is what I have 
against current BK, which should also be 2.5.49. This should be everything 
you need to make it work. Sorry about that.

	-pat

diff -Nru a/fs/sysfs/inode.c b/fs/sysfs/inode.c
--- a/fs/sysfs/inode.c	Mon Nov 25 11:28:31 2002
+++ b/fs/sysfs/inode.c	Mon Nov 25 11:28:31 2002
@@ -145,6 +145,43 @@
 	return error;
 }
 
+#define to_subsys(k) container_of(k,struct subsystem,kobj)
+#define to_sattr(a) container_of(a,struct subsys_attribute,attr)
+
+/**
+ * Subsystem file operations.
+ * These operations allow subsystems to have files that can be 
+ * read/written. 
+ */
+ssize_t subsys_attr_show(struct kobject * kobj, struct attribute * attr, 
+			 char * page, size_t count, loff_t off)
+{
+	struct subsystem * s = to_subsys(kobj);
+	struct subsys_attribute * sattr = to_sattr(attr);
+	ssize_t ret = 0;
+
+	if (sattr->show)
+		ret = sattr->show(s,page,count,off);
+	return ret;
+}
+
+ssize_t subsys_attr_store(struct kobject * kobj, struct attribute * attr,
+			  const char * page, size_t count, loff_t off)
+{
+	struct subsystem * s = to_subsys(kobj);
+	struct subsys_attribute * sattr = to_sattr(attr);
+	ssize_t ret = 0;
+
+	if (sattr->store)
+		ret = sattr->store(s,page,count,off);
+	return ret;
+}
+
+static struct sysfs_ops subsys_sysfs_ops = {
+	.show	= subsys_attr_show,
+	.store	= subsys_attr_store,
+};
+
 /**
  *	sysfs_read_file - read an attribute. 
  *	@file:	file pointer.
@@ -265,9 +302,14 @@
 
 	if (!kobj || !attr)
 		goto Einval;
-	
+
+	/* if the kobject has no subsystem, then it is a subsystem itself,
+	 * so give it the subsys_sysfs_ops.
+	 */
 	if (kobj->subsys)
 		ops = kobj->subsys->sysfs_ops;
+	else
+		ops = &subsys_sysfs_ops;
 
 	/* No sysfs operations, either from having no subsystem,
 	 * or the subsystem have no operations.
diff -Nru a/include/linux/kobject.h b/include/linux/kobject.h
--- a/include/linux/kobject.h	Mon Nov 25 11:28:31 2002
+++ b/include/linux/kobject.h	Mon Nov 25 11:28:31 2002
@@ -60,4 +60,13 @@
 	kobject_put(&s->kobj);
 }
 
+struct subsys_attribute {
+	struct attribute attr;
+	ssize_t (*show)(struct subsystem *, char *, size_t, loff_t);
+	ssize_t (*store)(struct subsystem *, const char *, size_t, loff_t); 
+};
+
+extern int subsys_create_file(struct subsystem * , struct subsys_attribute *);
+extern void subsys_remove_file(struct subsystem * , struct subsys_attribute *);
+
 #endif /* _KOBJECT_H_ */
diff -Nru a/lib/kobject.c b/lib/kobject.c
--- a/lib/kobject.c	Mon Nov 25 11:28:31 2002
+++ b/lib/kobject.c	Mon Nov 25 11:28:31 2002
@@ -229,6 +229,38 @@
 }
 
 
+/**
+ *	subsystem_create_file - export sysfs attribute file.
+ *	@s:	subsystem.
+ *	@a:	subsystem attribute descriptor.
+ */
+
+int subsys_create_file(struct subsystem * s, struct subsys_attribute * a)
+{
+	int error = 0;
+	if (subsys_get(s)) {
+		error = sysfs_create_file(&s->kobj,&a->attr);
+		subsys_put(s);
+	}
+	return error;
+}
+
+
+/**
+ *	subsystem_remove_file - remove sysfs attribute file.
+ *	@s:	subsystem.
+ *	@a:	attribute desciptor.
+ */
+
+void subsys_remove_file(struct subsystem * s, struct subsys_attribute * a)
+{
+	if (subsys_get(s)) {
+		sysfs_remove_file(&s->kobj,&a->attr);
+		subsys_put(s);
+	}
+}
+
+
 EXPORT_SYMBOL(kobject_init);
 EXPORT_SYMBOL(kobject_register);
 EXPORT_SYMBOL(kobject_unregister);
@@ -238,3 +270,5 @@
 EXPORT_SYMBOL(subsystem_init);
 EXPORT_SYMBOL(subsystem_register);
 EXPORT_SYMBOL(subsystem_unregister);
+EXPORT_SYMBOL(subsys_create_file);
+EXPORT_SYMBOL(subsys_remove_file);


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-24 14:32           ` Werner Almesberger
@ 2002-11-25 18:34             ` Patrick Mochel
  2002-11-25 19:13               ` Werner Almesberger
  2002-11-25 19:40               ` Roman Zippel
  0 siblings, 2 replies; 15+ messages in thread
From: Patrick Mochel @ 2002-11-25 18:34 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: Linux Kernel Mailing List


On Sun, 24 Nov 2002, Werner Almesberger wrote:

> Alexander Viro wrote:
> > a) sysfs doesn't allow mkdir/rmdir and thus avoids an imperial buttload
> > of races - witness the crap in devfs.
> 
> But isn't one of the problems there that kernel and user space can
> both initiate changes ? What I'm proposing is to let this be driven
> by user space. You'd of course still have different policies in
> different parts of the sysfs hierarchy, but would that really be a
> problem ?

Yes. The fs hasn't allowed userspace to create and remove files and 
directories because it's too complex to be worth it. There are a number of 
races to be dealt with, plus you have to make the filesystem deal with the 
policy of interpreting the user's request properly. 

> > c) mkdir creating non-empty directory or rmdir removing non-empty directory
> > is *ugly*.
> 
> Uglier than a "magic" file that then goes and creates/removes
> directories and files in them ? Why don't we  echo mkdir foo >.
> then ? ;-)

On the surface, that's what's happening, and it's the same way procfs has
worked for ages.  It's not great, but it works well for specific purposes.

The difference is that sysfs directories are tied to kobjects. By writing
to the file with the specific syntax, you are telling the module to create
an object with the parameters you give. Once the object is registered, a 
directory is created for it, and it's only removed when the object is 
unregistered. We don't just randomly create directories. 

The object type is context dependent, as well as the parameters and the
syntax to write to the file.  That's the flexibility we can afford.

[ From a purist standpoint, it's still a little weird. Al has been telling
me for ages that the only proper way to do it is to have each object get a
mountpoint instead of a directory. According to him, which I generally
take as gospel, it's the only way to do in-kernel filesystems in a
race-free way. It's hard, and IIRC, there are several infrastructural
changes that must take place in order for it to happen. (I think I still
have the IRC log somewhere..) But, it might happen someday.. ]


	-pat


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-25 18:34             ` Patrick Mochel
@ 2002-11-25 19:13               ` Werner Almesberger
  2002-11-25 19:40               ` Roman Zippel
  1 sibling, 0 replies; 15+ messages in thread
From: Werner Almesberger @ 2002-11-25 19:13 UTC (permalink / raw)
  To: Patrick Mochel; +Cc: Linux Kernel Mailing List

Patrick Mochel wrote:
> The difference is that sysfs directories are tied to kobjects. By writing
> to the file with the specific syntax, you are telling the module to create
> an object with the parameters you give. Once the object is registered, a 
> directory is created for it, and it's only removed when the object is 
> unregistered. We don't just randomly create directories. 

Yes, and I think that's perfect. All I'm suggesting is to use
mkdir/rmdir to make the creation/removal request, and then use
whatever is convenient for ensuring that things stay unique (i.e.
resolve it either at the VFS, kprobes, or "glue" level).

I fully agree that creating interrelated objects at two distinct
places in the kernel and then trying to "synchronize" them leads
to madness. (*)

(*) Actually, someone in academia who works with protocol
    validations, and has a bit too much time on his or her hands,
    should once try to find an elegant way of linking this type of
    in-kernel dependencies to a validation tool like Spin.

    Right now, the process for validating such a mess is still to
    abstract the design into a model in a language like Promela
    (very slick, but decidedly "alien"), SyncC++ (C-ish, but still
    too far from real kernel code), etc., and to validate the
    model.

    I've done this on some occasions (and discovered interesting
    bugs in the process), but the pain threshold required is a bit
    too high to suggest this as a general procedure.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [BUG] sysfs on 2.5.48 unable to remove files while in use
  2002-11-25 18:34             ` Patrick Mochel
  2002-11-25 19:13               ` Werner Almesberger
@ 2002-11-25 19:40               ` Roman Zippel
  1 sibling, 0 replies; 15+ messages in thread
From: Roman Zippel @ 2002-11-25 19:40 UTC (permalink / raw)
  To: Patrick Mochel
  Cc: Werner Almesberger, Alexander Viro, Linux Kernel Mailing List

Hi,

On Mon, 25 Nov 2002, Patrick Mochel wrote:

> [ From a purist standpoint, it's still a little weird. Al has been telling
> me for ages that the only proper way to do it is to have each object get a
> mountpoint instead of a directory. According to him, which I generally
> take as gospel, it's the only way to do in-kernel filesystems in a
> race-free way. It's hard, and IIRC, there are several infrastructural
> changes that must take place in order for it to happen. (I think I still
> have the IRC log somewhere..) But, it might happen someday.. ]

What advantages do mountpoints have compared to normal directories?

bye, Roman


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2002-11-25 19:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-21 15:39 [BUG] sysfs on 2.5.48 unable to remove files while in use Rusty Lynch
2002-11-21 17:04 ` Patrick Mochel
2002-11-21 17:20   ` Rusty Lynch
2002-11-21 20:45 ` Patrick Mochel
2002-11-21 22:07   ` Rusty Lynch
2002-11-21 23:03     ` Patrick Mochel
2002-11-23  0:32       ` Rusty Lynch
2002-11-25 17:29         ` Patrick Mochel
2002-11-24 13:04       ` Werner Almesberger
2002-11-24 13:38         ` Alexander Viro
2002-11-24 14:32           ` Werner Almesberger
2002-11-25 18:34             ` Patrick Mochel
2002-11-25 19:13               ` Werner Almesberger
2002-11-25 19:40               ` Roman Zippel
2002-11-24 14:51           ` Roman Zippel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.