linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: very slow NFS boot on linux-next and -mm
       [not found] <20090503105456.GA30449@localhost>
@ 2009-05-03 12:40 ` Wu Fengguang
  2009-05-03 12:56   ` Kay Sievers
  0 siblings, 1 reply; 5+ messages in thread
From: Wu Fengguang @ 2009-05-03 12:40 UTC (permalink / raw)
  To: linux-nfs; +Cc: Trond Myklebust, linux-hotplug

Ah it's not really a NFS problem..

On Sun, May 03, 2009 at 06:54:56PM +0800, Wu Fengguang wrote:
> 
> My NFSROOT box sometimes will get stuck for a dozen seconds during boot.
> 
[snip]
> The NFSROOT client that got stuck is running latest linux-next, the
> server side is running 2.6.30-rc3.
> 
> Switching the client kernel to latest -mm makes it very reproducible.
> Here is another stack dump:
> 
> [  180.399845] udevd         D ffff8800280291e0  3056  1388      1

The udevd is doing this busy loop like mad:

        open("/etc/group", O_RDONLY|O_CLOEXEC)  = 7
        lseek(7, 0, SEEK_CUR)                   = 0
        fstat(7, {st_mode=S_IFREG|0644, st_sizey7, ...}) = 0
        mmap(NULL, 797, PROT_READ, MAP_SHARED, 7, 0) = 0x7f6791601000
        lseek(7, 797, SEEK_SET)                 = 797
        munmap(0x7f6791601000, 797)             = 0
        close(7)                                = 0

udev version is 0.141-1.

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: very slow NFS boot on linux-next and -mm
  2009-05-03 12:40 ` very slow NFS boot on linux-next and -mm Wu Fengguang
@ 2009-05-03 12:56   ` Kay Sievers
       [not found]     ` <20090504133019.GA14300@localhost>
  0 siblings, 1 reply; 5+ messages in thread
From: Kay Sievers @ 2009-05-03 12:56 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: linux-nfs, Trond Myklebust, linux-hotplug

On Sun, May 3, 2009 at 14:40, Wu Fengguang <fengguang.wu@intel.com> wrote:

> The udevd is doing this busy loop like mad:
>
>        open("/etc/group", O_RDONLY|O_CLOEXEC)  = 7
>        lseek(7, 0, SEEK_CUR)                   = 0
>        fstat(7, {st_mode=S_IFREG|0644, st_sizey7, ...}) = 0
>        mmap(NULL, 797, PROT_READ, MAP_SHARED, 7, 0) = 0x7f6791601000
>        lseek(7, 797, SEEK_SET)                 = 797
>        munmap(0x7f6791601000, 797)             = 0
>        close(7)                                = 0
>
> udev version is 0.141-1.

Are you sure that's "busy loop" in that sense, and not just "load"?

If your config carries OWNER= or GROUP= rules in the udev rules, but
your system does not have these users or groups configured, glibc will
try to resolve these names to the uid/gid with every event that needs
these values.

If the in rules configured user/group names can be resolved at startup
of udevd, they will be cached and never looked up again.

Are you sure, that your system is configured correctly regarding the
used rules and referenced system users/groups names?

Thanks,
Kay

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: very slow NFS boot on linux-next and -mm
       [not found]     ` <20090504133019.GA14300@localhost>
@ 2009-05-06 13:42       ` Kay Sievers
  2009-05-07 11:59         ` [PATCH] inotify: report rounded-up event size to user space Wu Fengguang
  0 siblings, 1 reply; 5+ messages in thread
From: Kay Sievers @ 2009-05-06 13:42 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: linux-nfs@vger.kernel.org, Trond Myklebust,
	linux-hotplug@vger.kernel.org

On Mon, May 4, 2009 at 15:30, Wu Fengguang <fengguang.wu@intel.com> wrote:

> I tried remove every udev rules in /etc/udev/ and /lib/udev, the /etc/group
> accesses disappeared in strace, but udevd is still busy.

> ppoll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=3, events=POLLIN}], 3, NULL, [], 8) = 1 ([{fd=3, revents=POLLIN}])
> ioctl(3, FIONREAD, [39])                = 0
> read(3, 0x62ad60, 39)                   = -1 EINVAL (Invalid argument)

Seems, you have issues with inotify on your nfs mount?

Inotify wakes up udevd to tell something in the rules directory has
changed, but inotify seems not to return anything useful, but keeps
waking us up. That causes an endless loop of parsing rules files.

Thanks,
Kay

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] inotify: report rounded-up event size to user space
  2009-05-06 13:42       ` Kay Sievers
@ 2009-05-07 11:59         ` Wu Fengguang
  2009-05-07 12:16           ` Eric Paris
  0 siblings, 1 reply; 5+ messages in thread
From: Wu Fengguang @ 2009-05-07 11:59 UTC (permalink / raw)
  To: Andrew Morton, Kay Sievers
  Cc: linux-nfs@vger.kernel.org, Trond Myklebust,
	linux-hotplug@vger.kernel.org, Eric Paris, Al Viro,
	Christoph Hellwig

On Wed, May 06, 2009 at 09:42:58PM +0800, Kay Sievers wrote:
> On Mon, May 4, 2009 at 15:30, Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > I tried remove every udev rules in /etc/udev/ and /lib/udev, the /etc/group
> > accesses disappeared in strace, but udevd is still busy.
> 
> > ppoll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=3, events=POLLIN}], 3, NULL, [], 8) = 1 ([{fd=3, revents=POLLIN}])
> > ioctl(3, FIONREAD, [39])                = 0
> > read(3, 0x62ad60, 39)                   = -1 EINVAL (Invalid argument)
> 
> Seems, you have issues with inotify on your nfs mount?
> 
> Inotify wakes up udevd to tell something in the rules directory has
> changed, but inotify seems not to return anything useful, but keeps
> waking us up. That causes an endless loop of parsing rules files.

Thanks for the tip. The failed inotify read() is caused by the size *roundup*
behavior introduced by the -mm commit 3b46cf7d5f3ca(Reimplement inotify_user
using fsnotify).  Which says:

+               /*
+                * We need to pad the filename so as to properly align an
+                * array of inotify_event structures.  Because the structure is
+                * small and the common case is a small filename, we just round
+                * up to the next multiple of the structure's sizeof.  This is
+                * simple and safe for all architectures.
+                */

The udev madness originates from these kernel testing failures:

[  756.569243] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.600103] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.630265] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.670862] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.701845] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.732899] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.763126] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.794829] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.824985] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.856760] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  761.608521] __ratelimit: 210 callbacks suppressed

Which are printed by the following patch:

--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
 static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
 					    size_t count)
 {
 	size_t event_size = sizeof(struct inotify_event);
 	struct fsnotify_event *event;
 
 	if (fsnotify_notify_queue_is_empty(group))
 		return NULL;
 
 	event = fsnotify_peek_notify_event(group);
 
 	event_size += roundup(event->name_len, event_size);
 
-	if (event_size > count)
+	if (event_size > count) {
+		if (printk_ratelimit())
+			printk("get_one_event: event_size=%d > count=%d, name_len=%d, name=%s\n",
+					(int)event_size, (int)count, (int)event->name_len, event->file_name);
 		return ERR_PTR(-EINVAL);
+	}


It can be fixed by reporting the rounded up value to user space.

Thanks,
Fengguang
---
inotify: report rounded-up event size to user space

Fix a udev madness problem, which falls into an endless loop:

(1) ppoll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=3, events=POLLIN}], 3, NULL, [], 8) = 1 ([{fd=3, revents=POLLIN}])
(2) ioctl(3, FIONREAD, [39])                = 0
(3) read(3, 0x62ad60, 39)                   = -1 EINVAL (Invalid argument)

In (2) we reported a small len, while in (3) we insist on a rounded up len,
leading to a failed inotify read(), which will be retried endlessly by udev.

[  756.569243] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.600103] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.630265] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.670862] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.701845] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.732899] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.763126] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.794829] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.824985] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  756.856760] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
[  761.608521] __ratelimit: 210 callbacks suppressed

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/notify/inotify/inotify_user.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- linux.orig/fs/notify/inotify/inotify_user.c
+++ linux/fs/notify/inotify/inotify_user.c
@@ -318,7 +318,9 @@ static long inotify_ioctl(struct file *f
 		mutex_lock(&group->notification_mutex);
 		list_for_each_entry(holder, &group->notification_list, event_list) {
 			event = holder->event;
-			send_len += sizeof(struct inotify_event) + event->name_len;
+			send_len += sizeof(struct inotify_event);
+			send_len += roundup(event->name_len,
+					sizeof(struct inotify_event));
 		}
 		mutex_unlock(&group->notification_mutex);
 		ret = put_user(send_len, (int __user *) p);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] inotify: report rounded-up event size to user space
  2009-05-07 11:59         ` [PATCH] inotify: report rounded-up event size to user space Wu Fengguang
@ 2009-05-07 12:16           ` Eric Paris
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Paris @ 2009-05-07 12:16 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Kay Sievers, linux-nfs@vger.kernel.org,
	Trond Myklebust, linux-hotplug@vger.kernel.org, Al Viro,
	Christoph Hellwig

On Thu, 2009-05-07 at 19:59 +0800, Wu Fengguang wrote:

> ---
> inotify: report rounded-up event size to user space
> 
> Fix a udev madness problem, which falls into an endless loop:
> 
> (1) ppoll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=3, events=POLLIN}], 3, NULL, [], 8) = 1 ([{fd=3, revents=POLLIN}])
> (2) ioctl(3, FIONREAD, [39])                = 0
> (3) read(3, 0x62ad60, 39)                   = -1 EINVAL (Invalid argument)
> 
> In (2) we reported a small len, while in (3) we insist on a rounded up len,
> leading to a failed inotify read(), which will be retried endlessly by udev.
> 
> [  756.569243] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  756.600103] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  756.630265] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  756.670862] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  756.701845] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  756.732899] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  756.763126] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  756.794829] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  756.824985] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  756.856760] get_one_event: event_sizeH > count8, name_len", namea-dev-root-link.rules
> [  761.608521] __ratelimit: 210 callbacks suppressed
> 
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>

Gaaah should have remembered to do that with the ioctl.
 
Acked-by: Eric Paris <eparis@redhat.com>

Andrew can you add this to your tree, I will roll it into my next patch
set.

-Eric

> ---
>  fs/notify/inotify/inotify_user.c |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> --- linux.orig/fs/notify/inotify/inotify_user.c
> +++ linux/fs/notify/inotify/inotify_user.c
> @@ -318,7 +318,9 @@ static long inotify_ioctl(struct file *f
>  		mutex_lock(&group->notification_mutex);
>  		list_for_each_entry(holder, &group->notification_list, event_list) {
>  			event = holder->event;
> -			send_len += sizeof(struct inotify_event) + event->name_len;
> +			send_len += sizeof(struct inotify_event);
> +			send_len += roundup(event->name_len,
> +					sizeof(struct inotify_event));
>  		}
>  		mutex_unlock(&group->notification_mutex);
>  		ret = put_user(send_len, (int __user *) p);


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-05-07 12:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20090503105456.GA30449@localhost>
2009-05-03 12:40 ` very slow NFS boot on linux-next and -mm Wu Fengguang
2009-05-03 12:56   ` Kay Sievers
     [not found]     ` <20090504133019.GA14300@localhost>
2009-05-06 13:42       ` Kay Sievers
2009-05-07 11:59         ` [PATCH] inotify: report rounded-up event size to user space Wu Fengguang
2009-05-07 12:16           ` Eric Paris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).