* Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier)
@ 2015-03-19 1:19 Marek Marczykowski-Górecki
2015-03-19 10:38 ` Ian Campbell
2015-03-19 11:10 ` David Vrabel
0 siblings, 2 replies; 14+ messages in thread
From: Marek Marczykowski-Górecki @ 2015-03-19 1:19 UTC (permalink / raw)
To: xen-devel; +Cc: Boris Ostrovsky, David Vrabel
[-- Attachment #1.1: Type: text/plain, Size: 2358 bytes --]
Hi,
I've hit some deadlock in kernel xenstore client exposed via
/proc/xen/xenbus. Steps to reproduce are simple:
int main() {
struct xs_handle *xs;
xs = xs_open(0);
xs_watch(xs, "domid", "token");
xs_read(xs, 0, "name", NULL);
return 0;
}
xs_watch internally creates new thread, which uses read to wait for the
watch. And in the same time, the program tries to read some value,
but actually it hangs at sending the command (before even sending a path to be
read). Strace gives this (simplified for readability):
[pid 2494] write(3, "\4\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0", 160 = 16
[pid 2494] write(3, "domid\0", 6) = 6
[pid 2494] write(3, "token\0", 6) = 6
[pid 2495] read(3, <unfinished ...>
[pid 2494] futex(0x71c0d4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
[pid 2495] <... read resumed>
"\17\0\0\0\377\377\377\377\220~\255\27\f\0\0\0", 16) = 16
[pid 2495] read(3, "domid\0token\0", 12) = 12
[pid 2495] read(3, "\4\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0", 16) = 16
[pid 2495] read(3, "OK\0", 3) = 3
[pid 2495] futex(0x71c0d4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x71c0d0,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
[pid 2494] <... futex resumed> ) = 0
[pid 2495] <... futex resumed> ) = 1
[pid 2494] futex(0x71c0a8, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2495] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2494] <... futex resumed> ) = -1 EAGAIN (Resource
temporarily unavailable)
[pid 2495] <... futex resumed> ) = 0
[pid 2494] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 2495] read(3, <unfinished ...>
[pid 2494] <... futex resumed> ) = 0
[pid 2494] rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER,
0x7fc78c1488f0}, NULL, 8) = 0
[pid 2494] rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER,
0x7fc78c1488f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc78c1488f0}, 8) = 0
[pid 2494] write(3, "\2\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0", 16
And thats all - 2494 is waiting on write, 2495 is waiting on read.
On 3.12.x it is working. On 3.17.0 and 3.18.7 it is broken. I haven't
checked versions in the middle.
Any ideas?
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
[-- Attachment #1.2: Type: application/pgp-signature, Size: 473 bytes --]
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-19 1:19 Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) Marek Marczykowski-Górecki @ 2015-03-19 10:38 ` Ian Campbell 2015-03-19 12:10 ` Iurii Konovalenko 2015-03-19 11:10 ` David Vrabel 1 sibling, 1 reply; 14+ messages in thread From: Ian Campbell @ 2015-03-19 10:38 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Iurii Konovalenko, Embedded-pv-devel, Boris Ostrovsky, David Vrabel, xen-devel On Thu, 2015-03-19 at 02:19 +0100, Marek Marczykowski-Górecki wrote: > Hi, > > I've hit some deadlock in kernel xenstore client exposed via > /proc/xen/xenbus. Sounds similar to what Iurii also reported last night in "Userspace PV backend hangs". Iurri's case was all 3.14 kernels, which is in your range too. > Steps to reproduce are simple: > int main() { > struct xs_handle *xs; > xs = xs_open(0); > xs_watch(xs, "domid", "token"); > xs_read(xs, 0, "name", NULL); > return 0; > } > > xs_watch internally creates new thread, which uses read to wait for the > watch. And in the same time, the program tries to read some value, > but actually it hangs at sending the command (before even sending a path to be > read). Strace gives this (simplified for readability): > [pid 2494] write(3, "\4\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0", 160 = 16 > [pid 2494] write(3, "domid\0", 6) = 6 > [pid 2494] write(3, "token\0", 6) = 6 > [pid 2495] read(3, <unfinished ...> > [pid 2494] futex(0x71c0d4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...> > [pid 2495] <... read resumed> > "\17\0\0\0\377\377\377\377\220~\255\27\f\0\0\0", 16) = 16 > [pid 2495] read(3, "domid\0token\0", 12) = 12 > [pid 2495] read(3, "\4\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0", 16) = 16 > [pid 2495] read(3, "OK\0", 3) = 3 > [pid 2495] futex(0x71c0d4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x71c0d0, > {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...> > [pid 2494] <... futex resumed> ) = 0 > [pid 2495] <... futex resumed> ) = 1 > [pid 2494] futex(0x71c0a8, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> > [pid 2495] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> > [pid 2494] <... futex resumed> ) = -1 EAGAIN (Resource > temporarily unavailable) > [pid 2495] <... futex resumed> ) = 0 > [pid 2494] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> > [pid 2495] read(3, <unfinished ...> > [pid 2494] <... futex resumed> ) = 0 > [pid 2494] rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER, > 0x7fc78c1488f0}, NULL, 8) = 0 > [pid 2494] rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER, > 0x7fc78c1488f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc78c1488f0}, 8) = 0 > [pid 2494] write(3, "\2\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0", 16 > > And thats all - 2494 is waiting on write, 2495 is waiting on read. > > On 3.12.x it is working. On 3.17.0 and 3.18.7 it is broken. I haven't > checked versions in the middle. > > Any ideas? > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-19 10:38 ` Ian Campbell @ 2015-03-19 12:10 ` Iurii Konovalenko 2015-03-19 12:32 ` [Embedded-pv-devel] " Vitaly Chernooky 2015-03-19 13:00 ` David Vrabel 0 siblings, 2 replies; 14+ messages in thread From: Iurii Konovalenko @ 2015-03-19 12:10 UTC (permalink / raw) To: Ian Campbell Cc: Embedded-pv-devel, Boris Ostrovsky, David Vrabel, Marek Marczykowski-Górecki, xen-devel [-- Attachment #1: Type: text/plain, Size: 3911 bytes --] Hi, guys! When I read, that I am not alone and that issue depends on kernel version, I decided to continue investigation. And I found why our threads locks on read/write operations. On Linux kernel 3.14+ syscalls of file read and write changed a bit: fdget() function was replaced by fdget_pos() - it is fdget() function plus additional position mutex lock for files with FMODE_ATOMIC_POS (files for inodes with S_IFREG flag set - regular nodes). As I thought our xen files are not regular and nonseekable, I hoped this flag is not set. But it is set. It is because our file system is created by function simple_fill_super(), and inside it this flag is hardly set: inode->i_mode = S_IFREG | files->mode; So, as a fast hack I made a patch: just made copy of this function for xen, which does not set this flag. It works for me. Could you please check if it works for you. Best regards. Iurii Konovalenko | Senior Software Engineer GlobalLogic P +3.8044.492.9695 M +38.099.932.2909 S yufuntik www.globallogic.com http://www.globallogic.com/email_disclaimer.txt On Thu, Mar 19, 2015 at 12:38 PM, Ian Campbell <ian.campbell@citrix.com> wrote: > On Thu, 2015-03-19 at 02:19 +0100, Marek Marczykowski-Górecki wrote: >> Hi, >> >> I've hit some deadlock in kernel xenstore client exposed via >> /proc/xen/xenbus. > > Sounds similar to what Iurii also reported last night in "Userspace PV > backend hangs". > > Iurri's case was all 3.14 kernels, which is in your range too. > >> Steps to reproduce are simple: >> int main() { >> struct xs_handle *xs; >> xs = xs_open(0); >> xs_watch(xs, "domid", "token"); >> xs_read(xs, 0, "name", NULL); >> return 0; >> } >> >> xs_watch internally creates new thread, which uses read to wait for the >> watch. And in the same time, the program tries to read some value, >> but actually it hangs at sending the command (before even sending a path to be >> read). Strace gives this (simplified for readability): >> [pid 2494] write(3, "\4\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0", 160 = 16 >> [pid 2494] write(3, "domid\0", 6) = 6 >> [pid 2494] write(3, "token\0", 6) = 6 >> [pid 2495] read(3, <unfinished ...> >> [pid 2494] futex(0x71c0d4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...> >> [pid 2495] <... read resumed> >> "\17\0\0\0\377\377\377\377\220~\255\27\f\0\0\0", 16) = 16 >> [pid 2495] read(3, "domid\0token\0", 12) = 12 >> [pid 2495] read(3, "\4\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0", 16) = 16 >> [pid 2495] read(3, "OK\0", 3) = 3 >> [pid 2495] futex(0x71c0d4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x71c0d0, >> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...> >> [pid 2494] <... futex resumed> ) = 0 >> [pid 2495] <... futex resumed> ) = 1 >> [pid 2494] futex(0x71c0a8, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> >> [pid 2495] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> >> [pid 2494] <... futex resumed> ) = -1 EAGAIN (Resource >> temporarily unavailable) >> [pid 2495] <... futex resumed> ) = 0 >> [pid 2494] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> >> [pid 2495] read(3, <unfinished ...> >> [pid 2494] <... futex resumed> ) = 0 >> [pid 2494] rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER, >> 0x7fc78c1488f0}, NULL, 8) = 0 >> [pid 2494] rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER, >> 0x7fc78c1488f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc78c1488f0}, 8) = 0 >> [pid 2494] write(3, "\2\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0", 16 >> >> And thats all - 2494 is waiting on write, 2495 is waiting on read. >> >> On 3.12.x it is working. On 3.17.0 and 3.18.7 it is broken. I haven't >> checked versions in the middle. >> >> Any ideas? >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > [-- Attachment #2: 0001-arm-xen-hack-to-make-xen-files-unregular.patch --] [-- Type: text/x-patch, Size: 2954 bytes --] From ea5642c0ea537ef3c9da946d4f26a8f597b68622 Mon Sep 17 00:00:00 2001 From: Iurii Konovalenko <iurii.konovalenko@globallogic.com> Date: Thu, 19 Mar 2015 13:55:44 +0200 Subject: [PATCH] arm: xen: hack to make xen files unregular Change-Id: I0bc32867ca12dad78aa5f532a9c606fab9a3d1db Signed-off-by: Iurii Konovalenko <iurii.konovalenko@globallogic.com> --- drivers/xen/xenfs/super.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 70 insertions(+), 1 deletion(-) diff --git a/drivers/xen/xenfs/super.c b/drivers/xen/xenfs/super.c index 06092e0..1f7e74c 100644 --- a/drivers/xen/xenfs/super.c +++ b/drivers/xen/xenfs/super.c @@ -14,6 +14,7 @@ #include <linux/module.h> #include <linux/fs.h> #include <linux/magic.h> +#include <linux/pagemap.h> #include <xen/xen.h> @@ -42,6 +43,74 @@ static const struct file_operations capabilities_file_ops = { .llseek = default_llseek, }; +static const struct super_operations xen_simple_super_operations = { + .statfs = simple_statfs, +}; + +static int xen_simple_fill_super(struct super_block *s, unsigned long magic, + struct tree_descr *files) +{ +struct inode *inode; +struct dentry *root; +struct dentry *dentry; +int i; + +s->s_blocksize = PAGE_CACHE_SIZE; +s->s_blocksize_bits = PAGE_CACHE_SHIFT; +s->s_magic = magic; +s->s_op = &xen_simple_super_operations; +s->s_time_gran = 1; + +inode = new_inode(s); +if (!inode) + return -ENOMEM; +/* +* because the root inode is 1, the files array must not contain an +* entry at index 1 +*/ +inode->i_ino = 1; +inode->i_mode = S_IFDIR | 0755; +inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; +inode->i_op = &simple_dir_inode_operations; +inode->i_fop = &simple_dir_operations; +set_nlink(inode, 2); +root = d_make_root(inode); +if (!root) + return -ENOMEM; +for (i = 0; !files->name || files->name[0]; i++, files++) { + if (!files->name) + continue; + + /* warn if it tries to conflict with the root inode */ + if (unlikely(i == 1)) + printk(KERN_WARNING "%s: %s passed in a files array" + "with an index of 1!\n", __func__, + s->s_type->name); + + dentry = d_alloc_name(root, files->name); + if (!dentry) + goto out; + inode = new_inode(s); + if (!inode) { + dput(dentry); + goto out; + } + inode->i_mode = files->mode; + inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; + inode->i_fop = files->ops; + inode->i_ino = i; + d_add(dentry, inode); +} +s->s_root = root; +return 0; +out: +d_genocide(root); +shrink_dcache_parent(root); +dput(root); +return -ENOMEM; +} + + static int xenfs_fill_super(struct super_block *sb, void *data, int silent) { static struct tree_descr xenfs_files[] = { @@ -60,7 +129,7 @@ static int xenfs_fill_super(struct super_block *sb, void *data, int silent) {""}, }; - return simple_fill_super(sb, XENFS_SUPER_MAGIC, + return xen_simple_fill_super(sb, XENFS_SUPER_MAGIC, xen_initial_domain() ? xenfs_init_files : xenfs_files); } -- 1.9.1 [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Embedded-pv-devel] Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-19 12:10 ` Iurii Konovalenko @ 2015-03-19 12:32 ` Vitaly Chernooky 2015-03-19 13:00 ` David Vrabel 1 sibling, 0 replies; 14+ messages in thread From: Vitaly Chernooky @ 2015-03-19 12:32 UTC (permalink / raw) To: Iurii Konovalenko Cc: Ian Campbell, Marek Marczykowski-Górecki, xen-devel, Embedded-pv-devel, David Vrabel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 4748 bytes --] Hi All! This issue is discussed in details here: https://lkml.org/lkml/2014/2/17/324 With best regards, On Thu, Mar 19, 2015 at 2:10 PM, Iurii Konovalenko < iurii.konovalenko@globallogic.com> wrote: > Hi, guys! > > When I read, that I am not alone and that issue depends on kernel > version, I decided to continue investigation. > And I found why our threads locks on read/write operations. > On Linux kernel 3.14+ syscalls of file read and write changed a bit: > fdget() function was replaced by fdget_pos() - it is fdget() function > plus additional position mutex lock for files with FMODE_ATOMIC_POS > (files for inodes with S_IFREG flag set - regular nodes). As I thought > our xen files are not regular and nonseekable, I hoped this flag is > not set. But it is set. It is because our file system is created by > function simple_fill_super(), and inside it this flag is hardly set: > inode->i_mode = S_IFREG | files->mode; > So, as a fast hack I made a patch: just made copy of this function for > xen, which does not set this flag. It works for me. Could you please > check if it works for you. > Best regards. > > Iurii Konovalenko | Senior Software Engineer > GlobalLogic > P +3.8044.492.9695 M +38.099.932.2909 > S yufuntik > www.globallogic.com > http://www.globallogic.com/email_disclaimer.txt > > > On Thu, Mar 19, 2015 at 12:38 PM, Ian Campbell <ian.campbell@citrix.com> > wrote: > > On Thu, 2015-03-19 at 02:19 +0100, Marek Marczykowski-Górecki wrote: > >> Hi, > >> > >> I've hit some deadlock in kernel xenstore client exposed via > >> /proc/xen/xenbus. > > > > Sounds similar to what Iurii also reported last night in "Userspace PV > > backend hangs". > > > > Iurri's case was all 3.14 kernels, which is in your range too. > > > >> Steps to reproduce are simple: > >> int main() { > >> struct xs_handle *xs; > >> xs = xs_open(0); > >> xs_watch(xs, "domid", "token"); > >> xs_read(xs, 0, "name", NULL); > >> return 0; > >> } > >> > >> xs_watch internally creates new thread, which uses read to wait for the > >> watch. And in the same time, the program tries to read some value, > >> but actually it hangs at sending the command (before even sending a > path to be > >> read). Strace gives this (simplified for readability): > >> [pid 2494] write(3, "\4\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0", 160 = 16 > >> [pid 2494] write(3, "domid\0", 6) = 6 > >> [pid 2494] write(3, "token\0", 6) = 6 > >> [pid 2495] read(3, <unfinished ...> > >> [pid 2494] futex(0x71c0d4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...> > >> [pid 2495] <... read resumed> > >> "\17\0\0\0\377\377\377\377\220~\255\27\f\0\0\0", 16) = 16 > >> [pid 2495] read(3, "domid\0token\0", 12) = 12 > >> [pid 2495] read(3, "\4\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0", 16) = 16 > >> [pid 2495] read(3, "OK\0", 3) = 3 > >> [pid 2495] futex(0x71c0d4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x71c0d0, > >> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...> > >> [pid 2494] <... futex resumed> ) = 0 > >> [pid 2495] <... futex resumed> ) = 1 > >> [pid 2494] futex(0x71c0a8, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...> > >> [pid 2495] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> > >> [pid 2494] <... futex resumed> ) = -1 EAGAIN (Resource > >> temporarily unavailable) > >> [pid 2495] <... futex resumed> ) = 0 > >> [pid 2494] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> > >> [pid 2495] read(3, <unfinished ...> > >> [pid 2494] <... futex resumed> ) = 0 > >> [pid 2494] rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER, > >> 0x7fc78c1488f0}, NULL, 8) = 0 > >> [pid 2494] rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER, > >> 0x7fc78c1488f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc78c1488f0}, 8) = 0 > >> [pid 2494] write(3, "\2\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0", 16 > >> > >> And thats all - 2494 is waiting on write, 2495 is waiting on read. > >> > >> On 3.12.x it is working. On 3.17.0 and 3.18.7 it is broken. I haven't > >> checked versions in the middle. > >> > >> Any ideas? > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xen.org > >> http://lists.xen.org/xen-devel > > > > > > _______________________________________________ > Embedded-pv-devel mailing list > Embedded-pv-devel@lists.xenproject.org > http://lists.xenproject.org/cgi-bin/mailman/listinfo/embedded-pv-devel > -- *Vitaly Chernooky | Senior Developer - Product Engineering and Development* GlobalLogic P *+380.44.4929695 ext.1136* M *+380.63.6011802* S cvv_2k www.globallogic.com http://www.globallogic.com/email_disclaimer.txt [-- Attachment #1.2: Type: text/html, Size: 7126 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-19 12:10 ` Iurii Konovalenko 2015-03-19 12:32 ` [Embedded-pv-devel] " Vitaly Chernooky @ 2015-03-19 13:00 ` David Vrabel 2015-03-19 13:10 ` Vitaly Chernooky 1 sibling, 1 reply; 14+ messages in thread From: David Vrabel @ 2015-03-19 13:00 UTC (permalink / raw) To: Iurii Konovalenko, Ian Campbell Cc: Embedded-pv-devel, Boris Ostrovsky, xen-devel, David Vrabel, Marek Marczykowski-Górecki On 19/03/15 12:10, Iurii Konovalenko wrote: > Hi, guys! > > When I read, that I am not alone and that issue depends on kernel > version, I decided to continue investigation. > And I found why our threads locks on read/write operations. > On Linux kernel 3.14+ syscalls of file read and write changed a bit: > fdget() function was replaced by fdget_pos() - it is fdget() function > plus additional position mutex lock for files with FMODE_ATOMIC_POS > (files for inodes with S_IFREG flag set - regular nodes). As I thought > our xen files are not regular and nonseekable, I hoped this flag is > not set. But it is set. It is because our file system is created by > function simple_fill_super(), and inside it this flag is hardly set: > inode->i_mode = S_IFREG | files->mode; > So, as a fast hack I made a patch: just made copy of this function for > xen, which does not set this flag. It works for me. Could you please > check if it works for you. I still can't get this to deadlock, but why not clear FMODE_ATOMIC_POS in xenbus_file_open() ? David ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-19 13:00 ` David Vrabel @ 2015-03-19 13:10 ` Vitaly Chernooky 2015-03-20 4:04 ` Marek Marczykowski-Górecki 0 siblings, 1 reply; 14+ messages in thread From: Vitaly Chernooky @ 2015-03-19 13:10 UTC (permalink / raw) To: David Vrabel Cc: Iurii Konovalenko, Ian Campbell, Marek Marczykowski-Górecki, xen-devel, Embedded-pv-devel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 1750 bytes --] David, On Thu, Mar 19, 2015 at 3:00 PM, David Vrabel <david.vrabel@citrix.com> wrote: > On 19/03/15 12:10, Iurii Konovalenko wrote: > > Hi, guys! > > > > When I read, that I am not alone and that issue depends on kernel > > version, I decided to continue investigation. > > And I found why our threads locks on read/write operations. > > On Linux kernel 3.14+ syscalls of file read and write changed a bit: > > fdget() function was replaced by fdget_pos() - it is fdget() function > > plus additional position mutex lock for files with FMODE_ATOMIC_POS > > (files for inodes with S_IFREG flag set - regular nodes). As I thought > > our xen files are not regular and nonseekable, I hoped this flag is > > not set. But it is set. It is because our file system is created by > > function simple_fill_super(), and inside it this flag is hardly set: > > inode->i_mode = S_IFREG | files->mode; > > So, as a fast hack I made a patch: just made copy of this function for > > xen, which does not set this flag. It works for me. Could you please > > check if it works for you. > > I still can't get this to deadlock, but why not clear FMODE_ATOMIC_POS > in xenbus_file_open() ? > Because it is not the root of issue. FMODE_ATOMIC_POS is just one of results of bug. Iurii has fixed the root of issue but in suboptimal way. So we just need to have found optimal way. With best regards, > David > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > -- *Vitaly Chernooky | Senior Developer - Product Engineering and Development* GlobalLogic P *+380.44.4929695 ext.1136* M *+380.63.6011802* S cvv_2k www.globallogic.com http://www.globallogic.com/email_disclaimer.txt [-- Attachment #1.2: Type: text/html, Size: 3180 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-19 13:10 ` Vitaly Chernooky @ 2015-03-20 4:04 ` Marek Marczykowski-Górecki 2015-03-20 9:58 ` Vitaly Chernooky 2015-03-20 10:08 ` Vitaly Chernooky 0 siblings, 2 replies; 14+ messages in thread From: Marek Marczykowski-Górecki @ 2015-03-20 4:04 UTC (permalink / raw) To: Vitaly Chernooky Cc: Iurii Konovalenko, Ian Campbell, xen-devel, Embedded-pv-devel, David Vrabel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 2149 bytes --] On Thu, Mar 19, 2015 at 03:10:49PM +0200, Vitaly Chernooky wrote: > David, > > On Thu, Mar 19, 2015 at 3:00 PM, David Vrabel <david.vrabel@citrix.com> > wrote: > > > On 19/03/15 12:10, Iurii Konovalenko wrote: > > > Hi, guys! > > > > > > When I read, that I am not alone and that issue depends on kernel > > > version, I decided to continue investigation. > > > And I found why our threads locks on read/write operations. > > > On Linux kernel 3.14+ syscalls of file read and write changed a bit: > > > fdget() function was replaced by fdget_pos() - it is fdget() function > > > plus additional position mutex lock for files with FMODE_ATOMIC_POS > > > (files for inodes with S_IFREG flag set - regular nodes). As I thought > > > our xen files are not regular and nonseekable, I hoped this flag is > > > not set. But it is set. It is because our file system is created by > > > function simple_fill_super(), and inside it this flag is hardly set: > > > inode->i_mode = S_IFREG | files->mode; > > > So, as a fast hack I made a patch: just made copy of this function for > > > xen, which does not set this flag. It works for me. Could you please > > > check if it works for you. > > > > I still can't get this to deadlock, but why not clear FMODE_ATOMIC_POS > > in xenbus_file_open() ? > > > > Because it is not the root of issue. FMODE_ATOMIC_POS is just one of > results of bug. Iurii has fixed the root of issue but in suboptimal way. So > we just need to have found optimal way. I can just confirm that: 1. (unsurprisingly) the bug is still present in 4.0-rc4 2. both proposed fixes are effective I'm not sure if removing S_IFREG completely is a good idea, I guess there will be much more side effects... What about another idea: xenbus_file_open uses nonseekable_open - this looks like a good place to clear FMODE_ATOMIC_POS if present? It doesn't make sense to get a lock for position on nonseekable file, right? -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #1.2: Type: application/pgp-signature, Size: 473 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-20 4:04 ` Marek Marczykowski-Górecki @ 2015-03-20 9:58 ` Vitaly Chernooky 2015-03-20 10:08 ` Vitaly Chernooky 1 sibling, 0 replies; 14+ messages in thread From: Vitaly Chernooky @ 2015-03-20 9:58 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Iurii Konovalenko, Ian Campbell, xen-devel, Embedded-pv-devel, David Vrabel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 2844 bytes --] On Fri, Mar 20, 2015 at 6:04 AM, Marek Marczykowski-Górecki < marmarek@invisiblethingslab.com> wrote: > On Thu, Mar 19, 2015 at 03:10:49PM +0200, Vitaly Chernooky wrote: > > David, > > > > On Thu, Mar 19, 2015 at 3:00 PM, David Vrabel <david.vrabel@citrix.com> > > wrote: > > > > > On 19/03/15 12:10, Iurii Konovalenko wrote: > > > > Hi, guys! > > > > > > > > When I read, that I am not alone and that issue depends on kernel > > > > version, I decided to continue investigation. > > > > And I found why our threads locks on read/write operations. > > > > On Linux kernel 3.14+ syscalls of file read and write changed a bit: > > > > fdget() function was replaced by fdget_pos() - it is fdget() function > > > > plus additional position mutex lock for files with FMODE_ATOMIC_POS > > > > (files for inodes with S_IFREG flag set - regular nodes). As I > thought > > > > our xen files are not regular and nonseekable, I hoped this flag is > > > > not set. But it is set. It is because our file system is created by > > > > function simple_fill_super(), and inside it this flag is hardly set: > > > > inode->i_mode = S_IFREG | files->mode; > > > > So, as a fast hack I made a patch: just made copy of this function > for > > > > xen, which does not set this flag. It works for me. Could you please > > > > check if it works for you. > > > > > > I still can't get this to deadlock, but why not clear FMODE_ATOMIC_POS > > > in xenbus_file_open() ? > > > > > > > Because it is not the root of issue. FMODE_ATOMIC_POS is just one of > > results of bug. Iurii has fixed the root of issue but in suboptimal way. > So > > we just need to have found optimal way. > > I can just confirm that: > 1. (unsurprisingly) the bug is still present in 4.0-rc4 > 2. both proposed fixes are effective > > I'm not sure if removing S_IFREG completely is a good idea, I guess > there will be much more side effects... > Hm... Usually nobody expect regular files to be nonseekable, but ... Yes, I am going to recheck it twice in the nearest future. > What about another idea: xenbus_file_open uses nonseekable_open - this > looks like a good place to clear FMODE_ATOMIC_POS if present? It is wrong place to clear FMODE_ATOMIC_POS but something in your idea is right. It > doesn't make sense to get a lock for position on nonseekable file, > right? > Usually :) > > -- > Best Regards, > Marek Marczykowski-Górecki > Invisible Things Lab > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? > -- *Vitaly Chernooky | Senior Developer - Product Engineering and Development* GlobalLogic P *+380.44.4929695 ext.1136* M *+380.63.6011802* S cvv_2k www.globallogic.com http://www.globallogic.com/email_disclaimer.txt [-- Attachment #1.2: Type: text/html, Size: 4751 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-20 4:04 ` Marek Marczykowski-Górecki 2015-03-20 9:58 ` Vitaly Chernooky @ 2015-03-20 10:08 ` Vitaly Chernooky 2015-03-20 10:38 ` Vitaly Chernooky 2015-03-20 10:41 ` Marek Marczykowski-Górecki 1 sibling, 2 replies; 14+ messages in thread From: Vitaly Chernooky @ 2015-03-20 10:08 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Iurii Konovalenko, Ian Campbell, xen-devel, Embedded-pv-devel, David Vrabel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 2800 bytes --] On Fri, Mar 20, 2015 at 6:04 AM, Marek Marczykowski-Górecki < marmarek@invisiblethingslab.com> wrote: > On Thu, Mar 19, 2015 at 03:10:49PM +0200, Vitaly Chernooky wrote: > > David, > > > > On Thu, Mar 19, 2015 at 3:00 PM, David Vrabel <david.vrabel@citrix.com> > > wrote: > > > > > On 19/03/15 12:10, Iurii Konovalenko wrote: > > > > Hi, guys! > > > > > > > > When I read, that I am not alone and that issue depends on kernel > > > > version, I decided to continue investigation. > > > > And I found why our threads locks on read/write operations. > > > > On Linux kernel 3.14+ syscalls of file read and write changed a bit: > > > > fdget() function was replaced by fdget_pos() - it is fdget() function > > > > plus additional position mutex lock for files with FMODE_ATOMIC_POS > > > > (files for inodes with S_IFREG flag set - regular nodes). As I > thought > > > > our xen files are not regular and nonseekable, I hoped this flag is > > > > not set. But it is set. It is because our file system is created by > > > > function simple_fill_super(), and inside it this flag is hardly set: > > > > inode->i_mode = S_IFREG | files->mode; > > > > So, as a fast hack I made a patch: just made copy of this function > for > > > > xen, which does not set this flag. It works for me. Could you please > > > > check if it works for you. > > > > > > I still can't get this to deadlock, but why not clear FMODE_ATOMIC_POS > > > in xenbus_file_open() ? > > > > > > > Because it is not the root of issue. FMODE_ATOMIC_POS is just one of > > results of bug. Iurii has fixed the root of issue but in suboptimal way. > So > > we just need to have found optimal way. > > I can just confirm that: > 1. (unsurprisingly) the bug is still present in 4.0-rc4 > 2. both proposed fixes are effective > > I'm not sure if removing S_IFREG completely is a good idea, I guess > there will be much more side effects... > What about another idea: xenbus_file_open uses nonseekable_open - this > looks like a good place to clear FMODE_ATOMIC_POS if present? It > doesn't make sense to get a lock for position on nonseekable file, > right? > The Open Group Base Specifications Issue 7 IEEE Std 1003.1, 2013 Edition requires from regular files to be seekable. But Linux kernel looks like Linus has own opinion on it :((( With best regards, > -- > Best Regards, > Marek Marczykowski-Górecki > Invisible Things Lab > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? > -- *Vitaly Chernooky | Senior Developer - Product Engineering and Development* GlobalLogic P *+380.44.4929695 ext.1136* M *+380.63.6011802* S cvv_2k www.globallogic.com http://www.globallogic.com/email_disclaimer.txt [-- Attachment #1.2: Type: text/html, Size: 4341 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-20 10:08 ` Vitaly Chernooky @ 2015-03-20 10:38 ` Vitaly Chernooky 2015-03-20 10:46 ` David Vrabel 2015-03-20 10:41 ` Marek Marczykowski-Górecki 1 sibling, 1 reply; 14+ messages in thread From: Vitaly Chernooky @ 2015-03-20 10:38 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Iurii Konovalenko, Ian Campbell, xen-devel, Embedded-pv-devel, David Vrabel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 3396 bytes --] So I have finished my investigation and suggest to discuss the simple attaches patch. With best regards, On Fri, Mar 20, 2015 at 12:08 PM, Vitaly Chernooky < vitalii.chernookyi@globallogic.com> wrote: > > > On Fri, Mar 20, 2015 at 6:04 AM, Marek Marczykowski-Górecki < > marmarek@invisiblethingslab.com> wrote: > >> On Thu, Mar 19, 2015 at 03:10:49PM +0200, Vitaly Chernooky wrote: >> > David, >> > >> > On Thu, Mar 19, 2015 at 3:00 PM, David Vrabel <david.vrabel@citrix.com> >> > wrote: >> > >> > > On 19/03/15 12:10, Iurii Konovalenko wrote: >> > > > Hi, guys! >> > > > >> > > > When I read, that I am not alone and that issue depends on kernel >> > > > version, I decided to continue investigation. >> > > > And I found why our threads locks on read/write operations. >> > > > On Linux kernel 3.14+ syscalls of file read and write changed a bit: >> > > > fdget() function was replaced by fdget_pos() - it is fdget() >> function >> > > > plus additional position mutex lock for files with FMODE_ATOMIC_POS >> > > > (files for inodes with S_IFREG flag set - regular nodes). As I >> thought >> > > > our xen files are not regular and nonseekable, I hoped this flag is >> > > > not set. But it is set. It is because our file system is created by >> > > > function simple_fill_super(), and inside it this flag is hardly set: >> > > > inode->i_mode = S_IFREG | files->mode; >> > > > So, as a fast hack I made a patch: just made copy of this function >> for >> > > > xen, which does not set this flag. It works for me. Could you please >> > > > check if it works for you. >> > > >> > > I still can't get this to deadlock, but why not clear FMODE_ATOMIC_POS >> > > in xenbus_file_open() ? >> > > >> > >> > Because it is not the root of issue. FMODE_ATOMIC_POS is just one of >> > results of bug. Iurii has fixed the root of issue but in suboptimal >> way. So >> > we just need to have found optimal way. >> >> I can just confirm that: >> 1. (unsurprisingly) the bug is still present in 4.0-rc4 >> 2. both proposed fixes are effective >> >> I'm not sure if removing S_IFREG completely is a good idea, I guess >> there will be much more side effects... >> What about another idea: xenbus_file_open uses nonseekable_open - this >> looks like a good place to clear FMODE_ATOMIC_POS if present? It >> doesn't make sense to get a lock for position on nonseekable file, >> right? >> > > The Open Group Base Specifications Issue 7 IEEE Std 1003.1, 2013 Edition > requires from regular files to be seekable. But Linux kernel looks like > Linus has own opinion on it :((( > > With best regards, > > >> -- >> Best Regards, >> Marek Marczykowski-Górecki >> Invisible Things Lab >> A: Because it messes up the order in which people normally read text. >> Q: Why is top-posting such a bad thing? >> > > > > -- > *Vitaly Chernooky | Senior Developer - Product Engineering and Development* > GlobalLogic > P *+380.44.4929695 ext.1136 <%2B380.44.4929695%20ext.1136>* M *+380.63.6011802 > <%2B380.63.6011802>* S cvv_2k > www.globallogic.com > > http://www.globallogic.com/email_disclaimer.txt > -- *Vitaly Chernooky | Senior Developer - Product Engineering and Development* GlobalLogic P *+380.44.4929695 ext.1136* M *+380.63.6011802* S cvv_2k www.globallogic.com http://www.globallogic.com/email_disclaimer.txt [-- Attachment #1.2: Type: text/html, Size: 5783 bytes --] [-- Attachment #2: 0001-Fix-deadlock-on-regular-nonseekable-files.patch --] [-- Type: text/x-patch, Size: 916 bytes --] From f5f5922f389c0b6a2f0f912adf0c091ee97e0076 Mon Sep 17 00:00:00 2001 From: Vitaly Chernooky <vitaly.chernooky@globallogic.com> Date: Fri, 20 Mar 2015 12:26:37 +0200 Subject: [PATCH] Fix deadlock on regular nonseekable files It is actual for proc-like pseudo filesystems which mark their files as regular but nonseekable. Change-Id: I92f4da22a5835cc6b6396988c36b5906561ac741 Signed-off-by: Vitaly Chernooky <vitaly.chernooky@globallogic.com> --- fs/open.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/open.c b/fs/open.c index 2ed7325..b792784 100644 --- a/fs/open.c +++ b/fs/open.c @@ -1100,7 +1100,7 @@ EXPORT_SYMBOL(generic_file_open); */ int nonseekable_open(struct inode *inode, struct file *filp) { - filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE); + filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE | FMODE_ATOMIC_POS); return 0; } -- 1.7.9.5 [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-20 10:38 ` Vitaly Chernooky @ 2015-03-20 10:46 ` David Vrabel 2015-03-20 11:39 ` Vitaly Chernooky 0 siblings, 1 reply; 14+ messages in thread From: David Vrabel @ 2015-03-20 10:46 UTC (permalink / raw) To: Vitaly Chernooky, Marek Marczykowski-Górecki Cc: Boris Ostrovsky, Embedded-pv-devel, Iurii Konovalenko, Ian Campbell, xen-devel On 20/03/15 10:38, Vitaly Chernooky wrote: > So I have finished my investigation and suggest to discuss the simple > attaches patch. Looks ok to me. But this needs to go via the filesystem maintainers, Cc'ing Linus on it. You should explain the deadlock in more detail in the commit message and mention that it fixes a regression and which commit introduced the regression. Another solution would be to replace the file with a symlink to /dev/xen/xenbus. David ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-20 10:46 ` David Vrabel @ 2015-03-20 11:39 ` Vitaly Chernooky 0 siblings, 0 replies; 14+ messages in thread From: Vitaly Chernooky @ 2015-03-20 11:39 UTC (permalink / raw) To: David Vrabel Cc: Iurii Konovalenko, Ian Campbell, Marek Marczykowski-Górecki, xen-devel, Embedded-pv-devel, Boris Ostrovsky David, On Fri, Mar 20, 2015 at 12:46 PM, David Vrabel <david.vrabel@citrix.com> wrote: > On 20/03/15 10:38, Vitaly Chernooky wrote: >> So I have finished my investigation and suggest to discuss the simple >> attaches patch. > > Looks ok to me. But this needs to go via the filesystem maintainers, > Cc'ing Linus on it. Just add Linus to cc? May be to continue mailing thread mentioned above will be better? WIth best regards, > You should explain the deadlock in more detail in > the commit message and mention that it fixes a regression and which > commit introduced the regression. Ok, > > Another solution would be to replace the file with a symlink to > /dev/xen/xenbus. > > David -- Vitaly Chernooky | Senior Developer - Product Engineering and Development GlobalLogic P +380.44.4929695 ext.1136 M +380.63.6011802 S cvv_2k www.globallogic.com http://www.globallogic.com/email_disclaimer.txt ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-20 10:08 ` Vitaly Chernooky 2015-03-20 10:38 ` Vitaly Chernooky @ 2015-03-20 10:41 ` Marek Marczykowski-Górecki 1 sibling, 0 replies; 14+ messages in thread From: Marek Marczykowski-Górecki @ 2015-03-20 10:41 UTC (permalink / raw) To: Vitaly Chernooky Cc: Iurii Konovalenko, Ian Campbell, xen-devel, Embedded-pv-devel, David Vrabel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 3031 bytes --] On Fri, Mar 20, 2015 at 12:08:48PM +0200, Vitaly Chernooky wrote: > On Fri, Mar 20, 2015 at 6:04 AM, Marek Marczykowski-Górecki < > marmarek@invisiblethingslab.com> wrote: > > > On Thu, Mar 19, 2015 at 03:10:49PM +0200, Vitaly Chernooky wrote: > > > David, > > > > > > On Thu, Mar 19, 2015 at 3:00 PM, David Vrabel <david.vrabel@citrix.com> > > > wrote: > > > > > > > On 19/03/15 12:10, Iurii Konovalenko wrote: > > > > > Hi, guys! > > > > > > > > > > When I read, that I am not alone and that issue depends on kernel > > > > > version, I decided to continue investigation. > > > > > And I found why our threads locks on read/write operations. > > > > > On Linux kernel 3.14+ syscalls of file read and write changed a bit: > > > > > fdget() function was replaced by fdget_pos() - it is fdget() function > > > > > plus additional position mutex lock for files with FMODE_ATOMIC_POS > > > > > (files for inodes with S_IFREG flag set - regular nodes). As I > > thought > > > > > our xen files are not regular and nonseekable, I hoped this flag is > > > > > not set. But it is set. It is because our file system is created by > > > > > function simple_fill_super(), and inside it this flag is hardly set: > > > > > inode->i_mode = S_IFREG | files->mode; > > > > > So, as a fast hack I made a patch: just made copy of this function > > for > > > > > xen, which does not set this flag. It works for me. Could you please > > > > > check if it works for you. > > > > > > > > I still can't get this to deadlock, but why not clear FMODE_ATOMIC_POS > > > > in xenbus_file_open() ? > > > > > > > > > > Because it is not the root of issue. FMODE_ATOMIC_POS is just one of > > > results of bug. Iurii has fixed the root of issue but in suboptimal way. > > So > > > we just need to have found optimal way. > > > > I can just confirm that: > > 1. (unsurprisingly) the bug is still present in 4.0-rc4 > > 2. both proposed fixes are effective > > > > I'm not sure if removing S_IFREG completely is a good idea, I guess > > there will be much more side effects... > > What about another idea: xenbus_file_open uses nonseekable_open - this > > looks like a good place to clear FMODE_ATOMIC_POS if present? It > > doesn't make sense to get a lock for position on nonseekable file, > > right? > > > > The Open Group Base Specifications Issue 7 IEEE Std 1003.1, 2013 Edition > requires from regular files to be seekable. But Linux kernel looks like > Linus has own opinion on it :((( Maybe the better idea would be to change filetype of xenbus (and others) to S_IFIFO or something like this (but keep the file type present, instead of removing it completely). Regarding the implementation, maybe simple_fill_super can be modified to not add S_IFREG if other file type is already present in files->mode? -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? [-- Attachment #1.2: Type: application/pgp-signature, Size: 473 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) 2015-03-19 1:19 Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) Marek Marczykowski-Górecki 2015-03-19 10:38 ` Ian Campbell @ 2015-03-19 11:10 ` David Vrabel 1 sibling, 0 replies; 14+ messages in thread From: David Vrabel @ 2015-03-19 11:10 UTC (permalink / raw) To: Marek Marczykowski-Górecki, xen-devel; +Cc: Boris Ostrovsky, David Vrabel On 19/03/15 01:19, Marek Marczykowski-Górecki wrote: > Hi, > > I've hit some deadlock in kernel xenstore client exposed via > /proc/xen/xenbus. Steps to reproduce are simple: > int main() { > struct xs_handle *xs; > xs = xs_open(0); > xs_watch(xs, "domid", "token"); > xs_read(xs, 0, "name", NULL); > return 0; > } This test program works for me on 4.0-rc4 in a dom0 and domU. I also don't see any xenbus changes in the kernel that could cause a regression like this. Perhaps you not running a vanilla kernel? David ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-03-20 11:39 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-19 1:19 Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) Marek Marczykowski-Górecki 2015-03-19 10:38 ` Ian Campbell 2015-03-19 12:10 ` Iurii Konovalenko 2015-03-19 12:32 ` [Embedded-pv-devel] " Vitaly Chernooky 2015-03-19 13:00 ` David Vrabel 2015-03-19 13:10 ` Vitaly Chernooky 2015-03-20 4:04 ` Marek Marczykowski-Górecki 2015-03-20 9:58 ` Vitaly Chernooky 2015-03-20 10:08 ` Vitaly Chernooky 2015-03-20 10:38 ` Vitaly Chernooky 2015-03-20 10:46 ` David Vrabel 2015-03-20 11:39 ` Vitaly Chernooky 2015-03-20 10:41 ` Marek Marczykowski-Górecki 2015-03-19 11:10 ` David Vrabel
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.