* mutex @ 2014-09-09 15:18 Amos Kong 2014-09-09 15:23 ` RFC virtio-rng: fail to read sysfs of a busy device Amos Kong 0 siblings, 1 reply; 5+ messages in thread From: Amos Kong @ 2014-09-09 15:18 UTC (permalink / raw) To: virtualization; +Cc: amit.shah, herbert, kvm Hi Amit, Rusty RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1127062 steps: - Read random data by 'dd if=/dev/hwrng of=/dev/null' in guest - check sysfs files in the same time, 'cat /sys/class/misc/hw_random/rng_*' Result: cat process will get stuck, it will return if we kill dd process. We have some static variables (eg, current_rng, data_avail, etc) in hw_random/core.c, they are protected by rng_mutex. I try to workaround this issue by undelay(100) after mutex_unlock() in rng_dev_read(). This gives chance for hwrng_attr_*_show() to get mutex. This patch also contains some cleanup, moving some code out of mutex protection. Do you have some suggestion? Thanks. diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index aa30a25..fa69020 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -194,6 +194,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, } mutex_unlock(&rng_mutex); + udelay(100); if (need_resched()) schedule_timeout_interruptible(1); @@ -233,10 +234,10 @@ static ssize_t hwrng_attr_current_store(struct device *dev, int err; struct hwrng *rng; + err = -ENODEV; err = mutex_lock_interruptible(&rng_mutex); if (err) return -ERESTARTSYS; - err = -ENODEV; list_for_each_entry(rng, &rng_list, list) { if (strcmp(rng->name, buf) == 0) { if (rng == current_rng) { @@ -270,8 +271,8 @@ static ssize_t hwrng_attr_current_show(struct device *dev, return -ERESTARTSYS; if (current_rng) name = current_rng->name; - ret = snprintf(buf, PAGE_SIZE, "%s\n", name); mutex_unlock(&rng_mutex); + ret = snprintf(buf, PAGE_SIZE, "%s\n", name); return ret; } @@ -284,19 +285,19 @@ static ssize_t hwrng_attr_available_show(struct device *dev, ssize_t ret = 0; struct hwrng *rng; + buf[0] = '\0'; err = mutex_lock_interruptible(&rng_mutex); if (err) return -ERESTARTSYS; - buf[0] = '\0'; list_for_each_entry(rng, &rng_list, list) { strncat(buf, rng->name, PAGE_SIZE - ret - 1); ret += strlen(rng->name); strncat(buf, " ", PAGE_SIZE - ret - 1); ret++; } + mutex_unlock(&rng_mutex); strncat(buf, "\n", PAGE_SIZE - ret - 1); ret++; - mutex_unlock(&rng_mutex); return ret; } ^ permalink raw reply related [flat|nested] 5+ messages in thread
* RFC virtio-rng: fail to read sysfs of a busy device 2014-09-09 15:18 mutex Amos Kong @ 2014-09-09 15:23 ` Amos Kong 2014-09-10 5:52 ` Amit Shah 0 siblings, 1 reply; 5+ messages in thread From: Amos Kong @ 2014-09-09 15:23 UTC (permalink / raw) To: virtualization; +Cc: amit.shah, herbert, kvm (Resend to fix the subject) Hi Amit, Rusty RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1127062 steps: - Read random data by 'dd if=/dev/hwrng of=/dev/null' in guest - check sysfs files in the same time, 'cat /sys/class/misc/hw_random/rng_*' Result: cat process will get stuck, it will return if we kill dd process. We have some static variables (eg, current_rng, data_avail, etc) in hw_random/core.c, they are protected by rng_mutex. I try to workaround this issue by undelay(100) after mutex_unlock() in rng_dev_read(). This gives chance for hwrng_attr_*_show() to get mutex. This patch also contains some cleanup, moving some code out of mutex protection. Do you have some suggestion? Thanks. diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index aa30a25..fa69020 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -194,6 +194,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, } mutex_unlock(&rng_mutex); + udelay(100); if (need_resched()) schedule_timeout_interruptible(1); @@ -233,10 +234,10 @@ static ssize_t hwrng_attr_current_store(struct device *dev, int err; struct hwrng *rng; + err = -ENODEV; err = mutex_lock_interruptible(&rng_mutex); if (err) return -ERESTARTSYS; - err = -ENODEV; list_for_each_entry(rng, &rng_list, list) { if (strcmp(rng->name, buf) == 0) { if (rng == current_rng) { @@ -270,8 +271,8 @@ static ssize_t hwrng_attr_current_show(struct device *dev, return -ERESTARTSYS; if (current_rng) name = current_rng->name; - ret = snprintf(buf, PAGE_SIZE, "%s\n", name); mutex_unlock(&rng_mutex); + ret = snprintf(buf, PAGE_SIZE, "%s\n", name); return ret; } @@ -284,19 +285,19 @@ static ssize_t hwrng_attr_available_show(struct device *dev, ssize_t ret = 0; struct hwrng *rng; + buf[0] = '\0'; err = mutex_lock_interruptible(&rng_mutex); if (err) return -ERESTARTSYS; - buf[0] = '\0'; list_for_each_entry(rng, &rng_list, list) { strncat(buf, rng->name, PAGE_SIZE - ret - 1); ret += strlen(rng->name); strncat(buf, " ", PAGE_SIZE - ret - 1); ret++; } + mutex_unlock(&rng_mutex); strncat(buf, "\n", PAGE_SIZE - ret - 1); ret++; - mutex_unlock(&rng_mutex); return ret; } -- Amos. ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: RFC virtio-rng: fail to read sysfs of a busy device 2014-09-09 15:23 ` RFC virtio-rng: fail to read sysfs of a busy device Amos Kong @ 2014-09-10 5:52 ` Amit Shah 2014-09-10 6:49 ` Amos Kong 0 siblings, 1 reply; 5+ messages in thread From: Amit Shah @ 2014-09-10 5:52 UTC (permalink / raw) To: Amos Kong; +Cc: herbert, kvm, virtualization On (Tue) 09 Sep 2014 [23:23:07], Amos Kong wrote: > (Resend to fix the subject) > > Hi Amit, Rusty > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1127062 > steps: > - Read random data by 'dd if=/dev/hwrng of=/dev/null' in guest > - check sysfs files in the same time, 'cat /sys/class/misc/hw_random/rng_*' > > Result: cat process will get stuck, it will return if we kill dd process. How common is it going to be to have a long-running 'dd' process on /dev/hwrng? Also, with the new khwrng thread, reading from /dev/hwrng isn't required -- just use /dev/random? (This doesn't mean we shouldn't fix the issue here...) > We have some static variables (eg, current_rng, data_avail, etc) in hw_random/core.c, > they are protected by rng_mutex. I try to workaround this issue by undelay(100) > after mutex_unlock() in rng_dev_read(). This gives chance for hwrng_attr_*_show() > to get mutex. > > This patch also contains some cleanup, moving some code out of mutex > protection. > > Do you have some suggestion? Thanks. > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c > index aa30a25..fa69020 100644 > --- a/drivers/char/hw_random/core.c > +++ b/drivers/char/hw_random/core.c > @@ -194,6 +194,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, > } > > mutex_unlock(&rng_mutex); > + udelay(100); We have a need_resched() right below. Why doesn't that work? > if (need_resched()) > schedule_timeout_interruptible(1); > @@ -233,10 +234,10 @@ static ssize_t hwrng_attr_current_store(struct device *dev, > int err; > struct hwrng *rng; The following hunk doesn't work: > + err = -ENODEV; > err = mutex_lock_interruptible(&rng_mutex); err is being set to another value in the next line! > if (err) > return -ERESTARTSYS; > - err = -ENODEV; And all usage of err below now won't have -ENODEV but some other value. > list_for_each_entry(rng, &rng_list, list) { > if (strcmp(rng->name, buf) == 0) { > if (rng == current_rng) { > @@ -270,8 +271,8 @@ static ssize_t hwrng_attr_current_show(struct device *dev, > return -ERESTARTSYS; > if (current_rng) > name = current_rng->name; > - ret = snprintf(buf, PAGE_SIZE, "%s\n", name); > mutex_unlock(&rng_mutex); > + ret = snprintf(buf, PAGE_SIZE, "%s\n", name); This looks OK... > > return ret; > } > @@ -284,19 +285,19 @@ static ssize_t hwrng_attr_available_show(struct device *dev, > ssize_t ret = 0; > struct hwrng *rng; > > + buf[0] = '\0'; > err = mutex_lock_interruptible(&rng_mutex); > if (err) > return -ERESTARTSYS; > > - buf[0] = '\0'; > list_for_each_entry(rng, &rng_list, list) { > strncat(buf, rng->name, PAGE_SIZE - ret - 1); > ret += strlen(rng->name); > strncat(buf, " ", PAGE_SIZE - ret - 1); > ret++; > } > + mutex_unlock(&rng_mutex); > strncat(buf, "\n", PAGE_SIZE - ret - 1); > ret++; > - mutex_unlock(&rng_mutex); But this isn't resulting in savings; the majority of the time is being spent in the for loop, and that writes to the buffer. BTW I don't expect strcat'ing to the buf in each of these scenarios is a long operation, so this reworking doesn't strike to me as something we should pursue. Amit ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RFC virtio-rng: fail to read sysfs of a busy device 2014-09-10 5:52 ` Amit Shah @ 2014-09-10 6:49 ` Amos Kong 2014-09-10 7:32 ` Amos Kong 0 siblings, 1 reply; 5+ messages in thread From: Amos Kong @ 2014-09-10 6:49 UTC (permalink / raw) To: Amit Shah; +Cc: herbert, kvm, virtualization On Wed, Sep 10, 2014 at 11:22:12AM +0530, Amit Shah wrote: > On (Tue) 09 Sep 2014 [23:23:07], Amos Kong wrote: > > (Resend to fix the subject) > > > > Hi Amit, Rusty > > > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1127062 > > steps: > > - Read random data by 'dd if=/dev/hwrng of=/dev/null' in guest > > - check sysfs files in the same time, 'cat /sys/class/misc/hw_random/rng_*' > > > > Result: cat process will get stuck, it will return if we kill dd process. > > How common is it going to be to have a long-running 'dd' process on > /dev/hwrng? Not a common usage, but we have this strict testing. > Also, with the new khwrng thread, reading from /dev/hwrng isn't > required -- just use /dev/random? Yes. > (This doesn't mean we shouldn't fix the issue here...) Completely agree :-) > > We have some static variables (eg, current_rng, data_avail, etc) in hw_random/core.c, > > they are protected by rng_mutex. I try to workaround this issue by undelay(100) > > after mutex_unlock() in rng_dev_read(). This gives chance for hwrng_attr_*_show() > > to get mutex. > > > > This patch also contains some cleanup, moving some code out of mutex > > protection. > > > > Do you have some suggestion? Thanks. > > > > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c > > index aa30a25..fa69020 100644 > > --- a/drivers/char/hw_random/core.c > > +++ b/drivers/char/hw_random/core.c > > @@ -194,6 +194,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, > > } > > > > mutex_unlock(&rng_mutex); > > + udelay(100); > > We have a need_resched() right below. Why doesn't that work? need_resched() is giving chance for userspace to > > if (need_resched()) It never success in my debugging. If we remove this check and always call schedule_timeout_interruptible(1), problem also disappears. diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index aa30a25..263a370 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, mutex_unlock(&rng_mutex); - if (need_resched()) - schedule_timeout_interruptible(1); + schedule_timeout_interruptible(1); if (signal_pending(current)) { err = -ERESTARTSYS; > > schedule_timeout_interruptible(1); > > @@ -233,10 +234,10 @@ static ssize_t hwrng_attr_current_store(struct device *dev, > > int err; > > struct hwrng *rng; > The following hunk doesn't work: > > > + err = -ENODEV; > > err = mutex_lock_interruptible(&rng_mutex); > > err is being set to another value in the next line! > > > if (err) > > return -ERESTARTSYS; > > - err = -ENODEV; > > And all usage of err below now won't have -ENODEV but some other value. Oops! > > list_for_each_entry(rng, &rng_list, list) { > > if (strcmp(rng->name, buf) == 0) { > > if (rng == current_rng) { > > @@ -270,8 +271,8 @@ static ssize_t hwrng_attr_current_show(struct device *dev, > > return -ERESTARTSYS; > > if (current_rng) > > name = current_rng->name; > > - ret = snprintf(buf, PAGE_SIZE, "%s\n", name); > > mutex_unlock(&rng_mutex); > > + ret = snprintf(buf, PAGE_SIZE, "%s\n", name); > > This looks OK... > > > > > return ret; > > } > > @@ -284,19 +285,19 @@ static ssize_t hwrng_attr_available_show(struct device *dev, > > ssize_t ret = 0; > > struct hwrng *rng; > > > > + buf[0] = '\0'; > > err = mutex_lock_interruptible(&rng_mutex); > > if (err) > > return -ERESTARTSYS; > > > > - buf[0] = '\0'; > > list_for_each_entry(rng, &rng_list, list) { > > strncat(buf, rng->name, PAGE_SIZE - ret - 1); > > ret += strlen(rng->name); > > strncat(buf, " ", PAGE_SIZE - ret - 1); > > ret++; > > } > > + mutex_unlock(&rng_mutex); > > strncat(buf, "\n", PAGE_SIZE - ret - 1); > > ret++; > > - mutex_unlock(&rng_mutex); > > But this isn't resulting in savings; the majority of the time is being > spent in the for loop, and that writes to the buffer. Right > BTW I don't expect strcat'ing to the buf in each of these scenarios is > a long operation, so this reworking doesn't strike to me as something > we should pursue. > > Amit -- Amos. ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: RFC virtio-rng: fail to read sysfs of a busy device 2014-09-10 6:49 ` Amos Kong @ 2014-09-10 7:32 ` Amos Kong 0 siblings, 0 replies; 5+ messages in thread From: Amos Kong @ 2014-09-10 7:32 UTC (permalink / raw) To: Amit Shah; +Cc: herbert, kvm, virtualization On Wed, Sep 10, 2014 at 02:49:38PM +0800, Amos Kong wrote: > On Wed, Sep 10, 2014 at 11:22:12AM +0530, Amit Shah wrote: > > On (Tue) 09 Sep 2014 [23:23:07], Amos Kong wrote: > > > (Resend to fix the subject) > > > > > > Hi Amit, Rusty > > > > > > RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1127062 > > > steps: > > > - Read random data by 'dd if=/dev/hwrng of=/dev/null' in guest > > > - check sysfs files in the same time, 'cat /sys/class/misc/hw_random/rng_*' > > > > > > Result: cat process will get stuck, it will return if we kill dd process. > > > > How common is it going to be to have a long-running 'dd' process on > > /dev/hwrng? > > Not a common usage, but we have this strict testing. For -smp 1: It's easy to reproduce with slow backend (/dev/random). cat can return most of time with some delay if we use quick backend (/dev/urandom). But for -smp 2: I didn't touch this problem even with slow backend. > > Also, with the new khwrng thread, reading from /dev/hwrng isn't > > required -- just use /dev/random? > > Yes. > > > (This doesn't mean we shouldn't fix the issue here...) > > Completely agree :-) > > > > We have some static variables (eg, current_rng, data_avail, etc) in hw_random/core.c, > > > they are protected by rng_mutex. I try to workaround this issue by undelay(100) > > > after mutex_unlock() in rng_dev_read(). This gives chance for hwrng_attr_*_show() > > > to get mutex. > > > > > > This patch also contains some cleanup, moving some code out of mutex > > > protection. > > > > > > Do you have some suggestion? Thanks. > > > > > > > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c > > > index aa30a25..fa69020 100644 > > > --- a/drivers/char/hw_random/core.c > > > +++ b/drivers/char/hw_random/core.c > > > @@ -194,6 +194,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf, > > > } > > > > > > mutex_unlock(&rng_mutex); > > > + udelay(100); > > > > We have a need_resched() right below. Why doesn't that work? [smp 1] Why need_resched() always return zero? what's the original purpose of it ? > > > > if (need_resched()) > > It never success in my debugging. > > If we remove this check and always call schedule_timeout_interruptible(1), > problem also disappears. > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c > index aa30a25..263a370 100644 > --- a/drivers/char/hw_random/core.c > +++ b/drivers/char/hw_random/core.c > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, > char __user *buf, > > mutex_unlock(&rng_mutex); > > - if (need_resched()) > - schedule_timeout_interruptible(1); > + schedule_timeout_interruptible(1); > > if (signal_pending(current)) { > err = -ERESTARTSYS; > > > > schedule_timeout_interruptible(1); > > > @@ -233,10 +234,10 @@ static ssize_t hwrng_attr_current_store(struct device *dev, > > > int err; > > > struct hwrng *rng; > > > The following hunk doesn't work: > > > > > + err = -ENODEV; > > > err = mutex_lock_interruptible(&rng_mutex); > > > > err is being set to another value in the next line! > > > > > if (err) > > > return -ERESTARTSYS; > > > - err = -ENODEV; > > > > And all usage of err below now won't have -ENODEV but some other value. > > Oops! > > > > list_for_each_entry(rng, &rng_list, list) { > > > if (strcmp(rng->name, buf) == 0) { > > > if (rng == current_rng) { > > > @@ -270,8 +271,8 @@ static ssize_t hwrng_attr_current_show(struct device *dev, > > > return -ERESTARTSYS; > > > if (current_rng) > > > name = current_rng->name; > > > - ret = snprintf(buf, PAGE_SIZE, "%s\n", name); > > > mutex_unlock(&rng_mutex); > > > + ret = snprintf(buf, PAGE_SIZE, "%s\n", name); > > > > This looks OK... > > > > > > > > return ret; > > > } > > > @@ -284,19 +285,19 @@ static ssize_t hwrng_attr_available_show(struct device *dev, > > > ssize_t ret = 0; > > > struct hwrng *rng; > > > > > > + buf[0] = '\0'; > > > err = mutex_lock_interruptible(&rng_mutex); > > > if (err) > > > return -ERESTARTSYS; > > > > > > - buf[0] = '\0'; > > > list_for_each_entry(rng, &rng_list, list) { > > > strncat(buf, rng->name, PAGE_SIZE - ret - 1); > > > ret += strlen(rng->name); > > > strncat(buf, " ", PAGE_SIZE - ret - 1); > > > ret++; > > > } > > > + mutex_unlock(&rng_mutex); > > > strncat(buf, "\n", PAGE_SIZE - ret - 1); > > > ret++; > > > - mutex_unlock(&rng_mutex); > > > > But this isn't resulting in savings; the majority of the time is being > > spent in the for loop, and that writes to the buffer. > > Right > > > BTW I don't expect strcat'ing to the buf in each of these scenarios is > > a long operation, so this reworking doesn't strike to me as something > > we should pursue. > > > > Amit > > -- > Amos. -- Amos. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-09-10 7:32 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-09-09 15:18 mutex Amos Kong 2014-09-09 15:23 ` RFC virtio-rng: fail to read sysfs of a busy device Amos Kong 2014-09-10 5:52 ` Amit Shah 2014-09-10 6:49 ` Amos Kong 2014-09-10 7:32 ` Amos Kong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).