xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V5 0/2] xenbus: Fix S3 frontend resume when xenstored is not running
@ 2013-05-28 17:09 Aurelien Chartier
  2013-05-28 17:09 ` [PATCH V5 1/2] xenbus: save xenstore local status for later use Aurelien Chartier
  2013-05-28 17:09 ` [PATCH V5 2/2] xenbus: delay xenbus frontend resume if xenstored is not running Aurelien Chartier
  0 siblings, 2 replies; 6+ messages in thread
From: Aurelien Chartier @ 2013-05-28 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Aurelien Chartier, konrad.wilk, ian.campbell, JBeulich,
	david.vrabel

Hi,

This patch series fixes the S3 resume of a domain running xenstored and a
frontend over xenbus (xen-netfront in my use case).

As device resume is happening before process resume, the xenbus frontend
resume is hanging if xenstored is not running, thus causing a deadlock.
This patch series is fixing that issue by deferring the xenbus frontend
resume when we are running xenstored in that same domain.

Aurelien Chartier (2):
  xenbus: save xenstore local status for later use
  xenbus: delay xenbus frontend resume if xenstored is not running

 drivers/xen/xenbus/xenbus_comms.h          |    1 +
 drivers/xen/xenbus/xenbus_probe.c          |   27 +++++++++-----------
 drivers/xen/xenbus/xenbus_probe.h          |    7 ++++++
 drivers/xen/xenbus/xenbus_probe_frontend.c |   37 +++++++++++++++++++++++++++-
 include/xen/xenbus.h                       |    1 +
 5 files changed, 57 insertions(+), 16 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH V5 1/2] xenbus: save xenstore local status for later use
  2013-05-28 17:09 [PATCH V5 0/2] xenbus: Fix S3 frontend resume when xenstored is not running Aurelien Chartier
@ 2013-05-28 17:09 ` Aurelien Chartier
  2013-05-28 17:09 ` [PATCH V5 2/2] xenbus: delay xenbus frontend resume if xenstored is not running Aurelien Chartier
  1 sibling, 0 replies; 6+ messages in thread
From: Aurelien Chartier @ 2013-05-28 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Aurelien Chartier, konrad.wilk, ian.campbell, JBeulich,
	david.vrabel

Save the xenstore local status computed in xenbus_init. It can then be used
later to check if xenstored is running in this domain.

Signed-off-by: Aurelien Chartier <aurelien.chartier@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>

Changes in v4:
- Change variable name to xen_store_domain_type
---
 drivers/xen/xenbus/xenbus_comms.h |    1 +
 drivers/xen/xenbus/xenbus_probe.c |   27 ++++++++++++---------------
 drivers/xen/xenbus/xenbus_probe.h |    7 +++++++
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_comms.h b/drivers/xen/xenbus/xenbus_comms.h
index c8abd3b..e74f9c1 100644
--- a/drivers/xen/xenbus/xenbus_comms.h
+++ b/drivers/xen/xenbus/xenbus_comms.h
@@ -45,6 +45,7 @@ int xb_wait_for_data_to_read(void);
 int xs_input_avail(void);
 extern struct xenstore_domain_interface *xen_store_interface;
 extern int xen_store_evtchn;
+extern enum xenstore_init xen_store_domain_type;
 
 extern const struct file_operations xen_xenbus_fops;
 
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 3325884..56cfaaa 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -69,6 +69,9 @@ EXPORT_SYMBOL_GPL(xen_store_evtchn);
 struct xenstore_domain_interface *xen_store_interface;
 EXPORT_SYMBOL_GPL(xen_store_interface);
 
+enum xenstore_init xen_store_domain_type;
+EXPORT_SYMBOL_GPL(xen_store_domain_type);
+
 static unsigned long xen_store_mfn;
 
 static BLOCKING_NOTIFIER_HEAD(xenstore_chain);
@@ -719,17 +722,11 @@ static int __init xenstored_local_init(void)
 	return err;
 }
 
-enum xenstore_init {
-	UNKNOWN,
-	PV,
-	HVM,
-	LOCAL,
-};
 static int __init xenbus_init(void)
 {
 	int err = 0;
-	enum xenstore_init usage = UNKNOWN;
 	uint64_t v = 0;
+	xen_store_domain_type = XS_UNKNOWN;
 
 	if (!xen_domain())
 		return -ENODEV;
@@ -737,29 +734,29 @@ static int __init xenbus_init(void)
 	xenbus_ring_ops_init();
 
 	if (xen_pv_domain())
-		usage = PV;
+		xen_store_domain_type = XS_PV;
 	if (xen_hvm_domain())
-		usage = HVM;
+		xen_store_domain_type = XS_HVM;
 	if (xen_hvm_domain() && xen_initial_domain())
-		usage = LOCAL;
+		xen_store_domain_type = XS_LOCAL;
 	if (xen_pv_domain() && !xen_start_info->store_evtchn)
-		usage = LOCAL;
+		xen_store_domain_type = XS_LOCAL;
 	if (xen_pv_domain() && xen_start_info->store_evtchn)
 		xenstored_ready = 1;
 
-	switch (usage) {
-	case LOCAL:
+	switch (xen_store_domain_type) {
+	case XS_LOCAL:
 		err = xenstored_local_init();
 		if (err)
 			goto out_error;
 		xen_store_interface = mfn_to_virt(xen_store_mfn);
 		break;
-	case PV:
+	case XS_PV:
 		xen_store_evtchn = xen_start_info->store_evtchn;
 		xen_store_mfn = xen_start_info->store_mfn;
 		xen_store_interface = mfn_to_virt(xen_store_mfn);
 		break;
-	case HVM:
+	case XS_HVM:
 		err = hvm_get_parameter(HVM_PARAM_STORE_EVTCHN, &v);
 		if (err)
 			goto out_error;
diff --git a/drivers/xen/xenbus/xenbus_probe.h b/drivers/xen/xenbus/xenbus_probe.h
index bb4f92e..146f857 100644
--- a/drivers/xen/xenbus/xenbus_probe.h
+++ b/drivers/xen/xenbus/xenbus_probe.h
@@ -47,6 +47,13 @@ struct xen_bus_type {
 	struct bus_type bus;
 };
 
+enum xenstore_init {
+	XS_UNKNOWN,
+	XS_PV,
+	XS_HVM,
+	XS_LOCAL,
+};
+
 extern struct device_attribute xenbus_dev_attrs[];
 
 extern int xenbus_match(struct device *_dev, struct device_driver *_drv);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH V5 2/2] xenbus: delay xenbus frontend resume if xenstored is not running
  2013-05-28 17:09 [PATCH V5 0/2] xenbus: Fix S3 frontend resume when xenstored is not running Aurelien Chartier
  2013-05-28 17:09 ` [PATCH V5 1/2] xenbus: save xenstore local status for later use Aurelien Chartier
@ 2013-05-28 17:09 ` Aurelien Chartier
  2013-05-29  7:03   ` Jan Beulich
  1 sibling, 1 reply; 6+ messages in thread
From: Aurelien Chartier @ 2013-05-28 17:09 UTC (permalink / raw)
  To: xen-devel
  Cc: Aurelien Chartier, konrad.wilk, ian.campbell, JBeulich,
	david.vrabel

If the xenbus frontend is located in a domain running xenstored, the device
resume is hanging because it is happening before the process resume. This
patch adds extra logic to the resume code to check if we are the domain
running xenstored and delay the resume if needed.

Signed-off-by: Aurelien Chartier <aurelien.chartier@citrix.com>

Changes in v2:
- Instead of bypassing the resume, process it in a workqueue
Changes in v3:
- Add a struct work in xenbus_device to avoid dynamic allocation
- Several small code fixes
Changes in v4:
- Use a dedicated workqueue
Changes in v5:
- Move create_workqueue error handling to xenbus_frontend_dev_resume
---
 drivers/xen/xenbus/xenbus_probe_frontend.c |   37 +++++++++++++++++++++++++++-
 include/xen/xenbus.h                       |    1 +
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xenbus/xenbus_probe_frontend.c b/drivers/xen/xenbus/xenbus_probe_frontend.c
index 3159a37..3374aee 100644
--- a/drivers/xen/xenbus/xenbus_probe_frontend.c
+++ b/drivers/xen/xenbus/xenbus_probe_frontend.c
@@ -29,6 +29,8 @@
 #include "xenbus_probe.h"
 
 
+static struct workqueue_struct *xenbus_frontend_wq;
+
 /* device/<type>/<id> => <type>-<id> */
 static int frontend_bus_id(char bus_id[XEN_BUS_ID_SIZE], const char *nodename)
 {
@@ -89,9 +91,40 @@ static void backend_changed(struct xenbus_watch *watch,
 	xenbus_otherend_changed(watch, vec, len, 1);
 }
 
+static void xenbus_frontend_delayed_resume(struct work_struct *w)
+{
+	struct xenbus_device *xdev = container_of(w, struct xenbus_device, work);
+
+	xenbus_dev_resume(&xdev->dev);
+}
+
+static int xenbus_frontend_dev_resume(struct device *dev)
+{
+	/* 
+	 * If xenstored is running in this domain, we cannot access the backend
+	 * state at the moment, so we need to defer xenbus_dev_resume
+	 */
+	if (xen_store_domain_type == XS_LOCAL) {
+		struct xenbus_device *xdev = to_xenbus_device(dev);
+
+		if (!xenbus_frontend_wq) {
+			pr_err("%s: no workqueue to process delayed resume\n",
+			       xdev->nodename);
+			return -EFAULT;
+		}
+
+		INIT_WORK(&xdev->work, xenbus_frontend_delayed_resume);
+		queue_work(xenbus_frontend_wq, &xdev->work);
+
+		return 0;
+	}
+
+	return xenbus_dev_resume(dev);
+}
+
 static const struct dev_pm_ops xenbus_pm_ops = {
 	.suspend	= xenbus_dev_suspend,
-	.resume		= xenbus_dev_resume,
+	.resume		= xenbus_frontend_dev_resume,
 	.freeze		= xenbus_dev_suspend,
 	.thaw		= xenbus_dev_cancel,
 	.restore	= xenbus_dev_resume,
@@ -440,6 +473,8 @@ static int __init xenbus_probe_frontend_init(void)
 
 	register_xenstore_notifier(&xenstore_notifier);
 
+	xenbus_frontend_wq = create_workqueue("xenbus_frontend");
+
 	return 0;
 }
 subsys_initcall(xenbus_probe_frontend_init);
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 0a7515c..569c07f 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -70,6 +70,7 @@ struct xenbus_device {
 	struct device dev;
 	enum xenbus_state state;
 	struct completion down;
+	struct work_struct work;
 };
 
 static inline struct xenbus_device *to_xenbus_device(struct device *dev)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH V5 2/2] xenbus: delay xenbus frontend resume if xenstored is not running
  2013-05-28 17:09 ` [PATCH V5 2/2] xenbus: delay xenbus frontend resume if xenstored is not running Aurelien Chartier
@ 2013-05-29  7:03   ` Jan Beulich
  2013-05-29 15:44     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2013-05-29  7:03 UTC (permalink / raw)
  To: Aurelien Chartier; +Cc: xen-devel, david.vrabel, konrad.wilk, ian.campbell

>>> On 28.05.13 at 19:09, Aurelien Chartier <aurelien.chartier@citrix.com> wrote:
> If the xenbus frontend is located in a domain running xenstored, the device
> resume is hanging because it is happening before the process resume. This
> patch adds extra logic to the resume code to check if we are the domain
> running xenstored and delay the resume if needed.
> 
> Signed-off-by: Aurelien Chartier <aurelien.chartier@citrix.com>
> 
> Changes in v2:
> - Instead of bypassing the resume, process it in a workqueue
> Changes in v3:
> - Add a struct work in xenbus_device to avoid dynamic allocation
> - Several small code fixes
> Changes in v4:
> - Use a dedicated workqueue
> Changes in v5:
> - Move create_workqueue error handling to xenbus_frontend_dev_resume

Acked-by: Jan Beulich <jbeulich@suse.com>

Yet nevertheless I'm still seeing room for improvement:

> +static int xenbus_frontend_dev_resume(struct device *dev)
> +{
> +	/* 
> +	 * If xenstored is running in this domain, we cannot access the backend
> +	 * state at the moment, so we need to defer xenbus_dev_resume
> +	 */
> +	if (xen_store_domain_type == XS_LOCAL) {
> +		struct xenbus_device *xdev = to_xenbus_device(dev);
> +
> +		if (!xenbus_frontend_wq) {
> +			pr_err("%s: no workqueue to process delayed resume\n",
> +			       xdev->nodename);
> +			return -EFAULT;

Issuing a message here is fine, but I think you should also issue a
pr_warn() at initialization time.

> +		}
> +
> +		INIT_WORK(&xdev->work, xenbus_frontend_delayed_resume);

And I also think that this would better be done once at initialization
time too.

> +		queue_work(xenbus_frontend_wq, &xdev->work);
> +
> +		return 0;
> +	}

Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V5 2/2] xenbus: delay xenbus frontend resume if xenstored is not running
  2013-05-29  7:03   ` Jan Beulich
@ 2013-05-29 15:44     ` Konrad Rzeszutek Wilk
  2013-05-29 16:17       ` Aurelien Chartier
  0 siblings, 1 reply; 6+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-05-29 15:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Aurelien Chartier, xen-devel, david.vrabel, ian.campbell

On Wed, May 29, 2013 at 08:03:13AM +0100, Jan Beulich wrote:
> >>> On 28.05.13 at 19:09, Aurelien Chartier <aurelien.chartier@citrix.com> wrote:
> > If the xenbus frontend is located in a domain running xenstored, the device
> > resume is hanging because it is happening before the process resume. This
> > patch adds extra logic to the resume code to check if we are the domain
> > running xenstored and delay the resume if needed.
> > 
> > Signed-off-by: Aurelien Chartier <aurelien.chartier@citrix.com>
> > 
> > Changes in v2:
> > - Instead of bypassing the resume, process it in a workqueue
> > Changes in v3:
> > - Add a struct work in xenbus_device to avoid dynamic allocation
> > - Several small code fixes
> > Changes in v4:
> > - Use a dedicated workqueue
> > Changes in v5:
> > - Move create_workqueue error handling to xenbus_frontend_dev_resume
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> 
> Yet nevertheless I'm still seeing room for improvement:
> 
> > +static int xenbus_frontend_dev_resume(struct device *dev)
> > +{
> > +	/* 
> > +	 * If xenstored is running in this domain, we cannot access the backend
> > +	 * state at the moment, so we need to defer xenbus_dev_resume
> > +	 */
> > +	if (xen_store_domain_type == XS_LOCAL) {
> > +		struct xenbus_device *xdev = to_xenbus_device(dev);
> > +
> > +		if (!xenbus_frontend_wq) {
> > +			pr_err("%s: no workqueue to process delayed resume\n",
> > +			       xdev->nodename);
> > +			return -EFAULT;
> 
> Issuing a message here is fine, but I think you should also issue a
> pr_warn() at initialization time.
> 
> > +		}
> > +
> > +		INIT_WORK(&xdev->work, xenbus_frontend_delayed_resume);
> 
> And I also think that this would better be done once at initialization
> time too.
> 
> > +		queue_work(xenbus_frontend_wq, &xdev->work);
> > +
> > +		return 0;
> > +	}
> 
> Jan

Lets do that as a follow up patch. Aurlien, I've taken your patches in.

> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V5 2/2] xenbus: delay xenbus frontend resume if xenstored is not running
  2013-05-29 15:44     ` Konrad Rzeszutek Wilk
@ 2013-05-29 16:17       ` Aurelien Chartier
  0 siblings, 0 replies; 6+ messages in thread
From: Aurelien Chartier @ 2013-05-29 16:17 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, david.vrabel, Jan Beulich, ian.campbell

On 29/05/13 16:44, Konrad Rzeszutek Wilk wrote:
> On Wed, May 29, 2013 at 08:03:13AM +0100, Jan Beulich wrote:
>>>>> On 28.05.13 at 19:09, Aurelien Chartier <aurelien.chartier@citrix.com> wrote:
>>> If the xenbus frontend is located in a domain running xenstored, the device
>>> resume is hanging because it is happening before the process resume. This
>>> patch adds extra logic to the resume code to check if we are the domain
>>> running xenstored and delay the resume if needed.
>>>
>>> Signed-off-by: Aurelien Chartier <aurelien.chartier@citrix.com>
>>>
>>> Changes in v2:
>>> - Instead of bypassing the resume, process it in a workqueue
>>> Changes in v3:
>>> - Add a struct work in xenbus_device to avoid dynamic allocation
>>> - Several small code fixes
>>> Changes in v4:
>>> - Use a dedicated workqueue
>>> Changes in v5:
>>> - Move create_workqueue error handling to xenbus_frontend_dev_resume
>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>
>> Yet nevertheless I'm still seeing room for improvement:
>>
>>> +static int xenbus_frontend_dev_resume(struct device *dev)
>>> +{
>>> +	/* 
>>> +	 * If xenstored is running in this domain, we cannot access the backend
>>> +	 * state at the moment, so we need to defer xenbus_dev_resume
>>> +	 */
>>> +	if (xen_store_domain_type == XS_LOCAL) {
>>> +		struct xenbus_device *xdev = to_xenbus_device(dev);
>>> +
>>> +		if (!xenbus_frontend_wq) {
>>> +			pr_err("%s: no workqueue to process delayed resume\n",
>>> +			       xdev->nodename);
>>> +			return -EFAULT;
>> Issuing a message here is fine, but I think you should also issue a
>> pr_warn() at initialization time.
>>
>>> +		}
>>> +
>>> +		INIT_WORK(&xdev->work, xenbus_frontend_delayed_resume);
>> And I also think that this would better be done once at initialization
>> time too.
>>
>>> +		queue_work(xenbus_frontend_wq, &xdev->work);
>>> +
>>> +		return 0;
>>> +	}
>> Jan
> Lets do that as a follow up patch. Aurlien, I've taken your patches in.

Thanks Konrad. I'll send the follow up patch later on.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-05-29 16:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-28 17:09 [PATCH V5 0/2] xenbus: Fix S3 frontend resume when xenstored is not running Aurelien Chartier
2013-05-28 17:09 ` [PATCH V5 1/2] xenbus: save xenstore local status for later use Aurelien Chartier
2013-05-28 17:09 ` [PATCH V5 2/2] xenbus: delay xenbus frontend resume if xenstored is not running Aurelien Chartier
2013-05-29  7:03   ` Jan Beulich
2013-05-29 15:44     ` Konrad Rzeszutek Wilk
2013-05-29 16:17       ` Aurelien Chartier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).