All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Ujfalusi <peter.ujfalusi@ti.com>
To: Grant Likely <grant.likely@secretlab.ca>, <gregkh@linuxfoundation.org>
Cc: <broonie@kernel.org>, <lgirdwood@gmail.com>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [RESEND] drivercore: deferral race condition fix
Date: Wed, 9 Apr 2014 10:03:37 +0300	[thread overview]
Message-ID: <5344F0C9.5000503@ti.com> (raw)
In-Reply-To: <5343FB23.8090309@ti.com>

On 04/08/2014 04:35 PM, Peter Ujfalusi wrote:
> On 04/08/2014 03:43 PM, Grant Likely wrote:
>>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>>> index 06051767393f..80703de6e6ad 100644
>>> --- a/drivers/base/dd.c
>>> +++ b/drivers/base/dd.c
>>> @@ -53,6 +53,10 @@ static LIST_HEAD(deferred_probe_pending_list);
>>>  static LIST_HEAD(deferred_probe_active_list);
>>>  static struct workqueue_struct *deferred_wq;
>>>  
>>> +static atomic_t probe_count = ATOMIC_INIT(0);
>>> +static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
>>> +static bool deferral_retry;
>>> +
>>>  /**
>>>   * deferred_probe_work_func() - Retry probing devices in the active list.
>>>   */
>>> @@ -141,6 +145,11 @@ static void driver_deferred_probe_trigger(void)
>>>  	if (!driver_deferred_probe_enable)
>>>  		return;
>>>  
>>> +	if (atomic_read(&probe_count) > 1)
>>> +		deferral_retry = true;
>>> +	else
>>> +		deferral_retry = false;
>>
>> A few comments:
>> - Really need to comment why these lines are being added.
>> - I think this hunk needs to be moved to realy_probe(). It
>>   doesn't make any sense when called via deferred_probe_initcall(), and
>>   it doesn't work in the device_bind_driver path because the probe_count
>>   is not incremented there. In fact, the device_bind_driver() path has
>>   the same race condition, but it is unlikely to be a problem in
>>   practice because device_bind_driver() is used very rarely and doesn't
>>   execute any driver code.
> 
> The reason why I have added the flagging to driver_deferred_probe_trigger()
> because this is the place where the deferred drivers will be kicked.
> When the drivers are loaded in order during the boot it is not really
> interesting for this situation. When a driver has been moved to deferred queue
> is the time we need to watch for the 'race' to handle.
> I did have this flagging first in really_probe() but as far as I recall it
> exhibited random misses.
> The driver_deferred_probe_trigger() will be called every time when a driver
> probed with success - from driver_bound(), right? So what we are doing is that
> we set the deferral_retry flag if we have more than one driver's probe in
> progress and see if when the last driver leaves it's probe we had another
> loaded with success.
> The probe_count will be decremented after the driver_bound() so if we had only
> the two racy driver as last, we will have the flag set.
> Hrm, probably it might be better for readability to move the deferral_retry
> flag code just before the driver_bound() call in really_probe(). Inthis way we
> will have these in one place.

Now that I had time to think about this again I think the really_probe() is a
wrong place for this failsafe mechanism.
At the end we need to look and handle the following case:
When a driver probed with success while other driver(s) still in their probe
(thus not present in the deferred lists) we need to flag this event in the
driver_deferred_probe_trigger() function, just before the
list_splice_tail_init() call - when the deferred list is prepared.
Basically we set a flag for later use, that we have prepared the deferred list
but there were drivers in-fligth which we do not yet know if they are going to
end up deferring.

We need to check this flag in driver_deferred_probe_add() function which adds
the driver to the deferred pending list in case it returned with -EPROBE_DEFER.
Here we check the flag and also check if this is the last driver known to us
probing (probe_count == 1).
If these conditions met, we call driver_deferred_probe_trigger() from here as
an automatic one shot try to see if the previously loaded driver had satisfied
the deferred last driver.

I need to move driver_deferred_probe_add() down a bit in the source file for
this and rename the flag I had to 'deferred_auto_retry' or something.
I think this is going to be more robust and also gives cleaner explanation
what this 'recovery' code meant to do.

> 
>> - The 'if' is unnecessary:
>> 	deferred_retry = (atomic_read(&probe_count) > 1);
>>
>>> +
>>>  	/*
>>>  	 * A successful probe means that all the devices in the pending list
>>>  	 * should be triggered to be reprobed.  Move all the deferred devices
>>> @@ -259,9 +268,6 @@ int device_bind_driver(struct device *dev)
>>>  }
>>>  EXPORT_SYMBOL_GPL(device_bind_driver);
>>>  
>>> -static atomic_t probe_count = ATOMIC_INIT(0);
>>> -static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
>>> -
>>>  static int really_probe(struct device *dev, struct device_driver *drv)
>>>  {
>>>  	int ret = 0;
>>> @@ -310,6 +316,16 @@ probe_failed:
>>>  		/* Driver requested deferred probing */
>>>  		dev_info(dev, "Driver %s requests probe deferral\n", drv->name);
>>>  		driver_deferred_probe_add(dev);
>>> +		/*
>>> +		 * This is the last driver to load and asking to be deferred.
>>> +		 * If other driver(s) loaded while this driver was loading, we
>>> +		 * should try the deferred modules again to avoid missing
>>> +		 * dependency for this driver.
>>> +		 */
>>> +		if (atomic_read(&probe_count) == 1 && deferral_retry) {
>>> +			deferral_retry = false;
>>> +			driver_deferred_probe_trigger();
>>> +		}
>>
>> Testing the probe count probably isn't necessary. Clearing the flag
>> though is probably racy if there are two deferred drivers in flight.
> 
> I think it is a good thing to have to avoid kicking the deferred list all the
> time. If we still have 5 driver still probing we can just wait till the last
> is gone and just check if we need to do an 'emergency' kick to the deferred list.
> 
>> I would rather be happier if each probe could track on its own if there
>> had been any successful probes and then decide whether or not to trigger
>> again based on that, but when I played with it I found that it just
>> creates another race condition between calling really_probe() and
>> really_probe() grabbing a probe state footprint. Everything I tried made
>> things more complicated than less.
> 
> Yes, I also experimented with other ways but things got more fragile with even
> more corner cases to handle and understand...
> 
>> Go ahead and add my a-b when you
>> respin the patch.
>>
>> Acked-by: Grant Likely <grant.likely@linaro.org>
> 
> Thanks, I'll send the v2 tomorrow.
> 
>>
>>
>>>  	} else if (ret != -ENODEV && ret != -ENXIO) {
>>>  		/* driver matched but the probe failed */
>>>  		printk(KERN_WARNING
>>> -- 
>>> 1.9.1
>>>
>>
> 
> 


-- 
Péter

      reply	other threads:[~2014-04-09  7:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-03  7:12 [RESEND] drivercore: deferral race condition fix Peter Ujfalusi
2014-04-03  9:40 ` Mark Brown
2014-04-08 10:27   ` Grant Likely
2014-04-08 12:47     ` Grant Likely
2014-04-08 13:16       ` Peter Ujfalusi
2014-04-08 12:43 ` Grant Likely
2014-04-08 13:35   ` Peter Ujfalusi
2014-04-09  7:03     ` Peter Ujfalusi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5344F0C9.5000503@ti.com \
    --to=peter.ujfalusi@ti.com \
    --cc=broonie@kernel.org \
    --cc=grant.likely@secretlab.ca \
    --cc=gregkh@linuxfoundation.org \
    --cc=lgirdwood@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.