All of lore.kernel.org
 help / color / mirror / Atom feed
* about phosphor pid control package
@ 2019-05-08 13:45 Will Liang (梁永鉉)
  2019-05-08 16:36 ` Patrick Venture
  0 siblings, 1 reply; 8+ messages in thread
From: Will Liang (梁永鉉) @ 2019-05-08 13:45 UTC (permalink / raw)
  To: OpenBMC Maillist

Hi,

I have a question about getFailSafeMode().

Currently, only sensors that are defined as "temp" types can be checked for failure.
I did not find any "fan" type sensors to check if the fan has failed.
Our project need to check the fan fail so  I want to add another "fan" sensor type to check.

Can I add one more "for loop" to check the fan sensor in updateSensors() function in zone.cpp??

for (const auto& t : _thermalInputs)
{
    ........
}
for (const auto& t : _fanInputs)
{
    ........
}

BRs,
Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: about phosphor pid control package
  2019-05-08 13:45 about phosphor pid control package Will Liang (梁永鉉)
@ 2019-05-08 16:36 ` Patrick Venture
  2019-05-09  6:33   ` Will Liang (梁永鉉)
  0 siblings, 1 reply; 8+ messages in thread
From: Patrick Venture @ 2019-05-08 16:36 UTC (permalink / raw)
  To: Will Liang (梁永鉉); +Cc: OpenBMC Maillist

On Wed, May 8, 2019 at 6:46 AM Will Liang (梁永鉉) <Will.Liang@quantatw.com> wrote:
>
> Hi,
>
> I have a question about getFailSafeMode().
>
> Currently, only sensors that are defined as "temp" types can be checked for failure.
> I did not find any "fan" type sensors to check if the fan has failed.
> Our project need to check the fan fail so  I want to add another "fan" sensor type to check.



>
> Can I add one more "for loop" to check the fan sensor in updateSensors() function in zone.cpp??
>
> for (const auto& t : _thermalInputs)
> {
>     ........
> }
> for (const auto& t : _fanInputs)
> {
>     ........
> }

updateSensors is deliberately not talking to the fans because they're
not considered inputs into the thermal config, they're controlled
outputs -- the question I have is, what would you like to do if a fan
isn't responding?  failsafemode drives the fans to a specific
pre-defined speed to keep it from thermal issues.  If a fan is failing
to respond, one can't drive it -- perhaps one can drive the others to
some failsafe?

If so, one needs to update the failsafe for a zone outside of the
thermal sensors, but rather where the fans are checked (void
PIDZone::updateFanTelemetry(void))

>
> BRs,
> Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: about phosphor pid control package
  2019-05-08 16:36 ` Patrick Venture
@ 2019-05-09  6:33   ` Will Liang (梁永鉉)
  2019-05-09 14:43     ` Patrick Venture
  0 siblings, 1 reply; 8+ messages in thread
From: Will Liang (梁永鉉) @ 2019-05-09  6:33 UTC (permalink / raw)
  To: Patrick Venture; +Cc: OpenBMC Maillist

Hi,

> -----Original Message-----
> From: Patrick Venture [mailto:venture@google.com]
> Sent: Thursday, May 9, 2019 12:36 AM
> To: Will Liang (梁永鉉) <Will.Liang@quantatw.com>
> Cc: OpenBMC Maillist <openbmc@lists.ozlabs.org>
> Subject: Re: about phosphor pid control package
> 
> On Wed, May 8, 2019 at 6:46 AM Will Liang (梁永鉉)
> <Will.Liang@quantatw.com> wrote:
> >
> > Hi,
> >
> > I have a question about getFailSafeMode().
> >
> > Currently, only sensors that are defined as "temp" types can be checked for
> failure.
> > I did not find any "fan" type sensors to check if the fan has failed.
> > Our project need to check the fan fail so  I want to add another "fan" sensor
> type to check.
> 
> 
> 
> >
> > Can I add one more "for loop" to check the fan sensor in updateSensors()
> function in zone.cpp??
> >
> > for (const auto& t : _thermalInputs)
> > {
> >     ........
> > }
> > for (const auto& t : _fanInputs)
> > {
> >     ........
> > }
> 
> updateSensors is deliberately not talking to the fans because they're not
> considered inputs into the thermal config, they're controlled outputs -- the
> question I have is, what would you like to do if a fan isn't responding?
> failsafemode drives the fans to a specific pre-defined speed to keep it from
> thermal issues.  If a fan is failing to respond, one can't drive it -- perhaps one
> can drive the others to some failsafe?

If a fan fails, we need to enter the fail safe mode to increase the other fan duty.

> If so, one needs to update the failsafe for a zone outside of the thermal sensors,
> but rather where the fans are checked (void
> PIDZone::updateFanTelemetry(void))

I add following code into the PIDZone::updateFanTelemetry(void) function to check the fan fail. 
If the fan fails, it will enter fail safe mode.
  if (sensor->getFailed())
  {
  	failSafeSensors.insert(f);
  }
  else
  {
  	// Check if it's in there: remove it.
    auto kt = _failSafeSensors.find(f);
    if (kt != _failSafeSensors.end())
  {
  	failSafeSensors.erase(kt);
  }
   
But one more question I have is that the above code can only check if a single fan has failed.
Our project needs to check for dual-fan failures. Do you have any suggestions for checking the failure of the dual-fan?

Will
> >
> > BRs,
> > Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: about phosphor pid control package
  2019-05-09  6:33   ` Will Liang (梁永鉉)
@ 2019-05-09 14:43     ` Patrick Venture
  2019-05-09 17:51       ` James Feist
  0 siblings, 1 reply; 8+ messages in thread
From: Patrick Venture @ 2019-05-09 14:43 UTC (permalink / raw)
  To: Will Liang (梁永鉉); +Cc: OpenBMC Maillist

On Wed, May 8, 2019 at 11:33 PM Will Liang (梁永鉉)
<Will.Liang@quantatw.com> wrote:
>
> Hi,
>
> > -----Original Message-----
> > From: Patrick Venture [mailto:venture@google.com]
> > Sent: Thursday, May 9, 2019 12:36 AM
> > To: Will Liang (梁永鉉) <Will.Liang@quantatw.com>
> > Cc: OpenBMC Maillist <openbmc@lists.ozlabs.org>
> > Subject: Re: about phosphor pid control package
> >
> > On Wed, May 8, 2019 at 6:46 AM Will Liang (梁永鉉)
> > <Will.Liang@quantatw.com> wrote:
> > >
> > > Hi,
> > >
> > > I have a question about getFailSafeMode().
> > >
> > > Currently, only sensors that are defined as "temp" types can be checked for
> > failure.
> > > I did not find any "fan" type sensors to check if the fan has failed.
> > > Our project need to check the fan fail so  I want to add another "fan" sensor
> > type to check.
> >
> >
> >
> > >
> > > Can I add one more "for loop" to check the fan sensor in updateSensors()
> > function in zone.cpp??
> > >
> > > for (const auto& t : _thermalInputs)
> > > {
> > >     ........
> > > }
> > > for (const auto& t : _fanInputs)
> > > {
> > >     ........
> > > }
> >
> > updateSensors is deliberately not talking to the fans because they're not
> > considered inputs into the thermal config, they're controlled outputs -- the
> > question I have is, what would you like to do if a fan isn't responding?
> > failsafemode drives the fans to a specific pre-defined speed to keep it from
> > thermal issues.  If a fan is failing to respond, one can't drive it -- perhaps one
> > can drive the others to some failsafe?
>
> If a fan fails, we need to enter the fail safe mode to increase the other fan duty.
>
> > If so, one needs to update the failsafe for a zone outside of the thermal sensors,
> > but rather where the fans are checked (void
> > PIDZone::updateFanTelemetry(void))
>
> I add following code into the PIDZone::updateFanTelemetry(void) function to check the fan fail.
> If the fan fails, it will enter fail safe mode.
>   if (sensor->getFailed())
>   {
>         failSafeSensors.insert(f);
>   }
>   else
>   {
>         // Check if it's in there: remove it.
>     auto kt = _failSafeSensors.find(f);
>     if (kt != _failSafeSensors.end())
>   {
>         failSafeSensors.erase(kt);
>   }
>
> But one more question I have is that the above code can only check if a single fan has failed.

> Our project needs to check for dual-fan failures. Do you have any suggestions for checking the failure of the dual-fan?

I'm not entirely certain what you mean.  You're saying a dual-fan is a
fan that has two outputs but one input?

>
> Will
> > >
> > > BRs,
> > > Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: about phosphor pid control package
  2019-05-09 14:43     ` Patrick Venture
@ 2019-05-09 17:51       ` James Feist
  2019-05-10  0:08         ` Will Liang (梁永鉉)
  2019-05-14 12:50         ` Will Liang (梁永鉉)
  0 siblings, 2 replies; 8+ messages in thread
From: James Feist @ 2019-05-09 17:51 UTC (permalink / raw)
  To: Patrick Venture, Will Liang (梁永鉉); +Cc: OpenBMC Maillist

On 5/9/19 7:43 AM, Patrick Venture wrote:
> On Wed, May 8, 2019 at 11:33 PM Will Liang (梁永鉉)
> <Will.Liang@quantatw.com> wrote:
>>
>> Hi,
>>
>>> -----Original Message-----
>>> From: Patrick Venture [mailto:venture@google.com]
>>> Sent: Thursday, May 9, 2019 12:36 AM
>>> To: Will Liang (梁永鉉) <Will.Liang@quantatw.com>
>>> Cc: OpenBMC Maillist <openbmc@lists.ozlabs.org>
>>> Subject: Re: about phosphor pid control package
>>>
>>> On Wed, May 8, 2019 at 6:46 AM Will Liang (梁永鉉)
>>> <Will.Liang@quantatw.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I have a question about getFailSafeMode().
>>>>
>>>> Currently, only sensors that are defined as "temp" types can be checked for
>>> failure.
>>>> I did not find any "fan" type sensors to check if the fan has failed.
>>>> Our project need to check the fan fail so  I want to add another "fan" sensor
>>> type to check.
>>>
>>>
>>>
>>>>
>>>> Can I add one more "for loop" to check the fan sensor in updateSensors()
>>> function in zone.cpp??
>>>>
>>>> for (const auto& t : _thermalInputs)
>>>> {
>>>>      ........
>>>> }
>>>> for (const auto& t : _fanInputs)
>>>> {
>>>>      ........
>>>> }
>>>
>>> updateSensors is deliberately not talking to the fans because they're not
>>> considered inputs into the thermal config, they're controlled outputs -- the
>>> question I have is, what would you like to do if a fan isn't responding?
>>> failsafemode drives the fans to a specific pre-defined speed to keep it from
>>> thermal issues.  If a fan is failing to respond, one can't drive it -- perhaps one
>>> can drive the others to some failsafe?
>>
>> If a fan fails, we need to enter the fail safe mode to increase the other fan duty.
>>
>>> If so, one needs to update the failsafe for a zone outside of the thermal sensors,
>>> but rather where the fans are checked (void
>>> PIDZone::updateFanTelemetry(void))
>>
>> I add following code into the PIDZone::updateFanTelemetry(void) function to check the fan fail.
>> If the fan fails, it will enter fail safe mode.
>>    if (sensor->getFailed())
>>    {
>>          failSafeSensors.insert(f);
>>    }
>>    else
>>    {
>>          // Check if it's in there: remove it.
>>      auto kt = _failSafeSensors.find(f);
>>      if (kt != _failSafeSensors.end())
>>    {
>>          failSafeSensors.erase(kt);
>>    }
>>
>> But one more question I have is that the above code can only check if a single fan has failed.
> 
>> Our project needs to check for dual-fan failures. Do you have any suggestions for checking the failure of the dual-fan?
> 
> I'm not entirely certain what you mean.  You're saying a dual-fan is a
> fan that has two outputs but one input?

If this is what you mean, on our systems we simply have a tach sensor 
per tach in the fan, i.e. fan1a and fan1b. I think the above logic would 
work for this issue.

> 
>>
>> Will
>>>>
>>>> BRs,
>>>> Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: about phosphor pid control package
  2019-05-09 17:51       ` James Feist
@ 2019-05-10  0:08         ` Will Liang (梁永鉉)
  2019-05-14 12:50         ` Will Liang (梁永鉉)
  1 sibling, 0 replies; 8+ messages in thread
From: Will Liang (梁永鉉) @ 2019-05-10  0:08 UTC (permalink / raw)
  To: James Feist, Patrick Venture; +Cc: OpenBMC Maillist


> On 5/9/19 7:43 AM, Patrick Venture wrote:
> > On Wed, May 8, 2019 at 11:33 PM Will Liang (梁永鉉)
> > <Will.Liang@quantatw.com> wrote:
> >>
> >> Hi,
> >>
> >>> -----Original Message-----
> >>> From: Patrick Venture [mailto:venture@google.com]
> >>> Sent: Thursday, May 9, 2019 12:36 AM
> >>> To: Will Liang (梁永鉉) <Will.Liang@quantatw.com>
> >>> Cc: OpenBMC Maillist <openbmc@lists.ozlabs.org>
> >>> Subject: Re: about phosphor pid control package
> >>>
> >>> On Wed, May 8, 2019 at 6:46 AM Will Liang (梁永鉉)
> >>> <Will.Liang@quantatw.com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I have a question about getFailSafeMode().
> >>>>
> >>>> Currently, only sensors that are defined as "temp" types can be
> >>>> checked for
> >>> failure.
> >>>> I did not find any "fan" type sensors to check if the fan has failed.
> >>>> Our project need to check the fan fail so  I want to add another
> >>>> "fan" sensor
> >>> type to check.
> >>>
> >>>
> >>>
> >>>>
> >>>> Can I add one more "for loop" to check the fan sensor in
> >>>> updateSensors()
> >>> function in zone.cpp??
> >>>>
> >>>> for (const auto& t : _thermalInputs) {
> >>>>      ........
> >>>> }
> >>>> for (const auto& t : _fanInputs)
> >>>> {
> >>>>      ........
> >>>> }
> >>>
> >>> updateSensors is deliberately not talking to the fans because
> >>> they're not considered inputs into the thermal config, they're
> >>> controlled outputs -- the question I have is, what would you like to do if a
> fan isn't responding?
> >>> failsafemode drives the fans to a specific pre-defined speed to keep
> >>> it from thermal issues.  If a fan is failing to respond, one can't
> >>> drive it -- perhaps one can drive the others to some failsafe?
> >>
> >> If a fan fails, we need to enter the fail safe mode to increase the other fan
> duty.
> >>
> >>> If so, one needs to update the failsafe for a zone outside of the
> >>> thermal sensors, but rather where the fans are checked (void
> >>> PIDZone::updateFanTelemetry(void))
> >>
> >> I add following code into the PIDZone::updateFanTelemetry(void) function
> to check the fan fail.
> >> If the fan fails, it will enter fail safe mode.
> >>    if (sensor->getFailed())
> >>    {
> >>          failSafeSensors.insert(f);
> >>    }
> >>    else
> >>    {
> >>          // Check if it's in there: remove it.
> >>      auto kt = _failSafeSensors.find(f);
> >>      if (kt != _failSafeSensors.end())
> >>    {
> >>          failSafeSensors.erase(kt);
> >>    }
> >>
> >> But one more question I have is that the above code can only check if a
> single fan has failed.
> >
> >> Our project needs to check for dual-fan failures. Do you have any
> suggestions for checking the failure of the dual-fan?
> >
> > I'm not entirely certain what you mean.  You're saying a dual-fan is a
> > fan that has two outputs but one input?
> 
> If this is what you mean, on our systems we simply have a tach sensor per tach
> in the fan, i.e. fan1a and fan1b. I think the above logic would work for this
> issue.

I'm so sorry to make you misunderstand because of my unclear expression.
The "dual-fan" means "Dual rotor fan" , two tachometer output one PWM input.


> >
> >>
> >> Will
> >>>>
> >>>> BRs,
> >>>> Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: about phosphor pid control package
  2019-05-09 17:51       ` James Feist
  2019-05-10  0:08         ` Will Liang (梁永鉉)
@ 2019-05-14 12:50         ` Will Liang (梁永鉉)
  2019-05-14 16:01           ` James Feist
  1 sibling, 1 reply; 8+ messages in thread
From: Will Liang (梁永鉉) @ 2019-05-14 12:50 UTC (permalink / raw)
  To: James Feist, Patrick Venture; +Cc: OpenBMC Maillist



> -----Original Message-----
> From: Will Liang (梁永鉉)
> Sent: Friday, May 10, 2019 8:06 AM
> To: 'James Feist' <james.feist@linux.intel.com>; Patrick Venture
> <venture@google.com>
> Cc: OpenBMC Maillist <openbmc@lists.ozlabs.org>
> Subject: RE: about phosphor pid control package
> 
> 
> > On 5/9/19 7:43 AM, Patrick Venture wrote:
> > > On Wed, May 8, 2019 at 11:33 PM Will Liang (梁永鉉)
> > > <Will.Liang@quantatw.com> wrote:
> > >>
> > >> Hi,
> > >>
> > >>> -----Original Message-----
> > >>> From: Patrick Venture [mailto:venture@google.com]
> > >>> Sent: Thursday, May 9, 2019 12:36 AM
> > >>> To: Will Liang (梁永鉉) <Will.Liang@quantatw.com>
> > >>> Cc: OpenBMC Maillist <openbmc@lists.ozlabs.org>
> > >>> Subject: Re: about phosphor pid control package
> > >>>
> > >>> On Wed, May 8, 2019 at 6:46 AM Will Liang (梁永鉉)
> > >>> <Will.Liang@quantatw.com> wrote:
> > >>>>
> > >>>> Hi,
> > >>>>
> > >>>> I have a question about getFailSafeMode().
> > >>>>
> > >>>> Currently, only sensors that are defined as "temp" types can be
> > >>>> checked for
> > >>> failure.
> > >>>> I did not find any "fan" type sensors to check if the fan has failed.
> > >>>> Our project need to check the fan fail so  I want to add another
> > >>>> "fan" sensor
> > >>> type to check.
> > >>>
> > >>>
> > >>>
> > >>>>
> > >>>> Can I add one more "for loop" to check the fan sensor in
> > >>>> updateSensors()
> > >>> function in zone.cpp??
> > >>>>
> > >>>> for (const auto& t : _thermalInputs) {
> > >>>>      ........
> > >>>> }
> > >>>> for (const auto& t : _fanInputs)
> > >>>> {
> > >>>>      ........
> > >>>> }
> > >>>
> > >>> updateSensors is deliberately not talking to the fans because
> > >>> they're not considered inputs into the thermal config, they're
> > >>> controlled outputs -- the question I have is, what would you like
> > >>> to do if a
> > fan isn't responding?
> > >>> failsafemode drives the fans to a specific pre-defined speed to
> > >>> keep it from thermal issues.  If a fan is failing to respond, one
> > >>> can't drive it -- perhaps one can drive the others to some failsafe?
> > >>
> > >> If a fan fails, we need to enter the fail safe mode to increase the
> > >> other fan
> > duty.
> > >>
> > >>> If so, one needs to update the failsafe for a zone outside of the
> > >>> thermal sensors, but rather where the fans are checked (void
> > >>> PIDZone::updateFanTelemetry(void))
> > >>
> > >> I add following code into the PIDZone::updateFanTelemetry(void)
> > >> function
> > to check the fan fail.
> > >> If the fan fails, it will enter fail safe mode.
> > >>    if (sensor->getFailed())
> > >>    {
> > >>          failSafeSensors.insert(f);
> > >>    }
> > >>    else
> > >>    {
> > >>          // Check if it's in there: remove it.
> > >>      auto kt = _failSafeSensors.find(f);
> > >>      if (kt != _failSafeSensors.end())
> > >>    {
> > >>          failSafeSensors.erase(kt);
> > >>    }
> > >>
> > >> But one more question I have is that the above code can only check
> > >> if a
> > single fan has failed.
> > >
> > >> Our project needs to check for dual-fan failures. Do you have any
> > suggestions for checking the failure of the dual-fan?
> > >
> > > I'm not entirely certain what you mean.  You're saying a dual-fan is
> > > a fan that has two outputs but one input?
> >
> > If this is what you mean, on our systems we simply have a tach sensor
> > per tach in the fan, i.e. fan1a and fan1b. I think the above logic
> > would work for this issue.
I think someone may also need to check for one rotor fan fail, I will push the above code into gerrit

> I'm so sorry to make you misunderstand because of my unclear expression.
> The "dual-fan" means "Dual rotor fan" , two tachometer output one PWM
> input.
Our architecture is that both fan1a and fan1b fail (dual rotor fan failure) and then this situation is identified as one fan fail.

Would you have any suggestion or idea?

>
> > >
> > >>
> > >> Will
> > >>>>
> > >>>> BRs,
> > >>>> Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: about phosphor pid control package
  2019-05-14 12:50         ` Will Liang (梁永鉉)
@ 2019-05-14 16:01           ` James Feist
  0 siblings, 0 replies; 8+ messages in thread
From: James Feist @ 2019-05-14 16:01 UTC (permalink / raw)
  To: Will Liang (梁永鉉), Patrick Venture; +Cc: OpenBMC Maillist


>>>>> But one more question I have is that the above code can only check
>>>>> if a
>>> single fan has failed.
>>>>
>>>>> Our project needs to check for dual-fan failures. Do you have any
>>> suggestions for checking the failure of the dual-fan?
>>>>
>>>> I'm not entirely certain what you mean.  You're saying a dual-fan is
>>>> a fan that has two outputs but one input?
>>>
>>> If this is what you mean, on our systems we simply have a tach sensor
>>> per tach in the fan, i.e. fan1a and fan1b. I think the above logic
>>> would work for this issue.
> I think someone may also need to check for one rotor fan fail, I will push the above code into gerrit
> 
>> I'm so sorry to make you misunderstand because of my unclear expression.
>> The "dual-fan" means "Dual rotor fan" , two tachometer output one PWM
>> input.
> Our architecture is that both fan1a and fan1b fail (dual rotor fan failure) and then this situation is identified as one fan fail.
> 
> Would you have any suggestion or idea?

What we have done in the past is used fan redundancy failures for this 
sort of thing. I would suggest monitoring the fan redundancy interface 
for redundancy lost to create failure: 
https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/Control/FanRedundancy.interface.yaml

-James

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-05-14 16:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-08 13:45 about phosphor pid control package Will Liang (梁永鉉)
2019-05-08 16:36 ` Patrick Venture
2019-05-09  6:33   ` Will Liang (梁永鉉)
2019-05-09 14:43     ` Patrick Venture
2019-05-09 17:51       ` James Feist
2019-05-10  0:08         ` Will Liang (梁永鉉)
2019-05-14 12:50         ` Will Liang (梁永鉉)
2019-05-14 16:01           ` James Feist

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.