linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ethan zhao <ethan.zhao@oracle.com>
To: Linda Knippers <linda.knippers@hp.com>
Cc: Kristen Carlson Accardi <kristen@linux.intel.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	dirk.j.brandewie@intel.com, viresh.kumar@linaro.org,
	corbet@lwn.net, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	ethan.kernel@gmail.com
Subject: Re: [PATCH 2/2 V7] intel_pstate: add kernel parameter to force loading on Sun X86 servers.
Date: Fri, 05 Dec 2014 13:52:16 +0800	[thread overview]
Message-ID: <54814810.6030302@oracle.com> (raw)
In-Reply-To: <54813B17.8010509@hp.com>

Linda,

On 2014/12/5 12:56, Linda Knippers wrote:
> Hi Ethan,
>
> On 12/4/2014 10:38 PM, ethan zhao wrote:
>> Linda,
>>
>> On 2014/12/5 7:03, Linda Knippers wrote:
>>> On 12/4/2014 5:38 PM, Kristen Carlson Accardi wrote:
>>>> On Thu, 04 Dec 2014 23:10:58 +0100
>>>> "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
>>>>
>>>>> On Thursday, December 04, 2014 11:07:31 AM Ethan Zhao wrote:
>>>>>> To force loading on Oracle Sun X86 servers, provide one kernel command line
>>>>>> parameter
>>>>>>
>>>>>>     intel_pstate = ora_force
>>>>> I would suggest to change the name of the option to "oracle_force" or
>>>>> "sun_force"
>>>>> for clarity.
>>>>>
>>>>> Anyway, I need an ACK from Kristen if this patch is to be applied.
>>>>>
>>>>>> For those who be aware of the risk of no power capping capabily working and
>>>>>> try to get better performance with this driver.
>>>>>>
>>>>>> Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
>>>>>> ---
>>>>>>    v2: change to hardware vendor specific naming parameter.
>>>>>>    v4: refine code and doc.
>>>>>>    v5&v6: fix a typo in doc.
>>>>>>    v7: change enum PCC to PPC.
>>>>>>
>>>>>>    Documentation/kernel-parameters.txt | 5 +++++
>>>>>>    drivers/cpufreq/intel_pstate.c      | 6 +++++-
>>>>>>    2 files changed, 10 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/Documentation/kernel-parameters.txt
>>>>>> b/Documentation/kernel-parameters.txt
>>>>>> index 479f332..7d0983e 100644
>>>>>> --- a/Documentation/kernel-parameters.txt
>>>>>> +++ b/Documentation/kernel-parameters.txt
>>>>>> @@ -1446,6 +1446,11 @@ bytes respectively. Such letter suffixes can also be
>>>>>> entirely omitted.
>>>>>>                   disable
>>>>>>                     Do not enable intel_pstate as the default
>>>>>>                     scaling driver for the supported processors
>>>>>> +               ora_force
>>>>>> +             Force loading intel_pstate on Oracle Sun Servers(X86).
>>>>>> +             only for those who be aware of the risk of no power capping
>>>>>> +             capability working and try to get better performance with this
>>>>>> +             driver.
>>>>> That is not sufficiently clear.  What does "risk of no power capping capability
>>>>> working" mean, in particular?
>>>>>
>>>>>>          intremap=    [X86-64, Intel-IOMMU]
>>>>>>                on    enable Interrupt Remapping (default)
>>>>>> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
>>>>>> index 1bb62ca..2654e13 100644
>>>>>> --- a/drivers/cpufreq/intel_pstate.c
>>>>>> +++ b/drivers/cpufreq/intel_pstate.c
>>>>>> @@ -866,6 +866,7 @@ static struct cpufreq_driver intel_pstate_driver = {
>>>>>>    };
>>>>>>      static int __initdata no_load;
>>>>>> +static unsigned int  ora_force;
>>>>>>      static int intel_pstate_msrs_not_valid(void)
>>>>>>    {
>>>>>> @@ -1003,7 +1004,8 @@ static bool intel_pstate_platform_pwr_mgmt_exists(void)
>>>>>>                case PSS:
>>>>>>                    return intel_pstate_no_acpi_pss();
>>>>>>                case PPC:
>>>>>> -                return intel_pstate_has_acpi_ppc();
>>>>>> +                return intel_pstate_has_acpi_ppc() &&
>>>>>> +                    (!ora_force);
>>>>>>                }
>>>>>>        }
>>>>>>    @@ -1078,6 +1080,8 @@ static int __init intel_pstate_setup(char *str)
>>>>>>          if (!strcmp(str, "disable"))
>>>>>>            no_load = 1;
>>>>>> +    if (!strcmp(str, "ora_force"))
>>>>>> +        ora_force = 1;
>>>>>>        return 0;
>>>>>>    }
>>>>>>    early_param("intel_pstate", intel_pstate_setup);
>>>>> And can anyone please remind me what was wrong with a "force" option that would
>>>>> work for everyone, not just Oracle/Sun?
>>>>>
>>>> That was my suggestion as well (i.e. a parameter to bypass the vendor
>>>> checks), but Linda didn't like it.  My personal opinion is that unless
>>>> it's generic, I don't really feel like having a force option solely for
>>>> oracle.  I'm not convinced you want this for production machines, and I
>>>> think for debug purposes I don't want a vendor specific param.
>>> I'd be happy with it if it somehow disabled what the platform is doing,
>>> but it doesn't.  I don't see the point of forcing intel_pstate if you
>>> can't force the platform to stop doing power management at the same time.
>>> Even if it's for test/debug purposes, I'm not sure what you're testing
>>> when you have dueling power management.
>>    Most of the power management functions is done by SP(service processor) on Sun
>> X86
>>    servers, the 'force' parameter is not supposed to disable whole platform
>> working I think,
>>    with intel_pstate,  it doesn't do CPU power capping issued via _PPC
>> notification. but all
>>   other rest parts of the power management still work. There is no scene as HP
>> proliant OS
>>   mode that OS could control everything(sorry, I don't know Proliant Architecture).
>>
>>   So at least, it doesn't make sense to Oracle Sun X86 servers, provide an OS
>> option to stop
>>   all PM functions even disable ACPI at all.
>>
>>   If the users could be aware of that the power capping doesn't work with CPUs.
>> they could
>>   load intel_pstate driver, though there may be faulty in SP . they still could
>> monitor and
>>   manage the power consumption of other parts in the server.
>>
>>   Perhaps this is what we would test/have tested with intel_pstate.
>>
>>   There is a public manual about PM command in Sun server SP may could help you
>> to understand
>>   the  difference.
>>   https://docs.oracle.com/cd/E19121-01/sf.x4150/820-6412-12/820-6412-12.pdf
> I've tried to put the pieces together so tell me if I've got this right.
   Sorry, I have a little trouble to express complex thing with English, 
or something I don't
  understand them thoroughly makes thing worse :>.
> Under normal circumstances, the Oracle platform wants the OS to do normal
> power management (p-state and c-state management) using the ACPI information
> that the firmware provides.

   OS does the action that selects the State with the information from 
ACPI and governor /users
>    The firwmare or SP will potentially change the
> ACPI information on the fly for things like power capping.  So normally, you
> would want the acpi_cpufreq driver.  If the intel_pstate driver is loaded,
> then that's going to disregard the ACPI information, uncluding the changes
> that the firmware or SP may make when power capping.
   Definitely right.
> There is no case where
> the firmware or SP will try to manage pstates or cstates itself. Is that right?
  Yup,
   I heard of some new CPUs could manage the states by itself ?
>
> If that's right, then I can see how the force option could make sense for
> your platforms.  Sorry it took me so long to get this part.
>
> HP platforms are different.  On our platforms, the platform is configurable
> and customers can choose to have the firmware manage p-states, in which case
> the pcc_cpufreq driver will be loaded to allow the OS to provide hints, or
> to have the OS provide manage p-states, in which case the intel-pstate
> driver will be loaded.  In our case, forcing the intel-pstate driver if
> the platform is configured to have the firmware manage the p-states means
> that both are trying to manage the power.  I don't think that ever makes
> sense.  If the admin wants intel-pstate, its easy to configure the platform
> through the iLO or the BIOS/UEFI setup so that the OS manages the p-states.
>
> If the force option only works if the platform exposes _PPC, then it would
> work with Oracle platforms and not work with HP platforms.  That gives us
> what we want and is also note necessarily vendor specific.  And the good
> news is that's actually how your recent patches work.
   So it is not necessary to bypass HP checking code and just rename the 
'force' to a generic name
   it should works right ?
>
> If the description said something about forcing the intel-pstate driver in
> place of the acpi_cpufreq driver, assuming the processor is supported by
> the intel-pstate driver, then I think we're good with a generic sounding
> boot option.  It should also be clear that someone using force would
> not get the intel-pstate driver on older processors.
   if the old CPU is not supported, the driver will report "ENODEV" error.
>    There's no way
> to force that.  And you could put in whatever warnings you want about
> other features, such as power capping, potentially being disabled if
> the force option is used.
>
> Does this make sense to everyone?  It finally does to me. :-)
   Thanks to your clarification.  it is crystal clear for both HP & 
Oracle X86 servers.
>
> -- ljk
>
>
>
>>> The description would need to be different too since I think on
>>> ProLiant, power capping can happen at any time, even if the
>>> system is in OS control mode and the intel_pstate driver is
>>> loaded.
>>    Does that mean only the CPU power capping not work ? If so, they work the same
>> way.
>>> Can anyone suggest a description for a force option that would
>>> make sense generically?
>>    the 'force' option means CPU power capping (frequency limited) not work to all,
>>    right ?
>>
>>   Thanks,
>>   Ethan
>>> -- ljk
>>>
>>>
>>>


  reply	other threads:[~2014-12-05  5:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-04  2:07 [PATCH 2/2 V7] intel_pstate: add kernel parameter to force loading on Sun X86 servers Ethan Zhao
2014-12-04  2:22 ` ethan zhao
2014-12-04 22:10 ` Rafael J. Wysocki
2014-12-04 22:38   ` Kristen Carlson Accardi
2014-12-04 23:03     ` Linda Knippers
2014-12-05  2:05       ` Rafael J. Wysocki
2014-12-05  3:50         ` Linda Knippers
2014-12-05  3:38       ` ethan zhao
2014-12-05  4:56         ` Linda Knippers
2014-12-05  5:52           ` ethan zhao [this message]
2014-12-05  2:23     ` ethan zhao
2014-12-05  2:14   ` ethan zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54814810.6030302@oracle.com \
    --to=ethan.zhao@oracle.com \
    --cc=corbet@lwn.net \
    --cc=dirk.j.brandewie@intel.com \
    --cc=ethan.kernel@gmail.com \
    --cc=kristen@linux.intel.com \
    --cc=linda.knippers@hp.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).