Linux Documentation

Linux Documentation
 help / color / mirror / Atom feed

* Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller
From: Doug Anderson @ 2018-03-08  5:19 UTC (permalink / raw)
  To: Sagar Dharia
  Cc: Jonathan Corbet, Andy Gross, David Brown, Rob Herring,
	Mark Rutland, Wolfram Sang, Greg Kroah-Hartman,
	Karthikeyan Ramasubramanian, linux-doc, linux-arm-msm, devicetree,
	linux-i2c, linux-serial, Jiri Slaby, evgreen, acourbot,
	Girish Mahadevan, swboyd
In-Reply-To: <03c5ed5b-5160-bdd5-82f9-4a8beab33ca8@codeaurora.org>

Hi,

On Wed, Mar 7, 2018 at 6:42 PM, Sagar Dharia <sdharia@codeaurora.org> wrote:
> Hi Doug,
> Thank you for reviewing the patch. I will take a stab at a few comments
> below. We will address most of the other comments in next version of I2C
> patch.
>>
>>
>>
>>> +
>>> +#define I2C_AUTO_SUSPEND_DELAY 250
>>
>>
>> Why 250 ms?  That seems like an eternity.  Is it really that expensive
>> to turn resources off and on?  I would sorta just expect clocks and
>> stuff to get turned off right after a transaction finished unless
>> another one was pending right behind it...
>>
> The response from RPMh to turn on/off shared resources also take quite a few
> msecs. The QUP serial bus block sits quite a few shared-NOCs away from the
> memory and runtime-PM is used a bandwidth vote/NOC vote for these NOCs from
> QUP to memory. Also the RPC between apps and RPMh can sometimes take longer
> depending on other tasks running on apps. This 250 msec avoids thrashing of
> these RPCs between apps and RPMh.
> If you plan to keep these NOCs on forever, then your are right: runtime-PM
> will be only used to turn on/off local clocks and we won't even need
> autosuspend. that's not true on products where this driver is currently
> deployed.

OK, fair enough.  I don't know how RPMh works well enough to argue.
It does seem odd that you'd want to design things such that it takes a
few msecs to pull it out of runtime suspend, especially for touch.


>>> +
>>> +static const struct geni_i2c_clk_fld geni_i2c_clk_map[] = {
>>> +       {KHz(100), 7, 10, 11, 26},
>>> +       {KHz(400), 2,  5, 12, 24},
>>> +       {KHz(1000), 1, 3,  9, 18},
>>
>>
>> So I guess this is all relying on an input serial clock of 19.2MHz?
>> Maybe document that?
>>
>> Assuming I'm understanding the math here, is it really OK for your
>> 100kHz and 1MHz mode to be running slightly fast?
>>
>> 19200. / 2 / 24
>>>>>
>>>>> 400.0
>>
>>
>> 19200. / 7 / 26
>>>>>
>>>>> 105.49450549450549
>>
>>
>> 19200. / 1 / 18
>>>>>
>>>>> 1066.6666666666667
>>
>>
>> It seems like you'd want the fastest clock that you can make that's
>> _less than_ the spec.
>>
>>
>> It would also be interesting to know if it's expected that boards
>> might need to tweak the t_high / t_low depending on their electrical
>> characteristics.  In the past I've had lots of requests from board
>> makers to tweak things because they've got a long trace, or a stronger
>> or weaker pull, or ...  If so we might later need to add some dts
>> properties like "i2c-scl-rising-time-ns" and make the math more
>> dynamic here, unless your hardware somehow automatically adjusts for
>> this type of thing...
>> These values are derived by our HW team to comply with the t-high and
>
> t-low specs of I2C. We have confirmed on scope that the frequency of SCL is
> indeed less than/equal to the spec. We have not come across slaves who have
> needed to tweak these things. We are open to adding these properties in dts
> if you have any such slaves not conforming due to board-layout of other
> reasons.

OK, I'm fine with leaving something like this till later if/when it
comes up.  Documenting a little bit more about how these timings work
seems like it would be nice, though, at least mentioning what the
source clock is...


>>>
>>> +       mode = msg->len > 32 ? GENI_SE_DMA : GENI_SE_FIFO;
>>
>>
>> DMA is hard and i2c transfers > 32 bytes are rare.  Do we really gain
>> a lot by transferring i2c commands over DMA compared to a FIFO?
>> Enough to justify the code complexity and the set of bugs that will
>> show up?  I'm sure it will be a controversial assertion given that the
>> code's already written, but personally I'd be supportive of ripping
>> DMA mode out to simplify the driver.  I'd be curious if anyone else
>> agrees.  To me it seems like premature optimization.
>
>
> Yes, we have multiple clients (e.g. touch, NFC) using I2C for data transfers
> bigger than 32 bytes (some transfers are 100s of bytes). The fifo size is
> 32, so we can definitely avoid at least 1 interrupt when DMA mode is used
> with data size > 32.

Does that 1-2 interrupts make any real difference, though?  In theory
it really shouldn't affect the transfer rate.  We should be able to
service the interrupt plenty fast and if we were concerned we would
tweak the watermark code a little bit.  So I guess we're worried about
the extra CPU cycles (and power cost) to service those extra couple
interrupts?

In theory when touch data is coming in or NFC data is coming in then
we're probably not in a super low power state to begin with.  If it's
touch data we likely want to have the CPU boosted a bunch to respond
to the user quickly.  If we've got 8 cores available all of which can
run at 1GHz or more a few interrupts won't kill us.  NFC data is
probably not common enough that we need to optimize power/CPU
utilizatoin for that.


So while i can believe that you do save an interrupt or two, I still
am not convinced that those interrupts are worth a bunch of complex
code (and a whole second code path) to save.


...also note that above you said that coming out of runtime suspend
can take several msec.  That seems like it dwarfs any slight
difference in timing between a FIFO-based operation and DMA.


>>> +       geni_se_select_mode(&gi2c->se, mode);
>>> +       writel_relaxed(msg->len, gi2c->se.base + SE_I2C_RX_TRANS_LEN);
>>> +       geni_se_setup_m_cmd(&gi2c->se, I2C_READ, m_param);
>>> +       if (mode == GENI_SE_DMA) {
>>> +               rx_dma = geni_se_rx_dma_prep(&gi2c->se, msg->buf,
>>> msg->len);
>>
>>
>> Randomly I noticed a flag called "I2C_M_DMA_SAFE".  Do we need to
>> check this flag before using msg->buf for DMA?  ...or use
>> i2c_get_dma_safe_msg_buf()?
>>
>> ...btw: the relative lack of people doing this in the kernel is
>> further evidence of DMA not really being worth it for i2c busses.
>
> I cannot comment about other drivers here using or not using DMA since they
> may not be exercised with slaves like NFC?
>>>
>>> +       ret = pm_runtime_get_sync(gi2c->se.dev);
>>> +       if (ret < 0) {
>>> +               dev_err(gi2c->se.dev, "error turning SE resources:%d\n",
>>> ret);
>>> +               pm_runtime_put_noidle(gi2c->se.dev);
>>> +               /* Set device in suspended since resume failed */
>>> +               pm_runtime_set_suspended(gi2c->se.dev);
>>> +               return ret;
>>
>>
>> Wow, that's a cluster of arcane calls to handle a call that probably
>> will never fail (it just enables clocks and sets pinctrl).  Sigh.
>> ...but as far as I can tell the whole sequence is right.  You
>> definitely need a "put" after a failed get and it looks like
>> pm_runtime_set_suspended() has a special exception where it can be
>> called if you got a runtime error...
>
> We didn't have this in before either. But this condition is somewhat
> frequent if I2C transactions are called on cusp of exiting system suspend.
> (e.g. PMIC slave getting a wakeup-IRQ and trying to read from PMIC through
> I2C to read its status as to what caused that wake-up. At that time,
> get_sync doesn't really enable resources (kernel 4.9) since the runtime-pm
> ref-count isn't decremented. We run the risk of unclocked access if these
> arcane calls aren't present. You can go through runtime-pm documentation
> chapter 6 for more details.

Yeah, I certainly agree that the calls are needed if
pm_runtime_get_sync() and I'm not suggesting removing them.  Hence the
"as far as I can tell the whole sequence is right".

...but I'm actually kinda worried if you're saying that you actually
ran into this case.  Hopefully that got fixed and code no longer tries
to read from the PMIC at a bad time anymore?  That code should be
fixed not to be running so late in suspend.


>>> +       ret = devm_request_irq(&pdev->dev, gi2c->irq, geni_i2c_irq,
>>> +                              IRQF_TRIGGER_HIGH, "i2c_geni", gi2c);
>>> +       if (ret) {
>>> +               dev_err(&pdev->dev, "Request_irq failed:%d: err:%d\n",
>>> +                       gi2c->irq, ret);
>>> +               return ret;
>>> +       }
>>> +       disable_irq(gi2c->irq);
>>
>>
>> Can you explain the goal of the disable_irq() here.  Is it actually
>> needed for something or does it somehow save power?  From drivers I've
>> reviewed in the past this doesn't seem like a common thing to do, so
>> I'm curious what it's supposed to gain for you.  I'd be inclined to
>> just delete the whole disable/enable of the irq from this driver.
>
>
> Qualcomm's power team suggests we enable/disable unused IRQs. Otherwise they
> can block apps from entering some low-power mode (unless the interrupt is in
> some list?) I will confirm again with them and let you know.

Since this is weird (to me anyway), please document w/ a comment.


>>>
>>> +       /* Make sure no transactions are pending */
>>> +       ret = i2c_trylock_bus(&gi2c->adap, I2C_LOCK_SEGMENT);
>>> +       if (!ret) {
>>> +               dev_err(gi2c->se.dev, "late I2C transaction request\n");
>>> +               return -EBUSY;
>>> +       }
>>
>>
>> Does this happen?  How?
>>
>> Nothing about this code looks special for your hardware.  If this is
>> really needed, why is it not part of the i2c core since there's
>> nothing specific about your driver here?
>>
> There have been some clients that don't implement sys-suspend/resume
> callbacks (so i2c adapter has no clue they are done with their transactions)
> and this allows us to be flexible when they call I2C transactions extremely
> late.

Still feels like this belongs in the i2c core, not your driver.  Maybe
you should send a patch for the core and remove it from here?

...and also, it seems like any i2c clients that don't implement the
suspend/resume callbacks and try to do i2c transactions late in the
game need to be fixed.  It should be documented that this isn't a
valid thing for a driver to do and if we end up in this error case
then it's not an i2c issue but it's a bad driver somewhere.

>
>>
>>> +       if (!pm_runtime_status_suspended(device)) {
>>> +               geni_i2c_runtime_suspend(device);
>>> +               pm_runtime_disable(device);
>>> +               pm_runtime_set_suspended(device);
>>> +               pm_runtime_enable(device);
>>> +       }
>>
>>
>> Similar question.  Why do you need this special case code?  Are there
>> cases where we're all the way at suspend_noirq and yet we still
>> haven't runtime suspended?  Can you please document how we get into
>> this state?
>> This is when transaction happens less-than 250 msec of the
>
> system-suspend. PM-runtime has not gotten a chance to auto-suspend us since
> timer hasn't expired before system-suspend is attempted. These calls make
> sure that we truly turn off driver resources and make runtime-PM state
> consistent with the HW state. We can document this.

OK.  PM Runtime always gets me mixed up.  Seems really strange that it
wouldn't autosuspend all devices (regardless of timeout) at system
suspend time.

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 00/17] Include linux trace docs to Sphinx TOC tree
From: Steven Rostedt @ 2018-03-08  3:41 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: Du, Changbin, mingo, linux-doc, linux-kernel
In-Reply-To: <20180307104649.4db521a5@lwn.net>

On Wed, 7 Mar 2018 10:46:49 -0700
Jonathan Corbet <corbet@lwn.net> wrote:

> Anyway, with that, the patch series is applied.  Thanks for helping to
> improve the docs, and my apologies for taking so long to get to this.

Thanks for doing this Jon. I'm going to have to find some time to
revisit all the documentation in the tracing directory, because I'm
sure they have fallen out of date.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 00/17] Include linux trace docs to Sphinx TOC tree
From: Du, Changbin @ 2018-03-08  2:42 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Steven Rostedt, Du, Changbin, mingo, linux-doc, linux-kernel
In-Reply-To: <20180307104649.4db521a5@lwn.net>

On Wed, Mar 07, 2018 at 10:46:49AM -0700, Jonathan Corbet wrote:
> On Tue, 27 Feb 2018 17:43:37 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Tue, 27 Feb 2018 17:34:22 +0800
> > "Du, Changbin" <changbin.du@intel.com> wrote:
> >
> > > Ten days past, will you accept this serias? Thank you!
> > 
> > Currently I'm very overloaded with other code that needs to get done
> > ASAP, and I need to balance what is critical and what is not. I don't
> > have time to review this, as this isn't critical, and can wait.
> > 
> > If Jon can review it to make sure that it doesn't change the
> > readability of the text, then I'll trust his judgment. 
> 
> So I've spent a while working through the patches.  I think it's a
> well-done RST conversion carried out with a light hand; I do not believe
> there are any readability issues with the resulting text files.
> 
> I will note that the series adds some new build warnings:
> 
> > Documentation/trace/events.rst:45: WARNING: Inline emphasis start-string without end-string.
> > Documentation/trace/events.rst:49: WARNING: Inline emphasis start-string without end-string.
> > Documentation/trace/events.rst:193: WARNING: Inline emphasis start-string without end-string.
> > Documentation/trace/events.rst:114: WARNING: Unknown target name: "common".
> > Documentation/trace/ftrace.rst:2620: WARNING: Inline emphasis start-string without end-string.
> 
> These point to places where the resulting formatted docs are, in fact,
> incorrect.  I had to append the attached patch to the series to make those
> problems go away.  The warnings are there for a purpose!
> 
> Anyway, with that, the patch series is applied.  Thanks for helping to
> improve the docs, and my apologies for taking so long to get to this.
> 
> jon
> 
I am also appriciated for your review. And I am glade to see these docs can appear
in the new beautiful html documentation! Thnak you.

- changbin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller
From: Sagar Dharia @ 2018-03-08  2:42 UTC (permalink / raw)
  To: Doug Anderson, Jonathan Corbet, Andy Gross, David Brown,
	Rob Herring, Mark Rutland, Wolfram Sang, Greg Kroah-Hartman
  Cc: Karthikeyan Ramasubramanian, linux-doc, linux-arm-msm, devicetree,
	linux-i2c, linux-serial, Jiri Slaby, evgreen, acourbot,
	Girish Mahadevan, swboyd
In-Reply-To: <CAD=FV=WbeP8xpVi7cji9hsNyxEam2D5D+_21ur_MuUO8FK42OQ@mail.gmail.com>

Hi Doug,
Thank you for reviewing the patch. I will take a stab at a few comments 
below. We will address most of the other comments in next version of I2C 
patch.
> 
> 
>> +
>> +#define I2C_AUTO_SUSPEND_DELAY 250
> 
> Why 250 ms?  That seems like an eternity.  Is it really that expensive
> to turn resources off and on?  I would sorta just expect clocks and
> stuff to get turned off right after a transaction finished unless
> another one was pending right behind it...
> 
The response from RPMh to turn on/off shared resources also take quite a 
few msecs. The QUP serial bus block sits quite a few shared-NOCs away 
from the memory and runtime-PM is used a bandwidth vote/NOC vote for 
these NOCs from QUP to memory. Also the RPC between apps and RPMh can 
sometimes take longer depending on other tasks running on apps. This 250 
msec avoids thrashing of these RPCs between apps and RPMh.
If you plan to keep these NOCs on forever, then your are right: 
runtime-PM will be only used to turn on/off local clocks and we won't 
even need autosuspend. that's not true on products where this driver is 
currently deployed.
>> +
>> +static const struct geni_i2c_clk_fld geni_i2c_clk_map[] = {
>> +       {KHz(100), 7, 10, 11, 26},
>> +       {KHz(400), 2,  5, 12, 24},
>> +       {KHz(1000), 1, 3,  9, 18},
> 
> So I guess this is all relying on an input serial clock of 19.2MHz?
> Maybe document that?
> 
> Assuming I'm understanding the math here, is it really OK for your
> 100kHz and 1MHz mode to be running slightly fast?
> 
> 19200. / 2 / 24
>>>> 400.0
> 
> 19200. / 7 / 26
>>>> 105.49450549450549
> 
> 19200. / 1 / 18
>>>> 1066.6666666666667
> 
> It seems like you'd want the fastest clock that you can make that's
> _less than_ the spec.
> 
> 
> It would also be interesting to know if it's expected that boards
> might need to tweak the t_high / t_low depending on their electrical
> characteristics.  In the past I've had lots of requests from board
> makers to tweak things because they've got a long trace, or a stronger
> or weaker pull, or ...  If so we might later need to add some dts
> properties like "i2c-scl-rising-time-ns" and make the math more
> dynamic here, unless your hardware somehow automatically adjusts for
> this type of thing...
>These values are derived by our HW team to comply with the t-high and 
t-low specs of I2C. We have confirmed on scope that the frequency of SCL 
is indeed less than/equal to the spec. We have not come across slaves 
who have needed to tweak these things. We are open to adding these 
properties in dts if you have any such slaves not conforming due to 
board-layout of other reasons.
>> +       mode = msg->len > 32 ? GENI_SE_DMA : GENI_SE_FIFO;
> 
> DMA is hard and i2c transfers > 32 bytes are rare.  Do we really gain
> a lot by transferring i2c commands over DMA compared to a FIFO?
> Enough to justify the code complexity and the set of bugs that will
> show up?  I'm sure it will be a controversial assertion given that the
> code's already written, but personally I'd be supportive of ripping
> DMA mode out to simplify the driver.  I'd be curious if anyone else
> agrees.  To me it seems like premature optimization.

Yes, we have multiple clients (e.g. touch, NFC) using I2C for data 
transfers bigger than 32 bytes (some transfers are 100s of bytes). The 
fifo size is 32, so we can definitely avoid at least 1 interrupt when 
DMA mode is used with data size > 32.
> 
> 
>> +       geni_se_select_mode(&gi2c->se, mode);
>> +       writel_relaxed(msg->len, gi2c->se.base + SE_I2C_RX_TRANS_LEN);
>> +       geni_se_setup_m_cmd(&gi2c->se, I2C_READ, m_param);
>> +       if (mode == GENI_SE_DMA) {
>> +               rx_dma = geni_se_rx_dma_prep(&gi2c->se, msg->buf, msg->len);
> 
> Randomly I noticed a flag called "I2C_M_DMA_SAFE".  Do we need to
> check this flag before using msg->buf for DMA?  ...or use
> i2c_get_dma_safe_msg_buf()?
> 
> ...btw: the relative lack of people doing this in the kernel is
> further evidence of DMA not really being worth it for i2c busses.
I cannot comment about other drivers here using or not using DMA since 
they may not be exercised with slaves like NFC?
>> +       ret = pm_runtime_get_sync(gi2c->se.dev);
>> +       if (ret < 0) {
>> +               dev_err(gi2c->se.dev, "error turning SE resources:%d\n", ret);
>> +               pm_runtime_put_noidle(gi2c->se.dev);
>> +               /* Set device in suspended since resume failed */
>> +               pm_runtime_set_suspended(gi2c->se.dev);
>> +               return ret;
> 
> Wow, that's a cluster of arcane calls to handle a call that probably
> will never fail (it just enables clocks and sets pinctrl).  Sigh.
> ...but as far as I can tell the whole sequence is right.  You
> definitely need a "put" after a failed get and it looks like
> pm_runtime_set_suspended() has a special exception where it can be
> called if you got a runtime error...
We didn't have this in before either. But this condition is somewhat 
frequent if I2C transactions are called on cusp of exiting system 
suspend. (e.g. PMIC slave getting a wakeup-IRQ and trying to read from 
PMIC through I2C to read its status as to what caused that wake-up. At 
that time, get_sync doesn't really enable resources (kernel 4.9) since 
the runtime-pm ref-count isn't decremented. We run the risk of unclocked 
access if these arcane calls aren't present. You can go through 
runtime-pm documentation chapter 6 for more details.

>> +       ret = devm_request_irq(&pdev->dev, gi2c->irq, geni_i2c_irq,
>> +                              IRQF_TRIGGER_HIGH, "i2c_geni", gi2c);
>> +       if (ret) {
>> +               dev_err(&pdev->dev, "Request_irq failed:%d: err:%d\n",
>> +                       gi2c->irq, ret);
>> +               return ret;
>> +       }
>> +       disable_irq(gi2c->irq);
> 
> Can you explain the goal of the disable_irq() here.  Is it actually
> needed for something or does it somehow save power?  From drivers I've
> reviewed in the past this doesn't seem like a common thing to do, so
> I'm curious what it's supposed to gain for you.  I'd be inclined to
> just delete the whole disable/enable of the irq from this driver.

Qualcomm's power team suggests we enable/disable unused IRQs. Otherwise 
they can block apps from entering some low-power mode (unless the 
interrupt is in some list?) I will confirm again with them and let you know.
>> +       /* Make sure no transactions are pending */
>> +       ret = i2c_trylock_bus(&gi2c->adap, I2C_LOCK_SEGMENT);
>> +       if (!ret) {
>> +               dev_err(gi2c->se.dev, "late I2C transaction request\n");
>> +               return -EBUSY;
>> +       }
> 
> Does this happen?  How?
> 
> Nothing about this code looks special for your hardware.  If this is
> really needed, why is it not part of the i2c core since there's
> nothing specific about your driver here?
> 
There have been some clients that don't implement sys-suspend/resume 
callbacks (so i2c adapter has no clue they are done with their 
transactions) and this allows us to be flexible when they call I2C 
transactions extremely late.

> 
>> +       if (!pm_runtime_status_suspended(device)) {
>> +               geni_i2c_runtime_suspend(device);
>> +               pm_runtime_disable(device);
>> +               pm_runtime_set_suspended(device);
>> +               pm_runtime_enable(device);
>> +       }
> 
> Similar question.  Why do you need this special case code?  Are there
> cases where we're all the way at suspend_noirq and yet we still
> haven't runtime suspended?  Can you please document how we get into
> this state?
> This is when transaction happens less-than 250 msec of the 
system-suspend. PM-runtime has not gotten a chance to auto-suspend us 
since timer hasn't expired before system-suspend is attempted. These 
calls make sure that we truly turn off driver resources and make 
runtime-PM state consistent with the HW state. We can document this.


Thanks
Sagar
-- 
Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2] earlycon: Allow specifying a uartclk in options
From: Daniel Kurtz @ 2018-03-07 23:30 UTC (permalink / raw)
  To: Daniel Kurtz
  Cc: adurbin, Brian Norris, Jonathan Corbet, Greg Kroah-Hartman,
	jslaby, mingo, Thomas Gleixner, Christoffer Dall, Paul McKenney,
	marc.zyngier, frederic, dwmw, tom.saeger, zohar, alexander.levin,
	linux-doc, linux-kernel, linux-serial
In-Reply-To: <20180301184335.248378-1-djkurtz@chromium.org>

On Thu, Mar 1, 2018 at 11:43 AM Daniel Kurtz <djkurtz@chromium.org> wrote:

> Currently when an earlycon is registered, the uartclk is assumed to be
> BASE_BAUD * 16 = 1843200.  If a baud rate is specified in the earlycon
> options, then 8250_early's init_port will program the UART clock divider
> registers based on this assumed uartclk.

> However, not all uarts have a UART clock of 1843200.  For example, the
> 8250_dw uart in AMD's CZ/ST uses a fixed 48 MHz clock (as specified in
> cz_uart_desc in acpi_apd.c).  Thus, specifying a baud when using earlycon
> on such a device will result in incorrect divider values and a wrong UART
> clock.

> Fix this by extending the earlycon options parameter to allow
specification
> of a uartclk, like so:

>   earlycon=uart,mmio32,0xfedc6000,115200,48000000

> If none is specified, fall-back to prior behavior - 1843200.

> Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>

This general approach is facing resistance, so trying another more targeted
approach to work around the "BASE_BAUD=115200" assumption in arch/x86:

https://patchwork.kernel.org/patch/10265587/
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] [v2] docs: clarify security-bugs disclosure policy
From: Kees Cook @ 2018-03-07 23:14 UTC (permalink / raw)
  To: Dave Hansen
  Cc: LKML, Dan Williams, Thomas Gleixner, Greg KH, Linus Torvalds,
	Alan Cox, Andrea Arcangeli, Andy Lutomirski, Tim Chen, Al Viro,
	Andrew Morton, linux-doc, Jonathan Corbet, Mark Rutland
In-Reply-To: <20180307214624.D4361772@viggo.jf.intel.com>

On Wed, Mar 7, 2018 at 1:46 PM, Dave Hansen <dave.hansen@linux.intel.com> wrote:
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> I think we need to soften the language a bit.  It might scare folks
> off, especially the:
>
>          We prefer to fully disclose the bug as soon as possible.
>
> which is not really the case.  Linus says:
>
>         It's not full disclosure, it's not coordinated disclosure,
>         and it's not "no disclosure".  It's more like just "timely
>         open fixes".
>
> I changed a bit of the wording in here, but mostly to remove the word
> "disclosure" since it seems to mean very specific things to people
> that we do not mean here.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Kees Cook <keescook@google.com>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: linux-doc@vger.kernel.org
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Mark Rutland <mark.rutland@arm.com>
> ---
>
>  b/Documentation/admin-guide/security-bugs.rst |   24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff -puN Documentation/admin-guide/security-bugs.rst~embargo2 Documentation/admin-guide/security-bugs.rst
> --- a/Documentation/admin-guide/security-bugs.rst~embargo2      2018-03-07 13:23:49.390228208 -0800
> +++ b/Documentation/admin-guide/security-bugs.rst       2018-03-07 13:42:37.618225395 -0800
> @@ -29,18 +29,20 @@ made public.
>  Disclosure
>  ----------
>
> -The goal of the Linux kernel security team is to work with the
> -bug submitter to bug resolution as well as disclosure.  We prefer
> -to fully disclose the bug as soon as possible.  It is reasonable to
> -delay disclosure when the bug or the fix is not yet fully understood,
> -the solution is not well-tested or for vendor coordination.  However, we
> -expect these delays to be short, measurable in days, not weeks or months.
> -A disclosure date is negotiated by the security team working with the
> -bug submitter as well as vendors.  However, the kernel security team
> -holds the final say when setting a disclosure date.  The timeframe for
> -disclosure is from immediate (esp. if it's already publicly known)
> +The goal of the Linux kernel security team is to work with the bug
> +submitter to understand and fix the bug.  We prefer to publish the fix as
> +soon as possible, but try to avoid public discussion of the bug itself
> +and leave that to others.
> +
> +Publishing the fix may be delayed when the bug or the fix is not yet
> +fully understood, the solution is not well-tested or for vendor
> +coordination.  However, we expect these delays to be short, measurable in
> +days, not weeks or months.  A release date is negotiated by the security
> +team working with the bug submitter as well as vendors.  However, the
> +kernel security team holds the final say when setting a timeframe.  The
> +timeframe varies from immediate (esp. if it's already publicly known bug)

Nit: I think "a" is missing. I was expecting: "... already a publicly known ...

>  to a few weeks.  As a basic default policy, we expect report date to
> -disclosure date to be on the order of 7 days.
> +release date to be on the order of 7 days.

Otherwise, yeah, looks good.

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

-- 
Kees Cook
Pixel Security
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 2/8] [PATCH 2/8] Documentations: dt-bindings: Add a document of PECI adapter driver for Aspeed AST24xx/25xx SoCs
From: Pavel Machek @ 2018-03-07 22:11 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: joel, andrew, arnd, gregkh, jdelvare, linux, benh, andrew,
	linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc
In-Reply-To: <33bf6563-b220-7ff9-8b04-84e9bd781b3f@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 888 bytes --]

Hi!

> >Are these SoCs x86-based?
> 
> Yes, these are ARM SoCs. Please see Andrew's answer as well.

Understood, thanks.

> >>+	Read sampling point selection. The whole period of a bit time will be
> >>+	divided into 16 time frames. This value will determine which time frame
> >>+	this controller will sample PECI signal for data read back. Usually in
> >>+	the middle of a bit time is the best.
> >
> >English? "This value will determine when this controller"?
> >
> 
> Could I change it like below?:
> 
> "This value will determine in which time frame this controller samples PECI
> signal for data read back"

I guess... I'm not native speaker, I guess this could be improved some
more.

Best regards,
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: [PATCH] [v2] docs: clarify security-bugs disclosure policy
From: Linus Torvalds @ 2018-03-07 21:53 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Linux Kernel Mailing List, Dan Williams, Thomas Gleixner,
	Greg Kroah-Hartman, One Thousand Gnomes, Andrea Arcangeli,
	Andrew Lutomirski, Kees Cook, Tim Chen, Al Viro, Andrew Morton,
	open list:DOCUMENTATION, Jonathan Corbet, Mark Rutland
In-Reply-To: <20180307214624.D4361772@viggo.jf.intel.com>

That patch looks fine to me, avoiding any terms that might be
misunderstood, and being pretty straightforward.

I'm guessing this will go through Jon?

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] [v2] docs: clarify security-bugs disclosure policy
From: Dave Hansen @ 2018-03-07 21:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dave Hansen, dan.j.williams, tglx, gregkh, torvalds, gnomes,
	aarcange, luto, keescook, tim.c.chen, viro, akpm, linux-doc,
	corbet, mark.rutland


From: Dave Hansen <dave.hansen@linux.intel.com>

I think we need to soften the language a bit.  It might scare folks
off, especially the:

	 We prefer to fully disclose the bug as soon as possible.

which is not really the case.  Linus says:

	It's not full disclosure, it's not coordinated disclosure,
	and it's not "no disclosure".  It's more like just "timely
	open fixes".

I changed a bit of the wording in here, but mostly to remove the word
"disclosure" since it seems to mean very specific things to people
that we do not mean here.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
---

 b/Documentation/admin-guide/security-bugs.rst |   24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff -puN Documentation/admin-guide/security-bugs.rst~embargo2 Documentation/admin-guide/security-bugs.rst
--- a/Documentation/admin-guide/security-bugs.rst~embargo2	2018-03-07 13:23:49.390228208 -0800
+++ b/Documentation/admin-guide/security-bugs.rst	2018-03-07 13:42:37.618225395 -0800
@@ -29,18 +29,20 @@ made public.
 Disclosure
 ----------
 
-The goal of the Linux kernel security team is to work with the
-bug submitter to bug resolution as well as disclosure.  We prefer
-to fully disclose the bug as soon as possible.  It is reasonable to
-delay disclosure when the bug or the fix is not yet fully understood,
-the solution is not well-tested or for vendor coordination.  However, we
-expect these delays to be short, measurable in days, not weeks or months.
-A disclosure date is negotiated by the security team working with the
-bug submitter as well as vendors.  However, the kernel security team
-holds the final say when setting a disclosure date.  The timeframe for
-disclosure is from immediate (esp. if it's already publicly known)
+The goal of the Linux kernel security team is to work with the bug
+submitter to understand and fix the bug.  We prefer to publish the fix as
+soon as possible, but try to avoid public discussion of the bug itself
+and leave that to others.
+
+Publishing the fix may be delayed when the bug or the fix is not yet
+fully understood, the solution is not well-tested or for vendor
+coordination.  However, we expect these delays to be short, measurable in
+days, not weeks or months.  A release date is negotiated by the security
+team working with the bug submitter as well as vendors.  However, the
+kernel security team holds the final say when setting a timeframe.  The
+timeframe varies from immediate (esp. if it's already publicly known bug)
 to a few weeks.  As a basic default policy, we expect report date to
-disclosure date to be on the order of 7 days.
+release date to be on the order of 7 days.
 
 Coordination
 ------------
_
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 00/13] Remove metag architecture
From: Arnd Bergmann @ 2018-03-07 21:24 UTC (permalink / raw)
  To: James Hogan
  Cc: open list:METAG ARCHITECTURE, Linux Kernel Mailing List,
	Guenter Roeck, Jonathan Corbet, Steven Rostedt, Ingo Molnar,
	Peter Zijlstra, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Jason Cooper,
	Marc Zyngier, Daniel Lezcano, Greg Kroah-Hartman, Jiri Slaby,
	Linus Walleij, Wim Van Sebroeck, Mauro Carvalho Chehab,
	Mauro Carvalho Chehab, Wolfram Sang, open list:DOCUMENTATION,
	Linux-MM, linux-gpio, linux-watchdog, Linux Media Mailing List,
	linux-i2c
In-Reply-To: <20180221233825.10024-1-jhogan@kernel.org>

On Thu, Feb 22, 2018 at 12:38 AM, James Hogan <jhogan@kernel.org> wrote:
> These patches remove the metag architecture and tightly dependent
> drivers from the kernel. With the 4.16 kernel the ancient gcc 4.2.4
> based metag toolchain we have been using is hitting compiler bugs, so
> now seems a good time to drop it altogether.
>
> Quoting from patch 1:
>
> The earliest Meta architecture port of Linux I have a record of was an
> import of a Meta port of Linux v2.4.1 in February 2004, which was worked
> on significantly over the next few years by Graham Whaley, Will Newton,
> Matt Fleming, myself and others.
>
> Eventually the port was merged into mainline in v3.9 in March 2013, not
> long after Imagination Technologies bought MIPS Technologies and shifted
> its CPU focus over to the MIPS architecture.
>
> As a result, though the port was maintained for a while, kept on life
> support for a while longer, and useful for testing a few specific
> drivers for which I don't have ready access to the equivalent MIPS
> hardware, it is now essentially dead with no users.
>
> It is also stuck using an out-of-tree toolchain based on GCC 4.2.4 which
> is no longer maintained, now struggles to build modern kernels due to
> toolchain bugs, and doesn't itself build with a modern GCC. The latest
> buildroot port is still using an old uClibc snapshot which is no longer
> served, and the latest uClibc doesn't build with GCC 4.2.4.
>
> So lets call it a day and drop the Meta architecture port from the
> kernel. RIP Meta.

I've pulled it into my asm-generic tree now, which is also part of linux-next,
and followed up with patches removing frv, m32r, score, unicore32
and blackfin. I have not removed the device drivers yet, but I'm working
on that.

       Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] docs: clarify security-bugs disclosure policy
From: Linus Torvalds @ 2018-03-07 21:21 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Linux Kernel Mailing List, Thomas Gleixner, Greg Kroah-Hartman,
	One Thousand Gnomes, Andrea Arcangeli, Andrew Lutomirski,
	Kees Cook, Tim Chen, Dan Williams, Al Viro, Andrew Morton,
	open list:DOCUMENTATION, Jonathan Corbet, Mark Rutland
In-Reply-To: <20180306233140.268BD8E1@viggo.jf.intel.com>

On Tue, Mar 6, 2018 at 3:31 PM, Dave Hansen <dave.hansen@linux.intel.com> wrote:
>
> I think we need to soften the language a bit.  It might scare folks
> off, especially the:
>
>          We prefer to fully disclose the bug as soon as possible.
>
> which is not really the case.

Ack. What we do is definitely not full disclosure. In fact, we often
actively try to avoid disclosing details and leave that entirely to
others.

We disclose the *patches*, and the explanation of the patch, but not
necessarily anything else (ie no exploit code or even any exploit
discussion). We also don't explicitly disclose the discussion of the
patches or the report, although part of it mayt obviously become more
or less public for other reasons.

So we should probably avoid using a term that means something else to
a lot of people.

And for similar reasons, I don't think the fixed verbiage should use
"coordinated disclosure" either, like in your patch. That usually
means the kind of embargoes that the security list does not honor.

So I think it merits clarification, but maybe just specify the two
things relevant to our disclosure: the fact that the patch and
explanation for the patch gets made public (but not necessarily other
effects), and that the timeframe is very limited.

It's not full disclosure, it's not coordinated disclosure, and it's
not "no disclosure".

It's more like just "timely open fixes".

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [v3, 3/4] i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller
From: Doug Anderson @ 2018-03-07 21:16 UTC (permalink / raw)
  To: Jonathan Corbet, Andy Gross, David Brown, Rob Herring,
	Mark Rutland, Wolfram Sang, Greg Kroah-Hartman
  Cc: Karthikeyan Ramasubramanian, linux-doc, linux-arm-msm, devicetree,
	linux-i2c, linux-serial, Jiri Slaby, evgreen, acourbot,
	Sagar Dharia, Girish Mahadevan, swboyd
In-Reply-To: <1519781889-16117-4-git-send-email-kramasub@codeaurora.org>

Hi,

On Tue, Feb 27, 2018 at 5:38 PM, Karthikeyan Ramasubramanian
<kramasub@codeaurora.org> wrote:
> This bus driver supports the GENI based i2c hardware controller in the
> Qualcomm SOCs. The Qualcomm Generic Interface (GENI) is a programmable
> module supporting a wide range of serial interfaces including I2C. The
> driver supports FIFO mode and DMA mode of transfer and switches modes
> dynamically depending on the size of the transfer.
>
> Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
> Signed-off-by: Sagar Dharia <sdharia@codeaurora.org>
> Signed-off-by: Girish Mahadevan <girishm@codeaurora.org>
> ---
>  drivers/i2c/busses/Kconfig         |  11 +
>  drivers/i2c/busses/Makefile        |   1 +
>  drivers/i2c/busses/i2c-qcom-geni.c | 626 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 638 insertions(+)

I'm not an expert on geni (and, to be honest, I haven't read the main
geni patch yet).  ...but I figured I could at least add my $0.02 since
I've stared at i2c bus drivers a lot in the past.  Feel free to tell
me if I'm full or crap...


>  create mode 100644 drivers/i2c/busses/i2c-qcom-geni.c
>
> diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
> index e2954fb..1ddf5cd 100644
> --- a/drivers/i2c/busses/Kconfig
> +++ b/drivers/i2c/busses/Kconfig
> @@ -848,6 +848,17 @@ config I2C_PXA_SLAVE
>           is necessary for systems where the PXA may be a target on the
>           I2C bus.
>
> +config I2C_QCOM_GENI
> +       tristate "Qualcomm Technologies Inc.'s GENI based I2C controller"
> +       depends on ARCH_QCOM
> +       depends on QCOM_GENI_SE
> +       help
> +         If you say yes to this option, support will be included for the
> +         built-in I2C interface on the Qualcomm Technologies Inc.'s SoCs.

Kind of a generic description and this driver is only for new SoCs,
right?  Maybe make it a little more specific?


> +
> +         This driver can also be built as a module.  If so, the module
> +         will be called i2c-qcom-geni.
> +
>  config I2C_QUP
>         tristate "Qualcomm QUP based I2C controller"
>         depends on ARCH_QCOM
> diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
> index 2ce8576..201fce1 100644
> --- a/drivers/i2c/busses/Makefile
> +++ b/drivers/i2c/busses/Makefile
> @@ -84,6 +84,7 @@ obj-$(CONFIG_I2C_PNX)         += i2c-pnx.o
>  obj-$(CONFIG_I2C_PUV3)         += i2c-puv3.o
>  obj-$(CONFIG_I2C_PXA)          += i2c-pxa.o
>  obj-$(CONFIG_I2C_PXA_PCI)      += i2c-pxa-pci.o
> +obj-$(CONFIG_I2C_QCOM_GENI)    += i2c-qcom-geni.o
>  obj-$(CONFIG_I2C_QUP)          += i2c-qup.o
>  obj-$(CONFIG_I2C_RIIC)         += i2c-riic.o
>  obj-$(CONFIG_I2C_RK3X)         += i2c-rk3x.o
> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
> new file mode 100644
> index 0000000..e1e4268
> --- /dev/null
> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> @@ -0,0 +1,626 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
> +
> +#include <linux/clk.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/err.h>
> +#include <linux/i2c.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_platform.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/qcom-geni-se.h>
> +
> +#define SE_I2C_TX_TRANS_LEN            0x26c
> +#define SE_I2C_RX_TRANS_LEN            0x270
> +#define SE_I2C_SCL_COUNTERS            0x278
> +
> +#define SE_I2C_ERR  (M_CMD_OVERRUN_EN | M_ILLEGAL_CMD_EN | M_CMD_FAILURE_EN |\
> +                       M_GP_IRQ_1_EN | M_GP_IRQ_3_EN | M_GP_IRQ_4_EN)
> +#define SE_I2C_ABORT           BIT(1)
> +
> +/* M_CMD OP codes for I2C */
> +#define I2C_WRITE              0x1
> +#define I2C_READ               0x2
> +#define I2C_WRITE_READ         0x3
> +#define I2C_ADDR_ONLY          0x4
> +#define I2C_BUS_CLEAR          0x6
> +#define I2C_STOP_ON_BUS                0x7
> +/* M_CMD params for I2C */
> +#define PRE_CMD_DELAY          BIT(0)
> +#define TIMESTAMP_BEFORE       BIT(1)
> +#define STOP_STRETCH           BIT(2)
> +#define TIMESTAMP_AFTER                BIT(3)
> +#define POST_COMMAND_DELAY     BIT(4)
> +#define IGNORE_ADD_NACK                BIT(6)
> +#define READ_FINISHED_WITH_ACK BIT(7)
> +#define BYPASS_ADDR_PHASE      BIT(8)
> +#define SLV_ADDR_MSK           GENMASK(15, 9)
> +#define SLV_ADDR_SHFT          9
> +/* I2C SCL COUNTER fields */
> +#define HIGH_COUNTER_MSK       GENMASK(29, 20)
> +#define HIGH_COUNTER_SHFT      20
> +#define LOW_COUNTER_MSK                GENMASK(19, 10)
> +#define LOW_COUNTER_SHFT       10
> +#define CYCLE_COUNTER_MSK      GENMASK(9, 0)
> +
> +#define GP_IRQ0                        0
> +#define GP_IRQ1                        1
> +#define GP_IRQ2                        2
> +#define GP_IRQ3                        3
> +#define GP_IRQ4                        4
> +#define GP_IRQ5                        5
> +#define GENI_OVERRUN           6
> +#define GENI_ILLEGAL_CMD       7
> +#define GENI_ABORT_DONE                8
> +#define GENI_TIMEOUT           9

Above should be an enum; then use the enum type as the parameter to
geni_i2c_err() so it's obvious that "err" is not a normal linux error
code.


> +#define I2C_NACK               GP_IRQ1
> +#define I2C_BUS_PROTO          GP_IRQ3
> +#define I2C_ARB_LOST           GP_IRQ4

Get rid of definition of GP_IRQ1, 3, and 4 and just define I2C_NACK,
I2C_BUS_PROTO, and I2C_ARB_LOST directly.


> +#define DM_I2C_CB_ERR          ((BIT(GP_IRQ1) | BIT(GP_IRQ3) | BIT(GP_IRQ4)) \
> +                                                                       << 5)

Should these really be using "GP_IRQ1", "GP_IRQ3", and "GP_IRQ4".
Does this use of those numbers have anything to do with the other use
of them?  Seems like this should just be BIT(1) | BIT(3) | BIT(4).

Said another way: does bit 1 in this field coorespond to NACK, bit 3
correspond to BUS_PROTO, and bit 4 correspond to ARB_LOST?  If not
then I see no reason to try to tie them together.  If they do
correspond then use BIT(I2C_NACK), etc...


> +
> +#define I2C_AUTO_SUSPEND_DELAY 250

Why 250 ms?  That seems like an eternity.  Is it really that expensive
to turn resources off and on?  I would sorta just expect clocks and
stuff to get turned off right after a transaction finished unless
another one was pending right behind it...


> +#define KHz(freq)              (1000 * freq)

I probably wouldn't define KHz macro and just used numbers like 100000
like all the other i2c drivers, but I guess it's OK.  Should be all
caps, though?


> +#define PACKING_BYTES_PW       4
> +
> +struct geni_i2c_dev {
> +       struct geni_se se;
> +       u32 tx_wm;
> +       int irq;
> +       int err;
> +       struct i2c_adapter adap;
> +       struct completion done;
> +       struct i2c_msg *cur;
> +       int cur_wr;
> +       int cur_rd;
> +       u32 clk_freq_out;
> +       const struct geni_i2c_clk_fld *clk_fld;
> +};
> +
> +struct geni_i2c_err_log {
> +       int err;
> +       const char *msg;
> +};
> +
> +static struct geni_i2c_err_log gi2c_log[] = {

static const?


> +       [GP_IRQ0] = {-EINVAL, "Unknown I2C err GP_IRQ0"},
> +       [I2C_NACK] = {-ENOTCONN, "NACK: slv unresponsive, check its power/reset-ln"},

Longer than 80 characters; don't split the string, but you could still
wrap better.


> +       [GP_IRQ2] = {-EINVAL, "Unknown I2C err GP IRQ2"},
> +       [I2C_BUS_PROTO] = {-EPROTO, "Bus proto err, noisy/unepxected start/stop"},
> +       [I2C_ARB_LOST] = {-EBUSY, "Bus arbitration lost, clock line undriveable"},
> +       [GP_IRQ5] = {-EINVAL, "Unknown I2C err GP IRQ5"},
> +       [GENI_OVERRUN] = {-EIO, "Cmd overrun, check GENI cmd-state machine"},
> +       [GENI_ILLEGAL_CMD] = {-EILSEQ, "Illegal cmd, check GENI cmd-state machine"},
> +       [GENI_ABORT_DONE] = {-ETIMEDOUT, "Abort after timeout successful"},
> +       [GENI_TIMEOUT] = {-ETIMEDOUT, "I2C TXN timed out"},
> +};
> +
> +struct geni_i2c_clk_fld {
> +       u32     clk_freq_out;
> +       u8      clk_div;
> +       u8      t_high;
> +       u8      t_low;
> +       u8      t_cycle;
> +};
> +
> +static const struct geni_i2c_clk_fld geni_i2c_clk_map[] = {
> +       {KHz(100), 7, 10, 11, 26},
> +       {KHz(400), 2,  5, 12, 24},
> +       {KHz(1000), 1, 3,  9, 18},

So I guess this is all relying on an input serial clock of 19.2MHz?
Maybe document that?

Assuming I'm understanding the math here, is it really OK for your
100kHz and 1MHz mode to be running slightly fast?

19200. / 2 / 24
>>> 400.0

19200. / 7 / 26
>>> 105.49450549450549

19200. / 1 / 18
>>> 1066.6666666666667

It seems like you'd want the fastest clock that you can make that's
_less than_ the spec.


It would also be interesting to know if it's expected that boards
might need to tweak the t_high / t_low depending on their electrical
characteristics.  In the past I've had lots of requests from board
makers to tweak things because they've got a long trace, or a stronger
or weaker pull, or ...  If so we might later need to add some dts
properties like "i2c-scl-rising-time-ns" and make the math more
dynamic here, unless your hardware somehow automatically adjusts for
this type of thing...


> +};
> +
> +static int geni_i2c_clk_map_idx(struct geni_i2c_dev *gi2c)
> +{
> +       int i;
> +       const struct geni_i2c_clk_fld *itr = geni_i2c_clk_map;
> +
> +       for (i = 0; i < ARRAY_SIZE(geni_i2c_clk_map); i++, itr++) {
> +               if (itr->clk_freq_out == gi2c->clk_freq_out) {
> +                       gi2c->clk_fld = geni_i2c_clk_map + i;

Isn't "geni_i2c_clk_map + i" just "itr"?


> +                       return 0;
> +               }
> +       }
> +       return -EINVAL;
> +}
> +
> +static void qcom_geni_i2c_conf(struct geni_i2c_dev *gi2c)
> +{
> +       const struct geni_i2c_clk_fld *itr = gi2c->clk_fld;
> +       u32 val;
> +
> +       writel_relaxed(0, gi2c->se.base + SE_GENI_CLK_SEL);
> +
> +       val = (itr->clk_div << CLK_DIV_SHFT) | SER_CLK_EN;
> +       writel_relaxed(val, gi2c->se.base + GENI_SER_M_CLK_CFG);
> +
> +       val = itr->t_high << HIGH_COUNTER_SHFT;
> +       val |= itr->t_low << LOW_COUNTER_SHFT;
> +       val |= itr->t_cycle;
> +       writel_relaxed(val, gi2c->se.base + SE_I2C_SCL_COUNTERS);
> +       /*
> +        * Ensure later writes/reads to serial engine register block is
> +        * not reordered before this point.
> +        */
> +       mb();

This mb() is to make sure that later writes to "gi2c->se.base" are not
reordered to be above the ones in this function?  You don't need a
mb().  writel_relaxed() already enforces this.


> +}
> +
> +static void geni_i2c_err_misc(struct geni_i2c_dev *gi2c)
> +{
> +       u32 m_cmd = readl_relaxed(gi2c->se.base + SE_GENI_M_CMD0);
> +       u32 m_stat = readl_relaxed(gi2c->se.base + SE_GENI_M_IRQ_STATUS);
> +       u32 geni_s = readl_relaxed(gi2c->se.base + SE_GENI_STATUS);
> +       u32 geni_ios = readl_relaxed(gi2c->se.base + SE_GENI_IOS);
> +       u32 dma = readl_relaxed(gi2c->se.base + SE_GENI_DMA_MODE_EN);
> +       u32 rx_st, tx_st;
> +
> +       if (dma) {
> +               rx_st = readl_relaxed(gi2c->se.base + SE_DMA_RX_IRQ_STAT);
> +               tx_st = readl_relaxed(gi2c->se.base + SE_DMA_TX_IRQ_STAT);
> +       } else {
> +               rx_st = readl_relaxed(gi2c->se.base + SE_GENI_RX_FIFO_STATUS);
> +               tx_st = readl_relaxed(gi2c->se.base + SE_GENI_TX_FIFO_STATUS);
> +       }
> +       dev_dbg(gi2c->se.dev, "DMA:%d tx_stat:0x%x, rx_stat:0x%x, irq-stat:0x%x\n",
> +               dma, tx_st, rx_st, m_stat);
> +       dev_dbg(gi2c->se.dev, "m_cmd:0x%x, geni_status:0x%x, geni_ios:0x%x\n",
> +               m_cmd, geni_s, geni_ios);
> +}
> +
> +static void geni_i2c_err(struct geni_i2c_dev *gi2c, int err)
> +{
> +       gi2c->err = gi2c_log[err].err;

You should only set gi2c->err if it was 0 to start with.  You want
"err" to contain the first error, not the last one.  This is
especially important due to the comment elsewhere in this patch "if
this is err with done-bit not set, handle that through timeout".  You
don't want the timeout to clobber the true error.


On a separate note: I wonder if it makes sense to couch the rest of
this function in something that will compile to a no-op if DEBUG and
DYNAMIC_DEBUG aren't defined?  Then you can avoid including code for
all these readl calls.

> +       if (gi2c->cur)
> +               dev_dbg(gi2c->se.dev, "len:%d, slv-addr:0x%x, RD/WR:%d\n",
> +                       gi2c->cur->len, gi2c->cur->addr, gi2c->cur->flags);
> +       dev_dbg(gi2c->se.dev, "%s\n", gi2c_log[err].msg);
> +
> +       if (err != I2C_NACK && err != GENI_ABORT_DONE)
> +               geni_i2c_err_misc(gi2c);
> +}
> +
> +static irqreturn_t geni_i2c_irq(int irq, void *dev)
> +{
> +       struct geni_i2c_dev *gi2c = dev;
> +       int j;
> +       u32 m_stat = readl_relaxed(gi2c->se.base + SE_GENI_M_IRQ_STATUS);
> +       u32 rx_st = readl_relaxed(gi2c->se.base + SE_GENI_RX_FIFO_STATUS);
> +       u32 dm_tx_st = readl_relaxed(gi2c->se.base + SE_DMA_TX_IRQ_STAT);
> +       u32 dm_rx_st = readl_relaxed(gi2c->se.base + SE_DMA_RX_IRQ_STAT);
> +       u32 dma = readl_relaxed(gi2c->se.base + SE_GENI_DMA_MODE_EN);
> +       struct i2c_msg *cur = gi2c->cur;
> +
> +       if (!cur ||
> +           m_stat & (M_CMD_FAILURE_EN | M_CMD_ABORT_EN) ||
> +           dm_rx_st & (DM_I2C_CB_ERR)) {
> +               if (m_stat & M_GP_IRQ_1_EN)
> +                       geni_i2c_err(gi2c, I2C_NACK);
> +               if (m_stat & M_GP_IRQ_3_EN)
> +                       geni_i2c_err(gi2c, I2C_BUS_PROTO);
> +               if (m_stat & M_GP_IRQ_4_EN)
> +                       geni_i2c_err(gi2c, I2C_ARB_LOST);
> +               if (m_stat & M_CMD_OVERRUN_EN)
> +                       geni_i2c_err(gi2c, GENI_OVERRUN);
> +               if (m_stat & M_ILLEGAL_CMD_EN)
> +                       geni_i2c_err(gi2c, GENI_ILLEGAL_CMD);
> +               if (m_stat & M_CMD_ABORT_EN)
> +                       geni_i2c_err(gi2c, GENI_ABORT_DONE);
> +               if (m_stat & M_GP_IRQ_0_EN)
> +                       geni_i2c_err(gi2c, GP_IRQ0);
> +
> +               /* Disable the TX Watermark interrupt to stop TX */
> +               if (!dma)
> +                       writel_relaxed(0, gi2c->se.base +
> +                                          SE_GENI_TX_WATERMARK_REG);
> +               goto irqret;
> +       }
> +
> +       if (dma) {
> +               dev_dbg(gi2c->se.dev, "i2c dma tx:0x%x, dma rx:0x%x\n",
> +                       dm_tx_st, dm_rx_st);
> +               goto irqret;
> +       }
> +
> +       if (cur->flags & I2C_M_RD &&
> +           m_stat & (M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN)) {
> +               u32 rxcnt = rx_st & RX_FIFO_WC_MSK;
> +
> +               for (j = 0; j < rxcnt; j++) {
> +                       u32 val;
> +                       int p = 0;
> +
> +                       val = readl_relaxed(gi2c->se.base + SE_GENI_RX_FIFOn);
> +                       while (gi2c->cur_rd < cur->len && p < sizeof(val)) {
> +                               cur->buf[gi2c->cur_rd++] = val & 0xff;
> +                               val >>= 8;
> +                               p++;
> +                       }
> +                       if (gi2c->cur_rd == cur->len)
> +                               break;
> +               }
> +       } else if (!(cur->flags & I2C_M_RD) &&
> +                  m_stat & M_TX_FIFO_WATERMARK_EN) {
> +               for (j = 0; j < gi2c->tx_wm; j++) {
> +                       u32 temp;
> +                       u32 val = 0;
> +                       int p = 0;
> +
> +                       while (gi2c->cur_wr < cur->len && p < sizeof(val)) {
> +                               temp = (u32)cur->buf[gi2c->cur_wr++];

What is the (u32) cast doing here?


> +                               val |= (temp << (p * 8));

Get rid of extra parenthesis.


> +                               p++;
> +                       }
> +                       writel_relaxed(val, gi2c->se.base + SE_GENI_TX_FIFOn);
> +                       /* TX Complete, Disable the TX Watermark interrupt */
> +                       if (gi2c->cur_wr == cur->len) {
> +                               writel_relaxed(0, gi2c->se.base +
> +                                               SE_GENI_TX_WATERMARK_REG);
> +                               break;
> +                       }
> +               }
> +       }
> +irqret:
> +       if (m_stat)
> +               writel_relaxed(m_stat, gi2c->se.base + SE_GENI_M_IRQ_CLEAR);
> +
> +       if (dma) {
> +               if (dm_tx_st)
> +                       writel_relaxed(dm_tx_st, gi2c->se.base +
> +                                               SE_DMA_TX_IRQ_CLR);
> +               if (dm_rx_st)
> +                       writel_relaxed(dm_rx_st, gi2c->se.base +
> +                                               SE_DMA_RX_IRQ_CLR);
> +       }
> +       /* if this is err with done-bit not set, handle that through timeout. */
> +       if (m_stat & M_CMD_DONE_EN)
> +               complete(&gi2c->done);
> +       else if ((dm_tx_st & TX_DMA_DONE) || (dm_rx_st & RX_DMA_DONE))
> +               complete(&gi2c->done);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static void geni_i2c_abort_xfer(struct geni_i2c_dev *gi2c)
> +{
> +       u32 val;
> +       unsigned long timeout = HZ;

Rename to time_left?  ...and maybe use a #define for the init value?


> +
> +       geni_i2c_err(gi2c, GENI_TIMEOUT);
> +       gi2c->cur = NULL;

Don't you need a spinlock or something?  In most of the other cases
you could get away with no locking because the irq isn't happening at
the same time as other code that's mucking with stuff, but in the
timeout case we may be mucking with stuff at the same time as the irq.


> +       geni_se_abort_m_cmd(&gi2c->se);
> +       do {
> +               timeout = wait_for_completion_timeout(&gi2c->done, timeout);
> +               val = readl_relaxed(gi2c->se.base + SE_GENI_M_IRQ_STATUS);
> +       } while (!(val & M_CMD_ABORT_EN) && timeout);

Print an error if there was a timeout aborting?


> +}
> +
> +static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> +                               u32 m_param)
> +{
> +       dma_addr_t rx_dma;
> +       enum geni_se_xfer_mode mode;
> +       unsigned long timeout;
> +
> +       gi2c->cur = msg;
> +       mode = msg->len > 32 ? GENI_SE_DMA : GENI_SE_FIFO;

DMA is hard and i2c transfers > 32 bytes are rare.  Do we really gain
a lot by transferring i2c commands over DMA compared to a FIFO?
Enough to justify the code complexity and the set of bugs that will
show up?  I'm sure it will be a controversial assertion given that the
code's already written, but personally I'd be supportive of ripping
DMA mode out to simplify the driver.  I'd be curious if anyone else
agrees.  To me it seems like premature optimization.


> +       geni_se_select_mode(&gi2c->se, mode);
> +       writel_relaxed(msg->len, gi2c->se.base + SE_I2C_RX_TRANS_LEN);
> +       geni_se_setup_m_cmd(&gi2c->se, I2C_READ, m_param);
> +       if (mode == GENI_SE_DMA) {
> +               rx_dma = geni_se_rx_dma_prep(&gi2c->se, msg->buf, msg->len);

Randomly I noticed a flag called "I2C_M_DMA_SAFE".  Do we need to
check this flag before using msg->buf for DMA?  ...or use
i2c_get_dma_safe_msg_buf()?

...btw: the relative lack of people doing this in the kernel is
further evidence of DMA not really being worth it for i2c busses.


> +               if (!rx_dma) {
> +                       mode = GENI_SE_FIFO;
> +                       geni_se_select_mode(&gi2c->se, mode);
> +               }
> +       }
> +
> +       timeout = wait_for_completion_timeout(&gi2c->done, HZ);

Perhaps make a #define for the timeout instead of just hardcoding HZ (1 second).


> +       if (!timeout)

Can you rename "timeout" to "time_left"?  Otherwise this read like "if
there wasn't a timeout then abort".


> +               geni_i2c_abort_xfer(gi2c);
> +
> +       gi2c->cur_rd = 0;
> +       if (mode == GENI_SE_DMA) {
> +               if (gi2c->err) {
> +                       writel_relaxed(1, gi2c->se.base + SE_DMA_RX_FSM_RST);
> +                       wait_for_completion_timeout(&gi2c->done, HZ);

Worth printing an error if this one times out?  Seems like we'd be in
bad shape...

...also: to be paranoid do you need a re_init_completion before you
reset things?  In theory one could conceive of the concept that the
earlier completion timed out and then the DMA interrupt came right
after.  Now there will be a completion already on the books so your
wait will return instantly even though the reset hasn't been done.


> +               }
> +               geni_se_rx_dma_unprep(&gi2c->se, rx_dma, msg->len);
> +       }
> +       if (gi2c->err)
> +               dev_err(gi2c->se.dev, "i2c error :%d\n", gi2c->err);

OK, so I'm a bit baffled.  You've got all these tables in this driver
that give you nice/informative error messages.  Then those nice error
messages are just calling dev_dbg() and here you print out an arcane
linux error?

Also: seems like you wouldn't want to print errors for NACKs, right?
Otherwise i2cdetect is going to be spewing isn't it?


> +       return gi2c->err;
> +}
> +
> +static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> +                               u32 m_param)
> +{
> +       dma_addr_t tx_dma;
> +       enum geni_se_xfer_mode mode;
> +       unsigned long timeout;
> +
> +       gi2c->cur = msg;
> +       mode = msg->len > 32 ? GENI_SE_DMA : GENI_SE_FIFO;
> +       geni_se_select_mode(&gi2c->se, mode);
> +       writel_relaxed(msg->len, gi2c->se.base + SE_I2C_TX_TRANS_LEN);
> +       geni_se_setup_m_cmd(&gi2c->se, I2C_WRITE, m_param);
> +       if (mode == GENI_SE_DMA) {
> +               tx_dma = geni_se_tx_dma_prep(&gi2c->se, msg->buf, msg->len);
> +               if (!tx_dma) {
> +                       mode = GENI_SE_FIFO;
> +                       geni_se_select_mode(&gi2c->se, mode);
> +               }
> +       }
> +
> +       if (mode == GENI_SE_FIFO) /* Get FIFO IRQ */
> +               writel_relaxed(1, gi2c->se.base + SE_GENI_TX_WATERMARK_REG);
> +
> +       timeout = wait_for_completion_timeout(&gi2c->done, HZ);
> +       if (!timeout)
> +               geni_i2c_abort_xfer(gi2c);
> +
> +       gi2c->cur_wr = 0;
> +       if (mode == GENI_SE_DMA) {
> +               if (gi2c->err) {
> +                       writel_relaxed(1, gi2c->se.base + SE_DMA_TX_FSM_RST);
> +                       wait_for_completion_timeout(&gi2c->done, HZ);
> +               }
> +               geni_se_tx_dma_unprep(&gi2c->se, tx_dma, msg->len);
> +       }
> +       if (gi2c->err)
> +               dev_err(gi2c->se.dev, "i2c error :%d\n", gi2c->err);
> +       return gi2c->err;
> +}
> +
> +static int geni_i2c_xfer(struct i2c_adapter *adap,
> +                        struct i2c_msg msgs[],
> +                        int num)
> +{
> +       struct geni_i2c_dev *gi2c = i2c_get_adapdata(adap);
> +       int i, ret;
> +
> +       gi2c->err = 0;
> +       reinit_completion(&gi2c->done);
> +       ret = pm_runtime_get_sync(gi2c->se.dev);
> +       if (ret < 0) {
> +               dev_err(gi2c->se.dev, "error turning SE resources:%d\n", ret);
> +               pm_runtime_put_noidle(gi2c->se.dev);
> +               /* Set device in suspended since resume failed */
> +               pm_runtime_set_suspended(gi2c->se.dev);
> +               return ret;

Wow, that's a cluster of arcane calls to handle a call that probably
will never fail (it just enables clocks and sets pinctrl).  Sigh.
...but as far as I can tell the whole sequence is right.  You
definitely need a "put" after a failed get and it looks like
pm_runtime_set_suspended() has a special exception where it can be
called if you got a runtime error...


> +       }
> +
> +       qcom_geni_i2c_conf(gi2c);
> +       for (i = 0; i < num; i++) {
> +               u32 m_param = i < (num - 1) ? STOP_STRETCH : 0;
> +
> +               m_param |= ((msgs[i].addr << SLV_ADDR_SHFT) & SLV_ADDR_MSK);
> +
> +               if (msgs[i].flags & I2C_M_RD)
> +                       ret = geni_i2c_rx_one_msg(gi2c, &msgs[i], m_param);
> +               else
> +                       ret = geni_i2c_tx_one_msg(gi2c, &msgs[i], m_param);
> +
> +               if (ret) {
> +                       dev_err(gi2c->se.dev, "i2c error %d @ %d\n", ret, i);
> +                       break;
> +               }
> +       }
> +       if (ret == 0)
> +               ret = num;
> +
> +       pm_runtime_mark_last_busy(gi2c->se.dev);
> +       pm_runtime_put_autosuspend(gi2c->se.dev);
> +       gi2c->cur = NULL;
> +       gi2c->err = 0;
> +       return ret;
> +}
> +
> +static u32 geni_i2c_func(struct i2c_adapter *adap)
> +{
> +       return I2C_FUNC_I2C | (I2C_FUNC_SMBUS_EMUL & ~I2C_FUNC_SMBUS_QUICK);
> +}
> +
> +static const struct i2c_algorithm geni_i2c_algo = {
> +       .master_xfer    = geni_i2c_xfer,
> +       .functionality  = geni_i2c_func,
> +};
> +
> +static int geni_i2c_probe(struct platform_device *pdev)
> +{
> +       struct geni_i2c_dev *gi2c;
> +       struct resource *res;
> +       u32 proto, tx_depth;
> +       int ret;
> +
> +       gi2c = devm_kzalloc(&pdev->dev, sizeof(*gi2c), GFP_KERNEL);
> +       if (!gi2c)
> +               return -ENOMEM;
> +
> +       gi2c->se.dev = &pdev->dev;
> +       gi2c->se.wrapper = dev_get_drvdata(pdev->dev.parent);
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       gi2c->se.base = devm_ioremap_resource(&pdev->dev, res);
> +       if (IS_ERR(gi2c->se.base)) {
> +               ret = PTR_ERR(gi2c->se.base);
> +               dev_err(&pdev->dev, "Err IO Mapping register block %d\n", ret);

No need for error message with devm_ioremap_resource().


> +               return ret;
> +       }
> +
> +       gi2c->se.clk = devm_clk_get(&pdev->dev, "se");
> +       if (IS_ERR(gi2c->se.clk)) {
> +               ret = PTR_ERR(gi2c->se.clk);
> +               dev_err(&pdev->dev, "Err getting SE Core clk %d\n", ret);
> +               return ret;
> +       }
> +
> +       ret = device_property_read_u32(&pdev->dev, "clock-frequency",
> +                                                       &gi2c->clk_freq_out);
> +       if (ret) {
> +               dev_info(&pdev->dev,
> +                       "Bus frequency not specified, default to 400KHz.\n");
> +               gi2c->clk_freq_out = KHz(400);
> +       }

I feel like it should default to 100KHz.  i2c_parse_fw_timings()
defaults to this and to me the wording "New drivers almost always
should use the defaults" makes me feel this should be the defaults.

> +
> +       gi2c->irq = platform_get_irq(pdev, 0);
> +       if (gi2c->irq < 0) {
> +               dev_err(&pdev->dev, "IRQ error for i2c-geni\n");
> +               return gi2c->irq;
> +       }
> +
> +       ret = geni_i2c_clk_map_idx(gi2c);
> +       if (ret) {
> +               dev_err(&pdev->dev, "Invalid clk frequency %d KHz: %d\n",
> +                       gi2c->clk_freq_out, ret);

Need a divide by 1000 since your printout includes "KHz".  Also note
that the proper Si units is kHz not KHz, isn't it?


> +               return ret;
> +       }
> +
> +       gi2c->adap.algo = &geni_i2c_algo;
> +       init_completion(&gi2c->done);
> +       platform_set_drvdata(pdev, gi2c);
> +       ret = devm_request_irq(&pdev->dev, gi2c->irq, geni_i2c_irq,
> +                              IRQF_TRIGGER_HIGH, "i2c_geni", gi2c);
> +       if (ret) {
> +               dev_err(&pdev->dev, "Request_irq failed:%d: err:%d\n",
> +                       gi2c->irq, ret);
> +               return ret;
> +       }
> +       disable_irq(gi2c->irq);

Can you explain the goal of the disable_irq() here.  Is it actually
needed for something or does it somehow save power?  From drivers I've
reviewed in the past this doesn't seem like a common thing to do, so
I'm curious what it's supposed to gain for you.  I'd be inclined to
just delete the whole disable/enable of the irq from this driver.


> +       i2c_set_adapdata(&gi2c->adap, gi2c);
> +       gi2c->adap.dev.parent = &pdev->dev;
> +       gi2c->adap.dev.of_node = pdev->dev.of_node;
> +       strlcpy(gi2c->adap.name, "Geni-I2C", sizeof(gi2c->adap.name));
> +
> +       ret = geni_se_resources_on(&gi2c->se);
> +       if (ret) {
> +               dev_err(&pdev->dev, "Error turning on resources %d\n", ret);
> +               return ret;
> +       }
> +       proto = geni_se_read_proto(&gi2c->se);
> +       tx_depth = geni_se_get_tx_fifo_depth(&gi2c->se);
> +       if (unlikely(proto != GENI_SE_I2C)) {

Avoid compiler hints like unlikely() unless you're really truly
optimizing a tight inner loop.  Otherwise let the compiler do its job.


> +               dev_err(&pdev->dev, "Invalid proto %d\n", proto);
> +               geni_se_resources_off(&gi2c->se);
> +               return -ENXIO;
> +       }
> +       gi2c->tx_wm = tx_depth - 1;
> +       geni_se_init(&gi2c->se, gi2c->tx_wm, tx_depth);
> +       geni_se_config_packing(&gi2c->se, BITS_PER_BYTE, PACKING_BYTES_PW,
> +                                                       true, true, true);
> +       geni_se_resources_off(&gi2c->se);
> +       dev_dbg(&pdev->dev, "i2c fifo/se-dma mode. fifo depth:%d\n", tx_depth);
> +
> +       pm_runtime_set_suspended(gi2c->se.dev);
> +       pm_runtime_set_autosuspend_delay(gi2c->se.dev, I2C_AUTO_SUSPEND_DELAY);
> +       pm_runtime_use_autosuspend(gi2c->se.dev);
> +       pm_runtime_enable(gi2c->se.dev);
> +       i2c_add_adapter(&gi2c->adap);
> +
> +       dev_dbg(&pdev->dev, "I2C probed\n");

Is this really a useful dev_dbg()?  Just turn on initcall debugging...


> +       return 0;
> +}
> +
> +static int geni_i2c_remove(struct platform_device *pdev)
> +{
> +       struct geni_i2c_dev *gi2c = platform_get_drvdata(pdev);
> +
> +       pm_runtime_disable(gi2c->se.dev);

Is this right?  You don't want to disable PM Runtime transitions but
you actually want to force the adapter into suspended state, don't
you?  I don't see anything that does that...


> +       i2c_del_adapter(&gi2c->adap);
> +       return 0;
> +}
> +
> +static int geni_i2c_resume_noirq(struct device *device)
> +{
> +       return 0;
> +}

No need for a dummy function; just stick NULL in the structure, no?

> +
> +#ifdef CONFIG_PM
> +static int geni_i2c_runtime_suspend(struct device *dev)
> +{
> +       struct geni_i2c_dev *gi2c = dev_get_drvdata(dev);
> +
> +       disable_irq(gi2c->irq);
> +       geni_se_resources_off(&gi2c->se);
> +       return 0;
> +}
> +
> +static int geni_i2c_runtime_resume(struct device *dev)
> +{
> +       int ret;
> +       struct geni_i2c_dev *gi2c = dev_get_drvdata(dev);
> +
> +       ret = geni_se_resources_on(&gi2c->se);
> +       if (ret)
> +               return ret;
> +
> +       enable_irq(gi2c->irq);
> +       return 0;
> +}
> +
> +static int geni_i2c_suspend_noirq(struct device *device)
> +{
> +       struct geni_i2c_dev *gi2c = dev_get_drvdata(device);
> +       int ret;
> +
> +       /* Make sure no transactions are pending */
> +       ret = i2c_trylock_bus(&gi2c->adap, I2C_LOCK_SEGMENT);
> +       if (!ret) {
> +               dev_err(gi2c->se.dev, "late I2C transaction request\n");
> +               return -EBUSY;
> +       }

Does this happen?  How?

Nothing about this code looks special for your hardware.  If this is
really needed, why is it not part of the i2c core since there's
nothing specific about your driver here?


> +       if (!pm_runtime_status_suspended(device)) {
> +               geni_i2c_runtime_suspend(device);
> +               pm_runtime_disable(device);
> +               pm_runtime_set_suspended(device);
> +               pm_runtime_enable(device);
> +       }

Similar question.  Why do you need this special case code?  Are there
cases where we're all the way at suspend_noirq and yet we still
haven't runtime suspended?  Can you please document how we get into
this state?


> +       i2c_unlock_bus(&gi2c->adap, I2C_LOCK_SEGMENT);
> +       return 0;
> +}
> +#else
> +static int geni_i2c_runtime_suspend(struct device *dev)
> +{
> +       return 0;
> +}
> +
> +static int geni_i2c_runtime_resume(struct device *dev)
> +{
> +       return 0;
> +}
> +
> +static int geni_i2c_suspend_noirq(struct device *device)
> +{
> +       return 0;
> +}
> +#endif
> +
> +static const struct dev_pm_ops geni_i2c_pm_ops = {
> +       .suspend_noirq          = geni_i2c_suspend_noirq,
> +       .resume_noirq           = geni_i2c_resume_noirq,
> +       .runtime_suspend        = geni_i2c_runtime_suspend,
> +       .runtime_resume         = geni_i2c_runtime_resume,

Please use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS and SET_RUNTIME_PM_OPS.  Then
you can get rid of all the dummy functions.  AKA something like:

static const struct dev_pm_ops geni_i2c_pm_ops = {
  SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(geni_i2c_suspend_noirq, NULL)
  SET_RUNTIME_PM_OPS(geni_i2c_runtime_suspend, geni_i2c_runtime_resume, NULL)
};


> +};
> +
> +static const struct of_device_id geni_i2c_dt_match[] = {
> +       { .compatible = "qcom,geni-i2c" },
> +       {}
> +};
> +MODULE_DEVICE_TABLE(of, geni_i2c_dt_match);
> +
> +static struct platform_driver geni_i2c_driver = {
> +       .probe  = geni_i2c_probe,
> +       .remove = geni_i2c_remove,
> +       .driver = {
> +               .name = "geni_i2c",
> +               .pm = &geni_i2c_pm_ops,
> +               .of_match_table = geni_i2c_dt_match,
> +       },
> +};
> +
> +module_platform_driver(geni_i2c_driver);
> +
> +MODULE_DESCRIPTION("I2C Controller Driver for GENI based QUP cores");
> +MODULE_LICENSE("GPL v2");
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 1/8] [PATCH 1/8] drivers/peci: Add support for PECI bus driver core
From: Jae Hyun Yoo @ 2018-03-07 19:03 UTC (permalink / raw)
  To: Julia Cartwright
  Cc: joel, andrew, arnd, gregkh, jdelvare, linux, benh, andrew,
	linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc
In-Reply-To: <20180307031907.GD1924@kryptos.localdomain>

Hi Julia,

Thanks for sharing your time on reviewing it. Please see my inline answers.

Jae

On 3/6/2018 7:19 PM, Julia Cartwright wrote:
> On Wed, Feb 21, 2018 at 08:15:59AM -0800, Jae Hyun Yoo wrote:
>> This commit adds driver implementation for PECI bus into linux
>> driver framework.
>>
>> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
>> ---
> [..]
>> +static int peci_locked_xfer(struct peci_adapter *adapter,
>> +			    struct peci_xfer_msg *msg,
>> +			    bool do_retry,
>> +			    bool has_aw_fcs)
> 
> _locked generally means that this function is invoked with some critical
> lock held, what lock does the caller need to acquire before invoking
> this function?
> 

I intended to show that this function has a mutex locking inside for 
serialization of PECI data transactions from multiple callers, but as 
you commented out below, the mutex protection scope should be adjusted 
to make that covers the peci_scan_cmd_mask() function too. I'll rewrite 
the mutex protection scope then this function will be in the locked scope.

>> +{
>> +	ktime_t start, end;
>> +	s64 elapsed_ms;
>> +	int rc = 0;
>> +
>> +	if (!adapter->xfer) {
> 
> Is this really an optional feature of an adapter?  If this is not
> optional, then this check should be in place when the adapter is
> registered, not here.  (And it should WARN_ON(), because it's a driver
> developer error).
> 

I agree with you. I'll move this code into the peci_register_adapter() 
function.

>> +		dev_dbg(&adapter->dev, "PECI level transfers not supported\n");
>> +		return -ENODEV;
>> +	}
>> +
>> +	if (in_atomic() || irqs_disabled()) {
> 
> As Andrew mentioned, this is broken.
> 
> You don't even need a might_sleep().  The locking functions you use here
> will already include a might_sleep() w/ CONFIG_DEBUG_ATOMIC_SLEEP.
> 

Thanks for letting me know that. I'll drop that checking code and 
might_sleep() too.

>> +		rt_mutex_trylock(&adapter->bus_lock);
>> +		if (!rc)
>> +			return -EAGAIN; /* PECI activity is ongoing */
>> +	} else {
>> +		rt_mutex_lock(&adapter->bus_lock);
>> +	}
>> +
>> +	if (do_retry)
>> +		start = ktime_get();
>> +
>> +	do {
>> +		rc = adapter->xfer(adapter, msg);
>> +
>> +		if (!do_retry)
>> +			break;
>> +
>> +		/* Per the PECI spec, need to retry commands that return 0x8x */
>> +		if (!(!rc && ((msg->rx_buf[0] & DEV_PECI_CC_RETRY_ERR_MASK) ==
>> +			      DEV_PECI_CC_TIMEOUT)))
>> +			break;
> 
> This is pretty difficult to parse.  Can you split it into two different
> conditions?
> 

Sure. I'll split it out.

>> +
>> +		/* Set the retry bit to indicate a retry attempt */
>> +		msg->tx_buf[1] |= DEV_PECI_RETRY_BIT;
> 
> Are you sure this bit is to be set in the _second_ byte of tx_buf?
> 

Yes, I'm pretty sure. The first byte contains a PECI command value and 
the second byte contains 'HostID[7:1] & Retry[0]' value.

>> +
>> +		/* Recalculate the AW FCS if it has one */
>> +		if (has_aw_fcs)
>> +			msg->tx_buf[msg->tx_len - 1] = 0x80 ^
>> +						peci_aw_fcs((u8 *)msg,
>> +							    2 + msg->tx_len);
>> +
>> +		/* Retry for at least 250ms before returning an error */
>> +		end = ktime_get();
>> +		elapsed_ms = ktime_to_ms(ktime_sub(end, start));
>> +		if (elapsed_ms >= DEV_PECI_RETRY_TIME_MS) {
>> +			dev_dbg(&adapter->dev, "Timeout retrying xfer!\n");
>> +			break;
>> +		}
>> +	} while (true);
>> +
>> +	rt_mutex_unlock(&adapter->bus_lock);
>> +
>> +	return rc;
>> +}
>> +
>> +static int peci_xfer(struct peci_adapter *adapter, struct peci_xfer_msg *msg)
>> +{
>> +	return peci_locked_xfer(adapter, msg, false, false);
>> +}
>> +
>> +static int peci_xfer_with_retries(struct peci_adapter *adapter,
>> +				  struct peci_xfer_msg *msg,
>> +				  bool has_aw_fcs)
>> +{
>> +	return peci_locked_xfer(adapter, msg, true, has_aw_fcs);
>> +}
>> +
>> +static int peci_scan_cmd_mask(struct peci_adapter *adapter)
>> +{
>> +	struct peci_xfer_msg msg;
>> +	u32 dib;
>> +	int rc = 0;
>> +
>> +	/* Update command mask just once */
>> +	if (adapter->cmd_mask & BIT(PECI_CMD_PING))
>> +		return 0;
>> +
>> +	msg.addr      = PECI_BASE_ADDR;
>> +	msg.tx_len    = GET_DIB_WR_LEN;
>> +	msg.rx_len    = GET_DIB_RD_LEN;
>> +	msg.tx_buf[0] = GET_DIB_PECI_CMD;
>> +
>> +	rc = peci_xfer(adapter, &msg);
>> +	if (rc < 0) {
>> +		dev_dbg(&adapter->dev, "PECI xfer error, rc : %d\n", rc);
>> +		return rc;
>> +	}
>> +
>> +	dib = msg.rx_buf[0] | (msg.rx_buf[1] << 8) |
>> +	      (msg.rx_buf[2] << 16) | (msg.rx_buf[3] << 24);
>> +
>> +	/* Check special case for Get DIB command */
>> +	if (dib == 0x00) {
>> +		dev_dbg(&adapter->dev, "DIB read as 0x00\n");
>> +		return -1;
>> +	}
>> +
>> +	if (!rc) {
> 
> You should change this to:
> 
> 	if (rc) {
> 		dev_dbg(&adapter->dev, "Error reading DIB, rc : %d\n", rc);
> 		return rc;
> 	}
> 
> And then leave the happy path below unindented.
> 

Agreed. That would be neater. Will rewrite it. Thanks!

>> +		/**
>> +		 * setting up the supporting commands based on minor rev#
>> +		 * see PECI Spec Table 3-1
>> +		 */
>> +		dib = (dib >> 8) & 0xF;
>> +
>> +		if (dib >= 0x1) {
>> +			adapter->cmd_mask |= BIT(PECI_CMD_RD_PKG_CFG);
>> +			adapter->cmd_mask |= BIT(PECI_CMD_WR_PKG_CFG);
>> +		}
>> +
>> +		if (dib >= 0x2)
>> +			adapter->cmd_mask |= BIT(PECI_CMD_RD_IA_MSR);
>> +
>> +		if (dib >= 0x3) {
>> +			adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG_LOCAL);
>> +			adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG_LOCAL);
>> +		}
>> +
>> +		if (dib >= 0x4)
>> +			adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG);
>> +
>> +		if (dib >= 0x5)
>> +			adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG);
>> +
>> +		if (dib >= 0x6)
>> +			adapter->cmd_mask |= BIT(PECI_CMD_WR_IA_MSR);
>> +
>> +		adapter->cmd_mask |= BIT(PECI_CMD_GET_TEMP);
>> +		adapter->cmd_mask |= BIT(PECI_CMD_GET_DIB);
>> +		adapter->cmd_mask |= BIT(PECI_CMD_PING);
> 
> These cmd_mask updates are not done with any locking in mind.  Is this
> intentional?  Or: is synchronization not necessary because this is
> always done during enumeration prior to exposing the adapter to users?
> 

Thanks for the pointing it out. This function should be done in a locked 
scope as you said. I'll adjust mutex protection scope to make that 
covers this function as well.

>> +	} else {
>> +		dev_dbg(&adapter->dev, "Error reading DIB, rc : %d\n", rc);
>> +	}
>> +
>> +	return rc;
>> +}
>> +
>> +static int peci_cmd_support(struct peci_adapter *adapter, enum peci_cmd cmd)
>> +{
>> +	if (!(adapter->cmd_mask & BIT(PECI_CMD_PING)) &&
>> +	    peci_scan_cmd_mask(adapter) < 0) {
>> +		dev_dbg(&adapter->dev, "Failed to scan command mask\n");
>> +		return -EIO;
>> +	}
>> +
>> +	if (!(adapter->cmd_mask & BIT(cmd))) {
>> +		dev_dbg(&adapter->dev, "Command %d is not supported\n", cmd);
>> +		return -EINVAL;
>> +	}
> 
> It would be nicer if you did this check prior to dispatching to the
> various subfunctions (peci_ioctl_ping, peci_ioctl_get_dib, etc.).  In
> that way, these functions could just assume the adapter supports them.
> 

Agreed. I'll drop all individual calls from subfunctions and will call 
it from peci_command().

> [..]
>> +static int peci_register_adapter(struct peci_adapter *adapter)
>> +{
>> +	int res = -EINVAL;
>> +
>> +	/* Can't register until after driver model init */
>> +	if (WARN_ON(!is_registered)) {
> 
> Is this solving a problem you actually ran into?
> 

Generally, an adapter driver registration will be happened after the 
PECI bus registration because peci_init uses postcore_initcall, but in 
case of incorrect implementation of an adapter driver which uses
a preceding postcore_initcall or a core_initcall as its module init, 
then an adapter registration would be prior to bus registration. This 
code is an exceptional case handling for that to warn the incorrect 
adapter driver implementation.

> [.. skipped review due to fatigue ..]
> 
>> +++ b/include/linux/peci.h
>> @@ -0,0 +1,97 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (c) 2018 Intel Corporation
>> +
>> +#ifndef __LINUX_PECI_H
>> +#define __LINUX_PECI_H
>> +
>> +#include <linux/cdev.h>
>> +#include <linux/device.h>
>> +#include <linux/peci-ioctl.h>
>> +#include <linux/rtmutex.h>
>> +
>> +#define PECI_BUFFER_SIZE  32
>> +#define PECI_NAME_SIZE    32
>> +
>> +struct peci_xfer_msg {
>> +	u8	addr;
>> +	u8	tx_len;
>> +	u8	rx_len;
>> +	u8	tx_buf[PECI_BUFFER_SIZE];
>> +	u8	rx_buf[PECI_BUFFER_SIZE];
>> +} __attribute__((__packed__));
> 
> The packed attribute has historically caused gcc to emit atrocious code,
> as it seems to assume packed implies members might not be naturally
> aligned.  Seeing as you're only working with u8s in this case, though,
> this shouldn't be a problem.
> 

It should be a packed struct because it is also being used for CRC8 
calculation which is treating it as a contiguous byte array.

>> +struct peci_board_info {
>> +	char			type[PECI_NAME_SIZE];
>> +	u8			addr;	/* CPU client address */
>> +	struct device_node	*of_node;
>> +};
>> +
>> +struct peci_adapter {
>> +	struct module	*owner;
>> +	struct rt_mutex	bus_lock;
> 
> Why an rt_mutex, instead of a regular mutex.  Do you explicitly need PI
> in mainline?
> 

Currently this implementation has only a temperature monitoring sideband 
feature but other sideband features such as CPU error detection and 
crash dump will be implemented later, and those additional sideband 
features should have higher priority than the temperature monitoring 
feature so it is the reason why I used an rt_mutex.

>> +	struct device	dev;
>> +	struct cdev	cdev;
>> +	int		nr;
>> +	char		name[PECI_NAME_SIZE];
>> +	int		(*xfer)(struct peci_adapter *adapter,
>> +				struct peci_xfer_msg *msg);
>> +	uint		cmd_mask;
>> +};
>> +
>> +#define to_peci_adapter(d) container_of(d, struct peci_adapter, dev)
> 
> You can also do this with a static inline, which provides a marginally
> better error when screwed up.
> 

Agreed. That would be more helpful for debugging in debug build. I'll 
rewrite the macro to a static inline like below:

static inline struct peci_adapter *to_peci_adapter(void *d)
{
	return container_of(d, struct peci_adapter, dev);
}

>     Julia
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 2/2] docs: add Co-Developed-by docs
From: Dominik Brodowski @ 2018-03-07 18:07 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Tobin C. Harding, Andrew Morton, Greg Kroah-Hartman, Joe Perches,
	Randy Dunlap, Thomas Gleixner, linux-kernel, linux-doc
In-Reply-To: <20180307101614.6722b592@lwn.net>

On Wed, Mar 07, 2018 at 10:16:14AM -0700, Jonathan Corbet wrote:
> On Mon,  5 Mar 2018 14:58:21 +1100
> "Tobin C. Harding" <me@tobin.cc> wrote:
> 
> > When Co-Developed-by tag was added, docs were only added to
> > Documention/process/5.Posting.rst and were not added to
> > Documention/process/submitting-patches.rst
> > 
> > Add documentation to Documention/process/submitting-patches.rst
> 
> Someday we should really pull those documents together into a coherent
> whole so we don't have these weird synchronization issues.  But it's much
> easier to just apply this, so that's what I've done for now, with Willy's
> comma tweak.

In case you still have my patch to Documentation/process/5.Posting.rst
pending, we'll have a mismatch between Co-developed-by there (which has been
used a number of times in the past), and Co-Developed-by in
submitting-patches.rst and checkpatch and Co-developed-by (which was used
exactly once in Linus' tree). That should be coherent at least...

Thanks,
	Dominik
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 00/17] Include linux trace docs to Sphinx TOC tree
From: Jonathan Corbet @ 2018-03-07 17:46 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Du, Changbin, mingo, linux-doc, linux-kernel
In-Reply-To: <20180227174337.37ecb348@vmware.local.home>

On Tue, 27 Feb 2018 17:43:37 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 27 Feb 2018 17:34:22 +0800
> "Du, Changbin" <changbin.du@intel.com> wrote:
>
> > Ten days past, will you accept this serias? Thank you!
> 
> Currently I'm very overloaded with other code that needs to get done
> ASAP, and I need to balance what is critical and what is not. I don't
> have time to review this, as this isn't critical, and can wait.
> 
> If Jon can review it to make sure that it doesn't change the
> readability of the text, then I'll trust his judgment. 

So I've spent a while working through the patches.  I think it's a
well-done RST conversion carried out with a light hand; I do not believe
there are any readability issues with the resulting text files.

I will note that the series adds some new build warnings:

> Documentation/trace/events.rst:45: WARNING: Inline emphasis start-string without end-string.
> Documentation/trace/events.rst:49: WARNING: Inline emphasis start-string without end-string.
> Documentation/trace/events.rst:193: WARNING: Inline emphasis start-string without end-string.
> Documentation/trace/events.rst:114: WARNING: Unknown target name: "common".
> Documentation/trace/ftrace.rst:2620: WARNING: Inline emphasis start-string without end-string.

These point to places where the resulting formatted docs are, in fact,
incorrect.  I had to append the attached patch to the series to make those
problems go away.  The warnings are there for a purpose!

Anyway, with that, the patch series is applied.  Thanks for helping to
improve the docs, and my apologies for taking so long to get to this.

jon

From 6234c7bd8c14508fb76c0a4d6f01eb81c8ce9cbf Mon Sep 17 00:00:00 2001
From: Jonathan Corbet <corbet@lwn.net>
Date: Wed, 7 Mar 2018 10:44:08 -0700
Subject: [PATCH] docs: ftrace: fix a few formatting issues

Make sure that literal * characters are not interpreted as emphasis
markers.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 Documentation/trace/events.rst | 10 +++++-----
 Documentation/trace/ftrace.rst |  8 ++++----
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 27bfd06ae29d..bdf1963ba6ba 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -42,7 +42,7 @@ To disable all events, echo an empty line to the set_event file::
 
 	# echo > /sys/kernel/debug/tracing/set_event
 
-To enable all events, echo '*:*' or '*:' to the set_event file::
+To enable all events, echo ``*:*`` or ``*:`` to the set_event file::
 
 	# echo *:* > /sys/kernel/debug/tracing/set_event
 
@@ -50,7 +50,7 @@ The events are organized into subsystems, such as ext4, irq, sched,
 etc., and a full event name looks like this: <subsystem>:<event>.  The
 subsystem name is optional, but it is displayed in the available_events
 file.  All of the events in a subsystem can be specified via the syntax
-"<subsystem>:*"; for example, to enable all irq events, you can use the
+``<subsystem>:*``; for example, to enable all irq events, you can use the
 command::
 
 	# echo 'irq:*' > /sys/kernel/debug/tracing/set_event
@@ -111,8 +111,8 @@ It also displays the format string that will be used to print the
 event in text mode, along with the event name and ID used for
 profiling.
 
-Every event has a set of 'common' fields associated with it; these are
-the fields prefixed with 'common_'.  The other fields vary between
+Every event has a set of ``common`` fields associated with it; these are
+the fields prefixed with ``common_``.  The other fields vary between
 events and correspond to the fields defined in the TRACE_EVENT
 definition for that event.
 
@@ -190,7 +190,7 @@ And for string fields they are:
 
 ==, !=, ~
 
-The glob (~) accepts a wild card character (*,?) and character classes
+The glob (~) accepts a wild card character (\*,?) and character classes
 ([). For example::
 
   prev_comm ~ "*sh"
diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst
index 636aa9bf5674..0bc33ad4a3f9 100644
--- a/Documentation/trace/ftrace.rst
+++ b/Documentation/trace/ftrace.rst
@@ -2615,13 +2615,13 @@ To see which functions are being traced, you can cat the file:
 
 Perhaps this is not enough. The filters also allow glob(7) matching.
 
-  <match>*
+  ``<match>*``
 	will match functions that begin with <match>
-  *<match>
+  ``*<match>``
 	will match functions that end with <match>
-  *<match>*
+  ``*<match>*``
 	will match functions that have <match> in it
-  <match1>*<match2>
+  ``<match1>*<match2>``
 	will match functions that begin with <match1> and end with <match2>
 
 .. note::
-- 
2.14.3


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH v3 15/15] selinux: delay sid population for rootfs till init is complete
From: Victor Kamensky @ 2018-03-07 17:26 UTC (permalink / raw)
  To: Rob Landley
  Cc: Stephen Smalley, Taras Kondratiuk, H. Peter Anvin, Al Viro,
	Arnd Bergmann, Mimi Zohar, Jonathan Corbet, James McMechan,
	initramfs, linux-doc, linux-kernel, linux-security-module,
	xe-linux-external, Paul Moore, Eric Paris
In-Reply-To: <d23913a5-2a18-4d2a-60ae-bfa1ba8aaa0c@landley.net>



On Wed, 7 Mar 2018, Rob Landley wrote:

> On 02/20/2018 12:56 PM, Stephen Smalley wrote:
>> On Fri, 2018-02-16 at 20:33 +0000, Taras Kondratiuk wrote:
>>> From: Victor Kamensky <kamensky@cisco.com>
>>>
>>> With initramfs cpio format that supports extended attributes
>>> we need to skip sid population on sys_lsetxattr call from
>>> initramfs for rootfs if security server is not initialized yet.
>>>
>>> Otherwise callback in selinux_inode_post_setxattr will try to
>>> translate give security.selinux label into sid context and since
>>> security server is not available yet inode will receive default
>>> sid (typically kernel_t). Note that in the same time proper
>>> label will be stored in inode xattrs. Later, since inode sid
>>> would be already populated system will never look back at
>>> actual xattrs. But if we skip sid population for rootfs and
>>> we have policy that direct use of xattrs for rootfs, proper
>>> sid will be filled in from extended attributes one node is
>>> accessed and server is initialized.
>>>
>>> Note new DELAYAFTERINIT_MNT super block flag is introduced
>>> to only mark rootfs for such behavior. For other types of
>>> tmpfs original logic is still used.
>>
>> (cc selinux maintainers)
>>
>> Wondering if we shouldn't just do this always, for all filesystem
>> types.  Also, I think this should likely also be done in
>> selinux_inode_setsecurity() for consistency.

Sorry, I did not have time to try out Stephen's suggestion,
especially given that core initramfs xattrs acceptance and dicussion
looks a bit stalled, and for my use case it is dependency before
SELinux changes.

I will look for both suggestion this week. Hope to see initramfs
xattrs patch series review going again.

> I don't understand what selinux thinks it's doing here.
>
> Initramfs is special because it's populated early, ideally early enough drivers
> can load their firmware out of it. This is guaranteed to be before any processes
> have launched, before any other filesystems have been mounted. I'm surprised
> selinux is trying to do anything this early because A) what is there for it to
> do, B) where did it get a ruleset?
>
> This isn't really a mount flag, this is a "the selinux subsystem isn't
> functionally initialized yet" flag. We haven't launched init. In a modular
> system the module probably isn't loaded. There are no processes, and the only
> files anywhere are the ones we're in the process of extracting. What's there
> fore selinux to do?
>
> When a filesystem is mounted, none of these cached selinux "we already looked at
> the xattrs" inode fields are populated yet, correct? It can figure that out when
> something accesses the file and do it then, so the point is _not_ doing this now
> and thus not cacheing the wrong info. That's what the mount flag is doing,
> telling selinux "not yet". So why does selinux not already _know_ "not yet"?
>
> Why doesn't load_policy flush the cache of the old default contexts? What
> happens if you mount an ext2 root and then init reads a dozen files before it
> gets to the load_policy?

I need to check whether security context caching happens on all
file operations, or when setxattr is executed. If latter, 
setxattr operation before policy load may not be very common use case.

Also note there is a second SELinux related patch and 
corresponding Stephen's comment: if SELinux is enabled
in kernel, but policy is not loaded yet, setxattr for security.selinux
extended attribute will go for check to SELinux LSM callback, it will
be denied. My other patch was relaxing above for "rootfs" only,
i.e covering initramfs xattrs case. Stephen's point was that
maybe it needs to be relaxed for
all cases if policy not loaded yet. I need some time to look
at the code and think about what can go wrong, if rule relaxed
for all cases.

> Do those doesn't files have bad default contexts forever now?
>
> Where does the selinux ruleset come from during the cpio extract?

Yes, in our use case SELinux policy file 
(/etc/selinux/*/policy/policy.*) comes from cpio initramfs itself.
So it is chicken and egg problem.

> Was it
> hardwired into the driver? It certainly didn't come out of a file, and it wasn't
> a process that loaded it. Why is selinux trying to evaluate and cache the
> security context of files before it has any rules?

Note for ext2 case there is no setxattr first as we have in initramfs
xattrs case, so extended attributes values are read from backing
persitent storage as they were put there before, and there would not
be discrepency what is cached in security context inode data scructure
and real "security.selinux" extended attribute value in file system.

Thanks,
Victor

> (It has xattr annotations,
> but they have no _meaning_ without rules...?
>
> Confused,
>
> Rob
>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 2/2] docs: add Co-Developed-by docs
From: Jonathan Corbet @ 2018-03-07 17:16 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Andrew Morton, Greg Kroah-Hartman, Joe Perches, Randy Dunlap,
	Dominik Brodowski, Thomas Gleixner, linux-kernel, linux-doc
In-Reply-To: <1520222301-11874-3-git-send-email-me@tobin.cc>

On Mon,  5 Mar 2018 14:58:21 +1100
"Tobin C. Harding" <me@tobin.cc> wrote:

> When Co-Developed-by tag was added, docs were only added to
> Documention/process/5.Posting.rst and were not added to
> Documention/process/submitting-patches.rst
> 
> Add documentation to Documention/process/submitting-patches.rst

Someday we should really pull those documents together into a coherent
whole so we don't have these weird synchronization issues.  But it's much
easier to just apply this, so that's what I've done for now, with Willy's
comma tweak.

I'll leave the checkpatch part to Joe.

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3] Documentation/sphinx: Fix Directive import error
From: Jonathan Corbet @ 2018-03-07 17:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Takashi Iwai, Jani Nikula, Jiri Slaby, linux-doc, linux-kernel
In-Reply-To: <20180302184014.GG31400@bombadil.infradead.org>

On Fri, 2 Mar 2018 10:40:14 -0800
Matthew Wilcox <willy@infradead.org> wrote:

> Sphinx 1.7 removed sphinx.util.compat.Directive so people
> who have upgraded cannot build the documentation.  Switch to
> docutils.parsers.rst.Directive which has been available since
> docutils 0.5 released in 2009.
> 
> Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1083694
> Co-developed-by: Takashi Iwai <tiwai@suse.de>
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>

FWIW, I've *finally* gotten a chance to test this with a few sphinx
versions; it works just fine, of course.  Will get it upstream shortly.

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 15/15] selinux: delay sid population for rootfs till init is complete
From: Rob Landley @ 2018-03-07 16:51 UTC (permalink / raw)
  To: Stephen Smalley, Taras Kondratiuk, H. Peter Anvin, Al Viro,
	Arnd Bergmann, Mimi Zohar, Jonathan Corbet, James McMechan
  Cc: initramfs, Victor Kamensky, linux-doc, linux-kernel,
	linux-security-module, xe-linux-external, Paul Moore, Eric Paris
In-Reply-To: <1519152994.14218.15.camel@tycho.nsa.gov>

On 02/20/2018 12:56 PM, Stephen Smalley wrote:
> On Fri, 2018-02-16 at 20:33 +0000, Taras Kondratiuk wrote:
>> From: Victor Kamensky <kamensky@cisco.com>
>>
>> With initramfs cpio format that supports extended attributes
>> we need to skip sid population on sys_lsetxattr call from
>> initramfs for rootfs if security server is not initialized yet.
>>
>> Otherwise callback in selinux_inode_post_setxattr will try to
>> translate give security.selinux label into sid context and since
>> security server is not available yet inode will receive default
>> sid (typically kernel_t). Note that in the same time proper
>> label will be stored in inode xattrs. Later, since inode sid
>> would be already populated system will never look back at
>> actual xattrs. But if we skip sid population for rootfs and
>> we have policy that direct use of xattrs for rootfs, proper
>> sid will be filled in from extended attributes one node is
>> accessed and server is initialized.
>>
>> Note new DELAYAFTERINIT_MNT super block flag is introduced
>> to only mark rootfs for such behavior. For other types of
>> tmpfs original logic is still used.
> 
> (cc selinux maintainers)
> 
> Wondering if we shouldn't just do this always, for all filesystem
> types.  Also, I think this should likely also be done in
> selinux_inode_setsecurity() for consistency.

I don't understand what selinux thinks it's doing here.

Initramfs is special because it's populated early, ideally early enough drivers
can load their firmware out of it. This is guaranteed to be before any processes
have launched, before any other filesystems have been mounted. I'm surprised
selinux is trying to do anything this early because A) what is there for it to
do, B) where did it get a ruleset?

This isn't really a mount flag, this is a "the selinux subsystem isn't
functionally initialized yet" flag. We haven't launched init. In a modular
system the module probably isn't loaded. There are no processes, and the only
files anywhere are the ones we're in the process of extracting. What's there
fore selinux to do?

When a filesystem is mounted, none of these cached selinux "we already looked at
the xattrs" inode fields are populated yet, correct? It can figure that out when
something accesses the file and do it then, so the point is _not_ doing this now
and thus not cacheing the wrong info. That's what the mount flag is doing,
telling selinux "not yet". So why does selinux not already _know_ "not yet"?

Why doesn't load_policy flush the cache of the old default contexts? What
happens if you mount an ext2 root and then init reads a dozen files before it
gets to the load_policy? Do those doesn't files have bad default contexts
forever now?

Where does the selinux ruleset come from during the cpio extract? Was it
hardwired into the driver? It certainly didn't come out of a file, and it wasn't
a process that loaded it. Why is selinux trying to evaluate and cache the
security context of files before it has any rules? (It has xattr annotations,
but they have no _meaning_ without rules...?

Confused,

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] HID: ntrig: document sysfs interface
From: Jiri Kosina @ 2018-03-07 14:46 UTC (permalink / raw)
  To: Aishwarya Pant
  Cc: Benjamin Tissoires, linux-input, linux-kernel, Jonathan Corbet,
	Greg KH, Julia Lawall, linux-doc
In-Reply-To: <20180302053017.GA23123@mordor.localdomain>

On Fri, 2 Mar 2018, Aishwarya Pant wrote:

> Add sysfs documentation for N-Trig touchscreens under Documentation/ABI.
> Descriptions have been collected from code comments.

Applied, thanks.

-- 
Jiri Kosina
SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 05/10] hwmon: generic-pwm-tachometer: Add generic PWM based tachometer
From: Guenter Roeck @ 2018-03-07 14:20 UTC (permalink / raw)
  To: Rajkumar Rampelli, Mikko Perttunen, robh+dt, mark.rutland,
	thierry.reding, jonathanh, jdelvare, corbet, catalin.marinas,
	will.deacon, kstewart, gregkh, pombredanne, mmaddireddy,
	mperttunen, arnd, timur, andy.gross, xuwei5, elder, heiko, krzk,
	ard.biesheuvel
  Cc: devicetree, linux-kernel, linux-pwm, linux-tegra, linux-hwmon,
	linux-doc, linux-arm-kernel, ldewangan
In-Reply-To: <952cb50e-d9a0-ae5e-d9f6-d1ce268f0384@nvidia.com>

On 03/07/2018 01:47 AM, Rajkumar Rampelli wrote:
> 
> 
> On Wednesday 28 February 2018 07:59 PM, Guenter Roeck wrote:
>> On 02/27/2018 11:03 PM, Mikko Perttunen wrote:
>>> On 02/28/2018 08:12 AM, Rajkumar Rampelli wrote:
>>>>
>>>> On Wednesday 28 February 2018 11:28 AM, Guenter Roeck wrote:
>>>>> On 02/27/2018 09:38 PM, Rajkumar Rampelli wrote:
>>>>>>
>>>>>> On Wednesday 21 February 2018 08:20 PM, Guenter Roeck wrote:
>>>>>>> On 02/20/2018 10:58 PM, Rajkumar Rampelli wrote:
>>>>>>>> Add generic PWM based tachometer driver via HWMON interface
>>>>>>>> to report the RPM of motor. This drivers get the period/duty
>>>>>>>> cycle from PWM IP which captures the motor PWM output.
>>>>>>>>
>>>>>>>> This driver implements a simple interface for monitoring the speed of
>>>>>>>> a fan and exposes it in roatations per minute (RPM) to the user space
>>>>>>>> by using the hwmon's sysfs interface
>>>>>>>>
>>>>>>>> Signed-off-by: Rajkumar Rampelli <rrajk@nvidia.com>
>>>>>>>> ---
>>>>>>>>   Documentation/hwmon/generic-pwm-tachometer |  17 +++++
>>>>>>>>   drivers/hwmon/Kconfig                      |  10 +++
>>>>>>>>   drivers/hwmon/Makefile                     |   1 +
>>>>>>>>   drivers/hwmon/generic-pwm-tachometer.c     | 112 +++++++++++++++++++++++++++++
>>>>>>>>   4 files changed, 140 insertions(+)
>>>>>>>>   create mode 100644 Documentation/hwmon/generic-pwm-tachometer
>>>>>>>>   create mode 100644 drivers/hwmon/generic-pwm-tachometer.c
>>>>>>>>
>>>>>>>> diff --git a/Documentation/hwmon/generic-pwm-tachometer b/Documentation/hwmon/generic-pwm-tachometer
>>>>>>>> new file mode 100644
>>>>>>>> index 0000000..e0713ee
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/Documentation/hwmon/generic-pwm-tachometer
>>>>>>>> @@ -0,0 +1,17 @@
>>>>>>>> +Kernel driver generic-pwm-tachometer
>>>>>>>> +====================================
>>>>>>>> +
>>>>>>>> +This driver enables the use of a PWM module to monitor a fan. It uses the
>>>>>>>> +generic PWM interface and can be used on SoCs as along as the SoC supports
>>>>>>>> +Tachometer controller that moniors the Fan speed in periods.
>>>>>>>> +
>>>>>>>> +Author: Rajkumar Rampelli <rrajk@nvidia.com>
>>>>>>>> +
>>>>>>>> +Description
>>>>>>>> +-----------
>>>>>>>> +
>>>>>>>> +The driver implements a simple interface for monitoring the Fan speed using
>>>>>>>> +PWM module and Tachometer controller. It requests period value through PWM
>>>>>>>> +capture interface to Tachometer and measures the Rotations per minute using
>>>>>>>> +received period value. It exposes the Fan speed in RPM to the user space by
>>>>>>>> +using the hwmon's sysfs interface.
>>>>>>>> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
>>>>>>>> index ef23553..8912dcb 100644
>>>>>>>> --- a/drivers/hwmon/Kconfig
>>>>>>>> +++ b/drivers/hwmon/Kconfig
>>>>>>>> @@ -1878,6 +1878,16 @@ config SENSORS_XGENE
>>>>>>>>         If you say yes here you get support for the temperature
>>>>>>>>         and power sensors for APM X-Gene SoC.
>>>>>>>>   +config GENERIC_PWM_TACHOMETER
>>>>>>>> +    tristate "Generic PWM based tachometer driver"
>>>>>>>> +    depends on PWM
>>>>>>>> +    help
>>>>>>>> +      Enables a driver to use PWM signal from motor to use
>>>>>>>> +      for measuring the motor speed. The RPM is captured by
>>>>>>>> +      PWM modules which has PWM capture capability and this
>>>>>>>> +      drivers reads the captured data from PWM IP to convert
>>>>>>>> +      it to speed in RPM.
>>>>>>>> +
>>>>>>>>   if ACPI
>>>>>>>>     comment "ACPI drivers"
>>>>>>>> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
>>>>>>>> index f814b4a..9dcc374 100644
>>>>>>>> --- a/drivers/hwmon/Makefile
>>>>>>>> +++ b/drivers/hwmon/Makefile
>>>>>>>> @@ -175,6 +175,7 @@ obj-$(CONFIG_SENSORS_WM8350)    += wm8350-hwmon.o
>>>>>>>>   obj-$(CONFIG_SENSORS_XGENE)    += xgene-hwmon.o
>>>>>>>>     obj-$(CONFIG_PMBUS)        += pmbus/
>>>>>>>> +obj-$(CONFIG_GENERIC_PWM_TACHOMETER) += generic-pwm-tachometer.o
>>>>>>>>     ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
>>>>>>>>   diff --git a/drivers/hwmon/generic-pwm-tachometer.c b/drivers/hwmon/generic-pwm-tachometer.c
>>>>>>>> new file mode 100644
>>>>>>>> index 0000000..9354d43
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/drivers/hwmon/generic-pwm-tachometer.c
>>>>>>>> @@ -0,0 +1,112 @@
>>>>>>>> +/*
>>>>>>>> + * Copyright (c) 2017-2018, NVIDIA CORPORATION.  All rights reserved.
>>>>>>>> + *
>>>>>>>> + * This program is free software; you can redistribute it and/or modify it
>>>>>>>> + * under the terms and conditions of the GNU General Public License,
>>>>>>>> + * version 2, as published by the Free Software Foundation.
>>>>>>>> + *
>>>>>>>> + * This program is distributed in the hope it will be useful, but WITHOUT
>>>>>>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>>>>>>>> + * more details.
>>>>>>>> + *
>>>>>>>> + */
>>>>>>>> +
>>>>>>>> +#include <linux/module.h>
>>>>>>>> +#include <linux/platform_device.h>
>>>>>>>> +#include <linux/err.h>
>>>>>>>> +#include <linux/pwm.h>
>>>>>>>> +#include <linux/hwmon.h>
>>>>>>>> +#include <linux/hwmon-sysfs.h>
>>>>>>>> +
>>>>>>>> +struct pwm_hwmon_tach {
>>>>>>>> +    struct device        *dev;
>>>>>>>> +    struct pwm_device    *pwm;
>>>>>>>> +    struct device        *hwmon;
>>>>>>>> +};
>>>>>>>> +
>>>>>>>> +static ssize_t show_rpm(struct device *dev, struct device_attribute *attr,
>>>>>>>> +            char *buf)
>>>>>>>> +{
>>>>>>>> +    struct pwm_hwmon_tach *ptt = dev_get_drvdata(dev);
>>>>>>>> +    struct pwm_device *pwm = ptt->pwm;
>>>>>>>> +    struct pwm_capture result;
>>>>>>>> +    int err;
>>>>>>>> +    unsigned int rpm = 0;
>>>>>>>> +
>>>>>>>> +    err = pwm_capture(pwm, &result, 0);
>>>>>>>> +    if (err < 0) {
>>>>>>>> +        dev_err(ptt->dev, "Failed to capture PWM: %d\n", err);
>>>>>>>> +        return err;
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>> +    if (result.period)
>>>>>>>> +        rpm = DIV_ROUND_CLOSEST_ULL(60ULL * NSEC_PER_SEC,
>>>>>>>> +                        result.period);
>>>>>>>> +
>>>>>>>> +    return sprintf(buf, "%u\n", rpm);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static SENSOR_DEVICE_ATTR(rpm, 0444, show_rpm, NULL, 0);
>>>>>>>> +
>>>>>>>> +static struct attribute *pwm_tach_attrs[] = {
>>>>>>>> +    &sensor_dev_attr_rpm.dev_attr.attr,
>>>>>>>> +    NULL,
>>>>>>>> +};
>>>>>>>
>>>>>>> "rpm" is not a standard hwmon sysfs attribute. If you don't provide
>>>>>>> a single standard hwmon sysfs attribute, having a hwmon driver is pointless.
>>>>>> Guenter Roeck,
>>>>>> I will define a new hwmon sysfs attribute node called "hwmon_tachometer_attributes" in hwmon.h like below and update the same in tachometer hwmon driver. Is it fine ?
>>>>>> enum hwmon_tachometer_attributes {
>>>>>
>>>>> Are you kidding me ?
>>>>>
>>>>> Guenter
>>>> Sorry, I just wanted to confirm whether my understanding is correct or not before implementing it actually.
>>>> Or, shall I add this attribute as a part of fan attributes with "hwmon_fan_rpm" ? or any other way to do it ? I need your inputs in fixing this.
>>>
>>> I think he wants the attribute to be named according to the properties in this document: https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface. I guess the attribute would then map to fan1_input, though I'm not sure if that's 100% correct either since this could technically be attached to something other than a fan. But I would think in practice that's not a big concern.
>>>
>>> Guenter,
>>> Please correct me as well if I'm wrong.
>>>
>> You are absolutely correct.
>>
>> While I am not opposed to ABI changes, the merits of those would need to be
>> discussed on the mailing list. But replacing "fan1_input" with "rpm" is
>> not an acceptable ABI change, even if it may measure something that turns
>> but isn't a fan.
>>
>> If this _is_ in fact supposed to be used for something else but fans, we
>> would have to discuss what that might be, and if hwmon is the appropriate
>> subsystem to measure and report it. This does to some degree lead back to
>> my concern of having the "fan" part of this patch series in the pwm core.
>> I am still not sure if that makes sense.
>>
>> Thanks,
>> Guenter
> I am planning to add tachometer support in pwm-fan.c driver (drivers/hwmon/) instead of adding new generic-pwm-tachometer.c driver. Measuring RPM value will be done in pwm-fan driver itself using pwm capture feature and will add new sysfs attributes under this driver to report rpm value of fan.

There is an existing attribute to report the RPM of fans. It is called fan[1..n]_input.

"replacing "fan1_input" with "rpm" is not an acceptable ABI change"

Preemptive NACK.

Guenter

>>
>>> Thank you,
>>> Mikko
>>>
>>>>
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH V2 5/7] thermal/drivers/cpu_cooling: Add idle cooling device documentation
From: Daniel Lezcano @ 2018-03-07 11:42 UTC (permalink / raw)
  To: Pavel Machek
  Cc: edubezval, kevin.wangtao, leo.yan, vincent.guittot, amit.kachhap,
	linux-kernel, javi.merino, rui.zhang, daniel.thompson, linux-pm,
	Jonathan Corbet, open list:DOCUMENTATION
In-Reply-To: <20180306231906.GB28911@amd>

On 07/03/2018 00:19, Pavel Machek wrote:
> Hi!

Hi Pavel,

thanks for taking the time to review the documentation.

>> --- /dev/null
>> +++ b/Documentation/thermal/cpu-idle-cooling.txt
>> @@ -0,0 +1,165 @@
>> +
>> +Situation:
>> +----------
>> +
> 
> Can we have some real header here? Also if this is .rst, maybe it
> should be marked so?

Ok, I will fix it.

>> +Under certain circumstances, the SoC reaches a temperature exceeding
>> +the allocated power budget or the maximum temperature limit. The
> 
> I don't understand. Power budget is in W, temperature is in
> kelvin. Temperature can't exceed power budget AFAICT.

Yes, it is badly worded. Is the following better ?

"
Under certain circumstances a SoC can reach the maximum temperature
limit or is unable to stabilize the temperature around a temperature
control.

When the SoC has to stabilize the temperature, the kernel can act on a
cooling device to mitigate the dissipated power.

When the maximum temperature is reached and to prevent a catastrophic
situation a radical decision must be taken to reduce the temperature
under the critical threshold, that impacts the performance.

"

>> +former must be mitigated to stabilize the SoC temperature around the
>> +temperature control using the defined cooling devices, the latter
> 
> later?
> 
>> +catastrophic situation where a radical decision must be taken to
>> +reduce the temperature under the critical threshold, that can impact
>> +the performances.
> 
> performance.
> 
>> +Another situation is when the silicon reaches a certain temperature
>> +which continues to increase even if the dynamic leakage is reduced to
>> +its minimum by clock gating the component. The runaway phenomena will
>> +continue with the static leakage and only powering down the component,
>> +thus dropping the dynamic and static leakage will allow the component
>> +to cool down. This situation is critical.
> 
> Critical here, critical there. I have trouble following
> it. Theoretically hardware should protect itself, because you don't
> want kernel bug to damage your CPU?

There are several levels of protection. The first level is mitigating
the temperature from the kernel, then in the temperature sensor a reset
line will trigger the reboot of the CPUs. Usually it is a register where
you write the maximum temperature, from the driver itself. I never tried
to write 1000°C in this register and see if I can burn the board.

I know some boards have another level of thermal protection in the
hardware itself and some other don't.

In any case, from a kernel point of view, it is a critical situation as
we are about to hard reboot the system and in this case it is preferable
to drop drastically the performance but give the opportunity to the
system to run in a degraded mode.

>> +Last but not least, the system can ask for a specific power budget but
>> +because of the OPP density, we can only choose an OPP with a power
>> +budget lower than the requested one and underuse the CPU, thus losing
>> +performances. In other words, one OPP under uses the CPU with a
> 
> performance.
> 
>> +lesser than the power budget and the next OPP exceed the power budget,
>> +an intermediate OPP could have been used if it were present.
> 
> was.
> 
>> +Solutions:
>> +----------
>> +
>> +If we can remove the static and the dynamic leakage for a specific
>> +duration in a controlled period, the SoC temperature will
>> +decrease. Acting at the idle state duration or the idle cycle
> 
> "should" decrease? If you are in bad environment..

No, it will decrease in any case because of the static leakage drop. The
bad environment will impact the speed of this decrease.

>> +The Operating Performance Point (OPP) density has a great influence on
>> +the control precision of cpufreq, however different vendors have a
>> +plethora of OPP density, and some have large power gap between OPPs,
>> +that will result in loss of performance during thermal control and
>> +loss of power in other scenes.
> 
> scene seems to be wrong word here.

yes, 'scenario' will be better :)

>> +At a specific OPP, we can assume injecting idle cycle on all CPUs,
> 
> Extra comma?
> 
>> +Idle Injection:
>> +---------------
>> +
>> +The base concept of the idle injection is to force the CPU to go to an
>> +idle state for a specified time each control cycle, it provides
>> +another way to control CPU power and heat in addition to
>> +cpufreq. Ideally, if all CPUs of a cluster inject idle synchronously,
>> +this cluster can get into the deepest idle state and achieve minimum
>> +power consumption, but that will also increase system response latency
>> +if we inject less than cpuidle latency.
> 
> I don't understand last sentence.

Is it better ?

"Ideally, if all CPUs, belonging to the same cluster, inject their idle
cycle synchronously, the cluster can reach its power down state with a
minimum power consumption and static leakage drop. However, these idle
cycles injection will add extra latencies as the CPUs will have to
wakeup from a deep sleep state."


>> +The mitigation begins with a maximum period value which decrease
> 
> decreases?
>   
>> +more cooling effect is requested. When the period duration is equal
>> to
>> +the idle duration, then we are in a situation the platform can’t
>> +dissipate the heat enough and the mitigation fails. In this case
> 
> fast enough?
> 
>> +situation is considered critical and there is nothing to do. The idle
> 
> Nothing to do? Maybe power the system down?

Nothing to do == the mitigation can't handle the situation, it reached
its limit. We can't do better.

Solution: add an emergency thermal shutdown (which is an orthogonal
feature to be added to the thermal framework).

Sidenote: it is a very unlikely case, as we are idle most of the time
when the heat is hard to dissipate. I tested this with a proto-SoC with
an interesting thermal behavior (temperature jumps insanely high),
running at full blast and bad heat dissipation, the mitigation never
reached the limit.

>> +The idle injection duration value must comply with the constraints:
>> +
>> +- It is lesser or equal to the latency we tolerate when the mitigation
> 
> less ... than the latency
> 
>> +Minimum period
>> +--------------
>> +
>> +The idle injection duration being fixed, it is obvious the minimum
>> +period can’t be lesser than that, otherwise we will be scheduling the
> 
> less.
> 
>> +Practically, if the running power is lesses than the targeted power,
> 
> less.
> 
>> +However, in this demonstration we ignore three aspects:
>> +
>> + * The static leakage is not defined here, we can introduce it in the
>> +   equation but assuming it will be zero most of the time as it is
> 
> , but?
> 
> Best regards,

Thanks!


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v16 00/13] support "task_isolation" mode
From: Yury Norov @ 2018-03-07 10:07 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Steven Rostedt, Ingo Molnar, Peter Zijlstra, Andrew Morton,
	Rik van Riel, Tejun Heo, Frederic Weisbecker, Thomas Gleixner,
	Paul E. McKenney, Christoph Lameter, Viresh Kumar,
	Catalin Marinas, Will Deacon, Andy Lutomirski, Daniel Lezcano,
	Francis Giraldeau, linux-mm, linux-doc, linux-api, linux-kernel,
	Prasun Kapoor, Narayana Prasad Athreya, Alex Belits,
	Chandrakala Chavva
In-Reply-To: <1509728692-10460-1-git-send-email-cmetcalf@mellanox.com>

Hi Chris,

(CC Cavium people)

Thanks for your series.

On Fri, Nov 03, 2017 at 01:04:39PM -0400, Chris Metcalf wrote:
> Here, finally, is a new spin of the task isolation work (v16), with
> changes based on the issues that were raised at last year's Linux
> Plumbers Conference and in the email discussion that followed.
> 
> This version of the patch series cleans up a number of areas that were
> a little dodgy in the previous patch series.
> 
> - It no longer loops in the final code that prepares to return to
>   userspace; instead, it sets things up in the prctl() and then
>   validates when preparing to return to userspace, adjusting the
>   syscall return value to -EAGAIN at that point if something doesn't
>   line up quite correctly.
> 
> - We no longer support the NOSIG mode that let you freely call into
>   the kernel multiple times while in task isolation.  This was always
>   a little odd, since you really should be in sufficient control of
>   task isolation code that you can explicitly stop isolation with a
>   "prctl(PR_TASK_ISOLATION, 0)" before using the kernel for anything
>   else.  It also made the semantics of migrating the task confusing.
>   More importantly, removing that support means that the only path
>   that sets up task isolation is the return from prctl(), which allows
>   us to make the simplification above.
> 
> - We no longer try to signal the task isolation process from a remote
>   core when we detect that we are about to violate its isolation.
>   Instead, we just print a message there (and optionally dump stack),
>   and rely on the eventual interrupt on the core itself to trigger the
>   signal.  We are always in a safe context to generate a signal when
>   we enter the kernel, unlike when we are deep in a call stack sending
>   an SMP IPI or whatever.
> 
> - We notice the case of an unstable scheduler clock and return
>   EINVAL rather than spinning forever with EAGAIN (suggestion from
>   Francis Giraldeau).
> 
> - The prctl() call requires zeros for arg3/4/5 (suggestion from
>   Eugene Syromiatnikov).
> 
> The kernel internal isolation API is also now cleaner, and I have
> included kerneldoc APIs for all the interfaces so that it should be
> easier to port it to additional architectures; in fact looking at
> include/linux/isolation.h is a good place to start understanding the
> overall patch set.
> 
> I removed Catalin's Reviewed-by for arm64, and Christoph's Tested-by
> for x86, since this version is sufficiently different to merit
> re-review and re-testing.
> 
> Note that this is not rebased on top of Frederic's recent housekeeping
> patch series, although it is largely orthogonal right now.  After
> Frederic's patch series lands, task isolation is enabled with
> "isolcpus=nohz,domain,CPUS".  We could add some shorthand for that
> ("isolcpus=full,CPUS"?) or just use it as-is.
> 
> The previous (v15) patch series is here:
> 
> https://lkml.kernel.org/r/1471382376-5443-1-git-send-email-cmetcalf@mellanox.com
> 
> This patch series is available at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git dataplane
> 
> Some folks raised some good points at the LPC discussion and then in
> email discussions that followed.  Rather than trying to respond to
> everyone in a flurry of emails, I'll answer some questions here:
> 
> 
> Why not just instrument user_exit() to raise the isolation-lost signal?
> 
> Andy pointed me in this direction.  The advantage is that you catch
> *everything*, by definition.  There is a hook that can do this in the
> current patch set, but you have to #define DEBUG_TASK_ISOLATION
> manually to take advantage of it, because as written it has two issues:
> 
> 1. You can't actually exit the kernel with prctl(PR_TASK_ISOLATION,0)
>    because the user_exit hook kills you first.
> 2. You lose the ability to get much better diagnostics by waiting
>    until you are further into kernel entry and know what you're doing.
> 
> You could work around #2 in several ways, but #1 is harder.  I looked
> at x86 for a while, and although you could imagine this, you really
> want to generate a lost-isolation signal on any syscall that isn't
> that exact prctl(), and it's awkward to try to do all of that checking
> before user_exit().  Since in any case we do want to have the more
> specific probes at the various kernel entry points where we generate
> the diagnostics, I felt like it wasn't the right approach to enable
> as a compilation-time default.  I'm open to discussion on this though!
> 
> 
> Can't we do all the exit-to-userspace work with irqs disabled?
> 
> In fact, it turns out that you can do lru_add_drain() with irqs
> disabled, so that's what we're doing in the patch series now.
> 
> However, it doesn't seem possible to do the synchronous cancellation of
> the vmstat deferred work with irqs disabled, though if there's a way,
> it would be a little cleaner to do that; Christoph?  We can certainly
> update the statistics with interrupts disabled via
> refresh_cpu_vm_stats(false), but that's not sufficient.  For now, I
> just issue the cancellation during sys_prctl() call, and then if it
> isn't synchronized by the time we are exiting to userspace, we just
> jam in an EAGAIN and let userspace retry.  In practice, this doesn't
> seem to ever happen.
> 
> 
> What about using a per-cpu flag to stop doing new deferred work?
> 
> Andy also suggested we could structure the code to have the prctl()
> set a per-cpu flag to stop adding new future work (e.g. vmstat per-cpu
> data, or lru page cache).  Then, we could flush those structures right
> from the sys_prctl() call, and when we were returning to user space,
> we'd be confident that there wasn't going to be any new work added.
> 
> With the current set of things that we are disabling for task
> isolation, though, it didn't seem necessary.  Quiescing the vmstat
> shepherd seems like it is generally pretty safe since we will likely
> be able to sync up the per-cpu cache and kill the deferred work with
> high probability, with no expectation that additional work will show
> up.  And since we can flush the LRU page cache with interrupts
> disabled, that turns out not to be an issue either.
> 
> I could imagine that if we have to deal with some new kind of deferred
> work, we might find the per-cpu flag becomes a good solution, but for
> now we don't have a good use case for that approach.
> 
> 
> How about stopping the dyn tick?
> 
> Right now we try to stop it on return to userspace, but if we can't,
> we just return EAGAIN to userspace.  In practice, what I see is that
> usually the tick stops immediately, but occasionally it doesn't; in
> this case I've always seen that nr_running is >1, presumably with some
> temporary kernel worker threads, and the user code just needs to call
> prctl() until those threads are done.  We could structure things with
> a completion that we wait for, which is set by the timer code when it
> finally does stop the tick, but this may be overkill, particularly
> since we'll only be running this prctl() loop from userspace on cores
> where we have no other useful work that we're trying to run anyway.
> 
> 
> What about TLB flushing?
> 
> We talked about this at Plumbers and some of the email discussion also
> was about TLB flushing.  I haven't tried to add it to this patch set,
> because I really want to avoid scope creep; in any case, I think I
> managed to convince Andy that he was going to work on it himself. :)
> Paul McKenney already contributed some framework for such a patch, in
> commit b8c17e6664c4 ("rcu: Maintain special bits at bottom of
> ->dynticks counter").
> 
> What about that d*mn 1 Hz clock?
> 
> It's still there, so this code still requires some further work before
> it can actually get a process into long-term task isolation (without
> the obvious one-line kernel hack).  Frederic suggested a while ago
> forcing updates on cpustats was required as the last gating factor; do
> we think that is still true?  Christoph was working on this at one
> point - any progress from your point of view?

I tested your series on ThunderX 2 machine. When I run 10 giga-ticks test,
it always passed. If I run for more, the test exits like this:

# time ./isolation 1000
/sys devices: OK (using task isolation cpu 100)
prctl unaffinitized: OK
prctl on cpu 0: OK
==> hello, world
test_off: OK
Received signal 11 successfully
test_segv: OK
test_fault: OK
test_fault (SIGUSR1): OK
test_syscall: OK
test_syscall (SIGUSR1): OK
test_schedule: OK
test_schedule (SIGUSR1): OK
testing task isolation jitter for 1000000000000 ticks
ERROR: Program unexpectedly entered kernel.
INFO: loop times:
  1 cycles (count: 128606844716)
  2 cycles (count: 31558352292)
  3 cycles (count: 4)
  29 cycles (count: 437)
  30 cycles (count: 1874)
  31 cycles (count: 221)
  57 cycles (count: 4)
  58 cycles (count: 6)
  59 cycles (count: 1)

real  15m58.643s
user  15m58.626s
sys   0m0.012s

I pass task_isolation_debug to boot parameters:
# cat /proc/cmdline 
BOOT_IMAGE=/boot/Image-isol nohz_full=100-110 isolcpus=100-110 task_isolation_debug root=UUID=75b9dd5b-58d8-4a50-8868-004cb7c1f25f ro text

But dmesg looks empty...

I investigate it, but at now I have no ideas what happens. Have you seen
that before?

Anyway, we are going to include your test in our scenario, with some
modifications. I've added --dry-run option to make it easier to
demonstrate the effect of isolation on jitter. If you like it, feel
free to use this change.

Tested-by: Yury Norov <ynorov@caviumnetworks.com>

From 5c5823c1fc8441ab1486287679de77b8cea5154d Mon Sep 17 00:00:00 2001
From: Yury Norov <ynorov@caviumnetworks.com>
Date: Wed, 7 Mar 2018 02:41:08 +0300
Subject: [PATCH] isolation test: --dry-run mode

Add dry-run mode for better understanding the effect of isolation.
Also, make sanity checks on waitticks.

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
 tools/testing/selftests/task_isolation/isolation.c | 47 +++++++++++++++++-----
 1 file changed, 36 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/task_isolation/isolation.c b/tools/testing/selftests/task_isolation/isolation.c
index 9c0b49619b40..e90a6c85fe2a 100644
--- a/tools/testing/selftests/task_isolation/isolation.c
+++ b/tools/testing/selftests/task_isolation/isolation.c
@@ -53,6 +53,7 @@
 #include <unistd.h>
 #include <stdlib.h>
 #include <fcntl.h>
+#include <limits.h>
 #include <assert.h>
 #include <string.h>
 #include <errno.h>
@@ -500,7 +501,7 @@ static void jitter_handler(int sig)
 	exit(exit_status);
 }
 
-static void test_jitter(unsigned long waitticks)
+static void test_jitter(unsigned long waitticks, int flags)
 {
 	u_int64_t start, last, elapsed;
 	int rc;
@@ -513,7 +514,8 @@ static void test_jitter(unsigned long waitticks)
 	rc = mlockall(MCL_CURRENT);
 	assert(rc == 0);
 
-	set_task_isolation(PR_TASK_ISOLATION_ENABLE |
+	if (flags & PR_TASK_ISOLATION_ENABLE)
+		set_task_isolation(PR_TASK_ISOLATION_ENABLE |
 			   PR_TASK_ISOLATION_SET_SIG(SIGUSR1));
 
 	last = start = get_cycle_count();
@@ -537,26 +539,49 @@ static void test_jitter(unsigned long waitticks)
 	} while (elapsed < waitticks);
 
 	jitter_test_complete = true;
-	prctl_isolation(0);
+
+	if (flags & PR_TASK_ISOLATION_ENABLE)
+		prctl_isolation(0);
+
 	jitter_summarize();
 }
 
 int main(int argc, char **argv)
 {
 	/* How many billion ticks to wait after running the other tests? */
-	unsigned long waitticks;
+	long waitticks = 10;
+	int flags = PR_TASK_ISOLATION_ENABLE;
 	char buf[100];
 	char *result, *end;
 	FILE *f;
 
-	if (argc == 1)
-		waitticks = 10;
-	else if (argc == 2)
-		waitticks = strtol(argv[1], NULL, 10);
-	else {
-		printf("syntax: isolation [gigaticks]\n");
+	errno = 0;
+
+	switch (argc) {
+	case 1:
+		break;
+	case 2:
+		if (strcmp(argv[1], "--dry-run") == 0)
+			flags = 0;
+		else
+			waitticks = strtol(argv[1], NULL, 10);
+		break;
+	case 3:
+		if (strcmp(argv[1], "--dry-run") == 0)
+			flags = 0;
+
+		waitticks = strtol(argv[2], NULL, 10);
+		break;
+	default:
+		printf("syntax: isolation [--dry-run] [gigaticks]\n");
 		ksft_exit_fail();
 	}
+
+	if (errno || waitticks <= 0 || waitticks > (LONG_MAX / 1000000000)) {
+		printf("Gigaticks: bad number.\n");
+		ksft_exit_fail();
+	}
+
 	waitticks *= 1000000000;
 
 	/* Get a core from the /sys nohz_full device. */
@@ -637,7 +662,7 @@ int main(int argc, char **argv)
 		return exit_status;
 	}
 
-	test_jitter(waitticks);
+	test_jitter(waitticks, flags);
 
 	return exit_status;
 }
-- 
2.14.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH 05/10] hwmon: generic-pwm-tachometer: Add generic PWM based tachometer
From: Rajkumar Rampelli @ 2018-03-07  9:47 UTC (permalink / raw)
  To: Guenter Roeck, Mikko Perttunen, robh+dt, mark.rutland,
	thierry.reding, jonathanh, jdelvare, corbet, catalin.marinas,
	will.deacon, kstewart, gregkh, pombredanne, mmaddireddy,
	mperttunen, arnd, timur, andy.gross, xuwei5, elder, heiko, krzk,
	ard.biesheuvel
  Cc: devicetree, linux-kernel, linux-pwm, linux-tegra, linux-hwmon,
	linux-doc, linux-arm-kernel, ldewangan
In-Reply-To: <62ba7f67-5a99-51ab-1214-eb68ebb7e642@roeck-us.net>



On Wednesday 28 February 2018 07:59 PM, Guenter Roeck wrote:
> On 02/27/2018 11:03 PM, Mikko Perttunen wrote:
>> On 02/28/2018 08:12 AM, Rajkumar Rampelli wrote:
>>>
>>> On Wednesday 28 February 2018 11:28 AM, Guenter Roeck wrote:
>>>> On 02/27/2018 09:38 PM, Rajkumar Rampelli wrote:
>>>>>
>>>>> On Wednesday 21 February 2018 08:20 PM, Guenter Roeck wrote:
>>>>>> On 02/20/2018 10:58 PM, Rajkumar Rampelli wrote:
>>>>>>> Add generic PWM based tachometer driver via HWMON interface
>>>>>>> to report the RPM of motor. This drivers get the period/duty
>>>>>>> cycle from PWM IP which captures the motor PWM output.
>>>>>>>
>>>>>>> This driver implements a simple interface for monitoring the 
>>>>>>> speed of
>>>>>>> a fan and exposes it in roatations per minute (RPM) to the user 
>>>>>>> space
>>>>>>> by using the hwmon's sysfs interface
>>>>>>>
>>>>>>> Signed-off-by: Rajkumar Rampelli <rrajk@nvidia.com>
>>>>>>> ---
>>>>>>>   Documentation/hwmon/generic-pwm-tachometer |  17 +++++
>>>>>>>   drivers/hwmon/Kconfig                      |  10 +++
>>>>>>>   drivers/hwmon/Makefile                     |   1 +
>>>>>>>   drivers/hwmon/generic-pwm-tachometer.c     | 112 
>>>>>>> +++++++++++++++++++++++++++++
>>>>>>>   4 files changed, 140 insertions(+)
>>>>>>>   create mode 100644 Documentation/hwmon/generic-pwm-tachometer
>>>>>>>   create mode 100644 drivers/hwmon/generic-pwm-tachometer.c
>>>>>>>
>>>>>>> diff --git a/Documentation/hwmon/generic-pwm-tachometer 
>>>>>>> b/Documentation/hwmon/generic-pwm-tachometer
>>>>>>> new file mode 100644
>>>>>>> index 0000000..e0713ee
>>>>>>> --- /dev/null
>>>>>>> +++ b/Documentation/hwmon/generic-pwm-tachometer
>>>>>>> @@ -0,0 +1,17 @@
>>>>>>> +Kernel driver generic-pwm-tachometer
>>>>>>> +====================================
>>>>>>> +
>>>>>>> +This driver enables the use of a PWM module to monitor a fan. 
>>>>>>> It uses the
>>>>>>> +generic PWM interface and can be used on SoCs as along as the 
>>>>>>> SoC supports
>>>>>>> +Tachometer controller that moniors the Fan speed in periods.
>>>>>>> +
>>>>>>> +Author: Rajkumar Rampelli <rrajk@nvidia.com>
>>>>>>> +
>>>>>>> +Description
>>>>>>> +-----------
>>>>>>> +
>>>>>>> +The driver implements a simple interface for monitoring the Fan 
>>>>>>> speed using
>>>>>>> +PWM module and Tachometer controller. It requests period value 
>>>>>>> through PWM
>>>>>>> +capture interface to Tachometer and measures the Rotations per 
>>>>>>> minute using
>>>>>>> +received period value. It exposes the Fan speed in RPM to the 
>>>>>>> user space by
>>>>>>> +using the hwmon's sysfs interface.
>>>>>>> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
>>>>>>> index ef23553..8912dcb 100644
>>>>>>> --- a/drivers/hwmon/Kconfig
>>>>>>> +++ b/drivers/hwmon/Kconfig
>>>>>>> @@ -1878,6 +1878,16 @@ config SENSORS_XGENE
>>>>>>>         If you say yes here you get support for the temperature
>>>>>>>         and power sensors for APM X-Gene SoC.
>>>>>>>   +config GENERIC_PWM_TACHOMETER
>>>>>>> +    tristate "Generic PWM based tachometer driver"
>>>>>>> +    depends on PWM
>>>>>>> +    help
>>>>>>> +      Enables a driver to use PWM signal from motor to use
>>>>>>> +      for measuring the motor speed. The RPM is captured by
>>>>>>> +      PWM modules which has PWM capture capability and this
>>>>>>> +      drivers reads the captured data from PWM IP to convert
>>>>>>> +      it to speed in RPM.
>>>>>>> +
>>>>>>>   if ACPI
>>>>>>>     comment "ACPI drivers"
>>>>>>> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
>>>>>>> index f814b4a..9dcc374 100644
>>>>>>> --- a/drivers/hwmon/Makefile
>>>>>>> +++ b/drivers/hwmon/Makefile
>>>>>>> @@ -175,6 +175,7 @@ obj-$(CONFIG_SENSORS_WM8350)    += 
>>>>>>> wm8350-hwmon.o
>>>>>>>   obj-$(CONFIG_SENSORS_XGENE)    += xgene-hwmon.o
>>>>>>>     obj-$(CONFIG_PMBUS)        += pmbus/
>>>>>>> +obj-$(CONFIG_GENERIC_PWM_TACHOMETER) += generic-pwm-tachometer.o
>>>>>>>     ccflags-$(CONFIG_HWMON_DEBUG_CHIP) := -DDEBUG
>>>>>>>   diff --git a/drivers/hwmon/generic-pwm-tachometer.c 
>>>>>>> b/drivers/hwmon/generic-pwm-tachometer.c
>>>>>>> new file mode 100644
>>>>>>> index 0000000..9354d43
>>>>>>> --- /dev/null
>>>>>>> +++ b/drivers/hwmon/generic-pwm-tachometer.c
>>>>>>> @@ -0,0 +1,112 @@
>>>>>>> +/*
>>>>>>> + * Copyright (c) 2017-2018, NVIDIA CORPORATION.  All rights 
>>>>>>> reserved.
>>>>>>> + *
>>>>>>> + * This program is free software; you can redistribute it 
>>>>>>> and/or modify it
>>>>>>> + * under the terms and conditions of the GNU General Public 
>>>>>>> License,
>>>>>>> + * version 2, as published by the Free Software Foundation.
>>>>>>> + *
>>>>>>> + * This program is distributed in the hope it will be useful, 
>>>>>>> but WITHOUT
>>>>>>> + * ANY WARRANTY; without even the implied warranty of 
>>>>>>> MERCHANTABILITY or
>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General 
>>>>>>> Public License for
>>>>>>> + * more details.
>>>>>>> + *
>>>>>>> + */
>>>>>>> +
>>>>>>> +#include <linux/module.h>
>>>>>>> +#include <linux/platform_device.h>
>>>>>>> +#include <linux/err.h>
>>>>>>> +#include <linux/pwm.h>
>>>>>>> +#include <linux/hwmon.h>
>>>>>>> +#include <linux/hwmon-sysfs.h>
>>>>>>> +
>>>>>>> +struct pwm_hwmon_tach {
>>>>>>> +    struct device        *dev;
>>>>>>> +    struct pwm_device    *pwm;
>>>>>>> +    struct device        *hwmon;
>>>>>>> +};
>>>>>>> +
>>>>>>> +static ssize_t show_rpm(struct device *dev, struct 
>>>>>>> device_attribute *attr,
>>>>>>> +            char *buf)
>>>>>>> +{
>>>>>>> +    struct pwm_hwmon_tach *ptt = dev_get_drvdata(dev);
>>>>>>> +    struct pwm_device *pwm = ptt->pwm;
>>>>>>> +    struct pwm_capture result;
>>>>>>> +    int err;
>>>>>>> +    unsigned int rpm = 0;
>>>>>>> +
>>>>>>> +    err = pwm_capture(pwm, &result, 0);
>>>>>>> +    if (err < 0) {
>>>>>>> +        dev_err(ptt->dev, "Failed to capture PWM: %d\n", err);
>>>>>>> +        return err;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (result.period)
>>>>>>> +        rpm = DIV_ROUND_CLOSEST_ULL(60ULL * NSEC_PER_SEC,
>>>>>>> +                        result.period);
>>>>>>> +
>>>>>>> +    return sprintf(buf, "%u\n", rpm);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static SENSOR_DEVICE_ATTR(rpm, 0444, show_rpm, NULL, 0);
>>>>>>> +
>>>>>>> +static struct attribute *pwm_tach_attrs[] = {
>>>>>>> +    &sensor_dev_attr_rpm.dev_attr.attr,
>>>>>>> +    NULL,
>>>>>>> +};
>>>>>>
>>>>>> "rpm" is not a standard hwmon sysfs attribute. If you don't provide
>>>>>> a single standard hwmon sysfs attribute, having a hwmon driver is 
>>>>>> pointless.
>>>>> Guenter Roeck,
>>>>> I will define a new hwmon sysfs attribute node called 
>>>>> "hwmon_tachometer_attributes" in hwmon.h like below and update the 
>>>>> same in tachometer hwmon driver. Is it fine ?
>>>>> enum hwmon_tachometer_attributes {
>>>>
>>>> Are you kidding me ?
>>>>
>>>> Guenter
>>> Sorry, I just wanted to confirm whether my understanding is correct 
>>> or not before implementing it actually.
>>> Or, shall I add this attribute as a part of fan attributes with 
>>> "hwmon_fan_rpm" ? or any other way to do it ? I need your inputs in 
>>> fixing this.
>>
>> I think he wants the attribute to be named according to the 
>> properties in this document: 
>> https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface. I 
>> guess the attribute would then map to fan1_input, though I'm not sure 
>> if that's 100% correct either since this could technically be 
>> attached to something other than a fan. But I would think in practice 
>> that's not a big concern.
>>
>> Guenter,
>> Please correct me as well if I'm wrong.
>>
> You are absolutely correct.
>
> While I am not opposed to ABI changes, the merits of those would need 
> to be
> discussed on the mailing list. But replacing "fan1_input" with "rpm" is
> not an acceptable ABI change, even if it may measure something that turns
> but isn't a fan.
>
> If this _is_ in fact supposed to be used for something else but fans, we
> would have to discuss what that might be, and if hwmon is the appropriate
> subsystem to measure and report it. This does to some degree lead back to
> my concern of having the "fan" part of this patch series in the pwm core.
> I am still not sure if that makes sense.
>
> Thanks,
> Guenter
I am planning to add tachometer support in pwm-fan.c driver 
(drivers/hwmon/) instead of adding new generic-pwm-tachometer.c driver. 
Measuring RPM value will be done in pwm-fan driver itself using pwm 
capture feature and will add new sysfs attributes under this driver to 
report rpm value of fan.
>
>> Thank you,
>> Mikko
>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe 
>>> linux-tegra" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-hwmon" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 1/8] [PATCH 1/8] drivers/peci: Add support for PECI bus driver core
From: Julia Cartwright @ 2018-03-07  3:19 UTC (permalink / raw)
  To: Jae Hyun Yoo
  Cc: joel, andrew, arnd, gregkh, jdelvare, linux, benh, andrew,
	linux-kernel, linux-doc, devicetree, linux-hwmon,
	linux-arm-kernel, openbmc
In-Reply-To: <20180221161606.32247-2-jae.hyun.yoo@linux.intel.com>

On Wed, Feb 21, 2018 at 08:15:59AM -0800, Jae Hyun Yoo wrote:
> This commit adds driver implementation for PECI bus into linux
> driver framework.
> 
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
> ---
[..]
> +static int peci_locked_xfer(struct peci_adapter *adapter,
> +			    struct peci_xfer_msg *msg,
> +			    bool do_retry,
> +			    bool has_aw_fcs)

_locked generally means that this function is invoked with some critical
lock held, what lock does the caller need to acquire before invoking
this function?

> +{
> +	ktime_t start, end;
> +	s64 elapsed_ms;
> +	int rc = 0;
> +
> +	if (!adapter->xfer) {

Is this really an optional feature of an adapter?  If this is not
optional, then this check should be in place when the adapter is
registered, not here.  (And it should WARN_ON(), because it's a driver
developer error).

> +		dev_dbg(&adapter->dev, "PECI level transfers not supported\n");
> +		return -ENODEV;
> +	}
> +
> +	if (in_atomic() || irqs_disabled()) {

As Andrew mentioned, this is broken.

You don't even need a might_sleep().  The locking functions you use here
will already include a might_sleep() w/ CONFIG_DEBUG_ATOMIC_SLEEP.

> +		rt_mutex_trylock(&adapter->bus_lock);
> +		if (!rc)
> +			return -EAGAIN; /* PECI activity is ongoing */
> +	} else {
> +		rt_mutex_lock(&adapter->bus_lock);
> +	}
> +
> +	if (do_retry)
> +		start = ktime_get();
> +
> +	do {
> +		rc = adapter->xfer(adapter, msg);
> +
> +		if (!do_retry)
> +			break;
> +
> +		/* Per the PECI spec, need to retry commands that return 0x8x */
> +		if (!(!rc && ((msg->rx_buf[0] & DEV_PECI_CC_RETRY_ERR_MASK) ==
> +			      DEV_PECI_CC_TIMEOUT)))
> +			break;

This is pretty difficult to parse.  Can you split it into two different
conditions?

> +
> +		/* Set the retry bit to indicate a retry attempt */
> +		msg->tx_buf[1] |= DEV_PECI_RETRY_BIT;

Are you sure this bit is to be set in the _second_ byte of tx_buf?

> +
> +		/* Recalculate the AW FCS if it has one */
> +		if (has_aw_fcs)
> +			msg->tx_buf[msg->tx_len - 1] = 0x80 ^
> +						peci_aw_fcs((u8 *)msg,
> +							    2 + msg->tx_len);
> +
> +		/* Retry for at least 250ms before returning an error */
> +		end = ktime_get();
> +		elapsed_ms = ktime_to_ms(ktime_sub(end, start));
> +		if (elapsed_ms >= DEV_PECI_RETRY_TIME_MS) {
> +			dev_dbg(&adapter->dev, "Timeout retrying xfer!\n");
> +			break;
> +		}
> +	} while (true);
> +
> +	rt_mutex_unlock(&adapter->bus_lock);
> +
> +	return rc;
> +}
> +
> +static int peci_xfer(struct peci_adapter *adapter, struct peci_xfer_msg *msg)
> +{
> +	return peci_locked_xfer(adapter, msg, false, false);
> +}
> +
> +static int peci_xfer_with_retries(struct peci_adapter *adapter,
> +				  struct peci_xfer_msg *msg,
> +				  bool has_aw_fcs)
> +{
> +	return peci_locked_xfer(adapter, msg, true, has_aw_fcs);
> +}
> +
> +static int peci_scan_cmd_mask(struct peci_adapter *adapter)
> +{
> +	struct peci_xfer_msg msg;
> +	u32 dib;
> +	int rc = 0;
> +
> +	/* Update command mask just once */
> +	if (adapter->cmd_mask & BIT(PECI_CMD_PING))
> +		return 0;
> +
> +	msg.addr      = PECI_BASE_ADDR;
> +	msg.tx_len    = GET_DIB_WR_LEN;
> +	msg.rx_len    = GET_DIB_RD_LEN;
> +	msg.tx_buf[0] = GET_DIB_PECI_CMD;
> +
> +	rc = peci_xfer(adapter, &msg);
> +	if (rc < 0) {
> +		dev_dbg(&adapter->dev, "PECI xfer error, rc : %d\n", rc);
> +		return rc;
> +	}
> +
> +	dib = msg.rx_buf[0] | (msg.rx_buf[1] << 8) |
> +	      (msg.rx_buf[2] << 16) | (msg.rx_buf[3] << 24);
> +
> +	/* Check special case for Get DIB command */
> +	if (dib == 0x00) {
> +		dev_dbg(&adapter->dev, "DIB read as 0x00\n");
> +		return -1;
> +	}
> +
> +	if (!rc) {

You should change this to:

	if (rc) {
		dev_dbg(&adapter->dev, "Error reading DIB, rc : %d\n", rc);
		return rc;
	}

And then leave the happy path below unindented.

> +		/**
> +		 * setting up the supporting commands based on minor rev#
> +		 * see PECI Spec Table 3-1
> +		 */
> +		dib = (dib >> 8) & 0xF;
> +
> +		if (dib >= 0x1) {
> +			adapter->cmd_mask |= BIT(PECI_CMD_RD_PKG_CFG);
> +			adapter->cmd_mask |= BIT(PECI_CMD_WR_PKG_CFG);
> +		}
> +
> +		if (dib >= 0x2)
> +			adapter->cmd_mask |= BIT(PECI_CMD_RD_IA_MSR);
> +
> +		if (dib >= 0x3) {
> +			adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG_LOCAL);
> +			adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG_LOCAL);
> +		}
> +
> +		if (dib >= 0x4)
> +			adapter->cmd_mask |= BIT(PECI_CMD_RD_PCI_CFG);
> +
> +		if (dib >= 0x5)
> +			adapter->cmd_mask |= BIT(PECI_CMD_WR_PCI_CFG);
> +
> +		if (dib >= 0x6)
> +			adapter->cmd_mask |= BIT(PECI_CMD_WR_IA_MSR);
> +
> +		adapter->cmd_mask |= BIT(PECI_CMD_GET_TEMP);
> +		adapter->cmd_mask |= BIT(PECI_CMD_GET_DIB);
> +		adapter->cmd_mask |= BIT(PECI_CMD_PING);

These cmd_mask updates are not done with any locking in mind.  Is this
intentional?  Or: is synchronization not necessary because this is
always done during enumeration prior to exposing the adapter to users?

> +	} else {
> +		dev_dbg(&adapter->dev, "Error reading DIB, rc : %d\n", rc);
> +	}
> +
> +	return rc;
> +}
> +
> +static int peci_cmd_support(struct peci_adapter *adapter, enum peci_cmd cmd)
> +{
> +	if (!(adapter->cmd_mask & BIT(PECI_CMD_PING)) &&
> +	    peci_scan_cmd_mask(adapter) < 0) {
> +		dev_dbg(&adapter->dev, "Failed to scan command mask\n");
> +		return -EIO;
> +	}
> +
> +	if (!(adapter->cmd_mask & BIT(cmd))) {
> +		dev_dbg(&adapter->dev, "Command %d is not supported\n", cmd);
> +		return -EINVAL;
> +	}

It would be nicer if you did this check prior to dispatching to the
various subfunctions (peci_ioctl_ping, peci_ioctl_get_dib, etc.).  In
that way, these functions could just assume the adapter supports them.

[..]
> +static int peci_register_adapter(struct peci_adapter *adapter)
> +{
> +	int res = -EINVAL;
> +
> +	/* Can't register until after driver model init */
> +	if (WARN_ON(!is_registered)) {

Is this solving a problem you actually ran into?

[.. skipped review due to fatigue ..]

> +++ b/include/linux/peci.h
> @@ -0,0 +1,97 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2018 Intel Corporation
> +
> +#ifndef __LINUX_PECI_H
> +#define __LINUX_PECI_H
> +
> +#include <linux/cdev.h>
> +#include <linux/device.h>
> +#include <linux/peci-ioctl.h>
> +#include <linux/rtmutex.h>
> +
> +#define PECI_BUFFER_SIZE  32
> +#define PECI_NAME_SIZE    32
> +
> +struct peci_xfer_msg {
> +	u8	addr;
> +	u8	tx_len;
> +	u8	rx_len;
> +	u8	tx_buf[PECI_BUFFER_SIZE];
> +	u8	rx_buf[PECI_BUFFER_SIZE];
> +} __attribute__((__packed__));

The packed attribute has historically caused gcc to emit atrocious code,
as it seems to assume packed implies members might not be naturally
aligned.  Seeing as you're only working with u8s in this case, though,
this shouldn't be a problem.

> +struct peci_board_info {
> +	char			type[PECI_NAME_SIZE];
> +	u8			addr;	/* CPU client address */
> +	struct device_node	*of_node;
> +};
> +
> +struct peci_adapter {
> +	struct module	*owner;
> +	struct rt_mutex	bus_lock;

Why an rt_mutex, instead of a regular mutex.  Do you explicitly need PI
in mainline?

> +	struct device	dev;
> +	struct cdev	cdev;
> +	int		nr;
> +	char		name[PECI_NAME_SIZE];
> +	int		(*xfer)(struct peci_adapter *adapter,
> +				struct peci_xfer_msg *msg);
> +	uint		cmd_mask;
> +};
> +
> +#define to_peci_adapter(d) container_of(d, struct peci_adapter, dev)

You can also do this with a static inline, which provides a marginally
better error when screwed up.

   Julia

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox