linux-i2c.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Felipe Balbi <balbi@ti.com>
To: Alexander Kochetkov <al.kochet@gmail.com>
Cc: balbi@ti.com, linux-i2c@vger.kernel.org,
	linux-omap@vger.kernel.org, Tony Lindgren <tony@atomide.com>,
	Wolfram Sang <wsa@the-dreams.de>
Subject: Re: [PATCH 2/2] i2c: omap: fix "Too much work in one IRQ" irq handling
Date: Sun, 16 Nov 2014 09:45:58 -0600	[thread overview]
Message-ID: <20141116154558.GA7422@saruman> (raw)
In-Reply-To: <A47891AC-7453-43F0-9196-5133A1525C68@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6495 bytes --]

Hi,

On Sat, Nov 15, 2014 at 08:42:03AM +0300, Alexander Kochetkov wrote:
> Hello again.
> 
> > (please, never top-post)
> Sorry.
> 
> Sorry for the inaccurate presentation of ideas. I am not a native
> English speaker.

neither am I ;-)

> First about patches:
> [PATCH 1/2] and [PATCH 2/2] - intended to solve two independent problems.
> They were sent as series, In the future, try not to do so, In order
> not to mislead.

sending several fixes as a series is not a problem, a problem would be
to make fixes depend on new features or cleanups.

> > How could I ever call omap_i2c_complete_cmd() with 'err' set as 0 if I
> > had either a NACK or Arbitration Lost ?
> 
> [PATCH 1/2] - fix AL, NACK handling and does not apply to [PATCH 2/2].
> It not cause of problem solved of [PATCH 2/2].

still, how could we ever have that situation ? We break out of the loop
as soon as an error is encountered.

> > \bright, this is the same as it was before. If count reaches 100 we will
> > omap_i2c_complete_cmd().
> *No*
> 
> During refactoring submitted a series of patches was made the mistake.
> This error alters the behavior of the interrupt handler, if it ends
> with the message "Too much work in one IRQ".
>
> Error in the commit 66b9298878742f08cb6e79b7c7d5632d782fd1e1. 
> All subsequent commits were correct and translate this error further.
> 
> In the parent commit 3312d25e1abdc41be8a75a1b2c3ccaa39a14ed99
> (the commit before 66b9298878742f08cb6e79b7c7d5632d782fd1e1) in case
> count reaches 100, loop breaks *without* calling omap_i2c_complete_cmd().
> 
> In commit 66b9298878742f08cb6e79b7c7d5632d782fd1e1, in case
> count reaches 100, loop breaks and omap_i2c_complete_cmd() *get called*.

aaa, now I see what you're talking about. I'll review that code on
Monday. Let me see if that was intentional or I missed something.

> To see that, you need to open two versions of a file i2c-omap.c, side
> by side.
> 
> i2c-omap.c version corresponding to parent commit:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-omap.c?id=3312d25e1abdc41be8a75a1b2c3ccaa39a14ed99#n823
> 
> i2c-omap.c version corresponding next commit:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-omap.c?id=66b9298878742f08cb6e79b7c7d5632d782fd1e1#n823
> 
> According to wikipedia: "Code refactoring is the process of
> restructuring existing computer code – changing the factoring –
> without changing its external behavior.'
> 
> And commit's (66b9298878742f08cb6e79b7c7d5632d782fd1e1) message states what:
> i2c: omap: switch over to do {} while loop
> this will make sure that we execute at least once.
> No functional changes otherwise.
> 
> But functional change present!
> It's call of omap_i2c_complete_cmd()!
> 
> The next question: What affects this change?
> If count reaches 100 and loop breaks and no error occupy during loop processing,
> and no error would be set in err_cmd, omap_i2c_complete_cmd will set 0 as
> result of transfer and wakeup omap_i2c_xfer. In other words, current i2c transfer
> will be aborted with incorrect (success) status set. But, does transfer completed in real?
> Do buf data was sent to i2c slave device or received from it in real?
> Upper layer code would thing, that data was sent successfully.
> 
> How to see that bug in real live? Just add extra delayes into isr thread.
> I did it unintentionally inserting dev_warn into isr thread.

right, apparently it is a bug, but it's very difficult to reproduce
considering we break out of IRQ handler as soon as something gets
transferred. The possibility of that count reaching 100 is very minor. A
bug nevertheless.

> BTW. I made more testing and provide more logs to demonstrate affected
> changes.
> 
> 
> > which deadlock are you talking about ? How do you trigger it ? Where are
> > the lockup debug traces ?
> > 
> > Well, I tried to debug I2C ISR hang issue (thats another question I
> > want to discuss later) using output to console. I places few dev_warn
> > 
> > if you found a bug with the ISR, why discuss it later ? How can I
> > understand if you found a real bug without proper logs ?
> 
> 
> I put in order my thoughts and describe. It's next problem, not
> related to patch1/patch2.
> 
> In general, the problem (the 3rd one) is that linux either hang or
> segfault in the i2c-omap driver if another master on the i2c bus
> (multimaster environment),

multimaster is not supported by this driver, so you can't call that a
bug. It's a missing feature, it needs to be implemented. Nobody has ever
had access to a multimaster environment where to develop it, so it was
never done.

> submit write request to General Call Address. In that case ISR could
> not correctly handle RDR (or XRDY, ARDY, or that ever). Thats becase
> i2c-omap driver doesn't correctly handle i2c-controller's slave mode.

right, Linux doesn't support being the slave. That's also a missing
feature, not a bug.

> But controller enter slave mode after each master transfer. Yes, AAS
> and GC interrupts masked, but i2c-controller still sends RDR (or that
> ever events) when it detect slave access from another master on the
> bus.
> 
> I have two safe solutions of the (3rd) problem in the mind:
> - keep interrupts disabled between i2c-master access (I think about implementing
> i2c_omap_interrupt_enable_clr/i2c_omap_interrupt_enable_set helpers and putting
> them in the right places)
> - keep controller disabled between i2c-master access (keep EN-bit of CON register 0)

send a patch and we'll see, but keep in mind if you want to support
multimaster, you need to implement it as per documentation.

> That solutions also help with races between isr and xfer_msg.

what races ? If you found races, that's another problem which should be
fixed separately from implementing a new feature.

> What do you think about that?
> 
> How to reproduce 3-rd problem:
> In order to ease reproduce, you should disable i2c controller from
> entering suspend mode:
> echo on > /sys/devices/platform/omap/omap_i2c.2/power/control
> 
> And then, using another i2c-master, connected to the same (i2c.2) bus,
> initiate I2C write transfer to Generall Call Address, than linux ether
> hang or isr thread segfault :)

right, it has never been implemented, what would you expect ? ;-)

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2014-11-16 15:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-15  1:20 [PATCH 1/2] i2c: omap: fix NACK and Arbitration Lost irq handling Alexander Kochetkov
     [not found] ` <1416014452-6712-1-git-send-email-al.kochet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-15  1:20   ` [PATCH 2/2] i2c: omap: fix "Too much work in one IRQ" " Alexander Kochetkov
     [not found]     ` <1416014452-6712-2-git-send-email-al.kochet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-15  1:48       ` Felipe Balbi
2014-11-15  2:37         ` Alexander Kochetkov
2014-11-15  3:47           ` Felipe Balbi
2014-11-15  3:53             ` Felipe Balbi
2014-11-15  5:42               ` Alexander Kochetkov
2014-11-16 15:45                 ` Felipe Balbi [this message]
2014-11-17 14:41                   ` Wolfram Sang
2014-11-18 16:00                     ` Felipe Balbi
2014-11-18 16:12                       ` Wolfram Sang
2014-11-20 16:38                         ` Alexander Kochetkov
     [not found]                           ` <2DED62C3-7C54-49E0-A39B-F68D5DAC66B1-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-20 16:43                             ` Wolfram Sang
2014-11-18 16:31                       ` Alexander Kochetkov
     [not found]                         ` <5D39428D-F359-4F04-8ACC-D607011B88B9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-18 16:38                           ` Felipe Balbi
2014-11-15  1:43   ` [PATCH 1/2] i2c: omap: fix NACK and Arbitration Lost " Felipe Balbi
     [not found]     ` <2159E044-9130-410D-905B-B941408DCDCD@gmail.com>
     [not found]       ` <2159E044-9130-410D-905B-B941408DCDCD-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-15  2:48         ` Alexander Kochetkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141116154558.GA7422@saruman \
    --to=balbi@ti.com \
    --cc=al.kochet@gmail.com \
    --cc=linux-i2c@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=tony@atomide.com \
    --cc=wsa@the-dreams.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).