From: Felipe Balbi <balbi@ti.com>
To: Alexander Kochetkov <al.kochet@gmail.com>
Cc: balbi@ti.com, linux-i2c@vger.kernel.org,
linux-omap@vger.kernel.org, Tony Lindgren <tony@atomide.com>,
Wolfram Sang <wsa@the-dreams.de>
Subject: Re: [PATCH 2/2] i2c: omap: fix "Too much work in one IRQ" irq handling
Date: Sun, 16 Nov 2014 09:45:58 -0600 [thread overview]
Message-ID: <20141116154558.GA7422@saruman> (raw)
In-Reply-To: <A47891AC-7453-43F0-9196-5133A1525C68@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6495 bytes --]
Hi,
On Sat, Nov 15, 2014 at 08:42:03AM +0300, Alexander Kochetkov wrote:
> Hello again.
>
> > (please, never top-post)
> Sorry.
>
> Sorry for the inaccurate presentation of ideas. I am not a native
> English speaker.
neither am I ;-)
> First about patches:
> [PATCH 1/2] and [PATCH 2/2] - intended to solve two independent problems.
> They were sent as series, In the future, try not to do so, In order
> not to mislead.
sending several fixes as a series is not a problem, a problem would be
to make fixes depend on new features or cleanups.
> > How could I ever call omap_i2c_complete_cmd() with 'err' set as 0 if I
> > had either a NACK or Arbitration Lost ?
>
> [PATCH 1/2] - fix AL, NACK handling and does not apply to [PATCH 2/2].
> It not cause of problem solved of [PATCH 2/2].
still, how could we ever have that situation ? We break out of the loop
as soon as an error is encountered.
> > \bright, this is the same as it was before. If count reaches 100 we will
> > omap_i2c_complete_cmd().
> *No*
>
> During refactoring submitted a series of patches was made the mistake.
> This error alters the behavior of the interrupt handler, if it ends
> with the message "Too much work in one IRQ".
>
> Error in the commit 66b9298878742f08cb6e79b7c7d5632d782fd1e1.
> All subsequent commits were correct and translate this error further.
>
> In the parent commit 3312d25e1abdc41be8a75a1b2c3ccaa39a14ed99
> (the commit before 66b9298878742f08cb6e79b7c7d5632d782fd1e1) in case
> count reaches 100, loop breaks *without* calling omap_i2c_complete_cmd().
>
> In commit 66b9298878742f08cb6e79b7c7d5632d782fd1e1, in case
> count reaches 100, loop breaks and omap_i2c_complete_cmd() *get called*.
aaa, now I see what you're talking about. I'll review that code on
Monday. Let me see if that was intentional or I missed something.
> To see that, you need to open two versions of a file i2c-omap.c, side
> by side.
>
> i2c-omap.c version corresponding to parent commit:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-omap.c?id=3312d25e1abdc41be8a75a1b2c3ccaa39a14ed99#n823
>
> i2c-omap.c version corresponding next commit:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-omap.c?id=66b9298878742f08cb6e79b7c7d5632d782fd1e1#n823
>
> According to wikipedia: "Code refactoring is the process of
> restructuring existing computer code – changing the factoring –
> without changing its external behavior.'
>
> And commit's (66b9298878742f08cb6e79b7c7d5632d782fd1e1) message states what:
> i2c: omap: switch over to do {} while loop
> this will make sure that we execute at least once.
> No functional changes otherwise.
>
> But functional change present!
> It's call of omap_i2c_complete_cmd()!
>
> The next question: What affects this change?
> If count reaches 100 and loop breaks and no error occupy during loop processing,
> and no error would be set in err_cmd, omap_i2c_complete_cmd will set 0 as
> result of transfer and wakeup omap_i2c_xfer. In other words, current i2c transfer
> will be aborted with incorrect (success) status set. But, does transfer completed in real?
> Do buf data was sent to i2c slave device or received from it in real?
> Upper layer code would thing, that data was sent successfully.
>
> How to see that bug in real live? Just add extra delayes into isr thread.
> I did it unintentionally inserting dev_warn into isr thread.
right, apparently it is a bug, but it's very difficult to reproduce
considering we break out of IRQ handler as soon as something gets
transferred. The possibility of that count reaching 100 is very minor. A
bug nevertheless.
> BTW. I made more testing and provide more logs to demonstrate affected
> changes.
>
>
> > which deadlock are you talking about ? How do you trigger it ? Where are
> > the lockup debug traces ?
> >
> > Well, I tried to debug I2C ISR hang issue (thats another question I
> > want to discuss later) using output to console. I places few dev_warn
> >
> > if you found a bug with the ISR, why discuss it later ? How can I
> > understand if you found a real bug without proper logs ?
>
>
> I put in order my thoughts and describe. It's next problem, not
> related to patch1/patch2.
>
> In general, the problem (the 3rd one) is that linux either hang or
> segfault in the i2c-omap driver if another master on the i2c bus
> (multimaster environment),
multimaster is not supported by this driver, so you can't call that a
bug. It's a missing feature, it needs to be implemented. Nobody has ever
had access to a multimaster environment where to develop it, so it was
never done.
> submit write request to General Call Address. In that case ISR could
> not correctly handle RDR (or XRDY, ARDY, or that ever). Thats becase
> i2c-omap driver doesn't correctly handle i2c-controller's slave mode.
right, Linux doesn't support being the slave. That's also a missing
feature, not a bug.
> But controller enter slave mode after each master transfer. Yes, AAS
> and GC interrupts masked, but i2c-controller still sends RDR (or that
> ever events) when it detect slave access from another master on the
> bus.
>
> I have two safe solutions of the (3rd) problem in the mind:
> - keep interrupts disabled between i2c-master access (I think about implementing
> i2c_omap_interrupt_enable_clr/i2c_omap_interrupt_enable_set helpers and putting
> them in the right places)
> - keep controller disabled between i2c-master access (keep EN-bit of CON register 0)
send a patch and we'll see, but keep in mind if you want to support
multimaster, you need to implement it as per documentation.
> That solutions also help with races between isr and xfer_msg.
what races ? If you found races, that's another problem which should be
fixed separately from implementing a new feature.
> What do you think about that?
>
> How to reproduce 3-rd problem:
> In order to ease reproduce, you should disable i2c controller from
> entering suspend mode:
> echo on > /sys/devices/platform/omap/omap_i2c.2/power/control
>
> And then, using another i2c-master, connected to the same (i2c.2) bus,
> initiate I2C write transfer to Generall Call Address, than linux ether
> hang or isr thread segfault :)
right, it has never been implemented, what would you expect ? ;-)
--
balbi
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2014-11-16 15:45 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-15 1:20 [PATCH 1/2] i2c: omap: fix NACK and Arbitration Lost irq handling Alexander Kochetkov
[not found] ` <1416014452-6712-1-git-send-email-al.kochet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-15 1:20 ` [PATCH 2/2] i2c: omap: fix "Too much work in one IRQ" " Alexander Kochetkov
[not found] ` <1416014452-6712-2-git-send-email-al.kochet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-15 1:48 ` Felipe Balbi
2014-11-15 2:37 ` Alexander Kochetkov
2014-11-15 3:47 ` Felipe Balbi
2014-11-15 3:53 ` Felipe Balbi
2014-11-15 5:42 ` Alexander Kochetkov
2014-11-16 15:45 ` Felipe Balbi [this message]
2014-11-17 14:41 ` Wolfram Sang
2014-11-18 16:00 ` Felipe Balbi
2014-11-18 16:12 ` Wolfram Sang
2014-11-20 16:38 ` Alexander Kochetkov
[not found] ` <2DED62C3-7C54-49E0-A39B-F68D5DAC66B1-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-20 16:43 ` Wolfram Sang
2014-11-18 16:31 ` Alexander Kochetkov
[not found] ` <5D39428D-F359-4F04-8ACC-D607011B88B9-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-18 16:38 ` Felipe Balbi
2014-11-15 1:43 ` [PATCH 1/2] i2c: omap: fix NACK and Arbitration Lost " Felipe Balbi
[not found] ` <2159E044-9130-410D-905B-B941408DCDCD@gmail.com>
[not found] ` <2159E044-9130-410D-905B-B941408DCDCD-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-11-15 2:48 ` Alexander Kochetkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141116154558.GA7422@saruman \
--to=balbi@ti.com \
--cc=al.kochet@gmail.com \
--cc=linux-i2c@vger.kernel.org \
--cc=linux-omap@vger.kernel.org \
--cc=tony@atomide.com \
--cc=wsa@the-dreams.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).