linux-i2c.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jean Delvare <khali-PUYAD+kWke1g9hUCZPvPmw@public.gmane.org>
To: Andrew Worsley <amworsley-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: ben-linux-elnMNo+KYs3YtjvyW6yDsg@public.gmane.org,
	linux-i2c-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Peter Korsgaard <jacmet-OfajU3CKLf1/SzgSGea1oA@public.gmane.org>
Subject: Re: i2c-ocores timeout bug
Date: Thu, 16 Jun 2011 18:52:09 +0200	[thread overview]
Message-ID: <20110616185209.08ef41ed@endymion.delvare> (raw)
In-Reply-To: <BANLkTikEESnV9+hc=VFg=qFGOXeg4o0sOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Hi Andrew,

On Wed, 15 Jun 2011 08:33:45 +1000, Andrew Worsley wrote:
> Hi, I have hit upon a bug in this driver in the 2.6.32 which caused
> memory corrupt and crash in my kernel. It appears to be still present
> in 3.0-rc3 so as you are listed as the current  kernel i2c maintainers
> which covers i2c-ocores I thought you could comment on it and the
> proposed fix I enclose below.

The same MAINTAINERS file also says that Peter Korsgaard is maintaining
the i2c-ocores driver, so he would be an even better recipient for your
request. Cc'd. Peter, can you please review the patch? Thanks.

> 
> Basically the i2c-ocores xfer function, kicks off an interrupt driven
> array of i2c_msg transfers and waits 1second for it to complete and
> just
> returns -ETIMEDOUT if it times out. The interrupt driven transfer is
> *not* stopped on a time out and continues referencing the i2c_msg
> structure. In my
> case an i2c read the interrupt handler keeps adding bytes to the
> structure. Unfortunately after the time out the structure was released
> and it's length field was changed to a large number and the interrupt
> driver kept happily writing bytes way past the end of the original
> buffer until the kernel crashed or locked up!
> 
> Below is a fix that works for me in 2.6.32 which removes the i2c_msg
> buffer from the interrupt handler and changes the handler to check for
> it's removal.
> 
> diff --git a/drivers/i2c/busses/i2c-ocores.c b/drivers/i2c/busses/i2c-ocores.c
> index 8dee246..21e57a0 100644
> --- a/drivers/i2c/busses/i2c-ocores.c
> +++ b/drivers/i2c/busses/i2c-ocores.c
> @@ -137,6 +137,12 @@ static void ocores_process(struct ocores_i2c *i2c)
>                 wake_up(&i2c->wait);
>                 return;
>         }
> +       if (msg == 0) {
> +       /* Caller has with drawn the request, buffer no longer available */
> +           i2c->state = STATE_ERROR;
> +           oc_setreg(i2c, OCI2C_CMD, OCI2C_CMD_STOP);
> +           return;
> +       }
> 
>         /* error? */
>         if (stat & OCI2C_STAT_ARBLOST) {
> @@ -223,8 +229,13 @@ static int ocores_xfer(struct i2c_adapter *adap,
> struct i2c_msg *msgs, int num)
>         if (wait_event_timeout(i2c->wait, (i2c->state == STATE_ERROR) ||
>                                (i2c->state == STATE_DONE), HZ))
>                 return (i2c->state == STATE_DONE) ? num : -EIO;
> -       else
> +       else {
> +               printk(KERN_NOTICE
> +               "ocores_xfer() i2c transfer to %d-%#x timed out\n",
> +                       i2c_adapter_id(adap), msgs->addr);
> +               i2c->msg = 0; /* remove the caller's request which
> will be re-used */
>                 return -ETIMEDOUT;
> +       }
>  }
> 
>  static void ocores_init(struct ocores_i2c *i2c)
> 
> Andrew
> 
> Some more details of the failure -
> 
> Here's the user space command that causes the lock up:
> 
>    i2cdump -y 4 0x50 i
> 
> Note: With out the i, it performs one byte transfers and is fine. It
> seems my i2c-ocores i2c transfers go *very* slowly and in order to get
> the above command to not time out I had to bump up the i2c-ocores time
> out value to 3s. I suspect that's not necessary for normal situation
> which is why this bug may not have been noticed before. Also most
> other transactions are usually one byte device probes which don't get
> a response. In my case the interrupts appear to be slow - taking some
> 15 bytes over one second and there is plenty more data if the
> interrupt driver wants to keep reading.
> 
> Here's the transfer function and you can see how it just waits for
> 1sec and then fails the request with out stopping the interrupt driver
> from continuing to reference the request data:
> 
>  static int ocores_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
> {
> 	struct ocores_i2c *i2c = i2c_get_adapdata(adap);
> 
> 	i2c->msg = msgs;
> 	i2c->pos = 0;
> 	i2c->nmsgs = num;
> 	i2c->state = STATE_START;
> 
> 	oc_setreg(i2c, OCI2C_DATA,
> 			(i2c->msg->addr << 1) |
> 			((i2c->msg->flags & I2C_M_RD) ? 1:0));
> 
> 	oc_setreg(i2c, OCI2C_CMD, OCI2C_CMD_START);
> 
> 	if (wait_event_timeout(i2c->wait, (i2c->state == STATE_ERROR) ||
> 			       (i2c->state == STATE_DONE), HZ))
> 		return (i2c->state == STATE_DONE) ? num : -EIO;
> 	else
> 		return -ETIMEDOUT;
> }
> 
> My kernel would crash with various faults including:  "Thread overran
> stack, or stack corrupted".
> I added in some debugging into the i2c-ocores module and traced things
> down to apparently a corruption of the len field of the i2c transfer
> i2c_msg structure:
> 
> The above i2cdump command generates an i2c_msg read transfer for 32
> bytes of data. At the 16th byte of read data the length field is
> over-written with 49844 :
> 
> Read finished 1 xfers left
> ocores_process() state=1 status=0x41
> ocores_process() Read addr=0x50 len=32
> ocores_process() state=3 status=0x41
> Read [0] = 0x6 len=32
> ocores_process() state=3 status=0x41
> Read [1] = 0x40 len=32
> ocores_process() state=3 status=0x41
> Read [2] = 0x55 len=32
> ocores_process() state=3 status=0x41
> Read [3] = 0x0 len=32
> ocores_process() state=3 status=0x41
> Read [4] = 0xfb len=32
> ocores_process() state=3 status=0x41
> Read [5] = 0x0 len=32
> ocores_process() state=3 status=0x41
> Read [6] = 0x50 len=32
> ocores_process() state=3 status=0x41
> Read [7] = 0x0 len=32
> ocores_process() state=3 status=0x41
> Read [8] = 0x0 len=32
> ocores_process() state=3 status=0x41
> Read [9] = 0x0 len=32
> ocores_process() state=3 status=0x41
> Read [10] = 0xcf len=32
> ocores_process() state=3 status=0x41
> Read [11] = 0xde len=32
> ocores_process() state=3 status=0x41
> Read [12] = 0xe6 len=32
> ocores_process() state=3 status=0x41
> Read [13] = 0xdc len=32
> ocores_process() state=3 status=0x41
> Read [14] = 0xea len=32
> ocores_process() state=3 status=0x41
> Read [15] = 0xff len=49844
> ocores_process() state=3 status=0x41
> Read [16] = 0x96 len=49844
> ocores_process() state=3 status=0x41
> ....
> 
>         It happily continues past the end of the buffer at 32 ...
> 
> Read [28] = 0x18 len=49844
> ocores_process() state=3 status=0x41
> Read [29] = 0xa5 len=49844
> ocores_process() state=3 status=0x41
> Read [30] = 0x31 len=49844
> ocores_process() state=3 status=0x41
> Read [31] = 0x27 len=49844
> ocores_process() state=3 status=0x41
> Read [32] = 0x1f len=49844
> ocores_process() state=3 status=0x41
> Read [33] = 0x7 len=49844
> ocores_process() state=3 status=0x41
> Read [34] = 0x3d len=49844
> ocores_process() state=3 status=0x41
> Read [35] = 0xe6 len=49844
> ocores_process() state=3 status=0x41
> Read [36] = 0x0 len=49844
> ocores_process() state=3 status=0x41
> Read [37] = 0x64 len=49844
> .....
> 
>    It then continues till it stomps on something and gets some sort of fault....
> 
> Read [220] = 0x8 len=58644
> ocores_process() state=3 status=0x41
> Read [221] = 0x60 len=58644
> ocores_process() state=3 status=0x41
> Read [222] = 0x43 len=58644
> ocores_process() state=3 status=0x41
> Read [223] = 0x49 len=58644
> ocores_process() state=3 status=0x41
> Read [224] = 0x0 len=58644
> PANIC: double fault, gdt at c1400000 [255 bytes]
> double fault, tss at c1404300
> eip = c032b5a3, esp = c03fa000
> eax = c03fa020, ebx = df16e4e0, ecx = 0000007b, edx = 00000009
> esi = de545580, edi = c0119427
> Read [28] = 0x18 len=49844

-- 
Jean Delvare

  parent reply	other threads:[~2011-06-16 16:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-14 22:33 i2c-ocores timeout bug Andrew Worsley
     [not found] ` <BANLkTikEESnV9+hc=VFg=qFGOXeg4o0sOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-06-16 16:52   ` Jean Delvare [this message]
     [not found]     ` <20110616185209.08ef41ed-R0o5gVi9kd7kN2dkZ6Wm7A@public.gmane.org>
2011-07-03 20:31       ` Peter Korsgaard
     [not found]         ` <87pqlr14z0.fsf-uXGAPMMVk8amE9MCos8gUmSdvHPH+/yF@public.gmane.org>
2011-07-03 23:45           ` Andrew Worsley
2011-06-29 22:55   ` Andrew Worsley
     [not found]     ` <BANLkTino1RYuDDjcXZYdJ1LyeSHz6h=ACw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-06-30  7:10       ` Peter Korsgaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110616185209.08ef41ed@endymion.delvare \
    --to=khali-puyad+kwke1g9huczpvpmw@public.gmane.org \
    --cc=amworsley-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=ben-linux-elnMNo+KYs3YtjvyW6yDsg@public.gmane.org \
    --cc=jacmet-OfajU3CKLf1/SzgSGea1oA@public.gmane.org \
    --cc=linux-i2c-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).