From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?ISO-8859-2?Q?S=B3awomir_Skowron?= Subject: Re: RadosGW problems with copy in s3 Date: Mon, 5 Mar 2012 22:21:44 +0100 Message-ID: <-8255131364718062163@unknownmsgid> References: Mime-Version: 1.0 (1.0) Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from mail-wi0-f174.google.com ([209.85.212.174]:60564 "EHLO mail-wi0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751693Ab2CEVVq convert rfc822-to-8bit (ORCPT ); Mon, 5 Mar 2012 16:21:46 -0500 Received: by wibhm2 with SMTP id hm2so2212286wib.19 for ; Mon, 05 Mar 2012 13:21:45 -0800 (PST) In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Yehuda Sadeh Weinraub Cc: "ceph-devel@vger.kernel.org" On 5 mar 2012, at 19:59, Yehuda Sadeh Weinraub wrote: > On Mon, Mar 5, 2012 at 2:23 AM, S=C5=82awomir Skowron > wrote: >> 2012/3/1 S=C5=82awomir Skowron : >>> 2012/2/29 Yehuda Sadeh Weinraub : >>>> On Wed, Feb 29, 2012 at 5:06 AM, S=C5=82awomir Skowron >>>> wrote: >>>>> >>>>> Ok, it's intentional. >>>>> >>>>> We are checking meta info about files, then, checking md5 of file >>>>> content. In parallel, updating object that have change, and then >>>>> archiving this objects in another key, and last thing is deleting >>>>> objects that expires. >>>>> >>>>> This happens over and over, because, this site is changing many t= imes. >>>>> >>>>> Now i don't have any idea, how to workaround this problem, withou= t >>>>> shutdown this app :( >>>> >>>> I looked at your osd log again, and there are other things that do= n't >>>> look right. I'll also need you to turn on 'debug osd =3D 20' and '= debug >>>> filestore =3D 20'. >>> >>> osd.24 almost 10 minutes of log in debug, as above in attachment. >>> >>>> Other than that, I just pushed a workaround that might improve thi= ngs. >>>> It's on the wip-rgw-atomic-no-retry branch on github (based on >>>> 0.42.2), so you might want to give it a spin and let us know wheth= er >>>> it actually improved things. >>> >>> Ok i will try, and let you know soon. >> >> Unfortunately, no improvment after upgrade for this version. >> > It looks like an issue with updating the bucket index, but I'm having > trouble confirming it, as the log provided (of osd.24) doesn't contai= n > any relevant operations. If you could provide a log from the relevant > osd it may be very helpful. > > You can find the relevant osd by looking at an operation that took to= o > long, and look for a request like the following: > > 2012-02-28 20:20:10.944859 7fb1affb7700 -- 10.177.64.6:0/1020439 --> > 10.177.64.4:6839/7954 -- osd_op(client.65007.0:587 .dir.3 [call > rgw.bucket_prepare_op] 7.ccb26a35) v4 -- ?+0 0xf25270 con 0xbcd1c0 > > It would be easiest looking for the reply to that request as it will > contain the osd id (search for a line that contains osd_op_reply and > the client.65007.0:587 request id). > > In the mean time, I created issue #2139 for a probable culprit. Havin= g > the relevant logs will allow us to verify whether you're hitting that > or another issue. > > Thanks, > Yehuda Ok, because of time difference between as i will try too find this on the morning in job. If there will be insufficient verbosity of logs i will try too start all OSD in debug, as you write earlier, and then generate, the problem again. I try to send logs as soon as possible. Regards Slawomir Skowron -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html