From mboxrd@z Thu Jan 1 00:00:00 1970 From: Guido Winkelmann Subject: Random blocks when accessing rbd images Date: Thu, 15 Dec 2011 16:07:55 +0100 Message-ID: <1404301.on6okQVZ04@pc10> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7Bit Return-path: Received: from unknownsite.de ([62.48.69.106]:40956 "EHLO hartes-hannover.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751245Ab1LOPIC (ORCPT ); Thu, 15 Dec 2011 10:08:02 -0500 Received: from pc10.localnet (pc10.asys-h.de [193.98.1.90]) by hartes-hannover.de (Postfix) with ESMTPSA id 21EC910C866 for ; Thu, 15 Dec 2011 16:08:00 +0100 (CET) Sender: ceph-devel-owner@vger.kernel.org List-ID: To: ceph-devel@vger.kernel.org Hi, I've got a small ceph cluster with one mon, one mds and two osds (all on the same machine, for now), that I want to use as a block- and file storage backend for qemu machine virtualisation. I found that read access to some of the rbd images, or parts of some of them sometimes blocks indefinitely, usually after the image has been sitting around untouched for a while, for example over night. This has the effect that virtual machines that try to access their disks as well as rbd commands like "rbd cp" will just hang indefinitely. I found that these blocks can usually be "fixed" by restarting one of the osds. The last time this happened, ceph -s reported one of the osds to be in state "active+clean+scrubbing". (I'm afraid I don't have the complete output from ceph -s anymore.) Does anybody have any idea what could be going wrong here? Guido