From mboxrd@z Thu Jan 1 00:00:00 1970 From: Loic Dachary Subject: Stable releases preparation temporarily stalled Date: Wed, 6 Jan 2016 15:30:46 +0100 Message-ID: <568D2516.2030706@dachary.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from relay4-d.mail.gandi.net ([217.70.183.196]:39230 "EHLO relay4-d.mail.gandi.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751981AbcAFOav (ORCPT ); Wed, 6 Jan 2016 09:30:51 -0500 Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Abhishek L , Abhishek Varshney , Nathan Cutler Cc: Ceph Development Hi, The stable releases (hammer, infernalis) did not make progress in the p= ast few weeks because we can't run tests. Before xmas the following happened: * the sepia lab was migrated and we discovered the OpenStack teuthology= backend can't run without it (that was a problem during a few days onl= y) * there are OpenStack specific failures in each teuthology suites and i= t is non trivial to separate them from genuine backport errors * the make check bot went down (it was partially running on my private = hardware) If we just wait, I'm not sure when we will be able to resume our work b= ecause: * the sepia lab is back but has less horsepower than it did * not all of us have access to the sepia lab * the make check bot is being worked on by the infrastructure team but = it is low priority and it may take weeks before it's back online * the ceph-qa-suite errors that are OpenStack specific are low priority= and it may never be fixed I think we should rely on the sepia lab for testing for the foreseeable= future and wait for the make check bot to be back. Tests will take a l= ong time to run, but we've been able to work with a one week delay befo= re so it's not a blocker. Although fixing OpenStack specific errors would allow us to use the teu= thology OpenStack backend (I will fix the last error left in the rados = suite), it is unrealistic to set that as a requirement to run tests: we= don't have the workforce nor the skills to do that. Hopefully, some ti= me in the future, Ceph developers will use ceph-qa-suite on OpenStack = as part of the development workflow. But right now running ceph-qa-suit= e on OpenStack suites is outside of the development workflow and in a s= tate of continuous regression which is inconvenient for us because we n= eed something stable to compare the runs from the integration branch. =46ixing the make check bot is a two part problem. Each failed run must= be looked at to chase false negatives (continuous integration with fal= se negatives is a plague), which I did in the past year on a daily basi= s and I'm happy to keep doing. Before xmas break the bot running at jen= kins.ceph.com sent over 90% false negative, primarily because it was tr= ying to run on unsupported operating systems and it was stopped until t= his is fixed. It also appears that the machine running the bot is not r= e-imaged after each test, meaning a bugous run may taint all future tes= ts and create a continuous flow of false negative. Addressing these two= issues require knowing or learning about the Ceph jenkins setup and sl= ave provisioning. This probably is a few days of work, reason why the i= nfrastructure team can't resolve that immediately. If you have alternative creative ideas on how to improve the current si= tuation, please speak up :-) Cheers --=20 Lo=C3=AFc Dachary, Artisan Logiciel Libre -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html