[osuosl-openpower] MAINTENANCE: OpenStack cluster Ceph migration: April 9, 2018 (possibly additional days)
Lance Albertson
lance at osuosl.org
Tue Apr 10 16:05:56 UTC 2018
Hi all,
I thought I would send you an update on our progress for the migration to
Ceph storage. Yesterday we migrated 144 VMs and 7.7TB of data. We have 46
VMs left to move including all of the cinder volumes and 80 glance images
which I'm hoping to continue working on today. I expect that we'll need to
do some tuning after the migration is over so please bare with us.
If you would like to see if your VM has been migrated, you can check this
page [1] to see if the UUID of your VM has been moved. I'll be updating it
throughout the day.
Thanks-
[1] https://goo.gl/jAq1QJ
On Thu, Mar 29, 2018 at 4:46 PM, Lance Albertson <lance at osuosl.org> wrote:
> Service(s) affected:
>
> All VMs hosted on the OpenPOWER OpenStack cluster will be offline for
> approximately 5 minutes to 2 hours during each VM migration to Ceph. The
> outages will only occur when we take a VM down for a migration. All running
> VMs should remain online without any issue until we proceed with the
> migration.
>
> Outage Window(s):
>
> Start: Mon, Apr 9, 10:00AM PDT (Mon Apr 9 1700 UTC)
> End: Mon, Apr 9, 5:00PM PDT (Tue Apr 10 0000 UTC)
>
> I doubt we'll be able to finish the migration in one day so the following
> windows are additional as needed:
>
> Start: Tue, Apr 10, 9:00AM PDT (Tue Apr 10 1600 UTC)
> End: Tue, Apr 10, 5:00PM PDT (Wed Apr 11 0000 UTC)
>
> Start:
> Wed
> , Apr 1
> 1
> , 9:00AM PDT (
> Wed
> Apr 1
> 1
> 1600 UTC)
> End:
> Wed
> , Apr 1
> 1
> , 5:00PM PDT (
> Thu
> Apr 1
> 2
> 0000 UTC)
>
> Start:
> Thu
> , Apr 1
> 2
> , 9:00AM PDT (
> Thu
> Apr 1
> 2
> 1600 UTC)
> End:
> Thu
> , Apr 1
> 2
> , 5:00PM PDT (
> Fri
> Apr 1
> 3
>
> 0000 UTC)
>
> Start:
> Fri
> , Apr 1
> 3
> , 9:00AM PDT (
> Fri
> Apr 1
> 3
> 1600 UTC)
> End:
> Fri
> , Apr 1
> 3
> , 5:00PM PDT (
> Sat
>
>
> Apr 1
> 4
>
> 0000 UTC)
>
> Reason for outage:
>
> We are in the process of migrating the storage backend of the cluster
> from local storage to using Ceph as a backend. The migration to Ceph should
> improve I/O bandwidth and capacity and also provide more flexibility with
> doing server maintenance since we can do live migrations on VMs. Thanks to
> a donation from IBM, we have a new five node Ceph cluster with 292TB of
> capacity including SSD's for journal caching.
>
> We completed the first phase of this migration back in mid-March and now
> we're ready for the next phase of the migration. In this next phase, we're
> going to switch the OpenStack cluster over to using the new Ceph cluster
> for storage. The switch itself should not cause any outages as any running
> VMs should remain running on local storage. However any VM that is rebooted
> from the OpenStack API will fail to start since it will be expecting a Ceph
> volume for the disk. Any VM that is created after the switch will
> automatically be deployed on Ceph.
>
> The conversion will require we convert the following OpenStack services
> into Ceph:
>
> - VM disks
> - Volumes (cinder)
> - Image (glance) files
>
> For the vast majority of VMs, the process should be very simple. We will
> simply shutdown the VM, copy the disk image over to ceph using qemu-img and
> start the VM back up. For the very few VMs that use a cinder volume as a
> boot volume, the process is a little more complicated and may take more
> time, however it works the same. If your VM has a cinder volume attached to
> it, we will migrate both at the same time.
>
> Here is the order in which I'll be doing the migrations:
>
> - VMs with no cinder volumes
> - VMs with cinder volumes attached
> - VMs with a cinder boot volume
>
> I expect most VM migrations should only last 5-20min however VMs with a
> lot of storage may have longer downtimes. If you wish to schedule a
> specific time to do a migration, please let me know ASAP. I will be
> providing a spreadsheet closer to the migration showing the order of moves
> I'm planning an updating in real time as the moves are completed.
>
> If you're at all interested in the specifics of how I'm doing this
> migration, you're free to look at this gist [1] I made for myself to keep
> track of all the commands. If you have any questions or concerns please let
> me know.
>
> Thanks!
>
> [1] https://gist.github.com/ramereth/5e11018570f8cd8aa7e707643a4bbf4b
>
> --
> Lance Albertson
> Director
> Oregon State University | Open Source Lab
>
--
Lance Albertson
Director
Oregon State University | Open Source Lab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osuosl.org/pipermail/openpower/attachments/20180410/b83b7723/attachment.html>
More information about the openpower
mailing list