[osuosl-openpower] MAINTENANCE: OpenStack cluster server moves: Mar 8, 9 & 12

Lance Albertson lance at osuosl.org
Thu Mar 8 19:15:19 UTC 2018


The move for openpower1 has been completed and all VMs should be booting up
or already should be back online that were on that hypervisor. Please let
us know if you have an issue with one of your VMs. We'll be moving
openpower2 later this afternoon as planned.

Thanks-

On Tue, Mar 6, 2018 at 2:36 PM, Lance Albertson <lance at osuosl.org> wrote:

> Service(s) affected:
>
> ​All VMs hosted on the OpenPOWER OpenStack cluster will be offline for
> approximately 2-4 hours during each server move. In addition, any VMs which
> have block storage attached to the affected nodes will have an outage.
>
> For a list of affected VMs per hypervisor node, please see the following
> spreadsheet which includes the UUID for each instance as it stands today.
> You can see what UUID your VM has by looking at the
> /run/cloud-init/.instance-id file on your vm. In addition, if you're using
> a block storage (cinder) volume, I have a sheet which shows the mappings by
> UUID to the host.
>>  OpenStack Cluster Server Moves
> <https://docs.google.com/a/osuosl.org/spreadsheets/d/15D3VE13chSn0jmGWpf5wsPsin6ex0B3I6FTwS74T5uY/edit?usp=drive_web>
>> Outage Window
> ​s​
> :
>
> ​openpower1​
> ​Start:   Thu, Mar 8, 9:00AM PST (Thu Mar 8 1700 UTC)
> End:    Thu, Mar 8, 11:00AM PST (Thu Mar 8 1900 UTC)
>
> ​openpower2
> ​Start:   Thu, Mar 8, 3:00PM PST (Thu Mar 8 2300 UTC)
> End:    Thu, Mar 8, 5:00PM PST (Fri Mar 9 0100 UTC)
>
> ​openpower3
> ​Start:   Fri, Mar 9, 8:30AM PST (Fri Mar 9 1630 UTC)
> End:    Fri Mar 9, 10:30AM PST (Fri Mar 9 1830 UTC)
>
> ​openpower5
> ​Start:   Fri, Mar 9, 1:00PM PST (Fri Mar 9 2100 UTC)
> End:    Fri Mar 9, 3:00PM PST (Fri Mar 9 2300 UTC)
>
> ​openpower6 (note DST change for us)
> ​Start:   Mon, Mar 12, 1:00PM PDT (Fri Mar 9 2000 UTC)
> End:    Mon Mar 12, 3:00PM PDT (Fri Mar 9 2200 UTC)
>
> Reason for outage:
>
> ​We are in the process of ​migrating the storage backend of the cluster
> from local storage to using Ceph as a backend. The migration to Ceph should
> improve I/O bandwidth and capacity and also provide more flexibility with
> doing server maintenance since we can do live migrations on VMs. Thanks to
> a donation from IBM, we have a new five node Ceph cluster with 292TB of
> capacity including SSD's for journal caching. In addition, we're going to
> be upgrading the networking layer from 1Gbps to 40Gbps due to the use of
> Ceph thanks to several donations from Mellanox. Since we're going to be
> incurring an outage for the server move, we wanted to do a few other items
> as the same time to reduce additional outage times.
>
> The first phase of this migration includes the following (which this
> outage covers):
>
> 1. Moving each compute server to a different rack closer to a Mellanox 40G
> switch
> 2. Installing and configuring a Mellanox 40G NIC card
> 3. Upgrading the system firmware (which includes Meltdown/Spectre fixes)
> 4. Switching over to a 4.14 mainline kernel on the host to provide better
> feature support on ppc64le (also provides fixes for Meltdown/Spectre)
>
> We have five compute nodes and we're planning on doing two sever moves a
> day starting on Thursday of this week. We're going to need to bring the
> nodes up and down several times so we'll be disabling the openstack
> services on those nodes until the process is complete.
>
> The second phase of the migration will happen in a few weeks and should
> only have per VM impacts while we migrate them over to the new Ceph
> cluster. I'll send a separate announcement about that once we're ready for
> that.
>
> If you have any questions or concerns please let me know directly via
> email or IRC.
>
> Thanks!
>
> --
> Lance Albertson
> Director
> Oregon State University | Open Source Lab
>



-- 
Lance Albertson
Director
Oregon State University | Open Source Lab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osuosl.org/pipermail/openpower/attachments/20180308/005a1b76/attachment.html>


More information about the openpower mailing list