[opencompute-hosting-announce] Unplanned Power Event

Lance Albertson lance at osuosl.org
Fri Aug 6 04:29:44 UTC 2021


FYI: It looks like we had another power event that impacted our primary
data center along with our OpenCompute hosts in another datacenter. I'm
taking a look to see what might be down but this time it seems to be not
nearly as widespread. I don't think we had any issues with any of the OSL
managed services.

Please let me know if you do have any issues.

Thanks-

On Tue, Aug 3, 2021 at 3:26 PM Lance Albertson <lance at osuosl.org> wrote:

> I received an update on the issues we had in the primary data center. It
> appears that there was a battery cell problem on one of the UPS's. Previous
> to the outage OSU issued a Purchase Order for battery replacements and are
> waiting for them to arrive to schedule the installation. The projected
> arrival date for the batteries is September 10th. When they arrive we are
> scheduling the install as a priority.
> In the meantime, this may happen again however I did fix a few systems we
> had issues with related to how their power was configured.
>
> If you have any questions or concerns please let me know.
>
> Thank you!
>
> On Sun, Aug 1, 2021 at 12:28 PM Lance Albertson <lance at osuosl.org> wrote:
>
>> I got word that this outage was more campus wide which included impacting
>> the OpenCompute hosts. I went through those hosts and ensured they are back
>> online but let me know if I missed anything.
>>
>> OSU will be sending in a tech in a few days to see why the UPS didn't
>> fail over properly in our primary datacenter which caused the power event.
>> I'm also going to spot check a few hosts' power when I go in on Tuesday to
>> ensure power is split properly between the power feeds. If you had any
>> hosts that went down with dual power, please let me know ASAP so I can add
>> it to the list of hosts to check.
>>
>> Thanks for your patience!
>>
>> On Sun, Aug 1, 2021 at 8:15 AM Lance Albertson <lance at osuosl.org> wrote:
>>
>>> It seems as though we had an unplanned power event that happened in our
>>> primary data center early this morning at 3:03AM PDT (1003 UTC) that
>>> affected one of the two power feeds. Virtually every system that has a dual
>>> power supply should have remained online. The one exception is some systems
>>> located in a row that are only being fed by that power feed which include:
>>>
>>> - power8-aix
>>> - pieta.debian.org
>>> - gcc2-power8
>>> - All Buildbot/RTEMS systems
>>> - gcc113
>>> - gcc114
>>> - gcc115
>>> - gcc116
>>> - gcc117
>>> - gcc118
>>>
>>> I believe every system that we monitor should be back online but there
>>> might be others we aren't monitoring that are still down. If that's the
>>> case, please send an email to support and we'll take a look at it as soon
>>> as possible.
>>>
>>> I'm still waiting to hear back about what happened and why it happened
>>> and will pass that information along once I learn more.
>>>
>>> Thanks for your patience.
>>>
>>> --
>>> Lance Albertson
>>> Director
>>> Oregon State University | Open Source Lab
>>>
>>
>>
>> --
>> Lance Albertson
>> Director
>> Oregon State University | Open Source Lab
>>
>
>
> --
> Lance Albertson
> Director
> Oregon State University | Open Source Lab
>


-- 
Lance Albertson
Director
Oregon State University | Open Source Lab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osuosl.org/pipermail/opencompute-hosting-announce/attachments/20210805/e324d6b8/attachment.html>


More information about the opencompute-hosting-announce mailing list