This is an old revision of the document!

ICS Server Update and Reboot Schedule

The following ICS servers will be updated and reboot at the times noted below:

Hostname Service Maint Window Begin Maint Window Duration puppet4 3/22/2021-0900 2 hours nagios 3/22/2021-900 2 hours digphenotypes 3/22/2021-0900 2 hours puppet6-2 3/22/2021-0900 2 hours colin-v1 3/22/2021-0900 2 hours N/A 3/22/2021-0900 2 hours ftp0 3/22/2021-0900 2 hours mda1 3/22/2021-0900 2 hours rt4-1 3/22/2021-0900 2 hours drew-v2 3/22/2021-0900 2 hours drew-v4 3/22/2021-0900 2 hours rt4-2 3/22/2021-0900 2 hours vpn001 3/22/2021-0900 2 hours imap0 3/22/2021-0900 2 hours vault0 3/22/2021-0900 2 hours ipa2 3/22/2021-0900 2 hours logstash0_elk 3/22/2021-0900 2 hours metrics_grafana 3/22/2021-0900 2 hours control 3/22/2021-1700 2 hours emp7-1 3/22/2021-1700 2 hours sge-sm0 3/22/2021-1700 2 hours imap1 3/22/2021-1700 2 hours mailman-mta0 3/22/2021-1700 2 hours data_collection_prometheus 3/22/2021-1700 2 hours SGE-Master 3/22/2021-1700 2 hours puppet master 3/22/2021-1700 2 hours ganglia 3/22/2021-1700 2 hours loghost0 3/22/2021-1700 2 hours emp7-2 3/22/2021-1700 2 hours
Hostname Service Maint Window Begin Maint Window Duration
jo-grant haproxy 3/22/2001-0900 2 hours
peter-capaldi and its VMs web services 3/22/2001-0900 2 hours
liz-shaw haproxy 3/22/2001-1000 2 hours
matt-smith and its VMs web services 3/22/2001-1300 2 hours

Kubernetes Cluster Upgrade

  • Downtime Notice: #78583: Upgrade Kubernetes Clusters
  • Date: March 22nd , 2021
  • Duration: All week beginning at 0600 and lasting approximately 8 hours

IMPACT: The Kubernetes clusters are built with redundant nodes so overall service should not be disrupted. Some users may notice pods or instances disconnecting and rebooting as they are migrating from one node to another.

Engineering Tower Generator ATS repair On Jan 30th

  • Downtime Notice: #78155 : ICS Data Center Power Event: Engineering Tower Generator ATS repair
  • Date: January 30th, 2020
  • Duration: Beginning at 0600 and lasting approximately 8 hours

Campus facilities scheduling an electrical power shut down for the ICS building on 1/30/2021 in order to effect repairs on a failed generator automatic transfer switch.   

Network: Power will be temporarily reconfigured to make sure all ICS network equipment remains powered during this period.

Computing: We do not anticipate downtime on any ICS computing during this period.

Environment: One of three 20 ton air conditioners will be unavailable during this period. Portable cooling units will be used to supplement the remaining air conditioners. If the portable cooling units prove unable to maintain a reasonable operating temperature we will begin shutting down non-critical infrastructure followed by hosts generated the greatest amount of heat (e.g. GPU clusters).

OIT: OIT will be mirroring our efforts to make sure that their equipment in the building is available during this period.

Although we will do our best to avoid downtime during this period, there are many factors that can intersect unexpectedly to cause an outage. Please send email to if you are concerned that this work will cause a hardship on your research and we will work with campus facilities to do our best to accommodate.

Last Quarter

projects/maint-winter-2021.1615855873.txt.gz · Last modified: 2021/03/15 17:51 by dutran
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0