Computing and Network Services Home Computing and Network Services Home Drew University home page
Drew University: Computing and Network Services
 
Drew University > Technology > CNS User Support | Computer Store | Campus Networking | Remote Access | Enterprise Applications | Telecommunications
 

Replacing Our Main Storage Array

CNS maintains the main disk array used by most of our critical systems on campus. That array is on a three-year replacement cycle, and we're in the process of replacing our HP StorageWorks EVA 3000 with a new EVA 6000. To read about our servers and storage in more detail, please go to the Systems and Networking section of CNS Live. That section will be updated to reflect the new configuration once the array replacement is finished.

The real challenge of replacing the disk array every 3 years is the data migration - there's more each time, and it's not just a matter of getting the information across, it's also making sure that critical configuration information is duplicated as well.

We lease the arrays, so the old array has to be returned to HP within a certain period of time. We plan on finishing the migration well ahead of that. As long as we're relying on both arrays, the systems are not as reliable as our normal configuration. For servers and storage, the only campuswide single point of failure we normally have is the storage array itself. Until we can disconnect the old array and give each server a redundant path to the new hardware, however, the storage networking switches (fibre channel), patch cables, etc. are all potential points of failure as well. Returning the systems to their normal level of fault tolerance is a high priority.

We will notify the community of significant downtime - if either a single service will be unavailable for a longer period of time, or in scope - a brief disruption that will have a broad impact. There are too many services to send out an announcement for each. Services that are more limited in scope and will only suffer a brief disruption will be on a posted schedule, but we do not plan to make announcements for those.

A quick list is available below. We also have a status page that we will be updating as we migrate services.

Causeway and Chalmers

Our two NetWare clusters hold the bulk of the data on the SAN array. Causeway is the name for Computing and Network Services centralized networked file and print services system. The most visible services are the network drives (F:, G:, etc.) you see on your computer, as well as providing the bulk of our network printing. Chalmers is our NetWare 6.5 cluster for the Novell GroupWise email system. We have a four-server cluster, each running a "post office" and other services.

We are finishing final testing on a method of copying the data from the old to new arrays without requiring the extensive downtime that would be required if we used a typical backup/restore technique. We will be "mirroring" the data between the arrays using a feature built into the file system we're using.

Other servers

While the servers above hold the bulk of our data, there are a lot of other servers (physical and virtual) we're relying on for critical functions - a two-node database server cluster, our campus DNS and DHCP servers (critical for network operations), the Drew web site, AIMS, access to the Library catalog and resources and Blackboard, just to name a few. They all have to have their data moved.

Virtual servers supporting some applications have already been shifted to the new array: AdAstra, Adobe Connect Enterprise Server (formally Macromedia Breeze), Blackberry Enterprise Server, Cisco Secure ACS (used by the Administrative Computing VPN), Raiser's Edge, software license metering, and Supportworks servers have been relocated during off hours.

In Brief:

  • File/Print Cluster (Causeway) - One short period (under a half hour) of complete downtime will be required to move the "cluster services partition". Once that has been recreated on the new disk array, we can resume normal operations and "mirror" the data partitions, and break the mirror, leaving only the copy on the new array once all the data is duplicated. We will announce the schedule for the complete shutdown shortly.
  • GroupWise Cluster - One short period (under a half hour) of complete downtime will be required to move the "cluster services partition". Once that has been recreated on the new disk array, we can resume normal operations and "mirror" the data partitions, and break the mirror, leaving only the copy on the new array once all the data is duplicated. We will announce the schedule for the complete shutdown shortly.
  • Library - This is now powered by a pair of virtual servers. We need about ten minutes each to move them to the new array. Once that's done, an additional period of about an hour and a half to copy the existing catalog data to a new volume, followed by a drive letter swap. The schedule will be determined in consultation with the Library.
  • Database cluster (Microsoft SQL Server / MySQL) - We're still working on the details of this move. There aren't as clear a set of well-documented procedures for moving the quorum disk (sort of the Windows equivalent to the NetWare cluster services partition).
  • AIMS (Disney) - We expect about twenty minutes of downtime in order to move the virtual server to the new array. The schedule will be determined in consultation with Administrative Computing.
  • Blackboard - Again, we need twenty minutes or so of time to move the virtual server. The schedule will be determined in consultation with ITS.
  • Web (Ektron) - We have a procedure, also used during software upgrades, to move the virtual server running the Ektron database with no interruption for the main web site.
  • Web (depts.drew.edu, users.drew.edu, courses.drew.edu) - This virtual server, and another used for some web applications will be moved during an early-morning shutdown. We expect 5 to 10 minutes per server.
  • Web - load balancer - our load balancing "appliance" for web traffic is actually a virtual server. It will need to be moved. The procedure we use for this will require a 30-second disruption in all web services. We will notify the campus before this happens.
  • Other application servers - some other small virtual servers providing specific still need to be moved. They will be done off-hours, and we expect disruption to be minimal.

 

 

 
 
 
Copyright © 2003-2009, Drew University Where do I go for HELP? | CNS Live | Contact Us
Page last updated: 13 June 2007