As SharePoint moves into the Data Center as a formal service offering the concern about disaster recovery arises. Specifically, as the business integrates SharePoint technologies with their day to day business tasks, it’s importance to the business will increase. Once this happens there will be an expectation that its always on and performing consistently.
So where do you start? Meet with the Data Center staff to understand the base services available to you. Also, keep your Service Level Agreement (SLA) by your side as a reference for decision support. In order to have a SharePoint service that is resilient to disaster, you design must take advantage of existing Data Center services and their experience (this will get you 80% of the way there). For example, if you speak with the Data Center staff about disaster recovery services available to you and the approach they have taken with other applications.
As you go through the design process, consider focusing on the following areas:
- A distributed design – Generally companies start with a medium farm for each major continent. This is because of WAN, performance, staffing, regional laws etc.
- Rebuild procedures – The exact (tested and proven) steps for building the farms and restoring the databases and index.
- Media storage – If the data center is gone, where do you obtain the media? Off site storage facilities?
- Off site facilities – Where does everyone convene to begin the rebuild process? Do you have phones? Security access? Can your staff get to the location quickly?
- Hardware procurement – Where do you get new hardware? Perhaps you have standby hardware arrangements with your vender?
- Network – Does your network provider have service in the area of your off site facilities? What’s required to get the site online?
- Lost staff – Though not a pleasant though but who else knows how to rebuild your farms?
- Communication plan – How do you inform people? How do you rally the troops? What are their roles?
As you think through the process of a design, ask yourself the following questions to help validate the design:
- What happens if I lose a Data Center? The complete data center is gone. All the tools and facilities are lost.
- What happens if I lose the farm? The farm is gone but the data center is still available and therefore all the tools for recovery are as well.
- What can venders provide from a product and services perspective to help you reduce time to recovery?
For the most part I see clients focusing on SQL Server backup and restore for their Disaster Recovery plan – which is only part of the picture right? If my job was on the line I’d approach Disaster Recovery from a holistic standpoint and if management chose to strip the plan down I would atleast have my vision published and a paper trail when this go south some day.
So Disaster Recovery is complex huh? As you work through your plan you will quickly begin asking yourself questions such as, does SharePoint justify this sort of investment? Is it really a priority? Perhaps I want a managed service?