Job Title: Disaster Recovery Specialist
Overview: The Disaster Recovery Specialist is responsible for analyzing and understanding disaster recovery (DR) needs, recommending DR strategies, and implementing DR plans. This role involves conducting end-to-end DR drills/tests, developing disaster recovery documentation (including DR plans and runbooks), and creating policies, processes, and procedures to ensure the confidentiality, availability, and integrity of the organization’s systems. The role requires a combination of technical expertise, project management, and system analysis skills.
Key Responsibilities:
- Develop disaster recovery plans, procedures, and policies for the restoration of critical business applications.
- Validate all DR replications for servers and applications.
- Assess and analyze DR gaps (RPO, RTO, replication deficiencies) and contribute to the design and implementation of DRP solutions.
- Collaborate with the IT department and other key personnel to assess the impact of hardware or software changes on disaster recovery outcomes.
- Support technology and infrastructure assessments as part of the business impact analysis process.
- Operate within a complex environment, managing global distributed components that depend on various systems and end-users (both internal and external).
- Create and maintain DR documentation, including DR maps.
- Provide regular DR status reports to the Global Corporate DR Manager.
- Ensure all disaster recovery plans are reviewed and updated regularly.
- Establish and maintain detailed DR communication and command/control plans through a change management process.
- Work with IT technical staff to ensure that disaster recovery solutions are adequate, in place, maintained, and tested as part of the regular operational lifecycle.
- Develop, understand, and implement all necessary testing for successful DR execution.
- Schedule and lead DR exercises.
- Continuously improve and optimize existing DR solutions and services.
- Support DR activation and lead drills from preparation to closure.
Required Qualifications:
- Experience: 4-8 years of experience in Disaster Recovery, Infrastructure, or Data Center roles is an advantage.
- Education & Certifications: A bachelor’s degree in Computer Science, Information Systems, Information Security, or a related IT field is required. CBCP or equivalent certification is an advantage. PMP certification is an advantage.
Technical and Managerial Skills:
- Strong understanding of on-premises and cloud infrastructure disaster recovery solutions, processes, and best practices. – MUST
- Solid knowledge of IT infrastructure. – MUST
- Familiarity with Windows, Linux, and other UNIX platforms (RedHat, AIX). – Required
- Experience with DRAAS (Disaster Recovery as a Service). – Required
- Understanding of virtual infrastructure and cloud technologies (e.g., VMware, AWS, Microsoft Azure). – Required
- AWS and Azure foundational knowledge is an advantage.
- Familiarity with Jenkins, GitLab, Artifactory, and Nexus is a plus.
- Project management skills are an advantage.
- Knowledge of high-availability technologies, including servers, networking components, database/storage, and associated replication technologies is an advantage.
- Proficiency in Windows (Active Directory, Exchange, Security, NFS) and Linux/Unix solutions is an advantage.
Additional Skills:
- High level of proficiency in English (mandatory).
- Strong stakeholder management skills (mandatory).
- Advanced PC skills, including MS Office (Word, Excel, PowerPoint) (mandatory).
- Familiarity with business continuity program lifecycle plans and deliverables (e.g., risk assessments, BIAs, continuity planning) is an advantage.
Willingness to Travel: Willingness to travel (infrequently) and provide off-hours on-call support as needed (mandatory).
Personality Traits:
- Self-starter with a driven, assertive, and positive attitude.
- Effective problem-solving skills.
- Excellent attention to detail.
- Ability to perform under pressure and in stressful situations.
- Capable of efficiently coordinating work with business partners in remote locations.
- Ability to work collaboratively within a global team.
- Capable of handling multiple tasks simultaneously.
- Excellent organizational skills and ability to interact with individuals at all levels of the organization.
- Strong oral and written communication skills to effectively convey plans, exercises, and activities.