Senior Incident Response Manager
Q2 seeks an Incident Response Manager who will report up through the Q2 Integrated Operations Center (IOC) organization and directly to the Director of IT Operations and will be responsible for the management oversight of Incident Response, Outage Communication, and Root-cause analysis of major incidents impairing service delivery across multiple digital channels.
We are looking for someone who has strong incident management experience, someone who is a self-starter, can quickly assess incident priority, engage resources and drive to quick resolution. Additionally, this person will also manage a small team of Tools Administrators responsible for the administering a $1M annual portfolio of monitoring tools utilized for early detection and root cause prior to, during, and after an event. Building and maintaining strong internal customer relationships is essential as this position will work with Q2’s Customer Experience Manager Team, Operational Teams, Executives, and other internal staff and system integrators on a daily basis.
RESPONSIBILITIES:
- Develop Incident Response Program targeted at improved internal and external communication during an event or outage and recommend opportunities for process improvement
- Provide root cause analysis, create metrics, management, dashboards, administration of monitoring tools, and communication process
- Ensure that technical issues effecting Digital Channel Service Delivery are responded to and that normal service operations are restored as quickly as possible
- Identify persistent or recurring problems and recommend creative solutions
- Maintain escalation and contact lists for mission critical assets
- Monitor applications for most efficient operation, identifying fault conditions as well as opportunities for further optimization
- Review and revise incident management processes, policies and escalation procedures on a regular basis to drive efficiencies and effectiveness in responding to issues
- Ensure incidents are escalated and facilitated to enable efficient and timely service restorations
- Communicate with all levels of management regarding P1 incidents
- Facilitate the restoration of service and ensure the correct technical staff is working on an incident
- Escalate issues during the issue resolution and facilitate and support lessons learned reviews
- Drive Root Cause Analysis with technology partners, post incident resolution and facilitate RCA reviews with internal and external stakeholders
- Manage Tools Team with administrative responsibility of monitoring systems
- Complete ad-hoc and ongoing projects on an as-needed basis
EXPERIENCE AND KNOWLEDGE:
- Bachelors or equivalent degree in the field of Computer Engineering, Information Systems, Management, etc.
- 8-10 years of experience with at least 3-5 years related to incident management, program management
- Knowledge of Release management desired
- Experience in Java, Node.js, MSSQL, mobile tools, Linux, API Monitoring Tools, Server knowledge is a plus.
- Incident Management Certification preferred, ITIL or equivalent
- Self-driven and effective in communication
- Good Inter and Intrapersonal skills
- Experience in working with large, strategic clients needed
- Exceptional verbal and written communication skills a must
- Ability to clearly articulate the impact of an incident, and interpret the mitigation or remediation steps to prevent future incidents is essential
- Must be able to create good relationships across various teams while driving on detailed explanations of events leading up to and after the event is critical to your success
At Q2, our goal is to be a diverse and inclusive workforce that fosters mutual respect for our employees and the communities we serve. Q2 is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.