Job Detail
-
Job ID 57375
Job Description
OpenText is a global leader in information management solutions, integrating AI to power innovation, improve efficiency, and transform the way digital knowledge workers operate.
Role Overview:
As a Lead Site Reliability Engineer, you will enhance the availability, performance, and stability of OpenText services while automating repetitive operational tasks. You’ll act as the technical custodian of production and staging environments, ensuring smooth collaboration between development, operations, and customer-facing teams.
Key Responsibilities:
- Manage incidents per SLA, ensuring quality and timely resolution.
- Maintain and support production/staging applications, acting as a technical liaison across teams.
- Automate operational processes using scripting tools (PowerShell, Bash, Python, etc.).
- Administer and troubleshoot IIS, Tomcat, Apache, and .Net-based applications.
- Support cloud infrastructure (AWS, GCP) and Kubernetes deployments.
- Implement Infrastructure-as-Code using Ansible, Terraform, GitOps.
- Monitor systems with tools like New Relic, Dynatrace, Prometheus, Nagios, or Zabbix.
- Manage databases (Oracle, PostgreSQL, MariaDB, Cassandra) and API gateways (Apigee, OAuth 2.0).
- Participate in 24/7 on-call rotation and shift schedules.
Qualifications & Skills:
- Proficiency in scripting (PowerShell, Shell, Bash, Perl, Python).
- Strong knowledge of Windows OS; Linux experience preferred.
- Hands-on cloud experience (AWS, GCP).
- Strong troubleshooting skills for distributed, high-throughput web apps.
- Familiarity with ITIL processes (certification is a plus).
- Excellent problem-solving, communication, and cross-functional collaboration skills.