Senior Site Reliability Engineer (Sre) Jobs | Epam Systems

Job Overview

Company

EPAM Systems

Location

WorkFromHome

Ready to Apply?

Take the Next Step in Your Career

Join EPAM Systems and advance your career in Redes y sistemas

Apply for This Position

Click the button above to apply on our website

Job Description

2 weeks ago Be among the first 25 applicants

EPAM is a leading global provider of digital platform engineering and development services.

We are committed to having a positive impact on our customers, our employees, and our communities.

We embrace a dynamic and inclusive culture.

Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow.

No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

We are seeking a talented and experienced Senior Site Reliability Engineer (SRE) to join our dynamic team.

As a Senior SRE, you will play a critical role in designing, developing, and maintaining highly reliable systems and processes to ensure optimal performance and scalability of applications and infrastructure across diverse environments.

Responsibilities

Build and containerize applications and deploy them using open-source container management tools such as Docker or Podman
Design and maintain Kubernetes resource manifests, deploying them into clusters on platforms like AKS or GKE
Configure and deploy Prometheus agents to monitor infrastructure and application behaviors, raising alerts when necessary
Create and manage continuous deployment pipelines using tools like Helm and ArgoCD
Optimize observability by implementing monitoring, logging, and tracing solutions
Maintain and manage CI/CD processes within Azure DevOps or similar environments
Develop and implement solutions on cloud platforms, leveraging expertise in at least one provider (e.g., Microsoft Azure, GCP, AWS)
Troubleshoot infrastructural and application issues by utilizing logs and traces to isolate events effectively

Requirements

Minimum 3+ years of programming experience, preferably in GoLang
Hands-on experience with at least one scripting language (e.g., Bash or Python)
Proficiency with Kubernetes, with at least 3 years of practical expertise
Fundamental knowledge of observability tools, with a focus on Prometheus or similar monitoring platforms
Skills in configuring and managing CI/CD pipelines using Azure DevOps or tools like Helm and ArgoCD for GitOps-style continuous deployment
Background in cloud platforms with competency in at least one provider (e.g., Microsoft Azure, Google Cloud, AWS)
Flexibility to use open-source tools like Docker or Podman to containerize applications and manage their runtime environments effectively

Nice to have

Familiarity with multiple cloud providers, including AWS and GCP alongside Azure
Expertise in GitOps packaging and deployment tools like Argo CD and Helm
Understanding of service meshes like Istio for Kubernetes-based microservices architectures
Competency in infrastructure-as-code tools such as Terraform
Background in software development with experience across multiple domains

We offer

International projects with top brands
Work with global teams of highly skilled, diverse peers
Employee financial programs
Paid time off and sick leave
Upskilling, reskilling and certification courses
Unlimited access to the LinkedIn Learning library and 22,000+ courses
Global career opportunities
Volunteer and community involvement opportunities
EPAM Employee Groups
Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn

Seniority level

Mid-Senior level

Employment type

Full-time

Job function

Engineering, Information Technology, and Business Development

Industries

Software Development, IT Services and IT Consulting, and Nanotechnology Research

We’re unlocking community knowledge in a new way.

Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

About EPAM Systems

Quick Access Links

Job Details:
https://mx.expertini.com/job/senior-site-reliability-engineer-sre-workfromhome-epam-systems-74-2251748/

Company Jobs:
More EPAM Systems Jobs

Location Jobs:
Jobs in WorkFromHome

Category Jobs:
Redes y sistemas Jobs

Don't Miss This Opportunity!

EPAM Systems is actively hiring for this Senior Site Reliability Engineer (SRE) position

Apply Now

Senior Site Reliability Engineer (SRE)