Know ATS Score
CV/Résumé Score
  • Expertini Resume Scoring: Our Semantic Matching Algorithm evaluates your CV/Résumé before you apply for this job role: Site Reliability Engineer.
Mexico Jobs Expertini

Urgent! Site Reliability Engineer Position in WorkFromHome - Gilder Search Group

Site Reliability Engineer



Job description

Prediktive - LATAM, United States

We are looking for a Site Reliability Engineer based in Latin America to work on a long-term project for one of our clients, a Software Development company, based in Los Angeles, CA.

Our client’s platform is a mobile-first CMMS, EAM & IIoT suite of solutions that helps teams streamline work orders, track assets, and schedule preventive maintenance, all in one place.

Responsibilities

  • Configure and operate monitoring, logging and tracing tools.

    Work with developers to improve application logging with focus on problem detection.

    Implement modern or replace tooling as needed.

  • Build dashboards, alerts, and automation workflows; define and track reliability metrics.

  • Monitor system performance and reliability, and implement improvements as needed.

  • Collaborate with software engineering teams to design and implement reliable systems.

  • Write and maintain robust automation tasks for infrastructure and development processes.

  • Participate in a 24/7 on-call rotation for alerts and incidents, and engage in root cause analysis and post-mortem meetings as needed.

  • Implement and manage security and compliance best practices across infrastructure and pipelines.

  • Manage and optimize AWS EKS Kubernetes clusters for deployment, scaling, and operation of containerized applications.

  • Design, build, and maintain scalable customer-facing infrastructure on AWS using Terraform.

  • Collaborate with developers and QA teams to streamline code deployment and testing workflows.

  • Collaborate with Database Administrators to gather requirements and implement client-side configuration of database connections.

  • Support Linux-based environments across development, staging, and production.

Qualifications

  • Advanced Level of English
  • 5+ years of professional experience in Site Reliability Engineering or a related role.

  • 4+ years of experience scripting with Bash or Shell.

  • 3+ years working with Linux-based systems and troubleshooting system-level Issues.

  • 3+ years of experience with CI/CD pipelines, preferably using GitHub Actions.

  • 2+ years working with Kubernetes (monitoring, deployment, scaling and networking).

  • Expertise in SRE concepts, including SLI/SLOs and Golden Signals.

  • Understanding of AWS services (EC2, EKS, S3, IAM, CloudWatch, etc.).

  • Familiar with Docker and container registries.

  • Strong problem-solving skills and the ability to work independently and collaboratively.

Bonus Points

  • Bachelor’s Degree in Computer Science, Systems Engineering or related fields
  • Experience with helm charts for Kubernetes deployments.

  • Knowledge of networking fundamentals and security best practices.

What we offer

  • Long term positions
  • Compensation in USD
  • Paid time off
  • Cool clients and products
  • Work with great engineers

4tech

Posted: Monday, September 22, 2025
Job # 1165

#J-18808-Ljbffr


Required Skill Profession

Redes Y Sistemas



Your Complete Job Search Toolkit

✨ Smart • Intelligent • Private • Secure

Start Using Our Tools

Join thousands of professionals who've advanced their careers with our platform

Rate or Report This Job
If you feel this job is inaccurate or spam kindly report to us using below form.
Please Note: This is NOT a job application form.


    Unlock Your Site Reliability Potential: Insight & Career Growth Guide