Site Reliability Engineer (SRE)

Tel Aviv, Israel

Full Time

Engineering

Mid Level

We are looking for a highly experienced Site Reliability Engineer (SRE) or DevOps Engineer to join our passionate Engineering team. In this critical role, you will be instrumental in ensuring the reliability, performance, and scalability of our core infrastructure. You will work with the latest cloud technologies, focusing on automating and optimizing our continuous integration and continuous deployment pipelines, and managing our Kubernetes environment. The job requires to be 3 time a week at the office

Qualifications

5+ years of experience as DevOps or SRE engineer role
Experience designing and operating large-scale distributed systems.
Deep understanding of SRE principles and practices (SLOs/SLIs, Error Budgets, Toil reduction).
Kubernetes cluster administration working knowledge (preferably EKS), using Helm, gitops.
Scripting and automation skills (Shell, Python, etc.)
Experience using a broad range of AWS technologies (EC2, S3, VPC, Lambda, IAM, CloudWatch, etc.)
Proven record of build automation and CI/CD pipelines, including github actions, ArgoCD, FluxCD)
Experience working with monitoring frameworks like Grafana, DataDog, Prometheus, ELK
Experience with cloud-managed database services (e.g., AWS RDS, Redis, DynamoDB).
Knowledge of DNS, Load Balancing, SSL, TCP/IP, networking, and security
Provision infra using IaC tools such as Teraform, serverless framework, Cloudformation, Pulumi.
Experience with DB administration and maintenance
Outstanding interpersonal communication skills

What You’ll Do

Analyzes and determines integration needs.
Automates infrastructure and application deployment on AWS.
Identify manual processes that can be automated
Maintain and improve our cloud infrastructure
Continuously maintain and improve our CI/CD
Design, implement, and maintain scalable and highly-available infrastructure systems, focusing on reliability and performance.
Develop and implement robust monitoring, alerting, and logging solutions to proactively identify and resolve potential system issues.
Conduct blameless post-mortems for critical incidents, driving continuous improvement in system resilience.
Participate in capacity planning and performance tuning to ensure the platform can handle current and future load.
Must - Be available on-call to respond to and resolve critical infrastructure issues outside of regular business hours. (including weekends)

LinearB Values:

Put the Customer First
Take Ownership
One Team
Show Product Expertise
Be Data Driven
Reach for the Next Level
Listen Curiously & Speak Courageously

LinearB is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected veteran status, age, or any other characteristic protected by law.

#LI-hybrid.

Apply for this position

Required*

First Name*

Last Name*

Email Address*

Phone*

Address

Resume*

We've received your resume. Click here to update it.

Attach resume or Paste resume

Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*

Submit Application

Thanks for visiting our Career Page. Please review our open positions and apply to the positions that match your qualifications.

Site Reliability Engineer (SRE)

Qualifications

Apply for this position