About this job
Job category
IT
Job type
Full time
Location
Bucharest, Romania
Company size
Mature [ 50+ employess ]
Apply now
Don't miss out on this opportunity. Apply now and take the first step toward success.
Resume Assistance
See how well your resume matches this job role with our AI-powered score. By uploading your resume, you agree to our Terms of Service
Job Description
Alchemy seeks a Site Reliability Engineer to build and maintain tools and infrastructure, automating operational tasks. The role requires strong experience in operations, networking, and software development.
Responsibilities
- Design, build, and refactor major software components that improve the availability, resilience, performance and efficiency of our system.
- Is part of our on-call rotation and responds to our infrastructure incidents in accordance with our policy.
- Proactively addresses bugs and bottlenecks as part of our infrastructure.
- Can define and choose the best SLI/SLOs in accordance to our system needs.
- Is able to choose the best tools for different problems and can adapt to our ever-changing specifications and growth.
- Addresses issues in our Incident Management process by reducing and fixing noisy alerts, reducing MTTD and MTTR and is able to support other team members on this aspect.
- Able to identify and address design bottlenecks in our infrastructure.
- Able to mentor new hires and onboard them to our tools and infrastructure.
- Able to address code complexity and efficiency issues while constantly addressing software bugs.
- Able to support and guide other team members with code-related problems and participate in and offer effective code reviews.
Requirements
- Experience writing efficient code in one or more programming languages (e.g. Python, Golang, Java, Rust).
- Experience developing software applications and tools from scratch that can be expanded and used by other team members by offering a clear structure, reusable code patterns and guidance.
- Past experience designing and managing the lifecycle of complex systems while taking into account multiple factors such as costs, systems performance, scalability, resilience and disaster recovery.
- Expertise in all aspects of operating Linux-based systems with focus on troubleshooting, configuration and monitoring.
- Past experience managing large scale infrastructures running on Baremetal, Public and Private cloud (e.g AWS, GCP, Azure) and Container-based infrastructure (Kubernetes, Openshift, Docker etc.).
- Knows the insides of different protocols across the stack such as HTTP, DNS, DHCP, routing protocols, etc.
- Leverages programming languages and different automation tools to reduce toil and automate repetitive tasks.
- Past experience with IaaC such as Terraform or Pulumi, and Configuration Management tools (e.g. Ansible, Puppet, Chef).
- Experience with one or more CI/CD solutions (e.g. Jenkins, ArgoCD, Gitlab pipelines, Spinnaker, Harness) is a must.
- Experience implementing monitoring and logging solutions for infrastructure and applications.
- Must have experience with monitoring and logging tools such as Prometheus, Thanos, Splunk, Grafana, Graphite, Loki, etc.
- Past experience leading a team is a big plus.
- Has great communication skills and is able to express his ideas to other team members effectively.
Benefits
- Attractive salary package
- Opportunity to work with the latest cloud and blockchain technologies
- Flexible time away
- Private Medical Insurance
- Start-up environment: internal off-site hackathons, access to company-rented hacker house during summer
About this job
Job category
IT
Job type
Full time
Location
Bucharest, Romania
Company size
Mature [ 50+ employess ]
Apply now
Don't miss out on this opportunity. Apply now and take the first step toward success.
Resume Assistance
See how well your resume matches this job role with our AI-powered score. By uploading your resume, you agree to our Terms of Service