Logo

Senior Staff Engineer - Operations Automation (AWS/Alibaba)

OKX
Lead the design and implementation of an enterprise-level operations automation platform in a multi-cloud environment, integrating AWS and Alibaba Cloud resources. Build a standardized, intelligent operational framework that enhances efficiency and reliability.

Overview

Department

IT

Job type

Full time

Compensation

Salary not specified

Location

Singapore, Southeast Asia

Resume Assistance

See how well your resume matches this job role with our AI-powered score. By uploading your resume, you agree to our Terms of Service

Ready to apply?

You're one step away - it takes less than a minute to upload your resume

Requirements

  • Proficient in at least one programming language (Python/Go/Java), with experience in large-scale operations platform development and familiarity with microservices architecture (Spring Cloud/Dubbo) and full-stack technologies.
  • Deep understanding of AWS and Alibaba Cloud core service APIs, cloud-native technologies (Serverless, K8s Operator), and DevOps toolchains (Ansible, Prometheus).
  • Skilled in automated testing frameworks to ensure platform stability.
  • 5+ years in DevOps/operations development, with proven experience in designing and deploying enterprise-level automation platforms (e.g., CMDB, operations middleware).
  • Hands-on experience with AWS/Aliyun hybrid cloud automation tools, including cross-cloud resource synchronization and federated authentication (e.g., Alibaba Cloud RAM SSO, AWS IAM Identity Center).
  • Product-oriented mindset, capable of designing user-friendly and efficient operational features.
  • Strong cross-team collaboration skills to drive adoption of automation platforms across development, operations, and security teams.
  • AWS Certified DevOps Engineer or Alibaba Cloud ACP/ACE (DevOps track) certifications preferred.
  • Bachelor’s degree or higher in Computer Science, Software Engineering, or related fields.
  • Responsibilities

  • Automation Platform Design & Development
  • Lead the architecture design of a cross-cloud (AWS/Alibaba Cloud) operations automation platform, covering core modules such as resource orchestration, monitoring/alerting, self-healing, and cost optimization.
  • Develop unified operational APIs and a visual console, integrating AWS SDK/Boto3 and Alibaba Cloud OpenAPI/SDK to standardize cross-cloud resource operations.
  • Toolchain Integration & Optimization
  • Build end-to-end resource lifecycle management using IaC tools (Terraform, AWS CloudFormation, Alibaba Cloud ROS), enabling one-click environment provisioning and teardown.
  • Integrate CI/CD pipelines (GitLab, cloud-native toolchains) to automate application deployment, configuration changes, and database migrations.
  • Intelligent Operations Capability Development
  • Design an automated operations rule engine, leveraging AI/ML (e.g., anomaly detection, root cause analysis) for predictive fault resolution (e.g., AWS Lambda + CloudWatch event-triggered remediation).
  • Build a knowledge base system to document SOPs and enable automated execution (e.g., Alibaba Cloud OOS).
  • Multi-Cloud Coordination & Standardization
  • Design a unified operations model across AWS and Alibaba Cloud, abstracting common interfaces to address multi-cloud differences (e.g., aligning ECS and EC2 instance management strategies).
  • Establish operational standards and drive configuration standardization/automated validation across dev, test, and production environments.
  • Security & Compliance Governance
  • Embed security baseline checks to automatically scan cloud configurations (e.g., security group rules, IAM policies, Alibaba Cloud RAM permissions) and generate compliance reports.
  • Automate approval workflows for sensitive operations (e.g., Alibaba Cloud ActionTrail and AWS CloudTrail log-triggered approval tickets).
  • Cost Optimization Framework
  • Develop resource utilization analysis tools, leveraging AWS Cost Explorer and Alibaba Cloud Cost Management APIs to generate automated optimization recommendations (e.g., idle resource cleanup, scaling policy tuning).
  • Design FinOps automation solutions for budget alerts, cost allocation, and multi-dimensional cost visualization.
  • Benefits

  • Competitive total compensation package
  • L&D programs and Education subsidy for employees' growth and development
  • Various team building programs and company events
  • Wellness and meal allowances
  • Comprehensive healthcare schemes for employees and dependants
  • © All rights reserved.