転職・求人情報の詳細をご覧になる場合は会員登録(無料)が必要です
部署・役職名 | Site Reliability Engineer |
---|---|
職種 | |
業種 | |
勤務地 | |
仕事内容 |
■Mission Our mission is to provide customers with the most reliable financial infrastructure to process card transactions and offer a range of financial services that support businesses. As a Senior Site Reliability Engineer at Our fintech product, you will play a pivotal role in ensuring our systems are robust, resilient, and scalable. You will lead efforts to improve our service uptime and performance, enabling our customers to grow their businesses with confidence. ■Responsibilities As a Senior Site Reliability Engineer, your primary responsibilities will include but are not limited to. ・Service Level Indicators (SLIs) & Metrics Dissect and define key Service Level Indicators (SLIs) into specific, measurable metrics that reflect the health and performance of our financial services infrastructure. ・Managing Monitoring and Alerting Systems Design, build, and maintain advanced monitoring and alerting systems to promptly detect and address issues before they impact our customers. This includes leveraging a variety of tools and technologies to gain insights into system performance and reliability. ・Service Level Objectives (SLOs) Work closely with engineering teams to establish and support Service Level Objectives (SLOs) that align with our mission to provide reliable services. Provide guidance and support to engineers in achieving these SLOs through effective monitoring, alerting, and incident response strategies. ・Incident Management and Response Lead and participate in the incident response process, including post-mortem analysis and implementing preventive measures to minimize recurrence. ・Continuous Improvement Continuously evaluate and improve our infrastructure and processes to enhance reliability, scalability, and efficiency. Foster a culture of innovation and excellence within the team. ・Cross-Functional Collaboration Foster a collaborative environment by working closely with Infrastructure/Platform Teams, Application Developers, and the Information Security Team to ensure system architectures and deployments are optimized for security, reliability, and scalability. Coordinate with these teams to implement best practices for infrastructure management, application development, and security. Drive the integration of SRE principles into the broader engineering culture, facilitating knowledge sharing and joint problem-solving efforts to enhance overall system performance and security posture. ・Team Leadership and Mentorship Oversee a team of junior SREs and other technical staff, providing mentorship, guidance, and support to ensure professional growth and achievement of team objectives. |
応募資格 |
【必須(MUST)】 ・A minimum of 5 years of experience in a Site Reliability Engineering role or similar, with at least 2 years in a leadership position.・Deep understanding of SLIs, SLOs, and SLAs and their importance in maintaining high service reliability and performance. ・Proficient in designing, building, and maintaining monitoring and alerting systems using tools like Prometheus, Grafana, ELK stack, etc. ・Experience with cloud services (e.g., AWS, Google Cloud Platform, Azure) and container orchestration technologies (e.g., Kubernetes, Docker). ・Strong knowledge of infrastructure as code (IaC) practices and tools (e.g., Terraform, Ansible). ・Excellent problem-solving skills, with the ability to lead root cause analysis and implement strategic solutions. ・Strong leadership and communication skills, capable of mentoring junior team members and collaborating with cross-functional teams. 【歓迎(WANT)】 ・Experience working in a start-up or fast-growing company.・Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience. ・You have a strong interest in and knowledge of Cloud Native technologies ・You are true to data and can make radical and innovative decisions based on research and practical testing ・You have a strong focus on our users’ success, and their trust in our service as a whole ・You have experience in making quality solutions in an iterative approach with a team ・You are passionate about technical learning, and sharing your learnings with others ・You are comfortable and excited about tackling ambiguous challenges and shaping up our services ・You are comfortable writing documentation in English, and speaking in English both informally and during formal presentations |
アピールポイント | 創立5年以内 自社サービス・製品あり ベンチャー企業 年間休日120日以上 産休・育休取得実績あり ストックオプション制度あり 女性管理職実績あり シェアトップクラス 20代管理職実績あり 2年連続売り上げ10%以上UP 完全土日休み フレックスタイム |
リモートワーク | 可 「可」と表示されている場合でも、「在宅に限る」「一定期間のみ」など、条件は求人によって異なります |
受動喫煙対策 | 屋内禁煙 |
更新日 | 2024/04/01 |
求人番号 | 3415422 |
採用企業情報
転職・求人情報の詳細をご覧になる場合は会員登録(無料)が必要です