Gramedia as Senior DevOps Engineer
May 2024 - Present - Enhanced system availability and performance by developing a Static Management System utilizing Cache and CDN, resulting in improved user experience.
- Increased availability and performance while optimizing costs by 10% by implementing a Proxy Image Processing and Transformation system with Cache and CDN.
- Modernized and optimized infrastructure through the implementation of a three-layer architecture, significantly improving system efficiency.
- Implemented Infrastructure as Code using Terraform, streamlining deployment processes and ensuring consistent infrastructure management.
- Transitioned from traditional infrastructure to containerized solutions using Kubernetes, enhancing scalability and operational efficiency.
- Developed a centralized dashboard monitoring solution with Grafana and Prometheus, accelerating the incident tracking process and improving response times.
- Simplified backup strategies by utilizing Tags, ensuring efficient data management and recovery processes.
- Maintained consistency and clarity in all technical documentation, avoiding industry jargon for better understanding across teams.
Jakarta, Indonesia
Lion Parcel as Site Reliability Engineer (SRE)
Jan 2021 - May 2024 - Provide technical support and assistance to developers, addressing their problems and needs, ensuring smooth development processes.
- Utilize JIRA for project management and issue tracking, ensuring efficient collaboration and task management.
- Develop and maintain an automated CI/CD pipeline using Jenkins, enabling seamless code deployment for every release.
- Create and maintain deployment processes for mobile apps on Android and iOS platforms, ensuring efficient and reliable distribution.
- Provision infrastructure, servers, and services using Terraform, enabling scalable and consistent deployment and management.
- Set up and monitoring, tracing, and logging tools such as ELK, Grafana, and Datadog, ensuring comprehensive visibility into system performance and issues.
- Implement monitoring alerts for services, databases, and logs, reducing the occurrence of errors in production environments.
- Manage and monitor a Kubernetes-based container cluster, ensuring high availability with a 99.9% uptime.
- Actively manage, improve, and monitor cloud infrastructure services on AWS and GCP, including backups, patches, and scaling.
- Administer GitHub repositories and permissions, including branching and tagging, ensuring efficient version control and collaboration.
- Develop and maintain scripts to automate tasks, improving efficiency and reducing manual effort.
- Integrate automation testing into the CI/CD pipeline, ensuring the quality and reliability of software releases.
- Scale Jenkins agents in Kubernetes to increase the number of executors, optimizing build and deployment processes for Go and Node.js projects.
- Implement a single dashboard monitoring solution using Prometheus, Thanos, and Grafana, providing a centralized view of system metrics.
- Successfully involved in the migration from GCP to AWS, minimizing downtime and ensuring a smooth transition.
- Import existing infrastructure to Terraform and implementing GitOps for managing and automating IaC.
Jakarta, Indonesia
BuangDisini as DevOps / Cloud Engineer
May 2022 - Nov 2022 - Provisioned and managed infrastructure using Terraform and Ansible, ensuring efficient and reliable deployment of resources.
- Created CI/CD automation for code deployment using Github Actions, streamlining the development and release process.
- Implemented container management and monitoring using Portainer, enhancing visibility and control over containerized applications.
- Integrated various tools and systems, improving collaboration and efficiency across the development and operations teams.
- Conducted troubleshooting and root cause analysis, swiftly identifying and resolving issues to minimize downtime and optimize system performance.
- Maintained clear and concise documentation of infrastructure configurations and processes, facilitating ease of understanding and knowledge transfer.
Remote