BLOG

How the ilert Team Achieved a Seamless Migration from Community MySQL to AWS RDS Aurora with Minimal Customer Impact

Roman Frey
October 24, 2024
Table of Contents:

As our customer base and data demands grew exponentially over the years, scaling our database infrastructure became imperative. Our vision was to set up an active-active database architecture that would ensure regional independence and exceptional service quality globally. Here’s an in-depth look at how our team managed to migrate our production data to AWS RDS Aurora, incorporating cutting-edge strategies to minimize impact during the transitional phase.

Understanding the Challenge

Facing limitations with our existing Community MySQL setup, we needed a scalable, high-availability solution that could handle our increasing load and improve global data access. Our aim was to implement an active-active configuration with AWS RDS Aurora to facilitate regional independence and enhance global service delivery.

Step 1: Strategic Pre-Migration Planning

Preparation was the initial key step. Our team meticulously examined the existing database system, charting out all dependencies and specifications. We enlisted all infrastructure components using Terraform, which not only facilitated a smoother setup in AWS but also ensured consistency across our environments—crucial for reducing potential errors during migration.

Our team undertook an exhaustive planning process, evaluating every aspect of our existing database configurations and querying demands. This analysis helped us precisely define the compute and storage specifications for our Aurora setup.

Step 2: Configuring AWS RDS Aurora and Read-Only Services

We set up the AWS RDS Aurora cluster, ensuring it met all our specifications for high performance and reliability. To ensure a smooth switch, we set up read-only services connected to an Aurora read replica. This step was vital as it allowed our services to continue operating on a read-only basis without any disruption when the main database was temporarily unavailable.

Parallel to this, we configured NGINX Ingress inside our Kubernetes clusters, which played a pivotal role in managing traffic during the migration. By defining specific rules in NGINX Ingress, we directed traffic between our normal and read-only service instances, maintaining service availability even when the main database was in a critical state.

Here is an example for the service-a ingress rule:

apiVersion: networking.k8s.io/v1

kind: Ingress

metadata:

  labels:

    app: service-a

    ingress-class: nginx

  name: service-a

  namespace: default

spec:

  ingressClassName: nginx

  rules:

  - host: '*.ilert.com'

    http:

      paths:

      - backend:

          service:

            name: service-a

            port:

              number: 9999

        path: /

        pathType: Prefix

  tls:

  - hosts:

    - '*.ilert.com'

And here is an example script to manage the traffic between read-only and normal instances:

# Move traffic to readonly instances

kubectl patch service service-a -p '{"spec":{"selector": {"app": "service-a-readonly"}}}'

# Move traffic to back to write instances

kubectl patch service service-a -p '{"spec":{"selector": {"app": "service-a"}}}'

Step 3: Timing and Executing the Migration

To minimize customer impact, we scheduled the migration during our lowest traffic period. The switch to "read-only" mode for the main database lasted only 4 minutes. During this window, our applications were seamlessly interacting with the read-only services connected to the Aurora replica, ensuring continuous availability of data for reading purposes.

Simultaneously, we initiated the final synchronization of the last batch of data from the MySQL database to the Aurora database. At the end of this process, the Aurora cluster was promoted to handle both read and write operations.

Step 4: Switching Over with Minimal Disruption

Following the successful synchronization and promotion of the Aurora cluster, we switched the live traffic from the read-only instances back to the normal service instances, now pointing to the newly promoted Aurora cluster. This switch was handled delicately through updated NGINX Ingress rules, which redirected all traffic to the new Aurora setup, now capable of handling both read and write operations.

Step 5: Monitoring and Optimization Post-Migration

Post-migration, our team engaged in meticulous monitoring to ensure the system was functioning as expected. We paid close attention to performance metrics such as query efficiency, CPU usage, and storage utilization. Continuous optimizations were applied to ensure that our queries were fully leveraging Aurora’s advanced capabilities.

Conclusion

Migrating to AWS RDS Aurora with just a 4-minute read-only window exemplifies our team's commitment to operational excellence and minimal customer impact. Our detailed preparation, use of sophisticated tools like Terraform, and strategic execution enabled us to not only enhance our database performance but also prepare our global infrastructure to better serve our customers through an innovative active-active setup.

Today, we are successfully operating our Aurora Database Cluster in an active-active configuration across two independent regions, spanning six availability zones. This configuration not only boosts performance and ensures higher availability but also reduces latency for our global customer base.

Looking ahead, we are planning to scale our operations even further, enhancing our infrastructure's resilience and efficiency. Our journey with AWS Aurora is a testament to our ongoing commitment to leveraging cutting-edge technology to deliver the best possible service to our customers.

ilert is a validated partner of AWS and has Amazon RDS Ready and Qualified Software achievements. ilert provides out-of-the-box integrations with various Amazon services that are aimed at monitoring your systems and alerting your team when anomalies are detected. Go to the cloud and achieve operational excellence with ilert and AWS.
Learn more

Other blog posts you might like:

Ready to elevate your incident management?
Start for free
Our Cookie Policy
We use cookies to improve your experience, analyze site traffic and for marketing. Learn more in our Privacy Policy.
Open Preferences
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.