Clubs PSMT - AS400 slow performance and AWS tunnels down on 2024-10-10 – Incident details

8702 - Portmore experiencing major outage

AS400 slow performance and AWS tunnels down on 2024-10-10

Resolved
Operational
Started about 1 year agoLasted about 3 hours

Affected

POS Guatemala

Degraded performance from 11:51 AM to 2:39 PM

6307 - Escuintla

Degraded performance from 11:51 AM to 2:39 PM

6301 - Mira Flores

Degraded performance from 11:51 AM to 2:39 PM

6303 - Pradera

Degraded performance from 11:51 AM to 2:39 PM

6305 - San Cristobal

Degraded performance from 11:51 AM to 2:39 PM

6304 - Fraijanes

Degraded performance from 11:51 AM to 2:39 PM

Updates
  • Resolved
    Resolved

    The incident has been successfully resolved and is now closed.

    Brief description of the problem:

    There was a service disruption affecting certain resources within the VNet, leading to connectivity issues and failures in communication between services hosted in Azure.

    Actual state:

    • The affected services are now operational, and network connectivity between the impacted resources has been restored.

    • All related VNet configurations have been reverted to their previous, working state to resolve the immediate issue.

    Root Cause:

    The problem was triggered by a policy change applied to the Azure Virtual Networks (VNets). Specifically, the new policy inadvertently modified network security settings, causing unexpected connectivity restrictions. This resulted in some services being unable to communicate across subnets or with external resources. The change also affected route tables and security group rules, leading to disruptions in the internal traffic flow within the VNet.

    The issue was resolved by identifying and rolling back the policy change, restoring the original VNet configurations. The Azure team is investigating the need for additional safeguards to prevent such policy impacts in the future.

    Our systems are back to normal operation. Thank you for your understanding and support throughout this process.

  • Monitoring
    Monitoring

    The system is stable but the situation is being actively monitored. We are closely watching for any potential impact and taking necessary measures. Further updates to follow. Thank you for your patience and understanding.

  • Update
    Update

    The issue has been identified, and our team is still working on resolving it. We appreciate your patience as we continue working towards a resolution. Further updates will follow. Thank you for your understanding.

  • Identified
    Identified

    The issue has been identified and our team is working on resolving it. We appreciate your patience as we work towards a resolution. Further updates to follow. Thank you for your understanding.

  • Investigating
    Investigating

    We are currently investigating an incident affecting our systems, impacting the communications. Our team is actively working to identify and resolve the issue. Further updates to follow. Thank you for your patience.