Maintaining Virtual Circuits
Learn about planned maintenance of FastConnect virtual circuits.
Oracle performs regular maintenance on the routers dedicated for use with FastConnect virtual circuits. This maintenance lets Oracle enhance overall device operational stability by replacing faulty hardware, applying patches, and more. These maintenance activities are crucial for service improvement. Each maintenance task is planned carefully and scheduled in advance to minimize any impact on services. This article outlines what happens during FastConnect maintenance and what steps to take to minimize service outages because of planned or unplanned maintenance.
We recommend that you always configure a primary and redundant secondary connection to Oracle Cloud Infrastructure. The redundant secondary connection can either be a FastConnect private virtual circuit or an IPSec connection. When an IPSec connection is being used as the secondary path, ensure that the Site-to-Site VPN IPSec tunnels are configured to use BGP Routing. While using such connections, Oracle Cloud always prioritizes FastConnect over IPSec tunnels using the AS Path prepend mechanism.
Establish primary and secondary connections on different physical devices to offer reliable connectivity from on-premises to OCI resources. While creating a secondary FastConnect connection with FastConnect Direct, use the "specify router proximity" option from the OCI Console to make the secondary connection land on a different physical device. For Partner connections, work with your FastConnect partner to provision a secondary virtual circuit on a redundant partner cross-connect. This helps you have uninterrupted connectivity during any planned or unplanned events. For information on redundancy practices, see the Connectivity redundancy guide (PDF).
High availability in Virtual Circuits
High availability in virtual circuits is achieved through redundant connections between OCI and the on-premises network. Implementing high availability keeps the network intact during any outage or planned activities. In case you're using the Oracle partner connectivity model, Oracle handles the redundancy of the physical connections between the partner and Oracle, and the redundancy of routers in the FastConnect locations. You're expected to handle the redundancy of the physical connection between the on-premises network and the Oracle partner. For other FastConnect connectivity models such as Third-party and colocation, you're responsible for ensuring redundancy between FastConnect routers and your own edge devices by configuring redundant virtual circuits using different physical FastConnect routers provided by Oracle in every region and FastConnect POP location.
The following network topologies show redundant virtual circuits used in the Oracle partner scenario, Third-party provider or colocation with Oracle scenario, and IPSec VPN as a backup for FastConnect.
For more information, review FastConnect Redundancy Best Practices.
Maintenance Events and Notifications
Planned maintenance for FastConnect services is carefully scheduled to focus on one FastConnect router at a time to ensure uninterrupted connectivity over virtual circuits during maintenance. This approach ensures that at least one path is always available to access redundant circuits with diverse paths.
During maintenance, Oracle sends the RFC 8326 "BGP GRACEFUL SHUTDOWN 65535:0" message to CPE edge devices along with AS path prepending. If the CPE device acknowledges this message then the local preference on the CPE device is set to zero to ensure that the path going under maintenance is no longer preferred. The AS path change is done by prepending the Oracle AS 31898 to the BGP routes advertised from OCI to the CPE. Sending this message along with AS path prepending ensures traffic gracefully shifts to the redundant path prior to the maintenance activity.
Ensure that any on-premises devices in the path are set up to acknowledge the AS path prepend or BGP Graceful Shutdown Community message. Also, validate that redundancy is configured to shift the traffic to an alternate path, in case the primary path is de-preferred. Lastly, where applicable, check with your service provider to confirm they allow AS Path prepending or BGP Community messages on the connection they manage to your network.
If the network doesn't allow the preceding actions, you're likely to experience asymmetric routing and packet drops during maintenance activities.
Setting the local preference to zero on the CPE device after receiving the graceful shutdown community might be vendor specific. Validate with equipment vendors that the CPE device has this feature. If not, configure an inbound routing policy to set the local preference on the CPE to zero, based on receiving the graceful shutdown community message from OCI.
OCI routers support AS path prepending when it's also supported on CPE devices. Asymmetric routing is possible if traffic shifting on the CPE and OCI internal routers doesn't happen at the same time, because of a delay in shifting traffic. To eliminate such issues, we recommend that you enable support for asymmetric routing in CPE devices.
When planned maintenance is scheduled, you're notified at least 14 days before the maintenance windows through Console Announcements and also email notifications if you're subscribed for email notifications. Email notification contacts are added and managed by the Service Administrator. You're notified of service outages and security incidents using the same mechanisms.
Verifying Virtual Circuit Failover
When you first provision redundant connections, validate they're working correctly before you place them into production. Repeat the validation regularly (every 6 months, every year, and so on) or before scheduled maintenance windows to ensure the failover is still working correctly as changes can be made after the initial failover test that can break the failover. If you only test it when you first provision the redundant connectivity, you run the risk of finding out it's not working when an actual outage occurs, which might be too late. Also, remember to validate that failing back to the primary works.
The failover validation process has two stages, each of which are seen during OCI router maintenance:
- De-prefer the primary path using local preference and AS path prepend, then verify that traffic shifts to the secondary path. The Connectivity redundancy guide (PDF) explains how the AS path prepend and local preference settings can be used to prioritize a particular path. This is the main fail-over test that you perform, as the path de-prefer process is run by OCI during the maintenance window before the BGP session shutdown.
- Shut down the primary BGP session between on-premises and OCI networks. To shut down the BGP session, deactivate the virtual circuit from the OCI Console. This forces traffic to flow through the secondary connection.
You can bring up the primary path by reverting the changes and then checking if the traffic is forwarded back to the primary path. We recommend testing failover using both methods already suggested to ensure the failover mechanism is working smoothly.
For Oracle partner connectivity models, if you have several virtual circuits you have the option to validate failover using the previous mentioned methods. If you only have one virtual circuit you don't have an option to test failover, as the redundancy only exists between the Oracle FastConnect router and the provider.
If the on-premises network uses stateful firewalls, you're prone to issues during failover, so it's even more important to ensure traffic failover happens as expected.
Traffic statistics can be monitored on the OCI Console. The bits received and bits sent metrics only increment on the current active path. You can monitor the health, capacity, and performance of a FastConnect connection by using metrics, alarms, and notifications. For more information, see Monitoring and Notifications.
Frequently Asked Questions
- What impact can redundant virtual circuits have?
- When you use redundant virtual circuits you're less likely to experience disruptions during maintenance. If the BGP peers support AS path prepending and GSHUT (graceful shutdown), the traffic convergence process is faster and smoother. Traffic seamlessly switches to the secondary path post BGP reconvergence, eliminating interruptions. In such scenarios, we don't expect any noticeable impact. Always adhere to the documentation described earlier.
- What happens if I don't have redundant virtual circuits?
- If you rely on a single virtual circuit you might experience traffic interruptions during the maintenance window outlined in the notification. We recommend following the FastConnect Redundancy Best Practices to implement redundant connections. For immediate redundancy, configuring an IPSec connection as an alternative path can help mitigate disruptions, keeping in mind the bandwidth limitation of Site-to-Site VPN.
- What is the expected impact duration for non redundant connections?
- The actual maintenance doesn't take the entire time specified in the notice, but the impact can occur at any point within the specified maintenance window. Prepare and plan based on the start and end times provided.
- Can I request a modification in AS path prepending to prepend more than 3 times during maintenance or change the GSHUT configuration?
- No. The procedures for AS path prepending and GSHUT are standardized globally in OCI and can't be altered to accommodate individual requests.
- How can I quickly verify virtual circuit redundancy?
- You can verify redundancy through the OCI
Console. Non-redundant virtual circuits trigger notifications on the Console indicating the lack of redundancy stating "This FastConnect virtual circuit has a single point of failure. Provision a redundant connection to avoid potential outages." To confirm:
- Open the virtual circuit's details page in the OCI Console.
- Look at the BGP section of the details. The "Logical Device" field displays the device name hosting the virtual circuit. For redundancy, the logical devices for each virtual circuit must be different.
- Both virtual circuits land on the same physical device. Is it possible to perform maintenance on one virtual circuit at a time?
- Maintenance is performed at the physical device level, not the virtual circuit level. Maintenance on one device impacts all associated virtual circuits simultaneously. We recommend you implement redundant connections landing on different physical devices to avoid such scenarios. Follow the FastConnect Redundancy Best Practices.
- Why was a maintenance window canceled or rescheduled?
- Maintenance can be postponed if issues are identified that could lead to unforeseen disruptions. The priority is to complete maintenance with minimal impact to you. Rescheduling ensures all blockers are resolved before proceeding.
- Can I request that maintenance be canceled or rescheduled?
- We notify all affected customers 14 days before the change regarding the maintenance to give you enough time to accommodate and manage your network. These notification are sent out to all customers who have virtual circuits running on same device. We can't accommodate individual customer requests based on their time or date preference.
- Can I request that the maintenance notification is sent again?
- No. Oracle sends out emails in an automated format based on compartment IDs and virtual circuit OCIDs, and automation sends it to the notification email associated with that compartment. The customer notification team can't send individual emails, these notifications only happen in a batch.
- I got several notifications with different dates. How can I confirm when the maintenance is scheduled?
- If you have several notifications because of rescheduling, consider the latest notification the definitive source of truth. Always plan based on the updated time frames provided in the most recent notification.
- What redundancy do I need to provision when using an Oracle Partner to provide Layer 3 connectivity to OCI?
- Oracle ensures our partners offering Layer 3 connectivity already have physical redundancy provisioned on their behalf. You are responsible for provisioning your preferred redundancy to the Oracle Partner. If you're using a region with a single FastConnect location and also want location diversity, consider provisioning a second virtual circuit to a nearby region. See BGP Session to Oracle Partner for more details.