Skip to main content

Command Palette

Search for a command to run...

Achieving Seamless East–West Traffic Inspection with ACI Multi-Pod

Updated
3 min read
Achieving Seamless East–West Traffic Inspection with ACI Multi-Pod
E

DC Consultant | DCACI | CCNP Security | PCNSA | CDE PAM

Recently, I've worked on an architecture case that combined Cisco ACI Multi-Pod with advanced Palo Alto firewall clustering and high-availability requirements.

In this article, I will try to explain that what makes this design particularly valuable is that it is not just an architectural exercise—it is aligned with how Cisco ACI is designed and supported to behave in Multi-Pod environments when integrating third-party firewalls.

The customer environment is a two-pod ACI Multi-Pod setup. The main requirement was to enforce East–West traffic inspection through firewalls using ACI Policy-Based Redirect (PBR) in a one-arm design.

Each pod must provide N+1 redundancy, meaning that if a firewall fails, the remaining devices in the same pod must seamlessly take over the traffic with no throughput penalty. Under normal conditions, intra-pod traffic should be handled by local pod firewalls, while inter-pod traffic should be inspected by firewalls in Pod1. In the event of a pod-level disaster, all traffic must automatically fail over to the opposite pod —without manual intervention, without session loss, and without throughput degradation.

To meet these requirements and allow for future throughput growth, the design uses four firewalls per pod (two HA pairs per pod). Each HA pair operates in active/active mode, and all eight firewalls form an extended inter-pod active/active cluster using the HA4 interfaces.

This type of design raises a few important questions:

1. How do we avoid asymmetric traffic?

Because the cluster is active/active, ACI performs ECMP on redirected traffic. To ensure that each session remains stateful and consistently uses the same firewall throughout its lifetime, ACI provides symmetric PBR, which guarantees symmetric forwarding per session.

2. How do we prevent session disruption when a firewall fails within the same pod?

When a firewall goes down, sessions handled by that device must be transparently taken over by the remaining firewalls in the pod—without impacting existing sessions on healthy devices. Normally, losing a cluster member would trigger a re-hash, potentially disrupting active flows. ACI solves this with resilient hashing, ensuring that only the sessions belonging to the failed firewall are redistributed, while all other sessions remain untouched.

3. Can we always prefer local pod firewalls and use the remote pod only as a last resort?

Each firewall exposes a loopback interface used for path monitoring and failover decisions. As long as at least one firewall in each HA pair is operational, traffic remains within the local pod.

At this point, a technical limitation of ACI PBR policy logic becomes visible: if an entire HA pair goes down, ACI’s PBR Backup Policy is triggered, redirecting traffic to loopbacks in the opposite pod based on priority. Even if the remaining HA pair in the local pod could theoretically handle all traffic, part of the traffic will still be redirected to the remote pod. This happens because the Redirect Health Policy in ACI becomes degraded once both firewalls in an HA pair are lost (two loopbacks are down), causing the Primary PBR policy to switch to the Backup PBR policy.

Final

Features such as symmetric PBR, resilient hashing, and PBR Backup Policies are mature, documented mechanisms that have been validated by Cisco for service insertion at scale. When combined with firewall platforms that support active/active clustering and state synchronization, the result is a solution that behaves predictably not only during steady state, but also under partial failures and pod-level events.

This is an important distinction: the solution works not because of custom logic or non-standard tuning, but because it leverages supported ACI behaviors exactly as intended in Multi-Pod designs.

Overall, this approach delivers stateful, high-throughput traffic inspection with automatic failover across pods—without manual intervention and without compromising session continuity.

In environments where scale and resilience are mandatory, this model avoids the typical trade-offs of active/passive designs—where the added throughput of an HA pair cannot be fully utilized—and isolated firewall clusters—where sessions cannot be synchronized across firewalls in both pods.