The last three posts have been focused on understanding how to calculate our TCAM requirements. Now that we understand how TCAM is allocated, it’s time to create class and policy maps for QoS classification and marking. I won’t display the whole set of ACLs and the QoS policy creation in this post – there will be a post detailing queueing on the 9k – but I will demonstrate how we may have to steal some space from one or more TCAM regions in order to fulfill our requirements.
I’ve created my IPv4 and IPv6 ACLs and the required class and policy maps. When I applied the service-policy to a vPC interface, I was presented with an unfortunate message:
2017 Feb 13 18:18:52 switch-9k-1 %$ VDC-1 %$ %ACLQOS-SLOT1-2-ACLQOS_OOTR: Tcam resource exhausted: Ingress L2 QOS [ing-l2-qos]
What happened? Remember from the first post, our default TCAM allocation for the ing-l2-qos region was rather small:
switch-9k-1# show hardware access-list tcam region NAT ACL[nat] size = 0 Ingress PACL [ing-ifacl] size = 0 VACL [vacl] size = 0 Ingress RACL [ing-racl] size = 1536 Ingress RBACL [ing-rbacl] size = 0 Ingress L2 QOS [ing-l2-qos] size = 512 Ingress L3/VLAN QOS [ing-l3-vlan-qos] size = 512 Ingress SUP [ing-sup] size = 512 Ingress L2 SPAN filter [ing-l2-span-filter] size = 256 Ingress L3 SPAN filter [ing-l3-span-filter] size = 256 Ingress FSTAT [ing-fstat] size = 0 span [span] size = 512 Egress RACL [egr-racl] size = 512 Egress SUP [egr-sup] size = 256 Ingress Redirect [ing-redirect] size = 0
And now it becomes obvious why we need to invest the time to determine our QoS TCAM requirements. I calculated that I would need 511 IPv4 and 518 IPv6 entries, for a total of 1029 TCAM entries – this exceeds capacity for the TCAM region.
In my topology, L3 will be handled by a pair of upstream Catalyst 6k switches. Policy maps will be attached directly to interfaces and not to VLANs or SVIs, so ing-l2-qos is the region which needs to be expanded. Based on the topology and the default allocations, I will reduce the allocation for the Ingress RACL region by 1024 and add that to the ing-l2-qos region. If needed, I will apply ACLs on the upstream switch SVIs to restrict access to hosts connected to the Nexus.
Note: per Cisco’s documentation for Nexus 9k TCAM Carving, it is critical that when re-allocating memory between TCAM regions, a full slice must be removed from one region in order to actually increase the space of another. For example, the “RACL” region is cut into 3 slices of size 512. To take space from this region, we must decrease it by a factor of 512 in order to use that space elsewhere. We cannot reduce the size by 256, as it will not fully de-allocate the memory slice for use elsewhere. If only 256 additional entries were required for our ing-l2-qos region, this could be taken from either the ing-l3-vlan-qos or a SPAN region as they use slices of size 256.
Back to business – we need to increase the ing-l2-qos region to support at least 1029 TCAM entries, and we’ll take it from the ing-racl region:
switch-9k-1(config)# hardware access-list tcam region ing-racl 512 Warning: Please save config and reload the system for the configuration to take effect switch-9k-1(config)# hardware access-list tcam region ing-l2-qos 1536 Warning: Please save config and reload the system for the configuration to take effect switch-9k-1(config)# exit
Save the config, reload, and we should now see the updated allocations:
switch-9k-1# show hardware access-list tcam region NAT ACL[nat] size = 0 Ingress PACL [ing-ifacl] size = 0 VACL [vacl] size = 0 Ingress RACL [ing-racl] size = 512 Ingress RBACL [ing-rbacl] size = 0 Ingress L2 QOS [ing-l2-qos] size = 1536
The policy map can now be applied to the interface, and we can see the utilization:
switch-9k-1# show hardware access-list resource utilization | b 0x1 ... ----------------------------------------------- Used Free Percent Utilization ------------------------------------------------ Ingress L2 QOS 1030 506 67.06 Ingress L2 QOS IPv4 511 33.27 Ingress L2 QOS IPv6 518 33.72 Ingress L2 QOS MAC 1 0.07
Great – there are still 506 entries available, so we have room to update the policy as business requirements change. Now I’ll continue configuring the switch by applying this service policy to another vPC connected to an untrusted host, and…
2017 Feb 13 19:08:03 switch-9k-1 %$ VDC-1 %$ %ACLQOS-SLOT1-2-ACLQOS_OOTR: Tcam resource exhausted: Ingress L2 QOS [ing-l2-qos]
Why? I’m just applying the same policy! What happened to the TCAM? Statistics. Each instantiation of the policy map will consume the full number of TCAM entries in order to maintain statistics for each entry on each port. Per Cisco, “By default QoS policy applied on multiple interfaces does not share the label since statistics are enabled by default. In order to share the label for the same QoS policy applied on multiple interfaces, you have to configure the QoS policy with the no-stats option…”
Translation? We don’t have a ton of TCAM to play with, and a large number of entries will prevent us from applying a policy on more than one port. Just append no-stats when applying an interface service policy, and this issue will be avoided:
interface Port-Channel1101 service-policy type qos input IN-UNTRUSTED no-stats
I hope this series has provided some real-world examples of TCAM resource utilization and tools to monitor and manage this limited resource on the 9k. As a recap:
- Remember that TCAM operates on masks. If multiple port matches are required, identify if they fall within a contiguous mask and use a range to reduce the number of TCAM entries. The same is true for source / destination IP addresses.
- Calculate your approximate (or exact) TCAM utilization and carve enough space in the appropriate region to avoid TCAM exhaustion errors.
- The newer Nexus 9k (e.g. 93180YC-EX) have different TCAM regions than documented for the first generation 9k devices. For example, there is no dedicated FEX L2 QOS region; this is now part of the ing-l2-qos region (assuming you tie the policy map directly to a FEX interface)
- The above statement may not be true if you are running the Enterprise Services / L3 feature license. I only have access to switches licensed for L2 operation!
In the next series of posts, I’ll expand on Carole Warner Reece’s excellent “QoS on the Nexus 5000/2000” series and note the differences between configuration of QoS on the Nexus 5k and the Nexus 9k.