Routing Protocols over VPLS
So a customer of mine was reading an interesting article posted @ NetworkWorld that focuses on VPLS and running a Layer-3 routing protocol over this new WAN access technology. If you are not familiar with VPLS or how it functions, for the sake of this discussion let’s just say that it provides Layer-2 peering connectivity to a WAN whereby those sites can peer at Layer-2 just as though they would on an Ethernet based LAN. For more of a techie explaination, please visit the VPLS page at Wikipedia. Anyway my customer forwarded me the link to the article with this statement attached:
“The attached URL gives an example of what the perception is with running a routing protocol over a VPLS network. Essentially, I want to create a Best Practices / Guidelines document for deploying OSPF/EIGRP over VPLS. Because VPLS has a multipoint behavior, I want to look at both scalability and best practices in deployment.”
Well this got me thinking…So I’ve configured VPLS on the 7600’s quite a few times before however in all those instances I have focused on the Service Provider side of the configuration with complete disregard for the implications related to the customer design, configuration and scalability implications. So I figured I would send off a couple of emails to a few internal mailing lists to see if we (Cisco) had already created such a document that I could provide to them so that they would’nt have to re-invent the wheel if we’ve already started to produce tires …Well needless to say my email courted quite a few responses, everyone from Distinquished SE’s, TME’s, SE’s, SME’s to CSE’s responsed…
Based on that feedback, I have come to the following conclusion. In essence the VPLS WAN looks very much like an Ethernet LAN from the customer perspective so much so that when running a Layer-3 routing protocol over the VPLS network, most all of the rules associated with running that IGP in a LAN environment remain the same in the VPLS WAN environment. So, let’s assume best case and that means the VPLS network is stable 100% of the time. Then it looks like one big LAN, and how many OSPF or EIGRP neighbors would we recommend on a single LAN? 50 maybe?
Then let’s assume worst case (this is a WAN after all) and that you have some site instabilities at an OSPF DR site and you get some flip flopping of DR and BDR. Or what if you get a situation where the VPLS domain is partitioned and some sites can’t access other sites or if you have a bidirectional issue on a PW (pseudo-wire), etc. A lot can go wrong in the WAN. How would you troubleshoot a weird IGP issue? Much like a bad port on a switch, it’s hard!
So my thoughts and from the feedback I received is that, I go by the KISS principle (Keep It Simple Stupid), VPLS is for small domains (under 10 or so) where maybe there is some network design advantage to having a L2 network (between data centers) but for anything more than that, go with L3VPNs. And BTW, I’ve heard of end customers buying SP VPLS and then quickly realizing it’s not as easy as it sounds and to top if off a loop in the WAN will melt your network
(remember it’s just like a LAN) …

Comment by neteng
on 19 February 2008:
Great article Joe! Things seem to get interesting when you ‘emulate’ certain technologies over other technologies. The behavior is often unpredictable as the scale increases. This piece really hammers that point home.
Phillip