VLAN over VxLAN tunnel in linux

Question

I m trying to extend a layer 2 network (Vlans) over a layer 3 network using vxlan tunnels ... i set up a lab where i have two VMs where,

I created a vxlan tunnel between the 2 main interfaces of the VMs
I created 2 vlan sub-interfaces under the second interface for each machine
I linked each vlan sub interface with the vxlan sub interface in separate bridges for each machine
I assigned an ip to every bridge (192.168.100.1/24 , 192.168.100.2/24 and 192.168.100.3/24 , 192.168.100.4/24) ===> now when itry to ping from one bridge to another in (same vlan tag) it doesn t work

[root@Asguard ~]# ping 192.168.100.1 -I 192.168.100.3
PING 192.168.100.1 (192.168.100.1) from 192.168.100.3 : 56(84) bytes of data.
From 192.168.100.3 icmp_seq=10 Destination Host Unreachable
ping: sendmsg: No route to host
From 192.168.100.3 icmp_seq=11 Destination Host Unreachable
From 192.168.100.3 icmp_seq=12 Destination Host Unreachable
From 192.168.100.3 icmp_seq=14 Destination Host Unreachable
From 192.168.100.3 icmp_seq=15 Destination Host Unreachable
From 192.168.100.3 icmp_seq=16 Destination Host Unreachable
From 192.168.100.3 icmp_seq=17 Destination Host Unreachable

this is the script i run in each VM

VM1 :

#!/bin/bash
Bridge and interface setup
ip link add br10 type bridge
ip link add br20 type bridge
ip link set br10 up
ip link set br20 up
VLAN 10 on bridge br10
ip link add link enp0s9 name enp0s9.10 type vlan id 10
ip link set enp0s9.10 master br10
ip link set enp0s9.10 up
VLAN 20 on bridge br20
ip link add link enp0s9 name enp0s9.20 type vlan id 20
ip link set enp0s9.20 master br20
ip link set enp0s9.20 up
VXLAN on both bridges
ip link set vxlan1000 master br10
ip link set vxlan1000 up
#ip link add vxlan1000_2 type vxlan id 1000 dev enp0s3 remote 10.1.25.235 dstport 4789
ip link set vxlan1000 master br20
ip link set vxlan1000 up
ip addr add 192.168.100.1/24 dev br10
ip addr add 192.168.100.2/24 dev br20

VM2

#!/bin/bash
Bridge and interface setup
ip link add br11 type bridge
ip link add br22 type bridge
ip link set br11 up
ip link set br22 up
VLAN 10 on bridge br10
ip link add link enp0s8 name enp0s8.10 type vlan id 10
ip link set enp0s8.10 master br11
ip link set enp0s8.10 up
VLAN 20 on bridge br20
ip link add link enp0s8 name enp0s8.20 type vlan id 20
ip link set enp0s8.20 master br22
ip link set enp0s8.20 up
VXLAN on both bridges
ip link add vxlan1001 type vxlan id 1000 dev enp0s3 remote 10.1.25.31 dstport 4789
ip link set vxlan1001 master br11
ip link set vxlan1001 up
#ip link add vxlan1000_2 type vxlan id 1000 dev enp0s3 remote 10.1.25.235 dstport 4789
ip link set vxlan1001 master br22
ip link set vxlan1001 up
ip addr add 192.168.100.3/24 dev br11
ip addr add 192.168.100.4/24 dev br22

!!! i want to know how linux handle the tagging and encapsulation to make them work together to make the vxlan extention

UPDATE for the last answer

Thank you for your answer. I apologize if my initial question didn’t fully convey my intentions, so I’ll clarify what I aimed to achieve in my initial setup:

I set up two VMs to simulate Linux routers.
Each VM has two physical interfaces:

The first interface on each VM (enp0s3), in a Layer 2 network, is used to simulate a public IP and serve as the overlay tunnel interface for VXLAN.

On enp0s3, I created a VTEP (VXLAN interface for encapsulation and decapsulation of traffic), assuming we’re in a Layer 3 network.

The second physical interface on each VM (e.g., enp0s8 or enp0s9) serves as the router's trunk port, where I can connect to internal VLANs (or directly to a switch trunk port).
I then created VLAN sub-interfaces on these interfaces (enp0s8 and enp0s9), assigning two VLANs per VM.
In each VM, I enslaved the VXLAN sub-interface and the VLAN sub-interfaces to the same bridge (let’s say br0), hoping that tagged traffic would:

Pass through the bridge, keeping the tag intact.

Reach the VXLAN interface attached to the bridge, where it would be encapsulated with the tag preserved.

Traverse the tunnel to the second VTEP, which is enslaved to the bridge with the VLANs, then be decapsulated and routed to the appropriate VLAN based on its tag.

Moving Forward with a New Setup:

I’ll try a new setup based on your recommendations:

I’ll enable VLAN filtering for VLAN traffic segregation.

From your suggestion:So what you should do to use a VXLAN link as the trunk instead? Obviously, on vm1, you should "un-enslave" host0 from the bridge (note that you also need to "move" the IP address assignment, I mean like 192.168.181.111/24, let it be static or DHCP, from the bridge back to host0), then create the VXLAN that connects to vm2, and enslave the VXLAN to the bridge. And on vm2, you should remove the VLAN interfaces created on host0, and after creating the VXLAN that connects to vm1, you create VLAN interfaces on it instead in similar manner.

If I understood correctly, I can create VLANs directly on the VXLAN interface itself. I’ll experiment with this approach.

P.S. I hope this helps clarify my goals for this lab. I’ll explore alternative methods based on your suggestions and see how they work, thanks.

Tom Yan · Answer 1 · 2024-10-27T08:39:21.387

So I tried to speculate what you actually want, and have come up with some hints that may or may not help you.

First of all, I assume that you want to simulate a switch with / connected to two VLANs on one of the VM, and probably with a VXLAN that is expected to serve as a trunk. The quickest / simplest way is to create a bridge and add VLAN interfaces on it:

[tom@vm1 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute 
       valid_lft forever preferred_lft forever
2: host0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master bridge state UP group default qlen 1000
    link/ether 86:96:5a:d1:62:7e brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::8496:5aff:fed1:627e/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever
3: bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 86:96:5a:d1:62:7e brd ff:ff:ff:ff:ff:ff
    inet 192.168.181.111/24 scope global bridge
       valid_lft forever preferred_lft forever
    inet6 fe80::8496:5aff:fed1:627e/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever
4: bridge.10@bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 86:96:5a:d1:62:7e brd ff:ff:ff:ff:ff:ff
    inet 10.180.10.1/24 scope global bridge.10
       valid_lft forever preferred_lft forever
    inet6 fe80::8496:5aff:fed1:627e/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever
5: bridge.20@bridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 86:96:5a:d1:62:7e brd ff:ff:ff:ff:ff:ff
    inet 10.180.20.1/24 scope global bridge.20
       valid_lft forever preferred_lft forever
    inet6 fe80::8496:5aff:fed1:627e/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever

Note that unfortunately this would give a wrong idea to many people. Adding VLAN interfaces on bridge slaves (such as tap interfaces or physical Ethernet interfaces that has master set to a respective bridge), and enslaving the VLAN interfaces (instead of their "links") to the bridge, is NOT the way you create access ports on the bridge.

If you want to e.g. simulate VLAN segregation on a bridge that is connected a bunch of VMs, like putting the VMs into different VLANs, what you need to do is to enable vlan_filtering on the bridge and set the vlan-id and so on for the bridge slaves (again, such as tap interfaces) with the bridge vlan command. I am not going to dive into the details of this here since it's not directly relevant to implementing the proof-of-concept in concern here.

As you can see, a Ethernet interface, host0, is also enslaved to the bridge. Because I don't have vlan_filtering on here, and I haven't messed with the vlan-id configurations, by default it is effectively considered a trunk port (while the VLAN interfaces added on the bridge are considered access ports).

The interface is connected to a bridge on the VM host (well I'm really using systemd-nspawn here), with another VM connected to the same bridge:

[tom@vm2 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute 
       valid_lft forever preferred_lft forever
2: host0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 2e:57:3f:5c:e1:de brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.181.222/24 scope global host0
       valid_lft forever preferred_lft forever
    inet6 fe80::2c57:3fff:fe5c:e1de/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever
3: host0.10@host0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 2e:57:3f:5c:e1:de brd ff:ff:ff:ff:ff:ff
    inet 10.180.10.2/24 scope global host0.10
       valid_lft forever preferred_lft forever
    inet6 fe80::2c57:3fff:fe5c:e1de/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever
4: host0.20@host0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 2e:57:3f:5c:e1:de brd ff:ff:ff:ff:ff:ff
    inet 10.180.20.2/24 scope global host0.20
       valid_lft forever preferred_lft forever
    inet6 fe80::2c57:3fff:fe5c:e1de/64 scope link proto kernel_ll 
       valid_lft forever preferred_lft forever

As you can see, the trunk is "split" using multiple VLAN interfaces here. (It might be worth mentioning that, if for example you need separate bridges / physical network segments to "join" the two VLANs respectively, instead of a "sub-segment" that is "VLAN-aware" (like one bridge for both VLANs), it would then make sense to enslave the VLAN interfaces to the bridges respectively on this end. (NOTE: It seems that this was your original plan? Splitting trunk interfaces on both ends and attach the VLAN interfaces to different bridges that are not VLAN-aware?)

The setup shown above is close to what you want (well, what I have guessed). When you ping 10.180.10.1 from vm2 (for the first time), you will only see the untagged ARP request (which is broadcast traffic) from vm2 entering vm1 via bridge0.10, but not bridge0.20, if you e.g. tcpdump. And when you ping 10.180.20.1, you'll see that untagged ARP request entering vm1 via bridge0.20 only.

In other words, traffics received from a VLAN interface will be untagged, and those sent with one will be tagged. (That's why you only want to enslave one to a bridge if the bridge is NOT supposed to be VLAN-aware.)

Yet it is just close, but not what you want yet, because there is no VXLAN!

The thing is, when your VMs are in the same L2 network segment, the VXLAN is obviously unnecessary. With that said, it doesn't mean that you cannot "consider" the L2 link (with the host0s on its ends) a L3-only one and go ahead anyway.

So what you should do to use a VXLAN link as the trunk instead? Obviously, on vm1, you should "un-enslave" host0 from the bridge (note that you also need to "move" the IP address assignment, I mean like 192.168.181.111/24, let it be static or DHCP, from the bridge back to host0), then create the VXLAN that connects to vm2, and enslave the VXLAN to the bridge. And on vm2, you should remove the VLAN interfaces created on host0, and after creating the VXLAN that connects to vm1, you create VLAN interfaces on it instead in similar manner.

By un-enslaving host0 on vm1, vm1 would be effectively connected to two network segments (I'm considering everything related to the bridge as one segment here). My point is, only when the link you "run the VXLAN over" is not attached to the "VLAN/bridge segment", the VXLAN would make sense (and can be used to "extend" the "VLAN/bridge segment".) Also, the host0 link is only responsible for transmitting the "VXLAN-encapsulated" traffics. Therefore, it would not make sense to have VLAN interfaces on the host0 on vm2, because it belongs to a different scope / context from the one the VLANs belong to. (NOTE: It seems that this was what you misunderstood / got wrong in your original plan?)

P.S. The bridge used on vm1 isn't exactly necessary to create a proof-of-concept. You can really even just run a VXLAN over the L2 (but considered L3-only) link that connects the two VM, and then create VLAN interfaces on the VXLAN interfaces on both ends. But at the same time, you may use a bridge on both VMs as well. The reason I used one (on one of them) is purely to show you what's the proper way to create / simulate access ports on a bridge quickly, that is, you should NOT create VLAN interfaces on the trunk interface and enslave them to one single bridge that is not VLAN-aware. (But indeed you haven't been doing that anyway.)

VLAN over VxLAN tunnel in linux

Bridge and interface setup

VLAN 10 on bridge br10

VLAN 20 on bridge br20

VXLAN on both bridges

Bridge and interface setup

VLAN 10 on bridge br10

VLAN 20 on bridge br20

VXLAN on both bridges

1 Answers1