Using VRRP protocol on Mikrotik routers

Virtual Router Redundancy Protocol (VRRP) protocol is a computer networking protocol which makes a virtual router from two or more physical routers in the same network. This mechanism protects us from failure of a single device, an ethernet port on the device, a network cable or a network switch. The protocol is described in the RFC 3768 (https://tools.ietf.org/rfc/rfc3768.txt).

VRRP will make a virtual router as an additional virtual (logical) interface over one single physical port on the router. That can be only LAN or WAN port in one time. We can make more than one virtual router inside same physical device. That means that we can make separate VRRP interfaces over LAN and WAN ports.

VRRP protocol will not transfer or update a device configuration between two Mikrotik devices. That must be performed either manually or using a scripts.

In most cases we will use two devices. Both routers must be configured correctly and must work properly as a stand-alone routers, before we can proceed with configuration of the VRRP protocol.

 

How all this looks like

We will demonstrate situation when an ISP provides us with just one public IP. Therefore, we need to make following connection between devices:

vrrp

 

Settings in our example

In this example we will use an IP addresses in range 192.168.88.xxx/24 for LAN side. WAN side will have an IP addresses 198.51.100.24/30. We will leave both routers to use a DHCP client on the ether1 port. Port Ether1 is LAN and port Ether5 is WAN port. For this demonstration I used two Routerboard RB150 units with RouterOS 5.26.

As this is proof of concept (POC), we will use an admin account without a password for the FTP service on both routers. Also, a FTP service is enabled on both routers without any restrictions. In real-life, we should never open a FTP service or using an admin account without a password.

 

Protocol protection

We didn’t used a protocol protection in this demonstration. However, when we need to configure a real routers, we should use the protocol protection. We can use following options in the VRRP protocol protection:

  • none
  • simple
  • ah

With option none we are not using any protection. This is useful only for demonstrations or testing purposes.

Option simple expecting that the password is used and it must be setup on all routers. In this option, password is sent as clear text over the network. However, using this password will prevent from an accidental configuration changes.

Option ah should be used when we want to encrypt a packet header. Therefore, we will have higher level of a security, as we have protection of both a configuration and a password in the same time. We should use this option whenever we have either low level of a control over the environment or we need to have higher level of a security inside the network.

 

Protocol roles

In the VRRP protocol configuration, one router must be a Master and all others are Slaves. If there is more than one Slave router, then they must have priority in which order they will take over the Master role. Take over process is automated.

We have option to always keep one router as Master and second as Slave. Master router will be used always while it works. Only if Master router is down, Slave will be used. Second mode of operation is to leave the Master role on the router which currently works. That means that if first router goes down, second became Master and will not return role back, even if a first router became operational.

For this demonstration we will use just basic settings, without additional security and with automatic return of Master role to the first router. We can turn off this option, if we don’t want to have switching of a roles between the routers, for example in the middle of a working day or if we’re using VPN tunnels.

We controlling this behavior through the option preemption-mode. Automatic taking over of the Master role will be disabled when we set it’s value to no.

We’ll use a command line interface to avoid mistakes in the configuration. All this steps can be performed also through the Webfig or a Winbox interfaces.

 

Settings for a router #1

We should create new virtual interface of type VRRP over LAN port and to assign a highest priority to it – 254. Standard value is 100.

/interface vrrp add interface=ether1 name=vrrp1 priority=254

New interface appears in the system. Now we should assign an IP address to this interface. This IP address must be from the same subnet used for all other computers and a network devices. This address must have network mask /32 or 255.255.255.255.

/ip address add address=192.168.88.11/32 interface=vrrp1

Now we should add two scripts. Those utility scripts makes full automation of this solution.

First script will check a WAN link. If a WAN links fail, like a problem with the port or a network cable, router should disable LAN port. That action will initiate a VRRP failover sequence. Second router can’t find a first router on the network, concludes that first router is down and takes a role of the VRRP master router.

I didn’t built check for full link operation, like if ISP side is down. In such cases a VRRP transfer is senseless, as neither router can’t established the Internet connection.

 

Check_WAN script

:local WanPortUp [/interface ethernet get [find name=ether5] running]

:local LanPortUp [/interface ethernet get [find name=ether1] disabled]

:if ($WanPortUp) do={

    :if ($LanPortUp) do={

        /interface ethernet set [find name=ether1] disabled=no;

    }

} else={

   :if ($LanPortUp = false) do={

       /interface ethernet set [find name=ether1] disabled=yes;

   }

}

Second script will export a configuration, whole or just part, like firewall related rules. Router will export configuration to the file. Then we can pick that file and transfer it to the second router. When second router takes over the Master role, it will load that configuration as it’s own.

 

Export_FW_cfg script

/ip firewall filter export file=fw_filter

/ip firewall nat export file=fw_nat

Now we should add Scheduled tasks for those scripts. Interval is purely arbitrary. We can export configuration once per day. As we have tis script, we can even run it manually when we make significant update to the configuration.

In our example we will check WAN link every 5 seconds and we will export configuration every 5 minutes.

/system scheduler

add disabled=no interval=5s name=Check_WAN_link on-event=Check_WAN_link

add disabled=no interval=5m name=Export_FW_cfg on-event=Export_FW_cfg

Settings for a router #2

Router itself must be configured so that it can be only one router in the network. We should test this before we continue with further setup. If router #2 can’t handle a network operation alone, it can’t then handle VRRP failover, too.

We should add a VRRP interface on same way as on the first router. Only difference is that priority will leave on 100.

On secondary router we should also provide two scripts for events On-Master and On-Backup. Those scripts corresponding to the events when the Master role is took and released.

/interface vrrp add arp=enabled interface=ether1 name=vrrp1 on-backup="" on-master=""

IP address for the VRRP interface must be same as on the router #1.

/ip address add address=192.168.88.11/32 disabled=no interface=vrrp1

In our scenario, we have a single public IP address on the WAN port (ether5 in this example). That IP address can’t be assigned to both routers in same time. Therefore, only active router will use it.

Further, our primary router have some firewall rules. Those rules must be used on both devices on same way. We should avoid a manual configuration and potential misconfiguration between routers. Therefore, we should import a configuration for firewall settings from the file, which we already transferred before. We will see that script later.

 

On-Master script:

/interface ethernet set [find name=ether5] disabled=no

/ip firewall filter remove [find]

/ip firewall nat remove [find]

/import file=fw_filter.rsc

/import file=fw_nat.rsc

When we need to release a Master role and leave it to first router, we should clean-up some settings. We will delete all firewall rules and disable WAN port.

 

On-Backup script:

/interface ethernet set [find name=ether5] disabled=yes

/ip firewall filter remove [find]

/ip firewall nat remove [find]

I mention that we need automated script which will we used to transfer configuration files from the router #1 to the router #2. In our example that will be just firewall configuration files, but we can transfer any other configuration files. We’ll use tool fetch command, separate command for every file we need to transfer.

In our example we are using admin account without a password. We should never leave admin account without password.

 

DL_cfg script:

/tool fetch mode=ftp address=192.168.88.252 user=admin password="" src-path=fw_filter.rsc dst-path=fw_filter.rsc

/tool fetch mode=ftp address=192.168.88.252 user=admin password=”” src-path=fw_nat.rsc dst-path=fw_nat.rsc

Now we should add Scheduled task for this script. Execution time and recurrence period are arbitrary. We can execute it ones per day or every hour. That depends on amount of a changes which we will make. We always have an option to run script manually and fetch all last versions of the files immediately.

In our demonstration we will set interval to 5 minutes.

/system scheduler

add disabled=no interval=5m name=DL_cfg on-event=DL_cfg

Assigning new gateway IP address to network devices

We assume that there is a DHCP server in the network. We should edit it’s configuration and we should change an IP address of the default gateway for the network. We will enter here new IP address of a virtual router.

 

Testing

We will perform following tests to check if VRRP protocol works fine. We must have successful every step before we continue further.

First test is normal operation. During this test router 1 will works and router 2 is in stand-by mode:

  • VRRP interface on router 1 is in status RM (Running Master)
  • VRRP interface on router 2 is in status B (Backup)
  • Scheduled task must execute appropriate script and create necessary .rsc files on router 1
  • Scheduled task must execute appropriate script and transferred necessary .rsc files from router 1 to router 2
  • We should try to connect from workstation to the Internet or server outside our network

Now we can perform a test of LAN failure. During this test, router 1 will fail on the LAN port. Router 2 must activate the Master role and execute all necessary scripts:

  • We should unplug a network cable from LAN port on router 1
  • VRRP interface on router 2 must discover that router 1 is unavailable and to take over the Master role (VRRP is in the RM status)
  • All changes and scripts must be loaded properly in firewall (and other) configuration
  • WAN port on router 2 must be switched to enabled and connected with ISP
  • We will again try to access some server outside our network. Access must works and we can see some Web page, for example
  • We plugging back cable to LAN port on router 1
  • We should check that router 1 take back the Master role
  • On router 2 WAN port must be disabled and all firewall rules must be deleted

Third test will be made on the WAN link failure. We simulating a cable or port failure on WAN side of the link. Second router must became a VRRP master and executed all scripts for a configuration transfer:

  • Now we will unplug a network cable from WAN port on router 1
  • Automated script will discover that there is some problem with WAN port and it will disable LAN port on router 1
  • Router 2 must again take over function of the VRRP Master router
  • We plug back cable to WAN port on router 1
  • Script detects that WAN port is up and running and enabling LAN port on router 1. Consequently, router 2 sending back the Master role to the router 1, performing same tasks to disable WAN port and to delete configuration.

 

 

What we can see during the test

We will ping our test ISP router on “public” IP address 198.51.100.25/30. Our routers have an IP address 198.51.100.26/30 on the WAN port. We will use an outgoing NAT (masquerade) on the port ether5.

In one moment, when we have stable ping, we will unplug one of the network cables. We should test first ether1, then ether5 port. Any disconnection will cause a VRRP failover. We have script which check WAN port every 5 seconds

Here is a result when we disconnecting LAN port:

C:\>ping 198.51.100.25 -t

Pinging 198.51.100.25 with 32 bytes of data:

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Request timed out.

Request timed out.

Reply from 198.51.100.25: bytes=32 time=5ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

As we can see, we lost just two ping replays and after that a router continues to work. From the perspective of a user, there was no problem on the link.

 

Now we will make test when the port ether5 disconnected:

C:\>ping 198.51.100.25 -t

Pinging 198.51.100.25 with 32 bytes of data:

Reply from 198.51.100.25: bytes=32 time=2ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=11ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Reply from 198.51.100.25: bytes=32 time=1ms TTL=63

Switching between routers is almost invisible. We had one slower ping response and that’s all.

Advertisements

2 thoughts on “Using VRRP protocol on Mikrotik routers

  1. I can’t believe this, because your 2 routers WAN port is same IP 192.51.100.26/300, this can’t be working…One switch and 2 same IP device,,,how are you let this working?

    Like

    • Hi,

      In the essence, this is a cluster technology.

      Like all other clusters, we’re using here two or more devices to build one virtual super device. That’s the key.

      Now, every Mikrotik is the separate device. Additionally, every network device must have an unique MAC and IP addresses. We will configure them as the completely independent devices.

      The trick is in the part where we’re configuring the virtual VRRP adapter. This adapter is the software based and linked to the specific physical port (e.g. ether4). Every device must have it defined. More over, all adapters will have the same IP address. However, the control software will enable only one adapter at the time.

      This is the reason why we’re using the Master role.

      In case that the Master device failed, the first Slave will take it’s role. Using the power of scripts, we can maintain the same configuration across all devices and the clients will not feel this switch.

      Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s