Syed Jahanzaib Personal Blog to Share Knowledge !

February 19, 2018

High CPU load when PPPoE sessions disconnects in Mikrotik

Filed under: Uncategorized — Syed Jahanzaib / Pinochio~:) @ 4:46 PM

stress


Disclaimer:

Every Network is different , so one solution cannot be applied to all. Therefore try to understand logic & create your own solution as per your network scenario. Just dont follow copy paste.

If anybody here thinks I am an expert on this stuff, I am NOT certified in anything Mikrotik/Cisco/Linux or Windows. However I have worked with some core networks and I read & research & try stuff all of the time. So I am not speaking/posting about stuff I am formerly trained in, I pretty much go with experience and what I have learned on my own. And , If I don’t know something then I read & learn all about it.

So , please don’t hold me/my-postings to be always 100 percent correct. I make mistakes just like everybody else. However – I do my best, learn from my mistakes and always try to help others

 


Scenario-1:

We are using Mikrotik CCR as PPPOE/NAS. We are using public ip routing setup so each user is assigned public ip via pppoe profile.

Scenario-2:

We are using single Mikrotik CCR as PPPOE/NAS. We have local dsl service therefore NATTING is also done on the same router.


Problem:

When we have network outages like light failure in any particular area , in LOG we see many PPPoE sessions disconnects with ‘peer not responding‘ messages. Exactly at this moments, our NAS CPU usage reaches to almost 100% , which results in router stops passing any kind of traffic. This can continue for a minute or so on.

As showed in the image below …

pppoe high cpu usage

If you are using Masquarade /NAT on the router, that is the problem. When using Masquarade, RouterOS has to do full connection tracking recalculation on EACH interface connect/disconnect.

So if you have lots of PPP session connecting/disconnecting, connection tracking will constantly be recalculated which can cause high CPU usage. When interfaces connect/disconnect, in combination with NAT, it gives you high CPU usage.


Solution OR Possible Workarounds :

First read this

Separating NATTING from ROUTING in Mikrotik

https://aacable.wordpress.com/2018/03/27/separating-natting-from-routing-in-mikrotik/

  • If you have private ip users with natting, Stop using Masquarade on same router that have a lot of dynamic interfaces. Just DO NOT use NAT on any router that have high number of connecting/disconnecting interfaces. Place an additional router connected with your PPPoE NAS, and route NAT there.
    Example: Add another router & perform all natting on that router by sending marked traffic from private ip series to that nat router. Setup routing between the PPPoE NAS and the NAT router.
  • IF all of your clients are on public IP , you can simply Turn Off connection tracking completely. This is the simplest approach.But beware that turning of CT will disable all NATTING / marking traffic as well.
    Note: You can exempt your specific public pool from connection tracking as well.

ct

 

  • Any device that is CORE device or Gateway on your network, It should be assigned to perform one job only. Try not to mix multiple functions in one device. This will save you from later headache of troubleshooting.

Please read this …

Features affected by connection tracking

  • NAT
  • firewall:
    • connection-bytes
    • connection-mark
    • connection-type
    • connection-state
    • connection-limit
    • connection-rate
    • layer7-protocol
    • p2p
    • new-connection-mark
    • tarpit
  • p2p matching in simple queues

So if you will turn OFF the connection tracking, above features will stop working.


– Code Snippet:

Some working example of excluding your public pool from connection tracking

  • First make sure Connection Tracking is set to AUTO
/ip firewall connection tracking set enabled=auto

 

  • Then make a address list which should have your users ip pool so that we can use this list as an Object in multiple rules later.
/ip firewall address-list
add address=1.1.1.0/24 list=public_pool
#add address=2.1.1.0/24 list=public_pool

 

  • Now create rule to turn off connection tracking from our public ip users witht the RAW table
/ip firewall raw
add action=notrack chain=prerouting src-address-list=public_pool

That’s it!



Some Tips for General Router Management

  • Turn off all non essential services that are not actually being used or needed. Services place an additional CPU load on any system. Example, you can move your DHCP role to cisco switches for better response , also for intervlan routing it is highly recommended., Also if your ROS is acting as DNS as well, then move DNS role to dedicated dns server like BIND etc. This will free up some resources from the core system
  • Use 10-gig network cards instead of 1-gig / Use 1-gig network cards instead of 100 meg
  • Disable STP if it is not needed. Now this is highly debatable part I know 🙂

Regard's
Syed Jahanzaib ~
Advertisements

FREERADIUS WITH MIKROTIK – Part #12 – Happy Hours !

Filed under: freeradius — Tags: , — Syed Jahanzaib / Pinochio~:) @ 11:40 AM

fre2

happy hours

FREERADIUS WITH MIKROTIK – Part #1 – General Tip’s Click here to read more on FR tutorials …


* SCENARIO:

We have a full working freeradius based billing system which have multiple Quota base services assigned to users. All users accounting is stored in radacct table as usual, but we have a separate column named qt_used in users table which is updates via radacct trigger when accounting updates arrives from the NAS. This column is checked by an external script every 10 minutes & if it finds qt_used value above then qt_total then it simply change user group to EXPIRED group & disconnects users from NAS by sending COA / POD.

* TASK:

Our bandwidth remains empty in night therefore we want to provide some extra benefits to quota base users by BYPASSING quota limit & counting in specific late night hours.

Example: There should be no QUOTA counting for all quota base users from 00:00 (midnight) till 8am (morning).

* Possible SOLUTION !

Like always, this may not be best optimal solution to achieve the task, but still it may work. Be sure took for other better approach rather then this sol.

Following is our USERS table

mysql> describe users;
+-------------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+-------------------+-----------------------------+
| id | int(10) | NO | PRI | NULL | auto_increment |
| username | varchar(128) | NO | UNI | NULL | |
| password | varchar(32) | NO | | NULL | |
| firstname | text | NO | | NULL | |
| lastname | text | NO | | NULL | |
| email | text | NO | | NULL | |
| mobile | text | NO | | NULL | |
| cnic | text | NO | | NULL | |
| srvname | text | NO | | NULL | |
| srvid | int(3) | NO | | NULL | |
| expiration | date | YES | | NULL | |
| mac | varchar(30) | NO | | NULL | |
| bwpkg | varchar(256) | NO | | NULL | |
| is_enabled | int(1) | NO | | NULL | |
| is_days_expired | int(1) | NO | | NULL | |
| is_qt_expired | int(1) | NO | | NULL | |
| is_uptime_expired | int(1) | NO | | NULL | |
| qt_used | varchar(20) | NO | | NULL | |
| uptime_used | varchar(32) | NO | | NULL | |
| owner | text | NO | | NULL | |
| vlanid | varchar(32) | NO | | NULL | |
| createdon | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+-------------------+--------------+------+-----+-------------------+-----------------------------+
22 rows in set (0.00 sec)

& following is our services table

mysql> describe services;
+--------------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+-------------------+-----------------------------+
| srvid | int(10) | NO | PRI | NULL | auto_increment |
| srvname | varchar(128) | NO | MUL | NULL | |
| descr | varchar(128) | YES | | other | |
| enabled | varchar(1) | NO | | NULL | |
| expdays | varchar(3) | NO | | NULL | |
| dlimit | varchar(30) | NO | | NULL | |
| ulimit | varchar(30) | NO | | NULL | |
| qt_enabled | tinyint(1) | NO | | NULL | |
| tot_qt | int(128) | NO | | NULL | |
| free_quota_enabled | tinyint(4) | NO | | NULL | |
| free_qt_start_time | time | NO | | NULL | |
| free_qt_end_time | time | NO | | NULL | |
| ippool | varchar(30) | NO | | NULL | |
| createdon | datetime | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+--------------------+--------------+------+-----+-------------------+-----------------------------+
14 rows in set (0.00 sec)

* MYSQL TRIGGER on RADACCT table for AFTER UPDATE ACTION. [ This is our magical code 😛 ]

Ok, now we will create a TRIGGER on radacct table for AFTER UPDATE action.
[apply it in mysql shell on radius table]

BEGIN
SET @ctime = (select current_time());
SET @srvid = (select srvid from users where username =New.Username);
SET @st = (select free_qt_start_time from services where srvid=@srvid);
SET @et = (select free_qt_end_time from services where srvid=@srvid);
if (@ctime > @st)
OR (@ctime < @et)
then
UPDATE users set qt_used = qt_used + NEW.acctoutputoctets WHERE username = New.username;
# below section ELSE is not required, I just added it for ON THE FLY troubleshooting]
else
insert into log (data,msg) values ('zaib','happy_hours_time_not_matched');
END IF;
END
# Syed Jahanzaib / aacable at hotmail dot com

Current time is 11:00{am}, inspect the mysql.log & you will find following (quota will not update because time doesnt matches)

2018-02-19T06:18:26.797015Z 51 Query UPDATE radacct SET framedipaddress = '192.168.50.255', acctsessiontime = '1841', acctinputoctets = '0' > 32 | '122655', acctoutputoctets = '0' > 32 | '145069' WHERE acctsessionid = '8100000e' AND username = 'zaib' AND nasipaddress = '101.11.11.253'
2018-02-19T16:18:26.797819Z 51 Query SET @ctime = (select current_time())
2018-02-19T16:18:26.797889Z 51 Query SET @srvid = (select srvid from users where username =New.Username)
2018-02-19T16:18:26.798009Z 51 Query SET @st = (select free_qt_start_time from services where srvid=@srvid)
2018-02-19T16:18:26.798092Z 51 Query SET @et = (select free_qt_end_time from services where srvid=@srvid)
2018-02-19T16:18:26.798165Z 51 Query insert into log (data,msg) values ('zaib','happy_hours_time_not_matched)

.

& if the radacct updates came in between happy hours then we will observe following (Quota will update accordingly)

2018-02-19T06:19:31.818111Z 98 Query UPDATE radacct SET framedipaddress = '192.168.50.255', acctsessiontime = '1906', acctinputoctets = '0' << 32 | '130274', acctoutputoctets = '0' << 32 | '161901' WHERE acctsessionid = '8100000e' AND username = 'zaib' AND nasipaddress = '101.11.11.253'
2018-02-19T06:19:31.818500Z 98 Query SET @ctime = (select current_time())
2018-02-19T06:19:31.818574Z 98 Query SET @srvid = (select srvid from users where username =New.Username)
2018-02-19T06:19:31.818680Z 98 Query SET @st = (select free_qt_start_time from services where srvid=@srvid)
2018-02-19T06:19:31.818758Z 98 Query SET @et = (select free_qt_end_time from services where srvid=@srvid)
2018-02-19T06:19:31.818833Z 98 Query UPDATE users set qt_used = qt_used + NEW.acctoutputoctets WHERE username = New.username

 


Regard's
Syed Jahanzaib ~

.

%d bloggers like this: