Disclaimer:
Every Network is different , so one solution cannot be applied to all. Therefore try to understand logic & create your own solution as per your network scenario. Just dont follow copy paste.
If anybody here thinks I am an expert on this stuff, I am NOT certified in anything Mikrotik/Cisco/Linux or Windows. However I have worked with some core networks and I read & research & try stuff all of the time. So I am not speaking/posting about stuff I am formerly trained in, I pretty much go with experience and what I have learned on my own. And , If I don’t know something then I read & learn all about it.
So , please don’t hold me/my-postings to be always 100 percent correct. I make mistakes just like everybody else. However – I do my best, learn from my mistakes and always try to help others
Scenario-1:
We are using Mikrotik CCR as PPPOE/NAS. We are using public ip routing setup so each user is assigned public ip via pppoe profile.
Scenario-2:
We are using single Mikrotik CCR as PPPOE/NAS. We have local dsl service therefore NATTING is also done on the same router.
Problem:
When we have network outages like light failure in any particular area , in LOG we see many PPPoE sessions disconnects with ‘peer not responding‘ messages. Exactly at this moments, our NAS CPU usage reaches to almost 100% , which results in router stops passing any kind of traffic. This can continue for a minute or so on.
As showed in the image below …
If you are using Masquarade /NAT on the router, that is the problem. When using Masquarade, RouterOS has to do full connection tracking recalculation on EACH
interface connect/disconnect.
So if you have lots of PPP session connecting/disconnecting, connection tracking
will constantly be recalculated which can cause high CPU usage. When interfaces connect/disconnect, in combination with NAT, it gives you high CPU usage.
Solution OR Possible Workarounds :
First read this
Separating NATTING from ROUTING in Mikrotik
https://aacable.wordpress.com/2018/03/27/separating-natting-from-routing-in-mikrotik/
- If you have private ip users with natting, Stop using Masquarade on same router that have a lot of dynamic interfaces. Just DO NOT use NAT on any router that have high number of connecting/disconnecting interfaces. Place an additional router connected with your PPPoE NAS, and route NAT there.
Example: Add another router & perform all natting on that router by sending marked traffic from private ip series to that nat router. Setup routing between the PPPoE NAS and the NAT router. - IF all of your clients are on public IP , you can simply Turn Off
connection tracking
completely. This is the simplest approach.But beware that turning of CT will disable all NATTING / marking traffic as well.
Note: You can exempt your specific public pool from connection tracking as well.
- Any device that is CORE device or Gateway on your network, It should be assigned to perform one job only. Try not to mix multiple functions in one device. This will save you from later headache of troubleshooting.
Please read this …
Features affected by connection tracking
- NAT
- firewall:
- connection-bytes
- connection-mark
- connection-type
- connection-state
- connection-limit
- connection-rate
- layer7-protocol
- p2p
- new-connection-mark
- tarpit
- p2p matching in simple queues
So if you will turn OFF the connection tracking, above features will stop working.
– Code Snippet:
Some working example of excluding your public pool from connection tracking
- First make sure
Connection Tracking
is set to AUTO
/ip firewall connection tracking set enabled=auto
- Then make a address list which should have your users ip pool so that we can use this list as an Object in multiple rules later.
/ip firewall address-list add address=1.1.1.0/24 list=public_pool #add address=2.1.1.0/24 list=public_pool
- Now create rule to turn off connection tracking from our public ip users witht the RAW table
/ip firewall raw add action=notrack chain=prerouting src-address-list=public_pool add action=notrack chain=prerouting dst-address-list=public_poolThat’s it!
Some Tips for General Router Management
- Turn off all non essential services that are not actually being used or needed. Services place an additional CPU load on any system. Example, you can move your DHCP role to cisco switches for better response , also for intervlan routing it is highly recommended., Also if your ROS is acting as DNS as well, then move DNS role to dedicated dns server like BIND etc. This will free up some resources from the core system
- Use 10-gig network cards instead of 1-gig / Use 1-gig network cards instead of 100 meg
- Disable STP if it is not needed. Now this is highly debatable part I know 🙂
- Use Dynamic queues , they are spreader over multi cores
Regard's Syed Jahanzaib ~
AOA .
Brother When i create a rules in row with notrack then service will be shutdown any solution ?
LikeLike
Comment by udasschand — February 20, 2018 @ 4:44 AM
Dear Jahanzaib bhai…
I prefer to use ROS Bugfix version, becoz I saw these type of issue by Current ROS version…
LikeLiked by 1 person
Comment by kashifzai86 — February 20, 2018 @ 10:04 AM
use version – 6.36.2
LikeLike
Comment by KAMAL SK — April 16, 2018 @ 7:27 PM
I wouldn’t recommend you to go backward. also the router will not downgrade it self below factory shipped version.
what will you do if your router comes with 6.4x shipped as factory version? forcing it downgrade via other methods is not recommended.
LikeLike
Comment by Syed Jahanzaib / Pinochio~:) — April 25, 2018 @ 8:11 AM
Dear Jahanzaib,
We are facing a problem in MikroTik (CCR1036-12G-4S) with CPU high utilization stays on 100%. We have 68 simple queues and configured link bonding to increase the throughput. But whenever the traffic reaches up to 1.5 Gbps, the CPU utilization reaches up to 100%. I am really surprised by this because we just have 1.5 Gbps of traffic and it will be more up to 2.5 Gbps soon. Your suggestion is required whether it’s because of link bonding or high numbers of queues. If I see the datasheet of CCR1036-12G-4S it has high throughput.
Regards,
LikeLike
Comment by xpertarm — April 7, 2019 @ 5:40 PM
there are several points you need to look into,
1) if you are using natting; then
move nating to other router;
disable connection tracking;
# problem solved
fi
exit 1
ELSE
try acquiring ccr1072 (if budget allows) or a X86 box with 10g card & get rid of bonding, it will solve the issue as well.
CPU speed doesnt matters, it the cache and new generation CPU’s that matters
ELSEIf
check your configuration, there must be something incorrectly configured.
in short- 10g is a better choice
LikeLiked by 1 person
Comment by Syed Jahanzaib / Pinochio~:) — April 8, 2019 @ 9:41 AM
Thanks a lot for your suggestion. We are not doing nating and we will disable connection tracking. We have just configured queues and link bonding. We have configured bonding on 6 interfaces.
LikeLike
Comment by xpertarm — April 8, 2019 @ 1:32 PM
Why Link bonding needed?? I have 6 PTCL Gpon of 250Mbps … Did I use bonding for this?? I need reason for bonding use?? I know this is not answer for your query
LikeLike
Comment by kashifzai86 — April 9, 2019 @ 12:26 PM
What if you have only 1G network support per port, & you still need to pass more then 1 G traffic ? if 10g is not available, then ultimately you have to select bonding route to achieve more bandwidth, although I have seen mikrotik have some cpu issues while doing bonding ,
LikeLike
Comment by Syed Jahanzaib / Pinochio~:) — April 10, 2019 @ 9:47 AM
Mr. Zaib
I need to ask k konse CCR mai pppoe WAN links configure karoun?? Main CCR or Nating CCR… I have 1 Fiber Media link & 4 pppoe GPoN Links??
LikeLike
Comment by kashifzai86 — April 10, 2019 @ 9:26 AM
you should configure PPPOE on CCR (preferably which has more specs) and natting on other CCR.
LikeLike
Comment by Syed Jahanzaib / Pinochio~:) — April 10, 2019 @ 9:45 AM
Jhanzaib bhai
I have both with same specs (i.e. CCR1036)… Jo mujhay samjh aya hai k connection tracking jis CCR mai ON hugi usi CCR k andar Per Conneciton Classifier (both address:0 til 3) k sath 4 WAN Links configure houngay, kio k agar mai Conection Tracking off kardounga tou Load balancing PCC kaam nhi karega…
IS trah to 4 pppoe WAN links bhi wahin masquared houngay?? to issue to wahin ka wahin raha hai….
kia aise hi huga??
LikeLike
Comment by kashifzai86 — April 10, 2019 @ 9:59 AM
– You should turn on connection tracking on natting router where load balancing is configured.
– ON THE MAIN CCR WHERE PPPOE users ARE CONNECTED, YOU CAN TURN OFF CONNECTION TRACKING on this ccr. Cpu hike occurs when dynamic interface connects/disconnects and CT is enabled. this will not happen on the pppoe server as CT will be disabled.
LikeLike
Comment by Syed Jahanzaib / Pinochio~:) — April 12, 2019 @ 10:08 AM
Zaib Asalam Aliqum !!!
Connection Tracking k baad jab Connection State kaam nhi karta hai tou kia “Input” and “Forward” k accepts rules se connection state remove kardoun??
phir kaam karega??
Kashif Khan
LikeLike
Comment by kashif khan — February 17, 2020 @ 8:21 PM