February 9, 2026

Handling Stale PPPoE Sessions in MikroTik + FreeRADIUS

Filed under: Mikrotik Related — Tags: AAA Systems, freeradius, FreeRADIUS unlang, ISP Billing, ISP Operations, mikrotik, MikroTik PPPoE, Network Authentication, pppoe, RADIUS Accounting, Simultaneous-Use, Stale Sessions — Syed Jahanzaib / Pinochio~:) @ 4:32 PM

Handling Stale PPPoE Sessions in MikroTik + FreeRADIUS

(Exact File Locations, unlag Placement, SQL Ownership, and Cron Responsibilities)

Author: Syed Jahanzaib ~A Humble Human being! nothing else
Platform: aacable.wordpress.com
Category: Corporate Offices / DHCP-DNS Engineering
Audience: Systems Administrators, IT Support, NOC Teams, Network Architects

Disclaimer & Note on Writing Style

Every network environment is unique. A solution that works effectively in one infrastructure may require modification in another. Readers are strongly encouraged to understand the underlying concepts and adapt the guidance according to their own architecture, operational policies, and risk tolerance.

Blind copy-paste implementation without proper validation, testing, and change management is never recommended , especially in production environments. Always ensure proper backups and risk assessment before applying any configuration.

The content shared here is based on hands-on experience from real-world deployments, ISP environments, lab testing, and continuous learning. While I strive for technical accuracy, no technical implementation is entirely free from the possibility of error. Constructive discussion and alternative approaches are always welcome.

Due to professional commitments, it is not always feasible to publish highly detailed or multi-part write-ups. The technical logic and implementation details are written based on my own practical experience. AI tools such as ChatGPT are used only to refine grammar, structure, and presentation — not to generate the core technical concepts.

This blog is not intended for client acquisition or follower growth. It exists solely to share practical knowledge and real-world experience with the community.

Thank you for your understanding and continued support.

Introduction

In PPPoE deployments built on MikroTik NAS and FreeRADIUS with an SQL backend, stale accounting sessions are an unavoidable operational reality. These sessions typically arise when accounting stop packets are never received due to NAS reboots, power failures, access link disruptions, or transport issues between the NAS and the RADIUS server.

When left unverified, such stale records directly impact Simultaneous-Use enforcement, resulting in legitimate users being blocked with “user already online” conditions despite having no active session. Addressing this problem requires more than periodic record deletion; it demands a controlled and verifiable method to determine whether a session is truly active.

This article presents a production-safe approach to PPPoE session verification by leveraging accounting activity, SQL-based validation, unlang logic, and scheduled cleanup processes. The objective is to ensure accurate session state, reliable Simultaneous-Use enforcement, and clean accounting data without compromising audit integrity or service availability.

Why this happens (root cause)

MikroTik sends Accounting-Start
Every 5 minutes it sends Interim-Update
FreeRADIUS inserts a row in radacct with acctstoptime = NULL
NAS reboots / loses connectivity
Accounting-Stop is never sent
FreeRADIUS still thinks the session is active
Simultaneous-Use = 1 blocks re-login

👉 RADIUS cannot guess that the user is gone unless you tell it how.

What ISPs do in real production

There are three layers usually combined:

1️⃣ Interim-Update timeout validation (MOST IMPORTANT)

This is the primary verification mechanism.

Logic

If a session has not sent an Interim-Update for X minutes, it is considered dead, even if acctstoptime is NULL.

Why it works

A live session must send interim updates
If NAS rebooted, updates stop
This is proof, not assumption

Typical ISP rule

If last interim update > 2 × Acct-Interim-Interval → session is dead

Example:

Interim interval = 5 minutes
Timeout = 10–15 minutes

SQL condition (verification)

SELECT username, nasipaddress, acctsessionid
FROM radacct
WHERE acctstoptime IS NULL
AND TIMESTAMPDIFF(MINUTE, acctupdatetime, NOW()) > 15;

If this condition is true → session is NOT alive.

Proper cleanup (controlled)

UPDATE radacct
SET acctstoptime = acctupdatetime,
    acctterminatecause = 'NAS-Reboot'
WHERE acctstoptime IS NULL
AND TIMESTAMPDIFF(MINUTE, acctupdatetime, NOW()) > 15;

✅ This is production-safe
❌ This is NOT blind deletion
✔ This is based on verifiable inactivity

2️⃣ NAS reachability check (secondary verification)

Many ISPs also verify NAS status before cleanup.

Typical checks

Ping NAS IP
SNMP availability
RouterOS API availability

Logic

IF NAS unreachable
AND session has no interim updates
THEN mark sessions as dead

This avoids killing sessions when:

SQL delayed
Temporary RADIUS lag

Example logic (pseudo)

if ping NAS == fail:

cleanup stale sessions for that NAS

This is often implemented in cron scripts.

3️⃣ MikroTik-side protections (VERY IMPORTANT)

a) Proper Acct-Interim-Interval

On MikroTik:

/radius
set accounting=yes interim-update=5m

Never leave it default or disabled.

b) Send Stop on reboot (best-effort)

RouterOS already tries, but you can improve reliability:

Use stable storage
Avoid power cuts without UPS

Still, RADIUS must assume stops can be lost.

4️⃣ What ISPs DO NOT do (bad practices)

❌ Delete all acctstoptime IS NULL blindly
❌ Disable Simultaneous-Use
❌ Allow multiple sessions to avoid complaints
❌ Manual cleanup by support staff

These cause:

Abuse
Multiple logins
Incorrect billing
Legal/audit issues

5️⃣ Recommended production-grade approach (summary)

Layer	Purpose	Mandatory
Interim timeout check	Session verification	✅ YES
NAS reachability check	Extra safety	⚠ Recommended
Controlled SQL cleanup	Session release	✅ YES
Simultaneous-Use=1	Abuse prevention	✅ YES

6️⃣ Industry-standard wording (how ISPs define it)

“A session is considered active only if periodic accounting updates are received within the defined timeout window. Absence of updates implies session termination due to NAS failure or connectivity loss.”

This is exactly how large ISPs justify cleanup during audits.

Final Answer to core question!

How do ISPs verify if user is actually not live on NAS?

✅ By absence of Interim-Update packets within a defined time window, optionally combined with NAS reachability checks.

There is no other reliable method in RADIUS.

This article documents a fully production-aligned design, explicitly mapping:

Which logic runs inside FreeRADIUS
Which logic runs via SQL
Which logic runs via cron
Exact file names and locations for every decision

No unnamed logic. No invisible automation.

Environment Assumptions

FreeRADIUS 3.x
MySQL / MariaDB backend
MikroTik PPPoE NAS
Interim-Update = 5 minutes
Simultaneous-Use = 1

Session State Model (Conceptual)

State	Meaning	Where enforced
ACTIVE	Interim updates arriving	FreeRADIUS SQL
STALE	Interim missing, NAS alive	Cron
CLOSED	Verified termination	FreeRADIUS unlang / Cron

Database Layer (Schema Ownership)

Location

Database: radiusdb
Table : radacct

Schema Extension (run once)

Concept/Logic/Purpose: Extends the radacct table to explicitly track session state and last known MikroTik session timers. This enables deterministic handling of stale, resumed, and terminated sessions without relying on assumptions.

ALTER TABLE radacct
ADD COLUMN session_state ENUM('ACTIVE','STALE','CLOSED') DEFAULT 'ACTIVE',
ADD COLUMN last_acct_session_time INT DEFAULT 0,
ADD INDEX idx_state_update (session_state, acctupdatetime),
ADD INDEX idx_nas_user (nasipaddress, username);

Responsibility:
✔ Stores session truth
✔ No logic, only state

SQL Query Ownership (queries.conf)

Concept/Logic/Purpose: Centralizes all accounting-related SQL logic in a single, predictable location managed by FreeRADIUS. This separation ensures that runtime logic (unlang) and data manipulation (SQL) remain clean and auditable.

File Location

/etc/freeradius/mods-config/sql/main/mysql/queries.conf

All SQL below lives only in this file.

Accounting-Start Query

Used by: FreeRADIUS accounting {}
Concept/Logic/Purpose: Creates a new accounting record when a PPPoE session is first established.
This marks the authoritative beginning of a user session and initializes all tracking fields.

accounting_start_query = "
INSERT INTO radacct (
  acctsessionid, acctuniqueid, username, nasipaddress,
  acctstarttime, acctupdatetime, acctstoptime,
  acctsessiontime, last_acct_session_time, session_state
) VALUES (
  '%{Acct-Session-Id}', '%{Acct-Unique-Session-Id}',
  '%{User-Name}', '%{NAS-IP-Address}',
  NOW(), NOW(), NULL,
  '%{Acct-Session-Time}', '%{Acct-Session-Time}', 'ACTIVE'
)"

Interim-Update Query

Used by: FreeRADIUS accounting {}
Concept/Logic/Purpose: Periodically refreshes session counters and timestamps to confirm that the user is still actively connected. It also safely resumes sessions after temporary NAS–RADIUS connectivity loss.

accounting_update_query = "
UPDATE radacct
SET acctupdatetime = NOW(),
    acctsessiontime = '%{Acct-Session-Time}',
    last_acct_session_time = '%{Acct-Session-Time}',
    session_state = 'ACTIVE'
WHERE acctsessionid = '%{Acct-Session-Id}'
AND nasipaddress = '%{NAS-IP-Address}'
AND acctstoptime IS NULL
"

Accounting-Stop Query

Used by: FreeRADIUS accounting {}
Concept/Logic/Purpose: Close session normally. Closes a session only when FreeRADIUS receives an explicit stop notification from the NAS. This represents a verified and intentional session termination.

accounting_stop_query = "
UPDATE radacct
SET acctstoptime = NOW(),
    acctterminatecause = '%{Acct-Terminate-Cause}',
    session_state = 'CLOSED'
WHERE acctsessionid = '%{Acct-Session-Id}'
AND nasipaddress = '%{NAS-IP-Address}'
AND acctstoptime IS NULL
"

NAS Reboot Cleanup Query

Used by: FreeRADIUS unlang only
Concept/Logic/Purpose: Close sessions after verified NAS reboot. Force-closes all open sessions for a NAS only after a reboot has been positively detected. This prevents ghost sessions while preserving accounting accuracy.

nas_reboot_cleanup_query = "
UPDATE radacct
SET acctstoptime = acctupdatetime,
    acctterminatecause = 'NAS-Reboot',
    session_state = 'CLOSED'
WHERE nasipaddress = '%{NAS-IP-Address}'
AND acctstoptime IS NULL
"

Last Session-Time Lookup Query

Used by: FreeRADIUS unlang only
Concept/Logic/Purpose: Detect MikroTik reboot Retrieves the previously stored Acct-Session-Time to detect timer regression.This is the primary mechanism used to infer MikroTik NAS reboots reliably.

nas_last_session_time_query = "
SELECT last_acct_session_time
FROM radacct
WHERE acctsessionid = '%{Acct-Session-Id}'
AND nasipaddress = '%{NAS-IP-Address}'
AND acctstoptime IS NULL
LIMIT 1
"

Simultaneous-Use Count Query

Used by: SQL authorize stage
Concept/Logic/Purpose: Enforce single login. Counts only ACTIVE sessions to enforce single-login policies correctly. STALE sessions are excluded to avoid false user lockouts during temporary failures.

simul_count_query = "
SELECT COUNT(*)
FROM radacct
WHERE username = '%{User-Name}'
AND acctstoptime IS NULL
AND session_state = 'ACTIVE'
AND nasipaddress != '%{NAS-IP-Address}'
"

FreeRADIUS unlang Logic (Runtime Decisions)

Implements real-time decision making based on live accounting packets. This layer handles session creation, resumption, and NAS reboot detection without relying on timers.

Once stale sessions are identified at the database level, FreeRADIUS must use this verified state during authorization. This is where unlang becomes critical — it allows dynamic decision-making based on real-time session validity rather than raw record existence.

File Location

/etc/freeradius/sites-enabled/default

Section

server default → accounting { }

Responsibility

✔ Real-time decisions
✔ Packet-driven logic
❌ No time-based cleanup

Complete accounting{} block

# /etc/freeradius/sites-enabled/default
accounting {
  if (Acct-Status-Type == Start) {
    sql
    ok
  }
  if (Acct-Status-Type == Interim-Update) {
    update request {
      Tmp-Integer-0 := "%{sql:nas_last_session_time_query}"
    }
    if (&Tmp-Integer-0 && (&Acct-Session-Time < &Tmp-Integer-0)) {
      radiusd::log_warn("NAS reboot detected %{NAS-IP-Address}")
      sql:nas_reboot_cleanup_query
    }
    sql
    ok
  }
  if (Acct-Status-Type == Stop) {
    sql
    ok
  }
}

Cron Layer (Time-Based Maintenance Only)

Practical Example

Assume the NAS is configured to send Interim-Updates every 5 minutes.

• Session start time: 10:00
• Last Interim-Update received: 10:25
• Current time: 10:45

Since no updates were received for 20 minutes (4× the interim interval), the session can be confidently classified as stale and excluded from active session counts.

NOTE: CRON intervals

In production ISP environments, such cleanup jobs are typically executed every 5 to 15 minutes. Running them too frequently can increase database load, while long intervals delay user recovery. The exact timing should be aligned with the configured Interim-Update interval.

Cron Script #1 — Mark STALE Sessions

File:

/usr/local/sbin/radius/mark-stale-sessions.sh
Concept/Logic/Purpose: Periodically identifies sessions that stopped sending Interim-Updates but may still be valid. Sessions are marked STALE instead of being disconnected, allowing safe recovery.
Does NOT disconnect users

#!/bin/bash
mysql -u radius -p'PASSWORD' radiusdb <<EOF
UPDATE radacct
SET session_state = 'STALE'
WHERE acctstoptime IS NULL
AND session_state = 'ACTIVE'
AND TIMESTAMPDIFF(MINUTE, acctupdatetime, NOW()) > 15;
EOF

Note: This query intentionally avoids deleting records blindly. Instead, it relies on time-based verification to determine whether a session has genuinely stopped sending updates. This approach prevents accidental cleanup of slow or temporarily delayed sessions and ensures billing and audit accuracy.

Crontab Entry:

*/5 * * * * /usr/local/sbin/radius/mark-stale-sessions.sh

Cron Script #2 — Cleanup Lost Sessions

Concept/Logic/Purpose: Final cleanup for dead accounting. Performs conservative, time-based cleanup of sessions that will never return.This protects database integrity without interfering with active or recoverable sessions.

File:

/usr/local/sbin/radius/cleanup-lost-sessions.sh

#!/bin/bash
mysql -u radius -p'PASSWORD' radiusdb <<EOF
UPDATE radacct
SET acctstoptime = acctupdatetime,
    acctterminatecause = 'Lost-Accounting',
    session_state = 'CLOSED'
WHERE acctstoptime IS NULL
AND TIMESTAMPDIFF(MINUTE, acctupdatetime, NOW()) > 120;
EOF

Crontab Entry:

0 * * * * /usr/local/sbin/radius/cleanup-lost-sessions.sh

What Runs Where (Zero Ambiguity)

Logic	Location
Start / Stop	sites-enabled/default → accounting {}
Interim resume	sites-enabled/default → accounting {}
NAS reboot detection	sites-enabled/default → accounting {}
STALE marking	mark-stale-sessions.sh
Final cleanup	cleanup-lost-sessions.sh
Simultaneous-Use	queries.conf → simul_count_query

What MUST NOT Exist

❌ Anonymous cron entries
❌ Logic without file ownership
❌ Session deletion without cause
❌ Time-based reboot assumptions

Session Resume After Temporary Accounting Outage

In our implementation:

Sessions are marked as STALE, not closed
acctstoptime remains NULL
When connectivity restores and the same Acct-Session-Id sends Interim-Updates again:
- FreeRADIUS updates the existing record
- session_state is set back to ACTIVE
- The STALE condition is effectively cleared
- Accounting continues normally

This is exactly the correct behavior.

Step-by-step (what really happens in your setup)

1️⃣ Interim updates stop (temporary NAS↔RADIUS issue)

Cron runs:
- mark-stale-sessions.sh
Result in radacct:

acctstoptime = NULL

session_state = STALE

👉 Session is not closed, only flagged as stale.

2️⃣ NAS–RADIUS connectivity is restored

MikroTik resumes sending:

Acct-Status-Type = Interim-Update

Acct-Session-Id = SAME

3️⃣ FreeRADIUS processes Interim-Update

Inside:

/etc/freeradius/sites-enabled/default

→ accounting { }

Flow:

nas_last_session_time_query finds the existing row
(acctstoptime IS NULL ✔)
NAS reboot check:
- Acct-Session-Time has not reset
- No reboot detected ✔
accounting_update_query executes:

UPDATE radacct
SET
acctupdatetime = NOW(),
acctsessiontime = ...,
last_acct_session_time = ...,
session_state = 'ACTIVE'
WHERE acctsessionid = ?
AND acctstoptime IS NULL;

4️⃣ Result in database

Field	Value
acctstoptime	NULL
session_state	ACTIVE
counters	updated
accounting	continuous

👉 The STALE flag is removed automatically by the Interim-Update.

What does NOT happen (important)

❌ No new row is created
❌ No duplicate session
❌ No accounting reset
❌ No Simultaneous-Use false block

Why this works (core principle)

FreeRADIUS can safely resume a session only if acctstoptime was never set.

OUR design respects this rule.

That is the entire reason STALE exists as a state.

One-line confirmation…

Yes, in this design sessions are marked as STALE (not closed), and when the same accounting session resumes, FreeRADIUS continues updating the existing record and automatically restores the session to ACTIVE.

One operational warning

If a session is ever closed (acctstoptime set), it can never be resumed — only restarted.

🛡️ Audit Justification Points

Audit Rationale: This session model is designed to ensure that PPPoE session closures are only recorded when there is explicit evidence of termination, either through an accounting stop, verified NAS reboot, or prolonged accounting silence beyond operational thresholds. Temporary outages do not constitute termination, preserving billing integrity and avoiding false positive disconnects.

To make this useful for Enterprise/ISP managers who need to justify these changes to auditors, add a section titled “Audit & Compliance: Why This Matters”.

Audit & Compliance Justification Implementing automated stale session handling is not just an operational fix; it is a data integrity requirement.

AAA Data Integrity (Authentication, Authorization, Accounting): Auditors require that Accounting logs accurately reflect user usage. Leaving stale sessions “open” (with AcctStopTime IS NULL) falsifies usage duration records, leading to incorrect billing disputes and “ghost” data consumption logs.

Revenue Assurance: For prepaid or quota-based ISPs, stale sessions prevent the system from calculating the final session volume. By forcing a closure based on the last known Interim-Update, we ensure that the billable data matches the actual network activity, preventing revenue leakage or customer overcharging.

Traceability & Non-Repudiation: Our customized SQL queries introduce a acctterminatecause = 'NAS-Reboot' flag. This provides a distinct audit trail, differentiating between a user logging off (User-Request) and a system correction (System-Cleanup), which is critical for forensic analysis during network outages.*

When this logic does NOT apply

NAS firmware that does not send Interim-Update reliably
Cases where NAS sends stale Interim-Updates after long outages
Networks with asymmetric paths and intermittent packet loss

In such cases you may need longer STALE thresholds or secondary reachability checks (SNMP/ICMP).

📌 NOC Operational Expectation

STALE threshold: 15 minutes
Final cleanup: 120 minutes
Alerts for NAS reboot events should be integrated into monitoring (Syslog/SNMP)
Any unexpected growth in STALE counts must be investigated

⃣General View for Non-Technical Reader

In simple terms, sessions are only ended when there is verified evidence of exit. Temporary network issues are handled without affecting service continuity, ensuring users don’t lose sessions or get billed incorrectly.

Important Caveats

• If Interim-Updates are disabled or misconfigured on the NAS, this method will not work correctly.
• Database latency or replication delay must be considered in large deployments.
• Multi-NAS environments should ensure session verification is NAS-aware to avoid false positives.

Final Operational Principle

Stale PPPoE sessions are not a database anomaly but a natural consequence of real-world network behavior. Treating them as such requires session verification based on accounting activity rather than the mere presence of an open record.

By relying on Interim-Update freshness, SQL-based validation, and unlang-driven authorization logic, ISPs can accurately distinguish between active and defunct sessions. This method ensures Simultaneous-Use enforcement remains fair, prevents unnecessary customer lockouts, and preserves accounting accuracy for billing and audit purposes.

When implemented correctly and aligned with NAS interim update intervals, this approach provides a scalable and production-ready solution for managing PPPoE session state in MikroTik and FreeRADIUS environments.

© Syed Jahanzaib — aacable.wordpress.com

Implementing Daily Quota-Based Speed Throttling in FreeRADIUS 3.2.x (Production-Grade, Without sqlcounter)

Filed under: freeradius, Mikrotik Related — Tags: Bandwidth Throttling, freeradius, isp, linux, mikrotik, mysql, Quota Management — Syed Jahanzaib / Pinochio~:) @ 10:21 AM

Author: Syed Jahanzaib ~A Humble Human being! nothing else
Platform: aacable.wordpress.com
Category: Corporate Offices / DHCP-DNS Engineering
Audience: Systems Administrators, IT Support, NOC Teams, Network Architects

Disclaimer & Note on Writing Style

Blind copy-paste implementation without proper validation, testing, and change management is never recommended — especially in production environments. Always ensure proper backups and risk assessment before applying any configuration.

This blog is not intended for client acquisition or follower growth. It exists solely to share practical knowledge and real-world experience with the community.

Thank you for your understanding and continued support.

Implementing Daily Quota‑Based Speed Throttling in FreeRADIUS 3.2.x

Production‑Grade Design (Without sqlcounter)

Overview

In many ISP and enterprise environments, not all subscribers should be treated equally. A common requirement is:

Allow full speed (e.g., 10 Mbps) up to a daily quota (e.g., 100 GB/day)
Automatically downgrade speed (e.g., to 5 Mbps) once the quota is exceeded
Restore full speed automatically the next day

This article documents a real, production‑tested implementation using:

FreeRADIUS (tested on 3.2.7)
MySQL backend
MikroTik NAS with CoA support
Cron‑based enforcement
Explicit SQL logic (no magic counters)
.my.cnf security

⚠️ Important: This design intentionally does NOT use sqlcounter. The reasoning is explained later.
Also we can adapt this logic for Huawei / Cisco & other NASES too… 😉

What This Design Supports

Multiple service profiles (e.g., 5M, 10M, 20M)
Per‑service daily quota limits
Automatic speed downgrade after quota exhaustion
Real‑time enforcement for already‑connected users via CoA (We send only explicit bandwidth-update CoA attributes, not disconnect requests.)
Automatic speed restoration on the next day
No infinite CoA loops (Avoid sending CoA too frequently because the NAS might be overloaded otherwise
Safe to run every few minutes in production

This approach scales cleanly and is widely used in real ISP deployments.

High‑Level Architecture

Responsibilities are cleanly separated:

Component	Responsibility
FreeRADIUS	Authentication & policy decision
MySQL	Usage data, services, enforcement state
Cron script	Quota detection & CoA triggering
CoA	Force re‑authorization / speed change
State table	Prevent repeated enforcement

Why this matters

Easier debugging
Predictable behavior
Safe upgrades
No hidden logic in authentication path

Design Principles (Read This First)

Before touching configuration, understand these realities:

Authorization happens once (at login)
Quota is crossed later (during accounting)
FreeRADIUS does not automatically re‑authorize sessions
CoA is mandatory for real‑time speed changes
Repeated CoA without state tracking causes loops

To solve this correctly, we introduce a quota state table.

Database Tables Used

Table	Purpose
radcheck	Authentication only
radacct	Usage accounting
service_profiles	Speed + quota rules
user_services	User → service mapping
user_quota_state	Enforcement memory (critical)

This is exactly how large ISP RADIUS systems are structured.

Mental Model (Clean Separation)

Large systems scale by separating responsibility — not by collapsing everything into one table.

For this design we use three custom tables.

1️⃣ service_profiles — Rules Table

Purpose

Defines how a service behaves.

Controls

Normal speed (e.g., 10M/10M)
Throttled speed (e.g., 5M/5M)
Daily quota (GB)
Whether quota enforcement is enabled

One‑Line Summary

service_profiles defines HOW a service behaves.

Table Creation

CREATE TABLE service_profiles (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(50),
normal_rate VARCHAR(50),
throttled_rate VARCHAR(50),
daily_quota_gb INT,
quota_enabled TINYINT(1)
);

Example Entry

INSERT INTO service_profiles
(name, normal_rate, throttled_rate, daily_quota_gb, quota_enabled)
VALUES
('10mb', '10M/10M', '5M/5M', 100, 1);

2️⃣ user_services — User → Service Mapping

Purpose

Maps which user is on which service plan.

One‑Line Summary

user_services defines WHICH rules apply to WHICH user.

Table Creation

CREATE TABLE user_services (
username VARCHAR(64) PRIMARY KEY,
service_name VARCHAR(50)
);

This keeps radcheck clean and avoids overloading authentication tables.

3️⃣ user_quota_state — Enforcement Memory (Critical)

Purpose

Remembers who has already been throttled today.

It prevents infinite CoA loops.

One‑Line Summary

user_quota_state answers one question: “Has quota enforcement already been applied for this user today?”

Table Creation

CREATE TABLE user_quota_state (
username VARCHAR(64) NOT NULL,
service_name VARCHAR(50) NOT NULL,
quota_date DATE NOT NULL,
is_throttled TINYINT(1) DEFAULT 0,
PRIMARY KEY (username, quota_date)
);

What This Table IS

A daily flag
A loop‑prevention mechanism
A CoA control switch

What This Table IS NOT

❌ Billing record
❌ Usage log
❌ Authorization source
❌ Permanent status

It does not control access. It only prevents repeated enforcement.

Lifecycle of a User

Login → Normal speed
Quota exceeded → Script detects breach
CoA sent → Speed reduced
State recorded → No further CoA today
New day → State cleared → Full speed restored

Authorization Logic (Speed Assignment)

During authorization, FreeRADIUS decides which speed profile applies.

Logic

On login (or re-auth):

If user is not throttled today → Normal speed
If user is throttled today → Throttled speed

🧠 One Sentence Mental Model

Authorization decides speed, accounting measures usage, CoA enforces change, one flag prevents repetition.

1️⃣ Is putting this logic in sites-enabled/default efficient?

Short answer

Yes, if done correctly — and no, it will NOT overload authorization in your use-case.

Why this is safe

In FreeRADIUS, authorize {} is executed anyway for every Access-Request.
You are not adding a new phase, you are only adding conditional logic.

The real cost is:

SQL queries
String expansion
Unlang branching

These are cheap compared to:

TLS
PAP/CHAP/MSCHAP
SQL auth itself
Accounting writes

What would be inefficient (and what you should avoid)

❌ Multiple SQL queries per request
❌ Repeating the same SQL lookup again and again
❌ Logic applied to every NAS when not required

✅ Production-safe optimization pattern (Recommended)

Move logic into a policy
Cache service data in control
Proper way to detect NAS type and send correct attributes (MikroTik vs others)
Best Practice: Use nas table (already supported by FreeRADIUS)
Use NAS-Type in unlang

FreeRADIUS authorize {} Configuration

Edit:

nano /etc/freeradius/sites-enabled/default

authorize {
preprocess
filter_username
suffix
# 1) Load credentials
sql
# 2) Determine user's service
update control {
Service-Name := "%{sql:SELECT service_name FROM user_services WHERE username='%{User-Name}' LIMIT 1}"
}
# 3) Quota‑based speed decision
if ("%{sql:SELECT 1 FROM user_quota_state WHERE username='%{User-Name}' AND quota_date=CURDATE() LIMIT 1}" == "1") {
update reply {
Mikrotik-Rate-Limit := "%{sql:SELECT throttled_rate FROM service_profiles WHERE name='%{control:Service-Name}'}"
}
}
else {
update reply {
Mikrotik-Rate-Limit := "%{sql:SELECT normal_rate FROM service_profiles WHERE name='%{control:Service-Name}'}"
}
}
pap
chap
mschap
digest
eap {
ok = return
}
expiration
logintime
}

Result

Normal login → Normal speed
Over quota → Throttled speed
CoA re‑auth → Correct speed
Next day → Full speed restored

Real‑Time Enforcement Using CoA (qc.sh)

Mental Model

SQL detects quota breach → CoA enforces change → State table prevents repetition
Authorization alone is not enough because users may stay connected for days.

Selection Criteria (Critical)
The script must select only:

Active sessions
Services with quota_enabled = 1
Users who exceeded daily quota
Users not already throttled today

Production‑Ready qc.sh (quota check every 5 minutes)

#!/bin/bash
while read user nas ip secret; do
echo "Applying quota throttle for user=$user on NAS=$nas"
echo "User-Name=$user, Framed-IP-Address=$ip" | \
radclient -x "$nas" coa "$secret"
mysql <<EOF
INSERT IGNORE INTO user_quota_state (username, quota_date)
VALUES ('$user', CURDATE());
EOF
done < <(
mysql -N <<EOF
SELECT
r.username,
r.nasipaddress,
r.framedipaddress,
n.secret
FROM radacct r
JOIN user_services u ON r.username = u.username
JOIN service_profiles s ON u.service_name = s.name
JOIN nas n ON n.nasname = r.nasipaddress
WHERE r.acctstoptime IS NULL
AND s.quota_enabled = 1
AND NOT EXISTS (
SELECT 1
FROM user_quota_state q
WHERE q.username = r.username
AND q.quota_date = CURDATE()
)
GROUP BY r.username, r.nasipaddress, r.framedipaddress, n.secret
HAVING
SUM(r.acctinputoctets + r.acctoutputoctets)
> (MAX(s.daily_quota_gb) * 1024 * 1024 * 1024);
EOF
)

Safe to run every 5–10 minutes. Cron it to run every 5 minutes.

Why This Does NOT Loop

Once a user is throttled:

is_throttled = 1 is recorded
Script excludes that user
No repeated CoA packets
No NAS overload

This is idempotent enforcement, which is mandatory in ISP systems.

Daily Quota Reset & restore already connected users to there original speed – Script

Restoring Original Speed for Already Connected Users at Midnight

✅ The correct restore logic (mirror of qc.sh)

The restore script must:

Target only users who were throttled
Target only active sessions
Send CoA only
NOT touch accounting
NOT disconnect

That means:

Same CoA
Different condition

The Problem

At midnight, we reset daily quota (qc_reset.sh) state by running:

This correctly unmarks users as “quota exceeded”.

However:

❗ Users who are already connected and throttled will remain throttled
until their session is re-authorized.

Because:

FreeRADIUS only applies speed on authentication / re-auth
Clearing the table alone does not push a change to active sessions

So we need a way to force already connected users to re-apply policy.

Change bandwidth of ALREADY CONNECTED users using CoA, both when quota is exceeded and when quota is reset

The Correct Solution (Industry Practice)

✔ Use CoA (Change of Authorization) at midnight

to force policy re-evaluation for affected users.

This is exactly how ISPs do it.

Design Principle (Important)

We do NOT want to:

Disconnect users
Restart NAS
Restart FreeRADIUS
Touch radacct

Example of qc_restore.sh

#!/bin/bash
#set -x
while read user nas rate secret; do
echo "Restoring user=$user to $rate"
# I have disabled this for test, you may enable it , syedjahanzaib
# echo "User-Name=$user, Mikrotik-Rate-Limit=$rate" | \
# radclient -q -c 1 "$nas:3799" coa "$secret"
done < <(
mysql radius -N <<EOF
SELECT
r.username,
r.nasipaddress,
s.normal_rate,
n.secret
FROM user_quota_state q
JOIN radacct r ON r.username = q.username
JOIN user_services u ON u.username = r.username
JOIN service_profiles s ON s.name = u.service_name
JOIN nas n ON n.nasname = r.nasipaddress
WHERE q.quota_date < CURDATE()
AND r.acctstoptime IS NULL;
EOF
)

NOTE:
WHERE q.quota_date < CURDATE()
This is intentional: only past-quota users restored after the new day begins

Why You MUST NOT Rely Only on FreeRADIUS Logic??

You might ask:
“Why not let FreeRADIUS re-evaluate on every packet?”
Because:

Authorization happens once
Accounting happens later
Usage thresholds are crossed after auth

➡ CoA + state table is the only correct solution

qc_reset.sh

DELETE FROM user_quota_state WHERE quota_date < CURDATE();

Cron Job (Run this script every night at 00:05 (12:05 am)

At midnight:

quota_reset.sh deletes state
qc_restore.sh sends CoA
NAS re-asks RADIUS
RADIUS sees NO quota state
RADIUS replies with normal_rate
NAS updates bandwidth live

✔ Same session
✔ Same IP
✔ Same user
✔ Only bandwidth changes

🕛 Correct cron order

Script	Purpose	Frequency
`qc.sh`	Detect quota exceeded & throttle	Every 5–10 min
`quota_reset.sh`	Reset daily enforcement state	Once per day
`qc_restore.sh`	Restore normal speed via CoA	Once per day

🧠 Final Mental Model (This is the key)

CoA is used in BOTH directions
Throttle AND restore.

🟢 There is NO DISCONNECT Anywhere

If you wanted to disconnect, you would see:

You are not doing that.

So your system is:

Correct
Clean
ISP-grade
Non-disruptive

Final sanity checklist

Item	Status
Throttle works	✅
Restore logic correct	✅
No disconnects	✅
Explicit bandwidth CoA	✅
Daily lifecycle clean	✅
Script behavior predictable	✅

One-line takeaway (this explains everything)

Restore script does nothing during the same day — and that is exactly correct.

You’ve reached the correct end-state.

Data Consistency Lesson (Important)

Problem: Enforcement query returned no rows even though users were active.
Root Cause:

nasipaddress did not match nas.nasname

Fix:

Ensure nasname contains the NAS IP address, not only a hostname

This alignment is critical when dynamically pulling NAS secrets.

Why We Did NOT Use sqlcounter

This was a deliberate engineering decision.
Problems with sqlcounter

Runs inside authentication path
Executes on every login / re‑auth
Awkward speed downgrade logic
No native state memory
Fragile across upgrades

Correct Reality

Authorization happens once
Usage thresholds are crossed later
CoA + state tracking is mandatory

🔹 Simulate Quota Usage (Insert Fake Active Session)

When testing quota logic, you may not have a real online user yet.
In that case, insert a fake but valid active session into radacct.

✅ Example: Insert Active Session for User `zaib`

Why these fields matter

Column	Reason
`acctsessionid`	Must be unique
`username`	Must match `user_services`
`nasipaddress`	Must match `nas.nasname`
`framedipaddress`	Used for session identification
`acctstarttime`	Marks session start
`acctstoptime = NULL`	Marks session as active
`acctinputoctets`	Simulates data usage
`acctoutputoctets`	Optional (can be 0)

🔹 Verify Inserted Session

Expected result:

(≈120 GB)

🔹 Now Run QC (Quota and Throttle Mode Check)

Expected behavior

Script detects quota exceeded
Sends bandwidth-only CoA
Inserts entry into user_quota_state
User remains connected
Speed changes to throttled_rate

⚠️ Important Cleanup After Testing

After simulation, remove the fake session to avoid confusion:

🔹 One-Line Reminder

radacct must contain an active (acctstoptime = NULL) session for quota logic to trigger.

Final Architecture Summary

Component	Role
radacct	Usage accounting
service_profiles	Speed & quota rules
user_services	User mapping
user_quota_state	Enforcement memory
qc.sh	Detection & CoA
quota_reset.sh	Daily reset
FreeRADIUS	Policy decision

Securing MySQL Credentials (.my.cnf)

⚠️ Use a .my.cnf file in production (don’t hardcode passwords). this is a security + ops best practice, so let’s do it correctly and simply.

Location

/root/.my.cnf

Content

[client]
user=radius
password=YOUR_DB_PASSWORD
host=localhost
database=radius

🔒 VERY IMPORTANT: Set strict permissions

MySQL will ignore the file if permissions are weak.

chmod 600 /root/.my.cnf
chown root:root /root/.my.cnf

Verify:

ls -l /root/.my.cnf

Expected:

-rw------- 1 root root ... /root/.my.cnf

Result

❌ No passwords in scripts ❌ No exposure via ps or history

✅ How this changes your scripts (Clean & Safe)

❌ Old (bad practice)

mysql radius -u radius -pPASSWORD -e "DELETE FROM user_quota_state ..."

✅ New (correct)

mysql -e "DELETE FROM user_quota_state WHERE quota_date < CURDATE();"

No username
No password
No exposure in ps, history, or logs

🧠 How MySQL finds .my.cnf (important to know)

When you run mysql, it checks in this order:

/etc/my.cnf
/etc/mysql/my.cnf
~/.my.cnf ← this one
Command-line options

So /root/.my.cnf is automatically used.

Conclusion

By avoiding sqlcounter and using explicit SQL + CoA + state tracking, we achieve:

Reliable daily quota enforcement
Correct speed throttling
Clean FreeRADIUS configuration
Predictable behavior
Production‑grade stability

This reflects how real ISP systems are built, not lab examples.

Performance Tuning & Scaling Tips (Production ISPs)

This solution is already efficient by design, but the following optimizations ensure stable performance at scale (thousands to tens of thousands of users).

1️⃣ Indexing: The Single Biggest Performance Win

Your scripts heavily query radacct and user_quota_state.
Without proper indexes, MySQL will scan entire tables.

✅ Required Indexes

-- radacct: active session lookup
CREATE INDEX idx_radacct_active
ON radacct (username, acctstoptime);
CREATE INDEX idx_radacct_nas
ON radacct (nasipaddress);
-- radacct: usage calculation
CREATE INDEX idx_radacct_usage
ON radacct (username, acctstarttime);
-- user_quota_state: daily state check
CREATE UNIQUE INDEX idx_quota_user_day
ON user_quota_state (username, quota_date);
-- user_services: user lookup
CREATE INDEX idx_user_services_user
ON user_services (username);
-- service_profiles: service lookup
CREATE INDEX idx_service_profiles_name
ON service_profiles (name);

📌 These indexes reduce query time from seconds to milliseconds.

2️⃣ Limit `radacct` Growth (Very Important)

radacct grows fast in ISP environments.

Best Practices

Enable Interim-Update (5–10 minutes)
Periodically archive old records:

Or move to an archive table.

3️⃣ Cron Frequency: Don’t Overdo It

Recommended schedule

Script	Frequency
`qc.sh`	Every 5–10 minutes
`quota_reset.sh`	Once per day
`quota_restore.sh`	Once per day

Running throttle every minute does not improve accuracy, only DB load.

5️⃣ Reduce CoA Traffic (Selective Targeting)

Your design sends CoA only when needed:

✔ User exceeded quota
✔ User not already throttled
✔ User still online

This is far better than sending CoA to all users.

6️⃣ Avoid `sqlcounter` for High-Scale Quota Logic

Why your design scales better:

Aspect	sqlcounter	This Design
Runs on auth	Yes	No
DB queries	Per login	Scheduled
CoA support	Limited	Native
Loop protection	No	Yes
Debuggability	Hard	Easy

For large ISPs, offloading quota logic from auth path is critical.

7️⃣ MySQL Tuning (Basic but Important)

Minimum recommended settings for radius DB:

[mysqld]
innodb_buffer_pool_size = 2G
innodb_log_file_size = 512M
max_connections = 300</div>

Adjust according to RAM and load.

8️⃣ Use `.my.cnf` (Already Done)

Avoid command-line passwords.

Benefits:

No leaks in ps
Cleaner scripts
Safer cron jobs

9️⃣ Logging Without Killing Performance

Avoid excessive debug logging.

Recommended approach

Log actions, not every query
Rotate logs weekly

🔟 Test at Scale (Before Production)

Before enabling in production:

Simulate 100–500 fake users
Run throttle mode
Observe:
- MySQL CPU
- Disk I/O
- CoA response time

If MySQL is slow → indexes are missing.

Final Performance Rule (Remember This)

Quota logic must run OUTSIDE the authentication path.

Your design already follows this rule — which is why it scales.

One-Line Takeaway

Indexes + selective CoA + cron-based logic = ISP-grade performance.

Syed Jahanzaib

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Syed Jahanzaib – سید جہانزیب – Personal Blog to Share Knowledge !

February 9, 2026

Search My Blog . . .

Blog Stats

Categories

Meta

Email Subscription

Archives

Recent Posts

Pages

Top Clicks

Top Posts

RSS

Syed Jahanzaib FB Link

Syed Jahanzaib – سید جہانزیب – Personal Blog to Share Knowledge !

February 9, 2026

Handling Stale PPPoE Sessions in MikroTik + FreeRADIUS

Handling Stale PPPoE Sessions in MikroTik + FreeRADIUS

Disclaimer & Note on Writing Style

Introduction

What ISPs do in real production

How do ISPs verify if user is actually not live on NAS?

Database Layer (Schema Ownership)

FreeRADIUS unlang Logic (Runtime Decisions)

Cron Layer (Time-Based Maintenance Only)

Session Resume After Temporary Accounting Outage

🛡️ Audit Justification Points

When this logic does NOT apply

📌 NOC Operational Expectation

Important Caveats

Final Operational Principle

Share this:

Implementing Daily Quota-Based Speed Throttling in FreeRADIUS 3.2.x (Production-Grade, Without sqlcounter)

Disclaimer & Note on Writing Style

Implementing Daily Quota‑Based Speed Throttling in FreeRADIUS 3.2.x

Overview

What This Design Supports

High‑Level Architecture

Why this matters

Design Principles (Read This First)

Database Tables Used

Mental Model (Clean Separation)

1️⃣ service_profiles — Rules Table

Purpose

Controls

One‑Line Summary

Table Creation

Example Entry

2️⃣ user_services — User → Service Mapping

Purpose

One‑Line Summary

Table Creation

3️⃣ user_quota_state — Enforcement Memory (Critical)

Purpose

One‑Line Summary

Table Creation

What This Table IS

What This Table IS NOT

Lifecycle of a User

Authorization Logic (Speed Assignment)

Logic

FreeRADIUS authorize {} Configuration

Result

Real‑Time Enforcement Using CoA (qc.sh)

Mental Model

Production‑Ready qc.sh (quota check every 5 minutes)

Daily Quota Reset & restore already connected users to there original speed – Script

Restoring Original Speed for Already Connected Users at Midnight

✅ The correct restore logic (mirror of qc.sh)

The Problem

The Correct Solution (Industry Practice)

✔ Use CoA (Change of Authorization) at midnight

Design Principle (Important)

Example of qc_restore.sh

qc_reset.sh

Cron Job (Run this script every night at 00:05 (12:05 am)

🕛 Correct cron order

🧠 Final Mental Model (This is the key)

🟢 There is NO DISCONNECT Anywhere

Final sanity checklist

One-line takeaway (this explains everything)

Data Consistency Lesson (Important)

Why We Did NOT Use sqlcounter

Correct Reality

🔹 Simulate Quota Usage (Insert Fake Active Session)

✅ Example: Insert Active Session for User zaib

Why these fields matter

🔹 Verify Inserted Session

🔹 Now Run QC (Quota and Throttle Mode Check)

Expected behavior

⚠️ Important Cleanup After Testing

🔹 One-Line Reminder

Final Architecture Summary

Securing MySQL Credentials (.my.cnf)

✅ Example: Insert Active Session for User `zaib`

2️⃣ Limit `radacct` Growth (Very Important)

6️⃣ Avoid `sqlcounter` for High-Scale Quota Logic

8️⃣ Use `.my.cnf` (Already Done)