Handling Stale PPPoE Sessions in MikroTik + FreeRADIUS
(Exact File Locations, unlag Placement, SQL Ownership, and Cron Responsibilities)
- Author: Syed Jahanzaib ~A Humble Human being! nothing else
- Platform: aacable.wordpress.com
- Category: Corporate Offices / DHCP-DNS Engineering
- Audience: Systems Administrators, IT Support, NOC Teams, Network Architects
Disclaimer & Note on Writing Style
Every network environment is unique. A solution that works effectively in one infrastructure may require modification in another. Readers are strongly encouraged to understand the underlying concepts and adapt the guidance according to their own architecture, operational policies, and risk tolerance.
Blind copy-paste implementation without proper validation, testing, and change management is never recommended , especially in production environments. Always ensure proper backups and risk assessment before applying any configuration.
The content shared here is based on hands-on experience from real-world deployments, ISP environments, lab testing, and continuous learning. While I strive for technical accuracy, no technical implementation is entirely free from the possibility of error. Constructive discussion and alternative approaches are always welcome.
Due to professional commitments, it is not always feasible to publish highly detailed or multi-part write-ups. The technical logic and implementation details are written based on my own practical experience. AI tools such as ChatGPT are used only to refine grammar, structure, and presentation — not to generate the core technical concepts.
This blog is not intended for client acquisition or follower growth. It exists solely to share practical knowledge and real-world experience with the community.
Thank you for your understanding and continued support.
Introduction
In PPPoE deployments built on MikroTik NAS and FreeRADIUS with an SQL backend, stale accounting sessions are an unavoidable operational reality. These sessions typically arise when accounting stop packets are never received due to NAS reboots, power failures, access link disruptions, or transport issues between the NAS and the RADIUS server.
When left unverified, such stale records directly impact Simultaneous-Use enforcement, resulting in legitimate users being blocked with “user already online” conditions despite having no active session. Addressing this problem requires more than periodic record deletion; it demands a controlled and verifiable method to determine whether a session is truly active.
This article presents a production-safe approach to PPPoE session verification by leveraging accounting activity, SQL-based validation, unlang logic, and scheduled cleanup processes. The objective is to ensure accurate session state, reliable Simultaneous-Use enforcement, and clean accounting data without compromising audit integrity or service availability.
Why this happens (root cause)
- MikroTik sends Accounting-Start
- Every 5 minutes it sends Interim-Update
- FreeRADIUS inserts a row in radacct with acctstoptime = NULL
- NAS reboots / loses connectivity
- Accounting-Stop is never sent
- FreeRADIUS still thinks the session is active
- Simultaneous-Use = 1 blocks re-login
👉 RADIUS cannot guess that the user is gone unless you tell it how.
What ISPs do in real production
There are three layers usually combined:
1️⃣ Interim-Update timeout validation (MOST IMPORTANT)
This is the primary verification mechanism.
Logic
If a session has not sent an Interim-Update for X minutes, it is considered dead, even if acctstoptime is NULL.
Why it works
- A live session must send interim updates
- If NAS rebooted, updates stop
- This is proof, not assumption
Typical ISP rule
If last interim update > 2 × Acct-Interim-Interval → session is dead
Example:
- Interim interval = 5 minutes
- Timeout = 10–15 minutes
SQL condition (verification)
SELECT username, nasipaddress, acctsessionid FROM radacct WHERE acctstoptime IS NULL AND TIMESTAMPDIFF(MINUTE, acctupdatetime, NOW()) > 15;
If this condition is true → session is NOT alive.
Proper cleanup (controlled)
UPDATE radacct SET acctstoptime = acctupdatetime, acctterminatecause = 'NAS-Reboot' WHERE acctstoptime IS NULL AND TIMESTAMPDIFF(MINUTE, acctupdatetime, NOW()) > 15;
✅ This is production-safe
❌ This is NOT blind deletion
✔ This is based on verifiable inactivity
2️⃣ NAS reachability check (secondary verification)
Many ISPs also verify NAS status before cleanup.
Typical checks
- Ping NAS IP
- SNMP availability
- RouterOS API availability
Logic
- IF NAS unreachable
- AND session has no interim updates
- THEN mark sessions as dead
This avoids killing sessions when:
- SQL delayed
- Temporary RADIUS lag
Example logic (pseudo)
if ping NAS == fail:
cleanup stale sessions for that NAS
This is often implemented in cron scripts.
3️⃣ MikroTik-side protections (VERY IMPORTANT)
- a) Proper Acct-Interim-Interval
On MikroTik:
/radius set accounting=yes interim-update=5m
Never leave it default or disabled.
- b) Send Stop on reboot (best-effort)
RouterOS already tries, but you can improve reliability:
- Use stable storage
- Avoid power cuts without UPS
Still, RADIUS must assume stops can be lost.
4️⃣ What ISPs DO NOT do (bad practices)
❌ Delete all acctstoptime IS NULL blindly
❌ Disable Simultaneous-Use
❌ Allow multiple sessions to avoid complaints
❌ Manual cleanup by support staff
These cause:
- Abuse
- Multiple logins
- Incorrect billing
- Legal/audit issues
5️⃣ Recommended production-grade approach (summary)
| Layer | Purpose | Mandatory |
| Interim timeout check | Session verification | ✅ YES |
| NAS reachability check | Extra safety | ⚠ Recommended |
| Controlled SQL cleanup | Session release | ✅ YES |
| Simultaneous-Use=1 | Abuse prevention | ✅ YES |
6️⃣ Industry-standard wording (how ISPs define it)
“A session is considered active only if periodic accounting updates are received within the defined timeout window. Absence of updates implies session termination due to NAS failure or connectivity loss.”
This is exactly how large ISPs justify cleanup during audits.
Final Answer to core question!
How do ISPs verify if user is actually not live on NAS?
✅ By absence of Interim-Update packets within a defined time window, optionally combined with NAS reachability checks.
There is no other reliable method in RADIUS.
This article documents a fully production-aligned design, explicitly mapping:
- Which logic runs inside FreeRADIUS
- Which logic runs via SQL
- Which logic runs via cron
- Exact file names and locations for every decision
No unnamed logic. No invisible automation.
Environment Assumptions
- FreeRADIUS 3.x
- MySQL / MariaDB backend
- MikroTik PPPoE NAS
- Interim-Update = 5 minutes
- Simultaneous-Use = 1
Session State Model (Conceptual)
| State | Meaning | Where enforced |
| ACTIVE | Interim updates arriving | FreeRADIUS SQL |
| STALE | Interim missing, NAS alive | Cron |
| CLOSED | Verified termination | FreeRADIUS unlang / Cron |
Database Layer (Schema Ownership)
Location
- Database: radiusdb
- Table : radacct
Schema Extension (run once)
Concept/Logic/Purpose: Extends the radacct table to explicitly track session state and last known MikroTik session timers. This enables deterministic handling of stale, resumed, and terminated sessions without relying on assumptions.
ALTER TABLE radacct ADD COLUMN session_state ENUM('ACTIVE','STALE','CLOSED') DEFAULT 'ACTIVE', ADD COLUMN last_acct_session_time INT DEFAULT 0, ADD INDEX idx_state_update (session_state, acctupdatetime), ADD INDEX idx_nas_user (nasipaddress, username);
Responsibility:
✔ Stores session truth
✔ No logic, only state
SQL Query Ownership (queries.conf)
- Concept/Logic/Purpose: Centralizes all accounting-related SQL logic in a single, predictable location managed by FreeRADIUS. This separation ensures that runtime logic (unlang) and data manipulation (SQL) remain clean and auditable.
File Location
- /etc/freeradius/mods-config/sql/main/mysql/queries.conf
All SQL below lives only in this file.
Accounting-Start Query
- Used by: FreeRADIUS accounting {}
- Concept/Logic/Purpose: Creates a new accounting record when a PPPoE session is first established.
This marks the authoritative beginning of a user session and initializes all tracking fields.
accounting_start_query = " INSERT INTO radacct ( acctsessionid, acctuniqueid, username, nasipaddress, acctstarttime, acctupdatetime, acctstoptime, acctsessiontime, last_acct_session_time, session_state ) VALUES ( '%{Acct-Session-Id}', '%{Acct-Unique-Session-Id}', '%{User-Name}', '%{NAS-IP-Address}', NOW(), NOW(), NULL, '%{Acct-Session-Time}', '%{Acct-Session-Time}', 'ACTIVE' )"
Interim-Update Query
- Used by: FreeRADIUS accounting {}
- Concept/Logic/Purpose: Periodically refreshes session counters and timestamps to confirm that the user is still actively connected. It also safely resumes sessions after temporary NAS–RADIUS connectivity loss.
accounting_update_query = " UPDATE radacct SET acctupdatetime = NOW(), acctsessiontime = '%{Acct-Session-Time}', last_acct_session_time = '%{Acct-Session-Time}', session_state = 'ACTIVE' WHERE acctsessionid = '%{Acct-Session-Id}' AND nasipaddress = '%{NAS-IP-Address}' AND acctstoptime IS NULL "
Accounting-Stop Query
- Used by: FreeRADIUS accounting {}
- Concept/Logic/Purpose: Close session normally. Closes a session only when FreeRADIUS receives an explicit stop notification from the NAS. This represents a verified and intentional session termination.
accounting_stop_query = " UPDATE radacct SET acctstoptime = NOW(), acctterminatecause = '%{Acct-Terminate-Cause}', session_state = 'CLOSED' WHERE acctsessionid = '%{Acct-Session-Id}' AND nasipaddress = '%{NAS-IP-Address}' AND acctstoptime IS NULL "
NAS Reboot Cleanup Query
- Used by: FreeRADIUS unlang only
- Concept/Logic/Purpose: Close sessions after verified NAS reboot. Force-closes all open sessions for a NAS only after a reboot has been positively detected. This prevents ghost sessions while preserving accounting accuracy.
nas_reboot_cleanup_query = " UPDATE radacct SET acctstoptime = acctupdatetime, acctterminatecause = 'NAS-Reboot', session_state = 'CLOSED' WHERE nasipaddress = '%{NAS-IP-Address}' AND acctstoptime IS NULL "
Last Session-Time Lookup Query
- Used by: FreeRADIUS unlang only
- Concept/Logic/Purpose: Detect MikroTik reboot Retrieves the previously stored
Acct-Session-Timeto detect timer regression.This is the primary mechanism used to infer MikroTik NAS reboots reliably.
nas_last_session_time_query = " SELECT last_acct_session_time FROM radacct WHERE acctsessionid = '%{Acct-Session-Id}' AND nasipaddress = '%{NAS-IP-Address}' AND acctstoptime IS NULL LIMIT 1 "
Simultaneous-Use Count Query
- Used by: SQL authorize stage
- Concept/Logic/Purpose: Enforce single login. Counts only ACTIVE sessions to enforce single-login policies correctly. STALE sessions are excluded to avoid false user lockouts during temporary failures.
simul_count_query = " SELECT COUNT(*) FROM radacct WHERE username = '%{User-Name}' AND acctstoptime IS NULL AND session_state = 'ACTIVE' AND nasipaddress != '%{NAS-IP-Address}' "
FreeRADIUS unlang Logic (Runtime Decisions)
Implements real-time decision making based on live accounting packets. This layer handles session creation, resumption, and NAS reboot detection without relying on timers.
Once stale sessions are identified at the database level, FreeRADIUS must use this verified state during authorization. This is where unlang becomes critical — it allows dynamic decision-making based on real-time session validity rather than raw record existence.
File Location
- /etc/freeradius/sites-enabled/default
Section
- server default → accounting { }
Responsibility
- ✔ Real-time decisions
- ✔ Packet-driven logic
- ❌ No time-based cleanup
Complete accounting{} block
# /etc/freeradius/sites-enabled/default accounting { if (Acct-Status-Type == Start) { sql ok } if (Acct-Status-Type == Interim-Update) { update request { Tmp-Integer-0 := "%{sql:nas_last_session_time_query}" } if (&Tmp-Integer-0 && (&Acct-Session-Time < &Tmp-Integer-0)) { radiusd::log_warn("NAS reboot detected %{NAS-IP-Address}") sql:nas_reboot_cleanup_query } sql ok } if (Acct-Status-Type == Stop) { sql ok } }
Cron Layer (Time-Based Maintenance Only)
Practical Example
Assume the NAS is configured to send Interim-Updates every 5 minutes.
• Session start time: 10:00
• Last Interim-Update received: 10:25
• Current time: 10:45
Since no updates were received for 20 minutes (4× the interim interval), the session can be confidently classified as stale and excluded from active session counts.
NOTE: CRON intervals
In production ISP environments, such cleanup jobs are typically executed every 5 to 15 minutes. Running them too frequently can increase database load, while long intervals delay user recovery. The exact timing should be aligned with the configured Interim-Update interval.
Cron Script #1 — Mark STALE Sessions
File:
- /usr/local/sbin/radius/mark-stale-sessions.sh
- Concept/Logic/Purpose: Periodically identifies sessions that stopped sending Interim-Updates but may still be valid. Sessions are marked STALE instead of being disconnected, allowing safe recovery.
- Does NOT disconnect users
#!/bin/bash mysql -u radius -p'PASSWORD' radiusdb <<EOF UPDATE radacct SET session_state = 'STALE' WHERE acctstoptime IS NULL AND session_state = 'ACTIVE' AND TIMESTAMPDIFF(MINUTE, acctupdatetime, NOW()) > 15; EOF
Note: This query intentionally avoids deleting records blindly. Instead, it relies on time-based verification to determine whether a session has genuinely stopped sending updates. This approach prevents accidental cleanup of slow or temporarily delayed sessions and ensures billing and audit accuracy.
Crontab Entry:
- */5 * * * * /usr/local/sbin/radius/mark-stale-sessions.sh
Cron Script #2 — Cleanup Lost Sessions
Concept/Logic/Purpose: Final cleanup for dead accounting. Performs conservative, time-based cleanup of sessions that will never return.This protects database integrity without interfering with active or recoverable sessions.
File:
- /usr/local/sbin/radius/cleanup-lost-sessions.sh
#!/bin/bash mysql -u radius -p'PASSWORD' radiusdb <<EOF UPDATE radacct SET acctstoptime = acctupdatetime, acctterminatecause = 'Lost-Accounting', session_state = 'CLOSED' WHERE acctstoptime IS NULL AND TIMESTAMPDIFF(MINUTE, acctupdatetime, NOW()) > 120; EOF
Crontab Entry:
- 0 * * * * /usr/local/sbin/radius/cleanup-lost-sessions.sh
What Runs Where (Zero Ambiguity)
| Logic | Location |
| Start / Stop | sites-enabled/default → accounting {} |
| Interim resume | sites-enabled/default → accounting {} |
| NAS reboot detection | sites-enabled/default → accounting {} |
| STALE marking | mark-stale-sessions.sh |
| Final cleanup | cleanup-lost-sessions.sh |
| Simultaneous-Use | queries.conf → simul_count_query |
What MUST NOT Exist
❌ Anonymous cron entries
❌ Logic without file ownership
❌ Session deletion without cause
❌ Time-based reboot assumptions
Session Resume After Temporary Accounting Outage
In our implementation:
- Sessions are marked as STALE, not closed
- acctstoptime remains NULL
- When connectivity restores and the same Acct-Session-Id sends Interim-Updates again:
- FreeRADIUS updates the existing record
- session_state is set back to ACTIVE
- The STALE condition is effectively cleared
- Accounting continues normally
This is exactly the correct behavior.
Step-by-step (what really happens in your setup)
1️⃣ Interim updates stop (temporary NAS↔RADIUS issue)
- Cron runs:
- mark-stale-sessions.sh
- Result in radacct:
acctstoptime = NULL
session_state = STALE
👉 Session is not closed, only flagged as stale.
2️⃣ NAS–RADIUS connectivity is restored
- MikroTik resumes sending:
Acct-Status-Type = Interim-Update
Acct-Session-Id = SAME
3️⃣ FreeRADIUS processes Interim-Update
Inside:
- /etc/freeradius/sites-enabled/default
→ accounting { }
Flow:
- nas_last_session_time_query finds the existing row
(acctstoptime IS NULL ✔) - NAS reboot check:
- Acct-Session-Time has not reset
- No reboot detected ✔
- accounting_update_query executes:
UPDATE radacct SET acctupdatetime = NOW(), acctsessiontime = ..., last_acct_session_time = ..., session_state = 'ACTIVE' WHERE acctsessionid = ? AND acctstoptime IS NULL;
4️⃣ Result in database
| Field | Value |
| acctstoptime | NULL |
| session_state | ACTIVE |
| counters | updated |
| accounting | continuous |
👉 The STALE flag is removed automatically by the Interim-Update.
What does NOT happen (important)
❌ No new row is created
❌ No duplicate session
❌ No accounting reset
❌ No Simultaneous-Use false block
Why this works (core principle)
FreeRADIUS can safely resume a session only if acctstoptime was never set.
OUR design respects this rule.
That is the entire reason STALE exists as a state.
One-line confirmation…
Yes, in this design sessions are marked as STALE (not closed), and when the same accounting session resumes, FreeRADIUS continues updating the existing record and automatically restores the session to ACTIVE.
One operational warning
- If a session is ever closed (acctstoptime set), it can never be resumed — only restarted.
🛡️ Audit Justification Points
Audit Rationale: This session model is designed to ensure that PPPoE session closures are only recorded when there is explicit evidence of termination, either through an accounting stop, verified NAS reboot, or prolonged accounting silence beyond operational thresholds. Temporary outages do not constitute termination, preserving billing integrity and avoiding false positive disconnects.
To make this useful for Enterprise/ISP managers who need to justify these changes to auditors, add a section titled “Audit & Compliance: Why This Matters”.
Audit & Compliance Justification Implementing automated stale session handling is not just an operational fix; it is a data integrity requirement.
AAA Data Integrity (Authentication, Authorization, Accounting): Auditors require that Accounting logs accurately reflect user usage. Leaving stale sessions “open” (with
AcctStopTime IS NULL) falsifies usage duration records, leading to incorrect billing disputes and “ghost” data consumption logs.Revenue Assurance: For prepaid or quota-based ISPs, stale sessions prevent the system from calculating the final session volume. By forcing a closure based on the last known
Interim-Update, we ensure that the billable data matches the actual network activity, preventing revenue leakage or customer overcharging.Traceability & Non-Repudiation: Our customized SQL queries introduce a
acctterminatecause = 'NAS-Reboot'flag. This provides a distinct audit trail, differentiating between a user logging off (User-Request) and a system correction (System-Cleanup), which is critical for forensic analysis during network outages.*
When this logic does NOT apply
- NAS firmware that does not send Interim-Update reliably
- Cases where NAS sends stale Interim-Updates after long outages
- Networks with asymmetric paths and intermittent packet loss
In such cases you may need longer STALE thresholds or secondary reachability checks (SNMP/ICMP).
📌 NOC Operational Expectation
- STALE threshold: 15 minutes
- Final cleanup: 120 minutes
- Alerts for NAS reboot events should be integrated into monitoring (Syslog/SNMP)
- Any unexpected growth in STALE counts must be investigated
⃣General View for Non-Technical Reader
In simple terms, sessions are only ended when there is verified evidence of exit. Temporary network issues are handled without affecting service continuity, ensuring users don’t lose sessions or get billed incorrectly.
Important Caveats
• If Interim-Updates are disabled or misconfigured on the NAS, this method will not work correctly.
• Database latency or replication delay must be considered in large deployments.
• Multi-NAS environments should ensure session verification is NAS-aware to avoid false positives.
Final Operational Principle
Stale PPPoE sessions are not a database anomaly but a natural consequence of real-world network behavior. Treating them as such requires session verification based on accounting activity rather than the mere presence of an open record.
By relying on Interim-Update freshness, SQL-based validation, and unlang-driven authorization logic, ISPs can accurately distinguish between active and defunct sessions. This method ensures Simultaneous-Use enforcement remains fair, prevents unnecessary customer lockouts, and preserves accounting accuracy for billing and audit purposes.
When implemented correctly and aligned with NAS interim update intervals, this approach provides a scalable and production-ready solution for managing PPPoE session state in MikroTik and FreeRADIUS environments.
© Syed Jahanzaib — aacable.wordpress.com

