Skip to content

Filesystem Analysis & Disk Cleanup

Filesystem Analysis & Disk Cleanup provides deep visibility into what is consuming disk space across your fleet and gives you safe, auditable tools to reclaim it. The system combines a parallel-scanning agent worker pool, resumable scan state, and a preview-before-execute cleanup workflow to help administrators identify and resolve disk space issues at scale — without risking accidental data loss.

The feature operates in two phases: analysis (scan the filesystem and build a snapshot of what is consuming space) and cleanup (preview safe candidates, then execute deletion of approved items). Both phases are available through the REST API, the dashboard UI, and the AI assistant.


Disk space exhaustion is one of the most common and disruptive issues in managed fleets. A single device running out of disk can cause application crashes, failed updates, corrupted databases, and user complaints. Across a fleet of thousands of devices, the problem multiplies — temp files accumulate silently, browser caches grow unchecked, old downloads pile up, and log files go unrotated.

Filesystem Analysis addresses this by:

  • Scanning deeply with a parallel worker pool that can traverse millions of filesystem entries efficiently.
  • Resuming interrupted scans so that long-running analysis on large volumes is not lost if the agent restarts or the connection drops.
  • Categorizing disk usage into actionable groups: largest files, largest directories, temp file accumulation, old downloads, unrotated logs, trash usage, and duplicate candidates.
  • Providing safe cleanup with a strict preview-then-execute workflow that only targets known-safe categories.

A filesystem scan is initiated by sending a POST request to the scan endpoint with a target path and optional configuration. The API queues a filesystem_analysis command to the agent via WebSocket (preferred for immediate dispatch) or the command queue.

The agent-side scanner uses a worker pool to parallelize directory traversal. Multiple goroutines walk the filesystem concurrently, each processing a subtree and reporting results back to a coordinator. This is critical for scanning large volumes — a serial scan of a 2TB disk with millions of files can take over an hour, while a parallelized scan with 8-16 workers can complete in minutes.

The system supports three scan strategies that control how much of the filesystem is traversed:

StrategyBehaviorWhen to Use
baselineFull recursive scan from the specified path. Scans every directory up to maxDepth.First scan on a device, or when disk usage has changed significantly.
incrementalScans only “hot directories” — directories that changed significantly since the last baseline.Routine follow-up scans to detect new accumulation without rescanning the entire volume.
autoThe API automatically selects baseline or incremental based on scan state.Default. Recommended for most use cases.

When using auto strategy, the API evaluates the following conditions to determine the scan mode:

  1. If the scan targets a non-root path (not C:\ or /), a baseline scan is always used.

  2. If a previous scan was interrupted and checkpoint data exists, a baseline scan resumes from the checkpoint.

  3. If no baseline has ever been completed for this device, a baseline scan is started.

  4. If the current disk usage percentage has changed by more than 3% since the last baseline, a baseline scan is triggered.

  5. If hot directories exist from the previous scan, an incremental scan targets only those directories.

Long-running baseline scans store checkpoint data in the device_filesystem_scan_state table. The checkpoint records which directories have been scanned and which are still pending. If a scan is interrupted (agent restart, network drop, timeout), the next scan with auto or baseline strategy will automatically resume from the checkpoint rather than starting over.

The scan state table tracks:

FieldTypeDescription
deviceIdUUIDPrimary key. One state record per device.
lastRunModestringbaseline or incremental.
lastBaselineCompletedAttimestampWhen the last full baseline scan finished.
lastDiskUsedPercentrealDisk usage percentage at the time of the last baseline. Used to decide when a full rescan is needed.
checkpointJSONBPending directories and depth information for resume.
aggregateJSONBAccumulated scan results from completed subtrees.
hotDirectoriesJSONBDirectories identified as high-churn for incremental scans (up to 24).

The scan endpoint accepts the following parameters:

Terminal window
POST /api/v1/devices/:deviceId/filesystem/scan
Content-Type: application/json
{
"path": "C:\\",
"strategy": "auto",
"maxDepth": 32,
"topFiles": 50,
"topDirs": 30,
"maxEntries": 5000000,
"workers": 8,
"timeoutSeconds": 300,
"followSymlinks": false
}
FieldTypeDefaultDescription
pathstringAbsolute path to scan. Required. Max 2048 characters.
strategyenumautoScan strategy: auto, baseline, or incremental.
maxDepthnumberMaximum directory depth to traverse (1-64).
topFilesnumberNumber of largest files to return (1-500).
topDirsnumberNumber of largest directories to return (1-200).
maxEntriesnumberMaximum filesystem entries to scan (1,000 to 25,000,000).
workersnumberNumber of parallel scan workers (1-32).
timeoutSecondsnumber300 (baseline) / 120 (incremental)Scan timeout in seconds (5-900).
followSymlinksbooleanWhether to follow symbolic links during traversal.

The response returns a 202 Accepted with the command ID and scan mode:

{
"success": true,
"data": {
"commandId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "queued",
"createdAt": "2026-02-20T14:30:00.000Z",
"scanMode": "baseline",
"strategy": "auto"
}
}

When a scan completes, the results are saved as a filesystem snapshot in the device_filesystem_snapshots table. Snapshots are immutable records of the filesystem state at a point in time.

FieldTypeDescription
idUUIDUnique snapshot identifier.
deviceIdUUIDThe device that was scanned.
capturedAttimestampWhen the snapshot was created.
triggerenumon_demand (user-initiated) or threshold (automatic).
partialbooleantrue if the scan was interrupted or incomplete.
summaryJSONBAggregate statistics (files scanned, dirs scanned, bytes scanned, max depth, permission denied count).
largestFilesJSONBTop largest files by size (up to 50).
largestDirsJSONBTop largest directories by size (up to 30).
tempAccumulationJSONBTemporary file accumulation by category (bytes per category).
oldDownloadsJSONBFiles in download directories that have not been accessed recently (up to 200).
unrotatedLogsJSONBLog files that have grown large without rotation (up to 200).
trashUsageJSONBFiles in trash/recycle bin (up to 16).
duplicateCandidatesJSONBFiles that appear to be duplicates based on size and name (up to 200).
cleanupCandidatesJSONBAll items eligible for safe cleanup (up to 1,000).
errorsJSONBErrors encountered during scanning (permission denied, etc., up to 200).
Terminal window
GET /api/v1/devices/:deviceId/filesystem

Returns the most recent snapshot for the device. The response restructures the snapshot data for readability:

{
"data": {
"id": "snapshot-uuid",
"deviceId": "device-uuid",
"capturedAt": "2026-02-20T14:35:00.000Z",
"trigger": "on_demand",
"partial": false,
"scanMode": "baseline",
"path": "C:\\",
"summary": {
"filesScanned": 1250000,
"dirsScanned": 85000,
"bytesScanned": 450000000000,
"maxDepthReached": 24,
"permissionDeniedCount": 12
},
"topLargestFiles": [...],
"topLargestDirectories": [...],
"tempAccumulation": [
{ "category": "browser_cache", "bytes": 2500000000 },
{ "category": "temp_files", "bytes": 1800000000 }
],
"oldDownloads": [...],
"unrotatedLogs": [...],
"trashUsage": [...],
"duplicateCandidates": [...],
"cleanupCandidates": [...],
"errors": []
}
}

When incremental scans run after a baseline, the results are merged with the existing snapshot data. The merge logic:

  • Accumulates summary counts (files scanned, dirs scanned, bytes scanned, permission denied count).
  • Takes the maximum for max depth reached.
  • Deduplicates largest files and directories by path, keeping the entry with the larger size.
  • Merges temp accumulation by category, summing byte counts.
  • Deduplicates cleanup candidates, old downloads, unrotated logs, and trash entries by path.
  • Concatenates error lists up to a cap of 200.

Disk cleanup follows a strict two-phase workflow: preview first, then execute. This ensures that no files are deleted without explicit review.

Only items in the following categories are eligible for automated cleanup:

CategoryDescriptionExamples
temp_filesOperating system and application temporary files.%TEMP%, /tmp, app-specific temp directories.
browser_cacheWeb browser cache files.Chrome, Firefox, Edge, Safari cache directories.
package_cachePackage manager caches.npm, pip, Homebrew, Chocolatey, NuGet cache directories.
trashFiles in the trash or recycle bin.Windows Recycle Bin, macOS Trash, Linux trash directories.

The preview builds a cleanup plan from the latest filesystem snapshot without touching any files on the device.

Terminal window
POST /api/v1/devices/:deviceId/filesystem/cleanup-preview
Content-Type: application/json
{
"categories": ["temp_files", "browser_cache"]
}
FieldTypeRequiredDescription
categoriesarrayNoFilter to specific categories. If omitted, all safe categories are included. Max 10 entries.

The response includes the preview plan and creates a cleanup run record with status previewed:

{
"success": true,
"data": {
"cleanupRunId": "run-uuid",
"snapshotId": "snapshot-uuid",
"estimatedBytes": 4300000000,
"candidateCount": 156,
"categories": [
{ "category": "temp_files", "count": 89, "estimatedBytes": 1800000000 },
{ "category": "browser_cache", "count": 67, "estimatedBytes": 2500000000 }
],
"candidates": [
{
"path": "C:\\Users\\admin\\AppData\\Local\\Temp\\old_installer.exe",
"category": "temp_files",
"sizeBytes": 450000000,
"safe": true,
"reason": "Temporary file older than 30 days",
"modifiedAt": "2026-01-15T10:00:00.000Z"
}
]
}
}

After reviewing the preview, submit the specific paths you want to delete. Only paths that appear in the latest preview as safe candidates are accepted.

Terminal window
POST /api/v1/devices/:deviceId/filesystem/cleanup-execute
Content-Type: application/json
{
"paths": [
"C:\\Users\\admin\\AppData\\Local\\Temp\\old_installer.exe",
"C:\\Users\\admin\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\Cache"
]
}
FieldTypeRequiredDescription
pathsarrayYesPaths to delete. Must be from the latest previewable candidates. Min 1, max 200 entries. Max 4096 characters per path.

The API validates each path against the current snapshot’s safe cleanup candidates. Paths not present in the candidate list are silently excluded. For each valid path, a file_delete command is dispatched to the agent with recursive: true.

The response reports per-path results:

{
"success": true,
"data": {
"cleanupRunId": "run-uuid",
"status": "executed",
"bytesReclaimed": 4100000000,
"selectedCount": 2,
"failedCount": 0,
"actions": [
{
"path": "C:\\Users\\admin\\AppData\\Local\\Temp\\old_installer.exe",
"category": "temp_files",
"sizeBytes": 450000000,
"status": "completed"
},
{
"path": "C:\\Users\\admin\\AppData\\Local\\Google\\Chrome\\User Data\\Default\\Cache",
"category": "browser_cache",
"sizeBytes": 3650000000,
"status": "completed"
}
]
}
}

If all cleanup actions fail, the run status is failed. If at least one succeeds, the status is executed.


The AI assistant provides two tools for filesystem analysis and cleanup.

A read-only tool that retrieves the latest filesystem snapshot for a device and explains what is consuming disk space. It can optionally trigger a fresh scan.

ParameterTypeDescription
deviceIdstring (UUID)The device to analyze. Required.
refreshbooleanIf true, trigger a new scan before returning results.
pathstringRoot path to scan (default: OS root).
maxDepthnumberMaximum scan depth (1-64).
topFilesnumberNumber of largest files to return (1-500).
topDirsnumberNumber of largest directories to return (1-200).
maxEntriesnumberMaximum filesystem entries to scan (1,000-25,000,000).
workersnumberNumber of parallel scan workers (1-32).
timeoutSecondsnumberScan timeout (5-900 seconds).
maxCandidatesnumberMaximum cleanup candidates to return (1-200).

The AI uses this tool to answer questions like “What is using disk space on this device?”, “Why is the C: drive full?”, and “Show me the largest files on device X.”

disk_cleanup (Tier 1 preview / Tier 3 execute)

Section titled “disk_cleanup (Tier 1 preview / Tier 3 execute)”

A dual-tier tool. Preview mode is Tier 1 (auto-executed, read-only). Execute mode is Tier 3 (requires human approval before deleting files).

ParameterTypeDescription
deviceIdstring (UUID)The device to clean up. Required.
actionenumpreview or execute. Required.
categoriesarrayFilter cleanup to specific categories. Optional.
pathsarraySpecific paths to delete (required for execute). Min 1, max 200.
maxCandidatesnumberMaximum candidates to return in preview (1-200).

The filesystem scanner identifies several categories of reclaimable space. Here is what each category covers and where the scanner looks.

  • %TEMP% (user temp directory)
  • C:\Windows\Temp (system temp directory)
  • C:\Windows\Prefetch (prefetch cache)
  • Application-specific temp directories

Cache directories for all major browsers:

  • Chrome: User Data/Default/Cache, Code Cache, Service Worker
  • Firefox: cache2 directory in profile
  • Edge: Same structure as Chrome (Chromium-based)
  • Safari: ~/Library/Caches/com.apple.Safari
  • npm: ~/.npm/_cacache
  • pip: ~/.cache/pip (Linux/macOS) or %LOCALAPPDATA%\pip\cache (Windows)
  • Homebrew: ~/Library/Caches/Homebrew
  • Chocolatey: C:\ProgramData\chocolatey\cache
  • NuGet: ~/.nuget/packages
  • Windows: C:\$Recycle.Bin per-user directories
  • macOS: ~/.Trash
  • Linux: ~/.local/share/Trash

Additional Detected Items (Not Auto-Cleaned)

Section titled “Additional Detected Items (Not Auto-Cleaned)”

The scanner also detects the following, which appear in the snapshot but are not eligible for automated cleanup:

  • Old downloads: Files in download directories older than a configurable threshold.
  • Unrotated logs: Log files that have grown beyond expected sizes.
  • Duplicate candidates: Files with matching sizes and names in different locations.
  • Large files: The top N largest individual files on the volume.

These items require manual review and are surfaced in the snapshot for informational purposes.


Table: device_filesystem_snapshots

ColumnTypeDescription
idUUIDPrimary key.
device_idUUIDForeign key to devices.
captured_attimestampWhen the snapshot was created.
triggerenumon_demand or threshold.
partialbooleanWhether the scan was incomplete.
summaryJSONBAggregate scan statistics.
largest_filesJSONBTop largest files.
largest_dirsJSONBTop largest directories.
temp_accumulationJSONBTemp file accumulation by category.
old_downloadsJSONBOld files in download directories.
unrotated_logsJSONBLarge unrotated log files.
trash_usageJSONBTrash/recycle bin contents.
duplicate_candidatesJSONBPotential duplicate files.
cleanup_candidatesJSONBAll safe cleanup candidates.
errorsJSONBScan errors.
raw_payloadJSONBComplete raw agent response.

Indexed on (device_id, captured_at) for efficient latest-snapshot queries.

Table: device_filesystem_cleanup_runs

ColumnTypeDescription
idUUIDPrimary key.
device_idUUIDForeign key to devices.
requested_byUUIDForeign key to users (who initiated the cleanup).
requested_attimestampWhen the cleanup was requested.
approved_attimestampWhen the cleanup was approved for execution (null for preview-only runs).
planJSONBThe cleanup plan (snapshot ID, categories, preview data).
executed_actionsJSONBPer-path execution results.
bytes_reclaimedbigintTotal bytes reclaimed by the cleanup.
statusenumpreviewed, executed, or failed.
errortextError message if the run failed.

Indexed on (device_id, requested_at) for history queries.

Table: device_filesystem_scan_state

ColumnTypeDescription
device_idUUIDPrimary key. Foreign key to devices. One row per device.
last_run_modetextbaseline or incremental.
last_baseline_completed_attimestampWhen the last full baseline finished.
last_disk_used_percentrealDisk usage at last baseline (for delta detection).
checkpointJSONBPending directories for scan resume.
aggregateJSONBAccumulated partial results.
hot_directoriesJSONBHigh-churn directories for incremental scans.

All filesystem endpoints are prefixed with /api/v1/devices/:deviceId. Replace :deviceId with a valid device UUID.

MethodPathDescriptionPermission
GET/devices/:id/filesystemGet latest filesystem analysis snapshotdevices.read
POST/devices/:id/filesystem/scanTrigger a filesystem scandevices.execute
POST/devices/:id/filesystem/cleanup-previewPreview safe cleanup candidatesdevices.read
POST/devices/:id/filesystem/cleanup-executeExecute cleanup on approved pathsdevices.execute

”No filesystem analysis available yet” (404)

Section titled “”No filesystem analysis available yet” (404)”

No snapshot exists for this device. The device has not been scanned yet. Trigger a scan with POST /devices/:id/filesystem/scan and wait for it to complete before querying the snapshot.

”No filesystem snapshot available. Run a scan first.” (404 on cleanup-preview)

Section titled “”No filesystem snapshot available. Run a scan first.” (404 on cleanup-preview)”

The cleanup preview requires an existing snapshot to build the candidate list from. Run a filesystem scan first, then retry the preview.

Large filesystem scans can exceed the default command timeout. Try:

  • Increasing timeoutSeconds (up to 900 seconds / 15 minutes).
  • Reducing maxEntries to limit the scan scope.
  • Reducing maxDepth to avoid deeply nested directory trees.
  • Scanning a subdirectory instead of the root path.

The scan uses resumable state, so a timed-out baseline scan will resume from its checkpoint on the next attempt.

The preview filters candidates by two criteria: the safe flag must be true AND the category must be one of the four safe categories (temp_files, browser_cache, package_cache, trash). If the snapshot contains cleanup candidates but the preview returns none, the candidates may be in non-safe categories (e.g., old downloads, unrotated logs) that require manual review.

Also verify that the categories filter in your request matches the categories present in the snapshot’s candidates.

”No valid cleanup paths selected from latest previewable candidates” (400)

Section titled “”No valid cleanup paths selected from latest previewable candidates” (400)”

The paths submitted to the execute endpoint do not match any current safe cleanup candidates. This can happen if:

  • A new scan ran between the preview and execute, changing the candidate list.
  • The paths were manually constructed rather than copied from a preview response.
  • The candidates’ safe flag was false (items outside safe categories are excluded).

Re-run the preview and use paths from the fresh response.

The cleanup execute response reports per-path status. Individual paths can fail if:

  • The file was already deleted between preview and execute.
  • The agent does not have permission to delete the file (e.g., a file locked by a running process).
  • The path is on a read-only filesystem.

The overall run status is executed if at least one path succeeded, or failed if every path failed. The bytesReclaimed field reflects only the successfully deleted paths.

Incremental scans only traverse hot directories identified by the previous baseline. If no hot directories were detected (the disk is relatively stable), the incremental scan has little to scan. This is expected behavior. Run a baseline scan to get a comprehensive view.

Scan shows “partial: true” in the snapshot

Section titled “Scan shows “partial: true” in the snapshot”

The scan was interrupted before completing. This can be caused by agent restart, network disconnection, or timeout. The partial snapshot contains valid data for the portions that were scanned. Run another scan to resume from the checkpoint — the system will merge results with the partial data.

The agent process may not have permission to read certain directories (e.g., other users’ home directories, system-protected paths). These are logged in the snapshot’s errors array with the path and error type. Running the agent as root/SYSTEM reduces permission errors but is not always necessary — the most valuable disk usage data is typically in user-accessible directories.