Scheduling and Jobs¶
Collection plans run on top of NetBox's JobRunner framework. Each plan
gets its own CollectionJobRunner (defined in
netbox_facts/jobs.py).
Manual runs¶
From the plan detail page, click Run (or POST to
/api/plugins/facts/collectionplans/<id>/run/). The view calls
CollectionPlan.enqueue_collection_job(request) which:
- Refuses to enqueue if
statusis alreadyqueuedorworking(OperationNotSupported-> HTTP409). - Picks the user (
run_asif the requester is a superuser and the field is set, otherwise the requester). - Calls
CollectionJobRunner.enqueue()with the plan asinstanceandqueue_name=plan.priority(one ofhigh,default,low).
enqueue() injects the plugin's job_timeout (default 1800s) when the
caller does not pass one, then sets the plan's status to queued.
Recurring runs¶
Set the plan's interval (in minutes). The post_save signal
handle_collection_job_change calls
CollectionJobRunner.enqueue_once(instance=plan, interval=plan.interval, ...),
which is the upstream NetBox idiom that ensures only one scheduled job
exists per plan.
Unsetting the interval, or disabling the plan, deletes any pending scheduled job for that plan.
Queue priorities¶
| Priority | RQ queue |
|---|---|
high |
high |
default |
default |
low |
low |
Queues map directly. Configure RQ workers to drain higher-priority queues first if you mix interactive and bulk plans.
Job lifecycle¶
CollectionJobRunner.run():
- Loads the
CollectionPlanfromself.job.object_id. - Calls
plan.run(request=request), which constructs aNapalmCollectorand iterates devices. - In a
finally, copies the in-memory log ontoself.job.data["log"]so the job results page can display it (even on failure). - Links the most recent
FactsReportfor the plan to this job (if not already linked).
CollectionPlan.run() updates plan status:
- Set
workingat the start. - Set
completedandlast_run = nowon success. - Set
failedand re-raise on exception.
NapalmCollector.execute() updates report status:
Applied(apply mode) orPending(detect-only) on clean finish.Failedanderror_messageon uncaught exception.
Stalled detection¶
If a plan is working but no live Job exists (e.g. a worker crashed),
CollectionPlan.check_stalled() (called from __init__) flips its
status to stalled. This is a hint to operators: stalled plans can be
re-run safely.
Job timeouts¶
Two limits cap how long a run can take:
napalm_timeout(plugin-wide, default 60s) is the per-RPC NAPALM timeout. It is injected intooptional_args["timeout"]before connecting.job_timeout(plugin-wide, default 1800s) is the RQ-level cap on the whole job. Long plans (many devices) may need this raised inPLUGINS_CONFIG.
Inspecting jobs¶
Each plan's Jobs tab lists every core.Job it has produced. Each job
links to its FactsReport and shows the per-device log written into
Job.data["log"].