Why I Split Recon Into Three Jenkins Jobs

TL;DR

One giant job looks simple until it breaks.
Separate jobs make retries, logs, and state much easier to handle.
The handoff between jobs should be files, not hidden assumptions.

The problem with the single-job version

My first instinct was the usual one: just run everything in sequence and call it a pipeline. Discovery feeds probing, probing feeds scanning, and the whole thing ships one summary at the end.

That version works for a while. Then reality shows up:

one stage times out and hides useful output from the earlier stage
reruns repeat expensive work
logs turn into a wall of mixed concerns
small changes become risky because everything is tightly coupled

The core problem is that the job has no boundaries. When something goes wrong, you lose both clarity and control.

What changed after the split

Once I separated the workflow into three Jenkins jobs, the pipeline became easier to operate almost immediately.

Discovery became a state publisher

The discovery job stopped trying to be clever. Its only job was to produce the latest host inventory and publish that state for the next stage.

That meant I could evaluate discovery quality on its own:

did the host list look sane?
did the diff behave correctly?
did the output belong to scope?

Probing became a change detector

The HTTP stage stopped being just a liveness test. It became the place where I decide what changed enough to matter.

That gave me a much more useful mental model:

discovery finds possible assets
probing tells me what is alive
the diff tells me what deserves attention

Scanning became selective by design

The scan stage became lighter once I stopped feeding it everything. That reduced noise and made the results easier to review.

The lesson here is simple: selective scanning is not an optimization you add later. It is a structural decision that improves the pipeline from the start.

The file handoff matters

The jobs communicate through explicit files rather than implicit Jenkins state. That decision matters because it keeps the pipeline debuggable.

If a run looks wrong, I can inspect:

the latest discovered host list
the current live-service state
the delta generated in the last probing run
the scan input used for the last pass

That is far better than relying on console output alone.

Why this model scales better

A split pipeline scales operationally even when the host count grows:

discovery can stay daily
probing can run in smaller slices
scanning can trigger only on new or changed services

This is the difference between a pipeline that merely runs and a pipeline that remains usable.

Takeaway

If you are building recon automation in Jenkins, do not start from the question "How do I cram all this into one job?" Start from the question "Where do I want the boundaries?" Clear job boundaries are what make the rest of the system maintainable.

Public reference repo

The sanitized reference repository for this series is here: jenkins-recon. It contains the public-facing example Jenkins jobs and handoff files behind this write-up.