TL;DR

  • Split recon into three jobs with one responsibility each.
  • Keep the handoff between jobs file-based and easy to inspect.
  • Prefer steady daily coverage over one giant noisy scan.
  • Design the pipeline so one slow step does not block the others.

Why I built it this way

I did not want a single oversized recon job that mixes enumeration, HTTP probing, and template scanning in one run. That kind of setup is hard to reason about, hard to restart cleanly, and annoying to debug when the output starts getting messy.

The better model was a pipeline with three stages:

  1. Discover candidate hosts.
  2. Probe them and detect what changed.
  3. Scan only what is worth scanning.

That division sounds obvious in hindsight, but it changes the whole operating model. Each stage has a clear input, a clear output, and a smaller failure surface.

The three jobs

1) Discovery

The first job collects hostnames for in-scope targets and writes a clean current-state list. The important part is not the raw count. The important part is that the output is normalized enough that later jobs can trust it.

Typical questions this job answers:

  • What hosts exist right now?
  • Which ones are new since the last run?
  • Which ones disappeared?

2) Probing

The second job takes a slice of the current host list and checks which services are actually live. This is where I care about status code, page title, and lightweight technology hints.

This job is also where the pipeline becomes useful instead of merely busy. A live-state diff gives me three meaningful categories:

  • New URLs
  • Changed URLs
  • Removed URLs

That is much better than re-reading the same full service list every day.

3) Scanning

The third job runs focused checks against the URLs that matter. The key decision here was to stop treating all live URLs as equally interesting. If a host is unchanged and already scanned, I do not need to keep spending the same effort on it every hour.

Why Jenkins still works well here

Jenkins is not trendy, but it is still good at this specific kind of automation:

  • scheduled jobs
  • parameterized runs
  • artifact retention
  • visible logs
  • clear handoffs between stages

I do not need a complex orchestrator for a pipeline like this. I need something boring, inspectable, and easy to repair at 2 a.m.

Design choices that mattered more than the tooling

The useful decisions were not about brands or products. They were about control points:

  • each job writes durable state
  • each diff is explicit
  • each stage can run again without manual cleanup
  • the scan stage consumes only the subset that changed

That combination keeps the pipeline understandable over time.

What I would do again

I would absolutely keep the three-job split. It gives me cleaner logs, simpler recovery, and much better signal. If you are building a recon pipeline in Jenkins, that separation is the first thing I would recommend.

Public reference repo

The sanitized reference repository for this series is here: jenkins-recon. It contains the public-facing example files, job layouts, and state-file patterns that sit behind these posts.