Chunking HTTP Probes for Large Host Lists

TL;DR

Do not probe thousands of hosts in one burst unless you really need to.
Chunking spreads load, reduces failures, and keeps each run small enough to inspect.
A simple pointer file is enough to rotate through the full list over time.

The failure mode of full-list probing

Probing a large host list in one run sounds efficient, but it creates several problems at once:

long runtime
uneven network load
harder retries
more noisy output when only a small percentage changed

I wanted the probing stage to finish reliably and to produce results I could actually review. That pushed me toward chunking.

The chunk model

Instead of probing the full resolved list every run, I take one slice of it per scheduled execution.

The basic pattern is:

Read the current offset from a pointer file.
Select the next block of hosts.
Probe that block.
Advance the pointer.
Wrap to the beginning when the end is reached.

That is not sophisticated, but it is exactly the kind of boring mechanism that survives real use.

Why chunking helped

Runtime became predictable

A smaller probe set means each run finishes in a narrow time range. That matters because overlapping scheduled runs are one of the easiest ways to make a Jenkins job flaky.

Diffs became easier to reason about

When one run only touches a portion of the host set, the resulting changes are easier to inspect. I am looking at a small, fresh set of signals instead of a blended dump of everything.

Failures became cheaper

If a run fails, I lose one chunk, not the entire day of probing. Recovery is simpler because the blast radius is smaller.

The tradeoff

Chunking gives up instant full coverage. That is the cost. A specific host might not be probed again for several runs, depending on list size and chunk size.

For my use case, that tradeoff was acceptable because the goals were:

steady monitoring
controlled load
clean deltas

If I needed rapid full visibility after a major change, I could still run an explicit full pass.

Picking a chunk size

Chunk size is not just a performance number. It is an operating decision.

Too small:

coverage rotates too slowly
useful changes take longer to surface

Too large:

runtime drifts upward
logs get noisy again
retries become expensive

The right size is the one that your environment can process comfortably with room for occasional slow hosts and network variance.

Why I like the pointer-file approach

I did not need a database or a queue. A tiny state file was enough because the workflow is sequential and the logic is easy to audit.

That fits the whole philosophy of the pipeline: simple state, visible transitions, and no magic hidden in the scheduler.

Takeaway

Chunking was one of the highest-value changes in the whole setup. It lowered operational pain without making the design more complicated. For recurring HTTP reconnaissance, that is a good trade every time.

Public reference repo

The sanitized reference repository for this series is here: jenkins-recon. It contains the public-facing example chunk rotation logic and file-based state handling behind this write-up.