RootNode configuration options

A RootNode describes the work that a reap is to carry out.

Description

RootNodes are units of work for a reaper. The terminology comes for the original Harvest codebase when a RootNode was literally that - a single point rooting the tree of URLs to traverse. In Harvest-NG a RootNode is rather more flexible - it is a list of URLs which comprise a starting workload, along with control filters which specify how these URLs may be processed, and what to do with the additional URLs that may result.

Other sections

The following configuration sections are nested inside this one

  • Postfilters - Postfilters are run after objects are fetched and summarised, to determine if they should be stored
  • Prefilters - Prefilters are run before objects are fetched, to determine if they should be

Configuration Options

Url
Either a single URL or a list of URLs for this reaper to gather from.