Harvest::Controller - interface to the spider controller | |
| |
DESCRIPTION | |
The
| |
METHODS | |
| |
$overseer=new Controller($database,$delay,$noims); | |
Create a new Controller. There should be only one instance of the controller class per spider.
The controller class by itself will do nothing, until Rootnodes are entered
using the
| |
$overseer->add($rootnode); | |
Add a new rootnode to the list being fetched by the spider. Rootnode should be an object of type Harvest::Controller::RootNode
| |
$root->more | |
Returns TRUE if there are more objects left to fetch.
| |
$obj=$root->next | |
Returns the next Harvest::Object to fetch. If there are no more objects available at the current time it will return a time in seconds to sleep until more objects should become available.
| |
$root->done($obj) | |
Should be called with the results of running a Harvest::Reaper fetch operation on the object This method will run the appropriate post-summarising filters, extract any URL references contained in the object, filter them, and add them to the workload, and finally add the object to the database passed to the constructor of this instance of the Controller. | |