Name
SERV_QUEUE_TOP — Retrieve target website and store within Virtuoso
Synopsis
WS.WS.
SERV_QUEUE_TOP
(
|
in target varchar , |
in WebDAV_collection varchar , | |
in update integer , | |
in debug integer , | |
in function_hook varchar , | |
in
data
any
) ; |
Description
Web Robot site retrieval can be performed with the WS.WS.SERV_QUEUE_TOP PL function integrated in to the Virtuoso server.
To run multiple walking robots all you simply need to do is kick them off from separate ODBC/SQL connections and all robots will walk together without overlapping.
From a VSP interface, after calling the retrieval function you may call http_flush to keep running tasks in the server and allowing the user agent to continue with other tasks.
Parameters
target
URI to target site.
WebDAV_collection
Local WebDAV collection to copy the content to.
update
Flag to set updatable, can be 1 or 0 for on or off respectably.
debug
Debug flag, must be set to 0
function_hook.
Fully qualified PL function hook name. If not supplied or NULL then the default function will be used.
data
application dependent data, usually an array, is passed to the PL function hook to perform next queue entry extraction. In our example we use an array with names of non-desired sites.
Examples
Example 24.370. Retrieve External Sites
WS.WS.SERV_QUEUE_TOP ( 'www.foo.com', 'sites/www_foo_com', 0, 0, 'DB.DBA.my_hook', vector ('www.skip.me','www.bar.com') );