16.10.8. Cartridge Architecture
Extractor Cartridges
An Extractor Cartridge processes a Resource of a given format, extracting RDF according to rules appropriate to that format. External data does not come into play; only the content of the Resource fed to the Sponger.
Supported Standard Non-RDF Data Formats
These Cartridges handle open formats - typically community-developed, openly-documented, and freely-licensed data structures.
Table 16.12.
Supported Vendor-specific Non-RDF Data Formats
These Cartridges handle closed formats - typically proprietary; sometimes undocumented; possibly licensed to no-one except the format originator. Sometimes data may not be parsed as desired or expected, as many of these Cartridges have required reverse-engineering of the data format in question.
Table 16.13.
Meta Cartridges
A Meta Cartridge submits a Resource to a third-party Web Service for processing. Returned RDF supplements the RDF generated by Extractor and other Meta Cartridges. Locally generated RDF may also be submitted to the third-party services, instead-of or in-addition-to the original Resource itself.
Default Sponger behavior is for all installed Meta Cartridges to be brought to bear on all submitted Resources:
Table 16.14.
Meta Cartridge Usage via REST Request
Description.vsp underlies the /about/html/ page, and accepts the parameters described below.
Table 16.15.
Parameter | Value | Description | Example |
---|---|---|---|
@Lookup@ | The type of lookup | ||
No Value | When value is not given (i.e., @Lookup@= ), all will work as if the parameter were not present. %BR% The "Lookup" name is chosen to distinguish between parameters belonging to the URL being processed, and parameters for the Sponger. | Refresh the graph with all current cartridges, either type | |
0 | NLP meta only | Execute only NLP meta extraction | |
-2 | Keywords-based only | Execute only keywords-based meta extraction | |
x,y... | A list of meta cartridges to be executed, by their unique IDs. The ID column can be found in Conductor -> Linked Data -> Sponger -> Meta Cartridges | Execute only CNET (ID=19) and NYT: The TimesTags (ID=22) meta cartridges | |
refresh=0,1,2 etc. | Usage : for cache invalidation. When used 1 or larger number (n), adds get:refresh "N" (explicit refresh interval in seconds) as a directive to Sponger. A refresh of zero ("0") seconds will make a new graph on the next lookup with the '@Lookup@ ' parameter value. | Refresh the graph with all current cartridges | |
refresh=clean | Usage : for overwriting. The 'clean' usage explicitly clears the graph i.e. will cause the Sponger to drop cache even if it is marked to be in the fly. Thus, if network resource fetched cache by some reason is left in some inconsistent state like shutdown during the fetching, then 'clean' is required as it doesn't check cache state. Note : must be used with caution as other threads may be doing Network Resource Fetch at same time. |
Meta Cartridges Parametrized Examples
All examples in the table below start from the same Resource, http://www.news.com, and submit it to the Sponger for processing with the single listed Meta Cartridge.
It can be informative to start by seeing what the results would be with no Meta Cartridges at all .
If you have a lot of time to spare, you may want to see what the results would be with all Meta Cartridges combined . As may be obvious, this must wait for each of the above services to respond, so it may take quite some time to return.
Table 16.16.
Cartridge | URL Pattern | Example |
---|---|---|
Alchemy | @Lookup@=8&refresh=0 | cURL example |
Amazon Search for products | @Lookup@=13&refresh=0 | cURL example |
BBC | @Lookup@=1665&refresh=0 | cURL example |
BestBuy Search for products | @Lookup@=14&refresh=0 | cURL example |
Bing | @Lookup@=11&refresh=0 | cURL example |
Bit.ly | @Lookup@=915&refresh=0 | cURL example |
CNET | @Lookup@=19&refresh=0 | cURL example |
Crunchbase | @Lookup@=839&refresh=0 | cURL example |
Dapper | @Lookup@=243&refresh=0 | cURL example |
DBpedia | @Lookup@=26&refresh=0 | cURL example |
Delicious Meta | @Lookup@=23&refresh=0 | cURL example |
Discogs | @Lookup@=840&refresh=0 | cURL example |
Document Links | @Lookup@=34&refresh=0 | cURL example |
eBay | @Lookup@=18&refresh=0 | cURL example |
Evri Meta | @Lookup@=3966&refresh=0 | cURL example |
Flickr Search for photos | @Lookup@=16&refresh=0 | cURL example |
Freebase NYTC | @Lookup@=5&refresh=0 | cURL example |
Freebase NYTCF | @Lookup@=4&refresh=0 | cURL example |
Geonames Meta | @Lookup@=24&refresh=0 | cURL example |
Geopoints | @Lookup@=3731&refresh=0 | cURL example |
Get Glue Meta | @Lookup@=25&refresh=0 | cURL example |
Google Search | @Lookup@=1382&refresh=0 | cURL example |
Google Social Graph | @Lookup@=30&refresh=0 | cURL example |
Guardian | @Lookup@=28&refresh=0 | cURL example |
Hoovers | @Lookup@=2&refresh=0 | cURL example |
Journalisted | @Lookup@=3174&refresh=0 | cURL example |
Local Search | @Lookup@=15&refresh=0 | cURL example |
LOD | @Lookup@=21&refresh=0 | cURL example |
MIME Type | @Lookup@=1029&refresh=0 | cURL example |
New York Times | @Lookup@=22&refresh=0 | cURL example |
NPR Meta | @Lookup@=29&refresh=0 | cURL example |
NYT: The Article Search | @Lookup@=9&refresh=0 | cURL example |
NYT: The TimesTags | @Lookup@=22&refresh=0 | cURL example |
OpenCalais | @Lookup@=1&refresh=0 | cURL example |
Oreilly Search for products | @Lookup@=17&refresh=0 | cURL example |
RapLeaf | @Lookup@=2745&refresh=0 | cURL example |
SameAs.org | @Lookup@=3257&refresh=0 | cURL example |
Sindice | @Lookup@=12&refresh=0 | cURL example |
Technorati | @Lookup@=27&refresh=0 | cURL example |
Tesco | @Lookup@=31&refresh=0 | cURL example |
TrueKnowledge | @Lookup@=3967&refresh=0 | cURL example |
@Lookup@=4020&refresh=0 | cURL example | |
uClassify | @Lookup@=3086&refresh=0 | cURL example |
UMBEL | @Lookup@=6&refresh=0 | cURL example |
Ustream | @Lookup@=3902&refresh=0 | cURL example |
Virtuoso Faceted Web Service | @Lookup@=21&refresh=0 | cURL example |
voID Statistics | @Lookup@=35&refresh=0 | cURL example |
whoisi? | @Lookup@=3052&refresh=0 | cURL example |
World Bank | @Lookup@=3&refresh=0 | cURL example |
XRD | @Lookup@=3650&refresh=0 | cURL example |
Yahoo BOSS | @Lookup@=10&refresh=0 | cURL example |
Yahoo Geocode | @Lookup@=2855&refresh=0 | cURL example |
Yelp Search for business | @Lookup@=20&refresh=0 | cURL example |
Zemanta | @Lookup@=7&refresh=0 | cURL example |
Zillow | @Lookup@=32&refresh=0 | cURL example |