16.16. RDF Replication

Tables of RDF storage, such as DB.DBA.RDF_QUAD and DB.DBA.RDF_OBJ, can not be replicated in a usual way, because it's content is cached in memory in special ways and synchronized with values outside these tables, such as current values of special sequence objects.

Moreover, same IRI may have different internal IRI_IDs on different boxes, because the assigned IDs vary if new IRIs appear in data in different order. Similarly, there will be different IDs of RDF literal, datatypes and languages, blocking any attempt of one-to-one replication between RDF storages.

However, a special asynchronous RDF replication makes it possible to configure a "publisher" Virtuoso instance to keep the log of changes in some RDF graphs and subscribe some Virtuoso instances to replay all these changes.

Configuration functions are quite straightforward.

RDF graphs to replicate are all members of <http://www.openlinksw.com/schemas/virtrdf#rdf_repl_graph_group> graph group. That group can be filled in with graphs like any other graph group, but it is better to get the advantage of proper security check made by DB.DBA.RDF_REPL_GRAPH_INS() that inserts a graph to the group and DB.DBA.RDF_REPL_GRAPH_DEL() that removes a graph from the group.

Only publicly readable graphs can be replicated, an error is signalled otherwise, and it is better to know about a security issue as early as possible.

The DB.DBA.RDF_REPL_START() function starts the RDF replication at the publishing side. It creates replication "publication" named '__rdf_repl' and makes a log file '__rdf_repl.log' to record changes in replicated graphs. If the replication has been started before then an error is signalled; passing value 1 for parameter "quiet" elimintaes the error so the incorrect call has no effect at all. If the replication is enabled then the value of registry variable 'DB.DBA.RDF_REPL' indicates the moment of replication start.

The DB.DBA.RDF_REPL_START() function performs a security check before starting the replication to check.

The DB.DBA.RDF_REPL_STOP() stops the RDF replication at the publishing side. It calls repl_unpublish() but does not make empty reates replication "publication" named '__rdf_repl' and makes a log file '__rdf_repl.log' to record changes in replicated graphs.

Replication is asynchronous and the order of insertion and removal operations at the subscriber's side may not match the order at the publisher. As a result, it is not recommended to make few subscriptions that writes changes of few publishers into one common graph. A client-side application can force the synchronuzation by calling DB.DBA.RDF_REPL_SYNC() that acts like repl_sync() but for an RDF subscription. DB.DBA.RDF_REPL_SYNC() will not only initial synchronisation but also wait for the end of subscription to guarantee that the total effect of INSERT and DELETE operations is correct even if these operations were made in an order that differs from the original one.

Prefix	IRI
schema	http://schema.org/
n4	http://creativecommons.org/licenses/by/4.0/
n3	http://docs.openlinksw.com/virtuoso/rdfreplication/
rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
n5	http://www.openlinksw.com/#
xsdh	http://www.w3.org/2001/XMLSchema#

Prefix	URI
xmlns:schema	http://schema.org/
xmlns:n4	http://creativecommons.org/licenses/by/4.0/
xmlns:n3	http://docs.openlinksw.com/virtuoso/rdfreplication/
xmlns:rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
xmlns:n5	http://www.openlinksw.com/#
xmlns:xsdh	http://www.w3.org/2001/XMLSchema#

Prefix

URI

xmlns:schema

http://schema.org/

xmlns:n4

http://creativecommons.org/licenses/by/4.0/

xmlns:n3

http://docs.openlinksw.com/virtuoso/rdfreplication/

xmlns:rdf

http://www.w3.org/1999/02/22-rdf-syntax-ns#

xmlns:n5

http://www.openlinksw.com/#

xmlns:xsdh

http://www.w3.org/2001/XMLSchema#

Subject	Predicate	Object
n3:	rdf:type	schema:APIReference
n3:	rdf:type	schema:TechArticle
n3:	schema:name	16.16.ÃÂ RDF Replication
n3:	schema:copyrightHolder	_:vb81392
n3:	schema:datePublished	2016-09-09 16:16:54
n3:	schema:headline	16.16.ÃÂ RDF Replication
n3:	schema:keywords	OpenLink,Virtuoso,database,RDBMS,relational,SQL,RDF,triple store,linked data,linked open data,Big Data
n3:	schema:license	n4:deed.en_US
n3:	schema:publisher	_:vb81391
n3:	schema:url	n3:
_:vb81391	rdf:type	schema:Organization
_:vb81391	schema:name	OpenLink Software
_:vb81391	schema:url	n5:this
_:vb81392	rdf:type	schema:Organization
_:vb81392	schema:name	OpenLink Software
_:vb81392	schema:url	n5:this

Subject

Predicate

Object

n3:

rdf:type

schema:APIReference

n3:

rdf:type

schema:TechArticle

n3:

schema:name

16.16.ÃÂ RDF Replication

n3:

schema:copyrightHolder

_:vb81392

n3:

schema:datePublished

2016-09-09 16:16:54

n3:

schema:headline

16.16.ÃÂ RDF Replication

n3:

schema:keywords

OpenLink,Virtuoso,database,RDBMS,relational,SQL,RDF,triple store,linked data,linked open data,Big Data

n3:

schema:license

n4:deed.en_US

n3:

schema:publisher

_:vb81391

n3:

schema:url

n3:

_:vb81391

rdf:type

schema:Organization

_:vb81391

schema:name

OpenLink Software

_:vb81391

schema:url

n5:this

_:vb81392

rdf:type

schema:Organization

_:vb81392

schema:name

OpenLink Software

_:vb81392

schema:url

n5:this

Prev	Up	Next
16.15.4. GEO Spatial Examples	Home	16.17. RDF Performance Tuning

16.16. RDF Replication

Namespace Prefixes

Statements

Namespace Prefixes

Statements