6.3. Virtuoso Cluster Operation

Abstract

These sections apply to Virtuoso as of version 6.0.

Clustering primarily offers greatly increased scalability for large databases without requiring application changes. The database is divided over a number of servers, of which all provide transparent access to the same data.

6.3.1. General

Virtuoso can be run in cluster mode where one logical database is served by a collection of server processes spread over a cluster of machines.

The cluster's composition is declared in a cluster.ini file which is to be in the starting directory of each of the servers composing the cluster. This file declares the hosts and listening ports of all processes composing the cluster and which of these processes is the local process and which the master.

A cluster has a single master process which is the only one allowed to run DDL operations and which is responsible for distributed deadlock resolution. In all other respects, all server processes of the cluster are interchangeable.

The set of processes declared in the cluster.ini files is called the physical cluster.

Each cluster server process has its own database and log files and is solely responsible for these. All configuration fields in virtuoso.ini and related files apply to the process whose ini file this is and their meaning is not modified by clustering.

Specifically, the SQL client and HTTP and other listening ports of each process are declared as usual and are used as usual. A cluster server process has additionally a cluster listening port that is used for cluster communications. This may not be connected to by anything except other processes of the same physical cluster. The cluster listener ports of all processes are declared in cluster.ini and all processes must specify the same information.

cluster.ini fields

The below is a sample cluster.ini file declaring a physical cluster of 4 processes.

[Cluster]
Threads = 100
ThisHost = Host1
Master = Host1
ReqBatchSize = 100
BatchesPerRPC = 4
BatchBufferBytes = 20000
LocalOnly = 2

Host1 = box1:2222
Host2 = box2:2223
Host3 = box3:2224
Host4 = box4:2225

Host1-1 = box1-1:12222
Host2-1 = box2-1:12223
Host3-1 = box3-1:12224
Host4-1 = box4-1:12225

The lines Host1 ... Host4 declare the listening ports of each process. The line ThisHost = 1 declares that this process is Host1, hence cluster listener at box1:2222 box1 - box4 and box1-1 - box4-1 are machine names that must be resolvable in the local context. IP numbers can also be used. Mentioning a host several times declares additional interfaces for the host. Any of these interfaces may be used for cluster connection to the Virtuoso server at the host. Thus Host1 = gives the first interface, Host1-1 the second and so on. This is useful since servers most often have multiple network interfaces and Virtuoso balances the traffic among these interfaces if multiple interfaces are provided. Each host will listen at all the host:port numbers mentioned and other hosts will decide which interface to use based on load.

The Threads line gives the maximum number of threads that will be made for serving requests from other hosts of the cluster. This is in addition to any other threads reserved in any other ini files.

The other fields should be left at the values shown.

Prefix	IRI
n2	http://docs.openlinksw.com/virtuoso/clusteroperation/
schema	http://schema.org/
n5	http://creativecommons.org/licenses/by/4.0/
rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
n4	http://www.openlinksw.com/#
xsdh	http://www.w3.org/2001/XMLSchema#

Prefix

IRI

http://docs.openlinksw.com/virtuoso/clusteroperation/

schema

http://schema.org/

http://creativecommons.org/licenses/by/4.0/

rdf

http://www.w3.org/1999/02/22-rdf-syntax-ns#

http://www.openlinksw.com/#

xsdh

http://www.w3.org/2001/XMLSchema#

Prefix	URI
xmlns:n2	http://docs.openlinksw.com/virtuoso/clusteroperation/
xmlns:schema	http://schema.org/
xmlns:n5	http://creativecommons.org/licenses/by/4.0/
xmlns:rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
xmlns:n4	http://www.openlinksw.com/#
xmlns:xsdh	http://www.w3.org/2001/XMLSchema#

Prefix

URI

xmlns:n2

http://docs.openlinksw.com/virtuoso/clusteroperation/

xmlns:schema

http://schema.org/

xmlns:n5

http://creativecommons.org/licenses/by/4.0/

xmlns:rdf

http://www.w3.org/1999/02/22-rdf-syntax-ns#

xmlns:n4

http://www.openlinksw.com/#

xmlns:xsdh

http://www.w3.org/2001/XMLSchema#

Subject	Predicate	Object
n2:	rdf:type	schema:TechArticle
n2:	schema:name	6.3.ÃÂ Virtuoso Cluster Operation
n2:	schema:copyrightHolder	_:vb78704
n2:	schema:datePublished	2016-09-09 16:16:54
n2:	schema:description	This chapter describes setting up and operating Virtuoso on a cluster of computers. The section on Virtuoso cluster programming documents the SQL extensions specific to cluster application development. These sections apply to Virtuoso as of version 6.0. Clustering primarily offers greatly increased scalability for large databases without requiring application changes. The database is divided over a number of servers, of which all provide transparent access to the same data.
n2:	schema:headline	6.3.ÃÂ Virtuoso Cluster Operation
n2:	schema:keywords	OpenLink,Virtuoso,database,RDBMS,relational,SQL,RDF,triple store,linked data,linked open data,Big Data
n2:	schema:license	n5:deed.en_US
n2:	schema:publisher	_:vb78703
n2:	schema:url	n2:
_:vb78703	rdf:type	schema:Organization
_:vb78703	schema:name	OpenLink Software
_:vb78703	schema:url	n4:this
_:vb78704	rdf:type	schema:Organization
_:vb78704	schema:name	OpenLink Software
_:vb78704	schema:url	n4:this

Subject

Predicate

Object

n2:

rdf:type

schema:TechArticle

n2:

schema:name

6.3.ÃÂ Virtuoso Cluster Operation

n2:

schema:copyrightHolder

_:vb78704

n2:

schema:datePublished

2016-09-09 16:16:54

n2:

schema:description

This chapter describes setting up and operating Virtuoso on a cluster of computers. The section on Virtuoso cluster programming documents the SQL extensions specific to cluster application development. These sections apply to Virtuoso as of version 6.0. Clustering primarily offers greatly increased scalability for large databases without requiring application changes. The database is divided over a number of servers, of which all provide transparent access to the same data.

n2:

schema:headline

6.3.ÃÂ Virtuoso Cluster Operation

n2:

schema:keywords

OpenLink,Virtuoso,database,RDBMS,relational,SQL,RDF,triple store,linked data,linked open data,Big Data

n2:

schema:license

n5:deed.en_US

n2:

schema:publisher

_:vb78703

n2:

schema:url

n2:

_:vb78703

rdf:type

schema:Organization

_:vb78703

schema:name

OpenLink Software

_:vb78703

schema:url

n4:this

_:vb78704

rdf:type

schema:Organization

_:vb78704

schema:name

OpenLink Software

_:vb78704

schema:url

n4:this

Prev	Up	Next
6.2.11. Conductor News Server Administration	Home	6.3.2. Setting up a Cluster