<?xml version="1.0" encoding="ISO-8859-1" ?>
<!--ATOM based XML document generated By OpenLink Virtuoso-->
<atom:feed xmlns:atom="http://www.w3.org/2005/Atom">
<atom:id>http://docs.openlinksw.com/virtuoso/clusterprogramming.html</atom:id>
<atom:title>Virtuoso Cluster Programming</atom:title>
<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogramming.html" type="text/html" rel="alternate" />
<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogramming.html" type="application/atom+xml" rel="self" />
<atom:subtitle>OpenLink Virtuoso Universal Server: Documentation</atom:subtitle>
 <atom:author>
  <atom:name>virtuoso.docs@openlinksw.com</atom:name>
  <atom:email>virtuoso.docs@openlinksw.com</atom:email>
  </atom:author>
<atom:updated>2008-06-18T13:00:12Z</atom:updated>
<atom:generator>OpenLink Software Documentation Team</atom:generator>
<atom:logo>http://docs.openlinksw.com/virtuoso/../images/misc/logo.jpg</atom:logo>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogrammingsqlexmod.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>Cluster SQL Execution Model<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogrammingsqlexmod.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>Cluster SQL Execution Model</atom:title>
  <atom:content type="html">This section explains the basics of how SQL queries work on clustered Virtuoso. Query optimization for cluster is similar to query optimization for a single process. The main issues of optimization have too do with join order, index choice and join type. Still, the performance characteristics of a distributed memory cluster are radically different from a single process database. Namely, the cost of a network round trip between nodes, even if these were only different processes on a shared memory multiprocessor, is between 5 and 50 single row lookups from a big table, supposing the row being sought for is in memory. The 5x factor applies when within the same machine, the 50 times factor applies over 1Gbit ethernet.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogrammingseqidenreg.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>Sequences, Identity and Registry<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogrammingseqidenreg.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>Sequences, Identity and Registry</atom:title>
  <atom:content type="html">Sequences and identity columns have a cluster-wide scope. Thus, an identity column can be used as a primary key and partitioning column and the system guarantees that there will be no duplicates. Sequence numbers are signed 64 bit integers. The sequence numbers are locally ascending on each node. When a cluster node first requests a sequence number, it is assigned a block of numbers from which it will assign subsequent numbers. Thus, two nodes will allocate from different ranges. The global order is not necessarily ascending but numbers stay unique.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogrammingsqlopt.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>SQL Options<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogrammingsqlopt.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>SQL Options</atom:title>
  <atom:content type="html">For purposes of debugging or writing stored procedures that are specifically meant to work with local data only, it is useful to disable cluster functionality. This is done with the NO CLUSTER table option. This can be used in the table option clause of a table in a FROM or in an update or delete. Specially when writing procedures to be called with DAQ, see below, it us necessary to ensure that the procedures will not access data outside of the host running them.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogrammingcallproc.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>Calling Procedures in Cluster<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogrammingcallproc.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>Calling Procedures in Cluster</atom:title>
  <atom:content type="html">Normally, all interprocess communication in the cluster is transparent. In special cases, the developer may wish to execute a given procedure on a given host of the cluster. This is typically the case when there is affinity between data and logic. A regular stored procedure or trigger is executed on the host where it is invoked. With the distributed async queue (DAQ) system one can execute procedures on specified remote hosts. Procedures invoked over DAQ are restricted to dealing with data that is held on the host where they execute. Generic procedures or triggers may use any data from anywhere.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogrammingpartfunc.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>Partition Functions<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogrammingpartfunc.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>Partition Functions</atom:title>
  <atom:content type="html">Given a key and a set of values, the partition function can determine which cluster nodes hold the value. The table name is a case sensitive full name of a table as it appears in SYS_KEYS. The key_name is the case sensitive name of the index. The values are key part values in the index order. The is_update, if non-zero, specifies that if the value is stored in multiple places, all are to be returned, otherwise just one is picked at random, preferring the local if there is a local copy of the partition. The value is a list of node numbers, corresponding to the Host&lt;n&gt; entries in the cluster.ini file.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogrammingdpipe.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>Distributed Pipe<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogrammingdpipe.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>Distributed Pipe</atom:title>
  <atom:content type="html">A distributed pipe is a single construct that can be used for map-reduce and stream transformation. It is a further development of the DAQ. A dpipe is an object which accepts a series of input rows and generates an equal amount of output rows. It may or may not preserve order and it may or may not be transactional. The input row of a dpipe is a tuple of values. To each element of the tuple corresponds a transformation. The transformation is expressed as a partitioned SQL function, basically a function callable by daq_call, with arguments specifying the partition where it is to be run. The output row is formed by gathering together the transformation results of each element of the input tuple. Conceptually, this is like a map operation, like running several DAQ&#39;s, one for each column of the dpipe. A transformation function does not always need to produce a value. It may also produce a second partitioned function call with new arguments which will be partitioned and scheduled by the dpipe. Since the second function is independently partitioned, this may be used for implementing a reduce phase. This phase may then return a value and/or further functions to be called.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogrammingclandrdf.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>Cluster and RDF<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogrammingclandrdf.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>Cluster and RDF</atom:title>
  <atom:content type="html">The RDF tables are partitioned by default on any fresh clustered database. Thus RDF operations are not affected by clustering. For RDF loading, use the single-threaded load functions DB.DBA.RDF_LOAD_RDFXML and DB.DBA.TTLP. These should essentially always be run in row autocommit mode and without logging. Thus do log_enable (2) on the connection before invoking these functions. Running these functions in the default transactional mode will load within the current transaction. This will cause widespread locking and will run out of rollback space after some millions of triples. This has a strict transactional semantic but is not generally relevant in RDF applications.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogrammingvirtdbandrepl.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>Cluster, Virtual Database and Replication<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogrammingvirtdbandrepl.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>Cluster, Virtual Database and Replication</atom:title>
  <atom:content type="html">Clustering has no relation to any virtual database, transactional or snapshot replication mechanism on Virtuoso. Transactional replication is not supported with clustering. Snapshot replication will work. Virtual database operations work identically with single process Virtuoso databases. All operations on remote tables are done by the cluster node running the SQL statement. For purposes of symmetry, it is desirable to have all the remote data sources defined for all server processes so that they can be used interchangeably.</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogramminglimalpha.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>Limitations of Alpha 6.0<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogramminglimalpha.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>Limitations of Alpha 6.0</atom:title>
  <atom:content type="html">DDL SQL</atom:content>
 </atom:entry>
 <atom:entry>
  <atom:id>http://docs.openlinksw.com/virtuoso/clusterprogrammingtrbsht.html</atom:id>
  <atom:author>
    <atom:name>virtuoso.docs@openlinksw.com</atom:name>
    <atom:email>virtuoso.docs@openlinksw.com</atom:email>
   </atom:author>Troubleshooting<atom:link href="http://docs.openlinksw.com/virtuoso/clusterprogrammingtrbsht.html" type="text/html" rel="alternate" />
  <atom:published>2008-06-18T13:00:12Z</atom:published>
  <atom:title>Troubleshooting</atom:title>
  <atom:content type="html">If an operation seems to hang, see the output of status (). Check for the presence of the following conditions:</atom:content>
 </atom:entry>
</atom:feed>