1.4.1. What is the storage cost per triple?

Prev	Up	Next
1.4. Virtuoso FAQ	Home	1.4.2. What is the cost to insert a triple (for the insertion itself, as well as for updating any indices)?

¶

1.4.1. What is the storage cost per triple?

This depends on the index scheme. If indexed 2 ways, assuming that the graph will always be stated in queries, this is 31 bytes.

With 4 indices, supporting queries where the graph can be left unspecified (i.e., triples from any graph will be considered in query evaluation), this is 39 bytes. The numbers are measured with the LUBM validation data set of 121K triples, with no full-text index on literals.

With 4 indices and a full text index on all literals, the Billion Triples Challenge data set, 1115M triples, is about 120 GB of database pages. The database file size is larger due to space in reserve and other factors. 120 GB is the number to use when assessing RAM-to-disk ratio, i.e., how much RAM the system ought to have in order to provide good response. This data set is a heterogeneous collection including social network data, conversations harvested from the Web, DBpedia, Freebase, etc., with relatively numerous and long text literals.

The numbers do not involve any database page stream compression such as gzip. Using such compression does not save in terms of RAM because cached pages must be kept uncompressed but will cut the disk usage to about half.

Prev	Up	Next
1.4. Virtuoso FAQ	Home	1.4.2. What is the cost to insert a triple (for the insertion itself, as well as for updating any indices)?

Prefix	Namespace IRI
n3	http://docs.openlinksw.com/virtuoso/virtuosofaq1/
schema	http://schema.org/
n5	http://creativecommons.org/licenses/by/4.0/deed.
rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
n4	http://www.openlinksw.com/#
xsdh	http://www.w3.org/2001/XMLSchema#

Namespace Prefix	Namespace URI
xmlns:n3	http://docs.openlinksw.com/virtuoso/virtuosofaq1/
xmlns:schema	http://schema.org/
xmlns:n5	http://creativecommons.org/licenses/by/4.0/deed.
xmlns:rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
xmlns:n4	http://www.openlinksw.com/#
xmlns:xsdh	http://www.w3.org/2001/XMLSchema#

Subject	Predicate	Object
n3:	rdf:type	schema:TechArticle
n3:	rdf:type	schema:APIReference
n3:	schema:name	1.4.1.ÃÂ What is the storage cost per triple?
n3:	schema:copyrightHolder	_:vb82352
n3:	schema:datePublished	2016-09-09 16:16:54
n3:	schema:headline	1.4.1.ÃÂ What is the storage cost per triple?
n3:	schema:keywords	OpenLink,Virtuoso,database,RDBMS,relational,SQL,RDF,triple store,linked data,linked open data,Big Data
n3:	schema:license	n5:en_US
n3:	schema:publisher	_:vb82351
n3:	schema:url	n3:
_:vb82351	rdf:type	schema:Organization
_:vb82351	schema:name	OpenLink Software
_:vb82351	schema:url	n4:this
_:vb82352	rdf:type	schema:Organization
_:vb82352	schema:name	OpenLink Software
_:vb82352	schema:url	n4:this