16.1.3.RDF_QUAD and other tables

The main tables of the default RDF storage system are:

create table DB.DBA.RDF_QUAD (
  G IRI_ID,
  S IRI_ID,
  P IRI_ID,
  O any,
  primary key (G,S,P,O) );
create bitmap index RDF_QUAD_OGPS on DB.DBA.RDF_QUAD (O, G, P, S);

Each triple (more correctly, each quad) is represented by one row in RDF_QUAD. The columns represent the graph, subject, predicate and object. The IRI_ID type columns reference RDF_IRI, which translates the internal id to the external name of the IRI. The O column is of type ANY. If the O value is a non-string SQL scalar, such as a number or date or IRI, it is stored in its native binary representation. If it is a "very short" string (20 characters or less), it is also stored "as is". Long strings and RDF literal with non-default type or language are stored as RDF_BOX values. Instance of rdf_box consists of data type, language, the content (or beginning characters of a long content) and a possible reference to RDF_OBJ if the object is too long to be held in-line in this table or should be outlined for free-text indexing.

create table DB.DBA.RDF_PREFIX (
  RP_NAME varchar primary key,
  RP_ID int not null unique );
create table DB.DBA.RDF_IRI (
  RI_NAME varchar primary key,
  RI_ID IRI_ID not null unique );

These two tables store a mapping between internal IRI id's and their external string form. A memory-resident cache contains recently used IRIs to reduce access to this table. Function id_to_iri (in id IRI_ID) returns the IRI by its ID. Function iri_to_id (in iri varchar, in may_create_new_id) returns an IRI_ID for given string; if the string is not used before as an IRI then either NULL is returned or a new ID is allocated, depending on the second argument.

create table DB.DBA.RDF_OBJ (
  RO_ID integer primary key,
  RO_VAL varchar,
  RO_LONG long varchar,
  RO_DIGEST any
)
create index RO_VAL on DB.DBA.RDF_OBJ (RO_VAL)
create index RO_DIGEST on DB.DBA.RDF_OBJ (RO_DIGEST)
;

When an O value of RDF_QUAD is longer than a certain limit or should be free-text indexed, the value is stored in this table. Depending on the length of the value, it goes into the varchar or the long varchar column. The RO_ID is contained in rdf_box object that is stored in the O column. Still, the truncated value of O can be used for determining equality and range matching, even if < and > of closely matching values need to look at the real string in RDF_OBJ. When RO_LONG is used to store very long value, RO_VAL contains a simple checksum of the value, to accelerate search for identical values when the table is populated by new values.

create table DB.DBA.RDF_DATATYPE (
    RDT_IID IRI_ID not null primary key,
    RDT_TWOBYTE integer not null unique,
    RDT_QNAME varchar );

The XML Schema data type of a typed string O represented as 2 bytes in the O varchar value. This table maps this into the broader IRI space where the type URI is given an IRI number.

create table DB.DBA.RDF_LANGUAGE (
    RL_ID varchar not null primary key,
    RL_TWOBYTE integer not null unique );

The varchar representation of a O which is a string with language has a two byte field for language. This table maps the short integer language id to the real language name such as 'en', 'en-US' or 'x-any'.

Note that unlike datatype names, language names are not URIs.

A short integer value can be used in both RDF_DATATYPE and RDF_LANGUAGE tables for two different purposes. E.g. an integer 257 is for 'unspecified datatype' as well as for 'unspecified language'.