www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

RDF Database and SPARQL

Overview
Data Representation
RDF and SPARQL API and SQL
SPARUL -- an Update Language For RDF Graphs
RDF Insert Methods in Virtuoso
Virtuoso Sponger
Dereferencable IRIs and RDF Linked Data
RDF Views -- Mapping Relational Data to RDF
Introduction Rationale Quad Map Patterns, Value and IRI Classes Simple Mapping Example -- Northwind RDF View Configuring RDF Storages Translation Of SPARQL Triple Patterns To Quad Map Patterns Describing Source Relational Tables Function-Based IRI Classes Connection Variabes in IRI Classes Lookup Optimization -- BIJECTION and RETURNS Options Join Optimization -- Declaring IRI Subclasses RDF Metadata Maintenance and Recovery
RDF Inference in Virtuoso
Using Full Text Search in SPARQL
Virtuoso SPARQL Query Service
Business Intelligence Extensions for SPARQL
Debugging SPARQL queries
Virtuoso RDF Performance Tuning
RDF Store Benchmarks
SPARQL Implementation Details
Native RDF Storage Providers

15.8. RDF Views -- Mapping Relational Data to RDF

RDF Views map relational data into RDF and allow customizing RDF representation of locally stored RDF data. To let SPARQL clients access relational data as well as physical RDF graphs in a single query, we introduce a declarative Meta Schema Language for mapping SQL Data to RDF Ontologies. As a result, all types of clients can efficiently access all data stored on the server. The mapping functionality dynamically generates RDF Data Sets for popular ontologies such as SIOC, SKOS, FOAF, and ATOM/OWL without disruption to the existing database infrastructure of Web 1.0 or Web 2.0 solutions. RDF views are also suitable for declaring custom representation for RDF triples, e.g. property tables, where one row holds many single-valued properties.

15.8.1. Introduction

The Virtuoso RDF Views meta schema is a built-in feature of Virtuoso's SPARQL to SQL translator. It recognizes triple patterns that refer to graphs for which an alternate representation is declared and translates these into SQL accordingly. The main purpose of this is evaluating SPARQL queries against existing relational databases. There exists previous work from many parties for rendering relational data as RDF and opening it to SPARQL access. We can mention D2RQ, SPASQL, Squirrel RDF, DBLP and others. The Virtuoso effort differs from these mainly in the following:


15.8.2. Rationale

Since most of the data that is of likely use for the emerging semantic web is stored in relational databases, the argument for exposing this to SPARQL access is clear. We note that historically, SQL access to relational data has essentially never been given to the public outside of the organization. If programmatic access to corporate IS has been available to partners or the public, it has been through dynamic web pages or more recently web services. There are reasons of performance, security, maintainability and so forth for this.

The culture of the emerging semantic web is however taking a different turn. Since RDF and OWL offer a mergeable and queryable model for heterogeneous data, it is more meaningful and maintainable to expose selected data for outside query than it would be with SQL. Advances in hardware make this also less of a performance issue than it would have been in the client-server database era.

In the context of Virtuoso, since Virtuoso is originally a virtual/federated database, incorporating SPARQL to relational mapping is an evident extension of the product's mission as a multi-protocol, multi-platform connector between information systems.


15.8.3. Quad Map Patterns, Value and IRI Classes

In the simplest sense, any relational schema can be rendered into RDF by converting all primary keys and foreign keys into IRI's, assigning a predicate IRI to each column, and an rdf:type predicate for each row linking it to a RDF class IRI corresponding to the table. Then a triple with the primary key IRI as subject, the column IRI as predicate and the column's value as object is considered to exist for each column that is neither part of a primary or foreign key.

Strictly equating a subject value to a row and each column to a predicate is often good but is too restrictive for the general case.

Thus in the most common case the RDF meta schema should consist of independent transformations; the domain of each transformation is a result-set of some SQL SELECT statement and range is a set of triples. The SELECT that produce the domain is quite simple: it does not use aggregate functions, joins and sorting, only inner joins and WHERE conditions. There is no need to support outer joins in the RDF meta schema because NULLs are usually bad inputs for functions that produce IRIs. In the rare cases when NULLs are OK for functions, outer joins can be encapsulated in SQL views. The range of mapping can be described by a SPARQL triple pattern: a pattern field is a variable if it depends on table columns, otherwise it is a constant. Values of variables in the pattern may have additional restrictions on datatypes, when datatypes of columns are known.

This common case of an RDF meta schema is implemented in Virtuoso, with one adjustment. Virtuoso stores quads, not triples, using the graph field (G) to indicate that a triple belongs to some particular application or resource. A SPARQL query may use quads from different graphs without large difference between G and the other three fields of a quad. E.g., variable ?g in expression GRAPH ?g {...} can be unbound. SPARQL has special syntax for "graph group patterns" that is convenient for sets of triple patterns with a common graph, but it also has shorthands for common subject and predicate, so the difference is no more than in syntax. There is only one feature that is specific for graphs but not for other fields: the SPARQL compiler can create restrictions on graphs according to FROM and FROM NAMED clauses.

Virtuoso RDF Views should offer the same flexibility with the graphs as SPARQL addressing physical triples. A transformation cannot always be identified by the graph used for ranges because graph may be composed from SQL data. The key element of the meta schema is a "quad map pattern". A simple quad map pattern fully defines one particular transformation from one set of relational columns into triples that match one SPARQL graph pattern. The main part of quad map pattern is four declarations of "quad map values", each declaration specifies how to calculate the value of the corresponding triple field from the SQL data. The pattern also lists boolean SQL expressions that should be used to filter out unwanted rows of source data (and to join multiple tables if source columns belong to different tables). There are also quad map patterns that group together similar quad patterns but do not specify any real transformation or even prevent unwanted transformations from being used, they are described in "Grouping Map Patterns" below.

Quad map values refer to schema elements of two further types: "IRI classes" and "literal classes".

15.8.3.1. IRI Classes

An IRI class declares that a column or set of columns gets converted into a IRI in a certain way. The conversion of this sort can be declared revertible (bijection) so an IRI can be parsed into original SQL values; this is useful when some equality of an IRI constant and a calculated IRI can be replaced with an equality of a parse result of a constant and an SQL column that is index criteria or simply faster. In addition, the SPARQL optimizer will eliminate redundant conversions if one IRI class is explicitly declared as a subclass of another. The most flexible declaration for conversion consists of specifying functions that assemble and disassemble from IRI into its constituent parts. This is overkill for typical conversions so it is possible to specify only one sprintf-style format string such that sprintf() SQL function will print an IRI using this format and sprintf_inverse() will be able to parse it back.

The use of sprintf_inverse() assumes that the format does not contain fragments like '%s%s' that make it impossible to separate parts of IRI from each other.

In the following, we shall map the Virtuoso users and user roles system tables into the SIOC ontology.

create iri class oplsioc:user_iri "http://myhost/sys/user?id=%d"
  (in uid integer not null) .
create iri class oplsioc:group_iri "http://myhost/sys/group?id=%d"
  (in gid integer not null) .
create iri class oplsioc:membership_iri
  "http://myhost/sys/membership?super=%d&sub=%d"
  (in super integer not null, in sub integer not null) .
create iri class oplsioc:dav_iri "http://myhost%s"
  (in path varchar) .

These IRI classes are used for mapping data from the DB.DBA.SYS_USERS and DB.DBA.SYS_ROLE_GRANTS system tables that are defined in Virtuoso as follows:

create table DB.DBA.SYS_USERS (
  U_ID                integer not null unique,
  U_NAME              char (128) not null primary key,
  U_IS_ROLE           integer default 0,
  U_FULL_NAME         char (128),
  U_E_MAIL            char (128) default ",
  U_ACCOUNT_DISABLED  integer default 1,
  U_DAV_ENABLE        integer default 0,
  U_SQL_ENABLE        integer default 1,
  U_HOME              varchar (128),
. . .
 );

Single record in DB.DBA.SYS_USERS corresponds to a plain user or a group (role). Users and roles are collectively named "grantees". Thus a role may be granted to another role or to a user account. A role grant may be direct (explicit) or assigned by recursion.

create table SYS_ROLE_GRANTS (
  GI_SUPER   integer,
  GI_SUB     integer,
  GI_DIRECT  integer default 1,
. . .
  primary key (GI_SUPER, GI_SUB, GI_DIRECT));

One IRI class usually corresponds to one ontology class, because similar things are usually called similarly. One may wish to use identifiers of ontology classes as identifiers of related IRI classes, to not remember double number of names, e.g. create IRI class mybank:XpressXfer for subjects that will have rdf:type property mybank:XpressXfer made by mapping. That is technically possible but proven to become inconvenient and misleading as application evolves. While RDF types tend to persist, IRI classes may change over time or same subject may get more than one name via more than one IRI class, say, for exports to different systems. It is found to be more convenient to compose names of IRI classes by adding some common prefixes or suffixes to RDF classes (or to table names), say, write create IRI class mybank:XpressXfer_iri.


15.8.3.2. Literal Classes

A "literal class" declares that a column or set of columns gets converted into a literal instead of an IRI. More precisely, the result of conversion can be IRI_ID so it represents an IRI, but in current version of Virtuoso this is supported only for some internal built-in literal classes, not for classes declared by the user. So for user-defined literal class the result of the conversion is an RDF literal even if it is a string representation of a valid IRI.

In any case, a literal class can be used only in quad map values of O fields, because Virtuoso does not support literal values as subjects.

A special case of literal class is the identity class that converts a value from varchar column into an untyped literal and value from column of any other SQL datatype into a typed literal with type from XMLSchema set, i.e. xsd:integer, xsd:dateTime and so on. Columns of types ANY and IRI_ID are not supported.

The SPARQL optimizer knows that RDF literal types are pairwise disjoint so literal classes that produce literals of different types are known to be pairwise disjoint. The optimizer will replace a join on two disjoint literal classes with an empty statement, to simplify the resulting query.


15.8.3.3. Simple Quad Map Patterns

The following declaration of quad map pattern is self-explanatory. The line for object uses identity literal class so there's no need to specify its name.

graph      <http://myhost/sys>
subject    oplsioc:user_iri (DB.DBA.SYS_USERS.U_ID)
predicate  foaf:email
object     DB.DBA.SYS_USERS.U_E_MAIL

The description language also supports SPARQL-style notation that contains less keywords and eliminates duplicate graphs, subjects and predicates. The following add two patterns with constant graph IRI <http://myhost/sys> and subjects are made from column DB.DBA.SYS_USERS.U_ID by oplsioc:user_iri.

graph <http://myhost/sys>
  {
    oplsioc:user_iri (DB.DBA.SYS_USERS.U_ID)
      a sioc:user ;
      oplsioc:name DB.DBA.SYS_USERS.U_FULL_NAME .
  }

15.8.3.4. Assigning Names To Quad Map Patterns

In real applications, quad map patterns should be named, for schema manipulation and keeping debug info readable. Thus it is much better to rewrite the previous example as

create virtrdf:SysUsers as graph <http://myhost/sys>
  {
    oplsioc:user_iri (DB.DBA.SYS_USERS.U_ID)
      a sioc:user
          as virtrdf:SysUserType-User;
      oplsioc:name DB.DBA.SYS_USERS.U_FULL_NAME
          as virtrdf:SysUsersFullName .
  }

Using these names, one may later write, say, drop quad map virtrdf:SysUserType-User.

One name, virtrdf:DefaultQuadMap is reserved. It is an internal quad map pattern used to access "native-form" quads from DB.DBA.RDF_QUAD:

create virtrdf:DefaultQuadMap as
graph rdfdf:default-iid-nonblank (DB.DBA.RDF_QUAD.G)
subject rdfdf:default-iid (DB.DBA.RDF_QUAD.S)
predicate rdfdf:default-iid-nonblank (DB.DBA.RDF_QUAD.P)
object rdfdf:default (DB.DBA.RDF_QUAD.O)

IRI classes from rdfdf:... namespace are also reserved.


15.8.3.5. Grouping Map Patterns

The previous example actually contains three map patterns, not two. The name virtrdf:SysUsers refers to a "group map pattern" that does not define any real transformation of relational data into RDF but helps organize quad map patterns into a tree. Group may contain both quad map patterns and other groups. A group can be manipulated as a whole, e.g. drop quad map virtrdf:SysUsers will remove all three map patterns.



15.8.4. Simple Mapping Example -- Northwind RDF View

Here is example of the basic Northwind RDF Views deployment. The sequence of operations is very common for adding SPARQL access to existing application.

There exist few important questions to answer. Who should have access to data behind RDF View? Should someone have access to other sorts of RDF data but not to the new View? What are applications that should be interoperable with the new RDF data source? Are there any applications that produce similar data but that data sould be kept apart from data made by view? How to ensure that deployment the view will not cause problems for other applications?

First of all, we decide whether the default web service endpoint should have access to the data in question. If it should then we have to grant SELECT priviledges to the account "SPARQL" that is used for the default endpoint; if it should not but some custom edpoint should then grant to the owner account of that account. Granting access is less trivial that it is usual. On one hand, those who can make SQL SELECT statements on application's tables can also make SPARQL queries on RDF View over that tables, because it makes SQL inside. On the other hand, those who do not intend to query that data at all may get unexpected "permission denied" errors on queries that worked fine before adding an RDF View. If SPARQL compiler can not prove that the query can not access data from the view then it will generate SQL code that will access tables behind the view. In some cases permission problems should be resolved by creating RDF View in a separate RDF storage. In this example, data are public:

use DB;

GRANT SELECT ON "Demo"."demo"."Products" TO "SPARQL";
GRANT SELECT ON "Demo"."demo"."Suppliers" TO "SPARQL";
GRANT SELECT ON "Demo"."demo"."Shippers" TO "SPARQL";
GRANT SELECT ON "Demo"."demo"."Categories" TO "SPARQL";
GRANT SELECT ON "Demo"."demo"."Customers" TO "SPARQL";
GRANT SELECT ON "Demo"."demo"."Employees" TO "SPARQL";
GRANT SELECT ON "Demo"."demo"."Orders" TO "SPARQL";
GRANT SELECT ON "Demo"."demo"."Order_Details" TO "SPARQL";
GRANT SELECT ON "Demo"."demo"."Countries" TO "SPARQL";
GRANT SELECT ON "Demo"."demo"."Provinces" TO "SPARQL";

Interoperability is the next question. The example is not interoperable with anything so in can provide data of any form, a real application will probably use some ontology from external source. Sometimes data should be converted from internal application's representation to something different (such as metric to imperial or ATT country code to two-character country id); sometimes composed IRIs should follow special rules; function-based IRI classes may help in that cases. As this is the first example, only plain format-string-based IRI classes are used.

We should also ensure that data generated by the new view will not be accidentally mixed with other data of the database. For that purpose the example will use a unique graph name that includes both application name and host name. In addition, the script will drop declarations that might remain from a previous run of the same script. The script is executed many times during the development so erasing previous version is worth writing. It will report an error if there's nothing to erase but it's better than unpredictable errors due to writing new declarations over existing ones.

Note:

Making graph name unique for every host is not needed if the application is supposed to be "local" and nobody will access more than one installation of the application. If this is the case, use some fixed graph IRI, not necessarily starting with hostname at all; this is much more convenient for querying because you don't have to calculate the graph name in each query. With fixed graph in use, it is still possible to clone the RDF View to map to a unique graph as soon as the application become "public" and requires merging data from many installations.

SPARQL drop quad map graph iri("http://^{URIQADefaultHost}^/Northwind")
;
SPARQL drop quad map virtrdf:NorthwindDemo
;
Note:

The ^{URIQADefaultHost}^ macro is replaced with the value of DefaultHost parameter of [URIQA] section of configuration file. The IRI is written as iri("http://^{URIQADefaultHost}^/Northwind"), not as <http://^{URIQADefaultHost}^/Northwind> because macro of this sort works only inside SPARQL string values.

Now it's safe to create IRI classes needed for the view. If these classes are used only in the view we define then it is safe to create all of them in a single statement. If some of them are used across multiple declarations then errors may occur. it is impossible to redefine an IRI class that is in use; the compiler will try to avoid reporting errors by checking whether the new declaration is identical to the existing one and by trying garbage collection in hope that the IRI class is used only in garbage, but errors may occur anyway. Thus it is better to declare "shared" IRI classes by individual statements and group together only "private" IRI classes of a view. If a "class redefinition" error occurs in the middle of a group then "undefined class" errors may occur after because the processing of the group was interrupted before rest of group was not executed. When in trouble, try DB.DBA.RDF_AUDIT_METADATA procedure.

SPARQL
create iri class northwind:Category "http://^{URIQADefaultHost}^/Northwind/Category/%d#this" (in category_id integer not null) .
create iri class northwind:Shipper "http://^{URIQADefaultHost}^/Northwind/Shipper/%d#this" (in shipper_id integer not null) .
create iri class northwind:Supplier "http://^{URIQADefaultHost}^/Northwind/Supplier/%d#this" (in supplier_id integer not null) .
create iri class northwind:Product   "http://^{URIQADefaultHost}^/Northwind/Product/%d#this" (in product_id integer not null) .
create iri class northwind:Customer "http://^{URIQADefaultHost}^/Northwind/Customer/%U#this" (in customer_id varchar not null) .
create iri class northwind:Employee "http://^{URIQADefaultHost}^/Northwind/Employee/%U%U%d#this" (in employee_firstname varchar not null, in employee_lastname varchar not null, in employee_id integer not null) .
create iri class northwind:Order "http://^{URIQADefaultHost}^/Northwind/Order/%d#this" (in order_id integer not null) .
create iri class northwind:CustomerContact "http://^{URIQADefaultHost}^/Northwind/CustomerContact/%U#this" (in customer_id varchar not null) .
create iri class northwind:OrderLine "http://^{URIQADefaultHost}^/Northwind/OrderLine/%d/%d#this" (in order_id integer not null, in product_id integer not null) .
create iri class northwind:Province "http://^{URIQADefaultHost}^/Northwind/Province/%U/%U#this" (in country_name varchar not null, in province_name varchar not null) .
create iri class northwind:Country "http://^{URIQADefaultHost}^/Northwind/Country/%U#this" (in country_name varchar not null) .
create iri class northwind:Flag "http://^{URIQADefaultHost}^%U#this" (in flag_path varchar not null) .
create iri class northwind:dbpedia_iri "http://dbpedia.org/resource/%U" (in uname varchar not null) .
create iri class northwind:EmployeePhoto "http://^{URIQADefaultHost}^/DAV/VAD/demo/sql/EMP%d#this" (in emp_id varchar not null) .
create iri class northwind:CategoryPhoto "http://^{URIQADefaultHost}^/DAV/VAD/demo/sql/CAT%d#this" (in category_id varchar not null) .
;

One IRI class per subject type; format strings begin with same host but different directory names so this will let the compiler to guess the type of subject by the text of IRI. Most of declarations are bijections and may get option (bijection) hint but these format strings are so simple that the compiler may understant it by itself. (northwind:Employee is not a bijection because sprintf_inverse will be unable to split the tail of IRI string and find the boundary between first and last name.)

The final operation is extending the default quad storage with new tree of quad map patterns.

SPARQL
prefix northwind: <http://demo.openlinksw.com/schemas/northwind#>
prefix oplsioc: <http://www.openlinksw.com/schemas/oplsioc#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix sioc: <http://rdfs.org/sioc/ns#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#>

alter quad storage virtrdf:DefaultQuadStorage
from Demo.demo.Products as products
from Demo.demo.Suppliers as suppliers
from Demo.demo.Shippers as shippers
from Demo.demo.Categories as categories
from Demo.demo.Customers as customers
from Demo.demo.Employees as employees
from Demo.demo.Orders as orders
from Demo.demo.Order_Details as order_lines
from Demo.demo.Countries as countries
from Demo.demo.Provinces as provinces
where (^{suppliers.}^.Country = ^{countries.}^.Name)
where (^{customers.}^.Country = ^{countries.}^.Name)
where (^{employees.}^.Country = ^{countries.}^.Name)
where (^{orders.}^.ShipCountry = ^{countries.}^.Name)
{
        create virtrdf:NorthwindDemo as graph iri ("http://^{URIQADefaultHost}^/Northwind") option (exclusive)
        {
                northwind:CustomerContact (customers.CustomerID)
                        a foaf:Person
                                as virtrdf:CustomerContact-foaf_Person .

                northwind:CustomerContact (customers.CustomerID)
                        a northwind:CustomerContact
                                as virtrdf:CustomerContact-CustomerContact;
                        foaf:name customers.ContactName
                                as virtrdf:CustomerContact-contact_name ;
                        foaf:phone customers.Phone
                                as virtrdf:CustomerContact-foaf_phone ;
                        northwind:is_contact_at northwind:Customer (customers.CustomerID)
                                as virtrdf:CustomerContact-is_contact_at ;
                        northwind:country northwind:Country (customers.Country)
                                as virtrdf:CustomerContact-country ;
                        rdfs:isDefinedBy northwind:customercontact_iri (customers.CustomerID) ;
                        rdfs:isDefinedBy northwind:CustomerContact (customers.CustomerID) .

                northwind:Country (customers.Country)
                        northwind:is_country_of
                northwind:CustomerContact (customers.CustomerID) as virtrdf:CustomerContact-is_country_of .

                northwind:Product (products.ProductID)
                        a northwind:Product
                                as virtrdf:Product-ProductID ;
                        northwind:has_category northwind:Category (products.CategoryID)
                                as virtrdf:Product-product_has_category ;
                        northwind:has_supplier northwind:Supplier (products.SupplierID)
                                as virtrdf:Product-product_has_supplier ;
                        northwind:productName products.ProductName
                                as virtrdf:Product-name_of_product ;
                        northwind:quantityPerUnit products.QuantityPerUnit
                                as virtrdf:Product-quantity_per_unit ;
                        northwind:unitPrice products.UnitPrice
                                as virtrdf:Product-unit_price ;
                        northwind:unitsInStock products.UnitsInStock
                                as virtrdf:Product-units_in_stock ;
                        northwind:unitsOnOrder products.UnitsOnOrder
                                as virtrdf:Product-units_on_order ;
                        northwind:reorderLevel products.ReorderLevel
                                as virtrdf:Product-reorder_level ;
                        northwind:discontinued products.Discontinued
                                as virtrdf:Product-discontinued ;
                        rdfs:isDefinedBy northwind:product_iri (products.ProductID) ;
                        rdfs:isDefinedBy northwind:Product (products.ProductID) .

                northwind:Category (products.CategoryID)
                        northwind:category_of northwind:Product (products.ProductID) as virtrdf:Product-category_of .

                northwind:Supplier (products.SupplierID)
                        northwind:supplier_of northwind:Product (products.ProductID) as virtrdf:Product-supplier_of .

                northwind:Supplier (suppliers.SupplierID)
                        a northwind:Supplier
                                as virtrdf:Supplier-SupplierID ;
                        northwind:companyName suppliers.CompanyName
                                as virtrdf:Supplier-company_name ;
                        northwind:contactName suppliers.ContactName
                                as virtrdf:Supplier-contact_name ;
                        northwind:contactTitle suppliers.ContactTitle
                                as virtrdf:Supplier-contact_title ;
                        northwind:address suppliers.Address
                                as virtrdf:Supplier-address ;
                        northwind:city suppliers.City
                                as virtrdf:Supplier-city ;
                        northwind:dbpedia_city northwind:dbpedia_iri(suppliers.City)
                                as virtrdf:Supplier-dbpediacity ;
                        northwind:region suppliers.Region
                                as virtrdf:Supplier-region ;
                        northwind:postalCode suppliers.PostalCode
                                as virtrdf:Supplier-postal_code ;
                        northwind:country northwind:Country(suppliers.Country)
                                as virtrdf:Supplier-country ;
                        northwind:phone suppliers.Phone
                                as virtrdf:Supplier-phone ;
                        northwind:fax suppliers.Fax
                                as virtrdf:Supplier-fax ;
                        northwind:homePage suppliers.HomePage
                                as virtrdf:Supplier-home_page ;
                        rdfs:isDefinedBy northwind:supplier_iri (suppliers.SupplierID) ;
                        rdfs:isDefinedBy northwind:Supplier (suppliers.SupplierID) .

                northwind:Country (suppliers.Country)
                        northwind:is_country_of
                northwind:Supplier (suppliers.SupplierID) as virtrdf:Supplier-is_country_of .

                northwind:Category (categories.CategoryID)
                        a northwind:Category
                                as virtrdf:Category-CategoryID ;
                        northwind:categoryName categories.CategoryName
                                as virtrdf:Category-home_page ;
                        northwind:description categories.Description
                                as virtrdf:Category-description ;
                        foaf:img northwind:CategoryPhoto(categories.CategoryID)
                                as virtrdf:Category-categories.CategoryPhoto ;
                        rdfs:isDefinedBy northwind:category_iri (categories.CategoryID) ;
                        rdfs:isDefinedBy northwind:Category (categories.CategoryID) .

                northwind:CategoryPhoto(categories.CategoryID)
                        a northwind:CategoryPhoto
                                as virtrdf:Category-categories.CategoryPhotoID ;
                        rdfs:isDefinedBy northwind:categoryphoto_iri (categories.CategoryID) ;
                        rdfs:isDefinedBy northwind:CategoryPhoto(categories.CategoryID) .

                northwind:Shipper (shippers.ShipperID)
                        a northwind:Shipper
                                as virtrdf:Shipper-ShipperID ;
                        northwind:companyName shippers.CompanyName
                                as virtrdf:Shipper-company_name ;
                        northwind:phone shippers.Phone
                                as virtrdf:Shipper-phone ;
                        rdfs:isDefinedBy northwind:shipper_iri (shippers.ShipperID) ;
                        rdfs:isDefinedBy northwind:Shipper (shippers.ShipperID) .

                northwind:Customer (customers.CustomerID)
                        a  northwind:Customer
                                as virtrdf:Customer-CustomerID2 ;
                        a  foaf:Organization
                                as virtrdf:Customer-CustomerID ;
                        foaf:name customers.CompanyName
                                as virtrdf:Customer-foaf_name ;
                        northwind:companyName customers.CompanyName
                                as virtrdf:Customer-company_name ;
                        northwind:has_contact northwind:CustomerContact (customers.CustomerID)
                                as virtrdf:Customer-contact ;
                        northwind:country northwind:Country (customers.Country)
                                as virtrdf:Customer-country ;
                        northwind:contactName customers.ContactName
                                as virtrdf:Customer-contact_name ;
                        northwind:contactTitle customers.ContactTitle
                                as virtrdf:Customer-contact_title ;
                        northwind:address customers.Address
                                as virtrdf:Customer-address ;
                        northwind:city customers.City
                                as virtrdf:Customer-city ;
                        northwind:dbpedia_city northwind:dbpedia_iri(customers.City)
                                as virtrdf:Customer-dbpediacity ;
                        northwind:region customers.Region
                                as virtrdf:Customer-region ;
                        northwind:PostalCode customers.PostalCode
                                as virtrdf:Customer-postal_code ;
                        foaf:phone customers.Phone
                                as virtrdf:Customer-foaf_phone ;
                        northwind:phone customers.Phone
                                as virtrdf:Customer-phone ;
                        northwind:fax customers.Fax
                                as virtrdf:Customer-fax ;
                        rdfs:isDefinedBy northwind:customer_iri (customers.CustomerID) ;
                        rdfs:isDefinedBy northwind:Customer (customers.CustomerID) .

                northwind:Country (customers.Country)
                        northwind:is_country_of
                northwind:Customer (customers.CustomerID) as virtrdf:Customer-is_country_of .

                northwind:Employee (employees.FirstName, employees.LastName, employees.EmployeeID)
                        a northwind:Employee
                                as virtrdf:Employee-EmployeeID2 ;
                        a foaf:Person
                                as virtrdf:Employee-EmployeeID ;
                        foaf:surname employees.LastName
                                as virtrdf:Employee-foaf_last_name ;
                        northwind:lastName employees.LastName
                                as virtrdf:Employee-last_name ;
                        foaf:firstName employees.FirstName
                                as virtrdf:Employee-foaf_first_name ;
                        northwind:firstName employees.FirstName
                                as virtrdf:Employee-first_name ;
                        foaf:title employees.Title
                                as virtrdf:Employee-title ;
                        northwind:titleOfCourtesy employees.TitleOfCourtesy
                                as virtrdf:Employee-title_of_courtesy ;
                        foaf:birthday employees.BirthDate
                                as virtrdf:Employee-foaf_birth_date ;
                        northwind:birthday employees.BirthDate
                                as virtrdf:Employee-birth_date ;
                        northwind:hireDate employees.HireDate
                                as virtrdf:Employee-hire_date ;
                        northwind:address employees.Address
                                as virtrdf:Employee-address ;
                        northwind:city employees.City
                                as virtrdf:Employee-city ;
                        northwind:dbpedia_city northwind:dbpedia_iri(employees.City)
                                as virtrdf:Employee-dbpediacity ;
                        northwind:region employees.Region
                                as virtrdf:Employee-region ;
                        northwind:postalCode employees.PostalCode
                                as virtrdf:Employee-postal_code ;
                        northwind:country northwind:Country(employees.Country)
                                as virtrdf:Employee-country ;
                        foaf:phone employees.HomePhone
                                as virtrdf:Employee-home_phone ;
                        northwind:extension employees.Extension
                                as virtrdf:Employee-extension ;
                        northwind:notes employees.Notes
                                as virtrdf:Employee-notes ;
                        northwind:reportsTo northwind:Employee(employees.FirstName, employees.LastName, employees.ReportsTo) where (^{employees.}^.ReportsTo = ^{employees.}^.EmployeeID)
                                as virtrdf:Employee-reports_to ;
                        foaf:img northwind:EmployeePhoto(employees.EmployeeID)
                                as virtrdf:Employee-employees.EmployeePhoto ;
                        rdfs:isDefinedBy northwind:employee_iri (employees.EmployeeID) ;
                        rdfs:isDefinedBy northwind:Employee (employees.FirstName, employees.LastName, employees.EmployeeID) .

                northwind:EmployeePhoto(employees.EmployeeID)
                        a northwind:EmployeePhoto
                                as virtrdf:Employee-employees.EmployeePhotoId ;
                        rdfs:isDefinedBy northwind:employeephoto_iri (employees.EmployeeID) ;
                        rdfs:isDefinedBy northwind:EmployeePhoto (employees.EmployeeID) .

                northwind:Employee (employees.FirstName, employees.LastName, orders.EmployeeID)
                        northwind:is_salesrep_of
                northwind:Order (orders.OrderID) where (^{orders.}^.EmployeeID = ^{employees.}^.EmployeeID) as virtrdf:Order-is_salesrep_of .

                northwind:Country (employees.Country)
                        northwind:is_country_of
                northwind:Employee (employees.FirstName, employees.LastName, employees.EmployeeID) as virtrdf:Employee-is_country_of .

                northwind:Order (orders.OrderID)
                        a northwind:Order
                                as virtrdf:Order-Order ;
                        northwind:has_customer northwind:Customer (orders.CustomerID)
                                as virtrdf:Order-order_has_customer ;
                        northwind:has_salesrep northwind:Employee (employees.FirstName, employees.LastName, orders.EmployeeID) where (^{orders.}^.EmployeeID = ^{employees.}^.EmployeeID)
                                as virtrdf:Customer-has_salesrep ;
                        northwind:has_employee northwind:Employee (employees.FirstName, employees.LastName, orders.EmployeeID) where (^{orders.}^.EmployeeID = ^{employees.}^.EmployeeID)
                                as virtrdf:Order-order_has_employee ;
                        northwind:orderDate orders.OrderDate
                                as virtrdf:Order-order_date ;
                        northwind:requiredDate orders.RequiredDate
                                as virtrdf:Order-required_date ;
                        northwind:shippedDate orders.ShippedDate
                                as virtrdf:Order-shipped_date ;
                        northwind:order_ship_via northwind:Shipper (orders.ShipVia)
                                as virtrdf:Order-order_ship_via ;
                        northwind:freight orders.Freight
                                as virtrdf:Order-freight ;
                        northwind:shipName orders.ShipName
                                as virtrdf:Order-ship_name ;
                        northwind:shipAddress orders.ShipAddress
                                as virtrdf:Order-ship_address ;
                        northwind:shipCity orders.ShipCity
                                as virtrdf:Order-ship_city ;
                        northwind:dbpedia_shipCity northwind:dbpedia_iri(orders.ShipCity)
                                as virtrdf:Order-dbpediaship_city ;
                        northwind:shipRegion orders.ShipRegion
                                as virtrdf:Order-ship_region ;
                        northwind:shipPostal_code orders.ShipPostalCode
                                as virtrdf:Order-ship_postal_code ;
                        northwind:shipCountry northwind:Country(orders.ShipCountry)
                                as virtrdf:ship_country ;
                        rdfs:isDefinedBy northwind:order_iri (orders.OrderID) ;
                        rdfs:isDefinedBy northwind:Order (orders.OrderID) .

                northwind:Country (orders.ShipCountry)
                        northwind:is_ship_country_of
                northwind:Order (orders.OrderID) as virtrdf:Order-is_country_of .

                northwind:Customer (orders.CustomerID)
                        northwind:has_order northwind:Order (orders.OrderID) as virtrdf:Order-has_order .

                northwind:Shipper (orders.ShipVia)
                        northwind:ship_order northwind:Order (orders.OrderID) as virtrdf:Order-ship_order .

                northwind:OrderLine (order_lines.OrderID, order_lines.ProductID)
                        a northwind:OrderLine
                                as virtrdf:OrderLine-OrderLines ;
                        northwind:has_order_id northwind:Order (order_lines.OrderID)
                                as virtrdf:order_lines_has_order_id ;
                        northwind:has_product_id northwind:Product (order_lines.ProductID)
                                as virtrdf:order_lines_has_product_id ;
                        northwind:unitPrice order_lines.UnitPrice
                                as virtrdf:OrderLine-unit_price ;
                        northwind:quantity order_lines.Quantity
                                as virtrdf:OrderLine-quantity ;
                        northwind:discount order_lines.Discount
                                as virtrdf:OrderLine-discount ;
                        rdfs:isDefinedBy northwind:orderline_iri (order_lines.OrderID, order_lines.ProductID) ;
                        rdfs:isDefinedBy northwind:OrderLine (order_lines.OrderID, order_lines.ProductID) .

                northwind:Order (orders.OrderID)
                        northwind:is_order_of
                northwind:OrderLine (order_lines.OrderID, order_lines.ProductID) where (^{orders.}^.OrderID = ^{order_lines.}^.OrderID) as virtrdf:Order-is_order_of .

                northwind:Product (products.ProductID)
                        northwind:is_product_of
                northwind:OrderLine (order_lines.OrderID, order_lines.ProductID) where (^{products.}^.ProductID = ^{order_lines.}^.ProductID) as virtrdf:Product-is_product_of .

                northwind:Country (countries.Name)
                        a northwind:Country
                                as virtrdf:Country-Type2 ;
                        a wgs:SpatialThing
                                as virtrdf:Country-Type ;
                        owl:sameAs northwind:dbpedia_iri (countries.Name) ;
                        northwind:name countries.Name
                                as virtrdf:Country-Name ;
                        northwind:code countries.Code
                                as virtrdf:Country-Code ;
                        northwind:smallFlagDAVResourceName countries.SmallFlagDAVResourceName
                                as virtrdf:Country-SmallFlagDAVResourceName ;
                        northwind:largeFlagDAVResourceName countries.LargeFlagDAVResourceName
                                as virtrdf:Country-LargeFlagDAVResourceName ;
                        northwind:smallFlagDAVResourceURI northwind:Flag(countries.SmallFlagDAVResourceURI)
                                as virtrdf:Country-SmallFlagDAVResourceURI ;
                        northwind:largeFlagDAVResourceURI northwind:Flag(countries.LargeFlagDAVResourceURI)
                                as virtrdf:Country-LargeFlagDAVResourceURI ;
                        wgs:lat countries.Lat
                                as virtrdf:Country-Lat ;
                        wgs:long countries.Lng
                                as virtrdf:Country-Lng ;
                        rdfs:isDefinedBy northwind:country_iri (countries.Name) ;
                        rdfs:isDefinedBy northwind:Country (countries.Name) .

                northwind:Country (countries.Name)
                        northwind:has_province
                northwind:Province (provinces.CountryCode, provinces.Province) where (^{provinces.}^.CountryCode = ^{countries.}^.Code) as virtrdf:Country-has_province .

                northwind:Province (provinces.CountryCode, provinces.Province)
                        a northwind:Province
                                as virtrdf:Province-Provinces ;
                        northwind:has_country_code provinces.CountryCode
                                as virtrdf:has_country_code ;
                        northwind:provinceName provinces.Province
                                as virtrdf:Province-ProvinceName ;
                        rdfs:isDefinedBy northwind:province_iri (provinces.CountryCode, provinces.Province) ;
                        rdfs:isDefinedBy northwind:Province (provinces.CountryCode, provinces.Province) .

                northwind:Province (provinces.CountryCode, provinces.Province)
                        northwind:is_province_of
                northwind:Country (countries.Name) where  (^{countries.}^.Code = ^{provinces.}^.CountryCode) as virtrdf:Province-country_of .
        }.
}.
;

The created RDF View is sufficient for querying relational data via SPARQL but not for accessing data by dereferencing IRIs of subjects. Making IRIs dereferenceable requires configuring HTTP server; that is explained in second part of the example.


15.8.5. Configuring RDF Storages

"Quad Storage" is a named set of quad map patterns. The declaration define input:storage storage-name states that a SPARQL query will be executed using only quad patterns of the given quad storage. Declarations of IRI classes, literal classes and quad patterns are shared between all quad storages of an RDF meta schema but every quad storage contains only a subset of all available quad patterns. Two quad storages are always defined:

Three statements for manipulating storages are

A map pattern can be created only as a part of create quad storage or alter quad storage statement, so initially it is used by exactly one storage. It can be imported to some other storage using directive create map-id using storage source-storage. E.g., declarations of many storages create virtrdf:DefaultQuadMap using storage virtrdf:DefaultQuadStorage.

Only a "top-level" quad map pattern (standalone or a whole group with descendants) can be imported, member of a group can not. The import directive also can not be a part of some group declaration.

The directive drop quad map map-name removes a map from one storage when it appears inside alter quad storage statement. Otherwise it removes the map from all storages. There exists garbage collection for quad map patterns, so any unused map is immediately deleted. A group is deleted with all its descendants.


15.8.6. Translation Of SPARQL Triple Patterns To Quad Map Patterns

When a SPARQL query is compiled into SQL using a quad storage, every triple pattern should become a subquery that retrieves data from relational tables. This subquery is an UNION ALL of joins generated from appropriate quad map patterns. The complete SQL query is composed from these basic subqueries. Thus the first operation of the SQL generation for a triple pattern is searching for quad map patterns that may in principle produce triples that match the triple pattern.

The more restrictions contained in the triple pattern the fewer quad map patterns will be used. A triple pattern graph ?g { ?s ?p ?o } is common enough to invoke all data transformations of the storage. A triple pattern graph <g> { ?s <p> <o> } will usually intersect with the range of only one quad map. Sometimes it is possible to prove that the storage can not contain any data that matches the given triple pattern, hence zero number of members of UNION ALL will result in constantly empty result-set.

The search for quad maps for a given pair of triple pattern and quad map storage is quite simple. The storage is treated as a tree of map patterns where quad map patterns are leafs, grouping patterns are inner nodes and the whole storage is also treated as a grouping pattern that specify no fields and contains all top-level map patterns of the storage.

The tree is traversed from the root, left to right, non-leaf vertex are checked before their children. The check of a vertex consists of up to four field checks, for G, S, P and O. Every field check compares the field definition in the vertex and the corresponding field in the triple pattern, G and G, S and S and so on. Note that a non-leaf vertex defines less than four of its fields, e.g., the root vertex does not define any of its fields and top-level graph map { ... } defines only graph. Checks are performed only for defined fields and return one of three values: "failed", "passed", "full match", according to the following rules:

Table: 15.8.6.1. Matching Triple Field and Vertex Field
Field of vertex Field in triple pattern Result
constant same constant full match
constant different constant failed
constant variable of same type passed
constant variable of different type failed
quad map value constant of same type full match
quad map value constant of different type failed
quad map value of type X variable, X or subtype of X full match
quad map value of type X variable, supertype of X passed
quad map value of type X variable, type does not intersect with X failed

If any of the checks fails, the vertex and all its children are excluded from the rest of processing. Otherwise, if all four fields are defined for the quad map pattern, the map is added to the list of matching map patterns. The difference between "passed" and "full match" is significant only if the map is declared with option (exclusive) If all performed checks return "full match" and option (exclusive) is set then the traverse of the tree is stopped as soon as all children of the vertex are traversed. The most typical use of this option is when the application developer is sure that all triples of a graph belong to his application and they come from his own quad map patterns, not from DB.DBA.RDF_QUAD. This is to prevent the SPARQL compiler from generating redundant subqueries accessing DB.DBA.RDF_QUAD. The declaration may look like

create quad storage <mystorage>
  {
    graph <mygraph> option (exclusive) { . . . }
    create virtrdf:DefaultQuadMap
      using storage virtrdf:DefaultQuadStorage .
  }

Exclusive patterns make the order of declarations important, because an exclusive declaration may "throw a shadow" on declarations after it. Consider a database that have a special table RDF_TYPE that caches all RDF types of all subjects in all graphs. Consider two declarations: all triples from graph <http://myhost/sys> and all triples with rdf:type predicate, both exclusive:

graph <http://myhost/sys> option (exclusive)
  {
    . . . # mapping of DB.DBA.SYS_USERS as in previous examples.
  }
graph rdfdf:default-iid-nonblank (DB.DBA.RDF_TYPE.G)
subject rdfdf:default-iid (DB.DBA.RDF_TYPE.S)
predicate rdf:type
object rdfdf:default (DB.DBA.RDF_TYPE.O)
option (exclusive)

The order of these declarations dictates that triple pattern

graph <http://myhost/sys> {?s rdf:type ?o}

is compiled using only quad map patterns of the graph declaration, ignoring second declaration (and of course ignoring default mapping rule, if any). An explicit option (order N) at the end of quad map pattern will tweak the priority. By default, order will grow from 1000 for the first declaration in the statement to 1999 for the last, explicit configuration is especially useful to make order persistent to alter storage statements.

The option (exclusive) trick is ugly, low-level and prone to cause compilation errors after altering storage declarations. When misused, it is as bad as "red cut" in PROLOG, but one must use this trick to build scalable storages.

The option (exclusive) helps the SPARQL compiler to prepare better SQL queries, but sometimes it is "too exclusive". For instance, if a grouping quad map pattern specify only quad map value for graph and no other fields then making it exclusive prohibits the use of all declarations of the storage after that one. Sometimes it is better to notify compiler that quads made by the given quad map pattern are supposed to be different from all quads made by declarations listed after the given one.

Consider an application that exports users' personal data as graphs whose IRIs looks like http://www.example.com/DAV/home/username/RDF/personal/; the application makes a query and a triple pattern is proven to be restrictive enough to filter out all quads that are not similar to quads generated by the given quad map pattern (say, the graph is constant http://www.example.com/DAV/home/JohnSmith/RDF/personal/). The application do not hope to find any quads that match the pattern but made by other applications, because graphs named like in the pattern are supposed to be solely for this single purpose; if, say, DB.DBA.RDF_QUAD occasionally contains some quads with graph equal to http://www.example.com/DAV/home/JohnSmith/RDF/personal/ then they can be ignored.

Under this circumstances, the quad map pattern may have option (soft exclusive). That grants a permission to the compiler to ignore rest of storage as soon as it is proven that the triple pattern can not access quads that does not match the pattern. So if that is proven then the pattern is exclusive and it makes the query faster; when unsure, the complier work like there is no option at all.

Note:

The option (exclusive) can be used as a security measure, option (soft exclusive) can not. Say, if an financial application exports its data as a single graph http://www.example.com/front-office/cash/ using exclusive then the query that explicitly refers to that graph will never access any quads written by the attacker into DB.DBA.RDF_QUAD using same graph IRI. The use of soft exclusive gives no such protection. From the compiler's perspective, the option (soft exclusive) is a hint that may be ignored, not an unambiguous order.

There is one exception from the rules described above. This exception is for virtrdf:DefaultQuadStorage only. If a graph variable of a quad map pattern is not bound and no source graph specified by FROM clauses then quad maps for specific constant graphs are ignored. In other words, if a default quad storage contains quad maps for specific graphs then the query in that storage should explicitly specify the graph in order to use a map for graph. This rule will not work if the default quad map is removed from the virtrdf:DefaultQuadStorage. This rule relates to the default storage itself, not to the containing patterns; copying some or all patterns into other storage will not reproduce there this special effect.


15.8.7. Describing Source Relational Tables

Quad map patterns of an application usually share a common set of source tables and quad map values of one pattern usually share either a single table or very small number of joined tables. Join and filtering conditions are also usually repeated in different patterns. It is not necessary to type table descriptions multiple times, they are declare once in the beginning of storage declaration statement and shared between all quad map declarations inside the statement. Names of aliases can be used instead of table names in quad map values.

from DB.DBA.SYS_USERS as user where (^{user.}^.U_IS_ROLE = 0)
from DB.DBA.SYS_USERS as group where (^{group.}^.U_IS_ROLE = 1)
from DB.DBA.SYS_USERS as account
from user as active_user
  where (^{active_user.}^.U_ACCOUNT_DISABLED = 0)
from DB.DBA.SYS_ROLE_GRANTS as grant
  where (^{grant.}^.GI_SUPER = ^{account.}^.U_ID)
  where (^{grant.}^.GI_SUB = ^{group.}^.U_ID)
  where (^{grant.}^.GI_SUPER = ^{user.}^.U_ID)

This declares five distinct aliases for two distinct tables, and six filtering conditions. Every condition is an SQL expression with placeholders where a reference to the table should be printed. The SPARQL compiler will not try to parse texts of these expressions (except dummy search for placeholders), so any logical expressions are acceptable. When a quad map pattern declaration refers to some aliases, the WHERE clause of the generated SQL code will contain a conjunction of all distinct texts of "relevant" conditions. A condition is relevant if every alias inside the condition is used in some quad map value of the map pattern, either directly or via clause like from user as active_user. (user is a "base alias" for active_user).

Consider a group of four declarations.

graph <http://myhost/sys>
  {
    oplsioc:user_iri (active_user.U_ID)
        a oplsioc:active-user .
    oplsioc:membership_iri (grant.GI_SUPER, grant.GI_SUB).
        oplsioc:is_direct
            grant.GI_DIRECT ;
        oplsioc:member-e-mail
            active_user.U_E_MAIL
               where (^{active_user.}^.U_E_MAIL like 'mailto:%').
    ldap:account-ref (account.U_NAME)
        ldap:belongs-to
            ldap:account-ref (group.U_NAME) option (using grant).
  }

The first declaration will extend <http://myhost/sys> graph with one imaginary triples { user a oplsioc:active-user } for every account record that is not a role and not disabled. The second declaration deals with membership records. A membership is a pair of a grantee ("super") and a granted role ("sub") stored as a row in DB.DBA.SYS_ROLE_GRANTS).

The second declaration states that every membership has oplsioc:is_direct property with value from GI_DIRECT column of that table (roles may be granted to other roles and users, so permissions are "direct" or "recursive").

The third declaration declares oplsioc:member-e-mail property of memberships. The value is a literal string from DB.DBA.SYS_USERS.U_E_MAIL, if the grantee is active (not disabled) and is not a role and its e-mail address starts with 'mailto:'. The join between DB.DBA.SYS_ROLE_GRANTS and DB.DBA.SYS_USERS is made by equality (GI_SUPER = U_ID) because the alias active_user in the declaration "inherits" all conditions specified for user. In addition, the SPARQL compiler will add one more condition to check if the U_E_MAIL is not null because the NULL value is not a valid object and it knows that U_E_MAIL is not declared as NOT NULL.

The last declaration contains an option clause. As usual, this indicates that the basic functionality is good for many tasks but not for all. In this declaration, the ldap:belongs-to property establishes a relation between grantee (subject) and a granted role (object). Both subject and object IRIs are based on account name, DB.DBA.SYS_USERS.U_NAME, so the quad map pattern contains two references to different aliases of DB.DBA.SYS_USERS but no alias for DB.DBA.SYS_ROLE_GRANTS. Hence the declaration could produce a triple for every row of the Cartesian product of the DB.DBA.SYS_USERS. To fix the problem, option (using alias-name) tells the compiler to process the alias-name as if it's used in some quad map value of the pattern.

It is an error to use an alias only in where clause of the quad map pattern but neither in values or in option (using alias-name). To detect more typos, an alias used in quad map values can not appear in option (using alias-name) clause.


15.8.8. Function-Based IRI Classes

Most of IRI classes can be declared by a sprintf format string, but sophisticated cases may require calculations, not only printing the string. create IRI class using function allows the application transform relational values to IRIs by any custom routines.

Let us extend the previous example about users and groups by a new class for grantees. Both users and groups are grantees and we have defined two IRI classes for them. Classes oplsioc:user_iri and oplsioc:group_iri work fine for quad maps of U_ID if and only if the value of U_IS_ROLE is accordingly restricted to FALSE or TRUE, otherwise one may occasionally generate, say, user IRI for a group. To create and parse IRIs that correspond to any U_IDs, two functions should be created:

create function DB.DBA.GRANTEE_URI (in id integer)
returns varchar
{
  declare isrole integer;
  isrole := coalesce ((select top 1 U_IS_ROLE
      from DB.DBA.SYS_USERS where U_ID = id ) );
  if (isrole is null)
    return NULL;
  else if (isrole)
    return sprintf ('http://%s/sys/group?id=%d', id);
  else
    return sprintf ('http://%s/sys/user?id=%d', id);
};
create function DB.DBA.GRANTEE_URI_INVERSE (in id_iri varchar)
returns integer
{
  declare parts any;
  parts := sprintf_inverse (id_iri,
      'http://myhost/sys/user?id=%d', 1 );
  if (parts is not null)
    {
      if (exists (select top 1 1 from DB.DBA.SYS_USERS
          where U_ID = parts[0] and not U_IS_ROLE ) )
        return parts[0];
    }
  parts := sprintf_inverse (id_iri,
      'http://myhost/sys/group?id=%d', 1 );
  if (parts is not null)
    {
      if (exists (select top 1 1 from DB.DBA.SYS_USERS
          where U_ID = parts[0] and U_IS_ROLE ) )
        return parts[0];
    }
  return NULL;
};

These functions may be more useful if the SPARQL web service endpoint is allowed to use them:

grant execute on DB.DBA.GRANTEE_URI to "SPARQL";
grant execute on DB.DBA.GRANTEE_URI_INVERSE to "SPARQL";

The next declaration creates an IRI class based on these two functions:

create iri class oplsioc:grantee_iri using
  function DB.DBA.GRANTEE_URI (in id integer)
    returns varchar,
  function DB.DBA.GRANTEE_URI_INVERSE (in id_iri varchar)
    returns integer .

In common case, IRI class declaration contains an N-array function that composes IRIs and N inverse functions that gets an IRI as an argument and extracts the Nth SQL value. IRI composing function should silently return NULL on incorrect arguments instead of error signal. Inverse functions should return NULL if the argument has an incorrect type or value.

It is possible to specify only composing function without any of inverse functions. However option (bijection) can not be used in that case, obviously.


15.8.9. Connection Variabes in IRI Classes

Writing function-based IRI class is overkill when the IRI can in principle be made by a sprintf_iri but the format should contain some context-specific data, such as host name used for the dynamic renaming of local IRIs. Format strings offer a special syntax for that cases. %{varname}U acts as %U but the function sprintf will take the value from client connection variable varname, not from list of arguments. Similarly, sprintf_inverse will not return fragment that match to %{varname}U in the vector of other fragments; instead it will get the value from connection environment and ensure that it matches the fragment of input; mismatch between printed and actual value of variable will means that the whole string do not match the format.

SPARQL optimizer knows about this formatting feature and sometimes it makes more deductions from occurence of %{varname}U than from occurence of plain %U, so this notation may be used in option ( returns ...) when appropriate. Of course, the optimizer has no access to the actual value of conection variabe because it may vary from run to run or may change between the compilation and the run, but the value is supposed to be persistent during any single query run so %{myvariable}U in one place is equal to %{myvariable}U in other.

Connection variables are set by connection_set and some of them have default values that are used if not overridden by application:

It is inconvenient to write different format strings for different cases. Two most common policies are different host names for default HTTP port of a publicly available service and different non-default ports for one or more host names of an intranet instalation; these two approaches are almost never used in a mix. So declaration of IRI classes may use shorthand ^{DynamicLocalFormat}^ in format strings that is expanded either to http://%{WSHost}U or to http://%{WSHostName}U:%{WSHostPort}U/..., depending on absence or presence of port number in the value of DefaultHost parameter of URIQA section of configuration file.

Note:

^{DynamicLocalFormat}^ is for IRI class declarations only and is not expanded in any other place, so it is useful sometimes to create an IRI class with empty argument list in order to get "almost constant" IRIs calculated without writing special procedures.


15.8.10. Lookup Optimization -- BIJECTION and RETURNS Options

There is one subtle problem with IRI class declarations. To get benefit from a relational index, SPARQL optimizer should compose equality between table column and some known SQL value, not between return value of IRI class and a known composed IRI. In addition, redundant calculations of IRIs takes time. To enable this optimization, an IRI class declaration should end with option (bijection) clause. For some simple format strings the compier may recognize the bijection automatically but an explicit declaration is always a good idea.

The SPARQL compiler may produce big amounts of SQL code when the query contains equality of two calculated IRIs and these IRIs may come from many different IRI classes. It is possible to provide hints that will let the compiler check if two IRI classes form disjoint sets of possible IRI values. The more disjoint sets are found the less possible combinations remain so the resulting SQL query will contain fewer unions of joins. The SPARQL compiler can prove some properties of sprintf format strings. E.g., it can prove that set of all strings printed by "http://example.com/item%d" and the set of strings printed by "http://example.com/item%d/" are disjoint. It can prove some more complicated statements about unions and intersections of sets of strings. The IRI or literal class declaration may contain option (returns ...) clause that will specify one or more sprintf patterns that cover the set of generated values. Consider a better version of IRI class declaration listed above:

create iri class oplsioc:grantee_iri using
  function DB.DBA.GRANTEE_URI (in id integer)
    returns varchar,
  function DB.DBA.GRANTEE_URI_INVERSE (in id_iri varchar)
    returns integer
  option ( bijection,
    returns "http://myhost/sys/group?id=%d"
    union   "http://myhost/sys/user?id=%d" ) .

It is very important to keep IRI classes easily distinguishable by the text of IRI string and easy to parse.

In some cases option (returns ...) can be used for IRI classes that are declared using sprintf format, but actual data have more specific format. Consider a literal class declaration that is used to output strings and the application knows that all these strings are ISBN numbers:

create literal class example:isbn_ref "%s" (in isbn varchar not null)
  option ( bijection, returns "%u-%u-%u-%u" union "%u-%u-%u-X" )

Sometimes interoperability restrictions will force you to violate these rules but please try to follow them as often as possible.


15.8.11. Join Optimization -- Declaring IRI Subclasses

Additional problem appears when the equality is between two IRIs of two different IRI classes. Even if both of them are bijections, the compiler does not know if these IRI classes behave identically on the intersection of their domains. To let the optimizer know this fact, one IRI class can be explicitly declared as a subclass of another:

make oplsioc:user_iri subclass of oplsioc:grantee_iri .
make oplsioc:group_iri subclass of oplsioc:grantee_iri .

The SPARQL compiler can not check the validity of a subclass declaration. The developer should carefully test functions to ensure that transformations are really subclasses, as well as to ensure that functions of an IRI class declarations are really inverse to each other.

When declaring that a table's primary key is converted into a IRI according to one IRI class, one usually declares that all foreign keys referring to this class also get converted into an IRI as per this same class, or subclass of same class.

Subclasses can be declared for literal classes as well as for IRI classes, but this case is rare. The reason is that most of literals are made by identity literal classes that are disjoint to each other even if values may be equal in SQL sense, such as "2" of type xsd:integer and "2.0" of type xsd:double.


15.8.12. RDF Metadata Maintenance and Recovery

This section refers to checking and backing up RDF view and storage declarations only. The checks and backup/restore do not affect physical quads, relational schema or tables or data therein. For general backup and restore, see server administration. To detect and fix automatically most popular sorts of RDF metadata corruption use DB.DBA.RDF_AUDIT_METADATA. It is also possible to backup RDF data by DB.DBA.RDF_BACKUP_METADATA and restore the saved state later by using DB.DBA.RDF_RESTORE_METADATA. It is convenient to make a backup before any modification of quad storages, quad map patterns or IRI classes, especially during debugging new RDF Views.

Note:

In SQL, adding a new view can not break anything. This is because SQL lacks the ability of querying "everything" so data sources are always specified. This is not true for SPARQL, so please treat any metadata manipulation as potentially destructive operation. If an RDF storage is supposed to be used by more than one application then these applications should be tested together, not one after other, and they should be installed/upgraded on live database in the very same order as they were installed/upgraded on istrumental machine during testing. Always remember that these applications share RDF tables so they may interfere.