16.2.11. Transitivity in SPARQL

Virtuoso SPARQL allows access to Virtuoso's SQL transitivity extension. Read the SQL section for a definition of the options.

The SPARQL syntax is slightly different from the SQL, although the option names and meanings are the same.

In SPARQL, the transitive options occur after a subquery enclosed in braces:

The below produces all the IRI's that are the same as <http://dbpedia.org/resource/New_York>.

SPARQL
SELECT ?syn
WHERE
  {
    {
      SELECT ?x ?syn
      WHERE
        {
          { ?x owl:sameAs ?syn }
          UNION
          { ?syn owl:sameAs ?x }
        }
    }
    OPTION ( TRANSITIVE, t_in (?x), t_out (?syn), t_distinct, t_min (0) )
    FILTER (?x = <http://dbpedia.org/resource/New_York>) .
  }

In this case, we provide a binding for ?x in the filter outside of the transitive subquery. The subquery therefore is made to run from in to out. The same effect would be accomplished if we bound ?syn and SELECT ?x, the designations of in and out are arbitrary and for transitive steps that can be evaluated equally well in both directions this makes no difference.

The transitive subquery in the above is

{SELECT ?syn
 WHERE
  {
    { SELECT ?x ?syn
      WHERE
       {
         { ?x owl:sameAs ?syn }
         UNION
         { ?syn owl:sameAs ?x}
       }
    } OPTION (TRANSITIVE, t_in (?x), t_out (?syn), t_distinct, t_min (0) )
  }
} .

Leaving out the option would just look for one step of owl:sameAs. Making it transitive will apply the subquery to all bindings it produces until all are visited at least once (the t_distinct modifier).

If the transitive step consists of a single triple pattern, there is a shorthand:

  <alice> foaf:knows ?friend option (transitive t_min (1))

will bind ?friend to all directly and indirectly found foaf:known individuals. If t_min had been 0, Malice> would have also been in the generated bindings.

The syntax is

  option (transitive transitivity_option[,...])

  transitivity_option ::=  t_in (<variable_list>)
  | t_out (<variable_list>)
  | t_distinct
  | t_shortest_only
  | t_no_cycles
  | t_cycles_only
  | t_min (INTNUM)
  | t_max (INTNUM)
  | t_end_flag (<variable>)
  | t_step (<variiable_or_step>)
  | t_direction INTNUM

  variable_list ::= <variable> [,...]

  variable_or_step ::= <variable> | path_id' | 'step_no'

Unlike SQL, variable names are used instead of column numbers. Otherwise all the options have the same meaning.

Some examples of the use of transitivity are:

Collection of Transitivity Option Demo Queries for SPARQL

Example for finding out what graphs contain owl:sameAs for "New York"

To find out what graphs contain owl:sameAs for Dan York, we do

   SELECT ?g ?x count (*) as ?count
   WHERE {
           {
             SELECT ?x ?alias ?g
             WHERE {
                     {
                       GRAPH ?g {?x owl:sameAs ?alias }
                     }
             UNION
                     {
                      GRAPH ?g {?alias owl:sameAs ?x}
                     }
                   }
           }
           OPTION ( TRANSITIVE,
                    t_in (?x),
                    t_out (?alias),
                    t_distinct,
                    t_min (1)) .
           FILTER (?x = <http://dbpedia.org/resource/New_York> ) .
         }

Here we select all paths that start with the initial URI and pass through one or more sameAs statements. Each step produces a result of the transitive subquery. The graph where the sameAs triple was found is returned and used as the grouping column. In this way we see how many times each graph is used. Note that graphs are counted many times since the graphs containing immediate sameAs statements are counted for paths of length 1, then again as steps on paths that reach to their aliases and so on.

Example for query that takes all the people known by Tim Berners-Lee, to a depth between 1 and 4 applications of the subquery

This query takes all the people known by kidehen, to a depth between 1 and 4 applications of the subquery. It then sorts them by the distance and the descending count of connections of each found connection. This is equivalent to the default connections list shown by LinkedIn.

  SPARQL
  SELECT ?o ?dist ((SELECT COUNT (*) WHERE {?o foaf:knows ?xx}))
  WHERE
    {
      {
        SELECT ?s ?o
        WHERE
          {
            ?s foaf:knows ?o
          }
      } OPTION ( TRANSITIVE,
                 t_distinct,
                 t_in(?s),
                 t_out(?o),
                 t_min (1),
                 t_max (4),
                 t_step ('step_no') as ?dist ) .
      FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>)
    }
  ORDER BY ?dist DESC 3
  LIMIT 50

Example for query that takes all the people known by Tim Berners-Lee, to a depth between 2 and 4 applications of the subquery

This query takes all the people known by kidehen, to a depth between 2 and 4 applications of the subquery. It then sorts them by the distance and the descending count of connections of each found connection. This is equivalent to the default connections list shown by LinkedIn.

  SPARQL
  SELECT ?o ?dist ((SELECT COUNT (*) WHERE {?o foaf:knows ?xx}))
  WHERE
    {
      {
        SELECT ?s ?o
        WHERE
          {
            ?s foaf:knows ?o
          }
      } OPTION ( TRANSITIVE,
                 t_distinct,
                 t_in(?s),
                 t_out(?o),
                 t_min (2),
                 t_max (4),
                 t_step ('step_no') as ?dist) .
      FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>)
    }
  ORDER BY ?dist DESC 3
  LIMIT 50

Example for finding how two people know each other and what graphs are involved in the connection

To find how two people know each other and what graphs are involved in the connection, we do:

  SPARQL
  SELECT ?link ?g ?step ?path
  WHERE
    {
      {
        SELECT ?s ?o ?g
        WHERE
          {
            graph ?g {?s foaf:knows ?o }
          }
      } OPTION ( TRANSITIVE,
                 t_distinct,
                 t_in(?s),
                 t_out(?o),
                 t_no_cycles,
                 T_shortest_only,
                 t_step (?s) as ?link,
                 t_step ('path_id') as ?path,
                 t_step ('step_no') as ?step,
                 t_direction 3) .
      FILTER (?s= <http://www.w3.org/People/Berners-Lee/card#i>
      && ?o = <http://www.advogato.org/person/mparaz/foaf.rdf#me>)
    }
    LIMIT 20

This query binds both the t_in and t_out variables. The ?g is left as a free variable. Also, specifying ?s and the system defined constants step_no and path_id as with t_step, we get for each transitive step a row of results with the intermediate binding of ?s, the count of steps from the initial ?s and a distinct identifier for the individual path, since there can be many distinct paths that link the ?s and ?o specified in the filter.

See the SQL transitive option section for details on the meaning of step_no and path_id.

Example for TBox Subsumption

Subsumption Demo Using Transitivity Clause

Yago Class Hierarchy (TBox) Subsumption

AlphaReceptors

# all subjects with IRI: <http://dbpedia.org/class/yago/AlphaReceptor105609111>,
# that are sub-classes of anything (hence ?y)
# without restrictions on tree levels
SELECT ?y
FROM <http://dbpedia.org/resource/classes/yago#>
WHERE
  {
    {
      SELECT *
      WHERE
        {
          ?x rdfs:subClassOf ?y .
        }
    }
    OPTION (TRANSITIVE, t_distinct, t_in (?x), t_out (?y) ) .
    FILTER (?x = <http://dbpedia.org/class/yago/AlphaReceptor105609111>)
  }

Example for Receptors

SELECT ?x
FROM <http://dbpedia.org/resource/classes/yago#>
WHERE
  {
    {
      SELECT *
      WHERE
        {
          ?x rdfs:subClassOf ?y .
        }
    } OPTION (transitive, t_distinct, t_in (?x), t_out (?y) ) .
  FILTER (?y = <http://dbpedia.org/class/yago/Receptor105608868>)
}

Inference Rule example using transitive properties from SKOS vocabulary

The following example demostrates the steps how to retrieve the skos ontology, add triples for skos:broaderTransitiveinto the graph, define inference rule, and at the and execute sparql query with inference rule and transitivity option. The queries were executed against the LOD instance (http://lod.openlinksw.com):

Make the Context graph, assuming you don't want to load entire SKOS vocabulary into our Quad Store:

SQL>SPARQL
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
INSERT INTO GRAPH <urn:rules.skos> { skos:broader rdfs:subPropertyOf skos:broaderTransitive .
                                     skos:narrower rdfs:subPropertyOf skos:narrowerTransitive };

OR Load entire SKOS ontology into Quad Store via iSQL interface (commandline or HTML based Conductor):

SQL>DB.DBA.RDF_LOAD_RDFXML (http_get ('http://www.w3.org/2009/08/skos-reference/skos-owl1-dl.rdf'), 'no', 'urn:rules.skos');
Done.

Make Context Rule:

SQL>rdfs_rule_set ('skos-trans', 'urn:rules.skos');
Done.

Go to SPARQL endpoint, for ex. http://lod.openlinksw.com/sparql

Use inference rule pragma to set context rule for SPARQL query, i.e:

SPARQL
DEFINE input:inference "skos-trans"
PREFIX p: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX category: <http://dbpedia.org/resource/Category:>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX geo: <http://www.georss.org/georss/>

SELECT DISTINCT ?m ?n ?p ?d
WHERE
  {
    ?m rdfs:label ?n.
    ?m skos:subject ?c.
    ?c skos:broaderTransitive category:Churches_in_Paris OPTION (TRANSITIVE) .
    ?m p:abstract ?d.
    ?m geo:point ?p
    FILTER ( lang(?n) = "fr" )
    FILTER ( lang(?d) = "fr" )
  }

You will get 22 rows returned from the query. Note that for comparison, if the option (transitive) is ommitted, then only 2 rows will be returned in our example query:

Figure 16.42. Transitive option

Inference Rule example using transitive properties from SKOS vocabulary: Variant II

This example shows how to find entities that are subcategories of Protestant Churches, no deeper than 3 levels within the concept scheme hierarchy, filtered by a specific subcategory. It demonstrates use of inference rules, sub-queries, and filter to obtain entities associated with category: Protestant_churches combined with the use of the transitivitve closure, sets to a maximum of 3 steps down a SKOS based concept scheme hierarchy:

Make sure the inference rule "skos-trans" is created as described in the previous example
Go to SPARQL endpoint, for ex. http://lod.openlinksw.com/sparql

Use inference rule pragma to set context rule for SPARQL query, i.e:

DEFINE input:inference "skos-trans"
PREFIX p: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/resource/>
PREFIX category: <http://dbpedia.org/resource/Category:>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX geo: <http://www.georss.org/georss/>

SELECT DISTINCT ?c AS ?skos_broader
       ?trans AS ?skos_narrower
       ?dist AS ?skos_level
       ?m ?n ?p AS ?geo_point
WHERE
  {
    {
      SELECT ?c  ?m ?n ?p ?trans ?dist
      WHERE
        {
  	      ?m rdfs:label ?n.
  	      ?m skos:subject ?c.
  	      ?c skos:broaderTransitive category:Protestant_churches .
  	      ?c skos:broaderTransitive ?trans
                OPTION ( TRANSITIVE,
                         t_distinct,
                         t_in (?c),
                         t_out (?trans),
                         t_max (3),
                         t_step ( 'step_no' ) as ?dist ) .
  	      ?m p:abstract ?d.
  	      ?m geo:point ?p
  	      FILTER ( lang(?n) = "en" )
  	      FILTER ( lang(?d) = "en" )
        }
    }
    FILTER ( ?trans = <http://dbpedia.org/resource/Category:Churches_in_London> )
  }
ORDER BY ASC (?dist)

You will get 22 rows returned from the query.

Figure 16.43. Transitive option

Prev	Up	Next
16.2.10. SPARQL DESCRIBE	Home	16.2.12. Supported SPARQL-BI "define" pragmas

Prefix	IRI
n3	http://docs.openlinksw.com/virtuoso/rdfsparqlimplementatiotrans/
schema	http://schema.org/
n5	http://creativecommons.org/licenses/by/4.0/
rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
n4	http://www.openlinksw.com/#
xsdh	http://www.w3.org/2001/XMLSchema#

Prefix	URI
xmlns:n3	http://docs.openlinksw.com/virtuoso/rdfsparqlimplementatiotrans/
xmlns:schema	http://schema.org/
xmlns:n5	http://creativecommons.org/licenses/by/4.0/
xmlns:rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
xmlns:n4	http://www.openlinksw.com/#
xmlns:xsdh	http://www.w3.org/2001/XMLSchema#

Subject	Predicate	Object
n3:	rdf:type	schema:APIReference
n3:	rdf:type	schema:TechArticle
n3:	schema:name	16.2.11.ÃÂ Transitivity in SPARQL
n3:	schema:copyrightHolder	_:vb81416
n3:	schema:datePublished	2016-09-09 16:16:54
n3:	schema:headline	16.2.11.ÃÂ Transitivity in SPARQL
n3:	schema:keywords	OpenLink,Virtuoso,database,RDBMS,relational,SQL,RDF,triple store,linked data,linked open data,Big Data,SPARQL
n3:	schema:license	n5:deed.en_US
n3:	schema:publisher	_:vb81415
n3:	schema:url	n3:
_:vb81415	rdf:type	schema:Organization
_:vb81415	schema:name	OpenLink Software
_:vb81415	schema:url	n4:this
_:vb81416	rdf:type	schema:Organization
_:vb81416	schema:name	OpenLink Software
_:vb81416	schema:url	n4:this