15.3.6.XQUERY and XML view

Similarly to XPATH, XQUERY may also be used as a query language for XML views. Virtuoso offers a special case of FLWR expression for this purpose. It is possible to use the xmlview() function in FOR clause expressions for querying XML views. This function is similar to document () in the sense that it sets the source of the path to be the logical root of the referenced XML view. The XML view must be a constant known at compile time. A SQL query against the appropriate tables of the XML view is internally constructed and evaluated at run time, producing XML fragments from the selected rows. At no point will non-selected parts of the evaluation of the XML view be physically created. The path expression following from xmlview() may contain filters involving XQuery variables bound in the scope of the path expression, thus allowing joining SQL data to XQuery variable values.

The XQUERY string

for $cat in xmlview("cat")/category return {$cat}

in the query

select xquery_eval(' {for $cat in xmlview("cat")/category return <q>{$cat}</q>}',
                     xtree_doc('<dummy_tag/>'));

is equivalent to the XPATH query string

XPATH [__view "cat"]/category

described above.

The expression xmlview("viewname")/path is not a valid top level SQL statement, but may be used by xquery_eval() function. The path statement is translated into SQL query in the context of the "viewname" (i.e. the necessary table names are taken from "viewname" XML view), so that only the desired relational data will be queried. Functionality of this kind of SQL queries is similar to functionality of the SQL fetch statement, i.e. such a query provides iteration over the result set of a cursor. The query is executed once for each row in the query expression's result set. Thus using the CREATE XML VIEW statement and a XQUERY FOR clause expression with xmlview() function allows you to query the database and to return the results in the form of an XML document and to avoid redundant data access. This kind of queries also allows computing joins between two or more documents and restructuring data.

[Note] Note

The <dummy_tag/> tag is not used and it is necessary only as an arbitrary argument for xtree_doc() functions.

Example15.8.Example 1

create xml view "product" as
{
  "Demo"."demo"."Products" p as "product"
      ("ProductID", "ProductName" as "product_name","UnitPrice" as "price",
      "SupplierID","CategoryID")
    {
      "Demo"."demo"."Suppliers" s as "supplier" ("SupplierID","CompanyName")
        on (s."SupplierID" = p."SupplierID")
       ,
      "Demo"."demo"."Categories" c as "category" ("Description")
        on (c."CategoryID" = p."CategoryID")

    }
}

This declares a two level hierarchy with a product node for each product and a supplier child node and a category child node of the product.

The following query will return the XML document in which each category node will contain all suppliers supplying products of the given category.

select xquery_eval('
  <document>
    {
      for $cat in xmlview("cat")/category
      return (
        <category description={$cat/@description}>
          {
            distinct (
              for $prod in xmlview("product")/product
              where $cat/@CategoryID=$prod/category/@CategoryID
              return $prod/supplier )
          }
        </category> )
    }
  </document>
  ',
  xtree_doc('<dummy_tag/>')
  );

returns the XML document in which each category node contains all suppliers supplying products of the given category.

Example15.9.Example 2

Let a document named suppliers.xml contains supplier elements; each supplier element in turn contains supplier_id and supplier_name subelements. The following query

select xquery_eval('
  <supplier_product>
    {
      for $supp in document("suppliers.xml")/supplier
      return (
        <supplier>{$supp/supplier_name }
          <product_name>
            {
              for $prod in xmlview("product")/product
              where string($supp/supplier_id)=$prod/supplier/@SupplierID
              return string($prod/@product_name)
            }
          </product_name>
        </supplier>)
    }
  </supplier_product>',
  xtree_doc('<dummy_tag/>')
  );

returns the XML document that contains supplier elements; each supplier element in turn contains supplier_name and product_name elements.


The previous query and the following one show that it is possible to use variables in a XPATH expression following xmlview() functions.

Example15.10.Example 3

The query

select xquery_eval('
  <document>
    {
      distinct(
        let $ex:= "Ex%"
        for $prod in xmlview("product")//supplier[@CompanyName like $ex]
        return <supp_id>{$prod/@SupplierID}</supp_id>)
    }
  </document>',
  xtree_doc('<dummy_tag/>'));

is equivalent to

select xquery_eval('
  <document>
    {
      distinct(
        for $prod in xmlview("product")//supplier [@CompanyName like "Ex%"]
        return <supp_id>{$prod/@SupplierID}</supp_id>)
    }
  </document>',
  xtree_doc('<dummy_tag/>'));

and selects all suppliers having attribute "CompanyName" starting with "Ex".


Optimization in the queries with xmlview() function

At least two methods may be used to accelerate the execution of queries with xmlview() function. The first method assumes that a path statement following xmlview() function should contain maximum conditions to reduce the result set. For example, the query

select xquery_eval('<w>
  {
    for $prod in xmlview("product")/product[@ProductID="1"]
    return <q>{$prod}</q>
  }</w>',
  xtree_doc('<dummy_tag/>'));

will be executed faster than

select xquery_eval('<w>
  {
    for $prod in xmlview("product")/product where $prod[@ProductID="1"]
    return <q>{$prod}</q>
  }</w>',
  xtree_doc('<dummy_tag/>'));

due to the SQL query produced from the path expression 'product[@ProductID="1"]' reduces the result set in comparison with the SQL query produced from the path expression 'product' in the second query.

If we execute a join of two (or more) XML views (or XML document and XML view), i.e. the query consists of the nested loops, the second method proposes to carry out a piece of query of the nested loop which is independent of the outer loop outside the outer loop and uses LET clause for it. For example, the query

select xquery_eval('<document>
  {
    let $prod_set:=(for $prod in xmlview("product")/product return $prod)
    for $cat in xmlview("cat")/category
    return (<category description={$cat/@description}$gt;
      {distinct(for $prod in $prod_set
                where $cat/@CategoryID=$prod/category/@CategoryID
                return $prod/supplier)}</category>)}</document>',
   xtree_doc('<dummy_tag/>'));

is equivalent to the query in the example 1, but it is about 5 times faster than original one. This method is especially useful for full joins. In this case we do not have a full join and this query may be optimized without the use of temporary result sets if the 'where' clause is replaced with proper filter:

select xquery_eval('<document>
  {
    for $cat in xmlview("cat")/category
    let $catID := $cat/@CategoryID
    return (<category description={$cat/@description}$gt;
      {distinct(
        for $supp in xmlview("product")/product[category/@CategoryID=$catID]/supplier
        return $supp)
      }</category>)
  }
  </document<',
  xtree_doc('<dummy_tag/>'));

This variant requires no memory for storing $prod_set and it never fetches redundant fields from "Demo"."demo"."Products" table but it heavily needs index for "Demo"."demo"."Products" on "CategoryID" field. If such index is built the last variant is about 10 times faster than the query in example 1. Similarly, the query in example 2 may be optimized as follows:

select xquery_eval('
  <supplier_product>
   {
     for $supp in document("suppliers.xml")/supplier
     let $supp_id:=string($supp/@supplier_id)
     return (
      <supplier>{$supp/supplier_name}
        <product_name>
          {
            for $prod in xmlview("product")/product[supplier/@SupplierID=$supp_id]
            return string($prod/@product_name)
          }
        </product_name>
      </supplier>)
   }
  </supplier_product>',
  xtree_doc('<dummy_tag/>')
  );

and it speeds up the operation by more than 15 times.

Restrictions in XPATH expressions following the xmlview() function

Virtuoso does not support certain kinds of XPATH expressions applied to the xmlview() function.

1. A path expression must not contain any functions, because it is impossible to translate most of the functions to SQL queries.

2. A path expression must not contain numeric XQUERY variables in the arithmetic expressions. Let a document named products.xml contains product elements; each product element has numeric attribute ProductPrice. A run time error occurs if the following query would be used

select xquery_eval('
  <document>
   {
     for $prod_doc in document("products.xml")/products/product/@ProductPrice
       for $prod_view in xmlview("product")/product[@price>$prod_doc+1]
     return <q>{$prod_doc, $prod_view/@product_name}</q>
   }
   </document>',
   xtree_doc('<dummy_tag/>'));

because the type of $prod_doc is considered as string. As it is mentioned in the previous restriction the using a function in a path expression (e.g. xmlview("product")/product[@price>number($prod_doc)+1]) is not allowed. The correct query is as follows:

select xquery_eval('
  <document>
    {
      for $prod_doc in document("products.xml")/products/product/@ProductPrice
        let $prod_doc2:=number($prod_doc)
        for $prod_view in xmlview("product")/product[@price>$prod_doc2+1]
      return <q>{$prod_doc, $prod_view/@product_name}</q>
    }
  </document>',
  xtree_doc('<dummy_tag/>'));

3. A path expression must not contain XQUERY variables with the following paths. The following query will not be executed

select xquery_eval('
  <document>
    {
      for $cat in xmlview("cat")/category,
          $prod in xmlview("product")/product[@CategoryID=$cat/@CategoryID]
      return <q>{$prod/supplier}</q>
    }
  </document>',
  xtree_doc('<dummy_tag/>'));

The correct query may be given as

select xquery_eval('
  <document>
    {
      for $cat in xmlview("cat")/category
      let $cat_id:=$cat/@CategoryID
      for $prod in xmlview("product")/product[@CategoryID=$cat_id]
      return <q>{$prod/supplier}</q>
    }
  </document>',
  xtree_doc('<dummy_tag/>'));

4. Virtuoso does not support a selection of n-th element in a path expression. The following query will not be executed

select xquery_eval('
  <document>
    {
      for $cat in xmlview("cat")//product[1] return <q>{$cat}</q>
    }
  </document>',
  xtree_doc('<dummy_tag/>'));

5. Virtuoso does not support a dereference (=>) in a path expression.

6. It is not recommended to use the long varchar, long varbinary and long nvarchar data types with the logical and boolean operations in a filter of the path expression. For example, the execution of the following query

select xquery_eval('
  <document>
    {
      for $prod in
        xmlview("product")/product[@SupplierID<5]/category[@Description like "Sw%"]
      return <q>{$prod}</q>
    }
  </document>',
  xtree_doc('<dummy_tag/>'));

may return an error, because the field "Description" has LONG VARCHAR type in the table "demo"."Categories".