20.2.4. Free Text Index Examples

composite (1, 2) = composite (1, 2)  is true
composite (1, 2) < composite (1, 3) is true.
composite (1, 0) > composite (1) is true.

composite_ref (composite (1, 2), 1) = 2

Example of a Composite Application Specific Document ID

The below code creates a table for news articles and defines a text index with a composite document id.

create table article(author_name varchar(255),
		     description long varchar,
		     pub_date datetime,
		     id integer,
		     title varchar(255),
		     dtid any not null,
		     primary key(id));

create text index on article (description)
	with key dtid clustered with (pub_date, author_name);

	Note:
	The dtid must be unique, non null and must be a composite since it is used as a custom text index id.

Next we define a mapping between dates and integers. These will be minutes consecutive of 1990-1-1 0:00. With 525600 minutes per 365 day year we will not run out of values for a long time, the positive integer limit being 2**31.

create procedure date2short (in dt datetime)
{
  return (1440 * datediff ('day', {d'1990-1-1'}, dt)
	+ hour (dt) * 60 + minute (dt));
}

create procedure short2date (in n integer)
{
  return (dateadd ('minute', mod (n, 1440),
	dateadd ('day', n / 1440, {d'1990-1-1'})));
}

Now we can insert an article:

insert into article (id, drtid, description, pub_date)
	values (1, composite (date2short ({dt '2001-1-15 12:44'}), 1),
		'sample news article', {dt '2001-1-15 12:44'});

	Note:
	The composite is the date2short of the datetime and the id. The dtid must be specified and cannot be generated by a trigger, since the free text index related triggers must have access to the value.

The text index will be in synchronous mode by default so we can now query the data:

select id from article where contains (description, 'sample');
select id from article where contains (description, 'sample', descending);

The first query will return the oldest hits first, the second the newest first. Note that inserting in ascending order of the document ID is incomparably more efficient than in descending order. There is no great speed difference between reading in ascending or descending order.

If no application specific ID were specified the order would reflect the insertion order. Note that in this example articles do not have to be received in publication order, although insertions will naturally tend to follow this.

Now since the ID has an application semantic, we can use it for filtering based on date:

Consider:

select id from article where contains (description, 'sample', descending,
	start_id, composite (date2short ({dt'2001-1-5'})));

Since the search goes in descending order of id and starts at an id beginning with the numeric value corresponding to 2001-1-5 0:00, we may only get hits where the id date component is less than this, newest first.

There is no time penalty for the start_id option. This is therefore incomparably faster than the query:

select id from article where contains (description, 'sample', descending)
and pub_date < {dt '2001-1-5'};

	See Also:
	The reference section for contains for a definition of these options.

Prefix	IRI
n2	http://docs.openlinksw.com/virtuoso/fttexamples/
schema	http://schema.org/
n5	http://creativecommons.org/licenses/by/4.0/
rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
n4	http://www.openlinksw.com/#
xsdh	http://www.w3.org/2001/XMLSchema#

Prefix	URI
xmlns:n2	http://docs.openlinksw.com/virtuoso/fttexamples/
xmlns:schema	http://schema.org/
xmlns:n5	http://creativecommons.org/licenses/by/4.0/
xmlns:rdf	http://www.w3.org/1999/02/22-rdf-syntax-ns#
xmlns:n4	http://www.openlinksw.com/#
xmlns:xsdh	http://www.w3.org/2001/XMLSchema#

Prefix

URI

xmlns:n2

http://docs.openlinksw.com/virtuoso/fttexamples/

xmlns:schema

http://schema.org/

xmlns:n5

http://creativecommons.org/licenses/by/4.0/

xmlns:rdf

http://www.w3.org/1999/02/22-rdf-syntax-ns#

xmlns:n4

http://www.openlinksw.com/#

xmlns:xsdh

http://www.w3.org/2001/XMLSchema#

Subject	Predicate	Object
n2:	rdf:type	schema:TechArticle
n2:	rdf:type	schema:APIReference
n2:	schema:name	20.2.4.ÃÂ Free Text Index Examples
n2:	schema:copyrightHolder	_:vb80654
n2:	schema:datePublished	2016-09-09 16:16:54
n2:	schema:headline	20.2.4.ÃÂ Free Text Index Examples
n2:	schema:keywords	OpenLink,Virtuoso,database,RDBMS,relational,SQL,RDF,triple store,linked data,linked open data,Big Data
n2:	schema:license	n5:deed.en_US
n2:	schema:publisher	_:vb80653
n2:	schema:url	n2:
_:vb80653	rdf:type	schema:Organization
_:vb80653	schema:name	OpenLink Software
_:vb80653	schema:url	n4:this
_:vb80654	rdf:type	schema:Organization
_:vb80654	schema:name	OpenLink Software
_:vb80654	schema:url	n4:this

Subject

Predicate

Object

n2:

rdf:type

schema:TechArticle

n2:

rdf:type

schema:APIReference

n2:

schema:name

20.2.4.ÃÂ Free Text Index Examples

n2:

schema:copyrightHolder

_:vb80654

n2:

schema:datePublished

2016-09-09 16:16:54

n2:

schema:headline

20.2.4.ÃÂ Free Text Index Examples

n2:

schema:keywords

OpenLink,Virtuoso,database,RDBMS,relational,SQL,RDF,triple store,linked data,linked open data,Big Data

n2:

schema:license

n5:deed.en_US

n2:

schema:publisher

_:vb80653

n2:

schema:url

n2:

_:vb80653

rdf:type

schema:Organization

_:vb80653

schema:name

OpenLink Software

_:vb80653

schema:url

n4:this

_:vb80654

rdf:type

schema:Organization

_:vb80654

schema:name

OpenLink Software

_:vb80654

schema:url

n4:this

Prev	Up	Next
20.2.3. The composite Data Type	Home	20.2.5. Pre-processing and Extending the Content Being Indexed

20.2.4. Free Text Index Examples

Namespace Prefixes

Statements

Namespace Prefixes

Statements