Name
AP_BUILD_MATCH_LIST — Returns report of all occurrences of phrases from the specified sets in the text.
Synopsis
AP_BUILD_MATCH_LIST
(
|
in phrase_set_ids vector of integers , |
in source_UTF8_text varchar not null , | |
in lang_name varchar not null , | |
in source_text_is_html integer , | |
in
report_flags
integer
) ; |
Description
Forms a report that lists all occurrences of phrases from the specified sets in the text.
The report describes "phrase hits", i.e. occurrences of annotation phrases in the text, using "arrows" that point to specific fragments in the text, such as words of found phrases or HTML tags.
The structure of the report is complicated, due to contradiction in requirements. It is compact to provide reasonable performance and scalability, so common data should not be repeated, saving memory. It is complete enough to prevent application from reading omitted data from system tables, saving time.
All objects of one type are listed as items of some vector and the whole report consists of several such vectors. An item in one vector may refer to item in other vector by its index, without storing a local copy.
Detailed description of the report structure can be found here .
Parameters
phrase_set_ids
vector of numeric identifiers of phrase sets at work, they may belong to various phrase classes, but if language of some phrase set differs from value of lang_name argument then the phrase set is silently ignored.
source_UTF8_text
a plain text or an HTML
lang_name
language name
source_text_is_html
0 for plain text, 1 for standard-compliant HTML or 2 for "dirty" HTML
report_flags
Report flag