{ "ns" : "http://zorba.io/modules/full-text", "description" : " This module provides an XQuery API to full-text functions.\n For general information about this implementation of the\n XQuery and XPath Full Text 1.0 specification\n as well as instructions for building an installing a thesaurus,\n see the Full Text Thesaurus documentation.\n
\n Most functions in this module take a language as a parameter\n using the\n xs:language
\n XML schema data type.\n
stem()
functions return the\n stem\n of a word.\n The stem of a word itself, however, is not guaranteed to be a word.\n It is best to consider a stem as an opaque byte sequence.\n All that is guaranteed about a stem is that,\n for a given word,\n the stem of that word will always be the same byte sequence.\n Hence,\n you should never compare the result of one of the stem()
\n functions against a non-stemmed string,\n for example:\n \n if ( ft:stem( \"apples\" ) eq \"apple\" ) ** WRONG **\n\n Instead do:\n
\n if ( ft:stem( \"apples\" ) eq ft:stem( \"apple\" ) ) ** CORRECT **\n\n
thesaurus-lookup()
functions have \"levels\"\n and \"relationship\" parameters.\n The values for these are implementation-defined.\n The default implementation uses the\n WordNet lexical database,\n version 3.0.\n \n In WordNet,\n the number of \"levels\" that two phrases are apart\n are how many hierarchical meanings apart they are.\n For example,\n \"canary\" is 5 levels away from \"vertebrate\"\n (carary > finch > oscine > passerine > bird > vertebrate).\n
\n When using the WordNet implementation,\n all of the relationships (and their abbreviations)\n specified by\n ISO 2788\n and\n ANSI/NISO Z39.19-2005\n with the exceptions of \"HN\" (history note)\n and \"X SN\" (see scope note for) are supported.\n These relationships are:\n
Rel. | \nMeaning | \nWordNet Rel. | \n
---|---|---|
BT | \nbroader term | \nhypernym | \n
BTG | \nbroader term generic | \nhypernym | \n
BTI | \nbroader term instance | \ninstance hypernym | \n
BTP | \nbroader term partitive | \npart meronym | \n
NT | \nnarrower term | \nhyponym | \n
NTG | \nnarrower term generic | \nhyponym | \n
NTI | \nnarrower term instance | \ninstance hyponym | \n
NTP | \nnarrower term partitive | \npart holonym | \n
RT | \nrelated term | \nalso see | \n
SN | \nscope note | \nn/a | \n
TT | \ntop term | \nhypernym | \n
UF | \nnon-preferred term | \nn/a | \n
USE | \npreferred term | \nn/a | \n
Relationship | \nMeaning | \n
---|---|
also see | \n\n A word that is related to another,\n e.g., for \"varnished\" (furniture)\n one should also see \"finished.\"\n | \n
antonym | \n\n A word opposite in meaning to another,\n e.g., \"light\" is an antonym for \"heavy.\"\n | \n
attribute | \n\n A noun for which adjectives express values,\n e.g., \"weight\" is an attribute\n for which the adjectives \"light\" and \"heavy\"\n express values.\n | \n
cause | \n\n A verb that causes another,\n e.g., \"show\" is a cause of \"see.\"\n | \n
derivationally related form | \n\n A word that is derived from a root word,\n e.g., \"metric\" is a derivationally related form of \"meter.\"\n | \n
derived from adjective | \n\n An adverb that is derived from an adjective,\n e.g., \"correctly\" is derived from the adjective \"correct.\"\n | \n
entailment | \n\n A verb that presupposes another,\n e.g., \"snoring\" entails \"sleeping.\"\n | \n
hypernym | \n\n A word with a broad meaning that more specific words fall under,\n e.g., \"meal\" is a hypernym of \"breakfast.\"\n | \n
hyponym | \n\n A word of more specific meaning than a general term applicable to it,\n e.g., \"breakfast\" is a hyponym of \"meal.\"\n | \n
instance hypernym | \n\n A word that denotes a category of some specific instance,\n e.g., \"author\" is an instance hypernym of \"Asimov.\"\n | \n
instance hyponym | \n\n A term that donotes a specific instance of some general category,\n e.g., \"Asimov\" is an instance hyponym of \"author.\"\n | \n
member holonym | \n\n A word that denotes a collection of individuals,\n e.g., \"faculty\" is a member holonym of \"professor.\"\n | \n
member meronym | \n\n A word that denotes a member of a larger group,\n e.g., a \"person\" is a member meronym of a \"crowd.\"\n | \n
part holonym | \n\n A word that denotes a larger whole comprised of some part,\n e.g., \"car\" is a part holonym of \"engine.\"\n | \n
part meronym | \n\n A word that denotes a part of a larger whole,\n e.g., an \"engine\" is part meronym of a \"car.\"\n | \n
participle of verb | \n\n An adjective that is the participle of some verb,\n e.g., \"breaking\" is the participle of the verb \"break.\"\n | \n
pertainym | \n\n An adjective that classifies its noun,\n e.g., \"musical\" is a pertainym in \"musical instrument.\"\n | \n
similar to | \n\n Similar, though not necessarily interchangeable, adjectives.\n For example, \"shiny\" is similar to \"bright\",\n but they have subtle differences.\n | \n
substance holonym | \n\n A word that denotes a larger whole containing some constituent\n substance, e.g., \"bread\" is a substance holonym of \"flour.\"\n | \n
substance meronym | \n\n A word that denotes a constituant substance of some larger whole,\n e.g., \"flour\" is a substance meronym of \"bread.\"\n | \n
verb group | \n\n A verb that is a member of a group of similar verbs,\n e.g., \"live\" is in the verb group\n of \"dwell\", \"live\", \"inhabit\", etc.\n | \n
Gets the current compare options.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ ], "returns" : { "type" : "object()", "description" : "said compare options." }, "errors" : [ ] }, { "arity" : 0, "name" : "current-lang", "qname" : "ft:current-lang", "signature" : "() as xs:language external", "description" : " Gets the current\n language:\n either the language specified by the\ndeclare ft-option using\n language
\n statement (if any)\n or the one returned by ft:host-lang()
(if none).\n", "summary" : "Gets the current\n language :\n either the language specified by the\n declare ft-option using \n language \n statement (if any)\n or the one returned by ft:host-lang() (if none).
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ ], "returns" : { "type" : "xs:language", "description" : "said language." }, "errors" : [ ] }, { "arity" : 0, "name" : "host-lang", "qname" : "ft:host-lang", "signature" : "() as xs:language external", "description" : " Gets the host's current\n language.\n The \"host\" is the computer on which the software is running.\n The host's current language is obtained as follows:\nsetlocale
(3) returns non-null,\n the language corresponding to that locale is used.\n LANG
environment variable is set,\n that language is ued.\n GetLocaleInfo()
\n function is used.\n Gets the host's current\n language .
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ ], "returns" : { "type" : "xs:language", "description" : "said language." }, "errors" : [ ] }, { "arity" : 1, "name" : "is-stem-lang-supported", "qname" : "ft:is-stem-lang-supported", "signature" : "($lang as xs:language) as xs:boolean external", "description" : " Checks whether the given\n language\n is supported for stemming.\n", "summary" : "Checks whether the given\n language \n is supported for stemming.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "lang", "type" : "xs:language", "occurence" : null, "description" : "true
only if the language is supported." }, "errors" : [ ] }, { "arity" : 1, "name" : "is-stop-word-lang-supported", "qname" : "ft:is-stop-word-lang-supported", "signature" : "($lang as xs:language) as xs:boolean external", "description" : " Checks whether the given\n language\n is supported for stop words.\n", "summary" : "Checks whether the given\n language \n is supported for stop words.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "lang", "type" : "xs:language", "occurence" : null, "description" : "true
only if the language is supported." }, "errors" : [ ] }, { "arity" : 1, "name" : "is-stop-word", "qname" : "ft:is-stop-word", "signature" : "($word as xs:string) as xs:boolean external", "description" : " Checks whether the given word is a stop-word.\n", "summary" : "Checks whether the given word is a stop-word.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "word", "type" : "xs:string", "occurence" : null, "description" : "ft:current-lang()
.true
only if $word
is a stop-word." }, "errors" : [ "ft:current-lang()
is not supported.Checks whether the given word is a stop-word.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "word", "type" : "xs:string", "occurence" : null, "description" : "$word
.true
only if $word
is a stop-word." }, "errors" : [ "$lang
is not supported.Checks whether the given\n language \n is supported for look-up using the default thesaurus.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "lang", "type" : "xs:language", "occurence" : null, "description" : "true
only if the language is supported." }, "errors" : [ ] }, { "arity" : 2, "name" : "is-thesaurus-lang-supported", "qname" : "ft:is-thesaurus-lang-supported", "signature" : "($uri as xs:string, $lang as xs:language) as xs:boolean external", "description" : " Checks whether the given\n language\n is supported for look-up using the thesaurus specified by the given URI.\n", "summary" : "Checks whether the given\n language \n is supported for look-up using the thesaurus specified by the given URI.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "uri", "type" : "xs:string", "occurence" : null, "description" : "true
only if the language is supported." }, "errors" : [ "$uri
refers to a thesaurus that is not found in the statically known thesauri.Checks whether the given\n language \n is supported for tokenization.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "lang", "type" : "xs:language", "occurence" : null, "description" : "true
only if the language is supported." }, "errors" : [ ] }, { "arity" : 1, "name" : "stem", "qname" : "ft:stem", "signature" : "($word as xs:string) as xs:string external", "description" : " Stems the given word.\n", "summary" : "Stems the given word.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "word", "type" : "xs:string", "occurence" : null, "description" : "ft:current-lang()
.$word
." }, "errors" : [ "ft:current-lang()
is not supported.Stems the given word.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "word", "type" : "xs:string", "occurence" : null, "description" : "$word
.$word
." }, "errors" : [ "$lang
is not supported.Strips all diacritical marks from all characters.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "string", "type" : "xs:string", "occurence" : null, "description" : "$string
with diacritical marks stripped." }, "errors" : [ ] }, { "arity" : 1, "name" : "thesaurus-lookup", "qname" : "ft:thesaurus-lookup", "signature" : "($phrase as xs:string) as xs:string* external", "description" : " Looks-up the given phrase in the default thesaurus.\n", "summary" : "Looks-up the given phrase in the default thesaurus.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "phrase", "type" : "xs:string", "occurence" : null, "description" : "ft:current-lang()
.$phrase
is found in the thesaurus or the empty sequence if not." }, "errors" : [ "ft:current-lang()
is not supported.Looks-up the given phrase in a thesaurus.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "uri", "type" : "xs:string", "occurence" : null, "description" : "ft:current-lang()
.$phrase
is found in the thesaurus or the empty sequence if not." }, "errors" : [ "ft:current-lang()
is unsupported.$uri
refers to a thesaurus that is not found in the statically known thesauri.Looks-up the given phrase in the thesaurus specified by the given URI.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "uri", "type" : "xs:string", "occurence" : null, "description" : "$phrase
.$phrase
is found in the thesaurus or the empty sequence if not." }, "errors" : [ "$lang
is not supported.$uri
refers to a thesaurus that is not found in the statically known thesauri.Looks-up the given phrase in a thesaurus.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "uri", "type" : "xs:string", "occurence" : null, "description" : "$phrase
.$phrase
.$phrase
is found in the thesaurus or the empty sequence if not." }, "errors" : [ "$uri
refers to a thesaurus that is not found in the statically known thesauri.$lang
is not supported.Looks-up the given phrase in a thesaurus.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "uri", "type" : "xs:string", "occurence" : null, "description" : "$phrase
.$phrase
.$phrase
is found in the thesaurus or the empty sequence if not." }, "errors" : [ "$level-least
or $level-most
is either negative or too large.$uri
refers to a thesaurus that is not found in the statically known thesauri.$lang
is not supported.Tokenizes the given node and all of its descendants.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "node", "type" : "node()", "occurence" : null, "description" : "ft:current-lang()
.ft:current-lang()
is not supported.Tokenizes the given node and all of its decendants.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "node", "type" : "node()", "occurence" : null, "description" : "$node
.$lang
is not supported.$includes
(and all of its\n descendants) but excluding $excludes
(and all of its\n descendants), if any.\n", "summary" : "Tokenizes the set of nodes comprising $includes (and all of its\n descendants) but excluding $excludes (and all of its\n descendants), if any.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "includes", "type" : "node()", "occurence" : null, "description" : "ft:current-lang()
.ft:current-lang()
is not supported.$includes
(and all of its\n descendants) but excluding $excludes
(and all of its\n descendants), if any.\n", "summary" : "Tokenizes the set of nodes comprising $includes (and all of its\n descendants) but excluding $excludes (and all of its\n descendants), if any.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "includes", "type" : "node()", "occurence" : null, "description" : "$lang
is not supported.Tokenizes the given string.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "string", "type" : "xs:string", "occurence" : null, "description" : "ft:current-lang()
.ft:current-lang()
is not supported.Tokenizes the given string.
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "string", "type" : "xs:string", "occurence" : null, "description" : "$string
.$lang
is not supported.ft:current-lang()
.\n", "summary" : "Gets properties of the tokenizer for the\n language \n returned by ft:current-lang() .
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ ], "returns" : { "type" : "object()", "description" : "said properties." }, "errors" : [ "ft:current-lang()
is not supported.Gets properties of the tokenizer for the given\n language .
", "annotation_str" : "", "annotations" : [ ], "updating" : false, "parameters" : [ { "name" : "lang", "type" : "xs:language", "occurence" : null, "description" : "$lang
is not supported. tokenization specifically.xs:language
.\n" }, { "name" : "ft:LANG-DE", "type" : "xs:language", "description" : " Predeclared constant for the German\n xs:language
.\n" }, { "name" : "ft:LANG-EN", "type" : "xs:language", "description" : " Predeclared constant for the English\n xs:language
.\n" }, { "name" : "ft:LANG-ES", "type" : "xs:language", "description" : " Predeclared constant for the Spanish\n xs:language
.\n" }, { "name" : "ft:LANG-FI", "type" : "xs:language", "description" : " Predeclared constant for the Finnish\n xs:language
.\n" }, { "name" : "ft:LANG-FR", "type" : "xs:language", "description" : " Predeclared constant for the French\n xs:language
.\n" }, { "name" : "ft:LANG-HU", "type" : "xs:language", "description" : " Predeclared constant for the Hungarian\n xs:language
.\n" }, { "name" : "ft:LANG-IT", "type" : "xs:language", "description" : " Predeclared constant for the Italian\n xs:language
.\n" }, { "name" : "ft:LANG-NL", "type" : "xs:language", "description" : " Predeclared constant for the Dutch\n xs:language
.\n" }, { "name" : "ft:LANG-NO", "type" : "xs:language", "description" : " Predeclared constant for the Norwegian\n xs:language
.\n" }, { "name" : "ft:LANG-PT", "type" : "xs:language", "description" : " Predeclared constant for the Portuguese\n xs:language
.\n" }, { "name" : "ft:LANG-RO", "type" : "xs:language", "description" : " Predeclared constant for the Romanian\n xs:language
.\n" }, { "name" : "ft:LANG-RU", "type" : "xs:language", "description" : " Predeclared constant for the Russian\n xs:language
.\n" }, { "name" : "ft:LANG-SV", "type" : "xs:language", "description" : " Predeclared constant for the Swedish\n xs:language
.\n" }, { "name" : "ft:LANG-TR", "type" : "xs:language", "description" : " Predeclared constant for the Turkish\n xs:language
.\n" } ] }