| 1. | Built-in Extension Functions |
| 2. | Implementing Extension Functions |
4XPath supports user-defined extension functions as specified by the XPath (and XSLT) Recommendations. It comes with a library of convenient extension functions for a range of purposes. These are listed here.
All built-in extension functions have the namespace URI 'http://xmlns.4suite.org/ext'
node-set(rtf)
Convert a result-tree fragment to a node-set.
ParametersReturn Value
rtfof type result tree fragment
A result tree fragment such as generated by the body of an XSLT variable or parameter.
- node set
A node set consisting of all of the top-level nodes in the result tree fragment, including all resursive tree elements.
match(pattern, arg)
Match a Python regular expression against a string
ParametersReturn Value
patternof type string
A string representing a regular expression such as used by the Python re module..
argof type string
A string to be matched against the pattern.
- boolean
true if the string matches, otherwise false
escape-url(url)
Escape illegal characters in a URL
ParametersReturn Value
urlof type string
The URL to be escaped
- string
URL with all illegal characters escaped according to RFC 1738
iso-time()
Get the current time in ISO 8601 format
ParametersNoneReturn Value
- string
Current time in the format YYYY-MM-DD HH:MM:SS
evaluate(expr)
Evaluate an XPath expression at XSLT run time
ParametersReturn Value
exprof type string
XPath expression to be parsed and evaluated using the current context.
- boolean, number, string, or node set
The result of evaluating the expression
distinct(nodeset)
Eliminates duplicates from a node set according to the string value of each
ParametersReturn Value
nodesetof type node set
The node set to be processed
- node set
A node set from which all duplicates have been removed. Two nodes in a node set are considered duplicates if their string values are equal. The last node in each distinct group is the one that is retained in the final list, and the order of the node set may be disrupted.
split(arg, delim)
Split a string into a node set of text nodes.
ParametersReturn Value
argof type string
The string to be split
delimof type string
The delimiter by which the string is to be split.
- node set
a node set of text nodes each of which represents a segment of the split string.
range(lo, hi)
generate a node set of text nodes containing numbers ascending from the low value to the high value.
ParametersReturn Value
loof type number
The starting point for the sequence of numbers.
hiof type number
The ending point for the sequence of numbers.
- node set
A node set of text nodes, each of which represents a number value, starting from the low value to the high value, incrementing by one.
if(cond, v1, v2)
Select from two values based on a condition
ParametersReturn Value
condof type boolean
The condition to be checked
v1of type boolean, number, string, or node set
The first choice
v2of type boolean, number, string, or node set
The second choice
- boolean, number, string, or node set
The first value if the condition is true, otherwise the second value.
find(outer, inner)
Return the index of a substring within a string
ParametersReturn Value
outerof type string
The string to be searched
innerof type string
The substring to seek
- number
The zero-based index at which the inner string is first located within the outer string. -1 if the inner string is not found.
To define your own extension functions, define equivalent Python functions. The module in which they are defined must have global dictionary named "ExtFunctions" mapping function names to function objects. Function names consist of a tuple of two strings, the first being the namespace URI for the unique function, and the second being the local name.
Note that if you are using the extension function from within 4XSLT, the namespace URI must be a valid, identifying (but not necessarily addressable) URI, and in particular, it cannot be an empty string. If you are using the extension function directly from 4XPath, the namespace URI can be the empty string.
Finally, modules containing any extension functions used must be indicated as such to the processor in one of two ways. (1) They are listed in the environment variable "EXTMODULES". "EXTMODULES" is a colon-separated list of modules. (2) They are registered with 4XPath using the xml.xpath.Util.RegisterExtensionFunctions() function, which takes a list of module names. In either case, all extension modules must should be in the "PYTHONPATH".
For example:
#demo.py
import time, urlparse
from xml.xpath import Conversions
def GetCurrentTime(context):
'''available in XPath as get-curent-time()'''
return asctime(localtime())
def HashContextName(context, maxkey):
'''
available in XPath as hash-context-name(maxkey),
where maxkey is a numeric expression
'''
#It is a good idea to use the appropriate core function to coerce
#arguments to the expected type
maxkey = Conversions.NumberValue(maxkey)
key = reduce(lambda a, b: a + b, context.node.nodeName)
return key % maxkey
ExtFunctions = {
('http://spam.com', 'get-curent-time'): GetCurentTime,
('http://spam.com', 'hash-context-name'): HashContextName
}
In order to use these functions, be sure that "demo" (the module name) is in the EXTMODULES environment variable, or that you call xml.xpath.Util.RegisterExtensionFunctions(). If you are using them directly from 4XPath, however, you need to do one more thing: you need to set up a prefix that maps to the namespace of the functions you've defined ("http://spam.com", in this case).
You can do this by setting the "processorNss" attribute on the context you pass to the appropriate XPath method. For instance:
from xml.dom import ext
from xml.dom.ext.reader import Sax2
from xml.xpath import Evaluate, Util
from xml.xpath.Context import Context
try:
doc = Sax2.FromXmlFile('myfile.xml', validate=0)
except Sax2.saxlib.SAXException, msg:
print "SAXException caught:", msg
except Sax2.saxlib.SAXParseException, msg:
print "SAXParseException caught:", msg
Util.IndexDocument(doc)
context = Context(doc, 1, 1, processorNss={'ext': 'http://spam.com'})
result = Evaluate("/transaction[@timestamp=ext:get-curent-time()]", doc)
Util.FreeDocumentIndex(doc)
ext.ReleaseNode(doc)
Note that you might choose to use the empty string for the extension function namespaces. If so, you don't need to specify the processorNss context attribute, but you might want to watch out for clashes with other extenstion function names, including the built-in library. Again, if you plan to use an extension function from within XSLT, it must have a non-null namespace URI.