.NET and XPath
So I’m working on this XPath presentation for my team at work. I was trying to hack up a sample using some of the more interesting XPath functions, like string-join. PHP’s DOMXPath throws a fit when I use this function so I cracked open MSDN and saw that XPathNavigator in the 2.0 framework claims to support “the XQuery 1.0 and XPath 2.0 Data Model[s].” Nifty, huh? Especially since string-join is defined in those specs. (Note that this table claims it is available in XPath 1.0. Apparently nobody bothered to check the XPath 1.0 specification which does not mention it at all.)
PHP’s implementation must be broken then. Off I go and code a Winforms project that I can use to run my example. Right? Yeah, right…
For the sake of simplicity, I coded a small CLI program that will run an XPath query against an empty document:
using System;
using System.Xml;
using System.Xml.XPath;
public class XPathCLI {
public static void Main(string[] args) {
XmlDocument doc = new XmlDocument();
XPathNavigator nav = doc.CreateNavigator();
Console.WriteLine(nav.Evaluate(args[0]).ToString());
}
}
Now let’s make sure it’s working:
$ ./XPathCLI.exe 'concat("hello ", "world")'
hello world
Looks good. Now let’s try the examples listed under string-join:
$ ./XPathCLI.exe "string-join({'Now', 'is', 'the', 'time', '...'}, \" \")"
Unhandled Exception: System.Xml.XPath.XPathException: invalid token: '{'
at Mono.Xml.XPath.Tokenizer.ParseToken () [0x00000]
at Mono.Xml.XPath.Tokenizer.advance () [0x00000]
at Mono.Xml.XPath.XPathParser.yyparse (yyInput yyLex) [0x00000]
at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000]
$ ./XPathCLI.exe "string-join({abra, cadabra}, \"\")"
Unhandled Exception: System.Xml.XPath.XPathException: invalid token: '{'
at Mono.Xml.XPath.Tokenizer.ParseToken () [0x00000]
at Mono.Xml.XPath.Tokenizer.advance () [0x00000]
at Mono.Xml.XPath.XPathParser.yyparse (yyInput yyLex) [0x00000]
at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000]
$ ./XPathCLI.exe 'string-join((), "separator")'
Unhandled Exception: System.Xml.XPath.XPathException: Error during parse of string-join((), "separator") ---> Mono.Xml.XPath.yyParser.yyException: irrecoverable syntax error
at Mono.Xml.XPath.XPathParser.yyparse (yyInput yyLex) [0x00000]
at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000] --- End of inner exception stack trace ---
at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000]
at System.Xml.XPath.XPathExpression.Compile (System.String xpath, IXmlNamespaceResolver nsmgr, IStaticXsltContext ctx) [0x00000]
at System.Xml.XPath.XPathExpression.Compile (System.String xpath) [0x00000]
at System.Xml.XPath.XPathNavigator.Compile (System.String xpath) [0x00000]
at System.Xml.XPath.XPathNavigator.Evaluate (System.String xpath) [0x00000]
at XPathCLI.Main (System.String[] args) [0x00000]
Ok, that didn’t go too well. Apparently Mono doesn’t like some of the syntax. Let’s use a node selecting expression instead:
$ ./XPathCLI.exe 'string-join(//something, "separator")' Unhandled Exception: System.Xml.XPath.XPathException: function string-join not found at System.Xml.XPath.ExprFunctionCall.Evaluate (System.Xml.XPath.BaseIterator iter) [0x00000] at System.Xml.XPath.CompiledExpression.Evaluate (System.Xml.XPath.BaseIterator iter) [0x00000]
Uh… ok. Let’s start over on MS.NET. It must be a Mono bug, right?
>XPathCLI.exe "string-join({'Now', 'is' 'the', 'time', '...'}, \" \")"
Unhandled Exception: System.Xml.XPath.XPathException: 'string-join({'Now', 'is''the', 'time', '...'}, " ")' has an invalid token.
at MS.Internal.Xml.XPath.XPathScanner.NextLex()
at MS.Internal.Xml.XPath.XPathParser.ParseMethod(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParsePrimaryExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseFilterExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParsePathExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseUnionExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseUnaryExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseMultiplicativeExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseAdditiveExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseRelationalExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseEqualityExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseAndExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseOrExpr(AstNode qyInput)
at MS.Internal.Xml.XPath.XPathParser.ParseXPathExpresion(String xpathExpresion)
at System.Xml.XPath.XPathExpression.Compile(String xpath, IXmlNamespaceResolver nsResolver)
at System.Xml.XPath.XPathNavigator.Evaluate(String xpath)
at XPathCLI.Main(String[] args)
Let’s jump straight to the one that made it past Mono’s parser to crash in the evaluator:
>XPathCLI.exe "string-join(//something, \"separator\")" Unhandled Exception: System.Xml.XPath.XPathException: Namespace Manager or XsltContext needed. This query has a prefix, variable, or user-defined function. at MS.Internal.Xml.XPath.CompiledXpathExpr.get_QueryTree() at System.Xml.XPath.XPathNavigator.Evaluate(XPathExpression expr, XPathNodeIterator context) at System.Xml.XPath.XPathNavigator.Evaluate(String xpath) at XPathCLI.Main(String[] args)
From this we can make a few conclusions:
- Mono, MS.NET, and PHP do not support XPath 2.0. I cannot find any PHP documentation that claims a specific version of XPath support, but, as noted in the intro paragraph, MSDN claims XPath 2.0 support and MS.NET does not deliver. (Mono may be following the MS.NET implementation instead of the spec, so whether this is a Mono bug or not is debatable.)
- Mono, MS.NET, and PHP do not support the
{...}construct, which is present in the XPath 2.0 “Precedence Order” section but not actually defined elsewhere. This construct is not present at all in the XPath 1.0 specification. Whether this is a specification or implementation defect is left an open question. - Mono, MS.NET, and PHP do not implement the
string-joinfunction defined in at least XPath 2.0.
And from those conclusions we can draw a few more.
- Nobody gives a whip about following the XPath specification.
- The XPath specification is broken. Or confusing. Or (more likely) both.
The real question, then, is do people intentionally not implement the XPath 2.0 specification because they don’t want to, or because parts of it make no sense? It seems odd to me that an implementation would support concat and not string-join, especially since they are defined right next to each other.
In any case, if you’re not implementing all of it, don’t claim that you do. Incorrect documentation is worse than no documentation.
July 1st, 2008 at 1:33 pm
The fact is very simple: XPath 2.0 is not supported by any of them. Anything that is only in 2.0 must be regarded as invalid expression. So, every implementation mentioned here is correct.
MSDN documetation does not say that it supports XPath 2.0 nor XQuery 1.0 (especially functions and operators).
XQuery 1.0 / XPath 2.0 F&O *working draft* is of course wrong. You cannot find corresponding part in the recommendation.
And yes, I don’t like XQuery and XPath 2.0 which are part of pro-xsd family.
July 4th, 2008 at 1:05 pm
What does this mean then?
“The XPathNavigator class in the System.Xml.XPath namespace is an abstract class which defines a cursor model for navigating and editing XML information items as instances of the XQuery 1.0 and XPath 2.0 Data Model.”