XPath: What about the between operator?

XPath 1.0 has following and preceding axes to help you get all the elements after a certain element or all the elements before a certain element. But what if you want all the elements between two elements? I didn’t find any explicit construct in xpath 1.0 specification to do this.

One way to do this is use the following axis and get all the elements after the first element and use the preceding axis and get all the elements before the second element and then find the intersection of these two. Sounds like a good idea, but how to do this using xpath only and no procedural code? So, I searched for the intersection operator and to my delight found this in XPath 2.0 specification, that also contains except and union operators.

But the tools I am working with only support XPath 1.0 and so, I was back to finding a way to do this in XPath 1.0. After a bit of experimentation, came up with the following strategy.

Take a simple XML like

<a><b/><c/><b/><b/><d/><b/><b/></a>

The goal is to get all the b elements between c and d.

  • Using //b, you get 5 <b> elements
  • Using //c/following::b, you get 4 <b> elements
  • Using //d/preceding::b, you get 3 <b> elements
  • Using //c/following::b[following::d] you get 2 <b> elements, the end goal!

Technically, the above solution is not 100% correct as there can be multiple c and d elements, but it’s good enough for my use case. However, the general idea is

first-element-axis/following::desired-elements[following::second-element]

to get the desired behavior of the between axis. I would be curious to hear any other solutions. And BTW, you can verify all this by using the online xpath evaluator, a nice tool to experiment with xpaths.

Advertisements

4 Comments

Filed under XML, XPath

4 responses to “XPath: What about the between operator?

  1. mark p.

    try //b[preceding::c and following::d]

  2. mark p.

    BTW: The online xpath evaluator you refer to does not appear to return the correct results, at least, for the ancestor axis.

    Try //b/ancestor::*

    The correction result should be <a> but the program returns the whole document including the and nodes which are clearly not ancestor nodes of <b>.

  3. S

    Hi Mark, I like your expression //b[preceding::c and following::d].

    The result for the //b/ancestor::* is correct. The result is just a single element, the a element and the entire markup of a is being displayed, which is the entire document.

  4. mark p.

    Thanks. That was an OOPS. Up too late thinking about something else.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s